A First Course in Spectral Theory [1 ed.] 2022028354, 9781470466565, 9781470471927, 9781470471910

The central topic of this book is the spectral theory of bounded and unbounded self-adjoint operators on Hilbert spaces.

258 119 8MB

English Pages 472 [492] Year 2022

Table of contents :
Contents
Preface
Chapter 1. Measure theory
1.1. 𝜎-algebras and monotone classes
1.2. Measures and Carathéodory’s theorem
1.3. Borel 𝜎-algebra on the real line and related spaces
1.4. Lebesgue integration
1.5. Lebesgue–Stieltjes measures on ℝ
1.6. Product measures
1.7. Functions on 𝜎-locally compact spaces
1.8. Regularity of measures
1.9. The Riesz–Markov theorem
1.10. Exercises
Chapter 2. Banach spaces
2.1. Norms and Banach spaces
2.2. The Banach space 𝐶(𝐾)
2.3. 𝐿^{𝑝} spaces
2.4. Bounded linear operators and uniform boundedness
2.5. Weak-* convergence and the separable Banach–Alaoglu theorem
2.6. Banach-space valued integration
2.7. Banach-space valued analytic functions
2.8. Exercises
Chapter 3. Hilbert spaces
3.1. Inner products
3.2. Subspaces and orthogonal projections
3.3. Direct sums of Hilbert spaces
3.4. Orthonormal sets and orthonormal bases
3.5. Weak convergence
3.6. Tensor products of Hilbert spaces
3.7. Exercises
Chapter 4. Bounded linear operators
4.1. The 𝐶*-algebra of bounded linear operators on ℋ
4.2. Strong and weak operator convergence
4.3. Invertibility, spectrum, and resolvents
4.4. Polynomials of operators
4.5. Invariant subspaces and direct sums of operators
4.6. Compact operators
4.7. Exercises
Chapter 5. Bounded self-adjoint operators
5.1. A first look at self-adjoint operators
5.2. Spectral theorem for compact self-adjoint operators
5.3. Spectral measures
5.4. Spectral theorem on a cyclic subspace
5.5. Multiplication operators
5.6. Spectral theorem on the entire Hilbert space
5.7. Borel functional calculus
5.8. Spectral theorem for unitary operators
5.9. Exercises
Chapter 6. Measure decompositions
6.1. Pure point and continuous measures
6.2. Singular and absolutely continuous measures
6.3. Hausdorff measures on ℝ
6.4. Matrix-valued measures
6.5. Exercises
Chapter 7. Herglotz functions
7.1. Möbius transformations
7.2. Schur functions and convergence
7.3. Carathéodory functions
7.4. The Herglotz representation
7.5. Growth at infinity and tail of the measure
7.6. Half-plane Poisson kernel and Stieltjes inversion
7.7. Pointwise boundary values
7.8. Meromorphic Herglotz functions
7.9. Exponential Herglotz representation
7.10. The Phragmén–Lindelöf method and asymptotic expansions
7.11. Matrix-valued Herglotz functions
7.12. Weyl matrices and Dirichlet decoupling
7.13. Exercises
Chapter 8. Unbounded self-adjoint operators
8.1. Graphs and adjoints
8.2. Resolvents and self-adjointness
8.3. Unbounded multiplication operators and direct sums
8.4. Spectral measures and the spectral theorem
8.5. Borel functional calculus
8.6. Absolutely continuous functions and derivatives on intervals
8.7. Self-adjoint extensions and symplectic forms
8.8. Exercises
Chapter 9. Consequences of the spectral theorem
9.1. Maximal spectral measure
9.2. Spectral projections
9.3. Spectral type and spectral decompositions
9.4. Ruelle–Amrein–Georgescu–Enss (RAGE) theorem
9.5. Essential and discrete spectrum; the min-max principle
9.6. Spectral multiplicity
9.7. Stone’s theorem
9.8. Fourier transform on ℝ
9.9. Abstract eigenfunction expansions
9.10. Exercises
Chapter 10. Jacobi matrices
10.1. The canonical spectral measure and Favard’s theorem
10.2. Unbounded Jacobi matrices
10.3. Weyl solutions and 𝑚-functions
10.4. Transfer matrices and Weyl disks
10.5. Full-line Jacobi matrices
10.6. Eigenfunction expansion for full-line Jacobi matrices
10.7. The Weyl 𝑀-matrix
10.8. Subordinacy theory
10.9. A Combes–Thomas estimate and Schnol’s theorem
10.10. The periodic discriminant and the Marchenko–Ostrovski map
10.11. Direct spectral theory of periodic Jacobi matrices
10.12. Exercises
Chapter 11. One-dimensional Schrödinger operators
11.1. An initial value problem
11.2. Fundamental solutions and transfer matrices
11.3. Schrödinger operators with two regular endpoints
11.4. Endpoint behavior
11.5. Self-adjointness and separated boundary conditions
11.6. Weyl solutions and Green’s functions
11.7. Weyl solutions and 𝑚-functions
11.8. The half-line eigenfunction expansion
11.9. Weyl disks and applications
11.10. Asymptotic behavior of 𝑚-functions
11.11. The local Borg–Marchenko theorem
11.12. Full-line eigenfunction expansions
11.13. Subordinacy theory
11.14. Potentials bounded below in an 𝐿¹_{}𝑙𝑜𝑐 sense
11.15. A Combes–Thomas estimate and Schnol’s theorem
11.16. The periodic discriminant and the Marchenko–Ostrovski map
11.17. Direct spectral theory of periodic Schrödinger operators
11.18. Exercises
Bibliography
Notation Index
Index

Recommend Papers

A First Course in Coding Theory 0198538030

Algebraic coding theory is a new and rapidly developing subject, popular for its many practical applications and for its

101 64 26MB Read more

A First Course in Random Matrix Theory 1108488080, 9781108488082

The real world is perceived and broken down as data, models and algorithms in the eyes of physicists and engineers. Data

562 56 4MB Read more

Representation Theory: A First Course 9781461209799, 9780387974958, 0387974954

The primary goal of these lectures is to introduce a beginner to the finite-dimensional representations of Lie groups an

320 38 44MB Read more

Coding theory: a first course 0521529239, 9780521529235, 0521821916, 9780521821919

Concerned with successfully transmitting data through a noisy channel, coding theory can be applied to electronic engine

502 54 1MB Read more

Probability Theory: A First Course in Probability Theory and Statistics 9783110466195, 9783110466171

This book is intended as an introduction to Probability Theory and Mathematical Statistics for students in mathematics,

207 56 4MB Read more

Probability Theory: A First Course in Probability Theory and Statistics 9783110466195, 9783110466171

This book is intended as an introduction to Probability Theory and Mathematical Statistics for students in mathematics,

228 124 34MB Read more

Spectral theory 9783030380014, 9783030380021

449 58 2MB Read more

A First Course in Corporate Finance

966 93 6MB Read more

A first course in analysis 9781107173149, 1107173140

The real numbers -- Differentiation -- Integration -- Sequences of functions -- Metric and Euclidean spaces -- Different

608 114 24MB Read more

A Course in Microeconomic Theory 9780691215747

David M. Kreps has developed a text in microeconomics that is both challenging and "user-friendly." The work i

122 99 163MB Read more

A First Course in Spectral Theory [1 ed.]
2022028354, 9781470466565, 9781470471927, 9781470471910

Author / Uploaded
Milivoje Lukić

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

GRADUATE STUDIES I N M AT H E M AT I C S

226

A First Course in Spectral Theory Milivoje Lukic´

A First Course in Spectral Theory

GRADUATE STUDIES I N M AT H E M AT I C S

226

A First Course in Spectral Theory Milivoje Lukic´

EDITORIAL COMMITTEE Matthew Baker Marco Gualtieri Gigliola Staﬃlani (Chair) Jeﬀ A. Viaclovsky Rachel Ward 2020 Mathematics Subject Classiﬁcation. Primary 47B15, 47B25, 47B02, 47B36, 34L40, 36C05.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-226

Library of Congress Cataloging-in-Publication Data Names: Luki´c, Milivoje, 1984– author. Title: A ﬁrst course in spectral theory / Milivoje Luki´ c. Description: Providence, Rhode Island : American Mathematical Society, [2022] | Series: Graduate studies in mathematics, 1065-7339 ; 226 | Includes bibliographical references and index. Identiﬁers: LCCN 2022028354 | ISBN 9781470466565 (hardcover) | ISBN 9781470471927 (paperback) | ISBN 9781470471910 (ebook) Subjects: LCSH: Spectral theory (Mathematics)–Textbooks. | AMS: Operator theory – Special classes of linear operators – Hermitian and normal operators (spectral measures, functional calculus, etc.). | Operator theory – Special classes of linear operators – Symmetric and selfadjoint operators (unbounded). | Operator theory – Special classes of linear operators – Jacobi (tridiagonal) operators (matrices) and generalizations. | Ordinary diﬀerential equations – Ordinary diﬀerential operators – Particular operators (Dirac, one-dimensional Schrodinger, ¨ etc.). | Partial diﬀerential equations – Elliptic equations and systems – Schr¨ odinger operator. | Functional analysis – Inner product spaces and their generalizations, Hilbert spaces – Hilbert and pre-Hilbert spaces: geometry and topology (including spaces with semideﬁnite inner product). Classiﬁcation: LCC QC20.7.S64 L85 2022 | DDC 515/.7222–dc23/eng20221013 LC record available at https://lccn.loc.gov/2022028354

Copying and reprinting. Individual readers of this publication, and nonproﬁt libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2022 by the author. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

28 27 26 25 24 23

To my teachers and my students

Contents

Preface

xiii

Chapter 1. Measure theory

1

§1.1. σ-algebras and monotone classes

1

§1.2. Measures and Carath´eodory’s theorem

6

§1.3. Borel σ-algebra on the real line and related spaces

10

§1.4. Lebesgue integration

15

§1.5. Lebesgue–Stieltjes measures on R

24

§1.6. Product measures

30

§1.7. Functions on σ-locally compact spaces

32

§1.8. Regularity of measures

35

§1.9. The Riesz–Markov theorem

38

§1.10. Exercises

41

Chapter 2. Banach spaces

45

§2.1. Norms and Banach spaces

45

§2.2. The Banach space C(K)

48

§2.3.

54

Lp

spaces

§2.4. Bounded linear operators and uniform boundedness

59

§2.5. Weak-∗ convergence and the separable Banach–Alaoglu theorem

65

§2.6. Banach-space valued integration

68

§2.7. Banach-space valued analytic functions

71

§2.8. Exercises

74 vii

viii

Contents

Chapter §3.1. §3.2. §3.3. §3.4. §3.5. §3.6. §3.7.

3. Hilbert spaces Inner products Subspaces and orthogonal projections Direct sums of Hilbert spaces Orthonormal sets and orthonormal bases Weak convergence Tensor products of Hilbert spaces Exercises

77 77 82 88 91 97 100 104

Chapter §4.1. §4.2. §4.3. §4.4. §4.5. §4.6. §4.7.

4. Bounded linear operators The C ∗ -algebra of bounded linear operators on H Strong and weak operator convergence Invertibility, spectrum, and resolvents Polynomials of operators Invariant subspaces and direct sums of operators Compact operators Exercises

107 107 110 113 118 119 122 125

Chapter §5.1. §5.2. §5.3. §5.4. §5.5. §5.6. §5.7. §5.8. §5.9.

5. Bounded self-adjoint operators A ﬁrst look at self-adjoint operators Spectral theorem for compact self-adjoint operators Spectral measures Spectral theorem on a cyclic subspace Multiplication operators Spectral theorem on the entire Hilbert space Borel functional calculus Spectral theorem for unitary operators Exercises

129 130 136 139 141 143 146 149 153 155

Chapter §6.1. §6.2. §6.3. §6.4. §6.5.

6. Measure decompositions Pure point and continuous measures Singular and absolutely continuous measures Hausdorﬀ measures on R Matrix-valued measures Exercises

159 160 162 169 176 178

Chapter 7. Herglotz functions §7.1. M¨obius transformations

183 184

Contents

ix

§7.2. Schur functions and convergence

188

§7.3. Carath´eodory functions

190

§7.4. The Herglotz representation

193

§7.5. Growth at inﬁnity and tail of the measure

196

§7.6. Half-plane Poisson kernel and Stieltjes inversion

199

§7.7. Pointwise boundary values

204

§7.8. Meromorphic Herglotz functions

210

§7.9. Exponential Herglotz representation

212

§7.10. The Phragm´en–Lindel¨ of method and asymptotic expansions 215 §7.11. Matrix-valued Herglotz functions

216

§7.12. Weyl matrices and Dirichlet decoupling

219

§7.13. Exercises

222

Chapter 8. Unbounded self-adjoint operators

227

§8.1. Graphs and adjoints

228

§8.2. Resolvents and self-adjointness

231

§8.3. Unbounded multiplication operators and direct sums

236

§8.4. Spectral measures and the spectral theorem

238

§8.5. Borel functional calculus

243

§8.6. Absolutely continuous functions and derivatives on intervals 247 §8.7. Self-adjoint extensions and symplectic forms

253

§8.8. Exercises

262

Chapter 9. Consequences of the spectral theorem

267

§9.1. Maximal spectral measure

268

§9.2. Spectral projections

270

§9.3. Spectral type and spectral decompositions

272

§9.4. Ruelle–Amrein–Georgescu–Enss (RAGE) theorem

275

§9.5. Essential and discrete spectrum; the min-max principle

278

§9.6. Spectral multiplicity

283

§9.7. Stone’s theorem

289

§9.8. Fourier transform on R

290

§9.9. Abstract eigenfunction expansions

293

§9.10. Exercises

296

x

Contents

Chapter 10. Jacobi matrices

299

§10.1. The canonical spectral measure and Favard’s theorem

300

§10.2. Unbounded Jacobi matrices

305

§10.3. Weyl solutions and m-functions

309

§10.4. Transfer matrices and Weyl disks

313

§10.5. Full-line Jacobi matrices

319

§10.6. Eigenfunction expansion for full-line Jacobi matrices

322

§10.7. The Weyl M -matrix

325

§10.8. Subordinacy theory

328

§10.9. A Combes–Thomas estimate and Schnol’s theorem

334

§10.10. The periodic discriminant and the Marchenko–Ostrovski map

336

§10.11. Direct spectral theory of periodic Jacobi matrices

347

§10.12. Exercises

352

Chapter 11. One-dimensional Schr¨odinger operators

359

§11.1. An initial value problem

361

§11.2. Fundamental solutions and transfer matrices

367

§11.3. Schr¨odinger operators with two regular endpoints

373

§11.4. Endpoint behavior

379

§11.5. Self-adjointness and separated boundary conditions

386

§11.6. Weyl solutions and Green’s functions

390

§11.7. Weyl solutions and m-functions

394

§11.8. The half-line eigenfunction expansion

399

§11.9. Weyl disks and applications

407

§11.10. Asymptotic behavior of m-functions

415

§11.11. The local Borg–Marchenko theorem

423

§11.12. Full-line eigenfunction expansions

425

§11.13. Subordinacy theory

429

§11.14. Potentials bounded below in an

L1loc

sense

433

§11.15. A Combes–Thomas estimate and Schnol’s theorem

439

§11.16. The periodic discriminant and the Marchenko–Ostrovski map

443

§11.17. Direct spectral theory of periodic Schr¨ odinger operators

450

§11.18. Exercises

453

Contents

xi

Bibliography

459

Notation Index

467

Index

469

Preface

Spectral theory can be viewed as a generalization of linear algebra with a focus on linear operators on inﬁnite-dimensional spaces. However, it is a branch of mathematical analysis that has its roots in the Fourier decomposition of a periodic function into sines and cosines. Those sines and cosines are solutions of the boundary value problem −f = λf , f (0) = f (2π), f (0) = f (2π). In modern language, they are eigenvectors of a diﬀerential operator (second derivative on an interval with periodic boundary conditions), acting on a suitable space of functions (which is an inﬁnitedimensional vector space). Modern spectral theory studies classes of recurrence and diﬀerential operators which are motivated by mathematical physics, orthogonal polynomials, partial diﬀerential equations, and integrable systems. This text is intended as a ﬁrst course in spectral theory, with a focus on the general theory of self-adjoint operators on separable Hilbert spaces and the direct spectral theory of Jacobi matrices and one-dimensional Schr¨ odinger operators. It has been written as a textbook for three adjacent purposes: (a) an undergraduate course on bounded self-adjoint operators, (b) a ﬁrst course for graduate students interested in the spectral theory of bounded and unbounded self-adjoint operators, (c) a topics course on continuum one-dimensional Schr¨odinger operators. The intended audience for this text includes beginning graduate students and advanced undergraduates, so the text is written with minimal prerequisites. It is assumed that the reader knows linear algebra and basic analysis, xiii

xiv

Preface

including basic complex analysis. In an eﬀort to keep the text accessible, we avoid unnecessary abstractions and get by without topology. Measure theory is not assumed as a prerequisite; the required background in measure theory is developed in Chapter 1, including some specialized results needed for our purposes (e.g., a criterion for a subalgebra of bounded Borel functions to be the entire algebra, used for the proof of uniqueness of the Borel functional calculus for self-adjoint operators). Chapter 2 introduces Banach spaces; these are vector spaces equipped with a norm (a suitable notion of length of vectors) which are complete. This chapter is a nonstandard introduction to functional analysis shaped by a spectral theorist’s needs: it includes a discussion of Banach space valued integrals, Banach space valued analytic functions, and important examples of Banach spaces, without going deep into abstract Banach space theory. Chapter 3 introduces Hilbert spaces, which are a special case of Banach space equipped with an inner product (an abstract version of a dot product). The chapter includes inﬁnite direct sums of Hilbert spaces (needed for the multiplication operator form of the spectral theorem) and tensor products. Chapter 4 describes the general structure and properties of bounded linear operators on Hilbert spaces; this provides the basic language for the remainder of the text. Chapter 5 begins the study of bounded self-adjoint operators. Selfadjoint operators can be viewed as an inﬁnite-dimensional generalization of Hermitian matrices, and this chapter can be viewed as a generalization of diagonalizability of Hermitian matrices. Spectral measures are introduced and two central results are proved; namely, the spectral theorem and the Borel functional calculus. The spectral theorem is presented in multiplication operator form, which we ﬁnd more useful and intuitive (we introduce and use spectral projections later in this text, but we do not use integration with respect to projection-valued measures or the historically more common approach via resolution of the identity). The Borel functional calculus is constructed using the spectral theorem. Chapter 6 presents several measure decompositions (continuous/pure point, absolutely continuous/singular, and decompositions with respect to Hausdorﬀ measures) and pointwise descriptions of these decompositions. This is part of the standard vocabulary of spectral theory, where continuity properties of spectral measures are of great importance. One of the goals of this text is to present the spectral theory of selfadjoint operators from the ground up as a correspondence of three main objects: self-adjoint operators, their spectral measures (which are measures on R), and Herglotz functions (which are complex-analytic functions from

Preface

xv

the upper half-plane to itself). Accordingly, Herglotz functions are introduced in Chapter 7. Through an integral representation, they are related to measures on R, and this chapter studies this correspondence. This may seem like a detour from spectral theory, but the truth is quite the opposite: although Chapter 7 doesn’t mention operators, we will see that it contains the hard parts of proofs of important spectral theoretic results. In Chapter 8, we study unbounded self-adjoint operators, culminating in their spectral theorem and Borel functional calculus. The presentation is independent from the bounded case, although the bounded case serves as a strong motivation. Many techniques from the bounded case have suitable analogues or restatements in the unbounded case, but there are technical complications. This chapter includes the study of symplectic forms over the complex ﬁeld of scalars and a description of self-adjoint extensions of a symmetric operator. Chapter 9 can be read as a continuation of Chapter 5 or of Chapter 8. It describes many general consequences of the spectral theorem and the Borel functional calculus, such as spectral type, spectral multiplicity, etc., which are part of the basic language of spectral theory. It contains a study of Stone’s theorem and its applications to constructing diagonalizations of diﬀerential operators; for instance, we provide a self-contained introduction to the Fourier transform on L2 (R) through the problem of diagonalizing the d viewed as an unbounded self-adjoint operator on R. This derivative −i dx approach is constructive and based on Stone’s theorem, and it serves as a warm-up for eigenfunction expansions of Schr¨odinger operators. Chapter 10 discusses bounded and unbounded Jacobi matrices, which are a well-studied class of self-adjoint operators corresponding to a secondorder recurrence relation on 2 (N) and 2 (Z). While they can be viewed as an extended example for general spectral theory, their connections to orthogonal polynomials and mathematical physics make them a classical subject of their own; we present some of their general properties and techniques for their study. We emphasize the correspondence with Weyl m-functions and use Weyl disks as a robust way of deriving approximation results, such as Carmona’s theorem. The chapter includes subordinacy theory, eigenfunction expansions for full-line Jacobi matrices, and the Weyl M -matrix approach. Finally, we present the direct spectral theory of periodic Jacobi matrices, using the Marchenko–Ostrovski map as a central object. 2

d Chapter 11 is dedicated to one-dimensional Schr¨ odinger operators − dx 2+ V , considered on a ﬁnite or inﬁnite interval, with locally integrable potentials V . The chapter starts with self-adjointness and the limit point-limit circle alternative. Eigenfunction expansions are introduced constructively, using Stone’s theorem. Weyl disks are used to derive various approximation

xvi

Preface

results, including Carmona’s formula and continuity of m-functions under L1loc perturbations of the potential. We also prove the local Borg–Marchenko theorem, asymptotic behavior of the m-functions, and Schnol’s theorem. We conclude this chapter with the direct spectral theory of periodic Schr¨ odinger operators studied via the Marchenko–Ostrovski map. The book can of course be read cover to cover, but various selections of the material are possible. For instance, beyond the introductory chapters, we suggest the following. • A course on bounded self-adjoint operators can contain Chapters 4 and 5 and Section 10.1. It can continue, time permitting, with Sections 6.1–6.2 and Sections 9.1–9.6. • A course on unbounded self-adjoint operators can contain Chapter 4, Sections 7.1–7.5, Chapter 8, and a selection of topics from Chapters 9, 10, 11. • A course on Jacobi or Schr¨ odinger operators can be based on the corresponding Chapter 10 or 11. It requires Chapter 5 or Chapter 8 as a prerequisite; it is also heavily reliant on Chapters 6, 7, 9, which can be studied in preparation or in parallel with Chapter 10 or 11. Many analytical tools are developed in Chapters 6 and 7, applied in Chapter 9 to self-adjoint operators, then reﬁned in more specialized settings in Chapters 10 and 11. They can be studied by taking cross-sections of diﬀerent chapters. I would like to thank Ilia Binder, David Damanik, Ana Djurdjevac, Benjamin Eichinger, Jake Fillman, Fritz Gesztesy, Manuela Girotti, Michael Goldstein, Ethan Gwaltney, Svetlana Jitomirskaya, Ilya Marchenko, Shaan Nagy, Maria Ntekoume, Barry Simon, Selim Sukhtaiev, Chunyi Wang, Xingya Wang, Ronen Wdowinski, Bohan Wu, Chengcheng Yang, Giorgio Young, Peter Yuditskii, and Maxim Zinchenko for helpful discussions and valuable feedback which improved this book.

Chapter 1

Measure theory

The subject of this chapter is the Lebesgue theory of measures and integration. This is one of the foundations of modern analysis; compared to Riemann integration, it includes a much wider class of functions which can be integrated and has better behavior with respect to limits. It is a classical idea to measure the size of a set by a positive number. A notion of size, such as the number of elements, length, or area, is intuitively expected to be additive for disjoint sets. A key idea in Lebesgue theory is that additivity should also hold for countable disjoint families, i.e., ∞ ∞ An = μ(An ) μ n=1

n=1

if An ∩ Ak = ∅ whenever n = k. This stronger property, called σ-additivity, will be part of the deﬁnition of a measure; it leads to good behavior of measures and integrals with respect to limits of sequences. Another fundamental question is which sets should be measured; this is captured by the notion of a σ-algebra. We will quickly specialize to the setting of Borel sets and Borel functions on metric spaces. This class is large enough to contain the sets which occur in our work, while avoiding some foundational paradoxes and topological distractions. Our choice of topics is shaped by the goals of this text; many other texts on measure theory are available [32, 81, 97, 99].

1.1. σ-algebras and monotone classes Let X be a nonempty set. Our ﬁrst deﬁnition describes classes of subsets of X which are closed under certain set operations. We denote by P(X) 1

2

1. Measure theory

the set of all subsets of X, and we denote complements of subsets of X by Ac = X \ A when there is no risk of confusion. Deﬁnition 1.1. A σ-algebra on a set X is a family A ⊂ P(X) that obeys (a) ∅ ∈ A; (b) A ∈ A implies Ac ∈ A; (c) for any sequence (Aj )∞ j=1 such that Aj ∈ A for all j,

∞

j=1 Aj

∈ A.

Some authors replace (a) by the condition X ∈ A; by (b), this is equivalent to our deﬁnition, since ∅ = X c and X = ∅c . The deﬁnition has some easy By (b) and (c), if Aj ∈ A c consequences. ∞ ∞ c ∈ A. Of course, (c) also holds for for all j ∈ N, then j=1 Aj = j=1 Aj ﬁnite unions, since we can take some of the Aj to be ∅. Thus, A1 , A2 ∈ A implies A1 ∪A2 ∈ A and A1 ∩A2 ∈ A. It also implies A1 \A2 = A1 ∩Ac2 ∈ A. Informally speaking, A is closed under ﬁnite and countable set operations. Example 1.2. For any set X, A = P(X) is a σ-algebra on X. Example 1.3. For any set X, A = {∅, X} is a σ-algebra on X. It is common in mathematics to obtain a minimal set with some property by showing that there exists a set with the property, and that intersections of sets with the property also have the property. For example, the closure of a set B in a metric space X can be deﬁned as the intersection of all closed sets in X that contain B, because an arbitrary intersection of closed sets is closed, and the whole space is closed. We are about to make an analogous construction for σ-algebras. It is important that the following result holds for the intersection of an arbitrary (not only countable) collection of σ-algebras. Lemma 1.4. The intersection of an arbitrary nonempty collection of σ-algebras on X is a σ-algebra on X. Proof. Let Aγ , γ ∈ Γ, be σ-algebras on X, and let A = γ∈Γ Aγ . Since ∅ ∈ Aγ for all γ, ∅ ∈ A. If A ∈ A, then A ∈ Aγ for all γ, so Ac ∈ Aγ for c all j ∈ N, then Aj ∈ Aγ for all j and all γ, so ∞γ, so A ∈ A. If Aj ∈ Afor ∞ A ∈ A for all γ, so j γ j=1 j=1 Aj ∈ A. Deﬁnition 1.5. Let F ⊂ P(X). The σ-algebra generated by F is the intersection of all σ-algebras on X that contain F . Since P(X) is a σ-algebra on X which contains F , the family of σalgebras which contain F is not empty, so the intersection of this family is well deﬁned. This intersection is a σ-algebra by Lemma 1.4, and it is the smallest σ-algebra that contains F .

1.1. σ-algebras and monotone classes

3

In a metric space X, the topology generated by the metric d is the family TX = {A ⊂ X | A is open with respect to the metric d}. Not every topology is generated by a metric: topological spaces are a generalization of metric spaces and are studied in their own right. In this text, we only use topologies generated by a metric (so-called metric topologies), even though some of the theory below can be stated more generally. Deﬁnition 1.6. Let X be a metric space. The Borel σ-algebra on X, denoted BX , is the σ-algebra generated by TX . Elements of the Borel σ-algebra are called Borel sets. Example 1.7. A metric space X is said to be discrete if every subset of X is open. One example of a discrete metric on any set X is

0 x=y d(x, y) = 1 x = y. If X is a discrete metric space, then TX = P(X), so BX = P(X). Deﬁnition 1.8. For spaces X, Y with σ-algebras AX , AY , a function f : X → Y is called measurable if and only if B ∈ AY implies f −1 (B) ∈ AX . In particular, if X, Y are metric spaces, f : X → Y is a Borel function if B ∈ BY implies f −1 (B) ∈ BX . Proposition 1.9. If f : X → Y and g : Y → Z are Borel functions, then so is their composition g ◦ f : X → Z. Proof. For any B ∈ BZ , since g is Borel, g −1 (B) ∈ BY . Since f is Borel, (g ◦ f )−1 (B) = f −1 (g −1 (B)) ∈ BX . To prove that a function is Borel, we sometimes use the notion of pushforward of a σ-algebra. This is based on the fact that the inverse image f −1 commutes with set operations. Proposition 1.10. Let f : X → Y and let A be a σ-algebra on X. Then B = {B ⊂ Y | f −1 (B) ∈ A} is a σ-algebra on Y , called the pushforward of A by f . Proof. From f −1 (∅) = ∅ ∈ A, we conclude ∅ ∈ B. If f −1 (B) ∈ A, so −1 f −1 (Y \ B) =X \ f −1 (B) ∈A and Y \ B ∈ B. If f (Bj ) ∈ A for some sets ∞ ∞ −1 (B ) ∈ A. Bj , then f −1 j j=1 Bj = j=1 f Borel sets and Borel functions are meant to be large enough classes to include all sets and functions which we will encounter in our work. The following lemma is a ﬁrst step in that direction.

4

1. Measure theory

Lemma 1.11. Every continuous function is Borel. Proof. By Proposition 1.10, the set S = {B | f −1 (B) ∈ BX } is a σ-algebra on Y . If f : X → Y is continuous, and if B is open, then f −1 (B) is open, so f −1 (B) ∈ BX . Thus, the σ-algebra S contains all open sets in Y . Therefore, it contains BY , so for any B ∈ BY , f −1 (B) ∈ BX . Example 1.12. For any A ∈ BX , the characteristic function of the set A

1 x∈A χA (x) = (1.1) 0 x ∈ Ac is a Borel function. Proof. For any set B, the inverse image χ−1 A (B) is equal to one of the sets ∅, A, Ac , or X. All of these are Borel sets, so χA is a Borel function. σ-algebras, and in particular Borel σ-algebras, behave naturally with respect to restrictions and inclusions (Exercise 1.1). To prove that some property holds for all elements of the σ-algebra A generated by G, we usually introduce the set S of elements of A with that property, prove G ⊂ S, and prove that S is a σ-algebra. However, that can be a diﬃcult task. As our last topic in this section, we show an abstract result which reduces that task to proving an easier condition—that S is closed under increasing and decreasing countable limits—as long as the set G has certain algebraic properties. We need the following deﬁnitions. Deﬁnition 1.13. An algebra on X is a family G ⊂ P(X) that obeys (a) ∅ ∈ G; (b) A ∈ G implies Ac ∈ G; (c) A1 , A2 ∈ G implies A1 ∩ A2 ∈ G. This has immediate further corollaries: Any algebra contains X = ∅c ; any algebra is closed under ﬁnite intersections, unions, and diﬀerences of sets. Every σ-algebra is an algebra, but not conversely: Example 1.14. The family G = {A ⊂ R | A is ﬁnite or Ac is ﬁnite} is an algebra, but not a σ-algebra. Deﬁnition 1.15. A monotone class on X is a family C ⊂ P(X) that obeys (a) if An ∈ C and An ⊂ An+1 for all n ∈ N, then n∈N An ∈ C; (b) if Bn ∈ C and Bn+1 ⊂ Bn for all n ∈ N, then n∈N Bn ∈ C.

1.1. σ-algebras and monotone classes

5

Every σ-algebra is a monotone class, but not conversely: Example 1.16. The family C = {∅, R} ∪ {(a, ∞) | a ∈ Z} is a monotone class, but not a σ-algebra. An arbitrary intersection of monotone classes is a monotone class, and P(X) is a monotone class. Thus, for any E ⊂ P(X), there exists a smallest monotone class which contains E, i.e., the monotone class generated by E. Theorem 1.17 (Monotone class theorem). If G ⊂ P(X) is an algebra, the monotone class generated by G is equal to the σ-algebra generated by G. Proof. Denote by C the monotone class generated by G. The main step is to prove that for all E, F ∈ C, E \ F, F \ E, E ∩ F ∈ C.

(1.2)

CE = {F ∈ C | E \ F, F \ E, E ∩ F ∈ C}.

(1.3)

Deﬁne for E ∈ C,

This is a monotone class, since the expressions E \ F, F \ E, E ∩ F viewed as functions of F preserve monotonicity and monotone limits. For instance, ∞ ∞ Fn ⊂ Fn+1 implies E \ Fn+1 ⊂ E \ Fn and E \ ( n=1 Fn ) = n=1 (E \ Fn ). Assume E ∈ G. Then (1.2) holds for F ∈ G, since G is an algebra and G ⊂ C. Thus, G ⊂ CE . Thus, CE is a monotone class with G ⊂ CE ⊂ C, so CE = C. Thus, (1.2) holds for all E ∈ G and F ∈ C. The conditions in (1.3) are symmetric in E, F , so (1.2) holds for all E ∈ C and F ∈ G. Now the previous argument can be repeated for any E ∈ C and shows CE = C. Thus, (1.3) holds for all E, F ∈ C. Since X ∈ G ⊂ C, (1.2) implies that for all E ∈ C, X \ E ∈ C, and that for all E, F ∈ C, E ∩ F ∈ C, so C is an algebra. For any An ∈ C, n ∈ N, consider Bn = nj=1 Aj ∈ C. This is a monotone sequence: Bn ⊂ Bn+1 for all n ∈ N. Since C is a monotone class, An = Bn ∈ C, n∈N

n∈N

so C is a σ-algebra. Denoting by A the σ-algebra generated by G, we conclude A ⊂ C. Conversely, A is a monotone class and G ⊂ A, so C ⊂ A. The monotone class theorem will be used twice in this text: in the proof of a uniqueness result for Borel measures on R in Section 1.5, and in the study of product measures in Section 1.6.

6

1. Measure theory

1.2. Measures and Carath´ eodory’s theorem In this section, we deﬁne measures, study their general properties, and give an important method for constructing measures. Deﬁnition 1.18. A measure on a σ-algebra A is a map μ : A → [0, ∞] with μ(∅) = 0, which is σ-additive, i.e., for any pairwise disjoint sets An ∈ A, n ∈ N, ∞ ∞ μ An = μ(An ). (1.4) n=1

n=1

If X is a metric space, a measure on BX is called a Borel measure on X. The measure μ is ﬁnite if μ(X) < ∞. It is ﬁnite on compacts if μ(K) < ∞ for every compact K ⊂ X. In (1.4) we are using the natural convention c + ∞ = ∞ for c ∈ [0, ∞]. Explicitly, if the series in (1.4) is divergent or if any of the terms in the series are inﬁnite, the value of the series is taken to be +∞. Let us see some easy examples and general properties of measures: Example 1.19. μ ≡ 0 is the trivial measure on any σ-algebra. Example 1.20. Fix x ∈ X. The Dirac measure at x is the measure δx on P(X) deﬁned by

1 x∈A δx (A) = 0 x∈ / A. Example 1.21. Let #A denote the number of elements of a set A (if A is inﬁnite, we write #A = ∞). For any set X, the counting measure on P(X) is deﬁned by μ(A) = #A. Theorem 1.22. Let μ be a measure on A. Then, for any sets in A, the following hold. (a) If n ∈ N and sets A1 , . . . , An are pairwise disjoint, then n n μ Aj = μ(Aj ). j=1

j=1

(b) If S ⊂ T , then μ(S) ≤ μ(T ). (c) If S ⊂ T and μ(S) < ∞, then μ(T \ S) = μ(T ) − μ(S). (d) For any sequence of sets (Bn )∞ n=1 such that Bn ⊂ Bn+1 for all n ∈ N, ∞ Bn = lim μ(Bn ). μ n=1

n→∞

1.2. Measures and Carath´eodory’s theorem

7

(e) For any sequence of sets (Cn )∞ n=1 such that Cn+1 ⊂ Cn for all n ∈ N, if there exists k ∈ N such that μ(Ck ) < ∞, then ∞ μ Cn = lim μ(Cn ). n→∞

n=1

(f) For any sequence of sets (An )∞ n=1 , ∞ ∞ An ≤ μ(An ). μ j=1

j=1

Proof. (a) This follows from σ-additivity with Aj = ∅ for j > n. (b) This follows by representing T as the disjoint union of S and T \ S. (c) This follows from μ(T ) = μ(S) + μ(T \ S) by subtracting μ(S). (d) Denote An = Bn \ Bn−1 for n ≥ 2 and A1 = B1 . The sets An are disjoint, so for each n, Bn = nj=1 Aj implies by (a) that μ(Bn ) = ∞ ∞ n j=1 μ(Aj ). Since j=1 Bj = j=1 Aj , σ-additivity implies ⎞ ⎛ ⎞ ⎛ ∞ ∞ ∞ n ⎠ ⎝ ⎠ ⎝ Bj = μ Aj = μ(Aj ) = lim μ(Aj ) = lim μ(Bn ). μ j=1

j=1

j=1

n→∞

n→∞

j=1

(e) Applying (d) to the increasing sequence of sets Ck \ Cn gives ∞ ∞ μ Ck \ Cn = μ (Ck \ Cn ) = lim μ(Ck \ Cn ). n=1

n→∞

n=1

Subtracting both sides from μ(Ck ) completes the proof. (f) Consider the increasing sequence of sets Bn = nj=1 Aj , with B0 = ∅. ∞ The ∞ sets Cn = Bn \ Bn−1 are pairwise disjoint, Cn ⊂ An , and n=1 An = n=1 Cn . Thus, by σ-additivity and (b), ∞ ∞ ∞ ∞ An = μ Cn = μ(Cn ) ≤ μ(An ). μ n=1

n=1

n=1

n=1

In this theorem, ﬁniteness appears as an assumption whenever the proof uses subtraction, because we cannot subtract ∞. This assumption cannot be removed: for instance, for part (e), if μ is the counting measure on N and ∈ N | k ≥ n}, then An+1 ⊂ An and μ(An ) = ∞ for all n, but An = {k μ n∈N An = μ(∅) = 0. The importance of σ-additivity will be evident; however, when constructing measures, σ-additivity presents a challenge—constructing ﬁnitely additive maps is much easier. We will now present a robust abstract way to

8

1. Measure theory

construct measures, which will be used several times in this text. The intermediate step will be an object called an outer measure, which has weaker properties than a measure, but it is deﬁned on all subsets of the space X. Deﬁnition 1.23. An outer measure on X is a map μ∗ : P(X) → [0, ∞] such that (a) μ∗ (∅) = 0; (b) μ∗ (A) ≤ μ∗ (B) if A ⊂ B; ∞ ∗ (c) (σ-subadditivity) μ∗ ∞ n=1 An ≤ n=1 μ (An ) for all sets An ⊂ X. A cover of A is a family of sets {Eγ }γ∈Γ such that A ⊂ γ∈Γ Eγ . The cover is called ﬁnite or countable if Γ is ﬁnite or countable, respectively. To construct an outer measure, let us ﬁrst choose a fairly arbitrary class of elementary sets E and a weight ρ on elementary sets (we do not call ρ a measure, because the elementary sets in general do not form a σ-algebra and because ρ is not required to obey any kind of additivity properties), and then deﬁne μ∗ (A) as an inﬁmum over countable covers of A: Theorem 1.24. Let E ⊂ P(X) with ∅ ∈ E and X ∈ E. Let ρ : E → [0, ∞] be a map with ρ(∅) = 0. Deﬁne, for all A ⊂ X,

∞ ∞ ∗ ρ(Ej ) A ⊂ Ej , Ej ∈ E ∀j ∈ N . (1.5) μ (A) = inf j=1

Then

μ∗

j=1

is an outer measure on X.

Proof. Since X ∈ E, every set A has a countable cover, so the deﬁnition is well posed. The property μ∗ (∅) = 0 follows by taking Ej = ∅ for all j. If A ⊂ B, any cover of B is also a cover of A, so the inﬁmum deﬁning μ∗ (A) is over a larger set than the inﬁmum deﬁning μ∗ (B), so μ∗ (A) ≤ μ∗ (B). If A = ∞ for any > 0 and n, there exists a countable cover n=1 An , ∞ ∗ n such that {En,j }∞ j=1 j=1 ρ(En,j ) ≤ μ (An ) + /2 . Then the countable collection {En,j }∞ n,j=1 is a cover for A and μ∗ (A) ≤

∞ ∞ n=1 j=1

ρ(En,j ) ≤

∞

μ∗ (An ) + .

n=1

Since > 0 is arbitrary, this implies that μ∗ (A) ≤

∞

n=1 μ

∗ (A ). n

For any E ∈ E, by taking the countable cover E1 = E and Ej = ∅ for j ≥ 2, we conclude μ∗ (E) ≤ ρ(E). For some choices of weights ρ, it can happen that μ∗ (E) < ρ(E) (Exercise 6.8). In several important constructions, we will show manually that μ∗ (E) = ρ(E) for E ∈ E.

1.2. Measures and Carath´eodory’s theorem

9

The core of the outer measure approach is Carath´eodory’s deﬁnition of being “measurable with respect to an outer measure”: Deﬁnition 1.25. The set A ⊂ X is measurable with respect to μ∗ if μ∗ (E) = μ∗ (E ∩ A) + μ∗ (E ∩ Ac )

∀E ⊂ X.

(1.6)

Theorem 1.26 (Carath´eodory). Let μ∗ be an outer measure on X. The family A of sets measurable with respect to μ∗ is a σ-algebra, and the restriction μ∗ |A is a measure on A. Proof. Using μ∗ (∅) = 0, it easily follows that ∅ ∈ A. Condition (1.6) is equivalent for A and Ac , so A ∈ A implies Ac ∈ A. Consider an increasing sequence Bn ⊂ Bn+1 ⊂ X, n ∈N, and its limit ∞ B = j=1 Bj . With the convention B0 = ∅, we note B = ∞ j=1 (Bj \ Bj−1 ) and conclude by σ-additivity of μ∗ that, for any E ⊂ X, ∞ ∗ ∗ c ∗ ∗ c μ∗ (E∩(Bj \Bj−1 )). (1.7) μ (E) ≤ μ (E∩B )+μ (E∩B) ≤ μ (E∩B )+ j=1

Let us prove that these inequalities sometimes turn into equalities. Fix E ⊂ X, let An ∈ A, n ∈ N, and take Bn = nj=1 Aj for n ∈ N. Measurability of Aj with respect to μ∗ implies c c c ) = μ∗ (E ∩ Bj−1 ∩ Acj ) + μ∗ (E ∩ Bj−1 ∩ Aj ), μ∗ (E ∩ Bj−1

which we rewrite as c c ) = μ∗ (E ∩ Bjc ) + μ∗ (E ∩ (Bj \ Bj−1 )). μ∗ (E ∩ Bj−1

By induction in n, this gives μ∗ (E) = μ∗ (E ∩ Bnc ) +

n

c μ∗ (E ∩ (Bj \ Bj−1 )).

j=1

By monotonicity of the outer measure, μ∗ (E ∩ Bnc ) ≥ μ∗ (E ∩ B c ) so μ∗ (E) ≥ μ∗ (E ∩ B c ) +

n

c μ∗ (E ∩ (Bj \ Bj−1 ))

j=1

for any n. Taking n → ∞, μ∗ (E) ≥ μ∗ (E ∩ B c ) +

∞

c μ∗ (E ∩ (Bj \ Bj−1 )).

(1.8)

j=1

This gives an inequality in the opposite direction compared to (1.7), so it implies that all three quantities are equal, ∞ ∗ ∗ c ∗ ∗ c c μ∗ (E∩(Bj \Bj−1 )). (1.9) μ (E) = μ (E∩B )+μ (E∩B) = μ (E∩B )+ j=1

10

1. Measure theory

Since E is arbitrary, the ﬁrst equality in (1.9) shows that B ∈ A. Thus, A is closed under countable unions, so it is a σ-algebra. If the sets An are disjoint, then Bj \ Bj−1 = Aj , so the second equality in (1.9), taken for E = B, proves that ∞ ∞ μ∗ Aj = μ∗ (Aj ). j=1

Thus,

μ∗

j=1

is σ-additive on A.

1.3. Borel σ-algebra on the real line and related spaces We now specialize to Borel σ-algebras on some important spaces, starting with the real line R, with the goal of obtaining useful criteria which establish that certain sets and functions are Borel. We use a topological notion: Deﬁnition 1.27. A base U of X is a family of open sets in X such that, for every open set V and every x ∈ V , there exists A ∈ U such that x ∈ A ⊂ V . The space is said to be second countable if it has a countable base. Lemma 1.28. If U is a base of X, every open set V in X can be written as a union of elements of U , A. V = A∈U A⊂V

Proof. Denote by U the union of all A ∈ U with A ⊂ V . Obviously U ⊂ V . For the converse, take any x ∈ V . By the deﬁnition of a base, there exists A ∈ U with x ∈ A ⊂ V , so x ∈ U . This shows V ⊂ U . Lemma 1.29. U = {(a, b) | a, b ∈ Q, a < b} is a countable base of R. Proof. Assume V ⊂ R is open and x ∈ V . There exists > 0 such that (x − , x + ) ⊂ V . By density of Q in R, there exist rational numbers a ∈ (x − , x), b ∈ (x, x + ). Then x ∈ (a, b) ⊂ V . Thus, U is a base of R. Its countability follows from countability of Q. A metric space is called separable if it contains a countable dense subset. In concrete situations it is useful to write down an explicit base, but second countability of a metric space is equivalent to separability (Exercise 1.2). Note how countability of the base is used in the following proof: Lemma 1.30. The Borel σ-algebra on R is the σ-algebra generated by the intervals (a, ∞) with a ∈ R. Proof. Denote by A the σ-algebra generated by the family {(a, ∞) | a ∈ R}. The sets (a, ∞) are open, thus they are Borel, so A ⊂ BR .

1.3. Borel σ-algebra on the real line and related spaces

11

By takingcomplements, (−∞, a] = R \ (a, ∞) ∈ A for any a. For any b, (−∞, b) = n∈N (−∞, b − 1/n] ∈ A. Then (a, b) = (−∞, b) ∩ (a, ∞) ∈ A for every a < b. Using intervals (a, b) with rational endpoints and their countable unions, by Lemma 1.29, every open set is in A. Thus, BR ⊂ A. Corollary 1.31. f : X → R is a Borel function if and only if f −1 ((a, ∞)) ∈ BX for all a ∈ R. Proof. By Proposition 1.10, the set {A ⊂ R | f −1 (A) ∈ BX } is a σ-algebra. Thus, it contains BR if and only if it contains the sets (a, ∞) for a ∈ R. Corollary 1.32. (a) If f, g : X → R are Borel, then their pointwise maximum h(x) = max{f (x), g(x)} is Borel. (b) If f : X → R is Borel, then −f is also Borel. Proof. (a) h−1 ((a, ∞)) = f −1 ((a, ∞)) ∪ g −1 ((a, ∞)) ∈ BX for all a ∈ R. (b) The function h(x) = −x is continuous, so it is a Borel function from R to R. Thus −f = h ◦ f is Borel as a composition of Borel functions. We now turn to Rn . If X, Y are metric spaces, let us deﬁne a metric on X × Y by d((x1 , y1 ), (x2 , y2 )) = max{dX (x1 , x2 ), dY (y1 , y2 )}.

(1.10)

By induction, this can be applied to a product of n metric spaces; for instance, this makes Rn a metric space with metric d∞ (x, y) = max |xj − yj |. j=1,...,n

(1.11)

Although this is not the Euclidean metric on Rn , it induces the same topology on Rn (Exercise 1.3), so for topological questions, we can use whichever metric is more practical. Metrics that generate the same topology are said to be equivalent; equivalent metrics obviously generate the same Borel σ-algebra. They also give the same notion of convergence, because convergence of sequences in a metric space can be restated in terms of the metric topology (Exercise 1.4). The metric (1.10), or any metric equivalent to it, will be called a product metric for X × Y . Lemma 1.33. If U is a base for X and V is a base for Y , the set {U × V | U ∈ U , V ∈ V} is a base for X × Y . Proof. For any open set E ⊂ X × Y and (x, y) ∈ E, there is an -ball around (x, y) contained in E. By (1.10), this -ball is of the form A × B where A, B are -balls in X, Y , respectively. In particular, A, B are open, so

12

1. Measure theory

there exist U ∈ U , V ∈ V such that x ∈ U ⊂ A and y ∈ V ⊂ B. It follows that (x, y) ∈ U × V ⊂ A × B ⊂ E. Conversely, since U is open in X and V is open in Y , ﬁx (x, y) ∈ U × V . ˜) < implies x ˜ ∈ U , and dY (y, y˜) <

There exists > 0 such that dX (x, x implies y˜ ∈ U . Using (1.10), it follows that U × V contains the -ball around (x, y). Thus, U × V is open in X × Y . Applying this inductively gives a countable base for Rn : Corollary 1.34. U = { nj=1 (aj , bj ) | aj , bj ∈ Q, aj < bj ∀j} is a countable base for Rn . n Corollary n 1.35. The Borel σ-algebra on R is the σ-algebra generated by the sets j=1 (aj , bj ), where a1 , . . . , an , b1 , . . . , bn ∈ R. Proof. Denote by A the σ-algebra generated by the sets nj=1 (aj , bj ). Since those sets are open, A ⊂ BRn . For the converse inclusion, by Lemma 1.34 and Lemma 1.28, any open set V ⊂ Rn is a countable union of sets of the n form j=1 (aj , bj ), so any open set V is in A. Thus, BRn ⊂ A.

For any vector-valued function h : X → Rn , denote its components by hj = πj ◦ h, where πj : Rn → R denotes the projection to the jth coordinate, πj (x) = xj . This is also denoted by h = (h1 , . . . , hn ). Proposition 1.36. A vector-valued function h : X → Rn is Borel if and only if its components hj : X → R are Borel for all j = 1, 2, . . . , n. Proof. Since the projections πj are continuous, if h is Borel, then hj = πj ◦h is Borel for each j. Conversely, assume that h1 , . . . , hn : X → R are Borel functions. For any a1 , b1 , . . . , an , bn ∈ R, n n (aj , bj ) = h−1 h−1 j ((aj , bj )) ∈ BX . j=1

j=1

Thus, by Corollary 1.35 and Proposition 1.10, h−1 (B) ∈ BX for every B ∈ BRn . Corollary 1.37. A function f : X → C is Borel if and only if Re f and Im f are Borel functions from X to R. Proof. In the identiﬁcation of C as R2 , the absolute value metric on C corresponds to the Euclidean metric on R2 , so BC = BR2 . In that interpretation, Re f, Im f are the components of f , so the claim follows from Proposition 1.36. Similarly, the next proof uses the identiﬁcation of C2 with R4 .

1.3. Borel σ-algebra on the real line and related spaces

13

Proposition 1.38. If f, g : X → C are Borel, then so are f + g and f g. Proof. The functions F1 (x, y) = x + y and F2 (x, y) = xy are continuous functions from C2 to C, so they are Borel. Since f, g are Borel functions, so is h = (f, g) : X → C2 by Proposition 1.36. Thus, the functions f + g = F1 ◦ h and f g = F2 ◦ h are Borel as compositions of Borel functions. Our next goal in this section is to consider pointwise limits of sequences of Borel functions. Here the robustness of the Borel condition (compared to Riemann integrability) becomes fully apparent. Since limits can be inﬁnite, general results are naturally formulated on the extended real line ˆ = R ∪ {−∞, +∞}. R ˆ in the obvious way, every nonempty With the order relation extended to R ˆ has a least upper bound (i.e., supremum) in R, ˆ which can be subset A ⊂ R ±∞. This matches the common usage of sup A = +∞ in calculus and allows ˆ sup{−∞} = −∞. In particular, for any sequence of functions fn : X → R, ˆ the following are well deﬁned pointwise as functions from X to R: sup fn , n∈N

inf fn ,

n∈N

lim sup fn = inf sup fk , n→∞

n∈N k≥n

lim inf fn = sup inf fk . n→∞

n∈N k≥n

ˆ by compressing R into a bounded interval We can construct a metric on R and measuring distances in the image. Such a metric will correctly capture the notion of convergence to ±∞ used in calculus. Formally, let τ : R → (c− , c+ ) be a strictly increasing bijection for some −∞ < c− < c+ < ∞. Then τ and τ −1 are continuous, so U ⊂ R is open if and only if τ (U ) is ˆ → [c− , c+ ] by τ (±∞) = c± and open. Let us extend τ to a function τ : R ˆ deﬁne a metric on R by d(x, y) = |τ (x) − τ (y)|

ˆ ∀x, y ∈ R.

ˆ by deﬁSince d is the pullback of the standard metric from [c− , c+ ] to R, ˆ if and only if τ (U ) is open in [c− , c+ ]; in other nition, a set U is open in R words, the restriction of d to R generates the standard topology on R. As ˆ is given by in the proof of Lemma 1.29, a countable base for R Uˆ = {(a, b) | a, b ∈ Q} ∪ {[−∞, b) | b ∈ Q} ∪ {(a, +∞] | a ∈ Q}. Using Uˆ , analogously to the proof of Lemma 1.30: Lemma 1.39. BRˆ is the σ-algebra generated by the sets (a, ∞], a ∈ R. ˆ is continuous, so any Borel function f : X → R The inclusion i : R → R ˆ i◦f : X → R ˆ (for a complete is also Borel when viewed as a function into R, ˆ description of R-valued Borel functions, see Exercise 1.8). The general result about sequences of Borel functions is:

14

1. Measure theory

ˆ Lemma 1.40. For any sequence of Borel functions fn : X → R, sup fn , n∈N

inf fn ,

lim sup fn ,

n∈N

n→∞

lim inf fn n→∞

ˆ are also Borel functions from X to R. Proof. If f (x) = supn∈N fn (x), then for any a ∈ R, f −1 ((a, ∞]) =

∞

fn−1 ((a, ∞]).

n=1

Thus, supn∈N fn is Borel. It follows that inf n∈N fn = − supn∈N (−fn ) is also a Borel function. Using those results, lim supn→∞ fn = inf n∈N supk≥n fk and lim inf n→∞ fn = supn∈N inf k≥n fk are also Borel functions. We ﬁnish this section with some remarks for Borel measures. Let E be a Borel set in X. For a Borel measure on X, its restriction to BE is a Borel measure on E. Conversely, a Borel measure μ on E generates a Borel measure on X by ν(A) = μ(A ∩ E). This idea of restricting or extending the space can motivate the following deﬁnition. Deﬁnition 1.41. Let μ be a Borel measure on X. (a) The measure μ is supported on a set E ∈ BX if μ(E c ) = 0. (b) The support of μ, denoted supp μ, is the set of all x ∈ X such that for every open V containing x, μ(V ) > 0. Note a linguistic subtlety: To say that μ is supported on E is not the same as saying that the support of μ is E. The measure can be supported on many diﬀerent sets, but its support is uniquely deﬁned. For instance, the Dirac measure δx is supported on any set E that contains x, and supp δx = {x}. The measure on R deﬁned by μ(A) = #(A ∩ Q) is supported on the countable set Q, and supp μ = R. The support has a useful characterization: Lemma 1.42. For any Borel measure μ on a second-countable space X, supp μ is the smallest closed set E ⊂ X such that μ(E c ) = 0. Proof. Let U be a countable base for X. Taking complements of the definition, (supp μ)c is the set of all x for which there is an open set V such that x ∈ V and μ(V ) = 0. This is equivalent to existence of A ∈ U with x ∈ A ⊂ V and μ(A) = 0. In other words, V = A. (1.12) (supp μ)c = V open μ(V )=0

A∈U μ(A)=0

The second union in (1.12) is countable, so from μ(A) = 0 for all A, it follows that μ((supp μ)c ) = 0; thus, (supp μ)c is an open set of zero measure. By

1.4. Lebesgue integration

15

the ﬁrst union in (1.12), (supp μ)c is the smallest open set of zero measure. Taking complements completes the proof.

1.4. Lebesgue integration In this section, we develop integration with respect to a Borel measure. Any set or function appearing below is implied to be Borel, and where other sets or functions are derived from it, they can be proved to be Borel by the material from the previous sections; we will keep such steps implicit. Deﬁnition 1.43. A function s : X → C is called simple if it only takes ﬁnitely many values, i.e., the set s(X) = {s(x) | x ∈ X} is ﬁnite. For a positive simple function s, we deﬁne the integral of s with respect to μ as cμ(s−1 ({c})). (1.13) s dμ = c∈s(X)

As all of integration, this formula is motivated by the area of a rectangle as height times base; the strength of Lebesgue integration can be traced to the fact that the bases of our rectangles are arbitrary Borel sets. Integration theory uses the convention that c · ∞ = ∞ for c > 0 but 0 · ∞ = 0. In formula (1.13), if s takes the value c = 0, its contribution to the integral is zero regardless of whether μ(s−1 ({0})) is ﬁnite or inﬁnite. A family of sets {Aα | α ∈ I} is called a partition of X if X = α∈I Aα and Aα ∩ Aβ = ∅ whenever α = β. The deﬁnition (1.13) can be rephrased by using characteristic functions of sets: if c1 , . . . , cn are distinct elements of [0, ∞) and A1 , . . . , An is a partition of X, then the integral of the function s=

n

cj χAj

(1.14)

j=1

is deﬁned by

s dμ =

n

cj μ(Aj ).

(1.15)

j=1

It would be cumbersome to always search for this exact partition of a simple function s; fortunately, as we are about to see, this is not necessary. Lemma 1.44. (a) If A1 , . . . , An is a partition of X and c1 , . . . , cn ≥ 0 (not necessarily distinct), the integral of the function (1.14) is given by (1.15). (b) If s is a positive simple function and λ ≥ 0, then (λs) dμ = λ s dμ.

16

1. Measure theory

(c) If s, t are positive simple functions, then s + t is a positive simple function and (s + t) dμ = s dμ + t dμ. (d) If s, t are simple functions and 0 ≤ s ≤ t pointwise, then s dμ ≤ t dμ. (e) If A1 , . . . , An are any Borel subsets of X and c1 , . . . , cn ∈ [0, ∞), the integral of the function (1.14) is given by (1.15). Proof. (a) If cj = ck for some j = k, using μ(Aj ) + μ(Ak ) = μ(Aj ∪ Ak ) allows us to merge those two sets in the partition without aﬀecting the sum; after ﬁnitely many steps, we will end up at the partition used in (1.13). (b) This follows immediately from (1.14) and (1.15). (c) Denote by c1 , . . . , cn the values of s, by d1 , . . . , dm the values of t, and denote Aj = s−1 ({cj }), Bk = t−1 ({dk }). Then {Aj ∩ Bk | 1 ≤ j ≤ n, 1 ≤ k ≤ m} is a partition such that s, t, s + t are constant on each set. Written in that partition, the claim to be proved reduces to the obvious equality m n

(cj + dk )μ(Aj ∩ Bk ) =

j=1 k=1

m n

cj μ(Aj ∩ Bk ) +

j=1 k=1

m n

dk μ(Aj ∩ Bk ).

j=1 k=1

(d) The function t − s is positive because s ≤t and is simple because s and t are simple. Thus, by (c), t dμ = s dμ + (t − s) dμ ≥ s dμ. (e) By deﬁnition, cj χAj dμ = cj μ(Aj ) + 0μ(Acj ) = cj μ(Aj ). Using (b) to take the sum of these functions completes the proof. If f is a positive simple function, we can conclude s dμ f dμ = sup

(1.16)

s simple 0≤s≤f

because s ≤ f implies s dμ ≤ f dμ and equality holds for s = f . Noting that the right-hand side makes sense even if f is not simple, we can use it to generalize the integral. Deﬁnition 1.45. For f : X → [0, ∞], deﬁne f dμ by (1.16). Lemma 1.46. (a) If 0 ≤ f ≤ g, then

f dμ ≤

g dμ. (b) If f ≥ 0 and c ∈ [0, ∞), then cf dμ = c f dμ. Proof. (a) Any simple function s such that 0 ≤ s ≤ f also obeys 0 ≤s ≤ g, so the deﬁning supremum for g dμ is over a bigger set than that for f dμ. (b) The case c = 0 is trivial. For c > 0, the simple function s obeys 0 ≤ s ≤ f if and only if the simple function cs obeys 0 ≤ cs ≤ cf .

1.4. Lebesgue integration

17

Note that f dμ can be inﬁnite. A trivial but often used consequence of (a) is that if 0 ≤ f ≤ g and g dμ < ∞, then f dμ < ∞. The ﬁrst remarkable result of integration theory is: Theorem 1.47 (Monotone convergence theorem). For any sequence fn : X → [0, ∞] such that fn ≤ fn+1 for all n ∈ N, lim fn dμ. fn dμ = lim n→∞

n→∞

Lemma 1.48. If s : X → [0, ∞) is a simple function and En are sets such that En ⊂ En+1 and n∈N En = X, then (1.17) sχEn dμ = s dμ. lim n→∞

Proof. Let s be given by (1.14). For any n, the function sχEn is simple and m sχEn dμ = cj μ(Aj ∩ En ). (1.18) j=1

By Theorem 1.22(c), μ(Aj ∩ En ) → μ(Aj ) as n → ∞. Applying this to each term of (1.18) gives (1.17). Proof of Theorem 1.47. Denote the pointwise limit by f = limn→∞ fn . Since the sequence is increasing, fn ≤ fn+1 ≤ f for all n. This implies that the integrals of fn have a limit and that fn dμ ≤ f dμ. lim n→∞

To prove the converse inequality, ﬁx c ∈ (0, 1) and a simple function s ≤ f . Deﬁne En = {x ∈ X | fn (x) ≥ cs(x)}. Then fn ≥ fn χEn ≥ csχEn implies (1.19) fn dμ ≥ csχEn dμ. Note that En ⊂ En+1 because fn ≤ fn+1 . Moreover, let us show n∈N En = X by verifying three cases: If f (x) = 0, then s(x) = 0 so x ∈ En for all n. If f (x) ∈ (0, ∞), then limn→∞ fn (x) = f (x) > cf (x) ≥ cs(x) so fn (x) ≥ cs(x) for large enough n. If f (x) = ∞, then limn→∞ fn (x) = ∞ > cs(x), so fn (x) ≥ cs(x) for large enough n. By Lemma 1.48, taking n → ∞ in (1.19), fn dμ ≥ lim csχEn dμ = cs dμ = c s dμ. lim n→∞

n→∞

Since c ∈ (0, 1) is arbitrary, this also implies fn dμ ≥ s dμ. lim n→∞

Taking the supremum over simple functions s ≤ f completes the proof.

18

1. Measure theory

To illustrate this abstract Lebesgue integral, let us show that it includes series with positive terms as a special case, and use that to prove a rearrangement theorem for series. Example 1.49. Consider a sequence (an )∞ n=1 with an ≥ 0 for all n. (a) If ν denotes the counting measure on N and f (n) = an for all n, then ∞ aj . (1.20) f dν = j=1

(b) If π : N → N is a bijection, then

∞

j=1 aj

=

∞

k=1 aπ(k) .

2, . . . , n}. For each n ∈ N, the function χEn f Proof. (a) Denote En = {1, n is simple, so En f dν = j=1 f (j). Since χEn f is an increasing sequence of functions converging pointwise to f , by the monotone convergence theorem, letting n → ∞ gives (1.20). (b) Repeating the proof n of (a) for the sets En = {π(1), π(2), . . . , π(n)} gives f dν = limn→∞ k=1 aπ(k) . By (1.20), this completes the proof. Instead of the deﬁnition of the integral as a supremum, it is often useful to use an explicit sequence of simple functions which monotonically converges to f and combine this with the monotone convergence theorem. Such a sequence is constructed in the next lemma, and will be immediately used to prove additivity of the integral. Lemma 1.50. If f : X → [0, ∞], there exist simple functions sn : X → [0, ∞) such that sn ≤ sn+1 and sn → f pointwise. Proof. A sequence satisfying these conditions is given by

2−n 2n f (x) 0 ≤ f (x) < n sn (x) = n n ≤ f (x). Lemma 1.51. If f, g : X → [0, ∞], then (f + g) dμ = f dμ + g dμ. Proof. Pick increasing sequences of simple functions such that sn → f , tn → g. Then sn + tn → f + g, so taking the limit as n → ∞ of (sn + tn ) dμ = sn dμ + tn dμ, monotone convergence gives (f + g) dμ = f dμ + g dμ. Proposition 1.52. For any sequence of functions gn : X → [0, ∞], ∞ ∞ gn dμ = gn dμ. n=1

n=1

1.4. Lebesgue integration

19

Proof. This follows by monotone convergence n applied to the increasing sen quence fn = j=1 gj , because fn dμ = j=1 gj dμ for each n. When going beyond monotone limits, Fatou’s lemma will be useful: Theorem 1.53 (Fatou’s lemma). For any functions fn : X → [0, ∞], lim inf fn dμ ≤ lim inf fn dμ. n→∞

n→∞

Proof. Let gn = inf k≥n fk . Then gn is an increasing sequence of functions and limn→∞ gn = lim inf k→∞ fk . By monotone convergence, lim gn dμ = lim gn dμ. lim inf fn dμ = n→∞ n→∞ n→∞ However, gn ≤ fn implies gn dμ ≤ fn dμ, and therefore gn dμ = lim inf gn dμ ≤ lim inf fn dμ. lim n→∞

n→∞

n→∞

The integral (1.16) should be thought of as the integral of f over the entire space X. For Borel subsets E ⊂ X, we deﬁne the integral over E as f dμ = χE f dμ. E

The following proposition provides a construction of a new measure from another measure and a multiplicative weight. Proposition 1.54. For any Borel measure μ on X and h : X → [0, ∞], another Borel measure ν on X is deﬁned by ν(E) = h dμ ∀E ∈ BX . E

Moreover, for all g : X → [0, ∞], g dν = gh dμ.

(1.21)

This measure is commonly described by saying dν = h dμ. Proof. Clearly, ν(∅) = 0. For any disjoint sets ∞En , n ∈ N, Proposition 1.52 E ) = applied to gn = hχEn gives ν ( ∞ n=1 n n=1 ν(En ). Thus, ν is a measure. Equality (1.21) holds for all g = χE , so by linear combinations, it holds for all simple functions g. For an arbitrary g : X → [0, ∞], use simple functions sn ≤ sn+1 such that sn → g. Then sn dν = sn h dμ for each n. Since sn h ≤ sn+1 h, sn h → gh, applying monotone convergence to both sides of this equality gives (1.21). Another very useful construction is the pushforward of a measure:

20

1. Measure theory

Lemma 1.55. If μ is a Borel measure on X and g : X → Y is a Borel function, the pushforward of μ by g is the Borel measure ν on Y deﬁned by ν(B) = μ(g −1 (B)). For any Borel function f : Y → [0, ∞], f dν = (f ◦ g) dμ.

(1.22)

Proof. If the sets Bn , n ∈ N are disjoint, so are g −1 (Bn ), so ∞ ∞ ∞ −1 −1 Bn g (Bn ) = μ(g −1 (Bn )) =μ μ g n=1

n=1

n=1

implies σ-additivity of ν. Also, ν(∅) = μ(∅) = 0, so ν is a measure. For f = χE , (1.22) holds by deﬁnition. By linearity, (1.22) holds for all simple functions. For simple functions sn such that sn ≤ sn+1 and sn → f , the functions sn ◦ g are also simple and obey sn ◦ g ≤ sn+1 ◦ g and sn ◦ g → f ◦ g. Since sn dν = (sn ◦ g) dμ, taking n → ∞ and applying monotone convergence on both sides proves (1.22). So far, we have seen integration theory as derived from measure theory. However, sometimes we use integrals to estimate measures: Lemma 1.56 (Markov’s inequality). For any f : X → [0, ∞] and c > 0, 1 f dμ. μ({x | f (x) ≥ c}) ≤ c Proof. This follows from f ≥ cχA where A = {x | f (x) ≥ c}.

A property is said to hold μ-almost everywhere (or “μ-a.e.”) if there is a set A such that μ(A) = 0 and the property holds for all x ∈ Ac . Sets of measure 0 are negligible in integration theory: Proposition 1.57. Let f, g : X → [0, ∞]. Then the following hold. (a) If f dμ < ∞, then f < ∞ holds μ-a.e. (b) f dμ = 0 if and only if f = 0 μ-a.e. (c) If f = g μ-a.e., then f dμ = g dμ. Proof. (a) By Markov’s inequality with c = k ∈ N, μ({x | f (x) = ∞}) ≤ μ({x | f (x) ≥ k}) ≤

1 k

f dμ.

Taking k → ∞ proves μ({x | f (x) = ∞}) = 0. (b) Assume that f = 0 μ-a.e. For every simple function 0 ≤ s ≤ f , s = 0 μ-a.e. Thus, by deﬁnition, s dμ = 0, so taking the supremum over simple

1.4. Lebesgue integration

21

functions s ≤ f , f dμ = 0. Conversely, if f dμ =0, then by Markov’s inequality, for every k ∈ N, μ({x | f (x) ≥ 1/k}) ≤ k f dμ = 0, so taking the union over k ∈ N shows μ({x | f (x) > 0}) = 0. (c) The set E = {x | f (x) = g(x)} obeys μ(E) = 0. Using the dec composition f = f χ + f χ and (b), f dμ = f χE c dμ. Analogously, E E g dμ = gχE c dμ. Since f χE c = gχE c , this completes the proof. Since a countable union of sets of zero measure has zero measure, it is common to impose countably many conditions that hold μ-a.e. and assume that they all hold away from the same set of zero measure, like in the following theorem. In cases when f (x) = limn→∞ fn (x) exists μ-a.e., it is common to consider f to be deﬁned by that equation and to not explicitly specify the value of f on the remaining zero measure set. For example: Theorem 1.58 (Monotone convergence theorem, again). If functions fn : X → [0, ∞] obey fn ≤ fn+1 μ-a.e. for all n ∈ N, and fn → f μ-a.e., then fn dμ = f dμ. lim n→∞

Proof. Denote by E a set such that all assumptions hold on E and μ(E c ) = 0. By monotone convergence, E fn dμ → E f dμ. Since fn = f χE and f = f χE μ-a.e., the claim follows by Proposition 1.57. ˆ Let us now extend integration to real-valued Borel functions h : X → R. For such h, we denote h± = max{±h, 0}. Note that h± ≥ 0, h = h+ − h− , and |h| = h+ + h− . If at least one of the integrals h± dμ is ﬁnite, we deﬁne h dμ = h+ dμ − h− dμ. For instance, if f : X → [0, ∞] and f dμ < ∞, then log f ≤ f implies (log f )+ ≤ f , so (log f )+ dμ < ∞. Thus, log f dμ is deﬁned, although its value can be −∞. However, in most situations, we will work in the case when both h± dμ are ﬁnite, and we call such functions h integrable. Lemma 1.59. |h| dμ < ∞ if and only if h+ dμ < ∞ and h− dμ < ∞. Proof. One implication follows from h± ≤ |h| and the other from |h| ≤ h+ + h− . Proposition 1.60. Let μ be a measure on X. (a) If c ∈ R and f : X → R is integrable, then cf is integrable and (cf ) dμ = c f dμ. (1.23)

22

1. Measure theory

(b) If f, g : X → R are integrable, then f + g is integrable and (f + g) dμ = f dμ + g dμ. (1.24) Proof. (a) Integrability of cf follows from |cf | dμ = |c| |f | dμ < ∞. Equation (1.23) follows from Lemma 1.46 with the observation that for c ≥ 0, (cf )± = cf± , and for c < 0, (cf )± = (−c)f∓ . (b) Integrability of h = f + g follows from the triangle inequality, since |f + g| dμ ≤ (|f | + |g|) dμ = |f | dμ + |g| dμ < ∞. (1.25) Since h+ − h− = f+ − f− + g+ − g− implies h+ + f− + g− = h− + f+ + g+ , additivity of integrals of positive functions implies h+ dμ + f− dμ + g− dμ = h− dμ + f+ dμ + g+ dμ. Regrouping terms gives (1.24).

We will now further generalize integration to complex-valued functions. Deﬁnition 1.61. We denote by L1 (X, dμ) the set of f : X → C such that |f | dμ < ∞. Consistently with prior terminology, we call such functions f integrable. Lemma 1.62. f is integrable if and only if Re f and Im f are integrable. Proof. One direction follows from |Re f | ≤ |f | and |Im f | ≤ |f |, and the other from |f | ≤ |Re f | + |Im f |. For f ∈ L1 (X, dμ) we deﬁne f dμ = Re f dμ + i Im f dμ. Lemma 1.63. (a) If c ∈ C and f ∈ L1 (X, dμ), then cf ∈ L1 (X, dμ) and (1.23) holds. (b) If f, g ∈ L1 (X, dμ), then f + g ∈ L1 (X, dμ) and (1.24) holds. Proof. (a) Integrability of cf follows from |cf | dμ = |c| |f | dμ < ∞. The equality (1.23) follows from the real-valued case by Re(cf ) = Re c Re f − Im c Im f and Im(cf ) = Re c Im f + Im c Re f . (b) Repeating argument (1.25) shows that f + g ∈ L1 (X, dμ). Equality (1.24) follows from Re(f +g) = Re f +Re g and Im(f +g) = Im f +Im g.

1.4. Lebesgue integration

23

Lemma 1.64. If f ∈ L1 (X, dμ), then f dμ ≤ |f | dμ. Proof. Pick ω ∈ C such that |ω| = 1 and ω f dμ = f dμ. Then f dμ = ω f dμ = Re ωf dμ = Re(ωf ) dμ. Using Re(ωf ) ≤ |ωf | dμ = |f | completes the proof.

Theorem 1.65 (Dominated convergence theorem). Consider a sequence of fn ∈ L1 (X, dμ) dominated by some g ∈ L1 (X, dμ) in the sense that |fn (x)| ≤ g(x)

(1.26)

for all n ∈ N and μ-a.e. x. Assume that fn converge pointwise μ-a.e. to a Borel function f . Then f ∈ L1 (X, dμ), (1.27) lim |fn − f | dμ = 0 n→∞

and lim

n→∞

fn dμ =

f dμ.

(1.28)

Proof. From (1.26), by passing to pointwise limits, it follows that |f | ≤ g μ-a.e., so f ∈ L1 (X, dμ). Deﬁne hn = 2g − |fn − f | ≥ 0. Since hn → 2g μ-a.e., by Fatou’s lemma, 2g dμ ≤ lim inf (2g − |fn − f |) dμ. n→∞ Since 2g and |fn − f | are integrable, we can subtract the constant 2g dμ from both sides and multiply by −1 to obtain lim sup |fn − f | dμ ≤ 0. n→∞

By this implies (1.27). Now (1.28) follows from bound, a trivial lower fn dμ − f dμ ≤ |fn − f | dμ. Lebesgue integration does not include conditionally convergent integrals—note that f is integrable if and only if |f | is integrable—but this is usually not an important limitation. The connection with series from Example 1.49 motivates: Deﬁnition 1.66. Let ν denote counting measure on a set Γ. If f : Γ → [0, ∞] or f : Γ → C with |f | dν < ∞, we deﬁne f (j) = f dν. j∈J

24

1. Measure theory

This gives a notion of summation over any set, in a way that does not include conditionally convergent sequences but is independent of ordering. The steps in the deﬁnition of the integral, from positive to complex functions, are reﬂected in many proofs in integration theory. For instance: Proposition 1.67. Let dν = f dμ in the notation of Proposition 1.54. For any f : X → C, f ∈ L1 (X, dν) if and only if f h ∈ L1 (X, dμ). If this holds, then f dν =

f h dμ.

(1.29)

Proof. By Proposition 1.54, equation (1.29) holds for positive functions. Thus, applying it to |f | shows |f | dν = |f |h dμ, which proves the ﬁrst claim. If f is real-valued, applying (1.29) to f± gives f± dν = f± h dμ. Using f = f+ − f− and subtracting integrals gives f dν = f h dμ, so (1.29) holds for real-valued functions. Likewise, if f is complex-valued, using f = Re f + i Im f and applying (1.29) to Re f, Im f shows that it holds for f .

1.5. Lebesgue–Stieltjes measures on R In this section, we study measures of R. Since the Borel σ-algebra on R is generated by intervals, it is natural to try to understand a measure on R by examining how it acts on intervals. This motivates the following deﬁnition: Deﬁnition 1.68. Let μ be a Borel measure on R. A function α : R → R is called a distribution function of μ if μ((x, y]) = α(y) − α(x)

∀x, y ∈ R, x < y.

(1.30)

Example 1.69. χ[x0 ,∞) is a distribution function for the Dirac measure δx0 . If a distribution function exists, it is determined uniquely up to an additive constant. Its existence is considered in the following lemma: Lemma 1.70. For a Borel measure μ on R, the following are equivalent: (a) μ is ﬁnite on compacts; (b) μ((x, y]) < ∞ for all x, y ∈ R with x < y; (c) μ has a distribution function. Proof. Every compact set K ⊂ R is contained in some interval (−C, C] ⊂ [−C, C], so (a) and (b) are equivalent. If μ has a distribution function, then μ((x, y]) < ∞ for all x, y ∈ R by (1.30), so (iii) implies (ii). To prove that

1.5. Lebesgue–Stieltjes measures on R

25

(ii) implies (iii), deﬁne ⎧ ⎪ x>0 ⎨μ((0, x]) α(x) = 0 x=0 ⎪ ⎩ −μ((x, 0]) x < 0.

(1.31)

The property (1.30) follows from (1.31) by additivity of μ, applied on a case-by-case basis depending on the signs of x, y. By (1.30), the distribution function is an increasing function, i.e., x < y implies α(x) ≤ α(y). We recall some properties of increasing functions: Lemma 1.71. Any increasing function α : R → R has the following properties. (a) The function α has one-sided limits α+ (x) = lim α(t),

x ∈ R ∪ {−∞},

α− (x) = lim α(t),

x ∈ R ∪ {+∞},

t↓x t↑x

which are themselves increasing functions of x. (b) α− (x) ≤ α(x) ≤ α+ (x) for all x ∈ R. ˆ with x < y. (c) α+ (x) ≤ α− (y) for all x, y ∈ R (d) (α− )+ = α+ and (α+ )− = α− . Proof. (a) Let us begin by deﬁning α+ (x) = inf α(t), t>x

α− (x) = sup α(t).

(1.32)

t α+ (x) there exists t > x such that α(t) < c, so for all y ∈ (x, x + t), α+ (x) ≤ α(y) < c. This implies that α+ (x) is the right limit of α at x. Since taking the inﬁmum over a larger set can only give a smaller value, α+ (x) ≤ α+ (y) if x < y. The statements for α− follow analogously. (b) Since α(x) ≤ α(t) for all t > x, taking the limit as t ↓ x implies that α(x) ≤ α+ (x). Similarly, α− (x) ≤ α(x). (c) Picking t ∈ (x, y) and using (1.32), we obtain α+ (x) ≤ α(t) ≤ α− (y). (d) The ﬁrst claim follows from the squeeze theorem applied to α+ (x) ≤ α− (t) ≤ α(t) as t ↓ x. The second claim is proved analogously.

26

1. Measure theory

Varying interval endpoints provides additional links between μ and α: Lemma 1.72. If μ is a measure on R with a distribution function α, then the following hold. (a) α is right-continuous, i.e., α+ (x) = α(x) for all x ∈ R. (b) For all x, y ∈ R with x < y, we have μ((x, y)) = α− (y) − α+ (x). (c) For any x, y ∈ R with x ≤ y, we have μ([x, y]) = α+ (y) − α− (x). Proof. (a) For any x ∈ R, by considering a decreasing sequence of intervals, lim (α(x + 1/n) − α(x)) = lim μ((x, x + 1/n]) = μ(∅).

n→∞

This gives

n→∞

α+ (x)

− α(x) = 0, so α is right-continuous.

(b) This is proved similarly by computing limn→∞ μ((x + 1/n, y − 1/n]). (c) This is proved similarly by computing limn→∞ μ((x − 1/n, y + 1/n]). This discussion of distribution functions has been merely a warmup; some further calculations of this kind, which compute the measures of other intervals, singletons, and arbitrary open sets V ⊂ R, are left as Exercises 1.14 and 1.15. We turn instead to the ﬁrst of two important results in this section: the construction of measures with prescribed distribution functions. Theorem 1.73. For any increasing right-continuous function α : R → R, there exists a Borel measure μα on R such that μα ((a, b]) = α(b) − α(a)

∀a, b ∈ R, a < b.

This measure is called the Lebesgue–Stieltjes measure corresponding to α. The proof will use Carath´eodory’s theorem. Consider the family ˆ a < b}, E = {∅} ∪ {(a, b) | a, b ∈ R,

(1.33)

ﬁx an arbitrary increasing function α : R → R, and deﬁne ρ : E → [0, ∞] by ρ(∅) = 0,

ρ((a, b)) = α− (b) − α+ (a)

ˆ a < b. ∀a, b ∈ R,

This weight generates an outer measure μ∗ by (1.5). Our goal is to prove that all Borel sets are measurable with respect to μ∗ and that, if α is rightcontinuous, the resulting Borel measure has distribution function α. The ﬁrst step is to determine the outer measure of intervals. To pass from countable covers to ﬁnite covers, it is useful to ﬁrst consider the compact case: Lemma 1.74. For any p, q ∈ R with p ≤ q, μ∗ ([p, q]) = α+ (q) − α− (p).

1.5. Lebesgue–Stieltjes measures on R

27

Proof. [p, q] ⊂ (p − , q + ) implies μ∗ ([p, q]) ≤ α− (q + ) − α+ (p − ) for any > 0. Letting → 0, we get μ∗ ([p, q]) ≤ α+ (q) − α− (p). Conversely, consider any countable cover of [p, q] by open intervals Ij , j ∈ N. By compactness, this cover has a ﬁnite subcover; among all ﬁnite subcovers, consider one with the smallest possible number of intervals, and denote the intervals by (aj , bj ), j = 1, . . . , n. Minimality implies that aj = ak and bj = bk for j = k, otherwise one of the intervals (aj , bj ), (ak , bk ) would contain the other and could be removed from the cover. Label the intervals so that a1 < a2 < · · · < an . Minimality further implies that b1 < b2 < · · · < bn (otherwise (aj+1 , bj ) ⊂ (aj , bj+1 ) for some j). Moreover, a1 < p < b1 since p is covered and each interval intersects [p, q]. Analogously, an < q < bn . Finally, ak+1 < bk for 1 ≤ k ≤ n − 1, otherwise the point bk would not be covered. Thus, α+ (ak+1 ) ≤ α− (bk ), so n−1

(α− (bk ) − α+ (ak )) ≥

k=1

Adding

n−1

(α+ (ak+1 ) − α+ (ak )) = α+ (an ) − α+ (a1 ).

k=1

α− (bn )

− α+ (an ) n −

and using α− (bn ) ≥ α+ (q) and α+ (a1 ) ≤ α− (p),

(α (bk ) − α+ (ak )) ≥ α+ (q) − α− (p).

k=1

Thus, the sum of weights over this ﬁnite subcover is bounded below by α+ (q) − α− (p). This lower bound then also applies to the original countable cover, which was arbitrary, so μ∗ ([p, q]) ≥ α+ (q) − α− (p). Compactness was used crucially to obtain the lower bound for the outer measure. Using the result for compact intervals, it becomes easy to compute outer measures of other intervals: Lemma 1.75. For any open interval (a, b) ⊂ R, μ∗ ((a, b)) = ρ((a, b)) = α− (b) − α+ (a). Proof. The trivial cover of (a, b) by itself shows μ∗ ((a, b)) ≤ α− (b) − α+ (a). For any compact interval [p, q] ⊂ (a, b), μ∗ ((a, b)) ≥ μ∗ ([p, q]) = α+ (q) − α− (p). Taking limits p ↓ a, q ↑ b proves μ∗ ((a, b)) ≥ α− (b) − α+ (a). Lemma 1.76. For any half-open interval (a, c] ⊂ R, μ∗ ((a, c]) = α+ (c) − α+ (a). Proof. For any b > c, (a, c] ⊂ (a, b) implies μ∗ ((a, c]) ≤ μ∗ ((a, b)) = α− (b) − α+ (a), so taking the limit b ↓ c gives μ∗ ((a, c]) ≤ α+ (c) − α+ (a).

28

1. Measure theory

Conversely, for any p ∈ (a, c], [p, c] ⊂ (a, c] implies μ∗ ((a, c]) ≥ μ∗ ([p, c]) = α+ (c) − α− (p), and taking the limit p ↓ a gives μ∗ ((a, c]) ≥ α+ (c) − α+ (a).

Lemma 1.77. For any I ∈ E and any c ∈ R, μ∗ (I) = μ∗ (I ∩ (−∞, c]) + μ∗ (I ∩ (c, ∞)).

(1.34)

Proof. If I ⊂ (−∞, c] or I ⊂ (c, ∞), this is trivial. In the case when I = (a, b) intersects both (−∞, c] and (c, ∞), both sides of (1.34) can be computed by Lemmas 1.75 and 1.76, so (1.34) follows from the trivial α− (b) − α+ (a) = α+ (c) − α+ (a) + α− (b) − α+ (c).

Lemma 1.78. For any c ∈ R, (c, ∞) is measurable with respect to μ∗ . Proof. Consider a set E ⊂ R and a countable cover of E by elementary sets, {Ij }∞ j=1 . By Lemmas 1.75 and 1.77, ∞

ρ(Ij ) =

j=1

∞ j=1

μ∗ (Ij ∩ (−∞, c]) +

∞

μ∗ (Ij ∩ (c, ∞)).

j=1

The sets Ij ∩(−∞, c] cover E∩(−∞, c] and the sets Ij ∩(c, ∞) cover E∩(c, ∞), so by σ-subadditivity of the outer measure, ∞

ρ(Ij ) ≥ μ∗ (E ∩ (−∞, c]) + μ∗ (E ∩ (c, ∞)).

j=1

Taking the inﬁmum over all countable covers {Ij }∞ j=1 gives μ∗ (E) ≥ μ∗ (E ∩ (−∞, c]) + μ∗ (E ∩ (c, ∞)).

(1.35)

The opposite inequality holds by subadditivity. Thus, equality holds in (1.35) for any E ⊂ R, so (c, ∞) is measurable with respect to μ∗ . Proof of Theorem 1.73. By Carath´eodory’s theorem, the set A of measurable sets with respect to μ∗ is a σ-algebra and μ∗ is a measure on A. Since A contains all intervals of the form (c, ∞), it contains BR . Therefore, the restriction of μ∗ to BR is a Borel measure, denoted by μα . Right-continuity of α gives α+ = α, so by Lemma 1.76, μα has distribution function α. For the function α(x) = x, the corresponding measure is called the (onedimensional) Lebesgue measure and is denoted by m or m1 . We warn the reader that Lebesgue measure of a set is not tightly related to its cardinality (there exist uncountable sets of zero Lebesgue measure, e.g., the middle third Cantor set), or to its topological properties (there exist sets with empty interior but positive Lebesgue measure; Exercise 1.16).

1.5. Lebesgue–Stieltjes measures on R

29

Integration with respect to Lebesgue measure generalizes Riemann integration (Exercise 1.17), so integration with respect to Lebesgue measure is b commonly denoted by a f (x) dx := [a,b] f dm for f ∈ L1 ([a, b], dm). The construction of Lebesgue–Stieltjes measures is complemented by an important uniqueness result: Theorem 1.79. If two Borel measures on R have the same distribution function, they are equal. The usual strategy suggests that, for two such measures μ, ν, we should prove that S = {E ∈ BR | μ(E) = ν(E)} is a σ-algebra. However, knowing for some sets En ∈ BR does not allow us to compare that μ(En ) = ν(En ) ∞ μ ( n=1 En ) with ν ( ∞ n=1 En ), since we cannot compute the measures of the unions. We notice the mismatch between the conditions for a σ-algebra, which must be closed under all countable unions, and σ-additivity for a measure, which only says something for disjoint countable unions. This is precisely the kind of obstacle for which monotone classes are needed. Proof. We use the family of left-open intervals, J = {(a, b] | a ∈ R ∪ {−∞}, b ∈ R, a < b} ∪ {(a, +∞) | a ∈ R ∪ {−∞}}, and the family of their ﬁnite disjoint unions, G = {∅} ∪ {

n

Ij | n ∈ N, Ij ∈ J for all j, Ij ∩ Ik = ∅ if j = k}.

j=1

Since G contains all half-lines BR , it generates the σ-algebra. But unlike the family of half-lines, the family G is an algebra. To prove this, observe that I ∈ J implies I c ∈ G and that I1 , I2 ∈ J implies I1 ∩ I2 ∈ G; thus, E ∈ G implies E c ∈ G and E, F ∈ G implies E ∩ F ∈ G. Fix k ∈ N and consider the set Ck = {E ∈ BR | μ(E ∩ (−k, k]) = ν(E ∩ (−k, k])}. If E ∈ G, then E ∩ (−k, k] is a ﬁnite disjoint union of intervals (aj , bj ] with −k ≤ aj < bj ≤ k. Since μ((aj , bj ]) = ν((aj , bj ]), by additivity, E ∈ Ck . in Ck and let E be Let (En )∞ n=1 be an increasing or decreasing sequence its limit, E = En if En is increasing and E = En if En is decreasing. Since μ((−k, k]) = ν((−k, k]) < ∞, by dominated convergence with the dominating function χ(−k,k] , μ(E ∩ (−k, k]) = lim μ(En ∩ (−k, k]) = lim ν(En ∩ (−k, k]) = ν(E ∩ (−k, k]), n→∞

n→∞

so E ∈ Ck . Thus, Ck is a monotone class and G ⊂ Ck ⊂ BR . By the monotone class theorem, Ck = BR .

30

1. Measure theory

Thus, for all E ∈ BR and k ∈ N, μ(E ∩ (−k, k]) = ν(E ∩ (−k, k]). By monotone convergence, the limit k → ∞ gives μ(E) = ν(E).

1.6. Product measures A metric space is called σ-compact if it can be written as a countable union of compact sets. We give a quick construction of product measures on σcompact product spaces. Theorem 1.80. If μ, ν are measures on σ-compact spaces X, Y and μ, ν are ﬁnite on compacts, then for every Borel function f : X × Y → [0, ∞] the following hold. (a) For every y ∈ Y , the function X → [0, ∞], x → f (x, y) is Borel. (b) The function Y → [0, ∞], y → f (x, y) dμ(x) is Borel. (c) For every x ∈ X, the function Y → [0, ∞], y → f (x, y) is Borel. (d) The function X → [0, ∞], x → f (x, y) dν(y) is Borel. (e) Iterated integrals of f are independent of order of integration, i.e., f (x, y) dμ(x) dν(y) = f (x, y) dν(y) dμ(x). Proof. Denote by M the class of all Borel functions f : X × Y → [0, ∞] with the desired properties. If f, g ∈ M, then f + g ∈ M by additivity of integrals. Moreover, if fn are a pointwise increasing sequence of functions in M, then the pointwise limit f = limn→∞ fn is also in M by monotone convergence and because pointwise limits of Borel functions are Borel. We will use these observations repeatedly. We call a rectangle a set R = A × B where A ∈ BX , B ∈ BY . For any rectangle R, χR ∈ M by a straightforward veriﬁcation. Consider the family of ﬁnite disjoint unions of rectangles, ⎧ ⎫ n ⎨ ⎬ G = {∅} ∪ Rj | Rj are rectangles, Rj ∩ Rk = ∅ if j = k . ⎩ ⎭ j=1

By additivity, E ∈ G implies χE ∈ M. Note also that G is an algebra in X × Y because the intersection of two rectangles is a rectangle and the complement of a rectangle is a disjoint union of rectangles, (A × B)c = (Ac × B) ∪ (A × B c ) ∪ (Ac × B c ). Fix compacts K ⊂ X and L ⊂ Y and denote C = {F ∈ BX×Y | χF χK×L ∈ M}.

1.6. Product measures

31

Since F ∈ G implies F ∩ (K × L) ∈ G, it follows that G ⊂ C. Note also that F ∈ C implies F c ∈ C by subtracting from χK×L (there are no inﬁnities in that subtraction, so this is an algebraic veriﬁcation). Let us prove that C is a monotone class. If (Fn )∞ n=1 is an increasing sequence in C and F = ∞ F , the pointwise increasing sequence n=1 n χFn χK×L → χF χK×L implies that F ∈ C. For a decreasing sequence (Fn )∞ n=1 , passing to complements reduces to increasing sequences. By the monotone class theorem, C contains the σ-algebra generated by G. Since compact metric spaces are separable, so are σ-compact metric spaces. Thus, they are second countable (Exercise 1.2). By Lemma 1.33, X × Y has a countable base consisting of rectangles, so C contains all open sets, and therefore BX×Y . Since X, Y are σ-compact, they have countable covers by compacts Kn , Ln , respectively. Since ﬁnite unions of compact sets are compact, we can assume sequences Kn , Ln to be increasing. For any F ∈ BX×Y and n ∈ N, χF χKn ×Ln ∈ M. Taking the increasing limit as n → ∞ gives χF ∈ M. By additivity, M contains all simple functions. Any positive Borel function is the pointwise limit of an increasing sequence of simple functions, so M contains all positive functions. Deﬁnition 1.81. In the setting of the previous theorem, the product measure μ ⊗ ν is the Borel measure deﬁned by (μ ⊗ ν)(E) = χE (x, y) dμ(x) dν(y). This is indeed a measure: it is σ-additive because it is additive (by additivity of integrals) and because monotone convergence can be used to move the limit inside the iterated integrals. For instance, from Lebesgue measure m1 = m on R, we inductively deﬁne n-dimensional Lebesgue measure mn = mn−1 ⊗ m on BRn . Theorem 1.82 (Tonelli). Assume that X, Y are σ-compact and μ, ν are ﬁnite on compacts. For any Borel function f : X × Y → [0, ∞], f d(μ ⊗ ν) = f (x, y) dμ(x) dν(y) = f (x, y) dν(y) dμ(x). (1.36) Proof. By Theorem 1.80 and the deﬁnition of μ ⊗ ν, (1.36) holds if f = χE for some set E ∈ BX×Y . By linearity, (1.36) holds for all simple functions. Taking an increasing limit of simple functions and using monotone convergence, we conclude that (1.36) holds for all positive Borel functions.

32

1. Measure theory

For complex-valued f , applying Tonelli’s theorem to |f | gives |f | d(μ⊗ν) = |f (x, y)| dμ(x) dν(y) = |f (x, y)| dν(y) dμ(x), (1.37)

so we can check whether f is integrable by computing iterated integrals. This is often checked in order to apply the following theorem: Theorem 1.83 (Fubini). Assume that X, Y are σ-compact and μ, ν are ﬁnite on compacts. For any f ∈ L1 (X × Y, d(μ ⊗ ν)) the following hold. (a) For ν-a.e. y, the function x → f (x, y) is in L1 (X, dμ). (b) For μ-a.e. x, the function y → f (x, y) is in L1 (Y, dν). (c) Equation (1.36) holds, with the interpretation that the inner integrals are well deﬁned a.e., and ignoring the exceptional zeromeasure sets, the outer integrals give the stated value. Proof. From (1.37), it follows that |f (x, y)| dν(y) < ∞ for μ-a.e. x and |f (x, y)| dμ(x) < ∞ for ν-a.e. y. The proof follows from Tonelli’s theorem in the usual way, by passing from positive to real-valued and then to complex-valued functions, using linearity of integrals.

1.7. Functions on σ-locally compact spaces In this section, we begin to use continuous functions as approximants and test functions. The main result of this section is a kind of approximation of bounded Borel functions by continuous functions, which gives a new way of proving that certain statements hold for all bounded Borel functions. In particular, we will use it for the study of the Borel functional calculus for self-adjoint operators. In order to work with continuous functions on X, we impose some topological assumptions on X. The following class suﬃces for our purposes: Deﬁnition 1.84. A metric space X is σ-locally compact if it has compact ∞ subsets Ln ⊂ X such that Ln ⊂ int Ln+1 for all n ∈ N and n=1 Ln = X. Any such sequence (Ln )∞ n=1 is called an exhaustion of X by compact sets. Not every compact sequence Kn ⊂ Kn+1 with ∞ n=1 Kn = X gives an exhaustion of X by compact sets; a counterexample in X = R is given by Kn = [−n, 0] ∪ [1/n, n]. There exist metric spaces which are σ-compact but not σ-locally compact (Exercise 1.20). In fact, σ-local compactness can be seen as a combination of separability and a local condition (Exercise 1.21). However, many common spaces are σ-locally compact: Example 1.85. Any countable space with the discrete metric is σ-locally compact. To obtain an exhaustion by compact sets, choose an enumeration of the space X = {xn | n ∈ N} and set Ln = {x1 , . . . , xn }.

1.7. Functions on σ-locally compact spaces

33

Example 1.86. For any k ∈ N, Rk is σ-locally compact with Ln = [−n, n]k . Lemma 1.87. On any σ-compact space, the Borel σ-algebra is generated by the family of compact subsets. Proof. Denote by A the σ-algebra generated by compact subsets of the space X. Compact sets are closed, so they are Borel sets. Thus, A ⊂ BX . For any closed Conversely, let Ln be compact sets such that X = ∞ n=1 Ln . F ⊂ X, the sets F ∩Ln are compact, so F ∩Ln ∈ A; thus F = ∞ n=1 (F ∩Ln ) ∈ A. Since A contains all closed sets; passing to complements, A contains all open sets, so BX ⊂ A. To proceed, we need some separation facts which are easily proved in our metric space setting. Distance between points and sets is deﬁned by d(x, B) = inf d(x, y). y∈B

This is a continuous function of x because |d(x, y) − d(x , y)| ≤ d(x, x ) implies |d(x, B) − d(x , B)| ≤ d(x, x ). Similarly, for A, B ⊂ X, we denote d(A, B) = inf inf d(x, y). x∈A y∈B

Lemma 1.88. If K is compact, V open, and K ⊂ V , then d(K, V c ) > 0. Proof. The function d(x, V c ) is continuous in x and strictly positive on the open set V , so it has a strictly positive minimum on the compact K. The support of a continuous function f : X → C is deﬁned as supp f = {x ∈ X | f (x) > 0}. This is a closed set; we denote by Cc (X) the set of continuous f : X → C such that supp f is compact. Note that f ∈ Cc (X) if and only if there exists a compact K ⊂ X such that f (x) = 0 for all x ∈ / K, and that Cc (X) is a vector space. The following lemma separates sets by a function f ∈ Cc (X): Lemma 1.89. In a σ-locally compact metric space X the following hold. (a) For any compact K, there exists δ > 0 such that {x ∈ K | d(x, K) ≤ δ} is compact. (b) If K is compact, V open, and K ⊂ V , then there exists f ∈ Cc (X) such that χK ≤ f ≤ 1 and supp f ⊂ V . Proof. (a) Let (Ln )∞ n=1 be an exhaustion of X by compact sets. Since Ln ⊂ int Ln+1 , the sets int Ln+1 are an open cover of K. There is a ﬁnite subcover int Lm1 , . . . , int Lmk , and taking m = max{m1 , . . . , mk } gives K ⊂ int Lm . Thus, for δ = d(K, (int Lm )c ) > 0, the set {x ∈ K | d(x, K) ≤ δ} is compact as a closed subset of the compact Lm .

34

1. Measure theory

(b) If < δ and < d(K, V c ), the function f (x) = (1 − −1 d(x, K))+ has compact support and supp f ⊂ V . We now consider the family of bounded Borel functions, its algebraic properties, and a useful notion of convergence: Deﬁnition 1.90. Denote by Bb (X) the set of bounded Borel functions from X to C. A subset M ⊂ Bb (X) is said to be a subalgebra of Bb (X) if it contains the constant function 1 and is closed under scalar multiplication, pointwise addition, and pointwise multiplication. M is said to be closed under pointwise convergence of uniformly bounded sequences if, for any sequence of gn ∈ M, such that sup sup |gn (x)| < ∞

n∈N x∈X

and the limit g(x) = limn→∞ gn (x) is convergent for all x ∈ X, it follows that g ∈ M. We emphasize that this notion of convergence does not correspond to any metric, and that we are not working with respect to any measure or any kind of almost-everywhere condition. This makes the current setting diﬀerent from, say, that in Chapter 2, where some density properties in L1 (X, dμ) will be considered. This distinction will be essential when there is no a priori distinguished measure that can be used. Lemma 1.91. If M is a subalgebra of Bb (X) closed under pointwise convergence of uniformly bounded sequences, {A ∈ BX | χA ∈ M} is a σ-algebra. Proof. Denote A = {A ∈ BX | χA ∈ M}. Since χ∅ = 0 ∈ M, ∅ ∈ A. If A ∈ A, then χAc = 1 − χA ∈ M, so Ac ∈ M. If A, B ∈ A, then χA∩B = χA χB ∈ M , so A ∩ B ∈ A. Thus, A is an algebra. For any sequence of sets An ∈ A, the uniformly bounded pointwise limit

shows that

∞

j=1 Aj

= lim χnj=1 Aj χ∞ j=1 Aj n→∞

∈ A, so A is a σ-algebra.

Proposition 1.92. Let X be a σ-locally compact metric space and let M be a subalgebra of Bb (X). If M is closed under pointwise convergence of uniformly bounded sequences, the following are equivalent: (a) Cc (X) ⊂ M; (b) χB ∈ M for all Borel sets B ⊂ X; (c) M = Bb (X). Proof. (a) =⇒ (b): For any compact K ⊂ X, the functions fn (x) = (1 − nd(x, K))+ are uniformly bounded and converge pointwise to χK , with

1.8. Regularity of measures

35

fn ∈ Cc (X) for large enough n; thus, χK ∈ M. Thus, the σ-algebra {A ∈ BX | χA ∈ M} contains all compact sets, so it contains all Borel sets. (b) =⇒ (c): Since M is an algebra and contains all characteristic functions of Borel sets, M contains all simple functions (functions which take ﬁnitely many values). Any positive Borel function f bounded by C ∈ N is the pointwise limit of the uniformly bounded functions fn =

n C2

k=0

k χ k k+1 , 2n {x| 2n ≤f (x)< 2n }

so M contains all positive bounded Borel functions. By linear combinations, we obtain all complex-valued bounded Borel functions, so M = Bb (X).

(c) =⇒ (a): This is trivial.

Pointwise convergence does not correspond to convergence with respect to a metric, so intuition from metric spaces cannot be applied. The smallest subalgebra of Bb (X), which contains Cc (X) and is closed under pointwise convergence of uniformly bounded sequences, is not the set of limit points of Cc (X). Despite Proposition 1.92, not every bounded Borel function is a pointwise limit of a uniformly bounded sequence of continuous functions (Exercise 1.22).

1.8. Regularity of measures Since Borel sets are deﬁned somewhat implicitly, it is of interest to know how well they can be approximated by open and closed sets, and how well their measures can be approximated by integrals of continuous functions. Theorem 1.93. Let μ be a ﬁnite Borel measure on a metric space X. For any Borel set E and > 0, there exist closed F and open V with F ⊂ E ⊂ V such that μ(V \ F ) < . Proof. We will prove that the family A = {E ∈ BX | ∀ > 0 ∃F closed ∃V open

F ⊂ E ⊂ V, μ(V \ F ) < }

is a σ-algebra. Trivially, ∅ ∈ A, by taking F = V = ∅. If E ∈ A, then F ⊂ E ⊂ V gives V c ⊂ E c ⊂ F c with F c \ V c = V \ F , so E c ∈ A. Let En ∈ A, and denote E = n∈N En . For any > 0, there exist closed ⊂ En ⊂ Vn with μ(Vn \ Fn ) < /2n+1 . Thus, Fn and open Vnsuch that Fn by taking V = n∈N Vn , A = n∈N Fn , we have A ⊂ E ⊂ V and μ(V \ A) ≤

∞ n=1

μ(Vn \ Fn ) < /2.

36

1. Measure theory

The set V is open. n The set A is not closed, but it is the increasing limit of the closed sets j=1 Fj , so using ﬁniteness of measure, for some n ∈ N, ⎞ ⎛ n μ ⎝A \ Fj ⎠ < /2. Thus, with F =

n

j=1 Fj ,

j=1

we have F ⊂ E ⊂ V and μ(V \ F ) < , so E ∈ A.

If E is a closed set, the sets Vn = {x ∈ X | d(x, E) < 1/n} obey E ⊂ Vn+1 ⊂ Vn and (Vn \ E) = Vn \ E = E \ E = ∅ n∈N

n∈N

because E is closed. Since the sequence of sets Vn \ E are decreasing and have ﬁnite measure, this implies μ(Vn \ E) → 0 as n → ∞. Thus, choosing F = E and V = Vn for large enough n implies E ∈ A. Thus, A ⊂ BX is a σ-algebra and contains all closed sets, so A = BX . We often must work with inﬁnite measures; the counting measure on N and the Lebesgue measure on R are just two examples. However, inﬁnities on compacts introduce unnatural obstacles (Exercise 1.19), so we deﬁne: Deﬁnition 1.94. A Baire measure on a σ-locally compact metric space X is a Borel measure μ such that μ(K) < ∞ for all compact K ⊂ X. Baire measures on R are precisely the Lebesgue–Stieltjes measures, by Theorems 1.73 and 1.79. Baire measures are usually deﬁned on more general spaces, on the σ-algebra generated by compact sets; by Lemma 1.87, in our level of generality, this matches our deﬁnition. Deﬁnition 1.95. A Borel measure μ is said to be inner regular if μ(A) =

sup

μ(K)

K⊂A K compact

for all Borel sets A, and outer regular if μ(A) = inf μ(V ) V ⊃A V open

for all Borel sets A. If μ is inner regular and outer regular, it is said to be regular. Theorem 1.96. On a σ-locally compact metric space, every Baire measure is regular. Proof. We ﬁx an exhaustion (Ln )∞ n=1 of X by compact sets (see Deﬁnition 1.84), the convention L0 = ∅, and the ﬁnite measures μn (E) = μ(E ∩ Ln ).

1.8. Regularity of measures

37

We decompose a Borel set A as a disjoint union of sets An = A ∩ (int Ln \ int Ln−1 ). Fix > 0. For any n, since μn is a ﬁnite measure, there exist closed Fn and open Vn such that Fn ⊂ An ⊂ Vn and μn (Vn \ Fn ) < /2n . Without loss of generality we can replace Vn by Vn ∩ int Ln ; then, by the set of inclusions Fn ⊂ Vn ⊂ int Ln and the deﬁnition of μn , we also conclude

μ(Vn \ Fn ) < n . 2 Thus, deﬁning F = n∈N Fn and V = n∈N Vn gives F ⊂ A ⊂ V and μ(V \ F ) < . This implies μ(F ) ≥ μ(A) − and μ(V ) ≤ μ(A) + . Since > 0 is arbitrary, sup μ(F ) ≥ μ(A) ≥ inf μ(V ). V open A⊂V

F closed F ⊂A

Since by monotone convergence limn→∞ μ(F ∩ Ln ) = μ(F ), using compacts K = F ∩ Ln for large enough n shows that sup K compact K⊂A

μ(K) ≥ μ(A) ≥ inf μ(V ). V open A⊂V

The opposite inequalities are trivial since K ⊂ A ⊂ V implies μ(K) ≤ μ(A) ≤ μ(V ). Lemma 1.97. If μ is a Baire measure on a σ-locally compact metric space X, then Cc (X) ⊂ L1 (X, dμ). Proof. For f ∈ Cc (X), consider the compactK = supp f and the maximum a = maxx∈K |f (x)|. Then 0 ≤ |f | ≤ aχK , so |f | dμ ≤ aμ(K) < ∞. An outer regular measure can be recovered from its values on open sets. Further, it is useful to know when a measure can be completely recovered from integrals of functions in Cc (X). For an open set V , we deﬁne FV = {f ∈ Cc (X) | 0 ≤ f ≤ 1, supp f ⊂ V }. Proposition 1.98. For any open set V , f dμ. μ(V ) = sup f ∈FV

(1.38)

(1.39)

Proof. For any compact F ⊂ V , by Lemma 1.89, there exists f ∈ FV such that χF ≤ f ≤ 1 and therefore μ(F ) ≤ f dμ. Taking the supremum over compacts gives, by inner regularity, f dμ. μ(V ) = sup μ(F ) ≤ sup F compact F ⊂V

f ∈FV

The opposite inequality follows from f ≤ χV for all f ∈ FV .

38

1. Measure theory

Thus, the integrals of functions f ∈ Cc (X) determine the measure on open sets by (1.39) and then on all Borel sets by outer regularity.

1.9. The Riesz–Markov theorem We have seen constructions of measures that were geometrically motivated by the concept of length on the real line, and area/volume on Rn . In more abstract situations, measures often appear because of how they act on functions rather than sets (this will be the case for spectral measures as well). In other words, instead of a measure, we usually ﬁrst encounter a functional: Deﬁnition 1.99. Let X be a metric space. A positive linear functional on Cc (X) is a linear map Λ : Cc (X) → C such that f ≥ 0 implies Λ(f ) ≥ 0. Any Baire measure μ on X generates a positive linear functional Λ(f ) = f dμ ∀f ∈ Cc (X), (1.40) and the goal of this section is to prove the converse: Theorem 1.100 (Riesz–Markov). Let X be a σ-locally compact metric space. For every positive linear functional Λ on Cc (X), there is a unique Baire measure μ on X such that (1.40) holds. We assume throughout this section that X is a σ-locally compact metric space and Λ is a positive linear functional. Uniqueness of μ follows from outer regularity of μ and from Proposition 1.98. Existence of μ will be proved through a series of lemmas. It uses the outer measure construction, with open sets V as elementary sets, and the weight ρ(V ) = sup Λ(f ), f ∈FV

where FV is deﬁned by (1.38); of course, F∅ = {0} and ρ(∅) = 0. We will prove σ-subadditivity of ρ, using the following reﬁnement of compactness: Lemma 1.101 (Continuous partitions of unity). If K ⊂ X is compact, for any open cover V of K, there exists a ﬁnite subcover V1 , . . . , Vn and functions h1 , . . . , hn ∈ Cc (X) such that hj ≥ 0, supp hj ⊂ Vj , and n

hj (x) = 1

∀x ∈ K.

(1.41)

j=1

Proof. For every y ∈ K, choose Vy ∈ V such that y ∈ Vy . By Lemma 1.89 applied to {y} ⊂ Vy , for small enough > 0, the function gy (x) = (1 − −1 d(x, y))+ is in Cc (X) and obeys supp gy ⊂ Vy .

1.9. The Riesz–Markov theorem

39

Since gy (y) = 1, the set Uy = {x | gy (x) > 0} is open and contains y. Thus, the family {Uy | y ∈ K} is an open cover of K. By compactness, this cover has a ﬁnite subcover Uy1 , . . . , Uyn . Since K ⊂ nj=1 Uyj , n

∀x ∈ K.

gyi (x) > 0

i=1

Moreover, let G(x) = d(x, K). Then the functions gy nj hj = G + k=1 gyk are well deﬁned, supp hj = supp gj ⊂ Vyj for each j, and (1.41) holds, so the proof is complete with the ﬁnite subcover {Vy1 , . . . , Vyn } of V. Lemma 1.102. For any open sets Vj , ∞ ∞ Vj ≤ ρ(Vj ). ρ j=1

(1.42)

j=1

, let K = supp f . Since {Vj }∞ Proof. If f ∈ F∞ j=1 are an open cover j=1 Vj of K, by Lemma n 1.101, there exists n ∈ N and h1 , . . . , hn ∈ Cc (X) such that hj ≥ 0, j=1 hj = 1 on K, and supp hj ⊂ Vj for each j. Then f = nj=1 f hj , so Λ(f ) =

n

Λ(f hj ) ≤

j=1

n

ρ(Vj ) ≤

j=1

∞

ρ(Vj ).

j=1

gives (1.42). Taking the supremum over all f ∈ F∞ j=1 Vj

Lemma 1.103. For any set E ⊂ X, we deﬁne μ∗ (E) = inf ρ(V ). E⊂V V open

(1.43)

Then μ∗ is an outer measure on X. Proof. Obviously, μ∗ (∅) = ρ(∅) = 0 and A ⊂ B implies μ∗ (A) ≤ μ∗ (B). Take any sequence of sets Ej ⊂ X. For any > 0, thereexist open sets Vj such that Ej ⊂ Vj and ρ(Vj ) ≤ μ∗ (Ej ) + /2j . Then V = j∈N Vj is open and j∈N Ej ⊂ V , so by σ-subadditivity of ρ, ∞ ∞ ∞ Vj ≤ ρ(Vj ) ≤ μ∗ (Ej ) + . μ∗ (E) ≤ ρ j=1

j=1

j=1

Since > 0 is arbitrary, this implies σ-subadditivity of μ∗ . Lemma 1.104. All open sets are measurable with respect to μ∗ .

40

1. Measure theory

Proof. Fix an open set V and arbitrary E ⊂ X. It suﬃces to prove that μ∗ (E) ≥ μ∗ (E ∩ V ) + μ∗ (E \ V )

(1.44)

because the opposite inequality follows from subadditivity. Moreover, we can assume μ∗ (E) < ∞, otherwise, the inequality is trivial. Take an open set U such that E ⊂ U , and let > 0. Since U ∩ V is open, there exists f ∈ FU ∩V such that Λ(f ) ≥ ρ(U ∩ V ) − , and since U \ supp f is open, there exists g ∈ FU \supp f such that Λ(g) ≥ ρ(U \ supp f ) − . Since ρ(U ∩ V ) ≥ μ∗ (E ∩ V ) and ρ(U \ supp f ) ≥ μ∗ (E \ V ), this gives Λ(f ) + Λ(g) ≥ μ∗ (E ∩ V ) + μ∗ (E \ V ) − 2 . Note that supp f ∩ supp g = ∅, so f + g ∈ FU . This and additivity of Λ imply ρ(U ) ≥ Λ(f + g) ≥ μ∗ (E ∩ V ) + μ∗ (E \ V ) − 2 . Since > 0 is arbitrary and U is arbitrary with E ⊂ U , (1.44) follows.

By Carath´eodory’s Theorem 1.26, the restriction of μ∗ to BX is a Borel measure on X, which we denote μ from now on. Lemma 1.105. For any f ∈ Cc (X), if for some compact K and open V , χK ≤ f ≤ χV , then μ(K) ≤ Λ(f ) ≤ μ(V ). Proof. For t ∈ (0, 1), deﬁne Vt = {x | f (x) > t}. Then g ∈ FVt implies tg ≤ f so Λ(g) ≤ t−1 Λ(f ). Taking the supremum over g ∈ FVt gives μ(Vt ) ≤ t−1 Λ(f ). These sets have ﬁnite measure; the limit as t → 1 gives μ({x | f (x) ≥ 1}) ≤ Λ(f ). Therefore, μ(K) ≤ Λ(f ). In particular, μ(K) < ∞. Let us ﬁx g ∈ Cc (X) such that 0 ≤ g ≤ 1 and g = 1 on supp f . For any t > 0, (f − t)+ ∈ FV , so Λ((f − t)+ ) ≤ μ(V ). Since f ≤ tg + (f − t)+ , Λ(f ) ≤ tΛ(g) + Λ((f − t)+ ) ≤ tΛ(g) + μ(V ). Since t > 0 is arbitrary, this implies Λ(f ) ≤ μ(V ).

Taking V = X, we see that μ(K) < ∞ for any compact K. Thus, every f ∈ Cc (X) is integrable with respect to μ, and it remains to prove that the integral is Λ(f ): Lemma 1.106. For every f ∈ Cc (X), f dμ = Λ(f ). Proof. By linearity of both sides, it suﬃces to prove this for f such that 0 ≤ f ≤ 1. Fix n ∈ N and deﬁne sets Ak = f −1 ((k/n, ∞)) and functions ! ! k−1 k − f− . gk = f − n n + +

1.10. Exercises

41

These functions obey f = nk=1 gk and χAk ≤ ngk ≤ χAk−1 for each k. We will use this in two ways: Integrating in μ gives μ(Ak ) ≤ gk dμ ≤ μ(Ak−1 ), whereas applying Lemma 1.105 gives μ(Ak ) ≤ Λ(gk ) ≤ μ(Ak−1 ). Averaging in k and using linearity of Λ gives n n 1 1 μ(Ak ) ≤ f dμ ≤ μ(Ak−1 ), n n k=1

1 1 μ(Ak ) ≤ Λ(f ) ≤ n n n

k=1

k=1 n

μ(Ak−1 ).

(1.45) (1.46)

k=1

Note that An = ∅ and A0 = supp μ has ﬁnite measure, so (1.45) and (1.46) place the values f dμ and Λ(f ) within the same interval of length n n n n 1 1 1 μ(A0 ) 1 . μ(Ak−1 ) − μ(Ak ) ≤ μ(Ak−1 ) − μ(Ak ) = n n n n n k=1 k=1 k=1 k=1 Thus, f dμ − Λ(f ) ≤ μ(A0 )/n. Taking n → ∞ shows f dμ = Λ(f ).

1.10. Exercises 1.1. Let X be a metric space. View E ∈ BX , E = ∅, as a metric subspace of X. Prove that {A ∈ BX | A ⊂ E} = {B ∩ E | B ∈ BX } = BE . If f : X → Y is Borel, prove that f |E : E → Y is also Borel. 1.2. For any metric space X, prove that the following are equivalent: (a) X is separable; (b) X has a countable base; (c) any base U of X contains a countable base U ⊂ U . 1.3. (a) If d, d˜ are metrics on X and there exist a, b ∈ (0, ∞) such that ˜ y) ≤ bd(x, y) ad(x, y) ≤ d(x, ∀x, y ∈ X, prove that d and d˜ generate the same topology in X. (b) Prove that the metric d∞ in Rn deﬁned in (1.11) and the metrics n 1/p p (xj − yj ) (1.47) dp (x, y) = j=1

for p ∈ [1, ∞) obey d∞ (x, y) ≤ dp (x, y) ≤ n1/p d∞ (x, y), and conclude that they generate the same topology.

42

1. Measure theory

1.4. For a sequence (xn )∞ n=1 in a metric space (X, d) and x ∈ X, prove that ∀ > 0 ∃N ∈ N ∀n ≥ N d(xn , x) <

(1.48)

if and only if ∀A ∈ Td x ∈ A =⇒ (∃N ∈ N ∀n ≥ N xn ∈ A).

(1.49)

1.5. Prove that BR is generated by the sets [a, ∞), a ∈ R. 1.6. Prove that any increasing function α : R → R is Borel. 1.7. Provethat BRn is the smallest σ-algebra containing all sets of the form nj=1 (aj , ∞) with a1 , . . . , an ∈ R. ˆ is Borel if and only if f −1 ({+∞}) ∈ 1.8. Prove that a function f : X → R −1 BX , f ({−∞}) ∈ BX , and f |E : E → R is Borel, where E = f −1 (R). 1.9. (a) Let f : X → R ∪ {+∞} be lower semicontinuous, i.e., for every x0 ∈ X, f (x0 ) ≤ lim inf x→x0 f (x). Prove that f is Borel. Hint: Prove that f −1 ((a, ∞]) is open for any a ∈ R. (b) Let f : X → R ∪ {−∞} be upper semicontinuous, i.e., for every x0 ∈ X, f (x0 ) ≥ lim supx→x0 f (x). Prove that f is Borel. 1.10. Let {Tn }∞ n=1 be a partition of X into Borel sets, and let fn : Tn → Y be Borel functions for n ∈ N. Prove that the function f : X → Y , deﬁned by f (x) = fn (x) for x ∈ Tn for all n, is a Borel function. 1.11. For a sequence of Borel functions gn : X → R, let S be the set of points x ∈ X such that limn→∞ gn (x) exists and is ﬁnite. Prove that S is Borel and the function g : S → R deﬁned by g(x) = limn→∞ gn (x) is Borel. 1.12. If μj , j ∈ N are Borel measures and cj ∈ [0, ∞), prove that μ = ∞ j=1 cj μj is a Borel measure. Justify any exchanges of limits. 1.13. Let f ∈ L1 (X, dμ). Prove that | f dμ| = |f | dμ if and only if there exists ω ∈ C with |ω| = 1 such that ωf = |f | μ-a.e. 1.14. For a Borel measure μ on R with a distribution function α, prove the following. (a) For any x ∈ R, μ({x}) = α+ (x) − α− (x). (b) α is continuous at x if and only if μ({x}) = 0. (c) For any real x < y, μ([x, y)) = α− (y) − α− (x). (d) μ((−∞, 0]) < ∞ if and only if α+ (−∞) is ﬁnite. 1.15. Let μ be a Borel measure on R with a distribution function α. (a) Prove that any open set V ⊂ R can be written as a countable disjoint union of open intervals, V = j∈J (aj , bj ). (b) Prove that μ(V ) = j∈J (α− (bj ) − α+ (aj )).

1.10. Exercises

43

1.16. Prove that the set ∞ 2

n−1

B=

n=1 k=1

1 2k − 1 1 2k − 1 − 2n+1 , + 2n+1 n n 2 2 2 2

!

obeys B = [0, 1], m(B) < 1, and m(B ∩ [a, b]) > 0 for any 0 ≤ a < b ≤ 1. 1.17. Prove the following link between Riemann–Stieltjes and Lebesgue integrals: for increasing right-continuous α and continuous f : [0, 1] → R, n−1 lim f (k/n)(α((k + 1)/n) − α(k/n)) = f dμα . n→∞

k=0

∞

(a,b]

and cn > 0 for n ∈ N. If n=1 cαn < ∞ for some α ∈ (0, 1), 1.18. Let xn ∈ R cn prove that ∞ n=1 |x−xn | < ∞ for Lebesgue-a.e. x ∈ R. cα n Hint: Consider the set of x ∈ [−k, k], where ∞ n=1 |x−xn |α < ∞. 1.19. Consider the counting measure of Q viewed as a measure on R, μ(A) = #(A ∩ Q). Prove that μ(V ) = ∞ for every nonempty open set V , and that |f | dμ = ∞ for every continuous function f except for f = 0. 1.20. Prove that the metric space Q is σ-compact, but not σ-locally compact. 1.21. A metric space X is called locally compact if for every x ∈ X there is an open set V such that x ∈ V and V is compact. Prove that a metric space is σ-locally compact if and only if it is separable and locally compact. 1.22. Let fn be a uniformly bounded sequence of continuous real-valued functions which converges pointwise to χB . Prove that the set B is a Gδ set, i.e., a countable intersection of open sets.

Chapter 2

Banach spaces

Banach spaces are, simply put, complete metric vector spaces whose metric behaves in a natural way with respect to the vector space operations. We will give the precise deﬁnition below. In this chapter, we present the basic properties of Banach spaces and consider some concrete spaces of interest. Further treatments of functional analysis include [76, 81, 97]. In this text, virtually all vector spaces are over the ﬁeld of scalars C. For the questions we study, this is not a limitation. Just as Rn can be viewed as a subset of Cn , the objects we study can be viewed as complex valued with no loss of generality (the only exception will be the proof of the Stone– Weierstrass theorem, which is naturally ﬁrst proved on real-valued functions and then suitably generalized to complex-valued functions). We will freely use terminology and notation inherited from linear algebra. For instance, in any vector space V , a linear combination of vectors in X ⊂ V is a ﬁnite sum n j=1 λj xj , where λj ∈ C and xj ∈ X, and the span of X, denoted span X, is the set of all linear combinations of vectors in X.

2.1. Norms and Banach spaces Deﬁnition 2.1. A seminorm on a vector space V is a map ·: V → [0, ∞) such that, for all λ ∈ C and x, y ∈ V , (a) λx = |λ| x, (b) x + y ≤ x + y. A seminorm that obeys x = 0 whenever x = 0 is called a norm. Since (b) implies x − z ≤ x − y + y − z, any norm induces a metric d(x, y) = x − y,

(2.1) 45

46

2. Banach spaces

so every normed vector space is a metric space, and when we use metric space terminology this refers to the induced metric (2.1). In particular, a sequence (xn )∞ n=1 in V is Cauchy if lim

sup xm − xn = 0,

N →∞ n,m≥N

and it is convergent if for some x ∈ V , limn→∞ xn − x = 0. We denote convergence as always by limn→∞ xn = x or xn → x, n → ∞. Deﬁnition 2.2. A Banach space is a normed vector space which is complete with respect to the induced metric. Example 2.3. For any n ∈ N, Cn is a Banach space with any of the norms " # n # n |zj |, z2 = $ |zj |2 , z∞ = max |zj |. z1 = j=1

j=1

1≤j≤n

The proofs for ·1 and ·∞ are the same as in Rn . The proof for ·2 can be done by adapting the proof from Rn , or more quickly by interpreting Cn as R2n in the standard way and noting that ·2 rewrites as the Euclidean norm on R2n since |zj |2 = (Re zj )2 + (Im zj )2 . Later in this chapter, Example 2.3 will be vastly generalized in the context of Lp spaces, and further examples will be discussed. Lemma 2.4. In any normed vector space, |x − y| ≤ x − y.

(2.2)

Proof. By the triangle inequality, y ≤ x + y − x and x ≤ y + x − y. Rearranging and using y − x = −(x − y) = x − y, we get −x − y ≤ x − y ≤ x − y, which is equivalent to (2.2).

The estimate (2.2) implies one of the basic continuity observations: Lemma 2.5. For any normed vector space V , the following are continuous: (a) the norm V → [0, ∞), x → x; (b) vector addition V × V → V , (x, y) → x + y; (c) scalar multiplication C × V → V , (λ, x) → λx. Proof. If xn → x, then by deﬁnition, xn − x → 0, so the inequality |xn − x| ≤ xn − x implies xn → x. Thus, the norm is continuous. Continuity of addition follows similarly from (xn + yn ) − (x + y) ≤ xn − x + yn − y,

2.1. Norms and Banach spaces

47

and continuity of scalar multiplication from λn xn − λx ≤ λn (xn − x) + (λn − λ)x = |λn |xn − x + |λn − λ|x. Series in a Banach space are again deﬁned as a limit of partial sums: ∞ Lemma 2.6 ∞(Weierstrass). For a sequence (xn )n=1 in a Banach space V such that n=1 xn < ∞, the series ∞

xn = lim

n=1

is convergent in V and

N →∞

xn

n=1

% % ∞ ∞ % % % % ≤ x xn . % n% % % n=1

Proof. Denote Sn =

N

n=1

n

k=1 xk .

For m < n, by the triangle inequality, % % n n ∞ % % % % xk % ≤ xk ≤ xk . Sm − Sn = % % % k=m+1

k=m+1

k=m+1

Since tails of convergent series converge to 0, this implies that (Sn )∞ n=1 is a Cauchy sequence, so it is convergent. By continuity of the norm, n ∞ % % % % xk = xk . % lim Sn % = lim Sn ≤ lim n→∞

n→∞

n→∞

k=1

k=1

Deﬁnition 2.7. A nonempty set S ⊂ V is a subspace of a normed vector space V if x, x ˜ ∈ S implies x + x ˜ ∈ S and λ ∈ C, x ∈ S implies λx ∈ S. S is a closed subspace of V if S is a subspace, and S is closed with respect to the induced metric from V . Any subspace of a normed vector space is also a normed vector space with the inherited norm. However, only a closed subspace of a Banach space is a Banach space with the inherited norm. Finite-dimensional subspaces of a Banach space are always closed (Exercise 2.1). Finally, we describe a general construction for obtaining a normed vector space from a vector space equipped with a seminorm. Lemma 2.8. Let V be a vector space with a seminorm ·. Then: (a) V0 = {x ∈ V | x = 0} is a vector subspace of V ; (b) if x − y ∈ V0 , then x = y; (c) on the quotient vector space V /V0 , [x] := x deﬁnes a norm.

48

2. Banach spaces

Proof. (a) Let λ ∈ C and x, y ∈ V0 . Then λx = |λ|x = 0 and 0 ≤ x + y ≤ x + y = 0, so λx, x + y ∈ V0 . (b) follows from (2.2). (c) By (b), [x] = x deﬁnes a function on V /V0 . It inherits seminorm properties from the seminorm on V , and [x] = [0] implies x ∈ / V0 , so [x] = x = 0.

2.2. The Banach space C(K) If X, Y are metric spaces, C(X, Y ) denotes the set of continuous functions from X to Y . We are particularly interested in the set C(K) = C(K, C) of continuous functions from a compact K to C. Compactness of K implies that every f ∈ C(K) is bounded and has a maximum absolute value, so f = sup |f (x)| = max|f (x)| x∈K

x∈K

deﬁnes a norm on C(K). Theorem 2.9. If K is a compact metric space, C(K) is a Banach space. Proof. Let (fn )∞ n=1 be a Cauchy sequence in C(K). Since |fm (x)−fn (x)| ≤ fm − fn , (fn (x))∞ n=1 is a Cauchy sequence in C for every x ∈ K. Thus, f (x) = limn→∞ fn (x) exists pointwise. For every > 0 there exists N such that for all m, n ≥ N and all x ∈ K, |fm (x) − fn (x)| < . Taking m → ∞ gives ∀ > 0 ∃N ∈ N ∀n ≥ N ∀x ∈ K |f (x) − fn (x)| ≤ .

(2.3)

Fix y ∈ K and > 0. Use (2.3) to choose n such that supx∈K |f (x)−fn (x)| ≤

. Choose δ > 0 such that d(x, y) < δ implies |fn (x) − fn (y)| < . Then d(x, y) < δ implies |f (x) − f (y)| ≤ |f (x) − fn (x)| + |fn (x) − fn (y)| + |fn (y) − f (y)| < 3 . Since is arbitrary, f is continuous at y. Since y is arbitrary, f ∈ C(K). Now (2.3) says that limn→∞ f − fn = 0, so C(K) is complete. Convergence in C(K) is called uniform convergence. Perhaps surprisingly, uniform convergence can be characterized by pointwise convergence: Lemma 2.10. Let K be a compact metric space and let fn ∈ C(K), f ∈ C(K) be functions with the property that lim xn = x =⇒ lim fn (xn ) = f (x).

n→∞

n→∞

Then fn converge to f uniformly on K.

(2.4)

2.2. The Banach space C(K)

49

Proof. We deﬁne a function F on the compact set L = K × ({0} ∪ {1/n | n ∈ N}) by F (x, 1/n) = fn (x) and F (x, 0) = f (x), and we use the fact that most points in {0} ∪ {1/n | n ∈ N} are isolated. At a point of the form (x, 1/n), continuity of F follows from continuity of fn . At a point of the form (x, 0), continuity of F follows from continuity of f and (2.4). Since F is continuous on the compact set L, it is uniformly continuous. Uniform continuity lets us estimate F (x, 1/n)−F (x, 0) uniformly in x which precisely gives uniform convergence of fn to f . A subset of a metric space is called precompact if its closure is compact. Bounded subsets of C(K) are not in general precompact; to formulate a criterion for precompactness, we need the following notions. Deﬁnition 2.11. Let X be a metric space with metric d. A family F ⊂ C(X, C) is said to be (a) pointwise bounded if supf ∈F |fn (x)| < ∞ for every x ∈ X; (b) equicontinuous if for every x ∈ X and > 0, there is δ > 0 such that |f (x) − f (y)| < holds for all f ∈ F and all y ∈ K with d(x, y) < δ. Theorem 2.12 (Arzel` a–Ascoli). Let K be a compact metric space. If F ⊂ C(K) is pointwise bounded and equicontinuous, any sequence in F has a convergent subsequence in C(K). The proof has several steps, which we separate for independent interest and formulate in a more general setting with other applications in mind. Lemma 2.13. If X is separable and F ⊂ C(X, C) is pointwise bounded, then any sequence in F has a subsequence which converges pointwise on a dense subset of X. Proof. Since X is separable, it has a countable dense subset {xk | k ∈ N}. To construct a subsequence of a sequence (fn )∞ n=1 in F , we use a diagonalization argument. Denote j(0, n) = n. Inductively in k ∈ N, since supn∈N |fn (xk )| < ∞, the sequence {j(k − 1, n)}∞ n=1 has a subsequence such that lim f (x ) exists. Since {j(n, n)}∞ {j(k, n)}∞ n→∞ j(k,n) k n=1 n=k is a ∞ ∞ subsequence of {j(k, n)}n=1 , the subsequence {fj(n,n) }n=1 converges at xk for every k. Theorem 2.14. Let fn : X → C be an equicontinuous sequence of functions on a metric space X, which converges pointwise on a dense set in X. Then the sequence converges pointwise everywhere, the pointwise limit f is continuous, and the functions have property (2.4). In particular, fn converge to f uniformly on compact subsets K ⊂ X.

50

2. Banach spaces

Proof. We deﬁne C(x) = lim

sup |fm (x) − fn (x)|.

N →∞ m,n≥N

By deﬁnition, the sequence (fn (x))∞ n=1 is Cauchy if and only if C(x) = 0. For any x ∈ X and > 0, by equicontinuity, there exists δ > 0 such that |fn (x) − fn (y)| <

∀n ∈ N ∀y ∈ Bδ (x),

(2.5)

where we denote Bδ (x) = {y ∈ X | d(x, y) < δ}. By using |fm (x) − fn (x)| ≤ |fm (x) − fm (y)| + |fm (y) − fn (y)| + |fn (y) − fn (x)|, inequality (2.5) implies C(x) ≤ 2 + C(y) for y ∈ Bδ (x). Since there is a dense set of y ∈ X such that C(y) = 0, this implies C(x) ≤ 2 , and since

> 0 is arbitrary, it implies C(x) = 0. Thus, fn (x) converges pointwise. Taking n → ∞ in (2.5) shows continuity of f (x) = limn→∞ fn (x). For ﬁxed x ∈ X and > 0, choose N so that |fn (x) − f (x)| < for n ≥ N . Combining this with (2.5) shows that |f (x) − fn (y)| < for all n ≥ N and all y ∈ Bδ (x). Since > 0 is arbitrary, this proves (2.4). By Lemma 2.10, fn converge uniformly to f on compact K ⊂ X.

Proof of Theorem 2.12. Since K is compact, it is separable. Thus, Lemma 2.13, any sequence has a subsequence which converges pointwise some dense set. By Theorem 2.14, this subsequence converges uniformly K.

by on on

The remainder of this section is dedicated to an important criterion for density in C(K), called the complex Stone–Weierstrass theorem. The criterion uses additional algebraic structure of C(K): In addition to the vector space structure, C(K) is equipped with the binary operation of pointwise multiplication, because products of continuous functions are continuous. Moreover, pointwise multiplication in C(K) is continuous, since fn gn − f g ≤ fn gn − g + fn − f g. Deﬁnition 2.15. Let F = R or F = C. A subset S ⊂ C(K, F) is a subalgebra of C(K, F) if it contains the constant function 1 and is closed under pointwise addition, pointwise multiplication, and scalar multiplication by λ ∈ F. S separates points if for every x = y, there is f ∈ S such that f (x) = f (y). Theorem 2.16 (Stone–Weierstrass). Let S be a subalgebra of C(K, R). If S separates points on K and if 1 ∈ S, then S is dense in C(K, R).

2.2. The Banach space C(K)

51

The proof uses approximation of |·| by polynomials on compact intervals: Lemma 2.17. For any a > 0 and δ > 0, there is a polynomial P such that max ||y| − P (y)| ≤ δ.

(2.6)

y∈[−a,a]

Proof. By the binomial expansion, for x ∈ (−1, 1), ! ∞ √ 1 × 3 × · · · × (2k − 3) k k−1 1/2 . 1−x=1− ck x , ck = (−1) = 2k k! k k=1

Since ck > 0 for all k, by monotone convergence, ∞ k=1

ck = lim x↑1

∞

ck xk = lim(1 −

k=1

x↑1

√

1 − x) = 1.

∞

In particular, the series k=1 ck is convergent and the binomial expansion holds at x = 1 as well, by continuity of both sides. Denote Qn (x) = 1 − nk=1 ck xk . Then for all x ∈ [0, 1], ∞ ∞ √ 1 − x − Qn (x) = c k xk ≤ ck . k=n+1

k=n+1

By the decay of tails of convergent series, the polynomials Qn converge to √ 1 − x uniformly on [0, 1] as n → ∞. By substituting x = 1 − y 2 /a2 , polynomials Qn (1 − y 2 /a2 ) converge to |y| a uniformly on y ∈ [−a, a]. Multiplying by a and using P (y) = aQn (1 − y 2 /a2 ) for large enough n gives (2.6). Separation of points enters through the following lemma, which allows us to arbitrarily prescribe values at two points: Lemma 2.18. Assume that S separates points on K and 1 ∈ S. Then, for any x, y ∈ K and f ∈ C(K, R), there exists x, y ∈ S such that hx,y (x) = f (x),

hx,y (y) = f (y).

Proof. If x = y, this is trivial: it suﬃces to take hx,y a constant function. From now on we assume x = y. Point evaluations at x, y give a map ! h(x) h → h(y) from S to R. Since S is a vector space, the image under this map is a vector subspace of R2 . Since 1 ∈ S, the image contains 11 . There exists Hx,y ∈ S such that Hx,y (x) = Hx,y (y). Thus, the image contains two linearly independent vectors in R2 , so it contains all of R2 . In other words, by choosing a linear combination of Hx,y and 1, denoted hx,y ∈ S, we can ensure that hx,y (x) = f (x) and hx,y (y) = f (y).

52

2. Banach spaces

With these ingredients we can complete the proof of the real Stone– Weierstrass theorem: Proof of Theorem 2.16. We will work with the closed subalgebra S and prove that S = C(K, R). The proof consists of several steps. The ﬁrst step is to prove that f ∈ S implies |f | ∈ S. If f ∈ S, denoting a = f , by Lemma 2.17 for any > 0 there is a polynomial P such that sup ||y| − P (y)| ≤ . y∈[−a,a]

Since f takes values in [−a, a], |f | − P ◦ f = sup ||f (x)| − P (f (x))| ≤ . x∈K

∈ S for any k ∈ N, and then P ◦ f ∈ S. Since Since S is a subalgebra,

> 0 is arbitrary, |f | ∈ S. fk

It follows that, for any f, g ∈ S, f + g |f − g| + ∈ S, 2 2 and similarly min(f, g) ∈ S. By induction in n ∈ N, maxima and minima of n functions in S are also in S. max(f, g) =

From now on, let us ﬁx f ∈ C(K, R) and > 0. For any x, y ∈ K, there exists hx,y ∈ S such that hx,y (x) = f (x) and hx,y (y) = f (y). Let us ﬁx x for the moment. By continuity, for every y ∈ K, there is an open neighborhood Uy of y on which hy > f − . By compactness, K has a ﬁnite subcover Uy1 , . . . , Uyn . Choosing gx = max(hx,y1 , . . . , hx,yn ) gives a function gx ∈ S such that gx (x) = f (x) and gx > f − on K. By continuity, any x ∈ K has an open neighborhood Vx on which gx < f + . By compactness, K has a ﬁnite subcover Vx1 , . . . , Vxm . The function F = min(gx1 , . . . , gxm ) ∈ S obeys f − < F < f + for all x ∈ K. Since > 0 is arbitrary, this implies f ∈ S.

Theorem 2.19 (Complex Stone–Weierstrass theorem). Let K be a compact metric space. Let S be a subalgebra of C(K) which separates points, and assume also that for any f ∈ S, its complex conjugate f¯ is also in S. Then S is a dense subset of C(K). Proof. Let SR = S ∩ C(K, R). Clearly, 1 ∈ SR . If f ∈ S, then f ∈ S, so Re f = f +f 2 ∈ SR . For any x = y, there exists h ∈ S such that h(x) = h(y).

2.2. The Banach space C(K)

53

Then Re(eiφ h) ∈ S for any φ ∈ R; the choice φ = − arg(h(x) − h(y)) guarantees that Re(eiφ h(x)) = Re(eiφ h(y)), so SR separates points. By Theorem 2.16, the closure of SR is C(K, R). It follows that f = Re f − i Re(if ) ∈ S for any f ∈ C(K). This criterion eﬀortlessly recovers some classical approximation results: Corollary 2.20 (Weierstrass’s ﬁrst theorem). For compact K ⊂ R, polynomials with complex coeﬃcients are dense in C(K). Proof. Polynomials with complex coeﬃcients are a subalgebra of C(K), and the complex conjugate of a polynomial is a polynomial. Polynomials separate points because f (x) = x is a polynomial and is injective on K. Thus, polynomials are dense in C(K). If we remove the assumption K ⊂ R, polynomials may not be dense. An important special case is the unit circle ∂D = {z ∈ C | |z| = 1}. Note that for any n ∈ Z, z → z n is a continuous function from ∂D → C. Polynomials are not dense in C(∂D): for instance, the function 1/z cannot be approximated by polynomials since for any polynomial p, % % % 2π !% %1 % % % dx 1 ix ix % − p(z)% % = 1. % 1 − e p(e ) − p(z) % = %z ≥ %z % z 2π 0 C(∂D) C(∂D) To obtain a dense set, one also includes negative powers of z: Corollary 2.21 (Weierstrass’s second theorem). The subspace of Laurent polynomials span{z n | n ∈ Z} is dense in C(∂D). Proof. S = span{z n | n ∈ Z} is a subalgebra of C(∂D) because z m z n = z m+n and z 0 = 1. S is closed under complex conjugation because z n = z −n . S separates points because f (z) = z is injective on ∂D. Thus, by the complex Stone–Weierstrass theorem, span{z n | n ∈ Z} is dense in C(∂D). Weierstrass’s second theorem is often restated using the substitution z= n as a statement about density of trigonometric polynomials k=m ck eikt (where m, n ∈ Z, m ≤ n) in the space C(T), where T = R/2πZ.

eit ,

Some other concrete density results are left to Exercises 2.2, 2.3, and 2.4. Each of these applications implies separability of the corresponding space C(K). For instance, it follows from Weierstrass’s ﬁrst theorem that the set of polynomials with coeﬃcients in Q + iQ is a countable dense subset of C([a, b]), so C([a, b]) is separable. Those are special cases of a general fact: Theorem 2.22. If K is a compact metric space, then C(K) is separable.

54

2. Banach spaces

Proof. Let {xn }∞ n=1 be a countable dense set in K, and denote by d the metric in K. Deﬁne for n, m ∈ N fn,m (x) = max(1 − md(x, xn ), 0). This is a countable set of functions which separates points. The set of all ﬁnite products V = {1} ∪ {fn1 ,m1 . . . fnk ,mk | k ∈ N, n1 , . . . , nk , m1 , . . . , mk ∈ N} is also countable and is closed under multiplication. Thus, span V obeys all the assumptions of the Stone–Weierstrass theorem, so span V is dense in C(K). Any linear combination of elements of V can be approximated by one with coeﬃcients in Q + iQ. Thus, linear combinations of elements of V with coeﬃcients in Q + iQ are dense in C(K); the set of such linear combinations is countable.

2.3. Lp spaces Fix a measure μ on X and p ∈ [1, ∞]. For f : X → C, deﬁne !1/p p |f | dμ , p ∈ [1, ∞), f p =

(2.7)

X

f ∞ = inf{t ∈ [0, ∞] | |f (x)| ≤ t for μ-a.e. x}.

(2.8)

First, let us prove that the inf in (2.8) is a minimum: Lemma 2.23. Let f : X → C. For μ-a.e. x, |f (x)| ≤ f ∞ . Proof. There exists a sequence of tn ≥ f ∞ such that tn → f ∞ and μ({x | |f (x)| > tn }) = 0. A countable union of zero measure sets has zero measure, so μ({x | |f (x)| > f ∞ }) = 0. In other words: |f | ≤ t holds μ-a.e. if and only if t ≥ f ∞ . Corollary 2.24. f p = 0 if and only if f = 0 holds μ-a.e. Proof. For p = ∞ this follows from Lemma 2.23, and for p ∈ [1, ∞) from Proposition 1.57 applied to |f |p . For any p ∈ [1, ∞], deﬁne Lp (X, dμ) = {f : X → C | f p < ∞}. We will see that f p is a seminorm on Lp (X, dμ); passing to a quotient space will then give a normed vector space Lp (X, dμ). The property λf p = |λ|f p is immediate, so it remains to prove the triangle inequality. The case p = 1 follows from f + g1 = |f + g| dμ ≤ |f | dμ + |g| dμ = f 1 + g1 .

2.3. Lp spaces

55

The case p = ∞ follows from |f (x) + g(x)| ≤ |f (x)| + |g(x)| ≤ f ∞ + g∞ ,

for μ-a.e. x.

Although the cases p = 1 and p = ∞ are easier, they require separate treatment so we will exclude them in the arguments below, leaving them as an exercise. If p, q ∈ [1, ∞] and p1 + 1q = 1, then p, q are called conjugate exponents. Lemma 2.25 (Young’s inequality). Let p, q ∈ (1, ∞) be conjugate exponents. For any x, y ≥ 0, xp y q + . (2.9) xy ≤ p q Proof. The exponential function is convex, so for all u, v ∈ R and t ∈ (0, 1), etu+(1−t)v ≤ teu + (1 − t)ev (Exercise 2.6). Using u = log xp , v = log y q , t = 1p , 1 − t = for x, y > 0. If x = 0 or y = 0, the inequality is trivial.

1 q

proves (2.9)

Theorem 2.26 (H¨ older’s inequality). If p, q ∈ (1, ∞) are conjugate expop nents and f ∈ L (X, dμ), g ∈ Lq (X, dμ), then g¯f ∈ L1 (X, dμ) and gf dμ ≤ f p gq . (2.10) Proof. If f p = 0, then f = 0 μ-a.e., so g¯f = 0 μ-a.e., and the statement is trivial. If f p = 0, by dividing by f p , it suﬃces to consider the case f p = 1. Similarly, it suﬃces to consider gq = 1. By Young’s inequality, |g(x)f (x)| ≤

|f (x)|p |g(x)|q + , q p

so by integrating, q p gf dμ ≤ |gf | dμ ≤ g + f = 1 + 1 = 1 = f p gq . q p q p

Noting that for any f it is possible to choose g such that equality holds, we obtain the following corollary. + 1q = 1. For any f ∈ Lp (X, dμ), . gf dμ (2.11) f p = max g∈Lq (X,dμ)

Corollary 2.27. Let p, q ∈ (1, ∞) obey

gq =1

1 p

56

2. Banach spaces

Proof. By rescaling and using H¨older’s inequality, it suﬃces to show that q there exists g ∈ L (X, dμ) with gq = 1 and g¯f dμ = f p = 1. It is straightforward to verify this for

|f (x)|p−2 f (x) f (x) = 0 g(x) = 0 f (x) = 0. H¨ older’s inequality holds also for p = 1 and p = ∞ (see also Exercise 2.7). Corollary 2.27 characterizes f p as an extremum over linear expressions in f , which is useful for proving subadditivity of the norm: Theorem 2.28 (Minkowski’s inequality). For any p ∈ [1, ∞], for all f1 , f2 ∈ Lp (X, dμ), (2.12) f1 + f2 p ≤ f1 p + f2 p . Proof. The cases p = 1, p = ∞ are easy and were proved before. Let p ∈ (1, ∞) and denote by q the conjugate exponent. For any f1 , f2 ∈ Lp (X, dμ) and g ∈ Lq (X, dμ), by the triangle inequality, g(f1 + f2 ) dμ ≤ gf1 dμ + gf2 dμ ≤ gq f1 p + gq f2 p . X

X

X

Taking the supremum over g with gq = 1 gives (2.12).

Collecting the facts, we have proved the following: Theorem 2.29. For any p ∈ [1, ∞], ·p is a seminorm on Lp (X, dμ) and f p = 0 if and only if f (x) = 0 for μ-a.e. x. Thus, with the zero norm subspace V0 = {f : X → C | f = 0 μ-a.e.}, the quotient space construction in Lemma 2.8 gives the normed vector space Lp (X, dμ) = Lp (X, dμ)/V0 = {[f ] | f p < ∞}. This is commonly phrased in terms of an equivalence relation, f ∼ g if and only if f = g μ-a.e. An element of Lp (X, dμ) is an equivalence class [f ] corresponding to a Borel function f , but following standard conventions, when considering elements of Lp (X, dμ), we typically do not distinguish between a Borel function and its equivalence class. For instance, since Corollary 2.27 is not aﬀected by changing f or g on a zero measure set, it can be formulated as a statement for the quotient spaces: for any f ∈ Lp (X, dμ), . gf dμ (2.13) f p = max g∈Lq (X,dμ) gq =1

Let us consider completeness. For p = ∞, convergence in L∞ (X, dμ) corresponds to a “μ-a.e.” version of uniform convergence, so by working

2.3. Lp spaces

57

away from a zero measure set, the proof of completeness of C(K) also proves that L∞ (X, dμ) is a Banach space. It remains to consider p ∈ [1, ∞). Theorem 2.30 (Riesz–Fischer). For p ∈ [1, ∞), Lp (X, dμ) is a Banach space. p Proof. Let (fn )∞ n=1 be a Cauchy sequence in L (X, dμ). By general metric ∞ space arguments, there is a subsequence (fnk )k=1 such that fnk+1 −fnk p ≤ 1 . It is notationally convenient to use fn0 = 0 and to conclude 4k ∞

fnk − fnk−1 p < ∞.

k=1

Consider h(x) = ∞ k=1 |fnk (x) − fnk−1 (x)|. By monotone convergence, p m p |h| dμ = lim |f (x) − f (x)| dμ, nk nk−1 m→∞ k=1

so taking pth roots and using Minkowski’s inequality (2.12), % % m m % % % % % % %fn − fn % . hp = lim % |fnk − fnk−1 |% ≤ lim k k−1 p m→∞ % m→∞ % k=1

p

k=1

The right-hand side is a convergent series, so h ∈ Lp (X, dμ), and h < ∞ μ-a.e. By the deﬁnition of h, this implies that for μ-a.e. x, the sequence (fnk (x))∞ k=1 is Cauchy and |fnk (x)| ≤ |h(x)| for all k, so the pointwise limit f (x) = limk→∞ fnk (x) exists μ-a.e., and |f | ≤ h. Due to |fn | ≤ h and |f | ≤ h, we estimate |fn − f |p ≤ 2p hp , so by dominated convergence with the dominating function 2p hp , lim |fnk − f |p dμ = lim |fnk − f |p dμ = 0, k→∞

k→∞

so fnk − f p → 0. By general metric space arguments, since the Cauchy sequence (fn )∞ n=1 has a convergent subsequence, it is convergent. This proof yields an additional fact which will be useful. Convergence in p-norm does not imply pointwise convergence, and pointwise convergence does not imply convergence in p-norm; however: p Corollary 2.31. If (fn )∞ n=1 is a sequence such that fn → f in L (X, dμ) and fn → g pointwise μ-a.e., then f = g μ-a.e.

Proof. By the proof of the Riesz–Fischer theorem, there is a subsequence p (fnk )∞ k=1 which converges both in L (X, dμ) and pointwise to the same limit. That limit must be equal μ-a.e. to both f and g, so f = g μ-a.e.

58

2. Banach spaces

We now start using topological properties of X to prove a density statement. We will work in the setting of σ-locally compact spaces and Baire measures on them, as deﬁned in Section 1.8. Theorem 2.32. Let X be a σ-locally compact metric space, let μ be a Baire measure on X, and let p ∈ [1, ∞). Then Cc (X) is a dense subset of Lp (X, dμ). Proof. First, we note that any f ∈ Cc (X) is in Lp (X, dμ), because |f | ≤ CχK for some C > 0 and K compact, and because μ is ﬁnite on compacts. Denote by M the closure of Cc (X) in Lp (X, dμ). Since Cc (X) is a vector subspace of Lp (X, dμ), so is M. We will show M = Lp (X, dμ). Consider a Borel set B ⊂ X with μ(B) < ∞. By regularity of μ, for any > 0, there exist compact K and open V such that K ⊂ B ⊂ V and μ(V \ K) < . By Lemma 1.89, there exists f ∈ Cc (X) with χK ≤ f ≤ χV . It follows that |f − χB | ≤ χV \K so f − χB p ≤ μ(V \ K) < . Since > 0 is arbitrary, χB ∈ M. Thus, M contains all Borel sets B with μ(B) < ∞. By taking linear combinations, any simple function s ∈ Lp (X, dμ) is in M. Any positive function f ∈ Lp (X, dμ) can be approximated from below by simple functions 0 ≤ sn ≤ f , sn→ f , so by dominated convergence with dominating function |f |p , we have |sn − f |p dμ → 0 and f ∈ M. Since any complex-valued f can be written as a linear combination of four positive functions f = (Re f )+ − (Re f )− + i(Im f )+ − i(Im f )− , and those functions are in Lp (X, dμ) if f is, we conclude that any f ∈ Lp (X, dμ) is in M. Combining this density result with Theorem 2.22 gives: Corollary 2.33. Let X be a σ-compact metric space, let μ be a Baire measure, and let p ∈ [1, ∞). Then Lp (X, dμ) is a separable Banach space. Proof. If X = n∈N Ln and sets Ln are compact, it suﬃces to take a union of countable dense sets in C(Ln ), n ∈ N. The density statements above were only for p ∈ [1, ∞), as the case p = ∞ is very diﬀerent in this regard (Exercises 2.8 and 2.9). We end by remarking upon two notationally special cases. When X is a subset of Rd for some d, if μ is chosen to be the restriction of the d-dimensional Lebesgue measure to X, we denote Lp (X) = Lp (X, dμ). For any set X, if μ is chosen to be the counting measure on X, we will denote p (X) = Lp (X, dμ). When X is countable (most commonly, X = N or X = Z), it is σ-locally compact with respect to the discrete metric; thus, it is a special case of the above considerations. With respect to the counting measure on X, the only zero measure set is the empty set, so the

2.4. Bounded linear operators and uniform boundedness

59

general quotient space step is not needed here, and p (X) is exactly the set of sequences (and not equivalence classes of sequences) & ' 1/p p p (X) = f : X → C | f p = |f (x)| 0. Then x < r implies T x < 1. Applying this to vectors with x = r/2 and rescaling by 2/r shows that for any x with x = 1, we have T x < 2/r. Thus, T ≤ 2/r. If T is bounded and linear, then T x − T y = T (x − y) ≤ T x − y, so T is continuous.

60

2. Banach spaces

For T ∈ L(X, Y ), the kernel and range are deﬁned by Ker T = {x ∈ X | T x = 0}, Ran T = {T x ∈ Y | x ∈ X}. Lemma 2.37. If T ∈ L(X, Y ), then Ker T is a closed subspace of X. Proof. Since T is continuous, if xn → x and T xn = 0, then T x = T lim xn = lim T xn = lim 0 = 0. n→∞

n→∞

n→∞

It should be noted that the subspace Ran T is not always closed in Y . Since Y is a vector space, L(X, Y ) inherits a vector space structure deﬁned by (S + T )x = Sx + T x,

(λS)x = λ(Sx)

∀x ∈ X.

Proposition 2.38. If X is a normed vector space and Y is a Banach space, then L(X, Y ) is a Banach space with the norm (2.14). Proof. It is straightforward to verify that L(X, Y ) is a vector space and that (2.14) is a norm. To show that L(X, Y ) is complete, let (Tn )∞ n=1 be a Cauchy sequence in L(X, Y ). For any x ∈ X, the inequality Tn x − Tm x ≤ Tn − Tm x shows that (Tn x)∞ n=1 is a Cauchy sequence in Y , so it is convergent. We deﬁne T : X → Y by T x = lim Tn x. n→∞

Linearity of T follows from linearity of Tn . Since (Tn )∞ n=1 is a Cauchy sequence in L(X, Y ), supn Tn < ∞, so T x = lim Tn x ≤ lim supTn x. n→∞

n→∞

Therefore, T is bounded with norm at most lim supn→∞ Tn . Similarly, (Tn − T )x = lim (Tn − Tm )x ≤ lim supTn − Tm x m→∞

m→∞

implies that Tn − T ≤ lim supTn − Tm . m→∞

Since

(Tn )∞ n=1

is a Cauchy sequence, it follows that limn→∞ Tn −T = 0.

Lemma 2.39. Let T1 , T2 ∈ L(X, Y ). If the set {x | T1 x = T2 x} is dense in X, then T1 = T2 . Proof. This set is Ker(T1 − T2 ), which is a closed subspace of X. If it is also dense, then Ker(T1 − T2 ) = X, so T1 = T2 .

2.4. Bounded linear operators and uniform boundedness

61

Composition of linear operators is denoted by multiplicative notation, as in the following statement. Lemma 2.40. If T ∈ L(X, Y ) and S ∈ L(Y, Z), then ST ∈ L(X, Z) and ST ≤ ST . Proof. Linearity of ST follows from linearity of S and T and boundedness follows from ST x ≤ ST x ≤ ST x. Deﬁnition 2.41. The operator U ∈ L(X, Y ) is norm-preserving if U x = x

∀x ∈ X.

(2.15)

U is unitary if it is norm-preserving and Ran U = Y . Remark 2.42. For readability, even in discussions that involve more than one norm, our notation for norms usually leaves that implicit. For instance, in (2.15), the norm on the left-hand side corresponds to the space Y , and the norm on the right-hand side corresponds to the space X. Lemma 2.43. If X is a Banach space, Y is a normed space, and U ∈ L(X, Y ) is norm-preserving, then Ker U = {0} and Ran U is a closed subspace of Y . Proof. By (2.15), U x = 0 implies x = 0, so Ker U = {0}. Assume that y is in the closure of Ran U , i.e., there exists a sequence U xn = yn → y. The sequence (U xn )∞ n=1 is convergent, so it is a Cauchy sequence. Since U xm − U xn = U (xm − xn ) = xm − xn , the sequence (xn )∞ n=1 is Cauchy, so it is convergent in X. Its limit x obeys U x = U lim xn = lim U xn = y, n→∞

which shows that y ∈ Ran U .

n→∞

The importance of unitary maps lies in the fact that they preserve all Banach space operations: since a unitary map is a linear bijection, it preserves notions from linear algebra such as linear independence, and since it preserves distances, it preserves metric space notions such as density. Unitary maps U ∈ L(X, Y ) are also sometimes called isometric isomorphisms, and if such U exists, X and Y are said to be isometrically isomorphic. Throughout this text, we will often work with bounded linear operators, which are initially deﬁned only on a dense subspace of X. In particular, some of the central results in this text are constructions of speciﬁc unitary maps, and such constructions often start with a norm-preserving map on a dense subspace. The following procedure is therefore useful.

62

2. Banach spaces

Proposition 2.44. Let X, Y be Banach spaces, and let V be a dense subspace of X. If the linear map T : V → Y is bounded, then: (a) T can be uniquely extended to a bounded linear operator T ∈ L(X, Y ); (b) T = T ; (c) {T x | x ∈ X} ⊂ {T x | x ∈ V }. Proof. For any x ∈ X and any sequence of xn ∈ V such that xn → x, the sequence T xn is Cauchy because T xm − T xn ≤ T xm − xn , so it is convergent. Moreover, the limit is independent of the choice of sequence: if ˜n → x, then T xn − T x ˜n ≤ T xn − x ˜n → 0. Thus, we also x ˜n ∈ V , x can deﬁne T : X → Y by T x = limn→∞ T xn for any sequence of xn ∈ V such that xn → x. Note that this makes (c) obvious. For x ∈ V , we can take xn = x for all n to conclude T x = T x, so T is an extension of T . Linearity of T follows from the linearity of T , and T x = lim T xn ≤ lim T xn = T x n→∞

n→∞

shows that T ≤ T . The reverse inequality follows from T |V = T . If there were two extensions of T in L(X, Y ), they would be equal on the dense set V , so they would be equal by Lemma 2.39. The specialization to norm-preserving maps has additional properties: Proposition 2.45. Let X, Y be Banach spaces, and let V be a dense subspace of X. If the linear map U : V → Y is norm-preserving, i.e., obeys (2.15) for all x ∈ V , then: (a) U can be uniquely extended to a bounded linear operator U ∈ L(X, Y ); (b) this extension is a norm-preserving map U : X → Y ; (c) {U x | x ∈ X} = {U x | x ∈ V }. Proof. Continuing from the proof of Proposition 2.44, for xn → x, xn ∈ V , U x = lim U xn = lim xn = x, n→∞

n→∞

so U is norm-preserving. The range of U contains that of U and is closed by Lemma 2.43, so it contains {U x | x ∈ V }. Dense sets can also be used to study pointwise convergence of operators.

2.4. Bounded linear operators and uniform boundedness

63

Lemma 2.46. Let X, Y be Banach spaces. Consider a sequence (Tn )∞ n=1 in L(X, Y ). If supn Tn < ∞ and Tn converge pointwise on some dense subset of X, then Tn converge pointwise on X and T x = lim Tn x n→∞

(2.16)

deﬁnes some T ∈ L(X, Y ). Moreover, T ≤ lim inf n→∞ Tn . Proof. Denote M = supn∈N Tn . The operators obey Tn x − Tn y = Tn (x − y) ≤ M x − y. Viewing this as a Banach-space valued version of equicontinuity, we can carry over an argument from Theorem 2.14. We deﬁne C(x) = lim

sup Tm x − Tn x.

N →∞ m,n≥N

The sequence (Tn x)∞ n=1 is Cauchy if and only if C(x) = 0. For any x, y ∈ X, by using Tm x − Tn x ≤ Tm x − Tm y + Tm y − Tn y + Tn y − Tn x ≤ M x − y + Tm y − Tn y + M x − y, we obtain C(x) ≤ 2M x−y+C(y). Since there is a dense set of y ∈ X such that C(y) = 0, this implies C(x) = 0 for all x ∈ X. Thus, Tn x converges for every x ∈ X. The map T deﬁned by (2.16) is linear because Tn are linear, and for any x, T x = lim Tn x ≤ lim inf Tn x. n→∞

n→∞

Further statements are left as exercises. In particular, Exercise 2.10 describes an abstract completion of a normed vector space to a Banach space, which is based on the metric space completion realized by equivalence classes of Cauchy sequences. Exercise 2.11 shows that this abstract completion is, up to a unitary, the only Banach space which contains V as a dense subset. Accordingly, it is common to call a Banach space B a completion of a normed vector space X if there is a norm-preserving map i : X → B such that Ran i is dense in B. It is also common to identify X and Ran i and think of X as a dense subset of B. For instance, the spaces Lp (X, dμ) for p ∈ [1, ∞) are completions of Cc (X) with respect to the Lp -norm on Cc (X). We have now reached an important general result, known as the uniform boundedness principle or the Banach–Steinhaus theorem. Theorem 2.47 (Uniform boundedness principle). Let X be a Banach space, and let Y be a normed vector space. If a family F ⊂ L(X, Y ) is pointwise bounded, i.e., ∀x ∈ X, sup T x < ∞ T ∈F

64

2. Banach spaces

then F is norm bounded, i.e., supT ∈F T < ∞. The proof uses an upper bound on the operator norm obtained from its values in an arbitrary ball, not necessarily centered at 0. Lemma 2.48. Let T : X → Y be a bounded linear operator between normed vector spaces. For any x ∈ X and any r > 0, sup T y ≥ T r.

(2.17)

y∈X y−x≤r

Proof. For all v ≤ r, by the triangle inequality, 1 T v ≤ (T (x + v) + T (x − v)) ≤ 2

sup T y.

y∈X x−y≤r

Taking the supremum over v gives an upper bound for rT .

Proof of Theorem 2.47. Assume that F is not norm bounded. Then there is a sequence of Tn ∈ F such that Tn ≥ 4n . We construct a sequence (xn )∞ n=1 in X inductively, by setting x0 = 0 and using Lemma 2.48 with Tn to obtain some xn in a ball of radius 3−n around xn−1 : xn − xn−1 ≤ 3−n , 2 Tn xn > 3−n Tn . 3

(2.18) (2.19)

The factor 2/3 is added because the supremum in (2.17) may not be a maximum. By (2.18), xn converge to some x and x − xn ≤ 12 3−n . Thus, 1 Tn x ≥ Tn xn − Tn x − xn > 3−n Tn → ∞, 6 which shows that (Tn )∞ n=1 is not pointwise bounded.

The above discussion of L(X, Y ) is quite general and will be applied to various Banach spaces in this text. There are two common special cases. One is the case Y = C. Bounded linear operators from X to C are also called bounded linear functionals of X. The notation X ∗ := L(X, C) is customary, and the space X ∗ is called the dual space of X. Another is the case Y = X. The notation L(X) := L(X, X) is customary, and elements of L(X) are called bounded linear operators on X. In this special case, composition of operators can be viewed as a multiplicative operation on L(X), which provides additional algebraic structure.

2.5. Weak-∗ convergence and the separable Banach–Alaoglu theorem

65

2.5. Weak-∗ convergence and the separable Banach–Alaoglu theorem For any Banach space B, its dual space B ∗ = L(B, C) has the induced (operator) norm which makes it a Banach space and gives a notion of norm convergence in B ∗ . However, a weaker notion of convergence is often useful: Deﬁnition 2.49. A sequence of Λn ∈ B ∗ weak-∗ converges to Λ ∈ B ∗ if lim Λn x = Λx

n→∞

∀x ∈ B. w

This is often denoted by w-limn→∞ Λn = Λ or Λn → Λ as n → ∞. Since weak-∗ convergence is not deﬁned as convergence with respect to a metric, the reader is warned not to automatically apply preconceptions about convergence taken from metric spaces. For instance, uniqueness of the weak-∗ limit has to be proved: w

w

Lemma 2.50. If Λn → Λ and Λn → Λ , then Λ = Λ . Proof. For all x ∈ B, Λx = limn→∞ Λn x = Λ x, so Λ = Λ .

w

Lemma 2.51. Λn → Λ implies Λn → Λ. Proof. This follows from |Λn x − Λx| = |(Λn − Λ)x| ≤ Λn − Λx.

The connections between weak-∗ convergence and boundedness are described in the following proposition. w

Proposition 2.52. If Λn → Λ, then supn∈N Λn < ∞ and Λ ≤ lim inf Λn . n→∞

Proof. Pointwise convergence implies pointwise boundedness. Thus, by the uniform boundedness principle, Λn are uniformly bounded. Moreover, Λx = lim Λn x ≤ lim inf Λn x n→∞

n→∞

∀x ∈ B.

Specializing Lemma 2.46 to functionals immediately implies the following criterion for weak-∗ convergence: ∗ Proposition 2.53. If a bounded sequence (Λn )∞ n=1 in B converges on a dense subset of B, then it weak-∗ converges to some Λ ∈ B ∗ .

We now reach the main result of this section. Bounded sequences in Banach spaces do not have to have convergent subsequences. However: Theorem 2.54 (Separable Banach–Alaoglu theorem). Any bounded sequence in a separable Banach space has a weak-∗ convergent subsequence.

66

2. Banach spaces

∗ Proof. Let (Λn )∞ n=1 be a bounded sequence in B . For any x ∈ B,

sup|Λn x| ≤ supΛn x < ∞, n∈N

n∈N

so the sequence Λn is pointwise bounded. By Lemma 2.13, there is a subsequence which converges pointwise on a dense subset of B. By Proposition 2.53, this subsequence is weak-∗ convergent. This property can be described as weak-∗ sequential compactness. Indeed, the Banach–Alaoglu theorem in its more general form is stated as weak-∗ compactness of the closed ball {Λ ∈ B ∗ | Λ ≤ r} and, with that topological reformulation, holds also for nonseparable Banach spaces. As the last abstract topic, we describe an attempt to describe weak-∗ convergence in terms of a metric on B ∗ ; this leads to an imperfect but still relevant description. We use a construction from metric space theory. A semimetric on a set X is a symmetric function d : X × X → [0, ∞) which obeys the triangle inequality; in a semimetric, d(x, y) = 0 does not necessarily imply x = y. Lemma 2.55. If dk , k ∈ N are semimetrics on X, then so is d(x, y) =

∞

min{2−k , dk (x, y)}.

(2.20)

k=1

Proof. For ﬁxed k, let us prove that d˜k (x, y) = min{2−k , dk (x, y)} obeys the triangle inequality. If |dk (x, y)| ≥ 2−k or |dk (y, z)| ≥ 2−k , then d˜k (x, z) ≤ 2−k ≤ d˜k (x, y) + d˜k (y, z). If d˜k (x, y) < 2−k and d˜k (y, z) < 2−k , then d˜k (x, z) ≤ dk (x, z) ≤ dk (x, y) + dk (y, z) = d˜k (x, y) + d˜k (y, z). Since d˜k is also symmetric, it is a semimetric. The sum (2.20) is convergent due to the upper bound by 2−k , and as a sum of symmetric functions which obey the triangle inequality, it has the same properties. Theorem 2.56. If {xk }∞ k=1 is a dense sequence in a separable Banach space B, then ∞ min{2−k , |(Λ − Λ )xk |} (2.21) d(Λ, Λ ) = k=1

B∗ .

Moreover, consider an arbitrary sequence of Λn ∈ deﬁnes a metric on B ∗ . This sequence is weak-∗ convergent to Λ ∈ B ∗ if and only if it is bounded and d(Λn , Λ) → 0 as n → ∞.

2.5. Weak-∗ convergence and the separable Banach–Alaoglu theorem

67

Proof. (a) By Lemma 2.55, d is a semimetric. Assume d(Λ, Λ ) = 0; then Λxk = Λ xk for all k, so by density of {xk }, Λ = Λ . Thus, d is a metric. w

(b) If Λn → Λ, then Λn is a bounded sequence in B ∗ . Moreover, (Λn − Λ)xk → 0 for every k, so by dominated convergence with dominating sequence 2−k applied to the counting measure on N, lim d(Λn , Λ) =

n→∞

∞ k=1

lim min{2−k , |(Λ − Λ )xk |} = 0.

n→∞

Conversely, if d(Λn , Λ) → 0, then min(2−k , |(Λn − Λ)xk |) → 0 for each k, so (Λn − Λ)xk → 0 for each k. Thus, Λn converge to Λ on a dense set. Using w Lemma 2.46, since Λn are uniformly bounded, Λn → Λ. Thus, if we restrict to a bounded subset of B ∗ such as {Λ ∈ B ∗ | Λ ≤ r}, weak-∗ convergence can be interpreted as convergence with respect to a metric. However, the restriction to a bounded subset was crucial here: on the entire set B ∗ , there is no metric which precisely gives weak-∗ convergence (in topological language: the topology of weak-∗ convergence is not metrizable) unless B is ﬁnite dimensional. We will not prove or use that fact. The notion of weak-∗ convergence on B ∗ is tied to the original Banach space B, so it is imprecise to discuss it without specifying the space B. This is nonetheless common practice for some common Banach spaces. It is also common in some cases to refer to this notion as weak convergence, although weak convergence is in general a diﬀerent concept. A ubiquitous special case is obtained from B = C(K), for K a compact metric space, using the correspondence from the Riesz–Markov theorem: Deﬁnition 2.57. We denote by M(K) the set of ﬁnite positive Borel measures on a compact metric space K. The measures μn ∈ M(K) converge weakly to μ ∈ M(K) if ∀f ∈ C(K). f dμn → f dμ w

We denote this by μn → μ or w-limn→∞ dμn = dμ. Corollary 2.58. Any sequence of μn ∈ M(K) such that supn∈N μn (K) < ∞ has a weakly convergent subsequence. Proof. The measures μn correspond to positive linear functionals Λn (f ) = f dμn on C(K). Since Λn = μn (K) is uniformly bounded and C(K) is separable, by the Banach–Alaoglu theorem there is a subsequence such that w Λnl → Λ as l → ∞ for some Λ ∈ C(K)∗ . When f ≥ 0, Λnl f ≥ 0 for all l, so Λf ≥ 0. Thus Λ is also a positive linear functional, so it is of the form w Λf = f dμ for a positive Borel measure μ, and μnl → μ, by deﬁnition.

68

2. Banach spaces

2.6. Banach-space valued integration In this section, we present some basics of Banach-space valued integration. Let A be a σ-algebra on X, let μ be a measure on A, and let B be a Banach space. A simple function s : X → B can be written as s=

n

yj χAj

(2.22)

j=1

with Aj ∈ A, yj ∈ B. If can deﬁne

s dμ < ∞, then μ(Aj ) < ∞ for each j, so we s dμ =

n

μ(Aj )yj .

(2.23)

j=1

This integral is additive and % % % % n n % % % % % % % s dμ% = % μ(Aj )yj % ≤ μ(Aj )yj = s dμ. % % % % j=1

j=1

The Bochner integral is developed for functions which can be approximated by simple functions: Deﬁnition 2.59. A measurable function f : X → B is said to be Bochnerintegrable if there exists a sequence of simple functions sn : X → B such that sn dμ < ∞ for every n and (2.24) lim f − sn dμ = 0. n→∞

For any Bochner-integrable function, its Bochner integral is deﬁned as (2.25) sn dμ. f dμ = lim n→∞

Since % % % % % sm dμ − sn dμ% ≤ sm − sn dμ ≤ f − sm dμ + f − sn dμ, % % Equation (2.24) implies that the sequence sn dμ is Cauchy in B and therefore convergent. Similar arguments prove that the limit is independent of the choice of sequence sn , so the Bochner integral (2.25) is well deﬁned. The setting of Lemma 2.6 can be understood as a Bochner integral with respect to counting measure on N. We note another special case, with respect to Lebesgue measure m on an interval [a, b]:

2.6. Banach-space valued integration

69

Example 2.60. Any continuous f : [a, b] → B is Bochner-integrable and, b denoting by a f (x) dx its Bochner integral, 1 f lim n→∞ n n−1 j=0

j a + (b − a) n

!

b

=

f (x) dx.

(2.26)

a

Proof. Since f is continuous, by the same proof as in the scalar case, it is uniformly continuous. Thus, the simple functions ( )! x−a 1 sn (x) = f a + n n b−a satisfy the condition (2.24), which leads to (2.26).

It is often useful to reduce to scalar objects. To do this in the Banach space setting, the key tool is the following property: Lemma 2.61. For any Banach space B, if we denote B1∗ = {Λ ∈ B ∗ | Λ = 1}, then x = sup |Λx| Λ∈B1∗

∀x ∈ B.

(2.27)

This includes the fact that functionals separate points: if Λx = Λy for all Λ ∈ B1∗ , then by (2.27), x − y = 0, so x = y. The reader can proceed in two ways. One is to note that this is an immediate corollary of the Hahn–Banach theorem. Instead of proving this in an abstract Banach space setting, we instead observe that in all the cases of interest to us, it is easy to verify the statement manually. To verify (2.27) for some Banach space B, it is not even necessary to have a complete description of the dual space B ∗ ; every Λ ∈ B1∗ obeys |Λx| ≤ x, so supΛ∈B1∗ Λx ≤ x is trivial and it suﬃces to show the opposite inequality by using some elements of B1∗ . For instance, for Lp (X, dμ) it follows from Corollary 2.27 (with the roles of p and q reversed) that any g ∈ Lq (X, dμ) induces a Λ ∈ (Lp (X, dμ))∗ by Λf = g¯f dμ and that Λ = gq . Using Corollary 2.27 a second time, it follows that (2.27) holds for Lp (X, dμ). Other cases will appear later: when B = H is a Hilbert space, the property (2.27) will follow from the Riesz representation theorem and when B = L(H), the property (2.27) will follow from Exercise 4.14.

70

2. Banach spaces

Since a functional can always be multiplied by eiφ without changing its norm, (2.27) implies x = sup Re Λx Λ∈B1∗

∀x ∈ B.

(2.28)

Lemma 2.62. If f : X → B is Bochner-integrable, then: Λ f dμ = Λf dμ ∀Λ ∈ B ∗ % % % % % f dμ% ≤ f dμ. % % Proof. For a simple function (2.22), by linearity of Λ, n n μ(Aj )yj = μ(Aj )Λyj = Λs dμ. Λ s dμ = Λ j=1

j=1

For a sequence of simple functions sn obeying (2.24), use Λf dμ − Λsn dμ ≤ |Λf − Λsn | dμ ≤ |Λ(f − sn )| dμ ≤ f − sn dμ, so Λsn dμ → Λf dμ, and the continuity of Λ implies sn dμ = lim Λ sn dμ = lim Λsn dμ = Λf dμ. Λ f dμ = Λ lim n→∞

n→∞

n→∞

To prove the second claim, we note that for any Λ ∈ B1∗ , Λ f dμ = Λf dμ ≤ |Λf | dμ ≤ f dμ. Taking the sup over Λ ∈ B1∗ concludes the proof by (2.27).

Remark 2.63. For a diﬀerent approach to integration, assume that f : X → B is such that there exists I ∈ B so that for all Λ ∈ B ∗ , Λf : X → C is measurable and ΛI =

Λf dμ.

Then the value I is called the Pettis integral of f . Existence of the Pettis integral is a strictly weaker notion than existence of the Bochner integral. Deﬁnition 2.64. Let F : I → B where I ⊂ R is an interval and B a Banach space. The derivative of F at x0 ∈ int I, if it exists, is F (x0 ) ∈ B such that F (x0 + h) − F (x0 ) − hF (x0 ) = 0. h→0 |h| lim

If I is open and F is diﬀerentiable at every point in I, then F is also a function from I to B. Based on this notion of derivative, C n (I, B) for n ∈ N ∪ {∞} is deﬁned analogously to the scalar case.

2.7. Banach-space valued analytic functions

71

Theorem 2.65 (Fundamental theorem of calculus). (a) If f ∈ C([a, x b], B), then the function F : [a, b] → B deﬁned by F (x) = a f (t) dt is continuous on [a, b], diﬀerentiable on (a, b), and F (x) = f (x) for all x ∈ (a, b). (b) If G, f ∈ C([a, b], B) and G (x) = f (x) for x ∈ (a, b), then b f (t) dt = G(b) − G(a). a

The proof of the fundamental theorem of calculus follows the same steps as in the scalar case, with one additional ingredient. In the scalar case, the following key step follows from the mean value theorem; in the Banach-space valued case, one uses (2.28) to reduce to the scalar case. Lemma 2.66. If g ∈ C([a, b], B) and g = 0 on (a, b), then g(b) = g(a). Proof. For any Λ ∈ B ∗ , Re(Λg) ∈ C([a, b], R) and (Re Λg) = Re Λg = 0 for x ∈ (a, b), so Re Λg(b) = Re Λg(a) by the mean value theorem. Since Re Λ(g(b) − g(a)) = 0 for all Λ ∈ B ∗ , (2.28) implies g(b) − g(a) = 0.

2.7. Banach-space valued analytic functions We now consider complex-analytic calculus for Banach-space valued functions. We will use the customary notation in C, Dr (z0 ) = {z ∈ C | |z − z0 | < r}. If Ω ⊂ C is an open set, γ : [a, b] → Ω is a *C 1 contour, and f : Ω → B is a continuous map, then the contour integral γ f (z) dz is deﬁned by b + f (z) dz = γ (t)f (γ(t)) dt. γ

a

Theorem 2.67. Let Ω ⊂ C be an open set, let B be a Banach space, and let f : Ω → B. The following are equivalent: (a) Holomorphicity: For every z0 ∈ Ω there is a value of f (z0 ) ∈ B such that f (z) − f (z0 ) − (z − z0 )f (z0 ) = 0; (2.29) lim z→z0 z − z0 (b) Weak analyticity: For every Λ ∈ B ∗ , the function Λf : Ω → C is analytic; (c) Cauchy’s integral formula: f is continuous and, for every disk Dr (z0 ) ⊂ Ω and every z ∈ Dr (z0 ), + f (w) 1 dw; (2.30) f (z) = 2πi |w−z0 |=r w − z

72

2. Banach spaces

(d) Local representability by power series: For every z0 ∈ Ω, there is a neighborhood Dr (z0 ) ⊂ Ω and coeﬃcients Fn ∈ B such that ∞ n n=0 r Fn < ∞ and f (z) =

∞

(z − z0 )n Fn

∀z ∈ Dr (z0 ).

(2.31)

n=0

Proof. (a) =⇒ (b): Applying arbitrary Λ ∈ B ∗ to (2.29) shows that Λf : Ω → C is holomorphic with (Λf ) = Λf . Thus, Λf is analytic. (b) =⇒ (c): We begin by proving continuity of f at an arbitrary point z0 ∈ Ω. Fix r > 0 such that Dr (z0 ) ⊂ Ω. For every Λ ∈ B ∗ , analyticity of Λf implies that Λf (z) − Λf (z0 ) 0, Dr (z0 ) ∩ Ω = Dr (z0 ) \ {z0 } and the limit limz→z0 f (z) is convergent. If f has a removable singularity at z0 , prove that deﬁning f at z0 by f (z0 ) = limz→z0 f (z) gives an analytic function on Ω ∪ {z0 }.

Chapter 3

Hilbert spaces

The distinguishing feature of Hilbert spaces is inner product, which is the n x inspired by the dot product in Rn , x · y = j=1 j yj . The dot product takes a central place in Euclidean geometry, and it can be used to recover, among other things, the length of vectors, x2 = x·x. When generalizing to Cn , to retain the connection with Euclidean norm, one must conjugate the entries of one of the vectors, leading to the deﬁnition of the inner product n ¯j yj (it is a matter of convention whether complex on C , x, y = nj=1 x conjugation is applied on the ﬁrst or second vector). We will consider Hilbert spaces over the ﬁeld of scalars C, which arise as an abstract generalization of this inner product on Cn . The analyst’s main interest in Hilbert spaces lies in inﬁnite-dimensional settings, but given the original dot product on Rn , it should not be surprising that a geometric intuition is very useful in the study of Hilbert spaces. The reader should compare the knowledge and intuition from linear algebra with what is presented here, and note the more subtle phenomena which only appear in inﬁnite-dimensional settings.

3.1. Inner products In this section, we will start with the general setting of sesquilinear forms and gradually specialize to the setting of inner products on Hilbert spaces. Deﬁnition 3.1. Let V be a vector space. A sesquilinear form on V is a map ·, · : V × V → C with the following properties: 77

78

3. Hilbert spaces

(a) Linearity in the second parameter : For all x, y, y ∈ V and λ ∈ C, x, λy = λx, y, x, y + y = x, y + x, y . (b) Conjugate-linearity in the ﬁrst parameter : For all x, x , y ∈ V and λ ∈ C, ¯ y, λx, y = λx, x + x , y = x, y + x , y. It is a matter of convention which of the two parameters is linear, since we could exchange the two parameters in all forms throughout the text. A sesquilinear form can be recovered by its values on the diagonal x, x with x ∈ V : Lemma 3.2 (Polarization identity). For all x, y ∈ V , 1 ω −1 x + ωy, x + ωy. x, y = 4

(3.1)

ω∈{1,i,−1,−i}

Proof. By sesquilinearity, for any ω ∈ C with |ω| = 1, ω −1 x + ωy, x + ωy = ω −1 x, x + x, y + ω −2 y, x + ω −1 y, y. The proof is completed by summing over ω ∈ {1, i, −1, −i} using

4 k ∈ 4Z k ω = 0 k ∈ Z \ 4Z. ω∈{1,i,−1,−i}

Corollary 3.3. If a sesquilinear form obeys x, x ∈ R for all x ∈ V , then it is conjugate symmetric, i.e., y, x = x, y

∀x, y ∈ V.

(3.2)

Proof. Using properties of the sesquilinear form, y + ωx, y + ωx = ω(x + ω −1 y), ω(x + ω −1 y) = ωωx + ω −1 y, x + ω −1 y. Multiplying by ω −1 , summing over ω ∈ {1, i, −1, −i}, taking the complex conjugate, and using the polarization identity gives (3.2). We will now impose a positive deﬁnite condition and show that it naturally turns V into a normed vector space: Deﬁnition 3.4. Let V be a vector space. An inner product on V is a sesquilinear map ·, · : V × V → C which is positive deﬁnite, i.e., x, x > 0

∀x ∈ V \ {0}.

(3.3)

3.1. Inner products

79

The induced norm of an inner product is deﬁned by . x = x, x.

(3.4)

Soon, we will prove that the induced norm is, indeed, a norm. Condition (3.3) is directly motivated by this. We begin by proving some other general properties of inner products. Lemma 3.5. For all x ∈ V and λ ∈ C, λx = |λ|x. ¯ Proof. This follows from λx, λx = λλx, x.

Deﬁnition 3.6. Vectors x1 , x2 ∈ V are orthogonal if x1 , x2 = 0; this is denoted x1 ⊥ x2 . A sequence (xj )nj=1 in V is called (pairwise) orthogonal if xj ⊥ xk whenever j = k. Lemma 3.7 (Pythagorean theorem). If x1 , . . . , xn ∈ V are pairwise orthogonal, then %2 % n n % % % % xj % = xj 2 . % % % j=1

j=1

Proof. Using sesquilinearity of the inner product, xj , xk = 0 for j = k, and xj , xk = xj 2 , we compute %2 / % 0 n n n n n n % % % % xj % = xj , xk = xj , xk = xj 2 . % % % j=1

j=1

k=1

j=1 k=1

j=1

The Pythagorean theorem has obvious roots in Euclidean geometry. Similarly, the following lemma describes orthogonal projection of a vector x to a vector y = 0. Lemma 3.8. For any x, y ∈ V with y = 0, there is a unique λ ∈ C such that x − λy ⊥ y, and it is given by λ = y, x/y2 . Proof. By linearity of the inner product in the second parameter, the equation y, x−λy = 0 is equivalent to y, x−λy, y = 0, which has the unique solution λ = y, x/|y2 . Theorem 3.9 (Cauchy–Schwarz inequality). For all x, y ∈ V , |x, y| ≤ xy.

(3.5)

Proof. If y = 0, (3.5) holds trivially. For y = 0, we use λ from Lemma 3.8 and z = x − λy. Since y, λz = λy, z = 0, by the Pythagorean theorem, x2 = z2 + λy2 ≥ λy2 = |λ|2 y2 =

|y, x|2 . y2

Multiplying by y2 completes the proof since |y, x| = |x, y|.

80

3. Hilbert spaces

Proposition 3.10. For all x, y ∈ V , x + y ≤ x + y. Proof. By the deﬁnition of induced norm and conjugate symmetry, x + y2 = x2 + x, y + y, x + y2 = x2 + 2 Rex, y + y2 . Estimating Rex, y ≤ |x, y| ≤ x y gives x + y2 ≤ x2 + 2x y + y2 = (x + y)2 .

Collecting Lemma 3.5 and Proposition 3.10, we obtain: Corollary 3.11. The induced norm (3.4) is a norm. Deﬁnition 3.12. A vector space with an inner product is called a Hilbert space if it is complete with respect to the induced norm. In particular, every Hilbert space is a Banach space with the induced norm. It is common to denote a Hilbert space by H instead of V . Example 3.13. For any n ∈ N, Cn is a Hilbert space with the inner product w, z =

n

wj zj .

j=1

Proof. It is trivial that this is an inner product; the induced norm is pre cisely the norm ·2 encountered in Example 2.3. Example 3.14. The set of square-summable sequences ⎧ ⎫ 1/2 ⎨ ⎬ |zγ |2 0. Now we consider f (t) = x − y − tv2 ,

t ∈ R.

For all t, y + tv ∈ S so f (t) ≥ c2 = f (0). Thus, f has a global minimum at 0. However, by expanding in terms of inner products, we write f (t) = x − y2 − 2x − y, vt + v2 t2 and compute f (0) = 2x − y, v > 0, which contradicts a minimum at 0.

Before deriving further properties of orthogonal projections, we will take a detour, using the projection theorem to describe the space of bounded linear functionals on H. Immediately from the deﬁnition, for any subspaces S, T , S ⊂ T implies ⊂ S ⊥ . Some other general properties are given in Exercise 3.9. We prove a criterion for a subspace of H, not necessarily closed, to be dense in H:

T⊥

Corollary 3.23. A subspace S of H is dense in H if and only if S ⊥ = {0}. Proof. Assume S = H. Fix x ∈ H\S and let y be the orthogonal projection of x to S. Then y ∈ S so x = y. Thus, x − y is a nonzero element of (S)⊥ ⊂ S ⊥ . Conversely, assume z ∈ S ⊥ and z = 0. Then for any y ∈ S, / S. y − z2 = y2 + z2 ≥ z2 , so z ∈ Recall that H∗ denotes the set of bounded linear functionals on H, i.e., the set of bounded linear operators from H to C. On a Hilbert space, using the inner product, it is easy to generate many bounded linear functionals:

84

3. Hilbert spaces

Lemma 3.24. For any y ∈ H, Λx = y, x

(3.8)

deﬁnes a bounded linear functional Λ ∈ H∗ with norm Λ = y. Proof. The map Λ : H → C is linear because the inner product is linear in the second argument. By the Cauchy–Schwarz inequality, for any x ∈ H, |Λx| = |y, x| ≤ yx.

(3.9)

Thus, Λ is bounded and Λ ≤ y. If y = 0, this gives Λ = 0. If y = 0, equality holds in (3.9) for x = y, so Λ ≥ y. The two inequalities combine to give Λ = y. Remarkably, this construction provides all bounded linear functionals on H: Theorem 3.25 (Riesz representation theorem). For every Λ ∈ H∗ , there is a unique y ∈ H such that (3.8) holds for all x ∈ H. Proof. Let Λ ∈ H∗ . If Λ = 0, (3.8) holds with y = 0. Otherwise, Ker Λ is a closed subspace of H and Ker Λ = H, so by Corollary 3.23, there exists z ∈ (Ker Λ)⊥ , z = 0. For any x ∈ H, the calculation ! Λx Λx z = Λx − Λz = 0 Λ x− Λz Λz shows that x −

Λx Λz z

is in Ker Λ and therefore orthogonal to z. This implies 1 2 Λx Λx z, x − z, z = z, x − z = 0. Λz Λz

Solving for Λx gives Λz z, x = Λx = z2

1

2 Λz z, x , z2

which is a representation for Λ of the desired form. If the same functional Λ can also be represented in the form Λx = ˜ y , x, subtracting gives the zero functional (Λ − Λ)(x) = y − y˜, x. It follows that y − y˜ = Λ − Λ = 0 so y = y˜, which proves uniqueness of y. We now return to orthogonal projections; existence and uniqueness allow us to view orthogonal projection as a map on H: Deﬁnition 3.26. If S is a closed subspace of H, the orthogonal projection to S is the map P : H → H deﬁned so that for every x ∈ H, P x is the unique vector such that P x ∈ S and x − P x ∈ S ⊥ .

3.2. Subspaces and orthogonal projections

85

Proposition 3.27. The orthogonal projection P to a closed subspace S of H has the following properties: (a) P is a bounded linear operator on H, i.e., P ∈ L(H); (b) Ker P = S ⊥ and Ran P = S; (c) P = 1 if S = {0}; (d) P P = P ; (e) x, P y = P x, y for all x, y ∈ H. Proof. For any x, x ∈ H, P x + P x ∈ S and (x + x ) − (P x + P x ) = (x − P x) + (x − P x ) ∈ S ⊥ , so by uniqueness, P (x + x ) = P x + P x . Similarly, cP x ∈ S and cx − cP x = c(x − P x) ∈ S ⊥ implies P (cx) = cP x for all c ∈ C. Thus, P is linear. By the Pythagorean theorem, x2 = P x2 + x − P x2 ≥ P x2 , so P is bounded and P ≤ 1. In particular, P ∈ L(H). For any x ∈ S, P x = x because x ∈ S and x − x ∈ S ⊥ . Thus, if S = {0}, P = 1. For any x ∈ H, P x ∈ S implies P P x = P x by the above. Thus, P P = P and S ⊂ Ran P . Since Ran P ⊂ S, we conclude Ran P = S. By deﬁnition, P x = 0 if and only if x ∈ S ⊥ , so Ker P = S ⊥ . For all x, y ∈ H, x − P x ∈ S ⊥ and P y ∈ S imply x − P x, P y = 0, so x, P y = P x, P y. Analogously, P x, P y = P x, y, so x, P y = P x, y. This has a dual point of view, in which orthogonal projections are deﬁned without reference to a subspace as operators with certain properties: Deﬁnition 3.28. An operator P ∈ L(H) is called an orthogonal projection if P 2 = P and u, P v = P u, v ∀u, v ∈ H. (3.10) This deﬁnition is compatible with earlier terminology and completes a correspondence between closed subspaces of H and orthogonal projections as a family of operators in L(H): Proposition 3.29. Let P ∈ L(H) be an orthogonal projection. Then Ran P is a closed subspace of H and P is the orthogonal projection to Ran P . Proof. If x ∈ Ran P , then x = P y for some y, so P x = P 2 y = P y = x. Conversely, if P x = x, then x ∈ Ran P . This proves that Ran P = Ker(I − P ), and in particular, Ran P is closed. For any x, y ∈ H, (I − P )x, P y = P (I − P )x, y = (P − P 2 )x, y = 0, so (I − P )x ∈ S ⊥ . Since P x ∈ S and x − P x ∈ S ⊥ , it follows that P is the orthogonal projection to the subspace S.

86

3. Hilbert spaces

The projection theorem is an existence and uniqueness result, but orthogonal projection can often be computed. By Lemma 3.8, orthogonal projection to a one-dimensional subspace span{y}, where y = 0, is Px =

y, x y. y2

We can view one-dimensional subspaces Sj = span{yj } as a motivating special case for the following results: Theorem 3.30. Let S1 , . . . , Sn be mutually orthogonal closed subspaces of H and let Pj denote orthogonal projection to Sj . Then the subspace S = span

n

Sj

j=1

is a closed and orthogonal projection to S, which is given by n Pj x. Px =

(3.11)

j=1

Moreover, for all x ∈ H, n

Pj x2 ≤ x2

(3.12)

j=1

with equality if and only if x ∈ S; this is known as Bessel’s inequality. Proof. Fix x ∈ H and denote y = nj=1 Pj x ∈ S. Fix k. For all j = k, Sj ⊂ Sk⊥ implies Pk Pj = 0, so by properties of Pk , Pk y =

n

Pk Pj x = Pk Pk x = Pk x.

j=1

Thus, z, x − y = 0 if z ∈ Sk for some k. By linearity, z, x − y = 0 for all z ∈ S, and by continuity, z, x − y = 0 for all z ∈ S. Thus, y ∈ S ⊂ S and ⊥ x − y ∈ S , so P = nj=1 Pj is an orthogonal projection to S. Moreover, for x ∈ S, x = P x ∈ S, so S ⊂ S; thus, S = S is closed. The vectors Pj x ∈ Sj for j = 1, . . . , n and the vector x − P x ∈ S ⊥ are pairwise orthogonal and their sum is x, so by the Pythagorean theorem, n Pj x2 + x − P x2 . x2 = j=1

This implies Bessel’s inequality, with equality if and only if x = P x.

Next, we consider the setting where there are iniﬁnitely many closed subspaces Sγ , indexed by an abstract index γ ∈ Γ.

3.2. Subspaces and orthogonal projections

87

Theorem 3.31. Let Sγ , γ ∈ Γ, be mutually orthogonal closed subspaces of H, and let Pγ denote orthogonal projection to Sγ . Then: (a) S = span γ∈Γ Sγ is a closed subspace of H. (b) For any x ∈ H, the set {γ ∈ Γ | Pγ x = 0} is countable. (c) For any x ∈ H and any injective map σ : N → Γ such that {γ ∈ Γ | Pγ x = 0} ⊂ σ(N), the orthogonal projection of x to S is given by Px =

∞

Pσ(j) x

(3.13)

j=1

(in particular, the series is convergent in H). (d) For all x ∈ H,

Pγ x2 ≤ x2

(3.14)

γ∈Γ

with equality if and only if x ∈ S. Proof. By Theorem 3.31, for any distinct γ1 , . . . , γn ∈ Γ, n Pγj x2 ≤ x2 . j=1

Viewing these ﬁnite sums as integrals of simple functions on Γ with respect to counting measure and taking the supremum over all ﬁnite subsets of Γ, Pγ x2 ≤ x2 . γ∈Γ

By Markov’s inequality (Lemma 1.56), for any k ∈ N, the set {γ ∈ Γ | Pγ x2 ≥ 1/k} is ﬁnite, so the set {γ ∈ Γ | Pγ x = 0} is countable. By Lemma 3.16, the series in (3.13) is convergent. Denote by y the value of the series. The vector y is in S. Using linearity and continuity of Pγ ,

∞ Pγ x γ ∈ σ(N) Pγ Pσ(j) x = Pγ y = 0 γ∈ / σ(N). j=1 Thus, z, x −y = 0 if z ∈ Sγ for some γ. By linearity, z, x − y = 0 for all z ∈ span γ∈Γ Sγ , and by continuity, z, x − y = 0 for all z ∈ S. Thus, y ∈ S and x − y ∈ S ⊥ . Thus, y is the orthogonal projection of x to S. For any n ∈ N, by the proof of Theorem 3.31, % %2 n n % % % % Pσ(j) x2 + %x − Pσ(j) x% . x2 = % % j=1

j=1

88

3. Hilbert spaces

Taking n → ∞ implies x2 =

∞

Pσ(j) x2 + x − P x2 ,

j=1

so (3.14) holds, with equality if and only if x = P x.

Remark 3.32. Theorem 3.31(b) shows that such a map σ exists, and part (c) shows that the right-hand side of (3.13) is independent of the choice of map σ, so we will denote it more concisely by Pγ x. Px = γ∈Γ

3.3. Direct sums of Hilbert spaces Any closed subspace S of a Hilbert space H is also a Hilbert space, since the restriction of the inner product on H is an inner product of S and since a closed subset of a complete space is complete. By the projection theorem, any vector v can be uniquely decomposed as v = P v +(v −P v) with P v ∈ S, v − P v ∈ S ⊥ . With respect to this decomposition, inner products can be computed as v, w = P v, P w + v − P v, w − P w. Thus, the projection theorem can be viewed as a decomposition of the Hilbert space H into Hilbert spaces S and S ⊥ . This motivates a construction which creates, from two Hilbert spaces H1 and H2 , a new Hilbert space whose vectors are formal sums (or, more formally, ordered pairs) of vectors in H1 and H2 and whose inner product is the sum of inner products in H1 and H2 . The resulting space is called the direct sum of Hilbert spaces H1 and H2 and is denoted H1 ⊕ H2 . This construction could then be iterated or generalized to the construction of a direct sum of n Hilbert spaces, H1 ⊕ H2 ⊕ · · · ⊕ Hn . Instead of doing this, we will present a further generalization right away: the direct sum of an arbitrary (possibly inﬁnite) family of Hilbert spaces. We will need this level of generality in order to state the spectral theorem for self-adjoint operators. Deﬁnition 3.33. Given Hilbert spaces Hγ , γ ∈ Γ, we deﬁne their direct sum as the space

3 Hγ = (vγ )γ∈Γ vγ ∈ Hγ for all γ ∈ Γ and vγ 2 < ∞ (3.15) γ∈Γ

γ∈Γ

with addition and scalar multiplication given by (uγ )γ∈Γ + (vγ )γ∈Γ = (uγ + vγ )γ∈Γ ,

λ (uγ )γ∈Γ = (λuγ )γ∈Γ ,

3.3. Direct sums of Hilbert spaces

89

and inner product given by (uγ )γ∈Γ , (vγ )γ∈Γ =

uγ , vγ .

(3.16)

γ∈Γ

Note 4 that if Γ is ﬁnite, the summability condition in (3.15) is trivial, and γ∈Γ Hγ consists of arbitrary N -tuples with vγ ∈ Hγ . If we view C as a Hilbert space with inner product z, w = z¯w and set Hγ = C for all γ, as a special case of the above construction we obtain: 4 Example 3.34. For any set Γ, γ∈Γ C = 2 (Γ). In particular, the following theorem contains an independent proof of completeness of 2 (Γ). Theorem 3.35. For any family of Hilbert spaces Hγ , γ ∈ Γ: 4 (a) The direct sum γ∈Γ Hγ is a Hilbert space.

4 (b) Vectors with ﬁnitely many nonzero entries are dense in γ∈Γ Hγ . 4 (c) If Hγ are separable spaces and Γ is countable, γ∈Γ Hγ is separable. 4 Proof. (a) The set H = γ∈Γ Hγ is obviously closed under scalar multiplication, and it is closed under addition due to uγ + vγ 2 ≤ (2uγ 2 + 2vγ 2 ). γ∈Γ

γ∈Γ

Thus, H is a vector space. By the Cauchy–Schwarz inequality, 1/2 1/2 2 2 |uγ , vγ | ≤ uγ vγ ≤ uγ vγ < ∞, γ∈Γ

γ∈Γ

γ∈Γ

γ∈Γ

so the sum deﬁning the inner product is absolutely convergent; thus, (3.16) is well deﬁned as a map from H × H to C. It then follows that (3.16) is an inner product on H. The induced norm is, of course, 1/2 1/2 2 vγ , vγ = vγ . (vγ )γ∈Γ = γ∈Γ

γ∈Γ

The next step is to prove that H is complete. Consider a Cauchy sequence of vectors vn = (vn,γ )γ∈Γ ∈ H, n ∈ N. We will prove that it is convergent. Since the sequence is Cauchy, it is bounded. For any γ ∈ Γ, it follows from vm,γ − vn,γ ≤ vm − vn that (vn,γ )∞ n=1 is a Cauchy sequence in Hγ , and therefore the limit wγ = lim vn,γ n→∞

(3.17)

90

3. Hilbert spaces

exists for each γ. By Fatou’s lemma applied to the counting measure, wγ 2 ≤ lim inf vn,γ 2 = lim inf vn 2 < ∞, γ∈Γ

n→∞

γ∈Γ

n→∞

so w = (wγ )γ∈Γ ∈ H. Since (vn )∞ n=1 is Cauchy, for any > 0 there exists n0 such that for all m, n ≥ n0 , vn − vm 2 < 2 . By Fatou’s lemma, for n ≥ n0 , wγ − vn,γ 2 ≤ lim inf vm,γ − vn,γ 2 = lim inf vm − vn 2 ≤ 2 . m→∞

γ∈Γ

n→∞

γ∈Γ

This means that limn→∞ w − vn 2 = 0, that is, vn → w in H. (b) For any v = (vγ )γ∈Γ ∈ H and any > 0, the condition γ∈Γ vγ 2 < ∞ implies that there is a ﬁnite A ⊂ Γ such that vγ 2 < 2 . γ∈Γ\A

Thus, the vector w with wγ = vγ for γ ∈ A and wγ = 0 for γ ∈ / A obeys v − w < . (c) Let Dγ be some dense subsets of Hγ . Then, for the vector w from the proof of (b), we can denote K = #A and x ∈ H such that xγ −wγ < 2 /K / A. Then x − w < . Thus, the set for γ ∈ A and xγ = 0 for γ ∈ {(xγ )γ∈Γ | xγ ∈ Dγ for all γ ∈ A and xγ = 0 for all γ ∈ / A} D= A⊂Γ A ﬁnite

is dense in H. If Γ is countable, then the set {A ⊂ Γ | A is ﬁnite} is countable. If Hγ are separable, the sets Dγ can be chosen to be countable. Since ﬁnite Cartesian products of countable sets are countable, their countable union D is a countable dense subset of H. We originally motivated the direct sum construction through orthogonal subspaces of a single Hilbert space. But we then developed it in the diﬀerent setup of a sum of Hilbert spaces. We now revisit this construction in the special case of mutually orthogonal closed subspaces Sγ of a single Hilbert space, oﬀering a diﬀerent interpretation up to a natural isomorphism. Theorem 3.36. Let Sγ , γ ∈ Γ, be mutually orthogonal closed subspaces of the Hilbert space H, and let Pγ denote orthogonal projection to Sγ . Then the map 3 Sγ → Sγ (3.18) U : span γ∈Γ

γ∈Γ

deﬁned by U w = (Pγ w)γ∈Γ is unitary, with inverse given by w =

γ∈Γ Pγ w.

3.4. Orthonormal sets and orthonormal bases

91

Proof. Denote by S the closure of the span of γ∈Γ Sγ . By Theorems 3.30 and 3.31, for any w ∈ S, Pγ w2 = w2 , γ∈Γ

so U is well-deﬁned and norm-preserving. If w ∈ Sβ for some β ∈ Γ, then

,w γ = β (U w)γ = 0 γ = β, so Ran U contains all vectors with only one nonzero entry. By linearity, it contains all vectors with ﬁnitely many nonzero entries. Since those are dense and Ran U is closed, it follows that U is surjective. Due to the natural unitary map between them, the two spaces in (3.18) are often conﬂated, and span γ∈Γ Sγ is often called the direct sum of the mutually orthogonal closed subspaces Sγ . For instance, in this language, the projection theorem can be concisely restated as for any closed subspace S of H, S ⊕ S ⊥ = H.

3.4. Orthonormal sets and orthonormal bases In this section, we develop the notion of an orthonormal basis of a Hilbert space H, which allows a useful representation of arbitrary vectors and a classiﬁcation of Hilbert spaces up to unitary equivalence. Let us begin by comparing this with the situation from linear algebra. As discussed in Chapter 2, we recall that in any vector space V , the span of a subset X ⊂ V consists of vectors of the form n c j xj , v= j=1

where n ∈ N, x1 , . . . , xn ∈ X and c1 , . . . , cn ∈ C. We emphasize that n must be ﬁnite—in a general vector space, there is no notion of convergence and therefore no notion of series. Likewise, linear independence of a set X is deﬁned as linear independence of every ﬁnite subset of X. A Hamel basis of V is deﬁned as a linearly independent set of vectors X such that span X = V . While it can be proved using Zorn’s lemma that every vector space has a Hamel basis, for almost all purposes in analysis, that is not the useful object to consider. Instead, in a Hilbert space H, it is useful to consider sets X such that span X is dense, and to allow vectors to be represented as inﬁnite linear combinations of basis vectors, addressing issues of convergence as they arise. In another departure from general linear algebra, we will only consider orthonormal bases.

92

3. Hilbert spaces

Deﬁnition 3.37. Let H be a Hilbert space. (a) A set of vectors X ⊂ H is an orthonormal family if x = 1 for all x ∈ X and x, x = 0 for all x, x ∈ X with x = x . (b) A set of vectors X ⊂ H is an orthonormal basis of H if it is an orthonormal family and span X is dense in H. A countable orthonormal family is often enumerated and written as a sequence, and we will alternate between these points of view. Example 3.38. Deﬁne vectors eγ ∈ 2 (Γ) by

1 β=γ (eγ )β = 0 β = γ.

(3.19)

Then (eγ )γ∈Γ is an orthonormal basis of 2 (Γ). Proof. These vectors form an orthonormal family. Their span is the set of vectors in 2 (Γ) with ﬁnitely many nonzero entries, which is dense in 2 (Γ) by Theorem 3.35. dx Example 3.39. {eikx | k ∈ Z} is an orthonormal basis for L2 ([0, 2π], 2π ).

Proof. This is an orthonormal family because

2π 1 k−l =0 ilx ikx i(k−l)x dx = e e , e = 2π 0 k − l ∈ Z \ {0}. 0 Since a Lebesgue measure gives zero weight to boundary points, the map dx dx x → eix induces an isomorphism from the space L2 ([0, 2π], 2π ) to L2 (∂D, 2π ), dx now denoting normalized Lebesgue measure on the unit circle ∂D. with 2π dx This space has a dense subspace C(∂D), so for any f ∈ L2 (∂D, 2π ) and > 0, there exists h ∈ C(∂D) such that f − h < /2. By Weierstrass’s second theorem (Corollary 2.21), for any h ∈ C(∂D), there exists a trigonometric polynomial Q such that h − Q∞ < /2. Thus, f − Q2 ≤ f − h2 + h − Q2 ≤ f − h2 + h − Q∞ < . dx ). Thus, span{eikx | k ∈ Z} is dense in L2 ([0, 2π], 2π

If a Hilbert space H has an orthonormal basis (eγ )γ∈Γ , applying Theorem 3.36 with one-dimensional subspaces Sγ = span{eγ } describes the structure of the Hilbert space by a unitary correspondence with 2 (Γ): Theorem 3.40. Let (eγ )γ∈Γ be an orthonormal basis for the Hilbert space H. The map U : H → 2 (Γ) deﬁned by U w = (eγ , w)γ∈Γ ,

w ∈ H,

3.4. Orthonormal sets and orthonormal bases

93

is unitary and the inverse map is given by U −1 κ = κγ e γ , κ ∈ 2 (Γ).

(3.20)

γ∈Γ

Proof. Since projection to Sγ = span{eγ } is Pγ w = eγ , weγ and Pγ w = |eγ , w|, Theorem 3.36 implies that for every w ∈ H, |eγ , w|2 , (3.21) w2 = γ∈Γ

and that for any κ ∈ 2 (Γ), the vector w =

γ∈Γ κγ eγ

obeys U w = κ.

The norm-preserving property (3.21) is called Parseval’s equality. Special cases of Theorem 3.40 give unitary representations of interest, such as the Fourier series expansion: dx )→ Example 3.41 (Fourier series expansion). The map F : L2 ([0, 2π], 2π 2 (Z), deﬁned by 2π dx e−inx f (x) , (F f )n = 2π 0 is unitary, and its inverse is given by uk eikx , (F −1 u)(x) = k∈Z dx ). with the series understood as a limit in L2 ([0, 2π], 2π

Hilbert spaces H, K are said to be unitarily equivalent if there exists a unitary map U : H → K. Theorem 3.40 provides such unitary equivalences, which describe the structure of a Hilbert space. For instance, since spaces 2 ({1, . . . , n}) = Cn and 2 (N) are separable, it follows that any Hilbert space with a countable orthonormal basis is separable. Conversely, we will prove that every separable Hilbert space has a countable orthonormal basis; then Theorem 3.40 will lead to a classiﬁcation of separable Hilbert spaces. Parts of the proof are constructive. We will need a formula for orthogonal projection to certain ﬁnite-dimensional subspaces: Corollary 3.42. Let y1 , . . . , yn be an orthonormal sequence in H. Then the subspace S = span{y1 , . . . , yn } is closed and the orthogonal projection to S is given by n yj , xyj . (3.22) Px = j=1

Moreover, for all x ∈ H, n j=1

|yj , x|2 ≤ x2 ,

(3.23)

94

3. Hilbert spaces

with equality if and only if x ∈ S. This is also known as Bessel’s inequality. Proof. This is a special case of Theorem 3.30 with Sj = span{yj }.

This motivates a process for obtaining orthonormal sequences with a given span, known as the Gram–Schmidt process. This procedure can be expressed in several superﬁcially diﬀerent ways, depending on when one deals with linear dependence and when one treats normalization. Let us address linear dependence in a preliminary step: Lemma 3.43. Any ﬁnite or inﬁnite sequence of vectors has a linearly independent subsequence with the same span. Proof. Starting from the sequence (xn )N n=1 , we include in the subsequence / span{xj | j ≤ n − 1}. By induction in all the elements xn such that xn ∈ m, for every ﬁnite m ≤ N , span{xn | n ≤ m} = span{xnk | nk ≤ m}. We now describe the Gram–Schmidt process, formulating it as an existence and uniqueness result with an explicit solution: Proposition 3.44 (Gram–Schmidt process). Let (xn )N n=1 be a linearly independent sequence in H, with N ﬁnite or ∞, and denote V0 = {0} and Vn = span{xj | 1 ≤ j ≤ n} for n ≥ 1. Then there is a unique orthonormal sequence (yn )N n=1 such that for all n, span{yj | 1 ≤ j ≤ n} = Vn ,

(3.24)

and for some scalars cn > 0, xn − cn yn ∈ Vn−1 . The sequence is given explicitly by a recursive formula xn − n−1 j=1 yj , xn yj . yn = n−1 xn − j=1 yj , xn yj

(3.25)

(3.26)

Proof. We prove uniqueness by induction in n. The basis of induction n = 0 is trivial; in the inductive step, we assume that y1 , . . . , yn−1 are orthonormal and that span{y1 , . . . , yn−1 } = Vn−1 . The orthogonality conditions on yn ⊥ . Together with imply that yn ⊥ yj for all j < n, so by linearity, yn ∈ Vn−1 xn − cn yn ∈ Vn−1 , this implies that xn − cn yn is the orthogonal projection of xn to Vn−1 . By Corollary 3.42, this implies xn − cn yn =

n−1

yj , xn yj .

j=1

3.4. Orthonormal sets and orthonormal bases

95

Since xn ∈ / Vn−1 , cn yn = 0. Since yn is normalized and cn > 0, this implies that % % n−1 % % % % c n = %x n − yj , xn yj %, % % j=1

and (3.26) is the unique solution. From xn − cn yn ∈ Vn−1 and the inductive assumption, it follows that span{y1 , . . . , yn } = Vn . Corollary 3.45. Every separable Hilbert space has a countable orthonormal basis. In particular, every separable Hilbert space is unitarily equivalent to Cn for some n ∈ N or to 2 (N). Proof. If H is separable, it has a countable dense set. By Lemma 3.43, H has a linearly independent sequence with a dense span. Applying the Gram–Schmidt process gives an orthonormal basis (yj )N j=1 . It follows that 4N there exists a unitary map from H to j=1 C. This is very close to a classiﬁcation result for separable Hilbert spaces. It remains to prove that diﬀerent orthonormal bases have the same cardinality. We need two results: Theorem 3.46. If a Hilbert space H has a ﬁnite orthonormal basis consisting of n vectors, then any orthonormal family in H has at most n vectors. Proof. Assume that (ek )nk=1 is an orthonormal basis of H and that X is an orthonormal family in H. Using Parseval’s equality with respect to the orthonormal basis (ek )nk=1 and Bessel’s inequality with respect to the orthonormal family X gives x∈X

1=

x∈X

n n 2 x = |ek , x| ≤ ek 2 = n, 2

x∈X k=1

which shows that X has at most n elements.

k=1

Theorem 3.47. In a separable Hilbert space, every orthonormal set is countable. Proof. Let X be an orthonormal set in H. By Parseval’s equality with respect to a countable orthonormal basis (eγ )γ∈Γ , for each x ∈ X there

96

3. Hilbert spaces

exists γ such that x, eγ = 0. However, for each γ ∈ Γ, the set {x ∈ X | x, eγ = 0} is countable by Theorem 3.31 applied to orthogonal projections to x ∈ X. Taking the union over Γ shows that the set X is countable. Thus, in a separable Hilbert space H, every orthonormal basis has the same cardinality, which is called the dimension of H and is denoted dim H. In the ﬁnite-dimensional case, this Hilbert space dimension matches the notion of dimension from linear algebra; however, in the inﬁnite-dimensional case, the cardinalities are not the same (Exercise 3.15). An important special case is obtained by starting with the sequence of monomials with respect to a suitable measure. In the theory of orthogonal polynomials, measures such that supp μ is inﬁnite are called nontrivial. This facilitates the following construction: Example 3.48. Let μ be a measure on C with all ﬁnite moments, i.e., ∀n ∈ N ∪ {0}. |z|n dμ(z) < ∞ Then z n ∈ L2 (C, dμ) for all n ∈ N ∪ {0} and the following hold: (a) Monomials 1, z, . . . , z n are linearly independent in L2 (C, dμ) if and only if supp μ consists of more than n points. (b) If μ is nontrivial, there is a unique sequence of polynomials (pn (z))∞ n=0 such that each pn is of degree n, pn have positive leading coeﬃcients, and (3.27) pm , pn = pm (x)pn (x) dμ(x) = δm,n . (c) If μ is nontrivial and supp μ is a compact subset of R, then 2 (pn (z))∞ n=0 is an orthonormal basis in L (R, dμ). Proof. (a) Monomials 1, z, . . . , z n are linearly dependent if and only if there exists a nontrivial polynomial Q with deg Q ≤ n such that Q2 = 0, i.e., Q = 0 μ-a.e. Such a polynomial exists if and only if supp μ consists of at most n points. (b) The sequence (pn )∞ n=0 is the sequence obtained by the Gram–Schmidt 2 process from the linearly independent sequence (z n )∞ n=0 in L (C, dμ). (c) By Weierstrass’s theorem, if supp μ is a compact subset of R, polynomials are dense in C(supp μ), which is itself a dense subset of L2 (R, dμ). The polynomials pn are called the orthonormal polynomials for the measure μ. Orthogonal polynomials for measures supported on R are closely related to Jacobi matrices, and we will revisit them in Chapter 10; see also [25, 92, 98, 107].

3.5. Weak convergence

97

For measures supported on the unit circle ∂D = {z ∈ C | |z| = 1}, orthonormal polynomials do not usually give an orthonormal basis, but are closely related to a basis of trigonometric polynomials (Exercise 3.16). For a systematic study of orthogonal polynomials on the unit circle see [88, 89].

3.5. Weak convergence The Riesz representation theorem motivates the following deﬁnition. Deﬁnition 3.49. A sequence (xn )∞ n=1 in H converges weakly to x ∈ H if xn , v → x, v

∀v ∈ H. w

This is denoted w-limn→∞ xn = x or xn → x. Of course, this is the special case of weak-∗ convergence (Section 2.5) in the setting of a Hilbert space H, written as a statement about vectors in H instead of about functionals in H∗ . The next few basic properties are mostly specializations of general properties of weak-∗ convergence: w

w

Lemma 3.50. If xn → x and xn → y, then x = y. Proof. For all v ∈ H, x, v = limn→∞ xn , v = y, v. Thus, x − y, · is the trivial functional on H, so x = y. It is common to refer to convergence with respect to the Hilbert space norm as strong convergence, to distinguish it from the newly deﬁned weak convergence. The two are related: w

Lemma 3.51. xn → x implies xn → x. Proof. This follows from |xn , v − x, v| = |xn − x, v| ≤ xn − xv. w

Lemma 3.52. If H is ﬁnite dimensional, xn → x implies xn → x. Proof. If H has a ﬁnite orthonormal basis {ej }N j=1 with N < ∞, then w xn → x implies ej , xn → ej , x for all j, so N N ej , xn ej → ej , xej = x. xn = j=1

j=1

Accordingly, we will focus on the inﬁnite-dimensional case from now on. In that case, weak convergence does not imply strong convergence: Example 3.53. Any orthonormal sequence converges weakly to 0, but does not converge strongly.

98

3. Hilbert spaces

Proof. Let (xn )∞ Pythagorean then=1 be an orthonormal √ sequence. By the ∞ orem, for all n = m, xm − xn = 2. Therefore, (xn )n=1 is not a Cauchy sequence, so it is not convergent. For any v ∈ H, by Bessel’s inequality, ∞

|xn , v|2 ≤ v2 < ∞,

n=1 w

so xn , v → 0 as n → ∞. Thus, xn → 0.

w

This example also shows that xn → x does not imply convergence of xn to x. The connections between weak convergence and boundedness are described in the following proposition. Proposition 3.54. w

(a) If xn → x, then supn∈N xn < ∞. w

(b) If xn → x, then x ≤ lim inf n→∞ xn . w

(c) If xn → x and x ≥ lim supn→∞ xn , then xn → x. Proof. (a) The functionals Λn = xn , · ∈ H∗ converge pointwise, so they are pointwise bounded. Thus, by the uniform boundedness principle, they are uniformly bounded. Since Λn = xn , the sequence xn is bounded. (b) This follows from x2 = lim |xn , x| ≤ lim inf xn x. n→∞

n→∞

(c) Since xn , x → x, x = x2 , starting from xn − x2 = xn 2 + x2 − 2 Rexn , x, we conclude lim supxn − x2 = lim supxn 2 − x2 ≤ 0. n→∞

n→∞

Thus, xn → x.

Weak convergence does not imply convergence of norms, so it does not w w imply convergence of inner products: to ﬁnd xn → x and yn → y such that xn , yn does not converge to x, y, it suﬃces to take yn = xn to be an orthonormal sequence in H. The next lemma is therefore in some sense optimal. w

Lemma 3.55. If xn → x and yn → y, then xn , yn → x, y. Proof. Since the sequence (xn )∞ n=1 is bounded, it follows from |xn , yn − y| ≤ xn yn − y

and yn → y

3.5. Weak convergence

99

that xn , yn − y → 0. Weak convergence implies xn − x, y → 0, so xn , yn − x, y = xn − x, y + xn , yn − y → 0.

Applying Lemma 2.46 to a sequence in H∗ and using the Riesz representation theorem provides the following criterion for weak convergence. Lemma 3.56. If (xn )∞ n=1 is a bounded sequence in H and there is a dense set D ⊂ H such that for all y ∈ D, limn→∞ xn , y is convergent, then the sequence (xn )∞ n=1 is weakly convergent. Applying the Banach–Alaoglu theorem to Hilbert spaces gives the following result, often stated as weak compactness of a closed ball in H. Theorem 3.57. In a separable Hilbert space, every bounded sequence has a weakly convergent subsequence. Similarly, a small modiﬁcation of Theorem 2.56 and its proof show: Theorem 3.58. Let H be a separable Hilbert space, and let (ek )∞ k=1 be an orthonormal basis of H. Then ∞ d(x, y) = min(2−k , |ek , x − y|) (3.28) k=1

deﬁnes a metric on H. Moreover, let (xn )∞ n=1 be a sequence in H, and let w x ∈ H. Then xn → x if and only if supxn < ∞

and

n∈N

lim d(xn , x) = 0.

n→∞

Proof. (a) By Lemma 2.55, d is a semimetric. If d(x, y) = 0, then ek , x − y = 0 for all k, which implies x = y because (ek )∞ k=1 is an orthonormal basis. Thus, d is a metric. w

(b) If xn → x, then xn is a bounded sequence in H. Moreover, ek , xn − x → 0 for every k, so by dominated convergence with dominating sequence 2−k applied to the counting measure on N, ∞ lim min{2−k , |ek , xn − x|} = 0. lim d(xn , x) = n→∞

k=1

n→∞

Conversely, if d(xn , x) → 0, then min(2−k , |ek , xn − x|) → 0 for each k, so ek , xn − x → 0 for each k, and then by linearity, v, xn → v, x for all v ∈ span{ek }∞ k=1 . Since that set is dense and the sequence xn is bounded, w by Lemma 3.56, xn → x. In other words, on bounded sets, convergence in this metric is equivalent to weak convergence; this is only true on bounded subsets of H, and not on all of H, as seen in the following example.

100

3. Hilbert spaces

w

Example 3.59. If we take xn = nen , then d(xn , 0) = 2−n → 0, but xn → 0 ∞ 1 because for the vector v = j=1 j ej ∈ H, we have v, xn = 1 → 0. The restriction to bounded sets does not mean that this choice of metric is wrong; rather, it turns out there is no metric that would work. If dim H = ∞, weak convergence on H is not metrizable, i.e., there is no metric d on H w such that d(xn , x) → 0 if and only if xn → x. This result will not be proved or needed in this text.

3.6. Tensor products of Hilbert spaces The tensor product of Hilbert spaces H, K is a new Hilbert space H ⊗ K obtained by a multiplicative construction: vectors are obtained from formal products of vectors and the inner product is a product of inner products. We describe that construction in this section, as well as a universal property which determines the tensor product uniquely and which can be more transparent than the construction itself. We begin with a motivating example. Example 3.60. Consider the Hilbert space L2 ([0, 1]2 ) = L2 ([0, 1]2 , dm2 ), where m2 denotes two-dimensional Lebesgue measure on [0, 1]2 . For f, g ∈ L2 ([0, 1]) = L2 ([0, 1], dm), we deﬁne the function f ⊗ g ∈ L2 ([0, 1]2 ) by (f ⊗ g)(x, y) = f (x)g(y).

(3.29)

(a) The map (f, g) → f ⊗ g is a bilinear map from L2 ([0, 1]) × L2 ([0, 1]) to L2 ([0, 1]2 ), i.e., it is linear in each parameter. (b) For all f1 , f2 , g1 , g2 ∈ L2 ([0, 1]), f1 ⊗ g1 , f2 ⊗ g2 = f1 , f2 g1 , g2

(3.30)

with inner products taken in the respective Hilbert spaces. (c) span{f ⊗ g | f, g ∈ L2 ([0, 1])} is dense in L2 ([0, 1]2 ). (d) span{f ⊗ g | f, g ∈ L2 ([0, 1])} = L2 ([0, 1]2 ). Proof. (a) Bilinearity follows directly from (3.29), and f ⊗ g ∈ L2 ([0, 1]2 ) from Tonelli’s theorem: 1 1 |f (x)g(y)|2 dm2 (x, y) = |f (x)|2 dx |g(y)|2 dy < ∞. [0,1]2

0

0

(b) By the deﬁnition of the inner product on L2 ([0, 1]2 ), f1 (x)g1 (y)f2 (x)g2 (y) dm2 (x, y). f1 ⊗ g1 , f2 ⊗ g2 = [0,1]2

The integrand is in L1 ([0, 1]2 ), so using Fubini’s theorem to separate this as a product of single integrals gives (3.30).

3.6. Tensor products of Hilbert spaces

101

(c) span{e2πikx ⊗ e2πily | k, l ∈ Z} is a subalgebra of C([0, 1]2 ) which separates points and is closed under complex conjugation, so it is dense in C([0, 1]2 ) by the Stone–Weierstrass theorem. Thus, it is dense in L2 ([0, 1]2 ). (d) The set {e2πikx ⊗ e2πily | k, l ∈ Z} is an orthonormal family by (3.30) and has a dense span by (c), so it is an orthonormal basis for L2 ([0, 1]2 ). For f, g ∈ L2 ([0, 1]), |e2πinx ⊗ e2πiny , f ⊗ g| = |e2πinx , f (x)e2πiny , g(y)| ≤ f g n∈Z

n∈Z

by the Cauchy–Schwarz inequality in 2 (Z). Thus, for h = f ⊗ g, |e2πinx ⊗ e2πiny , h| < ∞.

(3.31)

n∈Z

By linearity, (3.31) then also holds for all h ∈ span{f ⊗ g | f, g ∈ L2 ([0, 1])}. However, there exist h ∈ L2 ([0, 1]2 ) for which (3.31) fails: it suﬃces to take an e2πinx ⊗ e2πiny h= n∈Z

with a sequence a ∈ (so that the sum gives an element of L2 ([0, 1]2 )) 1 and a ∈ / (Z) (so that (3.31) fails). An explicit example is the vector ∞ 1 2πinx ⊗ e2πiny . h = n=1 n e 2 (Z)

A description of a tensor product Hilbert space would not have any substance without the description of the accompanying bilinear map (f, g) → f ⊗ g. The previous example suggests the following deﬁnition. Deﬁnition 3.61. Let H, K, V be Hilbert spaces, and let i : H × K → V be a map with the following properties. (a) i is bilinear, i.e., for any λ ∈ C, x, x ∈ H, y, y ∈ K, λi(x, y) = i(λx, y) = i(x, λy),

i(x + x , y) = i(x, y) + i(x , y),

i(x, y + y ) = i(x, y) + i(x, y ).

(3.32) (3.33) (3.34)

(b) For all x, x ∈ H and y, y ∈ K, i(x, y), i(x , y ) = x, x y, y . (c) The image of i has a dense span in V . The space V is called the tensor product of Hilbert spaces H, K and i is called the canonical bilinear map. We will now prove existence of the tensor product (by an explicit construction) and its uniqueness (up to a unitary map). The construction is diﬀerent from the algebraic construction of a tensor product of modules: we

102

3. Hilbert spaces

rely on the inner product structure throughout, and in order to make H ⊗ K a Hilbert space, we use an additional Hilbert space completion, which is considered in an abstract setting in Exercise 3.8. Theorem 3.62. Let H, K be Hilbert spaces. (a) There exists a Hilbert space V1 and a bilinear map i1 : H × K → V1 which obeys properties (a), (b), and (c) of Deﬁnition 3.61. (b) If there is a Hilbert space V2 and a bilinear map i2 : H × K → V2 with the same properties, then there is a unitary map U : V1 → V2 such that U ◦ i1 = i2 . Proof. (a) We begin by considering the set of formal linear combinations of pairs (x, y) ∈ H × K,

n cj (xj , yj ) | n ∈ N, cj ∈ C, xj ∈ H, yj ∈ K A= j=1

(more formally, this can be presented as the set of all functions H × K → C which are equal to 0 except at ﬁnitely many points). The set A is a vector space; on it, we deﬁne the sesquilinear form 0 / n m n m cj (xj , yj ), dk (xk , yk ) = c¯j dk xj , xk yj , yk . j=1

j=1 k=1

k=1

We deﬁne the set A0 = {v ∈ A | v, w = 0 for all w ∈ A}. v, w

w

If v − − ∈ A0 , then v, w = v, w = v , w , so this sesquilinear form induces a sesquilinear form on the quotient vector space A/A0 . Denote by x ⊗ y the coset of (x, y) in A/A0 . It is directly veriﬁed that the map i(x, y) = x ⊗ y is bilinear. For instance, if we compute ¯ x y, y − λx, x y, y = 0, λ(x, y) − (λx, y), (x , y ) = λx, taking linear combinations for the second parameter shows that λ(x, y) − (λx, y), w = 0

∀w ∈ A,

and therefore λ(x ⊗ y) = (λx) ⊗ y. Similar calculations show that

(3.35)

λ(x ⊗ y) = x ⊗ (λy),

(3.36)

x ⊗ y + x ⊗ y = (x + x ) ⊗ y,

(3.38)

(3.39)

x ⊗ y + x ⊗ y = x ⊗ (y + y ),

(3.37)

x ⊗ y, x ⊗ y = x, x y, y .

3.6. Tensor products of Hilbert spaces

103

Our next goal is to prove that this sesquilinear form on A/A0 is positive deﬁnite. In order to do this, let us ﬁrst rewrite an arbitrary vector v=

n

cj xj ⊗ yj

j=1

in A/A0 . By the Gram–Schmidt process, there is an orthonormal sequence x1 , x2 , . . . , xm , m ≤ n, with the same span as x1 , . . . , xn . Writing xj as linear combinations of xk and using (3.35) and (3.37), we can write v as a linear combination of vectors of the form xk ⊗ y for some y ∈ K. Grouping terms with the same k by using (3.36) and (3.38), we can ﬁnally write v=

m

xk ⊗ yk

k=1

for some

y1 , . . . , yk

∈ K. Since

v, v =

xk

are orthonormal,

m m m xj , xk yj , yk = yk 2 ≥ 0. j=1 k=1

k=1

yk

= 0 for all k, so v = 0. In conclusion, we Moreover, v, v = 0 implies have proved that the sesquilinear form on A/A0 is positive deﬁnite. By construction, the span of the range of i is A/A0 . The vector space A/A0 is equipped with an inner product, but is not (in general) complete; denoting by V its Hilbert space completion (Exercise 3.8) completes the proof. (b) Deﬁne a map W : A → V2 by deﬁning W : (x, y) → i2 (x, y) and extending linearly. The map W preserves inner products, so Ker W = A0 . Thus, W induces a norm-preserving map U : A/A0 → V2 . Since A/A0 is a dense subset of V1 , this extends to a norm-preserving map U : V1 → V2 . The range of U is V2 because the span of the range of i2 is dense in V2 and U ◦ i1 = i2 by construction. Due to this existence and uniqueness theorem, it is customary to denote the tensor product V by H ⊗ K and the values of the canonical bilinear map i by i(x, y) = x ⊗ y. For example, we can now say that L2 ([0, 1]) ⊗ L2 ([0, 1]) = L2 ([0, 1]2 ) with the canonical bilinear map (3.29). In practice, it is easier to check the deﬁnition than to trace the explicit construction of the tensor product: Lemma 3.63. If (ej )j∈J is an orthonormal basis of H and (fk )k∈K is an orthonormal basis of K, then (ej ⊗ fk )j∈J,k∈K is an orthonormal basis of H ⊗ K. In particular, dim(H ⊗ K) = dim H dim K.

104

3. Hilbert spaces

Proof. The set (ej ⊗ fk )j∈J,k∈K is an orthonormal set because ej ⊗ fk , ej ⊗ fk = ej , ej fk , fk , and this is equal to 1 if j = j and k = k and zero otherwise. It suﬃces to prove that M = span{ej ⊗ fk | j ∈ J, k ∈ K} is dense in H ⊗ K. For any k ∈ K, bilinearity of the tensor product implies that span{ej ⊗ fk | j ∈ J} = {x ⊗ fk | x ∈ span{ej | j ∈ J}}, and since x ⊗ fk = xfk , this set is dense in {x ⊗ fk | x ∈ H}. Thus, x ⊗ fk ∈ M for all x ∈ H and k ∈ K. Repeating this argument for K, we conclude x ⊗ y ∈ M for all x ∈ H and y ∈ K. Since M is a closed subspace, this implies that M = H ⊗ K. In particular, it is common to say that Cm ⊗ Cn ∼ = Cmn , denoting the m standard basis of C by e1 , . . . , em , the standard basis of Cn by f1 , . . . , fn , and viewing Cmn as the space of m × n matrices so that ej ⊗ fk is the matrix with a 1 in the jk entry and zeros in all other entries.

3.7. Exercises 3.1. Let S ⊂ [0, 2π) be a subset of positive Lebesgue measure. Prove that there exists C > 0 such that |a + beiθ |2 dθ ≥ C(|a|2 + |b|2 ) ∀a, b ∈ C, S

and ﬁnd the optimal constant C as a function of S. 3.2. Prove that |x, y| = xy if and only if x, y are linearly dependent. 3.3. The Gram matrix of vectors x1 , . . . , xn ∈ H is the n × n matrix B with entries bjk = xj , xk . Prove the following. (a) B is always positive semideﬁnite, i.e., λ∗ Bλ ≥ 0 for all λ ∈ Cn . (b) B is positive deﬁnite, i.e., λ∗ Bλ > 0 for all λ ∈ Cn \ {0}, if and only if vectors x1 , . . . , xn are linearly independent. 3.4. Prove that x + y = x + y if and only if x = 0 or y = λx for some λ ≥ 0. in H obeying (3.6), for 3.5. Given a pairwise orthogonal sequence (xj )∞ ∞ j=1 any bijection σ : N → N, prove that j=1 xσ(j) = ∞ j=1 xj . 3.6. If X is σ-locally compact and μ is a Baire measure on X whose support contains at least two points, prove that for p ∈ [1, ∞] \ {2}, the p-norm on Lp (X, μ) is not the induced norm of an inner product. 3.7. Let V be a vector space, and let ·, ·, : V × V → C be a map which is linear in the second parameter, conjugate-symmetric, and for all x ∈ V , x, x ≥ 0. . (a) Prove that x = x, x deﬁnes a seminorm on V .

3.7. Exercises

105

(b) Let V0 = {x ∈ V | x = 0}. For any x ∈ V0 and y ∈ V , prove that x, y = 0. (c) Prove that ·, ·, induces an inner product on V /V0 . Hint: Use Lemma 2.8. 3.8. Let V be a pre-Hilbert space, and let B be its Banach space completion (Exercise 2.10). Explicitly, we assume that B is a Banach space, i : V → B is a norm-preserving linear map, and Ran i is dense in B. (a) Prove that the map ·, · : Ran i × Ran i → C deﬁned by i(x), i(y) = x, y can be extended uniquely to a continuous map B × B → C, which we also denote by ·, ·. (b) Prove that this extension is an inner product on B and that x, x = x2 for all x ∈ B, so the norm induced by ·, · is the norm of B. (c) Conclude that B is a Hilbert space with inner product ·, ·. ⊥

3.9. (a) If S is a subspace of H, prove that S ⊥ = S . (b) If S is a closed subspace of H, prove that (S ⊥ )⊥ = S. (c) If S is an arbitrary subspace of H, prove that (S ⊥ )⊥ = S. 3.10. If P, Q are orthogonal projections on H, prove that Ran P ⊂ Ran Q if and only if QP = Q. 3.11. A subspace S of H is ﬁnite-dimensional if S = span{v1 , . . . , vn } for some ﬁnite set v1 , . . . , vn . Prove that any ﬁnite-dimensional subspace is closed. √ 2 3.12. Prove that ( 2 sin(nπx))∞ n=1 is an orthonormal basis for L ([0, 1]). 3.13. For any n ∈ N, prove that the sequence (e2πik·x )k∈Zn is an orthonormal basis for L2 ([0, 1]n , dmn ) where mn denotes Lebesgue measure on [0, 1]n . 3.14. Let (eγ )γ∈Γ be an orthonormal basis of a Hilbert space H. Prove that γ∈Γ κγ eγ can be interpreted as a Bochner integral (Deﬁnition 2.59) if and only if κ ∈ 1 (Γ), but it can be interpreted as a Pettis integral (Remark 2.63) if and only if κ ∈ 2 (Γ).

106

3. Hilbert spaces

3.15. Let H be a separable, inﬁnite-dimensional Hilbert space. Prove that any set X ⊂ H such that span X = H is uncountable. Hint: If X was countable, prove there would exist an orthonormal sequence (yj )∞ j=1 with span{yj | j ∈ N} = H and consider ∞ −1 v = j=1 j yj . 3.16. Let μ be a probability measure on ∂D = {z ∈ C | |z| = 1}. Assume that μ is nontrivial, i.e., supp μ is inﬁnite. (a) Prove that applying the Gram–Schmidt process to the sequence 1, z, z −1 , z 2 , z −2 , . . . 2 gives an orthonormal basis {χn }∞ n=0 for L (∂D, dμ). This is called the CMV basis following Cantero–Moral–Vel´ azquez [16]. ∞ (b) Denote by (ϕn )n=0 the result of applying Gram–Schmidt process to (z n )∞ n=0 . Prove that for all n ∈ N ∪ {0},

χ2n (z) = z n ϕ2n (1/z),

χ2n+1 (z) = z −n ϕ2n+1 (z).

3.17. For any Hilbert space H and n ∈ N, construct4a canonical bilinear map that justiﬁes the identiﬁcation H ⊗ Cn = nj=1 H. 3.18. Consider Hilbert spaces H, K and their tensor product H ⊗ K. Prove that span{x ⊗ y | x ∈ H, y ∈ K} = H ⊗ K if and only if H and K are inﬁnite dimensional. 3.19. If H1 , H2 , H3 are Hilbert spaces, prove that there exists a unitary map U : (H1 ⊗ H2 ) ⊗ H3 → H1 ⊗ (H2 ⊗ H3 ) such that U ((x1 ⊗x2 )⊗x3 ) = x1 ⊗(x2 ⊗x3 ) for all xj ∈ Hj , j = 1, 2, 3.

Chapter 4

Bounded linear operators

In this chapter, we study bounded linear operators on a Hilbert space H. It is assumed throughout that H is separable. Composition of operators on L(H) is viewed as a multiplicative operation, with identity operator I deﬁned by Ix = x for all x ∈ H; together with the linear structure, this makes L(H) an algebra. The Hilbert space structure on H induces an additional unary operation, which associates to every operator A its adjoint operator A∗ such that for all u, v ∈ H, (4.1) u, Av = A∗ u, v. That this is well deﬁned and determines a unique operator A∗ ∈ L(H) will be proved promptly in Proposition 4.2. We will then consider the resulting structure and properties of L(H). Some of the material of this chapter is preparation for a detailed study of self-adjoint operators, i.e., those with A = A∗ , which will follow in later chapters.

4.1. The C ∗ -algebra of bounded linear operators on H For operators on a Hilbert space, the norm can be characterized in terms of the inner product: Lemma 4.1. For any linear map A : H → H, A =

sup

|u, Av|.

u,v∈H u=v=1

In particular, A is a bounded linear operator if and only if this supremum is ﬁnite. 107

108

4. Bounded linear operators

Proof. By Lemma 3.24, for any v ∈ H, Av = sup |u, Av|. u∈H u=1

Taking the supremum over all normalized v ∈ H completes the proof.

Proposition 4.2. Let A ∈ L(H). For any u ∈ H, there is a unique vector A∗ u ∈ H such that (4.1) holds for all v ∈ H. The map u → A∗ u is linear and A∗ = A. In particular, A∗ is a bounded linear operator on H. Proof. For ﬁxed u ∈ H, consider the linear map Λu : H → C deﬁned by Λu v = u, Av. Since for all v ∈ H, |Λu v| = |u, Av| ≤ uAv ≤ uAv, Λu is a bounded linear functional on H. By the Riesz representation theorem, it corresponds to a unique vector A∗ u. Linearity of A∗ follows from uniqueness; namely, for any λ ∈ C and u, v ∈ H, A∗ (λu), v = λu, Av = λu, Av = λA∗ u, v = λA∗ u, v, so A∗ (λu) = λA∗ u, and similarly, for any u, u , v ∈ H, A∗ (u + u ), v = u + u , Av = A∗ u, v + A∗ u , v = A∗ u + A∗ u , v, so A∗ (u+u ) = A∗ u+A∗ u . Thus, A∗ is a linear operator on H. Boundedness of A∗ and A∗ = A follow from Lemma 4.1. Deﬁnition 4.3. The adjoint of A ∈ L(H) is the unique operator A∗ ∈ L(H) such that (4.1) holds for all u, v ∈ H. Example 4.4. Let A be a complex n × n matrix, viewed as an element of L(Cn ). Its adjoint A∗ is the matrix with entries (A∗ )ij = Aji . Proof. Since A∗ ∈ L(Cn ) is uniquely determined by (4.1), it suﬃces to compute its matrix elements, which follow from (4.1) by (A∗ )ij = δi , A∗ δj = A∗ δj , δi = δj , Aδi = Aji .

Lemma 4.5. Let U ∈ L(H). Then: (a) U is norm-preserving if and only if U ∗ U = I. (b) U is unitary if and only if U ∗ U = U U ∗ = I. Proof. (a) By deﬁnition, U is norm-preserving if and only if U v = v for all v ∈ H. By the polarization identity, this is equivalent to U v, U w = v, w for all v, w ∈ H and therefore equivalent to v, U ∗ U w = v, w for all v, w ∈ H. By the Riesz representation theorem this is equivalent to U ∗ U w = w for all w and then to U ∗ U = I.

4.1. The C ∗ -algebra of bounded linear operators on H

109

(b) If U ∗ U = U U ∗ = I, then U is norm-preserving by (a); moreover, U U ∗ = I implies Ran U = H, so U is unitary. Conversely, let U be unitary. Then it is norm-preserving, so U ∗ U = I. Moreover, any v ∈ H can be written in the form v = U w so U U ∗ v = U U ∗ U w = U w = v. Thus, U U ∗ = I. Example 4.6. Let S denote the shift operator on 2 (N), deﬁned by (Su)n = un+1 ,

u ∈ 2 (N).

(4.2)

n=1 n ≥ 2.

(4.3)

Its adjoint S ∗ is the operator

0 (S ∗ u)n = un−1

Note that SS ∗ = I but S ∗ S = I. Proof. Note that S is a bounded linear operator with S = 1, because Su = 2

∞

∞ ∞ 2 |un+1 | = |uk | ≤ |uk |2 = u2

n=1

2

k=2

k=1

and equality holds for any vector with u1 = 0. From (4.2), we compute Sδk = δk−1 if k ≥ 2 and Sδ1 = 0. Thus, for u ∈ 2 (N), (S ∗ u)n = δn , S ∗ u = Sδn , u and splitting cases gives (4.3). Direct calculations give SS ∗ = I and S ∗ Sx = x − x1 δ1 . Lemma 4.7. For any A ∈ L(H), (Ran A)⊥ = Ker A∗ . Proof. u, Av = 0 for all v ∈ H is equivalent to A∗ u, v = 0 for all v ∈ H, so it is equivalent to A∗ u = 0. Recall that any bounded linear operator is continuous, so it maps convergent sequences to convergent sequences. It also maps weakly convergent sequences to weakly convergent sequences: w

w

Lemma 4.8. Let A ∈ L(H). If xn → x, then Axn → Ax. Proof. For any y ∈ H, y, Axn = A∗ y, xn → A∗ y, x = y, Ax.

On L(H), as already noted, we interpret the composition of operators as a multiplicative operation. Viewing the adjoint as a unary operation leads to the following structure. Deﬁnition 4.9. Let X be a Banach space equipped with a binary operation denoted multiplicatively and a unary operation ∗ . X is a C ∗ algebra if it is

110

4. Bounded linear operators

a Banach space, a ring, and for all a, b ∈ X and z ∈ C, the following hold. (a) ab ≤ ab. (b) (a + b)∗ = a∗ + b∗ . (c) (za)∗ = z¯a∗ . (d) (ab)∗ = b∗ a∗ . (e) (a∗ )∗ = a. (f) If a is invertible, then so is a∗ and (a∗ )−1 = (a−1 )∗ . (g) a∗ a = a2 . If X has an identity element for multiplication, it is a C ∗ algebra with identity. If multiplication in X is commutative, X is a commutative C ∗ algebra. Theorem 4.10. L(H) is a C ∗ -algebra with identity. Proof. L(H) is a Banach space by Proposition 2.38. The algebraic properties are obvious, with the multiplicative identity I. The property A∗ A = A2 is proved by proving two inequalities. By Lemma 4.1, A∗ A ≥ sup |u, A∗ Au| = sup |Au, Au| = sup Au2 = A2 . u∈H u=1

u∈H u=1

u∈H u=1

Conversely, A∗ A ≤ A∗ A = A2 , so A∗ A = A2 .

L(H) is the canonical example of a C ∗ -algebra and is the reason why we introduce them here, but we should point out a few additional examples. In all of the following examples, multiplication is pointwise multiplication of functions and the unary operation is complex conjugation, f ∗ (x) = f (x). Example 4.11. If K is compact, C(K) is a commutative C ∗ -algebra with identity. Example 4.12. The set Bb (X) of bounded Borel functions on X, with the norm f = supx∈X |f (x)|, is a commutative C ∗ -algebra with identity. Example 4.13. L∞ (X, dμ) is a commutative C ∗ -algebra with identity.

4.2. Strong and weak operator convergence In L(H), convergence in the operator norm is sometimes also called uniform convergence. In addition to uniform convergence, in L(H), there are notions of strong operator convergence and weak operator convergence, which are the subject of this section.

4.2. Strong and weak operator convergence

111

Deﬁnition 4.14. A sequence of operators An ∈ L(H) converges strongly to A ∈ L(H) if for every v ∈ H, An v → Av as n → ∞. We denote this by s An → A or s-lim An = A. n→∞

Deﬁnition 4.15. A sequence of operators An ∈ L(H) converges weakly to A ∈ L(H) if for every u, v ∈ H, u, An v → u, Av as n → ∞. We denote w this by An → A or w-lim An = A. n→∞

Similarly to weak convergence in H (Section 3.5), the reader is warned that strong operator convergence and weak operator convergence in L(H) are not deﬁned with respect to a metric, so intuitively natural properties must be veriﬁed. For instance: s

s

Lemma 4.16. If An → A and An → B, then A = B. Proof. This follows from Av = limn→∞ An v = Bv for all v ∈ H. s

s

It is obvious that An → A implies An → A, and that An → A implies w An → A. Exercise 4.3 shows that all three types of operator convergence are equivalent for dim H < ∞, and Exercise 4.4 that they are distinct when s dim H = ∞. Moreover, Exercise 4.4 shows that An → A does not necessarily s imply A∗n → A∗ . We will now focus on strong operator convergence, which will be central to our treatment of functional calculus in Chapter 5. We have already encountered strong operator convergence in Theorem 3.31; the series ∞ j=1 Pj considered there converges in the sense of strong operator convergence. It usually does not converge in norm (Exercise 4.5). Weak operator convergence will not play an important role in this text, and its properties will be left as exercises. s

Proposition 4.17. If An → A, then the sequence An is bounded and A ≤ lim inf An .

(4.4)

n→∞

Proof. For every v ∈ H, the sequence (An v)∞ n=1 is convergent, so it is bounded. By the uniform boundedness principle, the sequence (An )∞ n=1 is bounded. The bound on A follows from Av = lim An v = lim inf An v ≤ lim inf An v. n→∞ s

n→∞ s

n→∞

s

Lemma 4.18. If An → A and Bn → B, then An Bn → AB.

112

4. Bounded linear operators

Proof. For any v ∈ H, write An Bn v − ABv = An (Bn − B)v + (An − A)Bv. Since the sequence An is bounded and Bn v → Bv, it follows that An (Bn − B)v → 0. s

Since An → A, it follows that (An −A)Bv → 0. Adding these two statements completes the proof. Finally, we show that for separable Hilbert spaces, strong operator convergence is metrizable on bounded subsets of L(H), by revisiting the idea of Theorems 2.56 and 3.58: Theorem 4.19. Let H be a separable Hilbert space with orthonormal basis (ek )∞ k=1 . Then d(x, y) =

∞

min(2−k , (A − B)ek )

k=1

deﬁnes a metric on H. Moreover, let (An )∞ n=1 be a sequence in L(H) and s let A ∈ L(H). Then An → A if and only if supAn < ∞

and

n∈N

lim d(An , A) = 0.

n→∞

Proof. (a) By Lemma 2.55, d is a semimetric. If d(A, B) = 0, then (A − B)ek = 0 for all k, so by linearity (A − B)v = 0 for all v ∈ span{ek | k ∈ N} and ﬁnally by continuity (A − B)v = 0 for all v ∈ H. Thus, d is a metric on L(H). s

(b) If An → A, then An is a bounded sequence in H. Moreover, (An − A)ek → 0 for every k, so by dominated convergence with dominating sequence 2−k applied to the counting measure on N, lim d(xn , x) =

n→∞

∞ k=1

lim min{2−k , (An − A)ek } = 0.

n→∞

Conversely, if d(xn , x) → 0, then min(2−k , (An − A)ek ) → 0 for each k, so An ek → Aek for each k, and then by linearity, An v → Av for all v ∈ span{ek }∞ k=1 . Since that set is dense and the sequence An is bounded, s by Lemma 2.46, An → A. As usual, this only makes strong operator convergence metrizable on bounded subsets of L(H) and not on the entire space L(H).

4.3. Invertibility, spectrum, and resolvents

113

4.3. Invertibility, spectrum, and resolvents An operator A ∈ L(H) is called invertible if it has an inverse in L(H), i.e., if there exists a bounded linear operator A−1 such that AA−1 = A−1 A = I. Of course, the inverse has all the algebraic properties guaranteed in any ring: if A is invertible, the inverse is unique, and if A and B are invertible, then so is AB, and (AB)−1 = B −1 A−1 . There is a simple criterion for invertibility: Lemma 4.20. An operator A ∈ L(H) is invertible if and only if Ran A is dense in H and Au > 0. (4.5) inf u∈H u u=0

While the condition that Ran A is dense can be viewed as a weakening of surjectivity, (4.5) can be viewed as a strengthening of injectivity; this perspective will be apparent in the proof. Proof of Lemma 4.20. Denote the inﬁmum in (4.5) by C. If A is invertible, then Ran A = H is dense. Moreover, for any u, u = A−1 Au ≤ A−1 Au, which implies that C ≥ A−1 −1 > 0. Conversely, assume that Ran A is dense and (4.5) holds. Since C > 0, it follows that Au ≥ Cu > 0 whenever u = 0, so Ker A = {0}. This implies injectivity of A, since Au − Av = A(u − v) = 0 whenever u = v. Moreover, for any convergent sequence vn → v with vn = Aun ∈ Ran A, 1 1 um − un ≤ A(um − un ) = vm − vn , C C ∞ so (un )n=1 is a Cauchy sequence in H. Thus, (un )∞ n=1 is convergent in H. Continuity of A implies v = limn→∞ Aun = A limn→∞ un , so v ∈ Ran A. Thus, Ran A is closed, and since it is dense, Ran A = H. Therefore, A is a bijection. Finally, C > 0 implies that for any u ∈ H, u ≤ C1 Au, so the inverse A−1 is bounded. A lot of information about A can be obtained by considering the invertibility of A − z = A − zI for z ∈ C. Deﬁnition 4.21. The spectrum of A ∈ L(H) is the set σ(A) = {z ∈ C | A − z is not invertible}. Its complement C \ σ(A) is called the resolvent set. For z ∈ C \ σ(A), the inverse RA (z) = (A − z)−1 is called the resolvent of A at z.

114

4. Bounded linear operators

We warn the reader that some sources deﬁne the resolvent as the inverse of z − A, diﬀering by a minus sign from our convention. Next, we recall some terminology from linear algebra: Deﬁnition 4.22. Eigenvalues of A are values of z ∈ C such that Ker(A − z) = {0}. The subspace Ker(A − z) is called the eigenspace corresponding to z, and its nonzero elements are called eigenvectors. Of course, since Ker(A − z) = {0} prevents invertibility of A − z, an eigenvalue of A is always in the spectrum of A. For matrices, the converse also holds: it is a standard result in linear algebra that z is an eigenvalue of A if and only if A − z is not invertible. Example 4.23. If A ∈ L(Cn ), then σ(A) is the set of eigenvalues of A. On inﬁnite-dimensional Hilbert spaces, elements of the spectrum are not necessarily eigenvalues; we will see this in Examples 4.32 and 5.6. We now turn to some general properties of resolvents. Proposition 4.24 (The ﬁrst resolvent identity). For any z, w ∈ / σ(A), (A − z)−1 − (A − w)−1 = (z − w)(A − z)−1 (A − w)−1 .

(4.6)

It is easy to motivate the ﬁrst resolvent identity: it corresponds to a partial fraction decomposition 1 1 z−w − = . x−z x−w (x − z)(x − w) Of course, this is not a proof of (4.6), but it is the ﬁrst of many indications to come that one can successfully apply scalar functions to operators and expect some properties to carry over. This will be especially true later, when we focus on self-adjoint operators. Proof of Proposition 4.24. The proof is the calculation RA (z)(z − w)RA (w) = RA (z)[(A − w) − (A − z)]RA (w) = RA (z)(A − w)RA (w) − RA (z)(A − z)RA (w) = RA (z) − RA (w).

Corollary 4.25. Resolvents of A commute, i.e., for any z, w ∈ / σ(A), (A − z)−1 (A − w)−1 = (A − w)−1 (A − z)−1 . Proof. For z = w, this is trivial; for z = w it follows from the ﬁrst resolvent identity by interchanging z and w and comparing the two equalities.

4.3. Invertibility, spectrum, and resolvents

115

Just as composition of operators in L(H) is denoted multiplicatively, positive integer powers of an operator are deﬁned inductively by B 0 = I and B k = BB k−1 . This appears in the following expansion, which is merely the geometric series reinvented in the operator setting. Theorem 4.26 (Neumann series). If B ∈ L(H) and B < 1, then I − B is invertible and the resolvent is given by the norm-convergent series (I − B)−1 =

∞

Bk .

k=0

Moreover, the norm of the resolvent is bounded by (I − B)−1 ≤

1 . 1 − B

Proof. By Lemma 2.6, since B k ≤ Bk , the Neumann series T =

∞

Bk

k=0

is norm-convergent and deﬁnes an operator T with T ≤ 1/(1 − B). By using telescoping series, both T (I − B) and (I − B)T are computed to be equal to n−1 (B k − B k+1 ) = lim (I − B n ) = I, lim n→∞

k=0

n→∞

where the last step uses B n ≤ Bn → 0. Thus, T = (I − B)−1 .

This allows us to consider invertibility perturbatively and to start studying (A − z)−1 as an L(H)-valued function of z ∈ C \ σ(A). We will use the discussion of analytic Banach-space valued functions (Deﬁnition 2.68), which relied on the property (2.27); this property holds in any Banach space, but more concretely, it can be manually proved for the Banach space B = L(H) (Exercise 4.14). The next two statements are applications of the Neumann series, one by viewing A − z as a perturbation of A − z0 for z near z0 ∈ C \ σ(A), and the other by a perturbation around ∞. Proposition 4.27. The spectrum σ(A) is a closed set in C, and the resolvent (A − z)−1 is an L(H)-valued analytic function on z ∈ C \ σ(A). Proof. Fix z0 ∈ C \ σ(A) and denote r = RA (z0 )−1 . For any z ∈ C, we can write A − z = (A − z0 ) − (z − z0 ) = [I − (z − z0 )RA (z0 )](A − z0 ).

(4.7)

116

4. Bounded linear operators

For z ∈ Dr (z0 ), we have the norm estimate (z − z0 )RA (z0 ) < 1, so (4.7) can be inverted by using the Neumann series, ∞ RA (z) = RA (z0 )[I − (z − z0 )RA (z0 )]−1 = (z − z0 )k RA (z0 )k+1 . k=0

This shows that Dr (z0 ) ⊂ C \ σ(A) and that the resolvent is locally represented by a convergent power series. Thus, RA (z) is analytic on C\σ(A). Proposition 4.28. For any A ∈ L(H), σ(A) is a nonempty compact subset of C and z ∈ σ(A) implies |z| ≤ A. Proof. If |z| > A, then z −1 A < 1, so the operator A−z = −z(I −z −1 A) can be inverted by applying the Neumann series to I − z −1 A. Explicitly, this gives the bounded inverse ∞ Ak (4.8) RA (z) = −z −1 (I − z −1 A)−1 = − z k+1 k=0

for any |z| > A, with the norm estimate 1 1 = . RA (z) ≤ |z|(1 − z −1 A) |z| − A Thus, A−z is invertible whenever |z| > A. In particular, σ(A) is bounded. The norm estimate also implies that RA (z) → 0 as |z| → ∞. If σ(A) was the empty set, RA (z) would be an entire function. Since RA (z) → 0 as |z| → ∞, by Liouville’s theorem (Proposition 2.70), this would imply RA (z) = 0, which is a contradiction with RA (z)(A−z) = I. The previous proof can be improved by a closer look at where the Neumann series converges, which leads to Gelfand’s spectral radius formula. Deﬁnition 4.29. The spectral radius of A ∈ L(H) is r(A) = max |z|. z∈σ(A)

Theorem 4.30. For any A ∈ L(H), r(A) = lim An 1/n . n→∞

The proof of this result requires a lemma about subadditive sequences. Lemma 4.31. Let (xn )n∈N be a sequence in [−∞, ∞) such that xn+m ≤ xn + xm for all n, m ∈ N. Then limn→∞ xn /n exists in [−∞, ∞) and xn xn = inf . lim n→∞ n n∈N n

4.3. Invertibility, spectrum, and resolvents

117

Proof. For any n ∈ N, by induction in k, subadditivity of the sequence implies xkn+r ≤ kxn + xr . Thus, for any 0 ≤ r ≤ n − 1, lim sup k→∞

xn xkn+r kxn + xr ≤ lim sup = . kn + r kn + r n k→∞

Combining the subsequences (xkn+r )∞ k=0 for r = 0, 1, . . . , n − 1 gives lim sup m→∞

xm xn ≤ . m n

Since this holds for any n ∈ N, it follows that xm xn ≤ inf . lim sup n∈N n m→∞ m Trivially, lim sup ≥ lim inf ≥ inf, which completes the proof.

Proof of Theorem 4.30. Since Am+n ≤ Am An for all m, n ∈ N, the sequence logAn is subadditive, so limn→∞ An 1/n = inf n∈N An 1/n . For |z| > limn→∞ An 1/n , we have % n %1/n % A % % < 1, lim sup % % z n+1 % n→∞ so by the root test, the series (4.8) is absolutely convergent and gives the resolvent, so z ∈ / σ(A). Thus, r(A) ≤ lim An 1/n . n→∞

(4.9)

For the opposite inequality, consider the substitution z = 1/w to expand the resolvent around ∞: the resolvent is given by the convergent power series (A − 1/w)

−1

=−

∞

wk+1 Ak

k=0

in a punctured neighborhood of w = 0. This series has a removable singularity at w = 0 and, by Corollary 2.69, its radius of convergence is at least 1/r(A), which implies 1 lim supk→∞

Ak 1/k

≥

1 . r(A)

Combining this with (4.9) completes the proof.

Example 4.32. The operators S, S ∗ on 2 (N) given by (4.2) and (4.3) obey σ(S) = σ(S ∗ ) = {z ∈ C | |z| ≤ 1}. The set of eigenvalues of S is {z ∈ C | |z| < 1}.

118

4. Bounded linear operators

Proof. Since S = S ∗ = 1, σ(S), σ(S ∗ ) ⊂ {z ∈ C | |z| ≤ 1}. Note that Sv = zv if and only if vn = z n−1 v1 for all n. Such v ∈ 2 (N), v = 0, exist if and only if |z| < 1, so this is the set of eigenvalues of S. Since σ(S) is closed, it follows that σ(S) = {z ∈ C | |z| ≤ 1}. Likewise, for |z| < 1, since Ker(S − z) = (Ran(S ∗ − z¯))⊥ , it follows that Ran(S ∗ − z¯) = H, so z¯ ∈ σ(S ∗ ). Since σ(S ∗ ) is closed, it follows that σ(S ∗ ) = {z ∈ C | |z| ≤ 1}.

4.4. Polynomials of operators Let A ∈ L(H). For nonnegative integers k, we have already considered the kth power of A, which is deﬁned inductively by A0 = I and Ak = AAk−1 . The algebra of polynomials with complex coeﬃcients is denoted C[x]; for any p ∈ C[x], we deﬁne p(A) = nk=0 ck Ak . The algebraic properties of this notion are apparent. For ﬁxed A, the map p → p(A) is a homomorphism of algebras, i.e., it preserves linear operations, multiplication, and the multiplicative identity. It also obeys p(A)∗ = p¯(A∗ ). What about the spectrum of p(A)? Theorem 4.33 (Spectral mapping theorem for polynomials). For any A ∈ L(H) and p ∈ C[x], σ(p(A)) = {p(λ) | λ ∈ σ(A)}. Proof. Assume that κ ∈ / {p(λ) | λ ∈ σ(A)}. This means that the polynomial p(x) − κ has no zeros in σ(A); thus, its factorization into linear factors is of the form n (x − λj ), p(x) − κ = α j=1

where α ∈ C \ {0} and λj ∈ C \ σ(A) for all j. In particular, p(A) − κ = α

n

(A − λj ).

j=1

Since all the A − λj are invertible and α = 0, their product p(A) − κ is invertible. It follows that κ ∈ / σ(p(A)). Conversely, assume that κ = p(λ) for some λ ∈ σ(A). Then p(x) − p(λ) is divisible by x − λ, and we will use the polynomial factorizations p(x) − p(λ) = q(x)(x − λ) = (x − λ)q(x). By Lemma 4.20, λ ∈ σ(A) implies that A − λ is not surjective or inf (A − λ)u = 0.

u∈H u=1

(4.10)

4.5. Invariant subspaces and direct sums of operators

119

If A−λ is not surjective, then p(A)−p(λ) = (A−λ)q(A) cannot be surjective either. If (4.10) holds, then since q(A) is a bounded operator, inf (p(A) − p(λ))u = inf q(A)(A − λ)u = 0,

u∈H u=1

u∈H u=1

so p(A) − p(λ) is not invertible. In both cases, we have proved p(λ) ∈ σ(p(A)). We say that operators A, B ∈ L(H) commute if AB = BA. If two operators commute, so do their polynomials: Lemma 4.34. If AB = BA, then p(A)q(B) = q(B)p(A) for all p, q ∈ C[x]. Proof. The set M = {T ∈ L(H) | T B = BT } is closed under multiplication because if T1 , T2 ∈ M , then T1 T2 B = T1 BT2 = BT1 T2 . The set M contains the identity operator, and by assumption, it contains A. Thus, by induction, it contains all powers Ak . The set M is closed under linear operations, so it contains all p(A). Thus, AB = BA implies p(A)B = Bp(A) for all p ∈ C[x]. Applying this argument again, p(A)B = Bp(A) implies p(A)q(B) = q(B)p(A) for all q ∈ C[x]. The polynomial functional calculus deﬁned here is very robust, since it allows arbitrary A ∈ L(H). However, it treats polynomials as algebraic objects rather than as functions on the spectrum. This distinction is illustrated in Exercise 4.16. In the next chapter, we will deﬁne a substantial generalization of f (A) to bounded Borel functions f on the spectrum at the cost of specializing to self-adjoint operators A.

4.5. Invariant subspaces and direct sums of operators In this section, we will introduce direct sums of bounded operators on Hilbert spaces and, as a dual point of view, decompositions of some operators into smaller blocks. The constructions considered here generalize the notion of block diagonal matrices from linear algebra; however, we are working with operators on arbitrary Hilbert spaces, and we consider countable direct sums 4N n=1 with a ﬁnite or inﬁnite number of terms (N can be ﬁnite or ∞, where ∞ denotes countably many summands). In the general context of vector 4 spaces, for any linear maps An : Hn → Kn , one can deﬁne a linear map N n=1 An by N 3 N An (vn )N (4.11) n=1 = (An vn )n=1 , n=1

120

4. Bounded linear operators

as a map from one Cartesian product of vector spaces to another. To make this a map between direct sums of Hilbert spaces, we need to ensure that ﬁniteness of norm is preserved: Proposition 4.35. Given linear maps An : Hn → Kn between Hilbert spaces, n = 1, . . . , N , (4.11) deﬁnes a bounded linear operator N 3

An :

N 3

n=1

Hn →

n=1

N 3

Kn

n=1

if and only if each An is bounded and supn An < ∞. In this case, % % N % %3 % % An % = supAn . % % % n n=1

4 Proof. Assume An ∈ L(Hn , Kn ) and supn An < ∞. For v ∈ N n=1 Hn , % %2 N N N % % 3 % % 2 2 v A = A v ≤ sup A vn 2 % % n n n n % % n n=1

n=1

4N

n=1

shows that n=1 An is a map between direct sums of Hilbert spaces with 4 norm at most N n=1 An ≤ supn An . For the converse, ﬁx k and note that for all v with vn = 0 for all n = k, Av = Ak vk . Since such v obey v = vk , taking the supremum over normalized vk ∈ Hk shows that A ≥ Ak for all k. Since k is arbitrary, we conclude A ≥ supk Ak . 4 Deﬁnition 4.36. The operator N n=1 An is called the direct sum of operators An ∈ L(Hn , Kn ). 4 Lemma 4.37. If all the An are unitary, then N n=1 An is unitary. 4 Proof. If all An are unitary, then N An is norm-preserving by a direct 4N n=1 −1 calculation and has the inverse n=1 An . We now specialize to the case Kn = Hn and describe how direct sums behave with respect to adjoints and invertibility. Proposition 4.38. Let An ∈ L(Hn ) for n = 1, . . . , N , and let supn An < 4N 4N ∗ ∗ ∞. If A = n=1 An , then A = n=1 An . In particular, if all An are self-adjoint, their direct sum is self-adjoint. Proof. This follows from the calculation that, for any v, w ∈ v, Aw =

N

vn , An wn =

n=1

N

4N

n=1 Hn ,

A∗n vn , wn = (A∗n vn )N n=1 , w.

n=1

4.5. Invariant subspaces and direct sums of operators

121

Proposition 4.39. Let An ∈ L(Hn ) for n = 1, . . . , N , and let supn An < 4 ∞. For the direct sum A = N / σ(A) if and only if z ∈ / σ(An ) for n=1 An , z ∈ all n and supn (An − z)−1 < ∞. For such z, (A − z)−1 =

N 3

(An − z)−1 .

(4.12)

n=1

Proof. For v, w ∈

4N

n=1 Hn ,

we have (A − z)v = w if and only if

(An − z)vn = wn

∀n.

This system has a unique solution for every w if and only if each An − z is a bijection. In this case, the unique solution of the system is vn = (An − z)−1 wn . Thus, A − z has a bounded inverse if and only if each An − z 4N −1 is bounded; has a bounded inverse and the linear map n=1 (An − z) moreover, in this case, (4.12) holds. In particular, since σ(An ) ⊂ σ(A) for all n and σ(A) is closed, N

σ(An ) ⊂ σ(A).

n=1

It is left as an exercise to show that this can be a strict inclusion if N = ∞. Recall that direct sums of Hilbert spaces are constructed as new Hilbert spaces but can also be used for a decomposition of a Hilbert space into its subspaces. Similarly, direct sums of operators were deﬁned as a way to construct new operators but are often used to express a decomposition of an operator into blocks. The existence of such a decomposition depends on the existence of so-called invariant subspaces. Deﬁnition 4.40. A subspace S ⊂ H is invariant for A ∈ L(H) if v ∈ S implies Av ∈ S. Lemma 4.41. Let S be a subspace of H which is invariant for A ∈ L(H). Then: (a) S is invariant for A; (b) S ⊥ is invariant for A∗ . Proof. Any v ∈ S can be written as a limit v = limn→∞ vn with vn ∈ S. By continuity of A, Av = limn→∞ Avn , so Avn ∈ S implies Av ∈ S. Let w ∈ S ⊥ . For any v ∈ S, we have Av ∈ S, and therefore A∗ w, v = w, Av = 0. Thus, A∗ w ∈ S ⊥ . If S is a closed invariant subspace for A, then the restriction of A to S, denoted A|S , is a bounded linear operator on the Hilbert space S.

122

4. Bounded linear operators

Proposition 4.42. If A ∈ L(H) and Hn are closed invariant subspaces for 4 A such that H = N n=1 Hn , then A=

N 3

(A|Hn ).

n=1

Proof. The operators A|Hn are uniformly bounded, so their direct sum is an element of L(H). By deﬁnition, it agrees with A on each Hn , so by linearity and continuity, the two are equal on H. Finally, the following proposition considers norm and strong convergence for direct sums of operators: Proposition 4.43. Consider bounded operators An,k ∈ L(Hn ), 1 ≤ n ≤ N , k ∈ N ∪ {∞}. Then the following hold. 4 4N (a) If supn An,k −An,∞ → 0 as k → ∞, then N n=1 An,k → n=1 An,∞ . s

(b) If supn supk∈N An,k < ∞ and An,k → An,∞ as k → ∞ for all n, 4 s 4N then N n=1 An,k → n=1 An,∞ . Proof. (a) This follows from % % N N % %3 3 % % A − A % n,∞ % = supAn,k − An,∞ . n,k % % n n=1

n=1

(b) The operators are uniformly bounded, so it suﬃces to prove convergence on a dense set of vectors. If v = (vn )N n=1 is such that vn = 0 4 4N for all but ﬁnitely many n, then n=1 An,k v → N n=1 An,∞ v follows from An,k vn → An,∞ vn . We say that an operator A ∈ L(H) has an orthonormal basis of eigenvectors if there is an orthonormal basis {vn }N n=1 of H such that each vn is an eigenvector of H. Reformulating, operators with an orthonormal basis of eigenvectors are precisely those that can be represented as a direct sum of multiplication operators on one-dimensional subspaces {cvn | c ∈ C}. We will soon focus on self-adjoint operators, which do not always have eigenvectors; however, the direct sum formalism will still be used to decompose self-adjoint operators into simpler blocks of a standard form.

4.6. Compact operators In this section, we consider compact operators, a subclass of bounded operators with properties reminiscent of the ﬁnite-dimensional case, and we

4.6. Compact operators

123

describe an important class of examples known as compact integral operators. In the Hilbert space setting, compact operators can be deﬁned by the following convergence-improving property: Deﬁnition 4.44. An operator K ∈ L(H) is called compact if for every w weakly convergent sequence un → u, Kun → Ku. Recall that, if dim H < ∞, weak convergence is equivalent to strong convergence, so every bounded operator is compact. If H is inﬁnite dimensional, this is no longer the case: the identity operator I is not compact, w since there exist un → 0 such that un → 0. Compactness of an operator can be characterized in terms of the image of the unit ball: Proposition 4.45. Denote by B = {u ∈ H | u ≤ 1} the closed unit ball in a Hilbert space H. For K ∈ L(H), the following are equivalent: (a) K is a compact operator; (b) the image K(B) = {Ku | u ∈ B} is a precompact subset of H; (c) the image K(B) = {Ku | u ∈ B} is a compact set. Proof. (a) =⇒ (c): For any sequence vn ∈ K(B), we must prove existence of a convergent subsequence in K(B). Write vn = Kun , un ∈ B. By Theorem 3.57, the sequence (un )∞ n=1 has a weakly convergent subsequence w unk → u as k → ∞, and from u ≤ lim inf k→∞ unk , we conclude u ∈ B. By compactness of K, this implies vnk = Kunk → Ku ∈ K(B). (c) =⇒ (b): This is trivial. w

(b) =⇒ (a): For any weakly convergent sequence un → u, we must prove Kun → Ku. Any weakly convergent sequence is bounded, so by w rescaling, we assume un ≤ 1 for all n. Since K is bounded, un → u implies w Kun → Ku by Lemma 4.8. Thus, any strongly convergent subsequence of Kun must converge to Ku; otherwise, it would weakly converge to two diﬀerent limits. In other words, the sequence Kun ∈ K(B) has only one possible limit point Ku in H. By precompactness of K(B), this implies Kun → Ku. An operator F ∈ L(H) is called ﬁnite-rank if Ran F is ﬁnite dimensional. Corollary 4.46. Any ﬁnite-rank operator is compact. Proof. If F is ﬁnite-rank, then F (B) is a bounded subset of the ﬁnitedimensional space Ran F , so F (B) is compact; thus, F is compact. Proposition 4.47. The set of compact operators is closed in L(H).

124

4. Bounded linear operators

Proof. Let Kk be compact operators such that Kk → K as k → ∞. For w any weakly convergent sequence un → u, Kk un → Kk u for any k, so from Kun − Ku ≤ Kun − Kk un + Kk un − Kk u + Kk u − Ku we conclude lim supKun − Ku ≤ 2Kk − K supun . n→∞

n

Since Kk → K as k → ∞, Kk − K can be made arbitrarily small, so lim supn→∞ Kun − Ku = 0. Thus, Kun → Ku as n → ∞, so K is compact. By the previous two statements, if an operator can be approximated in norm by ﬁnite rank operators, then it is compact. We will prove the converse in Theorem 5.25. Obviously, linear combinations of compact operators are compact. In fact, compact operators form an ideal in L(H): Lemma 4.48. Let K, A ∈ L(H). If K is compact, then AK and KA are compact. w

w

Proof. If un → u, then Kun → Ku so AKun → AKu. Similarly, if un → u, w then Aun → Au by Lemma 4.8, so KAun → KAu. Lemma 4.49. For K ∈ L(H), K is compact if and only if K ∗ K is compact. Proof. If K is compact, then so is K ∗ K by the previous lemma. Conversely, w w if K ∗ K is compact and un → u, then un − u → 0 so K ∗ K(un − u) → 0. Since (un )∞ n=1 is bounded, this implies K(un − u)2 = un − u, K ∗ K(un − u) → 0, i.e., Kun → Ku.

Lemma 4.50. For K ∈ L(H), K is compact if and only if K ∗ is compact. Proof. If K ∗ is compact, then K ∗ K is compact, so K is compact by the previous two lemmas. Analogously, if K is compact, then K ∗ is compact. Finally, we describe an important family of compact operators, called compact integral operators. Integral operators on L2 (X, dμ) are deﬁned by an integral kernel, which is a function on X × X, and it is customary to denote both the kernel and the operator by the same letter: Proposition 4.51. Let X be σ-locally compact metric space with a Baire measure μ. Let K ∈ L2 (X × X, d(μ ⊗ μ)). Then the integral operator K, deﬁned by K(x, y)u(y) dμ(y), (Ku)(x) = X

4.7. Exercises

125

is a compact operator on L2 (X, dμ). Its adjoint is the integral operator with kernel K ∗ (x, y) = K(y, x). Proof. By the Cauchy–Schwarz inequality, for any x ∈ X, !1/2 2 K(x, y)u(y) dμ(y) ≤ u |K(x, y)| dμ(y) , X

X

so squaring and integrating in x gives 2 2 |K(x, y)|2 dμ(y) dμ(x). Ku ≤ u X

This shows that K is a bounded operator and !1/2 2 |K(x, y)| dμ(y) dμ(x) = K ≤ X

(4.13)

X

X

!1/2

|K| d(μ ⊗ μ) 2

.

X×X

Moreover, if Kn is a sequence of integral kernels converging to K in L2 (X × X, d(μ ⊗ μ)), applying this norm estimate to K − Kn shows norm convergence of operators, Kn → K in L(H). We now use density arguments to approximate K ∈ L2 (X ×X, d(μ⊗μ)). ˜ ∈ Cc (X × X) such that K − K ˜ 2 < . For any ﬁxed > 0, there exists K Using Cartesian projections πj : X × X → X, j = 1, 2, we obtain compacts ˜ By Stone–Weierstrass, K ˜ can be approximated by linear comπj (supp K). ˜ ˜ g ∈ C(π2 (supp K)). Thus, binations of f (x)g(y) with f ∈ C(π1 (supp K)), 2 K can be approximated in L (X × X, d(μ ⊗ μ)) by kernels of the form F (x, y) =

k

fj (x)gj (y).

j=1

For any such kernel F , the corresponding integral operator is ﬁnite rank: Ran F ⊂ span{fj | 1 ≤ j ≤ k}. Thus, the integral operator K can be approximated in operator norm by ﬁnite rank operators, so it is compact.

4.7. Exercises 4.1. For A ∈ L(H), prove that A = w

sup

Reu, Av.

u,v∈H u=v=1

w

4.2. If An → A and An → B, prove that A = B. 4.3. If H is a ﬁnite-dimensional Hilbert space, prove that in L(H), norm convergence, strong operator convergence, and weak operator convergence are all mutually equivalent.

126

4. Bounded linear operators

4.4. Recall the operators S, S ∗ on 2 (N) given by (4.2) and (4.3). Prove that: (a) S n converges strongly, but not in norm, to 0 as n → ∞. (b) (S ∗ )n converges weakly, but not strongly, to 0 as n → ∞. 4.5. If Pj , j ∈ N are orthogonal projections to mutually orthogonal sub P spaces and Ran Pj = {0} for all j, prove that the series ∞ j=1 j is not convergent in norm. w

4.6. If An → A, prove that the sequence An is bounded and (4.4) holds. w

s

w

w

w

4.7. If An → A and Bn → B, prove that An Bn → AB. 4.8. If An → A and Bn → B, prove that An Bn → AB. s

4.9. If dim H = ∞, prove that there exist sequences such that An → A w w and Bn → B, but An Bn → AB. 4.10. Construct a sequence (An )∞ n=1 in L(H) that obeys d(An , 0) → 0 but is not strongly convergent. 4.11. Let H be an inﬁnite-dimensional separable Hilbert space with orthonormal basis (ej )∞ j=1 . Deﬁne a metric d such that, for any sequence w ∞ (An )n=1 in L(H) and A ∈ L(H), An → A if and only if (An )∞ n=1 is bounded and d(An , A) → 0 as n → ∞. 4.12. Let H be an inﬁnite-dimensional separable Hilbert space. For each of the following statements, determine whether it is true or false: (a) Every bounded sequence in L(H) has a strongly convergent subsequence. (b) Every bounded sequence in L(H) has a weakly convergent subsequence. 4.13. Prove the second resolvent identity: if A, B ∈ L(H) have resolvents at z, then RA (z) − RB (z) = RA (z)(B − A)RB (z). 4.14. (a) Let Λ ∈ L(H)∗ be given by Λ(A) = u, Av for some u, v ∈ H. Prove that Λ = uv. (b) Prove (2.27) for B = L(H). Hint: Use Lemma 4.1. 4.15. Find all eigenvalues of the operator S ∗ given by (4.3) on 2 (N). 4.16. (a) Assume that A is a diagonalizable n × n matrix, i.e., there exists a unitary V such that V −1 AV is a diagonal matrix. Use this to compute V −1 p(A)V for any polynomial p, and to prove that if p(λ) = q(λ) for all λ ∈ σ(A), then p(A) = q(A).

4.7. Exercises

127

(b) For the Jordan block A=

! t 1 , 0 t

ﬁnd a polynomial p such that p(λ) = 0 for all λ ∈ σ(A) but p(A) = 0. (c) For any n × n matrix A which is not diagonalizable, prove that there exists p ∈ C[x] such that p(λ) = 0 for all λ ∈ σ(A) but p(A) = 0. 4.17. Prove that A ∈ L(H) is an orthogonal projection if and only if A = A∗ and σ(A) ⊂ {0, 1}. 4.18. Give an example of bounded linear operators An on separable Hilbert spaces Hn , n ∈ N, such that supn An < ∞ and ∞ ∞ 3 An ⊂ σ(An ). σ n=1

n=1

4.19. Prove that there exist operators on Hilbert spaces 4∞Hn , n ∈ N, such k → ∞ for all n, but that An,k → An,∞ as n=1 An,k does not 4∞ converge in norm to n=1 An,∞ .

Chapter 5

Bounded self-adjoint operators

In this chapter, we consider bounded self-adjoint operators on separable Hilbert spaces: Deﬁnition 5.1. A ∈ L(H) is self-adjoint if A∗ = A. Self-adjoint operators are the natural generalization of Hermitian matrices—deﬁned in linear algebra as n × n matrices A such that Aij = Aji for all i, j (compare Example 4.4). A central result is that every Hermitian matrix is diagonalizable; i.e., there exists an orthonormal basis {v1 , . . . , vn } of Cn consisting of eigenvectors of A. This is called diagonalizability because if we assemble an n × n matrix V out of the eigenvectors by Vij = (vi )j and a diagonal matrix D out of the corresponding eigenvalues by Djj = λj and Dij = 0 for i = j, then V is unitary and AV = V D. In other words, A is represented in the form A = V DV −1 . A commonly demonstrated ﬁrst application in linear algebra is an eﬃcient method for computing high powers of a diagonalizable matrix using An = V D n V −1 . More fundamentally, diagonalizability leads to a classiﬁcation of all Hermitian matrices up to unitary equivalence: two Hermitian matrices are unitarily equivalent if and only if they have the same eigenvalues with the same multiplicities. Our goal in this chapter is a generalization of the above discussion, and much more, on separable Hilbert spaces. We will prove that compact selfadjoint operators have an orthonormal basis of eigenvectors (in particular, 129

130

5. Bounded self-adjoint operators

this includes diagonalizability of Hermitian matrices), but the bulk of the chapter is dedicated to the general setting, not assuming compactness. The central result is called the spectral theorem for bounded self-adjoint operators. When dim H = ∞, we cannot describe every self-adjoint operator A in terms of eigenvalues and eigenvectors, since there may not be any (Example 5.6). Thus, inevitably, the spectral theorem for bounded self-adjoint operators will appear diﬀerent from the compact case. Instead of individual eigenvectors, we will work with unitary maps, and diagonal matrices will be replaced by multiplication operators and their direct sums. We say that operators A ∈ L(H), B ∈ L(K) are unitarily equivalent if there exists a unitary U ∈ L(H, K) such that U AU −1 = B and denote this by A ∼ = B. In that terminology, we prove that every bounded self-adjoint operator is unitarily equivalent to a direct sum of multiplication operators. One of the main applications will be to deﬁne in a consistent way functions of self-adjoint operators g(A) for bounded Borel functions g : σ(A) → C. This is called the Borel functional calculus for self-adjoint operators. We already know the meaning of p(A) if p is a polynomial, but the Borel functional calculus provides a vast generalization for self-adjoint operators A.

5.1. A ﬁrst look at self-adjoint operators We begin the chapter with some general consequences of self-adjointness. Lemma 5.2. Let A be a self-adjoint operator, and let z ∈ C \ R. For any u ∈ H, (A − z)u ≥ |Im z|u. (5.1) Proof. For z = x + iy, x, y ∈ R, (5.1) follows from the calculation (A − z)u2 = (A − x − iy)u, (A − x − iy)u = (A − x)u, (A − x)u − iy(A − x)u, u + iyu, (A − x)u + (−iy)(iy)u, u = (A − x)u2 + y 2 u2 .

For z ∈ / R, (5.1) is a kind of strong injectivity condition on A − z. This leads to a general result about invertibility of A − z: Corollary 5.3. If A is self-adjoint, then σ(A) ⊂ R. Proof. For z ∈ C \ R, (5.1) implies that Ker(A − z) = {0}. Applying (5.1) also to z¯ and using Proposition 4.7 gives (Ran(A − z))⊥ = Ker(A∗ − z¯) = Ker(A − z¯) = {0}, so Ran(A − z) is dense. Thus, A − z is invertible by Lemma 4.20.

5.1. A ﬁrst look at self-adjoint operators

131

Corollary 5.3 focuses our remaining interest on invertibility of A − λ for λ ∈ R. We reﬁne Lemma 4.20 to the setting of self-adjoint operators: Lemma 5.4 (Weyl’s criterion). Let A be self-adjoint, and let λ ∈ R. The operator A − λ is invertible if and only if inf

u∈H u=0

(A − λ)u > 0. u

Proof. If the inﬁmum is strictly positive, then Ran(A − λ)⊥ = Ker(A − λ) = {0}, so Ran(A−λ) is dense. Thus, A−λ is invertible by Lemma 4.20. Conversely, if the inﬁmum is 0, then A − λ is not invertible by Lemma 4.20. Weyl’s criterion is usually restated in the following equivalent form. Proposition 5.5 (Weyl’s criterion). Let A be self-adjoint and let λ ∈ R. Then λ ∈ σ(A) if and only if there exists a sequence (un )∞ n=1 of normalized vectors such that lim (A − λ)un = 0. n→∞

The vectors un in Weyl’s criterion are sometimes described as approximate eigenvectors for λ ∈ σ(A). This perspective is useful, because elements of the spectrum are not necessarily eigenvalues: Example 5.6. Let A be the operator on L2 ([0, 1]) = L2 ([0, 1], dx) given by (Af )(x) = xf (x). The operator A is self-adjoint and has no eigenvalues, but σ(A) = [0, 1]. Proof. The operator A is self-adjoint because for all f, g ∈ L2 ([0, 1]), 1 1 f (x)xg(x) dx = xf (x)g(x) dx = Af, g. f, Ag = 0

0

(x−z)−1

is bounded on [0, 1], so multiplication For z ∈ C\[0, 1], the function by (x − z)−1 is a bounded operator on L2 ([0, 1]): for any f ∈ L2 ([0, 1]), 1 f (x) 2 1 2 1 max |f (x)|2 dx. x − z dx ≤ x∈[0,1] x − z 0 0 / σ(A). Thus, multiplication by (x−z)−1 is a bounded inverse for A−z, so z ∈ Fix λ ∈ [0, 1]. For any > 0, the characteristic function f = χ[λ−,λ+] is a nonzero element of L2 ([0, 1]) and, since |(x − λ)f (x)| ≤ |f (x)| for all x, (A − λ)f ≤ f .

132

5. Bounded self-adjoint operators

Thus, by Weyl’s criterion, A − λ does not have a bounded inverse, so λ ∈ σ(A). In conclusion, σ(A) = [0, 1]. However, Af = λf implies that xf (x) = λf (x) for Lebesgue-a.e. x, so f (x) = 0 for Lebesgue-a.e. x = λ, and therefore f = 0 as an element of L2 ([0, 1]). Thus, A has no eigenvalues. If a self-adjoint operator has any eigenvalues and eigenvectors, their properties are analogous to those for Hermitian matrices: Lemma 5.7. If z is an eigenvalue of a self-adjoint operator A, then z ∈ R. Proof. If z is an eigenvalue of A, then z ∈ σ(A), so z ∈ R.

Lemma 5.8. If λ = κ are eigenvalues of a self-adjoint operator A and u, v the corresponding eigenvectors, then u ⊥ v. ¯ this follows from Proof. Since κ = λ = λ, ¯ v = λu, v = Au, v = u, Av = u, κv = κu, v. λu,

We will now describe the spectral radius of a self-adjoint operator, r(A) = max |λ|. λ∈σ(A)

Proposition 5.9. If A is self-adjoint, then r(A) = A. We show a proof using the spectral radius formula (Theorem 4.30), and a direct proof using only the Neumann series (Theorem 4.26). Proof using the spectral radius formula. Since A is self-adjoint and L(H) is a C ∗ -algebra, A2 = A∗ A = A2 . k

k

Thus, A2 = A2 for any k ∈ N, so by the spectral radius formula, k

k

r(A) = lim An 1/n = lim A2 1/2 = A. n→∞

k→∞

Proof using only the Neumann series. If |λ| > A, then λ−1 A < 1, so by the Neumann series, the operator A − λ = −λ(I − λ−1 A) is invertible. Thus, r(A) ≤ A. For the converse, it suﬃces to show that A or −A is in the spectrum. Since products of invertible operators are invertible, it suﬃces to show that A2 − A2 = (A − A)(A + A) is not invertible. By self-adjointness and the deﬁnition of operator norm, there exist normalized vectors un such that lim un , A2 un = lim Aun 2 = A2 .

n→∞

n→∞

(5.2)

5.1. A ﬁrst look at self-adjoint operators

133

By elementary estimates, A2 un − A2 un 2 = A2 un 2 − 2A2 Reun , A2 un + A4 un 2 ≤ 2A4 un 2 − 2A2 un , A2 un . The right-hand side converges to 0 by (5.2), so lim A2 un − A2 un 2 = 0.

n→∞

Thus, by Weyl’s criterion, A2 − A2 is not invertible.

The previous proof relied on the inner product u, A2 u which, due to the square, is easy to rewrite as a square of a norm. However, it is often useful to consider the quantity u, Au. This leads us to a nontrivial improvement of Lemma 4.1 for self-adjoint operators. As a preliminary, we note that by self-adjointness and skew-symmetry of the inner product, u, Au = Au, u = u, Au, so u, Au ∈ R for all u ∈ H. Proposition 5.10. If A ∈ L(H) is self-adjoint, then A = sup |u, Au|. u∈H u=1

Proof. Denote by C the supremum in the statement. By Lemma 4.1, C ≤ A. For the converse, recall that A or −A are in the spectrum of A. For a choice of ± sign for which ±A ∈ σ(A), by Weyl’s criterion, there exist normalized vectors un such that lim (A ∓ A)un = 0.

n→∞

This implies by the Cauchy–Schwarz inequality that lim un , (A ∓ A)un = 0

n→∞

and ﬁnally that lim un , Aun = ± lim un , Aun = ±A,

n→∞

which proves that C ≥ A.

n→∞

By shifting the operator by constants, we can remove the absolute value in the previous proposition.

134

5. Bounded self-adjoint operators

Proposition 5.11. If A is self-adjoint, then min σ(A) = inf u, Au,

(5.3)

max σ(A) = sup u, Au.

(5.4)

u∈H u=1

u∈H u=1

Proof. Applying Proposition 5.10 to A − c for arbitrary c ∈ R, we see that max |x| = sup |u, (A − c)u| = sup |u, Au − c|.

x∈σ(A−c)

u=1

u=1

By the spectral mapping theorem, σ(A − c) = {λ − c | λ ∈ σ(A)}, so this can be rewritten as max |λ − c| = sup|λ − c|, (5.5) λ∈σ(A)

λ∈S

where S = {u, Au | u = 1}. Both S and σ(A) are contained in [−A, A]. Thus, for c ≤ −A, all expressions |λ − c| in (5.5) are equal to λ − c, and (5.5) implies max σ(A) = sup S, which is (5.4). Similarly, for c ≥ A, (5.5) implies (5.3). The set S from the previous proof is connected as the continuous image of the unit sphere in H, so it cannot be expected to provide any further information about σ(A) beyond (5.3) and (5.4). A more sophisticated generalization of Proposition 5.11, called the min-max principle, will be discussed later. We next deﬁne positivity for self-adjoint operators. This generalizes the notion of positive semi-deﬁniteness of matrices, often encountered in the context of the second derivative test of functions of several variables. Deﬁnition 5.12. A self-adjoint operator A is said to be positive if u, Au ≥ 0 for all u ∈ H, and we denote this by A ≥ 0. As an immediate corollary of (5.3), we obtain a criterion for positivity of A in terms of σ(A). Corollary 5.13. A ≥ 0 if and only if σ(A) ⊂ [0, ∞). This notion of positivity is also used to deﬁne a partial order relation: Deﬁnition 5.14. If A, B are self-adjoint operators on H, we say that A ≤ B if B − A ≥ 0. Lemma 5.15. The relation A ≤ B is a partial order on the set of bounded self-adjoint operators on H.

5.1. A ﬁrst look at self-adjoint operators

135

Proof. If A = 0, then v, Av = 0 for all v ∈ H, so A ≥ 0. This implies reﬂexivity. If A ≥ 0 and −A ≥ 0, then v, Av = 0 for all v ∈ H, so A = 0 by Proposition 5.10. This implies antisymmetry. If A ≤ B and B ≤ C, then v, (C − A)v = v, (C − B)v + v, (B − A)v ≥ 0 for all v ∈ H, so A ≤ C. This implies transitivity. Most of this chapter is devoted to the study of a single self-adjoint operator. However, we often study an operator A by approximating it by some “simpler” operators An , so in the rest of this section, we consider how the spectrum behaves with respect to norm and strong convergence of selfadjoint operators. The Hausdorﬀ distance between nonempty subsets S, T of a metric space is dH (S, T ) = max sup inf d(x, y), sup inf d(x, y) . x∈S y∈T

y∈T x∈S

This deﬁnes a metric on nonempty compact subsets of the metric space (Exercise 5.4). Proposition 5.16. If A, B ∈ L(H) are self-adjoint, then dH (σ(A), σ(B)) ≤ A − B. Proof. For z ∈ σ(A), from (5.16) and (B − z)u ≤ (A − z)u + A − Bu, we conclude that dist(z, σ(B)) ≤ A − B. Repeating the argument with the roles of A and B reversed concludes the proof. Corollary 5.17. If An → A∞ , then σ(An ) → σ(A∞ ) in Hausdorﬀ distance. Thus, in case of norm convergence, the spectra of An uniquely determine the spectrum of A∞ . This does not hold for strong operator convergence (Exercise 5.5), but one inclusion holds: s

Proposition 5.18. For self-adjoint operators An with An → A∞ , σ(A∞ ) ⊂ {λ ∈ R | lim dH ({λ}, σ(An )) = 0}. n→∞

Proof. Let λ ∈ σ(A∞ ). For any > 0 there exists normalized v with (A∞ − λ)v < . By strong operator convergence, for large enough n, (An − λ)v < 2 , which implies d(λ, σ(An )) < 2 . Since this holds for any

> 0, the claim follows. This inclusion has an immediate corollary: Corollary 5.19. Let F be a closed subset of R. For self-adjoint operators s An with An → A∞ , if σ(An ) ⊂ F for all n, then σ(A∞ ) ⊂ F .

136

5. Bounded self-adjoint operators

5.2. Spectral theorem for compact self-adjoint operators In the introduction to this chapter, the study of self-adjoint operators was motivated by the spectral theorem for Hermitian matrices: Theorem 5.20. If A is an n × n Hermitian matrix, then there exists an orthonormal basis {v1 , . . . , vn } of Cn consisting of eigenvectors of A. As promised, we will prove a generalization of that statement: Theorem 5.21 (Spectral theorem for compact self-adjoint operators). If K is a compact self-adjoint operator on a separable Hilbert space H, then there H exists an orthonormal basis (vn )dim n=1 of H consisting of eigenvectors of K. Moreover, if dim H = ∞ and we denote by λn the eigenvalues for vn , then limn→∞ λn = 0. The special case H = Cn recovers Theorem 5.20. For the general setting, the key lemma follows. Lemma 5.22. Let K be a compact self-adjoint operator on a nontrivial Hilbert space H. Then K or −K is an eigenvalue. Proof. By Proposition 5.10, there exists a sequence (un )∞ n=1 of normalized vectors such that |un , Kun | → K. By considering K or −K, we can assume without loss of generality that un , Kun → K. This sequence has a weakly convergent subsequence, which by relabelling we denote also by (un )∞ n=1 , and we denote u = w-limn→∞ un . By compactness of K, Kun → Ku, so by Lemma 3.55, un , Kun → u, Ku. Thus, u, Ku = K. Weak convergence implies u ≤ 1, so K = u, Ku ≤ uKu ≤ Ku2 ≤ K. This implies that u = 1 and Ku = u, Ku = K, so Ku − Ku2 = Ku2 − 2Ku, Ku + K2 u2 = 0, so Ku = Ku.

Proof of Theorem 5.21. For any λ ∈ C, Ker(K − λ) is a closed subspace of H, so it has an orthonormal basis (which may be empty). Since eigenvectors corresponding to distinct eigenvalues are mutually orthogonal, the union over λ of those orthonormal bases is an orthonormal set in H, which we denote by {vn | n = 1, . . . , N } with N ﬁnite or ∞. Consider M = span{vn | n = 1, . . . , N }. In other words, M is the direct sum of subspaces Ker(K − λ) for λ ∈ C; of course, since H is separable, only countably many of those subspaces can be

5.2. Spectral theorem for compact self-adjoint operators

137

nontrivial. Since vn are eigenvectors, their span is an invariant subspace for K, and so is its closure M . By Lemma 4.41, M ⊥ is also an invariant subspace for K. The restriction of K to M ⊥ is a compact self-adjoint operator on M ⊥ , because it obeys u, Kv = Ku, v for all u, v ∈ M ⊥ and takes weakly convergent sequences to strongly convergent sequences. However, K|M ⊥ has no eigenvectors by the construction of M . Thus, Lemma 5.22 implies M ⊥ = {0}. Thus, M is a dense subspace of H, so (vn )N n=1 is an orthonormal basis of H. w

If H is inﬁnite dimensional, vn → 0 implies λn vn = Kvn → 0. Since vn = 1 for all n, this implies |λn | = λn vn → 0. Expanding vectors with respect to the orthonormal basis (vn )N n=1 gives 4N N a unitary map V : H → n=1 C where V u = (vn , u)n=1 (Theorem 3.40). Since the vn are eigenvectors, V KV −1 has a particularly simple form 4 which generalizes the diagonalization of Hermitian matrices: for any f ∈ N n=1 C, (V KV −1 f )n = λn fn . The remainder of this section is an aside, an application to arbitrary compact operators. Theorem 5.23 (Singular value decomposition). Any compact operator K on H can be represented in the form Kv =

N

μn en , vfn ,

(5.6)

n=1 N where (en )N n=1 and (fn )n=1 are orthonormal families in H, μn > 0 for all n, and μn → 0 if N = ∞. If N = ∞, (5.6) denotes a norm-convergent series.

Before proving this, let us explain the convergence of a series such as (5.6). Lemma 5.24. (a) Let (ej )nj=m , (fj )nj=m be two ﬁnite orthonormal families in H, and let μm ≥ · · · ≥ μn > 0. Deﬁne a ﬁnite rank operator F ∈ L(H) by Fv =

n

μj ej , vfj .

j=m

Then F = μm . ∞ (b) Let (ej )∞ j=1 , (fj )j=1 be two orthonormal families in H, and let (μj )∞ j=1 be a decreasing sequence with μj → 0 as j → ∞. Then

138

5. Bounded self-adjoint operators

the series T =

∞

μj ej , ·fj

j=1

is norm convergent and deﬁnes a compact operator T with T = μ1 . Proof. (a) For any v ∈ H, by Bessel’s inequality, nj=m |ej , v|2 ≤ v2 , so F v2 =

n

μj ej , vfj 2 =

j=m

n

μ2j |ej , v|2 ≤ μ2m v2 .

j=m

Moreover, equality holds for v = em . Thus, F = μm . (b) Denote Fn = nj=1 μj ej , ·fj . By (a), Fn −Fm = μm+1 for m < n, so the Fn form a Cauchy sequence in L(H). Its limit K is a bounded operator and, since Fn = μ1 for all n, K = μ1 . Moreover, the operators Fn are ﬁnite rank and Fn → K, so by Corollary 4.46 and Proposition 4.47, K is compact. Proof of Theorem 5.23. The operator K ∗ K is compact and self-adjoint. Moreover, K ∗ K ≥ 0, because v, K ∗ Kv = Kv, Kv ≥ 0 for all v. By the spectral theorem for compact self-adjoint operators, K ∗ K has an orthonormal basis of eigenvectors. If we remove from that basis eigenvectors with zero eigenvalue, we obtain a basis (en )N n=1 for the subspace S = Ker(K ∗ K)⊥ . In particular, K ∗ Ken = λn en , with λn > 0 and λn → 0 if N = ∞. √ Denote μn = λn and fn = μ−1 n Ken . Then −1 −1 −1 ∗ fm , fn = μ−1 m μn Kem , Ken = μm μn em , K Ken = δmn ,

so (fn )N n=1 is an orthonormal family. By Lemma 5.24, a bounded operator B is deﬁned by N Bv = μn en , vfn . n=1

Note that Ken = μn fn = Ben for each n, so Kv = Bv for all v ∈ S = ⊥ ∗ span{en }N n=1 . If v ∈ S , then v ∈ Ker(K K), so Kv2 = v, K ∗ Kv = 0, which implies Kv = 0. Thus, Kv = 0 = Bv for v ∈ S ⊥ . Since K and B agree on S and S ⊥ , they are equal. Theorem 5.25. The set of compact operators in L(H) is the closure of the set of ﬁnite rank operators.

5.3. Spectral measures

139

Proof. If K is compact, it can be approximated in operator norm by ﬁniterank operators by (5.6). Conversely, if Fn are ﬁnite-rank they are compact by Corollary 4.46, and if Fn → K, then K is compact by Proposition 4.47.

5.3. Spectral measures In this section, we introduce the notion of spectral measure corresponding to a vector. We denote N0 = {n ∈ Z | n ≥ 0}. Theorem 5.26. Let A be a bounded self-adjoint operator on H, and let ψ ∈ H. There exists a unique compactly supported Borel measure μA,ψ on R such that for all k ∈ N0 , (5.7) ψ, Ak ψ = xk dμA,ψ (x). Moreover, supp μA,ψ ⊂ σ(A). Deﬁnition 5.27. The measure μA,ψ which obeys (5.7) for all k ∈ N0 is called the spectral measure for the vector ψ and the operator A. The spectral measure will often be denoted more concisely by μψ . The reader should be warned that the term “spectral measure” will later also be used with related but diﬀerent meanings, corresponding to an operator A but perhaps not to any particular vector ψ ∈ H. The integrals ck = xk dμ(x) for k ∈ N0 are called the moments of the measure μ. Equation (5.7) precisely speciﬁes the moments of the desired spectral measure. The reader should keep in mind that not every sequence of real numbers is the sequence of moments of a positive measure on R; for c0 = μ(R) ≥ 0, and for every t ∈ R, c2 − 2tc1 + t2 c0 = instance, 2 (x − t) dμ(x) ≥ 0, so c21 ≤ c0 c2 . Investigating this set of constraints more systematically would lead us to the so-called moment problem. Instead, we proceed directly to the proof of Theorem 5.26, which accounts for those constraints somewhat implicitly. In the proof, we will use the left-hand side of (5.7) to deﬁne a linear functional on polynomials. We will prove that it has a unique extension to a positive linear functional on C(σ(A)) and will use the Riesz–Markov theorem to obtain the spectral measure. We will use the polynomial spectral mapping theorem from Section 4.4. In a preliminary lemma, we obtain boundedness from positivity, and address a technicality: the distinction between polynomials as algebraic expressions and polynomials as elements of C(σ(A)) (this distinction is nontrivial if σ(A) is a ﬁnite set). We denote by F[x] the set of polynomials in one variable with coeﬃcients in the ﬁeld F.

140

5. Bounded self-adjoint operators

Lemma 5.28. Let Λ : C[x] → C be a linear map, and let K ⊂ R be compact. Assume that for every p ∈ R[x] such that p(x) ≥ 0 for all x ∈ K, Λ(p) ≥ 0. Then the following hold. (a) For all p ∈ C[x], |Λ(p)| ≤ Λ(1) max|p(x)|. x∈K

(5.8)

(b) If p1 (x) = p2 (x) for all x ∈ K, then Λ(p1 ) = Λ(p2 ), so Λ gives a bounded linear functional on the subspace span{xn | n ∈ N0 } in C(K). (c) Λ has a unique extension to a bounded linear functional on C(K), and that extension is a positive linear functional on C(K). Proof. (a) Denote p = maxx∈K |p(x)|. For any p ∈ R[x], the polynomials p∓p ∈ R[x] are nonnegative on K, so by positivity and linearity, pΛ(1)∓ Λ(p) = Λ(p ∓ p) ≥ 0. This can be rewritten as −pΛ(1) ≤ Λ(p) ≤ pΛ(1). In particular, Λ(p) ∈ R. Any p ∈ C[x] can be written as a linear combination p = Re p + i Im p by separating real and imaginary parts of coeﬃcients. Since Λ(Re p), Λ(Im p) ∈ R, from Λ(p) = Λ(Re p) + iΛ(Im p) it follows that Re Λ(p) = Λ(Re p). Thus, |Λ(p)| = sup Re(eiφ Λ(p)) = sup Re Λ(eiφ p) ≤ sup Λ(1)eiφ p ≤ Λ(1)p. φ∈R

φ∈R

φ∈R

(b) If p1 = p2 on K, applying (5.8) to p1 −p2 implies that Λ(p1 ) = Λ(p2 ). (c) By Weierstrass’s theorem (Corollary 2.20), polynomials form a dense subspace in C(K), so by Proposition 2.44, Λ extends uniquely to a bounded linear functional on C(K), which we denote by the same letter, Λ : C(K) → C. Any f ∈ C(K) can be approximated uniformly by a sequence of polynomials pn . If f is real valued, pn can also be chosen with real coeﬃcients (by taking their real parts); if f is nonnegative, pn can also be chosen nonnegative (by replacing pn with the polynomials pn − max(0, minx∈K pn (x))). These nonnegative approximants pn obey Λ(pn ) ≥ 0, so f ≥ 0 implies Λ(f ) = lim Λ(pn ) ≥ 0. n→∞

Proof of Theorem 5.26. Let us denote K = σ(A) and, for p ∈ C[x], Λ(p) = ψ, p(A)ψ. This deﬁnes a linear map Λ : C[x] → C. Since A is self-adjoint, p(A)∗ = p(A). If p ∈ R[x], this implies that p(A) is self-adjoint. If, moreover, p ≥ 0 on σ(A), the polynomial spectral mapping theorem (Theorem 4.33) implies

5.4. Spectral theorem on a cyclic subspace

141

σ(p(A)) ⊂ [0, ∞), so p(A) is a positive operator and Λ(p) = ψ, p(A)ψ ≥ 0. Thus, by Lemma 5.28, Λ extends uniquely to a bounded, positive linear functional on C(K). Existence of the spectral measure follows from the Riesz–Markov theorem (Theorem 1.100). It remains If μ1 , μ2 are compactly supported mea k kto prove uniqueness. sures and x dμ1 (x) = x dμ2 (x) for all k ∈ N0 , the corresponding bounded linear functionals are equal on polynomials, so by density of polynomials, they are equal on C(supp μ1 ∪supp μ2 ). Thus, by the Riesz–Markov theorem, μ1 = μ2 . Note that μA,ψ (R) = ψ2 by (5.7) with k = 0. Moreover, the spectral measure of a vector contains a lot of information about the values of the spectral parameter which correspond to v. This is illustrated by the following examples. Recall that δλ denotes the Dirac measure at λ. Example 5.29. Let v be an eigenvector of A, and let λ be the corresponding eigenvalue. The spectral measure of v is μA,v = v2 δλ . Proof. From Av = λv, it follows by induction that Ak v = λk v for k ∈ N0 . Thus, by linearity, for any polynomial p, p(A)v = p(λ)v. Thus,

v, p(A)v = v, p(λ)v = p(λ)v2 .

This is equal to p dμ for the choice of measure μ = v2 δλ , so by uniqueness of the spectral measure, μA,v = v2 δλ . Generalizations of this example are considered in Exercises 5.9 and 5.10. We give a diﬀerent example: Example 5.30. Let A be the self-adjoint operator from Example 5.6 and let f ∈ L2 ([0, 1], dx). Then dμA,f (x) = |f (x)|2 dx. Proof. For any k ∈ N0 , (Ak f )(x) = xk f (x), so 1 1 f (x)xk f (x) dx = xk |f (x)|2 dx. f, Ak f = 0

0

From this, we directly read oﬀ dμA,f (x) = |f (x)|2 dx.

5.4. Spectral theorem on a cyclic subspace In this section, we show that the spectral measure of the vector ψ describes the behavior of the operator on a certain subspace generated by ψ. Deﬁnition 5.31. For a bounded self-adjoint operator A and ψ ∈ H, we deﬁne the cyclic subspace of ψ as CA (ψ) = {p(A)ψ | p ∈ C[x]}. The vector ψ is said to be cyclic if CA (ψ) = H.

142

5. Bounded self-adjoint operators

Lemma 5.32. The cyclic subspace CA (ψ) is the smallest invariant closed subspace for A that contains ψ. Proof. For any polynomial p, Ap(A) is also a polynomial in A, so the subspace {p(A)ψ | p ∈ C[x]} is invariant for A. Thus, its closure CA (ψ) is also invariant for A, by Lemma 4.41. Of course, ψ = A0 ψ ∈ CA (ψ). Let M be an invariant subspace for A. If ψ ∈ M , it follows by induction that Ak ψ ∈ M for all k = 0, 1, 2, . . . , and then that p(A)ψ ∈ M for all polynomials p. Since M is closed, it follows that CA (ψ) ⊂ M . We will describe the behavior of A on the cyclic subspace CA (ψ) by using L2 (R, dμA,ψ ) as a model space. More precisely, we ﬁnd a unitary map between these spaces which encodes A by multiplication by the function x: Theorem 5.33. Let A be a bounded self-adjoint operator, and let μA,ψ be the spectral measure of a vector ψ ∈ H. Then the map U : C[x] → CA (ψ) deﬁned by U p = p(A)ψ extends uniquely to a unitary map U : L2 (R, dμA,ψ ) → CA (ψ) such that U 1 = ψ and, for all f ∈ L2 (R, dμA,ψ ), (U −1 AU f )(x) = xf (x).

(5.9)

Proof. Let us write μ = μA,ψ . For all k, l ∈ N0 , k l k+l k+l A ψ, A ψ = ψ, A ψ = x dμ(x) = xk xl dμ(x). By sesquilinearity, for all polynomials p, q, p(A)ψ, q(A)ψ = p(x)q(x) dμ(x). In particular, U pH = p(A)ψH = pL2 (R,dμ) . Since polynomials are dense in L2 (R, dμ) and U is norm-preserving on polynomials, U extends uniquely to a norm-preserving map, denoted by the same letter U , from L2 (R, dμ) to H. The range of U is the closure of {p(A)ψ | p ∈ C[x]}, which is precisely CA (ψ). To prove (5.9), consider ﬁrst that for any polynomial p, AU p = Ap(A)ψ = U (xp(x)). Thus, (5.9) holds for all polynomials, so by density and continuity it holds for all f ∈ L2 (R, dμ). If the operator A has a cyclic vector ψ, Theorem 5.33 immediately implies that A is unitarily conjugated into the form of a multiplication operator:

5.5. Multiplication operators

143

Theorem 5.34 (Spectral theorem for bounded self-adjoint operators with a cyclic vector). If the operator A has a cyclic vector ψ, there is a unitary U : L2 (R, dμA,ψ ) → H such that U 1 = ψ and for all f ∈ L2 (R, dμA,ψ ), (U −1 AU f )(x) = xf (x).

(5.10)

This result is already suﬃcient for some classes of self-adjoint operators, such as half-line Jacobi matrices (see Chapter 10). However, we will also prove a more general version of the spectral theorem, applicable for every self-adjoint operator on a separable Hilbert space. Exercise 5.16 considers an alternative description of the cyclic subspace of ψ. When we discuss unbounded operators in Chapter 8, that alternative description will be used as the deﬁnition.

5.5. Multiplication operators By Theorem 5.34, every self-adjoint operator with a cyclic vector is unitarily equivalent to a multiplication operator. Due to this, multiplication operators serve as models for self-adjoint operators, and it is worthwhile to study them systematically. Just as for diagonal matrices, many properties of multiplication operators can be explicitly computed and characterized. Although Theorem 5.34 involves a ﬁnite spectral measure supported on a compact subset of R, with almost no additional eﬀort, we will allow Baire measures on R, i.e., positive Borel measures on R which are ﬁnite on compacts. We recall that these are precisely the Lebesgue–Stieltjes measures on R (see Chapter 1). Deﬁnition 5.35. Let μ be a Baire measure on R. Let g ∈ L∞ (R, dμ). The multiplication operator Tg,dμ on L2 (R, dμ) is deﬁned by (Tg,dμ f )(x) = g(x)f (x). We will also use the notation Tg when the measure is clear from context. Conversely, we will write Tg(x),dμ(x) when that is needed; for instance, the conclusion (5.10) can now be written more concisely as U −1 AU = Tx,dμA,ψ (x) . Lemma 5.36. If μ is a Baire measure on R and g ∈ L∞ (R, dμ), then Tg is a bounded linear operator and Tg = g∞ . Proof. For any f ∈ L2 (R, dμ), |gf |2 dμ ≤ g2∞ |f |2 dμ, so Tg is bounded and Tg ≤ g∞ . Conversely, for any C < g∞ , the set A = {x | |g(x)| > C} has μ(A) > 0. Since μ is a Baire measure, for some k ∈ N, f = χA∩[−k,k] is a nonzero element of L2 (R, dμ) and |gf |2 dμ ≥ C 2 |f |2 dμ. It follows that Tg ≥ C for any C < g∞ , so Tg ≥ g∞ . Algebraic properties of multiplication operators are veriﬁed trivially:

144

5. Bounded self-adjoint operators

Lemma 5.37. The map g → Tg,dμ is a homomorphism of C ∗ -algebras L∞ (R, dμ) → L(L2 (R, dμ)), i.e., for any g, h ∈ L∞ (R, dμ) and λ ∈ C, Tλg = λTg ,

Tg+h = Tg + Th ,

Tgh = Tg Th ,

Tg∗ = Tg¯,

T1 = I.

Lemma 5.36 tells us, in particular, that Tg = 0 if and only if g = 0 μ-a.e. Combining this principle with algebraic properties leads to further criteria: Lemma 5.38. Tg is self-adjoint if and only if g is real-valued μ-a.e. Proof. Tg is self-adjoint if and only if Tg − Tg∗ = 0. By the calculation Tg − Tg∗ = Tg − Tg¯ = Tg−¯g , this is equivalent to g − g¯∞ = 0.

Invertibility of Tg − z is related to division by g − z, which leads us to consider whether |g − z| has a lower bound that holds μ-a.e. This leads to the notion of essential range of g with respect to μ, which is deﬁned as 5 6 Ranμ g = z ∈ C | μ {x | |g(x) − z| < } > 0 for all > 0 . Properties of the essential range are collected in the following lemma. Lemma 5.39. For any Baire measure μ on R and Borel function g: (a) Ranμ g is the support of the pushforward of μ by g; (b) Ranμ g is the smallest closed set E ⊂ C such that g ∈ E holds μ-a.e.; (c) Ranμ g is bounded if and only if g ∈ L∞ (R, dμ), and in that case, max{|z| | z ∈ Ranμ g} = g∞ ; (d) If g is continuous, then Ranμ g = {g(x) | x ∈ supp μ}. If, in addition, μ is compactly supported, then Ranμ g = {g(x) | x ∈ supp μ}. Proof. (a) Denote by ν the pushforward of μ by g (Lemma 1.55). By deﬁnition, z ∈ Ranμ g if and only if μ(g −1 (D (z)) > 0 for all > 0. Equivalently, z ∈ Ranμ g if and only if μ(g −1 (U )) > 0 for all open sets U which contain z, i.e., z ∈ Ranμ g if and only if z ∈ supp ν. (b) This follows from (a) by Lemma 1.42. (c) If g ∈ L∞ (R, dμ), then |g| ≤ g∞ holds μ-a.e., so by (b), Ranμ g ⊂ {z ∈ C | |z| ≤ g∞ }. This shows M ≤ g∞ where M = max{|z| | z ∈ Ranμ g}. Conversely, since g ∈ Ranμ g holds μ-a.e., then |g| ≤ M holds μ-a.e., so g∞ ≤ M . (d) Denote A = {g(x) | x ∈ supp μ}. Since A is closed and g ∈ A μ-a.e., Ranμ g ⊂ A. Conversely, for any open U ⊂ C which intersects A, g −1 (U ) is open and intersects supp μ so μ(g −1 (U )) > 0. Thus, A ⊂ Ranμ g. Since

5.5. Multiplication operators

145

A ⊂ Ranμ g ⊂ A and Ranμ g is closed, Ranμ g = A. If μ is compactly supported, A is compact as the continuous image of a compact set. We can now describe the spectrum and resolvent of a multiplication operator. / σ(Tg,dμ ), the resolvent Proposition 5.40. σ(Tg,dμ ) = Ranμ g, and for z ∈ of Tg,dμ at z is (Tg,dμ − z)−1 = T1/(g−z),dμ . 1 ∈ L∞ (R, dμ), the operator T1/(g−z),dμ is bounded and inverse Proof. If g−z 1 ∈ / L∞ (R, dμ), then for every > 0, to Tg,dμ − z = Tg−z,dμ . Conversely, if g−z the set A = {x | |g(x) − z| < } has μ(A) > 0. Since μ is a Baire measure, there exists k ∈ N such that f = χA∩[−k,k] is a nonzero element of L2 (R, dμ) and

|(g − z)f |2 dμ ≤ 2

|f |2 dμ.

Thus, Tg − z cannot have a bounded inverse.

We single out an important special case in which we can very explicitly describe the spectrum and compute the norm of the resolvents; this will be used in the next section to compute the norm of the resolvent for an arbitrary self-adjoint operator! Corollary 5.41. For any Baire measure μ on R, σ(Tx,dμ(x) ) = supp μ and for z ∈ C \ supp μ, % % % 1 % 1 −1 % % . = (Tx,dμ(x) − z) = % x − z %L∞ (dμ) dist(z, supp μ) Proof. As in the previous proof, the resolvent is equal to T(x−z)−1 ,dμ(x) and exists precisely when (x − z)−1 ∈ L∞ (R, dμ). Moreover, by pushforwards, the function g(x) = (x − z)−1 has g∞ = 1/ dist(z, supp μ). The following proposition gives criteria for a sequence of multiplication operators to converge in norm or in the sense of strong operator convergence. Proposition 5.42. Let μ be a Baire measure on R. Consider functions gn ∈ L∞ (R, dμ), n ∈ N ∪ {∞}. (a) If gn → g∞ in L∞ (R, dμ), then Tgn → Tg∞ . (b) If gn are uniformly bounded, i.e., supgn ∞ < ∞,

(5.11)

n∈N

s

and limn→∞ gn (x) = g∞ (x) for μ-a.e. x, then Tgn → Tg∞ .

146

5. Bounded self-adjoint operators

Proof. (a) This is immediate from Tgn − Tg∞ = gn − g∞ ∞ . (b) Denote by C the supremum in (5.11). For all f ∈ L2 (R, dμ), by dominated convergence with dominating function 4C 2 |f |2 , 2 lim Tgn f − Tg∞ f = lim |gn f − g∞ f |2 dμ = 0. n→∞

n→∞

5.6. Spectral theorem on the entire Hilbert space In Theorem 5.33, we described the action of A on a cyclic subspace deﬁned by some vector ψ ∈ H. In this section, we extend that to describe the action of A on the entire Hilbert space. We prove that any bounded self-adjoint operator is unitarily equivalent to a direct sum of multiplication operators, i.e., that there exists a sequence (μn )N n=1 of probability measures such that A∼ =

N 3

Tx,dμn (x) .

n=1

4N Recall that our notation for direct sums n=1 allows N ﬁnite or ∞, and N = ∞ denotes a countable sum. The key ingredient in the proof is a decomposition of the Hilbert space as a direct sum of cyclic subspaces. Deﬁnition 5.43. Let H be a separable Hilbert space and A a self-adjoint operator. A spectral basis for A is a (ﬁnite or inﬁnite) sequence (ψn )N n=1 of normalized vectors such that H=

N 3

CA (ψn ).

(5.12)

n=1

Example 5.44. If A is a compact self-adjoint operator, any orthonormal basis of eigenvectors of A is a spectral basis for A. Lemma 5.45. Any self-adjoint operator on a separable Hilbert space has a spectral basis. This sequence can be chosen to begin with an arbitrary normalized vector ψ1 . To motivate the proof, recall that any cyclic subspace CA (ψ) is invariant for A and that CA (ψ)⊥ is an invariant subspace for A∗ = A. This suggests an appealing, naive approach: start with some ψ1 , then pick arbitrary ψ2 ∈ CA (ψ1 )⊥ , ψ3 ∈ (CA (ψ1 ) ⊕ CA (ψ2 ))⊥ , and so on. However, the sequence chosen in this way may not cover the entire space H. To ensure that it does, we modify the construction so that φn ∈ CA (ψ1 ) ⊕ · · · ⊕ CA (ψn ) for all n, where (φn )N n=1 is a ﬁxed orthonormal basis of H. Proof. Since H is separable, it has a ﬁnite or countable orthonormal basis N (φn )N n=1 . We will deﬁne (ψn )n=1 inductively: let ψ1 = φ1 and let ψn be the

5.6. Spectral theorem on the entire Hilbert space

147

orthogonal projection of φn onto Vn−1 =

n−1

CA (ψk )⊥ .

k=1

)⊥

Since ψn ∈ CA (ψk for k < n, Lemma 5.32 implies that CA (ψn ) ⊂ CA (ψk )⊥ for k < n. Thus, the cyclic subspaces CA (ψn ) are mutually orthogonal. Now the construction implies that ⊥ = ψn − φn ∈ Vn−1

n−1 3

CA (ψk ),

k=1

so, by induction in n, φn ∈

n 3

CA (ψk ).

k=1

Since (φn )N n=1 is an orthonormal basis of H, it follows that H=

N 3

CA (ψn ).

n=1

The proof is completed by discarding from the sequence all vectors ψn which are equal to 0 (relabeling the sequence and changing the value of N in the process) and normalizing all the nonzero vectors. Of course, the spectral basis is not unique: Diﬀerent spectral bases of the same operator may not even be of the same cardinality (Exercise 5.18). Now that we know that every self-adjoint operator has a spectral basis, we can prove the spectral theorem. Theorem 5.46 (Spectral theorem for self-adjont operators). Let A be a bounded self-adjoint operator on a separable Hilbert space H, and let (ψn )N n=1 be a spectral basis for A. Denoting μn = μA,ψn , there exists a unitary map U:

N 3

L2 (R, dμn ) → H

(5.13)

n=1

such that U −1 AU =

N 3

Tx,dμn (x) .

(5.14)

n=1

Proof. Applying Theorem 5.33 to each ψn , there exist measures μn and unitaries Un : L2 (R, dμn ) → CA (ψn ) such that Un−1 AUn = Tx,dμn (x) for all n. Since A is the direct sum of its 4 restrictions to CA (ψn ), (5.14) holds for the unitary U = N n=1 Un .

148

5. Bounded self-adjoint operators

The spectral representations (5.13) and (5.14) can be used to study the spectrum and resolvent of A. This provides a signiﬁcant improvement over (5.1) and the ﬁrst application of the spectral theorem: where before we were only able to give an upper bound on the norm of the resolvent for nonreal z, the spectral theorem allows us to compute the norm for any z ∈ C \ σ(A). Proposition 5.47. If A has the spectral representations (5.13) and (5.14), then σ(A) =

N

supp μn .

n=1

Moreover, for all z ∈ C \ σ(A), U −1 (A − z)−1 U =

N 3

T(x−z)−1 ,dμn (x)

n=1

and (A − z)−1 =

1 . dist(z, σ(A))

(5.15)

Proof. The proof is based on the observation that conjugation by a unitary map does not aﬀect invertibility or the norm of the inverse. Thus, it suﬃces to study the direct sum of multiplication operators. / supp μn and, in that By Corollary 5.41, Tx,dμn (x) − z is invertible if z ∈ case, (Tx,dμn (x) − z)−1 =

1 . dist(z, supp μn )

It follows that sup(Tx,dμn (x) − z)−1 = sup n

n

1 1 , = N dist(z, supp μn ) dist z, n=1 supp μn

and the proof is then completed by Proposition 4.39.

Corollary 5.48. For any z ∈ C, dist(z, σ(A)) = inf (A − z)u. u∈H u=1

(5.16)

Proof. For z ∈ σ(A), this is precisely Weyl’s criterion. For z ∈ / σ(A), it follows from (5.15).

5.7. Borel functional calculus

149

5.7. Borel functional calculus In this section, we present a consistent way of deﬁning operators g(A), where A ∈ L(H) is self-adjoint and g : σ(A) → C is a bounded Borel function. This will vastly generalize the deﬁnition of p(A) for polynomials p. We denote by Bb (σ(A)) the set of bounded Borel functions on σ(A), previously discussed in Section 1.7. This is a C ∗ -algebra, with addition and multiplication deﬁned pointwise, with complex conjugation in the role of taking adjoints, and with the supremum norm. The Borel functional calculus will be a homomorphism of C ∗ -algebras which preserves a certain kind of convergence of sequences. Theorem 5.49 (Borel functional calculus). For any bounded self-adjoint operator A on H, there is a unique map ΦA : Bb (σ(A)) → L(H) such that the following hold. (a) ΦA is an algebraic homomorphism, i.e., it is linear, preserves mulg ) = ΦA (g)∗ . tiplication, ΦA (1) = I, and ΦA (¯ (b) If g is the identity map g(x) = x, then ΦA (g) = A. (c) If gk → g∞ pointwise and sup sup |gk (x)| < ∞, k∈N x∈σ(A) s

then ΦA (gk ) → ΦA (g∞ ). Existence is shown by the following construction: Lemma 5.50. Let A be a self-adjoint operator with a spectral representations (5.13) and (5.14). If the operator Φ(g) is deﬁned, for g ∈ Bb (σ(A)), by U −1 Φ(g)U =

N 3

Tg(x),dμn (x) ,

n=1

then Φ has the properties (a), (b), and (c) from Theorem 5.49. Moreover, for any g ∈ Bb (σ(A)), Φ(g) = supgL∞ (dμn ) .

(5.17)

n

Proof. Property (a) of Theorem 5.49 follows from properties of multiplication operators, and (b) follows from the spectral representation since 4N −1 AU . n=1 Tx,dμn (x) = U

150

5. Bounded self-adjoint operators

s

By Proposition 5.42, Tgk ,dμn → Tg∞ ,dμn for each n. Since gk are uniformly bounded on σ(A) ⊃ supp μn , by Proposition 4.43, N 3

s

Tgk ,dμn →

n=1

N 3

Tg∞ ,dμn ,

n=1 s

and conjugating by U gives Φ(gk ) → Φ(g∞ ), which proves (iii). Finally, (5.17) follows from Φ(g) = supn Tg(x),dμn (x) .

The proof of uniqueness is based on a criterion for a subalgebra of Bb (X) to be equal to Bb (X) (Proposition 1.92), reﬁned to a compact X ⊂ R as follows: Proposition 5.51. Let X ⊂ R be compact. Let M be a subalgebra of Bb (X), i.e., closed under addition, scalar multiplication, multiplication, and 1 ∈ M. If M is closed under pointwise convergence of uniformly bounded sequences, the following are equivalent: (a) The function g(x) = x is in M. (b) C(X) ⊂ M. (c) χB ∈ M for all Borel sets B. (d) M = Bb (X). Proof. (a) =⇒ (b): Since 1, x ∈ M and M is a subalgebra, M contains all polynomials. Note that M is also closed under uniform convergence; thus, by density of polynomials in C(X) (Weierstrass’s theorem), C(X) ⊂ M. (b) =⇒ (c) and (c) =⇒ (d): These follow from Proposition 1.92. (d) =⇒ (a): This is trivial.

Proof of Theorem 5.49. Existence was proved in Lemma 5.50. Assume that Φ1 , Φ2 obey the properties (a), (b), and (c) of Theorem 5.49 and denote M = {g ∈ Bb (σ(A)) | Φ1 (g) = Φ2 (g)}. This is a subalgebra of Bb (σ(A)) which contains the identity map g(x) = x and is closed under pointwise convergence of uniformly bounded sequences, so M = Bb (σ(A)). Due to the uniqueness of the Borel functional calculus, we will write g(A) instead of ΦA (g) from now on. Uniqueness tells us, in particular, that the construction in Lemma 5.50 gives the same operators, regardless of the spectral representation, so we can choose a spectral representation which is convenient for a given argument. This trick will be used below. We used spectral measures to construct the functional calculus, but functional calculus can also be used to express spectral measures:

5.7. Borel functional calculus

151

Corollary 5.52. Let A be a self-adjoint operator on H. Let g ∈ Bb (σ(A)) and ψ ∈ H. Then g(A)ψ ∈ CA (ψ) and ψ, g(A)ψ = g(x)dμA,ψ (x). Proof. Consider the spectral representation with respect to a spectral basis (ψn )N n=1 which has ψ as the ﬁrst basis vector, i.e., ψ1 = ψ. By the construction of the unitary map U , the map U −1 maps CA (ψ) to the set of vectors −1 ψ = (f )N (Fn )N n n=1 with n=1 with Fn = 0 for all n = 1. In particular, U −1 N f1 = 1 and fn = 0 for n = 1. Thus, U g(A)ψ = (gfn )n=1 , and gfn = 0 for n = 1 precisely means that g(A)ψ ∈ CA (ψ). Moreover, / 0 N 3 −1 −1 ψ, g(A)ψ = U ψ, Tg(x),dμn (x) U ψ n=1

= 1, Tg(x),dμ1 (x) 1 = g(x) dμ1 (x). This completes the proof, since μ1 is the spectral measure for ψ1 = ψ.

Many further identities follow from properties of the functional calculus. For instance, the functional calculus includes resolvents in a natural way: Lemma 5.53. For any z ∈ C \ σ(A), the function g(x) = (x − z)−1 is contained in Bb (σ(A)) and g(A) = (A − z)−1 . Proof. Since |g(x)| ≤ 1/ dist(z, σ(A)) for all x ∈ σ(A), g is bounded on σ(A). Since g(x)(x − z) = (x − z)g(x) = 1 for all x ∈ σ(A) and the Borel functional calculus is an algebraic homomorphism, g(A)(A − z) = (A − z)g(A) = I,

so g(A) is the resolvent for A at z.

Through the formula (5.17), this gives another way to compute the norm of (A − z)−1 as (5.15). Moreover: Corollary 5.54. For any z ∈ C \ σ(A) and ψ ∈ H, 1 dμA,ψ (x). ψ, (A − z)−1 ψ = x−z

(5.18)

Proof. This follows from Corollary 5.52 applied to g(A) = (A − z)−1 .

The connection (5.18) between resolvents of A and spectral measures has a central place in spectral theory, as will be seen in later chapters.

152

5. Bounded self-adjoint operators

Since the functional calculus is explicitly constructed in terms of multiplication operators, further properties are straightforward to derive; see, e.g., the spectral mapping theorem for continuous functions (Exercise 5.21), which generalizes that for polynomials. We discussed uniqueness of the entire Borel functional calculus, but sometimes a single function of A can be uniquely characterizaed by a natural √ set of properties. For A ≥ 0, the function g(x) = x is deﬁned on σ(A), so √ g(A) = A is well deﬁned by the Borel functional calculus. The square root lemma (Exercise 5.24) gives a set of properties which describe it uniquely. The Borel functional calculus can be used to solve the initial value problem for a function ψ : R → H given by i∂t ψ(t) = Aψ(t),

ψ(0) = ψ0 .

(5.19)

This has the physical interpretation as a time-independent Schr¨odinger equation. Formally, it resembles the scalar initial value problem if = λf , f (0) = f0 , which has the solution f (t) = e−iλt f0 . This motivates: Lemma 5.55. If A is a bounded self-adjoint operator on H, then U (t) = e−itA are unitary operators for all t ∈ R. They obey U (t + s) = U (t)U (s) for all t, s ∈ R and U (0) = I. As a function of t, U (t) is norm-diﬀerentiable and iU = AU in the sense that, taking limits in L(H), U (s) − U (t) = AU (t). (5.20) s→t s−t In particular, for any ψ0 ∈ H, the family ψ(t) = U (t)ψ0 solves (5.19). i lim

Proof. From e−itx = 1/e−itx , we conclude U (t)∗ = U (t)−1 . Other properties follow from e−i(t+s)x = e−itx e−isx and e−i0x = 1. To prove diﬀerentiability, ﬁx t and denote x e−isx − e−itx − xe−itx = (e−iux − e−itx ) du. f (s, x) = i s−t |s − t| [t,s] By using a Lipschitz estimate |e−iux − e−itx | ≤ |iux − itx| and integrating, |(s − t)x2 | |x| , |iux − itx| du ≤ |f (s, x)| ≤ |s − t| [t,s] 2 so by the functional calculus, % % % A2 |s − t| % U (s) − U (t) %≤ %i − AU (t) , % % s−t 2 which implies the norm convergence (5.20).

5.8. Spectral theorem for unitary operators

153

5.8. Spectral theorem for unitary operators So far, unitary maps have appeared mostly as a way to communicate the equivalence of certain objects. In this section, we change the perspective and consider unitary operators W ∈ L(H) in their own right. We will describe elements of their spectral theory, with close parallels to self-adjoint operators. Some steps will be left as exercises. By Lemma 4.5, W ∈ L(H) is unitary if and only if W W ∗ = W ∗ W = I. In other words, W is unitary if and only if it is invertible and W −1 = W ∗ . This can be compared with self-adjoint operators, which obey A = A∗ . In order to obtain a version of the spectral theorem, we will need a decomposition of H as a direct sum of cyclic subspaces; for this, we need a notion of cyclic subspace such that both the subspace and its orthogonal complement are invariant for W . Equivalently, the cyclic subspace should be invariant for W and W ∗ = W −1 , so we allow negative powers of W : Deﬁnition 5.56. Let W ∈ L(H) be a unitary operator. The cyclic subspace generated by ψ ∈ H is CW (ψ) = span{W k ψ | k ∈ Z}. Another diﬀerence from the self-adjoint case is that the role of R is replaced by the unit circle ∂D = {z ∈ C | |z| = 1}. It is notationally convenient to parametrize z = eiθ . The following theorem examines the notion of spectral measure; here, too, it is natural to include negative powers because trigonometric polynomials are dense in C(∂D) (Corollary 2.21): Theorem 5.57. Let W be a unitary operator on H and let ψ ∈ H. Then there exists a unique positive Borel measure μ on ∂D such that k eikθ dμ(θ), ∀k ∈ Z. (5.21) ψ, W ψ = ∂D

The measure satisﬁes μ(∂D) = ψ2 . Moreover, there exists a unitary operator U : L2 (∂D, dμW,ψ ) → CW (ψ) such that for all f ∈ L2 (∂D, dμ), (U −1 W U f )(eiθ ) = eiθ f (eiθ ).

(5.22)

Deﬁnition 5.58. The measure that obeys (5.21) is called the spectral measure for the vector ψ and operator W and is denoted by μW,ψ or μψ . The proof requires the following lemma. Lemma 5.59 (Fej´er–Riesz). Let f be a Laurent polynomial, i.e., f (z) = n k k=m ck z with m, n ∈ Z. If f (z) ≥ 0 for all z ∈ ∂D, then there exists a polynomial P such that z ). f (z) = P (z)P (1/¯

(5.23)

154

5. Bounded self-adjoint operators

Proof. Since f (z) and f (1/¯ z ) are analytic functions of z which coincide on ∂D, they must be equal, i.e., f (z) = f (1/¯ z ).

(5.24)

Writing f in the form f (z) = z m Q(z) with Q a polynomial, we see that f can be decomposed as a product of linear factors, f (z) = az m

K

(z − zk )jk ,

(5.25)

k=1

where the zk are distinct and jk are their multiplicities. Substituting this zk is a zero on both sides of (5.24), we see that for every zero zk of f , 1/¯ of the same multiplicity. Since f (z) has constant sign on ∂D, zeros on ∂D have even multiplicity. Thus, one can take P to be a constant b times the product of (z − zk )ik , where ⎧ ⎪ |zk | < 1 ⎨ jk , ik = jk /2, |zk | = 1 ⎪ ⎩ 0, |zk | > 1. For a suitable choice of b, we obtain a polynomial such that (5.23) holds.

Proof of Theorem 5.57. We deﬁne a linear functional Λ on the vector space S of Laurent polynomials by 0 / n n ikθ k ck e ck W ψ . = ψ, Λ k=m

k=m

n ikθ ≥ 0 for all θ ∈ R, then by Lemma 5.59, k If k=m ck e k=m ck z = z ). Since W ∗ = W −1 , this implies nk=m ck W k = P (W )∗ P (W ) P (z)P (1/¯ and n ikθ Λ ck e = ψ, P (W )∗ P (W )ψ = P (W )ψ, P (W )ψ ≥ 0. n

k=m

Thus, Λ is a positive linear functional on S. By Weierstrass’s second theorem, S is dense in C(∂D), so as in the proof of Lemma 5.28, Λ extends to a positive linear functional on C(∂D). By the Riesz–Markov theorem, there exists a unique positive measure μ such that Λ(f ) = f dμ and, in particular, (5.21) holds. Applying (5.21) with k = 0 implies μ(∂D) = ψ2 . We deﬁne U on monomials by U : z k → W k ψ and extend to Laurent polynomials by linearity. Viewing S as a subspace of L2 (D, dμ), W is a

5.9. Exercises

155

norm-preserving map from S to CW (ψ) because % n 0 %2 / n n % % % % ck eikθ % = ck W k ψ, ck W k ψ %U % % k=m k=m k=m 0 / n n −k k c¯k W ck W ψ = ψ, =Λ

k=m

k=m

n

n

c¯k e−ikθ

k=m

ck eikθ

k=m

2 n ck eikθ dμ(θ). = k=m

Since S is dense in L2 (∂D, dμ), this means that U can be uniquely extended to a unitary map of L2 (D, dμ) onto CW (ψ). Finally, it is immediate that (5.22) holds for f (z) = z k , so by linearity, density of Laurent polynomials, and boundedness of both sides, (5.22) holds for all f ∈ L2 (∂D, dμ). By the same arguments as in the self-adjoint case, for any unitary W , the 4NHilbert space H can be written as a direct sum of cyclic subspaces H = n=1 CW (ψn ). Thus, starting from Theorem 5.57, the following theorem follows by the same arguments as in the self-adjoint case. Theorem 5.60 (Spectral theorem for unitary operators). Let W ∈ L(H) be unitary. There exists a sequence of probability measures (dμn )N n=1 on ∂D (N may be ﬁnite or inﬁnite) and a unitary map U:

N 3

L2 (∂D, dμn ) → H

n=1

such that for every f = (fn )N n=1 ∈

4N

n=1 L

2 (∂D, dμ

(5.26) n ),

(U −1 W U f )n (eiθ ) = eiθ fn (eiθ ).

(5.27)

A collection of measures μn together with a unitary map U as in (5.26) and (5.27) is called a spectral representation. Some further consequences are left as exercises. In particular, the Borel functional calculus can be introduced for unitary operators (Exercise 5.27).

5.9. Exercises 5.1. Prove that ≤ is not a total order unless dim H = 1. 5.2. Let P, Q be orthogonal projections on H. Prove that P ≤ Q if and only if Ran P ⊂ Ran Q.

156

5. Bounded self-adjoint operators

5.3. If Kn ∈ L(H) are positive compact operators, prove that s

s

Kn → I ⇐⇒ Kn1/2 → I. 1/2

1/2

Hint: Use I − Kn = (I + Kn )(I − Kn ). 5.4. In a metric space (X, d), denote by F (X) the set of nonempty compact subsets of X. Prove that the Hausdorﬀ distance is a metric on F (X). 5.5. Construct a sequence of self-adjoint operators An on L2 ([0, 1], dx) s such that σ(An ) = [0, 1] for every n and An → 0 as n → ∞. 5.6. If An , n ∈ N∪{∞} are bounded self-adjoint operators and An → A∞ , prove that (An − z)−1 → (A∞ − z)−1 for all z ∈ C \ R. Hint: Check and use An − z = ((An − A∞ )(A∞ − z)−1 + I)(A∞ − z). 5.7. Let A, B be compact self-adjoint operators on a separable Hilbert space H. If A and B commute, prove that there exists an orthonormal H basis (vn )dim n=1 of H such that every vn is an eigenvector of both A and B. Hint: For every eigenvalue λ of A, prove that Ker(A − λ) is an invariant subspace for B. 5.8. An operator K is called normal if KK ∗ = K ∗ K. Let K be a compact normal operator. Prove that there exists an orthonormal basis H (vn )dim n=1 of H such that every vn is an eigenvector of K. The corresponding eigenvalues λn can be complex, but if dim H = ∞, then limn→∞ λn = 0. ∗ ∗ Hint: Consider the operators A = K+K , B = K−K 2 2i . 5.9. Let v1 , . . . , vn be normalized eigenvectors of a self-adjoint operator A , . . . , λn . Find the corresponding to mutually distinct eigenvalues λ1 spectral measure of their linear combination v = nj=1 κj vj . 5.10. Let (vj )∞ j=1 be normalized eigenvectors of a self-adjoint operator A corresponding to mutually distinct eigenvalues (λj )∞ j=1 . Find the ∞ 2 < ∞. κ v , where |κ | spectral measure for v = ∞ j=1 j j j=1 j 5.11. Let ψ be an eigenvector of A. What is the cyclic subspace of ψ? What is its dimension? 5.12. Let A be a Hermitian n×n matrix with n distinct eigenvalues λ1 , . . . , λn . Let v1 , . . . , vn be the corresponding eigenvectors. If ψ = nj=1 κj vj with all κj nonzero, prove that ψ is cyclic. 5.13. If A is a Hermitian n × n matrix with a cyclic vector, prove that A has n distinct eigenvalues.

5.9. Exercises

157

5.14. Let A be the self-adjoint operator from Examples 5.6 and 5.30. Prove that a vector v ∈ L2 ([0, 1], dx) is cyclic if and only if v(x) = 0 for Lebesgue-a.e. x ∈ [0, 1]. 5.15. Let A be a self-adjoint operator and let u ∈ H. If v ∈ CA (u), prove that there exists f ∈ L2 (R, dμ) such that dμA,v = |f |2 dμA,u . Hint: For f ∈ L2 (R, dμ), ﬁnd the spectral measure of f with respect to Tx,dμ(x) . 5.16. Prove that span{(A − z)−1 ψ | z ∈ C \ R} = CA (ψ). Hint: To prove ⊂, use Theorem 5.33 to ﬁnd (A − z)−1 ψ in CA (ψ). To prove ⊃, use the Neumann series and extract An ψ as suitable limits as z → ∞. 5.17. Prove that the multiplication operator Tg,dμ is: (a) unitary if and only if |g(x)| = 1 for μ-a.e. x; (b) a projection if and only if g(x) ∈ {0, 1} for μ-a.e. x. 5.18. If A is a 2×2 matrix with distinct eigenvalues λ1 , λ2 , prove that A has a spectral basis (v1 , v2 ) of cardinality 2 and a spectral basis (v1 + v2 ) of cardinality 1. 5.19. If A is a self-adjoint operator with an eigenvalue/eigenvector pair Aψ = λψ, prove that for all g ∈ Bb (σ(A)), g(A)ψ = g(λ)ψ. 5.20. If A is a self-adjoint operator and g ∈ Bb (σ(A)) such that g ≥ 0 on σ(A), prove that g(A) ≥ 0. 5.21. Prove the spectral mapping theorem for continuous functions: If A is a bounded self-adjoint operator and f ∈ C(σ(A)), then σ(f (A)) = {f (λ) | λ ∈ σ(A)}. In particular, for any z ∈ C \ σ(A), ' & 1 | λ ∈ σ(A) . σ((A − z)−1 ) = λ−z 5.22. Let A, B be bounded self-adjoint operators. If AB = BA, prove that for all g ∈ Bb (σ(A)) and h ∈ Bb (σ(B)), g(A)h(B) = h(B)g(A). Hint: Consider the set of g ∈ Bb (σ(A)) for which g(A) commutes with B.

158

5. Bounded self-adjoint operators

5.23. The following measurability statement is useful when considering ergodic families of operators, such as random operators. Consider selfadjoint operators Aω ∈ L(H) parametrized by ω ∈ Ω with Ω a measure space. If M = sup Aω < ∞ ω∈Ω

and the map ω → u, Aω v is measurable for every u, v ∈ H, prove that the map ω → u, h(Aω )v is measurable for every u, v ∈ H and every h ∈ Bb ([−M, M ]). √ 5.24. Square root lemma: Let A ≥ 0. Prove that A is the only operator B ∈ L(H) which obeys B ≥ 0 and B 2 = A. 5.25. For any bounded self-adjoint operator A and w ∈ C, prove that lim (I + wA/n)n = ewA

n→∞

with the limit taken in the sense of norm-convergence. 5.26. Let W ∈ L(H) be a unitary operator with a spectral representation (5.26) and (5.27). Prove that σ(W ) = N n=1 supp μn . 5.27. Borel functional calculus for unitary operators: If W ∈ L(H) is unitary, prove that there is a unique map ΦW : Bb (∂D) → L(H) such that the following hold. (a) ΦW is an algebraic homomorphism, i.e., it is linear, preserves g ) = ΦW (g)∗ . multiplication, ΦW (1) = I, and ΦW (¯ (b) If g is the identity map g(z) = z, then ΦW (g) = W . (c) If gk → g∞ pointwise and sup sup |gk (z)| < ∞, k∈N z∈∂D s

then ΦW (gk ) → ΦW (g∞ ).

k 5.28. If W ∈ L(H) is unitary, prove that the limit s-limn→∞ n1 n−1 k=0 W exists and that it is an orthogonal projection in H. Describe its range.

Chapter 6

Measure decompositions

It is clear that there are qualitative diﬀerences between, e.g., Lebesgue measure on R and the counting measure on Z, n∈Z δn . In this chapter, we consider several such diﬀerences, which lead to measure decompositions with an important role in spectral theory. One of these diﬀerences is in how the measure acts on countable sets. To state this, recall (from Deﬁnition 1.41) that a measure ν is said to be supported on some Borel set S if ν(S c ) = 0. Lebesgue measure gives zero measure to singletons {x} and therefore to all countable sets, whereas the counting measure on Z is supported on the countable set Z. We can ﬁnd the same distinction by comparing the spectral measures in Example 5.6 to those in Example 5.29 and Exercises 5.9 and 5.10. This is considered in Section 6.1. Another decomposition is based on how the measure acts on sets of zero Lebesgue measure; this is considered in Section 6.2. Hausdorﬀ measures have an additional parameter α which represents a kind of fractal dimension and allows us to quantify intermediate behaviors between counting measures and Lebesgue measures. Hausdorﬀ measures and decompositions based on them are considered in Section 6.3. We will sometimes work in the setting of measures on an abstract metric space X but will often focus on Baire measures on R, i.e., Borel measures on R which are ﬁnite on compacts. All measures are assumed to be positive unless otherwise stated, but we will sometimes generalize from ﬁnite positive measures to the following

159

160

6. Measure decompositions

class: let us call ν : BX → C a complex measure if there exist ﬁnite positive measures ν1 , ν2 , ν3 , ν4 : BX → [0, ∞) such that ν = (ν1 − ν2 ) + i(ν3 − ν4 ).

(6.1)

Integration is accordingly deﬁned by h dν = h dν1 − h dν2 + i h dν3 − i h dν4 for h ∈ 4j=1 L1 (dνj ). It is more common to deﬁne complex measures as σadditive maps BX → C; that deﬁnition is equivalent to ours (Exercise 6.1). Moreover, the representation (6.1) naturally arises in spectral theory so we can view Exercise 6.1 as an aside.

6.1. Pure point and continuous measures In this section, we consider the ﬁrst, and simplest, decomposition of the measure, which is based on how it acts on individual points. In Section 9.3, this decomposition will be linked with the eigenvalues and eigenvectors of a self-adjoint operator. Deﬁnition 6.1. A measure μ is said to have a point mass at x if μ({x}) > 0. Deﬁne by P the set of point masses of μ. The measure μ is a pure point measure if μ(P c ) = 0. The measure μ is a continuous measure if P = ∅. Deﬁnition 6.2. A Borel measure μ on X is said to be σ-ﬁnite if there is a sequence of Fn ∈ BX such that X = ∞ n=1 Fn and μ(Fn ) < ∞ for each n. Lemma 6.3. Any σ-ﬁnite measure has countably many point masses. Proof. For any m, n ∈ N, the set Am,n = {x ∈ Fm | μ({x}) ≥ 1/n} is ﬁnite because #Am,n ≤ nμ(Fm ) < ∞. Every point mass of μ is in Am,n for some m, n ∈ N, and their countable union is countable. Lemma 6.4. For a σ-ﬁnite measure μ, the following are equivalent: (a) μ is a pure point measure; (b) μ is supported on some countable set S; (c) μ = λ∈P μ({λ})δλ , where P is the set of point masses of μ. Proof. (a) =⇒ (b): This is immediate from Lemma 6.3. (b) =⇒ (c): Let S denote a countable set such that μ(S c ) = 0. For any set B, by countability of B ∩ S, μ({λ}) = μ({λ})δλ (B). μ(B) = μ(B ∩ S) = λ∈B∩S

λ∈S

6.1. Pure point and continuous measures

161

In particular, μ({x}) > 0 implies x ∈ S, so P ⊂ S. Moreover, any λ ∈ / P can be removed without aﬀecting the sum, so the result follows. (c) =⇒ (a): This is immediate since δλ (P c ) = 0 for all λ ∈ P .

Theorem 6.5. Any σ-ﬁnite measure can be uniquely decomposed as a sum of a pure point measure and a continuous measure. Proof. Assume that μ = μpp + μcont with μcont continuous and μpp pure point. Then μpp ({x}) = μ({x}) for all x. By Lemma 6.4(iii), this determines the pure point measure uniquely as μ({λ})δλ , (6.2) μpp = λ∈P

which proves uniqueness. To prove existence, deﬁne μpp by (6.2). For any Borel set B, by countability of B ∩ P , μ({λ}) = μpp (B), μ(B) ≥ μ(B ∩ P ) = λ∈B∩P

so μcont = μ − μpp is a positive measure. By construction, for any point x, μcont ({x}) = μ({x}) − μpp ({x}) = 0, so μcont is a continuous measure. The above decomposition can be written in the form dμcont = χP c dμ,

dμpp = χP dμ

(this notation was deﬁned in Proposition 1.54), where P is the set of pure points. The Fourier transform of a ﬁnite measure on R is deﬁned by μ ˆ(k) = e−ikx dμ(x) (up to diﬀerent conventions about factors of 2π). For measures of the form dμ(x) = f (x) dx, this corresponds to the Fourier transform of the function f . Generally speaking, smoothness of a function or measure is related to the decay of its Fourier transform. In particular, presence of pure points in the measure should be related to a lack of decay of its Fourier transform. It is nonetheless remarkable that pure points of μ precisely correspond to the Cesar`o-averaged limit of |ˆ μ(k)|2 : Theorem 6.6 (Wiener). For any ﬁnite Borel measure μ on R, T 1 |ˆ μ(k)|2 dk = μ({λ})2 . lim T →∞ 2T −T

(6.3)

λ∈R

In particular, this limit is zero if and only if μ is a continuous measure. With the same eﬀort, we will prove a more general version.

162

6. Measure decompositions

Theorem 6.7 (Wiener). For any complex Borel measures μ, ν on R, T 1 lim μ ˆ(k)ˆ ν (k) dk = μ({λ})ν({λ}). (6.4) T →∞ 2T −T λ∈R

Proof. Let us ﬁrst assume that μ, ν are ﬁnite positive measures. Using the deﬁnition of μ ˆ(k), we obtain the iterated integral T T 1 1 μ ˆ(k)ˆ ν (k) dk = e−ik(x−y) dμ(x) dν(y) dk. 2T −T 2T −T R R Since μ and ν are ﬁnite, using Fubini’s theorem and integrating in k gives T 1 μ ˆ(k)ˆ ν (k) dk = sinc(T (x − y)) dμ(x) dν(y), 2T −T R R with the sinc function deﬁned by sinc u = sin u/u for u = 0 and sinc 0 = 1. By dominated convergence with dominating function 1, we compute T 1 μ ˆ(k)ˆ ν (k) dk = χ{0} (x − y) dμ(x) dν(y), lim T →∞ 2T −T R R and computing the remaining integrals gives (6.4). For complex measures μ, ν, writing them as linear combinations of positive measures and using sesquilinearity of (6.4) proves the general case.

6.2. Singular and absolutely continuous measures In this section, we begin to consider decompositions of one measure with respect to another. The important distinction is the following. Deﬁnition 6.8. The measure μ is singular with respect to ν if there exists S such that μ(S c ) = 0 and ν(S) = 0. This is denoted μ ⊥ ν. The measure μ is continuous with respect to ν if for all measurable A, ν(A) = 0 implies μ(A) = 0. This is denoted μ " ν. The question is whether a measure μ can be decomposed as a sum of an absolutely continuous and a singular measure with respect to ν. We begin with uniqueness: Lemma 6.9. For any two measures μ, ν, there is at most one way to decompose μ = μac + μs so that μac " ν and μs ⊥ ν. If such a decomposition exists, it is necessarily of the form dμac = χS c dμ, for some measurable set S.

dμs = χS dμ

(6.5)

6.2. Singular and absolutely continuous measures

163

Proof. Since μs ⊥ ν, there exists S such that ν(S) = 0 and μs (S c ) = 0. Since μac " ν, ν(S) = 0 implies μac (S) = 0. Thus, for any measurable A, μs (A ∩ S c ) = 0 and μac (A ∩ S) = 0, so μs (A) = μs (A ∩ S) = μ(A ∩ S),

μac (A) = μac (A ∩ S c ) = μ(A ∩ S c ).

Thus, the decomposition is necessarily of the form (6.5). If there are two such sets S, T , then ν(S) = ν(T ) = 0 and μ(S ∩ T c ) = μ(T ∩ S c ) = 0; thus, μ(S#T ) = 0, and the two sets give the same decomposition. Existence of such a decomposition can be ensured using ﬁniteness properties of the measures and this will be considered below. To appreciate the result that will follow, note a class of continuous measures: Example 6.10. A measure μ is said to be absolutely continuous with respect to ν if there exists a function f ≥ 0 such that dμ = f dν. Every such measure is continuous with respect to ν. Proof. If ν(A) = 0, then χA f = 0 a.e., so μ(A) = χA f dν = 0. Continuity with respect to ν does not imply absolute continuity, but it does for ﬁnite measures as a consequence of the following theorem: Theorem 6.11 (Radon–Nikodym). Let μ, ν be ﬁnite measures on X. There exists f ∈ L1 (X, dν), f ≥ 0, and μs ⊥ ν such that dμ = f dν + dμs .

(6.6)

In particular, this is the unique decomposition into a continuous and a singular part with respect to ν. The function f is called the Radon–Nikodym derivative. Proof. We denote η = μ + ν and deﬁne a linear functional on L2 (X, dη) by Λ(h) = h dν. This functional is bounded, since 7 . |h|2 dη. |Λ(h)| ≤ |h| dν ≤ |h| dη ≤ η(X) Thus, by Riesz’s representation theorem, the functional is of the form Λ(h) = hg dη (6.7) for some g ∈ L2 (X, dη). Since the functional is positive, applying it to 1 } gives functions h = χA with Am,± = {x | ± Im g(x) ≥ m 1 η(Am,± ) ≤ ± Im χA g dη = ± Im Λ(h) = 0, m

164

6. Measure decompositions

so η(Am,± ) = 0. Taking the union over m and over ± signs shows that g is real-valued η-a.e. Moreover, subtracting hg dη from (6.7) gives h(1 − g) dν = hg dμ ∀h ∈ L2 (X, dη). (6.8) Applying this to characteristic functions of sets {x | g(x) ≤ −1/m} gives ! 1 1 ν(Cm ) ≤ χCm (1 − g) dν = χCm g dμ ≤ − μ(Cm ), 0≤ 1+ m m so μ(Cm ) = ν(Cm ) = 0 for each m; thus, g ≥ 0 η-a.e. Analogous arguments show that g ≤ 1 η-a.e. Note that (6.7) already gives dν = g dη. To obtain the decomposition (6.6), we deﬁne f : X → [0, ∞] by f = g −1 − 1 and write (6.8) as 1 1 dμ ∀h ∈ L2 (X, dη). dν = h (6.9) h −1 1+f 1+f Let us denote S = {x | f (x) = ∞} and dμs = χS dμ. Applying (6.8) to h = χS gives ν(S) = 0, so μs ⊥ ν. For any Borel set B and k ∈ N, denote Bk = {x ∈ B | f (x) ≤ k}. Applying (6.9) to h = (1 + f )χBk ∈ L∞ (X, dη) ⊂ L2 (X, dη) and taking k → ∞ gives, by monotone convergence, χB∩S f dν = χB∩S dμ. Since ν(S) = 0, we can write this as χB f dν = μ(B ∩ S) = μ(B) − μs (B),

(6.10)

which precisely means that dμ − dμs = f dν. Applying (6.10) to B = X implies f ∈ L1 (X, dν). For the following generalization, we denote by L1loc (R, dν) the set of locally integrable functions on R, L1loc (R, dν) = {f : R → C | χK f ∈ L1 (R, dν) for all compacts K ⊂ R}. Theorem 6.12 (Radon–Nikodym). Let μ, ν be two Baire measures on R. There exists f ∈ L1loc (R, dν), f ≥ 0, and μs ⊥ ν such that (6.6) holds. In particular, this is the unique decomposition into a continuous and a singular part with respect to ν. The function f is called the Radon–Nikodym derivative. Proof. Denote F1 = [−1, 1], Fn+1 = [−n − 1, n + 1] \ [−n, n], apply the Radon–Nikodym decomposition to ﬁnite measures χFn dμ, χFn dν, and sum in n.

6.2. Singular and absolutely continuous measures

165

A generalization to σ-ﬁnite measures is considered in Exercise 6.5. The Radon–Nikodym decomposition with respect to Lebesgue measure ν gives the Lebesgue decomposition dμ = dμac + dμs = f (x) dx + dμs

(6.11)

into the absolutely continuous and singular part of μ. In this case it is common to omit the qualiﬁer “with respect to Lebesgue measure”. The importance of this decomposition has led to additional terminology: Deﬁnition 6.13. The set Sac is said to be an essential support of the absolutely continuous part of μ if the measure χSac (x) dx is mutually absolutely continuous with f (x) dx. An essential support of the absolutely continuous part of μ can be obtained by Sac = {x ∈ R | f (x) > 0}. Of course, it is not uniquely determined; it is only determined up to symmetric diﬀerence with a set of Lebesgue measure zero. An essential support of the absolutely continuous part of μ determines the topological support of the absolutely continuous part of μ, but the converse is false: supp μac does not determine Sac , even up to a set of measure zero (Exercise 6.6). Thus, essential support of the absolutely continuous part of μ contains more information about the measure. The Lebesgue decomposition is often combined with the decomposition into pure point and continuous parts. Since μpp is supported on a countable set, it is part of μs , and we obtain the decomposition μ = μac + μsc + μpp , where μsc = μs − μpp is called the singular continuous part of μ. Whereas continuity of a measure is equivalent to Cesar` o-decay of |ˆ μ(k)|2 by Wiener’s Theorem 6.6, Fourier transforms of absolutely continuous measures decay pointwise: Lemma 6.14 (Riemann–Lebesgue). For any f ∈ L1 (R), lim

k→±∞

e−ikx f (x) dx = 0.

(6.12)

166

6. Measure decompositions

Proof. By density of Cc (R) in L1 (R), for any > 0, there exists g ∈ Cc (R) such that f − g1 < . Using its modulus of continuity ωg , we can estimate −ikx −ikx −ik(x+π/k) 2 e g(x) dx = e g(x) dx + e g(x + π/k) dx −ikx (g(x) − g(x + π/k)) dx = e ≤ ωg (π/k)(diam supp g + π/k). Since g ∈ Cc (R) is uniformly continuous, taking k → ±∞ implies e−ikx g(x) dx = 0. lim k→±∞

Together with

−ikx −ikx e f (x) dx − e g(x) dx ≤ f − g1 ,

this implies that

−ikx f (x) dx ≤ f − g1 < . lim sup e k→±∞

Since > 0 is arbitrary, this proves (6.12).

The proof of the Radon–Nikodym decomposition obtains f by an existence theorem, but f can be recovered by pointwise diﬀerentiation: Theorem 6.15. In the setting of Theorem 6.12, the limit f (x) = lim ↓0

μ((x − , x + )) ν((x − , x + ))

(6.13)

exists for (μ + ν)-a.e. x and recovers the decomposition (6.6) with S = f −1 ({+∞}) and dμs = χS dμ. The proof requires some prerequisites. Let η be a Baire measure on R and let f ∈ L1 (R, dη). For all x ∈ supp η, we can deﬁne the maximal function χ(x−r,x+r) (t)|g(t)| dη(t) . (6.14) (M g)(x) = sup η((x − r, x + r)) r>0 Lemma 6.16 (Croft–Garsia covering lemma). Let η be a Baire measure on R, and let I1 , . . . , In be a ﬁnite family of intervals in R. There is a disjoint subfamily Ij1 , . . . , Ijk such that n k Ii ≤ 2 η(Iji ). η i=1

i=1

6.2. Singular and absolutely continuous measures

167

Proof. Note that it suﬃces to prove the statement for families of intervals such that no interval is contained in the union of all the others. Let us denote Ij = (aj , bj ). If no interval is contained in the union of all the others, then aj = ak for j = k (otherwise one of the intervals Ij , Ik would be contained in the other), so we can relabel the intervals such that a1 < a2 < · · · < an . Similarly, b1 < b2 < · · · < bn since Ij ⊂ Ij±1 . Moreover, bj+2 ≥ aj for all j since Ij+1 ⊂ Ij ∪ Ij+2 . Thus, each of the subfamilies {Ij | j odd} and {Ij | j even} is a disjoint n 1 I . subfamily; at least one of them has total measure at least 2 η j j=1 Lemma 6.17. Let η be a Baire measure on R and let g ∈ L1 (R, dη). For any c > 0, 2 η({x | (M g)(x) > c}) ≤ |g| dη (6.15) c and η({x | (M g)(x) + |g(x)| > 2c}) ≤

3 c

|g| dη.

(6.16)

Proof. Let K be a compact subset of {x | (M g)(x) > c}. For any x ∈ K, there is an interval Ix = (x − rx , x + rx ) such that χIx (t)|g(t)| dη(t) > c. (6.17) η(Ix ) Since K is compact, there is a ﬁnite subcover that we denote by I1 , . . . , In . By Lemma 6.16, the ﬁnite subcover contains a disjoint family of intervals Ij1 , . . . , Ijk such that ⎞ ⎛ n k ⎠ ⎝ Ij ≤ 2 η(Ijl ). η(K) ≤ η j=1

l=1

Using (6.17) and since the intervals are disjoint, this implies 2 η(K) ≤ c k

l=1

2 χIjl (t)|g(t)| dη(t) ≤ c

|g(t)| dη(t).

Since η is inner regular, this implies (6.15). Since (M g)(x) + |g(x)| > 2c implies (M g)(x) > c or |g(x)| > c, using (6.15) and Markov’s inequality applied to g gives (6.16). Although (6.15) resembles Markov’s inequality and is called a weak-L1 property, the maximal function is usually not an L1 function (Exercise 6.4).

168

6. Measure decompositions

Theorem 6.18. Let η be a compactly supported ﬁnite Borel measure on R and let g ∈ L1 (R, dη). Then for η-a.e. x ∈ R, χ(x−,x+) (t)|g(t) − g(x)| dη(t) lim =0 (6.18) ↓0 η((x − , x + )) and, in particular, χ(x−,x+) (t)g(t) dη(t) = g(x). (6.19) lim ↓0 η((x − , x + )) Proof. Let us deﬁne

(T g)(x) = lim sup ↓0

χ(x−,x+) (t)|g(t) − g(x)| dη(t) . η((x − , x + ))

This is well deﬁned for all x ∈ supp η. The triangle inequality implies that (T g)(x) ≤ (M g)(x) + |g(x)|, and subadditivity of lim sup implies that for any f, g ∈ L1 (dη), (T (f + g))(x) ≤ (T f )(x) + (T g)(x), and therefore (T f )(x) − (T g)(x) ≤ (T (f − g))(x) ≤ (T f )(x) + (T g)(x). In particular, if f is continuous, (T f )(x) = 0 identically so (T g)(x) = (T (g − f ))(x). By Lemma 6.16, for any c > 0,

3 |g − f | dη. c However, the right-hand side can be made arbitrarily small by density of continuous functions in L1 (R, dη). Thus, η({x | (T g)(x) > 2c}) = 0. Since c > 0 is arbitrary, for η-a.e. x, χ(x−,x+) (t)|g(t) − g(x)| dη(t) lim sup ≤ 0, η((x − , x + )) ↓0 η({x | (T g)(x) > 2c}) = η({x | (T (g − f ))(x) > 2c}) ≤

which implies (6.19).

Theorem 6.19. Let η be a Baire measure on R and let f ∈ L1loc (R, dη). Then for η-a.e. x ∈ R, equations (6.18) and (6.19) hold. Proof. Since the condition (6.19) is local, it suﬃces to apply the previous result to the ﬁnite measures χ[−n,n] dη, n ∈ N. In the special case when η is Lebesgue measure, points x at which (6.18) holds are called Lebesgue points of the function g. In that terminology, the above theorem says that almost every point is a Lebesgue point.

6.3. Hausdorﬀ measures on R

169

Proof of Theorem 6.15. Applying Lebesgue’s diﬀerentiation theorem to g ∈ L1 (R, dη) and using dν = g dη, for η-a.e. x, ν((x − , x + )) = g(x). ↓0 η((x − , x + )) By inverting and subtracting 1, we conclude that for η-a.e. x, μ((x − , x + )) = f (x). lim ↓0 ν((x − , x + )) lim

6.3. Hausdorﬀ measures on R With the motivation that d-dimensional volume scales as the dth power of the length, Hausdorﬀ measures are constructed with scaling by the αth power of the length, where α ≥ 0 is not necessarily an integer and serves as a fractal dimension parameter. Compared with the decompositions seen above, Hausdorﬀ measures allow a more reﬁned look at the singular continuous part of the measure. Hausdorﬀ measures are studied in geometric measure theory as measures on Rd [31] and on even more general metric spaces. We will focus on Hausdorﬀ measures on R, which allows some simpliﬁcation. The construction of Hausdorﬀ measures uses Carath´eodory’s Theorem 1.26; aspects of this construction can be compared with the construction of Lebesgue measure in Section 1.5. We denote the diameter of an interval by |I| = b − a,

I = (a, b).

For α < 1, assigning weight |I|α to the interval I has undesired eﬀects for large intervals, so a cutoﬀ in allowed interval sizes is needed. Thus, we start by selecting as elementary sets the intervals of length at most δ, Eδ = {∅} ∪ {(a, b) | a, b ∈ R, a < b ≤ a + δ},

(6.20)

applying to them the weight ρα (∅) = 0,

ρα (I) = |I|α ,

and using countable covers of arbitrary A ⊂ R by elements of Eδ to deﬁne

∞ ∞ h∗α,δ (A) = inf ρα (Ij ) A ⊂ Ij , Ij ∈ Eδ ∀j ∈ N . (6.21) j=1

For any δ > 0 and s ≥ 0,

j=1

h∗α,δ

is an outer measure on R by Theorem 1.24.

If δ1 ≤ δ2 , we note Eδ1 ⊂ Eδ2 ; since the inﬁmum over a smaller set is larger, we observe that h∗α,δ1 (A) ≥ h∗α,δ2 (A). Thus, we can deﬁne h∗α (A) = lim h∗α,δ (A) = sup h∗α,δ (A). δ↓0

δ>0

For any α ≥ 0, this is an outer measure by abstract arguments:

170

6. Measure decompositions

Lemma 6.20. If μ∗δ are outer measures for all δ, then μ∗ (A) = sup μ∗δ (A) δ

is also an outer measure. Proof. Obviously, μ∗ (∅) = 0, and A ⊂ B implies μ∗δ (A) ≤ μ∗δ (B) for all δ, so it implies μ∗ (A) ≤ μ∗ (B). For any sets An , n ∈ N, ∞ ∞ ∞ ∗ ∗ μδ An ≤ μδ (An ) ≤ μ∗ (An ), n=1

n=1

n=1

so taking the supremum over δ shows σ-subadditivity of μ∗ .

In particular, h∗α is an outer measure. Let us note a special case: Lemma 6.21. Hausdorﬀ outer measure h∗0 is the counting measure on R. Proof. If A contains n elements x1 , . . . , xn , denote δ0 = min{|xj − xk | | 1 ≤ j < k ≤ n}. An interval of length smaller than δ0 contains at most one of the numbers x1 , . . . , xn , so h∗0,δ (A) ≥ n for all δ < δ0 . It follows that h∗0 (A) ≥ n and therefore h∗0 (A) ≥ #A. Conversely, a ﬁnite set A with #A = n can be covered by n intervals of arbitrarily small length, so h∗0 (A) ≤ #A. Moreover, h∗1 is the Lebesgue outer measure on R (Exercise 6.7). For any α > 1, h∗α is identically zero (Exercise 6.8), so we will restrict ourselves to α ∈ [0, 1] from now on. In order to apply Carath´eodory’s Theorem 1.26 and prove that outer measures h∗α generate Borel measures, we need the following. Lemma 6.22. For any c ∈ R, (c, ∞) is measurable with respect to h∗α . Proof. Since h∗0 is counting measure on R, it suﬃces to consider the case α > 0. Consider a countable cover of a set E ⊂ R by intervals Ij of length at most δ. Separating the intervals Ij based on whether they intersect (−∞, c] or not, we obtain two subfamilies: the ﬁrst is a cover of E ∩ (−∞, c], and the second is a cover of E ∩ [c + δ, ∞), because the intervals have length at most δ. By adding the interval (c, c + δ) to the second subfamily, we conclude ρα (Ij ) ≥ h∗α,δ (E ∩ (−∞, c]), j:Ij ∩(−∞,c]=∅ α

δ +

j:Ij ∩(−∞,c]=∅

ρα (Ij ) ≥ h∗α,δ (E ∩ (c, ∞)).

6.3. Hausdorﬀ measures on R

171

Adding these two inequalities gives δα +

∞

ρα (Ij ) ≥ h∗α,δ (E ∩ (−∞, c]) + h∗α,δ (E ∩ (c, ∞)),

j=1

and taking the inﬁmum over all countable covers of E gives δ α + h∗α,δ (E ∩ (c, ∞)) ≥ h∗α,δ (E ∩ (−∞, c]) + h∗α,δ (E ∩ (c, ∞)). In the limit δ ↓ 0, this gives h∗α (E) ≥ h∗α (E ∩ (−∞, c]) + h∗α (E ∩ (c, ∞)).

(6.22)

The opposite inequality holds by subadditivity. Thus, equality holds in (6.22) for any E ⊂ R, so (c, ∞) is measurable with respect to h∗α . Theorem 6.23. For any α ≥ 0, hα = h∗α |BR is a Borel measure on R. Proof. By Carath´eodory’s Theorem 1.26, the family of measurable sets with respect to h∗α is a σ-algebra; since it contains the half-lines (c, ∞), it contains all Borel sets. Thus, the restriction of h∗α to the Borel σ-algebra BR is a measure. Hausdorﬀ measures have many applications. As a ﬁrst application, if we ﬁx a set A and vary α, we observe a certain critical value at which the value of h∗α (A) changes: Theorem 6.24. Let A ⊂ R. (a) If h∗α (A) < ∞ for some α, then h∗β (A) = 0 for all β > α. (b) If h∗β (A) > 0 for some β, then h∗α (A) = ∞ for all α < β. (c) There is a unique number d ∈ [0, 1] such that h∗α (A) = ∞ ∀α ∈ [0, d),

h∗β (A) = 0 ∀β ∈ (d, 1].

(6.23)

Proof. If 0 ≤ α ≤ β and x ∈ [0, δ], then xβ ≤ δ β−α xα . Thus, for any interval I of length at most δ, ρβ (I) ≤ δ β−α ρα (I). Applying this to countable covers of A by intervals of length at most δ gives h∗β,δ (A) ≤ δ β−α h∗α,δ (A).

(6.24)

Now (a) follows by taking δ ↓ 0 in (6.24), and (b) follows by dividing (6.24) by δ β−α and then taking δ ↓ 0. (c) For the set {0} ∪ {x ∈ [0, 1] | h∗x (A) = ∞} is nonempty; denote by d its supremum. Applying (a) and (b) concludes the proof. Deﬁnition 6.25. The number d with the property (6.23) is called the Hausdorﬀ dimension of the set A and is denoted by dimH A.

172

6. Measure decompositions

Hausdorﬀ dimension provides a ﬁne gradation in the relative thickness of sets; it can distinguish between many Cantor sets with zero Lebesgue measure (Exercise 6.11). Although every countable set has zero Hausdorﬀ dimension, the converse is not true (Exercise 6.10). Deﬁnition 6.26. A setA is said to be σ-ﬁnite with respect to ν if there exist sets An with A = ∞ n=1 An and ν(An ) < ∞ for all n. Theorem 6.24 can be reﬁned to show that for α < dimH A, the set A is not σ-ﬁnite with respect to hα (Exercise 6.12). We now turn to the characterization and decomposition of Baire measures with respect to Hausdorﬀ measures. Deﬁnition 6.27. A Baire measure μ is said to be α-continuous if μ " hα , and it is said to be α-singular if μ ⊥ hα . For α < 1, the measures hα are not Baire measures, and they are not σ-ﬁnite. Thus, the Radon–Nikodym theorem does not apply; in order to decompose μ into an α-continuous and an α-singular part, we need a diﬀerent approach. We will use the upper α-derivative μ((x − r, x + r)) . (6.25) Dμα (x) = lim sup (2r)α r↓0 Lemma 6.28. Dμα is a Borel function. Proof. For any r > 0, the set S = {(x, t) ∈ R2 | x − r < t < x + r} is Borel. So by Tonelli’s theorem, μ((x − r, x + r)) = χS (x, t) dμ(t) is a Borel function of x. Since μ((x − r, x + r))/(2r)α is left-continuous in r ∈ (0, ∞), and since any limit can be written as a limit along a sequence, μ((x − r, x + r)) μ((x − r, x + r)) = lim sup . α n→∞ r∈(0,1/n)∩Q R→0 r∈(0,R) (2r) (2r)α

Dμα (x) = lim sup

The right-hand side is a Borel function by Lemma 1.40.

The function Dμα allows us to decompose the Baire measure μ into absolutely continuous and singular parts with respect to hα : Theorem 6.29 (Rogers–Taylor). For a Baire measure μ on R, denote T = {x | Dμα (x) = ∞} and deﬁne measures dμαc = χT c dμ,

dμαs = χT dμ.

Then μ = μαc + μαs , μαc " hα , and μαs ⊥ hα .

6.3. Hausdorﬀ measures on R

173

The proof requires a covering theorem which allows inﬁnite families of intervals. We call a family of sets G disjoint if for any G1 , G2 ∈ G, G1 = G2 implies G1 ∩ G2 = ∅. Theorem 6.30 (Vitali). For any family F of open intervals whose union is bounded, there exists a countable disjoint subfamily G ⊂ F such that J⊂ (x − 5r, x + 5r). (6.26) J∈F

(x−r,x+r)∈G

Proof. Set F1 = F . The construction is inductive: If Fn = ∅, we set Ln = sup{|I| | I ∈ Fn } and choose In = (xn − rn , xn + rn ) ∈ Fn with |In | > Ln /2, then we set Fn+1 = {I ∈ Fn | I ∩ In = ∅}. If FN = ∅ for some N , we write LN = 0 and terminate the construction. Otherwise, we write N = ∞. This construction gives a countable subfamily G = {In | 1 ≤ n < N }. By construction, Ik ∩ In = ∅ if k < n. For N= boundedness of the union of disjoint intervals gives ∞ n=1 |In | < ∞ ∞, ∞ so n=1 Ln < ∞; thus, Ln → 0 as n → ∞. Thus, in both cases, for any J ∈ F , there exists n such that Ln < |J| and therefore J ∈ / Fn . Choose the largest n such that J ∈ Fn . Then J ∩ In = ∅ and |J| ≤ Ln < 2|In |. Thus, J ⊂ (xn − 5rn , xn + 5rn ). Proof of the Rogers–Taylor theorem. We will prove that for ﬁnite, compactly supported measures μ, hα (T ) = 0 and for any A, hα (A) = 0 =⇒ μ(A \ T ) = 0. The general case then follows by σ-additivity using the measures χ[−n,n] dμ. Fix k, δ ∈ (0, ∞) and denote Tk = {x | Dμα (x) > k}. For any x ∈ Tk , by deﬁnition of Dμα , there exists rx < δ/10 such that μ((x − rx , x + rx )) ≥ k(2rx )α . The intervals (x − rx , x + rx ) cover the set Tk , so there is a countable disjoint subfamily G such that (x − 5rx , x + 5rx ). Tk ⊂ (x−rx ,x+rx )∈G

Using the deﬁnition of Hausdorﬀ measures, 5α (10rx )α ≤ hα,δ (Tk ) ≤ k (x−rx ,x+rx )∈G

(x−rx ,x+rx )∈G

μ((x − rx , x + rx )).

174

6. Measure decompositions

Since G is a disjoint family, this implies hα,δ (Tk ) ≤

5α μ(R). k

Applying this with k → ∞ shows hα,δ (T ) = 0 for all δ > 0, so hα (T ) = 0. To prove the other claim, observe that Dμα (x) < ∞ implies μ((x − r, x + r)) 0. For any a, b > 0, using r = max{a, b} gives μ((x − a, x + b)) μ((x − r, x + r)) ≤ sup . α (a + b) rα a,b∈(0,1] r∈(0,1] sup

Thus, if we denote μ((x − a, x + b)) (a + b)α a,b∈(0,1] and write Ek = {x | Cμα (x) ≤ k}, then T c ⊂ k∈N Ek . Cμα (x) =

sup

Take A ⊂ R such that hα (A) = 0. Consider a countable cover of A by intervals In of length |In | ≤ 1. For n such that In ∩ Ek = ∅, using the point x ∈ In , Cμα (x) ≤ k implies μ(In ) ≤ Cμα (x)|In |α ≤ k|In |α . Thus μ(A ∩ Ek ) ≤

n:In ∩Ek =∅

μ(In ) ≤

∞

k|In |α .

n=1

Taking the inﬁmum over covers with |In | ≤ 1 for all n gives μ(A ∩ Ek ) ≤ khα,1 (A) = 0. Taking the union over k ∈ N gives μ(A ∩ T c ) = 0. In the special case α = 1, the decomposition μ = μαc + μαs is the Lebesgue decomposition, because h1 is Lebesgue measure. These decompositions characterize the behavior of parts of μ with respect to sets of zero ν-measure. A diﬀerent kind of decomposition is performed with respect to σ-ﬁnite sets with respect to hα : Deﬁnition 6.31. The measure μ is strongly α-continuous if μ(A) = 0 for every set A which is σ-ﬁnite with respect to hα . The measure μ is almost α-singular if there exists S which is σ-ﬁnite with respect to hα such that μ(S c ) = 0.

6.3. Hausdorﬀ measures on R

175

A Baire measure μ has a unique decomposition into a strongly α-continuous part and an almost α-singular part: uniqueness is proved in Exercise 6.15 and existence in Exercise 6.16. Note that a set is σ-ﬁnite with respect to counting measure if and only if it is countable, so the special case α = 0 recovers continuous and pure point measures. Deﬁnition 6.32. Let α ∈ (0, 1]. A ﬁnite measure μ on R is uniformly αH¨older continuous (abbreviated UαH) if there exists C ∈ (0, ∞) such that for all x ∈ R and all r ∈ (0, 1/2], μ((x − r, x + r)) ≤ C(2r)α . Theorem 6.33 (Rogers–Taylor). A ﬁnite measure μ on R is α-continuous if and only if for every > 0, there exists a decomposition μ = μ1 + μ2 with μ1 a UαH measure and μ2 a measure with μ2 (R) < . Proof. Assume that μ " hα . Then in the notation of Theorem 6.29 and its proof, μ(T ) = 0. In particular, there exists k ∈ N such that μ(Ekc ) < . Denote dμ1 = χEk dμ, dμ2 = χEkc dμ. Then μ2 (R) < by deﬁnition. For any x ∈ R and all r ∈ (0, 1/2], let us prove μ1 ((x − r, x + r)) = μ((x − r, x + r) ∩ Ek ) ≤ k(2r)α . If (x − r, x + r) ∩ Ek = ∅, choosing a point y in the intersection, we write (x − r, x + r) = (y − a, y + b) with a, b ∈ (0, 1], so the inequality follows from the deﬁnition of Ek . If (x − r, x + r) ∩ Ek = ∅, the inequality is trivial. Conversely, assume that μ has such a decomposition for every > 0. Any UαH measure is α-continuous, so μ1 " hα . Decomposing μ2 into an αcontinuous and an α-singular part and adding to μ1 , we get a decomposition of μ; in particular, μαs = (μ2 )αs . It follows that μαs (R) = (μ2 )αs (R) < . Since is arbitrary, it follows that μαs = 0, so μ " hα . Existence of subsets with ﬁnite nonzero Hausdorﬀ measure is a nontrivial result, whose proof goes beyond the scope of this text. By results of Besicovitch [10] and Davies [23] (see also [31, Theorem 1.6, Theorem 5.6]): Theorem 6.34. For every Borel set A ⊂ R and α ∈ [0, 1], if hα (A) = ∞, there exists a compact subset K ⊂ A such that 0 < hα (K) < ∞. In particular, taking dμ = χK dhα and applying Theorem 6.33 shows: Corollary 6.35. For every Borel set A ⊂ R and α ∈ [0, 1], if hα (A) > 0, there exists a UαH measure ν such that 0 < ν(R) < ∞ and supp ν ⊂ A. This can be used to estimate the Hausdorﬀ dimension of sets (Exercise 6.17).

176

6. Measure decompositions

6.4. Matrix-valued measures We now consider matrix-valued measures Ω, with positivity in the sense of operator order (i.e., the values are positive semideﬁnite matrices). This generalization will occur naturally in spectral theory when we consider fullline Jacobi matrices and full-line Schr¨odinger operators. Recall that L(Cd ) is the set of d × d matrices, and that inequalities between matrices should be interpreted in the sense of operator order. Deﬁnition 6.36. A map Ω : BX → L(Cd ) is a positive d × d-matrix valued measure on X if it is σ-additive, Ω(∅) = 0, and for every B ∈ BX , Ω(B) ≥ 0. Note that Ω cannot take inﬁnite values, so ﬁniteness of the measure is built into the deﬁnition. The following lemma provides a decomposition of matrix-valued measures in the style of Radon–Nikodym: Lemma 6.37. If Ω is a positive d × d-matrix valued Borel measure on X, then the following hold. (a) Tr Ω is a ﬁnite positive measure. (b) There exists a matrix-valued Borel function W : X → L(Cd ) such that W d Tr Ω (6.27) Ω(B) = B

for any Borel set B. (c) Tr Ω-a.e., Tr W = 1 and W ≥ 0. Proof. (a) For any v ∈ Cd , the map μv (B) = v ∗ Ω(B)v is σ-additive. Since Ω(B) ≥ 0 implies v ∗ Ω(B)v ≥ 0, μv is a positive measure. Since μv (X) = v ∗ Ω(X)v ∈ [0, ∞), it is a ﬁnite positive measure. Thus, so is Tr Ω =

d

μej ,

j=1

where (ej )dj=1 denotes the standard basis of Cd . (b) For any set B with Tr Ω(B) = 0, Ω(B) ≥ 0 implies Ω(B) = 0 and then v ∗ Ω(B)v = 0 for all v. Thus, for any v ∈ Cn , μv " Tr Ω, so by the Radon–Nikodym decomposition, there exists a positive function fv ∈ L1 (X, d Tr Ω) such that ∗ fv d Tr Ω. v Ω(B)v = B

By the polarization identity for the sesquilinear form (u, v) → u∗ Ω(B)v, 1 ω −1 fej +ωek d Tr Ω e∗j Ω(B)ek = 4 B ω∈{1,i,−1,−i}

6.4. Matrix-valued measures

177

from which we read oﬀ functions 1 wjk = ω −1 fej +ωek ∈ L1 (X, d Tr Ω) 4 ω∈{1,i,−1,−i}

such that (6.27) holds. In particular, Ωjk is a complex measure. Note also that each wjk is a.e. ﬁnite so by changes on zero measure sets, we can assume W : X → L(Cd ). (c) For each v ∈ Cd , we can express the positive measure μv as dμv = v ∗ W v d Tr Ω and conclude that v ∗ W v ≥ 0 Tr Ω-a.e. Applying this to the countable set of v ∈ (Q + iQ)d and taking a countable union of zero measure sets, we obtain a set E with Tr Ω(E) = 0 such that v ∗ W (x)v ≥ 0 for all v ∈ (Q + iQ)d and all x ∈ E c . Then, by continuity, v ∗ W v ≥ 0 for all v ∈ Cd and all x ∈ E c . Thus, W ≥ 0 a.e. Summing (6.27) on the diagonal gives d Tr Ω = Tr W d Tr Ω so Tr W = 1 Tr Ω-a.e. We proved that any positive d × d matrix-valued measure is of the form dΩ = W dμ, where W = (wjk )dj,k=1 ≥ 0 is a matrix-valued function with Tr W = 1 and μ is a ﬁnite positive measure. It is sometimes natural to consider objects formally given by dΩ = W dμ in greater generality, with the following warning. If W dμ = ∞, the symbol dΩ = W dμ usually does not deﬁne a map Ω on the original σ-algebra, since the entries wjk may not be elements of L1 (X, dμ), and in particular, for j = k, wjk dμ may be undeﬁned. Nonetheless, integration with respect to W dμ is well deﬁned and it is the natural setting for deﬁning vector-valued L2 spaces: Lemma 6.38. Let μ be a positive measure on X, and let W be a positive matrix-valued function on X. Then the following hold. (a) For vector-valued Borel functions f : X → Cd , !1/2 ∗ f = f W f dμ

(6.28)

is well deﬁned (i.e., the integral is nonnegative) and deﬁnes a seminorm. (b) The relation deﬁned by f ∼ g if f − g = 0 is an equivalence relation. (c) The set L2 (X, Cd , W dμ) of equivalence classes with f < ∞ is a Hilbert space with the inner product (6.29) f, g = f ∗ W g dμ.

178

6. Measure decompositions

Proof. Denote

& ' d ∗ L (X, C , W dμ) = f : X → C | f W f dμ < ∞ . 2

d

Writing W = W 1/2 W 1/2 , we note that f ∗ W f dμ = (W 1/2 f )∗ (W 1/2 f ) dμ, 4 so f ∈ L2 (X, Cd , W dμ) if and only if W 1/2 f ∈ dj=1 L2 (X, dμ). In particular, L2 (X, Cd , W dμ) is a vector space, and the map T f = W 1/2 f , T : L2 (X, Cd , W dμ) →

d 3

L2 (X, dμ)

j=1

4d 2 is norm-preserving. By pulling back properties from j=1 L (X, dμ), it follows that (6.28) deﬁnes a norm, ∼ an equivalence relation, and (6.29) an inner product on the quotient space L2 (X, Cd , W dμ). sequence in L2 (X, Cd , W dμ). Then Let (fn )∞ n=1 be an arbitrary Cauchy4 d 2 (W 1/2 fn )∞ n=1 is a Cauchy sequence in j=1 L (X, dμ), so by completeness 4d it has a limit g ∈ j=1 L2 (X, dμ) in the sense that lim (W 1/2 fn − g)∗ (W 1/2 fn − g) dμ = 0. n→∞

By the Riesz–Fischer theorem, there is a subsequence (W 1/2 fnk )∞ k=1 which converges to g pointwise a.e. If limk→∞ W (x)1/2 fnk (x) = g(x), then g(x) ∈ Ran W (x)1/2 (note that Ran W (x)1/2 is closed for each x). Choosing f (x) ∈ Ran W (x) with W 1/2 f (x) = g(x) gives a factorization g = W 1/2 f such that 2 d the Cauchy sequence (fn )∞ n=1 converges to f in L (X, C , W dμ).

6.5. Exercises 6.1. Let A be a σ-algebra on X, and let μ : A → C be a σ-additive map, i.e., ⎞ ⎛ ∞ ∞ Aj ⎠ = μ(Aj ) (6.30) μ⎝ j=1

j=1

for all pairwise disjoint Aj ∈ A (in particular, the series is always convergent). (a) The variation of μ is deﬁned as

∞ ∞ |μ(Aj )| A = Aj , Aj ∩ Ak = ∅ if j = k . |μ|(A) = sup j=1

j=1

Prove that |μ| is a positive measure on A.

6.5. Exercises

179

(b) If |μ|(E) = ∞ for some set E, prove that there is a disjoint decomposition E = A1 ∪ E1 such that |μ(A1 )| ≥ 1 and |μ|(E1 ) = ∞. Hint: Reduce to the case when μ takes only real values. In this ∞ case, ﬁnd a partition (Bn )∞ n=1 of E such that n=1 |μ(Bn )| > 2 + 2|μ(E)| and group the Bn ’s by whether μ(Bn ) is positive or negative. (c) Prove that |μ|(X) < ∞. Hint: Assume the opposite and use (b) inductively. (d) Prove that there exist (positive) ﬁnite measures μj : A → [0, ∞) such that μ = μ1 − μ2 + i(μ3 − μ4 ) and μj " |μ| for all j. . Hint: If μ is real valued, use μ± (A) = |μ|(A)±μ(A) 2 1 (X, d|μ|) such that μ(A) = (e) Prove that there exists f ∈ L A f d|μ| for all A ∈ A, and prove that |f | = 1 |μ|-a.e. 6.2. For any ﬁnite Borel measures μ, ν on R, prove that 1 T μ ˆ(k)ˆ ν (k) dk = μ({λ})ν({λ}). lim T →∞ T 0 λ∈R

6.3. Prove that measures μ, ν on X are mutually singular if there exist sets Sn ⊂ X such that lim μ(Sn ) = 0,

n→∞

lim ν(Snc ) = 0.

n→∞

−n and consider the equation Hint: Reduce ∞ to the case μ(Sn ) ≤ 2 ∞ S = m=1 n=m Sn .

6.4. If ν denotes the Lebesgue measure on R and f ∈ L1 (R, dν) is not 0, prove that the function M f deﬁned in (6.14) is not in L1 (R, dν). 6.5. (a) Prove that any σ-ﬁnite measure μ can be written in the form dμ = w d˜ μ for some ﬁnite measure μ ˜ and function 0 < w < 1. (b) Prove that for any σ-ﬁnite measures μ, ν, there is a unique decomposition dμ = f dν + dμs , where f ≥ 0 and μs ⊥ ν. Hint: Use (a) to reduce to ﬁnite measures μ ˜, ν˜. 6.6. Consider a Baire measure μ with decomposition (6.11) and an essential support of the absolutely continuous part of μ, denoted Sac . (a) Prove that the topological support of f (x) dx is equal to the essential closure of Sac , i.e., the set {x | m(E ∩ (x − , x + )) > 0 for all > 0}. (b) Prove that there exists an absolutely continuous measure μ such that supp μ = [0, 1] but μ is not mutually absolutely continuous with χ[0,1] (x) dx (i.e., does not have [0, 1] as an essential support). Hint: Use Exercise 1.16.

180

6. Measure decompositions

6.7. Prove that for α = 1 and any δ > 0, the Hausdorﬀ outer measure h∗1,δ is the Lebesgue outer measure on R. In particular, so is h∗1 . 6.8. For any α > 1 and δ > 0, prove that h∗α,δ (A) = 0 for every A ⊂ R. In particular, h∗α (A) = 0. 6.9. Prove hα (A) = 0 if and only if there exist open intervals In with ∞ that α < ∞ such that every point x ∈ A belongs to inﬁnitely |I | n=1 n many of the intervals In . 6 5 ∞ ∞ N is uncount6.10. Prove that the set A = n=1 an /n! | (an )n=1 ∈ {0, 1} able and has zero Hausdorﬀ dimension. 6.11. Fix γ ∈ (0, 1/2). The middle- 1−γ 2 Cantor set is obtained by denoting f0 (x) = γx, f1 (x) = γx + 1 − γ, and recursively deﬁning C0 = [0, 1],

Cn+1 = f0 (Cn ) ∪ f1 (Cn ),

C=

∞

Cn

n=0

(the special case γ = 1/3 is the middle-third Cantor set). Denote α = log 2/ log(1/γ). (a) Prove that for any intervals I1 , I2 with |I1 |, |I2 | ≥

γ d(I1 , I2 ) 1 − 2γ

there is an interval I such that I1 ∪I2 ⊂ I and |I1 |α +|I2 |α ≥ |I|α . (b) For any δ, > 0, prove that there exists a ﬁnite cover of C by closed intervals I1 , . . . , In such that n |Ij |α ≤ hα,δ (C) + , j=1

and any endpoint of any Ij is a boundary point of Cn for some n. (c) Prove that the step in (a) can be applied to the ﬁnite cover in (b) iteratively until there is only one interval left, and conclude that hα,δ (C) ≥ 1. (d) Prove that hα (C) = 1 and dimH C = α. 6.12. (a) Let A ⊂ R. If α < dimH A, prove that A is not σ-ﬁnite with respect to hα . (b) Prove that for any α < 1, hα is not a σ-ﬁnite measure on R. 6.13. Let μ be a Baire measure on R, and let S be a Borel set with μ(S) > 0. If for some α ∈ [0, 1], Dμα (x) < ∞ for μ-a.e. x ∈ S, prove that dimH (S) ≥ α.

6.5. Exercises

181

6.14. Let μ be a Baire measure on R. Prove that log μ((x − r, x + r)) α∗ (x) = lim inf r→0 log r (with the convention log 0 = −∞) is a Borel function of x ∈ R, and that Dμα (x) = 0

∀α < α∗ (x),

Dμα (x) = ∞

∀α > α∗ (x).

6.15. For measures μ, ν, prove that there is at most one way to decompose μ = μ1 + μ2 so that both of the following hold. (a) For any set A which is σ-ﬁnite with respect to ν, μ1 (A) = 0. (b) There exists a set S which is σ-ﬁnite with respect to ν and μ2 (S c ) = 0. 6.16. Let μ be a Baire measure on R, and denote Pα = {x | Dμα (x) = 0}. (a) For the measures μsαc = χPα dμ,

μaαs = χPαc dμ,

prove that μsαc is strongly α-continuous, μaαs is almost α-singular, and μ = μsαc + μaαs . (b) Prove that {x | 0 < Dμα (x) < ∞} is σ-ﬁnite with respect to hα and that the measure μαc − μsαc can be represented in the form μαc − μsαc = f dhα for some Borel function f ≥ 0.

α 6.17. Let xn ∈ R and cn > 0 be such that ∞ n=1 cn < ∞ for some α ∈ (0, 1). (a) Prove that for any α < β and any ﬁnite UβH measure μ, ∞ cαn dμ(x) < ∞. |x − xn |α n=1 ∞ cα cn n (b) Prove that ∞ n=1 |x−xn | = ∞ implies n=1 |x−xn |α = ∞. (c) Using Corollary 6.35, prove that the set

∞ cn =∞ S= x∈R| |x − xn | n=1

has Hausdorﬀ dimension at most α.

Chapter 7

Herglotz functions

We have already encountered integrals of the form 1 dμ(x), f (z) = x−z

(7.1)

where μ is a ﬁnite Borel measure on R; they appeared in the identity (5.18) relating resolvents of self-adjoint operators with the spectral measures. The function f deﬁned by (7.1) is sometimes called the Stieltjes transform or the 1 Im z = |x−z| Borel transform of the measure μ. Since Im x−z 2 , the function f (z) maps the upper half-plane C+ = {z ∈ C | Im z > 0} to itself. Functions with this property have a central place in the spectral theory of self-adjoint operators, and it is beneﬁcial to consider them starting from a general perspective. Deﬁnition 7.1. A Herglotz function is an analytic function f : C+ → C+ . Although not all Herglotz functions are of the form (7.1), we will show that every Herglotz function has an integral representation which generalizes (7.1) in a natural way, ! x 1 − dμ(x) (7.2) f (z) = az + b + 1 + x2 R x−z with a ≥ 0, b ∈ R, and with μ a positive measure on R which obeys 1 dμ(x) < ∞. 1 + x2 R

(7.3)

Measures obeying (7.3) are said to be Poisson-ﬁnite. To see that (7.2) is a generalization of (7.1), note that if μ is ﬁnite, then the two terms in the 183

184

7. Herglotz functions

integrand in (7.2) are separately integrable, and the second is independent of z so it is simply an additive constant. In this case, f (z) is an aﬃne function of z plus an integral of the form (7.1). From one perspective, a description of all Herglotz functions is possible because the condition f (z) ∈ C+ is a form of boundedness: boundedness away from −i. This becomes more transparent when working with bounded domains, so we will begin with brief considerations of M¨obius transformations and of certain families of functions on the unit disk D = {z ∈ C | |z| < 1}. We will then prove the representation (7.2) and explore the various relations between the function f and the measure μ.

7.1. M¨ obius transformations Recall the equivalence relation $ on C2 \ { u$v

0 0 } deﬁned by

if and only if u = λv for some λ ∈ C \ {0}.

(7.4)

ˆ = C ∪ {∞} The quotient space can be identiﬁed with 0 the Riemann sphere wC 2 ˆ deﬁned by π 1 = w1 . We by using the quotient map π : C \ { 0 } → C w w2 2 w1 0 w1 ˆ refer to w2 = 0 as projective coordinates corresponding to w = w2 ∈ C. ˆ More Every 2 × 2 matrix A preserves cosets, so it induces a map on C. ˆ ˆ explicitly, the M¨obius transformation fA : C → C is uniquely deﬁned by fA ◦ π = π ◦ A. This is often written as ! ! fA (w) w $A 1 1 1 (even if w = ∞ or fA (w) = ∞, with the convention ∞ 1 $ 0 ) or as fA (w) =

A11 w + A12 , A21 w + A22

where Aij denote entries of the matrix A. Lemma 7.2. Let A, B be invertible 2 × 2 matrices. Then the following hold. (a) fA = id if and only if A = λI for some λ ∈ C \ {0}. (b) For any invertible 2 × 2 matrices A and B, fAB = fA ◦ fB . ˆ (c) For any invertible 2 × 2 matrix A, the map fA is a bijection of C to itself. Proof. (a) fA = id means that every nonzero vector w ∈ C2 is an eigenvector of A. This is only possible if A is a multiple of the identity matrix.

7.1. M¨obius transformations

185

(b) fA ◦ π = π ◦ A and fB ◦ π = π ◦ B imply fAB ◦ π = π ◦ AB = (π ◦ A) ◦ B = fA ◦ π ◦ B = fA ◦ fB ◦ π. Since π is surjective, this implies fAB = fA ◦ fB . (c) follows from (a) and (b) using fA ◦ fA−1 = fA−1 ◦ fA = fI = id.

Due to the ﬁrst property, when considering M¨obius transformations, it is common to normalize the matrix by the condition det A = 1. Recall that SL(2, C) = {A ∈ L(C2 ) | det A = 1}. Even with that normalization, there is a remaining ambiguity: matrices A and −A have the same determinant and fA = f−A . We note also that in computational problems, it is convenient not to worry about normalization: Example 7.3 (Cayley transform). The function z−i z+i

γ(z) =

(7.5)

is an analytic bijection from C+ to D. Its inverse is w+1 . iw − i

γ −1 (w) = Proof. From the calculation 1 − |γ(z)|2 =

|z + i|2 − |z − i|2 4 Im z = , 2 |z + i| |z + i|2

it follows that γ(z) ∈ D if and only if z ∈ C+ . The formula for γ −1 follows from ! ! ! 1 −i 1 1 2 0 = . 1 i i −i 0 2 Example 7.4. For any z0 ∈ D, γz0 (z) =

z − z0 1 − z0 z

= γ−z0 . is an analytic bijection of D to itself and γz−1 0 Proof. Similarly to the previous example, this follows from the calculations 1 − |γz0 (z)|2 = and 1 −z0 −z0 1

!

1 z0 z0 1

(|1 − |z0 |2 )(|1 − |z|2 ) |1 − z¯0 z|2 ! =

! 1 − |z0 |2 0 . 0 1 − |z0 |2

186

7. Herglotz functions

These examples hint at a general property of M¨obius transformations. ˆ any of the following regions: a disk Let us call a generalized disk in C {w ∈ C | |w − c| < r} with c ∈ C, r > 0, the complement of its closure {w ∈ C | |w − c| > r} ∪ {∞}, or a half-plane {w ∈ C | Re(e−iφ w) > t} with φ, t ∈ R. The three cases are distinguished by whether ∞ is outside, inside, or on the boundary of the generalized disk. ˆ can be represented in projective Lemma 7.5. Every generalized disk in C coordinates in the form ! ! w1 w1 ∗ M >0 (7.6) w2 w2 for some Hermitian 2 × 2 matrix M with det M < 0, and conversely, every set of this form is a generalized disk. Proof. The condition |w − c| < r is equivalent to (7.6) with the matrix ! −1 c . Mc,r = c¯ r2 − |c|2 The region |w −c| > r can likewise be represented by (7.6) with M = −Mc,r . The half-plane condition Re(e−iφ w) > t is equivalent to (7.6) with the matrix ! 0 eiφ ˜ Mφ,t = −iφ . e −t ˜ φ,t are 2 × 2 Hermitian matrices with The matrices Mc,r , −Mc,r , and M strictly negative determinant. For the converse, assume that M = M ∗ and det M < 0. Note that multiplying M by a positive scalar does not aﬀect the condition (7.6). By separating cases based on the sign of M11 , it is straightforward to see that ˜ φ,t . M is a positive multiple of some matrix Mc,r , −Mc,r , or M Lemma 7.6. The generalized disk described by (7.6) is a disk in the complex plane if and only if M11 < 0. In this case, its radius r and center c are given by √ M12 − det M , c=− . (7.7) r=− M11 M11 Proof. This follows from the previous proof by noting that the formulas (7.7) are invariant under rescaling M by a positive constant and by verifying them for the matrix Mc,r . Proposition 7.7. M¨ obius transformations map every generalized disk bijectively to a generalized disk.

7.1. M¨obius transformations

187

Proof. For the generalized disk given by w∗ M w > 0, its image under the M¨obius transformation fA is characterized by the condition w∗ M w > 0, where M = (A−1 )∗ M A−1 . Since M is also Hermitian and det M = det M /|det A|2 , the image is also a generalized disk. An important special case is upper half-plane: Example 7.8. Let J = A vector w ∈ C2 \ { if w∗ J w > 0.

! 0 i . −i 0

0 0 } corresponds to a coset π(w) =

(7.8) w1 w2

in C+ if and only

Proof. This follows from the calculation ! ! w1 2 w1 w1 ∗ = −i(w1 w2 − w1 w2 ) = 2 Im(w1 w2 ) = J Im . 2 w2 w2 |w2 | w2

Deﬁnition 7.9. A matrix A is called J -expanding if and J -contracting if

A∗ J A − J ≥ 0

(7.9)

J − A∗ J A ≥ 0.

(7.10)

These notions provide the criterion for a M¨obius transformation to map the upper half-plane into (but not necessarily onto) itself. Lemma 7.10. If A is invertible and J -contracting, then fA−1 (C+ ) ⊂ C+ . Proof. By (7.9), (Av)∗ J Av > 0 implies v ∗ J v ≥ v ∗ A∗ J Av > 0, so π(Av) ∈ C+ implies π(v) ∈ C+ . If A ∈ SL(2, C) is J -expanding and J -contracting, it is said to be J -unitary. For our choice of J , J -unitary matrices can be described in terms of entries: Lemma 7.11. An SL(2, C) matrix is J -unitary if and only if it has real entries. Of course, this set is denoted by SL(2, R) = {A ∈ SL(2, C) | Ajk ∈ R for all j, k}. Proof. The set G = {A ∈ SL(2, C) | A∗ J A = J } is a group. If A ∈ SL(2, R), a direct calculation shows A∗ J A = J . Thus, SL(2, R) ⊂ G. Conversely, assume that A ∈ SL(2, C) is J -unitary. Then fA−1 (C+ ) = C+ so fA preserves R∪{∞}. We separate cases based on the value of fA (∞).

188

7. Herglotz functions If fA (∞) = ∞, then A 01 $ 01 , which shows that ! a11 a12 . A= 0 a22

Thus, fA (z) = (a11 /a22 )z+(a12 /a22 ) is an aﬃne map on C. Since it preserves C+ , it must have a11 /a22 > 0 and a12 /a22 ∈ R. Together with det A = a11 a22 = 1, this implies A ∈ SL(2, R). If fA (∞) = λ ∈ R, choosing B=

!

0 1 −1 λ

∈ SL(2, R)

gives fB ◦ fA = fBA which maps ∞ to ∞ and BA ∈ G. By the previous case, BA ∈ SL(2, R) so A ∈ SL(2, R). Some further properties of J -contracting matrices are left as exercises. While we will only use the notion of a J -contracting matrix for the matrix (7.8) which corresponds to C+ , other choices are also of interest; most ! −1 0 notably, j = corresponds to D. 0 1

7.2. Schur functions and convergence For analytic functions on a ﬁxed region Ω ⊂ C, the natural notion of convergence is uniform convergence on compact subsets of Ω. For instance, since every contour on Ω has a compact image, this notion of convergence allows us to exchange limits and contour integrals. It is a remarkable fact that uniform convergence on compacts can sometimes be concluded from pointwise convergence. In this section, we will not aim for a general treatment, but we present a self-contained discussion for the following class of functions. Deﬁnition 7.12. A Schur function is an analytic function g : D → D. By the maximum principle, if |g(z)| = 1 for some z ∈ D, then g is constant. Thus, any Schur function is either a unimodular constant g ≡ eiφ or a function g : D → D. Lemma 7.13 (Schwarz lemma). If g is a Schur function and g(0) = 0, then g(z)/z is a Schur function. Proof. Since g(0) = 0, the function h(z) = g(z)/z has a removable singularity at 0. Thus, h is an analytic function on D, and by the maximum principle, for any r < 1, 1 1 sup |g(z)| ≤ . sup |h(z)| = sup |h(z)| = r z:|z|=r r z:|z|≤r z:|z|=r Taking the limit r ↑ 1 completes the proof.

7.2. Schur functions and convergence

189

Theorem 7.14 (Schwarz–Pick theorem for D). Let g : D → D be an analytic function. Then for all z, w ∈ D, g(z) − g(w) z − w . (7.11) ≤ 1 − g(w)g(z) 1 − wz Proof. Using the M¨ obius transformations from Example 7.4, the inequality (7.11) can be written as |γg(w) (g(z))| ≤ |γw (z)| −1 and ζ = γ (z), as |h(ζ)| ≤ |ζ|. Since or, in terms of h = γg(w) ◦ g ◦ γw w h : D → D and h(0) = 0, this follows from the Schwarz lemma.

The Schwarz–Pick theorem has an analogue for the half-plane (Exercise 7.5). In the setting of the unit disk, we use the Schwarz–Pick theorem to gain insight about the set of all Schur functions, viewed as a subset of C(D, C). Lemma 7.15. The set of all Schur functions is equicontinuous on D. Proof. For |z|, |w| < r < 1, for any Schur function g, we claim that |g(z) − g(w)| ≤

2 |z − w|. 1 − r2

If g is constant, this is trivial; otherwise, it follows from the Schwarz–Pick theorem together with |1 − g(z)g(w)| ≤ 2 and |1 − zw| ≥ 1 − r2 . This implies equicontinuity at any point z ∈ D. Proposition 7.16. If a sequence of Schur functions converges pointwise on a dense set in D, then it converges to a Schur function uniformly on compact subsets of D. Proof. Denote the sequence by gn . Since the functions gn are equicontinuous and converge pointwise on a dense set, by Theorem 2.14, they converge uniformly on compact subsets of D to a continuous function g∞ : D → D. Analyticity of g∞ now follows from Morera’s theorem: since any contour γ in D has a compact image, by uniform convergence on the range of γ, lim gn (z) dz = lim gn (z) dz = 0. γ n→∞

n→∞ γ

From the previous results, we prove a special case of Montel’s theorem, a more general result in complex analysis. Theorem 7.17. Every sequence of Schur functions has a subsequence which converges to a Schur function uniformly on compact subsets of D.

190

7. Herglotz functions

Proof. By a diagonalization argument (Lemma 2.13), there is a subsequence which converges pointwise on a countable dense subset of D. By equicontinuity and Theorem 2.14, this subsequence converges uniformly on compact subsets of D. Note that any Herglotz function f corresponds to a Schur function g = γ ◦ f ◦ γ −1 , where γ is the Cayley transform (7.5). This generates all the Schur functions except the unimodular constants g(z) ∈ ∂D, which correspond to f (z) = c ∈ ˆ → C ˆ are continuous, the R ∪ {∞}. By this conjugation, since γ, γ −1 : C previous results immediately extend to Herglotz functions: ˆ pointCorollary 7.18. If a sequence of Herglotz functions converges (on C) wise on a dense set in C+ , then it converges uniformly on compact subsets of C+ to a Herglotz function or to a constant c ∈ R ∪ {∞}. Corollary 7.19. Every sequence of Herglotz functions has a subsequence which converges uniformly on compact subsets of C+ to a Herglotz function or to a constant c ∈ R ∪ {∞}. The set of Schur functions can be equipped with a metric which corresponds to convergence on compacts (Exercise 7.4); with such a metric, Theorem 7.17 tells us that the set of Schur functions is a compact metric space.

7.3. Carath´ eodory functions The Cauchy integral formula represents an analytic function inside a region by the values on the boundary. In fact, more is true: an analytic function on D can, up to an imaginary constant, be reconstructed from the real part of its values on ∂D: Proposition 7.20 (Schwarz integral formula). Let F be analytic in a neighborhood of D. Then for all z ∈ D, iθ e +z dθ F (z) = i Im F (0) + Re F (eiθ ) . (7.12) iθ e −z 2π We mention Proposition 7.20 purely for context: we will not use or directly prove it, but our goal is to describe a generalization of this formula dθ is in which F is not necessarily analytic on the boundary, and Re F (eiθ ) 2π replaced by a positive measure. Proposition 7.20 will follow easily from that generalization. We will use the set M(∂D) of ﬁnite positive Borel measures on ∂D and the notion of weak convergence of measures (Deﬁnition 2.57).

7.3. Carath´eodory functions

191

Theorem 7.21. Every analytic function F : D → {z ∈ C | Re z ≥ 0} can be written uniquely in the form iθ e +z F (z) = iβ + dρ(θ) (7.13) eiθ − z for some β ∈ R and ρ ∈ M(∂D). The constant β and the measure ρ can be obtained from F by β = Im F (0) and dρ(θ) = w-lim Re F (reiθ ) r↑1

dθ . 2π

Remark 7.22. This integral representation is often discussed in the context of Carath´eodory functions, which are analytic functions F : D → {z ∈ C | Re z ≥ 0} with F (0) = 1. Since (7.13) implies F (0) = iβ + ρ(∂D), Carath´eodory functions correspond to the case when β = 0 and ρ is a probability measure. The proof of Theorem 7.21 requires some preliminary statements. Let us begin with the easy direction. Lemma 7.23. For any β ∈ R and ρ ∈ M(∂D), (7.13) deﬁnes an analytic function F : D → {z ∈ C | Re z ≥ 0}. If moments of ρ are denoted for k ∈ Z by ck = e−ikθ dρ(θ), then the power series representation of F around 0 is F (z) = iβ + c0 + 2

∞

ck z k .

(7.14)

k=1

Proof. The integral kernel in (7.12) can be expanded as a geometric series as ∞ 1 + ze−iθ eiθ + z = =1+ 2z k e−ikθ . (7.15) eiθ − z 1 − ze−iθ k=1

Substituting (7.15) into (7.13) gives ∞ 2z k e−ikθ dρ(θ). F (z) = iβ + 1+ k=1

The series and integral can be exchanged by Fubini’s theorem, since ∞ 1 + |z| ρ(∂D) < ∞. |2z k e−ikθ | dρ(θ) = 1+ 1 − |z| k=1

(7.16)

192

7. Herglotz functions

Integrating (7.16) term by term using the moments of ρ gives the power series (7.14). As noted, by Fubini’s theorem, the power series is convergent for z ∈ D; thus, it deﬁnes an analytic function on D. It follows from Re

eiθ + z 1 − |z|2 = >0 eiθ − z |eiθ − z|2

that Re F (z) ≥ 0 for all z ∈ D.

(7.17)

Lemma 7.24. If, for a sequence of measures ρn ∈ M(∂D), the limits of moments (7.18) e−ikθ dρn (θ) lim n→∞

are convergent for k ∈ N0 , then ρn converge weakly to some measure ρ∞ ∈ M(∂D). Proof. Observe the corresponding functionals Λn (h) = h dρn on C(∂D). Using (7.18) for k = 0, we conclude that ρn (∂D) converge; as ρn are positive measures, Λn = ρn (∂D), so Λn is a bounded sequence of linear functionals. By the statement (7.18) and its complex conjugate, the limit Λ∞ (h) = lim Λn (h) n→∞

is convergent for h = e−ikθ for all k ∈ Z. By linearity, it is convergent if h is a trigonometric polynomial, and by density and boundedness, it is convergent for every h ∈ C(∂D) by Lemma 2.46. The limit Λ∞ (h) is a positive linear functional on C(∂D), so it corresponds to some ρ∞ ∈ M(∂D) by the Riesz–Markov theorem. Proof of Theorem 7.21. For analytic F : D → {z ∈ C | Re z ≥ 0}, denote by ∞ ak z k F (z) = k=0

its power series centered at 0. For r < 1, consider the measures dθ dρr = Re F (reiθ ) . 2π They are positive measures because Re F ≥ 0. By Cauchy’s integral theorem, the moments of the function F (reiθ ) for k ∈ Z are

2π ak rk k ≥ 0 1 iθ −ikθ dθ −k−1 = F (re )e F (rz)z dz = (7.19) 2π 2πi ∂D 0 k ≤ −1. 0 Complex-conjugating and replacing k by −k gives

2π 0 k≥1 dθ −ikθ = F (reiθ )e −k 2π a−k r k ≤ 0. 0

(7.20)

7.4. The Herglotz representation

193

Taking the average of (7.19) and (7.20) gives the moments of Re F (reiθ ) as ⎧ 1 k ⎪ k≥1 ⎨ 2 ak r dθ −ikθ iθ = Re a0 Re F (re ) e k=0 2π ⎪ ⎩1 −k ¯−k r k ≤ −1. 2a By Lemma 7.24, the measures ρr converge weakly as r ↑ 1. Also, comparing these moments with Lemma 7.23 and with the Taylor expansion of F (rz) shows that iθ dθ e +z Re F (reiθ ) . F (rz) = i Im F (0) + eiθ − z 2π Since F (rz) → F (z) as r ↑ 1, weak convergence implies (7.13) with β = Im F (0) and dρ = w-limr↑1 dρr . Beyond the existence and uniqueness of the representation (7.13), we also want to know its continuity properties. Theorem 7.25. Given analytic functions Fn : D → {z ∈ C | Re z ≥ 0}, n ∈ N ∪ {∞}, with representations iθ e +z dρn (θ), Fn (z) = iβn + eiθ − z the following are equivalent: (a) Fn (z) → F∞ (z) for every z ∈ D. (b) The sequence Fn converges to F∞ uniformly on compact subsets of D. w

(c) βn → β∞ and ρn → ρ∞ . Proof. (a) =⇒ (b): This follows from Proposition 7.16 applied to the Schur functions fn (z) = γ(iFn (z)), where γ is the Cayley transform (7.5). (b) =⇒ (c): βn = i Im Fn (0) → i Im F∞ (0) = β∞ . Cauchy’s integral formula applied to a circle of radius r < 1 implies that Taylor coeﬃcients of Fn converge to those of F∞ . Thus, moments of ρn converge to those of ρ∞ , w so ρn → ρ∞ by Lemma 7.24. iθ

is in C(∂D), so (c) =⇒ (a): For every z ∈ D, the function eiθ → eeiθ +z −z this follows by the deﬁnition of weak convergence of measures.

7.4. The Herglotz representation In this section, we derive an integral representation for Herglotz functions. Note that the Herglotz condition Im f (z) > 0 is an open condition, but for the following result it is more natural to allow the slightly more general case Im f (z) ≥ 0. Of course, if f : C+ → C+ ∪ R obeys f (z0 ) ∈ R for some

194

7. Herglotz functions

z0 ∈ C+ , then f is constant by the maximum principle. Thus, every analytic f : C+ → C+ ∪ R is a Herglotz function or a real-valued constant. Theorem 7.26 (Herglotz representation, ﬁrst form). Every analytic function f : C+ → {z ∈ C | Im z ≥ 0} has a unique representation of the form 1 + xz dν(x), (7.21) f (z) = az + b + R x−z where a ≥ 0, b ∈ R, and ν is a ﬁnite positive measure on R. Proof. Recall the Cayley transform γ deﬁned by (7.5). The function F (w) = −if (γ −1 (w)) maps D to {z ∈ C | Re z ≥ 0}, so there exist β ∈ R and ρ ∈ M(∂D) such that eiθ + w dρ(θ). (7.22) F (w) = iβ + iθ ∂D e − w Solving for f gives eiθ + γ(z) dρ(θ). f (z) = −β + i iθ ∂D e − γ(z) The measure ρ may have a point mass at eiθ = 1; separating that point mass from the rest of the integral gives 1 + γ(z) eiθ + γ(z) ρ({1}) + i dρ(θ). f (z) = −β + i iθ 1 − γ(z) ∂D\{1} e − γ(z) Since

γ(x) + γ(z) 1 + xz 1 + γ(z) = z, i = , 1 − γ(z) γ(x) − γ(z) x−z the representation (7.21) is obtained by algebraic manipulations after denoting b = −β, a = ρ({1}), and denoting by ν the measure on R obtained as the pushforward of the measure ρ on ∂D \ {1} under the bijection γ −1 : ∂D \ {1} → R. Explicitly, for any Borel set B ⊂ R, ν(B) = ρ(γ(B)). i

Conversely, starting from the representation (7.21), the above steps can be reversed to represent the function F (w) in the form (7.22) with β = −b and ρ = aδ1 + ν ◦ γ −1 , where δ1 denotes the Dirac measure at 1. Since the representation (7.22) is unique, so is the representation (7.21). While the Herglotz representation in the form (7.21) follows naturally from the Cayley transform, the Herglotz representation is more commonly stated in the form stated in the introduction to this chapter. Theorem 7.27 (Herglotz representation, second form). Every analytic function f : C+ → {z ∈ C | Im z ≥ 0} has a unique representation of the form (7.2), with a ≥ 0, b ∈ R, and μ a positive measure on R which obeys (7.3).

7.4. The Herglotz representation

195

Proof. (7.2) is obtained from (7.21) by x 1 + xz 1 − = 2 x−z 1+x (x − z)(1 + x2 )

(7.23)

with dμ(x) = (1 + x2 )dν(x).

While (7.2) is more commonly referred to as the Herglotz representation and is more natural from the perspective of generalizing (7.1) and of the Stieltjes inversion described below, the alternative form (7.21) is more convenient for other purposes; we will use them interchangeably. We are also interested in the continuity properties of the Herglotz representation. Denote C0 (R) = {h ∈ C(R) | lim h(x) = 0}. x→±∞

Proposition 7.28. Given analytic functions fn : C+ → {z ∈ C | Im z ≥ 0}, n ∈ N ∪ {∞}, with Herglotz representations 1 + xz dνn (x), (7.24) fn (z) = an z + bn + R x−z the following are equivalent: (a) fn (z) → f∞ (z) for every z ∈ C+ ; (b) the sequence fn converges to f∞ uniformly on compact subsets of C+ ; (c) bn → b∞ , an + νn (R) → a∞ + ν∞ (R), and ∀h ∈ C0 (R). h dνn → h dν∞

(7.25)

Proof. By applying Theorem 7.25 to the functions eiθ + w −1 dρn (θ), Fn (w) = −ifn (γ (w)) = iβn + iθ ∂D e − w it follows that (a) and (b) are mutually equivalent and equivalent to the condition that ∀g ∈ C(∂D). g dρn → g dρ∞ Any g ∈ C(∂D) can be uniquely written as a linear combination of the constant function 1 and a function obeying g(1) = 0. Convergence for the constant function is equivalent to an +νn (R) → a∞ +ν∞ (R) and convergence for functions obeying g(1) = 0 is equivalent to (7.25) with h = g ◦ γ. Corollary 7.29. In the setting of Proposition 7.28, if fn → f∞ , then lim sup an ≤ a∞ . n→∞

(7.26)

196

7. Herglotz functions

Proof. Fix c > 0 and a continuous function h on R such that χ[−c,c] ≤ h ≤ χ[−2c,2c] . Then ν∞ ([−c, c]) ≤ h dν∞ = lim h dνn ≤ lim inf νn (R). n→∞

n→∞

Since this holds for any c > 0, it follows that ν∞ (R) ≤ lim inf νn (R). n→∞

Subtracting this from an + νn (R) → a∞ + ν∞ (R) implies (7.26).

The inequality (7.26) can be strict (Exercise 7.9). Finally, let us note a specialization of Proposition 7.28. In spectral theory, we usually consider approximations of a Herglotz function with a∞ = 0, which leads to a slight simpliﬁcation. Corollary 7.30. In the setting of Proposition 7.28, if a∞ = 0, the following are equivalent: (a) fn (z) → f∞ (z) for every z ∈ C+ ; (b) the sequence fn converges to f∞ uniformly on compact subsets of C+ ; (c) bn → b∞ , an → 0, νn (R) → ν∞ (R), and (7.25) holds. Proof. (a) =⇒ (b) and (c) =⇒ (a) follow directly from Proposition 7.28. For (b) =⇒ (c), note that since a∞ = 0 and an ≥ 0 for all n, (7.26) implies an → 0. Then an + νn (R) → a∞ + ν∞ (R) implies νn (R) → ν∞ (R), and the rest follows.

7.5. Growth at inﬁnity and tail of the measure In this section we express the coeﬃcients a and b in the Herglotz representation in terms of the values of f ; we will see that the value of a is related to the asymptotic behavior of f at inﬁnity. We also give a necessary and suﬃcient condition for a Herglotz function to be of the special form (7.1). We begin by noting that f (i) = ai + b +

1 + xi dν(x) = b + (a + ν(R))i x−i

so b = Re f (i), a + ν(R) = Im f (i). However, isolating the value of a requires taking a limit: Proposition 7.31. If the function f is given by (7.2), then f (iy) . y→∞ iy

a = lim

7.5. Growth at inﬁnity and tail of the measure

Proof. Since

aiy+b iy

→ a, it suﬃces to prove that 1 1 + ixy dν(x) = 0 lim y→∞ iy x − iy

197

(7.27)

for ﬁnite measures ν. The integrand converges to 0 pointwise as y → ∞, so (7.27) follows by dominated convergence with the bound . . 2y2 1 1 + ixy 1 + x 1 + x2 y 2 =. ≤ 1, iy x − iy = . 2 y x + y2 y 4 + x2 y 2 which is valid for all x ∈ R and y ≥ 1.

The limit can also be taken nontangentially (Exercise 7.11). Proposition 7.32. Let f be a Herglotz function. The function f is of the form (7.1) for some ﬁnite positive measure μ on R if and only if there exists C < ∞ such that C ∀z ∈ C+ . (7.28) |f (z)| ≤ Im z Proof. If (7.1) holds and μ is ﬁnite, then |x − z| ≥ Im z implies that μ(R) 1 dμ(t) ≤ , |f (z)| ≤ |t − z| Im z so (7.28) holds with C = μ(R). Conversely, let f be a Herglotz function. If (7.28) holds, by Proposition 7.31, f (iy) a = lim = 0. y→∞ y Since y dμ(t), Im f (x + iy) = (x − t)2 + y 2 monotone convergence implies y2 lim y Im f (x + iy) = lim dμ(t) = μ(R), y↑∞ y↑∞ (x − t)2 + y 2 so (7.28) implies μ(R) < ∞. Thus, the two terms in the integrand in (7.2) are separately integrable and f is of the form 1 dμ(t) f (z) = β + R t−z for some β ∈ R. Now limy↑∞ f (iy) = β by dominated convergence, but that limit is zero by (7.28). These results relate the behavior of the measure at ∞ and the behavior of the Herglotz function at ∞, as does the following result:

198

7. Herglotz functions

Proposition 7.33. Let f be a Herglotz function with the Herglotz representation (7.2). For any γ ∈ (0, 2), ∞ dμ(x) Im f (iy) a = 0 and < ∞ ⇐⇒ dy < ∞. (7.29) γ yγ 1 R 1 + |x| Lemma 7.34. For any κ ∈ (−1, 1), ∞ π/2 tκ . dt = 2 1+t cos(κπ/2) 0

(7.30) κ

z Proof. Consider the meromorphic function f (z) = 1+z 2 on C \ [0, −i∞) κ with arg(z ) = κ arg z for arg z ∈ (−π/2, 3π/2). Let 0 < r < R < ∞, and consider the region Ω = {z ∈ C | r < |z| < R, 0 < arg z < π}. The function f has a pole at i, so by residue calculus, the contour integral of f over ∂Ω is + f (z) dz = 2πi Resi (f ) = πeiκπ/2 . ∂Ω

Parametrizing the contour gives R π π 1+κ iκθ tκ R1+κ eiκθ r e iκπ dt+i dθ −i dθ = πeiκπ/2 . (1+e ) 2 iθ 2 iθ 2 r 1+t 0 1 + (Re ) 0 1 + (re ) Since κ ∈ (−1, 1), letting r ↓ 0 and R ↑ ∞ gives ∞ tκ iκπ dt = πeiκπ/2 , (1 + e ) 1 + t2 0 which implies (7.30).

Proof of Proposition 7.33. If a > 0, then the right-hand side is also false by Proposition 7.31, so we assume a = 0 from now on. If we decompose ! t 1 1 dμ(t) + − dμ(t), f (z) = b1 + 1 + t2 (−1,1) t − z R\(−1,1) t − z t where b1 = b − (−1,1) 1+t 2 dμ(t), by Proposition 7.32 the contribution from the ﬁrst integral is O(1/ Im z), so it does not aﬀect the equivalence (7.29). Thus, it suﬃces to prove the equivalence in the case ! t 1 − dμ(t), f (z) = b + t − z 1 + t2 where supp μ ⊂ (−∞, −1] ∪ [1, ∞). In that case, f (iy) is bounded for y ∈ (0, 1) because 7 2 2 1 + iyt = 1+y t ≤1 ∀y ∈ (0, 1), ∀t ∈ R \ (−1, 1), t − iy t2 + y 2

7.6. Half-plane Poisson kernel and Stieltjes inversion

so (7.29) is equivalent to the equivalence ∞ Im f (iy) dy < ∞ ⇔ yγ 0

R

199

1 dμ(x) < ∞. |x|γ

This follows from (7.30) by Tonelli’s theorem, ∞ ∞ Im f (iy) y 1−γ 1 π/2 dy = dμ(x) dy = dμ(x). γ 2 2 y sin(γπ/2) R |x|γ 0 0 R x +y

7.6. Half-plane Poisson kernel and Stieltjes inversion In this section, we consider ways of recovering the measure in the Herglotz representation from the function f . Instead of relying on the Cayley transform and functions on D, it is useful to take a more direct approach. Lemma 7.35. Fix a ≥ 0, b ∈ R, and a measure μ which obeys (7.3). The right-hand side of (7.2) deﬁnes an analytic function on C \ supp μ which obeys f (¯ z ) = f (z). Proof. The key step is to provide some uniform estimates on the integrand. For any 1 ≤ R < ∞, if dist(z, supp μ) ≥ 1/R and |z| ≤ R, let us prove 1 x 4R3 − ∀x ∈ supp μ. (7.31) ≤ x − z 1 + x2 1 + x2 Using (7.23), for |x| > 2R this follows from 1 + xz 1 + R|x| 2R|x| x − z ≤ |x| − |z| ≤ |x|/2 ≤ 4R, and for x ∈ supp μ ∩ [−2R, 2R] from 1 + xz 1 + |x||z| 1 + 2RR 3 x − z ≤ |x − z| ≤ R−1 ≤ 4R . By (7.31) and (7.3), the integral in (7.2) is convergent for each z ∈ C \ supp μ, so it deﬁnes a function f on C \ supp μ. By Morera’s theorem, it suﬃces to prove that f has zero integral over any closed null-homotopic contour γ in C \ supp μ. For any such contour γ, its image Ran γ is a compact subset of C\supp μ so there exists R such that dist(z, supp μ) ≥ 1/R and |z| ≤ R for all z ∈ Ran γ. Thus, (7.31) implies that Fubini’s theorem can be applied as follows: ! x 1 − f (z) dz = 0 dμ(x) = 0 dz dμ(x) = x − z 1 + x2 γ R γ R because the integrand is holomorphic in C \ supp μ.

200

7. Herglotz functions

Remark 7.36. Lemma 7.35 will be repeatedly used as part of a method to prove that all Herglotz functions have some property, usually describing the behavior of f near some part of the real line. This property will be obviously additive, i.e., if it holds for two Herglotz functions, it holds for their sum. We will choose a large enough interval [p, q] ⊂ R and decompose 1 f (z) = dμ(x) + g(z), [p,q] x − z where g consists of all remaining terms in (7.2): ! x x 1 − dμ(x) + dμ(x). g(z) = az + b − 2 1 + x2 [p,q] 1 + x R\[p,q] x − z Since g corresponds to the measure χR\[p,q] dμ, by Lemma 7.35 it has an analytic extension to C \ supp(χR\[p,q] dμ) ⊃ C+ ∪ (p, q) ∪ C− which obeys g(¯ z ) = g(z) and, in particular, has real values on (p, q). Often the desired property is trivial for such functions g, in such cases it remains to and 1 d˜ μ(x) with a ﬁnite, compactly consider Herglotz functions of the form x−z supported measure d˜ μ = χ[p,q] dμ. Similarly to Proposition 7.31, a point mass in μ can be computed as a normal or nontangential limit. Note that the following nontangential limit includes as a special case the normal limit z = x0 + i , ↓ 0: Lemma 7.37. For any Herglotz function f , any x0 ∈ R, and δ > 0, μ({x0 }) =

lim

z→x0 δ≤arg(z−x0 )≤π−δ

(x0 − z)f (z).

(7.32)

Proof. The property (7.32) is additive, in the sense that if it holds for two / supp μ, then f has an Herglotz functions, it holds for their sum. If {x0 } ∈ analytic extension at x0 and lim (x0 − z)f (z) = 0 · f (x0 ) = 0,

z→x0

so f has the property (7.32). By Remark 7.36, it therefore suﬃces to consider the case when μ is compactly supported and f is of the form (7.1). Thus, it remains to prove x0 − z dμ(x) = μ({x0 }). (7.33) lim z→x0 x−z δ 0;

202

7. Herglotz functions

(c) For any δ > 0,

lim ↓0

R\(−δ,δ)

P (s) ds = 0.

(7.35)

Proof. (a) This is immediate from the deﬁnition. (b) It is elementary to compute q q p 1 arctan − arctan P (s) ds = π

p for ﬁnite p, q. By the monotone convergence theorem, this formula holds also for p = −∞ and q = +∞ with the notation arctan(±∞) = ±π/2. Using lim↓0 arctan y = π2 sgn y, we compute for any −∞ ≤ p < q ≤ +∞, ⎧ ⎪ p (Dμ)(x), and ﬁx δ > 0 such that for all t ∈ (0, δ], C1 × 2t ≤ μ((x − t, x + t)) ≤ C2 × 2t.

(7.40)

We will prove that P (x − t) dμ(t) ≤ lim sup P (x − t) dμ(t) ≤ C2 . C1 ≤ lim inf ↓0

R

(7.41)

R

↓0

Since μ is ﬁnite, for any δ > 0,

P (x − t) dμ(t) ≤ P (δ)μ(R) → 0, R\(x−δ,x+δ)

↓ 0,

so it suﬃces to prove (7.41) with integrals over (x − δ, x + δ). Since P is even and decreasing on [0, ∞), the remaining integral can be written as a positive linear combination of the values of μ((x − s, x + s)) with s ≤ δ: using Tonelli’s theorem, we can rewrite P (x − t) dμ(t) (x−δ,x+δ)

δ

= P (δ)μ((x − δ, x + δ)) + 0

μ((x − t, x + t))(−P (t)) dt

and analogously, with Lebesgue measure instead of μ, δ P (x − t) dt = P (δ) × 2δ + 2t(−P (t)) dt. (x−δ,x+δ)

0

206

7. Herglotz functions

Comparing the right-hand sides by using the inequalities (7.40), and rewriting in terms of left-hand sides, implies (x−δ,x+δ) P (x − t) dμ(t) C1 ≤ ≤ C2 , (x−δ,x+δ) P (x − t) dt and taking the limit ↓ 0 using Lemma 7.39 completes the proof.

Theorem 7.46. If f is a Herglotz function, then the following hold: (a) the limit 1 lim Im f (x + i ) π ↓0 exists Lebesgue-a.e. and μ-a.e. with a value in [0, ∞]; w(x) =

(b) w(x) < ∞ for Lebesgue-a.e. x; (c) w(x) > 0 for μ-a.e. x; (d) the Radon–Nikodym decomposition of μ with respect to Lebesgue measure is given by dμ = w dx + dμs , where dμs = χS dμ, S = w−1 ({∞}). Proof. This follows from the Radon–Nikodym decomposition of μ with respect to Lebesgue measure, diﬀerentiation of measures, and the fact that exists, it is equal to the normal wherever the derivative lim↓0 μ((x−,x+)) (2) boundary value w(x) by Lemma 7.45. Since the singular part of the measure is supported on the set S = this provides a kind of upper bound on the possible singular part. We emphasize that S = ∅ does not guarantee that μs = 0 (Exercise 7.12). Moreover, the normal boundary limit does not necessarily exist for all x ∈ R (Exercise 7.13). w−1 ({∞}),

Theorem 7.46 has various generalizations; from normal limits, it can be generalized to nontangential limits. Moreover, while the normal limit of Im f is of special interest, the normal limit of Re f also exists. Proposition 7.47. Let f be a Herglotz function. Then the limit lim f (x + i ) ↓0

exists and is ﬁnite for Lebesgue-a.e. x ∈ R. √ √ Proof. Since f and i f are Herglotz functions, their imaginary√values have ﬁnite normal boundary values Lebesgue-a.e. It follows that f has ﬁnite boundary values Lebesgue-a.e. Thus, so does f .

7.7. Pointwise boundary values

207

Corollary 7.48. Let f be a Herglotz function. Then lim↓0 f (x + i ) = 0 for Lebesgue-a.e. x ∈ R. Proof. Applying Proposition 7.47 to the Herglotz function −1/f , we con clude that lim↓0 (−1/f (x + i )) = ∞ for Lebesgue-a.e. x. This allows us to re-express the Lebesgue decomposition of μ in terms of boundary values of f on the closure of C+ (viewed as part of the Riemann ˆ this is useful in the context of subordinacy theory. sphere C); Corollary 7.49. If f is a Herglotz function, then (a) as a limit in the Riemann sphere, f (x + i0) := lim f (x + i ) ↓0

exists Lebesgue-a.e. and μ-a.e. with a value in C+ = C+ ∪R∪{∞}; (b) the set S = {x ∈ R | f (x + i0) = ∞} has zero Lebesgue measure; (c) the Radon–Nikodym decomposition of μ with respect to Lebesgue measure is given by 1 Im f (x + i0) dx + dμs , π where dμs = χS dμ. dμ =

Proof. By Proposition 7.47, m(S ) = 0. Moreover, μs (S c ) = 0. Combining these statements gives μ(S \ S) = 0. Since S ⊂ S , this implies χS dμ = χS dμ = dμs . For Lebesgue-a.e. x ∈ R, f (x + i0) is ﬁnite and w(x) = π1 f (x + i0), which implies the new characterization of the absolutely continuous part of μ. Instead of comparison with Lebesgue measure, this can be generalized to a comparison of two measures and used in conjunction with the Radon– Nikodym theorem (Section 6.2): Theorem 7.50. Let f, g be Herglotz functions corresponding to measures μ, ν on R. Let w be the Radon–Nikodym derivative w(x) = lim r↓0

μ((x − r, x + r)) , ν((x − r, x + r))

which exists μ + ν-a.e. with w(x) ∈ [0, ∞]. Then for (μ + ν)-a.e. x ∈ R, lim ↓0

Im f (x + i ) = w(x). Im g(x + i )

208

7. Herglotz functions

Proof. For (μ + ν)-a.e. x, the limit lim Im(f (x + i ) + g(x + i )) ↓0

exists and is strictly positive. Thus, by Remark 7.36, it suﬃces to prove the claim for the case when μ, ν are compactly supported and f, g is of the form 1 1 dμ(x), g(z) = dν(x). f (z) = x−z x−z Moreover, by symmetry, it suﬃces to prove that for every x ∈ R, lim sup ↓0

Im f (x + i ) μ((x − r, x + r)) ≤ lim sup . Im g(x + i ) ν((x − r, x + r)) r↓0

Fix a constant C > lim sup r↓0

(7.42)

μ((x − r, x + r)) , ν((x − r, x + r))

and ﬁx δ > 0 such that for all t ∈ (0, δ], μ((x − t, x + t)) ≤ Cν((x − t, x + t)).

(7.43)

Then as in the proof of Lemma 7.45, expressing Poisson integrals of μ, ν in terms of values of the measures on intervals, it follows that P (x − t) dμ(t) ≤ C P (x − t) dν(t), (x−δ,x+δ)

(x−δ,x+δ)

which proves (7.42).

In the second part of this section, we discuss the problem of using pointwise boundary behavior of f to study the α-continuous and α-singular parts of the measure, following del Rio–Jitomirskaya–Last–Simon [26]. The Rogers–Taylor decomposition uses the upper α-derivative Dμα (x) = lim sup r↓0

μ((x − r, x + r)) . (2r)α

(7.44)

More precisely, it uses the set of x where Dμα (x) = ∞. We characterize this set in terms of the quantities Qαμ (x) = lim sup 1−α Im f (x + i ), ↓0

Rμα (x)

= lim sup 1−α |f (x + i )|. ↓0

Theorem 7.51. For any Herglotz function f , any α ∈ [0, 1) and x ∈ R, {x ∈ R | Dμα (x) = ∞} = {x ∈ R | Qαμ (x) = ∞} = {x ∈ R | Rμα (x) = ∞}.

7.7. Pointwise boundary values

209

Proof. We will prove this by proving three implications Dμα (x) = ∞ =⇒ Qαμ (x) = ∞ =⇒ Rμα (x) = ∞ =⇒ Dμα (x) = ∞. Starting with the inequality

1 χ (t) ≤ 2 (x−,x+) (t − x)2 + 2 and integrating with respect to dμ(t) gives μ((x − , x + )) ≤ Im f (x + i ). Multiplying by 1−α and taking → 0 gives Dμα (x) ≤ 2Qαμ (x), which proves the ﬁrst implication. The trivial observation Im f (x+i ) ≤ |f (x+i )| implies Qαμ (x) ≤ Rμα (x), which proves the second implication. The third implication is trivial for α = 0: by Lemma 7.38, Rμ0 (x) < ∞ for all x. Let us assume α ∈ (0, 1) and let x obey Dμα (x) < ∞. Then there exists C < ∞ such that μ((x − δ, x + δ)) ≤ Cδ α

∀δ ∈ (0, 1].

By the standard trick (Remark 7.36) we can assume that μ is supported on (x − 1, x + 1). Then, by Tonelli’s theorem, ∞ dμ(t) . = μ((x − τ (y), x + τ (y)) dy, |f (x + i )| ≤ (t − x)2 + 2 0 (x−1,x+1) where τ (y) = min{1, y −2 − 2 }. The important thing is that this integral depends only on the value of the measure on intervals (x − δ, x + δ), so we estimate it by comparison with the measure Cα χ |t − x|α−1 dt. 2 (x−1,x+1) This measure has the property ν((x − δ, x + δ)) = Cδ α for δ ∈ (0, 1] so ∞ dν(t) . ν((x − τ (y), x + τ (y)) dy = . |f (x + i )| ≤ (t − x)2 + 2 0 (x−1,x+1) dν(t) =

Multiplying by 1−α and using symmetry and t = x + v gives 1 1/ α−1 |t − x|α−1 v dv 1−α 1−α . √ . |f (x + i )| ≤ Cα

dt = Cα

v2 + 1 (t − x)2 + 2 0 0 ∞ α−1 dv < ∞, it follows that Rμα (x) < ∞, which proves the third Since 0 v√v2 +1 implication. The exclusion of α = 1 in the previous theorem was necessary; for a Herglotz function, it is possible to have convergence of Im f (i ) as → 0 and divergence of |f (i )| (Exercise 7.14).

210

7. Herglotz functions

7.8. Meromorphic Herglotz functions In spectral theory, we often encounter extensions of Herglotz functions to domains larger than C+ , with the reﬂection symmetry f (¯ z ) = f (z).

(7.45)

These extensions can be analytic or even meromorphic, with possible poles on the real line. The domain of such an extension is related to the support of the measure and to the essential support of μ, denoted ess supp μ, deﬁned as the set of nonisolated points of supp μ. Proposition 7.52. Let f : C+ → C+ be the Herglotz function with the Herglotz representation (7.2) for z ∈ C+ . Then f extends to (a) an analytic function f : C \ supp μ → C with the property (7.45); ˆ with the property (b) a meromorphic function f : C \ ess supp μ → C (7.45). Proof. (a) This was already proved as Lemma 7.35. (b) If supp μ has an isolated point λ, the measure can be decomposed / supp ν. By the Herglotz representation, as dμ = μ({λ})δλ + dν where λ ∈ ! ! λ 1 x 1 − + az + b + − dν(x). f (z) = μ({λ}) λ − z λ2 + 1 x − z x2 + 1 The ﬁrst term has a simple pole at λ with residue −μ({λ}) and all other terms are analytic in a neighborhood of λ. Note that our proof also shows: ˆ is a Corollary 7.53. Any isolated singularity λ of f : C \ ess supp μ → C simple pole and the residue of f at λ is strictly negative. It is also important to know a kind of converse to Proposition 7.52; namely, that f cannot be extended analytically or meromorphically to any domain not contained in C \ supp μ or C \ ess supp μ, respectively: Lemma 7.54. Let f have the Herglotz representation (7.2) for z ∈ C+ . (a) If f extends to an analytic function f : C+ ∪ (p, q) ∪ C− with the property (7.45), then supp μ ∩ (p, q) = ∅. (b) If f extends to a meromorphic function f : C+ ∪ (p, q) ∪ C− with the property (7.45), then ess supp μ ∩ (p, q) = ∅. Proof. (a) By (7.45), f is real valued on (p, q). μ((p, q)) = 0.

By Proposition 7.43,

7.8. Meromorphic Herglotz functions

211

(b) By (a), the set supp μ ∩ (p, q) can contain only poles of f . Poles of a meromorphic function are isolated, so supp μ ∩ (p, q) has no accumulation points in (p, q); thus, ess supp μ ∩ (p, q) = ∅. In this text, we will usually consider extensions which obey (7.45). This will avoid some complications normally associated √ with analytic extensions. For √ instance, the Herglotz function f (z) = − −z, deﬁned on C+ so that arg(− −z) ∈ (0, π/2), has analytic extensions to C \ [0, ∞) and to C \ (−∞, 0], but only the ﬁrst of those obeys (7.45). Deﬁnition 7.55. We call the function f given by (7.2) on C \ supp μ an analytic Herglotz function, and we call C \ supp μ its domain of analyticity. We call the function f given by (7.2) on C \ ess supp μ an analytic Herglotz function, and we call C \ ess supp μ its domain of analyticity. ˆ be a meromorphic Herglotz function. At Proposition 7.56. Let f : Ω → C any point x ∈ R ∩ Ω which is not a pole of f , f (x) > 0. Proof. If f (z) is a meromorphic Herglotz function on Ω, then so is f (z) − f (x) and so is g(z) = −1/(f (z) − f (x)). The function g has an isolated singularity at x. By Corollary 7.53, that singularity is a simple pole with strictly negative residue, so f (z) − f (x) has a simple zero at x with strictly positive derivative. Proposition 7.56 can also be proved by deriving a formula for f (Exercise 7.15). Together, Proposition 7.56 and Corollary 7.53 describe the behavior of f on intervals (p, q) ⊂ R ∩ Ω. The function f is strictly increasing, except at poles, where it has vertical asymptotes. Qualitatively, this resembles the graph of the tangent function. Given two discrete sets A, B ⊂ R, we say that they strictly interlace if the following conditions hold: (a) A ∩ B = ∅. (b) For any x, y ∈ A with x < y, there exists t ∈ B ∩ (x, y). (c) For any x, y ∈ B with x < y, there exists t ∈ A ∩ (x, y). ˆ be a meromorphic Herglotz function. On Proposition 7.57. Let f : Ω → C any interval I ⊂ R ∩ Ω, the sets {x ∈ I | f (x) = u} and {x ∈ I | f (x) = v} strictly interlace in I for any u, v ∈ R ∪ {∞} with u = v. Proof. If v = ∞, by replacing f by −1/(f − v), we reduce to the case of v = ∞. Likewise, by then subtracting u, we reduce to the case u = 0, so it suﬃces to prove that zeros and poles of f strictly interlace in I.

212

7. Herglotz functions

The set of poles of a nonconstant meromorphic function has no accumulation points in the domain. If p < q are two consecutive poles of f on I, then f : (p, q) → R is strictly increasing. Since poles are simple and have negative residue, lim f (x) = −∞, x↓p

lim f (x) = +∞, x↑q

so f : (p, q) → R is a bijection and f has a zero in (p, q). Thus, f has a zero between any two poles on I. Applying the same argument to the meromorphic Herglotz function −1/f shows that between any two zeros of f there is a pole of f , which completes the proof. In the previous proofs, we used the convenient observation that the maps z → −1/z and z → z − u preserve C+ , so they preserve meromorphic Herglotz functions. The natural generality for that observation follows. ˆ is a meromorphic Herglotz function and Corollary 7.58. If f : Ω → C A ∈ SL(2, R), then the function g deﬁned by ! ! g(z) f (z) $A 1 1 is also a meromorphic Herglotz function with the same domain Ω. If A is upper triangular, the functions f and g have the same poles; otherwise, on any interval in Ω ∩ R, poles of f and g strictly interlace. Proof. Since action by A preserves C+ , it maps f to another meromorphic Herglotz function g. The result for poles follows from ! ! a11 g(z) 1 . f (z) = ∞ ⇐⇒ $A ⇐⇒ g(z) = 1 0 a21 In spectral theory, action by a rotation matrix ! cos φ − sin φ A= ∈ SL(2, R) sin φ cos φ will correspond to a change of boundary condition for a half-line Schr¨odinger operator and poles correspond to its discrete spectrum.

7.9. Exponential Herglotz representation Let us ﬁx the branch of log such that 0 < Im log z < π for z ∈ C+ . Then, if f is a Herglotz function, so is log f . Applying the Herglotz representation to log f provides a very useful multiplicative representation for f .

7.9. Exponential Herglotz representation

213

Theorem 7.59 (Exponential Herglotz representation). Let f be a Herglotz function. Then the limit ξ(x) =

1 lim Im log f (x + i ) ∈ [0, 1] π ↓0

(7.46)

exists for Lebesgue-a.e. x ∈ R, and there exists a constant k ∈ R such that ! x 1 log f (z) = k + − ξ(x) dx. 1 + x2 R x−z Proof. Since log f (z) is a Herglotz function, it has a Herglotz representation ! 1 x dμ(x). − log f (z) = az + b + 1 + x2 R x−z Since 0 < Im log f < π, a = lim

y→∞

Im log f (iy) = 0. y

For the same reason, for any c < d and > 0, 1 d Im log f (x + i ) dx ≤ d − c, π c so by Stieltjes inversion, 1 (μ((c, d)) + μ([c, d])) ≤ d − c. 2 Taking d ↓ c implies that μ has no pure points, so μ((c, d)) ≤ d − c for all c < d. Denoting Lebesgue measure by |·|, this implies that μ(A) ≤ |A| for all open intervals, then for all open sets, and ﬁnally for all Borel sets (by outer regularity). By the Radon–Nikodym theorem, dμ = ξ(x) dx for some Borel function ξ with 0 ≤ ξ ≤ 1. Finally, ξ is reconstructed from normal boundary values of Im log f by Theorem 7.46. Lemma 7.60. If kn → k and ξn → ξ pointwise Lebesgue-a.e., then fn → f uniformly on compacts. Proof. By Proposition 7.28, it suﬃces to prove that for all h ∈ C0 (R), dx dx → h(x)ξ(x) . h(x)ξn (x) 1 + x2 1 + x2 This follows from dominated convergence with the dominating function |h(x)|/(1 + x2 ).

214

7. Herglotz functions

An important special case of Theorem 7.59 is when ξ is piecewise constant. In that case, the piecewise integrals can be computed, by using the elementary calculation √ ! d x 1 (d − z)/ d2 + 1 √ . (7.47) − dx = ln x − z 1 + x2 (c − z)/ c2 + 1 c Then the integral turns into a sum, and exponentiating turns that into a product formula for f (z). We give some examples and leave others as exercises: Example 7.61. Let f be a Herglotz function with a meromorphic extension to C with the symmetry (7.45). Assume that it has zeros (λn )∞ n=1 and poles such that p < λ < p for all n ∈ N. Then f is of the form (pn )∞ n n n+1 n=1 . ∞ (λn − z)/ λ2n + 1 . (7.48) f (z) = C 2 n=1 (pn − z)/ pn + 1 for some C > 0. Proof. The sets of zeros and poles are discrete, so pn → ∞, λn → ∞ as n → ∞. The function f is strictly increasing between poles, so ξ(x) = 1 if x ∈ (pn , λn ) for some n and ξ(x) = 0 otherwise. The exponential Herglotz representation can be integrated piecewise to give . ∞ (λn − z)/ λ2n + 1 . ln log f (z) = k + (p − z)/ p2n + 1 n n=1 and exponentiating gives (7.48) with C = ek > 0.

The square roots in (7.48) are z-independent but their presence ensures a convergent product; compare Exercise 7.19. Example 7.62. Let f be a Herglotz function with a meromorphic extension to C with the symmetry (7.45). Assume that it has zeros (λn )∞ n=1 and poles ∞ (pn )n=1 such that λn < pn < λn+1 for all n ∈ N. Then f is of the form 8 ∞ (λ − z)/ λ2n+1 + 1 n+1 λ1 − z . f (z) = −C . 2 λ1 + 1 n=1 (pn − z)/ p2n + 1 for some C > 0. Proof. This follows by applying Example 7.61 to the meromorphic Herglotz function −1/f and a telescoping argument to rearrange the inﬁnite product, or directly by computing the exponential Herglotz representation for f .

7.10. The Phragm´en–Lindel¨ of method and asymptotic expansions

215

7.10. The Phragm´ en–Lindel¨ of method and asymptotic expansions The Phragm´en–Lindel¨ of method is a technique for bounding the values of an analytic function in an (often unbounded) domain in C in terms of bounds on its boundary values and growth rates. We present one special case which will have important consequences for Herglotz function asymptotics. Theorem 7.63 (Phragm´en–Lindel¨ of). Let Ω = {z ∈ C | α < arg z < β}. ¯ ⊂ C, and If h : Ω → C is analytic on Ω, has a continuous extension to Ω there exist C1 , C2 > 0 and η < π/(β − α) such that |h(z)| ≤ C1 eC2 |z| , η

(7.49)

then h is bounded on Ω and sup|h(z)| = sup |h(z)|. z∈Ω

z∈∂Ω

Proof. By composing with a power z → eiφ z κ for suitable φ and κ > 0, we can reduce to the case η < 1 < π/(β − α) and −α = β ∈ (0, π/2). On that domain, Re z ≥ |z| cos β > 0 so for any > 0, the function h (z) = h(z)e−z ¯ with is analytic on Ω, with a continuous extension to Ω sup |h (z)| ≤ sup |h(z)|. z∈∂Ω

Moreover, by

|e−z |

=

e− Re z

z∈∂Ω

≤

e− cos β|z| ,

(7.49) implies that

lim h (z) = 0.

z→∞ ¯ z∈Ω

ˆ Thus, by the maximum principle applied to the closure of Ω in C, sup|h (z)| = sup |h (z)| ≤ sup |h(z)|. z∈Ω

z∈∂Ω

z∈∂Ω

As → 0, h (z) → h(z), and the claim follows.

When a Herglotz function has an explicit nontangential asymptotic expansion and a meromorphic continuation through the negative half-line, the Phragm´en–Lindel¨ of method can often be used to extend that expansion through the negative half-line. We formulate a criterion: Corollary 7.64. Let f be an analytic Herglotz function on C \ [c, ∞), and z ) = g(z). let g be an analytic function on C \ [c, ∞) with the symmetry g(¯ Assume that n, γ > 0 are such that for all δ > 0, g(z) = O(|z|n),

z → ∞, arg z ∈ [δ, 2π − δ]

216

7. Herglotz functions

and

f (z) = g(z) + O(|z|−γ ), Then for all δ > 0,

z → ∞, arg z ∈ [δ, π − δ].

f (z) = g(z) + O(|z|−γ ),

z → ∞, arg z ∈ [δ, 2π − δ].

Proof. By shifting z by a real constant, we can assume that c = 1. Such shifts do not aﬀect the nontangential limits in the hypotheses and conclusions. We assume δ ∈ (0, π) from now on. By symmetry, since the asymptotic behavior holds for arg z ∈ [δ, π − δ], it holds also for arg z ∈ [π + δ, 2π − δ], so it suﬃces to extend it into the sector arg z ∈ [π − δ, π + δ]. The Herglotz representation for f has the form 1 + xz 1 dμ(x), supp μ ⊂ [1, ∞). f (z) = az + b + x − z 1 + x2 When Im z < 0 and x ≥ 1, x ≤ |x − z|, so 1 + |z| dμ(x). |f (z)| ≤ a|z| + |b| + 1 + x2 In particular, f (z) = O(|z|) as z → ∞, arg z ∈ [π − δ, π + δ]. Thus, the function h(z) = z γ (f (z) − g(z)) obeys all the hypotheses of Theorem 7.63 with Ω = {z ∈ C | π − δ < arg z < π + δ}: In particular, it has a continuous ¯ with h(0) = 0, since f, g are bounded at 0. Thus, h is bounded extension to Ω ¯ on Ω, which implies f (z) = g(z) + O(|z|−γ ) for π − δ ≤ arg z ≤ π + δ.

7.11. Matrix-valued Herglotz functions In this section, we consider two generalizations. We begin by considering a generalization of the Herglotz representation to complex measures and then study matrix-valued Herglotz functions. Matrix-valued Herglotz functions naturally appear in spectral theory; Exercise 7.21 illustrates this. Complex measures are linear combinations with complex coeﬃcients of (positive) ﬁnite measures. To avoid issues associated with inﬁnite measures, here we will work with the alternative Herglotz representation 1 + xz dν(x). (7.50) f (z) = az + b + R x−z In this generalization, to be able to recover the measure from the function, the function should be considered on the domain C \ R instead of C+ . Lemma 7.65. Let a, b ∈ C, and let ν be a complex measure on R. If f is deﬁned on C \ R by (7.50), then f (iy) , y→∞ iy

a = lim

b=

f (i) + f (−i) , 2

(7.51)

7.11. Matrix-valued Herglotz functions

217

for any x0 ∈ R and δ > 0, (1 + x20 )ν({x0 }) =

lim

z→x0 δ 0, a=

lim

z→∞ δ 0}; then [dμu ] = [χS dμ]. In particular, u, χR\S (A)u = μu (R \ S) = 0, and therefore u ∈ Ker χR\S (A) = Ran χS (A). Pick w such that μw is a maximal spectral measure for A and let v = u + χR\S (A)w.

(9.26)

Applying χS (A) to (9.26) gives χS (A)v = u, so u ∈ CA (v). The two summands in (9.26) lie in mutually orthogonal invariant subspaces Ran χS (A) and Ran χR\S (A), so for any Borel set B, μv (B) = v, χB (A)v = u, χB (A)u + w, χB\S (A)w = μu (B) + μw (B \ S). This is equal to zero if and only if μ(B ∩ S) = μ(B \ S) = 0, so if and only if μ(B) = 0. Thus, [μv ] = [μ]. Proposition 9.34. Let A ∈ L(H) be self-adjoint. There exists a spectral basis (ψj )N j=1 for A such that μψj ' μψj+1 for all j < N . Proof. This proof relies on the notation and method from the proof of Lemma 5.45. As in that construction, we will start with an orthonormal basis (φj )N j=1 and deﬁne inductively a sequence of vectors ψn , together with a decreasing sequence of subspaces Vn given by V0 = H and Vn =

n

CA (ψj )⊥ .

j=1

However, now the ψn are deﬁned diﬀerently: For any n, we ﬁrst deﬁne un as the orthogonal projection of φn onto Vn−1 . Then, using Lemma 9.33, we deﬁne ψn to be a vector such that un ∈ CA (ψn ) and that μψn is a maximal spectral measure for A|Vn−1 .

9.6. Spectral multiplicity

285

As in the proof of Lemma 5.45, the cyclic subspaces CA (ψj ) are mutually orthogonal by construction, and since un ∈ CA (ψn ), it follows that H=

N 3

CA (ψj ).

j=1

Moreover, since Vn are a decreasing sequence of subspaces and μψn are maximal spectral measures for A|Vn−1 , it follows that μψn−1 ' μψn . Finally, we note that any zero elements can be removed from the se quence (ψn )N n=1 , and the remaining elements can be normalized. Lemma 9.35. If [μ] = [ν] and g ∈ Bb (R), then Tg,dμ ∼ = Tg,dν . Proof. By Radon–Nikodym, ν " μ implies that dν = hdμ with h ≥ 0 μ-a.e. This representation implies ν({x | h(x) = 0}) = 0, so since ν ' μ, we conclude μ({x | h(x) = 0}) = 0. Thus, dν = hdμ with h > 0 μ-a.e. The map U : L2 (dμ) → L2 (dν) given by U f = h−1/2 f is unitary and obeys U −1 Tg(x),dν(x) U = Tg(x),dμ(x) . Lemma 9.36. If a measure dν is decomposed as a ﬁnite or countable sum dν =

N

dνj

(9.27)

j=1

4N of mutually singular measures on R, then Tg,dν ∼ = j=1 Tg,dνj . 4N 2 N Proof. The map U : L2 (dν) → j=1 L (dνj ) given by U f = (f )j=1 is norm-preserving because of (9.27) and onto because the νj are mutually 4 singular, so it is unitary. Moreover, U Tg,dν U −1 = N j=1 Tg,dνj . Proof of Theorem 9.31(a). Using the spectral basis from Proposition 9.34 and adding trailing zero measures if necessary to make the sequence inﬁnite, ∞ 3 ∼ Tx,dνj (x) A= j=1

for a sequence of measures obeying νj+1 " νj for all j. The condition νj+1 " νj implies existence of Borel sets Mj such that [dνj+1 ] = [χMj dνj ]. Introducing the mutually disjoint sets m−1 Mi \ Mm , Sm = i=1

S∞ =

∞ j=1

Mj

286

9. Consequences of the spectral theorem

allows us to represent the decreasing sequence of sets M1 , M1 ∩ M2 , M1 ∩ M2 ∩ M3 , . . . as j−1 Mi = Sm i=1

m∈N∪{∞} m≥j

(we use the convention 0i=1 Mi = R). Therefore, the measures μm = χSm dμψ1 are mutually singular by construction and < ; μm . [νj ] = m∈N∪{∞} m≥j

Therefore, by Lemma 9.36, A∼ =

3

3

Tx,dμm (x) ,

j∈N m∈N∪{∞} m≥j

which, after changing the order of summation, is the same as (9.25).

Proving uniqueness also requires some preliminary lemmas. Lemma 9.37. If A ∼ = B, then for any Borel set S, A|Ran χ (A) ∼ = B|Ran χ (B) . S

S

Proof. By the uniqueness of functional calculus, U χS (A)U −1 = χS (B). Since v ∈ Ran χS (B) if and only if χS (B)v = v, this implies that U is a bijection from Ran χS (A) to Ran χS (B). Thus, the restriction of U to Ran χS (A) conjugates A|Ran χS (A) to B|Ran χS (B) . This lemma will be applied to multiplication operators; note that if A = Tx,dμ(x) , then Ran χS (A) is naturally identiﬁed with L2 (χS dμ) so A|Ran χ (A) ∼ = Tx,χ (x) dμ(x) . S

S

Lemma 9.38. Assume that μ, ν are not the zero measure and that m n 3 3 Tx,dμ(x) ∼ Tx,dν(x) . = i=1

i=1

Then m = n and [μ] = [ν]. Proof. These operators have maximal spectral measures μ, ν respectively, so their unitary equivalence implies [μ] = [ν]. Using Tx,dμ(x) ∼ = Tx,dν(x) , m 3 i=1

Tx,dμ(x)

∼ =

n 3 i=1

Tx,dν(x)

∼ =

n 3

Tx,dμ(x) .

i=1

By symmetry, it remains to show that m > n would lead to a contradiction.

9.6. Spectral multiplicity

287

4m 2 4n 2 Assume that m > n and that U : i=1 L (dμ) → i=1 L (dμ) is a unitary map such that m n 3 3 Tx,dμ(x) U −1 = Tx,dμ(x) . U i=1

i=1

By uniqueness of the Borel functional calculus, for any Borel set S, m n 3 3 TχS ,dμ U −1 = TχS ,dμ . U i=1

i=1 Cm .

Let v1 , . . . , vn+1 be an orthonormal set View v1 , . . . , vn+1 as con4in m 2 stant functions of x, so as elements of i=1 L (dμ), and deﬁne fj = U vj ∈ 4 n 2 i=1 L (dμ). Using unitarity of U and computing inner products in both Hilbert spaces, for any Borel set S and i, j = 1, . . . , n + 1, ∗ fi (x) fj (x) dμ(x) = vi∗ vj dμ = δij μ(S). S

S

(x)∗ f

It follows that fi j (x) = δij for μ-a.e. x. Thus, for μ-a.e. x, the vectors f1 (x), . . . , fn+1 (x) are an orthonormal set in Cn , which is a contradiction. Proof of Theorem 9.31(b). Since μm are mutually singular measures, there exists a partition of R into Borel sets Sm such that each μm is sup such that each ν ported on Sm , and analogously a partition into sets Sm m . By Lemma 9.37 applied to S ∩ S , we obtain is supported on Sm m n m 3 j=1

Tx,χS (x)dμm (x) ∼ = n

n 3

Tx,χSm (x)dνn (x) .

(9.28)

j=1

For m = n, this implies χSn dμm and χSm dνn are zero measures. Thus, each , and each ν μm is supported on Sm m is supported on Sm . Thus, by (9.28) applied to m = n, m m 3 3 ∼ Tx,dμm (x) = Tx,dνm (x) , j=1

which implies [μm ] = [νm ] for all m.

j=1

In practice, if an operator is already represented in terms of multiplication operators, we would not retrace the above proofs (constructing a spectral basis, etc.) in order to determine the decomposition into spectral multiplicities. Instead, we may directly manipulate the operator into the form (9.25) and appeal to uniqueness. We illustrate this with two examples. Example 9.39. Let A be a self-adjoint operator on H. If A has a cyclic vector ψ, then it has only multiplicity 1 spectrum: μ1 = μψ and μm = 0 for all m ≥ 2. Conversely, if μm = 0 for all m ≥ 2, then A has a cyclic vector.

288

9. Consequences of the spectral theorem

Proof. Since ψ is a cyclic vector, A ∼ = Tx,dμψ (x) . This is already in the form (9.25) with μ1 = μψ and μm = 0 for all m ≥ 2, so by uniqueness of this representation, these are the multiplicity m measures. Conversely, if μm = 0 for all m ≥ 2, then (9.25) simpliﬁes to A ∼ =

Tx,dμ1 (x) . More precisely, there exists a unitary U : L2 (R, dμ1 ) → H such that U −1 AU = Tx,dμ1 (x) . Since the constant function 1 ∈ L2 (R, dμ1 ) is cyclic for Tx,dμ1 (x) , U 1 ∈ H is cyclic for A.

Example 9.40. Denote by A the operator of multiplication by 2 cos k on dk ). Its multiplicity 2 measure is dμ2 (x) = χ(−2,2) (x) dx, and L2 ([0, 2π], 2π μn = 0 for all n = 2. Proof. By Lemma 9.36 applied to the decomposition dk dk dk = χ[0,π] (k) + χ[π,2π] (k) , 2π 2π 2π A is unitarily equivalent to the direct sum of multiplications by 2 cos k on dk L2 ([(n − 1)π, nπ], 2π ), n = 1, 2. χ[0,2π] (k)

On each interval [(n − 1)π, nπ], the map g(k) = 2 cos k is strictly monotone with image [−2, 2], so by a change of variables, 2 nπ 1 2 dk = |f (k)| |f (g −1 (λ))|2 |(g −1 ) (λ)| dλ. 2π 2π −2 (n−1)π Therefore, with the choice of measure χ(−2,2) (λ) χ(−2,2) (λ) −1 |(g ) (λ)| dλ = √ dλ, 2π 2π 4 − λ2 dk the maps Un : L2 [(n − 1)π, nπ], 2π → L2 ([−2, 2], dν(λ)) given by Un f = −1 f ◦ g are unitary. dν(λ) =

Thus, the unitary map U1 ⊕ U2 conjugates A to the operator Tλ,dν(λ) ⊕ Tλ,dν(λ) . Since dν is mutually absolutely continuous with χ(−2,2) (x) dx, Lemma 9.35 completes the proof. Exercise 9.13 demonstrates how this notion of multiplicity generalizes the notion of multiplicity of eigenvalues. Exercise 9.14 characterizes cyclic vectors in the multiplicity 1 case, and Exercise 9.15 generalizes Example 9.39 to the case when there is a ﬁnite spectral basis. Exercise 9.16 introduces a decomposition of the Hilbert space into multiplicity m subspaces for A, and Exercise 9.17 gives an interpretation of that decomposition in terms of the minimal number of cyclic subspaces needed to cover a subspace of the form χS (A). We conclude this section by showing how to read oﬀ spectral multiplicity for multiplication operators on vector-valued L2 spaces introduced in Lemma 6.38:

9.7. Stone’s theorem

289

Proposition 9.41. Let W dμ be as in Lemma 6.38. The operator A of multiplication by x on L2 (R, Cd , W (x) dμ(x)) has the following properties: (a) A has maximal spectral measure μ. (b) Denoting Sm = {x | rank W (x) = m}, the multiplicity m measure for A is dμm = χSm dμ; in particular, μm = 0 for m > d. Proof. Since W ≥ 0, we can diagonalize W (x) = U (x)−1 D(x)U (x) with U (x) unitary and D(x) diagonal matrices, D(x) = diag(λ1 (x), . . . , λd (x)),

λ1 ≥ · · · ≥ λd ≥ 0.

Moreover, U (x) and D(x) can be chosen as Borel functions of x, since W (x) is Borel. Thus, the map (U f )(x) = U (x)f (x) is a unitary map U : L2 (R, Cd , W dμ) → L2 (R, Cd , D dμ). Since D is diagonal, viewing Cd -valued functions as vectors of scalar functions gives L2 (R, Cd , D dμ) =

d 3

L2 (R, λk dμ).

k=1

Therefore A∼ =

d 3

Tx,λk (x) dμ(x) .

k=1

This representation is in the form of that in Proposition 9.34, so the claims follow as in the proof of Theorem 9.31, since rank W is the number of nonzero eigenvalues of W . It is also common to combine the decomposition by multiplicity with the decomposition by spectral type, by decomposing the multiplicity m measures μm instead of the maximal spectral measure. For instance, if we say that the singular spectrum of some operator A has multiplicity 1, we mean that (μm )s = 0 for m ≥ 2.

9.7. Stone’s theorem Stone’s theorem expresses spectral projections in terms of resolvents. The proof will be based on functional calculus and calculations related to the Stieltjes inversion formula.

290

9. Consequences of the spectral theorem

Theorem 9.42 (Stone). Let A be a self-adjoint operator, and let c < d be real numbers. Then d 1 1 (χ(c,d) (A) + χ[c,d] (A)) = s-lim ((A − t − i )−1 − (A − t + i )−1 ) dt. ↓0 2πi c 2 Proof. By the Borel functional calculus, d 1 ((A − t − i )−1 − (A − t + i )−1 ) dt = g (A), 2πi c where g are deﬁned by (7.36). Since, by Theorem 7.40, 0 ≤ g ≤ 1 and g converges pointwise to 12 (χ(c,d) + χ[c,d] ) as → 0, the corresponding multiplication operators converge strongly to 1 χ(c,d) (A) + χ[c,d] (A) . 2 Stone’s theorem can be improved to norm convergence if we include a test function in Cc (R): Theorem 9.43 (Stone). If A is self-adjoint and h ∈ Cc (R), then 1 h(t)((A − t − i )−1 − (A − t + i )−1 ) dt. h(A) = lim ↓0 2πi Proof. By the Borel functional calculus, 1 h(t)((A − t − i )−1 − (A − t + i )−1 ) dt = h (A), 2πi where h is deﬁned by (7.39). As in the proof of Proposition 7.44, the functions h converge to h uniformly on R, so h (A) → h(A). Besides Stone’s formula, other useful identities can be obtained by combining functional calculus and Herglotz functions (Exercise 9.18).

9.8. Fourier transform on R This section can be seen as a detour and an extended example. In it, we d deﬁned rely on the material of Chapter 8, revisiting the derivative D = −i dx 2 in (8.27) and (8.28) as a self-adjoint operator on L (R). We will show that D is diagonalized, i.e., conjugated to a multiplication operator, by a unitary operator known as the Fourier transform. The ﬁrst step is to use resolvents and Stone’s theorem to compute further functions of D. This line of argument will lead us to a derivation of the Fourier transform—which diagonalizes the derivative, i.e., conjugates it to a multiplication operator. This is not the standard approach to introducing the Fourier transform and proving its unitarity, but it illustrates the techniques which will soon be used for Schr¨odinger operators.

9.8. Fourier transform on R

291

Lemma 9.44. For f ∈ L1 (R), the function fˆ : R → C deﬁned by 1 fˆ(k) = √ e−iky f (y) dy 2π is a bounded continuous function of k ∈ R. Proof. Boundedness follows from the k-independent estimate 1 ˆ |f (y)| dy. |f (k)| ≤ √ 2π Continuity follows from dominated convergence with dominating function |f |, since e−iky is continuous in k for each y. Lemma 9.45. For g ∈ L1 (R, dk), the function gˇ : R → C deﬁned by 1 √ eikx g(k) dk gˇ(x) = 2π is a bounded continuous function of x ∈ R. Proof. This follows from the previous lemma by the observation gˇ = = g¯.

Our goal is to prove that “ˆ” and “ˇ” extend to unitary maps which are the Fourier transform and the inverse Fourier transform, respectively. Their relation to the operator D is found in the following key calculation. Lemma 9.46. For h ∈ Cc (R) and f ∈ L1 (R) ∩ L2 (R), >

h(D)f = (hfˆ). Proof. The right-hand side is well deﬁned, since hfˆ ∈ Cc (R). By Stone’s theorem, h(D) = s-lim h (D), ↓0

1 h(k)[RD (k + i ) − RD (k − i )] dk. h (D) = 2πi If f ∈ L1 (R) ∩ L2 (R), then h (D)f → h(D)f in L2 (R). However, using the formula for the resolvents, we can evaluate h (D)f pointwise as where

(h (D)f )(x) ! +∞ x 1 i(k+i)(x−y) i(k−i)(x−y) e f (y) dy + e f (y) dy dk h(k) = 2π −∞ x 1 h(k)eik(x−y)−|x−y| f (y) dy dk. = 2π

292

9. Consequences of the spectral theorem

By dominated convergence with dominating function |h(k)f (y)| ∈ L1 (dy dk), 1 lim(h (D)f )(x) = h(k)eik(x−y) f (y) dy dk. ↓0 2π Integrating in y and then in k gives precisely 1 √ h(k)eikx fˆ(k) dk = (hfˆ)(x). lim(h (D)f )(x) = ↓0 2π >

>

Since h (D)f converges to h(D)f in L2 (R) and to (hfˆ) pointwise, the two limits are equal, which concludes the proof. Theorem 9.47. The map f → fˆ on L1 (R) ∩ L2 (R) extends to a unitary operator F : L2 (R) → L2 (R) such that F DF −1 = Tk,dk . The inverse F −1 is an extension of the map g → gˇ. Proof. Let us ﬁrst note that for any f, g ∈ L1 (R) ∩ L2 (R), by Fubini’s theorem, ikx f (x)ˇ g (x) dx = f (x)e g(k) dk dx = fˆ(k)g(k) dk. (9.29) Combining this with Lemma 9.46, for h ∈ Cc (R) and f ∈ L1 (R) ∩ L2 (R), ˆ dk. f, h(D)f = fˆ(k)h(k)f(k) Applying this to a sequence of nonnegative functions hn ∈ Cc (R) which is increasing in n and converges pointwise to 1, we obtain 2 ˆ dk f, hn (D)f = hn (k)|f(k)| s

for each n; taking limits as n → ∞, using hn (D) → I for the left-hand side and monotone convergence for the right-hand side gives the Plancherel formula f 2 = |fˆ(k)|2 dk, so fˆ ∈ L2 (R, dk) and the map f → fˆ is norm-preserving. Thus, this map extends to a norm-preserving map F : L2 (R, dx) → L2 (R, dk). g¯, we conclude that g → gˇ also extends to Using again the observation gˇ = = a norm-preserving map W : L2 (R, dk) → L2 (R, dx), ¯ W G = F (G). (9.30) Using continuity, Lemma 9.46 implies that for all h ∈ Cc (R), h(D) = W Th(k),dk F .

9.9. Abstract eigenfunction expansions

293

Using again the sequence hn ↑ 1, by strong convergence, we obtain I = W F , so W is onto. By (9.30), this implies that F is onto, so F is unitary and W = F −1 . Now the earlier conclusions can be stated as h(D) = F −1 Th(k),dk F = F −1 h(Tk,dk )F , for all h ∈ Cc (R). The set of h ∈ Bb (R) for which equality holds is a subalgebra closed under pointwise convergence of uniformly bounded sequences, and since it contains Cc (R), it is equal to Bb (R). Thus, D = F −1 Tk,dk F , which completes the proof. For f ∈ L2 (R), its Fourier transform F f is not deﬁned pointwise. It is deﬁned as an element of L2 (R), and since F is a bounded operator and χ[−n,n] f → f as n → ∞, F f = lim F (χ[−n,n] f ). n→∞

the Fourier transforms on the right-hand side are Since χ[−n,n] f ∈ deﬁned pointwise, and it is common to write this as n 1 e−ikx f (x) dx, (F f )(x) = lim √ n→∞ 2π −n L1 (R),

emphasizing that the limit is taken in the sense of L2 (R)-convergence, rather than pointwise. For f ∈ L2 (R, dx), f ∈ D(D) if and only if F f ∈ D(Tk,dk ), so if and only if k 2 |(F f )(k)|2 dk < ∞. For n ∈ N, this generalizes inductively to f ∈ D(Dn ) if and only if k 2n |(F f )(k)|2 dk < ∞. For the most part, the above proofs kept a clear conceptual diﬀerence between L2 (R, dx) and L2 (R, dk). However, the two are of course equal as Hilbert spaces up to a notational change, and this symmetry was used in the observation (9.30) to relate the Fourier transform and the inverse Fourier transform. When we construct eigenfunction expansions for Schr¨odinger operators, we will deﬁne “ˆ” and “ˇ” in an operator-dependent way; the symmetry between “ˆ” and “ˇ” will be lost and additional arguments will be needed.

9.9. Abstract eigenfunction expansions Stone’s theorem allows us to compute operators in the Borel functional calculus, and this can be used to ﬁnd unitary equivalences which conjugate the given self-adjoint operator A to a multiplication operator B. We will now

294

9. Consequences of the spectral theorem

describe the abstract portion of this approach, which will be applied later to obtain eigenfunction expansions of Jacobi and Schr¨odinger operators. The operator B will be the operator of multiplication by λ on a Hilbert space of the form L2 (R, Cn , W (λ) dμ(λ)) (see Section 6.4) with W (λ) ≥ 0 and Tr W (λ) = 1 for μ-a.e. λ. In applications to so-called half-line eigenfunction expansions, we will have n = 1, W = 1. We will also denote L2loc (R, Cn , W dμ) = {g : R → Cn | gχ[−k,k] ∈ L2 (R, Cn , W dμ) ∀k ∈ N}. Our goal is to prove: Theorem 9.48. Let A be a self-adjoint operator on H. Let H0 ⊂ H be a dense subset of H. Let μ be a Baire measure on R, and let W be an n × n matrix-valued function on R with W ≥ 0 and Tr W = 1 μ-a.e. Let f → fˆ be a linear map from H0 to L2loc (R, Cn , W dμ) such that for all f, g ∈ H0 and all h ∈ Cc (R), (9.31) g, h(A)f = hˆ g ∗ W fˆ dμ. Denote K = L2 (R, Cn , W dμ). Then the following hold. (a) The map f → fˆ extends to a norm-preserving map U : H → K. (b) There exists a linear map U ∗ : K → H such that U ∗ g, f = g, U f

(9.32)

for all f ∈ H and all g ∈ K. This map obeys U ∗ ≤ 1. (c) Denote by B the operator of multiplication by λ in K. Then h(A) = U ∗ h(B)U

(9.33)

for all bounded continuous functions h. In particular, U ∗ U = I. (d) U U ∗ is an orthogonal projection in K with Ran(U U ∗ ) = Ran U and Ker(U U ∗ ) = Ker U ∗ . (e) Ker U ∗ = (Ran U )⊥ is a resolvent-invariant subspace for B. (f) If in addition Ker U ∗ = {0}, then U is unitary, so (9.33) provides a unitary equivalence between A and B, (9.33) holds for any bounded Borel function h, and A = U ∗ BU. The desired case Ker U ∗ = {0} cannot be established by abstract arguments: for instance, if H is a proper subspace of K and U is inclusion, then U U ∗ = I. In applications, we will always verify by hand that Ker U ∗ = {0}, and that veriﬁcation will use the fact that Ker U ∗ is a resolvent-invariant subspace.

9.9. Abstract eigenfunction expansions

295

The relation (9.33) is crucial. Viewed as a property of h, note that it is not obviously multiplicative. Thus, we will require a formula which uses the functional calculus linearly: Lemma 9.49. If B is self-adjoint, its resolvents for z ∈ C \ R can be expressed as ∞ RB (z)ϕ = i eikz e−ikB ϕ dk, z ∈ C+ , 0 0 eikz e−ikB ϕ dk, z ∈ C− . RB (z)ϕ = −i −∞

Proof. Let z ∈ C+ . Pointwise convergence of Riemann sums with uniform boundedness implies strong convergence t n t iztj/n −iBtj/n eikz e−ikB ϕ dk = lim i e e ϕ i n→∞ n 0 j=1

itz −itB

= (I − e e

)Rz (B)ϕ.

Moreover, eitz e−itB Rz (B)ϕ ≤ e−t Im z Rz (B)ϕ which goes to 0 as t → +∞. This proves the ﬁrst formula; the second is proved analogously. Proof of Theorem 9.48. (a) For any f ∈ H0 , h(λ)fˆ(λ)∗ W (λ)fˆ(λ) dμ(λ) = f, h(H)f . Let us apply this to a sequence of hn ∈ Cc (R) with 0 ≤ hn ≤ 1 which monotonically converges to 1 everywhere. Then using monotone convergence s on the left-hand side and hn (A) → I on the right-hand side implies that fˆ(λ)∗ W (λ)fˆ(λ) dμ(λ) = f 2 . In particular, this proves that fˆ ∈ K. Thus, the map f → fˆ is normpreserving from H0 to K, so it extends continuously to a norm-preserving map U : H → K. (b) For any g ∈ K, the map f → g, U f is a bounded linear functional and |g, U f | ≤ gU f ≤ gf . By the Riesz representation theorem, there is a unique vector U ∗ g which obeys (9.32) for all f ∈ H, and U ∗ g ≤ g. The map U ∗ is linear since the right-hand side of (9.32) is skew-linear in g. (c) Equation (9.31) can now be rewritten as g, h(A)f = U g, h(B)U f

(9.34)

296

9. Consequences of the spectral theorem

for f, g ∈ H0 and h ∈ Cc (R). Since h(A), h(B) are bounded operators, by density of H0 in H, this holds for all f, g ∈ H and h ∈ Cc (R). For any bounded continuous h : R → C, there is a sequence of uniformly bounded approximants hn ∈ Cc (R) such that hn → h pointwise. Then g, hn (A)f = U g, hn (B)U f , so using strong operator convergence on both sides, we conclude that h also obeys (9.34). Taking h = 1 gives U ∗ U = I. (d) For any g1 , g2 ∈ K, g1 , U U ∗ g2 = U ∗ g1 , U ∗ g2 = U U ∗ g1 , g2 , so (U U ∗ )∗ = U U ∗ . Moreover, U U ∗ U U ∗ = U IU ∗ = U U ∗ . Thus, U U ∗ is an orthogonal projection. From Ran(U U ∗ U ) ⊂ Ran(U U ∗ ) ⊂ Ran U and U U ∗ U = U I = U , we conclude Ran(U U ∗ ) = Ran U . Since U U ∗ is an orthogonal projection, U U ∗ g = 0 if and only if g, U U ∗ g = 0, which is equivalent to U ∗ g = 0. (e) By (d), g ∈ Ran U = Ran(U U ∗ ) if and only if g, U U ∗ g = g2 , so if and only if U ∗ g = g. We will use this as a criterion for Ran U . By (c), we have eitA = U ∗ eitB U . Since operators eitA and eitB are unitary on H and K, for any f ∈ H, f = eitA f = U ∗ eitB U f ≤ eitB U f = f . Equality must hold, which implies that eitB U f ∈ Ran U . This means that Ran U is invariant for eitB for any t ∈ R. Thus, Ker U ∗ = (Ran U )⊥ is invariant for e−itB = (eitB )∗ by Lemma 4.41. In other words, g ∈ Ker U ∗ implies U ∗ e−itB g = 0. By Lemma 9.49, for z ∈ C+ , ∞ ∗ −1 U (B − z) g = i eitz U ∗ e−itB g dt = 0, 0

so Ker U ∗ is invariant for (B − z)−1 for all z ∈ C+ . The case z ∈ C− is proved analogously, so Ker U ∗ is resolvent-invariant for B. (f) If Ker U ∗ = (Ran U )⊥ = {0}, then Ran U is dense in H. Since U is norm-preserving, this implies that Ran U = H and U is unitary. Now (9.33) holds for all bounded Borel functions by Theorem 8.44 and then A = U ∗ BU holds by Proposition 9.13.

9.10. Exercises 9.1. Let A be self-adjoint. If S ⊂ T ⊂ R, prove that χS (A) ≤ χT (A) in the sense of operator order. 9.2. Let A be self-adjoint. Prove that min σ(A) = sup{E ∈ R | χ(−∞,E) (A) = 0}.

9.10. Exercises

297

9.3. Let u, v ∈ H. If μu and μv are mutually singular, prove that u ⊥ v. Hint: Use S such that μA,u (S) = 0 and μA,v (S c ) = 0, and compute u, v = u, χS (A)v + u, χS c (A)v. 9.4. If A is self-adjoint and w = u + v, prove that μA,w " μA,u + μA,v . 9.5. If μ is a maximal spectral measure for A and S is a Borel set, prove that χS dμ is a maximal spectral measure for the restriction of A to Ran χS (A). 9.6. Let A, B be unbounded self-adjoint operators. Prove that the following are equivalent: (a) eikA eilB = eilB eikA for all k, l ∈ R. (b) RA (z)RB (w) = RB (w)RA (z) for all z, w ∈ C \ R. (c) f (A)g(B) = g(B)f (A) for all f, g ∈ Bb (R). If these conditions hold, the unbounded operators A, B are said to commute. 9.7. Let A be a self-adjoint operator on H which has an orthonormal basis ∞ of eigenvectors (vn )∞ n=1 , with corresponding eigenvalues (λn )n=1 . (a) Prove that σ(A) = {λn | n ∈ N}. (b) If dim H = ∞, construct a self-adjoint operator on H which has an orthonormal basis of eigenvectors and σ(A) = [0, 1]. 9.8. If A is self-adjoint, K ∈ L(H) relatively compact, and ψ ∈ Hac (A), prove that lim Ke−itA ψ2 = 0.

t→∞

Hint: Use the Riemann–Lebesgue lemma and imitate the proof of the RAGE theorem. 9.9. If A is a bounded self-adjoint operator on H and dim H = ∞, prove that σess (A) = ∅. 9.10. Prove a strengthening of Weyl’s criterion: for any λ ∈ C, if V stands for the set of all orthonormal sequences v = (vn )∞ n=1 in H, or for the w set of all normalized sequences with vn → 0, then dist(λ, σess (A)) = inf lim inf (A − λ)vn . v∈V n→∞

9.11. Let A be a self-adjoint operator, and let {λ1 , . . . , λk } ⊂ R be a ﬁnite set. Prove that σess (A) ⊂ {λ1 , . . . , λk } if and only if the operator k j=1 (A − λj ) is compact. 9.12. If A is a self-adjoint operator bounded below and λn are deﬁned as in Section 9.5, prove that min σess (A) = limn→∞ λn (A).

298

9. Consequences of the spectral theorem

9.13. Let A be a self-adjoint operator on H and μm , and let m ∈ N ∪ {∞} be its multiplicity m measures. Let λ be an eigenvalue of A. (a) Prove that μm ({λ}) > 0 for exactly one value of m ∈ N ∪ {∞}. (b) Prove that μm ({λ}) > 0 if and only if dim Ker(A − λ) = m. 9.14. Let A be a self-adjoint operator with multiplicity 1 spectrum (i.e., denoting by μm its multiplicity m measures, μm = 0 for all m ≥ 2). Prove that a vector ψ is cyclic for A if and only if μA,ψ is mutually absolutely continuous with μ1 . 9.15. Let A be a self-adjoint operator, and let μm be its multiplicity m measures. For any n ∈ N, prove that A has a spectral basis with at most n vectors if and only if μm = 0 for all m > n. 9.16. Let A be a self-adjoint operator on H and μm , and let m ∈ N ∪ {∞} be its multiplicity m measures. (a) Let Sm be any Borel set such that μm is supported on Sm and μk (Sm ) = 0 for all k = m. Prove that the subspace Hm (A) = Ran χSm (A) is independent of the choice of Sm . This subspace is called the multiplicity m subspace for the operator A. (b) Prove that 3 Hm (A). H= m∈N∪{∞}

9.17. Let A be a self-adjoint operator on H, let μm , m ∈ N ∪ {∞} be its multiplicity m measures, and let n ∈ N. 4n (a) Prove that j=1 Hj (A) can be written as a direct sum of n cyclic subspaces of A. (b) Assume that a Borel set S is such that χS (A) can be written as a direct sum of n cyclic subspaces of A. Prove that μm (S) = 0 for all m > n. 9.18. Let A be a self-adjoint operator, and let λ ∈ R. Prove that s-lim i (A − λ − i )−1 = χ{λ} (A). ↓0

Chapter 10

Jacobi matrices

Jacobi matrices are tridiagonal self-adjoint matrices with real diagonal entries and positive oﬀ-diagonal entries. The simplest form is a ﬁnite Jacobi matrix, ⎞ ⎛ b1 a1 ⎟ ⎜a1 b2 a2 ⎟ ⎜ ⎟ ⎜ .. .. ⎟ ⎜ . . a2 (10.1) J =⎜ ⎟ ⎟ ⎜ . . .. .. a ⎠ ⎝ d−1 ad−1 bd with a1 , . . . , ad−1 ∈ (0, ∞) and b1 , . . . , bd ∈ R. The elements left blank (matrix elements Jkl with |k − l| ≥ 2) are implied to be 0. These are clearly Hermitian matrices, i.e., self-adjoint operators on Cd . Similarly, half-line Jacobi matrices are operators on 2 (N) given formally by the tridiagonal matrix expression ⎛ ⎞ b1 a1 ⎜a1 b2 a2 ⎟ ⎜ ⎟ ⎜ ⎟ a b a 2 3 3 ⎟ J =⎜ (10.2) ⎜ .. .. ⎟ ⎜ ⎟ . . a3 ⎝ ⎠ .. . with an > 0 and bn ∈ R for all n. For operators on 2 (N), one has to be careful with matrix notation: denoting by (δn )∞ n=1 the standard basis of 2 (N), every operator J on 2 (N) corresponds to an inﬁnite matrix of coeﬃcients Jkl = δk , Jδl , but not every inﬁnite matrix (Jkl )∞ k,l=1 corresponds to a bounded linear operator. In Section 10.1 we proved that for 299

300

10. Jacobi matrices

∞ ∞ (an )∞ n=1 , (bn )n=1 ∈ (N), (10.2) deﬁnes a bounded self-adjoint operator with δ1 as a cyclic vector. From there on, we will refer to the spectral measure μ = μδ1 as the spectral measure corresponding to J. We will also describe an orthogonal polynomial construction which starts from μ and results in a sequence of recursion coeﬃcients an > 0, bn ∈ R, and prove that the two correspondences are mutually inverse. In Section 10.2 we will discuss the case of unbounded Jacobi matrices; this section can be skipped by a reader interested only in the bounded case.

After that, we will consider connections between J, μ, and the corresponding Herglotz function m(z) = δ1 , (J − z)−1 δ1 ,

(10.3)

which in this context is called the Weyl m-function or simply the m-function. Since μ is the spectral measure for the cyclic vector δ1 , 1 dμ(x), (10.4) m(z) = x−z and the m-function encodes spectral properties of J. Sections 10.3 and 10.4 contain useful perspectives on the Weyl function. In Section 10.5 we introduce full-line (or two-sided ) Jacobi matrices, deﬁned with the same tridiagonal pattern but acting on 2 (Z).

10.1. The canonical spectral measure and Favard’s theorem In this section we introduce the basic objects associated with bounded halfline Jacobi matrices. We start with the precise deﬁnition: Lemma 10.1. If a, b ∈ ∞ (N) and an > 0, bn ∈ R for all n ∈ N, then

b1 u1 + a1 u2 n=1 (10.5) (Ju)n = an−1 un−1 + bn un + an un+1 n ≥ 2 deﬁnes a bounded self-adjoint operator J on 2 (N) and J ≤ 2 sup an + sup|bn |. n∈N

(10.6)

n∈N

Proof. Denote α = supn∈N an and β = supn∈N |bn |. By the Cauchy–Schwarz inequality, for any u ∈ 2 (N), |an−1 un−1 + bn un + an un+1 |2 ≤ (α + β + α)(α|un−1 |2 + β|un |2 + α|un+1 |2 ). Taking the sum over n shows that Ju2 ≤ (2α + β)2 u2 so (10.6) holds. Self-adjointness is the statement that u, Jv = Ju, v

(10.7)

10.1. The canonical spectral measure and Favard’s theorem

301

holds for all u, v ∈ 2 (N). Since

⎧ ⎪ ⎨bk δk , Jδl = Jδk , δl = amin(k,l) ⎪ ⎩ 0

k=l |k − l| = 1 |k − l| ≥ 2,

(10.7) holds for u, v ∈ {δn | n ∈ N}. By sesquilinearity, (10.7) then holds for all u, v ∈ 2c (N) = span{δn | n ∈ N}. Finally, by continuity, (10.7) holds for all u, v. Another proof of (10.6) consists of decomposing J = A + B + C where each of the operators A, B, C has one nonzero diagonal and their norms are bounded by α, β, α, respectively. The estimate (10.6) has a converse, up to a multiplicative constant (Exercise 10.1). Due to their tridiagonal structure, strong and weak operator convergence of Jacobi matrices can be described very explicitly in terms of their coeﬃcients (Exercise 10.2). Lemma 10.2. For any bounded half-line Jacobi matrix J: (a) δ1 is a cyclic vector for J; (b) the support of its spectral measure μJ,δ1 is an inﬁnite set; (c) the sequence (J n δ1 )∞ n=0 is linearly independent; applying to it the Gram–Schmidt process gives the orthonormal basis (δn )∞ n=1 . Proof. From (10.5), it follows by induction that for all n ∈ N, where cn =

n

J n δ1 − cn δn+1 ∈ span{δk | 1 ≤ k ≤ n},

k=1 ak k

and therefore

span{J δ1 | 0 ≤ k ≤ n} = span{δk | 1 ≤ k ≤ n + 1}, from which it follows that δ1 is cyclic. Reversing this, for each n we have n k δn+1 − c−1 n J δ1 ∈ span{J δ1 | 0 ≤ k ≤ n − 1}.

Since (δn+1 )∞ n=0 is an orthonormal sequence and cn > 0, the claim follows by uniqueness of the Gram–Schmidt process. By the spectral theorem, there is a unitary map from L2 (R, dμδ1 ) to 2 (N), so the two Hilbert spaces have equal dimension. If supp μδ1 was a ﬁnite set, the space L2 (R, dμδ1 ) would be ﬁnite dimensional, leading to a contradiction. Since δ1 is a cyclic vector, the spectral measure μδ1 is a maximal spectral measure for J. From the perspective of general spectral theory, there is no reason to prefer μJ,δ1 over another maximal spectral measure; however, in

302

10. Jacobi matrices

the theory of Jacobi matrices, μJ,δ1 is regarded as the canonical spectral measure corresponding to J. The proof of Lemma 10.2 also applies for ﬁnite Jacobi matrices with appropriate restrictions of indices: Lemma 10.3. For any d × d Jacobi matrix J: (a) δ1 is a cyclic vector for J; (b) the support of its spectral measure μJ,δ1 has cardinality d; (c) the sequence (J n δ1 )d−1 n=0 is linearly independent; applying to it the Gram–Schmidt process gives the orthonormal basis (δn )dn=1 . The unitary map U : L2 (R, dμJ,δ1 ) → 2 (N) provided by the spectral theorem maps the constant function 1 to δ1 and conjugates J to Tλ,dμJ,δ1 (λ) . Thus, U maps the monomial λn to the vector J n δ1 . Thus, the Gram– Schmidt process in Lemma 10.2(c) is related to the orthonormal polynomial construction in Example 3.48. We will now develop this idea and obtain the inverse of the map J → μJ,δ1 . An index shift by 1 is apparent already in Lemma 10.2(c) and will reappear below. This is an artifact of the standard indexing conventions for half-line Jacobi matrices; it would vanish if we regarded half-line Jacobi matrices as operators on 2 (N ∪ {0}). For a moment, let us forget about Jacobi matrices and work in the Hilbert space L2 (R, dμ) for a compactly supported probability Borel measure μ on R. Recall that in orthogonal polynomial theory, a measure is said to be nontrivial if supp μ is not a ﬁnite set. Then every nontrivial polynomial is nonzero in L2 (R, dμ) and polynomials are dense in L2 (R, dμ); in other words, the sequence (xn )∞ n=0 is a linearly independent sequence with a dense span (see Example 3.48). Thus, applying to it the Gram–Schmidt process gives a sequence of orthonormal polynomials pn (x), deg pn = n, (10.8) pm , pn = pm (x)pn (x) dμ(x) = δm,n , which form an orthonormal basis in L2 (R, dμ). Since μ is supported on the real line, pn (x) = pn (x) follows by induction through the Gram–Schmidt process, so the polynomials pn have real coeﬃcients. Thus, the complex conjugate in (10.8) can be removed. Proposition 10.4 (Jacobi recursion). Let μ be a nontrivial probability Borel measure on R with ﬁnite moments, and let pn be its orthonormal polynomi∞ als. Then there exist sequences (an )∞ n=1 , (bn )n=1 with an > 0, bn ∈ R for all n ∈ N, such that xpn (x) = an pn−1 (x) + bn+1 pn (x) + an+1 pn+1 (x)

(10.9)

10.1. The canonical spectral measure and Favard’s theorem

303

holds for all n ≥ 1 and xp0 (x) = b1 p0 (x) + a1 p1 (x).

(10.10)

∞ ∞ Moreover, if μ is compactly supported, then (an )∞ n=1 , (bn )n=1 ∈ (N).

Remark 10.5. It is standard to set the convention p−1 (x) = 0 and claim that (10.9) holds for all n ≥ 0, with an arbitrary value of a0 . Proof. Since xpn (x) is a polynomial of degree n+1, it is a linear combination of p0 , . . . , pn+1 . Since pn is orthogonal to all polynomials of degree ≤ n − 1, for k ≤ n − 2, pk , xpn = xpk , pn = 0, so xpn (x) is a linear combination of pn−1 , pn , pn+1 ; i.e., there exist coeﬃcients an+1 , bn+1 , cn+1 ∈ R such that xpn (x) = cn+1 pn−1 (x) + bn+1 pn (x) + an+1 pn+1 (x) (for n = 0, the term cn+1 pn−1 (x) should be ommitted). Since pn , pn+1 have positive leading coeﬃcients, it follows that an+1 > 0. Moreover, for n ≥ 1, cn+1 = pn−1 , xpn = xpn−1 , pn = an . Now assume that supp μ ⊂ [−C, C] for some C < ∞. From an = xpn−1 , pn = xpn (x)pn−1 (x)dμ(x), using |x| ≤ C and using the Cauchy–Schwarz inequality gives an ≤ Cpn pn−1 = C. Similarly, |bn | = |pn−1 , xpn−1 | ≤ C so the sequences are bounded.

N Deﬁnition 10.6. The coeﬃcients (an , bn )∞ n=1 ∈ ((0, ∞) × R) in Proposition 10.4 are called the Jacobi parameters of the measure μ.

Next, we see that Jacobi parameters determine a compactly supported measure uniquely: Lemma 10.7. If two nontrivial compactly supported probability measures on R have the same Jacobi parameters, then they are equal. ˜ have Proof. For any probability measure, p−1 = 0 and p0 = 1. If μ and μ the same Jacobi parameters, then by induction using (10.9), they have the same orthonormal polynomials. For all n, μ pn dμ = δn,0 = pn d˜

304

10. Jacobi matrices

since this integral can be interpreted as the inner product of pn with 1. Then, by linearity, P dμ = P d˜ μ for any polynomial P . Since polynomials are dense in C(supp μ ∪ supp μ ˜), it follows that μ = μ ˜. Of course, the Jacobi parameters an , bn of a measure μ can be used to assemble a half-line Jacobi matrix J. By Lemma 10.7, this construction from μ to J is injective. We will now see that it is also surjective. This, and more, follows from the following lemma: Lemma 10.8. For a bounded half-line Jacobi matrix J with coeﬃcients (an , bn )∞ n=1 , the Jacobi parameters of the spectral measure μJ,δ1 are precisely (an , bn )∞ n=1 . Proof. By Lemma 10.2, there is a unitary map U : L2 (R, dμ) → 2 (N) which maps 1 to δ1 and conjugates multiplication by x with the operator J. It follows that U (P (x)) = P (J)δ1 for any polynomial P . Unitary maps preserve inner products, so they preserve Gram–Schmidt processes. In L2 (R, dμJ,δ1 ), the Gram–Schmidt process on (xn )∞ n=0 gives 2 (N), the Gram–Schmidt process on (J n δ )∞ , so in (pn (x))∞ 1 n=0 n=0 gives . However, by Lemma 10.2 this Gram–Schmidt process in 2 (N) (pn (J)δ1 )∞ n=0 gives (δn+1 )∞ n=0 , so we conclude that for all n, pn (J)δ1 = δn+1 . From the deﬁnition of the Jacobi matrix, Jδn+1 = an δn + bn+1 δn+1 + an+1 δn+2 , and we can now rewrite this as Jpn (J)δ1 = an pn−1 (J)δ1 + bn+1 pn (J)δ1 + an+1 pn+1 (J)δ1 . Applying U −1 to this equality gives (10.9), and concludes the proof.

In summary, the two constructions, from a Jacobi matrix to its spectral measure and from a measure to its Jacobi parameters, are bijections and they are mutually inverse: Theorem 10.9 (Favard’s theorem). The map J → μJ,δ1 is a bijection between the set of bounded half-line Jacobi matrices and the set of compactly supported nontrivial probability measures on R. Its inverse is obtained by taking the Jacobi parameters of a measure and using them as coeﬃcients of the Jacobi matrix. All the arguments presented here also apply to ﬁnite Jacobi matrices, with the appropriate range of indices (see Exercise 10.3). From now on, we will denote μ = μJ,δ1 and always consider J and μ related as in Favard’s theorem.

10.2. Unbounded Jacobi matrices

305

10.2. Unbounded Jacobi matrices In this section, we consider how the matrix representation (10.2) leads to unbounded self-adjoint Jacobi matrices in the case when at least one of the ∞ sequences (an )∞ n=1 , (bn )n=1 is unbounded. This section can be skipped by a reader who is only interested in bounded Jacobi matrices, except for a glance at some terminology (e.g., every bounded Jacobi matrix is limit point, and in later sections we will state various results for Jacobi matrices which are limit point). We will see that in the unbounded case, the matrix representation may ∞ be incomplete; depending on the coeﬃcient sequences (an )∞ n=1 , (bn )n=1 , it may be necessary to also specify a boundary condition at ∞ in order to specify an unbounded self-adjoint Jacobi matrix. We begin by deﬁning a maximal Jacobi operator on 2 (N) with domain

∞ 2 2 |an−1 un−1 + bn un + an un+1 | < ∞ . D(Jmax ) = u ∈ (N) | n=2

Its action on the domain is deﬁned by

b1 u1 + a1 u2 (Jmax u)n = an−1 un−1 + bn un + an un+1

n=1 n ≥ 2.

This operator may or may not be self-adjoint. We deﬁne the Wronskian of two sequences u, v as the sequence Wn (u, v) = an (un+1 vn − un vn+1 ). We show that the Wronskian has a limit as n → ∞ and that this limit provides the obstruction to self-adjointness: Lemma 10.10. For any u, v ∈ D(Jmax ), lim Wn (u, v) = Jmax u, v − u, Jmax v.

n→∞

Proof. For n ∈ N, by a direct calculation,

Wn (u, v) − Wn−1 (u, v) n ≥ 2 (Jmax u)n vn − un (Jmax v)n = n = 1, W1 (u, v) so summing from 1 to n gives n n (Jmax u)j vj − uj (Jmax v)j = Wn (u, v). j=1

j=1

by the Cauchy–Schwarz inequality, the Since u, v, Jmax u, Jmax v ∈ left-hand side converges as n → ∞, and taking this limit concludes the proof. 2 (N),

306

10. Jacobi matrices

Accordingly, we deﬁne the boundary Wronskian W+∞ (u, v) = lim Wn (u, v) n→∞

for u, v ∈ D(Jmax ). To search for self-adjoint restrictions of Jmax , we begin by ﬁnding its adjoint. Recall that 2c (N) denotes the span of {δn | n ∈ N}, i.e., the set of sequences with ﬁnitely many nonzero entries. Theorem 10.11. The restriction J0 of Jmax to D(J0 ) = 2c (N) obeys J0∗ = Jmax and J0 is the restriction of Jmax to D(J0 ) = {u ∈ D(Jmax ) | W+∞ (u, v) = 0 for all v ∈ D(Jmax )}. Proof. Assume that u, w ∈ 2 (N) obey u, J0 v = w, v

∀w ∈ 2c (N).

Since any v is a linear combination of δn ’s, this is equivalent by linearity to u, J0 δn = w, δn

∀n ∈ N.

This is equivalent to wn = an−1 un−1 + bn un + an un+1 (with the convention u0 = 0), so it is equivalent to (u, w) ∈ Γ(Jmax ). This proves that J0 is densely deﬁned and J0∗ = Jmax . As the adjoint of another operator, Jmax is closed. Since J0 is a restriction of Jmax , so is J0 . Moreover, u ∈ J0 = (J0∗ )∗ if and only if W+∞ (u, v) = 0 for all v ∈ D(Jmax ). We deﬁne a self-adjoint half-line Jacobi matrix J as a self-adjoint restriction of Jmax . By general principles, this is equivalent to J being a self-adjoint extension of J0 and equivalent to J0 ⊂ J ⊂ Jmax ,

J ∗ = J.

We will use the framework of Section 8.7 to describe such self-adjoint restrictions. Lemma 10.12. The quotient vector space D(Jmax )/D(J0 ) has dimension 0 or 2. Proof. For every n, the map D(Jmax ) → C2 , given by ! un+1 , u → an un

10.2. Unbounded Jacobi matrices

307

lets us express Wn in terms of a symplectic form on C2 , ! ! ! un+1 0 1 vn+1 Wn (u, v) = . an un an vn −1 0 ucker identity By Theorem 8.64, Wn obeys the Pl¨ Wn (v1 , v2 )Wn (v3 , v4 ) − Wn (v1 , v3 )Wn (v2 , v4 ) + Wn (v1 , v4 )Wn (v2 , v3 ) = 0 for all v1 , v2 , v3 , v4 ∈ 2 (N); alternatively, this follows directly from the identity (v1 )n+1 (v2 )n+1 (v3 )n+1 (v4 )n+1 an (v1 )n an (v2 )n an (v3 )n an (v4 )n (v1 )n+1 (v2 )n+1 (v3 )n+1 (v4 )n+1 = 0. an (v1 )n an (v2 )n an (v3 )n an (v4 )n Taking n → ∞, we conclude that for all v1 , v2 , v3 , v4 ∈ D(Jmax ), W+∞ (v1 , v2 )W+∞ (v3 , v4 ) − W+∞ (v1 , v3 )W+∞ (v2 , v4 ) + W+∞ (v1 , v4 )W+∞ (v2 , v3 ) = 0. Thus, by Theorem 8.64, the quotient space D(Jmax )/D(J0 ) has dimension 0 or 2. Deﬁnition 10.13. The case J0 = Jmax is called the limit point case. The case dim(D(Jmax )/D(J0 )) = 2 is called the limit circle case. In particular, Jmax is self-adjoint if and only if we are in the limit point case. By Lemma 10.1, a bounded Jacobi matrix is self-adjoint, so W+∞ (u, v) = 0 for all u, v ∈ 2 (N). Thus, every bounded Jacobi matrix is in the limit point case. More generally: ∞ Lemma 10.14. If (an )∞ n=1 ∈ (N), we are in the limit point case.

Proof. By the Cauchy–Schwarz inequality, for all u, v ∈ 2 (N), ∞

|Wn (u, v)| =

n=1

∞

an |un+1 vn − un vn+1 | ≤ 2a∞ u2 v2 < ∞.

n=1

In particular, for all u, v ∈ 2 (N), Wn (u, v) → 0 as n → ∞.

Jacobi matrices with an = 1 for all n are also called discrete Schr¨odinger operators; by Lemma 10.14, discrete Schr¨odinger operators are always in / ∞ (N). the limit point case, even if (bn )∞ n=1 ∈ In the limit circle case, we see that the matrix representation does not fully describe a self-adjoint Jacobi matrix. Self-adjoint extensions J of J0 are in bijective correspondence with Lagrangian subspaces D(J) ⊂ D(Jmax ), and they can be parametrized by a single self-adjoint boundary condition by the results of Section 8.7:

308

10. Jacobi matrices

Corollary 10.15. In the limit circle case, for any v ∈ D(Jmax )\D(J0 ) such that W+∞ (v, v) = 0, D(J) = {u ∈ D(Jmax ) | W+∞ (u, v) = 0} deﬁnes a self-adjoint restriction of Jmax . Conversely, every self-adjoint restriction of Jmax is of this form. Finally, let us establish the canonical spectral measure for an unbounded self-adjoint Jacobi matrix: Theorem 10.16. Let J be a self-adjoint half-line Jacobi matrix. The vector δ1 is cyclic and the spectral measure dμJ,δ1 has all ﬁnite moments, i.e., ∀n ∈ N ∪ {0}. |x|2n dμJ,δ1 (x) < ∞ The Jacobi parameters of μJ,δ1 are the coeﬃcients of the Jacobi matrix. Proof. For any v ∈ 2 (N) and z ∈ C \ R, (J − i)−1 v, (J − z)−1 (J − i)−1 v = v, (J + i)−1 (J − z)−1 (J − i)−1 v, and by the Borel functional calculus, this can be written as 1 1 dμJ,(J−i)−1 v (x) = dμJ,v (x). x−z (x + i)(x − z)(x − i) As a function of z, this determines the measure uniquely, so we conclude 1 dμJ,v (x). dμJ,(J−i)−1 v (x) = |x − i|2 Applying this inductively, using (J − i)n δ1 ∈ D(J) = Ran((J − i)−1 ), we conclude that 1 n (x). dμ dμJ,δ1 (x) = |x − i|2n J,(J−i) δ1 In particular, for any n, 2n |x − i| dμJ,δ1 (x) = 1 dμJ,(J−i)n δ1 = (J − i)n δ1 22 < ∞, so μJ,δ1 has ﬁnite moments. For any n, k ∈ N, the functions hn,k (x) = min{k, |x|n } sgn x are bounded, so hn,k (J)δ1 is in the cyclic subspace of δ1 . By dominated convergence with dominating function x2n , n 2 k → ∞, J δ1 − hn,k (J)δ1 2 = |xn − hn,k (x)|2 dμJ,δ1 (x) → 0, so J n δ1 is in the cyclic subspace of δ1 . Applying the Gram–Schmidt process as in the proof of Lemma 10.2, this cyclic subspace contains all δn , so it is equal to 2 (N).

10.3. Weyl solutions and m-functions

309

Comparing the Gram–Schmidt process for monomials in L2 (R, dμJ,δ1 ) 2 with the Gram–Schmidt process for (J n δ1 )∞ n=0 in (N) shows, as in the bounded case, equality of Jacobi parameters and coeﬃcients of the Jacobi matrix. Thus, we obtain once again a canonical spectral measure μ = μJ,δ1 and the corresponding Weyl function (10.3) and (10.4). For measures with unbounded support, the Jacobi parameters may or may not uniquely determine the measure. This has a fascinating connection with the limit point/limit circle dichotomy for Jacobi matrices [2], [92, Section 3.8], [25, Section 2.4]. In the rest of this chapter, we will restrict our work to the limit point case.

10.3. Weyl solutions and m-functions In this section, we will introduce Weyl solutions for half-line Jacobi matrices, and use them to study the Weyl m-function. A sequence v is said to be a (formal) eigensolution at z ∈ C if it obeys the Jacobi recursion an−1 vn−1 + bn vn + an vn+1 = zvn .

(10.11)

The word “formal” is used to emphasize that the sequence v is not required to be square-summable (and therefore not part of the Hilbert space), but we will usually omit it. When discussing eigensolutions, we will use the convention to also assume existence of a coeﬃcient a0 > 0 and assume that v0 is also deﬁned. Then (10.11) can be used to express v2 in terms of v0 and v1 , and so on, so an eigensolution is uniquely deﬁned by the values of v1 and a0 v0 . In other words, since (10.11) is a second-order recursion relation and an = 0 for all n, for any ﬁxed z ∈ C, the set of eigensolutions is a two-dimensional vector space. The second-order recursion relation can be rewritten as a ﬁrst-order system, ! ! vn vn+1 = A(an , bn ; z) , (10.12) an vn an−1 vn−1 where A(an , bn ; z) = is called the 1-step transfer matrix.

z−bn an

an

− a1n 0

!

310

10. Jacobi matrices

We call an eigensolution nontrivial if it is not identically zero. By (10.12), if vn−1 = vn = 0 for some n, then v is trivial; in other words, a nontrivial eigensolution can never have two consecutive zeros. It would have seemed more obvious to rewrite the Jacobi recursion as the ﬁrst-order system ! ! ! z−bn − an−1 vn+1 vn a a n n = , vn vn−1 1 0 but this has the disadvantage that a single transfer matrix depends on two oﬀ-diagonal Jacobi parameters. Our transfer matrices also have the useful property that det A(a, b; z) = 1. Note that the Wronskian of two sequences can also be expressed as a determinant, ! un+1 vn+1 . Wn (u, v) = det an un an vn Lemma 10.17. If u, v are eigensolutions at z, then their Wronskian is independent of n. Proof. Combining the recursions for u, v as ! ! un vn un+1 vn+1 = A(an , bn ; z) an un an vn an−1 un−1 an−1 vn−1

(10.13)

and taking determinants, the claim follows from det A(an , bn ; z) = 1.

Lemma 10.18. If u, v are eigensolutions at z, and α, β ∈ C, the following are equivalent: (a) αu + βv = 0. (b) For one value of n, un+1 vn+1 an un an vn

!

α β

! = 0.

(10.14)

(c) For all values of n, (10.14) holds. In particular, u, v are linearly independent if and only if their Wronskian is nonzero. Proof. (a) =⇒ (b) and (c) =⇒ (a) are obvious. (b) =⇒ (c) is proved by induction, multiplying by A(an+1 , bn+1 ; z) or by A(an , bn ; z)−1 from the left to increase or decrease the index by 1. In particular, u, v are linearly independent if and only if (10.14) has no nontrivial solutions, i.e., if and only if the Wronskian is nonzero.

10.3. Weyl solutions and m-functions

311

Of particular importance are certain eigensolutions deﬁned by their behavior at +∞: Deﬁnition 10.19. A Weyl solution for J at z is a nontrivial sequence ψ = (ψn )∞ n=0 which obeys (10.11) for all n ∈ N and ψ is in the domain of J. In the limit point case, this can be restated as: a Weyl solution at z is 2 an eigensolution at z such that ∞ n=0 |ψn | < ∞. Even though ψ is in the domain of J and solves (10.11), in general (ψn )∞ n=1 is not really an eigenvector of the operator J, and z is not an eigenvalue, unless ψ0 = 0. The Weyl solution should be thought of as a solution which obeys the boundary condition at +∞ but may not obey the boundary condition ψ0 = 0. Proposition 10.20. For any z ∈ C\σess (J), the set of Weyl solutions, with the trivial solution added, is one dimensional. Moreover: (a) if z ∈ σd (J), then for any Weyl solution ψ, ψ0 = 0; (b) if z ∈ C \ σ(J), then for any Weyl solution ψ, ψ0 = 0 and m(z) = −

ψ1 . a0 ψ0

(10.15)

Proof. If ψ, ψ˜ are both Weyl solutions at z, they decay as n → ∞, and so does their Wronskian. Since the Wronskian is independent of n, it must be zero, so ψ, ψ˜ are linearly dependent. Thus, there cannot be two linearly independent Weyl solutions. If z ∈ σd (J), then there is an eigenvector of J, i.e., a nontrivial ψ in the domain of J such that Jψ = zψ. This sequence obeys (10.11) for n ≥ 2 and b1 ψ1 + a1 ψ2 = (Jψ)1 = zψ1 , so ψ can be extended to a Weyl solution at z by setting ψ0 = 0. −1 If z ∈ C \ σ(J), we consider (ψn )∞ n=1 = ψ = (J − z) δ1 in the domain of J. Since (J − z)ψ = δ1 , this sequence obeys (10.11) for n ≥ 2 and

b1 ψ1 + a1 ψ2 = zψ1 + 1. The sequence ψ can therefore be extended to a Weyl solution by setting ψ0 so that a0 ψ0 = −1. Thus, a Weyl solution exists, and −

ψ1 = ψ1 = δ1 , ψ = δ1 , (J − z)−1 δ1 = m(z). a0 ψ0

We now deﬁne a procedure called coeﬃcient stripping. Starting from the Jacobi matrix J with parameters (an , bn )∞ n=1 , consider the Jacobi matrix J1 ∞ with parameters (an , bn )n=2 . In terms of the shift operator S and its adjoint

312

10. Jacobi matrices

S ∗ from (4.2) and (4.3), the coeﬃcient stripped Jacobi matrix can be written as J1 = SJS ∗ . In the unbounded case, let us clarify that D(J1 ) = SD(J); in particular, coeﬃcient stripping also inherits a boundary condition at +∞, if any. Proposition 10.21. If m1 is the m-function corresponding to J1 , then for all z ∈ C+ , 1 m(z) = . (10.16) b1 − z − a21 m1 (z) Proof. If ψ is a Weyl solution for J, then Sψ is a Weyl solution for J1 , so m(z) = −ψ1 /(a0 ψ0 ) and m1 (z) = −ψ2 /(a1 ψ1 ). By a direct calculation, m(z) = −

ψ1 1 ψ1 = = , a0 ψ0 (b1 − z)ψ1 + a1 ψ2 b1 − z + a21 aψ1 ψ2 1

which implies (10.16).

Example 10.22. The free half-line Jacobi matrix is deﬁned by an = 1 and bn = 0 for all n ∈ N. It corresponds to the m-function √ −z + z 2 − 4 m(z) = 2 √ 2 with the branch of z − 4 on C \ [−2, 2], which takes positive values on (2, ∞), and to the spectral measure . 1 dμ(x) = χ(−2,2) (x) 4 − x2 dx. (10.17) 2π In particular, σ(J) = [−2, 2] and J has purely absolutely continuous spectrum. Proof. Since in this case J = J1 , coeﬃcient stripping gives 1 . m(z) = −z − m(z) This turns into a quadratic equation for m(z), which has two solutions corresponding to the two branches of square root. Only the branch with positive values on (2, ∞) corresponds to a Herglotz√ function. Since Im m(z) 1 extends continuously to R with values 2 χ[−2,2] (x) 4 − x2 , (10.17) follows from Proposition 7.43. Exercise 10.4 considers a related example, and Exercise 10.7 indicates the analogue of Weyl solutions for ﬁnite Jacobi matrices. Corollary 10.23. σess (J) = σess (J1 ). On any interval I ⊂ R \ σess (J), the sets σd (J) and σd (J1 ) strictly interlace.

10.4. Transfer matrices and Weyl disks

313

Proof. Since J is unitarily equivalent to the operator Tx,dμ(x) of multiplication by x on L2 (R, dμ), it follows that σess (J) = ess supp μ. The complement of this set is the largest domain on which m(z) has a meromorphic extension which obeys m(¯ z ) = m(z). By (10.16), the functions m(z) and m1 (z) have meromorphic extensions to the same regions, which implies σess (J) = σess (J1 ). By Proposition 7.57, since m1 is a meromorphic Herglotz function on C \ σess (J), the zeros and poles of m1 (z) strictly interlace on I. By (10.16), the poles of m1 (z) are precisely the zeros of m(z). We will combine the exact calculation for the free Jacobi matrix with the result about compact perturbations: Corollary 10.24. If J is a Jacobi matrix with an → 1 and bn → 0 as n → ∞, then σess (J) = [−2, 2]. Proof. Denote by J0 the free Jacobi matrix, which has spectrum [−2, 2]; we will show that J − J0 is compact. Denote by Pn an orthogonal projection to span{δ1 , . . . , δn }. Then the operator Pn (J − J0 )Pn is ﬁnite rank and, as in the proof of Lemma 10.1, (J − J0 ) − Pn (J − J0 )Pn ≤ 2 sup ak + sup|bk |. k≥n

k>n

This converges to 0 as n → ∞, so J − J0 is the norm limit of ﬁnite rank operators. Thus, J − J0 is compact, so σess (J) = σess (J0 ) = [−2, 2].

10.4. Transfer matrices and Weyl disks Of course, the ﬁrst-order matrix recursion (10.12) can be iterated: we deﬁne an n-step transfer matrix by Tn (z) = A(an , bn ; z) · · · A(a1 , b1 ; z),

(10.18)

so that for any eigensolution v at z, ! ! vn+1 v1 = Tn (z) . an vn a0 v0 By comparing (10.11) with (10.9), we note that the sequence vn = pn−1 (z) is the eigensolution with a0 v0 = 0, v1 = 1, and therefore ! ! 1 pn (z) . (10.19) = Tn (z) 0 an pn−1 (z) Lemma 10.25. The polynomials pn and pn−1 have no common zeros.

314

10. Jacobi matrices

Proof. From det A(a, b; z) = 1 we conclude det Tn (z) = 1. In particular, Tn (z) is invertible, so (10.19) implies ! ! pn (z) 0 = . 0 an pn−1 (z) Of course, (10.19) means that the right-hand side is the ﬁrst column of Tn (z). Considering the second column leads us to introduce the second kind polynomials qn (z) as the solution of the recursion zqn (z) = an qn−1 (z) + bn+1 qn (z) + an+1 qn+1 (z) with a0 q−1 = −1, q0 = 0. By induction, for n ∈ N, qn is a polynomial in z of degree n − 1. Since vn = qn−1 (z) is the eigensolution with a0 v0 = −1, v1 = 0, ! ! 0 qn (z) , = Tn (z) −1 an qn−1 (z) so we can ﬁnally conclude Tn =

! pn −qn . an pn−1 −an qn−1

It is sometimes useful to note that since det Tn = 1, the inverse is ! −an qn−1 qn −1 . Tn = −an pn−1 pn

(10.20)

(10.21)

In terms of the projective relation (7.4) on C2 , the formula (10.15) can be written as ! ! −m(z) ψ1 . $ 1 a0 ψ0 In terms of M¨ obius transformations, the coeﬃcient stripping formula (10.16) can be rewritten as ! ! −m(z) −m1 (z) $ . A(a1 , b1 ; z) 1 1 These identities suggest that it would be better to conjugate by the matrix corresponding to the M¨ obius transformation w → −w and consider transfer matrices ! ! 1 z−b −1 0 a a ? = jA(a, b; z)j, j= . A(a, b; z) = 0 1 −a 0 These would encode the Jacobi recursion for an arbitrary eigensolution at z by the formula ! ! −vn −vn+1 ? = A(an , bn ; z) , (10.22) an vn an−1 vn−1

10.4. Transfer matrices and Weyl disks

315

and their M¨obius transformations would precisely correspond to action by coeﬃcient stripping, ! ! m(z) m1 (z) ? A(a1 , b1 ; z) $ . 1 1 Unfortunately, this is not the standard convention; we will only use it in this section because it ﬁts the Weyl disk formalism. Likewise, we deﬁne ? n , bn ; z) · · · A(a ? 1 , b1 ; z) = jTn (z)j. T?n (z) = A(a These transfer matrices have the J -contracting property (see Deﬁnition 7.9): Lemma 10.26. Fix z ∈ C+ . ? b; z) is J -contracting. (a) For any a > 0, b ∈ R, the matrix A(a, (b) For any n, J − T?n (z)∗ J T?n (z) = 2 Im z

n−1 k=0

! 1 0 ? T?k (z)∗ T (z). 0 0 k

Proof. (a) follows from the calculation ? b; z)∗ J A(a, ? b; z) = J − A(a,

! 2 Im z 0 . 0 0

(10.23)

(10.24)

Applying this to a = ak+1 , b = bk+1 , multiplying from the right by T?k (z) and from the left by T?k (z)∗ , and summing in k proves (b). The identity (10.23) shows that the n-step transfer matrix (10.18) is also J -contracting, since the right-hand side is positive (compare Exercise 7.1). Moreover, it provides a J -monotonicity property which will lead to a nesting property below. Deﬁnition 10.27. For z ∈ C+ and n ∈ N ∪ {0}, Weyl disks are deﬁned by & ! ! ' w ∗? w ∗ ? ˆ ≥0 . Dn (z) = w ∈ C | Tn (z) J Tn (z) 1 1 Projectively, recall that ! ! w ∗ w J ≥0 ⇐⇒ w ∈ C+ , 1 1 so w ∈ Dn (z) if and only if T?n (z) w1 corresponds to a point in C+ . Thus, obius transformation corresponding Dn (z) is the inverse image of C+ in the M¨ to T?n (z). This proves: Lemma 10.28. For any nontrivial eigensolution v of J at z, vn+1 v1 ∈ Dn (z) ⇐⇒ − ∈ C+ . − a0 v0 an vn

(10.25)

316

10. Jacobi matrices

It is more customary in the literature to talk about the Weyl circles ∂Dn (z) rather than the disks themselves; the circles are characterized by vn+1 v1 ∈ ∂Dn (z) ⇐⇒ − ∈ R ∪ {∞}. (10.26) − a0 v0 an vn Lemma 10.29. For every n ∈ N and z ∈ C+ , the Weyl disk Dn (z) is a disk in C+ and the disks are nested in the sense that for all n ∈ N, Dn (z) ⊂ Dn−1 (z).

(10.27)

Proof. From (10.23), we have the identity T?n−1 (z) J T?n−1 (z) − T?n (z) J T?n (z) = 2 Im z T?n−1 (z) ∗

∗

∗

! 1 0 ? (z). T 0 0 n−1

This implies the inequality T?n−1 (z)∗ J T?n−1 (z) ≥ T?n (z)∗ J T?n (z) from which the nesting property is immediate. ˆ under Since Dn (z) is the image of a half-plane (generalized disk in C) a M¨obius transformation, to prove that it is a disk, it suﬃces to prove ∞∈ / Dn (z). Since − av01v0 = ∞ corresponds to the solution with a0 v0 = 0, v1 = 1, we compute z − b1 v2 =− 2 ∈ / C+ . − a1 v1 a1 / Dn (z) for all n by the nesting property. This implies ∞ ∈ / D1 (z), so ∞ ∈

Since the closed disks Dn (z) are nested, their intersection n∈N Dn (z) is a point or a closed disk in C+ . This dichotomy corresponds to the limit point case in the sense of Deﬁnition 10.13: Proposition 10.30. If J is in the limit point case, for any z ∈ C+ , Dn (z) = {m(z)}. n∈N

Proof. Let w ∈ n∈N Dn (z) and let (vn )∞ n=0 be the eigensolution at z obeying a0 v0 = 1, v1 = −w. Then ! ! w −vn+1 ? = , Tn (z) an vn 1 ∗ so multiplying (10.23) on the left by w1 and on the right by w1 , we obtain ! ! ! ! n−1 w −vn+1 ∗ −vn+1 w ∗ J − J |vk+1 |2 . = 2 Im z an vn an vn 1 1 k=0

10.4. Transfer matrices and Weyl disks

317

The condition w ∈ Dn (z) rewrites to ! ! −vn+1 ∗ −vn+1 J ≥ 0, an vn an vn so we obtain the inequality 2 Im z

n−1

|vk+1 |2 ≤

k=0

w 1

!∗

J

w 1

! = 2 Im w.

The upper bound is independent of n, so v ∈ 2 (N). Since J is limit point, this shows that v is a Weyl solution for J. It follows that w = m(z), which concludes the proof. The previous proof shows that for the Weyl solution normalized by a0 ψ0 = 1, we have 2 Im z

n |ψk |2 = 2 Im m(z) + 2iWn (ψ, ψ), k=1

and the limit point condition guarantees Wn (ψ, ψ) → 0 as n → ∞; thus, ∞ k=1

|ψk |2 =

Im m(z) . Im z

Exercise 10.9 proves further properties of Weyl solutions and Exercises 10.10 and 10.11 explore some further properties of Weyl disks. The connection between Weyl disks and the Herglotz function provides a valuable tool for deriving certain approximants of m(z). Proposition 10.31. If J is in the limit point case, uniformly on compact subsets of C+ , qn (z) . (10.28) m(z) = lim − n→∞ pn (z) Proof. From (10.21), for every n ∈ N, ! ! 0 −qn (z) $ T?n (z)−1 , pn (z) 1 so 0 ∈ R∪{∞} implies −qn (z)/pn (z) ∈ ∂Dn (z). In particular, since Dn (z) ⊂ C+ , we observe that −qn (z)/pn (z) is a Herglotz function. By Proposition 10.30, the diameters of Dn (z) shrink to 0 as n → ∞, and the limit (10.28) holds pointwise for every z ∈ C+ . Since the functions are Herglotz, by Proposition 7.28, pointwise convergence implies uniform convergence on compact subsets of C+ .

318

10. Jacobi matrices

This technique is very robust: the main argument was that 0 ∈ C+ , so −qn (z)/pn (z) ∈ Dn (z). Other values lead to other approximants, some of which correspond to explicitly computable measures on R. In such cases, the approximations of the m-function lead to approximations of the spectral measure. We describe one such application, known as Carmona’s formula. An important feature of Carmona’s formula is that it allows the study of μ through the behavior of pn (x) as n → ∞ for real values of x. Theorem 10.32 (Carmona). If J is in the limit point case, for every h ∈ Cc (R), 1 h dμ = lim dx. (10.29) h(x) 2 n→∞ π(pn (x) + a2n p2n−1 (x)) Proof. Deﬁne m(n) (z) by

! ! i m(n) (z) ? . $ Tn (z) 1 1

Since i ∈ C+ , m(n) (z) ∈ Dn (z) for all z ∈ C+ , so as n → ∞, m(n) (z) converge pointwise to m(z). Since m(z) has no point mass at inﬁnity, by (n) (n) Proposition 7.28, the measures μ corresponding to m (z) converge to dμ in the sense that h dμn → h dμ for all h ∈ Cc (R). It remains to compute the measures μ(n) . In terms of the entries of the transfer matrix, the property det T?n = 1 lets us easily compute T?−1 and explicitly write the M¨obius transformation n

m(n) (z) = −

qn (z) + an qn−1 (z)i . pn (z) + an pn−1 (z)i

Since pn , pn−1 have real coeﬃcients and have no common zeros, m(n) (z) extends continuously to R; the imaginary part of the boundary value is Im m(n) (x) = −

an pn (x)qn−1 (x) − an pn−1 (x)qn (x) . pn (x)2 + a2n pn−1 (x)2

Using again det Tn = 1 gives Im m(n) (x) =

p2n (x) +

1 , a2n p2n−1 (x)

and therefore

1 dx. + a2n p2n−1 (x)) As already stated, by Proposition 7.28, this completes the proof. dμ(n) (x) =

π(p2n (x)

A common variation also known as Carmona’s formula is 1 dx h(x) h dμ = lim 2 2 n→∞ π(an pn (x) + p2n−1 (x))

(10.30)

10.5. Full-line Jacobi matrices

319

(note the diﬀerent placement of an compared to (10.29)); this can be proved by a similar proof (Exercise 10.13). For some purposes, it is better to use more specialized approximations. For decaying perturbations of the free Jacobi matrix (i.e., Jacobi matrices J with an → 1, bn → 0), we know that σess (J) = [−2, 2], so particular attention is focused on determining the spectral type on [−2, 2]. In such cases, the denominator in Exercise 10.14 often has better behavior than the one in Carmona’s formula, for x ∈ (−2, 2).

10.5. Full-line Jacobi matrices We now turn our attention to full line or two-sided Jacobi matrices, which are operators on 2 (Z) with a matrix representation ⎞ ⎛ .. .. . . ⎟ ⎜ ⎟ ⎜. . ⎟ ⎜ . b−1 a−1 ⎟ ⎜ ⎟ ⎜ a b a −1 0 0 ⎟ ⎜ ⎟. ⎜ (10.31) a0 b1 a1 J =⎜ ⎟ ⎟ ⎜ a b a 1 2 2 ⎟ ⎜ ⎜ .. .. ⎟ ⎜ . .⎟ a2 ⎠ ⎝ .. . The sequences of coeﬃcients an ∈ (0, ∞), bn ∈ R are now indexed by n ∈ Z. +∞ ∞ If (an )+∞ n=−∞ , (bn )n=−∞ ∈ (Z), then J is a bounded self-adjoint operator deﬁned precisely by

(Ju)n = an−1 un−1 + bn un + an un+1 for any u ∈ 2 (Z), by the same proof as in Lemma 10.1. If at least one of the sequences is not bounded, J will also be an unbounded operator. To motivate the choice of domain, let us denote by P+ and P− orthogonal projections to span{δn | n > 0} and span{δn | n ≤ 0} and formally decompose with respect to subspaces Ran P± as follows, ⎞ ⎛ .. .. . ⎟ ⎜ . ⎟ ⎜ .. ⎟ ⎜ . b−1 a−1 ⎟ ⎜ ⎟ ⎜ a−1 b0 a0 ⎟ ⎜ ⎟. ⎜ J =⎜ a0 b1 a1 ⎟ ⎟ ⎜ a1 b2 a2 ⎟ ⎜ ⎜ .. .. ⎟ ⎜ . . ⎟ a2 ⎠ ⎝ .. .

320

10. Jacobi matrices

The lower right block is a half-line Jacobi matrix J+ on Ran P+ and the upper left part is, up to a reﬂection of the real line, a half-line Jacobi matrix J− on Ran P− . The remaining a0 entries are collected into a ﬁnite rank self-adjoint operator a0 F where F u = δ0 , uδ1 + δ1 , uδ0 , and we write J = J− ⊕ J+ + a0 F.

(10.32)

This motivates the following deﬁnition: Deﬁnition 10.33. A full-line self-adjoint Jacobi matrix J with separated boundary conditions is deﬁned in terms of two self-adjoint half-line Jacobi matrices J± and a coeﬃcient a0 > 0 as an operator with domain D(J) = D(J− ) ⊕ D(J+ ), acting on elements of the domain by (10.32). Since F is bounded self-adjoint, ∗ ∗ (J− ⊕ J+ + a0 F )∗ = (J− ⊕ J+ )∗ + a0 F = J− ⊕ J+ + a0 F = J− ⊕ J+ + a0 F,

so the operator J deﬁned above is indeed self-adjoint. Since ﬁnite rank perturbations do not change the essential spectrum, this decomposition immediately implies σess (J) = σess (J+ ) ∪ σess (J− ).

(10.33)

If J± are both in the limit point case, then there are no boundary conditions, so

|an−1 un−1 + bn un + an un+1 |2 < ∞ . D(J) = u ∈ 2 (Z) | n∈Z

Formal eigensolutions at z are now sequences u = (un )+∞ n=−∞ which solve an−1 un−1 + bn un + an un+1 = zun

∀n ∈ Z.

There are now unique, up-to-normalization Weyl solutions ψ ± (z) for each half-line J± ; each Weyl solution can be extended uniquely as an eigensolution on Z. Example 10.34. The free full-line Jacobi matrix J is deﬁned by an = 1 and bn = 0 for all n ∈ Z. For any z ∈ C+ , Weyl solutions normalized by ψ0± (z) = 1 are given by ±n √ z − z2 − 4 ± , n ∈ Z, ψn (z) = 2

10.5. Full-line Jacobi matrices

with the branch of (2, ∞).

321

√ z 2 − 4 on C \ [−2, 2] which takes positive values on

Proof. The m-function for the half-line free Jacobi matrix is √ −z + z 2 − 4 m(z) = . 2 Coeﬃcient stripping does not aﬀect the free Jacobi matrix, so for all n, −

+ (z) ψn+1

ψn+ (z)

=−

+ ψn+1 (z)

an ψn+ (z)

= m(z).

By induction we ﬁnd ψn+ (z) = (−m(z))n . By a reﬂection we ﬁnd ψn− (z). Weyl solutions exist for z ∈ / σess (J ± ). Another easy observation is that, for z ∈ / σ(J), the Weyl solutions ψ − and ψ + are linearly independent; otherwise, one nontrivial solution would obey the conditions at both endpoints, so it would be an eigenvector of the operator J. Denote the Wronskian of ψ ± by + − (z)ψn− (z) − ψn+1 (z)ψn+ (z)], W (z) = an [ψn+1

which is independent of n. Denote also the Green’s function Gm,n (z) =

− + ψmin(m,n) (z)ψmax(m,n) (z)

W (z)

.

This turns out to be the integral kernel for the resolvent of J (of course, the integral here is with respect to the counting measure on Z): Proposition 10.35. For any u ∈ 2 (Z) and z ∈ C \ σ(J), Gn,m (z)un . [(J − z)−1 u]m =

(10.34)

m∈Z

Proof. Fix m and denote v = (Gm,n )+∞ n=−∞ . Gm,n is for n ≥ m a multiple of ψn+ , and for n ≤ m a ﬁxed multiple of ψn− . In particular, v ∈ D(J). The same observation shows ((J − z)v)n = 0

∀n = m.

Meanwhile, ((J − z)v)m =

− + (z) + b ψ − (z)ψ + (z) + a ψ − (z)ψ + (z) am−1 ψm−1 (z)ψm m m m m m m+1 , W (z)

and using the eigenfunction equation for ψ − turns this into ((J − z)v)m =

− + (z) + a ψ − (z)ψ + (z) −am ψm+1 (z)ψm m m m+1 = 1. W (z)

322

10. Jacobi matrices

Thus, (J − z)v = δm , so (10.34) holds for u = δm . By linearity, it holds for all u ∈ 2c (Z). For arbitrary u ∈ 2 (Z), apply the above to compactly supported vectors uχ[−N,N ] . Using uχ[−N,N ] → u as N → ∞, the left-hand side of (10.34) converges in the 2 sense, so it also converges pointwise. The right-hand side also converges for each m, so the limits are equal.

10.6. Eigenfunction expansion for full-line Jacobi matrices In the full-line case, the spectrum can be of multiplicity 2, as already seen in the following example. For this example, the reader should recall the unitary dk ) → 2 (Z) and its inverse F −1 which correspond to map F : L2 ([0, 2π], 2π the usual Fourier series expansion (Example 3.41). Example 10.36. The free full-line Jacobi matrix is deﬁned on 2 (Z) by (Ju)n = un−1 + un+1 . dk ). Then F −1 JF is the operator of multiplication by 2 cos k on L2 ([0, 2π], 2π In particular, J has a purely absolutely continuous spectrum of multiplicity 2.

Proof. Since F −1 JF and multiplication by 2 cos k are bounded operators dk ), it suﬃces to prove that they agree on the dense set of on L2 ([0, 2π], 2π compactly supported sequences. By linearity, it suﬃces to prove that they agree on the vectors δn . For this, use Jδn = δn−1 + δn+1 to compute F −1 JF eink = ei(n−1)k + ei(n+1)k = 2 cos k eink .

The remaining claims follow from Example 9.40.

Thus, there is no hope in general for a cyclic vector. However, we will see that every δn can be obtained from δ0 and δ1 using polynomials of J, and this will be a starting point in the canonical construction of a unitary map that diagonalizes the full-line Jacobi matrix. This canonical unitary map will naturally be in terms of a matrix-valued measure, which leads us to use matrix-valued analogues of multiplication operators and Herglotz functions. Let us consider eigenfunctions u0 , u1 at x, which solve an−1 un−1 + bn un + an un+1 = xun

∀n ∈ Z,

(10.35)

and the initial conditions u10 (x) = 0,

u11 (x) = 1,

u00 (x) = 1,

u01 (x) = 0.

For any n ∈ Z, u0n (x) and u1n (x) are polynomials in x. The following is the analogue of the formula δn = pn−1 (J)δ1 from the half-line case.

10.6. Eigenfunction expansion for full-line Jacobi matrices

323

Lemma 10.37. For any n ∈ Z, δn = u1n (J)δ1 + u0n (J)δ0 .

(10.36)

Proof. For n ∈ Z, let us denote ψn = u1n (J)δ1 + u0n (J)δ0 ∈ 2 (Z). By the Borel functional calculus, since uj are solutions of (10.35) for j = 0, 1, Jujn (J) = an−1 ujn−1 (J) + bn ujn (J) + an ujn+1 (J). Applying this to δj and summing in j = 0, 1 implies Jψn = an−1 ψn−1 + bn ψn + an ψn+1 . Since the same holds for the sequence of δn , it suﬃces to note that ψ0 = δ0 and ψ1 = δ1 and proceed by induction in ±n. This lemma can easily be used to construct a spectral basis for J with at most two elements starting from δ0 , δ1 . Already from this, it could be concluded that the spectrum of J can have multiplicity at most 2, but we are about to present a more precise spectral representation. Corresponding to the Jacobi matrix J, we deﬁne the 2 × 2 matrix-valued measure Ω = (Ωi,j )1i,j=0 by the property that for all Borel sets B, (10.37) δi , χB (J)δj = χB dΩi,j (see discussion of matrix-valued measures and corresponding L2 spaces in λ0 Section 6.4). For any Borel set B ⊂ R and any λ = λ1 ∈ C2 , if we denote v = 1j=0 λj δj , then ∗

λ Ω(B)λ =

1

λi δi , χB (J)δj λj = v, χB (J)v ≥ 0,

i,j=0

so Ω is a positive matrix-valued measure. In particular, it can be written as dΩ = W dμ, where μ = Tr Ω is a ﬁnite positive measure and W is a matrix-valued function with W ≥ 0 and Tr W = 1 μ-a.e. We will construct a unitary map U from 2 (Z) to the Hilbert space which conjugates J to a multiplication operator. We begin by introducing the map pointwise on a dense subset of 2 (Z): For f ∈ 2c (Z), we deﬁne ! u0n ˆ fn 1 . f (λ) = un L2 (R, C2 , dΩ),

n∈Z

324

10. Jacobi matrices

Conversely, for g ∈ L2 (R, C2 , dΩ), we deﬁne ! u0n gˇn = W g dμ. u1n Theorem 10.38 (Eigenfunction expansion for full-line Jacobi matrices). Let J be a full-line self-adjoint Jacobi matrix. There is a unitary map U : 2 (Z) → L2 (R, C2 , dΩ) such that: (a) U f = fˆ for all f ∈ 2c (Z); (b) U −1 g = gˇ for all g ∈ L2c (R, C2 , dΩ); (c) U JU −1 = Tλ,dΩ(λ) . Proof. By the deﬁnition and basic properties of the matrix-valued measure dΩ = W dμ, for any m, n ∈ Z and any compactly supported bounded function h : R → C, !∗ ! 1 u0n u0m W huim ujn Wi,j dμ dμ = h 1 um u1n i,j=0

=

1

uim (J)δi , h(J)ujn (J)δj

i,j=0

/ =

1

uim (J)δi , h(J)

i=0

1

0 ujn (J)δj

j=0

= δm , h(J)δn . First, we apply this calculation with m = n and h = χ[−k,k] and use monotone convergence as k → ∞ to conclude that δˆn ∈ L2 (R, C2 , dΩ) for each n, and then fˆ ∈ L2 (R, C2 , dΩ) for each f ∈ 2c (Z). By sesquilinearity, the same calculation shows that ∀f1 , f2 ∈ 2c (Z). hfˆ1 W fˆ2 dμ = f1 , h(J)f2 This allows us to apply Theorem 9.48 with A = J and B = Tλ,dΩ(λ) ; we conclude that the map f → fˆ extends to a norm-preserving map U and there exists a linear map U ∗ : L2 (R, C2 , dΩ) → 2 (Z) with U ∗ ≤ 1 and U ∗ g, f = g, U f for all f, g in the appropriate Hilbert spaces. For f ∈ 2c (Z) and g ∈ L2c (R, C2 , dΩ), ! u0n ∗ ˆ g W 1 fn dμ = g, f. un

10.7. The Weyl M -matrix

325

Formally, by placing the sum inside the integral, this looks like ˇ g , f ; applying this to f = gˇχ[−n,n] and letting n → ∞ proves by monotone convergence ˆ for a dense set of f implies that gˇ ∈ 2 (Z). Now the equality ˇ g , f = g, f ∗ 2 2 that U g = gˇ for all g ∈ Lc (R, C , dΩ). Moreover, Theorem 9.48 says that Ker U ∗ = (Ran U )⊥ is a resolventinvariant subspace for multiplication by λ. Fix g ∈ Ker U ∗ . Since the subspace is resolvent-invariant, for any λ1 < λ2 , χ(λ1 ,λ2 ] g ∈ Ker U ∗ . Thus,

u0n u1n

! W χ(λ1 ,λ2 ] g dμ = 0

∀n ∈ Z.

Evaluating this at n = 0 and n = 1 implies that W χ(λ1 ,λ2 ] g dμ = 0, and since λ1 < λ2 are arbitrary, W g = 0 μ-a.e. Thus, g ∗ W g = 0 μ-a.e., so g = 0 in L2 (R, C2 , dΩ). This proves Ker U ∗ = {0}, which concludes the proof by Theorem 9.48(f). In the case of bounded self-adjoint Jacobi matrices, we could have argued more directly that U is onto by proving that its image contains every P polynomial Q ; this proof would proceed by induction in deg P + deg Q, using the degrees of u0n , u1n . The eigenfunction expansion provides us with a canonical matrix-valued spectral measure Ω for the self-adjoint operator J. Corollary 10.39. The measure μ = Tr Ω is a maximal spectral measure for J. The spectral multiplicity n measures for J (see Theorem 9.31) are given by dμn = χSn dμ, where Sn = {λ | rank W (λ) = n} (in particular, Sn = 0 unless n = 1 or 2). The measure μ = Tr Ω is regarded as the canonical spectral measure for the full-line Jacobi matrix. Note that μ = μJ,δ0 + μJ,δ1 , which relates to the special role δ0 , δ1 had in our construction.

10.7. The Weyl M -matrix In this section we continue studying the full-line setting from the previous sections. We have already observed that the essential spectrum of J is the union of essential spectra of J± . To obtain ﬁner connections between

326

10. Jacobi matrices

spectral properties of J and J± , we introduce the Weyl M -matrix. This is the Borel transform of the matrix-valued spectral measure, 1 M (z) = dΩ(λ). λ−z This is a matrix-valued Herglotz function. In particular, Tr M (z) is the Borel transform of the canonical spectral measure μ. Integrating each entry, we obtain a formula in terms of the Green’s function, ! G0,0 (z) G0,1 (z) . M (z) = G1,0 (z) G1,1 (z) The Weyl M -matrix can be expressed in terms of the half-line m-functions m+ (z) = δ1 , (J+ − z)−1 δ1 , m− (z) = δ0 , (J− − z)−1 δ0 . Lemma 10.40.

!−1 a0 m−1 − . M= a0 m−1 + In particular, the diagonal Green’s function elements are given by 1 2 − = −m−1 + + a0 m− , G1,1 1 2 = −m−1 − − + a0 m+ . G0,0

(10.38)

(10.39) (10.40)

Proof. By the second resolvent identity, (J − z)−1 = (J− ⊕ J+ − z)−1 − a0 (J − z)−1 F (J− ⊕ J+ − z)−1 . Applying this to δ0 and δ1 and then taking the inner product with δ0 and δ1 gives four equalities: G0,0 = m− − a0 G0,1 m− , G0,1 = −a0 G0,0 m+ , G1,0 = −a0 G1,1 m− , G1,1 = m+ − a0 G1,0 m+ . Viewing the ﬁrst two as a set of linear equations for G0,0 and G0,1 , and the second two as a set of linear equations for G1,0 and G1,1 , we can rewrite the systems in the combined form ! ! ! a0 G0,0 G0,1 1 0 m−1 − = , G1,0 G1,1 0 1 a0 m−1 + which proves (10.38). The formulas for the diagonal Green’s function follow easily.

10.7. The Weyl M -matrix

327

This allows us to express spectral properties of J in terms of Weyl functions m± , which allows us to relate spectral properties of J to the spectral properties of half-line operators J± . For instance, it provides a second proof of (10.33). Namely, the functions m± are Herglotz functions with region of meromorphicity C\σess (J± ), so the same is true for −1/m± . By (10.39) and (10.40), G0,0 and G1,1 have region of meromorphicity C\(σess (J+ )∪σess (J− )), and so does their sum. Since the Herglotz function G0,0 + G1,1 corresponds to the maximal spectral measure for J, (10.33) follows. More importantly, (10.38) provides immediate consequences of results proved in an abstract setting in Section 7.12. To relate it to that setting, we write !−1 !−1 −a−1 m−1 −1 −1 −m ˜ −1 − − 0 =− , a0 M = − −1 −1 m ˜+ −1 −a−1 0 m+ where −1 m ˜ + = −a−1 0 m+ ,

m ˜ − = a0 m− .

This puts a0 M in the setting of Section 7.12, up to an additive constant self-adjoint matrix which does not aﬀect the phenomena studied here. Corollary 10.41. The absolutely continuous spectrum of J is precisely the sum of absolutely continuous spectra of J± , with multiplicities added, i.e., J|Hac (J) ∼ = J+ |Hac (J+ ) ⊕ J− |Hac (J− ) . Corollary 10.42. Let μs denote the singular part of the canonical spectral measure for the full-line Jacobi matrix J. For μs -a.e. x ∈ R, (a) rank W (x) = 1, (b) m± have normal limits which are values in R ∪ {∞}, (c) there exists α = α(x) ∈ [0, π) such that (a0 m+ (x + i0))−1 = a0 m− (x + i0) = cot α(x) and W (x) =

! cos2 α(x) − cos α(x) sin α(x) . − cos α(x) sin α(x) sin2 α(x)

In other words, the singular part of the maximal spectral measure is supported on the set {x | (a0 m+ (x + i0))−1 = a0 m− (x + i0) = cot α}. S= α∈[0,π)

Moreover, S has Lebesgue measure zero.

328

10. Jacobi matrices

10.8. Subordinacy theory Spectral properties of Jacobi matrices may be studied through the behavior of formal eigensolutions with real spectral parameters. The pure point spectrum of a Jacobi matrix corresponds to eigenvectors, which are simply formal eigensolutions which are square-integrable. Carmona’s formula allows us to recover the measure, but it is not a pointwise criterion. Subordinacy theory was discovered by Khan–Pearson [54] with important developments by Gilbert–Pearson [41], Gilbert [40], and Jitomirskaya–Last [47, 48]. It uses a pointwise characterization in terms of the behavior of eigensolutions to describe the decomposition into absolutely continuous/singular spectra and, more generally, decomposition into α-continuous/α-singular spectra. We begin with the half-line setting. Deﬁnition 10.43. Fix λ ∈ R. A nontrivial eigensolution u at λ is called subordinate (at +∞) if n 2 j=1 |uj | =0 (10.41) lim n 2 n→∞ j=1 |vj | for some eigensolution v at λ. It also helps to consider a continuous interpolation of 2 -norms: for L > 0, deﬁne u2L =

L

|uk |2 + (L − L)|uL+1 |2 .

k=1

Lemma 10.44. (a) If (10.41) holds for some eigensolution v, then it holds for every eigensolution v linearly independent with u. (b) If u is subordinate, it is linearly dependent with u. Then, u is a constant multiple of a real-valued eigensolution. (c) (10.41) holds if and only if uL = 0. L→∞ vL lim

(10.42)

Proof. (a) If v = Cu, the limit (10.41) is 1/C 2 . Thus, (10.41) implies that v is linearly independent with u. Now any eigensolution w can be written as w = C1 u + C2 v and, if w is linearly independent with u, then C2 = 0. By elementary estimates, 1 |w|2 ≥ |C2 |2 |v|2 − |C1 |2 |u|2 . 2

10.8. Subordinacy theory

329

This implies

n n 2 |wk |2 1 2 k=1 k=1 |vk | |C ≥ | lim inf − |C1 |2 = ∞, lim inf n 2 n 2 2 n→∞ n→∞ |u | 2 |u | k=1 k k=1 k

and inverting completes the proof. (b) If v = u, then the limit in (10.41) is equal to 1. Thus, if u is subordinate, u must be linearly dependent with u. Thus, vectors (u1 , u0 ) and (u1 , u0 ) are linearly dependent, so u1 /u0 ∈ R ∪ {∞}. Thus, by a constant multiplicative factor, one can make (u1 , u0 ) ∈ R2 \ {0}. (c) It is a simple analysis that for any n, & ' u2n u2n+1 u2n + t|un+1 |2 = max , max 2 v2n v2n+1 t∈[0,1] v2 n + t|vn+1 | since the function of t is monotone. This implies equivalence of (10.41) and (10.42). This narrows our focus to the question of when some real-valued eigensolution is subordinate. The following inequality relates this to the behavior of the m-function: Lemma 10.45 (Jitomirskaya–Last inequality). For any L > 0, deﬁne 1 . (10.43)

(L) = 2p·−1 L q·−1 L Then (L) ∈ (0, ∞) and for all L > 0, √ √ p·−1 L 5 + 24 5 − 24 ≤ . ≤ |m(λ + i (L))| q·−1 L |m(λ + i (L))|

(10.44)

Proof. Consider the Weyl solution ψn (z) for z = λ + i , normalized by a0 ψ0 = 1 so that ψ1 = −m(z). We use variation of parameters to compare this to eigensolutions at λ: we deﬁne ! −1 ψn+1 vn = Tn (λ) an ψn and derive

vn − vn−1 = Tn−1 (λ)−1 A(an , bn ; λ)−1 A(an , bn ; z) − I Tn−1 (λ)vn−1

which simpliﬁes to vn − vn−1 = Tn−1 (λ)

−1

0 0 −i 0

! ψn . an−1 ψn−1

−m(z) , we obtain 1 ! ! n−1 −m(z) 0 −1 − i

Tk (λ) . ψk+1 1

By telescoping and using v0 = vn =

!

k=0

330

10. Jacobi matrices

Multiplying on the left by (1, 0)Tn (λ) gives ψn+1 = −qn (λ) − pn (λ)m(z) − i

n−1

(pn (λ)qk (λ) − qn (λ)pk (λ)) ψk+1 .

k=0

Using Cauchy–Schwarz twice on the right-hand side implies |ψn+1 | ≥ |qn (λ) + pn (λ)m(z)| − |pn (λ)|q·−1 L ψL − |qn (λ)|p·−1 L ψL . Rearranging and using the triangle inequality for ·L gives q·−1 (λ) + p·−1 (λ)m(z)L ≤ ψL + 2 p·−1 L q·−1 L ψL . Squaring this, combining with ψ2L ≤ ψ2 = Im m(z)/ , and using =

(L), we obtain q·−1 (λ) + p·−1 (λ)m(z)2L ≤ 8p·−1 L q·−1 L |m(z)|. Using the triangle inequality in the left-hand side, this implies (q·−1 (λ)L − |m(z)|p·−1 (λ)L )2 ≤ 8p·−1 L q·−1 L |m(z)|. Dividing this by q·−1 (λ)2L and expanding gives a quadratic inequality for κ = |m(z)|p·−1 (λ)L /q·−1 (λ)L , which implies 5 −

√

κ2 − 10κ + 1 ≤ 0, √ 24 ≤ κ ≤ 5 + 24, completing the proof.

Subordinacy of (pn−1 (λ))∞ n=1 corresponds to inﬁnite normal boundary values of m: Theorem 10.46. Consider a half-line Jacobi matrix J in the limit point case with Weyl function m(z). The solution (pn−1 (λ))∞ n=1 is subordinate if and only if (10.45) lim m(λ + i ) = ∞. ↓0

Proof. The function (L) deﬁned above is a continuous, strictly decreasing ∞ function of L. The sequences (pn−1 (λ))∞ n=1 and (qn−1 (λ))n=1 are not both square-summable due to the limit point condition, so lim (L) = 0.

L→∞

By taking L → ∞ in the Jitomirskaya–Last inequality, we conclude that (pn−1 (λ))∞ n=1 is subordinate if and only if lim |m(λ + i (L))| = ∞.

L→∞

By properties of (L), this is equivalent to (10.45).

We will now extend this to a characterization of subordinacy of an arbitrary real solution, using a trick of varying the b1 coeﬃcient in the Jacobi matrix, which is of some independent interest.

10.8. Subordinacy theory

331

cos α10.47. Fix λ ∈ R and α ∈ R. The eigensolution at λ with Corollary u1 u0 $ sin α is subordinate if and only if lim a0 m(λ + i ) = − cot α. ↓0

Proof. We separate cases and reduce each case to Theorem 10.46. The case α = 0 is Theorem 10.46, and the case α = π/2 follows by applying Theorem 10.46 to the coeﬃcient-stripped matrix J1 = SJS ∗ . For α ∈ R \ π2 Z, note that an eigensolution (un )∞ n=1 obeys a0 u0 + b1 u1 + a1 u2 = λu1 . By grouping the ﬁrst two terms, we conclude that the eigensolution obeys u1 /u0 = cot α if and only if (b1 + a0 tan α)u1 + a1 u2 = λu1 , so the same eigensolution (un )∞ n=1 corresponds to orthonormal polynomials for the modiﬁed Jacobi matrix J (α) = J + a0 tan αδ1 , ·δ1 . Denote by m(α) its m-function. Since J and J (α) have the same coeﬃcientstripped Jacobi matrix J1 , coeﬃcient stripping gives ! ! ! z−b1 1 m(z) m1 (z) a1 a1 $ 1 1 −a1 0 and

z−b1 −a0 tan α a1

−a1

1 a1

0

!

! m(α) (z) $ 1

! m1 (z) . 1

Combining these equations to express m(α) in terms of m, we obtain ! ! ! ! z−b1 −a0 tan α 1 −1 z−b1 1 m(z) m(α) (z) a1 a1 a1 a1 , $ 1 1 −a1 0 −a1 0 so m(z) . m(α) (z) = a0 tan αm(z) + 1 In particular, lim m(α) (z) = ∞ ⇐⇒ lim a0 m(z) = − cot α, ↓0

↓0

so applying Theorem 10.46 to J (α) concludes the proof.

Theorem 10.48. Let J be a half-line Jacobi matrix in the limit point case. The singular part of its canonical spectral measure μ is supported on the set S = {λ ∈ R | (pn−1 (λ))∞ n=1 is subordinate},

332

10. Jacobi matrices

and the absolutely continuous part of μ mutually absolutely continuous with χN (λ) dλ, where N = {λ ∈ R | there is no subordinate solution at λ}. Proof. The set S is precisely the set on which m(λ + i0) = ∞. Moreover, λ ∈ N if and only if m(λ+i0) ∈ C+ or m(λ+i0) does not exist; however, the second case happens on a set of Lebesgue measure zero. Thus, the theorem follows from Corollary 7.49. The most commonly used consequence of this is a criterion for absolutely continuous spectrum in terms of bounded eigensolutions [6, 85, 103]: Theorem 10.49. Let J be a half-line Jacobi matrix with sup an < ∞. Let λ ∈ R. If all eigensolutions at λ are bounded, then there is no subordinate solution at λ. In particular, on the set S∞ of such λ, χS∞ dμ is mutually absolutely continuous with χS∞ (λ) dλ. Proof. Lt u, v be linearly independent eigensolutions at λ. Then their Wronskian W is nonzero and |W | ≤ |an | (|un ||vn+1 | + |vn ||un+1 |) ≤ |an | (|un | + |un+1 |) v∞ . This implies |un |2 + |un+1 |2 ≥ so

|W |2 1 |uk |2 ≥ . n |an |2 v2∞ n

lim inf n→∞

Since

|W |2 , 2|an |2 v2∞

lim supn→∞ n1

n

k=1

k=1 |vk

|2

≤ v2∞ , this implies n |uk |2 lim inf k=1 > 0, n 2 n→∞ k=1 |vk |

so u is not subordinate.

Further criteria for absolutely continuous spectrum have been proved by Last–Simon [61] with closely related work by Remling [77] (see also [92, Sections 7.3 and 7.4]). By strengthening the subordinacy assumption, we can characterize spectral decompositions with respect to Hausdorﬀ measures. (Note the lim inf in the following deﬁnition.) Deﬁnition 10.50. Fix β ∈ (0, 1] and λ ∈ R. A nontrivial eigensolution u at λ is called β-subordinate (at +∞) if lim inf n→∞

2−β uL

vβL

=0

(10.46)

10.8. Subordinacy theory

333

for some eigensolution v at λ. Theorem 10.51. Let J be a half-line Jacobi matrix in the limit point case. Fix β ∈ (0, 1). The β-singular part of its spectral measure μ is supported on the set Sβ = {λ ∈ R | (pn−1 (λ))∞ n=1 is β-subordinate}, and the β-continuous part of μ is supported on Sβc . Proof. Raising (10.43) to power 1 −β and using that to divide (10.44) gives √ √ 2−β 5 + 24 5 − 24 1−β p·−1 L ≤2 . ≤

(L)1−β |m(λ + i (L))|

(x)1−β |m(λ + i (L))| q·−1 β L

Taking L → ∞ proves that (pn−1 (λ))∞ n=1 is β-subordinate if and only if lim sup 1−β |m(λ + i )| = ∞. ↓0

Now the claim follows from Theorem 6.29 and Theorem 7.51.

Subordinacy can also be used to study spectra of full-line Jacobi matrices: with obvious modiﬁcations, we say a nontrivial eigensolution u at λ is subordinate at −∞ if for some eigensolution v at λ, −1 |uk |2 = 0. lim k=n −1 2 n→−∞ k=n |vk | Denote Sα±

=

& λ ∈ R | the solution with

u1 u0

!

' = cot α is subordinate at ± ∞ ,

N ± = {λ ∈ R | there is no subordinate eigensolution at ± ∞}. Note that the set S=

(Sα− ∩ Sα+ )

α∈[0,π)

is precisely the set of λ ∈ R for which there exists a nontrivial eigensolution which is subordinate at both endpoints ±∞. Theorem 10.52. Let J be a full-line Jacobi matrix which is the limit point at ±∞. Let μ be its canonical spectral measure. (a) The singular part of μ is supported on S and has multiplicity 1. (b) N− ∪ N+ is an essential support for μac , i.e., μac is mutually absolutely continuous with χN− ∪N+ (λ) dλ. (c) N− ∩ N+ is an essential support for the multiplicity 2 part of μac . Proof. This follows immediately from Corollary 10.41, Corollary 10.42, and Theorem 10.48.

334

10. Jacobi matrices

10.9. A Combes–Thomas estimate and Schnol’s theorem In this section, we specialize to bounded Jacobi matrices and consider two related results about the behavior of eigensolutions on and oﬀ the spectrum. The ﬁrst is that the decay properties of Weyl solutions can be signiﬁcantly improved, from square-summability to exponential decay; estimates of this type are called Combes–Thomas estimates. Proposition 10.53. Let J be a bounded half-line Jacobi matrix, and let ψ be a Weyl solution at z ∈ C \ σess (J). There exist γ > 0 and C < ∞ such that for all n, |ψn | ≤ Ce−γn . Proof. Let us ﬁrst assume that z ∈ / σ(J). The key observation is that (ψn ) is an eigensolution at z if and only if (un ) = (eγn ψn ) is an eigensolution of the operator (Jγ u)n = an−1 eγ un−1 + bn un + an e−γ un+1 . It is easy to estimate the operator norm Jγ − J ≤ (eγ − e−γ ) sup an , n∈N

so for small enough γ > 0, Jγ − J < dist(z, σ(J)), and therefore (Jγ − J)(J − z)−1 < dist(z, σ(J))(J − z)−1 = 1. By Theorem 4.26, follows that Jγ − z = (Jγ − J) + (J − z) = ((Jγ − J)(J − z)−1 + I)(J − z) is invertible as a product of invertible operators. In particular, taking u = (Jγ − z)−1 δ1 ∈ 2 (N), u is a bounded eigensolution of Jγ for n ≥ 2, so ψn = e−γn un is an exponentially decaying eigensolution of J. Thus, ψ must be a Weyl solution. If z ∈ σ(J), then ψ0 = 0, so ψ1 = 0 since ψ is nontrivial. Therefore, / σ(J1 ). Sψ is a Weyl solution for the coeﬃcient stripped matrix J1 and z ∈ By applying the above argument to J1 , we conclude exponential decay of Sψ. The rate of exponential decay can be estimated precisely: γ can be made arbitrarily close to the value of the so-called potential theoretic Green’s function at z. This gives a universal inequality which is one of the foundations of Stahl–Totik regularity [98]. The second result is Schnol’s theorem. We begin with the half-line case, which characterizes the spectrum in terms of where pn (x) grows at most polynomially:

10.9. A Combes–Thomas estimate and Schnol’s theorem

335

Theorem 10.54 (Schnol). Let J be a bounded half-line Jacobi matrix. Fix κ > 1/2 and denote Sκ = {λ ∈ C | |pn (λ)| = O(nκ ), n → ∞}. (a) μ is supported on Sκ . (b) Sκ ⊂ σ(J). (c) Sκ = σ(J). Proof. (a) By (10.8), ∞ ∞ ∞ −2κ 2 −2κ 2 n |pn (λ)| dμ(λ) = n−2κ < ∞. n |pn (λ)| dμ(λ) = n=1

n=1

∞

By Tonelli’s theorem, this implies that and convergence of the series implies pn

n=1

−2κ |p (λ)|2 < n n=1 n (λ) = O(nκ ) as n →

∞ for μ-a.e. λ, ∞.

(b) For λ ∈ / σ(J), the Weyl solution ψ obeys ψ0 = 0, so it is linearly independent with the eigensolution un = pn−1 (λ). Thus, their Wronskian w = an (un ψn+1 − un+1 ψn ) is nonzero and independent of n. By the triangle inequality, |w| = |an (un ψn+1 − un+1 ψn )| ≤ an Ce−γn (|un | + |un+1 |). This implies that |un | + |un+1 | ≥ C eγn for some C > 0 independent of n, so pn is not polynomially bounded. (c) Since μ is supported on Sκ , it is supported on Sκ ⊂ σ(J). Since σ(J) = supp μ is the smallest closed set on which μ is supported, this implies Sκ = σ(J). The full-line Schnol’s theorem characterizes the spectrum of a full-line Jacobi matrix in terms of the set of values of z for which there exists a polynomially bounded eigensolution (polynomially bounded at both ±∞): Theorem 10.55. Let J be a bounded full-line Jacobi matrix, and let μ be a maximal spectral measure for J. Fix κ > 1/2 and denote by Sκ the set of z for which there exists a nontrivial eigensolution which obeys un = O(|n|κ ) as n → ±∞. Then: (a) μ is supported on Sκ . (b) Sκ ⊂ σ(J). (c) Sκ = σ(J). Proof. (a) Using the orthogonality relations !∗ ! u0n u0m W dμ = δm , δn , u1m u1n

336

10. Jacobi matrices

we conclude that

2 −κ

(1 + n )

n∈Z

u0n u1n

!∗

Thus, for μ-a.e. λ,

2 −κ

(1 + n )

n∈Z

Choosing a nonzero vector λ =

2 −κ

(1 + n )

n∈Z

u0n u1n !

λ0 λ1

u0n u1n

!

W

u0n u1n

W

u0n u1n

!∗

dμ < ∞. ! < ∞.

∈ C2 such that W ≥ λλ∗ , we obtain !∗

∗

λλ

u0n u1n

! < ∞,

which implies that the nontrivial solution un = λ0 u0n + λ1 u1n obeys (1 + n2 )−κ |un |2 < ∞. n∈Z

Thus, this solution obeys un = O(|n|κ ) as n → ±∞. (b) Assume that there exists a nontrivial, polynomially bounded eigensolution u at z. For z ∈ / σ(J), there exist Weyl solutions ψ ± which are exponentially decaying at ±∞, respectively. As in the proof of Theorem 10.54, since u is polynomially bounded at ±∞, it must be linearly dependent with ψ ± , so it follows that W (ψ+ , ψ− ) = 0. This would mean that z is an eigenvalue of J, leading to contradiction. (c) Now it follows by the same argument as in the proof of Theorem 10.54. Schnol’s theorem implies an important criterion for the pure point spectrum of a Jacobi matrix (Exercise 10.17), which is used in proofs of a phenomenon called localization.

10.10. The periodic discriminant and the Marchenko–Ostrovski map We now consider full-line Jacobi matrices J with q-periodic Jacobi coeﬃcients, bn+q = bn ∀n ∈ Z. an+q = an , The behavior of the eigensolutions will be determined by behavior over one period, encoded in the q-step transfer matrix Tq (z) = A(aq , bq ; z) · · · A(a1 , b1 ; z), called in this context the monodromy matrix.

10.10. The periodic discriminant and the Marchenko–Ostrovski map

337

Lemma 10.56. η is an eigenvalue of Tq (z) if and only if there exists a nontrivial eigensolution v such that vn+q = ηvn for all n ∈ Z. Proof. This follows from the very deﬁnition of Tq (z) as a transfer matrix. If η is an eigenvalue, choose v0 , v1 so that av01v0 is an eigenvector. Then v1 vq+1 = η aq v q a0 v0 so vn+q = ηvn for n = 0, 1. By forward and backward induction using q-periodicity of the Jacobi parameters, vn+q = ηvn for all n ∈ Z. The converse is similar: if v is a nontrivial eigensolution such that vn+q = ηvn for n = 0, 1, then Tq (z) av0 1v0 = η av01v0 . We will need a general fact about 2 × 2 matrices with determinant 1: Lemma 10.57. Let A be a 2×2 matrix and let det A = 1. Denote t = Tr A. Then the following hold. (a) If t ∈ (−2, 2), the matrix A has two distinct eigenvalues both of which lie on the unit circle ∂D.

√ t± t2 −4 , 2

(b) If t ∈ {−2, 2}, the matrix A has a single eigenvalue t/2 of geometric multiplicity 1 or 2. (c) If t ∈ C \ [−2, 2], the matrix A has eigenvalues which lies in D \ {0} and the other in C \ D.

√ t± t2 −4 , 2

one of

Proof. Solving the characteristic polynomial η 2 − Tr Aη + det A = 0 gives √ 2 the eigenvalues η = t± 2t −4 . From this, (b) follows immediately, and (a) follows by t ± √ t2 − 4 2 t ± i√ 4 − t2 2 t2 + 4 − t2 = 1, t ∈ (−2, 2). = = 2 2 4 Conversely, if A has an eigenvalue η with |η| = 1, then by det A = 1, the other zero of its characteristic polynomial is 1/η = η¯, so the trace is t = η + η¯ = 2 Re η ∈ [−2, 2]. By contraposition, in the case (c) there are no eigenvalues η with |η| = 1. Thus, by det A = 1, there must be two distinct eigenvalues, one in D and the other in C \ D. The main fact proved above was that η + 1/η ∈ [−2, 2] if and only if η ∈ ∂D, which can be recognized as a standard fact about the Zhukovsky map η → η + 1/η. Equivalently, with the substitution η = eiw , this provides a fact about the cosine as a complex analytic function: cos w ∈ [−1, 1] if and only if w ∈ R. This and related basic facts (e.g., for w ∈ C, sin w = 0 if and only if w ∈ πZ) will be used below. In our setting, since det Tq (z) = 1, a central object will be the discriminant deﬁned by Δ(z) = Tr Tq (z).

338

10. Jacobi matrices

The previous two lemmas indicate that the set E = {z ∈ C | Δ(z) ∈ [−2, 2]} is relevant. For z ∈ E, there is a nontrivial eigensolution v such that vn+q = / E, ηvn for some η ∈ ∂D (in particular, a bounded eigensolution). For z ∈ ± there are nontrivial solutions v obeying vn+q = η ±1 vn± with η ∈ D. Thus v ± decays exponentially at ±∞ and grows exponentially at ∓∞. In particular, v ± are linearly independent, and any linear combination C+ v+ +C− v− grows exponentially at +∞ if C+ = 0 and at −∞ if C− = 0. In particular, there is no nontrivial, polynomially bounded eigensolution at z ∈ / E, and it already follows from Schnol’s theorem that σ(J) = E. We will soon reprove this as part of a more detailed study which will describe the set E and the spectral properties of J much more precisely. Some basic properties of the discriminant are read oﬀ from its deﬁnition: Lemma 10.58. The discriminant Δ is a polynomial of degree q with real coeﬃcients and leading coeﬃcient (a1 · · · aq )−1 . Proof. Viewed as a polynomial in z, the 1-step transfer matrix A(a, b; z) is ! ! 1/a 0 −b/a −1/a A(a, b; z) = z+ , 0 0 a 0 so the monodromy matrix is a polynomial of degree q with ! 1 1 0 q z + O(z q−1 ), z → ∞. Tq (z) = a1 · · · aq 0 0 Taking the trace, we conclude that Δ(z) is a polynomial with leading term (a1 · · · aq )−1 z q . The symmetry A(a, b; z)∗ = A(a, b; z¯) implies that Tq (z)∗ = Tq (¯ z ). Therefore Δ(z) = Δ(¯ z ), and Δ has real coeﬃcients. To proceed further, we want to track the z-dependence of Weyl solutions and half-line m-functions. We will do this by using the closely related Marchenko–Ostrovski map. The Marchenko–Ostrovski map is a natural object which can be deﬁned for almost periodic spectral problems [49], for which there is no discriminant. However, the Marchenko–Ostrovski map is not entire: we will start with an unmotivated deﬁnition on C+ and gradually prove the properties of this map and connections with the discriminant. Denote by m+,k the m-function of the half-line Jacobi matrix with coeﬃcients (an+k , bn+k )∞ n=1 , that is, of the k times coeﬃcient-stripped matrix J+,k = S k J+ (S ∗ )k . In particular, m+,0 = m+ . Taking the branch of log on C+ with Im log ∈ (0, π), the Marchenko–Ostrovski map Θ is deﬁned on C+ by q−1 i log(ak m+,k (z)). (10.47) Θ(z) = −π − q k=0

10.10. The periodic discriminant and the Marchenko–Ostrovski map

339

Since Im log m+,k ∈ (0, π), it follows immediately that −π < Re Θ(z) < 0

∀z ∈ C+ .

This deﬁnition (10.47) was chosen to be related to the Weyl solution at +∞ and, as we will see, to an eigenvalue of Tq : Lemma 10.59. For all z ∈ C+ , Im Θ(z) > 0 and ! ! −m+ (z) iqΘ(z) −m+ (z) =e . Tq (z) 1 1

(10.48)

Proof. Consider the Weyl solution at +∞ at energy z, denoted (ψn )n∈Z ; by q-periodicity of the Jacobi matrix, shifting the Weyl solution by q places gives again a Weyl solution. Since the Weyl solution is unique up to normalization, there exists η = η(z) ∈ C such that ψn+q = ηψn for all n ∈ Z. Taking the product of ψk+1 = −ak mk = e−iπ+log(ak mk (z)) ψk from k = 0 to q − 1 gives ψq = ψ0 eiqΘ , which implies η(z) = eiqΘ(z) . Since q−1

|ψkq+n |2 = |η|2k

n=0

q−1

|ψn |2 ,

n=0

square-summability of ψ at +∞ implies |η| < 1, i.e., Im Θ(z) > 0.

Lemma 10.60. For any z ∈ C \ R, Δ(z) ∈ / [−2, 2]. Proof. For z ∈ C+ , the matrix Tq (z) has an eigenvalue eiqΘ(z) ∈ D, so z ), the same holds for by Lemma 10.57, Δ(z) ∈ / [−2, 2]. Since Δ(z) = Δ(¯ z ∈ C− . We now relate the discriminant to Θ on C+ and use this to construct analytic continuations of Θ into certain simply connected regions. The proof will use the following basic fact from complex analysis. If g is a nonzero analytic function on a simply connected domain Ω, then there exists analytic h : Ω → C such that g = eh . Therefore, there exists an analytic branch of √ g = eh/2 on Ω. Lemma 10.61. For all z ∈ C+ , Δ(z) = 2 cos(qΘ(z)).

(10.49)

For any interval (c, d) ⊂ R containing no zeros of Δ2 − 4, Θ has an analytic continuation to C+ ∪ (c, d) ∪ C− such that (10.49) holds.

340

10. Jacobi matrices

Proof. Since det Tq = 1 and one eigenvalue is eiqΘ , the other eigenvalue is 2 2 e−iqΘ , so (10.49) .follows. Then 4 − Δ(z) = 4 sin (qΘ(z)), so on C+ we can 2 ﬁx a branch of Δ(z) − 4 by setting . Δ(z)2 − 4 = −2i sin(qΘ(z)), so that

iΔ (z) . (10.50) Θ (z) = . q Δ(z)2 − 4 Since the right-hand side has an analytic continuation to any simply connected subset Ω of C \ {z | Δ(z)2 = 4}, so does Θ ; thus, so does Θ, with the analytic continuation deﬁned by Θ(z) = Θ(z∗ ) + γ Θ (w) dw, where z∗ is an arbitrary reference point and γ is an arbitrary path from z∗ to z in Ω. Lemma 10.62. For any z ∈ C, if Δ(z) ∈ (−2, 2), then Δ (z) = 0.

Proof. For any interval (c, d) ⊂ R on which Δ ∈ (−2, 2), consider the analytic extension of Θ(z) to C+ ∪ (c, d) ∪ C− . For z ∈ (c, d), Δ(z) ∈ (−2, 2) implies qΘ(z) ∈ R\πZ. In particular, Θ is an extended Herglotz function, so Θ (z) > 0 for z ∈ (c, d). This implies Δ (z) = −2q sin(qΘ(z))Θ (z) = 0. The behavior of the discriminant on R can now be described very explicitly. Theorem 10.63. (a) All zeros of the polynomial Δ are simple and lie in R. (b) All zeros of Δ2 − 4 are real and can be listed, with multiplicity, in the form λ1 < λ2 ≤ λ3 < · · · < λ2q−2 ≤ λ2q−1 < λ2q

(10.51)

(in particular, λ2j−1 < λ2j for j = 1, . . . , q). (c) Each zero of Δ2 − 4 is a zero of Δ − 2 or Δ + 2:

2 n ≡ 2q, 2q − 3 (mod 4) Δ(λn ) = −2 n ≡ 2q − 1, 2q − 2 (mod 4).

(10.52)

(d) Δ has exactly one simple zero κj ∈ [λ2j , λ2j+1 ] for each j ∈ {1, . . . , q − 1} and no other zeros. (e) For each j, either λ2j < κj < λ2j+1 or λ2j = κj = λ2j+1 . Proof. (a) All zeros of Δ are real and simple by Lemmas 10.60 and 10.62, so Δ(z) has q distinct real zeros c1 < · · · < cq . Since Δ is a polynomial with real coeﬁcients, Δ has at least one zero κj ∈ (cj , cj+1 ) for 1 ≤ j ≤ q − 1. Since deg Δ = q − 1, those zeros are simple and Δ has no other zeros in C.

10.10. The periodic discriminant and the Marchenko–Ostrovski map

341

(b) For each zero cj of Δ, deﬁne by λ2j−1 the largest real zero of Δ2 − 4 smaller than cj , and by λ2j the smallest real zero of Δ2 − 4 larger than cj . This is well deﬁned even in the border cases j = 1, q since |Δ(λ)| → ∞ as λ → ±∞. Thus, by construction, λ2j−1 < cj < λ2j . Moreover, since cj < κj < cj+1 and |Δ(κj )| ≥ 2 by (a), it follows that λ2j ≤ κj ≤ λ2j+1 for j = 1, . . . , q − 1. The only possible case of equality among the λk ’s is the case λ2j = κj = λ2j+1 . In this case, since (Δ2 − 4) = 2ΔΔ , this shared value is at least a double zero of Δ2 − 4. Thus, the sequence (10.51) is a list of zeros of Δ2 − 4 with repetitions no higher than their algebraic multiplicity. Since deg(Δ2 −4) = 2q, we conclude that (10.51) lists all zeros with precisely their algebraic multiplicity. Moreover, if λ2j−1 < λ2j , then λ2j−1 < κj < λ2j . Since Δ(λ) → +∞ as λ → +∞, (10.52) follows by backward induction in n. It is now clear that the set E is given by E=

q

[λ2j−1 , λ2j ].

j=1

The closed intervals [λ2j−1 , λ2j ] are called spectral bands. The open intervals (λ2j , λ2j+1 ) are spectral gaps. The jth gap is said to be open if λ2j < λ2j+1 and closed if λ2j = λ2j+1 . Since closed gaps are possible, this merely tells us that E is a disjoint union of at most q closed intervals; see Figure 10.1 for an example. If we merely know the set E, the location of closed gaps (if any) seems lost; however, we will see below that if E is the spectrum of a periodic Jacobi matrix, the number and placement of closed gaps are uniquely determined by the set E.

2 λ2 = κ1 = λ3 λ1

λ6

κ3

λ7

λ8

λ4 κ2 λ5

−2

Figure 10.1. The discriminant on R for a 4-periodic Jacobi matrix with closed ﬁrst gap.

λ

342

10. Jacobi matrices

Factorizing the polynomials Δ and Δ2 − 4 gives the product formulas Δ(z)2 − 4 =

q 1 (z − λ2j−1 )(z − λ2j ), (a1 · · · aq )2 j=1

Δ (z) =

q a1 · · · aq

q−1

(z − κj ),

j=1

which imply a product formula for Θ by (10.50). Lemma 10.64. For z ∈ C+ , " # q−1 # (z − κj )2 1 $ . Θ (z) = (z − λ1 )(λ2q − z) (z − λ2j )(z − λ2j+1 )

(10.53)

j=1

This function has an analytic extension to C\E which obeys Θ (¯ z ) = −Θ (z), and the branch of square root is such that arg Θ (λ) = π/2 for λ ∈ (λ2q , ∞). 2 Proof. The product √ formula for Θ follows from those for Δ and Δ − 4. The square root Δ2 − 4 extends from C+ continuously with real values on R \ E, so by the reﬂection principle, it has an analytic extension to C \ E which obeys a reﬂection symmetry. By (10.50) the analytic extension of Θ to C \ E follows.

The functions mk (z) have meromorphic Herglotz extensions with asymptotic behavior mk (z) ∼ −1/z as z → ∞, so by extending both sides of (10.47) analytically through an interval of the form (C, ∞) with large enough C, it follows that the extension of Θ is purely imaginary and −iΘ is strictly increasing in (C, ∞). Thus, the analytic extension of Θ obeys arg Θ = π/2 on (C, ∞). By √ (10.50), this also means that our choice of branch of the square root Δ2 − 4 extends to C \ E with positive values on (λ2q , ∞). We will consistently use that branch in what follows. By counting argument changes, for λ ∈ (λ2j−1 , λ2j ), . Im lim Δ(λ + i )2 − 4 > 0 if j ≡ q (mod 2), ↓0 (10.54) . Im lim Δ(λ + i )2 − 4 < 0 if j ≡ q − 1 (mod 2). ↓0

This is illustrated in Figure 10.2 We also note that, by the product formula for Δ2 − 4 and the choice of branch, . 1 Δ(z)2 − 4 = z q + O(z q−1 ), z → ∞. a1 · · · aq

10.10. The periodic discriminant and the Marchenko–Ostrovski map

−i +i

+i −i

−i +i

Figure 10.2. The boundary values on E of ei arg

343

+i −i √

Δ2 −4

√ Δ2 −4 = √ 2 . |

Δ −4|

Note the sign change occurs even at a closed gap.

It follows from (10.53) or from Exercise 7.20 that iΘ is a Herglotz function and that iΘ (z) = −1/z + O(1/z 2 ) as z → ∞. By Proposition 7.32, its Herglotz representation is of the form 1 iΘ (z) = dν(λ), (10.55) λ−z where ν is a probability measure. By taking boundary values of (10.53), this measure can be explicitly obtained in the form 1/2 q−1 2 (λ − κj ) 1 dλ. dν(λ) = χE (λ) (λ − λ1 )(λ − λ2q ) j=1 (λ − λ2j )(λ − λ2j+1 ) The measure ν is called the density of states. Lemma 10.65. On C+ , i Θ(z) = − log(a1 · · · aq ) + i log(z − x) dν(x). q

(10.56)

Proof. Both sides of the proposed equality are analyticfunctions in C+ and, by (10.55), have equal derivatives. Thus, Θ(z) = c+i log(z −x) dν(x) for some complex constant c. To ﬁnd the constant c, we will compare the asymptotics as z → ∞, using the branch of log with −π < Im log < π. Since log(z − x) − log z = log(1 − x/z) → 0 as z → ∞ uniformly in x ∈ E, it follows that i log(z − x) dν(x) = i log z dν(x) + o(1) = i log z + o(1) as z → ∞, z ∈ C+ . Since m+,k (z) = (−1/z)(1 + o(1)) for each k, (10.47) implies that Θ(z) = −π −

q−1 q−1 i i log ak − i log(−1/z) + o(1) = − log ak + i log z + o(1). q q k=0

k=0

Comparing these asymptotics allows us to read oﬀ c, and concludes the proof. Proposition 10.66. Θ extends continuously to C+ . This extension obeys the following. (a) Im Θ = 0 on E.

344

10. Jacobi matrices

−π

0

Figure 10.3. Image of Θ(R) for a 4-periodic Jacobi matrix with closed ﬁrst gap.

(b) Re Θ = −π(q − j)/q on [λ2j , λ2j+1 ] for j = 1, . . . , q − 1. (c) Re Θ = 0 on [λ2q , ∞). (d) Re Θ = −π on (−∞, λ1 ]. This proposition describes the image of Θ on R as a generalized poklygonal curve, with open gaps mapped to vertical line segments traversed up and then down; see Figure 10.3. Proof. It is already known that Θ has an analytic extension through any interval (c, d) ⊂ R which contains no zeros of Δ2 − 4. Consider λk , a zero of Δ2 − 4. By the product formula (10.53), Θ (z) = O(|z − λk |−1/2 ),

z → λk , z ∈ C+ .

By the mean value theorem, this implies that |Θ(z) − Θ(w)| = O( 1/2 ),

z, w ∈ D (λk ) ∩ C+ ,

→ 0,

(10.57)

and by continuity, this holds also for z, w ∈ D (λk ) ∩ C+ \ {λk }. This implies that Θ has a limit at λk ; namely, for any sequence zn → λk with zn ∈ C+ \{λk }, (10.57) implies that Θ(zn ) is a Cauchy sequence, and (10.57) also implies that its limit is independent of the choice of zn → λk . By continuity, the extension obeys Δ(z) = 2 cos(qΘ(z)). Then Δ(z) = [−2, 2] on E so Θ(z) ∈ R for z ∈ E. Combining this with the observation that Θ is purely imaginary in gaps gives Θ(λ2j+1 ) = Θ(λ2j ). Meanwhile, at band edges (and only at band edges), qΘ(z) ∈ πZ. Since Θ > 0 on band interiors, this implies that Θ(λ2j ) − Θ(λ2j−1 ) = πq Z for each j. It follows that Θ(λ2q ) − Θ(λ1 ) = π, and since −π ≤ Re Θ ≤ 0 on C+ , it follows that Θ(λ2q ) = 0. Thus, Θ(λ2j ) = Θ(λ2j+1 ) = −π(q − j)/q for all j, which completes the proof.

10.10. The periodic discriminant and the Marchenko–Ostrovski map

−π

π

−2π

345

0

Figure 10.4. Images of analytic extensions of Θ through (λ2q , ∞) and through (−∞, λ1 ).

Corollary 10.67. The analytic extension of Θ(z) through C+ ∪(λ2j , λ2j+1 )∪ C− obeys q−j . Θ(¯ z ) = −Θ(z) − 2π q See Figure 10.4. Proof. This follows from the reﬂection principle since Re Θ(z) = − q−j q π for z ∈ (λ2j , λ2j+1 ). Corollary 10.67 shows that analytic extensions of Θ through diﬀerent gaps diﬀer, but only by additive real constants. Thus, although Θ does not have an analytic extension to C \ E, its imaginary part has a harmonic extension to C \ E: Corollary 10.68. The function deﬁned on C+ by L(z) = Im Θ(z) has an extension to C which is a positive harmonic function on C \ E, continuous on C, and zero on E. Moreover, 1 ∀z ∈ C, (10.58) L(z) = − log(a1 · · · aq ) + log|z − x| dν(x) q and L(z) = log|z| −

1 log(a1 · · · aq ) + o(1), q

z → ∞.

Proof. Harmonicity on C \ E and L(¯ z ) = L(z) follows from Corollary 10.67. Continuity on C follows from Proposition 10.66 and symmetry, and the integral representation for L follows from Lemma 10.65 on z ∈ C \ E. It remains to prove that the integral representation holds also for z = λ ∈ E. By the monotone convergence theorem, λ + i − x λ+i−x dν(x) dν(x) → log L(λ+i)−L(λ+i/n) = log λ + i/n − x λ−x

346

10. Jacobi matrices

as n → ∞. Subtracting this from L(λ + i) and using L(λ + i/n) → L(λ) shows that the integral representation holds at λ. The asymptotic behavior of L follows from that of Θ.

The function L is called the Lyapunov exponent and (10.58) is the Thouless formula. Our deﬁnition of L in terms of Θ implies that for all z ∈ C, eqL(z) is the norm of the largest eigenvalue of Tq (z), and that L(z) describes the exponential growth/decay rates of the eigensolutions and the exponential growth rate of transfer matrices (Exercise 10.24). This last property is usually taken as the deﬁnition in more general settings. The distribution function N (λ) = ν((−∞, λ]) is called the integrated density of states. It can be proved (Exercise 10.25) that it is up to an aﬃne substitution equal to the function Re Θ on R and, as a consequence, that ν gives equal weight to each band of the spectrum: 1 j = 1, . . . , q. (10.59) ν([λ2j−1 , λ2j ]) = , q Corollary 10.68 implies that L is a subharmonic function on C. Moreover, the properties of the Lyapunov exponent have a remarkable interpretation in the language of potential theory [4, 72]. Without introducing the terminology, we will point out that interpretation here. Corollary 10.69. The Lyapunov exponent L is equal to the potential theoˆ \ E with the pole at ∞. The measure retic Green’s function for the domain C ν is the equilibrium measure for ν. The logarithmic capacity of the set E is Cap E = (a1 · · · aq )1/q . In particular, this means that the measure ν is uniquely determined by the set E. Writing E as a disjoint union of closed intervals, each of those intervals may contain more than one spectral band, but (10.59) implies that the weight of each interval must be a multiple of 1/q. This presents constraints for which ﬁnite unions of intervals can be q-periodic spectra. Moreover, (10.59) then uniquely determines the locations of closed gaps. Finally, we point out a remarkable interpretation of the Marchenko– Ostrovski map as a conformal map. An analytic map is said to be conformal if it is injective. We recall two facts from complex analysis: Lemma 10.70. If f : C+ → C is analytic and Re f > 0 on C+ , then f is injective. Proof. For any z1 , z2 ∈ C+ , z1 = z2 , by the mean value theorem, f (z2 ) − f (z1 ) = Re f (z1 + t(z2 − z1 )) > 0 z2 − z1 for some t ∈ (0, 1), so f (z1 ) = f (z2 ). Re

10.11. Direct spectral theory of periodic Jacobi matrices

347

For any injective map f on C+ , let us denote Π = f (C+ ). Clearly, Π is connected, and the open mapping theorem implies Π is an open set. Lemma 10.71. Assume that a conformal map f : C+ → C extends to a ˆ Denote Π = f (C+ ). Then continuous map on the closure of C+ in C. f (∂C+ ) = ∂Π. Proof. It is a general fact about continuous maps and closures that f (C+ ) ⊂ f (C+ ). On the other hand, f (C+ ) is compact as a continuous image of a compact set; in particular, it is closed. Since it contains f (C+ ), we conclude f (C+ ) = Π. Let z0 ∈ C+ and w0 = f (z0 ). Let δ > 0 such that Dδ (z0 ) ⊂ C+ . The set U = f (Dδ (z0 )) is open, so it contains some D (w0 ). Since f is injective on C+ , it follows that for all z ∈ C+ \ Dδ (z0 ), |f (z) − w0 | ≥ . By continuity, the same holds for z in the boundary of C+ . Thus, f (∂C+ ) ∩ f (C+ ) = ∅. Thus, f (∂C+ ) = f (C+ ) \ f (C+ ) = Π \ Π = ∂Π. Applying these to the Marchenko–Ostrovski map Θ, since we know the description of Θ(R), we conclude: Corollary 10.72. The Marchenko–Ostrovski map maps C+ bijectively to ' q−1 & q−j π + it | 0 < t ≤ hj , − Π = {z ∈ C | −π < Re z < 0, Im z > 0} \ q j=1

where hj = L(κj ) = max{L(z) | z ∈ (λ2j , λ2j+1 )}. The region Π is called a comb domain; see [30].

10.11. Direct spectral theory of periodic Jacobi matrices We now turn to investigating the spectral properties of the Jacobi matrix J and its half-line restrictions J± . This requires the study of Dirichlet eigenvalues. Let us denote the entries of the monodromy matrix by ! t11 t12 Tq = . (10.60) t21 t22 By the representation (10.20), t21 = aq pq−1 is a polynomial of degree q − 1 with positive leading coeﬃcient. Deﬁnition 10.73. We say z ∈ C is a Dirichlet eigenvalue for the periodic Jacobi matrix J if t21 (z) = 0. There are several equivalent characterizations of Dirichlet eigenvalues:

348

10. Jacobi matrices

Lemma 10.74. For any z ∈ C, the following are equivalent: (a) z is a Dirichlet eigenvalue. (b) 10 is an eigenvector of Tq (z). (c) There is a nontrivial solution of the Jacobi recursion (10.11) such that v0 = vq = 0. (d) z is an eigenvalue of the ﬁnite (q − 1) × (q − 1) Jacobi matrix ⎛ ⎞ b1 a1 ⎜a1 b2 a2 ⎟ ⎜ ⎟ ⎜ ⎟ . . . . ⎟. . . a2 JD = ⎜ ⎜ ⎟ ⎜ ⎟ . . . . ⎝ . . aq−2 ⎠ aq−2 bq−1 Proof. (a) ⇐⇒ (b) follows from the form of the transfer matrix (10.60). (b) ⇐⇒ (c) follows from Tq

0.

v1 v0

! =

! vq+1 . vq

(10.61)

(c) ⇐⇒ (d) follows from (10.11) for n = 1, . . . , q − 1 since a0 , aq−1 =

Corollary 10.75. If z is a Dirichlet eigenvalue, then z ∈ R and Δ(z) ∈ / (−2, 2). Proof. z is real because it is an eigenvalue of the Hermitian matrix JD . Since Tq (z) is upper triangular, t11 (z)t22 (z) = det Tq (z) = 1 so |Δ(z)| = |t11 (z) + 1/t11 (z)| ≥ 2 by the arithmetic mean–geometric mean inequality. Theorem 10.76. The m-function for J+ is given on C+ by √ t22 − t11 + Δ2 − 4 . (10.62) m+ = 2t21 Moreover, the polynomial t21 has q − 1 distinct simple real zeros x1 < · · · < xq−1 and xj ∈ [λ2j , λ2j+1 ] for all j. Proof. Rewriting (10.48) projectively, m+ obeys −t11 m+ + t12 . −m+ = −t21 m+ + t22 This can be rewritten as a quadratic equation for m+ , whose solutions, using det Tq = 1, are found to be √ t22 − t11 ± Δ2 − 4 . 2t21

10.11. Direct spectral theory of periodic Jacobi matrices

349

√ Since Δ2 − 4 is nonzero on C+ , analyticity of m+ dictates that the sign ± be chosen uniformly throughout C+ . We will now show that this choice of sign, and the location of zeros of t21 , are dictated by the condition that m+ is a Herglotz function. On every band (λ2j−1 , λ2j ), the boundary values of Im m+ are determined by the square root, √ ± Δ2 − 4 lim Im m+ (λ + i ) = lim Im (λ + i ). ↓0 ↓0 2t21 By established properties of Δ and t21 , these boundary values are nonzero and have constant sign on each band interior (λ2j−1 , λ2j ). Since m+ is Herglotz, this sign must be positive on each band interior; by (10.54), this means that the sign of t21 must change between (λ2j−1 , λ2j ) and (λ2j+1 , λ2j+2 ). This means precisely that t21 has a zero in each interval [λ2j , λ2j+1 ], j = 1, . . . , q − 1. Since deg t21 = q − 1, it follows that all zeros xj ∈ [λ2j , λ2j+1 ] are simple, that there is exactly one per gap, and that t21 has no other zeros in C. Since t21 has positive leading coeﬃcient, it is positive on the rightmost band interior (λ2q−1 , λ2q ), so another consideration of the sign of Im m+ there shows that m+ is given by (10.62). It should be noted that this proof and result hold even in the case of closed gaps. In the closed gap case, of course, λ2j = xj = λ2j+1 , but even in an open gap, it can still happen that xj = λ2j or xj = λ2j+1 . Although (10.62) has been proved on C+ , the right-hand side is already in the form of a meromorphic Herglotz function on C\E. It is therefore immediate from our study of meromorphic Herglotz functions that σess (J+ ) ⊂ E. A closer look at (10.62) will reveal that J+ has purely absolutely continuous spectrum on E and eigenvalues precisely at those Dirichlet eigenvalues which are not at gap edges: Theorem 10.77. The operator J+ has essential spectrum σess (J+ ) = E and discrete spectrum σd (J+ ) = {xj | 1 ≤ j ≤ q − 1, |t11 (xj )| < 1}. More precisely, the spectral measure μ+ is given by dμ+ (λ) = w+ (λ) dλ + q−1 j=1 κj δxj where ⎧√ ⎨ 4−Δ(λ)2 λ ∈ (λ2j−1 , λ2j ) for some j |t21 (λ)| (10.63) w+ (λ) = ⎩0 else, and κj > 0 if and only if |t11 (xj )| < 1.

350

10. Jacobi matrices

Proof. To describe the spectral measure μ+ , we use (10.62) and consider boundary values of m+ , considering separately the interiors of bands and the remaining isolated points which may contain eigenvalues of J+ . On band interiors (λ2j−1 , λ2j ), Im m+ extends continuously from C+ with the boundary values . Δ(λ)2 − 4 Im m+ (λ + i0) = Im . t21 (λ) Rewriting using a real positive square root and using Proposition 7.43, the spectral measure μ+ on band interiors is given by . 4 − Δ(λ)2 dλ. χ(λ2j−1 ,λ2j ) (λ)dμ+ (λ) = χ(λ2j−1 ,λ2j ) (λ) |t21 (λ)| The only remaining contributions to the spectral measure can be pure points at a gap edge or at poles of the meromorphic Herglotz function m+ . While these can be investigated using (10.62), we proceed diﬀerently. As in the proof of Lemma 10.59, if z ∈ R is an eigenvalue of J+ , then the Weyl solution v must obey v0 = vq = 0, so z must be a Dirichlet eigenvalue. Moreover, if we normalize the Weyl solution by v1 = 1, then vq+1 = t11 (z) by (10.61). It follows, as in the proof of Lemma 10.59, that v is square-integrable if and only if |t11 (z)| < 1. The explicit formula (10.63) shows that w+ is continuous and strictly positive on each band interior (λ2j−1 , λ2j ). In fact, w+ extends continuously and strictly positively through closed gaps; moreover, its asymptotics at all gap edges can be precisely described (Exercise 10.18). The point masses κj can also be calculated (Exercise 10.19). Our discussion gave preferential treatment to the positive half-line and the oﬀ-diagonal entry t21 , but the proofs of Lemma 10.59 and Theorem 10.76 can be repeated for the Weyl solution at −∞ to describe the other eigenvalue and eigenvector of Tq (z), which correspond to m− ; we leave the details as an exercise. Lemma 10.78. If m− denotes the m-function corresponding to J− , then for any z ∈ C+ , ! ! 1 1 −iqΘ(z) =e . (10.64) Tq (z) −a20 m− (z) −a20 m− (z) Lemma 10.79. The entry t12 is a polynomial of degree q − 1 with negative leading coeﬃcients. It has simple zeros yj ∈ [λ2j , λ2j+1 ] for j = 1, . . . , q − 1 and no other zeros. In particular, by Lemma 10.78, we have the other solution of the quadratic equation considered in the proof of Theorem 10.76:

10.11. Direct spectral theory of periodic Jacobi matrices

351

Corollary 10.80. For z ∈ C+ ,

√ 1 t22 − t11 − Δ2 − 4 = . 2t21 a20 m−

The explicit formulas for m± give an important relation between their boundary values: Corollary 10.81. For all λ in the interior of E, 1 a20 m− (λ + i0)

= m+ (λ + i0).

(10.65)

This means that periodic Jacobi matrices are reﬂectionless; in general, a full-line Jacobi matrix is said to be reﬂectionless if (10.65) holds Lebesguea.e. on its spectrum. Theorem 10.82. The Jacobi matrix J has purely absolutely continuous spectrum on E with multiplicity 2, i.e., J ∼ = Tλ,χE (λ) dλ ⊕ Tλ,χE (λ) dλ . Proof. From the formulas for m± , we calculate G0,0 = −

a20

√

t21 , Δ2 − 4

G1,1 = √

t12 . Δ2 − 4

These are analytic Herglotz functions on C \ E, which again proves that σ(J) ⊂ E. Moreover, they have continuous extensions to band interiors (λ2j−1 , λ2j ) and at most square root singularities at gap edges, so the corresponding measures are purely absolutely continuous on (λ2j−1 , λ2j ) and there are no point masses at band edges. It follows that the maximal spectral measure for J is absolutely continuous with respect to χE (λ) dλ. Thus, by Corollary 10.41 and spectral properties of J± , J ∼ = TχE (λ) dλ ⊕TχE (λ) dλ . At points where Δ(z) = ±2, Lemma 10.57 does not provide a way to distinguish whether Tq (z) has an eigenvalue of geometric multiplicity 1 or 2. Of course, geometric multiplicity 2 means that Tq (z) = ±I, and 1 geometric multiplicity 1 means that Tq (z) is unitarily equivalent to ±1 0 ±1 . Remarkably, this dichotomy at gap edges is precisely linked to the open gap/closed gap dichotomy. Proposition 10.83. For λ ∈ C, the following are equivalent. (a) λ is a closed gap of J, i.e. λ = λ2j = λ2j+1 for some j ∈ {1, . . . , q − 1}. (b) λ is a double root of Δ2 − 4. (c) Tq (λ) ∈ {+I, −I}.

352

10. Jacobi matrices

Proof. (a) ⇐⇒ (b): This is known by Theorem 10.63. (a) =⇒ (c): If λ = λ2j = λ2j+1 , then xj = λ so t21 (λ) = 0. Analogously, yj = λ so t12 (λ) = 0. Thus, Tq (λ) is a diagonal matrix. Now det Tq (λ) = 1 and Tr Tq (λ) = ±2 imply that Tq (λ) = ±I. (c) =⇒ (b): Since det Tq (z) = 1 for all z, diﬀerentiating gives (t11 t22 − t21 t12 ) = 0. Using the product rule and applying for z = λ gives ±(t11 (λ) + t22 (λ)) = 0, which means that Δ (λ) = 0. Thus, λ is a double root of Δ2 − 4. Knowing that in band interiors, m± have continuous ﬁnite nonzero extensions denoted m± (λ + i0), we can single out the following formal eigensolutions of J. Deﬁnition 10.84. For λ ∈ R with Δ(λ) ∈ (−2, 2), Floquet solutions v ± are the formal eigensolutions at λ which obey v1+ = −a0 m+ (λ + i0), v0+

v1− 1 − = − a m (λ + i0) . v0 0 −

(10.66)

In particular, v0± = 0 and v1± = 0 (see also Exercise 10.21). Floquet solutions have the following skew-periodic property: Corollary 10.85. The sequence e∓inΘ(λ) vn± is q-periodic. Here Θ(λ) denotes the value of Θ obtained from C+ by analytic continuation. Proof. By continuity, it follows from (10.48) and (10.64) that ! ± ! vq+1 v1± ±iqΘ(λ) =e , a0 v0± a0 vq± ± so e∓iqΘ(λ) vq+n = vn± for n = 0, 1. By forward and backward induction in n, q-periodicity follows.

Floquet solutions are related to a direct integral representation which provides another approach for spectral theoretic properties of the full-line Jacobi matrix J; see Exercises 10.22 and 10.23.

10.12. Exercises 10.1. If J is a bounded half-line Jacobi matrix, prove that supn∈N an ≤ J and supn∈N |bn | ≤ J. 10.2. This problem describes criteria for strong and weak operator convergence of a sequence of Jacobi matrices in terms of their coeﬃcients.

10.12. Exercises

353

(a) Consider half-line Jacobi matrices Jk , indexed by k ∈ N ∪ {∞}, s such that Jk has coeﬃcients (ak,n , bk,n )∞ n=1 . Prove that Jk → J∞ as k → ∞ if and only if sup sup(ak,n + |bk,n |) < ∞, k∈N n∈N

and, for each n ∈ N, ak,n → a∞,n and bk,n → b∞,n as k → ∞. s w (b) Prove that Jk → J∞ if and only if Jk → J∞ . 10.3. A ﬁnite Favard’s theorem: Prove that as a map on the set of d × d Jacobi matrices, J → μJ,δ1 is a bijection with the set of probability measures on R whose support consists of exactly d points. Let J be the bounded half-line Jacobi matrix (10.2), and let (pn )∞ n=0 be the corresponding orthonormal polynomials. (a) Prove that pd (z) = 0 if and only if z is an eigenvalue of the d × d Jacobi matrix (10.1) (note the same Jacobi parameters from J). (b) Prove that pd has d distinct real zeros. 10.4. Find the m-function and the spectral measure corresponding to the Jacobi parameters bn = 0 and

1 n≥2 an = √ 2 n = 1. Hint: You may have to apply Proposition 7.43 away from some singularities and apply Lemma 7.37 to check for the presence of point masses at the singularities. 10.5. Fix c > 0 and consider the half-line Jacobi matrix with parameters bn = 0 and

1 n≥2 an = c n = 1. Prove that σess (J) = [−2, 2]. For which values of c > 0 does J have nonempty discrete spectrum? 10.6. Let α > 0 and β ∈ R. Find the spectral measure of the half-line Jacobi matrix with an = α and bn = β for all n. 10.7. If J is a d × d Jacobi matrix, we say that a Weyl solution at z ∈ C is an eigensolution at C such that ψd+1 = 0. Deﬁne the m-function by (10.3) and prove (10.15). 10.8. Let J be a half-line Jacobi matrix, and let Jn = S n J(S ∗ )n be the n times coeﬃcient-stripped Jacobi matrix. If I ⊂ R is an interval such that σ(Jn ) ∩ I contains at most k points, prove that σ(J) ∩ I contains at most n + k points. 10.9. Let J be a half-line Jacobi matrix, and let ψ(z) be a Weyl solution at z.

354

10. Jacobi matrices

(a) For z, w ∈ C, prove that J − T?n (w)∗ J T?n (z) = −i(z − w)

n−1 j=0

! ∗ 1 0 ? ? Tj (w) T (z). 0 0 j

(b) For z, w ∈ / σess (J), prove that Wn−1 (ψ(w), ψ(z)) − Wn (ψ(w), ψ(z)) = (z − w)ψn (w)ψn (z). from the right and Hint: Multiply the result of (a) by m(z) 1 m(w)∗ from the left. 1 (c) For z, w ∈ / σ(J), if Weyl solutions are normalized by a0 ψ0 = −1, prove that m(z) − m(w) = (z − w)

∞

ψn (w)ψn (z).

n=1

/ R, with Weyl solutions normalized (d) For any sequence zk → z ∈ ψ(zk ) → ψ(z) in 2 (N). by a0 ψ0 = −1, prove that 2 Hint: Use (b) to express ∞ n=1 |ψn (zk ) − ψn (z)| in terms of the m-function. (e) For any z ∈ / R, prove that m (z) =

∞

ψn (z)2 .

n=1

(f) Generalize (c) and (d) to z ∈ R \ σ(J). 10.10. For ﬁxed n ∈ N and z ∈ C+ , do the circles ∂Dn−1 (z) and ∂Dn (z) have a nonempty intersection? 10.11. Improve Proposition 10.30 by proving that for any z ∈ C+ , int Dn (z) = {m(z)}. n∈N

10.12. Prove that

lim

n→∞

π(p2n (x) +

1 dx = 1. a2n p2n−1 (x))

)(−i/a2n ). 10.13. Prove (10.30) by using m(n) (z) = −(fT−1 n (z) 10.14. In the setting of Carmona’s theorem, prove that for h ∈ C(R) with supp h ⊂ (−2, 2), √ h(x) 4 − x2 1 dx. h(x) dμ(x) = lim n→∞ 2π p2n (x) − xan pn (x)pn−1 (x) + a2n p2n−1 (x) (10.67)

10.12. Exercises

355

Hint: Consider the approximations ! ! m(n, z) m0 (z) ? $ Tn (z) , 1 1 √

where m0 (z) = −z+ 2 z −4 . This result is motivated by recalling that m0 (z) is the m-function corresponding to the free Jacobi matrix, so by coeﬃcient stripping, m(n, z) is the m-function corresponding to the Jacobi matrix J (n) with coeﬃcients

ak k ≤ n bk k ≤ n (n) (n) ak = bk = 1 k > n, 0 k > n. 2

10.15. Compute the Weyl M -matrix for the full-line Jacobi matrix J with coeﬃcients an ≡ 1, bn ≡ 0 and use its normal limits on the real line to prove that J ∼ = T[−2,2],dx ⊕ T[−2,2],dx . This provides a diﬀerent proof for Example 10.36. 10.16. Let J be a full-line Jacobi matrix. We proved that for z ∈ C \ σess (J), there exist exponentially decaying Weyl solutions at ±∞. This problem considers the converse. Assume that for some z ∈ C, there exist nontrivial eigensolutions v ± which decay exponentially at ±∞, i.e., there exist C, κ > 0 such that |vn± | ≤ Ce−γn for n ∈ N. Prove that z is an eigenvalue of J if v ± are linearly dependent and z∈ / σ(J) otherwise. 10.17. Let J be a bounded full-line Jacobi matrix with the following property: for any z ∈ R and any sequence u such that Ju = zu, if un = O(|n|κ ) for some κ, then u ∈ 2 (Z). Prove that J has an orthonormal basis of eigenvectors. 10.18. Consider the spectral density w+ of the half-line periodic Jacobi matrix J+ . (a) If λ2j = λ2j+1 is a closed gap, prove that limλ→λ2j w+ (λ) exists, λ∈R

is ﬁnite and nonzero. Accordingly, w+ has a strictly positive continuous extension to the interior of E. (b) At any open gap edge λk which is also a Dirichlet eigenvalue, prove that limλ→λk |λ − λk |1/2 w+ (λ) exists, is ﬁnite and nonzero. λ∈E

(c) At any open gap edge λk which is not a Dirichlet eigenvalue, prove that limλ→λk |λ−λk |−1/2 w+ (λ) exists, is ﬁnite and nonzero. λ∈E

10.19. For the half-line periodic Jacobi matrix J+ , if the Dirichlet eigenvalue xj is an eigenvalue of J+ , prove that μ+ ({xj }) =

t22 (xj ) − t11 (xj ) . t21 (xj )

356

10. Jacobi matrices

10.20. Borg’s theorem: If J is a periodic full-line Jacobi matrix with all gaps closed, i.e., such that σ(J) is a single closed interval, prove that it has constant coeﬃcients, i.e., there exist α > 0 and β ∈ R such that an = α and bn = β for all n ∈ Z. Hint: Find the half-line spectral measure μ+ and compare with Exercise 10.6. 10.21. For any λ with Δ(λ) ∈ (−2, 2), prove that the Floquet solutions deﬁned by (10.66) obey vn± = 0 for all n ∈ Z. 10.22. For t ∈ R \ πZ, consider the q × q matrix ⎞ ⎛ a1 e−it aq b1 ⎟ ⎜ a1 b2 a2 ⎟ ⎜ ⎟ ⎜ . . . . ⎟. . . a2 J(t) = ⎜ ⎟ ⎜ ⎟ ⎜ . . . . ⎝ . . aq−1 ⎠ eit aq aq−1 bq (a) For t ∈ R \ πZ, prove that J(t) has q distinct real eigenvalues ρ1 (t) < · · · < ρq (t) which are precisely the solutions ρ of Δ(ρ) = 2 cos t. (b) On the intervals (0, π) and (π, 2π), prove that ρj (t) are real analytic functions of t with nonzero derivative. (c) On the intervals (0, π) and (π, 2π), prove that there is a family of unitary q × q matrices U (t) which depends on real analytically on t and such that U (t)−1 J(t)U (t) are diagonal matrices. Hint: Relate J(t) with Floquet solutions. 10.23. Consider the Hilbert space 2

q

L ([0, 2π], dt) =

q 3

L2 ([0, 2π], dt)

j=1

viewed also as a space of square-integrable vector-valued functions [0, 2π] → Cq . (a) Prove that the mod q Fourier decomposition Fq : L2 ([0, 2π], dt)q → 2 (Z) deﬁned by

2π

(Fq f )kq+r =

fr (t)e−ikt dt,

k ∈ Z, r = 1, . . . , q,

0

is unitary. (b) Prove that, with the matrices J(t) from the previous exercise, (Fq−1 JFq f )(t) = J(t)f (t). This is described as the direct integral representation, since the right-hand side can be viewed as a pointwise (in t) multiplication

10.12. Exercises

357

by J(t) on Hilbert spaces Cq , which is similar to a direct sum construction, but is parametrized by t ∈ [0, 2π] with Lebesgue measure instead of a countable sum. (c) Using the unitaries U (t) from 4qthe previous exercise, prove that J is unitarily equivalent to j=1 Tρj (t),χ[0,2π] (t) dt . (d) Prove that ∼ Tρ (t),χ (t) dt = Tχ (x) dx ⊕ Tχ (x) dx j

[0,2π]

[λ2j−1 ,λ2j ]

[λ2j−1 ,λ2j ]

for each j and that J ∼ = TχE (x) dx ⊕ TχE (x) dx . This provides another proof that J has purely absolutely continuous spectrum of multiplicity 2 with essential support E. 10.24. Let L(z) denote the Lyapunov exponent associated to the periodic Jacobi matrix J. (a) If Tn (z) denote the n-step transfer matrices associated to J+ , prove that L(z) = limn→∞ n1 logTn (z). Hint: Reduce to the case where n is a multiple of q and use the spectral radius of Tq (z). (b) If z ∈ C+ and v is a nontrivial eigensolution at z, prove that 1 lim log|vn | = −L(z) n→∞ n if v is a Weyl solution at +∞ and 1 lim log|vn | = L(z) n→∞ n if v is linearly independent from the Weyl solution at +∞. 10.25. If Θ denotes the Marchenko–Ostrovski map, prove that for all λ ∈ R, Re Θ(λ) = −πν((λ, ∞)) = −π + πν((−∞, λ]) and that ν has the property (10.59). Hint: Use (10.56) to compute boundary values of Re Θ.

Chapter 11

One-dimensional Schr¨ odinger operators

Schr¨ odinger operators are operators given by the formal expression −Δ + V acting on functions in L2 (Ω), where Ω is a region in Rd (with Lebesgue measure), Δ denotes the Laplacian, and V stands for pointwise multiplication by a real-valued function V on Ω; the function V is often called the potential. Their name, and part of the motivation for their study, comes from quantum mechanics, in which they correspond to the Hamiltonian of a particle conﬁned to a region Ω with an external potential V . However, their study in the one-dimensional case dates back to the work of Sturm and Liouville in 1836 on the boundary value problem: −f + V f = λf, cos αf (0) + sin αf (0) = 0,

cos βf (1) − sin βf (1) = 0.

(11.1) (11.2) (11.3)

Classically, it was common to assume that V is smooth or at least continuous, but the theory applies with only minor changes to integrable potentials, V ∈ L1 ([0, 1]). We will study the diﬀerential equation (11.1) in Sections 11.1 and 11.2. In Section 11.3, we will interpret the boundary value problem (11.1), (11.2), and (11.3) as a self-adjoint Schr¨odinger operator H on L2 ([0, 1]). In these sections, integrability of V will ensure that solutions f and functions in the domain of H have pointwise values of f and f , which play an important role. 359

360

11. One-dimensional Schr¨odinger operators

The main scope of this chapter is more general: we study diﬀerential operators of the form d2 H =− 2 +V dx on the Hilbert space L2 (I), on an open interval I = (− , + ) ⊂ R, which can be ﬁnite or inﬁnite. We will assume that V is a real-valued potential on I such that V ∈ L1loc (I), i.e., V is integrable on every compact subinterval of I. Even for ﬁnite endpoints, this is more general than before, since V may not be integrable in neighborhoods of the endpoints: it is only required to be integrable on compact intervals [c, d] ⊂ I. If an endpoint ± obeys ± ∈ R

and

V ∈ L1 ([c, d]) for [c, d] ⊂ I ∪ {± },

(11.4)

it is said to be a regular endpoint (of course, the case of both regular endpoints is precisely the special case of Section 11.3). In general, endpoint behavior can be more varied; this is described by the so-called Weyl limit point–limit circle alternative discussed in Section 11.4, which informs us whether a boundary condition is needed at an endpoint. After regular endpoints, the most often encountered special case is that of an inﬁnite endpoint ± at which the potential is bounded below. Consistently with our choice to consider L1loc -potentials, we will study the L1loc -generalization of this condition and study inﬁnite endpoints at which the negative part of the potential, V− (x) = max(0, −V (x)), is uniformly locally L1 , i.e., x+1 and lim sup V− (t) dt < ∞. (11.5) ± = ±∞ x→±

x

We will see in Section 11.4 that in this case ± is a limit point endpoint. Using the Weyl alternative, in Section 11.5, we will describe self-adjoint Schr¨ odinger operators with separated boundary conditions, which are the central object of the entire chapter. In Section 11.6 we will study their resolvents, introducing Weyl solutions and the Green’s function. From Section 11.7 to Section 11.11, we specialize to the setting of one regular endpoint, often called the half-line setting. We introduce the mfunction and canonical spectral measure μ of the operator and construct eigenfunction expansions—these are the canonical unitary maps which diagonalize one-dimensional Schr¨odinger operators, i.e., conjugate them to multiplication operators. These eigenfunction expansions will connect us to the abstract theory of unbounded self-adjoint operators. We will also introduce Weyl disks, which provide a diﬀerent perspective on the limit

11.1. An initial value problem

361

point–limit circle alternative; they are useful for approximations, including approximations of the spectral measure from formal eigensolutions (Carmona’s theorem) and continuous dependence of the m-function on the potential. The theory in these sections should be seen as the interplay between three main objects: the Schr¨odinger operator H, the m-function, and the spectral measure μ. These three objects determine each other uniquely; we will prove that through a local Borg–Marchenko theorem. In Section 11.12 we study an arbitrary Schr¨odinger operator with separated boundary conditions and construct the full-line eigenfunction expansion. Once again this will connect us to the abstract theory of unbounded self-adjoint operators. Through the notion of the Weyl M -matrix, properties of a Schr¨odinger operator H on (− , + ) will be related to properties of Schr¨ odinger operators with the same potential on (− , c) and (c, + ), both of which have one regular endpoint at c and are taken with a Dirichlet endpoint at c. Thus, the eﬀort in Sections 11.7–11.11 is useful for the general case. In Section 11.13 we study subordinacy theory, which is a very robust way of characterizing spectral properties of a Schr¨odinger operator in terms of the behavior of its eigensolutions at real values of the spectral parameter. In Section 11.14, we specialize to Schr¨odinger operators for which each endpoint is either regular or of the form (11.5), and explore their ﬁner properties. This starts with semiboundedness of the spectrum and includes important estimates about the pointwise behavior of eigensolutions and their derivatives; for example, this includes the result that the boundedness of eigensolutions implies an absolutely continuous spectrum. This theme is continued in Section 11.15 with a Combes–Thomas estimate and Schnol’s theorem. In Sections 11.16 and 11.17 we study periodic Schr¨odinger operators. This classical setting can be studied in diﬀerent ways; we will use the Marchenko–Ostrovskii map as the central object. Other texts about Schr¨ odinger operators include [8, 19, 65, 73–75, 108, 110, 111]. Sturm–Liouville and Schr¨ odinger operators are sometimes considered under weaker regularity assumptions than those assumed here (see, e.g., [8, 28, 44, 83, 110]), using quasi-derivatives.

11.1. An initial value problem Our goal in this section is to study the initial value problem −f + (V − z)f = g,

f (0) = a,

f (0) = b,

(11.6)

362

11. One-dimensional Schr¨odinger operators

where a, b, z ∈ C and g ∈ L1 ([0, 1]). We will work on the unit interval [0, 1] and that aﬀects some of the estimates in this section; however, all qualitative conclusions extend to an arbitrary interval [c, d] by an aﬃne substitution. By a solution of (11.6), we mean a function f which belongs to the class AC2 ([0, 1]) = {f ∈ AC([0, 1]) | f ∈ AC([0, 1])},

(11.7)

and we interpret the diﬀerential equation in (11.6) as equality of L1 functions, i.e., equality Lebesgue-a.e. Properties of the space AC2 ([0, 1]) are summarized in the following lemma, whose proof is left as Exercise 11.1. Lemma 11.1. (a) AC2 ([0, 1]) is a Banach space with the norm

f AC2 ([0,1]) = |f (0)| + |f (0)| +

1

|f (x)| dx.

(11.8)

0

(b) For any y ∈ [0, 1], the point evaluations f → f (y) and f → f (y) are bounded linear functionals on AC2 ([0, 1]). (c) f C([0,1]) ≤ f AC2 ([0,1]) for all f ∈ AC2 ([0, 1]). (d) For any y ∈ [0, 1], the norm (11.8) is equivalent to the norm

1

f = |f (y)| + |f (y)| +

|f (x)| dx.

0

(e) For any f ∈ AC2 ([0, 1]), there exists a bounded linear functional Λ on AC2 ([0, 1]) such that Λ = 1 and Λ(f ) = f AC2 ([0,1]) (this implies that the space AC2 ([0, 1]) has the property (2.27)). For the study of the initial value problem (11.6), it is useful to view V as a perturbation of −∂x2 − z, so we start with a brief look at the case V = 0, g = 0. It will be natural to use the quasi-momentum k=

√ −z.

For now, we use this substitution pointwise, with an arbitrary choice of square root. Later, we will focus on z ∈ C \ [0, ∞), and it will be beneﬁcial to set the analytic branch of k such that Re k > 0. In particular, −k will be a Herglotz function. We will emphasize analyticity of solutions as Banach-space valued functions (see Section 2.7).

11.1. An initial value problem

363

Proposition 11.2. (a) For any z ∈ C, the functions c(x, k) =

s(x, k) =

∞ k 2n x2n n=0 ∞ n=0

(2n)!

= cosh(kx),

k 2n x2n+1 = (2n + 1)!

sinh(kx) k

x

z= 0 z=0

obey ∂x s(x, k) = c(x, k),

∂x c(x, k) = k 2 s(x, k),

and c(0, k) = 1,

(∂x c)(0, k) = 0,

s(0, k) = 0,

(∂x s)(0, k) = 1.

(b) The maps z → s(·, k), z → c(·, k) are entire functions from C to AC2 ([0, 1]). (c) The initial value problem −f − zf = 0,

f (0) = a,

f (0) = b

(11.9)

has a unique solution f ∈ AC2 ([0, 1]), given by f (x) = ac(x, k) + bs(x, k). Proof. (a) and (b): The functions c(x, k), s(x, k) are deﬁned as power series with even powers of k, so they are power series in z. Since % % % 1 n% 1 % x % = % n! % 2 (n − 1)! AC ([0,1]) for n ≥ 1, the power series converge in AC2 ([0, 1]) for all z, so they deﬁne entire functions. The other properties are trivial calculations. (c): It follows from (a) that f = ac(·, k) + bs(·, k) is a solution. If the initial value problem had two solutions, their diﬀerence F would obey −F −zF = 0, F (0) = F (0) = 0. It suﬃces to prove that this implies F = 0, keeping in mind that our notion of solution requires equality −F − zF = 0 only almost everywhere. The proof is by a Gronwall-type argument. Deﬁne g = |F |2 + |F |2 and note that g ≥ 0 and g ∈ AC([0, 1]). Compute g = 2 Re(F¯ F + F¯ F ) = 2 Re((1 − z)F¯ F ) ≤ Cg with C = |1 − z|. These calculations and inequalities hold Lebesgue-a.e. Now h(x) = e−Cx g(x) is also absolutely continuous and h (x) = e−Cx (g (x) − Cg(x)) ≤ 0.

364

11. One-dimensional Schr¨odinger operators

By its deﬁnition, an absolutely continuous function h such that h ≤ 0 a.e. obeys x h (t) dt ≤ h(0). h(x) = h(0) + 0

Thus, g(x) ≤

eCx g(0)

= 0, so g = 0 and F = 0 identically.

We will now note some important estimates. There will be a duality in our estimates: we want estimates uniform on bounded sets of z, but we also want good estimates for large z. To present both eﬃciently, we use the notation |||k||| = max(1, |k|). Lemma 11.3. For all z ∈ C and x ∈ [0, 1], |c(x, k)| ≤ e|Re k|x ,

(11.10)

|s(x, k)| ≤ |||k|||−1 e|Re k|x .

(11.11)

Proof. By Euler’s formula, |c(x, k)| ≤ e|Re k|x ,

|s(x, k)| ≤

e|Re k|x . |k|

Another estimate for s(·, k) follows from (∂x s)(x, k) = c(x, k) and x c(t, k) dt ≤ xe|Re k|x ≤ e|Re k|x . |s(x, k)| =

0

We now return to the initial value problem (11.6) and apply the standard idea to rewrite it as an equivalent integral equation. There is more than one way to do this; in this section, we ﬁrst add g, and then V . Lemma 11.4. For any g ∈ L1 ([0, 1]), the function T g deﬁned by x s(x − t, k)g(t) dt (T g)(x) =

(11.12)

0

is in AC2 ([0, 1]) and is the unique solution of the initial value problem −f − zf = −g,

f (0) = 0,

f (0) = 0.

Proof. Let f = T g. Using Fubini’s theorem, it is straightforward to verify x y c(y − t, k)g(t) dt dy, f (x) = 0

0

which implies that f ∈ AC([0, 1]) and x c(x − t, k)g(t) dt. f (x) = 0

11.1. An initial value problem

365

Similarly, it is proved that f ∈ AC([0, 1]) and x f (x) = k 2 s(x − t, k)g(t) dt + g(x) = −zf (x) + g(x), 0

which implies that f ∈ AC2 ([0, 1]) and f solves the initial value problem. If the initial value problem had two solutions, their diﬀerence would obey F (0) = F (0) = 0, −F − zF = 0, so it would be zero by Proposition 11.2. We now deﬁne a linear operator A from AC2 ([0, 1]) to itself by Af = T (V f ). We use A to rewrite the initial value problem (11.6) as an integral equation: Proposition 11.5. Fix a, b, z ∈ C, g ∈ L1 ([0, 1]), and V ∈ L1 ([0, 1]). A function f ∈ AC2 ([0, 1]) is a solution of (11.6) if and only if it is a solution of the integral equation f − Af = ac(·, k) + bs(·, k) − T g.

(11.13)

Proof. If f solves (11.6), then h = f −ac(·, k)−bs(·, k) obeys h(0) = h (0) = 0 and −h − zh = −f − zf = g − V f, so h = T (V f − g) = Af − T g by Lemma 11.4. Conversely, assume that f obeys (11.13). Since (Af )(0) = (Af ) (0) = 0, it follows that f (0) = a and f (0) = b. Moreover, by Lemma 11.4, −(Af ) − zAf = −V f and −(T g) − zT g = −g, so −f − zf = −(Af ) − zAf + (T g) + zT g = −V f + g,

and f solves (11.6).

Since A is a Volterra-type operator, we will solve the integral equation by showing that An ≤ C n /(n − 1)! for some C, and therefore the operator n I − A has an inverse given by the convergent series ∞ n=0 A . Bounds on the norm of powers of A can be produced directly in C([0, 1]) or AC2 ([0, 1]). But we will soon also want some sharper estimates which follow the growth rate e|Re k|x already appearing in (11.10) and (11.11); thus, we have already stated the main estimate in that form. Lemma 11.6. If |f (x)| ≤ M e|Re k|x for all x ∈ [0, 1], then |(An f )(x)| ≤ for all x ∈ [0, 1].

M |||k|||−n V nL1 e|Re k|x n!

366

11. One-dimensional Schr¨odinger operators

Proof. Denoting x = t0 and iterating (11.12) gives n x t1 tn−1 n |(A f )(x)| = ··· s(tj−1 −tj , k)V (tj ) f (tn ) dtn · · · dt2 dt1 . 0 0 0 j=1

Applying estimate (11.11) to all the factors s and multiplying with the bound on f gives tn−1 x t1 n −n |Re k|x n ··· |V (tj )| dtn · · · dt2 dt1 . |(A f )(x)| ≤ M |||k||| e 0

0

0

j=1

By using permutations of (t1 , . . . , tn ) and symmetry, the remaining n-fold x n integral is ( 0 |V (t)| dt) /n!, and the proof is complete. Theorem 11.7. Fix a, b, z ∈ C, g ∈ L1 ([0, 1]), and V ∈ L1 ([0, 1]). The initial value problem (11.6) has the unique solution f=

∞

An (ac(·, k) + bs(·, k) − T g).

(11.14)

n=0

Proof. Applying Lemma 11.6 with M = f C([0,1]) implies that e|Re k| |||k|||−n V nL1 f C([0,1]) . n! Now (Af ) = V f − zAf with (Af )(0) = (Af ) (0) = 0 implies that for f ∈ AC2 ([0, 1]) and n ∈ N, An f C([0,1]) ≤

An f AC2 ([0,1]) ≤ V L1 An−1 f C([0,1]) + |z|An f C([0,1]) . Since f C([0,1]) ≤ f AC2 ([0,1]) , this implies an operator norm estimate of the form Cn An L(AC2 ([0,1])) ≤ (n − 1)! n for some constant C independent of n ∈ N. Thus, the series ∞ n=0 A is 2 convergent in L(AC ([0, 1])), and it is then trivial to verify that it is the inverse of I − A. By invertibility of I − A, (11.13) has the unique solution f given by the convergent series (11.14). We also want to acknowledge the analyticity of the solution in certain parameters. The following result is suﬃcient for our purposes: Corollary 11.8. Let a = a(z) and b = b(z) be analytic functions from some domain Ω ⊂ C to C. Denote by fz the solution of −fz + (V − z)fz = g,

fz (0) = a(z),

fz (0) = b(z).

Then z → fz is an analytic function from Ω to AC2 ([0, 1]).

11.2. Fundamental solutions and transfer matrices

367

Proof. The terms of the series (11.14) are analytic in z by Lemma 2.72. Since the series converges uniformly on compact subsets of Ω, it is analytic by Lemma 2.71. As we worked on the space AC2 ([0, 1]) from the start, we can extract an immediate corollary. Since for any y ∈ [0, 1] point evaluations f → f (y) and f → f (y) are bounded linear functionals on AC2 ([0, 1]), it follows that they too are analytic functions of z, in the setting of Corollary 11.8. Analyticity of such point evaluations will be used repeatedly. We will also need joint continuity of the solution in the potential and initial condition. Corollary 11.9. Consider convergent sequences aj → a∞ and bj → b∞ in C, gj → g∞ in L1 ([0, 1]), and Vj → V∞ in L1 ([0, 1]). The corresponding solutions of (11.6) converge: fj → f∞ in AC2 ([0, 1]). Proof. We ﬁrst note that in AC2 ([0, 1]), aj c + bj s − T gj → a∞ c + b∞ s − T g∞ . Since T : L1 ([0, 1]) → AC2 ([0, 1]) is a bounded linear operator, for every n ∈ N, (T Vj )n (aj c + bj s − T gj ) → (T V∞ )n (a∞ c + b∞ s − T g∞ ). Thus, each term of the series solution (11.14) converges; the terms are uniformly bounded by the Volterra-type estimates above, so fj → f∞ . As already noted, these results can be rescaled to an arbitrary compact interval instead of [0, 1]; alternatively, this material could have been developed on a more general compact interval from the start, with small diﬀerences (e.g., in Lemma 11.3).

11.2. Fundamental solutions and transfer matrices Fundamental solutions are deﬁned as solutions u(x, z), v(x, z) of the initial value problems −∂x2 u + (V − z)u = 0,

u(0, z) = 0,

(∂x u)(0, z) = 1,

−∂x2 v + (V − z)v = 0,

v(0, z) = 1,

(∂x v)(0, z) = 0.

(11.15)

They are the subject of this section; their asymptotic behavior as z → ∞ will be of great importance. We begin by noting their explicit series representations, using the notation Δn (x) = {t ∈ Rn | x ≥ t1 ≥ t2 ≥ · · · ≥ tn ≥ 0}.

368

11. One-dimensional Schr¨odinger operators

Proposition 11.10. Fundamental solutions and their ﬁrst derivatives are given by the series representations ∞ u(x, z) = s(x, k) + s(x − t1 , k) n=1 Δn (x)

×

n−1

v(x, z) = c(x, k) + ×

V (tj )s(tj − tj+1 , k) V (tn )s(tn , k) dn t, (11.16)

j=1 ∞

s(x − t1 , k)

n=1 Δn (x) n−1

V (tj )s(tj − tj+1 , k) V (tn )c(tn , k) dn t, (11.17)

(∂x u)(x, z) = c(x, k) + ×

j=1 ∞

c(x − t1 , k)

n=1 Δn (x) n−1

V (tj )s(tj − tj+1 , k) V (tn )s(tn , k) dn t, (11.18)

j=1

(∂x v)(x, z) = k 2 s(x, k) + ×

∞

c(x − t1 , k)

n=1 Δn (x) n−1

V (tj )s(tj − tj+1 , k) V (tn )c(tn , k) dn t. (11.19)

j=1

Proof. By Proposition 11.5, fundamental solutions solve the integral equations x s(x − t, k)V (t)u(t, z) dt, (11.20) u(x, z) = s(x, k) + 0 x s(x − t, k)V (t)v(t, z) dt, (11.21) v(x, z) = c(x, k) + 0

and Theorem 11.7 gives series expansions for u, v, which can be written in the forms (11.16) and (11.17). By the proof of Lemma 11.4, the derivatives of fundamental solutions obey x c(x − t, k)V (t)u(t, z) dt, (11.22) (∂x u)(x, z) = c(x, k) + 0 x c(x − t, k)V (t)v(t, z) dt, (11.23) (∂x v)(x, z) = k 2 s(x, k) + 0

so substituting (11.16) and (11.17) gives (11.18) and (11.19).

11.2. Fundamental solutions and transfer matrices

369

Proposition 11.11. The series expansions in Proposition 11.10 converge uniformly on (x, z) ∈ [0, 1] × C and uniformly for V in bounded subsets of L1 ([0, 1]). For all z = −k 2 ∈ C and x ∈ [0, 1], |u(x, z)| ≤ |||k|||−1 e|Re k|x+V L1 , |v(x, z)| ≤ e|Re k|x+V L1 , |(∂x u)(x, z)| ≤ e|Re k|x+V L1 , |(∂x v)(x, z)| ≤ |||k|||e|Re k|x+V L1 . Proof. As in the proof of Lemma 11.6, using Lemma 11.3 and n 1 |V (tj )| dn t = V nL1 , n! Δn (x) j=1

1 |||k|||−(n+1) V nL1 e|Re k|x . the nth term of (11.16) is bounded above by n! Summing from n = 0 to ∞, using |||k|||n+1 ≥ |||k|||, and evaluating an exponential series, the estimate for u follows. The other estimates are proved analogously, with the diﬀerent powers of |||k||| originating in the application of Lemma 11.3.

While the previous proposition only gives upper bounds on u, v, the next proposition compares u, v to s, c, by viewing V as a perturbation of −∂x2 − z; this point of view is especially eﬀective for large z. Proposition 11.12. For all z ∈ C and x ∈ [0, 1], |u(x, z) − s(x, k)| ≤ |||k|||−2 e|Re k|x+V L1 , |v(x, z) − c(x, k)| ≤ |||k|||−1 e|Re k|x+V L1 , |(∂x u)(x, z) − c(x, k)| ≤ |||k|||−1 e|Re k|x+V L1 , (∂x v)(x, z) − k 2 s(x, k) ≤ e|Re k|x+V L1 . Proof. This is a modiﬁcation of the previous proof, estimating only terms for n from 1 to ∞ and using ∞ ∞ 1 1 |||k|||−n−1 V n ≤ |||k|||−2 V n ≤ |||k|||−2 eV L1 n! n!

n=1

n=1

to estimate |u − s|. The other estimates are proved analogously.

The Wronskian of functions f, g ∈ AC2 ([0, 1]) is deﬁned as the absolutely continuous function (11.24) W (f, g) = f g − f g.

370

11. One-dimensional Schr¨odinger operators

The key property that makes this useful is that W (f, g) = f g − f g = (−f + V f )g − (−g + V g)f.

(11.25)

The Wronskian appears in considerations of self-adjointness; here is our ﬁrst glimpse of that. Lemma 11.13. If f, g ∈ AC2 ([0, 1]) and −f + V f, −g + V g ∈ L2 ([0, 1]), then f, −g + V g − −f + V f, g = W (f¯, g)(1) − W (f¯, g)(0).

(11.26)

Proof. This follows by integrating (11.25), with f replaced by f¯.

The Wronskian is also valuable when studying eigensolutions: Lemma 11.14. If f, g are two solutions of −y +V y = zy, their Wronskian is independent of x. Moreover, W (f, g) = 0 if and only if f, g are linearly dependent. Proof. Independence of x follows from W (f, g) = f g − f g = (−f + V f )g − (−g + V g)f = zf g − zgf = 0. By uniqueness of solutions, a solution of −h +V h = zh is trivial if and only if h(0) = h (0) = 0. Applying this to a linear combination h = c1 f + c2 g, we conclude that c1 f + c2 g = 0 if and only if ! ! ! c1 0 f (0) g (0) = . (11.27) c2 f (0) g(0) 0 Thus, f, g are linearly independent if and only if (11.27) has only the trivial solution, i.e., if and only if f (0)g(0) − f (0)g (0) = 0. In particular, since the fundamental solutions u(·, z), v(·, z) are eigensolutions at z, their Wronskian is independent of x; by evaluating at x = 0, we obtain W (v(·, z), u(·, z))(x) = W (v(·, z), u(·, z))(0) = 1

∀x.

(11.28)

These considerations can be written in matrix form. Let us introduce the transfer matrices ! (∂x u)(x, z) (∂x v)(x, z) . T (x, z) = u(x, z) v(x, z) This is a 2 × 2 matrix-valued function, entire in z and absolutely continuous in x. The initial value problems for u, v translate to ! 0 V (x) − z T (x, z), T (0, z) = I, ∂x T (x, z) = 1 0

11.2. Fundamental solutions and transfer matrices

371

and (11.28) becomes det T (x, z) = 1. The basic property of the transfer matrix is that it describes the transfer of values from 0 to x for an arbitrary eigensolution at z: Lemma 11.15. If −f + V f = zf , then ! ! f (0) f (x) = T (x, z) . f (x) f (0)

(11.29)

Proof. Any solution of −f + V f = zf can be written as a linear combination of u, v as f = c1 u + c2 v. This implies f = c1 ∂x u + c2 ∂x v, so ! ! ! ! (∂x u)(x, z) (∂x v)(x, z) c1 f (x) . = c1 + c2 = T (x, z) c2 f (x) u(x, z) v(x, z) Evaluating at x = 0 determines constants as ! ! f (0) c1 = , c2 f (0)

which ﬁnally leads to (11.29).

Transfer matrices will be indispensable for the Weyl disk formalism and various proofs. For now, note that they help us to derive a variation of parameters formula for the solution of (11.6): Lemma 11.16. The solution of the initial value problem −f + (V − z)f = g, is given by

f (0) = a,

f (0) = b,

(11.30)

x

(v(x, z)u(t, z)−v(t, z)u(x, z))g(t) dt. (11.31)

f (x) = av(x, z)+bu(x, z)+ 0

f f , (11.30) can be written as ! ! ! 0 V −z g b F+ , F (0) = . 1 0 0 a

Proof. In terms of F = F =

Using (T −1 ) = −T −1 T T −1 , this leads to ! −1 −1 g(t) (T (t, z) F (t)) = T (t, z) , 0 Integrating from 0 to x gives T (x, z)

−1

F (x) =

T (0, z)

−1

F (0) =

! b . a

! x ! b −1 g(t) + T (t, z) dt. a 0 0

Using det T (t, z) = 1 and the standard formula for the 2 × 2 matrix inverse, multiplying by T (x, z) and taking the second entry of the resulting identity gives (11.31).

372

11. One-dimensional Schr¨odinger operators

Recalling that u, v are entire AC2 ([0, 1])-valued functions of z, it is natural to ask about their derivatives in the same sense. Proposition 11.17. The AC2 ([0, 1])-valued derivatives of u, v are given by x (v(x, z)u(t, z) − v(t, z)u(x, z))u(t, z) dt, (11.32) (∂z u)(x, z) = 0 x (v(x, z)u(t, z) − v(t, z)u(x, z))v(t, z) dt. (11.33) (∂z v)(x, z) = 0

Proof. Fix z ∈ C and consider u(·, z + h) for h ∈ C. Since −(∂x2 u)(x, z + h) + (V (x) − z)u(x, z + h) = hu(x, z + h), viewing u(x, z + h) as a solution of this inhomogeneous initial value problem and using (11.31) implies x (v(x, z)u(t, z) − v(t, z)u(x, z))u(t, z + h) dt. u(x, z + h) = u(x, z) + h 0

If we rewrite this as x u(x, z + h) − u(x, z) = (v(x, z)u(t, z) − v(t, z)u(x, z))u(t, z + h) dt h 0 and use limh→0 u(·, z +h) = u(·, z), the ﬁrst formula follows. The second formula is proved analogously. Note that it was convenient to evaluate the derivative pointwise; since we already know the derivative exists in AC2 ([0, 1]), the two must be equal. As deﬁned, the transfer matrix depends only on the potential V . In later sections, it will sometimes be appropriate to incorporate a boundary condition at 0. Corresponding to the boundary condition (11.2) for some α ∈ R, we consider the eigensolutions φ(x, z), θ(x, z) at z, with initial conditions ! ! ∂x φ(0, z) ∂x θ(0, z) cos α − sin α −1 Rα = = Rα , sin α cos α φ(0, z) θ(0, z) (the special case α = 0 gives φ = u, θ = v), and consider the transfer matrices ! ∂x φ(x, z) ∂x θ(x, z) . Tα (x, z) = Rα φ(x, z) θ(x, z) Note that Tα (x, z) = Rα T (x, z)Rα−1 . These obey the initial value problem ! 0 V (x) − z Tα (0, z) = I. Rα−1 Tα (x, z), ∂x Tα (x, z) = Rα 1 0 We conclude this section with an important lemma about the independence of values at diﬀerent points. Fix z ∈ C. For any h ∈ L2 ([0, 1]), let f ∈ AC2 ([0, 1]) be the unique solution of −f + (V − z)f = h,

f (0) = 0,

f (0) = 0.

(11.34)

11.3. Schr¨odinger operators with two regular endpoints

373

Since the solution is unique and linear in h, the map h → f is a linear operator B : L2 ([0, 1]) → AC2 ([0, 1]). In particular, the values f (1) and f (1) and their linear combinations, are linear functionals of h ∈ L2 ([0, 1]). The following lemma describes the functions which correspond to these functionals in the sense of Riesz’s representation theorem, and we obtain important corollaries from this. Lemma 11.18. Let V ∈ L1 ([0, 1]) and z ∈ C. (a) Fix α, β ∈ C and let g ∈ AC2 ([0, 1]) be the solution of −g + (V − z)g = 0,

g(1) = α,

g (1) = β.

For all h ∈ L2 ([0, 1]), the solution f of (11.34) obeys ¯ (1) − α βf ¯ f (1) = g, h.

(11.35)

(b) For any γ, δ ∈ C, there exists h ∈ L2 ([0, 1]) such that the solution of (11.34) obeys f (1) = γ, f (1) = δ. (c) The set of h ∈ L2 ([0, 1]), for which the solution f of (11.34) obeys f (1) = f (1) = 0, is the orthogonal complement {g | −g + (V − z)g = 0}⊥ . Proof. (a) This is immediate from (11.26). (b) The initial value problem (11.34) deﬁnes a linear map L2 ([0, 1]) → C2 by h → (f (1), f (1)). If this map was not onto, its range would be a proper subspace of C2 , so there would exist a choice of (α, β) = (0, 0) for which (11.35) is the trivial functional. This is a contradiction, because the vector g corresponding to that functional is nontrivial, as a nontrivial solution of −g + (V − z)g = 0. (c) f (1) = f (1) = 0 if and only if (11.35) holds for all α, β ∈ C, so if and only if h is orthogonal to all solutions of −g + (V − z)g = 0. Part (b), rescaled from [0, 1] to an arbitrary compact interval [c, d], shows that there are no hidden constraints between values of f at diﬀerent points, where f is an arbitrary function in the domain of a Schr¨ odinger operator.

11.3. Schr¨ odinger operators with two regular endpoints In this section, we will assume that V ∈ L1 ([0, 1]) is real-valued, and study Schr¨ odinger operators on L2 ([0, 1]) with boundary conditions (11.2) and (11.3); this is an important special case which already illustrates the use of the methods developed above.

374

11. One-dimensional Schr¨odinger operators

Theorem 11.19. Let V ∈ L1 ([0, 1]) be real-valued, α, β ∈ R, and let H be the operator on L2 ([0, 1]) deﬁned by D(H) = {f ∈ AC2 ([0, 1]) | −f + V f ∈ L2 ([0, 1]), f obeys (11.2) and (11.3)} and Hf = −f + V f . Then the following hold. (a) H is self-adjoint. (b) H has a complete orthonormal basis of eigenvectors, i.e., there is a sequence of fn ∈ L2 ([0, 1]) such that (fn )∞ n=1 is an orthonormal basis of L2 ([0, 1]) and Hfn = λn fn for some λn ∈ R. (c) All eigenvalues are simple, the set of eigenvalues is discrete, and σ(H) = {λn | n ∈ N}. (d) For all z ∈ C \ σ(H), (H − z)−1 is a compact integral operator. Symmetry of the operator will follow from Lemma 11.13. We will prove further properties by deriving an explicit formula for the inverse (H−z)−1 for z∈ / σ(H) and the spectral theorem for compact self-adjoint operators. The description of (H − z)−1 requires us to consider nontrivial eigensolutions which obey the boundary conditions at 0 or 1, respectively, so we deﬁne ψ ± (x, z) as the solutions of −∂x2 ψ + V ψ = zψ,

(11.36)

which obey ψ − (0, z) = − sin α, ψ + (1, z) = − sin β,

(∂x ψ − )(0, z) = cos α, (∂x ψ + )(1, z) = − cos β.

By Corollary 11.8, ψ ± (·, z) are entire functions of z. For ψ + (·, z), to conclude this from Corollary 11.8, use the linear substitution x = 1 − t to reduce to initial conditions at 0. The notation ψ ± (x, z) is useful for emphasizing the essential properties of ψ ± as functions of z, but it will often be convenient to use the more compact notation ψz± (x) = ψ ± (x, z). Lemma 11.20. Let z ∈ C. The kernel Ker(H − z) is nontrivial if and only if W (ψz+ , ψz− ) = 0. In this case, dim Ker(H − z) = 1. Proof. An eigensolution at z must be a multiple of ψz− in order to obey the boundary condition at 0, and a multiple of ψz+ in order to obey the boundary condition at 1. Thus, nontrivial eigensolutions exist if and only if ψz− , ψz+ are linearly dependent, in which case they are multiples of ψz± . Thus, z is an eigenvalue if and only if W (ψz+ , ψz− ) = 0, and every eigenspace is one dimensional.

11.3. Schr¨odinger operators with two regular endpoints

375

In particular, if W (ψz+ , ψz− ) = 0, then z ∈ σ(H). Conversely: Proposition 11.21. If W (ψz+ , ψz− ) = 0, then H − z has a bounded inverse. The inverse is the compact integral operator 1 ((H − z)−1 g)(x) = G(x, y; z)g(y) dy (11.37) 0

with the kernel G(x, y; z) =

1 ψz− (min(x, y))ψz+ (max(x, y)). + W (ψz , ψz− )

The kernel G is called Green’s function; the idea of the proof is that G constructed in this way obeys (−∂x2 + V (x) − z)G(x, y; z) = δy (x),

(11.38)

where δy denotes the Dirac delta function centered at y. The proof we will present will not use any distributional calculus, but the Heaviside function will appear. Proof. Since G(x, y; z) is jointly continuous in (x, y) ∈ [0, 1]2 , the righthand side of (11.37) is a compact integral operator. To prove (11.37), since H −z is injective, it suﬃces to prove that, for any g ∈ L2 ([0, 1]), the function 1 G(x, y; z)g(y) dy (11.39) f (x) = 0

is in D(H) and that (H − z)f = g. For ﬁxed y, G is absolutely continuous in x and

(ψz− ) (x)ψz+ (y) x < y 1 × ∂x G(x, y; z) = W (ψz+ , ψz− ) ψz− (y)(ψz+ ) (x) x > y.

(11.40)

The function ∂x G(x, y; z) has at x = y a jump of size −1 because lim ∂x G(x, y; z) − lim ∂x G(x, y; z) = x↓y

x↑y

ψz− (x)(ψz+ ) (x) − (ψz− ) (x)ψz+ (x) , W (ψz+ , ψz− )

and the numerator is equal to −W (ψz+ , ψz− ). However, denoting by h(x) = 1 2 (1 + sgn x) the Heaviside function, ∂x G(x, y; z) + h(x − y) ∈ AC([0, 1]) and

(ψz− ) (x)ψz+ (y) x < y 1 ∂x (∂x G(x, y; z) + h(x − y)) = + − × W (ψz , ψz ) ψz− (y)(ψz+ ) (x) x > y, so since ψz± are solutions of (11.36), ∂x (∂x G(x, y; z) + h(x − y)) = (z − V (x))G(x, y; z). Of course, this is another way of expressing (11.38).

(11.41)

376

11. One-dimensional Schr¨odinger operators

For any s < t, multiplying (11.40) by g(y) and integrating in (x, y) ∈ [s, t] × [0, 1] shows by Fubini’s theorem that t 1 1 ∂x G(x, y; z)g(y) dy dx = [G(x, y; z)]ts g(y) dy = f (t) − f (s). s

0

0

Since s < t is arbitrary, this implies that f ∈ AC([0, 1]) and 1 ∂x G(x, y; z)g(y) dy. f (x) =

(11.42)

0

By the same arguments, from (11.41) we obtain t 1 t (V (x) − z)f (x) dx = (z − V (x))G(x, y; z)g(y) dy dx s s 0 1 [(∂x G(x, y; z) + h(x − y))]x=t = x=s g(y) dy 0 t g(y) dy, = f (t) − f (s) + s

so

f

∈ AC([0, 1]) and

f

= (V − z)f − g.

Finally, using (11.39) and (11.42), 1 1 ψz− (0)ψz+ (y)g(y) dy, f (0) = W (ψz+ , ψz− ) 0 1 1 f (0) = (ψz− ) (0)ψz+ (y)g(y) dy, W (ψz+ , ψz− ) 0 so since ψz− obey the boundary condition at 0, so does f . Analogous calculations show that f obeys the boundary condition at 1, so f ∈ D(H). Proof of Theorem 11.19. If f, g ∈ D(H), then by (11.2), (f (0), f (0)) and (g(0), g (0)) are both multiples of (− sin α, cos α) in C2 , so g (0) f (0) = 0. W (f , g)(0) = g(0) f (0) Similarly, W (f , g)(1) = 0, so f, Hg = Hf, g

∀f, g ∈ D(H),

i.e., H is a symmetric operator. Thus, its eigenvalues are real by Lemma 8.16. Since all eigenvalues of H are real, the entire function z → W (ψz+ , ψz− ) has zeros only on R; in particular, it is not identically zero and the set of its zeros are discrete. For z ∈ C with W (ψz+ , ψz− ) = 0, the operator (H − z)−1 is compact. In particular, if z is real and W (ψz+ , ψz− ) = 0, then G(x, y; z) = G(y, x; z) so (H − z)−1 is a compact self-adjoint operator. By the spectral theorem for

11.3. Schr¨odinger operators with two regular endpoints

377

compact self-adjoint operators, it has an orthonormal basis of eigenfunctions (vn )∞ n=1 , (H − z)−1 vn = an vn

(11.43)

z)−1

(note an = 0 because (H − is injective). The functions vn are also eigenfunctions of H because (11.43) is equivalent to (H − z)vn = a−1 n vn and + z)v . Hvn = (a−1 n n Self-adjointness of H now follows from Example 8.21 applied to (H−λ)−1 for some λ ∈ R \ σ(H). The above argument was quite qualitative and relied on compactness; however, eigenvalues can be located much more precisely. Let us study d2 the locations of Dirichlet eigenvalues—eigenvalues of H = − dx 2 + V with Dirichlet boundary conditions α = β = 0. In this case, using the fundamental solution u, we note that u(1, z) = 0 if and only if z is a Dirichlet eigenvalue. Thus, we want to think of the entire function u(1, z) as an analogue of the characteristic polynomial. In that analogy, the following lemma guarantees equality of algebraic and geometric multiplicities. Lemma 11.22. If V is real-valued, the function u(1, z) has only simple zeros. Proof. Assume u(1, z) = 0. Then z ∈ R. From (11.32) it follows that 1 (v(1, z)u(t, z) − v(t, z)u(1, z))u(t, z) dt (∂z u)(1, z) = 0 1 = v(1, z) u(t, z)2 dt. 0

1 Now v(1, z) = 0 because of (11.28), and 0 u(t, z)2 dt = 0 because u(·, z) is a nontrivial real-valued function. Thus, (∂z u)(1, z) = 0. Using the characteristic function u(1, z), we can obtain more precise information about the distribution of the eigenvalues. This will be an application of Rouch´e’s theorem, using the special case V = 0 for comparison. Since complex analytic techniques count zeros with multiplicity, Lemma 11.22 will be useful. Example 11.23. For V = 0 and Dirichlet boundary conditions at 0 and 1, the spectrum of the Schr¨odinger operator is the set {n2 π 2 | n ∈ N}. Proof. V = 0 implies u(1, z) = s(1, k), so it suﬃces to solve the equation s(1, k) = 0. This equation has solutions k = iπn, n ∈ Z\{0}, so the Dirichlet eigenvalues are z = −k 2 = n2 π 2 .

378

11. One-dimensional Schr¨odinger operators

Lemma 11.24 (Counting lemma). Consider the operator H corresponding to Dirichlet boundary conditions at 0 and 1 and a real-valued potential V ∈ L1 . For positive integers N > eV L1 , the operator H has exactly N eigenvalues smaller than (N + 12 )2 π 2 . Proof. We begin by noting that the estimate √ √ 3|sinh −z| > e|Re −z|

(11.44)

holds on some curves z ∈ C. It holds on the parabolas ! √ 1 π |Im −z| = N + 2 √ for N ∈ N, because if −z = x ± i(N + 12 )π, then √ √ 2|sinh −z| = 2|cosh x| = |ex + e−x | > e|x| = e|Re −z| .

(11.45)

The estimate (11.44) also holds on the parabolas √ |Re −z| = Cπ √ for C ≥ 1, since if −z = x + iy and |x| ≥ 1, then

(11.46)

1 1 |sinh(x + iy)|2 = sinh2 x + sin2 y ≥ sinh2 x > (e2|x| − 2) > e2|x| . 4 9 Im z

√ |Im −z| = 52 π

√ |Im −z| = 32 π

√ |Re −z| = 1.1π

π2

4π 2

9π 2

Re z

Figure 11.1. Contours used in the proof of Lemma 11.24 and Dirichlet eigenvalues for V = 0.

11.4. Endpoint behavior

379

These two kinds of parabolas are illustrated in Figure11.1. The parabolas (11.45) and (11.46) intersect at two points and deﬁne a closed contour which encloses exactly N zeros of s(1, z) and on which (11.44) holds. If N, C > eV L1 , then on this contour, |z|1/2 > πeV L1 , so by the basic estimate for u(x, z), √ 1 < |z|−1/2 e|Re −z| < |s(1, k)|. 3 All the zeros are real by self-adjointness. Thus, by Rouch´e’s theorem, u(1, z) has exactly N zeros including multiplicity on the interval

|u(1, z) − s(1, k)| ≤ |z|−1 e|Re

√ −z|+V L1

(−C 2 π 2 , (N + 1/2)2 π 2 ). Since C can be arbitrarily large, u(1, z) has exactly N zeros including multiplicity on the interval (−∞, (N + 1/2)2 π 2 ). Since this holds for all large enough N , the conclusion follows.

The result can also be restated in the following way. Corollary 11.25. Consider the operator H corresponding to a real-valued potential V ∈ L1 ([0, 1]) and Dirichlet boundary conditions at 0 and 1. The spectrum of H is bounded from below. Arranging its elements in increasing order, σ(H) = {λn | n ∈ N}, with λn < λn+1 , the eigenvalues obey the asymptotics n → ∞. λn = n2 π 2 + O(n), The asymptotic behavior of Dirichlet eigenvalues and eigenvectors can be studied much more precisely; see [71]. The nth eigenvector for the Dirichlet boundary conditions has precisely n − 1 zeros in (0, 1); this is a special case of Sturm oscillation theory, see survey [90].

11.4. Endpoint behavior In a more general setting, V may not be integrable on the entire interval or the interval may be inﬁnite. We will assume that V ∈ L1loc (I), that is, V ∈ L1 ([c, d]) for every compact subinterval [c, d] ⊂ I. To describe the amount of smoothness required from functions in the domain, we denote AC2loc (I) = {f ∈ ACloc (I) | f ∈ ACloc (I)}.

(11.47)

Lemma 11.26. Let x0 ∈ I and z, α, β ∈ C. If g ∈ L1loc (I), then there exists a unique solution f ∈ AC2loc (I) of the initial value problem −f + (V − z)f = g,

f (x0 ) = α,

f (x0 ) = β.

(11.48)

380

11. One-dimensional Schr¨odinger operators

Proof. By using Theorem 11.7 and aﬃne transformations of the interval, there is a unique solution on any compact intervals [x0 , x0 + L] ⊂ I and [x0 − L, x0 ] ⊂ I. Solutions on overlapping intervals must match, so by using an increasing sequence of compact intervals whose union is I, the result follows. We introduce the local domain Dloc = {f ∈ AC2loc (I) | −f + V f ∈ L2loc (I)}. Under the stronger assumption V ∈ L2loc (I), this would be equivalent to Dloc = {f ∈ AC2loc (I) | f ∈ L2loc (I)}, but we are working with the more general condition V ∈ L1loc (I); due to this, the local domain depends on V . Functions in Dloc need not have any integrability properties at the endpoints, so we also deﬁne X− = {f ∈ Dloc | ∃c > − : f, −f + V f ∈ L2 ((− , c))}, X+ = {f ∈ Dloc | ∃c < + : f, −f + V f ∈ L2 ((c, + ))}. Since any f ∈ AC2loc (I) is bounded on compact intervals, the deﬁning conditions in X± are actually independent of the choice of c ∈ I. In particular, X− ∩ X+ = {f ∈ AC2loc (I) | f ∈ L2 (I), −f + V f ∈ L2 (I)}. The sets X± encode required properties of a function near an endpoint ± , and the following separation property shows that these are independent of each other. Lemma 11.27. For every f± ∈ X± , there exists f ∈ X− ∩ X+ such that f = f− on some interval (− , c) and f = f+ on some interval (d, + ). Proof. Fix [c, d] ⊂ I. By applying Lemma 11.18 on intervals [c, c+d 2 ] and c+d 2 2 [ 2 , d], there exists f ∈ AC ([c, d]) such that −f + V f ∈ L ([c, d]) and ! ! ! ! ! ! 0 f (d) f ( c+d f− (c) ) f+ (d) f (c) 2 = , , . = = 0 f (c) f− (c) f (d) f+ (d) f ( c+d 2 ) Extend f to a function on I by setting f (x) = f− (x) for x < c and f (x) = f+ (x) for x > d. Then f ∈ AC2loc (I) and f has all the required properties. This lemma would be almost trivial if we were working with V ∈ L2loc (I): then we could take any f ∈ AC2 ([c, d]) with desired values of f and f at c and d. The set X− ∩ X+ will be a maximal domain for a Schr¨odinger operator, but to obtain self-adjoint operators, it may be necessary to restrict this domain by boundary conditions. Here, Wronskians will play a role.

11.4. Endpoint behavior

381

The Wronskian of two functions f, g ∈ AC2loc (I) is the function W (f, g) = f g − f g.

(11.49)

While functions in X± do not necessarily have boundary values at ± , their Wronskians do: Proposition 11.28. (a) For any f, g ∈ X− , the limit W− (f, g) = lim W (f, g)(x) x↓−

is convergent. (b) For any f, g ∈ X+ , the limit W+ (f, g) = lim W (f, g)(x) x↑+

is convergent. (c) For any f, g ∈ X− ∩ X+ , −f + V f, g − f, −g + V g = W+ (f , g) − W− (f , g).

(11.50)

Proof. (a) As in Lemma 11.13, on any compact interval [c, d] ⊂ I, d d (−f + V f )g dx − f (−g + V g) dx = W (f , g)(d) − W (f , g)(c). c

c

(11.51) +V +Vg ∈ Since − , d)), by the Cauchy–Schwarz inequality and dominated convergence, (11.51) has a ﬁnite limit as c ↓ − . f, g, −f

f, −g

L2 ((

(b) This is analogous to (a). (c) Taking the limit of (11.51) as c ↓ − and d ↑ + gives (11.50).

In equation (11.50), the diﬀerence of boundary Wronskians appears as the obstruction to self-adjointness. We have to understand these better in order to describe choices of domain which lead to self-adjoint operators. For either choice of ± sign, the Wronskian W± is an alternating bilinear map, so the framework of Section 8.7 applies. We denote ∗ = {f ∈ X± | W± (f, g) = 0 ∀g ∈ X± }. X±

(11.52)

By Section 8.7, this is a vector subspace of X± and W± induces a symplec∗ . The ﬁrst step is to estimate the tic form on the quotient space X± /X± dimension of this quotient: ∗ is Lemma 11.29. At each endpoint ± , the quotient vector space X± /X± trivial or two dimensional.

382

11. One-dimensional Schr¨odinger operators

Proof. For every x ∈ I, the Wronskian at x corresponds through the point evaluation ! f (x) f → f (x) ucker identity. This follows to a symplectic form on C2 , so it obeys the Pl¨ from Theorem 8.64 or the following calculation. Starting from f1 (x) f2 (x) f3 (x) f4 (x) f1 (x) f2 (x) f3 (x) f4 (x) f (x) f (x) f (x) f (x) = 0, 1 2 3 4 f1 (x) f2 (x) f3 (x) f4 (x) we obtain W (f1 , f2 )W (f3 , f4 ) − W (f1 , f3 )W (f2 , f4 ) + W (f1 , f4 )W (f2 , f3 ) = 0. This is true at any x ∈ I, so by taking x → ± , we conclude that the bounducker identity. Thus, by Theorem 8.64, ary Wronskian W± also obeys the Pl¨ ∗ is 0 or 2. the dimension of the quotient space X± /X± Deﬁnition 11.30. At the endpoint ± , the potential V is the limit point if ∗ ) = 0 and the limit circle if dim(X /X ∗ ) = 2. dim(X± /X± ± ± The choice of terminology is motivated by Weyl disk formalism, which will be explained later. In the remainder of this section, we present two important special cases and explain how they ﬁt in the limit point–limit circle alternative. A regular endpoint has been deﬁned by (11.4). Informally speaking, regular endpoints behave just like internal points of the interval. For concreteness, let us work with − ; of course, analogous statements hold for + . The endpoint − is called regular for the potential V if it is a ﬁnite endpoint (− = −∞) and V ∈ L1 ((− , c)) for some, and therefore all, c ∈ I. Proposition 11.31. Let − be a regular endpoint of V . (a) For every f ∈ X− , the limits f (− ) := lim f (x), x↓−

f (− ) := lim f (x) x↓−

exist, and f extends to a function on {− } ∪ I so that for d ∈ I, f ∈ AC2 ([− , d]),

−f + V f ∈ L2 ([− , d]).

(b) For every f, g ∈ X− , W− (f, g) = f (− )g (− ) − f (− )g(− ).

11.4. Endpoint behavior

383

(c) The map T : X− → C2 deﬁned by T : f →

f (− ) f (− )

!

∗. has Ran T = C2 and Ker T = X−

(d) V is the limit circle at − . Proof. (a) For f ∈ X− , consider g = −f + V f ∈ L2 ((− , c)) ⊂ L1 (− , c). By an aﬃne transformation, Theorem 11.7 can be applied on the interval [− , c], and it provides existence of a function F ∈ AC2 ([− , c]) with −F + V F = g and F (c) = f (c), F (c) = f (c). By uniqueness of solutions, F = f on every interval [− + , c] with − < − + < c, so the extension of f is given by F . (b) This follows from (a) by computing the limit in the deﬁnition of W− . (c) The equation −h + V h = 0 has a solution h ∈ AC2 ([− , c]) with any prescribed values of h(− ) = α and h (− ) = β, and h extends to a function ∗ if and only if h ∈ X− , so Ran T = C2 . Using (b), it follows that f ∈ X− f (− ) = f (− ) = 0. ∗ ) = dim(X / Ker T ) = dim Ran T = 2. (d) dim(X− /X− −

For an inﬁnite endpoint ± = ±∞, a diﬀerent perspective is needed. Standard Sobolev estimates give upper bounds on f in terms of f and f , and as a variation of that idea, we need an upper bound on f in terms of f and −f + V f . We will work under the assumption (11.5) and prove that the endpoint is in the limit point case. To avoid confusion, we state the following proposition for + ; analogous statements hold for − . Proposition 11.32. Assume that V ∈ L1loc (I) and that (11.5) holds at the endpoint + = +∞. Then the following hold. ∞ (a) For any f ∈ X+ and c ∈ I, c |f |2 dx < ∞. (b) V is a limit point at +∞. (c) For any f ∈ X+ , limx→+∞ f (x) = 0. (d) For any c ∈ (− , +∞), there exists M < ∞ such that for all f ∈ X+ with f (c) = 0 or f (c) = 0, ∞ ∞ 1 ∞ 2 2 |f | dx ≤ M |f | dx + Re f (−f + V f ) dx. (11.53) 2 c c c x+1 The constant M depends only on supx≥c x V− (t) dt. The proof starts with a simple Sobolev estimate:

384

11. One-dimensional Schr¨odinger operators

Lemma 11.33. If [p, q] is an interval of length 12 ≤ q − p ≤ 1, such that f ∈ AC([p, q]) and f ∈ L2 ([p, q]), then for any > 0, ! q q 1 2 2 sup |f (x)| ≤

|f (x)| dx + 2 + |f (x)|2 dx.

p p x∈[p,q] Proof. Since f 2 ∈ AC([p, q]) and (f 2 ) = 2f f , for any x, y ∈ [p, q], max(x,y) q (f 2 ) dt ≤ 2|f f | dt. |f (x)2 − f (y)2 | = min(x,y) p By the Cauchy–Schwarz inequality, this implies q 1 q |f (t)|2 dt + |f (t)|2 dt. |f (x)|2 ≤ |f (y)|2 +

p p

(11.54)

By the mean value theorem, there exists y ∈ [p, q] such that |f (y)|2 = q 1 2 q−p p |f (t)| dt. Using that value of y in (11.54) concludes the proof. Proof of Proposition 11.32. (a) Let us ﬁx c ∈ I and denote x+1 V− (t) dt. C = sup x≥c

x

This is a ﬁnite constant since V ∈ L1loc (I) and (11.5) holds. For any d > c, integration by parts implies d d d 2 |f | dx = f (−f ) dx + f f c c

c

d

=−

d

V |f | dx + 2

c

c

d f (−f + V f ) dx + f f c .

(11.55)

On any interval [p, q] ⊂ [c, ∞) of length between 1/2 and 1, Lemma 11.33 allows us to estimate ! q q q q 1 2 2 2 − V |f | dx ≤ V− |f | dx ≤ C

|f | dx + C 2 + |f |2 dx.

p p p p (11.56) For any d ≥ c + 1, the interval [c, d] can be partitioned into intervals of length between 1/2 and 1. Summing over those intervals shows that (11.56) holds also for p = c, q = d. Combining with (11.55) implies d |f |2 dx (1 − C ) c (11.57) ! d d 1 2 d |f | dx + Re f (−f + V f ) dx + Re f f c . ≤C 2+

c c To proceed, we need < 1/C; in fact, let us ﬁx = 1/(2C).

11.4. Endpoint behavior

385

Let us take lim inf d→∞ of both sides of (11.57). In fact, many of the terms in (11.57) have a limit as d → ∞. The ﬁrst term on the right-hand side converges because f ∈ L2 ((c, ∞)), and the second by the Cauchy–Schwarz inequality since f, −f + V f ∈ L2 ((c, ∞)). Now we prove by contradiction that lim inf Re f (x)f (x) ≤ 0. (11.58) x→∞

If this was false, that would imply (|f (x)|2 ) ≥ δ > 0 for all x large enough, and therefore |f |2 would grow at least linearly at +∞, contradicting f ∈ L2 ((c, ∞)). Thus, taking lim inf d→∞ of (11.57) implies that d 1 lim inf |f |2 dx 2 d→∞ c ∞ ∞ 2 |f | dx + Re f (−f + V f ) dx − Re(f¯(c)f (c)). ≤ 2C(C + 1) c

c

(11.59)

∞ The left-hand side is equal to c |f |2 dx by nonnegativity of |f |2 and monotone convergence, and this proves that f ∈ L2 ((c, ∞)). (b) For any f ∈ X+ , the functions f and f are square-integrable on (c, ∞). For any f, g ∈ X+ , the Cauchy–Schwarz inequality implies f g , f g ∈ L1 ((c, ∞)). Thus, W (f, g) ∈ L1 ((c, ∞)); therefore, the only possible value of the limit W+ (f, g) = limx→∞ W (f, g)(x) is 0. (c) For any f ∈ X+ , f, f ∈ L2 ((c, ∞)) implies that x+1 x+1 2 lim |f (t)| dt = lim |f (t)|2 dt = 0, x→+∞ x

x→+∞ x

and then Lemma 11.33 implies that limx→+∞ f (x) = 0. (d) This follows immediately from (11.59) using (11.58).

Part (d) is a technical estimate, which will be needed twice below from diﬀerent perspectives. It will be used as an upper bound for the L2 -norm of f in the proof of a Combes–Thomas estimate. It will also be used as a lower d2 bound on f, Hf , once a self-adjoint choice of H = − dx 2 +V has been ﬁxed, and this will imply a lower bound on the spectrum. The interpretation as a lower bound on f, Hf is particularly intuitive: if V is bounded below, d2 even in the L1 sense considered here, the self-adjoint expression − dx 2 + V accepts a lower bound. Part (d) has been stated for an internal point c ∈ I, but it also holds for c = − if that is a regular endpoint. More notably, (d) can be stated more elegantly when both endpoints are inﬁnite and obey (11.5) (Exercise 11.5).

386

11. One-dimensional Schr¨odinger operators

Part (c) is another conclusion which was noted for future use. It will help us to conclude that many Schr¨ odinger operators have domains D(H) ⊂ L∞ (I), which will be an important technical ingredient in the proof of Schnol’s theorem.

11.5. Self-adjointness and separated boundary conditions 2

d We are now ready to turn the diﬀerential expression H = − dx 2 + V into a self-adjoint operator in the general setting where I is an interval on R and V ∈ L1loc (I). We begin by deﬁning the maximal operator Hmax on L2 (I) by

D(Hmax ) = {f ∈ Dloc | f, −f + V f ∈ L2 (I)} and Hmax f = −f + V f. Equation (11.50) can be written as Hmax f, g − f, Hmax g = W+ (f¯, g) − W− (f¯, g),

(11.60)

which can be interpreted as an obstruction to self-adjointness. Indeed, we ∗ ⊂ Hmax and that, if both endpoints are a limit point, will prove that Hmax Hmax is self-adjoint. Otherwise, we will construct self-adjoint restrictions of Hmax by separately restricting the domain at each limit circle endpoint. In order to separate the contributions from diﬀerent endpoints, we write the domain as (11.61) D(Hmax ) = X− ∩ X+ , ∗ deﬁned by (11.52). Recall that we denote and we will use the subspaces X± by L2c (I) the set of compactly supported functions in L2 (I),

L2c (I) = {f ∈ L2 (I) | f χ[c,d] = f for some compact [c, d] ⊂ I}. Theorem 11.34. The restriction H0 of Hmax to D(H0 ) = Dloc ∩ L2c (I) obeys the following. (a) H0 is densely deﬁned. (b) H0∗ = Hmax . ∗ ∩ X∗ . (c) H0 is the restriction of Hmax to X− +

Proof. To prove that H0 is densely deﬁned and to ﬁnd its adjoint, let us assume that u, v ∈ L2 (I) obey u, H0 f = v, f

∀f ∈ D(H0 ).

(11.62)

11.5. Self-adjointness and separated boundary conditions

387

Let us temporarily ﬁx a compact interval [c, d] ⊂ I. Consider any h ∈ L2 ([c, d]) which obeys d gh dx = 0 (11.63) c

for all solutions of of

−g + V

g = 0. For such h, by Lemma 11.18, the solution

−f + V f = h,

f (c) = f (c) = 0

obeys f (d) = f (d) = 0. Therefore, f = 0 on I \ [c, d] so f ∈ D(H0 ). For such f , (11.62) becomes d d u ¯(−f + V f ) dx = v¯f dx. c

c

as any solution of −w + V w = v, and use (11.51) Introduce w ∈ to rewrite as d d d u(−f + V f ) dx = (−w + V w)f dx = w(−f + V f ) dx. AC2loc (I)

c

c

c

−f +V

f can be an arbitrary function in L2 ([c, d]) which Recalling that h = is orthogonal to all solutions of −g + V g = 0, we have proved that d (u − w)h dx = 0 c

for every h which obeys (11.63) for all solutions of −g + V g = 0. By Lemma 11.18, it follows that in L2 ([c, d]), u − w ∈ {g ∈ AC2 ([c, d]) | −g + V g = 0}⊥⊥ . Since any ﬁnite-dimensional subspace is closed, this becomes u − w ∈ {g ∈ AC2 ([c, d]) | −g + V g = 0}. It follows that u ∈ AC2 ([c, d]) and −u + V u = −w + V w = v on [c, d]. Since the interval [c, d] ⊂ I was arbitrary, this implies that u ∈ AC2loc (I) and −u + V u = v on I. Since u, v ∈ L2 (I), we have shown that (11.62) implies (u, v) ∈ Γ(Hmax ). Conversely, for any (u, v) ∈ Γ(Hmax ), ∗ ∩ X ∗ . This (11.62) holds because boundary Wronskians vanish for f ∈ X− + ∗ implies that H0 is densely deﬁned and H0 = Hmax . ∗ ) if and only if It now follows from (11.50) that g ∈ D(Hmax

W+ (g, f ) − W− (g, f ) = 0

∀f ∈ D(Hmax ).

388

11. One-dimensional Schr¨odinger operators

∗ ∩ X ∗ , it is obvious that for all g ∈ X ∩ X , W (g, f ) = 0, If f ∈ X− − + ± + ∗ ). Conversely, let f ∈ D(H ∗ ). For any h ∈ X , use the so f ∈ D(Hmax + max function g ∈ X− ∩ X+ from Lemma 11.27; then

W+ (h, f ) = W+ (g, f ) = W+ (g, f ) − W− (g, f ) = 0. ∗. X+

Analogously, f ∈ Therefore, f ∈ ∗ ∩ X∗ . Hmax to the domain X− +

∗. X−

Thus,

∗ Hmax

(11.64)

is the restriction of

∗ = X , so In particular, if V is limit point at both endpoints, then X± ± H0 = Hmax is self-adjoint. Otherwise, we will look for self-adjoint extensions H of H0 , which must obey ∗ H0 = Hmax ⊂ H ∗ = H ⊂ Hmax .

We will extensively use Section 8.7. By the results of that section, selfadjoint extensions of H0 correspond to Lagrangian subspaces for the skewsymmetric sesquilinear form (11.60). The standard procedure is to pass to the quotient vector space ∗ ∗ ∩ X+ ), (X− ∩ X+ )/(X−

which turns W+ − W− into a symplectic form. In our current setting, by Lemma 11.27, this decomposes into a sum of symplectic forms induced by ∗ and X /X ∗ . Thus, we will be especially interested in selfW± on X− /X− + + adjoint restrictions H with separated boundary conditions: Deﬁnition 11.35. A Schr¨odinger operator on L2 (I) with separated boundary conditions is a Schr¨odinger operator with domain D(H) = Y− ∩ Y+ , where Y± are Lagrangian subspaces of X± with respect to Wronskians W± . The Lagrangian property of Y± is explicitly written as g , f ) = 0 ∀g ∈ Y± }. Y± = {f ∈ X± | W± (¯

(11.65)

Starting with the easy case, if the endpoint ± is a limit point, then X± = ∗ , so X± Y± = X± . Informally speaking, at a limit point endpoint, we do not impose any boundary conditions. ∗ ) = 2, so subspaces If the endpoint ± is a limit circle, then dim(X± /X± obeying (11.65) are one-dimensional subspaces generated by a suitable vector:

Lemma 11.36. At a limit circle endpoint ± , Lagrangian subspaces are subspaces of the form Y± = {f ∈ X± | W± (v, f ) = 0} ∗ such that W (¯ for some vector v ∈ X± \ X± ± v , v) = 0.

(11.66)

11.5. Self-adjointness and separated boundary conditions

389

∗ ) = 2, Y /X ∗ must be one dimensional. Thus, it Proof. Since dim(X± /X± ± ± ∗ . In other words, must be generated by some nontrivial vector [v] ∈ X± /X± ∗ ∗ Y± = span{v} + X± , where v ∈ X± \ X± .

The symplectic complement of Y± with respect to W± is then Y±⊥ = {f ∈ X± | W± (v, f ) = 0}. Thus, Y± = Y±⊥ if and only if W± (v, v) = 0. Deﬁnition 11.37. Let ± be a limit circle endpoint. For any choice of ∗ such that W (v, v) = 0, we will call the equation v ∈ X± \ X± ± W± (v, f ) = 0

(11.67)

a self-adjoint boundary condition at ± . Let us also note that self-adjoint boundary conditions respect an expected complex conjugation symmetry: Lemma 11.38. For any self-adjoint boundary condition at ± , f ∈ Y± if and only if f ∈ Y± . ∗ . TrivProof. By the complex conjugation symmetry of W± , v ∈ X± \ X± ially, W± (v, v) = 0, so v ∈ Y± ; thus, v can be used instead of v to characterize the Lagrangian subspace as

Y± = {f ∈ X± | W± (v, f ) = 0}. Using again the symmetry of W± , we rewrite this as Y± = {f ∈ X± | W± (v, f ) = 0}, and comparing this with (11.66), we see f ∈ Y± if and only if f ∈ Y± .

Corollary 11.39. For any Schr¨ odinger operator H with separated, selfadjoint boundary conditions, f ∈ D(H) if and only if f ∈ D(H). Self-adjoint boundary conditions can be written more concretely if the endpoint behavior of functions in X± is well understood. Most notably: Proposition 11.40. Let ± be a regular endpoint. Every self-adjoint boundary condition at ± is of the form cos φf (± ) + sin φf (± ) = 0

(11.68)

for some φ ∈ R. Proof. At the regular endpoint ± , functions f ∈ D(Hmax ) have continuous boundary values f (± ) and f (± ), so in the notation of Proposition 11.31, the Wronskian at ± can be evaluated as ! 0 −1 (T f ). W± (g, f ) = g(± )f (± ) − g (± )f (± ) = (T g) 1 0

390

11. One-dimensional Schr¨odinger operators

∗ is equivalent to T v = Thus, the condition v ∈ / X± ! 0 −1 (T v) = 0 (T v¯) 1 0

0 0 , and the condition

is equivalent to v(± )v (± ) ∈ R. Together, they are equivalent to the existence of κ, φ ∈ R such that v(± ) = eiκ sin φ and v (± ) = eiκ cos φ. Accordingly, (11.67) is equivalent to (11.68).

11.6. Weyl solutions and Green’s functions Let H be a Schr¨odinger operator with separated boundary conditions, with the domain D(H) = Y− ∩ Y+ , as introduced in the previous section. We will introduce the corresponding Weyl solutions and use them to describe the resolvents (H − z)−1 for z ∈ C \ σ(H). Informally, as inverses of diﬀerential operators, it will not be surprising that resolvents are integral operators; unlike the special case of two regular endpoints in Section 11.3, their integral kernels will often not be in L2 (I × I), but the integral representations will nonetheless be convergent. Deﬁnition 11.41. A Weyl solution at z ∈ C at the endpoint ± is a nontrivial solution of −ψ + V ψ = zψ such that ψ ∈ Y± . We will denote Weyl solutions at z by ψz± (x) or ψ ± (x, z). If V is limit point at the endpoint ± , then Y± = X± , so Weyl solutions can be characterized simply as nontrivial solutions of −ψ + V ψ = zψ which are square-integrable in a neighborhood of ± . However, if V is a limit circle at the endpoint ± , the Weyl solution depends not only on V , but also on the boundary condition at ± through the requirement ψ ∈ Y± . This is consistent with the usage in Section 11.3. Theorem 11.42. Consider a self-adjoint Schr¨ odinger operator H with separated boundary conditions, D(H) = Y− ∩ Y+ , and z ∈ C \ σess (H). (a) At each endpoint ± , there exist Weyl solutions ψz± ∈ Y± . The set of Weyl solutions, together with the trivial solution, is one dimensional. Moreover, W± (ψz± , ψz± ) = 0. (b) z is an eigenvalue of H if and only if W (ψz+ , ψz− ) = 0. Proof. Fix the ± sign and assume that f1 , f2 are Weyl solutions at the endpoint ± . From f1 , f2 ∈ Y± , it follows that W± (f1 , f2 ) = 0. Since f1 , f2 solve the same ordinary diﬀerential equation −f + V f = zf , their Wronskian W (f1 , f2 ) = f1 f2 −f1 f2 is independent of x. Thus, it is constantly

11.6. Weyl solutions and Green’s functions

391

zero, so f1 , f2 are linearly dependent. Thus, the set of Weyl solutions at each endpoint is at most one dimensional. By Lemma 11.38, f ∈ Y± implies f ∈ Y± , so W± (f , f ) = 0. If z is in the discrete spectrum, it is an eigenvalue, so the corresponding eigenvector f ∈ Ker(H −z)\{0} is a Weyl solution at both endpoints. Thus, it remains to consider z ∈ C \ σ(H). Fix [c, d] ⊂ I. For any g ∈ L2 (I) with supp g ⊂ [c, d], consider f = (H − z)−1 g ∈ Y− ∩ Y+ and evaluate at c and d to deﬁne linear maps Tc , Td : L2 ([c, d]) → C2 , ! ! f (c) f (d) , Td g = . Tc g = f (c) f (d) Any nontrivial value of Tc g ∈ C2 corresponds to f ∈ Y− , which is nontrivial on (− , c) and obeys −f + V f = zf on (− , c). In other words, on the interval (− , c), f is an eigensolution, and extending that eigensolution to I gives a Weyl solution at − . Analogously, any nontrivial value of Td g ∈ C2 leads to a Weyl solution at + . The set of Weyl solutions at each endpoint is at most one dimensional, so dim Ran Tc ≤ 1 and dim Ran Td ≤ 1. Our remaining goal is to show that dim Ran Tc = dim Ran Td = 1. If Ran Td = {0}, this would imply that for all g ∈ L2 ([c, d]), the solution of the initial value problem −f + (V − z)f = g,

f (d) = f (d) = 0,

has values (f (c), f (c)) ∈ Ran Tc . This leads to a contradiction since by Lemma 11.18, by varying g we can produce an arbitrary (f (c), f (c)) ∈ C2 . Analogously, Ran Tc = {0} would lead to a contradiction. If the Weyl solutions ψz− and ψz+ were linearly dependent, they would both be in Y− ∩Y+ , so they would be in Ker(H−z), contradicting invertibility of H − z. In particular, for z ∈ C \ σ(H), there exist Weyl solutions ψz± at ± and their Wronskian is nonzero, so we can deﬁne Green’s function G(x, y; z) =

1 ψz− (min(x, y))ψz+ (max(x, y)). + W (ψz , ψz− )

(11.69)

Note that this deﬁnition is independent of the normalization of ψz± . We will prove that this is the integral kernel of the resolvent. This is often formally written using Dirac delta functions as G(x, y; z) = δx , (H − z)−1 δy

392

11. One-dimensional Schr¨odinger operators

or as (H − z)G(·, y; z) = δy . We mention this only for motivation; we will not formally use distributions in the proofs. We ﬁrst collect simple properties of Green’s function into the following lemma. Let us denote as before the Heaviside function by 1 h(t) = (1 + sgn t). 2 Lemma 11.43. For any z ∈ C \ σ(H), the Green’s function G(x, y; z) has the following properties: (a) G(x, y; z) = G(y, x; z) for all x, y ∈ I. (b) For any y ∈ I, |G(x, y; z)|2 dx < ∞. (c) For any y ∈ I, as functions of x, G(x, y; z) ∈ ACloc (I), ∂x G(x, y; z) + h(x − y) ∈ ACloc (I), and ∂x (∂x G(x, y; z) + h(x − y)) = (V (x) − z)G(x, y; z). (d) The map y → G(x, y; z) is continuous as a function from I to L2 (I). Proof. (a) This follows immediately from (11.69). (b) This follows from the fact that ψ± are square-integrable near ± , respectively. (c) This is a calculation as in the proof of Proposition 11.21. (d) For y1 , y2 ∈ I, assuming without loss of generality that y1 < y2 , |G(x, y1 ; z) − G(x, y2 ; z)|2 dx y1 1 |ψz− (x)|2 |ψz+ (y1 ) − ψz+ (y2 )|2 dx = |W (ψz+ , ψz− )|2 − y2 1 |ψz− (y1 )ψz+ (x) − ψz− (x)ψz+ (y2 )|2 dx + |W (ψz+ , ψz− )|2 y1 + 1 |ψz+ (x)|2 |ψz− (y1 ) − ψz− (y2 )|2 dx. + |W (ψz+ , ψz− )|2 y2 As y1 ↑ y2 or y2 ↓ y1 , this converges to 0 by the square-integrability of ψz± at ± and their continuity on I. That shows that as an L2 (I)-valued function of y, G is left- and right-continuous, so it is continuous.

11.6. Weyl solutions and Green’s functions

393

Theorem 11.44. Let H be a Schr¨ odinger operator with separated boundary conditions. For any z ∈ C \ σ(H) and g ∈ L2 (I), the value (H − z)−1 g is given pointwise by −1 ((H − z) g)(x) = G(x, y; z)g(y) dy. (11.70) Proof. We begin by proving (11.70) for compactly supported g. Assume that supp g ⊂ [c, d] ⊂ I and denote the right-hand side of (11.70) by f . Compact support of g allows us to use Fubini’s theorem as in the proof of Proposition 11.21 to conclude f ∈ AC2loc (I) and −f + (V − z)f = g. It remains to prove that f ∈ Y− ∩ Y+ . For this, note that for x > d, d 1 − + f (x) = + − ψz (y)ψz (x)g(y) dy, c W (ψz , ψz ) which is a ﬁxed multiple of ψz+ (x), so f ∈ Y+ because ψz+ ∈ Y+ . Similarly, in a neighborhood of − , f is found to be a multiple of ψz− , so f ∈ D(H) and (H − z)f = g. Now let g ∈ L2 (I). By the above, (11.70) holds for the functions gχ[c,d] . In the double limit c ↓ − , d ↑ + , the left-hand side of (11.70) converges in the L2 (I)-sense. Meanwhile, the right-hand side of (11.70) converges for each x, because for any ﬁxed x ∈ I, g ∈ L2 (I) and |G(x, y; z)|2 dy < ∞. I

Moreover, G(x, ·; z) is a multiple of ψz− (y) for y < x and is a multiple of ψz+ (y) for y > x. If a sequence of functions converges both in the L2 (I) sense and pointwise, then the two limits are equal almost everywhere by Corollary 2.31, which shows that (11.70) holds for any g ∈ L2 (I) as equality of functions in L2 (I). However, both sides of (11.70) are continuous in x: the left-hand side because (H −z)−1 g ∈ D(H) and D(H) consists of continuous functions; and the right-hand side because y → G(x, y; z) is a continuous map from I to L2 (I) by Lemma 11.43(d). This notion of Weyl solution generalizes (and shares notation with) the solutions ψz± from Section 11.3, but note a subtle diﬀerence between Theorem 11.44 and Proposition 11.21: In Theorem 11.44, z ∈ / σ(H) is an assumption, rather than a conclusion. This weakening is necessary because in the general setting, even for z ∈ σess (H), Weyl solutions may exist and their Wronskian may be nonzero; however, in such cases, (11.69) does not have the same interpretation as the integral kernel of the resolvent. Due to the loss of this key interpretation, the term “Weyl solutions” is usually only used for z ∈ / σess (H).

394

11. One-dimensional Schr¨odinger operators

The explicit form of Green’s function allows us to describe certain relatively compact operators relevant for the RAGE theorem (Theorem 9.23): Lemma 11.45. For any compact [c, d] ⊂ I, the projection P f = χ[c,d] f on L2 (I) is relatively compact with respect to H, i.e., the operator P (H − i)−1 is compact. Proof. The operator K = P (H − i)−1 is an integral operator with kernel K(x, y) = χ[c,d] (x)G(x, y; i). By symmetry, Lemma 11.43 gives L2 (I)-continuity of Green’s function in x, so by continuity and compactness, |G(x, y; i)|2 dy < ∞. sup x∈[c,d] I

It follows that

|K(x, y)|2 dx dy = I

I

|G(x, y; i)|2 dy dx < ∞, [c,d]

I

so K is compact (Proposition 4.51).

Thus, for any increasing sequence of intervals [cn , dn ] with [cn , dn ] = I, the analysis of Section 9.4 applies to projections Pn f = χ[cn ,dn ] f . These results have direct physical interpretations in quantum mechanics where, for instance, Pn e−itH f 2 corresponds to the probability of ﬁnding the particle in the region [cn , dn ] at time t. In particular, RAGE Theorem 9.23 describes the dynamics of vectors in the pure point and continuous subspaces for H, and Exercise 9.8 describes a property of vectors in the absolutely continuous subspace for H.

11.7. Weyl solutions and m-functions In this section we focus on the half-line case characterized by one regular endpoint, and we change the notation a bit. We write the interval as I = (0, b), and assume that 0 is a regular endpoint with the boundary condition at 0, cos αf (0) + sin αf (0) = 0.

(11.71)

The following discussion is most commonly used in the case when b = +∞ and the potential is the limit point at b. However, the endpoint b can be ﬁnite or inﬁnite; it can even be a regular endpoint. We ﬁx the behavior at b by ﬁxing a Lagrangian subspace Y+ ⊂ X+ . As discussed before, this incorporates a self-adjoint boundary condition at b if H is the limit circle at b.

11.7. Weyl solutions and m-functions

395

Recall that φ(x, z) = φz (x) and θ(x, z) = θz (x) are solutions of the ordinary diﬀerential equation −f + V f = zf, satisfying the initial conditions

! φz (0) θz (0) = φz (0) θz (0)

cos α − sin α sin α cos α

!−1 .

(11.72)

The solutions θz , φz are α-dependent in order to match the operator, in particular, φz obeys the boundary condition at 0, so φz is a Weyl solution at the regular endpoint 0. The solutions also obey the useful relations W (θz , θz ) = 0,

W (θz , φz ) = 1,

W (φz , θz ) = −1,

W (φz , φz ) = 0 (11.73) obtained by evaluating those Wronskians at 0. Further properties of θz , φz on [0, c] for any c ∈ (0, b) follow by an aﬃne transformation from earlier results, in particular, by Corollary 11.8: Corollary 11.46. For any c ∈ (0, b), θz and φz are entire AC2 ([0, c])-valued functions of z. For z ∈ C \ σess (H), let us denote simply by ψz a Weyl solution for H at b. Since φz and ψz are Weyl solutions at 0 and b, respectively, their Wronskian is nonzero for all z ∈ / σ(H). Thus, we can deﬁne: Deﬁnition 11.47. The Weyl m-function associated to H is the map m : C \ σ(H) → C deﬁned by m(z) = −

W (ψz , θz ) . W (ψz , φz )

(11.74)

The Wronskians in this deﬁnition are independent of x; evaluating them at 0 gives cos αψz (0) − sin αψz (0) , (11.75) m(z) = sin αψz (0) + cos αψz (0) which can be written in the notation of M¨obius transformations as ! ! ! m(z) cos α − sin α ψz (0) . $ ψz (0) 1 sin α cos α However, the seemingly more implicit representation (11.74) in terms of Wronskians is often more convenient. We will now derive various properties of Weyl solutions and m-functions; the key property of Weyl solutions is the following. Lemma 11.48. For all z, w ∈ C \ σess (H), Weyl solutions obey b ψw (x)ψz (x) dx = W− (ψw , ψz ). (z − w) 0

(11.76)

396

11. One-dimensional Schr¨odinger operators

Proof. Since Weyl solutions are in the maximal domain and Hmax ψz = zψz , the standard formula (11.60) gives (w − z)ψw , ψz = wψw , ψz − ψw , zψz = W+ (ψ w , ψz ) − W− (ψ w , ψz ). Moreover, Weyl solutions obey the boundary condition (if any) at b, i.e., ψz , ψw ∈ Y+ , so W+ (ψw , ψz ) = 0, and the claim follows. Until now, the formulas were independent of the normalization of ψz , but for the remainder of this section, it will be convenient to ﬁx the normalization W (ψz , φz ) = 1

∀z ∈ C \ σ(H).

(11.77)

This is possible because φz is a Weyl solution at 0, so it is linearly independent with ψz . Lemma 11.49. With the normalization (11.77), m(z) = −W (ψz , θz ) and ψz = θz + m(z)φz .

(11.78)

Proof. m(z) = −W (ψz , θz ) follows immediately from (11.74). To prove (11.77), begin by writing ψz as a linear combination, ψz = aθz +bφz . Viewing the Wronskians W (·, θz ) and W (·, φz ) as nontrivial linear functionals on the two-dimensional space of eigensolutions of H, (11.73) allows us to compute W (ψz , θz ) = aW (θz , θz ) + bW (φz , θz ) = −b and similarly W (ψz , φz ) = a. Since we know the Wronskians, this gives us the values of a = 1 and b = m(z). Corollary 11.50. For z, w ∈ C\σ(H), if the Weyl solutions are normalized by (11.77), they obey b m(z) − m(w) . (11.79) ψw (x)ψz (x) dx = z−w 0 Proof. This is immediate from Lemma 11.48 if we prove that our normalization gives W− (ψw , ψz ) = m(z)−m(w). This can be obtained by a brute force calculation using (11.72) or by noting that the reality and z-independence of the initial conditions in (11.72) imply that, similarly to (11.73), W− (θw , θz ) = 0,

W− (θw , φz ) = 1,

W− (φw , θz ) = −1,

W− (φw , φz ) = 0,

and then using bilinearity of the Wronskian to expand and compute W− (ψ w , ψz ) = W− (θw + m(w)φw , θz + m(z)φz ).

11.7. Weyl solutions and m-functions

397

Theorem 11.51. The function m(z) is analytic on C \ σ(H) and it obeys sgn Im m(z) = sgn Im z. In particular, it is a Herglotz function, and for z ∈ C \ R, if the Weyl solutions are normalized by (11.77), then b Im m(z) = |ψz |2 dx. (11.80) Im z 0 Proof. By (11.79) applied to z ∈ C \ R and w = z, we obtain (11.80). In particular, z ∈ C+ implies m(z) ∈ C+ . We now note a symmetry in our eigensolutions. Since θz , φz are deﬁned with real initial conditions, they obey the symmetry θ¯z = θz¯, φ¯z = φz¯. Note also that ψz is an eigensolution at z¯ which is in Y+ since ψz is; thus, ψz is a Weyl solution at z¯. It also follows from (11.77) that W (ψ¯z , φ¯z ) = W (ψ¯z , φz¯) = 1, so ψ¯z obeys the correct normalization and therefore ψ¯z = ψz¯. Finally, this implies by (11.78) that m(z) = m(¯ z)

∀z ∈ C \ σ(H).

It remains to prove that m(z) is analytic on C \ σ(H). Using the Weyl solutions φz , ψz at 0, b, Green’s function for H is given by G(x, y; z) = φz (min(x, y))ψz (max(x, y)) = φz (min(x, y))θz (max(x, y)) + m(z)φz (x)φz (y).

(11.81)

Analyticity of (H − z)−1 implies analyticity, for any f ∈ L2 (I), of −1 f (x)φz (min(x, y))θz (max(x, y))f (y) dy dx f, (H − z) f = (11.82) + m(z) f (x)φz (x)φz (y)f (y) dy dx. Fix c ∈ (0, b). Since φz , θz are analytic AC2 ([0, c])-valued functions dividing cases x < y and x > y and writing the ﬁrst term on the right-hand side as a sum of two iterated integrals, this shows that it is entire in z for any f ∈ L2 (I) with supp f ⊂ [0, c]. Analyticity of m(z) will therefore follow from analyticity of everything else in (11.82), if we can choose for every z0 ∈ C \ σ(H) a function f ∈ L2 (I) with supp f ⊂ [0, c] such that !2 f (x)φz (x) dx = 0 f (x)φz (x)φz (y)f (y) dy dx = holds for z = z0 (and therefore, by continuity, in a neighborhood of z0 ). Since φz and φz are jointly continuous in z and x, in a neighborhood of any z ∈ C \ R it suﬃces to choose x0 ∈ [0, c) such that φz0 (x0 ) = 0 and f = χ[x0 ,x0 +] for suﬃciently small > 0.

398

11. One-dimensional Schr¨odinger operators

We conclude this section by noting some further properties of Weyl solutions and m-functions as functions of z. Theorem 11.52. If Weyl solutions are normalized by (11.77), then they are L2 (I)-continuous on C \ σ(H), i.e., for any z ∈ C \ σ(H), b lim |ψw − ψz |2 dx = 0. (11.83) w→z

0

Proof. Begin by assuming z, w ∈ C \ R. Expanding |ψw − ψz |2 = |ψz |2 + |ψw |2 − 2 Re(ψw ψz ) and using (11.79) to integrate gives b m(z) − m(w) m(z) − m(z) m(w) − m(w) + − 2 Re . |ψw − ψz |2 dx = z−z w−w z−w 0 (11.84) m(z) m(z)−m(z) = 0, which already As w → z, this converges to 2 ImIm z − 2 Re z−z proves (11.83) for z ∈ C \ R.

By (11.78), analyticity of the m-function on C\σ(H) implies AC2 ([0, c])analyticity of the Weyl solutions for any c < b. Thus, by Fatou’s lemma, for any λ ∈ R \ σ(H) and z ∈ C \ R, |ψz − ψλ |2 dx ≤ lim inf |ψz − ψw |2 dx. w∈C\R w→λ

Using (11.84), we can compute this lim inf, because m(w) → m(λ) and m(w)−m(w) w−w ¯

→ m (λ) by analyticity. Thus, for all z ∈ C \ R and λ ∈ R \ σ(H),

|ψz − ψλ |2 dx ≤

m(z) − m(z) m(z) − m(λ) + m (λ) − 2 Re . z − z¯ z−λ

(11.85)

By repeating this trick, if z → κ for some κ ∈ R \ σ(H), κ = λ, by Fatou’s lemma, m(κ) − m(λ) |ψκ − ψλ |2 dx ≤ m (κ) + m (λ) − 2 Re . (11.86) κ−λ Taking the limit of (11.85) as z → λ with z ∈ C \ R gives |ψz − ψλ |2 dx ≤ m (λ) + m (λ) − 2m (λ) = 0, lim z∈C\R z→λ

and taking the limit of (11.86) as κ → λ with κ ∈ R \ σ(H) gives |ψκ − ψλ |2 dx ≤ m (λ) + m (λ) − 2m (λ) = 0. lim κ∈R\σ(H) κ→λ

11.8. The half-line eigenfunction expansion

399

Together, these two conclusions show L2 (I)-continuity of Weyl solutions at λ ∈ R \ σ(H). The L2 (I)-continuity of the Weyl solutions can be used to extract additional consequences. We prove one corollary and leave another as Exercise 11.7. Corollary 11.53. If Weyl solutions are normalized by (11.77), then for all z ∈ C \ σ(H), b ψz (x)2 dx = m (z). (11.87) 0

Proof. By the Cauchy–Schwarz inequality, b b 2 ψz (x) dx − ψw (x)ψz (x) dx ≤ ψz ψz − ψw , 0

so

0

L2 (I)-continuity

of Weyl solutions implies b b ψz (x)2 dx = lim ψw (x)ψz (x) dx. 0

w→z

0

By the symmetry ψw = ψw , and Corollary 11.50, this limit can be computed as b b m(z) − m(w) 2 = m (z). ψz (x) dx = lim ψw (x)ψz (x) dx = lim w→z w→z z − w 0 0 Although we considered the Weyl m-function as a function on C \ σ(H), one can also consider its singularities at points in σd (H) and describe qualitatively and quantitatively the simple poles obtained there (Exercise 11.8). Of course, a diﬀerent normalization will be needed instead of (11.77), since W (ψz , φz ) = 0 for z ∈ σd (H).

11.8. The half-line eigenfunction expansion We continue to work under the assumptions and notation of the previous section. In this section, we will construct eigenfunction expansions for Schr¨ odinger operators with one regular endpoint. The eigenfunction expansion will be an explicit unitary operator, bearing some resemblance to the Fourier transform but based on formal eigenfunctions of H. This unitary operator will conjugate H to the operator of multiplication with respect to a canonical choice of spectral measure. We begin by introducing the measure. The Weyl m-function corresponding to H has a Herglotz representation involving a Baire measure μ on R.

400

11. One-dimensional Schr¨odinger operators

In particular, μ is given by Stieltjes inversion: for all h ∈ Cc (R), 1 h(λ) Im m(λ + i ) dλ = h(λ) dμ(λ). lim ↓0 π It is not a priori clear that this is related to the spectral properties of H, but we will see below that the resulting measure μ is a maximal spectral measure for H; in fact, we will consider this the canonical spectral measure for the operator H. The eigenfunction expansion for H will be a unitary map conjugating H to the multiplication operator Tλ,dμ(λ) . In particular, this will be a unitary map from L2 (I) to L2 (dμ). We will begin by constructing the eigenfunction expansion and its presumed inverse on dense subsets of the Hilbert spaces (note that it is not immediately obvious that these integral transforms even map into the other Hilbert space). Recall that L2c (I) denotes the set of compactly supported functions in L2 (I). Lemma 11.54. For f ∈ L2c (I), the function fˆ : R → C deﬁned by fˆ(λ) = φλ (x)f (x) dx (11.88) is a continuous function of λ ∈ R. Proof. If f ∈ L2c (I), then f ∈ L1 (supp f ) by the Cauchy–Schwarz inequality. Since φλ (x) is jointly continuous in λ, x, the integral ˆ f (λ) = f (x)φλ (x) dx is uniformly convergent on compacts and deﬁnes a continuous function fˆ.

Lemma 11.55. For g ∈ L2c (dμ), the function gˇ : I → C deﬁned by (11.89) gˇ(x) = φλ (x)g(λ) dμ(λ) is in AC2 ([0, d]) for every d < b and gˇ (x) = φλ (x)g(λ) dμ(λ), gˇ (x) = φλ (x)g(λ) dμ(λ).

(11.90) (11.91)

Proof. Since φλ is uniformly bounded in the AC2 ([0, d]) norm on compact sets of λ, it follows from Fubini’s theorem that for any x1 < x2 , x2 φλ (x)g(λ) dμ(λ) dx = (φλ (x2 ) − φλ (x1 ))g(λ) dμ(λ) = gˇ(x2 ) − gˇ(x1 ), x1

11.8. The half-line eigenfunction expansion

401

which proves gˇ ∈ ACloc ([0, b)) and (11.90). Analogously, computing x2 g (x1 ), φλ (x)g(λ) dμ(λ) dx = (φλ (x2 )−φλ (x1 ))g(λ) dμ(λ) = gˇ (x2 )−ˇ x1

proves gˇ ∈ AC2loc ([0, b)) and (11.91).

We can now state precisely the main result of this section: Theorem 11.56 (Half-line eigenfunction expansion). There exists a unitary map U : L2 (I) → L2 (R, dμ(λ)) with the following properties. (a) U f = fˆ for f ∈ L2c (I). (b) U −1 g = gˇ for g ∈ L2c (dμ). (c) U HU −1 = Tλ,dμ(λ) . It will follow immediately from (c) that h(H) = U −1 Th,dμ U for any bounded Borel function h. In particular, as a special case when (a) and (b) apply, the theorem implies that >

h(H)f = (hfˆ)

∀h ∈ Cc (R), ∀f ∈ L2c (I).

(11.92)

However, this logic is backwards, because the ﬁrst key step in the proof of Theorem 11.56 will be to prove (11.92). This will be Proposition 11.58 below, and it will be proved by using resolvents and Stone’s theorem. This will then allow us to use the abstract eigenfunction expansions of Section 9.9. Accordingly, the ﬁrst technical ingredient is the behavior of Green’s function for values of z approaching the real line. Recall from (11.81) that Green’s function for H is G(x, y; z) = φz (min(x, y))θz (max(x, y)) + m(z)φz (x)φz (y).

(11.93)

We use the concise notation f = o˜(g) if f = o(g) pointwise and f = O(g) uniformly in the given parameters. Lemma 11.57. For any d ∈ (0, b) and compact interval [λ1 , λ2 ] ⊂ R,

↓ 0,

Im G(x, y; λ + i ) = φ(x, λ)φ(y, λ) Im m(λ + i ) + o˜(1), uniformly in (x, y, λ) ∈ (0, d]2 × [λ1 , λ2 ]. Proof. By AC2 ([0, d])-analyticity of φz and θz , φ(t, λ + i ) = φ(t, λ) + i (∂z φ)(t, λ) + O( 2 ),

↓0

uniformly in t ∈ [0, d] and λ ∈ [λ1 , λ2 ] and analogously for θz . Moreover, since m(z) is Herglotz, m(λ + i ) = O( −1 ),

↓ 0,

(11.94)

402

11. One-dimensional Schr¨odinger operators

uniformly in λ ∈ [λ1 , λ2 ] by Lemma 7.38. Applying these expansions to (11.93) and using reality of fundamental solutions for real λ, a short calculation implies Im G(x, y; λ + i ) = φ(x, λ)φ(y, λ) Im m(λ + i ) + ∂z (φ(x, ·)φ(y, ·))|z=λ Re( m(λ + i )) + O( ). Finally, Re( m(λ+i )) = o˜(1) as ↓ 0 follows from (11.94) and Lemma 7.37, and this concludes the proof. We can now prove (11.92): Proposition 11.58. For h ∈ Cc (R) and f ∈ L2c (I), h(H)f is given by (h(H)f )(x) =

h(λ)φλ (x)φλ (y)f (y) dy dμ(λ).

(11.95)

Proof. By Stone’s theorem (Theorem 9.43), for any f ∈ L2 (I), 1 h(H)f = lim ↓0 2πi

h(λ) (H − λ − i )−1 − (H − λ + i )−1 f dλ, (11.96)

where the integral is of a continuous compactly supported L2 (I)-valued function and the limit is taken in L2 (I). For f ∈ L2c (I), we will evaluate this limit pointwise. Let d ∈ (0, b) be large enough that supp f ⊂ [0, d], and let [λ1 , λ2 ] ⊃ supp h. Then, as L2 functions of x,

h(λ) (H − λ − i )−1 − (H − λ + i )−1 f dλ d λ2 1 = h(λ) (G(x, y; λ + i ) − G(x, y; λ − i )) f (y) dλ dy (11.97) 2πi 0 λ1 1 d λ2 h(λ) Im G(x, y; λ + i )f (y) dλ dy. = π 0 λ1

1 2πi

For any kernel K such that K(x, y; λ, ) = o˜(1) as ↓ 0 uniformly in (x, y, λ) ∈ (0, d]2 × [λ1 , λ2 ], dominated convergence implies that for every x, 1 d λ2 h(λ)K(x, y; λ, )f (y) dλ dy = 0. (11.98) lim ↓0 π 0 λ1

11.8. The half-line eigenfunction expansion

403

Thus, by Lemma 11.57 and since h(λ)φλ (x)φλ (y) ∈ Cc (R) (as a function of λ), for every x, the Stieltjes inversion implies that 1 d λ2 lim h(λ) Im G(x, y; λ + i )f (y) dλ dy (11.99) ↓0 π 0 λ1 1 d λ2 h(λ)φλ (x)φλ (y)f (y) Im m(λ + i ) dλ dy = lim ↓0 π 0 λ1 d λ2 h(λ)φλ (x)φλ (y)f (y) dμ(λ) dy. = 0

λ1

Thus, we have computed the limit of (11.97) pointwise; by (11.96), this is equal to the L2 -limit h(H)f , which concludes the proof. This will allow us to apply the abstract eigenfunction expansion Theorem 9.48: our application is to denote by A = H on H = L2 (I) and to denote by B the operator of multiplication by λ on K = L2 (dμ(λ)). Theorem 9.48 implies that Ran U and Ker U ∗ are resolvent-invariant for B. It will then remain to prove that Ker U ∗ = {0}, and as remarked near Theorem 9.48, this cannot be concluded by abstract arguments. In our setting, Ker U ∗ = {0} will follow from resolvent-invariance of Ker U ∗ together with the following lemma: Lemma 11.59. If g ∈ L2c (dμ) and gˇ = 0 in L2 (I), then g(λ) dμ(λ) = 0.

(11.100)

Proof. If g ∈ L2c (dμ), then gˇ ∈ AC2loc ([0, b)), so gˇ = 0 in the L2 sense implies pointwise equalities gˇ(x) = gˇ (x) = 0 for all x ∈ [0, b). By (11.89) and (11.90), gˇ(0) = − sin α g(λ) dμ(λ), gˇ (0) = cos α g(λ) dμ(λ). Since at least one of sin α, cos α is nonzero and gˇ(0) = gˇ (0) = 0, (11.100) follows. We have now collected all the ingredients for the proof of the half-line eigenfunction expansion: Proof of Theorem 11.56. By Theorem 9.48, the map f → fˆ extends to a norm-preserving map U : L2 (I) → L2 (dμ), with a map U ∗ : L2 (dμ) → L2 (I) such that (11.101) U ∗ g, f = g, U f

404

11. One-dimensional Schr¨odinger operators

for all f ∈ L2 (dμ) and g ∈ L2 (dμ) and h(H) = U ∗ Th,dμ U for all bounded continuous functions h. For f ∈ L2c (I) and g ∈ L2c (dμ), consider the double integral f (x)φλ (x)g(λ) (dx ⊗ dμ(λ)). Fubini’s theorem is applicable because f (x)g(λ) is integrable and compactly supported in (x, λ) and φλ (x) is bounded on compacts. Thus, we get equality of iterated integrals which simpliﬁes using the deﬁnitions of fˆ and gˇ to f (x)ˇ g (x) dx = fˆ(λ)g(λ) dμ(λ). (11.102) This holds for all g ∈ L2c (dμ) and f ∈ L2c (I); in particular, it holds for a dense set of f ∈ L2 (I). Comparing with (11.101), we see that U ∗ g = gˇ for all g ∈ L2c (dμ). Let us prove that Ker U ∗ = {0}. Let g ∈ Ker U ∗ . Since Ker U ∗ is a resolvent-invariant subspace for Tλ,dμ(λ) , it follows that χ(λ1 ,λ2 ] g ∈ Ker U ∗ for any λ1 < λ2 . Moreover, χ(λ1 ,λ2 ] g ∈ L2c (dμ), so by Lemma 11.59, g(λ)χ(λ1 ,λ2 ] (λ) dμ(λ) = 0 for all λ1 < λ2 . This implies that g(λ) = 0 μ-a.e. Thus, Ker U ∗ = {0}, so by Theorem 9.48(f), U, U ∗ are mutually inverse unitary maps, h(H) = U ∗ Th,dμ U holds for all bounded Borel functions, and H = U ∗ Tλ,dμ(λ )U .

The eigenfunction expansion provides a multiplication operator representation which is precisely of the form considered abstractly in the spectral theorem for unbounded self-adjoint operators. Thus, Theorem 11.56 allows us to apply abstract spectral theory and obtain several corollaries. The ﬁrst, immediate corollary of Theorem 11.56 is the following. Corollary 11.60. H has simple (multiplicity 1) spectrum and μ is a maximal spectral measure for H. In particular, σ(H) = supp μ and σess (H) = ess supp μ. If we emphasize the dependence on the parameter α in the boundary condition at 0 and write Hα , mα , μα for the corresponding α-dependent objects, (11.75) can be written in the notation of M¨obius transformations as ! ! ! cos(α − β) − sin(α − β) mβ (z) mα (z) $ . (11.103) 1 sin(α − β) cos(α − β) 1

11.8. The half-line eigenfunction expansion

405

Proposition 7.57 has an immediate corollary: Corollary 11.61. The essential spectrum of Hα is independent of α. Moreover, on any interval in R \ σess (Hα ), the discrete spectra of Hα and Hβ strictly interlace whenever α − β ∈ / πZ. In the special case of two regular endpoints, we can apply this twice to change the boundary condition at each endpoint; thus, the special case of Dirichlet eigenvalues (Corollary 11.25) implies the following. 2

d 1 Corollary 11.62. Consider the operator H = − dx 2 + V with V ∈ L ([0, 1]) and boundary conditions (11.2) and (11.3). The spectrum of H is bounded from below. Arranging its elements in increasing order, σ(H) = {λn | n ∈ N}, with λn < λn+1 , the eigenvalues obey the asymptotics

λn = n2 π 2 + O(n),

n → ∞.

(11.104)

Proof. The case α = β = 0 is Corollary 11.25. By changing the boundary conditions twice, σ(Hα,β ) strictly interlaces σ(Hα,0 ), which strictly interlaces σ(H0,0 ). Since interlacing preserves the property (11.104), the proof is complete. Returning to the general setting, using pointwise boundary values of Herglotz functions, we can also study the absolutely continuous and singular spectrum. They have very diﬀerent dependence on the boundary condition at 0: Proposition 11.63. For α − β ∈ / πZ, the absolutely continuous parts of μα and μβ are mutually absolutely continuous, i.e., [(μα )ac ] = [(μβ )ac ], and the singular parts are mutually singular, i.e., (μα )s ⊥ (μβ )s . Proof. By Proposition 7.47 and Theorem 7.46, the limit lim mα (λ + i ) ↓0

exists for Lebesgue-a.e. λ ∈ R, and (μα )ac is mutually absolutely continuous with χAα (λ) dλ, where Aα = {λ ∈ R | lim mα (λ + i ) ∈ C+ }. ↓0

Since (11.103) represents mα in terms of mβ by a M¨obius transformation which preserves C+ , it follows that Aα = Aβ .

406

11. One-dimensional Schr¨odinger operators

Similarly, the singular part of the measure is supported in the set Sα = {λ ∈ R | lim mα (λ + i ) = ∞} ↓0

and, using (11.103), this can be written as Sα = {λ ∈ R | lim m0 (λ + i ) = − cot α}. ↓0

It follows that Sα ∩ Sβ = ∅, so μα ⊥ μβ .

Spectral properties of H, both qualitative and quantitative, can now be studied via the m-function and therefore via the Weyl solutions. For instance, Exercise 11.8 provides a formula for the residue of the m-function at an isolated eigenvalue λ, which can now also be interpreted as μ({λ}). We also note an explicitly computable example (see Exercise 11.9 for a generalization): Example 11.64. On the interval I = (0, +∞), the potential V ≡ 0 is the limit circle at 0 and the limit point at ∞. If we set the Dirichlet boundary condition at 0, the m-function is √ m(z) = − −z √ √ with the branch of −z such that Re −z > 0 on C \ [0, ∞). The spectrum is σ(H) = [0, ∞). The spectrum is purely absolutely continuous and the canonical spectral measure is √ 1 (11.105) dμ(λ) = χ(0,∞) (λ) λ dλ. π √ Proof. For z ∈ C \ R, let k = −z, with the branch of square root such that Re k > 0. The equation −f = −k 2 f has linearly independent solutions e±kx . Of those, the square integrable solution is e−kx , so the Weyl solution is computed by (11.75) with α = 0 as m(z) = −k. Since Im m(z) √ extends continuously to the closed upper half-plane with values χ(0,∞) (λ) λ on the real line, by Proposition 7.43, the spectral measure is precisely (11.105). The unitary map U in the eigenfunction expansion is uniquely determined as the closure of the densely deﬁned map f → fˆ. If f is not compactly supported, U f can still be computed by suitable approximations or test functions. We give one useful example. Informally speaking, naively computing the eigenfunction expansion (11.88) of the Dirac delta function δy would give the function δˆy (λ) = φλ (y). Applying (H − z)−1 to δy should correspond to multiplying the eigenfunction expansion by (λ − z)−1 , which motivates us to expect that the eigenfunction expansion maps the function λ (y) . This is not a rigorous argument, but it G(·, y; z) to the function φλ−z motivates the correct formula:

11.9. Weyl disks and applications

407

Proposition 11.65. Fix z ∈ C \ σ(H). For any y ∈ (0, b), the function λ (y) f (x) = G(x, y; z) is mapped by U to the function (U f )(λ) = φλ−z . Proof. For any g ∈ L2 (dμ), b −1 G(x, y; z)(U −1 g)(x) dx = ((H − z¯)−1 U −1 g)(y). U f, g = f, U g = 0

; < 2 Since (H − z¯)−1 U −1 g = U −1 g(λ) λ−¯ z , if g ∈ Lc (dμ), the right-hand side can be evaluated pointwise as @ A φλ (y) g(λ) −1 g(λ) U (y) = φλ (y) dμ(λ) = g(λ) dμ(λ). λ − z¯ λ − z¯ λ − z¯ Thus, the equality

(U f )(λ)g(λ) dμ(λ) =

φλ (y) g(λ) dμ(λ) λ − z¯

holds for all g ∈ L2c (dμ), so it follows that (U f )(λ) =

φλ (y) λ−z .

In particular, note that this proves that |φλ (y)|2 dμ(λ) = |G(x, y; z)|2 dx < ∞. |λ − z|2 Additional examples are given in Exercise 11.11. Sturm oscillation theory [79, 90] counts eigenvalues below the bottom of the essential spectrum in terms of the number of zeros of eigensolutions. Renormalized oscillation theory [33] counts eigenvalues in gaps of σess (H) (connected components of R \ σess (H)) in terms of the number of zeros of Wronskians of eigensolutions.

11.9. Weyl disks and applications In this section we consider another perspective on the limit point–limit circle dichotomy. This is used to generate approximations of the Weyl m-function and compute its asymptotics; already in this section, we will use it to derive the Carmona formula and prove continuity with respect to the potential. As before, we denote our interval by I = (0, b), where b can be ﬁnite or +∞, and assume that the real-valued potential V obeys V ∈ L1loc ([0, b)), i.e., L1 ([0, d]) for all d < b. We will study in detail the behavior of eigensolutions for z ∈ C+ .

408

11. One-dimensional Schr¨odinger operators

Lemma 11.66. Let z ∈ C+ . For any nontrivial solution of −f + V f = zf and any x ∈ (0, b), x |f (t)|2 dt = iW (f¯, f )(x) − iW (f¯, f )(0). (11.106) 2 Im z 0

In particular, the function −iW (f¯, f )(x) = 2 Im(f (x)f (x))

(11.107)

is a strictly decreasing, real-valued function of x. Proof. As before, starting from the calculation iW (f¯, f ) = 2 Im zf f¯ and integrating gives (11.106). Since f is a nontrivial eigensolution, if f (y) = 0 for some y, then f (y) = 0, so f has only isolated zeros. In particular, ¯ f ) = 2 Im zf f¯ > 0 away from a discrete set, so the function iW (f¯, f ) iW (f, is strictly increasing. In terms of the matrix J = we can write

! 0 i , −i 0

! f (x) J , −iW (f , f )(x) = f (x) (x) as projective which leads to a geometric interpretation: considering ff (x) ˆ as in Section 7.1 (see Example 7.8), we see that coordinates on C −iW (f , f )(x) ≥ 0

f (x) f (x)

⇐⇒

!∗

f (x) ∈ C+ = C+ ∪ R ∪ {∞} f (x)

with equality corresponding to R ∪ {∞}. The Weyl disk formalism will take this geometric perspective further, by linking the sign of −iW (f , f )(x) to the values of f, f at 0. This will be accomplished by using M¨ obius transformations corresponding to transfer matrices; we do this while incorporating the boundary condition at 0 with α ∈ R. We recall the transfer matrices ! ∂x φα (x, z) ∂x θα (x, z) , Tα (x, z) = Rα θα (x, z) φα (x, z) where φα (x, z), θα (x, z) are eigensolutions at z with initial conditions at 0 chosen so that Tα (0, z) = I. Then an arbitrary eigensolution f corresponds to an arbitrary v ∈ C2 by ! f (x) (11.108) = Rα−1 Tα (x, z)v. f (x)

11.9. Weyl disks and applications

409

Deﬁnition 11.67. For z ∈ C+ and x such that V ∈ L1 ([0, x]), the Weyl disks Dα (x, z) are deﬁned by & !∗ ! ' ˆ | w Tα (x, z)∗ J Tα (x, z) w ≥ 0 . Dα (x, z) = w ∈ C 1 1 Let us note the promised geometric interpretation of Weyl disks: Lemma 11.68. For any nontrivial eigensolution f at z, cos αf (0) − sin αf (0) ∈ Dα (x, z) sin αf (0) + cos αf (0)

⇐⇒

f (x) ∈ C+ . f (x)

The left-hand side is on the boundary of Dα (x, z) if and only if the right-hand side is on the boundary of C+ . Proof. Using (11.108) at x = 0 and using the deﬁnition of the Weyl disk, the left-hand side is equivalent to v ∗ Tα (x, z)∗ J Tα (x, z)v ≥ 0. Substituting J = Rα J Rα−1 and using (11.108) again gives v ∗ J v ≥ 0, which is equivalent to the right-hand side. The cases of equality are equivalent. The Weyl circle ∂Dα (x, z) (boundary of the Weyl disk) is naturally parametrized by self-adjoint boundary conditions at x: Example 11.69. For any α, β ∈ R, denote by mα,β the Weyl function d2 corresponding to the Schr¨odinger operator H = − dx 2 + V with boundary conditions cos αf (0) + sin αf (0) = 0,

cos βf (x) − sin βf (x) = 0.

(11.109) (11.110)

For any z ∈ C+ , the Weyl circle is parametrized by ∂Dα (x, z) = {mα,β (z) | β ∈ R}. Proof. As stated above, cos αf (0) − sin αf (0) ∈ ∂Dα (x, z) sin αf (0) + cos αf (0) if and only if f (x)/f (x) ∈ R ∪ {∞}. This is equivalent to f being the Weyl solution for some self-adjoint boundary condition. If f is the Weyl solution for the boundary condition (11.110), then by deﬁnition, mα,β (z) =

cos αf (0) − sin αf (0) . sin αf (0) + cos αf (0)

410

11. One-dimensional Schr¨odinger operators

Statements about arbitrary eigensolutions at z can be turned into statements about transfer matrices at z and then into statements about Weyl disks. Starting from Lemma 11.66, we obtain the following. Lemma 11.70. Fix z ∈ C+ . (a) For any x > 0,

∗

x

J − Tα (x, z) J Tα (x, z) = 2 Im z

! 0 0 Rα∗ Tα (t, z) dt. 0 1 (11.111)

∗

Tα (t, z) Rα 0

(b) For any x1 < x2 , Tα (x1 , z)∗ J Tα (x1 , z) > Tα (x2 , z)∗ J Tα (x2 , z)

(11.112)

in the sense of matrix (operator ) order. (c) The sets Dα (x, z) are disks in C+ for all x > 0. (d) The sets Dα (x, z) are strictly nested, i.e., Dα (x2 , z) ⊂ int Dα (x1 , z) whenever x1 < x2 . Proof. Multiplying (11.111) from the right by arbitrary v ∈ C2 and from the left by v ∗ reduces to the correct statement (11.106). By the polarization identity, this is suﬃcient to conclude the matrix identity (11.111). Moreover, for any v ∈ C2 \ { 00 }, x |f (t)|2 dt, v ∗ (J − Tα (x, z)∗ J Tα (x, z))v = 2 Im z 0

which is strictly increasing in x, implying the strict inequality (11.112). ˆ By Lemma 11.70, By general principles, these are generalized disks in C. they are strictly nested. Since T (0, z) = I, a direct calculation shows D(0, z) = C+ , this implies that D(x, z) are Euclidean disks for all x > 0 and subsets of C+ . The statement J −Tα (x, z)∗ J Tα (x, z) ≥ 0 is J -contractivity of Tα (x, z), and the strict inequality (11.112) is strict J -monotonicity of this family of transfer matrices. Now let us consider the limit x → b. For ﬁxed z ∈ C+ and α ∈ R, for the decreasing family of compact disks Dα (x, z), the intersection Dα (x, z) Dα (b, z) := x∈(0,b)

is a point or a disk (for clarity, let us emphasize that a disk has strictly positive radius). This limiting object also has an interpretation:

11.9. Weyl disks and applications

411

Lemma 11.71. For any nontrivial eigensolution f at z, cos αf (0) − sin αf (0) ∈ Dα (b, z) sin αf (0) + cos αf (0)

⇐⇒

f ∈ X+ and − iW+ (f , f ) ≥ 0.

Proof. The left-hand side holds if and only if cos αf (0) − sin αf (0) ∈ Dα (x, z) sin αf (0) + cos αf (0) for all x < b, i.e., if and only if −iW (f , f )(x) ≥ 0 for all x < b. Since this function is strictly decreasing, this holds if and only if lim −iW (f , f )(x) ≥ 0.

x→b

On the other hand, by (11.106),

b

lim −iW (f , f )(x) = −iW (f , f )(0) − 2 Im z

x→b

|f (t)|2 dt.

0

so f ∈ X+ ; moreover, in that Finiteness of this limit implies f ∈ case, the limit can be interpreted as −iW+ (f , f ). L2 ((0, b)),

Recall that we deﬁned the Weyl limit point–limit circle dichotomy by whether the boundary Wronskian on X+ is trivial or not. The following result explains that terminology: Theorem 11.72 (Equivalent characterizations of the limit circle case). For any z ∈ C+ , the following are equivalent: (a) The boundary Wronskian W+ is not trivial on X+ . (b) The set of eigensolutions at z which are in L2 ((0, b)) has dimension 2. (c) The intersection Dα (b, z) is a disk. Proof. (c) =⇒ (b): If the intersection contains two distinct points, then by Lemma 11.71, there are two linearly independent eigensolutions f1 , f2 ∈ L2 ((0, b)). (b) =⇒ (a): If there are two linearly independent eigensolutions f1 , f2 ∈ L2 ((0, b)), they are both in X+ and by linear independence, W+ (f1 , f2 ) = 0. Thus, W+ = 0. (a) =⇒ (b): Since W+ is not trivial, there are two distinct Lagrangian subspaces of X+ , denoted Y1 , Y2 , and two distinct self-adjoint Schr¨odinger operators H1 , H2 with the same V, α. Each of them has a Weyl solution at b, denoted f1 , f2 ; by the resolvent formula, f1 , f2 must be linearly independent. Every eigensolution can be expressed as a linear combination of f1 , f2 , so every eigensolution is in L2 ((0, b)).

412

11. One-dimensional Schr¨odinger operators

(b) =⇒ (c): Let us ﬁrst characterize the intersection of disks: w ∈ x Dα (x, z) if and only if ! ! w ∗ w ∗ Tα (x, z) J Tα (x, z) ≥0 1 1 for all x > 0, and by monotonicity in x, this is true if and only if !! ! w w ∗ ∗ Tα (x, z) J Tα (x, z) ≥ 0. lim x→b 1 1 If all solutions are in L2 ((0, b)), we can compute this limit: the second row Rα−1 Tα (x, z) consists of functions in L2 ((0, b)), so the entries of the matrix ! ! 0 0 |φα (t, z)|2 φα (t, z)θα (t, z) ∗ ∗ Rα Tα (t, z) = Tα (t, z) Rα 0 1 θα (t, z)φα (t, z) |θα (t, z)|2 are in L1 ((0, b)), and integrating this gives a convergent limit ! x |φα (t, z)|2 φα (t, z)θα (t, z) ∗ dt. J − lim (Tα (x, z) J Tα (x, z)) = x→b θα (t, z)φα (t, z) |θα (t, z)|2 0 The limit is self-adjoint and det lim (Tα (x, z)∗ J Tα (x, z)) = lim det(Tα (x, z)∗ J Tα (x, z)) = −1, x→b

x→b

so the inequality w 1

!∗

w lim (Tα (x, z) J Tα (x, z)) x→b 1 ∗

! ≥0

deﬁnes a disk.

The implication (b) =⇒ (c) can also be proved by explicitly computing the radius (Exercise 11.12). Taking the negation of the statements in Theorem 11.72 gives equivalent characterizations of the limit point case; recall that since there is always some self-adjoint operator and a Weyl solution at z, the set of eigensolutions in L2 ((0, b)) always has dimension at least 1. Theorem 11.73 (Equivalent characterizations of the limit point case). For any z ∈ C+ , the following are equivalent: (a) The boundary Wronskian W+ is trivial on X+ . (b) The set of eigensolutions at z which are in L2 ((0, b)) has dimension 1. (c) The intersection Dα (b, z) is a point. In the limit point case, the sole point in the intersection of Weyl disks is the value of the Weyl m-function:

11.9. Weyl disks and applications

413

Proposition 11.74. If V is limit point at b, then for any z ∈ C+ , Dα (b, z) = {mα (z)}. Proof. Let f be a Weyl solution at b. Then mα (z) =

cos αf (0) − sin αf (0) . sin αf (0) + cos αf (0)

The Weyl solution obeys f ∈ X+ and W+ (f , f ) = 0, so by Lemma 11.71, mα (z) ∈ Dα (b, z). Since we are in the limit point case, this concludes the proof. In the limit circle case, the limit circle is parametrized by self-adjoint boundary conditions at b (Exercise 11.13). Weyl disks provide a powerful tool for proving convergence of Herglotz functions; we present two applications in the limit point case. The ﬁrst application is a theorem of Carmona, which allows us to study spectral measures through the behavior of eigensolutions with real energies. Theorem 11.75 (Carmona). Assume that V is regular at 0 and is a limit point at b. Fix α ∈ R. For any h ∈ Cc (R), h(λ) (11.113) lim dλ = h(λ) dμα (λ). x→b π(φα (x, λ)2 + φα (x, λ)2 ) Proof. We deﬁne functions mα (x, z) by ! ! i mα (x, z) −1 . $ Tα (x, z) Rα 1 1

(11.114)

Since i ∈ C+ , by the deﬁnition of Weyl disks, mα (x, z) ∈ Dα (x, z) for all x. Since V is a limit point at b, it follows that mα (x, z) → mα (z) for each z ∈ C+ . Moreover, mα (x, z) are Herglotz functions, and (11.114) can be written as i∂x θα (x, z) + θα (x, z) i∂x φα (x, z) + φα (x, z) (i∂x θα (x, z) + θα (x, z))(−i∂x φα (x, z) + φα (x, z)) . =− |i∂x φα (x, z) + φα (x, z)|2

mα (x, z) = −

Since φα (x, z) and ∂x φα (x, z) are entire functions of z, real-valued on R, and have no common zeros, the denominator is continuous and nonzero on C+ ∪ R. Thus, mα (x, z) extend continuously to R with boundary values mα (x, λ) = −

(i∂x θα (x, λ) + θα (x, λ))(−i∂x φα (x, λ) + φα (x, λ)) . (∂x φα (x, λ))2 + φα (x, λ)2

414

11. One-dimensional Schr¨odinger operators

Since W (θα (·, z), φα (·, z)) = 1, the imaginary part is computed to be 1 Im mα (x, λ) = , (∂x φα (x, λ))2 + φα (x, λ)2 so the Herglotz function mα (x, z) corresponds to the measure 1 dλ. π((∂x φα (x, λ))2 + φα (x, λ)2 ) Since the Herglotz functions mα (x, z) converge pointwise to the Herglotz function mα (z), they converge uniformly on compacts, so corresponding measures converge to μα by Proposition 7.28. Carmona’s formula (11.113) is just one of many possible approximations, corresponding to a speciﬁc choice made in (11.114). Other choices are useful in speciﬁc situations; a variation useful in the study of decaying potentials V is given as Exercise 11.14. Our second application of Weyl disks involves continuity of the mfunction viewed as a function of the potential. Denote by mH the m-function corresponding to the operator H. Theorem 11.76. Let V ∈ L1loc ([0, b)) be regular at 0 and a limit point at b. d2 Fix α ∈ R. Let H = − dx 2 + V with boundary condition (11.2) at 0. Let Vn ∈ L1loc ([0, b)) be such that c |Vn (x) − V (x)| dx = 0 lim n→∞ 0

2

d for all c < b, and let Hn = − dx 2 + Vn with the same boundary condition (11.2) at 0 and (if Vn are limit circles at b) an arbitrary self-adjoint boundary condition at b. Then mHn (z) → mH (z) uniformly on compact subsets of z ∈ C+ .

Proof. By Corollary 11.9, for any real βn → β, solutions un,βn of −fn,β + Vn fn,βn = zfn,βn , n

fn,βn (c) = cos βn ,

fn,β (c) = sin βn , n

converge in AC2 ([0, c]) to the solution of −fβ + V fβ = zfβ ,

fβ (c) = cos β,

fβ (c) = sin β,

so in particular, lim

n→∞

cos αfn,β (0) − sin αfn,βn (0) n sin αfn,β (0) + cos αfn,βn (0) n

=

cos αfβ (0) − sin αfβ (0) sin αfβ (0) + cos αfβ (0)

By Lemma 2.10 this implies uniform convergence in β, lim

n→∞

(0) − sin αf cos αfn,β n,β (0) (0) + cos αf sin αfn,β n,β (0)

=

cos αfβ (0) − sin αfβ (0) sin αfβ (0) + cos αfβ (0)

,

.

11.10. Asymptotic behavior of m-functions

415

so the Weyl circle ∂DVn (x, z) converges in Hausdorﬀ metric dH to the Weyl circle ∂DV (x, z). Since V is a limit point at b, for any > 0 there exist x < b such that the diameter of DV (x, z) is smaller than . For large enough n, dH (DVn (x, z), DV (x, z)) < , so |mHn (z)−mH (z)| < 2 . This proves that mHn converges to mH pointwise in C+ ; uniform convergence on compacts follows from Corollary 7.18. Since the Weyl disk formalism is based on the J -monotonicity property of transfer matrices, they can be studied for any solution of an initial value problem of the form ∂x T (x, z) = iJ (A(x)z + B(x))T (x, z), where A, B ∈ L1loc ([0, ∞)), Tr(AJ ) = Tr(BJ ) = 0, A(x) ≥ 0, and B(x)∗ = B(x); such an initial value problem is sometimes called a Hamiltonian system. This point of view allows us to associate a Weyl function to the family of transfer matrices without reference to a self-adjoint operator (see, e.g., [11]). The special case B = 0 is the setting of de Branges canonical systems; it is particularly natural from an inverse spectral theoretic point of view, since the correspondence between trace-normalized canonical systems (Tr A = 1) and their Weyl functions is a bijection to the set of all Herglotz functions, by a deep theorem of de Branges; see [24, 78, 80].

11.10. Asymptotic behavior of m-functions We will now investigate the asymptotics of the m-functions as z → ∞. In order to be concise and complete, it will be convenient to use the following convention: Deﬁnition 11.77. Let P be a metric space, and let Ω ⊂ C. For two ˆ we use the notation functions F, G : Ω × P → C, F = o˜(|G|),

z → ∞, z ∈ Ω, uniformly in bounded subsets of P

to denote that for every bounded subset Q ⊂ P , lim sup sup z→∞ p∈Q z∈Ω

|F (z, p)| 0. In particular, −k is a Herglotz function; it is the Weyl function corresponding to the free half-line Schr¨ odinger operator with a Dirichlet boundary condition at 0. The central result is the following: Theorem 11.78. Let V ∈ L1loc ([0, b)) be real-valued, b ≥ 1, and let H = d2 − dx 2 + V have a Dirichlet boundary condition at 0. If V is a limit circle at b, assume an arbitrary self-adjoint boundary condition at b. Then the following hold. (a) For any δ > 0, 1 e−2kt V (t) dt + o˜(|k|−1 ), m(z) = −k −

z → ∞, arg z ∈ [δ, π − δ],

0

uniformly in bounded subsets of V ∈ L1 ([0, 1]). (b) If in addition H is semibounded, i.e., inf σ(H) > −∞, then 1 m(z) = −k − e−2kt V (t) dt + o(|k|−1 ), z → ∞, arg z ∈ [δ, 2π − δ]. 0

This is uniform in bounded subsets of V ∈ L1 ([0, 1]) with H such that inf σ(H) ≥ C, where C ∈ R. The proof of (a) consists of two parts: one is the derivation of the special case of operators on [0, 1] with a Dirichlet boundary condition at 1, and the other is an Atkinson argument which uses Weyl disks. We formulate both as lemmas below. The O( ) estimates come from explicit bounds on fundamental solutions and explicit functions like c(x, k) and s(x, k), whereas some of the o( ) estimates will come from the dominated convergence theorem and will therefore be pointwise in V ∈ L1 ([0, 1]). The proof of (b) uses the Phragm´en–Lindel¨ of principle to extend to a bigger sector which includes a negative half-line. Poles of m arbitrarily far on the negative half-line would be an obstacle to such asymptotics, so this is only possible if m is analytic on C+ ∪ (−∞, C) ∪ C− for some C ∈ R, or equivalently, if H is semibounded. In general, the potential V outside [0, 1] can be modiﬁed to make inf σ(H) arbitrarily small or −∞; nonetheless, (b) is often applicable and useful. By Corollary 11.62, Schr¨odinger operators

11.10. Asymptotic behavior of m-functions

417

with two regular endpoints are semibounded, and this suﬃcient criterion for semiboundedness will be generalized in Section 11.14. Lemma 11.79. For any δ > 0 and V ∈ L1 ([0, 1]), assuming Dirichlet boundary conditions at 0 and 1, 1 e−2kt V (t) dt + o˜(|k|−1 ), z → ∞, arg z ∈ [δ, 2π − δ], m(z) = −k − 0

(11.115)

uniformly in bounded subsets of V ∈

L1 ([0, 1]).

Proof. By evaluating Wronskians at x = 0 and x = 1, we can express the m-function as W (ψz+ , vz ) v(1, z) m(z) = − =− . (11.116) + u(1, z) W (ψz , uz ) The key is to revisit Proposition 11.12 and its proof: by estimating only terms for n from 3 to ∞ and leaving the terms for n = 0, 1, 2 intact, we obtain asymptotic expansions for u(1, z) and v(1, z) with a higher power of |||k|||−1 . With the notation n−1 s(1−t1 , k) V (tj )s(tj −tj+1 , k) V (tn )s(tn , k) dn t, An = 2k n+1 e−k Δn (1)

Bn = 2k n e−k

s(1 − t1 , k) Δn (1)

j=1

n−1

V (tj )s(tj − tj+1 , k) V (tn )c(tn , k) dn t,

j=1

the proof of Proposition 11.12 gives k u(1, z) − s(1, k) − e A1 − 2k 2 k v(1, z) − c(1, k) − e B1 − 2k

ek A2 ≤ |||k|||−4 e|Re k|+V L1 , 3 2k ek ≤ |||k|||−3 e|Re k|x+V L1 . B 2 2 2k

In the nontangential limit z → ∞, arg z ∈ [δ, 2π − δ], the term |e−k | = e− Re k ≤ e−|k| sin(δ/2) decays exponentially so the series expansions for u(1, z), v(1, z) imply ! A1 A2 ek −3 1+ + 2 + O(|k| ) , u(1, z) = 2k k k ! B1 B2 ek 1+ + 2 + O(|k|−3 ) . v(1, z) = 2 k k By (11.116), this implies

! B1 − A1 B2 − A2 − A1 (B1 − A1 ) −3 + + O(|k| ) . m0,0 (z) = −k 1 + k k2 (11.117)

418

11. One-dimensional Schr¨odinger operators

These terms can be further simpliﬁed. In particular, 1 1 B1 − A1 = (1 − e−2k(1−t) )V (t)e−2kt dt = V (t)e−2kt dt + O(e−2 Re k ). 0

0

(11.118) The second term can be rewritten more explicitly (Exercise 11.15), but for our purposes, it suﬃces to use the bounds |s(t, k)| ≤

et Re k , |k|

|c(t, k)| ≤ et Re k

and the pointwise limits lim

z→∞ arg z∈[δ,2π−δ]

1 s(t, k) c(t, k) = lim = tk tk z→∞ e /k e 2 arg z∈[δ,2π−δ]

to conclude by dominated convergence that nontangentially, n V (tj ) n d t. An , Bn → 2 Δn (1) j=1

It immediately follows that for each V , B2 − A2 − A1 (B1 − A1 ) → 0. Using this and (11.118), the expansion (11.117) improves to (11.115).

Lemma 11.80. In the setting of Theorem 11.78, the radius r(z) of the Weyl disk D0 (1, z) decays exponentially as z → ∞, r(z) =

√ 2|z| √ e−2 Re −z (1 + O(|z|−1/2 ), |Im −z|

z → ∞, arg z ∈ [δ, π − δ],

uniformly for bounded sets of V ∈ L1 ([0, 1]) for any δ > 0. Proof. The Weyl disk is given by ! ' & ! w w ∗ ∗ T (1, z) J T (1, z) ≥0 . D(1, z) = w | 1 1 The matrix M = T (1, z)∗ J T (1, z) obeys det M = −1, so the radius of the Weyl disk can be computed by Lemma 7.6 as 1 1 . (11.119) r(z) = − = M11 |uz (1)uz (1) − uz (1)uz (1)| Proposition 11.12 implies ek (1 + O(|k|−1 )), 2k which together imply uz (1) =

uz (1)uz (1)

=e

uz (1) =

2 Re k

ek (1 + O(|k|−1 )), 2

! 1 −2 + O(|k| ) . 4k

(11.120)

(11.121)

11.10. Asymptotic behavior of m-functions

419

We insert this into (11.119) and use 1 − 1 = |k − k| = Im k . 4k 4k 4|k|2 2|k|2 Since |k| = O(|Im k|) as z → ∞, arg z ∈ [δ, π − δ], this implies |Im k| 2 Re k 1 = e (1 + O(|k|−1 )), r(z) 2|k|2

and inverting this completes the proof.

Proof of Theorem 11.78. (a) Denote by m0 (z) the m-function which corresponds to a Dirichlet boundary condition at 1. Then m0 (z) ∈ ∂D0 (1, z) and m(z) ∈ D(1, z) so |m(z) − m0 (z)| ≤ 2r(z). Since r(z) decays exponentially as z → ∞ with arg z ∈ [δ, π − δ], the polynomial asymptotics of m0 (z) from Lemma 11.79 apply also to m(z) in the sector arg z ∈ [δ, π − δ]. (b) Since m and m0 are meromorphic Herglotz functions whose sets of poles are bounded below, by a corollary of the Phragm´en–Lindel¨ of method (Corollary 7.64) the asymptotics of m(z) extends to the sector arg z ∈ [δ, 2π − δ]. If we are only interested in the leading asymptotics, we can simplify this: Corollary 11.81. In the setting of Theorem 11.78, √ m(z) = − −z + o˜(1)

(11.122)

as z → ∞, with the same uniformity statements as in Theorem 11.78. Proof. By dominated convergence, for any V ∈ L1 ([0, 1]), 1 lim e−2kt V (t) dt = 0, z→∞ arg z∈[δ,2π−δ] 0

so the result follows from Theorem 11.78.

While the boundary condition at 1 was shown to have an asymptotically exponentially small contribution, the eﬀect of the boundary condition at 0 is much more interesting; we leave this to Exercise 11.17. Under some continuity assumptions, the m-function asymptotics can be made more precise by a more careful analysis of integrals involving V . The multiplier e−2kt decays as t goes from 0 to 1, so this integral emphasizes the values of V (t) from small t. In particular, with some additional regularity at 0, only the value V (0) matters:

420

11. One-dimensional Schr¨odinger operators

Corollary 11.82. In the setting of Theorem 11.78, if in addition V has a Lebesgue point at 0, then V (0) + o(|k|−1 ) 2k as z → ∞ in the same sector as before. m(z) = −k −

(11.123)

Proof. It suﬃces to prove that if V has a Lebesgue point at 0, then 1 2ke−2kt V (t) dt → V (0), z → ∞, arg z ∈ [δ, 2π − δ]. (11.124) 0 x Let us denote f (x) = 0 |V (t) − V (0)| dt and f (x) . x∈(0,] x

ω( ) = sup

By the Lebesgue point condition, lim→0 ω( ) = 0. Now we ﬁx > 0 and note 1 |2ke−2kt ||V (t) − V (0)| dt ≤ 2|k|e−2 Re k V 1 ,

which converges to 0 as k → ∞ in the sector arg k ∈ [δ/2, π − δ/2]. For the integral on [0, ], we use f ∈ AC([0, 1]) and integration by parts, 2|k|e−2 Re kt |V (t) − V (0)| dt 0 −2 Re kt f (t) + 4|k| Re ke−2 Re kt f (t) dt. = 2|k|e 0

Obviously,

0

2|k|e−2 Re k f ( )

→ 0, and the remaining integral obeys 4|k| Re ke−2 Re kt f (t) dt ≤ ω( ) 4|k| Re ke−2 Re kt t dt.

0

0

Another integration by parts on the right-hand side solves this and gives lim sup 2|k|e−2 Re kt |V (t) − V (0)| dt ≤ ω( ). k→∞ 0 arg k∈[δ/2,π−δ/2]

We conclude that

lim sup

1

2|k|e−2 Re kt |V (t) − V (0)| dt ≤ ω( ),

k→∞ 0 arg k∈[δ/2,π−δ/2]

which is arbitrarily small since > 0 is arbitrary. Thus, 1 2|k|e−2 Re kt |V (t) − V (0)| dt = 0, lim k→∞ 0 arg k∈[δ/2,π−δ/2]

which implies (11.124).

11.10. Asymptotic behavior of m-functions

421

This concludes the m-function asymptotics we will need in this chapter. We mention that Theorem 11.78 can be strengthened under stronger smoothness assumptions on V (Exercise 11.20). For instance, if V ∈ C n ([0, 1]) for some n ∈ N, there exist coeﬃcients c0 (V ), c1 (V ), . . . , cn+2 (V ) such that m(z) = −

n+2

cj k 1−j + o(|k|−n−1)

j=0

as z → ∞ in the appropriate sector. The coeﬃcients are uniform for V in bounded subsets of C n ([0, 1]). By Corollary 11.82, we already know that c0 = 1,

c1 = 0,

c2 = V (0)/2.

To compute further terms, instead of trying to follow the calculations in the proof, the coeﬃcients cj are more naturally found by the following method. We will vary x and use the logarithmic derivative of the Weyl solution m(x, z) =

(ψz+ ) (x) . ψz+ (x)

For each x, the function m(x, z) obeys an expansion of the same form but with diﬀerent coeﬃcients, m(x, z) = −k

n+2

cj (x)k −j + o(|k|−n−1),

arg z ∈ [δ, π − δ],

(11.125)

j=0

and the ﬁrst few coeﬃcients are already known to be c0 = 1,

c1 = 0,

c2 (x) =

V (x) . 2

(11.126)

To obtain an eﬀective way of computing coeﬃcients cj (x), let us consider their x-dependence. This relies on the following: Lemma 11.83. For any z ∈ C+ , m(x, z) obeys the Ricatti equation ∂x m(x, z) = V (x) − z − m(x, z)2 . Proof. This follows by a calculation from −(ψz+ ) + V ψz+ = zψz+ .

Theorem 11.84. For V ∈ C n ([0, 1]), the coeﬃcients cj (x) are locally absolutely continuous in x and are described by the recursion relation 1 1 cj (x)cn−j (x), cn (x) = cn−1 (x) − 2 2 n−2 j=2

together with (11.126).

n ≥ 3,

(11.127)

422

11. One-dimensional Schr¨odinger operators

Proof. It follows from the Ricatti equation and (11.125) that ∂x m(z, x) has an asymptotic expansion of the form ∂x m(x, z) = −k

n+2

dj (x)k −j + o(|k|−n−1)

j=0

with d0 = d1 = 0 and dl (x) =

l+1

for l ≥ 2.

cj (x)cl+1−j (x)

j=0

However, integrating ∂x m(−k 2 , x) from 0 to a and using uniform boundedness of the error implicit in o(|k|−n−1 ) implies that n+2 a dj (x) dx k −j + o(|k|−n−1). m(a, z) − m(0, z) = −k j=0

0

Comparing coeﬃcients with (11.125), it follows that cj (x) ∈ AC([0, a]) and cj = dj . Thus, the previous relations combine into cl =

l+1

cj cl+1−j = 2cl+1 +

j=0

l−1

cj cl+1−j .

j=2

Using (11.126), this can be rewritten as a formula for cl+1 in terms of lower order coeﬃcients, which gives the recursion (11.127). Now the coeﬃcients cn are computable recursively using (11.126) and (11.127), e.g., c3 (x) =

V (x) , 4

c4 (x) =

V (x) − V (x)2 , 8

... . 2

d Let us return to the general setting of a self-adjoint operator H = − dx 2+ V on I = (− , + ) with separated boundary conditions. For x ∈ I, denote

m± (x; z) = ±

(ψz± ) (x) . ψz± (x)

These are the half-line m-functions corresponding to the Schr¨odinger operators on I ∩ [x, ∞) and I ∩ (−∞, x], respectively, with Dirichlet boundary conditions at x. In terms of them, from (11.69), the diagonal Green’s function can be written as 1 . (11.128) G(x, x; z) = − m− (x; z) + m+ (x; z) This allows us to compute their asymptotics, e.g.,

11.11. The local Borg–Marchenko theorem

423

Lemma 11.85. For any x ∈ I, the diagonal Green’s function obeys the nontangential asymptotics 1 G(x, x; z) = √ + o(z −1 ), z → ∞, 2 −z in the sector arg z ∈ [δ, π − δ] for any δ > 0. Proof. The asymptotics in arg z ∈ [δ, π − δ] follow from (11.128) since √ m± (x; z) = − −z + o(1) in the same sector.

11.11. The local Borg–Marchenko theorem For a Schr¨ odinger operator with a regular endpoint at 0 and its m-function m(z), we have already seen that the leading asymptotic behavior of m(z) encodes the boundary condition at 0, and that further terms in the expansion encode the behavior of V at 0. In fact, we are about to see that the mfunction determines the entire potential uniquely: Theorem 11.86 (Borg–Marchenko). Let V ∈ L1loc ([0, b)), V˜ ∈ L1loc ([0, ˜b)). ˜ with separated boundary conditions Consider Schr¨ odinger operators H, H ˜ and potentials V, V , respectively, and denote by m(z), m(z) ˜ the correspond˜ ing m-functions. If m = m, ˜ then H = H. This result has a local version. We will present the local version ﬁrst, following a short proof due to Bennewitz: Theorem 11.87 (The local Borg–Marchenko theorem). In the setting of ˜ by Theorem 11.86, denote the boundary conditions at 0 for H, H cos αf (0) + sin αf (0) = 0, cos α ˜ f (0) + sin α ˜ f (0) = 0. For any d ∈ (0, min(b, ˜b)), the following are equivalent: (a) α = α ˜ and V = V˜ on (0, d). (b) For every , δ > 0, m(z) − m(z) ˜ = O(e−2(d−) Re

√ −z

),

z → ∞, arg z ∈ [δ, π − δ]. (11.129)

Proof of Theorem 11.87. √ From the asymptotics of fundamental solutions we know that m(z) = − −z(1 + o(1)) if α = 0 and (see Exercise 11.17) m(z) = cot α(1 + o(1)) if α ∈ (0, π), so the condition (11.129) implies α = α ˜. The leading asymptotics of fundamental solutions implies √ √ 1 φ(x, z) = (− sin α+cos α/ −z)ex −z (1+o(1)), z → ∞, arg z ∈ [δ, π−δ]. 2 (11.130)

424

11. One-dimensional Schr¨odinger operators

Moreover, by Lemma 11.85, G(x, x; z) = φ(x, z)ψ(x, z) → 0 as z → ∞ in the sector arg z ∈ [δ, π − δ]. ˜ z) → 1 as z → ∞ with arg z ∈ [δ, π − δ], so By (11.130), φ(x, z)/φ(x, ˜ z) and φ(x, ˜ z)ψ(x, z) converge to 0. Thus, so does their diﬀerence φ(x, z)ψ(x, ˜ z)θ(x, z) − φ(x, z)θ(x, ˜ z) + (m(z) − m(z))φ(x, ˜ z). φ(x, ˜ z)φ(x,

(11.131)

˜ z) and θ(x, z) = (a) =⇒ (b): If V = V˜ on (0, d), then φ(x, z) = φ(x, ˜ z) for x ∈ (0, d), so (11.131) implies θ(x, ˜ z) → 0, (m(z) − m(z))φ(x, ˜ z)φ(x, ˜ z) implies m(z) − which by the leading asymptotics of φ(x, z) and φ(x, √ m(z) ˜ = o(e−2x Re −z ) for all x < d. ˜ z), (11.131) (b) =⇒ (a): By the leading asymptotics of φ(x, z) and φ(x, implies that for any x ∈ (0, d), ˜ z)θ(x, z) − φ(x, z)θ(x, ˜ z) → 0, F (z) = φ(x,

z → ∞, arg z ∈ [δ, π − δ].

The function F is entire and, by the estimates for fundamental solutions, obeys √ 1/2 z → ∞. F (z) = O(e2x|Re −z| ) = O(e2x|z| ), Moreover, by the symmetry F (¯ z ) = F (z), the function decays along any nonreal ray. By the Phragm´en–Lindel¨ of Theorem 7.63 applied in the left and right half-planes, F is bounded on C, so it is constant by Liouville’s theorem and therefore identically equal to 0. Thus, for any z ∈ C \ R and x ∈ (0, d), ˜ z) θ(x, θ(x, z) = . ˜ z) φ(x, z) φ(x, Diﬀerentiating in x and using φ θ −φθ = φ˜ θ˜− φ˜θ˜ = 1 implies that φ2 = φ˜2 . Taking the logarithmic derivative in x gives φ˜ (x, z) φ (x, z) , = ˜ z) φ(x, z) φ(x, and diﬀerentiating again implies φ˜ (x, z) φ (x, z) = , ˜ z) φ(x, z) φ(x, ˜ implies that V = V˜ on which, since −φ + V φ = zφ and −φ˜ + V˜ φ˜ = z φ, (0, d). Proof of Theorem 11.86. Theorem 11.87 applies for any d < min(b, ˜b), ˜ so α = α ˜ and V = V˜ on (0, c), with c = min(b, ˜b). In particular, H and H have the same Weyl disks D(c, z) = x∈(0,c) D(x, z).

11.12. Full-line eigenfunction expansions

425

Note that b = c if and only m(z) ∈ ∂D(c, z), and analogously ˜b = c if and only if m(z) ˜ ∈ ∂D(c, z). Thus, m = m ˜ implies that b = c if and only if ˜b = c. Then, by the deﬁnition of c, we conclude b = ˜b = c. Now the value of m=m ˜ encodes the boundary condition, if any, at the endpoint b = ˜b (see ˜ Exercise 11.13), so H = H.

11.12. Full-line eigenfunction expansions In this section, we present the general eigenfunction expansion for onedimensional Schr¨ odinger operators with separated boundary conditions. We write the interval as I = (− , + ). The endpoints can be ﬁnite or inﬁnite. If the potential is a limit circle at the endpoint ± , we assume that a selfadjoint boundary condition has been ﬁxed at ± , i.e., v± , f ) = 0 W± (¯

for f ∈ D(H)

∗ such that W (¯ for some v± ∈ X± \ X± ± v± , v± ) = 0.

While the main motivation is the full-line Schr¨ odinger operator corresponding to a potential on R, this expansion can be applied to any onedimensional Schr¨ odinger operator with separated boundary conditions, and this expansion is sometimes useful even when applied to half-line problems. We ﬁx an internal point x0 ∈ I and denote by φz , θz solutions of −f + (V − z)f = 0 satisfying the initial conditions ! ! 1 0 φz (x0 ) θz (x0 ) = . 0 1 φz (x0 ) θz (x0 ) In particular, W (θz , φz ) = 1. We could ﬁx more general conditions in the style of (11.72), but here that does not correspond to a boundary condition of the Schr¨odinger operator and is only an internal choice; a diﬀerent choice would lead to an only superﬁcially diﬀerent eigenfunction expansion and is not of interest to us. Lemma 11.88. For z ∈ C \ R, the Weyl solutions ψz± have nonzero Wronskian with φz . If we normalize them so that W (ψz± , φz ) = 1 and deﬁne m± (z) = ±W (θz , ψz± ), then ψz± = θz ± m± (z)φz

(11.132)

z ) = m± (z). and m± (z) are Herglotz functions with the symmetry m± (¯ Proof. The claims for ψz+ and m+ are immediate from the half-line setting applied on the interval [x0 , + ). The claims for ψz− and m− follow from the same results after an aﬃne transformation to reverse the interval. That reversal changes the sign of the derivative in x, which explains the ± sign in (11.132).

426

11. One-dimensional Schr¨odinger operators

In particular, m± correspond naturally to the half-line Schr¨odinger operators H− and H+ on (− , x0 ) and (x0 , + ) with a Dirichlet boundary condition at x0 and boundary conditions at ± (if needed) inherited from H. The main object corresponding to the whole-line operator on (− , + ) is the Weyl M -matrix, deﬁned on C \ R by m m m −m −

M= This can be written as M=

m− m+ m− +m+ m− m− +m+

+

m− +m+ 1 m− −m+ 2 m− +m+

1 − + 2 m− +m+ 1 −m− −m+

m− m− +m+ 1 − m− +m +

.

! 0 1/2 − , 1/2 0

so by Lemma 7.68, it is a matrix-valued Herglotz function. It will serve as the full-line analogue of the Weyl m-function; compare the following lemma with Lemma 11.57. As in Deﬁnition 11.77, we use notation o˜(. . . ) to signify both a uniform O(. . . ) and a pointwise o(. . . ). Lemma 11.89. The Weyl M -matrix is a matrix-valued Herglotz function and !∗ ! θλ (y) θλ (x) + o˜(1), ↓ 0, Im M (λ + i ) Im G(x, y; λ + i ) = φλ (x) φλ (y) uniformly on (x, y, λ) ∈ [c, d]2 × [λ1 , λ2 ], with compact [c, d] ⊂ (− , + ) and [λ1 , λ2 ] ⊂ R. Proof. With ψz± normalized as in (11.132), denote ! ! m − m + 1 −m− m+ m− +m+ = M1 = + 1 1 − m−m+m W (ψz+ , ψz− ) + Note that 1 M = (M1 + M1 ), 2

M1 − M1

=

m− m− +m+ 1 − m− +m +

.

! 0 −1 . 1 0

Using (11.69) and (11.132), we can express Green’s function in the form ⎧ ⎪ φ (x) (y) φ ⎪ z z ⎪ , x ≤ y, M1 (z) ⎪ ⎨ θ (x) θz (y) z G(x, y; z) = ⎪ ⎪ φz (x) (y) φ z ⎪ ⎪ , x ≥ y. M1 (z) ⎩ θz (x) θz (y) It remains to study the asymptotics of Im G(x, y; λ + i ). Using AC2 ([c, d])analyticity of the fundamental solutions, their reality at λ ∈ R, and the

11.12. Full-line eigenfunction expansions

427

observation that the diﬀerence M1 − M1 has real entries, it follows that, with z = λ + i , , ! !φz (x) φz (y) = O( ), ↓ 0, Im (M1 (z) − M1 (z) ) θz (x) θz (y) so there is no diﬀerence in asymptotics in the cases x ≤ y and x ≥ y. In both cases, M1 or M1 can be replaced by the average M = 12 (M1 + M1 ) to conclude , ! !φz (y) φz (x) + O( ), ↓ 0. M (z) Im G(x, y; z) = Im θz (x) θz (y) The proof now proceeds analogously to the half-line case (Lemma 11.57), using analyticity of the fundamental solutions, and estimates for matrixvalued Herglotz functions. Namely, by (7.61), M (λ + i ) = O( −1 ),

↓ 0,

uniformly for λ ∈ [λ1 , λ2 ] and (M (λ + i ) + M (λ + i )∗ ) = o(1) as ↓ 0. By the matrix-valued Stieltjes inversion (Theorem 7.67), the Weyl M matrix corresponds to a Poisson-ﬁnite measure μ and a 2 × 2 matrix-valued function W on R such that W ≥ 0 and Tr W = 1 μ-a.e. and 1 lim h(λ) Im M (λ + i ) dλ = h(λ)W (λ) dμ(λ) π ↓0 for all h ∈ Cc (R). Note that M = M implies W = W . The full-line eigenfunction expansion will conjugate H to a multiplication operator on the Hilbert space L2 (R, C2 , W dμ) (see Lemma 6.38). We can now introduce the eigenfunction expansion and its presumed inverse; their basic properties are proved analogously to the half-line case, so we omit details. Lemma 11.90. For f ∈ L2c (I), the function fˆ : R → C2 deﬁned by ! φ (x)f (x) dx fˆ(λ) = λ θλ (x)f (x) dx is a continuous function of λ ∈ R. Lemma 11.91. For g ∈ L2c (R, C2 , W dμ), the function gˇ : I → C deﬁned by ! φλ (x) W (λ)g(λ) dμ(λ) gˇ(x) = θλ (x)

428

11. One-dimensional Schr¨odinger operators

is in AC2 ([c, d]) for every [c, d] ⊂ (− , + ) and ! φλ (x) W (λ)g(λ) dμ(λ), gˇ (x) = θλ (x) ! φλ (x) gˇ (x) = W (λ)g(λ) dμ(λ). θλ (x) In particular, if g ∈ L2c (R, C2 , W dμ) and gˇ = 0 Lebesgue-a.e., then ! 0 W (λ)g(λ) dμ(λ) = . 0 The main result about full-line eigenfunction expansion follows. Theorem 11.92 (Full-line eigenfunction expansion). There is a unitary map U : L2 (I) → L2 (R, C2 , W dμ) such that: (a) U f = fˆ for compactly supported f ; (b) U −1 g = gˇ for compactly supported g; (c) U HU −1 is the operator of multiplication of λ on L2 (R, C2 , W (λ) dμ(λ)). We omit the details of the proof, since the remaining steps are analogous to the proof of Theorem 11.56. As an immediate consequence of Proposition 9.41 we obtain the following. Corollary 11.93. The measure μ is a maximal spectral measure for H. The multiplicity n measures for H (see Theorem 9.31) are given by dμn = χSn dμ, where Sn = {λ | rank W (λ) = n}. The spectral representation provided by Theorem 11.92 can also be used to express spectral information about H in terms of half-line Weyl functions m± . This allows us to relate spectral properties of H to the spectral properties of half-line operators H± . The following results are immediate consequences of results proved in an abstract setting in Section 7.12: Corollary 11.94. σess (H) = σess (H+ ) ∪ σess (H− ). Corollary 11.95. The absolutely continuous spectrum of H is precisely the sum of absolutely continuous spectra of H± , with multiplicities added, i.e., H|H (H) ∼ = H+ |H (H ) ⊕ H− |H (H ) . ac

ac

+

ac

−

Corollary 11.96. For μs -a.e. x ∈ R, the following hold. (a) rank W (x) = 1. (b) m± have normal limits which are values in R ∪ {∞}.

11.13. Subordinacy theory

429

(c) There exists α = α(x) ∈ [0, π) such that m+ (x + i0) = −m− (x + i0) = − cot α(x) and W (x) =

! − cos α(x) sin α(x) cos2 α(x) . − cos α(x) sin α(x) sin2 α(x)

In other words, the singular part of μ is supported on the set S= {x | m+ (x + i0) = −m− (x + i0) = − cot α}. α∈[0,π)

Moreover, S has Lebesgue measure zero.

11.13. Subordinacy theory Spectral properties of Schr¨odinger operators can be studied through the behavior of formal eigensolutions with real spectral parameters. Obviously, a pure point spectrum of a Schr¨odinger operator corresponds to formal eigensolutions which are square-integrable. Carmona’s formula is useful, but it is not a pointwise criterion. Subordinacy theory developed by Gilbert–Pearson [40, 41] and Jitomirskaya–Last [47, 48] describes the decomposition into absolutely continuous/singular spectra and, more generally, decomposition into α-continuous/α-singular spectra in terms of the behavior of eigensolutions. We begin with the half-line setting with a potential V ∈ L1loc ([0, ∞)). Deﬁnition 11.97. Fix λ ∈ R. A nontrivial solution f of −f + V f = λf is called subordinate (at +∞) if x |f (t)|2 dt =0 (11.133) lim 0x 2 x→∞ 0 |g(t)| dt for some solution g of −g + V g = λg. Lemma 11.98. (a) If (11.133) holds for some eigensolution g, then it holds for every eigensolution g linearly independent with f . (b) If f is subordinate, it is linearly dependent with f ; it is a constant multiple of φα for some α ∈ R. Proof. (a) If g = Cf , the limit (11.133) is 1/C 2 . Thus, (11.133) implies that g is linearly independent with f . Now any eigensolution h can be written as h = C1 f + C2 g and, if h is linearly independent with f , then C2 = 0. By elementary estimates, 1 |h|2 ≥ |C2 |2 |g|2 − |C1 |2 |f |2 . 2

430

11. One-dimensional Schr¨odinger operators

This implies

x x 2 |h(t)|2 dt 1 2 0 0 |g(t)| dt ≥ − |C1 |2 = ∞, |C | lim inf lim inf x 2 x 2 dt 2 dt x→∞ x→∞ 2 |f (t)| |f (t)| 0 0

and inverting completes the proof. (b) If g = f , then the limit in (11.133) is equal to 1. Thus, if f is subordinate, f must be linearly dependent with f . Thus, vectors (f (0), f (0)) and (f (0), f (0)) are linearly dependent. This implies that f (0)/f (0) ∈ R ∪ {∞}, i.e., f is a scalar multiple of the solution φα for some α ∈ R. Thus, when looking for subordinate solutions, we are not only focused on real spectral parameters λ but also on real solutions: to check whether φα is subordinate, it suﬃces to compare it to θα . This comparison can be related to values of mα (z): Lemma 11.99 (Jitomirskaya–Last inequality). For any x > 0, deﬁne (x) > 0 by !−1/2 x x 2 2 φα (t, λ) dt θα (t, λ) dt . (11.134)

(x) = 4 0

For all x > 0, √ 5 − 24 ≤ |mα (λ + i (x))|

0

x √ !1/2 φα (t, λ)2 dt 5 + 24 0x . ≤ 2 |mα (λ + i (x))| 0 θα (t, λ) dt

(11.135)

Proof. Lemma 11.16 generalizes with the same proof to use φ = φα (·, λ), θ = θα (·, λ) instead of fundamental solutions u, v; in particular, the Weyl solution ψ at z = λ + i can be viewed as a solution of −ψ + (V − λ)ψ = i ψ,

W (ψ, φ) = 1,

W (ψ, θ) = −m(z)

(note the Weyl solution is taken at the complex spectral parameter z but compared to eigensolutions at λ) and expressed by variation of parameters as x (θ(x)φ(t) − θ(t)φ(x))ψ(t) dt. ψ(x) = θ(x) + mα (z)φ(x) + i

8

0 x 2 0 |f (t)| dt,

using the Cauchy–Schwarz inWith the notation f x = equality twice on the right-hand side implies |ψ(x)| ≥ |θ(x) + mα (z)φ(x)| − |θ(x)|φx ψx − |φ(x)|θx ψx . Rearranging and using the triangle inequality in L2 ([0, x]) gives θ + mα (z)φx ≤ ψx + 2 θx φx ψx .

11.13. Subordinacy theory

431

Squaring this, combining with ψ2x ≤ ψ2 = Im mα (z)/ , and using the choice of = (x) such that 2 θx φx = 1, we obtain Im mα (z) ≤ 8θx φx |mα (z)|.

Using the triangle inequality on the left-hand side, this implies θ + mα (z)φ2x ≤ 4

(θx − |mα (z)|φx )2 ≤ 8θx φx |mα (z)|. Dividing this by θ2x and expanding gives a quadratic inequality for κ = |mα (z)|φx /θx , κ2 − 10κ + 1 ≤ 0, √ √ which implies 5 − 24 ≤ κ ≤ 5 + 24, completing the proof. The main result of subordinacy theory is that subordinacy of φα corresponds to inﬁnite normal boundary values of mα : Theorem 11.100. Let H be regular at 0 and a limit point at ∞. For any λ ∈ R, φα (·, λ) is subordinate if and only if lim mα (λ + i ) = ∞. ↓0

(11.136)

Proof. Obviously, the function (x) deﬁned above is a strictly decreasing, continuous function of x; moreover, φα and θα are not both square-integrable in the limit point case, so lim (x) = 0. x→∞

By taking x → ∞ in the Jitomirskaya–Last inequality, we conclude that φα is subordinate if and only if lim |mα (λ + i (x))| = ∞.

x→∞

By observed properties of (x), this is equivalent to (11.136).

Theorem 11.101. Let H be regular at 0 and a limit point at ∞. The singular part of its spectral measure μα is supported on the set Sα = {λ ∈ R | φα is subordinate}, and the absolutely continuous part of the spectral measure μα is mutually absolutely continuous with χN (λ) dλ, where N = {λ ∈ R | there is no subordinate solution at λ}. Proof. The set Sα is precisely the set on which mα (λ + i0) = ∞. Moreover, λ ∈ N if and only if mα (λ + i0) ∈ C+ or mα (λ + i0) does not exist; however, the second case happens on a set of Lebesgue measure zero. Thus, the theorem follows from Corollary 7.49.

432

11. One-dimensional Schr¨odinger operators

Strengthening the subordinacy assumption, we will be able to characterize spectral decompositions with respect to Hausdorﬀ measures. Note the lim inf in the following deﬁnition. Deﬁnition 11.102. Fix β ∈ (0, 1] and λ ∈ R. A nontrivial solution f of −f + V f = λf is called β-subordinate (at +∞) if 2−β x 2 0 |f (t)| dt (11.137) lim inf x β = 0 x→∞ 2 0 |g(t)| dt for some solution g of −g + V g = λg. Theorem 11.103. Let H be regular at 0 and a limit point at ∞. Fix β ∈ (0, 1). The β-singular part of its spectral measure μα is supported on the set Sα,β = {λ ∈ R | φα is β-subordinate}, c . and the β-continuous part of μα is supported on Sα,β

Proof. Raising (11.134) to power 1 − β and using that to divide (11.135) gives √ √ 2−β 5 + 24 5 − 24 1−β φx ≤2 . ≤

(x)1−β |mα (λ + i (x))|

(x)1−β |mα (λ + i (x))| θβx Taking x → ∞ proves that φ = φα (·, λ) is β-subordinate if and only if lim sup 1−β |mα (λ + i )| = ∞. ↓0

Now the claim follows from Theorem 6.29 and Theorem 7.51.

Subordinacy can also be used to study spectra of full-line Schr¨odinger operators [40]: with obvious modiﬁcations, we say a nontrivial eigensolution f at λ is subordinate at −∞ if for some eigensolution g at λ, 0 |f (t)|2 dt lim x0 = 0. 2 dt x→−∞ |g(t)| x Denote Sα± = {λ ∈ R | φα (·, λ) is subordinate at ± ∞}, N ± = {λ ∈ R | there is no subordinate eigensolution at ± ∞}. Note that the set S=

(Sα− ∩ Sα+ )

α∈[0,π)

is precisely the set of λ ∈ R for which there exists an eigensolution which is subordinate at both endpoints ±∞.

11.14. Potentials bounded below in an L1loc sense

433

Theorem 11.104. Let H be a Schr¨ odinger operator on R which is limit point at ±∞. Let μ be its canonical spectral measure. (a) The singular part of μ is supported on S and has multiplicity 1. (b) N− ∪ N+ is an essential support for μac , i.e., μac is mutually absolutely continuous with χN− ∪N+ (λ) dλ. (c) N− ∩ N+ is an essential support for the multiplicity 2 part of μac . Proof. This follows from Corollary 11.95, Corollary 11.96, and Theorem 11.100.

11.14. Potentials bounded below in an L1loc sense In this section, we begin to specialize to potentials for which each ﬁnite endpoint is regular, and that at each inﬁnite endpoint, x+1 lim sup V− (t) dt < ∞. x→±∞

x

(In Proposition 11.32 we showed that this condition implies a limit point endpoint.) Under this additional assumption, we will study semiboundedness, elements of the operator domain, and properties of eigenfunctions. Namely, we denote

V− L1

loc,unif

=

x+1

sup x:(x,x+1)⊂I

V− (t) dt,

x

assuming from now on that I has length at least 1. Some properties of the operator domain follow immediately from properties of regular endpoints and Proposition 11.32: Corollary 11.105. If V ∈ L1loc (I) and V− L1 D(H) has the properties f ∈

L∞ (I)

and

f

∈

loc,unif

< ∞, then any f ∈

L2 (I).

Next, we will prove that the corresponding Schr¨ odinger operators are semibounded, with a controllable lower bound on the spectrum. We begin with the half-line setting. < ∞. There Proposition 11.106. Let V ∈ L1loc ([0, ∞)) and V− L1 loc,unif , such that the is a constant C, depending only on the value of V− L1 loc,unif following hold. (a) Consider the Schr¨ odinger operators H0 , Hπ/2 corresponding to this potential with a Dirichlet–Neumann boundary condition at 0. Then σ(H0 ), σ(Hπ/2 ) ⊂ [−C, ∞).

434

11. One-dimensional Schr¨odinger operators

(b) The Dirichlet m-function m0 obeys the asymptotic behavior, for any x > 0 and δ > 0, x m0 (z) = −k − e−2kt V (t) dt + o(|k|−1 ), z → ∞, arg z ∈ [δ, 2π − δ]. 0

In particular, m0 (z) = −k + o(1) as z → ∞ on the same sectors. (c) m0 is strictly increasing on (−∞, −C) and m0 (z) < 0 for z ∈ (−∞, −C). Proof. (a) Let α ∈ {0, π/2}. By Proposition 11.32(d), there is a constant C such that for all ψ ∈ D(Hα ), 1 Cψ, ψ + ψ, Hα ψ ≥ ψ , ψ ≥ 0 2 (ψ, Hα ψ ∈ R because Hα is self-adjoint). This implies that ψ, Hα ψ ≥ −Cψ, ψ

∀ψ ∈ D(Hα ),

so the criterion for semiboundedness (Corollary 8.42) completes the proof. (b) This follows from Theorem 11.78. (c) It follows from (a) that m0 and mπ/2 have analytic extensions through the interval (−∞, −C). Since the Neumann m-function is mπ/2 = −1/m0 , we conclude that m0 does not have poles or zeros on (−∞, −C). As a Herglotz function, m0 is strictly increasing on this interval, and by (b), limλ→−∞ m0 (λ) = −∞. Thus, by continuity, m0 is strictly negative throughout that interval. Now let us examine the full-line setting. Proposition 11.107. Let V be a potential on R such that V ∈ L1loc (R) and V− L1 < ∞. There is a constant C, depending only on the value of loc,unif , such that the following hold. V− L1 loc,unif

(a) Consider the Schr¨ odinger operator H corresponding to this potential. Then σ(H) ⊂ [−C, ∞). (b) The diagonal Green’s function G(x, x; z) obeys the asymptotic behavior, for any x > 0 and δ > 0, 1 + o(z −1 ), G(x, x; z) = √ 2 −z

z → ∞, arg z ∈ [δ, 2π − δ].

(c) G(x, x; z) is strictly increasing on (−∞, min σ(H)) and is strictly positive on (−∞, min σ(H)).

11.14. Potentials bounded below in an L1loc sense

435

Proof. (a) We will use the properties of half-line m-functions m± (x; z) which follow from Proposition 11.106. These functions have analytic extensions through (−∞, −C) which are strictly negative on that interval, so the functions 1 m+ m− − , m+ + m− m+ + m− also have analytic extensions through the same interval. Since their sum is the Herglotz function Tr M which corresponds to the maximal spectral measure for H, it follows that σ(H) ⊂ [−C, ∞). (b) This now follows from Lemma 11.85 and the Phragm´en–Lindel¨ of method. (c) G(x, x; λ) is strictly increasing on the interval (−∞, min σ(H)) and, by (b), limλ→−∞ G(x, x; λ) = 0. Thus, G(x, x; λ) > 0 on this interval. Part (a) can also be proved more elegantly by using Exercise 11.5. The asymptotic behavior of eigensolutions at an inﬁnite endpoint have important spectral consequences; we have already seen that through the context of Weyl solutions and through Carmona’s formula. In such arguments, we often need input not only on the behavior of the eigensolution f , but also of its derivative f . Our goal in this section is to show that various asymptotic properties of f extend to f under the assumption that the negative part of V is uniformly locally L1 . The main technical estimate is adapted from work of Stolz [103]. It should be noted that all constants in it depend only on V and |z|. Lemma 11.108. Let f be a solution of −f + V f = zf on the interval I, . and let C = |z| + V− L1 loc,unif

(a) Let [x, y] ⊂ I and assume ω ∈ C, f (x) = 0, and Re[¯ ω f (t)] ≥ 0 for t ∈ [x, y]. Then Re[¯ ω f (y)] ≥ Re[¯ ω f (x)]+(y−x) Re[¯ ω f (x)]−C(y−x)(y−x+1)|ω| max |f (t)|. x≤t≤y

(11.138) √ (b) Denote K = 1/ C. For any x such that [x − K, x + K] ⊂ I, |f (x)| ≤ C(1 + 2K)

max

y∈[x−K,x+K]

|f (y)|.

(11.139)

8 1 . If x ∈ I obeys f (x) = 0, Re[f (x)f (x)] ≥ (c) Denote δ = − 12 + 14 + 2C 0, and x + δ ∈ I, then for all y ∈ [x, x + δ), |f (y)| >

|f (x)| . 2

(11.140)

436

11. One-dimensional Schr¨odinger operators

Proof. (a) Since f ∈ AC2loc (I), A t y@ f (s)ds dt f (y) = f (x) + f (x) + x x y = f (x) + (y − x)f (x) + (y − s)f (s)ds.

(11.141)

x

ω f (s)] ≤ |¯ ω f (s)| ≤ |ω|M Denoting M = maxx≤t≤y |f (t)|, we have 0 ≤ Re[¯ for s ∈ [x, y]. Since f is an eigensolution, this implies A @ y (y − s)f (s)ds Re ω ¯ x y y = (y − s)V (s) Re [¯ ω f (s)] ds − (y − s) Re [¯ ω zf (s)] ds x x y V− (s)ds − |ωz|M (y − x)2 ≥ −|ω|M (y − x) x

≥ −|ω|M (y − x)(y − x + 1)C, which together with (11.141) proves (11.138). (b) Without loss of generality, assume Re[f (x)f (x)] ≥ 0 (the other case follows by considering f (−x)). Let M = maxx−K≤y≤x+K |f (y)|. Assume that, contrary to (11.139), we have |f (x)| > C(1 + 2K)M.

(11.142)

Denote g(y) = Re[f (x)f (y)]. The function g is continuous, g(x) ≥ 0, and g (x) = Re[f (x)f (x)] > 0, so g > 0 in some interval (x, x+ ). We claim that g > 0 in (x, x + K]. Assume to the contrary, that there exists y ∈ (x, x + K] such that g(y) = 0, and pick the smallest such y. Then g ≥ 0 on [x, y], so applying (a) with ω = f (x), we have g(y) ≥ g(x) + (y − x)|f (x)|2 − C(y − x)(y − x + 1)|f (x)|M ≥ (y − x)|f (x)| |f (x)| − CM (y − x + 1) . (11.143) Thus, by (11.142), g(y) > (y − x)|f (x)|CM (2K − (y − x)) > 0,

(11.144)

contradicting our assumption and proving g > 0 on (x, x + K]. Taking y = x + K in (11.144), we have Re[f (x)f (x + K)] > CM K 2 |f (x)| = M |f (x)| ≥ |f (x)f (x + K)|, which is a contradiction. Thus, the initial assumption (11.142) is wrong.

11.14. Potentials bounded below in an L1loc sense

437

(c) Assume the contrary. Then there exists y ∈ (x, x + δ) such that |f (y)| = |f (x)| 2 . Let s ∈ [x, y) be such that |f (s)| = max |f (t)|. t∈[x,y]

d |f (t)|2 , we have Re[f (s)f (s)] = 0 (this is true even Since Re[f (t)f (t)] = 12 dt if s = x, since we know a priori that Re[f (x)f (x)] ≥ 0). Note also

Re[f (s)f (y)] ≤ |f (s)f (y)| ≤

|f (s)|2 , 2

so we may pick t ∈ (s, y] as the smallest t > s with Re[f (s)f (t)] =

(11.145) |f (s)|2 2 .

Using (b) with x replaced by s and y replaced by t, and with ω = f (s) gives Re[f (s)f (t)] ≥ |f (s)|2 [1 − C(t − s)(t − s + 1)] > |f (s)|2 [1 − Cδ(δ + 1)] |f (s)|2 , 2 where we used t − s ≤ y − x < δ. This is a contradiction with (11.145), which completes the proof. =

This allows us to extend L2 -type estimates on f with polynomial or exponential weights to L2 -type and pointwise estimates on f and f . We allow weighted estimates and consider weights w : I → (0, ∞) which obey lim sup x→+∞

w(y) < ∞. w(x) y∈[x−1,x+1] sup

(11.146)

Besides the trivial weight w = 1, the main examples to keep in mind are the polynomial weight w(x) = xκ and the exponential weight w(x) = eκx , for κ ∈ R. Theorem 11.109. Let f be a solution of −f + V f = zf on I = (c, ∞) < ∞. If w : I → (0, ∞) obeys with a potential V which obeys V− L1 loc,unif (11.146) and ∞ w(x)|f (x)|2 dx < ∞, c

then

∞

w(x)|f (x)|2 dx < ∞

c

and lim

x→∞

.

w(x)|f (x)| = lim

x→∞

.

w(x)|f (x)| = 0.

438

11. One-dimensional Schr¨odinger operators

Proof. We use the constant δ > 0 from Lemma 11.108(c). By (11.146), there exists C1 such that for all x large enough and all y ∈ [x − δ, x + δ], C1−1 ≤

w(y) ≤ C1 . w(x)

We claim that 4C1 w(x)|f (x)| ≤ δ

x+δ

2

w(y)|f (y)|2 dy. x−δ

For Re[f (x)f (x)] ≥ 0, the claim follows from (11.140) by squaring it, multiplying by C1 w(y) ≥ w(x), and integrating from x to x + δ. The case Re[f (x)f (x)] < 0 follows analogously, by considering f (−x). This implies that limx→∞ w(x)|f (x)|2 = 0. Similarly, using (11.139), we conclude 4C1 C 2 (1 + 2K)2 x+K+δ 2 w(y)|f (y)|2 dy. w(x)|f (x)| ≤ δ x−K−δ √ This implies limx→∞ w(x)|f (x)|2 = 0 by square-integrability of wf , and it implies ∞ ∞ 8C1 C 2 (1 + 2K)2 (K + δ) w(x)|f (x)|2 dx ≤ w(y)|f (y)|2 dy < ∞ δ c c−K−δ

by Tonelli’s theorem.

In particular, for Weyl solutions, with the trivial weight w = 1, we conclude that the derivative of a Weyl solution is square-integrable and obtain pointwise decay: Corollary 11.110. Let ψz+ be a Weyl solution for the endpoint + = +∞ < ∞. Then for c ∈ I, (ψz+ ) ∈ with a potential V which obeys V− L1 loc,unif

L2 ((c, +∞)) and lim ψz+ (x) = lim (ψz+ ) (x) = 0.

x→+∞

x→+∞

Another application of Lemma 11.108 is to relate boundedness of eigensolutions to lack of subordinate solutions and therefore to absolutely continuous spectrum: < ∞. Let λ ∈ R. If Lemma 11.111. Let V ∈ L1loc ([0, ∞)) and V− L1 loc,unif all eigensolutions at λ are bounded, then there is no subordinate solution at λ.

11.15. A Combes–Thomas estimate and Schnol’s theorem

439

Proof. Let f, g be linearly independent eigensolutions at λ. Since f, g are bounded, by Lemma 11.108, g is also bounded and there exist c, d such that !1/2 !1/2 x+d x+d 2 2 |f (x)| ≤ c |f (t)| dt , |f (x)| ≤ c |f (t)| dt . x−d

x−d

Since f, g are linearly independent, their Wronskian W is nonzero. By the Cauchy–Schwarz inequality, !1/2 x+d 2 |f (t)| dt g∞ + g ∞ . |W | ≤ |f ||g | + |f ||g| ≤ c x−d

Squaring, integrating in x, dividing by x, and letting x → ∞, we conclude |W |2 1 x |f (t)|2 dt ≥ > 0. lim inf x→∞ x 0 2dc2 (g∞ + g ∞ )2 x Since lim supx→∞ x1 0 |g(t)|2 dt ≤ g2∞ < ∞, dividing gives x |f (t)|2 dt > 0, lim inf 0x 2 x→∞ 0 |g(t)| dt which proves that f is not subordinate.

Combining this with subordinacy theory (Theorem 11.101) gives the following criterion: < ∞. Fix a Corollary 11.112. Let V ∈ L1loc ([0, ∞)) and V− L1 loc,unif boundary condition at 0, and let μ denote the canonical spectral measure of the corresponding Schr¨ odinger operator. If S denotes the set of λ ∈ R for which all eigensolutions are bounded, then χS dμ is mutually absolutely continuous with Lebesgue measure on S. This is a very eﬀective and commonly used criterion for establishing absolutely continuous spectrum; other sophisticated criteria for absolutely continuous spectrum have been proved by Last–Simon [61]. A ﬁrst application of Corollary 11.112 can be to prove that potentials V ∈ L1 ([0, ∞)) give rise to absolutely continuous spectrum on (0, ∞) (Exercise 11.23). Note that integrability of V on (0, ∞) should be viewed as a kind of decay condition at +∞; Schr¨odinger operators with decaying spectra at +∞ are the subject of much study; see reviews [27, 56].

11.15. A Combes–Thomas estimate and Schnol’s theorem Weyl solutions are by deﬁnition L2 -integrable near the corresponding endpoint. We will now see that this can be signiﬁcantly improved away from the essential spectrum for potentials whose negative part is uniformly locally integrable. Estimates of this form are known as Combes–Thomas estimates.

440

11. One-dimensional Schr¨odinger operators

Theorem 11.113. Let H be a Schr¨ odinger operator on I = (− , + ) with separated boundary conditions. Assume that (11.5) holds at + = +∞. If ψz+ is the Weyl solution for some z ∈ C \ σess (H), then there exists γ > 0 such that ψz+ (x) = O(e−γx ),

x → +∞.

While the result is stated for an arbitrary Schr¨odinger operator with separated boundary conditions, the behavior of Weyl solutions at + does not depend on the potential near − , and we will be able to reduce the proof to the case of a ﬁnite, regular endpoint − . Another observation is that f is a solution of −f + V f = zf if and only if g(x) = eγx f (x) is a solution of −g + 2γg − γ 2 g + V g = zg. Exponential decay of f is equivalent to boundedness of g, which motivates the use of diﬀerential operators corresponding to the diﬀerential expression Hγ g = −g + 2γg − γ 2 g + V g.

(11.147)

These are not self-adjoint for γ = 0 but will still be useful in the proof. Proof. Let us ﬁx z ∈ / σess (H) and the Weyl solution ψ = ψz+ . Let us choose c ∈ I such that ψ(c) = 0. Then, we consider the operator H0 on [c, +∞) with the potential V and a Dirichlet boundary condition at c. More explicitly, writing the domain of H in terms of Lagrangian subspaces as D(H) = Y− ∩ Y+ , the domain of H0 is D(H0 ) = {g ∈ L2 ([c, ∞)) | g = G|[c,∞) for some G ∈ Y+ and g(c) = 0}. / σess (H0 ), and Corollary 11.94 implies σess (H0 ) ⊂ σess (H) and therefore z ∈ then the assumption ψ(c) = 0 ensures that z ∈ / σ(H0 ). From now on, we work with the operator H0 on the Hilbert space L2 ((c, ∞)). As motivated above, we wish to consider the operator Hγ deﬁned formally by Hγ = eγx H0 e−γx . By Proposition 11.32(d), for any g ∈ D(H0 ), 1 2 g ≤ M g2 + g, H0 g 2 (g, H0 g is real because H0 is self-adjoint). By the Cauchy–Schwarz inequality and arithmetic mean–geometric mean inequality, there exists C such that g 2 ≤ Cg2 + CH0 g2

(11.148) g

L2 ((c, ∞)),

so we for all g ∈ D(H0 ). In particular, g ∈ D(H0 ) implies ∈ can rigorously deﬁne the operator Hγ by D(Hγ ) = D(H0 ) and (11.147).

11.15. A Combes–Thomas estimate and Schnol’s theorem

441

Combining (11.148) with the estimates g ≤ (H0 − z)−1 (H0 − z)g and H0 g ≤ (H0 − z)g + |z|g shows that g ≤ C0 (H0 − z)g,

g ≤ C1 (H0 − z)g

for some constants C0 , C1 independent of g ∈ D(H0 ). Thus, (Hγ − H0 )g ≤ 2γg + γ 2 g ≤ (2γC1 + γ 2 C0 )(H0 − z)g, which ﬁnally implies that for small enough γ > 0, (Hγ − H0 )(H0 − z)−1 ≤ (2γC1 + γ 2 C0 ) < 1. For such γ > 0, the operator Hγ − z = ((Hγ − H0 )(H0 − z)−1 + I)(H0 − z) is invertible. Fix d < ∞; by the proof of Theorem 11.42, there exists h ∈ L2 ([c, ∞)) such that supp h ⊂ [c, d] and (H0 − z)−1 h is a nontrivial multiple of ψ on [d, ∞). Then (Hγ − z)−1 (eγx h(x)) is a nontrivial multiple of eγx ψ(x) on [d, ∞). However, (Hγ −z)−1 (eγx h(x)) ∈ D(H0 ) ⊂ L∞ (I), so ψ(x) = O(e−γx ) as x → +∞. The previous proof used the fact that any f ∈ D(H0 ) is a bounded function (Corollary 11.105). The same fact will be useful in the next two results, which closely relate the spectrum to the polynomial growth of formal eigensolutions. Theorem 11.114 (Schnol). Consider V ∈ L1loc ([0, ∞)) such that (11.5) holds at + = +∞. Let κ > 1/2 and deﬁne Sκ = {λ | φλ (x) = O(xκ ),

x → ∞}.

Then: (a) Sκ ⊂ σ(H); (b) the maximal spectral measure for H is supported on Sκ ; (c) Sκ = σ(H). Proof. (a) For λ ∈ / σ(H), there exists a Weyl solution ψ(x) which is exponentially decaying at ∞, i.e., there exists γ > 0 such that ψ(x) = O(e−γx ) as x → ∞. By Theorem 11.109, this implies ψ (x) = O(e−γx ) and φλ (x) = O(xκ ) implies φλ (x) = O(xκ ), so their Wronskian obeys W (ψ, φ)(x) = O(xκ e−γx ). Since the Wronskian is independent of x, it must be 0, so ψ, φ are eigenvectors and λ is an eigenvalue of H, leading to a contradiction.

442

11. One-dimensional Schr¨odinger operators

(b) We use eigenfunction expansions. If f (y) = G(y, x; z) = G(x, y; z) 1 φλ (x), and since U is unitary, for some z ∈ C \ R, recall that (U f )(λ) = λ−z 1 dμ(λ). |G(x, y; z)|2 dy = |φλ (x)|2 |λ − z|2 By the integral representation for resolvents, since D(H) ⊂ L∞ (I), we see that for every f ∈ L2 (I), sup G(x, y; z)f (y) dy < ∞, y∈I

so by the uniform boundedness principle, sup |G(x, y; z)|2 dy < ∞. x∈I

Thus, |φλ (x)|2 2 −κ 2 (1 + x2 )−κ dμ(λ) dx < ∞. (1 + x ) |G(x, y; z)| dy dx = |λ − z|2 By Tonelli’s theorem, this implies that for μ-a.e. λ, (1 + x2 )−κ |φλ (x)|2 dx < ∞, and therefore φλ (x) = O(|x|κ ) as x → ∞. (c) Since Sκ ⊂ σ(H), Sκ ⊂ σ(H); conversely, since μ is supported on Sκ , σ(H) = supp μ ⊂ Sκ . Theorem 11.115 (Schnol). Let H be a Schr¨ odinger operator on L2 (R) such < ∞. Let κ > 1/2 and denote by Sκ the set of λ for which that V− L1 loc,unif there exists a nontrivial solution of −u +V u = λu such that u(x) = O(|x|κ ) as x → ±∞. Then: (a) Sκ ⊂ σ(H); (b) the maximal spectral measure for H is supported on Sκ ; (c) Sκ = σ(H). Proof. (a) For λ ∈ / σ(H), there exist Weyl solutions ψ ± , which are exponentially decaying at ±∞, respectively. If a solution is polynomially bounded, it must be a multiple of ψ − and of ψ + , so it follows that W (ψ+ , ψ− ) = 0. This would mean that λ is an eigenvalue of H, leading to a contradiction. This proves Sκ ⊂ σ(H). (b) Consider W dμ as in the eigenfunction expansion. Fix y ∈ R and z ∈ C \ R. The function f (x) = G(x, y; z) is in L2 (R) and U f can be

11.16. The periodic discriminant and the Marchenko–Ostrovski map

443

computed analogously to Proposition 11.65 to give ! 1 φλ (y) . (U f )(λ) = λ − z θλ (y) In particular, since U is unitary, !∗ φλ (y) 2 |G(x, y; z)| dx = W θλ (y)

φλ (y) θλ (y)

!

1 dμ(λ). |λ − z|2

Since D(H) ⊂ L∞ (R), as in the half-line case the uniform boundedness principle implies sup |G(x, y; z)|2 dx < ∞, y∈R

and therefore

2 −κ

(1 + x )

!∗ ! φλ (y) φλ (x) W (λ) dμ(λ) dx < ∞. θλ (x) θλ (y)

Since the integrand is nonnegative, by Tonelli’s theorem, for μ-a.e. λ, !∗ ! φ (y) φλ (x) dx < ∞. W (λ) λ (1 + x2 )−κ θλ (x) θλ (y) ! v1 ∈ C2 such that W ≥ vv ∗ , we obtain Choosing a nonzero vector v = v2 !∗ ! 2 −κ φλ (x) ∗ φλ (x) dx < ∞, vv (1 + x ) θλ (x) θλ (x) which implies that the nontrivial solution f = v1 θλ + v2 φλ obeys (1 + x2 )−κ |f (x)|2 dx < ∞. Thus, this solution obeys f (x) = O(|x|κ ) as x → ±∞. (c) Since Sκ ⊂ σ(H), Sκ ⊂ σ(H); conversely, since μ is supported on Sκ , σ(H) = supp μ ⊂ Sκ .

11.16. The periodic discriminant and the Marchenko–Ostrovski map We will now consider Schr¨ odinger operators with periodic potentials. Up to rescaling, we can assume that V is a periodic locally integrable function on R with period 1. Periodicity ensures that V is limit point at ±∞; we denote by H the corresponding Schr¨odinger operator on L2 (R). As in the case of periodic Jacobi matrices, an important role will be played by the monodromy matrix (i.e., the transfer matrix over one period), which we will denote by T (z) = T (1, z). Since det T (z) = 1, the behavior of T (z) will be largely determined by its trace Δ(z) = Tr T (z), called the

444

11. One-dimensional Schr¨odinger operators

discriminant. By Lemma 10.57, the value of Δ(z) determines the magnitude of eigenvalues of T (z), which enters the following proof. Lemma 11.116. The spectrum of H is σ(H) = {z ∈ C | Δ(z) ∈ [−2, 2]}. In particular, for any z ∈ C \ R, Δ(z) ∈ / [−2, 2]. Proof. For z such that Δ(z) ∈ [−2, 2], T (z) has a unimodular eigenvalue, which generates a bounded eigenfunction f . Conversely, if Δ(z) ∈ / [−2, 2], there exist eigenfunctions ψ ± which are exponentially decaying at ±∞ and exponentially growing at ∓∞, so any nontrivial linear combination will be exponentially growing in at least one direction. In summary, a polynomially bounded, nontrivial eigensolution exists if and only if Δ(z) ∈ [−2, 2]. The claim now follows by Schnol’s theorem, since the set {z ∈ C | Δ(z) ∈ [−2, 2]} is closed. In particular, all zeros of Δ2 −4 are on the real line. For further study, we introduce Herglotz function techniques. We deﬁne for z ∈ C+ the function 1 m+ (x; z) dx. (11.149) w(z) = 0

Since m+ is jointly continuous in (x, z) ∈ [0, 1] × C+ , it is locally uniformly continuous, so the integral is an analytic function of z by Morera’s theorem. Lemma 11.117. For all z ∈ C+ , w(z) =

1 0

m− (x; z) dx.

(11.150)

Proof. Since G(x, x; z) ∈ C+ for all x, the function g(x; z) = log G(x, x; z) is well deﬁned on R×C+ with Im g ∈ (0, π). Since W (ψ + , ψ − ) is independent of x, diﬀerentiating gives ∂x ψ + ψ − + ψ + ∂x ψ − = m+ − m− . ψ+ψ− Integrating in x from 0 to 1 and using periodicity of g shows 1 (m+ (x; z) − m− (x; z)) dx = g(1, z) − g(0, z) = 0, ∂x g =

0

so now (11.150) follows from the deﬁnition (11.149).

By averaging (11.149) and (11.150) and using (11.128), it also follows that ! 1 1 dx. − w(z) = 2G(x, x; z) 0

11.16. The periodic discriminant and the Marchenko–Ostrovski map

445

It is obvious that w is a Herglotz function, since the functions m+ are. Strikingly, we will soon see that w has two more Herglotz properties. Let us also introduce the Marchenko–Ostrovski map Θ(z) = −iw(z). Lemma 11.118. For all z ∈ C+ , Im Θ(z) > 0 and ! ! m+ (z) iΘ(z) m+ (z) =e , T (z) 1 1 ! ! −m− (z) −iΘ(z) −m− (z) T (z) =e . 1 1

(11.151) (11.152)

Proof. The monodromy matrix evolves the Weyl solution from x = 0 to 1: ! ! (∂x ψ + )(1, z) (∂x ψ + )(0, z) = . T (z) ψ + (0, z) ψ + (1, z) By periodicity, the Weyl solution shifted by 1 is again a Weyl solution, so by uniqueness up to normalization, there exists η ∈ C such that ψ + (x + 1, z) = ηψ + (x, z). Since 1 1 (∂x ψ + )(x, z) ψ + (1, z) = dx = m+ (x; z) dx = w(z), ln + ψ (0, z) ψ + (x, z) 0 0 we conclude η = ew(z) , which implies (11.151). Since ψ + is square-integrable at +∞, |η| < 1, which implies Im Θ(z) > 0. Equation (11.152) is proved analogously. Lemma 11.119. (a) For all z ∈ C+ , Δ(z) = 2 cos Θ(z).

(11.153) Δ2 − 4,

Θ has an (b) For any interval (c, d) ⊂ R containing no zeros of analytic continuation to C+ ∪ (c, d) ∪ C− such that (11.153) holds. (c) For any z ∈ C, if Δ(z) ∈ (−2, 2), then Δ (z) = 0. Proof. Since T (z) has eigenvalues e±iΘ(z) , its trace is computed as (11.153), which proves (a). The proofs of (b) and (c) are analogous to those of Lemmas 10.61 and 10.62 with q = 1. Lemma 11.120. Zeros of Δ2 −4 have multiplicity at most 2. At any double zero, (Δ2 − 4) < 0. Proof. Assume that Δ2 − 4 has a zero of multiplicity m at λ, so that Δ(λ) = ±2. Then, by standard results in complex analysis, the equation Δ(z) = ±2 cos t for t ∈ (0, ) locally has solutions γj (t), j = 1, . . . , m, which (0) lie on curves γj with γj (0) = λ and with arguments of γ1 (0), . . . , γm

446

11. One-dimensional Schr¨odinger operators

equispaced. Since Δ(z) ∈ [−2, 2] implies z ∈ R, this can only happen if m ≤ 2. Moreover, in the case m = 2, the two curves must lie on R, which implies that Δ2 − 4 as a function on R has a local maximum at λ. Since T (z) is the transfer matrix from x = 0 to x = 1, its entries are given in terms of the fundamental solutions u(x, z), v(x, z) on the interval [0, 1] from (11.15). In particular, Δ(z) = v(1, z) + (∂x u)(1, z), so Propositions 11.11 and 11.12 immediately imply that |Δ(z)| ≤ 2e|Re k|+

1 0

|V (t)| dt

and |Δ(z) − 2c(1, k)| ≤ 2|||k|||−1 e|Re k|+

1 0

∀z ∈ C

|V (t)| dt

∀z ∈ C.

(11.154)

This directly implies: Lemma 11.121. limλ→−∞ Δ(λ) = +∞. Using (11.154), we can adapt the counting lemma to prove the following: Lemma 11.122. For large enough positive integers N , Δ2 − 4 has exactly 2N + 1 zeros (counted with multiplicity) smaller than (N + 12 )2 π 2 . Proof. The previous estimates imply |Δ(z)2 − 4c(1, k)2 | ≤ 8|||k|||−1 e2|Re k|+2

1 0

|V (t)| dt

so that |(Δ(z)2 − 4) + 4k 2 s(1, k)2 | ≤ 8|||k|||−1 e2|Re k|+2

1 0

,

|V (t)| dt

.

The function g(z) = −4k 2 s(1, k)2 = 4zs(1, k)2 is entire. It has a simple zero at 0 and double zeros at n2 π 2 for n ∈ N, and no other zeros; thus, it 2 2 has 2N + 1 zeros including √ + 1/2) π ). √ multiplicity1 on the interval (−∞, (N Moreover, on curves Im −z = (N + 2 )π for N ∈ N and Re −z = Cπ for C ≥ 1, 4 |g(z)| > e2|Re k| . 9 1 2 01 |V (t)| dt , then on those contours, |k| > 6πe2 0 |V (t)| dt , Thus, if N, C > 6e so 8 2|Re k|+2 1 |V (t)| dt 0 e ≥ |(Δ(z)2 − 4) − g(z)|, |g(z)| > |k| and Rouch´e’s theorem completes the proof. It is now possible to describe the behavior of the discriminant on R and the spectrum of H:

11.16. The periodic discriminant and the Marchenko–Ostrovski map

447

Theorem 11.123. All zeros of Δ2 − 4 are real and can be listed, with multiplicity, as a sequence (λn )∞ n=1 such that λ2j−1 < λ2j ≤ λ2j+1

∀j ∈ N.

Moreover,

2 n ≡ 1, 4 (mod 4) Δ(λn ) = −2 n ≡ 2, 3 (mod 4), and the periodic spectrum is E = ∞ j=1 [λ2j−1 , λ2j ].

(11.155)

Proof. The zeros of Δ2 − 4 divide R into intervals; counting the sign of Δ2 − 4 from −∞ using Lemma 11.121, it follows that |Δ| < 2 on the intervals (λ2j−1 , λ2j ) and |Δ| > 2 on the intervals (λ2j , λ2j+1 ), whenever those intervals are open. Using Lemma 11.120 and Lemma 11.119(c) then determines the sign of Δ(λn ) by induction in n, which implies (11.155).

2 λ1

λ2

λ3

λ4

λ5

λ

−2 Figure 11.2. The discriminant on R.

As a function on R, the discriminant is oscillatory toward +∞; see Figure 11.2. The intervals (λ2j , λ2j+1 ) are spectral gaps; the jth gap is said to be open if λ2j < λ2j+1 and closed if λ2j = λ2j+1 . For further study, we indicate another Herglotz property related to the Marchenko–Ostrovski map, which comes from a relation with the diagonal Green’s function: Proposition 11.124. For z ∈ C+ , 1 G(x, x; z) dx. w (z) =

(11.156)

0

In particular, w is a Herglotz function, and w has an analytic extension to z ) = w (z) and C \ σ(H) which obeys w (¯ 1 + o(1), w (z) = √ 2 −z for any δ > 0.

z → ∞, arg z ∈ [δ, 2π − δ]

(11.157)

448

11. One-dimensional Schr¨odinger operators

Proof. Consider the function ∞ ∂z m+ (x; z) ψz− (x) ψz+ (y)2 dy. h(x) = = −m+ (x; z) − m− (x; z) W (ψz+ , ψz− )ψz+ (x) x Using

! ψz− (ψz− ) ψz+ − ψz− (ψz+ ) W (ψz+ , ψz− ) = = , ψz+ (ψz+ )2 (ψz+ )2 the derivative of h is ∞ 1 ψ − (x)ψ + (x) ψz+ (y)2 dy − z + z − = ∂z m+ (x; z) − G(x, x; z) h (x) = + 2 ψz (x) x W (ψz , ψz )

(the last step uses (11.87)). Since the function h is independent of nor1 malization of the Weyl solutions, it is 1-periodic, so 0 h (x) dx = 0. This implies 1 1 G(x, x; z) dx = ∂z m+ (x; z) dx = w (z). 0

0

Since G(x, x; z) is Herglotz for each x, it follows that w is Herglotz. Due to joint continuity of G(x, y; z) in R × R × (C \ σ(H)), the righthand side of (11.156) deﬁnes an analytic function on C \ σ(H) by Fubini’s theorem and Morera’s theorem. The conjugation symmetry of w follows from G(x, x; z¯) = G(x, x; z). Finally, since the diagonal Green’s function obeys the asymptotics G(x, x; z) = 2√1−z + o(1) as z → −∞ uniformly in x, the function w obeys the same asymptotics. Theorem 11.125. All zeros of Δ are simple and can be listed, with multiplicity, as a sequence κj ∈ [λ2j , λ2j+1 ] with j ∈ N. Moreover, for each j, either λ2j < κj < λ2j+1 or λ2j = κj = λ2j+1 . Proof. It was already proved that there are no zeros of Δ on the set where Δ ∈ (−2, 2). Moreover, a zero of Δ2 − 4 of multiplicity m is also a zero of Δ of multiplicity m − 1, so Δ has a simple zero at every closed gap and no zeros at open gap edges. It remains for us to consider zeros on C \ E. Since Δ (z) = −2 sin Θ(z)Θ (z), zeros of Δ match those of Θ and w . In particular, by Proposition 11.124, w is Herglotz, so there are no zeros on C \ R. By (11.157), w → 0 as z → −∞, and since w is increasing on (−∞, λ1 ), it has no zeros there. On each open gap (λ2j , λ2j+1 ), since Δ(λ2j ) = Δ(λ2j+1 ), there exists a zero of Δ . By Proposition 11.124 and Proposition 7.56, w is strictly increasing there, and in particular, Θ has at most one zero there, and it is simple. Proposition 11.126. The function Θ, originally deﬁned on C+ , has a continuous extension to C+ . This extension obeys the following:

11.16. The periodic discriminant and the Marchenko–Ostrovski map

0

π

2π

449

3π

Figure 11.3. Image of Θ(R) for a periodic Schr¨ odinger operator.

(a) Im Θ = 0 on E; (b) Re Θ = 0 on (−∞, λ1 ]; (c) Re Θ = jπ on [λ2j , λ2j+1 ] for j ∈ N. This describes the image of Θ on R as a generalized polygonal curve, with open gaps mapped to vertical line segments traversed up and then down; see Figure 11.3. Proof. It is known that Θ has an analytic extension through any interval (c, d) ⊂ R which contains no zeros of Δ2 −4, so Θ has a continuous extension to C+ \ {λj | j ∈ N}. Consider a zero λk of Δ2 − 4. From the exponential Herglotz representation of w , it follows that Θ (z) = O(|z − λk |−1/2 ),

z → λk , z ∈ C+ .

As in the proof of Proposition 10.66, by the mean value theorem, this implies that lim Θ(z)

z→λk z∈C+

exists, and this completes the continuous extension of Θ to C+ . Since w is real-valued on R \ E, it follows that Im Θ is constant on (−∞, λ1 ] and on [λ2j , λ2j+1 ] for j ∈ N. On E, Δ ∈ [−2, 2] implies Im Θ = 0. Combining these conclusions shows Θ(λ2j ) = Θ(λ2j+1 ) for j ∈ N. √ Since m+ (x; z) =√− −z + o(1) for each x, it follows from the deﬁnition of w that w(z) = − −z + o(1) as z → −∞ and therefore Im Θ(z) = 0 for z ∈ (−∞, λ1 ). Since integration over each band shows that λ2j λ2j Δ (λ) . Θ (λ) dλ = dλ = π, Θ(λ2j ) − Θ(λ2j−1 ) = 4 − Δ(λ)2 λ2j−1 λ2j−1 the remaining conclusions follow by induction.

450

11. One-dimensional Schr¨odinger operators

Corollary 11.127. The analytic extension of Θ to C+ ∪ (λ2j , λ2j+1 ) ∪ C− obeys Θ(¯ z ) = −Θ(z) + 2jπ. Proof. This follows from the reﬂection principle, since Re Θ = jπ on the interval (λ2j , λ2j+1 ).

11.17. Direct spectral theory of periodic Schr¨ odinger operators The potential V also determines Schr¨ odinger operators on subintervals of R. We denote by H± the Schr¨odinger operators on the intervals (0, ±∞) with a Dirichlet boundary condition at 0, and denote by H1 the operator on (0, 1) with Dirichlet boundary conditions at both endpoints. We denote the entries of the monodromy matrix by ! t11 t12 T = . t21 t22 Since t21 (z) = u(1, z), where u denotes the Dirichlet solution, we will call zeros of t21 Dirichlet eigenvalues. The following statement about Dirichlet spectrum is mostly familiar from Section 11.3: Lemma 11.128. All zeros of t21 are simple. Moreover, for any z ∈ C, the following are equivalent: (a) z is a zero of t21 . (b) 10 is an eigenvector of T (z).

(c) there is a nontrivial eigensolution of −f + V f = zf such that f (0) = f (1) = 0.

(d) z is an eigenvalue of H1 . Proof. The equivalence of (a), (c), and (d) was proved in Section 11.3, and (1,z) . (a) ⇐⇒ (b) is elementary since T (z) 10 = uu(1,z) We now wish to compare the locations of Dirichlet eigenvalues to the periodic spectrum. The ﬁrst step is the following. Corollary 11.129. If z is a Dirichlet eigenvalue, then z ∈ R and Δ(z) ∈ / (−2, 2). Proof. z is real because it is an eigenvalue of the self-adjoint operator H1 . If t21 (z) = 0, then T (z) is lower triangular, so t11 (z)t22 (z) = det T (z) = 1. This implies |Δ(z)| = |t11 (z) + 1/t11 (z)| ≥ 2 by the arithmetic mean– geometric mean inequality.

11.17. Direct spectral theory of periodic Schr¨ odinger operators

To state the next result, let us ﬁx the branch of . Δ2 − 4 = −2i sin Θ(z).

√

451

Δ2 − 4 on C \ E by

Note that this branch is positive on (−∞, λ1 ). Theorem 11.130. The m-function for H+ is given on C+ by √ t11 − t22 − Δ2 − 4 . (11.158) m+ = 2t21 Moreover, the zeros of t21 can be listed as (μj )∞ j=1 so that μj < μj+1 and μj ∈ [λ2j , λ2j+1 ] for all j ∈ N. Proof. Rewriting (11.151) projectively implies that t11 m+ + t12 m+ = . t21 m+ + t22 This can be rewritten as a quadratic equation for m+ , whose solutions are √ t11 − t22 ± Δ2 − 4 . 2t21 √ Since Δ2 − 4 is nonzero on C+ , the ± sign must be chosen uniformly throughout C+ . We will determine this choice of sign, and the placement of zeros of t21 , based on the condition that m+ is Herglotz. On every band (λ2j−1 , λ2j ), the boundary values of Im m+ are given by 2 lim↓0 sin Θ(λ + i ) . ↓0 2t21 (λ) These boundary values are nonzero and have constant sign on the band interior (λ2j−1 , λ2j ). Since m+ , this sign must be positive on each band interior. Since sin Θ changes sign between consecutive gaps, t21 (λ) must also change sign, so it must have at least one zero in the gap closure [λ2j , λ2j+1 ] for each j. lim Im m+ (λ + i ) = ∓

By the counting lemma, Lemma 11.24, for large enough N , t21 has precisely N zeros smaller than (N + 1/2)2 π 2 . Since that many zeros have already been found in the intervals [λ2j , λ2j+1 ] for j = 1, . . . , N , this shows that there is precisely one zero in each [λ2j , λ2j+1 ] and no zeros in (−∞, λ1 ]. Finally, on the band [λ1 , λ2 ], lim↓0 sin Θ(λ + i ) > 0 and t12 (λ) > 0 because all√zeros of t21 are greater than λ. This implies the choice of sign in front of Δ2 − 4 in (11.158). From m+ , spectral properties of H+ can be read oﬀ: Theorem 11.131. The operator H+ has essential spectrum σess (H+ ) = E and discrete spectrum σd (H+ ) = {μj | j ∈ N, |t11 (μj )| < 1}.

452

11. One-dimensional Schr¨odinger operators

More precisely, the spectral measure μ+ is given by κ j δ μj , dμ+ (λ) = w+ (λ) dλ + j∈N

where

⎧√ ⎨ 4−Δ(λ)2 w+ (λ) =

⎩0

|t12 (λ)|

λ ∈ (λ2j−1 , λ2j ) for some j ∈ N else,

and κj > 0 if and only if |t22 (μj )| < 1. Similarly, m− can be found as the second solution of the quadratic equation: Proposition 11.132.

√ t22 − t11 − Δ2 − 4 . m− = 2t21

It follows, in particular, that m± obey the reﬂectionless condition m− (λ + i0) = −m+ (λ + i0) for all λ in the interior of E. From this, it follows just as for periodic Jacobi matrices that: Theorem 11.133. The full-line periodic Schr¨ odinger operator H has purely absolutely continuous spectrum on E with multiplicity 2, i.e., H ∼ = Tλ,χE (λ) dλ⊕ Tλ,χE (λ) dλ . Operators on (0, ±∞) and (0, 1) with Neumann boundary conditions, N and H N , can be related to the entry t . We leave as an denoted H± 12 1 exercise to the reader the following facts, which follow the same ideas as above. Lemma 11.134. All zeros of t12 are simple. Moreover, for any z ∈ C, the following are equivalent: (a) z is a zero of t12 . (b) 01 is an eigenvector of T (z).

(c) there is a nontrivial solution of −f + V f = zf such that f (0) = f (1) = 0.

(d) z is an eigenvalue of H1N . Lemma 11.135. All zeros of t12 are real and can be listed in the form (νj )∞ j=0 where ν0 ∈ (−∞, λ1 ] and νj ∈ [λ2j , λ2j+1 ] for j ∈ N; in particular, νj−1 < νj for all j ∈ N. Finally, we obtain equivalent characterizations of the open gap–closed gap dichotomy; see the discussion preceding Proposition 10.83 and its proof.

11.18. Exercises

453

Theorem 11.136. For λ ∈ C, the following are equivalent: (a) λ is a closed gap of H, i.e., λ = λ2j = λ2j+1 for some j ∈ N. (b) λ is a double root of Δ2 − 4. (c) T (λ) ∈ {+I, −I}. The characterization through the geometric multiplicity has a direct spectral interpretation through a Schr¨odinger operator on the interval (0, 2) with periodic boundary conditions (Exercise 11.28).

11.18. Exercises 11.1. Prove Lemma 11.1. 11.2. Prove that the initial value problem (11.6) has the unique solution given by (11.31). 11.3. Let V ∈ L1 ([0, 1]) and ϕ ∈ R. Prove that the operator Hϕ , deﬁned by Hϕ f = −f + V f with D(Hϕ ) = {f ∈ D(Hmax ) | f (1) = eiϕ f (0) and f (1) = eiϕ f (0)} is self-adjoint. These boundary conditions are called skew-periodic. The case ϕ = 0 is called periodic; the case ϕ = π, antiperiodic. 11.4. Let V ∈ L1 ([0, 1]). Besides the separated boundary conditions (11.2) and (11.3) with α, β ∈ R and the skew-periodic boundary conditions from the previous problem, are there any other self-adjoint choices of boundary conditions? 11.5. If V is a potential on R such that x+1 V− (t) dt < ∞, sup x∈R

x

prove that there exists M < ∞, which depends only on the value of this supremum, such that for all f ∈ X− ∩ X+ , +∞ +∞ 1 +∞ 2 2 |f | dx ≤ M |f | dx + f (−f + V f ) dx. 2 −∞ −∞ −∞ 11.6. If ψ ± (x, z) are Weyl solutions for the Schr¨odinger operator H, denote m± (x, z) = ±

∂x ψ ± (x, z) . ψ ± (x, z)

Prove that ∂x G(x, x; z) =

m− (x; z) − m+ (x; z) . m− (x; z) + m+ (x; z)

(11.159)

454

11. One-dimensional Schr¨odinger operators

11.7. In the setting of Theorem 11.52, prove that for all z, w ∈ C \ σ(H), b m(z) − m(w) , |ψw − ψz |2 dx = B(z) + B(w) − 2 Re z−w ¯ 0 where B is deﬁned by

m(z)−m(z) z ∈C\R z−¯ z B(z) = z ∈ R \ σ(H). m (z) Hint: Use Theorem 11.52. 11.8. Consider a Schr¨odinger operator H on (0, b) with a regular endpoint at 0. Since its m-function m(z) was deﬁned as a function on C\σ(H), any λ ∈ σd (H) is an isolated singularity of m(z). For ﬁxed λ ∈ σd (H), prove the following. (a) The Weyl solution ψλ at z = λ is a multiple of φλ . (b) In some neighborhood of λ, the Weyl solutions ψz are linearly independent with θz , so they can be normalized by W (ψz , θz ) = 1. With that normalization, b |ψz − ψλ |2 dx = 0. lim z→λ 0

(c) The m-function has a simple pole at λ, and its residue is b |ψλ (x)|2 dx, Resλ m = − 0

where Resλ m = limz→λ (z − λ)m(z) denotes the residue of m at λ. 2

d 11.9. Consider the operator H = − dx 2 on the interval I = (0, +∞) with the boundary condition at 0

cos αf (0) + sin αf (0) = 0. Find the m-function and the canonical spectral measure as functions of α. 11.10. For any z, w ∈ C, prove that the transfer matrices Tα (x, z) have the property ! x 0 0 ∗ ∗ J − Tα (x, w) J Tα (x, z) = −i(z − w) Tα (t, w) Rα Rα∗ Tα (t, z) dt. 0 1 0 11.11. Let H be a Schr¨odinger operator on (0, b) with a regular endpoint at 0. Let U denote its eigenfunction expansion as in Theorem 11.56. Let z ∈ C \ σ(H). Prove the following. φλ (y) . (a) For any y ∈ (0, b), if f (x) = ∂y G(x, y; z), then (U f )(λ) = λ−z (b) If the Weyl solution ψz is normalized by W (ψz , φz ) = 1, prove 1 . that (U ψz )(λ) = λ−z

11.18. Exercises

455

1 ([0, b)) and z ∈ C . Prove that the radius of the limit 11.12. Let V ∈ L + loc Weyl disk x Dα (x, z) is !−1 b 2 2 Im z |φα (t, z)| dt . 0

and that V is a limit circle at b. Fix 11.13. Assume that V ∈ a boundary condition at 0. Recalling that any self-adjoint boundary ∗ obeys v , f ) = 0, where v ∈ X+ \X+ condition at b is described by W+ (¯ v , v) = 0, let us denote the corresponding m-function by mv (z). W+ (¯ Prove that for any z ∈ C+ , the boundary of the limit disk Dα (b, z) ∗ , W (¯ is the set {mv (z) | v ∈ X+ \ X+ + v , v) = 0}. L1loc ([0, b))

11.14. Assume that V is regular at 0 and a limit point at b. Prove that for any h ∈ Cc ((0, ∞)), ∞ ∞ λ1/2 h(λ) lim dλ = h(λ) dμα (λ). x→b 0 π(λφα (x, λ)2 + φα (x, λ)2 ) 0 This variant of Carmona’s formula is useful for the study of decaying potentials, in combination with Pr¨ ufer variables [58, 66]. 11.15. In the setting of Theorem 11.78, prove the asymptotic behavior x V (t)e−2kt dt m(z) = −k − 0 1 x t1 −2kt1 e (1 − e−2kt2 )V (t1 )V (t2 ) dt2 dt1 + O(|k|−2 ) + k 0 0 as z → ∞ in the appropriate sector. 11.16. Denote by m0,β (z) the Weyl m-function for a Schr¨odinger operator with a Dirichlet boundary condition at 0 and a β-boundary condition at 1. The Atkinson argument proves that sup|m0,0 (z) − m0,β (z)| → 0,

z → ∞, arg z ∈ [δ, π − δ].

β∈R

Is the same true in the limit z → ∞, arg z ∈ [δ, 2π − δ]? 11.17. For α ∈ (0, π), let mα denote the Weyl m-function corresponding to an α-boundary condition at 0. Prove that as z → ∞, arg z ∈ [δ, 2π − δ], mα (z) = cot α +

1 cos α −2 −1 k + o(k −2 ). + 2 k sin α sin3 α

11.18. Let Hα,β denote the Schr¨odinger operator on [0, 1] with V ∈ L1 ([0, 1]) and boundary conditions (11.2) and (11.3). Prove that 0 0,

m(z) = −k

n+2

cj (V )k −j + o˜(|k|−n−1),

z → ∞, arg z ∈ [δ, π − δ],

j=0

uniformly in bounded subsets of V ∈ C n ([0, 1]). (b) If in addition H is semibounded, i.e., inf σ(H) > −∞, then m(z) = −k

n+2

cj (V )k −j + o(|k|−1 ),

z → ∞, arg z ∈ [δ, 2π − δ].

j=0

This is uniform in bounded subsets of V ∈ C n ([0, 1]) with H such that inf σ(H) ≥ C, where C ∈ R. 11.21. Let Hα,β denote the Schr¨odinger operator on [0, 1] with V ∈ L1 ([0, 1]) and boundary conditions (11.2) and (11.3), and let mα,β denote its Weyl function. (a) Prove that m0,β is uniquely determined by the spectra σ(H0,β ) and σ(Hπ/2,β ) with Dirichlet and Neumann boundary conditions at 0. Hint: Use Example 7.62. (b) Combine with the Borg–Marchenko theorem to conclude that σ(H0,β ) and σ(Hπ/2,β ) determine the potential V and β uniquely. 11.22. Prove that each coeﬃcient cn (x) described in Theorem 11.84 is a polynomial in V (x), . . . , V (n−2) (x). If we deﬁne a notion of degree B B (j) = 2 + j and that deg for diﬀerential polynomials of V so that degV is multiplicative (informally speaking, every V counts as 2, and every derivative counts as 1), prove that every monomial in cn has degree exactly 2n.

11.18. Exercises

457

11.23. Let V ∈ L1 ([0, ∞)) (this is not a typo; V is assumed integrable on the entire half-line, which is a kind of decay condition at +∞). Denote by T (x, z; V ) the corresponding transfer matrices. (a) Derive a ﬁrst-order ordinary diﬀerential equation for S(x, z; V ) = T (x, z; 0)−1 T (x, z; V ). (b) Use it to prove that for all λ ∈ (0, ∞), there is a convergent limit lim S(x, z; V ).

x→∞

(c) Conclude that for λ ∈ (0, ∞), all eigensolutions are bounded and that the maximal spectral measure on (0, ∞) is mutually absolutely continuous with Lebesgue measure. 11.24. Let H be a Schr¨odinger operator on (0, b) with a regular endpoint at 0 and φλ deﬁned, as usual, as a nontrivial eigensolution at λ which obeys the boundary condition at 0. If for some λ ∈ C, φλ (x) grows at most subexponentially, i.e., for all > 0, φλ (x) = O(ex ) as x → ∞, prove that λ ∈ σ(H). 11.25. If for some λ ∈ R, there exists a subexponentially growing eigensolution, i.e., a nontrivial eigensolution u(x) such that for every > 0, u(x) = O(ex ) as x → ±∞, prove that λ ∈ σ(H). 11.26. If H is a Schr¨odinger operator with limit circle endpoints, prove that it has compact resolvent; in particular, σess (H) = ∅. Hint: Use the behavior of eigensolutions at limit circle endpoints to prove that I×I |G(x, y; z)|2 dx dy < ∞. 11.27. If the Schr¨odinger operator H on (− , + ) is a limit circle at − , prove that it has simple (multiplicity 1) spectrum. Hint: Use a Weyl matrix with respect to some internal point c ∈ (− , + ) and apply the previous exercise to a Schr¨odinger operator on (− , c). 11.28. For a 1-periodic potential V , let Δ denote the discriminant, and let H2P denote the corresponding Schr¨odinger operator on the interval (0, 2) with periodic boundary conditions f (2) = f (0), f (2) = f (0). Prove that σ(H2P ) = {λ | Δ(λ)2 − 4 = 0} and that for λ ∈ σ(H2P ), dim Ker(H2P − λ) is equal to the multiplicity of λ as a root of Δ2 − 4.

Bibliography

[1] M. Aizenman and S. Warzel, Random operators: Disorder eﬀects on quantum spectra and dynamics, Graduate Studies in Mathematics, vol. 168, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/gsm/168. MR3364516 [2] N. I. Akhiezer, The classical moment problem and some related questions in analysis, Hafner Publishing Co., New York, 1965. Translated by N. Kemmer. MR0184042 [3] W. O. Amrein and V. Georgescu, On the characterization of bound states and scattering states in quantum mechanics, Helv. Phys. Acta 46 (1973/74), 635–658. MR363267 [4] D. H. Armitage and S. J. Gardiner, Classical potential theory, Springer Monographs in Mathematics, Springer-Verlag London, Ltd., London, 2001, DOI 10.1007/978-1-4471-02335. MR1801253 [5] F. V. Atkinson, On the location of the Weyl circles, Proc. Roy. Soc. Edinburgh Sect. A 88 (1981), no. 3-4, 345–356, DOI 10.1017/S0308210500020163. MR616784 [6] H. Behncke, Absolute continuity of Hamiltonians with von Neumann Wigner potentials. II, Manuscripta Math. 71 (1991), no. 2, 163–181, DOI 10.1007/BF02568400. MR1101267 [7] C. Bennewitz, A proof of the local Borg-Marchenko theorem, Comm. Math. Phys. 218 (2001), no. 1, 131–132, DOI 10.1007/s002200100384. MR1824201 [8] C. Bennewitz, M. Brown, and R. Weikard, Spectral and scattering theory for ordinary differential equations. Vol. I: Sturm-Liouville equations, Universitext, Springer, Cham, 2020, DOI 10.1007/978-3-030-59088-8. MR4199125 [9] G. Berkolaiko and P. Kuchment, Introduction to quantum graphs, Mathematical Surveys and Monographs, vol. 186, American Mathematical Society, Providence, RI, 2013, DOI 10.1090/surv/186. MR3013208 [10] A. S. Besicovitch, On existence of subsets of ﬁnite measure of sets of inﬁnite measure, Nederl. Akad. Wetensch. Proc. Ser. A. 55 = Indagationes Math. 14 (1952), 339–344. MR0048540 [11] R. Bessonov, M. Luki´ c, and P. Yuditskii, Reﬂectionless canonical systems, I: Arov gauge and right limits, Integral Equations Operator Theory 94 (2022), no. 1, Paper No. 4, 30, DOI 10.1007/s00020-021-02683-z. MR4360428 [12] M. Sh. Birman and M. Z. Solomjak, Spectral theory of selfadjoint operators in Hilbert space, Mathematics and its Applications (Soviet Series), D. Reidel Publishing Co., Dordrecht, 1987. Translated from the 1980 Russian original by S. Khrushch¨ ev and V. Peller. MR1192782

459

460

Bibliography

[13] G. Borg, Eine Umkehrung der Sturm-Liouvilleschen Eigenwertaufgabe. Bestimmung der Diﬀerentialgleichung durch die Eigenwerte (German), Acta Math. 78 (1946), 1–96, DOI 10.1007/BF02421600. MR15185 [14] G. Borg, Uniqueness theorems in the spectral theory of y + (λ − q(x))y = 0, Den 11te Skandinaviske Matematikerkongress, Trondheim, 1949, Johan Grundt Tanums Forlag, Oslo, 1952, pp. 276–287. MR0058063 [15] D. Borthwick, Spectral theory: Basic concepts and applications, Graduate Texts in c Mathematics, vol. 284, Springer, Cham, [2020] 2020, DOI 10.1007/978-3-030-38002-1. MR4180682 [16] M. J. Cantero, L. Moral, and L. Vel´ azquez, Five-diagonal matrices and zeros of orthogonal polynomials on the unit circle, Linear Algebra Appl. 362 (2003), 29–56, DOI 10.1016/S00243795(02)00457-3. MR1955452 [17] R. Carmona and J. Lacroix, Spectral theory of random Schr¨ odinger operators, Probability and its Applications, Birkh¨ auser Boston, Inc., Boston, MA, 1990, DOI 10.1007/978-1-46124488-2. MR1102675 [18] T. S. Chihara, An introduction to orthogonal polynomials, Mathematics and its Applications, Vol. 13, Gordon and Breach Science Publishers, New York-London-Paris, 1978. MR0481884 [19] H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schr¨ odinger operators with application to quantum mechanics and global geometry, Springer Study Edition, Texts and Monographs in Physics, Springer-Verlag, Berlin, 1987. MR883643 [20] D. Damanik, Schr¨ odinger operators with dynamically deﬁned potentials, Ergodic Theory Dynam. Systems 37 (2017), no. 6, 1681–1764, DOI 10.1017/etds.2015.120. MR3681983 [21] D. Damanik and J. Fillman, One-dimensional ergodic Schr¨ odinger operators. I. General theory, Graduate Series in Mathematics, vol. 221, American Mathematical Society, Providence, RI, 2022. [22] D, Damanik and J. Fillman, One-dimensional ergodic Schr¨ odinger operators. II. Speciﬁc classes, in preparation. [23] R. O. Davies, Subsets of ﬁnite measure in analytic sets, Nederl. Akad. Wetensch. Proc. Ser. A. 55 = Indagationes Math. 14 (1952), 488–489. MR0053184 [24] L. de Branges, Hilbert spaces of entire functions, Prentice-Hall, Inc., Englewood Cliﬀs, N.J., 1968. MR0229011 [25] P. A. Deift, Orthogonal polynomials and random matrices: A Riemann-Hilbert approach, Courant Lecture Notes in Mathematics, vol. 3, New York University, Courant Institute of Mathematical Sciences, New York; American Mathematical Society, Providence, RI, 1999. MR1677884 [26] R. del Rio, S. Jitomirskaya, Y. Last, and B. Simon, Operators with singular continuous spectrum. IV. Hausdorﬀ dimensions, rank one perturbations, and localization, J. Anal. Math. 69 (1996), 153–200, DOI 10.1007/BF02787106. MR1428099 [27] S. A. Denisov and A. Kiselev, Spectral properties of Schr¨ odinger operators with decaying potentials, Spectral theory and mathematical physics: a Festschrift in honor of Barry Simon’s 60th birthday, Proc. Sympos. Pure Math., vol. 76, Amer. Math. Soc., Providence, RI, 2007, pp. 565–589, DOI 10.1090/pspum/076.2/2307748. MR2307748 [28] J. Eckhardt, F. Gesztesy, R. Nichols, and G. Teschl, Weyl-Titchmarsh theory for SturmLiouville operators with distributional potentials, Opuscula Math. 33 (2013), no. 3, 467–563, DOI 10.7494/OpMath.2013.33.3.467. MR3046408 [29] V. Enss, Asymptotic completeness for quantum mechanical potential scattering. I. Short range potentials, Comm. Math. Phys. 61 (1978), no. 3, 285–291. MR523013 [30] A. Eremenko and P. Yuditskii, Comb functions, Recent advances in orthogonal polynomials, special functions, and their applications, Contemp. Math., vol. 578, Amer. Math. Soc., Providence, RI, 2012, pp. 99–118, DOI 10.1090/conm/578/11472. MR2964141

Bibliography

461

[31] K. J. Falconer, The geometry of fractal sets, Cambridge Tracts in Mathematics, vol. 85, Cambridge University Press, Cambridge, 1986. MR867284 [32] G. B. Folland, Real analysis: Modern techniques and their applications, Pure and Applied Mathematics (New York), John Wiley & Sons, Inc., New York, 1984. A Wiley-Interscience Publication. MR767633 [33] F. Gesztesy, B. Simon, and G. Teschl, Zeros of the Wronskian and renormalized oscillation theory, Amer. J. Math. 118 (1996), no. 3, 571–594. MR1393260 [34] F. Gesztesy, Inverse spectral theory as inﬂuenced by Barry Simon, Spectral theory and mathematical physics: a Festschrift in honor of Barry Simon’s 60th birthday, Proc. Sympos. Pure Math., vol. 76, Amer. Math. Soc., Providence, RI, 2007, pp. 741–820, DOI 10.1090/pspum/076.2/2307754. MR2307754 [35] F. Gesztesy and H. Holden, Soliton equations and their algebro-geometric solutions. Vol. I: (1 + 1)-dimensional continuous models, Cambridge Studies in Advanced Mathematics, vol. 79, Cambridge University Press, Cambridge, 2003, DOI 10.1017/CBO9780511546723. MR1992536 [36] F. Gesztesy, H. Holden, J. Michor, and G. Teschl, Soliton equations and their algebrogeometric solutions. Vol. II: (1 + 1)-dimensional discrete models, Cambridge Studies in Advanced Mathematics, vol. 114, Cambridge University Press, Cambridge, 2008, DOI 10.1017/CBO9780511543203. MR2446594 [37] F. Gesztesy and B. Simon, On local Borg-Marchenko uniqueness results, Comm. Math. Phys. 211 (2000), no. 2, 273–287, DOI 10.1007/s002200050812. MR1754515 [38] F. Gesztesy and E. Tsekanovskii, On matrix-valued Herglotz functions, Math. Nachr. 218 (2000), 61–138, DOI 10.1002/1522-2616(200010)218:161::AID-MANA613.3.CO;2-4. MR1784638 [39] F. Gesztesy and M. Zinchenko, On spectral theory for Schr¨ odinger operators with strongly singular potentials, Math. Nachr. 279 (2006), no. 9-10, 1041–1082, DOI 10.1002/mana.200510410. MR2242965 [40] D. J. Gilbert, On subordinacy and analysis of the spectrum of Schr¨ odinger operators with two singular endpoints, Proc. Roy. Soc. Edinburgh Sect. A 112 (1989), no. 3-4, 213–229, DOI 10.1017/S0308210500018680. MR1014651 [41] D. J. Gilbert and D. B. Pearson, On subordinacy and analysis of the spectrum of onedimensional Schr¨ odinger operators, J. Math. Anal. Appl. 128 (1987), no. 1, 30–56, DOI 10.1016/0022-247X(87)90212-5. MR915965 [42] M. Harmer, Hermitian symplectic geometry and extension theory, J. Phys. A 33 (2000), no. 50, 9193–9203, DOI 10.1088/0305-4470/33/50/305. MR1804888 [43] P. D. Hislop and I. M. Sigal, Introduction to spectral theory: With applications to Schr¨ odinger operators, Applied Mathematical Sciences, vol. 113, Springer-Verlag, New York, 1996, DOI 10.1007/978-1-4612-0741-2. MR1361167 [44] R. O. Hryniv and Ya. V. Mykytyuk, 1-D Schr¨ odinger operators with periodic singular potentials, Methods Funct. Anal. Topology 7 (2001), no. 4, 31–42. MR1879483 [45] D. Hundertmark, Some bound state problems in quantum mechanics, Spectral theory and mathematical physics: a Festschrift in honor of Barry Simon’s 60th birthday, Proc. Sympos. Pure Math., vol. 76, Amer. Math. Soc., Providence, RI, 2007, pp. 463–496, DOI 10.1090/pspum/076.1/2310215. MR2310215 [46] M. E. H. Ismail, Classical and quantum orthogonal polynomials in one variable, Encyclopedia of Mathematics and its Applications, vol. 98, Cambridge University Press, Cambridge, 2009. With two chapters by Walter Van Assche; With a foreword by Richard A. Askey; Reprint of the 2005 original. MR2542683 [47] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. I. Half-line operators, Acta Math. 183 (1999), no. 2, 171–189, DOI 10.1007/BF02392827. MR1738043

462

Bibliography

[48] S. Ya. Jitomirskaya and Y. Last, Power law subordinacy and singular spectra. II. Line operators, Comm. Math. Phys. 211 (2000), no. 3, 643–658, DOI 10.1007/s002200050830. MR1773812 [49] R. Johnson and J. Moser, The rotation number for almost periodic potentials, Comm. Math. Phys. 84 (1982), no. 3, 403–438. MR667409 [50] R. Johnson and J. Moser, Erratum: “The rotation number for almost periodic potentials” [Comm. Math. Phys. 84 (1982), no. 3, 403–438; MR0667409 (83h:34018)], Comm. Math. Phys. 90 (1983), no. 2, 317–318. MR714441 [51] I. S. Kac, On the spectral multiplicity of a second-order diﬀerential operator. (Russian), Dokl. Akad. Nauk SSSR 145 (1962), 510–513. MR0145375 [52] I. S. Kac, Spectral multiplicity of a second-order diﬀerential operator and expansion in eigenfunction (Russian), Izv. Akad. Nauk SSSR Ser. Mat. 27 (1963), 1081–1112. MR0159982 [53] T. Kappeler and J. P¨ oschel, KdV & KAM, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics], vol. 45, SpringerVerlag, Berlin, 2003, DOI 10.1007/978-3-662-08054-2. MR1997070 [54] S. Khan and D. B. Pearson, Subordinacy and spectral theory for inﬁnite matrices, Helv. Phys. Acta 65 (1992), no. 4, 505–527. MR1179528 [55] S. Khan and D. B. Pearson, Subordinacy and spectral theory for inﬁnite matrices, Helv. Phys. Acta 65 (1992), no. 4, 505–527. MR1179528 [56] R. Killip, Spectral theory via sum rules, Spectral theory and mathematical physics: a Festschrift in honor of Barry Simon’s 60th birthday, Proc. Sympos. Pure Math., vol. 76, Amer. Math. Soc., Providence, RI, 2007, pp. 907–930, DOI 10.1090/pspum/076.2/2310217. MR2310217 [57] W. Kirsch, An invitation to random Schr¨ odinger operators (English, with English and French summaries), Random Schr¨ odinger operators, Panor. Synth` eses, vol. 25, Soc. Math. France, Paris, 2008, pp. 1–119. With an appendix by Fr´ ed´ eric Klopp. MR2509110 [58] A. Kiselev, Y. Last, and B. Simon, Modiﬁed Pr¨ ufer and EFGP transforms and the spectral analysis of one-dimensional Schr¨ odinger operators, Comm. Math. Phys. 194 (1998), no. 1, 1–45, DOI 10.1007/s002200050346. MR1628290 [59] D. G. Larman, Subsets of given Hausdorﬀ measure in connected spaces, Quart. J. Math. Oxford Ser. (2) 17 (1966), 239–243, DOI 10.1093/qmath/17.1.239. MR201597 [60] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142 (1996), no. 2, 406–445, DOI 10.1006/jfan.1996.0155. MR1423040 [61] Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Invent. Math. 135 (1999), no. 2, 329–367, DOI 10.1007/s002220050288. MR1666767 [62] Y. Last and B. Simon, The essential spectrum of Schr¨ odinger, Jacobi, and CMV operators, J. Anal. Math. 98 (2006), 183–220, DOI 10.1007/BF02790275. MR2254485 [63] N. Levinson, The inverse Sturm-Liouville problem, Mat. Tidsskr. B 1949 (1949), 25–30. MR32067 [64] B. M. Levitan and I. S. Sargsjan, Introduction to spectral theory: selfadjoint ordinary differential operators, Translations of Mathematical Monographs, Vol. 39, American Mathematical Society, Providence, R.I., 1975. Translated from the Russian by Amiel Feinstein. MR0369797 [65] B. M. Levitan and I. S. Sargsjan, Sturm-Liouville and Dirac operators, Mathematics and its Applications (Soviet Series), vol. 59, Kluwer Academic Publishers Group, Dordrecht, 1991. Translated from the Russian, DOI 10.1007/978-94-011-3748-5. MR1136037 [66] M. Lukic, Schr¨ odinger operators with slowly decaying Wigner-von Neumann type potentials, J. Spectr. Theory 3 (2013), no. 2, 147–169, DOI 10.4171/JST/41. MR3042763

Bibliography

463

[67] V. A. Marchenko, Sturm-Liouville operators and applications, Operator Theory: Advances and Applications, vol. 22, Birkh¨ auser Verlag, Basel, 1986. Translated from the Russian by A. Iacob, DOI 10.1007/978-3-0348-5485-6. MR897106 [68] V. A. Marˇ cenko, Some questions of the theory of one-dimensional linear diﬀerential operators of the second order. I (Russian), Trudy Moskov. Mat. Obˇsˇ c. 1 (1952), 327–420. MR0058064 [69] C. A. Marx and S. Jitomirskaya, Dynamics and spectral theory of quasi-periodic Schr¨ odinger-type operators, Ergodic Theory Dynam. Systems 37 (2017), no. 8, 2353–2393, DOI 10.1017/etds.2016.16. MR3719264 [70] L. Pastur and A. Figotin, Spectra of random and almost-periodic operators, Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 297, Springer-Verlag, Berlin, 1992, DOI 10.1007/978-3-642-74346-7. MR1223779 [71] J. P¨ oschel and E. Trubowitz, Inverse spectral theory, Pure and Applied Mathematics, vol. 130, Academic Press, Inc., Boston, MA, 1987. MR894477 [72] T. Ransford, Potential theory in the complex plane, London Mathematical Society Student Texts, vol. 28, Cambridge University Press, Cambridge, 1995, DOI 10.1017/CBO9780511623776. MR1334766 [73] M. Reed and B. Simon, Methods of modern mathematical physics. II. Fourier analysis, selfadjointness, Academic Press [Harcourt Brace Jovanovich, Publishers], New York-London, 1975. MR0493420 [74] H. Reed and B. Simon, Methods of modern mathematical physics. IV. Analysis of operators, Academic Press [Harcourt Brace Jovanovich, Publishers], New York–London, 1978. MR0493421 [75] H. Reed and B. Simon, Methods of modern mathematical physics. III, Academic Press [Harcourt Brace Jovanovich, Publishers], New York–London, 1979, Scattering theory. MR529429 [76] M. Reed and B. Simon, Methods of modern mathematical physics. I: Functional analysis, 2nd ed., Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York, 1980. MR751959 [77] C. Remling, The absolutely continuous spectrum of Jacobi matrices, Ann. of Math. (2) 174 (2011), no. 1, 125–171, DOI 10.4007/annals.2011.174.1.4. MR2811596 [78] C. Remling, Spectral theory of canonical systems, De Gruyter Studies in Mathematics, vol. 70, De Gruyter, Berlin, 2018. MR3890099 [79] C. Remling and K. Scarbrough, Oscillation theory and semibounded canonical systems, J. Spectr. Theory 10 (2020), no. 4, 1333–1359, DOI 10.4171/jst/329. MR4192754 [80] R. Romanov, Canonical systems and de branges spaces, arXiv:1408.6022, 2014. [81] W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Book Co., New York, 1987. MR924157 [82] D. Ruelle, A remark on bound states in potential-scattering theory (English, with Italian summary), Nuovo Cimento A (10) 61 (1969), 655–662. MR246603 [83] A. M. Savchuk and A. A. Shkalikov, Sturm-Liouville operators with singular potentials (Russian, with Russian summary), Mat. Zametki 66 (1999), no. 6, 897–912, DOI 10.1007/BF02674332; English transl., Math. Notes 66 (1999), no. 5-6, 741–753 (2000). MR1756602 [84] J. H. Shapiro, Volterra adventures, Student Mathematical Library, vol. 85, American Mathematical Society, Providence, RI, 2018, DOI 10.1090/stml/085. MR3793153 [85] B. Simon, Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schr¨ odinger operators, Proc. Amer. Math. Soc. 124 (1996), no. 11, 3361–3369, DOI 10.1090/S0002-9939-96-03599-X. MR1350963 [86] B. Simon, A new approach to inverse spectral theory. I. Fundamental formalism, Ann. of Math. (2) 150 (1999), no. 3, 1029–1057, DOI 10.2307/121061. MR1740987

464

Bibliography

[87] B. Simon, On a theorem of Kac and Gilbert, J. Funct. Anal. 223 (2005), no. 1, 109–115, DOI 10.1016/j.jfa.2004.08.015. MR2139882 [88] B. Simon, Orthogonal polynomials on the unit circle. Part 1: Classical theory, American Mathematical Society Colloquium Publications, vol. 54, American Mathematical Society, Providence, RI, 2005, DOI 10.1090/coll054.1. MR2105088 [89] B. Simon, Orthogonal polynomials on the unit circle. Part 2: Spectral theory, American Mathematical Society Colloquium Publications, vol. 54, American Mathematical Society, Providence, RI, 2005, DOI 10.1090/coll/054.2/01. MR2105089 [90] B. Simon, Sturm oscillation and comparison theorems, Sturm-Liouville theory, Birkh¨ auser, Basel, 2005, pp. 29–43. MR2145076 [91] B. Simon, Trace ideals and their applications, 2nd ed., Mathematical Surveys and Monographs, vol. 120, American Mathematical Society, Providence, RI, 2005, DOI 10.1090/surv/120. MR2154153 [92] B. Simon, Szeg˝ o’s theorem and its descendants: Spectral theory for L2 perturbations of orthogonal polynomials, M. B. Porter Lectures, Princeton University Press, Princeton, NJ, 2011. MR2743058 [93] B. Simon, Advanced complex analysis, A Comprehensive Course in Analysis, Part 2B, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/simon/002.2. MR3364090 [94] B. Simon, Basic complex analysis, A Comprehensive Course in Analysis, Part 2A, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/simon/002.1. MR3443339 [95] B. Simon, Harmonic analysis, A Comprehensive Course in Analysis, Part 3, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/simon/003. MR3410783 [96] B. Simon, Operator theory, A Comprehensive Course in Analysis, Part 4, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/simon/004. MR3364494 [97] B. Simon, Real analysis, A Comprehensive Course in Analysis, Part 1, American Mathematical Society, Providence, RI, 2015. With a 68 page companion booklet, DOI 10.1090/simon/001. MR3408971 [98] H. Stahl and V. Totik, General orthogonal polynomials, Encyclopedia of Mathematics and its Applications, vol. 43, Cambridge University Press, Cambridge, 1992, DOI 10.1017/CBO9780511759420. MR1163828 [99] E. M. Stein and R. Shakarchi, Real analysis: Measure theory, integration, and Hilbert spaces, Princeton Lectures in Analysis, vol. 3, Princeton University Press, Princeton, NJ, 2005. MR2129625 [100] T.-J. Stieltjes, Recherches sur les fractions continues (French), Ann. Fac. Sci. Toulouse Sci. Math. Sci. Phys. 8 (1894), no. 4, J1–J122. MR1508159 [101] T. J. Stieltjes, Recherches sur les fractions continues [Suite et ﬁn], Ann. Fac. Sci. Toulouse Sci. Math. Sci. Phys. 9 (1895), no. 1, A5–A47. MR1508160 [102] P. Stollmann, Caught by disorder: Bound states in random media, Progress in Mathematical Physics, vol. 20, Birkh¨ auser Boston, Inc., Boston, MA, 2001, DOI 10.1007/978-1-4612-01694. MR1935594 [103] G. Stolz, Bounded solutions and absolute continuity of Sturm-Liouville operators, J. Math. Anal. Appl. 169 (1992), no. 1, 210–228, DOI 10.1016/0022-247X(92)90112-Q. MR1180682 [104] M. H. Stone, On one-parameter unitary groups in Hilbert space, Ann. of Math. (2) 33 (1932), no. 3, 643–648, DOI 10.2307/1968538. MR1503079 [105] M. H. Stone, Linear transformations in Hilbert space, American Mathematical Society Colloquium Publications, vol. 15, American Mathematical Society, Providence, RI, 1990. Reprint of the 1932 original, DOI 10.1090/coll/015. MR1451877 [106] G. Szeg¨ o, Orthogonal Polynomials, American Mathematical Society Colloquium Publications, Vol. 23, American Mathematical Society, New York, 1939. MR0000077

Bibliography

465

[107] G. Teschl, Jacobi operators and completely integrable nonlinear lattices, Mathematical Surveys and Monographs, vol. 72, American Mathematical Society, Providence, RI, 2000, DOI 10.1090/surv/072. MR1711536 [108] G. Teschl, Mathematical methods in quantum mechanics: With applications to Schr¨ odinger operators, Graduate Studies in Mathematics, vol. 99, American Mathematical Society, Providence, RI, 2009, DOI 10.1090/gsm/099. MR2499016 [109] E. C. Titchmarsh, Eigenfunction expansions associated with second-order diﬀerential equations. Part I, 2nd ed., Clarendon Press, Oxford, 1962. MR0176151 [110] J. Weidmann, Spectral theory of ordinary diﬀerential operators, Lecture Notes in Mathematics, vol. 1258, Springer-Verlag, Berlin, 1987, DOI 10.1007/BFb0077960. MR923320 [111] A. Zettl, Sturm-Liouville theory, Mathematical Surveys and Monographs, vol. 121, American Mathematical Society, Providence, RI, 2005, DOI 10.1090/surv/121. MR2170950

Notation Index

#A, the number of elements of set A, 6 , absolute continuity of one measure with respect to another, 162 ⊥, mutual singularity of measures, 162 ·L1 , 433 loc,unif

|||k||| = max(1, |k|), 364 AC([a, b]), set of absolutely continuous functions on [a, b], 247 AC2loc (I) = {f ∈ ACloc (I) | f ∈ ACloc (I)}, 379 ACloc (I), set of locally absolutely continuous functions on I, 250 Ac = X \ A, the complement of A in space X, 2 BX , the Borel σ-algebra on X, 3 Bb (X), the algebra of bounded Borel functions from X to C, 34

D(A), domain of unbounded operator A, 227 δx , the Dirac measure at x, 6 Δ, discriminant of a periodic Schr¨ odinger operator, 444 D(x, z), Weyl disk, 409 F , Fourier transform on L2 (R), 292 obius transformation induced by fA , M¨ matrix A, 184 fˆ, eigenfunction expansion of f , 400, 427 fˆ, Fourier transform of f , 291 gˇ, inverse eigenfunction expansion of g, 400, 427 gˇ, inverse Fourier transform of g, 291 G(x, y; z), Green’s function, 391

C(K) = C(K, C), the space of continuous maps K → C, 48 C(K, R), the space of continuous maps K → R, 50 C0 (R), the set of continuous decaying functions on R, 195 CA (ψ), cyclic subspace of vector ψ, 141 ˆ = C ∪ {∞}, the Riemann sphere, 184 C C+ , the upper half-plane, 183 C[x], algebra of polynomials with complex coeﬃcients, 118

Hac , the absolutely continuous subspace for A, 274 Hαc , the α-continuous subspace for A, 275 Hαs , the α-singular subspace for A, 275 Hcont , the continuous subspace for A, 273 h± , the positive and negative parts of a function, 21 Hpp , the pure point subspace for A, 272 Hsc , the singular continuous subspace for A, 274 Hs , the singular subspace for A, 274

D, the unit disk in C, 184

Ker, the kernel of an operator, 60

467

468

L(X, Y ), the set of bounded linear operators from X to Y , 59 Lp (X, μ), 54 Lp ([a, b]), Lp space on [a, b] with respect to Lebesgue measure, 247 p (X), Lp -space with counting measure on X, 58 Lpc (I), set of compactly supported functions in Lp (I), 400 p Lloc (I), set of locally Lp functions on I, 250 m(x, z), 421 m± (x, z), 422 μα , a Lebesgue–Stieltjes measure, 26 μ ⊗ ν, product measure, 31 M (z), Weyl M -matrix, 219, 326, 426 o˜( ) asymptotic notation, 415 P(X), the set of subsets of X, 1 ψz± (x) = ψ ± (x, z), Weyl solution, 390 Ran, the range of an operator, 60 Ranμ g, essential range of g with respect to μ, 144 ˆ = R ∪ {−∞, +∞}, the extended real R line, 13 Θ(z), Marchenko–Ostrovski map, 338, 445 TX , the metric topology on metric space X, 3 W (f, g), Wronskian, 381 W± (f, g), endpoint Wronskians, 381 odinger X± , endpoint domains for Schr¨ operators, 380 ∗ , null subspaces of endpoint X± domains, 381 Y± , endpoint domains for self-adjoint Schr¨ odinger operators, 388

Notation Index

Index

Baire measure, 36 Banach space, 46 Banach–Alaoglu theorem, 65 Banach–Steinhaus theorem, 63 base of a metric/topological space, 10 Bessel’s inequality, 86, 93 Borel σ-algebra, 3 Borel function, 3 Borel functional calculus, 149, 158, 243 Borel measure, 6 Borel set, 3 Borg’s theorem, 356 Borg–Marchenko theorem, 423 local, 423 bounded linear functional, 84 bounded operator, 64

Cauchy–Schwarz inequality, 79 Cayley transform, 185 closed graph theorem, 231 closure of an operator, 229 coeﬃcient stripping, 311 Combes–Thomas estimate, 334, 440 compact resolvent operator, 264 completion of a Banach space, 75 of a Hilbert space, 105 complex measure, 160 continuous spectrum, 274 convergence norm resolvent, 235 strong operator, 111, 122, 135 strong resolvent, 235 strong, in Hilbert space, 97 weak, 123 weak operator, 111 weak, in Hilbert space, 97, 99 weak-∗, 65, 67, 97 counting lemma, 378, 446 counting measure, 6 cover of a set, 8 Croft–Garsia covering lemma, 166 cyclic subspace, 141

C ∗ algebra, 110 Carath´eodory inequality, 223 Carath´eodory’s theorem, 9, 40 Carmona’s theorem, 318, 413 Cauchy’s integral formula, Banach-space valued, 71

Dirac measure, 6 direct sum of bounded operators, 120 of Hilbert spaces, 88, 89 of operators, 146 of subspaces of a Hilbert space, 90

absolutely continuous function, 247 absolutely continuous spectrum, 274 adjoint of bounded operator, 108 of direct sum of operators, 120 of matrix, 108 of unbounded operator, 230 algebra of sets, 4 Arzel` a–Ascoli theorem, 49

469

470

of unbounded operators, 237 of unitary operators, 120 Dirichlet eigenvalue, 347, 450 discrete spectrum, 278 discriminant of a periodic Jacobi matrix, 337 of periodic Schr¨ odinger operator, 444 distribution function of measure, 24 dominated convergence theorem, 23 dual space, 64, 65 eigenfunction expansion, 400 for full-line Jacobi matrix, 324 eigensolution, 309 eigenvalue, 114 eigenvector, 114, 136, 141 equicontinuity, 49 essential range, 144 essential spectrum, 278 preservation under compact perturbations, 280 exhaustion by compact sets, 32 exponential Herglotz representation, 213 Fatou’s lemma, 19 ﬁrst resolvent identity, 232 Floquet solutions, 352 Fourier series, 93, 105 Fourier transform, 292 Fubini’s theorem, 32 function convex, 75 lower semicontinuous, 42 upper semicontinuous, 42 fundamental solution, 367 Gram–Schmidt process, 94 graph of an operator, 228 Green’s function, 391 for a Jacobi matrix, 321 of Schr¨ odinger operator, 375 H¨ older’s inequality, 55 Hausdorﬀ dimension, 171 Hausdorﬀ distance, 135 Hausdorﬀ measure, 169 Heaviside function, 392 Hellinger–Toeplitz theorem, 235 Herglotz function, 183 Herglotz representation, 194 Hilbert space, 80

Index

inner product, 78 on 2 (N), 80 on Cn , 80 on L2 (X, dμ), 80 on quotient Hilbert space, 104 integrable function, 22 J -contracting matrix, 187 J -expanding matrix, 187 Jacobi matrix, 299 Jacobi recursion, 309 kernel, 60, 114 of integral operator, 125 Lagrangian subspace, 254 Lebesgue decomposition, 165 Lebesgue measure, 28 Lebesgue–Stieltjes measure, 26 limit circle, 382 limit point, 382 linear functional bounded, 64 linear relation, 228 Liouville’s theorem, Banach-space valued, 73 locally compact metric space, 43 Lyapunov exponent, 346 m-function of a Jacobi matrix, 300 Marchenko–Ostrovski map, 338, 445 matrix-valued measure, 176 measurable function, 3 measure, 6 α-continuous, 172 α-singular, 172 absolutely continuous, 163, 165 almost α-singular, 174 complex, 178 continuous, 160 continuous with respect to another measure, 162 pure point, 160 singular continuous, 165 singular with respect to another measure, 162 strongly α-continuous, 174 measure class, 268 metric space discrete, 3 metric topology, 3 min-max principle, 280

Index

monodromy matrix of a periodic Schr¨ odinger operator, 444 monotone class, 4 monotone class theorem, 5 monotone convergence theorem, 17, 21 multiplication operator, 143, 236 multiplicity m spectral measure, 284 Neumann series, 115 norm, 45 induced from inner product, 79, 80 induced metric, 45 of a linear operator, 59, 107 norm-preserving map, 61, 62 operator closable, 228 closed, 228 compact, 123, 136 densely deﬁned, 227 ﬁnite rank, 123 integral, 124 inverse, 113 order, 134 positive, 134, 243 self-adjoint, 129 unbounded, 227 orthogonal complement, 82 orthogonal projection, 84–87, 93, 270 orthonormal basis, 92, 136 orthonormal polynomial, 96 outer measure, 8, 39 parallelogram identity, 82 partition, 15 periodic spectrum, 447 Phragm´en–Lindel¨ of method, 215 Poisson kernel for C+ , 201 polarization identity, 78 positive linear functional, 38 precompact subset, 49 product measure, 31 projection theorem, 82, 91 pure point spectrum, 274 pushforward of a σ-algebra, 3 Pythagorean theorem, 79, 81 Radon–Nikodym theorem, 163, 164 range, 60 regular endpoint, 360, 382 regular measure, 36 resolvent, 113, 115

471

of self-adjoint operator, 148, 151 of unbounded operator, 231 resolvent identity, 114 Ricatti equation, 421 Riesz–Fischer theorem, 57 Riesz–Markov theorem, 38 Schnol’s theorem, 335, 441, 442 Schur function, 188 Schwarz integral formula, 190 Schwarz lemma, 188 Schwarz–Pick theorem, 189, 223 second kind polynomials, 314 second-countability, 10 of R, 10 of Rn , 12 self-adjoint operator unbounded, 232 seminorm, 45, 47, 54 separability of C(K), 53 of Lp spaces, 58 of a Hilbert space, 95 separable metric space, 10 sesquilinear form, 77 nondegenerate, 254 positive deﬁnite, 78 skew-symmetric, 254 symplectic, 254 shift operator, 109, 117, 126 σ-algebra, 2 generated by a set, 2 σ-locally compact space, 32 σ-additive, 6 σ-compactness, 30 simple function, 15 singular continuous spectrum, 274 singular spectrum, 274 singular value decomposition, 137 spectral basis, 146 spectral mapping theorem, 118, 157 spectral measure, 139, 153, 156 for unbounded self-adjoint operator, 238 maximal, 268 spectral multiplicity, 283 spectral projection, 270 spectral radius, 116 spectral representation, 148 spectral theorem, 136, 143, 147, 153, 155, 242 spectrum, 113, 115, 232

472

square root of positive operator, 152, 158 Stieltjes inversion, 202 Stone’s theorem, 290 Stone–Weierstrass theorem, 50, 52 subalgebra of Bb (R), 244 of Bb (X), 34, 150 of C(K, R) and C(K, C), 50 subordinate solution, 328, 332, 429 subspace closed, 47 cyclic, 153 invariant, 121 of Banach space, 47 resolvent-invariant, 242 support essential, 165, 210 of function, 33 support of a measure, 14 symmetric operator, 232 tensor product of Hilbert spaces, 100 Tonelli’s theorem, 31 transfer matrix, 313 trigonometric polynomials, 53, 92 uniform boundedness principle, 63 unitary map, 61, 93, 147 Weyl M -matrix, 219, 326, 426 Weyl disk, 409 for a Jacobi matrix, 315 Weyl solution, 311, 390 Weyl’s criterion, 131, 148, 279 Wronskian, 305, 369 Young’s inequality, 55

Index

Selected Published Titles in This Series 226 Milivoje Luki´ c, A First Course in Spectral Theory, 2022 225 Jacob Bedrossian and Vlad Vicol, The Mathematical Analysis of the Incompressible Euler and Navier-Stokes Equations, 2022 223 Volodymyr Nekrashevych, Groups and Topological Dynamics, 2022 222 Michael Artin, Algebraic Geometry, 2022 221 David Damanik and Jake Fillman, One-Dimensional Ergodic Schr¨ odinger Operators, 2022 220 Isaac Goldbring, Ultraﬁlters Throughout Mathematics, 2022 219 Michael Joswig, Essentials of Tropical Combinatorics, 2021 218 Riccardo Benedetti, Lectures on Diﬀerential Topology, 2021 217 Marius Crainic, Rui Loja Fernandes, and Ioan M˘ arcut ¸, Lectures on Poisson Geometry, 2021 216 Brian Osserman, A Concise Introduction to Algebraic Varieties, 2021 215 Tai-Ping Liu, Shock Waves, 2021 214 213 212 211

Ioannis Karatzas and Constantinos Kardaras, Portfolio Theory and Arbitrage, 2021 Hung Vinh Tran, Hamilton–Jacobi Equations, 2021 Marcelo Viana and Jos´ e M. Espinar, Diﬀerential Equations, 2021 Mateusz Michalek and Bernd Sturmfels, Invitation to Nonlinear Algebra, 2021

210 Bruce E. Sagan, Combinatorics: The Art of Counting, 2020 209 Jessica S. Purcell, Hyperbolic Knot Theory, 2020 ´ ´ 208 Vicente Mu˜ noz, Angel Gonz´ alez-Prieto, and Juan Angel Rojo, Geometry and Topology of Manifolds, 2020 207 Dmitry N. Kozlov, Organized Collapse: An Introduction to Discrete Morse Theory, 2020 206 Ben Andrews, Bennett Chow, Christine Guenther, and Mat Langford, Extrinsic Geometric Flows, 2020 205 204 203 202

Mikhail Shubin, Invitation to Partial Diﬀerential Equations, 2020 Sarah J. Witherspoon, Hochschild Cohomology for Algebras, 2019 Dimitris Koukoulopoulos, The Distribution of Prime Numbers, 2019 Michael E. Taylor, Introduction to Complex Analysis, 2019

201 Dan A. Lee, Geometric Relativity, 2019 200 Semyon Dyatlov and Maciej Zworski, Mathematical Theory of Scattering Resonances, 2019 199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019 198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019 197 196 195 194

Walter Craig, A Course on Partial Diﬀerential Equations, 2018 Martin Stynes and David Stynes, Convection-Diﬀusion Problems, 2018 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018 Seth Sullivant, Algebraic Statistics, 2018

193 192 191 190

Martin Lorenz, A Tour of Representation Theory, 2018 Tai-Peng Tsai, Lectures on Navier-Stokes Equations, 2018 Theo B¨ uhler and Dietmar A. Salamon, Functional Analysis, 2018 Xiang-dong Hou, Lectures on Finite Fields, 2018

189 I. Martin Isaacs, Characters of Solvable Groups, 2018 188 Steven Dale Cutkosky, Introduction to Algebraic Geometry, 2018 187 John Douglas Moore, Introduction to Global Analysis, 2017

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/gsmseries/.

The central topic of this book is the spectral theory of bounded and unbounded self-adjoint operators on Hilbert spaces. After introducing the necessary prerequisites in measure theory and functional analysis, the exposition focuses on operator theory and especially the structure of self-adjoint operators. These can be viewed as infinite-dimensional analogues of Hermitian matrices; the infinite-dimensional setting leads to a richer theory which goes beyond eigenvalues and eigenvectors and studies self-adjoint operators in the language of spectral measures and the Borel functional calculus. The main approach to spectral theory adopted in the book is to present it as the interplay between three main classes of objects: self-adjoint operators, their spectral measures, and Herglotz functions, which are complex analytic functions mapping the upper half-plane to itself. Self-adjoint operators include many important classes of recurrence and differential operators; the later part of this book is dedicated to two of the most studied classes, Jacobi operators and one-dimensional Schrödinger operators. This text is intended as a course textbook or for independent reading for graduate students and advanced undergraduates. Prerequisites are linear algebra, a first course in analysis including metric spaces, and for parts of the book, basic complex analysis. Necessary results from measure theory and from the theory of Banach and Hilbert spaces are presented in the first three chapters of the book. Each chapter concludes with a number of helpful exercises.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-226

GSM/226

www.ams.org