From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications [1 ed.] 1786306824, 9781786306821

From Euclidian to Hilbert Spaces analyzes the transition from finite dimensional Euclidian spaces to infinite-dimensiona

226 10 6MB

English Pages 370 [355] Year 2021

Table of contents :
Cover
Half-Title Page
Dedication
Title Page
Copyright Page
Contents
Preface
Chapter 1. Inner Product Spaces (Pre-Hilbert)
1.1. Real and complex inner products
1.2. The norm associated with an inner product and normed vector spaces
1.2.1. The parallelogram law and the polarization formula
1.3. Orthogonal and orthonormal families in inner product spaces
1.4. Generalized Pythagorean theorem
1.5. Orthogonality and linear independence
1.6. Orthogonal projection in inner product spaces
1.7. Existence of an orthonormal basis: the Gram-Schmidt process
1.8. Fundamental properties of orthonormal and orthogonal bases
1.9. Summary
Chapter 2. The Discrete Fourier Transform and its Applications to Signal and Image Processing
2.1. The space l2(ZN) and its canonical basis
2.1.1. The orthogonal basis of complex exponentials in l2 (ZN)
2.2. The orthonormal Fourier basis of l2 (ZN)
2.3. The orthogonal Fourier basis of
2.4. Fourier coefficients and the discrete Fourier transform
2.4.1. The inverse discrete Fourier transform
2.4.2. Definition of the DFT and the IDFT with the orthonormal Fourier basis
2.4.3. The real (orthonormal) Fourier basis
2.5. Matrix interpretation of the DFT and the IDFT
2.5.1. The fast Fourier transform
2.6. The Fourier transform in signal processing
2.6.1. Synthesis formula for 1D signals: decomposition on the harmonic basis
2.6.2. Signification of Fourier coefficients and spectrums of a 1D signal
2.6.3. The synthesis formula and Fourier coefficients of the unit pulse
2.6.4. High and low frequencies in the synthesis formula
2.6.5. Signal filtering in frequency representation
2.6.6. The multiplication operator and its diagonal matrix representation
2.6.7. The Fourier multiplier operator
2.7. Properties of the DFT
2.7.1. Periodicity of ẑ and ž
2.7.2. DFT and shift
2.7.3. DFT and conjugation
2.7.4. DFT and convolution
2.8. The DFT and stationary operators
2.8.1. The DFT and the diagonalization of stationary operators
2.8.2. Circulant matrices
2.8.3. Exhaustive characterization of stationary operators
2.8.4. High-pass, low-pass and band-pass filters
2.8.5. Characterizing stationary operators using shift operators
2.8.6. Frequency analysis of first and second derivation operators (discrete case)
2.9. The two-dimensional discrete Fourier transform (2D DFT)
2.9.1. Matrix representation of the 2D DFT: Kronecker product versus iteration of two 1D DFTs
2.9.2. Properties of the 2D DFT
2.9.3. 2D DFT and stationary operators
2.9.4. Gradient and Laplace operators and their action on digital images
2.9.5. Visualization of the amplitude spectrum in 2D
2.9.6. Filtering: an example of digital image filtering in a Fourier space
2.10. Summary
Chapter 3. Lebesgue’s Measure and Integration Theory
3.1. Riemann versus Lebesgue
3.2. σ-algebra, measurable space, measures and measured spaces
3.3. Measurable functions and almost-everywhere properties (a.e)
3.4. Integrable functions and Lebesgue integrals
3.5. Characterization of the Lebesgue measure on R and sets with a null Lebesgue measure
3.6. Three theorems for limit operations in integration theory
3.7. Summary
Chapter 4. Banach Spaces and Hilbert Spaces
4.1. Metric topology of inner product spaces
4.2. Continuity of fundamental operations in inner product spaces
4.2.1. Equivalence of separated topologies in finite-dimension vector spaces
4.3. Cauchy sequences and completeness: Banach and Hilbert
4.3.1. Completeness of vector spaces
4.3.2. Characterizing the completeness of normed vector spaces using series
4.3.3. Banach fixed-point theorem
4.4. Remarkable examples of Banach and Hilbert spaces
4.4.1. Lp and lp spaces: presentation and completeness
4.4.2. L∞ and l∞ spaces
4.4.3. Inclusion relationships between lp spaces
4.4.4. Inclusion relationships between Lp spaces
4.4.5. Density theorems in Lp(X,A,μ)
4.5. Summary
Chapter 5. The Geometric Structure of Hilbert Spaces
5.1. The orthogonal complement in a Hilbert space and its properties
5.2. Projection onto closed convex sets: theorem and consequences
5.2.1. Characterization of closed vector subspaces in Hilbert spaces
5.3. Polar and bipolar subsets of a Hilbert space
5.4. The (orthogonal) projection theorem in a Hilbert space
5.5. Orthonormal systems and Hilbert bases
5.5.1. Bessel’s inequality and Fourier coefficients
5.5.2. The Fischer-Riesz theorem
5.5.3. Characterizations of a Hilbert basis (or complete orthonormal system)
5.5.4. Isomorphisms between Hilbert spaces
5.5.5. l2(N,K) as the prototype of separable Hilbert spaces of infinite dimension
5.6. The Fourier Hilbert basis in L2
5.6.1. L2[-π, π] or L2[0, 2π]
5.6.2. L2pTq
5.6.3. L2[a,b]
5.6.4. Real Fourier series
5.6.5. Pointwise convergence of the real Fourier series: Dirichlet’s theorem
5.6.6. The Gibbs phenomenon and Cesàro summation
5.6.7. Speed of convergence to 0 of Fourier coefficients
5.6.8. Fourier transform in L2 (T) and shift
5.7. Summary
Chapter 6. Bounded Linear Operators in Hilbert Spaces
6.1. Fundamental properties of bounded linear operators between normed vector spaces
6.1.1. Continuity of linear operators defined on a finite-dimensional normed vector space
6.2. The operator norm, convergence of operator sequences and Banach algebras
6.2.1. A classical example of a non-bounded linear operator on a vector space of infinite dimension
6.3. Invertibility of linear operators
6.4. The dual of a Hilbert space and the Riesz representation theorem
6.4.1. The scalar product induced on the dual of a Hilbert space
6.5. Bilinear forms, sesquilinear forms and associated quadratic forms
6.5.1. The Lax-Milgram theorem and its consequences
6.6. The adjoint operator: presentation and properties
6.7. Orthogonal projection operators in a Hilbert space
6.7.1. Bounded multiplication operators and their relation to orthogonal projectors
6.7.2. Geometric realization of orthogonal projection operators via orthonormal systems
6.8. Isometric and unitary operators
6.8.1. Characterizations of isometric and unitary operators
6.8.2. Relationship between isometric and unitary operators and orthonormal systems
6.9. The Fourier transform on S(Rn), L1(Rn) and L2 (Rn)
6.9.1. The invariance of the Schwartz space with respect to the Fourier transform
6.9.2. Extension of the Fourier transform of S(Rn) to L1(Rn): the Riemann-Lebesgue theorem
6.9.3. Extension of the Fourier transform to a unitary operator on L2(Rn): the Fourier-Plancherel transform
6.9.4. Relationship between the Fourier-Plancherel transform and the Hermitian Hilbert basis
6.9.5. The Fourier transform and convolution
6.9.6. Convolution and Fourier transforms in L2: localization of the Fourier transform
6.10. The Nyquist-Shannon sampling theorem
6.10.1. The Nyquist frequency: aliasing and oversampling
6.11. Application of the Fourier transform to solve ordinary and partial differential equations
6.11.1. Solving an ordinary differential equation using the Fourier transform
6.11.2. The Fourier transform and partial differential equations
6.11.3. Solving the partial differential equation for heat propagation using the Fourier transform
6.12. Summary
Appendix 1. Quotient Space
Appendix 2. The Transpose (or Dual) of a Linear Operator
Appendix 3. Uniform, Strong and Weak Convergence
References
Index
Other titles from iSTE in Mathematics and Statistics

Recommend Papers

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications [1 ed.] 1786306824, 9781786306821

From Euclidian to Hilbert Spaces analyzes the transition from finite dimensional Euclidian spaces to infinite-dimensiona

337 9 23MB Read more

Functional Analysis: An Introduction to Metric Spaces, Hilbert Spaces, and Banach Algebras [2 ed.] 3031275365, 9783031275364, 9783031275371

This textbook provides an introduction to functional analysis suitable for lecture courses to final year undergraduates

101 67 Read more

Functional Analysis: An Introduction to Metric Spaces, Hilbert Spaces, and Banach Algebras [2nd ed. 2024] 3031275365, 9783031275364

This textbook provides an introduction to functional analysis suitable for lecture courses to final year undergraduates

119 86 6MB Read more

Introduction to functional analysis

752 13 21MB Read more

Introduction to Fourier Analysis on Euclidean Spaces (PMS-32), Volume 32 9781400883899

The authors present a unified treatment of basic topics that arise in Fourier analysis. Their intention is to illustrate

119 103 12MB Read more

Introduction to Functional Analysis 9783030527846, 9783319023687

391 67 7MB Read more

An Introduction to Functional Analysis 0521899648, 9780521899642

This accessible text covers key results in functional analysis that are essential for further study in the calculus of v

107 81 3MB Read more

Qα Analysis on Euclidean Spaces 9783110600285, 9783110601121

Starting with the fundamentals of Qα spaces and their relationships to Besov spaces, this book presents all major result

144 49 224MB Read more

Qα Analysis on Euclidean Spaces 9783110600285, 9783110601121

Starting with the fundamentals of Qα spaces and their relationships to Besov spaces, this book presents all major result

150 55 3MB Read more

Functional Analysis: Introduction to Further Topics in Analysis 9781400840557

This is the fourth and final volume in the Princeton Lectures in Analysis, a series of textbooks that aim to present, in

102 95 5MB Read more

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications [1 ed.]
1786306824, 9781786306821

Author / Uploaded
Edoardo Provenzi

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

From Euclidean to Hilbert Spaces

To my mentors, Sissa Abbati and Renzo Cirelli, who taught me the importance of rigor in mathematics, and to Brunella, Paola, Clara and Tommo, whose passion for their work has both helped and brought joy to many

From Euclidean to Hilbert Spaces Introduction to Functional Analysis and its Applications

Edoardo Provenzi

First published 2021 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd 2021 The rights of Edoardo Provenzi to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Control Number: 2021937006 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-682-1

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

Chapter 1. Inner Product Spaces (Pre-Hilbert) . . . . . . . . . . . . . .

1

1.1. Real and complex inner products . . . . . . . . . . . . . . . . . 1.2. The norm associated with an inner product and normed vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1. The parallelogram law and the polarization formula . . . . 1.3. Orthogonal and orthonormal families in inner product spaces 1.4. Generalized Pythagorean theorem . . . . . . . . . . . . . . . . 1.5. Orthogonality and linear independence . . . . . . . . . . . . . 1.6. Orthogonal projection in inner product spaces . . . . . . . . . 1.7. Existence of an orthonormal basis: the Gram-Schmidt process 1.8. Fundamental properties of orthonormal and orthogonal bases 1.9. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

6 9 11 11 13 15 19 20 28

Chapter 2. The Discrete Fourier Transform and its Applications to Signal and Image Processing . . . . . . . . . . . . . . . . . . . . . . . . .

31

2.1. The space 2 pZN q and its canonical basis . . . . . . . . . . . . 2.1.1. The orthogonal basis of complex exponentials in 2 pZN q . 2.2. The orthonormal Fourier basis of 2 pZN q . . . . . . . . . . . . 2.3. The orthogonal Fourier basis of 2 pZN q . . . . . . . . . . . . . 2.4. Fourier coefﬁcients and the discrete Fourier transform . . . . 2.4.1. The inverse discrete Fourier transform . . . . . . . . . . . 2.4.2. Deﬁnition of the DFT and the IDFT with the orthonormal Fourier basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3. The real (orthonormal) Fourier basis . . . . . . . . . . . . 2.5. Matrix interpretation of the DFT and the IDFT . . . . . . . . . 2.5.1. The fast Fourier transform . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

1

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

31 34 38 40 41 44

. . . .

. . . .

. . . .

. . . .

. . . .

46 47 48 51

vi

From Euclidean to Hilbert Spaces

2.6. The Fourier transform in signal processing . . . . . . . . . . . . . . . . 51 2.6.1. Synthesis formula for 1D signals: decomposition on the harmonic basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.6.2. Signiﬁcation of Fourier coefﬁcients and spectrums of a 1D signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.6.3. The synthesis formula and Fourier coefﬁcients of the unit pulse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.6.4. High and low frequencies in the synthesis formula . . . . . . . . . 55 2.6.5. Signal ﬁltering in frequency representation . . . . . . . . . . . . . . 59 2.6.6. The multiplication operator and its diagonal matrix representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.6.7. The Fourier multiplier operator . . . . . . . . . . . . . . . . . . . . 60 2.7. Properties of the DFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.7.1. Periodicity of zˆ and zˇ . . . . . . . . . . . . . . . . . . . . . . . . . . 62 2.7.2. DFT and shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.7.3. DFT and conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . 67 2.7.4. DFT and convolution . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.8. The DFT and stationary operators . . . . . . . . . . . . . . . . . . . . . 73 2.8.1. The DFT and the diagonalization of stationary operators . . . . . . 75 2.8.2. Circulant matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.8.3. Exhaustive characterization of stationary operators . . . . . . . . . 78 2.8.4. High-pass, low-pass and band-pass ﬁlters . . . . . . . . . . . . . . . 82 2.8.5. Characterizing stationary operators using shift operators . . . . . . 83 2.8.6. Frequency analysis of ﬁrst and second derivation operators (discrete case) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2.9. The two-dimensional discrete Fourier transform (2D DFT) . . . . . . . 88 2.9.1. Matrix representation of the 2D DFT: Kronecker product versus iteration of two 1D DFTs . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2.9.2. Properties of the 2D DFT . . . . . . . . . . . . . . . . . . . . . . . . 93 2.9.3. 2D DFT and stationary operators . . . . . . . . . . . . . . . . . . . 95 2.9.4. Gradient and Laplace operators and their action on digital images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.9.5. Visualization of the amplitude spectrum in 2D . . . . . . . . . . . . 97 2.9.6. Filtering: an example of digital image ﬁltering in a Fourier space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 2.10. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Chapter 3. Lebesgue’s Measure and Integration Theory . . . . . . . . 105 3.1. Riemann versus Lebesgue . . . . . . . . . . . . . . . . . . . . 3.2. σ-algebra, measurable space, measures and measured spaces . 3.3. Measurable functions and almost-everywhere properties (a.e) 3.4. Integrable functions and Lebesgue integrals . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

105 106 108 109

Contents

vii

3.5. Characterization of the Lebesgue measure on R and sets with a null Lebesgue measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 3.6. Three theorems for limit operations in integration theory . . . . . . . . 113 3.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Chapter 4. Banach Spaces and Hilbert Spaces . . . . . . . . . . . . . . 115 4.1. Metric topology of inner product spaces . . . . . . . . . . . . 4.2. Continuity of fundamental operations in inner product spaces 4.2.1. Equivalence of separated topologies in ﬁnite-dimension vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Cauchy sequences and completeness: Banach and Hilbert . . 4.3.1. Completeness of vector spaces . . . . . . . . . . . . . . . . 4.3.2. Characterizing the completeness of normed vector spaces using series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3. Banach ﬁxed-point theorem . . . . . . . . . . . . . . . . . 4.4. Remarkable examples of Banach and Hilbert spaces . . . . . . 4.4.2. L8 and 8 spaces . . . . . . . . . . . . . . . . . . . . . . . 4.4.3. Inclusion relationships between p spaces . . . . . . . . . 4.4.4. Inclusion relationships between Lp spaces . . . . . . . . . 4.4.5. Density theorems in Lp (X,A,μ) . . . . . . . . . . . . . . . 4.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . 116 . . . . . 120 . . . . . 128 . . . . . 129 . . . . . 133 . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

135 139 145 156 161 163 165 169

Chapter 5. The Geometric Structure of Hilbert Spaces . . . . . . . . . 171 5.1. The orthogonal complement in a Hilbert space and its properties 5.2. Projection onto closed convex sets: theorem and consequences . 5.2.1. Characterization of closed vector subspaces in Hilbert spaces 5.3. Polar and bipolar subsets of a Hilbert space . . . . . . . . . . . . . 5.4. The (orthogonal) projection theorem in a Hilbert space . . . . . . 5.5. Orthonormal systems and Hilbert bases . . . . . . . . . . . . . . . 5.5.1. Bessel’s inequality and Fourier coefﬁcients . . . . . . . . . . . 5.5.2. The Fischer-Riesz theorem . . . . . . . . . . . . . . . . . . . . 5.5.3. Characterizations of a Hilbert basis (or complete orthonormal system) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4. Isomorphisms between Hilbert spaces . . . . . . . . . . . . . . 5.5.5. 2 pN, Kq as the prototype of separable Hilbert spaces of inﬁnite dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. The Fourier Hilbert basis in L2 . . . . . . . . . . . . . . . . . . . . 5.6.1. L2 r´π, πs or L2 r0, 2πs . . . . . . . . . . . . . . . . . . . . . . 5.6.2. L2 pTq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3. L2 ra, bs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.4. Real Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.5. Pointwise convergence of the real Fourier series: Dirichlet’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

171 174 180 182 185 188 189 192

. . . 194 . . . 199 . . . . . .

. . . . . .

. . . . . .

201 202 202 204 205 206

. . . 212

viii

From Euclidean to Hilbert Spaces

5.6.6. The Gibbs phenomenon and Cesàro summation . 5.6.7. Speed of convergence to 0 of Fourier coefﬁcients 5.6.8. Fourier transform in L2 pTq and shift . . . . . . . 5.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

Chapter 6. Bounded Linear Operators in Hilbert Spaces

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

214 214 218 219

. . . . . . . 221

6.1. Fundamental properties of bounded linear operators between normed vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1. Continuity of linear operators deﬁned on a ﬁnite-dimensional normed vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. The operator norm, convergence of operator sequences and Banach algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1. A classical example of a non-bounded linear operator on a vector space of inﬁnite dimension . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. Invertibility of linear operators . . . . . . . . . . . . . . . . . . . . . . . 6.4. The dual of a Hilbert space and the Riesz representation theorem . . . 6.4.1. The scalar product induced on the dual of a Hilbert space . . . . . 6.5. Bilinear forms, sesquilinear forms and associated quadratic forms . . . 6.5.1. The Lax-Milgram theorem and its consequences . . . . . . . . . . . 6.6. The adjoint operator: presentation and properties . . . . . . . . . . . . 6.7. Orthogonal projection operators in a Hilbert space . . . . . . . . . . . . 6.7.1. Bounded multiplication operators and their relation to orthogonal projectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.2. Geometric realization of orthogonal projection operators via orthonormal systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8. Isometric and unitary operators . . . . . . . . . . . . . . . . . . . . . . 6.8.1. Characterizations of isometric and unitary operators . . . . . . . . 6.8.2. Relationship between isometric and unitary operators and orthonormal systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9. The Fourier transform on SpRn q, L1 pRn q and L2 pRn q . . . . . . . . . 6.9.1. The invariance of the Schwartz space with respect to the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.2. Extension of the Fourier transform of SpRn q to L1 pRn q: the Riemann-Lebesgue theorem . . . . . . . . . . . . . . . . . . . . . . . . 6.9.3. Extension of the Fourier transform to a unitary operator on L2 pRn q: the Fourier-Plancherel transform . . . . . . . . . . . . . . . . . . 6.9.4. Relationship between the Fourier-Plancherel transform and the Hermitian Hilbert basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.5. The Fourier transform and convolution . . . . . . . . . . . . . . . . 6.9.6. Convolution and Fourier transforms in L2 : localization of the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10. The Nyquist-Shannon sampling theorem . . . . . . . . . . . . . . . . 6.10.1. The Nyquist frequency: aliasing and oversampling . . . . . . . . .

223 226 227 238 239 244 249 249 257 261 269 278 280 286 288 293 296 296 301 302 305 306 309 310 312

Contents

6.11. Application of the Fourier transform to solve ordinary and partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11.1. Solving an ordinary differential equation using the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11.2. The Fourier transform and partial differential equations . . . . 6.11.3. Solving the partial differential equation for heat propagation using the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . 6.12. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

. . 313 . . 313 . . 315 . . 316 . . 319

Appendix 1: Quotient Space . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Appendix 2: The Transpose (or Dual) of a Linear Operator . . . . . . 329 Appendix 3: Uniform, Strong and Weak Convergence . . . . . . . . . 331 References Index

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

Preface

This book provides an introduction to the key theoretical concepts associated with Hilbert spaces and with operators deﬁned over these spaces. Our decision to dedicate a whole book to the subject of Hilbert spaces stems from a simple observation: of all the inﬁnite dimensional vector spaces, Hilbert spaces bear the closest resemblance to ﬁnite dimensional Euclidean spaces, that is, Rn or Cn , which provide the framework for classical analysis and linear algebra. The topological subtleties which come into play when using inﬁnite dimensions mean that certain conditions (which are always veriﬁed in ﬁnite dimensions) must be posed in order to maintain the validity of known results from Euclidian spaces. For Hilbert spaces, one of these topological conditions is completeness, that is, any Cauchy sequence must converge in the space in which it is deﬁned. From this perspective, the theory of Hilbert spaces may be seen as an elegant conjunction of algebra, analysis and topology. It draws on the work of some of the great mathematicians of the early 20th century, including Riesz, Banach and, evidently, Hilbert, who established the conditions needed to extend classical algebra and analysis into inﬁnite dimensions. One particularly important linear operator, the Fourier transform, appears on multiple occasions throughout this book. We start by examining the properties of this transform in ﬁnite dimensions, with the discrete Fourier transform, before extending it to inﬁnite dimensions, considering the use of this operator in a range of different domains, including signal and image processing. A clear understanding of the concepts introduced in this book is essential for mathematicians, physicists or engineers hoping to progress in any ﬁeld, whether applied or theoretical. These concepts provide access to tools and techniques

xii

From Euclidean to Hilbert Spaces

developed over a particularly rich, creative period in the history of mathematics, which remain relevant for both pure and applied forms of the subject. The author would like to thank Olivier Husson for his assistance in producing the majority of the ﬁgures included in this book. April 2021

1 Inner Product Spaces (Pre-Hilbert)

This chapter will focus on inner product spaces, that is, vector spaces with a scalar product, speciﬁcally those of ﬁnite dimension. 1.1. Real and complex inner products In real Euclidean spaces R2 and R3 , the inner product of two vectors v, w is deﬁned as the real number: v ‚ w “ xv, wy “ }v}}w} cospϑq where ϑ is the smallest angle between v and w and } } represents the norm (or the magnitude) of the vectors. Using the inner product, it is possible to deﬁne the orthogonal projection of vector v in the direction deﬁned by vector w. A distinction must be made between: – the scalar projection of v in the direction of w: }v} cospθq “

xv,wy }w}

w – the vector projection of v in the direction of w: }v} cospθq }w} “

; and

xv,wy }w}2 w

;

w is the unit vector in the direction of w. Evidently, the roles of v and w can where }w} be reversed.

The absolute value of the scalar projection measures the “similarity” of the directions of two vectors. To understand this concept, consider two remarkable relative positions between v and w: – if v and w possess the same direction, then the angle between them ϑ is either null or π, hence cospϑq “ ˘1, that is, the absolute value of the scalar projection of v in direction w is }v}; From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

2

From Euclidean to Hilbert Spaces

– however, if v and w are perpendicular, then ϑ “ π2 and hence cospϑq “ 0, showing that the scalar projection of v in direction w is null. When the position of v relative to w falls somewhere in the interval between the two vectors described above, the absolute value of the scalar projection of v in the direction of w falls between 0 and }v}; this explains its use to measure the similarity of the direction of vectors. In this book, we shall consider vector spaces which are far more complex than R2 and R3 , and the measure of vector similarity obtained through projection supplies crucial information concerning the coherence of directions. Before we can obtain this information, we must begin by moving from Euclidean spaces R2 and R3 to abstract vector spaces. The general deﬁnition of an inner product and an orthogonal projection in these spaces may be seen as an extension of the previous deﬁnitions, permitting their application to spaces in which our representation of vectors is no longer applicable. Geometric properties, which can only be apprehended and, notably, visualized in two or three dimensions, must be replaced by a set of algebraic properties which can be used in any dimension. Evidently, these algebraic properties must be necessary and sufﬁcient to characterize the inner product of vectors in a plane or in real space. This approach, in which we generalize concepts which are “intuitive” in two or three dimensions, is a classic approach in mathematics. In this chapter, the symbol V will be used to describe a vector space deﬁned over the ﬁeld K, where K is either R or C and is of ﬁnite dimension n ă `8. Field K contains the scalars used to construct linear combinations between vectors in V . Note that two ﬁnite dimensional vector spaces are isomorphic if and only if they are of the same dimension. Furthermore, if we establish a basis B “ pb1 , . . . , bn q for V , an isomorphism between V and Kn can be constructed as follows: I:

n ÝÑ K ¨ ˛ v1 n ř ˚ .. ‹ v “ rvsB “ vi bi ÞÝÑ ˝ . ‚

V

i“1

vn

that is, I associates each v P V with the vector of Kn given by the scalar components of v in relation to the established basis B. Since I is an isomorphism, it follows that Kn is the prototype of all vector spaces of dimension n over a ﬁeld K. D EFINITION 1.1.– Let V be a vector space deﬁned over a ﬁeld K.

Inner Product Spaces (Pre-Hilbert)

3

A K-form over V is an application deﬁned over V ˆ V with values in K, that is: φ : V ˆ V ÝÑ K pv, wq ÞÝÑ φpv, wq. D EFINITION 1.2.– Let V be a real vector space. A couple pV, x, yq is said to be a real inner product space (or a real pre-Hilbert space) if the form x, y is: 1) bilinear, i.e.1 linear in relation to each argument (the other being ﬁxed): xv1 ` v2 , w1 ` w2 y “ xv1 , w1 y ` xv1 , w2 y ` xv2 , w1 y ` xv2 , w2 y, @ v 1 , v 2 , w1 , w2 P V and: xαv, βwy “ αxv, βwy “ βxαv, wy “ αβxv, wy,

@α, β P R, v, w P V

2) symmetrical: xv, wy “ xw, vy, @v, w P V ; 3) deﬁned: xv, vy “ 0 ðñ v “ 0V , the null vector of the vector space V ; 4) positive: xv, vy ą 0 @v P V , v ‰ 0V . Upon reﬂection, we see that, for a real form over V , the symmetry and bilinearity requirements are equivalent to requiring symmetry and linearity on the left-hand side, that is: xv1 `v2 , wy “ xv1 , wy`xv2 , wy,

xαv, wy “ αxv, wy,

@v, w P V, @α P R

The simplest and most important example of a real inner product is the canonical inner product, deﬁned as follows: let v “ pv1 , v2 , . . . , vn q, w “ pw1 , w2 , . . . , wn q be two vectors in Rn written with their components in relation to any given, but ﬁxed, basis pbi qni“1 in Rn . The canonical inner product of v and w is: xv, wy ”

n ÿ

vi w i “ v t ¨ w “ v ¨ w t ,

i“1 t

t

where v and w in the ﬁnal equations are the transposed vectors of v and w, giving us the matrix product of a line vector (treated as a 1 ˆ n matrix) and a column vector (treated as an n ˆ 1 matrix). The extension of these deﬁnitions to complex vector spaces is not particularly straightforward. First, note that if V is a complex vector space, then there is no bilinear and deﬁnite-positive transformation over V ˆ V . In this case, any vector v P V would give the following: xiv, ivy “ i2 xv, vy “ ´xv, vy ď 0 since xv, vy ě 0 by positivity. 1 i.e. is the abbreviation of the Latin expression “id est”, meaning “that is”. This term is often used in mathematical literature.

4

From Euclidean to Hilbert Spaces

As we shall see, the property of positivity is essential in order to deﬁne a norm (and thus a distance, and by extension, a topology) from a complex inner product. To obtain an algebraic structure for complex scalar products which remains compatible with a topological structure, we are therefore forced to abandon the notion of bilinearity, and to search for an alternative. We could consider antilinearity2, i.e. ¯ wy xαv, βwy “ α ¯ βxv, But it has the same problem as bilinearity, xiv, ivy “ p´iqp´iqxv, vy “ i2 xv, vy “ ´xv, vy2 ď 0. A simple analysis shows that, in order to avoid losing the positivity, it is sufﬁcient to request the linearity with respect to one variable and the antilinearity with respect to the other. This property is called sesquilinearity3. The choice of the linear and antilinear variable is entirely arbitrary. By convention, the antilinear component is placed on the right-hand side in mathematics, but on the left-hand side in physics. We have chosen ¯ wy. xαv, βwy “ αβxv,

to

adopt

the

mathematical

convention

here,

i.e.

Next, it is important to note that sesquilinearity and symmetry are incompatible: if both properties were veriﬁed, then xv, αwy “ α ¯ xv, wy, and also xv, αwy “ xαw, vy “ αxw, vy “ αxv, wy. Thus, xv, αwy “ α ¯ xv, wy “ αxv, wy which can only be veriﬁed if α P R. Thus x, y cannot be both sesquilinear and symmetrical when working with vectors belonging to a complex vector space. The example shown above demonstrates that, instead of symmetry, the property which must be veriﬁed for every vector pair v, w is xv, wy “ xw, vy, that is, changing the order of the vectors in x, y must be equivalent to complex conjugation. A transform which veriﬁes this property is said to be Hermitian4. 2 The symbols z ˚ and z¯ represent the complex conjugation, i.e. ř śnif z P C, zś“n a ` ib, a,2b P R, řn ¯ then z ˚ “ z¯ “ a ´ ib. We recall that n k“1 zk “ k“1 zk , k“1 zk “ k“1 zk , |z| “ z z and z “ z¯ if and only if z P R. 3 Sesqui comes from the Latin semisque, meaning one and a half times. This term is used to highlight the fact that there are not two instances of linearity, but one “and a half”, due to the presence of the complex conjugation. 4 For the French mathematician Charles Hermite (1822, Dieuze-1901, Paris).

Inner Product Spaces (Pre-Hilbert)

5

These observations provide full justiﬁcation for Deﬁnition 1.3. D EFINITION 1.3.– Let V be a complex vector space. The pair pV, x, yq is said to be a complex inner product space (or a complex pre-Hilbert space) if x, y is a complex form which is: 1) sesquilinear: xv1 ` v2 , w1 ` w2 y “ xv1 , w1 y ` xv1 , w2 y ` xv2 , w1 y ` xv2 , w2 y @ v1 , v2 , w1 , w2 P V , and: Antilinearity on the right ¯ ¯ wy ÝÝÝÝÝÝÝÝÝÝÝÝÑ xαv, βwy “ αxv, βwy “ βxαv, wy “ αβxv, Linearity on the left

@ α, β P C, @ v, w P V ; 2) Hermitian: xv, wy “ xw, vy, @v, w P V ; 3) deﬁnite: xv, vy “ 0 ðñ v “ 0V , the null vector of the vector space V ; 4) positive: xv, vy ą 0 @v P V , v ‰ 0V . As in the case of the canonical inner product, for a complex form over V , the symmetry and sesquilinearity requirement is equivalent to requiring the Hermitian property and linearity on the left-hand side; if these properties are veriﬁed, then: xv, αwy “ xαw, vy “ αxw, vy “ α ¯ xw, vy “ α ¯ xv, wy “ α ¯ xv, wy,

@α P C.

Considering the sum of n, rather than two, vectors, sesquilinearity is represented by the following formulae: x

n ÿ

αi vi , wy “

i“1

xv ,

n ÿ

αi xvi , wy

[1.1]

αi xv, wi y

[1.2]

i“1 n ÿ

αi w i y “

i“1

n ÿ i“1

In Cn , the complex Euclidean inner product is deﬁned by: xv, wy ”

n ÿ

vi wi “ v ¨ pwqt “ v t ¨ w

i“1

where v “ pv1 , v2 , . . . , vn q, w “ pw1 , w2 , . . . , wn q P Cn are written with their components in relation to any given, but ﬁxed, basis pbi qni“1 in Cn . The symbol K will be used throughout to represent either R or C in the context of properties which are valid independently of the reality or complexity of the inner product.

6

From Euclidean to Hilbert Spaces

T HEOREM 1.1.– Let pV, x , yq be an inner product space. We have: 1) xv, 0V y “ 0 @v P V ; 2) if xu, wy “ xv, wy @w P V , then u and v must coincide; 3) xv, wy “ 0 @v P V ðñ w “ 0V , i.e. the null vector is the only vector which is orthogonal to all of the other vectors. P ROOF.– 1) xv, 0V y “ xv, 0V ` 0V y xv, 0V y ´ xv, 0V y “ 0 “ xv, 0V y.

“

xv, 0V y ` xv, 0V y by linearity, i.e.

2) xu, wy “ xv, wy @w P V implies, by linearity, that xu ´ v, wy “ 0 @w P V and thus, notably, considering w “ u ´ v, we obtain xu ´ v, u ´ vy “ 0, implying, due to the deﬁnite positiveness of the inner product, that u ´ v “ 0V , i.e. u “ v. 3) If w “ 0V , then xv, wy “ 0 @v P V using property (1). Inversely, by hypothesis, it holds that xv, wy “ 0 “ xv, 0V y @v P V , but then property (2) implies that w “ 0V . 2 Finally, let us consider a typical property of the complex inner product, which results directly from a property of complex numbers. T HEOREM 1.2.– Let pV, x , yq be a complex inner product space. Thus: pxv, wyq “ pxv, iwyq

@v, w P V

P ROOF.– Consider any complex number z “ a ` ib, so ´iz “ b ´ ia, hence b “ pzq “ p´izq. Taking z “ xv, wy, we obtain pxv, wyq “ p´ixv, wyq “ pxv, iwyq by sesquilinearity. 2 1.2. The norm associated with an inner product and normed vector spaces If pV, x, yq is an inner product space over K, then a norm on V can be deﬁned as follows: } }: V Ñ R` r0, `8q 0 “a v Ñ }v} “ xv, vy Note that }v} is well deﬁned since xv, vy ě 0 @v P V . Once a norm has been established, it is always possible to deﬁne a distance between two vectors v, w in V : dpv, wq “ }v ´ w}.

Inner Product Spaces (Pre-Hilbert)

7

The vector v P V such that }v} “ 1 is known as a unit vector. Every vector v P V can be normalized to produce a unit vector, simply by dividing it by its norm. N OTABLE EXAMPLES .– g f n fÿ n pR , x, yq : }v} “ e v 2 i

i“1

g g f n f n ÿ f fÿ n e pC , x, yq : }v} “ vi v i “ e |vi |2 i“1

i“1

Three properties of the norm, which should already be known, are listed below. Taking any v, w P V , and any α P K: 1) }v} ě 0, }v} “ 0 ðñ v “ 0V ; 2) }αv} “ |α|}v} (homogeneity); 3) }v ` w} ď }v} ` }w} (triangle inequality). D EFINITION 1.4 (normed vector space).– A normed vector space is a pair pV, } }q given by a vector space V and a function, called a norm, } } : V Ñ R` 0 , satisfying the three properties listed above. A norm a } } is Hilbertian if there exists an inner product x , y on V such that }v} “ xv, vy @v P V . Canonically, an inner product space is therefore a normed vector space. Counterexamples can be used to show that the reverse is not generally true. Note that, by deﬁnition, xv, vy “ v v, but, in general, the magnitude of the inner product between two different vectors is dominated by the product of their norms. This is the result of the well-known inequality shown below. T HEOREM 1.3 (Cauchy-Schwarz inequality).– For all v, w P pV, x , yq we have: | xv, wy | ď }v}}w} P ROOF.– Dozens of proofs of the Cauchy-Schwarz inequality have been produced. One of the most elegant proofs is shown below, followed by the simplest one:

8

From Euclidean to Hilbert Spaces

– ﬁrst proof : if w “ 0V , then the inequality is veriﬁed trivially with 0 “ 0. If xv,wy w ‰ 0V , then we can deﬁne z “ v ´ xv,wy }w}2 w, i.e. v “ }w}2 w ` z, and we note that: xz, wy “ xv ´ thus:

xv, wy xv, wy “0 wy w, wy “ xv, wy ´ xw, 2 }w}2 }w}

F xv, wy xv, wy w ` z, w ` z }w}2 }w}2 F B F B xv, wy xv, wy xv, wy “ w ` z ` z, w`z w, }w}2 }w}2 }w}2

}v}2 “ xv, vy “

“

B

xv, wy xv, wy ` xv, wy xw, zy ` xv, wy xz, wy ` xz, zy wy xw, 2 2 }w} }w}2 }w}2 }w}

|xv, wy|2 ` }z}2 }w}2 as the two intermediate terms in the penultimate step are zero, since xz, wy “ xw, zy “ 0. “

As }z}2 ě 0, we have seen that: }v}2 “

|xv, wy|2 |xv, wy|2 2 ` }z} ě }w}2 }w}2

i.e. |xv, wy|2 ď }v}2 }w}2 , hence |xv, wy| ď }v}}w}; – second proof (in one line!): @t P R we have: 0 ď }tv ´ w}2 “ xtv ´ w, tv ´ wy “ t2 }v}2 ´ 2txv, wy ` }w}2 ðñ Δ “ 4xv, wy2 ´ 4}v}2 }w}2 ď 0

2

The Cauchy-Schwarz inequality allows the concept of the angle between two vectors to be generalized for abstract vector spaces. In fact, it implies the existence of a coefﬁcient k between ´1 and `1 such that xv, wy “ }v}}w}k, but, given that the restriction of cos to r0, πs creates a bijection with r´1, 1s, this means that there is only one ϑ P r0, πs such that xv, wy “ }v}}w} cos ϑ. ϑ P r0, πs is known as the angle between the two vectors v and w. Another very important property of the norm is as follows. T HEOREM 1.4.– Let pV, } }q be an arbitrary normed vector space and v, w P V . We have: |}v} ´ }w}| ď }v ´ w}

[1.3]

Inner Product Spaces (Pre-Hilbert)

9

P ROOF.– On one side: }v} “ }v ´ w ` w} “ }pv ´ wq ` w} ď }v ´ w} ` }w} by the triangle inequality, thus }v} ´ }w} ď }v ´ w}. On the other side: }w} “ }w ´ v ` v} “ }pw ´ vq ` v} ď }w ´ v} ` }w} thus }w} ´ }v} ď }v ´ w}, i.e. }v} ´ }w} ě ´}v ´ w}. Hence, ´}v ´ w} ď }v} ´ }w} ď }v ´ w}, i.e. |}v} ´ }w}| ď }v ´ w}.

2

The following formula is also extremely useful. T HEOREM 1.5 (Carnot’s theorem).– Taking v, w P pV, x , yq: 2

2

2

2

2

2

v ˘ w “ v ` w ˘ 2xv, wy,

pK “ Rq

[1.4]

and v ˘ w “ v ` w ˘ xv, wy ˘ xw, vy,

pK “ Cq

[1.5]

P ROOF.– Direct calculation: 2

v ˘ w “ xv ˘ w, v ˘ wy “ xv, vy ˘ xv, wy ˘ xw, vy ` xw, wy 2 2 “ v ` w ˘ xv, wy ˘ xw, vy

2

If K “ C, then xw, vy “ xv, wy, and since, if z “ a ` ib “ pzq ` ipzq, then z ` z¯ “ 2a “ 2pzq, we can rewrite [1.5] as: 2

2

2

v ˘ w “ v ` w ˘ 2pxv, wyq

[1.6]

The laws presented in this section have immediate consequences which will be highlighted in section 1.2.1. 1.2.1. The parallelogram law and the polarization formula The parallelogram law in R2 is shown in Figure 1.1. This law can be generalized on a vector space with an arbitrary inner product. T HEOREM 1.6 (Parallelogram law).– Let pV, x, yq be an inner product space on K. Thus, @v, w P V : 2

2

2

2

v ` w ` v ´ w “ 2pv ` w q

10

From Euclidean to Hilbert Spaces

Figure 1.1. Parallelogram law in R2 : The sum of the squares of the two diagonal lines is equal to two times the sum of the squares of the edges v and w. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip 2

P ROOF.– A direct consequence of law [1.4] or law [1.5] taking v ` w then 2 v ´ w . 2 As we have seen, an inner product induces a norm. The polarization formula can be used to “reverse” roles and write the inner product using the norm. T HEOREM 1.7 (Polarization formula).– Let pV, x, yq be an inner product space on K. In this case, @v, w P V : ¯ 1´ 2 2 v ` w ´ v ´ w , pK “ Rq xv, wy “ 4 and: ¯ı ´ 1” 2 2 2 2 v ` w ´ v ´ w ` i v ` iw ´ v ´ iw , pK “ Cq xv, wy “ 4 P ROOF.– This law is a direct consequence of law [1.4], in the real case. For the complex case, w is replaced by iw in law [1.5], and by sesquilinearity, we obtain: 2

2

2

v ˘ iw “ v ` w ¯ ixv, wy ˘ ixw, vy 2

By direct calculation, we can then verify that v ` w 2 2 i v ` iw ´ i v ´ iw “ 4xv, wy.

2

´ v ´ w ` 2

It may seem surprising that something as simple as the parallelogram law may be used to establish a necessary and sufﬁcient condition to guarantee that a norm over a vector space will be induced by an inner product, that is, the norm is Hilbertian. This notion will be formalized in Chapter 4.

Inner Product Spaces (Pre-Hilbert)

11

1.3. Orthogonal and orthonormal families in inner product spaces The “geometric” deﬁnition of an inner product in R2 and R3 indicates that this product is zero if and only if ϑ, the angle between the vectors, is π{2, which implies cospϑq “ 0. In more complicated vector spaces (e.g. polynomial spaces), or even Euclidean vector spaces of more than three dimensions, it is no longer possible to visualize vectors; their orthogonality must therefore be “axiomatized” via the nullity of their scalar product. D EFINITION 1.5.– Let pV, x, yq be a real or complex inner product space of ﬁnite dimension n. Let F “ tv1 , ¨ ¨ ¨ , vn u be a family of vectors in V . Thus: – F is an orthogonal family of vectors if each different vector pair has an inner product of 0: xvi , vj y “ 0; – F is an orthonormal family if it is orthogonal and, furthermore, }vi } “ 1 @i. Thus, if tvi uni“1 is an orthogonal family, tui “ }vi }´1 vi uni“1 is an orthonormal family. An orthonormal family (unit and orthogonal vectors) may be characterized as follows: # 1 if i “ j xvi , vj y “ δi,j “ Orthonormal family 0 if i ‰ j δi,j is the Kronecker delta5. 1.4. Generalized Pythagorean theorem The Pythagorean theorem can be generalized to abstract inner product spaces. The general formulation of this theorem is obtained using a lemma. L EMMA 1.1.– Let pV, x, yq be a real or complex inner product space. Let u P V be orthogonal to all vectors v1 , . . . , vn P V . Hence, u is also orthogonal to all vectors in V obtained as a linear combination of v1 , . . . , vn . P ROOF.– Let w “ of vectors

n ř

αi vi , αi P K @i “ 1, . . . , n, i“1 v1 , . . . , vn . By direct calculation:

xu, wy “ xu,

n ÿ i“1

α i vi y

“

(sesquilinearity)

n ÿ

be an arbitrary linear combination

αi xu, vi y “

i“1

5 Leopold Kronecker (1823, Liegnitz-1891, Berlin).

uKvi

n ÿ i“1

αi 0 “ 0

2

12

From Euclidean to Hilbert Spaces

T HEOREM 1.8 (Generalized Pythagorean theorem).– Let pV, x, yq be an inner product space on K. Let u, v P V be orthogonal to each other. Hence: 2

2

2

u ` v “ u ` v

More generally, if the vectors v1 , . . . , vn P V are orthogonal, then: 2 n n ÿ ÿ 2 2 2 2 vi ðñ v1 ` . . . ` vn “ v1 ` . . . ` vn vi “ i“1 i“1 P ROOF.– The two-vector case can be proven thanks to Carnot’s formula: }u ` v}2 “ xu ` v, u ` vy *0 *0 xu,vy ` xv, uy ` xv, vy “ xu, uy ` “ }u}2 ` }v}2 Proof for cases with n vectors is obtained by recursion: – the case where n “ 2 is demonstrated above; 2 n´1 n´1 ř ř 2 vi “ vi (recursion hypothesis); – we suppose that i“1

i“1

– now, we write u “ vn and z “

n´1 ř

vi , so u K z using Lemma 1.1. Hence, using

i“1 2

2

case n “ 2: u ` z “ }u}2 ` }z} , but: u ` z “ vn `

n´1 ÿ

vi “

i“1

so:

n ÿ

vi

i“1

2 n ÿ u ` z “ vi i“1 2

and:

2 n´1 ÿ }u}2 ` }z}2 “ }vn }2 ` v i“1 i giving us the desired thesis.

“

(Recursion hypothesis)

}vn }2 `

n´1 ÿ i“1

2

vi “

n ÿ

2

vi

i“1

2

Note that the Pythagorean theorem thesis is a double implication if and only if V is real, in fact, using law [1.6] we have that }u ` v}2 “ }u}2 ` }v}2 holds true if and only if pxu, vyq “ 0, which is equivalent to orthogonality if and only if V is real. The following result gives information concerning the distance between any two vectors within an orthonormal family.

Inner Product Spaces (Pre-Hilbert)

13

T HEOREM 1.9.– Let pV, x, yq be an inner product space on K and let F be an orthonormal ? family in V . The distance between any two elements of F is constant and equal to 2. P ROOF.– Using the Pythagorean theorem: }u ` p´vq}2 “ }u}2 ` }v}2 “ 2, from the fact that u K v. 2 1.5. Orthogonality and linear independence The orthogonality condition is more restrictive than that of linear independence: all orthogonal families are free. T HEOREM 1.10.– Let F be an orthogonal family in pV, x, yq, F “ tv1 , ¨ ¨ ¨ , vn u, vi ‰ 0 @i, then F is free. P ROOF.– We need to prove the linear independence of the elements vi , that is, n ř ai vi “ 0 ùñ ai “ 0 @i. To this end, we calculate the inner product of the i“1

linear combination

n ř

ai vi and an arbitrary vector vj with j P t1, . . . , nu:

i“1

x

n ÿ

i“1

ai vi , v j y “

r1.1s

n ÿ

ai xvi , vj y

i“1

“

pxvi ,vj y‰0 ô i“jq

aj xvj , vj y “ aj }vj }2

By hypothesis, none of the vectors in F are zero; the hypothesis that

n ř

ai vi “ 0

i“1

therefore implies that: 2 }v x0, vjony = aj lo j }on ñ aj “ 0. omo lo omo 0 0

This holds for any j P t1, . . . , nu, so the orthogonal family F is free.

2

Using the general theory of vector spaces in ﬁnite dimensions, an immediate corollary can be derived from Theorem 1.10. C OROLLARY 1.1.– An orthogonal family of n non-null vectors in a space pV, x, yq of dimension n is a basis of V . D EFINITION 1.6.– A family of n non-null orthogonal vectors in a vector space pV, x, yq of dimension n is said to be an orthogonal basis of V . If this family is also orthonormal, it is said to be an orthonormal basis of V .

14

From Euclidean to Hilbert Spaces

The extension of the orthogonal basis concept to inner product spaces of inﬁnite dimensions will be discussed in Chapter 5. For the moment, it is important to note that an orthogonal basis is made up of the maximum number of mutually orthogonal vectors in a vector space. Taking n to represent the dimension of the space V and proceeding by reductio ad absurdum, imagine the existence of another vector u˚ P V , u ‰ 0, orthogonal to all of the vectors in an orthogonal basis pui qni“1 ; in this case, the set pu˚ , ui qni“1 would be free as orthogonal vectors are linearly independent, and the dimension of V would be n ` 1 instead of n! This property is usually expressed by saying that an orthogonal family is a basis if it is not a subset of another orthogonal family of vectors in V . Note that in order to determine the components of a vector in relation to an arbitrary basis, we must solve a linear system of n equations with n unknown variables. In fact, if v P V is any vector and pui q i “ 1, . . . , n is a basis of V , then the components of v in pui q are the scalars α1 , . . . , αn such that: $ n ř ’ ’ v “ αi ui,1 ’ 1 ’ ’ i“1 n & ÿ .. αi ui ðñ v“ . ’ ’ n i“1 ’ ř ’ ’ αi ui,n , %vn “ i“1

where ui,j is the j-th component of vector ui . However, in the presence of an orthogonal or orthonormal basis, components are determined by inner products, as seen in Theorem 1.11. Note, too, that solving a linear system of n equations with n unknown variables generally involves far more operations than the calculation of inner products; this highlights one advantage of having an orthogonal basis for a vector space. T HEOREM 1.11.– Let B “ tu1 , . . . , un u be an orthogonal basis of pV, x, yq. Then: v“

n ÿ xv, ui y ui }ui }2 i“1

Notably, if B is an orthonormal basis, then: v“

n ÿ i“1

xv, ui y ui

Inner Product Spaces (Pre-Hilbert)

15

P ROOF.– B is a basis, so there exists a set of scalars α1 , . . . , αn such that v “ n ř αj uj . Consider the inner product of this expression of v with a ﬁxed vector ui , j“1

i P t1, . . . , nu: xv, ui y “ x

n ÿ

α j uj , u i y “

j“1

so αi “

xv,ui y }ui }2

n ÿ

αj xuj , ui y

j“1

@i “ 1, ¨ ¨ ¨ , n, and thus v “

n ř i“1

“

pui Kuj @i‰jq

xv,ui y }ui }2 ui .

αi xui , ui y “ αi }ui }2

If B is an orthonormal basis,

}ui } “ 1 giving the second law in the theorem.

2

Geometric interpretation of the theorem: The theorem that we are about to demonstrate is the generalization of the decomposition theorem of a vector in plane R2 or in space R3 on a canonical basis of unit vectors on axes. To simplify this, consider the case of R2 . If ˆı and jˆ are, respectively, the unit vectors of axes x and y, then the decomposition theorem says that: v “ }v} cos α ˆı ` }v} cos β jˆ “ xv, ˆıy ˆı ` xv, jˆy jˆ looomooon looomooon xv,ˆ ıy

xv,ˆ jy

which is a particular case of the theorem above. We will see that the Fourier series can be viewed as a further generalization of the decomposition theorem on an orthogonal or orthonormal basis. 1.6. Orthogonal projection in inner product spaces The deﬁnition of orthogonal projection can be extended by examining the geometric and algebraic properties of this operation in R2 and R3 . Let us begin with R2 . In the Euclidean space R2 , the inner product of a vector v and a unit vector evidently gives us the orthogonal projection of v in the direction deﬁned by this vector, as shown in Figure 1.2 with an orthogonal projection along the x axis. The properties veriﬁed by this projection are as follows: 1) projecting onto the x axis a second time, vector Px v obviously remains unchanged given that it is already on the x axis, i.e. Px2 pvq :“ Px pPx vq “ Px v @v P V . Put differently, the operator Px bound to the x axis is the identity of this axis; 2) the difference vector between v and its projection v ´ Px v is orthogonal to the x axis, as we see from Figure 1.3;

16

From Euclidean to Hilbert Spaces

and diagonal projections Figure 1.2. Orthogonal projection Px v “ OC 2 OB and OD of a vector in v P R onto the x axis. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

Figure 1.3. Visualization of property 2 in R2 . For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

3) Px v minimizes the distance between the terminal point of v and the x axis. In and AD are, in fact, the hypotenuses of right-angled triangles ABC Figure 1.2, AB is another side of these triangles, and is therefore and ACD; on the other hand, AC smaller than AB and AD. AC is the distance between the terminal point of v and the and AD are the distances between the terminal point terminal point of Px v, while AB of v and the diagonal projections of v onto x rooted at B and D, respectively. We wish to deﬁne an orthogonal projection operation for an abstract inner product space of dimension n which retains these same geometric properties. Analyzing orthogonal projections in R3 helps us to establish an idea of the algebraic deﬁnition of this operation. Figure 1.4 shows a vector v P R3 and the plane

Inner Product Spaces (Pre-Hilbert)

17

produced by the orthogonal vectors u1 and u2 . We see that the projection p of v onto 1y this plane is the vector sum of the orthogonal projections p1 “ xv,u }u1 }2 u1 and p2

“

xv,u2 y }u2 }2 u2 2 ř

p “ p1 ` p2 “

i“1

onto the two vectors u1 and u2 taken separately, i.e. xv,ui y }ui }2

ui .

Figure 1.4. Orthogonal projection p of a vector in R3 onto the plane produced by two unit vectors. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

Generalization should now be straightforward: consider an inner product space pV, x, yq of dimension n and an orthogonal family of non-zero vectors F “ tu1 , . . . , um u, m ď n, ui ‰ 0V @i “ 1, . . . , m. The vector subspace of V produced by all linear combinations of the vectors of F shall be written SpanpF q: # + m ÿ spanpF q ” S “ s P V : Dα1 , . . . , αm P K such that s “ α j uj j“1

The orthogonal projection operator or orthogonal projector of a vector v P V onto S is deﬁned as the following application, which is obviously linear: PS : V ÝÑ S Ď V v ÞÝÑ PS pvq “

m ÿ xv, ui y ui }ui }2 i“1

Theorem 1.12 shows that the orthogonal projection deﬁned above retains all of the properties of the orthogonal projection demonstrated for R2 . T HEOREM 1.12.– Using the same notation as before, we have: 1) if s P S then PS psq “ s, i.e. the action of PS on the vectors in S is the identity;

18

From Euclidean to Hilbert Spaces

2) @v P V and s P S, the residual vector of the projection, i.e. v ´ PS pvq, is K to S: xv ´ PS pvq, sy “ 0

ðñ

v ´ PS pvq K s

3) @v P V et s P S: }v ´ PS pvq} ď }v ´ s} and the equality holds if and only if s “ PS pvq. We write: PS pvq “ argmin }v ´ s} sPS

P ROOF.– m ř

1) Let s P S, i.e. s “

αj uj , then:

j“1

PS psq “

m ÿ

x

m ř

αj uj , ui y

j“1

}ui }2

i“1

ui “

m ÿ

m ř

αj xuj , ui y

j“1

i“1

ui

}ui }2

m m ÿ ÿ αi xui , ui y u “ α i ui “ s i }ui }2 @i‰jq i“1 i“1

“

pui Kuj

2) Consider the inner product of PS pvq and a ﬁxed vector uj , j P t1, . . . , mu: m m ÿ ÿ xv, ui y xv, ui y ui , u j y “ xui , uj y 2 (linearity) }u } }ui }2 i i“1 i“1

xPS pvq, uj y “ x “

pui Kuj @i‰jq

xv, uj y xuj , uj y “ xv, uj y }uj }2

hence: xv, uj y´xPS pvq, uj y “ 0

ðñ

linearity of x , y

xv´PS pvq, uj y “ 0

Lemma 1.1 guarantees that xv ´ PS pvq, sy “ 0 @s “

m ř

@j P t1, ..., mu

αj uj .

j“1

3) It is helpful to rewrite the difference v ´ s as v ´ PS pvq ` PS pvq ´ s. From property 2, v ´PS pvqKS, however PS pvq, s P S so PS pvq´s P S. Hence pv ´PS pvqq K pPS pvq ´ sq. The generalized Pythagorean theorem implies that: 2 2 }v ´s}2 “ }v ´PS pvq`PS pvq´s}2 “ }v ´PS pvq}2 `}P S pvq ´ s} ě }v ´PS pvq} loooooomoooooon ě0

hence }v ´ s} ě }v ´ PS pvq} @v P V, s P S.

Inner Product Spaces (Pre-Hilbert)

19

Evidently, }PS pvq ´ s}2 “ 0 if and only if s “ PS pvq, and in this case }v ´ s}2 “ }v ´ PS pvq}2 . 2 The theorem demonstrated above tells us that the vector in the vector subspace S Ď V which is the most “similar” to v P V (in the sense of the norm induced by the inner product) is given by the orthogonal projection. The generalization of this result to inﬁnite-dimensional Hilbert spaces will be discussed in Chapter 5. As already seen for the projection operator in R2 and R3 , the non-negative scalar ui i y| quantity |xv,u }ui } gives a measure of the importance of }ui } in the reconstruction of m ř xv,ui y the best approximation of v in S via the formula PS pvq “ }ui }2 ui : if this i“1

quantity is large, then }uuii } is very important to reconstruct PS pvq, otherwise, in some circumstances, it may be ignored. In the applications to signal compression, a usual strategy consists of reordering the summation that deﬁnes PS pvq in descent order of i y| the quantities |xv,u }ui } and trying to eliminate as many small terms as possible without degrading the signal quality. This observation is crucial to understanding the signiﬁcance of the Fourier decomposition, which will be examined in both discrete and continuous contexts in the following chapters. Finally, note that the seemingly trivial equation v “ v ´ s ` s is, in fact, far more meaningful than it ﬁrst appears when we know that s P S: in this case, we know that v ´ s and s are orthogonal. The decomposition of a vector as the sum of a component belonging to a subspace S and a component belonging to its orthogonal is known as the orthogonal projection theorem. This decomposition is unique, and its generalization for inﬁnite dimensions, alongside its consequences for the geometric structure of Hilbert spaces, will be examine in detail in Chapter 5. 1.7. Existence of an orthonormal basis: the Gram-Schmidt process As we have seen, projection and decomposition laws are much simpler when an orthonormal basis is available. Theorem 1.13 states that in a ﬁnite-dimensional inner product space, an orthonormal basis can always be constructed from a free family of generators.

20

From Euclidean to Hilbert Spaces

T HEOREM 1.13.– (The iterative Gram-Schmidt process6) If pv1 , . . . , vn q, n ď 8 is a basis of pV, x, yq, then an orthonormal basis of pV, x, yq can be obtained from pv1 , . . . , vn q. P ROOF.– This proof is constructive in that it provides the method used to construct an orthonormal basis from any arbitrary basis. – Step 1: normalization of v1 : u1 “

v1 }v1 }

– Step 2, illustrated in Figure 1.5: v2 is projected in the direction of u1 , that is, we consider xv2 , u1 yu1 . We know from Theorem 1.12 that the vector difference v2 ´ xv2 , u1 yu1 is orthogonal to u1 . The result is then normalized: u2 “

v2 ´ xv2 , u1 yu1 }v2 ´ xv2 , u1 yu1 }

Figure 1.5. Illustration of the second step in the Gram-Schmidt orthonormalization process. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

– Step n, by iteration: un “

vn ´ pxvn , un´1 yun´1 ` . . . ` xvn , u1 yu1 q }vn ´ pxvn , un´1 yun´1 ` . . . ` xvn , u1 yu1 q}

2

1.8. Fundamental properties of orthonormal and orthogonal bases The most important properties of an orthonormal basis are listed in Theorem 1.14. 6 Jørgen Pedersen Gram (1850, Nustrup-1916, Copenhagen), Erhard Schmidt (1876, Tatu1959, Berlin).

Inner Product Spaces (Pre-Hilbert)

21

T HEOREM 1.14.– Let pu1 , . . . , un q be an orthonormal basis of pV, x, yq, dimpV q “ n. Then, @v, w P V : 1) Decomposition theorem on an orthonormal basis: n ÿ

v“

xv, ui yui

[1.7]

i“1

2) Parseval’s identity7: n ÿ

xv, wy “

xv, ui yxui , wy

[1.8]

i“1

3) Plancherel’s theorem8: }v}2 “

n ÿ

|xv, ui y|2

[1.9]

i“1

Proof of 1: an immediate consequence of Theorem 1.12. Given that pu1 , . . . , un q is a basis, v P spanpu1 , . . . , un q; furthermore, pu1 , . . . , un q is orthonormal, so v “ n ř xv, ui yui . It is not necessary to divide by }ui }2 when summing since PS pvq “ i“1

}ui } “ 1 @i.

Proof of 2: using point 1 it is possible to write v “

n ř

xv, ui yui , and calculating the

i“1

inner product of v, written in this way, and w, using equation [1.1], we obtain: xv, wy “ x

n ÿ

xv, ui yui , wy “

i“1

n ÿ

xv, ui yxui , wy

i“1

Proof of 3: writing w “ v on the left-hand side of Parseval’s identity gives us xv, vy “ }v}2 . On the right-hand side, we have: n ÿ

xv, ui yxui , vy “

i“1

hence }v}2 “

n ÿ i“1

n ř

xv, ui yxv, ui y “

n ÿ

|xv, ui y|2

i“1

|xv, ui y|2 .

i“1

7 Marc-Antoine de Parseval des Chêsnes (1755, Rosières-aux-Salines-1836, Paris). 8 Michel Plancherel (1885, Bussy-1967, Zurich).

2

22

From Euclidean to Hilbert Spaces

N OTE.– 1) The physical interpretation of Plancherel’s theorem is as follows: the energy of v, measured as the square of the norm, can be decomposed using the sum of the squared moduli of each projection of v on the n directions of the orthonormal basis pu1 , ..., un q. In Fourier theory, the directions of the orthonormal basis are fundamental harmonics (sines and cosines with deﬁned frequencies): this is why Fourier analysis may be referred to as harmonic analysis. 2) If pu1 , . . . , un q is an orthogonal, rather than an orthonormal, basis, then using the projector formula and Theorem 1.12, the results of Theorem 1.14 can be written as: a) decomposition of v P V on an orthogonal basis: v“

n ÿ xv, ui y ui }ui }2 i“1

[1.10]

b) Parseval’s identity for an orthogonal basis: xv, wy “

n ÿ xv, ui yxui , wy }ui }2 i“1

[1.11]

c) Plancherel’s theorem for an orthogonal basis: }v}2 “

n ÿ |xv, ui y|2 }ui }2 i“1

[1.12]

The following exercise is designed to test the reader’s knowledge of the theory of ﬁnite-dimensional inner product spaces. The two subsequent exercises explicitly include inner products which are non-Euclidean. Exercise 1.1 Consider the complex Euclidean inner product space C3 and the following three vectors: ˆ ˙ 1 ´πi 2 u “ p0, i, 2iq, v “ p2i, 0, ´iq, w “ 0, i, e 2 1) Determine the orthogonality relationships between vectors u, v, w. 2) Calculate the norm of u, v, w and the Euclidean distances between them. 3) Verify that pu, v, wq is a (non-orthogonal) basis of C3 .

Inner Product Spaces (Pre-Hilbert)

23

4) Let S be the vector subspace of C3 generated by u and w. Calculate PS v, the orthogonal projection of v onto S. Calculate dpv, PS vq, that is, the Euclidean distance between v and its projection onto S, and verify that this minimizes the distance between v and the vectors of S (hint: look at the square of the distance). 5) Using the results of the previous questions, determine an orthogonal basis and an orthonormal basis for C3 without using the Gram-Schmidt orthonormalization process (hint: remember the geometric relationship between the residual vector r and the subspace S). 6) Given a vector a “ p2i, ´1, 0q, write the decomposition of a and Plancherel’s theorem in relation to the orthonormal basis identiﬁed in point 5. Use these results to identify the vector from the orthonormal basis which has the heaviest weight in the decomposition of a (and which gives the best “rough approximation” of a). Use a graphics program to draw the progressive vector sum of a, beginning with the rough approximation and adding ﬁner details supplied by the other vectors. Solution to Exercise 1.1 1) Evidently, e´ 2 i “ ´i, so by directly calculating the inner products: xu, vy “ ´2, xu, wy “ 0 et xv, wy “ 12 . π

2) By direct calculation: }u}2 “ 5, }v}2 “ 5, }w}2 “ 54 . After calculating the ? difference vectors, we obtain: dpu, vq “ }u ´ v} “ 14, dpu, wq “ }u ´ w} “ 52 , ? dpv, wq “ }v ´ w} “ 221 . 3) The three vectors u, v, w are linearly independent, so they form a basis in C3 . This basis is not orthogonal since only vectors u and w are orthogonal. 4) S “ spanpu, wq. Since pu, wq is an orthogonal basis in S, we can write: PS pvq “

xv, uy xv, wy u` w “ p0, 0, ´iq }u}2 }w}2

The residual vector of the projection of v on S is r “ v ´ PS v “ p2i, 0, 0q and thus dpv, PS vq2 “ }r}2 “ 4. The most general vector in S is s “ αu ` βw “ p0, pα ` βqi, p2α ´ β2 qiq and dpv, sq2 “ }v ´ s}2 “ 4 ` pα ` βq2 ` p2α ´ β2 ` 1q2 ě 4 “ dpv, PS vq2 . This conﬁrms that PS v is the vector in S with the minimum distance from v in relation to the Euclidean norm. 5) r is orthogonal to S, which is generated by u and w, hence pu, w, rq is a set of orthogonal vectors in C3 , that is, an orthogonal basis of C3 . To obtain an orthonormal basis, we then simply divide each vector by its norm: ˙ ˆ ˙ ˙ ˙ ˆˆ ˆ 2i i 2i u i w r , 0, ? , ´ ? , pi, 0, 0q “ 0, ? , ? pˆ u, w, ˆ rˆq ” , , }u} }w} }r} 5 5 5 5 6) Decomposition: a “ xa, u ˆyˆ u ` xa, wy ˆ w ˆ ` xa, rˆyˆ r“

?i u ˆ 5

`

2i ? w ˆ 5

` 2ˆ r.

24

From Euclidean to Hilbert Spaces

Plancherel’s theorem: }a}2 “ 5 “ |xa, u ˆy|2 ` |xa, wy| ˆ 2 ` |xa, rˆy|2 p“ “ 5q.

1 5

`

4 5

`4

The vector with the heaviest weight in the reconstruction of a is thus rˆ: this vector gives the best rough approximation of a. By calculating the vector sum of this rough representation and the other two vectors, we can reconstruct the “ﬁne details” of a, ﬁrst with w ˆ and then with u ˆ. 2 Exercise 1.2 Let M pn, Cq be the space of n ˆ n complex matrices. The application φ : M pn, Cq ˆ M pn, Cq Ñ C is deﬁned by: φpA, Bq “ trpB : Aq t

where B : :“ B denotes the adjoint matrix of B and tr is the matrix trace. Prove that φ is an inner product. Solution to Exercise 1.2 The distributive property of matrix multiplication for addition and the linearity of the trace establishes the linearity of φ in relation to the ﬁrst variable. Now, let us prove that φ is Hermitian. Let A “ pai,j q1ďi,jďn and B “ pbi,j q1ďi,jďn be two matrices in M pn, Cq. Let pci,j q1ďi,jďn “ pbj,i q1ďi,jďn be the coefﬁcients of the matrix B : and let pdi,j q1ďi,jďn “ paj,i q1ďi,jďn be the coefﬁcients of A: . This gives us: »˜ φpA, Bq “ trpB : Aq “ tr –

n ÿ

¸ ﬁ

“

n ÿ

i,j

bk,i ak,i “

i,k“0

ﬂ “ tr –

ci,k ak,j

k“1

»˜

n ÿ

n ÿ

¸ ﬁ bk,i ak,j

k“1

di,k bk,i “ trpA: Bq

i,k“0

“ φpB, Aq Thus, φ is a sesquilinear Hermitian form. Furthermore, φ is positive: @A P M pn, Cq, φpA, Aq “

n ÿ i,k“0

|ak,i |2 ě 0

ﬂ i,j

Inner Product Spaces (Pre-Hilbert)

25

It is also deﬁnite: n ÿ

φpA, Aq “ 0 ðñ

|ak,i |2 “ 0

i,k“0

ðñ @1 ď k, i ď n, ak,i “ 0 ðñ A “ 0 2

Thus, φ is an inner product. Exercise 1.3

Let E “ RrXs be the vector space of single variable polynomials with real coefﬁcients. For P, Q P E, take: ż1 P ptqQptq ? dt ΦpP, Qq “ 1 ´ t2 ´1 1) Remember that f ptq “ Opgptqq means that D a, C ą 0 such that |t ´ t0 | ă tÑt0

a ùñ |f ptq| ď C |gptq|. Prove that for all P, Q P E, this is equal to: ˆ ˙ 1 P ptqQptq ? “ O ? 1´t 1 ´ t2 tÑ1 and: P ptqQptq ? 1 ´ t2

“ O

tÑ´1

ˆ

1 ? 1`t

˙

Use this result to deduce that Φ is deﬁnite over E ˆ E. 2) Prove that Φ is an inner product over E, which we shall note x , y. 3) For n P N, let Tn be the n-th Chebyshev polynomial, that is, the only polynomial such that @θ P R, Tn pcos θq “ cospnθq. Applying the substitution t “ cos θ, show that pTn qnPN is an orthogonal family in E. Hint: use the trigonometric formula [1.13]: 1 pcosppn`mqθq`cosppn´mqθqq “ cospnθq cospmθq 2

@n, m P N. [1.13]

4) Prove that for all n P N, pT0 , . . . , Tn q is an orthogonal basis of Rn rXs, the vector space of polynomials in RrXs of degree less than or equal to n. Deduce that pTn qnPN is an orthogonal basis in the algebraic sense: every element in E is a ﬁnite linear combination of elements in the basis of E.

26

From Euclidean to Hilbert Spaces

5) Calculate the norm of Tn for all n and deduce an orthonormal basis (in the algebraic sense) of E using this result. Solution to Exercise 1.3 1) We write f ptq “ P ptqQptq ? 1`t

P ptqQptq ? 1´t2

“

P ptqQptq ? ? . 1´t 1`t

Since P and Q are polynomials, the

function t ÞÑ is continuous in a neighborhood V1 p1q and thus, according to the Weierstrass theorem, itˇ is bounded in this neighborhood, that is, D C1 ą 0 such that ˇ ˇ P? ptqQptq ˇ ptqQptq t P V1 p1q ùñ ˇ 1`t ˇ ď C1 . Similarly, the function t ÞÑ P ? is continuous in 1´t ˇ ˇ ˇ P? ˇ a neighborhood V2 p´1q, thus D C2 ą 0 such that t P V2 p´1q ùñ ˇ ptqQptq ˇ ď C2 . 1´t This gives us: ˆ ˙ 1 1 t P V1 p1q ùñ |f ptq| ď C1 ? ðñ f ptq “ O ? tÑ1 1´t 1´t and: 1 ðñ f ptq “ O t P V2 p´1q ùñ |f ptq| ď C2 ? tÑ´1 1`t

ˆ

1 ? 1`t

˙

This implies that the integral deﬁning Φ is deﬁnite; f ptq is continuous over p´1, 1q and therefore can be integrated. The result which we have just proved shows that f ptq is integrable in a right neighborhood of –1 and a left neighborhood of 1, as the integral of its absolute value is incremented by an integrable function in both cases. 2) The bilinearity of Φ is obtained from the linearity of the integral using direct calculation. Its symmetry is a consequence of that of the dot product between functions. The only property which is not immediately evident is deﬁnite positiveness. Let us start by proving positiveness: ΦpP, P q “

ż1 ´1

P 2 ptq ? dt ě 0 1 ´ t2

and9: P 2 ptq ΦpP, P q “ 0 ðñ ? dt “ 0 a.e. on p´1, 1q ðñ P ptq “ 0 a.e. on p´1, 1q 1 ´ t2 but the only polynomial with an inﬁnite number of roots is the null polynomial 0ptq ” 0, so P “ 0. Φ is therefore an inner product on E. 9 a.e.: almost everywhere (see Chapter 3).

Inner Product Spaces (Pre-Hilbert)

3) For all n, m P N: ş1 m ptq xTn , Tm y “ ´1 Tn?ptqT dt 1´t2

27

pt “ cos θ, dt “ ´ sin θdθq

t “ cos θ “ ´1 ðñ θ “ π, t “ cos θ “ 1 ðñ θ “ 0 “

ż0

´ sin θ

π

“ “ “

żπ sin θ 0

żπ 0 żπ

Tn pcos θqTm pcos θq a dθ 1 ´ cos2 pθq

cospnθq cospmθq dθ | sin θ|

cospnθq cospmθq sin θ dθ sin θ

psin θ ě 0 on r0, πsq

cospnθq cospmθqdθ 0

1 “ 2

ˆż π 0

cosppn ` mqθqdθ `

żπ 0

cosppn ´ mqθqdθ

˙

pfrom r1.13sq

So, for all n ‰ m, we have: ˆ„ jπ „ jπ ˙ 1 sinppn ` mqθq sinppn ´ mqθq “0 ` xTn , Tm y “ 2 n`m n´m 0 0 that is, Chebyshev polynomials form an orthogonal family of polynomials in relation to the inner product deﬁned above. 4) The family pT0 , T1 , . . . , Tn q is an orthogonal (and thus free) of n`1 elements of RrXs, which is of dimension n ` 1, meaning that it is an orthogonal basis of Rn rXs. To show that pTn qnPN is a basis in the algebraic sense of E, consider a polynomial P P E of an arbitrary degree d P N, i.e. P P Rd rXs, and note that pT0 , T1 , . . . , Td q is an orthogonal (free) family of generators of Rd rXs, that is, a basis in the algebraic sense of the term. 5) The norm of Tn is calculated using the following equality: ˙ ˆż π żπ 1 xTn , Tm y “ cosppn ` mqθqdθ ` cosppn ´ mqθqdθ 2 0 0 which was demonstrated in point 3. Taking n “ m, we have: ˙ ˆż π ˆ„ jπ żπ ˙ 1 1 sin 2nθ }Tn }2 “ xTn , Tn y “ cosp2nθqdθ ` dθ “ `π “ 2 2 2n 0 0 0 `ş ˘ ş π π 2 1 }Tn } “ xTn , Tn y “ 2 0 cosp2nθqdθ ` 0 dθ “ # 1 `şπ şπ ˘ if n “ 0 2 ´ 0 dθ ` 0 dθ ¯“ π “ 1 “ sin 2nθ ‰π π ` π “ if n ě 1, 2 2n 2 0

28

From Euclidean to Hilbert Spaces

hence }T0 } “

?

π and }Tn } “ + " * #c 2 T0 ? Y Tn π π

a π{2 for n ě 1. Finally, the family:

ně1

is an orthonormal basis of the vector space of ﬁrst-order polynomials with real coefﬁcients E. 2 1.9. Summary In this chapter, we have examined the properties of real and complex inner products, highlighting their differences. We noted that the symmetrical and bilinear properties of the real inner product must be replaced by conjugate symmetry and sesquilinearity in order to obtain a set of properties which are compatible with deﬁnite positivity. This ﬁnal property is essential in order to produce a norm from a scalar product. We noted that the prototype for all inner product spaces, or pre-Hilbert spaces, of ﬁnite dimension n is the Euclidean space Kn , where K “ R or K “ C. Using the inner product, the concept of orthogonality between vectors can be extended to any inner product space. Two vectors are orthogonal if their inner product is null. The null vector is the only vector which is orthogonal to all other vectors, and the property of deﬁnite positiveness means that it is the only vector to be orthogonal to itself. If two vectors have the same inner product with all other vectors, that is, the same projection in every direction, then these vectors coincide. A norm on a vector space is said to be a Hilbert norm if an inner product can be deﬁned which generates the norm in a canonical manner. Remarkably, a norm is a Hilbert norm if and only if it satisﬁes the parallelogram law; this holds true for both ﬁnite and inﬁnite dimensions. The polarization law can be used to deﬁne an inner product which is compatible with a Hilbert norm. Vector orthogonality implies linear independence, guaranteeing that a set of n orthogonal vectors in a vector space of dimension n will constitute a basis. The expansion of a vector on an orthonormal basis is trivial: the components in relation to this basis are the inner products of the vector with the basis vectors. It is therefore much simpler to calculate components in such cases because, if the basis is not orthonormal, then a linear system of equations must be solved. The concept of orthogonal projection on a vector subspace S was also presented. Given an orthogonal basis of this space, the projection can be represented as an expansion over the vectors of the basis, with coefﬁcients given by the inner products

Inner Product Spaces (Pre-Hilbert)

29

(which are normalized if the basis is not orthonormalized). We have seen that the difference between a vector and its orthogonal projection, known as the residual vector, is orthogonal to the projection subspace S. We also demonstrated that the orthogonal projection is the vector in S which minimizes the distance (in relation to the norm of the vector space) between the vector and the vectors of S. Given an inner product space, of ﬁnite or inﬁnite dimensions, an orthonormal basis can always be deﬁned using the Gram-Schmidt orthonormalization algorithm. Finally, we proved the important Parseval identity and Plancherel’s theorem in relation to an orthonormal or orthogonal basis. The extension of these properties to inﬁnite dimensions is presented in Chapter 5.

2 The Discrete Fourier Transform and its Applications to Signal and Image Processing

The information presented in the previous chapter (Chapter 1) concerning complex inner product spaces and their properties lays the foundations for a very simple introduction to the discrete Fourier transform (DFT). We simply need to prove that certain functions of complex exponentials constitute an orthogonal basis for a complex inner product space of ﬁnite dimension. From a mathematical standpoint, the DFT is a simple change of basis in a vector space; however, its interpretation is of crucial importance and is extremely useful in the context of applications, notably in signal theory, as we shall see in section 2.6. This section draws on the excellent work of M. Frazier (2001). 2.1. The space 2 pZN q and its canonical basis In order to introduce the vector space in which the DFT is to be constructed, we need to make a few adjustments to the notation used thus far. We shall continue to work with complex vectors with a number of components N , 1 ă N ă `8, but a vector in CN will be considered as a ﬁnite sequence. Our ﬁrst task is to deﬁne ZN .

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

32

From Euclidean to Hilbert Spaces

D EFINITION 2.1.– Two integers i, j P Z are said to be congruent modulo N if their difference is divisible by N , that is: a´b “ m P Z, N meaning that we can write a “ b ` mN . The (Gaussian) notation for two integers which are congruent modulo N is: a”b

pmod N q

Congruence modulo N can be shown to be an equivalence relationship in Z. Like all equivalence relationships, it creates a partition of Z into distinct equivalence classes. The set of these equivalence classes is written as: ZN “ Z pmod N q A (ﬁnite) sequence of complex values on ZN is a function: z : ZN Ñ C j ÞÑ zpjq In practice, circular or “clock” arithmetic is applied: this consists of identifying a sequence deﬁned on ZN as a sequence deﬁned on t0, 1, . . . , N ´ 1u and extended to Z by N -periodicity: zpj ` mN q “ zpjq

@j, m P Z

that is, given the deﬁnition of zpjq when j P t0, 1, . . . , N ´ 1u, in order to determine zpjq when j R t0, . . . , N ´ 1u, we must add an integer multiple of N to j. This is written as mN , m P Z such that j¯ “ j ` mN P t0, 1, . . . , N ´ 1u. We then deﬁne zpj ` mN q “ zp¯ jq. An example is shown below. E XAMPLE .– N “ 12, z “ p1, i, i,

?

2i, 0, 0, 0, ´1, 0, 0, 0, 2q, that is:

$ zp0q “ 1 ’ ’ ’ ’ ’ zp1q “ zp2q “ i ’ ’ ? ’ ’ ’ &zp3q “ 2i zp4q “ zp5q “ zp6q “ 0 ’ ’ ’ zp7q “ ´1 ’ ’ ’ ’ ’zp8q “ zp9q “ zp10q “ 0 ’ ’ % zp11q “ 2

The Discrete Fourier Transform and its Applications to Signal and Image Processing

33

extended by 12-periodicity to Z. Determine zp´21q. Since N “ 12, we must ﬁnd the integer m ‰ 0 such that ´21 ` 12m P t0, 1, . . . , 11u: $ ´21 ` 12 “ ´9 ’ ’ ’ &´21 ` 24 “ 3 ´21 ` 12m “ ’ ´21 ` 36 “ 15 ’ ’ % ...

m“1 m“2 m“3

The value of m for which ´21 ` 12m falls within t0, . . . , 11u is ? m “ 2, and in this case we have ´21 ` 2 ¨ 12 “ 3, which implies zp´21q “ zp3q “ 2i. Despite the fact that ZN is often considered to represent the set of canonical representatives t0, 1, . . . , N ´ 1u, we can, in fact, consider z to be deﬁned over any sub-set of Z given by N consecutive integers, and not necessarily over t0, . . . , N ´ 1u. This convention will be used throughout this book. The complex vector space that will be used in this section is the set of all sequences of complex values over ZN : 2 pZN q “ tz : ZN Ñ Cu The reason for using this particular notation will become clear later. 2 pZN q is a complex vector space with the usual scalar summation and multiplication operators, that is, given z, w P 2 pZN q, α P C, the sum and multiplication by a complex vector are deﬁned as follows: z ` w : ZN Ñ C j ÞÑ pz ` wqpjq “ zpjq ` wpjq αz : ZN Ñ C j ÞÑ pαzqpjq “ αzpjq 2 pZN q is of dimension N : the application which associates each sequence z P pZN q with its images pzp0q, zp1q, . . . , zpN ´ 1qq: 2

2 pZN q ÐÑ CN ¨

zp0q zp1q .. .

˛

‹ ˚ ‹ ˚ ÐÑ pzp0q, zp1q, . . . , zpN ´ 1qq “ ˚ ‹ ‚ ˝ zpN ´ 1q is a linear isomorphism (the proof is left to the reader). z will be represented as a row vector or as a column vector as the case requires. z

34

From Euclidean to Hilbert Spaces

The isomorphism above allows us to deﬁne the canonical basis B of 2 pZN q as the set of the following N sequences: # 1 k“j B “ pe0 , e1 , . . . , eN ´1 q, ej pkq “ δj,k “ 0 k‰j We can also introduce an inner product into 2 pZN q using: xz, wy “

Nÿ ´1

zpkqwpkq

k“0

so z, w P 2 pZN q are orthogonal if and only if xz, wy “

Nř ´1

zpkqwpkq “ 0.

k“0

The norm induced by this inner product is: ˜ }z} “

Nÿ ´1

¸ 12 |zpkq|2

k“0

which will be referred to as the 2 pZN q norm. 2.1.1. The orthogonal basis of complex exponentials in 2 pZN q In this section, we are going to deﬁne the function system that will be essential to the development of the DFT. First, we recall these basic facts: 1) for an arbitrary z P C, z “ ρrcos α ` i sin αs “ ρeiα , ρ, α P R, ρ ě 0 ; 2) Euler’s formulas: cos α “ 12 peiα ` e´iα q, sin α “

1 iα 2i pe

´ e´iα q;

3) |z| “ 1 ô z “ eiα ; 4) eiα “ eipα`2πkq , k P Z; 5) as a speciﬁc instance of the previous point, if α “ 0, we obtain: e2πik “ 1

@k P Z ;

6) eiα eiβ “ eipα`βq ; 7) peiα qn “ einα ; 8) eiα “ e´iα ; 9) given z “ ρeiα , the solutions to the equation wN “ z are the N complex roots ? 2πm`α given by the equation: wm “ N ρei N , m “ 0, . . . , N ´ 1;

The Discrete Fourier Transform and its Applications to Signal and Image Processing

35

10) speciﬁcally: roots N -ths of the unit : ωm “ e2πi N , m

m “ 0, . . . , N ´ 1.

We also need to recall the geometric summation formula, deﬁned by: Sk “ 1 ` z ` z 2 ` . . . ` z k´1 ` z k “

k ÿ

zj

j“0

If z “ 1, then Sk “ k ` 1. If z ‰ 1, we observe that: p1 ´ zqSk “ 1 ` z ` z 2 ` . . . ` z k ´ pz ` z 2 ` . . . ` z k ` z k`1 q “ 1 ´ z k`1 hence: k ÿ j“0

# z “ j

1´z k`1 1´z

if z P Czt1u if z “ 1

k`1

Now, consider the sequences in 2 pZN q deﬁned by the following complex exponentials: Em : ZN ÝÑ C n ÞÝÑ Em pnq where: $ ’ E0 pnq “ 1 ’ ’ ’ ’ 2πi n ’ ’E1 pnq “ e N & 2n E2 pnq “ e2πi N ’ .. ’ ’ ’ . ’ ’ ’ pN ´1qn %E 2πi N N ´1 pnq “ e Hence: – E0 is the constant sequence E0 pnq ” 1 @n P ZN ; ¯ ´ pN ´1q 1 2 ; – E1 is the sequence E1 “ 1, e2πi N , e2πi N , . . . , e2πi N ¯ ´ 2pN ´1q 4 2 – E2 is the sequence E2 “ 1, e2πi N , e2πi N , . . . , e2πi N ; ¯ ´ 2pN ´1q pN ´1q2 N ´1 . – EN ´1 is the sequence EN ´1 “ 1, e2πi N , e2πi N , . . . , e2πi N The general sequence is: Em pnq “ e2πi

mn N

“ pωm qn

@m, n “ 0, . . . , N ´ 1

36

From Euclidean to Hilbert Spaces

where pωm qn is the n-th power of the N -th roots of the unit, @n P t0, ..., N ´ 1u, so: ` m ˘n mn “ e2πi N pωm qn “ e2πi N From formula z “ eiα “ rcos α ` i sin αs, we know that the system deﬁned above is a set of sequences of values which oscillate at different frequencies, since the arguments of the cos and sin functions change with the coefﬁcients m and n. As we shall see, the signiﬁcation of these frequencies is crucial to Fourier analysis. For now, let us focus on proving that the exponential system deﬁned above is an orthogonal basis of 2 pZN q. This proof relies on a preliminary lemma. L EMMA 2.1.– For all j, k P t0, 1, . . . , N ´ 1u, we have: # Nÿ ´1 Nÿ ´1 j´k j´k N e2πin N “ e´2πin N “ N δj,k “ 0 n“0 n“0

j“k j‰k

[2.1]

The physical interpretation of this key formula will be discussed later. Before going further with the proof, note that in the case where j, k P ZN , j ‰ k, we have j ´ k P k´j t1, 2, . . . , N ´ 1u, so j´k N “ ´ N R Z. P ROOF.– This proof covers the ﬁrst summation, but it is evident that this demonstration also holds for the second summation. We start by using the properties of complex exponentials to rewrite the formula as follows: Nÿ ´1

e2πin

j´k N

“

n“0

Nÿ ´1 ´

e2πi

¯n

j´k N

n“0

Let us analyze the following two cases: 0

– if j “ k, the exponentials in the sum are equal to 1, since e2πi N “ 1, and thus: Nÿ ´1 n“0

e2πin

j´j N

“

Nÿ ´1

1“N

n“0

– if j ‰ k, the exponentials are ‰ 1, so, using the geometric summation formula: ¯N ´1`1 ´ 2πi j´k Nÿ ´1 ´ ¯ 1 ´ e n j´k N e2πi N “ j´k 2πi N 1´e n“0 “ “

1 ´ e2πi

pj´kqN N

1 ´ e2πi

j´k N

1 ´ e2πipj´kq 1 ´ e2πi

j´k N

The Discrete Fourier Transform and its Applications to Signal and Image Processing

37

Since j ´ k “ m P Z, e2πipj´kq “ 1, the numerator of the ﬁnal formula is 0 when j ‰ k. The denominator, on the other hand, never cancels out; as we saw in the remark before the proof, if j ‰ k, then j´k N R Z. In this case, the summation is equal to 0. 2 The demonstration that E is an orthogonal basis of 2 pZN q is now trivial. T HEOREM 2.1.– E “ pE0 , . . . , EN ´1 q is an orthogonal basis of 2 pZN q. P ROOF.– E is given by N elements of an N –dimensional inner product space, so if we can prove that E is an orthogonal family, then the theorem is also proved. We know that an orthogonal family is free, and a free family of N vectors in an N –dimensional vector space is a basis. We thus calculate the inner products xEj , Ek y, @j, k P t0, . . . , N ´ 1u: xEj , Ek y “

Nÿ ´1

Ej pnqEk pnq “

n“0

Nÿ ´1

jn

e2πi N e´2πi N “

n“0

kn

Nÿ ´1

e2πi

pj´kqn N

“ N δj,k

n“0

using Lemma 2.1 to give us the ﬁnal equality, which proves that xEj , Ek y “ N δj,k , that is, the elements in the basis are mutually orthogonal. 2 If we consider that j “ k “ m in equation xEj , Ek y “ N δj,k , then xEm , Em y “ N δm,m “ N , hence: 2

}Em } “ N ,

}Em } “

?

N,

@m P t0, 1, . . . , N ´ 1u

Now, let us consider two examples in which the expression of the complex exponentials is particularly simple: N “ 2 and N “ 4 (the expression using N “ 3 is not quite so simple): 1) N “ 2. 2 pZ2 q “ tz “ pzp0q, zp1qq P C2 u, in this case Em pnq “ e2πi πimn e and thus: ˘ ` m“0: E0 “ eπi0¨0 , eπi0¨1 “ p1, 1q ˘ ` ˘ ` m“1: E1 “ eπi1¨0 , eπi1¨1 “ 1, eπi

mn 2

“

However, eπi “ cospπq ` i sinpπq “ ´1, so E1 “ p1, ´1q. Thus: E “ pp1, 1q, p1, ´1qq

[2.2]

is the basis of complex exponentials in 2 pZ2 q. Note the presence of a constant sequence (the ﬁrst) and an oscillating sequence (the second). This particular feature of the basis will be discussed in greater detail later.

38

From Euclidean to Hilbert Spaces

2) N “ 4. 2 pZ4 q “ tz “ pzp0q, zp1q, zp2q, zp3qq P C4 u: the Fourier basis is obtained from four complex sequences, each with four components. Veriﬁcation that the basis of complex exponentials of 2 pZ4 q is: E “ pp1, 1, 1, 1q, p1, i, ´1, ´iq, p1, ´1, 1, ´1q, p1, ´i, ´1, iqq

[2.3]

is left to the reader. Results [1.10], [1.11] and [1.12] from section 1.8 may be used to write the following formulas, which are valid for any two elements z, w P 2 pZN q: – decomposition on the orthogonal basis E: z“

Nÿ ´1

xz, Em y Em N m“0

[2.4]

– Parseval’s identity for the orthogonal basis E: xz, wy “

Nÿ ´1

xz, Em yxEm , wy N m“0

[2.5]

– Plancherel’s theorem for E: }z}2 “

Nÿ ´1

2

|xz, Em y| N m“0

[2.6]

The expressions above are calculated explicitly in section 2.3. There are several ways of renormalizing the basis E. Two of the most widespread approaches, which can also be used to deﬁne the DFT, are discussed in the next two sections. 2.2. The orthonormal Fourier basis of 2 pZN q ? As we saw in section 2.1.1, the norm 2 pZN q of all sequences Em?is N ; evidently, an orthonormal basis can therefore be obtained by dividing by N . This justiﬁes Deﬁnition 2.2. D EFINITION 2.2.– The orthonormal Fourier basis of 2 pZN q is the set: E “ pE0 , E1 , E2 , . . . , EN ´1 q of the N sequences Em P 2 pZN q: Em : ZN ÝÑ C n ÞÝÑ Em pnq

The Discrete Fourier Transform and its Applications to Signal and Image Processing

39

where: $ E0 pnq “ ?1N ’ ’ ’ n ’ ’ E pnq “ ?1N e2πi N ’ ’ & 1 2n E2 pnq “ ?1N e2πi N ’ ’ .. ’ ’ . ’ ’ ’ pN ´1qn % EN ´1 pnq “ ?1N e2πi N The general sequence of the orthonormal Fourier basis is: mn 1 1 Em pnq “ ? e2πi N “ ? pωm qn N N

@m, n “ 0, . . . , N ´ 1

and the orthonormality formula xEj , Ek y “ δj,k holds true. Using formulas [2.2] and [2.3], we can say that: 1 E “ ? pp1, 1q, p1, ´1qq 2

[2.7]

is the orthonormal Fourier basis of 2 pZ2 q and: E“

1 pp1, 1, 1, 1q, p1, i, ´1, ´iq, p1, ´1, 1, ´1q, p1, ´i, ´1, iqq 2

[2.8]

is the orthonormal Fourier basis of 2 pZ4 q. The translation of theorem 1.14 for 2 pZN q equipped with the orthonormal Fourier basis is as follows. Given arbitrary elements z, w P 2 pZN q, we have: – a decomposition on the orthonormal Fourier basis: z“

Nÿ ´1

xz, Em yEm

[2.9]

m“0

– Parseval’s identity: xz, wy “

Nÿ ´1

xz, Em yxEm , wy

[2.10]

m“0

– Plancherel’s theorem: }z}2 “

Nÿ ´1 m“0

2

|xz, Em y|

[2.11]

40

From Euclidean to Hilbert Spaces

2.3. The orthogonal Fourier basis of 2 pZN q ? Although the normalization constant 1{ N , which appears in the deﬁnition of the orthonormal Fourier basis, might appear to be the most logical choice for normalizing the basis E in 2 pZN q, another normalization is more commonly used in practical applications. The reason for this choice, shown below, is that it simpliﬁes the writing of several other formulas. D EFINITION 2.3.– The orthogonal Fourier basis of 2 pZN q is the set: F “ pF0 , F1 , F2 , . . . , FN ´1 q of N sequences Fm P 2 pZN q: Fm : ZN ÝÑ C n ÞÝÑ Fm pnq where: $ 1 ’ ’ ’F0 pnq “ N ’ 1 2πi n ’ ’ ’F1 pnq “ N e N & 2n F2 pnq “ N1 e2πi N ’ .. ’ ’ ’ . ’ ’ ’ pN ´1qn %F pnq “ 1 e2πi N N ´1

N

The general sequence of the orthogonal Fourier basis is: Fm pnq “

1 2πi mn 1 N “ e pωm qn N N

@m, n “ 0, . . . , N ´ 1

The relationships between the three bases E, E and F are: Em Em “ ? , N

Fm “

Em , N

Em Fm “ ? N

@m P t0, 1, . . . , N u

[2.12]

Using the formulas above, the orthogonal Fourier bases of 2 pZ2 q and 2 pZ4 q are easy to calculate: – orthogonal Fourier basis of 2 pZ2 q: F “

1 pp1, 1q, p1, ´1qq 2

[2.13]

– orthogonal Fourier basis of 2 pZ4 q : F “

1 pp1, 1, 1, 1q, p1, i, ´1, ´iq, p1, ´1, 1, ´1q, p1, ´i, ´1, iqq 4

[2.14]

The Discrete Fourier Transform and its Applications to Signal and Image Processing

41

Again, using relationship [2.12], we can determine the equivalents of formulas [2.9] or [2.4], [2.10] or [2.5] and [2.11] or [2.6] for two arbitrary elements z, w P 2 pZN q: – decomposition on the orthogonal Fourier basis: z“N

Nÿ ´1

xz, Fm yFm

m“0

– Parseval’s identity for the orthogonal Fourier basis: xz, wy “ N

Nÿ ´1

xz, Fm yxFm , wy

m“0

– Plancherel’s theorem for the orthogonal Fourier basis: Nÿ ´1

}z}2 “ N

2

|xz, Fm y|

m“0

Table 2.1 supplies a helpful summary of the differences between these bases and formulas: Em pnq “ e2πi

Basis

mn mn 1 1 , Em pnq “ ? e2πi N , Fm pnq “ e2πi N N N

Decomposition

E

z“

E

z“

F

mn N

Nř ´1 m“0 Nř ´1

xz,Em y Em N

xz, Em yEm

m“0 Nř ´1

z“N

Parseval’s identity xz, wy “ xz, wy “

Nř ´1

m“0 Nř ´1

xz, Em yxEm , wy

m“0 Nř ´1

xz, Fm yFm xz, wy “ N

m“0

xz,Em yxEm ,wy N

Plancherel ’s theorem Nř ´1

}z}2 “ }z}2 “

m“0 Nř ´1

|xz, Em y|2

m“0 Nř ´1

xz, Fm yxFm , wy }z}2 “ N

m“0

|xz,Em y|2 N

|xz, Fm y|2 .

m“0

Table 2.1. Different normalizations of Fourier bases and relative formulas

2.4. Fourier coefﬁcients and the discrete Fourier transform The deﬁnition of the DFT varies from author to author and from application to application. The two most widespread deﬁnitions use the orthonormal basis E and a blend of the orthogonal bases E and F . These two versions are useful for different reasons:

42

From Euclidean to Hilbert Spaces

– using the orthonormal basis E allows us to obtain unitary operators; – using a blend of the orthogonal bases E and F makes it possible to simplify many formulas, including the convolution formula, widely used in applications, which will be discussed later. For the purposes of this book, we shall use formulas obtained by a blend of the orthogonal bases E and F . This decision was made for reasons of coherency with various mathematical programs, notably MATLAB. First, let us reconsider the following decomposition: z“

Nÿ ´1

Nÿ ´1 xz, Em y Em xz, Em y Em “ N N m“0 m“0

However, Em {N “ Fm , so: z“

Nÿ ´1

xz, Em yFm

m“0

that is, any given element z P 2 pZN q can be decomposed over the orthogonal Fourier basis F with the components given by the inner products of z with elements of the basis E. Using the deﬁnition of the inner product of 2 pZN q, we can write: xz, Em y “

Nÿ ´1

zpnqEm pnq “

n“0

“

Nÿ ´1 n“0 Nÿ ´1

zpnqe2πi

mn N

zpnqe´2πi

mn N

n“0

D EFINITION 2.4.– Given any z P 2 pZN q, the complex vectors xz, Em y, m P t0, 1, . . . , N ´ 1u are known as the Fourier coefﬁcients of z, noted zˆpmq. Explicitly: zˆpmq “

Nÿ ´1

zpnqe´2πi

mn N

Fourier coefﬁcients of z

[2.15]

n“0

The sequence of Fourier coefﬁcients of z is written using zˆ P 2 pZN q: zˆ “ pˆ z p0q, zˆp1q, zˆp2q, . . . , zˆpN ´ 1qq

[2.16]

The linear operator which transforms z P 2 pZN q into the sequence zˆ P 2 pZN q of its Fourier coefﬁcients, that is: DFT ” ˆ ” F : 2 pZN q ÝÑ 2 pZN q z ÞÝÑ DFTpzq ” zˆ ” Fpzq

The Discrete Fourier Transform and its Applications to Signal and Image Processing

zˆpmq “

Nÿ ´1

zpnqe´2πi

mn N

43

@m P t0, 1, . . . , N ´ 1u

n“0

is known as the discrete Fourier transform, or DFT. It is important to note that the variable of z is n, while the variable of zˆ is m. The interpretation of n and m in the context of signal theory will be given in section 2.6; for now, note simply that n is the discrete value of an instant in time (or a position in space) at which a signal z is measured, whereas m is proportional to the oscillation frequency of a wave (harmonic) and is a multiple of a fundamental frequency. The DFT is used to translate a description of a signal in terms of temporal (or spatial) samples into a description in terms of signal frequencies. This notion will be formalized in section 2.6. Using the deﬁnitions given above, the decomposition of z may be written as follows: z“

Nÿ ´1

zˆpmqFm

[2.17]

m“0

that is, the Fourier coefﬁcients of z are the components of z in the orthogonal Fourier basis F : zˆ “ rzsF

[2.18]

Using the notation introduced above, the theorem of decomposition on the orthonormal Fourier basis, Parseval’s identity and Plancherel’s theorem may be rewritten as: – decomposition of z on the orthogonal Fourier basis: zpnq “

N ´1 mn 1 ÿ zˆpmqe2πi N N m“0

@n “ 0, 1, . . . , N ´ 1

[2.19]

– Parseval’s identity:

xz, wy “

N ´1 1 ÿ 1 zˆpmqwpmq ˆ “ xˆ z , wy ˆ N m“0 N

[2.20]

– Plancherel’s theorem : }z}2 “

N ´1 1 ÿ 1 2 |ˆ z pmq| “ }ˆ z }2 N m“0 N

[2.21]

44

From Euclidean to Hilbert Spaces

2.4.1. The inverse discrete Fourier transform It is interesting to compare formulas [2.15] and [2.19]: zˆpmq “

Nÿ ´1

zpnqe´2πi

mn N

,

zpnq “

n“0

N ´1 mn 1 ÿ zˆpmqe2πi N N m“0

@n, m P t0, 1, . . . , N ´ 1u The ﬁrst relationship states that given the values of zpnq, the values of zˆpmq can be reconstructed using formula [2.15]. The second relationship states that given the values of zˆpmq, the values of zpnq can be reconstructed using formula [2.19]. There is thus a “duality” between the two formulas: it is possible to obtain sequence z from sequence zˆ and vice versa using relationships [2.15] and [2.19]. This duality is formalized in Deﬁnition 2.5 and Theorem 2.2. D EFINITION 2.5.– The linear operator: 2 pZN q IDFT ” ˇ ” F ´1 : 2 pZN q ÝÑ z ÞÝÑ IDFTpzq ” zˇ ” F ´1 pzq zˇpnq “

N ´1 mn 1 ÿ zpmqe2πi N N m“0

@n P t0, 1, . . . , N ´ 1u

is known as the inverse discrete Fourier transform, or IDFT. T HEOREM 2.2.– The IDFT is the inverse linear operator of the DFT and vice versa: IDFT “ DFT´1 ,

DFT “ IDFT´1

or, in other terms, zˇˆ “ z,

zˆˇ “ z

@z P 2 pZN q

P ROOF.– We wish to prove that the composition between the DFT and the IDFT and between the IDFT and the DFT gives the identity operator id: the DFT˝IDFT“IDFT˝DFT“ id, idpzq “ z, @z P 2 pZN q. We start by verifying that, given an arbitrary sequence z P 2 pZN q and applying the DFT to obtain the sequence of Fourier coefﬁcients zˆ P 2 pZN q, it is possible to obtain the original sequence by applying the IDFT: 2 pZN q ÝÑ 2 pZN q ÝÑ 2 pZN q DFT IDFT ˇ“z z ÞÝÑ zˆ ÞÝÑ zˆ

The Discrete Fourier Transform and its Applications to Signal and Image Processing

45

Before writing the composition, it is important to note that the summation index – the symbol of which is unimportant – should not be confused with the ﬁxed variables n, m in zˇpnq and zˆpmq. To avoid this problem we will use the neutral symbol j. ˜ ¸ N ´1 N ´1 Nÿ ´1 mn 1 ÿ 1 ÿ 2πi mn ´2πi mj ˇ N N zˆpnq “ zˆpmqe “ zpjqe e2πi N N m“0 N m“0 j“0 N ´1 N ´1 n´j 1 ÿ ÿ zpjqe2πim N N m“0 j“0 ˜ ¸ N ´1 Nÿ ´1 1 ÿ 2πim n´j N “ zpjq e N j“0 m“0

“

“

pLemma 2.1q

N ´1 1 ÿ zpjqN δj,n N j“0

“ zpnq @n P t0, 1, . . . , N ´ 1u Now, let us verify that the inverse composition produces the same identity: 2 pZN q ÝÑ 2 pZN q ÝÑ 2 pZN q IDFT DFT ˆ “ z. z ÞÝÑ zˇ ÞÝÑ zˇ zˆˇpmq “ “

Nÿ ´1

´2πi mn N

zˇpnqe

n“0 ´1 Nÿ ´1 Nÿ

1 N

˜

n“0

zpjqe2πin

N ´1 jn 1 ÿ zpjqe2πi N N j“0

¸ e´2πi

mn N

j´m N

n“0 j“0

N ´1 1 ÿ “ zpjq N j“0

“

“

Nÿ ´1

pLemma2.1q

1 N

˜

Nÿ ´1

¸ e

2πin j´m N

n“0 Nÿ ´1

zpjqN δj,m

j“0

“ zpmq @m P t0, 1, . . . , N ´ 1u Thus, zˇˆpnq “ zpnq and zˆ ˇpmq “ zpmq, @n, m P t0, 1, . . . , N ´1u which concludes our proof. 2 Note the similarity between the DFT and the IDFT: the only differences are the coefﬁcient 1{N and the sign of the complex exponential. We wish to draw the reader’s attention to the formulas demonstrated above: N ´1 mn 1 ÿ zˇˆpnq “ zˆpmqe2πi N “ zpnq N m“0

@n P ZN

46

From Euclidean to Hilbert Spaces

zˆˇpmq “

Nÿ ´1

zˇpnqe´2πi

mn N

“ zpmq

@m P ZN

n“0

D EFINITION 2.6.– The pair pz, zˆq P 2 pZN q ˆ 2 pZN q is known as a Fourier pair. 2.4.2. Deﬁnition of the DFT and the IDFT with the orthonormal Fourier basis An alternative deﬁnition of Fourier coefﬁcients, the DFT and the IDFT, more commonly found in a theoretical mathematical context, uses the orthonormal Fourier basis E: – z, w P 2 pZN q; – Fourier coefﬁcients: N ´1 mn 1 ÿ zˆpmq “ ? zpnqe´2πi N N n“0

[2.22]

The notation zˆ in the following formulas in this list (and only these formulas) refers to the Fourier coefﬁcients above. – decomposition on the orthonormal Fourier basis: N ´1 mn 1 ÿ zˆpnqe2πi N zpmq “ ? N n“0

– DFT : N ´1 mn 1 ÿ zpnqe´2πi N zˆpmq “ ? N n“0

@m P t0, 1, . . . , N ´ 1u

– IDFT : N ´1 mn 1 ÿ zpmqe2πi N zˇpnq “ ? N m“0

@n P t0, 1, . . . , N ´ 1u

– Parseval’s identity: xz, wy “

Nÿ ´1

zˆpmqwpmq ˆ “ xˆ z , wy ˆ

m“0

– Plancherel’s theorem: }z}2 “

Nÿ ´1

|ˆ z pmq|2 “ }ˆ z }2

m“0

Box 2.1. Discrete orthonormal Fourier analysis

The Discrete Fourier Transform and its Applications to Signal and Image Processing

47

As we can see, the greatest advantage of using the orthonormal Fourier basis in deﬁning the objects used in Fourier analysis is that the DFT and the IDFT are operators which conserve the inner product, and consequently the norm; they are therefore represented using unitary matrices. We also see that, independently of the deﬁnition used, the product of the coefﬁcients of zˆ and zˇ must always be equal to 1{N to guarantee that IDFT = DFT´1 . 2.4.3. The real (orthonormal) Fourier basis The Fourier basis and DFT can be written using real notation. The advantage of a real DFT is that, if z is real, we can avoid the need to introduce imaginary components. For simplicity’s sake, we shall focus on the orthonormal Fourier basis. First, we must determine whether N is even or odd. Let us begin with the case where N is even: N “ 2M , M P N, M ě 1. In this case, @n “ 0, 1, . . . , N ´ 1, we write: $ c0 pnq “ ?1N ’ ’ b ’ ’ ’ 2 &cm pnq “ p 2πmn m “ 1, 2, ..., M ´ 1 N cos´ N q¯ N n 2π n p´1q 2 ?1 ’ “ ?N ’ N ’cM pnq “ bN cos ’ ’ % sm pnq “ N2 sin p 2πmn m “ 1, 2, . . . , M ´ 1 N q If N “ 2M ` 1 is odd, c0 , cm and sm are deﬁned in the same way as above, but m “ N {2 should not be considered as in this case N {2 is not an integer. T HEOREM 2.3.– The set tc0 , c1 , . . . , cM ´1 , cM , s1 , . . . , sM ´1 u, when N “ 2M , or the set tc0 , c1 , . . . , cM ´1 , s1 , . . . , sM ´1 u, when N “ 2M `1, is an orthonormal basis of 2 pZN q. Thus, for all z P 2 pZN q: z“ z“

M ÿ

xz, cm ycm `

m“0 M ´1 ÿ m“0

xz, cm ycm `

M ´1 ÿ

xz, sm ysm

m“1 M ´1 ÿ

xz, sm ysm

pN “ 2M q pN “ 2M ` 1q

m“1

D EFINITION 2.7.– The real orthonormal Fourier basis of 2 pZN q is the set of sequences of 2 pZN q tc0 , c1 , . . . , cM ´1 , cM , s1 , . . . , sM ´1 u when N “ 2M , our the set of sequences of 2 pZN q tc0 , c1 , . . . , cM ´1 , s1 , . . . , sM ´1 u when N “ 2M ` 1.

48

From Euclidean to Hilbert Spaces

The relationship with the Fourier coefﬁcients is obtained using the following formulas: $ ˆp0q ’ xz, c0 y “ z? ’ N ’ ’ ’ zˆpM q ’ ? xz, c y “ ’ M ’ N ’ ’ ’ ? 1 pˆ xz, c y “ z pmq ` zˆpN ´ mqq, m “ 1, 2, . . . , M ´ 1 ’ m ’ 2N ’ & ´i xz, sm y “ ?2N pˆ z pmq ´ zˆpN ´ mqq, m “ 1, 2, . . . , M ´ 1 ? ’ ’ zˆp0q “ N xz, c0 y ’ ’ ? ’ ’ zˆpM q “ N xz, cM y ’ ’ a ’ ’ ’ zˆpmq “ N {2pxz, cm y ´ ixz, sm yq, m “ 1, 2, . . . , M ´ 1 ’ ’ a ’ %zˆpmq “ N {2pxz, c m “ M ` 1, M ` 2, . . . , N ´ 1 N ´m y ` ixz, sN ´m yq, 2.5. Matrix interpretation of the DFT and the IDFT By deﬁnition, the DFT transforms sequences of 2 pZN q represented in the canonical basis B of 2 pZN q into sequences of 2 pZN q represented in the orthogonal Fourier basis F of 2 pZN q [2.17]: DFT : 2 pZN q ÝÑ 2 pZN q z “ rzsB ÞÝÑ DFTpzq “ zˆ “ rzsF The DFT is thus the operator used to operate the change from the canonical basis B of 2 pZN q to the Fourier basis F of 2 pZN q, and, consequently, the IDFT is the opposite operator. We wish to establish a matrix representation of these two linear operators DFT and IDFT. To do this, we shall use a notation which is widely used in literature concerning 2πi the DFT: ωN “ e´ N . Using the properties of complex exponentials, we can write: mn ωN “ e´2πi

mn N

and the Fourier coefﬁcients can thus be written as: zˆpmq “

Nÿ ´1 n“0

zpnqe´2πi

mn N

“

Nÿ ´1

mn zpnqωN

n“0

mn We deﬁne the matrix WN containing the elements ωN : mn wmn “ ωN

The Discrete Fourier Transform and its Applications to Signal and Image Processing

49

that is, explicitly: ¨

WN

1 1 1 1 2 3 ˚1 ω N ω ω N N ˚ ˚1 ω 2 4 6 ωN ωN ˚ N “˚ 3 6 9 ˚1 ωN ωN ωN ˚ .. .. .. ˚ .. ˝. . . . 2pN ´1q 3pN ´1q N ´1 ωN ωN 1 ωN

... ... ... ... .. .

˛

1

N ´1 ωN 2pN ´1q ωN 3pN ´1q ωN .. .

pN ´1qpN ´1q

‹ ‹ ‹ ‹ ‹ ‹ ‹ ‹ ‚

[2.23]

. . . ωN

This N ˆ N matrix is called Sylvester matrix1. It is symmetrical: WN “ WNt , i.e. mn nm wmn “ wnm (an obvious consequence of the deﬁnition of wmn , since ωN “ ωN ) and each line or column is obtained by the geometric progression2 of a power of ωN . A matrix of this type is known as a Vandermonde matrix3. By convention, when considering WN , we examine the variability of the indices of the lines and columns between 0 and N ´ 1 (in place of canonical variability, from 1 to N ). This convention is the reason why all elements in the ﬁrst line (m “ 0) and all elements in the ﬁrst column (n “ 0) are equal to 1. If we apply WN to z considered as a column vector in CN , then, by the deﬁnition of matrix product, we obtain a vector WN z whose m-th component pWN zqpmq4 is given by: pWN zqpmq “

Nÿ ´1

wmn zpnq “

n“0

Nÿ ´1

zpnqe´2πi

mn N

“ zˆpmq,

@m P ZN ,

n“0

thus: zˆ “ WN z

@z P 2 pZN q

Using the same approach, we can verify that the IDFT is implemented via the conjugate matrix of WN normalized by the coefﬁcient 1{N (transposition is not required as WN is symmetrical): WN´1 “

1 WN , N

zˇ “ WN´1 z

@z P 2 pZN q

1 James Joseph Sylvester (1814, London-1897, London). 2 A geometric progression of reason r is the sequence of powers 1 “ r0 , r “ r1 , r2 , r3 , . . . , rn . 3 Alexandre-Théophile Vandermonde (1735, Paris-1796, Paris). 4 This is the real Euclidean product of the m-th line of WN , i.e. pwm0 , wm1 , . . . , wmpN ´1q q times the components pzp0q, zp1q, . . . , zpnqq of z.

50

From Euclidean to Hilbert Spaces

WN is the change of basis matrix used to go from B to F , and WN´1 “ the change of basis matrix used to go from F to B.

1 N WN

is

O BSERVATIONS .– Using the deﬁnition of the DFT corresponding to equation [2.22], ˜N “ that is using the orthonormal Fourier basis, the associated matrix becomes W ? ˜N. WN { N . This is a unitary matrix, and thus its inverse matrix is W Examples: – N “ 2 : ω2 “ e´2πi{2 “ e´iπ “ cospπq ´ i sinpπq “ ´1, thus: ˆ ˙ 1 1 W2 “ 1 ´1 hence: W2´1

1 “ 2

ˆ ˙ 1 1 1 ´1

– N “ 4 : ω4 “ e´2πi{4 “ e´iπ{2 “ cospπ{2q ´ i sinpπ{2q “ ´i, thus: ¨ ˛ 1 1 1 1 ˚1 ´i p´iq2 p´iq3 ‹ ‹ W4 “ ˚ ˝1 p´iq2 p´iq4 p´iq6 ‚ 1 p´iq3 p´iq6 p´iq9 hence: ¨ 1 ˚1 W4 “ ˚ ˝1 1

1 ´i ´1 i

1 ´1 1 ´1

˛ 1 i ‹ ‹ ´1‚ ´i

The inverse matrix is: ¨ 1 1 1 ˚ 1 1 i ´1 W4´1 “ ˚ ˝ 4 1 ´1 1 1 ´i ´1

˛ 1 ´i ‹ ‹ ´1‚ i

[2.24]

[2.25]

Note that the columns of matrix W4´1 consist of the orthogonal basis F of pZ4 q, as seen in formula [2.14]; this is coherent with the fact that this is the matrix used to change from the orthogonal basis F to the canonical basis of 2 pZ4 q. 2

The Discrete Fourier Transform and its Applications to Signal and Image Processing

51

2.5.1. The fast Fourier transform As we have seen, the action of the DFT on a signal z P 2 pZN q can be represented as a matrix product. We must therefore calculate N multiplications for each element zˆpmq in the sequence zˆ P 2 pZN q. Since zˆ has N components, the complexity of the algorithm used to calculate the DFT is OpN 2 q. This complexity means that the DFT is extremely time-consuming when working with signals of large dimension. In practice, the Fourier transform was almost never used outside of a theoretical context (that is, in real-world applications) before the 1960s. A breakthrough came in 1965, when Cooley and Tukey used symmetries concealed within the DFT to construct a fast algorithm for calculating the DFT: this algorithm is known as the fast Fourier transform (FFT). The complexity of the FFT is of the order of OpN log N q, and, using modern computers, it allows the Fourier transform of large dimension signals to be calculated in under a second. The FFT is extremely efﬁcient in cases where the signal dimension is a power of 2. This is the reason why a 512 or 1,024 format is typically used for digital images, enabling rapid and efﬁcient processing using the FFT. The development of the FFT is considered as one of the greatest scientiﬁc breakthroughs of the 20th century, as it enables the use of Fourier transforms in a vast array of practical applications. 2.6. The Fourier transform in signal processing Fourier theory has applications in a wide range of domains, for example in solving ordinary and partial differential equations, classical and quantum physics, statistics and probabilities, and signal processing. In this section, we shall highlight the crucial role of Fourier theory in signal processing in one dimension (1D). 2.6.1. Synthesis formula for 1D signals: decomposition on the harmonic basis A discrete 1D signal of dimension N may be deﬁned as the set of N samplings of a variable, which may be dependent on time, on a spatial dimension (x,y or z), or on another parameter with a single degree of freedom.

52

From Euclidean to Hilbert Spaces

Two remarkable examples of discrete 1D signals, dependent on time or a single spatial dimension, are: – the set of intensity values for a piece of music, sampled at N different moments in time; – the set of grayscale values of a line or column in a simple image, corresponding to N different positions. A discrete 1D signal can be processed using Fourier theory using the following basic identiﬁcations: – the abstract mathematical representation of a discrete 1D signal is given by a sequence z P 2 pZN q; – n P ZN “ t0, 1, . . . , N ´ 1u represents the value of the parameter (time, spatial dimension, etc.) according to which the signal is sampled. The unit of measurement used for n is typically the second or meter; – the energy of the signal z is associated with the square of the norm }z}2 . The next step is to interpret the decomposition formula over the Fourier basis, the DFT and the IDFT, and Plancherel’s theorem in the context of signal processing. The interpretation of Plancherel’s theorem in this case is simplest: the energy of the signal z is decomposed into the sum of the squared magnitudes of the Fourier coefﬁcients. The decomposition formula over the Fourier basis, equation [2.19], is known as the synthesis formula in the context of signal processing: zpnq “

N ´1 mn 1 ÿ zˆpmqe2πi N N m“0

@n P ZN

Using this formula, the signal z can be reconstructed (or “synthesized") using the mn Fourier coefﬁcients zˆpmq and the oscillating functions e2πi N . The functions used in the signal synthesis operation are: ´ m ¯ ´ m ¯ m e2πi N n “ cos 2π n ` i sin 2π n . N N

[2.26]

When m “ 0, there is no oscillation; from m “ 1 to m “ N ´ 1 the functions m e2πi N n oscillate at a certain frequency (m is therefore measured in hertz or rad/s). This will be discussed in detail in section 2.6.4. These functions are known as harmonics, a term derived from the ﬁeld of music, as we see from Deﬁnition 2.8.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

53

1

D EFINITION 2.8 (harmonics).– The function n ÞÑ e2πi N n is known as a fundamental m (discrete) harmonic5 and the functions n ÞÑ e2πi N n for m “ 2, . . . , N ´ 1 are (discrete) harmonics of (higher) order m. 2.6.2. Signiﬁcation of Fourier coefﬁcients and spectrums of a 1D signal The synthesis formula tells us that the signal z in the value n of its parameter can be reconstructed using a linear combination of harmonic waves of frequencies which are multiples of 1{N via the coefﬁcient m: t0, 1{N, 2{N, . . . , pN ´ 1q{N u. The complex scalars of the linear combination are the Fourier coefﬁcients zˆpmq. Each Fourier coefﬁcient zˆpmq P C may be written as: zˆpmq “ apmq ` ibpmq “ |ˆ z pmq|eiArgpˆzpmqq where |ˆ z pmq| “

a 2 apmq2´` bpmq ¯ is the magnitude of the Fourier coefﬁcient zˆpmq

and Argpˆ z pmqq “ arctan

bpmq apmq

is its argument.

Evidently, the “weight” which measures the importance of each harmonic e2πi N in reconstructing a signal z is the magnitude6 of the Fourier coefﬁcient zˆpmq: mn |ˆ z pmq| : measures the importance of the harmonic e2πi N in reconstructing z. For this reason, in signal processing, the Fourier coefﬁcient formula is known as the analysis formula: mn

zˆpmq “

Nÿ ´1

zpnqe´2πi

mn N

@m P ZN

n“0

since zˆ allows us to analyze the frequency components of a signal. If the discrete signal z is dependent on the time t (or a spatial dimension x), then the transformation z Ñ zˆ obtained using the DFT enables us to go from a temporal (or spatial) representation of the signal to a frequential representation, or the Fourier space. The Fourier transform is often deﬁned as the equivalent of Newton’s prism for mathematics. Newton’s prism breaks down light into “hidden” frequency components corresponding to the colors of the spectrum. The Fourier transform reveals the frequency components which are “hidden” in any signal. This analogy explain the terms used in Deﬁnition 2.9. 5 It is important to specify that these harmonics are discrete; continuous harmonics are obtained using functions t ÞÑ e2πimνt “ eimωt , where ν is the frequency and ω “ 2πν the pulse. 6 The magnitude must be used here due to the fact that complex numbers are not ordered.

54

From Euclidean to Hilbert Spaces

D EFINITION 2.9.– Given z P 2 pZN q: – t|ˆ z pmq|, m P ZN u is known as the amplitude spectrum of z, or simply the spectrum of z; – t|ˆ z pmq|2 , m P ZN u is the power spectrum of z; – tArgpˆ z pmqq, m P ZN u is the phase spectrum of z. The signiﬁcation of these spectra will be discussed in detail later. Note the presence of one particularly special Fourier coefﬁcient, zˆp0q, which provides information concerning the average value of z: zˆp0q “

Nÿ ´1

0n

zpnqe2πi N “

n“0

where xzy “

1 N

Nř ´1

Nÿ ´1

zpnq “ N xzy

ùñ

zˆp0q “ N xzy

n“0

zpnq is the average value of the signal z.

n“0

Introducing this expression of zˆp0q into the synthesis formula and separating the ﬁrst term from the rest of the sum, we obtain: zpnq “

N ´1 mn 1 1 ÿ zˆpmqe2πi N N xzy ` N N m“1

that is : zpnq “ xzy `

N ´1 mn 1 ÿ zˆpmqe2πi N N m“1

The Fourier coefﬁcient zˆp0q is known as the “DC” component of the synthesis formula, while the other terms constitute the “AC” component. This terminology is taken from the ﬁeld of electronics, with DC standing for “direct current” (current of frequency zero) and AC standing for “alternating current”. One way of interpreting the formula set out above is to say that z is decomposed into the sum of its mean value and the ﬁner details reconstructed by higher order harmonics, weighted by the Fourier coefﬁcients of z. 2.6.3. The synthesis formula and Fourier coefﬁcients of the unit pulse It is helpful to compare the synthesis formula with formula [2.1], that is: # Nÿ ´1 N j“k ˘2πin j´k N e “ N δj,k “ 0 j‰k n“0

The Discrete Fourier Transform and its Applications to Signal and Image Processing

55

Rewriting j ´k “ m P ZN , switching m and n (an acceptable substitution, as both are arbitrary values of ZN ) and normalizing by N , we obtain the following formula: # N ´1 1 n“0 1 ÿ ˘2πin m N “ “ e0 pnq ” δ0 pnq ” δpnq e N m“0 0 n “ 1, . . . , N ´ 1 δ is known as the unit pulse. If we select the option “+” in the formula shown above, we obtain the synthesis formula for the unit pulse, in which all Fourier coefﬁcients are unitary: ˇ ˇ ˇ ˇ @m P ZN δp0 pmq “ 1 “ ˇδp0 pmqˇ This result is particularly informative: the DFT transforms a signal which is completely “localized” at a value on its parameter into a signal which is fully “delocalized” across the spectrum: the harmonics for all frequencies have the same weight when reconstructing the signal. Let us now calculate the Fourier coefﬁcients of the constant signal zpnq “ N1 , @n P ZN , we obtain: # Nÿ ´1 N ´1 1 m“0 1 ´2πi mn 1 ÿ ´2πi mn N “ N “ δ pmq “ zˆpmq “ e e 0 N N 0 m “ 1, . . . , N ´ 1 n“0 n“0 We see that the DFT of a constant signal (which is completely delocalized in relation to its parameter) is therefore a unit pulse in the Fourier domain, meaning that it is completely localized in its frequencies. The generalization of this behavior for spaces which are more complicated than 2 pZN q – notably L2 pΩq, Ω Ď Rn , which we will examine later – forms the basis for understanding the Heisenberg uncertainty principle, the conceptual core of quantum mechanics. Thanks to the results that we have discussed above, we can give a physical interpretation of the formula [2.1] in Lemma 2.1: the superposition of harmonic functions with frequencies which are integer multiples of one another is subjected to a destructive interference everywhere, except at one value where the harmonics experience a constructive interference. Moreover, according to the synthesis formula, harmonics must be weighted differently in order to reconstruct any signal which is not a pulse. 2.6.4. High and low frequencies in the synthesis formula Let us take a closer look at the meaning of the frequency coefﬁcients m in the set ! ´ mn ¯ ) ´ mn ¯ mn e2πi N “ cos 2π ` i sin 2π , n “ 0, 1, . . . , N ´ 1 , N N

56

From Euclidean to Hilbert Spaces

which represents the value of the harmonics in each of the N parameters n. For the sake of simplicity, we shall only consider the real part of the elements of the set above, that is ) ! ´ mn ¯ , n “ 0, 1, . . . , N ´ 1 ; Hm “ cos 2π N our remarks concerning the cosine are equally applicable to the sine. ` ˘ Consider the behavior of cos 2π mn when the value of m is between 0 and N ´1, N where N is even (the case where N is odd will be discussed later): – m “ 0 : As we have already ` seen, ˘ in this case, there is no oscillation, but simply a series of constant values, cos 2π 0n N “ 1, so: H0 “ t1, 1, . . . , 1u; –m“1: " ˆ ˙ ˆ ˙ ˆ ˙* 1 2 N ´1 H1 “ 1, cos 2π , cos 2π , . . . , cos 2π N N N The values of H1 represent N samples of a cosine oscillation. The cycle` does˘not terminate as we do not consider the value n “ N , which would give us cos 2π N N “ cosp2πq “ 1. Figure 2.1 shows the graph of Hm for m “ 1, N “ 16; –m“2: #

˜ ¸ ˆ ˙ ˆ ˙ 2 N2 2 4 H2 “ 1, cos 2π , cos 2π , . . . , cos 2π N N N ˆ ˙* 2pN ´ 1q “ 1, . . . , cos 2π N The values of H2 represent N samples of two cosine oscillations. n “ N {2 marks the end of a cosine cycle. Figure 2.2 shows the graph of Hm for m “ 2, N “ 16. We see that, for n “ 8 “ 16{2, the cosine value is 1. Increasing m up to N {2, the oscillation frequency of the cosine increases. ´ N ¯The n maximum frequency is reached when m “ N {2; in this case, cos 2π 2N “ cospπnq, thus: H N “ tp´1qn , n “ 0, 1, . . . , N ´ 1u 2

The Discrete Fourier Transform and its Applications to Signal and Image Processing

Figure 2.1. Hm for m “ 1, N “ 16

Figure 2.2. Hm for m “ 2, N “ 16

57

58

From Euclidean to Hilbert Spaces

We might expect the cosine oscillation frequency to increase up to N ´ 1, but this is not the case. In reality, from m “ N {2 ` 1, `the cosine ˘ oscillation frequency decreases. To understand this behavior, consider cos 2π nm when m belongs to the N ( set N2 ` 1, N2 ` 2, . . . , N ´ 1 , and apply the following change of variable: " * N N k “ N ´ m ô m “ N ´ k, m P ` 1, ` 2, . . . , N ´ 1 2 2 " * N ô kP ´ 1, . . . , 2, 1 , 2 then, when m increases from N2 ` 1 up to N ´ 1, k decreases from N2 ´ 1 down to 1. Applying this variable change to the cosine, we obtain: ˆ ˙ ˆ ˙ ˆ ˙ ˆ ˙ npN ´ kq nk nk nk cos 2π “ cos 2πn ´ 2π “ cos ´2π “ cos 2π , N N N N having used the periodicity and parity of the cosine. Consequently: " * ˆ ˙ " * ´ nm ¯ N nk N N , mP ` 1, ` 2, . . . , N ´ 1 ðñ cos 2π , kP ´ 1, . . . , 1 cos 2π N 2 2 N 2

Thus, the maximum number of harmonic oscillations is obtained when m “ N {2, and is symmetrical about this value. For example, the graph of Hm for m “ 9, N “ 16 is exactly equal to the graph of Hm with m “ 7, N “ 16. Similarly, the graph of m “ 15, N “ 16 is exactly equal to the graph in Figure 2.1, representing Hm with m “ 1, N “ 16; “ ‰ – evidently, if N is odd, the considerations set out above are valid for N2 , the integer part of N2 , that is the integer closest to, but not greater than N2 . The elements described above are the reasons for certain choices of terminology: – high frequencies: values of m close to

N 2;

– low frequencies: values of m close to 0 or N ´ 1. If the synthesis formula for a discrete signal z P 2 pZN q includes Fourier coefﬁcients zˆpmq with a high magnitude for values of m which are close to N {2, the signal will be characterized by relatively violent variations (as in the case of high sounds, such as those produced by cymbals). However, if the Fourier coefﬁcients with the highest modulus correspond to values of m close to 0 and N ´ 1, the signal will be characterized by “gentler” variations (as in the case of low sounds, such as those produced by bass drums).

The Discrete Fourier Transform and its Applications to Signal and Image Processing

59

The frequency m “ N {2 is known as the Nyquist frequency7. This is the highest harmonic frequency which can appear in the synthesis formula for N samples of a signal. 2.6.5. Signal ﬁltering in frequency representation The DFT can be used to easily modify the frequency content of a signal, for example increasing the strength of the lowest or highest frequencies. The standard approach is to obtain the Fourier space using the DFT then adjust the Fourier coefﬁcients as required using a ﬁlter f : 2 pZN q Ñ 2 pZN q, which may be either a linear or a nonlinear transform. Finally, the IDFT is applied to the sequence of modiﬁed Fourier coefﬁcients to reconstruct the original signal in its modiﬁed form. The signal processing approach used in the frequency domain is shown in Figure 2.3.

Figure 2.3. Filtering approach in the Fourier domain

Note that, in the IDFT ˝ f ˝ DFT transform composition, only f has the capacity to change the energy of the signal: the composition of the Fourier transform with its inverse produces an identity, so the energy of the original signal is retained. One particularly important example of a ﬁlter f , deﬁned in section 2.6.6, can be used to deﬁne the concept of the Fourier multiplier, deﬁned in section 2.6.7. 7 For the Swedish engineer Harry Nyquist (1889–1976).

60

From Euclidean to Hilbert Spaces

2.6.6. The multiplication operator and its diagonal matrix representation Let w : ZN Ñ C be a ﬁxed sequence in 2 pZN q. D EFINITION 2.10.– The linear application below is known as the multiplication operator by sequence w: Mw : 2 pZN q ÝÑ 2 pZN q z ÞÝÑ Mw pzq “ w ¨ z where Mw pzq “ w ¨ z : ZN Ñ C is the sequence deﬁned by the point-wise (also called Hadamard) product of w and z: Mw zpnq “ pw ¨ zqpnq “ wpnq ¨ zpnq

@n P ZN

Note that if z is represented as a column vector in the canonical basis of 2 pZN q, then the matrix associated with the operator Mw in relation to the canonical basis of 2 pZN q is a diagonal matrix Dw with diagonal elements given by the components of sequence w: ˛ ¨ wp0q 0 ¨ ˛ ¨ ˛ ‹ ˚ zp0q wp0qzp0q ‹ ˚ ‹˚ ‹ ˚ ‹ ˚ .. .. .. Dw z “ ˚ ‹˝ ‚“ ˝ ‚ . . . ‹ ˚ ‚ zpN ´ 1q ˝ 0 wpN ´ 1qzpN ´ 1q wpN ´ 1q E XAMPLE OF A MULTIPLICATION OPERATOR .– Consider the sequence of 2 pZ6 q given by z “ p2, 3 ´ i, 2i, 4 ` i, 0, 1q and the sequence wpnq “ in , n P Z6 , then: pwp0q “ 1, wp1q “ i, wp2q “ ´1, wp3q “ ´i, wp4q “ 1, wp5q “ iq and thus: pMw zqpnq “ p1¨2, i¨p3´iq, ´1¨2i, ´i¨p4`iq, 1¨0, i¨1q “ p2, 3i`1, ´2i, ´4i`1, 0, iq

This provides the foundation for introducing the Fourier multiplier operator. 2.6.7. The Fourier multiplier operator The Fourier multiplier operator – or multiplier – is one notable example of a frequency ﬁlter.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

61

D EFINITION 2.11.– Given a sequence w : ZN Ñ C, the Fourier multiplier by sequence w is the following operator: Tpwq : 2 pZN q ÝÑ 2 pZN q ~ ¨ zˆ z ÞÝÑ Tpwq pzq “ w that is, Tpwq is the operator given by the composition Tpwq “ IDFT ˝ Mw ˝ DFT , that is, 2 pZN q ÝÑ DFT

z

2 pZN q

ÝÑ Mw

2 pZN q

ÝÑ 2 pZN q IDFT

~ ¨ zˆ ÞÑ DFTpzq “ zˆ ÞÑ Mw pDFTpzqq “ w ¨ zˆ ÞÑ IDFTpMw pDFTpzqqq “ w

Applying the DFT to both sides of the deﬁnition of Tpwq , we see that the action of the Fourier multiplier is diagonal in the Fourier basis F : DFT Tpwq z “ rTpwq zsF “ Mw ˝ DFT z “ Mw zˆ,

@z P 2 pZN q

[2.27]

Thus, Tpwq multiplies the Fourier coefﬁcients of z by the components of sequence w (making this operator a multiplier). This means that we can: – attenuate the low frequencies of a signal z by selecting a sequence wpmq with a low value of |wpmq| when m » 0 and m » N ´ 1; – attenuate the high frequencies of a signal z by selecting a sequence wpmq with a low value of |wpmq| when m » N {2; – amplify the low frequencies of a signal z by selecting a sequence wpmq with a high value of |wpmq| when m » 0 and m » N ´ 1; – amplify the high frequencies of a signal z by selecting a sequence wpmq with a high value of |wpmq| when m » N {2. This information is used in graphic equalizers, used by musicians to adjust the level of high frequencies and bass notes in an audio signal. 2.7. Properties of the DFT In this section, we shall demonstrate the most important properties of the DFT. We shall begin by recalling the translation property of a summation index: n ÿ i“n0

ai “

n´k ÿ i“n0 ´k

ai`k “

n`k ÿ

ai´k

[2.28]

i“n0 `k

This property will be used on several occasions, along with the following lemma.

62

From Euclidean to Hilbert Spaces

L EMMA 2.2.– Let f : Z Ñ C be an N -periodic function, with N P N: f pn ` aN q “ f pnq

@a, n P Z

Then, for all m P Z : m`N ÿ´1

Nÿ ´1

f pnq “

n“m

f pnq

n“0

that is, the sum of an N -periodic function across any interval of size N is constant. P ROOF.– If m “ 0, there is nothing to prove, so we may take m P Z, m ‰ 0. Considering values of m ą 0: m`N ÿ´1

f pnq “

m`N ÿ´1

n“m

f pnq ´

m´1 ÿ

n“0

f pnq “

Nÿ ´1

n“0

f pnq `

n“0

m`N ÿ´1

f pnq ´

m´1 ÿ

f pnq

n“0

n“N

but, using [2.28]: m`N ÿ´1

f pnq “

m´1 ÿ

f pn ` N q “

n“0

n“N

m´1 ÿ

f pnq

n“0

because of the N -periodicity of f , thus: m`N ÿ´1

f pnq “

n“m

Nÿ ´1

f pnq `

n“0

m´1 ÿ

f pnq ´

n“0

m´1 ÿ

f pnq “

n“0

Nÿ ´1

f pnq

n“0

A similar demonstration may be used for cases where m ă 0.

2

2.7.1. Periodicity of zˆ and zˇ In what follows, we shall examine the most important properties of the discrete Fourier theory, starting with the periodicity of zˆ and zˇ. By direct calculation, if a P Z, then: zˆpm ` aN q “

Nÿ ´1

zpnqe´2πi

pm`aN qn N

“

zpnqe´2πi

mn N

Nÿ ´1

zpnqe´2πi

mn N

e´2πi

aN n N

n“0

n“0 Nÿ ´1

“

e´2πani “ zˆpmq

n“0

since e´2πani “ cosp2πanq ´ i sinp2πanq “ 1. The same calculation is used to prove zˇpn ` aN q “ zˇpnq @a P Z.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

63

Thanks to this property, the deﬁnitions of zˆ and zˇ can be extended to Z by considering the two N -periodic sequences: zˆ : Z ÝÑ C m ÞÝÑ zˆpmq “ zˆpm ` aN q and: zˇ : Z ÝÑ C n ÞÝÑ zˇpnq “ zˇpn ` aN q with a P Z such that m ` aN P ZN , or n ` aN P ZN , respectively. 2.7.2. DFT and shift We now wish to consider how the DFT of a signal z P 2 pZN q varies in response to a shift in z by a quantity different to N . Another operator for 2 pZN q must be introduced in order to formalize this consideration. D EFINITION 2.12.– Take z P 2 pZN q. The following linear application is the right shift operator of the quantity k: Rk : 2 pZN q ÝÑ 2 pZN q z ÞÝÑ Rk pzq where Rk pzq : ZN Ñ C is the sequence deﬁned by the formula: Rk zpnq “ zpn ´ kq

@n P ZN

E XAMPLE OF A SHIFT OPERATOR .– N “ 6, k “ 2, z “ p2, 3 ´ i, 2i, 4 ` i, 0, 1q. Then: $ ’ ’ &R2 zp0q “ zp0 ´ 2q “ zp´2q “ zp´2 ` 6q “ zp4q “ 0 R2 zp1q “ zp1 ´ 2q “ zp´1q “ zp´1 ` 6q “ zp5q “ 1 ’ ’ % ... giving us: R2 z “ p0, 1, 2, 3´i, 2i, 4`iq. Evidently, the effect of R2 on z is a simple displacement of each element in the sequence by two positions to the right (hence the notation R).

64

From Euclidean to Hilbert Spaces

The ﬁnal two elements “turn” into the ﬁrst two positions, as though following a circle. For this reason, Rk is also known as a circular shift operator or rotation operator. Now, consider the composition of this shift operator with the DFT and, inversely, that of the DFT with the shift operator. We shall begin with this latter composition: DFTpzpn ´ kqq, that is: 2 pZN q ÝÑ 2 pZN q ÝÑ 2 pZN q DFT

Rk

z

ÞÝÑ

Rk z

y ÞÝÑ pDFT ˝ Rk qz “ DFTpRk zq “ R kz

Theorem 2.4 shows that, due to the DFT, the action of the operator Rk is transformed into a multiplication by a complex exponential. T HEOREM 2.4.– Take z P 2 pZN q and k P Z. Then: ´2πi mk y N z R ˆpmq k zpmq “ e

@m P Z

[2.29]

k k mk that is, if we deﬁne the sequence ωN P 2 pZN q, ωN pmq “ ωN “ e´2πi then:

DFT ˝ Rk “ MωN k ˝ DFT

mk N

@m P Z, [2.30]

P ROOF.– řN ´1 ´2πi mn y N R k zpmq “ n“0 pRk zqpnqe “ “ “

Nÿ ´1

zpn ´ kqe´2πi

n“0 N ´k´1 ÿ n“´k N ´k´1 ÿ

mn N

zpn ´ k ` kqe´2πi

mpn`kq N

zpnqe´2πi

mk N

mn N

e´2πi

n“´k ´2πi mk N

Factor e summation:

is independent of the index n and can thus be left out of the

´2πi mk y N R k zpmq “ e

N ´k´1 ÿ n“´k

“

pLemma 2.2q

“ e´2πi

e´2πi

mk N

mk N

zpnqe´2πi Nÿ ´1 n“0

zˆpmq

mn N

zpnqe´2πi

mn N

The Discrete Fourier Transform and its Applications to Signal and Image Processing

65

Lemma 2.2 can be applied in this case as, by hypothesis, z is N -periodic and the mn exponential e´2πi N is itself an N -periodic function. 2 ˇ ˇ mk ˇ ˇ Note that, if we write zˆpmq “ |ˆ z pmq|eiArgpˆzpmqq then, since ˇe´2πi N ˇ “ 1, the product e´2πi N zˆpmq only changes the phase of zˆpmq. This is the reason why we say that the DFT transforms the shift into a phase shift. The fact that the phase of the Fourier coefﬁcients is modiﬁed by translations implies that the phase spectrum contains information regarding the geometry of the original signal. mk

2.7.2.1. Shift invariance of the spectrum Theorem 2.4 highlights an important limitation of the Fourier transform. Since: ˇ ˇ ˇ ´2πi mk ˇ y N ˇ “ 1 ùñ |R z pmq| @m, k P Z, ˇe k zpmq| “ |ˆ the magnitudes of the Fourier coefﬁcients of z and of all its shifts are equal. Consequently, the magnitude of the Fourier coefﬁcients |ˆ z pmq| informs us of the (global) importance of the harmonic of frequency m in the reconstruction of the signal z, but not of its (local) position within the signal. To gain a clearer understanding of this behavior, let us consider the unit pulse, to z which an arbitrary shift is applied: Rk δ0 . The spectrum of this signal is |R k δ0 pmq| “ ´2πi mk ˆ z ˆ N δ pmq|, but, as we have seen, δ pmq “ 1 for all m P Z |e , thus | R 0 0 N k δ0 pmq| “ mk |e´2πi N | “ 1. The difference between this case and that of the non-shifted unit ˇ ˇ ˇp ˇ pulse is that, in the latter case, the spectrum is real and thus ˇδ0 pmqˇ “ δp0 pmq “ 1 @m P ZN . The spectrum of the unit pulse is therefore exactly the same as that of any of its shifted forms. Knowledge of the spectrum alone is not sufﬁcient to reconstruct the spatial location of a signal; to do this, we need information from the phase, which is not easy to interpret or handle. One solution to this problem lies in using two transforms which “localize” the Fourier transform: the Gabor transform and the wavelet transform. These transforms lie outside the scope of this book, the interested reader can consult, for instance, Frazier (2001). Now, let us analyze the composition of the shift operator and the DFT : zˆpm ´ kq, that is: 2 pZN q ÝÑ 2 pZN q ÝÑ 2 pZN q DFT

z

Rk

ÞÝÑ DFTpzq ÞÝÑ pRk ˝ DFTqz “ zˆpm ´ kq

[2.31]

66

From Euclidean to Hilbert Spaces

T HEOREM 2.5.– Using the hypotheses from Theorem 2.4, this is equivalent to the formula: ˆ ˙ nk { 2πi pRk zˆqpmq “ zˆpm ´ kq “ e N z pmq ,

@m P Z

[2.32]

that is: Rk ˝ DFT “ DFT ˝ Mωk

[2.33]

N

P ROOF.– pRk zˆqpmq “ zˆpm ´ kq “

Nÿ ´1

zpnqe´2πi

pm´kqn N

n“0

“

Nÿ ´1 ´

ˆ ˙ ¯ mn kn { 2πi kn ´2πi 2πi N “ e N zpnq e e N z pmq

2

n“0

The properties analyzed above may be summarized in the form of Fourier pairs, shown in Table 2.2. This information shows that the shift operation in the original representation of z becomes a phase change in the Fourier space; conversely, the shift operation in the Fourier space corresponds to a phase change (with a conjugate phase) in the original representation of z. The following situation illustrates a particularly remarkable case. If N is an even value and k “ N {2, then: e´

2πim N 2 N

“ e´πim “ pe´πi qm “ p´1qm

and: e

2πin N 2 N

“ eπin “ peπi qn “ p´1qn

so: ˙˙ ˆ ˆ N DFT z n ´ “ p´1qm zˆpmq, 2

ˆ ˙ N { n zqpmq zˆ m ´ “ pp´1q 2 [2.34]

Thus, multiplying sequence z by p´1qn corresponds to shifting the spectrum by N {2. This operation is often used to center a spectrum on m “ 0.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

67

Original representation Fourier space e´2πi

zpn ´ kq kn

e2πi N zpnq

km N

zˆpmq

zˆpm ´ kq

Table 2.2. Fourier pairs and translation

Finally, note the relation between formula [2.30] and the diagonal representation of the operator Rk . Composing the left and right members of formula [2.30] with the IDFT, we obtain: DFT ˝ Rk ˝ IDFT “ MωN k Using Ak and DωN k (diagonal, see section 2.6.6) to write the matrices associated with the operator Rk and with MωN k in relation to the canonical basis, the previous equation can be rewritten as: WN Ak WN´1 “ DωN k . This tells us that the matrix Ak associated with the shift operator Rk is similar to the diagonal matrix DωN k . The invertible matrix which produces the matrix conjugation of Ak and DωN k is the Sylvester matrix WN , so we can say that the action of the shift operator Rk is diagonal in the Fourier space. 2.7.3. DFT and conjugation Given a sequence z P 2 pZN q, the conjugate sequence z¯ is written as z¯ “ p¯ z p0q, z¯p1q, . . . , z¯pN ´ 1qq, that is, z¯pnq “ zpnq @n P ZN . The relationship between the DFT and conjugation is shown in Theorem 2.6. T HEOREM 2.6.– For all z P 2 pZN q: zp ¯pmq “ zˆp´mq “ zˆpN ´ mq

@m P ZN

P ROOF.– zp ¯pmq “

Nÿ ´1 n“0

zpnqe´2πi

mn N

“

Nÿ ´1

zpnqe2πi

n“0

zˆpN ´ mq “ zˆp´mq, by periodicity.

mn N

“

Nÿ ´1

zpnqe´2πi

p´mqn N

“ zˆp´mq

n“0

2

68

From Euclidean to Hilbert Spaces

C OROLLARY 2.1.– z P 2 pZN q is real, that is, zpnq P R @n P ZN , if and only if: zˆpmq “ zˆp´mq “ zˆpN ´ mq,

@m P ZN

zˆ P 2 pZN q is real, that is, zˆpmq P R @m P ZN , if and only if: zpnq “ zp´nq “ zpN ´ nq,

@n P ZN

P ROOF.– As the DFT is an isomorphism of 2 pZN q, z is real, that is, z “ z¯, if and only if zˆ “ zp ¯, but, from Theorem 2.6, this also holds true when zˆpmq “ zˆp´mqq “ zˆpN ´ mq. zˆ is real if and only if zˆ “ zˆ, but the previous theorem states that zˆp´mq “ zp ¯pmq @m P ZN , implying that zˆpmq “ zp ¯p´mq, by simple substitution of the variable m Ø ´m. Hence: IDFTpˆ z pmqq “ IDFTpzp ¯p´mqq “ IDFTpDFTp¯ z p´mqqq “ z¯p´nq “ zp´nq Therefore zˆ is real ðñ zˆpmq “ zˆpmq ðñ IDFTpˆ z pmqq ðñ zpnq “ zp´nq “ zpN ´ nq @n P ZN .

IDFTpˆ z pmqq “ 2

Corollary 2.2 is an immediate consequence of the previous result. C OROLLARY 2.2.– z, zˆ P 2 pZN q are simultaneously real if and only if they are symmetrical about 0, that is, zpnq “ zp´nq and zˆpmq “ zˆp´mq, @m, n P ZN . 2.7.4. DFT and convolution One of the most important properties of the Fourier transform relates to the convolution operation. To understand this operation, we ﬁrst note the formula for polynomial products. If P pxq “ a0 `a1 x`. . .`an xn “ m ř

n ř

ai xi and Qpxq “ b0 `b1 x`. . .`bm xm “

i“0

bj xj , then:

j“0

P pxqQpxq “

n`m ÿ “0

c x ,

where

c “

ÿ k“0

a´k bk “

ÿ k“0

ak b´k

[2.35]

The Discrete Fourier Transform and its Applications to Signal and Image Processing

69

E XAMPLE .– P pxq “ a0 ` a1 x ` a2 x2 , Qpxq “ b0 ` b1 x ` b2 x2 , so: P pxqQpxq “ a0 b0 ` pa0 b1 ` a1 b0 qx ` pa0 b2 ` a1 b1 ` a2 b0 qx2 `pa1 b2 ` a2 b1 qx3 ` pa2 b2 qx4 The coefﬁcients of the powers of the variable x verify formula [2.35]. We see that the coefﬁcients c include a sum of the products of the coefﬁcients ai and bj . Notably, the sum of the indices i`j is always equal to ; as the index of one variable increases, that of the other decreases. These are the deﬁning characteristics of the convolution operation (in its discrete form), which we shall introduce in the space 2 pZN q. D EFINITION 2.13.– Take z, w P 2 pZN q. The convolution of z with w, written as z˚w, is the sequence of 2 pZN q with components deﬁned by: pz ˚ wqpnq “

Nÿ ´1

zpn ´ kqwpkq “

Nÿ ´1

k“0

wpn ´ kqzpkq ,

@n P ZN

k“0

Convolution is symmetrical, that is z ˚ w “ w ˚ z, due to the commutative nature of the product in C. E XAMPLE .– z, w P 2 pZ4 q, z “ p1, 1, 0, 2q, w “ pi, 0, 1, 2q, with canonical periodicity: zpn ` kN q “ zpnq and wpn ` kN q “ wpnq @n P ZN and k P Z. Then: pz ˚ wqp0q “

4´1 ÿ

zp0 ´ kqwpkq “

k“0

3 ÿ

zp´kqwpkq

k“0

“ zp0qwp0q ` zp´1qwp1q ` zp´2qwp2q ` zp´3qwp3q “ zp0qwp0q ` zp4 ´ 1qwp1q ` zp4 ´ 2qwp2q ` zp4 ´ 3qwp3q “ zp0qwp0q ` zp3qwp1q ` zp2qwp2q ` zp1qwp3q “1¨i`2¨0`0¨1`1¨2 “i`2

70

From Euclidean to Hilbert Spaces

We also have pz ˚ wqp1q “ 2 ` i, pz ˚ wqp2q “ 1 ` 2i, pz ˚ wqp3q “ 1 ` 3i, hence pz ˚ wq “ pi ` 2, 2 ` i, 1 ` 2i, 1 ` 3iq. The interaction between the DFT and convolution has a particularly elegant and useful property, described in Theorem 2.7. T HEOREM 2.7.– Take z, w P 2 pZN q. Then: DFT pz ˚ wqpmq “ zˆpmq ¨ wpmq ˆ ðñ pz ˚ wqpnq “ IDFT pˆ z ¨ wqpnq ˆ

@n, m P Z [2.36]

z ˚ wqpmq ˆ “ N DFTpz ¨ wqpmq IDFT pˆ z ˚ wqpnq ˆ “ N zpnq ¨ wpnq ðñ pˆ @n, m P Z [2.37] In other words, the Fourier transform of the convolution of z and w is the pointwise product of the Fourier transforms and vice versa: the inverse Fourier transform of the convolution of zˆ and w ˆ is N times the pointwise product of z and w. In other words, we obtain the Fourier pairs shown in Table 2.3. Original representation Fourier space z˚w zˆ ¨ w ˆ Nz ¨ w zˆ ˚ w ˆ Table 2.3. Fourier pairs relative to convolution

P ROOF.– By deﬁnition : { pz ˚ wqpmq “

Nÿ ´1

pz˚wqpnqe

´2πi mn N

n“0

“

Nÿ ´1

Nÿ ´1

¸

˜

n“0

zpn ´ kqwpkq e´2πi

k“0

The exponential is rewritten as: e´2πi

mn N

“ e´2πi

mpn´k`kq N

“ e´2πi

mpn´kq`mk N

“ e´2πi

mpn´kq N

e´2πi

mk N

mn N

The Discrete Fourier Transform and its Applications to Signal and Image Processing

71

Then: ´1 Nÿ ´1 Nÿ

{ pz ˚ wqpmq “

zpn ´ kqwpkqe´2πi

n“0 k“0 Nÿ ´1

wpkqe´2πi

“

k“0 Nÿ ´1

“

k“0 Nÿ ´1

“

wpkqe

mk N

´2πi mk N

wpkqe´2πi

mk N

k“0

Nÿ ´1

mpn´kq N

e´2πi

zpn ´ kqe´2πi

n“0 N ´k´1 ÿ n“´k N ´k´1 ÿ

mk N

mpn´kq N

zpn ´ k ` kqe´2πi zpnqe´2πi

mpn´k`kq N

mn N

n“´k Nÿ ´1

“

pLemma 2.2q

wpkqe´2πi

mk N

Nÿ ´1

zpnqe´2πi

mn N

n“0

k“0

“ wpmqˆ ˆ z pmq “ zˆpmqwpmq ˆ Lemma 2.2 can be applied here as it is valid for any k P Z. Thus: { pz ˚ wqpmq “ zˆpmqwpmq, ˆ

@m P Z

The proof that the IDFTpˆ z ˚ wqpnq ˆ “ zpnq ¨ wpnq is very similar, by deﬁnition : N ´1 1 ÿ 2πi mn N pˆ z ˚ wqpmqe ˆ N m“0 ¸ ˜ N ´1 N ´1 mn 1 ÿ ÿ “ zˆpm ´ kqwpkq ˆ e2πi N N m“0 k“0

IDFTpˆ z ˚ wqpnq ˆ “

The exponential is rewritten as: e2πi

mn N

“ e2πi

npm´k`kq N

“ e2πi

npm´kq N

e2πi N

2πi zˆpm ´ kqwpkqe ˆ

npm´kq N

e2πi N

“ e2πi

npm´kq`nk N

nk

Then: IDFTpˆ z ˚ wqpnq ˆ “

1 N

řN ´1 řN ´1 m“0

k“0

“

N ´1 Nÿ ´1 npm´kq 1 ÿ 2πi nk N wpkqe ˆ zˆpm ´ kqe2πi N N k“0 m“0

“

N ´1 1 ÿ 2πi nk N wpkqe ˆ N k“0

“

1 N

Nÿ ´1 k“0

2πi N wpkqe ˆ

nk

N ´k´1 ÿ m“´k N ´k´1 ÿ m“´k

zˆpm ´ k ` kqe2πi zˆpmqe2πi

mn N

nk

npm´k`kq N

72

From Euclidean to Hilbert Spaces N ´1 Nÿ ´1 mn 1 ÿ 2πi nk N wpkqe ˆ zˆpmqe2πi N pLemma 2.2q N k“0 ˜ ¸ m“0 ˜ ¸ Nÿ ´1 N ´1 mn 1 1 ÿ 2πi nk 2πi N N “N wpkqe ˆ zˆpmqe ¨ N k“0 N m“0

“

“ N IDFT wpnq ˆ ¨ IDFT zˆpnq “ N wpnq ¨ zpnq “ N zpnq ¨ wpnq

2

O BSERVATIONS .– – In this proof,

Nř ´1

wpkqe´2πi

mk N

cannot be replaced with wpmq ˆ before the ﬁnal

k“0

step, as the index k is still present in the second sum. wpmq ˆ can only be substituted in once k has been eliminated. – Formulas [2.36] demonstrate a sort of “distributive property” in connection with convolution and the pointwise product: when the DFT is applied to a convolution product, it is distributed over the factors, and the convolution becomes a pointwise product. Inversely, when the IDFT is applied to a pointwise product, it is distributed over the factors, and the pointwise product becomes a convolution. Thus: DFTpˇ z ˚ wq ˇ “ z ¨ w, IDFTpz ¨ wq “ zˇ ˚ w ˇ

@z, w P 2 pZN q

[2.38]

– Using the Fourier transform, a complex operation such as convolution can be transformed into a simple product of Fourier transforms (which can be calculated rapidly using the FFT). This result is particularly useful for signal processing applications. If we deﬁne the DFT using the normalization induced by the orthonormal Fourier basis, coefﬁcients appear in the DFT formula of the convolution. These coefﬁcients may be extremely large, particularly when dealing with DFTs in dimensions greater than 1 and/or large signals; this may result in calculation errors. The simplicity of formula [2.36] is the reason why many authors – and programmers – prefer the deﬁnition of Fourier coefﬁcients used in this book to other deﬁnitions. – Convolution is often carried out between a signal z and another signal w which is non-zero only on a support of size T . The value of T is important in choosing whether to apply the convolution operation directly or to use the FFT. The complexity of the direct convolution operation is OpN T q; using the FFT, the complexity is OpN log N q. It is therefore helpful to transform the convolution into a pointwise product with FFT in cases where T ą logpN q. For example, taking z P 2 pZN q with N “ 1, 000, then logpN q » 7 and it is thus preferable to carry out the convolution z ˚ w in the Fourier domain for all cases where the support of w is larger than 7. If one of the vectors in the convolution is ﬁxed, we can deﬁne an endomorphism of 2 pZN q.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

73

D EFINITION 2.14.– Taking a ﬁxed sequence w P 2 pZN q, the following linear transformation is the convolution operator with w: Tw : 2 pZN q ÝÑ 2 pZN q z ÞÝÑ Tw pzq “ z ˚ w As in the case of the shift operator, a diagonal representation of the convolution operator can be produced. To do this, we rewrite formula [2.36] without specifying the index m (as the representation is valid for any index), that is, DFTpz ˚ wq “ zˆ ¨ w, ˆ but DFTpz ˚ wq “ pDFT ˝ Tw qz and zˆ ¨ w ˆ“w ˆ ¨ zˆ “ Mwˆ zˆ “ pMwˆ ˝ DFTqz, that is, pDFT ˝ Tw qz “ pMwˆ ˝ DFTqz @z P 2 pZN q, making it possible to write the operator relationship DFT ˝ Tw “ Mwˆ ˝ DFT. Applying a composition between the IDFT and the left and right sides of this expression, we obtain: DFT ˝ Tw ˝ IDFT “ Mwˆ Let us consider this relationship in the context of the canonical basis B, just as we did in the case of the shift operator. The DFT and the IDFT become WN and WN´1 , and the multiplication operator Mwˆ takes the form of the diagonal matrix Dwˆ “ diagpwp0q, ˆ . . . , wpN ˆ ´ 1qq. If the matrix Aw is the representation of Tw in the basis B, that is, Aw “ rTw sB , then: WN Aw WN´1 “ Dwˆ which shows that the action of the convolution operator is diagonalized in the Fourier basis. Shift and convolution operators are not unique in this regard: there is a whole speciﬁc category of operators which have a diagonal action in the Fourier basis. These operators are called stationary and they will be examined in greater detail in section 2.8. 2.8. The DFT and stationary operators The relationship between the Fourier transform and the class of “stationary” operators is an important one. The DFT enables the diagonalization of these operators and they can be shown to be equivalent to convolutions and to Fourier multipliers. To prove these results, we shall also introduce the category of “circulant” matrices, which represent stationary operators in the canonical basis of 2 pZN q. Before giving the formal mathematical deﬁnition of stationary operator, let us introduce the idea behind such object by considering an audio signal z and a device T that acts linearly on it. If the signal z is transmitted to T with a delay Δt, and the

74

From Euclidean to Hilbert Spaces

only effect of this delay on T is that its output is delayed by the same quantity Δt, then the device T is said to be stationary. Mathematically speaking, if Rk is the shift operator by the quantity k P Z, then the stationarity of T is translated as the following relationship: @z P 2 pZN q

T pRk zq “ Rk pT zq,

The left side represents the action of the operator T on the z shifted by a quantity k, while the right side represents the shift in the action of operator T on the original signal z. These notions are summarized in the commutative diagram below. R

k 2 pZN q ÝÝÝÝ Ñ 2 pZN q § § § § Tđ đT

2 pZN q ÝÝÝÝÑ 2 pZN q Rk

These considerations justify Deﬁnition 2.15. D EFINITION 2.15.– An operator T : 2 pZN q Ñ 2 pZN q is said to be stationary (or shift invariant) if: T pRk zq “ Rk pT zq,

@z P 2 pZN q, @k P Z

[2.39]

that is, T is stationary if it commutes with all shift operators Rk : T ˝ Rk “ Rk ˝ T ,

@k P Z

[2.40]

In section 2.8.5, we shall show that a linear operator T P Endp2 pZN q is stationary Nř ´1 Nř ´1 if and only if pT zqpnq “ ak zpn ´ kq “ ak Rk zpnq, n P t0, . . . , N ´ 1u, ak P C.

k“0

k“0

The DFT provides an extremely important example of a non-stationary operator over 2 pZN q. To prove that the DFT is not a stationary operator, we simply recall the way in which it interacts with shift operators Rk : using formulas [2.30] and [2.33] we obtain, respectively, DFT ˝ Rk “ MωN k ˝ DFT and Rk ˝ DFT “ DFT ˝ M k , ω N

which shows that the DFT does not commute with the shift operators.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

75

2.8.1. The DFT and the diagonalization of stationary operators The most important properties of the DFT with regard to stationary operators can be summarized in a single theorem, but we prefer to highlight the fact that the Fourier transform diagonalizes stationary operators through a separate theorem. T HEOREM 2.8.– Let T P Endp2 pZN qq be a stationary operator. Then, T is diagonalizable, and each element of the orthogonal Fourier basis Fm of 2 pZN q is an Eigenvector of T . P ROOF.– For every ﬁxed m P t0, . . . , N ´ 1u, let us consider the element m of the mn orthogonal Fourier basis: Fm pnq “ N1 e2πi N . As T is an endomorphism, T Fm P 2 pZN q, and thus T Fm can be decomposed over the basis pF0 , . . . , FN ´1 q itself : pT Fm qpnq “

Nÿ ´1

ak Fk pnq “

k“0

N ´1 kn 1 ÿ ak e2πi N , N k“0

@n P ZN

Now, consider the action of the shift operator R1 on Fm : R1 Fm pnq “ Fm pn ´ 1q “ N1 e2πi m “ e´2πi N ¨ Fm pnq

mpn´1q N

“ e´2πi N ¨ m

Applying T to R1 Fm , we obtain: ` ˘ m T R1 Fm pnq “ T e´2πi N ¨ Fm pnq “

“

e´2πi N pT Fm q pnq m

Linearity of T

m

equation r2.41s

“

Nÿ ´1

e´2πi N

Nÿ ´1

ak Fk pnq

k“0

ak e´2πi N Fk pnq m

k“0

Now, we switch the order of composition of R1 and T : R1 T Fm pnq “ T Fm pn ´ 1q N ´1 kpn´1q 1 ÿ “ ak e2πi N equation r2.41s N k“0 “ “

N ´1 k kn 1 ÿ ak e´2πi N ¨ e2πi N N k“0 Nÿ ´1 k“0

ak e´2πi N Fk pnq k

1 2πi mn N Ne

[2.41]

76

From Euclidean to Hilbert Spaces

Since T is stationary, T R1 Fm “ R1 T Fm , that is: Nÿ ´1

ak e´2πi N Fk pnq “ m

Nÿ ´1

k“0

ak e´2πi N Fk pnq k

k“0

and due to the uniqueness of decomposition over a basis: ak e´2πi N “ ak e´2πi N , m

k

@k P ZN , pm is ﬁxedq

[2.42]

Let us analyze this equality. If k “ m, then equation [2.42] is simply an identity and requires no further discussion. In the case where k ‰ m, we begin by recalling that m, k P t0, . . . , N ´ 1u, so the cosine and sine of the complex exponentials have their values in only one period, as the next period begins when m, k “ N . Then: k‰m

ùñ

e´2πi N ‰ e´2πi N m

k

and equation [2.42] can be veriﬁed if and only if ak “ 0 @k ‰ m. Equation [2.41] thus becomes: T Fm pnq “ am Fm pnq,

@n P ZN ,

that is, Fm is an eigenvector of T with an eigenvalue am given by the m-th coefﬁcient of the decomposition of T Fm on the orthogonal Fourier basis. Evidently, the coefﬁcient am is dependent on T . Given that we ﬁxed an arbitrary index m, every element of the orthogonal Fourier basis is an eigenvector of T , and consequently 2 pZN q has a basis of eigenvectors of T . By deﬁnition, T is therefore diagonalizable. 2 Theorem 2.9 shows how the eigenvalues am can be made explicit using the DFT. The theorem shown above can be interpreted using matrices. We know that the action of the DFT is represented by the Sylvester matrix WN deﬁned in equation [2.23] and that WN is the matrix used to pass from the canonical basis B of 2 pZN q to the Fourier basis F of 2 pZN q; the inverse is WN´1 “ N1 WN , representing the matrix used to pass from basis F to basis B. If A is the matrix associated with T with respect to the canonical basis of 2 pZN q and D is the diagonal matrix of the eigenvalues of A, then: D “ WN AWN´1 ,

A “ WN´1 DWN

[2.43]

If rwsF represents any vector w P 2 pZN q with respect to the Fourier basis F , then: WN Az “ rAzsF

“

pF diagonalizes Aq

DrzsF “ DWN z,

@z P 2 pZN q

so WN A “ DWN , if and only if WN AWN´1 “ DWN WN´1 “ D.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

77

2.8.2. Circulant matrices Thanks to the introduction of the concept of circulant matrix, we will be able to prove the fundamental theorem concerning the link between the Fourier transform and stationary operators. First, let us generalize the periodicity of sequences 2 pZN q to matrices: given a ´1 matrix A “ pamn qN m,n“0 , we say that A is an N -periodic matrix if: am`kN,n “ am,n and am,n`kN “ am,n ,

@m, n, k P Z

E XAMPLE . a0,2 “ aN,2 “ aN,N `2 ´1 D EFINITION 2.16.– Let A “ pamn qN m,n“0 be an N ˆ N periodic matrix. A is said to be circulant if:

am`1,n`1 “ am,n ,

@m, n P Z

Repeating the translation k times, the deﬁnition is rewritten as: am`k,n`k “ am,n ,

@m, n, k P Z

We see that, since k P Z, a circulant periodic matrix can also be deﬁned with the property am´k,n´k “ am,n , k P Z. This deﬁnition is interpreted as follows. Line (column) m ` 1 (n ` 1) is obtained from line (column) m (n) by shifting one position to the right (at the bottom), as follows: ¨ ˛ a0 a1 a2 . . . aN ´1 ˚aN ´1 a0 a1 . . . aN ´2 ‹ ˚ ‹ ˚aN ´2 aN ´1 a0 . . . aN ´3 ‹ ˚ ‹ ˚ .. .. .. . . .. ‹ ˝ . . . . . ‚ a1 a2 a3 . . . a0 E XAMPLE OF A CIRCULANT MATRIX .– ˛ ¨ 3 2 ` i ´1 4i ˚ 4i 3 2 ` i ´1 ‹ ‹ A“˚ ˝ ´1 4i 3 2 ` i‚ 2 ` i ´1 4i 3

78

From Euclidean to Hilbert Spaces

E XAMPLE OF A NON - CIRCULANT MATRIX .– ˛ ¨ 2 i 3 B “ ˝3 2 i ‚ i 23 For this matrix to be circulant, the third line would have to be pi, 3, 2q. 2.8.3. Exhaustive characterization of stationary operators Theorem 2.9 is the most important result of this chapter. It is used to produce the eigenvalues of a stationary operator T in a very simple manner; it can also be used to characterize T as a convolution operator, in the original representation of z, and as a multiplier, in the frequency representation. T HEOREM 2.9.– Let T : 2 pZN q Ñ 2 pZN q be an endomorphism. The following properties are equivalent. 1) T is stationary. 2) The matrix A, which represents T in the canonical basis of 2 pZN q, is circulant. 3) T is a convolution operator. 4) T is a Fourier multiplier. 5) The matrix D, which represents T in the orthogonal Fourier basis F , is diagonal. Note that implication 1) ùñ 5) has already been proved. The theorem will be proved using the following strategy: 1q ùñ 2q ùñ 3q ùñ 1q

and

3q ðñ 4q

and

4q ðñ 5q

The proof of this theorem is crucial, as it provides an explicit technique for ﬁnding the Eigenvalues of T and for constructing the convolution operator and Fourier multiplier which represent T . P ROOF.– 1q ùñ 2q : let A be the associated matrix of T with respect to the N ´1 of 2 pZN q: canonical basis8pen qn“0 ˛ ¨ a0,1 ¨ ¨ ¨ a0,N ´1 a0,0 ˚ a1,0 a1,1 ¨ ¨ ¨ a1,N ´1 ‹ ‹ ˚ A“˚ . ‹ .. .. .. ‚ ˝ .. . . . aN ´1,0 aN ´1,1 ¨ ¨ ¨ aN ´1,N ´1

8 We recall that en pmq “ δn,m , @n, m P ZN .

The Discrete Fourier Transform and its Applications to Signal and Image Processing

79

From the deﬁnition of the associated matrix, we have am,n “ pT en qpmq, that is, the n-th column of A is the vector T en . Using the fact that T is stationary, we wish to show that: am`1,n`1 “ am,n

ðñ

pT en`1 qpm ` 1q “ pT en qpmq,

@m, n P ZN

We see that: # 1 if n “ m ´ 1 pR1 en qpmq “ en pm ´ 1q “ 0 if n ‰ m ´ 1

ðñ ðñ

m“n`1 m‰n`1

“ en`1 pmq @m P ZN thus en`1 “ R1 en and, consequently: am`1,n`1 “ pT R1 en qpm ` 1q

“

pT stationaryq

R1 pT en qpm ` 1q “ pT en qpm ` 1 ´ 1q

“ pT en qpmq “ am,n Since am`1,n`1 “ am,n @m, n P ZN , then A is circulant and the implication 1q ùñ 2q is proved. 2q ùñ 3q : let A be a periodic circulant matrix, that is, am,n “ am´k,n´k @n, m, k P Z. We wish to prove the existence of h P 2 pZN q such that Az “ z ˚ h “ Th pzq @z P 2 pZN q. We shall prove that the sequence h which we are looking for is the ﬁrst column in A, that is: ¨ ˛ a0,0 ‹ ˚ h “ T e0 “ ˝ ... ‚, hpmq “ am,0 , @m P ZN aN ´1,0 We see that hpm ´ nq “ am´n,0 “ am´n,n´n

“

pA circulantq

am,n , and thus, from the

deﬁnition of the matrix-vector product, we have: pAzqpmq “

Nÿ ´1

am,n zpnq “

n“0

Nÿ ´1

hpm ´ nqzpnq “ ph ˚ zqpmq

n“0

and implication 2q ùñ 3q is proved. 3q ùñ 1q : we must prove that a convolution operator Tw is stationary, that is: pTw ˝ Rk qpzq “ pRk ˝ Tw qpzq,

@z P 2 pZn q, @k P Z

80

From Euclidean to Hilbert Spaces

We begin by calculating the left side of the equation: pTw Rk zqpmq “ pw ˚ Rk zqpmq “

Nÿ ´1

wpm ´ nqRk zpnq “

n“0

Nÿ ´1

wpm ´ nqzpn ´ kq

n“0

Making the index substitution “ n ´ k ô n “ k ` , the variability of is: $ ’ ’ &n “ 0 ùñ “ ´k .. . ’ ’ %n “ N ´ 1 ùñ “ N ´ 1 ´ k then: pTw Rk zqpmq “

N ´1´k ÿ

wpm ´ k ´ qzpq

“´k

“

Nÿ ´1

Lemma 2.2

wppm ´ kq ´ qzpq

“0

“ pz ˚ wqpm ´ kq “ Rk pz ˚ wqpmq “ pRk Tw zqpmq and this proves the implication 3q ùñ 1q . 3q ðñ 4q : we must prove that a linear operator T : 2 pZN q Ñ 2 pZN q is a convolution operator if and only if T is a Fourier multiplier. Taking an arbitrary ﬁxed element w P 2 pZN q, we have: Tw pzq “ z ˚ w “ IDFTpDFTpz ˚ wqq

“

pTh. 2.36q

“ pIDFT ˝ Mwˆ ˝ DFTqpzq “ Tpwq ˆ pzq,

IDFTpˆ z ¨ wq ˆ “ IDFTpw ˆ ¨ zˆq @w, z P 2 pZN q

ˆ Inversely: where Mwˆ is the multiplication operator by the sequence w. Tpwq pzq “ pIDFT ˝ Mw ˝ DFTqpzq “ IDFTpw ¨ zˆq “ Twˇ pzq @w, z P 2 pZN q

“

equation r2.38s

w ˇ ˚ zˇˆ “ w ˇ˚z

This shows us that the convolution operator with w can be interpreted as the Fourier multiplier by w ˆ and vice versa, and that the Fourier multiplier by w can be interpreted as the convolution operator with w: ˇ Tw “ Tpwq ˆ ,

Tpwq “ Twˇ

@w P 2 pZN q

The double implication 3q ðñ 4q is thus proved. Before continuing on to the ﬁnal stage in our proof, let us summarize our ﬁndings. A stationary operator T : 2 pZN q Ñ 2 pZN q is represented by a circulant matrix A with respect to the canonical basis pe0 , . . . , eN ´1 q of 2 pZN q.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

81

This matrix A can be represented by the convolution operator Th with h “ T e0 , ˆ the ﬁrst column of A or, as we have just seen, by the Fourier multiplier Tphq ˆ , where h is the sequence of Fourier coefﬁcients of h. 4q ðñ 5q : we must prove that T is a Fourier multiplier Tpwq if and only if the associated matrix of T with respect to the orthogonal Fourier basis F is diagonal. The direct implication has already been proved in formula [2.27], so we simply need to prove the implication 5q ùñ 4q. Stating that D “ diagpdn,n q, n “ 0, . . . , N ´ 1 is the diagonal matrix which represents an operator T in the Fourier basis F means that: rT pzqsF “ DrzsF ðñ DFT ˝ T pzq “ Mw ˝ DFTpzq with Mw the multiplication operator by the sequence wpnq “ dn,n , n “ 0, . . . , N ´1. Applying the IDFT to both sides: T pzq “ IDFT ˝ Mw ˝ DFTpzq

@z P 2 pZN q

hence T “ Tpwq proving the implication 5q ùñ 4q. The proof of the theorem is now complete.

2

The theorem demonstrated above provides a standard technique for studying stationary operators T over 2 pZN q. We recall that the sequence: # 1 if n “ 0 @n P ZN δ P 2 pZN q, δpnq “ e0 pnq “ δ0,n “ 0 if n ‰ 0 is the unit pulse; thus, operator T is completely determined by its action on δ, h “ T δ, ˆ the DFT of the unit pulse response, which is referred to as the unit pulse response. h, is called the transfer function. The properties demonstrated in Theorems 2.8 and 2.9 can be used to summarize the analysis of stationary operators, as shown in Box 2.2. – T is the stationary operator of 2 pZN q. – A is the circulant matrix associated with T with respect to the canonical basis of 2 pZN q. – h is the unit pulse response of T : h “ T δ “ ﬁrst column of A

82

From Euclidean to Hilbert Spaces

– Th is the convolution operator with h: T z “ Th z “ h ˚ z “ z ˚ h ˆ – Tphq ˆ is the Fourier multiplier by h, the transfer function: ˆ ˆq T z “ Tphq ˆ z “ IDFTph ¨ z – Given h “ T δ, we obtain the Fourier pair in Table 2.4. Original representation Fourier space ˆ ¨ zˆ h h˚z Table 2.4. Fourier pair for the convolution between a signal z and the unit pulse response h of T

– D is the diagonal matrix which represents T in the orthogonal Fourier basis F of 2 pZN q: D“

1 ˆ ˆ . . . , hpN ´ 1qq WN AWN “ diagphp0q, N

– The Eigenvalues of T (the spectrum, in the linear algebra sense) are the components of the transfer function, that is the Fourier coefﬁcients of the unit pulse response, that is: ˆ ˆ Eigenvalues of T : thp0q, . . . , hpN ´ 1qu Box 2.2. Analysis of stationary operators over 2 pZN q

2.8.4. High-pass, low-pass and band-pass ﬁlters The synthesis formula for any given signal z P 2 pZN q transformed) via the action of a stationary operator T P Endp2 pZN qq is: T zpnq “

Nÿ ´1

TxzpmqFm pnq

n P ZN

[2.44]

m“0

where Fm is the vector with index m of the orthogonal Fourier basis of 2 pZN q. Thus, |Txzpmq| represents the importance of the harmonic of frequency m in the reconstruction of T z, and t|Txzpmq|, m P ZN u represents the spectrum of the transformed signal T z. To understand how the spectrum of T z is linked to that of the original signal z, ˆ ˆq, where let us apply the DFT to both sides of the formula T z “ Tphq ˆ z “ IDFTph ¨ z h “ T δ0 : ˆ ¨ zˆq “ h ˆ ¨ zˆ DFTpT zq “ DFT ˝ IDFTph

The Discrete Fourier Transform and its Applications to Signal and Image Processing

83

that is: ˆ Txzpmq “ hpmq ¨ zˆpmq ,

@m P ZN

so the Fourier coefﬁcients of T z, the sequence transformed by the operator T , are given by the product of the Fourier coefﬁcients of the original sequence z and the Fourier coefﬁcients of the unit pulse response h. Consequently, the spectrum of the transformed sequence T z is: ˇ !ˇ ) ˇx ˇ ˆ ¨ |ˆ z pmq|, m P ZN ˇT zpmqˇ “ |hpmq|

[2.45]

This allows us to understand the action of stationary ﬁlters T on the frequency content of a signal z: ˇ ˇ ˇ ˇ ˆ z p0q| “ 0 and we – if hp0q “ 0, the average of T z is zero, since ˇTxzp0qˇ “ 0 ¨ |ˆ ˇ ˇ ˇ ˇ know that ˇTxzp0qˇ is proportional to the average of T z; ˆ – if |hp0q| “ 1, then T preserves the average of z, that is xT zy “ xzy ; ˆ ˆ – if |hpmq| ą 1 for m » 0 and m » N ´1, and |hpmq| P r0, 1r for m » N {2, then T increases the low frequencies and reduces the high frequencies (low-pass ﬁlter); ˆ ˆ – if |hpmq| ą 1 for m » N {2 and |hpmq| P r0, 1r for m » 0 and m » N ´1, then T increases the high frequencies and reduces the low frequencies (high-pass ﬁlter); ˆ – if |hpmq| ą 1 for intermediate values of m, then T increases the mid-range frequencies (band-pass ﬁlter).

2.8.5. Characterizing stationary operators using shift operators We now have all of the results we need to demonstrate the characterization of a stationary operator as a linear combination of shift operators, or, in an equivalent manner, as a polynomial of the shift operator R1 , since Rk “ R1 ˝ ¨ ¨ ¨ ˝ R1 k times, that is, Rk “ R1k , @k P Z. T HEOREM 2.10.– T P Endp2 pZN qq is stationary if and only if the expression of T is: pT zqpnq “

Nÿ ´1 k“0

ak zpn ´ kq “

@n P t0, .., N ´ 1u where ak P C.

Nÿ ´1 k“0

ak Rk zpnq “

Nÿ ´1 k“0

ak pR1 qk zpnq

[2.46]

84

From Euclidean to Hilbert Spaces

P ROOF.– ùñ : let T be stationary. We know that T “ Th , where Th is the convolution operator with regard to the unit pulse response h “ T δ, that is: pT zqpnq “

Nÿ ´1

hpkqzpn ´ kq.

k“0

We must therefore simply identify the coefﬁcients ak of the formula pT zqpnq “

Nř ´1

ak zpn ´ kq with hpkq to obtain our thesis.

k“0

ðù : we can verify that T , written in the form used in formula [2.46], is stationary due to the linearity of T and Rk . We know that @n P t0, . . . , N ´ 1u: pT Rm zqpnq “ T pRm zpnqq “ T pzpn ´ mqq “ “

Nÿ ´1

Nÿ ´1

ak zpn ´ k ´ mq

k“0

ak Rm zpn ´ kq

k“0 Nÿ ´1

¸

˜ “

(linearity of Rm )

Rm

ak zpn ´ kq

“ pRm T zqpnq

k“0

hence: T ˝ Rm “ Rm ˝ T @m P Z.

2

Since hpkq “ T δpkq, the proof of the theorem above also proves the validity of the formula: pT zqpnq “

Nÿ ´1

T δpkqzpn ´ kq

@ T stationary

k“0

2.8.6. Frequency analysis of ﬁrst and second derivation operators (discrete case) In this section, we shall analyze two stationary operators which represent the discrete version of the ﬁrst and second derivatives. By comparing their eigenvalues, we see that the second derivation operator is more efﬁcient for amplifying high frequencies in digital signals. D EFINITION 2.17.– Given a sequence z P 2 pZN q, we deﬁne: T1 zpnq “ zpn ` 1q ´ zpnq

Discrete ﬁrst derivative

T2 zpnq “ zpn ` 1q ´ 2zpnq ` zpn ´ 1q

Discrete second derivative

The Discrete Fourier Transform and its Applications to Signal and Image Processing

85

The discrete ﬁrst derivative is simply the forward difference of z, divided by the difference of the values of n, but since pn ` 1q ´ n “ 1 there is no need to write the denominator. The discrete second derivative is the backward difference of the discrete ﬁrst derivative of z, divided by the difference of the values of n, which – once again – is 1, so does not need to be written: T2 zpnq “ T1 zpnq ´ T1 zpn ´ 1q “ zpn ` 1q ´ zpnq ´ rzpnq ´ zpn ´ 1qs “ zpn ` 1q ´ 2zpnq ` zpn ´ 1q. Let us begin by analyzing T1 . To calculate the pulse response, T1 is applied to the unit pulse δ “ e0 : ¨ ˛ ´1 ˛ ¨ ˚0‹ e0 p1q ´ e0 p0q ÐÝ n “ 0 ‹ ‹ ˚ ˚ 0‹ ÐÝ n “ 1 e0 p2q ´ e0 p1q ‹ ˚ ˚ ˚ ‹“˚ . ‹ h “ T1 δ “ ˚ .. ‹ ˚ . ‹ ˚ . ‚ ˚ . ‹ ˝ ‹ ˝0‚ e0 pN ´ 1 ` 1q ´ e0 pN ´ 1q ÐÝ n “ N ´ 1 1 using the fact that e0 p0q “ e0 pN q “ 1. The matrix which represents T1 in the canonical basis of 2 pZN q is: ˛ ¨ ´1 1 0 . . . 0 ˚ 0 ´1 1 . . . 0 ‹ ‹ ˚ ˚ . . . . .. ‹ AT1 “ ˚ ... ‹ . . . ‹ ˚ ˝ 0 0 . . . ´1 1 ‚ 1 0 . . . 0 ´1 Now, let us calculate the DFT of h. For all m P ZN , this is: ˆ hpmq “

Nÿ ´1

hpnqe´2πi

mn N

“ ´1 ¨ e´2πi

m0 N

` 0 ` . . . ` 1 ¨ e´2πi

mpN ´1q N

n“0

“ ´1 ` e2πi N e´2πi m

mN N

“ e2πi N ´ 1 m

m ˆ “ e2πi N ´ 1, m “ 0, 1, . . . , N ´ 1u and its so the eigenvectors of T1 are thpmq diagonal representation is: ´ ¯ pN ´1q 2 1 D “ diag 0, e2πi N ´ 1, e2πi N ´ 1 . . . , e2πi N ´ 1

The action of T1 in terms of frequency can now be interpreted using formula [2.45]. ˆ We wish to calculate the magnitudes of the Eigenvalues phpmqq mPZN . We see that: ´ m¯ m m m m m e2πi N ´ 1 “ eπi N peπi N ´ e´πi N q “ eπi N 2i sin π N

86

From Euclidean to Hilbert Spaces

ˇ ` ˇ mˇ ˇ ` ˘ˇ ˘ˇ ˆ ˇ “ 2 ˇsin π m ˇ, while m P ZN , m ă Thus, |hpmq| “ ˇeπi N ˇ ¨ ˇ2i sin π m N N N 1, so the sinus is always non-negative and the absolute value can be eliminated. To summarize: ´ m¯ ˆ |hpmq| “ 2 sin π , m P ZN N Speciﬁcally: ˆ – |hp0q| “ 0: hence, the ﬁltered signal T1 z averages to zero; ˆ N q| “ 2; – |hp 2 ˆ – |hpmq| ă 2 @m ‰

N 2;

ˆ – |hpmq| Ñ 0 if m Ñ 0 or m Ñ N ´ 1; – the action of the operator is symmetrical with regard to

N 2.

Since m “ N {2 represents the highest frequency of the signal and m “ 0 and m “ N ´ 1 represent the lowest frequencies, we can deduce that T1 reduces the low frequencies of z and increases the high frequencies by up to two times. Thus, the discrete ﬁrst derivative operator is a high-pass ﬁlter. Now, let us analyze T2 . Its pulse response is given by the vector: ¨ ˛ ´2 ¨ ˛ e0 p1q ´ 2 e0 p0q ` e0 p´1q ˚1‹ ‹ ˚ ‹ ˚ 0‹ e0 p2q ´ 2e0 p1q ` e0 p0q ˚ ‹ ˚ ˚ ‹“˚ . ‹ h “ Tδ “ ˚ .. ˚ ‹ ˚ . ‹ ˝ ‚ ˚ . ‹ . ‹ ˝0‚ e0 pN q ´ 2e0 pN ´ 1q ` e0 pN ´ 2q 1 The matrix associated with T2 in the canonical basis of 2 pZN q is: ˛ ¨ ´2 1 0 . . . 1 ˚ 1 ´2 1 . . . 0 ‹ ‹ ˚ ˚ . . . . .. ‹ AT2 “ ˚ ... ‹ . . . ‹ ˚ ˝ 0 0 . . . ´2 1 ‚ 1 0 . . . 1 ´2

The Discrete Fourier Transform and its Applications to Signal and Image Processing

87

Next, we calculate the DFT of h : ˆ hpmq “

Nÿ ´1

hpnqe´2πi

mn N

“ ´2 ¨ e´2πi

m0 N

` 1 ¨ e´2πi N ` 0 ` . . . m

n“0

`1 ¨ e´2πi

mpN ´1q N

“ ´2 ` e´2πi N ` e´2πim e2πi N “ ´2 ` e2πi N ` e´2πi N m

m

m

´ m¯ e2πi N ` e´2πi N “ ´2 ` 2 cos 2π 2 N m

“ ´2 ` 2 ¨

m

m

ˆ These values of hpmq must now be compared“ with those` of the ˘‰ ﬁrst derivative ˆ operator. We do this by rewriting hpmq “ ´4 12 ´ 12 cos 2π m and using the N 2 1 1 m ˆ trigonometric identity sin pαq “ 2 ´ 2 cosp2αq with α “ π N to obtain hpmq “ ` m˘ ` ˘ 2 ˆ ´4 sin π N . The eigenvalues of T2 are thus hpmq “ ´4 sin2 π m N , m “ 0, 1, . . . , N ´ 1, and its diagonal representation is: ˆ ˙˙ ˆ ˙ ˆ ´π¯ pN ´ 1qπ 2π 2 2 2 D “ diag 0, ´4 sin , ´4 sin , . . . , ´4 sin N N N

Figure 2.4. Difference between the sine functions representing the spectrum values of the ﬁrst and second derivative operators between 0 and π. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

88

From Euclidean to Hilbert Spaces

The effect of T2 on the frequency is deﬁned by the magnitudes of its Eigenvalues: ´ m¯ ˆ , m P ZN |hpmq| “ 4 sin2 π N We see that the magnitudes of the Eigenvalues of the second derivative operator are the squares of those of the ﬁrst derivative operator. Hence: ˆ – |hp0q| “ 0: thus, as in the case of the ﬁrst derivative, the ﬁltered signal T2 z averages to zero; ˆ N q| “ 4; – |hp 2

ˆ – |hpmq| ă 4 @m ‰

N 2;

ˆ – |hpmq| Ñ 0 if m Ñ 0 or m Ñ N ´ 1 and the convergence to zero is faster than for the ﬁrst derivative operator, as in this case, the sine is squared, as illustrated in Figure 2.4; – The action of the operator is symmetrical about

N 2.

Thus, the discrete second derivative operator is also a high-pass ﬁlter, amplifying high frequencies and reducing low frequencies in a way which is the square of the action of the discrete ﬁrst derivative operator. 2.9. The two-dimensional discrete Fourier transform (2D DFT) The Fourier transform considered up to now applies to signals zpnq which depend on only one parameter n. In practical contexts, signals are often very large and depend on multiple parameters. One classic example is that of digital images, which include two parameters: the two spatial coordinates of a pixel, as shown in Figure 2.5. DFT theory can be generalized for signals which depend on any (ﬁnite) number of parameters. For simplicity’s sake, we shall focus on the two-dimensional (2D) case, with parameters n1 , n2 . The ﬁrst step is to introduce the domain vector space: if N1 , N2 P N, we deﬁne: 2 pZN1 ˆ ZN2 q “ tz : ZN1 ˆ ZN2 ùñ Cu z P 2 pZN1 ˆ ZN2 q is a complex sequence which depends on two parameters: # n1 P t0, 1, . . . , N1 ´ 1u n2 P t0, 1, . . . , N2 ´ 1u

The Discrete Fourier Transform and its Applications to Signal and Image Processing

89

Figure 2.5. The two coordinates of a pixel, n1 , n2 , in a digital image (image source: author). For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

2 pZN1 ˆ ZN2 q is a vector space of dimension N1 ¨ N2 . The deﬁnitions used for summation and multiplication by a complex scalar are the same as those used for the 1D case and for inner products: xz, wy “

Nÿ 1 ´1 Nÿ 2 ´1

zpn1 , n2 qwpn1 , n2 q,

@z, w P 2 pZN1 ˆ ZN2 q

n1 “0 n2 “0

To extend DFT theory from one to two dimensions, we use the procedure for generating bases in 2 pZN1 ˆ ZN2 q from bases in 2 pZN1 q and 2 pZN2 q. T HEOREM 2.11.– Let tB0 , B1 , . . . , BN1 ´1 u, tC0 , C1 , . . . , CN2 ´1 u be orthonormal bases in 2 pZN1 q and 2 pZN2 q, respectively. For all m1 P t0, . . . , N1 ´ 1u and m2 P t0, . . . , N2 ´ 1u, consider the sequences in 2 pZN1 ˆ ZN2 q given by: Dm1 ,m2 pn1 , n2 q “ Bm1 pn1 q ¨ Cm2 pn2 q Then, Dm1 ,m2 is an orthonormal basis of 2 pZN1 ˆ ZN2 q, known as the tensor product basis of the two original bases. P ROOF.– The sequences Dm1 ,m2 , m1 P t0, . . . , N1 ´ 1u and m2 P t0, . . . , N2 ´ 1u are N1 ¨ N2 elements of 2 pZN1 ˆ ZN2 q, which is of dimension N1 ¨ N2 . Proof that these constitute an orthonormal basis can be obtained by showing that: " 1 if pm1 , m2 q “ pk1 , k2 q xDm1 ,m2 , Dk1 ,k2 y “ δpm1 ,m2 q,pk1 ,k2 q “ δm1 ,k1 δm2 ,k2 “ 0

if pm1 , m2 q ‰ pk1 , k2 q

90

From Euclidean to Hilbert Spaces

xDm1 ,m2 , Dk1 ,k2 y

řN1 ´1 řN2 ´1

“

def. of x , y

“

n1 “0

Nÿ 1 ´1 Nÿ 2 ´1

def. of D

“

n2 “0

Dm1 ,m2 pn1 , n2 qDk1 ,k2 pn1 , n2 q

Bm1 pn1 qCm2 pn2 qBk1 pn1 qCk2 pn2 q

n1 “0 n2 “0 Nÿ 1 ´1

Nÿ 2 ´1

n1 “0

n2 “0

Bm1 pn1 qBk1 pn1 q

Cm2 pn2 qCk2 pn2 q

“ xB m1 , Bk1 y xC m2 , Ck2 y “ δpm1 ,m2 q,pk1 ,k2 q . looooomooooon looooomooooon

δm1 ,k1

2

δm2 ,k2

For m1 P t0, 1, . . . , N1 ´ 1u and m2 P t0, 1, . . . , N2 ´ 1u, this theorem has the following corollaries: – the canonical orthonormal basis of 2 pZN1 ˆ ZN2 q is: # 1 if pn1 , n2 q “ pm1 , m2 q B “ em1 ,m2 pn1 , n2 q “ 0 if pn1 , n2 q ‰ pm1 , m2 q – the orthogonal Fourier basis of 2 pZN1 ˆ ZN2 q is: Fm1 ,m2 pn1 , n2 q “

m1 n1 m2 n2 1 1 2πi e2πi N1 ¨ e2πi N2 “ e N1 N2 N1 N2

´

m1 n 1 N1

`

m2 n 2 N2

¯

– the orthonormal Fourier basis of 2 pZN1 ˆ ZN2 q is: a Em1 ,m2 pn1 , n2 q “ N1 N2 Fm1 ,m2 pn1 , n2 q – the orthogonal basis of the complex exponentials in 2 pZN1 ˆ ZN2 q is: Em1 ,m2 pn1 , n2 q “ N1 N2 Fm1 ,m2 pn1 , n2 q

Using the theory of complex inner product spaces, the deﬁnition of Fourier coefﬁcients, the DFT and the IDFT can be generalized to 2 pZN1 ˆ ZN2 q. Taking z P 2 pZN1 ˆ ZN2 q, we have: xz, Em1 ,m2 y “ “ “

Nÿ 2 ´1 1 ´1 Nÿ n1 “0 n2 “0 Nÿ 2 ´1 1 ´1 Nÿ n1 “0 n2 “0 Nÿ 2 ´1 1 ´1 Nÿ n1 “0 n2 “0

zpn1 , n2 qe2πi

n 1 m1 N1

zpn1 , n2 qe´2πi

e2πi

m1 n1 N1

zpn1 , n2 qe´2πip

n2 m2 N2

e´2πi

m1 n 1 N1

`

m2 n 2 N2

m2 n2 N2

q

The Discrete Fourier Transform and its Applications to Signal and Image Processing

91

thus the Fourier coefﬁcients of z P 2 pZN1 ˆ ZN2 q are deﬁned as follows: zˆpm1 , m2 q “

Nÿ 1 ´1 Nÿ 2 ´1

zpn1 , n2 qe

´

´2πi

m1 n 1 N1

`

m2 n2 N2

¯

n1 “0 n2 “0

As in the 1D case: zˆp0, 0q “ N1 N2 xzy where xzy is the average of z. Note that the quantity N1 N2 may be extremely large. The synthesis formula can also be generalized to the 2D case, as follows: zpn1 , n2 q “

´ ¯ Nÿ 1 ´1 Nÿ 2 ´1 m1 n1 m2 n 2 1 2πi N ` N 1 2 zˆpm1 , m2 qe N1 N2 m “0 m “0 1

2

The 2D DFT and 2D IDFT operators can therefore be written using the following formulas: ˆ : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q z

ÞÝÑ zˆ, zˆpm1 , m2 q “

Nř 1 ´1 Nř 2 ´1 n1 “0 n2 “0

´2πi

zpn1 , n2 qe

´

m1 n1 N1

`

m2 n 2 N2

¯

and: ˇ : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q ´ ¯ Nř m1 n 1 m2 n2 1 ´1 Nř 2 ´1 2πi N ` N 1 2 z ÞÝÑ zˇ, zˇpn1 , n2 q “ N11N2 zpm1 , m2 qe m1 “0 m2 “0

Clearly, if the dimension is increased from 2 to 2 ă d ă `8, these formulas can be generalized in the following manner: zˆpm1 , . . . , md q “

Nÿ 1 ´1

¨¨¨

n1 “0

˜ zˇpn1 , ¨ ¨ ¨ , nd q “

d ź

k“1

Nÿ d ´1

zpn1 , . . . , nd qe

´2πi

d ř k“1

mk nk Nk

nd “0

¸´1 Nk

Nÿ 1 ´1 m1 “0

¨¨¨

Nÿ d ´1

zpm1 , . . . , md qe

2πi

d ř k“1

mk n k Nk

md “0

2.9.1. Matrix representation of the 2D DFT: Kronecker product versus iteration of two 1D DFTs The matrix representation of the 2D DFT in the canonical basis of 2 pZN1 ˆ ZN2 q can be constructed using the Sylvester matrices WN1 and WN2 associated with the 1D DFT for 2 pZN1 q and for 2 pZN2 q, respectively.

92

From Euclidean to Hilbert Spaces

The operation used to obtain a matrix representation of the 2D DFT is the Kronecker product, which is deﬁned below. D EFINITION 2.18.– Given two matrices, A of dimension m ˆ n and B of dimension p ˆ q: ¨ ˛ ¨ ˛ a11 ¨ ¨ ¨ a1n b11 ¨ ¨ ¨ b1q ˚ ‹ ˚ ‹ A “ ˝ ... . . . ... ‚, B “ ˝ ... . . . ... ‚ am1 ¨ ¨ ¨ amn

bp1 ¨ ¨ ¨ bpq

the Kronecker product matrix A b B is the matrix of dimension mp ˆ nq deﬁned by: ˛ ¨ a11 B ¨ ¨ ¨ a1n B ˚ .. ‹ .. A b B “ ˝ ... . . ‚ am1 B ¨ ¨ ¨ amn B

The matrix associated with the 2D DFT in the canonical basis of 2 pZN1 ˆ ZN2 q can be shown, by direct calculation, to be the matrix of dimension N1 N2 ˆ N1 N2 given by: ùñ

WN1 ,N2 “ WN1 b WN2

zˆpm1 , m2 q “ WN1 b WN2 zpn1 , n2 q

Unfortunately, the calculation needed to obtain the Kronecker product matrix becomes unfeasibly large for high values of N1 and N2 . In practice, the 2D DFT is generally written as the iteration of two 1D DFTs. To understand this approach, z P 2 pZN1 ˆ ZN2 q must be interpreted as a matrix made up of N2 column vectors with N1 elements: ¨ ˛, .. .. .. / / . ¨¨¨ . ˚ . ‹. ˚ ‹ zpn1 , n2 q “ ˝zp¨, 0q zp¨, 1q ¨ ¨ ¨ zp¨, N2 ´ 1q‚ / / .. .. .. . . ¨ ¨ ¨ . loooooooooooooooooooooomoooooooooooooooooooooon N2 column vectors

N1 elements for each column vector From the deﬁnition of the 2D DFT, we can write: ˜ ¸ Nÿ Nÿ 1 ´1 2 ´1 n1 m1 n 2 m2 zˆpm1 , m2 q “ zpn1 , n2 qe´2πi N1 e´2πi N2 n1 “0 n2 “0 looooooooooooooooomooooooooooooooooon Nÿ 2 ´1 n2 m2 WN1 zpn1 , n2 qe´2πi N2 “ n2 “0

zˆpm1 , n2 q “ WN1 zpn1 , n2 q

[2.47]

The Discrete Fourier Transform and its Applications to Signal and Image Processing

93

In this formula, the sum with regard to index n2 is the furthest out, so n2 is ﬁxed each time. Taking a ﬁxed value for n2 , zpn1 , n2 q is a column vector, so the highlighted parenthesis represents the 1D DFT of the column vector, which can be obtained by applying matrix WN1 to zpn1 , n2 q, with a ﬁxed value of n2 , as before. The next problem is that n1 is ﬁxed, and the changing index is n2 , meaning that WN1 zpn1 , n2 q is a row vector. For this reason, the DFT cannot be obtained by applying WN2 : as we saw in section 2.5, the 1D DFT is obtained from the product of the matrix WN and a sequence represented using a column vector. The solution to this problem consists of transposing the two sides of equation [2.47], transforming the row vector zˆpm1 , n2 q into a column vector: zˆpm1 , m2 qt “

Nÿ 2 ´1

pWN1 zpn1 , n2 qqt e´2πi

n2 m2 N2

n2 “0

Now, pWN1 zpn1 , n2 qqt is a column vector, so the DFT can be calculated by applying WN2 : zˆpm1 , m2 qt “ WN2 pWN1 zpn1 , n2 qqt “ WN2 zpn2 , n1 qWN1

“

pABqt “B t At

WN2 zpn1 , n2 qt pWN1 qt

since WNt 1 “ WN1 (note that n1 and n2 have swapped places). Thus, zˆpm1 , m2 qt “ WN2 zpn2 , n1 q WN1 , so to ﬁnd zˆpm1 , m2 q, we must simply transpose both sides again: zˆpm1 , m2 q “ pˆ z pm1 , m2 qt qt “ pWN2 zpn2 , n1 q WN1 qt “ WN1 zpn1 , n2 q WN2 The formula used to calculate the 2D DFT of a sequence z P 2 pZN1 ˆ ZN2 q is thus: zˆpm1 , m2 q “ WN1 zpn1 , n2 qWN2

[2.48]

It is important to note that equation [2.48] is only meaningful if zˆpm1 , m2 q and zpn1 , n2 q are interpreted as N1 ˆ N2 matrices in their entirety. Formula [2.48] is not the same as WN1 WN2 zpn1 , n2 q or WN2 WN1 zpn1 , n2 q, i.e. the formulas that one could have naively thought to use to implement 1D DFT over the columns and rows of z. The reason for this difference, as we have seen, is that the 1D matrix DFT requires the presence of a column vector, hence the transposition which results in formula [2.48]. 2.9.2. Properties of the 2D DFT The generalization of the properties of the 1D DFT, presented in section 2.7, to the 2D DFT is trivial.

94

From Euclidean to Hilbert Spaces

The demonstrations of these properties in 1D and 2D are practically identical, notwithstanding certain differences in notation. For this reason, we shall not provide proofs for the 2D extensions presented below. As in the 1D case, in order to examine the properties of the 2D DFT, we must ﬁrst extend the deﬁnition of a sequence z P 2 pZN1 ˆZN2 q by periodicity to any interval of length N1 with regard to the variable n1 and of length N2 with regard to the variable n2 . This extension is possible if z is deﬁned outside of ZN1 ˆ ZN2 in the following manner: zpn1 ` j1 N1 , n2 ` j2 N2 q “ zpn1 , n2 q ,

@n1 , n2 , j1 , j2 P Z

[2.49]

The shift operator is also helpful in 2D cases. D EFINITION 2.19.– Take z P 2 pZN1 ˆ ZN2 q, extended by periodicity as in formula [2.49], and k1 , k2 P Z. The shift operator over 2 pZN1 ˆ ZN2 q is deﬁned by: Rk1 ,k2 : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q z ÞÝÑ Rk1 ,k2 z, pRk1 ,k2 zqpn1 , n2 q “ zpn1 ´ k1 , n2 ´ k2 q Taking z P 2 pZN1 ˆ ZN2 q, extended by periodicity as in formula [2.49], then, for all n1 , n2 , m1 , m2 P Z: – periodicity of zˆ and zˇ : zˆpm1 , m2 q “ zˆpm1 ` N1 , m2 q “ zˆpm1 , m2 ` N2 q “ zˆpm1 ` N1 , m2 ` N2 q and: zˇpn1 , n2 q “ zˇpn1 ` N1 , n2 q “ zˇpn1 , n2 ` N2 q “ zˇpn1 ` N1 , n2 ` N2 q – 2D DFT and shift: ´2πi R{ k1 ,k2 zpm1 , m2 q “ e

´

m1 k1 N1

m k

2 2 ` N 2

¯

zˆpm1 , m2 q

@k1 , k2 P Z

k1 ,k2 k1 ,k2 that is, if we deﬁne the sequence ωN pm1 , m2 q “ P 2 pZN1 ˆ ZN2 q, ωN 1 ,N2 1 ,N2

e

´2πi

´

m1 k1 N1

m k

2 2 ` N 2

¯

@m1 , m2 P Z, then:

DFT 2D ˝ Rk1 ,k2 “ Mωk1 ,k2 ˝ DFT 2D N1 ,N2

k1 ,k2 where Mωk1 ,k2 is the multiplication operator by ωN in 2 pZN1 ˆ ZN2 q. Permutating the 1 ,N2 N1 ,N2

direction of composition, we obtain: ˆ

pRk1 ,k2 zˆqpm1 , m2 q “ zˆpm1 ´ k1 , m2 ´ k2 q “ DFT 2D e

2πi

´

m1 k1 N1

m k

2 2 ` N 2

¯

˙

z pm1 , m2 q

The Discrete Fourier Transform and its Applications to Signal and Image Processing

95

that is: Rk1 ,k2 ˝ DFT 2D “ DFT 2D ˝ M´

k ,k

ωN1 ,N2 1 2

¯˚

,

@k1 , k2 P Z

The properties examined above are summarized by the Fourier pairs in Table 2.5. Original representation zpn 1 ´ k1 , n2 ´ k2 q ´ ¯

e

2πi

n 1 k1 N1

n 2 k2 ` N 2

zpn1 , n2 q

e

´2πi

´

Fourier space ¯

m1 k1 N1

m k

2 2 ` N 2

zˆpm1 , m2 q

zˆpm1 ´ k1 , m2 ´ k2 q

Table 2.5. Fourier pairs for 2D shifts As in the case of 1D DFT, considering k1 “ N21 and k2 “ N22 , then p´1qn1 `n2 zpn1 , n2 q and zˆpm1 ´ N21 , m2 ´ N22 q. This transformation is used to obtain a centered visualization of the spectrum of z. Furthermore, as in the o1D case, the amplitude spectrum of a 2D ˇ of any shifted form zpn1 ´ k1 , n2 ´ k2 q is strictly identical, as ˇsignal ´zpn1 , n2 q and ˇ ´2πi m1 k1 ` m2 k2 ¯ ˇ N1 N2 ˇ “ 1. Thus, the amplitude spectrum gives us the frequency content of ˇe ˇ ˇ the signal, but does not tell us where these frequencies are located; – 2D DFT and conjugation: p zpm1 , m2 q “ zˆp´m1 , ´m2 q “ zˆpN1 ´ m1 , N2 ´ m2 q – 2D DFT and convolution: { pz ˚ wqpm1 , m2 q “ zˆpm1 , m2 qwpm ˆ 1 , m2 q where 2D convolution is deﬁned as: Tz : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q w ÞÝÑ Tz w “ z ˚ w

pz ˚ wqpn1 , n2 q “

Nÿ 1 ´1 Nÿ 2 ´1

zpn1 ´ k1 , n2 ´ k2 qwpk1 , k2 q

k1 “0 k2 “0

“

Nÿ 1 ´1 Nÿ 2 ´1

zpn1 , n2 qwpn1 ´ k1 , n2 ´ k2 q

k1 “0 k2 “0

Box 2.3. Properties of 2D DFT

2.9.3. 2D DFT and stationary operators The properties of 2D and 1D DFT with regard to stationary operators are the same.

96

From Euclidean to Hilbert Spaces

Strictly speaking, an operator T : 2 pZN1 ˆ ZN2 q Ñ 2 pZN1 ˆ ZN2 q is stationary if: T ˝ Rk1 ,k2 “ Rk1 ,k2 ˝ T,

@k1 , k2 P Z

In practice, if z is a digital image, a stationary operator is a transformation whose action is independent of the position of a pixel in the spatial context of the image. As in the 1D case, stationary operators over 2 pZN1 ˆ ZN2 q may be characterized as convolution operators or as Fourier multiplier operators. The theorem formalizing this relation relies on deﬁnitions of the Fourier multiplier, the unit pulse and the pulse response in the 2D case. D EFINITION 2.20.– Taking a ﬁxed w P 2 pZN1 ˆ ZN2 q, the Fourier multiplier associated with w is deﬁned as: Tpwq : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q ~ ¨ zˆ z ÞÝÑ Tpwq z “ w D EFINITION 2.21.– The unit pulse δ in 2 pZN1 ˆ ZN2 q is the ﬁrst vector in the canonical basis: δ “ e0,0 . Given a linear operator T over 2 pZN1 ˆ ZN2 q, the pulse response is deﬁned as the sequence h “ T δ P 2 pZN1 ˆ ZN2 q. T HEOREM 2.12.– Let T : 2 pZN1 ˆ ZN2 q ÝÑ 2 pZN1 ˆ ZN2 q be a linear operator. The following conditions are equivalent: 1) T is stationary; 2) T is the convolution operator with the pulse response h “ T δ: T z “ Th z “ h ˚ z “ z ˚ h

@z P 2 pZN1 ˆ ZN2 q

ˆ: 3) T is the Fourier multiplier associated with h ˆ}ˆ T z “ Tphq ˆ z “h¨z

P 2 pZN1 ˆ ZN2 q

4) T is diagonalizable, its eigenvectors are the orthogonal Fourier basis Fm1 ,m2 of ˆ 2 pZN1 ˆ ZN2 q, and its eigenvalues are the components of h. O BSERVATIONS .– This result can be extended to circulant matrices, but their deﬁnition in the 2D case is more complex.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

97

2.9.4. Gradient and Laplace operators and their action on digital images Repeating the analysis of discrete derivative operators from section 2.8.6 for 2D “ p B , B q, and the second cases, the ﬁrst derivative gives us the gradient, that is ∇ Bx By

derivative gives us the Laplacian, that is ∇2 “

B2 Bx2

`

B2 By 2 .

The gradient is used to detect the edges of an image in a particular direction. For isotropic edge detection – that is detection which is uniform with regard to direction – the Laplacian is used; this approach is more efﬁcient than using a gradient for intensifying ﬁne details, as we saw in the 1D case. Even in 2D cases, the differential operators above cancel out the average of an image, which is why the output is entirely black, except near the edges, as we see from Figure 2.6.

2.9.5. Visualization of the amplitude spectrum in 2D Visualizations of the spectrum of a 2D signal can be produced on the condition that the signal is centered, for the same reasons presented in the 1D case. Centering may be carried out using the 2D equivalent of formula [2.34], considering p´1qn1 `n2 z pn1 , n2 q in the place of zpn1 , n2 q, as we saw in section 2.9.2. Note that the 1D symmetry of the 1D DFT with regard to frequencies m P t0, 1, . . . , N {2u and m P tN {2 ` 1, N {2 ` 2, . . . , N ´ 1u is replaced by 2D mirror symmetry in the case of the 2D DFT. Figure 2.7 shows three grayscale digital images with their amplitude spectrums. The brightest points correspond to high magnitude values of the Fourier coefﬁcients, while the darkest points correspond to low values. There are several notable characteristics here: – the symmetry of the spectrum: frequency content is repeated in each quadrant by mirror symmetry; – the brightest points are located toward the center of the spectrum: this is due to the fact that these spectrums are centered, so the coordinates of the central frequency are pm1 , m2 q “ p0, 0q and |ˆ z p0, 0q| “ N1 N2 xzy, that is, N1 N2 times the average value of the image. This is why a compressive function, such as a logarithm, must be used to visualize a spectrum: the values of |ˆ z p0, 0q| are so much higher than the others that the variability range needs to be compressed;

98

From Euclidean to Hilbert Spaces

Figure 2.6. a) Original image of Panko; b) image after Laplacian ﬁlter; c) image ﬁltered using a gradient in the vertical direction; d) image ﬁltered using a gradient in the horizontal direction (image source: author). For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

The Discrete Fourier Transform and its Applications to Signal and Image Processing

Figure 2.7. Left column: original images. Right column: centered amplitude spectrums of the images in the left column, visualized using a logarithmic scale

99

100

From Euclidean to Hilbert Spaces

– moving out from the center, the spectrum shows the amplitude of the coefﬁcients corresponding to the highest frequencies, up to the maximum frequencies pN1 {2, N2 {2q, if N1 , N2 are even, or their integer parts prN1 {2s , rN2 {2sq if N1 , N2 are odd. The image with the highest frequency content is that of the mandrill: its spectrum is the widest of the three shown here. Note the particularly intense values near the edges, representing very high frequencies: these correspond to the ﬁne details of the hairs near the animal’s eyes; – as m1 and m2 represent vertical and horizontal frequencies, the vertical and horizontal edges of the images produce Fourier coefﬁcients which are localized on the corresponding axes. This is why the spectrum of the ﬁrst image, which features strong vertical intensity gradients between the rocks and the sea, is heavily dominated by intense Fourier coefﬁcients on the vertical axis. The second image (“Lena”, a classic image used in image processing) features ﬁne details in the hat area, at 45˝ and ´45˝ . This results in evident diagonal structures in the spectrum; – from this spectrum analysis, we see that the Fourier spectrum reveals the presence of geometric structures within an image, but does not tell us where in the image these structures are located. 2.9.6. Filtering: an example of digital image ﬁltering in a Fourier space Theorem 2.12 states that all stationary operators T acting on images (interpreted as ﬁnite 2D sequences) are “hidden” convolutions between the image and the pulse response h “ T δ. Furthermore, these convolutions can be represented as Fourier multipliers ˆ and the 2D DFT of the image within the Fourier space). (multiplication of h Different results will be obtained depending on the sequence h with which convolution is carried out. The effect of a convolution is often easier to interpret by examining the associated Fourier multiplier. Let us consider the notion of convolution with a discrete Gaussian, noted hpn1 , n2 q. As we shall see in Chapter 6, the Fourier transform of a Gaussian with a standard deviation σ is itself a Gaussian, but the standard deviation of the latter is inversely proportional to σ. Thus, we can further our understanding of the meaning of convoluting an image zpn1 , n2 q with a Gaussian hpn1 , n2 q by analyzing the ˆ 1 , m2 q in the Fourier space. multiplication zˆpm1 , m2 q ¨ hpm Figure 2.8 features three images corresponding to 512 ˆ 512 2D Gaussians. The intensity “ ¯ the pixel in position pn1 , n2 q is hpn1 , n2 q ´ 2 of n `n2 exp ´ 12σ2 2 and the standard deviation is σ “1, 5 and 10, respectively.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

101

Figure 2.8. Two-dimensional Gaussian images with a standard deviation of (left - right) 1, 5 and 10

As we stated above, the 2D DFTs of h are still Gaussians, but their standard 1 deviations are proportional to 1, 15 , and 10 . Evidently, hp0, 0q “ 1 and the values of ˆ 1 , m2 q decrease as we move away from the center; thus, multiplication in the hpm ˆ 1 , m2 q decreases the importance of the harmonics Fourier space zˆpm1 , m2 q ¨ hpm with pm1 , m2 q ‰ p0, 0q, which are associated with the ﬁner details in the image. ˆ 1 , m2 q, we can reconstruct an image Applying the 2D IDFT to zˆpm1 , m2 q ¨ hpm which is blurrier than the original. In image processing, convolution with a Gaussian corresponds to a blurring operation, as we see in Figure 2.9.

Figure 2.9. Blurred image of Lena obtained by multiplying DFTs and Gaussians with standard deviations of (left - right) 1, 5 and 10

C OMMENT CONCERNING FIGURE 2.9.– Note that as the standard deviation of the DFT of a Gaussian is inversely proportional to the original standard deviation, the DFT of the Gaussian with a standard deviation of 10 has a small standard deviation in the latter case, and thus tends rapidly toward 0. So, when the DFT of the Gaussian

102

From Euclidean to Hilbert Spaces

with an SD of 10 is multiplied with the DFT of the image, much of the detail in the image is lost. Blurring has a number of uses; for example, in cases where the original image is noisy, blurring can make this noise less evident (although it also reduces edge sharpness). Figure 2.10 shows a continuous version of the blurring frequency ﬁlter.

Figure 2.10. Blurring ﬁlter/low-pass ﬁlter in the frequency domain

N OTE .– Although convolution with a Gaussian results in a blurring effect, it would be wrong to assume that convolution is always associated with a blurring action. As we saw earlier, convolution, alongside the Fourier multiplier, constitutes a prototype for all stationary operators, which may blur a signal or enhance its contrast.

2.10. Summary In this chapter, we considered the space 2 pZN q composed of N-periodic sequences with complex values, isomorphic to CN . We introduced a special basis in this space, made up of the complex exponentials generated by the consecutive powers of the N -th complex roots of the unit. This basis is used to construct the Fourier basis of 2 pZN q. We interpreted the elements of this basis as harmonic waves, oscillating at frequencies which are multiples of a fundamental one.

The Discrete Fourier Transform and its Applications to Signal and Image Processing

103

The Fourier coefﬁcients of an element in 2 pZN q are its components with regard to the Fourier basis. As these coefﬁcients are complex, their magnitude must be used to determine the importance of a harmonic in relation to a certain frequency when reconstructing (or synthesizing) the element itself. The set of magnitudes of the Fourier coefﬁcients is known as the spectrum of an element in 2 pZN q. The DFT is the endomorphism of 2 pZN q which associates an element of 2 pZN q with the sequence of its Fourier coefﬁcients. The DFT is actually an isomorphism, and its inverse is known as the IDFT. The DFT may be associated with a matrix, known as a Sylvester matrix; this matrix is a Vandermonde matrix, that is, all of the lines and columns in the matrix can be obtained through geometric progressions. We presented an interpretation of these concepts in the context of signal theory, notably highlighting the fact that the highest harmonic oscillation frequency in a discrete signal obtained from N samples is N {2 (or half of its integer part if N is odd); this is the Nyquist frequency. The DFT transforms the shift operation into a multiplication by a phase factor, that is, a complex exponential with unit magnitude; this implies that the signal spectrum is shift-invariant. Convolution is transformed by the DFT into a pointwise product, allowing the convolution operator to be expressed diagonally in the Fourier space. Finally, we saw that the DFT can be used to diagonalize stationary operators, that is, operators which commutate with shift operators. Theorem 2.9 can be used to fully characterize a stationary operator as a convolution or as a Fourier multiplier and to determine the eigenvalues of this operator.

3 Lebesgue’s Measure and Integration Theory

In this chapter, we shall present the most essential elements of measure theory and integration. Our aim here is simply to establish clear and unambiguous notation and a common vocabulary. What follows is a deliberately brief summary. Readers who have not yet studied this important branch of mathematics may wish to look elsewhere for a more detailed introduction to measure theory and integration. Two excellent reference works in this domain are Briane and Pagès (1998) and Bartle (1966). 3.1. Riemann versus Lebesgue The main difference between the Riemann and Lebesgue approaches is shown in Figure 3.1. The key to Riemann integration lies in approximating the area of the surface between the x axis and the curve of a function f using small rectangles rai´1 , ai s ˆ r0, Φi s with their base on the x axis, of a height Φi close to the average height of function f over rai´1 , ai s. Lebesgue’s integration theory differs in that the ﬁrst stage involves breaking down the y axis into small intervals rbj´1 , bj s; the surface below the curve f is then approximated using: żb ÿ bj´1 ` bj f« ¨ length ptx : bj´1 ď f pxq ď bj uq 2 a j From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

106

From Euclidean to Hilbert Spaces

a)

b)

Figure 3.1. Riemann and Lebesgue integration. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

The main difﬁculty lies in the fact that the sets: Ej “ tx P ra, bs : bj´1 ď f pxq ď bj u shown in red in Figure 3.1(b), are generally not intervals, and it can be complicated, if not (as in certain cases) impossible, to associate them with a length or measure. The development of measure theory was motivated by the need to create a theory of integration using the strategy described above. This approach is far longer and more complicated than Riemann integration; however, Lebesgue integration presents a signiﬁcant advantage in terms of generality, and the properties that can be proved are far more powerful. 3.2. σ-algebra, measurable space, measures and measured spaces In order to deﬁne a Lebesgue integral, we must ﬁrst deﬁne the sets and functions which can be measured. The deﬁnitions and results below, based on work carried out in the early 20th century, make up the necessary formalization. Let X be a set. A σ-algebra on X is a collection A of subsets of X, that is A Ď PpXq, which veriﬁes the following properties: – H, X P A; – A is closed under complementation: E P A ùñ E c P A; Ť E P A. – A is closed under countable unions: pEn qnPN P A ñ nPN

This deﬁnition implies that A is closed under countable intersection. S IMPLE EXAMPLES .– – A “ PpXq: σ-algebra of the power set of X. – A “ tH, Xu: the minimal σ-algebra over X.

Lebesgue’s Measure and Integration Theory

107

Mathematicians working on measure theory have proved that the deﬁning properties of a σ-algebra are necessary and sufﬁcient to “measure” the sets contained in the σ-algebra itself, in a sense which will be deﬁned below. For this reason, the pair pX, Aq is called a measurable space and the elements of A are measurable sets. One further concept must be introduced before we can examine a meaningful example of a measurable space: that of the ordering relation between σ-algebras. If every element in a σ-algebra A1 is contained in the σ-algebra A2 , then A1 is said to be smaller than A2 and we write A1 Ă A2 . This concept is used to deﬁne the smallest σ-algebra generated by a collection of power sets: taking S Ă PpXq, the intersection of all σ-algebras which contain S is known as the σ-algebra generated by S. The case of a topological set X is particularly interesting, and merits closer attention. The existence of a topology means that we can deﬁne the concept of an open part of X. Taking τ Ď PpXq to be the open sets of X, we clearly see that τ is not a σ-algebra, since the complement of an open set is a closed set. However, we can consider the σ-algebra generated by τ , called the Borel σ-algebra1and noted BpXq. Each element in this algebra – which is a subset of X – is called a Borel set. Once we have a measurable space pX, Aq, the concept of a positive measure, or simply a measure, μ can be deﬁned as a function μ : A Ñ r0, `8s such that: – μpHq “ 0 ; – μ is σ-additive (or countably additive): if pEn qnPN is a countable family of twoby-two disjoint elements in A, then: ˜ ¸ ď ÿ μ An “ μpAn q nPN

nPN

The triple pX, A, μq is said to be a measure space. When the σ-algebra A and the measure μ are clearly speciﬁed, they are often omitted and one simply writes X. One very simple, but meaningful, example of a measure is given by the Dirac measure in the measurable space pR, BpRqq, that is, R with the Borel σ-algebra. The Dirac measure centered on x0 P R is deﬁned by: δx0 : Bpτ q Ñ t0, 1u: # 1 if x0 P E @E P BpRq δx0 pEq “ 0 if x0 R E

1 This σ-algebra cannot be described explicitly.

108

From Euclidean to Hilbert Spaces

Since R itself is an element in BpRq, δx0 pRq “ 1, and the Dirac measure of R is 1, independently of the starting point. It is therefore an example of a ﬁnite measure, that is, the measure of the entire space is ﬁnite. Measures are generally σ-ﬁnite, rather than simply ﬁnite. Given a measure space pX, A, μq, μ is said to be a σ-ﬁnite measure if X can be written as the countable union of measurable sets pEn qnPN Ă X with a ﬁnite measure, that is: ď En , μpEn q ă `8 @n P N X“ nPN

Several different techniques exist for constructing a measure, but these are not simple and cannot be described in short form. Readers may wish to consult the volume cited in the preface, or any other work on measure theory. 3.3. Measurable functions and almost-everywhere properties (a.e) The next step is to introduce the morphisms of measurable spaces, that is, applications between measurable spaces which preserve measurability. Let pX1 , A1 q, pX2 , A2 q be two measurable spaces and f : X1 Ñ X2 an arbitrary function. f is a measurable function (with respect to the chosen σ-algebras A1 and A2 ) if the reciprocal image via f of any element of the σ-algebra A2 is included in A1 , that is2: E P A2 ùñ f ´1 pEq P A1 . This is equivalent, by deﬁnition, to stating that the reciprocal image via f of a measurable set of X2 (with respect to A2 ) is a measurable set of X1 (with respect to A1 ). R EMARKS .– – Continuous functions between two topological spaces are clearly measurable with respect to their Borel σ-algebras. – Without other speciﬁcations, whenever we consider real-valued functions, that is f : X Ñ R, where pX, Aq is a measurable space, we ﬁx the Borel σ-algebra on R and we test the measurability of f with respect to this choice. – A complex-value function f : X Ñ C is measurable if both its real and imaginary parts are measurable. 2 Note the similarity between this deﬁnition and that of a continuous function, in the topological sense of the term.

Lebesgue’s Measure and Integration Theory

109

Let us now recall the crucial concept of properties which are deﬁned almost everywhere. A function f deﬁned on a measure space pX, A, μq has a property which holds almost everywhere (written a.e.) if f possesses this property on XzE, where E P A has a measure of zero: μpEq “ 0. E XAMPLES .– – f, g: measurable functions deﬁned on pX, A, μq, then f “ g a.e. if f pxq “ gpxq @x P U P A and μpXzU q “ 0. – f is the a.e. pointwise limit of the sequence pfn qnPN if lim fn pxq “ f pxq @x P U P A and μpXzU q “ 0.

nÑ`8

3.4. Integrable functions and Lebesgue integrals Given a measure space pX, A, μq, the integral of a measurable function deﬁned by real or complex functions is relatively simple to obtain. We start by considering a special function, the indicator (or characteristic) function of a set E P A: χE : X Ñ t0, 1u: # 1 if x P E χE pxq “ 0 if x R E An equivalent notation is 1E . Indicator functions are used to deﬁne simple functions or step functions via linear combination. More precisely, taking pEk qnk“1 to be a ﬁnite and disjoint partition of X, that is, Ek X Ek1 “ H @k ‰ k 1 and n ď

Ek “ X,

k“1

a simple function s : X Ñ R is deﬁned as: s“

n ÿ

c k χE k

k“1

spxq “ ck @x P Ek ; hence s can only take a ﬁnite number of values; if X is a subset of R, then s is a piecewise constant function. The natural deﬁnition of the Lebesgue integral of a simple function is: ż X

sdμ “

n ÿ k“1

ck μpEk q

110

From Euclidean to Hilbert Spaces

Note that, without the deﬁnition of the set measure Ek , the integral of s would not be correctly deﬁned. The importance of simple functions is expressed in Theorem 3.1. T HEOREM 3.1.– Let pX, A, μq be a measure space and f : X Ñ r0, `8s a measurable and non-negative function. f can be approximated from below using a series of simple functions, that is, D psn qnPN , with sn a simple function; such that psn qnPN Õ f , that is: 1) 0 ď s0 pxq ď s1 pxq ď . . . ď sn pxq ď . . . ď f pxq @x P X ; 2) lim sn pxq “ f pxq @x P X (pointwise limit). If f is bounded, then the nÑ`8

convergence of the sequence psn qnPN toward f is uniform. The proof of this theorem is both elegant and informative, showing that the sequence of simple functions is given by: # k k k`1 n ´1 n if n ď f pxq ď 2n , for k “ 0, 1, . . . , 2 sn pxq “ 2n 2n 2 if 2 ď f pxq This fundamental theorem makes it possible to deﬁne the integral of a measurable non-negative function f : X Ñ r0, `8s as: ż X

f dμ “ sup

0ďsďf

ż s dμ

s simple

X

f is said to be (Lebesgue) integrable if

ş X

f dμ ă `8.

If f : X Ñ R is measurable, then its integral can be deﬁned by considering its positive part: # f pxq if f pxq ě 0 f` pxq “ 0 if f pxq ă 0 and its negative part: # ´f pxq f´ pxq “ 0

if f pxq ď 0 if f pxq ą 0,

note that both these functions are positive-valued. Since f “ f` ´ f´ , if f` and f´ are integrable, then we can deﬁne the integral of a measurable function with extended real values as: ż ż ż f dμ “ f` dμ ´ f´ dμ X

X

X

Lebesgue’s Measure and Integration Theory

111

The same strategy is used for measurable functions f : X Ñ C, but using the positive and negative parts of the real part Repf q and the imaginary part Impf q. The integral is thus deﬁned as: ż ż ż f dμ “ Repf qdμ ` i Impf qdμ X

X

X

Absolute integrability is a necessary and sufﬁcient condition for integrability of a real or complex valued function: ż

f dμ ă `8 ðñ X

ż

|f |dμ ă `8

X

3.5. Characterization of the Lebesgue measure on R and sets with a null Lebesgue measure As we have seen, the construction of a measure is generally not trivial. However, given the importance of the Lebesgue measure on R, it is helpful to provide a brief summary of the characteristics of this measure. Remarkably, a theorem exists which provides the characterization of the Lebesgue measure using certain properties. Before quoting the result, we recall some deﬁnitions. – Borel measure: Let X be a topological space, and take the measurable space pX, BpXqq, where BpXq is the Borel σ-algebra. A measure μ deﬁned on this space is said to be a Borel measure if it associates a ﬁnite number with each compact subset K of X; – Regular Borel measure: A Borel measure is regular if, for any Borel set E P BpXq, we have: 1) μpEq “ suptμpKq, K Ă E, K compactu; 2) μpEq “ inftμpOq, E Ă O, O openu. Consider now pR, BpRq, μq. μ is a shift-invariant measure if: μpE ` aq “ μpEq for any Borel set E P BpRq and all a P R, where E ` a “ tx P R e ` a, where e P Eu.

:

x “

We can now quote the theorem that provides the characterization of the Lebesgue measure on R, noted m. T HEOREM 3.2.– If a measure on pR, BpRq, μq has the following properties:

112

From Euclidean to Hilbert Spaces

1) μ is a regular Borel measure; 2) μ is shift-invariant; 3) μ is normalized, that is μr0, 1s “ 1; then μ is the Lebesgue measure m. Thus, we can say that the Lebesgue measure on pR, BpRqq is a regular, shift-invariant, normalized Borel measure; this also implies that mra, bs “ b ´ a. A further consequence of this theorem is that the Lebesgue measure is σ-ﬁnite: R can be covered by a partition of compact intervals r´n, ns with n P N, all of which possess ﬁnite measures (μr´n, ns “ 2n). Generalization of the Lebesgue measure on R to Rn is straightforward, and we can prove that, if a function is Riemann-integrable on Rn , it is also Lebesgue-integrable and the two integrals coincide. Important examples of sets with null Lebesgue measure are given by hypersurfaces of dimension n ´ 1 in Rn , such as two-dimensional (2D) surfaces in R3 and curves in R2 . Regarding R, since R has the cardinality of continuous, the subsets of R with lower cardinality, that is, countable or ﬁnite subsets, have null Lebesgue measure, in particular: mpNq “ mpZq “ mpQq “ 0 This means that even if we eliminated from a measurable set in R, for example an interval ra, bs, a countably inﬁnite number of points, its Lebesgue measure would not change. This property means that the class of Lebesgue-integrable functions is much broader than that of Riemann-integrable functions. Take the case of a piecewise continuous function on a set with a ﬁnite or countable number of jump discontinuities: this function has no Riemann integral. It does, however, have a Lebesgue integral, which is the algebraic sum of the Riemann integrals of each section for which the function is continuous. As the number of discontinuities is ﬁnite or countable, we can simply ignore them, since they constitute a set of null Lebesgue measure and therefore have no effect on the ﬁnal result of integration. It is important to remember that Lebesgue integration theory does not provide more advanced tools for the explicit calculation of integrals, except in certain very speciﬁc cases; however, as just discussed, it allows us to give a meaningful sense to integrals of functions which are much less regular than is required for Riemann integration. This result, along with the crucial theorems presented in section 3.6, gives Lebesgue integration theory a signiﬁcant advantage over that of Riemann.

Lebesgue’s Measure and Integration Theory

113

3.6. Three theorems for limit operations in integration theory In this section, we shall summarize the three most important theorems concerning the limit operation in integration theory. These will be used in Chapter 4. In these theorems, we shall take pX, A, μq to be an arbitrary ﬁxed measure space. T HEOREM 3.3 (Monotone convergence theorem – Beppo Levi).– Let pfn qnPN , with fn : X Ñ R, be a monotonically increasing sequence of integrable functions. If the sequence of integrals is bounded, that is: ż fn dμ ă K @n P N, DK P R such that X

then D lim fn pxq ă `8 a.e. Furthermore, if we deﬁne the limit function f : X Ñ R as:

nÑ`8

lim fn pxq

# f pxq “

nÑ`8

if the limit is ﬁnite otherwise

0

then f is integrable, and the limit and integral commute: ż ż f pxq dμ “ lim fn pxq dμ. nÑ`8 X

X

Let us now pass to Fatou’s lemma by ﬁrst recalling that, given an arbitrary sequence pxn qnPN of real numbers, lim inf is the limit inferior of the sequence, that is: lim inf pxn q “ inftx P R : x limit point for pxn qnPN u nPN

T HEOREM 3.4 (Fatou’s lemma).– Let pfn qnPN ş , with fn : X Ñ R, be a sequence of positive integrable functions and let lim inf X fn dμ ă `8. The function f deﬁned nÑ`8

by: f pxq “

# lim inf fn pxq if the limit inferior is ﬁnite nÑ`8

otherwise

0

is integrable, moreover, the following inequality holds: ż ż f dμ ď lim inf fn dμ. X

nÑ`8

X

114

From Euclidean to Hilbert Spaces

T HEOREM 3.5 (Dominated convergence theorem – Henri Lebesgue).– Let pfn qnPN , where fn : X Ñ R, be sequence of measurable functions, and let Φ : X Ñ R be a positive and integrable function such that: |fn | ď Φ

e.a @n P N

If the real sequence pfn pxqqnPN is convergent @x P X and if f pxq “ lim fn pxq, nÑ`8

then fn and f are integrable and the limit and the integral commute, that is: ż ż f pxq dμ “ lim fn pxq dμ X

nÑ`8 X

3.7. Summary In this chapter, we provided a brief overview of key elements of measure theory and Lebesgue integration, touching on subjects such as σ-algebra, measurable sets, measures, measure spaces and measurable functions. Particular attention was paid to the Borel σ-algebra in a topological space: this σ-algebra is generated by the open subsets of the space in question. Almost-everywhere (a.e.) properties play an important role in measure theory: a property is veriﬁed a.e. if it is valid on a measurable subset such that the measure of its complementary set is null. We saw that the Lebesgue measure m on R can be characterized with respect to the Borel σ-algebra as a regular, normalized and shift-invariant measure. Remarkable examples of null Lebesgue measures in R include countable sets, speciﬁcally mpNq “ mpZq “ mpQq “ 0. Given a measure, the deﬁnition of the integral of a measurable function is straightforward and follows a standard approach. We begin by considering simple (or step) functions, which are linear combinations of characteristic functions of measurable sets. Simple functions approach any non-negative measurable function from below. This result is essential, allowing us to deﬁne the integral of non-negative measurable functions as the sup of the integral of simple functions which are not greater than the function itself. This deﬁnition is extended to arbitrary real-valued functions by using their positive and negative parts, and to complex-valued functions by using their real and imaginary parts. Finally, we outlined the three fundamental theorems concerning the relation between limits and integrals in Lebesgue theory: the monotonic and dominated convergence theorems (developed by Levi and Lebesgue, respectively) and Fatou’s lemma.

4 Banach Spaces and Hilbert Spaces

In this chapter, we shall consider normed or inner product spaces of inﬁnite dimensions. Particular attention will be paid to “complete” spaces, for which several crucial theorems – which do not hold for non-complete spaces of inﬁnite dimensions – can be formulated. Before we can begin our analysis, it is important to note that all of the properties described previously for inner product spaces of ﬁnite dimension which rely solely on the algebraic nature of the inner product remain valid for inﬁnite-dimensional vector spaces. For example: – a family of orthogonal vectors is free; – if xx, zy “ xy, zy @z, then vectors x and y necessarily coincide; – the null vector is the only vector which is orthogonal to all other vectors; – the Gram-Schmidt orthonormalization procedure can be iterated, guaranteeing that an inﬁnite system of mutually orthogonal vectors with a unitary norm will be obtained from any given inﬁnite set of vectors. The proofs for the ﬁrst three properties are identical to those used for ﬁnite-dimensional vector spaces. The proof of the ﬁnal property relies on the Zorn lemma. Results for ﬁnite sums are harder to generalize; in this case, we need to take account of topological arguments in addition to algebraic considerations.

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

116

From Euclidean to Hilbert Spaces

As we shall see, the deﬁnition and analysis of Banach and Hilbert spaces rely primarily on the analysis of the compatibility between the linear and topological structures of a normed or inner product vector space. For this reason, we start by recalling the concept of topology in such spaces. 4.1. Metric topology of inner product spaces As we have seen, all inner product spaces V can be assigned a norm, which is canonically induced from the scalar product. Using this norm, it will always, canonically, be possible to deﬁne a distance or metric on V : dpx, yq “ }x ´ y} “

a xx ´ y, x ´ yy,

@x, y P V

Function d possesses the following properties, @u, x, y P V : 1) dpx, yq ě 0 and dpx, yq “ 0 ðñ x “ y; 2) dpx, yq “ dpy, xq (symmetry); 3) dpx, yq ď dpx, zq ` dpz, yq (triangular inequality). D EFINITION 4.1 (Metric vector space).– A metric vector space is a pair pV, dq given by a vector space V and a function, the distance d : V ˆ V Ñ R` 0 “ r0, `8q, which satisﬁes the three properties given above. An inner product space is thus automatically a normed vector space and possesses a distance, independently of whether the scalar product is real or complex. As we shall see in this chapter, the converse is true if and only if the norm satisﬁes the parallelogram formula. The existence of a metric means that it is possible to establish relationships between points and subsets in a space which go further than simple notion of a point belonging to a set. As we know, these relationships form the basis for constructing a topology. Reminders of a number of common deﬁnitions are given below, establishing clear notation and naming conventions for the rest of this chapter. D EFINITION 4.2.– Let pV, x, yq be an inner product space and } } the associated norm. Then: – a neighborhood (open) of x P V of radius ε is the subset of V deﬁned by: Uε pxq “ ty P V : x ´ y ă εu if } } is the Euclidean norm, then Uε pxq is a sphere (open) centered in x and of radius ε. By extension, Uε pxq is often called a ball or sphere (open), and we write Bpx, εq, for any norm } };

Banach Spaces and Hilbert Spaces

117

– a subset O Ď V is said to be open if: @x P O Dε ą 0 such that y P Uε pxq ùñ y P O – a subset F Ď V is said to be closed if its complement F c “ V zF is open. Remember that this is the same as saying that any convergent sequence of elements in F will reach its limit within F ; Ş – the closure of E Ă V is E “ Eα , where Eα is a closed subset of V αPI

containing E. E is the smallest closed subset of V which contains E; – the border (or spherical surface) of Uε pxq is the subset of V deﬁned by: BUε pxq “ ty P V : x ´ y “ εu using the symbol Bpx, εq, the border is noted BBpx, εq; – the closed neighborhood (or ball, or sphere) of radius ε of x P V is the subset of V deﬁned by: Uε pxq “ ty P V : x ´ y ď εu using the symbol Bpx, εq, we can write Bpx, εq. We also recall that a topology on V is a set of parts of V containing V itself and H, which is stable with respect to arbitrary unions and ﬁnite intersections. The topology generated by the opens in V is the smallest topology which contains the open sets in V . Using this topology, with respect to the opens deﬁned above, V is a topological space. The topology of V is metric, that is the open sets are deﬁned using a distance function. We recall that this guarantees that the topology will be separated, that is, for all pairs x, y P V , x ‰ y, there exist two neighborhoods U pxq and V pyq, of different or equal radius, such that U pxq X V pyq “ H, and we say that the points are separated by these neighborhoods. A standard result in topology guarantees the uniqueness of the limit of sequences in a separated topology; hence, if sequences of vectors in V converge, they have a single limit. We now recall the deﬁnition of convergence for a sequence in the topology of V . D EFINITION 4.3.– Let pV, } }q be a normed vector space. A sequence of vectors pxn qnPN Ă V is convergent, or convergent in norm } }, toward the limit x if: @ε ą 0 DNε ą 0 : n ě Nε ùñ }xn ´ x} ă ε

118

From Euclidean to Hilbert Spaces

that is if, from n “ Nε , xn P Uε pxq. This can be represented using the more compact notation: lim xn “ x,

nÑ`8

xn ´ x

Ñ

nÑ`8

0

R EMARK .– Requiring the inequality }xn ´ x} ă ε to be valid @ε ą 0 enables us to reformulate the deﬁnition of convergent, adding a strictly positive, ﬁnite multiplication constant to ε, that is, xn Ñ x if: nÑ`8

@ε ą 0 DNε ą 0 and Dm P p0, `8q : n ě Nε ùñ }xn ´ x} ă mε If the property is valid for all positive and arbitrarily small ε, then we can consider ε that ε˜ “ m and redeﬁne the convergence with respect to ε˜: @˜ ε ą 0 DNε˜ ą 0 : n ě Nε˜ ùñ }xn ´ x} ă ε˜ This is possible because using the symbol ε or ε˜ is insigniﬁcant; the two quantities can be as small as we wish, so the two deﬁnitions are equivalent. In a metric topology, the uniqueness of the limit follows simply from the triangular inequality. If xn Ñ x and xn Ñ y, then: nÑ`8

nÑ`8

0 ď dpx, yq ď dpx, xn q ` dpxn , yq

Ñ

nÑ`8

0`0“0

In what follows, this consideration will be referred to using the standard expression “due to the arbitrarity of ε...”. It is also helpful to recall the concept of density of a subset in a normed vector space. Proof of the equivalence of the properties expressed in Deﬁnition 4.4 can be found in most works on the subject of topology. D EFINITION 4.4 (density).– Let pV, q be a normed vector space. A subset E Ă V is dense in V if one of the following propositions is veriﬁed: 1) @x P V, Dpxn qnPN Ă E : xn ÝÑ x pi.e. }xn ´ x} ÝÑ 0q, that is: any nÑ`8

nÑ`8

subset in V can be indeﬁnitely approached by a sequence of elements in E, and is the limit of this sequence; 2) @x P V and @ε ą 0, Dy P E : }x ´ y} ă ε, that is, for every element x in X there exists an element y in E with an arbitrarily small distance from x; 3) V is the closure of E: E “ V .

Banach Spaces and Hilbert Spaces

119

We end the recap of classical notions with the concept of continuity of a function between metric spaces, along with a classic result which says that we can characterize continuity of a function via its action on sequences. D EFINITION 4.5 (limits and continuity of functions between metric spaces).– Let X and Y be two arbitrary metric spaces, x ¯ P X and P Y , then: xqXX lim f pxq “ ðñ @ε ą 0 Dδε ą 0 : x P Uδε p¯

ùñ

xÑ¯ x

f pxq P Uε pqXY

that is, the limit of f in x ¯ is if f transforms the points of X which are arbitrarily close to x ¯ into points of Y which are arbitrarily close to . If “ f p¯ xq, then the function f : X Ñ Y is said to be continuous in x ¯ P V . In explicit terms: @ε ą 0 Dδε ą 0 : x P Uδε p¯ xq X X

ùñ

f pxq P Uε pf p¯ xqq X Y

f is continuous on X if it is continuous at every point in X. T HEOREM 4.1 (Sequential continuity).– The function f : X Ñ Y , with pX, dX q and pY, dY q arbitrary metric spaces, is continuous in x ¯ P X if and only if: @pxn qnPN Ď X such that lim xn “ x ¯ ùñ

nÑ`8

lim f pxn q “ f

nÑ`8

˙

ˆ lim xn

nÑ`8

“ f p¯ xq

that is: @pxn qnPN Ď X : dX pxn , x ¯q

Ñ

nÑ`8

0

ùñ

dY pf pxn q, f p¯ xqq

Ñ

nÑ`8

0

We see that the limit operation on the sequence pxn qnPN is carried out in the metric space pX, dX q, while the operation on the sequence pf pxn qqnPN is carried out in the metric space pY, dY q. The possibility of switching the order of the limit and the (continuous) function in the expression: ˆ ˙ lim f pxn q “ f lim xn nÑ`8

nÑ`8

is essential for proving many of the results presented later. P ROOF.– ùñ : let f be continuous in x ¯ and let pxn qnPN Ď X be an arbitrary sequence of elements of X such that lim xn “ x ¯. Then, by deﬁnition of the limit of a sequence, nÑ`8

120

From Euclidean to Hilbert Spaces

for sufﬁciently large values of n, xn belongs to a neighborhood of x ¯ of arbitrarily small radius δ ą 0: in other words, there exists N P N such that n ě N ùñ xn P Uδ p¯ xq. On the other side, due to the continuity of f , the elements xn belonging to the neighborhood Uδ p¯ xq are transformed by f into points belonging to a neighborhood of f p¯ xq of arbitrarily small radius ε ą 0, i.e. n ě N ùñ f pxn q P Uε pf p¯ xqq X Y , that is lim f pxn q “ f p¯ xq. nÑ`8

ðù : we shall assume that, for all sequences pxn qnPN Ď X such that lim xn “ x ¯ P X, it holds that lim f pxn q “ f p¯ xq; we need to prove that this

nÑ`8

nÑ`8

implies the continuity of f in x ¯ P X.

Using reductio ad absurdum, suppose that f is not continuous in x ¯: as we shall see, this results in a contradiction. Negation1 of the continuity of f in x ¯ is equivalent to saying that @δ ą 0, Dεδ ą 0 such that x P Uδ p¯ xq Ă X implies f pxq R Uεδ pf p¯ xqq. Since the values of δ are arbitrary, we may consider the sequence pδn qně1 deﬁned by δn “ n1 @n ě 1, which implies the existence of a sequence pxn qně1 Ă X such that xn P Uδn p¯ xq and f pxn q R Uεδn pf p¯ xqq. This leaves us with a contradiction: on the one hand, when n Ñ `8, δn Ñ 0 and thus xn Ñ x ¯, while on the other hand, f pxn q Ñ xq, that is, the hypothesis f p¯ nÑ`8 that f is not continuous results in a sequence of elements in X which converges to x ¯ without f pxq being convergent to f p¯ xq. This contradicts our initial hypothesis, and thus the possibility that f is not continuous must be rejected. 2 If V, W are two normed vector spaces, then they automatically constitute two metric spaces with respect to the distances canonically induced by the norm and deﬁnitions; the results presented above therefore remain valid. 4.2. Continuity of fundamental operations in inner product spaces Given an inner product space with both a linear structure and metric topology, the question about the compatibility of these two structures is evidently important; in other words, we wish to know whether the linear operations of the vector space V , together with inner product and norm, are continuous in the topology of V generated by its inner product. The response to this question is afﬁrmative, as Theorem 4.2 states. 1 Note that the negation of a mathematical proposition is performed by exchanging the universal and existential quantiﬁers and by considering the complementary afﬁrmation of the initial ¯ where C ¯ is proposition: thus, the negation of p@A DB ùñ Cq is p@B DA ùñ Cq, the negation of the afﬁrmation C.

Banach Spaces and Hilbert Spaces

121

T HEOREM 4.2.– Let pV, x , yq be an inner product space on K. We shall consider the topology induced by the inner product on V , the usual Euclidean topology on K and the product topology on V ˆ V and K ˆ V . Then: – inner product: x , y : V ˆ V ÝÑ K px, yq ÞÝÑ xx, yy – norm: } } : V ÝÑ R` 0 x ÞÝÑ }x} – sum: ` : V ˆ V ÝÑ V px, yq ÞÝÑ x ` y – and scalar multiplication: ¨K : K ˆ V ÝÑ V pk, xq ÞÝÑ kx are continuous functions. P ROOF.– All of the proofs shown below involve majorizing a selected norm using an expression which contains the norm of the difference between a sequence and its bound, which evidently converges to 0. – Continuity of inner product: we must prove that if pxn qnPN and pyn qnPN are any two sequences of elements in V which converge to x and y, respectively, then the sequence of scalars pxxn , yn yqnPN converges to xx, yy. To do this, we ﬁrst write a simple algebraic manipulation which holds for all n P N: xxn , yn y ´ xx, yy “ xxn ´ x ` x, yn ´ y ` yy ´ xx, yy ´ xx,yy “ xxn ´ x, yn ´ yy ` xxn ´ x, yy ` xx, yn ´ yy ` xx,yy “ xxn ´ x, yn ´ yy ` xxn ´ x, yy ` xx, yn ´ yy We can write the following majorization: |xxn , yn y´xx, yy| ď |xxn ´x, yn ´yy|`|xxn ´x, yy|`|xx, yn ´yy| @n P N and, from the Cauchy-Schwarz inequality: |xxn , yn y ´ xx, yy| ď }xn ´ x}}yn ´ y} ` }xn ´ x}}y} ` }x}}yn ´ y} @n P N As the equality holds for all n P N, the limit n Ñ `8 may be considered on both sides: by hypothesis, }xn ´ x} Ñ 0 and }yn ´ y} Ñ 0, so the right-hand side nÑ`8

tends to 0, hence:

nÑ`8

122

From Euclidean to Hilbert Spaces

|xxn , yn y ´ xx, yy|

Ñ

nÑ`8

0

which proves the continuity of the inner product. – Continuity of the norm: we must prove that if pxn qnPN is an arbitrary sequence of elements in V which converges to x, then the sequence of positive real numbers p}xn }qnPN converges to }x}. This can be done using the majorization of the norm provided by formula [1.3]: |}xn } ´ }x}| ď }xn ´ x} but }xn ´ x}

Ñ

nÑ`8

0, hence }xn }

Ñ

nÑ`8

}x}.

– Continuity of the sum: we must show that if pxn qnPN and pyn qnPN are any two sequences of elements in V which converge to x and y, respectively, then the sequence pxn ` yn qnPN converges to x ` y. To do this, we write: }pxn `yn q´px ` yq} “ }pxn ´xq`pyn ´yq} ď }xn ´ x} ` }yn ´y}

Ñ

nÑ`8

0

– Continuity of scalar multiplication: we must show that if pxn qnPN and pkn qnPN are any two sequences of elements in V and K, respectively, which converge to x and k, respectively, then the product sequence pkn xn qnPN converges to kx. Once again, an algebraic manipulation is involved: } kn xn ´ kx} “ }kn pxn ´ x ` xq ´ kx} “ }kn pxn ´ xq ` kn x ´ kx} “ }kn pxn ´ xq ` xpkn ´ kq} ď |kn |}xn ´ x} ` }x}|kn ´ k|

Ñ

nÑ`8

0

2

Let us consider the immediate consequences of this theorem. First, the continuity of the sum and scalar multiplication implies that the difference is also continuous, since x ´ y “ x ` p´1qy. If pxn qnPN and pyn qnPN are two sequences in pV, x , yq which converge to elements of x and y, respectively, then the continuity of the inner product and the norm, taken alongside Theorem 4.1, give us the following formulas that will be used later: lim xxn , yn y “ x lim xn , lim yn y “ xx, yy

nÑ`8

nÑ`8

nÑ`8

lim x lim xn “ n “ x . nÑ`8 nÑ`8

[4.1]

[4.2]

Banach Spaces and Hilbert Spaces

123

The case of series needs to be considered separately. First of all, let us recall the deﬁnitions of a series and of a convergent series. D EFINITION 4.6.– Given a sequence of vectors pxn qnPN Ă V , the series of general n ř xk , and we write: term xn is the sequence of partial sums pSn qnPN , where Sn “ k“0

ÿ

xn “

8 ÿ

xn “ pSn qnPN

n“0

nPN

ř

The series

xn is said to be convergent, or convergent in norm } }, to the sum x

nPN

if the sequence of partial sums pSn qnPN is convergent to x, that is: n ÿ

x “ lim

nÑ`8

xk “

ÿ

xn ðñ

nPN

k“0

lim }Sn ´ x} “ 0 ðñ }Sn ´ x}

nÑ`8

Ñ

nÑ`8

0

xn is said to be absolutely convergent2 if the sequence of the ˆ n ˙ ř partial sums of the norms, that is }xk } , is convergent. In this case, we k“0 nPN ř }xn } ă `8. write: The series

ř

nPN

nPN

We observe that, since Sn ´ x “ 8 ÿ xk }Sn ´ x} “

n ř

xk ´

k“0

8 ř

xn “ ´

k“0

8 ř

xk , then:

k“n`1

[4.3]

k“n`1

hence the explicit deﬁnition of a convergent series in a normed vector space is: 8 ÿ @ε ą 0 DNε ą 0 : n ě Nε ùñ xk ă ε k“n`1 ř ř Given convergent series xn , ym , the fact that a series is the sequence of nPN

mPN

its partial sums means that we can write: x

ÿ

nPN

xn ,

ÿ mPN

ym y “ x lim

N Ñ`8

N ÿ n“0

xn , lim

KÑ`8

K ÿ m“0

ym y “

lim

N,KÑ`8

K N ÿ ÿ

xxn , ym y

n“0 m“0

[4.4]

2 The absolute convergence deﬁned here becomes the normal convergence for the modulus of Sn when V “ R or V “ C.

124

From Euclidean to Hilbert Spaces

and: N N ÿ ÿ ÿ xn “ lim xn “ lim xn nPN N Ñ`8 n“0 N Ñ`8 n“0

[4.5]

Squaring the members of equation [4.5], we obtain: 2 ˜ ¸2 2 N N N ÿ ÿ ÿ 2 ÿ xn “ lim xn “ lim xn “ lim xn N Ñ`8 N Ñ`8 N Ñ`8 n“0 nPN n“0 n“0 N ř having used the fact that xn P R and the continuity of the square operation in n“0 R to exchange the limit with the square. If we consider an orthogonal family of vectors pun qnPN in place of an arbitrary sequence pxn qnPN , the generalized Pythagorean theorem (Theorem 1.8) can be used 2 N N ř ř ř 2 2 un un “ un , giving the following to write lim “ lim N Ñ`8 n“0 N Ñ`8 n“0 nPN very helpful formula: ÿ 2 ÿ 2 un “ un nPN nPN

pun qnPN : orthogonal family of vectors

[4.6]

Formula [4.6] will be used extensively in Chapter 5. It is important to note that this formula does not generally hold if pun qnPN is not an orthogonal system of vectors and if we consider the norm rather than its square. The possibility to exchange the limit and inner product and norm operations is crucial for proving many of the theorems that we shall see later. This consideration emphasized the importance of the compatibility of the linear and topological structures in an inner product space. The result below is a ﬁrst example of the usefulness of the continuity of the norm. In Chapter 1, we saw that the parallelogram law can be used to characterize the norms generated by an inner product, that is, Hilbertian norms. We now have all of the tools we need to formalize this afﬁrmation, which Yosida (1995) refers to as the Fréchet-von Neumann-Jordan theorem. T HEOREM 4.3 (Fréchet-von Neumann-Jordan theorem).– Let V be a vector space on K (of ﬁnite or inﬁnite dimension) and let } } be a norm on V . } } is a Hilbertian norm if and only if it satisﬁes the parallelogram law.

Banach Spaces and Hilbert Spaces

125

If the norm satisﬁes the parallelogram law, then the inner product from which it is induced is necessarily determined by the polarization formulas for real and complex cases, respectively: 1 2 2 pv ` w ´ v ´ w q 4 ¯ı ´ 1” 2 2 2 2 v ` w ´ v ´ w ` i v ` iw ´ v ´ iw xv, wy “ 4

xv, wy “

P ROOF.– The direct implication is obvious, so we only need to prove the reverse implication, that is, if a norm } } satisﬁes the parallelogram law, then it is induced by a an inner product in the canonical manner: } ¨ } “ x¨, ¨y. Let us begin by considering the real case. If an inner product exists which induces the norm, then it must take the following form: ppv, wq “

1 2 2 pv ` w ´ v ´ w q, 4

@v, w P V

Note that p is a composition of algebraic functions (sum and difference), the norm and its squared power, all of which are continuous functions; p itself is thus a continuous function of its arguments. The next step is to verify that this deﬁnition satisﬁes the deﬁning properties of a real inner product. First, we note that the symmetry ppv, wq “ ppw, vq is obvious, as is deﬁnite 2 2 positiveness, given that ppv, vq “ 14 2v “ v ě 0 and ppv, vq “ 0 if and only if v “ 0V . Second, we must verify bilinearity. Given that the symmetry condition is satisﬁed, any property of p which is demonstrated with respect to the ﬁrst argument also holds for the second argument, meaning that we can focus on the ﬁrst entry of p. Using the parallelogram law, we can write @v, w, z P V : }pv`zq`w}2 `}pv´zq`w}2 “ }pv`wq`z}2 `}pv`wq´z}2 “ 2}v`w}2 `2}z}2 and: }pv`zq´w}2 `}pv´zq´w}2 “ }pv´wq`z}2 `}pv´wq´z}2 “ 2}v´w}2 `2}z}2

126

From Euclidean to Hilbert Spaces

thus: 2

2

ppv ` z, wq ` ppv ´ z, wq “ 14 pv ` z ` w ´ v ` z ´ w 2 2 ` v ´ z ` w ´ v ´ z ´ w q “

1 r2p}v ` w}2 ` }z}2 q ´ 2p}v ´ w}2 ` }z}2 qs 4

[4.7]

1 “ 2 p}v ` w}2 ´ }v ´ w}2 q 4 “ 2ppv, wq :0 wq v, Taking v “ z, we obtain ppv ` v, wq ` ppv “ pp2v, wq “ 2ppv, wq ´ r4.7s

@v, w P V , that is: 2ppv, wq “ pp2v, wq,

@v, w P V

[4.8]

Now, take v1 , v2 P V such that v “ 12 pv1 `v2 q and z “ 12 pv1 ´v2 q, thus v`z “ v1 and v ´ z “ v2 , then: ppv1 , wq ` ppv2 , wq “ ppv ` z, wq ` ppv ´ z, wq “ 2ppv, wq “ pp2v, wq “

pdef. vq

ppv1 ` v2 , wq

r4.7s

r4.8s

Since v, w, z are arbitrary vectors, v1 , v2 are also arbitrary, therefore, the demonstration that ppv1 ` v2 , wq “ ppv1 , wq ` ppv2 , wq proves the additivity of p. Now, let us prove the property of homogeneity. We start by observing that if the reasoning which gave us pp2v, wq “ 2ppv, wq is iterated n P N times, we obtain ppnv, wq “ nppv, wq. v Furthermore, for all m P N, m ‰ 0, it not only holds that ppv, wq “ ppm m , wq, v v but also ppmp m q, wq “ mpp m , wq; combined with the formula ppnv, wq “ nppv, wq, n n this gives us pp m v, wq “ m ppv, wq @n, m P N, m ‰ 0, that is, p is homogeneous with respect to any number r P Q, r ě 0 : pprv, wq “ rppv, wq.

In order to extend this homogeneity to all rational numbers, we use the argument that if r ă 0, then, by rewriting rv “ ´|r|v “ |r|p´vq, we obtain: rppv, wq ´ pprv, wq “ rppv, wq ´ pp|r|p´vq, wq “ rppv, wq ´ |r|pp´v, wq “ rppv, wq ` rpp´v, wq “ rpppv, wq ` pp´v, wqq

“

(additivity)

rppv ´ v, wq “ rpp0V , wq “ 0

Hence, the property of homogeneity also holds for negative rational numbers, and thus for all rational numbers. Now, using the fact that Q is dense in R, we know that

Banach Spaces and Hilbert Spaces

127

for all α P R there exists a sequence of rational numbers prn qnPN Ă Q such that rn ÝÑ α. By the continuity of p, we have: nÑ`8

αppv, wq “ lim rn ppv, wq “ pp lim rn v, wq “ ppαv, wq, nÑ`8

nÑ`8

@α P R, v, w P V

In summary, p is an inner product on V which is compatible with its norm if K “ R. Now, let us consider the complex case: K “ C. As we saw in the real case, if there is an inner product which induces the norm, it must take the following form: ” ¯ı ´ 2 2 2 2 p˜pv, wq “ 14 v ` w ´ v ´ w ` i v ` iw ´ v ´ iw “ ppv, wq ` ippv, iwq @v, w P V . From the observations presented in section 1.1, to prove that p˜pv, wq is a complex inner product, we must simply verify the Hermitian property, that is, p˜pv, wq “ p˜pw, vq, since the linearity of the ﬁrst variable and the deﬁnite positiveness of p imply that these properties also hold for p˜. p˜ is an Hermitian form if and only if p˜pw, vq “ p˜pv, wq “ ppv, wq ´ ippv, iwq, since ppv, wq and ppv, iwq P R. Furthermore, p˜pw, vq “ ppw, vq` ippiw, vq “ ppv, wq ` ippiw, vq, given that ppv, wq “ ppw, vq, thus p˜pw, vq “ ppv, wq ` ippiw, vq. Comparing the formulas: p˜pw, vq “ p˜pv, wq “ ppv, wq ´ ippv, iwq

and

p˜pw, vq “ ppv, wq ` ippiw, vq

we see that p˜ is an Hermitian form if and only if ppv, iwq “ ´ppiw, vq @v, w P V . Now, we calculate: p˜pv, iwq “ “

1 1 2 2 2 2 pv ` iw ´ v ´ iw q “ p|i| v ` iw ´ |i| v ´ iw q 4 4 1 1 2 2 2 2 piv ´ w ´ iv ` w q “ ´ pw ` iv ´ w ´ iv q 4 4

“ ´ppw, ivq using the fact that w ´ iv “ iv ´ w. In short, p˜ is the inner product associated with our norm in the complex case. 2 The mathematical object below is crucial in mathematics.

128

From Euclidean to Hilbert Spaces

D EFINITION 4.7 (Topological vector space).– A topological vector space (T.V.S.) is a vector space V with a topology which is compatible with the linear structure of V , that is such that the linear operations of the sum and of scalar multiplication are continuous functions. The continuity of fundamental operations in an inner product space implies that these spaces are always T.V.S. The same can be said of normed vector spaces; the continuity of linear operations is proved in exactly the same way. In terms of topological arguments, there is no difference between an inner product space and a normed vector space, as the norm is the mathematical object used to prove continuity in both cases. The major difference between an inner product space and a normed vector space is related to the underlying geometric structure of the space itself, which is much richer in the former case. 4.2.1. Equivalence of separated topologies in ﬁnite-dimension vector spaces The dimension of the vector space played no part in the proofs of Theorem 4.2, so the considerations presented in the previous section hold true for any vector space, whether of ﬁnite or inﬁnite dimensions. In ﬁnite dimension, however, the topology (separable) of a T.V.S. can be guaranteed to be essentially unique. T HEOREM 4.4 (Tychonoff).– Let V be a separated T.V.S. of ﬁnite dimension n on the ﬁeld K. Given an arbitrary ﬁxed basis B “ pb1 , . . . , bn q in V , the linear isomorphism deﬁned by: I:

ÝÑ ¨ Kn ˛ x1 n ř ˚ ‹ x “ rxsB “ xi bi ÞÝÑ ˝ ... ‚ V

i“1

xn

is a homeomorphism (or topological isomorphism), that is a bicontinuous application (continuous, inversible, and of which the inverse is continuous) considering the usual Euclidean topology on Kn . As we have seen, all inner product spaces, whether normed or metric, are separated T.V.S, so one immediate consequence of Tychonoff’s theorem is that all inner products, norms and distances which can be deﬁned on a ﬁnite-dimensional vector space are topologically equivalent, that is, they generate the same topology,

Banach Spaces and Hilbert Spaces

129

which, up to an isomorphism, is the Euclidean topology. This does not hold for inﬁnite dimensions, as shown by a number of counter-examples. The simplest example of topological independence with respect to the choice of a norm in ﬁnite-dimension concerns vector spaces of dimension 1, as we see from the result below. T HEOREM 4.5.– If V is a normed, one-dimensional vector space on the ﬁeld K, any two norms deﬁned over V are multiples of each other by a real, strictly positive scalar. P ROOF.– Let } }1 , } }2 be two norms on V . By deﬁnition, }0V }1 “ }0V }2 “ 0; so we just concentrate on an arbitrary v P V different from the null vector. Let }v}1 “ k1 k1 1 ` and }v}2 “ k2 , then we can write }v} }v}2 “ k2 “ k P R , and thus }v}1 “ k}v}2 . Since V is of dimension 1, for any other vector w P V there exists λ P K such that w “ λv. Thus, by the homogeneity of the norm, we can write: }w}1 “ }λv}1 “ |λ|}v}1 “ |λ|k}v}2 “ k}λv}2 “ k}w}2 that is, for all w P V and for any pair of norms } }1 , } }2 on V , there exists a constant k P R` such that }w}1 “ k}w}2 . 2 4.3. Cauchy sequences and completeness: Banach and Hilbert Mathematicians working in the late 19th and early 20th centuries showed that the inﬁnite-dimensional metric, normed and inner product vector spaces, which were most “similar” to ﬁnite-dimensional Euclidean spaces, can be characterized using a relatively simple property: converging sequences can be identiﬁed with Cauchy sequences. D EFINITION 4.8.– Given a generic metric space pX, dq, a sequence pxn qnPN is a Cauchy sequence if: @ε ą 0 DNε ą 0 : n, m ě Nε ùñ dpxn , xm q ă ε that is, the elements in the sequence become arbitrarily close to each other as the indices of the elements increase, that is, as the sequence progresses. pX, dq is said to be a complete metric space if all Cauchy sequences converge to limits contained within X. We shall see many examples of complete metric spaces in this chapter. Simple examples of non-complete metric spaces can be built by using the following basic result concerning Cauchy sequences.

130

From Euclidean to Hilbert Spaces

T HEOREM 4.6.– Any convergent sequence in a metric space is necessarily a Cauchy sequence. P ROOF.– If xn

Ñ

nÝÑ`8

x ¯, then, by the arbitrary nature of ε and the triangular

inequality: @ε ą 0 DNε ą 0 : n, m ě Nε ùñ dpxn , xm q ď dpxn , x ¯q`dp¯ x, x m q ă

ε ε ` “ε 2 2 2

Using this result, we can prove that the metric spaces3 pQ, | |q and pp0, 1q, | |q are not complete. To verify that pQ, | |q is not complete, consider the sequence pp1 ` n1 qn qnPN : this sequence is rational, since Q is stable with respect to sum, division and power operators and to their composition. Furthermore, the sequence is known to converge to e, the basis of natural logarithms, so, by Theorem 4.6, it is a Cauchy sequence in Q, interpreted as a subset of R. However, e is an irrational number, that is e P RzQ, implying the existence of at least one Cauchy sequence in Q which converges outside of Q itself. Similarly, in pp0, 1q, | |q, consider the sequence p n1 qně1 ; this is evidently contained within p0, 1q and converges to 0, making it a Cauchy sequence on p0, 1q Ă R, but 0 R p0, 1q. Now, let us consider the relationship between complete and closed metric spaces. T HEOREM 4.7.– If pX, dq is a complete metric space and pE, dq, E Ď X a closed metric subspace in X, then pE, dq is complete. P ROOF.– Let pxn qnPN Ď E be a Cauchy sequence, since E Ă X we have that pxn qnPN Ď X, and thus, since X is complete, pxn qnPN converges to a limit x P X. However, the limits of sequences in E belong to E, and, since E is closed, E “ E, hence x P E, that is all Cauchy sequences of elements of E converge in E itself. 2 T HEOREM 4.8.– If pX, dq is an arbitrary metric space and pE, dq, E Ď X is a complete metric subspace of X, then pE, dq is closed. P ROOF.– Taking x P E, there exists a sequence pxn qnPN Ď E which converges to x. Given that the sequence converges, it is a Cauchy sequence in E. As E is complete, 3 Remember that Q is not a real or complex vector space, as it is not stable with regard to its product by a real or complex scalar; thus, Tychonoff’s theorem cannot be applied for Q.

Banach Spaces and Hilbert Spaces

131

pxn qnPN must converge to an element y P E. By uniqueness of the limit, x “ y and thus x P E, that is E “ E. 2 An inner product vector space, or a normed vector space, is also a metric vector space; consequently, the deﬁnition of a Cauchy sequence can be rewritten as: @ε ą 0 DNε ą 0 : n, m ě Nε ùñ }xn ´ xm } ă ε Some authors use an even shorter form: lim

n,mÑ`8

}xn ´ xm } “ 0

A standard result of Calculus guarantees that pRn , | |q and pCn , | |q are complete metric spaces for all ﬁnite n P N. Using Tychonoff’s theorem (Theorem 4.4), we known that real or complex separated topological vector spaces of ﬁnite dimension n are topologically equivalent to the Euclidean spaces Rn or Cn , respectively; it follows that completeness is never a problem for pre-Hilbert vector spaces (or normed spaces) of ﬁnite dimension: converging sequences in these spaces are all, and only, Cauchy sequences. If the dimension of the vector space is not ﬁnite, then while it remains true that convergent sequences are necessarily Cauchy sequences, the inverse is not always true. For this reason, we shall introduce a deﬁnition to characterize spaces in which the Cauchy condition is necessary and sufﬁcient for convergence4. D EFINITION 4.9 (Hilbert and Banach spaces).– Let V be a vector space of ﬁnite or inﬁnite dimension. – If pV, } }q is complete, then it is called a Banach space. – If pV, x , yq is complete, then it is called a Hilbert space. One consequence of Tychonoff’s theorem is that real or complex normed vector spaces of ﬁnite dimension are all Banach spaces, while real or complex inner product spaces of ﬁnite dimension are all Hilbert spaces. Finite or inﬁnite-dimension Hilbert spaces are also Banach spaces, due to the fact that they are normed and complete vector spaces; the inverse is not generally true, as the existence of an inner product in a Banach space is guaranteed if and only if the parallelogram law holds. Two results related to Cauchy sequences are presented below. These will be extremely useful in what follows. Before proving them, we recall that a sequence in a metric space is said to be bounded if all elements of the sequence fall within a ﬁnite neighborhood of one element of the space, as described in Deﬁnition 4.9. 4 One can also Fréchet spaces: locally convex topological vector spaces which are complete with respect to a shift-invariant topology.

132

From Euclidean to Hilbert Spaces

D EFINITION 4.10.– A sequence pxn qnPN in a metric space pX, dq is said to be bounded if there exists x˚ P X and M ě 0 such that dpxn , x˚ q ď M @n P N. T HEOREM 4.9.– All Cauchy sequences are bounded. P ROOF.– By deﬁnition, if pxn qnPN is a Cauchy sequence, there exists Nε ą 0 such that the distance between xNε and all elements xn of the sequence with n ě Nε is less than ε, that is dpxn , xNε q ă ε @n ě Nε . xNε is thus a good candidate to take the place of x˚ in the deﬁnition of a bounded sequence. To prove this, we note that the elements of the sequence corresponding to an index value n lower than Nε belong to X, thus their distance from xNε is ﬁnite, and we can deﬁne the following value: r “ maxtdpxNε , x0 q, dpxNε , x1 q, . . . , dpxNε , xNε ´1 qu Now, deﬁning M “ maxtε, ru, we obtain dpxn , xNε q ď M @n P N.

2

The second result relates to subsequences. D EFINITION 4.11.– Let pxn qnPN be a sequence in a metric space pX, dq and let ϕ : N Ñ N be a strictly increasing function, that is ϕpn ` 1q ą ϕpnq for all n P N. The sequence deﬁned by pxϕpnq qnPN is a subsequence of the initial sequence pxn qnPN . As a very simple exercise, readers are invited to prove that, if a sequence pxn qnPN in a metric space pX, dq is convergent, then all of its subsequences also converge, and converge to the same limit. The following important result shows that, for Cauchy sequences, the order of this implication can be reversed. T HEOREM 4.10.– Any Cauchy sequence in a metric space pX, dq which possesses at least one convergent subsequence is itself convergent to the same limit. P ROOF.– Let pxn qnPN be a Cauchy sequence in pX, dq which admits a convergent subsequence pxϕpnq qnPN , where ϕ : N Ñ N is the strictly increasing application which deﬁnes this subsequence. Let a be the limit of the subsequence, that is a “ lim xϕpnq . nÑ`8

For all n P N, by the triangular inequality, we have dpxn , aq ď dpxn , xϕpnq q ` dpxϕpnq , aq; if we can majorized both terms on the right by an arbitrarily small quantity ε, then the thesis of the theorem will be proven.

Banach Spaces and Hilbert Spaces

133

To show that this is possible, we shall use the deﬁnition of a Cauchy sequence for pxn qnPN to write: @ε ą 0 DNε P N such that m, n ě Nε ùñ dpxm , xn q ă

ε 2

but, as ϕ is strictly increasing, ϕpnq ě Nε , hence dpxn , xϕpnq q ă 2ε . Since the subsequence pxϕpnq qnPN is presumed to converge to a, this implies that: @ε ą 0 DKε P N such that: n ě Kε ùñ dpxϕpnq , aq ă and, by considering n ě dpxn , xϕpnq q ` dpxϕpnq , aq ă 2ε `

ε 2

ε 2

maxtNε , Kε u, we obtain dpxn , aq “ ε @ε ą 0, that is xn ÝÑ a. nÑ`8

ď 2

This theorem has notable applications in pure and applied mathematics. We shall see a theoretical use in the next section; here we mention its usefulness in optimization, where one seeks to identify the optimal solution to a problem by minimizing an appropriate function. In many cases, the function is too complicated for an analytical description of its minima to be possible, so the solution must be approximated using an iterative algorithm: in this way, a minimum point is attained after passing through a sequence of points. Theorem 4.10 is often used to demonstrate that the iterative algorithm converges, proving that the sequence of points deﬁned by the algorithm is a Cauchy sequence and proving that it admits a (wisely chosen) converging subsequence. 4.3.1. Completeness of vector spaces Metric vector spaces can always be completed in an essentially unique way to complete spaces, as Theorem 4.11 establishes. T HEOREM 4.11 (Completion of a non-complete metric vector space).– If pV, dq is a non-complete metric vector space, then there exists a complete metric vector space ˆ and an isometric injective function ι : V ÝÑ Vˆ , that is: pVˆ , dq # x1 , x2 P V, x1 ‰ x2 ùñ ιpx1 q ‰ ιpx2 q ˆ @x1 , x2 P V, dpx1 , x2 q “ dpιpx 1 q, ιpx2 qq such that ιpV q “ Vˆ , that is the image of V via ι is dense in Vˆ . C OROLLARY 4.1.– Any pre-Hilbert space V can be completed to a Hilbert space H.

134

From Euclidean to Hilbert Spaces

P ROOF.– This proof will focus on the case of pre-Hilbert spaces, which is most relevant for our purposes. The general proof follows a similar approach, except for the fact that the norm of the difference between two vectors is replaced by their distance. The completion of a pre-Hilbert space V is, by deﬁnition, the space H1 of all the Cauchy sequences pxn qnPN modulo the equivalency relationship „, deﬁned as follows: two Cauchy sequences pxn qnPN and pyn qnPN of elements in V are equivalent if lim }xn ´ yn } “ 0. nÑ`8

The completion of V is written as H “ H1 { „, and its elements are noted rxs. We deﬁne a norm on H as follows: @rxs P H,

rxs “ lim xn nÑ`8

where pxn qnPN is any Cauchy sequence in the equivalence class rxs. This deﬁnition does not depend on the choice of the Cauchy sequence used to represent the equivalence class, since, given that | xn ´ yn | ď xn ´ yn , if pyn qnPN P rxs, then at the limit we have: lim | xn ´ yn | ď lim xn ´ yn “ 0

nÑ`8

nÑ`8

that is: lim xn “ lim yn . nÑ`8

nÑ`8

Now, let us deﬁne an inner product on H which is compatible with this norm: xrxs, rysy “ lim xxn , yn y nÑ`8

where pxn qnPN and pyn qnPN are any two Cauchy sequences in the equivalence classes rxs and rys, respectively. To verify that this inner product is well deﬁned, we must verify the existence of the limit used to deﬁne it, and show that it does not depend on the chosen representative elements. The ﬁrst step is to prove the existence of the limit. To do this, we must simply show that xxn , yn y (a sequence in K) is Cauchy; given that K is complete, the limit must exist. Note that @n, m P N, by the triangular inequality and the Cauchy-Schwarz inequality, we can write: |xxn , yn y ´ xxm , ym y| “ |xxn , yn y ´ xxn , ym y ` xxn , ym y ´ xxm , ym y| “ |xxn , yn ´ ym y ` xxn ´ xm , ym y| ď |xxn , yn ´ ym y| ` |xxn ´ xm , ym y| ď xn yn ´ ym ` xn ´ xm ym

ÝÑ

n,mÑ`8

0

Banach Spaces and Hilbert Spaces

135

since xn and yn are bounded, given that pxn qnPN and pyn qnPN are Cauchy sequences. Now, we must verify that the limit is independent of the choice of representative elements: let pξn qnPN and pηn qnPN be two other representatives of the equivalence classes rxs and rys, respectively. Using direct algebraic manipulations, we can write: xxn , yn y “ xxn ´ ξn ` ξn , yn ´ ηn ` ηn y “ xxn ´ ξn , yn y ` xξn , yn ´ ηn y ` xξn , ηn y and: |xxn ´ξn , yn y`xξn , yn ´ηn y| ď xn ´ ξn yn `ξn yn ´ ηn

ÝÑ

n,mÑ`8

0

since pxn qnPN , pξn qnPN P rxs and pyn qnPN , pηn qnPN P rys, hence: xrxs, rysy “ lim xxn , yn y “ lim xξn , ηn y nÑ`8

nÑ`8

Due to the continuity of the inner product on V , all of these properties are transferred onto H by the limit operation. The ﬁnal step is to verify the isometry: a a rxs “ lim xn “ lim xxn , xn y “ xrxs, rxsy, nÑ`8

nÑ`8

@rxs P H

2

An alternative proof may be found in El Hage Hassan (2011). 4.3.2. Characterizing the completeness of normed vector spaces using series In this section, we shall consider a completeness criterion for normed vector spaces which draws on series and is highly useful in practice. The explicit deﬁnition of the Cauchy condition for the sequence of partial sums of ř a series is: nPN

@ε ą 0 DNε ą 0 : n, m ě Nε

n m ÿ ÿ ùñ xk ´ xk ă ε k“0

k“0

The two indices n and m vary independently of one another, and we can suppose, without loss of generality, that one is always greater than the other. For instance,

136

From Euclidean to Hilbert Spaces

n n m ř ř ř supposing that n ą m: xk ´ xk “ k“0

k“0

k“m`1

xk , implying that the

Cauchy condition for series can be rewritten as: @ε ą 0 DNε ą 0 : n ą m ě Nε

n ÿ ùñ xk ă ε

[4.9]

k“m`1

ˆ nInstead,˙the Cauchy condition for the series of norms of xk , that is, the sequence ř }xk } , is: k“0

nPN n ÿ

@ε ą 0 DNε ą 0 : n ą m ě Nε ùñ

}xk } ă ε

[4.10]

k“m`1

This observation will be used in proving the following result. T HEOREM 4.12 (Characterizing the completeness of normed spaces using series).– A normed vector space pV, q is complete if and only if all absolutely convergent series of elements in V are also (simply) convergent in V . P ROOF.– The proof of the direct implication is extremely simple, while that of the inverse is much more complicated and it involves techniques that are very commonly used in functional analysis. ř q to be complete, and let us demonstrate that if ř ùñ : Let us suppose pV, }xn } is convergent, then xn is also convergent in V . nPN

nPN

By completeness, the convergence of

ř

}xn } is equivalent to the Cauchy

nPN

condition [4.10], that is: @ε ą 0 DNε ą 0 : n ą m ě Nε ùñ

n ÿ

}xk } ă ε

[4.11]

k“m`1

n n ř ř ď and since x }xk }, the sequence of partial sums k k“m`1 k“m`1 ˙ ˆ n ř ř xk is also Cauchy, that is xn is convergent. Sn “ k“0

nPN

nPN

ð : Now, let us suppose that all absolutely convergent series of elements in V are also simply convergent in V . We must prove that this implies that V is complete, that is any Cauchy sequence pxn qnPN Ă V , that is: @ε ą 0 DNε ą 0 : n, m ě Nε ùñ xn ´ xm ă ε converges in V , that is, there exists x ¯ P V such that pxn qnPN

Ñ

nÑ`8

x ¯.

Banach Spaces and Hilbert Spaces

137

The Cauchy condition must be valid for all values of ε ą 0, and consequently for εk “ 21k , k P N; thus, any Cauchy sequence in V must verify: ˜k ùñ xn ´ xm ă 1 , ˜k ą 0 : n, m ě N @k ě 0 DN 2k

[4.12]

note that all the objects contained in the expression above are discrete. 1 ˜k`1 ě N ˜k ; this simple consideration allows ă 21k , it follows that N Since 2k`1 us to deﬁne a strictly increasing sequence of natural numbers pNk qkě0 simply by ˜k` ą N ˜k u for all k P N. Using this result, we can deﬁne the deﬁning Nk :“ inf tN PN

subsequence pxNk qkPN Ă V of pxn qnPN which, by its own deﬁnition, satisﬁes [4.12], that is: 1 @k ě 0, xNk ´ xNk`1 ă k 2

[4.13]

The interest of using this subsequence is that, if it converges in V , that is, if there exists x ¯ P V such that lim xNk “ x ¯, then by Theorem 4.10 the initial Cauchy kÑ`8

sequence pxn qnPN also converges to x ¯PV. To complete the proof, we must therefore demonstrate that the subsequence pxNk qkPN is convergent in V . In the absence of information concerning the convergence of the original sequence pxn qnPN , the convergence of pxNk qkPN cannot be proved directly; instead, we must use the hypothesis that absolutely convergent series in V imply the simple convergence of series in V . The link to series is obtained using a startlingly simple technique: rewriting the subsequence pxNk qkPN as a sequence of telescopic partial sums. To do this, we use pxNk qkPN to deﬁne a new sequence pyk qkPN Ă V as follows: # y0 “ xN0 hence: pyk qkPN “ pxN0 , xN1 ´xN0 , xN2 ´xN1 , . . .q yk “ xNk ´ xNk´1 , @k ě 1, then: k ÿ j“0

“ xN k yj “ lo xomo xN xN xN xN xN N Nk ´ 0on ` 1 ´ 0 ` looooomooooon 2 ´ 1 ` ... ` x k´1 looooomooooon loooooomoooooon y0

y1

˜ and this holds @k P N, thus

k ř j“0

y2

yk

¸ “ pxNk qkPN .

yj kPN

To resume, the completeness of V , that is, the convergence of an arbitrary Cauchy sequence pxn qnPN in V , is implied by the convergence of the sequence pxNk qkPN in

138

From Euclidean to Hilbert Spaces

˜ V ; this is equivalent to the convergence of

k ř

¸ yj

j“0

in V , that is, the simple kPN

8 ř yk . By the starting hypothesis, if we can prove that convergence of the series k“0 ř yk is convergent, this would be enough to prove the whole theorem. We begin kPN ř yk : by setting out the terms of the series kPN

ÿ

yk “ y0 `

kPN

8 ÿ

yk “ y0 `

k“1

8 ÿ xN ´ x N k k´1 k“1

8 ÿ xN “ y0 ` ´ xN k k`1 k“0

From inequality [4.13], it holds that xNk`1 ´ xNk ă

1 2k

“

` 1 ˘k 2

@k ě 0, thus:

8 8 ˆ ˙k ÿ ÿ 1 xNk`1 ´ xNk ă y0 ` yk “ y0 ` 2 k“0 kPN k“0

ÿ

“ y0 `

1 1´

1 2

“ y0 ` 2 ă `8

using the geometric series formula. Hence,

ř

yk is a bounded series of positive real terms, and, from a classic

kPN

result in series theory, we know that it converges.

2

If the space pV, } }q in the previous theorem is complete, then the Cauchy sequence pxn qnPN Ă V seen at the start of the proof is convergent; consequently, we know that the subsequence pxNk qkPN also converges to the same limit. This remark is formalized in Corollary 4.2. C OROLLARY 4.2.– Taking (V, } }) to be a complete normed vector space and pxn qnPN Ă V a sequence which converges to x0 P V , there exists a subsequence pxnk qkPN which converges to x0 . 4.3.2.1. The matrix exponential In this section, we shall examine a particularly important application of the previous theorem: the deﬁnition of the matrix exponential.

Banach Spaces and Hilbert Spaces

139

D EFINITION 4.12 (Matrix exponential).– Let A P Mpn, Kq be a square matrix5 with coefﬁcients in the ﬁeld K “ R or C. The exponential of A is the matrix deﬁned by: eA “

8 ÿ Ak k! k“0

The proof that eA is well deﬁned is trivial using the theorem proved above. For instance, let us consider the Frobenius norm of A: ˜ }A} “

n n ÿ ÿ

¸1{2 |aij |

2

i“1 j“1 2

This is the Euclidean norm of a vector in Kn obtained by A using lexicographical order, that is, by sequencing the lines (or rows) of A one after 2 k another. We shall prove that the series In ` A ` A2 ` A3! ` Ak! ` . . . converges in the topology of Mpn, Kq generated by this norm, implying, by Tychonoff’s theorem, its convergence with respect to any other norm. 2

Mpn, Kq is homeomorphic to the Euclidean Kn , which we know to be a complete normed space. To show that eA is well deﬁned, we must show that the series deﬁning eA is absolutely convergent; simple convergence is implied by Theorem 4.12. The proof of absolute convergence is extremely simple: consider the inequality }Ak } ď }A}k , veriﬁed by the Frobenius norm for all k P N and for any matrix A P Mpn, Kq, then: 8 8 ÿ ÿ }Ak } }A}k ď “ e}A} k! k! k“0 k“0

using the fact that }A} is a real number ě 0 and that the convergence radius of the exponential series in R is inﬁnite. 4.3.3. Banach ﬁxed-point theorem The result presented in this section is highly signiﬁcant for many different ﬁelds of mathematics, such as analysis, topology, solving differential equations, etc. 5 A must be square as we will be working with powers of A; for dimensional reasons, these are not deﬁned if A is not square.

140

From Euclidean to Hilbert Spaces

We begin by recalling Deﬁnition 4.12. D EFINITION 4.13 (Contraction mapping).– Let pX1 , d1 q and pX2 , d2 q be any two metric subspaces and let k P p0, 1q be a real constant. The application f : X1 Ñ X2 is a contraction with coefﬁcient k if, for all x, y P X1 : d2 pf pxq, f pyqq ď kd1 px, yq

[4.14]

The smallest value of k for which [4.14] holds is called the Lipschitz constant of f . Veriﬁcation that a contraction mapping is always a continuous function is immediate: for any value ε ą 0, let us take an arbitrary ﬁxed element x ¯ P X1 and consider the elements y P X1 such that d1 p¯ x, yq ă ε. Then, by the deﬁnition of a contraction mapping, d2 pf p¯ xq, f pyqq ď kd1 p¯ x, yq ă kε ă ε, since k P p0, 1q, so function f is continuous in x ¯. As x ¯ is an arbitrary element in X1 , f is continuous on all X1 . R EMARK .– It is evident from the deﬁnition that the distance (in the codomain) between the images of a pair of elements via a contraction mapping is smaller than the initial distance. However, contraction mapping cannot be redeﬁned using this property alone; the deﬁnition given above is not the same as stating that, for all x, y P X1 , x ‰ y, d2 pf pxq, f pyqq ă d1 px, yq. If f satisﬁes this condition, it is said to be a weak contraction mapping, or an application which reduces the distance between points. To understand the subtle difference between these two deﬁnitions, we begin by noting that if f is a weak contraction mapping, then for any pair x, y P X1 , x ‰ y, there exists kx,y P p0, 1q such that d2 pf pxq, f pyqq ď kx,y d1 px, yq, that is, kx,y is not a constant, as required by the deﬁnition of a contraction mapping. The two deﬁnitions coincide if and only if sup kx,y ” k¯ P p0, 1q, but this condition is not guaranteed to x,yPX1

be veriﬁed. The sup necessarily exists, since tkx,y , x, y P X1 , x ‰ yu is a bounded subset of R, but it can take the value 1, meaning that it is not strictly less than 1 as required by the deﬁnition of a contraction mapping. Contraction mappings with a domain and image in the same complete metric space have a remarkable property, described in the classic Theorem 4.13. T HEOREM 4.13 (Banach ﬁxed-point theorem).– Let pX, dq be a complete metric space and f : X Ñ X a contraction mapping in X of coefﬁcient k P p0, 1q: then f admits a single ﬁxed point, that is there exists a single x ¯ P X such that f p¯ xq “ x ¯.

Banach Spaces and Hilbert Spaces

141

P ROOF.– Let a P X be an arbitrary element. We deﬁne the sequence pxn qnPN Ă X by recursion as: # x0 “ a xn “ f pxn´1 q, n ě 1 The ﬁrst step of the proof consists simply of showing that, if this sequence admits a limit in X, then this limit is a ﬁxed point for f . The uniqueness of the ﬁxed point will be a simple consequence of the deﬁnition of the contraction mapping. Instead, the convergence of the sequence pxn qnPN , which is harder to prove, will be veriﬁed later. – If there exists X Q x ¯ “ lim xn , then x ¯ is a ﬁxed point for f : The proof of this nÑ`8

statement relies on a simple continuity argument. Since we know that a contraction mapping is continuous, if we let n tend toward `8 in the deﬁnition of the sequence, that is, xn “ f pxn´1 q, we obtain: lim xn “ lim f pxn´1 q ðñ x ¯ “ f p lim xn´1 q ðñ x ¯ “ f p¯ xq

nÑ`8

nÑ`8

nÑ`8

that is, x ¯ is a ﬁxed point for f . This is the reason for considering the recursively deﬁned sequence pxn qnPN described above. – Uniqueness of the ﬁxed point: Let x ¯, y¯ P X be two ﬁxed points for f , that is, f p¯ xq “ x ¯, f p¯ y q “ y¯. We can show that their distance is null, that is, x ¯ “ y¯, using the deﬁnite positiveness of the distance and the deﬁnition of contraction mapping: dp¯ x, y¯q “ dpf p¯ xq, f p¯ y qq ď kdp¯ x, y¯q but since k P p0, 1q, this inequality only holds if dp¯ x, y¯q “ 0, that is, x ¯ “ y¯. – Convergence of the sequence: Here, the hypothesis that pX, dq is complete will be crucial, because if we can show that pxn qnPN is Cauchy, then, by completeness, it is convergent. We begin by noting that for all n ě 1, using the deﬁnition of the sequence and the hypothesis that f is a contraction, we can write: dpxn`1 , xn q “ dpf pxn q, f pxn´1 qq ď kdpxn , xn´1 q hence, by iteration: dpxn`1 , xn q ď kdpxn , xn´1 q ď k 2 dpxn´1 , xn´2 q

[4.15]

ď . . . ď k dpx1 , x0 q n

that is the distance between consecutive elements, xn`1 and xn , in sequence pxn qnPN is majorized by k n dpx1 , x0 q; note that the power of k is equal to the smallest index value.

142

From Euclidean to Hilbert Spaces

Now, let us take two arbitrary but different natural indices n, m P N. Without loss of generality, we may consider that n ă m, hence m ´ n “ p P N, or m “ n ` p, and so dpxm , xn q “ dpxn`p , xn q. Iterating the triangular property of the distance, we obtain: dpxn`p , xn q ď dpxn`p , xn`p´1 q ` dpxn`p´1 , xn`p´2 q ` . . . ` dpxn`1 , xn q We see that all terms on the right side of the inequality are distances between two consecutive elements of the sequence pxn qnPN ; using this fact, we can apply the majorization given by [4.15] and write: $ dpxn`p , xn`p´1 q ď k n`p´1 dpx1 , x0 q ’ ’ ’ ’ &dpxn`p´1 , xn`p´2 q ď k n`p´2 dpx1 , x0 q .. ’ ’ . ’ ’ % dpxn`1 , xn q ď k n dpx1 , x0 q that is: dpxn`p , xn q ď pk n`p´1 ` k n`p´2 ` . . . ` k n qdpx1 , x0 q p´1 “ pk `¸ k p´2 ` . . . ` 1qk n dpx1 , x0 q ˜ p´1 ř j k k n dpx1 , x0 q “ j“0 ˜ ¸ `8 ř j ď k k n dpx1 , x0 q, kj ą0

As

`8 ř

j“0

k j is a geometric series in k P p0, 1q, it converges to

j“0

dpxn`p , xn q ď

1 1´k ,

so we have:

kn dpx1 , x0 q 1´k

Remembering that dpxn`p , xn q “ dpxm , xn q, m ą n P N of arbitrary value, we have: dpxm , xn q ď

kn dpx1 , x0 q ÝÑ 0 nÑ`8 1´k

This implies that pxn qnPN is a Cauchy sequence, and thus converges to an element x ¯ P X by the hypothesis of completeness of X. 2 It is important to note that the ﬁrst element in the sequence pxn qnPN is completely arbitrary: even if this element is distant from the ﬁxed point x ¯, the sequence will reach the ﬁxed point by the limit. In some occasions, a starting point x0 may be selected in such a way as to accelerate the speed at which the sequence convergences.

Banach Spaces and Hilbert Spaces

143

The following nice exercise is proposed by Sondaz (2010). Exercise 4.1 1) Give an example of a metric space pX, dq and contraction mapping f : X Ñ X with no ﬁxed point. 2) Give an example of a complete metric space pX, dq and an application f : X Ñ X which strictly reduces distances, that is such that dpf pxq, f pyqq ă dpx, yq @x, y P X, x ‰ y, and which admits no ﬁxed point. 3) Show that the Cauchy problem: # x1 ptq “ 12 sin xptq xp0q “ 1

[4.16]

has a unique solution ϕ : r´1, 1s Ñ R. Solution to Exercise 4.1 For points 1 and 2, the answer evidently involves undermining the ﬁxed point theorem by removing a hypothesis. For point (1), we consider a non-complete metric space. For point (2), we consider an application which strictly reduces distances; as we have seen, this hypothesis is less strict than requiring the application to be a contraction mapping. 1) We need to consider a non-complete metric space. We have already seen that pp0, 1q, | |q is not complete. In this space, let us consider, for instance, the function f : p0, 1q Ñ p0, 1q, f pxq “ 12 x. Then: ˇ ˇ ˇ1 1 ˇ 1 1 @x, y P p0, 1q, |f pxq ´ f pyq| “ ˇˇ x ´ y ˇˇ “ |x ´ y| ď |x ´ y| 2 2 2 2 so f is a contraction with coefﬁcient k “ 1{2. The ﬁxed point equation for f , that is, f pxq “ x, evidently has no solutions in p0, 1q since 12 x “ x if and only if x “ 0 R p0, 1q. 2) Consider the metric space pX, dq? “ pr0, `8q, | |q and the application f : r0, `8q Ñ r0, `8q deﬁned by f pxq “ x2 ` 1. Taking two arbitrary ﬁxed elements x, y P r0, `8q, due to Lagrange’s mean value theorem, there exists an element ξ P r0, `8q which is strictly included in the interval between x and y, such that: ξ f pxq ´ f pyq “ px ´ yqf 1 pξq “ px ´ yq a 2 ξ `1

144

From Euclidean to Hilbert Spaces

Since

a a ξ 2 ` 1 ą ξ 2 “ ξ P r0, `8q, then ? ξ2

ξ `1

ă 1, so |f pxq ´ f pyq| ă

|x ´ y|, that is f strictly reduces the distances. ? Nevertheless, in r0, `8q the ﬁxed point equation for f , that is, x “ x2 ` 1, can be written as x2 “ x2 ` 1, meaning 1 “ 0, which is obviously a contradiction; thus, f does not admit a ﬁxed point. 3) We know from differential equation theory that solving the Cauchy problem [4.16] is equivalent to determining a function ϕ P Cpr´1, 1sq which satisﬁes the following Volterra integral equation: 1 ϕptq “ 2

żt 0

sin ϕpsqds ` 1

[4.17]

Let us verify this statement. On one side, if ϕ is a solution of [4.16], by deﬁnition, ϕ is differentiable and thus continuous. Integrating both sides of the differential şt 1 ş 1 t equation from 0 to t, we obtain 0 ϕ psqds “ 2 0 sin ϕpsqds, that is, ϕptq ´ ϕp0q “ ş 1 t The initial condition [4.16] gives us ϕp0q “ 1, thus ϕ satisﬁes 2 0 sin ϕpsqds. ş 1 t ϕptq “ 2 0 sin ϕpsqds ` 1. On the other side, supposing that ϕ satisﬁes [4.17], the integral function sin ϕpsqds ` 1 is derivable @t P r´1, 1s, since sin ˝ϕ is continuous and the integration operation makes any continuous function derivable. Deriving [4.17] gives us ϕ1 ptq “ 12 sin ϕptq with ϕp0q “ 1, that is, ϕ satisﬁes [4.16]. ş 1 t 2 0

These considerations highlight the interest of the space Cpr´1, 1sq, which is a Banach space when it is endowed with the norm }f } “ sup |f ptq|. Consider the tPr´1,1s

following application: F : Cpr´1, 1sq ÝÑ Cpr´1, 1sq f ÞÝÑ F pf q where F pf q is the real-value continuous function on r´1, 1s deﬁned by the analytical şt expression F pf qptq “ 12 0 sin ϕpsqds ` 1, for all t P r´1, 1s. Clearly, if we can show that F is a contraction, by invoking the ﬁxed point theorem we will complete the proof that there is only one solution to the Cauchy problem [4.16]. To do this, let us consider any two functions f, g P Cpr´1, 1sq and an arbitrary t P r´1, 1s; then:

Banach Spaces and Hilbert Spaces

145

ˇ ˇż ˇ 1 ˇˇ t ˇ rsin f psq ´ sin gpsqsds ˇ ˇ 2 0 ˇ ˇż t ˇ 1ˇ ď ˇˇ | sin f psq ´ sin gpsq|dsˇˇ 2 0 ˆ ˙ ˆ ˙ p´q p`q pusing the formula sin p ´ sin q “ 2 sin cos q: 2 2 ˇ ˇ ˇż t ˇ ˇ ˇ f psq ´ gpsq f psq ` gpsq ˇˇ ˇˇ cos “ ˇˇ ˇˇsin ˇ dsˇ 2 2 0 |F pf qptq ´ F pgqptq| “

p| cospαq| ď 1q ˇż t ˇ ˇ ˇ ˇ ˇ f psq ´ gpsq ˇˇ ˇˇ ˇ ˇ ď ˇ ˇsin ˇ dsˇ 2 0

p| sinpαq| ď |α|q ˇż t ˇ ˇ ˇ ˇ ˇ f psq ´ gpsq ˇ ˇ ˇ ˇ ˇ dsˇ ďˇ ˇ ˇ ˇ 2 0 ˇ ˇż t ˇ 1ˇ ď ˇˇ }f ´ g}dsˇˇ 2 0 ˇ ˇż }f ´ g} ˇˇ t ˇˇ }f ´ g} ď |t| ˇ dsˇ “ 2 2 0 pt P r´1, 1s ùñ |t| ď 1q ď

}f ´ g} 2

In summary: |F pf qptq ´ F pgqptq| ď

}f ´g} 2

@t P r´1, 1s, hence:

}F pf q ´ F pgq} “ sup |F pf qptq ´ F pgqptq| ď tPr´1,1s

1 }f ´ g} 2

that is F is a contraction.

2

4.4. Remarkable examples of Banach and Hilbert spaces In this section, we shall introduce function spaces which are of crucial importance in mathematics. We shall demonstrate that some of these spaces are Banach spaces, while others are Hilbert spaces; we shall then present density theorems related to these spaces.

146

From Euclidean to Hilbert Spaces

4.4.1. Lp and p spaces: presentation and completeness In the following deﬁnitions, K will be either R or C. Let pX, A, μq be a measure space. For all 1 ď p ă `8, we deﬁne: " * ż Lp pX, A, μq “ f : X Ñ K, f measurable : |f |p dμ ă `8 X

The set Lp pX, A, μq becomes a vector space if we deﬁne the pointwise vector structure, that is @α, β P K, @f, g P Lp pX, A, μq: αf ` βg : X ùñ K x ÞÑ pαf ` βgqpxq “ αf pxq ` βgpxq This linear combination operation is well deﬁned thanks to the famous Minkowski inequality6 for integrals (which we will not prove): ˆż

˙1{p ˆż ˙1{p ˆż ˙1{p |f ` g|p dμ ď |f |p dμ ` |g|p dμ

X

X

[4.18]

X

Since multiplication by the scalars α, β has no effect on the integrability of f, g, the deﬁnition is coherent. Writing: }f }p “

ˆż

|f | dμ p

˙1{p

X

the properties of the Lebesgue integral give us: – positiveness (non-deﬁnite) and homogeneity: }f }p ě 0, }λf }p “ |λ|}f }p

@f P Lp pX, A, μq, λ P K

– the Minkowski inequality [4.18] becomes the triangular inequality7 for } }p : }f ` g}p ď }f }p ` }g}p ,

@f, g P Lp pX, A, μq

6 By iteration, we can write the generalized Minkowski inequality, which we shall use later: ˇp ¸1{p ˜ż ˇ ˙1{p n n ˆż ˇÿ ˇ ÿ ˇ ˇ p fk ˇ dμ ď |fk | dμ . ˇ ˇ X ˇk“1 k“1 X n n ÿ ÿ 7 By iteration: fk ď fk p . [4.18] k“1 k“1 p

Banach Spaces and Hilbert Spaces

147

– but: f “ 0 (the null function) }f }p “ 0 ùñ so any function g P Lp pX, A, μq which is null a.e. is such that }g}p “ 0. Thus, the fact that }f }p “ 0 does not imply that f is the null function, that is, that f pxq “ 0 @x P X. Hence, } }p is a semi-norm (or pseudo-norm) on Lp pX, A, μq. Unfortunately, the presence of a semi-norm makes it impossible to use the (highly useful) property that, if the norm of the difference between two elements in a normed space is null, then the two elements coincide, since they can differ over a set of measure zero. This norm feature is used to show the uniqueness of a mathematical object in cases where it is difﬁcult to prove it directly, this is why it is important to preserve it. The solution to the problem is to apply the quotient of Lp pX, A, μq w.r.t. a suitable subspace that allows us to get rid of the redundant functions. It should be clear that this subspace is: N “ tf : X Ñ K, f measurable : f “ 0 a.eu The quotient space Lp pX, A, μq “ Lp pX, A, μq{N formed by the equivalence classes of functions which are measurable on X, absolutely integrable in power p and equal a.e, is thus a normed vector space with norm } }p . Using the considerations presented in Appendix 1, it becomes apparent that, ﬁxed a representative f of an equivalence class of Lp pX, A, μq, all other functions g belonging to the same class can be written as g “ f ` h, where h : X Ñ K is null a.e. For simplicity’s sake, a representative function and the equivalence class to which it belongs are generally noted using the same symbol. Furthermore, in cases where X, A and μ do not need to be speciﬁed, we may simply write Lp . R EMARK .– Take X Ď Kn with the Lebesgue measure. Let us consider two functions f, g P Lp pX, A, μq which are continuous on X and which differ, at least, at the point x0 P X: f px0 q ‰ gpx0 q. By deﬁnition of continuity: @ε ą 0 Dδε ą 0 : x P Uδε px0 q ùñ f pxq P Uε pf px0 qq and gpxq P Uε pgpx0 qq but, by the separability property of Kn , Dε ą 0 such that Uε pf px0 qq X Uε pgpx0 qq “ H, that is, Dδε ą 0 such that x P Uδε px0 q implies f pxq ‰ gpxq, that is, if two

148

From Euclidean to Hilbert Spaces

continuous functions f and g are different at a point x0 , they must also be different on a neighborhood Uδε px0 q of radius δε ą 0. This neighborhood has a non-null Lebesgue measure, so the two functions are not equal a.e. In other words: two functions which are continuous on X Ă Kn cannot be equal a.e.: either they are the same function, or they are different on a non-null Lebesgue measure set. Thus, two continuous functions of Lp pX, A, μq which are different in at least one point are two different elements of Lp pX, A, μq, as they are representatives of two different equivalence classes. If p “ 2, then we can deﬁne an inner product on L2 pX, A, μq: xf, gy “

ż f g dμ

if K “ R

xf, gy “

and

X

ż f g dμ

if K “ C

X

These inner products are well deﬁned thanks to Hölder’s inequality for integrals (which we shall not prove here): if p, q ą 0 are conjugate exponents, that is, p1 ` 1q “ 1, then it holds that: ż

|f g| dμ ď

X

ˆż

|f |p dμ

˙1{p ˆż

X

|g|q dμ

˙1{q [4.19]

X

Evidently, p “ q “ 2 are conjugate exponents and thus the inner product introduced above is well deﬁned. The proof that this veriﬁes the axioms of the inner product is left to the reader; here we note simply that Hölder’s inequality for p “ q “ 2 implies the validity of the Cauchy-Schwarz inequality for the space L2 pX, A, μq. In fact, for all f, g P L2 pX, A, μq: ˇż ˇ ż ˙1{2 ˆż ˙1{2 ˆż ˇ ˇ |xf, gyL2 | “ ˇˇ f g¯ dμˇˇ ď |f g| dμ ď |f |2 dμ |g|2 dμ X

X

r4.19s

X

X

“ }f }2 }g}2 One notable instance of Lp spaces is represented by the p spaces, which are deﬁned through the following choices: – X is taken to be a countable set, typically X “ N or X “ Z; – A “ PpXq, the set of parts of X; – μ is the counting measure, that is μ : PpXq Ñ r0, `8s, μpAq “cardpAq @A P PpXq which has a ﬁnite cardinal and μpAq “ `8 if cardpAq is not ﬁnite.

Banach Spaces and Hilbert Spaces

149

Using these choices, any function f : X ùñ K is measurable and it can be identiﬁed with a sequence of elements in K, written pxn qnPN . Thus, explicitly8: # pN, Kq “

+ ÿ

pxn qnPN ,

p

|xn | ă `8 p

nPN

the same considerations hold if we exchange N for Z. In cases where there is no need to specify N, Z or any other countable set, we simply write p . The linear structure of these spaces is the same as that of the Lp spaces, that is pointwise deﬁned, and the norm of pxn qnPN P p pN, Kq is: ¸1{p

˜ ÿ

pxn qnPN p “

|xn |

p

nPN

The same holds if we exchange N for Z. The triangular inequality for this norm follows from the Minkowski inequality for series: ¸1{p

˜ ÿ

|xn ` yn |

¸1{p

˜ ď

p

nPN

ÿ

|xn |

¸1{p

˜ ÿ

`

p

nPN

|yn |

p

[4.20]

nPN

As in the case of Lp spaces, if p “ 2, an inner product can be deﬁned on 2 : xxn , yn y “

ÿ

xn y n

if K “ R

and

xxn , yn y “

nPN

ÿ

xn y n

if K “ C

nPN

The same holds if we exchange N for Z, or any other countable set. 0,

The inner product is well deﬁned thanks to Hölder’s inequality for series: if p, q ą ` 1q “ 1, then it holds that:

1 p

¸1{p ˜

˜ ÿ

|xn yn | ď

nPN

ÿ nPN

|xn |

p

¸1{q ÿ

|yn |

q

nPN

R EMARK .– – The inner product of 2 pN, Kq is the inﬁnite-dimensional generalization of the inner product of 2 pZN q. 8 The spaces p pN, Kq are vector subspaces of the vector space KN :“ tpxn qnPN , xn P K @n P Nu of sequences with values in K possessing a pointwise deﬁned linear structure. The same holds if N and Z are switched, in which case we speak of bilateral sequences.

150

From Euclidean to Hilbert Spaces

– The role of the Minkowski and Hölder inequalities in deﬁning Lp and p spaces should be clear: the Minkowski inequality guarantees the existence of a linear structure, and Hölder’s inequality ensures that the inner product is well deﬁned in the case where p “ 2. – } }p norms with p ‰ 2 are not Hilbert norms, in fact it is possible to provide examples of elements in all the Lp spaces, with p ‰ 2, for which the parallelogram law is not veriﬁed. Now, let us demonstrate that Lp and p spaces with 1 ď p ă `8, p ‰ 2 are Banach spaces, and for p “ 2, Hilbert spaces. The completeness of L2 pr0, 1sq spaces was demonstrated independently by the Austrian mathematician Ernst Sigismund Fischer (1875-1954) and by Frigyes Riesz9 in 1907. In 1910, Riesz demonstrated that all Lp r0, 1s spaces are complete. T HEOREM 4.14 (Riesz-Fischer theorem).– For all 1 ď p ă `8, the spaces pLp pX, A, μq, } }p q and pp , } }p q are complete. P ROOF.– We will report Riesz’s demonstration, who brought out the heavy artillery to prove these results, using the characterization theorem for complete normed vector spaces, Fatou’s lemma, the generalized Minkowski inequality, the monotone convergence theorem and the dominated convergence theorem to construct his proof. Let us consider any series

8 ř

fk in Lp pX, A, μq, 1 ď p ď `8, which is

k“0

absolutely convergent, that is: 8 ÿ

fk p “ M ă `8

k“0

then,ř by Theorem 4.12, to show that Lp pX, A, μq is complete, we must simply prove that fk is convergent in norm, that is that DS P Lp pX, A, μq such that: kPN

n ÿ fk ´ S k“0

ÝÑ

p

n ùñ `8

0

[4.21]

The ﬁrst step in determining the function S is to deﬁne the sequence: pgn qnPN , gn “

n ÿ

|fk |,

@n P N

k“0

9 Frigyes Riesz (1880-1956) was a Hungarian mathematician who made many hugely important contributions to the development of functional analysis, among other areas.

Banach Spaces and Hilbert Spaces

151

Using the generalized Minkowski theorem in equation [4.18], we know that: ˆż ˙1{p n n ÿ ÿ p pgn q dμ “ gn p “ |fk | ď fk p X k“0

ď

8 ÿ

fk p

k“0

“

p

k“0

M ă `8

(by hypothesis)

hence: ż

pgn qp dμ ď M p ,

@n P N

[4.22]

X

that is, pgn qp is a sequence of monotonic increasing functions of integrable functions, and the sequence of integrals is bounded. The monotone convergence Theorem 3.3 tells us that the pointwise limit function lim pgn qp pxq is ﬁnite a.e. on X, that is, @x P E Ď X and μpXzEq “ 0. This

nÑ`8

implies the existence @x P E of a ﬁnite pointwise limit: ˆ ˙ gpxq ” lim gn pxq “ lim rpgn qp pxqs1{p nÑ`8

Since @x P E,

nÑ`8

8 ř

8 ř

fk pxq ď

k“0

|fk pxq| “ gpxq, the series

k“0

8 ř

fk pxq converges

k“0

a.e. on X. Now, let us construct the required function S : X Ñ K: $8 & ř f pxq x P E k Spxq “ k“0 % 0 x P XzE p şThis pdeﬁnition ensures that S is measurable. The fact that S P L pX, A, μq, that is, X S dμ exists and is ﬁnite a.e., is a consequence of the dominated convergence theorem (Theorem 3.5) and Fatou’s lemma (Theorem 3.4). This can be proved by considering the sequence of partial sums for S p ,

that is: ˜ pSn q “ p

n ÿ k“0

¸p fk

˜ ď

n ÿ

¸p |fk |

“ pgn qp .

k“0

pgn qp is an increasing positive sequence, thus: pSn qp pxq ď pgn qp pxq ď lim pgn qp pxq “ g p pxq, nÑ`8

@x P E

[4.23]

152

From Euclidean to Hilbert Spaces

By monotony, lim pgn qp pxq “ lim inf pgn qp pxq and thus, by Fatou’s lemma, we nÑ`8

nÑ`8

have: ż

g p dμ ď lim

ż

nÑ`8 X

X

pgn qp dμ ď

lim M p “ M p ă `8

r4.22s nÑ`8

The positive measurable function g p is therefore integrable a.e. on X. Using this information and equation [4.23], that is pSn qp ď g p @n P N, the dominated convergence theorem can be used to guarantee that S p , the a.e. limit of pSn qp , converges on X, that is S P Lp pX, A, μq. To complete our proof, we must demonstrate that function S veriﬁes equation [4.21], that is: ˇp ż ˇˇ ÿ n 8 n ÿ ˇ ÿ ˇ ˇ fk ´ S ÝÑ 0 ðñ lim fk ´ fk ˇ dμ “ 0 ˇ nÑ`8 E ˇ n ùñ `8 ˇ k“0

k“0

p

k“0

note that we do not need to write the integration on XzE since μpXzEq “ 0. With 8 ř our notation, the condition of convergence in norm } }p for the series fk to S can k“0

be rewritten in a simpler way as follows: lim Sn ´ Sp “ 0 ðñ

nÑ`8

ż lim

nÑ`8 E

|Sn ´ S|p dμ “ 0

Evidently, if we can show that the integral and the limit can switch places, then the result will be proved, since, in this case: ż ż lim |Sn ´ S|p dμ “ lim |Sn ´ S|p dμ nÑ`8 E

“

pS is independent of nq

E nÑ`8

ż E

| lim Sn ´ S|p dμ “ 0 nÑ`8

To make this exchange possible, we can write the following majorization: |Sn pxq ´ Spxq|p ď p|Sn pxq| ` |Spxq|qp ď pgpxq ` gpxqqp “ p2gpxqqp “ 2p pgpxqqp @x P E ş As X g p dμ ď M p ă `8, this majorization ensures that the sequence p|Sn pxq ´ Spxq|p qnPN veriﬁes the conditions of the dominated convergence theorem, meaning that the limit and integral can be exchanged. As we saw previously, this ensures that the series

8 ř

fk in Lp pX, A, μq, which

k“0

we presumed to be absolutely convergent, is also simply convergent. Hence, all Lp pX, A, μq spaces with 1 ď p ă 8 are complete.

Banach Spaces and Hilbert Spaces

153

Since p spaces are special cases of Lp spaces, this result also holds for these spaces @1 ď p ă 8. 2 Exercise 4.2 Let a “ pan qnPN be a sequence of strictly positive real numbers, and let 2a pN, Cq be the ř vector space formed by the sequences of complex numbers pun qnPN which verify an |un |2 ă `8. Show that the application deﬁned by: nPN

xu, vy2a “

ÿ

an un vn

nPN

is well deﬁned on 2a pN, Cq ˆ 2a pN, Cq (i.e. xu, vy exists for all u, v P 2a pN, Cq), and deduce that this is an inner product. Solution to Exercise 4.2 ? ? Since u, v P 2a pN, Cq, au and av belong to 2 pN, Cq, then: ÿ? ? ? ? xu, vy2a “ an un an vn “ x an un , an vn y2 ă `8 nPN

The sesquilinearity and conjugate symmetry of xu, vy2a follow directly from the analogous properties of the inner product of 2 pN, Cq. The onlyřelement to verify explicitly is deﬁnite positiveness. If u P 2a pN, Cq, then xu, uy2a “ nPN an |un |2 ě 0 as it is a sum of positive terms. This formula also shows that xu, uy2a “ 0 ðñ an |un |2 “ 0 for all n P N, but an ą 0 for all n P N by hypothesis, thus |un |2 “ 0 ðñ un “ 0 @n P N, that is u “ 02a . 2 Exercise 4.3 Take s P R, s ą 0 and: # H “ s

u “ pun qnPN Ă C @n P N :

+ ÿ

2 s

2

p1 ` n q |un | ă `8

nPN

H s is a Hilbert space which is often encountered when solving differential equations using the Fourier transform. 1) Show that H s is a vector subspace of 2 pN, Cq. 2) Let φ : H s ˆ H s Ñ C be the application deﬁned by: ÿ p1 ` n2 qs un vn @u, v P H s φpu, vq :“ nPN

154

From Euclidean to Hilbert Spaces

presuming, for the moment, that the application is well deﬁned, that is, the series converges. For any sequence w “ pwn qnPN P H s , deﬁne the sequence w ˜ as follows: w ˜n “ p1 ` n2 qs{2 wn

@n P N

a) Show that w ˜ P 2 pN, Cq and it holds that: φpu, vq “ x˜ u, v˜y2

@u, v P H s

where x , y2 is the usual inner product of 2 pN, Cq. b) Deduce that φ is well deﬁned on H s ˆ H s , then that it constitutes an inner product, noted φ “ x , yH s . 3) We wish to show that pH s , x , yH s q is a Hilbert space. To do this, let us ﬁx an arbitrary Cauchy sequence pum qmPN in H s . a) Show that p˜ um qmPN is Cauchy in 2 pN, Cq. note ˜l.

b) Deduce that p˜ um qmPN converges in 2 pN, Cq to a limit, which we shall c) Deﬁne the sequence l “ pln qnPN by: ln “

1 ˜ln p1 ` n2 qs{2

@n P N

Show that l belongs to H s , that pum qmPN converges to l in H s , and conclude your proof. Solution to Exercise 4.3 1) To show that H s Ă 2 pN, Cq we shall demonstrate, in order, that u P H s ùñ u P 2 pN, Cq, that H s ‰ H, and that H s is stable with respect to linear combinations of its elements. For any sequence u “ pun qnPN Ă C it holds that 0 ď |un |2 ď p1 ` n2 q|un |2 for all n P N, hence: ÿ ÿ p1 ` n2 qs |un |2 ă s `8 ùñ u P 2 pN, Cq |un |2 ď nPN

nPN

def. of H

Evidently, 02 P H s , thus H s ‰ H. Finally, taking λ P C and u, v P H s , then: 0 ď |un ` λvn |2 ď p|un | ` |λ||vn |q2 ď 2p|un |2 ` |λ|2 |vn |2 q where the ﬁnal inequality draws on the fact that the moduli are real numbers and that, for all a, b P R, 0 ď pa ´ bq2 “ a2 ` b2 ´ 2ab “ 2a2 ´ a2 ` 2b2 ´ b2 ´ 2ab,

Banach Spaces and Hilbert Spaces

155

so a2 ` b2 ` 2ab ď 2a2 ` 2b2 , that is pa ` bq2 ď 2pa2 ` b2 q; writing a “ |un | and b “ |λ||vn |, we obtain the ﬁnal inequality from the previous formula. Now, with respect to the series, we can write: ˜ ¸ ÿ ÿ ÿ p1`n2 qs |un `λvn |2 ď 2 p1 ` n2 qs |un |2 ` |λ|2 p1 ` n2 qs |vn |2 q ă `8 nPN

nPN

nPN

as u, v P H , thus u ` λv P H and so H is a vector subspace of 2 pN, Cq. s

s

2) a) w ˜ P 2 pN, Cq if

s

|w ˜n |2 ă `8, but:

ř nPN

ÿ

|w ˜ n |2 “

nPN

ÿ

p1 ` n2 qs |wn |2 ă `8

nPN

˜ P 2 pN, Cq. Now, taking u, v P H s : since w P H , so w ÿ ÿ ÿ p1`n2 qs un vn “ p1`n2 qs{2 un p1 ` n2 qs{2 vn “ u ˜n v˜n “ x˜ u, v˜y2 φpu, vq “ s

nPN

nPN

nPN

b) We have: ř ř φpu, vq “ nPN p1 ` n2 qs un vn ď nPN |p1 ` n2 qs un v n | “

ÿ

|p1 ` n2 qs{2 un p1 ` n2 qs{2 v n |

nPN

“

ÿ

|˜ un v˜n | “ x˜ u, v˜y2

nPN

ď

Cauchy-Schwarz

}˜ u}2 }˜ v }2 ă `8

u, v˜y2 , thus φpu, vq is well deﬁned for all u, v P H 2 . By the fact that φpu, vq “ x˜ we know that φ is an inner product: it is Hermitian and sesquilinear, since x , y2 possesses these properties. Regarding the deﬁnite positiveness, we simply note that for ř all u P H s , φpuq “ 0 implies p1`n2 qs{2 un p1`n2 qs{2 un “ x˜ u, u ˜y2 “ }˜ u}2 “ 0, nPN

that is, u ˜ “ 0, that is, p1 ` n2 qs{2 un “ 0 ðñ un “ 0 @n P N. Hence φ is a complex inner product on H 2 , and this is noted φpu, vq “ xu, vyH s . 3) a) To prove that if u “ pum qmPN is an arbitrary Cauchy sequence in H s then p˜ um qmPN is a Cauchy sequence in 2 pN, Cq, we write the Cauchy condition in its squared form for u: @ε ą 0 DNε P N : m, k ď Nε ùñ }um ´ uk }2H s ă ε2 but }um ´ uk }2H s “ xum ´ uk , um ´ uk yH s

Č Č “ xum ´ u k , um ´ uk y2 , and:

p2.paqq

Č ´ uk “ p1`n2 qs{2 pum ´uk q “ p1`n2 qs{2 um ´p1`n2 qs{2 uk “ u ˜m ´ u ˜k um Č Č hence }um ´ uk }2H s “ xum ´ uk , u m ´ uk y2 “ x˜ um ´ u ˜k , u ˜m ´ u ˜k y2 “ }˜ um ´ 2 u ˜k }2 , which implies that p˜ um qmPN is a Cauchy sequence in 2 pN, Cq.

156

From Euclidean to Hilbert Spaces

b) Given that 2 pN, Cq is complete, the Cauchy sequence p˜ um qmPN converges to an element in 2 pN, Cq which we note ˜l. c) Let us consider the sequence l “ ˜l{p1 ` n2 qs{2 and show that it belongs to H by calculating the square of its norm in H s : s

}l}2H s “

ÿ

p1 ` n2 qs |ln |2 “

nPN

ÿ

p1 ` n2 qs

nPN

ÿ |˜ln |2 “ |˜ln |2 ă `8 pp1`n2 qs nPN

2

so l P H . Now, let us show that pum qmPN converges to : using the result from point Č (2a), we have xum ´ l, um ´ lyH s “ xuČ m ´ l, um ´ ly2 . Since we have also seen that ˜m ´ ˜l, it holds that }um ´ l}2H s “ }˜ um ´ ˜l}22 Ñ 0, by (3b), that is, uČ m´l “ u mÑ`8

pum qmPN converges to l in H s . We have thus demonstrated that the arbitrary Cauchy sequence pum qmPN converges inside H s , that is, H s constitutes a Hilbert space. 2 4.4.2. L8 and 8 spaces The case where p “ 8 has been deliberately excluded up to this point, and will be examined separately here. Let pX, A, μq be a measure space, as before, and let K “ R or C. We begin by deﬁning the space: L8 pX, A, μq “ tf : X ùñ K : DM P R, M ě 0, such that |f pxq| ď M a.e.u The elements of L8 pX, A, μq are known as essentially bounded functions, that is, functions which are bounded on the complement of a null measure set w.r.t. μ. As in the case of Lp spaces, we need to introduce the equivalence relation: f, g P L8 pX, A, μq, f „ g if f “ g a.e. to make the quotient space: L8 pX, A, μq “ L8 pX, A, μq{„ a normed vector space with norm given by: }f }8 “ inftM ě 0 : |f pxq| ď M a.e.u which we shall call ess suppf q, read as the essential supremum of f , which, by deﬁnition, satisﬁes: |f pxq| ď ||f ||8 a.e. for all f P L8 pX, A, μq.

Banach Spaces and Hilbert Spaces

157

The symbol 8 has its origins in the fact that if 1 ď p ă `8 and f P Lp X L8 , then: }f }8 “

lim

p ùñ `8

}f }p

As in the case of Lp spaces, the case of continuous functions requires further clariﬁcation. If a continuous function is such that |f pxq| ą M , then, by completeness, there exists a neighborhood of positive radius in which f is not bounded by M . Thus, a continuous and essentially bounded function is actually a bounded function in the usual sense. We also deﬁne: 8 pN, Kq “ L8 pN, PpNq, μcounting q “ tpxn qnPN : xn P K @n P N, DM ě 0 : |xn | ď M u

that is, 8 is the space of bounded sequences (a similar deﬁnition is obtained if we exchange N for Z). 8 pN, Kq is a normed space with: }pxn qnPN }8 “ sup |xn | nPN

T HEOREM 4.15.– pL8 pX, A, μq, } }8 q and p8 pN, Kq, } }8 q are Banach spaces. P ROOF.– Let us set out the proof for L8 pX, A, μq, then the fact that 8 pN, Kq is a Banach space will be an automatic implication. We must show that, if pfn qnPN is a Cauchy sequence of elements of L8 pX, A, μq, then it converges to an element in L8 pX, A, μq. By the deﬁnition of a Cauchy sequence, we have: @ε ą 0 DNε ą 0 : n, m ě Nε ùñ }fn ´ fm }8 ă ε

[4.24]

This will be used later. Now, let us consider the sets of points where the functions in the sequence behave in a “peculiar” manner: Ak “ tx P X : |fk pxq| ą ||fk ||8 u, Bn,m “ tx P X : |fn pxq ´ fm pxq| ą ||fn ´ fm ||8 u by the deﬁnition of L8 pX, A, μq, μpAk q “ μpBn,m q “ 0 and: c @x P Ack : |fk pxq| ď ||fk ||8 , @x P Bn,m : |fn pxq ´ fm pxq| ď ||fn ´ fm ||8

158

From Euclidean to Hilbert Spaces

To eliminate the dependency of the indices k, n, m, we construct the set: ď ď Ak Y Bn,m E“ kPN

n,mPN

which has a null measure, μpEq “ 0, as a countable union of null measure sets. Now, we observe that: @x P E c , @n, m ě Nε : | fn pxq ´ fm pxq |ď }fn ´ fm }8 ă ε 4.24

[4.25]

so pfn pxqqnPN is a Cauchy sequence of elements of K, which is complete; thus, there exists a pointwise limit f pxq “ lim fn pxq. n ùñ `8

Equation [4.25] of course holds if n ùñ `8, thus @ε ą 0 we have: @x P E c , @m ě Nε : | lim

nÑ`8

fn pxq ´ fm pxq |“| f pxq ´ fm pxq |ă ε

which is the deﬁnition of uniform convergence of the sequence pfn qnPN Ă L8 to f on E c . A standard result of calculus guarantees that if a sequence of bounded functions converges uniformly to a function, then even the limit function is bounded; in our case, this implies that f is essentially bounded on E c . The ﬁnal step is to extend the deﬁnition of f to a function f˜ deﬁned on all X (since the elements of L8 pX, A, μq are deﬁned on all X) while retaining the property of essential boundedness. This is trivial, as we simply take: # f pxq if x P E c f˜pxq “ 0 if x P E Since μpEq “ 0, f˜ : X Ñ K is the representative of an equivalence class of L pX, A, μq to which the Cauchy sequence pfn qnPN converges, this concludes our proof. 2 8

Exercise 4.4 Consider a sequence a “ pak qkPZ and, for all u P 8 pZ, Cq, let a ˚ u be the bilateral sequence deﬁned for k P Z by: ÿ pa ˚ uqk “ am uk´m mPZ

Let us take, for all f P 8 pZ, Cq, T puq :“ a ˚ u ` f .

Banach Spaces and Hilbert Spaces

159

1) For the purposes of this question, we take a “ δ1 , that is, the sequence deﬁned by a1 “ 1 and aj “ 0 if j ‰ 1. Calculate a ˚ u as a function of u. ÿ 2) First, suppose that a “ pak qkPZ P 1 verifying }a}1 “ |ak | ă 1. kPZ

a) Show that pa ˚ uqk is well deﬁned for all k P Z and that a ˚ u P 8 . b) Show that T : 8 pZ, Cq Ñ 8 pZ, Cq is a contraction. c) Deduce that there exists a single unique solution u P 8 pZ, Cq to the equation T puq “ u. ÿ 3) Now, let us suppose that a “ pak qkPZ P 2 veriﬁes }a}2 :“ |ak |2 ă 1. kPZ

a) Using an example, show that we can have a R 1 pZ, Cq. b) Show that pa ˚ uqk is well deﬁned for all k P Z and that a ˚ u P 8 pZ, Cq. c) Deduce that, for all u P 2 pZ, Cq, T puq P 8 pZ, Cq and that if u, v P pZ, Cq, then }T puq ´ T pvq}8 ă }u ´ v}2 . 2

d) Now, take a “ 12 δ1 and let f “ 1 be the constant sequence fj “ 1 for all j P Z. Calculate T puq as a function of u and determine lim pT puqqk . kÑ`8

e) Deduce that there is no u P 2 pZ, Cq such that T puq “ u. Does this contradict the ﬁxed-point theorem? Hint: There is no need to determine u to answer this question. f) Determine u P 8 pZ, Cq such that T puq “ u. Solution to Exercise 4.4 1) By deﬁnition: ÿ δm,1 uk´m “ uk´1 pδ1 ˚ uqk “ mPZ

2) a) By direct calculation: ÿ ÿ ÿ |am | “ }u}8 }a}1 ă `8 am uk´m ď |am uk´m | ď }u}8 pa ˚ uqk “ mPZ

mPZ

mPZ

since a P 1 pZ, Cq and u P 8 pZ, Cq. Furthermore, as the majorization is independent of k, }a ˚ u}8 “ suptpa ˚ uqk u ď }u}8 }a}1 ă `8. kPZ

b) Once again, by direct calculation, we have }T puq ´ T pvq}8 “ }a ˚ u ` f ´ a ˚ v ´ f }8 “ }a ˚ u ´ a ˚ v}8 , but u ÞÑ a ˚ u is linear, so from what we saw in

160

From Euclidean to Hilbert Spaces

the previous question: }T puq ´ T pvq}8 “ }a ˚ pu ´ vq}8 ď }a}1 }u ´ v}8 . Since }a}1 ă 1 by hypothesis, T is a contraction. c) Since p8 pZ, Cq, } }8 q is a complete normed (and therefore metric) space, the ﬁxed-point theorem gives us the existence of a single element u ¯ P 8 pZ, Cq such that T p¯ uq “ u ¯, that is, u ¯“a˚u ¯ ` f. 3) a) The simplest example of a sequence a P 2 pZ, Cq such that a R 1 pZ, Cq is # 8 ř ř 0 kď0 1 |ak | “ In this case, probably ak “ 1 k , the harmonic series, otherwise. kPZ k“1 k ř which we know to be divergent, so a R 1 pZ, Cq. On the other hand, |ak |2 “ 8 ř k“1

kPZ 1 k2 ,

2

which is convergent, that is, a P pZ, Cq.

b) Using the given hypotheses, for all ﬁxed k P Z, the Cauchy-Schwarz inequality can be applied to give: ¸1{2 ˜

˜ ÿ

|am uk´m | ď

mPZ

ď

|am |

mPZ ¸1{2

˜ }a}22

ÿ

ÿ

|un |

¸1{2 ÿ

|uk´m |

mPZ

“ }a}2 }u}2

nPZ

using the change ofř variable n “ k ´řm, with ﬁxed k P Z and m P Z, thus n P Z. Since pa ˚ uqk “ |am uk´m | ă `8, for all ﬁxed k P Z, the am uk´m ď mPZ

mPZ

sequence a ˚ u is well deﬁned. Furthermore, as in question 2a, since the majorization does not depend on k, }a ˚ u}8 “ suptpa ˚ uqk u ď }a}2 }u}2 ă `8. kPZ

c) T puq “ a ˚ u ` f is the sum of two elements of 8 pZ, Cq (f by hypothesis, and a ˚ u as demonstrated above), so T puq P 8 pZ, Cq. Once again, u ÞÑ a ˚ u is clearly linear, so, using the result from the previous question: }T puq´T pvq}8 “ }a˚u´a˚v}8 “ }a˚pu´vq}8 ď }a}2 }u´v}2 ă }u´v}2 since, by hypothesis, }a}2 ă 1. u

d) Using the from question 1, we have pT puqqk “ k´1 2 ` 1. Moreover, ř result 2 2 as u P pZ, Cq, |uk | converges, we necessarily have uk ÝÑ 0, which implies kÑ`8

kPZ

pT puqqk ÝÑ 1. kÑ`8

u

e) Taking T puq “ u, we would have uk “ k´1 2 ` 1, and, taking the limit for k which tends to inﬁnity on both sides, we would obtain the absurd result 0 “ 1. There is no contradiction with the ﬁxed-point theorem, since the inequality }T puq ´ T pvq}8 ă

Banach Spaces and Hilbert Spaces

161

}u ´ v}2 does not involve }T puq ´ T pvq}2 . . . Evidently, as there is no ﬁxed point, T cannot be a contraction on 2 pZ, Cq. f) A sequence u P 8 pZ, Cq such that T puq “ u is a bounded sequence uk “ ` 1 (this is an “arithmetico-geometric” sequence). Taking uk “ vk ` α, with u vk´1 `α v α ` 1, that is, vk “ k´1 unknown vk and α, then vk ` α “ k´1 2 `1 “ 2 2 `1´ 2 vk´1 thus, if we take α “ 2, we obtain a geometric sequence vk “ 2 ; by a standard result for geometric sequences, vk “ 2´k v0 . Furthermore, v0 “ u0 ´ α and α “ 2, hence v0 “ u0 ´2, implying that uk “ 2´k pu0 ´2q`2. For all k ě 0, 2´k ă 1, but for k ă 0, 2´k is not bounded, so to obtain a bounded uk , we need to eliminate its factor, that is, to impose u0 ´ 2 “ 0. Finally, we see that the only sequence u P 8 pZ, Cq such that T puq “ u, that is, the only ﬁxed point for the contraction T : 8 pZ, Cq Ñ 8 pZ, Cq, is the constant sequence of 2, uk “ 2 for all k P Z. 2 uk´1 2

4.4.3. Inclusion relationships between p spaces Let us introduce the following functional space: 0 pN, Kq ” ﬁn pN, Kq “ tpxn qnPN Ă K, DN P N : xn “ 0 @n ě N u [4.26] that is, the space of sequences with a ﬁnite number of elements ‰ 0. Clearly, 0 pN, Kq Ă p pN, Kq @p ě 1. T HEOREM 4.16.– Taking p, q P R, 1 ď p ď q ă 8, then: 0 pN, Kq Ă 1 pN, Kq Ă . . . Ă p pN, Kq Ă . . . Ă q pN, Kq Ă . . . Ă 8 pN, Kq P ROOF.– Given that 0 pN, Kq Ă 1 pN, Kq, the demonstration that p 8 pN, Kq Ă pN, Kq @1 ď p ă 8 is almost trivial since: ÿ pxn qnPN P p pN, Kq ðñ |xn |p ă `8 nPN

which gives us |xn |

ÝÑ

n ùñ `8

0, that is, |xn | is bounded and thus pxn qnPN P 8 pN, Kq.

It only remains to prove that p pN, Kq Ă q pN, Kq if 1 ď p ď q: as |xn | 0, then, in particular, DN P N such that |xn | ď 1, @ n ě N thus |xn | @ n ě N , which implies that: ÿ ÿ ||xn |p ||xn |q ď

q

nPðN

nPðN

ÝÑ

n ùñ `8 ď |xn |p

162

From Euclidean to Hilbert Spaces

The convergence of is, p Ă q .

ř

|xn |p therefore implies the convergence of

nPN

ř

|xn |q , that

nPN

2

R EMARK .– The completeness of an inﬁnite-dimensional metric space depends on the metric selected for the space. To verify this statement, let us examine the completeness of p1 , } }8 q, that is, 1 interpreted as a subspace of 8 and equipped with the norm of this latter space. Exercise 4.5 Show that p1 , } }8 q is not complete. Solution to Exercise 4.5 Since 1 Ă 8 , to solve this problem we must prove that 1 is not a closed subset of 8 with respect to the norm } }8 , that is, there exists at least one sequence that converges (and so it is Cauchy) outside p1 , } }8 q. The elements of 1 are sequences x ” pxn qnPN , so a sequence of elements of is a sequence of sequences. For all ﬁxed m P N, we shall note this sequence xm ” pxm n qnPN . 1

Now, let us verify that the sequence of elements of 1 deﬁned by: $ ’ &0 if n “ 0 1 xm “ if 1 ď n ď m n n ’ % 0 if n ą m converges in 8 z 1 . For all ﬁxed m P N, the sequence xm is explicitly deﬁned as follows: ˆ ˙ 1 1 0, 1, , . . . , , 0, 0, . . . 2 m which shows that xm P 1 for all ﬁxed m P N. Now, consider the sequence x˚ ” px˚n qnPN deﬁned by: # 0 if n “ 0 x˚n “ 1 if n ě 1 n Clearly, px˚n qnPN is bounded, and thus belongs to 8 , but }px˚n qnPN }1 “

8 ř n“1

1 n

“

`8, so px˚n qnPN R 1 . If we can show that pxm qmPN converges to x˚ in norm } }8 , this will complete our proof.

Banach Spaces and Hilbert Spaces

163

To do this, we calculate: ˚ }xm ´ x˚ }8 “ sup |xm n ´ xn | “ sup

nąm

nPN

1 n

˚ Up to n “ m, the difference xm n ´ xn is null, but when n ą m, the difference 1 1 1 1 becomes |0 ´ n | “ n . By the deﬁnition of sup, sup n1 “ sup t m`1 , m`2 ,...u “ 1 m`1

nąm

and thus: }xm ´ x˚ }8 “

1 ÝÑ 0 m ` 1 mÑ`8

2

4.4.4. Inclusion relationships between Lp spaces In general, there are no inclusion relationships between Lp pX, A, μq spaces. For instance, consider L1 pRq, L2 pRq and the following functions: # # x´2{3 if x ą 1 x´2{3 if 0 ă x ă 1 , gpxq “ f pxq “ 0 otherwise 0 otherwise Clearly, f P L1 pRq, but f R L2 pRq, since10: ż R

|f pxq|dx “

ż1 0

1

x

dx ă `8, 2{3

ż R

|f pxq|2 dx “

ż1 0

1 dx “ `8 x4{3

and g P L2 pRq, but g R L1 pRq, because: ż R

|gpxq|dx “

ż `8 1

1 x2{3

dx “ `8,

ż R

2

|gpxq| dx “

ż `8 1

1 dx ă `8 x4{3

p

Inclusions among L spaces can be obtained by imposing additional conditions. Since spaces L1 pRq and L2 pRq are particularly important, we shall examine the conditions used for these spaces – which are often veriﬁed in practical applications – in Theorem 4.17. T HEOREM 4.17.– The following statements are true: 1) if f P L1 pRq, with f bounded, then f P L2 pRq; 2) if f P L2 pRq, with f null outside of a ﬁnite interval, then f P L1 pRq. 10 Recall that if a ą 0 and b P R, and β ą 1.

1 dx 0 xα

şa

ă `8 and

ş`8 b

1 xβ

dx ă `8 if and only if α ă 1

164

From Euclidean to Hilbert Spaces

P ROOF.– 1) If f is in L1 pRq and is bounded, say |f pxq| ď M @x P R, M ě 0, then: ż ż ż |f pxq|2 dx “ |f pxq| ¨ |f pxq|dx ď M |f pxq|dx “ M }f }1 ă `8 R

R

R

thus f P L2 pRq. 2) If f is in L2 pRq and is null outside of a ﬁnite interval, say f pxq “ 0 @x R ra, bs, then: ż ż ż |f pxq|dx “ |f pxq|dx “ 1pxq ¨ |f pxq|dx “ x1, |f |yL2 ra,bs R

1pxq“1 @xPra,bs

ra,bs

¸1{2 ˜ż

˜ż

ď

dx

(Cauchy-Schwarz)

ra,bs

ra,bs

ra,bs

¸1{2 2

|f pxq| dx

“

?

b ´ a }f }2 ă `8

so f P L1 pRq.

2

Statement 1 remains valid for all f P L1 pRn q, n ě 1, while statement 2 remains valid if we replace an interval with a ﬁnite-measure part of Rn . More generally, in the case where μpXq ă `8, it is possible to create a highly useful string of inclusions. T HEOREM 4.18.– If pX, A, μq is a measure space with a ﬁnite measure, μpXq ă `8, and if q ą p ą 1, then: L8 pX, A, μq Ă . . . Ă Lq pX, A, μq Ă . . . Ă Lp pX, A, μq Ă . . . Ă L1 pX, A, μq P ROOF.– First, let us verify the thesis for L8 , then for L1 and L2 (which provide a clearer illustration of the approach used), and ﬁnally for Lp and Lq . ş ş If f P L8 pX, A, μq, then X |f |p dμ ď X }f }p8 dμ “ }f }p8 μpXq ă `8, hence f P Lp pX, A, μq. If f P L2 pX, A, μq, then: ż X

|f |dμ “

ż

|1 ¨ f |dμ

X

ď

Hölder inequ. [4.19]

ˆż

2

1 dμ

2

|f | dμ

˙ 12

X

X

“ hence f P L1 pX, A, μq.

˙ 12 ˆż a

μpXq}f }2 ă `8

Banach Spaces and Hilbert Spaces

165

Taking E “ tx P X : |f pxq| ě 1u and F “ tx P X : |f pxq| ď 1u, then X “ E Y F , and let p ă q. Then |f pxq|p ď |f pxq|q @x P E and |f pxq|p ď 1 @x P F . Thus, if f P Lq : ż ż ż ż ż p q q |f pxq| dμ ď |f pxq| dμ ` 1 dμ ď |f pxq| dμ ` 1 dμ X

E

F

X

X

“ }f }qq ` μpXq ă `8 that is, f P Lp .

2

4.4.5. Density theorems in Lp (X,A,μ) We shall begin our examination of dense varieties in Lp by considering step functions. 4.4.5.1. Step functions Let pX, A, μq be any measure space and K “ R or C. A piecewise constant function on X with values in K is known as a step or simple function. For all N P N, the rigorous deﬁnition of the space of these functions is: " řN Σ “ s : X Ñ K : Dpαi qN i“1 P K : s “ i“1 αi χEi , Ei measurable and μpEi q * ă `8 if αi ‰ 0

The function χEi

# 1 if x P Ei “ 0 if x R Ei

is the indicator function of Ei .

T HEOREM 4.19.– Σ “ Lp pX, A, μq @1 ď p ă 8, where the closure should be interpreted with respect to the topology of Lp pX, A, μq taking Σ Ă Lp pX, A, μq. 4.4.5.2. Intersections: Lp X Lq and p X q T HEOREM 4.20.– Let pX, A, μq be any measure space and K “ R or C, then: # Lp pX, A, μq Lp pX, A, μq X Lq pX, A, μq “ Lq pX, A, μq

@1 ď p, q ď 8

In the ﬁrst case, the intersection should be interpreted as a subset of Lp pX, A, μq and the closure with respect to the metric topology generated by the norm } }p . In the second case, the intersection should be interpreted as a subset of Lq pX, A, μq and the closure with respect to the topology relative to the norm } }q .

166

From Euclidean to Hilbert Spaces

Notably, as p spaces are nested, it holds that: p pN, Kq “ q pN, Kq

@1 ď p ă q ă 8

As before, for all ﬁxed q, p should be interpreted as a subspace of q and the closure should be interpreted with respect to the norm } }q . T HEOREM 4.21.– For all p P R, 1 ď p ă `8: 0 pN, Kq “ p pN, Kq that is, 0 pN, Kq is dense in p pN, Kq with respect to the topology generated by the norm } }p . P ROOF.– Let pxn qnPN be an arbitrary sequence in p pN, Kq. Consider the sequence: # xn if n ă N xN n :“ 0 otherwise then: }xn ´ xN n }p “

ÿ

p |xn ´ xN n| “

nPN

`8 ÿ n“N

|xn |p

Ñ

N Ñ`8

0

as this is the remainder of a convergent series (since pxn qnPN belongs to p pN, Kq), which proves the density of 0 pN, Kq in p pN, Kq. 2 4.4.5.3. Test functions Let Ω Ď Rn be an open set. D EFINITION 4.14.– Cc8 pΩq “ ˚tf : Ω ùñ K, f indeﬁnitely derivable on Ω and supppf q compact in Rn u where supppf q “ tx P Ω : f pxq ‰ 0u is said to be the support of f . The functions in Cc8 pΩq are known as test functions, as they are so regular that they are often used to test the action and properties of certain “wild” operators. Test functions play a crucial role in distribution theory and in analyzing differential equations. The identically null function is obviously a test function; other explicit

Banach Spaces and Hilbert Spaces

167

examples are much harder to ﬁnd. The canonical example of a test function on R for any value of ε ą 0 is given by: ˆ ˙ $ &exp ´ 1 if |x| ă ε 2 1´p x f pxq “ εq % 0 if |x| ě ε. For the purposes of our discussion, we need a simple symbol to denote the partial derivative of a function with n variables with respect to a multi-index l “ pl1 , l2 , . . . , ld q P Nd of length |l| “ l1 ` l2 ` . . . ` ld . The canonical notation is: Dl f pxq “

B |l| f Bxl11 Bxl22

. . . Bxldd

pxq

@x P Rn

hence Dl f pxq is the partial derivative of f in x l1 times with respect to x1 , l2 with respect to x2 , etc. This symbol appears in the (non-trivial) deﬁnition of a topology on Cc8 pΩq with respect to which the convergence of a sequence of test functions pfn qnPN to a test function f is equivalent to fulﬁlling the following two conditions: – there exists a compact set K Ă Ω such that supppfn q Ď K for all n P N; – @x P Rn , @l P Nd : Dl fn pxq

Ñ

nÑ`8

Dl f pxq, uniformly.

The space Cc8 pΩq with this topology is usually written as DpΩq and is complete. The following density result holds true. T HEOREM 4.22.– Considering the Borel σ algebra and the Lebesgue measure, then: Cc8 pΩq “ Lp pΩq

@1 ď p ă 8

where Cc8 pΩq should be interpreted as a subspace of Lp pΩq and interpret the closure with respect to the topology generated by the norm } }p . By the deﬁnition of closure, Cc8 pΩq is not complete with respect to the topology generated by the norm } }p , since there are sequences of elements of Cc8 pΩq which converge to elements in Lp pΩqzCc8 pΩq. 4.4.5.4. Schwartz space For simplicity’s sake – particularly in terms of notation – we shall start by examining the case of a function with a single variable. Taking k, l P N and f P C 8 pRq, for all x P R, we write: f k,l pxq “ xk

dl f pxq dxl

168

From Euclidean to Hilbert Spaces

D EFINITION 4.15 (Schwartz space, n “ 1).– The function space of functions f P C 8 pRq such that: lim |f k,l pxq| “ 0

|x|Ñ`8

@k, l P N

is known as the Schwartz space, or the space of rapidly decreasing functions. The canonical notation for this space is SpRq. Any element in SpRq is thus an inﬁnitely derivable function everywhere such that, if we consider its derivative of any order and multiply this value by any power of its variable, it converges to 0 as the variable tends to ˘8. To satisfy this characteristic, a function must decrease very rapidly to zero at inﬁnity, hence the alternative name for the functions of SpRq. Evidently, DpRq Ă SpRq, since test functions are null at inﬁnity, but the inclusion is strict, as we see from the most important example of a rapidly-decreasing function: 2 the Gaussian f pxq “ e´x , which does not belong to DpRq, as its support is not compact. Now, let us consider a function with n real variables f P C 8 pRn q. In this case, given two multi-indices l, k P Nn , we write: f k,l pxq “ xk11 xk22 ¨ ¨ ¨ xknn Dl f pxq

@x P Rn

D EFINITION 4.16 (Schwartz space, arbitrary n).– The function space of functions f P C 8 pRn q such that: lim

}x}Ñ`8

|f k,l pxq| “ 0

@k, l P Nn

is the Schwartz space, or rapidly decreasing function space. The canonical notation for this space is SpRn q. By construction, SpRn q is stable with respect to partial derivation and to multiplication by a polynomial. Functions of SpRn q (and their derivatives) decay at inﬁnity faster than the reciprocal of a polynomial. As in the case where n “ 1, DpRn q Ă SpRn q and the inclusion is strict, as the 2 Gaussian f pxq “ e´}x} belongs to SpRn q, but not to DpRn q. It is possible to deﬁne a topology on SpRn q in which a sequence pfn qnPN of functions of SpRn q converges to f P SpRn q if fnk,l Ñ f uniformly @k, l P Nn . nÑ`8

With respect to this topology, the Schwartz space is complete.

Banach Spaces and Hilbert Spaces

169

Just as we saw for the test function space, Schwartz space plays an important role in distribution theory (which was formalized by Laurent Schwartz himself) and in the context of partially-derived differential equations. The fact that DpRn q Ă SpRn q and that DpRn q is } }p -dense in Lp pRn q implies the following result (Theorem 4.23). T HEOREM 4.23.– Considering the Borel σ-algebra and the Lebesgue measure, then: SpRn q “ Lp pRn q

@1 ď p ă 8

interpreting SpRn q as a subspace of Lp pRn q and considering the closure with respect to the topology generated by the norm of } }p . From the deﬁnition of closure, SpRn q is not complete with respect to the topology generated by the norm } }p : there exist sequences of SpRn q which converge to elements of Lp pRn qzSpRn q. 4.5. Summary In this chapter, we have examined the compatibility of the topological structure of inner product vector spaces with the linear structure: the sum and product by a scalar operations are continuous in the topology generated by the inner product, as is the inner product itself, and the canonically induced norm. This compatibility is essential, as it implies that the limit operation commutes with the operations cited above; this result is fundamental in both theoretical and practical contexts. We also saw that all ﬁnite-dimensional vector spaces possess the same Euclidean topological structure up to a homeomorphism. Hilbert and Banach spaces were introduced as special cases of inner product or normed vector spaces, respectively, such that all Cauchy sequences converge within the space (completeness property). Any ﬁnite-dimensional inner product space is a Hilbert space, while any ﬁnite-dimensional normed vector space is a Banach space. All Hilbert spaces are Banach spaces, but the reverse is not usually true. Complete normed vector spaces can be characterized in a simple but very useful way: they are all, and only, spaces in which absolutely convergent series are also simply convergent. Any contraction deﬁned on a complete metric space possesses a unique ﬁxed point. We presented the Hilbert spaces L2 and 2 , along with examples of non-Hilbert, but Banach, spaces, Lp and p , with 1 ď p ď 8, p ‰ 2. The Minkowski inequality

170

From Euclidean to Hilbert Spaces

can be used to deﬁne a linear structure on all of these spaces, while Hölder’s inequality is used to deﬁne an inner product when p “ 2. p spaces are nested with increasing p; on the other hand, there is generally no inclusion relationship in Lp spaces, with the notable exception of ﬁnite measure spaces, for which Lp spaces are nested, but in the opposite way to p spaces, that is, with decreasing p. Finally, we demonstrated that Lp spaces coincide with the closure of many widely used function spaces, such as the test function space and the Schwartz space.

5 The Geometric Structure of Hilbert Spaces

Among the inﬁnite-dimensional vector spaces, Hilbert spaces are the closest to the Euclidean spaces Kn presented in Chapter 1 with respect to their geometric structure, which is the focus of the present chapter. Inﬁnite-dimensional Banach spaces do not share this characteristic, with structural properties that can be far more complicated than those of Hilbert spaces. The rich geometric structure of Hilbert spaces makes it possible to extend the discrete Fourier transform (DFT) to spaces in inﬁnite dimensions, using the concepts of series and the continuous Fourier transform. Suggested reading for those wishing to go further into the subjects discussed in this chapter and in Chapter 6 includes Berberian (1961), Abbati and Cirelli (1997), Saxe (2000), Debnath and Mikusinski (2005) and Moretti (2013). The ﬁrst step in analyzing the geometric structure of Hilbert spaces is to consider the concept of orthogonal complement. 5.1. The orthogonal complement in a Hilbert space and its properties The set of all vectors which are orthogonal to the vectors of a subset in a Hilbert space is of crucial importance in understanding the geometric properties of these spaces. The deﬁnition and properties of this set are given below.

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

172

From Euclidean to Hilbert Spaces

D EFINITION 5.1.– Let H be a Hilbert space and M Ď H any subset. The orthogonal complement of M in H is: M K “ tx P H : xx, yy “ 0 @ y P M u that is M K contains all of the elements of H which are orthogonal to elements in M . We denote with spanpM q the vector subspace of H generated by M , that is, the set of (ﬁnite) linear combinations of vectors in M . In Theorem 5.1, we shall write pM K qK “ M KK and pM KK qK “ M KKK . T HEOREM 5.1 (Properties of the orthogonal complement).– Let H be a Hilbert space and M Ď H an arbitrary subset. Then: K

1) t0H u “ H and HK “ t0H u; # t0H u if 0H P M K 2) M X M “ ; H if 0H R M 3) M K is a closed vector subspace of H; 4) if N Ď H, then M Ď N ñ N K Ď M K (K reverses the set inclusion relationships); 5) M Ď M KK (difference with respect to ﬁnite dimensions); 6) pM qK “ M K ; 7) M KKK “ M K ; 8) M K “ pspan M qK “ pspan M qK ; 9) if M “ H ñ M K “ t0H u (the orthogonal complement of a dense subset is the zero vector). The proof is given below. First, however, we note that the fact that M K is always closed is a very useful property for demonstrating that a linear variety of H is closed: we must simply demonstrate that this variety coincides with the orthogonal complement of a subset of H. We also remark how noticeable it is that, thanks to the orthogonal complement, we pass from the category of sets, in which M belongs, to that of topological vector spaces, where M K belongs to moreover, the property of closure for M K . P ROOF.– 1) The property follows from the fact that 0H is the only vector in H which is orthogonal to all the others. 2) 0H is the only vector which is orthogonal to itself.

The Geometric Structure of Hilbert Spaces

173

3) M K is a vector subspace: if x, x1 P M K , then xx, yy “ xx1 , yy “ 0 @y P M , hence xαx ` βx1 , yy “ αxx, yy ` βxx1 , yy “ 0 @y P M , i.e. M K is a vector subspace, as it is stable with respect to linear combinations of its elements. M K is closed: We must show that M K contains all the limit points of sequences in M K . Let pxn qnPN Ă M K be a sequence which is convergent (and thus Cauchy) to a limit x; then, since M K Ď H and H is complete, x P H. For all y P M , xxn , yy “ 0 @n P N, so, from the continuity of the inner product, we can write: 0 “ lim xxn , yy “ x lim xn , yy “ xx, yy nÑ`8

nÑ`8

thus x K y @y P M , that is x P M K . 4) Since M Ď N , the vectors of H which are orthogonal to the vectors of N are also orthogonal to the vectors of M (although the contrary is not necessarily true). Thus, y P H, y P N K implies y P M K , that is N K Ď M K . 5) Every vector in M is orthogonal to every vector in M K by deﬁnition, but there may also be other vectors in H which are orthogonal to M K , hence M Ď M KK . 6) The equality of the sets can be demonstrated by demonstrating the two inclusions in the opposite direction: – pM qK Ď M K : this follows from M Ď M and property 4; – M K Ď pM qK : we must show that y P M K ùñ y P pM qK . The elements of M are the union of all elements in M with the limits of the sequences in M , so we must show that, if y P M is orthogonal to all of the elements pxn qnPN Ă M K of an arbitrary convergent sequence in M K , then y is also orthogonal to the limit of this sequence. This can be proved using the continuity of the inner product: by hypothesis, xxn , yy “ 0 @n P N, thus: 0 “ lim xxn , yy “ x lim xn , yy nÑ`8

nÑ`8

that is, y K lim xn . nÑ`8

7) From property 5, M Ď M KK , and from property 4, M KKK Ď M K for any subset M of H. Now, writing N “ M K , the ﬁnal inclusion can be rewritten as N KK Ď N , implying, by property 4, N K Ď N KKK . M , and thus N , are arbitrary subsets of H, hence the inclusions M KKK Ď M K , N K Ď N KKK imply equality between a subset of H and its triorthogonal complement.

174

From Euclidean to Hilbert Spaces

8) Consider an arbitrary element in spanpM q: y0 “

n ř

αi yi , yi P M and αi P K

i“1

@i “ 1, . . . , n. Taking any ﬁxed x P M K , by the sesquilinearity (or bilinearity) of x , y (for K “ C or K “ R), we can write: xx, y0 y “ xx,

n ÿ

αi yi y “

i“1

n ÿ

αi xx, yi y “ 0 xKyi

i“1

hence x P pspanpM qqK , and since x is arbitrary, this implies M K Ď pspanpM qqK . Given that M Ď spanpM q, by (4), pspanpM qqK Ď M K , that is pspanpM qqK “ M K ; furthermore, by (6), pspanpM qqK “ pspanpM qqK “ M K . 9) M K “ pM qK “ HK “ t0H u, as the only vector which is orthogonal to all vectors in H is the zero vector. 2 5.2. Projection onto closed convex sets: theorem and consequences In this section, we shall consider a particularly important geometric property which has already been covered in the context of Euclidean spaces: the orthogonal projection minimizes the distance between a given vector and those of the subset on which it projects. This result will be presented and proved for the case of a closed convex subset and then used for a closed vector subspace. D EFINITION 5.2.– A subset S of a vector space is convex if: @x, y P S, @λ P r0, 1s : λx ` p1 ´ λqy P S that is, if S is stable with respect to convex combinations, that is, for linear combinations where the coefﬁcients sum to 1. In geometric terms, a convex subspace can be characterized by the fact that any pair of points may be connected by a line segment which remains within the subspace. Evidently, all vector subspaces are convex, as they are stable with respect to all linear combinations, including convex combinations. Note that the half sum of x and y (i.e.

x`y 2 )

is a convex combination with λ “ 1{2.

T HEOREM 5.2.– Let H be a Hilbert space and S a closed, convex and proper subset1 of H. Then, @x P H (ﬁxed) there exists a single point y0 P S such that: }x ´ y0 } “ inf }x ´ y} yPS

1 If S “ H, then the theorem may be veriﬁed trivially with y0 “ x.

The Geometric Structure of Hilbert Spaces

175

that is such that y0 minimizes the distance between x and the points in S. Before presenting the proof of this theorem, we should note that this result also holds for any closed vector subspace of H: the theorem of projection onto a closed convex space generalizes property 3 from Theorem 1.12 to inﬁnite-dimensional Hilbert spaces. D EFINITION 5.3.– The vector y0 in the previous theorem is the orthogonal projection of x on S, noted y0 “ PS pxq. The non-negative real quantity dpx, Sq “ }x ´ PS pxq} is the distance between x and the closed, convex and proper subset S. It is evident that if x P S then PS pxq “ x and dpx, Sq “ 0, so the information provided by the theorem is interesting when x R S. P ROOF.– D : For simplicity’s sake, let us note2, δ ” inf }x ´ y}. We shall demonstrate yPS

the existence of y0 using a non-constructive technique typical of the Hilbert school. Consider a sequence pyn qnPN Ă S which satisﬁes the equation3: lim }x ´ yn } “ δ

[5.1]

nÑ`8

The interest of such a sequence is that, by the continuity of the norm, [5.1] can be rewritten as: › › › › › › › › › › › › › › › › › δ “ › lim px ´ yn q› “ › lim x ´ lim yn › “ ›x ´ lim yn ›› nÑ`8

nÑ`8

nÑ`8

nÑ`8

hence to prove the existence of y0 , we can simply take y0 :“ lim yn . nÑ`8

We begin by noting that S is closed and is thus itself complete; to demonstrate the existence of the limit of yn , we must show that pyn qnPN is a Cauchy sequence in S. To show that pyn qnPN is a Cauchy sequence we will use the parallelogram law [1.6] (which holds since the norm is Hilbertian, see Theorem 4.3) on the elements x ´ yn and x ´ ym : }px´yn q`px´ym q}2 `}px´yn q´px´ym q}2 “ 2p}px´yn q}2 `}px´ym q}2 q which can be rewritten as: }2x ´ yn ´ ym }2 ` }yn ´ ym }2 “ 2p}px ´ yn q}2 ` }px ´ ym q}2 q 2 δpx, Sq would be more “correct”, since δ generally changes with x and S. 3 For example, δ 2 ď }x ´ yn }2 ď δ 2 ` n1 , @n P N.

176

From Euclidean to Hilbert Spaces

that is: }yn ´ ym }2 “ 2p}px ´ yn }2 ` }x ´ ym }2 q ´ }2x ´ yn ´ ym }2 ` ˘ and 2x ´ yn ´ ym “ 2 x ´ 12 pyn ` ym q , so [5.2] becomes:

[5.2]

› ›2 › › 1 › [5.3] }yn ´ ym } “ 2p}px ´ yn } ` }x ´ ym } q ´ 4 ›x ´ pyn ` ym q›› 2 › › Note that 12 pyn ` ym q P S by convexity, then ›x ´ 12 pyn ` ym q› ě δ by deﬁnition › › 2 of δ, thus ´4 ›x ´ 1 pyn ` ym q› ď ´4δ 2 , so equation [5.3] gives us: 2

2

2

2

}yn ´ ym }2 ď 2p}x ´ yn }2 ` }x ´ ym }2 q ´ 4δ 2 Since lim }x ´ yn }2 “ nÑ`8

lim }x ´ ym }2 “ δ 2 , the right-hand side of the

mÑ`8

previous inequality tends to 0 for sufﬁciently high values of n and m, therefore pyn qnPN is a Cauchy sequence. ! : Let us now prove that only one y0 exists which satisﬁes equation [5.1]. Let y1 be another element in S which veriﬁes }x ´ y1 } “ δ. Writing the parallelogram formula once again, but this time using x ´ y0 and x ´ y1 , we obtain: }px ´ y0 q ` px ´ y1 q}2 ` }px ´ y0 q ´ px ´ y1 q}2 “ 2p}x ´ y0 }2 ` }x ´ y1 }2 q that is: }2x ´ y0 ´ y1 }2 ` }y1 ´ y0 }2 “ 4δ 2 thus: › ˆ ˙›2 › y0 ` y1 ›› › 0 ď }y1 ´ y0 } “ 4δ ´ }2x ´ y0 ´ y1 } “ 4δ ´ ›2 x ´ › 2 ˜ › › ›2 ›2 ¸ › › › › y y ` y ` y 0 1 0 1 2 2 › “ 4 δ ´ ›x ´ › “ 4δ ´ 4 ››x ´ › › 2 2 › 2

2

2

2

1 P S by convexity, and, since δ 2 “ inf yPS }x ´ y}2 , it must We observe that y0 `y 2 › ›2 › › 2 hold that δ 2 ď ›x ´ y0 `y1 › and thus δ 2 ´ ›x ´ y0 `y1 › ď 0, that is:

2

2

› ›2 ¸ › y0 ` y1 ›› 2 2 › 0 ď }y1 ´ y0 } “ 4 δ ´ ›x ´ ď 0, 2 › ˜

hence y1 “ y0 .

2

The Geometric Structure of Hilbert Spaces

177

As we have seen, the parallelogram formula is essential to the proof of this theorem. Since, by Theorem 4.3, only Hilbert norms verify this formula, the proof given above cannot be applied to Banach spaces which are not Hilbert spaces and, in fact, counter-examples show that the theorem of projection onto a closed, convex and proper subset does not hold for any inﬁnite-dimensional Banach space. The theorem of projection onto a closed convex space has very important consequences, which will be described in detail later. For now, note that this theorem guarantees the existence and uniqueness of the orthogonal projection y0 , but it does not provide any information regarding the explicit expression of elements of the sequence pyn qnPN in S which converges to y0 . A geometric characterization of y0 , shown in Theorem 5.3, is therefore very useful. A remarkable application of this result will be presented in section 5.3. T HEOREM 5.3.– Let H be a real Hilbert space, S a closed, convex and proper subset of H and x a ﬁxed element in H. Then y0 is the orthogonal projection of x onto S, that is x ´ y0 “ inf x ´ y, if and only if: yPS

@y P S,

xx ´ y0 , y ´ y0 y ď 0

that is4, if and only if the angle ϑ between vectors x ´ y0 and y ´ y0 is obtuse, as shown in Figure 5.1. If H is complex, then we replace xx ´ y0 , y ´ y0 y ď 0 with pxx ´ y0 , y ´ y0 yq ď 0. P ROOF.– This proof concerns the real case. Proof of the complex case is left to the reader. ñ : we wish to show that, if x ´ y0 “ inf x ´ y, then xx ´ y0 , y ´ y0 y ď 0 yPS

@y P S. To do this, let us consider any y P S, using the convexity of S to guarantee that λy ` p1 ´ λqy0 P S @λ P r0, 1s. Thus, by hypothesis, and using the bilinearity and symmetry properties of the real inner product, we obtain: 2

2

2

x ´ y0 ď x ´ rλy ` p1 ´ λqy0 s “ x ´ y0 ´ λpy ´ y0 q “ xx ´ y0 ´ λpy ´ y0 q, x ´ y0 ´ λpy ´ y0 qy

4 Since xx ´ y0 , y ´ y0 y “ }x ´ y0 }}y ´ y0 } cospϑq, with ϑ the angle between vectors x ´ y0 and y ´ y0 .

178

From Euclidean to Hilbert Spaces

“ xx ´ y0 , x ´ y0 y ´ λxx ´ y0 , y ´ y0 y ´ λxy ´ y0 , x ´ y0 y `λ2 xy ´ y0 , y ´ y0 y looooooooomooooooooon “λxx´y0 ,y´y0 y 2

2

“ x ´ y0 ` λ2 y ´ y0 ´ 2λxx ´ y0 , y ´ y0 y Thus: 2

2

2

x ´ y0 ď x ´ y0 ` λ2 y ´ y0 ´ 2λxx ´ y0 , y ´ y0 y Simplifying and dividing by λ P p0, 1s, we obtain: 2

0 ď λ y ´ y0 ´ 2xx ´ y0 , y ´ y0 y that is: xx ´ y0 , y ´ y0 y ď

λ 2 y ´ y0 2

for all λ in p0, 1s. Now, taking the limit by λ Ñ 0 to both members of the inequality, 2 we obtain: lim xx´y0 , y´y0 y “ xx´y0 , y´y0 y ď lim λ2 y ´ y0 “ 0, completing λÑ0

λÑ0

the proof of the direct implication. ð : Let x P H be ﬁxed, and take y0 P S such that @y P S, xx ´ x0 , y ´ y0 y ď 0. Then, we wish to show that x ´ y0 “ inf yPS x ´ y. We have: 2

2

2

x ´ y “ x ´ y0 ` y0 ´ y “ x ´ y0 ´ py ´ y0 q “ xx ´ y0 ´ py ´ y0 q, x ´ y0 ´ py ´ y0 qy “ xx ´ y0 , x ´ y0 y ´ xx ´ y0 , y ´ y0 y ´ xy ´ y0 , x ´ y0 y ` xy ´ y0 , y ´ y0 y 2 2 “ x ´ y0 ` y ´ y0 ´ 2xx ´ y0 , y ´ y0 y, having used the symmetry of the real inner product. We thus have: 2

2

2

x ´ y0 ´ x ´ y “ 2 xx ´ y0 , y ´ y0 y ´ loooomoooon y ´ y0 ď 0 loooooooomoooooooon ď0 by hypothesis 2

ě0

2

that is, x ´ y0 ď x ´ y @y P S, that is: x ´ y0 “ inf x ´ y. yPS

2

The following corollary shows that any complement of a closed proper vector subspace of a Hilbert space is not trivial, and generalizes property 2 from Theorem 1.12 to inﬁnite-dimensional Hilbert spaces. T HEOREM 5.4.– Let H be a Hilbert space and S a closed, proper vector subspace of H, that is S ‰ H and S ‰ H. Then S K is not reduced to t0H u; in fact, for all ﬁxed x P HzS, the vector u “ x ´ PS pxq is non-zero and belongs to S K : @x P HzS,

u “ x ´ PS pxq ‰ 0H and u P S K

The Geometric Structure of Hilbert Spaces

179

Figure 5.1. Two-dimensional geometric visualization of the property veriﬁed by the projection onto a closed, convex and proper subset of H. For a color version of this ﬁgure, see www.iste.co.uk/ provenzi/spaces.zip

The vector u “ x ´ PS pxq is known as the residual vector, and the fact that u K S fully justiﬁes the use of the term “orthogonal projection” for PS pxq. P ROOF.– Vector subspaces are convex, so the theorem of projection onto a closed convex subset holds, and thus D PS pxq P S, such that u “ x ´ PS pxq “ inf x ´ y ” δ. yPS

Since x R S and PS pxq P S, u ‰ 0H . If we can show that u P S K , that is that xu, sy “ 0 @s P S, this will prove the theorem. To obtain this result, we start by noting that @k P K, @s P S: 2

2

2

2 2 u ` ks “ x ´ PS pxq ` ks “ }x ´ pP S pxq ´ ksq } ě δ “ u loooooomoooooon PS

by the deﬁnition of δ and the fact that S is a vector subspace. Hence: 2

2

u ` ks ´ u ě 0,

@k P K, @s P S

180

From Euclidean to Hilbert Spaces

Furthermore: 2

u ` ks “ xu ` ks, u ` ksy “ xu, uy ` xu, ksy ` xks, uy ` xks, ksy 2 ¯ sy ` kxs, uy ` |k|2 s2 “ u ` kxu, 2

2

Thus, u ` ks ´ u ě 0 if and only if: ¯ sy ` kxs, uy ` |k|2 s2 ě 0 kxu, As k is arbitrary, we can take k “ xu, syt with any t P R . Thus, the equation above becomes: 2

2 2 xu, sytxu, sy ` xu, syt lo xs, uy omo on `|xu, sy| t s ě 0 “xu,sy

2

2 2 2 2 ðñ |xu, ´ sy| t ` |xu, sy| ¯ t ` |xu, sy| t s ě 0 2 ðñ t2 |xu, sy|2 s ` 2t|xu, sy|2 ě 0 @t P R

It would be pointless to simplify |xu, sy|2 , since we ´wish to calculate ¯ xu, sy. The 2 strategy to complete our proof consists of interpreting t2 |xu, sy|2 s `2t|xu, sy|2 as a second-degree polynomial function of the form P ptq “ at2 ` bt ` c with: $ 2 2 ’ &a “ |xu, sy| s b “ 2|xu, sy|2 ’ % c“0 The discriminant is equal to Δ “ b2 ´ 4ac “ 4|xu, sy|4 ě 0. Thus, for P ptq to be positive @t P R, we must have Δ ď 0; this is only possible if Δ “ 0, but: Δ “ 0 ðñ 4|xu, sy|4 “ 0 @s P S (u being ﬁxed since x is ﬁxed) that is, xu, sy “ 0 @s P S, hence u P S K .

2

5.2.1. Characterization of closed vector subspaces in Hilbert spaces Theorem 5.4 can be used to deduce a highly useful characterization of closed vector subspaces in Hilbert spaces. We shall begin by considering an intermediate result. L EMMA 5.1.– Let H be a Hilbert space and M any subset of H, then: KK

spanpM q “ spanpM q

The Geometric Structure of Hilbert Spaces

P ROOF.– Taking: # S “ spanpM q

181

KK

T “ spanpM q

Theorem 5.1 guarantees that S Ď T ; if we can prove that S Ă T is an impossible condition, then we will be left with S “ T . We begin by observing that S is a closed vector subspace of T and that T , as the orthogonal complement of a subset of H, is a closed vector subspace of H; T itself is thus a Hilbert space. Reasoning by the absurd, if we assume S Ă T , then Theorem 5.4 can be applied to the pair S and T to ensure the existence of u P T , u ‰ 0H and u P S K . However, K

KK

this implies that u P S K X T “ spanpM q X spanpM q the fact that u ‰ 0H . Thence, spanpM q “ spanpM q

KK

“ t0H u, which contradicts

.

2

We now have all of the information needed to prove a helpful characterization of closed vector subspaces in Hilbert spaces: closed vector subspaces are precisely those which coincide with their biorthogonal complement. This characterization is particularly powerful, as it creates a bridge between different structures that coexist in a Hilbert space: in fact, on one side, closure is related to the topological structure induced by the presence of a Hilbert norm, while, on the other side, the concept of orthogonality is related to the geometry of the Hilbert space by the presence of an inner produce. This bridge can be used, for instance, to verify the closure of a vector subspace by showing explicitly its biorthogonal complement, if this computation is easier than the direct veriﬁcation of the closure. T HEOREM 5.5.– Let H be a Hilbert space and M a vector subspace of H. 1) M KK “ spanpM q. 2) M “ M KK . 3) M is a closed vector subspace of H if and only if M “ M KK . P ROOF.– 1) Given that M is a vector subspace, M ” spanpM q. Furthermore, by property K

8 from Theorem 5.1, M K “ spanpM qK “ spanpM q . Hence, the previous lemma KK

implies M KK “ spanpM q

“ spanpM q.

182

From Euclidean to Hilbert Spaces

2) We have shown that M KK “ spanpM q and we know that M ” spanpM q, thus M “ M KK . 3) Let us demonstrate the double implication: ñ : we know from point 1) that M “ M KK , but if M is closed then M “ M , and thus M “ M KK ; ð : if M “ M KK , then M is automatically a closed vector subspace by the fact that it is the orthogonal complement of M K . 2 C OROLLARY 5.1.– Let H be a Hilbert space and M, N any two parts of H. 1) It holds that: K

pM Y N q “ M K X N K 2) Additionally, if M and N are two closed vector subspaces of H, then: K

pM X N q “ span pM K Y N K q

[5.4]

P ROOF.– 1) Let us prove the two inclusions: K

K

– pM Y N q Ď M K X N K : taking x P pM Y N q and y P M , then y also belongs to M Y N , thus xx, yy “ 0, that is x P M K . Now, taking y P N , the same argument tells us that x P N K . Thus x P M K and x P N K , that is, x P M K X N K ; K

– M K X N K Ď pM Y N q : taking x P M K X N K , then x P M K and x P N K . If y P M Y N , then y P M or y P N , but in both cases xx, yy “ 0, that is, x P K pM Y N q ; 2) The relationship determined above holds for all parts of H and thus also holds ˘K ` for M K and N K . In this case, point 1 becomes M K Y N K Ď M KK X N KK “

th. r5.5s

spanpM q X spanpN q “ M X N since M and N are presumed to be closed vector subspaces. Now, taking the orthogonal, we obtain: ˘KK ` K pM X N q “ M K Y N K “ spanpM K Y N K q 2 th. r5.5s

5.3. Polar and bipolar subsets of a Hilbert space In this section, we shall use a different approach to obtain the same result concerning the characterization of a closed part of a Hilbert space. In this case, we shall use the concept of polar sets, which is particularly important in the context of convex optimization theory.

The Geometric Structure of Hilbert Spaces

183

D EFINITION 5.4 (Polar and bipolar).– Let H be a Hilbert space and M any non-empty part of H. The polar set of M , noted M 0 , is the subset of H deﬁned by5 : M 0 :“ tx P H : @y P M, pxx, yyq ď 1u ” tx P H : sup pxx, yyq ď 1u

yPM

The bipolar of M is the polar of the polar, that is: M 00 :“ pM 0 q0 “ th P H : @x P M 0 , pxh, xyq ď 1u ” th P H : sup pxh, xyq ď 1u

xPM 0

Let us also recall the concept of convex hull. D EFINITION 5.5 (Closed convex hull).– The closed convex hull of a part M of H is the closure of the intersection of all convex parts of H containing M . It is the smallest closed convex subset of H which contains M . The following result contains remarkable properties of both the polar and bipolar. T HEOREM 5.6.– Let H be a Hilbert space and M any non-empty part of H. 1) M 0 is a closed convex subset of H which contains the zero vector 0H . 2) M 00 coincides with the closed convex hull C of M Y t0H u. 3) If M is a convex part of H which contains 0H , then M “ M 00 . 4) If M is a vector subspace of H, then M 0 “ M K . P ROOF.– 1) The fact that M 0 contains 0H is an obvious consequence of the fact that x0H , yy “ 0 ă 1 for all y P M . To verify convexity, let us consider λ P r0, 1s and x1 , x2 P M 0 ; by the left linearity of the inner product: pxx1 ` p1 ´ λqx2 , yyq “ λpxx1 , yyq ` p1 ´ λqpxx2 , yyq ď λ ` p1 ´ λq “ 1 showing that M 0 is convex. All that remains is to prove the closure; to do this, we ﬁrst remark that, for all ﬁxed y P H, the application φy : H Ñ R, x ÞÑ φy pxq :“ pxx, yyq is continuous. Writing: Hy :“ φ´1 y tr´8, `1su “ tx P H : pxx, yyq ď 1u 5 Evidently, the real part of the inner product can be eliminated if H is a real Hilbert space.

184

From Euclidean to Hilbert Spaces

then Hy is a closed subset of H as it is the reciprocal image of a closed subset of R via the continuous map φy (remember that p´8, `1s is closed, since its complement set 0 in R is p1, `8q, which is open). Ş By deﬁnition, the elements of M must belong to Hy 0 for all y P M , that is M “ Hy , and thus it is closed in H as it is the intersection of closed parts of H.

yPM

2) Now, let us demonstrate the opposite inclusions. C Ď MŞ00 : ﬁrst, it is useful to show that M Ď M 00 . To do this, we note that M “ Hx . Next, let us take arbitrary but ﬁxed y P M and x P M 0 ; then 00

xPM 0

notably, x P Hy , that is, pxx, yyq ď 1 and since pxx, yyq “ pxy, xyq, Ş we also have pxy, xyq ď 1, that is, y P Hx . Since this holds for all x P M 0 , y P Hx “ M 00 . xPM 0

Then: y P M ùñ y P M 00 , that is, M Ď M 00 .

By (1), we know that M 00 , as a polar set, is convex, closed and contains t0H u. We have just seen that M Ď M 00 , thus M 00 is a closed convex set which contains M Y t0H u. Given that C, the closed convex hull of M Y t0H u, is the smallest convex subset of H which contains M Y t0H u, it must be included in M 00 . M 00 Ď C : the fact that 0H P C comes into play at this stage of the proof. From Theorem 5.3, for all x P H it holds that: pxx ´ PC x, 0H ´ PC xyq ď 0 ðñ pxx ´ PC x, ´PC xyq ď 0 ðñ pxx ´ PC x, PC xyq ě 0 and for all y P M , we also have pxx ´ PC x, y ´ PC xyq ď 0, that is, pxx ´ PC x, y ´ PC xyq ď ε for all ε ą 0, that is, by linearity of the inner product: pxx ´ PC x, y ´ PC xyq “ pxx ´ PC x, yyq ´ pxx ´ PC x, PC xyq ď ε that is, given that ε ` pxx ´ PC x, PC xyq is a real number ą 0: pxx´PC x, yyq ď ε`pxx´PC x, PC xyq ðñ which can be rewritten as: F˙ ˆB x ´ PC x ,y ď1 ε ` pxx ´ PC x, PC xyq that is, the element zpxq :“

x´PC x ε`pxx´PC x,PC xyq

pxx ´ PC x, yyq ď1 ε ` pxx ´ PC x, PC xyq

@y P M, @ε ą 0

P M 0 for all x P H.

As this result holds for any x P H, it can be applied when x P M 00 ; in this case, by deﬁnition, we have pxx, zpxqyq ď 1, that is: F˙ ˆB x ´ PC x ď1 pxx, zpxqyq “ x, ε ` pxx ´ PC x, PC xyq

The Geometric Structure of Hilbert Spaces

185

hence: pxx, x ´ PC xyq ď ε ` pxx ´ PC x, PC xyq “ ε ` pxPC x, x ´ PC xyq which gives us: pxx ´ PC x, x ´ PC xyq ď ε ðñ }x ´ PC x}2 ď ε

@ε ą 0

but this is possible if and only if x ´ PC x “ 0H , that is, x “ PC x; however, since PC x P C, x P C. This completes our proof that x P M 00 ùñ x P C, that is, M 00 Ď C. 3) By (2), if M is a convex part of H containing 0H , then M 00 is the smallest convex part which contains M . If M is convex, then M is also convex, and, furthermore, is the smallest closed set which contains M ; consequently, M “ M 00 . 4) Let us now prove the opposite inclusions. M K Ď M 0 : if x P M K , then, for all y P M , xx, yy “ 0 ă 1, therefore x P M 0 . M 0 Ď M K : taking x P M 0 , we wish to prove that x P M K , that is xx, yy “ 0 @y P M . This is done using the fact that M is taken to be a vector subspace of H: if y P M , then ty P M @t P Rzt0u. Since x P M 0 and ty P M , it must hold that: pxx, tyyq ď 1 ðñ tpxx, yyq ď 1 @t P Rzt0u ðñ pxx, yyq “ 0 @y P M If H is a real Hilbert space, this concludes our proof. If H is complex, we also need to show that the imaginary part of the inner product is zero. We do this using Theorem 1.2, which tells us that pxx, yyq “ pxx, iyyq, thus pxx, yyq “ pxx, iyyq “ 0 as we have previously proven that pxx, zyq “ 0 @z P M and z “ iy P M when y P M if M is a complex vector subspace. Finally, xx, yy “ 0 @y P M and thus x P M K . 2 Properties 3 and 4 from Theorem 5.6 imply property 2 of Theorem 5.5, that is, M “ M KK . In fact, on one side, M 0 “ M K , so by repeating the polar operation twice we obtain M 00 “ M KK . Furthermore, M 00 “ M , thus M “ M KK . 5.4. The (orthogonal) projection theorem in a Hilbert space We shall now present and demonstrate the most important corollary of the theorem of orthogonal projection on a closed convex set. T HEOREM 5.7 (Orthogonal projection theorem).– Let H be a Hilbert space on K “ R or C and let S be a closed proper subspace of H. Then: H “ S ‘ SK

186

From Euclidean to Hilbert Spaces

that is, @x P H, Ds P S, Dt P S K : x “ s ` t, and this decomposition is unique, that is, if: # # x“s`t s “ s1 1 1 K with s, s P S, t, t P S , then: 1 1 x“s `t t “ t1 If S is not a proper subspace, then we have the trivial decomposition H “ H ‘ t0H u. P ROOF.– Take a ﬁxed x P H, the orthogonal projection PS pxq P S of x onto S and the residual vector u: u “ x ´ PS pxq P S K . By Theorem 5.7, x can be decomposed as follows: x “ lo PoSmo pxq x ´ PS pxq on ` loooomoooon PS

PS K

We must now show that a decomposition of this type is unique. Consider the decompositions x “ s ` t and x “ s1 ` t1 , with s, s1 P S, t, t1 P S K , then s ` t “ s1 ` t1 , that is: 1 so´ to1 mo ´otn lo moson “ lo PS

PS K

As S and S K are vector spaces, they are stable by subtraction, hence the inclusions shown in curly brackets. We have S Q s ´ s1 “ t1 ´ t P S K , thus s ´ s1 P S X S K and t1 ´ t P S X S K . However S X S K “ t0H u, so we must have s1 “ s and t1 “ t. 2 I MPORTANT OBSERVATION .– Whenever we recognize the presence of a closed vector subspace S of a Hilbert space H, the orthogonal projection theorem, gives a much profound meaning to the otherwise trivial decomposition x “ x ´ y ` y, with x R S and y P S: in fact, we know that y “ PS pxq and y is orthogonal to x ´ y. This latter 2 property guarantees the possibility to use the Pythagorean theorem to write x “ 2 2 x ´ y ` y , which is often extremely useful in both theoretical and practical contexts, as we will see later in this chapter and the following one. The results introduced above are applied in the exercise below.

Exercise 5.1 Let Ω be a bounded subset of Rn , and consider the set M of functions f : Ω Ñ R, f P L2 pΩq which are constant a.e. Show that: 1) M is a closed vector subspace of L2 pΩq;

The Geometric Structure of Hilbert Spaces

187

2) @f P L2 pΩq, the projection of f onto M is the function which is constant a.e. ş 1 and equal to the average of f on Ω, that is: PM f “ |Ω| f pxqdx, |Ω| “ mpΩq, with Ω m the Lebesgue measure on Rn ; 2 3) The orthogonal complement ş of M in L pΩq is given by the functions h in L pΩq with zero average, that is Ω hpxqdx “ 0. 2

Solution to Exercise 5.1 1) M can be characterized as the vector subspace of L2 pΩq generated by the function 1pxq “ 1 which is constant a.e. on Ω. As there is only one generator, M is isomorphic to R, which is closed. 2) Taking f P L2 pΩq, then, by the projection theorem, if we write f “ f ´PM f ` PM f , we have f ´ PM f P M K . Let g be an element in M such that gpxq “ c ‰ 0 a.e. on Ω, and let us calculate the inner product between f ´ PM f and g: 0 “ xf ´ PM f, gyL2 pΩq “

ż Ω

pf pxq ´ PM f qgpxqdx

pPM f P M

ùñ const. a.e., so we interpret PM f P Rq “

ż

“c

Ω

f pxqgpxqdx ´

ˆż Ω

ż Ω

pPM f qgpxqdx “ c

ż Ω

f pxqdx ´ cpPM f q

ż dx Ω

˙ f pxqdx ´ pPM f q|Ω|

that is, since c ‰ 0: PM f “

1 |Ω|

ż Ω

f pxqdx

3) Taking any h P M K , then by deﬁnition: xh, gyL2 pΩq “ 0 @g P M . Now, taking gpxq “ c ‰ 0 a.e., we have: ż ż 0 “ xh, gyL2 pΩq “ hpxqgpxqdx “ c hpxqdx, @k P R Ω

hence

ş

Ω

Ω

hpxqdx “ 0.

What we have just proven and the orthogonal projection theorem imply that any function f P L2 pΩq, Ω Ă Rn such that mpΩq ă `8 can be represented in a unique manner as: f “ xf yΩ ` h

188

From Euclidean to Hilbert Spaces

where xf yΩ is the constant function on Ω and equal to the average of f on Ω and h P L2 pΩq such that xhyΩ “ 0. This implies that h must be an oscillating function, with oscillations that cancel out when we consider its average. This result is already remarkable by its own, but it will be further reﬁned by the Fourier expansion of f , that will be described later in this chapter. 2 5.5. Orthonormal systems and Hilbert bases As we saw in Chapters 1 and 2, the presence of an orthonormal basis in a Euclidean space makes it easy to calculate vector components and to characterize orthogonal projections. Furthermore, using the Fourier basis, it is also possible to deﬁne Fourier coefﬁcients and the DFT. In this section, we shall describe the conditions which must be added in order to extend these considerations to inﬁnite-dimensional Hilbert spaces. Let us begin with a deﬁnition. D EFINITION 5.6 (Orthonormal system).– An orthonormal family of elements in a Hilbert space is known as an orthonormal system. The properties of orthonormal systems will be analyzed in the context of separable Hilbert spaces, which are deﬁned below. D EFINITION 5.7.– A Hilbert space H is said to be separable if there exists a subset E Ď H which is countable and dense in H: cardpEq “ ℵ0 , E “ H. As the vast majority of Hilbert spaces are, in fact, separable, we shall give a counter-example of a non-separable Hilbert space in section 5.5.3. The main advantage of working with separable Hilbert spaces is set out in Theorem 5.8. T HEOREM 5.8.– All orthonormal systems in a separable inﬁnite-dimensional Hilbert space H are countable. P ROOF.– Let M be an inﬁnite orthonormal system in H. Given that H is separable, there exists a subset E Ď H which is countable and dense in H: E “ H. From the characterization 2 of density given in Deﬁnition 4.4, we can guarantee that, for any element x P M and any arbitrary but ﬁxed ε ą 0, Dux P E such that }x ´ ux } ă ε. If we can show that the correspondence deﬁned by the function: ı : M ÝÑ E x ÞÝÑ ıpxq “ ux

The Geometric Structure of Hilbert Spaces

189

is injective, then the theorem will be proven. In fact, if this is the case, M is in bijective correspondence with ıpM q Ď E which is an inﬁnite part of a countable set, and is therefore itself countable. To this aim, we take any y P M such that y ‰ x, and uy P E such that }x´uy } ă ε for all arbitrary but ﬁxed ε ą 0. Since x ‰ y are two distinct points arbitrarily selected in M , the injectivity of ı corresponds to the fact that ıpxq ‰ ıpyq, that is ux ‰ uy . To prove this, we begin by noting that, since x and y belong to an orthonormal system, ? their distance is equal to 2 and we can write: ? 2 “ }x ´ y} “ }x ´ ux ` uy ´ y ` ux ´ uy } ď

triang. ineq.

}x ´ ux } ` }y ´ uy } ` }ux ´ uy }

ă 2ε ` }ux ´ uy }

? ? ? that is }ux ´? uy } ą 2 ´ 2ε. 2 ´ 2ε ą 0 ðñ ε ă 2{2, thus, we simply need to ﬁx ε P p0, 2{2q, }ux ´ uy } ą 0 to obtain ux ‰ uy . 2 This theorem is the reason for selecting a discrete value, for example n P N or Z, to label the elements of an orthonormal system in a separable Hilbert space. C ONVENTION .– From now on, all Hilbert spaces H will be assumed to be separable, unless otherwise stated. The two most important propositions related to orthonormal systems are Bessel’s inequality and the Fischer-Riesz theorem. 5.5.1. Bessel’s inequality and Fourier coefﬁcients The expansion of a vector v P Kn , n ă `8, with respect to an orthonormal basis n ř xv, ui yui , xv, ui y being the components of v in this basis. pui qni“1 is written as v “ i“1

Furthermore, the Plancherel identity holds true: }v}2 “

n ř

|xv, ui y|2 . If we want to

i“1

extend this property to an orthonormal system of an inﬁnite-dimensional Hilbert space H we immediately see that a necessary condition must be veriﬁed: for any element x P H, theřsequence pxx, un yqnPN must decay toward 0 when n Ñ `8; otherwise, the series xx, un yun would not converge. The following result guarantees that this nPN

necessary condition is always satisﬁed; the Plancherel identity, on the other hand, is not guaranteed to hold.

190

From Euclidean to Hilbert Spaces

T HEOREM 5.9 (Bessel’s inequality).– Let pun qnPN Ă H be an orthonormal system in a Hilbert space H. Then, @x P H, it holds that: ÿ |xx, un y|2 ď }x}2 [5.5] nPN

More precisely, the difference between the two sides of inequality [5.5] may be quantiﬁed as: 2 ÿ x ´ |xx, un y|2 “ x ´ xx, un yun nPN nPN 2

ÿ

[5.6]

2

Př ROOF .– Bessel’s inequality can be proved by showing that the difference x ´ |xx, un y|2 is equal to the square of a norm, which is ě 0. nPN

For simplicity’s sake, we shall write λn “ xx, un y ðñ λn “ xun , xy @n P N and consider any N P N. By Carnot’s theorem (Theorem 1.5) we have: 2 2 N N N N ÿ ÿ ÿ ÿ 2 λn un “ }x} ´ xx, λ n un y ´ x λn un , xy ` λ n un x ´ n“0 n“0 n“0 n“0 Applying sesquilinearity to the two intermediary terms, and the generalized Pythagorean theorem to the ﬁnal term, the previous equality can be rewritten as follows: 2 N N N N ÿ ÿ ÿ ÿ λn xx, un y ´ λn xun , xy ` |λn |2 }un }2 λn un “ }x}2 ´ x ´ n“0 n“0 n“0 n“0 From the deﬁnitions of λn and λn , and using the fact that }un }2 “ 1 for all n, the ﬁnal equality becomes: 2 N N N N N ÿ ÿ ÿ ÿ ÿ n |2 “ }x}2 ´ λn un “ }x}2 ´ λ λn λn ` |λ |λn |2 x ´ n λn ´ n“0 n“0 n“0 n“0 n“0 that is: 2 N ÿ x ´ |xx, un y| “ x ´ xx, un yun n“0 n“0 2

N ÿ

2

The Geometric Structure of Hilbert Spaces

191

As we did not impose any restrictions on N P N, this equality holds true for an arbitrarily large value of N , that is: 2 ÿ x ´ |xx, un y| “ x ´ xx, un yun nPN nPN 2

2

ÿ

2

Bessel’s inequality allows us to generalize the deﬁnition of Fourier coefﬁcients encountered in Chapter 2. D EFINITION 5.8 (Generalized Fourier coefﬁcients).– The scalars xx, un y P K are said to be the generalized Fourier coefﬁcients of x with respect to the orthonormal system pun qnPN , and are written as: x ˆpnq “ xx, un y

@n P N

Bessel’s inequality can be reformulated stating that, for all x P H, the sequence: x ˆ ” pˆ xn qnPN belongs to 2 pN, Kq, and that: }ˆ x}2 ď }x} @x P H We see that the sequence of generalized Fourier coefﬁcients always decays toward 0. For Hilbert spaces where x can be identiﬁed with a function, analyzing the speed of decay of Fourier coefﬁcients provides interesting information concerning the regularity of the function itself. 2

Equation [5.6] gives an estimation of the difference between }x}2 and }ˆ x}2 and, rewritten with the notation introduced above, immediately implies Corollary 5.2. C OROLLARY 5.2.– Let H be a Hilbert space and pun qnPN any orthonormal system in H. Then: 2 ÿ 2 x ˆpnqun “ }x}2 ´ }ˆ x}2 x ´ nPN Speciﬁcally,

ř nPN

x ˆpnqun converges to x if and only if }ˆ x}2 “ }x}.

192

From Euclidean to Hilbert Spaces

5.5.2. The Fischer-Riesz theorem Theorem 5.10, which is fundamental in functional analysis, is sometimes referred to as the Fischer-Riesz theorem, for example in the classic Dunford and Schwartz (1958). T HEOREM 5.10 (Fischer-Riesz).– Let H be a Hilbert space, pun qnPN an orthonormal system in H and pkn qnPN a sequence of scalars in K “ R or C. 1) Then: ÿ ÿ kn un converges (in norm } } of H) ðñ |kn |2 converges (in K) nPN

that is,

ř

nPN

kn un converges ðñ pkn qnPN P 2 pN, Kq.

nPN

ř

2) If

kn un converges to the sum x, that is x “

nPN

ř

kn un , then:

nPN

ˆpnq kn “ xx, un y “ x and: }x}2 “

ÿ

|kn |2

nPN

that is, Bessel’s inequality becomes Plancherel’s equality }x}2 “

ř

|xx, un y|2 “

nPN

2

}ˆ x}2 . P ROOF.– ř 1) We wish to verify that studying the convergence of kn un is equivalent to nPN ř |kn |2 . This will be done by using the fact that H and studying the convergence of nPN

K are complete, so the Cauchy condition is necessary ř and sufﬁcient for the sequences to converge, and by remembering that the series kn un is the sequence pSN qN PN “ nPN ˙ ˆN ř k n un of partial sums. n“0

N PN

The Cauchy condition for pSN qN PN is: › › r › ÿ › › › kn un › ă ε @ε ą 0 DKε ą 0 : r ą s ě Kε ùñ }Sr ´ Ss } “ › ›n“s`1 ›

The Geometric Structure of Hilbert Spaces

› › r › ř › › Since › kn un ›› ă ε

193

› ›2 r › ř › › kn un ›› ă ε2 ” δ, as the inequality ›

ðñ

n“s`1

n“s`1

concerns two real positive numbers, the Cauchy condition for pSN qN PN can be redeﬁned as follows: › ›2 r › ÿ › › › @δ ą 0 DKδ ą 0 : r ą s ě Kδ ùñ › kn un › ă δ ›n“s`1 › The usefulness of considering the squared norm is that, thanks to the orthogonality of un , we can use the generalized Pythagorean theorem to write: › ›2 r r r r › ÿ › 1 ÿ ÿ ÿ * › › 2 }u k n un › “ }kn un }2 “ |kn |2 “ |kn |2 › n} ›n“s`1 › n“s`1 n“s`1 n“s`1 The Cauchy condition for the sequence of partial sums of the series

ř

kn un can

nPN

then be rewritten as: @δ ą 0 DKδ ą 0 : r ą s ě Kδ ùñ

r ÿ

|kn |2 ă δ,

n“s`1

which is the Cauchy condition for the sequence of partial sums of the series

ř

|kn |2 .

nPN

Hence, the study of the convergence of the two series is equivalent. ř

2) Assuming that the series

km um converges toward the sum x, then, by

mPN

continuity of the inner product: ÿ ÿ ÿ km um , un y “ km xum , un y “ km δm,n “ kn , xx, un y “ x mPN

Hence: x “

mPN

ř

xx, un yun “

nPN

@n P N.

mPN

ř

x ˆpnqun . The fact that property 2 implies that

nPN

Bessel’s inequality becomes Plancherel’s equality is a direct consequence of Corollary 5.2. An alternative proof is possible using the continuity of the norm: }x}2 “ }

ÿ nPN

xx, un yun }2 “

ÿ nPN

1 ÿ * 2 2 }u |xx, un y|2 |xx, un y|2 “ }ˆ x}2 “ n}

2

nPN

C OROLLARY 5.3.– Let H be a Hilbert space, ř x P H and let pun qnPN be an orthonormal system in H. Then the series x ˆpnqun is always convergent (with respect to the norm } } of H).

nPN

194

From Euclidean to Hilbert Spaces

P ROOF.– By Bessel’s inequality, pˆ xpnqqnPN P 2 pN, Kq, that is, convergent in K; by property 1 of the Fischer-Riesz theorem, the series

ř nPN ř nPN

convergent in H.

|ˆ xpnq|2 is x ˆpnqun is 2

N OTABLE EXAMPLE .– The fact that

x ˆpnqun is always convergent does not necessarily imply that it

ř nPN

converges to x, as we show with the following counter example. We take: H “ L2 r´π, πs, un ptq “ ?1π sinpntq, n P N and t P r´π, πs. It is easy to verify that pun qnPN is an orthonormal system for H. Taking xptq “ cosptq, by direct calculation, we obtain: ¸ ˜ż 8 π ÿ ÿ 1 x ˆpnqun “ cosptq sinpntqdt sinpntq π ´π n“1 nPN şπ Furthermore, ´π cosptq sinpntqdt “ 0 as it is the integral of an odd function on a symmetrical domain, thus: ÿ nPN

x ˆpnqun “

8 ÿ

0 ¨ sinpntq “ 0

n“1

where 0 is the identically ř null function on r´π, πs, which is clearly different from the function cosptq; thus, x ˆpnqun ‰ x. nPN

5.5.3. Characterizations of a Hilbert basis (or complete orthonormal system) The example above shows that an orthonormal system in a Hilbert space H does not necessarily guarantee that the series of Fourier coefﬁcients of x P H multiplied by the elements of this orthonormal system will converge in norm to x itself. This fact naturally raises the question of whether a condition which ensures such a convergence exists. In this section, we shall prove that the answer to this question is afﬁrmative. In section 1.5, we saw that, in ﬁnite dimension, this condition is that the orthonormal system must be an orthonormal basis, that is, a maximal set of unitary vectors orthogonal to each other, where “maximal” means that no other unitary vector exists which is orthogonal to all of them. Remarkably, this property also characterizes the bases of an inﬁnite-dimensional Hilbert space, but the terminology used in this case is different.

The Geometric Structure of Hilbert Spaces

195

D EFINITION 5.9 (Complete orthonormal system).– Let pun qnPN Ă H be an orthonormal system of a Hilbert space H. If pun qnPN is not a proper set of another orthonormal system of H, that is, if there are no other unitary vectors orthogonal to the vectors pun qnPN , then this system is referred to as a complete (or total) orthonormal system, or as a Hilbert basis. The property of being a Hilbert basis, in the sense deﬁned above, is equivalent to ﬁve other properties. T HEOREM 5.11.– Let pun qnPN be an orthonormal system of a Hilbert space H. The following statements are equivalent: 1) pun qnPN is a Hilbert basis; 2) xx, un y ” x ˆpnq “ 0 @n P N ðñ x “ 0H , that is 0H is the only vector which is orthogonal to all vectors of a complete orthonormal system (or, equivalently, the only vector x P H whose generalized Fourier coefﬁcients are all zero is the null vector); 3) spanppun qnPN q “ H, that is pun qnPN generates a vector subspace which is dense in H; 4) @x P H: ÿ ÿ x“ xx, un yun “ x ˆpnqun nPN

Generalized Fourier series expansion

nPN

5) @x, y P H: ÿ xx, un yxun , yy “ xˆ x, yˆy2 pN,Kq xx, yy “

Parseval’s identity

nPN

6) @x P H: 2

x “

ÿ

|xx, un y|2 “ }ˆ x}2

2

Plancherel’s identity

nPN

P ROOF.– Our proof consists of the following steps: 1q ñ 2q ñ 3q ñ 4q ñ 5q ñ 6q ñ 1q 1q ñ 2q: reasoning by the absurd, if statement 1 is true and statement 2 is false, then Dx˚ P H, x˚ ‰ 0H such that: xx˚ , un y “ 0 @n P N, but then the vector x˚ u˚ “ ˚ is a unitary vector and orthogonal to all of the elements of pun qnPN , thus }x } pu˚ , pun qnPN q would be a larger orthonormal system than pun qnPN , which contradicts the completeness of pun qnPN . ´ ¯K 2q ñ 3q: 2q ñ ppun qnPN qK “ t0H u ðñ spanppun qnPN q “ t0H u, by property 8 from Theorem 5.1, if we take the orthogonal complement of both sides,

196

From Euclidean to Hilbert Spaces

we obtain

´

¯KK ´ ¯KK spanppun qnPN q “ t0H uK “ H, then H “ spanppun qnPN q

“ spanppun qnPN q, by Theorem 5.5. 3q ñ 4q: ř let us consider x, calculate the inner products with pun qnPN and write the series xx, un yun , which we know converges to a certain point y P H. We must nPN

show that, if statement 3 holds, then it follows that y “ x. To this aim, note that the second part of the Fischer-Riesz theorem tells us that xx, un y “ xy, un y @n P N, that ´ ¯K is xx ´ y, un y “ 0, @n P N, that is x ´ y P ppun qnPN qK “ spanppun qnPN q “

p3)

HK “ t0H u, that is, y “ x.

4q ñ 5q: let us consider any x, y P H and write their generalized Fourier series. By statement 4, we have: ÿ ÿ xx, un yun y “ xy, um yum x“ nPN

mPN

thus: xx, yy “ x

ÿ

xx, un yun ,

nPN

ÿ

xy, um yum y

mPN

By the continuity and linearity of the inner product, we have: ÿ ÿ xx, un y xun , xy, um yum y xx, yy “ nPN

mPN

then, by the continuity and sesquilinearity of the inner product: ÿ ÿ ÿ ÿ xx, un y xy, um y xun , um y “ xx, un y xum , yy δn,m xx, yy “ nPN mPN

nPN mPN

that is: xx, yy “

ÿ

xx, un y xun , yy

nPN 2

5q ñ 6q: consider y “ x in statement 5: }x} “ xx, xy “ ř ř xx, un yxx, un y “ |xx, un y|2 . “ nPN

ř

xx, un yxun , xy

nPN

nPN

6q ñ 1q: reasoning by the absurd, if statement 6 is true and statement 1 is false, ˚ ˚ ˚ then ř Du˚ P 2H, }u } “ 1 and xu , un y “ 0 @n P N; this would give us |xu , un y| “ 0, which contradicts statement 4, since it states that nPN ř 2 |xu˚ , un y|2 “ }u˚ } “ 1. 2 nPN

The Geometric Structure of Hilbert Spaces

197

I MPORTANT NOTE CONCERNING PROPERTY 4.– The expansion into a generalized Fourier series on a Hilbert basis is an extension of the decomposition theorem for vectors on an orthonormal basis in a Euclidean space of ﬁnite dimension d, as shown in Table 5.1. Kd Hilbert space H Orthonormal basis: pui qi“1,...,d Hilbert basis: pun qnPN d ř ř Expansion: @x P Kd x “ xx, ui yui Fourier series: @x P H x “ xx, un yun i“1

Components: xx, ui y

nPN

Fourier coefﬁcients: xx, un y

Table 5.1. Analogies between a ﬁnite-dimensional Euclidean space and an inﬁnite-dimensional Hilbert space

The generalization of the canonical basis of the space 2 pZN q, introduced in section 2.1, is the canonical Hilbert basis of H “ 2 pZ, Kq given by the vectors pek qkPZ , ek pnq “ δk,n @n P Z: pe1 “ p1, 0, 0, . . .q, e2 “ p0, 1, 0, . . .q, . . .q The orthonormal property is obvious; completeness, for example, follows from the fact that the only vector which is orthogonal to e1 , e2 , . . . is the zero vector. T HEOREM 5.12.– All Hilbert spaces H admit a Hilbert basis. P ROOF.– Let O be the collection of all orthonormal families in H. O is an ordered set by inclusion. If Φ Ă O is linearly ordered, then the union of all elements of Φ is a superior bound. Zorn’s lemma (Moretti 2013) guarantees the existence of a maximal element in O. 2 E XAMPLE OF A NON - SEPARABLE H ILBERT SPACE .– The Hilbert space in Theorem 5.11 was implicitly assumed to be separable. Any Hilbert space which does not verify any of the properties which characterize a Hilbert basis is non-separable. We shall use property 2 from Theorem 5.11 to illustrate an example of a non-separable Hilbert space. We begin by deﬁning the following space: H “ tf : R Ñ K : D Ef Ă R, cardpEf q ď ℵ0 : f |Ef P 2 pN, Kq et f |RzEf “ 0RzEf u This is the space made up of all functions f deﬁned on R with a value in K, which vanish everywhere except on a ﬁnite or countable subset Ef of R, and such that the sequence f : Ef Ñ K is square summable.

198

From Euclidean to Hilbert Spaces

H is a vector space, with respect to the pointwise-deﬁned linear operations, which may be equipped with the following inner product: ÿ xf, gy “ f pxqgpxq f, g P H xPEf XEg

This is well deﬁned since, by deﬁnition of H, the sum is either ﬁnite or a convergent series (evidently, if K “ R, the conjugation operation becomes the identity). We can easily verify that H is a Hilbert space with respect to the topology induced by this inner product. Reasoning by the absurd, let us suppose that H is separable, so that any Hilbert basis is be countable. Then let u ” Ť pun qnPN be a Hilbert basis in H, under the separability hypothesis, and take U :“ nPN Un , where the sets Un Ă R @n P N are such that un |Un P 2 pN, Kq and un |RzUn “ 0RzUn . If we can show that there exists an element fu in H which is orthogonal to all un and which is not the identically null function on R, this would prove that property 2 of Theorem 5.11 does not hold: this contradiction implies that H cannot be separable. To construct an element of this sort, we begin by noting that U is the union of countable or ﬁnite sets, and is thus, itself, either countable or ﬁnite. Considering any point x ¯ P RzU , we can therefore deﬁne fu : R Ñ K as: # 1 if x “ x ¯ fu pxq “ 0 otherwise to obtain an element in H such that xun , fu y “ 0 @n P N, but f ‰ 0R . The fact that all complete orthonormal systems of a separable Hilbert space H of inﬁnite dimension are countable should not lead us to think that H itself is of countable dimension as a vector space. In other words, if we consider H simply as a vector space, rather than a Hilbert space, then by deﬁnition its dimension is the cardinality of a basis in the algebraic sense, that is, a subset B Ă H of linearly independent elements in H such that any element in H can be obtained through a ﬁnite linear combination of elements in basis B. The following result, which we shall not prove, gives us quite surprising information about the difference between the cardinality of complete orthonormal systems and that of algebraic basis of an inﬁnite dimensional Hilbert space. T HEOREM 5.13.– If the common cardinality of the Hilbert bases of a Hilbert space H (separable or otherwise) is ℵ0 , then the cardinality of the dimension of H, as a vector space, cannot be less than ℵ1 . It follows from this theorem that, as vector spaces, separable Hilbert spaces possess at least the cardinality of the continuum, that is a maximal system of linearly

The Geometric Structure of Hilbert Spaces

199

independent vectors possesses at least the cardinality of the continuum. The orthonormality requirement implies a further constraint, in the fact that the distance ? between the elements in the basis must be 2, this forces the cardinality of a complete orthonormal system to drop to that of the countable numbers. Nevertheless, it is important to note – once again – that given a Hilbert basis, any element in an inﬁnite-dimensional Hilbert space can be reconstructed via the generalized Fourier series in the sense of the Hilbert norm; this is by no means equivalent to the possibility of reconstructing elements by means of a ﬁnite linear combination. This consideration shows that the concept of Hilbert basis is the most adequate to “parameterize” the elements of an inﬁnite-dimensional Hilbert space via its generalized Fourier coefﬁcients relative to the Hilbert basis, rather than a basis in the algebraic sense. The reason for this lies in the fact that a Hilbert basis interacts with the rich geometric structure of the Hilbert space generated by the inner product via Fourier coefﬁcients, while a mere algebraic basis only takes into account the linear structure. The following deﬁnition establishes a speciﬁc terminology for the dimension of Hilbert spaces, adopted by certain authors, that we consider particularly adequate. D EFINITION 5.10 (Orthogonal dimension).– Let H be a Hilbert space. The orthogonal dimension of H is the common cardinality of all Hilbert bases in H. Evidently, the orthogonal dimension coincides with the ordinary dimension for a ﬁnite-dimensional Hilbert space, but the same cannot be said in inﬁnite dimensions. 5.5.4. Isomorphisms between Hilbert spaces One ﬁnal property which highlights the analogy between Hilbert spaces and ﬁnitedimensional Euclidean spaces is the existence of a prototype for these spaces. As we have seen, the dimension of a vector space V of ﬁnite dimension d is sufﬁcient to characterize it up to an isomorphism. In fact, we know that, for any ﬁxed basis of V , the correspondence I : V Ñ Kd which associates each vector v in V with its components (in Kd ) with respect to the chosen basis is an isomorphism. In this sense, Kd is the prototype of vector spaces on K of dimension d ă `8. For (separable) inﬁnite-dimensional Hilbert spaces, the prototype is 2 pN, Kq and the generalized Fourier coefﬁcients replace the vector components. The concept of isomorphism between Hilbert spaces must be deﬁned before we can establish a rigorous statement regarding this fact. The presence of the inner product

200

From Euclidean to Hilbert Spaces

implies that the canonical deﬁnition of isomorphism between vector spaces must be adapted to this situation. D EFINITION 5.11 (Isomorphism between Hilbert spaces).– Let H and H1 be two Hilbert spaces on the same ﬁeld K. The transformation U : H Ñ H1 is an isomorphism of Hilbert spaces if: 1) U is linear; 2) U is bijective; 3) U preserves the inner product, that is: xU pxq, U pyqyH1 “ xx, yyH

@x, y P H

Condition 3 implies (in the speciﬁc case where x “ y) that U preserves the norms, that is: }U pxq}H1 “ }x}H

@x P H

This also implies: }U pxq ´ U pyq}H1 “ }U px ´ yq}H1 “ }x ´ y}H

@x, y P H

that is, U preserves the distances. In this case, we say that U is isometric. The property of conservation of the norm implies }U pxq}H1 “ 0 ðñ }x}H “ 0; furthermore, by the deﬁnite positivity of the norm, it holds that U pxq “ 0H1 ðñ x “ 0H , that is kerpU q “ t0H u and thus U is injective. An isomorphism U between Hilbert spaces can thus be redeﬁned as a surjective linear transformation which preserves the inner product. Actually, the linearity request is redundant, as we see from the following result. T HEOREM 5.14.– Let V, V 1 be two inner product spaces, of ﬁnite or inﬁnite dimension, on the same ﬁeld K. If the transformation U : V Ñ V 1 is surjective and preserves the inner product, then it is linear. P ROOF.– @x, y, z P V and @α, β P K: 0 “ x0, zy “ xαx ` βy ´ αx ´ βy, zy “ xαx ` βy, zy ´ αxx, zy ´ βxy, zy “

pU preserves x yq

“

plinearity of x yq

xU pαx ` βyq, U pzqy ´ αxU pxq, U pzqy ´ βxU pyq, U pzqy

xU pαx ` βyq ´ αU pxq ´ βU pyq, U pzqy

The Geometric Structure of Hilbert Spaces

201

Since, by hypothesis, U is surjective, as z P V varies, U pzq represents any element of V 1 , thus U pαx ` βyq ´ αU pxq ´ βU pyq is orthogonal to all of the elements of V 1 , that is, U pαx ` βyq ´ αU pxq ´ βU pyq “ 0H1 , hence: U pαx ` βyq “ αU pxq ` βU pyq

@x, y P V, @α, β P K

and so U is linear.

2

The deﬁnition of isomorphism between Hilbert spaces can thus be reformulated as follows. D EFINITION 5.12 (Alternative deﬁnition of isomorphism between Hilbert spaces).– Let H and H1 be two Hilbert spaces on the same ﬁeld K. The transformation U : H Ñ H1 is an isomorphism of Hilbert spaces if: 1) U is surjective; 2) U preserves the inner product. The fact of being isomorphic is an equivalence relationship in the set of Hilbert spaces on the same ﬁeld K. The following result says that the orthogonal dimension plays, for a separable inﬁnite-dimensional Hilbert space, the same role played by the dimension for a ﬁnite-dimensional vector space. T HEOREM 5.15.– H, H1 : Hilbert spaces on the same ﬁeld K. H is isomorphic to H1 if and only if the orthogonal dimension of H is the same as that of H1 . 5.5.5. 2 pN, Kq as the prototype of separable Hilbert spaces of inﬁnite dimension L EMMA 5.2.– Let pun qnPN be a Hilbert basis of H, then, for any sequence pkn qnPN of 2 pN, Kq, there exists x P H such that pkn qnPN “ pxx, un yqnPN . If pkn qnPN P 2 pN, Kq, then, thanks to property 1 of Fischer-Riesz’s P ROOF.– ř theorem, kn un converges to a certain x P H. Then, property 2 of the same nPN

theorem guarantees that pkn qnPN “ pxx, un yqnPN .

2

T HEOREM 5.16.– If the Hilbert space H has countable orthogonal dimension ℵ0 , then H is isomorphic to 2 pN, Kq. P ROOF.– Let pun qnPN be a countable Hilbert basis in H and consider the application: U : H ÝÑ 2 pN, Kq x ÞÝÑ U pxq “ pxx, un yqnPN

202

From Euclidean to Hilbert Spaces

U is surjective by Lemma 5.2 and it preserves the inner product by Parseval’s identity: ÿ ÿ @x, y P H : xx, yyH “ xx, un y xun , yy “ xx, un y xy, un y ” xU pxq, U pyqy2 pN,Kq

nPN

nPN

Hence, U is an isomorphism of Hilbert spaces.

2

5.6. The Fourier Hilbert basis in L2 The best-known example of a Hilbert basis, which is also the most important in terms of practical applications, is the Fourier basis. This basis is deﬁned below in the context of the Hilbert space L2 . 5.6.1. L2 r´π, πs or L2 r0, 2πs Let us begin with H “ L2 r´π, πs or L2 r0, 2πs and K “ C, then: 1 un pxq “ ? einx , 2π

nPZ

is a complete orthonormal system, called the Fourier basis of L2 r´π, πs or L2 r0, 2πs. Note that this orthonormal system completes the orthonormal system ?1 sinpnxq, n P N which we used in section 5.5.2 as a counterexample to show that π the convergence (in Hilbert norm) of the generalized Fourier series to the element deﬁning the generalized Fourier coefﬁcients is not guaranteed if we consider a non-complete orthonormal system. Orthonormality is easy to prove. Considering L2 r´π, πs (the proof for L2 r0, 2πs is the same): ż ż żπ 1 π inx ´imx 1 π ipn´mqx xun , um y “ un pxqum pxqdx “ e e dx “ e dx 2π ´π 2π ´π ´π – if n “ m, then eipn´mqx “ e0 “ 1 and thus xun , un y “ }un }2 “ 1; – if n ‰ m, then, writing y “ ipn ´ mq, the inner product can be written as: ż 1 π yx xun , um y “ e dx 2π ´π “

1 1 x“π reyx sx“´π “ reipn´mqπ ´ eipm´nqπ s “ 0 2πy 2πipn ´ mq

The Geometric Structure of Hilbert Spaces

203

In short, xun , um y “ δn,m , proving orthonormality. The proof that the system is complete, instead, is much more complicated. The Fourier expansion here is written as follows: ÿ xf, un yun @f P L2 r´π, πs : f “ nPZ

where: 1 xf, un y ” fˆpnq “ ? 2π

żπ ´π

f pxqe´inx dx

is the n-th Fourier coefﬁcient of f . Note that the convergence of the series should be interpreted as: ˇ2 ż π ˇˇ N ˇ ÿ ˇ ˇ ˆ f pnqun pxqˇ dx Ñ 0 ˇf pxq ´ N Ñ`8 ˇ ´π ˇ n“´N D EFINITION 5.13.– Take H “ L2 r´π, πs or L2 r0, 2πs. The application: F ” ˆ : H ÝÑ 2 pZ, Cq f ÞÝÑ pfˆpnqqnPZ is known as the Fourier transform of H “ L2 r´π, πs or L2 r0, 2πs. We see that F coincides with the transformation which implements the isomorphism between L2 r´π, πs or L2 r0, 2πs and its prototype 2 pZ, Cq! The Fourier Hilbert basis of L2 pr´π, πsq and L2 pr0, 2πsq can be written in terms of real functions: $ ?1 ’ &u0 ” 2π cosn pxq ” ?1π cospnxq, n P N ’ % sinn pxq ” ?1π sinpnxq, n P N It is important to note that the complex exponential of parameter n P Z is replaced by two real sequences of parameter n P N; this is a consequence of Euler’s formula, eiϑ “ cos ϑ ` i sin ϑ, for all ϑ P R. The advantage of this basis is that it does not contain any imaginary parts; furthermore, the Fourier expansion in this case can be performed: – for even functions, using u0 and cosn ; – for odd functions, using sinn .

204

From Euclidean to Hilbert Spaces

The reason şπ for this result is easily explained: taking an even f , then fˆpnq “ ?1π ´π f pxq sinpnxqdx “ 0 @n, since f pxq sinpnxq is odd and r´π, πs is a symmetrical domain. Similar arguments can be applied to odd functions to obtain the desired result. 5.6.2. L2 pTq Our decision to consider the interval r´π, πs or r0, 2πs reﬂects the fact that the orthonormality of the system p ?12π einx qnPZ is very easy to prove. Actually, all of the properties stated for this system remain valid if r´π, πs or r0, 2πs is replaced by any other interval of size 2π. Furthermore, these properties continue to hold if we consider functions deﬁned on any real interval, that is, f : R Ñ C, on the condition that they are 2π-periodic. This can be formalized using a highly useful Hilbert space: " L2 pTq “ f : R Ñ C , f measurable , f px ` 2πq “ f pxq, ş2π 0

* |f pxq| dx ă `8 { „ 2

where f „ g if f “ g a.e., as usual. By periodicity, integration can be carried out on any interval of size 2π. The symbol T represents the 1D torus, which may be identiﬁed with the unitary circumference. Any function f : R Ñ C which is 2π-periodic may be identiﬁed with a function deﬁned on T by means of the following diagram: f

/ C ? p fr T R

p : R ÝÑ T x ÞÝÑ pcos x, sin xq f : R Ñ C 2π-periodic, f˜ : T Ñ C, f˜pppxqq “ f pxq L2 pTq is isomorphic to L2 r0, 2πs or L2 r´π, πs via the application which restricts f : R Ñ C, f P L2 pTq, to the interval r0, 2πs or r´π, πs (or any interval of size 2π): I : L2 pTq ÝÑ L2 r0, 2πs f ÞÝÑ If “ f |r0,2πs ou f |r´π,πs

The Geometric Structure of Hilbert Spaces

205

Using I, the complete orthonormal Fourier system can be transferred from L2 r0, 2πs or L2 r´π, πs onto L2 pTq: ˆ ˙ 1 inx ? e : Hilbert basis for L2 pTq 2π nPZ and the deﬁnition of the Fourier transform can be extended on L2 pTq. D EFINITION 5.14.– The transformation: F ” ˆ : L2 pTq ÝÑ 2 pZ, Cq f ÞÝÑ F f “ fr F f pnq “ fˆpnq “ pxf, un yqnPZ “ p ?12π Fourier transform on L2 pTq.

ş2π 0

f pxqe´inx dxqnPZ . is known as the

We know that this transformation is an isomorphism between Hilbert spaces, and ř that ||fp||2 pZ,Cq “ |xf, un y|2 “ ||f ||L2 pTq . nPZ

5.6.3. L2 ra, bs To handle elements of f P L2 ra, bs, a, b P R, a ă b, which are pb´aq-periodic, we must slightly modify the Fourier basis. The trick consists of multiplying the variable of f by an appropriate quantity – the pulse – which turns f into a pb ´ aq-periodic function. Formally, we deﬁne: – T “ b ´ a: the period; –ν“

1 T

: the frequency;

– ω “ 2πν “

2π T :

the pulse.

We see that: eiωnpx`T q “ cosrωnpx ` T qs ` i sinrωnpx ` T qs “ cosrωnx `ωnT s ` i sinrωnx ` ωnT s j „ 2π 2π nT ` i sinrωnx ` nT s “ cosrωnx “ cos ωnx ` T T `2πns ` i sinrωnx ` 2πns “ cospωnxq ` i sinpωnxq “ eiωnx

206

From Euclidean to Hilbert Spaces

thus x ÞÑ eiωnx is a T -periodic function. Using these considerations, we can show that a complete orthonormal system for L2 ra, bs can be obtained using the following set of functions: un : ra, bs ÝÑ C x ÞÝÑ un pxq “

x´a

? 1 e2πni b´a b´a

,

nPZ

in the complex case, and: $ 1 u0 “ ?b´a ’ ’ b ¯ ´ & 2 nPN cosn pxq ” b´a cos 2πn x´a b´a , b ¯ ´ ’ ’ %sin pxq ” 2 x´a nPN n b´a sin 2πn b´a , in the real case. In the speciﬁc case of the Hilbert space L2 r, s, P R, the Fourier basis can be written as: x 1 un pxq “ ? eπin , 2

nPZ

in the complex case, and: $ u0 “ ?12 ’ ’ b & ˘ ` cosn pxq ” 1 cos πn x , n P N b ’ ’ %sin pxq ” 1 sin `πn x ˘ , n P N n in the real case. 5.6.4. Real Fourier series Using the real Hilbert basis of L2 pTq, that is: $ ?1 ’ &u0 ” 2π cosn pxq ” ?1π cospnxq, n P N ’ % sinn pxq ” ?1π sinpnxq, n P N the real Fourier series expansion for any element f P L2 pTq is: f ptq 2“

L pTq

`8 `8 ÿ ÿ a0 an cospntq ` bn sinpntq ` 2 n“1 n“1

The Geometric Structure of Hilbert Spaces

207

with: a0 “

1 π

ż

an “

1 π

ż

bn “

1 π

ż

T

f ptqdt

a0 1 “ 2 2π

ż T

f ptqdt “ xf yT

f ptq cospntqdt

@n “ 1, 2, . . .

f ptq sinpntqdt

@n “ 1, 2, . . .

T

T

ùñ

(average of f )

The coefﬁcients a0 , an , bn , n “ 1, 2, . . . are known as the real Fourier coefﬁcients of f . Evidently, the equality must be interpreted in the sense of L2 pTq, that is: ż « T

˜ f ptq ´

¸ﬀ2 N N ÿ ÿ a0 an cospntq ` bn sinpntq dt ÝÑ 0 ` N Ñ`8 2 n“1 n“1

The expression: SN ptq “

N N ÿ ÿ a0 an cospntq ` bn sinpntq ` 2 n“1 n“1

is known as a trigonometric polynomial of order N . SN is a 2π-periodic function, like the elements of L2 pTq. To understand the presence of the constant π1 in the real Fourier coefﬁcients, consider the expansion of f with the respect to the system of cosine: `8 ÿ

˙ `8 ÿ ˆ1 ż 1 1 xf, ? cospntqy ? cospntq “ f ptq cospntqdt cospntq π π π T n“1 n“1 the same holds true for the sine system and for the constant. Incorporating the constant π1 into the deﬁnition of the Fourier coefﬁcients makes it possible to identify a20 with the average value of f , so that the real Fourier series can be interpreted as the superposition of the average value of f and combinations of harmonic waves of increasing frequency. Notably: – t ÞÑ a1 cosptq ` b1 sinptq is known as the fundamental harmonic; – t ÞÑ an cospntq ` bn sinpntq is the harmonic of order n.

208

From Euclidean to Hilbert Spaces

A tuning fork is able to produce a “pure” sound, that is one which consists exclusively of the fundamental harmonic; the vast majority of musical instruments, on the other hand, produce sounds which can be described by a Fourier series, that is a superposition of harmonics at frequencies which are multiples of the fundamental. Using the orthogonal projection theorem and Plancherel’s identity, we can say that the mean quadratic error (that is the norm L2 ) between f and the trigonometric polynomial of order N is: « ﬀ ż ż N ÿ ` 2 ˘ a20 2 2 2 EN “ rf ptq ´ SN ptqs dt “ f ptq dt ´ π a n ` bn ` 2 T T n“1 and since EN ÝÑ 0, it holds that: N Ñ`8

`8 ÿ` ˘ a2 a2n ` b2n f ptq dt “ π 0 ` 2 T n“1

«

ż

ﬀ

2

This is an identity between an integral and a numerical series, and is particularly useful for determining one of these two objects by calculating the other. Taking L2 ra, bs and writing T “ b ´ a and ω “ 2π T , we know that the real Fourier Hilbert basis is: # + c c 1 2 2 ? , cospωntq, sinpωntq, n “ 1, 2, 3, . . . T T T With respect to this Hilbert basis, the Fourier series expansion of f P L2 pra, bsq, f (T -periodic) is: f ptq

“ 2

L ra,bs

`8 `8 ÿ ÿ a0 an cospωntq ` bn sinpωntq ` 2 n“1 n“1

with: 1 a0 “ 2 T

żb

2 T

żb

an “

a

a

f ptqdt “ xf yra,bs

(average of f )

f ptq cospωntqdt, bn “

2 T

żb

f ptq sinpωntqdt

a

In this case, the Fourier polynomials are T -periodic functions.

@n “ 1, 2, . . .

The Geometric Structure of Hilbert Spaces

209

Exercise 5.2 ` The family ˘ pek : r´π, πs Ñ CqkPZ of non-normalized exponentials ek ptq :“ eikt kPZ is a Hilbert basis of L2 r´π, πs if this space is equipped with an ş 1 π f pxqgpxqdx. inner product deﬁned by xf, gy0 “ 2π ´π 1) Write the Fourier series associated with the function φ : r´π, πs Ñ C, t ÞÑ cosp3tq ´ sinp5tq. 2) Take N˚ “ Nzt0u, and let pψk : R Ñ RqkPN˚ be the family deﬁned by ψk ptq “ sinpktq. şπ a) Consider f P L2 r0, πs such that 0 f ptqψk ptqdt “ 0 @k P N˚ and also # f ptq if 0 ď t ă π gptq “ ´f p´tq if ´ π ă t ă 0 Prove that

şπ

´π

gptqe´ikt dt “ 0 @k P N˚ .

b) Prove that pψşk qkPN˚ is a complete system in L2 r0, πs equipped with the π inner product xf, gy “ 0 f pxqgpxqdx, that is a non-orthogonal family of elements in L2 r0, πs such that: spanppψk qkPN˚ q “ L2 r0, πs ô pspanppψk qkPN˚ qqK “ t0L2 r0,πs u ô xf, ψk y “ 0 @f P L2 r0, πs, @k P N˚ ñ f ” 0L2 r0,πs 3) Construct a Hilbert basis of L2 r0, πs from the family pψk qkPN˚ . 4) Use the result obtained above to determine a sequence of real coefﬁcients `8 ř ak ψk “ 1 (equality in the sense of L2 r0, πs). pak qkPN˚ such that k“1

5) Using Plancherel’s identity, prove that the following formula is valid: `8 ÿ

1 π2 “ p2k ` 1q2 8 k“0 Solution to Exercise 5.2 1) We can rewrite φ as: φpxq “

1 3it 1 pe ` e´3it q ´ pe5it ` e´5it q 2 2i

1 that is, φ “ 12 pe3 ` e´3 q ´ 2i pe5 ´ e´5 q, with the equality in the sense of L2 r´π, πs, is the Fourier series of the function φ by the uniqueness of the decomposition.

210

From Euclidean to Hilbert Spaces

2) We shall consider these two points separately. a) By direct calculation: żπ ´π

gptqe´ikt dt “

ż0 ´π

´f p´tqe´ikt dt `

żπ 0

f ptqe´ikt dt

if we change the variable in the ﬁrst integral as follows s “ ´t, ds “ ´dt, we obtain: ş0 ş0 şπ şπ ´f p´tqe´ikt dt “ π f psqeiks ds “ 0 ´f psqeiks ds “ 0 ´f ptqeikt dt and thus: ´π żπ

´ikt

gptqe ´π

dt “

żπ 0

´f ptqe

ikt

dt `

żπ 0

f ptqe

´ikt

dt “

żπ 0

f ptqpe´ikt ´ eikt qdt

By using Euler’s formula for the sine we have: żπ żπ żπ gptqe´ikt dt “ ´2i f ptq sinpktqdt “ ´2i f ptqψk ptqdt “ 0 ´π

0

0

by deﬁnition of the functions ψk . b) The function f deﬁned in 2(a) is, by hypothesis, orthogonal to all the elements pψk qkPN˚ , so, to verify that pψk qkPN˚ is a complete system for L2 r0, πs we simply have to prove that f “ 0L2 r0,πs . To do that, we use the fact that, by deﬁnition, g|r0,πs “ f , thus, if we show that g “ 0L2 rπ,πs , then, necessarily, f “ 0L2 r0,πs . ş 1 π gptqe´ikt dt “ 0 @k P N˚ , if this Thanks to what shown previously, xg, ek y “ 2π ´π ˚ holds also for k “ 0 and ´k, with k P N , then g is orthogonal to all the elements of the Hilbert basis pek qkPZ de L2 r´π, πs, which implies g “ 0L2 r´π,πs , thanks to theorem 5.11. To resume, the only properties that we have to verify are: xg, e0 y “ 0 and xg, e´k y “ 0 for all k P N˚ : ż 1 0 pcosp´3tq ´ sinp´5tqqdt xg, e0 y “ xg, e0 y “ 2π ´π ż 1 0 pcosp3tq ` sinp5tqqdt “ 2π ´π ˜ ˇ0 ˇ0 ¸ 1 cosp5tq ˇˇ sinp3tq ˇˇ “ ´ “0 2π 3 ˇ´π 3 ˇ´π ż ż 1 0 1 π ´ikt pcosp3tq ` sinp5tqqe dt ` pcosp3tq ´ sinp5tqqe´ikt dt xg, e´k y “ 2π ´π 2π 0 ż ż 1 π 1 0 pcosp3tq ` sinp5tqqeikt dt ` pcosp3tq ´ sinp5tqqeikt dt “ 2π ´π 2π 0 ż ż 1 ´π 1 0 ´pcosp3sq ´ sinp5sqqe´iks ds ` ´pcosp3sq ` sinp5sqqe´iks ds “ 2π π 2π 0

The Geometric Structure of Hilbert Spaces

“

1 2π

żπ 0

pcosp3tq ´ sinp5tqqe´ikt dt `

1 2π

ż0 ´π

211

pcosp3tq ` sinp5tqqe´ikt dt

” xg, ek y “ 0 @k P N˚ 3) The fact that pψk qkPN˚ is a complete system in L2 r0, πs means that we can obtain a Hilbert basis for the same space simply by examining the orthonormal properties of this system. For all n, m P N˚ : żπ xψn , ψm y “ sinpntq sinpmtqdt pt ÞÑ sinpntq sinpmtq is evenq 0

“

1 2

“

1 2

żπ sinpntq sinpmtqdt ´π żπ

eint ´ e´int eimt ´ e´imt dt 2i 2i

´π żπ

1 peint ´ e´int qpeimt ´ e´imt qdt 8 ż´π 1 π int pe ´ e´int qpe´imt ´ eimt qdt “´ 8 ´π “´

2π xen ´ e´n , e´m ´ em y0 8 π “ ´ pxen , e´m y0 ´ xen , em y0 ´ xe´n , e´m y0 ` xe´n , em y0 q 4 # 0 if n ‰ m “ π π ´ 4 p´1 ´ 1q “ 2 if n “ m, ¯ ´b a 2 is a Hilbert basis of L2 r0, πs. Thus }ψn } “ π2 @n P N˚ and so π ψn ˚ “´

nPN

4) Let us interpret 1 as the constant function 1 P ´b L2 r0, πs, ¯ 1ptq “ 1 @t P 2 r0, πs, which we shall decompose on the Hilbert basis of L2 r0, πs, π ψn nPN˚ determined above: c c `8 `8 ÿ ÿ 2 2 2 x1, 1“ x1, ψk y ψk ψk y ψk “ π π π k“1 k“1 showing us that 1 “

`8 ř

ak ψk , with:

k“1

2 2 ψk y “ π π

żπ

‰ 2 2 “ π 1 ´ p´1qk r´ cospktqs0 “ πk πk 0 # 0 k even . that is, the sequence we wanted to ﬁnd is: ak “ 4 k odd πk ak “ x1,

sinpktqdt “

212

From Euclidean to Hilbert Spaces

5) Plancherel’s identity for 1 gives us: ˇ2 ˇ c `8 ˇ ÿ ˇˇ 2 ˇ 2 }1} “ ψk yˇ ˇx1, ˇ ˇ π k“1

Moreover, }1}2 “

şπ 0

1dt “ π and x1,

b

2 π ψk y

“

aπ

2 ak ,

hence:

`8 `8 ÿ π ˆ 4 ˙2 ÿ π 1 1 π2 2 π“ ðñ “ |a2k`1 | “ 2 2 π p2k ` 1q2 p2k ` 1q2 8 k“0 k“0 k“0 `8 ÿ

2 5.6.5. Pointwise convergence of the real Fourier series: theorem

Dirichlet’s

Fourier series were initially met with skepticism by the mathematical community. The idea that series with trigonometric (hence inﬁnitely derivable) functions could be used to approximate non-derivable or, worse, non-continuous functions was considered absurd by many. Furthermore, Fourier did not provide rigorous convergence results for the series that bears his name. In fact, the theorems that we saw earlier concerning convergence in norm were obtained at a later stage by other mathematicians; furthermore, they are not sufﬁcient to guarantee the pointwise convergence of the series. The ﬁrst conditions for pointwise convergence of the Fourier series were identiﬁed by Dirichlet6 (b. 1805, Düren; d. 1859; Göttingen) in 1829. Dirichlet’s constructive proof is of crucial importance in Fourier analysis; readers who wish to explore the subject further may wish to consult Vretblad (2003). For the purposes of this book, we shall simply provide a rigorous deﬁnition of Dirichlet’s theorem, introducing the associated notation and terminology. If t0 is a point of discontinuity of a real-valued function f of one real variable, then the right and left limits are written as: f pt` 0 q “ lim f ptq, tÑt` 0

f pt´ 0 q “ lim f ptq tÑt´ 0

D EFINITION 5.15 (Dirichlet function).– Let f : R Ñ R. f is a Dirichlet function if it veriﬁes the following conditions: 6 Remarkably, the “modern” deﬁnition of a function, as a univocal correspondence between two sets, was established by Dirichlet as part of his efforts to prove the pointwise convergence of the Fourier series.

The Geometric Structure of Hilbert Spaces

213

1) f is T -periodic, T P R` ; 2) f is piecewise continuous, that is there is only a ﬁnite number of points at which f is not continuous; 3) for all t0 P R: f pt0 q “

´ f pt` 0 q ` f pt0 q , 2

[5.7]

that is, at any point t0 P R, the value of f in t0 is the average of the right and left limits of f in t0 . Condition [5.7] is of course satisﬁed in any point where f is continuous; however, it is not trivial to requite at any point of discontinuity. D EFINITION 5.16 (Generalized derivative).– Let f be a Dirichlet function and take t0 P R. f is said to possess a generalized derivative on the right in t0 if the following (ﬁnite) limit exists: lim

hÑ0`

f pt0 ` hq ´ f pt` 0q h

In the same way, f is said to possess a generalized derivative on the left in t0 if the following (ﬁnite) limit exists: lim

hÑ0´

f pt0 ` hq ´ f pt´ 0q h

These elements are necessary in deﬁning Dirichlet’s theorem. T HEOREM 5.17 (Dirichlet’s theorem, 1829).– Let f be a Dirichlet function and take t0 P R. If the function f possesses generalized derivatives on the right and left at point t0 , then the real Fourier series of f evaluated in t0 converges to f pt0 q. The conditions of this theorem are known as the Dirichlet conditions; they are sufﬁcient, but not necessary, for the pointwise convergence of the real Fourier series. Conditions which are both necessary and sufﬁcient for the pointwise convergence of the Fourier series have yet to be identiﬁed. Nevertheless – thankfully – the Dirichlet conditions are veriﬁed for the vast majority of functions encountered in practical applications. Note that, if we ignore the requirement [5.7], then the Fourier series converges to f pt0 q “

´ f pt` 0 q`f pt0 q . 2

214

From Euclidean to Hilbert Spaces

One ﬁnal remark concerning the possible consequences of a lack of continuity in f : In 1923, the great Russian mathematician Kolmogorov (b. 1903, Tambov; d. 1987, Moscow) succeeded in building a function with pathological discontinuities which make its Fourier series diverge at all points. 5.6.6. The Gibbs phenomenon and Cesàro summation Dirichlet’s theorem does not imply that the behavior of the Fourier series in the neighborhood of a discontinuity of a function will be “regular”; in fact, as we approach a jump discontinuity, oscillations – known as Gibbs oscillations – begin to appear, and remain present even when the number of Fourier coefﬁcients is increased. If a function f is a Dirichlet function, then the oscillations to the left and right of the discontinuity cancel out, and their average coincides with the value of f at the jump. The difference between the value of the function f and the value of the trigonometric polynomial SN in an arbitrarily close neighborhood of a jump continuity can be shown to be close to 18 %, even when N Ñ `8. The analysis of the Gibbs phenomenon involves mathematical subtleties which lie outside the scope of this book. For a more detailed exploration of the Gibbs phenomenon, readers may wish to consult Vretblad (2003). Figure 5.2 shows the Gibbs effect for a rectangular pulse function. Gibbs oscillations can be eliminating by considering a Cesàro (1859, Naples-1906, Torre Annunziata) summation in place of the usual summation; in this case, arithmetic averages of the partial sums are used to “smooth out” oscillations. 5.6.7. Speed of convergence to 0 of Fourier coefﬁcients We begin with a general result. L EMMA 5.3 (Riemann-Lebesgue lemma).– Taking f P L1 ra, bs, then: żb żb żb lim f ptq cospntqdt “ lim f ptq sinpntqdt “ lim f ptqeint dt “ 0

nÑ`8 a

nÑ`8 a

nÑ`8 a

The geometric interpretation of the Riemann-Lebesgue lemma is that the function f ptq cospntq or f ptq sinpntq oscillates at such a high frequency when n Ñ `8 that the values around the average cancel out, and thus the integral converges to 0. An immediate corollary of this lemma is that the Fourier coefﬁcients of the Fourier series of a function f P L1 ra, bs (and, of course, pb ´ aq-periodic), decay toward 0 when n Ñ `8.

The Geometric Structure of Hilbert Spaces

215

Figure 5.2. Gibbs phenomenon for the rectangular pulse function (courtesy of Éric Luçon)

Theorem 5.18 shows that the regularity of f has an important effect on the speed of decay of Fourier coefﬁcients. T HEOREM 5.18.– Let f : R Ñ R be a function that: – is of class C p pra, bsq, that is f is derivable p times on ra, bs with p continuous derivatives; – is pb ´ aq-periodic;

216

From Euclidean to Hilbert Spaces

– possesses equal generalized derivatives at the extrema of the interval ra, bs. Then, the Fourier coefﬁcients of f , an , bn , n “ 1, 2, . . . verify: ˆ ˙ 1 an , bn “ o , np that is they decay toward 0 faster than

1 np .

This result is very important, as it tells us that if f is “smooth”, then it can be approximated in a precise manner even with a small number of Fourier coefﬁcients. However, if f is not sufﬁciently smooth, then the convergence to 0 of the Fourier coefﬁcients of f is slow, and a large number of these coefﬁcients is required in order to obtain a good approximation of f . The inverse is also true under some suitable hypotheses, which space does not permit us to describe here. The most important concept to grasp is that the faster the Fourier coefﬁcients of a function converge to 0, the smoother the function is. P ROOF.– Let us consider the coefﬁcients an ; the proof is identical for the coefﬁcients bn . We can develop our proof, without loosing generality, by considering b “ π, a “ ´π, in fact it is always taken back our analysis to these values thanks to the following linear variable change: sptq “

b`a b´a ` t 2 2π

which shows that spπq “ b and sp´πq “ a. Using this convention, the expression of an , n “ 0, 1, 2, . . . is integrated by parts, with u “ f ptq and dv “ cospntqdt, hence du “ f 1 ptqdt and v “ n1 sinpntq. We obtain: ż 1 1 π 1 π f ptq sinpntqdt rf ptq sinpntqs´π ´ πn πn ´π ż ¯ ´π 1 π 1 ` nt dt f ptq cos “ πn ´π 2 ` ˘ since sinpnπq “ sinp´nπq “ 0 and cos π2 ` α “ ´ sinpαq @α P R. an “

The Geometric Structure of Hilbert Spaces

217

After a second integration by parts, we obtain: " ” ż ´π ´π ¯ıπ ¯ * 1 1 1 1 π 2 f ptq sin an “ ´ f ptq sin ` nt ` nt dt πn n 2 n ´π 2 ´π Since f 1 p´πq “ f 1 pπq by hypothesis, the ﬁrst bracketed term is zero, hence: ” ´π ´π ´π ¯ıπ ¯ ¯ f 1 ptq sin “ f 1 pπq sin ` nt ` nπ ´ f 1 p´πq sin ´ nπ 2 2 2 ´π ” ´π ¯ ´π ¯ı ` nπ ´ sin ´ nπ “ f 1 pπq sin 2 2 ” ´π ¯ ´π ¯ı “ f 1 pπq sin ` nπ ´ sin ´ nπ ` 2nπ 2 2 ” ´π ¯ ´π ¯ı ` nπ ´ sin ` nπ “ 0 “ f 1 pπq sin 2 2 Furthermore, the second term in brackets can be rewritten as: ż ż ´π ´π π ¯ ¯ 1 π 2 1 π 2 ´ f ptq sin f ptq cos ` nt dt “ ` ` nt dt n ´π 2 n ´π 2 2 żπ ´ ¯ π 1 f 2 ptq cos “ ¨ 2 ` nt dt n ´π 2 Moreover: an “

1 πn2

żπ ´π

f 2 ptq cos

´π 2

¯ ¨ 2 ` nt dt

In short, integration by parts of an gives us the expression: ż ´π ¯ 1 π 1 an “ f ptq cos ` nt dt πn ´π 2 With two integrations by parts of an , we have: żπ ´π ¯ 1 2 an “ f ptq cos ¨ 2 ` nt dt πn2 ´π 2 With p integrations by parts of an , we have: żπ ´π ¯ 1 ppq an “ f ptq cos ¨ p ` nt dt πnp ´π 2 Similarly, we obtain: żπ ¯ ´π 1 ppq ¨ p ` nt dt f ptq sin bn “ πnp ´π 2

218

From Euclidean to Hilbert Spaces

`π ˘ We now ` π see˘ that, by using the trigonometric identities cos 2 ` α “ ´ sinpαq and sin 2 ` α “ cospαq, @α P R, the integrals: ż ż ´π ´π ¯ ¯ 1 π ppq 1 π ppq εn “ f ptq cos f ptq sin ¨ p ` nt dt, ε˜n “ ¨ p ` nt dt π ´π 2 π ´π 2 are, by deﬁnition, the Fourier coefﬁcients of the function f ppq to within a sign. By hypothesis, f ppq is continuous on r´π, πs and thus, as the domain r´π, πs is compact, f ppq P L1 r´π, πs; hence, by the Riemann-Lebesgue lemma, its Fourier coefﬁcients converge to 0 when n Ñ `8. Furthermore, εn ÝÑ 0 and ε˜n ÝÑ 0, which means nÑ8 nÑ8 that: εn ε˜n ÝÑ 0, bn “ p ÝÑ 0 np nÑ8 n nÑ8 ` ˘ that is an , bn “ o n1p . an “

2

This result was used by Krylov (1863–1945) as the foundation of his method for improving the convergence of Fourier series for jump-discontinuous functions. 5.6.8. Fourier transform in L2 pTq and shift Now, let us analyze the relationship between shift and the Fourier transform for a function f P L2 pTq. The result is qualitatively identical to that which we obtained for the DFT in section 2.7.2. T HEOREM 5.19 (Fourier transform and shift).– Taking f P L2 pTq, then: 1) if ga pxq “ f px ´ aq, a P R, then: gˆa pnq “ e´ina fˆpnq, @n P Z; 2) if gk pxq “ eikx f pxq, k P Z, then: gˆk pnq “ fˆpn ´ kq, @n P Z. P ROOF.– Only the proof for 1 is shown here, as the proof for 2 is analogous. The proof consists of a direct calculation in which we make use of the shift-invariance of the Lebesgue measure: ż 2π f px ´ aq ´inx ? gˆa pnq “ xga , un y “ e dx 2π 0 ż 2π ż 2π´a f pxq ´inpx`aq f pxq ´inx 2 ´ina ? e ? e dpx ` aq “ e dx “ 2π 2π ´a 0 ´ina p “e f pnq. ! ) D EFINITION 5.17.– The set |fˆpnq|, n P Z is the spectrum (amplitude spectrum) of f P L2 pTq.

The Geometric Structure of Hilbert Spaces

219

|fˆpnq| represents the weight of importance of the harmonic of frequency n, that is, einx in reconstructing f , as can be seen in the formula ř ˆ einx f pnq ?2π . f“ nPZ

The property which we have just proved shows that the spectrum of f gives us information concerning the presence of certain frequencies in f ; however, it tells us nothing about their “position”: the shifted signal ga pxq “ f px ´ aq has the same spectrum as f , since |p ga pnq| “ |fppnq|. Localized information concerning frequency and position can be obtained in the context of wavelet theory. 5.7. Summary In this chapter, we extended some structural property of ﬁnite-dimensional inner product spaces to inﬁnite dimensional Hilbert spaces. The orthogonal complement to a subset or vector subspace of a Hilbert space plays an important role in this extension. The theorem of projection onto a closed convex subset of a Hilbert space is essential for extending the geometric structure of ﬁnite-dimensional Euclidean spaces to inﬁnite dimensions. The proof of this theorem draws on the parallelogram law, for which a Hilbert norm is required; hence, the theorem is only valid in Hilbert spaces. When the closed convex subset from the previous theorem is also a vector subspace, then the difference between the original vector and its projection belongs to the orthogonal complement of the subspace, as it does in ﬁnite dimensions; this property allows us to extend the orthogonal projection theorem to inﬁnite-dimensional Hilbert spaces. The orthogonal projection theorem is used to produce an extremely useful characterization of closed vector subspaces in Hilbert spaces, as those which coincide with their biorthogonal complement. We examined orthonormal systems in separable Hilbert spaces, that is, those which possess at least one countable dense subset. An orthonormal system of a separable Hilbert space is countable. All of the Hilbert spaces discussed here are implicitly considered to be separable unless otherwise stated. In order for an orthonormal system pun qnPN to be the generalization of an orthonormal basis to an inﬁnite-dimensional Hilbert space H, we must ﬁrst guarantee that for all x P H, the sequence of Fourier coefﬁcients pˆ xpnq “ xx, un yqnPN decays

220

From Euclidean to Hilbert Spaces

toward 0; otherwise, the expansion

ř

xx, un yun would not converge. Bessel’s

nPN

inequality ensures that this is the case, due to the fact that the sequence of Fourier coefﬁcients with respect to any orthonormal system of a Hilbert space belongs to 2 . Bessel’s inequality also tells us that Plancherel’s identity veriﬁed ř is not necessarily for any orthonormal system, as, in general, it holds that |xx, un y|2 un ď }x}2 . nPN

ř The Fischer-Riesz theorem states that Plancherel’s identity holds when the series xx, un yun converges to x; using a counter-example, we showed that this is not nPN ř xx, un yun is the the case for an arbitrary orthonormal system. It turns out that nPN

expansion of x when the orthonormal system pun qnPN is complete, that is, it is not a proper part of another orthonormal system in H. Complete orthonormal systems are also known as Hilbert bases. A Hilbert basis pun qnPN can be characterized using ﬁve equivalent conditions: the fact that the zero vector is the only vector which is orthogonal to all elements in a Hilbert basis, the fact that the subspace generated by the Hilbert basis is dense in H, the ability to expand into a generalized Fourier series, Parseval’s identity and Plancherel’s identity. An isomorphism of Hilbert spaces is a surjective transformation which preserves the inner product. We saw that the preservation of inner products implies isometry, and thus injectivity; furthermore, the combination of surjectivity and conservation of the inner product implies linearity. All separable Hilbert spaces on the same ﬁeld are isomorphic to one another; the prototype of an inﬁnite-dimensional, separable Hilbert space on the ﬁeld K is 2 pN, Kq. This result is the extension, to inﬁnite dimensions, of the fact that Kn is the prototype of all vector spaces of ﬁnite dimension n on K. The classic Fourier series and transform on spaces L2 pra, bsq are deﬁned as a special case of the theory developed earlier; their speciﬁcity lies in the choice of a Hilbert basis given by complex exponentials, or by a cosine and sine (plus a constant function). This also holds for functions deﬁned on R, as long as they are periodic. As in the case of sequences in 2 pZN q, also for the functions of L2 described earlier, the Fourier spectrum (the set of magnitudes of the Fourier coefﬁcients) is shiftinvariant, raising the necessity of an extension of Fourier theory to provide a localized frequency analysis. Wavelet theory responded to this need.

6 Bounded Linear Operators in Hilbert Spaces

A function A : V ùñ W , with V and W normed vector spaces on the same ﬁeld K, is known as a linear operator between V and W if: @α, β P K, Apαx ` βyq “ αApxq ` βApyq,

@x, y P V

To simplify the notation, the parentheses may be omitted in later occurrences, writing Ax in place of Apxq. V is the domain of A; the set: ImpAq “ ApV q “ ty P W : Dx P V : y “ Axu Ď W is the codomain or image of A, and W is the destination set of A. Basic examples are shown below. 1) The identity operator: id : V Ñ V , idpxq “ x @x P V and the null operator: 0 : V Ñ V , 0pxq “ 0V @x P V ; 2) The differential operator: this is deﬁned on a space of differentiable functions which may change according to the particular application we are interested in. As a concrete example, consider the ﬁrst-order differential operator: D1 f ptq “ df dt ptq “ f 1 ptq. dompD1 q “ tf P L2 ra, bs X C 1 ra, bs : f 1 P L2 ra, bsu, where a ă b are real constants, could be a perfectly valid domain for D1 . Then: D1 : dompD1 q Ă L2 ra, bs ÝÑ L2 ra, bs f ÞÝÑ D1 f Similarly, the operator Dn f ptq “

dn f ptq “ f pnq ptq dtn

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

222

From Euclidean to Hilbert Spaces

can be deﬁned on the domain dompDn q “ tf P L2 ra, bs X C 1 ra, bs L2 ra, bsu, where a ă b are real constants, that is:

:

f pnq P

Dn : dompDn q Ă L2 ra, bs ÝÑ L2 ra, bs f ÞÝÑ Dn f Partial differential operators are deﬁned in a similar way; 3) The integral operator: this operator is typically deﬁned by considering a kernel function kps, tq, k P L2 pra, bsˆra, bsq, where a ă b are real constants. The integration operator with kernel k is: Tk : L2 ra, bs ÝÑ L2 ra, bs şb f ÞÝÑ Tk f, where Tk f psq “ a kps, tqf ptqdt 4) Linear operators in ﬁnite dimensions. Let A : Kn Ñ Kn be a linear operator and let pu1 , . . . , un q be an orthonormal basis in Kn . Any x P Kn can be written as n n ř ř λj Auj . Then: λj uj , with λj P K @j and, by linearity, Ax “ x“ j“1

j“1

xAx, uj y “

n ÿ j“1

λj xAuj , ui y “

n ÿ

αij λj ,

@i “ 1, . . . , n

[6.1]

j“1

where αij “ xAuj , ui y. This shows that the action of A is entirely determined by the matrix of element pαij qi,j“1,...,n and vice versa: for any matrix with elements pαij qi,j“1,...,n , formula [6.1] can be used to deﬁne a linear operator on Kn . This last example highlights the well-known relationship between linear operators on Kn and n ˆ n matrices with elements in K. Since Kn is the prototype of all vector spaces V of dimension n on K, we can say that the theory of linear operators on vector spaces in ﬁnite dimensions is, in essence, a matrix theory. As we shall see, the action of bounded linear operators on separable Hilbert spaces can also be expressed using a matrix, but, in this case, the matrix contains a countably inﬁnite number of rows and columns. The presence of a topology generated by a norm motivates the need to check the continuity of linear operators deﬁned between two normed vector spaces V and W . If V and W have ﬁnite dimension, then any linear operator between them is continuous. However, as we shall see in section 6.2.1, if V is of inﬁnite dimension, then even simple linear operators may not be continuous. In the following sections, we shall examine the main properties of linear operators starting by showing that a linear operator is continuous if and only if it is bounded.

Bounded Linear Operators in Hilbert Spaces

223

6.1. Fundamental properties of bounded linear operators between normed vector spaces We begin by introducing formal deﬁnitions for continuous and bounded operators. Let pV, } }V q and pW, } }W q be two generic normed vector spaces. D EFINITION 6.1.– Let A : V ùñ W be a linear operator: – A is continuous in x0 P V if: @ε ą 0 Dδε ą 0 : }x´x0 }V ă δε ùñ }Ax´Ax0 }W “ }Apx´x0 q}W ă ε – A is continuous on V if A is continuous in every element of V ; – A is bounded if Dc P R, c ě 0, such that: }Ax}W ď c}x}V

@x P V

that is, any vector x P V is transformed by A into a vector Ax whose norm in W is majorized by a positive multiple of the norm of x in V . The continuity of a linear operator is equivalent to sequential continuity, just as we saw in the case of functions deﬁned on metric spaces. T HEOREM 6.1.– The linear operator A : V ùñ W is continuous in x0 P V if and only if: @pxn qnPN Ă V, xn

ÝÑ

n ùñ `8

x0 ùñ Axn

ÝÑ

n ùñ `8

Ax0

that is: @pxn qnPN Ă V, }xn ´ x0 }V

ÝÑ

n ùñ `8

0 ùñ }Axn ´ Ax0 }W

ÝÑ

n ùñ `8

0

Before going into the details concerning the properties of continuous linear operators, we can show that any continuous linear operator on a Hilbert space can be represented by an inﬁnite matrix. Let us use the same argument of example 4 previously discussed: let H be a Hilbert space, A : H Ñ H a continuous linear ř operator and pun qnPN a Hilbert basis of H. Then, for all x P H, x “ xx, un yun nPN

and by the continuity and linearity of A, we have: ˜ ¸ ÿ ÿ Ax “ A xx, un yun Apxx, un yun q “ (continuity)

nPN

nPN

“

(linearity)

Furthermore, by the continuity of the inner product: ÿ ÿ xAun , um yxx, um y xAx, um y “ x xx, un yAun , um y “ nPN

“

ÿ nPN

αnm xx, um y,

nPN

@m P N

ÿ

xx, un yAun

nPN

224

From Euclidean to Hilbert Spaces

where αnm “ xAun , um y, thus the inﬁnite matrix with elements pαmn qn,mPN is the representation of the continuous linear operator A with respect to the Hilbert basis pun qnPN . Unlike the ﬁnite dimensional case, it is not easy to know when an inﬁnite matrix corresponds to a continuous linear operator; this is the reason why inﬁnite matrices are almost never used when studying linear operators in inﬁnite-dimensional Hilbert spaces. Theorem 6.2 makes it considerably simpler to analyze the continuity of linear operators. T HEOREM 6.2.– Let A : V ùñ W be a linear operator and x0 P V an arbitrary ﬁxed element. Then, A is continuous in x0 if and only if A is continuous on all V . This theorem implies that we simply need to prove the continuity of a linear operator at a single, arbitrary point in order to guarantee the continuity over the whole vector space on which it is deﬁned. P ROOF.– ð : trivial, as if A is continuous on V , then, by deﬁnition, A is continuous at all points in V . ùñ : let A be continuous in x0 . To demonstrate that A is continuous in V , we must prove that the continuity of A in x0 implies its continuity in any arbitrary element x P V . Given any sequence pxn qnPN Ă V such that xn Ñ x, we must prove that this implies }Apxn q ´ Apxq}

ùñ

n ùñ `8

n ùñ `8

0.

We note that the sequence pxn ´ x ` x0 qnPN converges to x0 since pxn qnP N converges to x. Thus, by the continuity of A in x0 , it holds that Apxn ´ x ` x0 q Ñ Apx0 q, that is }Apxn ´ x ` x0 q ´ Apx0 q} ùñ 0; n ùñ `8

n ùñ `8

furthermore, by the linearity of A, }Apxn ´ x ` x0 q ´ Apx0 q} }Apxn q ´ Apxq ` Apx Apx ùñ 0. 0q ´ 0 q} “ }Apxn q ´ Apxq} n ùñ `8

“ 2

Thus, to verify the continuity1 (or lack of continuity!) of a linear operator A : V ùñ W , we must simply verify this property for an arbitrary point in V . This point is often chosen to be 0V , the zero vector in V , as, in many cases, it simpliﬁes the calculations involved. 1 We recall that, for a linear operator, continuity and uniform continuity are equivalent conditions.

Bounded Linear Operators in Hilbert Spaces

225

This fact is used below to prove a theorem which shows the relationship between continuous and bounded linear operators. T HEOREM 6.3.– A linear operator A : V continuous.

ùñ W is bounded if and only if it is

P ROOF.–

xn

A bounded ùñ A continuous ðñ A continuous in 0V . Take pxn qnPN Ă V , Ñ 0V , that is }xn }V Ñ 0; as A is assumed to be bounded, Dc P R` n ùñ `8

n ùñ `8

such that: }Axn ´ A0V }W

“

A0V “0W

}Axn }W ď c}xn }V

thus, for any sequence pxn qnPN Ă V , xn

Ñ

n ùñ `8

Ñ

n ùñ `8

0V , Axn

0 Ñ

n ùñ `8

Ap0V q, which

corresponds to the continuity of A in 0V , and hence, by the previous theorem, on all V. A continuous ðñ A continuous in 0V ùñ A bounded. In this case, it is helpful to consider the original deﬁnition of continuity, and to express it for x0 “ 0V : @ε ą 0 Dδε ą 0 : }x ´ 0V }V ă δε ùñ }Ax ´ A0V }W ă ε that is : @ε ą 0 Dδε ą 0 : }x}V ă δε ùñ }Ax}W ă ε As the previous expression is valid for all ε, we can consider the case where ε “ 1. For simplicity’s sake, we shall write δε“1 ” K ą 0. Using these choices, the hypothesis that A is continuous in 0V gives us the following implication: }x}V ă K ùñ }Ax}W ă 1

[6.2]

Note that we are approaching the deﬁnition of a bounded operator. The ﬁnal step of the proof consists of determining a speciﬁc vector x which satisﬁes [6.2] and that allows us to handle the inequality }Ax}W ă 1 in order to prove that A is bounded. To this aim, let us consider a real positive number 0 ă σ ă K, hence K ´ σ ą 0, and an arbitrary element y P V . We analyze the norm of the vector pK ´ σq }y}y V : › › › › ›pK ´ σq y › “ K ´ σ }y}V “ K ´ σ ă K › }y}V ›V }y}V

226

From Euclidean to Hilbert Spaces

that is, pK ´ σq }y}y V is a vector in V whose norm is strictly less than K; thus, relationship [6.2] implies: › ˆ ˙› › › ›A pK ´ σq y › ă 1 ðñ K ´ σ }Ay}W ă 1 › }y}V ›W }y}V ðñ }Ay}W ă

1 }y}V K ´σ

Since y P V is arbitrary and K ´ σ ą 0, we can take c ” deﬁnition of a bounded A: }Ay}W ă c}y}W

@y P V

1 K´σ

and obtain the 2

This theorem implies that the terms “bounded” and “continuous” can be interchanged for linear operators between normed vector spaces. So far, we speciﬁed the vector space in which the norm in question was considered. From now on, for simplicity’s sake, this speciﬁcation will not be shown and we shall simply write } }. 6.1.1. Continuity of linear operators deﬁned on a ﬁnite-dimensional normed vector space The following result shows that all linear operators deﬁned on a ﬁnite-dimensional vector space are continuous (and thus bounded). T HEOREM 6.4.– If V is a normed vector space of ﬁnite dimension N and W is a normed vector space (of any dimension), then any linear operator A : V Ñ W is bounded (and thus continuous). P ROOF.– As the space V is of ﬁnite dimensions, all norms on V are equivalent by Tychonoff’s theorem (Theorem 4.4); thus, we must simply prove that A : V Ñ W is continuous with respect to one norm, and this proof holds for all other norms. Let pu1 , . . . , uN q be a basis of V , then any x P V can be written as N ř xn un , xn P K. Let us consider the following norm on V : x “ n“1 N ř sup |xn |. By the linearity of A and the triangular xn un x “ ” n“1

inequality, we have:

n“1,...,N

Bounded Linear Operators in Hilbert Spaces

227

˜ ¸ N N N N ÿ ÿ ÿ ÿ Ax “ A x n un “ xn Aun ď xn Aun “ |xn | Aun n“1 n“1 n“1 n“1 ď “

N ÿ

sup

n“1 n“1,...,N

ˆ sup n“1,...,N

|xn | Aun

¸ ˙˜ÿ N |xn | Aun n“1

˜ “

def. of }x}

N ÿ

¸ Aun }x}

n“1

this shows us that A is bounded, that is continuous.

2

We therefore do not face any continuity problems when considering linear operators deﬁned on ﬁnite-dimensional normed vector spaces, whatever the dimension of the image space. As we shall see, the situation is much more complicated in the case of inﬁnite-dimensional domains. 6.2. The operator norm, convergence of operator sequences and Banach algebras D EFINITION 6.2.– Let A : V Ñ W , V ‰ t0V u be a bounded linear operator. The operator norm of A can be deﬁned in four different (equivalent) ways: }A} “ inf tc ě 0 : }Ax} ď c}x}, @x P V u “ N1

[6.3]

}A} “ sup }Ax} “ N2

[6.4]

}A} “ sup }Ax} “ N3

[6.5]

}Ax} “ N4 }x}

[6.6]

}x}ď1 }x}“1

}A} “ sup

x‰0V

For a non-bounded operator A, we write A “ `8; evidently, for the zero operator 0 it holds that }0} “ 0. Theorem 6.5 guarantees that the deﬁnition above is well posed. T HEOREM 6.5.– The four deﬁnitions given above coincide. P ROOF.– We shall show that N1 ď N4 ď N3 ď N2 ď N1 , working from right to left. In all of these proofs, we shall use the fact that the sup of a set is, by deﬁnition, the smallest of the upper bounds of the set itself.

228

From Euclidean to Hilbert Spaces

N2 ď N1 : by the deﬁnition of N1 (i.e. equation [6.3]) we can write }Ax} ď N1 }x} @x P V , thus, in particular, for vectors x such that }x} ď 1, it is true that }Ax} ď N1 , that is, N1 is an upper bound for the set t}Ax}, x P V, }x} ď 1u. By deﬁnition, the sup is the smallest of the upper bound of a set, hence N2 “ sup }Ax} ď N1 . }x}ď1

˘ ` N3 ď N2 : consider x P V such }x} “ 1 and the˘sequence xn “ 1 ´ n1 x, ˘ ` ` that n ě 1. On one side: }xn } “ 1 ´ n1 }x} “ 1 ´ n1 ď 1 and thus }Axn } ď sup }Ay} “ N2 . Passing by the limit, we obtain: lim }Axn } ď lim N2 “ N2 . nÑ`8

}y}ď1

On the other side, it is clear that xn

Ñ

nÑ`8

nÑ`8

x and thus, by the continuity of A and of

the norm, lim }Axn } “ }A lim xn } “ }Ax}. nÑ`8

nÑ`8

Combining this information, we can write }Ax} ď N2 , that is, N2 is an upper bound for the set t}Ax}, x P V, }x} “ 1u. The quantity N3 is deﬁned as the sup of this set, that is the smallest upper bound, hence N3 ď N2 . › › › › › x › › x › N4 ď N3 : let us consider x P V , x ‰ 0V , then › }x} › “ 1 and ›A }x} › ď › › › x › }Ax} }Ax} sup }Ay} “ N3 . Furthermore, ›A }x} › “ }x} , hence }x} ď N3 , that is, c is an }y}“1 ! ) upper bound for the set }Ax} . Since N4 is the sup of this set, that , x P V, x ‰ 0 V }x} is the smallest of the upper bounds, then N4 ď N3 . N1 ď N4 : for all x ‰ 0V , }Ax} }x} ď N4 , and }Ax} ď N4 }x}; moreover, by deﬁnition of N1 , it holds that N1 ď N4 . 2 R EMARK .– 1) The speciﬁcation @x P V plays an important role in the deﬁnition }A} “ inf tc ě 0 : }Ax} ď c}x}, @x P V u. Without this condition, the norm of A would be trivially null for any linear operator, since A0 “ 0. By considering all of the transformed vectors Ax, x P V , we ensure that the norm of A is ‰ 0 (except, evidently, in the case where A is the identically null operator). 2) We should also highlight the difference between the expression }A}, which represents the operator norm of the linear application A : V Ñ W , and the expression }x}, which represents the norm of a vector x P V . Certain authors use a different symbol for the operator norm, for example |||A|||, but we have chosen to retain the same symbol, } }.

Bounded Linear Operators in Hilbert Spaces

229

We shall now verify that the operator norm is well deﬁned on the set of linear operators from V to W , and that this space is stable with respect to pointwise-deﬁned linear operations, that is pA ` Bqx “ Ax ` Bx and pαAqx “ αAx, for all α P K and for all x P V . – Positive deﬁniteness: evidently, }A} ě 0 for any bounded operator A by equation [6.3]. Furthermore, by equation [6.6], }A} “ sup }Ax} }x} “ 0 if and only x‰0V

if }Ax} “ 0 @x P V , x ‰ 0V (if x “ 0V then Ax “ 0 by linearity). Thus, due to the positive deﬁniteness of the norm of W , }A} “ 0 ðñ Ax “ 0 @x P V , that is, if and only if A is the null operator 0pxq “ 0 @x P V . – Homogeneity: this is a direct consequence of the homogeneity of the norm of W . Using, for example, equation [6.5], we obtain, @α P K: }αA} “ sup }αAx} “ sup |α|}Ax} “ |α| sup }Ax} “ |α|}A} }x}“1

}x}“1

}x}“1

that is: }αA} “ |α|}A}

@α P K

[6.7]

– Triangular inequality: an immediate consequence of equation [6.3] in Deﬁnition 6.2 is that we can write: }Ax} ď }A}}x}

@x P V

[6.8]

Using this alongside the triangular inequality of the norm of W , for any pair of operators A, B : V Ñ W and for all x P V , we can write: }pA ` Bqx} “ }Ax ` Bx} ď }Ax} ` }Bx} ď }A}}x} ` }B}}x} “ p}A} ` }B}q}x} By equation [6.3], this implies: }A ` B} ď }A} ` }B}

[6.9]

The inequality [6.9] and the property of homogeneity [6.7] show that the set of bounded linear operators is invariant with respect to linear combinations, and is thus itself a vector space; this space becomes normed by the operator norm. D EFINITION 6.3.– The normed vector space of bounded linear operators from V to W endowed with the operator norm is noted BpV, W q. If V “ W , we simply write BpV q.

230

From Euclidean to Hilbert Spaces

In the literature, the letter B is used to denote bounded. The notation LpV, W q is also used in this sense. Deﬁnition 6.4 is an immediate consequence of the fact that BpV, W q is a normed vector space. D EFINITION 6.4 (Convergence in BpV, W q).– A sequence of bounded operators pAn qnPN Ă BpV, W q converges to the bounded operator A P BpV, W q if: }An ´ A} ÝÑ 0 nÑ`8

where }An ´ A} is the operator norm of the difference between An and A. Exercise 6.1 Using Deﬁnition 6.4, prove that a necessary condition for the convergence of a sequence of operators from pAn qnPN Ă BpV, W q to A P BpV, W q is: lim }pAn ´ Aqx} “ 0 @x P Bp0, 1q

[6.10]

nÑ`8

Solution to Exercise 6.1 We start by noting that, since the sup is a majorant of a set, it holds that: }A} ě }Ax} @x P Bp0, 1q ,

Bp0, 1q :“ tx P V : }x} ď 1u

[6.11]

Inequality [6.11] implies }An ´ A} ě }pAn ´ Aqx} @x P Bp0, 1q, thus, if there exists at least one x P Bp0, 1q such that lim }pAn ´ Aqx} ą 0, then lim }An ´ nÑ`8

nÑ`8

A} ą 0 which prevents the convergence of the sequence pAn qnPN to A. Property [6.10] is thus necessary for pAn qnPN Ă BpV, W q to converge to A P BpV, W q. 2 In the case where V “ W , we can add a third operation on BpV q, the product: pABqpxq :“ pA ˝ Bqpxq “ ApBpxqq @x P V that is the product in BpV q corresponds to the operation of functional composition between linear operators. We observe that: }pABqx} “ }ApBxq}

ď

pBx is a vector of V q

}A}}Bx} ď }A}}B}}x} @x P V

and thus, by Deﬁnition 6.3: }AB} ď }A}}B}

[6.12]

Bounded Linear Operators in Hilbert Spaces

231

Hence, taking A “ B, }A2 } ď }A}2 , by iterating these considerations we obtain the formula: }An } ď }A}n

@n P N.

Thus BpV q is invariant with respect to the product operation deﬁned above, and, consequently, BpV q is a normed associative unital algebra, where the unit is the identity operator. We recall that an algebra A on the ﬁeld K is a vector space on K equipped with a binary operation ¨ : A ˆ A Ñ A, commonly called the product, which is compatible with linear operations; this is equivalent to requiring that ¨ is bilinear, that is, for all a, b, c P A and k P K it holds that: – pa ` bq ¨ c “ a ¨ c ` b ¨ c and a ¨ pb ` cq “ a ¨ b ` a ¨ c; – pkaq ¨ b “ kpa ¨ bq, a ¨ pkbq “ kpa ¨ bq. T HEOREM 6.6.– Let pV, } }q be an arbitrary normed vector space on the ﬁeld K. The sum, product by a scalar of K and product in the algebra BpV q are continuous with respect to the operator norm. P ROOF.– Theorem 4.2 also applies in the case of the algebra BpV q, so the sum and product by a scalar are continuous and only the continuity of the product must be proven. If pAn qnPN and pBn qnPN are two sequences of operators of BpV q which converge to A P BpV q and B P BpV q, respectively, that is }An ´ A} Ñ 0, }Bn ´ B}

Ñ

nÑ`8

}An Bn ´ AB}

Ñ

Ñ

0, then we must show that An Bn

nÑ`8

nÑ`8

nÑ`8

AB, that is,

0:

}An Bn ´ AB} “ }An pBn ´ Bq ` pAn ´ AqB} `}An ´ A}}B}

Ñ

nÑ`8

ď

r6.9s,r6.12s

}An }}Bn ´ B}

0

2

The presence of a norm on BpV, W q generates a topology, and this naturally leads us to examine the conditions under which this space is complete. The following result provides a sufﬁcient condition for BpV, W q to be complete. T HEOREM 6.7.– Let V, W be two normed vector spaces. If W is complete, then BpV, W q is complete. Before proving this theorem, we wish to highlight the fact that the theorem holds for BpHq or BpH1 , H2 q, if H, H1 , H2 are Hilbert spaces. P ROOF.– Let pAn qnPN be a Cauchy sequence of operators in BpV, W q, that is: @ε ą 0 DNε ą 0 : @m, n ě Nε : }An ´ Am } ă ε

232

From Euclidean to Hilbert Spaces

To prove the theorem, we must show that pAn qnPN converges in BpV, W q using the hypothesis of completeness of W . We begin by noting that for all ﬁxed x P V , it holds that: @m, n ě Nε : }An x ´ Am x} ď }An ´ Am }}x} ă ε}x}

[6.13]

and thus, by the arbitrary nature of ε, pAn xqnPN is a Cauchy sequence in W . By hypothesis, W is complete, thus there exists lim An x P W ; this means that n ùñ 8

we can deﬁne the limit operator A associated with the sequence pAn qnPN : A : V ÝÑ W x ÞÝÑ Apxq “

lim

n ùñ `8

An x

We shall show that pAn qnPN converges in operator norm to A, and that A P BpV, W q, completing our proof. We begin by noting that @n ě Nε , it holds that: }An pxq ´ Apxq} “ }An pxq ´ “

lim

(continuity of } }) m ùñ `8

lim

m ùñ `8

Am pxq} [6.14]

}An pxq ´ Am pxq} ă ε}x} [6.13]

The ﬁnal equality draws on the fact that m tends toward `8, so we know that m ě Nε . Hence, @ε ą 0 DNε ą 0 : n ě Nε ùñ }An ´ A} “ sup }pAn ´ Aqx} }x}“1

“ sup }An pxq ´ Apxq} ă ε }x}“1

that is, pAn qnPN converges in operator norm to A. Finally, we must verify that A P BpV, W q. Taking an arbitrary x P V , then, since inequality [6.14] holds for all n ě Nε , we can write: }Ax} “ }Ax ´ ANε x ` ANε x} ď }Ax ´ ANε x} ` }ANε x} ă ε}x} ` }ANε }}x} r6.14s

that is, }Ax} ă pε ` }ANε }q}x} @x P V , thus A is bounded.

2

D EFINITION 6.5 (Banach algebra).– An algebra A on the ﬁeld K is a Banach algebra if the following properties are veriﬁed @a, b, c P A:

Bounded Linear Operators in Hilbert Spaces

233

– A is an associative algebra, that is a ¨ pb ¨ cq “ pa ¨ bq ¨ c; – A, as a vector space, admits a norm with respect to which it is a Banach space; – a ¨ b ď a b. From what we have already seen, we know that if V is a Banach space, BpV q is a complete, associative unital algebra with respect to the operator norm ; hence, BpV q is a unital Banach algebra. Evidently, for any Hilbert space H, BpHq is a unital Banach algebra. A particularly important property of the kernel of the operators of BpV, W q is shown below. T HEOREM 6.8.– Let V, W be two normed vector spaces and take A P BpV, W q, then kerpAq is a closed vector subspace of V . P ROOF.– Let pvn qnPN Ă kerpAq be an arbitrary convergent sequence. We must prove that its limit, v¯ “ lim vn , remains within kerpAq. A is bounded, and thus nÑ`8

continuous, so lim Avn “ Ap lim vn q “ A¯ v . Furthermore, Avn “ 0 @n P N nÑ`8

nÑ`8

since vn P kerpAq, hence A¯ v “ lim 0 “ 0, which implies v¯ P kerpAq. nÑ`8

2

The usefulness of this theorem is shown in the following exercise, which highlights the fact that the theorem of projection onto a closed proper vector subspace is not valid without the completeness hypothesis. Exercise 6.2 Let T be the linear operator (actually, a linear functional) deﬁned by: T :

2 pN, Cq ÝÑ C ř x “ pxn qnPN ÞÝÑ T pxq “ nPN

xn n`1

1) Show that T is continuous. " ř 2) Taking F “ pxn qnPN P 2 pN, Cq : proper vector subspace of 2 pN, Cq.

nPN

xn n`1

* “ 0 , show that F is a closed K

3) Prove the existence of u P 2 pN, Cq such that F “ tuu and use your result to deduce the explicit expression of F K . 4) We know (see the deﬁnition corresponding to [4.26]) that 0 pN, Cq is the vector subspace of 2 pN, Cq made up of sequences pxn qnPN which are zero after a

234

From Euclidean to Hilbert Spaces

certain index, which we equip with the topology induced by 2 pN, Cq. Take G “ F X 0 pN, Cq. a) Show that G is a closed proper vector subspace of 0 pN, Cq. b) Using formula [5.4], show that the orthogonal complement of G in 0 pN, Cq, that is GK0 :“ GK X 0 , is reduced to the zero vector: GK0 “ t02 pN,Cq u. c) Use your ﬁndings to deduce that 0 pN, Cq is not complete in the topology inherited from 2 pN, Cq. Solution to Exercise 6.2 1 1) Let u “ pun qnPN denote the sequence deﬁned by un “ n`1 @n P N, which 2 2 obviously belongs to pN, Cq. For all x P pN, Cq, it holds that: ˇ ˇ ˇ ˇ ˇ ˇÿ 1 ˇˇ ˇˇ ÿ ˇ ˇ ď xn xn un ˇ “ |xx, uy| }x}}u} |T pxq| “ ˇ ˇ“ˇ ˇ ˇnPN n ` 1 ˇ ˇnPN Cauchy-Schwarz

thus T is bounded, with }T } ď }u}, that is continuous. 2) By deﬁnition, F “ kerpT q and thus it forms a closed vector subspace in 2 pN, Cq, since we have just proved that T is continuous. One example of an element in 2 pN, Cq that does not belong to F is the ﬁrst vector in the canonical basis of ř e1 pnq 1 2 pN, Cq, that is, e1 “ p1, 0, 0, . . . q, since n`1 “ 2 ‰ 0. Thus, F is a closed proper vector subspace of 2 pN, Cq.

nPN

3) Taking u in the same way as in question 1, we know that: F “ tx P 2 pN, Cq : xx, uy “ 0u ” tuuK so F K “ tuuKK

“ spantuu; moreover, spantuu “

r5.5s

!

λ n`1 ,

) λ P C is a one-

dimensional vector subspace, and thus it is closed. Hence spantuu “ spantuu and ! ) λ , λ P C . n`1

FK “

4) a) We can rewrite G in an explicit form as: G “ F X 0 pN, Cq “ tpxn qnPN , D N P N : xn “ 0 @n ą N and

N ÿ

xn “ 0u n`1 n“0

showing that G “ ker T |0 pN,Cq . As the restriction of a continuous linear operator is itself continuous, G must be a closed vector subspace of 0 pN, Cq. To prove that G is N ř e1 pnq 1 proper, we consider e1 : e1 P 0 pN, Cq, and n`1 “ 2 ‰ 0. n“0

Bounded Linear Operators in Hilbert Spaces

235

b) We have: GK0 “ GK X 0 pN, Cq “ pF X 0 pN, CqqK X 0 pN, Cq “ spanpF K Y 0 pN, CqqK X 0 pN, Cq r5.4s

Knowing (from Theorem 4.21) that 0 pN, Cq is dense in 2 pN, Cq, we have pN, CqqK “ t02 pN,Cq u, which is already included in F K as a vector subspace of 2 pN, Cq. Furthermore: " * λ K K 0 K K K spanpF Y pN, Cq q “ spanpF q “ F “ F “ , λPC n`1 0

since F K is closed, from the answer to question 3. Then: " * λ K0 G “ , λ P C X 0 pN, Cq “ t02 pN,Cq u n`1 since it is clear that the sequence

λ n`1

R 0 pN, Cq.

c) G is a closed, proper vector subspace in 0 pN, Cq equipped with the topology inherited by 2 pN, Cq; nevertheless, we have just shown that GK0 , the orthogonal complement of G in 0 pN, Cq, consists of the zero vector alone. This contradicts the result of Theorem 5.4 (a corollary of the theorem of projection onto a closed, convex proper part of a Hilbert space) which states that the orthogonal complement of a closed, proper vector subspace does not solely consist of the zero vector. Clearly, the only hypothesis which is not respected here is the completeness of 0 pN, Cq with respect to the inherited topology of 2 pN, Cq. 2 Our next step is to consider the way in which a continuous linear operator between two normed vector spaces interacts with Cauchy sequences. T HEOREM 6.9.– Let V and W be two arbitrary normed vector spaces, A P BpV, W q, and let pxn qnPN Ă V be a Cauchy sequence; then pAxn qnPN is a Cauchy sequence in W. P ROOF.– By hypothesis: @ε ą 0 DNε ą 0 : @n, m ě Nε : }xn ´ xm } ă ε. Now, let us consider pAxn qnPN and analyze }Axn ´ Axm } “ }Apxn ´ xm q} ď }A}}xn ´ xm } ă }A}ε, @n, m ě Nε . By the arbitrary nature of ε, pAxn qnPN is a Cauchy sequence of elements in W . 2 This result can help to prove the completeness of a normed vector space, as we shall see in Exercise 6.3, which may be seen as a continuation of Exercise 4.2.

236

From Euclidean to Hilbert Spaces

Exercise 6.3 Given a ﬁxed sequence a “ pan qnPN of strictly positive real numbers, we write: ÿ ? 2a pN, Cq :“ tu P CN : an |un |2 ă `8 ðñ au P 2 pN, Cqu nPN

In Exercise 4.2, we veriﬁed that: ÿ xu, vy2a “ an un vn and

}u}22a “

nPN

ÿ

an |un |2

nPN

are an inner product and a norm on 2a pN, Cq, respectively. 1) Show that the operator ıa : 2a pN, Cq ãÑ 2 pN, Cq ? ? u ÞÑ ıa puq :“ au ” p an un qnPN is linear, continuous, has unit norm and is bijective. Give the explicit expression of the inverse operator of ıa ; verify that this is continuous and has a norm of 1. 2) Using your ﬁndings, deduce that 2a pN, Cq is a Hilbert space. an

3) Let a and b be two sequences of strictly positive real numbers such that “ Opbn q. Show that 2b pN, Cq Ă 2a pN, Cq, and that the canonical injection is

nÑ`8

continuous. Solution to Exercise 6.3 1) Linearity rewriting: if u, v P 2a pN, Cq and λ P C, then ? can be shown by ? simple? ıpu ` λvq “ apu ` λvq “ au ` λ av “ ıpuq ` λıpvq. Concerning continuity, for all u P 2a pN, Cq, ıpuq P 2 pN, Cq and the norm is: ÿ ? ÿ }ıa puq}22 “ | a n u n |2 “ an |un |2 “ }u}22a ðñ }ıa puq}2 “ }u}22a nPN

nPN

This shows that ıa is continuous, and that its norm is 1. The ﬁnal condition to prove is bijectivity. We note that the operator: j1{a : 2 pN, Cq ùñ 2a pN, Cq v ÞÑ ı1{a pvq :“

?1 v a

” p ?1an vn qnPN

? ? ? is well deﬁned, since a ą 0 and v{ a P 2a pN, Cq ðñ av{ a “ v P ?2 pN, Cq. 2 2 Furthermore, it is such that j1{a ˝ ıa : a pN, Cq Ñ a pN, Cq, j1{a ˝ ıa puq “ ?aa u “ u @u P 2a pN, Cq; vice versa, for all v P 2 pN, Cq, ıa ˝ j1{a : 2 pN, Cq Ñ 2 pN, Cq, ıa ˝ j1{a pvq “

? ?a v a

“ v, that is, j1{a ˝ ıa “ id2a pN,Cq and ıa ˝ j1{a “ id2 pN,Cq . Thus,

Bounded Linear Operators in Hilbert Spaces

237

ıa is bijective with inverse j1{a . The inverse is also clearly continuous and possesses a unit norm, since: }j1{a pvq}22a “

ÿ an |vn |2 “ }v}22 ðñ }j1{a pvq}2a “ }v}22 a n nPN

@v P 2 pN, Cq [6.15]

2) By the continuity of ıa and by Theorem 6.9, ıa transforms the Cauchy sequences in 2a pN, Cq into Cauchy sequences in 2 pN, Cq. Now, let pum qmPN be an arbitrary Cauchy sequence of elements in 2a pN, Cq; ıa ppum qmPN q is a Cauchy sequence in 2 pN, Cq, which we know to be complete, thus D L P 2 pN, Cq such that ıa ppum qmPN q Ñ L, that is: mÑ`8

0 “ lim }ıa ppum qmPN q ´ L}2 “ “

lim }j1{a pıa ppum qmPN q ´ Lq}2a

r6.15s mÑ`8

mÑ`8

lim }j1{a ˝ ıa ppum qmPN q ´ j1{a pLq}2a

j1{a linear mÑ`8

“ lim }pum qmPN ´ j1{a pLq}2a mÑ`8

that is, pum qmPN converges in 2a pN, Cq to j1{a pLq, hence 2a pN, Cq is a Hilbert space. 3) We must show that if an

“

nÑ`8

Opbn q, then u P 2b pN, Cq ùñ 2a pN, Cq, that

is: ÿ

bn |un |2 ă `8 ùñ

nPN

ÿ

an |un |2 ă `8

nPN

for all u P 2b pN, Cq. By deﬁnition, an

“

nÑ`8

Opbn q if and only if there exist C1 ą 0

and N P N such that, for all n ě N , it holds that an ď C1 bn . For the purposes of this demonstration, we must multiply both sides of the previous inequality by |un |2 , giving us an |un |2 ď C1 bn |un |2 for all n ě N , that `8 `8 ř ř an |un |2 ď C1 bn |un |2 . The summation of the ﬁrst N terms, from is, n“N

n“N

n “ 0 to n “ N ´ 1, is ﬁnite, so there must be a constant C2 ą 0 which Nř ´1 Nř ´1 an |un |2 ď C2 bn |un |2 ; we therefore take is sufﬁciently large to result in n“0 n“0 ř ř an |un |2 ď C bn |un |2 . This tells us that C :“ maxpC1 , C2 q ą 0, giving us nPN

nPN

if u P 2b pN, Cq, then u P 2a pN, Cq. Furthermore, the previous inequality can be rewritten as }u}22 ď C}u}22 , thus the canonical injection ι : 2b pN, Cq ãÑ 2a pN, Cq a b ? veriﬁes }ιpuq}2a ď C}u}2b for all u P 2b pN, Cq, meaning that it is bounded and thus continuous. 2

238

From Euclidean to Hilbert Spaces

We shall conclude this section by presenting an extremely useful result which can be used to characterize the equality between continuous operators on an inner product space of arbitrary dimensions, via the equality of their action on vectors within an inner product. T HEOREM 6.10.– Let A, B : V Ñ W be two linear operators deﬁned on an inner product space of arbitrary dimension. Then: A “ B ðñ xx, Ayy “ xx, Byy @x, y P V P ROOF.– By linearity of A, it holds that xx, Ayy “ xx, Byy @x, y P V ðñ xx, pA ´ Bqyy “ 0 @x, y P V . Let us take an arbitrary but ﬁxed element y P V and write u “ pA ´ Bqy P V , then xx, uy “ 0 @x P V holds true if and only if u “ 0, that is pA ´ Bqy “ 0 @y P V , that is A ´ B “ 0, implying A “ B. 2 6.2.1. A classical example of a non-bounded linear operator on a vector space of inﬁnite dimension Although we have chosen to focus only on bounded linear operators on Hilbert spaces, it is important to show at least one example of a non-bounded linear operator. Actually, we are going to prove that one of the simplest operations – the derivation – on the simplest Hilbert basis – the Fourier basis – does not produce a bounded operator. Let un pxq “ ?12π einx , pn P Zq, be the Fourier basis of L2 r0, 2πs. Let us consider the ﬁrst derivation operator on the inﬁnite-dimensional vector space generated by the Fourier basis: D : spanppun qnPZ q ÝÑ L2 r0, 2πs d ÞÝÑ Dun “ dx un un d where dx un pxq “ ?in einx , which is square integrable on r0, 2πs. Of course, the 2π previous deﬁnition of D is extended by linearity on the whole span.

We can show that the norm of D is not ﬁnite. To calculate it, we may use equation [6.5] in Deﬁnition 6.2 of an operator norm, taken v in the domain of D we have : }D} “ sup }Dv} ě sup }Dun } }v}“1

}un }“1

where the inequality is motivated by the fact that the sup on the right hand side is computed over a subset of the domain of D.

Bounded Linear Operators in Hilbert Spaces

239

However, the condition }un } “ 1 does not determine any constraints, as any element un in the Fourier Hilbert basis of L2 r0, 2πs has a unit norm, thus }D} is simply the sup of the set of values }Dun } with respect to the integer index n, that is: ˜ż ˇ ˇ2 ¸1{2 2π ˇ ˇ in inx ˇ ? e ˇ dx }D} ě sup }Dun } “ sup ˇ ˇ 2π nPZ nPZ 0 ¸1{2 ˜ ˇ ż 2π ˇ ˇ 1 inx ˇ2 2 ˇ ? e ˇ dx “ sup |in| ˇ 2π ˇ nPZ 0 that is: }D} ě sup |in| “ sup |n| “ `8 nPZ

nPZ

which implies that the derivation operator deﬁned above is not bounded, and is therefore not continuous. 6.3. Invertibility of linear operators Exercise 6.3 highlighted the importance of analyzing the inverse of a linear operator. This subject will be examined in greater detail in this section. D EFINITION 6.6.– Let V, W be two normed vector spaces on the same ﬁeld K and let A : V Ñ ImpAq Ď W be a linear operator. The inverse operator of A is A´1 : ImpAq Ď W Ñ V such that @x P V : A´1 : ImpAq Ď W ÝÑ V Ax ÞÝÑ A´1 pAxq “ x If there exists A´1 , then A is invertible. For all x P V , it holds that A´1 pAxq “ x and ApA´1 pAxqq “ Apxq, thus the invertibility of A can be deﬁned in an equivalent manner with the conditions: A´1 ˝ A “ idV and A ˝ A´1 “ idImpAq In the speciﬁc case where W “ V and ImpAq “ V , the invertibility of A is equivalent to the existence of an operator A´1 : V Ñ V such that: A ˝ A´1 “ A´1 ˝ A “ idV If A : V Ñ V , the symbol GLpV q is used to designate the set of continuous bijective linear operators with a continuous inverse, known as the set of regular elements in BpV q. Theorem 6.11 summarizes the elementary properties of the inverse (the proofs of these properties are identical to those performed in ﬁnite dimension).

240

From Euclidean to Hilbert Spaces

T HEOREM 6.11.– Let V, W be two normed vector spaces and let A : V Ñ ImpAq Ď W be linear: 1) If A´1 exists, then it is unique; 2) If A´1 exists, then it is a linear operator; 3) A´1 exists if and only if kerpAq “ t0V u, that is a necessary and sufﬁcient condition for A to be invertible on its image is that its kernel is reduced to the zero vector of V . P ROOF.– 1) Let B1 , B2 : ImpAq Ď W Ñ V be two inverse operators of A, then: B1 “ B1 ˝ idImpAq “ B1 ˝ pA ˝ B2 q “ pB1 ˝ Aq ˝ B2 “ idImpAq ˝ B2 “ B2 . 2) For all w1 , w2 P ImpAq and k P K, we have: A´1 pw1 ` kw2 q “ A´1 pAA´1 pw1 q ` kAA´1 pw2 qq (linearity of Aq “ A´1 ApA´1 pw1 q ` kA´1 pw2 qq “ A´1 pw1 q ` kA´1 pw2 q 3) We know that the inverse of A can be deﬁned on its image if and only if A is injective. Let us verify that this is equivalent to kerpAq “ 0V . On one side, if Ax “ 0W , then x “ A´1 Ax “ A´1 0W “ 0V by linearity of A´1 , so if there exists A´1 , the kernel of A is reduced to the zero vector of V . On the other side, taking kerpAq “ t0V u and x1 , x2 P V such that Ax1 “ Ax2 , then Ax1 ´ Ax2 “ 0W , that is by linearity of A, Apx1 ´ x2 q “ 0W , but if kerpAq “ t0V u then x1 ´ x2 “ 0V , that is, x1 “ x2 , proving the injectivity of A. 2 The condition kerpAq “ t0V u is necessary and sufﬁcient for the invertibility of a linear operator on its image space ImpAq in ﬁnite and inﬁnite dimensions. In ﬁnite dimensions, the inverse of a linear operator, if it exists, is always bounded. In inﬁnite dimensions, on the other hand, the condition kerpAq “ t0V u does not imply any relationship between the continuity of A and that of A´1 : A may be bounded and have a non-bounded inverse or, conversely, A may be non-bounded and have a bounded inverse. One classic example of this situation is given by the derivation and integral operators. An easier example is provided by the linear operator A : 2 pN, Kq Ñ 2 pN, Kq deﬁned by Apx1 , x2 , x3 , . . . , xn , . . . q “ px1 , x2 {2, x3 {3, . . . , xn {n, . . . q, that is Appxn qnPN˚ q “ pxn {nqnPN˚ . A is bounded and }A} ď 1. For all x “ pxn qnPN P 2 pN, Kq: }Ax}22 “

ÿ ÿ |xn |2 |xn |2 “ }x}22 ď 2 n nPN nPN

Bounded Linear Operators in Hilbert Spaces

241

The operator A´1 : 2 pN, Kq Ñ 2 pN, Kq, A´1 ppyn qnPN˚ q “ pnyn qnPN˚ is evidently the inverse of A. Nevertheless, A´1 is not bounded: we can verify this by considering the general element of the canonical basis of 2 pN, Kq, that is en “ p0, 0, . . . , 1, 0, 0, . . . q, where 1 is in the position n. We see that, on one side, }en }2 “ 1 @n P N, and, on the other side, }A´1 en }2 “ n, hence }A´1 } “ sup }A´1 en }2 “ `8. nPN

A very useful characterization exists for the bounded invertibility of linear operators. It is important to note that this characterization holds independently of the continuity of the operator, making it particularly helpful in practical applications. T HEOREM 6.12 (Bounded invertibility of a linear operator).– If V and W are two normed vector spaces and A : V Ñ W is a linear operator (not necessarily bounded), then DA´1 P BpImpAq, V q if and only if Dμ ą 0 such that }Ax} ě μ}x} @x P V . P ROOF.– ùñ : suppose that DA´1 P BpImpAq, W q, then, by deﬁnition, Dm ą 0 such that @y P ImpAq: }A´1 y} ď m}y}. Since A is invertible and y P ImpAq, Dx P V such 1 ´1 Ax} ď m}Ax}, that is }Ax} ě }x} and, that we can write y “ Ax, then }A loooomoooon m on loomo }x}

“μą0

since y is an arbitrary element in ImpAq, the inequality holds for all x P V .

ð : suppose that }Ax} ě μ }x} @x P V , then, in particular, if we consider x P kerpAq:

pą0q

}Ax} “ }0} “ 0 ě μ}x} ðñ }x} “ 0 ðñ x “ 0V ùñ kerpAq “ t0V u pμą0q

´1

that is DA : ImpAq ùñ V . We must therefore prove that A´1 is bounded. For all y P ImpAq such that x “ A´1 y, we have: }Ax} ě μ}x} ðñ }AA´1 y} ě μ}A´1 y} ðñ }y} ě μ}A´1 y} ðñ }A´1 y} ď μ1 }y}, @y P ImpAq, that is A´1 is bounded. 2 The condition of the theorem is interpreted as follows. First, the fact that }Ax} ě μ}x} guarantees that the kernel of A consists solely of the zero vector. Furthermore, the inequality }Ax} ě μ}x} is inverted with respect to the inequality which deﬁnes a bounded operator, that it is well suited to guarantee that the inverse operator of A is bounded. One immediate consequence of the theorem shown above is that a linear operator A : V Ñ W is bounded and has a bounded inverse if and only if it satisﬁes the following condition: Da, b ą 0, a ď b : a}x} ď }Ax} ď b}x}

@x P V

242

From Euclidean to Hilbert Spaces

that is, the norm of all of the vectors of V , transformed by the action of A, is bounded by the norm of the vector itself multiplied by two positive constants. This consideration has an important consequence for the images of bounded linear operators deﬁned on Banach spaces, as stated in the next theorem. T HEOREM 6.13.– Let V be a Banach space and W an arbitrary normed vector space. Take A P BpV, W q. If A is invertible with a bounded inverse, then ImpAq is a closed vector subspace of W . P ROOF.– From Theorem 6.12, we know that the condition DA´1 P BpImpAq, V q is equivalent to: Da ą 0 : }x} ď a}Ax} @x P V We must prove that this condition implies that ImpAq is closed, that is, if pyn qnPN Ă ImpAq is such that yn Ñ y, then y P ImpAq. Since yn P ImpAq, then nÑ`8

there exists pxn qnPN Ă V such that yn “ Axn @n P N, hence: xn ´ xm ď a Apxn ´ xm q “ Axn ´ Axm “ yn ´ ym

Ñ

n,mÑ`8

0

because pyn qnPN is a convergent, and thus Cauchy sequence. The sequence pxn qnPN must therefore also be Cauchy and, since V is a Banach space, there exists x P V such that xn Ñ x. By the continuity of A, we obtain: nÑ`8

Ax “ A lim xn “ lim Axn “ lim yn “ y nÑ`8

that is, y P ImpAq.

nÑ`8

nÑ`8

2

There is a second condition which is sufﬁcient to ensure the continuity of the inverse of a linear operator. The presentation of this condition relies on an intermediary result, which is, itself, one of the most important theorems in functional analysis (the proof of this theorem is beyond the scope of this book, we simply note that it is a consequence of Baire’s category theorem). T HEOREM 6.14 (Open mapping theorem – Banach-Schauder).– Let V and W be two Banach spaces. If A P BpV, W q is surjective, then A is an open mapping, that is A transforms open subsets of V into open subsets of W . T HEOREM 6.15 (Continuous inverse operator theorem in Banach spaces).– Let V and W be two Banach spaces. If A P BpV, W q is bijective, that is kerpAq “ t0V u, and A is surjective, then A´1 P BpW, V q, that is A´1 is continuous. P ROOF.– Recall the topological characterization of continuity: a function between two topological spaces is continuous if and only if the counterimage of any open

Bounded Linear Operators in Hilbert Spaces

243

subset is open. By deﬁnition, the counterimages of A´1 are the images of A, hence A´1 is continuous if and only if any image of open via A is open; this property is guaranteed by the open mapping theorem. 2 The continuous inverse theorem can be used to characterize operators belonging to the set GLpV q for any given Banach space V . T HEOREM 6.16 (Characterization of GLpV q).– Let V be a Banach space and GLpV q the set of regular elements of the Banach algebra BpV q (linear bijections with continuous inverse). For an operator A P BpV q, the following two conditions are equivalent: 1) A P GLpV q; 2) D a linear operator B deﬁned on all V such that BA “ idV and AB “ idV . If one of the two conditions is satisﬁed, then B is unique and B “ A´1 . P ROOF.– 1q ùñ 2q If A P GLpV q, then we must simply consider B “ A´1 to prove the implication. 2q ùñ 1q The hypothesis BA “ idV implies that kerpAq “ t0u, that is, A is injective. Reasoning by the absurd, if x ‰ 0 and Ax “ 0, then we would have BAx “ 0, which contradicts the fact that BAx “ idV pxq “ x ‰ 0. Furthermore, the hypothesis AB “ idV implies that ImpAq “ V ; for all x P V , it holds that ApBxq “ ABpxq “ idV pxq “ x, so any x P V can be seen as the image via A of an element in V , that is, Bx, meaning that A is surjective. Thus, the existence of B such that the hypotheses BA “ idV and AB “ idV are valid implies that A is a linear bijection, and that @x P V , BpAxq “ x, that is, B “ A´1 . Hence A is bounded by hypothesis, invertible and surjective; by the continuous inverse theorem, B “ A´1 , and therefore A P GLpV q. The ﬁnal step is to prove uniqueness. Let B and B 1 be two operators which verify 2; then A´1 “ A´1 AB and A´1 “ A´1 AB 1 , hence A´1 “ B “ B 1 . 2 Clearly, if A P GLpV q, then we also have A´1 P GLpV q and if A, B P GLpV q, then AB P GLpV q since pABq´1 “ B ´1 A´1 given that ABB ´1 A´1 “ idV and B ´1 A´1 AB “ idV . GLpV q is therefore stable with respect to the product and inversion, and its unit element is idV , that is GLpV q is a group. D EFINITION 6.7.– The group GLpV q is called the general linear group of V .

244

From Euclidean to Hilbert Spaces

6.4. The dual of a Hilbert space and the Riesz representation theorem Again, let us consider BpV, W q, where V, W are two normed vector spaces. We know that BpV, W q is a Banach space with respect to the operator norm if W is a Banach space. Consider the speciﬁc case in which W is the ﬁeld K on which V is deﬁned as a vector space. As K “ R or C is complete, BpV, Kq is a Banach space, known as the dual of V and noted V ˚ (the notation V 1 is sometimes used in the literature to denote a dual space). The elements of V ˚ are known as the bounded linear functionals on V . We could ask ourselves how the “dualization” process of V can be iterated. For Hilbert spaces, the answer to this question is quite surprising: the dualization of any Hilbert space H is an involution, that is, H˚˚ » H, where » is an isomorphism between Hilbert spaces. H˚˚ is called the bidual of H. This is not true, in general, for Banach spaces; those which are isomorphic to their bidual are known as reﬂexive Banach spaces. The Banach spaces Lp pX, A, μq are reﬂexive for 1 ă p ă 8, but L1 pX, A, μq and L8 pX, A, μq are not. Each functional ϕ P V ˚ transforms an element of V into a scalar of K. This transformation is represented using the following notation: ϕ : V ÝÑ K x ÞÝÑ ϕpxq “ xϕ, xy The notation xϕ, xy comes from the fact that if V is a Hilbert space, then any continuous linear functional ϕ P V ˚ acts as an inner product on the vectors of V . This statement forms the basis for a famous result ﬁrst identiﬁed by Riesz, which will be shown and proved below. T HEOREM 6.17 (Riesz representation theorem).– Let H be a Hilbert space on K “ R or C, and let H˚ be the dual of H. Then: T : H ÝÑ H˚ x ÞÝÑ Tx where: Tx : H ÝÑ K y ÞÝÑ Tx pyq “ xy, xy is an isomorphism between H and H˚ interpreted as Banach spaces, that is, T is bijective, preserves the norms and: – if K “ R, then T is linear;

Bounded Linear Operators in Hilbert Spaces

245

– if K “ C, then T is antilinear. The functional Tx is called the Riesz representative of x in H˚ . Before presenting the proof, it is important to understand the reason for the antilinearity in the case K “ C. We shall begin by analyzing the summation operation: T :

H ÝÑ H˚ x1 ` x2 ÞÝÑ Tx1 `x2

Tx1 `x2 : H ÝÑ C y ÞÝÑ Tx1 `x2 pyq “ xy, x1 ` x2 y “ xy, x1 y ` xy, x2 y “ Tx1 pyq ` Tx2 pyq thus Tx1 `x2 “ Tx1 ` Tx2 . Now, consider the multiplication by a scalar using k P C: T : H ÝÑ H˚ kx ÞÝÑ Tkx Tkx : H ÝÑ C ¯ xy “ kT ¯ x pyq y ÞÝÑ Tkx pyq “ xy, kxy “ kxy, ¯ x. thus Tkx “ kT Therefore: T :

T : H ÝÑ H˚ x1 ` x2 ÞÝÑ Tx1 ` Tx2 ,

H ÝÑ H˚ ¯ x kx ÞÝÑ kT

which explains why T is antilinear if K “ C. Evidently, if K “ R, this distinction has no place and T is linear. The Riesz representation theorem owes its name to the fact that it allows all continuous linear functions on a Hilbert space to be represented via inner products; notably, for any continuous linear function ϕ on H “ L2 pX, A, μq there exists a single element f P L2 pX, A, μq such that ϕ “ Tf with: Tf : L2 pX, A, μq ÝÑ K ş g ÞÝÑ Tf pgq “ xg, f y “ X g f¯dμ More generally, we know that all separable, inﬁnite-dimensional Hilbert spaces are isomorphic to 2 pN, Kq, for which the inner product is deﬁned by a series. These observations are the reason why continuous linear functionals are very often represented by ﬁnite sums, series or integrals in applications of functional analysis.

246

From Euclidean to Hilbert Spaces

One ﬁnal aspect to note before moving on to the proof is that if we consider the inner product in the way it is used in physics, that is, as antilinear with respect to the ﬁrst entry and linear with respect to the second entry, then the deﬁnition of Tx becomes Tx pyq “ xx, yy. P ROOF.– Since the linear or antilinear character of T has already been examined, we shall start by verifying that T is well deﬁned, that is, Tx is a bounded linear functional on H. Taking α, β P K, y, y1 , y2 P H: – Tx is linear2: Tx pαy1 `βy2 q “ xαy1 `βy2 , xy “ αxy1 , xy`βxy2 , xy “ αTx py1 q`βTx py2 q – Tx is bounded: We begin by observing that }Tx pyq} “ |Tx pyq| since Tx pyq P K. Thus: }Tx pyq} “ |Tx pyq| “ |xy, xy|

ď

(Cauchy-Schwarz)

}x}}y}

[6.16]

The fact that Tx is a bounded linear operator between the Hilbert spaces H and K allows us to calculate the operator norm of Tx . With respect to this norm, T is an isometry, that is, }Tx }BpH,Kq “ }x}H @x P H. The case of the zero vector is straightforward: if x “ 0H then T0H is the zero functional since T0H pyq “ xy, 0H y “ 0 @y P H, thus: }0H } “ 0 “ }T0H }. Taking x P H, x ‰ 0H , let us prove that }Tx } ď }x} and that }x} ď }Tx }, in that order: – }Tx } ď }x}: by [6.16] we can write }Tx pyq} ď }x}}y} @y P H, hence: }Tx } “ sup |Tx pyq| ď sup }x}}y} “ }x} y“1

y“1

– }x} ď }Tx }: in this case, we can write: }x}2 “ xx, xy

“

(def. of Tx )

Tx pxq

“

Tx pxq“}x}2 ě0 !

|Tx pxq| “ }Tx pxq}

ď

(Tx bounded)

}Tx }}x}

and since }x} ‰ 0, the ﬁrst and last members of the expression above can be divided by }x}, giving us }x} ď }Tx }. In summary, }Tx } “ }x} @x P H, hence T is an isometry and consequently T is injective. 2 If we had deﬁned Tx pyq “ xx, yy, then we would have Tx pαy1 ` βy2 q “ αxx, ¯ y1 y ` ¯ x py2 q, that is, Tx would be an antilinear functional. It is thus ¯ y2 y “ αT ¯ x py1 q ` βT βxx, impossible to avoid antilinearity either in T or Tx .

Bounded Linear Operators in Hilbert Spaces

247

The ﬁnal step in the proof is to demonstrate that T is surjective, that is, for all ϕ P H˚ there exists x P H such that ϕ “ Tx . The argument which Riesz used to demonstrate the surjectivity of T is particularly elegant. First, if ϕ is the identically zero functional 0, then ϕ “ T0H . Now, let ϕ be a non-identically zero function, and consider its kernel: – 0H P kerpϕq by linearity of ϕ, thus kerpϕq ‰ H; – since ϕ ‰ 0, there exists at least one vector in H that is not nulliﬁed by ϕ, that is, kerpϕq ‰ H; – as we saw in Theorem 6.8, kerpϕq is always closed. Thus, kerpϕq is a closed proper subspace of H; based on this observation, Theorem 5.4 can be used to guarantee that kerpϕqK ‰ t0H u, that is, there exists at least one u ‰ 0H , u P kerpϕqK . Now, we note that since kerpϕq X kerpϕqK “ t0H u and since u ‰ 0H , u R kerpϕq, ϕpyq u is well deﬁned. for all y P H, the vector z “ y ´ ϕpuq z P kerpϕq, and by linearity, ϕpzq “ ϕpy ´ short: # u P kerpϕqK ϕpyq z “ y ´ ϕpuq u P kerpϕq

ϕpyq ϕpuq uq

“ ϕpyq ´

ϕpyq “ 0; in ϕpuq ϕpuq

hence: 0 “ xz, uy “ xy ´

ϕpyq ϕpyq ϕpyq u, uy “ xy, uy ´ x u, uy “ xy, uy ´ }u}2 ϕpuq ϕpuq ϕpuq

that is: ϕpyq “

ϕpuq ϕpuq xy, uy “ xy, uy 2 }u} }u}2

@y P H

Hence, for any vector u P kerpϕqK , u ‰ 0H , the vector x “ ϕpyq “ xy, xy “ Tx pyq,

ϕpuq }u}2 u

is such that:

@y P H

that is, ϕ “ Tx . This proves that T is surjective and concludes the proof.

2

The ﬁnal step of the proof above actually demonstrates an even ﬁner result: the orthogonal complement of the kernel of a bounded linear function on a Hilbert space H is a straight line in H.

248

From Euclidean to Hilbert Spaces

C OROLLARY 6.1.– Let H be a Hilbert space and take ϕ P H˚ , ϕ ” 0. Then kerpϕqK K is a one-dimensional vector subspace of H, that is, dimpkerpϕq q “ 1. One generator of this space is the residual vector x ´ Pker ϕ x, where x P H is such that ϕ “ Tx via the Riesz isomorphism. P ROOF.– In the ﬁnal part of the proof of the Riesz representation theorem, we showed that if ϕ is not identically null functional, then for any given u P kerpϕqK , u ‰ 0H , x“

ϕpuq }u}2 u

is the vector in H, which is identiﬁed with ϕ via the formula ϕ “ Tx .

Reasoning by the absurd, if kerpϕqK has a dimension greater than 1, then there exists at least one other generator, which we shall note u1 ‰ u, u1 ‰ 0H , u1 P kerpϕqK , where u and u1 are linearly independent. Since kerpϕqK is a vector space, the GramSchmidt algorithm can be applied to orthonormalize the pair pu, u1 q and obtain the pair p˜ u, u ˜1 q P kerpϕqK ˆ kerpϕqK , }˜ u} “ }˜ u1 } “ 1 and u ˜Ku ˜1 . We deﬁne the vectors: x“

ϕp˜ uq u ˜ “ ϕp˜ uq˜ u, }˜ u}2

x1 “

ϕp˜ u1 q 1 u ˜ “ ϕp˜ u1 q˜ u1 }˜ u1 }2

which are themselves orthogonal, so Pythagoras’ theorem can be used to estimate the squared norm of their difference: }x ´ x1 }2 “ }x ` p´x1 q}2 “ }x}2 ` }x1 }2 “ |ϕp˜ uq|2 }˜ u}2 ` |ϕp˜ u1 q|2 }˜ u1 }2 ą 0 since ϕp˜ uq, ϕp˜ u1 q and the norms of u ˜ and u ˜1 are ‰ 0. Consequently, x ‰ x1 , so we would have two different vectors in H, x and x1 , associated with the same functional ϕ P H˚ . This is incompatible with the injectivity of the Riesz map. K Furthermore, since x “ ϕpuq }u}2 u and u P kerpϕq , x R ker ϕ and so Theorem 5.4 tells us that the residual vector of the orthogonal projection of x onto ker ϕ, that is, 2 x ´ Pker ϕ x, belongs to kerpϕqK .

R EMARK .– In light of this discussion, the inverse of the Riesz map can be expressed as: T ´1 : H˚ ÝÑ H ϕ ÞÝÑ T ´1 pϕq “ x “

ϕpuq }u}2 u

where u ‰ 0H is an arbitrary vector in kerpϕqK . Since dimpkerpϕqK q “ 1, in order to verify that this deﬁnition is well established, we must simply verify that if k P K, k ‰ 0, then the vector x associated with ϕ via u1 “ ku (as an arbitrary element of the one-dimensional subspace kerpϕqK ) is the same: ϕpu1 q 1 k ϕpuq kk ϕpuq ϕpuq u “ ku ““ u“ u“x 1 2 2 2 2 2 }u } |k| }u} }u}2 |k|}u} hence the deﬁnition of T ´1 does not depend on the choice of the vector u ‰ 0H P kerpϕqK . x1 “

Bounded Linear Operators in Hilbert Spaces

249

6.4.1. The scalar product induced on the dual of a Hilbert space In the context of the Riesz representation theorem, we saw that a Hilbert space H and its dual H˚ can be identiﬁed as Banach spaces, since the isometry of the transformation T draws only on the norm of H and H˚ . It is possible to go even further, and identify these as Hilbert spaces. The ﬁrst step is to introduce an inner product on H˚ . This can be done using the Riesz isomorphism T : H Ñ H˚ : any bounded linear functional of H˚ is the image of a vector in H and, as we know the inner product of H, there is no risk of ambiguity if we deﬁne the inner product on H˚ as: xϕ, ψyH˚ :“ xT ´1 ϕ, T ´1 ψyH ,

@ϕ, ψ P H˚

The fact that T preserves the norm guarantees that this deﬁnition of inner product will be compatible with the pre-existing Banach space structure on H˚ . If ϕ “ Tx , that is, ϕ is the functional which can be identiﬁed with the image of the vector x P H via T , then: }ϕ}2 “ xT ´1 pTx q, T ´1 pTx qy “ xx, xy “ }x}2 “ }Tx }2 where the ﬁnal equality is a consequence of the Riesz representation theorem. The compatibility between the co-existing structures of inner product space and complete normed space implies that H˚ , equipped with the inner product induced by the Riesz isomorphism T , is itself a Hilbert space; thus, T becomes an (antilinear) isomorphism between the Hilbert spaces H and H˚ . The Riesz representation theorem is one of the most important results of functional analysis. In the following two sections, we discuss an extension of this result (called the Lax-Milgram theorem) and an extremely signiﬁcant consequence of Riesz’s theorem: each operator in BpHq can be unambiguously associated with another operator, called its adjoint, which plays a fundamental role in the analysis of projection and unitary operators, among other things. 6.5. Bilinear forms, sesquilinear forms and associated quadratic forms The concept of a quadratic form associated with a bilinear or sesquilinear form could have been introduced in Chapter 1. However, we have decided to discuss this subject here because the connection between bounded linear operators in Hilbert spaces and quadratic forms leads directly to the deﬁnition of the adjoint operator, which will be presented in section 6.6. D EFINITION 6.8 (quadratic form).– Let φ : V ˆ V Ñ R (resp. φ : V ˆ V Ñ C) be a bilinear (resp. sesquilinear) form on the real (resp. complex) vector space V . The

250

From Euclidean to Hilbert Spaces

function Φ : V Ñ R, resp. Φ : V Ñ C, deﬁned by restriction of φ on the diagonal of V ˆ V , that is: Φpxq :“ φpx, xq, is called the quadratic form associated with φ. With the addition of positive-deﬁniteness and symmetry (resp. conjugate symmetry) requirements, φ becomes an inner product x , y and, in this case, Φpxq “ xx, xy “ }x}2 for all v P V , that is, Φ is the square of the norm canonically associated with φ. This observation is the reason why Φ is known as the quadratic form. Now, let us consider the concept of bounded forms. D EFINITION 6.9.– If pV, } }q is a normed vector space, then the form φ : V ˆ V Ñ K, taken to be bilinear if K “ R and sesquilinear if K “ C, is said to be bounded if there exists a constant m ą 0 such that: |φpx, yq| ď m}x}}y},

@x, y P V

Where applicable, the norm of φ is deﬁned by the formula: φ :“ inftm ą 0 : |φpx, yq| ď m}x}}y}, @x, y P V u As in the case of operators in BpHq, the norm of φ can be rewritten in an equivalent, and highly useful, form: φ “

sup

φpx, yq

x“y“1

giving us: |φpx, yq| ď φ x y ,

@x, y P H

D EFINITION 6.10 (bounded quadratic forms and their norm).– If pV, } }q is a normed vector space, then the quadratic form Φ is said to be bounded if there exists a constant k ą 0 such that: |Φpxq| ď k}x}2 ,

@x P V

The norm of a bounded quadratic form is deﬁned by: Φ :“ inftk ą 0 : |Φpxq| ď k}x}2 , @x P V u

Bounded Linear Operators in Hilbert Spaces

251

As we saw with the norm of φ, the norm of Φ can be rewritten as: }Φ} :“ sup |Φpxq| }x}“1

giving us: }Φpxq} ď }Φ}}x}2 ,

@x P V

[6.17]

As in the case of inner products and their norms, the polarization formula can be used to completely describe a bilinear (sesquilinear) form via its associated quadratic form. T HEOREM 6.18.– Let φ be a bilinear (resp. sesquilinear) form on V and let Φ be its associated quadratic form. Then, for all x, y P V : 4φpx, yq “ Φpx ` yq ´ Φpx ´ yq respectively: 4φpx, yq “ Φpx ` yq ´ Φpx ´ yq ` iΦpx ` iyq ´ iΦpx ´ iyq The proof is identical to that presented in section 1.2.1, where we saw that the bilinearity or sesquilinearity of the form φ is the only aspect required to prove the polarization formula. The following result is an immediate corollary of the polarization formula, and gives a condition which is equivalent to that set out in Theorem 6.10 for bilinear or sesquilinear forms. C OROLLARY 6.2.– Let φ1 and φ2 be two bilinear or sesquilinear forms on V . Then: φ1 “ φ2 ðñ Φ1 “ Φ2 , that is φ1 px, yq “ φ2 px, yq @x, y P V ðñ φ1 px, xq “ φ2 px, xq @x P V that is, the equality of the quadratic forms is necessary and sufﬁcient to characterize the equality of the forms with which they are associated. Now, let us consider an important consequence of this corollary. T HEOREM 6.19.– A sesquilinear form φ : V ˆ V Ñ C is Hermitian if and only if its associated quadratic form Φ is real, that is if Φpxq P R @x P V .

252

From Euclidean to Hilbert Spaces

P ROOF.– Let us prove these two implications. ùñ :Let φ be Hermitian, that is, φpx, yq “ φpy, xq @x, y P V . Then: Φpxq “ φpx, xq “ φpx, xq “ Φpxq,

@x P V

that is, Φ is real. ðù : Now, taking Φpxq “ Φpxq, let us deﬁne a sesquilinear form ψ : V ˆ V Ñ C as follows: ψpx, yq “ φpy, xq. If we can show that ψ “ φ, this will prove that φ is sesquilinear. To do this, we examine the quadratic form Ψ associated with ψ: Ψpxq “ φpx, xq “ Φpxq “ Φpxq,

@x P V

and, by Corollary 6.2, Ψ “ Φ implies ψ “ φ.

2

As a special case of the theorem just proven, if a sesquilinear form φ is positive, and thus real, it must necessarily be Hermitian. This consideration provides additional justiﬁcation for the deﬁnition of complex inner product given in Chapter 1 as a sesquilinear positive-deﬁnite Hermitian form. Theorem 6.20 relates to the relationship between the boundedness of a bilinear or sesquilinear form φ and that of its associated quadratic form. T HEOREM 6.20.– A bilinear or sesquilinear form φ on a normed vector space pV, } }q is bounded if and only if the associated quadratic form Φ is bounded. Furthermore: – if φ is real, then: }φ} “ }Φ} ; – if φ is complex, then its norm is contained in the interval between the norm of Φ and its double: }Φ} ď }φ} ď 2}Φ}. P ROOF.– We shall prove the ﬁrst inequality by considering a real bilinear or complex sesquilinear form. }Φ} ď }φ}, φ real or complex : by deﬁnition we have: }Φ} “ sup |Φpxq| “ sup |φpx, xq| ď x“1

sup

p˚q x“y“1

x“1

|φpx, yq| “ }φ}

where p˚q is due to the fact that the upper bound is calculated on a larger set of values. If φ is bounded, then Φ is also bounded, and the ﬁrst inequality is valid. }φ} ď }Φ}, φ real bilinear : polarization formula, we have:

now, taking Φ to be bounded, then, by the

|φpx, yq| “ 14 |Φpx ` yq ´ Φpx ´ yq| ď 2

2

ď 14 }φ}2px ` y q “

1 2 4 }φ}p}x ` y} r6.17s 2 2 1 2 }φ}px ` y q

` }x ´ y}2 q

Bounded Linear Operators in Hilbert Spaces 2

2

253

2

2

by applying the parallelogram formula x ` y ` x ´ y “ 2px ` y q. Hence: }φ} “

1 2 2 |φpx, yq| ď sup }Φ}px ` y q “ }Φ} x“y“1 2 x“y“1 sup

Hence, a bounded Φ implies a bounded φ and it holds that }ϕ} ď }Φ}. }φ} ď 2}Φ}, φ complex sesquilinear : polarization formula, we have:

taking Φ to be bounded, using the

|φpx, yq| “ 14 |Φpx ` yq ´ Φpx ´ yq ` iΦpx ` iyq ´ iΦpx ´ iyq| ď

r6.17s

1 4 }φ}p}x

` y}2 ` }x ´ y}2 ` }x ` iy}2 ` }x ´ iy}2 q 2

2

In this case, the parallelogram formula gives us: x ` iy ` x ´ iy “ 2 2 2 2 2 2 2px ` iy q “ 2px ` |i|2 y q “ 2px ` y q, thus 2 2 }x ` y}2 ` }x ´ y}2 ` }x ` iy}2 ` }x ´ iy}2 “ 4px ` y q and so: 2

2

|φpx, yq| ď }Φ}px ` y q which implies: }φ} “

sup x“y“1

|φpx, yq| ď

sup x“y“1

2

2

}Φ}px ` y q “ 2}Φ}

Thus, a bounded Φ implies that φ is bounded, and it holds that }ϕ} ď 2}Φ}.

2

If a (complex) sesquilinear form φ is also Hermitian, then we know that its associated quadratic form Φ is real. The theorem proved above guarantees the equality of the norms of φ and Φ when φ is a real bilinear form (and thus Φ is also real). These considerations naturally lead to the idea that a Hermitian (complex) sesquilinear form might have a norm which coincides with that of its (real) quadratic form. The following result conﬁrms that this is the case. T HEOREM 6.21.– If a sesquilinear form φ : V ˆ V Ñ C, where pV, } }q is a normed vector space, is bounded and Hermitian, then }φ} “ }Φ}. P ROOF.– We have seen that the inequality }Φ} ď }φ} is always valid, so we must simply show that the opposite inequality is valid when φ is Hermitian. Once again, consider the polarization formula: φpx, yq “

1 pΦpx ` yq ´ Φpx ´ yq ` iΦpx ` iyq ´ iΦpx ´ iyqq 4

254

From Euclidean to Hilbert Spaces

Since Φ is real, the real part of both sides is: pφpx, yqq “

1 pΦpx ` yq ´ Φpx ´ yqq 4

Using equation [6.17] and the parallelogram formula, we can write the following inequality: |pφpx, yqq| ď

1 1 }Φ}p}x ` y}2 ` }x ´ y}2 q “ }Φ}p}x}2 ` }y}2 q 4 2

[6.18]

If θ P r0, 2πq is such that φpx, yq “ |φpx, yq|eiθ , then, by linearity on the ﬁrst entry of φ: 0 ď |φpx, yq| “ e´iθ φpx, yq “ φpe´iθ x, yq that is, φpe´iθ x, yq is a real positive quantity, and thus it coincides with its real part and also with its magnitude, hence |φpx, yq| “ |pφpe´iθ x, yqq|. Using equation [6.18], we obtain: }φ} “ ď

sup x“y“1

|φpx, yq| “

sup x“y“1

|pφpe´iθ x, yqq|

1 }Φ}p}x}2 ` }y}2 q “ }Φ} x“y“1 2

2

sup

Now, let us consider the important relationship between bounded bilinear or sesquilinear forms deﬁned on a Hilbert space H and the operators of BpHq. The two results presented below are essential for deﬁning the adjoint of a bounded operator. T HEOREM 6.22.– For all ﬁxed A P BpHq, the bilinear form (if H is real) or sesquilinear form (if H is complex) φA on H deﬁned by: φA px, yq “ xAx, yy

ou

φA px, yq “ xx, Ayy

is bounded, and it holds that }φA } “ }A}. P ROOF.– Consider the deﬁnition φA px, yq “ xAx, yy: the proof for the other deﬁnition is similar. We observe that: |φA px, yq| “ |xAx, yy|

ď

(Cauchy-Schwarz)

}Ax}}y} ď }A}}x}}y}, r6.11s

hence φA is bounded and: }φA } “

sup x“y“1

|φA px, yq| ď

sup x“y“1

}A}}x}}y} “ }A}

@x, y P H

Bounded Linear Operators in Hilbert Spaces

255

thus }φA } ď }A}. Now, we shall prove the equality of the norms by demonstrating that }A} ď }φA }. First, we note that φA px, Axq “ xAx, Axy “ }Ax}2 ě 0, so it holds that }Ax}2 “ |φA px, Axq|. Then, given that φA is bounded: }Ax}2 “ |φA px, Axq| ď }φA }}x}}Ax} If Ax ‰ 0, then both sides of the previous inequality can be divided by }Ax}, giving us }Ax} ď }φA }}x}. If Ax “ 0, then the inequality }Ax} ď }φA }}x} is written as 0 ď }φA }}x}, which is trivially true. Thus, the inequality }Ax} ď }φA }}x} holds with no constraints, and we can write: }A} “ sup }Ax} ď sup }φA }}x} “ }φA } x“1

x“1

that is, }A} ď }φA }.

2

If we write Bilb pHq, resp. Sesqb pHq, to denote the vector space (with respect to the pointwise deﬁned linear operations) of the bounded bilinear, or sesquilinear, forms on H, then the mapping: BpHq ÝÑ Bilb phq , A ÞÝÑ φA

or :

BpHq ÝÑ Sesqb phq A ÞÝÑ φA

is an isometric inclusion. The mapping deﬁned by BpHq Q A ÞÑ φA P Bilb pHq is linear. The mapping given by BpHq Q A ÞÑ φA P Sesqb pHq is also linear if we deﬁne φA px, yq “ xAx, yy, but it is antilinear if we deﬁne φA px, yq “ xx, Ayy. By isometry, we can add a further characterization of the norm of an operator A P BpHq. C OROLLARY 6.3 (Fifth characterization of the norm of an operator in BpHq).– For all A P BpHq it holds that: }A} “

sup x“y“1

|xAx, yy|

[6.19]

The following result tells us that the application which associates a bounded operator with a bounded bilinear or sesquilinear form is not only an isometric inclusion, but is also surjective, that is any bounded bilinear or sesquilinear form on a Hilbert space H is deﬁned by one, and only one, operator in BpHq. In short, the correspondence bounded operator ðñ bounded bilinear or sesquilinear form is an isometric isomorphism.

256

From Euclidean to Hilbert Spaces

T HEOREM 6.23.– Let H be a Hilbert space on K “ R, C. For any bounded bilinear form φ : H ˆ H Ñ K if K “ R, or any bounded sesquilinear form K “ C, there exists a unique operator B P BpHq such that φ “ φB , that is: φpx, yq “ xBx, yy

or:

φpx, yq “ xx, Byy,

@x, y P H

P ROOF.– For the purposes of our proof, let us consider the deﬁnition φB px, yq “ xx, Byy; the proof for the other one is analogous. Injectivity: Theorem 6.23 guarantees that, for all B P BpHq, φB px, yq “ xx, Byy is a bounded bilinear or sesquilinear form. Now, take B1 , B2 P BpHq such that φ “ φB1 “ φB2 , that is φpx, yq “ xx, B1 yy “ xx, B2 yy @x, y P H, then, by Theorem 6.10, B1 “ B2 . Surjectivity: Taking an arbitrary ﬁxed bilinear or sesquilinear form φ : HˆH Ñ K and an arbitrary element y P H, the application : φy : H ÝÑ K x ÞÝÑ φy pxq :“ φpx, yq is clearly a bounded linear functional on H, that is, φy P H˚ . By the Riesz representation theorem, there exists one single element ξy P H such that φy “ Tξy “ T pξy q, where T is the Riesz isomorphism and Tξy P H˚ is the Riesz representative of ξy P H, which has an action on any x P H deﬁned by Tξy pxq “ xx, ξy y. In short, @x, y P H, we know that φpx, yq “ φy pxq “ xx, ξy y, and thus the property of surjectivity will be proven if we can show that the application: B : H ÝÑ H y ÞÝÑ By :“ ξy is a bounded linear operator on H, since in this case it holds that φpx, yq “ xx, Byy @x, y P H. Taking arbitrary x, y1 , y2 P H and α1 , α2 P K, we have: xx, ξα1 y1 `α2 y2 y “ φpx, α1 y1 ` α2 y2 q “ α1 φpx, y1 q ` α2 φpx, y2 q “ α1 xx, ξy1 y ` α2 xx, ξy2 yxx, α1 ξy1 y ` xx, α2 ξy2 y “ xx, α1 ξy1 ` α2 ξy2 y which shows the linearity of the correspondence H Q y ÞÑ ξy “ By P H. To show that B is bounded, we observe that, since φ is bounded, there exists k ą 0 such that : |xx, Byy| “ |φpx, yq| ă k}x}}y}

@x, y P H

Bounded Linear Operators in Hilbert Spaces

257

Due to the arbitrary nature of x, we know that the inequality also holds when x “ Ay, that is: }By}2 “ |xBy, Byy| ă k}By}}y}

@y P H

hence }By} ă k}y} @y P H such that By ‰ 0, and when By “ 0 the inequality }By} ă k}y} is trivially true, so it holds that }By} ă k}y} @y P H, that is, B is bounded. 2 6.5.1. The Lax-Milgram theorem and its consequences In 1954, Peter Lax and Arthur Milgram presented a simple and elegant proof of a remarkable consequence of Theorem 6.23, generalizing the Riesz representation theorem to bilinear or sesquilinear forms. One of the hypotheses required to obtain this result is deﬁned below. D EFINITION 6.11 (coercive or V -elliptical forms).– Let pV, } }q be a normed vector space. A bilinear or sesquilinear form φ : V ˆ V Ñ K, K “ R or C is said to be coercive or V -elliptical if there exists a constant K ą 0 such that: Φpxq ě K}x}2 ,

@x P V

It is evident that an inner product on V is a coercive form, as, in this case, Φpxq “ xx, xy “ }x}2 ě K}x}2 @x P V with 0 ă K ď 1. The following example is less trivial. If z P Cpr0, 1s, Rq is such that min zptq ą tPr0,1s

0, then the bilinear form: φz : L2 r0, 1s ˆ L2 r0, 1s ÝÑ R ş1 px, yq ÞÝÑ φz px, yq :“ 0 xptqyptqzptqdt is coercive since: Φz pxq “

ż1 0

“ min zptq tPr0,1s

|xptq|2 zptqdt ě ż1 0

ż1 0

|xptq|2 min zptqdt tPr0,1s

|xptq|2 dt “ K}x}2

where K “ min zptq. tPr0,1s

T HEOREM 6.24 (Lax-Milgram theorem).– Let H be a Hilbert space on K “ R or C and let φ : H ˆ H Ñ K be a bounded and coercive bilinear form if K “ R, or a

258

From Euclidean to Hilbert Spaces

bounded and coercive sesquilinear form if K “ C. Then, for any bounded functional ϕ P H˚ , there exists a single element uϕ P H such that: ϕpxq “ φpx, uϕ q,

@x P H

P ROOF.– We know from Theorem 6.23 that there exists an operator A P BpHq such that: φpx, yq “ xx, Ayy,

@x, y P H

[6.20]

On the other side, the Riesz representation theorem guarantees that, for any bounded linear functional ϕ P H˚ , there exists a single element T ´1 pϕq P H, where T is the Riesz isomorphism, such that: ϕpxq “ xx, T ´1 pϕqy,

@x P H

[6.21]

The main idea behind the proof of this theorem is to compare equations [6.20] and [6.21]. If the operator A : H Ñ H is an isomorphism, then there exists a unique element in H, written as uϕ P H since it depends on ϕ, which satisﬁes Auϕ “ T ´1 pϕq; then: ϕpxq “ xx, T ´1 pϕqy r6.21s

“

pT ´1 pϕq“Auϕ q

xx, Auϕ y “ φpx, uϕ q, r6.20s

@x P H

that is, the thesis of the Lax-Milgram theorem. Now, let us show that A is an isomorphism. Injectivity is a simple consequence of coercivity: 0 ď K}x}2 ď Φpxq “ φpx, xq “ xx, Axy

“

xx,Axyě0

|xx, Axy|

ď

Cauchy-Schwarz

x Ax

1 Ax for all x ‰ 0, and for x “ 0 the inequality is trivial, so it hence }x} ď K holds for all x P H. This implies the injectivity of A: given arbitrary x1 , x2 P H, by linearity, the condition Ax1 “ Ax2 implies that Apx1 ´ x2 q “ 0; then }x1 ´ x2 } ď 1 K Apx1 ´ x2 q “ 0, that is x1 “ x2 .

The surjectivity of A, that is, the fact that ImpAq “ H, is slightly harder to prove. The ﬁrst argument used here reposes on the inequality proven above. More precisely, let pxn qnPN Ă H be an arbitrary sequence of elements in H, then pAxn qnPN Ă ImpAq is an arbitrary sequence of elements in the image of A. Now, let us suppose that this sequence is convergent in H, that is there exists y P H such that Axn ´ y Ñ nÑ`8

0. Notably, as a convergent sequence, pAxn qnPN is Cauchy, that is, for all ε ą 0 DNε P N such that n, m ě Nε implies Axn ´ Axm ă ε. It therefore also holds that 1 Axn ´ Axm ă ε for all n, m ě Nε , that is, if pAxn qnPN converges }xn ´xm } ď K in H, then pxn qnPN is a Cauchy sequence in H. Since H is complete, pxn qnPN itself

Bounded Linear Operators in Hilbert Spaces

259

converges in H, that is, there exists x P H such that lim xn ´ x “ 0. A is nÑ`8

bounded and therefore continuous, so: ˆ ˙ A lim xn ´ x “ lim Axn ´ Ax “ 0 nÑ`8

nÑ`8

Furthermore, by the uniqueness of the limit in a metric space, we obtain y “ Ax P ImpAq, that is ImpAq is a closed vector subspace of H as it contains the limits of all of its sequences. The closure of ImpAq means that we can use Theorem 5.4. Reasoning by the absurd, if ImpAq is a proper vector subspace of H, then there exists a non-zero vector ξ P Hz ImpAq that is orthogonal to ImpAq, that is, xξ, Ayy “ 0 @y P H. Taking y “ ξ, we obtain: 0 “ xξ, Aξy “ Φpξq

ě

coercivity

K}ξ}2 ą 0

since ξ ‰ 0 and K ą 0, which is absurd.

2

The Lax-Milgram theorem is widely used in solving partial differential equations (PDE) expressed in variational form. Roughly speaking, this approach involves rewriting a PDE as the problem of minimization of a functional expressed by an integral, and looking for the so-called weak solution of the PDE, which takes the form of a minimizer of the functional. In this type of approach, one almost immediate corollary of the Lax-Milgram theorem (often cited as an integral part of the theorem) proves extremely useful. C OROLLARY 6.4 (Lax-Milgram: symmetric case).– Take: – H a real Hilbert space; – ϕ P H˚ ; – φ : H ˆ H Ñ R a bounded, coercive and symmetrical bilinear form; – Φ : H Ñ R the quadratic form associated with φ. Then the vector uϕ P H such that ϕpxq “ φpx, uϕ q @x P H is the only element in H which minimizes the linear functional: Jϕ : H ÝÑ R x ÞÝÑ Jϕ pxq :“ 12 Φpxq ´ ϕpxq that is, D! uϕ P H such that: Jϕ puϕ q “ min Jϕ pxq xPH

ðñ

uϕ “ arg min Jϕ pxq xPH

260

From Euclidean to Hilbert Spaces

P ROOF.– We perform a shift in a neighborhood of uϕ with an arbitrary vector w P H and compute Jϕ : Jϕ puϕ ` wq “ 12 φpuϕ ` w, uϕ ` wq ´ ϕpuϕ ` wq “

1 rφpuϕ , uϕ q ` φpuϕ , wq ` φpw, uϕ q ` φpw, wqs ´ ϕpuϕ q ´ ϕpwq 2 “

pφ symmetricq

“

1 rφpuϕ , uϕ q ` 2φpuϕ , wq ` φpw, wqs ´ ϕpuϕ q ´ ϕpwq 2

1 1 φpuϕ , uϕ q ´ ϕpuϕ q ` φpw, wq ` φpw, uϕ q ´ ϕpwq 2 2 ˙ ˆ 1 φpuϕ , uϕ q ´ ϕpuϕ q “ Jpuϕ q and uϕ satisﬁes ϕpwq “ φpw, uϕ q 2

1 “ Jpuϕ q ` Φpwq 2 ě

pφ coerciveq

Jpuϕ q `

K }w}2 2

ě

2 pK 2 }w} ě0q

Jpuϕ q

that is, Jpuϕ q ď Jϕ puϕ ` wq @w P H, thus uϕ is the only minimizer of J.

2

Since a real inner product is a bounded, coercive and symmetrical bilinear form, and since its associated quadratic form is the square of the norm (typically expressed in integral form), this result guarantees that, for any real functional of the form: Jϕ pxq “

1 }x}2 ´ ϕpxq 2

where ϕ P H˚ , there exists a single minimizer uϕ P H. The Lax-Milgram theorem and its symmetric variant form the basis for ﬁnite element methods, which are based around the following idea: If ϕ does not have a simple expression, then looking directly for the minimizer (weak solution of a PDE) uϕ in the whole Hilbert space H may be very complicated and time-consuming. In pnq this case, the answer can be approximated by looking for a sequence uϕ in Hn , a ﬁnite-dimensional subspace of H (hence the term “ﬁnite elements”). pnq

In the case where φ is symmetrical and deﬁnite-positive, uϕ is the orthogonal projection of u on Hn in the sense of the inner product deﬁned by φ. Once we have deﬁned a basis phi qni“1 (which is typically orthonormal) on Hn , the pnq problem consists of solving the linear system Auϕ “ b, whereAij “ φphj , hi q and bi “ ϕphi q.

Bounded Linear Operators in Hilbert Spaces

261

Finally, note that the Lax-Milgram theorem presented here may be obtained as a corollary of a theorem proven by Lions and Stampacchia in 1967 in the context of variational inequalities. 6.6. The adjoint operator: presentation and properties In this section, we shall examine a particularly important consequence of the Riesz representation theorem and of the results presented in section 6.5.1: the possibility of associating A with another operator, called “adjoint”, which is of fundamental importance in functional analysis and its applications. Consider an operator A P BpHq. By Theorem 6.22, the bilinear or sesquilinear form deﬁned by φpx, yq “ xx, Ayy is bounded. By Theorem 6.23, there exists a single bounded operator B such that, for all x, y P H, it holds that φpx, yq “ xBx, yy, hence: xx, Ayy “ φpx, yq “ xBx, yy. By the same arguments, if we select the alternative options in theorems 6.22 and 6.23, we obtain the equation: xAx, yy “ φpx, yq “ xx, Byy @x, y P H. The operator B has a speciﬁc name and symbol. D EFINITION 6.12.– Take A P BpHq. The adjoint operator of A, noted3 A: , is A: P BpHq such that: xA: x, yy “ xx, Ayy

and

xAx, yy “ xx, A: yy

@x, y P H

The application : : BpHq Ñ BpHq, A ÞÑ A: is known as adjunction. T HEOREM 6.25.– The adjunction is an antilinear automorphism of BpHq and it veriﬁes the following properties: for all A, B P BpHq and k P K: 1) pA ` Bq: “ A: ` B : ; ¯ :; 2) pkAq: “ kA 3) pABq: “ B : A: ; 4) pAq:: “ A; 5) }A: A} “ }A}2 , }AA: } “ }A: }2 ; 6) }A: } “ }A}.

3 The origin of the symbol :, the dagger, reﬂects the close relationship between the adjoint operator A: and the transposed or dual operator At . For more information, see Appendix 2. The symbol A˚ is also widely used.

262

From Euclidean to Hilbert Spaces

P ROOF.– 1) and 2) are immediate consequences of the sesquilinearity of the complex inner product (if the Hilbert space is real, then evidently, k¯ “ k, as a consequence of bilinearity). 3) xpABq: x, yy “ xx, AByy “ xA: x, Byy “ xB : A: x, yy @x, y P H, hence property 3. 4) Since A:: “ pA: q: , xA:: x, yy “ xpA: q: x, yy “ xx, A: yy “ xA: y, xy “ xy, Axy “ xAx, yy @x, y P H, hence property 4. 5) Let us begin by showing that }A}2 ď }A: A}: taking x P H, }x} “ 1, we have: }Ax}2 “ xAx, Axy “ |xAx, Axy| “ |xA: Ax, xy| }A: Ax}}x} “ }A: Ax} ď Cauchy-Schwarz

ď }A: A}}x} “ }A: A} thus, since }A}2 “ sup }Ax}2 , }A}2 ď }A: A}. }x}“1

Now, let us show that }A: A} ď }A}2 . We begin by noting that, for all x, y P H, }x} “ }y} “ 1, it holds that: a xAx, Ayy ď pxAx, Ayyq2 ` pImpxAx, Ayyq2 “ |xAx, Ayy| [6.22] }Ax}}Ay} ď }A}}x}}A}}y} “ }A}2 ď Cauchy-Schwarz

If xA: Ax, yy “ |xA: Ax, yy|eiϑ , with ϑ the phase of xA: Ax, yy, then : R Q |xA: Ax, yy| “ e´iϑ xA: Ax, yy “ xA: Ax, eiϑ yy that is, xA: Ax, eiϑ yy P R and thus xA: Ax, eiϑ yy “ xA: Ax, eiϑ yy

ď

r6.22s

}A}2 ,

since }eiϑ y} “ 1. Using the fact that xA: Ax, yy “ xA: Ax, eiϑ yy, we can write: |xA: Ax, yy| ď }A}2 ,

@x, y P H, }x} “ }y} “ 1

[6.23]

Now, let us take an arbitrary ξ P H and use this last inequality to estimate the norm of A: Aξ: }A: Aξ} “ “

1 }A: Aξ} 1 }A: Aξ}

1 1 1 : 2 : : }ξ} }A Aξ} }ξ} “ }A: Aξ} ˇ}ξ} xA Aξ, A Aξy}ξ} ˇ ˇ : ξ A: Aξ ˇ 1 : : xA A |xA Aξ, A Aξy|}ξ} “ , y ˇ }ξ} }ξ} }A: Aξ} ˇ }ξ} :

ξ A Aξ Writing x “ }ξ} and y “ }A : Aξ} and observing that these two vectors are unitary, we can use inequality [6.23] to write }A: Aξ} ď }A}2 }ξ}, for all ξ P H, which implies that }A: A} “ sup}ξ}“1 }A: Aξ} ď }A}2 . Hence }A: A} “ }A}2 @A P BpHq. If we write B “ A: , then B P BpHq and }B : B} “ }B}2 , that is, }A:: A: } “ }A: }2 ; moreover, A:: “ A, thus }AA: } “ }A: }2 for all A P BpHq.

Bounded Linear Operators in Hilbert Spaces

263

6) On one side, we have: }A}2 “ }A: A} ď }A: }}A} ùñ r6.12s

}A}2 }A: }}A} ď ðñ }A} ď }A: } }A} }A}

and on the other side we have: }A: }2 “ }AA: } ď }A}}A: } ùñ r6.12s

}A: }2 }A}}A: } ď ðñ }A: } ď }A} }A: } }A: }

2 An immediate corollary of properties 1 and 6 is that the adjunction : : BpHq Ñ BpHq is a continuous function, in fact, if pAn qnPN Ă BpHq is a sequence in BpHq which converges toward A P BpHq, that is, }An ´ A} Ñ 0, then: nÑ`8

:

:

:

}An ´ A } “ }pAn ´ Aq } “ }An ´ A} p1q

p6q

Ñ

nÑ`8

0

The Banach algebra BpHq equipped with the adjunction operation becomes a C˚ algebra, as formalized below. D EFINITION 6.13 (C˚ -algebra).– A Banach algebra A is called a C˚ -algebra if it is possible to equip it with a map j : A Ñ A such that, @a, b P A and @k P C: 1) jpa ` bq “ jpaq ` jpbq; ¯ 2) jpkaq “ kjpaq; 3) jpabq “ jpbqjpaq; 4) jpjpaqq “ a. C˚ -algebra theory is extremely important in functional analysis and its applications, in particular in quantum mechanics; however, a thorough discussion C˚ -algebras lies outside of the scope of this work. Let us now consider the class of operators that are invariant with respect to adjunction. D EFINITION 6.14 (self-adjoint or Hermitian operators).– A P BpHq is a self-adjoint (s.a.) or Hermitian operator if A: “ A, that is, if: xAx, yy “ xx, Ayy,

@x, y P H

To understand the importance of self-adjoint operators, we just quote the fact that the physical observables in quantum mechanics are represented by self-adjoint operators on a Hilbert space. Two particularly remarkable self-adjoint operators are A: A and AA: .

264

From Euclidean to Hilbert Spaces

T HEOREM 6.26.– Taking A P BpHq, then A: A and AA: are self-adjoint. P ROOF.– We simply apply the properties pABq: “ B : A: and A:: “ A: pA: Aq: “ A: A:: “ A: A and: pAA: q: “ A:: A: “ AA:

2

The following theorem establishes the conditions under which the self-adjoint property is stable with respect to the operations of the algebra BpHq. The following notation will be used: @A, B P BpHq, we deﬁne the operator rA, Bs :“ AB ´ BA, called the commutator between A and B. A and B are said to commute if rA, Bs “ 0, the null operator; in this case, AB “ BA. T HEOREM 6.27.– If A, B P BpHq, A, B are self-adjoint, then: – αA ` βB is self-adjoint if and only if α, β P R ; – AB is self-adjoint if and only if rA, Bs “ 0. P ROOF.– The ﬁrst property is a straightforward consequence of property 2 concerning the adjunction and sesquilinearity of the inner product. The second property is proven below. ùñ : AB s.a., that is AB “ pABq: , then pABq: “ B : A: AB “ BA.

“

A,B s.a.

BA, thus

ð : @x, y P H it holds that: xABx, yy “ xBx, A: yy “ xx, B : A: yy xx, BAyy

“

rA,Bs“0

xx, AByy, hence AB “ pABq: .

“

A,B s.a.

2

The following exercise makes use of many of the results presented above. Exercise 6.4 Let pun qnPN be an orthonormal system in the Hilbert space H, pλn qnPN Ă C and A : H Ñ H: ÿ Ax “ λn xx, un yun , @x P H nPN

1) Show that, if the sequence pλn qnPN is bounded, then A P BpHq.

Bounded Linear Operators in Hilbert Spaces

265

2) Calculate the adjoint A: of A. Using your result, deduce a necessary and sufﬁcient condition for operator A to be anti-self-adjoint, that is, A ` A: “ 0. 3) For all n P N, consider the operator An deﬁned by: An x “

n ÿ

λk xx, uk yuk

k“0

a) Calculate An un`1 ´ Aun`1 for all n P N. Using your result, deduce a necessary condition to have An ÝÑ A in BpHq. nÑ`8

b) Supposing that lim λn “ 0, prove that An ÝÑ A in BpHq. nÑ`8

nÑ`8

Solution to Exercise 6.4 1) Since pun qnPN is an orthonormal system of a Hilbert space, the FischerRiesz theoremřguarantees that Ax is well deﬁned, that is, the convergence (in H) of the series λn xx, un yun is equivalent to the convergence (in C) of the series nPN ř ř |λn xx, un y|2 “ |λn |2 |xx, un y|2 . If pλn qnPN is a bounded sequence, that is, nPN nPN ř |λn xx, un y|2 ď M 2 }x}2 ă `8, by Bessel’s sup |λn | “ M ă `8, then nPN

nPN

inequality [5.5]. Now, let us analyze the conditions under which A is bounded: › ›2 ›ÿ › ÿ ÿ › › 2 λn xx, un yun › “ }λn xx, un yun } “ |λn |2 |xx, un y|2 }Ax}2 “ › ›nPN › (Pythagorean th.) nPN nPN The fact that pλn qnPN is bounded and Bessel’s inequality can also be used to write }Ax} ď M }x}, @x P H; furthermore, }A} “ sup }Ax} ď M , showing that A P }x}“1

BpHq. 2) Taking x, y P H, we have: xAx, yy “

λn xx, un y xun , yy “

ř

B

nPN

“

B x,

λn xy, un yun

ř

x,

ř

λn xun , yyun

F

nPN

F

nPN

Hence A: x “

ř

λn xx, un yun , @x P H.

nPN

By continuity, we can write pA ` A: qx “

ř

pλn ` λn q xx, un yun , thus A ` A:

nPN

is the zero operator if and only if λn ` λn “ 0 @n P N. Writing λn “ an ` ibn ,

266

From Euclidean to Hilbert Spaces

an , bn P R @n P N, we see that the condition λn ` λn “ 0 is equivalent to an “ ´an @n P N, that is an “ 0 @n P N, whereas there are no constraints on bn . Thus, A is anti-self-adjoint if and only if λn P iR for all n P N, that is, λn is a pure imaginary sequence. 3) a) Using the following facts: }un } “ 1,

xuk , un`1 y “ 0

@n P N, k ‰ n ` 1

we deduce that An un`1 “ 0, Aun`1 “ λn`1 un`1 @n P N. Then, by [6.10], we can write: }An ´ A}BpHq ě }pAn ´ Aqun`1 }H “ |λn`1 |

@n P N

that is, lim λn “ 0 is a necessary condition for lim }An ´ A}BpHq “ 0. nÑ`8

nÑ`8

b) For all n P N and x P H, we calculate: › ›2 8 8 › ÿ › ÿ › › 2 λn xx, uk yuk › “ |λk |2 |xx, uk y| }An x ´ Ax}2 “ › › › k“n`1 k“n`1 ˆ ˙2 ď sup |λk | }x}2 kěn`1

by Bessel’s inequality. Thus, }An ´ A}BpHq ď supkěn`1 |λk |, @n P N. Using the fact that lim λn “ 0, we obtain the required result: nÑ`8

lim }An ´A}BpHq “ 0

nÑ`8

Now, let us consider the norm of self-adjoint operators. T HEOREM 6.28.– Let A P BpHq be a self-adjoint operator, then: A “ sup |xAx, xy| }x}“1

P ROOF.– For simplicity’s sake, we write: sA “ sup |xAx, xy| }x}“1

sA ď }A} : by the Cauchy-Schwartz inequality, we have: |xAx, xy| ď }x}}Ax} ď }A}}x}2

2

Bounded Linear Operators in Hilbert Spaces

267

thus: sA “ sup |xAx, xy| ď sup }A}}x}2 “ }A} }x}“1

}x}“1

and so sA ď }A}. }A} ď sA : using the fact that @z P C, z ` z¯ “ 2Rpzq, we can write @x, y P H: 1 4pxAx, yyq “ 4 rxAx, yy ` xAx, yys “ 2rxAx, yy ` xy, Axys 2 By direct calculation, we can verify that the following equality holds true: 2rxAx, yy ` xy, Axys “ xApx ` yq, x ` yy ´ xApx ´ yq, x ´ yy thus: 4pxAx, yyq “ xApx ` yq, x ` yy ´ xApx ´ yq, x ´ yy “ }x ` y}2 xA

x`y x´y x`y x´y , y ´ }x ´ y}2 xA , y }x ` y} }x ` y} }x ´ y} }x ´ y}

ď }x ` y}2 sA ` }x ´ y}2 sA “ sA p}x ` y}2 ` }x ´ y}2 q “ sA 2p}x}2 ` }y}2 q [1.6]

that is, pxAx, yyq ď 12 sA p}x}2 ` }y}2 q @x, y P H. Since the inequality is valid for any pair of vectors in H, let us consider the pair x, z, where z “ eiϑ y with arbitrary ϑ P R. Given that }z} “ }y}, the previous inequality becomes: pxAx, eiϑ yyq ď

1 sA p}x}2 ` }y}2 q 2

[6.24]

We can now use a similar argument to that used to prove property 5 in the case of adjunction: we write xAx, yy “ |xAx, yy|eiϑ , where ϑ is the phase of xAx, yy, then: R Q |xAx, yy| “ e´iϑ xAx, yy “ xAx, eiϑ yy

“

(being real)

pxAx, eiϑ yyq

thus |xAx, yy| “ pxAx, eiϑ yyq, and so inequality [6.24] may be rewritten as: |xAx, yy| ď

1 sA p}x}2 ` }y}2 q 2

Now, let us introduce the vector y “ we obtain: |xAx, yy| “ |xAx,

}x} }Ax} Ax

into this inequality. On the left side,

}x} }x} }x} Axy| “ |xAx, Axy| “ }Ax}2 “ }x}}Ax} }Ax} }Ax} }Ax}

268

From Euclidean to Hilbert Spaces

while on the right side, we have: 1 }x}2 1 1 }Ax}2 q “ sA p}x}2 ` }x}2 q “ sA }x}2 sA p}x}2 ` }y}2 q “ sA p}x}2 ` 2 2 }Ax}2 2 thus, @x P H, it holds that }x}}Ax} ď sA }x}2 , and if x ‰ 0H , then }Ax} ď sA }x}, hence: }A} “ sup

x‰0H

}Ax} }x} ď sA sup “ sA }x} x‰0H }x}

and ﬁnally }A} ď sA .

2

Theorem 6.29 points out a property of the adjoint operator which is of fundamental importance in optimization. T HEOREM 6.29.– Taking A P BpHq, then: kerpAq “ pImpA: qqK

and

ImpA: q “ pkerpAqqK

thus: H “ kerpAq ‘ ImpA: q

and

H “ kerpA: q ‘ ImpAq

P ROOF.– kerpAq Ď pImpA: qqK : taking any x P H and y P kerpAq, then Ay “ 0H and so we can write: 0 “ xx, 0H y “ xx, Ayy “ xA: x, yy that is, y K A: x @x P H. Since ImpA: q “ tA: x, x P Hu, this implies that y P pImpA: qqK . pImpA: qqK Ď kerpAq : taking y P pImpA: qqK , then xA: x, yy “ 0 @x P H, and since xA: x, yy “ xx, Ayy, then xx, Ayy “ 0 @x P H, that is, Ay “ 0H , therefore y P kerpAq. Therefore: kerpAq “ pImpA: qqK “ pImpA: qqK . Taking the orthogonal complement again: kerpAqK “ pImpA: qqKK “ ImpA: q. We see that it is essential to consider the closure of ImpA: q, since kerpAqK is a closed subspace in H and, in general, ImpA: q is not. The orthogonal decompositions of H into a direct sum of subspaces are an immediate consequence of the orthogonal projection theorem. 2

Bounded Linear Operators in Hilbert Spaces

269

Finally, let us analyze the relationship between inversion and adjunction. Recall that, as we saw in section 6.3, if V is a Banach space, then GLpV q is its general linear group, that is, the group of continuous bijective linear operators with continuous inverses. T HEOREM 6.30.– Let H be a Hilbert space and let A P GLpHq. Then A: is invertible and: 1) it holds that: pA: q´1 “ pA´1 q: that is, for the operators in GLpHq, inversion and adjunction commute: the inverse of the adjoint is the adjoint of the inverse; 2) if A P GLpHq is self-adjoint, then A´1 is also self-adjoint. P ROOF.– 1) We need to prove that, for all x P H, pA´1 q: A: x “ A: pA´1 q: x “ x. To do this, let us consider, @x, y P H: xy, pA´1 q: A: xy “ xA´1 y, A: xy “ xAA´1 y, xy “ xy, xy xy, A: pA´1 q: xy “ xAy, pA´1 q: xy “ xA´1 Ay, xy “ xy, xy hence, by [6.10] pA´1 q: A: x “ A: pA´1 q: x “ x. 2) An immediate consequence of property 1 is that if A A “ pA´1 q: . ´1

“

A: , then 2

6.7. Orthogonal projection operators in a Hilbert space We have already examined the concept of orthogonal projection in a Hilbert space H. Here we wish to characterize orthogonal projections from an operator point of view. We will see that the adjoint operator will play a crucial role. A clear, simple way of understanding projection operators (orthogonal or otherwise) is to imagine that we are in a ﬁnite-dimensional Euclidean space, for example R2 , and to project a vector in the direction of another vector. Now, imagine that we want to repeat the process, that is, we want to “project the projection”; clearly, this operation has no effect on the projected vector. This property is used to deﬁne the concept of projection itself4. 4 Many authors refer to this as oblique projection to distinguish it from the more restrictive concept of orthogonal projection.

270

From Euclidean to Hilbert Spaces

D EFINITION 6.15.– An operator A P BpHq is called a projector, or a projection operator, if it is idempotent, i.e. A2 “ A. The presence of an inner product in H allows us to target a speciﬁc projection: the orthogonal projection. The results presented in Chapter 5 showed that the completeness of H with respect to the topology generated by the inner product allows us to give two equivalent deﬁnitions of orthogonal projection. D EFINITION 6.16.– Let H be a Hilbert space and S a closed proper subspace of H. The function: PS : H ÝÑ S x ÞÝÑ PS pxq is the orthogonal projector on the subspace S if }x ´ PS pxq} “ inf }x ´ y}, that is, yPS

PS pxq is the element in S which minimizes the distance from x P H with respect to the norm induced by the inner product of H. In an equivalent manner, if we consider the decomposition of H: H “ S ‘ S K and note x “ x1 ` x2 , with x P H, x1 P S, x2 P S K , then the orthogonal projection operator PS is deﬁned via the formula PS pxq “ x1 . Let us consider an example of a projector. Take H “ L2 r´a, as, with a P R equipped with the Borel σ-algebra and the Lebesgue measure. The odd and even functions can be easily veriﬁed to be orthogonal for the inner product of L2 r´a, as. We then have the following decomposition: f pxq “

f pxq ` f p´xq f pxq ´ f p´xq ` , 2 2 looooooomooooooon looooooomooooooon even part

@x P r´a, as,

odd part

thus the projector of f P L2 r´a, as on the subspace P Ă L2 r´a, as of even functions p´xq , and the projector on the subspace I Ă L2 r´a, as is deﬁned by PP f pxq “ f pxq`f 2 p´xq of odd functions is deﬁned by PI f pxq “ f pxq´f , @x P r´a, as. 2 Now, let us examine the properties of the operator PS . 1) PS |S “ idS . This is trivial: the element PS pxq P S which minimizes the distance to x P S is itself. In other words, if x “ x1 P S, then PS pxq “ PS px1 q “ x1 . 2) PS2 “ PS (idempotence). @x P H, we have PS2 pxq “ PS p PS pxq. Thus PS is indeed a projector.

PoSmo pxq lo on

PS, by deﬁnition

q “

Bounded Linear Operators in Hilbert Spaces

271

3) PS is a continuous linear operator. Let x1 “ x11 ` x21 P H , with x11 P S and P S K , and let x2 “ x12 ` x22 P H, with x12 P S and x22 P S K . For all α, β P K we have: 1 1 αx1 ` βx2 “ αx αx21 ` βx22 1 ` βx2 ` looooomooooon looooomooooon

x21

PS

and thus:

PS K

PS pαx1 ` βx2 q “ αx11 ` βx12 “ αPS px1 q ` βPS px2 q PS is thus a linear operator. Its continuity can be proven by showing that it is bounded: taking any x “ x1 `x2 P H, with x1 P S, x2 P S K , then, by the Pythagorean 2 2 2 2 2 theorem, }x}2 “ }x1 }2 ` }x2 }2 and PS x “ x1 ď x1 ` x2 “ x , i.e. PS x ď x @x P H; 4) PS “ PS: (self-adjoint). To prove this, we use the projection theorem twice, on x P H and on y P H: x “ x1 ` x2 , y “ y 1 ` y 2 , x1 , y 1 P S, x2 , y 2 P S K : :0 1 xPS x, yy “ xx1 , y 1 ` y 2 y “ xx1 , y 1 y ` xx , y 2 y “ xx ´ x2 , y 1 y :0 2 “ xx, y 1 y ´ xx , y 1 y “ xx, y 1 y and since y 1 “ PS y, then: xPS x, yy “ xx, PS yy @x, y P H; 5) PS is a 1-Lipschitz function, that is: }PS pxq ´ PS pyq} ď }x ´ y} ,

@x, y P H

We simply note that @x, y P H, the projection of x ´ y, i.e. PS px ´ yq, and the residual vector px ´ yq ´ PS px ´ yq are orthogonal, since one belongs to S and the other to S K . Thus, we can apply the Pythagorean theorem and write: 2

2

}x ´ y} “ }px ´ yq ´ PS px ´ yq ` PS px ´ yq} 2

2

“ }px ´ yq ´ PS px ´ yq} ` }PS px ´ yq} 2

ě }PS px ´ yq}

“ }PS pxq ´ PS pyq}

2

So, we obtain: }PS pxq ´ PS pyq} ď }x ´ y} , @x, y P H. 6) The non-trivial orthogonal projectors have a unitary norm: # 1 if S ‰ t0H u PS “ 0 if S “ t0H u If S “ t0H u then PS ” 0, thus its norm is 0. Otherwise, by setting y “ 0 in S x the 1-Lipschitz property of PS we have that PS x ď x @x P H, i.e. Px ď1

272

From Euclidean to Hilbert Spaces

@x P Hzt0H u. Furthermore, if S ‰ t0H u, then there exists x ¯ P S, x ¯ ‰ 0H and S x PS x ¯“x ¯, i.e. PS x ¯ “ ¯ x, i.e. P¯xSx¯ “ 1. Then PS “ sup Px “ 1. xPH, x‰0

7) ImpPS q “ S . This is obvious, by deﬁnition of the projection operator. 8) ker PS “ S K . We must show the double inclusion: x P ker PS ùñ PS pxq “ 0 , so, for all y P S it holds that 0 “ xPS pxq, yy “ xx, PS: pyqy “ xx, yy since PS: “ PS and PS pyq “ y, thus x P S K . x P S K ùñ x “ 0 ` x “ PS pxq ` PS K pxq, by uniqueness of the orthogonal decomposition, thus PS pxq “ 0 and then x P ker PS . 9) One immediate consequence of the two previous properties and the projection theorem is that: H “ ImpPS q ‘ kerpPS q

@S closed subspace of H.

10) PS ` PS K “ idH (decomposition of the identity). For any x P H, we always have the decomposition x “ x1 ` x2 with x1 “ PS pxq and x2 “ PS K pxq. We thus have PS pxq ` PS K pxq “ pPS ` PS K qpxq “ x1 ` x2 “ x, @x P H. An immediate consequence is that: PS K “ idH ´ PS

@S closed subspace of H.

PS K is also called complementary projector and it is denoted with PS K . For all x P H, the residual vector of the projection of x on S is obtained via PS K “ pidH ´ PS qpxq “ x ´ PS pxq; 11) Characterization of the projection subspace: S “ tx P H : PS pxq “ xu “ tx P H : }PS pxq} “ }x}u that is, the elements of S are the ﬁxed points of PS in H, which may themselves be characterized as elements of H which have a norm equal to that of their projection on S. Let us prove that S “ tx P H : PS pxq “ xu: an element of S is a point of H on which PS acts as the identity, vice-versa, if x P H satisﬁes PS pxq “ x then, by applying PS to both sides we get PS2 pxq “ PS pxq, but then, thanks to the idempotence of PS , x “ PS pxq P S. Let us now check that PS pxq “ x ðñ }PS pxq} “ }x}: ùñ : evidently, PS pxq “ x ùñ }PS pxq} “ }x}; ðù : again, taking H Q x “ x1 ` x2 with PS pxq “ x1 , we have: ›2 › ›2 › 2 2 }PS pxq} “ }x} ùñ }PS pxq} “ }x} ùñ ›x1 › “ ›x1 ` x2 › › ›2 › ›2 › ›2 ùñ ›x1 › “ ›x1 › ` ›x2 › psince x1 K x2 q

Bounded Linear Operators in Hilbert Spaces

273

› ›2 ðñ ›x2 › “ 0 ðñ x2 “ 0 (property of the norm) ðñ x “ x1 , we thus have }PS pxq} “ }x} ùñ x “ x1 “ PS pxq. 12) xPS x, xy “ xx, PS xy “ }PS x}2 for all x P H. The ﬁrst equality is simply a consequence of the fact that PS is self-adjoint, then, by idempotence, PS2 “ PS PS “ PS , so xPS2 x, xy “ xPS x, PS xy “ }PS x}2 . Two of the properties proven above characterize bounded linear operators as orthogonal projectors. T HEOREM 6.31 (“Algebraic” characterization of orthogonal projectors).– Taking A P BpHq, the following statements are equivalent: 1) A is an orthogonal projector; 2) A: A “ A; 3) A: A “ A: ; 4) A is self-adjoint and idempotent, i.e. A: “ A and A2 “ A. P ROOF.– The theorem will be proven by the logical loop 1) ùñ 2) ùñ 3) ùñ 4) ùñ 1). 1q ùñ 2q : A is an orthogonal projector, hence A: “ A and A2 “ A, then A A “ AA “ A2 “ A. :

2q ùñ 3q : if A: A “ A, then pA: Aq: “ A: , but we know that A: A is selfadjoint, thus pA: Aq: “ A: A “ A: . 3q ùñ 4q : if A: A “ A: , then pA: Aq: “ A:: “ A; moreover, A: A is selfadjoint, hence pA: Aq: “ A: A “ A. By hypothesis, A: A “ A: , so A: “ A, that is, A is self-adjoint. Reusing the starting hypothesis A: A “ A: , the fact that A is self-adjoint implies that A2 “ A, that is, A is idempotent. 4q ùñ 1q : let A P BpHq be self-adjoint and idempotent. We wish to show that A is an orthogonal projector. By deﬁnition, an orthogonal projector projects onto a closed vector subspace, so we ﬁrst need to show that ImpAq, the subspace which is intended to be the “site” of the projection, is closed, given the hypotheses of the theorem.

274

From Euclidean to Hilbert Spaces

We can show that the continuity and idempotence of a linear operator A imply the closure of its image; this is remarkable, since the relationship between the concept of closure of a vector subspace and idempotence is far from obvious. Let pxn qnPN Ă ImpAq be a sequence converging to x0 P H. We wish to show that x0 P ImpAq. Since each xn P ImpAq, then, @n P N, there exists ξn P H such that xn “ Aξn and we can thus rewrite xn ÝÝÝÝÝ Ñ x0 as Aξn ÝÝÝÝÝÑ x0 . The continuity nÑ`8

nÑ`8

Ñ Ax0 , but A2 “ A, hence Aξn ÝÝÝÝÝÑ Ax0 . We then of A implies that A2 ξn ÝÝÝÝÝ nÑ`8

nÑ`8

have Aξn ÝÝÝÝÝÑ Ax0 and Aξn ÝÝÝÝÝ Ñ x0 , and the uniqueness of the limit implies nÑ`8

nÑ`8

that Ax0 “ x0 , i.e. x0 P ImpAq, thus ImpAq is closed. In this case, property A: “ A is used alongside idempotence to show that A projects in an orthogonal manner. First, let us write an orthogonal decomposition of H with respect to ImpAq: for all x P H, let us consider x “ Ax ` px ´ Axq, then Ax P ImpAq by deﬁnition. We need to show that x ´ Ax is orthogonal to any vector of the form Aξ, ξ P H: xAξ, x ´ Axy “ xξ, A: px ´ Axqy “ xξ, Apx ´ Axqy A s.a.

“ xξ, Ax ´ A2 xy 2“ xξ, Ax ´ Axy “ 0 A “A

thus x ´ Ax P ImpAqK and then A “ PImpAq by the orthogonal projection theorem. 2 Exercise 6.5 Let E Ă 2 pN, Cq be the set: E “ tx “ pxn qnPN P 2 pN, Cq : x0 ` x1 “ 0u 1) Show that E is a closed vector subspace of 2 pN, Cq. 2) Provide an explicit description of E K and determine the orthogonal projection operator PE : 2 pN, Cq Ñ E on E. Determine }PE } and PE: . 3) Let x “ pxn qnPN P 2 pN, Cq such that: # 1 if n “ 0 xn “ 0 otherwise Calculate the distance between x and the subspace E, i.e. δ “ inf t}x ´ y}u. yPE

Bounded Linear Operators in Hilbert Spaces

275

4) Let A : 2 pN, Cq Ñ 2 pN, Cq be the operator deﬁned by: # ´x1 if n “ 0 pApx0 , x1 , x2 , . . . qqn “ xn otherwise Determine }A} and A: . (Hint: calculate Ap0, 1, 0, 0, ...q). 5) Show that A2 “ A and determine ImpAq. Is A an orthogonal projector? Solution to Exercise 6.5 1) We begin by showing that E is a vector subspace of 2 pN, Cq: let us consider any λ P C and arbitrary x, y P E. Then, given that the linear structure of 2 pN, Cq is deﬁned pointwise, [6.24] tells us that: z :“ λx ` y “ pλxn qnPN ` pyn qnPN “ pλxn ` yn qnPN thus z0 “ λx0 ` y0 and z1 “ λx1 ` y1 , and then z0 ` z1 “ λpx0 ` x1 q ` py0 ` y1 q “ λ ¨ 0 ` 0 “ 0 since x, y P E, showing that E is stable with respect to the linear combinations of its elements. We can show that E is closed using a technique which is particularly useful in the context of constraints as x0 ` x1 “ 0. This approach consists of establishing an identity between the constraint and the condition deﬁning the kernel of a continuous linear operator between normed vector spaces, which we know from Theorem 6.8 to be a closed vector subspace of the operator domain. In our case, it is easy to identify the sum of the projection operators on the ﬁrst and second components, that is, A :“ P0 ` P1 : 2 pN, Cq Ñ C, Apxq :“ P0 pxq ` P1 pxq “ x0 ` x1 , with the continuous linear operator (insofar as it is a sum of continuous linear operators) between two Hilbert spaces such that kerpAq “ tx P 2 pN, Cq : Ax “ 0 ðñ x0 ` x1 “ 0u “ E, demonstrating the closure of E. 2) Using the constraint x0 ` x1 “ 0, a sequence x P E can be written as px0 , ´x0 , x2 , x3 , . . . q, of course by respecting the fact that x P 2 pN, Cq. This implies that the canonical Hilbert basis of 2 pN, Cq, that is, e “ pp1, 0, 0, . . . q, p0, 1, 0, . . . q, . . . q, can be used to construct a Hilbert basis of E as: e˜ :“ pp1, ´1, 0, . . . q, p´1, 1, 0, . . . q, p0, 0, 1, 0, . . . q, . . . q thus: E K “ ty P 2 pN, Cq : xy, e˜n y “ 0 @n P Nu. Taking y “ py0 , y1 , y2 , . . . q P 2 pN, Cq, then: - n “ 0: xy, e˜0 y “ y0 ´ y1 ` 0 ` ¨ ¨ ¨ “ y0 ´ y1 null if and only if y0 “ y1 ; - n “ 1: xy, e˜1 y “ y0 ´ y1 ` 0 ` ¨ ¨ ¨ “ ´y0 ` y1 null if and only if y0 “ y1 , as in the case where n “ 0;

276

From Euclidean to Hilbert Spaces

- n “ 2: xy, e˜2 y “ 0 ` 0 ` y2 ` 0 ` ¨ ¨ ¨ “ y2 null if and only if y2 “ 0. Evidently, for all n ě 2, xy, e˜n y “ yn , which is null if and only if yn “ 0. Thus, the only vector y P 2 pN, Cq which is orthogonal to all elements in the Hilbert basis e˜ of E is y “ py0 , y0 , 0, . . . q, that is: E K “ tpy, y, 0, 0, . . . q, y P Cu The orthogonal projection operator on E can be determined using the projection theorem: 2 pN, Cq “ E ‘ E K . We decompose the arbitrary vector z “ pzn qnPN P 2 pN, Cq into a sum of two vectors, one belonging to E and the other to E K . This is done by noting that, given z “ pz0 , z1 , z2 , . . . q, z P E if the ﬁrst two components are the inverse of one another, and z P E K if the ﬁrst two components are equal and are null from the third position onward, then: z “ pz0 , z1 , z2 , z3 , . . . q “ pa, ´a, z2 , z3 , . . . q ` pb, b, 0, 0, . . . q “ pa ` b, b ´ a, z2 , z3 , . . . q which implies the system of constraints: # a ` b “ z0 b ´ a “ z1 solved by a “ pz0 ´ z1 q{2 and b “ pz0 ` z1 q{2, that is: ˆ ˙ ˆ ˙ z0 ´ z1 z0 ` z1 z0 ` z1 z0 ´ z1 pz0 , z1 , z2 , . . . q “ ,´ , z2 , . . . ` , , 0, 0, . . . 2 2 2 2 with the ﬁrst vector in E and the second in E K ; thus: ˆ ˙ z0 ´ z1 z0 ´ z1 PE pz0 , z1 , z2 , . . . q “ ,´ , z2 , . . . 2 2 is the explicit expression of the orthogonal projector on E. Finally, without carrying out a single calculation, we can state that PE has unit norm, }PE } “ 1, given that it is a non-trivial orthogonal projector, and also that PE: “ PE , as orthogonal projectors are self-adjoint. 3) Let x “ pxn qnPN be the element in 2 pN, Cq such that: # 1 if n “ 0 xn “ 0 otherwise Since E is a closed vector subspace of 2 pN, Cq, the distance between x and E is well deﬁned thanks to the projection theorem. PE pxq represents the vector in E which is the closest to x; therefore; this distance is equal to δ “ }x ´ PE pxq}2 : ˆ ˙ ˆ ˙ 1´0 1´0 1 1 PE pxq “ PE pp1, 0, . . . qq “ ,´ , 0, . . . “ , ´ , 0, . . . 2 2 2 2

Bounded Linear Operators in Hilbert Spaces

277

Then : ›ˆ › ˆ ˙› ˙› › 1 1 › › › 1 1 δ “ ››p1, 0, . . . q ´ , ´ , 0, . . . ›› “ ›› , , 0, . . . ›› 2 2 2 2 2 2 dˆ ˙ ˆ ˙ 2 2 1 1 1 “ ` “? 2 2 2 4) First, we note that x0 plays no part in the action of A, thus: # ´x1 if n “ 0 Apx0 , x1 , x2 , . . . q “ Apy0 , x1 , x2 , . . . q “ xn otherwise

@y0 P C

Notably, this holds true for y0 “ 0, so we can limit the action of A on the elements of 2 pN, Cq of the form x “ p0, x1 , x2 , . . . q. Using this speciﬁcation, by direct calculation, we obtain: a a a 2 2 2 2 2 2 2 }Ax}2 “ ? p´x a 1 q ` x1 ` x2 ` . . . “? 2x1 ` x2 ` . . . ď 2x1 ` 2x2 ` . . . ď 2 02 ` x21 ` x22 ` . . . “ 2}x}2 With this majorization, the deﬁnition of the operator norm from equation [6.3] becomes: ? }A} “ inft0 ă c ď 2 : }Ax}2 ď c}x}2 @x “ p0, x1 , x2 , . . . q P 2 pN, Cqu The inf is the sup of the minimizer set; thus, if we can identify a vector x P ? ? 2 pN, Cq for which }Ax}2 “ 2, then the norm of A must be 2. Taking the hint given in the question, we calculate: a ? }Ap0, 1, 0, . . . q}2 “ }p´1, 1, 0, . . . q}2 “ p´1q2 ` 12 ` 02 ` . . . “ 2 ? and then }A} “ 2. Now, let us determine A: . For all x, y P 2 pN, Cq (in this case, x is not necessarily of the form p0, x1 , x2 , . . . q) it holds that: xAx, yy2 “ xp´x1 , x1 , x2 , . . . q, py0 , y1 , y2 , . . . qy2 “ ´x1 y0 ` x1 y1 ` x2 y2 ` ¨ ¨ ¨ “ x0 ¨ 0 ` x1 py1 ´ y0 q ` x2 y2 ` ¨ ¨ ¨ “ xx, p0, y1 ´ y0 , y2 , ...qy “ xx, A: yy and then the adjoint operator of A is: A: pyq “ p0, y1 ´ y0 , y2 , . . . q

@y P 2 pN, Cq

5) We have A2 x “ AAx “ Ap´x1 , x1 , x2 , . . . q “ p´x1 , x1 , x2 , . . . q “ Ax for all x P 2 pN, Cq, thus A is idempotent. Moreover, we clearly see that ImpAq “ E, where E is the subspace deﬁned at?the beginning of the exercise. Thus A is a projection operator on E, but since }A} “ 2 ‰ 1, it cannot be an orthogonal projector. A is

278

From Euclidean to Hilbert Spaces

therefore an oblique projection operator on E. The difference between the actions of A and PE is: oblique projector on E Ax “ p´x1 , x1 , x2 , . . . q ˆ ˙ x 0 ´ x1 x0 ´ x 1 PE x “ ,´ , x2 , . . . orthogonal projector on E 2 2

2

6.7.1. Bounded multiplication operators and their relation to orthogonal projectors In this section, we shall present a concrete application of the last theorem, while taking the opportunity to introduce a new category of highly useful linear operators. D EFINITION 6.17.– Let H “ L2 pX, A, μq and take g P L8 pX, A, μq. The multiplication operator by g is deﬁned by: Mg : L2 pX, A, μq ÝÑ L2 pX, A, μq f ÞÝÑ Mg f “ f ¨ g % where f ¨ gpxq “ f pxqgpxq @x P X (pointwise multiplication). Now, let us examine the properties of Mg . – Mg is bounded @g P L8 pX, A, μq: ˙ ż ż ˆ }Mg f }22 “ |f pxqgpxq|2 dμpxq ď sup |gpxq|2 |f pxq|2 dμpxq X

“

}g}28

2 f 2

X

xPX

ă `8

thus5 }Mg }2 ď }g}8 then Mg P BpL2 pX, A, μqq @g P L8 pX, A, μq. – @g, h P L8 pX, A, μq, by the commutativity of the pointwise product, multiplication operators commute, that is, Mg Mh “ Mh Mg , rMg , Mh s “ 0. – kerpMg q “ tf P L2 pX, A, μq : Mg pf q “ f ¨ g “ 0L2 pX,A,μq u. Deﬁning the set: Ng “ tx P X : gpxq “ 0u, it is clear that pf ¨ gqpxq “ 0 @x P Ng . Thus, since gpxq ‰ 0 @x P Ng c “ XzNg , to obtain the zero function on X via the product f ¨ g, we must simply impose the 5ŤIt is possible to show that if pX, A, μq is a measure space with a σ-ﬁnite measure, i.e. X “ Ak , where μpAk q ă `8, then }Mg }2 “ }g}8 . kPN

Bounded Linear Operators in Hilbert Spaces

279

condition that f must be null on Ng c (remember that f is an equivalence class of functions which are equal a.e.). In short: kerpMg q “ tf P L2 pX, A, μq : f pxq “ 0 @x P Ng c u – Now, let us consider the invertibility of Mg . For the kernel of Mg to be trivial, the only element in kerpMg q must be the equivalence class in which the identically zero function appears. This corresponds to requiring that μpNg q “ 0, since in this case μpNg c q “ μpXq ´ μpNg q “ μpXq thus kerpMg q “ tf P L2 pX, A, μq : f pxq “ 0 a.e.u. – If μpNg q “ 0, then there exists an inverse operator of Mg : Mg ´1 : ImpMg q Ñ L pX, A, μq which can be characterized using the function g1 : X Ñ K, g1 pxq “ # 1 if gpxq ‰ 0 gpxq . 0 otherwise 2

By deﬁnition, ImpMg q “ th P L2 pX, A, μq : Df P L2 pX, A, μq : h “ 1 Mg pf q “ f ¨ gu; it is thus clear that g1 ¨ h “ gpxq ¨ f ¨ g “ f . This simple observation

allows us to characterize both ImpMg q and the action of Mg ´1 : ImpMg q “ th P L2 pX, A, μq :

1 ¨ h P L2 pX, A, μqu and Mg ´1 “ M g1 g

– Let us determine the adjoint of Mg : @f, h P L2 pX, A, μq, g P L8 pX, A, μq: ż f pxqgpxq hpxqdμpxq xMg: f, hy “ xf, Mg hy “ xf, ghy “ X

“

ż ´

¯

gpxqf pxq hpxqdμpxq

X

“ x¯ g ¨ f, hy “ xMg¯ f, hy that is, Mg: “ Mg¯ , by Theorem 6.10. – Now, we calculate Mg2 : Mg2 f “ Mg pMg f q “ Mg pf gq “ f g 2 “ Mg2 f @f P L2 pX, A, μq, g P L8 pX, A, μq Thus, the bounded linear operator Mg is self-adjoint and idempotent if and only if g¯ “ g and g 2 “ g. The ﬁrst condition means that g must be a real-valued function, but the only function with real values which is equal to its own square is a function which

280

From Euclidean to Hilbert Spaces

only takes values of 0 and 1, that is, the indicator function of a measurable subset of Rn , which is clearly an element of L8 pX, A, μq. In summary, the multiplication operator Mg is an orthogonal projection operator if and only if g “ χE , with E Ď X measurable. MχE is invertible if and only if μpE c q ‰ 0, i.e. μpEq ‰ μpXq. Leaving aside invertibility, let us calculate the image of MχE : the condition which determines this subspace is χ1E ¨ h P L2 pX, A, μq, but, by deﬁnition, χ1E pxq “ 0 @x P X such that χE pxq “ 0, that is, @x P E c and, in this case, χ1E ¨h P L2 pX, A, μq. When x P E, χ1E pxq “ 1, so the deﬁning condition of ImpMg q becomes h P L2 pE, A, μq. In conclusion, for any measurable set E Ď X, the orthogonal projector and multiplication operator MχE is, explicitly: MχE : L2 pX, A, μq ÝÑ L2 pE, A, μq # f pxq f ÞÝÑ MχE f “ 0

xPE otherwise

6.7.2. Geometric realization of orthogonal projection operators via orthonormal systems We now have the means of proving another important analogy between Hilbert spaces and ﬁnite-dimensional Euclidean spaces related to the geometric realization of orthogonal projectors on a vector subspace generated by an orthogonal family, which we have already discussed in Chapter 1. We recall that the orthogonal projector of an inner product vector space V of ﬁnite dimension n on a vector subspace S of dimension s can be written as: PS pxq “

s ÿ

xx, ui yui

i“1

where pui qsi“1 is any orthonormal basis of S. In a Hilbert space, we have the following result. T HEOREM 6.32.– Take A P BpHq, A ‰ 0. The following statements are equivalent: 1) A is an orthogonal projector;

Bounded Linear Operators in Hilbert Spaces

281

2) there exists an orthonormal system6 pun qnPN in H such that: ÿ xx, un yun @x P H Ax “ nPN

Where applicable, A projects onto the closed subspace spanpun , n P Nq. P ROOF.– 1q ùñ 2q : First, we note that an orthogonal projector A is surjective if and only if it is the identity operator. The condition ImpAq “ H implies, by properties 7 and 8 of orthogonal projectors, that ImpAqK “ kerpAq “ t0H u; thus, by property 9, it holds that Ax “ x @x P H, i.e. A “ idH . In this case, any complete orthonormal system pun qnPN in H realizes A since, on the one hand, Ax “ x, and on the other hand, by Theorem 5.11řregarding the characterization of complete orthonormal systems, we can write x “ xx, un yun . Given that a complete orthonormal system is a special nPN

instance of an orthonormal system, the implication 1q ùñ 2q when ImpAq “ H is true. Now, let A be an orthogonal projector such that ImpAq Ă H, that is, ImpAq is a closed vector subspace (by deﬁnition of an orthogonal projector) and proper in H; thus, it is a Hilbert space itself, properly included in H. Let pun qnPN be any complete orthonormal system in ImpAq. Given our hypotheses, pun qnPN is only an orthonormal system (and not, generally, a complete orthonormal system) of H. For all y P ImpAq, we have the following decomposition: ÿ y“ xy, un yun nPN

Moreover, ImpAq “ tAx, x P Hu, so, using the fact that A, as an orthogonal projector, is self-adjoint: ÿ ÿ xAx, un yun “ xx, Aun yun , @x P H Ax “ nPN

pA s.a.q

nPN

Since A is the identity on ImpAq and un P ImpAq @n P N, then Aun “ un , hence: ÿ xx, un yun , @x P H Ax “ nPN

that is, the orthogonal projector A is realized on the orthonormal system pun qnPN of H as described in point 2. 6 Note that although we write pun qnPN , the orthonormal system may be ﬁnite, i.e. it may include a ﬁnite number of un ‰ 0.

282

From Euclidean to Hilbert Spaces

2q ùñ 1q : for any pair x, y P H, let pun qnPN be an orthonormal system of H such that: ÿ ÿ Ax “ xx, um yum , Ay “ xy, un yun mPN

nPN

then, by the continuity of the inner product: ÿ ÿ ÿ xx, Ayy “ xx, xy, un yun y “ xy, un yxx, un y “ xx, un yxun , yy nPN

nPN

nPN

Again, using the continuity of the inner product, as we saw when proving Parseval’s identity (Theorem 5.11): ÿ ÿ xx, um yum , xy, un yun y xx, A: Ayy “ xAx, Ayy “ x mPN

“

ÿ ÿ

nPN

xx, um yxy, un yxum , un y

mPN nPN

“

ÿ ÿ

ÿ

xx, um y xun , yy δn,m “

mPN nPN

xx, un y xun , yy

nPN

so xx, Ayy “ xx, A: Ayy @x, y P H, that is, A “ A: A by Theorem 6.10. Using the algebraic characterization of orthogonal projectors, Theorem 6.31, we can therefore state that A is an orthogonal projector. Supposing that 1 and 2 are veriﬁed, then: 1) ùñ ImpAq “ kerpAqK and kerpAq “ ImpAqK ; 2) ùñ ImpAq Ď spanpun , n P Nq, which is obvious, and kerpAq Ď p spanpun , n P Nq qK , which is not quite so obvious. For x “ 0H this is true; taking N ř ř xx, un yun “ lim xx, un yun . The x P kerpAq, x ‰ 0H , then Ax “ 0 “ nPN

N ùñ `8 n“1

vectors un are linearly independent since they are mutually orthogonal, so, for all N P N ř N, the linear combination xx, un yun is zero if and only if the coefﬁcients xx, un y n“1

are zero, that is, x P pun , n P NqK “ pspanpun , n P NqqK “ p spanpun , n P Nq qK , by the properties of the orthogonal complement. To summarize: on one side, ImpAq Ď spanpun , n P Nq, while, on the other side kerpAq Ď p spanpun , n P Nq qK , thus spanpun , n P Nq Ď kerpAqK “ ImpAq, that is, ImpAq “ spanpun , n P Nq. 2 R EMARK .– We see from the proof of this theorem that any orthonormal system pun qnPN in ImpAq can be used to realize a projector in the sense deﬁned by the theorem.

Bounded Linear Operators in Hilbert Spaces

283

This means that, although each term in the summation may be different, the overall action of the operator will be the same for any orthonormal system pun qnPN in ImpAq. A remarkable application of this result is shown in Exercise 6.6, which illustrates the way in which the best linear approximation of a parabola on a real interval may be found using the orthogonal projection theory on a Hilbert space. Exercise 6.6 Let f pxq ” 1 (th constant function equal to 1) and gpxq “ x, the identity function, seen as two elements of L2 r0, 1s. Calculate: 1) the angle ϑ between f and g; 2) their distance in L2 r0, 1s; 3) the projection PW h of the function h P L2 r0, 1s, hpxq “ x2 , on the vector subspace W “ spanpf, gq. Interpret your ﬁndings. Solution to Exercise 6.6 1) The angle between f and g is obtained using the deﬁnition of inner product: xf, gy “ f g cospϑq, so we need to calculate xf, gy, f , g: „ 2 j1 ż1 1 x xf, gy “ xdx “ “ 2 0 2 0 f “

ˆż 1 0

˙1{2 dxq “ 1,

In conclusion, cospϑq “

xf,gy f g

“

g “ ? 3 2 ,

ˆż 1

x2 dx

0

thus ϑ “

˙1{2

“

˜„

x3 3

j1 ¸1{2 0

1 “? 3

π 6.

2) Distance: ¯1{2 ´ş ¯1{2 ´ş 1 1 “ 0 p1 ´ xq2 dx dpf, gq “ f ´ g “ 0 pf pxq ´ gpxqq2 dx ˆ ” ı ˙1{2 ? 3 1 “ ´ p1´xq “ ?13 “ 33 3 0

3) Projection on W : We use the characterization of projection given by the previous theorem. We need to construct an orthonormal basis of W , which can be done by using the Gram-Schmidt procedure. A wise choice is to begin with the function f which is a generator of W and, furthermore, has a unitary norm. The second (and ﬁnal) element in the orthonormal basis of W is then: g˜pxq “

x ´ 12 gpxq ´ xf, gyf pxq “ x ´ 1 , gpxq ´ xf, gyf pxq 2

284

From Euclidean to Hilbert Spaces

with x ´

¨ « ˛ ˜ż 1 ˆ ˙2 ¸1{2 ˆ ˙3 ﬀ1 1{2 1 1 1 1 1 “ ‚ “ ? dx “˝ x´ x´ 2 2 3 2 2 3 0 0

? ? ` ˘ so g˜pxq “ 2 3 x ´ 12 and the desired orthonormal basis is B “ p1, 3 p2x ´ 1qq. The orthogonal projection hpxq “ x2 on W is thus: g PW h “ xh, f yf ` xh, g˜y˜ By direct calculation, xh, f y “ 13 and xh, g˜y “ ? 3? 1 1 3 p2x ´ 1q “ x ´ PW hpxq “ ` 3 6 6

?

3 6 ,

hence:

The interpretation of this result is as follows: The functions r0, 1s ÞÑ 1 and f

r0, 1s ÞÑ x are the generators of the space W of linear functions (straight lines) g

deﬁned on the interval r0, 1s. In fact, any linear function : r0, 1s Ñ K may be written as pxq “ α ` βx, x P r0, 1s with α, β P K; since α ` βx “ αf pxq ` βgpxq @x P r0, 1s, then “ αf ` βg. The function r0, 1s ÞÑ x2 is a parabola deﬁned on the same interval. By deﬁnition h

of orthogonal projection: PW h “ arg min h ´ w wPW

that is, PW h is the element in W which minimizes the L2 distance between h and the straight lines. So, the straight line with equation y “ x ´ 16 is the best approximation of the parabola with equation y “ x2 , in the sense of the norm L2 , on the interval r0, 1s. 2 Figure 6.1 shows a graphical representation of this approximation. A list of properties of orthogonal projectors follows (for the proofs of these properties, see, for example, Abbati and Cirelli 1997). For all A, B P BpHq, we recall that: rA, Bs “ AB ´ BA rA, Bs is said to be the commutator of A and B. If rA, Bs “ 0, the zero operator, that is, AB “ BA, then A and B are said to commute. Let R and S be two closed vector subspaces in the Hilbert space H and let PR , PS be the orthogonal projectors on R and S, respectively.

Bounded Linear Operators in Hilbert Spaces

285

Figure 6.1. The line of equation y “ x ´ 16 (shown in blue) is the best approximation of the parabola with equation y “ x2 (in red) with respect to the Hilbert norm of L2 r0, 1s. For a color version of this ﬁgure, see www.iste.co.uk/provenzi/spaces.zip

T HEOREM 6.33 (Sum of orthogonal projectors).– The following statements are equivalent: 1) PR ` PS is an orthogonal projector; 2) PR PS “ PS PR “ 0; 3) PR pxq “ 0 @x P S and PS pxq “ 0 @x P R; 4) R K S. Moreover, if PR ` PS is an orthogonal projector, then it projects on R ` S. T HEOREM 6.34 (Product of orthogonal projectors).– The following statements are equivalent: 1) PR PS is an orthogonal projector; 2) PS PR is an orthogonal projector; 3) rPR , PS s “ 0; 4) R “ pR X Sq ‘ pR X S K q ;

286

From Euclidean to Hilbert Spaces

5) S “ pR X Sq ‘ pRK X Sq. If PR PS and PS PR are orthogonal projectors, then they project on R X S. T HEOREM 6.35 (Difference between orthogonal projectors).– The following statements are equivalent: 1) PR ´ PS is a projector; 2) PR PS “ PS PR “ PS ; 3) R X S “ S, i.e. S Ă R. If PR ´ PS is an orthogonal projector, then it projects on R X S K . T HEOREM 6.36 (Mixing projector sum, difference and product).– If rPS , PR s “ 0, then PR ` PS ´ PR PS is an orthogonal projector which projects on spanpR Y Sq. 6.8. Isometric and unitary operators In this section, we shall determine the properties of isometric and unitary operators in a Hilbert space of inﬁnite dimension, and provide an algebraic and geometric characterization of these operators. Once again, the adjoint operator plays a fundamental role in algebraic characterization, while orthonormal systems and Hilbert bases are crucial for the geometric characterization. In ﬁnite-dimensional vector spaces V , a linear operator which preserves the inner product, that is, A : V Ñ V , xAx, Ayy “ xx, yy, @x, y P V , also preserves the norm of the vectors (simply by considering x “ y), that is, }Ax} “ }x} @x P V . To prove the converse, it is sufﬁcient to consider the polarization formula7 [1.7], @x, y P V : ¯ 1´ 2 2 2 2 x ` y ´ x ´ y ` i x ` iy ´ i x ´ iy xx, yy “ 4 If we replace x, y with Ax, Ay and use the linearity of A: 1´ 2 2 Apx ` yq ´ Apx ´ yq xAx, Ayy “ 4 ¯ 2 2 `i Apx ` iyq ´ i Apx ´ iyq

7 The complex case is considered here; the real case is even simpler.

Bounded Linear Operators in Hilbert Spaces

287

Assuming that A preserves the norm, we have: ¯ 1´ 2 2 2 2 x ` y ´ x ´ y ` i x ` iy ´ i x ´ iy “ xx, yy xAx, Ayy “ 4 As we know, the norm canonically generates a metric via dpx, yq “ }x ´ y}. For this reason, an operator which preserves the inner product or norm is said to be isometric. The only vector which has a norm of zero is the vector 0V , thus an isometric operator A never transforms a non-zero vector (whose norm is ą 0) into the null vector, that is, kerpAq “ t0V u. Hence, dimpkerpAqq “ 0 and then, by the rank theorem, dimpImpAqq “ dimpV q. In other words, an isometric endomorphism in ﬁnite dimensions is automatically surjective. In an inﬁnite-dimensional Hilbert space, it is still true that a bounded linear operator preserves the scalar product if and only if it is isometric. However, the statement that an isometric operator A : H Ñ H is always surjective is no longer true. One counter-example is provided by the operator A P BpHq deﬁned by Aun “ u2n , where pun qnPN is an arbitrary Hilbert basis of H. Evidently, A is isometric, but, as we will see in Theorem 6.39, ImpAq “ spanpuk , k P N, evenq Ă H; the inclusion is strict, since puk , k P N, k evenq is not a complete orthonormal system, as it is a proper subset of pun qnPN . These considerations lead to Deﬁnition 6.18. D EFINITION 6.18.– The operator A : H Ñ H is said to be: – isometric, if xAx, Ayy “ xx, yy, @x, y P H, or, in an equivalent manner, if }Ax} “ }x}, @x P H; – unitary, if A is isometric and surjective. Let us calculate the norm of an isometric operator: }A} “ sup }Ax} “ sup }x} “ 1 }x}“1

}x}“1

Since a unitary operator is also isometric, we have that the norm of isometric and unitary operators is 1. BASIC EXAMPLES OF UNITARY OPERATORS .– Let us consider Rn with the Borel σ-algebra and the Lebesgue measure. Given a ﬁxed element a P Rn , any translation operator: Ta : L2 pRn q ÝÑ L2 pRn q f ÞÝÑ Ta f, where Ta f pxq “ f px ´ aq, @x P Rn

288

From Euclidean to Hilbert Spaces

is unitary. In fact, we know that it is well deﬁned, linear and isometric due to the shift invariance of the Lebesgue measure. It is also surjective, since, for any element g P L2 pRn q, we simply need to consider f P L2 pRn q, f pxq “ gpx ` aq @x P Rn to obtain Ta f “ g. Now, let R P Opnq be a rotation matrix of Rn , where Opnq is the orthogonal group of dimension n, that is, the group of square matrices R of dimension n which are orthogonal, that is, such that Rt “ R´1 . Any rotation operator: TR : L2 pRn q ÝÑ L2 pRn q f ÞÝÑ TR f, where TR f pxq “ f pRxq, @x P Rn is unitary, due to the fact that the Jacobian of the transformation, that is, the determinant of R, has an absolute value of 1 and thus the integrals used to calculate the norm of TR f and of f are equal. It is also surjective, since for any element g P L2 pRn q, we simply need to consider f P L2 pRn q, f pxq “ gpRt xq @x P Rn to obtain TR f “ g. A special case of the rotation operator is the inverse identity matrix: P “ ´I such that TP f “ fP , with fP pxq “ f p´xq. TP is known as the parity operator. 6.8.1. Characterizations of isometric and unitary operators The following results establish a useful characterization of isometric and unitary operators. T HEOREM 6.37 (Algebraic characterization of isometric operators).– A BpHq is an isometric operator if and only if A: A “ idH .

P

P ROOF.– Let A be isometric, then @x P H: xA: Ax, xy “ xAx, Axy “ Ax

2

“

A isometric

2

x “ xx, xy

that is, A: A “ idH . Conversely, if A: A “ idH , then @x, y P H: xAx, Ayy “ xA: Ax, yy “ xx, yy that is, A conserves the inner product, and thus it is isometric.

2

The following result is particularly useful in optimization theory and in quantum mechanics.

Bounded Linear Operators in Hilbert Spaces

289

T HEOREM 6.38.– Let A P BpHq be isometric, then AA: is an orthogonal projector. P ROOF.– We will use the algebraic characterization of orthogonal projectors: since we already know that AA: is self-adjoint for all A P BpHq, we simply need to verify that AA: is bounded and idempotent. 1) AA: is bounded: @x P H it holds that: }AA: x} “ }ApA: xq}

“

A isometric

}A: x} ď A: x “ A x

2) AA: is idempotent : pAA: q2 “ AA: pAA: q “ ApA: AqA:

“

“

}A}“1

A: A“idH

x

AA: .

2

AA: projects onto its image,which can be characterized as follows. T HEOREM 6.39.– Let A P BpHq be isometric. The image of the orthogonal projector AA: is ImpAq, so the image of an isometric A P BpHq is a closed vector subspace of H. P ROOF.– We wish to show that ImpAA: q = ImpAq. We begin by observing that in general, @A, B P BpHq, it holds that ImpABq Ď ImpAq and ImpABq “ ImpAq if and only if B is surjective: ImpAq “ ty P H : Dx P H : Ax “ yu, ImpABq “ tz P H : Dx P ImpBq : Ax “ yu This tells us that ImpABq = ImpAq if and only if B is surjective, otherwise the images of A and those of AB would not agree. Taking B “ A: , then ImpAA: q Ď ImpAq @A P BpHq. Now, let A be isometric, that is, A: A “ idH and y P ImpAq, then Dx P H such that: y “ Ax “ ApA: Aqx “ AA: pAxq that is, y P ImpAA: q, hence ImA Ď ImpAA: q. Thus ImpAA: q=ImpAq. Since AA: is an orthogonal projector, we know that its image is closed; hence, ImpAq is closed for any isometric operator. 2 The fact that an isometric operator has a closed image can be shown directly, using a proof very similar to that of Theorem 6.13. If A is unitary, that is, isometric and surjective, then ImpAA: q “ ImpAq “ H and then AA: “ idH . The fact that kerpA: q “ ImpAqK gives us immediately sufﬁcient conditions to guarantee the invertibility or non-invertibility of the adjoint of an operator in BpHq.

290

From Euclidean to Hilbert Spaces

T HEOREM 6.40.– Taking A P BpHq: K

– if A is isometric and not surjective, then kerpA: q “ ImpAq ‰ t0H u, i.e. A: is not invertible; – if A is unitary, then kerpA: q “ HK “ t0u, i.e. A: is invertible. Now, let us apply these results to the case of the operator Aun “ u2n , where pun qnPN is a complete orthonormal system in H. As noted before, since }Aun } “ }u2n } “ 1 “ }un }, A is isometric, but it is not unitary, as ImpAq “ spanpuk , k evenq Ă H. Let us determine A: : xAun , um y “ xun , A: um y, furthermore, xAun , um y “ xu2n , um y “ δ2n,m , then we can write xun , A: um y “ δ2n,m , that is: # u m2 if m “ 2n : A um “ 0 if m ‰ 2n We see that kerpA: q “ spanpum , m oddq “ ImpAqK , conﬁrming our results. The following theorem gives a complete algebraic characterization of unitary operators. T HEOREM 6.41 (algebraic characterization of unitary operators).– A P BpHq, the following statements are equivalent: 1) A is unitary; 2) A is invertible and A´1 “ A: P BpHq; 3) A: A “ AA: “ idH ; 4) A: is unitary. P ROOF.– 1q ùñ 2q : We know that if A is unitary, then A is injective, that is, invertible on its image; by deﬁnition, A is surjective; therefore, it is bijective and invertible on all H. To show that A´1 “ A: , we write: xA: Ax, yy “ xAx, Ayy “ xx, yy,

@x, y P H

hence A: A “ idH (showing that A: is the left inverse of A) and then: A: “ A: pAA´1 q “ pA: AqA´1 “ idH A´1 “ A´1

Bounded Linear Operators in Hilbert Spaces

291

that is, A: “ A´1 . Now, we need only to prove that A: “ A´1 is bounded: Since A is surjective, Dy P H such that x “ Ay, and, by unitarity: }x} “ }Ay} “ }y}, that is, }x} “ }y} and then: }A: x} “ }A´1 x} “ }A´1 Ay} “ }y} “ }x} @x P H which implies that }A´1 } “ }A: } “ sup }x} “ 1. x“1

2q ùñ 3q : A: “ A´1 ùñ A: A “ A´1 A “ idH and: A: “ A´1 ùñ AA: “ AA´1 “ idH 3q ùñ 4q : From AA: “ idH , we obtain: xx, AA: yy “ xx, yy @x, y P H; furthermore, xx, AA: yy “ xA: x, A: yy, thus xA: x, A: yy “ xx, yy @x, y P H, that is, A: is isometric. Now, we only need to prove that A: is surjective. We do this using the other identity, A: A “ idH , which implies: A: pAyq “ pA: Aqy “ idH pyq “ y, @y P H that is, @y P H Dξ “ Ay P H such that A: ξ “ y, i.e. A: is surjective. 1q ùñ 4q ùñ 1q : As we have seen, given an arbitrary unitary operator, its adjoint is also unitary. Using the hypothesis that A: is a unitary operator, then A:: is unitary, and, since A:: “ A, then unitary A: implies unitary A. 2 One consequence of this result is that we can study the unitarity of an operator A by considering that of its adjoint, which can be simpler. Corollary 6.5 shows that the norm of a unitary operator is invariant with respect to adjunction and inversion. C OROLLARY 6.5.– If A P BpHq is unitary, then }A} “ }A: } “ }A´1 } “ 1. Let U pHq be the set of unitary operators on a Hilbert space H. If A, B P UpHq, then, by direct calculation, we can verify that AB P UpHq. The theorem proved above tells us that if A P UpHq then A´1 “ A: P UpHq, that is, U pHq veriﬁes the group axioms with respect to composition. D EFINITION 6.19.– UpHq denotes the unitary group of H. UpHq coincides with the group of automorphisms of H: AutpHq.

292

From Euclidean to Hilbert Spaces

Some applications of the characterization of unitary operators are shown below. Taking H “ L2 pX, A, μq and g P L8 pX, A, μq, we have seen that the multiplication operator by g deﬁned by: Mg : L2 pX, A, μq ÝÑ L2 pX, A, μq f ÞÝÑ Mg f “ f ¨ g where f ¨ gpxq “ f pxqgpxq @x P X is linear and bounded. Moreover, we know that Mg: “ Mg¯ , thus Mg Mg: “ Mg¯g “ M|g|2 , and then Mg Mg: “ idH if and only if M|g|2 “ idH , but this is, equivalent to requiring that |g|2 “ 1, that is, the equivalence class of g must contain at least one representative, also noted g for simplicity’s sake, of the form gpxq “ eihpxq , with h : X Ñ R measurable. Let us apply the last theorem that we proven to verify that for all complete orthonormal system pun qnPN of H, the operator U deﬁned as: U un “ p´iqn un is a unitary operator. We will use the algebraic characterization U : U “ U U : “ idH . On one side, by deﬁnition: xU un , un y “ xun , U : un y

[6.25]

and, on the other side: xU un , un y “ xp´iqn un , un y “ p´iqn xun , un y “ xun , p´iqn un y “ xun , U : un y @n P N

r6.25s

hence U : un “ p´iqn un , and then: U : U un “ U : pU un q “ p´iqn U un “ p´iqn p´iqn un “ |p´iqn |2 un “ un . Moreover: U U : un “ U pU : un q “ U p´iqn un “ p´iqn p´iqn un “ |p´iqn |2 un “ un Since pun qnPN is a complete orthonormal system, the fact that U : U “ U U : is the identity onřany element un can be extended to all H. In fact, for all x P H, it holds that x “ xx, un yun and, by the linearity and continuity of U : U and U : U , we can nPN

write: U :U x “

ÿ

xx, un yU : U un “

nPN

ÿ

xx, un yun “ x.

nPN

Bounded Linear Operators in Hilbert Spaces

293

The same is true for U U : x, i.e. U : U “ U U : “ idH , proving that U is unitary. As in ﬁnite dimensions, unitary operators allow us to establish an equivalence relation between operators, as formalized in the following deﬁnition. D EFINITION 6.20.– Two operators A, B P BpHq are unitarily equivalent if there exists a unitary operator U P BpHq such that A “ U BU ´1 . We do not have the space to go into greater detail regarding the properties of unitary equivalence here. We simply note that unitary equivalence preserves operator properties, such as continuity, invertibility and self-adjointness. 6.8.2. Relationship between isometric and unitary operators and orthonormal systems The ﬁnal property of isometric and unitary operators that we wish to discuss here is their interaction with orthonormal systems and complete orthonormal systems in Hilbert spaces. In ﬁnite dimension, isometric and unitary operators coincide, and they transform orthonormal bases into orthonormal bases. In inﬁnite dimension, this remains true only for unitary operators. T HEOREM 6.42 (Geometric characterization of isometric operators).– A P BpHq is isometric if and only if it transforms complete orthonormal systems pun qnPN in H into orthonormal systems pAun qnPN . P ROOF.– ùñ : for any complete orthonormal system pun qnPN in H, by the isometry of A, we can write: xAun , Aum y “ xun , um y “ δn,m

@n, m P N

thus pAun qnPN is an orthonormal system of H. ð : let A P BpHq, pun qnPN be the complete orthonormal system of H and pAun qnPN an orthonormal system of H. We wish to prove that A is isometric. On one side, the fact that pun qnPN is a complete orthonormal system implies that, for into Fourier series x “ ř a generalized ř all x P H, we have the decomposition xx, un yun and Plancherel’s identity }x}2 “ |xx, un y|2 . nPN

nPN

294

From Euclidean to Hilbert Spaces

On the other side, by the continuity of A, we can write Ax “

ř

xx, un y Aun ;

nPN

furthermore, the hypothesis that pAun qnPN is an orthonormal system of H allows us to use the second part of the Riesz-Fischer theorem (Theorem 5.10) to state that8: ÿ 2 |xx, un y|2 Ax “ nPN 2

2

that is, Ax “ x , therefore A is isometric.

2

T HEOREM 6.43 (Geometric characterization of unitary operators).– A P BpHq is unitary if and only if it transforms complete orthonormal systems pun qnPN in H into complete orthonormal systems pAun qnPN . P ROOF.– ùñ : a unitary operator A is isometric, thus by Theorem 6.43, pAun qnPN is an orthonormal system and we simply need to show that pAun qnPN is complete. We do this using one of the characteristic properties of a Hilbert basis: As we saw in point 2 of Theorem 5.11, if xx, Aun y “ 0 @n P N implies x “ 0H , then pAun qnPN is a complete orthonormal system. Since A is surjective, there exists y P H such that x “ Ay; hence, the condition xx, Aun y “ 0 @n P N becomes: @n P N : xAy, Aun y

“

A unitary

xy, un y “ 0

and since pun qnPN is a complete orthonormal system of H, y “ 0H , implying x “ A0H “ 0H , then pAun qnPN is a complete orthonormal system of H. ð : by Theorem 6.43, we can guarantee that, if A transforms complete orthonormal systems pun qnPN of H into complete orthonormal system pAun qnPN in H, then A is at least isometric; thus, we only need to demonstrate its surjectivity. We have seen that the image of an isometric operator is closed, that is, by linearity, ImpAq “ spanpAun , n P Nq “ H, since pAun qnPN is a complete orthonormal system by hypothesis, thus A is surjective, implying that A is unitary. 2 We end this section with a simple exercise involving both unitary operators and orthogonal projectors. 8 Explicitly, the second part of the Riesz-Fischer theorem tells us that, given an orthonormal ř system pvn qnPN in a Hilbert space H, if the series kn vn converges to y P H, then it holds nPN ř |kn |2 ; in our case, y “ Ax, vn “ Aun and kn “ xx, un y. that }y}2 “ nPN

Bounded Linear Operators in Hilbert Spaces

295

Exercise 6.7 Let H be a Hilbert space. Show that the following properties are equivalent. 1) A P BpHq is self-adjoint and unitary. 2) The operator P “ 12 pA ` idH q is an orthogonal projector. 3) There exist two mutually orthogonal closed subspaces H1 and H2 in H such that H “ H1 ‘ H2 and there exists an operator A P BpHq such that, for all x “ x1 ` x2 , xi P Hi , it holds that Ax “ x1 ´ x2 . Suggestion: show that 1q ðñ 2q and 2q ðñ 3q. Solution to Exercise 6.7 We begin with the equivalence 1q ðñ 2q 1q ùñ 2q : By hypothesis, A is self-adjoint, that is, A “ A: , and unitary, that is, A: “ A´1 . Then A “ A´1 and thus A2 “ AA “ AA´1 “ idH . We can use this fact to show that P “ 12 pA ` idH q is self-adjoint and idempotent, implying that it is an orthogonal projector: P: “

1 pA ` idH q: 2

“

linearity of :

1 : 1 pA ` id:H q “ pA ` idH q “ P 2 2

1 2 1 1 pA ` 2A ` idH q “ pidH ` 2A ` idH q “ pA ` idH q “ P 4 4 2 2q ùñ 1q : If property 2 holds, then we write A “ 2P ´ idH , where P is an orthogonal projector, and we prove that A is self-adjoint and unitary: P2 “

A: “ 2P : ´ id:H “ 2P ´ idH “ A A: A “ A: A “ p2P ´ idH q2 “ 4P 2 ´ 4P ` idH “ 4P ´ 4P ` idH “ idH ùñ A: “ A´1 The next step is to analyze the equivalence 2q ðñ 3q. 2q ùñ 3q : If property 2 holds, then we know that H “ ImpP q ‘ kerpP q, hence H1 “ ImpP q and H2 “ kerpP q. Furthermore, if we write H Q x “ x1 ` x2 , x1 P ImpP q and x2 P kerpP q: P x “ x1 and P pxq “

1 1 pA ` idH qpxq “ pAx ` x1 ` x2 q 2 2

that is, x1 “ 12 pAx ` x1 ` x2 q, and then Ax “ 2x1 ´ x1 ´ x2 “ x1 ´ x2 . 3q ùñ 2q : Assuming that property 3 is veriﬁed, P pxq “ 12 pAx ` xq “ 12 px1 ´ x2 ` x1 ` x2 q “ x1 for all x P H, thus, by deﬁnition, P is the orthogonal projector PH1 by the hypothesis that H1 and H2 are orthogonal and closed. 2

296

From Euclidean to Hilbert Spaces

6.9. The Fourier transform on SpRn q, L1 pRn q and L2 pRn q The Fourier transform on L2 pRn q is the most important example of a unitary operator on L2 pRn q in terms of its applications to theoretical physics, differential equation theory and signal processing, among others. Nonetheless, this operator is not simple to construct, as L2 pRn q is not the most natural space for the Fourier transform; the most suitable environment for the Fourier transform is, in fact, the Schwartz space. Several constructions of the Fourier transform on L2 pRn q can be found in the literature; the most widespread, which shall be used here, consists of deﬁning the Fourier transform on the Schwartz space to highlight its remarkable properties, and then operating an extension to L2 pRn q using a limit procedure. In addition to this result, we shall present an explicit formula which makes use of the Hermite basis of L2 pRn q. 6.9.1. The invariance of the Schwartz space with respect to the Fourier transform Let us begin by deﬁning the Fourier transform on the Schwartz space SpRn q for n “ 1. We will then generalize this deﬁnition for an arbitrary (ﬁnite) n. D EFINITION 6.21.– The Fourier transform on SpRq is the following linear operator: Fˆ : SpRq ÝÑ SpRq f ÞÝÑ Fˆ pf q “ fˆ, where: fˆpωq “

?1 2π

ş

R

f pxqe´iωx dmpxq

where m is the Lebesgue measure on R and ω P R. The inverse Fourier transform on SpRq is the following linear operator: Fˇ : SpRq ÝÑ SpRq f ÞÝÑ Fˇ pf q “ fˇ, where: fˇpxq “

?1 2π R

ş

f pωqeiωx dmpωq

More generally, the Fourier transform on SpRn q is the following linear operator: Fˆ : SpRn q ÝÑ SpRn q f ÞÝÑ Fˆ pf q “ fˆ, where: fˆpωq “

1 p2πqn{2

ş

Rn

f pxqe´ixω,xy dmpxq

where m is the Lebesgue measure on Rn , ω P Rn and xω, xy “

n ř

ω1 xi is the

k“1

Euclidean inner product in Rn . The inverse Fourier transform on SpRn q is the following linear operator: Fˇ : SpRn q ÝÑ SpRn q f ÞÝÑ Fˇ pf q “ fˇ, where: fˇpxq “

1 p2πqn{2

ş

Rn

f pωqeixω,xy dmpωq

Bounded Linear Operators in Hilbert Spaces

297

To verify that these deﬁnitions are well posed, we must ensure that the integrals exist and that fˆ and fˇ are rapidly decreasing functions. The existence of the integrals is evident if we consider that SpRn q Ă L1 pRn q, thus: ż ż ˇ ˇ ˇ ˇ |f pxq| dmpxq ă `8. ˇf pxqe´ixω,xy ˇ dmpxq “ Rn

Rn

The same is true for the inverse Fourier transform. The fact that fˆ and fˇ are rapidly decreasing functions can be veriﬁed by iterating the derivation under the integral sign and by integrating by parts. A summary of the most important properties of the Fourier transform for a function f P SpRq, a, b, c P R, a ‰ 0 is given in Table 6.4. I MPORTANT OBSERVATIONS .– – Fˆ transforms the product by a constant into a division by the same constant (up to a coefﬁcient). – Fˆ , like the DFT, transforms the shift of the initial variable into the product by a complex exponential. – Fˆ transforms the n-th derivation into the product by a power of iω. This property is crucial for transforming differential equations into algebraic equations. – Fˆ transforms a Gaussian with unit standard deviation into a Gaussian with unit standard deviation. More generally, Fˆ inverts the standard deviation: a Gaussian with a small standard deviation, that is, with values located in close proximity to its mean, is transformed by Fˆ into a Gaussian with a large standard deviation, that is, with values which are spread away from the mean, and vice versa. Original function f P SpRq Fourier transform fˆ P SpRq ` ˘ 1 ˆ ω f f paxq |a|

f px ´ bq f pax ´ bq eicx f pxq f 1 pxq 2

f pxq dn f dxn n

p´ixq f pxq x2 2

e´

2 2

e´c

x

a

e´iωb fˆpωq ` ˘ ´iω b a ˆ ω e f a |a| fˆpω ´ cq iω fˆpωq ´ω 2 fˆpωq piωqn fˆpωq dn fˆ pωq dω n

e´

ω2 2

ω2

´ 1 ? e 4c2 c 2

Table 6.1. Properties of the Fourier transform on SpRq

298

From Euclidean to Hilbert Spaces

We wish to prove the property fp1 pωq “ iω fˆpωq. P ROOF.– We begin by observing that for f : R Ñ C, f P SpRq, then f pxq ÝÑ 0. We write the Fourier transform of f 1 : ż `8 1 f 1 pxqe´iωx dx “ (int. by parts) fp1 pωq “ ? 2π ´8 ż `8 ‰ 1 1 “ ´iωx `8 f pxqe ´? f pxqp´iωqe´iωx dx “? ´8 2π 2π ´8 ż `8 1 ? f pxqe´iωx dx “ 0 ´ p´iωq 2π ´8 “ iω fˆpωq

|x|Ñ`8

2

The fact that the Gaussian with unit standard deviation is invariant with respect to the Fourier transform is not immediately evident, so a proof is helpful. For that, we need two lemmas. L EMMA 6.1.– It holds that: ż `8 ? x2 e´ 2 dx “ 2π ´8

P ROOF.– We write I “ I2 “

ż `8

e´

´8

x2 2

ş`8 ´8

dx ¨

e´

x2 2

ż `8

dx, then:

e´

y2 2

dy

´8

“

(th. Fubini)

ż `8 ż `8 ´8

1

e´ 2 px

2

`y 2 q

dxdy

´8

Switching to polar coordinates pρ, ϑq, ρ P r0, `8q, ϑ P r0, 2πq and recalling that the Jacobian in polar coordinates is ρ, we obtain: ş`8 ş2π ´ ρ2 ş2π ş`8 ρ2 e 2 ρ dρdϑ “ 0 dϑ 0 e´ 2 ρ dρ 0 ” 0 ı`8 ‰ “ ρ2 “ 2π ´e´8 ` e0 “ 2π “ 2π ´e´ 2

I2 “

0

Thus I “

?

2

2π.

L EMMA 6.2.– It holds that: ż `8 ż `8 px`iωq2 x2 e´ 2 dx “ e´ 2 dx ´8

´8

The proof uses the calculus of residues of complex analysis.

Bounded Linear Operators in Hilbert Spaces

299

We can now prove that: ω2 z x2 e´ 2 pωq “ e´ 2

P ROOF.– 1 z x2 e´ 2 pωq “ ? 2π

ż `8

e´

ω ω2 ¨e´ 2 ¨e 2 ω2

e´ 2 ? 2π

e´ 2 “ ? 2π

ż `8

ω2

ż `8

e´ 2 “ ? 2π

e´

x2 2

e´iωx e

ω2 2

dx

´8

e´

x2 `2iωx´ω 2 2

e´

px`iωq2 2

dx

dx

´8 ω2

“

“

´ e?

Lemma 6.1

ż `8

´8

e´ 2 ? 2π

Lemma 6.2

e´iωx dx

´8 ω2

“ 2

x2 2

ω2 2

2π

ż `8

e´

x2 2

dx

´8

?

2π “ e´

ω2 2

The inversion of the standard deviation, i.e. the fact that e´c

2 2

x2

ÞÑ

c Fˆ

1 ?

ω2

e´ 4c2 , can 2

be proven using an alternative technique (evidently, the technique presented earlier is also an option). 2

2

This technique is based on solving a differential equation. If f pxq “ e´c x , then 1 f pxq “ ´2c2 xf pxq, thus f 1 ` 2c2 xf “ 0 and, given the properties f 1 pxq ÞÑ iω fˆpωq, Fˆ

´ixf pxq ÞÑ fˆ1 pωq and the fact that 2c2 xf “ i2c2 p´ixf q, by applying Fˆ to both Fˆ

sides of the previous differential equation we can write: iω fˆpωq ` i2c2 fˆ1 pωq “ 0 ðñ ω fˆpωq ` 2c2 fˆ1 pωq “ 0

[6.26]

This gives us a separable differential equation9 with respect to fˆ. The canonical technique for solving this type of differential equation is to ﬁrst search for constant solutions fˆpωq “ C P R @ω P R, implying fˆ1 pωq “ 0 @ω P R, thus [6.26] becomes ω fˆpωq “ 0 which may only be veriﬁed for all ω P R when fˆpωq ” 0; hence, the only 9 We recall that a differential equation with respect to a function yptq is said to be separable if it can be written as y 1 ptq “ f pyptqq ¨ gptq, where f and g are two continuous functions.

300

From Euclidean to Hilbert Spaces

constant solution to the differential equation [6.26] is the identically zero function. However, this function is not coherent with the fact that fˆp0q ‰ 0: ż 1 ? f p0q “ f pxqe´i0x dx 2π R def. of fˆp0q ! ż ż 2 2 1 1 f pxqdx “ ? e´c x dx “? 2π R 2π R ż ? 2 1 1 1 ? ? 2π “ ? ? “? e´y {2 dy “ Lemma (6.1) 2πc 2 R 2πc 2 c 2 Hence, fˆp0q ” 0 is not a solution to [6.26]. Now, let us suppose that fˆpωq ‰ 0 and look for non-constant solutions to [6.26] using the variable separation technique. We write the equation as follows: fˆ1 pωq ω “´ 2 ˆ 2c f pωq ω2 Integrating both sides we obtain: log |fˆpωq| “ ´ 4c 2 ` log C, C ą 0, where log C is the arbitrary constant resulting from integration. It is written in this way because, taking the exponential of both sides, we obtain: ω2

ω2

|fˆpωq| “ e´ 4c2 `log C “ Ce´ 4c2

2

ω fˆpωq “ ˘Ce´ 4c2

2

ω that is, fˆpωq “ Ke´ 4c2 , K P Rzt0u. Now, we simply observe that K “ fˆp0q “

as before, which gives us the solution fˆpωq “

1 ? e c 2

ω2 ´ 4c 2

1 ? , c 2

.

The properties of the Fourier transform deﬁned on SpRn q, summarized in Table 6.2 (where c P R, c ‰ 0, a, b P Rn , k P t1, . . . , nu), follow directly from those obtained in the case where n “ 1, with relatively straightforward changes to the demonstration technique, notably involving the use of Fubini’s theorem to calculate multiple integrals. We end this section by presenting the result which makes the Schwartz space so interesting for Fourier transform theory (and which justiﬁes the name of Fˇ ). T HEOREM 6.44.– The transform Fˆ is a linear isomorphism of SpRn q in itself, and its inverse transformation is Fˇ : Fˇ “ Fˆ ´1 . Furthermore, if f P SpRn q is interpreted as a function of L2 pRn q, then: }f } “ }fˆ} @f P SpRn q Ă L2 pRn q. The Schwartz space is thus invariant with respect to the application of the Fourier transform Fˆ , which possesses an explicit integral formula and an explicit inverse given by Fˇ and conserves the norm of rapidly decreasing functions when these are

Bounded Linear Operators in Hilbert Spaces

301

interpreted as elements of L2 pRn q. There is no other inﬁnite-dimensional functional space in which the Fourier transform possesses all of these properties simultaneously. Original function f P SpRn q Fourier transform fˆ P SpRn q ` ˘ 1 ˆ ω f c f pcxq |c| f pcx ´ bq

e´ixω,by fˆpωq ` ˘ ´iω b c ˆ ω e f

eixa,xy f pxq

fˆpω ´ aq

Bxk f pxq

iωk fˆpωq

Bx2k f pxq

´ωk2 fˆpωq

Bxnk f pxq

piωk qn fˆpωq

p´ixk qn f pxq

Bxnk fˆpωq

f px ´ bq

e´ e´c

}x}2 2

2

}x}2

|c|

e´

c

}ω}2 2

}ω}2

´ 1 ? e 4c2 c 2

Table 6.2. Properties of the Fourier transform on SpRn q

As we shall see, L1 pRq is not invariant under Fourier transform, while in L2 pRq we loose the explicit integral formula. 6.9.2. Extension of the Fourier transform of SpRn q to L1 pRn q: Riemann-Lebesgue theorem

the

The functions which constitute the elements of the Schwartz are too regular to be exhaustive, particularly with respect to applications. It is thus important to consider the extension of the Fourier transform to less regular function spaces, such as L1 pRq and L2 pRq. In this section, we shall consider L1 pRq, for which we have a particularly famous result. T HEOREM 6.45 (Riemann-Lebesgue theorem).– The operator Fˆ from section 6.9.1 can be extended in a unique manner to the injective and continuous linear operator deﬁned as follows: Fˆ1 : L1 pRn q ÝÑ C 8 pRn q f ÞÝÑ Fˆ1 pf q “, where: Fˆ1 f pωq “

1 p2πqn{2

ş

Rn

f pxqe´ixω,xy dmpxq

302

From Euclidean to Hilbert Spaces

The same statement holds for the extension of Fˇ to L1 pRn q with the corresponding integral function, that is: Fˇ1 : L1 pRn q ÝÑ C 8 pRn q f ÞÝÑ Fˇ1 pf q “, or : Fˇ1 f pxq “

1 p2πqn{2

ş

Rn

f pωqeixω,xy dmpωq

We recall that C 8 pRn q is the space of deﬁned and continuous functions on Rn which tend toward 0 as we approach inﬁnity, equipped with the norm }f }8 “ supxPRn |f pxq|. O BSERVATIONS .– – The Riemann-Lebesgue theorem tells us that the integral formula of the Fourier transform remains valid for the elements of L1 pRn q; this is very important, since functions which are absolutely integrable in the Lebesgue sense are much more widespread than rapidly decreasing functions in practical applications. – The injectivity of F1 means that it can be inverted on the image F1 pL1 pRn qq Ă C 8 pRn q but not on L1 pRn q. A classic counter-example for the case where n “ 1 is the indicator function for the interval r´1, 1s in R, that is, χr´1,1s ; it belongs to L1 pRq, but by direct calculation we obtain: c 2 sin ω ˆ F1 pχr´1,1s qpωq “ [6.27] π ω This evidently belongs to C 8 pRq but not to L1 pRq; it actually belongs to L2 pRq. Thus, Fˇ1 , which is deﬁned on all L1 pRq, is not the inverse of Fˆ1 . 6.9.3. Extension of the Fourier transform to a unitary operator on L2 pRn q: the Fourier-Plancherel transform The technique which is classically used to extend Fˆ to L2 pRn q consists of using a theorem that is of fundamental importance in functional analysis, which will be presented and proved below. First, however, we must establish a deﬁnition of the extension of a linear operator. D EFINITION 6.22 (bounded extension of bounded linear operators).– Let E, V, W be vector spaces on the same ﬁeld K and let E be a vector subspace of V . Let A : E Ñ W be a linear operator. The linear operator B : V Ñ W is an extension of A on V if the restriction of B to E coincides with A, that is, if Ax “ Bx @x P E. T HEOREM 6.46 (Theorem of extension of a bounded linear operator).– Let E and F be two normed vector spaces, with F a Banach space. Let A : DA Ď E Ñ F be a bounded linear operator, where DA is a vector subspace of E. Then, there exists only one linear operator A with the following properties:

Bounded Linear Operators in Hilbert Spaces

303

1) the domain of A is the closure of DA in E: DA “ DA ; 2) A is continuous: A P BpDA , F q; 3) }A} “ }A}. This operator is deﬁned as follows. Let pxn qnPN Ă DA be an arbitrary sequence which converges to x P DA , then: A : DA Ď E ÝÑ F x ÞÝÑ Ax “ lim Axn nÑ`8

P ROOF.– Let x be an arbitrary element in DA “ DA , then, by deﬁnition, there exists a sequence pxn qnPN Ă DA such that x “ lim xn . pxn qnPN . Being convergent, nÑ`8

pxn qnPN is a Cauchy sequence and, since A is continuous, the sequence pAxn qnPN Ă F is also a Cauchy sequence, by Theorem 6.9. Since F is a Banach space, there exists y “

lim Axn ; thus, the operator A :

nÑ`8

DA Ñ F , Ax “ lim Axn is well deﬁned and linear, as it is deﬁned via the limit nÑ`8

operation, which is linear. Furthermore, A does not depend on the sequence which converges to x; in fact, if px1n qnPN Ă E is another sequence such that x “ lim x1n , then: nÑ`8

} lim Axn ´ lim Ax1n } “ lim }Axn ´ Ax1n } (Continuity of } }) nÑ`8

nÑ`8

“ lim }Apxn ´ nÑ`8

x1n q}

“ }A} lim }xn ´ nÑ`8

x1n }

nÑ`8

ď lim }A}}xn ´ x1n } (A bounded) nÑ`8

“ }A} } lim pxn ´ x1n q} (Continuity of } })

“ }A} } lim xn ´ lim nÑ`8

nÑ`8

nÑ`8

x1n }

“ }A}}x ´ x} “ 0

Evidently, any x P DA may be identiﬁed as the limit of the constant sequence xn “ x @n P N; hence, given that the deﬁnition of A is independent with respect to the chosen sequence, Ax “ Ax @x P DA , that is, the restriction of A to DA is A and, inversely, A is an extension of A on DA . The fact that A is a bounded operator on DA can be veriﬁed by considering the limit of the inequality }Axn } ď }A}}xn }. The limit conserves the order relation, that is: lim }Axn } ď lim }A}}xn }

nÑ`8

nÑ`8

304

From Euclidean to Hilbert Spaces

and, by the continuity of the norm, we have: } lim Axn } ď }A} } lim xn } ðñ }Ax} ď }A}}x} nÑ`8

nÑ`8

for all x P DA , that is, A is bounded, and thus continuous. Now, let us prove that any other extension of A to DA must coincide with A. Let B be another bounded extension of A on DA , then, for all x P DA , there exists a sequence pxn qnPN Ă DA , such that x “ lim xn and by the deﬁnition of A and the nÑ`8

continuity of B we have: Ax ´ Bx “ lim Axn ´ B nÑ`8

˙

ˆ lim xn

nÑ`8

“ lim Axn ´ lim Bxn nÑ`8

nÑ`8

“ lim pAxn ´ Bxn q nÑ`8

For all n P N, xn P DA and, since B is an extension of A, by deﬁnition Bxn “ Axn @n P N, then Axn ´ Bxn “ 0 @n P N and thus Ax ´ Bx “ lim pAxn ´ nÑ`8

Bxn q “ lim 0 “ 0, i.e. A “ B. nÑ`8

Finally, we need toshow that the extension is isometric, that is, }A} “ }A}. We have already seen that Ax ď A x for all x P DA , thus: A “ sup Ax ď sup A x “ A x“1

x“1

then, if we can show that A ě A, this will prove the isometry of the extension. The proof is straightforward if we consider the deﬁnition of the following operator norm: # + " * Ax Ax }A} “ sup , x P DA zt0E u ě sup , x P DA zt0E u “ A x x since Ax “ Ax @x P DA and DA Ď DA , hence A “ A and the theorem is fully proven. 2 Using the extension theorem and the fact that SpRn q “ L2 pRn q, the Fourier transform of the Schwartz space can be extended to L2 pRn q via the limit formula of the extension theorem, as formalized as follows. T HEOREM 6.47.– The operators Fˆ and Fˇ which deﬁne the Fourier transform and the inverse Fourier transform on SpRn q, respectively, can be extended in a unique manner to two unitary operators F and F˜ on L2 pRn q; furthermore, F˜ “ F ´1 .

Bounded Linear Operators in Hilbert Spaces

305

The operator F is known as the Fourier-Plancherel transform and it is deﬁned as follows: let pfn qnPN Ă SpRn q be an arbitrary sequence of elements in SpRn q which converge to f P L2 pRn q, then: F : L2 pRn q ÝÑ L2 pRn q f ÞÝÑ F pf q “ lim fˆn . nÑ`8

Analogously: F ´1 : L2 pRn q ÝÑ L2 pRn q f ÞÝÑ F pf q “ lim fˇn nÑ`8

Thus, the Fourier-Plancherel transform F on L2 pRn q has the vital properties of being a unitary operator with inverse given by the unitary operator F ´1 . One reason L2 pRn q is a less natural space than SpRn q for studying the Fourier transform is the lack of a valid integral formula for all elements of L2 pRn q. Theorem 6.48 provides a partial solution to this problem. T HEOREM 6.48.– If f P L1 pRn qXL2 pRn q, then F “ Fˆ1 and, for functions belonging to L1 pRn q X L2 pRn q, the integral formula of the Fourier transform remains valid. Thankfully, as we saw in section 4.4.4, the functions of L1 pRn q X L2 pRn q include the bounded functions of L1 pRn q and those of L2 pRn q which cancel outside of a compact subspace, often encountered in practical applications. 6.9.4. Relationship between the Fourier-Plancherel transform and the Hermitian Hilbert basis One very important Hilbert basis in L2 pRq is the Hermite basis, deﬁned as: n 2 1 2 d p´1qn e2x e´x , un pxq “ a ? n n dx 2 n! π

x P R, n P N

The functions un can be shown to decay rapidly, so their Fourier transform is obtained by applying the integral formula, that is: ż 1 F un pωq “ ? un pxqe´iωx dmpxq 2π R ż n 2 1 2 d 1 p´1qn “? a e´iωx` 2 x e´x dmpxq. ? n n dx 2π 2 n! π R By means of some simple algebraic manipulations, we can show that: F un “ p´iqn un ,

nPN

306

From Euclidean to Hilbert Spaces

and thus F coincides with the unitary operator introduced in section 6.8.1. By the continuity of F , we can write: Ff “

ÿ

p´iqn xf, un yun ,

@f P L2 pRq, pun qnPN : Hermite basis

nPN

6.9.5. The Fourier transform and convolution The properties of the Fourier transform with respect to convolution merit a separate discussion, given their importance and usefulness in both theoretical and applied mathematics. Readers wishing to study this subject in greater detail are advised to consult Gasquet and Witomski (2013). We shall begin by deﬁning convolution and discussing its basic properties, before proving the best-known and most important property of the Fourier transform in L1 pRn q with respect to convolution: the convolution product is transformed into the pointwise product of the Fourier transforms (to within a coefﬁcient). D EFINITION 6.23.– Taking f, g : Rn Ñ R, the convolution between f and g is the function f ˚ g deﬁned by: ż f px ´ yqgpyqdmpyq, @x P Rn pf ˚ gqpxq “ Rn

as long as the integral exists in the Lebesgue sense. T HEOREM 6.49 (Basic properties of convolution).– The following properties hold: 1) if f P L1 pRn q and g P L8 pRn q or vice versa , then the convolution is well deﬁned; 2) if f, g P L2 pRn q, then the convolution is well deﬁned and, in general, is an element of L8 pRn q; 3) if f, g P L1 pRn q, then the convolution is well deﬁned and belongs to L1 pRn q, which becomes a Banach algebra with respect to the convolution; 4) if convolution is well deﬁned, then: - f ˚ pαg ` βhq “ αf ˚ g ` βf ˚ h (linearity); - f ˚ g “ g ˚ f (commutativity) ; - f ˚ pg ˚ hq “ pf ˚ gq ˚ h (associativity). P ROOF.– Only the ﬁrst two properties will be proved here. Proof of the remaining properties is left to the reader as an exercise.

Bounded Linear Operators in Hilbert Spaces

307

1) If f P L1 pRn q and g P L8 pRn q, then: ż ż |f px ´ yqgpyq|dmpyq ď }g}8 |f px ´ yq|dmpyq “ }g}8 }f }1 Rn

Rn

by the shift invariance of the Lebesgue measure. 2) If f, g P L2 pRn q, then, by the Hölder inequality [4.19]: ż Rn

|f px ´ yqgpyq|dmpyq ď

ˆż Rn

˙1{2 ˆż |f px ´ yq|2 dmpyq

Rn

˙1{2 |gpyq|2 dmpyq

“ }f }2 }g}2 again by the shift invariance of the Lebesgue measure.

2

T HEOREM 6.50 (Convolution and Fourier transform in L1 ).– Taking f, g L1 pRn q, then the Fourier transform veriﬁes the following property:

P

fz ˚ g “ p2πqn{2 fˆ ¨ gˆ P ROOF.– We simply write the deﬁnition of convolution and of the Fourier transform, then apply the Fubini theorem twice, with a minor algebraic manipulation in between: ˙ ż ˆż 1 { pf ˚ gqpωq “ f px ´ yqgpyqdmpyq e´ixω,xy dmpxq p2πqn{2 Rn n R ż ż 1 “ f px ´ yqgpyqe´ixω,xy dmpxqdmpyq (Fubini) p2πqn{2 Rn Rn ż ż 1 “ f px ´ yqgpyqe´ixω,x´y`yy dmpxqdmpyq p2πqn{2 Rn Rn ż ż 1 “ f px ´ yqe´ixω,x´yy gpyqe´ixω,yy dmpxqdmpyq p2πqn{2 Rn Rn ż ż 1 ´ixω,x´yy “ f px ´ yqe dmpxq gpyqe´ixω,yy dmpyq (Fubini) p2πqn{2 Rn n R “ (t “ x ´ y, dmptq “ dmpx ´ yq) “ p2πq

n{2

1 p2πqn{2

ż Rn

f ptqe

“ p2πqn{2 fˆpωq ¨ gˆpωq,

´ixω,ty

1 dmptq p2πqn{2

@ω P Rn .

ż

gpyqe´ixω,yy dmpyq

Rn

2

If we inverse the Fourier transform on the image F1 pL1 pRn qq Ă C8 pRn q, then we obtain f ˚ g “ p2πqn{2 pfˆ ¨ gˆq_ , which is often written in the form: pf ¨ gq_ “ p2πq´n{2 fˇ ˚ gˇ

[6.28]

308

From Euclidean to Hilbert Spaces

This formula will be used in section 6.11. Convolution is a stationary operation, that is, it commutes with translation, as in the discrete case. Fixing s P R and g P L1 pRq, then we can deﬁne the right translation operator Rs and the convolution operator with g, Tg , as follows: ż Rs f ptq “ f pt ´ sq, Tg f ptq “ pf ˚ gqptq “ f pt ´ xqgpxqdx R

then, for all t P R: Rs Tg f ptq “ Tg f pt´sq “

ż R

f pt´s´xqgpxqdx “

ż R

Rs f pt´xqgpxqdx “ Tg Rs f ptq

As we saw in the discrete case (see section 2.9.6), the convolution operation with the Gaussian function results in blurring of a signal. This result can be understood from a different perspective, using the following impulse function: # 1 0ătăε Iε ptq “ ε 0 otherwise If f P L1 pRq, then: ż ż 1 ε f pt ´ xqdx pf ˚ Iε qptq “ f pt ´ xqIε pxqdx “ ε 0 R Now, applying the variable substitution u “ t ´ x, we obtain du “ ´dx and the lower and upper extrema of the integral with respect to the new variable u become t and t ´ ε. Then: ż ż 1 t´ε 1 t pf ˚ Iε qptq “ ´ f puqdu “ f puqdu “ xf yrt,t`εs , ε t ε t´ε that is, the mean of f in the interval rt ´ ε, ts, of size ε. A Gaussian Gμ,σ with mean μ and standard deviation σ is a “smooth” version of the pulse Iε , which rapidly tends toward 0 outside of the interval rμ ´ σ, μ ` σs, thus: f ˚ Gμ,σ » local mean of f in rμ ´ σ, μ ` σs In section 2.9.6, we saw that blurring, in the frequency domain, results from the fact that the Fourier multiplier corresponding to the convolution with the Gaussian constitutes a low-pass ﬁlter. Here, we ﬁnd the explanation for the blurring effect in the original domain of a signal f : following convolution with a Gaussian, each value of f in t is replaced by an approximation of the local mean value of f , with a locality parameter determined by the standard deviation of the Gaussian. A further property of convolution, which is crucial for applications to the theory of differential equations, is discussed in Theorem 6.51.

Bounded Linear Operators in Hilbert Spaces

309

T HEOREM 6.51.– Taking f P CpRn q with bounded partial derivatives and g P L1 pRn q, then f ˚ g P CpRn q and: Bxk pf ˚ gq “ pBxk f q ˚ g,

@k “ 1, . . . , n

In the same way, if g P CpRn q with bounded parial derivatives and f P L1 pRn q, then: Bxk pf ˚ gq “ f ˚ pBxk gq ,

@k “ 1, . . . , n

P ROOF.– The hypotheses of the theorem ensure that the derivation can be passed under the integral sign, thus @k “ 1, . . . , n: ˙ ż ˆż Bxk f px ´ yqgpyqdmpyq “ Bxk pf px ´ yqgpyqq dmpyq Rn

“

ż Rn

Rn

pBxk f px ´ yqq gpyqdmpyq

since f is the only element which depends on x, that is, Bxk pf ˚ gq “ pBxk f q ˚ g. The second formula is a consequence of the commutative property of convolution, which allows us to switch the roles of f and g. 2 6.9.6. Convolution and Fourier transforms in L2 : localization of the Fourier transform A generalization of equation [6.27] allows us to highlight a signiﬁcant limitation of the Fourier transform. The formalization of this statement relies on the following result, taken from Gasquet and Witomski (2013), which shows that the Fourier transform of the product of the elements in L2 pRn q is proportional to the convolution of their Fourier transforms. T HEOREM 6.52.– If f, g P L2 pRn q, then: fy ¨ g “ p2πq´n{2 fˆ ˚ gˆ Let us consider the spectrum of f P L2 pRq, but only in the neighborhood of a value of t0 . Using translation, it is always possible to consider t0 “ 0. The simplest, but incorrect (for reasons which we shall see later) approach to localizing the analysis of the spectrum of f ptq consists of truncating it, that is, multiplying it by the step function of size 2T : # 1 if |t| ď T χptq “ , 0 otherwise

310

From Euclidean to Hilbert Spaces

where 2T is the size of the neighborhood that we wish to consider. Since χ P L2 pRq, by Theorem 6.5210, the Fourier transform of the truncated signal f˜ptq “ f ptqχptq is ? p ˆ where: f˜pωq “ 1{ 2π fˆpωq ˚ χpωq, c 2 sinpωT q 1 T p χpωq ˜ “? T “ sincpωT q π ωT π 2π where the function R Q t ÞÑ sincptq :“ sint t . Thus: ¯ T ´ˆ p f pωq ˚ sincpωT q f˜pωq “ π that is, the spectrum of the truncated signal is proportional to the convolution between the spectrum of the original signal and the sinc function of ωT . We thus see that precise localized information concerning the original signal cannot be obtained by truncation alone. This is one of the difﬁculties inherent in localizing frequency analysis of a signal within the context of Fourier analysis. Wavelet theory (Frazier 2001), developed to a signiﬁcant extent in the late 1980s, offers powerful tools for handling this phenomenon. 6.10. The Nyquist-Shannon sampling theorem The Nyquist-Shannon theorem11 is one of the most important theorems in signal theory. It states that, when a function f possesses a bounded spectrum as speciﬁed in Deﬁnition 6.24, this function can be reconstructed using a discrete set of samples. D EFINITION 6.24.– The function f : R Ñ C is said to be a continuous signal of ﬁnite bandwidth if there exists Ω P R` such that: fˆpωq “ 0

@|ω| ą Ω

The human visual system is incapable of perceiving an electromagnetic wave as light when the oscillating frequency of the wave is lower than 400 THz or higher than 800 THz, where T = Tera = 1012 . Moreover, humans are able to hear sounds as variations in air pressure only at frequencies between 20 Hz and 20 KHz, where K = Kilo = 103 . Visual and auditory signals, which are transmitted to the brain for interpretation, are two key examples of ﬁnite-bandwidth signals. 10 This argument is not valid if f P L1 pRq, as, in this case, the formula from Theorem 6.52 ˆ R would only be valid if fˆ and χ ˆ belong to L1 pRq; however, as we saw in section 6.9.2, χ 1 L pRq. 11 This theorem is known by several different names, sometimes including the names of Whittaker and Kotelnikov, two other mathematicians who independently discovered it.

Bounded Linear Operators in Hilbert Spaces

311

T HEOREM 6.53 (Shannon-Nyquist sampling theorem12).– Let: – f : R Ñ C be a signal of ﬁnite bandwidth: DΩ P R` such that fˆpωq “ 0 @|ω| ą Ω; – fˆ be continuous and C 1 pRq piecewise. Thus, f is fully determined by its samples at points tn “ following formula holds: f ptq “

ÿ

f

nPZ

´π ¯ n sincpΩt ´ πnq Ω

π Ω n,

n P Z, and the

[6.29]

where the convergence of the series is uniform. There are several proofs of the sampling theorem, including a notable example which uses Poisson’s summation formula (1781, Pithiviers-1840, Paris); here, we have chosen to present an alternative proof, found in Boggess and Narcowich (2015, p. 118). P ROOF.– We shall use the series and Fourier transform of fˆ. To do this, we interpret fˆ as a 2Ω-periodic function when we write its Fourier series and as a function with support bounded in r´Ω, Ωs when we calculate its Fourier transform. Thanks to our hypotheses, fˆ P L2 r´Ω, Ωs and thus we can develop fˆ into a Fourier series: ÿ ÿ πωk 2πωk ck ei Ω [6.30] ck ei 2Ω “ fˆpωq “ kPZ

kPZ

with: 1 ck “ 2Ω

żΩ ´Ω

πωk fˆpωqe´i Ω dω

?

ż ´π 2π 1 ? fˆpωqeip Ω kqω dω “ ˆ 2Ω 2π R f pωq“0 @|ω|ąΩ ? ` π ˘ ?2π ` π ˘ 2π ˇ “ 2Ω fˆ ´ Ω k “ 2Ω f ´ Ω k , where in the ﬁnal step of the previous computation we have used the deﬁnition of π the inverse Fourier transform of fˆ, i.e. f , calculated in ´ Ω k, and we included the normalization factor of the series in ck . 12 Shannon (b. 1916, Petoskey; d. 2001, Medford), Nyquist (b. 1889, Stora Kil; d. 1976, Harlingen)

312

From Euclidean to Hilbert Spaces

The Fourier series [6.30] can thus be rewritten as follows: ? ? ÿ 2π ´ π ¯ πωk ÿ 2π ´ π ¯ πωn fˆpωq “ “ f ´ k ei Ω f n e´i Ω 2Ω Ω 2Ω Ω pn“´k ðñ k“´nq nPZ kPZ and this series is uniformly convergent since fˆ is continuous and C 1 piecewise. We calculate f ptq via the inverse Fourier transform of fˆpωq: ż 1 f ptq “ ? fˆpωqeiωt dω 2π R żΩ 1 ? fˆpωqeiωt dω “ 2π ´Ω fˆpωq“0 @|ω|ąΩ żΩ ÿ ? 2π ´ π ¯ ´i πωn iωt 1 f n e Ω e dω “? Ω 2π ´Ω nPZ 2Ω ÿ 1 ´π ¯ż Ω tΩ´πn eiω Ω dω “ f n 2Ω Ω ´Ω nPZ

[6.31]

In the ﬁnal step of the previous calculation, the series and the integral can be switched thanks to the fact that the series is uniformly convergent. Now, let us analyze the integral: ˆ ˙ ˙ ˆ żΩ żΩ żΩ tΩ´πn tΩ ´ πn tΩ ´ πn eiω Ω dω “ cos ω sin ω dω ` i dω Ω Ω ´Ω ´Ω ´Ω The second integral is zero, as the sine function is odd and the domain of integration is symmetric; on the other hand, the cosine function is even, so we obtain: « ˘ ﬀΩ ` ˆ ˙ żΩ żΩ sin ω tΩ´πn tΩ ´ πn iω p tΩ´πn Ω q Ω dω “ 2 e cos ω dω “ 2 tΩ´πn Ω ´Ω 0 Ω 0 ` tΩ´πn ˘ sin Ω Ω sin ptΩ ´ πnq “ 2Ω ´ 0 “ 2Ω tΩ ´ πn tΩ ´ πn Inserting this result in [6.31], we obtain: ÿ 2Ω ´ π ¯ sin ptΩ ´ πnq ÿ ´π ¯ f ptq “ f f n “ n sinc ptΩ ´ πnq 2Ω Ω tΩ ´ πn Ω nPZ nPZ and, as underlined before, the series is uniformly convergent.

2

6.10.1. The Nyquist frequency: aliasing and oversampling Since the sinc function ` π is˘ ﬁxed, the signal f is unequivocally characterized by the n . sequence of samples f Ω

Bounded Linear Operators in Hilbert Spaces

313

π The sampling period used in the theorem is T “ Ω , so the sampling frequency, known as the Nyquist frequency and noted νN , is νN “ T1 “ Ω π.

We now wish to compare the Nyquist frequency with the maximal frequency present in the signal f . Remember that we started with the hypothesis that f is a ﬁnite-bandwidth signal with maximum pulse Ω. Then the maximum frequency νmax Ω of f is deﬁned by the relation Ω “ 2πνmax , i.e. νmax “ 2π . Comparing the Nyquist sampling frequency νN with the maximal frequency νmax of signal f , we obtain νN “ 2νmax , which tells us that the sampling theorem holds if and only if the sampling frequency is at least twice the maximal frequency present in the signal f . This is coherent with the results of the discrete Fourier transform, where we have seen that the highest frequency of a discrete signal given by N periodic samples is N2 if N is even, or the integer part of N2 if N is odd. If the sampling frequency is lower than the Nyquist frequency, then a phenomenon known as aliasing occurs; this corresponds to errors in signal reconstruction. These errors result from the fact that, as we saw in our proof, we need to consider a periodic extension of the spectrum of f ; the Nyquist frequency νN is the minimum frequency which allows f to be reconstructed without “overﬂowing” into the next period of the spectrum. A lower sampling frequency results in the inclusion of parasite information from the adjacent spectrum periods on each side. Finally, we note that the general term of the series in the theorem converges to 0 with the same speed as n1 when n Ñ `8; this is a relatively slow convergence. The convergence speed can be increased, for example to n12 , by increasing the sampling frequency: this technique is known as oversampling. 6.11. Application of the Fourier transform to solve ordinary and partial differential equations The way the Fourier transform behaves with respect to derivatives makes it particularly helpful for solving certain types of differential equations. The general idea is illustrated below in the case of an ordinary differential equation (ODE). 6.11.1. Solving an ordinary differential equation using the Fourier transform Taking y, g : R Ñ R, y, g P L1 pRq, y twice differentiable, consider the following ODE: y 2 ptq ´ yptq “ ´gptq

@t P R

314

From Euclidean to Hilbert Spaces

Applying the Fourier transform to both sides, by the property of linearity, we can write: yp2 pωq ´ yˆpωq “ ´ˆ g pωq that is: ´ω 2 yˆpωq ´ yˆpωq “ ´ˆ g pωq

p1 ` ω 2 qˆ y pωq “ gˆpωq

ðñ

that is: yˆpωq “

1 ¨ gˆpωq 1 ` ω2

(Solution in the frequency domain)

We see that the properties of the Fourier transform allowed us to transform the ODE into an algebraic equation in the frequency domain. If we know the Fourier transform of g, then the ODE is solved in the Fourier space. However, as the original ODE was formulated in terms of the variable t, we must return to the original representation by applying the inverse Fourier transform to both sides of the ﬁnal equation, using property [6.28] we have: „

1 pˆ y pωqq ptq “ yptq “ ¨ gˆpωq 1 ` ω2 _

j_

1 ptq “ ? 2π

ˆˆ

1 1 ` ω2

˙_

ptq ˚ gptq

˙ [6.32]

We can verify by direct calculation that: c 2 a z ´a|t| e pωq “ 2 π a ` ω2 so, considering a “ 1: c πy 1 e´|t| pωq “ 2 1 ` ω2 and then: 1 yptq “ ? 2π

c

π ´|t| ˚ gptq e 2

that is: 1 yptq “ 2

ż `8

´|t´s|

e ´8

1 gpsqds “ 2

ż `8 ´8

gpt ´ sqe´|s| ds

If we are able to calculate the integral (this depends on the analytical expression of g), then yptq can be determined explicitly; otherwise, the value must be approximated. To solve an ODE via the Fourier transform, we thus need to perform the following operations:

Bounded Linear Operators in Hilbert Spaces

315

1) transform the ODE in the frequency domain, applying the Fourier transform to both sides of the equation; 2) solve the algebraic ODE in the Fourier space; 3) apply the inverse Fourier transform to obtain the solution to the ODE in its original representation; 4) typically, the solution in the Fourier space is given by a product; hence, the solution in the original representation is given by a convolution. This technique can only be used if the coefﬁcients of the derivatives are constant, and if the functions are integrable. 6.11.2. The Fourier transform and partial differential equations The Fourier transform is even more effective when applied to partial differential equations. For the purposes of our presentation, we shall only consider functions of the type u “ u pt, xq or u “ upt, x, y, zq, where t is the time coordinate and x or px, y, zq are one-dimensional (1D) or three-dimensional (3D) coordinates, respectively. It is implicitly considered that u P L1 pR2 q or u P L1 pR4 q, respectively, and that u can be derived enough times so that the corresponding PDE is well deﬁned. For simplicity’s sake, we write: Bu B2 u Bu “ uxx , “ ux , “ ut , . . . 2 Bx Bx Bt The properties of the Fourier transform with respect to the partial derivatives are as follows: – if the integration variable of the Fourier transform is x, then: 2 u xx pt, ωq “ iω u ˆpt, ωq, u y ˆpt, ωq xx pt, ωq “ ´ω u

B B2 u ˆpt, ωq u ˆpt, ωq, ux tt pt, ωq “ Bt Bt2 The ﬁrst two formulas are straightforward; to obtain the remaining two, we note that, since u P L1 pR2 q, the order of derivation and integration can be modiﬁed: upt pt, ωq “

1 ? 2π

ż `8 ´8

B 1 Bu pt, xq ´iωx ? dx “ e Bt Bt 2π

The same is true for utt ;

ż `8 ´8

u pt, xq e´iωx dx “

B u ˆ pt, ωq Bt

316

From Euclidean to Hilbert Spaces

– if the integration variable of the Fourier transform is t, then: 2 upt px, ωq “ iωˆ upx, ωq, ux ˆpx, ωq tt px, ωq “ ´ω u

u xx px, ωq “

B B2 u ˆpx, ωq u ˆpx, ωq, u y xx px, ωq “ Bx Bx2

– these considerations can be extended to upt, x, y, zq. 6.11.3. Solving the partial differential equation for heat propagation using the Fourier transform Consider the Cauchy problem for u P C 2 pR2 q X L1 pR2 q and ϕ P C 2 pRq X L1 pRq deﬁned by: # ut “ α2 uxx @x P p´8, `8q , @t P p0, `8q , α P R` up0, xq “ ϕ pxq @x P p´8, `8q , t “ 0 where – u pt, xq is the temperature of a 1D bar at time t and at the point x; – ut pt, xq is the rate of temperature change at time t and at the point x; – uxx pt, xq is the concavity of the temperature proﬁle at time t and x (note that the second derivative is with respect to the spatial variable, thus it would be wrong to interpret uxx as an acceleration); – ϕpxq is the initial concavity of the temperature proﬁle at the point x. If we write the second discrete derivative (with step Δx) with respect to x, we see that it deﬁnes the comparison of the temperature at point x at time t with that of its neighbors at the same instant: uxx pt, xq » “

upt,x`Δxq´2upt,xq`upt,x´Δxq » pΔxq2 2 pΔxq2

ﬁ

ﬃ — upt, x ` Δxq ` upt, x ´ Δxq — ´upt, xqﬃ ﬂ –looooooooooooooooomooooooooooooooooon 2 mean temperature of neighboring points

Thus, the equation ut “ α2 uxx tells us that: – if upt, xq is less than the mean temperature of its neighbors, then uxx ą 0 and thus ut pt, xqp“ α2 uxx q ą 0, meaning that the temperature at the point x will increase over time: the neighboring points lose some of their heat in favor of x in order to attain thermal equilibrium;

Bounded Linear Operators in Hilbert Spaces

317

– in the opposite case, ut pt, xqp“ α2 uxx q ă 0 and so the temperature at point x decreases over time: x loses heat to its neighbors in order to attain thermal equilibrium; – the positive constant α2 is a characteristic of the material, known as the thermal diffusion coefﬁcient. The higher the value of α2 , the faster the bar will reach thermal equilibrium. The heat equation is used in many other domains: for instance, in image processing, it is used to smooth out imperfections, and in the ﬁeld of economics, it plays an important role in the Black-Scholes-Merton model of ﬁnancial markets. The heat equation is solved by calculating the Fourier transform (integrating with respect to variable x) on both sides: ut pt, xq “ α2 uxx pt, xq

ÝÑ p

B ppt, ωq u ppt, ωq “ ´α2 ω 2 u Bt

The initial condition in the Fourier space becomes u pp0, ωq “ ϕpωq. p The PDE has thus been transformed into an ODE: # # B ut pt, xq “ αuxx pt, xq u ppt, ωq “ ´α2 ω 2 u ppt, ωq p ÝÑ Bt up0, xq “ ϕpxq u pp0, ωq “ ϕpωq p because ω is a constant with respect to variable t, thus the equation B ppt, ωq “ ´α2 ω 2 u ppt, ωq is ordinary. We recall that the solution of the Cauchy Bt u problem: # y 1 “ ´ky yp0q “ y0 is yptq “ y0 e´kt and thus, in the present case: u p pt, ωq “ ϕpωq p ¨ e´α

2

ω2 t

“ ϕpωq p ¨ e´pα

2

tqω 2

(Solution in the Fourier space)

The inverse Fourier transform is then applied to obtain the solution in the original representation. Using equation [6.28], we obtain: ´ ¯ 2 2 _ upt, xq “ ϕpωq p ¨ e´pα tqω pt, xq ´ ¯ 2 2 _ 1 ´pα tqω ˇ pt, xq ˆ ˚ e “ ?2π ϕpxq ˇˆ Furthermore, ϕpxq “ ϕpxq, and e´pα can use the following property: ω2 1 ´c2 x2 pωq “ ? e´ 4c2 e{ c 2

ðñ

2

tqω 2

is a Gaussian with respect to ω, so we

? 2 1 ´c2 x2 pωq “ e´ 4c2 ω c 2e{

318

From Euclidean to Hilbert Spaces

In our case, this gives us 4c12 “ α2 t; moreover, c2 “ 4α12 t and then c “ 2α1?t (in physical terms, only the positive determination of the root is relevant). Finally, we can write: ? ¯_ ´ x2 x2 2 1 ´pα2 tqω 2 ? e´ 4α2 t “ ? e´ 4α2 t e pt, xq “ 2α t α 2t and the solution of the heat equation is thus: upt, xq “

1 ? α 4πt

ż `8

e´

px´yq2 4α2 t

ϕpyqdy

´8

Certain expressions of ϕpxq permit exact integration, and an analytical expression of upt, xq is thus possible. Generally, however, it is only possible to approximate upt, xq. It is interesting to note that, as the standard expression of a Gaussian is: px´μq2 1 ? e´ 2σ2 σ 2π ? then σ “ α 2t, i.e. σ 2 “ 2α2 t: the variance of the Gaussian featured in the solution of the heat equation is not ﬁxed, but increases linearly as the time t increases.

This tells us that the support of the Gaussian widens over time; this is perfectly coherent with common experience, given that as t Ñ `8, the bar reaches thermal equilibrium and thus the temperature is uniform across the whole bar. The observations above provide a deeper insight into the technique of convolution with a Gaussian, widely used in signal processing, for example to blur digital images. Taking ϕpyq to represent the original intensity of any given pixel y in a digital image, and interpreting upt, xq as the intensity of the blurred image at time t and in a ﬁxed pixel at position x, the convolution of an image with a Gaussian may be considered as the exchange of intensity (“heat”) between x and its neighbors. Furthermore, just as heat propagation is an irreversible process, the blurring effect obtained by convolution with a Gaussian cannot be directly inverted. One ﬁnal observation linked to the spatial dimension of the problem is that the application of the technique described above requires x to be variable between ´8 and `8. Other techniques are used to solve problems where x varies within a bounded interval, including the sine and cosine Fourier transforms and the Laplace transform.

Bounded Linear Operators in Hilbert Spaces

319

6.12. Summary Linear operators between normed vector spaces are continuous at a given point if and only if they are continuous everywhere, and if and only if they are bounded. All linear operators deﬁned on a ﬁnite-dimensional vector space are continuous (and thus bounded); this ceases to be true, in general, when the space in which the operator is deﬁned is not of ﬁnite dimension. A classic example is provided by the derivation operation. For bounded linear operators, we can deﬁne a norm, with four equivalent deﬁnitions, which makes the set BpV, W q of bounded linear operators between two normed vector spaces V and W a normed vector space in its own right. In the speciﬁc case where V “ W , the composition of operators deﬁnes a product in BpV q with respect to which BpV q becomes a unital normed associative algebra. Furthermore, if W is complete, BpV, W q is complete; in the speciﬁc case where V “ W “ H, a Hilbert space, BpHq is a unital Banach algebra, that is, a complete associative normed algebra such that AB ď A B @A, B P BpHq. The kernel of a bounded operator is always a closed vector subspace in the domain of the operator. If the kernel consists solely of the zero vector, then the operator is inversible, but its inverse will not necessarily be bounded. The existence of μ ą 0 such that }Ax} ě μ}x} gives a simple and useful characterization of the bounded invertibility of an operator A : V Ñ W . If V is a Banach space and this condition is veriﬁed, then ImpAq, the image space of A, is closed. In practical applications, the closure of kerpAq (where A is continuous) and of ImpAq (in the hypotheses given above) may be used to characterize a closed subspace: we must simply show that this coincides with the kernel or image of a linear operator which satisﬁes those hypotheses. The dual of an arbitrary vector space V on the ﬁeld K “ R or C is the vector space V ˚ of linear functionals deﬁned on the vector space itself. If the space is normed, then it is natural to require compatibility with the topological structure generated by the norm, that is, the functionals are continuous, that is, we deﬁne V ˚ “ BpV, Kq. Given that K is complete, V ˚ is always complete, even when V is not. In the case of a Hilbert space H, the Riesz representation theorem tells us that H and H˚ are isomorphic by the transformation which associates each x P H with the functional Tx which implements the inner product, that is, Tx pyq “ xy, xy @y P H. This theorem makes it possible to deﬁne the adjoint A: of any operator A P BpHq via the relationship xA: x, yy “ xx, Ayy @x, y P H. If A “ A: , then A is said to be self-adjoint. Two examples of self-adjoint operators are A: A and AA: . The adjoint of a bounded linear operator is a particularly important operator in both theory and practice. An idea of its importance can be seen in the theorem used to characterize an orthogonal projection operator on a Hilbert space: A P BpHq is an

320

From Euclidean to Hilbert Spaces

orthogonal projector on ImpAq if and only if A is self-adjoint A “ A: and idempotent A2 “ A. This result can be used, for example, to show that multiplication operators on L2 pRn q are orthogonal projectors if and only if they multiply by the indicator function of a measurable subset of Rn . There is also a highly important geometric representation of orthogonal projectors: A P BpHq is an orthogonal projectorřif and only if there exists an orthonormal system pun qnPN in H such that Ax “ xx, un yun , @x P H. This realization of the projector is the extension, in nPN

inﬁnite dimensions, of the analogous formula valid in ﬁnite dimension. The adjoint also plays a role in the analysis of isometric and unitary operators. An operator A P BpHq is isometric if it conserves the norm (or, in an equivalent manner, the inner product); a unitary operator is isometric and surjective. The two categories of operators have unit norm. The relationship between isometric operators and orthogonal projectors is given by the following result: if A P BpHq is isometric, then AA: is an orthogonal projector. If A P BpHq is isometric, then ImpAq “ ImpAA: q and, given that AA: is an orthogonal projector (since A is taken to be isometric), ImpAA: q is closed; thus the image space of an isometric operator is always closed. Since kerpA: q “ Im pAqK , if A is isometric but not surjective, then ImpAq ‰ H; hence, Im pAqK ‰ t0H u and then A: is not invertible. Using the same argument, we also see that if A is unitary, then A: is invertible. As in the case of orthogonal projection operators, an algebraic characterization of isometric and unitary operators can be obtained via the adjoint: A P BpHq is isometric if and only if A: A “ idH , while A P BpHq is unitary if and only if A: A “ AA: “ idH ; in this ﬁnal case, A is invertible and A´1 “ A: . Moreover, we can show that A is unitary if and only if A: is unitary. One consequence of this result is that the unitary nature of an operator A can be studied by examining that of its adjoint, which, in some cases, is simpler. Regarding the geometric realization of isometric and unitary operators, A P BpHq is isometric if and only if it transforms Hilbert bases into orthonormal systems, while A P BpHq is unitary if and only if it transforms Hilbert bases into Hilbert bases. This is an important difference with respect to the ﬁnite dimensional case. ş The Fourier transform f pxq ÞÑ fˆpωq “ p2πq1n{2 Rn f pxqe´ixω,xy dx is widely used in both pure and applied mathematics. The most “natural” space in which to deﬁne this transform is the Schwartz space; in this space, the Fourier transform has the integral formula given above, and is an isometric isomorphism with respect to the norm inherited by L2 pRn q. If we wish to extend the transform to a space with less regular functions, for example L1 pRn q or L2 pRn q, certain properties must be sacriﬁced. On L1 pRn q, the image is C8 pRn q, but the integral formula is preserved. On L2 pRn q, the integral formula must be replaced by a limit formula, but the isomorphic character of the transform is retained; the extension of the Fourier transform on L2 pRn q deﬁnes a unitary operator F P BpL2 pRn qq. An explicit formula

Bounded Linear Operators in Hilbert Spaces

321

for thisřunitary operator can be obtained by means of the Hermite basis: Ff “ p´iqn xf, un yun . Finally, we note that – to within a constant – the Fourier nPN

transform of the convolution of two functions in L1 pRn q is the pointwise product of the transforms. Finally, we presented the Nyquist-Shannon sampling theorem, which enables the reconstruction of a signal with bounded bandwidth using a sufﬁciently dense, but ﬁnite, set of samples of this signal. We also described applications of the Fourier transform in solving differential equations, notably the heat equation, which played a crucial role in the development of Fourier’s theory.

Appendix 1 Quotient Space

The concept of quotient of a vector space is essential in mathematics, and, in our opinion, does not always receive the attention it deserves in works on linear algebra. For this reason, we have chosen to devote an appendix to the deﬁnition and interpretation of this concept. D EFINITION A1.1.– An equivalence relation „ deﬁned on a vector space V (of arbitrary dimension) on the ﬁeld K is said to be compatible with the linear structure of V if: v „ v1 , w „ w1

ùñ

αv ` βw „ αv 1 ` βw1 ,

@α, β P K

The equivalence class of 0 in V is a vector space Z (since it is stable with respect to linear combinations, and contains the neutral element) known as the kernel of the equivalence relationship. One special case of this deﬁnition is when w “ w1 “ v 1 and α “ 1, β “ ´1, which implies: v „ v1

ùñ

v ´ v 1 „ 0 ðñ v ´ v 1 P Z

Conversely, if v ´ v 1 P Z, i.e. v ´ v 1 „ 0 ðñ v ´ v 1 „ v 1 ´ v 1 , and by the fact that v 1 „ v 1 , and since „ is compatible with the linear structure of V , we obtain: v ´ v 1 ` v 1 „ v 1 ´ v 1 ` v 1 , that is, v „ v 1 . In short: v „ v 1 ðñ v ´ v 1 P Z, which tells us that an equivalence relationship compatible with the linear structure of a vector space is univocally determined by its kernel, which is a vector subspace of V . This observation allows us to reverse the process. Given an arbitrary vector subspace W in V , if we deﬁne: v „W v 1 ðñ v ´ v 1 P W

@v, v 1 P V

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

324

From Euclidean to Hilbert Spaces

then „ is an equivalence relationship in V which is compatible with its linear structure and with W as kernel. By symmetry, v „W v 1 ðñ v 1 „W v ðñ v 1 ´ v P W , and this means that there exists w P W such that v 1 ´ v “ w, that is, v 1 “ v ` w; thus, the equivalence class rvsW containing v P V is the subset of V given by: rvsW “ v ` W “ tv ` w : w P W u This is referred to as a linear subvariety and interpreted geometrically as the shift of the subspace W by the vector v. We observe that if v P W , then, by linearity, the shift of W through v does not modify W . As a vector subspace of V , W contains the 0, thus if v R W , then the equivalence class v ` W does not contain the 0 and cannot, therefore, be a vector subspace of V . Lemma A1.1 is essential for deﬁning a quotient space, and will be used extensively in the rest of this appendix. L EMMA A1.1.– Let V be an arbitrary vector space, let v1 , v2 P V and let W be a vector subspace of V . Then the equality v1 ` W “ v2 ` W holds if and only if W1 “ W2 ” W and v1 ´ v2 P W . P ROOF.– ð : taking v1 ´ v2 ” w0 P W “ W1 “ W2 , then v2 “ v1 ´ w0 and: v1 ` W “ tv1 ` w : w P W u,

v2 ` W “ tv1 ` w ´ w0 : w P W u

but evidently W “ tw : w P W u “ tw ´ w0 : w P W u, hence v1 ` W “ v2 ` W . ñ : inversely, taking v1 ` W1 “ v2 ` W2 , then, by the deﬁnition of a linear subvariety, v1 ´ v2 ` W1 “ v2 ´ v2 ` W2 “ W2 , thus, if w0 “ v1 ´ v2 , we obtain w0 ` W1 “ W2 . Since W2 is a vector subspace of V , it contains 0. w0 ` W1 also contains 0, i.e. ´w0 P W1 and thus w0 P W1 since W1 is also a vector subspace. Shifting the vectors of W1 using w0 , which is a vector in W1 , does not change the subspace, i.e. w0 ` W1 “ W1 ; however, since w0 ` W1 “ W2 , we obtain W1 “ W2 “ W and w0 “ v1 ´ v2 P W1 “ W . 2 This lemma implies that every linear subvariety is uniquely determined by a single subspace W , of which the subvariety is the shift. Moreover, the vector which induces the shift is uniquely determined, up to the sum with a vector in W . It is now possible to establish the deﬁnition of quotient space and prove that this deﬁnition is well posed.

Appendix 1

325

D EFINITION A1.2 (quotient (vector) space).– Let V be any vector space and W a vector subspace of V . The quotient vector space V {W is the set of all linear subvarieties of V which are shifts of W , equipped with the following linear operations: pv1 ` W q ` pv2 ` W q “ pv1 ` v2 q ` W, αpv ` W q “ αv ` W,

@v1 , v2 P V

@v P V, @α P K

Let us verify that these operations are well deﬁned and that V {W is a vector space on K. The easy proof that the vector space axioms for V {W are directly induced by the vector space properties of V is left to the reader. Let us just underline the following properties: a) if v1 ` W “ v11 ` W and v2 ` W “ v21 ` W , then v1 ` v2 ` W “ v11 ` v21 ` W ; b) if v1 ` W “ v11 ` W , then αv1 ` W “ αv11 ` W ; @v1 , v2 , v11 , v21 P V and @α P K. We begin by proving the validity of property a. Lemma A1.1 tells us that v1 ´v11 ” w1 P W and v2 ´ v21 ” w2 P W , thus: 1 1 pv1 ` v2 q ` W “ pv11 ` v21 q ` pw 1 ` w2 q ` W “ pv1 ` v2 q ` W loooooooomoooooooon “W

To prove the validity of property b, we simply note that if we write v1 ´ v11 ” w P W , then αpv1 ´ v11 q “ αw P W , and thus by Lemma A1.1, αv1 ` W “ αv11 ` W . It is natural to wonder what the dimension of V {W is, and whether it is linked or not to the dimensions of V and W . To answer this question we need the following preliminary result. L EMMA A1.2.– Let V be an arbitrary vector space, W a vector subspace of V and H a subspace of V which is supplementary to W , i.e. such that W X H “ t0u and V “ W ‘ H. Then, for any vector v P V which implements a translation of W , there exists only one vector hv P H X pv ` W q. This vector is used to write v in a unique manner in the direct sum v “ wv ` hv . P ROOF.– Let us begin by proving the existence of a vector hv belonging to H and to v ` W . By the hypothesis V “ W ‘ H, any vector v P V may be written in a unique manner as v “ wv ` hv , wv P W and hv P H; we must prove that hv P v ` W . To do this, let us now consider a vector v 1 “ wv1 `hv1 P V , wv1 P W and hv1 P H, which belongs to the same equivalence class as v, that is, which is such that v ` W “

326

From Euclidean to Hilbert Spaces

v 1 ` W . Again, using Lemma A1.1, v 1 ´ v P W , that is, wv1 ` hv1 ´ wv ´ hv P W , that is, wv1 ´ wv ` hv1 ´ hv P W . Moreover, since wv1 ´ wv P W and hv1 ´ hv P H, the only case in which their sum remains within W is where hv1 ´ hv “ 0 (given that W X H “ t0u). Hence, hv1 “ hv and then v `W Q v 1 “ wv1 `hv , that is, wv1 `hv P v `W . Using Lemma A1.1 once more, we know that the sum of a vector belonging to v ` W and a vector in W does not take us outside of the equivalence class v ` W , thus hv P v ` W . Since hv P H and hv P v ` W , then hv P H X pv ` W q. Inversely, if h P H X pv ` W q, then, in particular, h P v ` W , that is, h „W v, that is, Dv P V and w ˜v P W such that h “ w ˜v ` v, that is, v “ wv ` h, where wv “ ´w ˜v , that is, h “ hv . 2 T HEOREM A1.1.– If W is a subspace of the vector space V which admits a supplement H in V , then H is isomorphic to V {W : V {W » H,

V “W ‘H

P ROOF.– The uniqueness of vector hv , as established by Lemma A1.2, allows us to construct the bijective and intrinsically linear correspondence which associates an arbitrary linear subvariety v ` W in V with the component in H of an arbitrary representative v P v ` W , that is: V {W ÝÑ H v ` W ÞÝÑ hv , such that: v “ wv ` hv is a linear isomorphism. 2 Note that, given a closed vector subspace W of a Hilbert space H, the orthogonal projection theorem 5.7 tells us that a supplementary space always exists in the form of the orthogonal complement W K ; hence, in this case: H{W » W K that is, the quotient vector space of a Hilbert space on a closed vector subspace W is isomorphic to the orthogonal complement of W . This result also allows us to determine the dimension of V {W as a function of that of V and of W in ﬁnite dimensions. In this case, dimpV q “ dimpW q`dimpHq and dimpHq “ dimpV {W q, then dimpV {W q “ dimpV q´dimpW q. C OROLLARY A1.1 (Dimension of V {W ).– Let V be a vector space of ﬁnite dimension and W a vector subspace of V , then: dimpV {W q “ dimpV q ´ dimpW q

Appendix 1

327

Many problems in both pure and applied mathematics require us to consider situations where V and W are of inﬁnite dimension, while V {W is of ﬁnite dimension. In this case, dimpV {W q is known as the codimension of W in V and written as codimpV {W q. Once the dimension of V {W and the linear isomorphism with H have been determined, Corollary A1.2 concerning the bases of V {W in ﬁnite dimensions is almost immediate. C OROLLARY A1.2 (Bases of V {W ).– Let V be a vector space of ﬁnite dimension n and W a vector subspace of V , then the linear subvarieties pei `W qni“1 Ă V {W form a basis of the quotient vector space V {W if and only if the representatives pei qni“1 Ă V constitute a basis for a supplementary subspace of W in V . Note that the zero of V {W is evidently the linear subvariety which contains the 0 of V , i.e. 0V ` W ” W is the zero of V {W . We conclude our analysis of V {W by considering the natural projection of V onto V {W : π : V ÝÑ V {W v ÞÝÑ πpvq “ v ` W The properties of π are as follows: – π is surjective: this stems from the fact that each element in V {W is represented by a vector in V ; – the ﬁbers of π, i.e. the counter-images of the elements in V {W through π, are the elements of V {W interpreted as a subvariety of V : π ´1 prv0 sq “ tv P V : v ` W “ v0 ` W u but the equality between sets v ` W “ v0 ` W is only veriﬁed for v “ v0 ` W , thus: π ´1 prv0 sq “ v0 ` W where rv0 s is interpreted, ﬁrst, as the equivalence class corresponding to the element of V {W identiﬁed by v0 P V , then as v0 ` W , seen as a subset of V ; – π is a linear application, by the fact that V {W is well deﬁned; – the kernel of π is W : kerpπq “ W . By Lemma A1.1, v0 ` W “ W if and only if v0 P W .

Appendix 2 The Transpose (or Dual) of a Linear Operator

Any linear operator A : V Ñ W , where V and W are two ﬁnite-dimensional vector spaces, can be univocally associated with a linear operator At known as the transpose or dual operator of A, deﬁned as: At : W ˚ ÝÑ V ˚ ϕ ÞÝÑ At ϕ “ ϕ ˝ A that is: At ϕ : V ÝÑ K v ÞÝÑ At ϕpvq “ ϕpAvq This deﬁnition is natural, as it only uses A and the elements supplied by the vector spaces themselves. Using canonical notation to express the action of a linear functional, we can rewrite At ϕpvq “ ϕpAvq as: xAt ϕ, vy “ xϕ, Avy

[A2.1]

The fact that this is well deﬁned, that is, the linearity of the functional At ϕ, is guaranteed by the fact that for a ﬁxed ϕ, the function v ÞÑ ϕpAvq is linear, as it is a composition of linear applications. The uniqueness of this deﬁnition can also be easily proven. Let At1 and At2 be two transpose operators such that At1 ϕpvq “ ϕpAvq “ At2 ϕpvq, that is, pAt1 ´ At2 qϕpvq “ 0. Taking an arbitrary ﬁxed ϕ P V ˚ and leaving v free within V , it is evident from equation pAt1 ´ At2 qϕpvq “ 0 that pAt1 ´ At2 qϕ is the identically zero functional. This holds for all ϕ P V ˚ , implying that At1 ´ At2 “ 0, that is, At1 “ At2 . From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

330

From Euclidean to Hilbert Spaces

Now, let V and W be two ﬁnite-dimensional Banach spaces. In this case, the deﬁnition remains valid as long as, for all A P BpV, W q, the transpose operator deﬁned above is continuous, that is, At P BpW ˚ , V ˚ q, and if At ϕ is a bounded linear functional on V whenever ϕ is a bounded linear functional on W . Let us verify these properties. – At ϕ is a bounded linear functional on V @ϕ P W ˚ : linearity is evident by deﬁnition, so we only need to prove that At ϕ is bounded: }At ϕ} “ sup}v}“1 }pAt ϕqv} ď

A bounded

“

def of At

sup}v}“1 }ϕpAvq}

sup}v}“1 }ϕ}}A}}v} “ }ϕ}}A} ă `8

ď

ϕ bounded

sup}v}“1 }ϕ}}Av}

– At P BpW ˚ , V ˚ q: }At } “ sup }At ϕ} }ϕ}“1

“

sup }ϕ˝A}

def of At }ϕ}“1

ď

sup }ϕ}}A} “ }A}

ϕ bounded }ϕ}“1

ă

APBpV,W q

`8

If V “ W “ H, where H is a Hilbert space, then the Riesz isomorphism T : H Ñ H˚ , H Q x ÞÑ T pxq “ Tx , where Tx pyq “ xy, xy @y P H is associated with the adjoint operator deﬁned in section 6.4 via the expression: A: “ T ´1 At T.

Appendix 3 Uniform, Strong and Weak Convergence

Sequences of operators may be shown to converge with respect to different topologies than the one induced by the operator norm. The same can be said for sequences of elements in Banach or Hilbert spaces. To take a concrete example, consider the following case. Let pun qnPN be an arbitrary Hilbert basis in a Hilbert space H. For all n P N, we deﬁne the linear operator: An : H ÝÑ H x ÞÝÑ An x “

n ř

xx, um yum

m“0

From the geometric characterization of projection operators (see Theorem 6.32), we know that An is the orthogonal projector on the vector subspace of H generated by u1 , . . . , un : Sn “ spanpu1 , . . . , un q. Since any x P H may be written as x “

8 ř

xx, un yun , it would seem that the

n“0

sequence of projectors pAn qnPN converges toward idH when n Ñ `8. Nevertheless, since Sn Ă Sn1 @n ă n1 , we know by Theorem 6.35 that An1 ´An is the projector onto Sn1 XSnK “ spanpun`1 , un`2 , . . . , un1 q, thanks to the orthogonality of the vectors pun qnPN . As we have seen, all orthogonal projectors onto non-trivial subspaces have a unitary norm, that is, An1 ´ An “ 1 @n ă n1 , and thus the sequence pAn qnPN is not a Cauchy sequence in BpHq with respect to the operator norm; thus, it cannot be convergent because BpHq is complete and so convergent and Cachy sequences coincide. The sequence pAn xqnPN in H, however, converges to x for all x P H, by the fact that pun qnPN is a Hilbert basis. This highlights the need to deﬁne an alternative form From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

332

From Euclidean to Hilbert Spaces

of convergence in order to assign a precise meaning to the intuitive notion that the sequence pAn qnPN converges to idH . Similar examples are encountered in Banach and Hilbert spaces; for this reason, we have organized our presentation of alternative forms of convergence into separate sections for different spaces. A3.1. Strong and weak convergence in Banach spaces Let pV, } }q be a Banach space. By deﬁnition, a sequence pxn qnPN Ă V converges toward x P V if xn ´ x ÝÝÝÝÝ Ñ 0. A different type of convergence can be deﬁned nÑ`8

in V by using the continuous linear functionals of its dual space V ˚ . D EFINITION A3.1 (Weak convergence in a Banach space).– Let V be a Banach space. The sequence pxn qnPN Ă V converges weakly toward x P V if, for all ϕ P V ˚ : ϕpxn q ÝÝÝÝÝ Ñ ϕpxq nÑ`8

where the convergence in this case is that of sequences of scalars in K. x is the weak w limit of the sequence pxn qnPN and we write xn ÝÝÝÝÝÑ x, with w for weak. nÑ`8

We note that, for all ϕ P V ˚ and x P V , ϕpxq ď ϕ x, thus, if xn ´ x ÝÝÝÝÝÑ 0, then: nÑ`8

ϕpxn q ´ ϕpxq “ ϕpxn ´ xq ď ϕ xn ´ x ÝÝÝÝÝÑ 0 nÑ`8

that is, “standard” convergence implies weak convergence. For this reason, “standard” convergence in a Banach space is also referred to as strong convergence. Counter-examples show that the inverse is not generally true. Thus, in a Banach space, the topology deﬁned by weak convergence has fewer opens than the topology deﬁned by strong convergence. A3.2. Strong and weak convergence in a Hilbert space A Hilbert space H is also a Banach space, thus the deﬁnition of strong and weak convergence given above also applies to Hilbert spaces. Nevertheless, by the Riesz representation theorem, we know that the action of any continuous linear functional on H can be identiﬁed with a scalar product. For this reason, an equivalent deﬁnition, which is more explicit for the purposes of calculation, can be used for weak convergence in a Hilbert space.

Appendix 3

333

D EFINITION A3.2 (weak convergence in a Hilbert space).– Let H be a Hilbert space. The sequence pxn qnPN Ă H converges weakly toward x P H if, for all y P H: xy, xn y ÝÝÝÝÝ Ñ xy, xy nÑ`8

As in the case of Banach spaces, x is said to be the weak limit of the sequence w Ñ x. pxn qnPN and we write xn ÝÝÝÝÝ nÑ`8

A very simple counter-example can be used to show that weak convergence does not generally imply strong convergence in a Hilbert space. Take any y P H and xn “ un @n P N, where pun qnPN is an arbitrary orthonormal ř 2 system in H. By Bessel’s inequality |xy, un y|2 ď y , so the series is convergent nPN

and thus its general term tends toward 0. Since any series which is absolutely convergent is convergent, ř H is complete, hence xy, un y2 is convergent and then xy, un y2 ÝÝÝÝÝ Ñ 0; however, this holds if nÑ`8

nPN

Ñ 0 for all y P H. and only if xy, un y ÝÝÝÝÝ nÑ`8

Hence, any orthonormal system pun qnPN in a Hilbert space is weakly convergent toward 0. However, ?distance between any two elements of an orthonormal ? we know that the system is 2: un ´ um “ 2 @n, m P N, thus pun qnPN does not verify the Cauchy condition, and therefore it cannot be strongly convergent. A3.3. Uniform, strong and weak convergence in the Banach algebra BpHq In the Banach algebra pBpHq, } }q, where H is any Hilbert space and } } is the operator norm, three different convergences can be deﬁned for a sequence of operators pAn qnPN Ă BpHq. D EFINITION A3.3.– We shall use u, s and w to denote uniform, strong and weak. Let pAn qnPN Ă BpHq be a sequence of bounded linear operators on the Hilbert space H, and take A P BpHq. – Uniform convergence (standard convergence, in operator norm): u

An ÝÝÝÝÝÑ A ðñ An ´ A ÝÝÝÝÝÑ 0 nÑ`8

nÑ`8

334

From Euclidean to Hilbert Spaces

– Strong convergence: s

An ÝÝÝÝÝ Ñ A ðñ An x ÝÝÝÝÝÑ Ax ðñ An x ´ AxH ÝÝÝÝÝÑ 0 nÑ`8

nÑ`8

nÑ`8

@x P H

– Weak convergence: w

An ÝÝÝÝÝÑ A ðñ xy, An xy ÝÝÝÝÝÑ xy, Axy @x, y P H nÑ`8

nÑ`8

As we saw at the beginning of this appendix, for any Hilbert basis pum qmPN , the sequence: An : H ÝÑ H x ÞÝÑ An x “

n ř

xx, um yum

m“0

does not converge uniformly idH . However, it converges strongly towards the identity operator, since, by the continuity of the norm, we have: ÿ lim }An x´idH pxq}H “ } lim An x´x}H “ } xx, um yum ´x}H “ 0 nÑ`8

nÑ`8

mPN

having used the fact that idH pxq is not dependent on n and the generalized Fourier expansion on the Hilbert basis pun qnPN . It is possible to show that, in BpHq, uniform convergence implies strong convergence, which itself implies weak convergence. On the other hand, as we see from the example shown above, strong convergence does not imply uniform convergence. Other counter-examples can be used to show that weak convergence in BpHq does not imply strong convergence.

References

Abbati, M. and Cirelli, R. (1997). Metodi matematici per la ﬁsica – Operatori lineari negli spazi di Hilbert. Città studi, Milan. Bartle, R. (1966). The Elements of Integration. John Wiley & Sons, Hoboken. Berberian, S. (1961). Introduction to Hilbert Spaces. Oxford University Press, Oxford. Boggess, A. and Narcowich, F. (2015). A First Course Wavelets with Fourier Analysis. John Wiley & Sons, Hoboken. Briane, M. and Pagè, G. (1998). Théorie de l’intégration – cours et exercices. Vuibert, Paris. Debnath, L. and Mikusinski, P. (2005). Introduction to Hilbert Spaces with Applications. Academic Press, Cambridge. Dunford, N. and Schwartz, J. (1958). Linear Operators, Part 1. Wiley Interscience, Hoboken. El Hage Hassan, N. (2011). Topologie générale et espaces normés : cours et exercices corrigés. Dunod, Paris. Frazier, M.W. (2001). Introduction to Wavelets through Linear Algebra. Springer, Berlin. Gasquet, C. and Witomski, P. (2013). Fourier Analysis and Applications: Filtering, Numerical Computation, Wavelets, vol. 30. Springer Science & Business Media, Berlin. Moretti, V. (2013). Spectral Theory and Quantum Mechanics, vol. 64. Springer, Berlin. Saxe, K. (2000). Beginning Functional Analysis. Springer, Berlin.

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

336

From Euclidean to Hilbert Spaces

Sondaz, D. (2010). Bien maîtriser les mathématiques : limites, applications continues, espaces complets. Cépaduès, Toulouse. Vretblad, A. (2003). Fourier Analysis and Its Applications. Springer, Berlin. Yosida, K. (1995). Functional Analysis. Springer-Verlag, Berlin-Heidelberg.

Index

L8 , 156 Lp , 145 V {W , 324 C˚ -algebra, 263 KN , 149 2 pZN q, 33 8 , 157 p , 149 DpΩq “ Cc8 pΩq, 166 σ-algebra, 106 Borel, 107 generated, 107 BpV, W q, 229 BpHq, 231 SpRq, 168 SpRn q, 168 DpRq, 166 A, B algebra Banach, 232 on a ﬁeld, 231 almost everywhere (a.e), 109 basis Fourier Hilbert of L2 , 202 Hilbert, 194 orthogonal, 14 orthonormal, 14 orthonormal Fourier of 2 pZN q, 40 bipolar, 183 Borel set, 107

C, D closed convex hull, 183 closure, 117 Codomain of a linear operator, 221 coefﬁcients Fourier in 2 pZN q, 42 generalized Fourier, 191 commutator, 284 continuity of fundamental operations in pre-Hilbert spaces, 120 contraction mapping, 140 convergence of a sequence of bounded operators, 230 strong, 332 uniform, 333 weak, 332 convolution, 69, 306 Dual, 244 E, F equivalence of topologies in ﬁnite dimensions, 128 essential supremum, 156 expansion to a generalized Fourier series, 195 Fatou’s lemma, 113 FFT (Fast Fourier Transform), 51 ﬁnite element methods, 260 form bilinear, 3

From Euclidean to Hilbert Spaces: Introduction to Functional Analysis and its Applications First Edition. Edoardo Provenzi. © ISTE Ltd 2021. Published by ISTE Ltd and John Wiley & Sons, Inc.

338

From Euclidean to Hilbert Spaces

bounded bilinear or sesquilinear, 250 coercive, 257 deﬁned, 3 deﬁnite, 5 elliptical, 257 Hermitian, 5 positive, 3, 5 quadratic, 249 sesquilinear, 5 symmetrical, 3 formula analysis, 53 synthesis, 52 Fourier multiplier operator, 61 Fourier-Plancherel transform, 305 function 1-Lipschitz, 271 continuous between metric spaces, 119 essentially bounded, 156 indicator (characteristic), 109 measurable, 108 step (or simple), 165 test, 166

isometric, 200 isomorphism between Hilbert spaces, 200

G, H, I

neighborhood open, 116 norm, 6 Frobenius, of a matrix, 139 Hilbertian, 7 of a bounded bilinear or sesquilinear form, 250 operatorial, 227 Nyquist frequency, 59 operator adjoint, 261 bounded linear, 223 continuous linear, 223 differential, 221 identity, 221 integration, 221 inverse, 239 isometric, 287 multiplication in 2 pZN q, 60 multiplication in L2 , 278 null, 221 orthogonal projection in Hilbert spaces, 270

Gram-Schmidt orthonormalization algorithm, 20 harmonic fundamental, 53, 207 higher order, 53, 207 Hermite basis, 305 Homogeneity of the norm, 7 identity Parseval’s, 195 ﬁnite dimensions, 21 Plancherel’s, 195 inequality Bessel’s, 189 Cauchy-Schwarz, 7 Hölder’s for integrals, 148 for series, 149 Minkowski for integrals, 146 for series, 149 triangle, 7

K, L, M Kronecker delta, 11 product, 92 law parallelogram, 9 polarization, 10 Lebesgue integral of a function, 110 linear functional, 244 linear operator image of a 221 matrix exponential, 138 measure, 107 σ-ﬁnite, 108 Borel, 111 regular, 111 counting, 148 ﬁnite, 108 multi-index, 167 N, O

Index

projection (oblique), 269 rotation, 288 self-adjoint (Hermitian), 263 shift in !2 pZN q, 63 translation, 288 transpose, 329 unitary, 287 orthogonal complement, 172 dimension of a Hilbert space, 199 family of vectors, 11 projection in finite dimensions, 17 P, R, S polar, 183 product canonical inner, 3 complex Euclidean inner, 5 of bounded operators, 230 residual vector, 18 Riemann-Lebesgue lemma, 214 sequence bounded, 132 in a metric space, 132 Cauchy, 129 convergent in norm, 117 series absolutely convergent in norm, 123 convergent in norm, 123 real Fourier in L2 , 206 set closed, 117 measurable, 107 open, 117 signal finite bandwidth, 310 space Banach, 131 complete metric, 129 complex pre-Hilbert, 5 Hilbert, 131 separable, 188 measurable, 107 metric vector, 116 normed vector, 7

339

quotient, 324 real pre-Hilbert, 3 Schwartz, 167 topological vector, 127 spectrum amplitude, 54 phase, 54 power, 54 subsequence, 132 subset density, 118 support of a function, 166 Sylvester matrix, 49 system orthonormal, 188 complete, 194 T, U theorem Banach fixed-point, 139 bounded extension of bounded linear operators, 302 Carnot’s, 9 characterization of a Hilbertian norm, 124 characterization of completeness of normed spaces using series, 136 completion of a non-complete metric space, 133 continuous inverse operator, 242 decomposition on an orthonormal basis, 21 dominated convergence, 113 extension of a bounded linear operator, 302 Fischer-Riesz, 192 generalized Pythagorean, 12 Lax-Milgram, 257 monotone convergence, 113 open mapping (Banach-Schauder), 242 orthogonal projection in a Hilbert space, 185 Plancherel’s finite dimensions, 21 projection on a closed convex, 174 Riemann-Lebesgue, 301 Riesz-Fisher (completeness of Lp spaces), 150

340

From Euclidean to Hilbert Spaces

Riesz representation, 244 sampling, 311 topology metric, 117 separated, 117

transform discrete Fourier (DFT), 43 Fourier-Plancherel on L2 pRn q, 304 inverse discrete Fourier (IDFT), 44 unit pulse, 55

Other titles from

in Mathematics and Statistics

2021 MOKLYACHUK Mikhail Convex Optimization: Introductory Course POGORUI Anatoliy, SWISHCHUK Anatoliy, RODRÍGUEZ-DAGNINO Ramón M. Random Motions in Markov and Semi-Markov Random Environments 1: Homogeneous Random Motions and their Applications Random Motions in Markov and Semi-Markov Random Environments 2: High-dimensional Random Motions and Financial Applications

2020 BARBU Vlad Stefan, VERGNE Nicolas Statistical Topics and Stochastic Models for Dependent Data with Applications CHABANYUK Yaroslav, NIKITIN Anatolii, KHIMKA Uliana Asymptotic Analyses for Complex Evolutionary Systems with Markov and Semi-Markov Switching Using Approximation Schemes KOROLIOUK Dmitri Dynamics of Statistical Experiments

MANOU-ABI Solym Mawaki, DABO-NIANG Sophie, SALONE Jean-Jacques Mathematical Modeling of Random and Deterministic Phenomena

2019 BANNA Oksana, MISHURA Yuliya, RALCHENKO Kostiantyn, SHKLYAR Sergiy Fractional Brownian Motion: Approximations and Projections GANA Kamel, BROC Guillaume Structural Equation Modeling with lavaan KUKUSH Alexander Gaussian Measures in Hilbert Space: Construction and Properties LUZ Maksym, MOKLYACHUK Mikhail Estimation of Stochastic Processes with Stationary Increments and Cointegrated Sequences MICHELITSCH Thomas, PÉREZ RIASCOS Alejandro, COLLET Bernard, NOWAKOWSKI Andrzej, NICOLLEAU Franck Fractional Dynamics on Networks and Lattices VOTSI Irene, LIMNIOS Nikolaos, PAPADIMITRIOU Eleftheria, TSAKLIDIS George Earthquake Statistical Analysis through Multi-state Modeling (Statistical Methods for Earthquakes Set – Volume 2)

2018 AZAÏS Romain, BOUGUET Florian Statistical Inference for Piecewise-deterministic Markov Processes IBRAHIMI Mohammed Mergers & Acquisitions: Theory, Strategy, Finance PARROCHIA Daniel Mathematics and Philosophy

2017 CARONI Chysseis First Hitting Time Regression Models: Lifetime Data Analysis Based on Underlying Stochastic Processes (Mathematical Models and Methods in Reliability Set – Volume 4) CELANT Giorgio, BRONIATOWSKI Michel Interpolation and Extrapolation Optimal Designs 2: Finite Dimensional General Models CONSOLE Rodolfo, MURRU Maura, FALCONE Giuseppe Earthquake Occurrence: Short- and Long-term Models and their Validation (Statistical Methods for Earthquakes Set – Volume 1) D’AMICO Guglielmo, DI BIASE Giuseppe, JANSSEN Jacques, MANCA Raimondo Semi-Markov Migration Models for Credit Risk (Stochastic Models for Insurance Set – Volume 1) GONZÁLEZ VELASCO Miguel, del PUERTO GARCÍA Inés, YANEV George P. Controlled Branching Processes (Branching Processes, Branching Random Walks and Branching Particle Fields Set – Volume 2) HARLAMOV Boris Stochastic Analysis of Risk and Management (Stochastic Models in Survival Analysis and Reliability Set – Volume 2) KERSTING Götz, VATUTIN Vladimir Discrete Time Branching Processes in Random Environment (Branching Processes, Branching Random Walks and Branching Particle Fields Set – Volume 1) MISHURA YULIYA, SHEVCHENKO Georgiy Theory and Statistical Applications of Stochastic Processes NIKULIN Mikhail, CHIMITOVA Ekaterina Chi-squared Goodness-of-fit Tests for Censored Data (Stochastic Models in Survival Analysis and Reliability Set – Volume 3)

SIMON Jacques Banach, Fréchet, Hilbert and Neumann Spaces (Analysis for PDEs Set – Volume 1)

2016 CELANT Giorgio, BRONIATOWSKI Michel Interpolation and Extrapolation Optimal Designs 1: Polynomial Regression and Approximation Theory CHIASSERINI Carla Fabiana, GRIBAUDO Marco, MANINI Daniele Analytical Modeling of Wireless Communication Systems (Stochastic Models in Computer Science and Telecommunication Networks Set – Volume 1) GOUDON Thierry Mathematics for Modeling and Scientific Computing KAHLE Waltraud, MERCIER Sophie, PAROISSIN Christian Degradation Processes in Reliability (Mathematial Models and Methods in Reliability Set – Volume 3) KERN Michel Numerical Methods for Inverse Problems RYKOV Vladimir Reliability of Engineering Systems and Technological Risks (Stochastic Models in Survival Analysis and Reliability Set – Volume 1)

2015 DE SAPORTA Benoîte, DUFOUR François, ZHANG Huilong

Numerical Methods for Simulation and Optimization of Piecewise Deterministic Markov Processes DEVOLDER Pierre, JANSSEN Jacques, MANCA Raimondo Basic Stochastic Processes LE GAT Yves Recurrent Event Modeling Based on the Yule Process (Mathematical Models and Methods in Reliability Set – Volume 2)

2014 COOKE Roger M., NIEBOER Daan, MISIEWICZ Jolanta Fat-tailed Distributions: Data, Diagnostics and Dependence (Mathematical Models and Methods in Reliability Set – Volume 1) MACKEVIČIUS Vigirdas Integral and Measure: From Rather Simple to Rather Complex PASCHOS Vangelis Th Combinatorial Optimization – 3-volume series – 2nd edition Concepts of Combinatorial Optimization / Concepts and Fundamentals – volume 1 Paradigms of Combinatorial Optimization – volume 2 Applications of Combinatorial Optimization – volume 3

2013 COUALLIER Vincent, GERVILLE-RÉACHE Léo, HUBER Catherine, LIMNIOS Nikolaos, MESBAH Mounir Statistical Models and Methods for Reliability and Survival Analysis JANSSEN Jacques, MANCA Oronzio, MANCA Raimondo Applied Diffusion Processes from Engineering to Finance SERICOLA Bruno Markov Chains: Theory, Algorithms and Applications

2012 BOSQ Denis Mathematical Statistics and Stochastic Processes CHRISTENSEN Karl Bang, KREINER Svend, MESBAH Mounir Rasch Models in Health DEVOLDER Pierre, JANSSEN Jacques, MANCA Raimondo Stochastic Methods for Pension Funds

2011 MACKEVIČIUS Vigirdas Introduction to Stochastic Analysis: Integrals and Differential Equations MAHJOUB Ridha Recent Progress in Combinatorial Optimization – ISCO2010 RAYNAUD Hervé, ARROW Kenneth Managerial Logic

2010 BAGDONAVIČIUS Vilijandas, KRUOPIS Julius, NIKULIN Mikhail Nonparametric Tests for Censored Data BAGDONAVIČIUS Vilijandas, KRUOPIS Julius, NIKULIN Mikhail Nonparametric Tests for Complete Data IOSIFESCU Marius et al. Introduction to Stochastic Models VASSILIOU PCG Discrete-time Asset Pricing Models in Applied Stochastic Finance

2008 ANISIMOV Vladimir Switching Processes in Queuing Models FICHE Georges, HÉBUTERNE Gérard Mathematics for Engineers HUBER Catherine, LIMNIOS Nikolaos et al. Mathematical Methods in Survival Analysis, Reliability and Quality of Life JANSSEN Jacques, MANCA Raimondo, VOLPE Ernesto Mathematical Finance

2007 HARLAMOV Boris Continuous Semi-Markov Processes

2006 CLERC Maurice Particle Swarm Optimization