167 4 2MB
English Pages 199 Year 2018
Carlo Garoni · Stefano Serra-Capizzano
Generalized Locally Toeplitz Sequences: Theory and Applications Volume II
Generalized Locally Toeplitz Sequences: Theory and Applications
Carlo Garoni Stefano Serra-Capizzano •
Generalized Locally Toeplitz Sequences: Theory and Applications Volume II
123
Carlo Garoni Department of Science and High Technology University of Insubria Como, Italy
Stefano Serra-Capizzano Department of Science and High Technology University of Insubria Como, Italy
ISBN 978-3-030-02232-7 ISBN 978-3-030-02233-4 https://doi.org/10.1007/978-3-030-02233-4
(eBook)
This book has been realized with the financial support of the Italian INdAM (Istituto Nazionale di Alta Matematica) and the European “Marie-Curie Actions” Programme through the Grant PCOFUND-GA2012-600198. Library of Congress Control Number: 2018958367 Mathematics Subject Classification (2010): 15B05, 65N06, 65N25, 65N30, 65N35, 47B06, 35P20, 15A18, 15A60, 15A69 © Springer Nature Switzerland AG 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Sequences of matrices with increasing size naturally arise in several contexts and especially in the discretization of continuous problems, such as integral and differential equations. The theory of generalized locally Toeplitz (GLT) sequences was developed in order to compute/analyze the asymptotic spectral distribution of these sequences of matrices, which in many cases turn out to be GLT sequences. In the first volume [22], we presented the theory of univariate/unilevel GLT sequences, which arise in the discretization of unidimensional integral and differential equations; this is the reason why the first volume addressed only unidimensional applications. In this second volume, we present the theory of multivariate/multilevel GLT sequences, which arise in the discretization of multidimensional integral and differential equations. The focus here is accordingly on multidimensional applications, especially partial differential equations (PDEs). It is important to emphasize that the extension from the univariate case addressed in [22] to the multivariate case addressed here, despite being fundamental for the applications as it allows one to face concrete PDEs, is essentially a technical matter whose purpose is to illustrate the appropriate generalization of ideas already presented in [22]. The fact that all the main “GLT ideas” have been covered in [22] makes it an essential prerequisite to this book. In particular, apart from (almost) obvious adaptations, several “multivariate proofs” are the same as their corresponding “univariate versions” from [22]. We have therefore been tempted to omit them here so as to shorten the book, but ultimately we did not opt for this solution in order to help the reader gain familiarity with the multivariate language (especially the multi-index notation). The book is conceptually divided into two parts. The first part (Chaps. 1–5) covers the theory of multilevel GLT sequences, which is finally summarized in Chap. 6. The second part (Chap. 7) is devoted to PDE applications. The book is intended for use as a text for graduate or advanced undergraduate courses. It should also be useful as a reference for researchers working in the fields of linear and multilinear algebra, numerical analysis, and matrix analysis. Given its analytic spirit, it could also be of interest to analysts, particularly those working in the fields of measure and operator theory. v
vi
Preface
As already pointed out, the first volume [22] is an essential prerequisite to this second volume. It also provides detailed motivations to the theory of GLT sequences [22, pp. 1–3] which will not be repeated here for the sake of conciseness. In addition to [22], a basic knowledge of multidimensional integro-differential calculus (partial derivatives, multiple integrals, etc.) is required. Assuming the reader possesses the necessary prerequisites, most of which, if not already addressed in [22], will be tackled in Chap. 2, there exists a way of reading this book that allows one to omit essentially all the mathematical details/ technicalities without losing the core. This is probably “the best way of reading” for those who love practice more than theory, but it is also advisable for theorists, who can recover the missing details afterward. It consists in reading carefully the summary of the theory in Chap. 6 and the applications in Chap. 7. To conclude, we wish to express our gratitude to Bruno Iannazzo, Carla Manni, and Hendrik Speleers, who awakened the interest in the theory of GLT sequences and ultimately inspired the writing of this book. We also wish to thank all of our colleagues who have worked in the field of “Toeplitz matrices and spectral distributions” and contributed to laying the foundations of the theory of GLT sequences. We mention in particular Bernhard Beckermann, Albrecht Böttcher, Fabio Di Benedetto, Marco Donatelli, Leonid Golinskii, Sergei Grudsky, Arno Kuijlaars, Maya Neytcheva, Debora Sesana, Bernd Silbermann, Paolo Tilli, Eugene Tyrtyshnikov, and Nickolai Zamarashkin. Finally, special thanks go to those researchers who, possibly attracted by the first volume [22], decided to enter the research field of GLT sequences. We mention in particular Giovanni Barbarino from Scuola Normale Superiore (Pisa, Italy), Davide Bianchi and Isabella Furci from University of Insubria (Como, Italy), Ali Dorostkar and Sven-Erik Ekström from Uppsala University (Uppsala, Sweden), Mariarosa Mazza and Ahmed Ratnani from the Max Planck Institute for Plasma Physics (Munich, Germany). Several of their contributions will certainly appear in a future edition of both volumes I and II. Based on their research experience, the authors propose a reference textbook in two volumes on the theory of generalized locally Toeplitz sequences and their applications. The first volume focuses on the univariate version of the theory and the related applications in the unidimensional setting, while this second volume, which addresses the multivariate case, is mainly devoted to concrete PDE applications. Como, Italy August 2018
Carlo Garoni Stefano Serra-Capizzano
Contents
1 Notes to the Reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 General Notation and Terminology . . . . . . . . . . . . 2.1.2 Multi-index Notation . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Multilevel Matrix-Sequences . . . . . . . . . . . . . . . . 2.2 Multivariate Trigonometric Polynomials . . . . . . . . . . . . . . 2.3 Multivariate Riemann-Integrable Functions . . . . . . . . . . . . 2.4 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Tensor Products and Direct Sums . . . . . . . . . . . . . . . . . . 2.6 Singular Value and Eigenvalue Distribution of a Sequence of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 The Notion of Singular Value and Eigenvalue Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Clustering and Attraction . . . . . . . . . . . . . . . . . . . 2.6.3 Zero-Distributed Sequences . . . . . . . . . . . . . . . . . 2.6.4 Sparsely Unbounded and Sparsely Vanishing Sequences of Matrices . . . . . . . . . . . . . . . . . . . . . 2.6.5 Spectral Distribution of Sequences of Perturbed Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . 2.7 Approximating Classes of Sequences . . . . . . . . . . . . . . . . 2.7.1 Definition of a.c.s. and a.c.s. Topology . . . . . . . . . 2.7.2 The a.c.s. Tools for Computing Singular Value and Spectral Distributions . . . . . . . . . . . . . . . . . . 2.7.3 The a.c.s. Algebra . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 Some Criteria to Identify a.c.s. . . . . . . . . . . . . . . . 2.7.5 An Extension of the Concept of a.c.s. . . . . . . . . . .
1
. . . . . . . . .
3 3 3 7 12 12 16 17 18
......
24
...... ...... ......
24 26 27
......
28
...... ...... ......
30 31 31
. . . .
33 34 35 36
. . . . . . . . .
. . . . . . . . .
. . . .
. . . . . . . . .
. . . .
. . . . . . . . .
. . . .
. . . . . . . . .
. . . .
. . . .
vii
viii
3 Multilevel Toeplitz Sequences . . . . . . . . . . . . . . . . . . . . . 3.1 Multilevel Toeplitz Matrices and Multilevel Toeplitz Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Basic Properties of Multilevel Toeplitz Matrices . . . . . 3.3 Schatten p-Norms of Multilevel Toeplitz Matrices . . . 3.4 Multilevel Circulant Matrices . . . . . . . . . . . . . . . . . . 3.5 Singular Value and Spectral Distribution of Multilevel Toeplitz Sequences: An a.c.s.-Based Proof . . . . . . . . . 3.6 Extreme Eigenvalues of Hermitian Multilevel Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
.........
39
. . . .
. . . .
39 42 46 51
.........
55
.........
58
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . . . .
61 61 62 67 72 72 73 73 79
.... .... ....
82 85 86
.... ....
91 91
4 Multilevel Locally Toeplitz Sequences . . . . . . . . . . . . . . . . . . . . 4.1 Multilevel LT Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Definition of Multilevel LT Operator . . . . . . . . . . . . 4.1.2 Properties of the Multilevel LT Operator . . . . . . . . . . 4.2 Definition of Multilevel LT and sLT Sequences . . . . . . . . . . 4.3 Fundamental Examples of Multilevel LT Sequences . . . . . . . 4.3.1 Zero-Distributed Sequences . . . . . . . . . . . . . . . . . . . 4.3.2 Sequences of Multilevel Diagonal Sampling Matrices 4.3.3 Multilevel Toeplitz Sequences . . . . . . . . . . . . . . . . . 4.4 Singular Value and Spectral Distribution of a Finite Sum of Multilevel LT Sequences . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Algebraic Properties of Multilevel LT Sequences . . . . . . . . . 4.6 Characterizations of Multilevel LT Sequences . . . . . . . . . . .
. . . . . . . . .
5 Multilevel Generalized Locally Toeplitz Sequences . . . . . . . . . . 5.1 Equivalent Definitions of Multilevel GLT Sequences . . . . . . 5.2 Singular Value and Spectral Distribution of Multilevel GLT Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Approximation Results for Multilevel GLT Sequences . . . . . 5.3.1 Characterizations of Multilevel GLT Sequences . . . . . 5.3.2 Sequences of Multilevel Diagonal Sampling Matrices 5.4 The Multilevel GLT Algebra . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Algebraic-Topological Definitions of Multilevel GLT Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . .
. . . . .
. . . . . . . . .
. . . . .
. 93 . 95 . 99 . 100 . 103
. . . . 109
6 Summary of the Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Multilevel GLT Preconditioning . . . . . . . . . . . . . . 7.1.2 Multilevel Arrow-Shaped Sampling Matrices . . . . . 7.2 Applications to PDE Discretizations: An Introduction . . . . 7.3 FD Discretization of Convection-Diffusion-Reaction PDEs 7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
121 121 121 122 124 127 136
Contents
ix
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs . . . . . . . . . . . . . . . . . . . . 146 7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs . . . . . . . . . . . . . . . . . . . . 160 7.7 Galerkin B-Spline IgA Discretization of Second-Order Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 8 Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
About the Authors
Dr. Carlo Garoni graduated in Mathematics at the University of Insubria (Italy) in 2011 and received his Ph.D. in Mathematics at the same university in 2015. He has pursued research at the Universities of Insubria and Rome “Tor Vergata,” and he now has a Marie-Curie postdoctoral position at the USI University of Lugano (Switzerland). He has published about 25 research papers in different areas of Mathematics, most of which are connected with the theory of GLT sequences and its applications. Prof. Stefano Serra-Capizzano is a Full Professor of Numerical Analysis, Deputy Rector of the University of Insubria (Italy), and a long-term Visiting Professor at Uppsala University (Sweden). He has authored over 200 research papers in different areas of Mathematics, with more than 90 collaborators all over the world, and he has recently won a Prodi Chair Professorship in Nonlinear Analysis at Würzburg University (Germany). He is the founder of the Ph.D. program “Mathematics of Computation” at the University of Insubria’s Department of Science and High Technology.
xi
Chapter 1
Notes to the Reader
The present book covers the multivariate version of the theory of Generalized Locally Toeplitz (GLT) sequences, also known as the theory of multilevel GLT sequences. In addition, the book presents some emblematic (multidimensional) applications of this theory in the context of the numerical discretization of Partial Differential Equations (PDEs). The generalization of the theory of GLT sequences from the univariate case addressed in [22] to the multivariate case addressed here is essentially a matter of technicalities, which results in the technical nature of the present volume. We therefore recommend that, before going into this book, the reader give a reading to [22, pp. 1–3] in order to call to mind the motivations behind the theory of (unilevel and multilevel) GLT sequences, which will not be repeated here for the sake of conciseness. When reading [22, pp. 1–3] in a multidimensional perspective, the GLT sequences and the Differential Equations (DEs) mentioned therein should be understood as multilevel GLT sequences and PDEs, respectively. After going through [22, pp. 1–3], we encourage the reader to try reading this book according to the scheme suggested in the preface, which consists in reading Chaps. 6 and 7 first, and then coming back to fill the gaps (if necessary or wanted). When reading the present book, it is advisable that the reader have at hand the first volume [22], for at least two reasons. First, [22] is cited many times throughout the book. Secondly, several “multivariate proofs” from Chaps. 2–5 are essentially the same as their corresponding “univariate versions” from [22], and we recommend that the reader compare them with each other so as to learn the way in which the multilevel language (especially, the multi-index notation) allows one to transfer many results from the univariate to the multivariate case. Roughly speaking, this transfer process is carried out through a sort of “automatic procedure” consisting in turning some letters (n, i, j, x, θ, etc.) in boldface (n, i, j , x, θ , etc.). Finally, we remark that, as highlighted in the preface, the first volume [22] is an essential prerequisite to this second volume. In addition to [22], the other necessary prerequisite for reading this book is a basic knowledge of multidimensional integro-differential calculus (partial derivatives, multiple integrals, etc.). © Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_1
1
Chapter 2
Mathematical Background
This chapter collects the necessary preliminaries to develop the multivariate version of the theory of GLT sequences. The reader is supposed to be familiar with the univariate version of the theory [22] and to possess a basic knowledge of multidimensional integro-differential calculus (partial derivatives, multiple integrals, etc.).
2.1 Notation and Terminology For the reader’s convenience, we report in this section some of the most common notations and terminologies that will be used throughout the book. Special attention is devoted to the multi-index notation and the notion of multilevel matrix-sequences. Together with the index at the end, this section can be used as a reference whenever an unknown notation/terminology is encountered.
2.1.1 General Notation and Terminology • • • • •
The cardinality of a set S is denoted by #S. If S is a subset of a topological space, the closure of S is denoted by S. A permutation σ of the set {1, 2, . . . , n} is denoted by [σ (1), σ (2), . . . , σ (n)]. Rm×n (resp., Cm×n ) is the space of real (resp., complex) m × n matrices. Om and Im denote, respectively, the m × m zero matrix and the m × m identity matrix. Sometimes, when the size m can be inferred from the context, O and I are used instead of Om and Im . The symbol O is also used to indicate rectangular zero matrices whose sizes are clear from the context. • If x is a vector and X is a matrix, xT and x∗ (resp., X T and X ∗ ) are the transpose and the conjugate transpose of x (resp., X ). • If x is a vector with m components x1 , . . . , xm , diag(x) is the diagonal matrix whose diagonal entries are x1 , . . . , xm .
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_2
3
4
2 Mathematical Background
• If x, y are vectors with m components, x · y denotes their scalar product. • We use the abbreviations HPD, HPSD, SPD, SPSD for “Hermitian Positive Definite”, “Hermitian Positive SemiDefinite”, “Symmetric Positive Definite”, “Symmetric Positive SemiDefinite”. • If X, Y ∈ Cm×m , the notation X ≥ Y (resp., X > Y ) means that X, Y are Hermitian and X − Y is HPSD (resp., HPD). • If X, Y ∈ Cm×m , we denote by X ◦ Y the componentwise (or Hadamard) product of X and Y : (X ◦ Y )i j = xi j yi j , i, j = 1, . . . , m. • If X ∈ Cm×m , we denote by X † the Moore–Penrose pseudoinverse of X . For more on the Moore–Penrose pseudoinverse, see [22, Sect. 2.4.2]. • If X ∈ Cm×m , we denote by Λ(X ) the spectrum of X . • If X ∈ Cm×m , the singular values and eigenvalues of X are denoted by σ j (X ), j = 1, . . . , m, and λ j (X ), j = 1, . . . , m, respectively. The maximum and minimum singular values are also denoted by σmax (X ) and σmin (X ). If the eigenvalues are real, their maximum and minimum are also denoted by λmax (X ) and λmin (X ). • If 1 ≤ p ≤ ∞, the symbol | · | p denotes both the p-norm of vectors and the associated operator norm for matrices: m
|xi | p
1/ p
, if 1 ≤ p < ∞, maxi=1,...,m |xi |, if p = ∞, |X x| p , X ∈ Cm×m . |X | p = maxm x∈C |x| p |x| p =
i=1
x ∈ Cm ,
x =0
The 2-norm | · |2 is also known as the spectral (or Euclidean) norm and it will be preferably denoted by · . For more on p-norms, see [22, Sect. 2.4.1]. • Given X ∈ Cm×m and 1 ≤ p ≤ ∞, X p denotes the Schatten p-norm of X , which is defined as the p-norm of the vector (σ1 (X ), . . . , σm (X )) formed by the singular values of X . The Schatten 1-norm is also known under the names of tracenorm and nuclear norm. For more on Schatten p-norms, see [22, Sect. 2.4.3]. • (X ) and (X ) are, respectively, the real and the imaginary part of the square matrix X : X − X∗ X + X∗ , (X ) = ,
(X ) = 2 2i where i is the imaginary unit (i2 = −1). Note that (X ), (X ) are Hermitian and X = (X ) + i (X ) for all square matrices X . • If X ∈ Cm×m is diagonalizable and f : Λ(X ) → C, we denote by f (X ) the matrix obtained by applying the function f to X . For more on matrix functions, see [22, Sect. 2.4.6]. • We use the abbreviations FDs, FEs, IgA for “Finite Differences”, “Finite Elements”, “Isogeometric Analysis”. • Given two sequences {ζn }n and {ξn }n , with ζn ≥ 0 and ξn > 0 for all n, the notation ζn = O(ξn ) means that there exists a constant C, independent of n, such that ζn ≤ Cξn for all n; and the notation ζn = o(ξn ) means that ζn /ξn → 0 as n → ∞.
2.1 Notation and Terminology
5
• Cc (C) (resp.,Cc (R)) is the space of complex-valued continuous functions defined on C (resp., R) with bounded support. Moreover, for m ∈ N ∪ {∞}, Ccm (R) = Cc (R) ∩ C m (R), where C m (R) is the space of functions F : R → C such that the real and imaginary parts (F), (F) are of class C m over R in the classical sense. • If wi : Di → C, i = 1, . . . , d, we define the tensor-product function w1 ⊗ · · · ⊗ wd : D1 × · · · × Dd → C as follows: for every (ξ1 , . . . , ξd ) ∈ D1 × · · · × Dd , (w1 ⊗ · · · ⊗ wd )(ξ1 , . . . , ξd ) = w1 (ξ1 ) · · · wd (ξd ). • If f : D → E and g : E → F are arbitrary functions, the composite function g ◦ f is preferably denoted by g( f ). • If g : D → C, we set g ∞ = supξ ∈D |g(ξ )|. If we need/want to specify the domain D, we write g ∞,D instead of g ∞ . Clearly, g ∞ < ∞ if and only if g is bounded over its domain. • If g : D → C is continuous over D, with D ⊆ Ck for some k, we denote by ωg (·) the modulus of continuity of g, ωg (δ) =
sup
|g(x) − g(y)|,
δ > 0.
x,y∈D |x−y|∞ ≤δ
If we need/want to specify D, we will say that ωg (·) is the modulus of continuity of g over D. • χ E is the characteristic (or indicator) function of the set E, χ E (ξ ) =
1, if ξ ∈ E, 0, otherwise.
• μk denotes the Lebesgue measure in Rk . Throughout this book, unless otherwise stated, all the terminology coming from measure theory (such as “measurable set”, “measurable function”, “almost everywhere (a.e.)”, etc.) is always referred to the Lebesgue measure. • If E 1 , . . . , E d ⊆ R are measurable sets and f : E 1 × · · · × E d → C, we say that f is d-separable if there exist d measurable functions f i : E i → C, i = 1, . . . , d, such that f = f 1 ⊗ · · · ⊗ f d . In this case, f 1 ⊗ · · · ⊗ f d is called a factorization of f . Note that any d-separable function is measurable. Throughout this book, “separable function” is an abbreviation of “d-separable function”. • If f : D ⊆ Rk → C is measurable, we denote by ER( f ) its essential range. For more on the essential range, see [22, Sect. 2.2.1]. • If D is any measurable subset of some Rk , we set M D = { f : D → C : f is measurable}, p p |f| < ∞ , 1 ≤ p < ∞, L (D) = f ∈ M D : D L ∞ (D) = f ∈ M D : ess sup D | f | < ∞ .
6
2 Mathematical Background
If D is the special domain [0, 1]d × [−π, π]d , we preferably use the notation Md instead of M D : Md = {κ : [0, 1]d × [−π, π]d → C : κ is measurable}. If f ∈ L p (D) and the domain D is clear from the context, we write f L p instead of f L p (D) to indicate the L p -norm of f , which is defined as f L p
( D | f | p )1/ p , if 1 ≤ p < ∞, = ess sup D | f |, if p = ∞.
For more on L p spaces, see [22, Sect. 2.2.2]. • If D ⊆ Rk is a measurable set with 0 < μk (D) < ∞, we denote by dmeasure the pseudometric on M D defined in [22, Eq. (2.14)], which induces on M D the topology τmeasure of convergence in measure. For more details on this topic, see [22, Sect. 2.3.2]. • If f ∈ L 1 ([−π, π]d ), the Fourier coefficients of f are denoted by f k and are defined as follows: 1 f (θ) e−ik·θ dθ , k ∈ Zd . (2.1) fk = (2π )d [−π,π]d • We use a notation borrowed from probability theory to indicate sets. For example, if f, g : D ⊆ Rk → C, then { f = 1} = {x ∈ D : f (x) = 1}, {0 ≤ f ≤ 1, g > 2} = {x ∈ D : 0 ≤ f (x) ≤ 1, g(x) > 2}, μk { f > 0, g < 0} is the measure of the set {x ∈ D : f (x) > 0, g(x) < 0}, χ{ f =0} is the characteristic function of the set where f vanishes, ... • A functional φ is any function defined on some vector space (such as, for example, Cc (C) or Cc (R)) and taking values in C. • If K is either R or C and g : D ⊂ Rk → K is a measurable function defined on a set D with 0 < μk (D) < ∞, we denote by φg the functional φg : Cc (K) → C,
φg (F) =
1 μk (D)
F(g(x))dx.
(2.2)
D
• A sequence of matrices is a sequence of the form {An }n , where n varies in some infinite subset of N and An is a square matrix of size dn such that dn → ∞ as n → ∞. Throughout this book, unless otherwise specified, the size of the nth matrix of a sequence of matrices is always assumed to be dn .
2.1 Notation and Terminology
7
2.1.2 Multi-index Notation Throughout this book, we will systematically use the multi-index notation. When discretizing a linear PDE over a d-dimensional domain ⊂ Rd by means of a linear numerical method, the actual computation of the numerical solution reduces to solving a linear system whose coefficient matrix usually possesses a d-level structure (see Example 2.5 below). As we shall see later on, especially in Chap. 7, the multiindex notation is a powerful tool that allows one to give a compact expression of this matrix by treating the dimensionality parameter d as any other parameter involved in the considered discretization. In this way, the dependence of the matrix structure on d is highlighted and a compact presentation is made possible. A multi-index i of size d, also called a d-index, is simply a (row) vector in Zd ; its components are denoted by i 1 , . . . , i d . • 0, 1, 2, . . . are the vectors of all zeros, all ones, all twos, . . . (their size will be clear from the context). • For any d-index m, we set N (m) = dj=1 m j and we write m → ∞ to indicate that min(m) → ∞. The notation N (α) = dj=1 α j will be actually used for any vector α with d components and not only for d-indices. • Let {an }n∈Nd be a family of numbers parameterized by a d-index n. The limit of an as n → ∞ is defined, as in the case of a traditional sequence {an }n∈N , in the following way: lim n→∞ an = a if and only if for every neighborhood U of a there exists N such that an ∈ U for n ≥ N. Moreover, we define lim sup an = lim n→∞
n→∞
sup am ,
lim inf an = lim n→∞
m≥n
n→∞
inf am .
m≥n
• If h, k are d-indices, h ≤ k means that h r ≤ kr for all r = 1, . . . , d, while h ≤ k means that h r > kr for at least one r ∈ {1, . . . , d}. • If h, k are d-indices such that h ≤ k, the multi-index range h, . . . , k (or, more precisely, the d-index range h, . . . , k) is the set of cardinality N (k − h + 1) given by { j ∈ Zd : h ≤ j ≤ k}. We assume for this set the standard lexicographic ordering:
...
[ ( j1 , . . . , jd ) ] jd =h d ,...,kd
jd−1 =h d−1 ,...,kd−1
...
j1 =h 1 ,...,k1
.
(2.3)
For instance, in the case d = 2 the ordering is (h 1 , h 2 ), (h 1 , h 2 + 1), . . . , (h 1 , k2 ), (h 1 + 1, h 2 ), (h 1 + 1, h 2 + 1), . . . , (h 1 + 1, k2 ), . . . . . . . . . , (k1 , h 2 ), (k1 , h 2 + 1), . . . , (k1 , k2 ). • When a d-index j varies over a d-index range h, . . . , k (this is often written as j = h, . . . , k), it is understood that j varies from h to k following the specific ordering (2.3). For instance, if m ∈ Nd and we write x = [x i ]m i=1 , then x is a vector
8
2 Mathematical Background
of size N (m) whose components x i , i = 1, . . . , m, are ordered in accordance with (2.3): the first component is x1 = x(1,...,1,1) , the second component is x(1,...,1,2) , and so on until the last component, which is x m = x(m 1 ,...,m d ) . Similarly, if X = [x i j ]m i, j =1 ,
(2.4)
then X is an N (m) × N (m) matrix whose components are indexed by a pair of d-indices i, j , both varying from 1 to m according to the lexicographic ordering (2.3). • If h, k are d-indices such that h ≤ k, the notation kj =h indicates the summation over all j in h, . . . , k. • If i, j are d-indices, i j means that i precedes (or equals) j in the lexicographic ordering (which is a total ordering on Zd ). Moreover, we define i∧ j=
i, if i j , j , if i j .
(2.5)
Note that i ∧ j is the minimum among i and j with respect to the lexicographic ordering. In the case where i and j are 1-indices (i.e., normal scalar indices), it is clear that i ∧ j = min(i, j). • Operations involving d-indices that have no meaning in the vector space Zd must always be interpreted in the componentwise sense. For instance, n p = (n 1 p1 , . . . , n d pd ), αi/ j = (αi 1 /j1 , . . . , αi d /jd ) for all α ∈ C, i 2 = (i 12 , . . . , i d2 ), max(i, j ) = (max(i 1 , j1 ), . . . , max(i d , jd )), i mod m = (i 1 mod m 1 , . . . , i d mod m d ), etc. • When a multi-index appears as subscript or superscript, we sometimes suppress the brackets to simplify the notation. For instance, the component of the vector x = [x i ]m i=1 corresponding to the d-index i is denoted by x i or x i 1 ,...,i d , and we often avoid the heavy notation x(i1 ,...,id ) . We provide below a few examples to help the reader become familiar with the multiindex notation. Example 2.1 Let h = 1 = (1, 1) and k = (4, 2). The multi-index range h, . . . , k consists of N (k − h + 1) = N (k) = 8 elements which are sorted according to the lexicographic ordering (2.3) as follows: (1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2), (4, 1), (4, 2). Note that
k
j =h
j 2 = (60, 20).
Example 2.2 Let a : [0, 1]2 → C and n ∈ N2 . Set i n . x= a n i=1 Then, x is the vector of size N (n) given by
2.1 Notation and Terminology
⎡
a( n11 , n12 )
9
⎤
⎢ 1 2 ⎥ ⎢ a( n 1 , n 2 ) ⎥ ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ 1 ⎢ a( n 1 , 1) ⎥ ⎥ ⎢ ⎢ 2 1 ⎥ ⎢ a( n , n ) ⎥ 1 2 ⎥ ⎢ ⎢ 2 2 ⎥ ⎢ a( , ) ⎥ ⎢ n1 n2 ⎥ ⎡ ⎤ ⎥ ⎢ .. x1 ⎥ ⎢ . ⎥ ⎢x ⎥ ⎢ ⎥ ⎢ 2 ⎥ a( 2 , 1) ⎥ = ⎢ x=⎢ .. ⎥ , ⎥ ⎢ ⎢ n1 ⎥ ⎣ . ⎦ ⎢ .. ⎥ ⎢ . ⎥ ⎢ xn 1 ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎢ a(1, 1 ) ⎥ ⎢ n2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ a(1, n22 ) ⎥ ⎥ ⎢ .. ⎥ ⎢ ⎦ ⎣ .
⎡
i
a( n11 , n12 )
⎤
⎥ ⎢ i ⎢ a( 1 , 2 ) ⎥ ⎢ n1 n2 ⎥ ⎥, xi 1 = ⎢ .. ⎥ ⎢ ⎥ ⎢ . ⎦ ⎣ i a( n11 , 1)
i 1 = 1, . . . , n 1 .
a(1, 1) Moreover, x 2 =
n i=1
|x i |2 =
n i=1
|a(i/n)|2 .
Example 2.3 Consider the matrix ⎡
4 ⎢4 A=⎢ ⎣0 0
4 4 0 0
0 0 1 1
⎤ 0 0⎥ ⎥. 1⎦ 1
(2.6)
Instead of indexing the entries of A in the standard way, i.e., by means of two traditional scalar indices i, j = 1, . . . , 4, we can decide to index the entries of A by means of two 2-indices (or bi-indices) i, j = 1, . . . , 2 = (1, 1), (1, 2), (2, 1), (2, 2). This is possible because both the ranges 1, . . . , 4 and 1, . . . , 2 have 4 elements, and 4 is the size of A. The two writings A = [ai j ]i,4 j=1 and A = [ai j ]2i, j =1 correspond to the two different indicizations. We have a(1,1),(1,1) = 4, a(1,2),(1,1) = 4, a(2,1),(1,1) = 0, a(2,2),(1,1) = 0,
a(1,1),(1,2) = 4, a(1,2),(1,2) = 4,
a(1,1),(2,1) = 0, a(1,2),(2,1) = 0,
a(1,1),(2,2) = 0, a(1,2),(2,2) = 0,
a(2,1),(1,2) = 0, a(2,2),(1,2) = 0,
a(2,1),(2,1) = 1, a(2,2),(2,1) = 1,
a(2,1),(2,2) = 1, a(2,2),(2,2) = 1.
10
2 Mathematical Background
The indicization of the entries of A with the bi-indices i, j reflects the fact that we are thinking of A as a block matrix partitioned into 4 blocks as indicated in (2.6): for all i, j = 1, . . . , 2, the entry ai j is the (i 2 , j2 ) entry of the (i 1 , j1 ) block of A. For example, a(2,1),(2,2) = entry (1, 2) in the (2, 2) block of A = 1. We can therefore write
ai j
⎧ 4, ⎪ ⎪ ⎨ 0, = ⎪ 0, ⎪ ⎩ 1,
if i 1 if i 1 if i 1 if i 1
= j1 = 1, = 1 and j1 = 2, = 2 and j1 = 1, = j1 = 2.
Note that ai j depends only on the first components of the bi-indices i and j . Example 2.4 (tensor products) Let X ∈ Cm 1 ×m 2 and Y ∈ C1 ×2 , and define the block matrix ⎤ ⎡ x11 Y x12 Y · · · x1m 2 Y ⎢ x21 Y x22 Y · · · x2m 2 Y ⎥ ⎥ ⎢ m ×m (2.7) Z = [xi j Y ] i=1,...,m 1 = ⎢ . .. .. ⎥ ∈ C 1 1 2 2 . ⎣ .. j=1,...,m 2 . . ⎦ xm 1 1 Y xm 1 2 Y · · · xm 1 m 2 Y Using the identities r = r/ss + r mod s,
r/s = (r + 1)/s − 1,
which are satisfied for all integers r ≥ 0 and s ≥ 1, for every i = 1, . . . , m 1 1 and j = 1, . . . , m 2 2 we can write i = (i/1 − 1)1 + ((i − 1) mod 1 ) + 1, j = ( j/2 − 1)2 + (( j − 1) mod 2 ) + 1. The (i, j) entry of the matrix Z is then given by z i j = xi/1 , j/2 y((i−1) mod 1 )+1,(( j−1) mod 2 )+1 . It is clear that this expression is rather complicated. Now, suppose we decide to index the entries of Z by two bi-indices i, j such that i = 1, . . . , n and j = 1, . . . , k, where n = (m 1 , 1 ) and k = (m 2 , 2 ). This indicization, which is possible because #{1, . . . , n} = N (n) = m 1 1 = number of rows of Z , #{1, . . . , k} = N (k) = m 2 2 = number of columns of Z ,
2.1 Notation and Terminology
11
reflects the fact that we are thinking of Z as an m 1 × m 2 block matrix in which each of the m 1 m 2 blocks is of size 1 × 2 . Actually, this is the natural way of thinking in view of the block structure of Z ; see (2.7). With such indicization, for all i = 1, . . . , n and j = 1, . . . , k we have z i j = entry (i 2 , j2 ) in the (i 1 , j1 ) block of Z = xi1 j1 yi2 j2 . We then see that z i j has a much simpler expression than z i j . To conclude, we remark that the matrix Z is the so-called tensor (Kronecker) product of X and Y , and it is usually denoted by X ⊗ Y ; we shall come back to tensor products in Sect. 2.5. Example 2.5 (multilevel matrices) In many cases, it is convenient to partition matrices into blocks, which are partitioned into smaller blocks, which are partitioned into smaller blocks, and so on until a certain nesting level d is reached. Such matrices are called multilevel matrices. More precisely, following Tyrtyshnikov [41, Sect. 6], we say that a square matrix A of size N is a d-level matrix with level orders n 1 , n 2 , . . . , n d if N = n 1 n 2 · · · n d and A is partitioned into n 21 square blocks of size N /n 1 , each of which is partitioned into n 22 square blocks of size N /(n 1 n 2 ), each of which is partitioned into n 23 square blocks of size N /(n 1 n 2 n 3 ), and so on until the last n 2d square blocks of size N /(n 1 n 2 · · · n d ) = 1, which are scalars. In formulas, A = [ Ai1 j1 ]in11, j1 =1 ,
Ai1 j1 ∈ Cn˜ 1 ×n˜ 1 ,
n˜ 1 =
Ai1 j1 = [ Ai1 j1 ; i2 j2 ]in22, j2 =1 , Ai1 j1 ; i2 j2 ∈ Cn˜ 2 ×n˜ 2 , n˜ 2 = .. . Ai1 j1 ; ...; id−1 jd−1 = [ Ai1 j1 ; ...; id jd ]indd, jd =1 , Ai1 j1 ; ...; id jd ∈ Cn˜ d ×n˜ d , n˜ d =
N ; n1 N ; n1 n2 N n 1 ···n d
= 1.
The level orders n 1 , n 2 , . . . , n d and the order N are also referred to as partial orders and total order, respectively. Indexing the entries of a d-level matrix A by two traditional scalar indices i, j = 1, . . . , N is a nightmare. On the contrary, A admits a natural indicization by means of two d-indices i, j = 1, . . . , n, where n = (n 1 , . . . , n d ). Indeed, A can be written in the form (2.4) as follows: A = [ A i j ]ni, j =1 , where A i j = Ai1 j1 ; ...; id jd for i, j = 1, . . . , n. We remark that, as we shall see in Chap. 7, when a linear PDE over a d-dimensional hyperrectangle is discretized by a linear numerical method, the resulting discretization matrix is normally a d-level matrix with level orders n 1 , n 2 , . . . , n d and total order N (n) = n 1 n 2 · · · n d , where n i is the discretization parameter in the ith direction. Example 2.6 (matrix computations with multi-indices) Let n ∈ Nd and let A = [ai j ]ni, j=1 ,
B = [bi j ]ni, j =1 ,
x = [x j ]nj=1 ,
y = [y j ]nj=1 .
12
2 Mathematical Background
It is not difficult to see that the following properties hold. • • • • •
A∗ = [a j i ]ni, j=1 . α A + β B = [αai j + βbi j ]ni, j =1 for all α, β ∈ C. AB = [ nk=1 a i k bk j ]ni, j =1 . n Ax = [ j=1 a i j x j ]ni=1 . x∗ Ay = ni, j =1 a i j x i y j .
These properties show that the usual matrix computation rules remain formally the same when passing from standard scalar indices to multi-indices. We invite the reader to prove them as an exercise.
2.1.3 Multilevel Matrix-Sequences We recall from Sect. 2.1.1 that a sequence of matrices is a sequence of the form {An }n , where n varies in some infinite subset of N and An is a square matrix of size dn → ∞. A d-level matrix-sequence is a special sequence of matrices of the form {A n }n , where: • n varies in some infinite subset of N; • n = n(n) is a d-index with positive components which depends on n and satisfies n → ∞ as n → ∞; • A n is a square matrix of size N (n). Recall from Sect. 2.1.2 that n → ∞ means min(n) → ∞. The name “d-level matrix-sequence” is due to the fact that, in practical applications, especially in the context of PDE discretizations, each matrix A n of a d-level matrix-sequence { A n }n is normally a d-level matrix with level orders (n 1 , . . . , n d ) = n; see Example 2.5. Throughout this book, we often use the abbreviation “matrixsequence” for both “d-level matrix-sequence” and “multilevel matrix-sequence”.
2.2 Multivariate Trigonometric Polynomials A d-variate trigonometric polynomial is a finite linear combination of the d-variate Fourier frequencies k ∈ Zd , eik·θ = ei(k1 θ1 +...+kd θd ) , that is, a function of the form f (θ) =
N k=−N
f k eik·θ ,
f −N , . . . , f N ∈ C,
N ∈ Nd .
(2.8)
2.2 Multivariate Trigonometric Polynomials
13
Note that, as a consequence of the orthogonality relations e
i·θ −ik·θ
e
[−π,π]d
dθ =
(2π )d , if k = , 0, if k = ,
(2.9)
the numbers f −N , . . . , f N appearing in (2.8) are the (only possible nonzero) Fourier coefficients of f according to the definition (2.1). Note also that a 1-variate (or univariate) trigonometric polynomial is just a trigonometric polynomial in the classical sense. In what follows, we say that a d-variate trigonometric polynomial is d-separable (or simply separable) if it is a d-separable function from Rd to C according to the definition in Sect. 2.1.1. Lemma 2.1 Let f : Rd → C be a separable d-variate trigonometric polynomial and let f = f 1 ⊗ · · · ⊗ f d be a factorization of f . If f is not identically 0, then f 1 , . . . , f d are (univariate) trigonometric polynomials. Proof Since f 2 , . . . , f d are not identically 0, there exists (ϑ2 , . . . , ϑd ) such that f 2 (ϑ2 ) · · · f d (ϑd ) = 0. The definition of d-variate trigonometric polynomials implies that θ1 → f (θ1 , ϑ2 , . . . , ϑd ) = f 1 (θ1 ) f 2 (ϑ2 ) · · · f d (ϑd ) is a (univariate) trigonometric polynomial, and this means that f 1 is a trigonometric polynomial. With the same argument, one can show that f 2 , . . . , f d are trigonometric polynomials as well. Corollary 2.1 Let f : Rd → C be a separable d-variate trigonometric polynomial. Then, there exist (univariate) trigonometric polynomials f 1 , . . . , f d : R → C such that f = f 1 ⊗ · · · ⊗ f d . The next lemma shows that the set of zeros of every non-trivial d-variate trigonometric polynomial has zero measure. Lemma 2.2 Let f : Rd → C be a d-variate trigonometric polynomial with at least one nonzero Fourier coefficient. Then μd { f = 0} = 0. Proof The proof proceeds by induction on d. For d = 1, the result has already been proved in [22, solution of Exercise 6.2, pp. 286–287]. Suppose d > 1 and assume N f k eik·θ and that the lemma holds for dimensions up to d − 1. Let f (θ) = k=−N set Z = { f = 0}. By Fubini’s theorem, μd (Z ) = =
R
d
χ Z (θ1 , . . . , θd )dθ1 . . . dθd = dθ2 . . . dθd dθ1 ,
Rd−1
Rd−1
dθ2 . . . dθd
R
χ Z (θ1 , . . . , θd )dθ1
Z θ2 ,...,θd
where, for each fixed (θ2 , . . . , θd ) ∈ Rd−1 , the set Z θ2 ,...,θd is defined by Z θ2 ,...,θd = {θ1 ∈ R : f (θ1 , θ2 , . . . , θd ) = 0}.
(2.10)
14
2 Mathematical Background
Write
N1
f (θ) =
pk1 (θ2 , . . . , θd ) eik1 θ1 ,
(2.11)
k1 =−N1
where p−N1 , . . . , p N1 are (d − 1)-variate trigonometric polynomials given by pk1 (θ2 , . . . , θd ) =
(N2 ,...,Nd )
f k ei(k2 ,...,kd )·(θ2 ,...,θd ) ,
k1 = −N1 , . . . , N1 .
(k2 ,...,kd )=−(N2 ,...,Nd )
Let A = {(θ2 , . . . , θd ) ∈ Rd−1 : pk1 (θ2 , . . . , θd ) = 0 for all k1 = −N1 , . . . , N1 }. Since not all the Fourier coefficients f k are equal to 0, at least one of the polynomials pk1 has at least a nonzero Fourier coefficient. Thus, by induction hypothesis, μd−1 (A) = 0. Moreover, by (2.11), • if (θ2 , . . . , θd ) ∈ A then Z θ2 ,...,θd = R; • if (θ2 , . . . , θd ) ∈ Ac = Rd−1 \A then there exists an index k1 ∈ {−N1 , . . . , N1 } such that pk1 (θ2 , . . . , θd ) = 0, and so μ1 (Z θ2 ,...,θd ) = 0 by induction hypothesis. Going back to (2.10), we see that
μd (Z ) =
dθ2 . . . dθd A
and the proof is complete.
dθ1 + Z θ2 ,...,θd
Ac
dθ2 . . . dθd
dθ1 = 0, Z θ2 ,...,θd
The next two lemmas are the multivariate versions of [22, Lemmas 2.7 and 2.8]. They show how it is possible to approximate an L 1 (resp., a measurable) function by means of standard (resp., weighted) multivariate trigonometric polynomials. In what follows, for any D ⊆ Rk we denote by Cc (D) the space of continuous functions f : D → C such that support the supp( f ) = {x ∈ D : f (x) = 0} = { f = 0} is compact. We recall that, if D is measurable (so that it makes sense to talk about L p (D)), then the space Cc (D) is dense in L p (D) for all 1 ≤ p < ∞ [22, p. 13]. Lemma 2.3 Let f ∈ L 1 ([−π, π]d ). Then, there exists a sequence of d-variate trigonometric polynomials { pm }m such that pm ∞ ≤ ess sup[−π,π]d | f | for all m and pm → f a.e. and in L 1 ([−π, π]d ). Proof The proof follows the same pattern as [22, proof of Lemma 2.7]. It suffices to show that, for each ε > 0, there exists a d-variate trigonometric polynomial pε such that f − pε L 1 ≤ ε. pε ∞ ≤ ess sup[−π,π]d | f |, Indeed, this shows the existence of a sequence of d-variate trigonometric polynomials { pm }m such that pm ∞ ≤ ess sup[−π,π]d | f | for all m and pm → f in L 1 ([−π, π]d ).
2.2 Multivariate Trigonometric Polynomials
15
Recalling that the L 1 convergence of a sequence implies the a.e. convergence of an appropriate subsequence, passing to an appropriate subsequence of { pm }m (if necessary), we may assume that pm → f a.e. Let ε > 0. By [22, Theorem 2.2] in combination with the dominated convergence theorem and the density of Cc ((−π, π )d ) in L 1 ((−π, π )d ), there exists f ε ∈ Cc ((−π, π )d ) such that f ε ∞ ≤ ess sup[−π,π]d | f |,
f − f ε L 1 < ε.
(2.12)
The function f ε is continuous on [−π, π]d and satisfies f ε (θ) = 0 for every θ ∈ ∂([−π, π]d ). Thus, by the multivariate version of Fejér’s theorem, which is proved essentially in the same way as the classical (univariate) Fejér theorem [28, Theorem 3.1], there exists a d-variate trigonometric polynomial pε such that pε ∞ ≤ f ε ∞ ,
f ε − pε ∞ < ε.
(2.13)
By combining (2.12) and (2.13), we arrive at pε ∞ ≤ ess sup[−π,π]d | f |,
f − pε L 1 ≤ ε(1 + (2π )d ),
which proves the thesis.
Lemma 2.4 Let κ : [0, 1]d × [−π, π]d → C be a measurable function. Then, there exists a sequence {κm }m such that κm : [0, 1]d × [−π, π]d → C is a function of the form κm (x, θ ) =
Nm
i j ·θ a (m) , j (x) e
a (m) ∈ C ∞ ([0, 1]d ), j
N m ∈ Nd ,
(2.14)
j =−N m
and κm → κ a.e. Proof The proof is essentially the same as [22, proof of Lemma 2.8]. The function κ˜ m = κχ{|κ|≤1/m} belongs to L ∞ ([0, 1]d × [−π, π]d ) and converges to κ in measure. Indeed, κ˜ m → κ pointwise over [0, 1]d × [−π, π]d , and the pointwise (a.e.) convergence on a set of finite measure implies the convergence in measure [22, Lemma 2.4]. By [22, Lemma 2.2], the space generated by the trigonometric monomials
e2πi·x ei j ·θ = ei(2π1 x1 +...+2πd xd + j1 θ1 +...+ jd θd ) : , j ∈ Zd
is dense in L 1 ([0, 1]d × [−π, π]d ), so we can choose a function κm belonging to this space such that κm − κ˜ m L 1 ≤ 1/m. Note that κm is a function of the form (2.14). Moreover, for each ε > 0, using Chebyshev’s inequality [22, Eq. (2.4)] we obtain
16
2 Mathematical Background
μ2d {|κm − κ| > ε} ≤ μ2d ({|κm − κ˜ m | > ε/2} ∪ {|κ˜ m − κ| > ε/2}) ≤ μ2d {|κm − κ˜ m | > ε/2} + μ2d {|κ˜ m − κ| > ε/2} κm − κ˜ m L 1 + μ2d {|κ˜ m − κ| > ε/2}, ≤ (ε/2) which converges to 0 as m → ∞. Hence, κm → κ in measure. Since the convergence in measure on a set of finite measure implies the existence of a subsequence that converges a.e. [22, Lemma 2.4], passing to a subsequence of {κm }m (if necessary) we may assume that κm → κ a.e.
2.3 Multivariate Riemann-Integrable Functions A function a : [0, 1]d → C is said to be Riemann-integrable if its real and imaginary parts (a), (a) : [0, 1]d → R are Riemann-integrable in the classical sense. Recall that any Riemann-integrable function is bounded by definition. We report below a list of properties possessed by Riemann-integrable functions that will be used in this book, either explicitly or implicitly. • If α, β ∈ C and a, b : [0, 1]d → C are Riemann-integrable, then αa + βb is Riemann-integrable. • If a, b : [0, 1]d → C are Riemann-integrable, then ab is Riemann-integrable. • If a : [0, 1]d → C is Riemann-integrable and F : C → C is continuous, then F(a) : [0, 1]d → C is Riemann-integrable. • If a : [0, 1]d → C is Riemann-integrable, then a belongs to L ∞ ([0, 1]d ) and its Lebesgue and Riemann integrals over [0, 1]d coincide. • If a : [0, 1]d → C is bounded, then a is Riemann-integrable if and only if a is continuous a.e. Note that the last two properties imply the first three. The proof of the second-tolast property can be found in [30, pp. 73–74], while the last property is Lebesgue’s characterization theorem of Riemann-integrable functions [30, p. 104]. Note that the proofs in [30] are made for the case d = 1 only, but the generalization to the case d > 1 is straightforward. A further property of Riemann-integrable functions that will be used in this book is stated and proved in the next lemma, which is the multivariate version of [22, Lemma 2.9]. Lemma 2.5 Let a : [0, 1]d → R be Riemann-integrable. For each n ∈ Nd , consider the partition of (0, 1]d given by the d-dimensional hyperrectangles I i,n = and let
i −1 i i1 − 1 i1 id − 1 id , , , = × ··· × , n n n1 n1 nd nd
i = 1, . . . , n,
2.3 Multivariate Riemann-Integrable Functions
a i,n ∈
inf a(x), sup a(x) ,
x∈I i,n
Then
17
n
i = 1, . . . , n.
x∈I i,n
a i,n χ Ii,n → a a.e. in [0, 1]d
(2.15)
i=1
and
1 a i,n = N (n) i=1 n
lim
n→∞
a(x)dx.
(2.16)
[0,1]d
Proof The proof is essentially the same as [22, proof of Lemma 2.9]. Fix ε > 0 and let x ∈ (0, 1]d be a continuity point of a. Then, there is a δ > 0 such that |a(y) − a(x)| ≤ ε whenever y ∈ [0, 1]d and |y − x|∞ ≤ δ. Take n ≥ (1/δ)1 and call I k,n the unique hyperrectangle of the partition (0, 1]d = ni=1 I i,n containing x. For y ∈ I k,n , we have y ∈ [0, 1]d and |y − x|∞ ≤ δ, hence |a(y) − a(x)| ≤ ε. It follows that ! ! n ! ! ! = |a k,n − a(x)| ! a χ (x) − a(x) i,n I i,n ! ! i=1
≤ max a(x) − inf a(y), sup a(y) − a(x) ≤ ε. y∈I k,n
y∈I k,n
As a consequence, ni=1 a i,n χ Ii,n (x) → a(x) whenever x is a continuity point of a in (0, 1]d . This implies (2.15), because a is Riemann-integrable and hence continuous a.e. in [0, 1]d . Since ! ! n ! ! ! a i,n χ Ii,n !! ≤ a ∞ < ∞, ! i=1
1 a i,n = N (n) i=1 n
[0,1]d
n
" a i,n χ Ii,n ,
i=1
(2.16) follows from (2.15) and from the dominated convergence theorem.
2.4 Matrix Norms For the reader’s convenience, we report from [22, Sects. 2.4.1 and 2.4.3] several matrix-norm inequalities that we shall use throughout the book. First, we recall the expressions of the p-norms for p = 1, ∞: |X |1 = max
j=1,...,m
m i=1
|xi j |,
|X |∞ = max
i=1,...,m
m j=1
|xi j |,
X ∈ Cm×m .
18
2 Mathematical Background
An important bound for X in terms of |X |1 and |X |∞ (and hence in terms of the components of X ) is the following: X ≤
# |X |1 |X |∞ ≤ max(|X |1 , |X |∞ ),
X ∈ Cm×m .
(2.17)
Given 1 ≤ p, q ≤ ∞, we say that p, q are conjugate exponents if 1p + q1 = 1 (it is 1 understood that ∞ = 0). The following Hölder-type inequality holds for the Schatten norms: X Y 1 ≤ X p Y q ,
X, Y ∈ Cm×m .
(2.18)
We will also need the following trace-norm inequalities: X 1 ≤ rank(X ) X ≤ m X , X 1 ≤
m
|xi j |,
X ∈ Cm×m ,
X ∈ Cm×m .
(2.19) (2.20)
i, j=1
2.5 Tensor Products and Direct Sums If X, Y are matrices of any dimension, say X ∈ Cm 1 ×m 2 and Y ∈ C1 ×2 , the tensor (Kronecker) product of X and Y is the m 1 1 × m 2 2 matrix defined by ⎡
X ⊗ Y = xi j Y i=1,...,m 1
j=1,...,m 2
⎤ x11 Y · · · x1m 2 Y ⎢ .. ⎥ , = ⎣ ... . ⎦ xm 1 1 Y · · · xm 1 m 2 Y
and the direct sum of X and Y is the (m 1 + 1 ) × (m 2 + 2 ) matrix defined by X ⊕ Y = diag(X, Y ) =
X O . O Y
Tensor products and direct sums possess a lot of nice algebraic properties. (i) Associativity: for all matrices X , Y , Z , (X ⊗ Y ) ⊗ Z = X ⊗ (Y ⊗ Z ), (X ⊕ Y ) ⊕ Z = X ⊕ (Y ⊕ Z ). We can therefore omit parentheses in expressions like X 1 ⊗ X 2 ⊗ · · · ⊗ X d or X1 ⊕ X2 ⊕ · · · ⊕ Xd .
2.5 Tensor Products and Direct Sums
19
(ii) If X 1 , X 2 can be multiplied and Y1 , Y2 can be multiplied, then (X 1 ⊗ Y1 )(X 2 ⊗ Y2 ) = (X 1 X 2 ) ⊗ (Y1 Y2 ), (X 1 ⊕ Y1 )(X 2 ⊕ Y2 ) = (X 1 X 2 ) ⊕ (Y1 Y2 ). (iii) For all matrices X , Y , (X ⊗ Y )∗ = X ∗ ⊗ Y ∗ ,
(X ⊗ Y )T = X T ⊗ Y T
(X ⊕ Y )∗ = X ∗ ⊕ Y ∗ ,
(X ⊕ Y )T = X T ⊕ Y T .
(iv) Bilinearity (of tensor products): for each fixed matrix X , the application Y → X ⊗ Y is linear on C1 ×2 for all 1 , 2 ∈ N; for each fixed matrix Y , the application X → X ⊗ Y is linear on Cm 1 ×m 2 for all m 1 , m 2 ∈ N. From (i)–(iv), a lot of other interesting properties follow. For example, if X, Y are invertible, then X ⊗ Y is invertible, with inverse X −1 ⊗ Y −1 . If X, Y are normal (resp., Hermitian, symmetric, unitary) then X ⊗ Y is also normal (resp., Hermitian, symmetric, unitary). If X ∈ Cm×m and Y ∈ C× , the eigenvalues and singular values of X ⊗ Y are {λi (X )λ j (Y ) : i = 1, . . . , m, j = 1, . . . , }, {σi (X )σ j (Y ) : i = 1, . . . , m, j = 1, . . . , };
(2.21) (2.22)
and the eigenvalues and singular values of X ⊕ Y are {λi (X ) : i = 1, . . . , m} ∪ {λ j (Y ) : j = 1, . . . , },
(2.23)
{σi (X ) : i = 1, . . . , m} ∪ {σ j (Y ) : j = 1, . . . , };
(2.24)
see [22, Exercise 2.5]. In particular, for all X ∈ Cm×m , Y ∈ C× , and 1 ≤ p ≤ ∞, we have (2.25) X ⊗ Y p = X p Y p , p p 1/ p ! ! ( X p + Y p ) , if 1 ≤ p < ∞, X ⊕ Y p = !( X p , Y p )! p = max( X ∞ , Y ∞ ), if p = ∞, (2.26)
20
2 Mathematical Background
and rank(X ⊗ Y ) = rank(X )rank(Y ), rank(X ⊕ Y ) = rank(X ) + rank(Y ).
(2.27) (2.28)
In addition to the properties considered so far, we need to highlight two further properties of tensor products, which are very important for the “multidimensional purposes” of this book. The first one is the multi-index formula for tensor products: if we have d matrices X k ∈ Cm k ×m k , k = 1, . . . , d, then (X 1 ⊗ X 2 ⊗ · · · ⊗ X d )i j = (X 1 )i1 j1 (X 2 )i2 j2 · · · (X d )id jd , i, j = 1, . . . , m, (2.29) where m = (m 1 , m 2 , . . . , m d ). Note that (2.29) can be rewritten in the form (2.4) as follows: m X 1 ⊗ X 2 ⊗ · · · ⊗ X d = (X 1 )i1 j1 (X 2 )i2 j2 · · · (X d )id jd i, j =1 .
(2.30)
Note also that X 1 ⊗ X 2 ⊗ · · · ⊗ X d is one of the most eminent example of a d-level matrix with level orders m 1 , m 2 , . . . , m d and total order N (m); see Example 2.5 for the corresponding definitions. Equation (2.29) is of fundamental importance and, indeed, it motivates the introduction of multi-indices to index the entries of a matrix formed by a sum of tensor products. To better understand the importance of (2.29), try to write the (i, j) entry of X 1 ⊗ X 2 ⊗ · · · ⊗ X d as a function of two scalar indices i, j = 1, . . . , N (m); see also Example 2.4. The second property is a natural upper bound for the rank of the difference of two tensor products formed by d factors. More precisely, suppose we have 2d matrices X 1 , . . . , X d , Y1 , . . . , Yd , with X i , Yi ∈ Cm i ×m i for all i = 1, . . . , d; then, rank(X 1 ⊗ · · · ⊗ X d − Y1 ⊗ · · · ⊗ Yd ) ≤ N (m)
d rank(X i − Yi ) , mi i=1
where m = (m 1 , . . . , m d ). This is true because rank(X 1 ⊗ · · · ⊗ X d − Y1 ⊗ · · · ⊗ Yd ) " d Y1 ⊗ · · · ⊗ Yi−1 ⊗ (X i − Yi ) ⊗ X i+1 ⊗ · · · ⊗ X d = rank i=1
≤
d
rank(Y1 ⊗ · · · ⊗ Yi−1 ⊗ (X i − Yi ) ⊗ X i+1 ⊗ · · · ⊗ X d )
i=1
=
d i=1
rank(Y1 ⊗ · · · ⊗ Yi−1 )rank(X i − Yi )rank(X i+1 ⊗ · · · ⊗ X d )
(2.31)
2.5 Tensor Products and Direct Sums
≤
d
21
m 1 · · · m i−1 rank(X i − Yi )m i+1 · · · m d .
i=1
We conclude this section with a few results concerning the commutative properties of tensor products and direct sums. We also discuss the distributive properties of tensor products with respect to direct sums. Lemma 2.6 For every m ∈ Nd and every permutation σ of the set {1, . . . , d}, there exists a permutation matrix Πm;σ of size N (m) such that T X σ (1) ⊗ X σ (2) ⊗ · · · ⊗ X σ (d) = Πm;σ (X 1 ⊗ X 2 ⊗ · · · ⊗ X d )Πm;σ
for all matrices X 1 ∈ Cm 1 ×m 1 , X 2 ∈ Cm 2 ×m 2 , . . . , X d ∈ Cm d ×m d . Proof The proof proceeds by induction on d. The case d = 1 is trivial. For d = 2, the result is clear if σ is the identity [1, 2], so we only have to prove it for σ = [2, 1]. In other words, we have to show that for every m ∈ N2 there exists a permutation matrix Πm;[2,1] such that T X 2 ⊗ X 1 = Πm;[2,1] (X 1 ⊗ X 2 )Πm;[2,1]
(2.32)
for all X 1 ∈ Cm 1 ×m 1 and X 2 ∈ Cm 2 ×m 2 . Let Πm;[2,1] be the permutation matrix associated with the permutation ζ of {1, . . . , m 1 m 2 } given by ζ = [1, m 2 + 1, 2m 2 + 1, . . . , (m 1 − 1)m 2 + 1, 2, m 2 + 2, 2m 2 + 2, . . . , (m 1 − 1)m 2 + 2, ... ... ..., m 2 , 2m 2 , 3m 2 . . . , m 1 m 2 ], i.e.,
$ ζ (i) = ((i − 1) mod m 1 )m 2 +
% i −1 + 1, m1
i = 1, . . . , m 1 m 2 .
In other words, Πm;[2,1] is the matrix whose rows are, in this order, eζT(1) , . . . , eζT(m 1 m 2 ) , where e1 , . . . , em 1 m 2 are the vectors of the canonical basis of Cm 1 m 2 . It can be verified that Πm;[2,1] satisfies (2.32) for all X 1 ∈ Cm 1 ×m 1 and X 2 ∈ Cm 2 ×m 2 . The verification can be done componentwise, by showing that the (i, j) entry of the first matrix in (2.32) is equal to the (i, j) entry of the second matrix for all i, j = 1, . . . , m 1 m 2 . This completes the proof of the lemma for d = 2. For d ≥ 3, we assume that the lemma holds for indices up to d − 1 and we prove that it holds for d. Let m ∈ Nd and let σ be a permutation of {1, . . . , d}. Let 1 ≤ i ≤ d be the index such that σ (i) = d, and let τ be the permutation of {1, . . . , d − 1} defined by τ ( j) = σ ( j) for j = 1, . . . , i − 1 and τ ( j) = σ ( j + 1) for j = i, . . . , d − 1. If i = d, then, by induction hypothesis and the properties of tensor products,
22
2 Mathematical Background
X σ (1) ⊗ · · · ⊗ X σ (d) = X τ (1) ⊗ · · · ⊗ X τ (d−1) ⊗ X d T ⊗ Xd = Π(m 1 ,...,m d−1 );τ (X 1 ⊗ · · · ⊗ X d−1 )Π(m 1 ,...,m d−1 );τ T = (Π(m 1 ,...,m d−1 );τ ⊗ Im d )(X 1 ⊗ · · · ⊗ X d−1 ⊗ X d )(Π(m ⊗ Im d ), 1 ,...,m d−1 );τ
and the thesis holds with Πm;σ = Π(m 1 ,...,m d−1 );τ ⊗ Im d . If i < d, then X σ (1) ⊗ · · · ⊗ X σ (d) = X σ (1) ⊗ · · · ⊗ X σ (i−1) ⊗ X d ⊗ X σ (i+1) ⊗ · · · ⊗ X σ (d) = X σ (1) ⊗ · · · ⊗ X σ (i−1) ⊗ T Π(m σ (i+1) ···m σ (d) ,m d );[2,1] (X σ (i+1) ⊗ · · · ⊗ X σ (d) ⊗ X d )Π(m σ (i+1) ···m σ (d) ,m d );[2,1] = (Im σ (1) ···m σ (i−1) ⊗ Π(m σ (i+1) ···m σ (d) ,m d );[2,1] ) · (X σ (1) ⊗ · · · ⊗ X σ (i−1) ⊗ X σ (i+1) ⊗ · · · ⊗ X σ (d) ⊗ X d ) T · (Im σ (1) ···m σ (i−1) ⊗ Π(m ) σ (i+1) ···m σ (d) ,m d );[2,1] T = Pm;σ (X τ (1) ⊗ · · · ⊗ X τ (d−1) ⊗ X d )Pm;σ ,
(2.33)
where Pm;σ = Im σ (1) ···m σ (i−1) ⊗ Π(m σ (i+1) ···m σ (d) ,m d );[2,1] . By induction hypothesis, T X τ (1) ⊗ · · · ⊗ X τ (d−1) = Π(m 1 ,...,m d−1 );τ (X 1 ⊗ · · · ⊗ X d−1 )Π(m . 1 ,...,m d−1 );τ
Substituting in (2.33) and using the properties of tensor products, we see that the thesis holds with Πm;σ = Pm;σ (Π(m 1 ,...,m d−1 );τ ⊗ Im d ), which is a permutation matrix because it is the product of two permutation matrices. Lemma 2.7 For every m ∈ Nd and every permutation σ of the set {1, . . . , d}, there exists a permutation matrix Vm;σ of size m 1 + . . . + m d such that T X σ (1) ⊕ X σ (2) ⊕ · · · ⊕ X σ (d) = Vm;σ (X 1 ⊕ X 2 ⊕ · · · ⊕ X d )Vm;σ
for all matrices X 1 ∈ Cm 1 ×m 1 , X 2 ∈ Cm 2 ×m 2 , . . . , X d ∈ Cm d ×m d . Proof The proof proceeds by induction on d. The case d = 1 is trivial. For d = 2, the only possible permutations are the identity σ = [1, 2] and the transposition σ = [2, 1], and we can take Vm;[1,2] = Im 1 +m 2 ,
Vm;[2,1] =
O Im 2 . Im 1 O
For d ≥ 3, we assume that the lemma holds for indices up to d − 1 and we prove that it holds for d. Let 1 ≤ i ≤ d be the index such that σ (i) = d, and let τ be the permutation of {1, . . . , d − 1} defined be τ ( j) = σ ( j) for j = 1, . . . , i − 1 and τ ( j) = σ ( j + 1) for j = i, . . . , d − 1. If i = d, then, by induction hypothesis and the properties of direct sums,
2.5 Tensor Products and Direct Sums
23
X σ (1) ⊕ · · · ⊕ X σ (d) = X τ (1) ⊕ · · · ⊕ X τ (d−1) ⊕ X d T = V(m 1 ,...,m d−1 );τ (X 1 ⊕ · · · ⊕ X d−1 )V(m ⊕ Xd 1 ,...,m d−1 );τ T = (V(m 1 ,...,m d−1 );τ ⊕ Im d )(X 1 ⊕ · · · ⊕ X d−1 ⊕ X d )(V(m ⊕ Im d ) 1 ,...,m d−1 );τ
and the thesis holds with Vm;σ = V(m 1 ,...,m d−1 );τ ⊕ Im d . If i < d, then X σ (1) ⊕ · · · ⊕ X σ (d) = X σ (1) ⊕ · · · ⊕ X σ (i−1) ⊕ X d ⊕ X σ (i+1) ⊕ · · · ⊕ X σ (d) = X σ (1) ⊕ · · · ⊕ X σ (i−1) ⊕ T V(m σ (i+1) +...+m σ (d) ,m d );[2,1] (X σ (i+1) ⊕ · · · ⊕ X σ (d) ⊕ X d )V(m σ (i+1) +...+m σ (d) ,m d );[2,1] = (Im σ (1) +...+m σ (i−1) ⊕ V(m σ (i+1) +...+m σ (d) ,m d );[2,1] ) · (X σ (1) ⊕ · · · ⊕ X σ (i−1) ⊕ X σ (i+1) ⊕ · · · ⊕ X σ (d) ⊕ X d ) T · (Im σ (1) +...+m σ (i−1) ⊕ V(m ) σ (i+1) +...+m σ (d) ,m d );[2,1] T = Um;σ X τ (1) ⊕ · · · ⊕ X τ (d−1) ⊕ X d Um;σ ,
(2.34)
where Um;σ = Im σ (1) +...+m σ (i−1) ⊕ V(m σ (i+1) +...+m σ (d) ,m d );[2,1] . By induction hypothesis, T X τ (1) ⊕ · · · ⊕ X τ (d−1) = V(m 1 ,...,m d−1 );τ (X 1 ⊕ · · · ⊕ X d−1 )V(m . 1 ,...,m d−1 );τ
Substituting in (2.34) and using the properties of direct sums, we see that the thesis holds with Vm;σ = Um;σ (V(m 1 ,...,m d−1 );τ ⊕ Im d ), which is a permutation matrix because it is the product of two permutation matrices. Lemmas 2.6 and 2.7 show that tensor products and direct sums are “almost” commutative. More precisely, Lemma 2.6 says that the tensor product operation is commutative, up to a permutation transformation Πm;σ which depends on m and σ , but not on the specific matrices X 1 , X 2 , . . . , X d . Lemma 2.7 says the same thing for the direct sum operation. Concerning the distributive properties of tensor products with respect to direct sums, it follows directly from the definitions that the distributive law on the right holds without permutation transformations. In other words, for all matrices X 1 , . . . , X d , Y we have (X 1 ⊕ X 2 ⊕ · · · ⊕ X d ) ⊗ Y = (X 1 ⊗ Y ) ⊕ (X 2 ⊗ Y ) ⊕ · · · ⊕ (X d ⊗ Y ). (2.35) As for the distributive law on the left, a result analogous to Lemmas 2.6 and 2.7 holds, showing that this property holds modulo permutation transformations which only depend on the dimensions of the involved matrices. More precisely, for every ∈ N and m ∈ Nd , there exists a permutation matrix Q ,m of size (m 1 + . . . + m d ) such that T X ⊗ (Y1 ⊕ Y2 ⊕ · · · ⊕ Yd ) = Q ,m (X ⊗ Y1 ) ⊕ (X ⊗ Y2 ) ⊕ · · · ⊕ (X ⊗ Yd ) Q ,m (2.36)
24
2 Mathematical Background
for all matrices X ∈ C× , Y1 ∈ Cm 1 ×m 1 , Y2 ∈ Cm 2 ×m 2 , . . . , Yd ∈ Cm d ×m d . In this book, however, we will only need the distributive law on the right (2.35), and so we do not provide the proof of the distributive law on the left (2.36); the interested reader is referred to [24, Lemma 2].
2.6 Singular Value and Eigenvalue Distribution of a Sequence of Matrices We recall that, throughout this book, a sequence of matrices is a sequence of the form { An }n , where n varies in some infinite subset of N and An is a square matrix of size dn → ∞. In [22, Chap. 3] we studied the notion of (asymptotic) singular value and eigenvalue distribution, as well as other related concepts such as clustering and attraction, for sequences of matrices { An }n such that dn = n. We also studied the socalled zero-distributed sequences {Z n }n in the case where the size of the nth matrix Z n is dn = n. In [22, Chap. 4] we investigated the spectral distribution of sequences of perturbed Hermitian matrices, i.e., sequences of the form { An = X n + Yn }n , where X n is Hermitian and Yn is a small perturbation of X n ; once again, the size of X n and Yn was supposed to be dn = n. The crucial observation is that the assumption dn = n was made only for reasons of notational simplicity and elegance. Indeed, apart from obvious adaptations, nothing changes in [22, Chaps. 3 and 4] if the assumption “dn = n” is replaced by “dn → ∞ as n → ∞”. In this section, for the reader’s convenience, we rewrite definitions, results, etc., from [22, Chaps. 3 and 4] for general sequences of matrices, i.e., without the constraint dn = n. Of course, all the proofs are omitted because, up to obvious adaptations, they are the same as in [22]; it is therefore not instructive to reproduce them here.
2.6.1 The Notion of Singular Value and Eigenvalue Distribution Let K be either R or C and let g : D ⊂ Rk → K be a measurable function defined on a set D with 0 < μk (D) < ∞. To any such g we associate the functional φg defined by Eq. (2.2). This functional will play an important role in what follows. Definition 2.1 (singular value and eigenvalue distribution of a sequence of matrices) Let {An }n be a sequence of matrices, with An of size dn . • We say that {An }n has an asymptotic singular value distribution described by a functional φ : Cc (R) → C, and we write { An }n ∼σ φ, if dn 1 F(σ j (An )) = φ(F), n→∞ dn j=1
lim
∀ F ∈ Cc (R).
(2.37)
2.6 Singular Value and Eigenvalue Distribution of a Sequence of Matrices
25
If φ = φ| f | for some measurable f : D ⊂ Rk → C defined on a set D with 0 < μk (D) < ∞, we say that {An }n has an asymptotic singular value distribution described by f and we write { An }n ∼σ f . In this case, the function f is referred to as the singular value symbol of {An }n . • We say that { An }n has an asymptotic eigenvalue (or spectral) distribution described by a functional φ : Cc (C) → C, and we write { An }n ∼λ φ, if dn 1 F(λ j (An )) = φ(F), lim n→∞ dn j=1
∀ F ∈ Cc (C).
(2.38)
If φ = φ f for some measurable f : D ⊂ Rk → C defined on a set D with 0 < μk (D) < ∞, we say that {An }n has an asymptotic eigenvalue (or spectral) distribution described by f and we write {An }n ∼λ f . In this case, the function f is referred to as the eigenvalue (or spectral) symbol of {An }n . The adjective “asymptotic” is often omitted, i.e., one often simply says that {An }n has a singular value or eigenvalue/spectral distribution (described by something). When we write a relation such as { An }n ∼σ φ (resp., {An }n ∼λ φ), it is understood that φ is a functional on Cc (R) (resp., Cc (C)), as in Definition 2.1. Similarly, when we write a relation such as { An }n ∼σ f or { An }n ∼λ f , it is understood that f is as in Definition 2.1; that is, f is a measurable function defined on a subset D of some Rk with 0 < μk (D) < ∞. Sometimes, for brevity, we will write {An }n ∼σ, λ f to indicate that { An }n ∼σ f and { An }n ∼λ f . Remark 2.1 (informal meaning of the singular value and eigenvalue distribution) By definition of φ f , the spectral distribution { An }n ∼λ f means that dn 1 1 F(λ j (An )) = F( f (x))dx, n→∞ dn μk (D) D j=1 lim
∀ F ∈ Cc (C).
(2.39)
Intuitively, if the function f were continuous a.e. and the eigenvalues of An were exact samples of f over an equispaced grid in D, then (2.39) would be satisfied as it would simply say that a special Riemann sum of a Riemann-integrable function converges to the corresponding integral. The informal meaning behind (2.39) is in that f is continuous a.e., if n is large enough and fact the following. Assuming x j,n : j = 1, . . . , dn is an equispaced grid in D, then a suitable ordering of the eigenvalues of An is such that the pairs {(x j,n , λ j (An )) : j = 1, . . . , dn } reconstruct approximately the hypersurface {(x, f (x)) : x ∈ D}. In other words, the eigenvalues of An , except possibly for o(dn ) outliers, are approximately equal to the samples of f over a uniform grid in D (for n large enough). For instance, if k = 1 and D = [a, b], then, assuming we have no outliers, the eigenvalues of An are approximately equal to b − a , i = 1, . . . , dn , f a+i dn
26
2 Mathematical Background
for n large enough. Similarly, if k = 2, dn is a perfect square and D = [a1 , b1 ] × [a2 , b2 ], then, assuming we have no outliers, the eigenvalues of An are approximately equal to b1 − a1 b2 − a2 , a2 + i 2 √ , f a1 + i 1 √ dn dn
# i 1 , i 2 = 1, . . . , dn ,
for n large enough. A completely analogous meaning can also be given for the singular value distribution {An }n ∼σ f , which is equivalent to dn 1 1 F(σ j (An )) = F(| f (x)|)dx, n→∞ dn μk (D) D j=1 lim
∀ F ∈ Cc (R).
(2.40)
Remark 2.2 It is clear that { An }n ∼σ f is equivalent to {An }n ∼σ | f |. Moreover, if every An is normal, then {An }n ∼λ f implies { An }n ∼σ f . Indeed, the singular values of a normal matrix coincide with the moduli of the eigenvalues [22, p. 30]. Therefore, for any fixed F ∈ Cc (R), by applying the eigenvalue distribution (2.39) with the test function F(| · |) ∈ Cc (C), we get dn dn 1 1 1 F(σ j (An )) = lim F(|λ j (An )|) = F(| f (x)|)dx. n→∞ dn n→∞ dn μk (D) D j=1 j=1 lim
2.6.2 Clustering and Attraction This subsection introduces the notions of clustering and attraction for general sequences of matrices. Throughout the book, if z ∈ C and ε > 0, we denote by D(z, ε) the disk with center z and radius ε, i.e., D(z, ε) = {w ∈ C : |w − z| < ε}. If S ⊆ C and ε > 0, we denote by D(S, ε) the ε-expansion of S, which is defined as D(S, ε) = z∈S D(z, ε). Definition 2.2 (clustering of a sequence of matrices) Let {An }n be a sequence of matrices, with An of size dn , and let S ⊆ C be a nonempty subset of C. • We say that { An }n is strongly clustered at S (in the sense of the eigenvalues), or equivalently that the eigenvalues of {An }n are strongly clustered at S, if, for every ε > 0, the number of eigenvalues of An lying outside D(S, ε) is bounded by a constant Cε independent of n. In other words, for every ε > 0 we have #{ j ∈ {1, . . . , dn } : λ j (An ) ∈ / D(S, ε)} = O(1).
(2.41)
• We say that { An } is weakly clustered at S (in the sense of the eigenvalues), or equivalently that the eigenvalues of { An }n are weakly clustered at S, if, for every ε > 0,
2.6 Singular Value and Eigenvalue Distribution of a Sequence of Matrices
#{ j ∈ {1, . . . , dn } : λ j (An ) ∈ / D(S, ε)} = o(dn ).
27
(2.42)
By replacing “eigenvalues” with “singular values” and λ j (An ) with σ j (An ) in (2.41)– (2.42), we obtain the definitions of a sequence of matrices strongly or weakly clustered at a nonempty subset of C in the sense of the singular values. Throughout this book, when we speak of strong/weak cluster, sequence of matrices strongly/weakly clustered, etc., without further specifications, it is understood “in the sense of the eigenvalues”. When the clustering is intended in the sense of the singular values, this is specified every time. Definition 2.3 (spectral attraction) Let { An }n be a sequence of matrices, with An of size dn , and let z ∈ C. We say that z strongly attracts the spectrum Λ(An ) with infinite order if, once we have ordered the eigenvalues of An according to their distance from z, |λ1 (An ) − z| ≤ |λ2 (An ) − z| ≤ . . . ≤ |λdn (An ) − z|, the following limit relation holds for each fixed j ≥ 1: lim |λ j (An ) − z| = 0.
n→∞
The next theorem and its corollary are the versions of [22, Theorem 3.1 and Corollary 3.1] for general sequences of matrices. Theorem 2.1 If { An }n ∼λ f , then {An }n is weakly clustered at the essential range ER( f ) and every point of ER( f ) strongly attracts the spectrum Λ(An ) with infinite order. Corollary 2.2 If { An }n ∼λ f and Λ(An ) is contained in S ⊆ C for all n, then ER( f ) is contained in the closure S. In particular, if {An }n ∼λ f and the matrices An are Hermitian (resp., HPSD), then ER( f ) ⊆ R (resp., ER( f ) ⊆ [0, ∞)).
2.6.3 Zero-Distributed Sequences A sequence of matrices {Z n }n is said to be a zero-distributed sequence if we have {Z n }n ∼σ 0, where 0 is the identically zero function (defined on some unspecified measurable subset D of some Rk with 0 < μk (D) < ∞). In other words, {Z n }n is zero-distributed if and only if dn 1 F(σ j (Z n )) = F(0), n→∞ dn j=1
lim
∀ F ∈ Cc (R),
28
2 Mathematical Background
where, of course, dn is the size of Z n . Theorems 2.2 and 2.3 are the versions of [22, Theorems 3.2 and 3.3] for general sequences of matrices. In the statement of Theorem 2.3 and throughout the book, we use the natural convention 1/∞ = 0. Theorem 2.2 Let {Z n }n be a sequence of matrices, with Z n of size dn . The following conditions are equivalent. 1. {Z n }n ∼σ 0.
#{ j ∈ {1, . . . , dn } : σ j (Z n ) > ε} = 0. dn rank(Rn ) = lim Nn = 0. 3. For every n we have Z n = Rn + Nn , where lim n→∞ n→∞ dn
2. For every ε > 0, lim
n→∞
With the terminology of clustering introduced in Sect. 2.6.2, condition 2 in Theorem 2.2 can be reformulated by saying that {Z n }n is weakly clustered at {0} in the sense of the singular values. Theorem 2.3 Let {Z n }n be a sequence of matrices, with Z n of size dn , and suppose that lim Z n p /(dn )1/ p = 0 for some p ∈ [1, ∞]. Then {Z n }n ∼σ 0. n→∞
Remark 2.3 (algebra of zero-distributed sequences) It follows from the equivalence 1 ⇐⇒ 3 in Theorem 2.2 that, for each fixed sequence of positive integers dn such that dn → ∞ as n → ∞, the set of zero-distributed sequences
Z = {Z n }n : Z n ∈ Cdn ×dn , {Z n }n ∼σ 0
(2.43)
is a *-algebra over the complex field C with respect to the natural operations of conjugate transposition, addition, scalar-multiplication and product: { An }∗n = { A∗n }n , {An }n + {Bn }n = { An + Bn }n , α{An }n = {α An }n , { An }n {Bn }n = { An Bn }n .
(2.44)
2.6.4 Sparsely Unbounded and Sparsely Vanishing Sequences of Matrices We introduce in this subsection the notions of sparsely unbounded and sparsely vanishing sequences of matrices, along with a number of related results, which are the versions of [22, Propositions 5.3 and 5.4, Remark 8.6 and Proposition 8.4] for general sequences of matrices. Definition 2.4 (sparsely unbounded sequence of matrices) A sequence of matrices {An }n , with An of size dn , is said to be sparsely unbounded (s.u.) if for every M > 0 there exists n M such that, for n ≥ n M ,
2.6 Singular Value and Eigenvalue Distribution of a Sequence of Matrices
29
#{i ∈ {1, . . . , dn } : σi (An ) > M} ≤ r (M), dn where lim M→∞ r (M) = 0. Definition 2.5 (sparsely vanishing sequence of matrices) A sequence of matrices {An }n , with An of size dn , is said to be sparsely vanishing (s.v.) if for every M > 0 there exists n M such that, for n ≥ n M , #{i ∈ {1, . . . , dn } : σi (An ) < 1/M} ≤ r (M), dn where lim M→∞ r (M) = 0. It is clear that if { An }n is s.v. then { A†n }n is s.u.; it suffices to recall that the singular values of A† are 1/σ1 (A), . . . , 1/σr (A), 0, . . . , 0, where σ1 (A), . . . , σr (A) are the nonzero singular values of A (r = rank(A)). Proposition 2.1 Let {An }n be a sequence of matrices, with An of size dn . The following conditions are equivalent. 1. {An }n is s.u.
#{i ∈ {1, . . . , dn } : σi (An ) > M} = 0. dn 3. For every M > 0 there exists n M such that, for n ≥ n M ,
2.
lim lim sup
M→∞ n→∞
An = Aˆ n,M + A˜ n,M ,
rank( Aˆ n,M ) ≤ r (M)dn ,
A˜ n,M ≤ M,
where lim M→∞ r (M) = 0. dn 1 χ(M,∞) (σi (An )) = 0. Note that condition 2 can be rewritten as lim lim sup M→∞ n→∞ dn i=1
Proposition 2.2 If {An }n ∼σ f then { An }n is s.u. Proposition 2.3 Let { An }n be a sequence of matrices, with An of size dn . The following conditions are equivalent. 1. {An }n is s.v. 2.
lim lim sup
M→∞ n→∞
#{i ∈ {1, . . . , dn } : σi (An ) < 1/M} = 0. dn
Note that condition 2 can be rewritten as lim lim sup M→∞ n→∞
dn 1 χ[0,1/M) (σi (An )) = 0. dn i=1
Proposition 2.4 If {An }n ∼σ f then { An }n is s.v. if and only if f = 0 a.e.
30
2 Mathematical Background
Remark 2.4 Let { An }n be an s.u. sequence of Hermitian matrices, with An of size dn . Then, the following stronger version of condition 3 in Proposition 2.1 is satisfied: for every M > 0 there exists n M such that, for n ≥ n M , An = Aˆ n,M + A˜ n,M ,
rank( Aˆ n,M ) ≤ r (M)dn ,
A˜ n,M ≤ M,
where lim M→∞ r (M) = 0, the matrices Aˆ n,M and A˜ n,M are Hermitian, and for all functions g : R → R we have g( Aˆ n,M + A˜ n,M ) = g( Aˆ n,M ) + g( A˜ n,M ). This stronger version of condition 3 has been proved in [22, p. 157 (lines 21–34) and p. 158 (lines 1–8)] for the case “dn = n”, but the extension to the case “dn → ∞ as n → ∞” is obvious.
2.6.5 Spectral Distribution of Sequences of Perturbed Hermitian Matrices The spectral distribution of sequences of perturbed Hermitian matrices has been investigated in [22, Chap. 4]. Here, we report the version of the main result obtained therein [22, Theorem 4.3] for general sequences of matrices. Theorem 2.4 Let {X n }n , {Yn }n be sequences of matrices, with X n , Yn of size dn , and set An = X n + Yn . Assume that the following conditions are met. 1. X n , Yn ≤ C for all n, where C is a constant independent of n. 2. Every X n is Hermitian and {X n }n ∼λ φ. 3. Yn 1 = o(dn ). Then { An }n ∼λ φ. Corollary 2.3 Let {X n }n , {Yn }n be sequences of matrices, with X n , Yn of size dn , and set An = X n + Yn . Assume that the following conditions are met. 1. X n , Yn ≤ C for all n, where C is a constant independent of n. 2. Every X n is Hermitian and {X n }n ∼λ f . 3. Yn 1 = o(dn ). Then { An }n ∼λ f . Proof Recall from Definition 2.1 that {An }n ∼λ f means { An }n ∼λ φ f and apply Theorem 2.4 with φ = φ f .
2.7 Approximating Classes of Sequences
31
2.7 Approximating Classes of Sequences In [22, Chap. 5] we studied the theory of approximating classes of sequences (a.c.s.). For notational simplicity and elegance, the theory presented therein refers to sequences of matrices { An }n such that the size of An is dn = n. However, apart from obvious adaptations, nothing changes in [22, Chap. 5] if the assumption “dn = n” is replaced by “dn → ∞ as n → ∞”. In this section, for the reader’s convenience, we rewrite definitions, results, etc., from [22, Chap. 5] for general sequences of matrices, i.e., without the constraint dn = n. Of course, all the proofs are omitted because, up to obvious adaptations, they are the same as in [22]; it is therefore not instructive to reproduce them here. We also avoid repeating again the motivations behind the notion of a.c.s., which can be found in [22, p. 65]. Throughout the book, we use the abbreviation “a.c.s.” for both the singular “approximating class of sequences” and the plural “approximating classes of sequences”; it will be clear from the context whether “a.c.s.” is singular or plural.
2.7.1 Definition of a.c.s. and a.c.s. Topology Here is the formal definition of a.c.s., as appeared in the original paper [33]. Definition 2.6 (approximating class of sequences) Let {An }n be a sequence of matrices, with An of size dn , and let {{Bn,m }n }m be a sequence of sequences of matrices, with Bn,m of size dn . We say that {{Bn,m }n }m is an approximating class of sequences (a.c.s.) for { An }n if the following condition is met: for every m there exists n m such that, for n ≥ n m , An = Bn,m + Rn,m + Nn,m ,
rank(Rn,m ) ≤ c(m)dn ,
Nn,m ≤ ω(m), (2.45)
where the quantities n m , c(m), ω(m) depend only on m, and lim c(m) = lim ω(m) = 0.
m→∞
m→∞
Roughly speaking, {{Bn,m }n }m is an a.c.s. for {An }n if, for all sufficiently large m, the sequence {Bn,m }n approximates (asymptotically) the sequence {An }n , in the sense that An is eventually equal to Bn,m plus a small-rank matrix (with respect to the matrix size dn ) plus a small-norm matrix. Note that an equivalent definition of a.c.s. is obtained by replacing, in Definition 2.6, “for every m” with “for every sufficiently large m”. Indeed, suppose that the splitting (2.45) holds for m ≥ M. For m < M, define n m = 1, c(m) = 1, ω(m) = 0 and Rn,m = An,m − Bn,m , Nn,m = Odn . Then, we see that (2.45) actually holds for every m. It turns out that, for each fixed sequence of positive integers dn such that dn → ∞, the notion of a.c.s. is a notion of convergence in the space of all sequences of matrices corresponding to {dn }n , i.e.,
32
2 Mathematical Background
E = {{ An }n : An ∈ Cdn ×dn for every n}.
(2.46)
To be precise, for every square matrix A ∈ C× , let
rank(R) + N : p(A) = inf
i = min + σi+1 (A) , i=0,...,
R, N ∈ C
×
,
R+N = A (2.47)
where σ1 (A) ≥ . . . ≥ σ (A) and σ+1 (A) = 0 by convention; note that Eq. (2.47) is proved in [22, Theorem 5.3]. For any { An }n ∈ E , define pa.c.s. ({ An }n ) = lim sup p(An ), n→∞
{ A n }n ∈ E ,
da.c.s. ({ An }n , {Bn }n ) = pa.c.s. ({ An − Bn }n ),
{ An }n , {Bn }n ∈ E .
(2.48) (2.49)
Then, the following theorem holds [22, Sect. 5.2.1]. Theorem 2.5 Fix a sequence of positive integers dn such that dn → ∞, and let E be the space (2.46). The following properties hold. 1. da.c.s. in (2.49) is a pseudometric on E such that da.c.s. ({An }n , {Bn }n ) = 0 if and only if { An − Bn }n is zero-distributed. 2. Suppose { An }n ∈ E and {{Bn,m }n }m ⊂ E . Then, {{Bn,m }n }m is an a.c.s. for { An }n if and only if da.c.s. ({ An }n , {Bn,m }n ) → 0 as m → ∞. a.c.s.
Theorem 2.5 justifies the convergence notation {Bn,m }n −→ { An }n , which will be used to indicate that {{Bn,m }n }m is an a.c.s. for { An }n . The so-called a.c.s. topology τa.c.s. is defined as the topology induced on E by the pseudometric da.c.s. . As explained in [22, Sect. 5.2.3], the a.c.s. topology is strongly connected with the topology τmeasure associated with the convergence in measure of functions. This connection becomes absolutely evident in the light of the two research issues proposed in [16, Sect. 4], which have been positively solved in two recent papers [3, 5]. It is not to be excluded that the deep connections between the a.c.s. convergence and the convergence in measure highlighted in [3, 5, 16] and [22, Sect. 5.2.3] may lead to a “bridge”, in the precise mathematical sense established in [11], between measure theory and the asymptotic linear algebra theory underlying the notion of a.c.s.; a bridge that could be exploited to obtain matrix theory results from measure theory results and vice versa. For deeper insights on this topic, we suggest reading [5, Sect. 1].
2.7 Approximating Classes of Sequences
33
2.7.2 The a.c.s. Tools for Computing Singular Value and Spectral Distributions The importance of the a.c.s. resides in Theorems 2.6 and 2.7, which are the versions of [22, Theorems 5.4 and 5.6] for general sequences of matrices. Theorem 2.6 Let {An }n , {Bn,m }n be sequences of matrices and let φ, φm : Cc (R) → C be functionals. Suppose that 1. {Bn,m }n ∼σ φm for every m, a.c.s. 2. {Bn,m }n −→ { An }n , 3. φm → φ pointwise over Cc (R). Then { An }n ∼σ φ. Theorem 2.7 Let { An }n , {Bn,m }n be sequences of Hermitian matrices and let φ, φm : Cc (C) → C be functionals. Suppose that 1. {Bn,m }n ∼λ φm for every m, a.c.s. 2. {Bn,m }n −→ { An }n , 3. φm → φ pointwise over Cc (C). Then { An }n ∼λ φ. Theorem 2.6 admits the following converse, which is the version of [22, Theorem 5.5] for general sequences of matrices. Theorem 2.8 Let { An }n , {Bn,m }n be sequences of matrices and let φ, φm : Cc (R) → C be functionals. Suppose that 1. { An }n ∼σ φ, 2. {Bn,m }n ∼σ φm for every m, a.c.s. 3. {Bn,m }n −→ { An }n . Then φm → φ pointwise over Cc (R). We report below two important corollaries of Theorems 2.6 and 2.7. Corollary 2.4 Let { An }n , {Bn,m }n be sequences of matrices and let f, f m : D ⊂ Rk → C be measurable functions defined on a set D with 0 < μk (D) < ∞. Suppose that 1. {Bn,m }n ∼σ f m for every m, a.c.s. 2. {Bn,m }n −→ { An }n , 3. f m → f in measure. Then { An }n ∼σ f . Proof Apply Theorem 2.6 with φm = φ| fm | and φ = φ| f | . Note that φm → φ point wise over Cc (R) by [22, Lemma 2.5], since | f m | → | f | in measure.
34
2 Mathematical Background
Corollary 2.5 Let { An }n , {Bn,m }n be sequences of Hermitian matrices and let f, f m : D ⊂ Rk → C be measurable functions defined on a set D with 0 < μk (D) < ∞. Suppose that 1. {Bn,m }n ∼λ f m for every m, a.c.s. 2. {Bn,m }n −→ { An }n , 3. f m → f in measure. Then { An }n ∼λ f . Proof Apply Theorem 2.7 with φm = φ fm and φ = φ f . Note that φm → φ pointwise over Cc (C) by [22, Lemma 2.5]. Remark 2.5 (topological interpretation of Corollaries 2.4 and 2.5) It is interesting to give a topological interpretation of Corollaries 2.4 and 2.5. We only focus on Corollary 2.4 as the discussion for Corollary 2.5 is similar. Fix a sequence of positive integers dn such that dn → ∞, and let
E = {{ An }n : An ∈ Cdn ×dn for every n}, M D = { f : D → C : f is measurable}. We have seen in Sect. 2.7.1 and [22, Sect. 2.3.2] that E (resp., M D ) is a pseudometric space with respect to the pseudometric da.c.s. (resp., dmeasure ) which induces the a.c.s. topology τa.c.s. (resp., the topology of convergence in measure τmeasure ). Corollary 2.4 is then equivalent to saying that the set of “σ -pairs” {({ An }n , f ) ∈ E × M D : {An }n ∼σ f } is closed in E × M D equipped with the product (pseudometrizable) topology τa.c.s. × τmeasure induced, for example, by the pseudometric da.c.s.×measure (({An }n , κ), ({Bn }n , ξ )) = da.c.s. ({ An }n , {Bn }n ) + dmeasure (κ, ξ ); see [22, Exercise 2.2]. Indeed, Corollary 2.4 reads as follows: if a sequence of σ -pairs ({Bn,m }n , f m ) converges to a pair ({ An }n , f ) in E × M D , then ({ An }n , f ) is a σ -pair.
2.7.3 The a.c.s. Algebra In this subsection, we formulate the algebraic properties of a.c.s. [22, Sect. 5.4]. The next theorem is the version of [22, Propositions 5.1, 5.2 and 5.5] for general sequences of matrices. Theorem 2.9 Let {An }n , { A"n }n be sequences of matrices, with An , A"n of size dn , and a.c.s. a.c.s. " }n −→ { A"n }n . The following properties suppose that {Bn,m }n −→ { An }n and {Bn,m hold.
2.7 Approximating Classes of Sequences
35
a.c.s.
∗ • {Bn,m }n −→ { A∗n }n .
a.c.s.
" • {α Bn,m + β Bn,m }n −→ {α An + β A"n }n for all α, β ∈ C. a.c.s.
" • If { An }n and { A"n }n are s.u. then {Bn,m Bn,m }n −→ { An A"n }n .
It is worth mentioning two further properties of a.c.s., which will not be used in this book, but are anyway interesting to know. They are the versions for general sequences of matrices of the two properties stated at the end of [22, Sect. 5.4]. a.c.s.
• Let { An }n be an s.u. sequence of matrices and suppose that {Bn,m }n −→ { An }n . a.c.s. Assume also that the An and Bn,m are Hermitian. Then { f (Bn,m )}n −→ { f (An )}n for all continuous functions f : C → C. a.c.s. • Let { An }n be an s.v. sequence of matrices and suppose that {Bn,m }n −→ { An }n . a.c.s. † Then {Bn,m }n −→ { A†n }n .
2.7.4 Some Criteria to Identify a.c.s. Two useful criteria to identify an a.c.s. without constructing the splitting (2.45) are given below. They are the versions of [22, Corollaries 5.3 and 5.4] for general sequences of matrices. Theorem 2.10 Let { An }n be a sequence of matrices, with An of size dn , let {{Bn,m }n }m be a sequence of sequences of matrices, with Bn,m of size dn , and let p ∈ [1, ∞]. Suppose that for every m there exists n m such that, for n ≥ n m , An − Bn,m p ≤ ε(m, n)(dn )1/ p , a.c.s.
where lim lim sup ε(m, n) = 0. Then {Bn,m }n −→ { An }n . m→∞ n→∞
Theorem 2.11 Let { An }n be a sequence of matrices, with An of size dn , and let {{Bn,m }n }m be a sequence of sequences of matrices, with Bn,m of size dn . Suppose that {An − Bn,m }n ∼σ gm for some gm : D ⊂ Rk → C such that gm → 0 in measure. a.c.s. Then {Bn,m }n −→ { An }n . We provide the following converse of Theorem 2.11 for future use. Proposition 2.5 Let {An }n be a sequence of matrices, with An of size dn , and let {{Bn,m }n }m be a sequence of sequences of matrices, with Bn,m of size dn . Suppose a.c.s. that {An − Bn,m }n ∼σ gm for some gm : D ⊂ Rk → C and that {Bn,m }n −→ { An }n . Then gm → 0 in measure. a.c.s.
Proof Since { An − Bn,m }n −→ {Odn }n and {Odn }n ∼σ 0, Theorem 2.8 implies that φgm → φ0 pointwise over Cc (R). It follows that gm → 0 in measure by [22, Lemma 2.6].
36
2 Mathematical Background
2.7.5 An Extension of the Concept of a.c.s. We now provide a natural extension of the a.c.s. notion. The underlying idea is that, in Definition 2.6, one could choose to approximate {An }n by a class of sequences {{Bn,α }n }α∈A parameterized by a not necessarily integer parameter α. For example, one may want to use a parameter ε > 0 and claim that a given class of sequences {{Bn,ε }n }ε>0 is an a.c.s. for { An }n as ε → 0. Intuitively, this assertion should have the following meaning: for every ε > 0 there exists n ε such that, for n ≥ n ε , An = Bn,ε + Rn,ε + Nn,ε ,
rank(Rn,ε ) ≤ c(ε)dn ,
Nn,ε ≤ ω(ε),
where n ε , c(ε), ω(ε) depend only on ε and both c(ε) and ω(ε) tend to 0 as ε → 0. This is in fact the correct meaning. Definition 2.7 (approximating class of sequences as ε → 0) Let { An }n be a sequence of matrices, with An of size dn , and let {{Bn,ε }n }ε>0 be a class of sequences of matrices, with Bn,ε of size dn . We say that {{Bn,ε }n }ε>0 is an a.c.s. for {An }n as ε → 0 if the following property holds: for every ε > 0 there exists n ε such that, for n ≥ n ε , An = Bn,ε + Rn,ε + Nn,ε ,
rank(Rn,ε ) ≤ c(ε)dn ,
Nn,ε ≤ ω(ε),
where the quantities n ε , c(ε), ω(ε) depend only on ε and lim c(ε) = lim ω(ε) = 0.
ε→0
ε→0
Clearly, if {{Bn,ε }n }ε>0 is an a.c.s. for {An }n as ε → 0, then {{Bn,ε(m) }n }m is an a.c.s. for {An }n (in the sense of the classical Definition 2.6) for all sequences of positive numbers {ε(m)}m such that ε(m) → 0 as m → ∞. A.c.s. parameterized by a positive ε → 0 appear, for example, in the definition of multilevel GLT sequences (Definition 5.1). Thanks to the topological results in Sect. 2.7.1, we can give the following elegant characterization of an a.c.s. of this kind: a class of sequences of matrices {{Bn,ε }n }ε>0 is an a.c.s. for { An }n as ε → 0 if and only if da.c.s. ({ An }n , {Bn,ε }n ) → 0 as ε → 0. Throughout this book, to indicate that {{Bn,ε }n }ε>0 is an a.c.s. for { An }n as ε → 0, we will write a.c.s. {Bn,ε }n −→ { An }n as ε → 0. For the definition of multilevel LT sequences (Definition 4.3), we need the concept of a.c.s. parameterized by a multi-index m → ∞. In what follows, a multiindex sequence of sequences of matrices is any class of sequences of the form {{Bn,m }n }m∈M which satisfies the following two properties.
2.7 Approximating Classes of Sequences
37
1. M ⊆ Nq for some q ≥ 1 and M ∩ {i ∈ Nq : i ≥ k} is nonempty for every k ∈ Nq . We express the latter condition by saying that ∞ is an accumulation point for M. This is required to ensure that m can tend to ∞ inside M. 2. For every m ∈ M, {Bn,m }n is a sequence of matrices as defined at the end of Sect. 2.1.1. Definition 2.8 (approximating class of sequences as m → ∞) Let {An }n be a sequence of matrices, with An of size dn , and let {{Bn,m }n }m∈M be a multi-index sequence of sequences of matrices, with Bn,m of size dn . We say that {{Bn,m }n }m∈M is an a.c.s. for { An }n as m → ∞ if the following property holds: for every m ∈ M there exists n m such that, for n ≥ n m , An = Bn,m + Rn,m + Nn,m ,
rank(Rn,m ) ≤ c(m)dn ,
Nn,m ≤ ω(m), (2.50)
where the quantities n m , c(m), ω(m) depend only on m and lim c(m) = lim ω(m) = 0.
m→∞
m→∞
Note that an equivalent definition is obtained by replacing, in Definition 2.8, “for all m ∈ M” with “for all sufficiently large m ∈ M” (i.e., “for every m ∈ M that ˆ is greater than or equal to some m”). Indeed, suppose the splitting (2.50) holds ˆ For the other values of m, define n m = 1, c(m) = 1, ω(m) = 0 and for m ≥ m. Rn,m = An − Bn,m , Nn,m = Odn . Then, we see that (2.50) holds for every m ∈ M. Definition 2.8 extends the classical definition of a.c.s. (Definition 2.6). Indeed, a classical a.c.s. {{Bn,m }n }m for { An }n is an a.c.s. also in the sense ofDefinition 2.8 (take M as the infinite subset of N where m varies). In addition, if {Bn,m }n m∈M is an a.c.s. for {An }n in the sense of Definition 2.8, then {Bn,m }n m is an a.c.s. for {An }n (in the sense of the classical Definition 2.6) for all sequences of multi-indices {m = m(m)}m ⊆ M such that m → ∞ as m → ∞. a.c.s.
a.c.s.
" Remark 2.6 Let {Bn,m }n −→ { An }n and {Bn,m }n −→ { A"n }n as m → ∞. The following properties hold. a.c.s.
∗ }n −→ { A∗n }n as m → ∞. • {Bn,m a.c.s.
" • {α Bn,m + β Bn,m }n −→ {α An + β A"n }n as m → ∞ for all α, β ∈ C. a.c.s.
" • If { An }n and { A"n }n are s.u. then {Bn,m Bn,m }n −→ { An A"n }n .
The proof of these results is essentially the same as the proof of the analogous results for standard a.c.s.; see Theorem 2.9. As in the case of a.c.s. parameterized by a positive ε → 0, also for a.c.s. parameterized by a multi-index m → ∞ we can give the following elegant characterization based on the topological results of Sect. 2.7.1:
38
2 Mathematical Background
a multi-index sequence of sequences of matrices {{Bn,m }n }m∈M is an a.c.s. for {An }n as m → ∞ if and only if da.c.s. ({ An }n , {Bn,m }n ) → 0 as m → ∞. Throughout this book, to indicate that {{Bn,m }n }m∈M is an a.c.s. for {An }n as m → ∞, we will write a.c.s. {Bn,m }n −→ { An }n as m → ∞.
Chapter 3
Multilevel Toeplitz Sequences
This chapter is devoted to multilevel Toeplitz matrices. More precisely, we focus on the sequences of multilevel Toeplitz matrices generated by a multivariate L 1 function. These sequences, together with the sequences of multilevel diagonal sampling matrices (to be introduced afterwards) and the zero-distributed sequences (already studied in Sect. 2.6.3), should be regarded as the “building blocks” of the theory of multilevel GLT sequences. Despite its conciseness, this chapter contains everything we need about multilevel Toeplitz matrices to fully develop the theory of multilevel GLT sequences. In particular, it contains an a.c.s.-based proof of the multivariate L 1 versions of Szeg˝o’s first limit theorem and the Avram–Parter theorem about the singular value and eigenvalue distributions of multilevel Toeplitz sequences. We stress that almost all the results and the proofs contained in this chapter have an exact analog in [22, Chap. 6], where we dealt with classical (i.e., unilevel) Toeplitz matrices. However, we decided to reproduce here all the proofs without omitting details, in order to help the reader become familiar with the multilevel language (especially, the multi-index notation). In this regard, the reader is invited to compare the “multivariate proofs” presented in this chapter with the corresponding “univariate proofs” in [22, Chap. 6], in order to learn the way in which the multilevel language allows one to transfer many results from the univariate to the multivariate case by simply turning some letters (n, i, j, x, θ, etc.) in boldface (n, i, j , x, θ , etc.).
3.1 Multilevel Toeplitz Matrices and Multilevel Toeplitz Sequences Given a d-index n ∈ Nd , a matrix of the form n a i− j i, j =1 ∈ C N (n)×N (n) ,
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_3
(3.1)
39
40
3 Multilevel Toeplitz Sequences
whose (i, j ) entry depends only on the difference i − j , is called a multilevel Toeplitz matrix (or, more precisely, a d-level Toeplitz matrix). In the case d = 1, the matrix (3.1) becomes ⎡
n ai− j i, j=1
a0 a−1 ⎢ . ⎢ a1 . . ⎢ ⎢ ⎢ a2 . . . =⎢ ⎢ .. . . ⎢ . . ⎢ ⎢ . . ⎣ . an−1 · · ·
a−2 .. . .. . .. . .. . ···
··· .. . .. . .. . .. . a2
⎤ · · · a−(n−1) .. ⎥ . ⎥ ⎥ .. ⎥ .. . . ⎥ ⎥, ⎥ .. . a−2 ⎥ ⎥ ⎥ .. ⎦ . a −1
a1
a0
which means that a 1-level (or unilevel) Toeplitz matrix is just a matrix whose entries are constant along each diagonal. Unilevel Toeplitz matrices are nothing else than classical Toeplitz matrices, which have been the subject of [22, Chap. 6]. In the case d = 2, the matrix (3.1) can be written as
n 1 n n n = ai1 − j1 i11, j1 =1 ai− j i, j =1 = ai1 − j1 ,i2 − j2 i22, j2 =1 i 1 , j1 =1 ⎡ ⎤ a0 a−1 a−2 · · · · · · a−(n 1 −1) .. ⎥ ⎢ .. .. .. ⎢ a1 . . . . ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ a2 . . . . . . . . . . . . ⎥ . ⎥, =⎢ ⎢ .. . . . . . . . . ⎥ ⎢ . . . . . a−2 ⎥ ⎢ ⎥ ⎢ . ⎥ .. .. .. ⎣ .. ⎦ . . . a an 1 −1 · · · · · · a2
(3.2)
−1
a1
a0
where, for all k = −(n 1 − 1), . . . , n 1 − 1, ⎡
n ak = ak,i2 − j2 i22, j2 =1
ak,0 ak,−1 ⎢ .. ⎢ ak,1 . ⎢ ⎢ . . ⎢ ak,2 . =⎢ ⎢ .. . .. ⎢ . ⎢ ⎢ . ⎣ .. ak,n 2 −1 · · ·
ak,−2 .. . .. . .. . .. . ···
··· .. . .. . .. . .. . ak,2
⎤ · · · ak,−(n 2 −1) .. ⎥ ⎥ . ⎥ ⎥ . .. .. ⎥ . ⎥. ⎥ .. . ak,−2 ⎥ ⎥ ⎥ .. ⎦ . a
(3.3)
k,−1
ak,1
ak,0
A matrix of this form, which we call a 2-level Toeplitz matrix, is also known as a block Toeplitz matrix with Toeplitz blocks, or BTTB matrix. In general, we can say that a d-level Toeplitz matrix is a block Toeplitz matrix with (d − 1)-level Toeplitz blocks.
3.1 Multilevel Toeplitz Matrices and Multilevel Toeplitz Sequences
41
For n ∈ N and k ∈ Z, let Jn(k) be the n × n matrix whose (i, j) entry equals 1 if i − j = k and 0 otherwise, that is, (Jn(k) )i j = δi− j−k ,
i, j = 1, . . . , n,
n ∈ N,
k ∈ Z,
(3.4)
where δr = 1 if r = 0 and δr = 0 otherwise. For n ∈ Nd and k ∈ Zd , let Jn(k) = Jn(k1 1 ) ⊗ Jn(k2 2 ) ⊗ · · · ⊗ Jn(kd d ) .
(3.5)
Lemma 3.1 The d-level Toeplitz matrix (3.1) admits the following expression: n−1
n ai− j i, j =1 =
ak Jn(k) ,
(3.6)
k=−(n−1)
where Jn(k) is defined in (3.5). Proof Equation (3.6) is proved componentwise, by showing that the (i, j ) entry of the matrix in the right-hand side is equal to a i− j . By (3.4) and the crucial property (2.29), for all i, j = 1, . . . , n we have (Jn(k) )i j = (Jn(k1 1 ) ⊗ Jn(k2 2 ) ⊗ · · · ⊗ Jn(kd d ) )i j = (Jn(k1 1 ) )i1 j1 (Jn(k2 2 ) )i2 j2 · · · (Jn(kd d ) )id jd = δi1 − j1 −k1 δi2 − j2 −k2 · · · δid − jd −kd = δi− j −k ,
(3.7)
where δ r = 1 if r = 0 and δ r = 0 otherwise. Therefore,
n−1
ak Jn(k)
ij
k=−(n−1)
n−1
=
n−1
ak (Jn(k) )i j =
k=−(n−1)
ak δi− j −k = a i− j ,
k=−(n−1)
and (3.6) is proved.
Given a function f : [−π, π]d → C belonging to L 1 ([−π, π]d ), its Fourier coefficients are denoted by 1 fk = (2π )d
[−π,π]d
f (θ) e−ik·θ dθ ,
k ∈ Zd .
(3.8)
The nth (d-level) Toeplitz matrix associated with f is defined as
Tn ( f ) = f i− j
n i, j =1
=
n−1
f k Jn(k) ,
(3.9)
k=−(n−1)
where the second equality follows from Lemma 3.1. We call {Tn ( f )}n∈Nd the family of (d-level) Toeplitz matrices (or simply the Toeplitz family) generated by f , which
42
3 Multilevel Toeplitz Sequences
in turn is referred to as the generating function of {Tn ( f )}n∈Nd . Any matrix-sequence extracted from the Toeplitz family {Tn ( f )}n∈Nd , i.e., any matrix-sequence of the form {Tn ( f )}n with n = n(n) → ∞ as n → ∞, is referred to as a (d-level) Toeplitz sequence generated by f . Note that if g = f a.e. then Tn (g) = Tn ( f ) for all n ∈ Nd , i.e., g generates the same Toeplitz family (and hence also the same Toeplitz sequences) as f . This is due to the fact that the Fourier coefficients of g and f coincide. If f is a d-variate trigonometric polynomial, say f (θ) =
r
f k eik·θ ,
(3.10)
k=−r
then, as indicated by the notation, the Fourier coefficients of f are given by the coef/ {−r, . . . , r}. ficients f k in (3.10) for k ∈ {−r, . . . , r} and are equal to 0 for k ∈ This is a consequence of the orthogonality relations (2.9). It follows that every Toeplitz matrix Tn ( f ) generated by f has a number of nonzero entries in each row and column which is bounded by (2|r|∞ + 1)d , a constant independent of n. Indeed, considering for instance the ith row, we have (Tn ( f ))i j = f i− j = 0 whenever |i − j |∞ > |r|∞ , which means that the only possible nonzero entries of Tn ( f ) in the ith row are those corresponding to the column multi-indices j ∈ {1, . . . , n} such that |i − j |∞ ≤ |r|∞ ; and the number of all multi-indices j ∈ Zd satisfying |i − j |∞ ≤ |r|∞ is (2|r|∞ + 1)d . We also note that, by (3.7) and the equation (Tn (e
ik·θ
))i j = (e
ik·θ
)i− j =
1, if i − j = k, 0, if i − j = k,
i, j = 1, . . . , n,
which is satisfied for all n ∈ Nd and k ∈ Zd , we have Tn (eik·θ ) = Jn(k) ,
n ∈ Nd ,
k ∈ Zd .
In other words, {Jn(k) }n∈Nd is the Toeplitz family generated by the d-variate Fourier frequency eik·θ .
3.2 Basic Properties of Multilevel Toeplitz Matrices In this section, we study the basic properties of the Toeplitz matrices Tn ( f ) generated by a function f ∈ L 1 ([−π, π]d ). For each fixed n ∈ Nd , the map Tn (·) : L 1 ([−π, π]d ) → C N (n)×N (n) , is linear, i.e.,
f → Tn ( f ),
3.2 Basic Properties of Multilevel Toeplitz Matrices
Tn (α f + βg) = αTn ( f ) + βTn (g),
43
α, β ∈ C,
f, g ∈ L 1 ([−π, π]d ). (3.11)
This follows from the relation (α f + βg)k = α f k + βgk ,
k ∈ Zd ,
which is a consequence of the linearity of the integral in (3.8). It is clear from the definition that, if C is a constant (considered as a constant function from [−π, π]d to C), then (3.12) Tn (C) = C I N (n) . For every f ∈ L 1 ([−π, π]d ), the Fourier coefficients of f are related to the Fourier coefficients of its conjugate f by the equation 1 ( f )j = (2π )d
f (θ ) e
[−π,π]d
−i j ·θ
1 dθ = (2π )d
[−π,π]d
f (θ) ei j ·θ dθ = f − j ,
which is satisfied for all j ∈ Zd . Hence, for all i, j = 1, . . . , n, i.e.,
Tn ( f )
ij
= ( f )i− j = f j −i = (Tn ( f ))∗ i j ,
(Tn ( f ))∗ = Tn ( f ),
f ∈ L 1 ([−π, π]d ).
(3.13)
In particular, if f is real a.e. then all the Toeplitz matrices Tn ( f ) are Hermitian. The next lemma provides an integral expression for the quantity u∗ Tn ( f )v, where u, v ∈ C N (n) . This expression will be used in Theorem 3.1 to obtain a localization of the spectrum of Tn ( f ) in the case where f is real a.e. (and hence Tn ( f ) is Hermitian). Lemma 3.2 For every f ∈ L 1 ([−π, π]d ) and every n ∈ Nd , u∗ Tn ( f )v =
1 (2π )d
[−π,π]d
f (θ) u∗ U (θ)v dθ ,
u, v ∈ C N (n) ,
(3.14)
n where U (θ) = e−i(i− j )·θ i, j =1 . The matrix U (θ) satisfies 2 n i j ·θ uj e , u U (θ)u = ∗
θ ∈ [−π, π]d ,
u = [u j ]nj=1 ∈ C N (n) ,
(3.15)
j =1
hence U (θ) is HPSD for every θ ∈ [−π, π]d . Moreover 1 (2π )d
[−π,π]d
u∗ U (θ)u dθ = u 2 ,
u ∈ C N (n) .
(3.16)
44
3 Multilevel Toeplitz Sequences
Proof If we index the entries of any vector y ∈ C N (n) by a d-index j = 1, . . . , n and the entries of any matrix A ∈ C N (n)×N (n) by a pair of d-indices i, j = 1, . . . n (so that y = [y j ]nj=1 and A = [ai j ]ni, j =1 ), then we have n
u∗ Av =
u, v ∈ C N (n) ,
ai j u i v j ,
A ∈ C N (n)×N (n) ;
i, j =1
see also Example 2.6. For every u, v ∈ C N (n) , ∗
u Tn ( f )v =
n
f i− j u i v j =
i, j =1
1 = (2π )d 1 = (2π )d
i, j =1
n
[−π,π]d
[−π,π]d
f (θ )
n
1 (2π )d
[−π,π]d
f (θ) e
−i(i− j )·θ
dθ u i v j
e−i(i− j )·θ u i v j dθ
i, j =1
f (θ ) u∗ U (θ)v dθ ,
where U (θ) is defined in the statement of the lemma. This proves (3.14). Equation (3.16) follows from (3.14) by taking v = u and f = 1 (recall that Tn (1) = I N (n) ). Finally, ∗
u U (θ)u =
n
e
−i(i− j )·θ
ui u j =
i, j =1
n
ui e
−ii·θ
n
uj e
i j ·θ
j =1
i=1
2 n i j ·θ = uj e , j =1
which proves (3.15). Theorem 3.1 Assume that f ∈ L 1 ([−π, π]d ) is real a.e. and let m f = ess inf f (θ ), θ ∈[−π,π]d
M f = ess sup f (θ). θ∈[−π,π]d
Then Λ(Tn ( f )) ⊆ [m f , M f ],
n ∈ Nd .
If moreover m f < M f , then Λ(Tn ( f )) ⊂ (m f , M f ),
n ∈ Nd .
Note that the case m f = M f is trivial, because f = m f a.e. by [22, Lemma 2.1] and Tn ( f ) = m f I N (n) . Whenever f is not constant a.e., we have m f < M f and the spectrum of the Toeplitz matrices Tn ( f ) is contained in the open interval (m f , M f ).
3.2 Basic Properties of Multilevel Toeplitz Matrices
45
Proof By Lemma 3.2, for every u ∈ C N (n) such that u = 1 we have u∗ Tn ( f )u =
1 (2π )d
[−π,π]d
f (θ ) u∗ U (θ )u dθ ,
1 (2π )d
[−π,π]d
u∗ U (θ )u dθ = 1.
Since U (θ ) is HPSD for all θ ∈ [−π, π]d and m f ≤ f ≤ M f a.e. by [22, Lemma 2.1], we obtain m f ≤ u∗ Tn ( f )u ≤ M f , and the inclusion Λ(Tn ( f )) ⊆ [m f , M f ] follows from the minimax principle for eigenvalues [22, Theorem 2.8]. Suppose now that m f < M f . In this case, we show that, for all u ∈ C N (n) satisfying u = 1, m f < u∗ Tn ( f )u < M f .
(3.17)
Once this is proved, the inclusion Λ(Tn ( f )) ⊂ (m f , M f ) follows again from the minimax principle for eigenvalues. We only prove the left inequality in (3.17), because the proof of the right inequality is completely analogous. By contradiction, suppose ˆ = 1 and uˆ ∗ Tn ( f )uˆ = m f . By Lemma 3.2, there exists uˆ ∈ C N (n) such that u
0 = uˆ ∗ Tn ( f )uˆ − m f = =
1 (2π )d
[−π,π]d
1 (2π )d
[−π,π]d
( f (θ ) − m f ) uˆ ∗ U (θ)uˆ dθ
2 n ( f (θ) − m f ) uˆ j ei j ·θ dθ .
(3.18)
j =1
Considering that the integrand in (3.18) is nonnegative a.e. (because f ≥ m f a.e.) and that nj=1 uˆ j ei j ·θ > 0 a.e. (Lemma 2.2), it follows from (3.18) and the vanishing property [22, Eq. (2.1)] that f − m f = 0 a.e. (a contradiction to the hypothesis m f < M f ). Two important corollaries of Theorem 3.1 are reported below. The first follows from Theorem 3.1 and the observation that every nonnegative function f which does not vanish a.e. satisfies m f ≥ 0 and M f > 0. The second follows from Theorem 3.1 and the linearity of the map Tn (·); see (3.11). Throughout the book, if X, Y are square matrices of the same size, the notation X ≥ Y (resp., X > Y ) means that X, Y are Hermitian and X − Y is HPSD (resp., HPD). Corollary 3.1 Assume that f ∈ L 1 ([−π, π]d ) is nonnegative and not a.e. equal to 0. Then Tn ( f ) > O N (n) for all n ∈ Nd . Corollary 3.2 For any fixed n ∈ Nd , the map Tn (·) : L 1 ([−π, π]d ) → C N (n)×N (n) is a Linear Positive Operator (LPO), i.e., it is linear and satisfies Tn ( f ) ≥ O N (n) whenever f ≥ 0 a.e. In particular, Tn (·) is monotone, i.e., f ≥ g a.e.
=⇒
Tn ( f ) ≥ Tn (g).
46
3 Multilevel Toeplitz Sequences
The last result of this section provides an important relation between tensor products and multilevel Toeplitz matrices. Recall that if f 1 , . . . , f d ∈ L 1 ([−π, π]) then f 1 ⊗ · · · ⊗ f d ∈ L 1 ([−π, π]d ) by Fubini’s theorem. Lemma 3.3 Let f 1 , . . . , f d ∈ L 1 ([−π, π]) and n ∈ Nd . Then Tn ( f 1 ⊗ · · · ⊗ f d ) = Tn 1 ( f 1 ) ⊗ · · · ⊗ Tn d ( f d ).
(3.19)
Proof The proof is very simple if we use the crucial property (2.29). By Fubini’s theorem, the Fourier coefficients of f 1 ⊗ · · · ⊗ f d are given by ( f 1 ⊗ · · · ⊗ f d ) k = ( f 1 )k 1 · · · ( f d )k d ,
k ∈ Zd .
Hence, for all i, j = 1, . . . , n, [Tn 1 ( f 1 ) ⊗ · · · ⊗ Tn d ( f d )]i j = [Tn 1 ( f 1 )]i1 j1 · · · [Tn d ( f d )]id jd = ( f 1 )i1 − j1 · · · ( f d )id − jd = ( f 1 ⊗ · · · ⊗ f d )i− j = [Tn ( f 1 ⊗ · · · ⊗ f d )]i j .
3.3 Schatten p-Norms of Multilevel Toeplitz Matrices Important inequalities involving multilevel Toeplitz matrices and Schatten p-norms are provided in Theorem 3.2. They originally appeared in [37, Corollary 4.2] and were generalized in [34, Corollary 3.5]. To prove Theorem 3.2, we need a couple of intermediate lemmas, which are interesting also in themselves. They combine results from [37, Theorems 3.2 and 3.3], which hold in the more general context of “LPOs and unitarily invariant norms”. We recall that the map Tn (·) is an LPO (Corollary 3.2) and that the Schatten p-norms are unitarily invariant. Lemma 3.4 For every f ∈ L 1 ([−π, π]d ) and every n ∈ Nd , |u∗ Tn ( f )v| ≤ ≤
u∗ Tn (| f |)u v∗ Tn (| f |)v
1 ∗ 1 u Tn (| f |)u + v∗ Tn (| f |)v, 2 2
u, v ∈ C N (n) .
Proof The proof is based on Lemma 3.2 and on the Cauchy–Schwarz inequality applied first in C N (n) and then in L 2 ([−π, π]d ). Using these ingredients, we obtain 1 |u Tn ( f )v| = (2π )d ∗
2
[−π,π]d
2 f (θ) u U (θ )v dθ ∗
3.3 Schatten p-Norms of Multilevel Toeplitz Matrices
≤
=
≤
1 (2π )d 1 (2π )d 1 (2π )d
2
∗
[−π,π]d
[−π,π]d
[−π,π]d
47
| f (θ )| |u U (θ )v| dθ ∗ | f (θ )| U (θ)1/2 u U (θ)1/2 v dθ | f (θ )|
u∗ U (θ )u
2
2 v∗ U (θ)v dθ
2 1 ∗ ∗ = | f (θ)| u U (θ)u | f (θ)| v U (θ)v dθ (2π )d [−π,π]d 1 1 ∗ ≤ | f (θ )| u U (θ )u dθ | f (θ)| v∗ U (θ )v dθ (2π )d [−π,π]d (2π )d [−π,π]d = u∗ Tn (| f |)u v∗ Tn (| f |)v . This proves the first inequality of the lemma. The second one is just the geometric– arithmetic mean inequality; see, e.g., [32, p. 63]. Lemma 3.5 Let f ∈ L 1 ([−π, π]d ), n ∈ Nd and 1 ≤ p ≤ ∞. Then
Tn ( f ) p ≤ Tn (| f |) p .
(3.20)
N (n) N (n) , {vi }i=1 of C N (n) Proof By Lemma 3.4, for any pair of orthonormal bases {ui }i=1 we have N (n) 1 ∗ N (n) 1 ∗ ∗ ≤ u Tn (| f |)ui + v Tn (| f |)vi u Tn ( f )vi i i=1 p 2 i 2 i i=1 p N (n) N (n) 1 1 ≤ ui∗ Tn (| f |)ui i=1 + vi∗ Tn (| f |)vi i=1 . p p 2 2 N (n) N (n) , {vi }i=1 , and Passing to the supremum over all pairs of orthonormal bases {ui }i=1 taking into account that Tn (| f |) is HPSD by Corollary 3.2, from [22, Lemma 2.11] we obtain (3.20).
In the statement of Theorem 3.2 and throughout the book, we use the natural convention C/∞ = 0 for all numbers C. Theorem 3.2 Let f ∈ L p ([−π, π]d ), n ∈ Nd and 1 ≤ p ≤ ∞. Then
Tn ( f ) p ≤
N (n)1/ p
f L p . (2π )d/ p
(3.21)
In particular, for p = ∞ we have
Tn ( f ) = Tn ( f ) ∞ ≤ f L ∞ .
(3.22)
48
3 Multilevel Toeplitz Sequences
Proof In view of Lemma 3.5, it suffices to prove (3.21) in the case where f ≥ 0. In this case, we know from Corollary 3.2 that Tn ( f ) is HPSD. In particular, Tn ( f ) = λmax (Tn ( f )) and (3.22) follows directly from Theorem 3.1. Suppose now that 1 ≤ p < ∞. By Lemma 3.2 and the inequality in [22, Eq. (2.3)], for every u ∈ C N (n) such that u = 1 we have
∗
u Tn ( f )u
p
p 1 ∗ = f (θ) u U (θ)u dθ (2π )d [−π,π]d 1 ≤ f (θ) p u∗ U (θ)u dθ = u∗ Tn ( f p )u. (2π )d [−π,π]d
Hence, by [22, Lemma 2.11],
Tn ( f ) pp = ≤
sup N (n) {ui }i=1 orthonormal basis of C N (n)
sup N (n) {ui }i=1 orthonormal basis of C N (n)
∗ u Tn ( f )ui N (n) p i
i=1
p
∗ u Tn ( f p )ui N (n) = Tn ( f p ) 1 . i i=1 1
To conclude, we observe that, since Tn ( f p ) is HPSD, its singular values coincide with its eigenvalues. Thus,
Tn ( f p ) 1 = trace(Tn ( f p )) = N (n)( f p )0 N (n) N (n) p = f (θ ) p dθ =
f L p . d d d (2π ) [−π,π] (2π )
The last result of this section shows that Tn ( f )Tn (g) − Tn ( f g) 1 = o(N (n)) as n → ∞ in the case where f ∈ L p ([−π, π]d ) and g ∈ L q ([−π, π]d ), with p, q conjugate exponents. Note that in this case f, g, f g ∈ L 1 ([−π, π]d ) by Hölder’s inequality, so the matrices Tn ( f ), Tn (g), Tn ( f g) are defined. Lemma 3.6 Let f, g ∈ L 1 ([−π, π]d ), where f (θ ) = rj =−r f j ei j ·θ is a d-variate trigonometric polynomial, and let n ∈ Nd . Then, the rows of Tn ( f )Tn (g) − Tn ( f g) corresponding to indices k ∈ {r + 1, . . . , n − r} are zero. In particular, rank(Tn ( f )Tn (g) − Tn ( f g)) ≤ N (n) − N (n − 2r). Proof The Fourier coefficients of f g are given by 1 f (θ)g(θ) e−ik·θ dθ (2π )d [−π,π]d r r 1 −i(k− j )·θ = fj g(θ) e dθ = f j gk− j , (2π )d [−π,π]d j =−r j =−r
( f g)k =
k ∈ Zd .
3.3 Schatten p-Norms of Multilevel Toeplitz Matrices
49
For all k, = 1, . . . , n, (Tn ( f g))k = ( f g)k− =
r
f j gk−− j
(3.23)
j =−r
and (Tn ( f )Tn (g))k =
n n (Tn ( f ))ks (Tn (g))s = f k−s gs− s=1
=
k−1
s=1
f j gk− j − .
(3.24)
j =k−n
Considering that f j = 0 for j ∈ / {−r, . . . , r}, it is clear that (3.23) and (3.24) coincide for r + 1 ≤ k ≤ n − r, because in this case we have k − n ≤ −r and k − 1 ≥ r. Theorem 3.3 Let f ∈ L p ([−π, π]d ) and g ∈ L q ([−π, π]d ), where 1 ≤ p, q ≤ ∞ are conjugate exponents. Then lim
n→∞
Tn ( f )Tn (g) − Tn ( f g) 1 = 0. N (n)
(3.25)
Proof Wefirst prove (3.25) under the assumption that f, g ∈ L ∞ ([−π, π]d ). Let m ( f m ) j ei j ·θ be a sequence of d-variate trigonometric polynomials f m (θ) = rj=−r m such that
f − f m L 1 → 0;
f m ∞ ≤ f L ∞ , note that such a sequence exists by Lemma 2.3. For every m and every n ∈ Nd ,
Tn ( f )Tn (g) − Tn ( f g) 1 ≤ Tn ( f )Tn (g) − Tn ( f m )Tn (g) 1 + Tn ( f m )Tn (g) − Tn ( f m g) 1 + Tn ( f m g) − Tn ( f g) 1 .
(3.26)
Using the linearity of Tn (·), the Hölder-type inequality (2.18) and Theorem 3.2, we can bound the first and last term in the right-hand side of (3.26) as follows:
Tn ( f )Tn (g) − Tn ( f m )Tn (g) 1 ≤ Tn ( f − f m ) 1 Tn (g)
≤ N (n) f − f m L 1 g L ∞ ,
Tn ( f m g) − Tn ( f g) 1 ≤ N (n) f m g − f g L 1
(3.27)
≤ N (n) f m − f L 1 g L ∞ .
(3.28)
50
3 Multilevel Toeplitz Sequences
To bound the second term in the right-hand side of (3.26), we use Lemma 3.6 and the trace-norm inequality (2.19) in combination with Theorem 3.2. We have
Tn ( f m )Tn (g) − Tn ( f m g) 1 ≤ rank(Tn ( f m )Tn (g) − Tn ( f m g)) Tn ( f m )Tn (g) − Tn ( f m g)
≤ (N (n) − N (n − 2r m )) Tn ( f m ) Tn (g) + Tn ( f m g)
≤ (N (n) − N (n − 2r m )) f m ∞ g L ∞ + f m g L ∞ ≤ 2(N (n) − N (n − 2r m )) f L ∞ g L ∞ .
(3.29)
Putting together (3.26) and (3.27)–(3.29), we get
Tn ( f )Tn (g) − Tn ( f g) 1 ≤ 2 f − f m L 1 g L ∞ N (n) 2(N (n) − N (n − 2r m )) f L ∞ g L ∞ + . N (n) Passing to the limit as n → ∞, we obtain lim sup n→∞
Tn ( f )Tn (g) − Tn ( f g) 1 ≤ 2 f − f m L 1 g L ∞ . N (n)
Passing to the limit as m → ∞, we obtain (3.25). This concludes the proof of (3.25) in the case where f, g ∈ L ∞ ([−π, π]d ). Suppose now that f ∈ L p ([−π, π]d ) and g ∈ L q ([−π, π]d ), with 1 ≤ p, q ≤ ∞ conjugate exponents. Take two sequences { f m }m and {gm }m such that f m , gm ∈ L ∞ ([−π, π]d ) for all m, f m → f in L p ([−π, π]d ) and gm → g in L q ([−π, π]d ); for example, one can choose f m = f χ{| f |≤m} and gm = g χ{|g|≤m} . By the linearity of Tn (·), the Hölder-type inequality (2.18) and Theorem 3.2, for every m and every n ∈ Nd we have
Tn ( f )Tn (g) − Tn ( f g) 1 ≤ Tn ( f − f m )Tn (g) 1 + Tn ( f m )Tn (g − gm ) 1 + Tn ( f m )Tn (gm ) − Tn ( f m gm ) 1 + Tn ( f m gm − f g) 1 ≤ N (n)1/ p f − f m L p N (n)1/q g L q + N (n)1/ p f m L p N (n)1/q g − gm L q + Tn ( f m )Tn (gm ) − Tn ( f m gm ) 1 + N (n) f m gm − f g L 1 ≤ N (n) f − f m L p g L q + sup f i L p g − gm L q i
Tn ( f m )Tn (gm ) − Tn ( f m gm ) 1 + + f m gm − f g L 1 . (3.30) N (n) Note that supi f i L p < ∞, because f i → f in L p ([−π, π]d ) and hence f i L p →
f L p . Since f m , gm ∈ L ∞ ([−π, π]d ), by the first part of the proof we have
3.3 Schatten p-Norms of Multilevel Toeplitz Matrices
51
Tn ( f m )Tn (gm ) − Tn ( f m gm ) 1 = 0. n→∞ N (n) lim
Dividing both sides of (3.30) by N (n) and passing to the limit as n → ∞, we obtain
Tn ( f )Tn (g) − Tn ( f g) 1 N (n) n→∞ ≤ f − f m L p g L q + sup f i L p g − gm L q + f m gm − f g L 1 .
lim sup
(3.31)
i
This relation holds for every m. As m → ∞, f m → f in L p ([−π, π]d ) and gm → g in L q ([−π, π]d ) by construction. Moreover, f m gm → f g in L 1 ([−π, π]d ) by Hölder’s inequality, since
f g − f m gm L 1 ≤ ( f − f m )g L 1 + f m (g − gm ) L 1 ≤ f − f m L p g L q + f m L p g − gm L q ≤ f − f m L p g L q + sup f i L p g − gm L q . i
Passing to the limit as m → ∞ in (3.31), we obtain (3.25).
3.4 Multilevel Circulant Matrices Given a d-index n ∈ Nd , a matrix of the form n a(i− j) mod n i, j =1 ∈ C N (n)×N (n)
(3.32)
is called a multilevel circulant matrix (or, more precisely, a d-level circulant matrix). Since the (i, j ) entry a(i− j ) mod n depends only on the difference i − j , it is clear that any d-level circulant matrix is in particular a d-level Toeplitz matrix. In the case d = 1, the matrix (3.32) becomes ⎡
n a(i− j) mod n i, j=1
⎢ ⎢ ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎢ ⎢ ⎣
a0 a1 a2 .. . .. . an−1
an−1 .. . .. . .. .
an−2 .. . .. . .. . .. . ··· ···
··· .. . .. . .. . .. . a2
···
a1 .. . .. .
⎤
⎥ ⎥ ⎥ ⎥ .. ⎥ . ⎥, ⎥ .. . an−2 ⎥ ⎥ ⎥ .. . an−1 ⎦ a1 a0
(3.33)
which is the expression of the generic unilevel circulant matrix. Unilevel circulant matrices are nothing else than classical circulant matrices, which have been the
52
3 Multilevel Toeplitz Sequences
subject of [22, Sect. 6.4]. In the case d = 2, the matrix (3.32) can be written as
n 1 n n a(i− j ) mod n i, j =1 = a(i1 − j1 ) mod n 1 ,(i2 − j2 ) mod n 2 i22, j2 =1 i 1 , j1 =1 ⎡ ⎤ a0 an 1 −1 an 1 −2 · · · · · · a1 .. ⎥ ⎢ .. .. .. ⎢ a1 . . . . ⎥ ⎢ ⎥ ⎢ .. ⎥ .. .. .. .. ⎢ a2 ⎥ n 1 . . . . . ⎥, = a(i1 − j1 ) mod n 1 i1 , j1 =1 = ⎢ ⎢ .. ⎥ . . . . .. .. .. .. a ⎢ . ⎥ n 1 −2 ⎥ ⎢ ⎢ . ⎥ .. .. .. ⎣ .. ⎦ . . . a an 1 −1 · · ·
···
(3.34)
n 1 −1
a2
a1
a0
where, for all k = 0, . . . , n 1 − 1, n ak = ak,(i2 − j2 ) mod n 2 i22, j2 =1 ⎡ ak,0 ak,n 2 −1 ak,n 2 −2 ⎢ .. .. ⎢ ak,1 . . ⎢ ⎢ . . .. .. ⎢ ak,2 =⎢ ⎢ . .. .. ⎢ .. . . ⎢ ⎢ . .. ⎣ .. . ak,n 2 −1
···
···
⎤ · · · · · · ak,1 .. ⎥ .. . . ⎥ ⎥ .. ⎥ .. .. . . . ⎥ ⎥. ⎥ .. .. . . ak,n 2 −2 ⎥ ⎥ ⎥ .. .. . . ak,n 2 −1 ⎦ ak,2 ak,1 ak,0
(3.35)
A matrix of this form, which we call a 2-level circulant matrix, is also known as a block circulant matrix with circulant blocks, or BCCB matrix. In general, we can say that a d-level circulant matrix is a block circulant matrix with (d − 1)-level circulant blocks. Let Cn be the n × n matrix whose (i, j) entry equals 1 if (i − j) mod n = 1 and 0 otherwise: ⎡ ⎤ 0 1 ⎢ ⎥ . ⎢ 1 .. ⎥ ⎢ ⎥ ⎢ ⎥ . . .. .. ⎢ ⎥ ⎥. Cn = ⎢ (3.36) ⎢ ⎥ .. .. ⎢ ⎥ . . ⎢ ⎥ ⎢ ⎥ .. .. ⎣ ⎦ . . 1 0 The matrix Cn is called the generator of classical circulant matrices of order n. This name is due to the fact that the powers of Cn are
3.4 Multilevel Circulant Matrices
⎡
0
1
53
0
⎤
⎡
0
1
0
0
⎤
⎥ ⎥ ⎢ ⎢ ⎢ 0 ... ⎢ 0 ... 1⎥ 1 0⎥ ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ .. .. .. .. ⎥ ⎥ ⎢ ⎢ . . . . 1⎥ ⎥ ⎢1 ⎢0 2 3 Cn =⎢ ⎥ , C n =⎢ ⎥ , . . . , C nn = In , .. .. .. .. .. .. ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢1 . . . . . . ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ . . . . . . . .. .. .. .. .. .. .. ⎦ ⎦ ⎣ ⎣ 1 0 0 1 0 0 0
(3.37) and so the classical circulant matrix (3.33) can be written as a linear combination of nonnegative powers of Cn : n−1 n a(i− j) mod n i, j=1 = ak Cnk .
(3.38)
k=0
For n ∈ Nd and k ∈ Zd , let C nk = Cnk11 ⊗ Cnk22 ⊗ · · · ⊗ Cnkdd .
(3.39)
Lemma 3.7, which is a “refined” version of Lemma 3.1 for multilevel circulant matrices, generalizes (3.38) to the multilevel case. Lemma 3.7 The d-level circulant matrix (3.32) admits the following expression: n−1 n ak C nk , a(i− j ) mod n i, j =1 =
(3.40)
k=0
where C nk is defined in (3.39). Proof The proof follows the same pattern as the proof of Lemma 3.1. Equation (3.40) is proved componentwise, by showing that the (i, j ) entry of the matrix in the righthand side is equal to a(i− j ) mod n . Let δr = 1 if r = 0 and δr = 0 otherwise. By (3.36) and (3.37), for every n ∈ N and k ∈ Z we can express the (i, j) entry of Cnk as follows: (Cnk )i j = δ(i− j−k) mod n ,
i, j = 1, . . . , n.
(3.41)
By the crucial property (2.29), for all i, j = 1, . . . , n we have (C nk )i j = (Cnk11 )i1 j1 (Cnk22 )i2 j2 · · · (Cnkdd )id jd = δ(i1 − j1 −k1 ) mod n 1 δ(i2 − j2 −k2 ) mod n 2 · · · δ(id − jd −kd ) mod n d = δ(i− j −k) mod n , where δ r = 1 if r = 0 and δ r = 0 otherwise. Therefore,
54
3 Multilevel Toeplitz Sequences
n−1
=
ak C nk ij
k=0
n−1
ak (C nk )i j =
k=0
n−1
ak δ(i− j −k) mod n = a(i− j ) mod n ,
k=0
and (3.40) is proved. r
As a consequence of Lemma 3.7, any linear combination of the form k=−r ck C nk is a d-level circulant matrix, because it can be written in the form (3.40) by using the identity Cn−1 = Cnn−1 (see (3.37)) and the properties of tensor products (see Sect. 2.5). Theorem the spectral decomposition of any linear combination of r 3.4 provides ck C nk and hence of any multilevel circulant matrix. Let the form k=−r n−1 n 1 1 Fn = √ e−2π i jk/n j, k=0 = √ e−2πi( j−1)(k−1)/n j, k=1 . n n
(3.42)
The matrix Fn is the so-called unitary discrete Fourier transform of order n. Using the equation z N +1 −1 N , if z = 1, z−1 k z = N + 1, if z = 1, k=0 it is easy to check that Fn is indeed unitary: Fn∗ Fn = In . For n ∈ Nd , let Fn = Fn 1 ⊗ · · · ⊗ Fn d .
(3.43)
Theorem 3.4 Let n, r ∈ Nd and c−r , . . . , c r ∈ C. Then, r
ck C nk = Fn
k=−r
diag
j =0,..., n−1
c
2π j Fn∗ , n
r r ck eik·θ and Fn is defined in (3.43). In particular, k=−r ck C nk where c(θ ) = k=−r is a normal matrix whose spectrum is given by
r
Λ
ck C nk
k=−r
2π j : j = 0, . . . , n − 1 . = c n
Proof The spectral decomposition of Cn is known and is given by Cn = Fn Dn Fn∗ ,
Dn =
diag j=0,...,n−1
e2πi j/n = diag e2π i( j−1)/n . j=1,...,n
This can be verified by direct computation: for all i, j = 1, . . . , n, n 1 2πi(i−1)( −1)/n 1 e (Cn ) j = √ e2π i(i−1) j/n = (Dn Fn∗ )i j . (Fn∗ Cn )i j = √ n =1 n
3.4 Multilevel Circulant Matrices
55
Therefore, also the spectral decomposition of the matrix C nk is known. Indeed, using the properties of tensor products (see Sect. 2.5), we obtain C nk = Cnk11 ⊗ Cnk22 ⊗ · · · ⊗ Cnkdd = (Fn 1 Dnk11 Fn∗1 ) ⊗ (Fn 2 Dnk22 Fn∗2 ) ⊗ · · · ⊗ (Fn d Dnkdd Fn∗d ) = (Fn 1 ⊗ Fn 2 ⊗ · · · ⊗ Fn d )(Dnk11 ⊗ Dnk22 ⊗ · · · ⊗ Dnkdd )(Fn 1 ⊗ Fn 2 ⊗ · · · ⊗ Fn d )∗ = Fn Dnk Fn∗ , where Dnk = Dnk11 ⊗ Dnk22 ⊗ · · · ⊗ Dnkdd =
diag
j =0,...,n−1
(e2π i
d
r =1 jr kr /n r
)=
diag
j =0,...,n−1
(e2π i( j /n)·k );
note that the second equality follows from the crucial property (2.29). Thus, r k=−r
ck C nk =
r
ck Fn Dnk Fn∗ = Fn
r
k=−r
= Fn
r
k=−r
ck
k=−r
= Fn
ck Dnk Fn∗
diag
diag (e
j =0,...,n−1
r
j =0,...,n−1
and the thesis follows from the identity
2π i( j /n)·k
) Fn∗
ck e2π i( j /n)·k Fn∗ ,
k=−r
r k=−r
ck e2π i( j /n)·k = c(2π j /n).
3.5 Singular Value and Spectral Distribution of Multilevel Toeplitz Sequences: An a.c.s.-Based Proof This section is devoted to the proof of Theorem 3.5, which comprises the multivariate L 1 versions of Szeg˝o’s first limit theorem and the Avram–Parter theorem. It provides the singular value distribution of multilevel Toeplitz sequences generated by a function f ∈ L 1 ([−π, π]d ) and the spectral distribution of multilevel Toeplitz sequences generated by a real function f ∈ L 1 ([−π, π]d ). For the eigenvalues it goes back to Szeg˝o [27], and for the singular values it was established by Avram [2] and Parter [29]. They assumed d = 1 and f ∈ L ∞ ([−π, π]d ); see [8, Sect. 5] and [9, Sect. 10.14] for more on the subject in the case of L ∞ generating functions. The extension to d ≥ 1 and f ∈ L 1 ([−π, π]d ) was performed by Tyrtyshnikov and Zamarashkin [41–43] and Tilli [38]. For the eigenvalues in the case of d ≥ 1 and f ∈ L ∞ ([−π, π]d ), Theorem 3.5 can also be derived from [7, Corollary 22]. In this section, we are going to see a proof of Theorem 3.5 based on the notion of a.c.s. It is essentially the same as the proof of [22, Theorem 6.5], with the only difference that
56
3 Multilevel Toeplitz Sequences
in [22] we focused on the case d = 1, while here we will address the general case d ≥ 1. Theorem 3.5 If f ∈ L 1 ([−π, π]d ) and {n = n(n)}n ⊆ Nd is any sequence such that n → ∞ as n → ∞, then {Tn ( f )}n ∼σ f . If moreover f is real, then {Tn ( f )}n ∼λ f . Proof The proof consists of two steps. In the first step, we show that the theorem holds if f is a d-variate trigonometric polynomial. In the second step, using an approximation argument based on the concept of a.c.s., we show that the theorem holds for every f ∈ L 1 ([−π, π]d ). Step 1. Suppose f is a d-variate trigonometric polynomial, so that r
f (θ ) =
f k eik·θ
k=−r
for some r ∈ Nd . Consider the d-level circulant matrix Cn ( f ) =
r
f k C nk ,
(3.44)
k=−r
where C nk = Cnk11 ⊗ Cnk22 ⊗ · · · ⊗ Cnkdd as in (3.39) and Cn is defined in (3.36). Note that C n ( f ) is Hermitian whenever f is real, by Theorem 3.4. We are going to show that (3.45) {C n ( f )}n ∼σ, λ f and
a.c.s.
{C n ( f )}n −→ {Tn ( f )}n .
(3.46)
Once this is done, the singular value distribution {Tn ( f )}n ∼σ f follows from Corollary 2.4, and the spectral distribution {Tn ( f )}n ∼λ f follows from Corollary 2.5 under the assumption that f is real (in which case both Tn ( f ) and C n ( f ) are Hermitian). To prove (3.45) we use Theorem 3.4. Since C n ( f ) is normal, it is enough to show that {C n ( f )}n ∼λ f (see Remark 2.2). By Theorem 3.4, the eigenvalues of C n ( f ) are given by f (2π j /n), j = 0, . . . , n − 1. Hence, for every F ∈ Cc (C), N (n) n−1 1 1 2π j F(λ j (C n ( f ))) = lim F f n→∞ N (n) n→∞ N (n) n j=1 j =0 1 1 F( f (θ))dθ = F( f (θ))dθ . = (2π )d [0,2π]d (2π )d [−π,π]d
lim
Note that the last equality holds because f is a d-variate trigonometric polynomial (so it is periodic in each direction with period 2π ), while the second equality is due
3.5 Singular Value and Spectral Distribution of Multilevel …
57
d n−1 2π j to the fact that (2π) j =0 F( f ( n )) is a Riemann sum for [0,2π ]d F( f (θ ))dθ and N (n) converges to this integral as n → ∞, because the function F( f (θ)) is continuous (and hence Riemann-integrable). Thus, {C n ( f )}n ∼λ f and the proof of (3.45) is complete. To prove (3.46), it is enough to show that, for n ≥ r + 1, d ri . rank(Tn ( f ) − C n ( f )) ≤ N (2r + 1)N (n) n i=1 i
(3.47)
This implies that the matrix-sequence {Tn ( f ) − C n ( f )}n is zero-distributed by Theorem 2.2, hence da.c.s. ({Tn ( f )}n , {C n ( f )}n ) = 0 and (3.46) is met. By (3.9) and the / {−r, . . . , r}, for n ≥ r + 1 we have fact that f k = 0 if k ∈ n−1
Tn ( f ) =
f k Jn(k) =
k=−(n−1)
r
f k Jn(k) ,
(3.48)
k=−r
where Jn(k) = Jn(k1 1 ) ⊗ Jn(k2 2 ) ⊗ · · · ⊗ Jn(kd d ) as in (3.5) and Jn(k) is defined in (3.4). Since it is clear from (3.4) and (3.37) that the nonzero rows of Cnk − Jn(k) are at most |k|, we have k ∈ Z, n ∈ N. rank(Cnk − Jn(k) ) ≤ |k|, Hence, by (2.31), rank(C nk − Jn(k) ) ≤ N (n)
d |ki | i=1
ni
,
k ∈ Zd ,
n ∈ Nd .
Comparing (3.44) and (3.48), we see that, for n ≥ r + 1, rank(C n ( f ) − Tn ( f )) ≤
r
rank(C nk − Jn(k) ) ≤
k=−r
≤ N (2r + 1)N (n)
r k=−r
N (n)
d |ki | i=1
ni
d ri , n i=1 i
and (3.47) is proved. Step 2. Let f ∈ L 1 ([−π, π]d ). Since the set of d-variate trigonometric polynomials is dense in L 1 ([−π, π]d ) by, e.g., [22, Lemma 2.2] or Lemma 2.3, there exists a sequence of d-variate trigonometric polynomials { f m }m such that f m → f in L 1 ([−π, π]d ). By replacing f m with ( f m ) (if necessary), we may assume that f m is real if f is real. In this way, all the matrices Tn ( f ) and Tn ( f m ) are Hermitian if f is real. By Step 1, {Tn ( f m )}n ∼σ f m
58
3 Multilevel Toeplitz Sequences
and, if f is real, {Tn ( f m )}n ∼λ f m . Moreover,
a.c.s.
{Tn ( f m )}n −→ {Tn ( f )}n by Theorem 2.10, because, by Theorem 3.2,
Tn ( f ) − Tn ( f m ) 1 = Tn ( f − f m ) 1 ≤ N (n) f − f m L 1 . Thus, the relations {Tn ( f )}n ∼σ, λ f follow from Corollaries 2.4 and 2.5.
A noteworthy extension of the spectral distribution result {Tn ( f )}n ∼λ f to the case where f is not real has been performed in [15], on the basis of Tilli’s pioneering paper [40]. It was proved in [15] that the relation {Tn ( f )}n ∼λ f holds whenever f satisfies the following conditions: • f ∈ L ∞ ([−π, π]d ); • the essential range ER( f ) has empty interior and does not disconnect the complex plane C. The class of functions f ∈ L ∞ ([−π, π]d ) whose essential range has empty interior and does not disconnect C is sometimes referred to as the Tilli class. The hypothesis of “being in the Tilli class” has been used not only in [15] but also in the recent work [13], which provided an extension of both Theorem 3.5 and the above theorem from [15]. For more on this subject, see [13].
3.6 Extreme Eigenvalues of Hermitian Multilevel Toeplitz Matrices Suppose that f ∈ L 1 ([−π, π]d ) is real a.e. In this case, each Toeplitz matrix Tn ( f ) is Hermitian and Λ(Tn ( f )) ⊆ [m f , M f ], where m f = ess inf θ∈[−π,π]d f (θ) and M f = ess supθ ∈[−π,π]d f (θ); see Theorem 3.1. The next theorem shows that, for each fixed j ≥ 1, the j smallest eigenvalues of Tn ( f ) converge to m f and the j largest eigenvalues of Tn ( f ) converge to M f as n → ∞. Theorem 3.6 Let f ∈ L 1 ([−π, π]d ) be real a.e., let m f = ess inf f (θ ), θ ∈[−π,π]d
M f = ess sup f (θ ), θ∈[−π,π]d
and let λ1 (Tn ( f )) ≥ · · · ≥ λ N (n) (Tn ( f )) be the eigenvalues of Tn ( f ) sorted in nonincreasing order. Then, for each fixed j ≥ 1, lim λ j (Tn ( f )) = M f ,
n→∞
lim λ N (n)− j+1 (Tn ( f )) = m f .
n→∞
(3.49)
3.6 Extreme Eigenvalues of Hermitian Multilevel Toeplitz Matrices
59
Proof We only prove the left limit in (3.49) as the proof of the right limit is conceptually identical. To prove the left limit, we show that lim λ j (Tn ( f )) = M f
n→∞
for all sequences {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. The proof of the latter assertion is based on the observation that, by Theorems 2.1 and 3.5, each point of the essential range ER( f ) ⊆ [m f , M f ] strongly attracts the spectrum Λ(Tn ( f )) with infinite order. If M f is finite, then it belongs to the essential range ER( f ) because M f = sup ER( f ) by definition and ER( f ) is closed by [22, Lemma 2.1]. As a consequence, M f strongly attracts the spectrum Λ(Tn ( f )) with infinite order, so the left limit in (3.49) holds for each fixed j ≥ 1 by Definition 2.3. To prove that this limit continues to hold even if M f is not finite, suppose by contradiction that there exists a fixed j ≥ 1 such that lim inf λ j (Tn ( f )) = < M f . n→∞
Passing, if necessary, to a subsequence of {Tn ( f )}n , we may assume that lim λ j (Tn ( f )) = < M f .
n→∞
This means that all the eigenvalues of Tn ( f ) (except possibly for the largest j − 1) are eventually smaller than + for some positive such that + < M f . By definition of M f , we can find a point y ∈ ER( f ) which lies in ( + , M f ]. Clearly, the point y cannot strongly attract Λ(Tn ( f )) with infinite order because only the largest j − 1 eigenvalues of Tn ( f ) can converge to y as n → ∞. This contradiction proves the left limit in (3.49).
Chapter 4
Multilevel Locally Toeplitz Sequences
The theory of Locally Toeplitz (LT) sequences dates back to Tilli’s pioneering paper [39]. It was then carried forward by the second author [35, 36], and it was finally reviewed and extended in [20]. Following [20], in this chapter we address the multivariate version of the theory of LT sequences, also known as the theory of multilevel LT sequences. The topic is presented here on an abstract level, whereas for motivations and insights we refer the reader to [22, Sect. 7.1]. Needless to say, the theory of multilevel LT sequences is the keystone of the theory of multilevel GLT sequences, which will be the subject of Chap. 5. We stress that many results and proofs contained in this chapter have an exact analog in [22, Chap. 7], where we dealt with classical (unilevel) LT sequences. However, we decided to reproduce here all the proofs without omitting details, in order to help the reader become familiar with the multilevel language (especially, the multi-index notation). In this regard, the reader is invited to compare the “multivariate proofs” presented in this chapter with the corresponding “univariate proofs” in [22, Chap. 7], in order to learn the way in which the multilevel language allows one to transfer many results from the univariate to the multivariate case by simply turning some letters (n, i, j, x, θ, etc.) in boldface (n, i, j , x, θ, etc.).
4.1 Multilevel LT Operator Just as the theory of LT sequences [22, Chap. 7] begins with the notion of LT operator, the theory of multilevel LT sequences begins with the notion of multilevel LT operator. After introducing the multilevel LT operator in Sect. 4.1.1, we study its properties in Sect. 4.1.2.
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_4
61
62
4 Multilevel Locally Toeplitz Sequences
4.1.1 Definition of Multilevel LT Operator Throughout this book, if n ∈ Nd and a : [0, 1]d → C, the nth (d-level) diagonal sampling matrix generated by a is denoted by Dn (a) and is defined as the following diagonal matrix of size N (n): Dn (a) = diag a i=1,...,n
i n
,
(4.1)
where we recall that i varies from 1 to n following the lexicographic ordering; see Sect. 2.1.2. Note that Dn (a) can also be defined through a recursive formula: if d = 1, then i ; Dn (a) = diag a n i=1,...,n if d > 1, then i 1 Dn (a) = Dn 1 ,...,n d (a) = diag Dn 2 ,...,n d a , x2 , . . . , xd , n1 i 1 =1,...,n 1
(4.2)
where a(i 1 /n 1 , x2 , . . . , xd ) is the (d − 1)-variate function defined as follows: a
i
1
n1
, x2 , . . . , xd : [0, 1]d−1 → C,
(x2 , . . . , xd ) → a
i
1
n1
, x2 , . . . , xd .
If u : D → C and v : E → C are arbitrary functions, the tensor-product function u ⊗ v : D × E → C is defined as (u ⊗ v)(ξ, ϑ) = u(ξ )v(ϑ),
(ξ, ϑ) ∈ D × E.
Definition 4.1 (multilevel locally Toeplitz operator) • Let m, n ∈ N, let a : [0, 1] → C, and let f ∈ L 1 ([−π, π]). In accordance with [22, Definition 7.1], the associated (1-level) Locally Toeplitz (LT) operator is defined as the following n × n matrix: L Tnm (a, f ) = Dm (a) ⊗ Tn/m ( f ) ⊕ On mod m i = diag a Tn/m ( f ) ⊕ On mod m m i=1,...,m i = diag a Tn/m ( f ) ⊕ On mod m . m i=1,...,m It is understood that L Tnm (a, f ) = On when n < m and that the term On mod m is not present when n is a multiple of m. Moreover, here and in what follows, the tensor product operation ⊗ is always applied before the direct sum ⊕, exactly as in
4.1 Multilevel LT Operator
63
the case of numbers, where multiplication is always applied before addition. Note also that in the last equality we intentionally removed the square brackets in order to illustrate a notation that will be used hereinafter to simplify the presentation (roughly speaking, we are assuming that the “diag operator” is applied before the direct sum ⊕). • Let m, n ∈ Nd , let a : [0, 1]d → C, and let f 1 , . . . , f d ∈ L 1 ([−π, π]). The associated (d-level) Locally Toeplitz (LT) operator is defined as the following N (n) × N (n) matrix: 1 ,...,m d (a(x1 , . . . , xd ), f 1 ⊗ · · · ⊗ f d ) L Tnm (a, f 1 ⊗ · · · ⊗ f d ) = L Tnm1 ,...,n d i1 m 2 ,...,m d = diag Tn 1 /m 1 ( f 1 ) ⊗ L Tn 2 ,...,n d a , x2 , . . . , xd , f 2 ⊗ · · · ⊗ f d m1 i 1 =1,...,m 1
⊕ O(n 1 mod m 1 )n 2 ···n d . This is a recursive definition, whose base case has been given in the previous item. For example, in the case d = 2 we have 1 ,m 2 L Tnm1 ,n (a, f 1 ⊗ f 2 ) 2
i i2 1 = diag Tn 1 /m 1 ( f 1 ) ⊗ diag a , Tn 2 /m 2 ( f 2 ) ⊕ On 2 mod m 2 m1 m2 i 1 =1,...,m 1 i 2 =1,...,m 2 ⊕ O(n 1 mod m 1 )n 2 .
In Definition 4.1, we have defined the multilevel LT operator L Tnm (a, f ) in the case where f is a separable function of the form f = f 1 ⊗ · · · ⊗ f d with f 1 , . . . , f d ∈ L 1 ([−π, π]). We are going to see in Definition 4.2 that L Tnm (a, f ) is actually welldefined (in a unique way) for any f ∈ L 1 ([−π, π]d ). The crucial result in view of Definition 4.2 is Theorem 4.1. It shows that L Tnm (a, f 1 ⊗ · · · ⊗ f d ) coincides with Dm (a) ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O up to a permutation transformation Πnm depending only on m, n and not on the functions a, f 1 , . . . , f d . Theorem 4.1 For any m, n ∈ Nd there exists a permutation matrix Πnm of size N (n) such that L Tnm (a, f 1 ⊗ · · · ⊗ f d ) = Πnm Dm (a) ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (n)−N (m)N (n/m) (Πnm )T for every a : [0, 1]d → C and every f 1 , . . . , f d ∈ L 1 ([−π, π]). Proof The proof is done by induction on d. For d = 1, the result holds with Πnm = In . For d ≥ 2, set ν = (n 2 , . . . , n d ) and μ = (m 2 , . . . , m d ). By definition,
64
4 Multilevel Locally Toeplitz Sequences
L Tnm (a, f 1 ⊗ · · · ⊗ f d ) =
i 1 diag Tn 1 /m 1 ( f 1 ) ⊗ L Tνμ a , · , f 2 ⊗ · · · ⊗ f d ⊕ O(n 1 mod m 1 )n 2 ···n d , m1 i 1 =1,...,m 1 (4.3)
where a(i 1 /m 1 , ·) : [0, 1]d−1 → C is the function (x2 , . . . , xd ) → a(i 1 /m 1 , x2 , . . . , xd ). By induction hypothesis, setting N (ν, μ) = N (ν) − N (μ)N (ν/μ), we have i 1 L Tνμ a , · , f2 ⊗ · · · ⊗ fd = m 1 i 1 Πνμ Dμ a , · ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ) (Πνμ )T . m1
(4.4)
Let us work on the argument of the “diag operator” in (4.3). From Lemma 2.6, Eq. (4.4) and the properties of tensor products (see Sect. 2.5), we get i 1 , · , f2 ⊗ · · · ⊗ fd Tn 1 /m 1 ( f 1 ) ⊗ L Tνμ a m
1 i1 = Π(N (ν),n 1 /m 1 );[2,1] L Tνμ a , · , f 2 ⊗ · · · ⊗ f d ⊗ Tn 1 /m 1 ( f 1 ) m1 · (Π(N (ν),n 1 /m 1 );[2,1] )T = Π(N (ν),n 1 /m 1 );[2,1]
i 1 · Πνμ Dμ a , · ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ) (Πνμ )T m1 ⊗ Tn 1 /m 1 ( f 1 ) · (Π(N (ν),n 1 /m 1 );[2,1] )T = Π(N (ν),n 1 /m 1 );[2,1] (Πνμ ⊗ In 1 /m 1 )
i 1 , · ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ) ⊗ Tn 1 /m 1 ( f 1 ) · Dμ a m1 · (Πνμ ⊗ In 1 /m 1 )T (Π(N (ν),n 1 /m 1 );[2,1] )T .
(4.5)
Using Eq. (2.35), Lemmas 2.6, 3.3 and the properties of tensor products and direct sums (see Sect. 2.5), we obtain i 1 , · ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ) ⊗ Tn 1 /m 1 ( f 1 ) Dμ a m1 i 1 , · ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) ⊗ Tn 1 /m 1 ( f 1 ) ⊕ O N (ν,μ)n 1 /m 1 = Dμ a m1
4.1 Multilevel LT Operator
65
= Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] i 1 · Dμ a , · ⊗ Tn 1 /m 1 ( f 1 ) ⊗ Tν/μ ( f 2 ⊗ · · · ⊗ f d ) m1 · (Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] )T ⊕ O N (ν,μ)n 1 /m 1 = Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] i 1 · Dμ a , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) m1 · (Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] )T ⊕ O N (ν,μ)n 1 /m 1 = (Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] ⊕ I N (ν,μ)n 1 /m 1 ) i 1 , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 · Dμ a m1 · (Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] ⊕ I N (ν,μ)n 1 /m 1 )T .
(4.6)
Substituting (4.6) into (4.5), we arrive at i 1 Tn 1 /m 1 ( f 1 ) ⊗ L Tνμ a , · , f2 ⊗ · · · ⊗ fd m1 i 1 m = Pn Dμ a , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 (Pnm )T , m1 (4.7) where Pnm = Π(N (ν),n 1 /m 1 );[2,1] (Πνμ ⊗ In 1 /m 1 ) · (Π(N (μ),n 1 /m 1 , N (ν/μ));[1,3,2] ⊕ I N (ν,μ)n 1 /m 1 ). Combining (4.7) and (4.3), we obtain L Tnm (a, f 1 ⊗ · · · ⊗ f d ) m1 i 1 Pnm , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) diag Dμ a = m1 i 1 =1,...,m 1 i 1 =1 T m1 m ⊕ O N (ν,μ)n 1 /m 1 Pn ⊕ O(n 1 mod m 1 )n 2 ···n d . i 1 =1
From Lemma 2.7 and Eqs. (2.35) and (4.2), diag
i 1 =1,...,m 1
i 1 Dμ a , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m1
66
=
4 Multilevel Locally Toeplitz Sequences m1 i 1 =1
⎡
i 1 Dμ a , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m1
⎤ i 1 = Vnm ⎣ Dμ a , · ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m 1 ⎦ (Vnm )T m1 i 1 =1 = Vnm Dm (a) ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m 1 (Vnm )T , m1
where Vnm = Vh(m,n);σ , σ = [1, m 1 + 1, 2, m 1 + 2, . . . , m 1 , 2m 1 ], h(m, n) = N (μ)N (n/m), . . . , N (μ)N (n/m), m1
N (ν, μ)n 1 /m 1 , . . . , N (ν, μ)n 1 /m 1 . m1
Thus, L Tnm (a, f 1 ⊗ · · · ⊗ f d ) m1 m Pn Vnm = i 1 =1
· Dm (a) ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m 1 T m1 m T m Pn ⊕ O(n 1 mod m 1 )n 2 ···n d · (Vn ) =
m1
i 1 =1
Pnm Vnm ⊕ I(n 1 mod m 1 )n 2 ···n d
i 1 =1
· Dm (a) ⊗ Tn/m ( f 1 ⊗ · · · ⊗ f d ) ⊕ O N (ν,μ)n 1 /m 1 m 1 +(n 1 mod m 1 )n 2 ···n d T m1 Pnm ⊕ I(n 1 mod m 1 )n 2 ···n d . · (Vnm )T
i 1 =1
This concludes the proof; note that the permutation matrix Πnm is given by Πnm
=
m1
Pnm
Vnm ⊕ I(n 1 mod m 1 )n 2 ···n d
i 1 =1
and, moreover, the number N (ν, μ)n 1 /m 1 m 1 + (n 1 mod m 1 )n 2 · · · n d is equal to N (n) − N (m)N (n/m).
4.1 Multilevel LT Operator
67
Definition 4.2 (multilevel locally Toeplitz operator) Let m, n ∈ Nd , let a : [0, 1]d → C, and let f ∈ L 1 ([−π, π]d ). The associated (d-level) Locally Toeplitz (LT) operator is defined as the following N (n) × N (n) matrix: L Tnm (a, f ) = Πnm Dm (a) ⊗ Tn/m ( f ) ⊕ O N (n)−N (m)N (n/m) (Πnm )T , where Πnm is the permutation matrix appearing in Theorem 4.1. Remark 4.1 We note that L Tnm (a, f ) = L Tnm (a, g) whenever f = g a.e. Moreover, suppose that f = f 1 ⊗ · · · ⊗ f d a.e., with f 1 , . . . , f d ∈ L 1 ([−π, π]); then L Tnm (a, f ) is equal to L Tnm (a, f 1 ⊗ · · · ⊗ f d ), as defined by Definition 4.1. This shows that Definition 4.2 is an extension of Definition 4.1.
4.1.2 Properties of the Multilevel LT Operator We now investigate the properties of the multilevel LT operator L Tnm (a, f ). Let d C[0,1] be the complex vector space of all functions a : [0, 1]d → C. For each fixed n, m ∈ Nd , the map L Tnm (a, ·) : L 1 ([−π, π]d ) → C N (n)×N (n) ,
f → L Tnm (a, f ),
(4.8)
is linear for any a : [0, 1]d → C, and the map L Tnm (·, f ) : C[0,1] → C N (n)×N (n) , d
a → L Tnm (a, f ),
(4.9)
is linear for any f ∈ L 1 ([−π, π]d ). This follows from Definition 4.2, the linearity of the maps a → Dm (a) and f → Tn/m ( f ), and the bilinearity of tensor products. For any n, m ∈ Nd and any pair of functions a : [0, 1]d → C and f ∈ L 1 ([−π, π]d ), we have (4.10) (L Tnm (a, f ))∗ = L Tnm (a, f ) and L Tnm (a, f ) p = Dm (a) p Tn/m ( f ) p m i Tn/m ( f ) p = a (4.11) m i=1 p max i=1,...,m a( mi ) Tn/m ( f ), if p = ∞, = m i p 1/ p Tn/m ( f ) p , if 1 ≤ p < ∞. i=1 a( m ) Equation (4.10) follows from Definition 4.2, the relations (X ⊗ Y )∗ = X ∗ ⊗ Y ∗ and (X ⊕ Y )∗ = X ∗ ⊕ Y ∗ , and the identity (Tk ( f ))∗ = Tk ( f ) in (3.13). The equations
68
4 Multilevel Locally Toeplitz Sequences
in (4.11) follow from Definition 4.2, the invariance of · p with respect to unitary transformations (such as permutations), and Eqs. (2.25)–(2.26). Now, let 1 ≤ p, q ≤ ∞ be conjugate exponents. If f ∈ L p ([−π, π]d ) and f˜ ∈ q L ([−π, π]d ), then f f˜ ∈ L 1 ([−π, π]d ) by Hölder’s inequality, so for any a, a˜ : ˜ f˜), L Tnm (a a, ˜ f f˜). [0, 1]d → C we can consider the matrices L Tnm (a, f ), L Tnm (a, In Proposition 4.1 we show that L Tnm (a, f )L Tnm (a, ˜ f˜) is “close” to L Tnm (a a, ˜ f f˜), as long as a, a˜ are bounded. Proposition 4.1 Let a, a˜ : [0, 1]d → C be bounded, and let f ∈ L p ([−π, π]d ) and f˜ ∈ L q ([−π, π]d ), where 1 ≤ p, q ≤ ∞ are conjugate exponents. Then, for every n, m ∈ Nd , L Tnm (a, f ) L Tnm (a, ˜ f˜) − L Tnm (a a, ˜ f f˜)1 ≤ ε(n/m) N (n), where ε(k) = a a ˜ ∞
(4.12)
Tk ( f )Tk ( f˜) − Tk ( f f˜)1 N (k)
and lim ε(k) = 0 by Theorem 3.3. In particular, for every m ∈ Nd there exists k→∞
nm ∈ Nd such that, for n ≥ nm ,
and
N (n) ˜ f˜) − L Tnm (a a, ˜ f f˜)1 ≤ L Tnm (a, f )L Tnm (a, N (m)
(4.13)
˜ f˜) = L Tnm (a a, ˜ f f˜) + Rn,m + Nn,m , L Tnm (a, f ) L Tnm (a, N (n) 1 rank(Rn,m ) ≤ √ , Nn,m ≤ √ . N (m) N (m)
(4.14)
Proof By Definition 4.2 and the properties of tensor products and direct sums, ˜ f˜) − L Tnm (a a, ˜ f f˜) L Tnm (a, f )L Tnm (a, ˜ ⊗ Tn/m ( f )Tn/m ( f˜) − Tn/m ( f f˜) ⊕ O (Πnm )T . = Πnm Dm (a a) Hence, ˜ f˜) − L Tnm (a a, ˜ f f˜)1 L Tnm (a, f )L Tnm (a, = Dm (a a) ˜ 1 Tn/m ( f )Tn/m ( f˜) − Tn/m ( f f˜)1 Tn/m ( f )Tn/m ( f˜) − Tn/m ( f f˜)1 ≤ N (n)a a ˜ ∞ , N (n/m)
4.1 Multilevel LT Operator
69
and (4.12) is proved. Since ε(k) → 0 as k → ∞, for every m ∈ Nd there exists nm ∈ Nd such that, for n ≥ nm , (4.13) holds; and (4.14) follows from (4.13) and [22, Lemma 5.6]. Theorems 4.2 and 4.3 provide information about the asymptotic singular value p and eigenvalue distribution of a finite sum of the form i=1 L Tnm (ai , f i ). As we shall see, they play a central role in the computation of the singular value and eigenvalue distribution of multilevel GLT sequences. Theorem 4.2 Let a1 , . . . , a p : [0, 1]d → C and let f 1 , . . . , f p ∈ L 1 ([−π, π]d ). Then, for every m ∈ Nd and every F ∈ Cc (R), p N (n) 1 F σr L Tnm (ai , f i ) n→∞ N (n) r =1 i=1 m p j 1 1 = φm (F) = F ai f i (θ) dθ . N (m) j =1 (2π )d [−π,π]d m i=1 lim
(4.15)
Moreover, if a1 , . . . , a p are Riemann-integrable, then, for every F ∈ Cc (R), 1 lim φm (F) = φ(F) = m→∞ (2π )d
[0,1]d ×[−π,π]d
p F ai (x) f i (θ ) dxdθ . (4.16) i=1
Proof By Definition 4.2, (Πnm )T =
p
p
L Tnm (ai , f i )
Πnm
i=1
(4.17)
Dm (ai ) ⊗ Tn/m ( f i )
⊕ O N (n)−N (m)N (n/m) .
i=1
Recalling that Dm (ai ) = diag j =1,...,m ai ( j /m), for every j = 1, . . . , m the j th diagonal block of size N (n/m) of the matrix (4.17) is given by p p j j ai ai Tn/m ( f i ) = Tn/m fi . m m i=1 i=1 It follows that the singular values of σk
p j ai fi , Tn/m m i=1
p i=1
L Tnm (ai , f i ) are
k = 1, . . . , N (n/m),
j = 1, . . . , m,
70
4 Multilevel Locally Toeplitz Sequences
plus further N (n) − N (m)N (n/m) singular values equal to 0; see (2.24). Therefore, by Theorem 3.5, for any F ∈ Cc (R) we have p N (n) 1 m F σr L Tn (ai , f i ) lim n→∞ N (n) r =1 i=1 = lim
n→∞
N (m)N (n/m) × N (n)
N (n/m) p m j 1 1 F σk Tn/m ai fi N (m) j =1 N (n/m) k=1 m i=1 m p j 1 1 dθ . F a (θ) (4.18) f = i i d N (m) j =1 (2π ) [−π,π]d m i=1 ×
This proves (4.15). assume that a1 , . . . , a p are Riemann-integrable, then the function x → Ifwe p F i=1 ai (x) f i (θ ) is Riemann-integrable for each fixed θ ∈ [−π, π]d , because it is the composition of a continuous function with a Riemann-integrable function. Hence, by Lemma 2.5, for each fixed θ ∈ [−π, π]d we have m p j p 1 F ai F ai (x) f i (θ) dx. f i (θ) = lim m→∞ N (m) m [0,1]d i=1 i=1 j =1 Passing to the limit as m → ∞ in (4.18), and using the dominated convergence theorem, we get (4.16). Theorem 4.3 Let a1 , . . . , a p : [0, 1]d → C and let f 1 , . . . , f p ∈ L 1 ([−π, π]d ). Then, for every m ∈ Nd and every F ∈ Cc (C), p N (n) 1 m F λr L Tn (ai , f i ) lim n→∞ N (n) r =1 i=1 p m j 1 1 = φm (F) = F ai f i (θ) dθ. N (m) j =1 (2π )d [−π,π]d m i=1
(4.19)
Moreover, if a1 , . . . , a p are Riemann-integrable, then, for every F ∈ Cc (C), lim φm (F) = φ(F) =
m→∞
1 (2π )d
[0,1]d ×[−π,π]d
p F ai (x) f i (θ ) dxdθ . i=1
(4.20) Proof The proof follows the same pattern as the proof of Theorem 4.2. By (4.10) and Definition 4.2,
4.1 Multilevel LT Operator
(Πnm )T
71
p m L Tn (ai , f i ) Πnm i=1
p p 1 = (Πnm )T L Tnm (ai , f i ) + L Tnm (ai , f i ) Πnm 2 i=1 i=1 p p 1 = Dm (ai ) ⊗ Tn/m ( f i ) + Dm (ai ) ⊗ Tn/m ( f i ) 2 i=1 i=1 ⊕ O N (n)−N (m)N (n/m) . For 1 ≤ j ≤ m, the j th block of this matrix is given by p p j 1 j ai ai Tn/m ( f i ) + Tn/m ( f i ) 2 i=1 m m i=1 p j = Tn/m ai fi . m i=1 p It follows that the eigenvalues of ( i=1 L Tnm (ai , f i )) are p j ai fi , λk Tn/m m i=1
k = 1, . . . , N (n/m),
j = 1, . . . , m,
plus further N (n) − N (m)N (n/m) eigenvalues equal to 0; see (2.23). Therefore, by Theorem 3.5, for any F ∈ Cc (C) we have p N (n) 1 m F λr L Tn (ai , f i ) lim n→∞ N (n) r =1 i=1 = lim
n→∞
N (m)N (n/m) × N (n)
N (n/m) p m j 1 1 F λk Tn/m ai fi N (m) j =1 N (n/m) k=1 m i=1 p m j 1 1 F a (θ) dθ. (4.21) f = i i N (m) j =1 (2π )d [−π,π]d m i=1 ×
This proves (4.19). If we assume that a1 , . . . , a p are Riemann-integrable, then the function x → p F(( i=1 ai (x) f i (θ))) is Riemann-integrable for each fixed θ ∈ [−π, π]d , and so, by Lemma 2.5,
72
lim
m→∞
4 Multilevel Locally Toeplitz Sequences
p p m j 1 F ai = F ai (x) f i (θ ) dx. f i (θ ) N (m) j=1 m [0,1]d i=1 i=1
Passing to the limit as m → ∞ in (4.21), and using the dominated convergence theorem, we get (4.20).
4.2 Definition of Multilevel LT and sLT Sequences From a historical point of view, the theory of multilevel LT sequences started with the notion of multilevel separable LT (sLT) sequences. For convenience, however, we first introduce the notion of multilevel LT sequences and then, as a special case, we will obtain the notion of multilevel sLT sequences. Definition 4.3 (multilevel locally Toeplitz sequence) Let {A n }n be a d-level matrixsequence, let a : [0, 1]d → C be Riemann-integrable and let f ∈ L 1 ([−π, π]d ). We say that { A n }n is a (d-level) Locally Toeplitz (LT) sequence with symbol a ⊗ f , and we write { A n }n ∼LT a ⊗ f , if a.c.s.
{L Tnm (a, f )}n −→ { A n }n as m → ∞. The symbol a ⊗ f is sometimes called the kernel of { A n }n . The functions a and f are, respectively, the weight function and the generating function of { A n }n ; we refer the reader to [22, Sect. 7.1] for the origin and the meaning of this terminology. Definition 4.4 (multilevel separable locally Toeplitz sequence) Let {A n }n be a dlevel matrix-sequence. We say that { A n }n is a (d-level) separable Locally Toeplitz (sLT) sequence if { A n }n ∼LT a ⊗ f for some Riemann-integrable function a : [0, 1]d → C and some separable function f ∈ L 1 ([−π, π]d ). In this case, we write {A n }n ∼sLT a ⊗ f . It is clear from the definition that an sLT sequence is just an LT sequence with separable generating function. From now on, if we write { A n }n ∼LT a ⊗ f (resp., { A n }n ∼sLT a ⊗ f ), it is understood that a : [0, 1]d → C is Riemann-integrable and f ∈ L 1 ([−π, π]d ) (resp., f ∈ L 1 ([−π, π]d ) is separable).
4.3 Fundamental Examples of Multilevel LT Sequences In this section we provide three fundamental examples of multilevel LT sequences: zero-distributed sequences, sequences of multilevel diagonal sampling matrices and multilevel Toeplitz sequences. These may be regarded as the “building blocks” of the theory of multilevel GLT sequences, because from them we can construct through algebraic operations a lot of other matrix-sequences which will turn out to be multilevel GLT sequences.
4.3 Fundamental Examples of Multilevel LT Sequences
73
4.3.1 Zero-Distributed Sequences We have introduced zero-distributed sequences in Sect. 2.6.3. We now show that any d-level zero-distributed sequence is a d-level LT sequence with symbol 0. Theorem 4.4 Let {Z n }n be a d-level matrix-sequence. The following conditions are equivalent. 1. {Z n }n ∼σ 0. a.c.s. 2. {O N (n) }n −→ {Z n }n . 3. {Z n }n ∼sLT 0. a.c.s.
Proof (1 ⇐⇒ 2) By Theorem 2.5, we have {O N (n) }n −→ {Z n }n if and only if da.c.s. ({Z n }n , {O N (n) }n ) → 0 if and only if da.c.s. ({Z n }n , {O N (n) }n ) = 0 if and only if {Z n }n ∼σ 0. (2 ⇐⇒ 3) This follows from the definition of d-level sLT sequences (Defini tion 4.4) and the observation that L Tnm (0, 0) = O N (n) and 0 ⊗ 0 = 0.
4.3.2 Sequences of Multilevel Diagonal Sampling Matrices To any a : [0, 1]d → C we associate the family of d-level diagonal sampling matrices generated by a, that is, the family {Dn (a)}n∈Nd with Dn (a) defined as in (4.1). We are going to show that any d-level matrix-sequence of the form {Dn (a)}n with n = n(n) → ∞ as n → ∞ is a d-level sLT sequence with symbol (a ⊗ 1)(x, θ ) = a(x), as long as a is Riemann-integrable. For the proof we need the following technical lemma. Lemma 4.1 Let M be any infinite subset of N. For every m ∈ M let {x(m, k)}k∈Nd be a family of numbers such that x(m, k) → x(m) as k → ∞, where x(m) → 0 as m → ∞. Then, there exists a family {m(k)}k∈Nd ⊆ M such that m(k) → ∞ and x(m(k), k) → 0 as k → ∞. Proof Since x(m, k) → x(m) for every m ∈ M = {m 1 , m 2 , m 3 , . . .}, • • • •
for m = m 1 there exists k1 such that |x(m 1 , k) − x(m 1 )| ≤ 1/m 1 for k ≥ k1 ; for m = m 2 there exists k2 > k1 such that |x(m 2 , k) − x(m 2 )| ≤ 1/m 2 for k ≥ k2 ; for m = m 3 there exists k3 > k2 such that |x(m 3 , k) − x(m 3 )| ≤ 1/m 3 for k ≥ k3 ; ...
Define • m(k) = m 1 for k ≥ k2 ; • m(k) = m 2 for k ≥ k2 ∧ k ≥ k3 ;
74
4 Multilevel Locally Toeplitz Sequences
• m(k) = m 3 for k ≥ k3 ∧ k ≥ k4 ; • ... By construction, m(k) → ∞ and |x(m(k), k) − x(m(k))| ≤ 1/m(k) for k ≥ k2 . Since x(m(k)) → 0 as k → ∞, we conclude that x(m(k), k) → 0 as well. Theorem 4.5 If a : [0, 1]d → C is Riemann-integrable then {Dn (a)}n ∼sLT a ⊗ 1 for any sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Proof The proof is organized in two steps: we first show that the thesis holds if a is continuous; then, by using an approximation argument, we show that it holds for any Riemann-integrable function a. As we shall see, the approximation argument heavily relies on the Riemann-integrability of a. Step 1. We prove by induction on d that if a ∈ C([0, 1]d ) and ωa (·) is the modulus of continuity of a, then Dn (a) = L Tnm (a, 1) + Rn,m + Nn,m , rank(Rn,m ) ≤ N (n)
d mi i=1
ni
,
Nn,m ≤
d i=1
ωa
1 mi + . mi ni
(4.22)
a.c.s.
Since ωa (δ) → 0 as δ → 0, the convergence {L Tnm (a, 1)}n −→ {Dn (a)}n as m → ∞ (and hence the relation {Dn (a)}n ∼sLT a ⊗ 1) follows immediately dfrom Defini1/m i and tion 2.8 (take n m such that n ≥ m2 for n ≥ n m , and take c(m) = i=1 d ω(m) = i=1 ωa (2/m i )). In the case d = 1, we have n = n(n) = (dn ) for some sequence of numbers {dn }n such that dn → ∞ as n → ∞, and Eq. (4.22) reduces to Ddn (a) = L Tdmn (a, 1) + Rdn ,m + Ndn ,m , 1 m Ndn ,m ≤ ωa + . rank(Rdn ,m ) ≤ m, m dn This is nothing else than Eq. (7.24) from [22] with dn in place of n, and it was already proved in [22]. In the case d > 1, L Tnm (a, 1) is an N (n) × N (n) diagonal matrix that can be written as follows: j 1 2 ,...,m d In 1 /m 1 ⊗ L Tnm2 ,...,n , · , 1 ⊕ O(n 1 mod m 1 )n 2 ···n d a d m1 j1 =1,...,m 1 ! j 1 m 2 ,...,m d = diag L Tn 2 ,...,n d a ,· ,1 diag m1 j1 =1,...,m 1 i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1
L Tnm (a, 1) =
diag
⊕ O(n 1 mod m 1 )n 2 ···n d ,
(4.23)
4.3 Fundamental Examples of Multilevel LT Sequences
75
where, for any xˆ1 ∈ [0, 1], the function a(xˆ1 , ·) is defined as follows: a(xˆ1 , ·) : [0, 1]d−1 → C,
(x2 , . . . , xd ) → a(xˆ1 , x2 , . . . , xd ).
Moreover, by (4.2), Dn (a) is an N (n) × N (n) diagonal matrix that can be written as follows: ! i 1 Dn 2 ,...,n d a ,· diag Dn (a) = diag n1 j1 =1,...,m 1 i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1 i 1 Dn 2 ,...,n d a ,· . (4.24) ⊕ diag n1 i 1 =m 1 n 1 /m 1 +1,...,n 1 For j1 = 1, . . . , m 1 and i 1 = ( j1 − 1)n 1 /m 1 + 1, . . . , j1 n 1 /m 1 , by induction hypothesis we have j i 1 1 2 ,...,m d , · , 1 − D ,· a L Tnm2 ,...,n n 2 ,...,n d a d m1 n1 j i 1 1 = Dn 2 ,...,n d a , · − Dn 2 ,...,n d a ,· m1 n1 /m 1 ] /m 1 ] + Rn[ 2j1,...,n + Nn[ 2j1,...,n , d ,m 2 ,...,m d d ,m 2 ,...,m d
where /m 1 ] rank(Rn[ 2j1,...,n ) ≤ n2 · · · nd d ,m 2 ,...,m d
d mk k=2
/m 1 ] Nn[ 2j1,...,n ≤ d ,m 2 ,...,m d
d k=2
ωa( j1 /m 1 ,·)
nk
,
d 1 mk 1 mk + ωa + ≤ . mk nk mk nk k=2
Moreover, " " " " " Dn ,...,n a j1 , · − Dn ,...,n a i 1 , · " ≤ ωa 1 + m 1 , 2 d " " 2 d m1 n1 m1 n1 because j1 j1 i 1 ( j1 − 1)n 1 /m 1 j1 ( j1 − 1)(n 1 /m 1 − 1) 1 m1 ≤ − − ≤ − ≤ + . m n1 m 1 n1 m1 n1 m1 n1 1
76
4 Multilevel Locally Toeplitz Sequences
Thus, j i 1 1 /m 1 ] [ j1 /m 1 , i 1 /n 1 ] 2 ,...,m d L Tnm2 ,...,n , · , 1 − D , · = Rn[ 2j1,...,n + Nn,m a n 2 ,...,n d a d d ,m 2 ,...,m d m1 n1 d mk /m 1 ] rank(Rn[ 2j1,...,n ) ≤ n · · · n , (4.25) 2 d d ,m 2 ,...,m d n k=2 k [ j1 /m 1 , i 1 /n 1 ] Nn,m ≤
d k=1
ωa
1 mk + . mk nk
Hence, by (4.23) and (4.24), Dn (a) − L Tnm (a, 1) =
diag
j1 =1,...,m 1
⊕ =
i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1
i 1 ,· Dn 2 ,...,n d a n1
j 1 m 2 ,...,m d − L Tn 2 ,...,n d a ,· ,1 m1
!
i 1 Dn 2 ,...,n d a ,· n1 i 1 =m 1 n 1 /m 1 +1,...,n 1 ! [ j1 /m 1 ] [ j1 /m 1 , i 1 /n 1 ] diag −Rn 2 ,...,n d ,m 2 ,...,m d − Nn,m diag diag
j1 =1,...,m 1
⊕
diag
i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1
i 1 Dn 2 ,...,n d a ,· n1 i 1 =m 1 n 1 /m 1 +1,...,n 1 diag
= Rn,m + Nn,m , where ! Rn,m =
diag
j1 =1,...,m 1
⊕ Nn,m =
diag
i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1
/m 1 ] −Rn[ 2j1,...,n d ,m 2 ,...,m d
i 1 Dn 2 ,...,n d a ,· , n1 i 1 =m 1 n 1 /m 1 +1,...,n 1 diag
diag
j1 =1,...,m 1
diag
i 1 =( j1 −1)n 1 /m 1 +1,..., j1 n 1 /m 1
⊕ O(n 1 mod m 1 )n 2 ···n d .
[ j1 /m 1 , i 1 /n 1 ] −Nn,m
!
4.3 Fundamental Examples of Multilevel LT Sequences
77
By (4.25), (2.26) and (2.28), we have # rank(Rn,m ) ≤ m 1
$ d mk n1 + (n 1 mod m 1 )n 2 · · · n d n2 · · · nd m1 n k=2 k
≤ n1n2 · · · nd
d mk k=2
Nn,m ≤
d
ωa
k=1
+ m 1 n 2 · · · n d = N (n)
nk
d mk k=1
nk
,
1 mk + , mk nk
and (4.22) is proved. Step 2. Let a : [0, 1]d → C be any Riemann-integrable function. Take any sequence of continuous functions am : [0, 1]d → C such that am → a in L 1 ([0, 1]d ). Note that such a sequence exists because C([0, 1]d ) is dense in L 1 ([0, 1]d ). By Step 1, we a.c.s. have {Dn (am )}n ∼sLT am ⊗ 1 for every m. Hence, {L Tnk (am , 1)}n −→ {Dn (am )}n as d k → ∞, i.e., for every m and every k ∈ N there is n m,k such that, for n ≥ n m,k , Dn (am ) = L Tnk (am , 1) + Rn,m,k + Nn,m,k , Nn,m,k ≤ ω(m, k), rank(Rn,m,k ) ≤ c(m, k)N (n), where lim c(m, k) = lim ω(m, k) = 0.
k→∞
k→∞
a.c.s.
Moreover, {Dn (am )}n −→ {Dn (a)}n . Indeed, n j j Dn (a) − Dn (am )1 = a n − am n . j =1
By the Riemann-integrability of |a − am | and by the fact that am → a in L 1 ([0, 1]d ), the quantity n j 1 j (4.26) ε(m, n) = a n − am n N (n) j =1
satisfies lim lim ε(m, n) = lim
m→∞ n→∞
m→∞ [0,1]d
|a(x) − am (x)|dx = lim a − am L 1 = 0. m→∞
78
4 Multilevel Locally Toeplitz Sequences a.c.s.
By Theorem 2.10, this implies that {Dn (am )}n −→ {Dn (a)}n . Thus, for every m there exists n m such that, for n ≥ n m , Dn (a) = Dn (am ) + Rn,m + Nn,m , rank(Rn,m ) ≤ c(m)N (n),
N n,m ≤ ω(m),
where lim c(m) = lim ω(m) = 0.
m→∞
m→∞
It follows that, for every m, every k ∈ Nd , and every n ≥ max(n m , n m,k ), Dn (a) = L Tnk (a, 1) + L Tnk (am , 1) − L Tnk (a, 1) + (Rn,m + Rn,m,k ) + (Nn,m + Nn,m,k ), rank(Rn,m + Rn,m,k ) ≤ (c(m) + c(m, k))N (n), Nn,m + Nn,m,k ≤ ω(m) + ω(m, k), k j N (n) j − am L Tnk (am , 1) − L Tnk (a, 1)1 ≤ a = ε(m, k)N (n), N (k) j =1 k k where in the last inequality we used the linearity of L Tnk (·, 1) and (4.11); recall that ε(m, k) is defined in (4.26). Let {m(k)}k∈Nd be a family of indices such that m(k) → ∞ as k → ∞ and lim ε(m(k), k) = lim c(m(k), k) = lim ω(m(k), k) = 0.
k→∞
k→∞
k→∞
Such a family exists by Lemma 4.1 (apply the lemma with x(m, k) = ε(m, k) + c(m, k) + ω(m, k)). Then, for every k ∈ Nd and every n ≥ max(n m(k) , n m(k),k ), Dn (a) = L Tnk (a, 1) + L Tnk (am(k) , 1) − L Tnk (a, 1) + (Rn,m(k) + Rn,m(k),k ) + (Nn,m(k) + Nn,m(k),k ), rank(Rn,m(k) + Rn,m(k),k ) ≤ (c(m(k)) + c(m(k), k))N (n), Nn,m(k) + Nn,m(k),k ≤ ω(m(k)) + ω(m(k), k), L Tnk (am(k) , 1) − L Tnk (a, 1)1 ≤ ε(m(k), k)N (n). By [22, Lemma 5.6], we can decompose the matrix L Tnk (am(k) , 1) − L Tnk (a, 1) √ as the sum of a small-rank term Rˆ n,k , with rank bounded by ε(m(k), k) N (n), √ plus a small-norm term Nˆ n,k , with norm bounded by ε(m(k), k). This shows that a.c.s. {L Tnk (a, 1)}n −→ {Dn (a)}n as k → ∞, hence {Dn (a)}n ∼sLT a ⊗ 1.
4.3 Fundamental Examples of Multilevel LT Sequences
79
4.3.3 Multilevel Toeplitz Sequences Given f ∈ L 1 ([−π, π]d ), we show that any d-level Toeplitz sequence of the form {Tn ( f )}n with n = n(n) → ∞ as n → ∞ is a d-level LT sequence with symbol (1 ⊗ f )(x, θ ) = f (θ). In view of what follows, we recall from Corollary 2.1 that if f is a separable d-variate trigonometric polynomial then f = f 1 ⊗ · · · ⊗ f d for some univariate trigonometric polynomials f 1 , . . . , f d . We also recall that the degree of a univariate trigonometric polynomial g is the smallest nonnegative integer r such that the Fourier coefficients gk are zero for |k| > r . Theorem 4.6 If f ∈ L 1 ([−π, π]d ) then {Tn ( f )}n ∼LT 1 ⊗ f for any sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Proof The proof is organized in three steps: we first show that the thesis holds if f is a separable d-variate trigonometric polynomial; then, by linearity, we show that it holds if f is an arbitrary d-variate trigonometric polynomial; finally, using an approximation argument, we prove the theorem under the sole assumption that f ∈ L 1 ([−π, π]d ). Step 1. We show by induction on d that, if f is a separable d-variate trigonometric polynomial, say f = f 1 ⊗ · · · ⊗ f d with f 1 , . . . , f d univariate trigonometric polynomials of degrees r1 , . . . , rd , respectively, then Tn ( f ) = L Tnm (1, f ) + Rn,m ,
rank(Rn,m ) ≤ N (n)
d (2ri + 1)m i . ni i=1
(4.27)
a.c.s.
Once this is done, the convergence {L Tnm (1, f )}n −→ {Tn ( f )}n (and hence the relation {Tn ( f )}n ∼LT 1 ⊗ f ) follows immediately d from Definition 2.8 (take n m such (2ri + 1)/m i and ω(m) = 0). that n ≥ m2 for n ≥ n m , and take c(m) = i=1 In the case d = 1, we have n = n(n) = (dn ) for some sequence of numbers {dn }n such that dn → ∞ as n → ∞, and Eq. (4.27) reduces to Tdn ( f ) = L Tdmn (1, f ) + Rdn ,m ,
rank(Rdn ,m ) ≤ (2r + 1)m,
(4.28)
where r is the degree of f . This is nothing else than Eq. (7.26) from [22] with dn in place of n, and it was already proved in [22]. In the case d > 1, let f = f 1 ⊗ · · · ⊗ f d with f 1 , . . . , f d being univariate trigonometric polynomials of degrees r1 , . . . , rd , respectively. By induction hypothesis,
80
4 Multilevel Locally Toeplitz Sequences 2 ,...,m d L Tnm2 ,...,n (1, f 2 ⊗ · · · ⊗ f d ) = Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − Rn 2 ,...,n d ,m 2 ,...,m d , d
rank(Rn 2 ,...,n d ,m 2 ,...,m d ) ≤ n 2 · · · n d
d (2ri + 1)m i . ni i=2
From the definition of L Tnm (1, f ) and the properties of tensor products and direct sums (see Sect. 2.5), we obtain L Tnm (1, f ) 2 ,...,m d = diag Tn 1 /m 1 ( f 1 ) ⊗ L Tnm2 ,...,n (1, f 2 ⊗ · · · ⊗ f d ) ⊕ O(n 1 mod m 1 )n 2 ···n d d j1 =1,...,m 1
=
diag
j1 =1,...,m 1
=
Tn 1 /m 1 ( f 1 ) ⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − Rn 2 ,...,n d ,m 2 ,...,m d
⊕ O(n 1 mod m 1 )n 2 ···n d diag
j1 =1,...,m 1
Tn 1 /m 1 ( f 1 ) ⊕ On 1 mod m 1
⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − Rn 2 ,...,n d ,m 2 ,...,m d = L Tnm1 1 (1, f 1 ) ⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − Rn 2 ,...,n d ,m 2 ,...,m d = L Tnm1 1 (1, f 1 ) ⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − R˜ n 1 ,...,n d ,m 1 ,...,m d , where R˜ n 1 ,...,n d ,m 1 ,...,m d = L Tnm1 1 (1, f 1 ) ⊗ Rn 2 ,...,n d ,m 2 ,...,m d satisfies rank( R˜ n 1 ,...,n d ,m 1 ,...,m d ) ≤ N (n)
d (2ri + 1)m i . ni i=2
Using (4.28) with dn = n 1 (n) = n 1 , we can decompose L Tnm1 1 (1, f 1 ) into the sum of Tn 1 ( f 1 ) plus a small-rank matrix −Rn 1 ,m 1 , whose rank is bounded by (2r1 + 1)m 1 . Invoking Lemma 3.3, we obtain L Tnm (1, f ) = Tn 1 ( f 1 ) − Rn 1 ,m 1 ⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) − R˜ n 1 ,...,n d ,m 1 ,...,m d = Tn ( f ) − Rn,m , where Rn,m = Rn 1 ,m 1 ⊗ Tn 2 ,...,n d ( f 2 ⊗ · · · ⊗ f d ) + R˜ n 1 ,...,n d ,m 1 ,...,m d satisfies rank(Rn,m ) ≤ (2r1 + 1)m 1 n 2 · · · n d + N (n) d (2ri + 1)m i = N (n) . ni i=1
d (2ri + 1)m i ni i=2
4.3 Fundamental Examples of Multilevel LT Sequences
81
This completes the proof of (4.27). Step 2. Let f be any d-variate trigonometric polynomial. By definition, f is a finite linear combination Fourier frequencies ei j ·θ , j ∈ Zd , and so we can r of the d-variate i j·θ write f (θ) = j =−r f j e for some separable trigonometric polynomials f j ei j ·θ . By linearity, Tn ( f ) =
r
f j Tn (ei j·θ ),
L Tnm (1, f ) =
r
f j L Tnm (1, ei j ·θ ).
j =−r
j=−r
a.c.s.
By Step 1, {Tn (ei j ·θ )}n ∼LT 1 ⊗ ei j ·θ , hence {L Tnm (1, ei j ·θ )}n −→ {Tn (ei j ·θ )}n as a.c.s. m → ∞, and so {L Tnm (1, f )}n −→ {Tn ( f )}n as m → ∞ by Remark 2.6. Thus, {Tn ( f )}n ∼LT 1 ⊗ f . Step 3. Let f ∈ L 1 ([−π, π]d ). Since the set of d-variate trigonometric polynomials is dense in L 1 ([−π, π]d ) by [22, Lemma 2.2] or Lemma 2.3, there is a sequence { f m }m of d-variate trigonometric polynomials such that f m → f in L 1 ([−π, π]d ). By Step 2, {Tn ( f m )}n ∼LT 1 ⊗ f m . Hence, for every m and every k ∈ Nd there is n m,k such that, for n ≥ n m,k , Tn ( f m ) = L Tnk (1, f m ) + Rn,m,k + Nn,m,k , Nn,m,k ≤ ω(m, k), rank(Rn,m,k ) ≤ c(m, k)N (n), where lim c(m, k) = lim ω(m, k) = 0.
k→∞
k→∞
Moreover, by Theorem 3.2, Tn ( f ) − Tn ( f m )1 = Tn ( f − f m )1 ≤ N (n) f − f m L 1 a.c.s.
and so {Tn ( f m )}n −→ {Tn ( f )}n by Theorem 2.10: for every m there exists n m such that, for n ≥ n m , Tn ( f ) = Tn ( f m ) + Rn,m + Nn,m , rank(Rn,m ) ≤ c(m)N (n),
N n,m ≤ ω(m),
where lim c(m) = lim ω(m) = 0.
m→∞
m→∞
It follows that, for every m, every k ∈ Nd and every n ≥ max(n m , n m,k ),
82
4 Multilevel Locally Toeplitz Sequences
Tn ( f ) = L Tnk (1, f ) + L Tnk (1, f m ) − L Tnk (1, f ) + (Rn,m + Rn,m,k ) + (Nn,m + Nn,m,k ), rank(Rn,m + Rn,m,k ) ≤ (c(m) + c(m, k))N (n), Nn,m + Nn,m,k ≤ ω(m) + ω(m, k), L Tnk (1, f m ) − L Tnk (1, f )1 = L Tnk (1, f m − f )1 ≤ N (n) f − f m L 1 , where in the last inequality we used (4.11) and Theorem 3.2. Let {m(k)}k∈Nd be a family of indices such that m(k) → ∞ as k → ∞ and lim c(m(k), k) = lim ω(m(k), k) = 0.
k→∞
k→∞
Such a family exists by Lemma 4.1 (apply the lemma with x(m, k) = c(m, k) + ω(m, k)). Then, for every k ∈ Nd and every n ≥ max(n m(k) , n m(k),k ), Tn ( f ) = L Tnk (1, f ) + L Tnk (1, f m(k) ) − L Tnk (1, f ) + (Rn,m(k) + Rn,m(k),k ) + (Nn,m(k) + Nn,m(k),k ), rank(Rn,m(k) + Rn,m(k),k ) ≤ (c(m(k)) + c(m(k), k))N (n), Nn,m(k) + Nn,m(k),k ≤ ω(m(k)) + ω(m(k), k), L Tnk (1, f m(k) ) − L Tnk (1, f )1 ≤ N (n) f m(k) − f L 1 . By [22, Lemma 5.6], we can decompose the matrix L Tnk (1, f m(k) ) − L Tnk (1, f ) as % the sum of a small-rank term Rˆ n,k , with rank bounded by f m(k) − f L 1 N (n), % plus a small-norm term Nˆ n,k , with norm bounded by f m(k) − f L 1 . This shows a.c.s. that {L Tnk (a, 1)}n −→ {Tn ( f )}n as k → ∞, hence {Tn ( f )}n ∼LT 1 ⊗ f . It follows from Theorem 4.6 that {Tn ( f )}n ∼sLT 1 ⊗ f whenever f ∈ L 1 ([−π, π]d ) is separable.
4.4 Singular Value and Spectral Distribution of a Finite Sum of Multilevel LT Sequences Theorem 4.7 provides the singular value distribution of a finite sum of multilevel LT sequences. Theorem 4.7 If {A(i) n }n ∼LT ai ⊗ f i , i = 1, . . . , p, then
p i=1
A(i) n
∼σ n
p i=1
ai ⊗ f i .
4.4 Singular Value and Spectral Distribution of a Finite Sum of Multilevel LT Sequences
83
Proof Take any sequence {m = m(m)}m ⊆ Nd such that m → ∞ as m → ∞. By definition of multilevel LT sequences, a.c.s.
{L Tnm (ai , f i )}n −→ { A(i) n }n ,
i = 1, . . . , p.
By Theorem 2.9,
p
a.c.s.
L Tnm (ai , f i )
−→ n
i=1
p
A(i) n
i=1
. n
By Theorem 4.2, for every F ∈ Cc (R) we have p N (n) 1 F σr L Tnm (ai , f i ) = φm (F), n→∞ N (n) r =1 i=1 p 1 lim φm (F) = φ(F) = F ai (x) f i (θ ) dxdθ . d m→∞ (2π ) [0,1]d ×[−π,π]d i=1 lim
By Theorem 2.6, we obtain
p
A(i) n
∼σ φ. n
i=1
p p Since φ is just the functional φ| i=1 ai ⊗ f i | associated with | i=1 ai ⊗ f i | according to Eq. (2.2), the previous singular value distribution is equivalent to
p i=1
A(i) n
∼σ n
p
ai ⊗ f i ,
i=1
and the thesis is proved.
Using Theorem 4.7, we show in Proposition 4.2 that the symbol of a multilevel LT sequence is essentially unique. Afterwards, in Proposition 4.3, we show that the symbol of a multilevel LT sequence formed by Hermitian matrices is real a.e. Proposition 4.2 If { A n }n ∼LT a ⊗ f and { A n }n ∼LT a˜ ⊗ f˜, then a ⊗ f = a˜ ⊗ f˜ a.e. Proof By Theorem 4.7, {O N (n) }n = { A n − A n }n ∼σ a ⊗ f − a˜ ⊗ f˜. Hence, for every F ∈ Cc (R), F(0) =
1 (2π )d
[0,1]d ×[−π,π]d
F(|a(x) f (θ ) − a(x) ˜ f˜(θ )|)dxdθ.
84
4 Multilevel Locally Toeplitz Sequences
This means that φ|a⊗ f −a⊗ ˜ ⊗ f˜| = 0 a.e. by ˜ f˜| = φ0 and so |a ⊗ f − a [22, Remark 2.1]. Proposition 4.3 If { A n }n ∼LT a ⊗ f and the A n are Hermitian, then a ⊗ f ∈ R a.e. Proof It holds in general that { A n }n ∼LT a ⊗ f implies {A∗n }n ∼LT a ⊗ f . This follows immediately from the definition of multilevel LT sequences in combination with (4.10) and Remark 2.6. If the matrices A n are Hermitian, then, by Proposition 4.2, we have a ⊗ f = a ⊗ f a.e., i.e., a ⊗ f ∈ R a.e. Theorem 4.8 provides the spectral distribution of a finite sum of multilevel LT sequences formed by Hermitian matrices. Theorem 4.8 If { A(i) n }n ∼LT ai ⊗ f i , i = 1, . . . , p, then
p p A(i) ∼ a ⊗ f λ i i . n n
i=1
i=1
In particular, if the A(i) n are Hermitian,
p
A(i) n
∼λ n
i=1
p
ai ⊗ f i .
i=1
Proof Take any sequence {m = m(m)}m ⊆ Nd such that m → ∞ as m → ∞. By definition of multilevel LT sequences, a.c.s.
{L Tnm (ai , f i )}n −→ { A(i) n }n ,
i = 1, . . . , p.
By Theorem 2.9,
p p a.c.s. L Tnm (ai , f i ) −→ A(i) . n i=1
n
i=1
n
By Theorem 4.3, for every F ∈ Cc (C) we have p N (n) 1 m F λr L Tn (ai , f i ) = φm (F), lim n→∞ N (n) r =1 i=1 p 1 lim φm (F) = φ(F) = F ai (x) f i (θ) dxdθ. m→∞ (2π )d [0,1]d ×[−π,π]d i=1
4.4 Singular Value and Spectral Distribution of a Finite Sum of Multilevel LT Sequences
By Theorem 2.7,
85
p A(i) ∼λ φ. n n
i=1 p Since φ = φ(i=1 ai ⊗ f i ) , we obtain
p p A(i) ∼ a ⊗ f . λ i i n i=1
n
i=1
are Hermitian then To concludethe proof, we that if all the matrices A(i) n note p p p p (i) (i) we have ( i=1 A n ) = i=1 A n and ( i=1 ai ⊗ f i ) = i=1 ai ⊗ f i a.e. by Proposition 4.3.
4.5 Algebraic Properties of Multilevel LT Sequences Proposition 4.4 collects the most elementary algebraic properties of multilevel LT sequences, which follow from Remark 2.6, Eq. (4.10), and the bilinearity of the multilevel LT operator L Tnm (a, f ) with respect to its arguments a and f (see Sect. 4.1.2, especially Eqs. (4.8) and (4.9)). Proposition 4.4 The following properties hold. • • • •
If { A n }n ∼LT a ⊗ f then { A∗n }n ∼LT a ⊗ f . If { A n }n ∼LT a ⊗ f then {α A n }n ∼LT αa ⊗f for all α ∈ C. r (i) ⊗ ( ri=1 f i ). If { A(i) n }n ∼LT a ⊗ f i , i = 1, . . . , r, then {i=1 A n }n ∼LT a r r (i) If { A(i) n }n ∼LT ai ⊗ f, i = 1, . . . , r, then { i=1 A n }n ∼LT ( i=1 ai ) ⊗ f .
In Theorem 4.9, we show, under mild assumptions, that the product of multilevel LT sequences is again a multilevel LT sequence with symbol given by the product of the symbols. Theorem 4.9 Let {A n }n ∼LT a ⊗ f and { A˜ n }n ∼LT a˜ ⊗ f˜, where f ∈ L p ([−π, π]d ), f˜ ∈ L q ([−π, π]d ), and 1 ≤ p, q ≤ ∞ are conjugate exponents. Then { A n A˜ n }n ∼LT a a˜ ⊗ f f˜. Proof By Theorem 4.7 and Proposition 2.2, every multilevel LT sequence is s.u., so in particular { A n }n and { A˜ n }n are s.u. Since, by definition of multilevel LT sequences, a.c.s.
{L Tnm (a, f )}n −→ { A n }n as m → ∞, a.c.s. {L Tnm (a, ˜ f˜)}n −→ { A˜ n }n as m → ∞,
86
4 Multilevel Locally Toeplitz Sequences
Remark 2.6 yields a.c.s. {L Tnm (a, f )L Tnm (a, ˜ f˜)}n −→ { A n A˜ n }n as m → ∞.
Using Proposition 4.1, especially (4.14), we obtain a.c.s. {L Tnm (a a, ˜ f f˜)}n −→ { A n A˜ n }n as m → ∞,
hence { A n A˜ n }n ∼LT a a˜ ⊗ f f˜.
As a consequence of Theorems 4.5, 4.6 and 4.9, we immediately obtain the following result. Theorem 4.10 If a : [0, 1]d → C is Riemann-integrable and f ∈ L 1 ([−π, π]d ) then {Dn (a)Tn ( f )}n ∼LT a ⊗ f for any sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞.
4.6 Characterizations of Multilevel LT Sequences Theorem 4.10 shows that, for any a, f as in Definition 4.3, there always exists a matrix-sequence { A n }n such that { A n }n ∼LT a ⊗ f . Indeed, it suffices to take A n = Dn (a)Tn ( f ). Theorem 4.11 shows that the sequences of the form {Dn (a)Tn ( f )}n play a central role in the world of multilevel LT sequences. Indeed, {A n }n ∼LT a ⊗ f if and only if A n equals Dn (a)Tn ( f ) up to a small-rank plus small-norm correction. More precisely, any multilevel LT sequence { A n }n ∼LT a ⊗ f admits the fixed matrix-sequence {Dn (a)Tn ( f )}n as an a.c.s., and, vice versa, any matrix-sequence {A n }n admitting {Dn (a)Tn ( f )}n as an a.c.s. is a multilevel LT sequence with symbol a ⊗ f . From a topological viewpoint, this means that { A n }n ∼LT a ⊗ f
⇐⇒
da.c.s. ({ A n }n , {Dn (a)Tn ( f )}n ) = 0.
Theorem 4.11 Let {A n }n be a d-level matrix-sequence, let a : [0, 1]d → C be a Riemann-integrable function and let f ∈ L 1 ([−π, π]d ). The following conditions are equivalent. 1. {A n }n ∼LT a ⊗ f . 2. For all sequences {am }m , { f m }m , {{A(m) n }n }m such that • am : [0, 1]d → C is Riemann-integrable and am → a in L 1 ([0, 1]d ), • f m ∈ L 1 ([−π, π]d ) and f m → f in L 1 ([−π, π]d ), • {A(m) n }n ∼LT am ⊗ f m , a.c.s.
we have { A(m) n }n −→ { A n }n .
4.6 Characterizations of Multilevel LT Sequences
87
3. There exist sequences {am }m , { f m }m such that • am : [0, 1]d → C is continuous with am ∞ ≤ a L ∞ for all m and am → a a.e., • f m : [−π, π]d → C is a d-variate trigonometric polynomial with f m ∞ ≤ ess sup[−π,π]d | f | for all m and f m → f a.e. and in L 1 ([−π, π]d ), a.c.s.
• {Dn (am )Tn ( f m )}n −→ { A n }n . 4. There exist sequences {am }m , { f m }m , {{A(m) n }n }m such that • am : [0, 1]d → C is Riemann-integrable and am → a in L 1 ([0, 1]d ), • f m ∈ L 1 ([−π, π]d ) and f m → f in L 1 ([−π, π]d ), a.c.s. (m) • {A(m) n }n ∼LT am ⊗ f m and { A n }n −→ { A n }n . a.c.s.
5. {Dn (a)Tn ( f )}n −→ { A n }n . 6. A n = Dn (a)Tn ( f ) + Z n for every n, where {Z n }n is zero-distributed. Proof (1 =⇒ 2) Let {am }m , { f m }m , {{A(m) n }n }m be sequences with the properties } ∼ a ⊗ f m , for every m and every k ∈ Nd there specified in item 2. Since { A(m) n LT m n is n m,k such that, for n ≥ n m,k , k A(m) n = L Tn (am , f m ) + R n,m,k + N n,m,k , Nn,m,k ≤ ω(m, k), rank(Rn,m,k ) ≤ c(m, k)N (n),
where lim c(m, k) = lim ω(m, k) = 0.
k→∞
k→∞
Moreover, since { A n }n ∼LT a ⊗ f , for every k ∈ Nd there is n k such that, for n ≥ n k , A n = L Tnk (a, f ) + Rn,k + Nn,k , rank(Rn,k ) ≤ c(k)N (n),
N n,k ≤ ω(k),
where lim c(k) = lim ω(k) = 0.
k→∞
k→∞
Hence, for every m, every k ∈ Nd and every n ≥ max(n m,k , n k ), k k A n = A(m) n + L Tn (a, f ) − L Tn (am , f m ) + (Rn,k − Rn,m,k ) + (Nn,k − Nn,m,k ), rank(Rn,k − Rn,m,k ) ≤ (c(k) + c(m, k))N (n), Nn,k − Nn,m,k ≤ ω(k) + ω(m, k).
(4.29)
88
4 Multilevel Locally Toeplitz Sequences
Thanks to (4.11), the linearity of the maps (4.8)–(4.9) and Theorem 3.2, we have L Tnk (a, f ) − L Tnk (am , f m )1 ≤ L Tnk (a, f − f m )1 + L Tnk (a − am , f m )1 k k j j j = − am a a Tn/k ( f − f m )1 + Tn/k ( f m )1 k k k j =1 j =1 k j N (n) j − am a N (k) j =1 k k ⎤ k j 1 j ⎦ + sup f L 1 − am a N (n); (4.30) N (k) j =1 k k
≤ N (n)a∞ f − f m L 1 + f m L 1 ⎡ ≤ ⎣a∞ f − f m L 1
note that f L 1 is uniformly bounded with respect to , because f converges to f in L 1 ([−π, π]d ). By the Riemann-integrability of |a − am |, and by the fact that am → a in L 1 ([0, 1]d ) and f m → f in L 1 ([−π, π]d ), the quantity ε(m, k) = a∞ f − f m L 1 + sup f L 1
k j 1 j − am a N (k) j =1 k k
(4.31)
satisfies lim lim ε(m, k) = lim a∞ f − f m L 1 + sup f L 1
m→∞ k→∞
m→∞
[0,1]d
|a(x) − am (x)|dx = 0. (4.32)
Choose any sequence {k(m)}m ⊆ Nd such that k(m) → ∞ as m → ∞ and lim c(m, k(m)) = lim ω(m, k(m)) = lim ε(m, k(m)) = 0.
m→∞
m→∞
m→∞
By (4.29)–(4.30), for every m and every n ≥ max(n m,k(m) , n k(m) ) we have k(m) A n = A(m) (a, f ) − L Tnk(m) (am , f m ) n + L Tn + (Rn,k(m) − Rn,m,k(m) ) + (Nn,k(m) − Nn,m,k(m) ), rank(Rn,k(m) − Rn,m,k(m) ) ≤ (c(k(m)) + c(m, k(m)))N (n), Nn,k(m) − Nn,m,k(m) ≤ ω(k(m)) + ω(m, k(m)), L Tnk(m) (a, f ) − L Tnk(m) (am , f m )1 ≤ ε(m, k(m))N (n).
(4.33)
4.6 Characterizations of Multilevel LT Sequences
89
By [22, Lemma 5.6], we can decompose L Tnk(m) (a, f ) − L Tnk(m) (am , f m ) as the sum √ of a small-rank term Rˆ n,m , with rank bounded by ε(m, k(m)) N (n), plus a small√ norm term Nˆ n,m , with norm bounded by ε(m, k(m)). We then infer from (4.33) a.c.s. that { A(m) n }n −→ { A n }n . This concludes the proof of the implication 1 =⇒ 2. (2 =⇒ 3) Since any Riemann-integrable function is bounded by definition, we have a ∈ L ∞ ([0, 1]d ). Hence, by [22, Theorem 2.2], there exists a sequence of continuous functions am : [0, 1]d → C such that am ∞ ≤ a L ∞ for all m and am → a a.e. The sequence {am }m satisfies the properties in item 3. Since f ∈ L 1 ([−π, π]d ), by Lemma 2.3 there exists a sequence of d-variate trigonometric polynomials { f m }m such that f m ∞ ≤ ess sup[−π,π]d | f | for all m and f m → f a.e. and in L 1 ([−π, π]d ). The sequence { f m }m satisfies the properties in item 3. a.c.s. By item 2 and Theorem 4.10, we have {Dn (am )Tn ( f m )}n −→ { A n }n , and the proof is complete. (3 =⇒ 4) Simply note that, under the assumptions in item 3, am → a in L 1 ([0, 1]d ) by the dominated convergence theorem and {Dn (am )Tn ( f m )}n ∼LT am ⊗ f m by Theorem 4.10. d (4 =⇒ 1) Since { A(m) n }n ∼LT am ⊗ f m , for every m and every k ∈ N there is n m,k such that, for n ≥ n m,k , k A(m) n = L Tn (am , f m ) + R n,m,k + N n,m,k , Nn,m,k ≤ ω(m, k), rank(Rn,m,k ) ≤ c(m, k)N (n),
where lim c(m, k) = lim ω(m, k) = 0.
k→∞
k→∞
a.c.s.
Since { A(m) n }n −→ { A n }n , for every m there exists n m such that, for n ≥ n m , A n = A(m) n + R n,m + N n,m , N n,m ≤ ω(m), rank(Rn,m ) ≤ c(m)N (n), where lim c(m) = lim ω(m) = 0.
m→∞
m→∞
Thus, for every m, every k ∈ Nd and every n ≥ max(n m , n m,k ), A n = L Tnk (a, f ) + L Tnk (am , f m ) − L Tnk (a, f ) + (Rn,m + Rn,m,k ) + (Nn,m + Nn,m,k ), rank(Rn,m + Rn,m,k ) ≤ (c(m) + c(m, k))N (n), Nn,m + Nn,m,k ≤ ω(m) + ω(m, k), L Tnk (am , f m ) − L Tnk (a, f )1 ≤ ε(m, k)N (n),
90
4 Multilevel Locally Toeplitz Sequences
where in the last inequalities we used (4.30); the quantity ε(m, k) is defined in (4.31) and satisfies (4.32). Let {m(k)}k∈Nd be a family of indices such that m(k) → ∞ as k → ∞ and lim ε(m(k), k) = lim c(m(k), k) = lim ω(m(k), k) = 0.
k→∞
k→∞
k→∞
Note that such a sequence exists by Lemma 4.1 (apply the lemma with x(m, k) = ε(m, k) + c(m, k) + ω(m, k)). Then, for every k ∈ Nd and n ≥ max(n m(k) , n m(k),k ), A n = L Tnk (a, f ) + L Tnk (am(k) , f m(k) ) − L Tnk (a, f ) + (Rn,m(k) + Rn,m(k),k ) + (Nn,m(k) + Nn,m(k),k ), rank(Rn,m(k) + Rn,m(k),k ) ≤ (c(m(k)) + c(m(k), k))N (n), Nn,m(k) + Nn,m(k),k ≤ ω(m(k)) + ω(m(k), k), L Tnk (am(k) , f m(k) ) − L Tnk (a, f )1 ≤ ε(m(k), k)N (n). The use of [22, Lemma 5.6] allows one to decompose L Tnk (am(k) , f m(k) ) − L Tnk (a, f ) √ as the sum of a small-rank term Rˆ n,k , with rank bounded by ε(m(k), k) N (n), plus √ a small-norm term Nˆ n,k , with norm bounded by ε(m(k), k). We therefore infer a.c.s. that {L Tnk (a, f )}n −→ { A n }n as k → ∞, i.e., { A n }n ∼LT a ⊗ f . This concludes the proof of the implication 4 =⇒ 1. a.c.s. (5 ⇐⇒ 6) Item 5 is equivalent to saying that {O N (n) }n −→ { A n − Dn (a)Tn ( f )}n , which, by Theorem 4.4, is equivalent to saying that { A n − Dn (a)Tn ( f )}n is zerodistributed. (2 =⇒ 5) Obvious (take am = a, f m = f and A(m) n = Dn (a)Tn ( f )). (5 =⇒ 4) Obvious (take am = a, f m = f and A(m) n = Dn (a)Tn ( f )).
Chapter 5
Multilevel Generalized Locally Toeplitz Sequences
In this chapter we develop the multivariate version of the theory of GLT sequences, also known as the theory of multilevel GLT sequences, which dates back to the pioneering papers [35, 36] and was recently reviewed and extended in [21]. The topic is presented here on an abstract level, whereas for motivations and insights we refer the reader to [22, pp. 1–3]; see also Chap. 1. We stress that essentially all the results and proofs contained in this chapter have an exact analog in [22, Chap. 8], where we dealt with classical (unilevel) GLT sequences. However, we decided to reproduce here all the proofs without omitting details, in order to help the reader become familiar with the multilevel language (especially, the multi-index notation). In this regard, the reader is invited to compare the “multivariate proofs” presented in this chapter with the corresponding “univariate proofs” in [22, Chap. 8], in order to learn the way in which the multilevel language allows one to transfer many results from the univariate to the multivariate case by simply turning some letters (n, i, j, x, θ, etc.) in boldface (n, i, j , x, θ , etc.).
5.1 Equivalent Definitions of Multilevel GLT Sequences We first report (a corrected version of) the original definition of multilevel GLT sequences. This definition is formulated in terms of a.c.s. parameterized by a positive ε → 0 (see Sect. 2.7.5). Definition 5.1 (multilevel generalized locally Toeplitz sequence) Let {A n }n be a d-level matrix-sequence and let κ : [0, 1]d × [−π, π]d → C be a measurable function. We say that { A n }n is a (d-level) Generalized Locally Toeplitz (GLT) sequence with symbol κ, and we write { A n }n ∼GLT κ, if the following condition is met.
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_5
91
92
5 Multilevel Generalized Locally Toeplitz Sequences
For every ε > 0 there exists a finite number of d-level LT sequences { A(i,ε) n }n ∼LT ai,ε ⊗ f i,ε , i = 1, . . . , Nε , such that Nε • i=1 ai,ε ⊗ f i,ε → κ in measure as ε → 0, Nε (i,ε) a.c.s. • −→ { A n }n as ε → 0. i=1 A n The symbol κ is sometimes called the kernel of { A n }n . In what follows, whenever we write a relation such as { A n }n ∼GLT κ, it is understood that { A n }n is a d-level matrix-sequence and κ : [0, 1]d × [−π, π]d → C is a measurable function, as in Definition 5.1. Proposition 5.1 provides a straightforward characterization of multilevel GLT sequences, which may be taken as their definition instead of Definition 5.1. Actually, Proposition 5.1 is essentially the same as Definition 5.1, but it is easier to handle, because it is based on the standard a.c.s. notion (see Definition 2.6). Proposition 5.1 Let {An }n be a d-level matrix-sequence and κ : [0, 1]d × [−π, π]d → C a measurable function. We have { A n }n ∼GLT κ if and only if the following condition is met. For every m ∈ N there exists a finite number of d-level LT sequences }n ∼LT ai,m ⊗ f i,m , i = 1, . . . , Nm , such that { A(i,m) n Nm • i=1 ai,m ⊗ f i,m → κ in measure, Nm (i,m) a.c.s. −→ { A n }n . • i=1 A n n Proof If { A n }n ∼GLT κ, then the condition of the proposition holds with ai,m = ai,ε(m) ,
f i,m = f i,ε(m) ,
{ A(i,m) }n = { A(i,ε(m)) }n , n n
Nm = Nε(m) ,
where ai,ε , f i,ε , { A(i,ε) n }n , Nε are as in Definition 5.1 and {ε(m)}m is any sequence of positive numbers such that ε(m) → 0. Conversely, suppose the condition of the proposition holds. Then { A n }n ∼GLT κ, because the condition of Definition 5.1 holds with ai,ε = ai,m(ε) ,
f i,ε = f i,m(ε) ,
(i,m(ε)) { A(i,ε) }n , n }n = { A n
Nε = Nm(ε) ,
where ai,m , f i,m , { A(i,m) }n , Nm are as in the proposition and {m(ε)}ε>0 ⊆ N is any n family of indices such that m(ε) → ∞ as ε → 0. Remark 5.1 It is clear that any d-level LT sequence is a d-level GLT sequence. More precisely, { A n }n ∼LT a ⊗ f =⇒ { A n }n ∼GLT a ⊗ f. To see this, it suffices to take, in Proposition 5.1, { A(1,m) }n = { A n }n , n
a1,m = a,
f 1,m = f,
Nm = 1.
5.2 Singular Value and Spectral Distribution of Multilevel GLT Sequences
93
5.2 Singular Value and Spectral Distribution of Multilevel GLT Sequences Theorem 5.1 provides the singular value distribution of a multilevel GLT sequence. Theorem 5.1 If { A n }n ∼GLT κ then { A n }n ∼σ κ. Proof By Proposition 5.1, there exist LT sequences { A(i,m) }n ∼LT ai,m ⊗ f i,m , n
i = 1, . . . , Nm ,
such that Nm • i=1 ai,m ⊗ f i,m → κ in measure, Nm (i,m) a.c.s. • −→ { A n }n . i=1 A n n By Theorem 4.7,
Nm i=1
A(i,m) n
∼σ n
Nm
ai,m ⊗ f i,m .
i=1
All the assumptions of Corollary 2.4 are then satisfied and { A n }n ∼σ κ.
Using Theorem 5.1, we show in Proposition 5.2 that the symbol of a multilevel GLT sequence is essentially unique. Afterwards, in Proposition 5.3, we show that the symbol of a multilevel GLT sequence formed by Hermitian matrices is real a.e. For the proofs we need the following lemma, which is the most elementary result in the world of the algebraic properties possessed by multilevel GLT sequences. These properties will be investigated in Sect. 5.4 and give rise to the so-called multilevel GLT algebra. Lemma 5.1 Let {A n }n ∼GLT κ and {Bn }n ∼GLT ξ . Then • {A∗n }n ∼GLT κ, • {α A n + β Bn }n ∼GLT ακ + βξ for all α, β ∈ C. Proof It suffices to write the meaning of { A n }n ∼GLT κ and {Bn }n ∼GLT ξ (using the characterization of Proposition 5.1), and to apply the first property of Proposition 4.4 in combination with Theorem 2.9. Proposition 5.2 If { An }n ∼GLT κ and { A n }n ∼GLT ξ then κ = ξ a.e. Proof By Lemma 5.1 we have {O N (n) }n ∼GLT κ − ξ . Therefore, by Theorem 5.1, for all test functions F ∈ Cc (R) we have 1 F(|κ(x, θ ) − ξ(x, θ )|)dxdθ . F(0) = (2π )d [0,1]d ×[−π,π]d This means that φ|κ−ξ | = φ0 and so |κ − ξ | = 0 a.e. by [22, Remark 2.1].
94
5 Multilevel Generalized Locally Toeplitz Sequences
Proposition 5.3 If {A n }n ∼GLT κ and the A n are Hermitian then κ ∈ R a.e. Proof Since the matrices A n are Hermitian, by Lemma 5.1 we have { A n }n ∼GLT κ and {A n }n ∼GLT κ. Thus, by Proposition 5.2, κ = κ a.e., i.e., κ ∈ R a.e. Theorem 5.2 provides the spectral distribution of a multilevel GLT sequence formed by Hermitian matrices. Theorem 5.2 If {A n }n ∼GLT κ and the A n are Hermitian then { A n }n ∼λ κ. Proof By Proposition 5.1, there exist LT sequences { A(i,m) }n ∼LT ai,m ⊗ f i,m , n
i = 1, . . . , Nm ,
such that Nm ai,m ⊗ f i,m → κ in measure, • i=1 Nm (i,m) a.c.s. • −→ { A n }n . i=1 A n n Since the matrices A n are Hermitian, we have A n = (A n ) and, consequently, by Theorem 2.9,
Nm a.c.s. (i,m) An −→ { A n }n .
n
i=1
By Theorem 4.8,
Nm Nm
A(i,m) ∼
a ⊗ f λ i,m i,m . n i=1
n
i=1
The Nmfunction κ is real a.e. by Proposition 5.3, so from the convergence in measure i=1 ai,m ⊗ f i,m → κ we get
Nm
ai,m ⊗ f i,m
→ κ in measure.
i=1
All the assumptions of Corollary 2.5 are then satisfied and { A n }n ∼λ κ.
Remark 5.2 In view of Lemma 5.1, it is clear that Theorem 4.7 is a particular case of Theorem 5.1, and Theorem 4.8 is a particular case of Theorem 5.2. We end this section with a spectral distribution result for multilevel GLT sequences formed by perturbed Hermitian matrices.
5.2 Singular Value and Spectral Distribution of Multilevel GLT Sequences
95
Theorem 5.3 Suppose {A n }n ∼GLT κ and A n = X n + Yn , where 1. every X n is Hermitian, 2. X n , Yn ≤ C for all n, where C is a constant independent of n, 3. Yn 1 = o(N (n)) as n → ∞. Then { A n }n ∼λ κ. Proof {Yn }n is zero-distributed by Theorem 2.3, so {Yn }n ∼GLT 0 by Theorem 4.4. Since X n = A n − Yn and the matrices X n are Hermitian, we have {X n }n ∼GLT κ by Lemma 5.1 and {X n }n ∼λ κ by Theorem 5.2. All the assumptions of Corollary 2.3 are then satisfied and the thesis follows.
5.3 Approximation Results for Multilevel GLT Sequences Theorem 5.4 is the main approximation result for multilevel GLT sequences. It is the same as Corollaries 2.4 and 2.5 with “∼σ ” and “∼λ ” replaced by “∼GLT ”. As we shall see, it is particularly useful to show that a given matrix-sequence { A n }n is a multilevel GLT sequence. Theorem 5.4 Let { A n }n be a d-level matrix-sequence and let κ : [0, 1]d × [−π, π]d → C be a measurable function. Suppose that 1. {Bn,m }n ∼GLT κm for every m, a.c.s. 2. {Bn,m }n −→ { A n }n , 3. κm → κ in measure. Then {A n }n ∼GLT κ. Proof Since {Bn,m }n ∼GLT κm , Proposition 5.1 ensures that, for every m, k, there exists a finite number of d-level LT sequences {A(i,k) n,m }n ∼LT ai,k,m ⊗ f i,k,m , i = 1, . . . , Nk,m , such that Nk,m • i=1 ai,k,m ⊗ f i,k,m → κm in measure as k → ∞, Nk,m (i,k) a.c.s. • i=1 A n,m n −→ {Bn,m }n as k → ∞. Hence, for every m, k there exists n k,m such that, for n ≥ n k,m , Bn,m =
Nk,m
A(i,k) n,m + R n,k,m + N n,k,m ,
i=1
rank(Rn,k,m ) ≤ c(k, m)N (n),
N n,k,m ≤ ω(k, m),
where lim c(k, m) = lim ω(k, m) = 0.
k→∞
k→∞
96
5 Multilevel Generalized Locally Toeplitz Sequences
Let {δm }m be a sequence of positive numbers such that δm → 0. Since f i,k,m → κm in measure as k → ∞, for every m we have
Nk,m i=1
ai,k,m ⊗
⎧ ⎫ Nk,m ⎨ ⎬ μ(m, k, δm ) = μ2d ai,k,m ⊗ f i,k,m − κm ≥ δm → 0 as k → ∞. ⎩ ⎭ i=1
a.c.s.
Now we recall that {Bn,m }n −→ { A n }n : for every m there exists n m such that, for n ≥ nm , A n = Bn,m + Rn,m + Nn,m , N n,m ≤ ω(m), rank(Rn,m ) ≤ c(m)N (n), where lim c(m) = lim ω(m) = 0.
m→∞
m→∞
It follows that, for every m, k and every n ≥ max(n m , n k,m ), An =
Nk,m
A(i,k) n,m + (R n,k,m + R n,m ) + (N n,k,m + N n,m ),
i=1
rank(Rn,k,m + Rn,m ) ≤ (c(k, m) + c(m))N (n), N n,k,m + Nn,m ≤ ω(k, m) + ω(m). Choose a sequence {km }m such that km → ∞ and lim c(km , m) = lim ω(km , m) = lim μ(m, km , δm ) = 0.
m→∞
m→∞
m→∞
Then, for every m and every n ≥ max(n m , n km ,m ),
Nkm ,m
An =
m) A(i,k n,m + (R n,km ,m + R n,m ) + (N n,km ,m + N n,m ),
i=1
rank(Rn,km ,m + Rn,m ) ≤ (c(km , m) + c(m)) N (n), Nn,km ,m + Nn,m ≤ ω(km , m) + ω(m). It follows that
N km ,m i=1
Moreover,
m) A(i,k n,m
a.c.s.
−→ { A n }n .
(5.1)
n
m) { A(i,k n,m }n ∼LT ai,km ,m ⊗ f i,km ,m
(5.2)
5.3 Approximation Results for Multilevel GLT Sequences
97
for all m and i = 1, . . . , Nkm ,m , and
Nkm ,m
ai,km ,m ⊗ f i,km ,m → κ
(5.3)
i=1
in measure as m → ∞. Indeed, for any δ > 0, ⎫ ⎧ km ,m ⎬ ⎨ N μ2d ai,km ,m ⊗ f i,km ,m − κ ≥ δ ⎭ ⎩ i=1 ⎧ ⎫ km ,m ⎨ N ⎬ ai,km ,m ⊗ f i,km ,m − κm ≥ δ/2 + μ2d {|κm − κ| ≥ δ/2} , ≤ μ2d ⎩ ⎭ i=1 where μ2d {|κm − κ| ≥ δ/2} tends to 0 by assumption (since κm → κ in measure) and ⎧ ⎫ km ,m ⎨ N ⎬ ai,km ,m ⊗ f i,km ,m − κm ≥ δ/2 = μ(m, km , δ/2) μ2d ⎩ ⎭ i=1
tends to 0 because it is eventually less than μ(m, km , δm ). In view of (5.1)–(5.3) and Proposition 5.1, we conclude that { A n }n ∼GLT κ. Remark 5.3 (topological closure of multilevel GLT sequences) It is interesting to give a topological interpretation of Theorem 5.4, which is completely analogous to the topological interpretation of Corollaries 2.4 and 2.5 given in Remark 2.5. Fix a sequence of d-indices {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞, and let
E = {{ An }n : An ∈ C N (n)×N (n) for every n}, Md = {κ : [0, 1]d × [−π, π]d → C : κ is measurable}. We have seen in Sect. 2.7.1 and [22, Sect. 2.3.2] that E (resp., Md ) is a topological (pseudometric) space with respect to the pseudometric da.c.s. (resp., dmeasure ) which induces the a.c.s. topology τa.c.s. (resp., the topology of convergence in measure τmeasure ). Theorem 5.4 is then equivalent to saying that the set of d-level GLT pairs G = ({ A n }n , κ) ∈ E × Md : {A n }n ∼GLT κ is closed in E × Md equipped with the product (pseudometrizable) topology τa.c.s. × τmeasure induced, for example, by the pseudometric da.c.s. × measure (({A n }n , κ), ({Bn }n , ξ )) = da.c.s. ({ A n }n , {Bn }n ) + dmeasure (κ, ξ );
98
5 Multilevel Generalized Locally Toeplitz Sequences
see [22, Exercise 2.2]. Indeed, Theorem 5.4 reads as follows: if a sequence of dlevel GLT sequences {Bn,m }n ∼GLT κm converges in the a.c.s. topology to another d-level matrix-sequence { A n }n and if the corresponding sequence of symbols κm converges in measure to a measurable function κ (i.e., if a sequence of d-level GLT pairs ({Bn,m }n , κm ) converges to a pair ({A n }n , κ) in E × Md ), then { A n }n ∼GLT κ (i.e., ({A n }n , κ) is a d-level GLT pair). The approximation result for GLT sequences stated in Theorem 5.4 admits the following converse, which can be interpreted as another approximation result for multilevel GLT sequences. Theorem 5.5 Let { A n }n be a d-level matrix-sequence and let {{Bn,m }n }m be a sequence of d-level matrix-sequences. Suppose that: 1. { A n }n ∼GLT κ; 2. {Bn,m }n ∼GLT κm for every m. a.c.s.
Then {Bn,m }n −→ { A n }n if and only if κm → κ in measure. Proof Assume that 1–2 hold. Then { A n − Bn,m }n ∼GLT κ − κm (by Lemma 5.1) and {A n − Bn,m }n ∼σ κ − κm (by Theorem 5.1). Therefore, if κm → κ in measure a.c.s. a.c.s. then {Bn,m }n −→ { A n }n by Theorem 2.11. Conversely, if {Bn,m }n −→ { A n }n then κm → κ in measure by Proposition 2.5. Corollary 5.1 Let { A n }n ∼GLT κ. Then, for all functions ai,m , f i,m , i = 1, . . . , Nm , such that • ai,m : [0, 1]d → C is Riemann-integrable and f i,m ∈ L 1 ([−π, π]d ), Nm • i=1 ai,m ⊗ f i,m → κ in measure, Nm a.c.s. we have i=1 Dn (ai,m )Tn ( f i,m ) n −→ { A n }n . In particular, {A n }n admits an a.c.s. of the form Nm
i j·θ Dn (a (m) ) j ) Tn (e
Nm
j =−N m
N m ∈ Nd ,
(5.4)
n m
j =−N m
where
∞ d a(m) j ∈ C ([0, 1] ),
,
i j ·θ a (m) → κ(x, θ ) a.e. j (x) e
Proof Let ai,m , f i,m , i = 1, . . . , Nm , be functions with the properties specified in the statement of the corollary. Then Nm i=1
Dn (ai,m )Tn ( f i,m )
∼GLT n
Nm
ai,m ⊗ f i,m
i=1
by Theorem 4.10 and Lemma 5.1. Therefore, the convergence
5.3 Approximation Results for Multilevel GLT Sequences
Nm
a.c.s.
Dn (ai,m )Tn ( f i,m )
i=1
99
−→ { A n }n n
follows from Theorem 5.5 applied with Bn,m =
Nm
Dn (ai,m )Tn ( f i,m ),
κm =
i=1
Nm
ai,m ⊗ f i,m .
i=1
To obtain for { A n }n an a.c.s. of the form (5.4), simply use the result of this corollary in combination with Lemma 2.4. Remark 5.4 (topological density in the space of multilevel GLT sequences) With the notation of Remark 5.3, we recall that the set of d-level GLT pairs G = ({ A n }n , κ) ∈ E × Md : {A n }n ∼GLT κ is closed in E × Md equipped with the product (pseudometrizable) topology τa.c.s. × τmeasure . Consider the subset of G consisting of the d-level GLT pairs of the form Nm
Dn (ai,m )Tn ( f i,m ),
i=1
Nm
ai,m ⊗ f i,m ,
i=1
where ai,m ∈ C ∞ ([0, 1]d ), f i,m is a d-variate trigonometric monomial in {ei j ·θ : j ∈ Zd } for all i = 1, . . . , Nm , and Nm ∈ Nd . Then, according to Corollary 5.1, this subset is dense in G, i.e., its closure in E × Md with respect to the topology τa.c.s. × τmeasure coincides precisely with G.
5.3.1 Characterizations of Multilevel GLT Sequences The next result is a characterization theorem for multilevel GLT sequences. All the provided characterizations have already been proved before, but it is anyway useful to collect them in a single statement. Theorem 5.6 Let {A n }n be a d-level matrix-sequence and let κ : [0, 1]d × [−π, π]d → C be a measurable function. The following conditions are equivalent. 1. {A n }n ∼GLT κ. 2. For all sequences {κm }m , {{Bn,m }n }m such that • {Bn,m }n ∼GLT κm for every m, • κm → κ in measure, a.c.s.
we have {Bn,m }n −→ { A n }n .
100
5 Multilevel Generalized Locally Toeplitz Sequences
3. There exist functions ai,m , f i,m , i = 1, . . . , Nm , such that • ai,m : [0, 1]d → C belongs to C ∞ ([0, 1]d ) and f i,m is a d-varite trigonometric monomial belonging to {ei j ·θ : j ∈ Zd }, Nm • i=1 ai,m ⊗ f i,m → κ a.e., Nm a.c.s. • i=1 Dn (ai,m )Tn ( f i,m ) n −→ { A n }n . 4. There exist sequences {κm }m , {{Bn,m }n }m such that • {Bn,m }n ∼GLT κm for every m, • κm → κ in measure, a.c.s. • {Bn,m }n −→ { A n }n . Proof The implication 1 =⇒ 2 follows from Theorem 5.5. The implication 2 =⇒ 3 follows from the observation that, on the one hand, we can find functions ai,m , f i,m , first two properties specified initem 3 (by Lemma 2.4), i = 1, . . . , Nm , with the Nm Nm Dn (ai,m )Tn ( f i,m )}n ∼GLT i=1 ai,m ⊗ f i,m (by Theand, on the other hand, { i=1 orem 4.10 and Lemma 5.1). The implication 3 =⇒ 4 is obvious (it suffices to take Nm Nm Dn (ai,m )Tn ( f i,m ) and κm = i=1 ai,m ⊗ f i,m ). Finally, the implication Bn,m = i=1 4 =⇒ 1 is Theorem 5.4.
5.3.2 Sequences of Multilevel Diagonal Sampling Matrices We have seen in Sect. 4.3 the three most important examples of multilevel GLT sequences, namely multilevel zero-distributed sequences, multilevel Toeplitz sequences and sequences of multilevel diagonal sampling matrices. Concerning the latter kind of sequences, we proved that {Dn (a)}n ∼GLT a ⊗ 1 whenever a is Riemann-integrable. From a mathematical viewpoint, however, the GLT relation {Dn (a)}n ∼GLT a ⊗ 1 makes sense for all measurable functions a : [0, 1]d → C, and it is therefore natural to ask whether we can drop the Riemann-integrability assumption. As an application of Theorem 5.4, in Theorem 5.7 we show that the relation {Dn (a)}n ∼GLT a ⊗ 1 holds for all functions a : [0, 1]d → C that are continuous a.e. in [0, 1]d . Since a function a : [0, 1]d → C is Riemann-integrable if and only if a is bounded and continuous a.e. (see Sect. 2.3), Theorem 5.7 is an extension of Theorem 4.5. More precisely, in Theorem 5.7 we are dropping the boundedness assumption. Theorem 5.7 If a : [0, 1]d → C is continuous a.e. then {Dn (a)}n ∼GLT a ⊗ 1 for any sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Proof For an arbitrary a.e. continuous function a : [0, 1]d → C, we can write a = α+ − α− + iβ+ − iβ− , where α± , β± : [0, 1]d → R are nonnegative a.e. continuous functions; simply take
5.3 Approximation Results for Multilevel GLT Sequences
α+ = max( (a), 0), β+ = max((a), 0),
101
α− = − min( (a), 0), β− = − min((a), 0).
Hence, due to Lemma 5.1 and the linearity of Dn (a) with respect to its argument a, it suffices to prove the relation {Dn (a)}n ∼GLT a ⊗ 1 in the case where a : [0, 1]d → R is a nonnegative a.e. continuous function. Let a : [0, 1]d → [0, ∞) be a nonnegative a.e. continuous function. Denote by am the truncation of a at level m, i.e., am (x) =
a(x), if a(x) ≤ m, m, if a(x) > m.
Since am is bounded and continuous a.e., am is Riemann-integrable, hence {Dn (am )}n ∼GLT am ⊗ 1 by Theorem 4.5. Moreover, it is clear that am → a pointwise, so am → a in measure. We show that
a.c.s.
{Dn (am )}n −→ {Dn (a)}n , after which the application of Theorem 5.4 concludes the proof. To show that a.c.s. {Dn (am )}n −→ {Dn (a)}n , we prove that for every m there exists n m such that, for n ≥ nm , rank(Dn (a) − Dn (am )) = rank(Dn (a − am )) ≤ ε(m)N (n),
(5.5)
where ε(m) → 0 as m → ∞. For any integer k ≥ 1, consider the partition of (0, 1]d given by I i,k =
i −1 i , 2k 2k
=
i1 − 1 i1 id − 1 id , k × ··· × , k , 2k 2 2k 2
i = 1, . . . , 2k 1, (5.6)
and let am,k =
2k 1
sup am (y) χ Ii,k .
i=1
y∈I i,k
For every x ∈ (0, 1]d and every m, k, we have 0 ≤ am (x) ≤ am, k+1 (x) ≤ am,k (x) ≤ sup am (y) ≤ m. y∈(0,1]d
(5.7)
102
5 Multilevel Generalized Locally Toeplitz Sequences
Since a ≥ 0 on [0, 1]d , for all m, n, k the rank of Dn (a − am ) is equal to rank(Dn (a − am )) j j − am = 0 = # j ∈ {1, . . . , n} : a n n j ≤ # j ∈ {1, . . . , n} : a ≥m n j = # j ∈ {1, . . . , n} : am =m n j ≤ # j ∈ {1, . . . , n} : am,k =m n j = # j ∈ {1, . . . , n} : ∈ {am,k = m} n j =# x∈ : j = 1, . . . , n : x ∈ {am,k = m} n
j =# : j = 1, . . . , n {am,k = m} n ≤ μd {am,k = m}N (n + 2k 1), where the last inequality is justified as follows: the set {am,k = m} is a finite union of squares from (5.6), say rm,k squares, and each of these squares cannot contain more than (n 1 /2k + 1)(n 2 /2k + 1) · · · (n d /2k + 1) = N (n + 2k 1)/2dk grid points of { j /n : j = 1, . . . , n}, implying that {am,k = m} contains at most rm,k N (n + 2k 1)/2dk = μd {am,k = m}N (n + 2k 1) grid points. By Lemma 2.5, am,k → am a.e. in [0, 1]d because am is Riemann-integrable. By (5.7), the convergence of am,k to am is monotone. Thus, lim μd {am,k = m} = μd
k→∞
∞
{am,k = m} = μd {am = m} = μd {a ≥ m}.
k=1
For every m we choose km such that μd {am,km = m} ≤ μd {a ≥ m} + 1/m. Then we choose n m such that, for n ≥ n m , the inequality N (n + 2km 1)/N (n) ≤ 2 holds (this choice is possible because N (n + 2km 1)/N (n) → 1 as n → ∞). With these choices, we see that the inequality (5.5) is satisfied with ε(m) = 2(μd {a ≥ m} + 1/m), which converges to 0 as m → ∞.
5.4 The Multilevel GLT Algebra
103
5.4 The Multilevel GLT Algebra We investigate in this section the important algebraic properties possessed by multilevel GLT sequences, which give rise to the so-called (multilevel) GLT algebra. These (r) properties establish that, if { A(1) n }n , . . . , { A n }n are given multilevel GLT sequences (r ) with symbols κ1 , . . . , κr , respectively, and if A n = ops(A(1) n , . . . , A n ) is obtained (1) (r ) from A n , . . . , A n by means of certain operations “ops”, then { A n }n is a multilevel GLT sequence with symbol κ = ops(κ1 , . . . , κr ). Theorem 5.8 Let { A n }n ∼GLT κ and {Bn }n ∼GLT ξ . Then 1. { A∗n }n ∼GLT κ, 2. {α A n + β Bn }n ∼GLT ακ + βξ for all α, β ∈ C, 3. { A n Bn }n ∼GLT κξ . Proof The first two items have already been settled before (see Lemma 5.1). We prove the third item. By Proposition 5.1, there exist LT sequences }n ∼LT ai,m ⊗ f i,m , {A(i,m) n {Bn( j,m) }n
i = 1, . . . , Nm ,
∼LT b j,m ⊗ g j,m ,
j = 1, . . . , Mm ,
such that Nm m • i=1 ai,m ⊗ f i,m → κ in measure and M b j,m ⊗ g j,m → ξ in measure, Nm (i,m) a.c.s. Mm ( j=1 j,m) a.c.s. • −→ { A n }n and −→ {Bn }n . i=1 A n j=1 Bn n n By Theorem 5.6, we may assume that the functions f i,m , g j,m belong to L ∞ ([−π, π]d ) (actually, we might assume much more than this! We might assume that f i,m , g j,m are d-variate trigonometric monomials, that ai,m , b j,m ∈ C ∞ ([0, 1]d ), Nm m (i,m) that i=1 ai,m ⊗ f i,m → κ a.e. and M }n = j=1 b j,m ⊗ g j,m → ξ a.e., and that {A n ( j,m)
{Dn (ai,m )Tn ( f i,m )}n and {Bn }n = {Dn (b j,m )Tn (g j,m )}n ). By Theorem 5.1 and Proposition 2.2, any multilevel GLT sequence is s.u., so in particular {A n }n and {Bn }n are s.u. Thus, by Theorem 2.9, Nm i=1
A(i,m) n
Mm
Bn( j,m)
j=1
=
Nm Mm
n
A(i,m) Bn( j,m) n
a.c.s.
−→ { A n Bn }n . n
i=1 j=1
Since f i,m , g j,m ∈ L ∞ ([−π, π]d ), by Theorem 4.9 we have {A(i,m) Bn( j,m) }n ∼LT ai,m b j,m ⊗ f i,m g j,m , n
i = 1, . . . , Nm ,
j = 1, . . . , Mm .
Finally, Nm Mm i=1 j=1
ai,m b j,m ⊗ f i,m g j,m =
Nm i=1
ai,m ⊗ f i,m
Mm j=1
b j,m ⊗ g j,m
→ κξ
104
5 Multilevel Generalized Locally Toeplitz Sequences
in measure by [22, Lemma 2.3]. By Proposition 5.1 we conclude that { A n Bn }n ∼GLT κξ . (i, j)
Corollary 5.2 Let r, q1 , . . . , qr ∈ N, α1 , . . . , αr ∈ C, and let { A n }n ∼GLT κi j for i = 1, . . . , r and j = 1, . . . , qi . Then r i=1
αi
qi j=1
j) A(i, n
∼GLT n
r
αi
i=1
qi
κi j .
j=1
Remark 5.5 (the multilevel GLT algebra) Theorem 5.8 is enough to conclude that the set of multilevel GLT sequences is a *-algebra over the complex field C. More precisely, fix a sequence of d-indices {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞, and let
E = { An }n : A n ∈ C N (n)×N (n) for every n , Md = {κ : [0, 1]d × [−π, π]d → C : κ is measurable}. The space E is a *-algebra over C with respect to the natural operations of conjugate transposition, addition, scalar-multiplication and product of matrix-sequences (see (2.44)). By Theorem 5.8, the set of d-level GLT sequences
G = { An }n : {An }n ∼GLT κ for some κ ∈ Md
is a *-subalgebra of E , which is referred to as the (d-level) GLT algebra. We note that, by Theorems 4.4, 4.6 and 5.7, the d-level GLT algebra contains the algebra generated by d-level zero-distributed sequences, d-level Toeplitz sequences, and sequences of d-level diagonal sampling matrices associated with a.e. continuous functions. Thus, if
B = {Tn ( f )}n : f ∈ L 1 ([−π, π]d )
∪ {Dn (a)}n : a : [0, 1]d → C is continuous a.e. ∪ {Z n }n : {Z n }n ∼σ 0
and Algebra(B) denotes the subalgebra of E generated by B, i.e., the smallest subalgebra of E containing B, then Algebra(B) ⊆ G . We also note that Md is a *-algebra over C with respect to the natural operations of complex conjugation, addition, scalar-multiplication and product of functions. Since E × Md is the product of two *-algebras over C, it is itself a *-algebra over C with respect to the natural pointwise operations
5.4 The Multilevel GLT Algebra
105
({ A n }n , κ)∗ = ({ A∗n }n , κ), ({ A n }n , κ) + ({Bn }n , ξ ) = ({ A n + Bn }n , κ + ξ ), α({ A n }n , κ) = ({α A n }n , ακ),
(5.8)
({A n }n , κ)({Bn }n , ξ ) = ({ A n Bn }n , κξ ). By Theorem 5.8, the set of d-level GLT pairs G = ({ A n }n , κ) ∈ E × Md : {A n }n ∼GLT κ is a *-subalgebra of E × Md . We are going to see in Theorems 5.9 and 5.10 that the multilevel GLT algebra enjoys other nice properties, in addition to those of Theorem 5.8, which make it look like a “big container”, closed under any type of “regular” operation. Theorem 5.9 If { A n }n ∼GLT κ and the A n are Hermitian then { f (A n )}n ∼GLT f (κ) for any continuous function f : C → C. Proof Since every A n is Hermitian and κ ∈ R a.e. (by Proposition 5.3), it suffices to prove the theorem for real continuous functions f : R → R. Indeed, suppose we have proved the theorem for this kind of functions and let f : C → C be any continuous complex function. Denote by α, β : R → R the real and imaginary parts of the restriction of f to R. Then, α, β are continuous functions such that f (x) = α(x) + iβ(x) for all x ∈ R, and since the eigenvalues of A n are real we have f (A n ) = α(A n ) + iβ(An ). In view of the relations {α( A n )}n ∼GLT α(κ) and {β(A n )}n ∼GLT β(κ), Theorem 5.8 yields { f (A n )}n ∼GLT α(κ) + iβ(κ) = f(κ). Let f : R → R be a real continuous function. For each M > 0, let { pm,M }m be a sequence of polynomials that converges uniformly to f over [−M, M]: lim f − pm,M ∞,[−M,M] = 0.
m→∞
Note that such a sequence exists by the Weierstrass theorem; see, e.g., [31, Theorem 7.26]. For every M > 0 and every m, n, write f (A n ) = pm,M (A n ) + f (A n ) − pm,M (A n ).
(5.9)
Since any multilevel GLT sequence is s.u. (by Theorem 5.1 and Proposition 2.2), the sequence {A n }n is s.u. Hence, by Remark 2.4, for all M > 0 there exists n M such that, for n ≥ n M , A n = Aˆ n,M + A˜ n,M ,
rank( Aˆ n,m ) ≤ r (M)N (n),
A˜ n,M ≤ M,
where lim M→∞ r (M) = 0, the matrices Aˆ n,M and A˜ n,M are Hermitian, and for all functions g : R → R we have
106
5 Multilevel Generalized Locally Toeplitz Sequences
g( Aˆ n,M + A˜ n,M ) = g( Aˆ n,M ) + g( A˜ n,M ). Thus, for every M > 0, every m and every n ≥ n M we can write f (A n ) = pm,M (A n ) + f ( Aˆ n,M ) + f ( A˜ n,M ) − pm,M ( Aˆ n,M ) − pm,M ( A˜ n,M ) = pm,M (A n ) + ( f − pm,M )( Aˆ n,M ) + ( f − pm,M )( A˜ n,M ).
(5.10)
The matrix ( f − pm,M )( Aˆ n,M ) can be written as the sum of two terms, namely , ( f − pm,M )( Aˆ n,M ) = Rn,m,M + Nn,m,M
where Rn,m,M = ( f − pm,M )( Aˆ n,M ) · χ S c (( f − pm,M )( Aˆ n,M )), Nn,m,M = ( f − pm,M )( Aˆ n,M ) · χ S (( f − pm,M )( Aˆ n,M )),
and S is the singleton S = {( f − pm,M )(0)}. In other words, Rn,m,M is the matrix obtained from ( f − pm,M )( Aˆ n,M ) by setting to 0 all the eigenvalues that are equal to ( f − pm,M )(0), while Nn,m,M is the matrix obtained from ( f − pm,M )( Aˆ n,M ) by setting to 0 all the eigenvalues that are different from ( f − pm,M )(0). Note that rank(Rn,m,M ) ≤ rank( Aˆ n,M ) ≤ r (M)N (n), ≤ | f (0) − pm,M (0)|. Nn,m,M = ( f − pm,M )( A˜ n,M ), the inequality A˜ n,M ≤ M Concerning the matrix Nn,m,M yields ≤ f − pm,M ∞,[−M,M] . Nn,m,M
Let
+ Nn,m,M . Nn,m,M = Nn,m,M
By (5.10), for every M > 0, every m and every n ≥ n M we have f (A n ) = pm,M (A n ) + Rn,m,M + Nn,m,M , where rank(Rn,m,M ) ≤ r (M)N (n), + Nn,m,M ≤ 2 f − pm,M ∞,[−M,M] . Nn,m,M ≤ Nn,m,M
Choose a sequence {Mm }m such that
5.4 The Multilevel GLT Algebra
107
Mm → ∞,
f − pm,Mm ∞,[−Mm ,Mm ] → 0.
(5.11)
Then, for every m and every n ≥ n Mm , f (A n ) = pm,Mm (A n ) + Rn,m,Mm + Nn,m,Mm , N n,m,Mm ≤ 2 f − pm,Mm ∞,[−Mm ,Mm ] , rank(Rn,m,Mm ) ≤ r (Mm )N (n), which implies that a.c.s.
{ pm,Mm (A n )}n −→ { f (A n )}n . Moreover, by Theorem 5.8, { pm,Mm (A n )}n ∼GLT pm,Mm (κ). Finally, by (5.11), pm,Mm (κ) → f (κ) a.e. All the hypotheses of Theorem 5.4 are satisfied and { f (A n )}n ∼GLT f (κ).
−1 The last issue we are interested in is to know if {A−1 in the case n }n ∼GLT κ where {A n }n ∼GLT κ, each A n is invertible, and κ = 0 a.e. (so that κ −1 is a welldefined measurable function). More in general, we may ask if {A†n }n ∼GLT κ −1 when {A n } ∼GLT κ and κ = 0 a.e. The answer to both the previous questions is affirmative, as we are going to see.
Theorem 5.10 If { A n }n ∼GLT κ and κ = 0 a.e. then { A†n }n ∼GLT κ −1 . Proof Take a sequence of matrix-sequences {{Bn,m }n }m such that {Bn,m }m ∼GLT ξm and
ξm → κ −1 a.e.
Note that a sequence {{Bn,m }n }m with these properties exists. Indeed, by Lemma 2.4 there exists a sequence {ξm }m , with ξm of the form ξm (x, θ) =
Nm
i j ·θ a(m) , j (x) e
∞ d a(m) j ∈ C ([0, 1] ),
j =−N m
such that ξm → κ −1 a.e. Therefore, it suffices to take Bn,m =
Nm j =−N m
i j ·θ Dn (a (m) ) j )Tn (e
N m ∈ Nd ,
108
5 Multilevel Generalized Locally Toeplitz Sequences
and to observe that {Bn,m }n ∼GLT ξm by Theorems 4.10 and 5.8. We show that a.c.s.
{Bn,m }n −→ { A†n }n , after which the thesis follows from Theorem 5.4. By Theorem 5.8 we have {Bn,m A n − I N (n) }n ∼GLT ξm κ − 1, which implies that {Bn,m A n − I N (n) }n ∼σ ξm κ − 1 by Theorem 5.1. Moreover, ξm κ − 1 → 0 a.e. (and hence also in measure). Thus, by Theorem 2.11, for every m there exists n m such that, for n ≥ n m , Bn,m A n = I N (n) + Rn,m + Nn,m , N n,m ≤ ω(m), rank(Rn,m ) ≤ c(m)N (n),
(5.12)
where lim c(m) = lim ω(m) = 0.
m→∞
m→∞
Multiplying (5.12) by A†n , we obtain that, for every m and every n ≥ n m , Bn,m A n A†n = A†n + (Rn,m + Nn,m )A†n .
(5.13)
Since κ = 0 a.e. by hypothesis, { A n }n is s.v. by Theorem 5.1 and Proposition 2.4. It follows that { A†n }n is s.u. and so, by Proposition 2.1, for all M > 0 there is n M such that, for n ≥ n M , A†n = Aˆ †n,M + A˜ †n,M ,
rank( Aˆ †n,M ) ≤ r (M)N (n),
A˜ †n,M ≤ M,
where lim M→∞ r (M) = 0. Choosing Mm = (ω(m))−1/2 , from (5.13) we see that, for every m and every n ≥ max(n m , n Mm ), Bn,m A n A†n = A†n + Rn,m + Nn,m , rank(Rn,m ) ≤ (c(m) + r (Mm ))N (n),
N n,m ≤ (ω(m))1/2 ,
(5.14)
where Rn,m = Rn,m A†n + Nn,m Aˆ †n,Mm and Nn,m = Nn,m A˜ †n,Mm . † If the matrices A n were invertible, then A n = A−1 n and (5.14) would imply that a.c.s. † {Bn,m }n −→ { A n }n . In the general case where the matrices A n are not invertible, the a.c.s. convergence {Bn,m }n −→ { A†n }n will follow again from (5.14) as soon as we have proved the following: for every m there exists nˆ m such that, for n ≥ nˆ m ,
A n A†n = I N (n) + Sn ,
rank(Sn ) ≤ ϑ(m)N (n),
where limm→∞ ϑ(m) = 0. This is easy, because, by definition of A†n , the rank of the matrix Sn = A n A†n − I N (n) is given by rank(Sn ) = #{i ∈ {1, . . . , N (n)} : σi (A n ) = 0}. Hence, the previous claim is a direct consequence of the fact that {A n }n is s.v.
5.5 Algebraic-Topological Definitions of Multilevel GLT Sequences
109
5.5 Algebraic-Topological Definitions of Multilevel GLT Sequences Before concluding the theory of multilevel GLT sequences, it is interesting to talk about a couple of possible abstract definitions of multilevel GLT sequences, which are based on the algebraic-topological results obtained in Sects. 5.3 and 5.4. Fix a sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Let E and Md be, respectively, the space of matrix-sequences corresponding to the sequence {n = n(n)}n and the space of measurable functions defined on [0, 1]d × [−π, π]d , and let E × Md be the product space:
E = { An }n : A n ∈ C N (n)×N (n) for every n , Md = {κ : [0, 1]d × [−π, π]d → C : κ is measurable}, E × Md = ({ An }n , κ) : {An }n ∈ E , κ ∈ Md . As we have seen in Remarks 5.3 and 5.5, • the space E is a *-algebra with respect to the natural operations, and it is also a topological (pseudometric) space with respect to the topology τa.c.s. induced by the distance da.c.s. ; • the space Md is a *-algebra with respect to the natural operations, and it is also a topological (pseudometric) space with respect to the topology τmeasure induced by the distance dmeasure ; • the space E × Md is a *-algebra with respect to the natural (pointwise) operations, and it is also a topological (pseudometric) space with respect to the topology τa.c.s. × τmeasure induced by the distance da.c.s. × measure . Let G be the subset of E × Md consisting of the d-level GLT pairs, i.e., G = ({ A n }n , κ) : {A n }n ∼GLT κ ⊆ E × Md . By Remarks 5.3 and 5.5, G is a closed *-subalgebra of E × Md . By Theorems 4.4, 4.6 and 5.7, G contains the set B = ({Tn ( f )}n , 1 ⊗ f ) : f ∈ L 1 ([−π, π]d ) ∪ ({Dn (a)}n , a ⊗ 1) : a : [0, 1]d → C is continuous a.e. ∪ ({Z n }n , 0) : {Z n }n ∼σ 0 . By Remark 5.4, the algebra generated by B is dense in G. In conclusion, the set of d-level GLT pairs G is the closed *-subalgebra of E × Md generated by B, i.e., the smallest closed *-subalgebra of E × Md containing B. Looking more carefully at Remark 5.4, we also note that, if we let
110
5 Multilevel Generalized Locally Toeplitz Sequences
C = ({Dn (a)}n , a ⊗ 1) : a ∈ C ∞ ([0, 1]d ) ∪ ({Tn (ei j ·θ )}n , 1 ⊗ ei j·θ ) : j ∈ Zd , then the set of d-level GLT pairs G is the closure of the subalgebra of E × Md generated by C. One may also decide to start the theory of multilevel GLT sequences from one of these two algebraic-topological definitions instead of the traditional one (Definition 5.1). It should be said, however, that the traditional definition looks more effective to obtain the fundamental singular value and spectral distribution results expressed in Theorems 5.1 and 5.2.
Chapter 6
Summary of the Theory
We conclude the theory of multilevel GLT sequences by providing a self-contained summary, which contains everything one needs to know in order to understand the applications presented in the next chapter. As mentioned in the preface, assuming the reader possesses the necessary prerequisites, a possible way of reading this book consists in first reading this chapter and the next one, and then coming back to fill the gaps, where “fill the gaps” essentially means “read the proofs of the results reported in this chapter”. The latter is substantially equivalent to reading the book. It is assumed that anyone who reads this summary is aware of the notation and terminology used throughout the book, which will be only partially repeated here for the sake of brevity. The reader can find both notation and terminology in Sect. 2.1 and/or in the index at the end. Multi-index notation. A multi-index i of size d, also called a d-index, is a (row) vector in Zd ; its components are denoted by i 1 , . . . , i d . 0, 1, 2, . . . are the vectors of all zeros, all ones, all twos, . . . (their size will be clear from the context). For any d-index m, we set N (m) = dj=1 m j and we write m → ∞ to indicate that min(m) → ∞. The notation N (α) = dj=1 α j will be used for any vector α with d components and not only for d-indices. If h, k are d-indices, h ≤ k means that h r ≤ kr for all r = 1, . . . , d. If h, k are d-indices such that h ≤ k, the multi-index (or d-index) range h, . . . , k is the set { j ∈ Zd : h ≤ j ≤ k}. We assume for this set the standard lexicographic ordering:
...
[ ( j1 , . . . , jd ) ] jd =h d ,...,kd
jd−1 =h d−1 ,...,kd−1
...
j1 =h 1 ,...,k1
.
For instance, in the case d = 2 the ordering is (h 1 , h 2 ), (h 1 , h 2 + 1), . . . , (h 1 , k2 ), (h 1 + 1, h 2 ), (h 1 + 1, h 2 + 1), . . . , (h 1 + 1, k2 ), . . . . . . . . . , (k1 , h 2 ), (k1 , h 2 + 1), . . . , (k1 , k2 ).
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_6
111
112
6 Summary of the Theory
When a d-index j varies over a multi-index range h, . . . , k (this is often written as j = h, . . . , k), it is understood that j varies from h to k following the lexicographic ordering. For instance, if m ∈ Nd and we write x = [x i ]m i=1 , then x is a vector of size N (m) whose components x i , i = 1, . . . , m, are ordered in accordance with the lexicographic ordering: the first component is x1 = x(1,...,1,1) , the second component is x(1,...,1,2) , and so on until the last component, which is x m = x(m 1 ,...,m d ) . Similarly, if X = [x i j ]m i, j =1 , then X is an N (m) × N (m) matrix whose components are indexed by a pair of d-indices i, j , both varying from 1 to m according to the lexicographic ordering. If h, k are d-indices such that h ≤ k, the notation kj =h indicates the summation over all j in h, . . . , k. If i, j are d-indices, i j means that i precedes (or equals) j in the lexicographic ordering (which is a total ordering on Zd ). Moreover, we define i, if i j , i∧ j= j , if i j . Note that i ∧ j is the minimum among i and j with respect to the lexicographic ordering. Operations involving d-indices that have no meaning in the vector space Zd must always be interpreted in the componentwise sense. For instance, n p = (n 1 p1 , . . . , n d pd ), αi/ j = (αi 1 /j1 , . . . , αi d /jd ) for all α ∈ C, etc. Matrix norms. Here is a list of important inequalities involving the p-norms and the Schatten p-norms of matrices. √ N 1. X ≤ |X |1 |X |∞ ≤ max(|X |1 , |X |∞ ) for all X ∈ Cm×m . N 2. X 1 ≤ rank(X ) X ≤ m X for all X ∈ Cm×m . N 3. X 1 ≤ i,m j=1 |xi j | for all X ∈ Cm×m . We also recall that X = σmax (X ) = X ∞ for all X ∈ Cm×m . Tensor products. If X ∈ Cm 1 ×m 2 and Y ∈ C1 ×2 , the tensor (Kronecker) product of X and Y is the m 1 1 × m 2 2 matrix defined by ⎡
X ⊗ Y = xi j Y i=1,...,m 1
j=1,...,m 2
⎤ x11 Y · · · x1m 2 Y ⎢ .. ⎥ . = ⎣ ... . ⎦ xm 1 1 Y · · · xm 1 m 2 Y
Here is a list of important properties satisfied by tensor products. P 1. P 2.
P 3. P 4. P 5.
Associativity: (X ⊗ Y ) ⊗ Z = X ⊗ (Y ⊗ Z ) for all matrices X , Y , Z . Bilinearity: (α X + βY ) ⊗ (γ W + ηZ ) = αγ (X ⊗ W ) + αη(X ⊗ Z ) + βγ (Y ⊗ W ) + βη(Y ⊗ Z ) for all α, β, γ , η ∈ C and for all matrices X , Y , W , Z such that X , Y are summable and W , Z are summable. (X ⊗ Y )∗ = X ∗ ⊗ Y ∗ and (X ⊗ Y )T = X T ⊗ Y T for all matrices X , Y . (X ⊗ Y )(W ⊗ Z ) = (X W ) ⊗ (Y Z ) for all matrices X , Y , W , Z such that X , W are multipliable and Y , Z are multipliable. X ⊗ Y p = X p Y p for all square matrices X , Y and all p ∈ [1, ∞].
6 Summary of the Theory
P 6. P 7.
113
rank(X ⊗ Y ) = rank(X )rank(Y ) for all matrices X , Y . If X i ∈ Cm i ×m i for i = 1, . . . , d and m = (m 1 , . . . , m d ), then (X 1 ⊗ · · · ⊗ X d )i j = (X 1 )i1 j1 · · · (X d )id jd ,
P 8.
i, j = 1, . . . , m.
If X i , Yi ∈ Cm i ×m i for i = 1, . . . , d and m = (m 1 , . . . , m d ), then rank(X 1 ⊗ · · · ⊗ X d − Y1 ⊗ · · · ⊗ Yd ) ≤ N (m)
d rank(X i − Yi ) . mi i=1
Sequences of matrices and multilevel matrix-sequences. A sequence of matrices is a sequence of the form { An }n , where n varies in some infinite subset of N and An is a square matrix of size dn such that dn → ∞ as n → ∞. If {An }n is a sequence of matrices, with An of size dn , we say that { An }n is sparsely unbounded (s.u.) if lim lim sup
M→∞ n→∞
#{i ∈ {1, . . . , dn } : σi (An ) > M} = 0; dn
and we say that { An }n is sparsely vanishing (s.v.) if lim lim sup
M→∞ n→∞
#{i ∈ {1, . . . , dn } : σi (An ) < 1/M} = 0. dn
A d-level matrix-sequence is a special sequence of matrices of the form { An }n , where: • n varies in some infinite subset of N; • n = n(n) ∈ Nd and n → ∞ (i.e., min(n) → ∞) as n → ∞; • A n is a square matrix of size N (n). Singular value and eigenvalue distribution of a sequence of matrices. Let {An }n be a sequence of matrices, with An of size dn , and let f : D ⊂ Rk → C be a measurable function defined on a set D with 0 < μk (D) < ∞. • We say that { An }n has a singular value distribution described by f , and we write { An }n ∼σ f , if dn 1 1 F(σi (An )) = F(| f (x)|)dx, n→∞ dn μk (D) D i=1 lim
∀ F ∈ Cc (R).
In this case, f is called the singular value symbol of {An }n . • We say that { An }n has a spectral (or eigenvalue) distribution described by f , and we write { An }n ∼λ f , if dn 1 1 F(λi (An )) = F( f (x))dx, n→∞ dn μk (D) D i=1 lim
∀ F ∈ Cc (C).
114
6 Summary of the Theory
In this case, f is called the spectral (or eigenvalue) symbol of { An }n . When we write a relation such as { An }n ∼σ f or { An }n ∼λ f , it is understood that {An }n is a sequence of matrices and f is a measurable function defined on a subset D of some Rk with 0 < μk (D) < ∞. In what follows, “iff” is an abbreviation of “if and only if”. S 1. S 2. S 3. S 4.
If { An }n If { An }n If { An }n If An =
∼σ f then { An }n is s.u. ∼σ f then { An }n is s.v. iff f = 0 a.e. ∼λ f and Λ(An ) ⊆ S for all n then f ∈ S a.e. X n + Yn where
• each X n is Hermitian and {X n }n ∼λ f , • X n , Yn ≤ C for all n, where C is a constant independent of n, • lim (dn )−1 Yn 1 = 0, n→∞
then { An }n ∼λ f . Informal meaning. Assuming that f is continuous a.e., the spectral distribution { An }n ∼λ f has the following informal meaning: all the eigenvalues of An , except possibly for o(dn ) outliers (with dn being the size of An ), are approximately equal to the samples of f over a uniform grid in its domain D (for n large enough). For instance, if k = 1 and D = [a, b], then, assuming we have no outliers, the eigenvalues of An are approximately equal to b − a f a+i , dn
i = 1, . . . , dn ,
for n large enough. Similarly, if k = 2, dn is a perfect square and D = [a1 , b1 ] × [a2 , b2 ], then, assuming we have no outliers, the eigenvalues of An are approximately equal to b1 − a1 b2 − a2 , a2 + i 2 √ , i 1 , i 2 = 1, . . . , dn , f a1 + i 1 √ dn dn for n large enough. A completely analogous meaning can also be given for the singular value distribution {An }n ∼σ f . Clustering and attraction. • Let { An }n be a sequence of matrices, with An of size dn , and let S be a nonempty subset of C. We say that { An }n is weakly clustered at S if lim
n→∞
#{ j ∈ {1, . . . , dn } : λ j (An ) ∈ / D(S, )} = 0, dn
∀ > 0.
• Let {An }n be a sequence of matrices, with An of size dn , and let z ∈ C. We say that z strongly attracts the spectrum Λ(An ) with infinite order if, once we have ordered the eigenvalues of An according to their distance from z,
6 Summary of the Theory
115
|λ1 (An ) − z| ≤ |λ2 (An ) − z| ≤ · · · ≤ |λdn (An ) − z|, the following limit relation holds for each fixed j ≥ 1: lim |λ j (An ) − z| = 0.
n→∞
CA 1.
If { An }n ∼λ f then { An }n is weakly clustered at ER( f ) and each z ∈ ER( f ) strongly attracts Λ(An ) with infinite order.
Zero-distributed sequences. A sequence of matrices {Z n }n such that {Z n }n ∼σ 0 is referred to as a zero-distributed sequence. In other words, {Z n }n is zero-distributed iff dn 1 lim F(σi (Z n )) = F(0), ∀ F ∈ Cc (R), n→∞ dn i=1 where dn is the size of Z n . Given a sequence of matrices {Z n }n , with Z n of size dn , the following properties hold. In what follows, we use the natural convention C/∞ = 0 for all numbers C. Z 1.
{Z n }n ∼σ 0 iff Z n = Rn + Nn with lim (dn )−1 rank(Rn ) = lim Nn = 0.
Z 2.
{Z n }n ∼σ 0 if there is a p ∈ [1, ∞] such that lim (dn )−1/ p Z n p = 0.
n→∞
n→∞
n→∞
Sequences of multilevel diagonal sampling matrices. If n ∈ Nd and a : [0, 1]d → C, the nth (d-level) diagonal sampling matrix generated by a is the N (n) × N (n) diagonal matrix given by i . Dn (a) = diag a n i=1,...,n Each d-level matrix-sequence of the form {Dn (a)}n , with n = n(n) → ∞ as n → ∞, is referred to as a sequence of (d-level) diagonal sampling matrices generated by a. Multilevel Toeplitz sequences. If n ∈ Nd and f : [−π, π]d → C is a function in L 1 ([−π, π]d ), the nth (d-level) Toeplitz matrix generated by f is the N (n) × N (n) matrix Tn ( f ) = [ f i− j ]ni, j =1 , where the f k are the Fourier coefficients of f , 1 fk = (2π )d
[−π,π]d
f (θ) e−ik·θ dθ ,
k ∈ Zd .
Each d-level matrix-sequence of the form {Tn ( f )}n , with n = n(n) → ∞ as n → ∞, is referred to as a (d-level) Toeplitz sequence generated by f .
116
T 1.
6 Summary of the Theory
For every n ∈ Nd the map Tn (·) : L 1 ([−π, π]d ) → C N (n)×N (n) • is linear: Tn (α f + βg) = αTn ( f ) + βTn (g); • is strictly positive: Tn ( f ) > O N (n) if f ≥ 0 and f is not a.e. equal to 0; • satisfies Tn (1) = I N (n) and (Tn ( f ))∗ = Tn ( f ).
T 2. T 3. T 4. T 5. T 6.
If f is real a.e. then Tn ( f ) is Hermitian for all n ∈ Nd . (n)1/ p If 1 ≤ p ≤ ∞ and f ∈ L p ([−π, π]d ) then Tn ( f ) p ≤ N(2π) d/ p f L p . 1 d If f ∈ L ([−π, π] ) and {Tn ( f )}n is any Toeplitz sequence generated by f then {Tn ( f )}n ∼σ f . If in addition f is real a.e. then {Tn ( f )}n ∼λ f . If f ∈ L ∞ ([−π, π]d ), the interior of ER( f ) is empty, C\ER( f ) is connected and {Tn ( f )}n is any Toeplitz sequence generated by f , then {Tn ( f )}n ∼λ f . If f is real a.e. and m f , M f are its essential infimum and supremum, then • Λ(Tn ( f )) ⊆ [m f , M f ] for all n ∈ Nd , • Λ(Tn ( f )) ⊂ (m f , M f ) for all n ∈ Nd whenever m f < M f , • for each fixed j ≥ 1 we have λ j (Tn ( f )) → M f and λ N (n)− j+1 (Tn ( f )) → m f as n → ∞, where λ1 (Tn ( f )) ≥ · · · ≥ λ N (n) (Tn ( f )).
T 7. T 8.
If f ∈ L p ([−π, π]d ) and g ∈ L q ([−π, π]d ), where 1 ≤ p, q ≤ ∞ are conjugate exponents, then N (n)−1 Tn ( f )Tn (g) − Tn ( f g) 1 → 0 as n → ∞. If f 1 , . . . , f d ∈ L 1 ([−π, π]) and n ∈ Nd then Tn ( f 1 ⊗ · · · ⊗ f d ) = Tn 1 ( f 1 ) ⊗ · · · ⊗ Tn d ( f d ).
Approximating classes of sequences. Let { An }n be a sequence of matrices and {{Bn,m }n }m a sequence of sequences of matrices, with An and Bn,m of size dn . We say that {{Bn,m }n }m is an approximating class of sequences (a.c.s.) for { An }n if the following condition is met: for every m there exists n m such that, for n ≥ n m , An = Bn,m + Rn,m + Nn,m ,
rank(Rn,m ) ≤ c(m)dn ,
Nn,m ≤ ω(m),
where n m , c(m), ω(m) depend only on m, and lim c(m) = lim ω(m) = 0.
m→∞
m→∞
We use the abbreviation “a.c.s.” for both the singular “approximating class of sequences” and the plural “approximating classes of sequences”. It turns out that, for each fixed sequence of positive integers dn such that dn → ∞, the notion of a.c.s. is a notion of convergence in the space E = {{ An }n : An ∈ Cdn ×dn for every n}. More precisely, for every A ∈ C× let
rank(R) + N : p(A) = inf i = min + σi+1 (A) , i=0,...,
R, N ∈ C
×
,
R+N = A
6 Summary of the Theory
117
where σ1 (A) ≥ · · · ≥ σ (A) and σ+1 (A) = 0 by convention. Set pa.c.s. ({ An }n ) = lim sup p(An ), n→∞
{ A n }n ∈ E ,
da.c.s. ({An }n , {Bn }n ) = pa.c.s. ({ An − Bn }n ),
{An }n , {Bn }n ∈ E .
Then, da.c.s. is a distance on E such that da.c.s. ({ An }n , {Bn }n ) = 0 iff {An − Bn }n is zero-distributed; moreover, da.c.s. turns E into a pseudometric space (E , da.c.s. ) where the statement “{{Bn,m }n }m converges to { An }n ” is equivalent to “{{Bn,m }n }m is an a.c.s. for {An }n ”. In particular, we can reformulate the definition of a.c.s. in the following way: a sequence of sequences of matrices {{Bn,m }n }m is said to be an a.c.s. for {An }n if {Bn,m }n converges to { An }n in (E , da.c.s. ) as m → ∞, i.e., if da.c.s. ({Bn,m }n , { An }n ) → 0 as m → ∞. The theory of a.c.s. may then be interpreted as an approximation theory for sequences of matrices, and for this reason we will use a.c.s. the convergence notation {Bn,m }n −→ { An }n to indicate that {{Bn,m }n }m is an a.c.s. for {An }n . ACS 1. ACS 2.
ACS 3.
{ An }n ∼σ f iff there exist sequences of matrices {Bn,m }n ∼σ f m such that a.c.s. {Bn,m }n −→ { An }n and f m → f in measure. Suppose each An is Hermitian. Then, { An }n ∼λ f iff there exist sequences a.c.s. of Hermitian matrices {Bn,m }n ∼λ f m such that {Bn,m }n −→ {An }n and f m → f in measure. a.c.s. a.c.s. If {Bn,m }n −→ { An }n and {Bn,m }n −→ { An }n , with An and An of the same size dn , then a.c.s.
∗ }n −→ { A∗n }n , • {Bn,m
a.c.s.
• {α Bn,m + β Bn,m }n −→ {α An + β An }n for all α, β ∈ C, a.c.s.
• {Bn,m Bn,m }n −→ { An An }n whenever {An }n , { An }n are s.u. a.c.s.
a.c.s.
ACS 4.
† }n −→ { A†n }n . Suppose {An }n is s.v. If {Bn,m }n −→ { An }n then {Bn,m
ACS 5.
Suppose { An }n is s.u. and An , Bn,m are Hermitian. If {Bn,m }n −→ { An }n a.c.s. then { f (Bn,m )}n −→ { f (An )}n for every continuous function f : C → C. Let p ∈ [1, ∞] and suppose for every m there exists n m such that, for n ≥ n m , An − Bn,m p ≤ (m, n)(dn )1/ p , where dn is the size of both An and a.c.s. Bn,m , and limm→∞ lim supn→∞ (m, n) = 0. Then {Bn,m }n −→ { An }n . Suppose {An − Bn,m }n ∼σ gm for some gm defined on a fixed domain a.c.s. (independent of m). If gm → 0 in measure then {Bn,m }n −→ { An }n .
ACS 6.
ACS 7.
a.c.s.
Multilevel generalized locally Toeplitz sequences. A d-level Generalized Locally Toeplitz (GLT) sequence {A n }n is a special d-level matrix-sequence equipped with a measurable function κ : [0, 1]d × [−π, π]d → C, the so-called symbol (or kernel). We use the notation { A n }n ∼GLT κ to indicate that {A n }n is a d-level GLT sequence with symbol κ. The symbol of a d-level GLT sequence is unique in the sense that if
118
6 Summary of the Theory
{A n }n ∼GLT κ and { A n }n ∼GLT ξ then κ = ξ a.e. in [0, 1]d × [−π, π]d . Conversely, if {A n }n ∼GLT κ and κ = ξ a.e. in [0, 1]d × [−π, π]d then { A n }n ∼GLT ξ . GLT 1. GLT 2.
If { A n }n ∼GLT κ then { A n }n ∼σ κ. If { A n }n ∼GLT κ and the matrices A n are Hermitian then {A n }n ∼λ κ. If { A n }n ∼GLT κ and A n = X n + Yn , where • every X n is Hermitian, • X n , Yn ≤ C for some constant C independent of n, • N (n)−1 Yn 1 → 0,
GLT 3.
then { A n }n ∼λ κ. We have • {Tn ( f )}n ∼GLT κ(x, θ ) = f (θ) if f ∈ L 1 ([−π, π]d ), • {Dn (a)}n ∼GLT κ(x, θ ) = a(x) if a : [0, 1]d → C is continuous a.e., • {Z n }n ∼GLT κ(x, θ ) = 0 iff {Z n }n ∼σ 0.
GLT 4.
If { A n }n ∼GLT κ and {Bn }n ∼GLT ξ then • {A∗n }n ∼GLT κ, • {α A n + β Bn }n ∼GLT ακ + βξ for all α, β ∈ C, • {A n Bn }n ∼GLT κξ .
GLT 5. GLT 6. GLT 7. GLT 8. GLT 9.
If { A n }n ∼GLT κ and κ = 0 a.e. then { A†n }n ∼GLT κ −1 . If { A n }n ∼GLT κ and each A n is Hermitian, then { f (A n )}n ∼GLT f (κ) for every continuous function f : C → C. { A n }n ∼GLT κ iff there exist d-level GLT sequences {Bn,m }n ∼GLT κm such a.c.s. that {Bn,m }n −→ { A n }n and κm → κ in measure. a.c.s. Suppose {A n }n ∼GLT κ and {Bn,m }n ∼GLT κm . Then, {Bn,m }n −→ { A n }n iff κm → κ in measure. If {A n }n ∼GLT κ then there exist functions ai,m , f i,m , i = 1, . . . , Nm , such that ∞ d • a i,m ∈ C ([0, 1] ) and f i,m is a d-variate trigonometric polynomial, Nm • i=1 ai,m (x) f i,m (θ) → κ(x, θ ) a.e., Nm a.c.s. • i=1 Dn (ai,m )Tn ( f i,m ) n −→ { A n }n .
Fix a sequence of d-indices {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Let
E = { An }n : A n ∈ C N (n)×N (n) for every n , Md = {κ : [0, 1]d × [−π, π]d → C : κ is measurable}, E × Md = ({ An }n , κ) : {An }n ∈ E , κ ∈ Md . We note the following.
6 Summary of the Theory
119
• The space E is a *-algebra with respect to the natural operations of conjugate transposition, linear combination and product of d-level matrix-sequences: { A n }∗n = { A∗n }n , α{ A n }n + β{Bn }n = {α A n + β Bn }n , { A n }n {Bn }n = { A n Bn }n ; and it is also a pseudometric space with respect to the distance da.c.s. inducing the a.c.s. convergence. • The space Md is a *-algebra with respect to the natural operations of complex conjugation, linear combination and product of functions, and it is also a pseudometric space with respect to the distance dmeasure inducing the convergence in measure. • The space E × Md is a *-algebra with respect to the natural (pointwise) operations: ({A n }n , κ)∗ = ({ A∗n }n , κ), α({ A n }n , κ) + β({Bn }n , ξ ) = ({α A n + β Bn }n , ακ + βξ ), ({ A n }n , κ)({Bn }n , ξ ) = ({ A n Bn }n , κξ ); and it is also a pseudometric space with respect to the product distance da.c.s. × measure (({A n }n , κ), ({Bn }n , ξ )) = da.c.s. ({ A n }n , {Bn }n ) + dmeasure (κ, ξ ). Let G be the subset of E × Md consisting of the d-level GLT pairs, i.e., G = ({ A n }n , κ) : {A n }n ∼GLT κ ⊆ E × Md . By GLT 4 and GLT 7, G is a closed *-subalgebra of E × Md . By GLT 3, G contains the set B = ({Tn ( f )}n , κ(x, θ ) = f (θ)) : f ∈ L 1 ([−π, π]d ) ∪ ({Dn (a)}n , κ(x, θ ) = a(x)) : a : [0, 1]d → C is continuous a.e. ∪ ({Z n }n , κ(x, θ ) = 0) : {Z n }n ∼σ 0 . By GLT 9, the subalgebra of E × Md given by C = ({Dn (a)}n , κ(x, θ ) = a(x)) : a ∈ C ∞ ([0, 1]d ) ∪ ({Tn ( f )}n , κ(x, θ ) = f (θ)) : f is a d − variate trigonometric polynomial
120
6 Summary of the Theory
is dense in G. In conclusion: • the set of d-level GLT pairs G is the closed *-subalgebra of E × Md generated by B, i.e., the smallest closed *-subalgebra of E × Md containing B; • the set of d-level GLT pairs G is the closure of the subalgebra of E × Md generated by C. Both these two algebraic-topological characterizations may be taken as the definition of d-level GLT sequences.
Chapter 7
Applications
In this chapter we present several applications of the theory of multilevel GLT sequences to the computation of the singular value and eigenvalue distribution of matrix-sequences arising from the numerical discretization of PDEs. In order to understand the content of this chapter, it is enough that the reader knows the summary of Chap. 6 and possesses the necessary prerequisites, most of which have been addressed in Chap. 2 and [22]. Indeed, our arguments/derivations in this chapter will never refer to Chaps. 1–5, i.e., they will only rely on the summary of Chap. 6. For more applications than the ones presented herein, we refer the reader to [22, Sect. 1.1], where specific pointers to the available literature are provided.
7.1 Auxiliary Results Before going into the applications of the theory of multilevel GLT sequences, we collect in this section a couple of auxiliary results. Besides simplifying the presentation of the next sections, these results are also interesting in themselves. Actually, they may be considered as further applications of the theory of multilevel GLT sequences.
7.1.1 Multilevel GLT Preconditioning The first auxiliary result concerns the preconditioning in the context of multilevel GLT sequences. It is the multilevel version of [22, Exercise 8.4]. Theorem 7.1 Let { A n }n be a sequence of Hermitian matrices such that {A n }n ∼GLT κ, and let {Pn }n be a sequence of HPD matrices such that {Pn }n ∼GLT ξ with ξ = 0 a.e. Then, the sequence of preconditioned matrices Pn−1 A n satisfies © Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_7
121
122
7 Applications
{Pn−1 A n }n ∼GLT ξ −1 κ and
{Pn−1 A n }n ∼σ, λ ξ −1 κ.
Proof The GLT relation {Pn−1 A n }n ∼GLT ξ −1 κ is a direct consequence of GLT 4 and GLT 5. The singular value distribution {Pn−1 A n }n ∼σ ξ −1 κ follows immediately from GLT 1. The only difficult part is the spectral distribution {Pn−1 A n }n ∼λ ξ −1 κ, which does not follow from GLT 1 because Pn−1 A n is not Hermitian in general. 1/2 −1/2 Since Pn is HPD, the eigenvalues of Pn are positive and the matrices Pn , Pn are well-defined. Moreover, Pn−1 A n ∼ Pn−1/2 A n Pn−1/2 ,
(7.1) −1/2
−1/2
is where X ∼ Y means that X is similar to Y . The good news is that Pn A n Pn Hermitian and, moreover, by GLT 4−GLT 6 (with GLT 6 applied to f (z) = |z|1/2 ), we have {Pn−1/2 A n Pn−1/2 }n ∼GLT |ξ |−1/2 κ|ξ |−1/2 = |ξ |−1 κ = ξ −1 κ; note that the latter equation follows from the fact that ξ ≥ 0 a.e. by S 3, since Pn is −1/2 −1/2 is Hermitian, GLT 1 yields HPD and {Pn }n ∼λ ξ by GLT 1. Since Pn A n Pn {Pn−1/2 A n Pn−1/2 }n ∼λ ξ −1 κ. Thus, by the similarity (7.1), {Pn−1 A n }n ∼λ ξ −1 κ.
7.1.2 Multilevel Arrow-Shaped Sampling Matrices If n ∈ Nd and a : [0, 1]d → C, the nth (d-level) arrow-shaped sampling matrix generated by a is denoted by Sn (a) and is defined as the following symmetric matrix of size N (n): i, j = 1, . . . , N (n). (7.2) (Sn (a))i, j = (Dn (a))i∧ j,i∧ j , In multi-index notation, we have Sn (a) = (Dn (a))i∧ j ,i∧ j = a that is,
i ∧ j , n
i, j = 1, . . . , n,
i ∧ j n Sn (a) = a . n i, j =1
(7.3)
(7.4)
7.1 Auxiliary Results
123
The motivation of the adjective “arrow-shaped” lies in the shape of the 1-level version of (7.4), as explained in [22, p. 190]. The next theorem is the multivariate version of [22, Theorem 10.4]. Theorem 7.2 Let a : [0, 1]d → C be continuous and let f (θ ) = rj =−r f j ei j ·θ be a d-variate trigonometric polynomial. Then, Sn (a) ◦ Tn ( f ) − Dn (a)Tn ( f ) ≤ (2|r|∞ + 1)d f ∞ ωa for every n ∈ Nd ,
|r| ∞ min(n)
Sn (a) ◦ Tn ( f ) ≤ C
(7.5)
(7.6)
for every n ∈ Nd and for some constant C independent of n, and {Sn (a) ◦ Tn ( f )}n ∼GLT a(x) f (θ)
(7.7)
for every sequence {n = n(n)}n ⊆ Nd such that n → ∞ as n → ∞. Proof For i, j = 1, . . . , n, we have the following. • If |i − j |∞ > |r|∞ , then the Fourier coefficient f i− j is zero and, consequently, i ∧ j f i− j = 0, n i =a f i− j = 0. n
(Sn (a) ◦ Tn ( f ))i j = (Sn (a))i j (Tn ( f ))i j = a (Dn (a)Tn ( f ))i j = (Dn (a))i i (Tn ( f ))i j
• If |i − j |∞ ≤ |r|∞ , then, considering that | f i− j | ≤ f ∞ , we have i i ∧ j |(Sn (a) ◦ Tn ( f ))i j − (Dn (a)Tn ( f ))i j | = a f i− j − a f i− j n n i i ∧ j −a ≤ f ∞ a n n j i ≤ f ∞ ωa − n n ∞ |r| ∞ ≤ f ∞ ωa . min(n) It follows from the first item that the nonzero entries in each row and column of the matrix Z n = Sn (a) ◦ Tn ( f ) − Dn (a)Tn ( f ) are at most (2|r|∞ + 1)d . Indeed, considering for instance the ith row, we have (Z n )i j = 0 whenever |i − j |∞ > |r|∞ , which means that the only possible nonzero entries of Z n in the ith row are those corresponding to the column multi-indices j ∈ {1, . . . , n} such that |i − j |∞ ≤ |r|∞ ; and the number of all multi-indices j ∈ Zd satisfying |i − j |∞ ≤ |r|∞ is (2|r|∞ + 1)d . It follows from the second item that each entry of Z n is bounded in
124
7 Applications
|r|∞ |r|∞ modulus by f ∞ ωa ( min(n) ). Thus, |Z n |1 , |Z n |∞ ≤ (2|r|∞ + 1)d f ∞ ωa ( min(n) ), and the application of N 1 yields (7.5). Using (7.5) we immediately obtain
Sn (a) ◦ Tn ( f ) ≤ Dn (a) Tn ( f ) + Z n ≤ a∞ f ∞ + (2|r|∞ + 1)d f ∞ ωa
|r| ∞ , min(n)
which implies (7.6). Finally, for any sequence {n = n(n)}n ⊆ Nd such that n → ∞ as |r|∞ ) → 0 as n → ∞, and so {Z n }n ∼σ 0 by (7.5) and Z 1 (or n → ∞, we have ωa ( min(n) Z 2). Thus, (7.7) follows from the decomposition Sn (a) ◦ Tn ( f ) = Dn (a)Tn ( f ) + Z n in combination with GLT 3 and GLT 4.
7.2 Applications to PDE Discretizations: An Introduction In the next sections we extend to the d-dimensional setting the GLT analysis carried out in [22, Sects. 10.5–10.7] for sequences of matrices arising from the discretization of unidimensional differential equations. More precisely, Sects. 7.3, 7.4, 7.5, 7.6, 7.7 are the d-dimensional versions of, respectively, Sects. 10.5.2, 10.6.1, 10.7.1, 10.7.2, 10.7.3 from [22]. The main observation is that no substantial differences are encountered when passing from 1 to d dimensions. In other words, all the main “GLT ideas” have already emerged in the unidimensional setting, and the GLT analysis of Sects. 7.3–7.7 is conceptually the same as the GLT analysis of the corresponding unidimensional subsections mentioned above. However, the d-dimensional case involves a lot of technical difficulties that are not visible in one dimension, and in order to gain familiarity with such technicalities it is necessary to see them in some detail. The most important of them is certainly the multi-index language, which allows one to tackle a d-dimensional GLT analysis by essentially maintaining the unidimensional notation, at the only price of turning some letters (n, p, i, j, etc.) in boldface (n, p, i, j , etc.). Before going into Sects. 7.3–7.7, we outline here the main ideas of a d-dimensional GLT analysis. Consider, for example, the general d-dimensional linear second-order PDE d d ∂ 2u ∂u ak + bk + cu = f − ∂ x ∂ x ∂ xk (7.8) k k=1 ,k=1 ⇐⇒ −1(A ◦ H u)1T + b · ∇u + cu = f, where ak , bk , c, f are given functions, A = [ak ]d,k=1 , b = [bk ]dk=1 , and H u is the Hessian of u, 2 d ∂ u . Hu = ∂ x ∂ xk ,k=1
7.2 Applications to PDE Discretizations: An Introduction
125
Assume we discretize (7.8) by a standard numerical method, such as, for instance, a FD scheme. The resulting discretization matrices A n are parameterized by a d-index n = (n 1 , . . . , n d ), where n i is related to the discretization step h i in the ith direction, and n i → ∞ if and only if h i → 0 (usually, we have h i ≈ 1/n i ). By choosing each n i as a function of a unique discretization parameter n ∈ N, as it normally happens in practice where the most natural choice is n i = n for all i = 1, . . . , d, we see that n = n(n) and, consequently, {A n }n is a (d-level) matrix-sequence. The matrix A n can be decomposed according to the terms of the PDE as follows: An =
d
K n,k (ak ) +
,k=1
d
Hn,k (bk ) + In (c) + Rn = K n + Z n ,
(7.9)
k=1
where Kn =
d
K n,k (ak ),
(7.10)
,k=1
Zn =
d
Hn,k (bk ) + In (c) + Rn ,
(7.11)
k=1
Rn is a small-rank perturbation due to the imposed boundary conditions,1 and K n,k (ak ), Hn,k (bk ), In (c) are the matrices resulting from the discretization of the separable differential operators2
1 Note
that the boundary conditions (Dirichlet, Neumann, etc.) have not been specified precisely because they only produce a small-rank perturbation Rn in the resulting discretization matrix A n ; see also the discussion in [22, p. 116] and the 2nd part of [22, Sect. 10.5.2]. 2 We say that a differential operator is separable if it is obtained by multiplying a given function with a product of partial derivatives. The general separable differential operator can be written as a
∂ r1 +···+rd u ∂ x1r1 · · · ∂ xdrd
.
An example of a non-separable differential operator is the Laplacian, which, however, can be written (just like any other linear differential operator) as a sum of separable differential operators: Δu =
d ∂2u . ∂ xk2 k=1
As evidenced by the forthcoming discussion, the discretization of a separable differential operator gives rise to a GLT (actually, a sLT) sequence. For instance, after a suitable normalization that we here ignore, the matrix-sequences {K n,k (ak )}n , {Hn,k (bk )}n , {In (c)}n are GLT (actually, sLT) sequences. As a consequence, the discretization of an arbitrary linear differential operator (a sum of separable differential operators) gives rise to a sum of GLT (actually, sLT) sequences, i.e., again a GLT sequence.
126
7 Applications
−ak
∂ 2u , ∂ x ∂ xk
bk
∂u , ∂ xk
cu,
respectively. More precisely, K n,k (ak ), Hn,k (bk ), In (c) are the matrices resulting from the discretization of the three left-hand side terms in the PDE −ak
∂ 2u ∂u + bk + cu = f. ∂ x ∂ xk ∂ xk
It normally turns out that, after a suitable normalization that we ignore in this discussion, • the matrix-sequence {Z n }n , which results from the discretization of the lower-order differential operators of the PDE (7.8) and includes the small-rank perturbation due to boundary conditions, is zero-distributed; • the GLT analysis of { A n }n reduces to the GLT analysis of the matrix-sequence {K n }n resulting from the discretization of the higher-order differential operator of the PDE (7.8). In addition, every matrix-sequence {K n,k (ak )}n appearing in the definition (7.10) of K n usually turns out to be a d-level GLT sequence (actually, a d-level sLT sequence) of the form K n,k (ak ) = Dn (ak )K n,k (1) + Z n,k , K n,k (1) = Tn (Hk ) + Yn,k ,
{Z n,k }n ∼σ 0,
(7.12)
{Yn,k }n ∼σ 0,
(7.13)
where Hk is a (separable) d-variate trigonometric polynomial. It follows immediately from (7.12)–(7.13) and GLT 3−GLT 4 that {K n,k (1)}n ∼GLT Hk (θ ), {K n,k (ak )}n ∼GLT ak (x)Hk (θ ).
(7.14) (7.15)
As a consequence, { A n }n ∼GLT
d
ak (x)Hk (θ ) = 1(A(x) ◦ H (θ))1T ,
,k=1
where H (θ) = [Hk (θ)]d,k=1 . From (7.16) and GLT 1−GLT 2 one often obtains the distribution relations { A n }n ∼σ, λ
d ,k=1
ak (x)Hk (θ ) = 1(A(x) ◦ H (θ ))1T .
(7.16)
7.2 Applications to PDE Discretizations: An Introduction
127
Remark 7.1 K n,k (1) is the matrix resulting from the discretization of the left-hand side of the PDE ∂ 2u = f − ∂ x ∂ xk and it is therefore referred to as the matrix associated with the discretization of the second derivative −∂ 2 u/∂ x ∂ xk . Thus, in view of (7.14), Hk is referred to as the dvariate trigonometric polynomial associated with the discretization of −∂ 2 u/∂ x ∂ xk , or simply the “symbol of −∂ 2 u/∂ x ∂ xk ”. For example, if the considered discretization method is a FD scheme, then Hk is the d-variate trigonometric polynomial that represents the FD formula used to discretize −∂ 2 u/∂ x ∂ xk . The latter assertion will become more clear after reading Sect. 7.3. Remark 7.2 (formal structure of the symbol and symbol of the negative Hessian operator) The formal analogy between the expression of the symbol 1(A(x) ◦ H (θ ))1T and the expression of the higher-order differential operator −1(A ◦ H u)1T in (7.8) is impressive! Because of this analogy, and especially because of (7.14) and Remark 7.1, the matrix H (θ ) in the so-called “Fourier variables” θ = (θ1 , . . . , θd ) is usually referred to as the “symbol of the negative Hessian operator”, although this terminology is clearly not rigorous from a mathematical viewpoint. If we change the numerical method for the discretization of (7.8), the symbol 1(A(x) ◦ H (θ ))1T remains the same except for the matrix H (θ), which changes according to the new method. For example, if we switch from a FD scheme to another, the symbol of the negative Hessian operator switches from H (θ) = [Hk (θ)]d,k=1 to H˜ (θ ) = [ H˜ k (θ)]d,k=1 , where H˜ k is the (separable) d-variate trigonometric polynomial associated with the new FD formula used to discretize −∂ 2 u/∂ x ∂ xk . We invite the reader to compare Remarks 7.1 and 7.2 with their univariate analog [22, Remark 10.1] and with the three paragraphs which introduce the 1st, 3rd and 4th part of [22, Sect. 10.5.2].
7.3 FD Discretization of Convection-Diffusion-Reaction PDEs Consider the convection-diffusion-reaction problem
−∇ · A∇u + b · ∇u + cu = f, in (0, 1)d , u = 0, on ∂((0, 1)d ), ⎧ d d ⎪ ∂u ∂u ⎨ ∂ − bk + cu = f, in (0, 1)d , ak + ⇐⇒ ∂ x ∂ x ∂ x k k k=1 ⎪ ⎩ ,k=1 u = 0, on ∂((0, 1)d ),
(7.17)
128
7 Applications
where ak , bk , c, f are given functions, A = [ak ]d,k=1 and b = [bk ]dk=1 . FD discretization. Problem (7.17) can be reformulated as follows:
−1(A ◦ H u)1T + s · ∇u + cu = f, in (0, 1)d , u = 0, on ∂((0, 1)d ), ⎧ d d ⎪ ∂ 2u ∂u ⎨ ak + sk + cu = f, in (0, 1)d , − ⇐⇒ ∂ x ∂ x ∂ x k k k=1 ⎪ ⎩ ,k=1 u = 0, on ∂((0, 1)d ),
(7.18)
where H u is the Hessian of u, (H u)k =
∂ 2u , ∂ x ∂ xk
, k = 1, . . . , d,
and s collects the coefficients of the first-order derivatives, sk = bk −
d ∂ak =1
∂ x
,
k = 1, . . . , d.
We consider the classical central FD discretization of (7.18). We choose n ∈ Nd and 1 and x j = j h for j = 0, . . . , n + 1.3 Let ek be the kth vector of the we set h = n+1 canonical basis of Rd . For j = 1, . . . , n, we have akk
u(x j + h k ek ) − 2u(x j ) + u(x j − h k ek ) ∂ 2 u ≈ akk (x j ) ∂ xk2 x=x j h 2k = akk (x j )
u(x j +ek ) − 2u(x j ) + u(x j −ek ) h 2k
(7.19)
for k = 1, . . . , d, ∂u ∂u (x j + h k ek ) − (x j − h k ek ) ∂ 2 u ∂ x ∂ x ak ≈ a (x ) k j ∂ x ∂ xk x=x j 2h k 1 u(x j + h k ek + h e ) − u(x j + h k ek − h e ) ≈ ak (x j ) 2h k 2h u(x j − h k ek + h e ) − u(x j − h k ek − h e ) − 2h that operations involving d-indices that have no meaning in Zd must be interpreted in the componentwise sense. In the present case, given n = (n 1 , . . . , n d ) and j = ( j1 , . . . , jd ), the vector 1 and the grid point x j = j h are given by h = ( n 11+1 , . . . , n d1+1 ) = of discretization steps h = n+1 (h 1 , . . . , h d ) and x j = ( j1 h 1 , . . . , jd h d ). 3 Recall
7.3 FD Discretization of Convection-Diffusion-Reaction PDEs
= ak (x j )
129
u(x j +ek +e ) − u(x j +ek −e ) − u(x j −ek +e ) + u(x j −ek −e ) 4h h k (7.20)
for , k = 1, . . . , d with = k, u(x j + h k ek ) − u(x j − h k ek ) ∂u ≈ sk (x j ) sk ∂ xk x=x j 2h k = sk (x j )
u(x j +ek ) − u(x j −ek ) 2h k
(7.21)
for k = 1, . . . , d, cu|x=x j = c(x j )u(x j ).
(7.22)
Thus, for every j = 0, . . . , n + 1, we approximate the evaluation u(x j ) of the solu/ {1, . . . , n} tion of (7.18) at the grid point x j by the value u j , where u j = 0 for j ∈ and the vector u = (u 1 , . . . , u n )T is the solution of the linear system −
d
akk (x j )
k=1
−
d
ak (x j )
,k=1 =k
+
d k=1
u j +ek − 2u j + u j −ek h 2k
sk (x j )
u j +ek +e − u j +ek −e − u j −ek +e + u j −ek −e 4h h k
u j +ek − u j −ek + c(x j )u j = f (x j ), 2h k
j = 1, . . . , n.
(7.23)
The matrix A n associated with this linear system admits the following natural decomposition: An =
d
K n,k (ak ) +
,k=1
d
Hn,k (sk ) + In (c),
k=1
where 1 , k = 1, . . . , d, diag ak (x j ) K n,k , h h k j =1,...,n 1 Hn,k (sk ) = k = 1, . . . , d, diag sk (x j ) Hn,k , h k j =1,...,n In (c) = diag c(x j ) In ,
K n,k (ak ) =
j =1,...,n
(7.24)
130
7 Applications
and the matrices K n,k , Hn,k , In are defined by their actions on a generic vector u ∈ R N (n) , as follows: (K n,kk u) j = −u j −ek + 2u j − u j +ek ,
j = 1, . . . , n,
(7.25)
for k = 1, . . . , d, 1 (K n,k u) j = − (u j −e −ek − u j −e +ek − u j +e −ek + u j +e +ek ), 4
j = 1, . . . , n, (7.26)
for , k = 1, . . . , d with = k, (Hn,k u) j = for k = 1, . . . , d,
1 (−u j −ek + u j +ek ), 2
(In u) j = u j ,
j = 1, . . . , n,
(7.27)
j = 1, . . . , n.
(7.28)
In (7.25)–(7.28), just like in (7.23), u i = 0 whenever i ∈ / {1, . . . , n}. FD discretization matrices. Thanks to the multi-index language, we are able to provide a compact and easy-to-manage expression for the matrices (7.25)–(7.28) (and hence also for the FD discretization matrix A n ). Lemma 7.1 For every n ∈ Nd , we have k−1
K n,kk =
In r
⊗ K nk ⊗
d
In r
(7.29)
r =k+1
r=1
for k = 1, . . . , d, K n,k = K n,k
−1 k−1 d =− Inr ⊗ Hn ⊗ Inr ⊗ Hn k ⊗ In r r=1
r =k+1
r=+1
(7.30) for 1 ≤ < k ≤ d, Hn,k =
k−1
In r
⊗ Hn k ⊗
d
In r
(7.31)
r =k+1
r=1
for k = 1, . . . , d, and In =
d r=1
Inr = I N (n) ,
(7.32)
7.3 FD Discretization of Convection-Diffusion-Reaction PDEs
131
where the matrices K n , Hn are defined for all n as follows: ⎡
⎤ 2 −1 ⎢ −1 2 −1 ⎥ ⎢ ⎥ ⎢ ⎥ . . . .. .. .. Kn = ⎢ ⎥ = Tn (2 − 2 cos θ ), ⎢ ⎥ ⎣ −1 2 −1 ⎦ −1 2 ⎡ ⎤ 0 1 ⎢ −1 0 1 ⎥ ⎥ 1⎢ ⎢ . . . .. .. .. ⎥ Hn = ⎢ ⎥ = −iTn (sin θ ). ⎥ 2⎢ ⎣ −1 0 1 ⎦ −1 0 Proof We only prove (7.29) as the proofs of (7.30) and (7.31) are completely analogous, while (7.32) is obvious from (7.28). Let δi j = 1 if i = j and δi j = 0 otherwise. n ∈ R N (n) and every j = 1, . . . , n, By the crucial property P 7, for every u = [u ]=1 k−1
In r
⊗ K nk ⊗
n
k−1
=1
r=1
r=1
= =
n
r=k+1
In r
u (K n k ) jk k
=1
In r u
d
⊗ K nk ⊗
d
j
In r
r=k+1 d r=1 r =k
(Inr ) jr r =
n
u j
u (K n k ) jk k
=1
d
δ jr r
r =1 r =k
= −u j −ek + 2u j − u j +ek = (K n,kk u) j , where the second-to-last equality is due to the fact that, when varies from 1 to n, (K n k ) jk k
d
δ jr r =
r =1 r =k
−1, 2, −1, for = j − ek , j , j + ek , respectively, 0, otherwise.
Remark 7.3 K n , Hn , In are the diffusion, convection, reaction matrices resulting from the classical central FD discretization of the univariate problem
−u (x) + u (x) + u(x) = f (x), x ∈ (0, 1), u(0) = u(1) = 0.
In other words, K n , Hn , In are the matrices resulting from the FD discretization of, respectively, the negative second derivative −u (x), the first derivative u (x), the identity operator u(x). To see this, follow the above derivation of the FD dis-
132
7 Applications
cretization matrices in the univariate case d = 1 or take a look at the 3rd part of [22, Sect. 10.5.2]. Considering that K n,kk = h 2k K n,kk (1) is the matrix resulting from the FD discretization of the d-variate problem ⎧ 2 ⎪ ⎨ − ∂ u = f, in (0, 1)d , ∂ xk2 ⎪ ⎩ u = 0, on ∂((0, 1)d ), that is, the matrix associated with the FD discretization of the negative second derivative −∂ 2 u/∂ xk2 , it is immediately clear from (7.29) the relationship that exists between K n , In and K n,kk (!) Similar considerations also apply to K n,k with = k, Hn,k and In . Remark 7.4 It follows from Lemma 7.1 and T 8 that K n,kk = Tn (2 − 2 cos θk ), k = 1, . . . , d, K n,k = Tn (sin θ sin θk ), , k = 1, . . . , d, k = 1, . . . , d. Hn,k = −iTn (sin θk ),
(7.33) = k,
(7.34) (7.35)
In particular, K n,k = Tn (Hk ),
, k = 1, . . . , d,
(7.36)
where H (θ) is the d × d symmetric matrix defined as follows: Hk (θ ) =
2 − 2 cos θk , if = k, sin θ sin θk ,
if = k.
(7.37)
Since K n,k (1) = (h h k )−1 K n,k , according to (7.36) and the discussion in Sect. 7.2 (see in particular (7.14)), we may predict that H (θ ) is, up to some normalization, the symbol of the negative Hessian operator. For instance, assuming n + 1 = νn for some fixed vector ν ∈ Qd with positive components, from (7.36), GLT 3 and GLT 4 we infer that
n −2 K n,k (1) = ν νk Tn (Hk )
n
(ν) ∼GLT ν νk Hk (θ ) = Hk (θ ),
(7.38)
where H (ν) (θ) = diag(ν)H (θ )diag(ν). This means that, assuming n + 1 = νn and after normalization by n −2 , the symbol of the negative Hessian operator is H (ν) (θ), which coincides with H (θ) up to a trivial transformation by diag(ν). With the same argument as in the proof of [17, Theorem 2.2], one can show that the matrix H (θ) is SPSD for all θ ∈ [−π, π]d , and it is SPD for all θ ∈ [−π, π]d such that θ1 · · · θd = 0. GLT analysis of the FD discretization matrices. Using the theory of multilevel GLT sequences, we now derive the spectral and singular value distribution of the sequence of normalized FD discretization matrices {n −2 A n }n under the assumption
7.3 FD Discretization of Convection-Diffusion-Reaction PDEs
133
that n + 1 = νn for some fixed vector ν. This assumption essentially says that each stepsize h i = ni 1+1 tends to 0 with the same asymptotic speed as the others. Theorem 7.3 Suppose that the following conditions on the PDE coefficients are satisfied: • for every , k = 1, . . . , d, the function ak : [0, 1]d → R belongs to C([0, 1]d ) and its partial derivatives ∂ak /∂ x1 , . . . , ∂ak /∂ xd : [0, 1]d → R are bounded; • for every k = 1, . . . , d, the function bk : [0, 1]d → R is bounded; • c : [0, 1]d → R is bounded. Let ν ∈ Qd be a vector with positive components and assume that n + 1 = νn (it is understood that n varies in the infinite subset of N such that n + 1 = νn ∈ Nd ). Then (7.39) {n −2 A n }n ∼GLT f (ν) (x, θ ) and
{n −2 A n }n ∼σ,λ f (ν) (x, θ ),
(7.40)
where f (ν) (x, θ ) =
d
(ν) ak (x)Hk (θ) = 1(A(x) ◦ H (ν) (θ ))1T = ν(A(x) ◦ H (θ ))ν T ,
,k=1
H
(ν)
(θ) = diag(ν)H (θ )diag(ν),
and H (θ) is defined in (7.37). Proof The proof consists of the following steps. In what follows, the letter C denotes a generic constant independent of n. While reading this proof, the reader should keep in mind the relation n + 1 = νn. Step 1. In view of (7.24), we decompose n −2 A n as follows: n −2 A n = n −2 K n + n −2 Z n ,
(7.41)
where n −2 K n = n −2 =
d
1 h h ,k=1 k
d ,k=1
ν νk
diag ak (x j ) K n,k
j =1,...,n
diag ak (x j ) K n,k
j =1,...,n
(7.42)
is the diffusion matrix, resulting from the FD discretization of the higher-order (diffusion) term in (7.18), while
134
7 Applications
d 1 diag sk (x j ) Hn,k + n −2 diag c(x j ) In h j =1,...,n j =1,...,n k=1 k d = n −1 νk diag sk (x j ) Hn,k + n −2 diag c(x j )
n −2 Z n = n −2
j =1,...,n
k=1
j =1,...,n
(7.43)
is the matrix resulting from the FD discretization of the lower-order terms (the convection and reaction terms). We show that n −2 K n ≤ C, n
−2
(7.44) −1
Z n ≤ Cn .
(7.45)
We have n
−2
d Kn = ν νk diag ak (x j ) K n,k j =1,...,n ,k=1
≤
d ,k=1
≤
d
ν νk a (x ) diag k j K n,k j =1,...,n
ν νk ak ∞ K n,k ≤ 4
,k=1
d
ν νk ak ∞ = C,
,k=1
where in the last inequality we used the fact that K n,k ≤ 4, which follows from either Lemma 7.1 and P 5 (taking into account that K n ≤ 4 and Hn ≤ 1 for all n by N 1) or (7.36) and T 3 (taking into account that Hk ∞ ≤ 4 for all , k = 1, . . . , d). Note that ak ∞ is finite because of the assumption that ak ∈ C([0, 1]d ). This completes the proof of (7.44). The proof of (7.45) is analogous: n
−2
d −1 −2 Z n = n νk diag sk (x j ) Hn,k + n diag c(x j ) j =1,...,n j =1,...,n k=1 d −2 n −1 νk ≤ diag sk (x j ) Hn,k + n diag c(x j ) k=1
≤
d
j =1,...,n
j =1,...,n
n −1 νk sk ∞ Hn,k + n −2 c∞
k=1
≤
d
n −1 νk sk ∞ + n −2 c∞ ≤ Cn −1 ,
k=1
where in the second-to-last inequality we used the fact that Hn,k ≤ 1, which follows from either Lemma 7.1 and P 5 (taking into account that Hn ≤ 1 for all n) or (7.35)
7.3 FD Discretization of Convection-Diffusion-Reaction PDEs
135
and T 3 (taking into account that sin θk ∞ ≤ 1 for all k = 1, . . . , d). Note that c∞ and sk ∞ are finite because of the assumption that c, bk and the partial derivatives ∂ak /∂ xr are bounded. Step 2. Define the symmetric matrix n −2 K˜ n = n −2
d
d 1 (ν) Sn (ak ) ◦ Tn (Hk ) = Sn (ak ) ◦ Tn (Hk ) h h k ,k=1 ,k=1
(7.46)
and consider the following decomposition of n −2 A n : n −2 A n = n −2 K˜ n + (n −2 K n − n −2 K˜ n ) + n −2 Z n .
(7.47)
By Theorem 7.2 and GLT 4, n −2 K˜ n ≤ C and {n −2 K˜ n }n ∼GLT f (ν) (x, θ ). In the next step we show that n −2 K˜ n is a symmetric approximation of n −2 K n , in the sense that (7.48) n −2 K n − n −2 K˜ n → 0. Once this is done, the theorem is proved. Indeed, from (7.48) and Step 1 we have n −2 K n − n −2 K˜ n + n −2 Z n → 0. Hence, the GLT relation (7.39) follows from (7.47), Z 1 (or Z 2) and GLT 3−GLT 4; the singular value distribution in (7.40) follows from (7.39) and GLT 1; and the spectral distribution in (7.40) follows from (7.39) and GLT 2 applied to the decomposition (7.47), taking into account N 2. Step 3. To prove (7.48), we use (7.36) and the fact that diag a(x j ) − Dn (a) = max a(x j ) − a j j =1,...,n n j =1,...,n j ≤ ωa (max(h)) ≤ max ωa x j − j =1,...,n n ∞ for all functions a ∈ C([0, 1]d ). We have n −2 K n − n −2 K˜ n d d (ν) ν νk diag ak (x j ) K n,k − Sn (ak ) ◦ Tn (Hk ) = j =1,...,n ,k=1 ,k=1 d (ν) (ν) diag ak (x j ) Tn (Hk ) − Dn (ak )Tn (Hk ) = j =1,...,n ,k=1 d (ν) (ν) + Dn (ak )Tn (Hk ) − Sn (ak ) ◦ Tn (Hk ) ,k=1
(7.49)
136
7 Applications
≤
d diag ak (x j ) − Dn (ak ) Tn (H (ν) ) k
,k=1 j =1,...,n
+
d Dn (ak )Tn (H (ν) ) − Sn (ak ) ◦ Tn (H (ν) ), k
k
,k=1
which tends to 0 by Theorem 7.2 and (7.49), taking into account that ak ∈ C([0, 1]d ) (ν) (ν) ) ≤ Hk ∞ by T 3. by assumption and Tn (Hk We remark the the proof of Theorem 7.3 is conceptually analogous to the proof of its univariate version [22, Theorem 10.8] in the form suggested by [22, Remark 10.4].
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs Consider the convection-diffusion-reaction problem
−∇ · A∇u + b · ∇u + cu = f, in (0, 1)d , u = 0, on ∂((0, 1)d ), ⎧ d d ⎪ ∂u ∂u ⎨ ∂ bk + cu = f, in (0, 1)d , ak + − ⇐⇒ ∂ x ∂ x ∂ x k k k=1 ⎪ ⎩ ,k=1 u = 0, on ∂((0, 1)d ),
(7.50)
where ak , bk , c, f are given functions, A = [ak ]d,k=1 and b = [bk ]dk=1 . FE discretization. The weak form of (7.50) reads as follows [10, Chap. 9]: find u ∈ H01 ([0, 1]d ) such that a(u, w) = f(w),
∀ w ∈ H01 ([0, 1]d ),
where a(u, w) =
(0,1)d
((∇w)T A∇u + (∇u)T b w + cuw),
f(w) =
f w. (0,1)d
1 Let n ∈ Nd , set h = n+1 and x j = j h for j = 0, . . . , n + 1. In what follows, we use the notation x ji = ji h i for ji = 0, . . . , n i + 1 and i = 1, . . . , d, so that we can write x j = (x j1 , . . . , x jd ) for all j = 0, . . . , n + 1. Pay attention to the fact that this notation is used to simplify the presentation but is not rigorous from a mathematical viewpoint. Fix the subspace W n = span(ϕ1 , . . . , ϕn ), where ϕ1 , . . . , ϕn : [0, 1]d → R are the so-called tensor-product hat-functions. They are defined as follows:
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
ϕ j = ϕ j1 ⊗ · · · ⊗ ϕ jd ,
137
j = 1, . . . , n,
(7.51)
where the functions ϕ ji : [0, 1] → R, ji = 1, . . . , n i , are the hat-functions corresponding to the discretization step h i = ni 1+1 . We recall from [22, Sect. 10.6.1] that the hat-functions ϕ j : [0, 1] → R, j = 1, . . . , n, corresponding to the discretization 1 are defined as step h = n+1 ϕ j (x) =
x − x j−1 x j+1 − x χ[x ,x ) (x) + χ[x ,x ) (x), x j − x j−1 j−1 j x j+1 − x j j j+1
j = 1, . . . , n;
(7.52) see also [22, Fig. 10.4]. It can be shown that W n ⊂ H01 ([0, 1]d ) and the tensorproduct hat-functions (7.51) form a basis for W n (i.e., they are linearly independent). Moreover, any partial Sobolev derivative of each tensor-product hat-function ϕ j coincides with the classical partial derivative (which exists a.e. in [0, 1]d ). A few properties of tensor-product hat-functions, which can be easily established on the basis of (7.52), are reported below. • Local support property: supp(ϕi ) = [x i−1 , x i+1 ] = [xi1 −1 , xi1 +1 ] × · · · × [xid −1 , xid +1 ]
(7.53)
for all i = 1, . . . , n. In particular, μd (supp(ϕi )) = 2d /N (n + 1). • Bound for partial derivatives: d ∂ϕ i (x) = ϕik (xk ) ϕir (xr ) ≤ n k + 1 ∂ xk r=1
(7.54)
r =k
for all i = 1, . . . , n and for a.e. x ∈ [0, 1]d . • Bound for the sum of partial derivatives: n ∂ϕi (x) ≤ 2(n k + 1) ∂ xk i=1
(7.55)
for a.e. x ∈ [0, 1]d . In the linear FE approach, we look for an approximation u W n of u by solving the following (Galerkin) problem: find u W n ∈ W n such that a(u W n , w) = f(w),
∀ w ∈ Wn .
Since {ϕ1 , . . . , ϕn } is a basis for W n , we can write u W n = nj=1 u j ϕ j for a unique vector u = (u 1 , . . . , u n )T . By linearity, the computation of u W n (i.e., of u) reduces
138
7 Applications
to solving the linear system A n u = f, where f = (f(ϕ1 ), . . . , f(ϕn ))T and A n is the stiffness matrix, A n = [a(ϕ j , ϕi )]ni, j =1 . Note that A n admits the following decomposition: An = K n + Z n ,
(7.56)
where Kn =
n
(0,1)d
(∇ϕi )T A∇ϕ j
is the diffusion matrix and n T Zn = (∇ϕ j ) b ϕi (0,1)d
(7.57) i, j =1
+
i, j =1
n (0,1)d
cϕ j ϕi
(7.58) i, j =1
is the sum of the convection and reaction matrices. FE discretization matrices. As explained in Sect. 7.2, the crucial object for determining the symbol of a sequence of matrices arising from the discretization of the PDE (7.50) is the “symbol of the negative Hessian operator”. Let us then investigate the structure of the FE discretization matrices so as to “guess” this crucial object. Let E k be the d × d matrix with 1 in position (, k) and 0 elsewhere, 1 ≤ , k ≤ d. By looking at the expression of K n , we see that Kn =
d
K n,k (ak ),
(7.59)
,k=1
where K n,k (ak ) =
(0,1)d
∂ϕi ∂ϕ j ak ∂ x ∂ xk
n (7.60) i, j =1
is the matrix obtained from the FE discretization of (7.50) with A replaced by ak E k , that is, ⎧ ∂u ⎨ ∂ ak = f, in (0, 1)d , − (7.61) ∂ x ∂ xk ⎩ u = 0, on ∂((0, 1)d ). According to the discussion in Sect. 7.2, in order to guess the symbol of the negative Hessian operator, we should understand which is the symbol of (a properly normalized version of) the matrix-sequence {K n,k (1)}n , once we have fixed a suit-
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
139
able relation n = n(n). The matrix K n,k (1) is in fact the matrix resulting from the FE discretization of (7.61) with ak = 1 identically, and it is therefore the matrix resulting from the FE discretization of the second derivative −∂ 2 u/∂ x ∂ xk . To sum up, we have to study the matrix K n,k = K n,k (1) =
(0,1)d
∂ϕi ∂ϕ j ∂ x ∂ xk
n .
(7.62)
i, j =1
Lemma 7.2 For every n ∈ Nd , we have K n,kk =
k−1
Mn r
⊗ K nk ⊗
d
Mn r
(7.63)
r =k+1
r=1
for k = 1, . . . , d, and K n,k = K n,k
−1 k−1 d =− Mnr ⊗ Hn ⊗ Mnr ⊗ Hn k ⊗ Mn r r=1
r =k+1
r=+1
(7.64) for 1 ≤ < k ≤ d, where the matrices K n , Hn , Mn are defined in terms of the hatfunctions (7.52) as follows:
1
Kn =
0 1
Hn = 0
0
ϕ j (x)ϕi (x)dx
n = (n + 1)Tn (2 − 2 cos θ ), i, j=1 n
= −
= i, j=1
ϕ j (x)ϕi (x)dx
(7.65)
n = −iTn (sin θ ), i, j=1
(7.66)
n ϕ j (x)ϕi (x)dx
1 0
i, j=1 1
Mn =
ϕ j (x)ϕi (x)dx
2
1 1 Tn + cos θ , n+1 3 3
(7.67)
with the last equalities in (7.65)–(7.67) following from direct computations based on (7.52). Proof We only prove (7.63) because (7.64) is proved in the same way. For every i, j = 1, . . . , n,
140
7 Applications
(K n,kk )i j =
(0,1)d
1
= 0
∂ϕ j ∂ϕi (x) (x)dx = ∂ xk ∂ xk
ϕ jk (xk )ϕik (xk )dxk
d r=1 r =k
(0,1)d 1
ϕik (xk )ϕ jk (xk )
d
ϕir (xr )ϕ jr (xr )dx
r =1 r =k
ϕ jr (xr )ϕir (xr )dxr = (K n k )ik jk
0
d (Mnr )ir jr r =1 r =k
= (Mn 1 ⊗ · · · ⊗ Mn k−1 ⊗ K n k ⊗ Mn k+1 ⊗ · · · ⊗ Mn d )i j , where the last equality follows from the crucial property P 7.
Remark 7.5 As in the case of FDs (see Remark 7.3), K n , Hn , Mn are the diffusion, convection, reaction matrices resulting from the FE discretization of the univariate problem −u (x) + u (x) + u(x) = f (x), x ∈ (0, 1), u(0) = u(1) = 0. In other words, K n , Hn , Mn are the matrices resulting from the FE discretization of, respectively, the negative second derivative −u (x), the first derivative u (x), the identity operator u(x). To see this, follow the above derivation of the FE discretization matrices in the univariate case d = 1 or take a look at [22, Sect. 10.6.1]. Remark 7.6 It follows from Lemma 7.2 and T 8 that, for all , k = 1, . . . , d, K n,k =
(n + 1)(n k + 1) Tn (Hk ), N (n + 1)
(7.68)
where H (θ) is the d × d symmetric matrix defined as follows: ⎧ d 2 ⎨ (2 − 2 cos θk ) r=1 ( 3 + 31 cos θr ), if = k, r = k Hk (θ) = ⎩ sin θ sin θk dr=1 ( 32 + 31 cos θr ), if = k.
(7.69)
r =,k
The matrix H (θ) is what we may predict to be, up to some normalization, the symbol of the negative Hessian operator. For instance, assuming n + 1 = νn for some fixed vector ν ∈ Qd with positive components, from (7.68), GLT 3 and GLT 4 we infer that ! ν νk ν νk (ν) (θ ), (7.70) n d−2 K n,k = Tn (Hk ) ∼GLT Hk (θ) = Hk n N (ν) N (ν) where H (ν) (θ) = diag(ν)H (θ)diag(ν)/N (ν). This means that, assuming n + 1 = νn and after normalization by n d−2 , the symbol of the negative Hessian operator is H (ν) (θ), which coincides with H (θ) up to a trivial transformation. With the same argument as in the proof of [17, Theorem 2.2], one can show that the matrix H (θ) is SPSD for all θ ∈ [−π, π]d , and it is SPD for all θ ∈ [−π, π]d such that θ1 · · · θd = 0.
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
141
GLT analysis of the FE discretization matrices. Using the theory of multilevel GLT sequences, we now derive the spectral and singular value distribution of the sequence of normalized stiffness matrices {n d−2 A n }n under the assumptions that n + 1 = νn for some fixed vector ν and that the matrix of diffusion coefficients A(x) is symmetric. Theorem 7.4 Suppose that the following conditions on the PDE coefficients are satisfied: • • • •
ak ∈ L ∞ ((0, 1)d ) for every , k = 1, . . . , d; bk ∈ L ∞ ((0, 1)d ) for every k = 1, . . . , d; c ∈ L ∞ ((0, 1)d ); A(x) = [ak (x)]d,k=1 is symmetric for every x ∈ (0, 1)d .
Let ν ∈ Qd be a vector with positive components and assume that n + 1 = νn (it is understood that n varies in the infinite subset of N such that n + 1 = νn ∈ Nd ). Then (7.71) {n d−2 A n }n ∼GLT f (ν) (x, θ) and
{n d−2 A n }n ∼σ,λ f (ν) (x, θ ),
(7.72)
where f (ν) (x, θ ) =
d
(ν) ak (x)Hk (θ ) = 1(A(x) ◦ H (ν) (θ))1T =
,k=1
H (ν) (θ) =
ν(A(x) ◦ H (θ ))ν T , N (ν)
diag(ν)H (θ )diag(ν) , N (ν)
and H (θ) is defined in (7.69). Proof The proof consists of the following steps. Throughout the proof, the letter C denotes a generic constant independent of n. While reading the proof, the reader should keep in mind the relation n + 1 = νn and the notation E k for the d × d matrix having 1 in position (, k) and 0 elsewhere (1 ≤ , k ≤ d). Step 1. We show that n d−2 K n ≤ C, n
d−2
(7.73) −1
Z n ≤ Cn .
To prove (7.73) we note that, for all i, j = 1, . . . , n, we have the following.
(7.74)
142
7 Applications
• If |i − j |∞ > 1 then there exists k ∈ {1, . . . , d} such that |i k − jk | > 1, which implies by (7.53) that supp(ϕi ) and supp(ϕ j ) intersect at most on a set of zero measure. Thus, (K n )i j =
(0,1)d
(∇ϕi )T A∇ϕ j = 0.
• By (7.53) and (7.54), |(K n )i j | ≤ ≤ ≤
(0,1)d
|(∇ϕi )T A∇ϕ j | ≤
max ak L ∞
,k=1,...,d
max ak L ∞
,k=1,...,d
(0,1)d
d ,k=1 supp(ϕi ) d
∂ϕ ∂ϕ i j |ak | ∂ x ∂ xk ,k=1 d
(n + 1)(n k + 1)
(n + 1)(n k + 1)
,k=1
2d ≤ Cn 2−d . N (n + 1)
Thus, each row and column of K n has at most 3d nonzero entries whose moduli are bounded by Cn 2−d , which implies (7.73) by N 1. The proof of (7.74) is conceptually identical. For all i, j = 1, . . . , n, we have the following. • If |i − j |∞ > 1 then (Z n )i j = 0 for the same reason for which (K n )i j = 0. • By (7.53), (7.54) and the obvious inequality |ϕi (x)| ≤ 1, |(Z n )i j | ≤
|(∇ϕ j ) b ϕi | + T
(0,1)d
(0,1)d
|cϕ j ϕi |
∂ϕ j |bk | c L ∞ ≤ + ∂ xk supp(ϕi ) k=1 supp(ϕi ) d 2d c L ∞ ∞ ≤ max bk L (n k + 1) + k=1,...,d N (n + 1) k=1 supp(ϕi ) 2d dk=1 (n k + 1) 2d c L ∞ ≤ max bk L ∞ + k=1,...,d N (n + 1) N (n + 1) ≤ Cn 1−d + Cn −d ≤ Cn 1−d .
d
Thus, each row and column of Z n has at most 3d nonzero entries whose moduli are bounded by Cn 1−d , which implies (7.74) by N 1. Step 2. Let L 1 ([0, 1]d , Rd×d ) be the space of functions L : [0, 1]d → Rd×d such that L i j ∈ L 1 ([0, 1]d ) for all i, j = 1, . . . , d. Consider the linear operator K n (·) : L 1 ([0, 1]d , Rd×d ) → R N (n)×N (n) ,
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
K n (L) =
143
n (0,1)d
(∇ϕi )T L∇ϕ j
.
(7.75)
i, j =1
The next four steps are devoted to showing that {n d−2 K n (L)}n ∼GLT 1(L(x) ◦ H (ν) (θ ))1T ,
∀ L ∈ L 1 ([0, 1]d , Rd×d ).
(7.76)
Once this is done, the theorem is proved. Indeed, by applying (7.76) with L = A, we get {n d−2 K n }n ∼GLT f (ν) (x, θ). Moreover, {n d−2 Z n }n is zero-distributed by Step 1 and Z 1 (or Z 2). Hence, the GLT relation (7.71) follows from the decomposition n d−2 A n = n d−2 K n + n d−2 Z n (see (7.56)) and GLT 3−GLT 4; the singular value distribution in (7.72) follows from (7.71) and GLT 1; and the spectral distribution in (7.72) follows from (7.71) and GLT 2 applied to the decomposition n d−2 A n = n d−2 K n + n d−2 Z n , taking into account what we have seen in Step 1, the inequality N 2, and the fact that K n is symmetric (because A(x) is symmetric for all x ∈ (0, 1)d by assumption). Step 3. We first prove (7.76) in the constant-coefficient case where L(x) = E k identically. In this case, we have K n (E k ) = K n,k and (7.76) is nothing else than (7.70). Step 4. We now prove (7.76) in the case where L(x) = a(x)E k with a ∈ C([0, 1]d ). To this end, we show that (7.77) Yn → 0 where Yn = n d−2 K n (a E k ) − n d−2 Dn (a)K n (E k ). Once this is done, from Z 1 (or Z 2) we have {Yn }n ∼σ 0. Hence, from Step 3 and GLT 3−GLT 4 applied to the obvious decomposition n d−2 K n (a E k ) = n d−2 Dn (a)K n (E k ) + Yn , we obtain (ν) {n d−2 K n (a E k )}n ∼GLT a(x)Hk (θ),
as required. For every i, j = 1, . . . , n, we have the following. • If |i − j |∞ > 1 then (K n (L))i j = 0 for all L ∈ L 1 ([0, 1]d , Rd×d ), because of the local support property of the tensor-product hat-functions already exploited in Step 1. Thus, (Yn )i j = 0. • If |i − j |∞ ≤ 1 then, using (7.53)–(7.54) and taking into account that i/n ∈ supp(ϕi ), we obtain " i # ∂ϕ ∂ϕ j i (x) (x)dx a(x) − a |(Yn )i j | = n n ∂ x ∂ xk (0,1)d i ≤ n d−2 a(x) − a (n + 1)(n k + 1)dx n supp(ϕi ) i dx max x − ≤ n d−2 Cn 2 ωa x∈supp(ϕi ) n ∞ supp(ϕi ) 2d 2 2 ≤ Cn d ωa ≤ Cωa . min(n) + 1 N (n + 1) min(n) + 1 d−2
144
7 Applications
Thus, each row and column of Yn has at most 3d nonzero entries whose moduli are 2 ), which implies (7.77) by N 1. bounded by Cωa ( min(n)+1 Step 5. We now prove (7.76) in the case where L(x) = a(x)E k with a ∈ L 1 ([0, 1]d ). Take am ∈ C([0, 1]d ) such that am → a in L 1 ([0, 1]d ). We have (ν) {n d−2 K n (am E k )}n ∼GLT am (x)Hk (θ)
by Step 4, and
(ν) (ν) (θ ) → a(x)Hk (θ) am (x)Hk
in measure as m → ∞. We prove that a.c.s.
{n d−2 K n (am E k )}m −→ {n d−2 K n (a E k )}n ,
(7.78)
after which the desired result follows from GLT 7. By N 3 and the bounds for the sum of partial derivatives of tensor-product hat-functions (7.55), we have n |(K n ((a − am )E k ))i j | K n (a E k ) − K n (am E k )1 = K n ((a − am )E k )1 ≤ ≤
n
i, j =1
d i, j =1 (0,1)
∂ϕ i ∂ϕ j |a(x) − am (x)| (x) (x)dx ∂ x ∂ xk
≤
(0,1)d
|a(x) − am (x)|
n n ∂ϕi ∂ϕ j (x) (x)dx ∂ x ∂ x k i=1 j =1
≤ 4(n + 1)(n k + 1)a − am L 1 ≤ Cn 2 a − am L 1 , hence n d−2 K n (a E k ) − n d−2 K n (am E k )1 ≤ Cn d a − am L 1 ≤ C N (n)a − am L 1 , and the a.c.s. convergence (7.78) follows from ACS 6. Step 6. Finally, we prove (7.76) for an arbitrary L ∈ L 1 ([0, 1]d , Rd×d ). Write L(x) =
d
L k (x)E k
,k=1
and note that, by linearity, K n (L) =
d ,k=1
K n (L k E k ).
7.4 FE Discretization of Convection-Diffusion-Reaction PDEs
145
Hence, by Step 5 and GLT 4, {n
d−2
K n (L)}n ∼GLT
d
(ν) L k (x)Hk (θ ) = 1(L(x) ◦ H (ν) (θ))1T ,
,k=1
which concludes the proof.
We invite the reader to compare the proof of Theorem 7.4 with the proof of its univariate version [22, Theorem 10.12]: they are essentially the same! We remark that the GLT relation (7.71) and the singular value distribution in (7.72) remain true even without the hypothesis that the matrix of diffusion coefficients A(x) is symmetric for all x ∈ (0, 1)d . Indeed, as it is clear from the proof of Theorem 7.4 and especially from Step 2, the symmetry of A is used only in the proof of the eigenvalue distribution in (7.72). Actually, also the eigenvalue distribution remains true without the symmetry assumption on A, as long as we assume that the entries of A are continuous. This result, which has never been observed before in the literature, is proved in the next theorem. Theorem 7.5 Suppose that the following conditions on the PDE coefficients are satisfied: • ak ∈ C([0, 1]d ) for every , k = 1, . . . , d; • bk ∈ L ∞ ((0, 1)d ) for every k = 1, . . . , d; • c ∈ L ∞ ((0, 1)d ). Let ν ∈ Qd be a vector with positive components and assume that n + 1 = νn (it is understood that n varies in the infinite subset of N such that n + 1 = νn ∈ Nd ). Then, both (7.71) and (7.72) are satisfied. Proof As noted before the statement of the theorem, we only have to prove the eigenvalue distribution in (7.72). The underlying idea is that, even in the case where A is not symmetric, the matrix K n is “almost” symmetric (as long as the entries of A are continuous). We can then derive the eigenvalue distribution in (7.72) from GLT 2 applied to a suitable decomposition of A n obtained from (7.56) by replacing K n with one of its symmetric approximations. Let us work out the details. By (7.59)–(7.60), Kn =
d ,k=1
K n,k (ak ),
K n,k (a) =
(0,1)d
Let K˜ n be the symmetric approximation of K n given by
∂ϕi ∂ϕ j a ∂ x ∂ xk
n . i, j =1
146
7 Applications
K˜ n =
d
K˜ n,k (ak ),
K˜ n,k (a) = Sn (a) ◦ K n,k (1)
,k=1
= Sn (a) ◦
(n + 1)(n k + 1) Tn (Hk ) N (n + 1)
(ν) ). = Sn (a) ◦ n 2−d Tn (Hk
We prove that
n d−2 K n − n d−2 K˜ n → 0.
(7.79)
Once this is done, the eigenvalue distribution in (7.72) follows from (7.71) and GLT 2 applied to the decomposition n d−2 A n = n d−2 K˜ n + (n d−2 K n − n d−2 K˜ n ) + n d−2 Z n , taking into account what we have seen in Step 1 of the proof of Theorem 7.4, the inequality N 2, and the symmetry of K˜ n . To prove (7.79), it is enough to show that n d−2 K n,k (a) − n d−2 K˜ n,k (a) → 0 for every a ∈ C([0, 1]d ) and every , k = 1, . . . , d. With the notation used in the proof of Theorem 7.4 (see in particular (7.75)), we have K n,k (a) = K n (a E k ). Thus, (ν) ), keeping in mind that K n,k (1) = K n (E k ) = n 2−d Tn (Hk n d−2 K n,k (a) − n d−2 K˜ n,k (a) (ν)
(ν)
≤ n d−2 K n (a E k ) − n d−2 Dn (a)K n (E k ) + Dn (a)Tn (Hk ) − Sn (a) ◦ Tn (Hk ),
which tends to 0 by Theorem 7.2 and what we have seen in Step 4 of the proof of Theorem 7.4.
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs Consider the convection-diffusion-reaction problem
−∇ · A∇u + b · ∇u + cu = f, in Ω, u = 0, on ∂Ω, ⎧ d d ⎪ ∂u ∂u ⎨ ∂ bk + cu = f, in Ω, ak + − ⇐⇒ ∂ x ∂ xk ∂ xk k=1 ,k=1 ⎪ ⎩ u = 0, on ∂Ω,
(7.80)
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs
147
where ak , bk , c, f are given functions, A = [ak ]d,k=1 , b = [bk ]dk=1 , and Ω is a bounded open domain in Rd . Isogeometric collocation approximation. Problem (7.80) can be reformulated as follows: −1(A ◦ H u)1T + s · ∇u + cu = f, in Ω, u = 0, on ∂Ω, ⎧ d d ⎪ (7.81) ∂ 2u ∂u ⎨ ak + sk + cu = f, in Ω, − ⇐⇒ ∂ x ∂ xk ∂ xk k=1 ⎪ ⎩ ,k=1 u = 0, on ∂Ω, where H u is the Hessian of u, (H u)k =
∂ 2u , ∂ x ∂ xk
, k = 1, . . . , d,
and s collects the coefficients of the first-order derivatives, sk = bk −
d ∂ak =1
∂ x
,
k = 1, . . . , d.
In the standard collocation method, we choose a finite dimensional vector space W , consisting of sufficiently smooth functions defined on Ω and vanishing on ∂Ω; we call W the approximation space. Then, we introduce a set of N = dim W collocation points {τ 1 , . . . , τ N } ⊂ Ω, and we look for a function u W ∈ W satisfying the PDE (7.81) at the points τ i , i.e., − 1(A(τ i ) ◦ H u W (τ i ))1T + s(τ i ) · ∇u W (τ i ) + c(τ i )u W (τ i ) = f (τ i ), (7.82) i = 1, . . . , N . The function u W is taken as an approximation to the solution u of (7.81). If {ϕ1 , . . . , ϕ N } is a basis of W , then we have u W = Nj=1 u j ϕ j for a unique vector u = (u 1 , . . . , u N )T , and, by linearity, the computation of u W (i.e., of u) reduces to solving the linear system (7.83) AC u = f, N where f = [ f (τ i )]i=1 and AC is the collocation matrix,
148
7 Applications
$ %N AC = −1(A(τ i ) ◦ H ϕ j (τ i ))1T + s(τ i ) · ∇ϕ j (τ i ) + c(τ i )ϕ j (τ i ) i, j=1 N d d ∂ 2ϕ j ∂ϕ j = − ak (τ i ) (τ i ) + sk (τ i ) (τ i ) + c(τ i )ϕ j (τ i ) ∂ x ∂ xk ∂ xk k=1 ,k=1 =−
d ,k=1
+
d k=1
diag ak (τ i )
i=1,...,N
diag sk (τ i ) i=1,...,N
i, j=1
N
∂ 2ϕ j (τ i ) ∂ x ∂ xk
i, j=1
N
∂ϕ j (τ i ) ∂ xk
+
i, j=1
diag c(τ i )
$
%N
ϕ j (τ i )
i=1,...,N
i, j=1
.
(7.84) Now, suppose that the physical domain Ω can be described by a global geometry function G : [0, 1]d → Ω, which is invertible and satisfies G(∂([0, 1]d )) = ∂Ω. Let {ϕˆ1 , . . . , ϕˆ N }
(7.85)
be a set of basis functions defined on the parametric (or reference) domain [0, 1]d and vanishing on the boundary ∂([0, 1]d ). Let {τˆ 1 , . . . , τˆ N }
(7.86)
be a set of N collocation points in (0, 1)d . In the isogeometric collocation approach, we find an approximation u W of u by using the standard collocation method described above, in which • the approximation space is chosen as W = span(ϕ1 , . . . , ϕ N ), with ϕi (x) = ϕˆi (G−1 (x)) = ϕˆi (ˆx),
x = G(ˆx),
i = 1, . . . , N ,
(7.87)
• the collocation points in the physical domain Ω are defined as τ i = G(τˆ i ),
i = 1, . . . , N .
(7.88)
The resulting collocation matrix AC is given by (7.84), with the basis functions ϕi and the collocation points τ i defined as in (7.87) and (7.88). Assuming that G and ϕˆi , i = 1, . . . , N , are sufficiently regular, we can apply standard differential calculus to express AC in terms of G and ϕˆi , τˆ i , i = 1, . . . , N . Let us work out this expression. For any u : Ω → R consider the corresponding function uˆ : [0, 1]d → R, which is defined on the parametric domain by u(ˆ ˆ x) = u(x),
x = G(ˆx).
(7.89)
In other words, uˆ = u(G). Then, u satisfies (7.81) if and only if uˆ satisfies the corresponding transformed problem
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs
149
−1(AG ◦ H u)1 ˆ T + sG · ∇ uˆ + cG uˆ = f (G), in (0, 1)d , uˆ = 0, on ∂((0, 1)d ), ⎧ d d ⎪ ∂ 2 uˆ ∂ uˆ ⎨ − aG,k + sG,k + cG uˆ = f (G), in (0, 1)d , ⇐⇒ ∂ x ˆ ∂ x ˆ ∂ x ˆ k k k=1 ⎪ ⎩ ,k=1 uˆ = 0 on ∂((0, 1)d ). (7.90) In (7.90), H uˆ is the Hessian of u, ˆ (H u) ˆ k =
∂ 2 uˆ , ∂ xˆ ∂ xˆk
, k = 1, . . . , d,
while AG = [aG,k ]d,k=1 , sG = [sG,k ]dk=1 , cG are the transformed diffusion, convection and reaction coefficients. Through standard differential calculus, one can show that cG = c(G), (7.91) AG = (JG )−1 A(G)(JG )−T , where JG is the Jacobian matrix of G, JG =
∂G i ∂ xˆ j
d = i, j=1
∂ xi ∂ xˆ j
d . i, j=1
The expression of sG in terms of A, s, G is complicated and hence not reported here. The collocation matrix AC in (7.84) can be expressed in terms of G and ϕˆi , τˆ i , i = 1, . . . , N , as follows: $ %N AC = −1(AG (τˆ i ) ◦ H ϕˆ j (τˆ i ))1T + sG (τˆ i ) · ∇ ϕˆ j (τˆ i ) + cG (τˆ i )ϕˆ j (τˆ i ) i, j=1 2 N d ∂ ϕˆ j =− (τˆ i ) diag aG,k (τˆ i ) ∂ xˆ ∂ xˆk i, j=1 ,k=1 i=1,...,N +
d k=1
diag sG,k (τˆ i ) i=1,...,N
N
∂ ϕˆ j (τˆ i ) ∂ xˆk
+ i, j=1
diag cG (τˆ i ) i=1,...,N
$
%N
ϕˆ j (τˆ i )
i, j=1
.
(7.92) In the IgA context, the geometry map G is expressed in terms of the functions ϕˆi , in accordance with theisoparametric approach [12, Sect. 3.1]. Moreover, the functions ϕˆi are usually tensor-product B-splines or their rational versions, the socalled NURBS. In this section, the role of the ϕˆi will be played by tensor-product B-splines over uniform knot sequences. Furthermore, we do not limit ourselves to the isoparametric approach, but we allow the geometry map G to be any sufficiently regular function from [0, 1]d to Ω, not necessarily expressed in terms of tensorproduct B-splines. Finally, following [1], the collocation points τˆ i will be chosen as the (tensor-product) Greville abscissae corresponding to the tensor-product B-
150
7 Applications
splines ϕˆi . For the same analysis as in this section, but with tensor-product B-splines replaced by NURBS, we refer the reader to [18]. Remark 7.7 For later purposes, we point out that the functions sG,k , k = 1, . . . , d, are bounded over Ω if the following conditions are satisfied: • for every , k = 1, . . . , d, the function ak : Ω → R is bounded and its partial derivatives ∂ak /∂ x1 , . . . , ∂ak /∂ xd : Ω → R are bounded; • for every k = 1, . . . , d, the function bk : Ω → R is bounded; • G ∈ C 2 ([0, 1]d ) and det(JG ) = 0 in [0, 1]d . To understand why this is true (without computing the complicated expression of sG ), we suggest that the reader have a look at the unidimensional case [22, Sect. 10.7.1], especially [22, Eq. (10.115)], where sG = sG is explicitly given in terms of A = a, s = s, G = G. Tensor-product B-splines and Greville abscissae. For p, n ∈ Nd and k = 1, . . . , d, let Nik ,[ pk ] , i k = 1, . . . , n k + pk , be the B-splines of degree pk defined on the knot sequence t1 = · · · = t pk +1 = 0 < t pk +2 < · · · < t pk +n k < 1 = t pk +n k +1 = · · · = t2 pk +n k +1 , (7.93) where ik i k = 0, . . . , n k . (7.94) tik + pk +1 = , nk For i k = 1, . . . , n k + pk , let ξik ,[ pk ] be the Greville abscissa associated with the Bspline Nik ,[ pk ] . Note that the B-splines Ni,[ p] , i = 1, . . . , n + p, and the associated Greville abscissae ξi,[ p] , i = 1, . . . , n + p, have been defined in [22, p. 232] for all p, n ≥ 1; see also [22, Fig. 10.8]. We define the tensor-product B-splines N i,[ p] : [0, 1]d → R as follows: N i,[ p] = Ni1 ,[ p1 ] ⊗ · · · ⊗ Nid ,[ pd ] ,
i = 1, . . . , n + p.
(7.95)
The (tensor-product) Greville abscissa ξ i,[ p] associated with the tensor-product Bspline N i,[ p] is defined by ξ i,[ p] = (ξi1 ,[ p1 ] , . . . , ξid ,[ pd ] ),
i = 1, . . . , n + p.
(7.96)
The properties of B-splines and Greville abscissae reported in [22, pp. 232–235] imply the following analogous properties of tensor-product B-splines and Greville abscissae. • Local support property: supp(N i,[ p] ) = [ti , ti+ p+1 ],
i = 1, . . . , n + p,
(7.97)
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs
151
where [ti , ti+ p+1 ] = [ti1 , ti1 + p1 +1 ] × · · · × [tid , tid + pd +1 ]. In particular, for the measure of the support we have μd (supp(N i,[ p] )) ≤ N ( p + 1)/N (n). • Vanishing property on the boundary: N i,[ p] (t) = 0,
t ∈ ∂([0, 1]d ),
i = 2, . . . , n + p − 1.
(7.98)
• Nonnegative partition of unity:
N i,[ p] (t) ≥ 0,
t ∈ [0, 1]d ,
N i,[ p] (t) = 1,
t ∈ [0, 1]d .
i = 1, . . . , n + p,
(7.99)
n+ p
(7.100)
i=1
• Bounds for derivatives: n+ p ∂ N i,[ p] ∂t (t) ≤ 2 pk n k ,
i=1 n+ p 2
t ∈ [0, 1]d ,
k = 1, . . . , d,
(7.101)
k
∂ N i,[ p] (t) ≤ 4 p pk n n k , ∂t ∂t k i=1
t ∈ [0, 1]d ,
, k = 1, . . . , d.
(7.102)
In (7.101) and (7.102), just as in their univariate analogs [22, Eqs. (10.128) and (10.129)], it is understood that the undefined values are counted as 0 in the summations. • ξ i,[ p] lies in the support of N i,[ p] , ξ i,[ p] ∈ supp(N i,[ p] ) = [ti , ti+ p+1 ],
i = 1, . . . , n + p.
(7.103)
• The Greville abscissae are somehow equivalent, in an asymptotic sense, to the uniform knots in [0, 1]d . More precisely, ξ i,[ p] − i ≤ C p , n + p ∞ min(n)
i = 1, . . . , n + p,
(7.104)
where C p depends only on p. B-spline IgA collocation matrices. In the IgA collocation approach based on (uniform) tensor-product B-splines, the basis functions ϕˆ1 , . . . , ϕˆ N in (7.85) are chosen as the tensor-product B-splines N i+1,[ p] ,
i = 1, . . . , n + p − 2,
(7.105)
and the collocation points τˆ 1 , . . . , τˆ N in (7.86) are chosen as the Greville abscissae ξ i+1,[ p] ,
i = 1, . . . , n + p − 2.
(7.106)
152
7 Applications
In this d-dimensional setting, we have N = N (n + p − 2). Of course, the basis functions (7.105) and the collocation points (7.106) are ordered in accordance with the standard lexicographic ordering. Throughout this section, we assume p ≥ 2, so as to ensure that the second-order partial derivative ∂ 2 N j +1,[ p] /∂t ∂tk is defined at the Greville abscissa ξ i+1,[ p] for every i, j = 1, . . . , n + p − 2 and every , k = 1, . . . , d. The collocation matrix (7.92) resulting from the choices of ϕˆi , τˆ i as in (7.105) [ p] and (7.106) will be denoted by AG,n so as to emphasize its dependence on G, n, p: $ [ p] AG,n = −1(AG (ξ i+1,[ p] ) ◦ H N j +1,[ p] (ξ i+1,[ p] ))1T + sG (ξ i+1,[ p] ) · ∇ N j +1,[ p] (ξ i+1,[ p] ) %n+ p−2 + cG (ξ i+1,[ p] )N j +1,[ p] (ξ i+1,[ p] ) i, j =1 =
d
[ p] K n,k (aG,k )
,k=1
+
d
[ p]
Hn,k (sG,k ) + Mn[ p] (cG ),
(7.107)
k=1
where [ p]
[ p]
K n,k (aG,k ) = Dn[ p] (aG,k )K n,k , [ p] Hn,k (sG,k ) Mn[ p] (cG )
with
= =
[ p] Dn[ p] (sG,k )Hn,k , Dn[ p] (cG )Mn[ p] ,
Dn[ p] (y) =
diag
, k = 1, . . . , d, k = 1, . . . , d,
y(ξ i+1,[ p] )
i=1,...,n+ p−2
being the d-level diagonal sampling matrix containing the samples of the function y : [0, 1]d → R at the Greville abscissae (7.106), and 2 n+ p−2 ∂ N j +1,[ p] [ p] (ξ i+1,[ p] ) , , k = 1, . . . , d, K n,k = − ∂ xˆ ∂ xˆk i, j =1 n+ p−2 ∂ N j +1,[ p] [ p] (ξ i+1,[ p] ) , k = 1, . . . , d, Hn,k = ∂ xˆk i, j =1 $ %n+ p−2 Mn[ p] = N j +1,[ p] (ξ i+1,[ p] ) i, j =1 .
(7.108) (7.109) (7.110)
[ p]
Note that AG,n can be decomposed as follows: [ p]
[ p]
[ p]
AG,n = K G,n + Z G,n , where
(7.111)
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs d
[ p]
K G,n =
[ p]
K n,k (aG,k ) =
,k=1
d
[ p]
Dn[ p] (aG,k )K n,k
153
(7.112)
,k=1
is the collocation diffusion matrix, resulting from the collocation discretization of the higher-order (diffusion) term in (7.81), and [ p]
Z G,n =
d
[ p]
Hn,k (sG,k ) + Mn[ p] (cG )
k=1
=
d
[ p]
Dn[ p] (sG,k )Hn,k + Dn[ p] (cG )Mn[ p]
(7.113)
k=1
is the matrix resulting from the discretization of the lower-order terms (the convection [ p] and reaction terms). The next lemma highlights the structure of the matrices K n,k = [ p] [ p] [ p] [ p] [ p] K n,k (1), Hn,k = Hn,k (1), Mn = Mn (1). Lemma 7.3 For p, n ∈ Nd with p ≥ 2, we have [ p] K n,kk
=
k−1
Mn[ rpr ]
⊗
K n[ kpk ]
⊗
d
Mn[ rpr ]
(7.114)
r =k+1
r=1
for k = 1, . . . , d, [ p]
[ p]
K n,k = K n,k −1 k−1 d =− Mn[ rpr ] ⊗ Hn[p ] ⊗ Mn[ rpr ] ⊗ Hn[kpk ] ⊗ Mn[ rpr ] (7.115) r =1
r =k+1
r=+1
for 1 ≤ < k ≤ d, [ p]
Hn,k =
k−1
d Mn[ rpr ] ⊗ Hn[kpk ] ⊗ Mn[ rpr ]
(7.116)
r =k+1
r=1
for k = 1, . . . , d, and Mn[ p] =
d
Mn[ rpr ] ,
(7.117)
r=1 [ p]
[ p]
[ p]
where the matrices K n , Hn , Mn follows:
are defined for all p, n ∈ N with p ≥ 2 as
154
7 Applications
$ %n+ p−2 K n[ p] = −N j+1,[ p] (ξi+1,[ p] ) i, j=1 , $ %n+ p−2 Hn[ p] = N j+1,[ p] (ξi+1,[ p] ) i, j=1 , $ %n+ p−2 Mn[ p] = N j+1,[ p] (ξi+1,[ p] ) i, j=1 ,
(7.118) (7.119) (7.120)
with Ni,[ p] , i = 1, . . . , n + p, and ξi,[ p] , i = 1, . . . , n + p, being, respectively, the B-splines of degree p defined on the knot sequence
1 2 n−1 , 1, . . . , 1 0, . . . , 0, , , . . . , & '( ) & '( ) n n n p+1
*
p+1
and the associated Greville abscissae; see [22, pp. 232–235] for the corresponding definitions and properties. Proof We only prove (7.114) as the proof of the other equations is the same. By the crucial property P 7, for all i, j = 1, . . . , n + p − 2 we have k−1
Mn[ rpr ]
⊗
K n[ kpk ]
⊗
d
r =1
Mn[ rpr ]
r=k+1
= −N jk +1,[ pk ] (ξik +1,[ pk ] )
d
= (K n[ kpk ] )ik jk
r =1 r =k
ij
N jr +1,[ pr ] (ξir +1,[ pr ] ) = −
r=1 r =k
d (Mn[ rpr ] )ir jr
∂ 2 N j +1,[ p] (ξ i+1,[ p] ) ∂ xˆk2
[ p]
= (K n,kk )i j .
[ p]
[ p]
Remark 7.8 As in the case of FDs and FEs (see Remarks 7.3 and 7.5), K n , Hn , [ p] Mn are the diffusion, convection, reaction matrices resulting from the B-spline IgA collocation discretization of the univariate problem
−u (x) + u (x) + u(x) = f (x), x ∈ (0, 1), u(0) = u(1) = 0.
[ p]
[ p]
[ p]
In other words, K n , Hn , Mn are the matrices resulting from the B-spline IgA collocation discretization of, respectively, the negative second derivative −u (x), the first derivative u (x), the identity operator u(x). To see this, follow the above derivation of the IgA collocation matrices in the univariate case d = 1 or take a look at [22, Sect. 10.7.1]. In view of what follows, we recall from [22, Eqs. (10.150)– (10.152)] that n −2 K n[ p] = Tn+ p−2 ( f p ) + Rn[ p] , −i n
−1
Hn[ p] Mn[ p]
= Tn+ p−2 (g p ) + = Tn+ p−2 (h p ) +
Sn[ p] , Vn[ p] ,
rank(Rn[ p] ) ≤ 4( p − 1),
(7.121)
rank(Sn[ p] ) rank(Vn[ p] )
≤ 4( p − 1),
(7.122)
≤ 4( p − 1),
(7.123)
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs
155
where f p , g p , h p are real trigonometric polynomials whose definitions are given in [22, Eqs. (10.147)–(10.149)]. Moreover, we know from [22, Eq. (10.153)] that n −2 K n[ p] , n −1 Hn[ p] , Mn[ p] ≤ C [ p]
(7.124)
for some constant C [ p] depending only on p. Remark 7.9 It follows from Lemma 7.3, Eqs. (7.121)–(7.123), T 8 and P 8 that [ p]
[ p]
K n,k = n n k Tn+ p−2 ((H p )k ) + Rn,k ,
, k = 1, . . . , d,
(7.125)
where [ p]
rank(Rn,k ) ≤ N (n + p − 2)
d 4( pi − 1) , n + pi − 2 i=1 i
, k = 1, . . . , d,
(7.126)
and H p (θ) is the d × d symmetric matrix defined as follows: ⎧ d ⎨ f pk (θk ) r=1 h pr (θr ), if = k, r =k (H p )k (θ) = d ⎩ g p (θ )g pk (θk ) r =1 h pr (θr ), if = k.
(7.127)
r =,k
[ p]
[ p]
Since K n,k (1) = K n,k , according to (7.125) and the discussion in Sect. 7.2 (see in particular (7.14)), we may predict that H p (θ ) is, up to some normalization, the symbol of the negative Hessian operator. More precisely, we should call it the symbol [ p] of the negative Hessian operator in the parametric variables xˆi , because K n,k (1) is the matrix resulting from the B-spline IgA collocation discretization of the d-variate problem ⎧ 2 ⎨ − ∂ uˆ = f, in (0, 1)d , ∂ xˆ ∂ xˆk ⎩ uˆ = 0, on ∂((0, 1)d ), in which the physical domain Ω and the physical variables xi are replaced by the parametric domain (0, 1)d and the parametric variables xˆi . For instance, assuming n = νn for some fixed vector ν ∈ Qd with positive components, from (7.125), (7.126), Z 1, GLT 3 and GLT 4 we infer that [ p]
[ p]
n −2 K n,k (1) = ν νk Tn+ p−2 ((H p )k ) + n −2 Rn,k ∼GLT ν νk (H p )k (θ ) = (H p(ν) )k (θ),
! n
(7.128)
where H p(ν) (θ) = diag(ν)H p (θ )diag(ν). This means that, assuming n = νn and after normalization by n −2 , the symbol of the negative Hessian operator (in the parametric variables) is H p(ν) (θ), which coincides with H p (θ) up to a trivial transformation. It
156
7 Applications
was proved in [17, Theorem 6.1] that, for all p ≥ 2, the matrix H p (θ) is SPSD for all θ ∈ [−π, π]d and SPD for all θ ∈ [−π, π]d such that θ1 · · · θd = 0. GLT analysis of the B-spline IgA collocation matrices. Using the theory of multilevel GLT sequences, we now derive the spectral and singular value distribution of [ p] the sequence of normalized IgA collocation matrices {n −2 AG,n }n under the assumptions that n = νn for some fixed vector ν and that the geometry map G is sufficiently smooth. Theorem 7.6 Let Ω be a bounded open domain in Rd and suppose that the following conditions on the PDE coefficients and the geometry map are satisfied: • for every , k = 1, . . . , d, the function ak : Ω → R belongs to C(Ω) and its partial derivatives ∂ak /∂ x1 , . . . , ∂ak /∂ xd : Ω → R are bounded; • for every k = 1, . . . , d, the function bk : Ω → R is bounded; • c : Ω → R is bounded; • G ∈ C 2 ([0, 1]d ) and det(JG ) = 0 over [0, 1]d . Let p ≥ 2, let ν ∈ Qd be a vector with positive components, and assume that n = νn (it is understood that n varies in the infinite subset of N such that n = νn ∈ Nd ). Then [ p] (ν) (7.129) {n −2 AG,n }n ∼GLT f G, p and
[ p]
(ν) {n −2 AG,n }n ∼σ, λ f G, p,
(7.130)
where d
(ν) f G, x, θ ) = p (ˆ
aG,k (ˆx)(H p(ν) )k (θ)
,k=1
+ , = 1(AG (ˆx) ◦ H p(ν) (θ ))1T = ν AG (ˆx) ◦ H p (θ) ν T , H p(ν) (θ)
(7.131)
= diag(ν)H p (θ )diag(ν),
and AG (ˆx) and H p (θ ) are defined, respectively, in (7.91) and (7.127). Proof The proof consists of the following steps. Throughout the proof, the letter C denotes a generic constant independent of n. While reading the proof, the reader should keep in mind the relation n = νn. Step 1. We show that [ p]
n −2 K G,n ≤ C, n
−2
[ p] Z G,n
(7.132) −1
≤ Cn .
By Lemma 7.3, the property P 5 and the inequalities (7.124), we have
(7.133)
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs [ p]
K n,k ≤ C [ p] n n k , [ p] Hn,k Mn[ p]
≤ C [ p] n k , ≤C
[ p]
, k = 1, . . . , d, k = 1, . . . , d,
,
157
(7.134) (7.135) (7.136)
where C [ p] = C [ p1 ] · · · C [ pd ] . Moreover, due to the assumptions on the geometry map G and the PDE coefficients ak , bk , c (see also Remark 7.7), we have Dn[ p] (aG,k ) ≤ aG,k ∞ < ∞, Dn[ p] (sG,k ) Dn[ p] (cG )
≤ sG,k ∞ < ∞,
, k = 1, . . . , d, k = 1, . . . , d,
≤ cG ∞ < ∞.
(7.137) (7.138) (7.139)
Thus, from (7.112) and (7.113) we obtain [ p]
K G,n ≤
d
[ p]
Dn[ p] (aG,k ) K n,k ≤
,k=1 [ p]
Z G,n ≤
d
d
aG,k ∞ C [ p] n n k ≤ Cn 2 ,
,k=1 [ p]
Dn[ p] (sG,k ) Hn,k + Dn[ p] (cG ) Mn[ p]
k=1
≤
d
sG,k ∞ C [ p] n k + cG ∞ C [ p] ≤ Cn,
k=1
which imply (7.132) and (7.133). Step 2. Define the symmetric matrix d
[ p]
n −2 K˜ G,n = n −2
Sn+ p−2 (aG,k ) ◦ n n k Tn+ p−2 ((H p )k )
,k=1
=
d
Sn+ p−2 (aG,k ) ◦ Tn+ p−2 ((H p(ν) )k )
(7.140)
,k=1 [ p]
and consider the following decomposition of n −2 AG,n : [ p] [ p] [ p] [ p] [ p] n −2 AG,n = n −2 K˜ G,n + (n −2 K G,n − n −2 K˜ G,n ) + n −2 Z G,n .
(7.141)
[ p] [ p] (ν) x, θ ). In By Theorem 7.2 and GLT 4, n −2 K˜ G,n ≤ C and {n −2 K˜ G,n }n ∼GLT f G, p (ˆ [ p] [ p] the next step we show that n −2 K˜ G,n is a symmetric approximation of n −2 K G,n , in the sense that [ p] [ p] (7.142) n −2 K G,n − n −2 K˜ G,n 1 = o(n d ).
158
7 Applications
Once this is done, the theorem is proved. Indeed, from (7.142), N 2 and Step 1 we have [ p]
[ p]
[ p]
n −2 K G,n − n −2 K˜ G,n + n −2 Z G,n 1 = o(n d ), [ p] [ p] [ p] n −2 K G,n − n −2 K˜ G,n + n −2 Z G,n ≤ C,
and so (7.129)–(7.130) follow from Z 2 and GLT 1−GLT 4, with GLT 2 applied to the decomposition (7.141). Step 3. To prove (7.142), we use the fact that, by (7.104), j j =1,...,n+ p−2 n+ p−2 j ωa ξ j +1,[ p] − ≤ max j =1,...,n+ p−2 n+ p−2 ∞ C p (7.143) ≤ ωa (Cn −1 ) ≤ ωa min(n)
Dn[ p] (a) − Dn+ p−2 (a) =
a(ξ j +1,[ p] ) − a
max
for all functions a ∈ C([0, 1]d ). We have [ p]
[ p]
n −2 K G,n − n −2 K˜ G,n d
= n −2
[ p]
Dn[ p] (aG,k )K n,k −
,k=1
=
"
d
d
Sn+ p−2 (aG,k ) ◦ Tn+ p−2 ((H p(ν) )k )
,k=1
#
[ p]
Dn[ p] (aG,k )n −2 K n,k − Dn[ p] (aG,k )Tn+ p−2 ((H p(ν) )k )
(7.144)
,k=1
+
d $
Dn[ p] (aG,k )Tn+ p−2 ((H p(ν) )k )
,k=1
%
− Dn+ p−2 (aG,k )Tn+ p−2 ((H p(ν) )k ) +
d
$
,k=1
(7.145)
Dn+ p−2 (aG,k )Tn+ p−2 ((H p(ν) )k ) % − Sn+ p−2 (aG,k ) ◦ Tn+ p−2 ((H p(ν) )k ) .
(7.146)
We consider separately the three summations in (7.144)–(7.146) and we show that their trace-norms are o(n d ). Once this is done, the proof of the theorem is complete. • Consider the first summation (7.144). For , k = 1, . . . , d, let [ p]
[ p]
X n,k = Dn[ p] (aG,k )n −2 K n,k − Dn[ p] (aG,k )Tn+ p−2 ((H p(ν) )k ). By (7.125),
7.5 B-Spline IgA Collocation Discretization of Convection-Diffusion-Reaction PDEs [ p]
rank(X n,k ) ≤ Cn d−1 ,
159
, k = 1, . . . , d.
Moreover, by (7.134), (7.137) and T 3, [ p]
X n,k ≤ C, Thus, by N 2,
[ p]
, k = 1, . . . , d. [ p]
[ p]
X n,k 1 ≤ rank(X n,k )X n,k ≤ Cn d−1 . This proves that the trace-norm of the first summation (7.144) is o(n d ) (actually, O(n d−1 )). • Consider now the second summation (7.145). For , k = 1, . . . , d, let [ p]
Yn,k = Dn[ p] (aG,k )Tn+ p−2 ((H p(ν) )k ) − Dn+ p−2 (aG,k )Tn+ p−2 ((H p(ν) )k ). Taking into account that the aG,k are continuous on [0, 1]d due to the assumptions on G and the ak , by (7.143) and T 3 we have [ p]
Yn,k ≤ CωaG,k (Cn −1 ),
, k = 1, . . . , d,
and so, by N 2, [ p]
Yn,k 1 ≤ Cn d ωaG,k (Cn −1 ),
, k = 1, . . . , d.
This proves that the trace-norm of the second summation (7.145) is o(n d ). • Finally, consider the third summation (7.146). For , k = 1, . . . , d, let [ p]
Vn,k = Dn+ p−2 (aG,k )Tn+ p−2 ((H p(ν) )k ) − Sn+ p−2 (aG,k ) ◦ Tn+ p−2 ((H p(ν) )k ). By Theorem 7.2, [ p]
Vn,k ≤ CωaG,k (Cn −1 ),
, k = 1, . . . , d,
and so, by N 2, [ p]
Vn,k 1 ≤ Cn d ωaG,k (n −1 ),
, k = 1, . . . , d.
This proves that the trace-norm of the third summation (7.146) is o(n d ).
We remark that the proof of Theorem 7.6 is essentially the same(!) as the proof of its univariate version [22, Theorem 10.14]. Remark 7.10 (formal structure of the symbol) It is important to note the impressive analogy between the expression of the symbol 1(AG (ˆx) ◦ H p(ν) (θ))1T in (7.131) and the expression of the higher-order differential operator −1(AG (ˆx) ◦ H u(ˆ ˆ x))1T in the
160
7 Applications
transformed problem (7.90). We invite the reader to think about this analogy in the light of Remark 7.9 and the introductory discussion in Sect. 7.2.
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs Consider the convection-diffusion-reaction problem
−∇ · A∇u + b · ∇u + cu = f, in Ω, u = 0, on ∂Ω, ⎧ d d ⎪ ∂u ∂u ⎨ ∂ bk + cu = f, in Ω, ak + − ⇐⇒ ∂ x ∂ xk ∂ xk k=1 ,k=1 ⎪ ⎩ u = 0, on ∂Ω,
(7.147)
where ak , bk , c, f are given functions, A = [ak ]d,k=1 , b = [bk ]dk=1 , and Ω is a bounded open domain in Rd with Lipschitz boundary. Isogeometric Galerkin approximation. The weak form of (7.147) reads as follows: find u ∈ H01 (Ω) such that a(u, v) = f(v),
∀ v ∈ H01 (Ω),
where a(u, v) =
, (∇v)T A∇u + (∇u)T b v + cuv ,
+ Ω
f(v) =
f v. Ω
In the standard Galerkin method, we look for an approximation u W of u by choosing a finite dimensional vector space W ⊂ H01 (Ω), the so-called approximation space, and by solving the following (Galerkin) problem: find u W ∈ W such that a(u W , v) = f(v),
∀v ∈ W.
If {ϕ1 , . . . , ϕ N } is a basis of W , then we can write u W = Nj=1 u j ϕ j for a unique vector u = (u 1 , . . . , u N )T , and, by linearity, the computation of u W (i.e., of u) reduces to solving the linear system A G u = f, N where f = [f(ϕi )]i=1 and
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
$
%N
A = a(ϕ j , ϕi ) G
i, j=1
=
N
+ Ω
161
(∇ϕi ) A∇ϕ j + (∇ϕ j ) b ϕi + cϕ j ϕi T
T
,
i, j=1
(7.148) is the Galerkin stiffness matrix. Now, suppose that the physical domain Ω can be described by a global geometry function G : [0, 1]d → Ω, which is invertible and satisfies G(∂([0, 1]d )) = ∂Ω. Let {ϕˆ 1 , . . . , ϕˆ N } be a set of basis functions defined on the parametric (or reference) domain [0, 1]d and vanishing on the boundary ∂([0, 1]d ). In the isogeometric Galerkin approach, we find an approximation u W of u by using the standard Galerkin method, in which the approximation space is chosen as W = span(ϕ1 , . . . , ϕ N ), where x = G(ˆx). (7.149) ϕi (x) = ϕˆi (G−1 (x)) = ϕˆi (ˆx), The resulting stiffness matrix A G is given by (7.148), with the basis functions ϕi defined as in (7.149). Assuming that G and ϕˆi , i = 1, . . . , N , are sufficiently regular, we can apply standard differential calculus to obtain the following expression for A G in terms of G and ϕˆi , i = 1, . . . , N : A =
+
G
[0,1]d
(∇ ϕˆi )T AG ∇ ϕˆ j N
, + (∇ ϕˆ j )T (JG )−1 b(G) ϕˆi + c(G)ϕˆ j ϕˆi |det(JG )|
,
(7.150)
i, j=1
where
%d $ AG = (JG )−1 A(G)(JG )−T = aG,k ,k=1 ,
(7.151)
and JG is the Jacobian matrix of G, i.e.,
∂G i JG = ∂ xˆ j
d i, j=1
∂ xi = ∂ xˆ j
d . i, j=1
In the IgA framework, G is expressed in terms of the functions ϕˆi , which are usually tensor-product B-splines or NURBS. Here, we do not require that G be expressed in terms of the ϕˆi , and the ϕˆi are chosen as tensor-product B-splines over uniform knot sequences; for the case of NURBS, see [18]. Galerkin B-spline IgA discretization matrices. As in the IgA collocation framework considered in Sect. 7.5, in the Galerkin B-spline IgA based on (uniform) tensor-product B-splines, the functions ϕˆ1 , . . . , ϕˆ N are chosen as the tensor-product B-splines N2,[ p] , . . . , Nn+ p−1,[ p] in (7.105). The resulting stiffness matrix (7.150) [ p] will be denoted by AG,n so as to emphasize its iependence on G, n, p:
162
7 Applications
[ p]
AG,n =
+ [0,1]d
(∇ N i+1,[ p] )T AG ∇ N j +1,[ p]
+ (∇ N j +1,[ p] )T (JG )−1 b(G) N i+1,[ p] n+ p−2 , + c(G)N j +1,[ p] N i+1,[ p] |det(JG )| i, j =1
=
d
[ p]
K G,n,k +
,k=1
d
[ p]
[ p]
HG,n,k + MG,n ,
(7.152)
k=1
where, for , k = 1, . . . , d, [ p]
[ p]
K G,n,k = K n,k (|det(JG )|aG,k ) ∂ N i+1,[ p] ∂ N j +1,[ p] n+ p−2 |det(JG )|aG,k , = ∂ xˆ ∂ xˆk [0,1]d i, j =1 [ p]
(7.153)
[ p]
HG,n,k = Hn,k (|det(JG )|((JG )−1 b(G))k ) n+ p−2 ∂ N j +1,[ p] |det(JG )|((JG )−1 b(G))k N i+1,[ p] , = ∂ xˆk [0,1]d i, j =1
(7.154)
[ p]
MG,n = Mn[ p] (|det(JG )|c(G)) n+ p−2 |det(JG )|c(G)N j +1,[ p] N i+1,[ p] . = [0,1]d
(7.155)
i, j =1
[ p]
Note that AG,n can be decomposed as follows: [ p]
[ p]
[ p]
AG,n = K G,n + Z G,n ,
(7.156)
where [ p] K G,n
=
d ,k=1
[ p] K G,n,k
=
n+ p−2 (∇ N i+1,[ p] ) |det(JG )| AG ∇ N j +1,[ p] T
[0,1]d
i, j =1
(7.157) is the matrix resulting from the discretization of the higher-order (diffusion) term in (7.147), and
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs [ p]
Z G,n =
d
[ p]
163
[ p]
HG,n,k + MG,n
k=1
=
[0,1]d
|det(JG )|(∇ N j +1,[ p] )T (JG )−1 b(G) N i+1,[ p] n+ p−2
+ |det(JG )|c(G)N j +1,[ p] N i+1,[ p]
(7.158) i, j =1
is the matrix resulting from the discretization of the terms in (7.147) with lower-order derivatives (the convection and reaction terms). For , k = 1, . . . , d, define the matrices [ p]
∂ N i+1,[ p] ∂ N j +1,[ p] n+ p−2 , ∂ xˆ ∂ xˆk [0,1]d i, j =1 n+ p−2 ∂ N j +1,[ p] [ p] = Hn,k (1) = N i+1,[ p] , ∂ xˆk [0,1]d i, j =1 n+ p−2 = Mn[ p] (1) = N j +1,[ p] N i+1,[ p] . [ p]
K n,k = K n,k (1) = [ p]
Hn,k
Mn[ p]
[0,1]d
(7.159) (7.160) (7.161)
i, j =1
Lemma 7.4 For p, n ∈ Nd , we have [ p]
K n,kk =
k−1
d Mn[ rpr ] ⊗ K n[ kpk ] ⊗ Mn[ rpr ]
(7.162)
r =k+1
r=1
for k = 1, . . . , d, [ p]
[ p]
K n,k = K n,k −1 k−1 d =− Mn[ rpr ] ⊗ Hn[p ] ⊗ Mn[ rpr ] ⊗ Hn[kpk ] ⊗ Mn[ rpr ] (7.163) r =1
r =k+1
r=+1
for 1 ≤ < k ≤ d, [ p]
Hn,k =
k−1
d Mn[ rpr ] ⊗ Hn[kpk ] ⊗ Mn[ rpr ]
(7.164)
r =k+1
r=1
for k = 1, . . . , d, and Mn[ p] =
d r=1
Mn[ rpr ] ,
(7.165)
164
7 Applications [ p]
[ p]
[ p]
where the matrices K n , Hn , Mn are defined for all p, n ∈ N as follows: K n[ p] = Hn[ p] =
0
1
0
Mn[ p]
1
=
N j+1,[ p] Ni+1,[ p]
N j+1,[ p] Ni+1,[ p]
n+ p−2 , i, j=1 n+ p−2
(7.166)
= − 0
i, j=1 n+ p−2
1
1
N j+1,[ p] Ni+1,[ p]
n+ p−2 , (7.167) i, j=1
,
N j+1,[ p] Ni+1,[ p] 0
(7.168)
i, j=1
with Ni,[ p] , i = 1, . . . , n + p, being the B-splines of degree p on the knot sequence * 1 2 n−1 , 1, . . . , 1 ; 0, . . . , 0, , , . . . , & '( ) & '( ) n n n
p+1
p+1
see [22, pp. 232–235] for the corresponding definitions and properties. Proof We only prove (7.162) as the proof of the other equations is the same. By the crucial property P 7, for all i, j = 1, . . . , n + p − 2 we have k−1
Mn[ rpr ]
⊗
K n[ kpk ]
⊗
r =1
1
= 0
d
Mn[ rpr ]
r=k+1
Nik +1,[ pk ] N jk +1,[ pk ]
d r=1 r =k
1 0
= (K n[ kpk ] )ik jk ij
Nir +1,[ pr ] N jr +1,[ pr ] =
d (Mn[ rpr ] )ir jr r =1 r =k
[0,1]d
∂ N i+1,[ p] ∂ N j +1,[ p] ∂ xˆk ∂ xˆk
[ p]
= (K n,kk )i j .
Remark 7.11 As in the case of FDs, FEs, IgA collocation (see Remarks 7.3, 7.5, [ p] [ p] [ p] 7.8), K n , Hn , Mn are the diffusion, convection, reaction matrices resulting from the Galerkin B-spline IgA discretization of the univariate problem
−u (x) + u (x) + u(x) = f (x), x ∈ (0, 1), u(0) = u(1) = 0.
[ p]
[ p]
[ p]
In other words, K n , Hn , Mn are the matrices resulting from the Galerkin B-spline IgA discretization of, respectively, the negative second derivative −u (x), the first derivative u (x), the identity operator u(x). To see this, follow the above derivation of the Galerkin IgA matrices in the univariate case d = 1 or take a look at [22, Sect. 10.7.2]. In view of what follows, we recall from [22, Eqs. (10.185)–(10.187)] that
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
n −1 K n[ p] = Tn+ p−2 ( f p ) + Rn[ p] , −i Hn[ p] n Mn[ p]
= Tn+ p−2 (g p ) + = Tn+ p−2 (h p ) +
Sn[ p] , Vn[ p] ,
165
rank(Rn[ p] ) ≤ 4( p − 1),
(7.169)
rank(Sn[ p] ) rank(Vn[ p] )
≤ 4( p − 1),
(7.170)
≤ 4( p − 1),
(7.171)
where f p , g p , h p are real trigonometric polynomials whose definitions are given in [22, Eqs. (10.182)–(10.184)]. Remark 7.12 It follows from Lemma 7.4, Eqs. (7.169)–(7.171), T 8 and P 8 that [ p]
K n,k =
nnk [ p] Tn+ p−2 ((H p )k ) + Rn,k , N (n)
, k = 1, . . . , d,
(7.172)
where [ p]
rank(Rn,k ) ≤ N (n + p − 2)
d 4( pi − 1) , n + pi − 2 i=1 i
, k = 1, . . . , d,
(7.173)
and H p (θ) is the d × d symmetric matrix defined as follows: ⎧ d ⎨ f pk (θk ) r=1 h pr (θr ), if = k, r =k (H p )k (θ) = d ⎩ g p (θ )g pk (θk ) r =1 h pr (θr ), if = k.
(7.174)
r =,k
[ p]
[ p]
Since K n,k (1) = K n,k , according to (7.172) and the discussion in Sect. 7.2 (see in particular (7.14)), we may predict that H p (θ ) is, up to some normalization, the symbol of the negative Hessian operator. More precisely, we should call it the symbol [ p] of the negative Hessian operator in the parametric variables xˆi , because K n,k (1) is the matrix resulting from the Galerkin B-spline IgA discretization of the d-variate problem ⎧ 2 ⎨ − ∂ uˆ = f, in (0, 1)d , ∂ xˆ ∂ xˆk ⎩ uˆ = 0, on ∂((0, 1)d ), in which the physical domain Ω and the physical variables xi are replaced by the parametric domain (0, 1)d and the parametric variables xˆi . For instance, assuming n = νn for some fixed vector ν ∈ Qd with positive components, from (7.172), (7.173), Z 1, GLT 3 and GLT 4 we infer that n
d−2
∼GLT
[ p] K n,k (1)
ν νk [ p] = Tn+ p−2 ((H p )k ) + n d−2 Rn,k N (ν)
ν νk (H p )k (θ) = (H p(ν) )k (θ), N (ν)
* n
(7.175)
166
7 Applications
where H p(ν) (θ) = diag(ν)H p (θ )diag(ν)/N (ν). This means that, assuming n = νn and after normalization by n d−2 , the symbol of the negative Hessian operator (in the parametric variables) is H p(ν) (θ), which coincides with H p (θ) up to a trivial transformation. It was proved in [17, Theorem 2.2] that, for all p ≥ 1, the matrix H p (θ) is SPSD for all θ ∈ [−π, π]d and SPD for all θ ∈ [−π, π]d such that θ1 · · · θd = 0. GLT analysis of the Galerkin B-spline IgA discretization matrices. Using the theory of multilevel GLT sequences, we now derive the spectral and singular value [ p] distribution of the sequence of normalized IgA Galerkin matrices {n d−2 AG,n }n under the assumptions that n = νn for some fixed vector ν, that the geometry map G is sufficiently smooth, and that the matrix of diffusion coefficients A(x) is symmetric. Theorem 7.7 Let Ω be a bounded open domain in Rd with Lipschitz boundary and suppose that the following conditions on the PDE coefficients and the geometry map are satisfied: • • • • •
ak ∈ L ∞ (Ω) for every , k = 1, . . . , d; bk ∈ L ∞ (Ω) for every k = 1, . . . , d; c ∈ L ∞ (Ω); A(x) = [ak (x)]d,k=1 is symmetric for every x ∈ Ω; G is regular, i.e., G ∈ C 1 ([0, 1]d ) and det(JG ) = 0 over [0, 1]d .
Let p ≥ 1, let ν ∈ Qd be a vector with positive components, and assume that n = νn (it is understood that n varies in the infinite subset of N such that n = νn ∈ Nd ). Then [ p] (ν) (7.176) {n d−2 AG,n }n ∼GLT f G, p and
[ p]
(ν) {n d−2 AG,n }n ∼σ, λ f G, p,
(7.177)
where (ν) f G, x, θ ) = p (ˆ
d
|det(JG (ˆx))|aG,k (ˆx)(H p(ν) )k (θ)
,k=1
= 1(|det(JG (ˆx))|AG (ˆx) ◦ H p(ν) (θ))1T , + ν |det(JG (ˆx))|AG (ˆx) ◦ H p (θ) ν T = , N (ν) diag(ν)H p (θ )diag(ν) H p(ν) (θ ) = , N (ν)
(7.178)
and AG (ˆx) and H p (θ ) are defined, respectively, in (7.151) and (7.174). Proof We follow the same argument as in the proof of Theorem 7.4. Throughout the proof, the letter C denotes a generic constant independent of n. While reading the
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
167
proof, the reader should keep in mind the relation n = νn and the notation E k for the d × d matrix having 1 in position (, k) and 0 elsewhere (1 ≤ , k ≤ d). Step 1. We show that d−2 [ p] n K ≤ C, G,n d−2 [ p] n Z ≤ Cn −1 . G,n
(7.179) (7.180)
To prove (7.179), we note that, for all i, j = 1, . . . , n + p − 2, we have the following. • If |i − j |∞ > | p|∞ then |i k − jk | > pk for some k ∈ {1, . . . , d}, which implies that the intersection of the supports of N i+1,[ p] and N j +1,[ p] has zero measure by the local support property (7.97). Thus, [ p] (K G,n )i j
=
[0,1]d
(∇ N i+1,[ p] )T |det(JG )| AG ∇ N j +1,[ p] = 0.
• By (7.97), (7.101) and the assumptions on the geometry map G and the PDE coefficients ak , we have d [ p] [ p] (K (K )i j ≤ G,n G,n,k ) i j ,k=1
≤
d
d ,k=1 [0,1]
≤C
d ,k=1
≤C
d
∂ N i+1,[ p] ∂ N j +1,[ p] |aG,k det(JG )| ∂ xˆ ∂ xˆ
k
∂ N i+1,[ p] ∂ N j +1,[ p] ∂ xˆ ∂ xˆ k supp(N i+1,[ p] )
4 p n pk n k μd (supp(N i+1,[ p] )) ≤ Cn 2−d .
(7.181)
,k=1 [ p]
Thus, each row and column of K G,n has at most (2| p|∞ + 1)d nonzero entries whose moduli are bounded by Cn 2−d , which implies (7.179) by N 1. The proof of (7.180) is conceptually identical. For all i, j = 1, . . . , n + p − 2, we have the following. [ p]
[ p]
• If |i − j |∞ > | p|∞ then (Z G,n )i j = 0 for the same reason for which (K G,n )i j = 0. • By (7.97), (7.99)–(7.101) and the assumptions on the geometry map G and the PDE coefficients bk , c, we have
168
7 Applications d [ p] [ p] [ p] (Z )i j ≤ (H G,n G,n,k ) i j + (MG,n ) i j k=1
≤
d k=1
+ ≤C
[0,1]d
[0,1]d
((JG )−1 b(G))k det(JG ) ∂ N j +1,[ p] N i+1,[ p] ∂ xˆ k
|c(G)det(JG )| N j +1,[ p] N i+1,[ p]
∂ N j +1,[ p] N i+1,[ p] ∂ xˆ k supp(N i+1,[ p] ) N j +1,[ p] N i+1,[ p]
d k=1
+
supp(N i+1,[ p] )
≤ Cμd (supp(N i+1,[ p] ))
d
2 pk n k + 1 ≤ Cn 1−d .
k=1 [ p]
Thus, each row and column of Z G,n has at most (2| p|∞ + 1)d nonzero entries whose moduli are bounded by Cn 1−d , which implies (7.180) by N 1. Step 2. Let L 1 ([0, 1]d , Rd×d ) be the space of functions L : [0, 1]d → Rd×d such [ p] that L i j ∈ L 1 ([0, 1]d ) for all i, j = 1, . . . , d. Consider the linear operator K n (·) : 1 d d×d N (n+ p−2)×N (n+ p−2) , L ([0, 1] , R ) → R K n[ p] (L) =
n+ p−2 [0,1]d
(∇ N i+1,[ p] )T L ∇ N j +1,[ p]
.
(7.182)
i, j =1
The next four steps are devoted to showing that {n d−2 K n[ p] (L)}n ∼GLT 1(L(ˆx) ◦ H p(ν) (θ ))1T ,
∀ L ∈ L 1 ([0, 1]d , Rd×d ). (7.183) Once this is done, the theorem is proved. Indeed, by applying (7.183) with L = [ p] (ν) d−2 [ p] Z G,n }n ∼σ 0 by Step 1 |det(JG )|AG , we get {n d−2 K G,n }n ∼GLT f G, p . Moreover, {n and Z 1 (or Z 2). Hence, the GLT relation (7.176) follows from the decomposition [ p] [ p] [ p] n d−2 AG,n = n d−2 K G,n + n d−2 Z G,n (see (7.156)) and GLT 3−GLT 4; the singular value distribution in (7.177) follows from (7.176) and GLT 1; and the spectral distribution in (7.177) follows from (7.176) and GLT 2 applied to the decomposition [ p] [ p] [ p] n d−2 AG,n = n d−2 K G,n + n d−2 Z G,n , taking into account what we have seen in Step 1, the inequality N 2, and the fact that, due to the symmetry assumption on A(x) for all [ p] x ∈ Ω, the matrix n d−2 K G,n is symmetric. Step 3. We first prove (7.183) in the constant-coefficient case where L(ˆx) = E k [ p] [ p] [ p] identically. In this case, K n (E k ) = K n,k (1) = K n,k and (7.183) is just (7.175).
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
169
Step 4. We now prove (7.183) in the case where L(ˆx) = a(ˆx)E k with a ∈ C([0, 1]d ). To this end, we show that (7.184) Yn[ p] → 0 [ p]
[ p]
[ p]
where Yn = n d−2 K n (a E k ) − n d−2 Dn+ p−2 (a)K n (E k ). Once this is done, from Z 1 (or Z 2) we have {Yn }n ∼σ 0. Hence, form Step 3 and GLT 3−GLT 4 we obtain {n d−2 K n[ p] (a E k )}n ∼GLT a(ˆx)(H p(ν) )k (θ), as required. For every i, j = 1, . . . , n + p − 2, we have the following. [ p]
• If |i − j |∞ > | p|∞ then (K n (L))i j = 0 for all L ∈ L 1 ([0, 1]d , Rd×d ), because of the local support property of tensor-product B-splines already exploited in [ p] Step 1. Thus, (Yn )i j = 0. • If |i − j |∞ ≤ | p|∞ then, using (7.97), (7.101) and the fact that supp(N i+1,[ p] ) is located near the point i/(n + p − 2) because max
xˆ ∈supp(N i+1,[ p] )
xˆ −
i C ≤ Cn −1 , ≤ n + p − 2 ∞ min(n)
we obtain ∂ N ∂ N j +1,[ p] i i+1,[ p] (ˆx) (ˆx)dxˆ a(ˆx) − a |(Yn[ p] )i j | = n d−2 n+ p−2 ∂ xˆ ∂ xˆk [0,1]d i 4 p pk n n k dxˆ a(ˆx) − a ≤ n d−2 n + p−2 supp(N i+1,[ p] ) i d−2 2 ≤ n Cn ωa dxˆ max xˆ − xˆ ∈supp(N i+1,[ p] ) n + p − 2 ∞ supp(N i+1,[ p] ) ≤ Cn d ωa (Cn −1 )μd (supp(N i+1,[ p] )) ≤ Cωa (Cn −1 ). [ p]
Thus, each row and column of Yn has at most (2| p|∞ + 1)d nonzero entries whose moduli are bounded by Cωa (Cn −1 ), which implies (7.184) by N 1. Step 5. We now prove (7.183) in the case where L(ˆx) = a(ˆx)E k with a ∈ L 1 ([0, 1]d ). Take am ∈ C([0, 1]d ) such that am → a in L 1 ([0, 1]d ). We have {n d−2 K n[ p] (am E k )}n ∼GLT am (ˆx)(H p(ν) )k (θ) by Step 4, and
am (ˆx)(H p(ν) )k (θ ) → a(ˆx)(H p(ν) )k (θ)
in measure as m → ∞. We prove that a.c.s.
{n d−2 K n[ p] (am E k )}m −→ {n d−2 K n[ p] (a E k )}n ,
(7.185)
170
7 Applications
after which the desired result follows from GLT 7. By N 3 and the bounds for the sum of partial derivatives of tensor-product B-splines (7.101), we have K n[ p] (a E k ) − K n[ p] (am E k )1 = K n[ p] ((a − am )E k )1 ≤
n
|(K n[ p] ((a − am )E k ))i j |
i, j =1
≤
i, j =1
≤
∂ N i+1,[ p] ∂ N j +1,[ p] |a(ˆx) − am (ˆx)| (ˆx) (ˆx) dxˆ ∂ xˆ ∂ xˆk [0,1]d n n ∂ N i+1,[ p] ∂ N j +1,[ p] |a(ˆx) − am (ˆx)| (ˆx) (ˆx) dxˆ ∂ xˆ ∂ xˆ k i=1 j =1
n
[0,1]d
≤ 4 pk n k p n a − am L 1 ≤ Cn 2 a − am L 1 , hence n d−2 K n[ p] (a E k ) − n d−2 K n[ p] (am E k )1 ≤ Cn d a − am L 1 ≤ C N (n + p − 2)a − am L 1 , and the a.c.s. convergence (7.185) follows from ACS 6. Step 6. Finally, we prove (7.183) for an arbitrary L ∈ L 1 ([0, 1]d , Rd×d ). Write L(ˆx) =
d
L k (ˆx)E k
,k=1
and note that, by linearity, K n[ p] (L) =
d
K n[ p] (L k E k ).
,k=1
Hence, by Step 5 and GLT 4, {n d−2 K n[ p] (L)}n ∼GLT
d
L k (ˆx)(H p(ν) )k (θ) = 1(L(ˆx) ◦ H p(ν) (θ))1T ,
,k=1
which concludes the proof.
We remark that the proof of Theorem 7.7 is essentially the same(!) as the proof of its univariate version [22, Theorem 10.15]. We also remark that the GLT relation (7.176) and the singular value distribution in (7.177) remain true even without the hypothesis that the matrix of diffusion coef-
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
171
ficients A(x) is symmetric for all x ∈ Ω. Indeed, as it is clear from the proof of Theorem 7.7 and especially from Step 2, the symmetry of A is used only in the proof of the eigenvalue distribution in (7.177). Actually, also the eigenvalue distribution remains true without the symmetry assumption on A, as long as we assume that the entries of A are continuous. This result, which has never been observed before in the literature, is proved in the next theorem. Theorem 7.8 Let Ω be a bounded open domain in Rd with Lipschitz boundary and suppose that the following conditions on the PDE coefficients and the geometry map are satisfied: • • • •
ak ∈ C(Ω) for every , k = 1, . . . , d; bk ∈ L ∞ (Ω) for every k = 1, . . . , d; c ∈ L ∞ (Ω); G is regular, i.e., G ∈ C 1 ([0, 1]d ) and det(JG ) = 0 over [0, 1]d .
Let p ≥ 1, let ν ∈ Qd be a vector with positive components, and assume that n = νn (it is understood that n varies in the infinite subset of N such that n = νn ∈ Nd ). Then, both (7.176) and (7.177) are satisfied. Proof As noted before the statement of the theorem, we only have to prove the eigenvalue distribution in (7.177). The proof is similar to that of Theorem 7.5. The [ p] underlying idea is that, even in the case where A is not symmetric, the matrix K G,n is “almost” symmetric (as long as the entries of A are continuous). We can then derive the eigenvalue distribution in (7.177) from GLT 2 applied to a suitable decomposition [ p] [ p] of AG,n obtained from (7.156) by replacing K G,n with one of its symmetric approximations. Let us work out the details. Throughout the proof, the letter C denotes a generic constant independent of n. By (7.153) and (7.157), [ p]
K G,n =
d
[ p]
K n,k (|det(JG )|aG,k ),
,k=1
[ p]
K n,k (a) =
a [0,1]d
∂ N i+1,[ p] ∂ N j +1,[ p] ∂ xˆ ∂ xˆk
n . i, j =1
[ p] [ p] Let K˜ G,n be the symmetric approximation of K G,n given by
[ p] K˜ G,n =
d
[ p] K˜ n,k (|det(JG )|aG,k ),
,k=1 [ p] K˜ n,k (a)
= Sn+ p−2 (a) ◦ n 2−d Tn+ p−2 ((H p(ν) )k ).
[ p]
Note that n d−2 K˜ n,k (a) ≤ C for all a ∈ C([0, 1]d ), by Theorem 7.2. We prove that [ p] [ p] n d−2 K G,n − n d−2 K˜ G,n 1 = o(n d ).
(7.186)
172
7 Applications
Once this is done, the eigenvalue distribution in (7.177) follows from (7.176) and GLT 2 applied to the decomposition [ p]
[ p]
[ p]
[ p]
[ p]
n d−2 AG,n = n d−2 K˜ G,n + (n d−2 K G,n − n d−2 K˜ G,n ) + n d−2 Z G,n , taking into account what we have seen in Step 1 of the proof of Theorem 7.7, the [ p] inequality N 2, and the symmetry of K˜ G,n . To prove (7.186), it is enough to show that [ p] [ p] n d−2 K n,k (a) − n d−2 K˜ n,k (a)1 = o(n d ) for every a ∈ C([0, 1]d ) and every , k = 1, . . . , d. With the notation used in the [ p] [ p] proof of Theorem 7.7 (see in particular (7.182)), we have K n,k (a) = K n (a E k ). Thus, [ p]
[ p]
n d−2 K n,k (a) − n d−2 K˜ n,k (a)1 ≤ n d−2 K n[ p] (a E k ) − n d−2 Dn+ p−2 (a)K n[ p] (E k )1
(7.187)
+ n d−2 Dn+ p−2 (a)K n[ p] (E k ) − Dn+ p−2 (a)Tn+ p−2 ((H p(ν) )k )1 +
Dn+ p−2 (a)Tn+ p−2 ((H p(ν) )k )
− Sn+ p−2 (a) ◦
Tn+ p−2 ((H p(ν) )k )1 .
(7.188) (7.189)
We consider separately the three trace-norms in (7.187)–(7.189) and we show that they are o(n d ). Once this is done, the proof of the theorem is complete. • The trace-norm (7.187) is o(n d ) by N 2, because the spectral norm of its argument tends to 0; see Step 4 in the proof of Theorem 7.7. • The trace-norm (7.188) is o(n d ) by N 2, because of the following two properties: first, the rank of its argument is bounded by Cn d−1 (see Remark 7.12 and take [ p] [ p] [ p] into account that K n,k (E k ) = K n,k (1) = K n,k ); second, the spectral norm of its argument is bounded by C, because Dn+ p−2 (a) ≤ a∞ , Tn+ p−2 ((H p(ν) )k ) ≤ (H p(ν) )k ∞ , n d−2 K n[ p] (E k ) ≤ C, where the first inequality is obvious, the second inequality follows from T 3, and the third inequality is proved by the same argument used in Step 1 of the proof of [ p] Theorem 7.7 to prove the inequality n d−2 K G,n ≤ C.4
4 Actually,
the argument used in Step 1 of the proof of Theorem 7.7 to prove the inequal[ p] [ p] ity n d−2 K G,n ≤ C can be used to prove that n d−2 K n (L) ≤ C for all functions L : [ p]
[0, 1]d → Rd×d such that L i j ∈ L ∞ ([0, 1]d ) for all i, j = 1, . . . , d. Recall also that K G,n = [ p] K n (|det(JG )|AG )
L ∞ ([0, 1]d )
and (|det(JG )|AG )i j ∈ tions of both Theorem 7.7 and the present theorem.
for all i, j = 1, . . . , d under the assump-
7.6 Galerkin B-Spline IgA Discretization of Convection-Diffusion-Reaction PDEs
173
• The trace-norm (7.189) is o(n d ) by N 2, because the spectral norm of its argument tends to 0 by Theorem 7.2. Remark 7.13 (formal structure of the symbol) As we know from Sect. 7.5, problem (7.147) can be formally rewritten as in (7.81). If, for any u : Ω → R, we define uˆ : [0, 1]d → R as in (7.89), then u satisfies (7.81) if and only if uˆ satisfies the corresponding transformed problem (7.90), in which the higher-orderoperator ˆ x))1T . It is then clear that, similarly to the collotakes the form −1(AG (ˆx) ◦ H u(ˆ (ν) x, θ) = cation case (see Remark 7.10), even in the Galerkin case the symbol f G, p (ˆ (ν) T 1(|det(JG (ˆx))|AG (ˆx) ◦ H p (θ))1 preserves the formal structure of the higher-order operator associated with the transformed problem (7.90). However, in this Galerkin context, we notice the appearance of the determinant factor |det(JG (ˆx))|, which is not present in the collocation setting; cf. (7.178) and (7.131). [ p]
Remark 7.14 The matrix AG,n in (7.152), which we decomposed as in (7.156), can also be decomposed as follows, according to the diffusion, convection and reaction terms: [ p] [ p] [ p] [ p] AG,n = K G,n + HG,n + MG,n , where the diffusion, convection and reaction matrices are given by [ p] K G,n [ p] HG,n [ p] MG,n
=
n+ p−2 (∇ N i+1,[ p] ) |det(JG )|AG ∇ N j +1,[ p]
[0,1]d
=
[0,1]d
[0,1]d
i, j =1
−1
|det(JG )|(∇ N j +1,[ p] ) (JG ) b(G) N i+1,[ p] T
=
,
T
n+ p−2 |det(JG )|c(G)N j +1,[ p] N i+1,[ p]
(7.190)
n+ p−2 ,
(7.191)
i, j =1
.
(7.192)
i, j =1
Let p ≥ 1 and assume n = νn as in Theorem 7.7. • Suppose that |det(JG )|AG ∈ L 1 ([0, 1]d , Rd×d ). Then [ p]
{n d−2 K G,n }n ∼GLT 1(|det(JG (ˆx))|AG (ˆx) ◦ H p(ν) (θ))1T =
ν(|det(JG (ˆx))|AG (ˆx) ◦ H p (θ ))ν T N (ν)
(7.193)
by (7.183) applied with L = |det(JG )|AG . If we also assume that A(x) is sym[ p] metric for all x ∈ Ω (so that K G,n is symmetric), then, by GLT 1, [ p]
{n d−2 K G,n }n ∼σ,λ 1(|det(JG (ˆx))|AG (ˆx) ◦ H p(ν) (θ))1T =
ν(|det(JG (ˆx))|AG (ˆx) ◦ H p (θ ))ν T . N (ν)
174
7 Applications
• Suppose that |det(JG )|c(G) ∈ L 1 ([0, 1]d ). Note that this is the same as assuming that c ∈ L 1 (Ω), because c= c(G)|det(JG )|. Ω
[0,1]d
Then, with the same argument used for proving (7.193), one can show that [ p]
{n d MG,n }n ∼GLT
|det(JG (ˆx))|c(G(ˆx))h p (θ) , N (ν)
where h p (θ ) =
d
h pr (θr ).
(7.194)
(7.195)
r=1 [ p]
Hence, by GLT 1 and the symmetry of MG,n , [ p]
{n d MG,n }n ∼σ,λ
|det(JG (ˆx))|c(G(ˆx))h p (θ) . N (ν)
7.7 Galerkin B-Spline IgA Discretization of Second-Order Eigenvalue Problems Let R+ be the set of positive real numbers. Consider the following second-order eigenvalue problem: find eigenvalues λ j ∈ R+ and eigenfunctions u j , for j = 1, 2, . . . , ∞, such that −∇ · A∇u j = λ j cu j , in Ω, (7.196) on ∂Ω, u j = 0, where A = [ak ]d,k=1 and Ω is a bounded open domain in Rd with Lipschitz boundary. We assume that ak ∈ L 1 (Ω) for all , k = 1, . . . , d, that A(x) = [ak (x)]d,k=1 is SPD for almost every x ∈ Ω, and that c ∈ L 1 (Ω) with c > 0 a.e. in Ω. It can be shown that the eigenvalues λ j must necessarily be real and positive. This can be formally seen by multiplying (7.196) by u j and integrating over Ω: λj =
−
-
Ω (∇ -
· A∇u j )u j = 2 Ω cu j
-
j) Ω (∇u -
T
A∇u j > 0. 2 cu j Ω
Isogeometric Galerkin approximation. The weak form of (7.196) reads as follows: find eigenvalues λ j ∈ R+ and eigenfunctions u j ∈ H01 (Ω), for j = 1, 2, . . . , ∞, such that
7.7 Galerkin B-Spline IgA Discretization of Second-Order Eigenvalue Problems
a(u j , w) = λ j (cu j , w),
∀ w ∈ H01 (Ω),
where a(u j , w) =
175
(∇w) A∇u j ,
(cu j , w) =
T
Ω
Ω
cu j w.
In the standard Galerkin method, we choose a finite dimensional vector space W ⊂ H01 (Ω), the so-called approximation space, we let N = dim W and we look for approximations of the eigenpairs (λ j , u j ), j = 1, 2, . . . , ∞, by solving the following discrete (Galerkin) problem: find λ j,W ∈ R+ and u j,W ∈ W , for j = 1, . . . , N , such that ∀w ∈ W. (7.197) a(u j,W , w) = λ j,W (c u j,W , w), Assuming that both the exact and numerical eigenvalues are arranged in nondecreasing order, the pair (λ j,W , u j,W ) is taken as an approximation to the pair (λ j , u j ) for all j = 1, 2, . . . , N . The numbers λ j,W /λ j − 1, j = 1, . . . , N , are referred to as the (relative) eigenvalue errors. If {ϕ1 , . . . , ϕ N } is a basis of W , we can identify each w ∈ W with its coefficient vector relative to this basis. With this identification in mind, solving the discrete problem (7.197) is equivalent to solving the generalized eigenvalue problem K G u j,W = λ j,W M G u j,W ,
(7.198)
where u j,W is the coefficient vector of u j,W with respect to {ϕ1 , . . . , ϕ N } and K
G
=
M =
N (∇ϕi ) A∇ϕ j
,
T
Ω
G
Ω
N cϕ j ϕi
(7.199)
i, j=1
.
(7.200)
i, j=1
The matrices K G and M G are referred to as the Galerkin stiffness and mass matrix, respectively. Due to our assumption that A > O a.e. and c > 0 a.e., both K G and M G are SPD, regardless of the chosen basis functions ϕ1 , . . . , ϕ N . Moreover, it is clear from (7.198) that the numerical eigenvalues λ j,W , j = 1, . . . , N , are just the eigenvalues of the matrix (7.201) L G = (M G )−1 K G . In the isogeometric Galerkin method, we assume that the physical domain Ω is described by a global geometry function G : [0, 1]d → Ω, which is invertible and satisfies G(∂([0, 1]d )) = ∂Ω. We fix a set of basis functions {ϕˆ 1 , . . . , ϕˆ N } defined on the reference (parametric) domain [0, 1]d and vanishing on the boundary ∂([0, 1]d ), and we find approximations to the exact eigenpairs (λ j , u j ), j = 1, 2, . . . , ∞, by using the standard Galerkin method described above, in which the approximation space is chosen as W = span(ϕ1 , . . . , ϕ N ), where
176
7 Applications
ϕi (x) = ϕˆi (G−1 (x)) = ϕˆi (ˆx),
x = G(ˆx).
(7.202)
The resulting stiffness and mass matrices K G and M G are given by (7.199) and (7.200), with the basis functions ϕi defined as in (7.202). If we assume that G and ϕˆi , i = 1, . . . , N , are sufficiently regular, we can apply standard differential calculus to obtain for K G and M G the following expressions: KG =
N
[0,1]d
|det(JG )|(∇ ϕˆi )T AG ∇ ϕˆ j
MG = where
[0,1]d
,
(7.203)
i, j=1
N c(G)|det(JG )|ϕˆ j ϕˆi
,
(7.204)
i, j=1
AG = (JG )−1 A(G)(JG )−T
and
JG =
∂G i ∂ xˆ j
d = i, j=1
∂ xi ∂ xˆ j
(7.205)
d . i, j=1
GLT analysis of the Galerkin B-spline IgA discretization matrices. Following the approach of Sects. 7.5 and 7.6, we choose the basis functions ϕˆi , i = 1, . . . , N , as the tensor-product B-splines N i+1,[ p] , i = 1, . . . , n + p − 2. The resulting stiffness and mass matrices (7.203) and (7.204) are given by [ p] K G,n [ p]
=
MG,n =
n+ p−2 |det(JG )|(∇ N i+1,[ p] ) AG ∇ N j +1,[ p]
,
T
[0,1]d
i, j =1 n+ p−2
[0,1]d
c(G)|det(JG )|N j +1,[ p] N i+1,[ p]
, i, j =1
and it is immediately seen that they are the same as the reaction and diffusion matrices in (7.190) and (7.192). The numerical eigenvalues are simply the eigenvalues of the matrix [ p] [ p] [ p] L G,n = (MG,n )−1 K G,n . Theorem 7.9 Let Ω be a bounded open domain in Rd with Lipschitz boundary and suppose that the following conditions on the PDE coefficients and the geometry map are satisfied: • • • •
ak ∈ L 1 (Ω) for all , k = 1, . . . , d; c ∈ L 1 (Ω) and c > 0 a.e. in Ω; A(x) = [ak (x)]d,k=1 is SPD for a.e. x ∈ Ω; |det(JG )| > 0 a.e. in [0, 1]d and (|det(JG )|AG )k ∈ L 1 ([0, 1]d ) for all , k = 1, . . . , d, where AG is defined in (7.205).
7.7 Galerkin B-Spline IgA Discretization of Second-Order Eigenvalue Problems
177
Let p ≥ 1, let ν ∈ Qd be a vector with positive components, and assume that n = νn (it is understood that n varies in the infinite subset of N such that n = νn ∈ Nd ). Then [ p] (ν) x, θ ) (7.206) {n −2 L G,n }n ∼GLT eG, p (ˆ and
[ p]
(ν) x, θ ), {n −2 L G,n }n ∼σ,λ eG, p (ˆ
where (ν) eG, x, θ) = p (ˆ
ν(AG (ˆx) ◦ H p (θ))ν T c(G(ˆx))h p (θ)
(7.207)
(7.208)
and H p (θ), h p (θ) are defined in (7.174), (7.195), respectively. Proof The components of |det(JG )|AG belong to L 1 ([0, 1]d ) by assumption, and also the function c(G)|det(JG )| belongs to L 1 ([0, 1]d ) because c ∈ L 1 (Ω) by assumption and c(G)|det(JG )| = c. [0,1]d
Ω
Hence, by Remark 7.14, [ p]
{n d−2 K G,n }n ∼GLT [ p]
{n d MG,n }n ∼GLT
ν(|det(JG (ˆx))|AG (ˆx) ◦ H p (θ))ν T , N (ν) |det(JG (ˆx))|c(G(ˆx))h p (θ) , N (ν)
and the relations (7.206)–(7.207) follow from Theorem 7.1, taking into account [ p] [ p] that K G,n and MG,n are SPD, that |det(JG )|, c > 0 a.e. by assumption, and that d h p (θ) = r=1 h pr (θr ) > 0 for all θ ∈ [−π, π]d because h p (θ ) ≥ (4/π 2 ) p+1 for all p ≥ 1 and θ ∈ [−π, π]; see [22, Eq. (10.165) and Remark 10.13].
Chapter 8
Future Developments
In the present Volume II, we have developed the theory of multilevel (or multivariate) GLT sequences, which, as illustrated in the applications of Chap. 7, allows the computation of the asymptotic singular value and eigenvalue distribution of matrices arising from the discretization of PDEs by virtually any approximation technique. In this final chapter, we list a series of possible future developments. 1. Develop the theories of multilevel block and reduced GLT sequences, as explained in items 1 and 2 of [22, Chap. 11]. We highlight that, for the unilevel block case, much work has already been carried out in three very recent papers [19, 25, 26]. In particular, papers [25, 26] systematically developed the theory of unilevel block GLT sequences, while paper [19] presented some of its most emblematic applications. It is worth noting that papers [25, 26] also suggested a new interesting algebraic-topological definition of GLT sequences, which allows a considerable simplification of the theory; it will certainly appear in the second editions of both Volumes I and II. 2. Try to design an automatic procedure for computing the symbol of a sequence of PDE discretization matrices, assuming to know that it is a (multilevel) GLT sequence. This objective was already proposed in item 3 of [22, Chap. 11] and it was partially pursued by Ahmed Ratnani at the Max Planck Institute for Plasma Physics (Munich, Germany). Ratnani is also the author of a so-called “GLT library”. 3. Develop a new edition of both Volumes I and II. GLT sequences are an expanding research field, both from the theoretical and applicative point of view. It will then be necessary to develop a new edition of Volumes I and II in order to include the new important achievements in this young research area. Actually, in very recent years, also due to the publication of Volume I, the interest in the theory of GLT sequences has grown considerably, thus leading to an impressive number of new recent findings. We here mention some of the most important ones, in addition to those already mentioned in item 4 of [22, Chap. 11]. © Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4_8
179
180
8 Future Developments
3.1 Giovanni Barbarino from Scuola Normale Superiore (Pisa, Italy) proved in [4] the following interesting result for unilevel GLT sequences, whose extension to multilevel GLT sequences is supposed to be straightforward. Suppose {An }n ∼GLT κ, where { An }n ∈ E = {{Bn }n : Bn ∈ Cn×n } and κ ∈ M1 = {ξ : [0, 1] × [−π, π] → C : ξ is measurable}. Then, there exists a sequence of unitary matrices {Q n }n ∈ E which is independent of {An }n and satisfies An = Q n Dn Q ∗n + Z n , where {Z n }n ∈ E is a zero-distributed sequence and {Dn }n ∈ E is a sequence of diagonal matrices such that {Dn }n ∼λ κ. This means in particular that, up to a small perturbation Z n , the nth matrix An of a GLT sequence is close to a normal matrix A˜ n whose eigenvectors are given by the columns of a fixed n × n unitary matrix Q n ; and, moreover, if { An }n ∼GLT κ then { A˜ n }n ∼GLT κ and { A˜ n }n ∼λ κ. An analogous result was somehow anticipated in [36, Remark 0.1] with the construction of the class of Generalized Locally Circulant (GLC) sequences, in which the role of Toeplitz matrices is played by circulant matrices and the normal matrices Q n admit an explicit expression. 3.2 Giovanni Barbarino proved in [6] the following remarkable result about sequences of perturbed Hermitian matrices. Let {X n }n , {Yn }n be sequences of matrices, with X n , Yn of size dn , and set An = X n + Yn . Assume that the following conditions are met. 1. Every X n is Hermitian and {X n }n ∼λ f . √ 2. Yn 2 = o( dn ). Then {An }n ∼λ f .
This is a noteworthy extension of Corollary 2.3, because if Yn ≤ C for some constant C independent of n and Yn 1 = o(dn ), as in the hypotheses of Corollary 2.3, then dn dn 2 Yn 2 = σi (Yn ) ≤ σmax (Yn ) σi (Yn ) = Yn Yn 1 = o( dn ). i=1
i=1
This extension of Corollary 2.3 increases the potential of the theory of GLT sequences as it allows us to replace property GLT 2 with the following more powerful version: GLT 2. If { A n }n ∼GLT κ and A n = X n + Yn , where • every X n is Hermitian, • N (n)−1/2 Yn 2 → 0, then {A n }n ∼λ κ.
8 Future Developments
181
Such an improved GLT 2 property is the key for computing the asymptotic spectral distribution of matrices arising from the discretization of PDEs with unbounded coefficients, as illustrated in [6]. It was also conjectured in [6], √on the basis of numerical experiments, that the assumption “Yn 2 = o( dn )” in the above extension of Corollary 2.3 can be replaced by the weaker assumption “Yn 1 = o(dn )”, thus yielding an even more powerful version of GLT 2. Yet, this conjecture is still unsolved and solving it is certainly an interesting topic for future research. 3.3 A “GLT program” for future research has been outlined in [23, Remark 15]. Roughly speaking, it suggests the idea that from the symbol of the GLT sequence arising from the discretization of a continuous eigenvalue problem one can obtain the distribution of the continuous eigenvalues through an appropriate limit process “from the discrete to the continuous”. Although this may seem like science fiction, the idea deserves a serious consideration, because Davide Bianchi from University of Insubria (Como, Italy) was able to select a class of continuous eigenvalue problems for which the above GLT program applies(!) 3.4 On the applicative side, it was shown in [14] that the theory of GLT sequences finds applications also in the context of fractional differential equations. Considering the importance of this topic nowadays, the GLT analysis of at least one model fractional differential equation should be included in a future edition of Volumes I and/or II.
References
1. Auricchio F., Beirão da Veiga L., Hughes T. J. R., Reali A., Sangalli G. Isogeometric collocation methods. Math. Models Methods Appl. Sci. 20 (2010) 2075–2107. 2. Avram F. On bilinear forms in Gaussian random variables and Toeplitz matrices. Probab. Theory Related Fields 79 (1988) 37–45. 3. Barbarino G. Equivalence between GLT sequences and measurable functions. Linear Algebra Appl. 529 (2017) 397–412. 4. Barbarino G. Normal form for GLT sequences. Full text available at: https://arxiv.org/abs/ 1805.08708v2. 5. Barbarino G., Garoni C. From convergence in measure to convergence of matrix-sequences through concave functions and singular values. Electron. J. Linear Algebra 32 (2017) 500–513. 6. Barbarino G., Serra-Capizzano S. Non-Hermitian perturbations of Hermitian matrixsequences and applications to the spectral analysis of approximated PDEs. Technical Report 2018-004 (2018), Department of Information Technology, Uppsala University. Full text available at: http://www.it.uu.se/research/publications/reports/2018-004. 7. Böttcher A., Gutiérrez-Gutiérrez J., Crespo P. M. Mass concentration in quasicommutators of Toeplitz matrices. J. Comput. Appl. Math. 205 (2007) 129–148. 8. Böttcher A., Silbermann B. Introduction to Large Truncated Toeplitz Matrices. SpringerVerlag, New York (1999). 9. Böttcher A., Silbermann B. Analysis of Toeplitz Operators. Second Edition, SpringerVerlag, Berlin (2006). 10. Brezis H. Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer, New York (2011). 11. Caramello O. Theories, Sites, Toposes: Relating and Studying Mathematical Theories through Topos-Theoretic ‘Bridges’. Oxford University Press, Oxford (2017). 12. Cottrell J. A., Hughes T. J. R., Bazilevs Y. Isogeometric Analysis: Toward Integration of CAD and FEA. John Wiley & Sons, Chichester (2009). 13. Donatelli M., Garoni C., Mazza M., Serra- Capizzano S., Sesana D. Preconditioned HSS method for large multilevel block Toeplitz linear systems via the notion of matrixvalued symbol. Numer. Linear Algebra Appl. 23 (2016) 83–119. 14. Donatelli M., Mazza M., Serra-Capizzano S. Spectral analysis and structure preserving preconditioners for fractional diffusion equations. J. Comput. Phys. 307 (2016) 262–279. 15. Donatelli M., Neytcheva M., Serra-Capizzano S. Canonical eigenvalue distribution of multilevel block Toeplitz sequences with non-Hermitian symbols. Oper. Theory Adv. Appl. 221 (2012) 269–291. © Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4
183
184
References
16. Garoni C. Topological foundations of an asymptotic approximation theory for sequences of matrices with increasing size. Linear Algebra Appl. 513 (2017) 324–341. 17. Garoni C., Manni C., Serra-Capizzano S., Sesana D., Speleers H. Lusin theorem, GLT sequences and matrix computations: an application to the spectral analysis of PDE discretization matrices. J. Math. Anal. Appl. 446 (2017) 365–382. 18. Garoni C., Manni C., Serra-Capizzano S., Speleers H. NURBS versus B-splines in isogeometric discretization methods: a spectral analysis. Submitted. 19. Garoni C., Mazza M., Serra-Capizzano S. Block generalized locally Toeplitz sequences: from the theory to the applications. Axioms 7 (2018) 49. 20. Garoni C., Serra-Capizzano S. The theory of locally Toeplitz sequences: a review, an extension, and a few representative applications. Bol. Soc. Mat. Mex. 22 (2016) 529–565. 21. Garoni C., Serra-Capizzano S. The theory of generalized locally Toeplitz sequences: a review, an extension, and a few representative applications. Oper. Theory Adv. Appl. 259 (2017) 353–394. 22. Garoni C., Serra-Capizzano S. Generalized Locally Toeplitz Sequences: Theory and Applications (Volume I). Springer, Cham (2017). 23. Garoni C., Serra-Capizzano S. Generalized locally Toeplitz sequences: a spectral analysis tool for discretized differential equations. Lecture Notes in Mathematics 2219 (2018) 161–236. 24. Garoni C., Serra-Capizzano S., Sesana D. Spectral analysis and spectral symbol of dvariate Q p Lagrangian FEM stiffness matrices. SIAM J. Matrix Anal. Appl. 36 (2015) 1100– 1128. 25. Garoni C., Serra-Capizzano S., Sesana D. Block locally Toeplitz sequences: construction and properties. Springer INdAM Series, Proceedings Volume of the INdAM Meeting “Structured Matrices in Numerical Linear Algebra: Analysis, Algorithms and Applications”, Cortona (Arezzo), Italy, 04–08/09/2017 (in press). 26. Garoni C., Serra-Capizzano S., Sesana D. Block generalized locally Toeplitz sequences: topological construction, spectral distribution results, and star-algebra structure. Springer INdAM Series, Proceedings Volume of the INdAM Meeting “Structured Matrices in Numerical Linear Algebra: Analysis, Algorithms and Applications”, Cortona (Arezzo), Italy, 04– 08/09/2017 (in press). 27. Grenander U., Szeg˝o G. Toeplitz Forms and Their Applications. Second Edition, AMS Chelsea Publishing, New York (1984). 28. Katznelson Y. An Introduction to Harmonic Analysis. Third Edition, Cambridge University Press, Cambridge (2004). 29. Parter S. V. On the distribution of the singular values of Toeplitz matrices. Linear Algebra Appl. 80 (1986) 115–130. 30. Royden H. L., Fitzpatrick P. M. Real Analysis. Fourth Edition, Pearson Education Asia Limited and China Machine Press (2010). 31. Rudin W. Principles of Mathematical Analysis. Third Edition, McGraw-Hill, New York (1976). 32. Rudin W. Real and Complex Analysis. Third Edition, McGraw-Hill, Singapore (1987). 33. Serra-Capizzano S. Distribution results on the algebra generated by Toeplitz sequences: a finite dimensional approach. Linear Algebra Appl. 328 (2001) 121–130. 34. Serra-Capizzano S. More inequalities and asymptotics for matrix valued linear positive operators: the noncommutative case. Oper. Theory Adv. Appl. 135 (2002) 293–315. 35. Serra-Capizzano S. Generalized locally Toeplitz sequences: spectral analysis and applications to discretized partial differential equations. Linear Algebra Appl. 366 (2003) 371–402. 36. Serra-Capizzano S. The GLT class as a generalized Fourier analysis and applications. Linear Algebra Appl. 419 (2006) 180–233. 37. Serra-Capizzano S., Tilli P. On unitarily invariant norms of matrix-valued linear positive operators. J. Inequal. Appl. 7 (2002) 309–330. 38. Tilli P. A note on the spectral distribution of Toeplitz matrices. Linear and Multilinear Algebra 45 (1998) 147–159.
References
185
39. Tilli P. Locally Toeplitz sequences: spectral properties and applications. Linear Algebra Appl. 278 (1998) 91–120. 40. Tilli P. Some results on complex Toeplitz eigenvalues. J. Math. Anal. Appl. 239 (1999) 390– 401. 41. Tyrtyshnikov E. E. A unifying approach to some old and new theorems on distribution and clustering. Linear Algebra Appl. 232 (1996) 1–43. 42. Tyrtyshnikov E. E., Zamarashkin N. L. Spectra of multilevel Toeplitz matrices: advanced theory via simple matrix relationships. Linear Algebra Appl. 270 (1998) 15–27. 43. Zamarashkin N. L., Tyrtyshnikov E. E. Distribution of eigenvalues and singular values of Toeplitz matrices under weakened conditions on the generating function. Sb. Math. 188 (1997) 1191–1201.
Index
Symbols 0, 1, 2, . . ., 7 {A n }n ∼GLT κ, 91 {A n }n ∼LT a ⊗ f , 72 {An }n ∼λ f , 25 {An }n ∼λ φ, 25 {A n }n ∼sLT a ⊗ f , 72 {An }n ∼σ f , 25, 26 {An }n ∼σ φ, 24 {An }n ∼σ, λ f , 25 αi/ j , 8 a.c.s. {Bn,ε }n −→ {An }n , 36 a.c.s. {Bn,m }n −→ {An }n , 38 a.c.s. {Bn,m }n −→ {An }n , 32 d C[0,1] , 67 Cc (C), 5 Cc (D), 14 Ccm (R), 5 Cc (R), 5 Cm×n , 3 C m (R), 5 Cn , 52 C n ( f ), 56 C nk , 53 χE , 5 da.c.s. , 32 diag(x), 3 diag(X, Y ), 18 dmeasure , 6 dn , 6 Dn (a), 62 [ p] Dn (y), 152 D(S, ε), 26 D(z, ε), 26 δ r , 41 δr , 41
E k , 138
ER( f ), 5
fk, 6 Fn , 54 Fn , 54 f (X ), 4 φg (F), 6 G , 97 G , 104 g( f ) = g ◦ f , 5 h, . . . , k, 7 h ≤ k, 7 h ≤ k, 7 H u, 124 i, 4 i1 , . . . , id , 7 i 2, 8 i ∧ j, 8 i j, 8 Im , I , 3 i mod m, 8
(X ), 4 j = h, . . . , k, 7 (k) Jn , 41 (k) Jn , 41 L 1 ([0, 1]d , Rd×d ), 142 lim n→∞ an , 7 lim inf n→∞ an , 7 lim supn→∞ an , 7 L p (D), 6 L Tnm (a, f ), 67 L Tnm (a, f 1 ⊗ · · · ⊗ f d ), 63 λ j (X ), 4 λmax (X ), 4 λmin (X ), 4 Λ(X ), 4
© Springer Nature Switzerland AG 2018 C. Garoni and S. Serra-Capizzano, Generalized Locally Toeplitz Sequences: Theory and Applications, https://doi.org/10.1007/978-3-030-02233-4
187
188 max(i, j), 8 MD , 5 Md , 6 m f , M f , 44 m → ∞, 7 μk , 5 N (m), N (α), 7 || f || L p , || f || L p (D) , 6 ||g||∞ , 5 ||g||∞,D , 5 |x| p , 4 |X | p , 4 ||x||, 4 ||X ||, 4 ||X || p , 4 n p, 8 Om , O, 3 pa.c.s. , 32 R+ , 174 Rm×n , 3 (X ), 4 #S, 3 S, 3 k j =h , 8 Sn (a), 122 supp( f ), 14 [σ (1), σ (2), . . . , σ (n)], 3 σ j (X ), 4 σmax (X ), 4 σmin (X ), 4 Tn ( f ), 41 {Tn ( f )}n∈Nd , 41 τa.c.s. , 32 τmeasure , 6 u ⊗ v, 62 w1 ⊗ · · · ⊗ wd , 5 ωg (δ), 5 X †, 4 x i = xi1 ,...,id , 8 x T , x∗ , 3 X T , X ∗, 3 x = [x i ]m i=1 , 7 X = [x i j ]m i, j =1 , 8 X ◦ Y, 4 X ⊕ Y , 18 X ⊗ Y , 18 X ∼ Y , 122 x · y, 4 X ≥ Y, X > Y , 4 ζn = o(ξn ), 4 ζn = O(ξn ), 4
Index A Accumulation point, 37 A.c.s., 31 Addition of sequences of matrices, 28 A.e., 5 Algebra of matrix-sequences E , 104, 109 of measurable functions Md , 104, 109 of multilevel GLT pairs G , 105, 109 of multilevel GLT sequences G , 104 of zero-distributed sequences Z, 28 product algebra E × Md , 104, 109 Algebraic properties of a.c.s., 34 of multilevel GLT sequences, 103 of multilevel LT sequences, 85 of tensor products and direct sums, 18 Algebraic-topological definition of (multilevel) GLT sequences, 109, 110, 179 Approximating class of sequences (a.c.s.), 31 as ε → 0, 36 as m → ∞, 37, 38 Approximation argument, 56, 74, 79 Approximation space, 147, 148, 160, 161, 175 Approximation theory for sequences of matrices, 117 Arrow-shaped sampling matrix, 122 Associativity of tensor products, 18 Attraction, 27 Automatic procedure, 1, 179 Avram–Parter theorem, 55
B Barbarino Giovanni, 180 BCCB matrix, 52 Bianchi Davide, 181 Big container, 105 Bi-indices, 9 Bilinearity of tensor products, 19 of the multilevel LT operator, 67 Block circulant matrix with circulant blocks, 52 with (d − 1)-level circulant blocks, 52 Block GLT sequence, 179 Block structure, 11 Block Toeplitz matrix with (d − 1)-level Toeplitz blocks, 40 with Toeplitz blocks, 40 Boldface, 1, 39, 61, 91, 124
Index Boundary conditions, 125, 126 Bound for partial derivatives, 137 Bound for the spectral (Euclidean) norm, 18 Bound for the sum of partial derivatives, 137 Bounds for derivatives, 151 Bridge, 32 B-splines, 149, 150 BTTB matrix, 40 Building blocks of the theory of multilevel GLT sequences, 39, 72
C Calculus, 3, 148, 149, 161, 176 Cardinality, 3, 7 Cauchy–Schwarz inequality, 46 Characterization of a.c.s. parameterized by ε → 0, 36 of a.c.s. parameterized by m → ∞, 37 of multilevel GLT sequences, 92, 99 of multilevel LT sequences, 86 of Riemann-integrable functions, 16 of s.u. sequences of matrices, 29 of s.v. sequences of matrices, 29 of zero-distributed sequences, 28, 73 Chebyshev’s inequality, 15 Circulant matrix, 51, 53 Class of sequences, 36 Closure, 3, 27 of multilevel GLT pairs, 97 of multilevel GLT sequences, 97 of σ -pairs, 34 Clustering, 26 Collocation matrix, 147 Collocation method, 147 Collocation points, 147, 148 Commutative properties of tensor products and direct sums, 21 Compact expression, 7 Compact presentation, 7 Componentwise sense, 8 Conjecture, 181 Conjugate exponents, 18 Conjugate transpose, 3 Conjugate transposition of a multilevel GLT sequence, 103 of a multilevel LT sequence, 85 of an a.c.s., 35, 37 of sequences of matrices, 28 Convection matrix, 131, 138, 140, 154, 164, 173 Convection term, 134, 153, 163, 173 Convention, 28, 32, 47
189 Convergence a.c.s., 31, 32 in measure, 6, 32 uniform, 105 Converse, 33, 35, 98 Criteria to identify a.c.s., 35
D Degree of a univariate trigonometric polynomial, 79 Density in L p , 14 in the space of multilevel GLT pairs, 99, 109 in the space of multilevel GLT sequences, 99 Dependence, 7, 152, 161 DEs, 1 Determinant factor, 173 Diagonal sampling matrix, 62, 152 Diag operator, 63, 64 Differential calculus, 148, 149, 161, 176 Differential operator higher-order, 126, 127, 159, 173 linear, 125 lower-order, 126 non-separable, 125 separable, 125 Diffusion matrix, 131, 133, 138, 140, 153, 154, 162, 164, 173, 176 Diffusion term, 133, 153, 162, 173 d-index, 7 d-index range, 7 Direct sum, 18, 62, 63 Discrete Fourier transform, 54 Discretization parameter, 11, 125 Discretization step, 125, 128, 137 Disk, 26 Distance, 109 Distributive law on the left, 23 on the right, 23 Distributive properties of tensor products with respect to direct sums, 23 d-level arrow-shaped sampling matrix, 122 d-level circulant matrix, 51, 56 d-level diagonal sampling matrix, 62, 152 d-level GLT pairs, 97–99, 105, 109, 110 d-level GLT sequence, 91, 117 d-level Locally Toeplitz (LT) operator, 63, 67 d-level Locally Toeplitz (LT) sequence, 72
190 d-level matrix, 11, 20 d-level matrix-sequence, 12, 125 d-level separable Locally Toeplitz (sLT) sequence, 72, 126 d-level structure, 7 d-level Toeplitz matrix, 40, 41, 46 generated by a d-variate trigonometric polynomial, 42 d-level Toeplitz sequence, 42 d-variate Fourier frequencies, 12 d-variate trigonometric polynomial, 12 d-separable, 13 separable, 13, 126 weighted, 14 E Eigenpairs, 175 Eigenvalue errors, 175 Eigenvectors, 180 Eminent, 20 Essential range, 5, 27, 58, 59 ε-expansion, 26 F Factorization of a separable function, 5, 13 Family of multilevel diagonal sampling matrices, 73 of multilevel Toeplitz matrices, 41 FD discretization, 128 FD discretization matrices, 130 FD formula, 127 FDs, 4 FE discretization, 136 FE discretization matrices, 138 Fejér theorem, 15 FEs, 4 Formal structure of the symbol, 127, 159, 173 Fourier coefficients, 6, 13, 41, 43, 46, 48 Fourier frequencies, 12 Fourier transform, 54 Fourier variables, 127 Fractional differential equations, 181 From the discrete to the continuous, 181 Fubini’s theorem, 13, 46 Function bounded, 5, 16 characteristic (indicator), 5 composite, 5 constant a.e., 44 continuous a.e., 16, 25, 100
Index d-separable, 5 measurable, 5 of a multilevel GLT sequence, 105 of an a.c.s., 35 Riemann-integrable, 16, 74, 86 separable, 5, 63, 72 tensor-product, 5, 62 Functional, 6 φg , 6 Function of a matrix, 4 G Galerkin mass matrix, 175 Galerkin method, 160, 175 Galerkin problem, 137, 160, 175 Galerkin stiffness matrix, 161, 175 Generalized eigenvalue problem, 175 General sequences of matrices, 24, 26–28, 30, 31, 33–35 Generating function of a family of multilevel arrow-shaped sampling matrices, 122 of a family of multilevel diagonal sampling matrices, 62 of a family of multilevel Toeplitz matrices (of a Toeplitz family), 42 of a multilevel LT sequence, 72 separable, 72 Generator of circulant matrices, 52 Geometric–arithmetic mean inequality, 47 Geometry map (function), 148, 149, 156, 161, 166, 171, 175, 176 GLC sequence, 180 GLT, 1 GLT algebra, 104 GLT analysis d-dimensional, 124 of FD discretization matrices, 132 of FE discretization matrices, 141 of IgA collocation matrices, 156 of IgA discretization matrices, 166, 176 GLT ideas, 124 GLT library, 179 GLT pairs, 97–99, 105, 109, 110 GLT program, 181 GLT sequence, 91, 117 Greville abscissae, 149, 150 H Hadamard product, 4 Hat-functions, 137 Hessian, 124
Index Higher-order differential operator, 126, 127, 159, 173 Hölder-type inequality, 18 HPD, 4 HPSD, 4 Hyperrectangle, 11, 16 Hypersurface, 25
I Identities, 10 Identity matrix, 3 Iff, 114 IgA, 4 IgA collocation matrices, 151 IgA collocation method, 147 IgA discretization matrices, 161 IgA Galerkin method, 160, 174 Imaginary part of a function, 5, 16 of a matrix, 4 Imaginary unit, 4 Indicization, 9–11 Induction, 13, 21, 22, 63, 74, 79 Informal meaning of singular value distribution, 25 of spectral (eigenvalue) distribution, 25 Instructive, 24, 31 Integral expression, 43 Integro-differential calculus, 1, 3 Interior, 58 Isogeometric collocation approximation, 147 Isogeometric Galerkin approximation, 160, 174 Isoparametric approach, 149
J Jacobian matrix, 149, 161
K Kernel (symbol), 72, 92, 117 Keystone, 61 Knot, 149–151, 154, 161, 164
L Lebesgue integral, 16 Lebesgue measure, 5 Lebesgue’s characterization theorem of Riemann-integrable functions, 16 Level orders, 11, 12, 20
191 Lexicographic ordering, 7 Linear combination, 12, 53, 54 of a.c.s., 35, 37 of matrix-sequences, 119 of multilevel GLT sequences, 103 of multilevel LT sequences, 85 of sequences of matrices, 28 Linear FE approach, 137 Linear numerical method, 7, 11 Linear operator, 142, 168 Linear PDE, 7, 11 Linear Positive Operator (LPO), 45, 46 Lipschitz boundary, 160, 166, 171, 174, 176 Localization of the spectrum, 43 Locally Toeplitz (LT) operator, 62, 63, 67 Locally Toeplitz (LT) sequence, 72 Local support property, 137, 150 Lower-order differential operators, 126 LPO, 45 LT, 61
M Mass matrix, 175 Matrix 2-level circulant, 52 2-level Toeplitz, 40 arrow-shaped sampling, 122 BCCB, 52 block, 10, 11 block circulant with circulant blocks, 52 block Toeplitz with Toeplitz blocks, 40 BTTB, 40 circulant, 51, 53 collocation, 147 diagonal sampling, 62, 152 d-level, 11, 20 d-level arrow-shaped sampling, 122 d-level circulant, 51, 56 d-level diagonal sampling, 62, 152 d-level Toeplitz, 40, 41 identity, 3 Jacobian, 149, 161 mass, 175 multilevel, 11, 20 multilevel arrow-shaped sampling, 122 multilevel circulant, 51, 56 multilevel diagonal sampling, 62, 152 multilevel Toeplitz, 40, 41 normal, 26, 54, 180 of diffusion coefficients, 141, 145, 166, 170 rectangular zero, 3
192 stiffness, 138, 161, 175 Toeplitz, 40, 41 zero, 3 Matrix computations with multi-indices, 11 Matrix functions, 4 Matrix-norm inequalities, 17 Matrix-sequence, 12, 42, 73, 125 d-level, 12, 125 multilevel, 12, 125 Maximum, 4 Max Planck Institute, 179 Minimax principle, 45 Minimum, 4, 8 Modulus of continuity, 5 Monotone, 45, 102 Moore–Penrose pseudoinverse, 4 of a multilevel GLT sequence, 107 of an a.c.s., 35 Motivations, 1, 31, 61, 91 Multi-index, 7 Multi-index formula for tensor products, 20 Multi-index language, 124, 130 Multi-index range, 7 Multi-index sequence (of sequences of matrices), 36 Multilevel arrow-shaped sampling matrix, 122 Multilevel circulant matrix, 51, 56 Multilevel diagonal sampling matrix, 62, 152 Multilevel GLT algebra, 104 Multilevel GLT pairs, 97–99, 105, 109, 110 Multilevel GLT sequence, 91, 117 Multilevel language, 1, 39, 61, 91 Multilevel Locally Toeplitz (LT) operator, 63, 67 Multilevel Locally Toeplitz (LT) sequence, 72 Multilevel matrix, 11, 20 Multilevel matrix-sequence, 12, 125 Multilevel separable Locally Toeplitz (sLT) sequence, 72, 126 Multilevel structure, 7 Multilevel Toeplitz matrix, 40, 41, 46 generated by a multivariate trigonometric polynomial, 42 Multilevel Toeplitz sequence, 42, 79 Multivariate Fourier frequencies, 12 Multivariate GLT sequence, 179 Multivariate trigonometric polynomial, 12 d-separable, 13 separable, 13, 126 weighted, 14
Index Multivariate version, 1, 3, 14–16, 61, 91, 123
N Natural operations on functions, 104 on pairs in E × Md , 104 on sequences of matrices, 28 Nesting level, 11 Nightmare, 11 Nonnegative partition of unity, 151 Norm L p -, 6 nuclear, 4 operator, 4 p-, 4, 17 Schatten p-, 4, 18 spectral (Euclidean), 4, 18 trace-, 4, 18 unitarily invariant, 46 Notation from probability theory, 6 Numerical eigenvalues, 175, 176 Numerical solution, 7 NURBS, 149, 150, 161
O Operations involving multi-indices, 8 on functions, 104 on pairs in E × Md , 104 on sequences of matrices, 28 Operations “ops”, 103 “Ops”, 103 Orthogonality relations, 13, 42 Orthonormal bases, 47, 48 Outliers, 25
P Parametric (reference) domain, 148, 155, 161, 165, 175 Parametric variables, 155, 165, 166 Partial orders, 11 Partition, 16, 101 Partition of unity, 151 PDE coefficients, 133, 141, 145, 156, 166, 171, 176 PDEs, 1 Perfect square, 26 Permutation, 3, 21–23, 63, 67 Perturbation, 24, 125, 126, 180 Physical domain, 148, 155, 161, 165, 175 Physical variables, 155, 165
Index Preconditioned matrices, 121 Product componentwise (Hadamard), 4 of a.c.s., 35, 37 of multilevel GLT sequences, 103 of multilevel LT sequences, 85 of sequences of matrices, 28 tensor (Kronecker), 10, 18 Pseudometric, 6, 32, 34, 97 Pseudometric space, 34, 97, 109
R Radius, 26 Ratnani Ahmed, 179 Reaction matrix, 131, 138, 140, 154, 164, 173, 176 Reaction term, 134, 153, 163, 173 Real part of a function, 5, 16 of a matrix, 4 Recursive definition, 63 Recursive formula, 62 Reduced GLT sequence, 179 Reference (parametric) domain, 148, 155, 161, 165, 175 Regular map, 166, 171 Restriction, 105 Riemann integral, 16 Riemann sum, 25, 57 Role, 24, 69, 86, 149
S Scalar indices, 8, 9, 11, 12, 20 Scalar-multiplication of sequences of matrices, 28 Science fiction, 181 Scuola Normale Superiore, 180 Separable Locally Toeplitz (sLT) sequence, 72, 126 Sequence of matrices, 6 sparsely unbounded (s.u.), 28, 35, 37, 85, 103, 105, 108 sparsely vanishing (s.v.), 29, 35, 108 strongly clustered, 27 strongly clustered (in the sense of the eigenvalues), 26 strongly clustered in the sense of the singular values, 27 weakly clustered, 27 weakly clustered (in the sense of the eigenvalues), 26
193 weakly clustered in the sense of the singular values, 27 Sequence of multilevel diagonal sampling matrices, 74, 100 Sequences of perturbed Hermitian matrices, 24, 30, 94, 180 Set closed, 34, 59, 97, 99, 105, 109 compact, 14 connected, 58 dense, 14, 15, 57, 77, 81, 99, 109 measurable, 5 of multilevel GLT pairs, 97 of σ -pairs, 34 of zero measure, 13, 142, 167 of zeros, 13 open, 44, 147, 156, 160, 166, 171, 174, 176 Similarity, 122 Singular value distribution, 24 of a finite sum of multilevel LT sequences, 82 of a multilevel GLT sequence, 93 of FD discretization matrices, 133 of FE discretization matrices, 141, 145 of IgA collocation matrices, 156 of IgA discretization matrices, 166, 171, 173, 174, 177 of multilevel Toeplitz sequences, 56 of preconditioned matrices, 122 sLT, 72 Small-norm, 31, 78, 82, 86, 89, 90 Small-rank, 31, 78, 80, 82, 86, 89, 90, 125, 126 Sobolev derivative, 137 Space L p (D), 6 Space of measurable functions M D , 5 Space of measurable functions Md , 6 Sparsely unbounded (s.u.) sequence of matrices, 28, 35, 37, 85, 103, 105, 108 Sparsely vanishing (s.v.) sequence of matrices, 29, 35, 108 SPD, 4 Spectral attraction, 27 Spectral decomposition of multilevel circulant matrices, 54 Spectral (eigenvalue) distribution, 25 of a finite sum of multilevel LT sequences, 84 of a multilevel GLT sequence, 94, 95 of FD discretization matrices, 133 of FE discretization matrices, 141, 145 of IgA collocation matrices, 156
194 of IgA discretization matrices, 166, 171, 173, 174, 177 of multilevel Toeplitz sequences, 56, 58 of preconditioned matrices, 122 Speed, 133 Splitting, 31, 35, 37 SPSD, 4 Square, 26, 102 Standard differential calculus, 148, 149, 161, 176 Standard lexicographic ordering, 7 Stepsize, 133 Stiffness matrix, 138, 161, 175 Strong attraction with infinite order, 27, 59 S.u., 28 Subscript, 8 Summation, 8, 151, 158, 159 Superscript, 8 Support, 5, 14, 137, 150, 151 S.v., 29 Symbol, 72, 91, 117 of −∂ 2 u/∂ x ∂ xk , 127 of the negative Hessian operator, 127, 132, 138, 140, 155, 165 of the negative Hessian operator in the parametric variables, 155, 165 real a.e., 83, 93 singular value, 25 spectral (eigenvalue), 25 unique, 83, 93, 117 Symmetric approximation, 135, 145, 157, 171 Szeg˝o first limit theorem, 55 σ -pair, 34 T Technicalities, 1, 124 Tensor (Kronecker) product, 10, 18, 46, 62 Tensor-product hat-functions, 136 Tensor-product B-splines, 149, 150 Tensor-product Greville abscissae, 149, 150 Tilli class, 58 Toeplitz family, 41 Toeplitz matrix, 40, 41, 46 generated by a multivariate trigonometric polynomial, 42 Toeplitz sequence, 42, 79 Tool, 7 Topological closure, 97 Topological density, 99 Topological interpretation, 34, 97
Index Topology a.c.s. (τa.c.s. ), 32, 34, 97, 99, 109 of convergence in measure (τmeasure ), 6, 32, 34, 97, 99, 109 product (τa.c.s. × τmeasure ), 34, 97, 99, 109 Total order, 11, 20 Total ordering, 8 Trace-norm, 4, 18 Trace-norm inequalities, 18 Transformed convection coefficient, 149 Transformed diffusion coefficient, 149 Transformed problem, 148, 160, 173 Transformed reaction coefficient, 149 Transpose, 3 Trigonometric monomial, 15, 99, 100, 103 Trigonometric polynomial, 12, 13 weighted, 14 Truncation, 101
U Uniform (equispaced) grid, 25 Uniform (equispaced) samples, 25 Uniform knots, 151 Uniform knot sequence, 149, 161 Uniqueness of the symbol of a multilevel GLT sequence, 93, 117 of a multilevel LT sequence, 83 University of Insubria, 181 Upper bound for the rank of the difference of two tensor products, 20
V Vanishing property, 45 Vanishing property on the boundary, 151
W Way of reading, 111 Way of thinking, 11 Weak form, 136, 160, 174 Weierstrass theorem, 105 Weight function, 72 World, 86, 93
Z Zero-distributed sequence, 27, 32, 73, 126 Zero matrix, 3