Vector Extrapolation Methods with Applications 161197495X, 9781611974959

An important problem that arises in different disciplines of science and engineering is that of computing limits of sequ

343 74 4MB

English Pages 428 [421] Year 2017

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
1.9781611974966.fm
1.9781611974966.ch0
1.9781611974966.ch1
1.9781611974966.ch2
1.9781611974966.ch3
1.9781611974966.ch4
1.9781611974966.ch5
1.9781611974966.ch6
1.9781611974966.ch7
1.9781611974966.ch8
1.9781611974966.ch9
1.9781611974966.ch10
1.9781611974966.ch11
1.9781611974966.ch12
1.9781611974966.ch13
1.9781611974966.ch14
1.9781611974966.ch15
1.9781611974966.ch16
1.9781611974966.appa
1.9781611974966.appb
1.9781611974966.appc
1.9781611974966.appd
1.9781611974966.appe
1.9781611974966.appf
1.9781611974966.appg
1.9781611974966.apph
1.9781611974966.bm
Recommend Papers

Vector Extrapolation Methods with Applications
 161197495X, 9781611974959

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Vector Extrapolation Methods with Applications

CS17_Sidi_FM_07-14-17.indd 1

8/9/2017 1:06:51 PM

Computational Science & Engineering The SIAM series on Computational Science and Engineering publishes research monographs, advanced undergraduate- or graduate-level textbooks, and other volumes of interest to an interdisciplinary CS&E community of computational mathematicians, computer scientists, scientists, and engineers. The series includes both introductory volumes aimed at a broad audience of mathematically motivated readers interested in understanding methods and applications within computational science and engineering and monographs reporting on the most recent developments in the field. The series also includes volumes addressed to specific groups of professionals whose work relies extensively on computational science and engineering. SIAM created the CS&E series to support access to the rapid and far-ranging advances in computer modeling and simulation of complex problems in science and engineering, to promote the interdisciplinary culture required to meet these large-scale challenges, and to provide the means to the next generation of computational scientists and engineers. Editor-in-Chief Donald Estep Colorado State University Editorial Board Daniela Calvetti Case Western Reserve University Paul Constantine Colorado School of Mines Omar Ghattas University of Texas at Austin

Chen Greif University of British Columbia

J. Nathan Kutz University of Washington

Jan S. Hesthaven Ecole Polytechnique Fédérale de Lausanne

Ralph C. Smith North Carolina State University

Johan Hoffman KTH Royal Institute of Technology David Keyes Columbia University

Charles F. Van Loan Cornell University Karen Willcox Massachusetts Institute of Technology

Series Volumes Sidi, Avram, Vector Extrapolation Methods with Applications Borzì, A., Ciaramella, G., and Sprengel, M., Formulation and Numerical Solution of Quantum Control Problems Benner, Peter, Cohen, Albert, Ohlberger, Mario, and Willcox, Karen, editors, Model Reduction and Approximation: Theory and Algorithms Kuzmin, Dmitri and Hämäläinen, Jari, Finite Element Methods for Computational Fluid Dynamics: A Practical Guide Rostamian, Rouben, Programming Projects in C for Students of Engineering, Science, and Mathematics Smith, Ralph C., Uncertainty Quantification: Theory, Implementation, and Applications Dankowicz, Harry and Schilder, Frank, Recipes for Continuation Mueller, Jennifer L. and Siltanen, Samuli, Linear and Nonlinear Inverse Problems with Practical Applications Shapira, Yair, Solving PDEs in C++: Numerical Methods in a Unified Object-Oriented Approach, Second Edition Borzì, Alfio and Schulz, Volker, Computational Optimization of Systems Governed by Partial Differential Equations Ascher, Uri M. and Greif, Chen, A First Course in Numerical Methods Layton, William, Introduction to the Numerical Analysis of Incompressible Viscous Flows Ascher, Uri M., Numerical Methods for Evolutionary Differential Equations Zohdi, T. I., An Introduction to Modeling and Simulation of Particulate Flows Biegler, Lorenz T., Ghattas, Omar, Heinkenschloss, Matthias, Keyes, David, and van Bloemen Waanders, Bart, editors, Real-Time PDE-Constrained Optimization Chen, Zhangxin, Huan, Guanren, and Ma, Yuanle, Computational Methods for Multiphase Flows in Porous Media Shapira, Yair, Solving PDEs in C++: Numerical Methods in a Unified Object-Oriented Approach

CS17_Sidi_FM_07-14-17.indd 2

8/9/2017 1:06:51 PM

AVRAM SIDI Technion – Israel Institute of Technology Haifa, Israel

Vector Extrapolation Methods with Applications

Society for Industrial and Applied Mathematics Philadelphia

CS17_Sidi_FM_07-14-17.indd 3

8/9/2017 1:06:51 PM

Copyright © 2017 by the Society for Industrial and Applied Mathematics 10 9 8 7 6 5 4 3 2 1 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA. Trademarked names may be used in this book without the inclusion of a trademark symbol. These names are used in an editorial context only; no infringement of trademark is intended. Publisher Executive Editor Developmental Editor Managing Editor Production Editor Copy Editor Production Manager Production Coordinator Compositor Graphic Designer

David Marshall and Kivmars Bowling Elizabeth Greenspan Gina Rinelli Harris Kelly Thomas Ann Manning Allen Julia Cochrane Donna Witzleben Cally Shrader Cheryl Hufnagle Lois Sellers

Library of Congress Cataloging-in-Publication Data Names: Sidi, Avram, author. Title: Vector extrapolation methods with applications / Avram Sidi, Technion – Israel Institute of Technology, Haifa, Israel. Description: Philadelphia : Society for Industrial and Applied Mathematicsm, [2017] | Series: Computational science and engineering series ; 17 | Includes bibliographical references and index. Identifiers: LCCN 2017026889 (print) | LCCN 2017031009 (ebook) | ISBN 9781611974966 (e-book) | ISBN 9781611974959 (print) Subjects: LCSH: Vector analysis. | Extrapolation. Classification: LCC QA433 (ebook) | LCC QA433 .S525 2017 (print) | DDC 515/.63--dc23 LC record available at https://lccn.loc.gov/2017026889

is a registered trademark.

CS17_Sidi_FM_07-14-17.indd 4

8/9/2017 1:06:51 PM

Contents Preface 0

xi

Introduction and Review of Linear Algebra 0.1 General background and motivation . . . . . . . . . . 0.2 Some linear algebra notation and background . . . . 0.3 Fixed-point iterative methods for nonlinear systems 0.4 Fixed-point iterative methods for linear systems . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1 1 2 15 16

I

Vector Extrapolation Methods

29

1

Development of Polynomial Extrapolation Methods 1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Solution to x = T x + d from {x m } . . . . . . . . . . . . . . . . . . . . 1.3 Derivation of MPE, RRE, MMPE, and SVD-MPE . . . . . . . . . . 1.4 Finite termination property . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Application of polynomial extrapolation methods to arbitrary {x m } 1.6 Further developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Determinant representations . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Compact representations . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Numerical stability in polynomial extrapolation . . . . . . . . . . . 1.10 Solution of x = f (x) by polynomial extrapolation methods in cycling mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.11 Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 31 35 39 45 47 48 49 55 56

Unified Algorithms for MPE and RRE 2.1 General considerations . . . . . . . . . . . . . . . . . . . . . 2.2 QR factorization . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Solution of least-squares problems by QR factorization 2.4 Algorithms for MPE and RRE . . . . . . . . . . . . . . . . 2.5 Error estimation via algorithms . . . . . . . . . . . . . . . 2.6 Further algorithms for MPE and RRE . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

65 65 66 67 71 74 78

MPE and RRE Are Related 3.1 Introduction . . . . . . . . . . 3.2 R k γ k for MPE and RRE . . 3.3 First main result . . . . . . . 3.4 Second main result . . . . . . 3.5 Peak-plateau phenomenon .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

83 83 85 86 87 88

2

3

. . . . .

v

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

57 63

vi

Contents

4

5

6

7

8

II 9

Algorithms for MMPE and SVD-MPE 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 4.2 LU factorization and SVD . . . . . . . . . . . . . 4.3 Algorithm for MMPE . . . . . . . . . . . . . . . 4.4 Error estimation via algorithm for MMPE . . 4.5 Algorithm for SVD-MPE . . . . . . . . . . . . . 4.6 Error estimation via algorithm for SVD-MPE

. . . . . .

. . . . . .

. . . . . .

. . . . . .

91 91 92 92 95 95 98

Epsilon Algorithms 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 SEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 VEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 TEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Implementation of epsilon algorithms in cycling mode . . . . 5.6 ETEA: An economical implementation of TEA . . . . . . . . 5.7 Comparison of epsilon algorithms with polynomial methods

. . . . . . .

. . . . . . .

. . . . . . .

99 99 99 106 110 114 115 117

Convergence Study of Extrapolation Methods: Part I 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Convergence and stability of rows in the extrapolation table 6.3 Technical preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Proof of Theorem 6.6 for MMPE and TEA . . . . . . . . . . . . 6.5 Proof of Theorem 6.6 for MPE and RRE . . . . . . . . . . . . . 6.6 Another proof of (6.13) . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Proof of Theorem 6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 Extension to infinite-dimensional spaces . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

119 119 121 128 130 137 145 145 148 153

Convergence Study of Extrapolation Methods: Part II 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Simple error bounds for sRRE ................... n,k MPE RRE 7.3 Error bounds for sn,k and sn,k via orthogonal polynomials D and conclusions . . 7.4 Appraisal of the upper bounds on Γn,k 7.5 Justification of cycling . . . . . . . . . . . . . . . . . . . . . . . 7.6 Cycling and nonlinear systems of equations . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

155 155 156 158 170 173 174

Recursion Relations for Vector Extrapolation Methods 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Recursions for fixed m . . . . . . . . . . . . . . . . . . . . . 8.3 Recursions for m = n + q with fixed q . . . . . . . . . . . 8.4 Recursions for fixed n . . . . . . . . . . . . . . . . . . . . . . 8.5 qd-type algorithms and the matrix eigenvalue problem

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

177 177 179 181 183 184

Krylov Subspace Methods

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

189

Krylov Subspace Methods for Linear Systems 191 9.1 Projection methods for linear systems . . . . . . . . . . . . . . . . . . 191 9.2 Krylov subspace methods: General discussion . . . . . . . . . . . . . 197 9.3 Method of Arnoldi: Full orthogonalization method (FOM) . . . . 203

Contents

vii

9.4 9.5 9.6

9.7 9.8 9.9 9.10 9.11 10

III 11

12

13

14

Method of generalized minimal residuals (GMR) . . . . . . . . . . . FOM and GMR are related . . . . . . . . . . . . . . . . . . . . . . . . . Recursive algorithms for FOM and GMR with A Hermitian positive definite: A unified treatment of conjugate gradients and conjugate residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the existence of short recurrences for FOM and GMR . . . . . The method of Lanczos . . . . . . . . . . . . . . . . . . . . . . . . . . . Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FOM and GMR with prior iterations . . . . . . . . . . . . . . . . . . Krylov subspace methods for nonlinear systems . . . . . . . . . . .

Krylov Subspace Methods for Eigenvalue Problems 10.1 Projection methods for eigenvalue problems . . . . . . . . . . . . . . 10.2 Krylov subspace methods . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 The method of Arnoldi . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 The method of Lanczos . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 The case of Hermitian A . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 Methods of Arnoldi and Lanczos for eigenvalues with special properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications and Generalizations Miscellaneous Applications of Vector Extrapolation Methods 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Computation of steady-state solutions . . . . . . . . . . . . . 11.3 Computation of eigenvectors with known eigenvalues . . 11.4 Computation of eigenpair derivatives . . . . . . . . . . . . . 11.5 Application to solution of singular linear systems . . . . . 11.6 Application to multidimensional scaling . . . . . . . . . . .

206 208

209 219 220 226 229 230 233 233 235 239 245 247 250 261

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

263 263 263 266 273 277 279

Rational Approximations from Vector-Valued Power Series: Part I 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Derivation of vector-valued rational approximations . . . . . . . . . 12.3 A compact formula for sn,k (z) from SMPE . . . . . . . . . . . . . . . 12.4 Algebraic properties of sn,k (z) . . . . . . . . . . . . . . . . . . . . . . . 12.5 Efficient computation of sn,k (z) from SMPE . . . . . . . . . . . . . . 12.6 Some sources of vector-valued power series . . . . . . . . . . . . . . . 12.7 Convergence study of sn,k (z): A constructive theory of de Montessus type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

285 285 286 289 290 294 296

Rational Approximations from Vector-Valued Power Series: Part II 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Generalized inverse vector-valued Padé approximants . . . . . . . 13.3 Simultaneous Padé approximants . . . . . . . . . . . . . . . . . . . . 13.4 Directed simultaneous Padé approximants . . . . . . . . . . . . . .

305 305 305 308 310

. . . .

297

Applications of SMPE, SMMPE, and STEA 313 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 14.2 Application to the solution of x = zAx + b versus Krylov subspace methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

viii

Contents

14.3 14.4 14.5 14.6 15

16

IV

A related application to Fredholm integral equations Application to reanalysis of structures . . . . . . . . . . Application to nonlinear differential equations . . . . Application to computation of f (A)b . . . . . . . . . .

. . . .

. . . .

. . . .

Vector Generalizations of Scalar Extrapolation Methods 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Review of a generalized Richardson extrapolation process 15.3 Vectorization of FGREP . . . . . . . . . . . . . . . . . . . . . . 15.4 A convergence theory for FGREP2 . . . . . . . . . . . . . . . 15.5 Vector extrapolation methods from scalar sequence transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

316 317 318 320

. . . .

. . . .

. . . .

. . . .

. . . .

327 327 327 329 332

. . . . . 341

Vector-Valued Rational Interpolation Methods 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Development of vector-valued rational interpolation methods . . 16.3 Algebraic properties of interpolants . . . . . . . . . . . . . . . . . . . 16.4 Convergence study of r p,k (z): A constructive theory of de Montessus type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendices

345 345 345 351 354 359

A

QR Factorization 361 A.1 Gram–Schmidt orthogonalization (GS) and QR factorization . . . 361 A.2 Modified Gram–Schmidt orthogonalization (MGS) . . . . . . . . . 364 A.3 MGS with reorthogonalization . . . . . . . . . . . . . . . . . . . . . . 365

B

Singular Value Decomposition (SVD) 367 B.1 Full SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 B.2 Reduced SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 B.3 SVD as a sum of rank-one matrices . . . . . . . . . . . . . . . . . . . . 370

C

Moore–Penrose Generalized Inverse C.1 Definition of the Moore–Penrose generalized inverse C.2 SVD representation of A+ . . . . . . . . . . . . . . . . . . C.3 Connection with linear least-squares problems . . . . C.4 Moore–Penrose inverse of a normal matrix . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

373 373 374 374 374

D

Basics of Orthogonal Polynomials 377 D.1 Definition and basic properties of orthogonal polynomials . . . . 377 D.2 Some bounds related to orthogonal polynomials . . . . . . . . . . . 378

E

Chebyshev Polynomials: Basic Properties 381 E.1 Definition of Chebyshev polynomials . . . . . . . . . . . . . . . . . . 381 E.2 Zeros, extrema, and min-max properties . . . . . . . . . . . . . . . . . 382 E.3 Orthogonality properties . . . . . . . . . . . . . . . . . . . . . . . . . . 383

F

Useful Formulas and Results for Jacobi Polynomials 385 F.1 Definition of Jacobi polynomials . . . . . . . . . . . . . . . . . . . . . 385 F.2 Some consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Contents

ix

G

H

Rayleigh Quotient and Power Method G.1 Properties of the Rayleigh quotient . . . . . G.2 The power method . . . . . . . . . . . . . . . G.3 Inverse power method or inverse iteration G.4 Inverse power method with variable shifts Unified FORTRAN 77 Code for MPE and RRE

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

387 387 389 391 392 397

Bibliography

405

Index

427

Preface An important problem that arises in different areas of science and engineering is of computing limits of sequences of vectors {x m }, where x m ∈ N , the dimension N being very large. Such sequences arise, for example, in the solution by fixed-point iterative methods of systems of linear or nonlinear algebraic equations. Given such a system of equations with solution s and a sequence {x m } obtained from this system by a fixed-point iterative method, we have lim m→∞ x m = s if limm→∞ x m exists. Now, in most cases of interest, {x m } converges to s extremely slowly, making direct use of the x m to approximate s with reasonable accuracy quite expensive. This will especially be the case when computing each x m is very time-consuming. One practical way to remedy this problem is to apply a suitable extrapolation (or convergence acceleration) method to the available x m . An extrapolation method takes a finite (and preferably small) number of the vectors x m and processes them to obtain an approximation to s that is better than the individual x m used in the process. A good method is in general nonlinear in the x m and takes into account, either implicitly or explicitly, the asymptotic behavior of the x m as m → ∞. If the sequence {x m } does not converge, we may think that no use can be made of it to approximate s. However, at least in some cases, a suitable vector extrapolation method can still be applied to the divergent sequence {x m } to produce good approximations to s, the solution of the system of equations being solved. In this case, we call s the antilimit of {x m }; figuratively speaking, we may view the sequence {x m } as diverging from its antilimit s. One nice feature of the methods we study is that they take the vector sequence {x m } as their only input, nothing else being needed. As such, they can be applied to arbitrary vector sequences, whether these are obtained from linear systems or from nonlinear systems or in any other way. The subject of vector extrapolation methods was initiated by Peter Wynn in the 1960s with an interesting and successful generalization of his famous epsilon algorithm, which implements the transformation of Daniel Shanks for accelerating the convergence of sequences of scalars. The works of Shanks and Wynn had a great impact and paved the way for more research into convergence acceleration. With the addition of more methods and their detailed study, the subject of vector extrapolation methods has come a long way since then. Today, it is an independent research area of numerical analysis. It has many practical applications. It also has connections to approximation theory. The relevance of vector extrapolation methods as effective computational tools for solving problems of very high dimension has long been recognized, as can be ascertained by doing a literature search in different computational disciplines. There are a few books that discuss different aspects of vector extrapolation methods: The 1977 monograph of Brezinski [29] contains one chapter that deals with the epsilon algorithm, its vectorized versions, and a matrix version. The 1991 book by Brezinski and Redivo Zaglia [36] contains one chapter that discusses some of the dexi

xii

Preface

velopments that took place in vector extrapolation methods up to the 1980s. Both books treat vector extrapolation methods as part of the general topic of convergence acceleration. The more recent book by Gander, Gander, and Kwok [93] briefly discusses a few of these methods as tools of scientific computing. So far, however, there has not been a book that is fully dedicated to the subject of vector extrapolation methods and their applications. The present book will hopefully help to fill this void. The main purpose of this book is to present a unified and systematic account of the existing literature, old and new, on the theory and applications of vector extrapolation methods that is as comprehensive and up-to-date as possible. In this account, I include much of the original and relevant literature that deals with methods of practical importance whose effectiveness has been amply verified in various surveys and comparative studies. I discuss the algebraic, computational/algorithmic, and analytical aspects of the methods covered. The discussions are rigorous, and complete proofs are provided in most places to make the reading flow better. I believe this treatment will help the reader understand the thought process leading to the development of the individual methods, why these methods work, how they work, and how they should be applied for best results. Inevitably, the contents and the perspective of this book reflect my personal interests and taste. Therefore, I apologize to those colleagues whose work has not been covered. Following the introduction and a review of substantial general and numerical linear algebra background in Chapter 0, which is needed throughout, this book is divided into four parts: (i) Part I reviews several vector extrapolation methods that are in use and that have proved to be efficient convergence accelerators. These methods are divided into two groups: (i) polynomial-type methods and (ii) epsilon algorithms. Chapter 1 presents the development and algebraic properties of four polynomialtype methods: minimal polynomial extrapolation (MPE), reduced rank extrapolation (RRE), modified minimal polynomial extrapolation (MMPE), and the most recent singular value decomposition–based minimal polynomial extrapolation (SVDMPE). Chapters 2 and 4 present computationally efficient and numerically stable algorithms for these methods. The algorithms presented are also very economical as far as computer storage requirements are concerned; this issue is crucial since most major applications of vector extrapolation methods are to very high dimensional problems. Chapter 3 discusses some interesting relations between MPE and RRE that were discovered recently. (Note that MPE and RRE are essentially different from each other.) Chapter 5 covers the three known epsilon algorithms: the scalar epsilon algorithm (SEA), the vector epsilon algorithm (VEA), and the topological epsilon algorithm (TEA). Chapters 6 and 7 present unified convergence and convergence acceleration theories for MPE, RRE, MMPE, and TEA. Technically speaking, the contents of the two chapters are quite involved. In these chapters, I have given detailed proofs of some of the convergence theorems. I have decided to present the complete proofs as part of this book since their techniques are quite general and are immediately applicable in other problems as well. For example, the techniques of proof developed in Chapter 6 have been used to prove some of the results presented in Chapters 12, 13, 14, and 16. (Of course, readers who do not want to

Preface

xiii

spend their time on the proofs can simply skip them and study only the statements of the relevant convergence theorems and the remarks and explanations that follow the latter.) Chapter 8 discusses some interesting recursion relations that exist among the vectors obtained from each of the methods MPE, RRE, MMPE, and TEA. (ii) Part II reviews some of the developments related to Krylov subspace methods for matrix problems, a most interesting topic of numerical linear algebra, to which vector extrapolation methods are closely related. Chapter 9 deals with Krylov subspace methods for solving linear systems since these are related to MPE, RRE, and TEA when the latter are applied to vector sequences that are generated by fixed-point iterative procedures for linear systems. In particular, it reviews the method of Arnoldi that is also known as the full orthogonalization method (FOM), the method of generalized minimal residuals (GMR), and the method of Lanczos. It discusses the method of conjugate gradients (CG) and the method of conjugate residuals (CR) in a unified manner. It also discusses the biconjugate gradient algorithm (Bi-CG). Chapter 10 deals with Krylov subspace methods for solving matrix eigenvalue problems. It reviews the method of Arnoldi and the method of Lanczos for these problems. These methods are also closely related to MPE and TEA. (iii) Part III reviews some of the applications of vector extrapolation methods. Chapter 11 presents some nonstandard uses for computing eigenvectors corresponding to known eigenvalues (such as the PageRank of the Google Web matrix) and for computing derivatives of eigenpairs. Another interesting application concerns multidimensional scaling. Chapter 12 deals with applying vector extrapolation methods to vector-valued power series. When MPE, MMPE, and TEA are applied to sequences of vectorvalued polynomials that form the partial sums of vector-valued Maclaurin series, they produce vector-valued rational approximations to the sums of these series. Chapter 12 discusses the properties of these rational approximations. Chapter 13 presents additional methods based on ideas from Padé approximants for obtaining rational approximations from vector-valued power series. Chapter 14 gives some interesting applications of vector-valued rational approximations. Chapter 15 briefly presents some of the current knowledge about vector generalizations of some scalar extrapolation methods, a subject that has not yet been explored fully. Chapter 16 discusses vector-valued rational interpolation procedures in the complex plane that are closely related to the methods developed in Chapter 12. (iv) Part IV is a compendium of eight appendices covering topics that we refer to in Parts I–III. The topics covered are QR factorization in Appendix A; singular value decomposition (SVD) in Appendix B; the Moore–Penrose inverse in Appendix C; fundamental properties of orthogonal polynomials and special properties of Chebyshev and Jacobi polynomials in Appendices D, E, and F; and the Rayleigh quotient and the power method in Appendix G. Appendix G also gives an independent rigorous treatment of the local convergence properties of

xiv

Preface

the Rayleigh quotient inverse power method with variable shifts for the eigenvalue problem, a subject not treated in most books on numerical linear algebra. A well-documented and well-tested FORTRAN 77 code for implementing MPE and RRE in a unified manner is included in Appendix H. The informed reader may pay attention to the fact that I have not included matrix extrapolation methods in this book, even though they are definitely related to vector extrapolation methods; I have pointed out some of the relevant literature on this topic, however. In my discussion of Krylov subspace methods for linear systems, I have also excluded the topic of semi-iterative methods, with Chebyshev iteration being the most interesting representative. This subject is covered very extensively in the various books and papers referred to in the relevant chapters. The main reason for both omissions, which I regret very much, was the necessity to keep the size of the book in check. Finally, I have not included any numerical examples since there are many of these in the existing literature; the limitation I imposed on the size of the book was again the reason for this omission. Nevertheless, I have pointed out some papers containing numerical examples that illustrate the theoretical results presented in the different chapters. I hope this book will serve as a reference for the more mathematically inclined researchers in the area of vector extrapolation methods and for scientists and engineers in different computational disciplines and as a textbook for students interested in undertaking to study the subject seriously. Most of the mathematical background needed to cope with the material is summarized in Chapter 0 and the appendices, and some is provided as needed in the relevant chapters. Before closing, I would like to express my deepest gratitude and appreciation to my dear friends and colleagues Dr. William F. Ford of NASA Lewis Research Center (today, NASA John H. Glenn Research Center at Lewis Field) and Professor David A. Smith of Duke University, who introduced me to the general topic of vector extrapolation methods. Our fruitful collaboration began after I was invited by Dr. Ford to Lewis Research Center to spend a sabbatical there during 1981–1983. Our first joint work was summarized very briefly in the NASA technical memorandum [297] and presented at the Thirtieth Anniversary Meeting of the Society for Industrial and Applied Mathematics, Stanford, California, July 19–23, 1982. This work was eventually published as the NASA technical paper [298] and, later, as the journal paper [299]. I consider it a privilege to acknowledge their friendship and their influence on my career in this most interesting topic. Lastly, I owe a debt of gratitude to my dear wife Carmella for her constant patience, understanding, support, and encouragement while this book was being written. I dedicate this book to her with love. Avram Sidi Technion, Haifa December 2016

Chapter 0

Introduction and Review of Linear Algebra

0.1 General background and motivation An important problem that arises in different areas of science and engineering is that of computing limits of sequences of vectors {x m },1 where x m ∈ N , the dimension N being very large in many applications. Such vector sequences arise, for example, in the numerical solution of very large systems of linear or nonlinear equations by fixedpoint iterative methods, and lim m→∞ x m are simply the required solutions to these systems. One common source of such systems is the finite-difference or finite-element discretization of continuum problems. In later chapters, we will discuss further problems that give rise to vector sequences whose limits are needed. In most cases of interest, however, the sequences {x m } converge to their limits extremely slowly. That is, to approximate s = limm→∞ x m with a reasonable prescribed level of accuracy by x m , we need to consider very large values of m. Since, the vectors x m are normally computed in the order m = 0, 1, 2, . . . , it is clear that we have to compute many such vectors until we reach one that has acceptable accuracy. Thus, this way of approximating s via the x m becomes very expensive computationally. Nevertheless, we may ask whether we can do something with those x m that are already available, to somehow obtain new approximations to s that are better than each individual available x m . The answer to this question is yes for at least a large class of sequences that arise from fixed-point iteration of linear and nonlinear systems of equations. One practical way of achieving this is to apply to the sequence {x m } a suitable convergence acceleration method (or extrapolation method). Of course, if lim m→∞ x m does not exist, it seems that no use can be made of the x m . Now, if the sequence {x m } is generated by an iterative solution of a linear or nonlinear system of equations, it can be thought of as “diverging from” the solution s of this system. We call s the antilimit of {x m } in this case. It turns out that vector extrapolation methods can be applied to such divergent sequences {x m } to obtain good approximations to the relevant antilimits, at least in some cases. As we will see later, a vector extrapolation method computes as an approximation to the limit or antilimit of {x m } a “weighted average” of a certain number of the vectors  x m . This approximation is of the general form ki=0 γi x n+i , where n and k are integers  chosen by the user and the scalars γi , which can be complex, satisfy ki=0 γi = 1. Of course, the methods differ in the way they determine the γi . 1 Unless

otherwise stated, {c m } will mean {c m }∞ m=0 throughout this book.

1

2

Chapter 0. Introduction and Review of Linear Algebra

Now, a good way to approach and motivate vector extrapolation methods is within the context of the fixed-point iterative solution of systems of equations. Because the actual development of these methods proceeds via the solution of linear systems, we devote the next section to a brief review of linear algebra, where we introduce the notation that we employ throughout this work and state some important results from matrix theory that we recall as we go along. Following these, in Sections 0.3 and 0.4 of this chapter, we review the essentials of the fixed-point iterative solution of nonlinear and, especially, linear systems in some detail. We advise the reader to study this chapter with some care and become familiar with its contents before proceeding to the next chapters.

0.2 Some linear algebra notation and background In this section, we provide the necessary background in matrix analysis and numerical linear algebra that we will need for this book; in addition, we establish most of the notation we will be using throughout. The rigorous treatments of these subjects are to be found in various books. For matrix analysis, we refer the reader to Gantmacher [95], Horn and Johnson [138, 139], Householder [142], Berman and Plemmons [21], and Varga [333], for example. For numerical linear algebra, we refer the reader (in alphabetical order) to Axelsson [12], Barrett et al. [16], Björck [23, 24], Brezinski [33, 34], Ciarlet [60], Datta [64], Demmel [73], Golub and Van Loan [103], Greenbaum [118], Hageman and Young [125], Ipsen [144], Kelley [157], Liesen and Strakoš [174], Meurant [187], Meyer [188], Parlett [208], Saad [239, 240], Stewart [309, 310, 311], Trefethen and Bau [324], van der Vorst [326], Watkins [339], and Wilkinson [344], for example. See also the numerical analysis books by Ralston and Rabinowitz [214] and Stoer and Bulirsch [313], and the more recent book by Gander, Gander, and Kwok [93], which also contain extensive treatments of numerical linear algebra. Vector and matrix spaces

We assume that the reader is familiar with the basic properties of vector spaces. We will be using the following standard notation: • : the field of complex numbers. • : the field of real numbers. •  s : the complex vector space of dimension s (over ). •  s : the real vector space of dimension s (over ). •  r ×s : the space of r × s matrices with complex entries. •  r ×s : the space of r × s matrices with real entries. We denote the dimension of any vector space  by dim  . Subspaces

• A subset  of a vector space  is a subspace of  if it is a vector space itself.

0.2. Some linear algebra notation and background

3

• If  and  are subspaces of the vector space  , then the set  ∩ is a subspace of  , and we have dim  + dim  − dim  ≤ dim( ∩  ) ≤ min{dim  , dim  }. The set  ∪  is not necessarily a subspace. • Define the sum of the subspaces  and  of  via  +  = { z ∈  : z = x + y, x ∈  , y ∈  }.  +  is a subspace of  , and max{dim  , dim  } ≤ dim( +  ) ≤ dim  + dim  , dim( +  ) = dim  + dim  − dim( ∩  ). Vectors and matrices

We will use lowercase boldface italic letters to denote column vectors. We will also use 0 to denote the zero column vector. Thus, ⎡ (1) ⎤ x ⎢ x (2) ⎥ ⎢ ⎥ x ∈  s if x = ⎢ . ⎥ , x (i ) ∈  ∀ i. (0.1) ⎣ .. ⎦ x (s )

We will denote the standard basis vectors in  s by e i . Thus e i has one as its ith component, the remaining components being zero. We will also denote by e the vector whose components are all unity. We denote the transpose of x and the Hermitian conjugate of x, both row vectors, by x T and x ∗ , respectively, and these are given as x T = [x (1) , x (2) , . . . , x (s ) ] and

x ∗ = [ x (1) , x (2) , . . . , x (s ) ].

Here, a stands for the complex conjugate of a. We will use uppercase boldface italic letters to denote matrices. We will also use O to denote the zero matrix. Of course, I denotes the identity matrix. Sometimes it becomes necessary to emphasize the dimension of the identity matrix, and in such cases we will also write I s to denote the identity matrix in  s ×s . Thus,



A ∈  r ×s

if A = [ai j ]1≤i ≤r 1≤ j ≤s

a11 ⎢ a21 ⎢ =⎢ . ⎣ ..

a12 a22 .. .

ar 1

ar 2

··· ··· ···

⎤ a1s a2s ⎥ ⎥ .. ⎥ , . ⎦

ai j ∈ 

∀ i, j .

(0.2)

ar s

Sometimes, we will also denote ai j by (A)i j . • We denote by AT the transpose of A. Thus, if A ∈  r ×s , then AT ∈  s ×r , and (AT )i j = (A) j i

∀ i, j .

4

Chapter 0. Introduction and Review of Linear Algebra

Similarly, we denote by A∗ the Hermitian conjugate of A. Thus, if A ∈  r ×s , then A∗ ∈  s ×r , and (A∗ )i j = (A) j i ∀ i, j .

Here too, a stands for the complex conjugate of a. Thus,

A∗ = AT = AT .

• A square matrix Ais said to be symmetric if AT = A. It is said to be skew symmetric if AT = −A. If A is real skew symmetric, then x T Ax = 0 for every real vector x. Similarly, a square matrix A is said to be Hermitian if A∗ = A. It is said to be skew Hermitian if A∗ = −A. If A is Hermitian (skew Hermitian), then x ∗Ax is real (purely imaginary or zero) for every complex vector x. If A = [ai j ]1≤i , j ≤s is Hermitian (skew Hermitian), then ai i are all real (purely imaginary or zero). • A square matrix A is said to be normal if it satisfies A∗A = AA∗ . Thus, Hermitian, skew-Hermitian, real symmetric, and real skew-symmetric matrices are all normal. • Any complex square matrix A can be written in the form A = AH + AS , where 1 1 AH = 2 (A + A∗ ) is the Hermitian part of A and AS = 2 (A − A∗ ) is the skewHermitian part of A. The symmetric part and the skew-symmetric part of a real square matrix can be defined analogously.

• A square matrix Q is said to be unitary if it satisfies Q ∗Q = QQ ∗ = I . • A matrix Q ∈  r ×s , r > s, is also said to be unitary if Q ∗Q = I s . (Note that, in this case, QQ ∗ is a singular r × r matrix, hence not equal to I r .) • If we denote the ith column of Q ∈  r ×s , r ≥ s, by q i , then Q is unitary if q ∗i q j = δi j for all i, j . • A square matrix P is said to be a permutation matrix if it is obtained by permuting the rows (or the columns) of the identity matrix I . If P is a permutation matrix, then so is P T . In addition, P T P = PP T = I ; that is, P is also unitary. Let A ∈  r ×s , and let P ∈  s ×s and P  ∈  r ×r be two permutation matrices. Then the matrices AP and P  A are obtained by permuting, respectively, the columns and the rows of A. • A ∈  r ×s is said to be diagonal if ai j = 0 for i = j , whether r ≥ s or r ≤ s. • If A ∈  r ×s is a diagonal matrix with elements d1 , d2 , . . . , d p along its main diagonal, where p = min{r, s}, then we will write A = diag (d1 , d2 , . . . , d p ),

p = min{r, s}.

0.2. Some linear algebra notation and background

5

If A = [ai j ]1≤i , j ≤s is a square matrix, we will define the matrix diag (A) ∈  s ×s via diag (A) = diag (a11 , a22 , . . . , a s s ), and we will define tr(A), the trace of A, as tr(A) =

s

i =1

ai i .

• A matrix A ∈  r ×s whose elements below (above) the main diagonal are all zero is said to be upper triangular (lower triangular).2 • If A = [ai j ]1≤i , j ≤s and B = [bi j ]1≤i , j ≤s are upper (lower) triangular, then AB = C = [ci j ]1≤i , j ≤s is upper (lower) triangular too, and ci i = ai i bi i for all i. • If A = [ai j ]1≤i , j ≤s is upper (lower) triangular with ai i = 0 for all i, then A is nonsingular. A−1 is upper (lower) triangular too, and (A−1 )i i = 1/ai i for all i. • A matrix whose elements below the subdiagonal (above the superdiagonal) are all zero is said to be upper Hessenberg (lower Hessenberg). Partitioning of matrices

We will make frequent use of matrix partitionings in different places. • If A = [ai j ]1≤i ≤r ∈  r ×s , then we will denote the ith row and j th column of A 1≤ j ≤s

by a Ti and a j , respectively; that is, a Ti = [ai 1 , ai 2 , . . . , ai s ] We will also write



⎢ ⎢ A= ⎢ ⎢ ⎣

a T1 a T2 .. . a Tr

and

a j = [a1 j , a2 j , . . . , a r j ]T .

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

and A = [ a 1 | a 2 | · · · |a s ].

(0.3)

• If A ∈  r ×s and B ∈  s ×t , then we have3 AB = [Ab 1 |Ab 2 | · · · |Ab t ] and also AB =

s

i =1

T

bi . ai

Matrix-vector multiplication

Let x = [x (1) , . . . , x (s ) ]T and let A = [ai j ]1≤i ≤r ∈  r ×s , with the columnwise parti1≤ j ≤s

tioning in (0.3). Then z = Ax can be computed as follows:  • Row version: z (i ) = sj =1 ai j x ( j ) , i = 1, . . . , r.  • Column version: z = sj =1 x ( j ) a j . r < s (r > s ), A is said to be upper trapezoidal (lower trapezoidal) too. Note that if x = [x (1) , . . . , x (r ) ]T ∈  r and y = [y (1) , . . . , y (s ) ]T ∈  s , then Z = x y T ∈  r ×s , with zi j = (Z)i j = x (i ) y ( j ) . 2 When 3

6

Chapter 0. Introduction and Review of Linear Algebra

Linear independence and rank

• A set of vectors {a 1 , . . . , a k } is said to be linearly dependent if there exist scalars  αi , not all zero, such that ki=1 αi a i = 0. Otherwise, it is said to be linearly independent. • The number of linearly independent rows of a matrix A is equal to the number of its linearly independent columns, and this number is called the rank of A. We denote this number by rank(A). • Thus, if A ∈  r ×s , then rank(A) ≤ min{r, s}. • We also have rank(A) = rank(AT ) = rank(A∗ ). • If A ∈  r ×s , and rank(A) = min{r, s}, then A is said to be of full rank. If A ∈  r ×s , r ≥ s, and is of full column rank, that is, rank(A) = s, then Ax = 0 if and only if x = 0. • A ∈  s ×s is nonsingular if and only if rank(A) = s. • rank(AB) ≤ min{rank(A), rank(B)}. • If A ∈  r ×r and C ∈  s ×s are nonsingular, and B ∈  r ×s , then rank(AB) = rank(BC) = rank(ABC) = rank(B). • rank(A∗A) = rank(AA∗ ) = rank(A). Range and null space

Let A ∈  r ×s , with the columnwise partitioning in (0.3). • The range of A is the subspace

(A) = y ∈  r : y = Ax for some x ∈  s .  Thus, (A) is the set of all vectors of the form y = is =1 x (i ) a i . Hence, the dimension of (A) satisfies dim (A) = rank(A). • The null space of A is the subspace



 (A) = x ∈  s : Ax = 0 .

The dimension of  (A) satisfies dim  (A) = s − rank(A). • For any two matrices A ∈  r ×s and B ∈  s × p , (AB) ⊆ (A) • We also have

and

dim (AB) = dim (B) − dim[ (A) ∩ (B)].

(A∗A) = (A∗ ),

 (A∗A) =  (A).

• For A ∈  r ×s , the orthogonal complement of (A), denoted by (A)⊥ , is defined as (A)⊥ = {y ∈  r : y ∗ x = 0 for all x ∈ (A)}. Then every vector in  r is the sum of a vector in (A) and another vector in (A)⊥ ; that is,  r = (A) ⊕ (A)⊥ . In addition,

(A)⊥ =  (A∗ ).

0.2. Some linear algebra notation and background

7

Eigenvalues and eigenvectors

• We say (λ, v), where λ ∈  and v ∈  s , v = 0, is an eigenpair of a square matrix A ∈  s ×s if Av = λv. λ is an eigenvalue of A and v is a right eigenvector, or simply an eigenvector of A, corresponding to the eigenvalue λ. To emphasize that λ is an eigenvalue of A, we will sometimes write λ(A) instead of λ. The left eigenvector w of A corresponding to its eigenvalue λ is defined via w T A = λw T . • A ∈  s ×s has exactly s eigenvalues that are the roots of its characteristic polynomial R(λ), which is defined by R(λ) = det(λI − A). Note that R(λ) is of degree exactly s with leading coefficient one. Thus, if q (λ − λi ) ri , where λ1 , . . . , λq are the distinct roots of R(λ), then R(λ) = i =1 q λi = λ j if i = j and i =1 ri = s. ri is called the algebraic multiplicity of λi . • An eigenvalue λi of A is said to be simple if ri = 1; otherwise, λi is multiple. • Eigenvectors corresponding to different eigenvalues are linearly independent. • Denote by ri the number of linearly independent eigenvectors corresponding to λi . ri is called the geometric multiplicity of λi , and it satisfies 1 ≤ ri ≤ ri . If ri = ri , we say that λi is nondefective; otherwise, λi is defective. If ri = ri for all i, then A is diagonalizable or nondefective; otherwise, A is nondiagonalizable or defective. • The set {λ1 , . . . , λ s } of all the eigenvalues of A is called the spectrum of A and will be denoted by σ(A).  • Let us express the characteristic polynomial of A in the form R(λ) = is =0 ci λi , c s = 1. Then −c s −1 = tr(A) =

s

i =0

λi

and

(−1) s c0 = det A =

s  i =1

λi .

• By the Cayley–Hamilton theorem, R(A) = O, where R(λ) is the characteristic polynomial of A. Thus, with R(λ) =  we have R(A) = is =0 ci Ai , with A0 ≡ I .

s

i =0 ci λ

i

,

• If A ∈  r ×s and B ∈  s ×r , then λ s det(λI r − AB) = λ r det(λI s − BA). As a result, when r = s, AB and BA have the same characteristic polynomials and hence the same eigenvalues. This can be extended to the product of an arbitrary number of square matrices of the same dimension. For example, if A, B, C are all in  r ×r , then ABC, CAB, and BCA all have the same characteristic polynomials and hence the same eigenvalues.

8

Chapter 0. Introduction and Review of Linear Algebra

• If A ∈  s ×s has exactly s linearly independent eigenvectors, then A can be diagonalized by a nonsingular matrix V as in V −1AV = diag (λ1 , . . . , λ s ). Here and In addition,

V = [ v 1 | · · · | v s ], V −1 = [ w 1 | · · · | w s ]T , w Ti v j

Av i = λi v i ,

i = 1, . . . , s,

w Ti A = λi w Ti ,

i = 1, . . . , s.

= δi j .

4

• A ∈  s ×s is a normal matrix if and only if it can be diagonalized by a unitary matrix V = [ v 1 | · · · | v s ], that is, V ∗AV = diag (λ1 , . . . , λ s ),

V ∗V = VV ∗ = I .

Thus, Av i = λi v i , i = 1, . . . , s, and v ∗i v j = δi j for all i, j . Special cases of normal matrices are Hermitian and skew-Hermitian matrices. (i) If A ∈  s ×s is Hermitian, then all its eigenvalues are real. (ii) If A ∈  s ×s is skew Hermitian, then all its eigenvalues are purely imaginary or zero. • The spectral radius of a square matrix A is defined as   ρ(A) = max λi (A), λi (A) eigenvalues of A. i

• A square matrix A is said to be positive definite if x ∗Ax > 0 for all x = 0. It is said to be positive semidefinite if x ∗Ax ≥ 0 for all x = 0. A matrix A is positive definite (positive semidefinite) if and only if (i) it is Hermitian and (ii) all its eigenvalues are positive (nonnegative). Note that the requirement that x ∗Ax be real for all x already forces A to be Hermitian. • If A ∈  r ×s , then A∗A ∈  s ×s and AA∗ ∈  r ×r are both Hermitian and positive semidefinite. If r ≥ s and rank(A) = s, A∗A is positive definite. If r ≤ s and rank(A) = r , AA∗ is positive definite. • The singular values of a matrix A ∈  r ×s are defined as   σi (A) = λi (A∗A) = λi (AA∗ ), i = 1, . . . , min{r, s}.

The eigenvectors of A∗A (of AA∗ ) are called the right (left) singular vectors of A. Thus, σi (A) = |λi (A)| if A is normal. 4

When A is nondiagonalizable or defective, there is a nonsingular matrix V such that the matrix J = V −1 AV , called the Jordan canonical form of A, is almost diagonal. We will deal with this general case in detail later via Theorem 0.1 in Section 0.4.

0.2. Some linear algebra notation and background

9

Vector norms

• We use  ·  to denote vector norms in  s . Vector norms satisfy the following conditions: 1. x ≥ 0 for all x ∈  s ; x = 0 if and only if x = 0. 2. γ x = |γ | x  for every γ ∈  and x ∈  s . 3. x + y ≤ x + y for every x, y ∈  s . • With x = [x (1) , . . . , x (s ) ]T , the l p -norm in  s is defined via x p =



s

(i ) p

1/ p

|x |

,

1 ≤ p < ∞;

i =1

x∞ = max |x (i ) |, i

p = ∞.

Thus, the l2 -norm (also called the Euclidean norm) is simply  x2 = x ∗ x.

• If A ∈  r ×s , r ≥ s, is unitary and x ∈  s , then Ax2 = x2 . • If A ∈  r ×s and x ∈  s , then min σi (A) ≤ i

Ax2 ≤ max σi (A). i x2

• If A ∈  s ×s is normal and x ∈  s , then min |λi (A)| ≤ i

Ax2 ≤ max |λi (A)|. i x2

• The following is known as the Hölder inequality: |x ∗ y| ≤ x p yq ,

1 1 + = 1, p q

1 ≤ p, q ≤ ∞.

The special case p = q = 2 of the Hölder inequality is called the Cauchy–Schwarz inequality. • If  ·  is a norm on  s and A ∈  s ×s is nonsingular, then x ≡ Ax is a norm on  s too. • If  ·  is a norm on  s , then x is a continuous function of x. Thus, with x = [x (1) , . . . , x (s ) ]T , x is a continuous function in the s scalar variables x (1) , . . . , x (s ) . • All vector norms are equivalent. That is, given any two vector norms  · (a) and  · (b ) in  s , there exist positive constants Ca b and Da b such that Ca b x(a) ≤ x(b ) ≤ Da b x(a)

∀ x ∈ s .

This implies that if a vector sequence {x m } converges to s in one norm, it converges to s in every norm. Thus, if limm→∞ x m −s(a) = 0, then lim m→∞ x m − s(b ) = 0 too, and vice versa, because Ca b x m − s(a) ≤ x m − s(b ) ≤ Da b x m − s(a) .

10

Chapter 0. Introduction and Review of Linear Algebra

Matrix norms

• Matrix norms will also be denoted by ·. Since matrices in  r ×s can be viewed also as vectors in  r s (that is, the matrix space  r ×s is isomorphic to the vector space  r s ), we can define matrix norms just as we define vector norms, by the following three conditions: 1. A ≥ 0 for all A ∈  r ×s . A = 0 if and only if A = O. 2. γA = |γ | A for every γ ∈  and A ∈  r ×s . 3. A + B ≤ A + B for every A, B ∈  r ×s . The matrix norms we will be using are generally the natural norms (or induced norms or subordinate norms) that are defined via A(a,b ) = max

Ax(a)

x=0

x(b )

,

where A ∈  r ×s and x ∈  s , and Ax(a) and x(b ) are the vector norms in  r and  s , respectively. Note that this maximum exists and is achieved for some nonzero vector x 0 . We say that the matrix norm  · (a,b ) is induced by, or is subordinate to, the vector norms  · (a) and  · (b ) . (Here,  · (a) and  · (b ) are not to be confused with the l p -norms.) With this notation, induced matrix norms satisfy the following fourth condition, in addition to the above three: 4. AB(a,c) ≤ A(a,b ) B(b ,c) , with A ∈  r ×s and B ∈  s ×t . There are matrix norms that are not natural norms and that satisfy the fourth condition. Matrix norms, whether natural or not, that satisfy the fourth condition are said to be multiplicative. • In view of the definition above, natural norms satisfy Ax(a) ≤ A(a,b ) x(b )

for all x.

In addition, Ax(a) ≤ M x(b )

for all x



A(a,b ) ≤ M .

• When A is a square matrix and the vector norms  · (a) and  · (b ) are the same, we let A(a) stand for A(a,a) . In this case, we have Ax(a) ≤ A(a) x(a)

for all x.

Also, we say that the matrix norm  · (a) is induced by the vector norm  · (a) . • In  s ×s , for any natural norm A(a,a) ≡ A(a) , we have I (a) = 1. • If Ai ∈  s ×s , i = 1, . . . , p, and  ·  is a multiplicative norm on  s ×s , then A1 · · · Ak  ≤

k  i =1

Ai  and

I  ≥ 1.

0.2. Some linear algebra notation and background

11

• With A ∈  r ×s as in (0.2), if we let Ax and x be the vector l p -norms in  r and  s , respectively, the natural norm A of A becomes A = A1 = max

r

1≤ j ≤s

|ai j |,

i =1 s

A = A∞ = max

1≤i ≤r

j =1

p = 1,

|ai j |,

A = A2 = max σi (A), i

p = ∞,

p = 2;

σi (A) singular values of A.

In view of these, we have the following: P p = 1,

1 ≤ p ≤ ∞,

AP 1 = A1

if P is a permutation matrix.

if P is a permutation matrix.

PA∞ = A∞ if P is a permutation matrix. U 2 = 1 if U ∈  r ×s and is unitary.

UA2 = A2 if A ∈  r ×s , U ∈  r ×r and is unitary. UAV 2 = A2 if A ∈  r ×s , U ∈  r ×r , V ∈  s ×s , and U ,V are unitary.

A2 = max |λi (A)| = ρ(A) if A ∈  s ×s and is normal.

• A matrix norm that is multiplicative but not natural and that is used frequently in applications is the Frobenius or Schur norm. For A ∈  r ×s , A = [ai j ]1≤i ≤r 1≤ j ≤s

as in (0.2), this norm is defined by  

s    r

AF =  |ai j |2 = tr(A∗A) = tr(AA∗ ). i =1 j =1

We also have UAV F = AF

if U ∈  r ×r and V ∈  s ×s are unitary.

• The natural matrix norms and spectral radii for square matrices satisfy ρ(A) ≤ A. In addition, given ε > 0, there exists a vector norm  ·  that depends on A and ε such that the matrix norm induced by it satisfies A ≤ ρ(A) + ε. The two inequalities between A and ρ(A) become an equality when we have the following: (i) A is normal and A = A2 , since A2 = ρ(A) in this case. (ii) A = diag (d1 , . . . , d s ) and A = A p , with arbitrary p, for, in this case, A p = max |di | = ρ(A), 1≤i ≤s

1 ≤ p ≤ ∞.

• The natural matrix norms and spectral radii for square matrices also satisfy   ρ(A) = lim Ak 1/k . k→∞

12

Chapter 0. Introduction and Review of Linear Algebra

Condition numbers

• The condition number of a nonsingular square matrix A relative to a natural matrix norm  ·  is defined by κ(A) = A A−1 . • Relative to every natural norm, κ(A) ≥ 1. • If the natural norm is induced by the vector l p -norm, we denote κ(A) by κ p (A). For l2 -norms, we have the following: σmax (A) , A arbitrary, and σmin (A) |λ (A)| κ2 (A) = max , A normal. |λmin (A)|

κ2 (A) =

Here σmin (A) and σmax (A) are the smallest and largest singular values of A. Similarly, λmin (A) and λmax (A) are the smallest and largest eigenvalues of A in modulus. • If A = [ai j ]1≤i , j ≤s is upper or lower triangular and nonsingular, then ai i = 0, 1 ≤ i ≤ s, necessarily, and κ p (A) ≥

maxi |ai i | , mini |ai i |

1 ≤ p ≤ ∞.

Inner products

• We will use (· , ·) to denote inner products (or scalar products). Thus, (x, y), with x, y ∈  s , denotes an inner product in  s . Inner products satisfy the following conditions: 1. 2. 3. 4.

(y, x) = (x, y) for all x, y ∈  s . (x, x) ≥ 0 for all x ∈  s and (x, x) = 0 if and only if x = 0. (αx, βy) = αβ(x, y) for x, y ∈  s and α, β ∈ . (x, βy + γ z ) = β(x, y) + γ (x, z ) for x, y, z ∈  s and β, γ ∈ .

• We say that the vectors x and y are orthogonal to each other if (x, y) = 0. • An inner product (· , ·) in  s can also be used to define a vector norm  ·  in  s as  x = (x, x).

• For any inner product (· , ·) and vector norm  ·  induced by it, and any two vectors x, y ∈  s , we have |(x, y)| ≤ x y. Equality holds if and only if x and y are linearly dependent. This is a more general version of the Cauchy–Schwarz inequality mentioned earlier.

0.2. Some linear algebra notation and background

13

• The standard Euclidean inner product in  s and the vector norm induced by it are defined by, respectively,  (x, y) = x ∗ y ≡ 〈x, y〉 and z  = 〈z , z 〉 ≡ z 2 .

The standard Euclidean inner product is used to define the angle ∠(x, y) between two nonzero vectors x and y as follows: cos ∠(x, y) =

| 〈x, y〉 | , x2 y2

0 ≤ ∠(x, y) ≤

π . 2

• The most general inner product in  s and the vector norm induced by it are, respectively,  (x, y) = x ∗ M y ≡ (x, y)M and z  = (z , z )M ≡ z M ,

where M ∈  s ×s is Hermitian positive definite. Such an inner product is also called a weighted inner product. (Throughout this book, we will be using both the Euclidean inner product and the weighted inner product and the norms induced by them. Normally, we will use the notation (y, z ) for all norms, whether weighted or not. We will use  the notation 〈x, y〉 = y ∗ z and z 2 = 〈z , z 〉 to avoid confusion when both weighted and standard Euclidean inner products and norms induced by them are being used simultaneously.)

• Unitary matrices preserve the Euclidean inner product 〈· , ·〉. That is, if U is unitary, then 〈U x, U y〉 = 〈x, y〉. Linear least-squares problems

The least-squares solution x to the problem Ax = b, where A ∈  r ×s , b ∈  r , and s x ∈  , is defined to be the solution to the minimization problem min Ax − b, x

where z  =



(z , z ), (· , ·) being an arbitrary inner product in  r .

x • Let the column partitioning of A be A = [ a 1 | a 2 | · · · |a s ]. It is known that also satisfies the normal equations s

j =1

(a i , a j ) x ( j ) = (a i , b),

i = 1, . . . , s.

• When (y, z ) = y ∗ z , the normal equations become A∗Ax = A∗ b. If r ≥ s and rank(A) = s, A∗A is nonsingular, and hence the solution is given by x = (A∗A)−1A∗ b. In addition, this solution is unique.

14

Chapter 0. Introduction and Review of Linear Algebra

Some special classes of square matrices

Next, we discuss some special classes of square matrices A = [ai j ]1≤i , j ≤s ∈  s ×s . • A is said to be strictly diagonally dominant if it satisfies |ai i | >

s

i =1 j =i

|ai j |,

i = 1, . . . , s.

It is known that if A is strictly diagonally dominant, then A is also nonsingular. • A is said to be reducible if there exists a permutation matrix P such that   A A12 P T AP = 11 , O A22 where A11 and A22 are square matrices. Otherwise, A is irreducible. Whether A ∈  s ×s is irreducible or not can also be determined by looking at its directed graph G(A). This graph is obtained as follows: Let P1 , . . . , P s be s points in the plane; these are called nodes. If ai j = 0, connect Pi to P j by a path −−→ Pi P j directed from Pi to P j . G(A) is said to be strongly connected if, for any −−−−→ −−→ −−−−−→ pair of nodes Pi and P j , there exists a directed path Pi =l0 P l1 , P l1 P l2 , . . . , P l r −1 P l r = j connecting Pi to P j . Then A is irreducible if and only if G(A) is strongly connected. • A is said to be irreducibly diagonally dominant if it is irreducible and satisfies |ai i | ≥

s

i =1 j =i

|ai j |,

i = 1, . . . , s,

with strict inequality occurring at least once. It is known that if A is irreducibly diagonally dominant, then A is also nonsingular. In addition, ai i = 0 for all i. • A is said to be nonnegative (positive) if ai j ≥ 0 (ai j > 0) for all i, j , and we write A ≥ O (A > O). If A is nonnegative, then ρ(A) is an eigenvalue of A, and the corresponding eigenvector v is nonnegative, that is, v ≥ 0. If A is irreducible, in addition to being nonnegative, then ρ(A) is a simple eigenvalue of A, and the corresponding eigenvector v is positive, that is, v > 0. • A is said to be an M-matrix if (i) ai j ≤ 0 for i = j and (ii) A−1 ≥ O. It is known that A is an M-matrix if and only if (i) ai i > 0 for all i and (ii) the matrix B = I − D −1A, where D = diag (A), satisfies ρ(B) < 1. • A = M − N is said to be a regular splitting of A if M is nonsingular with M −1 ≥ O and N ≥ O. Thus, the splitting A = M − N , with M = diag (A) and N = M − A, is a regular splitting if A is an M-matrix.

0.3. Fixed-point iterative methods for nonlinear systems

15

0.3 Fixed-point iterative methods for nonlinear systems Consider the nonlinear system of equations ψ(x) = 0,

ψ : N → N ,

(0.4)

whose solution we denote by s. What is meant by x, s, and ψ(x) is x = [x (1) , . . . , x (N ) ]T ,

s = [s (1) , . . . , s (N ) ]T ;

x (i ) , s (i )

scalars,

and  T ψ(x) = ψ1 (x), . . . , ψN (x) ;

  ψi (x) = ψi x (1) , . . . , x (N ) scalar functions.

Then, starting with a suitable vector x 0 , an initial approximation to s, the sequence {x m } of approximations can be generated by some fixed-point iterative method as x m+1 = f (x m ),

m = 0, 1, . . . ,

(0.5)

with  T f (x) = f1 (x), . . . , fN (x) ;

  fi (x) = fi x (1) , . . . , x (N ) scalar functions.

Here x − f (x) = 0 is a possibly “preconditioned” form of (0.4); hence, it has the same solution s [that is, ψ(s) = 0 and also s = f (s)], and, in the case of convergence, lim m→∞ x m = s. One possible form of f (x) is f (x) = x + C(x)ψ(x), where C(x) is an N × N matrix such that C(s) is nonsingular. We now want to study the nature of the vectors x m that arise from the iterative method of (0.5), the function f (x) there being nonlinear in general. Assuming that lim m→∞ x m exists, hence that x m ≈ s for all large m [recall that s is the solution to the system ψ(x) = 0 and hence to the system x = f (x)], we expand f (x m ) in a Taylor series about s. Expanding each of the functions fi (x m ), we have fi (x m ) = fi (s) + where

N

j =1

(j)

fi , j (s)(x m − s ( j ) ) + O(x m − s2 ) as m → ∞,

 ∂ fi  fi , j (s) = , ∂ x ( j )  x=s

i, j = 1, . . . , N .

Consequently, x m+1 = f (s) + F (s)(x m − s) + O(x m − s2 )

as m → ∞,

(0.6)

where F (x) is the Jacobian matrix of the vector-valued function f (x) given as ⎡

f1,1 (x) ⎢ f2,1 (x) ⎢ F (x) = ⎢ . ⎣ .. fN ,1 (x)

f1,2 (x) f2,2 (x) .. . fN ,2 (x)

··· ··· ···

⎤ f1,N (x) f2,N (x) ⎥ ⎥ .. ⎥ . . ⎦ fN ,N (x)

(0.7)

16

Chapter 0. Introduction and Review of Linear Algebra

Recalling that s = f (s), we rewrite (0.6) in the form x m+1 = s + F (s)(x m − s) + O(x m − s2 ) as m → ∞.

(0.8)

By (0.8), we realize that the vectors x m and x m+1 satisfy the approximate equality x m+1 ≈ s + F (s)(x m − s) = F (s)x m + [I − F (s)]s for all large m. That is, for all large m, the sequence {x m } behaves as if it were being generated by an N -dimensional linear system of the form (I − T )x = d through x m+1 = T x m + d,

m = 0, 1, . . . ,

(0.9)

where T = F (s) and d = [I − F (s)]s. This suggests that we should study those sequences {x m } that arise from linear systems of equations to derive and study vector extrapolation methods. We undertake this task in the next section.  Now,  the rate of convergence (to s) of the sequence  {x m } above is determined by ρ F (s) , the spectral radius of F (s). It is known that ρ F (s) < 1 must hold for conver  gence to take place and that, the closer ρ F (s) is to zero, the faster the convergence.   The rate of convergence deteriorates as ρ F (s) becomes closer to one, however. As an example, let us consider the cases in which (0.4) and (0.5) arise from finitedifference or finite-element discretizations of continuum problems. For s [the solution to (0.4) and (0.5)] to be a reasonable approximation to the solution of the continuum problem, the mesh size of the discretization must be small enough. However, a small mesh size means a large  N . In addition, as the mesh size tends to zero, hence N → ∞, generally, ρ F (s) tends to one, as can be shown rigorously in some cases. All this means that, when the mesh size decreases, not only does the dimension of the problem increase, the convergence of the fixed-point method in (0.5) deteriorates as well. As mentioned above, this problem of slow convergence can be treated efficiently via vector extrapolation methods. Remark: From our discussion of the nature of the iterative methods for nonlinear systems, it is clear that the vector-valued functions ψ(x) and f (x) above are assumed to be differentiable at least twice in a neighborhood of the solution s.

0.4 Fixed-point iterative methods for linear systems 0.4.1 General treatment Let A ∈ N ×N be a nonsingular matrix and b ∈ N be a given vector, and consider the linear system of equations Ax = b, (0.10) whose solution we denote by s. As already mentioned, what is meant by s, x, and b, is s = [s (1) , . . . , s (N ) ]T , x = [x (1) , . . . , x (N ) ]T , b = [b (1) , . . . , b (N ) ]T ; s (i ) , x (i ) , b (i ) scalars.

(0.11)

We now split the matrix as in5 A=M −N, 5 Note that

M nonsingular.

once M is chosen, N is determined (by A and M ) as N = M − A.

(0.12)

0.4. Fixed-point iterative methods for linear systems

Rewriting (0.10) in the form

17

Mx = Nx + b

(0.13)

and choosing an initial vector x 0 , we generate the vectors x 1 , x 2 , . . . by solving (for x m+1 ) the linear systems M x m+1 = N x m + b,

m = 0, 1, . . . .

(0.14)

Now, (0.14) can also be written in the form x m+1 = T x m + d,

m = 0, 1, . . . ;

T = M −1 N ,

d = M −1 b.

(0.15)

Note that the matrix T cannot have one as one of its eigenvalues since A = M (I − T ) and A is nonsingular. [Since we have to solve the equations in (0.14) many times, we need to choose M such that the solution of these equations is much less expensive than the solution of (0.10).] Now, we would like the sequence {x m } to converge to s. The subject of convergence can be addressed in terms of ρ(T ), the spectral radius of T , among others. Actually, we have the following result. Theorem 0.1. Let s be the (unique) solution to the system x = T x + d, where T does not have one as one of its eigenvalues, and let the sequence {x m } be generated as in x m+1 = T x m + d,

m = 0, 1, . . . ,

(0.16)

starting with some initial vector x 0 . A necessary and sufficient condition for {x m } to converge to s from arbitrary x 0 is ρ(T ) < 1. Proof. Since s is the solution to the system x = T x + d, it satisfies s = T s + d.

(0.17)

Subtracting (0.17) from (0.16), we obtain x m+1 − s = T (x m − s),

m = 0, 1, . . . .

(0.18)

m = 0, 1, . . . .

(0.19)

Then, by induction, we have x m − s = T m (x 0 − s),

Since x 0 is arbitrary, so is x 0 − s, and hence lim m→∞ (x m − s) = 0 with arbitrary x 0 if and only if lim m→∞ T m = O. We now turn to the study of T m . For this, we will make use of the Jordan factorization of T given as T = V J V −1 , (0.20) where V is a nonsingular matrix and J is the Jordan canonical form of T and is a block diagonal matrix given by ⎡ ⎤ J r1 (λ1 ) ⎢ ⎥ J r2 (λ2 ) ⎢ ⎥ ⎥, (0.21) J =⎢ .. ⎢ ⎥ . ⎣ ⎦ J rq (λq )

18

Chapter 0. Introduction and Review of Linear Algebra

with the Jordan blocks J ri (λi ) defined as

J 1 (λ) = [λ] ∈ 1×1 ,

⎡ λ ⎢ ⎢ J r (λ) = ⎢ ⎢ ⎣



1 λ

..

.

..

.

⎥ ⎥ ⎥ ∈  r ×r , ⎥ 1⎦

r > 1.

(0.22)

λ

Here λi are the (not necessarily distinct) eigenvalues of T and

q

r i =1 i

= N . Therefore,

T m = (V J V −1 ) m = V J m V −1 , where

⎡ ⎢ ⎢ Jm =⎢ ⎢ ⎣

[J r1 (λ1 )] m

(0.23) ⎤

[J r2 (λ2 )] m

..

. [J rq (λq )] m

⎥ ⎥ ⎥. ⎥ ⎦

(0.24)

It is clear that lim m→∞ J m = O implies that lim m→∞ T m = O by (0.23). Conversely, by J m = V −1 T m V , lim m→∞ T m = O implies that limm→∞ J m = O, which implies that lim m→∞ [J ri (λi )] m = O for each i.6 Therefore, it is enough to study [J r (λ)] m . First, when r = 1, [J 1 (λ)] m = [λ m ]. (0.25) As a result, lim m→∞ [J 1 (λ)] m = O if and only if |λ| < 1. For r > 1, let us write

J r (λ) = λI r + E r ,

⎡ 0 ⎢ ⎢ Er = ⎢ ⎢ ⎣



1 0

..

.

..

.

⎥ ⎥ ⎥ ∈  r ×r , ⎥ 1⎦

(0.26)

0 so that [J r (λ)] m = (λI r + E r ) m = λ m I r +

m  

m i =1

i

λ m−i E ir .

(0.27)

Now, observe that, when k < r , the only nonzero elements of E kr are (E kr )i ,k+i = 1, i = 1, . . . , r − k, and that E rr = O.7 Then Em r =O

if m ≥ r ,

(0.28)

6 r ×s For a sequence of matrices {B m }∞ , by lim m→∞ B m = O we mean that B m → O as m → ∞ m=0 ∈  entrywise, that is, lim m→∞ (B m )i j = 0, 1 ≤ i ≤ r, 1 ≤ j ≤ s , simultaneously. 7 For example, ⎤ ⎤ ⎤ ⎡ ⎡ ⎡ 0 1 0 0 0 0 1 0 0 0 0 1 ⎢0 0 1 0⎥ ⎢0 0 0 1⎥ ⎢0 0 0 0⎥ 2 3 4 ⎥ ⎥ ⎥ ⎢ ⎢ E4 = ⎢ ⎣0 0 0 1⎦ , E 4 = ⎣0 0 0 0⎦ , E 4 = ⎣0 0 0 0⎦ , E 4 = O 0 0 0 0 0 0 0 0 0 0 0 0

.

0.4. Fixed-point iterative methods for linear systems

19

and, therefore, (0.27) becomes r −1  

m m−i i [J r (λ)] = λ I r + λ Er i i =1 ⎡ m  m  m−1  m  m−2 λ λ 1  m2 λ m−1 m ⎢ λ λ 1 ⎢ m ⎢ λ =⎢ ⎢ ⎣ m

m

··· ··· ··· .. .

 m  m−r +1 ⎤ λ −1 rm λ m−r +2 ⎥ −2 ⎥ rm λ m−r +3 ⎥ ⎥. r −3 ⎥ .. ⎦ .

(0.29)

λm

  By the fact that ml = 0 when l > m, note that (0.29) is valid for all m = 0, 1, . . . . In addition, because   k m(m − 1) · · · (m − k + 1)

m = = cj m j , k! k j =0

ck =

1 , k!

  m−r +1 for λ = 0, the most dominant entry as m → ∞ in [J r (λ)] m is r m , which is λ −1 r −1 m m O(m λ ). Therefore, lim m→∞ [J r (λ)] = O if and only if |λ| < 1. We have shown that, whether r = 1 or r > 1, limm→∞ [J r (λ)] m = O if and only if |λ| < 1. Going back to (0.24), we realize that limm→∞ J m = O if and only if |λi | < 1, i = 1, . . . , q. The result now follows.

Remarks: 1. Jordan blocks of size ri > 1 occur when the matrix T is not diagonalizable. (Nondiagonalizable matrices are also said to be defective.) 2. If T is diagonalizable, then ri = 1 for all i = 1, . . . , q, and q = N necessarily, that is, J is a diagonal matrix, λi being the ith element along the diagonal. Therefore, J m is diagonal too and is given by J m = diag (λ1m , . . . , λNm ). In this case, the ith column of V is an eigenvector of T corresponding to the eigenvalue λi . 3. It is important to observe that [J r (λ) − λI r ]k =

E kr = O O

if k < r , if k ≥ r .

(0.30)

4. It is clear from (0.19) that, the faster T m tends to O, the faster the convergence of {x m } to s. The rate of convergence of T m to O improves as ρ(T ) decreases, as is clear from (0.24) and (0.29).

0.4.2 Error formulas for x m It is important to also analyze the structure of the error vector x m − s as a function of m. We will need this to analyze the behavior of the error in extrapolation. In addition, the result of Theorem 0.1 can also be obtained by looking at this structure. We treat the error x m − s next.

20

Chapter 0. Introduction and Review of Linear Algebra

Error when T is diagonalizable

To start, we will look at the case where T is diagonalizable, that is, ri = 1 for all i. Of course, in this case, q = N . As already mentioned, the ith column of V , which we will denote by v i , is an eigenvector corresponding to the eigenvalue λi of T , that is, T v i = λi v i .

(0.31)

In addition, v 1 , . . . , v N form a basis for N . In view of this, we have the following theorem. Theorem 0.2. Let x0 − s = Then xm − s =

N

i =1 N

i =1

αi v i

for some αi ∈ .

αi λim v i ,

(0.32)

m = 1, 2, . . . .

(0.33)

This result is valid whether the sequence {x m } converges or not. Proof. By (0.19) and (0.32), we have xm − s = T m

N

i =1

αi v i =

N

i =1

  αi T m v i .

Invoking (0.31), the result follows.

Remark: As x 0 is chosen arbitrarily, x 0 − s is also arbitrary, and so are the αi . Therefore, by (0.33), for convergence of {x m } from any x 0 , we need to have |λi | < 1 for all i. Error when T is nondiagonalizable (defective)

When T is nondiagonalizable (or defective), the treatment of x m − s becomes much more involved. For this, we need to recall some facts and details concerning the matrix V in (0.20). We have first the partitioning V = [V 1 |V 2 | · · · |V q ],

V i ∈ N ×ri ,

i = 1, . . . , q,

(0.34)

where the matrices V i have the columnwise partitionings given in V i = [ v i 1 | v i 2 | · · · | v i ri ] ∈ N ×ri ,

i = 1, . . . , q.

(0.35)

Here, v i 1 is an eigenvector of T corresponding to the eigenvalue λi , whether ri = 1 or ri > 1. When ri > 1, the vectors v i j , j = 2, . . . , ri , are principal vectors (or generalized eigenvectors) corresponding to λi . The v i j satisfy T v i 1 = λi v i 1 ,

ri ≥ 1;

T v i j = λi v i j + v i , j −1 ,

j = 2, . . . , ri ,

ri > 1.

(0.36)

Consequently, we also have k  v i , j −k = 0 if k < j , T − λi I v i j = 0 if k = j .

(0.37)

0.4. Fixed-point iterative methods for linear systems

Generally, a nonzero vector u that satisfies r (T − λi I u = 0 but

21

(T − λi I

 r −1

u = 0

is said to be a generalized eigenvector of T of rank r , with associated eigenvalue λi . Thus, v i j is of rank j . In addition, the N vectors v i j are linearly independent, and hence they form a basis for N . That is, every vector in N can be expressed as a linear combination of the v i j . We make use of these facts to prove the next theorem. Theorem 0.3. Let x0 − s =

ri q

i =1 j =1

αi j v i j

for some αi j ∈ .

(0.38)

Then there exist a vector z m associated with the zero eigenvalues and vector-valued polynomials p i (m) associated with the respective nonzero eigenvalues λi , given by zm =

ri q



i =1 j =m+1 λi =0

αi j v i , j −m ,

  m , ai l p i (m) = l l =0 ri −1

(0.39)

a i l = λ−l i

such that xm − s = zm +

q

i =1 λi =0

ri

j =l +1

αi j v i , j −l ,

λi = 0,

p i (m)λim .

(0.40)

(0.41)

This result is valid whether the sequence {x m } converges or not. Proof. By (0.38) and (0.19), we first have xm − s =

ri q

i =1 j =1

αi j (T m v i j ) =



ri q

i =1 j =1 λi =0

+

ri  q

i =1 j =1 λi =0

αi j (T m v i j ).

(0.42)

The first double summation represents the contribution of the zero eigenvalues of T to x m − s, while the second represents the contribution of the nonzero eigenvalues of T . We therefore need to analyze T m v i j for λi = 0 and for λi = 0 separately. We start with the contribution of the zero eigenvalues λi . Letting λi = 0 in (0.37), we have T m v i j = v i , j −m , where we also mean that v i k = 0 when k ≤ 0. Substituting q  r this into the summation i =1 j i=1 , we obtain the vector z m as the contribution of λi =0 the zero eigenvalues. We now turn to the contribution of the nonzero eigenvalues λi , which turns out to be more complicated. By (0.23), we have T m V = V J m . Now, by (0.34), T m V = [ T m V 1 | T m V 2 | · · · | T m V q ], and, invoking (0.24), we have T m V i = V i [J ri (λi )] m ,

i = 1, . . . , q.

22

Chapter 0. Introduction and Review of Linear Algebra

Now,

T m V i = [ T m v i 1 | T m v i 2 | · · · | T m v i ri ].

Equating the j th column of the matrix T m V i , namely, the vector T m v i j , with the j th column of V i [J ri (λi )] m by invoking (0.29), we obtain T mvi j =

j

 vik

k=1

Substituting (0.43) into the summation

 m m− j +k λ . j −k i

q i =1 λi =0

 ri

j =1

(0.43)

and rearranging, we obtain

 ri ri −1  

q

q



m αi j v i , j −l λim−l = p i (m)λim l i =1 l =0 i =1 j =l +1

λi =0

λi =0

as the contribution of the nonzero eigenvalues to x m − s. proof.

This completes the

Remarks:

  1. Since ml is a polynomial in m of degree exactly l , p i (m) is a polynomial in m of degree at most ri − 1.

2. Naturally, z m = 0 for all m when zero is not an eigenvalue of T . 3. In addition, z m = 0 for m ≥ max{ri : λi = 0}, since the summations on j in (0.39) are empty for such m. 4. Clearly, z m is in the subspace spanned by the eigenvectors and principal vectors corresponding to the zero eigenvalues. Similarly, p i (m) is in the subspace spanned by the eigenvector and principal vectors corresponding to the eigenvalue λi = 0. 5. It is clear from (0.38) that, because x 0 is chosen arbitrarily, x 0 − s is also arbitrary, and so are the αi j . Therefore, by (0.41), for convergence of {x m } from any x 0 , we need to have |λi | < 1 for all i. In addition, the smaller ρ(T ) is, the faster {x m } converges to s.

0.4.3 Some basic iterative methods We now turn to a few known examples of iterative methods. Of course, all the matrices A below are assumed to be nonsingular. To see what the matrices M and N in (0.12) are in some of the cases below, we decompose A as

where

A= D −E −F,

(0.44)

D = diag (A) = diag (a11 , a22 , . . . , aN N ),

(0.45)

while −E is the lower triangular part of A excluding the main diagonal, and −F is the upper triangular part of A excluding the main diagonal. Hence, both E and F have zero diagonals.

0.4. Fixed-point iterative methods for linear systems

23

1. Richardson iteration: In this method, we first rewrite (0.10) in the form x = x + ω(b − Ax)

for some ω ∈ 

(0.46)

and iterate as in x m+1 = x m + ω(b − Ax m ), That is, M=

in (0.12), and hence

1 I, ω

N=

T = I − ωA,

m = 0, 1, . . . .

(0.47)

1 (I − ωA) ω

(0.48)

d = ωb

(0.49)

in (0.15). 2. Jacobi method: In this method, x m+1 is computed from x m as follows: (i )

x m+1 =

  N

1 (j) ai j x m , b (i ) − ai i j =1

i = 1, . . . , N ,

(0.50)

j =i

provided that ai i = 0 for all i. This amounts to setting M = D, Therefore,

N = D −A= E +F.

(0.51)

T = I − D −1A = D −1 (E + F ).

(0.52)

3. Gauss–Seidel method: In this method, x m+1 is computed from x m as follows: (i ) x m+1

  i −1 N



1 (j) (j) (i ) b − ai j x m+1 − ai j x m , = ai i j =1 j =i +1

i = 1, . . . , N , (1)

(0.53) (2)

provided that ai i = 0 for all i. Here, we first compute x m+1 , then x m+1 , and so on. Thus, we have M = D − E, N = F (0.54) so that

T = (D − E )−1 F .

(0.55)

4. Symmetric Gauss–Seidel method: Let us recall that in each step of the Gauss– (i ) Seidel method, the x m are being updated in the order i = 1, 2, . . . , N . We will (i ) call this updating a forward sweep. We can also perform the updating of the x m in the order i = N , N − 1, . . . , 1. One such step is called a backward sweep, and it amounts to taking M = D −F, N = E (0.56) so that

T = (D − F )−1 E .

(0.57)

If we obtain x m+1 by applying to x m a forward sweep followed by a backward sweep, we obtain a method called the symmetric Gauss–Seidel method. The matrix of iteration relevant to this method is thus T = (D − F )−1 E (D − E )−1 F .

(0.58)

24

Chapter 0. Introduction and Review of Linear Algebra

5. Successive overrelaxation (SOR): This method is similar to the Gauss–Seidel method but involves a scalar ω called the relaxation parameter. In this method, x m+1 is computed from x m as follows:

  ⎫ i −1 N



1 ⎪ (j) (j) (i ) ai j x m+1 − ai j x m ⎬ = b − ai i , j =1 j =i +1 ⎪ ⎭ (i ) (i ) (i ) x m+1 = ω xˆm+1 + (1 − ω)x m

(i ) xˆm+1

i = 1, . . . , N ,

(1)

(0.59)

(2)

provided that ai i = 0 for all i. Here too, we first compute x m+1 , then x m+1 , and so on. Decomposing A as in (0.44), for SOR, we have M=

hence,

1 (D − ωE ), ω

N=

1 [(1 − ω)D + ωF ]; ω

T = (D − ωE)−1 [(1 − ω)D + ωF ].

(0.60)

(0.61)

Note that SOR reduces to the Gauss–Seidel method when ω = 1. By adjusting the parameter ω, we can minimize ρ(T ) and thus optimize the rate of convergence of SOR. In this case, SOR is denoted optimal SOR. 6. Symmetric SOR (SSOR): Note that, as in the case of the Gauss–Seidel method, (i ) in SOR too the x m are being updated in the order i = 1, 2, . . . , N . We will call (i ) this updating a forward sweep. We can also perform the updating of the x m in the order i = N , N − 1, . . . , 1. One such step is called a backward sweep, and it amounts to taking M=

so that

1 (D − ωF ), ω

N=

1 [(1 − ω)D + ωE] ω

T = (D − ωF )−1 [(1 − ω)D + ωE].

(0.62)

(0.63)

If we obtain x m+1 by applying to x m a forward sweep followed by a backward sweep, we obtain a method called SSOR. The matrix of iteration relevant to this method is thus T = (D − ωF )−1 [(1 − ω)D + ωE](D − ωE)−1 [(1 − ω)D + ωF ].

(0.64)

7. Alternating direction implicit (ADI) method: This method was developed to solve linear systems arising from finite-difference or finite-element solution of elliptic and parabolic partial differential equations. In the linear system Ax = b, we have A = H + V . Expressing this linear system in the forms (H + μI )x = (−V + μI )x + b

and

(V + μI )x = (−H + μI )x + b,

ADI is defined via the following two-stage iterative method: % (H + μI )x m+1 = (−V + μI )x m + b m = 0, 1, . . . , (V + μI )x m+1 = (−H + μI )x m+1 + b

(0.65)

0.4. Fixed-point iterative methods for linear systems

25

where μ is some appropriate scalar. In this case, M=

1 (H + μI )(V + μI ), 2μ

so that

N=

1 (H − μI )(V − μI ) 2μ

T = (V + μI )−1 (H − μI )(H + μI )−1 (V − μI ).

(0.66)

(0.67) (i )

Remark: In the Jacobi, Gauss–Seidel, and SOR methods mentioned above, the x m are updated one at a time. Because of this, these methods are called point methods. (i ) We can choose to update several of the x m simultaneously, that is, we can choose to (i ) update the x m in blocks. The resulting methods are said to be block methods.

0.4.4 Some convergence results for basic iterative methods We now state without proof some convergence results for the iterative methods described in the preceding subsection. For the proofs, we refer the reader to the relevant literature. Theorem 0.4. Let A have eigenvalues λi that satisfy 0 < λ1 ≤ λ2 ≤ · · · ≤ λN . Then the Richardson iterative method converges provided 0 < ω < 2/λN . Denoting T by T (ω), we also have the following optimal result: ω0 =

2 , λN + λ1

min ρ(T (ω)) = ρ(T (ω0 )) = ω

λN − λ1 < 1. λN + λ1

Theorem 0.5. Let the matrix A be strictly diagonally dominant. Then both the Jacobi and the Gauss–Seidel methods converge. Define μi =

i −1

j =1

|ai j |/|ai i |,

νi =

N

j =i +1

|ai j |/|ai i |,

i = 1, . . . , N .

1. For the Jacobi method, ρ(T ) ≤ T ∞ = max(μi + νi ) < 1. i

2. For the Gauss–Seidel method, ρ(T ) ≤ T ∞ ≤ max i

νi < 1. 1 − μi

We next state the Stein–Rosenberg theorem that pertains to the convergence of the Jacobi and Gauss–Seidel methods. Theorem 0.6. Denote by T J and T G-S the iteration matrices for the Jacobi and Gauss– Seidel iterative methods for the linear system Ax = b. If T J ≥ O, then precisely one of the following takes place:

26

Chapter 0. Introduction and Review of Linear Algebra

1. ρ(TG-S ) = ρ(TJ ) = 0. 2. 0 < ρ(T G-S ) < ρ(T J ) < 1. 3. ρ(TG-S ) = ρ(TJ ) = 1. 4. ρ(TG-S ) > ρ(TJ ) > 1. Thus, the Jacobi and Gauss–Seidel methods converge together and diverge together. In the case of convergence, the Gauss–Seidel method converges faster than the Jacobi method. Theorem 0.7. Let the matrix A be irreducibly diagonally dominant. Then both the Jacobi and the Gauss–Seidel methods converge. Theorem 0.8. Let A be an M-matrix. Then the Jacobi method converges. Theorem 0.9. Let A = M − N be a regular splitting of the matrix A, and let A−1 ≥ O. Then ρ(A−1 N ) < 1. ρ(M −1 N ) = 1 + ρ(A−1 N )

Hence, the iterative method x m+1 = T x m +d, where T = M −1 N , converges. Conversely, ρ(M −1 N ) < 1 implies that A is nonsingular and A−1 ≥ O. Theorem 0.10. Let the matrix A be Hermitian positive definite. Then we have the following: 1. The Gauss–Seidel method converges. 2. The Jacobi method converges if and only if the matrix 2D −A is Hermitian positive definite. Here D = diag(A). Theorem 0.11. For SOR to converge, it is necessary (but not sufficient) that 0 < ω < 2. Theorem 0.12. Let A be Hermitian positive definite. Then SOR converges if and only if 0 < ω < 2. Theorem 0.13. Let the matrices H and V in the ADI method be normal. Then       μ − λi (V )   μ − λi (H )     .  max  ρ(T ) ≤ max  i i μ + λi (V )  μ + λi (H ) 

Then the ADI method converges if H and V are Hermitian positive definite and μ > 0. Theorem 0.14. Denote the Hermitian and skew-Hermitian parts of A by AH and AS , 1 1 respectively. That is, AH = 2 (A + A∗ ) and AS = 2 (A − A∗ ), and A = AH + AS . Consider the following two-stage fixed-point iterative method for the system Ax = b: Pick x 0 and a scalar μ, and compute x 1 , x 2 , . . . as in

(μI + AH )x m+1 = (μI − AS )x m + b (μI + AS )x m+1 = (μI − AH )x m+1 + b

% ,

m = 0, 1, . . . .

0.4. Fixed-point iterative methods for linear systems

27

The matrix of iteration T in x m+1 = T x m + d is then given by T = (μI + AS )−1 (μI − AH )(μI + AH )−1 (μI − AS ). If AH is positive definite, then A is nonsingular. If, in addition, μ is real and μ > 0, then    μ − λi (AH )   < 1. ρ(T ) ≤ max  i μ + λi (AH ) 

Thus, the iterative method converges. (Note that the method described here is an ADI method.)

Chapter 1

Development of Polynomial Extrapolation Methods

1.1 Preliminaries 1.1.1 Motivation In this chapter, we present the derivation of four polynomial extrapolation methods: minimal polynomial extrapolation (MPE), reduced rank extrapolation (RRE), modified minimal polynomial extrapolation (MMPE), and SVD-based minimal polynomial extrapolation (SVD-MPE). Of these, MPE, RRE, and MMPE date back to the 1970s, while SVD-MPE was published in 2016. MPE was introduced by Cabay and Jackson [52]; RRE was introduced independently by Kaniel and Stein [155], Eddy [74], and Me˘sina [185]; and MMPE was introduced independently by Brezinski [28], Pugachev [211], and Sidi, Ford, and Smith [299]. SVD-MPE is a new method by the author [290].8 MPE and RRE, along with the epsilon algorithms (to be described in Chapter 5), have been reviewed by Skelboe [305] and by Smith, Ford, and Sidi [306]. Since the publication of these reviews, quite a few developments have taken place on the subject of vector extrapolation, and some of the newer developments have been reviewed by Sidi [286, 289]; for still newer developments, see Sidi [290, 292]. Our purpose here is to cover as many of these developments as possible and to present a broad perspective. Given a vector sequence that converges slowly, our aim in this chapter is to develop extrapolation methods whose only input is the sequence {x m } itself. As we mentioned in the preceding chapter, a major area of application of vector extrapolation methods is that of iterative solution of systems of equations. We have also seen that nonlinear systems of equations “behave” linearly close to their solutions. Therefore, in our derivation of polynomial extrapolation methods, we will go through the iterative solution of linear systems of equations. That is, we will derive the methods within the context of linear systems, making sure that these methods 8 The extrapolation methods we discuss in this book apply to vector sequences, as already mentioned. Block versions of some of the methods we describe here, which apply to sequences of vectors and matrices, have been given in Brezinski and Redivo Zaglia [40] and Messaoudi [186] and in the recent papers by Jbilou and Sadok [151] and Jbilou and Messaoudi [146]. See also Baron and Wajc [15]. We do not discuss these methods here.

31

32

Chapter 1. Development of Polynomial Extrapolation Methods

involve only the sequence of approximations {x m } that result from the iterative methods used.9 Following their derivation (definition), we will present a detailed discussion of their algebraic properties. We will not address the important issues of (i) actual algorithms for their numerical implementation and (ii) their analytical (convergence) properties in this chapter; we leave these topics to Chapters 2, 4, 6, and 7. Important note: Starting with this chapter, and throughout this book, we will fix our notation for the inner products in  s and the vector norms induced by them as follows: • For general or weighted inner products, (y, z ) = y ∗ M z ,

z  =



(z , z ),

(1.1)

where M ∈  s ×s is a fixed Hermitian positive definite matrix. Recall that all inner products in  s are weighted inner products unless they are Euclidean. • For the standard l2 or Euclidean inner product,  〈y, z 〉 = y ∗ z , z2 = 〈z , z 〉,

(1.2)

whenever confusion may arise. Throughout, we use the fact that, for any square matrix H and analytic functions f (λ) and g (λ), we have f (H ) g (H ) = g (H ) f (H ). We will also need the following definition throughout this work.  Definition 1.1. The polynomial A(z) = im=0 ai z i is monic of degree m if a m = 1. We also denote the set of monic polynomials of degree m by  m .

1.1.2 Minimal polynomials of matrices We start by discussing minimal polynomials of matrices. We already know that the characteristic polynomial R(λ) = det(λI − T ) of a square matrix T ∈ N ×N is a monic polynomial of degree exactly N and its roots are the eigenvalues of T . If λi and ri are the eigenvalues of T and their corresponding multiplicities, precisely as in (0.21) and (0.22) in the proof of Theorem 0.1, then R(λ) =

N

i =0

ei λi =

q  i =1

(λ − λi ) ri ,

q

i =1

ri = N ,

eN = 1.

(1.3)

The following theorem is known as the Cayley–Hamilton theorem, and there are different proofs of it. The proof we present here employs the Jordan canonical form and should provide a good exercise in the subject. Theorem 1.2. The matrix T satisfies R(T ) =

N

i =0

ei T i =

q  i =1

(T − λi I ) ri = O,

T0 = I.

(1.4)

In other words, the characteristic polynomial of T annihilates the matrix T . 9 A completely different derivation of vector extrapolation methods can be given starting with the Shanks transformation; this was done by Sidi, Ford, and Smith [299]. We summarize the Shanks transformation in Section 5.2. For yet another approach that proceeds through kernels, see Brezinski and Redivo Zaglia [42].

1.1. Preliminaries

33

Proof. We will recall (0.20)–(0.29), concerning the Jordan q canonical form, and use the same notation. First, substituting (0.20) into R(T ) = i =1 (T − λi I ) ri , we have   q q N

   i ri −1 ri ei T = (J − λi I ) V −1 = V R(J )V −1 . R(T ) = V (J − λi I ) V =V i =0

i =1

i =1

Next, by (0.24), we have

 ⎡  R J r1 (λ1 ) ⎢ N

⎢ ei J i = ⎢ R(J ) = ⎢ ⎣ i =0

  R J r2 (λ2 )

⎤ ..

.

 R J rq (λq )

⎥ ⎥ ⎥. ⎥ ⎦

Now, for each j = 1, . . . , q, q      r  r R J r j (λ j ) = J r j (λ j ) − λi I r j i J r j (λ j ) − λ j I r j j = O, i =1 i = j

 r since, by (0.30), J r j (λ j )−λ j I r j j = O. Therefore, R(J ) = O. Consequently, R(T ) = O as well.

Definition 1.3. The monic polynomial Q(λ) is said to be a minimal polynomial of T if Q(T ) = O and if Q(λ) has smallest degree. Theorem 1.4. The minimal polynomial Q(λ) of T exists and is unique. Moreover, if Q1 (T ) = O for some polynomial Q1 (λ) with deg Q1 > deg Q, then Q(λ) divides Q1 (λ). In particular, Q(λ) divides R(λ), the characteristic polynomial of T . [Thus, the degree of Q(λ) is at most N , and its zeros are some or all of the eigenvalues of T .] Proof. Since the characteristic polynomial R(λ) satisfies R(T ) = O, there is also a monic polynomial Q(λ) of smallest degree m, m ≤ N , satisfying Q(T ) = O. Suppose & & ) = O. that there is another monic polynomial Q(λ) of degree m that satisfies Q(T & Then the difference S(λ) = Q(λ) − Q(λ) also satisfies S(T ) = O, and its degree is less than m, which is impossible. Therefore, Q(λ) is unique. Let Q1 (λ) be of degree m1 such that m1 > m and Q1 (T ) = O. Then there exist polynomials a(λ) of degree m1 − m and r (λ) of degree at most m − 1 such that Q1 (λ) = a(λ)Q(λ) + r (λ). Therefore, O = Q1 (T ) = a(T )Q(T ) + r (T ) = r (T ). Since r (T ) = O, but r (λ) has degree less than m, r (λ) must be the zero polynomial. Therefore, Q(λ) divides Q1 (λ). Letting Q1 (λ) = R(λ), we realize that Q(λ) divides R(λ), meaning that its zeros are some or all of the eigenvalues of T .

Note that, with T = V J V −1 and J as before, we have Q(T ) = V Q(J )V −1 , where  ⎡  ⎤ Q J r1 (λ1 )   ⎢ ⎥ Q J r2 (λ2 ) ⎢ ⎥ ⎥. Q(J ) = ⎢ .. ⎢ ⎥ . ⎣  ⎦ Q J rq (λq )

34

Chapter 1. Development of Polynomial Extrapolation Methods

To see how Q(λ) factorizes, let us consider the case in which λ1 = a = λ2 and are different from the rest of the λi . Assume also that r1 ≥ r2 . Then, by (0.30), we have that [J r j (λ j ) −aI r j ]k = O only when k ≥ r j , j = 1, 2. This means that Q(λ) will have (λ − a) r1 as one of its factors. Definition 1.5. Given a nonzero vector u ∈ N , the monic polynomial P (λ) is said to be a minimal polynomial of T with respect to u if P (T )u = 0 and if P (λ) has smallest degree. Theorem 1.6. The minimal polynomial P (λ) of T with respect to u exists and is unique. Moreover, if P1 (T )u = 0 for some polynomial P1 (λ) with deg P1 > deg P , then P (λ) divides P1 (λ). In particular, P (λ) divides Q(λ), the minimal polynomial of T , which in turn divides R(λ), the characteristic polynomial of T . [Thus, the degree of P (λ) is at most N , and its zeros are some or all of the eigenvalues of T .] Proof. Since the minimal polynomial Q(λ) satisfies Q(T ) = O, it also satisfies Q(T )u = 0. Therefore, there is a monic polynomial P (λ) of smallest degree k, k ≤ m, where m is the degree of Q(λ), satisfying P (T )u = 0. Suppose that there is another monic polynomial P (λ) of degree k that satisfies P (T )u = 0. Then the difference S(λ) = P (λ) − P (λ) also satisfies S(T )u = 0, and its degree is less than k, which is impossible. Therefore, P (λ) is unique. Let P1 (λ) be of degree k1 such that k1 > k and P1 (T )u = 0. Then there exist polynomials a(λ) of degree k1 − k and r (λ) of degree at most k − 1 such that P1 (λ) = a(λ)P (λ) + r (λ). Therefore, 0 = P1 (T )u = a(T )P (T )u + r (T )u = r (T )u. Since r (T )u = 0, but r (λ) has degree less than k, r (λ) must be the zero polynomial. Therefore, P (λ) divides P1 (λ). Letting P1 (λ) = Q(λ) and P1 (λ) = R(λ), we realize that P (λ) divides Q(λ) and R(λ), meaning that its zeros are some or all of the eigenvalues of T .

Again, with T = V J V −1 and J as before, we have P (T ) = V P (J )V −1 , where  ⎡  P J r1 (λ1 ) ⎢ ⎢ P (J ) = ⎢ ⎢ ⎣



P J r2 (λ2 )



 ..

.

 P J rq (λq )

⎥ ⎥ ⎥. ⎥ ⎦

To see how P (λ) factorizes, let us consider the case in which λ1 = a = λ2 and are different from the rest of the λi . Assume also that r1 ≥ r2 . Recall that the eigenvectors and principal vectors v i j of T span N . Therefore, u can be expressed as a linear combination of the v i j . Suppose that 

u=

r1

j =1

r1 ≤ r1 ,



α1 j v 1 j +

r2

j =1

r2 ≤ r2 ,

α2 j v 2 j + (a linear combination of {v i j }, i ≥ 3), r1 ≥ r2 ≥ 1,

α1r  , α2r  = 0. 1

2

1.2. Solution to x = T x + d from {x m }

35

Then, by (0.37), we have that 

r1 k 



T − aI

j =1



α1 j v 1 j +

r2

j =1

 α2 j v 2 j = 0 only when k ≥ r1 .



This means that P (λ) will have (λ − a) r1 as one of its factors. Example 1.7. Let T = V J V −1 , where J is the Jordan canonical form of T , given as ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ J =⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

a

1 a

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

1 a a

1 a b

1 b b

a = b .

b Note that J has five Jordan blocks with (λ1 = a, r1 = 3), (λ2 = a, r2 = 2), (λ3 = b , r3 = 2), (λ4 = b , r4 = 1), and (λ5 = b , r5 = 1). Thus, the characteristic polynomial R(λ) and the minimal polynomial Q(λ) are R(λ) = (λ − a)5 (λ − b )4 , If

Q(λ) = (λ − a)3 (λ − b )2 .

u = 2v 11 − v 12 + 3v 21 + 4v 32 − 2v 41 + v 51 ,

then (T − aI )2 and (T − b I )2 annihilate the vectors 2v 11 − v 12 + 3v 21 and 4v 32 − 2v 41 + v 51 , respectively. Consequently, the minimal polynomial of T with respect to u is P (λ) = (λ − a)2 (λ − b )2 .

Remark: From the examples given here, it must be clear that the minimal polynomial of T with respect to u is determined by the eigenvectors and principal vectors of T that are present in the spectral decomposition of u. This means that if two vectors d 1 and d 2 , d 1 = d 2 , have the same eigenvectors and principal vectors in their spectral decompositions, then the minimal polynomial of T with respect to d 1 is also the minimal polynomial of T with respect to d 2 .

1.2 Solution to x = T x + d from {x m } 1.2.1 General considerations and notation Let s be the unique solution to the N -dimensional linear system x = T x + d.

(1.5)

Writing this system in the form (I − T )x = d, it becomes clear that the uniqueness of the solution is guaranteed when the matrix I − T is nonsingular or, equivalently,

36

Chapter 1. Development of Polynomial Extrapolation Methods

when T does not have one as its eigenvalue. Starting with an arbitrary vector x 0 , let the vector sequence {x m } be generated via the iterative scheme x m+1 = T x m + d,

m = 0, 1, . . . .

(1.6)

As we have shown already, provided ρ(T ) < 1, limm→∞ x m exists and equals s. Making use of what we already know about minimal polynomials, we can actually construct s as a linear combination of a finite number (at most N + 1) of the vectors x m , whether {x m } converges or not. This is the subject of Theorem 1.8 below. Before we state this theorem, we introduce some notation and a few simple, but useful, facts. Given the sequence {x m }, generated as in (1.6), let u m = Δx m ,

w m = Δu m = Δ2 x m ,

m = 0, 1, . . . ,

(1.7)

where Δx m = x m+1 − x m and Δ2 x m = Δ(Δx m ) = x m+2 − 2x m+1 + x m , and define the error vectors ε m as in ε m = x m − s,

m = 0, 1, . . . .

(1.8)

Using (1.6), it is easy to show by induction that u m = T m u 0,

w m = T m w 0,

m = 0, 1, . . . .

(1.9)

Similarly, by (1.6) and by the fact that s = T s + d, one can relate the error in x m+1 to the error in x m via ε m+1 = (T x m + d) − (T s + d) = T (x m − s) = T ε m ,

(1.10)

which, by induction, gives ε m = T m ε0 ,

m = 0, 1, . . . .

(1.11)

In addition, we can relate ε m to u m , and vice versa, via u m = (T − I )εm

and

ε m = (T − I )−1 u m ,

m = 0, 1, . . . .

(1.12)

Similarly, we can relate u m to w m , and vice versa, via w m = (T − I )u m

and

u m = (T − I )−1 w m ,

m = 0, 1, . . . .

(1.13)

Finally, by T m+i = T i T m , we can also rewrite (1.9) and (1.11) as in T m+i u 0 = T i u m ,

T m+i w 0 = T i w m ,

T m+i ε0 = T i ε m

m = 0, 1, . . . .

(1.14)

As usual, T 0 = I in (1.9) and (1.11) and throughout.

1.2.2 Construction of solution via minimal polynomials Theorem 1.8. Let P (λ) be the minimal polynomial of T with respect to εn = x n − s, given as k

ci λi , ck = 1. (1.15) P (λ) = i =0

1.2. Solution to x = T x + d from {x m }

Then

k

i =0 ci

37

= 0, and s can be expressed as k

i =0 ci x n+i

s=

k

i =0 ci

.

(1.16)

Proof. By definition of P (λ), P (T )εn = 0. Therefore, 0 = P (T )εn =

k

i =0

ci T i εn =

k

i =0

ci εn+i ,

(1.17)

the last equality following from (1.14). Therefore, 0=

k

i =0

ci εn+i =

k

i =0

ci x n+i −



 k ci s, i =0

  and solving this for s, we obtain (1.16), provided ki=0 ci = 0. Now, ki=0 ci = P (1) = 0 since one is not an eigenvalue of T , and hence (λ − 1) is not a factor of P (λ). This completes the proof.

By Theorem 1.8, we need to determine P (λ) to construct s. By the fact that P (λ) is uniquely defined via P (T )εn = 0, it seems that we have to actually know εn to know P (λ). However, since εn = x n − s and since s is unknown, we have no way of knowing εn . Fortunately, we can obtain P (λ) solely from our knowledge of the vectors x m . This we achieve with the help of Theorem 1.9. Theorem 1.9. The minimal polynomial of T with respect to εn is also the minimal polynomial of T with respect to u n = x n+1 − x n . Proof. Let P (λ) be the minimal polynomial of T with respect to εn as before, and denote by S(λ) the minimal polynomial of T with respect to u n . Thus, P (T )εn = 0

(1.18)

and S(T )u n = 0.

(1.19)

Multiplying (1.18) by (T − I ), and recalling from (1.12) that (T − I )εn = u n , we obtain P (T )u n = 0. By Theorem 1.6, this implies that S(λ) divides P (λ). Next, again by (1.12), we can rewrite (1.19) as (T − I )S(T )εn = 0, which, upon multiplying by (T − I )−1 , gives S(T )εn = 0. By Theorem 1.6, this implies that P (λ) divides S(λ). Therefore, P (λ) ≡ S(λ).

What Theorem 1.9 says is that P (λ) in Theorem 1.8 satisfies P (T )u n = 0

(1.20)

and has smallest degree. Now, since all the vectors x m are available to us, so are the vectors u m = x m+1 − x m . Thus, the polynomial P (λ) can now be determined from (1.20), as we show next.

38

Chapter 1. Development of Polynomial Extrapolation Methods

First, by (1.20), (1.15), and (1.14), we have that 0 = P (T )u n =

k

i =0

Next, recalling that ck = 1, we rewrite k−1

i =0

ci T i u n =

k

i =0 ci u n+i

k

i =0

ci u n+i .

(1.21)

= 0 in the form

ci u n+i = −u n+k .

(1.22)

Let us express (1.21) and (1.22) more conveniently in matrix form. For this, let us define the matrices U j as U j = [ u n | u n+1 | · · · | u n+ j ].

(1.23)

Thus, U j is an N × ( j + 1) matrix, u n , u n+1 , . . . , u n+ j being its columns. In this notation, (1.21) and (1.22) read, respectively, c = [c0 , c1 , . . . , ck ]T ,

U k c = 0, and

U k−1 c  = −u n+k ,

(1.24)

c  = [c0 , c1 , . . . , ck−1 ]T .

(1.25)

We will continue to use this notation without further explanation below. Clearly, (1.25) is a system of N linear equations in the k unknowns c0 , c1 , . . . , ck−1 and is in general overdetermined since k ≤ N . Nevertheless, by Theorem 1.9, it is consistent and has a unique solution for the ci . With this, we see that the solution s in (1.16) is determined completely by the k + 2 vectors x n , x n+1 , . . . , x n+k+1 . We now express s in a form that is slightly different from that in (1.16). With ck = 1 again, let us set c (1.26) γi = k i , i = 0, 1, . . . , k. j =0 c j

This is allowed because

k

j =0 c j

= P (1) = 0 by Theorem 1.8. Obviously, k

i =0

Thus, (1.16) becomes s= Dividing (1.24) by

γi = 1.

k

i =0

k

(1.27)

γi x n+i .

j =0 c j , and invoking (1.26), we realize that the γi

Ukγ = 0

and

k

i =0

γi = 1,

(1.28) satisfy the system

γ = [γ0 , γ1 , . . . , γk ]T .

(1.29)

This is a linear system of N + 1 equations in the k + 1 unknowns γ0 , γ1 , . . . , γk . It is generally overdetermined, but consistent, and has a unique solution.

1.3. Derivation of MPE, RRE, MMPE, and SVD-MPE

39

At this point, we note again that s is the solution to (I − T )x = d, whether  ρ(T ) < 1 or not. Thus, with the γi as determined above, s = ki=0 γi x n+i , whether lim m→∞ x m exists or not. We close this section with an observation concerning the polynomial P (λ) in Theorem 1.9. Proposition 1.10. Let k be the degree of P (λ), the minimal polynomial of T with respect to u n . Then the sets {u n , u n+1 , . . . , u n+ j }, j < k, are linearly independent, while the set {u n , u n+1 , . . . , u n+k } is not. The vector u n+k is a linear combination of u n+i , 0 ≤ i ≤ k − 1, as shown in (1.22).

1.3 Derivation of MPE, RRE, MMPE, and SVD-MPE 1.3.1 General remarks  So far, we have seen that s can be computed via a sum of the form ki=0 γi x n+i , with k k i i =0 γi = 1, once P (λ) = i =0 ci λ (ck = 1), the minimal polynomial of T with respect to εn , has been determined. We have seen that P (λ) is also the minimal polynomial of T with respect to u n and can be determined uniquely by solving the generally overdetermined, but consistent, system of linear equations in (1.22) or, equivalently, (1.25). However, the degree of the minimal polynomial of T with respect to εn can be as large as N . Because N can be a very large integer in general, determining s in the way we have described here becomes prohibitively expensive as far as computation time and storage requirements are concerned. [Note that we need to store the vectors u n , u n+1 , . . . , u n+k and solve the N × k linear system in (1.22).] Thus, we conclude that computing s via a combination of the iteration vectors x m , as described above, may not be feasible after all. Nevertheless, with a twist, we can use the framework developed thus far to approximate s effectively. To do this, we replace the minimal polynomial of T with respect to u n (or εn ) by another unknown polynomial, whose degree is smaller—in fact, much smaller—than N and is at our disposal. Let us denote the degree of the minimal polynomial of T with respect to u n by k0 ; of course, k0 ≤ N . Then, by Definition 1.5, it is clear that the sets {u n , u n+1 , . . . , u n+k }, 0 ≤ k ≤ k0 − 1, are linearly independent, but the set {u n , u n+1 , . . . , u n+k0 } is not. This implies that the matrices U k , k = 0, 1, . . . , k0 − 1, are of full rank, but U k0 is not; that is, rank(U k ) = k + 1,

0 ≤ k ≤ k0 − 1,

rank(U k0 ) = k0 .

(1.30)

1.3.2 Derivation of MPE Let us choose k to be an arbitrary positive integer that is normally much smaller than the degree of the minimal polynomial of T with respect to u n (hence also εn ) and, therefore, also much smaller than N . In view of Proposition 1.10, the overdetermined linear system U k−1 c  = −u n+k in (1.25) is now clearly inconsistent and hence has no solution for c0 , c1 , . . . , ck−1 in the ordinary sense. To get around this problem, we solve this system in the least-squares sense, since such a solution always exists. Following  that, we let ck = 1 and, provided ki=0 ci = 0, we compute γ0 , γ1 , . . . , γk precisely as in  (1.26) and then compute the vector sn,k = ki=0 γi x n+i as our approximation to s. The resulting method is MPE.

40

Chapter 1. Development of Polynomial Extrapolation Methods

We can summarize the definition of MPE through the following steps: 1. Choose the integers k and n and input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n , u n+1 , . . . , u n+k and form the N × k matrix U k−1 . (Recall that u m = x m+1 − x m .) 3. Solve the overdetermined linear system U k−1 c  = −u n+k in the least-squares sense for c  = [c0 , c1 , . . . , ck−1 ]T . This amounts to solving the optimization problem '

' ' k−1 ' ', min ' c u + u (1.31) i n+i n+k ' ' c0 ,c1 ,...,ck−1 i =0

which can also be expressed as min U k−1 c  + u n+k ,  c

c  = [c0 , c1 , . . . , ck−1 ]T .

(1.32) 

Here the vector norm  ·  that we are using is defined via z  = (z , z ), where (· , ·) is an arbitrary inner product at our disposal, as defined in (1.1).10 With  c0 , c1 , . . . , ck−1 available, set ck = 1 and compute γi = ci / kj=0 c j , i = 0, 1, . . . , k,  provided kj=0 c j = 0.  4. Compute sn,k = ki=0 γi x n+i as an approximation to lim m→∞ x m = s.

1.3.3 Derivation of RRE Again, let us choose k to be an arbitrary positive integer that is normally much smaller than the degree of the minimal polynomial of T with respect to u n (hence also εn ) and, therefore, also much smaller than N . In view of Proposition 1.10, the overdeter mined linear system U k γ = 0 in (1.29), subject to ki=0 γi = 1, is inconsistent, hence has no solution for γ0 , γ1 , . . . , γk in the ordinary sense. Therefore, we solve the system  U k γ = 0 in the least-squares sense, with the equation ki=0 γi = 1 serving as a constraint. Note that such a solution always exists. Following that, we compute the vector  sn,k = ki=0 γi x n+i as our approximation to s. The resulting method is RRE. This approach to RRE was essentially given by Kaniel and Stein [155] and by Me˘sina [185];11 however, their motivations are different from the one that we have used here, which goes through the minimal polynomial of a matrix with respect to a vector. We can summarize the definition of RRE through the following steps: 1. Choose the integers k and n and input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n , u n+1 , . . . , u n+k and form the N × (k + 1) matrix U k . (Recall that u m = x m+1 − x m .) 3. Solve the overdetermined linear system U k γ = 0 in the least-squares sense, sub ject to the constraint ki=0 γi = 1. This amounts to solving the optimization problem '

' k

' k ' ' subject to γ u γi = 1, (1.33) min ' i n+i ' γ0 ,γ1 ,...,γk ' i =0

10

i =0

For the use of other norms, see Section 1.6. two works compute the γi in the same way but differ in the way they compute sn,k ; namely,   sn,k = ki=0 γi x n+i in [185], while sn,k = ki=0 γi x n+i +1 in [155]. Note that, in both methods, only the vectors x n , x n+1 , . . . , x n+k+1 are used to compute sn,k . 11 The

1.3. Derivation of MPE, RRE, MMPE, and SVD-MPE

41

which can also be expressed as min U k γ  γ

k

subject to

i =0

γi = 1;

γ = [γ0 , γ1 , . . . , γk ]T .

(1.34)

 Here too the vector norm  ·  that we are using is defined via z = (z , z ), where (· , ·) is an arbitrary inner product at our disposal, as defined in (1.1).12  4. Compute sn,k = ki=0 γi x n+i as an approximation to lim m→∞ x m = s. Another approach to RRE

The constrained minimization problem satisfied by the γi can be replaced by a different unconstrained minimization problem as follows: By repeated application of (1.7), we have i −1 i −1



u n+ j and u n+i = u n + w n+ j . (1.35) x n+i = x n + j =0

Therefore, recalling also that k

i =0

γi x n+i = x n +

k−1

where ξj =

j =0

j =0

k

i =0 γi

ξ j u n+ j

k

i = j +1

= 1, we have k

and

i =0

γi = 1 −

j

i =0

γi ,

γi u n+i = u n +

k−1

j =0

ξ j w n+ j ,

j = 0, 1, . . . , k − 1.

(1.36)

(1.37)

Therefore, we can replace (1.33) by min

ξ0 ,ξ1 ,...,ξk−1

' ' k−1

' ' 'u + ξ j w n+ j ' ' n ',

(1.38)

j =0

which can also be expressed, and replaces (1.34), as min W k−1 ξ + u n , ξ

W j = [ w n | w n+1 | · · · | w n+ j ],

ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T .

With ξi determined, we can now go back to (1.36) and compute sn,k through sn,k = x n +

k−1

j =0

(1.39)

ξ j u n+ j .

We can also compute sn,k through sn,k =

k

i =0

γi x n+i ,

with the γi computed from the ξ j with the help of (1.37) as γ0 = 1 − ξ0 , 12 For

γi = ξi −1 − ξi ,

the use of other norms, see Section 1.6.

i = 1, . . . , k − 1,

γk = ξk−1 .

(1.40)

42

Chapter 1. Development of Polynomial Extrapolation Methods

We will make partial use of this approach in the next two chapters, where we develop efficient algorithms to implement polynomial extrapolation methods. Note that we have now defined RRE, with the help of (1.36) and (1.38), only via the ξi and without any reference to the γi . This is precisely how RRE was defined by Eddy [74]. We have thus shown here that the sn,k obtained in [74] is the same as that obtained in [185]. This equivalence of the methods was first proved in Smith, Ford, and Sidi [306]. The approach we have just presented allows us to express sn,k in terms of the matrices U k−1 and W k−1 in a simple way when the norm used is the standard l2 -norm,  that is, z 2 = z ∗ z. In this case, the solution to the minimization problem in (1.39) can be expressed in the form u , ξ = −W + k−1 n

where A+ stands for the Moore–Penrose generalized inverse13 of the matrix A, whether A is square and nonsingular or not. Next, by (1.36), sn,k = x n +

k−1

j =0

ξ j u n+ j = x n + U k−1 ξ .

Combining these two relations, we obtain u . sn,k = x n − U k−1 W + k−1 n A special case: The method of Henrici

Let us consider the case k = N . Then U N −1 and W N −1 are both N × N matrices. If, in addition, W N −1 is nonsingular, then sn,N becomes sn,N = x n − U N −1 W −1 N −1 u n , which was proposed by Henrici in [130, Section 5.9] as a vector generalization of the method of Steffensen for nonlinear scalar equations. Clearly, this method can be used only when N is small. For more on the method of Henrici, see Sadok [244]. For the method of Steffensen, see also Ralston and Rabinowitz [214], Atkinson [9], Stoer and Bulirsch [313], and Sidi [282].

1.3.4 Derivation of MMPE Recall that the ci in MPE are obtained by solving the linear system U k−1 c  +u n+k = 0 in the least-squares sense, that is, by requiring that some appropriate norm of the vector U k−1 c  + u n+k have minimum norm. We now demand that this vector be orthogonal to (or have zero projection in) a fixed subspace of dimension k. Thus, if we let q 1 , . . . , q k be a basis for this subspace, then c0 , c1 , . . . , ck−1 are the solution to   (1.41) q i , U k−1 c  + u n+k = 0, i = 1, . . . , k. The resulting method is MMPE. We can summarize the definition of MMPE through the following steps: 1. Choose the integers k and n and input the vectors x n , x n+1 , . . . , x n+k+1 . 13 For

the Moore–Penrose generalized inverse, see Appendix C and the sources cited there.

1.3. Derivation of MPE, RRE, MMPE, and SVD-MPE

43

2. Compute the vectors u n , u n+1 , . . . , u n+k and form the N × k matrix U k−1 . (Recall that u m = x m+1 − x m .) 3. Solve the linear system     q i , U k−1 c  = − q i , u n+k ,

i = 1, . . . , k,

(1.42)

which can also be expressed as k−1 

j =0

   q i , u n+ j c j = − q i , u n+k ,

i = 1, . . . , k.

(1.43)

This is, in fact, a system of k linear equations for the k unknowns c0 , c1 , . . . , ck−1 . Here (· , ·) is an arbitrary inner product. [For example, when (y, z ) = 〈y, z 〉 = y ∗ z and q i = e i , i = 1, . . . , k, the equations in (1.43) are the first k of the N  equations of the linear system k−1 c u = −u n+k .] j =0 j n+ j  With c0 , c1 , . . . , ck−1 available, set ck = 1 and compute γi = ci / kj=0 c j , i =  0, 1, . . . , k, provided kj=0 c j = 0. 4. Compute sn,k =

k

i =0 γi x n+i

as an approximation to lim m→∞ x m = s.

1.3.5 Derivation of SVD-MPE Again, we recall that we obtain the ci in MPE by what amounts to solving the linear system U k c = 0 with ck = 1 in the least-squares sense. This we accomplish by minimizing U k c with respect to c, subject to the constraint ck = 1, where the norm  ·   is defined by z  = (z , z ), (z, z ) being an arbitrary inner product. We now propose to modify MPE by minimizing U k cM with respect to c, now subject to the constraint cL = 1, where M ∈ N ×N and L ∈ (k+1)×(k+1) are Hermitian positive definite matrices. That is, determine the vector c = [c0 , c1 , . . . , ck ]T as the solution to the constrained minimization problem

min U k cM c

subject to cL = 1,

c = [c0 , c1 , . . . , ck ]T ,

  where yM = y ∗ M y and z L = z ∗ Lz . With c = [c0 , c1 , . . . , ck ]T determined,   we next compute the γi via γi = ci / kj=0 c j , provided kj=0 c j = 0, and then compute  sn,k = ki=0 γi x n+i as an approximation to limm→∞ x m = s. A complete algorithmic treatment of the resulting method has been given recently in Sidi [290]. To keep the treatment and the technical details of SVD-MPE simple, in what  follows, we take M = I N and L = I k+1 , that is, we use the standard l2 -norm z 2 = z ∗ z in both N and k+1 . Then the vector c turns out to be the right singular vector of U k corresponding to the smallest singular value. On account of this, we call the resulting method SVD-MPE. We can summarize the definition of SVD-MPE through the following steps:

1. Choose the integers k and n and input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n , u n+1 , . . . , u n+k and form the N × (k + 1) matrix U k . (Recall that u m = x m+1 − x m .)

44

Chapter 1. Development of Polynomial Extrapolation Methods

3. Solve the standard l2 constrained minimization problem min U k c2

subject to c2 = 1,

c

c = [c0 , c1 , . . . , ck ]T .

(1.44)

The solution c is the right singular vector corresponding to the smallest singular 2 c, c2 = 1. We assume that σmin is value σmin of U k ; that is, U ∗k U k c = σmin simple so that c is unique up to a multiplicative constant φ, |φ| = 1.  With c0 , c1 , . . . , ck available, compute γi = ci / kj=0 c j , i = 0, 1, . . . , k, provided k j =0 c j = 0. The assumption that σmin is simple guarantees the uniqueness of the γi . Note that if u n , u n+1 , . . . , u n+k are linearly independent, then U k has full rank and U ∗k U k is positive definite. As a result, all its eigenvalues are positive, and σmin is the positive square root of the smallest eigenvalue of U ∗k U k .  4. Compute sn,k = ki=0 γi x n+i as an approximation to lim m→∞ x m = s. Clearly, the vector c can be obtained from the singular value decomposition (SVD) of U k , namely, from U k = GΣH ∗ , where G = [ g 0 | g 1 | · · · | g k ] ∈ N ×(k+1) , g ∗i g j

= δi j ,

H = [ h 0 | h 1 | · · · | h k ] ∈ (k+1)×(k+1) ,

h ∗i h j

= δi j ,

0 ≤ i, j ≤ k,

(1.45) (1.46)

and Σ = diag (σ0 , σ1 , . . . , σk ) ∈ (k+1)×(k+1) , such that

Ukh j = σj g j ,

σ0 ≥ σ1 ≥ · · · ≥ σk > 0,

U ∗k g j = σ j h j ,

j = 0, 1, . . . , k.

(1.47) (1.48)

Then σmin = σk and c = h k .

1.3.6 A unification of polynomial extrapolation methods: Generalized MPE The derivation of the four polynomial methods above can be formally unified by reconsidering the derivation of MMPE. For MMPE, we have so far required the vector U k−1 c  + u n+k to be orthogonal to a fixed subspace of dimension k spanned by the vectors q 1 , . . . , q k . We can generalize this idea by requiring that U k−1 c  + u n+k be orthogonal to a subspace of dimension k varying with n. If this subspace is spanned (n) (n) by the vectors q 1 , . . . , q k , then the equations (1.43) that define MMPE become k−1 

j =0

 (n)   (n) q i , u n+ j c j = − q i , u n+k ,

i = 1, . . . , k.

(1.49)

As we will see later in Theorems 1.15 and 1.17, all four methods described above can be viewed as special cases of the resulting method, which we call generalized minimal polynomial extrapolation (GMPE). Within this unified framework, for i = 1, . . . , k, we have (n) for MPE q i = u n+i −1 (n) q i = w n+i −1 for RRE, (n) qi = qi for MMPE (q i independent of n), (n) q i = g i −1 for SVD-MPE.

1.4. Finite termination property

45

Recall that g j are left singular vectors of U k = [u n |u n+1 | · · · |u n+k ], so they depend on n and on k as well. This formal unification was suggested in Sidi [262].

1.4 Finite termination property From the way we developed the extrapolation methods MPE, RRE, MMPE, and SVDMPE, it is clear that, when these are applied to a vector sequence {x m } obtained via iterative solution of a nonsingular linear system of equations, they produce the solution to this system from a finite number of the vectors x m , whether the sequence converges or not. We formalize this in the following theorem. Theorem 1.11. Let s be the solution to the nonsingular linear system x = T x + d, and let {x m } be the sequence obtained via the fixed-point iterative scheme x m+1 = T x m + d, m = 0, 1, . . . , with x 0 chosen arbitrarily. If k is the degree of the minimal polynomial of T with respect to εn = x n − s (or, equivalently, u n = x n+1 − x n ), then MPE, RRE, MMPE, and SVD-MPE produce s via sn,k = s from the k + 2 vectors x n , x n+1 , . . . , x n+k+1 . The result of this theorem is made possible by the fact that the vectors u n , u n+1 , . . . , u n+k−1 are linearly independent, while u n , u n+1 , . . . , u n+k−1 , u n+k are linearly dependent. As a consequence, we have the following: • • •

k−1

cu + u n+k = 0 has a unique solution for c0 , c1 , . . . , cn+k−1 in the ordii =0 i n+i nary sense (pertinent to MPE and MMPE).  = 0 along with ki=0 γi = 1 has a unique solution for γ0 , γ1 , . . . , γk in the ordinary sense (pertinent to RRE). k

i =0 γi u n+i

k

i =0 ci u n+i = 0 along with c = 1 has a unique solution for c = [c0 , c1 , . . . , ck ]T (up to a common multiplicative constant φ, |φ| = 1) in the ordinary sense (pertinent to SVD-MPE).

Generalization of finite termination property

The next theorem concerns the extension of the finite termination property of our extrapolation methods to a broader class of linearly generated sequences. Theorem 1.12. Let the sequence {x m } be such that k

i =0

ai (x m+i − s) = 0,

m = 0, 1, . . . ;

a0 ak = 0,

k

i =0

ai = 0,

(1.50)

k being minimal. Let u m = x m+1 − x m as usual, and assume also that {u 0 , u 1 , . . . , u k−1 } is a linearly independent set. Then the vectors sn,k obtained by applying MPE, RRE, MMPE, and SVD-MPE to the sequence {x m } satisfy sn,k = s for all n = 0, 1, . . . . Proof. First, by (1.50), we have 

 k k

ai s = ai x m+i , i =0

i =0

46

Chapter 1. Development of Polynomial Extrapolation Methods

from which

k

i =0 ai x m+i k i =0 ai

s=

;

(1.51)

hence s=

k

i =0

δi x m+i ,

k

i =0

δi = 1;

a δi = k i

j =0 a j

,

i = 0, 1, . . . , k.

(1.52)

Note that the δi are the same for all m. Next, we show by induction on m that the sets S m = {u m , u m+1 , . . . , u m+k−1 } are all linearly independent. Now, we are given that the set S0 is linearly independent. Let us assume that S m , m ≤ n, are all linearly independent and show that so is Sn+1 . Taking differences in (1.50), we have k

i =0

ai u m+i = 0,

m = 0, 1, . . . .

(1.53)

For simplicity, let us set ak = 1. This is legitimate since ak = 0. Then (1.53) implies, with m = n, that k−1

ai u n+i . (1.54) u n+k = − i =0

Since Sn+1 = {u n+1 , u n+2 , . . . , u n+k }, let us consider the solution for b1 , . . . , bk to the linear system k

bi u n+i = 0. (1.55) i =1

Substituting (1.54) into (1.55) and rearranging, we obtain −bk a0 u n +

k−1

i =1

(bi − bk ai )u n+i = 0.

By our induction hypothesis that the set Sn is linearly independent, it follows that −bk a0 = 0,

bi − bk ai = 0,

i = 1, . . . , k − 1,

which, upon invoking the assumption that a0 = 0, gives b1 = · · · = bk = 0 as the unique solution to (1.55). This, of course, implies that the set Sn+1 must be linearly independent.  The linear independence of Sn implies that the equation ki=0 ci u n+i = 0 has a solution for c = [c0 , c1 , . . . , ck ]T that is unique up to a multiplicative constant. From this, and from the discussion following the statement of Theorem 1.11, we conclude that  the scalars γi in sn,k = ki=0 γi x n+i resulting from the four extrapolation methods we have discussed satisfy γi = δi , i = 0, 1, . . . , k, with the δi as in (1.52). This completes the proof.

Before proceeding further, we would like to mention that the sequence of vectors {x m } generated by the fixed-point iterative procedure x m+1 = T x m + d, m = 0, 1, . . . , is actually of the form described in the statement of Theorem 1.12. We will leave the

1.5. Application of polynomial extrapolation methods to arbitrary {x m }

47

complete proof of this to Lemma 6.22 in Chapter 6. Here, we will be content with a brief explanation of the matter and a conclusion. We already know from Theorem 0.3 that εm = x m − s = z m +

q

i =1 λi =0

p i (m)λim ,

where z m is the contribution of the zero eigenvalues of T and p i (m)λim is the contribution of the nonzero eigenvalue λi of T that appears in the Jordan block J ri (λi ) of (0.21). Of course, z m = 0 for all m ≥ 0 if T has no zero eigenvalue. z m = 0 for all m ≥ m0 ≥ 0 for some integer m0 ≥ 0 if T does have zero eigenvalues. Thus, z m = 0 for all m ≥ m0 ≥ 0 for some integer m0 ≥ 0, whether T has a zero eigenvalue or not. Hence, εm = x m − s =

q

i =1 λi =0

p i (m)λim

∀ m ≥ m0

for some m0 ≥ 0.

As a result, and this is what we prove later in this book, the minimal polynomial  P (λ) = ki=0 ci λi of T with respect to the vector ε m0 is the minimal polynomial of T with respect to ε m for all m > m0 as well. That is, k

i =0

ci (x m+i − s) = 0,

m = m0 , m0 + 1, m0 + 2, . . . .

The ci have the following properties: (i) Since the degree of P (λ) is exactly k, ck = 1 = 0. (ii) Since there is no contribution from zero eigenvalues, we have c0 = 0. (iii) Since  one is not an eigenvalue of T , we also have ki=0 ci = 0. In addition, since k is minimal, the set {u m0 , u m0 +1 , . . . , u m0 +k−1 } is linearly independent. Thus, the sequence {x m } m≥m0 is exactly as in the statement of Theorem 1.12.

1.5 Application of the extrapolation methods to arbitrary {x m } We note that all the methods we have described above take as their only input the integers k and n and the vectors x n , x n+1 , . . . , x n+k+1 . Therefore, they can be applied whether the sequence of vectors {x m } is generated by a linear or a nonlinear iterative process; {x m } can even be an arbitrary sequence having nothing to do with systems of equations. This is an important feature of the polynomial vector extrapolation methods discussed so far and also of the epsilon algorithms that will be discussed later. Let us now explore the consequences of the sequence {x m } being arbitrary. As before, setting u m = x m+1 − x m , m = 0, 1, . . . , let us consider the sets {u n , u n+1 , . . . , u n+k } and the matrices U k = [ u n | u n+1 | · · · | u n+k ], k = 0, 1, . . . , again. Clearly, in such a general case, too, there always exists an integer k0 ≤ N for which the sets {u n , u n+1 , . . . , u n+k }, 0 ≤ k ≤ k0 − 1, are linearly independent, but the set {u n , u n+1 , . . . , u n+k0 } is not. This implies that the matrices U k , k = 0, 1, . . . , k0 − 1, are of full rank, but U k0 is not; that is, rank(U k ) = k + 1,

0 ≤ k ≤ k0 − 1,

rank(U k0 ) = k0 .

(1.56)

48

Chapter 1. Development of Polynomial Extrapolation Methods

This is simply (1.30), where k0 is the degree of the minimal polynomial of a matrix T with respect to u n ; k0 in (1.56) has no such meaning in the general case we are considering here. It is now easy to see that the extrapolation methods developed so far can be applied to arbitrary {x m } with no changes for all k < k0 . It also follows that the vectors sn,k0 computed by all four methods are identical when they exist, analogous to what is stated in Theorem 1.11. We summarize this as our next theorem. Theorem 1.13. Let {x m } be an arbitrary sequence of vectors. Let u m = x m+1 − x m be such that rank(U k−1 ) = rank(U k ) = k. Then the vectors sn,k (produced from the k + 2 vectors x n , x n+1 , . . . , x n+k+1 ) by MPE, RRE, MMPE, and SVD-MPE, if they exist, are identical.

1.6 Further developments 1. If we use the standard Euclideanl2 inner product 〈y, z 〉 = y ∗ z , hence the standard Euclidean l2 -norm z 2 = z ∗ z , in our derivation of MPE and RRE, then the minimization problems that we need to solve become standard least-squares problems, which can be solved by employing known tools of numerical linear algebra. Clearly, we can also use other norms to define new methods in the spirit of MPE and RRE. For example, we can use the l p -norms with p = 2 for this purpose. If we use the l1 -norm or the l∞ -norm, then the resulting minimization problems can be expressed as linear programming problems, which can be solved by appropriate known methods and algorithms. All of this was suggested by Sidi, Ford, and Smith [299].

2. It is easy to see that the extrapolation methods proposed as generalizations of MPE and RRE (by switching to general l p -norms) in item 1 above also have the finite termination property. That is, if the x m are generated linearly via x m+1 = T x m +d, as described above, and if k is the degree of the minimal polynomial of T with respect to u n , then the generalized MPE and RRE methods give sn,k = s exactly. 3. As for MMPE, we have already noted that the inner product used in deriving MMPE does not have to be the Euclidean inner product; we can employ any inner product for this purpose. If the inner product used in implementing MMPE is the Euclidean inner product, that is, 〈y, z 〉 = y ∗ z , and if we choose the vectors q i to be k of the standard basis vectors in N , then the k × k system in (1.43) is nothing but a subsystem of U k−1 c  = −u n+k . As mentioned earlier, if q i = e i , i = 1, . . . , k, then the ci are the solutions to the first k equations, for example. All of this was actually suggested and used in [299]. 4. Finally, we do not have to restrict our attention to vector sequences in finitedimensional spaces. Even though MPE, RRE, and MMPE were developed within the setting of finite-dimensional linear spaces, they can also be defined, without any conceptual changes, in arbitrary infinite-dimensional linear spaces, such as function spaces. Specifically, if  is the space being considered, we only require the following: • For MPE, RRE, and SVD-MPE,  should be an inner product space (such as a Hilbert space). Note that we need to compute the scalars (y, z ) with y, z ∈  throughout.

1.7. Determinant representations

49

• For MMPE,  can also be a normed space (such as a Banach space). We can then replace the inner products (q i , z ), with z ∈  , by the scalars Qi (z ), where Qi are some linear functionals in the dual space of  . As we will see later in Chapter 6, most of the convergence theory carries over to these cases with minor modifications. This was suggested originally in [299]. The convergence and stability studies of [299], [261], [294], and [268] assume that the vector sequences {x m } may reside in such spaces in general. See the explanation given in Section 6.9.

1.7 Determinant representations In this section, we derive determinant representations for MPE, RRE, MMPE, and SVD-MPE. This is made possible by the fact that the γi for all four methods satisfy a system of linear equations. These representations turn out to be quite useful in the convergence and stability analyses we present later. We begin with the following general lemma (see [299]), which has been used by the author in various places. Lemma 1.14. Let ui , j and γ j be scalars and let the γ j satisfy the linear system k

j =0 k

j =0

ui , j γ j = 0,

i = 0, 1, . . . , k − 1, (1.57)

γ j = 1.

Then, whether v j are scalars or vectors, k

j =0

γj vj =

D(v0 , v1 , . . . , vk ) , D(1, 1, . . . , 1)

  v  0  u  0,0  D(v0 , v1 , . . . , vk ) =  u1,0  ..  .   uk−1,0

where

v1 u0,1 u1,1 .. . uk−1,1

··· ··· ··· ···

(1.58)       .     uk−1,k  vk u0,k u1,k .. .

(1.59)

If the vi are vectors, the determinant D(v0 , v1 , . . . , vk ) is defined via its expansion with respect to its first row. Proof. We will prove the lemma by direct verification of (1.58). Let C j be the cofactor of v j in D(v0 , v1 , . . . , vk ). Then, by properties of determinants, k

j =0

and

C j v j = D(v0 , v1 , . . . , vk ) and

k

j =0

ui , j C j = 0,

k

j =0

C j = D(1, 1, . . . , 1)

i = 0, 1, . . . , k − 1.

50

Chapter 1. Development of Polynomial Extrapolation Methods

Thus, we realize that the equations in (1.57) are satisfied with γ j = C j / 0, 1, . . . , k.

k

l =0

Cl , j =

The determinant representations for MPE and RRE given in the next theorem were first derived in Sidi [261], while that for MMPE was given in Brezinski [28] and Sidi, Ford, and Smith [299]. The determinant representation for SVD-MPE is from Sidi [290]. Theorem 1.15. Let MPE, RRE, and MMPE be defined as in the  preceding section, using N an arbitrary inner product (y, z ) in  and the norm z  = (z, z ) induced by it. Let SVD-MPE be  defined using the standard Euclidean inner product 〈y, z 〉 = y ∗ z and the norm z 2 = z ∗ z induced by it. With u m and w m as in (1.7), define the scalars ui , j by    ⎧  = Δx n+i , Δx n+ j for MPE, u n+i , u n+ j ⎪ ⎪     ⎪ ⎨ w ,u 2 = Δ for RRE, x , Δx n+i n+ j    n+i n+ j (1.60) ui , j = ⎪ = q i +1 , Δx n+ j for MMPE, q i +1 , u n+ j ⎪ ⎪ ⎩ = 〈g i , Δx n+ j 〉 for SVD-MPE, 〈g i , u n+ j 〉

where g i is the left singular vector of U k corresponding to the singular value σi , with the ordering σ0 ≥ σ1 ≥ · · · ≥ σk , so that σmin = σk , as in (1.45)–(1.48). Then, for all four methods, sn,k has the determinant representation sn,k =

D(x n , x n+1 , . . . , x n+k )

D(1, 1, . . . , 1)

,

(1.61)

where D(v0 , v1 , . . . , vk ) is the (k + 1) × (k + 1) determinant defined in (1.59) in Lemma 1.14 with the ui , j as in (1.60).14 That is,    x x n+1 · · · x n+k   n  u u0,1 ··· u0,k   0,0  u u1,1 ··· u1,k   1,0  . .. ..   .  . . .    uk−1,0 uk−1,1 · · · uk−1,k  . sn,k =  (1.62)  1 1 ··· 1    u u0,1 ··· u0,k   0,0  u u1,1 ··· u1,k   1,0  . .. ..   .  . . .    uk−1,0 uk−1,1 · · · uk−1,k 

Proof. We need to show that the γ j for the four methods satisfy the equations in (1.57).   As they already satisfy kj=0 γ j = 1, we need only show that they satisfy kj=0 ui , j γ j = 0, i = 0, 1, . . . , k − 1. We treat each method separately. 14 Unfortunately, the determinant representations for MPE and RRE given in Theorem 1.15 have been repeated numerous times and by different authors without ever mentioning their original source, namely, the paper [261]. These authors take (1.62) with (1.60) to be the definitions of MPE and RRE despite the fact that, originally, these methods were not defined via the determinant representations given in Theorem 1.15. These determinant representations were derived in [261] from the original definitions of MPE and RRE, which had nothing to do with determinants. When dealing especially with MPE and RRE, it is clear from the proof of Theorem 1.15 that the derivation of the determinant representations for these two methods is not straightforward at all.

1.7. Determinant representations

51

MPE: By the fact that the c j are the solution to the minimization problem of (1.32), they also satisfy the normal equations k−1 

j =0

   u n+i , u n+ j c j = − u n+i , u n+k ,

i = 0, 1, . . . , k − 1.

(1.63)

  Letting ck = 1, and dividing by ki=0 ci , these equations become kj=0 ui , j γ j = 0, i = 0, 1, . . . , k − 1. RRE: We consider the approach presented in (1.36)–(1.40), with the notation there. The minimization problem in (1.38) gives rise to the normal equations k−1 

j =0

   w n+i , w n+ j ξ j = − w n+i , u n ,

i = 0, 1, . . . , k − 1.

(1.64)

Now, k−1 

j =0

 k−1

    ξ j w n+ j w n+i , w n+ j ξ j + w n+i , u n = w n+i , u n + j =0

 = w n+i , =

k 

j =0

j =0



γ j u n+ j

 w n+i , u n+ j γ j .

(1.65)

k

ui , j γ j = 0, i = 0, 1, . . . , k − 1.  MMPE: Letting ck = 1, and dividing the equations in (1.43) by kj=0 c j , we obtain

Therefore, the equations in (1.64) become k

k

j =0

ui , j γ j = 0, i = 0, 1, . . . , k − 1.  SVD-MPE: By kj=0 c j u n+ j = U k c, we first have

j =0

k

j =0

ui , j c j =

k

j =0

(g ∗i u n+ j )c j = g ∗i



k j =0

 c j u n+ j = g ∗i U k c.

Next, by c = h k , where h j is the right singular vector of U k corresponding to σ j , and by U k h j = σ j g j and g ∗i g j = δi j , we have g ∗i U k c = g ∗i U k h k = σk g ∗i g k = 0,

i = 0, 1, . . . , k − 1,

  which, upon dividing by kj=0 c j , gives kj=0 ui , j γ j = 0, i = 0, 1, . . . , k − 1. This completes the proof.

Example 1.16. The case k = 1: Vector generalizations of the Aitken Δ2 -process: For MPE, RRE, MMPE, and SVD-MPE when k = 1, we have sn,1 =

u0,0 x n+1 − u0,1 x n

u0,0 − u0,1

= xn −

u0,0

u0,1 − u0,0

Δx n .

52

Chapter 1. Development of Polynomial Extrapolation Methods

Thus,

⎧ (Δx n , Δx n ) ⎪ ⎪ xn − Δx n ⎪ ⎪ (Δx n , Δ2 x n ) ⎪ ⎪ ⎪ ⎪ (Δ2 x n , Δx n ) ⎪ ⎪ ⎪ Δx n ⎨x n − 2 (Δ x n , Δ2 x n ) sn,1 = (q , Δx n ) ⎪ ⎪ ⎪ xn − 1 Δx n ⎪ ⎪ (q 1 , Δ2 x n ) ⎪ ⎪ ⎪ ⎪ 〈g , Δx n 〉 ⎪ ⎪ ⎩x n − 0 Δx n 〈g 0 , Δ2 x n 〉

for MPE,

for RRE,

for MMPE,

for SVD-MPE.

When compared with the Aitken Δ2 -process on a scalar sequence {z m }, which produces another sequence { z m }, where zn =

2 zn zn+2 − zn+1

zn+2 − 2zn+1 + zn

= zn −

Δzn Δzn , Δ2 z n

as approximations to the limit or antilimit of {z m }, we see that sn,1 from MPE, RRE, MMPE, and SVD-MPE can all be considered as vector generalizations of the Δ2 process. Of these, sn,1 from MPE and RRE are closely related to iterative methods by Lemaréchal [171] and Barzilai and Borwein [17] for the solution of systems of equations of the form x = f (x). We will discuss the solution of such systems briefly in Section 1.10.

In the context of MPE, RRE, MMPE, and SVD-MPE, the equations in (1.57) have the following interpretation, whose proof we leave to the reader. Theorem 1.17. The vectors γ = [γ0 , γ 1 , . . . , γk ]T associated with MPE, RRE, MMPE, and SVD-MPE satisfy the orthogonality conditions 

 k γ j u n+ j = (z , U k γ ) = 0 z, j =0

∀ z ∈ n,k

in the inner product (· , ·) relevant to each method, where ⎧ }  = span{u , . . . , u ⎪ ⎪ ⎨ n,k = span{wn , . . . , wn+k−1 } n,k n n+k−1 n,k = ⎪ k = span{q 1 , . . . , q k } ⎪ ⎩ !n,k = span{g 0 , . . . , g k−1 }

(1.66)

for MPE, for RRE, for MMPE, for SVD-MPE.

In words, these methods are such that the projections of the vector onto the respective subspaces n,k are zero.

k

j =0 γ j u n+ j

(1.67)

= Ukγ

The determinant representations of Theorem 1.15 enable us to formulate a necessary and sufficient condition for the existence of sn,k . This is the subject of the next theorem, in which we also use the following simple fact: If A = [a 1 | · · · |a s ], then

B = [b 1 | · · · |b s ],

C = A∗ M B = [ci j ]1≤i , j ≤s ,

A, B ∈  r ×s , ci j = a ∗i M b j

M ∈  r ×r , ∀ i, j .

1.7. Determinant representations

53

Theorem 1.18. Let (y, z ) = y ∗ M z , for some Hermitian positive matrix M ∈ N ×N , be the inner product used for MPE, RRE, and MMPE, and let (y, z ) = y ∗ z be the inner product used for SVD-MPE. Then sn,k for each method exist if and only if

where

det(U ∗ M W ) = 0 det(W ∗ M W ) = 0

for MPE, for RRE,

det(Q ∗ M W ) = 0 det(G ∗ W ) = 0

for MMPE, for SVD-MPE,

U = [u n |u n+1 | · · · |u n+k−1 ], Q = [q 1 |q 2 | · · · |q k ],

(1.68)

W = [w n |w n+1 | · · · |w n+k−1 ], G = [g 0 |g 1 | · · · |g k−1 ].

Proof. From Lemma 1.14 and from (1.61), it is clear that D(1, . . . , 1) = 0 is necessary and sufficient for the existence of sn,k . Let us perform a sequence of elementary column transformations on D(1, . . . , 1) that do not change its value as follows: Subtract the j th column from the ( j + 1)st in the order j = k, k − 1, . . . , 1. We obtain     1 0 ··· 0    u w0,0 ··· w0,k−1   0,0  w1,0 ··· w1,k−1  , w = u D(1, . . . , 1) =  u1,0 i,j i , j +1 − ui , j .  . . ..   . ..   . .    uk−1,0 wk−1,0 · · · wk−1,k−1  Let Z = [z 0 |z 1 | · · · |z k−1 ] stand for any of the matrices U , W , and Q. Invoking now w m = u m+1 − u m , we realize that wi , j = (z i , w n+ j ) = z ∗i M w n+ j . Consequently,   1   u  0,0 D(1, . . . , 1) =  ..  .   uk−1,0

      = det(Z ∗ M W ). Z ∗ M W    0T

The result for MPE, RRE, and MMPE now follows. The result for SVD-MPE follows by letting Z = G and M = I .

We now use Theorem 1.18 to provide sufficient conditions for the existence of sn,k from MPE and RRE in case the x m are generated by the fixed-point iterative method x m+1 = T x m + d, where I − T is nonsingular. To do this, we need the following results. Lemma 1.19. Let A ∈  s ×s have a positive definite Hermitian part AH . Then A is nonsingular. In addition, |x ∗Ax| ≥ x ∗AH x

∀ x ∈ s .

Proof. Recall that AH = 2 (A+ A∗ ) and AS = 2 (A− A∗ ) are, respectively, the Hermitian and skew-Hermitian parts of A. Now, for every nonzero x ∈  s , x ∗AH x = α is real 1

1

54

Chapter 1. Development of Polynomial Extrapolation Methods

 positive and x ∗AS x = iβ is purely imaginary or zero; therefore, |x ∗Ax| = α2 + β2 ≥ α > 0. Assume, to the contrary, that A is singular. Then there exists a nonzero vector x 0 ∈  s such that Ax 0 = 0. Consequently, x ∗0Ax 0 = 0, contradicting |x ∗Ax| > 0 for every nonzero x ∈  s . Therefore, A cannot be singular.

Lemma 1.20. Let A ∈  s ×s and B ∈  s × p , s ≥ p, be such that A has a positive definite Hermitian part and B has full rank. Then C = B ∗AB is nonsingular. Proof. Let AH and AS be, respectively, the Hermitian and skew-Hermitian parts of A. Then C H = B ∗AH B and C S = B ∗AS B are, respectively, the Hermitian and skewHermitian parts of C. It is easy to show that C H is positive definite since AH is positive definite and B has full rank. The result now follows from the preceding lemma.

Remark: For Lemmas 1.19 and 1.20 to hold, it is not necessary that AH be positive definite. It is easy to see that these lemmas hold whenever A = aA, for some a, |a| = 1, has positive definite Hermitian part AH . We make use of this fact in the next theorem by Sidi [262], which concerns the existence of sn,k from MPE and RRE. Theorem 1.21. Let {x m } be generated by the fixed-point iterative method x m+1 = T x m + d, where I − T is nonsingular. Let k be less than the degree of the minimal polynomial of T with respect to u n . Then 1. sn,k from RRE exists unconditionally, and 2. sn,k from MPE exists if E H , the Hermitian part of E = aM (I − T ), is positive definite for some a ∈ , |a| = 1. Remark: Recall that we already know that sn,k exists unconditionally both for MPE and for RRE, and sn,k = s, where s is the solution of (I − T )x = d, when k is the degree of the minimal polynomial of T with respect to u n . Proof. Let the matrices U and W be as in Theorem 1.18. We already know that u n , u n+1 , . . . , u n+k−1 are linearly independent. Therefore, the matrix U has full rank. Since w m = (T − I )u m for all m, W = (T − I )U . Since T − I is nonsingular and U has full rank, W has full rank too. Therefore, W ∗ M W is Hermitian positive definite, and hence nonsingular, which guarantees the existence of sn,k from RRE. When E H is positive definite, U ∗ M W = U ∗ M (T − I )U = −a −1 U ∗ E U is nonsingular, and hence sn,k from MPE exists.

Example 1.22. Nonexistence of sn,k for MPE: Let the vectors x m be generated via x m+1 = T x m + d, where T is a Hermitian N × N matrix with eigenvalues μi and corresponding eigenvectors z i . Of course, μi are real and z i can be normalized such that z ∗i z j = δi j . Let x 0 be such that u 0 = z 1 + z 2 . Clearly, the degree of the minimal polynomial of T with respect to u 0 is two. Let us define MPE by using the Euclidean inner product 〈· , ·〉 and consider MPE with n = 0 and k = 1. We have    1 1   D(1, 1) =  = 〈u 0 , w 0 〉 , u0,0 u0,1 

1.8. Compact representations

55

which, by the fact that w 0 = (T − I )u 0 = (μ1 − 1)z 1 + (μ2 − 1)z 2 , gives

D(1, 1) = μ1 + μ2 − 2.

If T is such that μ1 + μ2 = 2, then D(1, 1) = 0, and s0,1 does not exist.

1.8 Compact representations The alternative approach for RRE developed in Subsection 1.3.3 together with Theorems 1.17 and 1.18 enables us to express sn,k for MPE, RRE, MMPE, and SVD-MPE in compact form via matrices. First, by (1.36) and (1.37), sn,k =

k

i =0

γi x n+i = x n +

k−1

ξ j u n+ j

j =0

k

and

i =0

γi u n+i = u n +

k−1

j =0

ξ j w n+ j , (1.69)

where the ξ j are related to the γi via k

ξj = Letting

i = j +1

γi ,

j = 0, 1, . . . , k − 1.

U = [u n |u n+1 | · · · |u n+k−1 ],

(1.70)

W = [w n |w n+1 | · · · |w n+k−1 ],

Q = [q 1 |q 2 | · · · |q k ],

G = [g 0 |g 1 | · · · |g k−1 ],

as in Theorem 1.18, we can rewrite (1.69) in the form sn,k = x n + U ξ

and

k

i =0

γi u n+i = u n + W ξ ,

ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T .

(1.71)

Compact forms for MPE, RRE, and MMPE

Invoking (1.71), the orthogonality relations of Theorem 1.17 for MPE, RRE, and MMPE can be rewritten as Z ∗ M (u n + W ξ ) = 0 ⎧ ⎪ ⎨U Z= W ⎪ ⎩Q

where



Z ∗ M W ξ = −Z ∗ M u n ,

for MPE, for RRE, for MMPE.

(1.72)

(1.73)

Solving (1.72) for ξ , we have ξ = −(Z ∗ M W )−1 (Z ∗ M )u n , which, when substituted into (1.71), gives the compact representation sn,k = x n − U (Z ∗ M W )−1 (Z ∗ M )u n .

(1.74)

Of course, when the inner product used to define the methods is the standard Euclidean inner product 〈y, z 〉 = y ∗ z , we have M = I , and (1.74) becomes sn,k = x n − U (Z ∗ W )−1 Z ∗ u n .

(1.75)

56

Chapter 1. Development of Polynomial Extrapolation Methods Compact form for SVD-MPE

As for sn,k from SVD-MPE, as in Theorem 1.18, we just need to let Z = G = [g 0 |g 1 | · · · |g k−1 ] in (1.75). We then have sn,k = x n − U (G ∗ W )−1G ∗ u n .

(1.76)

Remark: The compact formulas we have here are in the spirit of Nuttall’s compact form for Padé approximants; see [14, p. 17]. We will make use of them in Section 12.3. We note that these formulas can also be obtained using the Schur complement, as shown in Brezinski [32] and Jbilou and Sadok [148]. See also Brezinski and Redivo Zaglia [41]. For the Schur complement, see [138], for example.

1.9 Numerical stability in polynomial extrapolation An important issue that we encounter in extrapolation is that of numerical stability. As the topic of stability may not be common knowledge, we start with some rather general remarks on what we mean by it and how we analyze it. Our discussion here applies to every method that produces vectors sn,k from a given sequence {x m } and that can be expressed in the form sn,k =

k

i =0

γi x n  +i ,

k

i =0

γi = 1.

Here n  ≥ n. Clearly, all the polynomial methods introduced in this chapter are included in this discussion with n  = n. Suppose that the vectors used in the extrapolation process are not precisely x m , but x m = x m + η m , where η m are (absolute) error vectors. As a result, the γi change to γ i = γi + δi and sn,k changes to sn,k , given as sn,k =

k

i =0

γ i x n  +i .

We are interested in analyzing the total error sn,k − s. We first have     sn,k − s = sn,k − sn,k + sn,k − s .

(1.77)

The term sn,k − s is the theoretical error, and in the case of convergence (if limn→∞ sn,k = s, for example), it becomes arbitrarily small. Thus, the total error is dominated by the term sn,k − sn,k close to convergence. This term is given by sn,k − sn,k =

k

i =0

γ i x n  +i −

k

i =0

γi x n  +i =

k

i =0

γi ηn  +i +

k

i =0

δi x n  +i +

k

i =0

δi ηn  +i . (1.78)

Now, assuming that the relative errors in the x j and the γ i are all bounded by some ε > 0, we will have η j  ≤ εx j  and |δi | ≤ ε|γi |. Taking norms in (1.78), and invoking these bounds, we obtain 

 k  sn,k − sn,k  ≤ 2 |γi | x n  +i  ε + O(ε2 ). (1.79) i =0

1.10. Solution of x = f (x) by polynomial extrapolation methods in cycling mode

57

Of course, when ε 0 (that is, the x m have been computed with approximately t correct significant decimal digits), and Γn,k = 10q , q > 0, then the number of correct significant decimal digits that can be achieved in sn,k is approximately t − q. From this discussion, it is clear that for reliable numerical results, we need a relatively small Γn,k that already satisfies Γn,k ≥ 1. The smaller Γn,k , the more stable and accurate the sn,k .

1.10 Solution of x = f (x) by polynomial extrapolation methods in cycling mode As we have already mentioned, a major application area of vector extrapolation methods is that of numerical solution of large systems of equations ψ(x) = 0 by fixed-point iterations x m+1 = f (x m ).15 For all the polynomial methods discussed in this chapter, the computation of the approximation sn,k to s, the solution of ψ(x) = 0, requires k + 1 vectors to be stored in the computer memory. For systems of very large dimension N , this means that we should keep k at a moderate size. As we will see in Chapter 6, when k is kept fixed, sn,k → s as n → ∞ faster than x m → s, in general. This may suggest that we should keep k fixed at a reasonable size that is suitable for the available computer memory and increase n indefinitely; the computational cost of this approach is too high, however. In view of these limitations, a better strategy for systems of equations is cycling, for which both n and k are fixed. Below, we describe three types of cycling, namely, full cycling, cycling with frozen γi , and cycling in parallel. We will justify these in Section 7.5. 15 As mentioned earlier, the system x = f (x) has the same solution s as ψ(x ) = 0 and may serve as a suitably preconditioned form of the latter.

58

Chapter 1. Development of Polynomial Extrapolation Methods

1.10.1 Full cycling In view of what we have learned so far, the most straightforward type of cycling is what we shall call here full cycling, whose steps are as follows:16 C0. Choose integers n ≥ 0 and k ≥ 1 and an initial vector x 0 . C1. Compute the vectors x 1 , x 2 , . . . , x n+k+1 [via x m+1 = f (x m )]. C2. Apply any of the four extrapolation methods, namely, MPE, RRE, MMPE, and SVD-MPE, to the vectors x n , x n+1 , . . . , x n+k+1 , with end result sn,k . C3. If sn,k satisfies the accuracy test, stop. Otherwise, set x 0 = sn,k and go to step C1. (r )

We will call each application of steps C1–C3 a cycle and denote by sn,k the sn,k com(0)

puted in the r th cycle. We will also denote the initial vector x 0 in step C0 by sn,k . (r )

Under suitable conditions, it can be shown rigorously that the sequence {sn,k }∞ r =0 has very good convergence properties. Remark: Before proceeding further, we note that to compute the approximations sn,k and s0,n+k to lim m→∞ x m = s in each cycle we must generate the n + k + 1 vectors x 1 , x 2 , . . . , x n+k+1 . Now, as can be argued heuristically (and as we will see in Chapter 7), s0,n+k is a better approximation to s than sn,k . However, when computing s0,n+k , we need to store n + k + 1 vectors of dimension N in the core memory at any given instance, as opposed to the k + 1 vectors needed to be stored for sn,k . (Here N is the dimension of the space we are working in.) If we do not wish to store more than k + 1 vectors in the core memory, a good strategy is to cycle with sn,k , where n > 0, rather than to cycle with s0,k , provided the cost of computing the x m is not high. At least in some cases, cycling with n > 0 may give better results than cycling with n = 0. A more fundamental reason for cycling with n > 0 rather than with n = 0 is as follows: In some instances, cycling with s0,k in floating-point arithmetic stalls, in the sense that errors stay large for many cycles until they start to decrease, and even then the decrease is slow. Numerical experience suggests that cycling with even moderately large n > 0 can prevent stalling from happening (see Sidi and Shapira [303, 304]). A special case

In Example 1.16, we briefly discussed the vectors s0,1 defined by MPE and RRE. Here we discuss them again within the context of full cycling and for solving systems of the form x = f (x). Assume that an initial approximation to the solution of x = f (x) is given. Then (r ) we compute the sequence {s0,1 } as follows: Input s(0) (initial approximation). For r = 0, 1, . . . , do Set y 0 = s(r ) . Compute y 1 = f (y 0 ) = f (s(r ) ), y 2 = f (y 1 ) = f (f (s(r ) )). Compute a r = y 1 − y 0 , b r = y 2 − 2y 1 + y 0 . 16 Since this is the most common use of cycling, in the following, we will usually refer to full cycling simply as cycling.

1.10. Solution of x = f (x) by polynomial extrapolation methods in cycling mode

a ∗r a r a a ∗r b r r b∗ a Compute s(r +1) = s(r ) − ∗r r a r br br end do (r ) Compute s(r +1) = s(r ) −

59

for MPE.

for RRE.

1.10.2 Cycling with frozen γ i The polynomial methods developed in this chapter can also be applied in a mode we shall call cycling with frozen γi ; this mode was suggested recently in Sidi [279, 285]. In this application, we first perform a few full cycles ( p cycles, for example, for some (r ) integer p) as described above to compute sn,k , r = 1, . . . , p, and save (“freeze”) the ( p)

( p)

(r )

γi = γi used to compute sn,k . To compute sn,k , r = p + 1, p + 2, . . . , we then go over the steps C1–C3, with step C2 modified as follows: C2. Compute sn,k for any of the four extrapolation methods, namely, MPE, RRE, MMPE, and SVD-MPE, via sn,k =

k

i =0

( p)

γi x n+i

(with the frozen γi ).

(That is, do not compute new γi for cycles r = p + 1, p + 2, . . . .) (r )

(r )

Clearly, the sn,k , r ≥ p + 1, computed with the frozen γi are not the actual sn,k computed by the extrapolation method being used in the full cycling mode; they are good approximations to them nevertheless. (r ) This strategy enables us to save the overhead involved in computing the γi (per(r )

taining to sn,k with r ≥ p + 1) in each cycle and allows us to store only one x j at any given instance. The vectors x j can be computed by keeping two vectors in the memory at any given instance. Thus, as suggested by x m+1 = f (x m ), x 1 overwrites ( p)

x 0 , x 2 overwrites x 1 , and so on, until x n . Once g = γ0 x n has been computed and ( p)

stored, we overwrite x n with x n+1 , compute γ1 x n+1 , and add it to g , replacing g ( p)

( p)

by g + γ1 x n+1 , and so on, finally replacing g by g + γk x n+k . Following this, we (r )

set sn,k = g . Clearly, this strategy is very economical both computationally and storagewise when the dimension N of the vectors x m is very large: (i) it avoids all of the expensive vector operations for determining the γi (due to the solution of least-squares problems involving the matrix U k in the cases of MPE and RRE, for example), and (ii) it avoids the storage of U k , which can be a very large matrix. It becomes useful especially when the cost of computing the individual vectors is very small (of order d N , where d is a small number independent of N , for example). In addition, it produces very good results in many cases. Of course, we would not want to use the frozen γi for too many of the cycles with r ≥ p + 1; instead, we would want to “refresh” the γi every once in a while (every q cycles, for example, for some fixed integer q) by applying the extrapolation method in full. Here are the steps of this mode of usage of cycling with frozen γi : FC0. Choose integers n ≥ 0, k ≥ 1, p ≥ 1, and q ≥ 2, and an initial vector x 0 .

60

Chapter 1. Development of Polynomial Extrapolation Methods (0)

FC1. Set sn,k = x 0 . For r = 1, . . . , p − 1 do (r ) (r −1) Compute the exact sn,k from sn,k . end do (r ) FC2. For h = 0, 1, 2, . . . do (r ) (r −1) Set r = p + hq and compute the exact sn,k from sn,k . (r )

Set x 0 = sn,k . For j = 1, . . . , q − 1 do Set r = p + hq + j . ( p+hq) (r ) (r −1) Compute sn,k from sn,k using frozen coefficients from sn,k . (r )

Set x 0 = sn,k . end do ( j ) end do (h) (r )

(r −1)

(r −1)

Here, by “compute the exact sn,k from sn,k ” we mean that, starting with x 0 = sn,k , the vectors x 1 , . . . , x n+k+1 are to be generated via x m+1 = f (x m ), and the extrapolation (r )

method is to be applied to the vectors x n , x n+1 , . . . , x n+k+1 to produce the exact sn,k . (r )

(r −1)

Similarly, for r = p + hq + j , 1 ≤ j ≤ q − 1, by “compute sn,k from sn,k frozen coefficients from

( p+hq) sn,k ”

we mean that, starting with x 0 = (r ) sn,k

(r −1) sn,k ,

using

the vectors

is to be computed as x 1 , . . . , x n+k+1 are to be generated via x m+1 = f (x m ), and k ( p+hq) ( p+hq) (r ) sn,k = i =0 γi x n+i , where the γi are those γi that were determined exactly ( p+hq)

by the relevant extrapolation method for the exact sn,k Note also that the vectors

( p−1) (1) sn,k , . . . , sn,k

and

( p+hq) sn,k ,

. h = 0, 1, . . . , are computed (r )

exactly by the extrapolation methods, while the rest of the sn,k are computed using (1)

( p)

frozen γi . Of these, sn,k , . . . , sn,k are computed by full cycling.

Application to linear f (x) and a special case

Let f (x) be linear, that is, f (x) = T x + d, and take p = 1 and q = 2 when cycling (2r ) with frozen γi . Then the vectors sn,k , r = 1, 2, . . . , that are computed using frozen γi , (j)

have an interesting structure. To analyze this structure, it is sufficient to consider sn,k , j = 2r, 2r + 1, 2r + 2. (2r ) (2r +1) Now, given the initial vector x 0 ≡ sn,k , the vector sn,k is obtained by first computing the vectors x 1 , x 2 , . . . , x n+k+1 via x m+1 = f (x m ) and next applying one of the four extrapolation methods, namely, MPE, RRE, MMPE, and SVD-MPE, to the vec (2r +1) (2r +1) (2r +1) = ki=0 γi x n+i , with the exact γi tors x n , x n+1 , . . . , x n+k+1 . Thus, sn,k (2r +1)

determined by each method. Next, we set y 0 = sn,k and compute y 1 , y 2 , . . . , y n+k+1  (2r +2) (2r +1) via y m+1 = f (y m ) and set sn,k = kj=0 γ j y n+ j .

1.10. Solution of x = f (x) by polynomial extrapolation methods in cycling mode

Because

k

(2r +1) i =0 γi

y1 = T y0 + d = T

k

i =0

61

= 1 and x m+1 = T x m + d and y m+1 = T y m + d,

(2r +1)

γi

and, by induction, yj =

x n+i + d =

k

i =0

(2r +1)

γi

k

i =0

(2r +1)

γi

(T x n+i + d) =

k

i =0

(2r +1)

γi

x n+i +1 ,

j = 0, 1, . . . .

x n+i + j ,

Consequently, (2r +2)

sn,k

=

k

j =0

=

(2r +1)

γj

k

i =0

k

k

i =0 j =0

(2r +1)

γi

x 2n+i + j

(2r +1) (2r +1) γj x 2n+i + j .

γi

(1.83)

[It must be emphasized that all the vectors x 1 , x 2 , . . . , x 2n+2k that are used for con(2r +2) (2r ) structing sn,k are obtained via x m+1 = f (x m ), starting with x 0 = sn,k .] This is the structure that we alluded to above. Letting n = 0 in (1.83), we obtain the special case (2r +2)

s0,k

=

k

k

i =0 j =0

(2r +1) (2r +1) γj x i+j .

γi

This is nothing but the “squared” version of the extrapolation methods MPE, RRE, MMPE, and SVD-MPE considered in Varadhan and Roland [332] and Roland, Varadhan, and Frangakis [222]. We have thus shown that, when f (x) is linear [that is, (2r ) f (x) = T x + d], the sequence {sn,k }∞ r =1 produced by our cycling with frozen coefficients, with p = 1, q = 2, and n = 0, is mathematically identical to that described in [222], even though the two are computed in completely different ways. [For nonlinear f (x), the two sequences have the same computational cost; they are mathematically different, however.] We end by considering the structure of the residual vector r (x) corresponding to x as an approximation to the solution of x = f (x), which we define by r (x) = f (x) − x,   (2r +1) (2r +1) γj = 1, it is for the case f (x) = T x + d. Using the fact that ki=0 kj=0 γi easy to verify that (2r +2)

r (sn,k

)=

k

k

i =0 j =0

(2r +1) (2r +1) γj u 2n+i + j ,

γi

u m = x m+1 − x m .

(2r )

By the fact that u m = T m u 0 = T m r (sn,k ), we then have (2r +2)

r (sn,k

)=

k

k

i =0 j =0



= Tn

(2r +1) (2r +1) 2n+i + j (2r ) γj T r (sn,k )

γi

k

i =0

(2r +1)

γi

Ti

2

(2r )

r (sn,k ).

62

Chapter 1. Development of Polynomial Extrapolation Methods

1.10.3 Cycling in parallel When vector extrapolation methods are applied on parallel computers with several processors, the overhead due to the implementation of the methods can be too large. This is because vector operations, such as inner products and vector additions, become inefficient because they require lots of communication between the different processors throughout the computation. Despite this deficiency, we can modify the extrapolation methods to reduce the cost of the communication between processors substantially. Even though the results of this modification are not what we would get out of the original unmodified extrapolation methods, they are very close to them. Here we are assuming that the vectors x m in N have been obtained exactly from the iterative process x m+1 = f (x m ), m = 0, 1, . . . . We are also assuming that our machine has q processors. As usual, we let (1)

(2)

(N )

x m = [x m , x m , . . . , x m ]T ,

m = 0, 1, . . . .

We choose positive integers ν1 , . . . , νq such that ν1 < ν2 < · · · < νq = N and let (ν

x m, j = [x mj −1

+1)



, x mj −1

+2)

(ν )

, . . . , x mj ]T ,

j = 1, . . . , q,

ν0 = 0.

Clearly, x m, j ∈ ν j −ν j −1 , j = 1, . . . , q, and ⎡ ⎢ ⎢ xm = ⎢ ⎣

x m,1 x m,2 .. . x m,q

⎤ ⎥ ⎥ ⎥. ⎦

Here are the steps of this approach: PC0. Choose integers n ≥ 0 and k ≥ 1 and an initial vector x 0 . PC1. Compute the vectors x 1 , x 2 , . . . , x n+k+1 [via x m+1 = f (x m )], and form the partial vectors x m, j , j = 1, . . . , q, as above. PC2. Apply any of the four extrapolation methods, namely, MPE, RRE, MMPE, and SVD-MPE, to the vectors x n, j , x n+1, j , . . . , x n+k+1, j in the j th processor, and call the resulting vector sn,k, j , j = 1, . . . , q. Then set ⎡ ⎢ ⎢ sn,k = ⎢ ⎣

sn,k,1 sn,k,2 .. . sn,k,q

⎤ ⎥ ⎥ ⎥ ⎦

as the approximation to s. PC3. If sn,k satisfies the accuracy test, stop. Otherwise, set x 0 = sn,k , and go to step PC1.   Let sn,k, j = ki=0 γi , j x n+i , j . As before, let sn,k = ki=0 γi x n+i be the vector that results from full cycling applied to x n , x n+1 , . . . , x n+k+1 . Since the γi , j differ (i) from

1.11. Final remarks

63

one value of j to another and also (ii) from the γi , it is clear that sn,k = sn,k ; sn,k is quite close to sn,k , however. (r )

Again, we will call each application of steps PC1–PC3 a cycle and denote by sn,k the sn,k that is computed in the r th cycle. We will also denote the initial vector x 0 in step (0)

(r )

PC0 by sn,k . Usually, the sequence { sn,k }∞ r =0 has very good convergence properties. Note also that, when computing sn,k, j , the j th processor is functioning independently of the other processors, which implies that there is no communication between the different processors when the vectors sn,k,1 , . . . , sn,k,q are being computed. In addition, the computation times of these vectors will be the same if ν j − ν j −1 = N /q for all j provided N is a multiple of q; otherwise, the computation times will be approximately the same if ν j − ν j −1 ≈ N /q for all j . Note that this type of cycling may be useful, for example, when dealing with systems of partial differential equations involving q unknown functions, g1 , . . . , gq , say. The vectors x m,1 , . . . , x m,q may be chosen to correspond to g1 , . . . , gq , respectively.

1.11 Final remarks So far, we have discussed in some detail the development and the various algebraic properties of polynomial-type extrapolation methods and their efficient application in the cycling mode. We end by mentioning, without proof at this stage, some recursive properties of MPE, RRE, and MMPE. Our first item here concerns the relation that exists between MPE and RRE, while the second concerns the existence of recursions produced by MMPE, (ii) among the vectors sn,k = sMPE (i) among the vectors sn,k = sMMPE n,k n,k produced by MPE, and (iii) among the vectors sn,k = sRRE produced by RRE. n,k 1. Despite the fact that MPE and RRE are essentially two different methods, the vectors sMPE and sRRE are closely related in more than one way. We will discuss n,k n,k their relation in detail in Chapter 3, after we have discussed the algorithms for their numerical implementation in Chapter 2.17 Two main results are proved in Chapter 3. The first one concerns the stagnation of RRE when MPE fails; it reads as follows: RRE sRRE n,k = sn,k−1



sMPE n,k does not exist.

The second result concerns the general case in which sMPE exists; part of it reads k RRE MPE μk sRRE n,k = μk−1 sn,k−1 + νk sn,k ,

μk = μk−1 + νk ,

where μk , μk−1 , νk are positive scalars depending only on sRRE , sRRE , sMPE , ren,k n,k−1 n,k spectively. An important conclusion follows from the analysis of Chapter 3: When applied to the sequence {x m }, both methods either perform well simultaneously or perform poorly simultaneously. of the form 2. There exists a three-term recursion relation among the vectors sMMPE n,k sn,k+1 = αnk sn,k + βnk sn+1,k ,

αnk + βnk = 1,

17 The mathematical apparatus leading to the algorithms of Chapter 2 is crucial for the study of Chapter 3.

64

Chapter 1. Development of Polynomial Extrapolation Methods

where αnk and βnk are some scalars. The initial conditions are sn,0 = x n and sn,1 , n = 0, 1, . . . . and Similarly, there exist four-term recursion relations among the vectors sMPE n,k of the form among the vectors sRRE n,k sn,k+1 = αnk sn,k + βnk sn+1,k−1 + γnk sn+1,k ,

αnk + βnk + γnk = 1,

where αnk , βnk , and γnk are some scalars. The initial conditions are sn,0 = x n and sn,1 , n = 0, 1, . . . . Of course, αnk , βnk , and γnk are different for the two methods. We derive these recursion relations and more in Chapter 8. Given the facts that (i) the input to the methods MPE, RRE, and MMPE is a sequence of arbitrary vectors and (ii) these methods are highly nonlinear, the existence of the recursive properties mentioned here is quite surprising.

Chapter 2

Unified Algorithms for MPE and RRE

2.1 General considerations In this chapter, we will develop algorithms for the efficient implementation of MPE and RRE. As is clear from the way we have derived these methods, the crucial stage of the implementations is that of determining the γ j . We have also seen that to determine the γ j we need to solve some least-squares problems, namely, (1.32) for MPE and (1.34) or (1.39) for RRE. The most immediate way of solving these problems is via the solution of the relevant normal equations, namely, (1.63) and (1.64) in the proof of Theorem 1.15. But the matrices of the normal equations in our problems tend to have large condition numbers, which is likely to cause their numerical solutions to have low accuracy. We thus need to pick better numerical methods for solving our least-squares problems. One very good choice uses the QR factorization of matrices, and we start the development of our algorithms by introducing the QR factorization. The algorithms for MPE and RRE that we present here have the following features: 1. Their treatments are unified via the QR factorization of the matrices U k . 2. Their computational and computer storage requirements are minimal. (Note that economy of storage is crucial when working with very high dimensional problems; such problems form the main area of application for our methods.) 3. Unlike the algorithms that proceed via the normal equations, they are numerically stable. As in the preceding chapter,  throughout this chapter we will use the vector norm  ·  in N defined as z  = (z , z ), with (y, z ) = y ∗ M z , where M is a Hermitian positive definite matrix.18 We will also use the standard Euclidean inner product and  ∗ the norm induced by it, namely, 〈y, z 〉 = y z and z 2 = 〈z , z 〉, respectively. The developments that follow next are based entirely on those in Sidi [266, 292]. Of these two papers, [266] uses the standard Euclidean inner product and norm, while [292] uses a weighted inner product and norm. 18 Recall that the weighted inner product (y, z ) = y ∗ M z , where M is a Hermitian positive definite matrix, is the most general inner product in N .

65

66

Chapter 2. Unified Algorithms for MPE and RRE

Summary of main points concerning MPE and RRE

Before proceeding further, we would like to summarize the main points relevant to MPE and RRE. We recall that for both methods, the approximations to lim m→∞ x m are of the form k k



sn,k = γi x n+i , γi = 1. (2.1) i =0

i =0

The methods differ in the way they determine the γi . Starting with the vector sequence {x m }, we let u m = x m+1 − x m ,

m = 0, 1, . . . ,

(2.2)

and define the matrices U j as U j = [ u n | u n+1 | · · · | u n+ j ] ∈ N ×( j +1) .

(2.3)

Thus, U j has u n , u n+1 , . . . , u n+ j as its columns. • Determining the γi for MPE: Denote c  = [c0 , c1 , . . . , ck−1 ]T and solve the equations U k−1 c  = −u n+k in the least-squares sense; that is, solve the minimization problem ' ' 'U c  + u ' min (2.4) k−1 n+k .  c

Following this, let ck = 1 and compute c γi =  k i

j =0 c j

provided

k

j =0 c j

,

i = 0, 1, . . . , k,

(2.5)

= 0.

• Determining the γi for RRE: Denote γ = [γ0 , γ1 , . . . , γk ]T and solve the equations  U k γ = 0 in the least-squares sense, subject to the constraint that ki=0 γi = 1; that is, solve the minimization problem min U k γ  γ

subject to

k

i =0

γi = 1.

(2.6)

Note that the computation of the γi for MPE involves an unconstrained linear least-squares problem, while that for RRE involves a least-squares problem with one linear constraint. Surprisingly, the treatment of both problems can be unified by applying QR factorization to the matrices U k . Because we are using a weighted norm to define MPE and RRE, we need to study the weighted QR factorization of matrices, which is done in detail in Appendix A. We urge the reader to study this appendix carefully before delving into Section 2.3, where we derive our algorithms for MPE and RRE.

2.2 QR factorization For convenience, we state Theorem A.1 from Appendix A next.

2.3. Solution of least-squares problems by QR factorization

67

Theorem 2.1. Let A = [ a 1 | a 2 | · · · | a s ] ∈  m×s ,

m ≥ s,

rank(A) = s.

Also, let M ∈  m×m be Hermitian positive definite and let (y, z ) = y ∗ M z . Then there exists a matrix Q ∈  m×s , unitary in the sense that Q ∗ MQ = I s , and an upper triangular matrix R ∈  s ×s with positive diagonal elements, such that A = QR. Specifically,

⎡ Q = [ q 1 | q 2 | · · · | q s ],

⎢ ⎢ R=⎢ ⎣

r11

r12 r22

··· ··· .. .

⎤ r1s r2s ⎥ ⎥ .. ⎥ , . ⎦ rs s

(q i , q j ) = q ∗i M q j = δi j ri j = (q i , a j ) = q ∗i M a j

∀ i, j ,

∀ i ≤ j,

ri i > 0

∀ i.

In addition, the matrices Q and R are unique. Note that the ri j are obtained in the process of computing q j from a j by the Gram– Schmidt (GS) orthogonalization process once the vectors q 1 , . . . , q j −1 have been computed. As explained in Appendix A, when we are using the standard Euclidean inner product (that is, when M = I ), the matrices Q and R should be computed numerically by the modified Gram–Schmidt (MGS) process. We assume that MGS will be effective when M is not ill conditioned. In what follows, we make use of the following simple, but important, observation. Lemma 2.2. Let P ∈  m×s be unitary in the sense that P ∗ M P = I s , where M is Hermi tian positive definite, and let (y, z ) = y ∗ M z and z  = (z , z ). Then

(P y, P z ) = y ∗ z

and P z  =



z ∗ z = z 2 .

2.3 Solution of least-squares problems by QR factorization 2.3.1 Solution of unconstrained problems The following is a well-known result for unconstrained least-squares problems. Theorem 2.3. Consider the least-squares problem mins Ax − b, x∈

A ∈  m×s ,

b ∈ m ,

z  =



(z, z ) =



z ∗M z .

(2.7)

1. This problem has at least one solution. 2. The vector x ∈  s is a solution to (2.7) if and only if it satisfies the s × s linear system (also called the system of normal equations) (A∗ MA)x = A∗ M b



A∗ M (Ax − b) = 0.

(2.8)

68

Chapter 2. Unified Algorithms for MPE and RRE

3. If x 1 and x 2 are two different solutions, then x 1 − x 2 ∈  (A). 4. If rank(A) = s, then the solution is unique and is given by x = (A∗ MA)−1A∗ M b.

(2.9)

Proof. We start by expressing b as b = b1 + b2,

b 1 ∈ (A),

b 2 ∈ (A)⊥M ,

where (A) stands for the subspace spanned by the columns of A as usual and (A)⊥M = {x ∈  m : (x, y) = 0 ∀ y ∈ (A)}. This is possible by the fact that  m = (A) ⊕ (A)⊥M . Since Ax ∈ (A) just as b 1 ∈ (A), we have that Ax − b 1 ∈ (A) too, and, therefore, (b 2 ,Ax − b 1 ) = b ∗2 M (Ax − b 1 ) = 0. Consequently, Ax − b2 = (Ax − b 1 ) − b 2 2 = Ax − b 1 2 + b 2 2 ≥ b 2 2

∀ x ∈ s .

We thus see that the solution to min x∈ s Ax − b is the same as the solution to min x∈ s Ax − b 1 , and the latter problem has at least one solution x that satisfies x − b 1 = 0, since b 1 ∈ (A). Consequently, A x = b 1 ; hence A x − b = b 2 . mins Ax − b = A x ∈

This proves part 1 of the theorem. To prove part 2, let x be an arbitrary vector, and set r = A x − b. We need to show that x is a solution to (2.7) if and only if A∗ M r = 0. For any x, x − b) + (Ax − A x )2 =  r + A(x − x )2 , Ax − b2 = (A which can be expanded as Ax − b2 =  r 2 + A(x − x )2 + 2ℜ[ r ∗ MA(x − x )] x )2 + 2ℜ[(A∗ M r )∗ (x − x )]. =  r 2 + A(x − First, assume that A∗ M r = 0. Then x )2 ≥  r 2 = A x − b2 , Ax − b2 =  r 2 + A(x − implying that x is a solution to (2.7). Now assume that x is a solution to (2.7), but that A∗ M r = 0. Choose x = x + t (A∗ M r ), where t is a real scalar. Then Ax − b2 =  r 2 + H (t ),

H (t ) = t 2 AA∗ M r 2 + 2t A∗ M r 22 .

By the fact that Ax − b2 ≥  r 2 , we must have H (t ) ≥ 0 for all real t . However, by the assumption that A∗ M r = 0, we have H (t ) < 0 for t < 0 and |t | sufficiently close to zero. This contradicts the assumption that x is a solution to (2.7). Thus, A∗ M r =0 must hold. To prove part 3, let x 1 and x 2 be two different solutions to (2.7). Then, by (2.8) and by the fact that  (A∗ MA) =  (A), A∗ MA(x 1 − x 2 ) = 0



A(x 1 − x 2 ) = 0,

implying that x 1 − x 2 ∈  (A). Part 4 follows from (2.8) and from the fact that the matrix A∗ MA is positive definite, and hence nonsingular, when rank(A) = s ≤ m.

2.3. Solution of least-squares problems by QR factorization

69

Numerical computation of  x

There are a few ways of solving (2.7) numerically when A has full column rank, that is, rank (A) = s. Here we present the solution that employs the weighted QR factorization of Theorem 2.1. Let the weighted QR factorization of A and of the m ×(s +1) augmented matrix A = [A| b] be A = QR and A = Q  R  , respectively, as in Theorem 2.1. Then, whether rank(A ) = s +1 or not, that is, whether b is in span{a 1 , . . . , a s } or not, there exists a vector q s +1 , q s +1  = 1, and a vector [t | t  ]T ∈  s +1 , t ∈  s , such that19 , + R t  ∗  . (2.10) Q = [Q | q s +1 ], Q M q s +1 = 0, R = 0T t  Next,

Q ∗ MA = [Q ∗ MA|Q ∗ M b ] = [ R |Q ∗ M b ]

and also

+

Q ∗ MA = Q ∗ MQ  R  = [Q ∗ MQ |Q ∗ M q s +1 ]R  = [ I | 0 ] which, when equated, give Now,

R 0T

t t

, = [ R | t ],

Q∗M b = t . A∗ MA = R ∗ (Q ∗ MQ)R = R ∗ R.

(2.11)

Consequently, the normal equations (2.8) become R ∗ Rx = R ∗ Q ∗ M b = R ∗ t . Multiplying both sides of this equation by (R ∗ )−1 , we obtain the s × s upper triangular linear system Rx = t , (2.12) and we obtain the solution x to (2.7) by solving (2.12) by back substitution. The matrix R and the vector t that are needed are provided by the QR factorization of the augmented matrix A = [A| b], namely, A = Q  R  , with Q  and R  as in (2.10). Now, in the problems that are of interest to us, we have m >> s. Thus, by the fact that R ∈  s ×s , the system in (2.12) is much smaller than that in (2.7) in these problems.

2.3.2 Solution of problems with one linear constraint Lemma 2.4. Let F ∈  s ×s be Hermitian positive definite and let g ∈  s , g = 0. Then the minimization problem mins x ∗ F x x∈

g∗x = 1

(2.13)

1 > 0. g ∗ F −1 g

(2.14)

subject to

has a unique solution x given as x = λF −1 g ,

λ=

19 If b ∈ span{a , . . . , a }, we have that rank(A ) = s +1, and hence Theorem 2.1 applies to A . Otherwise, 1 s  +1 ri ,s +1 q i , implying that ri ,s +1 = (q i , b), 1 ≤ i ≤ s , and r s +1,s +1 = 0. In any case, only we have b = is =1 the ri ,s +1 , 1 ≤ i ≤ s , are needed to form the vector t , and they are computed by MGS exactly as the ri j , 1 ≤ i , j ≤ s ; r s +1,s +1 and the vector q s +1 are not needed and hence do not have to be computed.

70

Chapter 2. Unified Algorithms for MPE and RRE

In addition,

mins x ∗ F x = x ∗F x = λ.

(2.15)

x∈ g ∗ x=1

Proof. We prove the lemma by direct verification. First, F −1 is Hermitian positive definite since F is. Therefore, λ > 0. Next, it is easily seen that x , as given in (2.14), x = 1. satisfies the constraint g ∗ To show that x in (2.14) is the (unique) solution, we proceed as follows: For every x ∈  s , we have  ∗  x ∗F x = x ∗F x + (x − x )∗ F (x − x ) + 2ℜ x F (x − x) . Next, by the fact that F is Hermitian and by (2.14), for all x satisfying g ∗ x = 1, x ) = (F x )∗ (x − x ) = λg ∗ (x − x ) = λ(g ∗ x − g ∗ x ) = 0, x ∗ F (x − on account of which we have x ∗F x = x ∗F x + (x − x )∗ F (x − x) ≥ x ∗F x

for every x ∈  s such that g ∗ x = 1.

Therefore, x is a solution of (2.13). By the fact that F is positive definite, equality holds here if and only if x = x . Therefore, x is unique. Finally, (2.15) follows from (2.14).

Remark: The actual computation of x in Lemma 2.4 can be carried out as follows: First, solve the linear system F h = g for h. Next, compute λ via λ = 1/(g ∗ h). Finally, set x = λh. The following theorem is actually a corollary of Lemma 2.4. Theorem 2.5. The least-squares problem mins Ax

subject to

g ∗ x = 1,

A ∈  m×s ,

m ≥ s,

rank(A) = s,

x∈

g ∈ s

(2.16)

has a unique solution x that is given by x=

(A∗ MA)−1 g . g ∗ (A∗ MA)−1 g

In addition, x = mins Ax = A

x∈ g ∗ x=1



λ,

λ=

(2.17)

1

g ∗ (A∗ MA)−1 g

=.

(2.18)

Proof. The proof can be achieved by noting that Ax2 = (Ax)∗ M (Ax) = x ∗ (A∗ MA)x and applying Lemma 2.4 with F = A∗ MA ∈  s ×s , which is Hermitian positive definite due to the assumption that rank (A) = s.

2.4. Algorithms for MPE and RRE

71

Numerical computation of  x

We can now solve the problem in (2.16) as follows: Invoking the QR factorization of A, namely, A = QR, we have A∗ MA = R ∗ R, as before. Therefore, Ax2 = x ∗ (R ∗ R)x. Now apply Lemma 2.4 with F = R ∗ R, everything else remaining the same. That is, first, obtain h as the solution of R ∗ Rh = g and compute λ = 1/(g ∗ h) > 0. Next, set x = λh. The solution of R ∗ Rh = g entails the solution of two triangular linear systems, namely,  of R ∗ p = g for p, and of Rh = p for h. Finally, x  = λ. min x∈ s ,g ∗ x=1 Ax = A

2.4 Algorithms for MPE and RRE 2.4.1 General preliminaries We now go back to Section 2.1 and turn to the actual implementations of MPE and RRE. The algorithms for MPE and RRE that we present in this section were first given in Sidi [266] with M = I . This paper also contains a well-documented FORTRAN 77 code that unifies the implementations of these algorithms. This code is reproduced in Appendix H. The treatment with arbitrary M is new, however, and appeared in Sidi [292]. We start by recalling Subsection 1.3.3; in particular, we recall equations (1.35)–  (1.40). Invoking ki=0 γi = 1, we can write k

i =0

γi x n+i = x n +

k−1

j =0

ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T ∈ k , (2.19)

ξ j u n+ j = x n + U k−1 ξ ,

where ξj =

k

i = j +1

γi = 1 −

j

i =0

γi ,

j = 0, 1, . . . , k − 1.

(2.20)

In our algorithms below, we first compute the γi , and then we use these to compute the ξi via (2.20), which are then used to compute sn,k via (2.19). To obtain the γi , we solve the relevant least-squares problems via weighted QR factorization of the matrices U j , assuming that these matrices are of full rank for j ≤ k, for some integer k > 0, that is, rank (U j ) = j + 1, j ≤ k. Thus, U j = QjRj,

j = 0, 1, . . . ,

Q j ∈ N ×( j +1) ,

where

⎡ Q j = [ q 0 | q 1 | · · · | q j ],

q ∗i M q j = δi j

∀ i, j ,

⎢ ⎢ Rj = ⎢ ⎣

ri j = q ∗i M u j

r00

R j ∈ ( j +1)×( j +1) , r01 r11

∀ i ≤ j,

··· ··· .. .

⎤ r0 j r1 j ⎥ ⎥ , .. ⎥ . ⎦ rj j

ri i > 0

∀ i.

(2.21)

(2.22)

(2.23)

We carry out the weighted QR factorizations by using MGS. For completeness, we reproduce here the steps of this process as applied to the matrix U k : 1. Compute r00 = u n  and q 0 = u n /r00 . 2. For j = 1, . . . , k do

72

Chapter 2. Unified Algorithms for MPE and RRE (0)

Set u j = u n+ j For i = 0, 1, . . . , j − 1 do (i ) (i +1) (i ) ri j = (q i , u j ) and u j = u j − ri j q i end do (i) (j) (j) Compute r j j = u j  and q j = u j /r j j . end do ( j ) Note that the matrices Q j and R j are obtained from Q j −1 and R j −1 , respectively, as follows: ⎤ ⎡ r0 j .. ⎥ ⎢   R j −1 ⎢ . ⎥ Q j = Q j −1 | q j , R j = ⎢ (2.24) ⎥. ⎣ r j −1, j ⎦ 0 ··· 0 rj j (0) uj

As mentioned earlier, x n needs to be saved first. Then, u n+ j overwrites x n+ j , (i +1)

(i )

(j)

overwrites u n+ j , u j overwrites u j , and q j overwrites u j . At the end of the process, Q k replaces U k . Thus, throughout the process, we need memory locations for k + 2 vectors only, in which x n and q 0 , q 1 , . . . , q k are stored at the end. After we have computed the γi (solving the appropriate least-squares problems using the QR factorizations of the U j , as will be explained shortly), we compute the ξi via (2.20). With the QR factorizations and the ξi available, we finally compute  sn,k = ki=0 γi x n+i , starting with (2.19), as sn,k = x n + U k−1 ξ = x n + Q k−1 R k−1 ξ .

(2.25)

To implement this economically, we first compute η = R k−1 ξ ,

η = [η0 , η1 , . . . , ηk−1 ]T .

(2.26)

Following that, we compute sn,k as sn,k = x n + Q k−1 η = x n +

k−1

i =0

ηi q i .

(2.27)

Note that to compute sn,k via (2.19), we need to save either the x n+ j or the u n+ j . Here we have seen that sn,k can be computed using the QR factorization of U k−1 , without the need to save either the x n+ j or the u n+ j .

2.4.2 Computation of γ i for MPE Since c  is the solution to the least-squares problem in (2.4), we can determine c  via the solution of the linear system [cf. (2.7) and (2.12)] R k−1 c  = −Q ∗k−1 M u n+k ≡ −ρk .

(2.28)

By the discussion following the proof of Theorem 2.3, it is easy to see that ρk = [r0k , r1k , . . . , rk−1,k ]T and is already available as part of the last column of R k . Thus, c  is obtained by solving the upper triangular system R k−1 c  = −ρk ,

ρk = [r0k , r1k , . . . , rk−1,k ]T .

With c  available, we let ck = 1 and compute the γi as in (2.5).

(2.29)

2.4. Algorithms for MPE and RRE

73

2.4.3 Computation of γ i for RRE Since γ is the solution to the constrained least-squares problem in (2.6), it can be obtained as explained following the proof of Theorem 2.5. Realizing that the constraint k e T γ = 1, where e = [1, . . . , 1]T , we first solve i =0 γi = 1 can be expressed in the form R ∗k R k h = e,

h = [h0 , h1 , . . . , hk ]T .

(2.30)

e This amounts to solving two triangular systems of linear equations, namely, R ∗k p = for p and R k h = p for h. With h available, we compute λ=

1 T

e h

= k

1

i =0

hi

.

(2.31)

We already know that λ is real and positive. Finally, we let γ = λh,

γi = λhi ,

i = 0, 1, . . . , k.

(2.32)

2.4.4 Unification and summary of algorithms We can now combine all of the above in the following algorithm that implements both MPE and RRE: 1. Choose integers n ≥ 0 and k ≥ 1. Input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n+i = Δx n+i , i = 0, 1, . . . , k; form the N × (k + 1) matrix U k = [ u n | u n+1 | · · · | u n+k ]; and form its weighted QR factorization, namely, U k = Q k R k , with Q k and R k as in (2.22), using MGS. (Here we are assuming that U k has full rank.) 3. Determine the γi : • For MPE: With ρk = [r0k , r1k , . . . , rk−1,k ]T , solve the k × k upper triangular system R k−1 c  = −ρk , Set ck = 1 and γi = ci /

k

j =0 c j ,

c  = [c0 , c1 , . . . , ck−1 ]T .

i = 0, 1, . . . , k, provided

k

j =0 c j

= 0.

• For RRE: With e = [1, 1, . . . , 1]T , solve the (k + 1) × (k + 1) linear system R ∗k R k h = e;

h = [h0 , h1 , . . . , hk ]T .

e for p and, (This amounts to solving two triangular systems: first R ∗k p = following that, R k h = p for h.)  Next, compute λ = 1/ ki=0 hi . (Note that λ is always real and positive because U k has full rank.) Next, set γ = λh, that is, γi = λhi , i = 0, 1, . . . , k. 4. With the γi determined, compute ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T via (2.20) or via ξk−1 = γk ,

ξ j = ξ j +1 + γ j +1 ,

j = k − 2, k − 3, . . . , 1, 0.

74

Chapter 2. Unified Algorithms for MPE and RRE

5. With the ξi determined, compute η = [η0 , η1 , . . . , ηk−1 ]T = R k−1 ξ . Then compute sn,k = x n + Q k−1 η = x n +

k−1

i =0

ηi q i .

For convenience, we also present the unified algorithm, with n = 0 for simplicity, in Table 2.1. Remark: The linear systems that we solve in step 3 are very small compared to the dimension N of the vectors x m , so their cost is negligible.

2.4.5 Operation counts and computer storage requirements In the problems we are interested in, N , the dimension of the vectors x m , is extremely large, while k takes on very small values. Consequently, the major part of the computational effort is spent in handling the large vectors, the rest being negligible. The terminology used for vector operations is as follows: vector addition: scalar-vector multiplication: (weighted) dot product: saxpy (scalar alpha x plus y):

x + y, αx, x ∗ M y (weighted inner product), αx + y.

As we can easily see, most of the vector computations take place in the QR factorization. At the j th stage of this process (by MGS), the vector x n+ j +1 is introduced first. Starting with this, we need one vector addition to compute u n+ j = x n+ j +1 − x n+ j , and (i )

following that, j saxpy operations to compute the u j , 1 ≤ i ≤ j ; one scalar-vector multiplication to compute q j ; and j + 1 dot products to compute the ri j , 0 ≤ i ≤ j . The total numbers of the various operations to compute Q k and R k are thus as follows: vector additions: scalar-vector multiplications: (weighted) dot products: saxpy (scalar alpha x plus y):

k + 1, k + 1, (k 2 + 3k + 2)/2, (k 2 + k)/2.

Finally, when forming sn,k via (2.27), we perform k saxpy operations. It is clear that most of the overhead is caused by the computation of the weighted inner products. As for the storage requirements, it is easy to see that we need only (k+2)N memory locations throughout the whole process. These are used to store first x n+i , 0 ≤ i ≤ k + 1; next x n and u n+i , 0 ≤ i ≤ k (that is, the matrix U k ); and next x n and q n+i , 0 ≤ i ≤ k (that is, the matrix Q k ). At the end of the process, we overwrite x n with sn,k .

2.5 Error estimation via algorithms We now turn to the assessment of the quality of sn,k as an approximation to s, the solution to the system x = f (x), when the sequence {x m } is generated via the iterative

2.5. Error estimation via algorithms

75

Table 2.1. Unified algorithm for implementing MPE and RRE.

Step 0. Input: The Hermitian positive definite matrix M ∈ N ×N , the integer k, and the vectors x 0 , x 1 , . . . , x k+1 . Step 1. Compute u i = Δx i = x i +1 − x i , i = 0, 1, . . . , k. Set U j = [u 0 | u 1 | · · · | u j ] ∈ N ×( j +1) , j = 0, 1, . . . . Compute the weighted QR factorization of U k , namely, U k = Q k R k ; Q k = [q 0 | q 1 | · · · | q k ] unitary in the sense Q ∗k MQ k = I k+1 , and R k = [ri j ]0≤i , j ≤k upper triangular, ri j = q ∗i M u j . (U k−1 = Q k−1 R k−1 is contained in U k = Q k R k .) Step 2. Computation of γ k = [γ0 , γ1 , . . . , γk ]T : For MPE: Solve the (upper triangular) linear system R k−1 c  = −ρk , ρk = [r0k , r1k , . . . , rk−1,k ]T , c  = [c0 , c1 , . . . , ck−1 ]T . (Note that ρk = Q ∗k−1 M u k .)  Set ck = 1 and compute α = ki=0 ci . Set γ k = c/α; that is, γi = ci /α, i = 0, 1, . . . , k, provided α = 0. For RRE: Solve the linear system R ∗k R k h = ˆe k ,

h = [h0 , h1 , . . . , hk ]T ,

ˆe k = [1, 1, . . . , 1]T ∈ k+1 .

[This amounts to solving two triangular (lower and upper) systems.]  k −1 Set λ = . (Note that λ is real and positive.) i =0 hi Set γ k = λh; that is, γi = λhi , i = 0, 1, . . . , k. Step 3. Compute ξ k = [ξ0 , ξ1 , . . . , ξk−1 ]T by ξ0 = 1 − γ0 ,

ξ j = ξ j −1 − γ j ,

j = 1, . . . , k − 1.

Compute sMPE and sRRE via k k   sk = x 0 + Q k−1 R k−1 ξ k = x 0 + Q k−1 η. [For this, first compute η = R k−1 ξ k , η = [η0 , η1 , . . . , ηk−1 ]T .  Next, set sk = x 0 + k−1 η q .] i =0 i i

scheme x m+1 = f (x m ). All we know to compute is f (x) for a given x, and we must use this knowledge for our assessment. One common way of assessing the quality of a given vector x as an approximation to s is by looking at some norm of the associated residual vector r (x) = f (x) − x, since lim x→s r (x) = r (s) = 0. [We will give another justification of this assertion in the following subsection when dealing with linear f (x).] Note first that the residual

76

Chapter 2. Unified Algorithms for MPE and RRE

vectors for the individual vectors x m are obtained at no cost because r (x m ) = f (x m ) − x m = x m+1 − x m = u m ,

whether f is linear or nonlinear. (2.33)

Actually, what is of interest is the size of the ratio r (sn,k )/r (x 0 ), namely, the order of magnitude of r (sn,k ) relative to that of r (x 0 ). This ratio shows by how many orders of magnitude the norm of the initial residual r (x 0 ) has been reduced in the course of extrapolation. It also indicates to some extent the order of magnitude of sn,k − s relative to that of x 0 − s. The following developments were first given in Sidi [266, Section 5].

2.5.1 Exact residual vector for linear sequences When the iteration vectors x m are generated linearly via x m+1 = T x m + d, we have  r (x) = (T x + d) − x; hence r (sn,k ) = T sn,k + d − sn,k . Invoking sn,k = ki=0 γi x n+i  with ki=0 γi = 1, we therefore have r (sn,k ) =

k

i =0

γi u n+i = U k γ ;

hence

r (sn,k ) = U k γ .

(2.34)

Recall now that u m = (T − I )ε m , where ε m = x m − s. Therefore, r (sn,k ) =

k

i =0

γi u n+i = (T − I )

k

i =0

γi εn+i = (T − I )(sn,k − s),

and by the fact that T − I is nonsingular, r (sn,k ) = (T − I )(sn,k − s) is a true norm for sn,k − s. Note also that sn,k − s = (T − I )−1 r (sn,k ),

so sn,k − s ≤ (T − I )−1  r (sn,k ),

and all this provides additional justification for looking at r (sn,k ) to assess the quality of sn,k − s versus x 0 − s.

2.5.2 Approximate residual vector for nonlinear sequences In Section 0.3, we concluded that, when the x m are generated via the convergent fixedpoint iterative method x m+1 = f (x m ), where f (x) is nonlinear, the sequence {x m } behaves as if it were being generated linearly for all sufficiently large m. Thus, when sn,k is sufficiently close to s, we will have, analogous to the linear case, r (sn,k ) = f (sn,k ) − sn,k ≈ U k γ ;

hence

r (sn,k ) ≈ U k γ .

(2.35)

We now show the validity of this last assertion. We start with the fact that, for x sufficiently close to the solution s, and because f (s) = s, r (x) = f (x) − x = [f (s) + F (s)(x − s) + O(x − s2 )] − x = [F (s) − I ](x − s) + O(x − s2 ),

2.5. Error estimation via algorithms

77

where F (x) is the Jacobian matrix of f evaluated at x. Thus, to first order in x − s, we have r (x) ≈ [F (s) − I ](x − s). (2.36) By this and by (2.33) u m = r (x m ) ≈ [F (s) − I ](x m − s)

(2.37)

to first order in x m − s, provided x m is sufficiently close to the solution s. Next,  provided sn,k = ki=0 γi x n+i is sufficiently close to s, we also have r (sn,k ) ≈ [F (s) − I ](sn,k − s) = [F (s) − I ]

k

i =0

=

by (2.36)

γi (x n+i − s)

by

k

i =0

γi = 1

k



γi [F (s) − I ](x n+i − s) i =0



k

i =0

γi u n+i

by (2.37)

= U kγ . Remark: As already mentioned, that the vector U k γ is (i) the actual residual vector r (sn,k ) for sn,k from linear systems and (ii) a true approximate residual vector for sn,k from nonlinear systems was proved originally in Sidi [266]. U k γ was adopted in a subsequent paper by Jbilou and Sadok [149] and named the “generalized residual” there. Despite sounding interesting, this name has no meaning and is misleading. By expanding f (sn,k ) about the solution s and retaining first-order terms only, we have shown that U k γ is actually a genuine approximation to the residual vector r (sn,k ) for sn,k from nonlinear systems when sn,k is close to the solution s. Thus, there is nothing “generalized” about U k γ .

2.5.3 Computation of U k γ  via algorithms Whether the vectors x m are generated linearly or nonlinearly, U k γ  can be determined in terms of already computed quantities and at no cost, without actually having to compute U k γ . Clearly, this computation of U k γ  can be made a part of the algorithms. Indeed, we have the following result from [292], which generalizes that proved in [266] for the case M = I . Theorem 2.6. U k γ  is given as r |γ | U k γ  = kk k λ

for MPE, for RRE.

(2.38)

Here, rkk is the last diagonal element of the matrix R k in (2.22) and λ is the parameter computed in step 3 of the algorithms in Subsection 2.4.4. Proof. First, by U k = Q k R k and by Lemma 2.2, we have U k γ  = Q k R k γ  = R k γ 2 .

78

Chapter 2. Unified Algorithms for MPE and RRE

For MPE, we start with

-

Rk γ =

R k−1

ρk

0T

rkk



.

k

i =0 ci



R k−1 c + ρk rkk i =0 ci   r 0 = kkk , 1 i =0 ci

1 = k



1



c 1



(2.39)

where we have invoked R k−1 c  = −ρk . Finally, recalling that ck = 1 and γk =  ck / ki=0 ci for MPE, (2.39) becomes   0 R k γ = rkk γk ; hence R k γ 2 = rkk |γk |. 1

For RRE, we recall that the vector γ relevant to this method is the solution to the constrained minimization problem in (2.6),  to which Theorem 2.5 applies with A = U k and g = e . Consequently, U k γ  = λ by (2.18).

2.6 Further algorithms for MPE and RRE 2.6.1 General preliminaries We end this chapter with a further set of algorithms to compute sn,k for MPE and RRE that are more recent than those presented in Section 2.4. These algorithms were developed in Sidi [279, 285] by using the standard Euclidean inner product and the norm induced by it. The new MPE algorithm generalizes the algorithm developed by Kamvar et al. in [153] for implementing the so-called quadratic extrapolation method defined there. The quadratic extrapolation method was designed for computing the PageRank of the Google Web matrix, and it turns out to be closely related to the k = 2 case of MPE; this relation was shown in [279, 285]. Note that the new algorithms here  are derived using the weighted inner product (y, z ) = y ∗ M z and the norm z  = (z , z ) induced by it. In these new algorithms, instead of working with the vectors u n+i = x n+i +1 − x n+i , we work with the vectors i defined as u i = x n+i +1 − x n , i = 0, 1, . . . . u (2.40)

& as Similarly, we define the matrices U j & = [u 0 | u 1 | · · · | u j ], U j

j = 0, 1, . . . ,

(2.41)

and also define their weighted QR factorizations & & & =Q U j jRj,

j = 0, 1, . . . ,

& ∈ N ×( j +1) , U j

where

⎡ & = [q 0 | q 1 | · · · | q j ], Q j ∗i M q j = δi j q

∀ i, j ,

⎢ ⎢ & Rj = ⎢ ⎣

∗i M u j ri j = q

r00

& R j ∈ ( j +1)×( j +1) , r01 r11

∀ i ≤ j,

··· ··· .. .

⎤ r0 j r1 j ⎥ ⎥ , .. ⎥ . ⎦ r j j

ri i > 0

∀ i.

(2.42)

(2.43)

(2.44)

2.6. Further algorithms for MPE and RRE

79

& and rely on the following simple lemma. The new algorithms are based on x n and U k Lemma 2.7. & 1. Let a = [a0 , a1 , . . . , ak ]T and a = [ a0 , a 1 , . . . , a k ]T . If U k a = U k a , then (i) ai =

k

j =i

a j ,

2. In addition,

0≤i ≤k

k

i =0

ai x n+i =

(ak = a k ),

(ii)

k

i =0

ai =

k

j =0

( j + 1) aj .



 k k−1

i . ai x n + ai +1 u i =0

i =0

i − u i −1 with u −1 = 0 into U k a, we obtain Proof. Substituting u n+i = u Uka =

k

i =0

ai u i =

k

i =0

& a i = U a i u k ,

a i = ai − ai +1 ,

0 ≤ i ≤ k − 1,

a k = ak .

The proof of part 1 can now be easily completed. Part 2 is proved by substituting  i −1 into ki=0 ai x n+i . x n+i = x n + u

Theorem 2.8. Let c  = [ c0 , c1 , . . . , ck−1 ]T be the solution to the minimization problem  & k . min U k−1 c + u

(2.45)

& c

Let ck = 1, and, provided

k

j =0 ( j

+ 1) c j = 0, define

k

ξ i = k

cj j =i +1

cj j =0 ( j + 1)

,

i = 0, 1 . . . , k − 1.

(2.46)

Then, for MPE, sn,k is given by sn,k = x n +

k−1

i =0

i . ξ i u

(2.47)

Proof. We observe that Lemma 2.7 applies with a = c and a= c . By the fact that  & ck = ck = 1, we have U k−1 c  + u n+k = U + u ; hence c k−1 k  & k . min U k−1 c  + u k  = min U k−1 c + u   c

c

In addition,

k−1 sn,k = x n +

i =0

k

i ci +1 u

i =0 ci

.

The proof can now be completed by making further use of Lemma 2.7.

80

Chapter 2. Unified Algorithms for MPE and RRE

Theorem 2.9. Let γ = [ γ0 , γ 1 , . . . , γ k ]T be the solution to the constrained minimization problem k

& γ  subject to min U ( j + 1) γ j = 1. (2.48) k γ& j =0 Define

k

ξ i =

j =i +1

γ j ,

i = 0, 1 . . . , k − 1.

(2.49)

Then, for RRE, sn,k is given by sn,k = x n +

k−1

i =0

i . ξ i u

(2.50)

& γ Proof. We observe that Lemma 2.7 applies with a = γ and a = γ . We have U k γ = U k k k and i =0 γi = i =0 (i + 1) γi . Therefore, min U k γ  = γ

k

i=0 γi =1

k

In addition, sn,k = x n +

min γ&

& γ . U k

γ j =1 j =0 ( j +1)

k−1

i =0

i . γi +1 u

The proof can now be completed by making further use of Lemma 2.7.

2.6.2 Unification of algorithms and error estimation Theorems 2.8 and 2.9 also show how to implement MPE and RRE stably and economically. Here is the relevant unified algorithm. 1. Choose integers n ≥ 0 and k ≥ 1. Input the vectors x n , x n+1 , . . . , x n+k . i = x n+i +1 − x n , i = 0, 1, . . . , k; form the N × (k + 1) 2. Compute the vectors u matrix & = [u 0 | u 1 | · · · | u k ]; U k & & & & & =Q and form its weighted QR factorization, namely, U k k R k , with Q k and R k & has full rank.) as in (2.23), using MGS. (Here we are assuming that U k 3. Determine the ξ i : • For MPE: With ρ k = [ r0k , r1k , . . . , rk−1,k ]T , solve the k × k upper triangular system & R k−1 ρk , c  = −

c  = [ c0 , c1 , . . . , ck−1 ]T .

  Set ck = 1 and ξ i = kj=i +1 c j , i = 0, 1 . . . , k − 1, provided c j / kj=0 ( j + 1) k c j = 0. j =0 ( j + 1)

2.6. Further algorithms for MPE and RRE

81

• For RRE: With e = [1, 2, . . . , k + 1]T , solve the (k + 1) × (k + 1) linear system ∗ & e, Rk h = Rk &

h = [h0 , h1 , . . . , hk ]T .

∗ (This amounts to solving two triangular systems: first & Rk p = e for p and & then R k h = p for h.)  Next, compute λ = 1/ k (i + 1)h . (Note that λ is always real and posii =0

i

& has full rank.) tive because U k , i = 0, 1, . . . , k. Next, set γ = λh; that is, γ i = λh i  k Then set ξi = j =i +1 γ j , i = 0, 1 . . . , k − 1. 4. With ξ = [ξ 0 , ξ 1 , . . . , ξ k−1 ]T determined, compute

η = [η0 , η1 , . . . , ηk−1 ]T = & R k−1 ξ . Then compute & η=x + sn,k = x n + Q k−1 n

k−1

i =0

i . ηi q

With these algorithms too, the error can be estimated in terms of quantities already & γ . As a result, computed without actually having to form sn,k . Note that U k γ = U k & & r (sn,k ) = U k γ for linear systems, while r (sn,k ) ≈ U k γ for nonlinear systems. Then we have the following result, whose proof we leave to the reader. & γ  is given as Theorem 2.10. U k /

rkk | γk | & γ  =  U k λ

for MPE,

for RRE.

(2.51)

Here, rkk is the last diagonal element of the matrix & R k in (2.22) and λ is the parameter computed in step 3 of the algorithms above.

Chapter 3

MPE and RRE Are Related

3.1 Introduction In spite of the fact that MPE and RRE are two different methods, the vectors sn,k produced by them turn out to be related in more than one interesting way, as already mentioned in Section 1.11. The purpose of this chapter is to explore this relationship in detail. Throughout this chapter, we use the notation of Sections 2.4 and 2.5. For clarity of notation and to avoid confusion, in what follows, we consider the vectors sn,k with n = 0 only and we denote s0,k by sk for simplicity. We also denote the vectors γ and ξ corresponding to sk by γ k and ξ k , respectively. We will also denote (i) the γi associated with γ k by γki and (ii) the ξ j associated with ξ k by ξk j . That is, γ k = [γk0 , γk1 , . . . , γkk ]T

ξ k = [ξk0 , ξk1 , . . . , ξk,k−1 ]T .

and

We will denote by sMPE and sRRE the vectors sk produced by MPE and RRE, respectively. k k We will do the same for γ k and ξ k whenever necessary. ∗  Again, the inner product and the norm induced by it are (y, z ) = y M z and z = (z , z ), respectively. As we will realize soon, the unified algorithms of the preceding chapter in general, and the weighted QR factorization in particular, turn out to be very useful tools in our study, which is completely theoretical. The results of this section were published only recently in a paper by Sidi [292].20 Two main results are proved in this chapter. The first one concerns the stagnation of RRE when MPE fails; it reads as follows: RRE sRRE k = sk−1



sMPE k does not exist.

The second result concerns the general case in which sMPE exists; part of it reads k RRE MPE μk sRRE k = μk−1 sk−1 + νk sk ,

μk = μk−1 + νk ,

, sRRE , and sMPE , where μk , μk−1 , and νk are positive scalars depending only on sRRE k k−1 k respectively. Remark: What is surprising about these results is that the phenomena described by them are universal because they occur independent of how the sequence {x m } is gen20 Used

with permission from Springer [292].

83

84

Chapter 3. MPE and RRE Are Related

erated. In fact, we have not restricted the vectors x m in any way; they can be generated linearly or nonlinearly or they can be completely arbitrary. As explained earlier, from the way we construct the matrices U k , namely, U k = [ u 0 | u 1 | · · · | u k ],

k = 0, 1, . . . ,

(3.1)

it is clear that there is an integer k0 ≤ N such that the matrices U k , k = 0, 1, . . . , k0 − 1, are of full rank, but U k0 is not; that is, rank(U k ) = k + 1,

k = 0, 1, . . . , k0 − 1,

rank(U k0 ) = k0 .

(3.2)

(Of course, this is the same as saying that {u 0 , u 1 , . . . , u k0 −1 } is a linearly independent set, but {u 0 , u 1 , . . . , u k0 } is not.) In the following, we will be looking at the cases in which k < k0 . Let U k = Q k R k be the weighted QR factorization of U k . We begin by observing that, since (3.3) U k γ k = Q k (R k γ k ) and since Q k is the same for both MPE and RRE, the vector that is of relevance is R k γ k , and we turn to the study of this vector. In addition, since Q k is unitary in the sense Q ∗k MQ k = I k+1 , by Lemma 2.2, we also have U k γ k  = R k γ k 2 .

(3.4)

This was shown as part of the proof of Theorem 2.6. Needless to say, all this is valid for both MPE and RRE. Below, we recall k k



sk = γki x i , γki = 1, (3.5) i =0

i =0

and also sk = x 0 + ξk j =

k−1

j =0

k

i = j +1

ξk j u j = x 0 + U k−1 ξ k ,

γki = γk, j +1 + ξk, j +1 ,

j = 0, 1, . . . , k − 1,

(3.6)

and the obvious fact that ξ k determines γ k uniquely and vice versa. We make repeated use of the partitionings U k = [ U k−1 | u k ], Q k = [Q k−1 | q k ], , + R k−1 ρk Rk = , ρk = [r0k , r1k , . . . , rk−1,k ]T . 0T rkk

(3.7) (3.8)

We also use the fact that the vectors c  and c in the definition of MPE are given by    c R k−1 c  = −ρk ⇒ c  = −R −1 , (3.9) ρ , c = k−1 k 1

is defined or not. Note that R k is nonsingular and that c exists and is unique whether sMPE k for all k < k0 . We will also use the notation e k = [1, 1, . . . , 1]T ∈ k+1 ,

k = 1, 2, . . . .

(Note that the e k here should not be confused with the standard basis vectors e k .) Finally, we recall that sMPE exists if and only if e Tk c = 0. k

3.2. R k γ k for MPE and RRE

85

3.2 R k γ k for MPE and RRE 3.2.1 R k γ k for MPE Assuming that e Tk c = 0, so that sMPE exists, we have k γ MPE k =

and hence Rk γ k =

1

MPE

e Tk c

+ 

c

e Tk c

,

(3.10) ,

ρk rkk

R k−1 0T

R k−1 c  + ρk rkk e Tk c   r 0 . = Tkk ek c 1 1

=



c 1



(3.11)

Of course, this immediately implies that R k γ MPE k 2 =

rkk

| e Tk c|

.

(3.12)

3.2.2 R k γ k for RRE and an identity As for RRE, by [see (2.30)–(2.32)] γ RRE k = λh,

we have R k γ RRE k =

R ∗k R k h = ek ,

R k (R ∗k R k )−1 ek

e Tk (R ∗k R k )−1 ek

=

λ=

1

e Tk h

R −∗ e k k

,

(3.13)

.

(3.14)

R −∗ e 2 k k 2

[Here B −∗ stands for (B ∗ )−1 = (B −1 )∗ .] Of course, this immediately implies that R k γ RRE k 2 =

1 −∗ R k e k 2

R −∗ e = k k



R k γ RRE k

R k γ RRE 22 k

e in more detail. First, We now go on to study R −∗ k k , + −1 + R k−1 c  /rkk R −∗ −1 −∗ k−1 Rk = ⇒ R = k (c  )∗ /rkk 0T 1/rkk Consequently,

+ R −∗ e = k k

+

=

=

R −∗ k−1 (c  )∗ /rkk

0 1/rkk

,

R −∗ e k−1 k−1  ∗ (c ) e k−1 /rkk + 1/rkk . e R −∗ k−1 k−1

e Tk c/rkk

,

e k−1 1 ,

0 1/rkk

.

(3.15)

, .

(3.16)



(3.17)

86

Chapter 3. MPE and RRE Are Related

which, by (3.15), can also be expressed as

1 1 R γ RRE = 2 2 k k RRE R R k γ RRE  k−1 γ k−1 2 2 k

+

,

R k−1 γ RRE k−1 0

+

e Tk c



rkk

0 1

 .

(3.18)

This is our key identity, and we will use it to prove our main results next. (Here t stands for the complex conjugate of t , as usual.)

3.3 First main result The following theorem is our first main result, concerning the case in which sMPE does k not exist and RRE stagnates. Theorem 3.1. 1. If sMPE does not exist, k

RRE sRRE k = sk−1 ,

(3.19)

RRE U k γ RRE k = U k−1 γ k−1 .

(3.20)

which is also equivalent to

2. Conversely, if (3.19) holds, then sMPE does not exist. k exists if and only if e Tk c = 0. Proof. The proof is based on the fact that sMPE k T Proof of part 1: Since e k c = 0 when sMPE does not exist, by (3.18), k 1 1 R k γ RRE k = RRE 2 R k−1 γ RRE 2 R k γ k 2 k−1 2

+

R k−1 γ RRE k−1 0

, .

(3.21)

Taking norms in (3.21), we obtain RRE R k γ RRE k 2 = R k−1 γ k−1 2 ,

(3.22)

which, upon substituting back into (3.21), gives + R k γ RRE k =

R k−1 γ RRE k−1 0

+

, = Rk

γ RRE k−1 0

, .

(3.23)

By the fact that R k is nonsingular, it follows that + γk = RRE

γ RRE k−1 0

, ,

(3.24)

which, together with (3.5), gives (3.19) and (3.20). Proof of part 2: By (3.19) and (3.6), we have RRE RRE = x 0 + U k−1 ξ RRE sRRE k = x 0 + U k−2 ξ k−1 = sk−1 , k

(3.25)

3.4. Second main result

87

from which

+ U k−1 ξ k = U k−2 ξ k−1 RRE

RRE



ξ RRE k−1 0

U k−1 ξ k = U k−1 RRE

, .

By the fact that U k−1 is of full column rank, (3.26) implies that + RRE , ξ k−1 RRE ξk = , 0

(3.26)

(3.27)

RRE RRE RRE RRE = 0. But, by (3.6), ξk,k−1 = γkk . Thus, γkk = 0 in our case. by which we have ξk,k−1 Invoking this in (3.6), we obtain (3.24). Multiplying both sides of (3.24) on the left by R k and invoking also (3.8), we obtain + , R k−1 γ RRE RRE k−1 R k γ RRE (3.28) = ⇒ R k γ RRE k 2 = R k−1 γ k−1 2 . k 0

Substituting (3.28) into (3.18), we obtain e Tk c = 0, which completes the proof.

Remark: What Theorem 3.1 is saying is that the stagnation of RRE (in the sense that = sRRE ) and the failure of sMPE to exist take place simultaneously. sRRE k k−1 k

3.4 Second main result The next theorem is our second main result, concerning the general case in which sMPE k exists. exists, Theorem 3.2. If sMPE k

1 1 1 + = MPE 2 RRE 2 2 U γ U γ  U k γ RRE  k k  k−1 k−1 k

(3.29)

and U k γ RRE k

U k γ RRE 2 k

=

U k−1 γ RRE k−1

U k−1 γ RRE 2 k−1

+

U k γ MPE k

U k γ MPE 2 k

.

(3.30)

.

(3.31)

Consequently, we also have sRRE k

U k γ RRE 2 k

In addition,

=

sRRE k−1

U k−1 γ RRE 2 k−1

+

sMPE k

U k γ MPE 2 k

RRE U k γ RRE k  < U k−1 γ k−1 .

(3.32)

exists, we have e Tk c = 0. Taking the standard l2 -norm of both sides in Proof. Since sMPE k (3.18) and observing that the two terms on the right-hand side are orthogonal to each other in the standard l2 inner product, we first obtain

 T 2 | e k c| 1 1 , + = RRE 2 RRE 2 rkk R k γ k 2 R k−1 γ k−1 2

(3.33)

88

Chapter 3. MPE and RRE Are Related

which, upon invoking (3.12), gives

1 1 1 . + = RRE 2 RRE 2 22 R k γ k 2 R k−1 γ k−1 2 R k γ MPE k

(3.34)

The result in (3.29) follows from (3.34) and (3.4). Next, invoking (3.11) and (3.12) in (3.18), we obtain + , 1 R k−1 γ RRE 1 1 RRE k−1 R γ MPE . Rk γ k = + 2 k k 2 RRE RRE 2 0 R k γ MPE  R k−1 γ k−1 2 R k γ k 2 2 k

(3.35)

Multiplying both sides of (3.35) on the left by Q k , and invoking (3.4) and + + , , R k−1 γ RRE R k−1 γ RRE RRE k−1 k−1 Qk = Q k−1 (R k−1 γ RRE = [Q k−1 | q k ] k−1 ) = U k−1 γ k−1 , 0 0 (3.36) we obtain (3.30). Let us rewrite (3.30) in the form + RRE , γ k−1 1 1 1 RRE U k γ MPE . (3.37) Uk Ukγk = + k RRE RRE 2 2 2 0 U k γ MPE  U k−1 γ k−1  U k γ k  k

From (3.37) and by the fact that U k is of full column rank, it follows that + RRE , γ k−1 1 1 1 RRE γ MPE , γk = + RRE RRE 2 2 k 2 0 U k γ MPE  U k−1 γ k−1  U k γ k  k

(3.38)

which, together with (3.5), gives (3.31). Finally, (3.32) follows directly from (3.29).

The following facts can be deduced directly from (3.29): =  U k γ MPE k

U k γ RRE  k

1 − (U k γ RRE /U k−1 γ RRE )2 k k−1

k

1 1 , = MPE 2 RRE 2 U k γ k  i ∈S U i γ i 

,

Sk = {0 ≤ i ≤ k : sMPE i exists}.

(3.39)

(3.40)

k

3.5 Peak-plateau phenomenon Let us go back to the case in which {x m } is generated by x m+1 = f (x m ), m = 0, 1, . . . , from the system x = f (x). As we have already noted, with the residual associated with an arbitrary vector x defined as r (x) = f (x) − x, (i) U k γ k = r (sk ) when f (x) is linear, and (ii) U k γ k ≈ r (sk ) when f (x) is nonlinear and sk is close to the solution s of x = f (x). Then, Theorem 3.2 [especially (3.39)] implies that the convergence behaviors of MPE and RRE are interrelated in the following sense: MPE and RRE = either converge well simultaneously or perform poorly simultaneously. Letting φMPE k RRE RRE RRE = U γ , and recalling that φ /φ ≤ 1 for all k, we have U k γ MPE  and φRRE k k k k−1 RRE /φ is significantly smaller than one, which means the following: (i) When φRRE k k−1

3.5. Peak-plateau phenomenon

89

that RRE is performing well, φMPE is close to φRRE , that is, MPE is performing well k k MPE too, and (ii) when φk is increasing, that is, MPE is performing poorly, φRRE /φRRE is k k−1 approaching one, that is, RRE is performing poorly too. Thus, when the graph of has a peak for k˜1 ≤ k ≤ k˜2 , then the graph of φRRE has a plateau for k˜1 ≤ k ≤ k˜2 . φMPE k k Now, when applied to sequences {x m } obtained from fixed-point iterative solution of linear systems, MPE and RRE are mathematically equivalent to two important Krylov subspace methods for linear systems, known as the method of Arnoldi and the method of generalized minimal residuals (GMR), respectively. Thus, the relationship between MPE and RRE we have discovered in this chapter reduces to that between these two Krylov subspace methods, discovered earlier by Brown [50] and Weiss [340]. The phenomenon we have discussed in the preceding paragraph is known as the peakplateau phenomenon in the context of Krylov subspace methods for linear systems. We will consider this topic in detail in Chapter 9.

Chapter 4

Algorithms for MMPE and SVD-MPE

4.1 Introduction In this chapter, we present algorithms for implementing MMPE and SVD-MPE. These algorithms are based in part on the developments of Chapter 2. We summarize the relevant developments first. Starting with the vector sequence {x m }, we let u m = x m+1 − x m ,

m = 0, 1, . . . ,

(4.1)

and define the matrices U j as U j = [ u n | u n+1 | . . . | u n+ j ] ∈ N ×( j +1) .

(4.2)

Let us recall that we need the vectors u n+i , i = 0, 1, . . . , k, to determine the γi . Let us also recall that the vector sn,k is of the form sn,k =

k

i =0

γi x n+i ,

k

i =0

γi = 1.

(4.3)

After determining the γi , we let ξ−1 = 1,

ξj =

k

i = j +1

γi = ξ j −1 − γ j ,

j = 0, 1, . . . , k − 1,

(4.4)

and compute sn,k without using the x n+i , i ≥ 1, as follows: sn,k = x n +

k−1

i =0

ξi u n+i = x n + U k−1 ξ ,

ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T .

(4.5)

The advantage of this is that, with the exception of x n , we can discard the rest of the x n+i . Actually, as soon as we compute u n+i , we can discard x n+i . This way, we need to work with only k + 2 vectors throughout the extrapolation process. In the next section, we recall briefly the LU factorization and SVD, as these will be used in developing the algorithms for MMPE and SVD-MPE. 91

92

Chapter 4. Algorithms for MMPE and SVD-MPE

4.2 LU factorization and SVD Theorem 4.1. Let A ∈  r ×s , r ≥ s, and rank(A) = s. Then there exist a permutation matrix P ∈  r ×r , a lower trapezoidal matrix L ∈  r ×s with ones along its main diagonal, and a nonsingular upper triangular matrix U ∈  s ×s such that PA = LU . P can be chosen such that all the elements of L are bounded by one in modulus. Remark: We do not intend to go into the details of the LU factorization here. We will only mention that, to construct L to have all its elements bounded in modulus by unity, we perform the LU factorization with pivoting, where the pivot chosen when zeroing the (i, j ) elements with i > j in the j th column is the largest of the (k, j ) elements with k ≥ j . Theorem 4.2. Let A ∈  r ×s , r ≥ s. Then there exist unitary matrices U ∈  r ×s and V ∈  s ×s and a diagonal matrix Σ = diag(σ1 , . . . , σ s ) ∈  s ×s , with σ1 ≥ σ2 ≥ · · · ≥ σ s ≥ 0, such that A = U ΣV ∗ . Furthermore, if U = [ u 1 | · · · | u s ] and V = [ v 1 | · · · | v s ], then Av i = σi u i ,

A∗ u i = σi v i ,

i = 1, . . . , s.

If rank (A) = t , then σi > 0, i = 1, . . . , t , and the rest of the σi are zero. Remark: The σi are called the singular values of A and the v i (u i ) are called the right (left) singular vectors of A. We also have A∗Av i = σi2 v i ,

AA∗ u i = σi2 u i ,

i = 1, . . . , s.

4.3 Algorithm for MMPE The algorithm for MMPE that we present here is based on the developments of Subsection 1.3.4. Let us recall that, to determine the γi , we choose k linearly independent vectors q 1 , . . . , q k and form the k × k linear system k−1 

j =0

   q i , u n+ j c j = − q i , u n+k ,

i = 1, . . . , k.

(4.6)

This is, in fact, a system of k linear equations for the k unknowns c0 , c1 , . . . , ck−1 . Here (· , ·) is an arbitrary inner product. With c0 , c1 , . . . , ck−1 determined, we set ck = 1 and compute c (4.7) γi = k i , i = 0, 1, . . . , k, j =0 c j  provided kj=0 c j = 0. As already mentioned, if we choose the q i to be k of the standard basis vectors in N and choose (· , ·) to be the Euclidean inner product, then the k × k system in (4.6) is nothing but a subsystem of U k−1 c  = −u n+k , where c  = [c0 , c1 , . . . , ck−1 ]T . In

4.3. Algorithm for MMPE

93

other words, we are using the first k of the equations PU k−1 c  = −P u n+k , where P is a suitable permutation matrix that permutes the rows of U k . This way of implementing MMPE was originally suggested in Sidi, Ford, and Smith [299], and this is precisely how we would like to determine the ci here. Jbilou and Sadok [149] have proposed to choose this subsystem via LU factorization of U k by partial pivoting. The details of their approach follow. We assume that rank (U j ) = j + 1 for j ≤ k. Then, by Theorem 4.1, there exist a permutation matrix P ∈ N ×N , a lower trapezoidal matrix L j ∈ N ×( j +1) with ones along its diagonal, and a nonsingular upper triangular matrix R j ∈ ( j +1)×( j +1) such that PU j = L j R j . (4.8) Note that P depends only on k. Also, when partial pivoting is used, all the elements of the matrices L j are bounded by one in modulus. Now, L j can be partitioned as Lj =

Lj Lj

. ,

Lj ∈ ( j +1)×( j +1) ,

Lj ∈ (N − j −1)×( j +1) ,

Lj being lower triangular with ones along its diagonal, and R j is as in ⎡ ⎢ ⎢ Rj = ⎢ ⎣

r00

r01 r11

··· ··· .. .

⎤ r0 j r1 j ⎥ ⎥ ∈ ( j +1)×( j +1) . .. ⎥ . ⎦ rj j

In addition, L k and R k can be partitioned as , +  L k−1 0 , l k ∈ N −k , Lk = Lk−1 l k , + R k−1 ρk , ρk = [r0k , r1k , . . . , rk−1,k ]T ∈ k . Rk = 0T rkk Partitioning P U k in exactly the same way as L k , we obtain -      U k−1 PU k = P U k−1 | u n+k = PU k−1 | P u n+k =  U k−1

Then PU k = L k R k gives . +  -  L k−1 n+k u U k−1 =   L k−1 n+k U k−1 u

0 lk

,+

R k−1 0T

n+k u

n+k u

ρk rkk

(4.9) (4.10)

. .

(4.11)

, ,

(4.12)

from which we obtain  = L R , U k−1 k−1 k−1

n+k = Lk−1 ρk . u

(4.13)

With these developments, the first k of the N equations PU k−1 c  = −P u n+k give the system  c  = − U u n+k , (4.14) k−1

94

Chapter 4. Algorithms for MMPE and SVD-MPE

which is the same as

Lk−1 R k−1 c  = −L k−1 ρk .

(4.15)

L k−1

is a lower triangular k × k matrix with ones along its diagonal, it is nonsinSince gular, and hence c  satisfies the k × k upper triangular linear system R k−1 c  = −ρk .

(4.16)



With c determined, we compute the γi via (4.7) and compute ξ0 , ξ1 , . . . , ξk−1 as in (4.4). To complete the process, we need to compute sn,k given in (4.5). We would like to do this in a way that saves storage. For this, we use the LU factorization PU k−1 = L k−1 R k−1 , from which we have U k−1 = P T L k−1 R k−1 . This enables us to compute sn,k via sn,k = x n + U k−1 ξ = x n + P T L k−1 R k−1 ξ . (4.17) Thus, we can compute sn,k by

  sn,k = x n + P T L k−1 η ,

η = R k−1 ξ .

(4.18)

Of course, L k−1 η is best computed as a linear combination of the columns of L k−1 . This shows that we do not have to save the matrix U k after we compute the LU factorization of PU k . Actually, we know that we can overwrite PU k by L k and R k as soon as these are computed.

4.3.1 Summary of algorithm For convenience, we now combine all of the above in the following algorithm, which implements MMPE: 1. Choose integers n ≥ 0 and k ≥ 1. Input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n+i = Δx n+i , i = 0, 1, . . . , k; form the N × (k + 1) matrix U k = [ u n | u n+1 | · · · | u n+k ]; and form its LU factorization by partial pivoting, namely, PU k = L k R k , with L k and R k as in (4.9) and (4.10). (Here P is a suitable permutation matrix that results from partial pivoting. We are assuming that U k has full rank.) 3. Determine the γi : With ρk = [r0k , r1k , . . . , rk−1,k ]T , solve the k × k upper triangular system R k−1 c  = −ρk , c  = [c0 , c1 , . . . , ck−1 ]T .    0. Set ck = 1 and γi = ci / kj=0 c j , i = 0, 1, . . . , k, provided kj=0 c j = 4. With the γi determined, compute ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T via ξk−1 = γk ,

ξ j = ξ j +1 + γ j +1 ,

j = k − 2, k − 3, . . . , 1, 0.

5. With the ξi determined, compute η = [η0 , η1 , . . . , ηk−1 ]T = R k−1 ξ . Then compute

sn,k = x n + P T (L k−1 η).

Remark: The linear systems that we solve in step 3 are very small compared to the dimension N of the vectors x m ; hence their cost is negligible.

4.4. Error estimation via algorithm for MMPE

95

4.4 Error estimation via algorithm for MMPE As for error estimation without having to actually compute sn,k , we proceed as in the case of MPE and RRE by using the residual vector r (sn,k ). As was shown originally in Sidi [266] and as we have discussed in detail in Section 2.5, (i) r (sn,k ) = U k γ for linear systems and (ii) r (sn,k ) ≈ U k γ for nonlinear systems. Thus, in both cases, we have to either compute exactly, or estimate, some norm of U k γ . In the case of MMPE, this can be done both in the l∞ -norm and in the l2 -norm. Actually, we have the following result due to Jbilou and Sadok [149]. Theorem 4.3. The vector U k γ satisfies  ' ' 'U γ ' ≤ |r | |γ | N − k . k kk k 2

' ' 'U γ ' = |r | |γ | and k kk k ∞

(4.19)

Here rkk is the last diagonal element of R k . Proof. First, and, with μ =

k

U k γ = P T Lk R k γ ,

i =0 ci

1 Rk γ = μ

+

and γk = 1/μ, and by (4.16), ρk rkk

R k−1 0T

,

c 1



 = γk

R k−1 c  + ρ k rkk



 = rkk γk

0 1

 .

Consequently, + L k R k γ = rkk γk

L k−1 L k−1

0 lk

,

0 1



 = rkk γk

0 lk

 .

Now, the first entry of l k is one and the remaining entries are bounded by one in  modulus. Therefore, l k ∞ = 1 and l k 2 ≤ N − k . In addition, by the fact that P T is also a permutation matrix, ' ' ' ' ' ' ' ' 'U γ ' = 'P T L R γ ' = 'L R γ ' = |r γ | ' l ' = |r γ | k k k k k kk k k ∞ kk k ∞ ∞ ∞

and  ' ' ' ' ' ' ' ' 'U γ ' = 'P T L R γ ' = 'L R γ ' = |r γ | 'l ' ≤ |r γ | N − k . k k k k k kk k k kk k 2 2 2 2

This completes the proof.

4.5 Algorithm for SVD-MPE The algorithm for SVD-MPE that we present here is based on the developments of Subsection 1.3.5. Let us recall that, to determine the γi , we first solve the minimization problem ' ' (4.20) min 'U k c '2 subject to c2 = 1. c

The solution c = [c0 , c1 , . . . , ck ]T is the right singular vector h min corresponding to 2 is the smallest eigenvalue of the the smallest singular value σmin of U k . [That is, σmin

96

Chapter 4. Algorithms for MMPE and SVD-MPE

matrix U ∗k U k , and h min (with h ∗min h min = 1) is the corresponding normalized eigenvector.] Thus, (4.21) min U k c2 = U k h min 2 = σmin . c2 =1

Assuming that rank (U k ) = k + 1, we have σmin > 0. With c = h min available, we next set c γi = k i , i = 0, 1, . . . , k, (4.22) j =0 c j  provided kj=0 c j = 0. Finally, we compute sn,k via (4.4) and (4.5).

Now, σmin > 0 and h min can be obtained from the SVD of U k ∈ N ×(k+1) . Of course, the SVD of U k can be computed by applying directly to U k the algorithm of Golub and Kahan [102], for example. Here we choose to apply SVD to U k indirectly, which will result in a very efficient algorithm for SVD-MPE that is economical both computationally and storagewise. Here are the details of the computation of the SVD of U k , which was originally suggested by Chan [57]: 1. We first compute the QR factorization of U k , as before, in the form Q k ∈ N ×(k+1) ,

U k = Qk Rk ,

R k ∈ (k+1)×(k+1) ,

where Q k is unitary (that is, Q ∗k Q k = I k+1 ) and R k positive diagonal elements, that is, ⎡ r00 r01 ⎢ r11 ⎢ Q k = [ q 0 | q 1 | · · · | q k ], R k = ⎢ ⎣

(4.23)

is upper triangular with ··· ··· .. .

⎤ r0k r1k ⎥ ⎥ .. ⎥ , . ⎦

(4.24)

rkk q ∗i q j = δi j

ri j = q ∗i u j

∀ i, j ,

∀ i ≤ j,

ri i > 0

∀ i.

(Of course, we can carry out the QR factorizations in different ways; we can do this using MGS, for example.) 2. We next compute the SVD of R k : By Theorem 4.2, there exist unitary matrices Y , H ∈ (k+1)×(k+1) , Y = [ y 0 | y 1 | · · · | y k ], ∗

Y Y = I k+1 ,

H = [ h 0 | h 1 | · · · | h k ],

H ∗ H = I k+1 ,

(4.25)

and a diagonal matrix Σ ∈ (k+1)×(k+1) , Σ = diag (σ0 , σ1 , . . . , σk ), such that

σ0 ≥ σ1 ≥ · · · ≥ σk ≥ 0,

R k = Y ΣH ∗ .

(4.26) (4.27)

In addition, since R k is nonsingular by our assumption that rank (U k ) = k + 1, we have that σi > 0 for all i. Consequently, σmin = σk > 0. 3. Substituting (4.27) into (4.23), we obtain the following true SVD of U k : U k = GΣH ∗ , G = Q k Y ∈ N ×(k+1) unitary, G ∗G = I k+1 , G = [ g 0 | g 1 | · · · | g k ], g ∗i g j = δi j .

(4.28)

4.5. Algorithm for SVD-MPE

97

Thus, σi , the singular values of R k , are also the singular values of U k , and h i , the corresponding right singular vectors of R k , are also the corresponding right singular vectors of U k . [Of course, the g i are corresponding left singular vectors of U k . Note that, unlike Y , H , and Σ, which we must compute for our algorithm, we do not need to actually compute G. The mere knowledge that the SVD of U k is as given in (4.28) suffices to conclude that c = h min = h k is the required optimal solution to (4.20). From this point, we continue with the development of our algorithm.] With c = h k already determined, we next compute the γi as in (4.22) and the ξi as in (4.4). Finally, we compute sn,k as follows: Making use of the fact that U k−1 = Q k−1 R k−1 , with ⎡ ⎤ r00 r01 · · · r0,k−1 ⎢ r11 · · · r1,k−1 ⎥ ⎢ ⎥ (4.29) Q k−1 = [ q 0 | q 1 | · · · | q k−1 ], R k−1 = ⎢ ⎥, .. .. ⎣ ⎦ . . rk−1,k−1 we rewrite (4.5) as

sn,k = x n + Q k−1 (R k−1 ξ ).

(4.30)

Thus, sn,k can be computed economically as sn,k = x n + Q k−1 η,

η = R k−1 ξ ,

η = [η0 , η1 , . . . , ηk−1 ]T .

(4.31)

Of course, Q k−1 η is best computed as a linear combination of the columns of Q k−1 ; hence (4.31) is computed as k−1

ηi q i . (4.32) sn,k = x n + i =0

As we have already seen in Chapter 2, we can overwrite the matrix U k with the matrix Q k simultaneously with the computation of Q k and R k ; that is, at any stage of the QR factorization, we store k + 2 N -dimensional vectors in memory. Since N >> k in our applications, the storage requirement of the (k + 1) × (k + 1) matrix R k is negligible. So is the cost of computing the SVD of R k , and so is the cost of computing the (k + 1)-dimensional vector η. For all practical purposes, the computational and storage requirements of SVD-MPE are the same as those of MPE and RRE.

4.5.1 Summary of algorithm For convenience, we now combine all of the above in the following algorithm, which implements SVD-MPE: 1. Choose integers n ≥ 0 and k ≥ 1. Input the vectors x n , x n+1 , . . . , x n+k+1 . 2. Compute the vectors u n+i = Δx n+i , i = 0, 1, . . . , k; form the N × (k + 1) matrix U k = [ u n | u n+1 | · · · | u n+k ]; and form its QR factorization (preferably by MGS), namely, U k = Q k R k : Q k ∈ N ×(k+1) unitary and R k ∈ (k+1)×(k+1) upper triangular, Q k = [q 0 | q 1 | · · · | q k ], Q ∗k Q k = I k+1 , R k = [ri j ]0≤i , j ≤k , ri j = q ∗i u j . (We are assuming that U k has full rank.)

98

Chapter 4. Algorithms for MMPE and SVD-MPE

3. Compute the SVD of R k , namely, R k = YΣH ∗ : Y = [ y 0 | y 1 | · · · | y k ], H = [ h 0 | h 1 | · · · | h k ] both unitary, Y ∗ Y = I k+1 , H ∗ H = I k+1 , Σ = diag (σ0 , σ1 , . . . , σk ), σ0 ≥ σ1 ≥ · · · ≥ σk > 0. 4. Determine the γi : With c = [c0 , c1 , . . . , ck ]T and h k = [hk0 , hk1 , . . . , hkk ]T , set c = h k , that is, ci = hki , i = 0, 1, . . . , k.   Compute γi = ci / kj=0 c j , i = 0, 1, . . . , k, provided kj=0 c j = 0. 5. With the γi determined, compute ξ = [ξ0 , ξ1 , . . . , ξk−1 ]T via ξk−1 = γk ,

ξ j = ξ j +1 + γ j +1 ,

j = k − 2, k − 3, . . . , 1, 0.

6. With the ξi determined, compute sn,k = x n + U k−1 ξ as follows: Compute η = [η0 , η1 , . . . , ηk−1 ]T = R k−1 ξ . Then compute sn,k = x n +

k−1

i =0

ηi q i .

Remark: The SVD problem that we solve in step 3 is very small compared to the dimension N of the vectors x m ; hence its cost is negligible.

4.6 Error estimation via algorithm for SVD-MPE As for error estimation without having to actually compute sn,k , we proceed by using the residual vector r (sn,k ) as before. Again, r (sn,k ) = U k γ for linear systems, and r (sn,k ) ≈ U k γ for nonlinear systems. Concerning U k γ , we have the following result. Theorem 4.4. Let σk be the smallest singular value of U k , and let h k be the corresponding right singular vector. Let also e = [1, 1, . . . , 1]T ∈ k+1 . Then U k γ 2 satisfies U k γ 2 =

Proof. By the fact that γ = c/

k

Ukγ =

j =0 c j ,

σk T

| e hk |

.

where c = h k , we first have

Ukc ; μ

μ=

k

j =0

cj = eT hk .

Thus, by the fact that c2 = 1, we have ' ' 'U c ' ' ' σ k 2 'U γ ' = = k, k 2 |μ| |μ|

from which the result follows.

(4.33)

Chapter 5

Epsilon Algorithms

5.1 Introduction An important sequence transformation used to accelerate the convergence of infinite sequences of scalars is the Shanks transformation. This transformation was originally derived by Schmidt [252] for solving linear systems by iteration. After it was neglected for a long time, it was resurrected by Shanks [255], who also gave a detailed study of its remarkable properties. Shanks’s paper was followed by that of Wynn [350], in which the epsilon algorithm, a most elegant and efficient implementation of the Shanks transformation, was presented. The papers of Shanks and Wynn made an enormous impact and paved the way for more research in sequence transformations. In this chapter, we first discuss in some detail the Shanks transformation and the epsilon algorithm of Wynn that implements it and then mention how it is used to accelerate the convergence of vector sequences as the scalar epsilon algorithm (SEA). Following this, we discuss the vector epsilon algorithm (VEA) of Wynn, which is obtained via a very interesting generalization of his scalar algorithm. Finally, we discuss another generalization of the epsilon algorithm due to Brezinski, which is called the topological epsilon algorithm (TEA). The first work in which the Shanks transformation and the epsilon algorithm are studied in detail is the book by Brezinski [29]. This book was also the first to treat the vector versions of the Shanks transformation. These topics are also treated in the book by Brezinski and Redivo Zaglia [36]. For a detailed study of the Shanks transformation containing the most recent developments, see Sidi [271]; [288]; and [278, Chapter 16]. Another algorithm (denoted the FS/qd algorithm) that implements the Shanks transformation was proposed in Sidi [278, Chapter 21]. The FS/qd algorithm is as efficient as the epsilon algorithm.

5.2 SEA In this section, we will dwell on the derivation and implementation of the Shanks transformation. As our derivation will proceed through the solution of linear recursion relations with constant coefficients, we will discuss the latter briefly first. 99

100

Chapter 5. Epsilon Algorithms

5.2.1 Linear homogeneous recursions with constant coefficients Let z0 , z1 , . . . , satisfy the (k + 1)-term recursion relation k

i =0

ai z m+i = 0,

m = 0, 1, . . . ;

ak = 1,

a0 = 0.

(5.1)

We are interested in solving this recursion relation for z m .  Consider the polynomial S(λ) = ki=0 ai λi . S(λ) is of degree exactly k since ak = 0. Therefore, it has exactly k roots, counting multiplicities. Furthermore, none of these be the distinct zeros of S(λ) with respective roots are zero since a0 = 0. Let λ1 , . . . , λ t multiplicities ω1 , . . . , ω t . Thus, S(λ) = it =1 (λ − λi )ωi and it =1 ωi = k. Then the general solution to this recursion is zm =

t

i =1

pi (m)λim ,

m = 0, 1, . . . ,

(5.2)

where, for each i, pi (m) is a polynomial in m of degree ωi − 1. Of course, the polynomials pi (m) have a total of k coefficients that can be determined by prescribing z0 , z1 , . . . , zk−1 as k initial values. Let us now consider the reverse problem: Suppose that the elements of the sequence {z m } are precisely of the form given in (5.2), where λ1 , . . . , λ t are distinct and, for each i, the polynomial pi (m) is of degree exactly ωi − 1, i = 1, . . . , t . Set k = it =1 ωi . Then the z m satisfy a unique (k +1)-term recursion relation of the form   given in (5.1), with ki=0 ai λi = it =1 (λ − λi )ωi . Furthermore, the z m do not satisfy a (k  + 1)-term recursion relation with k  < k; in this sense, k is minimal.

5.2.2 Derivation of the Shanks transformation Let us assume that x0 , x1 , . . . are of the form xm = s + where

λi = λ j

t

i =1

pi (m)λim ,

∀ i = j ,

m = 0, 1, . . . ,

λi = 0, 1

(5.3)

∀ i,

(5.4)

and for each i, pi (m) is a polynomial in m of degree exactly ωi − 1. Note that, if |λi | < 1 for all i, we have lim m→∞ x m = s. Assuming that we know the x m , but we do not know the pi (m), the λi , and s, we would like to determine s from our knowledge of the x m only. We start by rewriting (5.3) in the form xm − s =

t

i =1

pi (m)λim ,

m = 0, 1, . . . .

(5.5)

We already know from the last paragraph of the preceding subsection that the (x m − s) satisfy a unique recursion relation of the form k

i =0

ai (x m+i − s) = 0,

m = 0, 1, . . . ;

ak = 1,

a0 = 0,

k=

t

i =1

ωi ,

(5.6)

5.2. SEA

101

where the ai are defined via k

S(λ) =

i =0

ai λ i =

t  i =1

(λ − λi )ωi .

(5.7)

From (5.6), we obtain k

i =0



 k = ai s,

ai x m+i

m = 0, 1, . . . .

(5.8)

i =0

 Now, since λi = 1 for all i, we have that ki=0 ai = S(1) = 0. Therefore, we can divide k both sides of (5.8) by i =0 ai and obtain k s=

i =0 ai x m+i , k i =0 ai

m = 0, 1, . . . ,

(5.9)

which we can also write in the form s=

k

i =0

γi x m+i ,

m = 0, 1, . . . ;

a γ i = k i

j =0 a j

,

i = 0, 1, . . . , k;

k

i =0

γi = 1. (5.10)

Treating s and the γi as unknowns now, we realize that they can be determined as the solution to the linear system k

i =0

γi x m+i = s,

m = n, n + 1, . . . , n + k,

k

i =0

γi = 1,

(5.11)

where n is some arbitrary nonnegative integer. Taking differences of the first k + 1 equations in (5.11), we obtain the following linear system for the γi : k

i =0

γi Δx m+i = 0,

m = n, n + 1, . . . , n + k − 1,

k

i =0

γi = 1.

(5.12)

(As usual, Δx m = x m+1 − x m , m = 0, 1, . . . .) Of course, once the γi have been determined, the first equation in (5.11) gives s=

k

i =0

γi xn+i .

(5.13)

Summarizing, we have seen that when the sequence {x m } is as described in the first paragraph of this subsection, s is given by (5.13), with the γi as the solution to (5.12). When x m are not necessarily as in the first paragraph of this subsection, we can use the developments above to define the Shanks transformation. Definition 5.1. Let {x m } be an arbitrary sequence. Then, for any two integers n and k, the Shanks transformation produces an approximation to the limit or antilimit of this sequence that is given by k

ek (xn ) = γi xn+i , (5.14) i =0

102

Chapter 5. Epsilon Algorithms

the γi being the solution to the linear system k

i =0

γi Δx m+i = 0,

m = n, n + 1, . . . , n + k − 1,

k

i =0

γi = 1.

(5.15)

In view of (5.14) and (5.15), Lemma 1.14 applies, and ek (xn ) has the determinant representation     xn+1 ... xn+k xn     Δxn Δx . . . Δx n+1 n+k     . . . .. .. ..      Δx Δxn+k . . . Δxn+2k−1  n+k−1 (5.16) ek (xn ) =  .   1 1 ... 1    Δxn Δxn+1 . . . Δxn+k     . . .. .. ..   .    Δx Δxn+k . . . Δxn+2k−1  n+k−1

Remarks: 1. Clearly, if {x m } isexactly as described in the first paragraph of this subsection, and we set k = it =1 ωi , then ek (xn ) = s for all n. This was first shown by Brezinski and Crouzeix [35]. 2. It turns out that the Shanks transformation accelerates the convergence of sequences {x m } described  in the first paragraph of this subsection in the following sense: When k = ir=1 ωi , r < t , ek (xn ) converges to s as n → ∞ faster than xn provided |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ r | > |λ r +1 | ≥ · · · .  This was shown by Sidi [271]. [When k < it =1 ωi , ek (xn ) = s.] 3. Letting k = 1 in (5.16), we obtain e1 (xn ) =

xn (Δxn+1 ) − xn+1 (Δxn )

Δxn+1 − Δxn

=

2 xn xn+2 − xn+1

xn − 2xn+1 + xn+2

,

which is nothing but the well-known Aitken Δ2 -process that is used in accelerating the convergence of some linearly convergent sequences, that is, those sequences for which lim

m→∞

x m+1 − s

xm − s

= lim

m→∞

Δx m+1

Δx m

= ρ = 1.

If |ρ| < 1, s = limm→∞ x m . Otherwise, {x m } diverges, and s is its antilimit. For a survey of the properties of the Δ2 -process, see Sidi [278, Chapter 15], for example. 4. To compute ek (xn ), we need xn , xn+1 , . . . , xn+2k , a total of 2k + 1 terms of {x m }. An important and interesting property of the Shanks transformation is its connection with the Padé table.  We mention that the [r /s] Padé approximant from the i infinite power series f (z) = ∞ i =0 ai z is the rational function f r,s (z) = P (z)/Q(z),

5.2. SEA

103

with numerator polynomial P (z) of degree at most r and denominator polynomial Q(z) of degree at most s, Q(0) = 1, that satisfies f r,s (z) − f (z) = O(z r +s +1 ) as z → 0. Furthermore, if it exists, f r,s (z) is unique. For the subject of the Padé table, see the books by Baker [13], Baker and Graves-Morris [14], and Gilewicz [97], and the survey paper by Gragg [106]. See also Sidi [278, Chapter 17] for a detailed summary. The following result was given by Shanks [255].  Theorem 5.2. Let f m (z) = im=0 ai z i , m = 0, 1, . . . . If the Shanks transformation is applied to the sequence { f m (z)}, then ek ( fn (z)) = fn+k,k (z), where f r,s (z) is the [r /s]  i Padé approximant from the infinite power series f (z) = ∞ i =0 ai z .

5.2.3 Algorithms for the Shanks transformation Epsilon algorithm

Although the Shanks transformation is defined via the determinant representation of Definition 5.1, the computation of ek (xn ) via this representation is not efficient, especially when k is large. The following recursive algorithm, which implements it most efficiently and elegantly, is known as the epsilon algorithm and was developed by Wynn [350]. As the derivation of this algorithm is very complicated, we do not give it here. For this, we refer the reader to Wynn [350] or to Wimp [345, pp. 244–247]. Here are the steps of the epsilon algorithm: 1. Set

(n)

ε−1 = 0,

(n)

ε0 = xn ,

n = 0, 1, . . . .

(5.17)

(n)

2. Compute the εk via the recursion (n)

1

(n+1)

εk+1 = εk−1 +

(n+1) εk

(n)

− εk

,

n, k = 0, 1, . . . .

(5.18)

(n)

Commonly, the εk are arranged in a two-dimensional array as in Table 5.1. This (n) ∞

array is known as the epsilon table. Note that the sequences {εk }n=0 (k fixed) form the (n ∞

columns of the epsilon table, and the sequences {εk }k=0 (n fixed) form its diagonals. (n)

Concerning the εk , Wynn [350] has proved the following result. Theorem 5.3. The entries of the epsilon table satisfy (n)

(n)

ε2k = ek (xn )

and ε2k+1 = 1/ek (Δxn )

(Δxi = xi +1 − xi ).

(n)

(5.19)

(n)

Since we are interested only in the ε2k by (5.19), and since the ε2k+1 are auxiliary quantities, we may ask whether it is possible to obtain a recursion relation among the (n) ε2k only. The answer to this question, which is yes, was given again by Wynn [356], the result being the so-called cross rule: 1 (n−1) ε2k+2

(n) − ε2k

+

1 (n+1) ε2k−2

(n) − ε2k

=

1 (n−1) ε2k

(n) − ε2k

+

1 (n+1) ε2k

(n)

− ε2k

,

(5.20)

with the initial conditions (n)

(n)

ε−2 = ∞ and ε0 = xn , n = 0, 1, . . . .

(5.21)

104

Chapter 5. Epsilon Algorithms Table 5.1. The epsilon table.

(0)

(1) ε−1 (2) ε−1

ε0

(1)

ε0

(2)

ε0

(3)

(0)

ε1

(1) ε1

(1)

ε2

(2)

ε−1 (3)

(4)

(2)

(3)

(4)

ε0 .. .

ε1 .. .

(0)

ε3

(0)

ε4 .. .

(1)

ε1 ε0

ε−1 .. .

(0)

ε2

ε2 .. .

ε3 .. .

FS/qd algorithm

A different algorithm, denoted the FS/qd algorithm, which is as efficient as the epsilon algorithm, was recently proposed by Sidi in [275]; [278, Chapter 21]. In this (n) (n) algorithm, we first compute two sets of quantities, {ek } and {qk }, via the quotient(n)

(n)

difference algorithm (or the qd algorithm). With the ek and qk already computed, we (n)

apply the FS algorithm to compute two other sets of quantities, M k (n)

(n)

and Nk , with

the help of which we finally compute the ek (xn ). [Note that ek and ek (xn ) are not to be confused with each other.] Here are the steps of the FS/qd algorithm: 1. For n = 0, 1, . . . , set (n)

(n)

e0 = 0,

q1 =

un+1

(ui = xi +1 − xi ).

un

(n)

(5.22)

(n)

2. For n = 0, 1, . . . and k = 1, 2, . . . , compute ek and qk via the recursions (n) ek

(n+1) qk

=

(n) − qk

(n+1) + ek−1 ,

(n) qk+1

(n+1)

=

ek

(n) ek

(n+1)

qk

.

(5.23)

3. For n = 0, 1, . . . , set (n)

M0 =

xn , un

(n)

N0 =

1 un

(ui = xi +1 − xi ). (n)

(5.24)

(n)

4. For n = 0, 1, . . . and k = 1, 2, . . . , compute M k , Nk , and ek (xn ) via the recursions (n+1)

(n)

Mk =

(n+1)

(n)

M k−1 − M k−1 (n)

ek

,

(n)

Nk =

(n)

Nk−1 − Nk−1 (n)

ek

(n)

,

ek (xn ) =

Mk

(n)

Nk

.

(5.25)

5.2. SEA

105 Table 5.2. The qd-table.

(0)

q1

(1) e0

(0)

(1)

q1

(2) e0

e1

(0)

(1) e1

(2)

q1

(3) e0

(4)

e0 .. .

(0)

e2 (1)

(2) e1

(3)

q1 .. .

q2 q2

(1) e2

(2)

q2 (3)

e1 .. .

..

.

..

.

..

.

(2)

(3)

q2 .. .

e2 .. .

Table 5.3. The Romberg table.

(0)

Q0 (1) Q0 (2) Q0 (3) Q0 .. .

(n)

(0)

Q1 (1) Q1 (2) Q1 .. .

(0)

Q2 (1) Q2 .. .

(0)

Q3 .. .

..

.

(n)

The quantities qk and ek can be arranged in a two-dimensional array as in Table (n)

5.2. This table is called the qd-table. Similarly, the quantities M k arranged in a two-dimensional array as in Table 5.3, where the (n)

(n) Qk

(n)

and Nk

can be (n)

stand for M k or

Nk . This table is called a Romberg table. Remarks: 1. Steps 1 and 2 of the FS/qd algorithm implement the qd algorithm, while steps 3 and 4 implement the FS algorithm. 2. The FS/qd algorithm was designed to implement, among others, the higher-order G-transformation of Gray, Atchison, and McWilliams [117], the Shanks transformation being a special case. See also Pye and Atchison [212] and Sidi [278, Chapter 21]. The FS algorithm was designed by Ford and Sidi [86] to implement a generalization of the Richardson extrapolation process. See also [278, Chapter 3]. 3. The qd algorithm was developed by Rutishauser in [229] and is discussed further in [228] and [230]. It arises in the context of continued fractions. For a detailed

106

Chapter 5. Epsilon Algorithms

treatment of it, we refer the reader to Henrici [131, 132]. For a summary, see also [278, Chapter 17]. 4. The qd algorithm is actually closely related to the epsilon algorithm. The connection between the two was discovered and analyzed by Bauer [18, 19], who also developed another algorithm, denoted the η-algorithm, that is closely related to the ε-algorithm. We do not go into the η-algorithm here, but refer the reader to [18] and [19]. See also the description given in Wimp [345, pp. 160– 165].

5.2.4 Shanks transformation applied to vector sequences: The SEA Let us go back to vector sequences {x m } in N generated by a linear fixed-point iteration, as in Section 0.4. By Theorem 0.3, we see that the vectors x m satisfy (5.5) componentwise. That is, we have (j)

xm − s ( j ) =

t

i =1

(j)

(j)

pi (m)λim ,

m = 0, 1, . . . ,

(j)

where x m , s ( j ) , and pi (m) stand for the j th components of x m , s, and p i (m), respectively. This suggests that the Shanks transformation can be used to approximate limits or antilimits of arbitrary vector sequences {x m } ∈ N componentwise. That is, it can be (j) (j) applied to the scalar sequences {x m }∞ m=0 to approximate s , j = 1, 2, . . . , N . When used this way within the context of vector sequences, the resulting method is SEA. (n) (j) It is clear that, to compute ε2k from each sequence {x m }∞ m=0 , j = 1, 2, . . . , N , we need x n , x n+1 , . . . , x n+2k , a total of 2k + 1 terms of {x m }. Unfortunately, SEA turns out to be rather unstable numerically when applied to vector sequences. Therefore, it is not recommended for vector applications.

5.3 VEA 5.3.1 Vectorization of epsilon algorithm via Samelson inverse As is clear, when applied to a vector sequence {x m } in N , SEA treats each of the (j) scalar sequences {x m }∞ m=0 , j = 0, 1, . . . , N , separately, as if they were generated independently. Since, in the problems we are interested in, the processes that generate the (j) x m cause the components x m to be coupled, we want our extrapolation method to take this coupling into account. To achieve this with the epsilon algorithm, Wynn [352, 354] suggested “vectorizing” it by introducing the Samelson inverse of a nonzero vector, namely,

z 1 ; = z −1 = z z 22

z = 0,

z 2 =



z∗z,

where z stands for the complex conjugate of z and is a column vector just as z is itself. Thus, z −1 is a column vector too and has the desirable properties that

(z −1 )−1 = z

and

z T z −1 = 1.

5.3. VEA

107

It is easy to show that the Samelson inverse z −1 of z is the unique vector w that solves the constrained minimization problem min w2

subject to z T w = 1.

Finally, comparing the Samelson inverse z −1 of z with its Moore–Penrose inverse z + , we realize that the two are closely related since z+ =

z∗ ; z 22

z −1 = (z + )T .

hence

With this interpretation of 1/z , we obtain VEA as follows: 1. Set

(n)

ε−1 = 0,

(n)

ε0 = x n ,

n = 0, 1, . . . .

(5.26)

(n)

2. Compute the εk via the recursion (n)

1

(n+1)

εk+1 = εk−1 +

(n+1) εk

(n)

− εk

,

n, k = 0, 1, . . . .

(5.27)

(n)

The εk can be arranged in a two-dimensional array exactly as in Table 5.1. As in the case of SEA, here too the approximations to the limit or antilimit of (n) (n) {x m } are the ε2k . It is clear that to compute ε2k we need x n , x n+1 , . . . , x n+2k , a total of 2k + 1 terms of {x m }. (n) Again, as in the case of SEA, in the vector case too the ε2k+1 can be eliminated. This results in the following cross rule for VEA: 1 (n−1) ε2k+2

(n) − ε2k

+

1 (n+1) ε2k−2

(n) − ε2k

=

1 (n−1) ε2k

(n) − ε2k

+

1 (n+1) ε2k

(n)

− ε2k

,

(5.28)

with the initial conditions (n)

(n)

ε−2 = ∞ and ε0 = x n , n = 0, 1, . . . .

(5.29)

Replacing n by n + 1 in (5.28), and invoking 1/z = z /z 22 , we realize that the vectors (n)

ε2k satisfy a five-term recursion relation of the form (n)

(n)

(n+1)

ε2k+2 = ank ε2k + bnk ε2k

(n+2)

+ cnk ε2k

(n+2)

+ dnk ε2k−2 ,

n ≥ 0, k ≥ 1,

(5.30)

ank , . . . , dnk being scalars that satisfy ank + bnk + cnk + dnk = 1.

5.3.2 Determinant representation for VEA VEA is related to vector-valued Padé approximants. It has been studied extensively in the papers by Graves-Morris [107, 110] and Graves-Morris and Jenkins [112]. See also [14, Section 8.4]. The following theorem, which is based on the connection between (n) VEA and vector-valued Padé approximants, gives a determinant representation for ε2k .

108

Chapter 5. Epsilon Algorithms (n)

Theorem 5.4. ε2k has determinant representation (n)

ε2k =

where

E(x n , x n+1 , . . . , x n+2k )

E(1, 1, . . . , 1)

  v  0  M  00  E(v0 , v1 , . . . , v2k ) =  M10  ..  .  M2k−1,0

with M i i = 0,

Mi j =

j

−i −1 r =0

v1 M01 M11 .. . M2k−1,1

,

··· ··· ··· ···

(5.31)  v2k  M0,2k  M1,2k  ,  ..   .  M2k−1,2k 

〈u j −r +n−1 , u i +r +n 〉 = −M j i ,

i < j.

(5.32)

(5.33)

Here, u m = x m+1 − x m , m = 0, 1, . . . , as usual, and 〈y, z 〉 = y ∗ z . If the vi are vectors, the determinant E(v0 , v1 , . . . , v2k ) is defined via its expansion with respect to its first row. (n) By (5.31), it also follows that ε2k is of the form (n)

ε2k =

2k

i =0

γi x n+i ,

2k

i =0

γi = 1.

(5.34)

Proof. We make use of the connection between VEA as applied to the vector sequence ∞ i {x m }∞ and vector Padé approximants from the vector-valued power series i =0 c i z m=0 with c 0 = x 0 and c i = x i − x i −1 = u i −1 , i = 1, 2, . . . . Let us denote the vector Padé approximant of type [n + 2k/2k] by r [n+2k/2k] (z). The connection we alluded to is (n)

ε2k = r [n+2k/2k] (1).

(5.35)

 Letting f m (z) = im=0 c i z i , m = 0, 1, . . . , we begin with the determinant representation of r [n+2k/2k] (z), namely,  2k  2k−1 1 0 E z f (z), z f (z), . . . , z f (z), z f (z) n n+1 n+2k−1 n+2k r [n+2k/2k] (z) = , (5.36) E(z 2k , z 2k−1 , . . . , z 1 , z 0 )

with E(v0 , v1 , . . . , v2k ) as given in (5.32) and (5.33) (see Chapter 13). We now let z = 1 everywhere and also observe that x m = f m (1). By invoking (5.35), the result follows.

Remark: We will consider the subject of vector Padé approximants in more detail in Chapter 13. The references mentioned above provide only the denominator determinant in (5.31). The determinant representation of the numerator follows from an argument we present in Section 13.2. (n)

As an example, let us look at ε2 . It can be verified both via (5.27) and via Theorem 5.4 that u n 22 u n+1 − u n+1 22 u n (n) ε2 = x n+1 + u n+1 − u n 22

5.3. VEA

109

and (n)

ε2 =

u n+1 22 x n − 2[ℜ 〈u n+1 , u n 〉]x n+1 + u n 22 x n+2

Recall that 〈y, z 〉 = y ∗ z and z 2 =



u n+1 − u n 22

.

z ∗ z here.

5.3.3 McLeod’s theorem Theorem 5.5, which we state next, is known as McLeod’s theorem. It is the VEA analogue of Theorem 1.12, which we stated for the four polynomial extrapolation methods MPE, RRE, MMPE, and SVD-MPE. Theorem 5.5. Let the sequence {x m } be such that k

i =0

ai (x m+i − s) = 0,

m = 0, 1, . . . ,

a0 , ak = 0,

k

i =0

ai = 0,

(n)

(5.37) (n)

k being minimal. Then the vectors ε2k obtained by applying VEA to {x m } satisfy ε2k = s, n = 0, 1, . . . , provided that zero divisors are not encountered during the recursion. The result of Theorem 5.5 was conjectured by Wynn [353] and was first proved by McLeod [183] with the ai restricted to be real numbers; see [306, p. 211]. The complete proof was given by Graves-Morris [107].

5.3.4 Operation count and computer storage requirements (n)

We now turn to the analysis of the cost of computing ε2k for given n and k. In the problems we are interested in, N , the dimension of the vectors x m , is extremely large, while k takes on very small values. Consequently, the major part of the computational effort is spent in handling the large vectors, the rest being negligible. Rewriting (5.27) as (s )

(s +1)

ε r +1 = ε r −1 +

(s )

Δε r

(s )

Δε r 22

,

(s )

(s +1)

Δε r = ε r

(s )

− εr ,

z 2 =



z∗z,

(s )

we realize that, for r ≥ 1, the computation of ε r +1 involves one vector addition, one inner product, and one saxpy operation. For r = 0, it involves one inner product and (m) one scalar-vector multiplication, since ε−1 = 0. (n+s )

Now, given the vectors ε0

(n)

= x n+s , s = 0, 1, . . . , 2k, the computation of ε2k (n+ j )

necessitates the computation of all the vectors ε r , 1 ≤ j + r ≤ 2k, a total of 2k 2 + k (m) vectors. Taking into account that the computation of the ε1 is slightly different, it is (n)

easy to show that the cost of the whole process of computing ε2k is 6k 2 + O(k) vector operations and is as follows: vector additions: scalar-vector multiplications: dot products: saxpy (scalar alpha x plus y):

2k 2 + k, 2k, 2k 2 + k, 2k 2 − k.

110

Chapter 5. Epsilon Algorithms

As for storage, we need (2k + 1)N memory locations throughout the whole process (n) of computing ε2k . We may use them first to store x n , x n+1 , . . . , x n+2k if these x m are (m)

provided first. We overwrite these with the ε r

as we go along.

5.4 TEA 5.4.1 Preliminaries Instead of using the Samelson inverse, Brezinski [28, 29] “vectorizes” the epsilon algorithm by introducing a different vector inverse. Starting with the bilinear form 〈y, z 〉 = y ∗ z , we define the inverse of an ordered pair of vectors {a, b} such that 〈a, b〉 =  0 to be the ordered pair {b −1 , a −1 }, where b −1 =

a , 〈b, a〉

a −1 =

b . 〈a, b〉

We call a −1 the inverse of a with respect to b, and vice versa. Here are some desirable consequences of this inversion:  −1 = {a, b}, {a, b}−1

1 . 〈b, a〉

〈a −1 , b −1 〉 =

Note that, when a, b ∈ N , 〈b, a〉 = 〈a, b〉; hence 〈b, a〉 = 〈a, b〉 in general. Of course, 〈b, a〉 = 〈a, b〉 when a and b are real vectors. Interpreting the inverse of a vector with the help of this, we obtain two other vector versions of the epsilon algorithm next. Each of these vector versions is called TEA. (n) (n) To this effect, we treat ε2k and ε2k+1 differently by rewriting the epsilon algorithm in the following form:

1. Set

(n)

(n)

ε−1 = 0,

ε0 = x n ,

n = 0, 1, . . . .

(5.38)

(n)

2. Compute the εk via the recursions (n)

1

(n+1)

ε2k+1 = ε2k−1 +

(n) Δε2k

(n)

(n)

(n+1)

ε2k+2 = ε2k

,

+

1 (n)

Δε2k+1

(n)

,

(n)

n, k = 0, 1, . . . , (n+1)

interpreting 1/Δε2k and 1/Δε2k+1 differently. Here, Δεk = εk

(5.39)

(n)

− εk .

5.4.2 First version of TEA: TEA1 (n)

We now define a vector version of the epsilon algorithm with the vectors 1/Δε2k and (n)

(n)

1/Δε2k+1 interpreted in the following way: The inverse of Δε2k is with respect to a fixed vector y, arbitrary except for the restriction that all the required inverses should (n) (n) exist, that is, 〈y, Δε2k 〉 = 0 for each n and k. The inverse of Δε2k+1 is with respect to (n)

y −1 , the inverse of y with respect to Δε2k . Thus, we have (n)

y −1 = 0

Δε2k

, (n) 1

y, Δε2k



(n)

Δε2k+1

−1

=

y −1 (n)

〈Δε2k+1 , y −1 〉

(n)

=0

Δε2k (n)

(n) 1

Δε2k+1 , Δε2k

.

5.4. TEA

111

With these interpretations of 1/z in (5.38) and (5.39), we obtain the first version of TEA (TEA1) given as follows: 1. Set

(n)

(n)

ε−1 = 0,

ε0 = x n ,

n = 0, 1, . . . .

(5.40)

(n)

2. Compute the εk via the recursions (n)

(n)

y

(n+1)

ε2k+1 = ε2k−1 + 0 (n+1)

ε2k+2 = ε2k

+0

1,

(n)

Δε2k , y (n) Δε2k (n)

⎫ ⎪ ⎪ ⎪ ⎪ ⎬

n = 0, 1, . . .

, (n) 1

Δε2k+1 , Δε2k

⎪ ⎪ n = 0, 1, . . . ⎪ ⎪ ⎭

,

k = 0, 1, . . . .

(5.41)

(n)

As usual, the approximations to the limit or antilimit of {x m } are the ε2k . Of course, (n)

the εk can be arranged in a two-dimensional array as in Table 5.1.

(n)

It is also interesting to note from the second equality in (5.41) that the vectors ε2k satisfy a three-term recursion relation of the form (n)

(n)

(n+1)

ε2k+2 = ank ε2k + bnk ε2k

,

(5.42)

ank , bnk being scalars satisfying ank + bnk = 1. (n) Concerning the εk , Brezinski [28, 29] has proved the following results. Theorem 5.6. The entries of the TEA1 table satisfy  −1 (n) (n) ε2k = ek (x n ) and ε2k+1 = ek (Δx n ) =

where ek (x n ) =

with

F (x n , x n+1 , . . . , x n+k )

  v0   〈y, Δx n 〉  F (v0 , v1 , . . . , vk ) =  ..  .  〈y, Δx n+k−1 〉

F (1, 1, . . . , 1) v1 〈y, Δx n+1 〉 .. .

··· ···

〈y, Δx n+k 〉 · · ·

y

〈ek (Δx n ), y

1,

,

(5.43)

(5.44)

vk 〈y, Δx n+k 〉 .. .

〈y, Δx n+2k−1

     .   〉

(5.45)

(n)

In addition, Theorem 5.5 is valid with the present ε2k too. (n)

Let us note that the determinant representation of ε2k is of the same form as those of MPE, RRE, and MMPE, which follow from Lemma 1.14. Therefore, Lemma 1.14 applies to the case of TEA1 too and we have the following result, whose proof we leave to the reader. (n)

Theorem 5.7. For TEA1, ε2k is of the form (n)

ε2k =

k

i =0

γi x n+i ,

112

Chapter 5. Epsilon Algorithms

the γi being the solution to the linear system k

j =0

ui , j γ j = 0,

with

i = 0, 1, . . . , k − 1,

k

j =0

ui , j = 〈y, u n+i + j 〉,

γ j = 1,

i, j ≥ 0.

Therefore, we also have the following analogue of Theorem 1.17: 3 2

k γ j u n+i + j = 0, y,

i = 0, 1, . . . , k − 1.

j =0

(n)

The next theorem gives a sufficient condition for the existence of ε2k . Since its proof can be achieved exactly as that of Theorem 1.18, with D(1, . . . , 1) in the latter replaced by F (1, . . . , 1) now, we leave it to the reader. Theorem 5.8. Let u m = x m+1 − x m and w m = u m+1 − u m , m = 0, 1, . . . , as usual. 4 = 0, where Then sn,k exists and is unique if det W ⎡

〈y, w n 〉 〈y, w n+1 〉 .. .

⎢ 4=⎢ W ⎢ ⎣

〈y, w n+1 〉 〈y, w n+2 〉 .. .

〈y, w n+k−1 〉

⎤ 〈y, w n+k−1 〉 〈y, w n+k 〉 ⎥ ⎥ ⎥. .. ⎦ .

··· ···

〈y, w n+k 〉 · · ·

(5.46)

〈y, w n+2k−2 〉

5.4.3 Second version of TEA: TEA2 Brezinski [28, 29] proposes an additional TEA that is slightly different from TEA1 presented in (5.40) and (5.41). In this algorithm, denoted the second version of TEA (n) (n) (TEA2), ε2k+1 is as in TEA1, but y −1 in the computation of ε2k+2 is the inverse of y (n+1)

with respect to Δε2k

, that is, (n+1)

y −1 = 0

Δε2k

(n+1) 1

y, Δε2k

.

Here are the steps of TEA2: 1. Set

(n)

ε−1 = 0,

(n)

ε0 = x n ,

n = 0, 1, . . . .

(5.47)

(n)

2. Compute the εk via the recursions (n)

(n+1)

ε2k+1 = ε2k−1 + 0 (n)

(n+1)

ε2k+2 = ε2k

+0

y (n)

1,

Δε2k , y (n+1) Δε2k (n)

n = 0, 1, . . .

, (n+1) 1

Δε2k+1 , Δε2k

⎫ ⎪ ⎪ ⎪ ⎪ ⎬

⎪ ⎪ n = 0, 1, . . . ⎪ ⎪ ⎭

,

k = 0, 1, . . . . (5.48)

5.4. TEA

113 (n)

Again, the approximations to the limit or antilimit of {x m } are the ε2k . Of course, (n)

the εk can be arranged in a two-dimensional array as in Table 5.1.

(n)

It is also interesting to note from the second equality in (5.48) that the vectors ε2k satisfy a three-term recursion relation of the form (n)

(n+1)

ε2k+2 = cnk ε2k

(n+2)

+ dnk ε2k

,

(5.49)

cnk , dnk being scalars satisfying cnk + dnk = 1. (n) Concerning the εk , Brezinski [28, 29] has proved the results in the following theorem. Theorem 5.9. The entries of the TEA2 table satisfy (n)

ε2k = ek (x n )

 −1 (n) ε2k+1 = ek (Δx n ) =0

and

where

y

y, ek (Δx n )

F (x n+k , x n+k+1 , . . . , x n+2k )

ek (x n ) =

F (1, 1, . . . , 1)

,

1,

(5.50)

(5.51)

with F (v0 , v1 , . . . , vk ) precisely as in (5.45). In addition, Theorem 5.5 is valid with the (n) present ε2k too. Analogous to Theorem 5.7, we also have the following result. (n)

Theorem 5.10. For TEA2, ε2k is of the form (n)

ε2k =

k

i =0

γi x n+k+i ,

the γi being the solution to the linear system k

j =0

ui , j γ j = 0,

i = 0, 1, . . . , k − 1,

k

j =0

γ j = 1,

with ui , j = 〈y, u n+i + j 〉,

i, j ≥ 0.

Therefore, we also have the following analogue of Theorem 1.17: 3 2

k γ j u n+i + j = 0, y,

i = 0, 1, . . . , k − 1.

j =0

Clearly, the present γi are precisely those that arise from TEA1, hence are as in Theorem 5.7 with no changes. Since the determinant representation of sn,k for TEA2 is almost the same as that of TEA1, we also have, analogous to Theorem 5.8, the following theorem.

114

Chapter 5. Epsilon Algorithms

Theorem 5.11. Let u m = x m+1 − x m and w m = u m+1 − u m , m = 0, 1, . . . , as usual. 4 = 0, where W 4 is exactly as given in (5.46). Then sn,k exists and is unique if det W

5.4.4 Operation count and computer storage requirements (n)

We now turn to the analysis of the cost of computing ε2k by TEA1 and TEA2. We recall again that, in the problems we are interested in, N , the dimension of the vectors x m , is extremely large, while k takes on very small values relative to N . Consequently, the major part of the computational effort is spent in handling the large vectors, the rest being negligible. Comparing the recursions defining TEA1 and TEA2, we realize that they have exactly the same computational cost and storage requirements. Furthermore, these (n+s ) = x n+s , are precisely the same as those of VEA. Thus, given the vectors ε0 (n)

s = 0, 1, . . . , 2k, the computation of ε2k necessitates the computation of all the vec(n+ j ) , εr

1 ≤ j + r ≤ 2k, a total of 2k 2 + k vectors, each requiring one vector additors tion, one scalar-vector multiplication, and one saxpy operation. Taking into account (m) that the computation of the ε1 is slightly different, it is easy to show that the whole (n)

process of computing ε2k involves 6k 2 + O(k) vector operations and is as follows: vector additions: scalar-vector multiplications: dot products: saxpy (scalar alpha x plus y):

2k 2 + k, 2k, 2k 2 + k, 2k 2 − k.

As for storage, we need (2k + 1)N memory locations throughout the whole process (n) of computing ε2k . We may use them first to store x n , x n+1 , . . . , x n+2k if these x m are (m)

provided first. We overwrite these with the ε r

as we go along.

5.4.5 Concluding remark The bilinear form of Brezinski in his definition of TEA1 and TEA2 is actually 〈y, z 〉 = y T z , for which 〈y, z 〉 = 〈z , y〉, whereas 〈y, z 〉 = 〈z , y〉 in our case here. (The way we defined the inverse of the ordered pair {a, b} in the beginning of this section is consistent with the inner product we use here.). This has an important consequence when we apply TEA to a complex sequence {x m }; in this case 〈y, z 〉 = 〈z , y〉 generally speaking, which means that the order in which y and z appear in the inner product is (n) (n) important. This is very relevant when computing 1/Δε2k and 1/Δε2k+1 in (5.41) and (5.48). A detailed discussion of this important issue is given in Tan [321]. (Of course, no problems occur for real sequences.)

5.5 Implementation of epsilon algorithms in cycling mode Epsilon algorithms, like polynomial methods, can be applied in the cycling mode to the numerical solution of large systems of equations ψ(x) = 0, ψ : N → N , by fixedpoint iterations x m+1 = f (x m ). Here are the steps of cycling for epsilon algorithms:

5.6. ETEA: An economical implementation of TEA

115

C0. Choose integers n ≥ 0 and k ≥ 1 and an initial vector x 0 . C1. Compute the vectors x 1 , x 2 , . . . , x n+2k [via x m+1 = f (x m )]. C2. Apply any of the three epsilon algorithms, namely, SEA, VEA, and TEA, to the (n) vectors x n , x n+1 , . . . , x n+2k , with end result sn,k = ε2k . C3. If sn,k satisfies the accuracy test, stop. Otherwise, set x 0 = sn,k and go to step C1. (i )

We will call each application of steps C1–C3 a cycle. We will also denote by sn,k the sn,k (i )

that is computed in the ith cycle. Under suitable conditions, the sequence {sn,k }∞ i =1 has very good convergence properties. Since the approximations we generate when cycling with epsilon algorithms are (n) ε2k with fixed n and k, and since it takes 6k 2 + O(k) vector operations and (2k + (n)

(n)

O(1))N storage locations to compute ε2k , we may wonder whether ε2k can be generated less expensively. We tackle this issue in the next section, where we develop very economical implementations, which we shall call ETEA, for both TEA1 and TEA2. The analogous treatment for VEA turns out to be less economical; therefore, we omit it.

5.6 ETEA: An economical implementation of TEA Our discussion in Section 5.4 shows that the overhead cost of 6k 2 + O(k) vector op(n) erations when computing ε2k is due to the implementation of TEA via the recursion relations that define TEA1 and TEA2, because they necessitate the computation of all (n+ j ) (n) εr , 1 ≤ j + r ≤ 2k. We now propose new procedures for computing ε2k that do not use recursions. These procedures are based entirely on Theorems 5.7 and 5.10 and the approach to the implementation of the polynomial methods discussed in Chapters (n) 2 and 4. They are very suitable for computing ε2k with fixed n and k. As such, they are especially suitable for use in the cycling mode. Let us recall Theorems 5.7 and 5.10: (n)

ε2k = (n)

ε2k =

k

i =0

k

i =0

γi x n+i

γi x n+k+i

for TEA1,

for TEA2,

the γi (for both TEA1 and TEA2) being the solution to the linear system k

j =0

ui , j γ j = 0,

i = 0, 1, . . . , k − 1,

k

j =0

γ j = 1,

with ui , j = 〈y, u n+i + j 〉,

i, j ≥ 0,

u m = x m+1 − x m .

116

Chapter 5. Epsilon Algorithms

Here are the steps of ETEA1 and ETEA2, our new implementations of TEA1 and TEA2: 1. Choose the integers n ≥ 0 and k ≥ 1 and the vectors x 0 and y. 2. Introduce x 1 , . . . , x n and save only x n . 3. For i = 0, 1, . . . , 2k − 1, introduce x n+i +1 and compute u n+i = x n+i +1 − x n+i , and then compute the scalar ai = 〈y, u n+i 〉. • For ETEA1, save x n and u n+i , i = 0, 1, . . . , k − 1, and discard the rest of the x n+ j and u n+ j as soon as the corresponding a j are computed. • For ETEA2, save x n+k and u n+k+i , i = 0, 1, . . . , k − 1, and discard the rest of the x n+ j and u n+ j as soon as the corresponding a j are computed. 4. Set ui , j = ai + j and solve the linear system k−1

j =0

ui , j c j = −ui ,k ,

i = 0, 1, . . . , k − 1.21

Set ck = 1 and compute c γ i = k i

j =0 c j

,

i = 0, 1, . . . , k,

provided

k

j =0

c j = 0.

5. With the γi available, compute the scalars ξ0 , ξ1 , . . . , ξk−1 via ξ0 = 1 − γ0 ,

ξ j = ξ j −1 − γ j ,

j = 1, . . . , k − 1.

(n)

6. Compute ε2k via (n)

ε2k = x n + (n)

ε2k = x n+k +

k−1

j =0 k−1

j =0

ξ j u n+ j

for ETEA1,

ξ j u n+k+ j

for ETEA2.

(n)

The ETEA implementations for computing ε2k we have just described involve 5k vector operations, namely, vector additions: dot products: saxpy (scalar alpha x plus y):

2k, 2k, k.

As for storage, we need (k + 1)N memory locations (for x n and u n+i , 0 ≤ i ≤ k − 1, with ETEA1 and for x n+k and u n+k+i , 0 ≤ i ≤ k − 1, with ETEA2) throughout the 21 Since

ui , j = ai + j , the matrix of the linear system

k−1 j =0

ui , j c j = −ui ,k , i = 0, 1, . . . , k − 1, is a Hankel

matrix. This system can be solved using O(k 2 ) arithmetic operations. See Golub and Van Loan [103], for example.

5.7. Comparison of epsilon algorithms with polynomial methods

117

(n)

whole process of computing ε2k . [These should be compared with 6k 2 + O(k) vector operations and (2k + 1)N storage locations incurred by the recursive computations of TEA1 and TEA2.] Recently, two simplifications of TEA, one for TEA1 and another for TEA2, were given by Brezinski and Redivo Zaglia [43], called STEA1 and STEA2, respectively. For all practical purposes, they are based on (5.40)–(5.41) and (5.47)–(5.48). Of these, STEA2 is the preferred one since it requires k vectors to be stored in the core memory, while STEA1 requires 2k vectors to be stored. We do not go into the details of (n) these simplifications here. We only mention that the recursion relations for the ε2k in STEA2 are simplified to read (n)

(n)

(n+1)

ε2k+2 = ε2k

+

(n+1)

εˆ2k+2 − εˆ2k (n+2) εˆ2k

(n+1) − εˆ2k

(n+2)

(ε2k

(n+1)

− ε2k

),

(n)

where the (scalar) εˆk are obtained by applying the epsilon algorithm to the scalar sequence {〈y, x m 〉}. The number of vector operations required by both simplifications (n) for computing ε2k is k 2 + O(k) vector additions, k 2 + O(k) saxpy operations, and 2k dot products. Comparing this overhead with that of ETEA proposed here, we see that ETEA is more economical as far as numbers of vector operations are concerned, the storage requirements of ETEA2 and STEA2 being the same.

5.7 Comparison of epsilon algorithms with polynomial methods We end this chapter by briefly comparing epsilon algorithms with polynomial methods, when all these methods are being applied in the cycling mode in numerical solution of large systems of equations ψ(x) = 0, ψ : N → N , by fixed-point iterations x m+1 = f (x m ). By Theorems 1.12 and 5.5, we know that whenever the sequence {x m } (n) satisfies (1.50) or (5.37), MPE and RRE give sn,k = s and VEA and TEA give ε2k = s. Thus, it can be argued heuristically that, for arbitrary n and k, sn,k from MPE or RRE (n)

and ε2k from VEA and TEA have comparable accuracies. (This can also be concluded from the fact that they converge at the same rate as n → ∞ if f (x) is a linear function, as we will show later in Chapter 6.) Now, in many practical applications, such as those arising from the numerical solution of systems of linear or nonlinear partial differential equations, for example, the cost of computing the vectors x m is dominant, the cost of the arithmetic operations needed for implementing the various extrapolation methods being a small percentage of the total computational cost. In such cases, the polynomial methods are clearly preferable over the epsilon algorithms. This is so because the vectors needed for sn,k (n)

are x 0 , x 1 , . . . , x n+k+1 (n + k + 2 vectors altogether), while the vectors needed for ε2k are x 0 , x 1 , . . . , x n+2k (n + 2k + 1 vectors altogether). For n = 0, for example, 2k vectors need to be computed for the epsilon algorithms, while k + 1 vectors need to be computed for the polynomial methods. In addition, the computational overhead (the number of arithmetic operations) incurred for obtaining a given level of accuracy with epsilon algorithms is much higher than that incurred by polynomial methods, as can be seen by comparing their operation counts.

118

Chapter 5. Epsilon Algorithms

If the cost of computing the vectors x m is sufficiently low, however, TEA (via ETEA) can be used just as effectively as polynomial methods since its storage requirement is the same as those of MPE and RRE, but its computational overhead is considerably smaller. In remark 2 following the statement of Theorem 6.6, we mention that the condition in (6.12a) that guarantees the success of MPE and RRE need not hold for the success of TEA. What does guarantee the success of TEA is the condition in (6.12c), which can always be satisfied by an appropriate choice of the vector q in Theorem 6.6; note that there is always a suitable choice for q. Thus, TEA seems to have a clear advantage over MPE and RRE in such cases. It seems that there exist classes of problems on which the epsilon algorithms VEA and TEA would perform as well as, or even better than, polynomial methods. We discuss this issue in Chapter 6.

Chapter 6

Convergence Study of Extrapolation Methods: Part I

6.1 Introduction 6.1.1 General remarks Given the vector sequence {x m }, the approximations obtained by applying the various vector extrapolation methods to it can be arranged in a two-dimensional array as in Table 6.1. Table 6.1. The extrapolation table.

s0,0 s0,1 s0,2 .. .

s1,0 s1,1 s1,2 .. .

s2,0 s2,1 s2,2 .. .

(n)

··· ··· ··· .. .

Here sn,k also stands for the ε2k that result from the different epsilon algorithms. Note that k is fixed along the rows of the table, while n is fixed along its columns. In the k = 0 row, we have sn,0 = x n ; that is, the entries of this row are simply the elements of the sequence {x m } whose limit or antilimit is s. We also know in theory how good the x m are as approximations to s; we know precisely how they converge to (or diverge from) s. Now the entries in the remaining rows of this table are obtained from those in the k = 0 row, and they are intended to be better approximations to s. Two important questions that arise then are: “Are these rows really better? What can be said about their convergence and rates of convergence?” Similar questions can also be asked about the columns of the table. Our purpose in this chapter is to address these questions for row sequences {sn,k }∞ n=0 , with k fixed, when the sequences {x m } are generated by linear iterative processes. For simplicity, we will present the details of our study for sequences {x m } in N . The results of our investigation will also suggest general strategies for efficient application of vector extrapolation methods in general. The theory presented here concerns the methods MPE, RRE, MMPE, and TEA and was developed in the papers Sidi, Ford, and Smith [299]; Sidi [261, 263, 268, 273]; and Sidi and Bridger [294]. (For 119

120

Chapter 6. Convergence Study of Extrapolation Methods: Part I

a completely different approach to error analysis involving angles between vectors and subspaces, see Jbilou and Sadok [150].) At the end of this chapter, we will consider sequences in infinite-dimensional spaces, such as Hilbert spaces or other normed linear spaces. It turns out that the results obtained for N can be extended to such spaces with minor modifications. This, of course, expands the scope of vector extrapolation methods significantly. All the works by the author and collaborators mentioned in the preceding paragraph consider sequences in infinite-dimensional spaces.

6.1.2 Summary of determinant representations Before going on, we generalize the sn,k that result from applying MPE, RRE, MMPE, and TEA to {x m } slightly as follows. Definition 6.1. We define the approximations sn,k;ν via the following determinant representations: sn,k;ν =

where

D(x n+ν , x n+ν+1 , . . . , x n+ν+k )

D(1, 1, . . . , 1)

  v  0  u  0,0  D(v0 , v1 , . . . , vk ) =  u1,0  ..  .   uk−1,0

with ui , j

⎧ (u n+i , u n+ j ) ⎪ ⎪ ⎪ ⎨ (w , u ) n+i n+ j = ⎪ (q i +1 , u n+ j ) ⎪ ⎪ ⎩ (q, u n+i + j )

= = = =

v1 u0,1 u1,1 .. . uk−1,1

for some fixed ν ≥ 0,

··· ··· ··· ···

(Δx n+i , Δx n+ j ) (Δ2 x n+i , Δx n+ j ) (q i +1 , Δx n+ j ) (q, Δx n+i + j )

(6.1)

      ,     uk−1,k 

(6.2)

for MPE, for RRE, for MMPE, for TEA.

(6.3)

vk u0,k u1,k .. .

Here (y, z ) = y ∗ M z , M Hermitian positive definite, is an arbitrary inner product in N . The vectors q 1 , . . . , q k are fixed for MMPE. The vector q is fixed for TEA. Remarks: 1. Note that the ui , j are exactly as defined in Theorem 1.15 for MPE, RRE, and MMPE. Thus, sn,k;0 is exactly sn,k of Theorem 1.15. We will continue to let sn,k mean sn,k;0 in what follows. 2. The definition of ui , j for TEA is consistent with that given in Chapter 5, namely, with ui , j = 〈y, u n+i + j 〉, where y is a fixed vector, when we relate y and q via y = M q or, equivalently, q = M −1 y. Then (q, u n+i + j ) = 〈y, u n+i + j 〉. As a (n)

(n)

result, sn,k;0 = ε2k as defined in Theorem 5.6 for TEA1, and sn,k;k = ε2k as defined in Theorem 5.9 for TEA2. By Lemma 1.14, we also have the following.

6.2. Convergence and stability of rows in the extrapolation table

121

Lemma 6.2. The vector sn,k;ν in Definition 6.1 is also of the form sn,k;ν =

k

i =0

γi x n+ν+i ,

(6.4a)

where the γi are independent of ν and satisfy k

j =0

ui , j γ j = 0,

i = 0, 1, . . . , k − 1;

k

j =0

γ j = 1.

(6.4b)

In the next lemma, we give determinant representations for the polynomial  Qn,k (z) = ki=0 γi z i and the error sn,k;ν − s for the different extrapolation methods we have encountered so far. The proof of this lemma follows directly from Lemma 1.14 again. Lemma 6.3. With sn,k;ν as in Definition 6.1, and the γi as in (6.4a)–(6.4b), we have the determinant representations Qn,k (z) =

where

k

i =0

γi z i =

Hn,k (z) = D(z 0 , z 1 , . . . , z k ) and

Hn,k (z)

Hn,k (1)

,

(6.5)

Hn,k (1) = D(1, 1, . . . , 1),

(6.6)

and sn,k;ν − s =

k

i =0

γi (x n+ν+i − s) =

D(x n+ν − s, x n+ν+1 − s, . . . , x n+ν+k − s)

D(1, 1, . . . , 1)

.

(6.7)

6.2 Convergence and stability of rows in the extrapolation table Throughout, we make the following key assumption on {x m } ⊂ N : xm = s+ where

v i = 0,

p

i =1

λi ∈ ,

We also order the λi as in

v i λim ,

m ≥ m0 ≥ 0,

λi distinct,

λi = 0, 1,

|λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p |.

p ≤ N,

i = 1, . . . , p.

(6.8)

(6.9) (6.10)

Lemma 6.4. The vectors x m generated by the fixed-point iterative process x m+1 = T x m + d with diagonalizable T , which are described in Theorem 0.2, are exactly of the form described in (6.8) with (6.9). The λi are some or all of the distinct nonzero eigenvalues of T , and the v i are corresponding eigenvectors, that is, for each i, we have T v i = λi v i . In addition, the vectors v i are linearly independent.

122

Chapter 6. Convergence Study of Extrapolation Methods: Part I

Proof. Let us go back to Theorem 0.2 and replace the λi there by μi to avoid confusion. By (0.32) and (0.33), we realize that (i) the zero eigenvalues of T do not contribute to x m − s for m ≥ 1 since their contributions vanish as soon as m = 1, (ii) all the eigenvalues μi of T are different from one since I − T is nonsingular, (iii) nonzero eigenvalues μi for which αi = 0 do not contribute anything to x m − s, and (iv) the contributions of equal nonzero eigenvalues that do contribute can be lumped together into one. Renaming the distinct nonzero eigenvalues that contribute, we obtain the expansion in (6.8) with (6.9). As a result, the λi in (6.8) are some or all of the distinct and nonzero eigenvalues μ h of T and the v i are corresponding eigenvectors, that is, T v i = λi v i , i = 1, . . . , p. Hence, the v i are linearly independent as well.

Example 6.5. As an example, consider the case in which T is a 10 × 10 matrix with eigenvalues μ1 , . . . , μ10 such that μ1 = μ2 = 0; μ4 = μ5 = μ6 ; μ7 = μ8 ; and μ1 , μ3 , μ4 , μ7 , μ9 , and μ10 are nonzero and distinct. Denote the corresponding eigenvectors by w i . Assume that α3 = α5 = α10 = 0 in (0.32) and (0.33), the rest of the αi being nonzero. Then (0.32) becomes x 0 − s = (α1 w 1 + α2 w 2 ) + (α4 w 4 + α6 w 6 ) + (α7 w 7 + α8 w 8 ) + α9 w 9 , while for m ≥ 1, (0.33) becomes x m − s = (α4 w 4 + α6 w 6 )μ4m + (α7 w 7 + α8 w 8 )μ7m + α9 w 9 μ9m . Now, denoting λ1 = μ4 , λ2 = μ7 , and λ3 = μ9 , and denoting v 1 = α4 w 4 + α6 w 6 , v 2 = α7 w 7 + α8 w 8 , and v 3 = α9 w 9 , as well, we have xm − s =

3

i =1

v i λim .

Because the w i are all linearly independent, the v i are nonzero and also linearly independent. Clearly, we also have that T v i = λi v i , i = 1, 2, 3. We have thus shown that x m is precisely as in (6.8) with (6.9), the v i being linearly independent in addition.

In this section, we state the stability and convergence results that relate to the behavior of sn,k;ν as n → ∞. We recall that these sequences form the rows of the extrapolation table, namely Table 6.1. We shall state the simplest versions of these theorems here, leaving the more complete versions to Sections 6.3–6.7 in which we provide the proofs. When necessary, we will denote the sn,k;ν that result from MPE, RRE, MMPE, and TEA (TEA1 or TEA2), respectively, by sMPE , sRRE , sMMPE , and sTEA . n,k;ν n,k;ν n,k;ν n,k;ν

6.2.1 Convergence of rows with |λk | > |λk+1 | By the ordering of the λi in (6.8)–(6.10), we have either |λk | > |λk+1 | or |λk | = |λk+1 |. We start by studying the sequences {sn,k;ν }∞ n=0 for which |λk | > |λk+1 |. Theorem 6.6. Let {x m }, λi , and v i be as in (6.8)–(6.9), with the λi ordered as in (6.10). Let sn,k;ν be as in Definition 6.1. Assume that |λk | > |λk+1 |.

(6.11)

Assume, in addition, that v 1, . . . , v k

are linearly independent for MPE, RRE, and MMPE,

(6.12a)

6.2. Convergence and stability of rows in the extrapolation table

  (q 1 , v 1 )   (q 2 , v 1 )   ..  .  (q , v ) 1 k

(q 1 , v 2 ) (q 2 , v 2 ) .. .

(q k , v 2 )

and

k  i =1

··· ··· ···

 (q 1 , v k )  (q 2 , v k )   = 0 ..  .  (q , v )

for MMPE,

(6.12b)

k

k

(q, v i ) = 0

123

for TEA.

(6.12c)

Then the following are true: exists unconditionally for all n, and sMPE , sMMPE , and sTEA exist (i) Existence: sRRE n,k;ν n,k;ν n,k;ν n,k;ν for all large n. (ii) Acceleration of convergence: For all four methods,   sn,k;ν − s = a n,k;ν + o(1) |λk+1 |n = O(λnk+1 ) as n → ∞,

(6.13)

where the vectors a n,k;ν are bounded in n, that is, supn a n,k;ν  < ∞. For MPE and RRE, the vectors a n,k;ν are identical. Thus, as n → ∞, the errors sMPE − s and n,k;ν − s are asymptotically equal as well; that is, sRRE n,k;ν RRE sMPE n,k;ν − s ∼ sn,k;ν − s

as n → ∞, componentwise.

 (n,k) (n,k) ≡ γi to show their de(iii) The polynomial Qn,k (z ) = ki=0 γ i z i : Denote γi pendence on n and k. Then the polynomial Qn,k (z) satisfies k

Qn,k (z) =

i =0

  & (z) + O |λ /λ |n z =Q k k+1 k

(n,k) i

γi

where & (z) = Q k (n,k)

Thus, limn→∞ γi

k  z − λi i =1

1 − λi

as n → ∞,

.

(6.14)

(6.15)

all exist, and

lim Qn,k (z) =

n→∞

k 

i =0

(n,k)  i

lim γi

n→∞

& (z). z =Q k (n,k)

(6.16) (n,k)

Therefore, for all large n, Qn,k (z) has exactly k zeros, z1 , . . . , zk , which, as n → ∞, tend to λ1 , . . . , λk , respectively, and satisfy   (n,k) − λ s = O |λk+1 /λ s |n as n → ∞, s = 1, . . . , k. zs (6.17) (n,k)

If the v i are mutually orthogonal, that is, (v i , v j ) = 0 if i = j , and γi related to MPE or RRE, (6.14) and (6.17) improve to read, respectively, Qn,k (z) = and

k

(n,k)

zs

i =0

  & (z) + O |λ /λ |2n z =Q k k+1 k

(n,k) i

γi

  − λ s = O |λk+1 /λ s |2n as n → ∞,

are those

as n → ∞,

(6.18)

s = 1, . . . , k.

(6.19)

124

Chapter 6. Convergence Study of Extrapolation Methods: Part I

 (n,k) (iv) Stability: In addition, Γn,k = ki=0 |γi |, the quantity that measures the numerical stability of the sequence {sn,k;ν }, satisfies lim Γn,k =

n→∞

k

i =0

(n,k)

| lim γi n→∞

|≤

k  1 + |λi | i =1

|1 − λi |

.

(6.20)

Equality holds in (6.20) when λi are all positive or all negative. If they are all negative, then limn→∞ Γn,k = 1. (v) Finite termination: Finally, for k = p, we have sn, p = s and

p

i =0

(n, p) i

γi

z =

p  z − λi i =1 1 − λi

for all n ≥ m0 .

(6.21)

Remarks: , sMMPE , and sTEA do not necessarily exist for all n ≥ 1. The approximations sMPE n,k;ν n,k;ν n,k;ν 0. The reason for their possible nonexistence is that their corresponding γi are  obtained by first determining the ci and then dividing by ki=0 ci , which may not always be nonzero. However, when |λk | > |λk+1 |, they exist for all n ≥ n0 , with some n0 > 0. On the other hand, sRRE exists for every n ≥ 0, since its n,k;ν corresponding γi do. 2. The condition given in (6.11) must be satisfied for Theorem 6.6 to be true. • The convergence proof for MPE and RRE requires (6.12a) to be satisfied, while for MMPE, we need (6.12b) in addition. In such cases, MPE, RRE, and MMPE need n + k + 2 vectors x m as their input, while TEA needs n + 2k of these vectors. Thus, MPE, RRE, and MMPE have an advantage over TEA when (6.12a) is satisfied, especially when the cost of computing the x m dominates. This often takes place when we are solving multidimensional linear or nonlinear systems of equations by fixed-point iterative methods. • The convergence proof for TEA, on the other hand, does not require the condition in (6.12a), but only that in (6.12c).22 This suggests that TEA will be effective even when (6.12a) is not satisfied, provided (6.12c) is. Clearly, there are infinitely many choices for the vector q to guarantee that (6.12c) holds. This certainly seems to be an advantage of TEA over the polynomial extrapolation methods in problems where (6.12a) is not satisfied. 3. Note that |λk+1 | < |λ1 | by (6.11). Thus, if |λ1 | < 1, that is, if {x m } converges to s, sn,k;ν converges to s faster than all the x m , n ≤ m ≤ n + ν + k, that are used in constructing sn,k;ν . Thus, all four methods accelerate the convergence of the sequence {x m }. Note also that, when |λk+1 | < 1, sn,k;ν converges to s even if |λ1 | ≥ 1, that is, even if {x m } diverges. In other words, under suitable conditions, the four vector extrapolation methods can transform a divergent sequence into a convergent one. 22 Note that

these two conditions do not contradict each other.

6.2. Convergence and stability of rows in the extrapolation table

125

4. Comparing the result sn,k;ν − s = O(λnk+1 ) with xn − s =

k

i =1

as n → ∞

v i λni + O(λnk+1 )

as n → ∞,

 which follows from (6.8), one may be led to think that the vector ki=1 v i λni that forms the dominant part of x n − s is totally eliminated by the extrapolation method. As will become clear from Theorems 6.12 and 6.16 and Section 6.6, this is false. The true picture is as follows: sn,k;ν − s = where and

k

i =1

φi (n) = O(λnk+1 ) φi (n) = O(λni )

v i φi (n) +

p

v i φi (n),

i =k+1

as n → ∞,

as n → ∞,

i = 1, . . . , k,

i = k + 1, . . . , p.

In any case, the vectors v 1 , . . . , v k are still present in the expansion of sn,k;ν − s, but they are multiplied by coefficients that are O(λnk+1 ) as n → ∞. 5. That x n − s = O(λ1n ) and sn,k − s = O(λnk+1 ) as n → ∞ (with k fixed) shows that the rate of acceleration of sn,k relative to x n is |λk+1 /λ1 |n . This rate is satisfactory when λk+1 is considerably smaller than |λ1 |. This also suggests that if |λ1 | = · · · = |λk | = |λk+1 |, the approximations sn,k produced by extrapolation methods cease to be effective. This is the case, for example, when the sequence {x m } from a linear system Ax = b is generated by optimal SOR in some cases: Specifically, when A is consistently ordered and its associated Jacobi iteration matrix has only real eigenvalues, then the eigenvalues of the optimal SOR matrix of iteration are located on the same circle centered at the origin in the complex plane, hence have the same modulus. See Hageman and Young [125, Section 9.3]. (n,k)

contain a large amount of relevant 6. Part (iii) of Theorem 6.6 shows that the γi information concerning the λi and hidden in the sequence {x m }. Since our pur(n,k) pose is ultimately to obtain sn,k , we may be led to think that the γi are only auxiliary parameters of secondary importance. The results of part (iii) show otherwise. (n,k)

7. The results in (6.17) and (6.19) show that the zeros z s of Qn,k (z) = k (n,k) i z tend to the corresponding λ s at different rates. The best rate is i =0 γi obtained for λ1 , followed by λ2 , and so on. Comparing (6.17) and (6.19), we (n,k) conclude that, for MPE and RRE, the rate of convergence of z s to λ s when the v i are mutually orthogonal, that is, when (v i , v j ) = 0 for i = j , is twice the rate of convergence when the v i are not mutually orthogonal. Mutual orthogonality holds when the x m are generated via x m+1 = T x m + d, the matrix T being normal and the inner product being (y, z ) = y ∗ z . This, of course, includes Hermitian and skew-Hermitian T .

126

Chapter 6. Convergence Study of Extrapolation Methods: Part I

In addition, we note that (6.18) and (6.19) do not contradict (6.14) and (6.17), respectively; they imply, but are not implied by, (6.14) and (6.17). The proofs of (6.14)–(6.19) are based on Sidi [263, 273]. (n,k)

by actually solving the We must warn the reader that computing all the z s k (n,k) i (n,k) equation i =0 γi z = 0 numerically, with the γi determined as explained in the appropriate algorithms described in the preceding chapters, leads to rather inaccurate approximations for those λi for which |λi /λ1 | is not sufficiently close to one. We will describe more reliable ways later. 8. A detailed analysis of the vector a n,k;ν in (6.13) reveals that it is proportional to k −1 i =1 (1 − λi ) . Thus, a n,k;ν is large, and hence sn,k;ν − s is large as well, when some of the largest λi are too close to one in the complex plane. Similarly, the result in (6.16) shows that some of the γi become large when some of the largest λi are too close to one in the complex plane. By (6.20), this causes Γn,k to become large, which means that the computation of sn,k;ν in floating-point arithmetic is likely to become less reliable. One way of overcoming these problems is by applying the vector extrapolation methods to a subsequence {x r m }, where r > 1 is fixed. For this subsequence, we have xrm = s+

p

i =1

v i σim ,

σi = λir ,

i = 1, . . . , p.

Therefore, when sn,k is obtained from {x r m }, Theorem 6.6 applies with the λi replaced by the λir . Now, it is easy to see that if μ is very close to one, then μ2 , μ3 , . . . are less close to one in the complex plane, whether |μ| < 1 or not. This means that if, for example, λ1 is very close to one, then λ1r will be less close to one. Clearly, Γn,k is now likely to be smaller. Consequently, sn,k;ν will be more reliable. 9. The result in (6.13) concerning the asymptotic behavior of sn,k − s is illustrated very nicely by the numerical examples (for MPE and MMPE) in Sidi, Ford, and Smith [299]. 10. Essentially the same results can be stated for SEA and VEA. The results for SEA (as applied to scalar sequences) have been given by Wynn [355] and by Sidi [271]. See also Sidi [278, Chapter 16]. The results for VEA have been given by Graves– Morris and Saff [113, 114, 116], Graves–Morris [110, 111], and Roberts [221]. See also Baker and Graves–Morris [14, Chapter 8].

6.2.2 Convergence of intermediate rows with |λk | = |λk+1 | The validity of the result in (6.13) depends on the condition |λk | > |λk+1 | in (6.11). An immediate question then is: “What can be said for the case in which |λk | = |λk+1 |, which is the only remaining case in view of the ordering |λk | ≥ |λk+1 | in (6.10)?” This question can be answered in a satisfactory way for MPE and RRE when the sequence {x m } has been generated by a linear fixed-point iterative method. (Unfortunately, we do not have an analogous answer for MMPE and TEA.) Part 2 of the following theorem covers special cases of more general results given in Sidi [268] and is valid for both |λk | > |λk+1 | and |λk | = |λk+1 |. Part 1 is from Sidi [262, Theorem 2.1].

6.2. Convergence and stability of rows in the extrapolation table

127

Theorem 6.7. Let {x m }, λi , and v i be as in (6.8)–(6.9), with the λi ordered as in (6.10). Assume, in addition, that the vectors x m have been generated by the iterative procedure x m+1 = T x m +d, T being diagonalizable and I −T being nonsingular. Let sn,k;ν , k < p, be as in Definition 6.1, where we also have (y, z ) = y ∗ M z for some Hermitian positive definite matrix M . Then we have the following: 1. sn,k;ν exists and is unique (i) for RRE unconditionally and (ii) for MPE provided the Hermitian part of the matrix E = aM (I − T ) is positive definite for some a ∈ , |a| = 1. 2. Under the conditions of the preceding item, sn,k;ν for MPE and RRE satisfy sn,k;ν − s = O(λnk+1 )

as n → ∞,

(6.22)

whether |λk | > |λk+1 | or |λk | = |λk+1 |. Remarks: 1. When M = I and a = 1, we have E = I − T , and the condition imposed on E for MPE is that the matrix I − T have a positive definite Hermitian part. Of course, this condition is met trivially if I − T is Hermitian positive definite. 2. Recall that (i) if B is a Hermitian positive definite matrix, then z ∗ B z is real and positive for all z = 0, and (ii) if B is a skew-Hermitian matrix, then z ∗ B z is purely imaginary or zero. Denote by AH and AS the Hermitian and skewHermitian parts of the matrix A; thus, A = AH + AS . Let (λ, x) be an eigenpair of A. Then λx = Ax

=⇒

λ(x ∗ x) = x ∗AH x + x ∗AS x = α + iβ,

α and β real.

If, in addition, AH is positive definite, then α > 0. That is, the eigenvalues of A all have positive real parts; hence they all lie strictly to the right of (and never on) the imaginary axis in the complex plane. In view of this, the condition imposed on the matrix E for MPE also implies that the matrix M (I − T ) is required to have all its eigenvalues strictly to one side of a straight line through the origin. 3. The general implication of Theorem 6.7 is that, if |λ t | > |λ t +1 | = · · · = |λ t +r | > |λ t +r +1 |, the rows k = t , t +1, . . . , t + r −1 all have the same asymptotic behavior as n → ∞, that is, the error εn,k;ν = sn,k;ν − s is of order λnt+1 for all these rows. This is true, in particular, when t = 0, that is, when |λ1 | = |λ2 | = · · · = |λ r | > |λ r +1 |, in which case εn,k;ν = O(λ1n ) for k = 0, 1, . . . , r −1, while εn,r ;ν = O(λnr+1 ). (Recall remark 5 following the statement of Theorem 6.6.) 4. If |λ1 | = · · · = |λ p |, Theorem 6.7 implies that no convergence acceleration takes place when applying extrapolation methods. This will be so when the sequence {x m } arises from the iterative solution of a linear system by the optimal SOR method in some cases, for example. (Again, recall remark 5 following the statement of Theorem 6.6.) 5. The result in (6.22) concerning the asymptotic behavior of sn,k − s for MPE and RRE is illustrated by the numerical examples in Sidi [268].

128

Chapter 6. Convergence Study of Extrapolation Methods: Part I

6.3 Technical preliminaries We achieve the proof of Theorem 6.6 pertaining to the row sequences {sn,k;ν }∞ n=0 by actually analyzing the asymptotic behavior as n → ∞ of the different determinants D(v0 , v1 , . . . , vk ) that appear in Lemma 6.3. We begin with two lemmas of general interest. The first of these is Lemma 6.8, which was stated and proved as Lemma A.1 in [299]; it has been used in different contexts. Lemma 6.8. Let i0 , i1 , . . . , ik be positive integers, and assume that the scalars vi0 ,i1 ,...,ik are odd under an interchange of any two of the indices i0 , i1 , . . . , ik . Let xi , i ≥ 1, be scalars or vectors, and let ti , j , i ≥ 1, 1 ≤ j ≤ k, be scalars. Define Ik,N and Jk,N by Ik,N = and

Jk,N

N

N

···

i0 =1 i1 =1

N

ik =1

xi0

  xi  0 t  i0 ,1

t  i0 ,2 =  1≤i0 0. They are given only for the sake of comparison. Indeed, the bounds Γ D in Tables 7.1–7.4 decrease at an increasing rate as n increases, and they n,k

Ch . are much better than the corresponding Γn,k

We now turn to the explicit theoretical bounds Γn,k in (7.47), (7.50), (7.53), and (7.59). We first note that, when β < 1, these bounds tend to zero monotonically and quickly as n and/or k tend to ∞ individually or simultaneously. Clearly, β = ρ(T ) for all four sets D considered, which implies that the sequence {x m } generated via the fixed-point iterative method x m+1 = T x m + d is convergent. Next, let us compare the upper bounds Γn,k for the four spectra, β being the same in D all cases. From (7.47) and (7.50), we see that Γ 2 is smaller and decreases more quickly n,k

D D than Γn,k1 as n and/or k increase. Similarly, from (7.53) and (7.59), we see that Γn,k4 is

172

Chapter 7. Convergence Study of Extrapolation Methods: Part II D  Table 7.3. Bounds for Γn,k when D = [−β, β] with β = 0.96. Here Γn,k and Γn,k are, D Ch respectively, the lower and upper bounds on Γn,k given in (7.52) and Γn,k is the Chebyshev bound defined in (7.60) with [a, b ] = [−β, β]. Used courtesy of NASA [303].

n=0 Γ˜

k

0 2 4 6 8 10 12 14 16 18 20

˜Γ  n,k 1.00D+00 4.42D-01 2.41D-01 1.38D-01 7.92D-02 4.54D-02 2.58D-02 1.47D-02 8.30D-03 4.69D-03 2.65D-03

 Γn,k

n,k

1.00D+00 8.55D-01 6.44D-01 4.44D-01 2.90D-01 1.83D-01 1.13D-01 6.89D-02 4.15D-02 2.48D-02 1.47D-02

˜Γ Ch n,k 1.00D+00 8.55D-01 5.75D-01 3.45D-01 1.98D-01 1.12D-01 6.33D-02 3.56D-02 2.00D-02 1.13D-02 6.34D-03

 Γ˜n,k 1.82D-02 3.24D-03 8.31D-04 2.55D-04 8.78D-05 3.26D-05 1.28D-05 5.26D-06 2.23D-06 9.77D-07 4.37D-07

n = 50 Γ˜ n,k

1.30D-01 2.39D-02 6.38D-03 2.03D-03 7.18D-04 2.74D-04 1.11D-04 4.65D-05 2.02D-05 9.02D-06 4.12D-06

Ch Γn,k 1.30D-01 1.11D-01 7.47D-02 4.48D-02 2.57D-02 1.46D-02 8.22D-03 4.63D-03 2.60D-03 1.46D-03 8.24D-04

˜Γ  n,k 1.68D-03 1.71D-04 2.83D-05 6.02D-06 1.49D-06 4.14D-07 1.25D-07 4.01D-08 1.36D-08 4.81D-09 1.76D-09

n = 100 ˜Γ n,k

1.69D-02 1.74D-03 2.94D-04 6.37D-05 1.61D-05 4.54D-06 1.39D-06 4.54D-07 1.56D-07 5.61D-08 2.09D-08

D Table 7.4. Bounds for Γn,k when D = {z : z = it , t ∈ [−β, β]} with β = 0.96. Here and Γ are, respectively, the lower and upper bounds on Γ D given in (7.58) and Γ Ch is the n,k

n,k

n,k

Chebyshev bound defined in (7.61). Used courtesy of NASA [303].

k

0 2 4 6 8 10 12 14 16 18 20

Ch Γn,k 1.69D-02 1.44D-02 9.70D-03 5.82D-03 3.34D-03 1.89D-03 1.07D-03 6.01D-04 3.38D-04 1.90D-04 1.07D-04

˜Γ  n,k 1.00D+00 1.79D-01 3.02D-02 4.98D-03 8.13D-04 1.32D-04 2.15D-05 3.49D-06 5.66D-07 9.17D-08 1.49D-08

n=0 Γ˜ n,k

1.00D+00 3.15D-01 6.86D-02 1.34D-02 2.47D-03 4.45D-04 7.85D-05 1.37D-05 2.36D-06 4.05D-07 6.90D-08

˜Γ Ch n,k 1.00D+00 3.15D-01 5.24D-02 8.48D-03 1.37D-03 2.22D-04 3.60D-05 5.82D-06 9.42D-07 1.52D-07 2.47D-08

 Γ˜n,k 1.82D-02 1.66D-04 2.92D-06 7.41D-08 2.42D-09 9.50D-11 4.33D-12 2.22D-13 1.26D-14 7.80D-16 5.19D-17

n = 50 Γ˜ n,k

1.30D-01 1.21D-03 2.16D-05 5.59D-07 1.86D-08 7.42D-10 3.44D-11 1.79D-12 1.03D-13 6.49D-15 4.38D-16

Ch Γn,k 1.30D-01 4.10D-02 6.80D-03 1.10D-03 1.78D-04 2.89D-05 4.67D-06 7.56D-07 1.22D-07 1.98D-08 3.20D-09

˜Γ  n,k 1.68D-03 7.85D-06 7.20D-08 9.72D-10 1.71D-11 3.71D-13 9.45D-15 2.76D-16 9.02D-18 3.26D-19 1.29D-20

n = 100 ˜Γ n,k

1.69D-02 7.97D-05 7.38D-07 1.01D-08 1.79D-10 3.91D-12 1.00D-13 2.96D-15 9.77D-17 3.56D-18 1.42D-19

Ch Γn,k 1.69D-02 5.32D-03 8.83D-04 1.43D-04 2.32D-05 3.75D-06 6.07D-07 9.82D-08 1.59D-08 2.57D-09 4.16D-10

smaller and decreases more quickly than Γn,k3 as n and/or k increase. Clearly, the best results are obtained for D2 , followed by D4 , followed by D1 , followed by D3 . One conclusion that could be drawn from this pattern is the following: The best results will be possible when all the eigenvalues μi are real and negative. This is followed by the case in which all the μi are purely imaginary. This is followed by the case in which some of the μi are real positive and some are real negative. This is followed by the case in which all the μi are real positive. Note that, in the cases D = [−β, 0] and D = {z : z = it , t ∈ [−β, β]}, all eigenvalues of T are far from z = 1, while in the cases D = [0, β] and D = [−β, β], β is an eigenvalue of T and is close to z = 1. Recall our discussion of the results of Chapter 6, where we mentioned that eigenvalues that are close to z = 1 make the convergence slower and the numerical stability problematic. We note that, by the fact that both D1 and D2 are proper subsets of D3 , with the D D D D help of Lemma 7.7 only, we can conclude immediately that Γn,k1 ≤ Γn,k3 and Γn,k2 ≤ Γn,k3 . Finally, Example 6.1 in [303] concerns the case in which the matrix T is real symmetric (hence normal). The numerical results obtained for sn,k with n = 80 and k = 1, . . . , 20 confirm the tightness of the upper bounds for r (sn,k ) given in (7.18) of Theorem 7.6. D

7.5. Justification of cycling

173

7.5 Justification of cycling In Section 1.10, we discussed the use of the methods MPE, RRE, MMPE, and SVDMPE in the full cycling mode, and we also introduced cycling with frozen γi and cycling in parallel. As we have already mentioned, a major application area of vector extrapolation methods is the numerical solution of large systems of linear and nonlinear equations x = f (x), whose solution we denote by s, by fixed-point iterations x m+1 = f (x m ). We have also seen that the nonlinear system x = f (x) behaves linearly when x is close to its solution s. In view of the fact that the studies of Chapters 6 and 7 concern linearly generated vector sequences, we will refer to these studies to justify the use of (full) cycling in general, and the use of cycling with frozen γi and cycling in parallel in particular. Below, by ExtM, we shall mean any of these four methods. We advise the reader to review the steps of the three types of cycling first.

7.5.1 Justification of full cycling The justification of full cycling is easy in view of the results of Theorems 6.6 and 6.7 of Chapter 6. For all practical purposes, these theorems say that the error in sn,k behaves like λnk+1 for sufficiently large n, with the eigenvalues of F (s), the Jacobian matrix of f (x) evaluated at s, ordered as in |λ1 | ≥ |λ2 | ≥ · · · . In other words, the n fixed-point iterations decrease the contribution of the smaller eigenvalues to the point where they become of order |λnk+1 |, while the extrapolation method decreases the contribution of the k largest eigenvalues to the point that they become of order |λnk+1 | as well. Another justification is supplied by our analysis of σ(T )

Γn,k with fixed n and k, which forms the main factor of the error sn,k − s as shown in σ(T )

this chapter: We have shown that Γn,k is decreasing as a function of n and k at least in the special cases we have considered, which suggests reduction of errors. Thus, cycling with fixed n and k repeats this trend, causing further reductions of errors. Example 6.2 in [303] concerns the application of MPE and RRE in the cycling mode to a system of linear equations obtained from the finite-difference solution of a linear convection-diffusion equation, with some adjustable parameters that control the nature of the iteration matrix. The numerical results there demonstrate the effectiveness of the strategy with even moderately large n > 0 whether the iterative method converges or diverges. In addition, this strategy is observed to be effective when the cycling mode with n = 0 fails. (See our discussion of full cycling in Section 1.10.)

7.5.2 Justification of cycling with frozen γ i As for cycling with frozen γi , its justification is provided by the part of Theorem 6.6 (n,k) that concerns the polynomial Qn,k (z). There we saw that limn→∞ γi all exist and satisfy k k

 z − λi (n,k) . ( lim γi )z i = n→∞ 1 − λi i =0 i =1 (n,k)

are about the same for the given k This means that, for sufficiently large n, the γi in all cycles. Indeed, our computations confirm that this is so. Thus, by doing a few

174

Chapter 7. Convergence Study of Extrapolation Methods: Part II (n,k)

full cycles, we obtain well-established values for the γi , which we can now use in the next cycles without having to compute them again as functions of the vectors x m .

7.5.3 Justification of cycling in parallel The justification of cycling in parallel is also provided by Theorem 6.6. As usual, we let (1) (2) (N ) x m = [x m , x m , . . . , x m ]T , m = 0, 1, . . . . Similarly, we let

(1)

(2)

(N )

v i = [vi , vi , . . . , vi ]T ,

i = 1, . . . , p.

Assuming that our machine has q processors, we choose positive integers ν1 , . . . , νq such that ν1 < ν2 < · · · < νq = N . Setting also ν0 = 0, for each j = 1, . . . , q, let us now define the vectors x m, j and v i , j as (ν

x m, j = [x mj −1

+1)



, x mj −1

+2)

(ν )

, . . . , x mj ]T ,

(ν j −1 +1)

v i , j = [vi

(ν j −1 +2)

, vi

(ν )

, . . . , vi j ]T .

Clearly, for each j ∈ {1, . . . , q}, x m, j and v i , j are all vectors in ν j −ν j −1 , and ⎡ ⎢ ⎢ xm = ⎢ ⎣

x m,1 x m,2 .. . x m,q

⎤ ⎥ ⎥ ⎥, ⎦

⎡ ⎢ ⎢ vi = ⎢ ⎣

v i ,1 v i ,2 .. . v i ,q

⎤ ⎥ ⎥ ⎥. ⎦

We start by noting that the most important condition in Theorem 6.6 is that given in (6.8) with (6.9). It is easy to realize that this condition is satisfied by the individual x m, j in the sense that x m, j = s j +

p

i =1

v i , j λim ,

m ≥ m0 ,

p ≤ N.

Here we have defined the vectors s j just like x m, j and v i , j . Thus, it is reasonable to assume that full cycling applied to each individual sequence {x m, j }, j = 1, . . . , q, performs in nearly the same way as when it is applied to the sequence {x m }.

7.6 Cycling and nonlinear systems of equations As mentioned earlier, the most appropriate strategy for solving nonlinear systems of equations x = f (x) with solution s by vector extrapolation methods is via cycling. Below, by ExtM, we shall mean any of the four extrapolation methods MPE, RRE, MMPE, and SVD-MPE, as before. The following strategy has been of some theoretical interest despite the fact that it is not practical, especially when dealing with large-dimensional problems. C0. Choose an initial vector x 0 . C1. Compute the vectors x 1 , x 2 , . . . , x k+1 [ via x m+1 = f (x m )], where k is the degree of the minimal polynomial of F (s) with respect to x 0 − s. [As always, F (x) is the Jacobian matrix of f evaluated at x.]

7.6. Cycling and nonlinear systems of equations

175

C2. Apply ExtM to the vectors x 0 , x 1 , . . . , x k+1 , with end result s0,k . C3. If s0,k satisfies the accuracy test, stop. Otherwise, set x 0 = s0,k , and go to step C1. If any one of the the epsilon algorithms SEA, VEA, and TEA is used instead of the polynomial methods, then the above cycling strategy has the following steps: C0. Choose an initial vector x 0 . C1. Compute the vectors x 1 , x 2 , . . . , x 2k [ via x m+1 = f (x m )], where k is the degree of the minimal polynomial of F (s) with respect to x 0 − s. [As always, F (x) is the Jacobian matrix of f evaluated at x.] C2. Apply any one of the epsilon algorithms to the vectors x 0 , x 1 , . . . , x 2k , with end result s0,k . C3. If s0,k satisfies the accuracy test, stop. Otherwise, set x 0 = s0,k , and go to step C1. As before, we call each application of steps C1–C3 a cycle and denote by s(r ) the s0,k that is computed in the r th cycle. We also denote the initial vector x 0 in step C0 by s(0) . It is clear that k may change from one cycle to the next. In addition, because F (s) is not known, it would be practically impossible to determine k in each cycle. Even if we did know k for each cycle, it may be large; actually, it may be as large as the dimension of the problem itself. All of these considerations render this cycling strategy impractical when the dimension is very large. Nevertheless, the convergence properties of the sequence {s(r ) }∞ r =0 have been of some interest. The claim concerning this particular cycling is that the sequence {s(r ) }∞ r =0 converges to s quadratically. Even though this claim may be valid in some cases, judging from numerical experiments, nevertheless, all the proofs suffer from some serious deficiencies. The first papers dealing with this topic are those by Brezinski [26, 27], Gekeler [96], and Skelboe [305]. Of these, [26], [27], and [96] consider the application of the epsilon algorithms, while [305] also considers the application of MPE and RRE. The quadratic convergence proofs in all of these papers have a gap in that they all end up with the relation s(r +1) − s ≤ K r s(r ) − s2 , from which they conclude that {s(r ) }∞ r =0 converges quadratically. However, K r is a scalar that depends on r through s(r ) , and the proofs do not show how it depends on r . In particular, they do not show whether K r is bounded in r or how it grows with r if it is not bounded. This gap was disclosed in Smith, Ford, and Sidi [306]. A more recent paper, by Jbilou and Sadok [148], deals with the same cycling strategy via MPE and RRE. Yet another paper, by Le Ferrand [169], treats TEA. Both works provide proofs of quadratic convergence by imposing some strict conditions on (r ) are actually the objects the whole sequence {s(r ) }∞ r =0 as well as on f (x), when the s we are trying to analyze. For completeness, we state the theorem of [148]. Theorem 7.14. Let s be the solution to the nonlinear system x = f (x), where f : N → D, D being a convex set in N . Assume that F (s), the Jacobian matrix of f evaluated at

176

Chapter 7. Convergence Study of Extrapolation Methods: Part II

x = s, is such that F (s) − I is nonsingular and that F (x) − F (y) ≤ Lx − y

∀ x, y ∈ D

for some L > 0.

Consider the sequence {s(r ) } obtained by applying MPE or RRE by the cycling procedure described above, k r being the degree of the minimal polynomial of the matrix F (s) with respect to the vector s(r ) − s. Define f 0 (x) = x

and

f i (x) = f (f i −1 (x)),

i = 1, 2 . . . .

For each r = 0, 1, . . . , define the N × k r matrix H r (x) via

    f 1 (x) − f 0 (x)  f 2 (x) − f 1 (x)   f k r (x) − f k r −1 (x) · · · , H r (x) =   kr  f 1 (x) − f 0 (x) f 2 (x) − f 1 (x) f (x) − f k r −1 (x) 

and let

α r (x) =



det[H ∗r (x)H r (x)].

If there exist a constant α > 0 and an integer R > 0 such that α r (s(r ) ) ≥ α

∀ r ≥ R,

then the sequence {s(r ) } converges to s quadratically when s(0) is sufficiently close to s. The problem here is that the condition that α r (s(r ) ) ≥ α > 0 independently of r is a restriction on the s(r ) , which are the objects we are trying to analyze.

Chapter 8

Recursion Relations for Vector Extrapolation Methods

8.1 Introduction In Section 1.7, we gave determinant representations for the vectors sn,k that result from application of the polynomial extrapolation methods MPE, RRE, MMPE, and in Section 5.4, we gave the determinant representation for TEA. These representations can be used to obtain recursion relations among the different sn,k , which seem to be of interest in themselves.26 The source of the developments we present here is the paper by Ford and Sidi [87], which the reader may consult for more details. We first let μnm , m, n = 0, 1, . . . , be given real or complex scalars and define the determinants   m m  μnm  μn+1 · · · μn+k−1   m+1 m+1   μ m+1 μn+1 · · · μn+k−1   n n,m Gkn,m =  . (8.1) .. ..  , G0 = 1,  .. . .   m+k−1  μ μ m+k−1 · · · μ m+k−1  n

and

  bn   μnm   m+1 fkn,m (b ) =  μn  ..  .  m+k−1 μ n

n+1

n+k−1

··· ··· ···

bn+1 m μn+1 m+1 μn+1 .. . m+k−1 μn+1

···

 bn+k  m  μn+k  m+1  μn+k  , ..  .  m+k−1  μn+k

f0n,m (b ) = bn .

(8.2)

Here b in stands for the sequence {bi }∞ i =0 , whether scalar or vector. In adn,m dition, when b is a vector sequence, fk (b ) is the vector obtained by expanding the determinant in (8.2) with respect to its first row, as usual. In any case, fkn,m (b ) = k n+1,m . i =0 νi bn+i , where νi is the cofactor of bn+i in the first row and ν0 = Gk Following these, we define fkn,m (b )

Skn,m (b ) =

fkn,m (b )

fkn,m (I )

26

,

(8.3)

To compute the vectors sn,k economically and in a numerically stable way, we prefer the algorithms given in Chapters 2, 4, and 5, however.

177

178

Chapter 8. Recursion Relations for Vector Extrapolation Methods

where I stands for the sequence {Ii }∞ i =0 with Ii = 1 for all i, as well as fkn,m (b )

Tkn,m (b ) =

Note also that Skn,m (b ) =

Finally,

S0n,m (b ) = bn

Gkn+1,m Tkn,m (b )

Tkn,m (I )

and

.

(8.4)

.

(8.5)

T0n,m (b ) = bn .

(8.6)

By (8.2), it is easy to see that fkn,m (μ r ) = 0,

r = m, m + 1, . . . , m + k − 1,

μ r = {μir }∞ i =0 ,

(8.7)

since the determinant representation of fkn,m (μ r ) has two identical rows when m ≤ r ≤ m + k − 1. Consequently, Skn,m (μ r ) = 0, Skn,m (b ) =

k

i =0

ρi bn+i

r = m, m + 1, . . . , m + k − 1, for some ρi ,

Skn,m (I ) =

k

i =0

ρi = 1,

(8.8)

and Tkn,m (μ r ) = 0, Tkn,m (b ) =

k

i =0

r = m, m + 1, . . . , m + k − 1, σi bn+i

for some σi ,

σ0 = 1.

(8.9)

It is easy to see that the k + 1 equations in (8.8) determine the ρi and hence Skn,m (b ) uniquely. Similarly, the k + 1 equations in (8.9) determine the σi and hence Tkn,m (b ) uniquely. Quantities such as Skn,m (b ) arise as sn,k when one applies MPE, RRE, MMPE, and n,m TEA1 to a vector sequence {x i }∞ i =0 . Specifically, bi = x i for all i, and the Sk (b ) and μnm take different forms for the different methods: for MMPE and TEA1,

μnm =

(q m+1 , u n ) (q, u n+m )

sn,k = Skn,n (x) for MPE and RRE,

μnm =

(u m , u n ) for MPE, (w m , u n ) for RRE.

sn,k = Skn,0 (x)

for MMPE, for TEA1,

(8.10)

and (8.11)

These can be verified with the help of the determinant representations of sn,k . Of 2 course, here x stands for the sequence {x i }∞ i =0 , and u i = Δx i and w i = Δ x i , as usual. From the discussion above, it is clear that there are two separate problems of interest as far as the extrapolation methods are concerned:

8.2. Recursions for fixed m

179

• Recursive computation of Skn,m (b ) and Tkn,m (b ) when m is fixed. (The special case m = 0 is relevant to MMPE and TEA1.) • Recursive computation of Skn,m (b ) and Tkn,m (b ) when m varies with n as m = n + q, where q is a fixed nonnegative integer. (The special case m = n, that is, q = 0, is relevant to MPE and RRE.) In the next two sections, we derive the relevant recursion relations for these two cases and for the additional case in which n is held fixed. We do not go into the details of the computational procedures/algorithms here; for these, we refer the reader to [87]. In our developments, we make use of the well-known Sylvester determinant identity, which is the subject of the next theorem. For a proof of this theorem, see Gragg [106], for example. Theorem 8.1. Let C be a square matrix, and let Cρσ denote the matrix obtained by deleting row ρ and column σ of C . Also let Cρρ ;σσ  denote the matrix obtained by deleting rows ρ and ρ and columns σ and σ  of C . Provided ρ < ρ and σ < σ  , det C det Cρρ ;σσ  = det Cρσ det Cρ σ  − det Cρσ  det Cρ σ .

(8.12)

If C is a 2 × 2 matrix, then (8.12) holds with Cρρ ;σσ  = 1.

8.2 Recursions for fixed m Since m is kept fixed, we shall denote Skn,m (b ) and Tkn,m (b ) by Skn (b ) and Tkn (b ), ren+ j

spectively. Thus, Sk

n+ j ,m

(b ) stands for Sk

(b ), for example.

Theorem 8.2. Let μ r = {μir }∞ i =0 , r = 0, 1, . . . . 1. Then we have the three-term recursion relations Skn (b ) =

n+1 n Sk−1 (b ) − ckn Sk−1 (b )

1 − ckn

(8.13)

and n n+1 Tkn (b ) = Tk−1 (b ) − dkn Tk−1 (b ),

(8.14)

n n+1 ckn = Sk−1 (μ m+k−1 )/Sk−1 (μ m+k−1 )

(8.15)

n n+1 dkn = Tk−1 (μ m+k−1 )/Tk−1 (μ m+k−1 ).

(8.16)

where and

2. In addition, we have the following determinant representations:   n+1  S n (b ) Sk−1 (b )   k−1  n  n+1 Sk−1 (μ m+k−1 ) Sk−1 (μ m+k−1 ) n  Sk (b ) =    1 1    n  n+1 (μ m+k−1 ) Sk−1 (μ m+k−1 ) Sk−1

(8.17)

180

Chapter 8. Recursion Relations for Vector Extrapolation Methods

and Tkn (b ) =

  T n (b )   n k−1m+k−1 Tk−1 (μ )

 n+1 Tk−1 (b )   n+1 Tk−1 (μ m+k−1 )

n+1 Tk−1 (μ m+k−1 )

.

(8.18)

Proof. We give the proof for Skn (b ) first. Applying Theorem 8.1 to fkn,m (b ) with respect to the first and last rows and first and last columns, we obtain n+1,m n,m n+1,m fkn,m (b )Gk−1 = fk−1 (b )Gkn+1,m − fk−1 (b )Gkn,m .

(8.19)

From this and from (8.3), it is clear that n n+1 (b ) + βSk−1 (b ), Skn (b ) = αSk−1

(8.20)

for some constants α and β. To determine these, we resort to (8.8). First, letting b = I in (8.20) and recalling that Skn (I ) = 1 for all n and k, we obtain the equation α + β = 1. Next, letting b = μ equation

m+k−1

in (8.20) and recalling that Skn (μ m+k−1 ) = 0, we obtain the

n n+1 αSk−1 (μ m+k−1 ) + βSk−1 (μ m+k−1 ) = 0.

Solving these two equations for α and β and substituting in (8.20), we obtain (8.13), (8.15), and (8.17). We now turn to the proof for Tkn (b ). From (8.19) and from (8.4), it is clear that n n+1 Tkn (b ) = α Tk−1 (b ) + β Tk−1 (b ),

(8.21)

for some constants α and β . To determine these, we resort to (8.9). First, we realize  that α = 1 because Tkn,m (b ) = ki=0 σi bn+i with σ0 = 1. Next, letting b = μ m+k−1 in (8.21) and recalling that Tkn (μ m+k−1 ) = 0, we obtain the equation n n+1 α Tk−1 (μ m+k−1 ) + β Tk−1 (μ m+k−1 ) = 0.

Letting α = 1 and solving for β and and substituting in (8.21), we obtain (8.14), (8.16), and (8.18).

For the actual computation of the Skn (b ) and Tkn (b ) via the recursion relations above, see Algorithm 2.2 in [87]. Remark: For MMPE and TEA1, by Theorem 8.2, we have that the sn,k = Skn,0 (x) satisfy three-term recursion relations of the form sn,k+1 = αnk sn,k + βnk sn+1,k ,

αnk + βnk = 1.

(8.22)

Note that as far as TEA is concerned, this result is in agreement with (5.42). Before proceeding to the case of varying m, we note that in an earlier paper [31], Brezinski treated the problem of fixed m, resulting in the recursion relations denoted recursive projection algorithm (RPA) and compact recursive projection algorithm (CRPA), which are closely related to the ones we have summarized in this section. For a brief discussion about this relationship, see [87]. For a review of RPA and CRPA, see Brezinski and Redivo Zaglia [36, Chapter 4].

8.3. Recursions for m = n + q with fixed q

181

8.3 Recursions for m = n + q with fixed q Since m = n + q, and q is fixed, we shall denote Skn,m (b ) and Tkn,m (b ) by Skn (b ) and Tkn (b ), respectively. We shall also need the auxiliary quantities S kn (b ) = Skn,m−1 (b ) n+ j n+ j n+ j ,m+ j and T n (b ) = T n,m−1 (b ). Thus, S (b ) and S (b ) stand for S (b ) and k n+ j ,m+ j −1

Sk

k

k

k

k

(b ), respectively, for example.

Theorem 8.3. Let μ r = {μir }∞ i =0 , r = 0, 1, . . . . 1. Then we have the coupled recursion relations Skn (b ) =

n+1 n Sk−1 (b ) − ckn S k−1 (b )

1 − ckn

,

S kn (b ) =

n+1 n Sk−1 (b ) − ckn S k−1 (b )

1− ckn

(8.23)

and n n+1 (b ) − dkn T k−1 (b ), Tkn (b ) = Tk−1

n n+1 T kn (b ) = Tk−1 (b ) − d kn T k−1 (b ),

(8.24)

where n n+1 (μ m+k−1 )/S k−1 (μ m+k−1 ), ckn = Sk−1

n n+1 ckn = Sk−1 (μ m−1 )/S k−1 (μ m−1 )

(8.25)

and n n+1 (μ m+k−1 )/T k−1 (μ m+k−1 ), dkn = Tk−1

n n+1 d kn = Tk−1 (μ m−1 )/T k−1 (μ m−1 ). (8.26)

2. Furthermore, Skn (b ) and Tkn (b ) satisfy four-term (lozenge) recursion relations27 and have the following determinant representations:   n n+1  S (b ) Sk−1 (b ) Skn+1 (b )   k  S n (μ m ) n+1 Sk−1 (μ m ) Skn+1 (μ m )   k   n m+k n+1 m+k S (μ ) Sk−1 (μ ) Skn+1 (μ m+k ) k n  (8.27) (b ) =  Sk+1   1 1 1   n+1  S n (μ m ) Sk−1 (μ m ) Skn+1 (μ m )   k  n m+k  n+1 m+k Sk (μ ) S (μ ) S n+1 (μ m+k ) k−1

and

k

  n+1  T n (b ) Tk−1 (b ) Tkn+1 (b )   k  T n (μ m ) n+1 Tk−1 (μ m ) Tkn+1 (μ m )   k   n m+k n+1 T (μ ) Tk−1 (μ m+k ) Tkn+1 (μ m+k ) k n   Tk+1 (b ) = .  T n+1 (μ m ) Tkn+1 (μ m )   k−1   n+1 m+k Tk−1 (μ ) Tkn+1 (μ m+k )

(8.28)

Note that m = n + q throughout. Proof. The proof is similar to that of Theorem 8.2. Applying the Sylvester determinant identity to fkn,m (b ) with respect to the first and last rows and first and last columns, as in the preceding theorem, we realize that n,m n+1,m fkn,m (b ) = ε fk−1 (b ) + η fk−1 (b ), 27 Unfortunately, there are

ε, η some scalars.

misprints in the formulas given in [87]; we correct them here.

(8.29)

182

Chapter 8. Recursion Relations for Vector Extrapolation Methods n+1,m Clearly, the term fk−1 (b ) needs separate treatment. Applying the Sylvester deter-

minant identity to fkn,m−1 (b ) with respect to the first and second rows and first and last columns, we have n,m n+1,m fkn,m−1 (b ) = ε fk−1 (b ) + η fk−1 (b ),

ε , η some scalars.

(8.30)

It is easy to realize that (8.29) and (8.30) produce two coupled recursion relations for Skn (b ) and S kn (b ), and for Tkn (b ) and T kn (b ), and these are of the form n n+1 Skn (b ) = αSk−1 (b ) + βS k−1 (b ),

(8.31)

n n+1 S kn (b ) = γ Sk−1 (b ) + δ S k−1 (b )

(8.32)

n n+1 Tkn (b ) = α Tk−1 (b ) + β T k−1 (b ),

(8.33)

n n+1 (b ) + δ  T k−1 (b ). T kn (b ) = γ  Tk−1

(8.34)

and

From (8.8) and (8.9), we also have that α + β = 1,

α = 1,

γ + δ = 1,

γ  = 1.

We now have to determine the coefficients α, β, . . . . Note that we need one additional equation for each recursion relation. For this we make use of (8.8) and (8.9) as follows: For α, β, set b = μ m+k−1 in (8.31) and invoke Skn (μ m+k−1 ) = 0. For γ , δ, set b = μ m−1 in (8.32) and invoke S n (μ m−1 ) = 0. k

For α , β , set b = μ m+k−1 in (8.33) and invoke Tkn (μ m+k−1 ) = 0. For γ  , δ  , set b = μ m−1 in (8.34) and invoke T n (μ m−1 ) = 0. k

The results in (8.23)–(8.26) now follow. As for (8.27), we proceed by first showing that the Skn (b ) satisfy a linear four-term recursion relation. By (8.31), we realize that S n+1 (b ) is a linear combination of S n (b ) k−1

k

n+1 and Making use of this in (8.32), thus eliminating S kn (b ) and S k−1 (b ) from (8.32), and then replacing n by n + 1, we obtain a four-term recursion relation of the form n n+1 (b ) = τSkn (b ) + τ  Sk−1 (b ) + τ  Skn+1 (b ). Sk+1 n Sk−1 (b ).

That this gives rise to (8.27) can be verified as follows: First, it is easy to see that the right-hand side of (8.27) is a linear combination of bn , bn+1 , . . . , bn+k+1 and is equal to one when b = I , as it should be, by (8.8). Next, we need to show that it vanishes for b = μ r , m ≤ r ≤ m + k, again by (8.8). Now, since Skn (μ r ) = 0, n+1 Sk−1 (μ r ) = 0, Skn+1 (μ r ) = 0,

m ≤ r ≤ m + k − 1, m + 1 ≤ r ≤ m + k − 1, m + 1 ≤ r ≤ m + k,

it is clear that the first row of the numerator determinant on the right-hand side of (8.27) vanishes when b = μ r , m + 1 ≤ r ≤ m + k − 1; therefore, the determinant vanishes. This determinant also vanishes when b = μ m and b = μ m+k , since it has two identical rows in each of these cases. This completes the proof.

8.4. Recursions for fixed n

183

Since the treatment of (8.28) is almost identical to that of (8.27), we leave it to the reader.

Note that, by the fact that Skn (μ m ) = 0 and Skn+1 (μ m+k ) = 0, (8.27) can be rewritten as  n  S (b )  k  0   n m+k S (μ ) k n Sk+1 (b ) =   1   0   n m+k Sk (μ )

n+1 Sk−1 (b ) n+1 Sk−1 (μ m ) n+1 Sk−1 (μ m+k )

1 n+1 Sk−1 (μ m ) n+1 Sk−1 (μ m+k )

 Skn+1 (b )  Skn+1 (μ m )   0 .  1  Skn+1 (μ m )   0

(8.35)

Analogously, (8.28) can be rewritten as

n Tk+1 (b ) = −

 n+1  T n (b ) Tk−1 (b )  k  n+1 0 Tk−1 (μ m )   n m+k n+1 T (μ ) Tk−1 (μ m+k ) k

 Tkn+1 (b )  Tkn+1 (μ m )   0

n+1 Tkn+1 (μ m )Tk−1 (μ m+k )

.

(8.36)

For the actual computation of the Skn (b ) and Tkn (b ) via the recursion relations above, see Algorithm 3.2 in [87]. Remark: For MPE and RRE, by Theorem 8.3, we have that the sn,k = Skn,n (x) satisfy four-term recursion relations of the form sn,k+1 = αnk sn,k + βnk sn+1,k−1 + γnk sn+1,k ,

αnk + βnk + γnk = 1.

(8.37)

8.4 Recursions for fixed n We end with recursion relations for the case in which n is held fixed and m is varying. &m (b ), respectively. We shall denote Skn,m (b ) and Tkn,m (b ) by S km (b ) and T k Theorem 8.4. Let μ r = {μir }∞ , r = 0, 1, . . . . Then we have the three-term recursion i =0 relations m (b ) − ckm S km−1 (b ) S k−1 (8.38) S km (b ) = 1− ckm

and &m (b ) = T k

where and

&m−1 (b ) &m (b ) − d m T T k−1 k k

1 − d km

,

(8.39)

m (μ m+k−1 )/S km−1 (μ m+k−1 ) ckm = S k−1

(8.40)

&m (μ m+k−1 )/T &m−1 (μ m+k−1 ). d km = T k−1 k

(8.41)

184

Chapter 8. Recursion Relations for Vector Extrapolation Methods n+1,m Proof. Eliminating fk−1 (b ) from (8.29) and (8.30), we obtain a recursion relation

n,m between fkn,m (b ), fk−1 (b ), and fkn,m−1 (b ). Thus, we have a recursion relation of the form m (b ) + βS km−1 (b ). (8.42) S km (b ) = α S k−1

Letting b = μ m+k−1 in this recursion relation and realizing that S km (μ m+k−1 ) = 0, we obtain m α S k−1 (μ m+k−1 ) + βS km−1 (μ m+k−1 ) = 0, which, together with α + β = 1, results in (8.38) with (8.40). A similar treatment, along with the fact that the coefficient of bn in T n,m (b ) is unity for all m, n, k, results in (8.39) with (8.41). This completes the proof.

&m (b ) via the recursion relations above, For the actual computation of the S km (b ) and T k see Algorithm 4.2 in [87].

8.5 qd-type algorithms and the matrix eigenvalue problem The scalars dkn and d kn used in the recursions for the Tkn,m (b ) have some interesting approximation properties when applied in conjunction with vector sequences from power iterations with matrices. We turn to a brief discussion of this subject now. For a detailed treatment, see Sidi and Ford [296].

8.5.1 qd-type algorithms Given a vector sequence {x n }, and the vectors q and q 1 , q 2 , . . . , define, analogous to (8.10)–(8.11), ⎧ ⎪(q m+1 , x n ) analogous to MMPE, ⎨ μnm = (q, x n+m ) analogous to TEA, ⎪ ⎩(x , x ) analogous to MPE. m n 1. qd-MMPE algorithm: j Set T0n (μ j ) = μn . For k = 1, 2, . . . , (i) compute dkn via (8.16), and then (ii) for j ≥ k, compute Tkn (μ j ) ≡ Tkn,0 (μ j ) via (8.14). (Note that dkn is constructed from the k + 1 vectors x n , x n+1 , . . . , x n+k .) 2. qd-TEA algorithm: j Set T0n (μ j ) = μn . For k = 1, 2, . . . , (i) compute dkn via (8.16), and then (ii) for j ≥ k, compute Tkn (μ j ) ≡ Tkn,0 (μ j ) via (8.14). (Note that dkn is constructed from the 2k vectors x n , x n+1 , . . . , x n+2k−1 .) 3. qd-MPE algorithm: j Set T0n (μ j ) = μn = T 0n (μ j ). For k = 1, 2, . . . , (i) compute dkn and d kn via (8.26), and then (ii) for j ≥ k, compute T n (μ j ) ≡ T n,n (μ j ) and T n (μ j ) ≡ k

k

k

Tkn,n−1 (μ j ) via (8.24). (Note that dkn and d kn are constructed from the k + 1 vectors x n , x n+1 , . . . , x n+k .)

Theorem 8.5 is the summary of four theorems proved in [296].

8.5. qd-type algorithms and the matrix eigenvalue problem

185

Theorem 8.5. Let the sequence {x n } be such that xn =

p

i =1

v i λni ,

n = 0, 1, . . . ,

(8.43)

|λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p |,

(8.44)

where v i = 0, v 1, . . . , v p and

  (q 1 , v 1 )   (q 2 , v 1 )   ..  .  (q , v ) 1 k

λi = 0

distinct,

linearly independent for qd-MMPE and qd-MPE

(8.45)

 (q 1 , v k )  (q 2 , v k )   = 0 ..  .  (q , v )

(8.46)

(q 1 , v 2 ) (q 2 , v 2 ) .. .

(q k , v 2 )

··· ··· ···

k

for qd-MMPE

k

and k  i =1

(q, v i ) = 0

for qd-TEA.

(8.47)

Assume also that |λk−1 | > |λk | > |λk+1 | for some integer k ≥ 1, Define

5 λ εk = max  k λ

k−1

λ0 ≡ ∞.

    λk+1 6  ,  (note that 0 < εk < 1).  λ 

(8.48)

(8.49)

k

Apply the qd-MMPE, qd-TEA, and qd-MPE algorithms to {x n } as described above. 1. Then, for the qd-MMPE and qd-TEA algorithms, 1/dkn = λk + O(εnk )

as n → ∞,

(8.50)

and, for the qd-MPE algorithm, we have similarly 1/dkn = λk + O(εnk )

and

d kn = λk + O(εnk )

as n → ∞.

(8.51)

2. If (v i , v j ) = 0 for i = j , the above results for the qd-MPE algorithm improve to read 1/dkn = λk + O(ε2n ) k

and

1/d kn = λk + O(ε2n ) as n → ∞. k

(8.52)

We do not present the proof of this theorem here as it is quite lengthy and complicated. We only mention that it is achieved by employing the techniques we used in the treatment of the convergence of vector extrapolation methods given in Chapter 6.

186

Chapter 8. Recursion Relations for Vector Extrapolation Methods

8.5.2 Connection with the matrix eigenvalue problem • Let us observe that n Tk−1 (μk−1 )

dkn =

n+1 Tk−1 (μk−1 )



n,0 Tk−1 (μk−1 )

for qd-MMPE and qd-TEA.

n+1,0 Tk−1 (μk−1 )

Therefore, for k = 1, we have d1n =

T0n,0 (μ0 )

T0n+1,0 (μ0 )

=

μ0n

μ0n+1

.

Consequently,

(q , x n+1 ) 1 = 1 n (q 1 , x n ) d1

for qd-MMPE,

(q, x n+1 ) 1 = n (q, x n ) d1

for qd-TEA.

• Let us also observe that dkn

=

d kn =

n (μn+k−1 ) Tk−1

n+1 (μn+k−1 ) T k−1

n Tk−1 (μn−1 )

n+1 T k−1 (μn−1 )





n,n (μn+k−1 ) Tk−1

n+1,n Tk−1 (μn+k−1 )

n,n (μn−1 ) Tk−1

for qd-MPE.

n+1,n Tk−1 (μn−1 )

Therefore, for k = 1, we have d1n =

T0n,n (μn )

T0n+1,n (μn )

=

T0n,n (μn−1 ) μn−1 n . d 1n = n+1,n = n−1 T0 (μn−1 ) μn+1

μnn μnn+1

and

and

(x n−1 , x n+1 ) 1 = (x n−1 , x n ) d n

Consequently,

(x n , x n+1 ) 1 = n (x n , x n ) d1

for qd-MPE.

1

Let A be a square matrix and let the sequence {x n } be obtained by performing power iterations with A, starting with an arbitrary nonzero vector x 0 . That is, x n+1 = Ax n ,

n = 0, 1, . . . .

Then the vectors x n are precisely as described in (8.43)–(8.45) in Theorem 8.5, assuming that A is diagonalizable. The λi are some or all of the distinct eigenvalues of A and the v i are corresponding eigenvectors. Thus, the scalars 1/dkn and 1/d kn we have in our algorithms produce approximations to the eigenvalue λk , as described in (8.50), provided the condition in (8.48) is satisfied, and provided (8.46) holds for qd-MMPE and (8.47) holds for qd-TEA. The situation described in part 2 of the theorem occurs when A is a normal matrix, and (8.52) implies that the convergence of 1/dkn and 1/d kn to λk for qd-MPE is twice as fast as that for a nonnormal matrix.

8.5. qd-type algorithms and the matrix eigenvalue problem

187

Remarks: 1. The case k = 1 turns out to be familiar:

(q ,Ax n ) 1 = 1 (q 1 , x n ) d1n

and

for qd-MMPE,

(x ,Ax n ) 1 = n n (x n , x n ) d1

and

(q,Ax n ) 1 = (q, x n ) d1n

(x n−1 ,Ax n ) 1 = (x n−1 , x n ) d n

for qd-TEA,

for qd-MPE.

1

We realize that 1/d1n for the qd-MMPE and qd-TEA algorithms is the approximation to the largest eigenvalue of A provided by the regular power method, while 1/d1n for the qd-MPE algorithm is the Rayleigh quotient and is also an approximation to the largest eigenvalue of A.28 2. For arbitrary k, dkn have the following determinant representations: dkn

=

n+2,0 Gkn,0 Gk−1

n+1,0 Gkn+1,0 Gk−1

for qd-MMPE and qd-TEA

and dkn =

n+2,n Gkn,n Gk−1

n+1,n Gkn+1,n Gk−1

and

d kn =

n+2,n Gkn,n−1 Gk−1

n+1,n Gkn+1,n−1 Gk−1

for qd-MPE.

Note that these representations resemble the representations of the quantities (n) (n) ek and qk that define the qd algorithm we mentioned in Section 5.2 in our discussion of the FS/qd algorithm for the Shanks transformation. (See Henrici [130, Theorem 7.6a] or Sidi [278, p. 336], for example.) In addition, the convergence rates shown in Theorem 8.5 are similar to those obtained by the quotientdifference algorithm; see Kershaw [158], for example. These convergence properties also resemble those obtained from the basic LR and QR algorithms. (See Parlett [206] or Ralston and Rabinowitz [214, Theorem 10.16, pp. 523–525].)

28 We provide a detailed treatment of the Rayleigh quotient and the power method for the matrix eigenvalue problem in Chapter 10 and Appendix G.

Chapter 9

Krylov Subspace Methods for Linear Systems

9.1 Projection methods for linear systems In this chapter, we are concerned with the solution, via so-called Krylov subspace methods, of nonsingular linear (as opposed to nonlinear) systems of equations of the form Ax = b, A ∈ N ×N , b ∈ N , whose unique solution we denote by s. As we will see later in this chapter, these methods are closely related (actually, mathematically equivalent) to the vector extrapolation methods we have discussed in the previous chapters when the latter are applied to vector sequences {x m } obtained by applying fixed-point iterative methods to these nonsingular linear systems. Krylov subspace methods and extrapolation methods differ in their algorithmic aspects completely: The former take as their only input a procedure that performs the matrix-vector multiplication Ax without actually having to know the matrix A explicitly, while the latter take as their only input a vector sequence {x m } that results from a fixed-point iterative scheme without actually having to know what the scheme is. Krylov subspace methods for linear systems form a very well developed area of numerical linear algebra and are covered in great detail in the works of Saad [235, 236] and the recent books by Barrett et al. [16], Axelsson [12], Greenbaum [118], Saad [239], and van der Vorst [326] mentioned in Chapter 0. We also mention the classic book by Vorobyev [335], which is strongly related to Krylov subspace methods. Here, we will be studying only those Krylov subspace methods that are related to vector extrapolation methods. Thus, this chapter is meant to be introductory and not exhaustive. Because Krylov subspace methods are special cases of projection methods, we begin with the treatment of projection methods. Throughout this chapter, we define the residual vector associated with x as r (x) = b − Ax.

(9.1)

We start with the definition of projection methods. Definition 9.1. Let  and ' be two k-dimensional subspaces of N , and choose a vector s0 = x 0 ∈ N . Define the approximation sk to the solution s of Ax = b to be the vector 191

192

Chapter 9. Krylov Subspace Methods for Linear Systems

sk = x 0 + y, where y ∈  , such that r (sk ) = b − Ask is orthogonal to ' . That is, z , r (sk ) = 0 for every z ∈ ' . Here (· , ·) is some inner product in N ×N . Such a method is called a projection method onto  orthogonally to ' . If ' =  , we have an orthogonal projection method; otherwise, we have an oblique projection method.  and ' are called, respectively, right and left subspaces. Remarks: 1. A vector x = x 0 + y with y ∈  is said to belong to the affine subspace x 0 +  , and we write x ∈ x 0 +  . Thus, sk ∈ x 0 +  in Definition 9.1. 2. If ' =  , then the condition that r (sk ) be orthogonal to ' is called a Ritz– Galerkin condition. If ' =  , then it is called a Petrov–Galerkin condition. 3. When it becomes necessary to stress the fact that  and ' are k-dimensional, we will denote these subspaces by k and 'k , respectively. The following theorem shows that sk in Definition 9.1 exists and is unique under certain conditions. It also shows how sk can be determined. Theorem 9.2. Let the inner product in Definition 9.1 be (u, w) = u ∗ M w,

M Hermitian positive definite.

(Recall that this is the form of the most general inner product in N .) Let  = span{y 1 , . . . , y k } and ' = span{z 1 , . . . , z k }, and define also two N × k matrices Y and Z by Y = [ y 1 | · · · | y k ] and Z = [ z 1 | · · · | z k ]. Then sk in Definition 9.1 exists, is unique, satisfies Z ∗ M r (sk ) = 0, and is given by sk = x 0 + Y (Z ∗ MAY)−1 Z ∗ M r 0 ,

r 0 = r (x 0 ) = b − Ax 0 ,

provided the k × k matrix Z ∗ MAY is nonsingular.  T Proof. First, y ∈  means that y = ki=1 ξi y i = Y  ξ , where  ξ = [ξ1 , . . . , ξk ] . Therefore, sk = x 0 +Y ξ and r (sk ) = r 0 − AYξ . Next, z , r (sk ) = 0 for every z ∈ ' means that z ∗i M r (sk ) = 0, i = 1, . . . , k, which in turn can be written as Z ∗ M r (sk ) = 0. This gives Z ∗ MAYξ = Z ∗ M r 0 . Solving for ξ , and substituting into sk = x 0 + Y ξ , the result now follows.

9.1. Projection methods for linear systems

193

Remarks: 1. It is clear that a projection method is characterized entirely by its right and left subspaces and the inner product (· , ·). 2. In most cases of interest, the subspaces  and ' expand with increasing k. That is, if we denote the k-dimensional subspaces  and ' by k and 'k , respectively, then we have k ⊂ k+1 and 'k ⊂ 'k+1 . 3. Part of the philosophy behind approximating the solution s of Ax = b via a projection method is that, as k increases, r (sk ) is orthogonal to an expanding subspace and hence is likely to become closer to 0. Of course, when k = N , we will have r (sN ) = 0 because r (sN ) is orthogonal to N linearly independent vectors in N . As a result, sN = s. 4. Projection methods are used in different areas of computational mathematics. When used within the context of approximation theory, they are known as Galerkin methods. Without any changes, they can also be defined in any inner product space. There are two important cases in which Z ∗ MAY is nonsingular. We treat these in Theorem 9.4 below. We begin with the following result that will also be used in Theorem 9.4. Theorem 9.3. Let  = span{y 1 , . . . , y k } be a k-dimensional subspace of N , and let Y = [y 1 | · · · |y k ] ∈ N ×k . Consider the minimization problem  (9.2) min x − sG , cG = c ∗Gc, x∈x 0 +

where s ∈ N is fixed and G is Hermitian positive semidefinite. A vector s ∈ x 0 +  is a solution to this problem if and only if it satisfies Y ∗G( s − s) = 0.

(9.3)

If G is positive definite, s is unique and s = x 0 + Y (Y ∗GY )−1 Y ∗G(s − x 0 ).

(9.4)

Remarks:

 1. Recall that cG = c ∗G c is a vector norm when G is Hermitian positive definite. If G is Hermitian positive semidefinite, cG is a seminorm, that is, there exists c = 0 for which cG = 0.

2. Note that x = s is a solution to the problem min x∈N x − sG . In view of this, the solution s to (9.2) is an approximation to s from the affine subspace x 0 +  . Proof. First, x − s = (x − s ) + ( s − s), and, therefore, ' '2 2 2 = x − s G + ' s − s'G + 2ℜ[(x − s )∗G( s − s)], x − sG which, by letting 2 g (x) = x − s G + 2ℜ[(x − s )∗G( s − s)],

194

Chapter 9. Krylov Subspace Methods for Linear Systems

can be written as 2 2 x − sG =  s − sG + g (x).

By the fact that x, s ∈ x 0 +  , we have x = x 0 + Yξ and s = x 0 + Y ξ for some ξ , ξ ∈ k . Thus, x − s = Y (ξ − ξ ). Also, let a = Y ∗G( s − s) for short. Note that a ∈ k and is independent of x. With these, g (x) becomes 2 g (x) = Y (ξ − ξ )G + 2ℜ[(ξ − ξ )∗ a]. 2 ≥ 0; hence s is indeed an optimal solution. Let us Clearly, if a = 0, g (x) = x − sG now assume that s is an optimal solution, that is, that g (x) ≥ 0 for all x ∈ x 0 +  , but that a = 0. Let us choose ξ = ξ + t a, t being an arbitrary real scalar. [This amounts to choosing x = x(t ) = s + t (Ya) in x 0 +  .] Then 2 + 2t a22 , g (x) = t 2 YaG

which, by choosing t < 0 and sufficiently close to zero, can be made negative, thus resulting in a contradiction. Therefore, a = 0 must hold. This completes the proof of (9.3). Next, from the fact that every optimal solution s = x 0 +Y ξ satisfies (9.3), we have Y ∗G Y ξ = Y ∗G(s − x 0 ). When G is positive definite, the matrix Y ∗ GY ∈ k×k is Hermitian positive definite since Y has full rank. Solving this equation for ξ , and substituting into s = x 0 + Y ξ , we obtain (9.4).

Remark: If the subspaces  and ' are expanding in the sense described in remark 2 following the proof of Theorem 9.2, we also have that the sequence {sk − sG } in Theorem 9.3 is nonincreasing. The next theorem shows that, in some important cases, the vectors sk produced by projection methods are also those produced by minimizing certain scalar functions related to the linear system Ax = b. For this, we recall that the Hermitian and skew1 1 Hermitian parts of a square matrix K are K H = 2 (K + K ∗ ) and K S = 2 (K − K ∗ ), respectively.

Theorem 9.4. In the following cases, the matrix K = Z ∗ MAY is nonsingular; hence the vector sk in Theorem 9.2 exists and is unique. 1. ' =  , M = I , and AH is positive definite. If A is itself Hermitian positive definite, sk also satisfies sk − sA = min x − sA, x∈x 0 +

cA =



c ∗Ac.

(9.5)

9.1. Projection methods for linear systems

195

2. ' = A , A is nonsingular, and s is the solution to Ax = b. In this case, sk also satisfies r (sk )M = min r (x)M , x∈x 0 +

cM =



c ∗ M c.

(9.6)

Proof. In part 1, we first have that Z = Y ; therefore, K = Y ∗AY, and it is easy to verify that K H = Y ∗AH Y and K S = Y ∗AS Y . In addition, because AH is positive definite and Y has full rank, it follows that K H is positive definite. By this and by Lemma 1.19, we conclude that K is nonsingular. In part 2, we first have that Z = AY ; therefore, K = (AY)∗ M (AY ). Now, since A is nonsingular and Y has full rank, the matrix AY has full rank too. That is, rank(AY ) = k. Invoking the fact that M is Hermitian positive definite, we conclude that K is Hermitian positive definite and hence nonsingular. To prove the validity of the minimization problems, it is enough to show that sk satisfies Y ∗G(s − sk ) = 0 for some appropriate Hermitian positive definite matrix G, by Theorem 9.3. First, realizing that r (x) = b − Ax = A(s − x) for every x, the equation Z ∗ M r (sk ) = 0 satisfied by sk becomes Z ∗ MA(s − sk ) = 0.

(9.7)

In part 1, this last equation becomes Y ∗A(s − sk ) = 0. Since A is assumed to be Hermitian positive definite, Theorem 9.3 applies with G = A, which proves (9.5). In part 2, the equation Z ∗ M r (sk ) = 0 satisfied by sk becomes Y ∗A∗ MA(s − sk ) = 0.

(9.8)

Now A∗ MA is Hermitian positive definite since M is Hermitian positive definite and A is nonsingular. Therefore, Theorem 9.3 applies with G = A∗ MA. Finally, (9.6) follows from the fact that x − sA∗ MA = A(x − s)M = r (x)M . This completes the proof.

Remark: Theorem 9.4 is also valid when the Hermitian part of αA for some α ∈ , |α| = 1, is positive definite or αA is itself positive definite. In this case, we consider the solution to A x = b  , where A = αA and b  = αb. One-dimensional examples

We now consider two examples of projection methods for Ax = b in which k = 1. By Theorem 9.2, with  = span{y} and ' = span{z }, we have Z ∗ MAY = z ∗ MAy = (z ,Ay) and Z ∗ M r 0 = z ∗ M r 0 = (z , r 0 ), and hence s1 = x 0 +

(z , r 0 ) y. (z,Ay)

In the next two examples, we let r (x) =b − Ax and r 0 = r (x 0 ), as usual. We also take M = I . Thus, (c, d) = c ∗ d and c = c ∗ c.

196

Chapter 9. Krylov Subspace Methods for Linear Systems

• Example 1. Let y = z = r 0 . Then s1 = x 0 +

(r 0 , r 0 ) r . (r 0 ,Ar 0 ) 0

By part 1 of Theorem 9.4, if A is Hermitian positive definite, s1 also satisfies s1 − sA = min (x 0 + αr 0 ) − sA, α

cA =



c ∗Ac.

The procedure we have just described is actually one step of the steepest descent 2 starting with the method (SD) applied to the quadratic function f (x) = x − sA 29 vector x 0 . • Example 2. Let y = r 0 and z = Ay = Ar 0 . Then s1 = x 0 +

(Ar 0 , r 0 ) r . (Ar 0 ,Ar 0 ) 0

By part 2 of Theorem 9.4, s1 also satisfies r (s1 )2 = min r (x 0 + αr 0 )2 . α

The procedure we have just described is actually one step of the minimal residual method (MR) applied to the quadratic function f (x) = r (x)22 starting with the vector x 0 . In both SD and MR, we start with some initial vector x 0 = s(0) to obtain s1 , which we denote s(1) . We next set x 0 = s(1) and repeat the procedure to obtain s1 , which we denote s(2) , and repeat this procedure as many times as necessary to obtain the sequence {s(i ) }. Thus, the vectors y and z used to compute s(i ) vary with i. We have the following error bounds for the s(i ) : • For SD: Let μmin and μmax be, respectively, the smallest and the largest eigenvalues of A. (Since A is Hermitian positive definite, 0 < μmin < μmax .) Then ' ' (m+1) 's − s'A μmax − μmin ' ≤ ' 's(m) − s' μmax + μmin A



s(m+1) − sA

s(0) − sA

 ≤

μmax − μmin μmax + μmin

m .

Clearly, the sequence {s(m) − sA}∞ m=0 is monotonically decreasing and the method converges. 29 Here is a short description of SD. Consider the problem of minimizing a real scalar function f (x), x ∈  . At any point x 0 = s(0) , f (x ) has the greatest rate of decrease along the direction of h = −∇ f (x 0 ), where ∇ f (x 0 ) is the gradient of f (x ) evaluated at x 0 . Starting at the point x 0 , we therefore minimize f (x) along the straight line x = x 0 + αh. This amounts to minimizing the function g (α) = f (x 0 + αh) with respect to α. Thus, denoting by α0 the solution to g  (α) = h ·∇ f (x 0 + αh) = 0, we have s(1) = x 0 + α0 h as the minimum point of f (x 0 + αh). (Note that α0 > 0 necessarily.) Now replace x 0 by s(1) , and repeat the procedure we have just described as many times as needed to obtain s(2) , s(3) , . . . . Clearly, { f (s(i ) )} is a decreasing sequence. It can be shown under certain conditions that {s(i ) } converges to a minimum point of f (x). See Luenberger [178], for example.

9.2. Krylov subspace methods: General discussion

197

• For MR: Let AH , the Hermitian part of A, be positive definite, and denote the min and σmax , smallest eigenvalue of AH and the largest singular value of A by μ respectively. Then

 2min 1/2 μ r (s(m+1) )2 ≤ 1 − 2 σmax r (s(m) )2



 2min  m/2 μ r (s(m) )2 . ≤ 1 − 2 σmax r (s(0) )2

If A is Hermitian positive definite, let μmin and μmax be, respectively, the smallest and the largest eigenvalues of A. Then r (s(m+1) )2

r (s(m) )2



μmax − μmin μmax + μmin



r (s(m) )2

r (s(0) )2

 ≤

μmax − μmin μmax + μmin

m .

Clearly, if A or AH is Hermitian positive definite, {r (s(m) )2 }∞ m=0 is a decreasing sequence and the method converges.

9.2 Krylov subspace methods: General discussion 9.2.1 Krylov subspaces and Krylov subspace methods The one-dimensional methods for solving Ax = b discussed above are actually special cases of two important Krylov subspace methods. Specifically, SD is a special case of the method of Arnoldi and MR is a special case of the method of generalized minimal residuals. We treat both methods and the method of Lanczos, which is yet another Krylov subspace method, in this chapter. We start with the definition of a Krylov subspace. Definition 9.5. Let A be a square matrix and u be an arbitrary nonzero vector. Then the subspace (k (A; u) defined via (k (A; u) = span{u,Au,A2 u, . . . ,Ak−1 u}

(9.9)

is called a Krylov subspace. We already know from Proposition 1.10 that the dimension of (k (A; u) will be k only if k ≤ k0 , where k0 is the degree of the minimal polynomial of A with respect to u. If k > k0 , it is easy to see that Av ∈ (k (A; u) when v ∈ (k (A; u). That is, (k (A; u) is an invariant subspace of A. Definition 9.6. Let A ∈ N ×N be nonsingular, and consider again the solution of the linear system Ax = b by the projection method of Definition 9.1. If the right subspace k is the Krylov subspace (k (A; r 0 ), where r 0 = r (x 0 ) = b − Ax 0 , the resulting projection method is said to be a Krylov subspace method for the solution of Ax = b. From this definition, it is clear that each Krylov subspace method for solving a linear system Ax = b is characterized by its left subspace 'k and the inner product. [Observe that the right subspace k is (k (A; r 0 ) for all Krylov subspace methods.] In addition, the dimension of k must be k, which is possible only when k ≤ k0 , where k0 is the degree of the minimal polynomial of A with respect to r 0 . We now turn briefly to the three Krylov subspace methods we alluded to above.

198

Chapter 9. Krylov Subspace Methods for Linear Systems

• Method of Arnoldi or full orthogonalization method (FOM): In this method, 'k = k = (k (A; r 0 ). • Method of generalized minimal residuals (GMR): In this method, 'k = Ak = (k (A;Ar 0 ). • Method of Lanczos: In this method, 'k = (k (A∗ ; y) for some arbitrarily chosen vector y. Normally, these methods are defined with the standard Euclidean inner product 〈y, z 〉 = y ∗ z , and hence with M = I , even though they can also be defined using a weighted inner product. Here, we assume weighted inner products for FOM and GMR; for TEA, we assume the Euclidean inner product. We will discuss these methods and the commonly used algorithms for implementing them numerically in detail in the next sections.

9.2.2 Vector extrapolation versus Krylov subspace methods When applied to the solution of linear systems Ax = b in conjunction with a fixedpoint iterative scheme, the vector extrapolation methods we studied in the previous chapters are closely related to the methods of Arnoldi, GMR, and Lanczos. The following theorem from Sidi [262] shows the mathematical (but not computational) equivalence of the three Krylov subspace methods with the three vector extrapolation methods MPE, RRE, and TEA1. (Part of this theorem can also be found in a preprint by Beuneu [22].) In our proof, we make use of the following simple and easily verifiable observation concerning Krylov subspaces: (k (αA + βI ; u) = (k (A; u)

if α = 0.

(9.10)

Theorem 9.7. Consider the linear system (I − T )x = d, and let the vector sequence {x m } be generated by the iterative scheme x m+1 = T x m + d, with x 0 as the initial vector. Denote the vectors sk = s0,k generated from {x m } by applying MPE, RRE, and TEA1 , sRRE , and sTEA , respectively. Apply the methods of Arnoldi, GMR, and Lanczos to by sMPE k k k (I − T )x = d starting with x 0 , and denote the vectors generated by them by sArnoldi , sGMR , k k Lanczos and sk , respectively. Then Arnoldi , sMPE k = sk

GMR sRRE k = sk ,

Lanczos sTEA . k = sk

(9.11)

The vectors y used in the definitions of TEA1 and the method of Lanczos are the same. Proof. Let C = I − T . Using the notation and results introduced in the previous chapters, we start by showing that the three vector extrapolation methods are true Krylov subspace methods with k = (k (C; r 0 ), where r 0 = d − C x 0 . We already know that, for all three vector extrapolation methods, sk is of the form sk = which, with ξi =

k

j =i +1 γ j ,

k

i =0

γi x i ,

k

i =0

γi = 1,

i = 0, 1, . . . , k − 1, that is,

ξ0 = 1 − γ0 ,

ξi = ξi −1 − γi ,

1 ≤ i ≤ k − 1,

9.2. Krylov subspace methods: General discussion

199

can be written as sk = x 0 +

k−1

i =0

ξi u i = x 0 +

k−1

ξi T i u 0 = x 0 +

i =0

k−1

i =0

ξi T i r 0 ,



sk ∈ x 0 + (k (T ; r 0 ),

since u 0 = x 1 − x 0 = (T x 0 + d) − x 0 = d − C x 0 = r 0 . In addition, by the fact that T = I − C and from (9.10), (k (T ; r 0 ) = (k (C; r 0 ).

(9.12)

Consequently, for MPE, RRE, and TEA1, we have sk ∈ x 0 + (k (C; r 0 ).

(9.13)

Thus, MPE, RRE, and TEA1 will be true Krylov subspace methods for the system C x = d provided we are able to identify proper left subspaces to which the respective residual vectors r (sk ) are orthogonal. We start with MPE and RRE. First, by Theorem 1.17, we have 

 k γ j u j = 0 ∀ z ∈ k , (9.14) z, j =0

where for MPE, span{u 0 , u 1 , . . . , u k−1 } span{w 0 , w 1 , . . . , w k−1 } for RRE.

k = Let us recall that, by k

j =0

γj u j =

k

j =0

k

j =0 γ j

(9.15)

= 1,

γ j (x j +1 − x j ) =

k

j =0

γ j (T x j +d − x j ) = d −C

k

j =0

γ j x j = r (sk ). (9.16)

Then, by u i = T i u 0 = T i r 0,

w i = (T − I )u i = −CT i r 0 ,

i = 0, 1, . . . ,

and by (9.12), it follows that k =

(k (C; r 0 ) C(k (C; r 0 )

for MPE, for RRE.

Consequently, (9.14) can be rewritten as (z, r (sk )) = 0 (z, r (sk )) = 0

∀ z ∈ (k (C; r 0 ) for MPE, ∀ z ∈ C(k (C; r 0 ) for RRE.

200

Chapter 9. Krylov Subspace Methods for Linear Systems

As for TEA1, by Theorem 5.7, we have 2

3 k y, γ j u i + j = 0,

i = 0, 1, . . . , k − 1,

(9.17)

j =0

which, recalling that u i + j = T i u j , can be rewritten as 3 2 k

γ j u j = 0, (T ∗ )i y,

i = 0, 1, . . . , k − 1.

j =0

Thus, recalling also (9.16), (9.17) becomes 〈z , r (sk )〉 = 0

∀ z ∈ (k (T ∗ ; y) = (k (C ∗ ; y),

since T ∗ = I − C ∗ . This completes the proof.

Remarks: 1. Recall that the iterative scheme x m+1 = T x m + d for the solution to Ax = b is obtained by splitting A via A = M − N and then setting M −1 N = T . (Note that the matrix M in this splitting has nothing to do with the matrix M used for the weighted inner product and norm.) Therefore, the linear system (I − T )x = d that is the subject of Theorem 9.7 is actually M −1Ax = M −1 b, a preconditioned version of Ax = b. Thus, an iterative method for a linear system serves as a preconditioning of the system, provided M is not a multiple of I . In view of this, the error analyses of Chapters 6 and 7 apply to FOM, GMR, and the method of Lanczos. 2. It is clear that MMPE is also a Krylov subspace method. In this case, the left subspace is 'k = span{q 1 , . . . , q k }, where q i are linearly independent constant vectors. Numerical examples that demonstrate the mathematical equivalence of MPE, RRE, and TEA to FOM, GMR, and the method of Lanczos, respectively, are given in Gander, Golub, and Gruntz [94]. Of course, we recall that this equivalence is valid only for vector sequences generated by fixed-point iterative solution of linear systems.

9.2.3 Error analysis for special cases We now go back to the problem treated in Theorem 9.3 by letting k = (k (A; r 0 ) there and assuming that A is a Hermitian positive definite matrix. In view of Definition 9.6, we have the following simple but important observations. Lemma 9.8. Let k = (k (A; r 0 ), r 0 = r (x 0 ), and consider x ∈ x 0 + k . Then the following statements are true: 1. x can be expressed in the form x = x0 +

k−1

i =0

αi Ai r 0 = x 0 + q(A)r 0 ,

q(z) =

k−1

i =0

αi z i .

(9.18)

9.2. Krylov subspace methods: General discussion

201

2. Consequently, we also have s − x = p(A)(s − x 0 ) and Thus,

r (x) = p(A)r 0 ,

p(z) = 1 − zq(z).

5 6 k

i 4 = p(z) =  c z : p(0) = 1 . k i

4, p ∈ k

(9.19)

(9.20)

i =0

Proof. That (9.18) holds is obvious. The rest follows from the fact that r 0 = b − Ax 0 = A(s − x 0 ). We leave the details to the reader.

Note that, for arbitrary x ∈ x 0 + k , the polynomial q(z) in Lemma 9.8 is also arbitrary. So is the polynomial p(z), except for the constraint that p(0) = 1. The following lemma will be used in the proof of our main result. Lemma 9.9. Let G be Hermitian positive definite, B be arbitrary, and G, B ∈  s ×s . Then BG = G 1/2 BG −1/2 2 , and thus we have the following: 1. BG = B2 if B is diagonalizable and commutes with G. In this case, B is necessarily normal. 2. BG = G 1/2 BG −1/2 2 ≤



κ2 (G) B2 if B does not commute with G.

Proof. By definition, 2 BG = max x=0



B xG xG

2 = max x=0

(B x)∗G(B x) x ∗ B ∗ GB x . = max x=0 x ∗G x x ∗G x

Since G is Hermitian positive definite, G = V ΛV ∗ , where V is unitary and Λ = diag(μ1 , . . . , μ s ), the μi being the eigenvalues of G. Of course, μi are all real and positive. Let us now define G c = V Λc V ∗ , where Λc = diag(μ1c , . . . , μcs ) and c is real, whether positive or negative. Then G c is also Hermitian positive definite and G c G −c = I . Thus, with y = G 1/2 x, we have x ∗G x = y22 Hence,

and

x ∗ B ∗G B x = (G 1/2 BG −1/2 )y22 .

  (G 1/2 BG −1/2 )y2 2 (B x)∗G(B x) . = y2 x ∗G x

In addition, because G is nonsingular, for every x, there is a unique y, and vice versa. Consequently, BG = max x=0

(G 1/2 BG −1/2 )y2 B xG = G 1/2 BG −1/2 2 . = max y=0 y2 xG

202

Chapter 9. Krylov Subspace Methods for Linear Systems

Part 1 follows from the fact that, under the given conditions, G and B can be diagonalized simultaneously30 by the unitary matrix V and that V diagonalizes G ±1/2 as well, so that G ±1/2 commutes with B too. Part 2 is immediate, and we leave it to the reader.

Theorem 9.10 is the main result of this section. Theorem 9.10. Let A be a Hermitian positive definite matrix, and let s be the solution of Ax = b. Let k = (k (A; r 0 ), where r 0 = b − Ax 0 for some x 0 , and let sk be the approximate solution defined via the minimization problem  (9.21) s − sk G = min s − xG , cG = c ∗G c, x∈x 0 +k

where G is Hermitian positive definite. Denote the smallest and largest eigenvalues of A by μmin and μmax , respectively, and set κ = μmax /μmin . (Note that all eigenvalues of A are real and positive.) Then s − sk G

s − x 0 G



  L κ−1 k ≤ 2L ,    μmax + μmin κ+1 Tk μmax − μmin

1 L= 

κ2 (G)

κ=

μmax , μmin

(9.22)

if GA = AG, otherwise,

and Tk (z) is the kth Chebyshev polynomial. Proof. By (9.21) and by Lemma 9.8, we have s − sk G =

min

x∈x 0 +(k (A;r 0 )

s − xG = min  p(A)(s − x 0 )G 4 p∈ k

≤  p(A)(s − x 0 )G from which

s − sk G ≤  p(A)G s − x 0 G

4, ∀ p ∈ k

4. ∀ p ∈ k

(9.23) (9.24)

Now, by Lemma 9.9, we have  p(A)G =  p(A)2

if GA = AG,   p(A)G = G 1/2 p(A)G −1/2 2 ≤ κ2 (G)  p(A)2

otherwise.

(9.25)

Consequently, with μ1 , . . . , μN as eigenvalues of A, which all lie in the positive interval [μmin , μmax ], we have  p(A)2 = max | p(μi )| ≤ i

max

z∈[μmin ,μmax ]

| p(z)|.

(9.26)

Since p(z) is arbitrary in (9.24), we therefore have    s − sk G ≤ L min max | p(z)| s − x 0 G . 4 p∈ k

z∈[μmin ,μmax ]

30 Here we are using the fact that if two matrices are diagonalizable, then they are simultaneously diagonalizable if and only if they commute. See Horn and Johnson [138, Theorem 1.3.12, p. 50], for example.

9.3. Method of Arnoldi: Full orthogonalization method (FOM)

203

The solution to the min-max problem here is   2z − μmax − μmin Tk μmax − μmin p(z) = p ∗ (z) =   −μmax − μmin Tk μmax − μmin

and max

z∈[μmin ,μmax ]

| p ∗ (z)| =

 Tk

1 . μmax + μmin μmax − μmin

From this, the result follows.

Theorem 9.10 can be used to make important statements about the error in sk when FOM, GMR, and the method of Lanczos are applied (using the standard Euclidean inner product and norm) to the solution of Ax = b with A Hermitian positive definite. In this case, FOM and GMR can be implemented by the conjugate gradient algorithm (CG) and the conjugate residual algorithm (CR), respectively. Both algorithms are extremely efficient because they require a fixed amount of computer memory and very few arithmetic operations. We discuss these later in this chapter. As for the method of Lanczos, by choosing y = r 0 in the brief description given following Definition 9.6, it is clear that it reduces to the method of Arnoldi since now 'k = k by the fact that A∗ = A. By part 1 of Theorem 9.4, if we choose G = A in Theorem 9.10, we have 'k = k ; therefore, the sk there is that produced by FOM (implementable by CG). Thus, (9.22) becomes    (s − sk )∗A(s − sk ) s − sk A 1 κ−1 k ≤  . =  ≤2  s − x 0 A κ+1 (s − x 0 )∗A(s − x 0 ) T μmax + μmin k μmax − μmin

By part 2 of Theorem 9.4, if we choose G = A2 in Theorem 9.10, we have 'k = Ak ; therefore, the sk there is that produced by the method of GMR (implementable by CR). Thus, (9.22) becomes

r (sk )2 s − sk A2 ≤ = s − x 0 A2 r (x 0 )2

  1 κ−1 k .  ≤2   μ + μmin κ+1 Tk max μmax − μmin

These are the well-known and frequently cited error bounds for CG and CR.

9.3 Method of Arnoldi: Full orthogonalization method (FOM) The method of Arnoldi [7], also known as FOM, for the solution of Ax = b is a Krylov subspace method, with k = (k (A; r 0 ) = 'k . As we have already noted, this method was originally defined with the Euclidean inner product and norm. Essai [83] modified the definition of FOM by introducing the weighted inner  product (· , ·) and the norm ∗  ·  induced by it, namely, (y, z ) = y M z and z = (z , z ), M being a Hermitian

204

Chapter 9. Krylov Subspace Methods for Linear Systems

positive definite matrix. We present this modification here. (Note that when M = I and k = 1, s1 obtained from this method is the same as that obtained from one stage of SD, discussed in Section 9.1.) One of the consequences of Definition 9.6 is given in the following lemma. Lemma 9.11. The residual vectors r k = r (sk ) = b − Ask resulting from the method of Arnoldi are mutually orthogonal; that is, (r j , r k ) = 0

if j = k.

Proof. First, by Lemma 9.8, r k ∈ (k+1 (A; r 0 ). Next, r k is orthogonal to the subspace (k (A; r 0 ); therefore, it is also orthogonal to the subspaces ( j (A; r 0 ), j = 1, . . . , k − 1. Therefore, r k is orthogonal to all r j , j = 0, 1, . . . , k −1, since r j ∈ ( j +1 (A; r 0 ).

In [7], Arnoldi suggests a very efficient and elegant algorithm for implementing FOM, which generates an orthonormal basis for (k (A; r 0 ). This basis is obtained by using an interesting procedure that is analogous to the Gram–Schmidt process; we will call this procedure the Arnoldi–Gram–Schmidt process. In this procedure, we first normalize r 0 to obtain q 1 , the first vector of the orthonormal basis. The second vector in this basis, q 2 , is obtained by orthonormalizing Aq 1 (hence Ar 0 , as in GS) against q 1 . Following that, the third vector, q 3 , is obtained by orthonormalizing Aq 2 (instead of A2 r 0 ) against q 1 and q 2 , and so on, as long as k is less than or equal to the degree of the minimal polynomial of A with respect to r 0 . Thus, the q i are mutually orthonormal; that is, (q i , q j ) = δi j . Of course, the process can be carried out in a way that is analogous to MGS, which is our choice below. (Another way of constructing the orthonormal basis of Arnoldi when M = I is by using Householder reflections, as proposed by Walker [336]. We do not treat this here, however.) Here are the steps of this process as it is carried out in the spirit of MGS. Recall that theinner product we use here and the norm induced by it are (y, z ) = y ∗ M z and z  = z ∗ M z .

1. Set β = r 0  and q 1 = r 0 /β. 2. For j = 1, . . . , k do (1) Set a j +1 = Aq j . For i = 1, . . . , j do (i ) (i +1) (i ) Compute hi j = (q i , a j +1 ) and compute a j +1 = a j +1 − hi j q i . end do (i) ( j +1) ( j +1) Compute h j +1, j = a j +1  and set q j +1 = a j +1 /h j +1, j . end do ( j ) Note that, with q 1 , . . . , q j already determined, hi j q i is the projection of Aq j along q i , j ( j +1) i ≤ j . Clearly, a j +1 = Aq j − i =1 hi j q i , and hence Aq j =

j +1

i =1

hi j q i ,

hi j = (q i ,Aq j )

From this, we also have the following useful result.

∀ i, j .

(9.27)

9.3. Method of Arnoldi: Full orthogonalization method (FOM)

205

Lemma 9.12. Define the unitary matrices Q j ∈ N × j and the upper Hessenberg matrices

H j ∈ ( j +1)× j and H j ∈  j × j as Q j = [ q1 | · · · | q j ] and



h11 ⎢ h21 ⎢ ⎢ ⎢ ⎢ Hj =⎢ ⎢ ⎢ ⎢ ⎢ ⎣

h12 h22 .. .

Then

··· ··· .. . .. .

··· ··· ..

.

..

.

h1 j h2 j .. . .. . hj j h j +1, j

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎥ ⎦

(9.28)



h11 ⎢ h ⎢ 21 ⎢ ⎢ Hj =⎢ ⎢ ⎢ ⎣

h12 h22 .. .

··· ··· .. . .. .

··· ··· ..

. h j , j −1

h1 j h2 j .. . .. .

⎤ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦

hj j (9.29)

AQ k = Q k+1 H k = Q k H k + hk+1,k q k+1 e Tk ,

(9.30)

where e j is the j th standard basis vector in  , as usual. In addition, since (q i , q j ) = q ∗i M q j = δi j , we have (9.31) Q ∗k MAQ k = H k . k

Proof. The proof of (9.30) follows from (9.27). The proof of (9.31) is achieved by multiplying (9.30) on the left by Q ∗k and invoking Q ∗k MQ k = I k and the fact that Q ∗k M q k+1 = 0 (since all the columns of Q k , namely, the vectors q 1 , . . . , q k , are orthogonal to q k+1 ).

With the help of the matrices Q k and H k , we can express the approximation sk obtained from the method of Arnoldi in very simple terms. First, by the fact that k = 'k = (k (A; r 0 ), we can let Y = Z = Q k in Theorem 9.2. This results in sk = x 0 + Q k (Q ∗k MAQ k )−1Q ∗k M r 0 , which, by the fact that r 0 = βq 1 , by (9.31), and by Q ∗k M q 1 = e 1 , becomes e , sk = x 0 + βQ k H −1 k 1

(9.32)

of course, provided H k is nonsingular. (Note that H k can be singular in certain cases. It is nonsingular when A has a positive definite Hermitian part and M = I , for example.) From this, we also see that sk can be computed as follows: Following the Arnoldi–Gram–Schmidt process that produces Q k and H k , we solve for η the k × k linear system (9.33) H k η = βe 1 , η = [η1 , . . . , ηk ]T , and compute sk via sk = x 0 + Q k η = x 0 +

k

i =1

ηi q i .

(9.34)

The residual vector r k = r (sk ) and its norm can be determined in terms of the already computed quantities without having to actually compute sk or r k . This can be achieved as follows: Using (9.34), we have r k = r 0 − AQ k η = βq 1 − AQ k η.

206

Chapter 9. Krylov Subspace Methods for Linear Systems

By (9.30) and (9.33), we have AQ k η = Q k (H k η) + hk+1,k q k+1 (e Tk η) = βQ k e 1 + hk+1,k ηk q k+1 = βq 1 + hk+1,k ηk q k+1 . Consequently,

r k = −hk+1,k ηk q k+1 .

(9.35)

Invoking also β = r 0 , we finally have r k  = (hk+1,k |ηk |/β) r 0 .

(9.36)

Thus, the quantity hk+1,k |ηk |/β is actually the ratio r k /|r 0 , whose size can be used as a criterion for determining whether to increase k without first computing sk , which can be used to save computing time. It is clear that, as in the QR factorization of an N ×k matrix, to compute Q k via the Arnoldi–Gram–Schmidt orthogonalization process, we need to store k + 1 vectors in (i ) N throughout the process. Aq j and all a j +1 occupy the same storage locations. The amount of storage needed for the matrices H k and H k is negligible when N >> k. Since the storage requirements for computing sk increase with k, we cannot increase k indefinitely. One way of avoiding this problem is by applying FOM in cycles, just like vector extrapolation methods. Within the context of Krylov subspace methods, this is called restarting. We choose a positive integer m, and, starting with some initial vector x 0 , we compute s m by FOM. We set x 0 = s m and apply FOM again to obtain a new s m , repeating these steps as many times as needed. This procedure is called the restarted Arnoldi method and is denoted FOM(m).

9.4 Method of generalized minimal residuals (GMR) GMR for the solution of Ax = b is also a Krylov subspace method, in which k = (k (A; r 0 ), 'k = Ak . As we have already noted, like FOM, GMR was originally defined with the Euclidean inner product and norm. Essai [83] modified the definition of GMR by introducing the weighted inner  product (· , ·) and the norm  ·  induced ∗ by it, namely, (y, z ) = y M z and z  = (z , z ), M being a Hermitian positive definite matrix. (Note that when k = 1, s1 obtained from this method is the same as that obtained from one stage of MR, which was discussed in Section 9.1.) One of the consequences of Definition 9.6 and Theorem 9.4 is given in the following lemma, whose proof is similar to that of Lemma 9.11.

Lemma 9.13. The residual vectors r k = r (sk ) = b − Ask associated with the vector sk resulting from GMR are orthogonal in the sense that (Ar j , r k ) = 0, and also satisfy

r k  =

j = 0, 1, . . . , k − 1, min

x∈x 0 +(k (A;r 0 )

r (x).

This method, with M = I , has been considered by Axelsson [11]; Young and Jea [357]; Eisenstat, Elman, and Schultz [78]; and Saad and Schultz [241], who proposed

9.4. Method of generalized minimal residuals (GMR)

207

different algorithms for implementing it. The algorithm of [78] is called the method of generalized conjugate residuals (GCR), that of [357] is called ORTHODIR, and that of [241] is known as GMRES. The special case of k = 1 of this method was considered earlier by Vinsome [334] and is known as ORTHOMIN(1). Of these, GMRES by Saad and Schultz seems to be the most efficient computationally, and we present its details here.

9.4.1 GMRES algorithm The first part of the algorithm computes an orthonormal basis for (k (A; r 0 ) using the Arnoldi–Gram–Schmidt process, exactly as in the method of Arnoldi. Thus, if x ∈ x 0 + (k (A; r 0 ), then x = x0 +

k

ηi q i = x 0 + Q k η,

i =1

Therefore,

η = [η1 , . . . , ηk ]T .

(9.37)

r (x) = r 0 − AQ k η,

(9.38)

r (x) = r 0 − Q k+1 H k η.

(9.39)

which, by (9.30), becomes

Invoking r 0 = βq 1 = βQ k+1 e 1 , where e j is the j th standard basis vector in k+1 , as usual, (9.39) can be rewritten as r (x) = Q k+1 (βe 1 − H k η),

(9.40)

and since Q ∗j MQ j = I j for all j , giving Q k+1 y2 = (Q k+1 y)∗ M (Q k+1 y) = y ∗ (Q ∗k+1 MQ k+1 )y = y ∗ y = y22 , then (9.40) gives r (x) = βe 1 − H k η2 .

(9.41)

Consequently, the vector sk given by GMR is sk = x 0 +

k

i =1

ηi q i = x 0 + Q k η,

η = [η1 , . . . , ηk ]T ,

(9.42)

η being the solution to the linear least-squares problem r k  = r (sk ) = min βe 1 − H k η2 . η

(9.43)

This problem can be solved by using the QR factorization described in Section 2.3. & ∈ (k+1)×(k+1) is an upper Hessenberg matrix. Let & = [H |βe ]; clearly, H Define H k 1 & is && & and & & =Q & . Here, both Q H R be the QR factorization of H R are in (k+1)×(k+1) , Q ∗ & Q & = I , and & unitary in the sense that Q R is upper triangular with positive diagonal k+1 & elements. If we write R in the form + , R f & R= , 0T φ

208

Chapter 9. Krylov Subspace Methods for Linear Systems

then it is clear that R is k × k and upper triangular with positive diagonal elements, φ > 0, and f is a k-dimensional vector. Thus, , +     R & f . & Rη & ⇒ = Q H η = Q Hk = Q and βe 1 k φ 0 0T

Consequently,

& βe 1 − H k η = Q

and hence



f − Rη φ

 ,

' '   ' f − Rη  ' ' = f − Rη2 + φ2 1/2 . βe 1 − H k η2 = ' 2 ' ' φ 2

As a result, η in (9.42) is the solution to the least-squares problem '2 ' 1/2 r k  = min ' f − Rη'2 + φ2 . η

(9.44)

Since R is a k × k nonsingular matrix, the minimum is achieved when η is the solution to the system Rη = f . As a result, we also have r k  = φ = (φ/β) r 0 .

(9.45)

Thus, the quantity φ/β is actually the ratio r k /|r 0 , whose size can be used as a criterion for determining whether to increase k without first computing sk , which can be used to save computing time. It is clear that the storage requirements of GMRES are almost the same as those of FOM. In addition, exactly as in FOM, the storage requirements for computing sk increase with k; therefore, we cannot increase k indefinitely. One way of avoiding this problem is by applying GMRES in cycles, that is, using restarting. We choose a positive integer m, and, starting with some initial vector x 0 , we compute s m by GMRES. We set x 0 = s m , apply GMRES again to obtain a new s m , and repeat these steps as many times as needed. This procedure is called restarted GMRES and is denoted GMRES(m).

9.5 FOM and GMR are related In Theorems 3.1 and 3.2 of Chapter 3, we showed that MPE and RRE are closely related in the sense that they perform well simultaneously and poorly simultaneously, and that when MPE is not defined, RRE stagnates. We have also shown in Theorem 9.7 of the present chapter that, when applied in conjunction with a fixed-point iterative scheme for a linear system, MPE and RRE are mathematically equivalent to FOM and GMR, respectively. In view of these, we restate Theorems 3.1 and 3.2 in terms of FOM and GMR. The norm · used for defining all four methods is the same weighted norm. Theorem 9.14. Let sk be the approximations produced by FOM or GMR to the solution of Ax = b with the initial vector x 0 , and denote r k = r (sk ). does not exist, 1. If sFOM k

GMR r GMR k = r k−1 .

(9.46)

GMR sGMR k = sk−1 .

(9.47)

Consequently, we also have that

2. Conversely, if (9.47) holds, then sFOM does not exist. k

9.6. Recursive algorithms for FOM and GMR with A Hermitian positive definite

209

Theorem 9.15. Let sk be the approximations produced by FOM or GMR to the solution of Ax = b with the initial vector x 0 , and denote r k = r (sk ). If sFOM exists, k 1

r k

GMR

and

2

r GMR k

r GMR 2 k

=

=

1

r k−1 GMR

2

r GMR k−1

r GMR 2 k−1

+

+

1

(9.48)

r k 2 FOM

r FOM k

r FOM 2 k

.

(9.49)

.

(9.50)

Consequently, we also have sGMR k

r GMR 2 k

In addition,

=

sGMR k−1

r GMR 2 k−1

+

sFOM k

r FOM 2 k

 < r GMR . r GMR k k−1

(9.51)

The following facts, following from (9.48), are restatements of (3.39) and (3.40): r FOM k = 

1

r GMR 2 k

=

k

i ∈Sk

r GMR  k

1 − (r GMR /r GMR )2 k k−1

1 , r FOM 2 i

,

(9.52)

Sk = {0 ≤ i ≤ k : sFOM exists}. i

(9.53)

These results were given originally by Weiss [340] and Brown [50]. This topic has been analyzed further in the papers by Gutknecht [121, 122], Weiss [341], Zhou and Walker [360], Walker [337], Cullum and Greenbaum [62], and Eiermann and Ernst [75] by using weighted inner products and norms induced by them. In the next paragraph, we restate what we said concerning MPE and RRE in Chapter 3. Theorem 9.15 [especially (9.52)] implies that the convergence behaviors of FOM and GMR are interrelated in the following sense: FOM and GMR either converge well = r FOM  and φGMR = simultaneously or perform poorly simultaneously. Letting φFOM k k k GMR GMR GMR r k , and recalling that φk /φk−1 ≤ 1 for all k, we have the following: (i) When /φGMR is significantly smaller than one, which means that GMR is performing φGMR k k−1 FOM well, φk is close to φGMR , that is, FOM is performing well too, and (ii) when φFOM is k k GMR /φ is approaching one, that is, increasing, that is, FOM is performing poorly, φGMR k k−1 RRE is performing poorly too. Thus, when the graph of φFOM has a peak for k˜ ≤ k ≤ k

1

k˜2 , then the graph of φGMR has a plateau for k˜1 ≤ k ≤ k˜2 . This is known as the peakk plateau phenomenon in the context of Krylov subspace methods for linear systems.

9.6 Recursive algorithms for FOM and GMR with A Hermitian positive definite: A unified treatment of conjugate gradients and conjugate residuals 9.6.1 Introduction In view of the fact that both FOM and GMRES require an increasing amount of storage, we ask ourselves whether it is possible to design algorithms different from the

210

Chapter 9. Krylov Subspace Methods for Linear Systems

ones described above by which we can implement FOM and GMR (with no restarting) with a fixed amount of storage. It turns out that this is possible for both methods when A is Hermitian positive definite. We discuss one approach to this issue in this section. As a side remark, we note that when A is Hermitian, an interesting simplification takes place in the matrix H k produced by the Arnoldi–Gram–Schmidt process if the vectors q i are constructed to be mutually orthogonal with respect to the standard Euclidean inner product, namely, M = I and q ∗i q j = δi j . In this case, H k = Q ∗k AQ k is also Hermitian because A is Hermitian. By the fact that H k is already upper Hessenberg, it follows that it is tridiagonal as well. In addition, it is real, since its elements are all real: For each j , (i) h j j = q ∗j Aq j is real because A is Hermitian, (ii) h j +1, j is real positive by the way we construct q j +1 , (iii) h j , j +1 is real positive since h j , j +1 = h j +1, j = h j +1, j , and (iv) the rest of the hi j are zero. This means that to compute q j +1 , we need only q j −1 and q j ; we actually have the three-term recursion relation (9.54) Aq j = h j +1, j q j +1 + h j j q j + h j −1, j q j −1 .

Finally, if A is positive definite, so is H k , and h j j = q ∗j Aq j > 0, in addition. When A is Hermitian positive definite, the vectors sk defined by FOM and GMR can be computed economically via recursion relations of fixed length independent of k, just as with q k , provided the weighted inner product and norm used to define FOM and GMR are replaced by the standard Euclidean inner product and norm. These implementations of FOM (CG) and of GMR (CR) require very little storage that is fixed independently of k. Assuming in the remainder of this section that A is Hermitian positive definite, we provide a unified derivation of the relevant algorithms next. As usual, s is the solution to the system Ax = b, and we let r (x) = b − Ax = A(s − x), and r k = r (sk ) for short. To unify the derivations of these algorithms, we also define the inner product (· , ·) via I for FOM, (9.55) (y, z ) = y ∗ K z , K = A for GMR. Because A∗ = A and because KA = AK , we thus have (y,Az ) = (Ay, z )

for both FOM and GMR.

(9.56)

We will make use of this last property of (· , ·) below.31

9.6.2 Three-term recursions We now develop three-term recursion relations for the sk in a way that resembles the development of three-term recursion relations for orthogonal polynomials. Let us return to Sections 9.3 and 9.4 and replace the weighted inner product and norm there by the standard Euclidean inner product and norm. Then, the orthogonality results of Lemmas 9.11 and 9.13 for FOM and GMR become ∗ FOM (r FOM j ) (r k ) = 0

and

∗ GMR (Ar GMR j ) (r k ) = 0 if j < k.

31 Note that (· , ·) is the standard Euclidean inner product for FOM; therefore, it is a true inner product for all A. It is a true inner product for GMR provided A is Hermitian positive definite; otherwise, it is not.

9.6. Recursive algorithms for FOM and GMR with A Hermitian positive definite

211

These can be rewritten in terms of the inner product defined in (9.55) in a unified manner as (9.57) (r j , r k ) = r ∗j K r k = 0 if j < k. Before we continue, we note that (i) H k is nonsingular, hence sFOM exists for all k GMR k < k0 and (ii) sk exists unconditionally for all k < k0 , where k0 is the degree of the minimal polynomial of A with respect to r 0 . Therefore, the orthogonality relations we have just stated are valid for all k < k0 . Lemma 9.16. Let k0 be the degree of the minimal polynomial of A with respect to r 0 . Then we have the following: 1. dim (k (A; r 0 ) = k for 1 ≤ k ≤ k0 . 2. r k = 0 for 0 ≤ k < k0 and r k ∈ (k+1 (A; r 0 ) but r k ∈ (k (A; r 0 ). Thus, 4 ; hence deg p = k. Consequently, ( (A; r ) = r k = pk (A)r 0 for some pk ∈  k k k 0 span{r 0 , r 1 . . . , r k−1 } for 1 ≤ k ≤ k0 , and r 0 , r 1 , . . . , r k0 −1 are linearly independent. 3. In addition, r k0 = 0, and hence sk0 = s. Proof. Let us denote (k (A; r 0 ) by (k for short and recall the definition of the minimal polynomial of A with respect to r 0 . To prove part 1, we first realize that dim (k ≤ k. Let us assume that dim (k < k. This implies that the set {r 0 ,Ar 0 , . . . ,Ak−1 r 0 } is linearly dependent. Therefore,  there exist scalars αi , not all zero, such that ik−1 α Ai r 0 = 0. This implies that the =0 i k−1 polynomial i =0 αi z i annihilates r 0 and has degree strictly less than k0 for k ≤ k0 , which is impossible. Therefore, dim (k = k. To prove part 2, we proceed as follows: (i) Suppose r k = 0 for some k < k0 . By 4 ; thus, Lemma 9.8, this implies that pk (A)r 0 = 0 for some polynomial pk (z) in  k pk (z) is of degree at most k, such that pk (0) = 1, and hence pk (z) ≡ 0. This implies that pk (z) annihilates r 0 and has degree at most k < k0 , which is impossible. Therefore, r k = 0. (ii) Suppose now that deg pk < k; that is, r k ∈ (k . But (z , r k ) = 0 for all z ∈ (k . Thus, r k lies in (k and is orthogonal to every vector in (k , which is possible only when r k = 0, which we have shown to be impossible. Therefore, deg pk = k and,  therefore, r k ∈ (k+1 , but r k ∈ (k . (iii) Letting pk (z) = ki=0 αki z i , we have αkk = 0 for all k < k0 . Therefore, r 0 , r 1 , . . . , r k0 −1 are linearly independent and r 0 , r 1 , . . . , r k−1 span (k for 0 ≤ k < k0 . To prove part 3, we note that, by the definition of the minimal polynomial of A with respect to r 0 , Ak0 r 0 is a linear combination of the Ai r 0 , 0 ≤ i ≤ k0 − 1, which implies that r k0 is in (k0 (A; r 0 ). Since r k0 is also orthogonal to (k0 (A; r 0 ), it must satisfy r k0 = 0. Consequently, sk0 = s too.

Theorem 9.17. For k < k0 , k0 being the degree of the minimal polynomial of A with respect to r 0 , the approximations sk and their residual vectors r k satisfy the three-term recursion relations sk+1 = ρk r k + μk sk + νk sk−1 ,

r k+1 = −ρk Ar k + μk r k + νk r k−1 ,

k = 0, 1, . . . , (9.58)

212

Chapter 9. Krylov Subspace Methods for Linear Systems

where ρ0 =

ρk =

1 , αk + βk

αk =

μ0 = 1,

αk , αk + βk

νk =

(r k ,Ar k ) , (r k , r k )

βk =

μk =

with

1 , α0

ν0 = 0,

(9.59)

βk , αk + βk

k = 1, 2, . . . ,

(9.60)

.

(9.61)

(r k−1 ,Ar k )

(r k−1 , r k−1 )

In addition, the scalars ρk , μk , and νk are all real. Note also that μk + νk = 1. Remark: From (9.59)–(9.61), it is clear that ρk , μk , and νk in (9.58) are determined by r k and r k−1 only. Thus, (9.58) is a true three-term recursion relation that enables us to produce sk+1 from sk and sk−1 only. As such, it can be used as an efficient algorithm for implementing FOM and GMR when A is Hermitian positive definite, with the inner product as in (9.55), just as with the conjugate gradients and conjugate residuals that we treat in the next subsection. Proof. We begin with the proof of the recursion among the r k . Since Ar k ∈ (k+2 (A; r 0 ) = span{r 0 , r 1 , . . . , r k+1 } by Lemma 9.16, we have that Ar k = By (9.57), we have εi k =

k+1

i =0

(r i ,Ar k ) , (r i , r i )

εi k r i .

i = 0, 1, . . . , k + 1.

Now (r i ,Ar k ) = (Ar i , r k ) by (9.56), and Ar i ∈ (i +2 (A; r 0 ). Therefore, (r i ,Ar k ) = 0 for i + 1 < k, implying that εi k = 0 for i = 0, 1, . . . , k − 2. Thus, we have Ar k = εk+1,k r k+1 + εkk r k + εk−1,k r k−1 , from which we conclude that ρk = −

1

εk+1,k

,

μk = −

εkk

εk+1,k

,

νk = −

εk−1,k

εk+1,k

.

Now, εk−1,k and εkk can be computed since r k−1 and r k are already available. It seems, however, that εk+1,k cannot be computed because r k+1 is not available yet. We can 4 , with  4 as in Lemma determine εk+1,k by recalling that r k = pk (A)r 0 , where pk ∈  k k 9.8. We thus have z pk (z) = εk+1,k pk+1 (z) + εkk pk (z) + εk−1,k pk−1 (z), which, by letting z = 0 and invoking the fact that pk (0) = 1 for all k, gives εk+1,k + εkk + εk−1,k = 0



εk+1,k = −(εkk + εk−1,k ).

The recursion for the r k now follows with αk = εkk and βk = εk−1,k .

(9.62)

9.6. Recursive algorithms for FOM and GMR with A Hermitian positive definite

213

As for the claim that ρk , μk , and νk are real, we first observe that it is sufficient if we show that the εi k are all real. Next, since εkk = (r k ,Ar k )/(r k , r k ) > 0 for all k as long as r k = 0, it is sufficient if we show that εk−1,k and εk+1,k are real. We do this part of the proof by induction on k. For k = 0, we have Ar 0 = ε00 r 0 + ε10 r 1 and hence ε00 + ε10 = 0, with ε00 = (r 0 ,Ar 0 )/(r 0 , r 0 ) > 0 and ε10 = −ε00 < 0. Thus, the assertion is proved for k = 0. Assume now that εi ,k−1 , 0 ≤ i ≤ k, are all real. Then εk−1,k =

(r k−1 ,Ar k )

(r k−1 , r k−1 )

=

(Ar k−1 , r k )

(r k−1 , r k−1 )

= εk,k−1

(r k , r k ) . (r k−1 , r k−1 )

Since εk,k−1 is real by the induction hypothesis, then so is εk−1,k . By (9.62), εk+1,k is real since εkk and εk−1,k are. This completes the induction. Finally, the recursion for the sk is obtained by first letting r j = b − As j , j = k − 1, k, k + 1, on both sides of the recursion for the r k given in (9.58) and then by multiplying both sides by A−1 .

The three-term recursion relation for FOM was given originally in Engeli et al. [81]. Here we have given an independent development for both FOM (with K = I ) and GMR (with K = A).

9.6.3 CG and CR algorithms We now give the derivations of the CG algorithm of Hestenes and Stiefel [133] and the CR algorithm of Stiefel [312] for the solution of the linear system Ax = b, where A is Hermitian positive definite. We do this within a unified framework, just as we have done for the three-term recursion relations. Of course, the inner product (· , ·) here too is that defined in (9.55). Definition 9.18. Two nonzero vectors u and v are said to be A-conjugate or Aorthogonal with respect to the inner product (· , ·) if they satisfy (u,Av) = 0. Note that, by the fact that A and A2 are both Hermitian positive definite, Aconjugacy of u and v with respect to (· , ·) means regular orthogonality of u and v in appropriate weighted inner products, as follows: (u,Av) =

u ∗Av for FOM, u ∗A2 v for GMR.

(9.63)

In view of this, it is clear that we can always construct N vectors in N that are mutually A-conjugate; of course, these vectors form a basis for N . Let us denote these vectors p 0 , p 1 , . . . , p N −1 . We then have the following simple result, which can be proved easily. Lemma 9.19. Let the nonzero vectors p 0 , p 1 , . . . , p N −1 be mutually A-conjugate with respect to the inner product (· , ·). Then they form a linearly independent set. Let us now consider the orthogonal projection method of solving the linear system Ax = b for which k = span{p 0 , p 1 , . . . , p k−1 } = 'k and the inner product is (· , ·). The approximation sk generated by this method is then sk = x 0 +

k−1

i =0

βki p i ,

(9.64)

214

Chapter 9. Krylov Subspace Methods for Linear Systems

satisfying

(p j , r k ) = 0,

j = 0, 1, . . . , k − 1,

r k = r (sk ),

(9.65)

Invoking also the A-conjugacy of the p i and noting that r k = b − Ask = r 0 − we obtain βki =

(p i , r 0 )

(p i ,Ap i )

,

k−1

i =0

βki Ap i ,

(9.66)

i = 0, 1, . . . , k − 1.

(9.67)

One important point to realize here is that the βki are the same for all k, which means that (p k , r 0 ) sk+1 = sk + αk p k , αk = (9.68) , k = 0, 1, . . . , s0 = x 0 . (p k ,Ap k )

This implies that sk+1 can be constructed from sk using a two-term recursion provided we have a set of A-conjugate vectors, which can also be computed via a short recursion. This is precisely achieved by CG and CR, which compute, again, by two-term recursions, A-conjugate bases {p 0 , p 1 , . . . , p k−1 } for the Krylov subspaces (k (A; r 0 ) in FOM and GMR. Because the vectors p i are A-conjugate, we will call the method described in (9.64)–(9.68) a conjugate direction method. Based on these developments, the sk and the p k can be computed simultaneously, and with two-term recursions, by the following algorithm. General unified algorithm

1. Pick an initial vector x 0 = s0 , compute r 0 = b − Ax 0 , and set p 0 = r 0 . 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k , where

(b) Compute (c) Set

αk =

(p k , r k )

(p k ,Ap k )

.

(9.69)

r k+1 = r k − αk Ap k .

p k+1 = r k+1 + βk p k , where

(9.70) βk = −

(p k ,Ar k+1 )

(p k ,Ap k )

.

(9.71)

end do (k) Remarks: 1. The expression for r k+1 in (9.70) follows from that for sk+1 in (9.69) via r k+1 = b − Ask+1 . 2. αk is determined by requiring (p k , r k+1 ) = 0, while βk is determined by requiring (p k ,Ap k+1 ) = 0. 3. The expressions for αk and βk given in (9.69) and (9.71), respectively, are not the most suitable for computing in floating-point arithmetic. Computationally better expressions for αk and βk are given in (9.75) of Theorem 9.20, which follows. This theorem verifies the validity of the general algorithm given above.

9.6. Recursive algorithms for FOM and GMR with A Hermitian positive definite

215

Theorem 9.20. The general unified algorithm described above is a conjugate direction method. If r k = 0, then p k = 0 too, and the following are true: 1. span{r 0 , r 1 , . . . , r k } = span{r 0 ,Ar 0 , . . . ,Ak r 0 } = (k+1 (A; r 0 ).

(9.72)

2. span{p 0 , p 1 , . . . , p k } = span{r 0 ,Ar 0 , . . . ,Ak r 0 } = (k+1 (A; r 0 ). 3. (r i , r k ) = 0, (p i , r k ) = 0, (p i ,Ap k ) = 0, 0 ≤ i ≤ k − 1.

(9.73) (9.74)

4. αk =

(r k , r k ) , (p k ,Ap k )

αk > 0 if r k = 0,

βk =

(r k+1 , r k+1 )

(r k , r k )

,

(9.75)

βk > 0 if r k+1 = 0.

As soon as r k = 0, we have p k = 0 as well, and sk = s. Proof. We prove (9.72)–(9.74) simultaneously by induction on k. As before, we let (k stand for (k (A; r 0 ) for short. Clearly, the assertions in (9.72)–(9.74) are true for k = 0, 1. Assuming that they are true for arbitrary k, let us prove that they are true for k + 1. First, by the induction hypothesis and by (9.72) and (9.73), r k ∈ (k+1 and Ap k ∈ (k+2 . Thus, by (9.70), r k+1 ∈ (k+2 . As a result, by (9.71), p k+1 ∈ (k+2 as well. Next, (p i , r k+1 ) = 0, 0 ≤ i ≤ k, by (9.70): (i) for 0 ≤ i < k because (p i , r k ) = 0 and (p i ,Ap k ) = 0 by the induction hypothesis on (9.74), and (ii) for i = k with αk as given in (9.69). Therefore, (r i , r k+1 ) = 0, 0 ≤ i ≤ k, as well, since r i is a linear combination of p 0 , p 1 , . . . , p i by (9.72) and (9.73). Now, if r k+1 = 0, then r k+1 ∈ (k+1 (A; r 0 ) is impossible because this, along with (p i , r k+1 ) = 0, 0 ≤ i ≤ k, would imply that r k+1 = 0, which is a contradiction. Next, (p i ,Ap k+1 ) = 0, 0 ≤ i ≤ k, by (9.71): (i) for 0 ≤ i < k because (p i ,Ar k+1 ) = (Ap i , r k+1 ) = 0, which we have already proved, and because (p i ,Ap k ) = 0 by the induction hypothesis on (9.74), and (ii) for i = k with βk as given in (9.71). Since, when r k+1 = 0, r k+1 is not in (k+1 (A; r 0 ), neither is p k+1 by (9.71); hence p k+1 = 0 as well. This completes the proof of (9.72)–(9.74). We now go on to the proof of (9.75). The expression for αk follows from (9.71), with k replaced by k − 1, and from (9.74): (p k , r k ) = (r k + βk−1 p k−1 , r k ) = (r k , r k ). Substituting this into (9.69), the expression in (9.75) is obtained. As for the expression for βk , we proceed as follows: (p k ,Ar k+1 ) = (Ap k , r k+1 ) =

1 1 (r − r k+1 , r k+1 ) = − (r k+1 , r k+1 ). αk αk k

Here, we have used (9.70) to replace Ap k by (r k − r k+1 )/αk in the second equality, the last equality resulting from (r k , r k+1 ) = 0, which we have already proved. Substituting this and the new expression for αk into (9.71), we obtain the expression for βk given in (9.75). Finally, when r k = 0, we have sk = s immediately. Replacing k by k − 1 in (9.71), we have p k = r k + βk−1 p k−1 . Next, when r k = 0, we also have that βk−1 = 0, from which p k = 0 also follows.

216

Chapter 9. Krylov Subspace Methods for Linear Systems

CG algorithm

Letting K = I , the general algorithm described above becomes the CG algorithm. For completeness, we give here the steps of CG: 1. Pick an initial vector x 0 = s0 , compute r 0 = b − Ax 0 , and set p 0 = r 0 . 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k , where

(b) Compute (c) Set

αk =

r ∗k r k

p ∗k Ap k

r k+1 = r k − αk Ap k .

p k+1 = r k+1 + βk p k ,

r ∗k+1 r k+1

βk =

where

.

r ∗k r k

.

end do (k) Here αk > 0 if r k = 0 and βk > 0 if r k+1 = 0. The vectors p i and r i satisfy p ∗i Ap j = 0 and r ∗i r j = 0 if i = j , and sk is the solution to min

x∈x 0 +(k (A;r 0 )

s − x A.

Clearly, at stage k of the do loop, we need the vectors sk , r k , and p k from the previous step, and these are overwritten by sk+1 , r k+1 , and p k+1 , respectively. In addition, we need to compute Ap k , which overwrites Ap k−1 . Thus, we need to store only four vectors at any stage. When programmed properly, the computational cost of stage k is as follows: (i) one matrix-vector product, namely, Ap k ; (ii) three saxpy operations for computing sk+1 , r k+1 , and p k+1 ; and (iii) two inner products, namely, p ∗k Ap k and r ∗k+1 r k+1 , for computing αk and βk . (Note that r ∗k r k is available from step k − 1.) CR algorithm

Letting K = A, the general algorithm described above becomes the CR algorithm. For completeness, we give here the steps of CR: 1. Pick an initial vector x 0 = s0 , compute r 0 = b − Ax 0 , and set p 0 = r 0 . 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k ,

(b) Compute (c) Set

where

αk =

r ∗k Ar k

(Ap k )∗ (Ap k )

r k+1 = r k − αk Ap k .

p k+1 = r k+1 + βk p k , where

βk =

.

r ∗k+1 Ar k+1

r ∗k Ar k

.

end do (k) Here αk > 0 if r k = 0 and βk > 0 if r k+1 =  0. The vectors p i and r i satisfy (Ap i )∗ (Ap j ) = 0 and r ∗i Ar j = 0 if i = j , and sk is the solution to min

x∈x 0 +(k (A;r 0 )

r (x)2 ,

r (x) = b − Ax.

9.6. Recursive algorithms for FOM and GMR with A Hermitian positive definite

217

Clearly, at stage k of the do loop, we need the vectors sk , r k , and p k from the previous step, and these are overwritten by sk+1 , r k+1 , and p k+1 , respectively. In addition, we need to compute Ap k and Ar k+1 , which overwrite Ap k−1 and Ar k , respectively. Thus, we need to store only five vectors at any stage. When programmed properly, the computational cost of stage k is as follows: (i) two matrix-vector products, namely, Ap k and Ar k+1 ; (ii) three saxpy operations for computing sk+1 , r k+1 , and p k+1 ; and (iii) two inner products, namely, (Ap k )∗ (Ap k ) and r ∗k+1 Ar k+1 , for computing αk and βk . (Note that r ∗k Ar k is available from step k − 1.) Error analysis for CG and CR

We have already considered the error in s − sk for both CG and CR following Theorem 9.10. For convenience, we restate the relevant results here. Theorem 9.21. Let μmax and μmin be, respectively, the largest and the smallest eigenvalues of the matrix A, and let κ = μmax /μmin . (Of course, μmax and μmin are both positive.) Then the errors s − sCG and s − sCR satisfy k k



  (s − sCG )∗A(s − sCG ) 1 κ−1 k k k ≤  , =   ≤2  μ + μmin s − x 0 A κ+1 (s − x 0 )∗A(s − x 0 ) Tk max μmax − μmin

 s − sCG k A

s − sCR  2 k A

s − x 0 A2

  1 κ−1 k . ≤  =  ≤2  μmax + μmin r (x 0 )2 κ+1 Tk μmax − μmin r (sCR )2 k

9.6.4 CGNE and CGNR: Conjugate gradients on normal equations A linear system Ax = b with nonsingular non-Hermitian A ∈ N ×N can be turned = ∈ N ×N . Subinto another linear system Ax b, with Hermitian positive definite A sequently, its solution can be achieved by applying CG to the new system. This can be done in two ways: 1. Solve A∗Ax = A∗ b for x via CG. This solution method is called CGNR. Clearly, the new system is nothing but the system of normal equations for Ax = b. 2. Solve AA∗ y = b for y via CG and set x = A∗ y. This solution method is called CGNE. It is also known as Craig’s method. Note that both matrices A∗A and AA∗ are Hermitian positive definite since A is nonsingular. Therefore, the solutions for x produced by CGNE and CGNR are nothing but the unique solution s to Ax = b. By realizing that u ∗ (A∗A)v = (Au)∗ (Av)

and

u ∗ (AA∗ )v = (A∗ u)∗ (A∗ v),

CG can be applied to both problems without actually forming the matrices A∗A and AA∗ . In the case of CGNE, one need not explicitly generate the approximations y k to y when solving AA∗ y = b; one can compute the approximations sk = A∗ y k to s directly. Here is a unified description of CGNR and CGNE:

218

Chapter 9. Krylov Subspace Methods for Linear Systems

1. Pick an initial vector x 0 = s0 and compute r 0 = b − Ax 0 . Compute A∗ r 0 and set p 0 = A∗ r 0 . 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k , where

αCGNR = k

A∗ r k 22

r k 22

Ap k 2

p k 22

, αCGNE = k 2

(b) Compute r k+1 = r k − αk Ap k (c) Set p k+1 = A∗ r k+1 + βk p k , where

= βCGNR k

and also A∗ r k+1 .

A∗ r k+1 22

A∗ r k 22

.

, βCGNE = k

r k+1 22

r k 22

.

end do (k) Now, for CGNR, given an initial vector x 0 , we have that sk ∈ x 0 +(k (A∗A;A∗ r 0 ), with r 0 = b − Ax 0 . Similarly, for CGNE, given an initial vector y 0 , we have y k ∈ y 0 + (k (AA∗ ; r 0 ), with r 0 = b − AA∗ y 0 . By x 0 = A∗ y 0 , it follows that r 0 = b − Ax 0 = r 0 . In addition, by sk = A∗ y k , it follows that sk ∈ x 0 + A∗ (k (AA∗ ; r 0 ), and it is easy to show that A∗ (k (AA∗ ; r 0 ) = (k (A∗A;A∗ r 0 ), so that sk ∈ x 0 + (k (A∗A;A∗ r 0 ), with r 0 = b − Ax 0 , for CGNE just as for CGNR. Note that since the condition numbers of both matrices A∗A and AA∗ satisfy κ2 (A∗A) = κ2 (AA∗ ) = [κ2 (A)]2 , one may be led to conclude that this approach should be used only when κ2 (A) is not large. For an argument that counters this conclusion, see the discussion given in Greenbaum [118, Section 7.1, pp. 105–107]. CGNE and CGNR can also be used to solve, in the least-squares sense, linear systems Ax = b, when A is a rectangular matrix: 1. CGNR can be used to solve overdetermined linear systems Ax = b, where A ∈  m×n , with m > n and rank(A) = n, in the sense min x Ax − b2 . Note that the system of normal equations that results from this problem is A∗Ax = A∗ b, and A∗A is Hermitian positive definite. Its solution is s = (A∗A)−1A∗ b = A+ b, and this solution is unique. 2. CGNE can be used to solve underdetermined linear systems Ax = b, where A ∈  m×n , with m < n and rank(A) = m, by first solving AA∗ y = b for y and then by letting x = A∗ y. Note that AA∗ is Hermitian positive definite. Now, the linear system Ax = b has infinitely many solutions since m < n. The solution for x produced by CGNE is s = A∗ (AA∗ )−1 b = A+ b; hence it is the unique solution with the smallest l2 -norm.

9.6.5 Variants of CG and CR As we have seen, CG and CR can be applied very efficiently to a linear system Ax = b when A is Hermitian positive definite. When A is nonsingular Hermitian but indefinite, they cannot be applied in a straightforward way because now products like z ∗Az are not necessarily positive; they can be negative and they can even vanish. This causes problems for both CG and CR. These problems are overcome by the methods SYMMLQ (analogous to CG) and MINRES (analogous to CR) by Paige and Saunders

9.7. On the existence of short recurrences for FOM and GMR

219

[204]. In both methods, sk ∈ x 0 +(k (A; r 0 ). The sk generated by MINRES minimize the l2 -norm of the residual r (x) over the affine subspace x 0 + (k (A; r 0 ), while the residuals r k of the sk generated by SYMMLQ are orthogonal to (k (A; r 0 ); no minimization of any norm is involved in the latter case, however. In another paper [205], Paige and Saunders present an algorithm of the CG type, called LSQR, for solving arbitrary linear systems and least-squares problems. For details, see the literature.

9.7 On the existence of short recurrences for FOM and GMR In the preceding section, we showed that FOM and GMR can both be implemented using three-term recursions involving the residual vectors when A is Hermitian positive definite, thus saving lots of computing and also keeping fixed the amount of memory needed throughout the course of computing. A question that comes to mind now is whether and when it would be possible to implement these methods by short recursions (of fixed length) to achieve similar savings. This question was answered completely by Faber and Manteuffel [84], who give necessary and sufficient conditions for the existence of short recurrences. We will not go into the details of their treatment here but will present very briefly the issue of sufficient conditions. We will do this by generating the analogue of Theorem 9.17 concerning three-term recursions. For analogues of CG and CR algorithms, we refer the reader to [84] and [118, Chapter 6]. We start by recalling from Lemmas 9.11 and 9.13 that the residual vectors of FOM and GMR possess the following orthogonality properties: ∗ FOM (r FOM j ) (r k ) = 0 and

∗ GMR (Ar GMR j ) (r k ) = 0

if j < k.

These can be rewritten in a unified manner in terms of the inner product [analogous to that defined in (9.55)]32 (y, z ) = y ∗ K z ,

K=

I A∗

for FOM, for GMR.

(9.76)

Thus, for both FOM and GMR, (r j , r k ) = r ∗j K r k = 0

if j < k.

(9.77)

We wish to show that a sufficient condition for the existence of short recurrences is that the matrix A be such that33 m

A∗ = a j A j ; a m = 0, and (z, z ) = 0 ∀ z = 0. (9.78) j =0





Clearly, A A = AA , which implies that KA = AK . Thus, (y,Az ) = (A∗ y, z )

∀ y, z .

(9.79)

We now proceed as in Theorem 9.17. First, by the fact that r k ∈ (k+1 (A; r 0 ) and by (9.77), we have Ar k =

k+1

i =0

εi k r i ,

εi k =

(r i ,Ar k ) , (r i , r i )

i = 0, 1, . . . , k + 1.

(9.80)

32 As already mentioned, (· , ·) is a true inner product for FOM; it is a true inner product for GMR if and only if A is Hermitian positive definite. 33 The condition (z , z ) = 0 for z = 0 is automatically satisfied for FOM. It will be satisfied for GMR if we impose on A the additional requirement that z ∗ Az = 0 for all z = 0.

220

Chapter 9. Krylov Subspace Methods for Linear Systems

Now (r i ,Ar k ) = (A∗ r i , r k ) =

m

j =0

a j (A j r i , r k ).

(9.81)

By (9.77), we have (A j r i , r k ) = 0 for j + i < k, implying that (r i ,Ar k ) = 0, hence εi k = 0, for 0 ≤ i ≤ k − m − 1. Thus, we have the following (m + 2)-term recursion relation among the r i : k+1

Ar k =

εi k r i .

(9.82)

i =k−m

As for the computation of the εi k , we note that these are computable from (9.80) for k − m ≤ i ≤ k because r i , k − m ≤ i ≤ k, are all available. Since r k+1 is not available yet, we cannot use (9.80) to compute εk+1,k . For this, we proceed as in the proof of 4 , with  4 as in Lemma 9.8. Theorem 9.17 and invoke r k = pk (A)r 0 , where pk ∈  k k Thus, we have k+1

z pk (z) =

εi k pi (z),

i =k−m

which, by letting z = 0 and invoking pi (0) = 1 for all i, gives k+1

εi k = 0



εk+1,k = −

i =k−m

k

εi k .

(9.83)

i =k−m

We now turn to the related recursion relation among the si . Invoking first r i = b − Asi and next (9.83) in (9.82), we obtain Ar k = −A

k+1

εi k s i ,

i =k−m

which, upon multiplying by A−1 , yields the following (m +2)-term recursion relation: sk+1 = −

1

εk+1,k

 rk +

k

 εi k s i .

(9.84)

i =k−m

9.8 The method of Lanczos The method of Lanczos [165] for the solution of Ax = b is an oblique projection method that is also a Krylov subspace method, with k = (k (A; r 0 ) and 'k = (k (A∗ ; r 0 ), where r 0 = b−Ax 0 , r 0 is arbitrary, and M = I , such that 〈 r 0 , r 0 〉 = r ∗0 r 0 = 0. [Of course, when A is Hermitian and r 0 = r 0 , we have 'k = k = (k (A; r 0 ); hence, the method of Lanczos reduces to the method of Arnoldi.] In [165], Lanczos developed an algorithm by which one constructs bases for the subspaces k and 'k whose members are mutually biorthogonal. These bases are obtained by using an interesting iterative procedure that is analogous to that we studied in connection with the method of Arnoldi.

9.8. The method of Lanczos

221

Lanczos biorthogonalization

Given the vectors v 1 and w 1 , such that 〈w 1 , v 1 〉 = w ∗ v 1 = 1, we would like to construct two sequences of vectors {v 1 , v 2 , . . . , } and {w 1 , w 2 , . . . , } such that w k ∈ (k (A∗ ; w 1 ),

v k ∈ (k (A; v 1 ) and and

〈w i , v j 〉 = δi j

k = 1, 2, . . . ,

〈c, d〉 = c ∗ d.

∀ i, j ,

(9.85) (9.86)

We do this in a way that is analogous to the Arnoldi–Gram–Schmidt process: We begin with the observation that Av k =

k+1

i =1

ρi k v i

and A∗ w k =

k+1

i =1

σi k w i .

(9.87)

Clearly, ρi k = 〈w i ,Av k 〉 = 〈A∗ w i , v k 〉

and

σi k = 〈v i ,A∗ w k 〉 = 〈Av i , w k 〉 .

(9.88)

Since, by (9.85), Av i ∈ (i +1 (A; v 1 )

and A∗ w i ∈ (i +1 (A∗ ; w 1 )

and by (9.86), 〈w k , y〉 = 0 (9.88) gives

∀ y ∈ (k−1 (A; v 1 ) and ρi k = 0

and

〈z , v k 〉 = 0

σi k = 0,

∀ z ∈ (k−1 (A∗ ; w 1 ),

i = 1, . . . , k − 2.

(9.89)

As a result, (9.87) becomes Av k = ρk−1,k v k−1 + ρkk v k + ρk+1,k v k+1 , A∗ w k = σk−1,k w k−1 + σkk w k + σk+1,k w k+1 .

(9.90)

By (9.88), we also have σi k = 〈Av i , w k 〉 = 〈w k ,Av i 〉 = ρki .

Then, letting

αk = ρkk ,

γk = ρk,k−1 ,

(9.91)

δk = ρk−1,k ,

and invoking also (9.91), we can rewrite (9.90) as Av k = δk v k−1 + αk v k + γk+1 v k+1 , A∗ w k = γ k w k−1 + α k w k + δ k+1 w k+1 .

(9.92)

Here γk and δk are available from the computation of v k and w k , while αk = 〈w k ,Av k 〉 can be computed since w k and Av k are available. What remains is to determine γk+1 and δk+1 . We note that, because we have not determined v k+1 and w k+1 yet, there is some freedom in determining γk+1 and δk+1 . Letting k+1 , γk+1 v k+1 = Av k − δk v k−1 − αk v k ≡ v

& k+1 , δ k+1 w k+1 = A∗ w k − γ k w k−1 − α k w k ≡ w

(9.93)

222

Chapter 9. Krylov Subspace Methods for Linear Systems

and invoking the requirement that 〈w k+1 , v k+1 〉 = 1, we have k+1 〉 = δk+1 γk+1 〈w k+1 , v k+1 〉 = δk+1 γk+1 . 〈& w k+1 , v

(9.94)

Thus, we choose γk+1 and δk+1 to satisfy k+1 〉 δk+1 γk+1 = 〈& w k+1 , v and let v k+1 =

k+1 v

γk+1

w k+1 =

,

& k+1 w

δ k+1

(9.95)

.

(9.96)

Different ways of choosing γk+1 and δk+1 have been suggested. Letting k+1 〉 |, μ = | 〈& w k+1 , v

k+1 〉 , φ = arg 〈& w k+1 , v

one choice of γk+1 and δk+1 (due to Saad [236]) is   γk+1 = μ, δk+1 = μ e iφ .

Summarizing, here are the steps of the Lanczos biorthogonalization process: 1. Pick v 1 and w 1 such that 〈w 1 , v 1 〉 = 1. Also set γ1 = δ1 = 0 and v 0 = w 0 = 0. 2. For k = 1, 2, . . . do Compute Compute Compute

Choose Set

αk = 〈w k ,Av k 〉 . k+1 = Av k − αk v k − δk v k−1 . v & k+1 = A∗ w k − α k w k − γ k w k−1 . w

δk+1 , γk+1

v k+1 =

k+1 v

γk+1

,

such that

w k+1 =

k+1 〉 . δk+1 γk+1 = 〈& w k+1 , v

& k+1 w

δ k+1

.

end do (k) & k+1 = 0, and k+1 = 0, w Remark: Note that the process can be continued as long as v k+1 ) = 0. It terminates in two different manners: (& w k+1 , v & k+1 = 0, by (9.87), the process has found an invariant subspace k+1 = 0 or w 1. If v ∗ & k+1 = 0, of A or A . If v k+1 = 0, (k (A; v 1 ) is an invariant subspace of A. If w ∗ ∗ (k (A ; w 1 ) is an invariant subspace of A . This situation is referred to as regular termination. & k+1 = 0, but 〈& k+1 〉 = 0, vectors v k+1 and w k+1 satisfy k+1 = 0 and w w k+1 , v 2. If v ing 〈w k+1 , v k+1 〉 = 1 do not exist. This situation is referred to as serious breakdown. Note that, at some later stage, there may exist nonzero vectors v k+ j orthogonal to (k+ j −1 (A; v 1 ) and w k+ j orthogonal to (k+ j −1 (A∗ ; w 1 ). Techniques that skip the stages at which the vectors v i and w i are not defined and go straight to the stages at which they are defined are known as look-ahead procedures. Since this important topic is beyond the scope of this book, we do not discuss it here. For different approaches to look-ahead procedures, we refer the reader to Parlett, Taylor, and Liu [210]; Brezinski, Redivo Zaglia, and Sadok [38, 39, 44]; Freund, Gutknecht, and Nachtigal [91]; and Nachtigal [193], for example.

9.8. The method of Lanczos

223

Let us now assume that v i , w i , i = 1, . . . , k + 1, have been computed, and let V k = [ v 1 | · · · | v k ], From (9.92), we have

W k = [ w 1 | · · · | w k ].

AV k = V k T k + γk+1 v k+1 e Tk ,

(9.97) (9.98)

where T k is a tridiagonal matrix given by ⎡

α1 ⎢ γ2 ⎢ ⎢ ⎢ ⎢ ⎢ Tk =⎢ ⎢ ⎢ ⎢ ⎢ ⎣



δ2 α2 .. .

δ3 .. .

..

.

..

..

.

..

.

..

.

..

.

.

γk−1

..

. αk−1 γk

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎥ δk ⎦ αk

(9.99)

Since W ∗k V k = I k and W ∗k v k+1 = 0 by the biorthogonality of the v i and w j , we also have (9.100) T k = W ∗k AV k . Remark: Assuming that the biorthogonalization process does not fail for 1 ≤ k ≤ N , we see that V N and W N are N × N matrices and that W ∗N V N = I N , which implies that W ∗N = V −1 . Thus, (9.100) becomes the similarity relation N AV N = T N . V −1 N As a result, T N and A have the same eigenvalues. Solution of Ax = b via Lanczos biorthogonalization

In [236], Saad proposes to use Lanczos biorthogonalization to solve the linear systems Ax = b, where A is not necessarily Hermitian, via an oblique projection method. We describe this approach here. Let x 0 be a given initial vector, and let r 0 = b − Ax 0 . Let β = r 0 2 and v 1 = r 0 /β. Choose a vector w 1 such that w ∗1 v 1 = 1. (Normally, w 1 = v 1 is the standard choice.) Generate the vectors v i , w i , i = 1, . . . , k, and hence the subspaces k = (k (A; v 1 ) = (k (A; r 0 )

and

'k = (k (A∗ ; w 1 ),

as described above, assuming that all the vectors involved exist. Let sk ∈ x 0 + k such that r (sk ) = rk = b − Ask is orthogonal to 'k . Then, by Theorem 9.2, with M = I , we have sk = x 0 + V k (W ∗k AV k )−1 W ∗k r 0 , which, also by invoking (9.100), r 0 = βv 1 , and W ∗k v 1 = e 1 , becomes sk = x 0 + βV k T −1 e . k 1

(9.101)

Thus, sk can be computed as sk = x 0 + βV k ξ = x 0 + β

k

i =1

ξi v i ,

ξ = [ξ1 , . . . , ξk ]T ,

T k ξ = e1.

(9.102)

224

Chapter 9. Krylov Subspace Methods for Linear Systems

The residual vector r k and its l2 -norm can be determined in terms of the already computed quantities without having to actually compute sk or r k . This can be achieved exactly as in the case of FOM. We have r k = −βγk+1 (e Tk ξ )v k+1

(9.103)

r k  = |γk+1 | |ξk | v k+1  r 0 .

(9.104)

and hence By (9.102), it is clear that to be able to compute sk , we need all the vectors v 1 , . . . , v k . Thus, as in the case of FOM and GMRES, the storage requirements of the method of Lanczos via the biorthogonalization presented above increase with k.

9.8.1 The biconjugate gradient algorithm (Bi-CG) The problem of increasing storage requirements that is present in the algorithm just described can be overcome via an algorithm by Fletcher [85] that is known as the biconjugate gradient algorithm (Bi-CG). Actually, the work of Fletcher is based on the algorithm of Lanczos [164] for computing the eigenvalues of a non-Hermitian matrix. Starting with the vector r 0 = b − Ax 0 and another arbitrary vector r 0 such that r ∗0 r 0 = 0, Bi-CG constructs by short recurrences two bases for the Krylov subspaces (k (A; r 0 ) and (k (A∗ ; r 0 ), span{p 0 , p 1 , . . . , p k−1 } and span{ p0, p1, . . . , p k−1 }, respec∗ tively, that are mutually A-biconjugate, that is, they satisfy p i Ap j = 0 for i = j . Simultaneously, it also constructs the vectors sk , again by short recurrences. In this sense, Bi-CG for arbitrary matrices is analogous to CG for Hermitian positive definite matrices. For completeness, we give here the steps of Bi-CG: 1. Pick an initial vector x 0 = s0 , compute r 0 = b − Ax 0 , and set p 0 = r 0 . Pick also a vector r 0 such that r ∗0 r 0 = 0, and set p0 = r 0. 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k ,

(b) Compute

(c) Set

where

αk =

r ∗k r k . ∗ p k Ap k

r k+1 = r k − αk Ap k , r k+1 = r k − α k A∗ pk .

p k+1 = r k+1 + βk p k ,

where

βk =



p k+1 r ∗k+1 r k+1 . r ∗k r k

pk , = r k+1 + βk

end do (k) p k from the Clearly, at stage k of the do loop, we need the vectors sk , r k , r k , p k , and p k+1 , respecprevious step, and these are overwritten by sk+1 , r k+1 , r k+1 , p k+1 , and p k , which overwrite Ap k−1 and tively. In addition, we need to compute Ap k and A∗ ∗ A p k−1 , respectively. Thus, we need to store only seven vectors at any stage. When programmed properly, the computational cost of stage k is as follows: two matrixp k ); five saxpy operations (for computing sk+1 , vector products (namely, Ap k and A∗ ∗ r k+1 , p k+1 , r k+1 , and p k+1 ); and two inner products, namely, p k Ap k and r ∗k+1 r k+1 (for computing αk and βk ). (Note that r ∗k r k is available from step k − 1.)

9.8. The method of Lanczos

225

The following theorem verifies the validity of the general algorithm given above. ∗ p k Ap k = 0 in the algorithm described above, Theorem 9.22. As long as r ∗k r k = 0 and the following are true:

1. span{r 0 , r 1 , . . . , r k } = span{p 0 , p 1 , . . . , p k } = (k+1 (A; r 0 ). ∗

p0, p1, . . . , p k } = (k+1 (A ; r 0 ). 2. span{ r 0 , r 1 , . . . , r k } = span{ 3. 4.

r ∗i r k rk r ∗i

= 0, = 0,

∗ pi r k p ∗i r k

= 0, = 0,

∗ p i Ap k = 0, p k = 0, p ∗i A∗

0 ≤ i ≤ k − 1. 0 ≤ i ≤ k − 1.

(9.105) (9.106) (9.107) (9.108)

As soon as r k = 0, we have sk = s. Proof. We prove this theorem by induction on k. Obviously, the assertion is true for k = 0. Let us assume that it is true for k. Then, for k + 1, parts 1 and 2 are true by the fact that pk r k+1 = r k − αk Ap k , r k+1 = r k − α k A∗

and by the induction hypothesis. To prove parts 3 and 4, we proceed as follows. We begin with the proof of r ∗i r k+1 = 0 for 0 ≤ i ≤ k. By step 2(b) in the algorithm, we have (9.109) r ∗i r k+1 = r ∗i r k − αk r ∗i Ap k . pi − By the induction hypothesis, r ∗i r k = 0 for 0 ≤ i ≤ k − 1. Also, invoking r i = ∗ βi −1 p i −1 from step 2(c) in the algorithm, we have r i Ap k = 0 for 0 ≤ i ≤ k − 1, which follows from the induction hypothesis. Thus, we have shown that r ∗i r k+1 = 0 for 0 ≤ i ≤ k − 1. We now have to show that this is also true for i = k. By the fact that r k = p k−1 from step 2(c) in the algorithm, and by the induction hypothesis, p k − βk−1 ∗ p k Ap k . Thus, we have obtained we have r ∗k Ap k = ∗

r ∗k r k+1 = r ∗k r k − αk p k Ap k . Substituting into this equality the expression for αk given in step 2(a) of the algorithm, we observe that the right-hand side vanishes. We have thus proved that r ∗i r k+1 = 0 for ∗ 0 ≤ i ≤ k. The proof of r i r k+1 = 0 for 0 ≤ i ≤ k is identical. ∗ We now proceed to the proof of p i Ap k+1 = 0 for 0 ≤ i ≤ k. By step 2(c) in the algorithm, we have ∗ ∗ ∗ ∗ p i Ap k+1 = p i Ap k = (A∗ p i )∗ r k+1 + βk p i Ap k , p i Ar k+1 + βk

which, from step 2(b), becomes ∗ p i Ap k+1 =

1 ∗ p i Ap k . ( r − r i +1 )∗ r k+1 + βk αi i

(9.110)

Now, for 0 ≤ i ≤ k −1, ( r i − r i +1 )∗ r k+1 = 0, as we have already shown in the preceding ∗ paragraph, and p i Ap k = 0 by the induction hypothesis. Thus, we have shown that ∗ p i Ap k+1 = 0 for 0 ≤ i ≤ k − 1. We now have to show that this is also true for i = k. Letting i = k in (9.110) and invoking r ∗k r k+1 = 0, we obtain ∗ p k Ap k+1 = −

1 ∗ ∗ p k Ap k . r + βk r αk k+1 k+1

226

Chapter 9. Krylov Subspace Methods for Linear Systems

Substituting the expressions for αk and βk given in the algorithm, we obtain ∗ ∗ p k Ap k+1 = 0. We have thus shown that p i Ap k+1 = 0 for 0 ≤ i ≤ k. The proof p k+1 = 0 for 0 ≤ i ≤ k is identical. of p ∗i A∗

Remarks: 1. We mentioned that the iterations in Bi-CG can be continued as long as r ∗k r k = 0. They fail when r ∗k r k = 0, in which case we say that a serious breakdown has occurred. 2. When A is Hermitian positive definite and r 0 = r 0 , we have that k = 'k and that Bi-CG reduces to CG. This can be verified by comparing the two algorithms as we have described them here. It is easy to see that, in such a case, r k = r k and p k = p k for all k in Bi-CG, and αk and βk in Bi-CG are real and the same as those in CG. Indeed, the recursions for sk , r k , and p k are exactly the same in both Bi-CG and CG in this case.

9.8.2 Variants of Bi-CG Since Bi-CG requires both matrices A and A∗ , we need to have two separate matrixvector multiplication procedures for applying Bi-CG, namely, one for Ax and another for A∗ x. (This should be contrasted with FOM, GMR, CG, and CR, for which we need only one procedure to compute Ax.) To avoid the use of A∗ , variants of Bi-CG have been proposed: conjugate gradients squared (CGS) by Sonneveld [307] and biconjugate gradients stabilized (Bi-CGSTAB) by van der Vorst [325] are two such methods. Both methods converge faster than Bi-CG. CGS appears to have irregular behavior, while Bi-CGSTAB does not suffer from this problem. Another method that uses both A and A∗ but does not have irregular behavior was given by Freund and Nachtigal [88] and is known as the quasi-minimal residual method (QMR). Subsequently, a version of QMR that does not use A∗ was given by Freund [90] and is known as the transpose-free quasi-minimal residual method (TFQMR). We do not go into the details of these methods here and refer the reader to the books [118], [239], and [326] on Krylov subspace methods mentioned in Chapter 0 and to the relevant papers.

9.9 Preconditioning We have seen from the analyses of Chapters 6 and 7 concerning vector extrapolation methods and of the present chapter concerning Krylov subspace methods for solving a linear system Ax = b that the rates at which the sequences generated by all these methods converge depend very much on the spectral properties of the matrix A. To improve the convergence rates, we may transform the given linear system into one that has the same solution but whose matrix has more favorable spectral properties than A. For example, when solving Ax = b, with A Hermitian positive definite, the rate at which convergence of CG and CR takes place is dictated by the condition number κ2 of A; the larger κ2 , the slower the convergence. To improve the convergence rates of these methods, we may transform the given linear system into one whose matrix has a smaller condition number than that of A. We can achieve this, for example, by premultiplying the system by M −1 , where M is a nonsingular matrix, so that M −1Ax = M −1 b. Clearly, this transformed system

9.9. Preconditioning

227

has the same solution as Ax = b, and we would like M −1A to have better spectral properties than A. The matrix M is called a preconditioner. Clearly, the closer M is to A (or M −1 is to A−1 ), the closer M −1 A is to I , and the better the rate of convergence of Krylov subspace methods. Thus, M should be a reasonably good approximation to A in some sense. In addition, it should be relatively easy to construct M . Finally, when applying Krylov subspace methods to Ax = b, we do not wish to actually compute the matrix M −1 A. We always compute a vector M −1 z as the solution of M y = z for y. Thus, another important feature we require M to have is that it should be easy (inexpensive) to solve M y = z for y. This way of preconditioning is known as left preconditioning. A similar preconditioning is achieved by applying Krylov subspace methods to AM −1 y = b to compute y; following this, we let x = M −1 y. This is known as right preconditioning. Yet another preconditioning involves a factorized form of M . Suppose that the factorization M = M 1 M 2 of M is available. Then we apply Krylov subspace methods −1 −1 −1 to M −1 1 AM 2 y = M 1 b; following this, we let x = M 2 y. This is known as two-sided preconditioning. The relevant factorization may be the LU factorization M = LU in general, or the Cholesky factorization M = LL∗ when M is Hermitian positive definite, for example. When A is a large and sparse matrix, we would like the preconditioner M to be a sparse matrix as well. Now, the LU or Cholesky factorization of a sparse matrix is normally denser than the matrix itself, which makes these factorizations not so suitable. An often very effective alternative is an approximate LU factorization that is as sparse, or nearly as sparse, as A: In this approach, we choose sparse matrices L and U so that the difference LU −A is small. This is known as an incomplete LU (ILU) factorization, and M = LU is known as an incomplete LU preconditioner. For a Hermitian A, we compute M = LL∗ , which is known as an incomplete Cholesky (IC) preconditioner. There are different ways of achieving ILU and IC. In the simplest case, for a non-Hermitian A, we compute an LU factorization, but where A has a zero element, we replace any nonzero element of L or U by zero. For a Hermitian A, we do the same with the Cholesky factors L and L∗ . In this way we preserve the sparseness structure of A in ILU or IC. Sometimes, we may have to use pivoting when applying this procedure, by which we achieve that LU − PA is small, where P is a permutation matrix. Note that incomplete factorizations do not always exist, but there are classes of matrices for which they do exist; see Meijerink and van der Vorst [184]. There are several different approaches to the design of preconditioners. We do not go into these in any detail since the subject is beyond the scope of this book. This is done at length in numerous papers and in the books on Krylov subspace methods mentioned earlier. See also the book by Chen [58]. In the next subsection, we briefly discuss an approach that arises from fixed-point iterative schemes.

9.9.1 Fixed-point iterative schemes as preconditioners Let us recall that an iterative scheme for solving the nonsingular system Ax = b is obtained by first splitting A as A = M − N , where M is nonsingular; then rewriting Ax = b in the form x = T x + d,

T = M −1 N = I − M −1A,

d = M −1 b;

and then generating a sequence of vectors {x m } as in x m+1 = T x m + d,

m = 0, 1, . . . ,

for some arbitrary x 0 .

228

Chapter 9. Krylov Subspace Methods for Linear Systems

Clearly, the system x = T x + d, from which the sequence {x m } is derived, is nothing but M −1 Ax = M −1 b, which is Ax = b (left) preconditioned by the matrix M . Now the matrices M resulting from the Jacobi, Gauss–Seidel, and SOR methods discussed in Chapter 0 have all the desirable properties mentioned above at least for certain classes of sparse matrices A. Thus, Krylov subspace methods can be used very effectively in conjunction with fixed-point iterative schemes to solve linear systems that arise from, for example, finite-difference or finite-element solutions of elliptic equations.

9.9.2 Preconditioned CG What we have described so far can be used in a straightforward way with FOM, GMRES, or Bi-CG when these are applied to linear systems with non-Hermitian matrices A. In general, they cannot be used with CG or CR to directly solve Ax = b with Hermitian positive definite A, however. For example, when using left preconditioning, we run into the problem that the preconditioned matrix M −1A ceases to be Hermitian generally. This problem can be solved in different ways by choosing M to be Hermitian positive definite first of all and then switching to a suitable inner product with respect to which M −1A is self-adjoint.34 We now turn to the details of this approach for CG. The inner product we switch to is (y, z )M = y ∗ M z . With respect to this inner product, M −1A is self-adjoint, namely, (y, M −1Az )M = (M −1 Ay, z )M . In addition, it is positive definite since it is similar to M −1/2AM −1/2 , which is clearly Hermitian positive definite. We can now apply CG or CR to M −1Ax = M −1 b with this new inner product. We note that now sk ∈ x 0 + k , with

r k = b − Ask

and

k = (k (M −1A; r 0 ),

and r k = M −1 r k ,

(z , r k )M = 0

k = 0, 1, . . . ,

∀ z ∈ k .

Here are the steps of the preconditioned CG: 1. Pick an initial vector x 0 = s0 , compute r 0 = b − Ax 0 and r 0 = M −1 r 0 , and set p0 = r 0. 2. For k = 0, 1, . . . do (a) Set

sk+1 = sk + αk p k , where

(b) Compute (c) Set

r k+1 = r k − αk Ap k

p k+1 = r k+1 + βk p k ,

αk =

( r k , r k )M

(p k , M −1Ap k )M

=

r ∗k r k

p ∗k Ap k

and r k+1 = M −1 r k+1 .

where

βk =

( r k+1 , r k+1 )M

( r k , r k )M

=

.

r ∗k+1 r k+1

r ∗k r k

.

end do (k) If two-sided preconditioning is used to solve Ax = b in the form (P −1 AP −∗ )y = P −1 b, x = P −∗ y, then the matrix B = P −1 AP −∗ is clearly Hermitian; in this case we can directly apply CG to (P −1 AP −∗ )y = P −1 b. 34

9.10. FOM and GMR with prior iterations

229

The vectors p i and r i satisfy p ∗i Ap j = (p i , M −1 Ap j )M = 0 and r ∗i r j = ( r i , r j )M = 0 if i = j . Since sk ∈ x 0 + k and (z , r k )M = 0 for all z ∈ k , we have that Y ∗ M r k = 0, which, by r k = M −1 r k = M −1A(s − sk ), can be rewritten as Y ∗A(s − sk ) = 0. Therefore, by the fact that A is Hermitian positive definite, Theorem 9.3 applies and we have s − sk A = min s − xA. x∈x 0 +(k (M −1 A; r 0)

4 and Now, for arbitrary x ∈ x 0 + k , we have s − x = p(M −1A)(s − x 0 ), where p ∈  k is arbitrary otherwise, as we showed in Lemma 9.8. Therefore, by Lemma 9.9, s − sk A ≤  p(M −1A)A s − x 0 A

= A1/2 M −1/2 p(M −1/2AM −1/2 )M 1/2A−1/2 2 ≤ κ2 (M 1/2A−1/2 )  p(M −1/2AM −1/2 )2 , 4 . Applying Theorem 9.10, we finally obtain which holds for all p ∈  k s − sk A ≤ s − x 0 A

  κ2 (M 1/2A−1/2 ) − 1 k κ 1/2 −1/2 , )   ≤ 2κ2 (M A  +μ min μ + 1 κ Tk max max − μ min μ

max and μ min are, respectively, the largest and the smallest eigenvalues of the where μ = (Hermitian positive definite) matrix M −1/2AM −1/2 (equivalently, of M −1A), and κ max /μ min . μ

9.10 FOM and GMR with prior iterations Numerical experience shows that, in their restarted versions, FOM and GMR may stall in some cases; that is, both methods may suffer from very slow convergence. There have been several approaches to remedy this problem. Here we describe one due to Sidi and Shapira [304] that is easy to implement and uses ideas from vector extrapolation methods. Consider the linear system Ax = b. The Richardson iteration method applied to this system is (9.111) x j +1 = x j + ω(b − Ax j ), j = 0, 1, . . . , where ω is a carefully chosen scalar to ensure that {x m } converges, or diverges slowly. [Of course, convergence is guaranteed if ρ(I − ωA) < 1.] Then apply FOM or GMR in their restarted forms, modified as follows: 1. Choose two integers n, k ≥ 1 and an initial vector x 0 . 2. Compute x 1 , . . . , x n via (9.111). 3. Apply k stages of FOM or GMR to the system Ax = b, starting with the vector x n , the end result being sn,k . 4. If sn,k passes the accuracy test, stop. Otherwise, set x 0 = sn,k and go to step 2.

230

Chapter 9. Krylov Subspace Methods for Linear Systems

These modified restarted versions of FOM and GMR are denoted FOM(n, k) and GMR(n, k), respectively, in [304], where their error bounds (for fixed n and k) are studied in detail too. Of course, FOM(0, k) and GMR(0, k) are FOM(k) and GMR(k), respectively. The numerical examples provided in [304] show clearly that FOM(n, k) and GMR(n, k), even with moderately large n, can be very effective where FOM(k) and GMR(k) fail. Finally, we recall that (k (A; r ) = (k (I − ωA; r ) for arbitrary ω = 0. Thus, by Theorem 9.7, sFOM = sMPE and sGMR = sRRE , MPE and RRE being applied to the vectors n,k n,k n,k n,k x 0 , x 1 , . . . , x n+k+1 generated via (9.111), starting with the vector x n . We have already and sRRE in Chapter 7 (see also [303]) in terms of the analyzed the error bounds for sMPE n,k n,k spectrum of the matrix T used to generate the sequence {x m } via x m+1 = T x m + d. In the present case concerning FOM(n, k) and GMR(n, k), we have T = I − ωA. D in Section 7.3 of Chapter 7 pertain to Therefore, the bounds that we obtained for Γn,k the sets D containing the spectrum of I − ωA. For details, see [304], where several numerical examples showing the effectiveness of this approach are also given.

9.11 Krylov subspace methods for nonlinear systems We have seen that Krylov subspace methods are very effective at solving linear systems. Because nonlinear systems behave linearly close to their solutions, modified versions of Krylov subspace methods, FOM and GMR in particular, have been developed to solve nonlinear systems. See Wigton, Yu, and Young [343] and also Brown [49] and Brown and Saad [51], for example. See also Kelley [157, Chapter 6]. Let ψ(x) = 0 be the nonlinear N -dimensional system we wish to solve, and let s be the solution to it. Consider the solution of this system via Newton’s method. Here are the steps of this method: 1. Choose an initial approximation s0 to s. 2. For m = 0, 1, . . . , do (a) Solve the linear system (b) Set

ψ(s m ) + Ψ(s m )u m = 0 for u m .

s m+1 = s m + u m .

(c) If s m+1 passes the accuracy test, stop. end do (m) Here Ψ(x) is the Jacobian matrix of ψ evaluated at x. Provided ψ ∈ C 2 in a neighborhood of s and Ψ(s) is nonsingular, this method converges quadratically; that is, lim m→∞ s m = s and s m+1 − s ≤ C s m − s2

for all m and some constant C > 0

provided s0 is sufficiently close to s. To be able to use Newton’s method to solve ψ(x) = 0, we need to solve linear systems of the form Ψ(x)u = −ψ(x) at each iteration, and for this, we need to know Ψ(x), which is possible when ψ(x) is known analytically. If we are unable or unwilling to compute Ψ(x), we can resort to procedures that resemble Krylov subspace methods to solve the linear systems Ψ(x)u = −ψ(x) approximately. Now, to be able to apply regular Krylov subspace methods to these systems, we need to have a procedure for

9.11. Krylov subspace methods for nonlinear systems

231

computing matrix-vector products of the form Ψ(x)y. In the absence of such a procedure,35 which is the case when Ψ(x) is not known, we approximate the product Ψ(x)y as  1 Ψ(x)y ≈ ψ(x + εy) − ψ(x) ≡ g(x; y) (9.112) ε for a sufficiently small ε = 0.36 With this approximate matrix-vector product, we can now perform several steps of some Krylov subspace method in a suitable fashion to approximate u m in Newton’s method. Such modifications of Newton’s method are known as Newton–Krylov methods. In what follows, we describe the Newton–Arnoldi and Newton-GMRES methods of obtaining approximate solutions to Ψ(s m )u m = −ψ(s m ) for each m. We start by modifying the Arnoldi–Gram–Schmidt process described in Section 9.3 to accommodate the present approximate matrix-vector product g (x; y) in (9.112) and to generate a set of orthonormal vectors {q 1 , . . . , q k } for some fixed integer k.

1. Compute r 0 = −ψ(s m ) and set β = r 0  and q 1 = r 0 /β. 2. For j = 1, . . . , k do (1) Set a j +1 = g (s m ; q j ). For i = 1, . . . , j do (i ) (i +1) (i ) Compute hi j = (q i , a j +1 ) and compute a j +1 = a j +1 − hi j q i . end do (i) ( j +1) ( j +1) Compute h j +1, j = a j +1  and set q j +1 = a j +1 /h j +1, j . end do ( j ) We next define the unitary matrix Q k and the upper Hessenberg matrices H k and H k as Qk = [ q1 | · · · | q k ]

and



h11 ⎢ h21 ⎢ ⎢ ⎢ ⎢ Hk = ⎢ ⎢ ⎢ ⎢ ⎣

h12 h22 .. .

··· ··· .. .

··· ···

..

..

.

..

. .

h1k h2k .. . .. . hkk hk+1,k

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎦



h11 ⎢ h21 ⎢ ⎢ Hk = ⎢ ⎢ ⎢ ⎣

h12 h22 .. .

··· ··· .. .

··· ···

..

..

.

. hk,k−1

h1k h2k .. . .. . hkk

⎤ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦

From this point on, we proceed as in Sections 9.3 and 9.4: • For the Newton–Arnoldi approach, we solve the linear system H k η = βe 1 for mk = Q k η. η and set u 35 The important advantage of Krylov subspace methods in solving a linear system Ax = b stems from the fact that a procedure for computing the matrix-vector product Ax is available whether the matrix A is given explicitly or not. For a nonlinear system ψ(x ) = 0, however, no such thing is available, and we need to invent a procedure that will play the same role, at least approximately. 36 Note that  1 ψ(x + εy) − ψ(x ) = Ψ(x)y. lim ε→0 ε The proper choice of ε in (9.112) is an important issue in floating-point arithmetic. For this issue, see the discussion in Kelley [157, pp. 79–82].

232

Chapter 9. Krylov Subspace Methods for Linear Systems

• For the Newton-GMRES approach, we solve the linear least-squares problem mk = Q k η. minη βe 1 − H k η for η and set u

Following these, for both procedures, we replace s m+1 = s m +u m in Newton’s method mk and continue to the computation of s m+2 . by s m+1 = s m + u Remarks: 1. Note that the vector function ψ is evaluated k + 1 times for each m. 2. If ψ is linear, that is, ψ(x) = Ax −b, it is easy to see that g (x; y) = Ay, so that the modified Arnoldi–Gram–Schmidt process above becomes the regular Arnoldi– Gram–Schmidt process; hence the methods above are the regular FOM(k) and GMRES(k), namely, the restarted FOM and GMRES. For error analyses of the methods described here and other inexact Newton methods, see Brown [49], Brown and Saad [51], and Kelly [157], for example. As we have seen, the Newton–Krylov methods are applied directly to the nonlinear system ψ(x) = 0. If a fixed-point iterative procedure of the form x m+1 = f (x m ) has been developed for this system, a more direct approach to its solution would be via vector extrapolation methods, such as MPE and RRE, in the cycling mode. This mode of operation would be especially convenient when x = f (x) is a suitably preconditioned form of ψ(x) = 0, that is, when F (s), the Jacobian matrix of f (x) evaluated at x = s, has a favorable spectrum. This is what is done when solving most engineering problems.

Chapter 10

Krylov Subspace Methods for Eigenvalue Problems

10.1 Projection methods for eigenvalue problems In the preceding chapter, we discussed the solution of large and sparse nonsingular linear systems of the form Ax = b by Krylov subspace methods. In this chapter, we will tackle the problem of computing eigenpairs of large and sparse square matrices by Krylov subspace methods. This subject is covered in the books by Wilkinson [344], Parlett [208], Cullum and Willoughby [63], Golub and Van Loan [103], and Saad [240]. See also the works by Kaniel [154], Paige [203], Saad [233, 234], and Golub and van der Vorst [105]. Again, because Krylov subspace methods are special cases of projection methods, we begin by treating projection methods for eigenvalue problems. We will use the notation of Chapter 9 throughout. We will also use the standard l2 inner product and norm throughout this chapter, that is, (y, z ) = y ∗ z

and

z  =



(z , z ).

Given a matrix A ∈ N ×N , we will denote its eigenpairs by (μi , x i ). Thus, Ax i = μi x i . Since Ax i − μi x i = 0, for an approximate eigenpair (μ, x ), we necessarily have A x− x = 0. We can thus view the difference A x ≡ r ( μ x − μ x ), with the normalization  x  = 1, as the residual vector of the approximate eigenpair (μ, x ). This vector also as has the interesting property that its norm serves as an indicator of the quality of μ an approximation to some eigenvalue. This is the subject of Theorem 10.1. Theorem 10.1. Assume that A is diagonalizable with eigenvalues μ1 , . . . , μN , with A = P D P −1 , D = diag(μ1 , . . . , μN ). Let (μ, x ),  x  = 1, be an approximate eigenpair. Then the residual vector r ( x ) satisfies ≤ κ(P)r ( x ), min |μi − μ| i

κ(P) = P P −1 .

(10.1)

If A is normal, then (10.1) improves to read ≤ r ( min |μi − μ| x ). i

233

(10.2)

234

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

is an eigenvalue of A, (10.1) is satisfied automatically. Therefore, assume Proof. If μ is not an eigenvalue. Then A − μI is nonsingular and we have the identity that μ ) )−1 r ( )−1 (A − μI x = (A − μI x ). x = (A − μI Taking norms on both sides of this identity, )−1  r ( 1 =  x  ≤ (A − μI x ).

(10.3)

Now, = P (D − μI )P −1 A − μI



)−1 = P(D − μI )−1 P −1 . (A − μI

Consequently, )−1  P −1  = )−1  ≤ P (D − μI (A − μI

P P −1  . mini |μi − μ|

Substituting this into (10.3), we obtain (10.1). The result in (10.2) follows from (10.1) and from the fact that P is a unitary matrix when A is normal.

In view of Theorem 10.1, we aim to make r ( x ) as small as possible by imposing additional conditions on it. This is what we do in the next definition, which is the analogue of Definition 9.1 for linear systems. Definition 10.2. Let  = span{y 1 , . . . , y k }

and ' = span{z 1 , . . . , z k }

be two k-dimensional subspaces of N . Define (μ, x ) to be an approximate eigenpair for x is orthogonal to ' . A satisfying the following conditions: (i) x ∈  and (ii) A x − μ Such an approximation method is called a projection method for the eigenvalue problem. If ' =  , we have an orthogonal projection method; otherwise, we have an oblique projection method.  and ' are called, respectively, right and left subspaces. As in Chapter 9, let us define the matrices Y , Z ∈ N ×k as Y = [ y1 | · · · | yk ]

Z = [ z 1 | · · · | z k ].

(10.4)

ξ = [ξ1 , . . . , ξk ]T ∈ k ,

(10.5)

and

From Definition 10.2, it is clear that x=

k

i =1

and

ξi y i = Yξ ,

ξ)=0 (z ,AYξ − μY

∀ z ∈ ',

(10.6)

) = 0, i = 1, . . . , k, which in turn can be rewritten which means that (z i ,AY ξ − μYξ in the form ξ ) = 0. (10.7) Z ∗ (AYξ − μY This is simply the k-dimensional generalized eigenvalue problem ∗ Y )ξ = 0 (Z ∗AY − μZ



∗ Y)ξ . (Z ∗AY)ξ = μ(Z

(10.8)

10.2. Krylov subspace methods

235

ξ ) are obtained as follows: The μ are obtained as the The approximate eigenpairs (μ, solution to ∗ Y) = 0, (10.9) det(Z ∗AY − μZ and the ξ are obtained by solving the singular homogeneous linear systems in (10.8). When the matrix Z ∗ Y is nonsingular, (10.8) becomes the regular k-dimensional eigenvalue problem , B = (Z ∗ Y)−1 (Z ∗AY ). Bξ = μξ (10.10) (This happens when ' =  , for example, since in this case, Z ∗ Y = Y ∗ Y is Hermitian positive definite.) Clearly, this problem has k eigenvalues. If B is diagonalizable, it also has k eigenvectors. Assuming that B is diagonalizable, denote the k eigenpairs of B i and i , ξ i ), and set x i = Yξ i , i = 1, . . . , k. The μ x i are called, respectively, Ritz by (μ values and Ritz vectors of A, and they are the required approximate eigenvalues and i , eigenvectors of A. In addition, (μ x i ) are called Ritz pairs of A. The matrix B in (10.10) simplifies considerably when the vectors y i and z i are biorthogonal, that is, when (z i , y j ) = δi j , in which case we have Z ∗ Y = I k . Consequently, B = Z ∗ AY now. Later in this chapter, we show how this can be achieved in two different but familiar cases. Remarks: 1. It is clear that a projection method is characterized entirely by its right and left subspaces. 2. In most cases of interest, the subspaces  and ' expand with increasing k. That is, if we denote the k-dimensional subspaces  and ' by k and 'k , respectively, then we have k ⊂ k+1 and 'k ⊂ 'k+1 . 3. Part of the philosophy behind projection methods for the eigenvalue problem x is orthogonal to an expanding subspace and is that, as k increases, A x − μ hence is likely to become closer to 0. Of course, when k = N , we will have x = 0 since the vector A x is orthogonal to N linearly independent A x − μ x − μ x now; that is, the Ritz pair (μ, x = μ x ) becomes vectors in N . As a result, A an exact eigenpair. 4. It is easy to see that, when A ⊆  , the Ritz pairs (μ, x ) are actually exact eigenpairs. 5. The cost of numerically solving the k-dimensional eigenvalue problem in (10.10) is negligible since k 1, v j l is a generalized eigenvector of A of rank l corresponding to the eigenvalue λ j . The λ j and v j l in the spectral decomposition of Am u are the only information contained in (k (A; u). For simplicity, let us consider the case in which A is diagonalizable. (The argument also applies when A is nondiagonalizable.) From the structure of Am u, which is dictated by that of u, we observe the following: (i) It may happen that some of the distinct eigenvalues of A are not in the set {λ1 , . . . , λ p }. (This may be true in exact arithmetic. In floating-point arithmetic, because of roundoff, all distinct eigenvalues of A will eventually participate in the computed Am u, however.) (ii) If one of the λ j in (10.24) has more than one corresponding eigenvector, then the only one contained in the vectors Am u is v j . If we want to obtain another of these eigenvectors, we need to construct a new Krylov subspace, (k (A; u  ), such that u  , in its spectral decomposition, has another eigenvector v j corresponding to λ j that is linearly independent of vj.

10.3 The method of Arnoldi 10.3.1 Derivation of the method In the method of Arnoldi, we choose  = (k (A; u) = span{u,Au,A2 u, . . . ,Ak−1 u} and

' =

for some u = 0 in N . Recalling the method of Arnoldi for linear systems, we can achieve Y ∗ Y = I k by constructing an orthonormal basis for (k (A; u) via the Arnoldi–Gram–Schmidt process (carried out in the spirit of MGS), which we repeat here for convenience:38 38 Direct use of the vectors y j = A j −1 u, j = 1, 2, . . . , to form the matrices Y ∗ Y and Y ∗ AY and compute the Ritz values as in the preceding section is not recommended in floating-point arithmetic as it gives rather inaccurate results. The algorithm developed by Arnoldi performs much better in floating-point arithmetic.

240

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

1. Set β = u and q 1 = u/β. 2. For j = 1, . . . , k do (1) Set a j +1 = Aq j . For i = 1, . . . , j do (i ) (i +1) (i ) Compute hi j = (q i , a j +1 ) and compute a j +1 = a j +1 − hi j q i . end do (i) ( j +1) ( j +1) Compute h j +1, j = a j +1  and set q j +1 = a j +1 /h j +1, j . end do ( j ) Note that, with q 1 , . . . , q j already determined, hi j q i is the projection of Aq j along q i , j ( j +1) i ≤ j . Clearly, a j +1 = Aq j − i =1 hi j q i , and hence Aq j =

j +1

i =1

hi j = (q i ,Aq j ),

hi j q i ,

(q i , q j ) = δi j

∀ i, j .

(10.30)

Note also that, as long as k is less than the degree of the minimal polynomial of A with respect to u, which we are assuming throughout, we have h j , j −1 > 0, 2 ≤ j ≤ k + 1. We make use of this fact in the proof of Lemma 10.5 below. We now define the unitary matrix Q k ∈ N ×k and the upper Hessenberg matrix H k ∈ (k+1)×k and H k ∈ k×k as

Q k = [ q 1 | · · · | q k ],

Q ∗k Q k = I k ,

(10.31)

and



h11 ⎢ h21 ⎢ ⎢ ⎢ ⎢ Hk = ⎢ ⎢ ⎢ ⎢ ⎣

Then

h12 h22 .. .

··· ··· .. .

··· ···

..

..

.

..

. .

h1k h2k .. . .. . hkk hk+1,k

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎦



h11 ⎢ h21 ⎢ ⎢ Hk = ⎢ ⎢ ⎢ ⎣

h12 h22 .. .

AQ k = Q k+1 H k = Q k H k + hk+1,k q k+1 e Tk ,

··· ··· .. .

··· ···

..

..

.

. hk,k−1

h1k h2k .. . .. . hkk

⎤ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦

(10.32)

(10.33)

where e j is the j th standard basis vector in k , as usual. In addition, because Q ∗k q k+1 = 0, Q ∗k AQ k = H k . (10.34) Clearly, we now have Z = Y = Q k , and hence Z ∗ Y = I k and B = H k by (10.10). Hence, . H k ξ = μξ (10.35) i , ξ i ) are the eigenpairs of H k , then (μ i , Thus, if (μ x i ), where x i = Q k ξ i , are the Ritz pairs of A. In addition, we have the following interesting result concerning the Ritz i . values μ Lemma 10.5. Provided k is less than the degree of the minimal polynomial of A with i are all distinct. respect to u, the Ritz values μ

10.3. The method of Arnoldi

241

be an eigenvalue of H k . Then rank(H k − μI ) ≤ k − 1. The (upper Proof. Let μ triangular) matrix obtained by crossing out the first row and last column in ⎤ ⎡ h11 − μ h12 ··· ··· h1k ⎥ ⎢ h21 ··· h22 − μ ··· h2k ⎥ ⎢ .. ⎥ ⎢ . . . . ⎥ . . . k =⎢ H k − μI ⎥ ⎢ ⎥ ⎢ . . . .. .. .. ⎦ ⎣ hk,k−1 hkk − μ has rank k − 1 since its diagonal elements h j , j −1 , j = 2, . . . , k, are all nonzero. Conse ) = k −1 exactly. Thus, by the rank-nullity theorem,39 the null quently, rank(H k − μI k is one-dimensional, so μ is simple. The result follows by applying space of H k − μI i . this argument to each of the eigenvalues μ

Remark: From the algorithm described above, it is clear that, whether we want to determine the k Ritz values only or the k complete Ritz pairs, we need to store the matrix Q k (equivalently, the vectors q 1 , . . . , q k ) in the core memory throughout the computation. When N , the dimension of A, is very large, we can work only with a fixed k |μ2 |, and u 0 contains the eigenvector v 1 corresponding to μ1 in its spectral decomposition, then ρ(A; u m ) − μ1 = while

O(|μ2 /μ1 | m ) O(|μ2 /μ1 |2m )

as m → ∞ if A is not normal, as m → ∞ if A is normal,

c m u m − v 1 = O(|μ2 /μ1 | m )

as m → ∞ in all cases,

where c m is some normalization scalar. Thus, lim ρ(A; u m ) = μ1

m→∞

and

lim c u m→∞ m m

= v 1.

The power method can be used to find other special eigenvalues of a matrix A as well: • If A is nonsingular and its smallest eigenvalue is required, then we can apply the power method to A−1 . • If the eigenvalue that is closest to some scalar a is required, then we can apply the power method to (A − aI )−1 . Here a is called a shift, and this method is known as the inverse power method with a fixed shift. If a varies from one power iteration to the next, the method is known as the inverse power method with a variable shift.

10.3. The method of Arnoldi

243

For a detailed and rigorous treatment of the Rayleigh quotient and its different uses as a power method and an inverse power method (with both fixed and variable shifts) for computing eigenpairs of general matrices, we refer the reader to Appendix G, where we also describe how the method can be implemented in a numerically stable fashion. Appendix G gives an independent rigorous treatment of the local convergence properties of the inverse power method with variable shifts, a subject not treated in most books on numerical linear algebra. For further developments, see the books mentioned in the beginning of this chapter.

10.3.4 Optimality properties We already know that the k Ritz pairs of the method of Arnoldi satisfy Theorem 10.4, and that they are distinct. They also enjoy certain optimality properties, to which we now turn. The results in Theorem 10.6, which follows, are from Sidi [270], and they are valid for arbitrary matrices A. The proof for Hermitian A is given in Parlett [208, Chapter 12, pp. 239–240]. We will be using the notation of Theorems 10.3 and 10.4. Theorem 10.6. Denote the set of monic polynomials of degree k by k .   j ) = k−1 1. The monic polynomial S(z) = kj=1 (z − μ c z j + z k in (10.20) is the j =0 j unique solution to the optimization problem S(A)u = min  f (A)u. f ∈k

(10.39)

2. Consequently, the Ritz vectors x i , normalized as in Theorem 10.4, satisfy r ( x i ) = min  f (A)u

∀i

(10.40)

r ( x i ) = min (A − μI ) x i .

(10.41)

f ∈k

and also

μ∈

i and 3. Therefore, μ x i are also related via i = μ

x ∗i A xi = ρ(A; x i ), x ∗i xi

(10.42)

where ρ(A; z ) ≡ z ∗Az /z ∗ z is the Rayleigh quotient. Therefore, just like the eigenvalues of A, the Ritz values of A are in the field of values ) (A) of A,40 which is a compact and convex set of the complex plane. 4. Finally, when k = k0 , where k0 is the degree of the minimal polynomial of A with re i , spect to u, the Krylov subspace (k0 (A; u) is an invariant subspace of A, and (μ x i ), i = 1, . . . , k0 , are exact eigenpairs of A. Proof. To prove part 1, we start with (10.22) and (10.23), namely, r ( x i ) = S(A)u 40 See

Appendix G.

and

(z i , S(A)u) = 0,

i = 1, . . . , k.

(10.43)

244

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

Now, for the method of Arnoldi, z i = y i = Ai −1 u. Letting u i = Ai u for convenience of notation, we have k−1

S(A)u =

j =0

c j A j u + Ak u =

k−1

j =0

cj u j + uk.

We can now rewrite (10.43) as  ui ,

k−1

j =0

 c j u j + u k = 0,

i = 0, 1, . . . , k − 1,

from which we have k−1

j =0

(u i , u j )c j = −(u i , u k ),

i = 0, 1, . . . , k − 1,

and these are the normal equations that result from the least-squares problem '

' ' k−1 ' ' '. c u + u j j k ' ' c0 ,c1 ,...,ck−1 min

(10.44)

j =0

Since the vectors u j , 0 ≤ j ≤ k − 1, are linearly independent, a unique solution for c0 , c1 , . . . , ck−1 exists. Letting u j = A j u in these equations, we obtain (10.39). The rest of the proof follows from part 1. We leave the details to the interested reader.

Remark: Following the work of Arnoldi, the equivalent form given in (10.44) was suggested in a paper by Erdélyi [82], in the book by Wilkinson [344, pp. 583–584], and in the papers by Manteuffel [180] and Sidi and Bridger [294]. The equivalence of the different approaches with the method of Arnoldi was noted in Sidi [270]. The following theorem extends that given in Kahan [152] (see also Parlett [208]) for Hermitian A to arbitrary A. It is the generalization of the Rayleigh quotient (for the case k = 1) to the case k > 1. Theorem 10.7. For simplicity, denote Q k and H k by Q and H , respectively, and define R(B) = AQ − QB,

B ∈ k×k arbitrary.

Then R(H ) = min R(B). B

 Proof. We start by recalling that Q ∗Q = I k and G = λmax (G ∗G), where λmax (K ) denotes the largest (in modulus) eigenvalue of the matrix K . Writing B = H + (B − H ), we first have R(B) = R(H ) − Q(B − H ). Therefore,

R(B)∗ R(B) = R(H )∗ R(H ) + (B − H )∗ (B − H ) − C − C ∗ , C = [Q(B − H )]∗ R(H ).

10.4. The method of Lanczos

245

By H = Q ∗AQ, C = (B − H )∗Q ∗ (AQ − QH ) = (B − H )∗ (Q ∗AQ − H ) = O. Therefore,

R(B)∗ R(B) = R(H )∗ R(H ) + (B − H )∗ (B − H ).

Since both R(H )∗ R(H ) and (B − H )∗ (B − H ) are Hermitian positive semidefinite, we have R(B)2 = λmax (R(B)∗ R(B)) ≥ λmax (R(H )∗ R(H )) = R(H )2

∀ B ∈ k×k .

This completes the proof.

The matrix R(H ) is also directly related to the residual vectors of the x i . This is the subject of the next theorem, whose proof we leave to the reader. x i  = 1, i = 1, . . . , k, we have Theorem 10.8. With x i = Qξ i ,  r ( x i ) = R(H )ξ i



r ( x i ) ≤ R(H ).

10.4 The method of Lanczos 10.4.1 Derivation of the method In the method of Lanczos, we choose  = (k (A; v) = span{v,Av,A2 v, . . . ,Ak−1 v} and

' = (k (A∗ ; w) = span{w,A∗ w, (A∗ )2 w, . . . , (A∗ )k−1 w}

for some v, w in N such that (w, v) = 0. Recalling the method of Lanczos and BiCG for linear systems, we can achieve Z ∗ Y = I k by constructing bases for k and 'k that are mutually biorthogonal. We repeat the steps of this process for convenience. (Note that the vectors v j that we define below should not be confused with the v j in (10.24) and the v j i in (10.25).) 1. Pick α, β such that αβ = (w, v). Set v 1 = v/α and w 1 = w/β. [Thus, (w 1 , v 1 ) = 1.] Also set γ1 = δ1 = 0 and v 0 = w 0 = 0.

2. For j = 1, 2, . . . , k do Compute

α j = (w j ,Av j ).

Compute

j +1 = Av j − α j v j − δ j v j −1 . v

Compute

& j +1 = A∗ w j − α j w j − γ j w j −1 . w

Choose Set

δ j +1 , γ j +1

v j +1 =

end do (k)

j +1 v

γ j +1

,

j +1 ). such that δ j +1 γ j +1 = (& w j +1 , v

w j +1 =

& j +1 w

δ j +1

.

246

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

k+1 = 0, w & k+1 = 0, or Note that the process can be continued as long as v k+1 ) = 0. (& w k+1 , v Assuming that the process does not fail and we have computed v i , w i , i = 1, . . . , k + 1, we have (w i , v j ) = δi j and Av k = δk v k−1 + αk v k + γk+1 v k+1 ,

(10.45)

A∗ w k = γ k w k−1 + α k w k + δ k+1 w k+1 .

Let

V k = [ v 1 | · · · | v k ],

W k = [ w 1 | · · · | w k ].

(10.46)

From (9.92), we have AV k = V k T k + γk+1 v k+1 e Tk

and A∗ W k = W k T ∗k + δ k+1 w k+1 e Tk ,

(10.47)

where T k is a tridiagonal matrix given by ⎡ α1 ⎢ γ2 ⎢ ⎢ ⎢ ⎢ ⎢ Tk =⎢ ⎢ ⎢ ⎢ ⎢ ⎣

δ2 α2 .. .

⎤ δ3 .. .

..

.

..

..

.

..

.

..

.

..

.

.

γk−1

..

. αk−1 γk

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎥ δk ⎦ αk

(10.48)

Since W ∗k V k = I k and W ∗k v k+1 = 0 by the biorthogonality of the v i and w j , we also have T k = W ∗k AV k . (10.49) Clearly, now Z = W k and Y = V k and hence Z ∗ Y = W ∗k V k = I k and B = T k by (10.10). Hence, . T k ξ = μξ (10.50) i , ξ i ) are the eigenpairs of T k , then (μ i , Thus, if (μ x i ), where x i = V k ξ i , are the Ritz pairs of A. In addition, we have the following analogue of Lemma 10.5 concerning the i for the method of Lanczos. Ritz values μ Lemma 10.9. Provided k is less than the degree of the minimal polynomial of A with i are all respect to u, and provided the γi in (10.48) are all nonzero, the Ritz values μ distinct. The proof is the same as that of Lemma 10.5, since T k is an upper (and also lower) Hessenberg matrix. Remarks: 1. From the algorithm described above, it is clear that (i) we need to store only the last three vectors w i at each stage of the computation, (ii) we need to store only the last three vectors v i if we want to determine the k Ritz values only, and (iii) we need to store all k vectors v 1 , . . . , v k if we want to compute the k complete Ritz pairs.

10.5. The case of Hermitian A

247

2. The following facts concerning the breakdown of the Lanczos tridiagonalization k+1 = 0, w & k+1 = 0, or (& k+1 ) = 0 were mentioned in Section 9.8: if v w k+1 , v & k+1 = 0, the process has found an invariant subspace of A k+1 = 0 or w (a) If v i ξ i or A∗ . This situation is referred to as regular termination. Let Aξ i = μ i η∗i . and η∗i A = μ k+1 = 0, we have AV k = V k T k , that is, (k (A; v 1 ) is an invari• If v i ,V k ξ i ) are exact (right) ant subspace of A; the (right) Ritz pairs (μ eigenpairs of A in this case. & k+1 = 0, we have W ∗k A = T k W ∗k , that is, (k (A∗ ; w 1 ) is an in• If w i , W k ηi ) are exact (left) variant subspace of A∗ ; the (left) Ritz pairs (μ eigenpairs of A in this case. & k+1 = 0, but (& k+1 ) = 0, vectors v k+1 and w k+1 k+1 = 0 and w w k+1 , v (b) If v satisfying (w k+1 , v k+1 ) = 1 do not exist. This situation is referred to as serious breakdown. Note that, at some later stage, there may exist nonzero vectors v k+ j that are orthogonal to (k+ j −1 (A; v 1 ) and w k+ j that are orthogonal to (k+ j −1 (A∗ ; w 1 ). Techniques that skip the stages at which the vectors v i and w i are not defined and go straight to the stages at which they are defined are known as look-ahead procedures. In this case, the matrix T k ceases to be tridiagonal and has a more complicated structure. For this case, we refer the reader to the relevant literature mentioned earlier.

10.4.2 Computation of residuals As in the case of the method of Arnoldi, in the method of Lanczos, the residual vector x for the Ritz pair (μ, r ( x ) = A x − μ x ),  x  = 1, and its l2 -norm can be expressed in terms of already computed quantities. With the help of (10.47)–(10.50), we have A x = AV k ξ = V k T k ξ + γk+1 v k+1 e Tk ξ k ξ + γk+1 (e Tk ξ )v k+1 = μV x + γk+1 ξk v k+1 . = μ Here, ξk is the kth component of the vector ξ . As a result, r ( x ) = γk+1 ξk v k+1



r ( x ) = |γk+1 | |ξk | v k+1 .

10.5 The case of Hermitian A 10.5.1 Arnoldi = Lanczos for Hermitian A If A is Hermitian and w = v = u, we have  = ' = (k (A; u) = span{u,Au, . . . ,Ak−1 u} for both the method of Lanczos and the method of Arnoldi. That is, the two methods  become equivalent mathematically. Furthermore, when j ) in the Lanczos process, we have w j = v j ; hence wj,v we choose γ j = δ j = (& W k = V k , and both matrices are the same as the unitary matrix Q k produced by the method of Arnoldi. Thus, the algorithms for both methods produce the same matrix

248

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

B defined in (10.10), this matrix being simply T k = V ∗k AV k , with ⎡

α1 ⎢δ2 ⎢ ⎢ ⎢ ⎢ ⎢ B = Hk = T k = ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

δ2 α2 .. .

⎤ δ3 .. .

..

.

..

..

.

..

.

..

.

..

.

.

..

δk−1

. αk−1 δk

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎥ ⎥ δk ⎦ αk

α j real, δ j > 0.

(10.51)

Thus, the recursion relation in (10.45) becomes Av k = δk v k−1 + αk v k + δk+1 v k+1 ,

(10.52)

AV k = V k T k + δk+1 v k+1 e Tk .

(10.53)

while (10.47) becomes As for the residual vector r ( x ) of the Ritz pair (μ, x ), we have r ( x ) = δk+1 ξk v k+1



r ( x ) = δk+1 |ξk |.

(10.54)

The Lanczos process can now be simplified and results in the Hermitian Lanczos process. Here are the steps of this process: 1. Choose a vector v 1 , v 1  = 1. Also set δ1 = 0 and v 0 = 0. 2. For j = 1, 2, . . . , k do Compute

α j = (v j ,Av j ).

Compute

j +1 = Av j − α j v j − δ j v j −1 . v

Compute

δ j +1 =  v j +1 

and set

v j +1 =

j +1 v

δ j +1

.

end do (k) Since A is Hermitian, so is T k . Therefore, just like A, T k has only real eigenvalues, which are approximations to k of the eigenvalues of A. Good approximations are obtained from T k for the largest and smallest eigenvalues. We turn to the analysis of this case in the following.

10.5.2 Localization of Ritz values We now present two different theorems that give intervals in which the Ritz values reside when the matrix A is Hermitian. For simplicity of notation, in what follows we denote H k and Q k (in the method of Arnoldi) by H and Q, respectively. Of course, we remember that H k = T k and is real symmetric and tridiagonal and Q k = V k is unitary, as mentioned above.

10.5. The case of Hermitian A

249

Let us form the N ×N unitary matrix U by appending N − k orthonormal vectors & Then to Q, namely, U = [Q | Q]. = U AU = A ∗

H C



C & H

. ,

& ∗AQ, & C = Q ∗AQ. & & =Q H = Q ∗ AQ, H

is Hermitian and similar to A; hence it has the same eigenvalues. Of course, A i of H (the Theorem 10.10. Let us order the eigenvalues μi of A and the eigenvalues μ Ritz values of A), respectively, as μ1 ≥ μ2 ≥ · · · ≥ μN

1 ≥ μ 2 ≥ · · · ≥ μ k . and μ

(10.55)

Then i ≤ μi , μN −k+i ≤ μ

i = 1, . . . , k.

(10.56)

Proof. The result follows by applying the Cauchy interlace theorem41 N − k times to noting that H is a k × k principal submatrix of A. A,

i given in Theorem 10.10 are of a general and qualitative As is clear, the bounds for μ nature; they depend only on k, whatever the matrix H . The next theorem, due to Kahan [152] (also proved in [208, pp. 219–220]), is more quantitative in nature as it depends on the matrix H . Theorem 10.11. k of the eigenvalues of A, namely, μ j1 , . . . , μ jk , can be put into one-to-one 1 , . . . , μ k , such that correspondence with the eigenvalues of H , namely, μ i | ≤ R(H ), |μ ji − μ

i = 1, . . . , k,

where R(H ) = AQ − QH , as before. Remark: Theorem 10.11 is valid for arbitrary unitary matrices Q ∈ N ×k as long as H = Q ∗AQ.

10.5.3 Kaniel–Paige–Saad convergence theory So far, we have seen, when dealing with Hermitian matrices, how eigenvalue estimates via Ritz values can be obtained by the methods of Lanczos and Arnoldi; at this moment, we do not know much about the rates of convergence of the Ritz values, however. This topic was first considered by Kaniel [154], who took advantage of the ap i proximation power of Krylov subspaces and provided a priori error bounds for the μ i → μi using Chebyshev polynomials; these bounds show very convincingly that μ k−i +1 → μN −i +1 as k increases, for small i = 1, 2, . . . , providing rates of converand μ gence at the same time. Kaniel’s theory was later reconsidered and improved by Paige [203] and by Saad [233]. For the following theorem, see [240] and [208]. 41 Cauchy interlace theorem (see [208], for example): Let λ1 ≥ λ2 ≥ · · · ≥ λ s be the eigenvalues of the Hermitian s × s matrix B. Let θ1 ≥ θ2 ≥ · · · ≥ θ s −1 be the eigenvalues of any (s − 1) × (s − 1) principal submatrix of B. Then the θi interlace the λi ; that is, λi +1 ≤ θi ≤ λi , i = 1, . . . , s − 1.

250

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems (k)

(k)

i , Theorem 10.12. Let the eigenpairs (μi , x i ) of A and the Ritz pairs (μ x i ) produced by the k-step Lanczos method be ordered such that (10.55) and (10.56) hold. Then, for i = 1, 2, . . . , (k) i μi ≥ μ

≥ μi − (μ1 − μN )

 κ(k) tan θ 2 i i

Tk−i (1 + 2γi ) μi − μi +1 , γi = μi +1 − μN

θi = ∠(q 1 , x i ),

,

(k)

(k) κ1

(k) κi

= 1,

=

i −1 μ j − μN  j =1

(k)

j − μi μ

,

i > 1,

(10.57)

and (k) k−i +1 μN −i +1 ≤ μ

≤ μN −i +1 + (μ1 − μN )

θ−i = ∠(q 1 , x N −i +1 ), (k) κ−1

(k) κ−i

= 1,

=

γ−i =

 κ(k) tan θ 2 −i −i

Tk−i (1 + 2γ−i )

μN −i − μN −i +1

μ1 − μN −i

,

,

(k)

i −1 

k− j +1 μ1 − μ

j =1

k− j +1 μN −i +1 − μ

(k)

,

i > 1.

(10.58)

Remarks: 1. Since γ±i > 1, the Chebyshev polynomials Tk−i (1+2γ±i ) increase exponentially (k) (k) i → μi and μ k−i +1 → μN −i +1 , the respective in k as k increases; therefore, μ (k)

1 → μ1 errors tending to zero exponentially in k as k increases. In particular, μ (k)

k → μN . and μ (k)

i → μi is achieved, the proof of (10.58) 2. Note that once the proof of (10.57) on μ on

(k) k−i +1 μ

→ μN −i +1 is achieved by applying (10.57) to the matrix −A.

3. From the error bounds in (10.57) and (10.58), it is clear that the best Ritz values are obtained for μ1 and μN , followed by μ2 and μN −1 , and so on. For μ1 and μN , for example, we have (k)

1 ≥ μ1 − (μ1 − μN ) μ1 ≥ μ (k)



k ≤ μN + (μ1 − μN ) μN ≤ μ

tan θ1 Tk−1 (1 + 2γ1 )  tan θ−1

2

Tk−1 (1 + 2γ−1 )

,

2 .

10.6 Methods of Arnoldi and Lanczos for eigenvalues with special properties In Subsection 10.3.3, we very briefly described the Rayleigh quotient power method, while in Appendix G, we discuss this method in detail. From these discussions, we

10.6. Methods of Arnoldi and Lanczos for eigenvalues with special properties

251

see that, starting with a vector u, this method can be applied to the power iteration sequences {u m } with (i) u m = Am u to obtain the largest eigenvalue of A, (ii) u m = A−m u to obtain the smallest eigenvalue of A when A is nonsingular, and (iii) u m = (A −aI )−m u to obtain the eigenvalue of A closest to the scalar a in the complex plane, and to obtain their corresponding eigenvectors. What is important to realize here is that, in all three cases, the power iteration vectors u m get richer in the direction of the required eigenvectors as m increases. Our purpose here is to modify and generalize this approach to find one or more eigenvalues that have a specified special property, along with their corresponding eigenvectors, via the methods of Arnoldi and Lanczos. Following Sidi [273],42 we assume that, corresponding to the special property considered, there exists a scalar function ψ(z), ψ :  → , analytic on a set in  containing the eigenvalues of A, such that the eigenvalues satisfying this special property maximize |ψ(z)|. If we order the eigenvalues μi of A such that |ψ(μ1 )| ≥ |ψ(μ2 )| ≥ |ψ(μ3 )| ≥ · · · ,

(10.59)

then we are interested in finding μ1 , μ2 , . . . in this order. Hence, our task can be reformulated to read as follows: Given the function ψ :  →  and the ordering of the μi in (10.59), find μ1 , μ2 , . . . , μk for a given integer k. Going back to the above examples, we see that the most obvious candidates for ψ(z) can be as follows: 1. For eigenvalues that are largest in modulus, ψ(z) = z. (This is what is done in Sidi [270].) 2. For eigenvalues that are smallest in modulus, ψ(z) = z −1 . 3. For eigenvalues closest to some scalar a, ψ(z) = (z − a)−1 . 4. For eigenvalues with largest real parts, ψ(z) = exp(z). 5. For eigenvalues in a set Ω of the complex plane, pick ψ(z) to be, for example, a polynomial whose modulus assumes its largest values on Ω and is relatively small on the rest of the spectrum of A. [The behavior of ψ(z) outside the spectrum of A is immaterial.]43 Remarks: 1. Note that if (μi , x i ) are eigenpairs of A, then (ψ(μi ), x i ) are eigenpairs of ψ(A). 2. The function ψ(z) enters the picture through the computation of vectors of the form ψ(A)w. The computation of the matrix ψ(A), which may be a prohibitively expensive task even in the simplest cases, is not necessary for this purpose. The vectors ψ(A)w can be computed exactly if ψ(z) is a polynomial. If ψ(z) is not a polynomial but is approximated by a polynomial ψ(z), then ψ(A)w can be approximated by ψ(A)w. [In Section 14.6, we show that ψ(A) = p(A), p(z) being a polynomial; therefore, ψ(A)w is a matrix-vector product.] 42

The paper [273] treats the simultaneous iteration method in addition to the methods of Lanczos and Arnoldi. As it is outside the main theme of this chapter, we have not included the simultaneous iteration method in our study here. 43 Assume, for example, that the spectrum of A is known to be real and in [a, b ]. (i) If we are interested in the eigenvalues that are closest to the middle of [a, b ], then we can choose ψ(z) = (z − a)(z − b ). With this choice, |ψ(z)| is maximal at z = (a + b )/2. (ii) If we are interested in the eigenvalues that are closest to the endpoints a and b , then we can choose ψ(z) = (z − a)(z − b ) + c with c = (b − a)2 /4. With this choice, |ψ(z)| is maximal at z = a and z = b .

252

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

3. The common use of the methods of Arnoldi and Lanczos involves right Krylov subspaces (k (A; u 0 ). We now wish to apply these methods with right Krylov subspaces (k (A; [ψ(A)]n u 0 ) with n > 0. Thus, our application of these methods is analogous to the most common application of the power method when we increase n. By increasing n, we are able to increase the accuracy of the required Ritz pairs while keeping the dimension k (hence the storage requirements) small and fixed. As such, the approach we present here is indeed a convergence acceleration procedure for the Arnoldi and Lanczos methods when k is held fixed. It also has an elegant convergence theory. 4. Numerical experience suggests that if we obtain a certain accuracy by applying the methods of Lanczos and Arnoldi in the common way with right Krylov subspace (k (A; u 0 ), we can obtain the same accuracy by applying the methods with right Krylov subspace (k  (A; [ψ(A)]n u 0 ) with sufficiently large n but with k  < k, that is, with smaller storage requirements. See the numerical examples in [270] and [273]. 5. In what follows, we will restrict our attention to diagonalizable matrices. In Theorems 10.13–10.16, we will also assume that the ψ(μ j ) satisfy |ψ(μk )| > |ψ(μk+1 )|, in addition to (10.59). These theorems show that the best Ritz value is obtained for μ1 , followed by μ2 , and so on. Let us choose a nonzero vector u 0 and a positive integer n and generate u 1 , u 2 , . . . , u n via u m+1 = ψ(A)u m , m = 0, 1, . . . . (10.60) Next, let us apply the k-stage methods of Arnoldi and Lanczos with the matrix A [and not with ψ(A)], starting with the vector u n . Then the relevant right and left subspaces for these methods are  = (k (A; u n ) = '

for the Arnoldi method,

' = (k (A∗ ; u n )

 = (k (A; u n ),

for the Lanczos method.44

From here on, we follow the developments of Section 10.2. Remark: Before we go on, however, we need to make one more comment concerning the practical use of the method we have just described: If we generate the vectors u m in floating-point arithmetic exactly as in (10.60), we may run into overflows when |ψ(λ1 )| > 1 and underflows when |ψ(λ1 )| < 1. To avoid these problems, the vectors u m should always be normalized to have unit length as follows: 1. Choose an initial vector u 0 , u 0  = 1. 2. For m = 0, 1, . . . do m+1 = ψ(A)u m . Compute u Set u m+1 =

m+1 u

 u m+1 

.

end do (m) 44 This

amounts to taking w = v in the method of Lanczos, as defined at the beginning of Section 10.4.

10.6. Methods of Arnoldi and Lanczos for eigenvalues with special properties

253

The Krylov subspaces (k (A; u n ) and (k (A∗ ; u n ) remain the same under this mode of computation of the u m , which means that the methods produce the same Ritz pairs. We now go back to the convergence analysis of the methods as n is increased. What we want is to obtain the asymptotic behavior as n → ∞ of det T (z), with T (z) as in (10.15). Letting t r +1,s +1 = u r,s in (10.15), we now have   z0   u  0,0  det T (z) =  u1,0  ..  .   uk−1,0

z1 u0,1 u1,1 .. . uk−1,1

 z k  u0,k  u1,k  ≡ H (z), n,k ..  .  uk−1,k 

... ... ... ...

(10.61)

with u r,s = (Ar u n ,As u n ) ∗ r

s

u r,s = ((A ) u n ,A u n ) = (u n ,A

r +s

un)

for the Arnoldi method,

(10.62)

for the Lanczos method.

(10.63)

To find the asymptotic behavior of Hn,k (z) ≡ det T (z), we need the asymptotic behavior of the u r,s . Now, as we have already encountered in various places, the vectors u m are necessarily of the form u m = [ψ(A)] m u 0 =

p

j =1

v j [ψ(λ j )] m ,

(10.64)

where λ j are some or all of the distinct eigenvalues of A for which ψ(λ j ) = 0 and v j are corresponding eigenvectors. Of course, the v j are linearly independent. Below, we present the convergence theorems for Ritz pairs of all diagonalizable matrices. The convergence of Ritz pairs was originally considered in [270], with ψ(z) = z, for all matrices, whether diagonalizable or not, and in [273], for normal matrices, with arbitrary ψ(z). Thus, the next convergence theorems pertaining to Ritz pairs of nonnormal diagonalizable matrices, with arbitrary ψ(z), are new.

10.6.1 Ritz values for nonnormal diagonalizable matrices If A is an arbitrary diagonalizable matrix, we have u r,s = =



p

v i [ψ(λi )] i =1 p

p

i =1 j =1

u r,s = =



p i =1

n

λir ,

j =1

v j [ψ(λ j )]n λ sj



r

(v i , v j ) λi λ sj [ψ(λi )]n [ψ(λ j )]n

v i [ψ(λi )]n ,

p

p

i =1 j =1

p

p

j =1

v j [ψ(λ j )]n λ rj +s

for the Arnoldi method,

(10.65)

for the Lanczos method.

(10.66)



(v i , v j ) λ rj +s [ψ(λi )]n [ψ(λ j )]n

254

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems

Clearly, both (10.65) and (10.66) serve as true asymptotic expansions of the respective u r,s as n → ∞. We can unify the treatment of both methods by rewriting (10.65) and (10.66) in the form p

p

n (10.67) vi j θir λ sj ψi ψnj , u r,s = i =1 j =1

where vi j = (v i , v j ),

ψi = ψ(λi ),

λi λi

θi =

for the Arnoldi method, for the Lanczos method.

(10.68)

Comparing (i) Hn,k (z) in (10.61) and u r,s in (10.67) with (ii) Hn,k (z) in (6.86) and n

w r +1,s as in (6.80), we see that they are of exactly the same form: vi j , θir , λ sj ψi , and ψnj in (10.67) are the analogues of, respectively, vi j , μir , λ sj , μni , and λnj in (6.80). Substituting (10.67) into (10.61), and using the technique of the proof of Theorem 6.14, we obtain the following result.

Theorem 10.13. Hn,k (z) ≡ det T (z) has the expansion Hn,k (z) =

1≤i1 0 for all i .

10.6. Methods of Arnoldi and Lanczos for eigenvalues with special properties

257

which, upon invoking (10.64), becomes x (n,k) = s

=

 k j =1 j = s p

i =1

=

p

i =1

=

p

i =1

(n,k) (A − z j I )

ψni

ψni

  k j =1 j = s

  k j =1 j = s (n,k)

ψni L s

where we have let (n,k)

Ls

(μ) =



p i =1

v i ψni

(n,k)

(A − z j

  I ) vi

(n,k)

(λi − z j

  ) vi ,

(λi )v i ,

k  j =1 j = s

(n,k)

(μ − z j

).

as Let us now split the last expression for x (n,k) s (n,k) = L s (λ s )ψns (Σ1 + Σ2 + Σ3 ), x (n,k) s

where Σ1 = v s , Σ2 =

 (n,k) k 

ψi n L s (λi ) i =1 i = s

ψs

(n,k)

Ls

vi ,

(λ s )

 (n,k) p 

ψi n L s (λi ) Σ3 = vi . ψ s L(n,k) (λ ) i =k+1 s s

To complete the proof, we need to analyze the behavior of Σ2 and Σ3 as n → ∞, (n,k) (n,k) for which we need to analyze the fractions L s (λi )/L s (λ s ), which multiply the respective v i , as n → ∞. In doing so, we must recall that the λ j are distinct and (n,k)

that the v j are linearly independent. We must also recall that limn→∞ zi  (n,k) (k) i = 1, . . . , k, by which limn→∞ L s (λ s ) = L s = kj=1 (λ s − λ j ) = 0. j = s

1. For Σ2 , we observe that, as n → ∞,

k (n,k) L s (λi ) (n,k) L s (λ s )

j =1 (λi j = s ,i

∼ k

j =1 (λ s j = s

− λj ) (n,k)

− λj )

(λi − zi

We have to consider the following two cases:

),

i = 1, . . . , k,

i = s.

= λi ,

258

Chapter 10. Krylov Subspace Methods for Eigenvalue Problems (n,k)

• The case where A is nonnormal: In this case, by Theorem 10.14, λi −zi = O(|ψk+1 /ψi |n ) as n → ∞. Thus, the coefficient multiplying v i in Σ2 is O(|ψk+1 /ψ s |n ). As a result, Σ2 = O(|ψk+1 /ψ s |n ) as n → ∞ too. (n,k)

• The case where A is normal: In this case, by Theorem 10.16, λi − zi = O(|ψk+1 /ψi |2n ) as n → ∞. Thus, the coefficient multiplying v i in Σ2 is O(|ψk+1 /ψ s |n |ψk+1 /ψi |n ) with |ψk+1 /ψi | < 1 since i < k + 1. As a result, Σ2 = o(|ψk+1 /ψ s |n ) as n → ∞. 2. As for Σ3 , we observe that, as n → ∞,

k (n,k)

(λi )

(n,k)

(λ s )

Ls

Ls

j =1 (λi j = s

− λj )

j =1 (λ s j = s

− λj )

∼ k

,

i = k + 1, . . . , p.

Therefore, the coefficient multiplying v i in Σ3 is strictly O(|ψi /ψ s |n ). As a result, Σ3 = O(|ψk+1 /ψ s |n ) as n → ∞ by the ordering |ψk+1 | ≥ |ψi | for i ≥ k + 1. Therefore, Σ2 + Σ3 = O(|ψk+1 /ψ s |n ) as n → ∞. This completes the proof.

10.6.4 Final remarks The problem of finding eigenvalues with special properties has received considerable attention in the past. As we have seen, when applied to a Hermitian matrix, the (symmetric) Lanczos method already produces approximations to the largest and the smallest eigenvalues whose accuracy increases with the dimension k of the underlying Krylov subspaces. It is also observed that if Krylov subspace methods are applied with right subspace  = (k (A; u) starting with an arbitrary vector u, the Ritz values obtained are approximations to some of the eigenvalues in the outer part of the spectrum of A when A is not necessarily Hermitian. To approximate these in some inner part of the spectrum, we need to have different strategies, however. Having to be in some specified part of the spectrum can be considered to be a special property. We have observed in some numerical examples that if the largest eigenvalue is real and positive, and the method of Arnoldi with  = (k (A; u) attains some accuracy with a certain k, it attains comparable accuracy with  = (k  (A;An u), k  < k, for some appropriate n > 0. This strategy can certainly be useful when storage limitations are crucial. See [270, Example 7.2]. To improve the convergence of the methods of Arnoldi and Lanczos, here we have suggested applying these methods to the vector [ψ(A)]n u, where ψ(μ) is a suitable function that is largest when evaluated at the desired eigenvalues, and [ψ(μ)]n is even larger. If ψ(μ) is a polynomial of degree d , then [ψ(μ)]n is also a polynomial but of degree nd . Thus, one can now think of using a polynomial P (μ) of a high degree that is large in the domain where the desired eigenvalues lie, while it is much smaller outside this domain. In particular, one can employ the Chebyshev polynomials. This was done for the simultaneous iteration method for Hermitian matrices by Rutishauser [231, 232]. The approach in these two papers was generalized by Saad [237] to improve the convergence of the Arnoldi method and the simultaneous iteration method

10.6. Methods of Arnoldi and Lanczos for eigenvalues with special properties

259

for eigenvalues with largest real parts of non-Hermitian matrices; in particular, the polynomial P (μ) is now taken to be a Chebyshev polynomial rescaled and shifted to an elliptical domain that contains the unwanted eigenvalues. Following this, the method of Arnoldi is applied to the vector P (A)u, which is now the analogue of our [ψ(A)]n u. Since the method of Arnoldi requires k vectors to be stored in the core memory at all times, it cannot be used with an ever-increasing value of k. One simple way of improving the accuracy of the computed Ritz value is by restarting. For example, if we are interested in the eigenvalue with the largest real part, we apply the method with (k (A; u) and choose the Ritz value that has the largest real part and compute its corre ), and . We next apply the method again, this time with (k (A; u sponding Ritz vector u we repeat this as many times as necessary. Currently, the Arnoldi method is applied to large sparse matrices by using so-called filters and implicit restarting, a sophisticated technique due to Sorensen [308]; the software by Lehoucq, Sorensen, and Yang [170] that implements this technique is known as the Arnoldi Package (ARPACK) and is in wide use. The polynomials P (μ) mentioned above serve as filters if they are chosen such that their zeros are the Ritz values corresponding to unwanted eigenvalues. We refer the reader to [308] and [240, Chapter 7] for details. Lastly, the problem of finding eigenvalues with largest real parts was tackled in Goldhirsch, Orszag, and Maulik [99] with ψ(μ) = exp(μ) explicitly. The vectors u n = [ψ(A)]n u = exp(nA)u that are needed are approximated through the numerical solution of the linear system of ordinary differential equations u  (t ) = Au(t ) with initial conditions u(0) = u 0 , where u(n) = u n . Subsequently, a mathematical equivalent of the Arnoldi method is employed. (This equivalence can be verified with the help of [270, Section 5].) The method is then applied to a problem in hydrodynamic stability that involves the Orr–Sommerfeld equation.

Chapter 11

Miscellaneous Applications of Vector Extrapolation Methods

11.1 Introduction Since their development in the 1970s and 1980s, vector extrapolation methods have had quite a few successful applications to practical problems in very high dimensional spaces. These problems commonly arise in various scientific and engineering disciplines, among others. Some of the first applications of these methods were to the time-independent and also time-periodic steady-state solution of time-dependent linear and nonlinear systems of partial differential equations arising in different engineering disciplines, such as computational fluid dynamics. (These applications involve the solution of linear and nonlinear systems of algebraic equations from finite-difference and finite-element discretization of systems of partial differential equations.) Since then, vector extrapolation methods have also been applied to nonlinear problems in various other areas, such as multidimensional scaling (see Rosman et al. [223, 224]), image processing (see Rosman et al. [225]), semiconductor research (see Schilders [250, 251]), computerized tomography (see Rajeevan, Rajgopal, and Krishna [213]), transport theory involving the discrete Riccati equation (see El-Moallem and Sadok [79]), and Markov chains (see Wen et al. [342]), to name a few. Vector extrapolation methods have been used in conjunction with some powerful iterative methods, such as multigrid and spectral element methods for partial differential equations, and have been observed to help enhance the (already favorable) convergence of these iterative methods as well; see, for example, Shapira, Israeli, and Sidi [256, 257] and Shapira et al. [258]. In Brezinski and Redivo Zaglia [37], vector extrapolation methods are used to improve the convergence of the method of Kaczmarz. They have also been used in conjunction with sequences of truncated SVD solutions of ill-posed problems; see, for example, Jbilou, Reichel, and Sadok [147]. For all these applications and others, we refer the reader to the relevant literature.

11.2 Computation of steady-state solutions As already mentioned, some of the first applications of vector extrapolation methods were to the time-independent and also time-periodic steady-state solution of timedependent linear and nonlinear partial differential equations arising in different engi263

264

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

neering disciplines, such as computational fluid dynamics. Here we briefly discuss time-independent steady-state solution computations. The general setting for such problems is as follows. Let Ω be a subset of d , and denote the boundary of Ω by ∂ Ω. Let u(t ) be the solution to the initial boundary value problem (IBVP) du = φ(u) in Ω, t > 0, dt Λ(u) = 0 on ∂ Ω (boundary condition), M (u) = 0 at t = 0 (initial condition).

(11.1)

Here φ, Λ, and M are linear or nonlinear operators (differential, integral, etc.); Λ is independent of t , and t denotes time. Usually, φ(u) depends only on u and/or its partial derivatives with respect to the variables other than t , but it has no explicit time dependence. Assume that this IBVP has a time-independent (or steady-state) solution that we . Then u is the solution to the boundary value problem (BVP) shall denote u φ( u ) = 0 in Ω, Λ( u ) = 0 on ∂ Ω (boundary condition).

(11.2)

∂u

Example 11.1. An example of this is the heat equation ∂ t = Δu in Ω with a timeindependent boundary condition. The steady-state solution lim t →∞ u(t ) to this problem satisfies the Laplace equation Δu = 0 in Ω with the same boundary condition. To illustrate this point, let us consider the following one-dimensional heat equation for the temperature u(x, t ):

∂ u ∂ 2u = , ∂ x2 ∂t

u(0, t ) = 0,

u(1, t ) = 1,

0 < x < 1, t > 0,

t > 0,

u(x, 0) = f (x),

0 < x < 1.

It is easy to verify that the steady-sate solution u (x), which satisfies ∂ 2 u = 0, ∂ x2

0 < x < 1,

u (0) = 0,

u (1) = 1,

is u (x) = x. By using the method of separation of variables, it can be shown that u(x, t ) is given as 91 ∞

−(kπ)2 t u(x, t ) = x+ ak e sin(kπx), ak = 2 sin(kπx)[ f (x)−x] d x, k = 1, 2, . . . . 0

k=1

Under reasonable conditions on f (x), it can be shown that the infinite sum tends to zero as t → ∞, and we obtain limt →∞ u(x, t ) = x = u (x).

= lim t →∞ u(t ), one of the widely used techniques for determining u has Since u been one in which the IBVP in (11.1) is solved numerically by a time-marching technique with a fixed time step. For example, in the simplest case, letting Δt be the time increment, we approximate the IBVP via u(t + Δt ) − u(t ) ≈ φ(u(t )) Δt



u(t + Δt ) ≈ u(t ) + (Δt )φ(u(t )),

(11.3)

11.2. Computation of steady-state solutions

265

and following that, we approximate/discretize φ(t ) by using appropriate finite differences or finite elements and let t = nΔt to obtain the iterative scheme u n+1 = ψ(u n ),

n = 0, 1, . . . ,

u n ≈ u(nΔt ).

(11.4)

Of course, the initial approximation u 0 = u(0) needs to be provided in a suitable way.47 Obviously, when the discretization scheme is stable, limn→∞ u n exists and is an . In most cases of interest, the iterative scheme in (11.4) converges approximation to u very slowly, and its convergence can be accelerated by using a vector extrapolation method that is applied in the cycling mode. For some earlier applications of vector extrapolation and Krylov subspace methods to such problems, see Wong and Hafez [346]; Wigton, Yu, and Young [343]; Hafez et al. [124]; Reddy and Jacocks [217]; Yungster [358, 359]; and Kao [156], for example. Now, φ(u) can be discretized in various ways, some better than others. Normally, the very simpleminded discretization procedures give rise to sequences {u n } that have poor convergence properties, and so do extrapolation methods when applied to these sequences. Therefore, it is a good strategy to spend enough effort to design a good discretization procedure before everything else. Normally, completely explicit discretization schemes give rise to very poor convergence of {u n }, and vector extrapolation methods do not produce meaningful acceleration. If we include even a small amount of implicitness, we obtain considerably better convergence from extrapolation methods. Usually, the best acceleration results are obtained when extrapolation methods are applied in conjunction with implicit schemes. For the different fluid mechanics problems, for example, quite a few sophisticated approaches to the discretization of the relevant IBVPs have been suggested and are in use. When vector extrapolation methods are applied to these sequences in conjunction with cycling, they seem to be very effective. See the numerical examples in Sidi and Celestina [295], where two essentially different fluid flow problems are treated numerically. Each of these problems is discretized in two different ways, one explicit and the other fully implicit. The improvement in the numerical results achieved by applying MPE and RRE in conjunction with the more sophisticated fully implicit schemes is remarkable. Various studies have shown that the vector extrapolation methods that are most suitable for problems that result in a very high dimensional system of algebraic equations are polynomial-type methods; epsilon algorithms turn out to be quite expensive. Of the polynomial-type methods, MPE and RRE have proved very effective in many instances. From our discussion in Section 5.7, we recall that, when applied in the cycling mode and in conjunction with fixed-point iterative schemes, s0,k from MPE and (0)

RRE and ε2k from VEA and TEA have comparable accuracies. However, VEA and TEA require 2k vectors u n to be computed, while MPE and RRE require k + 1 such vectors, namely, half as many vectors as required by VEA and TEA. This is crucial in problems where the computation of the individual vectors u n has a high cost; for example, problems in computational fluid dynamics are of this kind. Here we would like to recall the conclusions we reached in Chapter 7 in our study of the errors when applying MPE and RRE, with n and k fixed, to vector sequences 47 The scheme in (11.3) corresponds to the application of the Euler method (see Atkinson [9, p. 300], for example) to the solution of the scalar initial value problem (IVP) u  = φ(u), u(0) = u0 , for example. We can use a more sophisticated scheme that corresponds to the application of better methods to the scalar IVP, such as the Runge–Kutta method or others. This has been done in the finite-difference or finite-element solution of problems in computational fluid dynamics; see Jameson, Schmidt, and Turkel [145].

266

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

obtained from fixed-point linear iterative schemes. The main point there was that the spectrum of the iteration matrix T has great influence on the rates of convergence of MPE and RRE. In Section 7.4, we compared the theoretical and numerical results from four different real and purely imaginary spectra. We observed that, when ρ(T ) < 1 is fixed, the best convergence acceleration is obtained from a purely negative spectrum followed by a purely imaginary spectrum, followed by a real mixed spectrum, followed by a purely positive spectrum. So far, we have discussed the application of vector extrapolation methods to obtain time-independent steady-state solutions of equations. Vector extrapolation methods in general, and MPE and RRE in particular, have also been applied to obtain timeperiodic steady-state solutions. The case in which the period of the solution is known has been treated by Skelboe [305]. In Section 11.3, we present another method for this problem, which combines the Richardson extrapolation process with vector extrapolation methods. This hybrid method seems to be more economical than that of [305]. The application of vector extrapolation methods to the case in which the period of the steady-state solution is not known is considered in Houben and Maubach [141] and Houben [140]. When applied properly, these methods seem to be successful in all cases.

11.3 Computation of eigenvectors with known eigenvalues One of the interesting applications of vector extrapolation methods is to the problem of computing eigenvectors of large sparse matrices corresponding to known eigenvalues. The topic has become of much interest recently in connection with the computation of the PageRank of the Google Web matrix; we deal with the PageRank computation later in this section. As we shall see shortly, vector extrapolation methods turn out to be very suitable for this problem. The approach we present next was developed in Sidi [279, 285].

11.3.1 Treatment of diagonalizable A We start by considering the case in which A is diagonalizable. Let us assume that some of the matrix A ∈ N ×N is known. Assume also that this eigenvalue eigenvalue μ is of multiplicity q, which need not be known. Then A has q linearly independent eigenvectors corresponding to this eigenvalue, and we are interested in computing one ) is an eigenpair of A, then (1, v ) is an eigenpair of v such eigenvector. Now, if (μ, = 1, that is, −1 A. Therefore, we may assume without loss of generality that μ μ , A v=v

(11.5)

and this is what we do in what follows. As a result, A has N eigenpairs (μi , w i ), Aw i = μi w i , i = 1, . . . , N , ordered such that μ1 = · · · = μq = 1,

μi = 1,

i = q + 1, . . . , N .

Let us pick an arbitrary initial vector x 0 and compute the vectors x 1 , x 2 , . . . as x m+1 = Ax m ,

m = 0, 1, . . . .

Since the eigenvectors w 1 , . . . , w N span  , we have N

x0 =

N

i =1

αi w i

for some scalars αi .

11.3. Computation of eigenvectors with known eigenvalues

Therefore, xm =

N

i =1

αi μim w i =

q

i =1

αi w i +

267

N

i =q+1

αi μim w i .

q q . Following the arguments in the Assuming that i =1 |αi | = 0, let i =1 αi w i = v proof of Lemma 6.4 and in Example 6.5, it is easy to conclude that x m is precisely of the form + xm = v

p

i =1

v i λim ,

m = 1, 2, . . . .

Here, (i) λi are some or all of the distinct nonzero eigenvalues of A not equal to one; and Av i = λi v i , i = 1, 2 . . . , p; and (iii) p < N . Clearly, v i are linearly (ii) A v=v provided one is the independent vectors. Note that the sequence {x m } converges to v largest eigenvalue of A; {x m } diverges otherwise. Let us order the λi as in |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p | and assume also that

|λk | > |λk+1 |.

We now note that the vector sequence {x m } is precisely as in Theorem 6.6. Therefore, . In particuvector extrapolation methods can be applied to {x m } to approximate v ; by lar, we can apply MPE or RRE to {x m } to obtain the approximations sn,k to v Theorem 6.6, we have = O(|λk+1 |n ) sn,k − v

as n → ∞.

Of course, convergence takes place provided |λk+1 | < 1. This will be the case naturally = 1 is the largest eigenvalue of A, as is the case in PageRank computations. if μ Theorem 6.7 applies to sn,k when |λk | = |λk+1 |. In view of all of this, we can apply MPE and RRE in the cycling mode if the dimension N is very large. Of course, we can also apply these methods using cycling with frozen γi and cycling in parallel, depending on the resources available. We also recall from remark 8 following the statement of Theorem 6.6 that we can apply MPE and RRE to a subsequence {x r m } with a suitable integer r > 1.

11.3.2 Treatment of nondiagonalizable A We now treat the case in which A is nondiagonalizable but the known eigenvalue μ is nondefective and has q corresponding linearly independent eigenvectors. Thus, at is defective; that is, it has at least one least one of the eigenvalues of A different from μ = 1, x m is precisely of the corresponding generalized eigenvector. In this case, with μ form + xm = v

p

i =1

g i (m)λim ,

m = 1, 2, . . . .

Here, (i) λi are some or all of the distinct nonzero eigenvalues of A not equal to , g i (m) is a vector-valued polynomial in m of degree ωi − 1, i = one; (ii) A v =v 1, . . . , p, and the set of the coefficients of all the g i (m) is linearly independent; and

268

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

p (iii) i =1 ωi ≤ N − q. (See Section 6.8 and Lemma 6.22.) Note that the sequence provided μ = 1 is the largest eigenvalue of A; {x m } diverges oth{x m } converges to v erwise. Let us order the λi as in |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p | and

|λi | = |λi +1 |

and let us assume that Then Theorem 6.23 applies to sn,k



ωi ≥ ωi +1 ,

|λ t | > |λ t +1 |.  with k = it =1 ωi , and we have

= O(n c |λ t +1 |n ) as n → ∞; c = ω t +1 − 1. sn,k − v   +r Theorem 6.24 applies to sn,k with it =1 ωi < k < it =1 ωi if |λ t | > |λ t +1 | = · · · = |λ t +r | > |λ t +r +1 |, and we have

= O(n β |λ t +1 |n ) sn,k − v

as n → ∞,

β ≥ 0 being some integer. We leave the details to the interested reader.

11.3.3 Application to PageRank computation General preliminaries

We start with the well known Perron–Frobenius theorem and the definition of stochastic matrices. (See Varga [333, Chapter 2] or Berman and Plemmons [21, Chapter 2], for example.) Theorem 11.2. Let A be a square matrix. 1. If A ≥ O, then ρ(A) is an eigenvalue of A and it has a nonnegative corresponding eigenvector. 2. If A ≥ O and is irreducible, then ρ(A) is a simple eigenvalue of A and it has a positive corresponding eigenvector. Definition 11.3. Let A = [ai j ] ∈ N ×N . A is column stochastic if it is nonnegative and the sum of the elements in each of its columns is one, that is, if ai j ≥ 0

∀ i, j ,

N

i =1

ai j = 1,

j = 1, . . . , N .

Clearly, if A is column stochastic, then ρ(A) = 1; therefore, by Theorem 11.2, A = 1 as an eigenvalue and a corresponding nonnegative right eigenvector. It is has μ easy to verify that the corresponding left eigenvector of A is e = [1, 1, . . . , 1]T ; that is, eT A = eT . = 1 is a simple eigenvalue and the corresponding right If A is also irreducible, then μ eigenvector is positive.

11.3. Computation of eigenvectors with known eigenvalues

269

Properties of the Web matrix and the PageRank vector

The matrix G(c) ∈ N ×N used in the Google PageRank computations is of the form48 G(c) = cP + (1 − c)E ,

P, E

column stochastic,

0 < c < 1.

= 1 with a Therefore, G(c) is also column stochastic, its largest eigenvalue being μ . In addition, E is a rank-one matrix of corresponding nonnegative right eigenvector v the form E = ue T , where u is a nonnegative vector such that e T u = 1. Interestingly, = μ1 = 1 is always simple, whether u is positive or nonnegative, the eigenvalue μ and this result follows from the paper [128] by Haveliwala and Kamvar, where it is proved that the second eigenvalue μ2 of G(c) satisfies |μ2 | ≤ c < 1. Thus, even if P has a multiple eigenvalue equal to one, G(c) has a simple eigenvalue equal to one. Yet, in two other papers by Langville and Meyer [167, 168], the following is proved concerning the eigenvalues of G(c): If 1, μ2 , μ3 , . . . , μN are the eigenvalues of P, then the eigenvalues of G(c) are 1, cμ2 , cμ3 , . . . , cμN . For another proof of this result, see Eldén [80]. = μ1 = 1, and In summary, the largest eigenvalue of the Google matrix G(c) is μ is nonnegative, this eigenvalue is simple and the corresponding right eigenvector v while the corresponding left eigenvector is e = [1, . . . , 1]T . (If u is positive, the matrix is positive.) The PageRank is simply G(c) becomes irreducible, which implies that v , normalized such that the sum of its components is one. It serves as a measure of the v relative importance of Web pages. The very special nature of the Web matrix G(c) enables us to make a useful ob as a function of c. Since the eigenvalue servation concerning the PageRank vector v = μ1 = 1 of G(c) is simple, the matrix I − G(c) has rank N − 1. Thus, the linear μ system (I − G(c)) v = 0 has N − 1 independent equations. When these equations are solved using Cramer’s rule, after assigning a constant value to one of the unknowns, is a vector-valued rational function of c, since the elements it becomes obvious that v of G(c) are linear functions of c.49 assuming for simplicity that P is We can actually determine the exact nature of v diagonalizable. Allowing also for one to be an eigenvalue of P of multiplicity q, we have (11.6) V −1 PV = D = diag(μ1 , . . . , μN ), μ1 = · · · = μq = 1, where

V = [ y 1 | y 2 | · · · | y N ],

such that z 1 = e[1, . . . , 1] and T

P y i = μi y i ,

V −1 = [ z 1 | z 2 | · · · | z N ]T ,

(11.7)

50

z Ti P = μi z Ti ,

i = 1, . . . , N ,

z Ti y j = δi j .

(11.8)

Under these conditions, we then have the following result. Theorem 11.4. As a function of c, the PageRank vector is given as =v (c) = y 1 + v 48

q

i =2

(z Ti u)y i + (1 − c)

N

z Ti u

i =q+1

1 − cμi

yi.

(11.9)

The dimension N is O(1010 ) at present. we are using the fact that the determinant of a matrix whose elements are polynomials in c is also a polynomial in c. 50 Note that, by the fact that PV = V D and V −1 P = DV −1 , the vectors y i and z i are, respectively, the right and left eigenvectors of P corresponding to the eigenvalue μi , i = 1, . . . , N . In addition, by V −1 V = I , the sets {y 1 , . . . , y N } and {z 1 , . . . , z N } are mutually biorthonormal in the sense z Ti y j = δi j . 49 Here

270

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

(c) is a vector-valued rational function of c that is analytic at c = 1 and has simple Thus, v poles at 1/μi , i = q + 1, . . . , N . Proof. Because y 1 , . . . , y N span N , we can write = v

N

i =1

αi (c)y i ,

(11.10)

where αi (c) are scalar functions to be determined. Then (I − G(c)) v =0



N

i =1

αi (c)(1 − cμi )y i = (1 − c)

N

j =1

α j (c)E y j .

(11.11)

By E = ue T = u z T1 and z Ti y j = δi j , we have N

j =1

α j (c)E y j =



N j =1

 α j (c)(z T1 y j ) u = α1 (c)u.

Scaling the that α1 (c) = 1 (this will be validated immediately), and observing αi (c) such T that u = N (z u)y i , (11.11) becomes i =1 i N

i =1

αi (c)(1 − cμi )y i = (1 − c)

N

i =1

(z Ti u)y i ,

which, by the linear independence of the y i , gives αi (c)(1 − cμi ) = (1 − c)(z Ti u)



αi (c) =

1−c (z T u), 1 − cμi i

i = 1, . . . , N . (11.12)

Since one is an eigenvalue of multiplicity q, we have μ1 = · · · = μq = 1; therefore, αi (c) = (z Ti u),

i = 1, . . . , q.

(11.13)

By z 1 = e and e T u = 1, (11.13) gives α1 (c) = 1, as required. Combining (11.12) and (11.13) with α1 (c) = 1 in (11.10), we obtain the result in (11.9).

Theorem 11.4 concerning the PageRank was given originally in Serra-Capizzano [254], where the case of P being nondiagonalizable is also considered in full detail. For additional results concerning the sensitivity and derivative of the PageRank as a function of c, see also Gleich et al. [98], for example. Computation of the PageRank vector

Now, the matrix P is very sparse, the number of nonzero elements in each of its rows being O(1). This means that the cost of computing a matrix-vector product P w is O(N ) arithmetic operations. The cost of computing the product E w is also the same because E w = (e T w)u. As a result, the cost of computing the matrix-vector product G(c)w is O(N ) arithmetic operations. From this, we see that the matrix G(c), despite the fact that it is not a sparse matrix, behaves like one when one computes G(c)w. Thus, methods that are based on power iterations can be most useful in PageRank

11.3. Computation of eigenvectors with known eigenvalues

271

computations. Because storage is a crucial problem in PageRank computations, we must strive to use methods that give accurate results quickly and require little storage. These aims can be realized by applying MPE and RRE in the cycling mode to the sequence of power iterations {x m }, x m+1 = G(c)x m , m = 0, 1, . . . , or to a subsequence {x r m }, precisely as explained above, by tuning the integer parameters n, k, r appropriately. One other strategy is to apply MPE and RRE in the cycling mode with frozen γi , which are obtained after a few full cycles. This way we need to save only one vector and do not have the overhead involved in implementing MPE and RRE through full cycling. Of course, we can also apply MPE and RRE in the parallel cycling mode. Note that, when the initial vector x 0 is chosen to be positive and satisfies e T x 0 = 1, all the power iterations x m , m = 1, 2 . . . , are also positive and satisfy e T x m = 1 because G(c) is column stochastic. This implies that the vector lim m→∞ x m is exactly the  PageRank. Since ki=0 γi = 1 in both MPE and RRE, sn,k also satisfies e T sn,k = 1 for both MPE and RRE, which is a desired property that sn,k should have. PageRank was developed by Brin and Page [46] and Page et al. [202] as “a method for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them.” The PageRank vector via extrapolation of power iterations was first computed by Kamvar et al. [153], who developed a technique denoted the quadratic extrapolation for computing the PageRank. Following the publication of [153], the application of known vector extrapolation methods, such as MPE and RRE, to the sequence of power iterations was suggested in Sidi [279, 285], where it was also shown that the approximation s n,2 to the PageRank vector produced by quadratic extrapolation from the vectors x n+i , i = 0, 1, 2, 3, is related to the vector sn,2 obtained from the same vectors by MPE, in the sense that s n,2 = G(c)sn,2 . Quadratic extrapolation was generalized in [279, 285], where an efficient algorithm for implementing this generalization was also given.51 The vector s n,k produced by this generalization from the input vectors x n , x n+1 , . . . , x n+k+1 is closely related to the vector sn,k obtained by MPE from the same input vectors, in the sense that s n,k = G(c)sn,k . Consequently, s n,k − v = G(c)(sn,k − v ), which implies that the error analysis developed in Chapter 6 for MPE and RRE applies here without any changes. See [279, 285] for details.  (1) Computation of v

Note that (1) = lim v (c) = y 1 + v c→1

q

i =2

(z Ti u)y i .

Clearly, this vector is a linear combination of the eigenvectors of P that correspond to (1) depends on the matrix E (via the vector u), as the eigenvalue one. Interestingly, v well as the matrix P, even though G(1) = P does not depend on E . Thus, the vector (1) cannot be computed by letting c = 1 in G(c). v (1) is considered in Brezinski, Redivo Zaglia, and SerraThe computation of v (c) Capizzano [45]. The method suggested in this paper is as follows: (i) Compute v (c) by a vector-valued for several values of c ∈ (0, 1), say c1 , . . . , c p ; (ii) approximate v 51 This generalization of the algorithm for quadratic extrapolation forms the basis of the additional algorithms for MPE and RRE developed in Section 2.6.

272

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

(c) at c1 , . . . , c p ; and (iii) compute h p (1) as an rational function h p (c) that interpolates v (1). The authors also define the interpolant h p (c) in some fashion. approximation to v Recently, three vector-valued rational interpolation methods, named IMPE, IMMPE, and ITEA, were developed and analyzed in Sidi [280, 281, 283, 284, 287, 291]. (1) as suggested in [45]. This is These methods can be used effectively to compute v justified in view of the fact that the de Montessus–type convergence theories for IMPE, (c) because, being analytic IMMPE, and ITEA given in [283, 284, 287, 291] apply to v (c) automatically satisfies all the at c = 1 and meromorphic elsewhere in the c-plane, v conditions of these convergence theories. We do not pursue this topic here any further; we refer the reader to Chapter 16, where these methods and their convergence theories are summarized.

11.3.4 Further application: A hybrid method We end by considering the solution to a problem that generalizes those considered earlier in this section. We assume that the sequence {x m } is such that xm =

L

i =1

w i μim +

p

i =1

v i λim ,

m = 0, 1, . . . ,

(11.14)

where the scalars μi and λi are distinct and different from zero, the μi are known, and the vectors w i and  v i are not necessarily known. The problem is to compute (approximate) x m = Li=1 w i μim . An interesting situation occurs when, for example, μi are the Lth roots  p of unity, so |μi | = 1, i = 1, . . . , L, and 1 > |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p |, in which case i =1 v i λim is  the transient (which tends to zero as m → ∞) and x m = Li=1 w i μim is the periodic steady state in that x m+L = x m for all m. How such a series can come about is discussed at length in Sidi [278, Section 25.13] via a linear system of first-order differential equations of the form x  (t ) = C x(t ) + f (t ),

(11.15)

where C is a constant matrix and f (t ) is a vector-valued periodic function of t with a known period T . Provided the eigenvalues of C have negative real parts, this system has a periodic steady state (as t → ∞) with period T . When we solve (11.15) by finite differences using the Euler method with a fixed time step h = T /L, where L is a positive integer, we obtain a sequence of vectors x m such that x m ≈ x(mh) and is of the form given in (11.14), μi being the Lth roots of unity. We can approximate x m by approximating the w i individually by a hybrid method that combines the Richardson extrapolation process with a vector extrapolation (r ) method. To compute w r , we first form the vectors x m = x m μ−m r . By (11.14), we have (r )

xm = wr +

L

i =1 i = r

 wi

μi μr

m +

p

i =1

 vi

λi μr

m ,

m = 0, 1, . . . .

(11.16)

(r )

Next, we apply the Richardson extrapolation process to the sequence {x m } to elimi-

11.4. Computation of eigenpair derivatives

nate (filter out) the summation (j)

(r )

y 0 = x m+ j , =

L

i =1 i = r

(r )

w i (μi /μ r ) m from x m as follows:

j = 0, 1, . . . , L − 1,

( j +1)

y (s j )

273

(j)

y s −1 − c s y s −1

1 − cs

j = 0, 1, . . . , L − s − 1,

,

s = 1, . . . , L − 1,

(j) y m = y L−1 .

Here c s = μ s /μ r for 1 ≤ s ≤ r − 1 and c s = μ s +1 /μ r for r ≤ s ≤ L − 1. The vectors y m have the expansion ym = wr +

p

i =1

 αi v i

λi μr

m ,

m = 0, 1, . . . ,

αi =

L−1  ci − c s s =1

1 − cs

.

(11.17)

In view of this expansion, it is clear that we can effectively apply a vector extrapolation method to the sequence { y m } to approximate w r ; thus, provided |λk | > |λk+1 |, the approximation sn,k obtained by MPE or RRE from the vectors y m , n ≤ m ≤ n +k +1, satisfies sn,k − w r = O(|λk+1 /μ r |n ) as n → ∞. Doing this for r = 1, . . . , L, we continue to the approximation of x m . This approach was suggested originally by Sidi and Israeli [300] and is discussed in [278, Section 25.13].

11.4 Computation of eigenpair derivatives An additional application of vector extrapolation methods has been to the problem of computing the derivatives of known eigensystems of large sparse matrices. This problem has an interesting mathematical structure, which makes it worth considering in some detail. It arises in different disciplines of engineering, such as mechanical, aeronautical, and civil engineering. It arises, for example, in optimal design of systems where the dynamic stability and/or response of the system is a function of several design parameters. It has been considered using different approaches by many authors. See the earlier works by Rudisill [226], Nelson [194], Golub and Meyer [104], and Meyer and Stewart [189], for example. Here we will be concerned with an iterative method and its acceleration via vector extrapolation methods.

11.4.1 An iterative method for eigenpair derivatives Let A(t ) ∈ N ×N be a function of a real scalar parameter t , and, for simplicity, assume that A(t ) is diagonalizable in a neighborhood of t = t0 . Let (μi (t ), v i (t )), with v ∗i (t )v i (t ) = 1, i = 1, . . . , N , be the eigenpairs of A(t ). For convenience of notation, denote A(t0 ) by A and A (t0 ) by A , and denote the eigenpairs (μi (t0 ), v i (t0 )) of A(t0 ) and their derivatives (μi (t0 ), v i (t0 )) by (μi , v i ) and (μi , v i ), respectively. Finally, assume that μ r is a simple nonzero eigenvalue of A and that the eigenpair (μ r , v r ) is known; we would like to compute (μr , v r ). Differentiating the identity [A(t ) − μ r (t )I ]v r (t ) = 0 at t = t0 , we first have (A − μr I )v r + (A − μ r I )v r = 0,

(11.18)

274

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

from which v r =

1 [Av r + (A − μr I )v r ]. μr

(11.19)

Multiplying (11.18) on the left by v ∗r and invoking v ∗r v r = 1, we obtain μr = v ∗r (A − μ r I )v r + v ∗r A v r .

(11.20)

Substituting (11.20) into (11.19), thus eliminating μr , we also have v r = Bv r + c,

(11.21)

where B=

1 [A − v r v ∗r (A − μ r I )], μr

c=

1  [A v r − (v ∗r A v r )v r ]. μr

(11.22)

Starting with an initial approximation x 0 for v r , we use (11.19) and (11.20) to generate the sequences of approximations {σ m } and {x m } to μr and v r , respectively, as follows: σ m = v ∗r (A − μ r I )x m + v ∗r A v r , 1 [Ax m + (A − σ m I )v r ], x m+1 = μr

m = 0, 1, . . . .

(11.23)

The iterative procedure we have just described was developed by Rudisill and Chu [227]. We turn to the convergence study of this procedure next.

11.4.2 Analysis of the iterative method Upon substituting the first of the recursions in (11.23) into the second, thus eliminating σ m , we obtain the following recursion relation for {x m } only: x m+1 = B x m + c,

m = 0, 1, . . . .

(11.24)

Of course, this is something we expect in view of (11.21). Subtracting (11.21) from (11.24), we obtain

which implies

x m+1 − v r = B(x m − v r ),

m = 0, 1, . . . ,

x m − v r = B m (x 0 − v r ),

m = 0, 1, . . . .

(11.25)

To analyze the behavior of x m − v r , we need to have information about the eigenpairs of B. It can easily be verified that (1, v r )

and

(μi /μ r , w i ),

w i = v i − (v ∗r v i )v r ,

1 ≤ i ≤ N , i = r,

(11.26)

are the eigenpairs of B. Let us recall that A is diagonalizable; therefore, its eigenvectors are linearly independent and hence span N . It is easy to see that the eigenvectors of B are also linearly independent and hence span N too. Therefore, we can express x 0 − v r as N

x 0 − v r = αi w i + α r v r . (11.27) i =1 i = r

11.4. Computation of eigenpair derivatives

275

Consequently, by (11.25), we also have that x m − v r =

N

i =1 i = r

αi (μi /μ r ) m w i + α r v r ,

m = 0, 1, . . . ,

(11.28)

which we rewrite as x m = (v r + α r v r ) +

N

i =1 i = r

αi (μi /μ r ) m w i ,

m = 0, 1, . . . .

(11.29)

Recalling our assumption that μ r is a simple eigenvalue, we can conclude from (11.29) that x m has an expansion of the form x m = (v r + α r v r ) +

p

i =1

z i λim ,

m ≥ 1.

(11.30)

Here (i) λi are some or all of the distinct nonzero eigenvalues μi /μ r of B with i = r , and they are all different from 1; (ii) z i are eigenvectors of B corresponding to the respective λi ; and (iii) p < N . Let us order the λi such that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p | and assume that |λk | > |λk+1 |. Thus, all the conditions of Theorem 6.6 are satisfied. Consequently, vector extrapolation methods can be applied to the sequence {x m }, and the resulting approximation to the limit or antilimit v r + α r v r satisfies sn,k = (v r + α r v r ) + O(|λk+1 |n )

as n → ∞.

(11.31)

Of course, convergence (to v r + α r v r ) takes place provided |λk+1 | < 1. This will be the case naturally if one is the largest eigenvalue of B or, equivalently, μ r is the largest eigenvalue of A. Since the matrices A and A are available, as always, we can apply extrapolation methods in the cycling mode too. With the vector sn,k available as an approximation to v r + α r v r , we can approximate μr from the first of the recursions in (11.23) by replacing σ m by θn,k and x m by sn,k : θn,k = v ∗r (A − μ r I )sn,k + v ∗r A v r .

(11.32)

We now show that θn,k is an approximation to μr independent of α r . Subtracting (11.20) from (11.32) and invoking (11.31), we have θn,k − μr = v ∗r (A − μ r I )(sn,k − v r )   = v ∗r (A − μ r I ) α r v r + O(|λk+1 |n ) as n → ∞ = O(|λk+1 |n )

as n → ∞.

(11.33)

276

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

Note that sn,k is an approximation to v r + α r v r and not to v r necessarily. By assuming, without loss of generality, that52 v ∗r v r = 0

as well as

and by choosing x 0 such that

v ∗r v r = 1,

v ∗r x 0 = 0,

(11.34) (11.35)

we can force α r = 0, as we show next: By (11.26), (11.27), (11.34), and (11.35), we have v ∗r (x 0 − v r ) =

(v ∗r x 0 ) − (v ∗r v r ) =

N

i =1 i = r N

i =1 i = r

αi (v ∗r w i ) + α r (v ∗r v r ),

αi [(v ∗r v i ) − (v ∗r v i )(v ∗r v r )] + α r (v ∗r v r ),

0 = αr . Of course, for an arbitrary vector f , x 0 = f − (v ∗r f )v r guarantees (11.35); so does the simplest choice x 0 = 0. Finally, with x 0 chosen as in (11.35), we also have sn, p = v r

and

θn, p = μr

exactly, by Theorem 6.6. The study of the convergence of the sequence {x m } presented above was given in Andrew [3]. For further developments concerning multiple eigenvalues, defective matrices, and more, see Andrew [4]. The application of vector extrapolation methods to the sequence {x m } was suggested by Tan [318, 319, 320, 322]. The results on the convergence of extrapolation methods presented here are new. Computation of the derivatives of several eigenpairs simultaneously has also received considerable attention lately; see Andrew and Tan [5, 6], for example. A special case: A a normal matrix

When A is a normal matrix, a simplification takes place in the sense that μr can be computed exactly via (11.36) μr = v ∗r A v r . This follows from (11.20) and from the fact that v ∗r A = μ r v ∗r when A is normal. Let B=

1 A, μr

c=

1  (A − μr I )v r . μr

(11.37)

Then we can express (11.19) as v r = Bv r + c.

(11.38)

52 That v r can be taken to satisfy both v ∗r v r = 1 and v ∗r v r = 0 can be shown as follows: For all t close to t0 , let y r (t ) be an eigenvector of A(t ) corresponding to μ r (t ), normalized such that y ∗r (t )y r (t ) = 1. From this, we have [y ∗r (t )y r (t )] = 0, which implies that ℜ[y ∗r (t )y r (t )] = 0; ℑ[y ∗r (t )y r (t )] = 0 in general, however. Let ℑ[y ∗r (t0 )y r (t0 )] = a, and set v r (t ) = e −ia(t −t0 ) y r (t ). Clearly, (i) A(t )v r (t ) = μ r (t )v r (t ), (ii) v ∗r (t )v r (t ) = 1, and (iii) v ∗r (t0 )v r (t0 ) = 0. [If v r (t0 ) is real and v ∗r (t0 )v r (t0 ) = 1, we have v ∗r (t0 )v r (t0 ) = 0 automatically.]

11.5. Application to solution of singular linear systems

277

Starting with an initial approximation x 0 for v r , we use (11.38) to generate the sequences of approximations {x m } to v r as follows: x m+1 = B x m + c ,

m = 0, 1, . . . .

(11.39)

Subtracting (11.38) from (11.39), we have x m+1 − v r = B(x m − v r ),

m = 0, 1, . . . .

(11.40)

1 ≤ i ≤ N , i = r.

(11.41)

Clearly, B has as its eigenpairs (1, v r )

and

(μi /μ r , v i ),

We now proceed as before. We leave the details to the reader.

11.5 Application to solution of singular linear systems We begin with three definitions concerning singular matrices in N ×N . Definition 11.5. Let C be a singular matrix. The index of C, denoted ind(C), is defined to be the smallest integer k for which rank(C k+1 ) = rank(C k ). Equivalently, ind(C) is the size of the largest Jordan block of C with eigenvalue zero. Definition 11.6. Let C be a square singular matrix, and let the Jordan canonical form of C be   J1 O V −1 CV = J = , O J0 where J 0 and J 1 are block diagonal matrices made up of the Jordan blocks of C with, respectively, zero and nonzero eigenvalues. The Drazin inverse of C, denoted C D , is then the matrix , + −1 O J1 D D −1 D . C =VJ V , J = O O If ind(C) = 1, C D is also called the group inverse of C and is denoted C # .53 The Drazin inverse is one of several generalized inverses of matrices. It is treated along with other generalized inverses in great detail in the books by Ben-Israel and Greville [20], Campbell and Meyer [56], and Rao and Mitra [215]. It is a most important tool when dealing with so-called differential-algebraic equations; for this topic, see Campbell [54, 55], Ascher and Petzold [8], and Kunkel and Mehrmann [162], for example. Definition 11.7. Let C be a singular matrix, and consider the linear system C x = d, whether consistent or not. The vector C D d is said to be the Drazin inverse solution to this linear system. When ind(C) = 1, this vector is also the group inverse solution C # d. 53 The Drazin inverse of the singular matrix C with ind(C) = a is actually the unique matrix X that satisfies C a+1 X = C a , XCX = X , CX = XC.

It is easy to verify that C D , as given here in terms of the Jordan canonical form of C, satisfies all these conditions.

278

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

The two problems that we dealt with in Sections 11.3 and 11.4 give rise to the in (11.5) and (ii) v r = Bv r + c in (11.21), both of singular linear systems (i) A v=v the form (11.42) x = T x + d, T ∈ N ×N , I − T singular, such that (i) this system is consistent and (ii) ind(I − T ) = 1, because both matrices A and B are assumed to have nondefective eigenvalues equal to one. For convenience, let us write (11.42) in the equivalent form C x = d,

C = I − T ∈ N ×N ,

C singular.

(11.43)

The numerical solution of such systems, whether consistent or not, is of interest in itself. Here we would like to consider briefly the application of vector extrapolation and Krylov subspace methods to them, assuming that ind(C) = ind(I − T ) is arbitrary. Let us define + (T ) to be the subspace spanned by the eigenvectors and principal vectors of T corresponding to the eigenvalues different from one and  (T ) to be the subspace spanned by the eigenvectors and principal vectors of T corresponding to the eigenvalue one. It is easy to see that  (T ) contains  (C), the null space of C.54 Clearly, N = + (T ) ⊕  (T ). We begin by considering the singular system in (11.42), assuming that it is consistent. This system has infinitely many solutions (in the regular sense) s = s + x,

s ∈ + (T ) unique,

x ∈  (T ) nonunique.

(11.44)

The vector s is the Drazin inverse solution of the singular system above, namely, s = C D d. Of course, if d = 0, then s = x . If ind(C) = 1 in addition, then s is simply an eigenvector of T corresponding to the eigenvalue one. If one is a simple eigenvalue of T , then s is a constant multiple of the only corresponding eigenvector. This is precisely . what we have in Section 11.3, with s = x=v # x, x ∈  (C). Again, if one is a simple If d = 0 and ind(C) = 1, then s = C d + eigenvalue of T , then x is a multiple of the only corresponding eigenvector. This is precisely the situation we have in Section 11.4, with s = v r +βv r and v r = (I − B)# c. As shown in Sidi [264], these consistent systems, with ind(C) = 1, can be solved for s = C # d by applying vector extrapolation methods to a sequence {x m } generated via (11.45) x m+1 = T x m + d, m = 0, 1, . . . . We turn to this issue next. First, by (11.44) and by the fact that T z = z for every z ∈  (T ), we have s = Ts+d



s = T s + d.

(11.46)

Next, subtracting (11.46) from (11.45), we obtain x m+1 − s = T (x m − s )



x m − s = T m (x 0 − s ),

m = 0, 1, . . . .

(11.47)

54 Of course, if T has only eigenvectors corresponding to the eigenvalue one, that is, if ind(C) = 1, then  (T ) =  (C).

11.6. Application to multidimensional scaling

279

Assuming for simplicity that T is diagonalizable, (μi , v i ), i = 1, . . . , N , being its eigenpairs, we have x 0 − s =

N

i =1 μi =1

αi v i + z ,

z=

N

i =1 μi =1

αi v i ∈  (T )

for some scalars αi . Therefore, by (11.47) and by T m z = z , x m = ( s + z ) +

N

i =1 μi =1

αi μim v i ,

m = 0, 1, . . . ,

z ∈  (T ).

We can now conclude, as before, that vector extrapolation methods can be applied to the sequence {x m }, and we obtain sn,k as approximations to s + z . If we want sn,k to approximate s = C # d only, then we can force z = 0 by choosing x 0 ∈ + (T ); this can be accomplished by choosing x 0 = C y for some arbitrary y or by choosing x 0 = 0. See [264] for further developments. As proposed in Sidi [264, 279, 285], one can also apply Krylov subspace methods such as FOM and GMR to the singular system C x = d. See also Ipsen and Meyer [143]. Golub and Greif [100, 101] discuss the application of the method of Arnoldi (FOM) and methods of Arnoldi type to (I −A)x = 0 to compute the PageRank of the Google Web matrix A. Whether the system in (11.42) is consistent or not, and whatever ind(C), the Drazin inverse solution C D d can be obtained by applying to a vector sequence {x m } generated via x m+1 = T x m + d some specially designed vector extrapolation methods of Sidi [267]. C D d can also be obtained by applying to the system C x = d some specially designed Krylov subspace methods of Calvetti, Reichel, and Zhang [53]; Sidi [274, 276]; and Sidi and Kluzner [302], and semiiterative methods by Eiermann, Marek, and Niethammer [77]; Hanke and Hochbruck [126]; Climent, Neumann, and Sidi [61]; and Sidi and Kanevsky [301]. As this is a very specialized topic beyond the scope of this work, we do not pursue it further here.

11.6 Application to multidimensional scaling Multidimensional scaling (MDS) is a generic name for a family of methods that, given a matrix representing the pairwise distances between a set of points in some abstract metric space, attempts to find a representation of these points in a low-dimensional (typically Euclidean) space. The distances in the target space should be as close to the original ones as possible. MDS methods are of great importance in the field of multidimensional data analysis. Originally introduced in the field of psychology, these methods have since then been applied to various problems. Some of the most common applications are dimensionality reduction, visualization and analysis of data (for example, financial data), information retrieval, graph visualization, texture mapping in computer graphics, and bioinformatics. More recently, MDS methods have been brought into the computer vision community as efficient methods for nonrigid shape analysis and recognition. For an extensive survey of MDS, see the book by Borg and Groenen [25], for example. We follow the treatment given in this book here. The data sets encountered in the above applications are often very large. At the same time, the nonlinear and nonconvex nature of MDS problems tends to make their

280

Chapter 11. Miscellaneous Applications of Vector Extrapolation Methods

solution computationally demanding. As a result, MDS algorithms tend to be slow, which makes their practical application in large-scale problems challenging. A number of low-cost algorithms that find an approximate solution to MDS problems have recently been proposed for large-scale settings. Yet some of the applications (for example, the representation of intrinsic geometry of nonrigid shapes in computer vision) require (numerically) exact solutions, which makes approximate MDS algorithms inappropriate. This problem can be alleviated by using multigrid methods (see Bronstein et al. [48]) and vector extrapolation methods in conjunction with some fixed-point iterative schemes (see Rosman et al. [223]). Here we will discuss very briefly the application of vector extrapolation methods in the cycling mode to a least-squares MDS formulation with the so-called SMACOF iterative scheme.55 For several numerical examples from different disciplines, we refer the reader to [223]. See also the book by Bronstein, Bronstein, and Kimmel [47, Chapter 7]. The SMACOF algorithm uses iterative majorization, a concept that can be explained as follows: We would like to minimize a complicated function f (x).56 The central idea is to construct a function g (x, z), with z fixed, that is simpler to minimize than f (x). The following requirements need to be satisfied by g (x, z): f (x) ≤ g (x, z)

and

f (z) = g (z, z).

If we let x  be such that g (x  , z) = min x g (x, z), then we have f (x  ) ≤ g (x  , z) ≤ g (z, z) = f (z). This can now be used iteratively to obtain a sequence {x m } such that { f (x m )} is nonincreasing: Starting with some x0 , and assuming that x1 , . . . , x m have been generated, we determine x m+1 via g (x m+1 , x m ) = min x g (x, x m ). Then we have f (x m+1 ) ≤ g (x m+1 , x m ) ≤ g (x m , x m ) = f (x m ), meaning that the sequence { f (x m )} is nonincreasing. Of course, if f (x) is bounded from below, then lim m→∞ f (x m ) exists, and if f (x) is continuous, then lim m→∞ x m exists and is a point of local minimum of f (x).

11.6.1 Description of MDS and SMACOF algorithm Let the data to be analyzed consist of N elements in some metric space, and for 1 ≤ i, j ≤ N , let the real scalars δi j be the distances between the ith and j th elements. Thus, δi j is some measure of the dissimilarity between the ith and j th elements; therefore, the larger δi j , the greater the dissimilarity. We assume that the δi j are given. We now represent the ith element by a real ν-dimensional row vector x i = [xi 1 , . . . , xi ν ], and define the N × ν matrix X as

X = [xi j ]1≤i ≤N , 1≤ j ≤ν

i = 1, . . . , N ,



x11 ⎢ x21 ⎢ =⎢ . ⎣ ..

x12 x22 .. .

xN 1

xN 2

··· ··· ···

⎤ x1ν x2ν ⎥ ⎥ .. ⎥ . . ⎦ xN ν

55 The acronym SMACOF originally stood for scaling by maximizing a convex function. Now it stands for scaling by majorizing a complicated function. 56 f (x) may be complicated in the sense that the computation of its derivative f  (x) may be very costly, or that f  (x) may not exist for all relevant x, for example.

11.6. Application to multidimensional scaling

281

Taking di j (X ) to be the Euclidean distance between x i and x j , namely, di j (X ) = x i − x j  =



ν

(xi k − x j k )2

1/2 ,

k=1

we define the so-called stress function s(X ) as s(X ) =

i 0 ∀ i = j ,

  N where i < j stands for N i =1 j =i +1 . Finally, we minimize the stress function with respect to the matrix X (that is, with respect to the xi j ). The minimization process is achieved by the SMACOF algorithm as follows: First, we expand s(X ) as in s(X ) = ηδ + η(X ) − 2ρ(X ),

(11.48)

where ηδ =

i ρ necessarily.59 In addition to being good approximations to f (z), these rational approximations have the important advantage that they can be used constructively to provide quantitative information about the singularities (their location and nature) of f (z). In this chapter, we apply MPE, MMPE, and TEA to the sequence of the partial sums of the Maclaurin series of f (z) to obtain some vector-valued rational approximations to it. We denote these approximation procedures SMPE, SMMPE, and STEA, respectively, and we analyze their algebraic and analytic properties. As before, we use the weighted inner product and the norm induced by it, namely, (y, z ) = y ∗ M z ,

z  =



(z , z ),

M Hermitian positive definite.

The contents of this chapter are taken from Sidi [269, 270]. The rational approximations from STEA were originally proposed by Brezinski [28]. In the next chapter, we will briefly treat a few more vector-valued rational approximation procedures that have been proposed in the past, such as the generalized inverse vector-valued Padé approximants that are related to VEA and the simultaneous Padé approximants. 59 A function f (z) is meromorphic in a domain of the complex plane if it is analytic at every point of this domain, except at a finite number of points where it has only polar singularities.

285

286

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

12.2 Derivation of vector-valued rational approximations Let the Maclaurin series of f (z) be f (z) =



i =0

and define x m (z) =

ui zi ,

m−1

i =0

ui zi ,

ui =

f (i ) (0) , i!

m = 0, 1, . . . ,

i = 0, 1, . . . ,

(12.1)

x 0 (z) ≡ 0.

(12.2)

As a result,60 Δx m (z) = x m+1 (z) − x m (z) = u m z m ,

m = 0, 1, . . . .

(12.3)

Now apply the three vector extrapolation methods to the sequence {x m (z)}. By Lemma 1.14 and Theorem 1.15 of Chapter 1, the approximations sn,k (z) that we obtain have the following determinant representation: sn,k (z) =

with

and

D(x n (z), x n+1 (z), . . . , x n+k (z)) , 1, . . . , 1) D(1,

  v  0  u  0,0  , v , . . . , v ) =  u 1,0 D(v 0 1 k  .  .  .   u k−1,0

··· ··· ···

v1 u 0,1 u 1,1 .. . u k−1,1

⎧   ⎪ ⎨ Δx n+i (z), Δx n+ j (z) u i , j = (q i +1 , Δx n+ j (z)) ⎪ ⎩ (q, Δx n+i + j (z))

···

 vk  u 0,k  u 1,k  ..  .  u k−1,k 

for SMPE, for SMMPE, for STEA.

(12.4)

(12.5)

(12.6)

Thus, from (12.3), ⎧ n+i n+ j ⎪ ⎨ ui , j z z , n+ u i , j = ui , j z j , ⎪ ⎩ u z n+i + j , i,j

ui , j = (u n+i , u n+ j ) ui , j = (q i +1 , u n+ j ) ui , j = (q, u n+i + j )

for SMPE, for SMMPE, for STEA .

(12.7)

As usual, q 1 , . . . , q k are linearly independent vectors in N and q is a nonzero vector in N . ,v ,...,v ) We now perform elementary row and column transformations on D(v 0 1 k as follows. 60 We have chosen to work with the Maclaurin series for convenience. We can also work with a Taylor  i series about an arbitrary z0 , in which case we will have f (z) = ∞ i =0 u i ζ , where ζ = z − z0 , instead of (12.1). Thus, there is no loss of generality when working with (12.1).

12.2. Derivation of vector-valued rational approximations

287

For SMPE: For i = 0, 1, . . . , k − 1 do Divide row i + 2 by z n+i . end do (i) For j = 0, 1, . . . , k do Divide column j + 1 by z n+ j . end do ( j ) Multiply the first row by z n+k .

For SMMPE: For j = 0, 1, . . . , k do Divide column j + 1 by z n+ j . end do ( j ) Multiply the first row by z n+k . For STEA: For i = 0, 1, . . . , k − 1 do Divide row i + 2 by z i . end do (i) For j = 0, 1, . . . , k do Divide column j + 1 by z n+ j . end do ( j ) Multiply the first row by z n+k . As a result of these transformations, we obtain , v , . . . , v ) = φ D(z k v , z k−1 v , . . . , z 0 v ), D(v 0 1 k n,k 0 1 k where

φn,k

while

⎧   k−1 n+i k ⎪ n+ j ⎪ z z −n−k ⎪ j =0 z i =0 ⎪ ⎪ ⎪  ⎨  k n+ j z = z −n−k j =0 ⎪ ⎪    ⎪ ⎪ k−1 i k ⎪ ⎪ n+ j ⎩ z z z −n−k j =0 i =0

  w  0  u  0,0  D(w0 , w1 , . . . , wk ) =  u1,0  ..  .   uk−1,0

w1 u0,1 u1,1 .. . uk−1,1

(12.8)

for SMPE, (12.9)

for SMMPE, for STEA, ··· ··· ··· ···

 wk  u0,k  u1,k  . ..  .  uk−1,k 

(12.10)

Here the ui , j are as in (12.7). Substituting (12.8) with (12.10) into (12.4) and canceling the factors φn,k from the numerator and denominator, we obtain the determinant representation given in the next theorem. Theorem 12.1. The approximations sn,k (z) from the power series sn,k (z) =

∞

i =0 u i z

D(z k x n (z), z k−1 x n+1 (z), . . . , z 0 x n+k (z))

D(z k , z k−1 , . . . , z 0 )

,

i

are

(12.11)

288

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

which we write in full as

    z k x (z) z k−1 x (z) · · · z 0 x  n n+1 n+k (z)  u u0,1 ··· u0,k    u0,0 u1,1 ··· u1,k   1,0   .. .. ..     . . .    uk−1,0 uk−1,1 ··· uk−1,k    sn,k (z) = ,  zk z k−1 · · · z 0    u u0,1 ··· u0,k   0,0   u1,1 ··· u1,k   u1,0  . .. ..   .  . . .    uk−1,0 uk−1,1 · · · uk−1,k 

with ui , j

⎧ ⎪ ⎨ (u n+i , u n+ j ) = (q i +1 , u n+ j ) ⎪ ⎩ (q, u n+i + j )

(12.12)

for SMPE, for SMMPE, for STEA.

(12.13)

As in Chapter 6, we can now replace x m (z) in this determinant representation by x m+ν (z) for some fixed integer ν ≥ 0 and modify (12.11) to read sn,k (z) =

D(z k x n+ν (z), z k−1 x n+ν+1 (z), . . . , z 0 x n+ν+k (z))

D(z k , z k−1 , . . . , z 0 )

.

(12.14)

Remarks: 1. An important thing to note here is that all the rows of the numerator and denominator determinants, except their first rows, are independent of z since the ui , j are all scalars independent of z; that is, the dependence on z comes in through the first rows only. This is true for all three methods described here. 2. From the representation in (12.14) it is clear that the numerator determinant is a vector-valued polynomial in z of degree at most n + k + ν − 1, while the denominator determinant is a scalar-valued polynomial of degree k. Thus, sn,k (z) is a vector-valued rational function in z. 3. That sn,k (z) from SMMPE and STEA are functions of z only can already be seen from (12.4)–(12.7) because the u i , j are constant multiples of some nonnegative powers of z for these two methods. For SMPE this is not immediate since the u i , j in this case depend on z as well as z. Interestingly, the powers of z disappear altogether from SMPE, even though they are present in the u i , j . This makes sn,k (z) a rational function in z for SMPE as well.  i The rational approximations sn,k (z) from f (z) = ∞ i =0 u i z obtained by applying the various vector extrapolation methods can be arranged in the two-dimensional array in Table 12.1. By imposing some realistic conditions on the asymptotic behavior of the x m (z) as m → ∞, one can now study the convergence behavior of the row or column sequences in this table. We undertake such an analysis for row sequences in Section 12.7 of this chapter.

12.3. A compact formula for sn,k (z) from SMPE

289

Table 12.1. The extrapolation table.

s0,0 (z) s0,1 (z) s0,2 (z) .. .

s1,0 (z) s1,1 (z) s1,2 (z) .. .

s2,0 (z) s2,1 (z) s2,2 (z) .. .

··· ··· ··· .. .

12.3 A compact formula for sn,k (z ) from SMPE In Section 1.8, we presented a compact formula for the polynomial vector extrapolation methods. We would like to use that formula to derive a compact formula for SMPE that is very similar to Nutall’s compact formula for Padé approximants; see Baker and Graves-Morris [14, pp. 16–17], for example. By (1.74), with proper substitutions, we have (z)[U (z)∗ M W 7 (z)]−1 [U (z)∗ M ]u z n , sn,k (z) = x n (z) − U n

(12.15)

(z) and W 7 (z) are given by where the N × k matrices U n+k−1 (z) = [u z n | u z n+1 | · · · | u U ] n n+1 n+k−1 z

(12.16)

and n+k 7 (z) = [u z n+1 − u z n | u z n+2 − u z n+1 | · · · | u W − u n+k−1 z n+k−1 ]. n+1 n n+2 n+1 n+k z (12.17) Factoring out z n+ j −1 from the j th columns of these matrices, j = 1, . . . , k, we obtain

(z) = z n U S(z) U

and

7 (z) = z n W (z)S(z), W

(12.18)

where U = [u n | u n+1 | · · · | u n+k−1 ],

(12.19)

W (z) = [u n+1 z − u n | u n+2 z − u n+1 | · · · | u n+k z − u n+k−1 ],

(12.20)

and S(z) = diag(z 0 , z 1 , . . . , z k−1 ).

(12.21)

Substituting (12.18) into (12.15) and invoking (12.19)–(12.21), we obtain the compact formula sn,k (z) = x n (z) − z n U [U ∗ M W (z)]−1 [U ∗ M u n ]. (12.22) We can summarize and rewrite (12.22) as in the following theorem. Theorem 12.2. Let ui , j = (u n+i , u n+ j ) = u ∗n+i M u n+ j ,

i, j = 0, 1, . . . .

(12.23)

Then sn,k (z) has the compact form sn,k (z) = x n (z) − z n U [T (z)]−1 c,

T (z) ∈ k×k ,

c ∈ k ,

(12.24)

290

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

where are given as ⎡ ⎢ ⎢ T (z) = ⎢ ⎣

T (z) = U ∗ M W (z) u0,1 z − u0,0 u1,1 z − u1,0 .. .

uk−1,1 z − uk−1,0 and

and

u0,2 z − u0,1 u1,2 z − u1,1 .. .

c = U ∗M u n ··· ···

uk−1,2 z − uk−1,1

···

u0,k z − u0,k−1 u1,k z − u1,k−1 .. .

⎤ ⎥ ⎥ ⎥ ⎦

(12.25)

uk−1,k z − uk−1,k−1

c = [u0,0 , u1,0 , . . . , uk−1,0 ]T .

(12.26)

Note that, in this formula, U and c are independent of z; the dependence on z comes only through x n (z), z n , and T (z). The similarity to Nutall’s compact formula is a result of the structure of T (z) in particular. Similar treatments can be given to sn,k (z) from SMMPE and STEA. We leave the details to the reader.

12.4 Algebraic properties of sn,k (z ) 12.4.1 General structure and Padé-like properties Theorem 12.3. Assume that the denominator of sn,k (z) is of degree exactly k. Then sn,k (z) can be expressed as k k− j x n+ν+ j (z) p n,k (z) j =0 c j z sn,k (z) = (12.27) ≡ , qn,k (0) = ck = 1, k k− j qn,k (z) j =0 c j z

where the c j satisfy the linear system k

j =0

ui , j c j = 0,

where ui , j

i = 0, 1, . . . , k − 1,

⎧ ⎪ ⎨ (u n+i , u n+ j ) = (q i +1 , u n+ j ) ⎪ ⎩ (q, u n+i + j )

ck = 1,

for SMPE, for SMMPE, for STEA.

(12.28)

(12.29)

[Note that the ui , j are as in (12.7) and (12.13).] Proof. Denote by d j the cofactor of w j in (12.10). Then k k k− j k− j x n+ν+ j (z) x n+ν+ j (z) j =0 c j z j =0 d j z , = sn,k (z) =  k k k− j k− j j =0 c j z j =0 d j z

where c j = d j /dk , j = 0, 1, . . . , recalling that dk = 0 by assumption. Because the d j are the cofactors of the first row in D(w0 , w1 . . . , wk ), they satisfy k

j =0

ui , j d j = 0,

i = 0, 1, . . . , k − 1.

12.4. Algebraic properties of sn,k (z)

291

By dividing both sides of these equations by dk , we obtain the equations in (12.28). This completes the proof.

The following lemma, whose proof is easy, shows that sn,k (z) can be expressed in a way that is more convenient computationally than what is given in Theorem 12.3. Lemma 12.4. Split x m+μ (z) as in x m,μ (z) =

x m+μ (z) = x m (z) + z m x m,μ (z),

μ−1

i =0

u m+i z i .

(12.30)

Then sn,k (z) can be rewritten in different forms as follows: k sn,k (z) = x n+ν (z) + z

n+ν

j =0 c j z

k

sn,k (z) = x n+ν (z) + z

x n+ν, j (z)

j =0 c j z

k−1  k n+ν

k− j

j =i +1 c j z

i =0

k

k− j

k− j +i

j =0 c j z



,

u n+ν+i

k− j

(12.31)

.

(12.32)

The next result concerns the behavior of sn,k (z) as z → 0 and shows that it has a Padé-like property. Theorem 12.5. Let sn,k (z), produced via SMPE, SMMPE, or STEA, be as in Theorem 12.3. Then f (z) − sn,k (z) = O(z n+ν+k )

as z → 0;

(12.33)

thus, sn,k (z) interpolates f (z) n + ν + k times at z = 0. That is, (i )

sn,k (0) = f (i ) (0),

i = 0, 1, . . . , n + ν + k − 1.

(12.34)

Proof. Subtracting sn,k (z) from f (z ) and invoking (12.27), obtain k f (z) − sn,k (z) =

j =0 c j z

[f (z) − x n+ν+ j (z)] . k k− j j =0 c j z

k− j

The result now follows from the fact that f (z) − x m (z) = O(z m ) as z → 0 and from the fact that ck = 1.

Remark: As can be seen from the proof of Theorem 12.5, the result in (12.33) is valid for arbitrary c j . Whether sn,k (z) is a good approximation to f (z) depends entirely on how the c j are chosen. The choice of the c j as the solution of the linear system in (12.28) turns out to be an excellent one. This will become clear later, in Section 12.7, from our discussion of the convergence properties of the sn,k (z) as n → ∞.

292

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

12.4.2 Reproducing property Since the approximations sn,k (z) are vector-valued rational functions of z, a natural question that arises is whether the methods SMPE, SMMPE, and STEA reproduce vector-valued rational functions f (z). The following theorem answers this question affirmatively. Theorem 12.6. Let the vector-valued rational function f (z) be given by f (z) =

p(z) , q(z)

deg p ≤ m − 1, m ≥ k,

deg q = k, q(0) = 1.

and

(12.35)

 i Here p(z) is a vector-valued polynomial and q(z) is a scalar polynomial. Let ∞ i =0 u i z be the Maclaurin series expansion of f (z). Denote by sn,k (z) the approximations generated  i by any of the methods MPE, MMPE, and TEA from ∞ i =0 u i z . Then, provided s m−k,k (z) exists, it satisfies s m−k,k (z) ≡ f (z). Remark: The assumption that deg p ≤ m − 1 with m ≥ k may seem to rule out the cases of p(z) having actual degree smaller than k − 1. This is not the case, however, because every polynomial g (z) satisfying deg g ≤ k − 1 also satisfies deg g ≤ m − 1 with arbitrary m ≥ k. Thus, we can always assume without loss of generality that deg p ≤ m − 1 with m ≥ k. Proof. We recall the representation of sn,k (z) given in Theorem 12.3. We need to show that the numerator and denominator polynomials of sn,k (z) given in this theorem are p(z) and q(z), respectively. We start by showing that q(z) is precisely the denominator polynomial of sn,k (z). By (12.35), we have the identity q(z)f (z) = p(z).

(12.36)

Since deg p ≤ m − 1, we also have the identities

dr dr [p(z)] ≡ 0, [q(z)f (z)] = dzr dzr

r = m, m + 1, . . . .

Using Leibniz’s rule of differentiation of a product and setting z = 0, these identities give r  

r (j) q (0)f (r − j ) (0) = 0, r = m, m + 1, . . . . (12.37) j j =0 Now, f (z) =



j =0

q(z) =

k

j =0

uj zj,

uj = j

ck− j z =

k

j =0

f ( j ) (0) , j!

cj z

k− j

,

j = 0, 1, . . . , ck− j

q ( j ) (0) = , j!

(12.38) j = 0, 1, . . . , k.

[Note also that ck = q(0) = 1 by (12.35).] With these, and by the fact that q ( j ) (0) = 0 for j > k, (12.37) becomes k

j =0

ck− j u r − j = 0,

r = m, m + 1, . . . ,

12.4. Algebraic properties of sn,k (z)

293

which, letting m = n + k, we write in the form k

j =0

c j u n+i + j = 0,

• Letting i = 0, (12.39) gives

k

j =0

i = 0, 1, . . . .

(12.39)

c j u n+ j = 0.

Taking the inner product of this equation with the vectors u n , u n+1 , . . . , u n+k−1 , we obtain k

c j (u n+i , u n+ j ) = 0, i = 0, 1, . . . , k − 1, j =0

and, together with ck = 1, these are precisely the equations given in (12.28) that define the scalars c j for SMPE. • Letting i = 0, again (12.39) gives k

j =0

c j u n+ j = 0.

Taking the inner product of this equation with the vectors q 1 , q 2 , . . . , q k , we obtain k

c j (q i , u n+ j ) = 0, i = 1, . . . , k, j =0

and, together with ck = 1, these are precisely the equations given in (12.28) that define the scalars c j for SMMPE. • Letting i = 0, 1, . . . , k − 1 in (12.39), we obtain k

j =0

c j u n+i + j = 0,

i = 0, 1, . . . , k − 1.

Taking the inner product of these equations with a vector q, we obtain k

j =0

c j (q, u n+i + j ) = 0,

i = 0, 1, . . . , k − 1,

and, together with ck = 1, these are precisely the equations given in (12.28) that define the scalars c j for STEA. Thus, we have shown that q(z) is precisely the denominator polynomial of sn,k (z). Before going on, we wish to introduce the following short-hand notation concerni ing polynomials and power series: For a given power series g (z) = ∞ i =0 gi z , we will let s

gi z i = g r z r + g r +1 z r +1 + · · · + g s z s . [g (z)] sr = i =r

294

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

We now show that p(z) is precisely the numerator polynomial of sn,k (z). By (12.36) and by the assumption that deg p ≤ m − 1 = n + k − 1, we have p(z) = [q(z)f (z)]0n+k−1 = Recalling that x r (z) =

 r −1 i =0



k j =0

c j z k− j



∞ j =0

n+k−1 ujzj

. 0

u i z i , it is easy to see that p(z) =

k

j =0

c j z k− j x n+ j (z).

This completes the proof.

12.5 Efficient computation of sn,k (z ) from SMPE We can make use of the representation of sn,k (z) given in (12.32) for actual computation of sn,k (z) from SMPE in a way that saves a lot of storage.61 We first notice that the equations in (12.28) are the normal equations for the linear least-squares problem min U k−1 c  + u n+k ,

c  ∈

where

k

U k = [ u n | u n+1 | · · · | u n+k ],

c  = [c0 , c1 , . . . , ck−1 ]T ,

which we dealt with in our treatment of MPE. Recall that we solved this problem by using the weighted QR factorization of U k , namely, U k = Q k R k , as in Section 2.4. Thus, ⎤ ⎡ r00 r01 · · · r0k ⎢ r11 · · · r1k ⎥ ⎥ ⎢ Q k = [ q 0 | q 1 | · · · | q k ], R k = ⎢ .. ⎥ , .. ⎣ . . ⎦ rkk q ∗i M q j = δi j ,

ri i > 0 ∀ i.

[The vectors q i , which are the columns of the matrix Q k , should not be confused with the vectors q i in the definition of sn,k (z) from SMMPE.] We next obtain c  by solving the upper triangular k × k linear system R k−1 c  = −ρk ,

ρk = [r0k , r1k , . . . , rk−1,k ]T .

Following the determination of c0 , c1 , . . . , ck−1 , we set ck = 1 and compute the de nominator qn,k (z) = kj=0 c j z k− j of sn,k (z). As for the numerator of sn,k (z) − x n+ν (z) in (12.32), we first let h(z) = [ h0 (z), h1 (z), . . . , hk−1 (z)]T , 61

hi (z) =

k

j =i +1

c j z k− j +i ,

Note that we can apply the algorithm for MMPE given in Chapter 4 to compute sn,k (z) from SMMPE. We leave the details to the reader.

12.5. Efficient computation of sn,k (z) from SMPE

295

and compute this numerator (with ν = 0 now) as in k−1 

k

i =0

j =i +1

 k−1

c j z k− j +i u n+i = hi (z)u n+i = U k−1 h(z) = Q k−1 [R k−1 h(z)]. i =0

Thus, sn,k (z) = x n (z) + z n

Q k−1 [R k−1 h(z)] . k k− j j =0 c j z

Recall that we can overwrite u n+i by q i , i = 0, 1, . . . , k, in the process of computing the weighted QR factorization of U k . Clearly, we need to keep in the core memory only k + 1 vectors, namely, x n (z) and the vectors q 0 , q 1 , . . . , q k−1 . Of course, this enables us to compute sn,k (z) for different values of z economically, since the important and expensive vector quantities needed for this, namely, the matrices Q k and R k , are independent of z and hence need to be computed only once. Error estimation

Using the approach developed in Section 2.5, we can obtain error estimates for the vectors sn,k (z), which may be of use at least when the limit function f (z) is analytic at z = 0 and meromorphic in some domain of the z-plane containing z = 0. The γ , where important quantity now is the vector U k n+k = [ u z n | u z n+1 | · · · | u U ] k n n+1 n+k z

and γ = [γ0 , γ1 , . . . , γk ]T ,

c z k−i , γ i = k i k− j j =0 c j z

i = 0, 1, . . . , k.

We recall that this vector is the exact residual vector when the x m (z) are generated linearly; it serves as an approximate residual vector when the x m (z) are generated nonlinearly. We can express this vector in terms of already computed quantities as follows: First, γ= U k

where

k

i =0

γi (u n+i z n+i ) = k

k

z n+k

j =0 c j z

k− j

i =0

U k = [u n | u n+1 | · · · | u n+k ] and

ci u n+i = k

z n+k

j =0 c j z

k− j

U k c,

c = [c0 , c1 , . . . , ck ]T

as usual. Invoking the weighted QR factorization of U k , namely, U k = Q k R k , we obtain γ= z U k k

n+k

j =0 c j z

+

But Rk c = Therefore,

R k−1 0T

ρk rkk

,

c 1



k− j

 =

n+k γ = rkk z U Qk k k k− j j =0 c j z



0 1

Q k R k c.

R k−1 c  + ρk rkk





 = rkk

r z n+k = kkk q . k− j k j =0 c j z

0 1

 .

(12.40)

296

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

Taking norms, we finally have n+k γ  = r   |z| U . k kk  k c z k− j 

(12.41)

j =0 j

γ or Note that we do not have to actually compute sn,k (z) to determine either U k γ . U k Since u n z n is the exact residual vector of sn,0 (z) = x n (z) for linearly generated sequences, and the approximate residual vector otherwise, we can assess the accuracy of sn,k (z) relative to that of x n (z) via the quotient γ U rkk r |z|k 1 k = kk  k  k . = n k− j r00  j =0 c j z − j  r00  j =0 c j z  u n z 

Here we have used the fact that u n  = r00 .

12.6 Some sources of vector-valued power series Before proceeding further, we would like to briefly discuss how vector-valued power series can arise in applications. Consider the system of linear or nonlinear algebraic equations ψ(x; z) = 0, where ψ : N → N , x ∈ N , and z is a complex parameter. We would like to solve this system for x. Clearly, x is a function of z; let us denote it by x(z). Thus, provided ψ is a C ∞ function in a certain domain of the z-plane containing z = 0, the solution x(z) will have a Maclaurin series in this domain. The simplest example is provided by a linear system of the form x − zAx − b = 0,

A ∈ N ×N ,

b ∈ N ,

an interesting case we will study in detail in Section 14.2.1. It is easy to show that dr x(0) = x 0 = b and d z r x(0) = r !(Ar b), r = 1, 2, . . . , for this example. In the general case, such Maclaurin series arise, at least theoretically, as follows: Assume that ψ(x; z) = 0 has been solved when z = 0, giving x 0 = x(0). To find dr x(0), the r th derivative of x(z) evaluated at z = 0, we proceed as follows: Let d zr

x = [x (1) , . . . , x (N ) ]T

and

ψ = [ψ1 , . . . , ψN ]T .

Then, because ψ(x(z); z) ≡ 0, dr ψ(x(z); z) ≡ 0, dzr

r = 1, 2, . . . .

For each r , and with z = 0, these give N × N systems of linear algebraic equations dr from which d z r x(0) can be obtained. For example, with r = 1, we have N

∂ d ∂ ψ (x(z); z) x ( j ) (z) + ψ (x(z); z) = 0, (j) i ∂ z i d z ∂ x j =1

i = 1, . . . , N ,

which can be written in matrix form as Ψ(x(z); z)

d ∂ ψ(x(z); z), x(z) = − ∂z dz

(12.42)

12.7. Convergence study of sn,k (z): A constructive theory of de Montessus type

297

where Ψ(x; z) is the Jacobian matrix of the vector-valued function ψ(x; z) (with respect to x only) evaluated at (x; z) = (x(z); z). Ψ(x; z) is given as ⎡ ⎤ ψ1,1 (x; z) ψ1,2 (x; z) · · · ψ1,N (x; z) ⎢ ψ2,1 (x; z) ψ2,2 (x; z) · · · ψ2,N (x; z) ⎥ ⎢ ⎥ (12.43) Ψ(x; z) = ⎢ ⎥, .. .. .. ⎣ ⎦ . . . ψN ,1 (x; z) ψN ,2 (x; z) · · · ψN ,N (x; z) where

∂ ψ (x; z), i, j = 1, . . . , N . ∂ x(j) i Upon setting z = 0 and x(0) = x 0 in (12.42), we obtain the linear system ψi , j (x; z) =

Ψ(x 0 ; 0)

which we solve for

d dz

d ∂ ψ(x(0); 0), x(0) = − ∂z dz

x(0), namely, for

d dz

available, we go on to r = 2 and compute

the vector

dr d zr

x ( j ) (0), j = 1, . . . , N . With x(0) and 2

d d z2

(12.44)

(12.45) d dz

x(0)

x(0), and so on. Note that, for each r ,

x(0) is obtained by solving a linear system of the form Ψ(x 0 ; 0)

  dr d d r −1 x(0) . x(0) = g r x(0), x(0), . . . , d z r −1 dz dzr

Of course, we are assuming that the matrix Ψ(x 0 ; 0), which features in all these computations, is nonsingular. Vector-valued power series may also arise from (systems of) differential equations with a small parameter ε, for example. When these systems are solved using perturbation techniques, one may end up with a series in powers of ε that may converge when ε is in some interval or may diverge for all ε = 0. In all cases, one may be interested in summing this perturbation series to obtain good approximations to the solution for different values of ε. One can also investigate the singularities of the solution as a function of ε by looking at the poles of the resulting rational approximations, without having to compute these approximations in full and for every ε. We consider an example of this in Section 14.5.

12.7 Convergence study of sn,k (z ): A constructive theory of de Montessus type We now turn to the convergence of the rows in Table 12.1, that is, the convergence of sn,k (z) as n → ∞ with k fixed; we provide a de Montessus–type study analogous to that for Padé approximants. The convergence theory of MPE, MMPE, and TEA that we mentioned in Chapter 6 can be used to derive the results below in a very convenient way. Here we state the important theorems but do not give their proofs. For details, we refer the reader to [269] and [270] and the papers mentioned in them. We assume throughout that the function f (z) is analytic at z = 0 and meromorphic in the open disk K = {z : |z| < R} for some R > 0.62 We let h be the number of distinct poles of f (z), which we denote by z j = λ−1 , and order them such that j 0 < |z1 | ≤ |z2 | ≤ · · · ≤ |z h | < R, 62 Of

(12.46)

course, f (z) will have at least one singularity, such as a pole or a branch point, when |z| = R.

298

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

which implies the ordering |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ h | > R−1 .

(12.47)

In addition, we choose ξ −1 ∈ (|z h |, R), but arbitrarily close to R otherwise, and set λ−1 = z h+1 = ξ −1 . h+1 Of course, these poles may be simple or multiple, and we treat these different cases below. As will become clear from the next two subsections, the rational functions sn,k (z) from the power series of f (z) are much better approximations to f (z) than the individual x m (z). In addition, they can be used constructively to produce a lot of useful information about the poles z j of f (z) and its Laurent expansions about the z j .

12.7.1 Meromorphic f (z) with simple poles For simplicity of presentation, let us assume first that f (z) has only simple poles in K. Consequently, f (z) has the representation f (z) =

h

aj

j =1

1 − λj z

+ g (z),

g (z) analytic in K.

(12.48)

Here the a j are constant vectors in N . Since g(z) is analytic for |z| < R, g (z) =



i =0

wi zi

for |z| < R,

(12.49)

and hence w m = o(ξ m )

and

In addition,

m−1

g (z) −

i =0

w i z i = o((ξ z) m )

as m → ∞,

|z| < ξ −1 < R. (12.50)

m−1

(λ j z) m 1 = (λ j z)i + 1 − λj z 1 − λj z i =0

∀ z = λ−1 . j

(12.51)

It is now easy to see that um =

h

j =1

m a j λm j + o(ξ )

as m → ∞

(12.52)

and f (z) − x m (z) =

h

aj

j =1

1 − λj z

(λ j z) m + O((ξ z) m ) as m → ∞,

|z| < ξ −1 < R,

(12.53) in the set K \ {z1 , . . . , z h }. Applying the techniques of Chapter 6, we can now prove the following theorem.

12.7. Convergence study of sn,k (z): A constructive theory of de Montessus type

299

Theorem 12.7. Assume that f (z) is exactly as described in the preceding paragraph, and define z h+1 = ξ −1 (equivalently, λ h+1 = ξ ). Let the integer k be such that |λk | > |λk+1 |. (This holds immediately when k = h.) Assume also that a 1 , . . . , a k are linearly independent for SMPE and SMMPE,    (q 1 , a 1 ) (q 1 , a 2 ) · · · (q 1 , a k )     (q 2 , a 1 ) (q 2 , a 2 ) · · · (q 2 , a k )     .. .. ..  = 0 for SMMPE,  . . .   (q , a ) (q , a ) · · · (q , a ) 2 k k 1 k k (q, a j ) = 0,

j = 1, . . . , k,

for STEA.

(12.54a)

(12.54b)

(12.54c)

Then the following are true: 1. Define Kk = {z : |z| < |zk+1 |}. Then sn,k (z) exists for all sufficiently large n. It converges to f (z) as n → ∞ uniformly in z, in every compact subset of Kk \ {z1 , . . . , zk }, such that sn,k (z) = f (z) + O(|z/zk+1 |n )

as n → ∞.

(12.55)

2. The polynomial qn,k (z) [see (12.27)] exists for all sufficiently large n and satisfies lim qn,k (z) =

n→∞

k  j =1

(1 − λ j z)

such that qn,k (z) =

k  j =1

(1 − λ j z) + O(|λk+1 /λk |n )

(n,k)

qn,k (z) has k zeros, z1 (n,k)

zj

(n,k)

, . . . , zk

as n → ∞.

(12.56)

, that converge to the poles z1 , . . . , zk , as in

− z j = O(|λk+1 /λ j |n )

as n → ∞,

j = 1, . . . , k.

(12.57)

If g(z) is a polynomial in z and the vectors a j are mutually orthogonal, that is, (a i , a j ) = 0 if i = j , these results for IMPE improve to read qn,k (z) =

k  j =1

(1 − λ j z) + O(|λk+1 /λk |2n )

as n → ∞

(12.58)

j = 1, . . . , k.

(12.59)

and (n,k)

zj

− z j = O(|λk+1 /λ j |2n ) (n,k)

as n → ∞, (n,k)

3. The residues of sn,k (z) at its poles z1 , . . . , zk converge to the residues of f (z) at its poles z1 , . . . , zk , respectively, as in  Res sn,k (z) z=z (n,k) = −a j z j + O(|λk+1 /λ j |n ) as n → ∞, j = 1, . . . , k. (12.60) j

300

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

12.7.2 Meromorphic f (z) with multiple poles We now turn to the case in which the (distinct) poles z j = λ−1 of f (z) are multiple in j general. Let us denote by ω j = p j + 1 the multiplicity of z j for each j . Consequently, f (z) has the representation f (z) =

pj h



a ji

j =1 i =0

(1 − λ j z)i +1

+ g (z),

g(z) analytic in K.

(12.61)

Here the a j i are constant vectors in N such that a j p j = 0, 1 ≤ j ≤ h, and g(z) is precisely as in (12.49) and (12.50). With these assumptions, f (z) has a power series as in (12.1), whose coefficients u m are um =

  m m λm ajl j + o(ξ ) l l =0

pj h 

j =1

as m → ∞,

(12.62)

where the a j l are constant vectors in N defined as ajl =

pj

i =l

  i , a ji l

(12.63)

and f (z) − x m (z) =

  m b j l (z) (λ j z) m + o((ξ z) m ) l l =0

pj h 

j =1

as m → ∞,

(12.64)

where the b j l (z) are vector-valued rational functions of z defined as b j l (z) =

pj

a ji

i

i =l

1 − λj z

q=l



i q−l



λ j z i −q

1 − λj z

.

(12.65)

Theorem 12.8. Assume that f (z) is exactly as described in the preceding paragraph, and λ h+1 = ξ ). Assume that |λ t | > |λ t +1 | (this holds immedefine z h+1 = ξ −1 (equivalently,  diately when t = h), and let k = tj =1 ω j . Assume also that a j i , 0 ≤ i ≤ p j , 1 ≤ j ≤ t , are linearly independent for SMPE and SMMPE,   (q , a )  1 10  ..   .  (q k , a 10 )

··· ···

(q 1 , a 1 p1 ) .. . (q k , a 1 p1 )

··· ···

(q, a j p j ) = 0,

(q 1 , a t 0 ) .. . (q k , a t 0 )

··· ···

j = 1, . . . , t ,

 (q 1 , a t pt )   ..  = 0  .  (q k , a t pt ) for STEA.

(12.66a)

for SMMPE, (12.66b) (12.66c)

Let also ¯p = max{ p j : |λ j | = |λ t +1 |}. Then the following are true:

(12.67)

12.7. Convergence study of sn,k (z): A constructive theory of de Montessus type

301

1. Define K t = {z : |z| < |z t +1 |}. Then sn,k (z) exists for all sufficiently large n. It converges to f (z) as n → ∞ uniformly in z, in every compact subset of K t \{z1 , . . . , z t }, such that (12.68) sn,k (z) = f (z) + O(n ¯p |z/z t +1 |n ) as n → ∞. 2. The polynomial qn,k (z) exists for all sufficiently large n and satisfies lim qn,k (z) =

t 

n→∞

j =1

(1 − λ j z)ω j

such that qn,k (z) =

t  j =1

(1 − λ j z)ω j + O(n α |λ t +1 /λ t |n ) as n → ∞,

(12.69)

where α is a nonnegative integer. That is, qn,k (z) has exactly k zeros that converge to the poles of f (z). (n,k) (n,k) For each j = 1, . . . , t , ω j zeros z j 1 , . . . , z j ω of qn,k (z) converge to the pole z j as j

in (n,k)

− z j = O(δ j (n, k)1/ω j ) as n → ∞, 1 ≤ l ≤ ω j , δ j (n, k) = n ¯p |λ t +1 /λ j |n . (12.70) Let us denote ωj 1 (n,k) (n,k) zj z . (12.71) = ω j l =1 j l zj l

Then

(n,k) zj − z j = O(δ j (n, k))

as n → ∞.

(12.72)

3. Let us rewrite f (z) as f (z) =

pj h



d ji

j =1 i =0

(z − z j )i +1

+ g (z),

d j i = a j i (−z j )i +1 ∀ j , i.

(n,k) i

Denote the residues of (z − zj 0 ≤ i ≤ p j , and let

(n,k)

) sn,k (z) at its poles z j l

(n,k) = d ji

ωj

l =1

Then

(12.73)

(n,k)

by d j i ,l , 1 ≤ l ≤ ω j ,

(n,k)

d j i ,l .

' (n,k) '1/n lim sup 'd − d j i ' ≤ |z j /z t +1 |. ji n→∞

12.7.3 Conclusions from Theorems 12.7 and 12.8 The following conclusions can be drawn from Theorems 12.7 and 12.8:

(12.74)

(12.75)

302

Chapter 12. Rational Approximations from Vector-Valued Power Series: Part I

1. Since f (z) is analytic at z = 0 and z 1 is the pole that is closest to z = 0, the sequence of the partial sums x n (z) = n−1 u z i converges to f (z) only for |z| < i =0 i |z1 |, and from (12.48) and (12.61), we have ' '1/n lim sup ' f (z) − x n (z)' ≤ |z/z1 |, |z| < |z1 |. n→∞

2. We see from Theorems 12.7 and 12.8, however, that sn,k (z) converges to f (z) for z= |z| < | z |, with (i) z = zk+1 in Theorem 12.7 when poles are simple and (ii) z t +1 in Theorem 12.8 when poles are multiple, and we have '1/n ' lim sup ' f (z) − sn,k (z)' ≤ |z/ z | as n → ∞. n→∞

z | > |z1 | in Thus, sn,k (z) converges in a larger set of the z-plane than x n (z) since | all cases, and it converges faster than x n (z) where the latter converges. We thus conclude that the methods we have developed here are also true convergence acceleration methods for vector-valued power series. 3. The poles of sn,k (z) tend to the k smallest poles of f (z), counted according to (n,k)

their multiplicities. If z j is a simple pole, the error in z j tends to zero like δ j (n, k) as n → ∞. If z j has multiplicity ω j > 1, then the ω j approximations (n,k)

zj l

tend to z j at a rate that is ω j times as slow as if z j were a simple pole; that (n,k)

is, the errors in the z j l

tend to zero like δ j (n, k)1/ω j as n → ∞. The simple

average & z j (n,k) of the z j l

tends to z j as if the latter were a simple pole, however.

(n,k)

In this sense, & zj

(n,k)

is an “optimal” approximation to z j .

4. Note the following facts concerning residues and their actual computation from sn,k (z): • By (12.48), a j is given by

 Res f (z) z=z = −a j z j , j

(n,k)

and since z j

is simple, Res sn,k

 (z)

(n,k)

(n,k) z=z j

=

p n,k (z j

(n,k)

 (z qn,k j

)

)

.

• By (12.61), a j i are given by

 Res [(z − z j )i +1 f (z)] z=z = a j i (−z j )i +1 , j

(n,k)

and since the z j l Res [(z

are simple,

 (n,k) − z j )i sn,k (z)] z=z (n,k) jl

(n,k)

(n,k) = (z j l

(n,k) − z j )i

 (z) is the derivative of qn,k (z) with respect to z. Here qn,k

p n,k (z j l

)

 (z (n,k) ) qn,k jl

.

12.7. Convergence study of sn,k (z): A constructive theory of de Montessus type

303

5. In remark 2 following the statement of Theorem 6.6 and also in remark 1 following Theorem 6.23, we discussed the relative advantage of MPE, RRE, and MMPE versus TEA and vice versa. In view of Theorems 12.7 and 12.8, we can give an analogous discussion pertaining to SMPE and SMMPE versus STEA. When (12.54a) or (12.66a) is satisfied, SMPE is more efficient than STEA. The same holds true for SMMPE when (12.54b) or (12.66b) is satisfied. When (12.54a) or (12.66a) does not hold, we cannot make a statement about the convergence of SMPE and SMMPE. In such a case, STEA can be used efficiently, since (12.54c) or (12.66c) can always be forced to hold by choosing the vector q appropriately.

Chapter 13

Rational Approximations from Vector-Valued Power Series: Part II

13.1 Introduction The problem of approximating vector-valued functions by rational functions using their power series expansions has received a lot of attention, and different approaches to this problem have been developed. In this chapter, we will give a brief overview of this important topic. All of the approaches, including SMPE, SMMPE, and STEA, which we have already discussed, have one common property: The rational approximations that result from them are all of the form r (z) = p(z)/q(z), where p(z) is a vector-valued polynomial and q(z) is a scalar polynomial. Assume that f : N → N , that is, f (z) = [ f1 (z), f2 (z), . . . , fN (z)]T , Then hence

p(z) = [ p1 (z), p2 (z), . . . , pN (z)]T , 

fi (z) pi (z)

scalar functions. scalar functions;

p (z) p (z) p (z) r (z) = 1 , 2 , . . . , N q(z) q(z) q(z)

T .

In other words, all components of r (z) have the same denominator polynomial. Of course, what is different between the methods is the criteria imposed on the r (z) by which the pi (z) and q(z) are determined. Before going on, we recall the following short-hand  notation concerning polynoi mials and power series: For a given power series g (z) = ∞ i =0 gi z , we will let [g (z)] sr =

s

i =r

gi z i = g r z r + g r +1 z r +1 + · · · + g s z s .

This notation will be useful in the next sections.

13.2 Generalized inverse vector-valued Padé approximants We will start our discussion with generalized inverse vector-valued Padé approximants, as these are very closely related to VEA. These approximants were originally developed 305

306

Chapter 13. Rational Approximations from Vector-Valued Power Series: Part II

in a more general setting by Graves-Morris in [107, 108]. They have been studied extensively in Graves-Morris and Jenkins [112], Graves-Morris and Saff [113, 114, 115, 116], and Roberts [219, 220, 221]. See also Chapter 8 in Baker and Graves-Morris [14], which contains a wealth of information on extensions of Padé approximants to the vector and matrix cases.  i Given the vector-valued power series f (z) = ∞ i =0 c i z , we define the vector Padé approximant of type [n/2k] to be the vector-valued rational function r [n/2k] (z) = p(z)/q(z), where p(z) is a vector-valued polynomial and q(z) is a real scalar polynomial, such that63 deg p ≤ n,

deg q = 2k,

q(z)

and

and

q(0) = 0;

〈p( z), p(z)〉 ;

divides

f (z) − r [n/2k] (z) = O(z n+1 )

as z → 0.

This definition of r [n/2k] (z) suffices to prove that, if it exists, r [n/2k] (z) is unique. The denominator polynomial q(z) has the determinant representation q(z) = E(z 2k , z 2k−1 , . . . , z 1 , z 0 ),   v  0  M  00  E(v0 , v1 , . . . , v2k ) =  M10  ..  .  M2k−1,0

where

with M i i = 0,

Mi j =

j

−i −1 r =0

v1 M01 M11 .. . M2k−1,1

··· ··· ··· ···

 v2k  M0,2k  M1,2k  ,  ..   .  M2k−1,2k 

〈c j −r +n−2k , c i +r +n−2k+1 〉 = −M j i ,

i < j.

(Note that the matrix M = [M i j ]0≤i , j ≤2k is skew symmetric of odd order and singular.) As for p(z), we have p(z) = [q(z)f (z)]0n . With the help of the determinant representation of q(z), we can now give a deter 2k− j minant representation for p(z) as well. First, if we let q(z) = 2k , we j =0 q j z realize that q j is the cofactor of the entry v j in the first row of the determinant E(v0 , v1 , . . . , v2k ). In addition, [q(z)f (z)]0n = where f m (z) =

i =0

bi

zi ,

i =0

j =0

q j z 2k− j f n−2k+ j (z),

ci zi ,

m = 0, 1, . . . .

y ∗ z is the standard Euclidean inner product. Thus, if a(z) =   then 〈a( z), b(z)〉 = ni=0 nj=0 〈a i , b j 〉 z i + j .

63 As always, 〈y, z 〉 =

n

m

2k

n

i =0

a i z i and b(z) =

13.2. Generalized inverse vector-valued Padé approximants

307

Therefore, p(z) has the determinant representation   p(z) = E z 2k f n−2k (z), z 2k−1 f n−2k+1 (z), . . . , z 1 f n−1 (z), z 0 f n (z) . Consequently, r [n/2k] (z) has the following determinant representation: r

[n/2k]

(z) =

  E z 2k f n−2k (z), z 2k−1 f n−2k+1 (z), . . . , z 1 f n−1 (z), z 0 f n (z)

E(z 2k , z 2k−1 , . . . , z 1 , z 0 )

.

The following convergence theorem, which is analogous to Theorems 12.7 and 12.8, is given in [114]. Theorem 13.1. Let f (z) be analytic at z = 0 and meromorphic in the disk DR = {z :  |z| < R}, with poles z1 , . . . , zk in D. Let Q(z) = ki=1 (z − zi ) and g (z) = Q(z)f (z). Assume that 〈g ( z i ), g (zi )〉 = 0, i = 1, . . . , k. Let r [n/2k] (z) = p n,k (z)/qn,k (z), where qn,k (z) is normalized to be monic. Then the following are true with k fixed: First,

lim r [n/2k] (z) = f (z)

n→∞

∀ z ∈ DR− ,

DR− = DR \ {z1 , . . . , zk },

the rate of convergence being governed by lim sup f (z) − r [n/2k] (z)K ≤ μ/R 1/n μ

n→∞

∀ μ < R,

where Kμ = DR− ∩ {z : |z| < μ}. Next, lim q (z) = [Q(z)]2 , n→∞ n,k the rate of convergence being governed by lim sup |qn,k (z) − [Q(z)]2 |1/n ≤ max |zi |/R ∀ z ∈ . n→∞

1≤i ≤k

Remarks: 1. The condition 〈g ( z i ), g(zi )〉 = 0, i = 1, . . . , k, is satisfied automatically if f (x) ∈ N when x ∈  and z1 , . . . , zk are all real.

2. Note that, qualitatively speaking, the rate of convergence of r [n/2k] (z) to f (z) is the same as that of sn,k (z) via the methods SMPE, SMMPE, and STEA. 3. The convergence of the denominator polynomials of r [n/2k] (z) is much different from those of sn,k (z), however. The denominator of sn,k (z) has precisely k zeros that tend to the k poles of f (z) in D. On the other hand, the denominator of r [n/2k] (z) has 2k zeros that tend to the k poles of f (z) in pairs. This does not imply that the residues of r [n/2k] (z) converge to those of f (z). In fact, numerical experiments and some analysis indicate that the residues do not converge. (See [14, pp. 488–489] for more details.)

308

Chapter 13. Rational Approximations from Vector-Valued Power Series: Part II

13.3 Simultaneous Padé approximants Suppose the Maclaurin series of the function f (z) is given as f (z) =



j =0

(1)

cjzj,

(2)

(N )

c j = [c j , c j , . . . , c j ]T .

(13.1)

Then fi (z) =



j =0

(i )

cj z j ,

i = 1, . . . , N .

(13.2)

mi ≥ 0

(13.3)

Let us choose an index set I = {m1 , m2 , . . . , mN },

integers.

Then we choose the polynomials pi (z) and q(z) such that64 deg pi ≤ Li = ni − mi ,

i = 1, . . . , N ,

deg q ≤ M =

N

i =1

mi ,

(13.4)

such that, for some integers n1 , . . . , nN , q(z) fi (z) − pi (z) = O(z ni +1 )

as z → 0,

i = 1, . . . , N .

(13.5)

Note that the number of unknown coefficients in q(z) is M +1, while this number for Therefore, the total of unknown coefficients in pi (z) is Li + 1, i = 1, . . . , N . number N the pi (z) and q(z) is M +1+ N (L +1) = 1+ (n +1). By (13.5), the first ni +1 i =1 i i =1 i (z) − q(z) f (z) vanish. Thus, coefficients of the Maclaurin expansion of p i i  Npi (z) and q(z) satisfy a total of N (n +1) (linear homogeneous) equations in 1+ i =1 i i =1 (ni +1) unknowns. Of course, a solution to this system is guaranteed. Now, (13.5) can be rewritten as n

[q(z) fi (z) − pi (z)]0 i = 0,

i = 1, . . . , N .

(13.6)

The computation of the coefficients of q(z) can be separated from that of the coefficients of the pi (z) if we realize from (13.4) and (13.6) that n

[q(z) fi (z) − pi (z)]Li +1 = 0 i



n

[q(z) fi (z)]Li +1 = 0, i

i = 1, . . . , N .

(13.7)

 The number of these (homogeneous) equations is N i =1 (ni − Li ) = M , while the number of unknown coefficients of q(z) is M + 1. Assuming that the matrix of these equa64 It

is easy to see that, when N = 1, p1 (z)/q(z) is simply the [n1 − m1 /m1 ] Padé approximant for f1 (z).

13.3. Simultaneous Padé approximants

309

tions has rank M , a solution exists and is given as   M  z M −1 ··· z 0   (1) z (1) (1) c   n1 −M −m1 +1 cn1 −M −m1 +2 · · · cn1 −m1 +1    .. .. ..     . . .   (1) (1)  c (1)  cn −M +1 ··· cn1   n −M 1 1    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   q(z) =  . .. .. ..   . . .    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    (N )  ) (N ) cn −M −m +1 cn(N −M  · · · c nN −mN +1  −mN +2  N N N   . . .   .. .. ..     (N ) (N ) (N )  c  cn −M +1 ··· cnN n −M N

(13.8)

N

To obtain the coefficients of pi (z), we use the remaining equations from (13.6), namely, L

[q(z) fi (z) − pi (z)]0 i = 0



L

pi (z) = [q(z) fi (z)]0 i ,

i = 1, . . . , N .

(13.9)

Following these, we form r (z) =

p(z) , q(z)

(13.10)

as explained in the preceding section. This r (z) is called the simultaneous Padé approximant of type [L, M ] with index set I . The condition that q(0) = 0 guarantees the uniqueness of r (z). The subject of simultaneous Padé approximation was developed by Mahler [179] and by de Bruin [65, 66, 67, 68, 69]. See also van Iseghem [327, 328, 329, 330]. Detailed convergence results of de Montessus type for them were given by Graves-Morris and Saff [113, 115]. Two of their results are summarized in Theorem 13.3. We start with the following definition. Definition 13.2. Let each of the functions f1 (z), . . . , fN (z) be meromorphic in the disk D = {z : |z| < R}, and let nonnegative integers m1 , . . . , mN be given for which RN i =1 mi > 0. Then the functions fi (z) are said to be polewise independent, with respect to the numbers mi , in DR , if there do not exist polynomials π1 (z), . . . , πN (z), at least one of which is nonnull, satisfying deg πi (z) ≤ mi − 1 πi (z) ≡ 0 and such that φ(z) =

N

i =1

if mi ≥ 1,

if mi = 0,

πi (z) fi (z)

is analytic in DR .

Let us now go back to the definition of the simultaneous Padé approximants given  in the first paragraph of this section. Let us fix the integers mi , and hence M = N i =1 mi , and set ni = n for all i = 1, . . . , N . Let us also denote the resulting approximant by r n (z) = p n (z)/qn (z). Note that as n → ∞, Li → ∞ for all i simultaneously.

310

Chapter 13. Rational Approximations from Vector-Valued Power Series: Part II

Theorem 13.3. Suppose that each of the functions f1 (z), . . . , fN (z) is analytic in the disk DR = {z : |z| < R}, except for possible poles at the M points z1 , . . . , zM counted according  to their multiplicities. Let m1 , . . . , mN be nonnegative integers such that M = N i =1 mi and such that the functions fi (z) are polewise independent in DR with respect to the mi . Then the following are true: 1. The suitably normalized denominator polynomial qn (z) satisfies lim qn (z) = q∞ (z) =

n→∞

M  i =1

(z − zi )

∀ z ∈ .

More precisely, if K is any compact subset of the complex plane, 1/n

lim sup qn (z) − q∞ (z)K ≤ max |zi |/R. 1≤i ≤M

n→∞

2. The sequence {r n (z)} converges as in lim r (z) = n→∞ n

f (z)

∀ z ∈ DR− ,

DR− = DR \ {z1 , . . . , zk },

the convergence being uniform on compact subsets of DR− . More precisely, in any compact subset E of DR− , 1/n

lim sup r n (z) − f (z)E ≤ zE /R. n→∞

Here the norms are the sup norms taken over the indicated sets. 3. If, for some i, fi (z) is not analytic at some point on the boundary of DR , then the sequence {r n (z)} diverges at every point exterior to DR according to the rule

lim sup r n (z) − f (z)1/n > |z|/R. n→∞

13.4 Directed simultaneous Padé approximants The index set I = {m1 , . . . , mN } serves to specify the degree reduction from ni to Li = ni − mi in all N components of r (z). We now generalize this idea and demand that degree reduction take place along certain linearly independent directions w 1 , . . . , w r for some r < N . If we also force ni = n for all i of these directions, we have the following definition for r (z) = p(z)/q(z): deg q(z) = M ,

deg p(z) ≤ n,

f (z) − r (z) = O(z n+1 )

deg 〈w i , p(z)〉 = Li ≤ n − mi ,

i = 1, . . . , r,

M=

as z → 0, r

i =1

mi ,

(13.11) (13.12)

where 〈y, z 〉 = y ∗ z . The resulting rational function r (z) is now called a directed simultaneous Padé approximant. These approximations were developed in Graves-Morris [108]. First, we can uncouple the solution for q(z) from p(z) as follows: By (13.11), [〈w i , q(z)f (z) − p(z)〉]0n = 0



[q(z) 〈w i , f (z)〉 − 〈w i , p(z)〉]0n = 0,

(13.13)

13.4. Directed simultaneous Padé approximants

311

which, upon invoking (13.12), gives [q(z) 〈w i , f (z)〉]nLi +1 = 0,

i = 1, . . . , r.

(13.14)

  Thus, we have obtained ir=1 (n − Li ) = ir=1 mi = M linear homogeneous equations for the M +1 unknown coefficients of q(z). Of course, a nontrivial solution exists and we have q(z) = G(z M , z M −1 , . . . , z 0 ), where    v0 v1 ··· vM     c1,n−M −m +1 c1,n−M −m +2 · · · c1,n−m +1   1 1 1   c  1,n−M −m1 +2 c1,n−M −m1 +3 · · · c1,n−m1 +2    . . .   .. .. ..      c1,n−M c1,n−M +1 ··· c1,n    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   G(v0 , v1 , . . . , vM ) =  , .. .. ..   . . .   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    c  r,n−M −m r +1 c r,n−M −m r +2 · · · c r,n−m r +1    c r,n−M −m +2 c r,n−M −m +3 · · · c r,n−m +2  r r r     .. .. ..   . . .     c c · · · c r,n−M r,n−M +1 r,n

ci , j = 〈w i , c j 〉 .

Proceeding as before, once q(z) has been determined, we have [q(z)f (z) − p(z)]0n = 0



p(z) = [q(z)f (z)]0n ;

hence,

  p(z) = G z M f n−M (z), z M −1 f n−M +1 (z), . . . , , z 0 f n (z) .  j As before, f m (z) = m j =0 c j z , m = 0, 1, . . . . Consequently, r (z) has the determinant representation   G z M f n−M (z), z M −1 f n−M +1 (z), . . . , z 0 f n (z) r (z) = . G(z M , z M −1 , . . . , z 0 )

A de Montessus–type convergence theorem for these approximations was proved by Graves-Morris and Saff [113], and it is the same as Theorem 13.3, except that we now assume that the functions 〈w i , f (z)〉, i = 1, . . . , r, are polewise independent in DR with respect to the integers m1 , . . . , m r . Finally, by choosing the w i and the mi in suitable ways, we obtain the methods SMMPE and STEA as special cases of the directed simultaneous Padé approximation: • Let r = M and mi = 1, i = 1, . . . , M . Then r (z) = sn−M ,M (z) from SMMPE. • Let r = 1 and m1 = M . Then r (z) = sn−M ,M (z) from STEA. We leave the details to the reader. Note that the convergence theory for SMMPE and STEA that we gave in the preceding chapter is carried out under conditions that are stated differently from those given for directed simultaneous Padé approximants.

Chapter 14

Applications of SMPE, SMMPE, and STEA

14.1 Introduction In this chapter, we discuss some applications of the vector-valued rational approximation procedures SMPE, SMMPE, and STEA that we studied in Chapter 12. The first application of these procedures is to the solution of the equation x = zAx + b in conjunction with Picard iterations. The approximations to the unknown vector x obtained via these procedures have a close connection with Krylov subspace methods for the eigenvalue problem for A. The second application is to the solution of Fredholm integral equations of the second kind. Here we are working in an infinite-dimensional function space with the inner product defined via an integral. The third application is to problems involving the study of the so-called reanalysis of structures, a problem of interest in civil engineering. The fourth application is to the perturbation expansion of the solution to a nonlinear differential equation with a small parameter. The perturbation solution is obtained via the Lindstedt–Poincaré method, and we are again working in an infinite-dimensional function space with the inner product defined via an integral. The final application is to the computation of vectors of the form f (A)b, where f (t ) is a scalar function, A and f (A) are square matrices of the same dimension, and b is a vector. The matrix f (A) is called a matrix function and has a very elegant theory having to do with minimal polynomials of matrices. We summarize this theory and discuss the application of vector-valued rational approximation procedures and also of Krylov subspace methods to the approximation of the vector f (A)b.

14.2 Application to the solution of x = zAx + b versus Krylov subspace methods 14.2.1 Fixed-point iterations and sn,k (z) Let us apply the procedures developed above to the solution of the problem x = zAx + b,

z ∈ ,

A ∈ N ×N ,

x, b ∈ N .

(14.1)

Clearly, the solution x of this problem is f (z) = (I − zA)−1 b. Picking x 0 (z) = 0, and iterating as in x m+1 (z) = zAx m (z) + b, 313

m = 0, 1, . . . ,

(14.2)

314

Chapter 14. Applications of SMPE, SMMPE, and STEA

we obtain x m (z) =

m−1

i =0

ui zi ,

m = 0, 1, . . . ,

u i = Ai b.

(14.3)

Clearly, if |z|ρ(A) < 1, we have lim x m (z) = (I − zA)−1 b = f (z).

(14.4)

m→∞

Let us now analyze the spectral structure of f (z). • If A is diagonalizable, and assuming for simplicity that the eigenvalues of A are nonzero, we have h

vj f (z) = , j =1 1 − λ j z

where λ j are some or all of the distinct eigenvalues of A and, for each j , v j is an eigenvector corresponding to λ j . (If zero is an eigenvalue of A, we have to add a constant vector to this summation.) • If A is defective, and assuming for simplicity that the eigenvalues of A are nonzero, we have pj h

w j i (z) f (z) = , (1 − λ j z)i j =1 i =0

where λ j are some or all of the distinct eigenvalues of A and, for each j , w j i are linear combinations of the eigenvectors and principal vectors corresponding to λ j . (If zero is an eigenvalue of A, we have to add a vector-valued polynomial in z to this summation.) It is clear that the vector-valued function f (z) is precisely of the forms given in Theorems 12.7 and 12.8. Hence, the rational approximation procedures developed in this chapter can be used to (i) accelerate the convergence of the sequence {x m (z)} within the disk |z| < |zmin |, where zmin is the pole of f (z) closest to the origin, and (ii) cause convergence for |z| ∈ (|zmin |, R) for some R > |zmin | as well.

14.2.2 sn,k (z) versus Krylov subspace methods for eigenvalue problems We already know that the vector-valued rational approximations sn,k (z) obtained as above are precisely those obtained from the related Krylov subspace methods when solving the linear system (I − zA)x = b. Surprisingly, there is a further connection between the approximations sn,k (z) we have just described and Krylov subspace methods for the eigenvalue problem for A. We now wish to explore this connection in detail. Poles of sn,k (z) versus Ritz values

Let us consider the determinant representation of sn,k (z) given in (12.11) with (12.10), and set n = ν = 0 for clarity. Factoring out z k from the first row of the denominator determinant, we obtain D(z k , z k−1 , . . . , z 0 ) = z k D(λ0 , λ1 , . . . , λk ),

λ = z −1 ,

14.2. Application to the solution of x = zAx + b versus Krylov subspace methods

  λ0   u  0,0  0 1 k D(λ , λ , . . . , λ ) =  u1,0  ..  .   uk−1,0

λ1 u0,1 u1,1 .. . uk−1,1

 λk  u0,k  u1,k  . ..  .  uk−1,k 

··· ··· ··· ···

315

(14.5)

Here the ui , j are as in (12.7), ui j = ti +1, j +1 , and the t r s are as in (10.12). Therefore, by Theorem 10.3 of Chapter 10, the poles of s0,k (z) are the reciprocals of the zeros of D(λ0 , λ1 , . . . , λk ), which in turn are the Ritz values of the Krylov subspace methods whose right subspace is  = (k (A; b) and whose left subspaces are (i) ' = (k (A; b) (for SMPE/method of Arnoldi), (ii) ' = span{q 1 , . . . , q k } (for SMMPE), and (iii) ' = (k (A∗ ; q) (for STEA/method of Lanczos). Residues of sn,k (z) versus Ritz vectors

We now wish to show that the residues of sn,k (z) are the Ritz vectors from the Krylov subspace methods just mentioned. As above, we set n = 0 for clarity and set ν = 0 as well. Let z be a pole of s0,k (z), and set λ = z −1 . Then, by (12.32), again with n = ν = 0, P 0,k ( z)  Res s0,k (z) z= z =  , Q0,k ( z)

where P 0,k ( z) = zk k−1

k−1

ηi u i ,

i =0

ηi =

k

j =i +1

c j λ j −i .

First, the ηi u i is the Ritz vector corresponding to λ.  ξ u , ξ = [ξ0 , . . . , ξk−1 ]T being the solution to Ritz vector corresponding to λ is ik−1 =0 i i the generalized eigenvalue problem in (10.8), namely,

We only need to show that

k−1

j =0

i =0

)ξ = 0, (ui , j +1 − λu i,j j

i = 0, 1, . . . , k − 1.

(14.6)

Thus, we will be done if we show that (14.6) is satisfied with the ξ j replaced by the respective η j . For i = 0, 1, . . . , k − 1, we have k−1

j =0

)η = (ui , j +1 − λu i,j j

k−1

j =0

) (ui , j +1 − λu i,j

k

r = j +1

c r λ r − j ,

which, upon changing the order of the summations and simplifying, gives k−1

j =0

 



 k k r + )η = λ − u λ (ui , j +1 − λu c u c i,j j i ,0 r i,j j r =1

  k

= λ ui ,0 c0 + ui , j c j j =1



 k = λ ui , j c j j =0

=0

by (12.28).

j =1

by

k

r =0

c r λ r = 0

316

Chapter 14. Applications of SMPE, SMMPE, and STEA

Summary of sn,k (z) versus Krylov subspace methods for eigenvalue problems

We now combine the results obtained above to state the following equivalence theorem. Theorem 14.1. Let sn,k (z) (with ν = 0) be the rational approximations to f (z) = −1 (I − zA) by applying SMPE, SMMPE, or STEA to the vector-valued power ∞ b obtained series i =0 u i z i of f (z), where u i = Ai b, i = 0, 1, . . . , are power iterations. Denote the , i = 1, . . . , k. reciprocals of the poles of s (z) by λ , and the corresponding residues by v n,k

i

i

i ) are the Ritz pairs obtained by applying Krylov subspace methods to the maThen (λ i , v trix A, with right subspace  = (k (A; u n ) and left subspaces ' as follows: 1. SMPE:

' = (k (A; u n ) =  (method of Arnoldi).

2. SMMPE: ' = span{q 1 , . . . , q k }. 3. STEA:

' = (k (A∗ ; q) (method of Lanczos).

14.3 A related application to Fredholm integral equations The application of SMPE, SMMPE, or STEA to the solution of x = zAx + b can be extended to the case in which A is a bounded linear operator on an infinite-dimensional normed linear space  that has only a discrete spectrum. The convergence theory we presented above can be extended to this case provided A has a complete set of eigenpairs. (For example, if  is a Hilbert space and A is a compact self-adjoint operator on  , then it has an infinite number of eigenvalues λi , which are all real and give limi →∞ λi = 0; the corresponding eigenvectors are orthogonal in the inner product of  . See Lovitt [176] and Atkinson [10], for example.) The convergence theory we presented for the case in which n → ∞ can be extended without any difficulty to such problems, as we mentioned in Section 6.9. See also [269]. One such example is the Fredholm integral equation of the second kind, 9

b

K(t , τ) f (τ)d τ + g (t ),

f (t ) = z

a ≤ t ≤ b,

a

f (t ) ≡ f (t , z) being the unknown function and K(t , τ) being the kernel. We can 8b apply SMPE, for example, with the inner product (φ, ψ) = a w(t )φ(t )ψ(t )d t , where w(t ) ≥ 0 on (a, b ), to the sequence { f m (t )} with

9 f0 (t , z) = 0,

f m+1 (t , z) = z

b a

K(t , τ) f m (τ, z)d τ + g (t ),

Thus, f m (t , z) = with

9 u0 (t ) = g (t ),

ui (t ) =

m−1

i =0

m = 0, 1, . . . .

ui (t )z i ,

b a

K(t , τ)ui −1 (τ)d τ,

i = 1, 2 . . . .

14.4. Application to reanalysis of structures

317

The analogue of the approximation sn,k (z) for SMPE is now the scalar function sn,k (t , z), which is obtained as follows: Compute ui , j = (un+i , un+ j ), where (φ, ψ) = 8b w(t )φ(t )ψ(t )d t and w(t ) ≥ 0 on (a, b ), and define a

sn,k (t , z) =

D(z k fn (t , z), z k−1 fn+1 (t , z), . . . , fn+k (t , z))

D(z k , z k−1 , . . . , z 0 )

,

with D(w0 , w1 , . . . , wk ) defined precisely as in (12.10). Clearly, the t -dependence comes through the numerator, the denominator being independent of t . Assume that K(t , τ) is Hermitian and in L2 ([a, b ] × [a, b ]), that is, it satisfies

9 b9

b

K(t , τ) = K(τ, t ) and a

|K(t , τ)|2 d t d τ < ∞.

a

Then K(t , τ) has an infinite sequence of eigenpairs (μi , φi (t )) such that μi are real and limi →∞ μi = 0, with φi (t ) mutually orthonormal in the sense (φi , φ j ) = 8b φi (t )φ j (t )d t = δi j . Assuming also that g (t ) is in L2 [a, b ], the approximations a f m (z, t ) have absolutely and uniformly convergent series expansions

f m (t , z) = f (t , z) +



i =1

ci φi (t )(μi z) m ,

m = 1, 2, . . . ,

that also serve as the asymptotic expansion of f m (t , z) as m → ∞. Here ci are some fixed scalars. Clearly, the convergence theory we have presented above applies to this example with no changes. Finally, information about the singularities of f (t , z) as a function of z can be obtained very easily from the zeros of the denominator of sn,k (t , z), which is now a polynomial in z with no dependence on t . Note that sn,k (t , z) is of the same form as the exact solution produced by the theory of Fredholm. Recall that the theory of Fredholm also gives the exact solution of the integral equation in the form f (t , z) =

∞

i i =0 wi (t )z , ∞ i i =0 ai z

and our sn,k (t , z) has the same form, except that in the numerator and denominator of sn,k (t , z) we have polynomials in z instead of infinite power series in z. In addition, the zeros of the denominator of the exact solution produce the eigenvalues of K(t , τ), while the zeros in the denominator of sn,k (t , z) produce approximations to the largest eigenvalues. Of course, the approach we have described here can be applied in conjunction with VEA as well. In [109], Graves-Morris applies this approach using the generalized inverse vector-valued Padé approximants to approximate the solution to a Fredholm integral equation.

14.4 Application to reanalysis of structures In reanalysis of structures, we deal with the set of linear equations K 0 r 0 = w,

318

Chapter 14. Applications of SMPE, SMMPE, and STEA

where K 0 is the stiffness matrix, which is real symmetric; r 0 is the displacement vector; and w is the load vector that is assumed to be independent of the design variables. In addition, if the matrix K 0 is also positive definite, it can be written in the decomposed form (Cholesky factorization) K 0 = U T0 U 0 ,

U0

lower triangular.

Let us make a change in the design variables so that the stiffness matrix K 0 is changed to K = K 0 + ΔK ; the load vector w is kept fixed, however. Thus, we would like to determine the modified displacement vector r that satisfies (K 0 + ΔK )r = w. The aim of reanalysis is to compute accurate approximations to the modified displacement r without having to solve the modified equations. See Kirsch [159, 160] and Kirsch and Papalambros [161], for example. Using the rational approximation procedure developed here, this problem has been tackled by Wu, Li, and Li [347]: It is easy to see that r is the solution to ΔK )r + K −1 w. Let us introduce a parameter z and consider the solution r = −(K −1 0 0 of the system x = zAx + b, A = −(K −1 b = K −1 0 ΔK ), 0 w. The parameter z serves as the magnitude of the change in the design variables when these variables are changed simultaneously by multiplying them by z. Of course, if f (z) is the solution of this system, then f (1) = r . Thus, we can approach the solution to this problem precisely in the way we have described above, starting with u 0 = r 0 , and obtain the solution for all values of z ∈ (0, 1] with very little extra effort. Of course, the vectors u i = Ai b are computed by solving the linear systems K 0 u i +1 = (−ΔK )u i , i = 0, 1, . . . . In the process, we invoke the known Cholesky factorization of K 0 and solve two triangular systems (involving U 0 and U T0 ) to obtain u i +1 very inexpensively.

14.5 Application to nonlinear differential equations One very popular method of solving IVPs involving nonlinear differential equations analytically is the perturbation method. Generally speaking, the method is used when the problems involve a small parameter. The idea is to transform such a nonlinear problem to a sequence of linear problems in suitable ways and, subsequently, to ex αi u press the solution as an infinite sum ∞ i =0 i ε , where ε is the small parameter in the problem and ui are solutions to the linear problems. However, many strongly nonlinear problems contain parameters that are not Consequently, the approximations obtained by truncating the series ∞ small. αi u ε , which are computed via the perturbation techniques, may be quite poor i =0 i for values of ε that are not sufficiently small. The effectiveness of these methods as analytical tools can be enhanced by including vector-valued rational approximations as part of the solution process, which was done in two papers by Wu and Zhong [348, 349] in the solution of some autonomous nonlinear differential equations with oscillatory solutions, such as the Duffing equation and the equation of the simple pendulum. Here we summarize this approach as described in [348]. Consider the nonlinear IVP dw d 2w + p02 w + ε f (w) = 0, w(0) = β, (0) = 0, (14.7) 2 dt dt

14.5. Application to nonlinear differential equations

319

where p0 is a positive constant and f (w) is a nonlinear function that is infinitely differentiable at w = 0. This problem has a periodic solution that depends on ε, as does its corresponding period T . Upon introducing a new variable τ, τ = p t , and setting u(τ) = w(t ), (14.7) becomes (14.8) p 2 u  + p02 u + ε f (u) = 0, u(0) = β, u  (0) = 0, where  denotes differentiation with respect to τ. The aim is to determine the frequency p(ε) and periodic solution u(τ, ε) of τ of period 2π. The period of oscillation (in terms of the variable t ) is then T (ε) = 2π/ p(ε). Applying the Lindstedt–Poincaré method (see Mickens [190], for example), we obtain the solution u(τ, ε) and p(ε) in the following forms: u(τ, ε) =



i =0

ui (τ)εi

and

p(ε) =



i =0

pi εi .

(14.9)

Here the functions u0 (τ), u1 (τ), . . . are 2π-periodic in τ and p1 , p2 , . . . are unknown constants; p0 is as in (14.7). They are obtained as follows: First, by substituting (14.9) into (14.8) and equating the coefficients of equal powers of ε, we obtain the problems p02 (u0 + u0 ) = 0,

u0 (0) = β,

u0 (0) = 0,

p02 (u1 + u1 ) = −2 p0 p1 u0 − f (u0 ),

u1 (0) = 0,

u1 (0) = 0,

p02 (u2 + u2 ) = −2 p0 p1 u1 − ( p12 + 2 p0 p2 )u0 − f  (u0 )u1 ,

u2 (0) = 0,

u2 (0) = 0,

and so on. In general, the functions ui (τ) satisfy IVPs of the form ui + ui = gi (u0 , u1 , . . . , ui −1 ),

ui (0) = 0, ui (0) = 0,

i = 1, 2, . . . .

Obviously, u0 (τ) = β cos τ. The rest of the ui (τ) and the constants pi are determined by requiring that the ui (τ) be 2π-periodic (hence have no secular terms). Once the ui (τ) have been obtained, we form the sequence of partial sums x m (τ, ε) =

m−1

i =0

ui (τ)εi ,

m = 0, 1, . . . ,

and apply to it SMPE, for example. The analogue of the approximation sn,k (z) for SMPE is now the scalar function sn,k (τ, ε), which is obtained as follows: Compute ui , j = (un+i , un+ j ), where (φ, ψ) = 8 2π φ(τ)ψ(τ)d τ, and let 0 sn,k (τ, ε) =

D(εk xn (τ, ε), εk−1 xn+1 (τ, ε), . . . , xn+k (τ, ε))

D(εk , εk−1 , . . . , ε0 )

,

with D(w0 , w1 , . . . , wk ) defined precisely as in (12.10). Clearly, the τ-dependence comes through the numerator, the denominator being independent of τ.

320

Chapter 14. Applications of SMPE, SMMPE, and STEA

Following the computation of sn,k (τ, ε), we approximate p(ε) as follows: We first  solve (14.8) for p = p(ε), obtaining p = −[ p02 u + ε f (u)]/u  . Next, we approximate p by replacing u by the approximation sn,k (τ, ε), thus obtaining

   p 2 sn,k (τ, ε) + ε f (sn,k (τ, ε)) p(ε) ≈ − 0 ≡ pn,k (τ, ε),  (τ, ε) sn,k

(14.10)

which is valid for all τ. Next, we evaluate this at some τ = τ0 ∈ (0, 2π) (for example, at τ0 = π) to obtain pn,k (τ0 , β) as our approximation to p(ε). Finally, we compute the following approximation to T (ε): T (ε) ≈

2π . pn,k (τ0 , ε)

These approximations turn out to be better than the partial sums obtained from the perturbation series in (14.9). In addition, they can also provide quantitative information about singularities in the period. For details and specific examples, see [348]. For additional examples, see [349].

14.6 Application to computation of f (A)b A subject that has received a lot of attention in the literature of numerical linear algebra is that of computing matrices of the form f (A), called matrix functions, and vectors of the form f (A)b, where A is a square matrix of dimension N and f (t ) is some given function that is analytic in a domain D of the t -plane containing the spectrum of A and b ∈ N . Problems involving matrix functions arise in many applications. For example, the function exp(A) has received much attention. In the remainder of this section, we will discuss two approaches to their computation. Matrix functions f (A) are discussed at length in Gantmacher [95, Volume 2], Horn and Johnson [139], Lancaster and Tismenetsky [163], and Higham [135]. In the next subsection, we present some of the fundamentals of matrix functions. We follow mostly [95, Volume 2] and [135]. In what follows, our knowledge of the Jordan canonical form and of minimal polynomials of matrices will be very helpful.

14.6.1 Definition and general properties of matrix functions For an arbitrary function f (t ), we have the following definition of the matrix function f (A). Definition 14.2. Let f (t ) be a scalar function that is analytic in a domain D containing the spectrum of A. Then the matrix f (A) is defined via < 1 f (t )(t I − A)−1 d t , f (A) = 2πi Γ

where Γ is a positively oriented closed Jordan curve in D whose interior contains the spectrum of A. As a consequence of this definition, we have the following theorem, which gives f (A) in terms of the Jordan canonical form of A.

14.6. Application to computation of f (A)b

321

Theorem 14.3. Let A = V J V −1 be the Jordan factorization of A, with ⎡ ⎤ J r1 (λ1 ) ⎢ ⎥ J r2 (λ2 ) ⎢ ⎥ ⎥, J =⎢ .. ⎢ ⎥ . ⎣ ⎦ J rq (λq ) where the Jordan blocks J r (λ) are defined via ⎡ λ 1 ⎢ . ⎢ λ .. 1×1 ⎢ J 1 (λ) = [λ] ∈  , J r (λ) = ⎢ .. ⎣ .

(14.11)

⎤ ⎥ ⎥ ⎥ = λI + E ∈  r ×r , r r ⎥ 1⎦

r > 1.

λ

Here λi are the (not necessarily distinct) eigenvalues of A and f (A) = V f (J )V −1 , where

q

r = N . Then i =1 i

f (J ) = diag( f (J r1 (λ1 )), . . . , f (J rq (λq )),



⎢ ⎢ ⎢ r −1

f ( j ) (λ) j ⎢ ⎢ f (J r (λ)) = Er = ⎢ ⎢ j! ⎢ j =0 ⎢ ⎢ ⎣

f (λ)

f  (λ)

f (λ)

1 2!

f  (λ)

···

···

f  (λ)

···

···

f (λ)

··· .. .

··· .. . .. .

(14.12) (14.13)

⎤ f (r −1) (λ) ⎥ f (r −2) (λ)⎥ ⎥ f (r −3) (λ)⎥ ⎥ ⎥. .. ⎥ ⎥ . ⎥ ⎥  ⎦ f (λ)

1 (r −1)! 1 (r −2)! 1 (r −3)!

f (λ)

(14.14) Proof. That (14.13) is true follows from (t I − A)−1 = V (t I − J )−1V −1 and

(t I − J )−1 = diag([t I r1 − J r1 (λ1 )]−1 , . . . , [t I rq − J rq (λq )]−1 ).

The proof of (14.14) can be achieved by first noting that [t I r − J r (λ)]−1 =

r −1

(t − λ)− j −1 E r , j

j =0

which follows from the fact that E rr = O, and next applying the Cauchy residue theorem to < 1 f (t )[t I r − J r (λ)]−1 d t . f (J r (λ)) = 2πi Γ

The following conclusions can easily be derived from Definition 14.2 and Theorem 14.3:

322

Chapter 14. Applications of SMPE, SMMPE, and STEA

1. By (14.13) and (14.14), f (λi ) is an eigenvalue of f (A) of multiplicity ri , i = 1, . . . , q. 2. If A is diagonal, then f (A) is also diagonal. 3. If f (t ) = t i , then f (A) = Ai , i = 0, 1, . . . .65 Consequently, f (t ) =

p

i =0

ci t i



f (A) =

p

i =0

ci Ai ,

A0 = I ,

(14.15)

which we have already used in different places for polynomial f (t ) without worrying about matrix functions. 4. f (A) commutes with A. 5. f (AT ) = [ f (A)]T . 6. f (XAX −1 ) = X f (A)X −1 . 7. If A is block diagonal, then so is f (A); namely, A = diag(A11 , . . . ,Am m )



f (A) = diag( f (A11 ), . . . , f (Am m )).

From Theorem 14.3 [especially from (14.14)], we observe that what determines f (A) is the finite set S f ≡ { f ( j ) (λi ), 0 ≤ j ≤ ri − 1, 1 ≤ i ≤ q}. From this, we conclude that if S f = S g for two different functions f (t ) and g (t ), then f (A) = g (A) as well. Based on this and on (14.15), we can show that f (A) is a polynomial in A even when f (t ) is not a polynomial in t . This is the subject of the next theorem.  Theorem 14.4. Let Q(t ) = sj =1 (t − ν j )n j be the minimal polynomial of A. Let w(t ) be the polynomial that interpolates f (t ) in the Hermite sense as follows: w (i ) (ν j ) = f (i ) (ν j ), Then Sw = S f , and, therefore, Thus, f (A) =

q i =0

i = 0, 1, . . . , n j − 1, f (A) = w(A).

− 1, where n = di A for some q ≤ n i

j = 1, . . . , s.

s

j =1 n j .

Recall that Q(t ) divides the characteristic polynomial of A. Therefore, ν j are all of the distinct eigenvalues of A, which implies that n ≤ N . In view of Theorem 14.4, we have the following polynomial representation of the vector f (A)b.  Theorem 14.5. Let P (t ) = rj=1 (t −μ j ) m j be the minimal polynomial of A with respect to b. Let u(t ) be the polynomial that interpolates f (t ) in the Hermite sense as follows: u (i ) (μ j ) = f (i ) (μ j ), Then Thus, f (A)b = 65 This

p

i = 0, 1, . . . , m j − 1,

j = 1, . . . , r.

f (A)b = u(A)b.

c (A i =0 i

i

& − 1, where m &= b) for some p ≤ m

r

j =1

can be seen by comparing (14.14) with (0.29) in the proof of Theorem 0.1.

mj .

14.6. Application to computation of f (A)b

323

Recall that μ j are some or all of the ν j , that is, some or all of the distinct eigen& ≤ n ≤ N . In advalues of A. Recall also that P (t ) divides Q(t ). Therefore, m dition, we see that f (A)b is in the Krylov subspace ( p+1 (A; b), with (k (A; b) = span{b,Ab, . . . ,Ak−1 b}. It may be convenient to express f (A) as a power series in some situations. Concerning this, we have the following theorem, which can be proved by starting with  B i + (I − B)−1 B k . Definition 14.2 and recalling that (I − B)−1 = ik−1 =0 Theorem 14.6. Assume that the function f (t ) is analytic at t = a, and let R > 0 be the radius of convergence of the Taylor series of f (t ) about t = a, that is, f (t ) =



i =0

ci (t − a)i ,

|t − a| < R,

ci =

f (i ) (a) , i!

i = 0, 1, . . . .

Then f (A) can also be expressed as a convergent infinite series via f (A) =



i =0

ci (A − aI )i ,

provided ρ(A − aI ) < R.

Consequently, we also have f (A)b =



  ci (A − aI )i b

provided ρ(A − aI ) < R.

i =0

Here are some examples: f (t ) = (1 + t )−1 ⇒ f (A) = (I + A)−1 =



(−1)i −1Ai ,

provided ρ(A) < 1.66

i =0

f (t ) = log(1 + t ) ⇒ f (A) = log(I + A) =



(−1)i −1 i =1

f (t ) = e t ⇒ f (A) = exp(A) =

f (t ) = cos t ⇒ cos(A) =

Ai ,



1 i A i =0 i!



(−1)i i =0

f (t ) = sin t ⇒ sin(A) =

i

(2i)!

A2i



(−1)i 2i +1 A (2i + 1)! i =0

provided ρ(A) < 1.

∀ A.

∀ A.

∀ A.

14.6.2 Application of vector extrapolation methods to f (A)b When the vectors u i = (A − aI )i b are known computable, we can apply vector or m−1 extrapolation methods to the sequence {x = u i } of partial sums of the infinite m i =0  series ∞ c u , whether this series converges or diverges. In this way, we can accelerate i =0 i i 66 Note also that f (A) = (I + A)−1 as long as I + A is not singular, whether ρ(A) < 1 or not. Similarly, with f (t ) = (1 − z t )−1 , we have f (A) = (I − zA)−1 , which we have already considered in Section 14.2.1.

324

Chapter 14. Applications of SMPE, SMMPE, and STEA

 the convergence of ∞ i =0 ci u i when the latter converges, or we can make it converge when it diverges, hence obtaining good approximations to f (A)b. In approaching the issue in this way, we propose to look at a slightly more general problem by introducing a complex variable z and considering the computation of f (zA)b, where z is a complex variable. Of course, f (A)b = f (zA)b| z=1 . With f (t ) as in Theorem 14.6, and with a = 0 for convenience, we have the vector-valued power series ∞

  ci Ai b z i , f (zA)b = i =0

which converges provided |z| < R/ρ(A).  i Setting u i = ci (Ai b) in Chapter 12, we can apply to ∞ i =0 u i z one of the rational approximation methods SMPE, SMMPE, and STEA to accelerate its convergence when it converges [that is, when |z| < R/ρ(A)], or make it converge when it diverges. Note also that there is no need to know (or store) the matrix A; it is sufficient to have a procedure for computing matrix-vector products Aw. By applying SMPE precisely as described in Section 12.5, for example, we can approximate f (zA)b for all z (including z = 1) economically, by which we mean that the expensive vector computations are done only once, the remaining computation of the vectors sn,k (z) involving a very small additional expense for each z. This approach has been used by Zhu, Li, and Gu [361] to compute exp(zA)b. Note that the computation of exp(zA) has received a lot of attention in the literature, and many different ways of computing it have been proposed. See Moler and Van Loan [191, 192], Saad [238], Hochbruck and Lubich [137], Higham [134], Higham and Al-Mohy [136], and Al-Mohy and Higham [2], for example, and the references mentioned in these papers. We do not intend to go into this specific topic here, however.

14.6.3 Application of Krylov subspaces to f (A)b Another approach to the computation of f (A)b employs Krylov subspaces, and this approach has been suggested and studied in detail in various works. See Saad [238], Gallopoulos and Saad [92], Tal-Ezer [315, 316, 317], and Eiermann and Ernst [76], for example. In this approach, we approximate f (A)b by a vector w k in a Krylov subspace (k (A; b), with k < p, recalling that f (A)b resides in the Krylov subspace ( p+1 (A; b).  c Ai )b, this amounts to approximating the funcSince w k is of the form w k = ( k−1 i =0i tion f (t ) by the polynomial p(t ) = ik−1 c t i . We can determine p(t ) by requiring it =0 i to interpolate f (t ) at some points t1 , . . . , tk . From the theory of polynomial interpolation, we thus have k  f (t ) − p(t ) = r (t ) (t − ti ), i =1

where r (t ) = f also have

(k)

( t )/k! for some t that depends on t and the ti . Consequently, we f (A)b − p(A)b = r (A)

5  k i =1

 6 (A − ti I ) b .

Taking norms on both sides of this equality, we obtain '   ' ' ' k (A − t I ) b'  f (A)b − p(A)b ≤ r (A) · ' i '. ' i =1

14.6. Application to computation of f (A)b

325

 We now choose the ti to be such that [ ki=1 (A − ti I )]b is small in some sense. Let   us note that [ ki=1 (A − ti I )]b = ki=0 εi Ai b for some εi with εk = 1. • We can choose the ti to be the solution to the minimization problem '   ' ' ' k (A − t I ) b' min ' i ' ' t1 ,...,tk i =1



' '

' ' k−1 i k ' ' ε A b + A b i '. ' ε0 ,ε1 ,...,εk−1 min

i =0

As we have shown in Theorem 10.6 of Chapter 10, when the standard l2 -norm is employed, the ti turn out to be the Ritz values of A that are obtained via the method of Arnoldi applied to the Krylov subspace (k (A; b). Thus, the ti are the eigenvalues of the matrix H k = Q ∗k AQ k , where Q k = [q 1 |q 2 | · · · |q k ], with q 1 = b/β, β = b, and Q ∗k Q k = I k . (See Section 10.3.) Then we also have the representation (14.16) p(A)b = βQ k f (H k )e 1 . • In view of the preceding item, we can also choose the ti to be the Ritz values of A that are obtained via the method of Lanczos applied to the Krylov subspaces (k (A; b) and (k (A∗ ; c) for some nonzero vector c. Thus, the ti are the eigenvalues of the tridiagonal matrix T k = W ∗k AV k , where V k = [v 1 |v 2 | · · · |v k ], with v 1 = b/β, β = b, and W k = [w 1 |w 2 | · · · |w k ], with w 1 = c/γ , such that w ∗1 v 1 = 1 and W ∗k V k = I k . (See Section 10.4.) Then we also have the representation (14.17) p(A)b = βV k f (T k )e 1 . The representations of p(A)b given in (14.16) and (14.17) can be obtained as follows: First, by [238, Lemma 3.1], p(A)b = βQ k p(H k )e 1 and

p(A)b = βV k p(T k )e 1

via Arnoldi orthogonalization, via Lanczos biorthogonalization.

Next, we recall that, by Lemma 10.5 in Chapter 10, the Ritz values t1 , . . . , tk are simple and distinct since the matrices H k and T k are irreducible Hessenberg matrices for k less than the degree of the minimal polynomial of A with respect to b.67 Consequently, p(H k ) = f (H k ) and

p(T k ) = f (T k )

via Arnoldi orthogonalization, via Lanczos biorthogonalization.

Since we are dealing with matrices of very large dimensions N and since k is chosen to be much smaller than N , the exact computation of f (H k ) or f (T k ) is much cheaper than that of f (A), which is what makes this approach practical. For these developments and more, see Saad [238], for example.

67 Actually, Lemma 10.5 was proved for the method of Arnoldi by invoking the fact that the first subdiagonal of H k is all nonzero. As for the method of Lanczos, we note that T k is also a Hessenberg matrix, and we assume that the elements of the first subdiagonal (hence the first superdiagonal as well) are all nonzero; then Lemma 10.5 applies.

Chapter 15

Vector Generalizations of Scalar Extrapolation Methods

15.1 Introduction Throughout the earlier part of this book, we have considered the acceleration of the convergence of vector sequences {x m } by polynomial methods and epsilon algorithms. Of these, the polynomial methods are specifically designed to treat vector sequences. In this chapter, we would like to see how some of the known extrapolation methods for scalar sequences can be “vectorized” to apply to vector sequences under suitable conditions. Recall that we have already seen how the epsilon algorithm that implements the Shanks transformation is vectorized in two different ways to obtain VEA and TEA. In recent years, additional scalar extrapolation methods have been vectorized in different forms to accelerate the convergence of vector sequences under certain conditions. Some convergence analysis for these methods has been developed, and some numerical experience has been gained. We devote most of this chapter to the vectorization of a generalized Richardson extrapolation process. We also briefly discuss additional methods from vectorization of some other scalar sequence transformations.

15.2 Review of a generalized Richardson extrapolation process We recall that all the methods we have studied so far take as their input only the vector sequence {x m }. We now briefly consider methods that rely not only on the x m but also on additional information about the x m when this information is available. To motivate the development of these methods, we start with a scalar sequence {x m }, for which we have the asymptotic expansion xm ∼ s +



i =1

αi gi (m)

as m → ∞,

(15.1)

where s is the limit or antilimit of {x m } and the sequences {gi (m)}∞ m=0 , i = 1, 2, . . . , are available/computable. The scalars αi are not necessarily known. Of course, we forms an asymptotic scale as m → ∞ in are assuming that the sequence {gi (m)}∞ i =1 327

328

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

the sense that

gi +1 (m)

lim

m→∞

gi (m)

= 0,

i = 1, 2, . . . .

(15.2)

Our aim is to develop a convergence acceleration method of obtaining s. The most direct way of achieving this is as follows: First, we truncate the summation on the right-hand side of (15.1) at the i = k term. Next, we replace the asymptotic i equality sign ∼ by the equality sign =. Finally, we replace s by sn,k and the αi by α and collocate at m = n, n + 1, . . . , n + k to obtain the (k + 1) × (k + 1) linear system x m = sn,k +

k

i =1

i gi (m), α

m = n, n + 1, . . . , n + k,

(15.3)

1 , . . . , α k being the unknowns. Here sn,k is the desired approximation to s. sn,k and α Using Cramer’s rule, we obtain the determinant representation sn,k =

where

G(xn , xn+1 , . . . , xn+k )

G(1, 1, . . . , 1)

  v0   g1 (n)   G(v0 , v1 , . . . , vk ) =  g2 (n)  .  ..   g (n) k

v1 g1 (n + 1) g2 (n + 1) .. .

gk (n + 1)

,

··· ··· ··· ···

(15.4)   vk  g1 (n + k)  g2 (n + k)  .  ..  .  gk (n + k)

(15.5)

By (15.4), we can also express sn,k in the form sn,k =

k

j =0

γ j xn+ j ,

(15.6)

where the γ j satisfy the (k + 1) × (k + 1) linear system k

j =0

gi (n + j )γ j = 0,

i = 1, . . . , k,

k

j =0

γ j = 1.

(15.7)

The general framework and formal setting for scalar sequences satisfying (15.1) and (15.2) was first given in Hart et al. [127, p. 39]. It was considered in detail again in Schneider [253], who also gave the first recursive algorithm for computing the sn,k . This algorithm was later rederived using different techniques by Håvie [129] and Brezinski [30], who gave it the name E-algorithm. A different and more economical recursive algorithm, called the FS-algorithm, was later given by Ford and Sidi [86]. For a detailed exposition and the most recent developments on this topic as it pertains to scalar sequences, see Sidi [278, Chapter 3], where the extrapolation method obtained is termed, for historical reasons, the first generalization of the Richardson extrapolation process. In the following, we will denote it by FGREP for short. We now present a different way of obtaining sn,k . We note that by subtracting the equation with m = n + i − 1 from that with m = n + i, i = 1, . . . , k, the linear system in (15.3) can also be written as sn,k = xn −

k

i =1

i gi (n), α

(15.8a)

15.3. Vectorization of FGREP k

i =1

329

i Δ gi (m) = Δx m , α

m = n, n + 1, . . . , n + k − 1,

(15.8b)

1 , . . . , α k in (15.3) where Δx m = x m+1 − x m and Δ gi (m) = gi (m + 1) − gi (m). Thus, α also satisfy by themselves the k × k system of equations in (15.8b). 1 , . . . , α k determined, we can obtain sn,k from (15.8a). In view of Of course, with α this, we can also express sn,k as68

sn,k

   xn Δxn ··· Δxn+k−1    g1 (n) Δ g1 (n) · · · Δ g1 (n + k − 1)      .. .. ..   . . .    g (n) Δ g (n) · · · Δ g (n + k − 1) k k k . =    Δ g1 (n) · · · Δ g1 (n + k − 1)     ..  ..  .  .   Δ g (n) · · · Δ g (n + k − 1) k k

(15.9)

In the next section, we show how the approaches we have presented so far can be vectorized.

15.3 Vectorization of FGREP Let us assume that the vector x m , analogous to (15.1), has an asymptotic expansion of the form ∞

αi g i (m) as m → ∞, (15.10) xm ∼ s+ i =1

{g i (m)}∞ i =1 being an asymptotic scale as m → ∞, in the sense that lim

m→∞

g i +1 (m)

g i (m)

= 0,

i = 1, 2, . . . ,

(15.11)

and that g i (m) are available/computable. The vector sequences {g i (m)}∞ m=0 , i = 1, 2, . . . , are known, as well as {x m }. The scalars αi do not have to be known. By (15.10), we mean ' ' r

' ' 'x − s − αi g i (m)' ' = o(g r (m)) ' m

as m → ∞

∀ r ≥ 1.

(15.12)

i =1

 Of course, the summation ∞ i =1 αi g i (m) in the asymptotic expansion of (15.10) does not need to be convergent; it may diverge in general. Finally, the αi are not all nonzero necessarily; some may be zero in general. Vector extrapolation methods can now be designed by using ideas developed for the scalar case. Below, we adopt two different approaches. 68 Note that we can also obtain (15.9) from (15.4) by performing elementary column transformations in (15.5).

330

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

15.3.1 First vectorized FGREP: FGREP1 We start with the equations (15.7) that define the γ j and replace the scalar gi (n + j ) there by the vector g i (n + j ). This gives the overdetermined linear system k

j =0

g i (n + j )γ j = 0,

i = 1, . . . , k,

k

j =0

γ j = 1.

(15.13)

But this system is inconsistent in general, and we should define the γ j to be a “solution” in some generalized sense. Once the γ j have been determined, set sn,k =

k

j =0

γ j x n+ j .

(15.14)

We can achieve this in a few ways. The simplest way is the following: Choose a nonzero vector y ∈ N and solve the (k + 1) × (k + 1) system k

j =0

〈y, g i (n + j )〉 γ j = 0,

This gives rise to sn,k =

k

i = 1, . . . , k,

j =0

H (x n , x n+1 , . . . , x n+k )

H (1, 1, . . . , 1)

where

  v0 v1   〈y, g 1 (n)〉 〈y, g 1 (n + 1)〉   H (v0 , v1 , . . . , vk ) =  〈y, g 2 (n)〉 〈y, g 2 (n + 1)〉  .. ..  . .  〈y, g (n)〉 〈y, g (n + 1)〉 k k

··· ··· ··· ···

γ j = 1.

,

(15.15)

(15.16)   vk  〈y, g 1 (n + k)〉  〈y, g 2 (n + k)〉  .  ..  .  〈y, g k (n + k)〉

(15.17)

We will call this method FGREP1. It is easy to see that these vectors satisfy a three-term recursion relation that can be carried out as explained in Chapter 8. Note that when g i (n) = Δx n+i −1 , sn,k is the vector obtained from TEA1.

15.3.2 Second vectorized FGREP: FGREP2 i and replace the scalar gi (m) there We start with the equations (15.8b) that define the α by the vector g i (m). This gives the overdetermined linear system k

i =1

i Δg i (m) = Δx m , α

m = n, n + 1, . . . , n + k − 1,

(15.18)

where Δx m = x m+1 − x m and Δg i (m) = g i (m + 1) − g i (m). But this system too is i to be a “solution” in some generinconsistent in general, and we should define the α i have been determined, we set alized sense. Once the α sn,k = x n −

k

i =1

i g i (n). α

(15.19)

15.3. Vectorization of FGREP

331

i in a few ways. The simplest way is the followWe can achieve the “solution” for the α ing: Choose a nonzero vector y ∈ N , take the inner product of both sides of (15.18) with y, and solve the k × k system k

i =1

i = 〈y, Δx m 〉 , 〈y, Δg i (m)〉 α

m = n, n + 1, . . . , n + k − 1.

(15.20)

Taken together, (15.19) and (15.20) give rise to sn,k = fn,k (x),

x ≡ {x m },

(15.21)

where, for an arbitrary vector sequence v = {v m } in N ,

   vn 〈y, Δv n 〉 ··· 〈y, Δv n+k−1 〉    g 1 (n) 〈y, Δg 1 (n)〉 · · · 〈y, Δg 1 (n + k − 1)〉      .. .. ..   . . .    Nn,k (v) g k (n) 〈y, Δg k (n)〉 · · · 〈y, Δg k (n + k − 1)〉 fn,k (v) = = .    〈y, Δg 1 (n)〉 · · · 〈y, Δg 1 (n + k − 1)〉  Dn,k     .. ..   . .   〈y, Δg (n)〉 · · · 〈y, Δg (n + k − 1)〉 k k

(15.22)

Of course, we are assuming that Dn,k , the denominator determinant of sn,k , is nonzero. Note also that Nn,k (x), the numerator determinant of sn,k , which is a vector, is to be interpreted as its expansion with respect to its first column. We will call this method FGREP2. The method we have just described is due to Brezinski [30], where a recursion relation for it is also given. For different recursion relations, see Ford and Sidi [87]. This method was named the vector E-algorithm in [30]. Remark: As already mentioned, FGREP2 is not the only method that can be ob i by solving only tained from the formalism given above. We can determine a set of α one of the equations in (15.18), namely, k

i =1

i Δg i (n) = Δx n , α

in different ways. • We can solve it by least squares, which amounts to requiring that the projection  i Δg i (n) − Δx n on the subspace spanned by the vectors of the vector ki=1 α Δg 1 (n), . . . , Δg k (n) be zero. (This is analogous to MPE.)  i Δg i (n) − • We can solve it by requiring that the projection of the vector ki=1 α Δx n on a k-dimensional subspace spanned by the fixed vectors q 1 , . . . , q k be zero. (This is analogous to MMPE.)  i Δg i (n) − Δx n . i by minimizing some l p -norm of ki=1 α • We can solve for the α The l1 - and l∞ -norms give rise to problems that can be solved by techniques of linear programming.

332

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

15.4 A convergence theory for FGREP2 Convergence acceleration properties of sn,k from GREP2, under certain conditions, have been considered in a few publications. See Wimp [345, Chapter 10] and Matos [182] for different results. Here we provide a new study by Sidi [293], whose results are summarized in Theorem 15.2, which is stated and proved below.69 This theorem provides optimal results in the form of 1. a genuine asymptotic expansion for sn,k as n → ∞, and 2. a definitive and quantitative convergence acceleration result. The technique we use to prove Theorem 15.2 is derived in part from Wimp [345] and mostly from Sidi [265], with necessary modifications to accommodate vector sequences. It also involves the notion of generalized asymptotic expansion; see Temme [323, Chapter 1], for example. For convenience, we give the precise definition of this notion here. ∞ Definition 15.1. Let {φi (m)}∞ i =1 and {ψi (m)}i =1 be two asymptotic scales as m → ∞ ∞. Let {W m } m=0 be a given sequence. We say that the formal series ∞ i =1 ai φi (m) is the generalized asymptotic expansion of W m with respect to {ψi (m)}∞ i =1 as m → ∞, written in the form

Wm ∼ provided Wm −

r

i =1



i =1

ai φi (m)

as m → ∞,

ai φi (m) = o(ψ r (m))

{ψi },

as m → ∞

∀ r ≥ 1.

The notation we use in the following is precisely that introduced in the previous section. Theorem 15.2. Let the sequence {x m } be as in (15.10), with the g i (m) satisfying (15.11), and lim

〈y, g i (m + 1)〉

m→∞

〈y, g i (m)〉

= bi = 1,

i = 1, 2, . . . ,

|b1 | > |b2 | > · · · ,

bi distinct,

(15.23)

lim bi = 0,

i →∞

in addition. Assume also that lim

m→∞

and define h k,i

g i (m)

〈y, Δg i (m)〉

  g i  g 1 =  .  ..   g k

69 Theorem

= g i = 0,

1 1 .. .

bi b1 .. .

··· ···

1

bk

···

i = 1, 2, . . . ,

 bik−1  b1k−1  ..  , .   b k−1 

i ≥ k + 1.

k

15.2 and its proof are used with permission from Springer [293].

(15.24)

(15.25)

15.4. A convergence theory for FGREP2

333

Then the following are true: 1. We have lim

〈y, Δg i (m + 1)〉

〈y, Δg i (m)〉

m→∞

= bi ,

i = 1, 2, . . . ,

(15.26)

in addition to (15.23). Furthermore, the sequence {〈y, Δg i (m)〉}∞ i =1 is an asymptotic scale as m → ∞; that is, lim

〈y, Δg i +1 (m)〉

〈y, Δg i (m)〉

m→∞

= 0,

i = 1, 2, . . . .

(15.27)

2. With arbitrary v ≡ {v m }∞ m=0 , fn,k (v) defined in (15.22) exist for all n ≥ n0 , n0 being some positive integer independent of v. 3.

(a) With g i ≡ {g i (m)}∞ m=0 , we have fn,k (g i ) = 0 for i = 1, . . . , k, while for i ≥ k + 1, fn,k (g i )

〈y, Δg i (n)〉 fn,k (g i )

〈y, Δg i (n)〉



h k,i

= 0, if h k,i

as n → ∞

V (b1 , . . . , bk )

(15.28)

= o(1) as n → ∞

= 0, if h k,i

and also  fn,k (g i ) ∼ Ck,i g i (n)

as n → ∞

 fn,k (g i ) = o(g i (n)) as n → ∞ where Ck,i =

= 0, if h k,i = 0, if h k,i

 h 1 k,i gi |V (b1 , . . . , bk )| 

(15.29)

(15.30)

and V (c1 , . . . , ck ) is the Vandermonde determinant of c1 , . . . , ck , given as   1 c · · · c k−1  1   1 1 c · · · c k−1     2 2 = V (c1 , . . . , ck ) =  . . (c j − ci ). (15.31) .  ..   .. .. 1≤i < j ≤k   1 ck · · · c k−1  k

(b) In addition, for i ≥ k + 1, { fn,k (g i )}∞ is an asymptotic scale as n → ∞ i =k+1 in the following generalized sense: lim

 fn,k (g i +1 )

n→∞

g i (n)

= 0,

i ≥ k + 1.

(15.32)

4. sn,k has a genuine generalized asymptotic expansion with respect to the asymptotic scale {g i (n)}∞ i =1 as n → ∞; namely, sn,k ∼ s +



i =k+1

αi fn,k (g i )

as n → ∞,

{g i },

(15.33)

334

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

in the sense that sn,k − s −

r

αi fn,k (g i ) = o(g r (n)) as n → ∞

∀ r ≥ k + 1.

(15.34)

= 0. αi fn,k (g i ) = o( fn,k (g r )) as n → ∞ if h k,r

(15.35)

i =k+1

We also have sn,k − s −

r

i =k+1

5. Let αk+μ be the first nonzero αk+i with i ≥ k + 1. Then the following are true: (a) sn,k satisfies

sn,k − s = O(g k+μ (n)) as n → ∞,

(15.36)

and, therefore, also sn,k+ j − s = O(g k+μ (n)) as n → ∞,

j = 0, 1, . . . , μ − 1.

(15.37)

(b) We also have sn,k − s ∼ αk+μ fn,k (g k+μ ) as n → ∞ if h k,k+μ = 0.

(15.38)

As a result, provided h k+ j ,k+μ = 0, j = 0, 1, . . . , μ − 1, we also have sn,k+ j − s ∼ αk+μ fn,k+ j (g k+μ )

as n → ∞,

j = 0, 1, . . . , μ − 1, (15.39)

which also implies sn,k+ j − s ∼ |αk+μ | Ck+ j ,k+μ g k+μ (n) as n → ∞, j = 0, 1, . . . , μ − 1. (15.40) (c) If αk = 0 and h k−1,k = 0, then sn,k+ j − s

sn,k−1 − s

=O

 g

k+μ (n)

g k (n)

 = o(1) as n → ∞, j = 0, 1, . . . , μ − 1. (15.41)

Proof. Proof of part 1: We first note that

〈y, g i (m + 2)〉 〈y, Δg i (m + 1)〉

〈y, Δg i (m)〉

〈y, g i (m + 1)〉 〈y, g i (m + 1)〉 = 〈y, g i (m)〉 〈y, g i (m + 1)〉

〈y, g i (m)〉

−1

. −1

Taking limits as m → ∞ and invoking (15.23), we obtain (15.26). Next, by (15.24), we have the asymptotic equality gi g i (m) ∼ 〈y, Δg i (m)〉

as m → ∞,

(15.42)

15.4. A convergence theory for FGREP2

335

which, upon taking norms, gives the asymptotic equality g i (m)

| 〈y, Δg i (m)〉 | ∼

Therefore,

| 〈y, Δg i +1 (m)〉 |



| 〈y, Δg i (m)〉 |

as m → ∞.

 gi

 g i  g i +1 (m)

 g i +1  g i (m)

(15.43)

as m → ∞.

Invoking the fact that {g i (m)}∞ i =1 is itself an asymptotic scale as m → ∞, as in (15.11), the result in (15.27) follows. Proof of part 2: By (15.22), fn,k (v) for arbitrary v ≡ {v m } exists provided Dn,k , the denominator determinant, is nonzero. Therefore, we need to analyze only the determinant Dn,k in (15.22). Let us set ηi , j (m) =

〈y, Δg i (m + j )〉

〈y, Δg i (m)〉

and observe that ηi , j (m) =

i, j = 1, 2, . . . ,

,

j 

〈y, Δg i (m + r )〉

r =1

〈y, Δg i (m + r − 1)〉

.

(15.44)

(15.45)

Letting m → ∞ and invoking (15.26), we obtain lim η (m) = m→∞ i , j

j

bi .

(15.46)

Factoring out 〈y, Δg j (n)〉 from the j th row of Dn,k , j = 1, . . . , k, we have

 1  Dn,k  =  ... k  1 j =1 〈y, Δg j (n)〉

η1,1 (n) .. . ηk,1 (n)

··· ···

which, upon letting n → ∞, gives lim ψn,k = V (b1 , b2 , . . . , bk ) =

n→∞

 η1,k−1 (n)   ..  ≡ ψn,k , .  ηk,k−1 (n) 

(b j − bi ),

(15.47)

(15.48)

1≤i < j ≤k

this limit being nonzero since the bi are distinct. Therefore, Dn,k ∼ V (b1 , b2 , . . . , bk )

k  j =1

〈y, Δg j (n)〉

as n → ∞.

From this and from (15.43), we conclude that Dn,k = 0 for all large n. Since Dn,k is also independent of v, we have that Dn,k = 0 for all n ≥ n0 , n0 being independent of v trivially. Proof of part 3: We now turn to fn,k (g i ) = Nn,k (g i )/Dn,k , where    g i (n) 〈y, Δg i (n)〉 · · · 〈y, Δg i (n + k − 1)〉     g 1 (n) 〈y, Δg 1 (n)〉 · · · 〈y, Δg 1 (n + k − 1)〉    Nn,k (g i ) =  . . .. ..   .. . .   g (n) 〈y, Δg (n)〉 · · · 〈y, Δg (n + k − 1)〉 k k k

(15.49)

336

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

We first observe that Nn,k (g i ) = 0 for i = 1, . . . , k, since the determinant in (15.49) has two identical rows when 1 ≤ i ≤ k. This proves that fn,k (g i ) = 0 for i = 1, . . . , k. Therefore, we consider the case i ≥ k + 1. Proceeding as in the analysis of Dn,k , let us factor out 〈y, Δg i (n)〉 and 〈y, Δg 1 (n)〉,. . . ,〈y , Δg k (n)〉 from the k + 1 rows of Nn,k (g i ). We obtain Nn,k (g i )  〈y, Δg i (n)〉 kj=1 〈y, Δg j (n)〉  g (n)  i  〈y,Δg i (n)〉  g (n)  1  〈y,Δg 1 (n)〉 =  ..  .   g k (n)  〈y,Δg (n)〉 k

1

ηi ,1 (n)

···

1 .. . 1

η1,1 (n) .. . ηk,1 (n)

···

 ηi ,k−1 (n)   η1,k−1 (n)   ≡ h (n), (15.50) k,i  ..  .   ηk,k−1 (n)

···

which, upon letting n → ∞ and invoking (15.46), (15.24), and (15.25), gives . lim h k,i (n) = h k,i

(15.51)

n→∞

Combining (15.50) with (15.47), we obtain fn,k (g i ) =

〈y, Δg i (n)〉 h k,i (n)

ψn,k

,

(15.52)

which, upon letting n → ∞, gives lim

n→∞

fn,k (g i )

〈y, Δg i (n)〉

=

h k,i

V (b1 , . . . , bk )

,

(15.53)

from which (15.28) follows. (15.29) is obtained by taking norms in (15.28) and by making use of (15.43). Finally,  fn,k (g i +1 ) | 〈y, Δg i +1 (n)〉 | h k,i (n) , = |ψn,k | g i (n) g i (n)

which, upon letting n → ∞ and invoking (15.43), gives  fn,k (g i +1 )

g i (n)



h k,i (n) g i +1 (n) 1 g i +1  g i (n) |V (b1 , . . . , bk )| 

as n → ∞.

Invoking (15.11), and noting that h k,i (n) is bounded in n, we obtain (15.32). Proof of part 4: We now turn to sn,k . First, we note that sn,k − s = fn,k (x) − s =

with

 x n − s   g 1 (n)  Nn,k (x − s) =  .  ..   g (n) k

Nn,k (x − s)

Dn,k

〈y, Δx n 〉 〈y, Δg 1 (n)〉 .. .

··· ···

〈y, Δg k (n)〉 · · ·

,

x − s ≡ {x m − s},

 〈y, Δx n+k−1 〉  〈y, Δg 1 (n + k − 1)〉  , ..  .  〈y, Δg (n + k − 1)〉 k

(15.54)

(15.55)

15.4. A convergence theory for FGREP2

337

because the coefficient of x n in the expansion of Nn,k (x) is Dn,k . By (15.10), the elements in the first row of Nn,k (x − s) have the asymptotic expansions ∞

xn − s ∼ 〈y, Δx n+ j 〉 ∼



i =1

i =1

αi g i (n) as n → ∞,

αi 〈y, Δg i (n + j )〉

as n → ∞,

j = 0, 1, . . . .

Multiplying the (i + 1)st row of Nn,k (x − s) in (15.55) by αi and subtracting from the first row, i = 1, . . . , k, we obtain Nn,k (x − s) ∞  i =k+1 αi g i (n)   g 1 (n)  ∼ ..  .   g k (n)

∞

αi 〈y, Δg i (n)〉 〈y, Δg 1 (n)〉 .. .

i =k+1

〈y, Δg k (n)〉

··· ···

∞

 αi 〈y, Δg i (n + k − 1)〉   〈y, Δg 1 (n + k − 1)〉   ..  .   〈y, Δg (n + k − 1)〉

i =k+1

···

k

(15.56) ∞

as n → ∞. Taking the summations i =k+1 and the multiplicative factors αi from the first row outside the determinant in (15.56), we have, formally, Nn,k (x − s) ∼



αi Nn,k (g i ) as n → ∞,

(15.57)

i =k+1

with Nn,k (g i ) as in (15.49). Substituting (15.57) into (15.54), we obtain, again formally, ∞

sn,k − s ∼

i =k+1

αi fn,k (g i )

as n → ∞,

which ∞ is almost the asymptotic expansion of sn,k given in (15.33). The infinite series α f (g i ) will be a valid generalized asymptotic expansion for sn,k − s with i =k+1 i n,k respect to the asymptotic scale {g i (n)}∞ i =1 as n → ∞ provided sn,k − s −

r

αi fn,k (g i ) = o(g r (n)) as n → ∞

∀ r ≥ k + 1.

(15.58)

i =k+1

By (15.12), for arbitrary r , we have xm = s+

r

i =1

αi g i (m) + ε r (m),

ε r (m) = o(g r (m))

as m → ∞.

(15.59)

Let us substitute this into (15.54) and proceed exactly as above; we obtain sn,k = s +

r

αi fn,k (g i ) + fn,k (ε r ).

(15.60)

i =k+1

Comparing (15.60) with (15.58), we realize that (15.58) will be satisfied provided fn,k (ε r ) = o(g r (n)) as n → ∞.

(15.61)

338

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

Now, fn,k (ε r ) = Nn,k (ε r )/Dn,k , and   ε r (n)   g 1 (n)  Nn,k (ε r ) =  .  ..  g (n) k

〈y, Δε r (n)〉 〈y, Δg 1 (n)〉 .. .

··· ···

〈y, Δg k (n)〉

···

 〈y, Δε r (n + k − 1)〉  〈y, Δg 1 (n + k − 1)〉  . ..  .  〈y, Δg (n + k − 1)〉 k

Let us factor out 〈y, Δg r (n)〉 and 〈y, Δg 1 (n)〉,. . . ,〈y , Δg k (n)〉 from the k + 1 rows of this determinant. We obtain Nn,k (ε r )  〈y, Δg r (n)〉 kj=1 〈y, Δg j (n)〉  〈y,Δε r (n+1)〉 〈y,Δε (n)〉  ε r (n)  〈y,Δg r (n)〉 〈y,Δg rr (n)〉 〈y,Δg r (n)〉  g (n)  1 1 η1,1 (n)  〈y,Δg 1 (n)〉 =  .. .. ..  . . .   g k (n)  〈y ,Δg (n)〉 1 ηk,1 (n) k

···

···

···



〈y ,Δε r (n+k−1)〉   〈y,Δg r (n)〉 

η1,k−1 (n) .. . ηk,k−1 (n)

   ≡ φ (ε ). (15.62) n,k r     

Dividing (15.62) by (15.47), we obtain fn,k (ε r ) =

φn,k (ε r )

ψn,k

〈y, Δg r (n)〉 ,

which, upon taking norms and invoking (15.43), gives  fn,k (ε r ) ∼

φn,k (ε r ) g r (n)

gr |V (b1 , . . . , bk )| 

as n → ∞.

Therefore, (15.58) will hold provided limn→∞ φn,k (ε r ) = 0. As we already know, with the exception of the elements in the first row, all the remaining elements of the determinant φn,k (ε r ) have finite limits as n → ∞, by (15.24) and (15.46). Therefore, limn→∞ φn,k (ε r ) = 0 will hold provided all the elements in the first row of φn,k (ε r ) tend to zero as n → ∞. That this is the case is what we show next. First, by (15.43)–(15.46), as n → ∞, j

j

g r (n + j ) ∼  g r  | 〈y, Δg r (n + j )〉 | ∼ |b r |  g r  | 〈y, Δg r (n)〉 | ∼ |b r | g r (n). (15.63) Next, by applying the Cauchy–Schwarz inequality to 〈y, Δε r (n + j )〉, and invoking (15.59) and (15.63), we have | 〈y, Δε r (n + j )〉 | ≤ y (ε r (n + j + 1) + ε r (n + j )) = o(g r (n + j + 1)) + o(g r (n + j ))

as n → ∞

= o(g r (n)) as n → ∞. Invoking also (15.43), for the elements in the first row of φn,k (ε r ), we finally obtain

' ' ' ε r (n) ' ' ' ' 〈y, Δg (n)〉 ' = o(1) r

as n → ∞

15.4. A convergence theory for FGREP2

and

339

   〈y, Δε r (n + j )〉     〈y, Δg (n)〉  = o(1) r

as n → ∞,

j = 0, 1, . . . , k − 1.

This implies that φn,k (ε r ) = o(1) as n → ∞, and the proof is complete. Proof of part 5: By αk+ j = 0, j = 1, . . . , μ − 1, and αk+μ = 0,70 the validity of (15.36) is obvious. (15.37) follows from (15.36). The validity of (15.38)–(15.40) can be shown in the same way. As for (15.41), we start with sn,k − s

sn,k−1 − s



1

sn,k − s

Ck−1,k |αk | g k (n)

,

which follows from (15.40), and invoke (15.37). We leave the details to the reader. This completes the proof.

Remarks: 1. Note that Theorem 15.2 is stated under a minimal number of conditions on the g i (m) and the x m . Of these, the condition in (15.23) is given in [345, p. 180, Eq. (3)], while that in (15.24) is a modification of [345, p. 180, Eq. (5)]. 2. The conditions we have imposed on the g i (m) enable us to proceed with the proof rigorously by employing asymptotic equalities ∼ everywhere possible. This should be contrasted with bounds formulated in terms of the big O notation, which do not allow us to obtain the optimal results we have in our theorem.71 3. Note that we have imposed essentially two different conditions on the g i (m), namely (15.23) and (15.24). One may naturally think that these conditions contradict each other. In addition, one may think that they also contradict the very first and fundamental property in (15.11), which must hold to make (15.10) a genuine asymptotic expansion. Thus, we need to make sure that there are no contradictions present in our theorem. For this, it is enough to show that all three conditions can hold simultaneously, which is the case when g i (m) ∼ w i bim

as m → ∞,

|bi | > |bi +1 | ∀ i ≥ 1.

It is easy to verify that (15.11), (15.23), and (15.24) are satisfied simultaneously in this case. = 0 for some i ≥ k + 1, we cannot claim a priori 4. Due to the possibility that h k,i ∞ that { fn,k (g i )}i =k+1 is an asymptotic scale in the regular sense. Note, however, that we can safely replace (15.28) by  fn,k (g i ) = O(g i (n)) as n → ∞

∀ i ≥ k + 1,

= 0. = 0 or h whether h k,i k,i this already takes into account the possibility that αk+1 = 0, in which case μ = 1. Recall that um ∼ vm as m → ∞ if and only if limm→∞ (um /vm ) = 1. One big advantage of asymptotic equalities is that they allow symmetry and division. That is, if um ∼ vm , then vm ∼ um as well. In     addition, um ∼ vm and um ∼ vm also imply um /um ∼ vm /vm . On the other hand, if um = O(vm ), we   do not necessarily have vm = O(um ). In addition, um = O(vm ) and um = O(vm ) do not necessarily imply  = O(v /v  ). um /um m m 70 Note that 71

340

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods

= 0 for all i ≥ k + 1 if, for example, the vectors 5. h g i are all linearly indepenk,i dent, which is possible if  is an infinite-dimensional space. This can be seen in (15.25) with respect to its first by expanding the determinant defining h k,i k column and realizing that h = c c g + g , where c and the c are all k,i

i

j =1 j

i

i

j

j

nonzero Vandermonde determinants. In such a case, by (15.29) and (15.11),  fn,k (g i +1 )

 fn,k (g i )



Ck,i +1 g i +1 (n)

Ck,i

g i (n)

= o(1) as n → ∞;

hence { fn,k (g i )}∞ i =1 is an asymptotic scale in the regular sense. Therefore, the asymptotic expansion of sn,k in (15.33) is a regular asymptotic expansion, which means that r

αi fn,k (g i ) = o( fn,k (g r )) as n → ∞ ∀ r ≥ k + 1. sn,k − s − i =k+1

6. When α1 = 0, the sequence {x m } is convergent if |b1 | < 1; it is divergent if |b1 | ≥ 1. The asymptotic result in (15.34), which is always true, shows clearly that sn,k converges to s faster than x n when {x m } is convergent. If {x m } is divergent, by the assumption that limi →∞ bi = 0, we have that |bi | < 1, i ≥ p, for some integer p, and sn,k converges when k ≥ p. 7. Consider the case αk = 0,

αk+1 = · · · = αk+μ−1 = 0,

αk+μ = 0.

By (15.34)–(15.41), we have the following: • Whether h k−1,k = 0 or h k−1,k = 0, sn,k−1 − s = O(g k (n)) as n → ∞. • Whether h k+ j ,k+μ = 0 or h k+ j ,k+μ = 0, 0 ≤ j ≤ μ − 1, sn,k+ j − s = O(g k+μ (n))

as n → ∞,

0 ≤ j ≤ μ − 1.

• If h k−1,k = 0, then sn,k converges faster (or diverges slower) than sn,k−1 ; that is, sn,k − s lim = 0. n→∞ s n,k−1 − s

• If h k+ j ,k+μ = 0,

0 ≤ j ≤ μ − 1, then

sn,k+ j − s ∼ M k+ j g k+μ (n)

as n → ∞,

j = 0, 1, . . . , μ − 1,

for some positive constants M k+ j . That is, sn,k , sn,k+1 , . . . , sn,k+μ−1 converge (or diverge) at precisely the same rate. 8. We have assumed that  is an inner product space for the sake of simplicity. We can assume  to be a normed Banach space in general. In this case, we replace 〈y, u〉 by Q(u), where Q is a bounded linear functional on . With this, the analysis of this section is straightforward.

15.5. Vector extrapolation methods from scalar sequence transformations

341

15.5 Vector extrapolation methods from scalar sequence transformations So far we have seen how FGREP can be vectorized in two different ways. The approach we have described for FGREP can be extended to other scalar sequence transformations or convergence acceleration methods. This has been done by Brezinski [28] for his θ algorithm, by Osada [196, 197, 198, 199] for the transformation of Lubkin [177], for the ρ algorithm of Wynn [351], for the modified ε algorithm of Sablonnière [242], for the u and v transformations of Levin [172], and for the d (m) transformations of Levin and Sidi [173]. In his papers, Osada presents some convergence theory for these vectorizations and also illustrates their use with numerical examples. The convergence theory in Osada’s works is developed along the lines of the convergence theory for the scalar methods. The convergence properties of the scalar sequence transformations mentioned here have been analyzed in several publications; see Sablonnière [243], Van Tuyl [331], Osada [195], and Sidi [259, 260, 277], for example. For an up-to-date account and the relevant references, see [278, Chapters 15, 19, 20]. In this section, we will discuss very briefly the vectorized versions of the θ and ρ algorithms. For the rest, we refer the reader to the works mentioned here. In what follows, the vector sequence {x m } is assumed to be real.

15.5.1 Vectorization of the θ algorithm We start with the θ algorithm of Brezinski [28]. Given a scalar sequence {x m }, this (n) algorithm computes a two-dimensional array whose elements we denote by θk as follows: 1. Set (n)

θ−1 = 0,

(n)

θ0 = x n ,

n = 0, 1, . . . .

(n)

2. Compute the θk via the recursions (n)

(n+1)

(n)

θ2k+1 = θ2k−1 + D2k , (n)

(n+1)

θ2k+2 = θ2k

(n)

(n)

(n+1)

Here Dk = 1/Δθk = (θk



n, k = 0, 1, . . . ,

(n+1) Δθ2k (n) D2k+1 , (n) ΔD2k+1

n, k = 0, 1, . . . .

(n)

− θk )−1 . (Note that Δ operates on the upper index (n)

only.) The required approximations to the limit or antilimit of {x m } are the θ2k . The following vectorized forms of the θ algorithm, called the generalized θ algorithm and the vector θ algorithm, were also given in [28]. They make use of different forms of vector inverses. Generalized θ algorithm

1. Set (n)

θ−1 = 0,

(n)

θ0 = x n ,

n = 0, 1, . . . .

342

Chapter 15. Vector Generalizations of Scalar Extrapolation Methods (n)

2. Compute the θ k via the recursions (n)

y

(n+1)

θ2k+1 = θ2k−1 + (n)

(n+1)

θ2k+2 = θ2k

,

n, k = 0, 1, . . . ,

+ ωk D 2k+1 ,

n, k = 0, 1, . . . .

(n)

〈y, Δθ2k 〉 (n)

(n)

Here (n+1)

(n)

ωk = − (n)

D 2k+1 =

〈z , Δθ2k



,

(n) 〈z , ΔD 2k+1 〉 (n) Δθ2k , (n) (n) 〈Δθ2k+1 , Δθ2k 〉

where y and z are two arbitrary real vectors such that the denominators do not vanish. Vector θ algorithm

1. Set

(n)

θ−1 = 0,

(n)

θ0 = x n ,

n = 0, 1, . . . .

(n)

2. Compute the θ k via the recursions (n) θ2k+1

(n)

(n+1) = θ2k−1

+

(n+1)

+

Δθ2k (n)

(n)

〈Δθ2k , Δθ2k 〉 (n+1)

(n)

θ2k+2 = θ2k

n, k = 0, 1, . . . ,

,

(n)

〈Δθ2k+1 , Δθ2k+1 〉 (n) (n) 〈Δ2 θ2k+1 , Δ2 θ2k+1 〉

(n+1)

· Δθ2k

,

n, k = 0, 1, . . . .

15.5.2 Vectorization of the ρ algorithm Given a scalar sequence {x m }, the ρ algorithm of Wynn [351] computes a two(n) dimensional array whose elements we denote by ρk as follows: 1. Set

(n)

ρ−1 = 0,

(n)

ρ0 = xn ,

n = 0, 1, . . . .

(n)

2. Compute the ρk via the recursions (n)

(n+1)

ρk+1 = ρk−1 +

k +1 (n+1) ρk

(n)

− ρk

,

n, k = 0, 1, . . . .

(n)

The required approximations to the limit or antilimit of {x m } are the ρ2k . For an up-to-date treatment of the ρ algorithm, and the relevant references, see Sidi [278]. The ρ algorithm has been vectorized by Osada [196] in two different ways, resulting in the topological ρ algorithm and the vector ρ algorithm.

15.5. Vector extrapolation methods from scalar sequence transformations Topological ρ algorithm

1. Set

(n)

(n)

ρ−1 = 0,

ρ0 = x n ,

n = 0, 1, . . . .

(n)

2. Compute the ρk via the recursions (n)

(n+1)

ρ2k+1 = ρ2k−1 + (n) ρ2k+2

(n+1) = ρ2k

(2k + 1)y (n)

〈y, Δρ2k 〉

n, k = 0, 1, . . . ,

,

(n+1)

+

(2k + 2)Δρ2k (n+1)

〈Δρ2k

(n)

, Δρ2k+1 〉

,

n, k = 0, 1, . . . .

Vector ρ algorithm

1. Set

(n)

ρ−1 = 0,

(n)

ρ0 = x n ,

n = 0, 1, . . . .

(n)

2. Compute the ρk via the recursions (n) ρk+1

(n+1) = ρk−1

(n)

+

(k + 1)Δρk (n)

(n)

〈Δρk , Δρk 〉

,

n, k = 0, 1, . . . .

343

Chapter 16

Vector-Valued Rational Interpolation Methods

16.1 Introduction In Chapter 12, we derived vector-valued rational approximations to functions f (z), f :  → N , by applying the vector extrapolation methods MPE, MMPE, and TEA to their Maclaurin series expansions, which were assumed to be known. From Theorem 12.5, we concluded that the rational approximations sn,k (z) (with ν = 0) obtained in this way actually interpolate the functions f (z) at only one point, namely, at z = 0, n+ k times in the Hermite sense. In this chapter, we would like to briefly discuss vectorvalued rational approximations that interpolate the functions f (z) at different points in the complex z-plane, again in the Hermite sense. We would like these approximations to generalize those of Chapter 12 in the sense that they reduce to the latter when all points of interpolation coincide. In what follows, we present the development of the rational interpolation procedures and discuss some of their algebraic properties. These topics are taken from Sidi [280, 281]. The interpolants defined here have interesting de Montessus–type convergence theories, which are given in Sidi [283, 284, 287, 291]; we summarize this topic here by stating the important results. For a detailed treatment, which is beyond the scope of this book, we refer the reader to the papers mentioned above. Finally, as mentioned in [280], the procedures we propose here for functions f :  → N can be extended to the case in which f :  →  , where  is a general linear space, exactly as is shown in [269, Section 6]. This amounts to introducing the norm defined in  when the latter is an inner product space such as a Hilbert space (for IMPE), and to introducing some bounded linear functionals (for IMMPE and ITEA) (see Section 11.3.3 of Chapter 11). With these, all the developments we perform remain unchanged as well. We refer the reader to [269] for details. See also Sections 1.6, 6.9, and 14.3.

16.2 Development of vector-valued rational interpolation methods 16.2.1 General preliminaries Let z be a complex variable, and let f (z) be a vector-valued function such that f :  → N . Assume that f (z) is defined on a bounded open set Ω ∈ , and consider the prob345

346

Chapter 16. Vector-Valued Rational Interpolation Methods

lem of interpolating f (z) at some points ξ1 , ξ2 , . . . in this set. We do not assume that the ξi are necessarily distinct. The general picture is described in the next paragraph. Let a1 , a2 , . . . be distinct complex numbers, and let ξ1 = ξ2 = · · · = ξ r1 = a1 , ξ r1 +1 = ξ r1 +2 = · · · = ξ r1 +r2 = a2 , ξ r1 +r2 +1 = ξ r1 +r2 +2 = · · · = ξ r1 +r2 +r3 = a3 , and so on.

(16.1)

Let g m,n (z) be the vector-valued polynomial (of degree at most n − m) that interpolates f (z) at the points ξ m , ξ m+1 , . . . , ξn in the Hermite sense. Thus, in Newtonian form, this polynomial is given as (see Stoer and Bulirsch [313, Chapter 2] or Atkinson [9, Chapter 3], for example) g m,n (z) = f [ξ m ] + f [ξ m , ξ m+1 ](z − ξ m ) + f [ξ m , ξ m+1 , ξ m+2 ](z − ξ m )(z − ξ m+1 ) + · · · + f [ξ m , ξ m+1 , . . . , ξn ](z − ξ m )(z − ξ m+1 ) · · · (z − ξn−1 ).

(16.2)

Here, f [ξ r , ξ r +1 , . . . , ξ r +s ] ∈ N is the divided difference of order s of f (z) over the set of points {ξ r , ξ r +1 , . . . , ξ r +s }. The f [ξ r , ξ r +1 , . . . , ξ r +s ] are defined and can be computed, as in the scalar case, by the recursion relations f [ξ r , ξ r +1 , . . . , ξ r +s ] =

f [ξ r , ξ r +1 , . . . , ξ r +s −1 ] − f [ξ r +1 , ξ r +2 , . . . , ξ r +s ]

ξ r − ξ r +s if ξ s = ξ r +s ,

r = 1, 2, . . . ,

s = 1, 2, . . . ,

(16.3)

with the initial conditions f [ξ r ] = f (ξ r ),

r = 1, 2, . . . .

(16.4)

Note that, if ξ r = ξ r +1 = · · · = ξ r +s = a, the right-hand side of (16.3) is defined via a limiting process, with the result f [ξ r , ξ r +1 , . . . , ξ r +s ] = f [a, a, . . . , a] =

f (s ) (a) s!

(16.5)

when f (z) has s continuous derivatives at z = a.72 For simplicity of notation, we let d m,n = f [ξ m , ξ m+1 , . . . , ξn ],

n ≥ m.

(16.6)

We also define the scalar polynomials ψ m,n (z) via ψ m,n (z) = 72

n  r =m

(z − ξ r ),

n ≥ m ≥ 1;

ψ m,m−1 (z) = 1,

m ≥ 1.

(16.7)

Actually, provided f (z) is continuously differentiable s times in a neighborhood K(a) of a, we have f [η0 , η1 , . . . , η s ] =

η(a)) f (s ) ( , s!

for some η (a) ∈ K(a),

and also lim

η0 ,η1 ,...,η s →a

f [η0 , η1 , . . . , η s ] = f [a, a, . . . , a] =

f (s ) (a) . s!

16.2. Development of vector-valued rational interpolation methods

347

With this notation, we can rewrite (16.2) in the form g m,n (z) =

n

d m,i ψ m,i −1 (z),

(16.8)

(z) = f [ξ , ξ d m,n m m+1 , . . . , ξn , z].

(16.9)

i =m

and the error in g m,n (z) is given by (z)ψ (z), f (z) − g m,n (z) = d m,n m,n

16.2.2 Construction of rational interpolants The vector-valued rational interpolants to the function f (z) we develop here are all of the general form k u(z) j =0 c j ψ1, j (z) g j +1, p (z) r p,k (z) = , (16.10) = k v(z) j =0 c j ψ1, j (z)

where c0 , c1 , . . . , ck are, for the time being, arbitrary complex scalars with ck = 0 and p is an integer greater than k. Obviously, u(z) is a vector-valued polynomial of degree at most p − 1 and v(z) is a scalar polynomial of degree k. The following lemma says that, whether the ξi are distinct or not, r p,k (z) interpolates f (z) at the points ξ1 , . . . , ξ p in the Hermite sense. See [280, Lemma 2.1 and Lemma 2.3]. Lemma 16.1. Let the vector-valued rational function r p,k (z) be as in (16.10), and assume that v(ξi ) = 0, i = 1, 2, . . . , p. (i) When the ξi are distinct, r p,k (z) interpolates f (z) at the points ξ1 , ξ2 , . . . , ξ p in the regular sense: r p,k (ξi ) = f (ξi ), i = 1, . . . , p. (16.11) (ii) When the ξi are not necessarily distinct and are ordered as in (16.1), r p,k (z) interpolates f (z) at the points ξ1 , ξ2 , . . . , ξ p in the Hermite sense as follows: Let t and ρ  be the unique integers satisfying t ≥ 0 and 0 ≤ ρ < r t +1 for which p = it =1 ri +ρ. Then (s )

r p,k (ai ) = f (s ) (ai )

for s = 0, 1, . . . , ri − 1 when i = 1, . . . , t and for s = 0, 1, . . . , ρ − 1 when i = t + 1. (16.12)

(Of course, when ρ = 0, there is no interpolation at a t +1 .) Proof. Subtracting r p,k (z) as given in (16.10) from f (z), we obtain

6 5

k 1 f (z) − r p,k (z) = c ψ (z)[f (z) − g j +1, p (z)] , v(z) j =0 j 1, j

(16.13)

which, by (16.9) and by ψ m,n (z)ψn+1,q (z) = ψ m,q (z), becomes f (z) − r p,k (z) =

k ψ1, p (z)

v(z)

j =0

cj d j +1, p (z).

(16.14)

348

Chapter 16. Vector-Valued Rational Interpolation Methods

The result follows by setting z = ξi and invoking ψ1, p (ξi ) = 0 while recalling that v(ξi ) = 0, with 1 ≤ i ≤ p.

Remark: It must be noted that the condition v(ξi ) = 0, i = 1, . . . , p, features throughout. Because k < p and because p can be arbitrarily large, this condition might look too restrictive at first. This is not the case, however. Indeed, the condition v(ξi ) = 0, i = 1, . . . , p, is natural for the following reason: Normally, we take the points of interpolation ξi in a set Ω on which the function f (z) is regular. If r p,k (z) is to approximate f (z), it should also be a regular function over Ω and hence free of singularities there. Since the singularities of r p,k (z) are the zeros of v(z), this implies that v(z) should not vanish on Ω. [We expect the singularities of r p,k (z)—the zeros of v(z)—to be close to the singularities of f (z), which are outside the set Ω.] So far, the c j in (16.10) are arbitrary. Of course, the quality of r p,k (z) as an approximation to f (z) depends very strongly on the choice of the c j . Naturally, the c j must depend on f (z) and on the ξi . Fixing the integers k and p such that p ≥ k + 1, we determine the c j as follows: 1. With the normalization ck = 1, we determine c0 , c1 , . . . , ck−1 as the solution to the problem '

' ' k ' ' c j d j +1, p+1 ' min ' subject to ck = 1, c0 ,c1 ,...,ck−1 '

(16.15)

j =0

where  ·  stands for an arbitrary vector norm in N . With the l1 - and l∞ norms, the optimization problem can be solved using linear programming. With the l2 -norm, it becomes a least-squares problem, which can be solved numerically via standard techniques. Of course, the inner product (· , ·) that defines  the l2 -norm [that is, z  = (z, z ) ] is not restricted to the standard inner product (y, z ) = y ∗ z ; it can be given by (y, z ) = y ∗ M z , where M is a Hermitian positive definite matrix. We let  ·  in (16.15) be an l2 -norm; in this case we can determine c0 , c1 , . . . , ck−1 via the solution of the linear system (of normal equations)

 c j d j +1, p+1 = 0,

i = 1, . . . , k;

ck = 1,

(d i , p+1 , d j +1, p+1 )c j = 0,

i = 1, . . . , k;

ck = 1.

 d i , p+1 ,

k

j =0

which we also write as k

j =0

(16.16)

We denote the resulting rational interpolation procedure IMPE. 2. Again, with the normalization ck = 1, we determine c0 , c1 , . . . , ck−1 via the solution of the linear system  qi ,

k

j =0

 c j d j +1, p+1 = 0,

i = 1, . . . , k;

ck = 1,

16.2. Development of vector-valued rational interpolation methods

349

which we also write as k

j =0

(q i , d j +1, p+1 )c j = 0,

i = 1, . . . , k;

ck = 1,

(16.17)

where q 1 , . . . , q k are linearly independent vectors in N . Note that we can choose the vectors q 1 , . . . , q k to be independent of p or to depend on p. We denote the resulting rational interpolation procedure IMMPE. 3. Again, with the normalization ck = 1, we determine c0 , c1 , . . . , ck−1 via the solution of the linear system  q,

k

j =0

 c j d j +1, p+i = 0,

i = 1, 2, . . . , k;

ck = 1,

i = 1, 2, . . . , k;

ck = 1,

which we also write as k

j =0

(q, d j +1, p+i )c j = 0,

(16.18)

where q is a nonzero vector in N . We denote the resulting rational interpolation procedure ITEA. Remarks: 1. The determination of the c j as in (16.15) is that adopted in [281], and it differs from the one originally given in [280]. Here and in [281] we impose the constraint ck = 1; in [280], the constraint is c0 = 1. Under the present constraint, ck = 1, the denominator polynomials v(z) for IMMPE and for ITEA given here turn out to be the same as those given in [280], up to a constant multiplicative factor. The denominator polynomial v(z) for IMPE here is slightly different from the corresponding one in [280]. 2. Clearly, to compute r p,k (z), we need to know k scalars and p vectors. In view of this, the fact that the definition of r p,k (z) presented above requires only knowledge of k scalars, independent of what p is, is quite remarkable. 3. To be acceptable as valid interpolation procedures, IMPE, IMMPE, and ITEA must satisfy a few important criteria: • The functions r p,k (z) must be independent of the way the points of interpolation ξi are ordered. From their definition so far, they seem to depend on the ordering ξ1 , ξ2 , . . . . • The interpolation procedures should be able to reproduce (recover) rational functions. • They should have at least a de Montessus–type convergence theory similar to the ones pertaining to Padé approximants and to SMPE, SMMPE, and STEA. Fortunately, these criteria are met by the interpolation procedures of this chapter, as we discuss in what follows.

350

Chapter 16. Vector-Valued Rational Interpolation Methods

Theorem 16.2 below concerns the limiting case of ξi → 0 for all i simultaneously and amply justifies the relevance and usefulness of our rational interpolation procedures. Theorem 16.2. Assuming that f (z) is sufficiently differentiable at z = 0, let um =

f (m) (0) , m!

x m (z) =

m−1

i =0

ui zi ,

m = 0, 1, . . . .

(16.19)

If ξi = 0 for all i, then r p,k (z) in (16.10) becomes (with p = n + k) k

j =0 c j

r n+k,k (z) =

z j x n+k− j (z)

k

j =0 c j z

j

k

cj j =0

=

k

z k− j x n+ j (z)

cj j =0

z k− j

,

(16.20)

where the scalars c j = ck− j satisfy k

j =0

with ui , j

ui , j c j = 0,

i = 1, . . . , k,

⎧ ⎪ ⎨(u n+i , u n+ j ) = (q i , u n+ j ) ⎪ ⎩(q, u n+i + j −1 )

(16.21)

for IMPE, for IMMPE, for ITEA.

(16.22)

Proof. By (16.5)–(16.8), and ξi = 0 for all i, we have ψ1,i (z) = z i ,

d r,r +i =

f (i ) (0) = ui, i!

g r,r +s (z) =

s

f (i ) (0) i z = x s +1 (z) i! i =0

for all i, r, s. Upon substituting these into (16.10) and setting c j = ck− j , we obtain k r n+k,k (z) =

j =0 c j

z j x n+k− j (z)

k

j =0 c j z

j

k =

cj j =0

k

z k− j x n+ j (z)

cj j =0

z k− j

,

which is precisely the form given in (12.27) of Theorem 12.3. The proof can be completed by rewriting the equations in (16.16), (16.17), and (16.18) in terms of the c j and showing that they reduce to those in (16.21) with (16.22). We leave the details to the reader.

Comparing (16.21) and (16.22) with (12.28) and (12.29), we observe that, when ξi = 0 for all i, r n+k,k (z) from IMMPE and ITEA become sn,k (z) from SMMPE and STEA, respectively; r n+k,k (z) from IMPE and sn,k (z) from SMPE are only slightly different from each other. This observation suggests that the approach to the vectorvalued rational interpolation problem presented here is valid. More evidence for this is provided by the convergence theory we present later in this chapter.

16.3. Algebraic properties of interpolants

351

16.3 Algebraic properties of interpolants 16.3.1 Projection properties With the above definitions, we can prove the next interesting result concerning IMPE, IMMPE, and ITEA. Theorem 16.3. In addition to interpolating f (z) at z = ξi , i = 1, . . . , p, the functions r p,k (z) from IMPE, IMMPE, and ITEA have the following projection properties: 

 d i , p+1 , f (z) − r p,k (z)  z=ξ



p+1

= 0,

i = 1, . . . , k,

for IMPE.

 q i , f (z) − r p,k (z) z=ξ = 0, i = 1, . . . , k, for IMMPE. p+1   q, f (z) − r p,k (z)  z=ξ = 0, i = 1, . . . , k, for ITEA. p+i

(16.23) (16.24) (16.25)

Proof. We start with (16.8) and (16.9), setting m = j + 1 and n > p there. We have f (z) =

n

s = j +1

d j +1,s ψ j +1,s −1 (z) + d j +1,n (z)ψ j +1,n (z),

which we write as f (z) = g j +1, p (z) +

n

s = p+1

d j +1,s ψ j +1,s −1 (z) + d j +1,n (z)ψ j +1,n (z).

Substituting this into (16.13) and rearranging, we obtain f (z) − r p,k (z) =

1 v(z)

5

n s = p+1

ψ1,s −1 (z)

k

j =0

c j d j +1,s + ψ1,n (z)

k

j =0

6 cj d (z) . j +1,n

(16.26) Setting z = ξ m and realizing that ψ1,n (ξ m ) = 0 only when 1 ≤ m ≤ n, we obtain f (ξ m ) − r p,k (ξ m ) =

n k



1 ψ1,s −1 (ξ m ) c j d j +1,s , v(ξ m ) s = p+1 j =0

m ≤ n.

(16.27)

Letting n = m = p + 1 in (16.27) and taking the inner product of both sides with d i , p+1 , i = 1, . . . , k, and invoking (16.16), we obtain (16.23). Letting n = m = p + 1 in (16.27) and taking the inner product of both sides with q i , i = 1, . . . , k, and invoking (16.17), we obtain (16.24). Letting n = p + k and m = p + i in (16.27) and taking the inner product of both sides with q, for i = 1, . . . , k, in this order, and invoking (16.18), we obtain (16.25).

Remark: When f (z) is a scalar function (that is, N = 1), by taking (q, f (z)) = f (z), we realize that the (scalar) function r p,k (z) generated by ITEA interpolates f (z) at the p + k points ξ1 , ξ2 , . . . , ξ p+k . Since the number of the parameters of r p,k (z) is precisely p + k, we conclude that r p,k (z) is nothing but the solution to the Cauchy rational interpolation problem for f (z) at ξ1 , ξ2 , . . . , ξ p+k , by Theorem 16.3. This

352

Chapter 16. Vector-Valued Rational Interpolation Methods

provides another justification for the approach to vector-valued rational interpolation presented here. Again, we need to know only k scalar parameters to compute this solution via the simple expression given in (16.10), whatever the value of p. (For a discussion of the Cauchy interpolation problem, see Baker and Graves-Morris [14, Chapter 7], for example.)

16.3.2 Determinant representations The constructions given above enable us to develop determinant representations for the r p,k (z) that have proved to be very useful in the study of their properties. This is the subject of the next theorem, whose proof we omit. Theorem 16.4. Let the vector-valued rational interpolant r p,k (z) to f (z) be given by u(z) r p,k (z) = = v(z)

k

j =0 c j

k

ψ1, j (z) g j +1, p (z)

j =0 c j

ψ1, j (z)

,

(16.28)

such that r p,k (ξi ) = f (ξi ), i = 1, . . . , p, and the scalars c j are defined by (16.16) for IMPE, by (16.17) for IMMPE, and by (16.18) for ITEA. Then r p,k (z) has a determinant representation of the form    ψ (z) g (z) ψ (z) g (z) · · · ψ (z) g 1,1 1,k 1, p 2, p k+1, p (z)  1,0   u1,0 u1,1 ··· u1,k     u2,0 u2,1 ··· u2,k     .. .. ..     . . .     uk,1 ··· uk,k uk,0   r p,k (z) = , (16.29) ψ (z) ψ (z) · · · ψ (z)  1,0  1,1 1,k  u u1,1 ··· u1,k   1,0  u u2,1 ··· u2,k   2,0  . .. ..   .  . . .    uk,0 uk,1 ··· uk,k 

where ui , j

⎧ ⎪ ⎨ (d i , p+1 , d j +1, p+1 ) = (q i , d j +1, p+1 ) ⎪ ⎩ (q , d j +1, p+i )

for IMPE, for IMMPE, for ITEA.

(16.30)

Here, the numerator determinant is vector-valued and is defined by its expansion with respect to its first row. That is, if M j is the cofactor of the term ψ1, j (z) in the denominator determinant, then k j =0 M j ψ1, j (z) g j +1, p (z) . (16.31) r p,k (z) = k j =0 M j ψ1, j (z)

16.3.3 Symmetry properties Going back to the way the r p,k (z) are defined, we realize that their definition relies on the ordering ξ1 , ξ2 , . . . of the points of interpolation. Now, to be acceptable as valid

16.3. Algebraic properties of interpolants

353

interpolants, the functions r p,k (z) must be independent of the way these points are ordered. That is, we should be able to obtain the same r p,k (z) for every permutation of the points of interpolation. In other words, r p,k (z) should be a symmetric function of ξ1 , ξ2 , . . . , ξ p .73 We address this issue next. We start with the following general lemma. Lemma 16.5. Define r (z) to be a vector-valued rational function of the form r (z) = u(z)/v(z), where u(z) is a vector-valued polynomial of degree at most p −1 and v(z) is a scalar polynomial of degree k. Assume that v(ξi ) = 0, i = 1, . . . , p, and that r (ξi ) = f (ξi ), i = 1, 2, . . . , p. Then r (z) is a symmetric function of ξ1 , . . . , ξ p provided v(z) is too. Proof. Because v(z) is a symmetric function of ξ1 , . . . , ξ p , r (z) will also be a symmetric function of ξ1 , . . . , ξ p provided u(z) is too. Now, u(z) = v(z)r (z). Therefore, u(ξi ) = v(ξi )r (ξi ) = v(ξi )f (ξi ),

i = 1, . . . , p;

(16.32)

that is, u(z) interpolates v(z)f (z) at the points ξ1 , . . . , ξ p . Being a (vector-valued) polynomial of degree at most p − 1, u(z) is the unique polynomial of interpolation to v(z)f (z) at ξ1 , . . . , ξ p . Hence, u(z) is a symmetric function of ξ1 , . . . , ξ p . Consequently, so is r (z) = u(z)/v(z).

In the next lemma, we address the symmetry properties of the denominator polynomials v(z) of the r p,k (z). We skip the proof of this lemma since it is quite complicated; we refer the reader to [281, Lemmas 3.4, 3.6, 3.8]. Lemma 16.6. The denominator polynomials v(z) of r p,k (z) are symmetric functions of all the ξi that are used in their construction. Thus, we have the following: 1. For IMPE, v(z) is a symmetric function of ξ1 , ξ2 , . . . , ξ p+1 . 2. For IMMPE, v(z) is a symmetric function of ξ1 , ξ2 , . . . , ξ p+1 . 3. For ITEA, v(z) is a symmetric function of ξ1 , ξ2 , . . . , ξ p+k . Combining Lemma 16.5 and Lemma 16.6, we have the following main result. Theorem 16.7. Let v(z) in r p,k (z) be such that v(ξi ) = 0, i = 1, 2, . . . , p. Then r p,k (z) is a symmetric function of ξ1 , ξ2 , . . . , ξ p for IMPE, IMMPE, and ITEA.

16.3.4 Reproducing properties In the next theorem, we show that, provided the denominator polynomials v(z) satisfy v(ξi ) = 0, i = 1, . . . , p, the interpolants r p,k (z) for IMPE, IMMPE, ITEA reproduce f (z) when the latter is itself a vector-valued rational function. (z)/v(z), (z) a vector-valued Theorem 16.8. Let f (z) be of the form f (z) = u with u a scalar polynomial of degree exactly k. polynomial of degree at most p − 1 and v(z) 73

A function f (x1 , . . . , x m ) is symmetric in x1 , . . . , x m if f (xi1 , . . . , xi m ) = f (x1 , . . . , x m ) for every permutation (xi1 , . . . , xi m ) of (x1 , . . . , x m ).

354

Chapter 16. Vector-Valued Rational Interpolation Methods

Then all three rational interpolants r p,k (z) reproduce f (z) in the sense that r p,k (z) ≡ f (z) i ) = 0, i = 1, . . . , p. provided v(ξ (z) is a polynomial of degree at most p − 1, we first have that Proof. By the fact that u (z) of order p or more vanish,74 that is, all divided differences of u [ξ1 , . . . , ξ p , ξ p+1 , . . . , ξ p+s ] = 0, u

s = 1, 2, . . . .

(16.33)

(z) = v(z)f Now, since u (z), by the Leibniz rule for divided differences (see, for example, Stoer and Bulirsch [313, p. 117]), we have [ξ1 , . . . , ξ m ] = u

m

i =1

v [ξ1 , . . . , ξi ] f [ξi , . . . , ξ m ].

(16.34)

is a polynomial of degree k, But, because v(z) v [ξ1 , . . . , ξi ] = 0,

i ≥ k + 2.

in the form Furthermore, writing v(z) = v(z)

k

j =0

c j ψ1, j (z),

which is legitimate, and comparing it with the Newtonian form = v(z)

k+1

i =1

we realize that

1 , . . . , ξi ] ψ1,i −1 (z), v[ξ

c j = v [ξ1 , . . . , ξ j +1 ],

j = 0, 1, . . . , k.

Substituting this into (16.34) and letting m = p + s there, switching to the notation c j satisfy d i ,m = f [ξi , ξi +1 , . . . , ξ m ] [recall (16.6)], and invoking (16.33), we see that the the equations k

(16.35) c j d j +1, p+s = 0, s = 1, 2, . . . . j =0

i ) = 0, Therefore, they also satisfy (16.16)–(16.18). It is now easy to see that, when v(ξ i = 1, . . . , p, we have c j = c j , j = 0, 1, . . . , k. This completes the proof.

Note that Theorem 16.8 and its proof can also serve to define the rational interpolation procedures. That is, these interpolation procedures can be obtained by demanding that r p,k (z) ≡ f (z) when f (z) is a vector-valued rational function, as described in Theorem 16.8.

16.4 Convergence study of r p,k (z ): A constructive theory of de Montessus type Under certain conditions imposed on f (z), the interpolation procedures described here possess interesting de Montessus–type convergence theories in the spirit of Saff 74 See

footnote 72.

16.4. Convergence study of r p,k (z): A constructive theory of de Montessus type

355

[245]. Below, we summarize the theories that are currently available. The details can be found in [284] and [287] for IMPE, in [283] for IMMPE, and in [291] for ITEA. Let E be a closed and bounded set in the z-plane whose complement K, including the point at infinity, is connected and has a classical Green’s function g (z) with a pole at infinity, which is continuous on ∂ E, the boundary of E, and is zero on ∂ E. For each σ, let Γσ be the locus g (z) = log σ, and let Eσ denote the interior of Γσ . Then E1 is the interior of E and, for 1 < σ < σ  , E ⊂ Eσ ⊂ Eσ  . ( p) ( p) ( p) For each p ∈ {1, 2, . . .}, let Ξ p = ξ1 , ξ2 , . . . , ξ p+a be the set of interpolation points used in constructing the r p,k (z) from IMPE (with a = 1), from IMMPE (with ( p)

a = 1), and from ITEA (with a = k). Assume that the sets Ξ p are such that ξi no limit points in K and  1/ p  p+a  ( p)   z − ξi  = κΦ(z), lim p→∞ 

κ = cap (E),

have

Φ(z) = exp[g (z)]

i =1

uniformly in z on every compact subset of K, where cap(E) is the logarithmic capacity of E defined by 1/n  , cap (E) = lim min max |r (z)| n→∞ r ∈n z∈E



n = r (z) : r ∈ Πn and monic .

( p) ( p) ( p) Such sequences ξ1 , ξ2 , . . . , ξ p+a , p = 1, 2, . . . , exist; see Walsh [338, p. 74]. Note that, in terms of Φ(z), the locus Γσ is defined by Φ(z) = σ for σ > 1, while ∂ E = Γ1 is simply the locus defined by Φ(z) = 1. It is clear that if z  ∈ Γσ  , z  ∈ Γσ  , and 1 < σ  < σ  , then Φ(z  ) < Φ(z  ).

16.4.1 Convergence theory for rational f (z) with simple poles Theorem 16.9. Assume that f (z) is a rational function that is analytic in E and meromorphic with simple poles in K, given as f (z) =

μ

ws + p(z), z − zs s =1

where z1 , . . . , zμ are distinct points in K and p(z) is an arbitrary vector-valued polynomial. For IMPE, assume that w 1 , . . . , w k are linearly independent. For IMMPE, assume that w 1 , . . . , w k are linearly independent and    (q 1 , w 1 ) · · · (q 1 , w k )      .. ..   = 0. . .   (q , w ) · · · (q , w ) 1 k k k For ITEA, assume that

k  i =1

Order the zi such that

(q, w i ) = 0.

Φ(z1 ) ≤ Φ(z2 ) ≤ · · · ≤ Φ(zμ ),

356

Chapter 16. Vector-Valued Rational Interpolation Methods

and assume that

Φ(zk ) < Φ(zk+1 ).

Then the following are true: 1. r p,k (z) = u p,k (z)/v p,k (z) exists for all large p, lim p→∞ r p,k (z) = f (z), such that lim sup f (z) − r p,k (z)1/ p ≤ p→∞

Φ(z) Φ(zk+1 )

(16.36)

uniformly in every compact subset of S = {z : Φ(z) < Φ(zk+1 )}, excluding the poles of f (z).  2. In addition, lim p→∞ v p,k (z) = ki=1 (z − zi ) such that k  1/ p  Φ(zk ) (z − zi ) ≤ lim sup v p,k (z) − . Φ(zk+1 ) p→∞ i =1

( p)

(16.37)

( p)

Consequently, v p,k (z) has k zeros z1 , . . . , zk , which converge to the poles z1 , . . . , zk as in ( p)

lim sup |z m − z m |1/ p ≤ p→∞

Φ(z m ) , Φ(zk+1 )

m = 1, . . . , k.

(16.38) ( p)

If (w i , w j ) = 0 when i = j , then the convergence rates of v p,k (z) and z m from IMPE improve to read k  1/ p  Φ(zk ) 2  lim sup v p,k (z) − (z − zi ) ≤ Φ(zk+1 ) p→∞ i =1

and ( p)

lim sup |z m − z m |1/ p ≤ p→∞

( p)



Φ(z m ) Φ(zk+1 )

(16.39)

2 ,

m = 1, . . . , k.

( p)

( p)

(16.40) ( p)

3. If w 1 , . . . , w k are the residues corresponding to the poles z1 , . . . , zk , respec( p)

tively, then lim p→∞ w m = w m , m = 1, . . . , k, and also ( p)

lim sup w m − w m 1/ p ≤ p→∞

Φ(z m ) , Φ(zk+1 )

m = 1, . . . , k.

(16.41)

16.4.2 Extension to irrational f (z) with simple poles The results of Theorem 16.9 can be extended to the case in which f (z) is not a rational function. Let us assume that f (z) is analytic in E and meromorphic in Eρ = int Γρ (Γρ is the locus Φ(z) = ρ, ρ > 1). In this case, f (z) is of the form f (z) =

μ

ws + θ(z), s =1 z − z s

16.4. Convergence study of r p,k (z): A constructive theory of de Montessus type

357

where θ(z) is an arbitrary vector-valued function that is analytic in Eρ . Again, order the zi such that Φ(z1 ) ≤ Φ(z2 ) ≤ · · · ≤ Φ(zμ ) < ρ. [If k = μ, replace Φ(zk+1 ) by ρ everywhere.] Now Theorem 16.9 applies and (16.36)– (16.41) hold without any changes.

16.4.3 Further extension for IMPE The additional improved results in (16.39) and (16.40) continue to hold if f :  →  , where  is an infinite-dimensional Hilbert space with inner product (· , ·) and f (z) is meromorphic in the whole complex plane and is of the form f (z) =

μ

ws + θ(z); s =1 z − z s

(w i , w j ) = 0

when i = j ,

θ(z) entire,

where μ can be finite or infinite. If μ = ∞, then we assume that lims →∞ |z s | = ∞. (Consequently, there can only be a finite number of the z s with the same modulus.) An example of such functions is f (z) = (I − zA)−1 b, where A is a compact selfadjoint operator on a Hilbert space  . An important special case arises in the Hilbert– Schmidt theory of Fredholm integral equations of the second kind, 9

b

K(x, t ) f (t ) d t = g (t ),

f (x) − z

a ≤ x ≤ b,

a

where K(x, t ) is real and continuous in (x, t ) ∈ [a, b ] × [a, b ] and satisfies K(x, t ) = K(t , x). The space  in this case is L2 [a, b ]. For more on this, see [284, Section 6].

Appendix A

QR Factorization

A.1 Gram–Schmidt orthogonalization (GS) and QR factorization Gram–Schmidt orthogonalization (GS), also called Gram–Schmidt process, is a procedure that takes an arbitrary set of linearly independent vectors {a 1 , . . . , a n } and constructs a set of vectors {q 1 , . . . , q n } out of them such that the q i are mutually orthonormal with respect to some specified inner product (· , ·); that is, (q i , q j ) = δi j . In addition, the q i are generated so that span{a 1 , . . . , a j } = span{q 1 , . . . , q j },

j = 1, . . . , n.

(A.1)

This, of course, implies a 1 = r11 q 1 , a 2 = r12 q 1 + r22 q 2 , a 3 = r13 q 1 + r23 q 2 + r33 q 3 , ............................................... a n = r1n q 1 + r2n q 2 + · · · + rnn q n

(A.2)

for some suitable scalars ri j , with ri i = 0, i = 1, . . . , n. Remark: We observe that, with a i , q i ∈  m , the relationships in (A.2), with no conditions imposed on the q i and ri j , can be expressed in matrix form as

where

A = QR,

(A.3)

A = [ a 1 | a 2 | · · · | a n ] ∈  m×n

(A.4)

and

⎡ Q = [ q 1 | q 2 | · · · | q n ] ∈  m×n ,

⎢ ⎢ R=⎢ ⎣

r11

r12 r22

··· ··· .. .

⎤ r1n r2n ⎥ ⎥ n×n .. ⎥ ∈  . . ⎦ rnn

361

(A.5)

362

Appendix A. QR Factorization

By this, we mean that (A.3)–(A.5) follow directly from (A.2), in which the vectors q i and the scalars ri j are not required to satisfy any specific conditions except (A.2) itself. We now return to determining the q i and the ri j such that the q i satisfy the or thogonality condition (q i , q j ) = δi j . Letting z  = (z , z ), the first of the relations in (A.2) gives r11 and q 1 as

r11 = a 1 ,

q 1 = a 1 /r11 .

With q 1 determined, and by invoking the requirement that (q 1 , q 2 ) = 0, the second relation is used to determine r12 , r22 , and q 2 as r12 = (q 1 , a 2 );

(2)

(2)

a 2 = a 2 − r12 q 1 ,

r22 = a 2 ,

(2)

q 2 = a 2 /r22 .

We now proceed in the same way and determine q 3 , . . . , q n and the rest of the ri j . For example, with q 1 , . . . , q j −1 already determined such that (q i , q s ) = δi s for 1 ≤ i, s ≤ j − 1, and by invoking the requirement that (q i , q j ) = 0 for i < j , from the j th of the j relations in (A.2), namely, a j = i =1 ri j q i , we determine the ri j and q j as follows: ri j = (q i , a j ), (j)

aj = aj −

j −1

i =1

ri j q i ,

1 ≤ i ≤ j − 1, (j)

r j j = a j ,

(j)

q j = a j /r j j . (j)

Note that, since the a i are linearly independent, the vectors a j are nonzero and hence r j j > 0 for all j . Note also that ri j q i is the projection of a j along q i , i ≤ j . As formulated above, GS can be implemented via the following algorithm. GS algorithm

GS1. Set r11 = a 1  and q 1 = a 1 /r11 . GS2. For j = 2, . . . , n do (1) Set a j = a j . For i = 1, . . . , j − 1 do (i +1) (i ) = a j − ri j q i . Compute ri j = (q i , a j ) and a j end do (i) (j) (j) Compute r j j = a j  and set q j = a j /r j j . end do ( j ) The construction via GS of the vectors q i , orthonormal with respect to an arbitrary inner product (· , ·), leads naturally to a generalization of the well-known QR factorization of matrices, which is the subject of the next theorem. Here we recall that every inner product (· , ·) in  m is necessarily of the form (y, z ) = y ∗ M z , where M ∈  m×m is Hermitian positive definite. Theorem A.1. Let A = [ a 1 | a 2 | · · · | a n ] ∈  m×n ,

m ≥ n,

rank(A) = n.

Also, let M ∈  m×m be Hermitian positive definite and let (y, z ) = y ∗ M z . Then there exists a matrix Q ∈  m×n , unitary in the sense that Q ∗ MQ = I n , and an upper triangular

A.1. Gram–Schmidt orthogonalization (GS) and QR factorization

363

matrix R ∈ n×n with positive diagonal elements such that A = QR. Specifically,

⎡ Q = [ q 1 | q 2 | · · · | q n ],

⎢ ⎢ R=⎢ ⎣

r11

r12 r22

··· ··· .. .

⎤ r1n r2n ⎥ ⎥ .. ⎥ , . ⎦ rnn

(q i , q j ) = q ∗i M q j = δi j ri j = (q i , a j ) =

q ∗i M a j

∀ i ≤ j,

∀ i, j , ri i > 0

∀ i.

In addition, the matrices Q and R are unique. Proof. Clearly, the columns of Q and the entries of R are precisely those in (A.2), as provided by GS we have just described, with the inner product (y, z ) = y ∗ M z . We have thus shown the existence of the QR factorization for A. [Recall also the remark following (A.2).] We now turn to the proof of uniqueness. Suppose that A = Q 1 R 1 and A = Q 2 R 2 ∗ are two QR factorizations. Then, by Q 1 R 1 = Q 2 R 2 , we have both R 1 R −1 2 = Q 1 MQ 2 and R 2 R −1 = Q ∗2 MQ 1 . Since R 1 R −1 and R 2 R −1 are both upper triangular with pos1 2 1 ∗ itive diagonals, so are Q 1 MQ 2 and Q ∗2 MQ 1 . But Q ∗1 MQ 2 = (Q ∗2 MQ 1 )∗ , and hence Q ∗1 MQ 2 is lower triangular as well; therefore, Q ∗1 MQ 2 is diagonal with positive diagonal elements. So is Q ∗2 MQ 1 . Now, Q ∗1 MQ 2 is unitary (in the regular sense) since −1 (Q ∗1 MQ 2 )∗ (Q ∗1 MQ 2 ) = (Q ∗2 MQ 1 )(Q ∗1 MQ 2 ) = (R 2 R −1 1 )(R 1 R 2 ) = I n .

Since the only diagonal unitary matrix with positive diagonal elements is I n , we have Q ∗1 MQ 2 = I n . Similarly, we have Q ∗2 MQ 1 = I n . Therefore, −1 R 1 R −1 2 = I n = R2R1 ;

As a result,

hence R 1 = R 2 .

−1 Q 1 = AR −1 1 = AR 2 = Q 2 .

This completes the proof.

Remarks: 1. The generalization of the QR factorization in Theorem A.1 reduces to the standard QR factorization when M = I m , that is, when (· , ·) is the standard Euclidean inner product (y, z ) = 〈y, z 〉 = y ∗ z . In this case, Q is a unitary matrix in the regular sense; that is, Q ∗Q = I n . 2. Because the columns of Q are orthonormal with respect to a weighted inner product when M = I m , we will call the resulting QR factorization a weighted QR factorization. 3. The weighted QR factorization we have described here is the same as the weighted QR factorization of Gulliksson and Wedin [120] and Gulliksson [119], who put more emphasis on the case in which M is a diagonal matrix with positive diagonal elements. 4. If A and M are real, then so are Q and R.

364

Appendix A. QR Factorization

A.2 Modified Gram–Schmidt orthogonalization (MGS) So far, we have shown that the QR factorization of a matrix A amounts to nothing but the application of GS to the columns of A. As for actually computing the matrices Q and R in Theorem A.1 in floating-point arithmetic (as opposed to exact arithmetic), we may be tempted to employ the GS algorithm as described in the preceding subsection. However, when M = I m , this algorithm turns out to be numerically unstable; for example, the matrix Q computed via the GS algorithm is far from being unitary. This deficiency can be alleviated by switching to another orthogonalization process to construct the matrices Q and R; we can use the modified Gram–Schmidt orthogonalization (MGS) or Givens rotations or Householder reflections.75 We choose to use MGS because it is easier to program than the others and produces good results numerically when M = I m . MGS can be implemented  via the following algorithm for an arbitrary inner product (· , ·) and norm z  = (z , z ). MGS algorithm

MGS1. Set r11 = a 1  and q 1 = a 1 /r11 . MGS2. For j = 2, . . . , n do (1) Set a j = a j . For i = 1, . . . , j − 1 do (i ) (i +1) (i ) = a j − ri j q i . Compute ri j = (q i , a j ) and compute a j end do (i) (j) (j) Compute r j j = a j  and set q j = a j /r j j . end do ( j ) Comparing the GS and MGS algorithms, it is easy to show that, in exact arithmetic, the scalars ri j and the vectors q j computed by MGS are the same as those computed by GS. Comparing the GS and MGS algorithms, we also see that their costs are identical. However, they differ in the way they compute the ri j [namely, ri j = (q i , a j ) for GS, (i )

while ri j = (q i , a j ) for MGS, the rest of the computations being identical], which has a beneficial effect on the rest of the computation, at least when M = I m . For example, once r13 q 1 , the projection of a 3 along q 1 , has been computed, it is immediately (1) (2) (2) subtracted from a 3 = a 3 to give a 3 , and then r23 is computed via (q 2 , a 3 ) instead of (q 2 , a 3 ), even though they are equal, mathematically speaking. An important point to realize here is that A can be overwritten by Q during the orthogonalization process if A does not need to be saved. Specifically, q 1 overwrites (1) a 1 componentwise at the time it is being computed. For j ≥ 2, a j overwrites a j , (2)

(1)

(j)

( j −1)

(j)

a j overwrites a j , . . . , a j overwrites a j , and q j overwrites a j , componentwise at the time they are being computed. Thus, the only storage needed to produce the matrix Q is that of the matrix A, additional storage being needed for the matrix R. If A ∈  m×n and m >> n, which is the case in our problems, this results in significant economy since Q needs mn memory locations, whereas R needs n 2 locations, and mn >> n 2 . 75 We do not go into Givens rotations and Householder reflections here and refer the reader to the various books on numerical linear algebra.

A.3. MGS with reorthogonalization

365

Finally, the performance of MGS will depend on the properties of the matrix M . This point is studied in detail in the papers [120] and [119] when M is a diagonal matrix. Problems occur when the condition number of M is very large. We do not go into this subject here and refer the reader to these papers for details. In this work, we assume that M is such that MGS is an effective procedure for computing the weighted QR factorization.

A.3 MGS with reorthogonalization Let us denote the vectors q i and the scalars ri j computed in floating-point arithmetic by qˇ i and ˇri j , respectively. Even though MGS is much better than GS when carried out in floating-point arithmetic, it too may sometimes suffer from loss of accuracy due to roundoff, in the sense that the computed vectors qˇ i may fail to reˇ ∗M Q ˇ = I in a substantial way, where tain mutual orthogonality, which results in Q n 76 ˇ = [ qˇ | qˇ | · · · | qˇ ]. One effective way of dealing with this problem is by a proQ 1 2 n cedure called reorthogonalization, which corrects the qˇ i and ˇri j as MGS progresses. Reorthogonalization is derived/designed via the following reasoning. Suppose that (the exact) q i , i = 1, . . . , j − 1, are known (or have been computed), and we would like to compute q j by GS. Suppose also that we have computed ˇri j ≈ (q i , a j ) [instead of the exact ri j = (q i , a j )], 1 ≤ i ≤ j − 1, and formed the vector (j) aˇ j = a j −

j −1

i =1

ˇri j q i .

(A.6)

ri j q i

(A.7)

With the exact ri j , we would have (j)

aj = aj − and

(j)

r j j = a j 

j −1

i =1

(j)

q j = a j /r j j ,

and

(A.8)

and we would like to obtain the (exact) ri j and q j . Subtracting (A.7) from (A.6), we get the equality (j) (j) aˇ j − a j =

j −1

i =1

(ri j − ˇri j )q i .

(A.9)

Taking the inner product of this equality with q i , and noting that (j)

a j = rj j q j

(j)



(q i , a j ) = 0,

i = 1, . . . , j − 1,

we obtain (in exact arithmetic) (j)

Δri j = ri j − ˇri j = (q i , aˇ j )



ri j = ˇri j + Δri j ,

i = 1, . . . , j − 1.

(A.10)

76 This problem has been considered in the literature for the case M = I m ; here we are assuming that the technique developed for this case will work for the case M = I m too.

366

Appendix A. QR Factorization

Substituting (A.10) into (A.9), we have (j)

(j)

a j = aˇ j −

j −1

i =1

(j)

(Δri j )q i ,

Δri j = (q i , aˇ j ),

i = 1, . . . , j − 1,

(A.11)

(j)

(j)

which shows that a j is the vector obtained (via GS) by subtracting from the vector aˇ j (j)

(j)

its projections along q 1 , . . . , q j −1 . Of course, once a j is obtained, we set r j j = a j  (j) (j) (j) and q j = a j /r j j . Thus, a j is the vector obtained by orthogonalizing aˇ j against (1)

(j)

q 1 , . . . , q j −1 . We now perform this orthogonalization via MGS: We set b j = aˇ j and (i +1)

apply MGS to compute Δri j and b j (i )

Δri j = (q i , b j ),

(i +1)

bj

and, finally, we set

(j)

, i = 1, . . . , j − 1, as (i )

= b j − (Δri j )q i ,

r j j = b j  and

i = 1, . . . , j − 1,

(j)

q j = b j /r j j .

Clearly, the Δri j computed in floating-point arithmetic are not exact; they turn out to be sufficiently close to being exact, however. As has been shown by Kahan for the case M = I m (see [208, Section 6-9]), it is enough to perform the above (in floatingpoint arithmetic) only once; the computed q i are mutually orthonormal to working precision of the floating-point arithmetic being used. In view of these developments, it is clear that when A ∈  m×n and m >> n, (i) the computational cost of MGS with reorthogonalization is about twice that of MGS without reorthogonalization, while (ii) the storage requirements are the same for both algorithms, because we can over(j) (1) (i ) (i +1) for i = 1, . . . , j − 1, in this order, and, finally, write aˇ j with b j and b j with b j (j)

we can overwrite b j with q j . Here are the steps of MGS with reorthogonalization. MGS algorithm with reorthogonalization

MGS1. Set r11 = a 1  and q 1 = a 1 /r11 . MGS2. For j = 2, . . . , n do (1) Set aˇ j = a j . For i = 1, . . . , j − 1 do (i ) (i +1) (i ) = aˇ j − ˇri j q i . Compute ˇri j = (q i , aˇ j ) and compute aˇ j end do (i) (1) (j) Set b j = aˇ j . For i = 1, . . . , j − 1 do (i ) (i +1) (i ) = b j − (Δri j )q i . Compute Δri j = (q i , b j ) and compute b j Compute ri j = ˇri j + Δri j . end do (i) (j) (j) Compute r j j = b j  and set q j = b j /r j j . end do ( j ) Note that the two do loops over i can be written as one nested do loop. For the sake of clarity, we have chosen to display the order of the computations.

Appendix B

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is a very important theoretical and computational tool in numerical linear algebra and is covered in every intermediate to advanced linear algebra course. Two versions of SVD are discussed in the literature: (i) full SVD and (ii) reduced SVD. Different proofs of the existence of SVD can be found in the literature. See, for example, the books by Golub and Van Loan [103], Stewart [310], Stoer and Bulirsch [313], and Trefethen and Bau [324]. Here we provide a somewhat simpler derivation of full SVD. Reduced SVD follows from full SVD in a straightforward manner.

B.1 Full SVD We start with what is known as full SVD. Theorem B.1. Let A ∈  m×n , A = O. Then there exist unitary matrices U ∈  m×m and V ∈ n×n such that A = U ΣV ∗ ,

U ∗AV = Σ = diag (σ1 , σ2 , . . . , σ p ) ∈  m×n ,

p = min{m, n}, (B.1)

where σi are real and nonnegative scalars and can be ordered as σ1 ≥ σ2 ≥ · · · ≥ σ p ≥ 0

(B.2)

and σi2 are eigenvalues of A∗A and hence also of AA∗ . If rank(A) = r , then σ1 ≥ σ2 ≥ · · · ≥ σ r > 0 = σ r +1 = · · · = σ p . If we partition the unitary matrices U and V columnwise as     U = u1 | · · · | u m and V = v 1 | · · · | v n ,

(B.3)

(B.4)

the vectors u i ∈  and v i ∈  for i = 1, . . . , p satisfy m

n

A∗Av i = σi2 v i , Av i = σi u i

and AA∗ u i = σi2 u i , A∗ u i = σi v i ,

(B.5)

in addition to satisfying u ∗i u j = δi j

and

since U and V are unitary. 367

v ∗i v j = δi j ,

(B.6)

368

Appendix B. Singular Value Decomposition (SVD)

Remarks: 1. σi is called the ith singular value of A. The vectors u i ∈  m and v i ∈ n are called, respectively, the ith left and ith right singular vectors of A. The m × n matrix Σ is of the form ⎡ ⎤ σ1 0 · · · 0 ⎢ 0 σ2 · · · 0 ⎥ ⎢ ⎥ .. . . .. ⎥ ⎢ .. ⎢. . . . ⎥ ⎢ ⎥ ⎢ Σ = ⎢ 0 0 · · · σn ⎥ (B.7) ⎥ if m ≥ n ⎢ 0 0 ··· 0 ⎥ ⎢ ⎥ ⎢. .. .. ⎥ ⎣ .. . . ⎦ 0 and



σ1 ⎢0 ⎢ Σ=⎢ . ⎣ ..

0 σ2 .. .

0

0

···

0 ··· ··· .. .

0 0 .. .

···

0 0 0 .. .

σm

0

··· ··· ···

⎤ 0 0⎥ ⎥ .. ⎥ .⎦

if m ≤ n.

(B.8)

0

2. If A is real, then so are U and V . 3. In our proof below, we also make use of the fact that B∗B x = 0



B x = 0,

(B.9)

which can be shown as follows: B∗B x = 0



x∗B∗B x = 0



(B x)∗ (B x) = 0



B x = 0.

That B x = 0 ⇒ B ∗ B x = 0 is obvious. Proof. We start by recalling that A∗A and AA∗ have p common nonnegative eigenvalues (counting multiplicities) σ12 ≥ · · · ≥ σ 2p ≥ 0. Next, we note that rank(A∗A) = rank(AA∗ ) = r since rank(A) = r ; therefore, σ1 ≥ · · · ≥ σ r > 0, the rest of the σi being zero. We now construct the unitary matrices U and V . Let v 1 , . . . , v r be orthonormalized eigenvectors of A∗A corresponding to the eigenvalues σ12 , . . . , σ r2 , respectively, and let v r +1 , . . . , v n be the remaining n − r orthonormalized eigenvectors of A∗A corresponding to the zero eigenvalues. Thus, we have A∗Av i =

σi2 v i , 0,

i = 1, . . . , r , i = r + 1, . . . , n

v ∗i v j = δi j ,

Av i = 0,

i = r + 1, . . . , n, (B.10)

where we have also made use of (B.9). Now let u i = σi−1Av i ,

i = 1, . . . , r.

(B.11)

It is easy to verify that AA∗ u i = σi2 u i

i = 1, . . . , r,

u ∗i u j = δi j .

(B.12)

B.2. Reduced SVD

369

We have shown that u 1 , . . . , u r , as defined above, are orthonormalized eigenvectors of AA∗ corresponding to the eigenvalues σ12 , . . . , σ r2 , respectively. Letting u r +1 , . . . , u m be the remaining m − r orthonormalized eigenvectors of AA∗ corresponding to the zero eigenvalues, we have AA∗ u i =

σi2 u i , 0,

i = 1, . . . , r , i = r + 1, . . . , m

u ∗i u j = δi j ,

A∗ u i = 0,

i = r + 1, . . . , m, (B.13)

where we have made use of (B.9) again. Finally, set     and V = v 1 | · · · | v n . U = u1 | · · · | u m

(B.14)

Clearly, U and V are unitary. Letting Σ = U ∗ AV , we have i, j = 1, . . . , r, otherwise. (B.15) Thus, Σ is a diagonal matrix in  m×n , as given in (B.7)–(B.8). This completes the proof. (Σ)i j = (U ∗AV )i j = u ∗i Av j = u ∗i (Av j ) = (A∗ u i )∗ v j =

σ i δi j , 0,

Note that the proof also displays the connection between the two singular vectors u i and v i of A corresponding to the singular value σi > 0, namely, that u i = σi−1Av i ,

v i = σi−1A∗ u i ,

i = 1, . . . , r.

(B.16)

B.2 Reduced SVD We now turn to reduced SVD, which is used more frequently than full SVD in applications. As we shall see, the reduced form of SVD can be obtained directly from the full form. Using the columnwise partitions of U and V in (B.4) of Theorem B.1, we can immediately write p

σi u i v ∗i , (B.17) A= i =1

which can be rewritten in the form A = U p Σ p V ∗p , where and

  U p = u 1 | · · · | u p ∈  m× p ,

  V p = v 1 | · · · | v p ∈  p× p ,

Σ p = diag (σ1 , . . . , σ p ) ∈  p× p .

(B.18)

(B.19) (B.20)

What we have in (B.18)–(B.20) is the reduced SVD of A. We end by summarizing the reduced SVD of A ∈  m×n when m ≥ n (so that p = n) in the next theorem. The case m ≤ n can be summarized similarly, and we leave it to the reader.

370

Appendix B. Singular Value Decomposition (SVD)

Theorem B.2. Let A ∈  m×n , A = O, with m ≥ n. Then there exist unitary matrices U ∈  m×n and V ∈ n×n such that A = U ΣV ∗ ,

U ∗AV = Σ = diag (σ1 , σ2 , . . . , σn ) ∈ n×n ,

(B.21)

where σi are real and nonnegative scalars and can be ordered as σ1 ≥ σ2 ≥ · · · ≥ σ n ≥ 0

(B.22)

and σi2 are eigenvalues of A∗A and hence also of AA∗ . If rank(A) = r , then σ1 ≥ σ2 ≥ · · · ≥ σ r > 0 = σ r +1 = · · · = σn . If we partition the unitary matrices U and V columnwise as     U = u1 | · · · | un and V = v 1 | · · · | v n ,

(B.23)

(B.24)

then the vectors u i ∈  m and v i ∈ n , for i = 1, . . . , n, satisfy A∗Av i = σi2 v i , Av i = σi u i

and AA∗ u i = σi2 u i , A∗ u i = σi v i ,

(B.25)

in addition to satisfying u ∗i u j = δi j

and

v ∗i v j = δi j ,

(B.26)

since U and V are unitary. Thus, the matrix V in Theorem B.2 is the same as the matrix V of the full SVD. Note also that the columns u 1 , . . . , u n of U in Theorem B.2 are the first n columns of U in the full SVD. Similarly, the matrix Σ in Theorem B.2 forms the first n rows of the matrix Σ in the full SVD. The following can be proved using the SVD of A:  r 

 A2 = σ1 and AF = σi2 . (B.27) i =1

B.3 SVD as a sum of rank-one matrices We end by mentioning that, because rank(A) = r , the expansion in (B.17) can be rewritten as r

σi u i v ∗i . (B.28) A= i =1

Clearly, each of the terms u i v ∗i is a rank-one matrix. From (B.28), it can be observed that (A) = span{u 1 , . . . , u r },  (A) = span{v r +1 , . . . , v n }. (B.29) Of course, we also have A∗ =

r

i =1

Finally, A∗A =

r

i =1

σi2 v i v ∗i

σi v i u ∗i .

and AA∗ =

(B.30) r

i =1

σi2 u i u ∗i .

(B.31)

B.3. SVD as a sum of rank-one matrices

371

The partial sums of the representation of A given in (B.28) can be used to show the following result. Theorem B.3. Let As =

s

∗ i =1 σi u i v i ,

1 ≤ s ≤ r − 1. Then

A − As 2 = min A − B2 = σ s +1 m×n B∈

(B.32)

rank(B)≤s

and A − BF = A − As F = min m×n B∈ rank(B)≤s

 r 

 i =s +1

σi2 .

(B.33)

Appendix C

Moore–Penrose Generalized Inverse

C.1 Definition of the Moore–Penrose generalized inverse As is well known, for a nonsingular square matrix A ∈ n×n , there is a unique matrix X ∈ n×n , called the inverse of A and denoted A−1 , such that XA = AX = I n . The question arises as to whether the idea of an inverse matrix can be generalized to arbitrary matrices, whether square and singular or even rectangular. The answer to this question is yes, and several types of generalized inverses (or pseudo-inverses) have been proposed and studied in detail in the literature. See the books by Ben-Israel and Greville [20], Campbell and Meyer [56], and Rao and Mitra [215], for example. Of these, the Moore–Penrose generalized inverse is one of the most widely used in applications, and we discuss it briefly here. Theorem C.1. Let A ∈  m×n , A = O. Then there exists a unique matrix X ∈ n×m that satisfies the following conditions (i) AXA = A, (ii) XAX = X , (iii) (AX)∗ = AX, (iv) (XA)∗ = XA. We call X the Moore–Penrose generalized inverse of A and denote it by A+ . The following facts can be verified easily: 1. (A+ )+ = A. 2. If A is square and nonsingular, that is, m = n = rank(A), then A+ = A−1 . 3. If rank(A) = n, then A+ = (A∗A)−1A∗ and A+ A = I n . 4. If rank(A) = m, then A+ = A∗ (AA∗ )−1 and AA+ = I m . 5. If A ∈  m×n and B ∈ n× p , and rank(A) = n = rank(B), then (AB)+ = B +A+ . 373

374

Appendix C. Moore–Penrose Generalized Inverse

C.2 SVD representation of A+ We recall from Theorem B.1 in Appendix B that A = U ΣV ∗ ,

Σ = diag (σ1 , σ2 , . . . , σ r , 0, . . . , 0 ) ∈  m×n , = >? @

(C.1)

p−r times

and V ∈  are unitary and p = min{m, n}, r = rank(A), and where U ∈  σ1 ≥ . . . ≥ σ r are the nonzero singular values of A. As a result, we also have the representation r

A= σi u i v ∗i . (C.2) m×m

n×n

i =1

We now state results for A+ , which are analogous to (C.1) and (C.2) and are given in terms of the SVD of A. Invoking (C.1) and Theorem C.1, it can be shown that A+ has SVD A+ = V Σ+ U ∗ ,

Σ+ = diag (σ1−1 , σ2−1 , . . . , σ r−1 , 0, . . . , 0 ) ∈ n×m , = >? @

(C.3)

p−r times

which in turn gives A+ =

r

i =1

σi−1 v i u ∗i .

(C.4)

 (A+ ) = span{u r +1 , . . . , u m }.

(C.5)

From these, it is clear that (A+ ) = span{v 1 , . . . , v r },

C.3 Connection with linear least-squares problems Let A ∈  m×n and b ∈  m , and consider the linear least-squares problem minn Ax − b2 .

x∈

(C.6)

We treated the solution to this problem in Section 2.3. We now state a theorem that relates A+ to this problem. x is orthogonal Theorem C.2. The vector x = A+ b is a solution to (C.6). In addition, to  (A); therefore, among all possible solutions, x is the only one with smallest standard l2 -norm. Thus, the general solution to (C.6) is of the form x = A+ b + x 0 , where x 0 is an arbitrary vector in  (A). As a result of Theorem C.2 and (C.4), the vector x has the following expansion:   r

u ∗i b (C.7) x= vi . σi i =1

C.4 Moore–Penrose inverse of a normal matrix Let A ∈ n×n be normal, and let A = U D U ∗ , with U unitary, be the Jordan factorization of A. Thus, D = diag(μ1 , μ2 , . . . , μn ) and U = [ u 1 | u 2 | · · · | u n ], Au i = μi u i ∀ i, u ∗i u j = δi j ∀ i, j .

C.4. Moore–Penrose inverse of a normal matrix

Then A=

n

i =1

375

μi u i u ∗i .

If A is nonsingular, A−1 = U D −1 U ∗ ,

−1 −1 D −1 = diag(μ−1 1 , μ2 , . . . , μn )

A−1 =



n

i =1

μ−1 u i u ∗i . i

If A is singular with rank(A) = r < n and nonzero eigenvalues μ1 , . . . , μ r , A+ = U D + U ∗ ,

−1 D + = diag(μ−1 1 , . . . , μ r , 0, . . . , 0)



A+ =

r

i =1

μ−1 u i u ∗i . i

If A is singular, we can use the above to also express A+ in terms of regular inverses. We show this in the next theorem, where we also use the notation above. Theorem C.3. Let A ∈ n×n be normal and singular with rank(A) = r < n and nonzero eigenvalues μ1 , . . . , μ r . Then, for arbitrary nonzero scalars α r +1 , α r +2 , . . . , αn , we have  −1

n n

A+ = A + αi u i u ∗i − α−1 u i u ∗i . i i =r +1

i =r +1

Proof. We start by expressing A+ as A+ = B −

n

i =r +1

α−1 u i u ∗i , i

B=

r

i =1

μ−1 u i u ∗i + i

Clearly, B is normal and nonsingular, and B = (A +

n

n

i =r +1

α−1 u i u ∗i . i

∗ −1 i =r +1 αi u i u i ) .

Appendix D

Basics of Orthogonal Polynomials

D.1 Definition and basic properties of orthogonal polynomials Let w(x) ≥ 0 for a ≤ x ≤ b , where [a, b ] can be a finite or infinite interval, such that 8b its moments μi = a w(x)x i d x, i = 0, 1, . . . , all exist. We call w(x) a weight function on the interval [a, b ]. Let us define the inner product (· , ·) and the norm  ·  induced by it via 9b  w(x) f (x) g (x)d x and  f  = ( f , f ). (f , g) = a

We say that the functions u(x) and v(x) are orthogonal to each other with respect to the inner product (· , ·) if (u, v) = 0. Definition D.1. A monic polynomial pn (x) of degree n is said to be an orthogonal polynomial with respect to the inner product (· , ·) above if it is orthogonal to the powers 8b x i , i = 0, 1, . . . , n − 1, that is, a w(x) pn (x)x i d x = 0 for i = 0, 1, . . . , n − 1. Below we state some of the basic properties of orthogonal polynomials: • The orthogonal polynomials pn (x) exist for all n = 0, 1, . . . and are mutually orthogonal; that is, ( p m , pn ) = 0 if m = n. Let us normalize the monic pn (x) as in pn (x) = pn (x)/ pn , n = 0, 1, . . . . (Note that, with this normalization, the leading coefficient of pn (x) is cn,n = 1/ pn  > 0.) Then ( p m , pn ) = δ mn ,

m, n = 0, 1, . . . .

Here δ mn is the Kronecker delta. We call { p0 (x), p1 (x), . . .} an orthonormal set of polynomials with respect to (· , ·). • The set of orthonormal polynomials { p0 (x), p1 (x), . . . , pn (x)} is linearly independent, and every polynomial u(x) of degree at most n can be uniquely 377

378

Appendix D. Basics of Orthogonal Polynomials

represented as u(x) =

n

i =0

ci pi (x),

• Every function f (x) for which panded formally as f (x) ∼



8b a

ci = ( pi , u),

w(x)| f (x)|2 d x exists and is finite can be ex9

cn pn (x),

cn =

n=0

and we have

9

b

i = 0, 1, . . . , n .

b a

w(x) pn (x) f (x)d x,

w(x)| f (x)|2 d x =

a



n = 0, 1, . . . ,

|cn |2 .

n=0

• The pn (x) satisfy the following three-term recursion relation: γn+1 pn+1 (x) = (x − αn ) pn (x) − γn pn−1 (x),

n = 0, 1, . . . ,

with initial conditions

Here

αn = (x pn , pn ),

p−1 (x) = 0,

p0 (x) = 1/1.

n ≥ 0,

γn = (x pn−1 , pn ) > 0,

n ≥ 1.

• The zeros of the polynomial pn (x) are real and simple, lie in the open interval (a, b ), and interlace the zeros of pn+1 (x). They are also the eigenvalues of the n × n real symmetric tridiagonal matrix ⎤ ⎡ α0 γ1 ⎥ ⎢ γ1 α1 γ2 ⎥ ⎢ ⎥ ⎢ γ α γ 2 2 3 ⎥ ⎢ ⎥. . . . Tn =⎢ .. .. .. ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ .. .. ⎣ . . γn−1 ⎦ γn−1 αn−1 • The (monic) pn (x) have the following extremal property:  pn  = min q ≤ q ∀ q ∈ n , q∈n

where n is the set of monic polynomials of degree n.

D.2 Some bounds related to orthogonal polynomials Concerning monotonic weight functions, we have the following useful result that is stated and proved in Szeg˝ o [314, Theorem 7.2, p. 163]. Theorem D.2. Let { pn (x)}∞ n=0 be the sequence of polynomials orthogonal on [a, b ] for finite b with respect to the nonnegative weight function w(x). Assume that w(x) is non decreasing on [a, b ]. Then the functions w(x)| pn (x)| attain their maximum on [a, b ] for x = b . A corresponding statement holds for any subinterval [x0 , b ] of [a, b ] where w(x) is nondecreasing.

D.2. Some bounds related to orthogonal polynomials

379

The following theorem by Sidi and Shapira [303, Appendix, Theorem A.4] and [304, Appendix A, Theorem A.4] concerns a lower bound for a best polynomial L∞ approximation problem. Theorem D.3. Let { pn (z)}∞ n=0 be the sequence of orthonormal polynomials on a compact set Ω of the complex z-plane with respect to the real nonnegative continuous weight function w(z) on Ω, that is, 9 (D.1) w(z) p m (z) pn (z)d Ω = δ mn , Ω

where d Ω stands for the area element if Ω is a domain D and for the line element if Ω is the boundary of a domain D or an arbitrary rectifiable curve. Let φ∗ (z) be the solution of the constrained min-max problem  min max | w(z)φ(z)| φ∈πk z∈Ω

subject to M (φ) = 1,

(D.2)

where πk stands for the set of polynomials of degree at most k and M is a bounded linear functional on the space of continuous functions on Ω. Then 9 

−1/2 k  ∗ 2 |M ( pi )| , c = d Ω. (D.3) max | w(z)φ (z)| ≥ c z∈Ω

Ω

i =0

Proof. We start by observing that, for any function f (z) that is continuous on Ω, we have 9  1/2 max | f (z)| ≥ c −1 | f (z)|2 d Ω . (D.4) 

z∈Ω

Ω

Letting f (z) = w(z)φ(z) in (D.4), where φ(z) is a polynomial of degree at most k satisfying M (φ) = 1, and minimizing both sides of (D.4) with respect to φ, we obtain 9 1/2   ∗ −1/2 2 w(z)|φ(z)| d Ω . (D.5) min max | w(z)φ (z)| ≥ c φ∈πk M (φ)=1

z∈Ω

Ω

Since φ(z) is a polynomial of degree at most k, it can be written as φ(z) =

k

i =0

hence

9 Ω

w(z)|φ(z)|2 d Ω =

k

i =0

|αi |2

αi pi (z);

and

M (φ) =

(D.6) k

i =0

αi M ( pi ),

(D.7)

so that the minimization problem inside the square brackets on the right-hand side of (D.5) becomes min αi

k

i =0

|αi |2

subject to

k

i =0

αi M ( pi ) = 1.

(D.8)

380

Appendix D. Basics of Orthogonal Polynomials

The solution of (D.8) can be achieved, for example, by using the method of Lagrange multipliers and is given by i =  αi = α k

M ( pi )

Thus,

k

min αi

i =0

|αi |2 =

k

i =0

i = 0, 1, . . . , k.

,

2 j =0 |M ( p j )|

| αi |2 =



k i =0

|M ( pi )|2

(D.9)

−1 .

(D.10)

The result in (D.3) now follows. That (D.9) is the solution can be proved by direct verification: k

i =0

By the fact that k

i =0

|αi |2 =

k

i =0

i |2 + |αi − α

k

i =0 αi M ( pi ) = 1

and

k

i =0

1

2 j =0 |M ( p j )|

Thus,

k

i =0

|αi |2 =

k

i =0

i =0

i (αi − α i ). α

i M ( pi ) = 1, i =0 α k

2 j =0 |M ( p j )| i =0

= k

k

k

1

i (αi − α i ) = k α

| αi |2 + 2ℜ

i )M ( pi ) (αi − α



k i =0

i |2 + |αi − α

αi M ( pi ) −

k

i =0

| αi |2 ≥

k

i =0

k

i =0

 i M ( pi ) = 0. α

| αi |2 ,

i , i = 0, 1, . . . , k. and equality holds if and only if αi = α

If Ω = [a, b ] is a finite real interval, we have d Ω = d x and c = b − a. Also, if M is a point evaluation functional, i.e., M (φ) = φ(ξ ) for some ξ , then M ( pi ) = pi (ξ ) in (D.3).

Appendix E

Chebyshev Polynomials: Basic Properties

In this appendix, we summarize very briefly some of the basic properties of Chebyshev polynomials, which we have used in various places in this book. Of course, the literature of Chebyshev polynomials and their applications is very rich. For further study, we refer the reader to the books by Rivlin [218], Mason and Handscomb [181], and Lanczos [166], for example.

E.1 Definition of Chebyshev polynomials Let x = cos θ for θ ∈ [0, π]. The Chebyshev polynomials77 are defined by Tn (x) = cos(nθ), n = 0, 1, . . . . Thus, T0 (x) = 1, T1 (x) = x, T2 (x) = 2x 2 − 1, . . . . From this definition, it follows that the Tn (x) satisfy the three-term recursion relation Tn+1 (x) = 2xTn (x) − Tn−1 (x), n = 1, 2, . . . , from which it can be shown that Tn (x) is a polynomial in x of degree exactly n (with leading coefficient 2n−1 ) and that Tn (x) is even or odd depending on whether n is an even or an odd integer. The three-term recursion can be generalized as T m+n (x) = 2T m (x)Tn (x) − T|m−n| (x). Because the transformation x = cos θ is one to one between −1 ≤ x ≤ 1 and 0 ≤ θ ≤ π, we see from the definition of the Tn (x) that |Tn (x)| ≤ 1 for x ∈ [−1, 1]. In other words, for x ∈ [−1, 1], Tn (x) assumes values between −1 and 1 only. When x is real and |x| > 1, the Tn (x) can be shown to satisfy Tn (x) = (sgn x)n cosh(nφ), where φ > 0 and e φ = |x| +



x 2 − 1 > 1.

77 Here we mean Chebyshev polynomials of the first kind. There are three additional kinds of Chebyshev polynomials.

381

382

Appendix E. Chebyshev Polynomials: Basic Properties

As a result, we have that

 1 1 Tn (x) ∼ (sgn x)n e nφ = (sgn x)n (|x| + x 2 − 1)n as n → ∞. 2 2

In other words, when |x| > 1, the sequence {|Tn (x)|} increases to infinity exponentially in n like e nφ for some φ > 0 that depends on x.

E.2 Zeros, extrema, and min-max properties Tn (x) has n zeros in the (open) interval (−1, 1). These are given by  xk = cos

 2k − 1 π , 2n

k = 1, . . . , n.

Tn (x) has n + 1 extrema in the (closed) interval [−1, 1], and these are given by xk = cos

kπ , n

k = 0, 1, . . . , n.

[Note that x0 = 1 and xn = −1. That is, ±1 are extrema of Tn (x).] At these points, Tn (x) assumes the values Tn (xk ) = (−1)k . Among all monic polynomials of degree n, 2−n+1 Tn (x) is the only one that has smallest L∞ -norm on [−1, 1]. This is known as the min-max property of Chebyshev polynomials. This is expressed formally as follows. Theorem E.1. Denote the set of monic polynomials p(x) of degree n by n . The (monic) polynomial pn (x) = 2−n+1 Tn (x) is the unique solution to the min-max problem min max | p(x)|.

p∈n −1≤x≤1

Of course, another way of expressing the min-max theorem is   −n+1 max | p(x)| ≥ 2 max |Tn (x)| = 2−n+1 ∀ p ∈ n . −1≤x≤1

−1≤x≤1

To obtain the monic polynomial pn (x) of degree n that has the smallest L∞ -norm in an arbitrary interval [a, b ], we shift the Chebyshev polynomial Tn from [−1, 1] to [a, b ]. Thus, the solution to this problem is     b −a n 2x − a − b  pn (x) = 2 Tn . b −a 4

An additional theorem that is useful in theoretical work with Krylov subspace methods and that also involves Chebyshev polynomials is as follows. Theorem E.2. Denote the set of polynomials p(x) of degree at most n that satisfy p(c) = 1 4 (c). Let [a, b ] be a real positive interval and let c < a or c > b . Consider the minby  n max problem min max | p(x)|. 4 (c) a≤x≤b p∈ n

E.3. Orthogonality properties

383

The solution pn (x) to this problem is

 2x − a − b b −a pn (x) =  . 2c − a − b Tn b −a 

Tn

E.3 Orthogonality properties Integral orthogonality

The Tn (x) also satisfy the orthogonality property ⎧ 91 if m = n, ⎨ 0 T m (x)Tn (x) 1 dx = π if m = n = 0,  ⎩ 2 1 − x2 −1 π if m = n = 0. Discrete orthogonality

Let xk , 1 ≤ k ≤ n, be the zeros of Tn (x), and of Tn (x). Then the polynomials T s (x), 0 ≤ s ≤ orthogonality relations: ⎧ n ⎨ 0

1 T r (xk )T s (xk ) = n 2 ⎩ k=1 n n

 k=0

T r (xk )T s (xk ) =

⎧ ⎨ 0 ⎩

1 n 2

n

let xk , 0 ≤ k ≤ n, be the extrema n − 1, satisfy the following discrete if r = s, if r = s = 0, if r = s = 0.

if r = s, if r = s = 0, if r = s = 0.

Here the double prime indicates that the terms with indices k = 0 and k = n are to be multiplied by 1/2.

Appendix F

Useful Formulas and Results for Jacobi Polynomials

F.1 Definition of Jacobi polynomials (α,β)

The Jacobi polynomials Pk (α,β)

Pk

(x) =

(x) are defined as

     k 

k + α k + β x − 1 j x + 1 k− j , 2 2 j j =0 k − j

(F.1)

with α > −1 and β > −1. They are orthogonal with respect to the weight function w(x) = (1 − x)α (1 + x)β on [−1, 1]; that is, 9

1 −1

(α,β)

(1 − x)α (1 + x)β P m

(α,β)

(x)Pk

= δ mk

(x)d x

Γ (k + α + 1)Γ (k + β + 1) 2α+β+1 , 2k + α + β + 1 Γ (k + 1)Γ (k + α + β + 1) (α,β)

where δ mk is the Kronecker delta. The Pk (α,β)

Pk

 (1) =

(F.2)

(x) are normalized such that

 k +α . k

(F.3)

The normalization condition given in (F.3) is the one that has been widely accepted in the literature of orthogonal polynomials. Thus, (F.1)–(F.3) can be found in many books. See, for example, Abramowitz and Stegun [1, Chapter 22] or Szeg˝ o [314].

F.2 Some consequences Theorem F.1. For x > 1 and x < −1, with x fixed otherwise, the sequence (α,β) {|Pk (x)|}∞ is monotonically increasing. k=0 Proof. We start with the case x > 1. First, all the terms in the summation on the (α,β) right-hand side of (F.1) are positive for x > 1. Next, the j th term of Pk (x) in (F.1) 385

386

Appendix F. Useful Formulas and Results for Jacobi Polynomials (α,β)

is strictly less than the corresponding term of Pk+1 (x). The result now follows. As for x < −1, we first recall that (α,β)

Pk

(β,α)

(−x) = (−1)k Pk

(x)

(F.4)

and then apply the result for x > 1, which we have already proved, to the polynomials (β,α) Pk (−x).

Polynomials orthogonal on [a, b ] with respect to the weight function w(x) = (b − x)α (x − a)β are   x −a (α,β) pk (x) = Pk 2 −1 . (F.5) b −a

Polynomials orthogonal on [−1, 1] with respect to the weight function w(x) = |x|2n can be expressed in terms of Jacobi polynomials and are given by (see [314, pp. 59– 60]) (0,n−1/2) Pν (2x 2 − 1) if k = 2ν, (F.6) pk (x) = (0,n+1/2) 2 xPν (2x − 1) if k = 2ν + 1. By proper manipulation, pk (x) in (F.6), for both even and odd values of k, can be expressed in the unified form  : ; ; : ν  

ν n + μ − 1/2 k k +1 2 j k−2 j (x − 1) x , ν= pk (x) = . (F.7) , μ= j 2 2 j =0 j

Theorem F.2. The polynomials pk (x) that are defined in (F.6) are such that, for x real and |x| > 1, or for x purely imaginary and |x| ≥ 1, the sequence {| pk (x)|}∞ is monok=0 tonically increasing. For x purely imaginary and |x| < 1, the sequences {| p2ν (x)|}∞ ν=0 and {| p2ν+1 (x)|}∞ are monotonically increasing. ν=0 Proof. We start with (F.7). Note that both ν and μ are monotonically nondecreasing in k, and that one of them is always increasing. Letting x be real and x > 1, we see that all the terms in the summation on the right-hand side of (F.7) are positive. Next, the j th term of pk (x) in (F.7) is strictly less than the corresponding term of pk+1 (x). The result now follows for x > 1. For x < −1, we note that pk (−x) = (−1)k pk (x) and apply the result for x > 1, which we have already proved, to the polynomials pk (−x). For the case in which x is purely imaginary, that is, x = iξ , ξ real, the factor (x 2 − 1) j x k−2 j in the j th term of pk (x) becomes ik (ξ 2 + 1) j ξ k−2 j . The proof for the case |x| ≥ 1 can now be completed as before. The proof of the case |x| < 1 can be carried out by employing Theorem F.1 in conjunction with (F.6).

Appendix G

Rayleigh Quotient and Power Method

G.1 Properties of the Rayleigh quotient As we saw in Chapter 10, the method of Arnoldi, with k = 1 and an initial vector u = and the approximate 0, produces the approximate eigenvalue (i.e., the Ritz value) μ eigenvector (i.e., the Ritz vector) x , which are given as = μ

u ∗Au ≡ ρ(A; u) u∗u

and

x = u.

(G.1)

ρ(A; u) is known as the Rayleigh quotient. ρ(A; u) has interesting properties, some of which we state here. is an approximation to the corIf u is an approximation to an eigenvector, then μ responding eigenvalue. Whether u is an eigenvector of A or not, = 0, u ∗ (Au − μu)

(G.2)

from which we also have78 = min Au − μu, Au − μu μ∈ The set

5 ) (A) =

Au − μu = Au − μu

u ∗Au : u ∈ N , u = 0 u∗u



(G.3) μ = μ.

6

is known as the field of values of A. Of course, ) (A) contains all the eigenvalues of A. It has quite a few interesting and useful properties. For example, ) (A) is compact and also convex in the complex plane. This result is known as the Haussdorff–Toeplitz theorem, whose proof can be found in Horn and Johnson [139], for example. If A is a normal matrix, then ) (A) is the convex hull of the eigenvalues of A. If A is Hermitian, then ) (A) is the real interval [μmin , μmax ], where μmin and μmax are, respectively, the smallest and the largest eigenvalues of A. 78 Throughout this appendix,  ·  stands for the standard Euclidean vector norm in N and the matrix norm induced by it.

387

388

Appendix G. Rayleigh Quotient and Power Method

ρ(A; u) has a most useful property, which also has practical implications; namely, the closer u is to an eigenvector, the closer ρ(A; u) is to the corresponding eigenvalue. Precisely this property is the subject of the following lemma. Lemma G.1. Let A ∈ N ×N ; let (μ, x), x = 1, be an eigenpair of A; and assume that μ has only corresponding eigenvectors but no corresponding generalized eigenvectors. Denote by - the subspace spanned by the eigenvectors and principal vectors corresponding to the eigenvalues different from μ. Denote by C the restriction of A − μI to the subspace - . Let u be an approximation to x such that u = c(x + ε), where c is some scalar and the error vector ε lies in - and satisfies ε ≤ σ < 1.79 Let = μ

u ∗Au . u∗u

(G.4)

Then − μ| ≤ K1 (ε) C ε |μ

− μ| ≤ K2 (ε) C ε |μ

2

1+t , (1 − t )2

if A is nonnormal,

K1 (t ) =

if A is normal,

1 . K2 (t ) = (1 − t )2

(G.5)

K1 (t ) and K2 (t ) are both increasing for t ∈ (0, 1); therefore, K1 (ε) ≤ K1 (σ)

and K2 (ε) ≤ K2 (σ).

(G.6)

Remarks: 1. Note that - is an invariant subspace of A and hence of A − aI for any a. 2. Since C is the restriction of A− μI to the subspace - , namely, C = (A− μI )|- , we have (A − μI )y C = max . (G.7) y∈y y=0

3. Because μ has only corresponding eigenvectors and no generalized eigenvectors, every nonzero vector u ∈ N , u ∈ - , can be expressed as u = u  + u  , where u  is an eigenvector of A corresponding to the eigenvalue μ and u  is in - . 4. The bound given in (G.5) for nonnormal A is valid also for normal A; it is inferior to the one stated specifically for normal A, however. 5. From the statement of the lemma, it is clear that the closer ε is to zero, the is to μ. In addition, μ is a better approximation to μ when A is normal closer μ than when it is not. = μ. 6. Finally, limε→0 μ Proof. Substituting u = c(x + ε) into (G.4), subtracting μ from both sides, and invoking Ax = μx, we obtain −μ = μ 79 Because

x ∗ (A − μI )ε + ε∗ (A − μI )ε Num(ε) . ≡ Den(ε) x + ε2

(G.8)

x = 1, ε can be viewed as the relative error vector in the sense that ε = c −1 u − x/x.

G.2. The power method

389

In what follows, we invoke x = 1 and ε ≤ σ < 1. 1. The denominator (which is real and positive) can be bounded from below as follows: Den(ε) ≥ (x − ε)2 = (1 − ε)2 . 2. As for the numerator, we have two different cases to consider: • When A is nonnormal, invoking the fact that (A − μI )ε = Cε, we have |Num(ε)| ≤ C ε + C ε2 = (1 + ε) C ε. • When A is normal, x ∗ z = 0 for all z ∈ - . In addition, (A − μI )y ∈ - if y ∈ - , because - is an invariant subspace of A − μI . Therefore, x ∗ (A − μI )ε = 0 Consequently,



Num(ε) = ε∗ Cε.

|Num(ε)| ≤ C ε2 .

The result now follows.

G.2 The power method In view of Lemma G.1, the Rayleigh quotient can be used to approximate the eigenvalue with largest modulus of A (and a corresponding eigenvector) in conjunction with power iterations, which amounts to the following theoretically: 1. Choose an initial vector u 0 , and compute the vectors u 1 , u 2 , . . . via u m+1 = Au m ,

m = 0, 1, . . . .

(G.9)

2. Compute μ(m) , m = 0, 1, . . . , as in μ(m) =

u ∗m Au m

u ∗m u m

=

u ∗m u m+1

u ∗m u m

.

(G.10)

The method we have just described is known as the power method. Let us assume for simplicity that A is diagonalizable. Then, by the fact that u m = Am u 0 , it is clear that u m has the structure um =

p

i =1

v i λim ,

m = 1, 2, . . . ,

p ≤ N,

(G.11)

where λ1 , . . . , λ p are some or all of the distinct nonzero eigenvalues of A and, for each i, v i is an eigenvector of A corresponding to λi ; that is, Av i = λi v i , i = 1, . . . , p .80 The relevant theorem for this case is as follows. Theorem G.2. With λ1 , λ2 , . . . , λ p in (G.11) ordered as |λ1 | ≥ |λ2 | ≥ · · · ≥ |λ p |, 80 Recall

Lemma 6.4 and Example 6.5 in Section 6.2.

(G.12)

390

Appendix G. Rayleigh Quotient and Power Method

assume also that

|λ1 | > |λ2 |.

(G.13)

Then μ(m) = λ1 + O(|λ2 /λ1 | m )

as m → ∞

if A is nonnormal,

μ(m) = λ1 + O(|λ2 /λ1 |2m )

as m → ∞

if A is normal,

and

(G.14)

  u m = λ1m v 1 + O(|λ2 /λ1 | m ) as m → ∞

in both cases.

(G.15)

Proof. Writing (G.11) in the form u m = c m (x + ε m ), with c m = v 1 λ1m ,

x=

v1 , v 1 

εm =

  p

v i λi m = O(|λ2 /λ1 | m ) λ v  1 1 i =2

as m → ∞,

we now apply Lemma G.1.

Remarks: 1. Needless to say, with the ordering of the λi as in (G.12) and (G.13), for an arbitrarily chosen vector u 0 , λ1 in (G.11) is likely to be the largest (in modulus) eigenvalue of A whether this eigenvalue is simple or multiple. Even when an eigenvector corresponding to the largest eigenvalue of A is missing from u 0 , roundoff errors in floating-point arithmetic will introduce such an eigenvector into u m as the power iterations progress. 2. We can prove a similar result for the case in which A is nonnormal and nondiagonalizable (that is, defective) but the eigenvalue μ has equal algebraic and geometric multiplicities. This means that the eigenvalues other than μ may have unequal algebraic and geometric multiplicities. In this case, with the ordering of the λi as in (G.12) and (G.13), we have81 u m = v 1 λ1m +

q

i =2

i (m)λim , v

i (m) vector-valued polynomials in m. v

Consequently, (G.14) and (G.15) now read, respectively, μ(m) = λ1 + O(m ν |λ2 /λ1 | m ) and

  u m = λ1m v 1 + O(m ν |λ2 /λ1 | m )

as m → ∞ as m → ∞.

2 (m) when Here ν is some nonnegative integer; it is equal to the degree of v |λ2 | > |λ3 |, for example. Of course, when A is normal, it is necessarily diagonalizable. If the Rayleigh quotient method is applied as described above, in floating-point arithmetic, we may run into overflows when |λ1 | > 1 and underflows when |λ1 | < 1. To avoid these problems, the method should be applied as follows: 81 Recall

Lemma 6.22 in Section 6.8.

G.3. Inverse power method or inverse iteration

391

1. Choose an initial vector u 0 , u 0  = 1. 2. For m = 0, 1, . . . do m+1 = Au m and set Compute u μ(m) =

m+1 u∗ u u ∗m Au m . = m ∗ ∗ umum umum

Set u m+1 =

m+1 u

 u m+1 

.

end do (m) It is easy to verify that the μ(m) here are the same as the ones we introduced in (G.10). Remark: It is clear that the Rayleigh quotient method converges provided |λ1 | > |λ2 | when the λi are ordered as in (G.12), the rate of convergence being determined by the ratio |λ2 /λ1 |, the largest of all the ratios |λi /λ1 |, 2 ≤ i ≤ p. Clearly, the smaller this ratio, the faster the convergence. In certain cases, we can apply the power method to the vectors u m+1 = (A − aI )u m , m = 0, 1, . . . , by choosing the scalar a such that the ratio |λ2 − a|/|λ1 − a| is smaller than |λ2 /λ1 |. The scalar a is called a shift in this usage of the power method, which is now called a power method with a shift. When the λi are all real and positive and λ1 > λ2 ≥ λ3 ≥ · · · ≥ λ p > 0, the choice a = λ p will yield an improved rate of convergence. The optimal a that minimizes max2≤i ≤ p {|λi − a|/|λ1 − a|} is given as aopt = (λ2 + λ p )/2, and we have max {|λi − aopt |} = |λ2 − aopt | < |λ1 − aopt |.

2≤i ≤ p

The resulting power method is known as Wilkinson’s method. See, for example, [214, pp. 496–497]. Of course, the basis for these developments is the fact that (λi − a, v i ) are eigenpairs of A − aI .

G.3 Inverse power method or inverse iteration The power method can be used to approximate the eigenvalue that is closest to a number a ∈  (and the corresponding eigenvector) by applying it to the vectors u m+1 = Bu m , m = 0, 1, . . . , with the matrix B = (A − aI )−1 , assuming that a is not an eigenvalue of A so that B is nonsingular. In particular, with a = 0, it can be used to approximate the smallest eigenvalue of A when A is nonsingular. This method is known as the inverse power method or inverse iteration. In this case, u m+1 = (A− aI )−1 u m ; hence, assuming again that A is diagonalizable, we have  m p

1 , m = 1, 2, . . . , p ≤ N . vi um = λi − a i =1

Again, λi are some or all of the distinct eigenvalues of A and v i are corresponding eigenvectors; that is, (A − aI )−1 v i = (λi − a)−1 v i , i = 1, . . . , p. Let us order the λi such that |(λ1 − a)−1 | ≥ |(λ2 − a)−1 | ≥ · · · ≥ |(λ p − a)−1 |

392

Appendix G. Rayleigh Quotient and Power Method

and assume also that

|(λ1 − a)−1 | > |(λ2 − a)−1 |.

Then Theorem G.2 applies with λi there replaced by (λi − a)−1 , from which we also obtain m    1  λ1 − a  + a = λ + O as m → ∞ if A is nonnormal, 1  (m) λ2 − a  μ   (G.16)  2m  1  λ1 − a  + a = λ + O as m → ∞ if A is normal, 1 λ −a  μ(m) 2

and

    λ − a m u m = (λ1 − a)−m v 1 + O  1 λ2 − a 

as m → ∞

in all cases.

(G.17)

Of course the vector u m+1 should not be computed by inverting A − aI ; rather, it should be obtained by actually solving (A − aI )u m+1 = u m , which can be done via the use of the LU factorization of A − aI . Once the LU factorization has been computed, it can be used to solve for u m+1 from (A − aI )u m+1 = u m by solving two triangular systems for each m. This is especially efficient when A has been transformed, via suitable similarity transformations, to a simple form, such as tridiagonal or upper Hessenberg. In addition, the vectors u m should be normalized to avoid overflows or underflows, as explained above. It can be argued that if a is too close to λ1 , then the matrix A − aI will be close to singular and hence ill conditioned. Even though this is true, the method works efficiently because of this ill-conditioning, since the computed u m+1 will be enhanced in the direction of v 1 , which is what we want ultimately. This can be shown by analyzing the solution u to (A−aI )u = u  in terms of the spectral decomposition of u  : Assume for simplicity that A is diagonalizable with (μi , x i ), i = 1, . . . , N , as its eigenpairs, and that |μ1 − a| < min2≤i ≤N |μi − a|. Clearly, u  = N i =1 αi x i for some scalars αi , and,  αi therefore, u = N x . This shows that, the closer a is to μ1 , the richer u is i =1 μi −a i  (compared to u ) in the direction of x 1 , which is what we want to happen eventually, even when the floating-point computation of u suffers from large roundoff errors. Note that the closer a is to an eigenvalue, the faster the convergence of the method to that eigenvalue. We make use of this fact to vary the shift from one iteration to the next, making it closer and closer to an eigenvalue, as we discuss next.

G.4 Inverse power method with variable shifts The scalar parameter a in the inverse power method just discussed is held fixed throughout. We now discuss a different version of this method in which a is varied at each iteration, when we have an approximate eigenvector w 0 of A. First, we compute the Rayleigh quotient w ∗Aw (G.18) μ(0) = 0∗ 0 w0w0

as an approximation to the relevant eigenvalue, followed by computing the vector w 1 , the improvement of w 0 , via & 1 = (A − μ(0) I )−1 w 0 , w

w1 =

&1 w . & w 1

(G.19)

G.4. Inverse power method with variable shifts

393

Assuming that we have computed w m , we go on to compute μ(m) and w m+1 as follows: μ(m) =

w ∗m Aw m

w ∗m w m

& m+1 = (A − μ(m) I )−1 w m , w

,

(G.20)

w m+1 =

& m+1 w

& w m+1 

.

(G.21)

Clearly, μ(m) , m = 0, 1, . . . , play the role of the scalar a in the inverse power method. Also, w m  = 1 for all m. One drawback of this method is that, at each stage, we have to solve a linear system with a different matrix, which increases the cost per iteration compared to inverse power iteration with a fixed shift. As we show below in Theorem G.3, the new method converges much faster, however. In fact, it converges at least quadratically. This suggests that we need to solve the linear systems in (G.18)–(G.21) only a few times. The issue of convergence for this new method [described via (G.18)–(G.21)] was considered first by Ostrowski [200, 201]. See also Kahan [152] and Parlett [207, 208]. In the next theorem, we give an independent treatment of the local convergence properties of this method; we show that, if w 0 is sufficiently close to an eigenvector x corresponding to some eigenvalue μ of A, then {w m } converges to x (i) cubically when A is normal and (ii) quadratically when A is nonnormal. Of course, {μ(m) } converges to μ. For a detailed treatment of the global convergence properties of this method, see [208]. As in Lemma G.1, we consider simultaneously both diagonalizable and nondiagonalizable matrices in this treatment. We also use the notation of Lemma G.1. Theorem G.3. Let A ∈ N ×N , and let μ be an eigenvalue of A that has only corresponding eigenvectors but no corresponding generalized eigenvectors. Denote by - the subspace spanned by the eigenvectors and principal vectors corresponding to the eigenvalues different from μ. Denote by C the restriction of A − μI to the subspace - ; that is, C = (A − μI )|- . Let w 0 = c0 (x + ε0 ), where x, x = 1, is an eigenvector of A corresponding to μ and ε0 lies in - .82 Let the sequences {μ(m) } and {w m } be generated via (G.20) and (G.21), and assume that μ(m) = μ for every m. Then the following are true: 1. w m are of the form w m = c m (x + ε m ),

εm ∈ - ,

m = 0, 1, . . . ,

(G.22)

where c m are some scalars that ensure w m  = 1. 2. Provided ε0  ≤ σ
0. Since this also implies that ε m  ≤ σ < 1/(9κ) < 1/3, by Lemma G.1, |μ − μ(m) | ≤ K1 (σ) C ε m 

whether A is normal or not.

Consequently,84   1 E m  ≤ K1 (σ) C ε m  C −1  = K1 (σ) κ ε m  < 3κσ < 3

(G.28)

whether A is normal or not. As a result, E m (I + E m )−1 satisfies85 E m (I + E m )−1  ≤ E m  (I + E m )−1  ≤

E m  1 < . 1 − E m  2

Consequently, 1 ε m+1  ≤ E m (I + E m )−1  ε m  < ε m  ≤ 2−(m+1) σ, 2

(G.29)

which completes the proof of (G.26). Of course, this implies that lim m→∞ (μ−μ(m) ) = 0 by Lemma G.1. Consequently, we also have that lim m→∞ E m = O by (G.27). We now turn to the order of convergence of the ε m to 0. Invoking Lemma G.1 in (G.27), we have ε m+1  ≤ E m  (I + E m )−1  ε m    ≤ |μ − μ(m) | C −1  (I + E m )−1  ε m    ≤ K s (ε m ) κ ε m  s (I + E m )−1  εm , where, we recall s = 1,

s = 2,

Consequently,

1 + ε m  (1 − ε m )2 1 K2 (ε m ) = (1 − ε m )2

K1 (ε m ) =

ε m+1 

ε m  s +1

when A is nonnormal.

when A is normal.

≤ κ K s (ε m ) (I + E m )−1 ,

which, upon letting m → ∞, invoking limm→∞ ε m = 0, and recalling that lim m→∞ E m = O, gives (G.23). The result in (G.24) follows directly from Lemma G.1. 84 Here we have made use of the fact that the function K1 (t ) = (1 + t )/(1 − t )2 is increasing for t ∈ (0, 1) and that K1 (σ) < K1 (1/3) = 3 since σ < 1/3. 85 Here we have made use of the fact that the function g (t ) = t /(1 − t ) is increasing for t ∈ (0, 1) and that g (E m ) < g (1/3) = 1/2 since E m  < 1/3.

396

Appendix G. Rayleigh Quotient and Power Method

Finally, there is an interesting global phenomenon taking place as far as the residual vectors r (w m ) are concerned. This is the subject of the next theorem, which was proved for Hermitian matrices by Kahan [152] and published in Parlett and Kahan [209]. The extension to normal matrices is due to Parlett [207]. Theorem G.4. Let A be normal in Theorem G.3. Then the norms of the residual vectors r (w m ) = Aw m − μ(m) w m ≡ r m form a decreasing sequence from any w 0 ; that is, r m+1  ≤ r m  for all m. Equality holds if and only if μ(m+1) = μ(m) and w m is an eigenvector of (A − μ(m) I )∗ (A − μ(m) I ). Proof. We start by recalling that [see (G.3)] (A − ρ(A; u)I )u = min (A − μI )u μ∈

(G.30)

and that w m are all unit vectors, that is, w m  = 1. Then r m+1  = (A − μ(m+1) I )w m+1  (m)

≤ (A − μ

I )w m+1 

by definition by μ(m+1) = ρ(A; w m+1 ) and (G.30)

= |w ∗m (A − μ(m) I )w m+1 | since (A − μ(m) I )w m+1 is a multiple of w m = |[(A − μ(m) I )∗ w m ]∗ w m+1 | ≤ (A − μ(m) I )∗ w m  w m+1  by the Cauchy–Schwarz inequality = (A − μ(m) I )w m 

since B ∗ z  = B z  if B is normal

= r m . The first inequality becomes an equality if and only if μ(m+1) = μ(m) . The second inequality becomes an equality if and only if w m+1 is a multiple of (A − μ(m) I )∗ w m , meaning that (A − μ(m) I )−1 w m = α−1 (A − μ(m) I )∗ w m , from which (A − μ(m) I )(A − μ(m) I )∗ w m = αw m . This completes the proof.

Appendix H

Unified FORTRAN 77 Code for MPE and RRE

86

C***********************************************************************AAA00010 C IMPLEMENTATION OF MPE AND RRE WITH QR FACTORIZATION FOR LEAST AAA00020 C SQUARES. (QR PERFORMED BY MODIFIED GRAM-SCHMIDT PROCESS) AAA00030 C MPE AND RRE ARE APPLIED IN THE CYCLING MODE. AAA00040 C***********************************************************************AAA00050 C THE COMPONENTS OF THE INITIAL VECTOR X, NAMELY, X(I),I=1,...,NDIM, AAA00060 C CAN BE PICKED RANDOMLY. WE ACHIEVE THIS, E.G., BY INVOKING THE AAA00070 C IMSL VERSION 10 SUBROUTINE DRNUN THAT GENERATES PSEUDORANDOM AAA00080 C NUMBERS FROM A UNIFORM (0,1) DISTRIBUTION. AAA00090 C OTHER CHOICES FOR X(1),...,X(NDIM) ARE POSSIBLE, SUCH AS X(I)=0, AAA00100 C I=1,...,NDIM. IN THIS CASE REPLACE THE STATEMENT AAA00110 C CALL DRNUN(NDIM,X) AAA00120 C BY THE DO LOOP AAA00130 C DO 10 I=1,NDIM AAA00140 C X(I)=0 AAA00150 C 10 CONTINUE AAA00160 C***********************************************************************AAA00170 IMPLICIT DOUBLE PRECISION (A-H,O-Z) AAA00180 PARAMETER (METHOD=1,N0=20,N=0,KMAX=10,NCYCLE=15,NDIM=1000) AAA00190 PARAMETER (EPSC=1D-10,IPRES=1,IPRES1=1) AAA00200 DIMENSION X(NDIM),S(NDIM),Y(NDIM),Z(NDIM) AAA00210 DIMENSION Q(NDIM,0:KMAX-1),R(0:KMAX,0:KMAX) AAA00220 DIMENSION C(0:KMAX),GAMMA(0:KMAX),XI(0:KMAX-1) AAA00230 EXTERNAL VECTOR AAA00240 C AAA00250 C INITIAL VECTOR DETERMINATION. AAA00260 C AAA00270 C CALL DRNUN(NDIM,X) AAA00280 DO 10 I=1,NDIM AAA00290 X(I)=0 AAA00300 10 CONTINUE AAA00310 C AAA00320 C END OF INITIAL VECTOR DETERMINATION. AAA00330 C AAA00340 CALL CYCLE(METHOD,X,S,N0,N,KMAX,NCYCLE,NDIM,Y,Z,VECTOR,Q,R, AAA00350 *C,GAMMA,XI,RESC,EPSC,IPRES,IPRES1) AAA00360 STOP AAA00370 END AAA00380 AAA00390 SUBROUTINE CYCLE(METHOD,X,S,N0,N,KMAX,NCYCLE,NDIM,Y,Z,VECTOR,Q,R, AAA00400 86 Used

courtesy of NASA [266].

397

398

Appendix H. Unified FORTRAN 77 Code for MPE and RRE *C,GAMMA,XI,RESC,EPSC,IPRES,IPRES1) AAA00410 C***********************************************************************AAA00420 C THIS SUBROUTINE APPLIES MPE AND RRE IN THE CYCLING MODE. AAA00430 C MPE AND RRE ARE INVOKED BY CALLING SUBROUTINE MPERRE. AAA00440 C***********************************************************************AAA00450 C THE ARGUMENTS METHOD,NDIM,Y,Z,VECTOR,Q,R,C,GAMMA,XI,IPRES,IPRES1 AAA00460 C ARE AS IN SUBROUTINE MPERRE. AAA00470 C AAA00480 C X : INITIAL VECTOR. INPUT ARRAY OF DIMENSION NDIM. (DOUBLE AAA00490 C PRECISION) AAA00500 C S : THE FINAL APPROXIMATION PRODUCED BY THE SUBROUTINE. OUTPUT AAA00510 C ARRAY OF DIMENSION NDIM. (DOUBLE PRECISION) AAA00520 C N0 : NUMBER OF ITERATIONS PERFORMED BEFORE CYCLING IS STARTED, AAA00530 C I.E., BEFORE MPE OR RRE IS APPLIED FOR THE FIRST TIME. AAA00540 C INPUT. (INTEGER) AAA00550 C N : NUMBER OF ITERATIONS PERFORMED BEFORE MPE OR RRE IS APPLIED AAA00560 C IN EACH CYCLE AFTER THE FIRST CYCLE. INPUT. (INTEGER) AAA00570 C KMAX : WIDTH OF EXTRAPOLATION. ON EXIT FROM SUBROUTINE MPERRE IN AAA00580 C EACH CYCLE, THE ARRAY S IS, IN FACT, THE APPROXIMATION AAA00590 C S(N0,KMAX) IN THE FIRST CYCLE, AND S(N,KMAX) IN THE AAA00600 C FOLLOWING CYCLES. INPUT. (INTEGER) AAA00610 C NCYCLE: MAXIMUM NUMBER OF CYCLES ALLOWED. INPUT. (INTEGER) AAA00620 C RESC : L2-NORM OF THE RESIDUAL FOR S AT THE END OF EACH CYCLE. AAA00630 C RETRIEVED AT THE END OF THE NEXT CYCLE. OUTPUT. (DOUBLE AAA00640 C PRECISION) AAA00650 C EPSC : AN UPPER BOUND ON RESC/RESP, SOME RELATIVE RESIDUAL FOR S, AAA00660 C USED IN THE STOPPING CRITERION. HERE RESP IS THE L2-NORM AAA00670 C OF THE RESIDUAL FOR S(N0,KMAX) AT THE END OF THE FIRST AAA00680 C CYCLE, I.E., ON EXIT FROM SUBROUTINE MPERRE THE FIRST TIME. AAA00690 C IF RESC.LE.EPSC*RESP AT THE END OF SOME CYCLE, THEN ONE AAA00700 C ADDITIONAL CYCLE IS PERFORMED, AND THE CORRESPONDING AAA00710 C S(N,KMAX) IS ACCEPTED AS THE FINAL APPROXIMATION, AND THE AAA00720 C SUBROUTINE IS EXITED. INPUT. (DOUBLE PRECISION) AAA00730 C***********************************************************************AAA00740 IMPLICIT DOUBLE PRECISION (A-H,O-Z) AAA00750 PARAMETER (EPS=0) AAA00760 DIMENSION X(NDIM),S(NDIM),Y(NDIM),Z(NDIM) AAA00770 DIMENSION Q(NDIM,0:KMAX-1),R(0:KMAX,0:KMAX) AAA00780 DIMENSION C(0:KMAX),GAMMA(0:KMAX),XI(0:KMAX-1) AAA00790 EXTERNAL VECTOR AAA00800 DO 40 IC=1,NCYCLE AAA00810 IF (IPRES.EQ.1.OR.IPRES1.EQ.1) THEN AAA00820 WRITE(6,101) IC AAA00830 101 FORMAT(/,’ CYCLE NO. ’,I3) AAA00840 END IF AAA00850 NN=N AAA00860 IF (IC.EQ.1) NN=N0 AAA00870 IF (IPRES.EQ.1.OR.IPRES1.EQ.1) THEN AAA00880 WRITE(6,102) NN AAA00890 102 FORMAT(/,’ NO. OF ITERATIONS PRIOR TO EXTRAPOLATION IS ’,I3) AAA00900 WRITE(6,103) KMAX AAA00910 103 FORMAT(/,’ WIDTH OF EXTRAPOLATION IS ’,I3) AAA00920 END IF AAA00930 DO 20 J=0,NN-1 AAA00940 CALL VECTOR(X,Y,NDIM) AAA00950 DO 10 I=1,NDIM AAA00960 X(I)=Y(I) AAA00970 10 CONTINUE AAA00980 20 CONTINUE AAA00990 CALL MPERRE(METHOD,X,S,KMAX,KOUT,NDIM,Y,Z,VECTOR,Q,R,C, AAA01000 *GAMMA,XI,RES,RES1,EPS,IPRES,IPRES1) AAA01010 IF (IC.EQ.1) RESP=R(0,0) AAA01020 RESC=R(0,0) AAA01030

399 IF (RESC.LE.EPSC*RESP) RETURN DO 30 I=1,NDIM X(I)=S(I) CONTINUE CONTINUE RETURN END

AAA01040 AAA01050 AAA01060 30 AAA01070 40 AAA01080 AAA01090 AAA01100 AAA01110 SUBROUTINE MPERRE(METHOD,X,S,KMAX,KOUT,NDIM,Y,Z,VECTOR,Q,R,C, AAA01120 *GAMMA,XI,RES,RES1,EPS,IPRES,IPRES1) AAA01130 C***********************************************************************AAA01140 C THIS SUBROUTINE APPLIES THE MINIMAL POLYNOMIAL EXTRAPOLATION (MPE) AAA01150 C OR THE REDUCED RANK EXTRAPOLATION (RRE) METHODS TO A VECTOR AAA01160 C SEQUENCE X0,X1,X2,... THAT IS OFTEN GENERATED BY A FIXED-POINT AAA01170 C ITERATIVE TECHNIQUE. AAA01180 C BOTH MPE AND RRE ARE ACCELERATION OF CONVERGENCE (OR EXTRAPOLATION) AAA01190 C METHODS FOR VECTOR SEQUENCES. EACH METHOD PRODUCES A TWO-DIMENSIONAL AAA01200 C ARRAY S(N,K) OF APPROXIMATIONS TO THE LIMIT OR ANTILIMIT OF THE AAA01210 C SEQUENCE IN QUESTION. AAA01220 C THE IMPLEMENTATIONS EMPLOYED IN THE PRESENT SUBROUTINE GENERATE AAA01230 C THE SEQUENCES S(0,0)=X0,S(0,1),S(0,2),.... AAA01240 C***********************************************************************AAA01250 C AUTHOR : AVRAM SIDI AAA01260 C COMPUTER SCIENCE DEPARTMENT AAA01270 C TECHNION-ISRAEL INSTITUTE OF TECHNOLOGY AAA01280 C HAIFA 32000, ISRAEL AAA01290 C E-MAIL ADDRESS: [email protected] AAA01300 C***********************************************************************AAA01310 C METHOD: IF METHOD.EQ.1, THEN MPE IS EMPLOYED. IF METHOD.EQ.2, THEN AAA01320 C RRE IS EMPLOYED. INPUT. (INTEGER) AAA01330 C X : THE VECTOR X0. INPUT ARRAY OF DIMENSION NDIM. (DOUBLE AAA01340 C PRECISION) AAA01350 C S : THE APPROXIMATION S(0,K) PRODUCED BY THE SUBROUTINE FOR AAA01360 C EACH K. ON EXIT, S IS S(0,KOUT). OUTPUT ARRAY OF DIMENSION AAA01370 C NDIM. (DOUBLE PRECISION) AAA01380 C KMAX : A NONNEGATIVE INTEGER. THE MAXIMUM WIDTH OF EXTRAPOLATION AAA01390 C ALLOWED. THUS, THE NUMBER OF THE VECTORS X0,X1,X2,... AAA01400 C EMPLOYED IN THE PROCESS IS KMAX+2 AT MOST. INPUT. (INTEGER) AAA01410 C KOUT : A NONNEGATIVE INTEGER. KOUT IS DETERMINED BY A SUITABLE AAA01420 C STOPPING CRITERION AND DOES NOT EXCEED KMAX. THE VECTORS AAA01430 C ACTUALLY EMPLOYED BY THE EXTRAPOLATION PROCESS ARE AAA01440 C X0,X1,X2,...,XP, WHERE P=KOUT+1. OUTPUT. (INTEGER) AAA01450 C NDIM : DIMENSION OF THE VECTORS. INPUT. (INTEGER) AAA01460 C Y : WORK ARRAY OF DIMENSION NDIM. (DOUBLE PRECISION) AAA01470 C Z : WORK ARRAY OF DIMENSION NDIM. (DOUBLE PRECISION) AAA01480 C VECTOR: A USER-SUPPLIED SUBROUTINE WHOSE CALLING SEQUENCE IS AAA01490 C CALL VECTOR(Y,Z,NDIM); Y,NDIM INPUT,Z OUTPUT. AAA01500 C Y,Z,NDIM ARE EXACTLY AS DESCRIBED ABOVE. FOR A FIXED-POINT AAA01510 C ITERATIVE TECHNIQUE FOR SOLVING THE LINEAR OR NONLINEAR AAA01520 C SYSTEM T=F(T), DIM(T)=NDIM, Y AND Z ARE RELATED BY Z=F(Y). AAA01530 C THUS, X1=F(X0), X2=F(X1), ETC. AAA01540 C VECTOR SHOULD BE DECLARED IN AN EXTERNAL STATEMENT IN THE AAA01550 C CALLING PROGRAM. AAA01560 C Q : WORK ARRAY OF DIMENSION (NDIM,0:KMAX-1). FOR EACH K, ITS AAA01570 C ELEMENTS ARE THOSE OF THE ORTHOGONAL MATRIX OBTAINED FROM AAA01580 C QR FACTORIZATION OF THE MATRIX U AAA01590 C U = ( U0 | U1 | ... | UK ), K=0,1,2,..., AAA01600 C WHERE U0=X1-X0, U1=X2-X1, U2=X3-X2, ETC. OUTPUT. (DOUBLE AAA01610 C PRECISION) AAA01620 C R : WORK ARRAY OF DIMENSION (0:KMAX,0:KMAX). FOR EACH K, ITS AAA01630 C ELEMENTS ARE THOSE OF THE UPPER TRIANGULAR MATRIX OBTAINED AAA01640 C FROM QR FACTORIZATION OF THE MATRIX U DESCRIBED ABOVE. AAA01650 C OUTPUT. (DOUBLE PRECISION) AAA01660

400

Appendix H. Unified FORTRAN 77 Code for MPE and RRE C C : WORK ARRAY OF DIMENSION (0:KMAX). FOR EACH K, C FOR MPE IS AAA01670 C THE LEAST-SQUARES SOLUTION OF THE SYSTEM U*C=0 SUBJECT TO AAA01680 C THE CONSTRAINT C(K)=1. (DOUBLE PRECISION) AAA01690 C GAMMA : WORK ARRAY OF DIMENSION (0:KMAX). FOR EACH K, THE GAMMAS AAA01700 C ARE SUCH THAT AAA01710 C S(0,K)=GAMMA(0)*X0+GAMMA(1)*X1+...+GAMMA(K)*XK. AAA01720 C FOR EACH K, GAMMA FOR RRE IS THE LEAST-SQUARES SOLUTION OF AAA01730 C THE SYSTEM U*GAMMA=0 SUBJECT TO THE CONSTRAINT AAA01740 C GAMMA(0)+GAMMA(1)+...+GAMMA(K)=1. (DOUBLE PRECISION) AAA01750 C XI : WORK ARRAY OF DIMENSION (0:KMAX-1). FOR EACH K, THE XIS AAA01760 C ARE SUCH THAT AAA01770 C S(0,K)=X0+XI(0)*U0+XI(1)*U1+...+XI(J)*UJ, J=K-1. AAA01780 C (DOUBLE PRECISION) AAA01790 C RES : L2-NORM OF THE RESIDUAL FOR S(0,K) FOR A LINEAR SYSTEM AAA01800 C T=A*T+B (OR AN ESTIMATE FOR IT FOR A NONLINEAR SYSTEM AAA01810 C T=F(T)) FOR EACH K. ON EXIT, THIS K IS KOUT. OUTPUT. AAA01820 C (DOUBLE PRECISION) AAA01830 C RES1 : L2-NORM OF THE RESIDUAL ACTUALLY COMPUTED FROM S(0,K) FOR AAA01840 C EACH K. (THE RESIDUAL VECTOR FOR ANY VECTOR VEC IS TAKEN AAA01850 C AS (F(VEC)-VEC).) ON EXIT, THIS K IS KOUT. OUTPUT. AAA01860 C (DOUBLE PRECISION) AAA01870 C EPS : AN UPPER BOUND ON RES/R(0,0), THE RELATIVE RESIDUAL FOR S, AAA01880 C USED IN THE STOPPING CRITERION. NOTE THAT R(0,0)=L2-NORM AAA01890 C OF THE RESIDUAL FOR X0, THE INITIAL VECTOR. IF, FOR SOME K, AAA01900 C RES.LE.EPS*R(0,0), THEN THE CORRESPONDING S(0,K) IS ACCEPTED AAA01910 C AS THE FINAL APPROXIMATION, AND THE SUBROUTINE IS EXITED AAA01920 C WITH KOUT=K. IF S(0,KMAX) IS NEEDED, THEN EPS SHOULD BE AAA01930 C SET EQUAL TO ZERO. INPUT. (DOUBLE PRECISION) AAA01940 C IPRES : IF IPRES.EQ.1, THEN RES IS PRINTED FOR ALL K, K=0,1,.... AAA01950 C OTHERWISE, IT IS NOT. INPUT. (INTEGER) AAA01960 C IPRES1: IF IPRES1.EQ.1, THEN RES1 IS COMPUTED AND PRINTED FOR ALL AAA01970 C K, K=0,1,.... OTHERWISE, IT IS NOT. INPUT. (INTEGER) AAA01980 C***********************************************************************AAA01990 C THE ABOVE-MENTIONED QR FACTORIZATION IS PERFORMED BY EMPLOYING AAA02000 C THE MODIFIED GRAM-SCHMIDT PROCESS. AAA02010 C***********************************************************************AAA02020 IMPLICIT DOUBLE PRECISION (A-H,O-Z) AAA02030 PARAMETER (EPS1=1D-32,EPS2=1D-16) AAA02040 DIMENSION X(NDIM),S(NDIM),Y(NDIM),Z(NDIM) AAA02050 DIMENSION Q(NDIM,0:KMAX-1),R(0:KMAX,0:KMAX) AAA02060 DIMENSION C(0:KMAX),GAMMA(0:KMAX),XI(0:KMAX-1) AAA02070 IF (IPRES.EQ.1.AND.IPRES1.EQ.1) THEN AAA02080 WRITE(6,301) AAA02090 301 FORMAT(/,’ K RES RES1’) AAA02100 ELSE IF (IPRES.EQ.1.AND.IPRES1.NE.1) THEN AAA02110 WRITE(6,302) AAA02120 302 FORMAT(/,’ K RES’) AAA02130 ELSE IF (IPRES.NE.1.AND.IPRES1.EQ.1) THEN AAA02140 WRITE(6,303) AAA02150 303 FORMAT(/,’ K RES1’) AAA02160 END IF AAA02170 DO 10 I=1,NDIM AAA02180 Y(I)=X(I) AAA02190 10 CONTINUE AAA02200 DO 250 K=0,KMAX AAA02210 C AAA02220 C COMPUTATION OF THE VECTOR XJ, J=K+1, FROM XK, AND COMPUTATION OF UK AAA02230 C AAA02240 CALL VECTOR(Y,Z,NDIM) AAA02250 DO 20 I=1,NDIM AAA02260 Y(I)=Z(I)-Y(I) AAA02270 20 CONTINUE AAA02280 C AAA02290

401 C C C

DETERMINATION OF THE ORTHONORMAL VECTOR QK FROM UK BY THE MODIFIED GRAM-SCHMIDT PROCESS

30

40 50

60

70

304

DO 50 J=0,K-1 SUM=0 DO 30 I=1,NDIM SUM=SUM+Q(I,J)*Y(I) CONTINUE R(J,K)=SUM DO 40 I=1,NDIM Y(I)=Y(I)-R(J,K)*Q(I,J) CONTINUE CONTINUE SUM=0 DO 60 I=1,NDIM SUM=SUM+Y(I)**2 CONTINUE R(K,K)=DSQRT(SUM) IF (R(K,K).GT.EPS1*R(0,0).AND.K.LT.KMAX) THEN HP=1D0/R(K,K) DO 70 I=1,NDIM Q(I,K)=HP*Y(I) CONTINUE ELSE IF (R(K,K).LE.EPS1*R(0,0)) THEN EEE=EPS1 WRITE(6,304) K,K,EEE FORMAT(/,’ R(’,I3,’,’,I3,’) .LE.’,1P,D8.1,’*R(0,0).’,/) END IF

C C C

END OF COMPUTATION OF THE VECTOR QK

C C C

COMPUTATION OF THE GAMMAS FOR MPE

IF (METHOD.EQ.1) THEN

80 90

100 311

110

DO 90 I=K-1,0,-1 CI=-R(I,K) DO 80 J=I+1,K-1 CI=CI-R(I,J)*C(J) CONTINUE C(I)=CI/R(I,I) CONTINUE C(K)=1D0 SUM=0 DO 100 I=0,K SUM=SUM+C(I) CONTINUE IF (DABS(SUM).LE.EPS2) THEN WRITE(6,311) K FORMAT(/,’ S( 0,’,I3,’) IS NOT DEFINED.’,/) GO TO 250 END IF DO 110 I=0,K GAMMA(I)=C(I)/SUM CONTINUE RES=R(K,K)*DABS(GAMMA(K))

C C C

END OF COMPUTATION OF THE GAMMAS FOR MPE

C C C

COMPUTATION OF THE GAMMAS FOR RRE

ELSE IF (METHOD.EQ.2) THEN

AAA02300 AAA02310 AAA02320 AAA02330 AAA02340 AAA02350 AAA02360 AAA02370 AAA02380 AAA02390 AAA02400 AAA02410 AAA02420 AAA02430 AAA02440 AAA02450 AAA02460 AAA02470 AAA02480 AAA02490 AAA02500 AAA02510 AAA02520 AAA02530 AAA02540 AAA02550 AAA02560 AAA02570 AAA02580 AAA02590 AAA02600 AAA02610 AAA02620 AAA02630 AAA02640 AAA02650 AAA02660 AAA02670 AAA02680 AAA02690 AAA02700 AAA02710 AAA02720 AAA02730 AAA02740 AAA02750 AAA02760 AAA02770 AAA02780 AAA02790 AAA02800 AAA02810 AAA02820 AAA02830 AAA02840 AAA02850 AAA02860 AAA02870 AAA02880 AAA02890 AAA02900 AAA02910 AAA02920

402

Appendix H. Unified FORTRAN 77 Code for MPE and RRE

120 130

140 150

160 170 C C C

C C C

END OF COMPUTATION OF THE GAMMAS FOR RRE END IF KOUT=K IF (IPRES.EQ.1.AND.IPRES1.NE.1) THEN WRITE(6,321) K,RES 321 FORMAT(I3,2X,1P,D15.2) END IF IF (RES.LE.EPS*R(0,0).OR.R(K,K).LE.EPS1*R(0,0) * .OR.K.EQ.KMAX.OR.IPRES1.EQ.1) THEN COMPUTATION OF THE APPROXIMATION S(0,K)

180 190

200

C C C C C C

DO 130 I=0,K CI=1D0 DO 120 J=0,I-1 CI=CI-R(J,I)*C(J) CONTINUE C(I)=CI/R(I,I) CONTINUE DO 150 I=K,0,-1 CI=C(I) DO 140 J=I+1,K CI=CI-R(I,J)*GAMMA(J) CONTINUE GAMMA(I)=CI/R(I,I) CONTINUE SUM=0 DO 160 I=0,K SUM=SUM+GAMMA(I) CONTINUE DO 170 I=0,K GAMMA(I)=GAMMA(I)/SUM CONTINUE RES=1D0/DSQRT(DABS(SUM))

210 220

XI(0)=1D0-GAMMA(0) DO 180 J=1,K-1 XI(J)=XI(J-1)-GAMMA(J) CONTINUE DO 190 I=1,NDIM S(I)=X(I) CONTINUE DO 220 J=0,K-1 HP=0 DO 200 I=J,K-1 HP=HP+R(J,I)*XI(I) CONTINUE DO 210 I=1,NDIM S(I)=S(I)+HP*Q(I,J) CONTINUE CONTINUE

END OF COMPUTATION OF THE APPROXIMATION S(0,K) END IF IF (IPRES1.EQ.1) THEN EXACT COMPUTATION OF RESIDUAL L2-NORM. CALL VECTOR(S,Y,NDIM) RES1=0 DO 230 I=1,NDIM

AAA02930 AAA02940 AAA02950 AAA02960 AAA02970 AAA02980 AAA02990 AAA03000 AAA03010 AAA03020 AAA03030 AAA03040 AAA03050 AAA03060 AAA03070 AAA03080 AAA03090 AAA03100 AAA03110 AAA03120 AAA03130 AAA03140 AAA03150 AAA03160 AAA03170 AAA03180 AAA03190 AAA03200 AAA03210 AAA03220 AAA03230 AAA03240 AAA03250 AAA03260 AAA03270 AAA03280 AAA03290 AAA03300 AAA03310 AAA03320 AAA03330 AAA03340 AAA03350 AAA03360 AAA03370 AAA03380 AAA03390 AAA03400 AAA03410 AAA03420 AAA03430 AAA03440 AAA03450 AAA03460 AAA03470 AAA03480 AAA03490 AAA03500 AAA03510 AAA03520 AAA03530 AAA03540 AAA03550

403 AAA03560 AAA03570 AAA03580 C AAA03590 C END OF EXACT COMPUTATION OF RESIDUAL L2-NORM. AAA03600 C AAA03610 IF (IPRES.EQ.1) THEN AAA03620 WRITE(6,331) K,RES,RES1 AAA03630 331 FORMAT(I3,2X,1P,2D15.2) AAA03640 ELSE IF (IPRES.NE.1) THEN AAA03650 WRITE(6,332) K,RES1 AAA03660 332 FORMAT(I3,2X,1P,D15.2) AAA03670 END IF AAA03680 END IF AAA03690 IF (RES.LE.EPS*R(0,0).OR.R(K,K).LE.EPS1*R(0,0)) RETURN AAA03700 DO 240 I=1,NDIM AAA03710 Y(I)=Z(I) AAA03720 240 CONTINUE AAA03730 250 CONTINUE AAA03740 RETURN AAA03750 END AAA03760 AAA03770 SUBROUTINE VECTOR(X,Y,NDIM) AAA03780 C***********************************************************************AAA03790 C THIS SUBROUTINE GENERATES THE VECTOR Y FROM THE VECTOR X BY USING, AAA03800 C E.G., A FIXED-POINT ITERATION TECHNIQUE. AAA03810 C***********************************************************************AAA03820 C IN THE PRESENT EXAMPLE THE ITERATIVE TECHNIQUE IS OF THE FORM AAA03830 C Y=A1*X+B1. HERE A1 IS AN NDIM*NDIM SEPTADIAGONAL MATRIX SYMMETRIC AAA03840 C WITH RESPECT TO BOTH OF ITS DIAGONALS AND IS DEFINED AS AAA03850 C A1=(1-OMEGA)*I+OMEGA*A, WHERE OMEGA IS A SCALAR, I IS THE AAA03860 C IDENTITY MATRIX, AND A IS THE MATRIX AAA03870 C AAA03880 C | 5 2 1 1 | AAA03890 C | 2 6 3 1 1 | AAA03900 C | 1 3 6 3 1 1 | AAA03910 C A = 0.06*| 1 1 3 6 3 1 1 | . AAA03920 C | 1 1 3 6 3 1 1 | AAA03930 C | 1 1 3 6 3 1 1 | AAA03940 C | . . . . . . . | AAA03950 C AAA03960 C B1 IS THE VECTOR DEFINED AS B1=OMEGA*B, THE VECTOR B BEING CHOSEN AAA03970 C SUCH THAT THE SOLUTION OF THE SYSTEM T=A*T+B IS THE VECTOR AAA03980 C (1,1,...,1). AAA03990 C THE ITERATIVE TECHNIQUE USED IS THUS RICHARDSON’S ITERATIVE AAA04000 C METHOD APPLIED TO THE SYSTEM (I-A)*T=B. AAA04010 C***********************************************************************AAA04020 IMPLICIT DOUBLE PRECISION (A-H,O-Z) AAA04030 PARAMETER (OMEGA=2D0,TAU=1D0-OMEGA) AAA04040 DIMENSION X(NDIM),Y(NDIM) AAA04050 N=NDIM AAA04060 Y(1)=(5*X(1)+2*X(2)+X(3)+X(4))*6D-2+46D-2 AAA04070 Y(2)=(2*X(1)+6*X(2)+3*X(3)+X(4)+X(5))*6D-2+22D-2 AAA04080 Y(3)=(X(1)+3*X(2)+6*X(3)+3*X(4)+X(5)+X(6))*6D-2+1D-1 AAA04090 DO 10 I=4,N-3 AAA04100 Y(I)=(X(I-3)+X(I-2)+3*X(I-1)+6*X(I)+3*X(I+1)+X(I+2)+X(I+3))*6D-2 AAA04110 * +4D-2 AAA04120 10 CONTINUE AAA04130 Y(N-2)=(X(N)+3*X(N-1)+6*X(N-2)+3*X(N-3)+X(N-4)+X(N-5))*6D-2+1D-1 AAA04140 Y(N-1)=(2*X(N)+6*X(N-1)+3*X(N-2)+X(N-3)+X(N-4))*6D-2+22D-2 AAA04150 Y(N)=(5*X(N)+2*X(N-1)+X(N-2)+X(N-3))*6D-2+46D-2 AAA04160 DO 20 I=1,N AAA04170 Y(I)=TAU*X(I)+OMEGA*Y(I) AAA04180 230

RES1=RES1+(Y(I)-S(I))**2 CONTINUE RES1=DSQRT(RES1)

404

Appendix H. Unified FORTRAN 77 Code for MPE and RRE 20

CONTINUE RETURN END

AAA04190 AAA04200 AAA04210

Bibliography [1] M. Abramowitz and I.A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Number 55 in Nat. Bur. Standards Appl. Math. Series. US Government Printing Office, Washington, D.C., 1964. (Cited on p. 385) [2] A.H. Al-Mohy and N.J. Higham. A new scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl., 31:970–989, 2009. (Cited on p. 324) [3] A.L. Andrew. Convergence of an iterative method for derivatives of eigensystems. J. Comput. Phys., 26:107–112, 1978. (Cited on p. 276) [4] A.L. Andrew. Iterative computation of derivatives of eigenvalues and eigenvectors. J. Inst. Math. Appl., 24:209–218, 1979. (Cited on p. 276) [5] A.L. Andrew and R.C.E. Tan. Computation of mixed partial derivatives of eigenvalues and eigenvectors by simultaneous iteration. Comm. Numer. Methods Engrg., 15:641–649, 1999. (Cited on p. 276) [6] A.L. Andrew and R.C.E. Tan. Iterative computation of derivatives of repeated eigenvalues and the corresponding eigenvectors. Numer. Linear Algebra Appl., 7:151–167, 2000. (Cited on p. 276) [7] W.E. Arnoldi. The principle of minimized iterations in the solution of the matrix eigenvalue problem. Quart. Appl. Math., 9:17–29, 1951. (Cited on pp. 203, 204) [8] U. Ascher and L.R. Petzold. Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. SIAM, Philadelphia, 1998. (Cited on p. 277) [9] K.E. Atkinson. An Introduction to Numerical Analysis. John Wiley & Sons Inc., New York, second edition, 1989. (Cited on pp. 42, 265, 346) [10] K.E. Atkinson. The Numerical Solution of Integral Equations of the Second Kind. Number 4 in Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 1997. (Cited on p. 316) [11] O. Axelsson. Conjugate gradient type methods for unsymmetric and inconsistent systems of linear equations. Linear Algebra Appl., 29:1–16, 1980. (Cited on p. 206) [12] O. Axelsson. Iterative Solution Methods. Cambridge University Press, Cambridge, 1994. (Cited on pp. 2, 191) [13] G.A. Baker Jr. Essentials of Padé Approximants. Academic Press, New York, 1975. (Cited on p. 103) [14] G.A. Baker Jr. and P.R. Graves-Morris. Padé Approximants. Cambridge University Press, Cambridge, second edition, 1996. (Cited on pp. 56, 103, 107, 126, 289, 306, 307, 352)

405

406

Bibliography [15] B. Baron and S. Wajc. Convergence acceleration of non-scalar sequences with non-linear transformations. Math. Comput. Simulation, 232:133–141, 1981. (Cited on p. 31) [16] R. Barrett, M. Berry, T.F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia, 1994. (Cited on pp. 2, 191) [17] J. Barzilai and J.M. Borwein. Two-point step size gradient methods. IMA J. Numer. Anal., 8:141–148, 1988. (Cited on p. 52) [18] F.L. Bauer. The quotient-difference and epsilon algorithms. In R.E. Langer, editor, On Numerical Approximation, pages 361–370. The University of Wisconsin Press, Madison, WI, 1959. (Cited on p. 106) [19] F.L. Bauer. Nonlinear sequence transformations. In H.L. Garabedian, editor, Approximation of Functions, pages 134–151. Elsevier, New York, 1965. (Cited on p. 106) [20] A. Ben-Israel and T.N.E. Greville. Generalized Inverses: Theory and Applications. CMS Books in Mathematics. Springer-Verlag, New York, second edition, 2003. (Cited on pp. 277, 373) [21] A. Berman and R.J. Plemmons. Nonnegative Matrices in Mathematical Sciences. Number 9 in Classics in Applied Mathematics. SIAM, Philadelphia, 1994. (Cited on pp. 2, 268) [22] J. Beuneu. Minimal polynomial projection methods. Preprint, 1984. (Cited on p. 198) [23] Å. Björck. Numerical Methods for Least Squares Problems. SIAM, Philadelphia, 1996. (Cited on p. 2) [24] Å. Björck. Numerical Methods in Matrix Computations. Springer, Berlin, 2015. (Cited on p. 2) [25] I. Borg and P. Groenen. Modern Multidimensional Scaling. Springer, New York, 1997. (Cited on p. 279) [26] C. Brezinski. Application de l’ε-algorithme à la résolution des systèmes non linéaires. C. R. Acad. Sci. Paris, 271 A:1174–1177, 1970. (Cited on p. 175) [27] C. Brezinski. Sur un algorithme de résolution des systèmes non linéaires. C. R. Acad. Sci. Paris, 272 A:145–148, 1971. (Cited on p. 175) [28] C. Brezinski. Généralisations de la transformation de Shanks, de la table de Padé, et de l’ε-algorithme. Calcolo, 12:317–360, 1975. (Cited on pp. 31, 50, 110, 111, 112, 113, 285, 341) [29] C. Brezinski. Accélération de la Convergence en Analyse Numérique. Number 584 in Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1977. (Cited on pp. xi, 99, 110, 111, 112, 113) [30] C. Brezinski. A general extrapolation algorithm. Numer. Math., 35:175–187, 1980. (Cited on pp. 328, 331) [31] C. Brezinski. Recursive interpolation, extrapolation and projection. J. Comput. Appl. Math., 9:369–376, 1983. (Cited on p. 180) [32] C. Brezinski. Other manifestations of the Schur complement. Linear Algebra Appl., 111:231–247, 1988. (Cited on p. 56) [33] C. Brezinski. Biorthogonality and Its Applications to Numerical Analysis. Marcel Dekker, New York, 1992. (Cited on p. 2)

Bibliography

407

[34] C. Brezinski. Projection Methods for Systems of Equations. Elsevier, New York, 1997. (Cited on p. 2) [35] C. Brezinski and M. Crouzeix. Remarques sur le procédé δ 2 d’Aitken. C. R. Acad. Sci. Paris, 270 A:896–898, 1970. (Cited on p. 102) [36] C. Brezinski and M. Redivo Zaglia. Extrapolation Methods: Theory and Practice. NorthHolland, Amsterdam, 1991. (Cited on pp. xi, 99, 180) [37] C. Brezinski and M. Redivo Zaglia. Convergence acceleration of Kaczmarz’s method. J. Engrg. Math., 93:3–19, 2015. (Cited on p. 263) [38] C. Brezinski, M. Redivo Zaglia, and H. Sadok. Avoiding breakdown and near breakdown in Lanczos type algorithms. Numer. Algorithms, 1:261–284, 1991. (Cited on p. 222) [39] C. Brezinski, M. Redivo Zaglia, and H. Sadok. Breakdown in the implementation of the Lanczos method for solving linear systems. Comput. Math. Appl., 33:31–44, 1997. (Cited on p. 222) [40] C. Brezinski and M. Redivo Zaglia. Vector and matrix sequence transformations based on biorthogonality. Appl. Numer. Math., 21:353–373, 1996. (Cited on p. 31) [41] C. Brezinski and M. Redivo Zaglia. A Schur complement approach to a general extrapolation algorithm. Linear Algebra Appl., 368:279–301, 2003. (Cited on p. 56) [42] C. Brezinski and M. Redivo Zaglia. New vector sequence transformations. Linear Algebra Appl., 389:189–213, 2004. (Cited on p. 32) [43] C. Brezinski and M. Redivo-Zaglia. The simplified topological ε-algorithms for accelerating sequences in a vector space. SIAM J. Sci. Comput., 36:A2227–A2247, 2014. (Cited on p. 117) [44] C. Brezinski, M. Redivo Zaglia, and H. Sadok. New look-ahead Lanczos-type algorithms for linear systems. Numer. Math., 83:53–85, 1999. (Cited on p. 222) [45] C. Brezinski, M. Redivo Zaglia, and S. Serra-Capizzano. Extrapolation methods for PageRank computations. C. R. Math. Acad. Sci. Paris, 340:393–397, 2005. (Cited on pp. 271, 272) [46] S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Comput. Networks ISDN Systems, 30:107–117, 1998. (Cited on p. 271) [47] A.M. Bronstein, M.M. Bronstein, and R. Kimmel. Numerical Geometry of Non-rigid Shapes. Springer, New York, 2008. (Cited on p. 280) [48] M.M. Bronstein, A.M. Bronstein, R. Kimmel, and I. Yavneh. Multigrid multidimensional scaling. Numer. Linear Algebra Appl., 13:149–171, 2006. Special issue on multigrid methods. (Cited on p. 280) [49] P.N. Brown. A local convergence theory for combined inexact-Newton/finite-difference projection methods. SIAM J. Numer. Anal., 24:407–434, 1987. (Cited on pp. 230, 232) [50] P.N. Brown. A theoretical comparison of the Arnoldi and GMRES algorithms. SIAM J. Sci. Stat. Comput., 12:58–78, 1991. (Cited on pp. 89, 209) [51] P.N. Brown and Y. Saad. Hybrid Krylov methods for nonlinear systems of equations. SIAM J. Sci. Stat. Comput., 11:450–481, 1990. (Cited on pp. 230, 232) [52] S. Cabay and L.W. Jackson. A polynomial extrapolation method for finding limits and antilimits of vector sequences. SIAM J. Numer. Anal., 13:734–752, 1976. (Cited on p. 31)

408

Bibliography [53] D. Calvetti, L. Reichel, and Q. Zhang. Conjugate gradient algorithms for symmetric inconsistent linear systems. In J.D. Brown, M.T. Chu, D.C. Ellison, and R.J. Plemmons, editors, Proceedings of the Cornelius Lanczos International Centenary Conference, pages 267–272. SIAM, Philadelphia, 1994. (Cited on p. 279) [54] S.L. Campbell. Singular Systems of Differential Equations. Pitman, London, 1980. (Cited on p. 277) [55] S.L. Campbell. Singular Systems of Differential Equations II. Pitman, London, 1982. (Cited on p. 277) [56] S.L. Campbell and C.D. Meyer Jr. Generalized Inverses of Linear Transformations. Dover, New York, 1991. (Cited on pp. 277, 373) [57] T.F. Chan. An improved algorithm for computing the singular value decomposition. ACM Trans. Math. Software, 8:72–83, 1982. (Cited on p. 96) [58] K. Chen. Matrix Preconditioning Techniques and Applications. Number 19 in Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2005. (Cited on p. 227) [59] E.W. Cheney. Introduction to Approximation Theory. Chelsea, New York, second edition, 1982. (Cited on pp. 163, 164) [60] P.G. Ciarlet. Introduction to Numerical Linear Algebra and Optimisation. Cambridge Texts in Applied Mathematics. Cambridge University Press, Cambridge, 1989. (Cited on p. 2) [61] J.-J. Climent, M. Neumann, and A. Sidi. A semi-iterative method for real spectrum singular linear systems with arbitrary index. J. Comput. Appl. Math., 87:21–38, 1997. (Cited on p. 279) [62] J. Cullum and A. Greenbaum. Relations between Galerkin and norm minimizing iterative methods for solving linear systems. SIAM J. Matrix Anal. Appl., 17:223–247, 1996. (Cited on p. 209) [63] J. Cullum and R.A. Willoughby. Lanczos Algorithms for Large Symmetric Eigenvalue Computations, Volumes 1 and 2. Birkhäuser, Boston, 1985. (Cited on p. 233) [64] B.N. Datta. Numerical Linear Algebra and Applications. SIAM, Philadelphia, second edition, 2010. (Cited on p. 2) [65] M.G. de Bruin. Generalized Continued Fractions and a Multidimensional Padé Table. PhD thesis, Amsterdam, 1974. (Cited on p. 309) [66] M.G. de Bruin. Generalized Padé tables and some algorithms therein. In J. Gilewicz, editor, Padé Approximation and Convergence Acceleration Techniques, Marseille: CNRS, Paris, 1982. (Cited on p. 309) [67] M.G. de Bruin. Some explicit formulae in simultaneous Padé approximation. Linear Algebra Appl., 63:271–281, 1984. (Cited on p. 309) [68] M.G. de Bruin. Simultaneous Padé approximation and orthogonality. In C. Brezinski, A. Draux, A.P. Magnus, P. Maroni, and A. Ronveaux, editors, Polynômes orthogonaux et Applications, pages 74–83, Springer, Berlin, 1985. (Cited on p. 309) [69] M.G. de Bruin. Simultaneous partial Padé approximants. J. Comput. Appl. Math., 21:343– 355, 1988. (Cited on p. 309)

Bibliography

409

[70] J. De Leeuw. Applications of convex analysis to multidimensional scaling. In J.R. Barra, F. Brodeau, G. Romier, and B. Van Cutsem, editors, Recent Developments in Statistics, pages 133–145. North Holland, Amsterdam, 1977. (Cited on p. 283) [71] J. De Leeuw. Convergence of the majorization method for multidimensional scaling. J. Classification, 5:160–180, 1998. (Cited on p. 283) [72] J. De Leeuw and W.J. Heiser. Convergence of correction-matrix algorithms for multidimensional scaling. In J.C. Lingoes, E.E. Roscam, and I. Borg, editors, Geometric Representations of Relational Data, pages 735–752. Mathesis Press, Ann Arbor, MI, 1977. (Cited on p. 283) [73] J.W. Demmel. Applied Numerical Linear Algebra. SIAM, Philadelphia, 1997. (Cited on p. 2) [74] R.P. Eddy. Extrapolating to the limit of a vector sequence. In P.C.C. Wang, editor, Information Linkage between Applied Mathematics and Industry, pages 387–396. Academic Press, New York, 1979. (Cited on pp. 31, 42) [75] M. Eiermann and O.G. Ernst. Geometric aspects of the theory of Krylov subspace methods. Acta Numer., 10:251–312, 2001. (Cited on p. 209) [76] M. Eiermann and O.G. Ernst. A restarted Krylov subspace method for the evaluation of matrix functions. SIAM J. Numer. Anal., 44:2481–2504, 2006. (Cited on p. 324) [77] M. Eiermann, I. Marek, and W. Niethammer. On the solution of singular linear systems of algebraic equations by semi-iterative methods. Numer. Math., 53:265–283, 1988. (Cited on p. 279) [78] S.C. Eisenstat, H.C. Elman, and M.H. Schultz. Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 20:345–357, 1983. (Cited on pp. 206, 207) [79] R. El-Moallem and H. Sadok. Vector extrapolation applied to algebraic Riccati equations arising in transport theory. Electron. Trans. Numer. Anal., 40:489–506, 2013. (Cited on p. 263) [80] L. Eldén. The eigenvalues of the Google matrix. Technical Report LiTH-MAT-R-04-01, Linköping University, 2004. (Cited on p. 269) [81] M. Engeli, M. Ginsburg, H. Rutishauser, and E. Stiefel. Refined iterative methods for the computation of the solution and the eigenvalues of self-adjoint boundary value problems. Mitt. Inst. Angew. Math. ETH, Zürich, Nr. 8, 1959. Basel–Stuttgart. (Cited on p. 213) [82] I. Erdélyi. An iterative least-square algorithm suitable for computing partial eigensystems. SIAM J. Numer. Anal., 2:421–436, 1965. (Cited on p. 244) [83] A. Essai. Weighted FOM and GMRES for solving nonsymmetric linear systems. Numer. Algorithms, 18:277–292, 1998. (Cited on pp. 203, 206) [84] V. Faber and T.A. Manteuffel. Necessary and sufficient conditions for the existence of a conjugate gradient method. SIAM J. Numer. Anal., 21:352–362, 1984. (Cited on p. 219) [85] R. Fletcher. Conjugate gradient methods for indefinite systems. In G.A. Watson, editor, Proceedings of the Dundee Biennial Conference on Numerical Analysis (1975), volume 506 of Lecture Notes in Mathematics, pages 73–89. Springer, Heidelberg, 1976. (Cited on p. 224) [86] W.F. Ford and A. Sidi. An algorithm for a generalization of the Richardson extrapolation process. SIAM J. Numer. Anal., 24:1212–1232, 1987. (Cited on pp. 105, 328)

410

Bibliography [87] W.F. Ford and A. Sidi. Recursive algorithms for vector extrapolation methods. Appl. Numer. Math., 4:477–489, 1988. Originally appeared as Technical Report No. 400, Computer Science Dept., Technion–Israel Institute of Technology (1986). (Cited on pp. 177, 179, 180, 181, 183, 184, 331) [88] R. Freund and N. Nachtigal. QMR: A quasi-minimal residual method for non-Hermitian linear systems. Numer. Math., 60:315–339, 1991. (Cited on p. 226) [89] R. Freund and S. Ruscheweyh. On a class of Chebyshev approximation problems which arise in connection with a conjugate gradient type method. Numer. Math., 48:525–542, 1986. (Cited on p. 170) [90] R.W. Freund. A transpose-free quasi-minimal residual algorithm for non-Hermitian linear systems. SIAM J. Sci. Comput., 14:470–482, 1993. (Cited on p. 226) [91] R.W. Freund, M.H. Gutknecht, and N.M. Nachtigal. An implementation of the lookahead Lanczos algorithm for non-Hermitian matrices, Parts I and II. Technical Report 90-11, Massachusetts Institute of Technology, Cambridge, MA, 1990. (Cited on p. 222) [92] E. Gallopoulos and Y. Saad. Efficient solution of parabolic equations by Krylov approximation methods. SIAM J. Sci. Stat. Comput., 13:1236–1264, 1992. (Cited on p. 324) [93] W. Gander, M.J. Gander, and F. Kwok. Scientific Computing: An Introduction Using Maple and MATLAB. Number 11 in Texts in Computational Science and Engineering. Springer, New York, 2014. (Cited on pp. xii, 2) [94] W. Gander, G. Golub, and D. Gruntz. Solving linear equations by extrapolation. In J.S. Kowalik, editor, Supercomputing, volume F62 of NATO ASI Series. Springer-Verlag, Berlin, 1990. (Cited on p. 200) [95] F.R. Gantmacher. The Theory of Matrices, Volumes 1 and 2. Chelsea, New York, 1959. Translated from the Russian with revisions from the author. (Cited on pp. 2, 320) [96] E. Gekeler. On the solution of systems of equations by the epsilon algorithm of Wynn. Math. Comput., 26:427–436, 1972. (Cited on p. 175) [97] J. Gilewicz. Approximants de Padé. Number 667 in Lecture Notes in Mathematics. Springer-Verlag, New York, 1978. (Cited on p. 103) [98] D. Gleich, P. Glynn, G. Golub, and C. Greif. Three results on the PageRank vector: Eigenstructure, sensitivity, and the derivative. In A. Frommer, M.W. Mahoney, and D.B. Szyld, editors, Web Information Retrieval and Linear Algebra Algorithms, number 07071 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2007. (Cited on p. 270) [99] I. Goldhirsch, S.A. Orszag, and B.K. Maulik. An efficient method for computing leading eigenvalues and eigenvectors of large unsymmetric matrices. J. Sci. Comput., 2:33–58, 1987. (Cited on p. 259) [100] G.H. Golub and C. Greif. Arnoldi-type algorithms for computing stationary distribution vectors, with applications to PageRank. Technical Report SCCM-04-15, Scientific Computing/Computational Mathematics Program, Stanford University, 2004. (Cited on p. 279) [101] G.H. Golub and C. Greif. An Arnoldi-type method for computing PageRank. BIT, 46:759–771, 2006. (Cited on p. 279) [102] G.H. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. SIAM J. Numer. Anal., Series B, 2:205–224, 1965. (Cited on p. 96)

Bibliography

411

[103] G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, fourth edition, 2013. (Cited on pp. 2, 116, 233, 367) [104] G.H. Golub and C.D. Meyer Jr. Using the QR factorization and group inversion to compute, differentiate, and estimate the sensitivity of stationary probabilities for Markov chains. SIAM J. Algebraic Discrete Methods, 7:273–281, 1986. (Cited on p. 273) [105] G.H. Golub and H.A. van der Vorst. Eigenvalue computation in the 20th century. J. Comput. Appl. Math., 123:35–65, 2000. (Cited on p. 233) [106] W.B. Gragg. The Padé table and its relation to certain algorithms of numerical analysis. SIAM Rev., 14:1–62, 1972. (Cited on pp. 103, 179) [107] P.R. Graves-Morris. Vector valued rational interpolants I. Numer. Math., 42:331–348, 1983. (Cited on pp. 107, 109, 306) [108] P.R. Graves-Morris. Vector valued rational interpolants II. IMA J. Numer. Anal., 4:209– 224, 1984. (Cited on pp. 306, 310) [109] P.R. Graves-Morris. Solution of integral equations using generalized inverse, functionvalued Padé approximants, I. J. Comput. Appl. Math., 32:117–124, 1990. (Cited on p. 317) [110] P.R. Graves-Morris. Extrapolation methods for vector sequences. Numer. Math., 61:475– 487, 1992. (Cited on pp. 107, 126) [111] P.R. Graves-Morris. A new approach to acceleration of convergence of a sequence of vectors. Numer. Algorithms, 11:189–201, 1996. (Cited on p. 126) [112] P.R. Graves-Morris and C.D. Jenkins. Vector valued rational interpolants III. Constructive Approximation, 2:263–289, 1986. (Cited on pp. 107, 306) [113] P.R. Graves-Morris and E.B. Saff. A de Montessus theorem for vector valued rational interpolants. In P.R. Graves-Morris, E.B. Saff, and R.S. Varga, editors, Rational Approximation and Interpolation, number 1105 in Springer Lecture Notes in Mathematics, pages 227–242. Springer-Verlag, Heidelberg, 1984. (Cited on pp. 126, 306, 309, 311) [114] P.R. Graves-Morris and E.B. Saff. Row convergence theorems for generalised inverse vector-valued Padé approximants. J. Comput. Appl. Math., 23:63–85, 1988. (Cited on pp. 126, 306, 307) [115] P.R. Graves-Morris and E.B. Saff. Divergence of vector-valued rational interpolants to meromorphic functions. Rocky Mountain J. Math., 21:245–261, 1991. (Cited on pp. 306, 309) [116] P.R. Graves-Morris and E.B. Saff. An extension of a row convergence theorem for vector Padé approximants. J. Comput. Appl. Math., 34:315–324, 1991. (Cited on pp. 126, 306) [117] H.L. Gray, T.A. Atchison, and G.V. McWilliams. Higher order G-transformations. SIAM J. Numer. Anal., 8:365–381, 1971. (Cited on p. 105) [118] A. Greenbaum. Iterative Methods for Solving Linear Systems. SIAM, Philadelphia, 1997. (Cited on pp. 2, 191, 218, 219, 226) [119] M. Gulliksson. On the modified Gram–Schmidt algorithm for weighted and constrained linear least squares problems. BIT, 35:453–468, 1995. (Cited on pp. 363, 365) [120] M. Gulliksson and P.-Å. Wedin. Modifying the QR-decomposition to constrained and weighted linear least squares. SIAM J. Matrix Anal. Appl., 13:1298–1313, 1992. (Cited on pp. 363, 365)

412

Bibliography [121] M.H. Gutknecht. Changing the norm in conjugate gradient type algorithms. SIAM J. Numer. Anal., 30:40–56, 1993. (Cited on p. 209) [122] M.H. Gutknecht. Lanczos-type solvers for nonsymmetric linear systems of equations. Acta Numer., 6:271–397, 1997. (Cited on p. 209) [123] L. Guttman. A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika, 33:469–506, 1968. (Cited on p. 283) [124] M. Hafez, S. Palaniswamy, G. Kuruvilla, and M.D. Salas. Applications of Wynn’s epsilonalgorithm to transonic flow calculations. AIAA paper 87-1143. In AIAA 8th Computational Fluid Dynamics Conference, Honolulu, Hawaii, 1987. (Cited on p. 265) [125] L.A. Hageman and D.M. Young. Applied Iterative Methods. Academic Press, New York, 1981. (Cited on pp. 2, 125) [126] M. Hanke and M. Hochbruck. A Chebyshev-like semiiteration for inconsistent linear systems. Electron. Trans. Numer. Anal., 1:89–103, 1993. (Cited on p. 279) [127] J.F. Hart, E.W. Cheney, C.L. Lawson, H.J. Maehly, C.K. Mesztenyi, J.R. Rice, H.J. Thacher Jr., and C. Witzgall, editors. Computer Approximations. Wiley, New York, 1968. second edition, Krieger, Malabar, FL, 1978. (Cited on p. 328) [128] T.H. Haveliwala and S.D. Kamvar. The second eigenvalue of the Google matrix. Technical report, Stanford University, 2003. (Cited on p. 269) [129] T. Håvie. Generalized Neville type extrapolation schemes. BIT, 19:204–213, 1979. (Cited on p. 328) [130] P. Henrici. Elements of Numerical Analysis. Wiley, New York, 1964. (Cited on pp. 42, 187) [131] P. Henrici. Applied and Computational Complex Analysis, volume 1. Wiley, New York, 1974. (Cited on p. 106) [132] P. Henrici. Applied and Computational Complex Analysis, volume 2. Wiley, New York, 1977. (Cited on p. 106) [133] M. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Standards, 49:409–436, 1952. (Cited on p. 213) [134] N.J. Higham. The scaling and squaring method for the matrix exponential revisited. SIAM J. Matrix Anal. Appl., 26:1179–1193, 2005. (Cited on p. 324) [135] N.J. Higham. Functions of Matrices: Theory and Computation. SIAM, Philadelphia, 2008. (Cited on p. 320) [136] N.J. Higham and A.H. Al-Mohy. Computing matrix functions. Acta Numer., 19:159– 208, 2010. (Cited on p. 324) [137] M. Hochbruck and Ch. Lubich. On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal., 34:1911–1925, 1997. (Cited on p. 324) [138] R.A. Horn and C.R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, 1985. (Cited on pp. 2, 56, 202) [139] R.A. Horn and C.R. Johnson. Topics in Matrix Analysis. Cambridge University Press, Cambridge, 1991. (Cited on pp. 2, 320, 387)

Bibliography

413

[140] S.H.M.J. Houben. Periodic steady state computation with the Poincaré-map method. In K. Antreich, R. Bulirsch, A. Gilg, and P. Rentrop, editors, Modeling, Simulation and Optimization of Integrated Circuits, volume 146 of International Series of Numerical Mathematics, pages 101–119, Springer, Basel, 2003. (Cited on p. 266) [141] S.H.M.J. Houben and J.M. Maubach. Periodic steady-state analysis of free-running oscillators. Technical Report RANA 00-16, Dept. of Mathematics and Computing Science, Eindhoven University of Technology, 2000. (Cited on p. 266) [142] A.S. Householder. The Theory of Matrices in Numerical Analysis. Blaisedell, New York, 1964. (Cited on p. 2) [143] I. Ipsen and C.D. Meyer. The idea behind Krylov methods. Amer. Math. Monthly, 105:889–899, 1998. (Cited on p. 279) [144] I.C.F. Ipsen. Numerical Matrix Analysis: Linear Systems and Least Squares. SIAM, Philadelphia, 2009. (Cited on p. 2) [145] A. Jameson, W. Schmidt, and E. Turkel. Numerical solution of the Euler equations by finite volume methods using Runge-Kutta time-stepping schemes. AIAA paper 81-1259. In AIAA 14th Fluid and Plasma Dynamics Conference, Palo Alto, California, 1981. (Cited on p. 265) [146] K. Jbilou and A. Messaoudi. Block extrapolation methods with applications. Appl. Numer. Math., 106:154–164, 2016. (Cited on p. 31) [147] K. Jbilou, L. Reichel, and H. Sadok. Vector extrapolation enhanced TSVD for linear discrete ill-posed problems. Numer. Algorithms, 51:195–208, 2009. (Cited on p. 263) [148] K. Jbilou and H. Sadok. Some results about vector extrapolation methods and related fixed-point iterations. J. Comput. Appl. Math., 36:385–398, 1991. (Cited on pp. 56, 175) [149] K. Jbilou and H. Sadok. LU-implementation of the modified minimal polynomial extrapolation method. IMA J. Numer. Anal., 19:549–561, 1999. (Cited on pp. 77, 93, 95) [150] K. Jbilou and H. Sadok. Vector extrapolation methods. Applications and numerical comparison. J. Comput. Appl. Math., 122:149–165, 2000. (Cited on p. 120) [151] K. Jbilou and H. Sadok. Matrix polynomial and epsilon-type extrapolation methods with applications. Numer. Algorithms, 68:107–119, 2015. (Cited on p. 31) [152] W. Kahan. Inclusion theorems for clusters of eigenvalues of Hermitian matrices. Technical report, Department of Computer Science, University of Toronto, 1967. (Cited on pp. 244, 249, 393, 396) [153] S.D. Kamvar, T.H. Haveliwala, C.D. Manning, and G.H. Golub. Extrapolation methods for accelerating PageRank computations. In Proceedings of the Twelfth International World Wide Web Conference, pages 261–270. ACM, 2003. (Cited on pp. 78, 271) [154] S. Kaniel. Estimates for some computational techniques in linear algebra. Math. Comput., 20:369–378, 1966. (Cited on pp. 233, 249) [155] S. Kaniel and J. Stein. Least-square acceleration of iterative methods for linear equations. J. Optim. Theory Appl., 14:431–437, 1974. (Cited on pp. 31, 40) [156] H.C. Kao. Some aspects of bifurcation structure of laminar flow in curved ducts. J. Fluid Mech., 243:519–539, 1992. (Cited on p. 265) [157] C.T. Kelley. Iterative Methods for Linear and Nonlinear Equations. SIAM, Philadelphia, 1995. (Cited on pp. 2, 230, 231, 232)

414

Bibliography [158] D. Kershaw. QD algorithms and algebraic eigenvalue problems. Linear Algebra Appl., 54:53–75, 1983. (Cited on p. 187) [159] U. Kirsch. Reduced basis approximations of structural displacements for optimal design. AIAA J., 29:1751–1758, 1991. (Cited on p. 318) [160] U. Kirsch. Improved stiffness-based first-order approximation for structural optimization. AIAA J., 33:143–150, 1995. (Cited on p. 318) [161] U. Kirsch and P.Y. Papalambros. Exact and accurate solutions in the approximate reanalysis of structures. AIAA J., 39:2198–2205, 2001. (Cited on p. 318) [162] P. Kunkel and V. Mehrmann. Differential-Algebraic Equations: Analysis and Numerical Solution. European Mathematical Society, Zürich, 2006. (Cited on p. 277) [163] P. Lancaster and M. Tismenetsky. The Theory of Matrices. Academic Press, London, 1985. (Cited on p. 320) [164] C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Nat. Bur. Standards, 45:255–282, 1950. (Cited on p. 224) [165] C. Lanczos. Solution of systems of linear equations by minimized iterations. J. Res. Nat. Bur. Standards, 49:33–53, 1952. (Cited on p. 220) [166] C. Lanczos. Applied Analysis. Prentice Hall, Englewood Cliffs, NJ, 1956. (Cited on p. 381) [167] A.N. Langville and C.D. Meyer. Fiddling with PageRank. Technical report, Department of Mathematics, North Carolina State University, 2003. (Cited on p. 269) [168] A.N. Langville and C.D. Meyer. Deeper inside PageRank. Technical report, Department of Mathematics, North Carolina State University, 2004. (Cited on p. 269) [169] H. Le Ferrand. Convergence of the topological ε-algorithm for solving systems of nonlinear equations. Numer. Algorithms, 3:273–283, 1992. (Cited on p. 175) [170] R.B. Lehoucq, D.C. Sorensen, and C. Yang. ARPACK USERS GUIDE: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia, 1998. (Cited on p. 259) [171] C. Lemaréchal. Une méthode de résolution de certains systèmes non linéaires bien posés. C. R. Acad. Sci. Paris, 272 A:747–776, 1971. (Cited on p. 52) [172] D. Levin. Development of non-linear transformations for improving convergence of sequences. Internat. J. Comput. Math., B3:371–388, 1973. (Cited on p. 341) [173] D. Levin and A. Sidi. Two new classes of nonlinear transformations for accelerating the convergence of infinite integrals and series. Appl. Math. Comput., 9:175–215, 1981. Originally appeared as a Tel Aviv University preprint in 1975. (Cited on p. 341) [174] J. Liesen and Z. Strakoš. Krylov Subspace Methods: Principles and Analysis. Oxford University Press, Oxford, 2012. (Cited on p. 2) [175] G.G. Lorentz. Approximation by incomplete polynomials (problems and results). In E.B. Saff and R.S. Varga, editors, Padé and Rational Approximations, pages 289–302. Academic Press, New York, 1977. (Cited on p. 163) [176] W.V. Lovitt. Linear Integral Equations. Dover, New York, reprinted edition, 1950. (Cited on p. 316)

Bibliography

415

[177] S. Lubkin. A method of summing infinite series. J. Res. Nat. Bur. Standards, 48:228–254, 1952. (Cited on p. 341) [178] D.G. Luenberger. Linear and Nonlinear Programming. Addison–Wesley, Reading, MA, second edition, 1984. (Cited on p. 196) [179] K. Mahler. Perfect systems. Composito Math., 19:95–166, 1968. (Cited on p. 309) [180] T.A. Manteuffel. Adaptive procedure for estimating parameters for the nonsymmetric Tchebychev iteration. Numer. Math., 31:183–203, 1978. (Cited on p. 244) [181] J.C. Mason and D.C. Handscomb. Chebyshev Polynomials. Chapman & Hall/RCR, Boca Raton, FL, 2003. (Cited on p. 381) [182] A.C. Matos. Acceleration results for the vector E-algorithm. Numer. Algorithms, 1:237– 260, 1991. (Cited on p. 332) [183] J.B. McLeod. A note on the ε-algorithm. Computing, 7:17–24, 1971. (Cited on p. 109) [184] J.A. Meijerink and H.A. van der Vorst. An iterative solution method for linear equations systems of which the coefficient matrix is a symmetric M -matrix. Math. Comput., 31:148– 162, 1977. (Cited on p. 227) [185] M. Me˘sina. Convergence acceleration for the iterative solution of the equations X = AX + f . Comput. Methods Appl. Mech. Engrg., 10:165–173, 1977. (Cited on pp. 31, 40, 42) [186] A. Messaoudi. Matrix extrapolation algorithms. Linear Algebra Appl., 256:49–73, 1997. (Cited on p. 31) [187] G. Meurant. The Lanczos and Conjugate Gradient Algorithms: From Theory to Finite Precision Computations. Number 19 in Software, Environments, and Tools. SIAM, Philadelphia, 2006. (Cited on p. 2) [188] C.D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia, 2000. (Cited on p. 2) [189] C.D. Meyer and G.W. Stewart. Derivatives and perturbations of eigenvectors. SIAM J. Numer. Anal., 25:679–691, 1988. (Cited on p. 273) [190] R.E. Mickens. Oscillations in Planar Dynamic Systems, volume 37 of Series on Advances in Mathematics for Applied Sciences. World Scientific, London, 1996. (Cited on p. 319) [191] C. Moler and C. Van Loan. Nineteen dubious ways to compute the exponential of a matrix. SIAM Rev., 20:801–836, 1978. (Cited on p. 324) [192] C. Moler and C. Van Loan. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev., 45:3–49, 2003. (Cited on p. 324) [193] N. Nachtigal. A Look-Ahead Variant of Lanczos Algorithm and Its Application to the QuasiMinimal Residual Method for Non-Hermitian Linear Systems. PhD thesis, Massachusetts Institute of Technology, 1991. (Cited on p. 222) [194] R.B. Nelson. Simplified calculation of eigenvector derivatives. AIAA J., 14:1201–1205, 1976. (Cited on p. 273) [195] N. Osada. A convergence acceleration method for some logarithmically convergent sequences. SIAM J. Numer. Anal., 27:178–189, 1990. (Cited on p. 341) [196] N. Osada. Acceleration methods for vector sequences. J. Comput. Appl. Math., 38:361– 371, 1991. (Cited on pp. 341, 342)

416

Bibliography [197] N. Osada. Extensions of Levin’s transformations to vector sequences. Numer. Algorithms, 2:121–132, 1992. (Cited on p. 341) [198] N. Osada. Extrapolation methods for some singular fixed point sequences. Numer. Algorithms, 3:335–344, 1992. (Cited on p. 341) [199] N. Osada. Vector sequence transformations for the acceleration of logarithmic convergence. J. Comput. Appl. Math., 66:391–400, 1996. (Cited on p. 341) [200] A.M. Ostrowski. On the convergence of the Rayleigh quotient iteration for the computation of characteristic roots and vectors, I. Arch. Ration. Mech. Anal., 1:233–241, 1958. (Cited on p. 393) [201] A.M. Ostrowski. On the convergence of the Rayleigh quotient iteration for the computation of characteristic roots and vectors, II. Arch. Ration. Mech. Anal., 2:423–428, 1959. (Cited on p. 393) [202] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, 1999. (Cited on p. 271) [203] C.C. Paige. The Computation of Eigenvalues and Eigenvectors of Very Large Sparse Matrices. PhD thesis, University of London, 1971. (Cited on pp. 233, 249) [204] C.C. Paige and M.A. Saunders. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal., 12:617–629, 1975. (Cited on p. 219) [205] C.C. Paige and M.A. Saunders. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software, 8:43–71, 1982. (Cited on p. 219) [206] B. Parlett. Convergence of the QR algorithm. Numer. Math., 7:187–193, 1965. (Cited on p. 187) [207] B.N. Parlett. The Rayleigh quotient iteration and some generalizations for nonnormal matrices. Math. Comput., 28:679–693, 1974. (Cited on pp. 393, 396) [208] B.N. Parlett. The Symmetric Eigenvalue Problem. Number 20 in Classics in Applied Mathematics. SIAM, Philadelphia, second edition, 1998. First published in 1980. (Cited on pp. 2, 233, 238, 243, 244, 249, 366, 393) [209] B.N. Parlett and W. Kahan. On the convergence of a practical QR algorithm. In IFIP Congress (1), pages 114–118, 1968. (Cited on p. 396) [210] B.N. Parlett, D.R. Taylor, and Z.A. Liu. A look-ahead Lanczos algorithm for unsymmetric matrices. Math. Comput., 44:105–124, 1985. (Cited on p. 222) [211] B.P. Pugachev. Acceleration of the convergence of iterative processes and a method of solving systems of nonlinear equations. U.S.S.R. Comput. Math. Math. Phys., 17:199–207, 1978. (Cited on p. 31) [212] W.C. Pye and T.A. Atchison. An algorithm for the computation of the higher order G-transformation. SIAM J. Numer. Anal., 10:1–7, 1973. (Cited on p. 105) [213] N. Rajeevan, K. Rajkopal, and G. Krishna. Vector-extrapolated fast maximum likelihood estimation for emission tomography. IEEE Trans. Med. Imaging, 11:9–20, 1992. (Cited on p. 263) [214] A. Ralston and P. Rabinowitz. A First Course in Numerical Analysis. McGraw-Hill, New York, second edition, 1978. (Cited on pp. 2, 42, 187, 391)

Bibliography

417

[215] C.R. Rao and S.K. Mitra. Generalized Inverses of Matrices and Applications. John Wiley, New York, 1971. (Cited on pp. 277, 373) [216] Lord Rayleigh. The Theory of Sound. Macmillan, New York, second revised edition, 1937. (Cited on p. 242) [217] K.C. Reddy and J.L. Jacocks. A locally implicit scheme for the Euler equations. AIAA paper 87-1144. In AIAA 8th Computational Fluid Dynamics Conference, Honolulu, Hawaii, 1987. (Cited on p. 265) [218] T.J. Rivlin. Chebyshev Polynomials: From Approximation Theory to Algebra and Number Theory. Wiley, New York, second edition, 1990. (Cited on p. 381) [219] D.E. Roberts. On the convergence of vector Padé approximants. J. Comput. Appl. Math., 70:95–109, 1996. (Cited on p. 306) [220] D.E. Roberts. The vector epsilon algorithm—A residual approach. Numer. Algorithms, 29:209–227, 2002. (Cited on p. 306) [221] D.E. Roberts. A vector generalisation of de Montessus’ theorem for the case of polar singularities on the boundary. J. Approx. Theory, 116:141–168, 2002. (Cited on pp. 126, 306) [222] Ch. Roland, R. Varadhan, and C.E. Frangakis. Squared polynomial extrapolation methods with cycling: An application to the positron emission tomography problem. Numer. Algorithms, 44:159–172, 2007. (Cited on p. 61) [223] G. Rosman, A.M. Bronstein, M.M. Bronstein, A. Sidi, and R. Kimmel. Fast multidimensional scaling using vector extrapolation. Technical Report CIS-2008-01, Computer Science Department, Technion–Israel Institute of Technology, 2008. (Cited on pp. 263, 280, 283) [224] G. Rosman, M.M. Bronstein, A.M. Bronstein, and R. Kimmel. Nonlinear dimensionality reduction by topologically constrained isometric embedding. Internat. J. Comput. Vision, 89:56–68, 2010. (Cited on p. 263) [225] G. Rosman, L. Dascal, A. Sidi, and R. Kimmel. Efficient Beltrami image filtering via vector extrapolation methods. SIAM J. Imaging Sci., 2:858–878, 2009. (Cited on p. 263) [226] C.S. Rudisill. Derivatives of eigenvalues and eigenvectors for a general matrix. AIAA J., 12:721–722, 1974. (Cited on p. 273) [227] C.S. Rudisill and Y.Y. Chu. Numerical methods for evaluating the derivatives of eigenvalues and eigenvectors. AIAA J., 13:834–837, 1975. (Cited on p. 274) [228] H. Rutishauser. Anwendungen des Quotienten-Differenzen-Algorithmus. Z. Angew. Math. Phys., 5:496–508, 1954. (Cited on p. 105) [229] H. Rutishauser. Der Quotienten-Differenzen-Algorithmus. Z. Angew. Math. Phys., 5:233–251, 1954. (Cited on p. 105) [230] H. Rutishauser. Der Quotienten-Differenzen-Algorithmus. Birkhäuser, Basel, 1957. (Cited on p. 105) [231] H. Rutishauser. Computational aspects of F.L. Bauer’s simultaneous iteration method. Numer. Math., 13:4–13, 1969. (Cited on p. 258) [232] H. Rutishauser. Simultaneous iteration method for symmetric matrices. Numer. Math., 16:205–223, 1970. (Cited on p. 258)

418

Bibliography [233] Y. Saad. On the rates of convergence of the Lanczos and the block-Lanczos methods. SIAM J. Numer. Anal., 17:687–706, 1980. (Cited on pp. 233, 249) [234] Y. Saad. Variations on Arnoldi’s method for computing eigenelements of large unsymmetric matrices. Linear Algebra Appl., 34:269–295, 1980. (Cited on p. 233) [235] Y. Saad. Krylov subspace methods for solving large unsymmetric linear systems. Math. Comput., 37:105–126, 1981. (Cited on p. 191) [236] Y. Saad. The Lanczos biorthogonalization algorithm and other oblique projection methods for solving large unsymmetric systems. SIAM J. Numer. Anal., 19:485–506, 1982. (Cited on pp. 191, 222, 223) [237] Y. Saad. Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems. Math. Comput., 42:567–588, 1984. (Cited on p. 258) [238] Y. Saad. Analysis of some Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal., 29:209–228, 1992. (Cited on pp. 324, 325) [239] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia, second edition, 2003. (Cited on pp. 2, 191, 226) [240] Y. Saad. Numerical Methods for Large Eigenvalue Problems. SIAM, Philadelphia, revised edition, 2011. (Cited on pp. 2, 233, 249, 259) [241] Y. Saad and M.H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7:856–869, 1986. (Cited on pp. 206, 207) [242] P. Sablonnière. Comparison of four algorithms accelerating the convergence of a subset of logarithmic fixed point sequences. Numer. Algorithms, 1:177–197, 1991. (Cited on p. 341) [243] P. Sablonnière. Asymptotic behaviour of iterated modified Δ2 and θ2 transforms on some slowly convergent sequences. Numer. Algorithms, 3:401–410, 1992. (Cited on p. 341) [244] H. Sadok. About Henrici’s transformation for accelerating vector sequences. J. Comput. Appl. Math., 29:101–110, 1990. (Cited on p. 42) [245] E.B. Saff. An extension of Montessus de Ballore theorem on the convergence of interpolating rational functions. J. Approx. Theory, 6:63–67, 1972. (Cited on p. 355) [246] E.B. Saff and R.S. Varga. On incomplete polynomials. In L. Collatz, G. Meinardus, and H. Werner, editors, Numerische Methoden der Approximationstheorie, Band 4, volume 42, pages 281–298. Birkhäuser, Basel, 1978. (Cited on pp. 162, 163) [247] E.B. Saff and R.S. Varga. Uniform approximation by incomplete polynomials. Internat. J. Math. Math. Sci., 1:407–420, 1978. (Cited on p. 163) [248] E.B. Saff and R.S. Varga. The sharpness of Lorentz’s theorem on incomplete polynomials. Trans. Amer. Math. Soc., 249:163–186, 1979. (Cited on p. 163) [249] E.B. Saff and R.S. Varga. On incomplete polynomials II. Pacific J. Math., 92:161–172, 1981. (Cited on p. 163) [250] W.H.A. Schilders. Application of vector extrapolation and conjugate gradient type methods to the semiconductor device problem. In M.S. Moonen, G.H. Golub, and B.L.R. De Moor, editors, Linear Algebra for Large Scale and Real-Time Applications, volume 232 of NATO ASI Series, pages 415–416, Kluwer Academic, Boston, 1992. (Cited on p. 263)

Bibliography

419

[251] W.H.A. Schilders, P.A. Gough, and K. Whight. Extrapolation techniques for improved convergence in semiconductor device simulation. In Proceedings of the Eighth International NASECODE Conference, pages 94–95, Vienna, 1992. (Cited on p. 263) [252] J.R. Schmidt. On the numerical solution of linear simultaneous equations by an iterative method. Philos. Mag., 7:369–383, 1941. (Cited on p. 99) [253] C. Schneider. Vereinfachte Rekursionen zur Richardson-Extrapolation in Spezialfällen. Numer. Math., 24:177–184, 1975. (Cited on p. 328) [254] S. Serra-Capizzano. Jordan canonical form of the Google matrix: A potential contribution to the PageRank computation. SIAM J. Matrix Anal. Appl., 27:305–312, 2005. (Cited on p. 270) [255] D. Shanks. Nonlinear transformations of divergent and slowly convergent sequences. J. Math. Phys., 34:1–42, 1955. (Cited on pp. 99, 103) [256] Y. Shapira, I. Israeli, and A. Sidi. An automatic multigrid method for the solution of sparse linear systems. In The Sixth Copper Mountain Conference on Multigrid Methods: NASA Conference Publication 3224, Part 2, pages 567–582, 1993. (Cited on p. 263) [257] Y. Shapira, I. Israeli, and A. Sidi. Towards automatic multigrid algorithms for SPD, nonsymmetric and indefinite problems. SIAM J. Sci. Comput., 17:439–453, 1996. (Cited on p. 263) [258] Y. Shapira, M. Israeli, A. Sidi, and U. Zrahia. Preconditioning spectral element schemes for definite and indefinite problems. Numer. Methods Partial Differential Equations, 15:535–543, 1999. (Cited on p. 263) [259] A. Sidi. Convergence properties of some nonlinear sequence transformations. Math. Comput., 33:315–326, 1979. (Cited on p. 341) [260] A. Sidi. Analysis of convergence of the T -transformation for power series. Math. Comput., 35:833–850, 1980. (Cited on p. 341) [261] A. Sidi. Convergence and stability properties of minimal polynomial and reduced rank extrapolation algorithms. SIAM J. Numer. Anal., 23:197–209, 1986. Originally appeared as NASA TM-83443 (1983). (Cited on pp. 49, 50, 119, 137) [262] A. Sidi. Extrapolation vs. projection methods for linear systems of equations. J. Comput. Appl. Math., 22:71–88, 1988. (Cited on pp. 45, 54, 126, 146, 198) [263] A. Sidi. On extensions of the power method for normal operators. Linear Algebra Appl., 120:207–224, 1989. (Cited on pp. 119, 126, 142) [264] A. Sidi. Application of vector extrapolation methods to consistent singular linear systems. Appl. Numer. Math., 6:487–500, 1990. (Cited on pp. 278, 279) [265] A. Sidi. On a generalization of the Richardson extrapolation process. Numer. Math., 57:365–377, 1990. (Cited on p. 332) [266] A. Sidi. Efficient implementation of minimal polynomial and reduced rank extrapolation methods. J. Comput. Appl. Math., 36:305–337, 1991. Originally appeared as NASA TM103240 ICOMP-90-20 (1990). (Cited on pp. 65, 71, 76, 77, 95, 397) [267] A. Sidi. Development of iterative techniques and extrapolation methods for Drazin inverse solution of consistent or inconsistent singular linear systems. Linear Algebra Appl., 167:171–203, 1992. (Cited on p. 279)

420

Bibliography [268] A. Sidi. Convergence of intermediate rows of minimal polynomial and reduced rank extrapolation tables. Numer. Algorithms, 6:229–244, 1994. (Cited on pp. 49, 119, 126, 127, 148, 151, 153) [269] A. Sidi. Rational approximations from power series of vector-valued meromorphic functions. J. Approx. Theory, 77:89–111, 1994. (Cited on pp. 149, 285, 297, 316, 345) [270] A. Sidi. Application of vector-valued rational approximations to the matrix eigenvalue problem and connections with Krylov subspace methods. SIAM J. Matrix Anal. Appl., 16:1341–1369, 1995. (Cited on pp. 149, 236, 243, 244, 251, 252, 253, 258, 259, 285, 297) [271] A. Sidi. Extension and completion of Wynn’s theory on convergence of columns of the epsilon table. J. Approx. Theory, 86:21–40, 1996. (Cited on pp. 99, 102, 126) [272] A. Sidi. A complete convergence and stability theory for a generalized Richardson extrapolation process. SIAM J. Numer. Anal., 34:1761–1778, 1997. (Cited on p. 129) [273] A. Sidi. Krylov subspace methods for eigenvalues with special properties and their analysis for normal matrices. Linear Algebra Appl., 280:129–162, 1998. (Cited on pp. 119, 126, 142, 251, 252, 253, 256) [274] A. Sidi. A unified approach to Krylov subspace methods for the Drazin-inverse solution of singular nonsymmetric linear systems. Linear Algebra Appl., 298:99–113, 1999. (Cited on p. 279) [275] A. Sidi. A new algorithm for the higher order G-transformation. Preprint, Computer Science Department, Technion–Israel Institute of Technology, 2000. (Cited on p. 104) [276] A. Sidi. DGMRES: A GMRES type algorithm for Drazin-inverse solution of singular nonsymmetric linear systems. Linear Algebra Appl., 335:189–204, 2001. (Cited on p. 279) [277] A. Sidi. A convergence and stability study of the iterated Lubkin transformation and the θ-algorithm. Math. Comput., 72:419–433, 2003. (Cited on p. 341) [278] A. Sidi. Practical Extrapolation Methods: Theory and Applications. Number 10 in Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2003. (Cited on pp. 99, 102, 103, 104, 105, 106, 126, 187, 272, 273, 328, 341, 342) [279] A. Sidi. Approximation of largest eigenpairs of matrices and applications to PageRank computation. Technical Report CS-2004-16, Computer Science Dept., Technion–Israel Institute of Technology, 2004. (Cited on pp. 59, 78, 266, 271, 279) [280] A. Sidi. A new approach to vector-valued rational interpolation. J. Approx. Theory, 130:177–187, 2004. (Cited on pp. 272, 345, 347, 349) [281] A. Sidi. Algebraic properties of some new vector-valued rational interpolants. J. Approx. Theory, 141:142–161, 2006. (Cited on pp. 272, 345, 349, 353) [282] A. Sidi. Unified treatment of regula falsi, Newton–Raphson, secant, and Steffensen methods for nonlinear equations. J. Online Math. Appl., 6, 2006. (Cited on p. 42) [283] A. Sidi. A de Montessus type convergence study for a vector-valued rational interpolation procedure. Israel J. Math., 163:189–215, 2008. (Cited on pp. 272, 345, 355) [284] A. Sidi. A de Montessus type convergence study of a least-squares vector-valued rational interpolation procedure. J. Approx. Theory, 155:75–96, 2008. (Cited on pp. 272, 345, 355, 357)

Bibliography

421

[285] A. Sidi. Vector extrapolation methods with applications to solution of large systems of equations and to PageRank computations. Comput. Math. Appl., 56:1–24, 2008. (Cited on pp. 59, 78, 156, 266, 271, 279) [286] A. Sidi. Methods for acceleration of convergence (extrapolation) of vector sequences. In B.W. Wah, editor, Wiley Encyclopedia of Computer Science and Engineering, volume 3, pages 1828–1846. Wiley, Hoboken, NJ, 2009. ISBN: 978-0-471-38393-2. (Cited on p. 31) [287] A. Sidi. A de Montessus type convergence study of a least-squares vector-valued rational interpolation procedure II. Comput. Methods Funct. Theory, 10:223–247, 2010. (Cited on pp. 272, 345, 355) [288] A. Sidi. Acceleration of convergence of general linear sequences by the Shanks transformation. Numer. Math., 119:725–764, 2011. (Cited on p. 99) [289] A. Sidi. Review of two vector extrapolation methods of polynomial type with applications to large-scale problems. J. Comput. Sci., 3:92–101, 2012. (Cited on p. 31) [290] A. Sidi. SVD-MPE: An SVD-based vector extrapolation method of polynomial type. Appl. Math., 7:1260–1278, 2016. Special issue on Applied Iterative Methods. (Cited on pp. 31, 43, 50) [291] A. Sidi. A de Montessus type convergence study for a vector-valued rational interpolation procedure of epsilon class. Jaen J. Approx., to appear, 2017. (Cited on pp. 272, 345, 355) [292] A. Sidi. Minimal polynomial and reduced rank extrapolation methods are related. Adv. Comput. Math., 43:151–170, 2017. (Cited on pp. 31, 65, 71, 77, 83) [293] A. Sidi. On a vectorized version of a generalized Richardson extrapolation process. Numer. Algorithms, 74:937–949, 2017. (Cited on p. 332) [294] A. Sidi and J. Bridger. Convergence and stability analyses for some vector extrapolation methods in the presence of defective iteration matrices. J. Comput. Appl. Math., 22:35–61, 1988. (Cited on pp. 49, 119, 148, 149, 151, 244) [295] A. Sidi and M.L. Celestina. Convergence acceleration for vector sequences and applications to computational fluid dynamics. Lewis Research Center, NASA TM-101327, 1988. AIAA 28th Aerospace Sciences Meeting paper AIAA-90-0338. (Cited on p. 265) [296] A. Sidi and W.F. Ford. Quotient-difference type generalizations of the power method and their analysis. J. Comput. Appl. Math., 32:261–272, 1990. (Cited on p. 184) [297] A. Sidi, W.F. Ford, and D.A. Smith. Acceleration of convergence of vector sequences. Lewis Research Center, NASA TM-82931, 1982. (Cited on p. xiv) [298] A. Sidi, W.F. Ford, and D.A. Smith. Acceleration of convergence of vector sequences. Lewis Research Center, NASA TP-2193, 1983. (Cited on p. xiv) [299] A. Sidi, W.F. Ford, and D.A. Smith. Acceleration of convergence of vector sequences. SIAM J. Numer. Anal., 23:178–196, 1986. Originally appeared as NASA TP-2193 (1983). (Cited on pp. xiv, 31, 32, 48, 49, 50, 93, 119, 126, 128, 130) [300] A. Sidi and M. Israeli. A hybrid extrapolation method: The Richardson–Shanks transformation, 1989. Unpublished research. (Cited on p. 273) [301] A. Sidi and Y. Kanevsky. Orthogonal polynomials and semi-iterative methods for the Drazin-inverse solution of singular linear systems. Numer. Math., 93:563–581, 2003. (Cited on p. 279)

422

Bibliography [302] A. Sidi and V. Kluzner. A Bi-CG-type iterative method for Drazin-inverse solution of singular inconsistent nonsymmetric linear systems of arbitrary index. Electron. J. Linear Algebra, 6:72–94, 1999/2000. (Cited on p. 279) [303] A. Sidi and Y. Shapira. Upper bounds for convergence rates of vector extrapolation methods on linear systems with initial iterations. Technical Report 701, Computer Science Department, Technion–Israel Institute of Technology, 1991. Appeared also as NASA Technical Memorandum 105608, ICOMP-92-09 (1992). (Cited on pp. 58, 156, 161, 166, 168, 171, 172, 173, 230, 379) [304] A. Sidi and Y. Shapira. Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algorithms, 18:113–132, 1998. (Cited on pp. 58, 161, 166, 229, 230, 379) [305] S. Skelboe. Computation of the periodic steady-state response of nonlinear networks by extrapolation methods. IEEE Trans. Circuits Syst., 27:161–175, 1980. (Cited on pp. 31, 175, 266) [306] D.A. Smith, W.F. Ford, and A. Sidi. Extrapolation methods for vector sequences. SIAM Rev., 29:199–233, 1987. Erratum: SIAM Rev., 30:623–624, 1988. (Cited on pp. 31, 42, 109, 175) [307] P. Sonneveld. CGS, a fast Lanczos-type solver for nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 10:36–52, 1989. (Cited on p. 226) [308] D.C. Sorensen. Implicit application of polynomial filters in a k-step Arnoldi method. SIAM J. Matrix Anal. Appl., 13:357–385, 1992. (Cited on p. 259) [309] G.W. Stewart. Introduction to Matrix Computations. Academic Press, New York, 1973. (Cited on p. 2) [310] G.W. Stewart. Matrix Algorithms, Volume I: Basic Decompositions. SIAM, Philadelphia, 1998. (Cited on pp. 2, 367) [311] G.W. Stewart. Matrix Algorithms, Volume II: Eigensystems. SIAM, Philadelphia, 2001. (Cited on p. 2) [312] E.L. Stiefel. Relaxationsmethoden bester Strategie zur losung linearer Gleichungssytems. Comment. Math. Helv., 29:157–179, 1955. (Cited on p. 213) [313] J. Stoer and R. Bulirsch. Introduction to Numerical Analysis. Springer-Verlag, New York, third edition, 2002. (Cited on pp. 2, 42, 346, 354, 367) o. Orthogonal Polynomials, volume 23 of American Mathematical Society Collo[314] G. Szeg˝ quium Publications. American Mathematical Society, Providence, RI, 1939. (Cited on pp. 378, 385, 386) [315] H. Tal-Ezer. Polynomial approximation of functions of matrices and applications. J. Sci. Comput., 4:25–60, 1989. (Cited on p. 324) [316] H. Tal-Ezer. High degree polynomial interpolation in Newton form. SIAM J. Sci. Stat. Comput., 12:648–667, 1991. (Cited on p. 324) [317] H. Tal-Ezer. On restart and error estimation for Krylov approximation of w = f (A)v. SIAM J. Sci. Comput., 29:2426–2441, 2007. (Cited on p. 324) [318] R.C.E. Tan. Accelerating the convergence of an iterative method for derivatives of eigensystems. J. Comput. Phys., 67:230–235, 1986. (Cited on p. 276)

Bibliography

423

[319] R.C.E. Tan. Computing derivatives of eigensystems by the topological ε-algorithm. Appl. Numer. Math., 3:539–550, 1987. (Cited on p. 276) [320] R.C.E. Tan. Computing derivatives of eigensystems by the vector ε-algorithm. IMA J. Numer. Anal., 7:485–494, 1987. (Cited on p. 276) [321] R.C.E. Tan. Implementation of the topological ε-algorithm. SIAM J. Sci. Stat. Comput., 9:839–848, 1988. (Cited on p. 114) [322] R.C.E. Tan. Some acceleration methods for iterative computation of derivatives of eigenvalues and eigenvectors. Internat. J. Numer. Methods Engrg., 28:1505–1519, 1989. (Cited on p. 276) [323] N.M. Temme. Asymptotic Methods for Integrals, volume 6 of Series in Analysis. World Scientific, Singapore, 2015. (Cited on p. 332) [324] L.N. Trefethen and D. Bau III. Numerical Linear Algebra. SIAM, Philadelphia, 1997. (Cited on pp. 2, 367) [325] H.A. van der Vorst. Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 13:631–644, 1992. (Cited on p. 226) [326] H.A. van der Vorst. Iterative Krylov Methods for Large Linear Systems. Number 13 in Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge, 2003. (Cited on pp. 2, 191, 226) [327] J. van Iseghem. Vector Padé approximants. In R. Vichnevetsky and J. Vignes, editors, Numerical Mathematics and Applications, pages 73–77. North Holland, Amsterdam, 1985. (Cited on p. 309) [328] J. van Iseghem. An extended cross-rule for vector Padé approximants. Appl. Numer. Math., 2:143–155, 1986. (Cited on p. 309) [329] J. van Iseghem. Approximants de Padé Vectoriels. PhD thesis, Université de Lille I, 1987. (Cited on p. 309) [330] J. van Iseghem. Vector orthogonal relations, vector QD-algorithm. J. Comput. Appl. Math., 19:141–150, 1987. (Cited on p. 309) [331] A.H. Van Tuyl. Acceleration of convergence of a family of logarithmically convergent sequences. Math. Comput., 63:229–246, 1994. (Cited on p. 341) [332] R. Varadhan and Ch. Roland. Squared extrapolation methods (SQUAREM): A new class of simple and efficient numerical schemes for accelerating the convergence of the EM algorithm. Department of Biostatistics working paper, Johns Hopkins University, 2004. (Cited on p. 61) [333] R.S. Varga. Matrix Iterative Analysis. Number 27 in Springer Series in Computational Mathematics. Springer–Verlag, New York, second edition, 2000. (Cited on pp. 2, 161, 268) [334] P.K.W. Vinsome. ORTHOMIN: An iterative method for solving sparse sets of simultaneous linear equations. In Proceedings of the Fourth Symposium on Reservoir Simulation, pages 149–159, 1976. (Cited on p. 207) [335] Y.V. Vorobyev. Method of Moments in Applied Mathematics. Gordon and Breach, New York, 1965. (Cited on p. 191)

424

Bibliography [336] H.F. Walker. Implementation of the GMRES method using Householder transformations. SIAM J. Sci. Stat. Comput., 9:152–163, 1988. (Cited on p. 204) [337] H.F. Walker. Residual smoothing and peak/plateau behavior in Krylov subspace methods. Appl. Numer. Math., 19:279–286, 1995. (Cited on p. 209) [338] J.L. Walsh. Interpolation and Approximation, volume 20 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, third edition, 1960. (Cited on p. 355) [339] D.S. Watkins. Fundamentals of Matrix Computations. John Wiley & Sons, New York, second edition, 2002. (Cited on p. 2) [340] R. Weiss. Convergence Behavior of Generalized Conjugate Gradient Methods. PhD thesis, University of Karlsruhe, 1990. (Cited on pp. 89, 209) [341] R. Weiss. Properties of generalized conjugate gradient methods. Numer. Linear Algebra Appl., 1:45–63, 1994. (Cited on p. 209) [342] Chun Wen, Ting-Zhu Huang, De-An Wu, and Liang Li. The finest level acceleration of multilevel aggregation for Markov chains. Internat. J. Numer. Anal. Modeling, Series B, 2:27–41, 2011. (Cited on p. 263) [343] L.B. Wigton, N.J. Yu, and D.P. Young. GMRES acceleration of computational fluid dynamics codes. In Proceedings of the 7th AIAA Computational Fluid Dynamics Conference, pages 66–74, Washington, D.C., 1985. AIAA. (Cited on pp. 230, 265) [344] J.H. Wilkinson. The Algebraic Eigenvalue Problem. Clarendon Press, Oxford, 1965. (Cited on pp. 2, 233, 244) [345] J. Wimp. Sequence Transformations and Their Applications. Academic Press, New York, 1981. (Cited on pp. 103, 106, 332, 339) [346] Y.S. Wong and M. Hafez. Conjugate gradients methods applied to transonic finite difference and finite element calculations. AIAA J., 20:1526–1534, 1982. (Cited on p. 265) [347] Baisheng Wu, Zhengguang Li, and Shunhua Li. The implementation of a vector-valued rational approximate method in structural reanalysis problems. Comput. Methods Appl. Mech. Engrg., 192:1773–1784, 2003. (Cited on p. 318) [348] Baisheng Wu and Huixiang Zhong. Summation and perturbation solutions to non-linear oscillations. Acta Mech., 154:121–127, 2002. (Cited on pp. 318, 320) [349] Baisheng Wu and Huixiang Zhong. Application of vector-valued rational approximations to a class of non-linear oscillations. Internat. J. Non-Linear Mech., 38:249–254, 2003. (Cited on pp. 318, 320) [350] P. Wynn. On a device for computing the e m (Sn ) transformation. Mathematical Tables and Other Aids to Computation, 10:91–96, 1956. (Cited on pp. 99, 103) [351] P. Wynn. On a procrustean technique for the numerical transformation of slowly convergent sequences and series. Proc. Cambridge Philos. Soc., 52:663–671, 1956. (Cited on pp. 341, 342) [352] P. Wynn. Acceleration techniques for iterated vector and matrix problems. Math. Comput., 16:301–322, 1962. (Cited on p. 106) [353] P. Wynn. Continued fractions whose coefficients obey a noncommutative law of multiplication. Arch. Ration. Mech. Anal., 12:273–312, 1963. (Cited on p. 109)

Bibliography

425

[354] P. Wynn. General purpose vector epsilon algorithm procedures. Numer. Math., 6:22–36, 1964. (Cited on p. 106) [355] P. Wynn. On the convergence and stability of the epsilon algorithm. SIAM J. Numer. Anal., 3:91–122, 1966. (Cited on p. 126) [356] P. Wynn. Upon systems of recursions which obtain among the quotients of the Padé table. Numer. Math., 8:264–269, 1966. (Cited on p. 103) [357] D.M. Young and K.C. Jea. Generalized conjugate gradient acceleration of nonsymmetrizable iterative methods. Linear Algebra Appl., 34:159–194, 1980. (Cited on pp. 206, 207) [358] S. Yungster. Shock wave/boundary layer interactions in premixed hydrogen-air hypersonic flows: a numerical study. AIAA paper 91-0413. In AIAA 29th Aerospace Sciences Meeting, Reno, Nevada, 1991. (Cited on p. 265) [359] S. Yungster. Numerical study of shock-wave/boundary-layer interactions in premixed combustible gases. AIAA J., 30:2379–2387, 1992. (Cited on p. 265) [360] L. Zhou and H.F. Walker. Residual smoothing techniques for iterative methods. SIAM J. Sci. Comput., 15:297–312, 1994. (Cited on p. 209) [361] Xiaojing Zhu, Chunjing Li, and Chuanqing Gu. A new method for computing the matrix exponential operation based on vector valued rational approximations. J. Comput. Appl. Math., 236:2306–2316, 2012. (Cited on p. 324)

Index ρ algorithm, 342 topological, 343 vector, 343 θ algorithm, 341 generalized, 341 vector, 342 Aitken Δ2 -process, 52 Arnoldi–Gram–Schmidt orthogonalization, 204 Bi-CG, 224–226 Cauchy–Schwarz inequality, 9 Cayley–Hamilton theorem, 7, 32 CG, 213–217 preconditioning of, 228 CGNE, 217 CGNR, 217 characteristic polynomial, 32 Chebyshev polynomials, 381 condition number, 12 CR, 213–217 Drazin inverse, 277 Drazin inverse solution, 277 E-algorithm, 328 eigenpair, 7 eigenpair derivatives, 273 iterative computation, 273 eigenvalue, 7 algebraic multiplicity of, 7 multiple, 7 defective, 7 nondefective, 7 simple, 7 eigenvector, 7 generalized, 20 eigenvectors with known eigenvalues, 266

hybrid method for, 272 PageRank vector, 268 epsilon algorithms, 99–118 scalar epsilon algorithm, see SEA topological epsilon algorithm, see TEA vector epsilon algorithm, see VEA ETA2, 116 ETEA1, 116 extrapolation table, 119 FGREP, 328, 329 determinant representation of, 329 vectorized, 329 FGREP1, 330 FGREP2, 330 FGREP1, 330 determinant representation of, 330 FGREP2, 330 convergence theory for, 332 determinant representation of, 331 first generalized Richardson extrapolation process, see FGREP FOM, 203–206 Hermitian case of three-term recursion for, 209–213 relation of GMR to, 208 short recurrence for, 219 FS-algorithm, 328 full orthogonalization method, see FOM generalized inverse vector-valued Padé approximants, 305

427

convergence theory for, 305 determinant representation for, 307 generalized MPE, 44 GMR, 197, 200, 206–208 Hermitian case of three-term recursion for, 209–213 relation of FOM to, 208 short recurrence for, 219 GMRES, 207 Google Web matrix, 269 Gram–Schmidt orthogonalization (GS), 361 group inverse, 277 group inverse solution, 277 Henrici’s method, 42 Hölder inequality, 9 IMMPE, 349 IMPE, 348 inner product, 12 Euclidean, 13 weighted, 13 inverse power method, 391–396 with a fixed shift, 391–392 with variable shifts, 392–396 convergence theory, 393–396 ITEA, 349 iterative majorization, 280 iterative methods for linear systems, 16–27 alternating direction implicit (ADI), 24 convergence of, 17, 25 error formulas for, 19–22 Gauss–Seidel, 23 Jacobi, 23

428 Richardson iteration, 23 successive overrelaxation (SOR), 24 optimal SOR, 24 symmetric Gauss–Seidel, 23 symmetric SOR (SSOR), 24 iterative methods for nonlinear systems, 15 Jacobi polynomials, 385–386 Jordan canonical form, 17 Jordan factorization, 17 Krylov subspace, 197 Krylov subspace methods for eigenvalue problems, 235–259 Kaniel–Paige–Saad convergence theory, 249 method of Arnoldi, 239 method of Lanczos, 245 treatment of special eigenpairs, 250 Krylov subspace methods for linear systems, 197–230 error analysis for, 200–203 method of Arnoldi for linear systems, see FOM method of generalized minimal residuals, see GMR method of Lanczos, 198 vector extrapolation methods and, 198 Krylov subspace methods for nonlinear systems, 230–232 Lanczos biorthogonalization, 221–223, 245 least-squares problems, 13, 67–71 constrained, 69 unconstrained, 67 Lindstedt–Poincaré method, 319 LU factorization, 92 matrix defective, 7 diagonal, 4 diagonalizable, 7 Hermitian, 4 Hessenberg, 5 irreducible, 14

Index irreducibly diagonally dominant, 14 M, 14 nondefective, 7 nondiagonalizable, 7 nonnegative, 14 normal, 4 null space of, 6 permutation, 4 positive, 14 positive definite, 8 positive semidefinite, 8 range of, 6 rank of, 6 reducible, 14 singular Drazin inverse of, 277 group inverse of, 277 index of, 277 skew-Hermitian, 4 skew-symmetric, 4 stochastic, 268 strictly diagonally dominant, 14 symmetric, 4 trace of, 5 triangular, 5 unitary, 4 matrix functions, 320 Krylov subspaces and, 324 matrix norms, 10 Frobenius, 11 multiplicative, 10 natural, 10 l p -induced, 11 Schur, 11 McLeod’s theorem, 109 method of conjugate gradients, see CG method of conjugate residuals, see CR method of Lanczos for linear systems, 220–224 minimal polynomial, 33 with respect to a vector, 34 minimal polynomial extrapolation, see MPE minimal residual method (MR), 196 MMPE, 42 algorithm for, 92–95 compact representation of, 55 derivation of, 42

determinant representation of, 49 error estimation for, 95 finite termination of, 45 recursion relations for, 179–180 row convergence of, 122–126 generalized, 148–151 modified Gram–Schmidt orthogonalization (MGS), 71, 364 reorthogonalization and, 365 modified minimal polynomial extrapolation, see MMPE monic polynomial, 32 Moore–Penrose generalized inverse, 373–375 connection with least-squares problems, 374 SVD representation of, 374 MPE, 39 algorithm for, 71–78 compact representation of, 55 derivation of, 39 determinant representation of, 49 error bounds for simple, 156–157 via orthogonal polynomials, 158–170 error estimation for, 74 finite termination of, 45 further algorithm for, 78–81 recursion relations for, 181–183 relation of RRE to, 86–89 row convergence of, 122–127 generalized, 148–153 multidimensional scaling (MDS), 279 SMACOF, 280 iterative majorization, 280 normal equations, 13 orthogonal complement, 6 orthogonal polynomials, 377–380 Padé table, 102 PageRank vector, 268 peak-plateau phenomenon, 88, 209 Perron–Frobenius theorem, 268

Index polynomial extrapolation methods, 31–64 compact representations of, 55 cycling of, 57–63 frozen coefficients, 59 full, 58 parallel, 62 derivation of, 39–45 determinant representations of, 49–55 finite termination of, 45–47 minimal polynomial extrapolation, see MPE modified minimal polynomial extrapolation, see MMPE numerical stability of, 56 reduced rank extrapolation, see RRE SVD-based minimal polynomial extrapolation, see SVD-MPE power method, 389–391 preconditioning, 226 projection methods eigenvalue problems and, 233–235 linear systems and, 191–197 QR factorization, 66, 361–366 Rayleigh quotient, 241, 387 properties of, 387–389 Rayleigh quotient power method, 241 reduced rank extrapolation, see RRE regular splitting, 14 Ritz pair, 235 Ritz value, 235 Ritz vector, 235 RRE, 40 algorithm for, 71–78 compact representation of, 55 derivation of, 40 determinant representation of, 49 error bounds for simple, 156–157 via orthogonal polynomials, 158–170 error estimation for, 74 finite termination of, 45

429 further algorithm for, 78–81 recursion relations for, 181–183 relation of MPE to, 86–89 row convergence of, 122–127 generalized, 148–153 stagnation of, 86 Samelson inverse, 106 SEA, 99–106 application of, 106 determinant representation of, 102 Shanks transformation, 100–106 algorithms for, 103–106 epsilon algorithm, 103 FS/qd algorithm, 104 simultaneous Padé approximants, 308 convergence theory for, 310 determinant representation for, 309 directed, 310 singular linear systems, 277 singular value, 8, 92, 368 singular value decomposition, see SVD singular vector, 92, 368 SMACOF, 280 SMMPE, 285 SMPE, 285 compact formula for, 289 spectral radius, 8 STEA, 285 steady-state solution, 263 steepest descent (SD) method, 196 Stein–Rosenberg theorem, 25 stochastic matrix, 268 stress function, 281 SVD, 92, 367–371 full, 367 reduced, 369 SVD-based minimal polynomial extrapolation, see SVD-MPE SVD-MPE, 43 algorithm for, 95–98 compact representation of, 56 derivation of, 43 determinant representation of, 49

error estimation for, 98 finite termination of, 45 Sylvester determinant identity, 179 TEA, 110–118 implementation via ETEA1, ETEA2, 115 implementation via STEA1, STEA2, 117 recursion relations for, 179–180 row convergence of, 122–126 generalized, 148–151 TEA1, 110 determinant representation of, 111 TEA2, 112 determinant representation of, 113 VEA, 106–110 determinant representation of, 107 finite termination of, 109 vector norms, 9 equivalence of, 9 l p -norms, 9 vector-valued rational approximations, 285–303, 313–325 algebraic properties of, 290 convergence of, 297 derivation of, 286 determinant representations of, 287 fixed-point iterations and, 313 Fredholm integral equations and, 316 Krylov subspace methods for eigenvalue problems and, 314 matrix functions and, 320 nonlinear differential equations and, 318 Lindstedt–Poincaré method, 319 reanalysis of structures and, 317 reproducing property of, 292 SMMPE, 285 SMPE, 285 STEA, 285

430 vector-valued rational interpolation methods, 345–357 convergence theory for, 354 determinant representations

Index of, 352 development of, 345 IMMPE, 349 IMPE, 348 ITEA, 349

limiting properties of, 350 projection properties of, 351 reproducing properties of, 353 symmetry properties of, 352

CS&E

Vector Extrapolation Methods with Applications is the first book fully dedicated to the subject of vector extrapolation methods. It is a self-contained, up-to-date, and state-of-the-art reference on the theory and practice of the most useful methods. It covers all aspects of the subject, including development of the methods, their convergence study, numerically stable algorithms for their implementation, and their various applications. It also provides complete proofs in most places. As an interesting application, the author shows how these methods give rise to rational approximation procedures for vector-valued functions in the complex plane, a subject of importance in model reduction problems among others. This book is intended for numerical analysts, applied mathematicians, and computational scientists and engineers in fields such as computational fluid dynamics, structures, and mechanical and electrical engineering, to name a few. Since it provides complete proofs in most places, it can also serve as a textbook in courses on acceleration of convergence of iterative vector processes, for example.

Vector Extrapolation Methods with Applications

A. Sidi

Avram Sidi is Professor Emeritus of Numerical Analysis in the Computer Science Department at the Technion-Israel Institute of Technology and the former holder of the Technion Administration Chair in Computer Science. He has published extensively in various areas of numerical analysis and approximation theory, such as convergence acceleration, numerical integration, rational approximation, and asymptotic analysis, convergence acceleration being a major area. He is also the author of the book Practical Extrapolation Methods: Theory and Applications (Cambridge University Press, 2003), which deals exclusively with the acceleration of convergence of scalar sequences. His research has involved the development of novel numerical methods of high accuracy, their rigorous mathematical analysis, design of efficient algorithms for their implementation, and their application to difficult problems. His methods and algorithms are being used successfully in various scientific and engineering disciplines.

Vector Extrapolation Methods with Applications

An important problem that arises in different disciplines of science and engineering is that of computing limits of sequences of vectors of very large dimension. Such sequences arise, for example, in the numerical solution of systems of linear and nonlinear equations by fixedpoint iterative methods, and their limits are simply the required solutions to these systems. The convergence of these sequences, which is very slow in many cases, can be accelerated successfully by using suitable vector extrapolation methods.

AVRAM SIDI

CS17 ISBN 978-1-611974-95-9 90000

9781611974959

CS17_SIDI-coverA_rev07-06-17.indd 1

AVRAM SIDI

Society for Industrial and Applied Mathematics 3600 Market Street, 6th Floor Philadelphia, PA 19104-2688 USA +1-215-382-9800 • Fax: +1-215-386-7999 [email protected] • www.siam.org

CS17

Computational Science and Engineering

7/6/2017 12:30:25 PM