299 29 1MB
English Pages [223]
M AT R I X P O S I T I V I T Y Matrix positivity is a central topic in matrix theory: properties that generalize the notion of positivity to matrices arose from a large variety of applications, and many have also taken on notable theoretical significance, either because they are natural or unifying or because they support strong implications. This is the first book to provide a comprehensive and up-to-date reference of important material on matrix positivity classes, their properties, and their relations. The matrix classes emphasized in this book include the classes of semipositive matrices, P-matrices, inverse M-matrices, and copositive matrices. This self-contained reference will be useful to a large variety of mathematicians, engineers, and social scientists, as well as graduate students. The generalizations of positivity and the connections observed provide a unique perspective, along with theoretical insight into applications and future challenges. Direct applications can be found in data analysis, differential equations, mathematical programming, computational complexity, models of the economy, population biology, dynamical systems, and control theory. c h a r l e s r . j o h n s o n is Class of 1961 Professor of Mathematics at College of William and Mary. He received his Ph.D. from The California Institute of Technology in 1972. Professor Johnson has published nearly 500 papers and 12 books and received several prizes. r o na l d l . s m i t h is Professor Emeritus at the University of Tennessee, Chattanooga. He received his Ph.D. in Mathematics at Auburn University. He received the Thomson-Reuters Original List of Highly Cited Researchers Award. The conference “Recent Advances in Linear Algebra and Graph Theory” was held in his honor. m i c h a e l j. t s at s o m e r o s is Professor of Mathematics at Washington State University. He received his Ph.D. in Mathematics at the University of Connecticut in 1990. He has served on the Board of Directors and Advisory and Journal Committees of the International Linear Algebra Society. He is co-Editor-in-Chief of the Electronic Journal of Linear Algebra and Associate Editor of Linear Algebra and Its Applications.
C A M B R I D G E T R AC T S I N M AT H E M AT I C S G E N E R A L E D I TO R S J . B E RTO I N , B . B O L L O B Á S , W. F U LTO N , B . K R A , I . M O E R D I J K , C . P R A E G E R , P. S A R NA K , B . S I M O N , B . TOTA RO A complete list of books in the series can be found at www.cambridge.org/mathematics. Recent titles include the following: 187. Convexity: An Analytic Viewpoint. By B. Simon 188. Modern Approaches to the Invariant Subspace Problem. By I. Chalendar and J. R. Partington 189. Nonlinear Perron-Frobenius Theory. By B. Lemmens and R. Nussbaum 190. Jordan Structures in Geometry and Analysis. By C.-H. Chu 191. Malliavin Calculus for Lévy Processes and Infinite-Dimensional Brownian Motion. By H. Osswald 192. Normal Approximations with Malliavin Calculus. By I. Nourdin and G. Peccati 193. Distribution Modulo One and Diophantine Approximation. By Y. Bugeaud 194. Mathematics of Two-Dimensional Turbulence. By S. Kuksin and A. Shirikyan 195. A Universal Construction for Groups Acting Freely on Real Trees. By I. Chiswell and T. Müller 196. The Theory of Hardy’s Z-Function. By A. Ivi´c 197. Induced Representations of Locally Compact Groups. By E. Kaniuth and K. F. Taylor 198. Topics in Critical Point Theory. By K. Perera and M. Schechter 199. Combinatorics of Minuscule Representations. By R. M. Green 200. Singularities of the Minimal Model Program. By J. Kollár 201. Coherence in Three-Dimensional Category Theory. By N. Gurski 202. Canonical Ramsey Theory on Polish Spaces. By V. Kanovei, M. Sabok, and J. Zapletal 203. A Primer on the Dirichlet Space. By O. El-Fallah, K. Kellay, J. Mashreghi, and T. Ransford 204. Group Cohomology and Algebraic Cycles. By B. Totaro 205. Ridge Functions. By A. Pinkus 206. Probability on Real Lie Algebras. By U. Franz and N. Privault 207. Auxiliary Polynomials in Number Theory. By D. Masser 208. Representations of Elementary Abelian p-Groups and Vector Bundles. By D. J. Benson 209. Non-homogeneous Random Walks. By M. Menshikov, S. Popov and A. Wade 210. Fourier Integrals in Classical Analysis (Second Edition). By C. D. Sogge 211. Eigenvalues, Multiplicities and Graphs. By C. R. Johnson and C. M. Saiago 212. Applications of Diophantine Approximation to Integral Points and Transcendence. By P. Corvaja and U. Zannier 213. Variations on a Theme of Borel. By S. Weinberger 214. The Mathieu Groups. By A. A. Ivanov 215. Slenderness I: Foundations. By R. Dimitric 216. Justification Logic. By S. Artemov and M. Fitting 217. Defocusing Nonlinear Schrödinger Equations. By B. Dodson 218. The Random Matrix Theory of the Classical Compact Groups. By E. S. Meckes 219. Operator Analysis. By J. Agler, J. E. Mccarthy, and N. J. Young 220. Lectures on Contact 3-Manifolds, Holomorphic Curves and Intersection Theory. By C. Wendl 221. Matrix Positivity. By C. R. Johnson, R. L. Smith and M. J. Tsatsomeros
Matrix Positivity CHARLES R. JOHNSON College of William and Mary RO NA L D L . S M I T H University of Tennessee at Chattanooga M I C H A E L J . T S AT S O M E RO S Washington State University
University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India 79 Anson Road, #06–04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781108478717 DOI: 10.1017/9781108778619 © Charles R. Johnson, Ronald L. Smith, and Michael J. Tsatsomeros 2020 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2020 Printed in the United Kingdom by TJ International, Padstow Cornwall A catalogue record for this publication is available from the British Library. Library of Congress Cataloging-in-Publication Data ISBN 978-1-108-47871-7 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
To the many students who have helped me with mathematics through my REU program at William and Mary. They have made it fun for me as well as enhanced greatly what I have been able to do. Charles R. Johnson To my parents and siblings; to Linda, my wife and best friend, whose help on this book was invaluable to me; and to Kevin, Brad, Reid, and Sarah, our children who gave us six wonderful grandchildren. Ronald L. Smith To Móσχ α, Katerina, and Joanne. Your inspiration and support make it all possible. Michael J. Tsatsomeros
Contents
Preface List of Symbols
page xi xii
1 1 1 3 5
1
Background 1.1 Purpose 1.2 Matrices 1.3 Convexity 1.4 Theorem of the Alternative
2
Positivity Classes 2.1 Introduction 2.2 (Entry-wise) Positive Matrices (EP) 2.3 M-Matrices (M) 2.4 Totally Positive Matrices (TP) 2.5 Positive Definite Matrices (PD) 2.6 Strictly Copositive Matrices (SC) 2.7 Doubly Nonnegative Matrices (DN) 2.8 Completely Positive Matrices (CP) 2.9 P-Matrices (P) 2.10 H+ -Matrices (H+ ) 2.11 Monotone Matrices (Mon) 2.12 Semipositive Matrices (SP) 2.13 Relationships among the Classes
6 6 6 7 9 9 10 10 11 11 12 12 12 12
3
Semipositive Matrices 3.1 Definitions and Elementary Properties 3.2 Sums and Products of SP Matrices 3.3 Further Structure and Sign Patterns of SP Matrices
14 14 18 21
viii
Contents 3.4 3.5 3.6 3.7 3.8
Spectral Theory of SP Matrices Minimally Semipositive Matrices Linear Preservers Geometric Mapping Properties of SP Matrices Strictly Semimonotone Matrices
4
P-Matrices 4.1 Introduction 4.2 Notation, Definitions, and Preliminaries 4.3 Basic Properties of P-Matrices 4.4 The Eigenvalues of P-Matrices 4.5 Special Classes of P-Matrices 4.6 The P-Problem: Detection of P-Matrices 4.7 The Recursive Construction of All P-Matrices 4.8 On Matrices with Nonnegative Principal Minors 4.9 Some Applications of P-Matrices 4.10 Miscellaneous Related Facts and References
5
Inverse M-Matrices 5.1 Introduction 5.2 Preliminary Facts 5.3 Diagonal Closure 5.4 Diagonal Lyapunov Solutions 5.5 Additive Diagonal Closure 5.6 Power Invariant Zero Pattern 5.7 Sufficient Conditions for a Positive Matrix to Be IM 5.8 Inverse M-Matrices Have Roots in IM 5.9 Partitioned IM Matrices 5.10 Submatrices 5.11 Vanishing Almost Principal Minors 5.12 The Path Product Property 5.13 Triangular Factorizations 5.14 Sums, Products, and Closure 5.15 Spectral Structure of IM Matrices 5.16 Hadamard Products and Powers, Eventually IM Matrices 5.17 Perturbation of IM Matrices 5.18 Determinantal Inequalities 5.19 Completion Theory 5.20 Connections with Other Matrix Classes 5.21 Graph Theoretic Characterizations of IM Matrices 5.22 Linear Interpolation Problems
25 27 30 38 48 55 55 56 60 64 66 74 78 84 85 87 90 90 91 93 95 96 98 100 101 102 108 112 119 123 124 125 126 136 151 153 155 157 158
Contents 5.23 5.24 5.25 5.26 5.27 5.28 6
Newton Matrices Perron Complements of IM Matrices Products of IM Matrices Topological Closure of IM Matrices Tridiagonal, Triangular, and Reducible IM Matrices Ultrametric Matrices
ix 158 159 160 160 161 163
Copositive Matrices 6.1 Introduction 6.2 Basic Properties of Copositive Matrices in Mn (R) 6.3 Characterizations of C, SC, and C+ Matrices 6.4 History 6.5 Spectral Properties 6.6 Linear Preservers
164 164 167 170 176 178 178
References Index
189 206
Preface
Properties that generalize the notion of positivity (of scalars) to matrices have arisen from a large variety of applications. Many have also taken on notable theoretical significance, either because they are natural or unifying or because they support strong implications. All three authors have written extensively on matricial generalizations of positivity, over a long period of time. Some of these generalizations are already the subject of one or more books, sometimes with excellent and modern treatments. But, for a variety of reasons, others are not yet treated comprehensively in readily available form. We feel that a good reference for these will be a useful contribution. This is the purpose of the present work. After a review of relevant background in Chapter 1, we discuss the most prominent generalizations of positivity in Chapter 2. The containment and other relationships among these are given, as well as useful book references for some. It is shown that all of them are contained among the “semipositive” matrices. There are, of course, variations upon these prominent generalizations too, and there are often weak and strong versions of the generalizations. In subsequent chapters, particular generalizations of positivity are studied in detail, when we feel that we can make significant and novel contributions. These include semipositive matrices in Chapter 3, P-matrices in Chapter 4, Inverse M-matrices in Chapter 5, and copositive matrices in Chapter 6. Then we conclude with a lengthy list of references for these areas. We thank Wenxuan Ding, Yuqiao Li, Yueqiao Zhang, and Megan Wendler for their help in the proofreading and preparation of parts of the manuscript.
Symbols
• • • • • • • • • • • • • • • • • • • • • • • • • • •
R, C Fields of real, complex numbers Rn , Cn Column n-vectors of real, complex numbers Rn+ Nonnegative orthant in Rn (all n-vectors of nonnegative numbers) Mm,n (F) The m-by-n matrices over field F; skipping F means F = C Mn (F) = Mn,n (F), Mn = Mn (C) = Mn,n (C) Mn ({−1, 0, 1}) The n-by-n matrices with entries in {−1, 0, 1} Mn ({−1, 1}) The n-by-n matrices with entries in {−1, 1} X T , X ∗ Transpose and conjugate transpose of a complex array X X † Moore–Penrose inverse of X ∈ Mm,n (C) X # Group inverse of X ∈ Mm,n (C) X > Y (X ≥ Y) Real arrays X, Y, every entry of X − Y is positive (nonnegative) [X, Y] Matrix interval (Y ≥ X), all real matrices Z such that X ≤ Z ≤ Y n = {1, 2, . . . , n}. A ◦ B Hadamard (entry-wise) product of A, B ∈ Mm,n (F) A(k) k-th Hadamard power A ◦ A ◦ · · · ◦ A, A ∈ Mm,n (F) index(A) Index of A n Unit simplex in Rn J All ones square matrix e All ones column vector Tr(A) Trace of A ∈ Mn (F) Q(x) = xT Ax Associated quadratic form, A ∈ Mn (C), x ∈ Cn adj A Adjoint of A ∈ Mm,n (C) R(A) Range of A ∈ Mm,n (C) rank(A) Rank of A ∈ Mm,n (C) Null(A) (Right) null space of A ∈ Mm,n (C) nullity(A) Dimension of Null(A) σ (A) Spectrum (eigenvalues) of A ∈ Mn (C)
List of Symbols
xiii
• ρ(A) = max{|λ| : λ ∈ σ (A)} Spectral radius of A ∈ Mn (C) • q(A) Positive eigenvalue of minimum modulus of an M-matrix A • diag(d1 , d2 , . . . , dn ) The n-by-n diagonal matrix with diagonal entries d1 , d 2 . . . , d n • |α| Cardinality of α ⊆ n • α c = n \ α Complement of α ⊆ n in n • A[α, β] Submatrix of A lying in rows α ⊆ m and columns β ⊆ n • A[α] = A[α, α] Principal submatrix of A lying in rows α ⊆ n • A(α, β) = A[α c , β c ] • A(i) = A({i}) • A/A[α] Schur complement of A[α] in A ∈ Mn (F), α ⊆ n • ppt (A, α) Principal pivot transform of A ∈ Mn (C) relative to α ⊆ n • FA Cayley transform of A ∈ Mn (C) • P (A/A[α]) Perron complement of A[α] in A ∈ Mn (F), α ⊆ n • S+ (A) = {x ∈ Rn : x ≥ 0 and Ax > 0}, A ∈ Mm,n (R) • R+ (A) = A S+ (A), A ∈ Mm,n (R) • K+ (A) = {x ∈ Rn : x ≥ 0 and Ax ≥ 0}, A ∈ Mm,n (R) A • A B= ∈ Mm+p,n (R), A ∈ Mm,n (R) and B ∈ Mp,n (R) B • A B = [A B] ∈ Mm,n+q , A ∈ Mm,n (R) and B ∈ Mm,q (R) • M(A) Comparison matrix of A ∈ Mn (C) • D(A) Directed graph of A ∈ Mn (C) • S(A) Signed directed graph of A ∈ Mn (R) • LCP (q, M) The Linear Complementarity Problem, M ∈ Mn (R), q ∈ Rn • m(A) Measure of irreducibility of A ∈ Mn (C) • U(A) Upper path product bound for A ∈ Mn (C) • I(A, B) Interval from A ∈ Mn (R) to B ∈ Mn (R) • V(A, B) Vertex matrices derived from A, B ∈ Mn (R)
Matrix Classes • • • • • • • • •
C (Cn ) Copositive matrices (n-by-n) CP Completely positive matrices Dn n-by-n invertible diagonal matrices D+ n n-by-n positive diagonal matrices in Dn DN Doubly nonnegative matrices DP Doubly positive matrices EIM Eventually inverse M-matrices EP (Entry-wise) positive matrices IM Inverse M-matrices
xiv • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
List of Symbols
IMD Dual of IM IS Identically signed class of matrices LSP Left semipositive matrices Mnk A ∈ Mn (R) with nonzero diagonal entries and length of longest cycle in D(A) ≤ k M M-matrices MSP Minimally semipositive matrices N (Entry-wise) Nonnegative matrices N+ Nonnegative matrices with positive main diagonal ND Negative definite matrices NSD Negative semidefinite matrices Pn Group of n-by-n permutation matrices Pkn Set of real P-matrices in Mnk P P-matrices PM Matrices all of whose powers are P-matrices P0 P0 -matrices PD Positive definite matrices PSD Positive semidefinite matrices PSPP Purely strict path product matrices PP Path product matrices RSP Redundantly semipositive matrices Sn = Sn (R) Set of symmetric matrices in Mn (R) Snk Set of A ∈ Mn (R) all of whose cycles in S(−A) are signed negatively SC (SCn ) Strictly copositive matrices (n-by-n) SIM Symmetric inverse M-matrices sN Symmetric nonnegative matrices SN Seminegative matrices SNN Seminonnegative matrices SNP Seminonpositive matrices SP Semipositive matrices SPP Strict path product matrices SZ Semizero matrices TN Totally nonnegative matrices TP Totally positive matrices TSPP Totally strict path product matrices Z Z-matrices
1 Background
1.1 Purpose Here, we review some well-known mathematical concepts that are helpful in developing the ideas of this work. These are primarily in the areas of matrix analysis, convexity, and an important theorem of the alternative (linear inequalities). In the latter two cases, we mention all that is needed. In the case of matrices, see the general reference [HJ13], or a good elementary linear algebra book, for facts or notation we use without further explanation.
1.2 Matrices 1.2.1 Matrix and Vector Notation We use R (C ) to denote the set of all n-component real (complex) vectors, thought of as columns, and Mm,n (F) to denote the m-by-n matrices over a general field F. Skipping the field means F = C and Mn,n (F) is abbreviated to Mn (F). Inequalities, such as >, ≥, are to be interpreted entry-wise, so that x > 0 means that all entries of a vector x are positive. If all entries of x are nonnegative, but not all zero, we write x ≥ 0, = 0. The transpose of A = [aij ] ∈ Mm,n is denoted by AT and the conjugate transpose (or Hermitian adjoint) by A∗ . The Hermitian and skew-Hermitian parts of A are, respectively, denoted by n
n
A − A∗ A + A∗ and S(A) = . 2 2 The spectrum, or set of eigenvalues, of A ∈ Mn is denoted by σ (A) and the spectral radius by ρ(A) = max |λ|. H(A) =
λ∈σ (A)
2
Background
Submatrices play an important role in analyzing matrix structure. For m = {1, 2, . . . , m} and n = {1, 2, . . . , n}, denote by A[α, β] the submatrix of A ∈ Mm,n (F) lying in rows α ⊆ m and columns β ⊆ n. If m = n and α = β, we abbreviate A[α, α] to A[α] and refer to it as a principal submatrix of A; we call A[α] a proper principal submatrix if α is a proper subset of n. A submatrix of A ∈ Mn of the form A[k] for some k ≤ n is called a leading principal submatrix. A submatrix may also be indicated by deletion of row and column indices, and for this round brackets are used. For example, A(α, β) = A[α c , β c ], in which the complementation is relative to m and n, respectively. If an index set is a singleton i, we abbreviate A({i}), for example, to A(i) in case m = n. Given a square submatrix A[α, β] of A ∈ Mm,n , we refer to det(A[α, β]) as a minor of A, or as a principal minor if α = β. The determinant of a leading principal submatrix is referred to as a leading principal minor. By convention, if α = ∅, then det(A[α]) = 1. 1.2.2 Gershgorin’s Theorem For A = [aij ] ∈ Mn (C), for each i = 1, 2, . . . , n, define Ri (A) =
n
|aij |
and the discs i (A) = {z ∈ C : |z − aii | ≤ Ri (A)}.
j=1,j =i
Gershgorin’s Theorem then says that σ (A) ⊆ ∪ni=1 i (A). This has the implication that a diagonally dominant matrix A ∈ Mn (C) (i.e., |aii | > Ri (A), i = 1, 2, . . . , n) has nonzero determinant. In particular, the determinant is of the same sign as the product of the diagonal entries, in the case of real matrices. 1.2.3 Perron’s Theorem If A ∈ Mn (R), A > 0, then very strong spectral properties follow as initially observed by Perron (Perron’s Theorem). These include • ρ(A) ∈ σ (A);
(1.2.1)
• the multiplicity of ρ(A) is one;
(1.2.2)
• λ ∈ σ (A), |λ| = ρ(A) ⇒ λ = ρ(A);
(1.2.3)
• there is a right (left) eigenvector of A with all entries positive;
(1.2.4)
• 0 < B ≤ A, B = A implies ρ(B) < ρ(A);
(1.2.5)
• No eigenvector (right or left) of A has all nonnegative entries besides those associated with ρ(A).
(1.2.6)
1.3 Convexity
3
There are slightly weaker statements for various sorts of nonnegative matrices; see [HJ91, chapter 8].
1.2.4 Schur Complements If
A=
A11 A21
A12 A22
∈ Mn
and A22 is square and invertible, the Schur complement (of A22 in A) is A/A22 = A11 − A12 A−1 22 A21 . More generally, if α ⊆ n is an index set, then A/A[α] = A(α) − A[α c , α]A[α]−1 A[α, α c ] is the Schur complement of A[α] in A if A[α] is invertible. Schur complements enjoy many nice properties, such as det A = det A[α] det(A/A[α]), which motivates the notation. Reference [Zha05] is a good reference on Schur complements and contains a detailed discussion of Schur complements in positivity classes.
1.3 Convexity 1.3.1 Convex Sets in Rn and Mm,n (R) A convex combination of a collection of elements of Rn is a linear combination whose coefficients are nonnegative and sum to 1. A subset S of Rn or Mm,n (R) is convex if it is closed under convex combinations. It suffices to know closure for pairs of elements, and geometrically this means that a set is convex if the line segment joining any two elements lies in the set. An extreme point of a convex set is one that cannot be written as a convex combination of two distinct points in the set. The generators of a convex set are a minimal set of elements from which any element of the set may be written as a convex combination. For finite dimensions (our case), the extreme points are the generators. If there are finitely many, the convex set is called polyhedral. A (convex) cone is just a convex set that is closed under linear combination whose coefficients are nonnegative (no constraint on the sum). A cone may also
4
Background
be finitely generated (polyhedral cross section) or not. The dual of a cone C is just the set C D = {x : x · y ≥ 0, whenever x ∈ C },
in which · denotes an inner product defined on the underlying space. The convex hull of a collection of points is just the set of all possible convex combinations. This is a convex set, and any convex set is the convex hull of its generators.
1.3.2 Helly’s Theorem The intersection of finitely many convex sets is again a convex set (possibly empty). Given a collection of convex sets (possibly infinite) in a d dimensional space, the intersection of all of them is nonempty if and only if the intersection of any d + 1 of them is nonempty. This is known as Helly’s Theorem [Hel1923] and can provide a powerful tool to show the existence of a solution to a system of equations.
1.3.3 Hyperplanes and Separation of Convex Sets In an inner product space of dimension d, such as Rd , a special kind of convex set is a half-space Ha = {x : a · x ≥ 0}, in which a = 0 is a fixed element of the inner product space. Any intersection of half-spaces is a convex set, and any closed convex set is an intersection of half-spaces. The complement of a half-space, Hac = {x : a · x < 0}, is an open half-space. The set {x : a · x = 0} is a (d − 1)-dimensional hyperplane that separates the two half-spaces Ha and Hac . If two convex sets S1 , S2 in a d-dimensional space do not intersect, then they may be separated by a (d − 1)-dimensional hyperplane, so that S1 ⊆ Ha and S2 ⊆ H−a . If S1 and S2 are both closed or both open, we may c . If S and S do use nonintersecting open half-spaces: S1 ⊆ Hac and S2 ⊆ H−a 1 2 intersect but only in a subset of a hyperplane, then such a hyperplane may be used to separate them: S1 ⊆ Ha and S2 ⊆ H−a .
1.4 Theorem of the Alternative
5
For further background on convex sets, see the general reference [Rock97].
1.4 Theorem of the Alternative In the theory of linear inequalities (or optimization), there is a variety of statements saying that exactly one of two systems of linear inequalities has a solution. Such statements, and there are many variations, are called Theorems of the Alternative (and there are such theorems in even more general contexts). Such statements can be a powerful tool for showing that one system of inequalities has a solution, by ruling out the other; however, they generally are not able to provide any particular solution. The book [Man69] provides nice discussion of several theorems of the alternative and relations among them. A particular version of the theorem of the alternative that is especially useful for us is the following: Theorem 1.4.1 Let A ∈ Mm,n (R). Then, either (i) there is an x ∈ Rn , x ≥ 0, such that Ax > 0 or (ii) there is a y ∈ Rm , y ≥ 0, y = 0, such that yT A ≤ 0, but not both. The “not both” portion is clear, as by (i), yT Ax > 0 and because of (ii), ≤ 0. In the event that m = n and A is symmetric, we have
yT Ax
Corollary 1.4.2 Let A ∈ Mn (R) and AT = A. Then, either (i) there is an x ∈ Rn , x ≥ 0, such that Ax > 0 or (ii) there is a y ∈ Rm , y ≥ 0, y = 0, such that Ay ≤ 0, but not both.
2 Positivity Classes
2.1 Introduction There are a remarkable number of ways that the notion of positivity for scalars has been generalized to matrices. Often, these represent the differing needs of applications, or differing natural aspects of the classical notion. Typically, the various ways involve the entries, transformational characteristics, the minors, the quadratic form, and combinations thereof. Our purpose here is to identify each of the generalizations and some of their basic characteristics. Usually there are natural variations that we also identify and relate. Several of these generalizations have been treated, in some depth, in book or survey form elsewhere; if so, we give some of the most prominent or accessible references. Our purpose in this work is to then treat in subsequent chapters those generalizations for which there seems not yet to be sufficiently general treatment in one place. The order in which we give the generalizations is roughly grouped by type. We then summarize the containments among the positivity classes; one of them includes all the others.
2.2 (Entry-wise) Positive Matrices (EP) One of the most natural generalizations of a positive number is reflected in the entries of a matrix. In general, adjectives, such as positive and nonnegative, refer to the entries of a matrix. The ways in which n-by-n positive matrices generalize the notion of a positive number are indicated in Perron’s theorem; see Section 1.2.3. Careful treatments of the theory of positive matrices may be found in several sources, such as [HJ13]. There are many further facts. Because of Frobenius’s work on nonnegative matrices, the general theory is referred to as Perron– Frobenius theory.
2.3 M-Matrices (M)
7
There are important variants on positive matrices. For their closure, the nonnegative matrices, the spectral radius is still an eigenvalue, but all the other conclusions somewhat. For nonnegative and irreducible must be weakened A11 A12 T matrices P AP = 0 A22 , with A11 and A22 square and nonempty for any permutation P , only property (1.2.3) must be weakened to allow other eigenvalues on the spectral circle. And, if some power, Aq , is positive, the stronger conclusions remain valid. The following are simple examples that illustrate what might occur. 1 1
The Perron root (spectral radius) is strictly dominant when the matrix is entry-wise positive. 1 1
1 0 It remains so if the matrix is not positive but some power is positive 0 1 (primitive matrix). 1 0 The Perron root remains of multiplicity 1, but there may be ties for spectral radius when the matrix is irreducible but not primitive. 1 0 1 1
0 1 , 0 1 The Perron root may be multiple and geometrically so, or not, 1 0 1 0 when the matrix is reducible. 0 0 , 1 0 Or the Perron root may still have multiplicity 1, even when the matrix is reducible. 0 1
The Perron root may be 0. 00 11
The nonnegative orthant in Rn is the closed cone that contains all entrywise nonnegative vectors and is denoted by Rn+ . The n-by-n nonnegative matrices (A ≥ 0) are simply those that map the nonnegative orthant in Rn into itself (ARn+ ⊆ Rn+ ). The n-by-n positive matrices (A > 0, or A ∈ EP that stands for entry-wise positive) are those that map the nonzero elements of Rn+ into the interior of this cone. Matrices that map other cones of Rn with special structure into themselves emulate the spectral structure of nonnegative matrices, and this has been studied from several points of view in some detail. Also studied have been matrices with some negative entries that still enjoy some or all of the above Perron conclusions or their Frobenius weakenings: A ∈ Mn (R) is eventually nonnegative (positive) if there is an integer k such that Ap ≥ 0 (> 0) for all p ≥ k.
2.3 M-Matrices (M) A matrix A = [aij ] ∈ Mn (R) is called a Z-matrix (A ∈ Z) if aij ≤ 0 for all i = j. Thus, a Z-matrix A may be written A = αI−P in which P is a nonnegative
8
Positivity Classes
matrix. If α > ρ(P), then A is called a (non-singular) M-matrix (A ∈ M). In several sources ([BP94, HJ91, FP62, NP79, NP80]) there are long lists of rather diverse-appearing conditions that are equivalent to A being an M-matrix, provided A is a Z-matrix. A modest list is the following: 1. A is positive stable, i.e., all eigenvalues of A are in the open right halfplane. 2. The leading principal minors of A are positive; 3. All principal minors of A are positive, i.e., A is a P-matrix; 4. A−1 exists and is a nonnegative matrix; 5. A is a semipositive matrix; 6. There exists a positive diagonal matrix D such that AD is row diagonally dominant; 7. There exist positive diagonal matrices D and E such that EAD is row and column diagonally dominant; 8. For each k = 1, . . . , n, the sum of the k-by-k principal minors of A is positive; 9. A has an L–U factorization in which L and U have positive diagonal entries. In addition, M-matrices are closed under positive scalar multiplication, extraction of principal submatrices, and of Schur complements, and the socalled Fan product (which is the entry-wise or Hadamard product, except that the off-diagonal signs are retained). They are not closed under either addition or matrix multiplication. M-matrices, A = [aij ] ∈ Mn (R), also satisfy classical determinantal inequalities, such as 10. Hadamard’s inequality: det A ≤ ni=1 aii ; 11. Fischer’s inequality: det A ≤ det A[α] det A[α c ], α ⊆ {1, . . . , n}; 12. Koteljanskii’s inequality: det A[α ∪ β] det A[α ∩ β] ≤ det A[α] det A[β], α, β ⊆ {1, . . . , n}. A complete description of all such principal minor inequalities is given in [Joh98]. M-matrices are not only positive stable, but, among Z-matrices, when all real eigenvalues are in the right half-plane, all eigenvalues are as well. They also have positive diagonal Lyapunov solutions, i.e., there is a positive diagonal matrix D such that DA + AT D is positive definite. If A = I − P is an irreducible M-matrix and we assume that ρ(P) < 1, then A is invertible and A−1 = I + P + P2 + · · · Because P is irreducible, A−1 is positive. If A had been reducible, A−1 would be nonnegative. A matrix is inverse M (IM) if it is the inverse of an M-matrix, or if it is a nonnegative invertible matrix whose inverse is a Z-matrix. Among nonnegative matrices,
2.5 Positive Definite Matrices (PD)
9
the inverse M-matrices have a great deal of structure, and Chapter 5 is devoted to developing that structure for the first time in one place.
2.4 Totally Positive Matrices (TP) A much stronger positivity requirement than that a matrix be positive is that all of its minors be positive. Such a matrix is called totally positive (TP). Such matrices arise in a remarkable variety of ways, and, of course, they also have very strong properties. The eigenvalues are positive and distinct, and the eigenvectors are highly structured in terms of the signs of their entries relative to the order in which the eigenvalue lies relative to the other eigenvalues. They always have an L–U factorization in which every minor of L and U is positive, unless it is identically 0. The determinantal inequalities of Hadamard, Fischer, and Koteljanskii (in Section 2.3) are satisfied, as well as many different ones. Transformationally, if A is TP, Ax cannot have more sign changes than x, and, of course, the TP matrices are closed under matrix multiplication. Though the definition requires many minors to be positive, because of Sylvester’s determinantal identity, relatively few need be checked; the contiguous minors (both index sets are consecutive) suffice, and even the initial minors (those contiguous minors of which at least one index set begins with index 1) suffice. The initial minors are as numerous as the entries. There are several comprehensive sources available for TP (and related) matrices, including the most recent book [FJ11]. Prior sources include the book [GK1935] and the survey [And80]. There are a number of natural variants on TP matrices. A totally nonnegative matrix, TN, is one in which all minors are nonnegative. TN is the topological closure of TP, and the properties are generally weaker. Additional variants are TPk and TNk in which each k-by-k submatrix is TP (respectively, TN).
2.5 Positive Definite Matrices (PD) Perhaps the most prominent positivity class is defined by the quadratic form for Hermitian matrices. Matrix A ∈ Mn (R) is called positive definite (A ∈ PD) if A is Hermitian (A∗ = A) and, for all 0 = x ∈ Cn , xT Ax > 0. There are several good sources on PD matrices, including [HJ13, Joh70, Bha07]. The PD matrices are closed under addition and positive scalar multiplication (they form a cone in Mn (C)), and under the Hadamard product, but not
10
Positivity Classes
under conventional matrix multiplication. They are closed under extraction of principal submatrices and Schur complements. Among Hermitian matrices, all positive eigenvalues, positive leading principal minors, and all principal minors positive are each equivalent to being PD. A matrix A ∈ Mn (C), is PD if and only if A = B∗ B, with B ∈ Mn (C), nonsingular; B may always be taken to be upper triangular (Cholesky factorization). The standard variation on PD is positive semidefinite matrices, PSD, in which the quadratic form is only required to be nonnegative. PSD is the closure of PD, and most of the weakened properties follow from this. Of course, negative definite (ND = −PD) and negative semidefinite (NSD = −PSD) are not generalizations of positivity. Another important variation is, of course, that the Hermitian part ∗ H(A) = A+A is PD. In the case of Mn (R) this just means that the quadratic 2 form is positive, but the matrix is not required to be symmetric. Another variation is that H(A) be PSD. There are some references that consider the former class, e.g., [Joh70, Joh72, Joh73, Joh75a, Joh75b, Joh75c, BaJoh76].
2.6 Strictly Copositive Matrices (SC) An important generalization of PSD matrices is the copositive matrices (C) for which it is only required that the quadratic form be nonnegative on nonnegative vectors: A ∈ Mn (R) is copositive if AT = A and xT Ax ≥ 0 for all x ≥ 0. Much theory has been developed about the subtle class of copositive matrices. Because there is no comprehensive reference, Chapter 6 is devoted to this class. Variations include the strictly copositive matrices (SC) for which positivity of the quadratic form is required on nonnegative, nonzero vectors, and copositive +, the copositive matrices for which x ≥ 0 and xT Ax = 0 imply Ax = 0. Also, the real matrices with Hermitian part in C or SC could be considered.
2.7 Doubly Nonnegative Matrices (DN) The intersection of the cone of (symmetric) nonnegative matrices and the cone of PD matrices in Mn (R) is the cone of doubly nonnegative matrices (DN). Natural variations include the closure of DN (nonnegative and PSD) and the doubly positive matrices (DP), which are positive and PD. The most natural
2.9 P-Matrices (P)
11
of the three classes, DN, lies properly between the other two. Membership in DN is straightforward to check, but the generators of the cone are difficult to determine.
2.8 Completely Positive Matrices (CP) A strong variation on positive semidefinite, as well as (doubly) nonnegative matrices, is the notion of complete positivity: A completely positive matrix (CP) A ∈ Mn (R) is a symmetric matrix that can be written as A = BBT with B ≥ 0. The minimum number of columns of B in such a representation is the so-called CP-rank of A; it may be greater than the rank of A. The CP notion strengthens one of the characterizations of PSD, namely, that a PSD matrix may be written as BBT (with no other restrictions on B). The CP matrices have some clear structure; they form a cone in Mn (R), whose generators are the rank 1, nonnegative PSD matrices. However, verifying membership in CP is still quite difficult. There is a book reference [BS-M03] on the CP matrices, as well as more recent literature. The CP matrices constitute the cone theoretic dual of the copositive matrices, one contained in and the other containing the PSD cone.
2.9 P-Matrices (P) Matrix A ∈ Mn (C) is a P-matrix (A ∈ P), if all its principal minors are positive. Several prior classes, namely, the M-matrices, inverse M-matrices, totally positive, and positive definite matrices are contained in P. P-matrices are very important for the linear complementarity problem (and of interest for a variety of other reasons). Despite the fact that the definition involves many fewer minors, checking whether a matrix is a P-matrix, though finite, is generally much more costly than checking whether a matrix is TP. Nonetheless, there are important characterizations of P-matrices, covered in Chapter 4. References about properties of P-matrices include [FP62], [FP66], [FP67], and [HJ91]. Classical variations include the P0 -matrices (all principal minors nonnegative) and Pk -matrices, 0 < k ≤ n, in which all k-by-k principal submatrices are P-matrices. Lying strictly between the TN matrices and the P-matrices are the intersection of nonnegative matrices and the P-matrices, the
12
Positivity Classes
nonnegative P-matrices. The positive P-matrices lie between the TP matrices and the P-matrices. The inverses of M-matrices are contained in the nonnegative P-matrices.
2.10 H+ -Matrices (H+ ) Matrix A = [aij ] ∈ Mn (C) is column diagonally dominant if |ajj | > n T i=1, i =j |aij | for each j = 1, 2, . . . , n. A is row diagonally dominant if A is column diagonally dominant. If there is a diagonal matrix D such that AD is row diagonally dominant, then A is called an H-matrix, which is equivalent to the existence of a diagonal matrix E such that EA is column diagonally dominant, or that there are diagonal matrices E and D such that EAD is both row and column diagonally dominant. Matrix A ∈ Mn (R) is an H+ -matrix if A is an H-matrix with positive diagonal entries. The M-matrices are contained in the H+ -matrices, and the H+ -matrices are positive stable. The H+ -matrices are of greatest interest when the entries are real.
2.11 Monotone Matrices (Mon) Matrix A ∈ Mm,n (R) is called monotone if whenever Ax ≥ 0, we have x ≥ 0. If A is monotone and Ax = 0, we also have A(−x) = 0, so that x ≥ 0 and −x ≥ 0, which means x = 0. Thus, A must have linearly independent columns, and if m = n, A must be invertible. Thus if A ∈ Mn (R) is monotone, A−1 exists and A−1 is nonnegative (Ax = y ≥ 0). So Av ≥ Au implies v ≥ u, a motivation for the name monotone.
2.12 Semipositive Matrices (SP) Matrix A ∈ Mm,n (R) is semipositive (A ∈ SP) if there is an x ≥ 0 such that Ax > 0 (a weakening of the characterization that characterizes entry-wise positive matrices). Chapter 3 is devoted to their study, and they are the most general class listed.
2.13 Relationships among the Classes If we interpret each of the above classes in the strict sense over R and all matrices are square (for example, DN means PD∩EP and H+ ⊆ Mn (R)), then we
2.13 Relationships among the Classes
13
may give a complete list of containments among the classes. Most of these are straightforward, except that SP contains all of the classes, using Theorem 1.4.1 appropriately. Containments are described in the following schema. SP ⊇ P
⊇ PD ⊇ DN ⊇ CP ⊇ TP ⊇ H+ ⊇ M ⊇ SC ⊇ PD ⊇ EP ⊇ TP ⊇ DN ⊇ Mon ⊇ M
That SP ⊇ P is proven in Chapter 4, using Theorem 1.4.1 and the characterization that says a real P-matrix is one that cannot weakly reverse all the signs of any real vector. That SP ⊇ SC uses Theorem 1.4.1 similarly. That SP ⊇ EP is trivial, as is SP ⊇ Mon. Other containments are known, using traditional characterizations; all are strict, and there are no other containments.
3 Semipositive Matrices
Our purpose here is to develop the theory of semipositive matrices in some depth. No other comprehensive general reference is available. As seen in Chapter 2, they include the strong versions of all the traditional positivity classes.
3.1 Definitions and Elementary Properties Recall that A ∈ Mm,n (R) is positive, A > 0, if and only if Ax > 0 for every nonzero x ≥ 0, x ∈ Rn . If the quantification on x is changed to existence, we have the notion of a semipositive (SP) matrix. Definition 3.1.1 For A ∈ Mm,n (R), let S+ (A) = {x ∈ Rn: x ≥ 0 and Ax > 0}
and R+ (A) = A S+ (A).
n Of course, R+ (A) ⊆ Rm + (and its interior), S+ (A) ⊆ R+ , the nonnegative orthants in Rm and Rn , respectively. Also R+ (A) = ∅ if and only if S+ (A) = ∅. If S+ (A) = ∅, then A is called semipositive (A ∈ SP).
Note that, by continuity, if A is SP, then there exists x > 0 such that Ax > 0. Definition 3.1.2 In a similar way, we may define • • • •
seminonnegative (SNN) if there exists nonzero x ≥ 0 such that Ax ≥ 0; seminegative (SN): −A is SP; seminonpositive (SNP): −A is SNN; semizero (SZ): A has a nonnegative, nonzero null vector. We note that
Remark 3.1.3 If A is SP, then S+ (A) is a nonempty cone in Rn+ and R+ (A) is a nonempty cone in Rm +.
3.1 Definitions and Elementary Properties
15
Of course, there are also left versions of these concepts, in which vectors are multiplied on the left. For example, left semipositive (left SP, LSP) means that there is a y ≥ 0 such that yT A > 0, and a prefix “L” means the left-hand version; equivalently AT is SP. Definition 3.1.4 We call a matrix C ∈ Mr,m (R) row positive if C ≥ 0 has at least one positive entry in each row. Note that the positive matrices are exactly those that map the positive orthant of Rm to that of Rr . Recall that B ∈ Mn,s (R) is monotone if for x ∈ Rs , Bx ≥ 0 implies x ≥ 0. If s = n, then B is invertible and B−1 ≥ 0 (see Section 2.11). Note also that if A ≥ 0 is invertible, then both A and AT are row positive. Theorem 3.1.5 If A ∈ Mm,n (R) is SP, then CAB is SP whenever C ∈ Mr,m (R) is row positive and B ∈ Mn (R) is monotone. Proof If A is SP, let x ∈ S+ (A). Choose u = B−1 x ∈ Rq so that Bu = x; then u ≥ 0. Now CABu = CAx = Cy with y > 0. Because C is row positive, Cy > 0, and, thus, u ∈ S+ (CAB), which means CAB is SP. Observation 3.1.6 If A ∈ Mm,n (R) and A ≥ 0, then A is SP if and only if A is row positive. Moreover, if A is row positive, S+ (A) includes the entire positive orthant of Rn . Also, if A ∈ Mn (R) is monotone, then A is SP. Furthermore, no broader classes than those mentioned preserve SP. This allows us to identify the most natural automorphisms of SP matrices: permutation equivalence and positive diagonal equivalence. Let Pn denote the n-by-n permutation matrices, Dn the invertible n-by-n diagonal matrices and D+ n the are both contained positive diagonal matrices in Dn . Note that Pn and D+ n in the row positive and monotone matrices; so are the monomial matrices, Pn D+ n , which are the n-by-n matrices with exactly one positive entry in each row and column. Corollary 3.1.7 Let A ∈ Mm,n (R) and let P ∈ Pm and Q ∈ Pn . Then, PAQ is SP if and only if A is SP. Proof Because A = PT (PAQ)QT , it suffices to show that A being SP implies that PAQ is SP. But, because P is row positive and Q is monotone, this follows from Theorem 3.1.5. + Corollary 3.1.8 Let A ∈ Mm,n (R), and let D ∈ D+ m and E ∈ Dn . Then DAE is SP if and only if A is SP. −1 −1 Proof Because D+ n is inverse closed and A = D (DAE)E , it suffices to show that A being SP implies that DAE is SP. But, again, as D is row positive and E is monotone, this follows from Theorem 3.1.5.
16
Semipositive Matrices
Remark 3.1.9 If A ∈ Mm,n (R) and R and S are monomial matrices of appropriate size, then RAS is SP if and only if A is SP. This follows from the preceding two facts. Some simple consequences of the definition, through S+ (A), include the following. Let Eij ∈ Mn (R) denote a matrix whose (i, j) entry equals 1 and all other entries are zero. If A ∈ Mm,n (R) is SP and Bt = A + tEij , then S+ (Bt1 ) ⊆ S+ (Bt2 ) for t1 ≤ t2 , so that Bt is SP for all t ≥ 0. In fact, for t > 0, S+ (A) ⊆ S+ (Bt ), with strict containment, unless A > 0 and the interior of S+ (A) is already the positive orthant. Also, if A is SP, then each row of A must have at least one positive entry, else S+ (A) = ∅. Rows and columns play different and essentially dual roles in SP matrices. Let us refer to a submatrix of A ∈ Mm,n (R) obtained by deleting a column (row) as a column- (row-) deleted submatrix of A. Theorem 3.1.10 Let A ∈ Mm,n (R). (1) If there is a column-deleted submatrix of A that is SP, then A is SP. (2) If A is SP, then any nonempty, row-deleted submatrix of A is SP. Proof (1) If B is a column-deleted submatrix of A, we may suppose, by Corollary 3.1.7, that B occupies the first k columns of A. Then, if x ∈ S+ (B) ⊆ Rk , and [x 0]T ∈ S+ (A), which is nonzero, as x is. The augmentation is with n − k zeroes. (2) If B is a row-deleted submatrix of A, then S+ (A) ⊆ S+ (B), so that S+ (B) is nonempty and B is SP. Remark 3.1.11 Because A ∈ Mm,1 (R) with positive entries (a positive column) is SP, it follows from Theorem 3.1.10 (1) that any matrix with a positive column is SP. Definition 3.1.12 If A is SP, but no proper, column-deleted submatrix of A is SP, then we call A minimally SP (MSP). This can happen only when m ≥ n, and we shall study the notion in greater depth in Section 3.5. Corresponding to Theorem 3.1.10 (2), rows may always be added to an SP matrix to produce a “taller” one. Of course, a row-deleted submatrix of A may be SP, even when A is not SP. We may now give a long list of conditions equivalent to SP (and there are analogous equivalent statements for the other concepts).
3.1 Definitions and Elementary Properties
17
Theorem 3.1.13 The following statements about A ∈ Mm,n (R) are equivalent: (1) A is SP; (2) the cone generated by the columns of A intersects the positive orthant in Rm ; (3) there is an x ∈ Rn such that x > 0 and Ax > 0; (4) S+ (A) has nonempty interior; (5) there is a b ∈ Rm , b > 0, such that Ax = b has a positive solution; (6) there is a positive diagonal matrix D such that AD has positive row sums; (7) there exist positive diagonal matrices D and E such that EAD has constant positive row sums; (8) AT is not SNP. Proof Any element of S+ (A) gives a linear combination of the columns of A that is in the positive orthant, so (1) implies (2). A sufficiently small positive perturbation of the coefficients of the linear combination shows that (2) implies (3). An x satisfying (3) lies in the interior of S+ (A), so that (3) implies (4). Choose b in the image of the interior of S+ (A) to see that (4) implies (5). Let D = diag(x)−1 to show that (5) implies (6). If b is the row sum vectors of AD, let E = diag(b)−1 to show that (6) implies (7). Because of (7) the first alternative of Theorem 1.4.1 is satisfied. The fact that the second alternative cannot be satisfied is statement (8). That (8) implies (1) is another application of the Theorem of the Alternative, to complete the proof. That S+ (A) is nonempty means that it has a nonempty interior. A corresponding statement is not so for SNN or SNP matrices. 1 −1
Example 3.1.14 Let A = −1 A is SNN (use x = [1 1]T ), but 1 . Then
x −x not SP. If x = [x1 x2 ]T , then Ax = x12 −x21 . If x ≥ 0, = 0, then at least one component of Ax is negative, unless x1 = x2 , in which case Ax = 0. The set {x : x1 = x2 ≥ 0} has an empty interior in R2 .
1
T Example 3.1.15 Let A = 11 −1 −1 . Then A is SP, as 0 ∈ S+ (A), but A is not SP. So it is possible to be SP and not left SP, or vice versa. Definition 3.1.16 If A is both SP and left SP (LSP), then it is called symmetrically SP. A symmetric matrix that is SP is, of course, symmetrically SP, but a symmetrically SP matrix need not be symmetric. We note that if A ∈ Mn (R) is SP and invertible, then A−1 is also SP. Theorem 3.1.17 Suppose that A ∈ Mn (R) is invertible. Then A is SP if and only if A−1 is SP.
18
Semipositive Matrices
Proof Because of Theorem 3.1.13 (3), an SP matrix A is one such that there are x, y > 0 with y = Ax. But then x = A−1 y and vice versa.
Of course, a square SP matrix need not be invertible, e.g., 11 11 .
3.2 Sums and Products of SP Matrices If A,B ∈ Mm,n (R) are SP, then A + B is defined, and we may ask (1) if it is SP and (2) which matrices in Mm,n (R) may be generated in this way? If A ∈ Mm,n (R) and B ∈ Mn,p (R) are SP, then AB is defined, and we may ask (1) if AB is SP and (2) which matrices in Mm,p (R) may be generated in this way? Not surprisingly, neither the sum, nor the product, of two SP matrices need to be SP. Example 3.2.1 Let
1 −5 1 1 , B= . 0 1 1 1
Then A, AT and B are all SP, as A 61 = 11 , AT 61 = 11 and B > 0. However, 2 −5 T A+A = −5 2 A=
is not SP, as no nonnegative combination of its columns can be positive, and −4 −4 AB = 1 1 is not SP, as it has a negative row. In order to give sufficient conditions so that the product or sum of SP matrices be SP, we exploit S+ (A) and R+ (A). Now, we may give a general sufficient condition for a product of SP matrices to be SP. Theorem 3.2.2 Let A ∈ Mm,n (R), C ∈ Mr,m (R), and B ∈ Mn,s (R) be SP. Then, (1) if R+ (B) ∩ S+ (A) = ∅, then AB is SP; (2) if A (R+ (B) ∩ S+ (A) ∩ R+ (C)) = ∅, then CAB is SP. Proof (1) If u ∈ R+ (B) ∩ S+ (A), then u = Bx > 0 for some x ≥ 0, and Au > 0. Thus ABx = Au > 0, and so AB is SP.
3.2 Sums and Products of SP Matrices
19
(2) Let z ∈ A(R+ (B) ∩ S+ (A) ∩ R+ (C)). Then z = Au, where u ∈ R+ (B) ∩ S+ (A) ∩ R+ (C). Because u ∈ R+ (B) ∩ S+ (A), as in (1), we have u = Bx > 0 for some x ≥ 0. It follows that CABx = CAu = Cz > 0 and so CAB is SP. Of course, it may happen that (a) one or both of A and B are not SP, or (b) that both are, and the sufficient conditions of Theorem 3.2.2 are not met, and yet AB still be SP. Example 3.2.3 Consider −1 −1 −1 −1 1 0 A= , B= , AB = . −1 0 0 1 1 1 Note that as A and B are not SP, R+ (B) ∩ S+ (A) = ∅, yet AB ∈ SP. Also, for 1 1 1 0 2 0 C= , E= , CE = . 0 1 1 0 1 0 Note that S+ (C) comprises the nonnegative vectors with positive second entry, S+ (E) comprises the nonnegative vectors with positive first entry, and thus R+ (E) = ES+ (E) comprises the vectors with positive first entry and zero second entry. Thus R+ (E) ∩ S+ (C) = ∅, yet CE ∈ SP. It seems not so straightforward to characterize, when AB is SP in terms of individual characteristics of A and B. The interaction of characteristics is key. If we begin with a general matrix in Mm,n (R) and ask if it is the product of two SP matrices, the answer is clearer, though nontrivial. To see what happens, a technical lemma is useful. Lemma 3.2.4 For m ≥ 2, n ≥ 1, suppose that C ∈ Mm,n (R). If {v, w} ⊆ Rm is a linearly independent set and v2 ∈ Rn is such that {Cv2 , w} is a linearly independent set, then there exist A ∈ Mm (R) and B ∈ Mm,n (R) such that Av1 = w, Bv2 = w and C = AB. Proof Choose A ∈ Mn (R) to be an invertible matrix such that Av1 = w and Aw = Cv2 = 0. Set B = A−1 C, so that C = AB. Then, Bv2 = A−1 Cv2 = w and the stated requirements are fulfilled. Theorem 3.2.5 If m ≥ 2, n ≥ 1, and C ∈ Mm,n (R), then there exist A ∈ Mm (R) and B ∈ Mm,n (R), both SP, such that C = AB. Proof If rank C ≥ 1, positive vectors v1 , v2 , and w may be chosen, so as to fulfill the hypothesis of Lemma 3.2.4. The positivity of these vectors means that
20
Semipositive Matrices
A and B are SP, completing the proof of this theorem. If C = 0, and m ≥ 2, we may choose ⎡
1 −1 0 · · · ⎢1 −1 0 · · · ⎢ A = ⎢. .. .. .. ⎣ .. . . . 1 −1 0 · · ·
⎡ ⎤ 0 1 1 1 ··· ⎢1 1 1 · · · 0⎥ ⎢ ⎥ .. ⎥ , B = ⎢ .. .. .. .. ⎣. . . ⎦ . . 0 1 1 1 ···
⎤ 1 1⎥ ⎥ .. ⎥ .⎦ 1
of the required sizes, so that C = AB. Because A and B both have positive columns, they are SP. Remark 3.2.6 The requirement that m ≥ 2 in Theorem 3.2.5 is necessary. Suppose that C ∈ M1,n (R) and all the entries of C are negative. If C = AB and A is 1-by-1, then either the lone entry of A is negative, in which case A is not SP, or B has all negative entries and is therefore not SP. However, if C ∈ M1,n (R), n ≥ 2, either C has some positive entries and is, itself, SP, in which case C = CI is an SP factorization of C, or C ≤ 0. If C ≤ 0, then we may choose A = [1 − 1] and B > 0. B ∈ M2,n (R) so that AB = C with both A and B SP. Of course, if m, n = 1 and C ≤ 0, then C = AB, with both A, B ∈ M1,1 (R) and SP, is not possible. But C = AB, with A ∈ M1,2 (R) and Suppose C = (−c) with C > 0. B ∈ M2,1 (R) and both A, B SP is possible.
1 Then let A = [1 − 1] and B = 1−c . Together with Theorem 3.2.5, the latter part of the remark above gives the following complete result. Theorem 3.2.7 Suppose that C ∈ Mm,n (R). Then, there is a k ≥ 1, and A ∈ Mm,k (R) and a B ∈ Mk,n (R), both SP such that C = AB. If m = 1 and C has no positive entries, we must have k ≥ 2 and k = 2 suffices. Otherwise, k ≤ m is possible. For the sum of two SP matrices there are similar sufficient conditions. Lemma 3.2.8 If A, B ∈ Mm,n (R) are SP, then S+ (A) ∩ S+ (B) ⊆ S+ (A + B). Proof Let A, B be SP and x ∈ S+ (A) ∩ S+ (B). Then (A + B)x = Ax + Bx > 0 and so x ∈ S+ (A + B). Theorem 3.2.9 If A,B ∈ Mm,n (R) are SP, and if S+ (A) ∩ S+ (B) = ∅, then A + B is SP. Proof This follows directly from the above lemma. Of course, again this sufficient condition is not necessary.
3.3 Further Structure and Sign Patterns
21
Example 3.2.10 Consider −1 −1 2 2 1 1 A= , B= , A+B= . 2 2 −1 −1 1 1 Note that as A and B are not SP, S+ (A) ∩ S+ (B) = ∅, yet A + B ∈ SP. Now, which matrices occur as the sum of two SP matrices? In this case, there is no flexibility in the dimensions of the summands; they must be exactly the same as the desired sum. This means that there are two restrictions. Theorem 3.2.11 Suppose that C ∈ Mm,n (R), with n ≥ 2. Then, there exist SP matrices A, B ∈ Mm,n (R) such that C = A + B. Proof First, pick A so that the entries of the first column are all 1s, the second column is the second column of C with one subtracted from each entry, and all of the other columns match the columns of C. Then set the first column of B to be the first column of C with 1 subtracted from each entry, let the second column have every entry equal to 1, and make all of the other columns have entries equal to 0. Then C = A + B and A and C are both SP, as each has a positive column. Remark 3.2.12 The requirement that n ≥ 2 in Theorem 3.2.11 is necessary. If n = 1 and C has a negative entry, then for C = A + B, A or B must have a negative entry in the same position, thus a negative row, and not be SP. So, when n = 1, C is the sum of two SP matrices if and only if A itself is SP. Despite it being easy to understand when a matrix is the sum of two SP matrices, it cannot so often be done if both left and right SP be required of the summands. For sums, the 0 matrix of any size is an easy counterexample. If A + B = 0 and A and B are both left SP and SP, then A = −B, so that A is both left and right SN as well. However, then A is SP and left SN, and therefore left SNP as well, which contradicts the Theorem of the Alternative 1.4.1.
3.3 Further Structure and Sign Patterns of SP Matrices If A ∈ Mm,n (R) and B ∈ Mp,n (R), we may form A A B= ∈ Mm+p,n (R). B Because of Theorem 3.1.10, in order for A B to be SP, we must have both A and B be SP. If A and B are both SP, what further do we need in order that A B be SP?
22
Semipositive Matrices
Lemma 3.3.1 For A ∈ Mm,n (R) and B ∈ Mp,n (R), S+ (A B) = S+ (A) ∩ S+ (B). Proof Observe that x ∈ S+ (A B) if and only if x ≥ 0, Ax > 0 and Bx > 0.
From this follows a useful observation. Theorem 3.3.2 If A ∈ Mm,n (R) and B ∈ Mp,n (R) are SP, then A B is SP if and only if S+ (A) ∩ S+ (B) = ∅. Proof By definition, A B is SP when S+ (A B) = ∅; the result follows from the above lemma. If p = m, we note that the condition in Theorem 3.3.2 is the same as the sufficient condition for A + B to be SP in Theorem 3.2.9. This means that Corollary 3.3.3 If A, B ∈ Mm,n (R) are SP, and A B in SP, then A + B is SP. This could also be proven by writing A + B as [I I](A B) and applying Theorem 3.2.2 (1). Thus, products, sums, and the operator are closely related. Another view would be the intersection of half-spaces; see Section 1.3.3. Of course, Theorem 3.3.2 shows how the semipositive cone is diminished as we add rows. If we add positive rows, there is no diminution at all, as the semipositive cone applied to a positive row is the entire nonnegative orthant, save the 0 vector. This means Corollary 3.3.4 Suppose that A ∈ Mm,n (R) and B ∈ Mp,n (R) with B > 0. Then A B is SP if and only if A is SP. What, then, about matrices of the form [A B]? Suppose that A ∈ Mm,n (R) and B ∈ Mm,q (R), and we consider A B = [A B]. An important fact is the following. Theorem 3.3.5 Suppose A ∈ Mm,n (R) and B ∈ Mm,q (R). If A or B is SP, then A B is SP.
Proof If A is SP, let x ∈ S+ (A) and define y = 0x ∈ Rn+q . Then y ≥ 0 and (A B)y = Ax > 0, and thus A B is SP. The proof when B is SP is similar. By Corollary 3.1.7, if any subset of the columns of A ∈ Mm,n (R) forms an SP matrix, then it follows from Theorem 3.3.5 that A is SP. This has already
3.3 Further Structure and Sign Patterns
23
been mentioned in Theorem 3.1.10 (1). Because a positive column vector is trivially an SP matrix, a special case is mentioned in Remark 3.1.11. Corollary 3.3.6 If A ∈ Mm,n (R) has a positive column, then A is SP. The previous corollary has an interesting and far-reaching generalization. We say that A ∈ Mm,n (R) has a positive front if the columns may be permuted so that, in each row, there is a nonzero entry and the first nonzero entry is positive. These positive entries are called fronted positives. Of course, a matrix with a positive column has a positive front, but it may happen that there is a positive front without a positive column. Example 3.3.7 Let
⎡
⎤ 1 0 −1 A = ⎣0 −1 1 ⎦ . 0 −1 1
Then the interchange of columns 2 and 3 yields ⎡ ⎤ 1 −1 0 A = ⎣0 1 −1⎦ , 0 1 −1 so that A has a positive front. It is easy to see that A is SP as ⎡ ⎤ 2 ⎣1⎦ ∈ S+ (A ) 0 and so A is SP by Corollary 3.1.7. It turns out that the existence of a positive front, like a positive column, is always sufficient for SP. As we shall see, this is a minimal condition in terms of the sign pattern alone. Theorem 3.3.8 If A ∈ Mm,n (R) has a positive front, then A is SP. Proof If A can be permuted to a matrix A that has a positive front, by Corollary 3.1.7, it is enough to show that A is semipositive because semipositivity is not changed under permutation. If A has z zero columns as its first z columns, set the first z entries of v to be zero. Set the remaining d entries of v as xd , xd−1 , . . . , x, where x is a fixed number to be specified later. Each row of the product A v will be a polynomial of degree less than or equal to d, whose first entry is positive (because the first nonzero entry in each row is positive, and there are no zero rows because A has a positive front). Because the coefficient of
24
Semipositive Matrices
the monomial with the largest degree in each of these polynomials is positive, the limit as x approaches infinity for all of these polynomials is infinity. Therefore there is some x > 0 where each of these polynomials are greater than zero, which implies that A is semipositive, and thus A is semipositive. We note that it is not difficult to test for the existence of a positive front [DGJT16]. An algorithm that does this is described next. Algorithm PF To determine if an m-by-n matrix A has a positive front, first check that the matrix has no zero row. If it does, this matrix does not have a positive front. If it does not, construct a sequence of matrices with A1 := A. Construct A2 by removing the 0 columns in A1 . Afterwards for k > 1, construct Ak+1 recursively in the following fashion: 1. Determine if Ak contains a column without a negative entry. If all p columns of Ak contain a negative entry, this matrix does not have a positive front. Otherwise, choose a j-th column of Ak , which we will denote by akj , which has no negative entries. Set Ak+1 by deleting the j-th column of Ak , as well as any row in Ak in which akj contains a positive entry (leaving the rest of the matrix), and repeat step 1, unless this deletion will result in the loss of the entire matrix. If this is the case, set Ak as the “final matrix” and proceed to step 2. 2. Define a function C(akj ) that takes the vector akj and returns the original column of A associated with akj . 3. If A has r columns of entirely 0, set the first r columns of A to be 0. Then, set the next column to be C(a3j ), the next column to be C(a4j ), . . . , and the next column to be C(akj ), where Ak is the final matrix. If there are columns of A that have not been mapped to by this C function and are nonzero, set them in an arbitrary order after the C(akj ) column. Note that this process must terminate in at most n steps because after n repetitions of step 1, the entire matrix will be “deleted.” Algorithm PF terminates (in A ) if and only if A has a positive front [DGJT16, theorem 4.5]. By the sign pattern of A = [aij ] ∈ Mm,n (R), we mean an m-by-n array A of signs (+, −, 0) in which aij > 0 ( 0, a sort-of diagonal dominance. This gives the following theorem. Theorem 3.3.10 An m-by-n sign pattern allows SP if and only if each row contains a +.
3.4 Spectral Theory of SP Matrices If we consider only square matrices A ∈ Mn (R), it is natural to ask what is special about the spectra of SP matrices, either just the multi-set of eigenvalues, or, more precisely, the similarity class (Jordan) structure. Interestingly, there is almost nothing special, which we show here. Nearly all matrices in Mn (R) are similar to an SP matrix.
26
Semipositive Matrices
Lemma 3.4.1 Every non-scalar matrix in Mn (R) is similar to a real matrix with positive first column. Proof Unless one is scalar and the other is not, two matrices in M2 (R) are similar if and only if they have the same trace and determinant. Furthermore, unless the diagonal entries are the eigenvalues, both off-diagonal entries will be nonzero. Thus, a simple calculation shows that the claim of the lemma is valid in the 2-by-2 case. Now, let A ∈ Mn (R) be non-scalar. It is clearly similar to a non-diagonal matrix B ∈ Mn (R), which, in turn, has a 2-by-2 non-scalar principal submatrix that we may assume, without loss of generality, is in the first two rows and columns and has a positive first column. Now, if the remaining entries in the first column of B are all nonzero, the proof is completed by performing similarity on B by a diagonal matrix of ± 1s, so as to adjust the signs of the entries in the first column to be all positive and complete the proof. If not all entries in the first column are nonzero, we may make them so, by a sequence of similarities of the form ⎤ ⎡ I 0 0 ⎥ ⎢ 1 0 ⎢0 0⎥ ⎦, ⎣ 1 1 0 0 I in which the second I may not be present. Then, we may proceed as in the case of a totally nonzero first column to complete the proof. This is the key to the observation of interest. Theorem 3.4.2 Except for nonpositive scalar matrices, every matrix in Mn (R) is similar to an SP matrix. Proof If A ∈ Mn (R) is not a scalar matrix, then according to Lemma 3.4.1, there is a matrix with positive column in its similarity class. By Corollary 3.3.6, this matrix is SP. Because any positive diagonal matrix is SP, positive scalar matrices are, as well, which completes the proof. From this we have Corollary 3.4.3 Let n be given and suppose that = {λ1 , . . . , λn } is a multiset of complex numbers that is the spectrum of a real matrix. Then is the spectrum of an SP matrix in Mn (R), unless n = 1 and λ1 ≤ 0. Recall that a multi-set = {λ1 , . . . , λn } of complex numbers is the spectrum of a real matrix if the multiplicity in of any non-real complex number is the same as its conjugate.
3.5 Minimally Semipositive Matrices
27
3.5 Minimally Semipositive Matrices Recall that A ∈ Mm,n (R) is called minimally semipositive (MSP) if it is SP and no column-deleted submatrix is SP. Otherwise, an SP matrix is called redundantly semipositive (RSP). These two concepts may be viewed in terms of S+ (A). Lemma 3.5.1 If A ∈ Mm,n (R) is SP, then A is RSP if and only if there are vectors in S+ (A) with 0 components. Column j may be deleted from A to leave an SP matrix if and only if there is a vector in S+ (A) whose j-th component is 0. Conversely, A is MSP if and only if all components of vectors in S+ (A) are positive. Proof There is an x ∈ S+ (A) with xj = 0 if and only if column j of A is redundant to the semipositivity of A. Both claims of the lemma follow from this. 2 2 −3 −1 2 0 Example 3.5.2 Matrix A = is SP, as e = [1 1 1]T ∈ S+ (A). 0 −2 3
But, deletion of column 2 or 3 leaves a matrix with a row of nonpositive 2 num
−3 bers. Deletion of column 1 leaves a matrix with the square submatrix −3 2 . Because the second row is the negative of the first, it cannot be SP, so that deletion of column 1 also leaves a matrix that is not SP. Thus, A is MSP. Notice that for the matrix A above, ⎡ ⎤ 1/2 0 1/2 A−1 = ⎣1/4 1/2 1/4⎦ , 1/6 1/3 1/2 so that not only is A invertible, but A−1 ≥ 0. An SP matrix may have any disparity between the number of rows and columns. In trying to understand the special structure of MSP matrices, an important fact is that there cannot be more columns than rows. But, first we notice that any SP matrix with a nontrivial null space must be RSP. Lemma 3.5.3 Suppose that A ∈ Mm,n (R) is SP and that Null(A) = {0}. Then A is RSP. Proof Let x = 0 be an element of Null(A), y ∈ S+ (A), and define y(t) = y + tx. Note that Ay(t) = Ay > 0. If y ≯ 0, we are done, using Lemma 3.5.1. If y > 0, then y(t) ≥ 0 for sufficiently small t, i.e., there is a T > 0 such that for 0 ≤ t ≤ T, y(t) ≥ 0. Because we may assume without loss of generality that x has some negative entries (else, replace x with −x), there is a t > 0 such
28
Semipositive Matrices
that y(t) ≥ 0 and y(t) ≯ 0. But, then y(t) ∈ S+ (A) has 0 entries, so that, by Lemma 3.5.1 again, A is RSP. Remark 3.5.4 Observe that if A ∈ Mn (R) is invertible and A−1 ≥ 0, then A−1 is row positive (no row can be 0, as rank(A−1 ) = n). Furthermore, if A−1 is row positive and x > 0, x ∈ Rn , then A−1 x > 0. It follows from Lemma 3.5.3 that columns may be deleted from an RSP matrix so that an SP matrix with linearly independent columns results. Such an SP matrix may or may not be MSP. 1 1 Example 3.5.5 Let A = 1 2 . Then, A is SP with linearly independent 13 columns, but A is also RSP, as both its columns are positive. The important consequence of Lemma 3.5.3 is the following. Theorem 3.5.6 If A ∈ Mm,n (R) is MSP, then A has linearly independent columns, so that n ≤ m. Proof The linearly independent columns follow from Lemma 3.5.3, and then n ≤ m follows as well. Another useful fact has a proof similar to that of Lemma 3.5.3, though it is independent. Lemma 3.5.7 Let A ∈ Mm,n (R) be SP. Then A is MSP if and only if for x ∈ Rn , Ax > 0 implies x > 0. Proof Suppose A is MSP, Ax > 0, and that x ∈ R has a nonpositive entry. If x ≥ 0, then x ∈ S+ (A) and by Lemma 3.5.1, A is RSP, a contradiction. Further, if x has a negative entry, choose u > 0, u ∈ S+ (A). Then there is a convex combination w = tu + (1 − t)x, 0 < t < 1, of u and x with w ≥ 0 and at least one entry 0. But, as Aw = tAu + (1 − t)Ax > 0, we again arrive at a contradiction, via Lemma 3.5.1. So, the forward implication is verified. On the other hand, also by Lemma 3.5.1, if Ax > 0 implies x > 0, the fact that A is SP means that S+ (A) = 0 and no vector in S+ (A) has a 0 component. So A is MSP, and the proof is complete. Note that Lemma 3.5.7 says something different from the fact claimed in Lemma 3.5.1. Now, an important characterization of square MSP matrices may be given. Corollary 3.5.8 If A ∈ Mn (R), then A is MSP if and only if A−1 exists and A−1 ≥ 0.
3.5 Minimally Semipositive Matrices
29
Proof Suppose that A is MSP. It follows from Theorem 3.5.6 that A is invertible. Because A is invertible, every vector in Rn is in the range of A. Let y ∈ Rn , y > 0 and let x ∈ Rn be such that Ax = y. By Lemma 3.5.7, x > 0. This means that y > 0 implies A−1 y = x > 0 and, because y > 0 is arbitrary, that A−1 ≥ 0. Conversely, suppose that A−1 ≥ 0, and choose z > 0. Then w = A−1 x > 0, and Aw = z > 0, so that A is SP. But if Ax = y > 0, then x = A−1 y > 0, so that by Lemma 3.5.7, A is MSP, completing the proof. In case m > n, A ∈ Mn (R) may still be MSP, in which case A will have full column rank and, thus, have left inverses. However, not every left inverse need to be nonnegative, unlike the square case.
Example 3.5.9 Matrix A = 11 ∈ M2,1 (R) is MSP, and L = [2 −1] is a left inverse that is not nonnegative. However, A does have
nonnegative left inverses 3 , then A is not SP, but M is as well, such as M = [1/2 1/2]. Also, if A = −1 still a nonnegative left inverse of A. This means that non-square MSP matrices may not be so crisply characterized, as in Corollary 3.5.8. However, there is still a strong result. Theorem 3.5.10 Let A ∈ Mn (R) be SP. Then, A is MSP if and only if A has a nonnegative left inverse. Proof First suppose that A is SP and has a nonnegative left inverse L. Suppose x ∈ S+ (A), so that y = Ax > 0. Then 0 < Ly = LAx = x, so that by Lemma 3.5.1, A is MSP, as no entry of a vector in S+ (A) is 0. On the other hand, suppose that A is MSP, so that A has left inverses by Theorem 3.5.6. Let Aj be A with the j-th column deleted, j = 1, 2, . . . , n. Because A is MSP, none of these is SP. By the Theorem of the Alternative 1.4.1 and because A is MSP, there is a yj ∈ Rm , yj ≥ 0, = 0 with yTj Aj ≤ 0. Let L ∈ Mn,m (R) be the matrix whose j-th row is yTj so that L ≥ 0. Then, the off-diagonal entries of LA are nonpositive. Let x ∈ Rn , x > 0, be such that Ax > 0 (which exists as A is MSP), and we have LAx > 0. This means that LA ∈ Mn (R) is a nonsingular M-matrix; see Chapter 2. Then, B = (LA)−1 ≥ 0. Because L ≥ 0, BL ≥ 0, as well. But (BL)A = B(LA) = (LA)−1 LA = I, so that A has a nonnegative left inverse, to complete the proof. As was discussed in Chapter 2, there are many variations on the notion of “monotonicity” for matrices. We might call the one in Lemma 3.5.7 strong monotonicity (for not necessarily square matrices): A ∈ Mm,n (R) is strongly
30
Semipositive Matrices
monotone if for x ∈ Rn , Ax > 0 implies x > 0. We may summarize our understanding of MSP as follows. Theorem 3.5.11 Suppose that A ∈ Mm,n (R) is SP. Then, the following are equivalent: (1) A is MSP. (2) A has a nonnegative left inverse. (3) A is strongly monotone. In the square case, recalling that a matrix A is called monotone if Ax ≥ 0 implies x ≥ 0, there is the crisper statement. Theorem 3.5.12 If A ∈ Mn (R), the following are equivalent: (1) A is MSP. (2) A−1 exists, and A−1 ≥ 0. (3) A is monotone. We have already discussed the sign patterns that allow or require SP in Section 3.3. We mention here the corresponding results for the concepts of this section: RSP and MSP. The sign patterns that allow MSP [JMS] are quite subtle. Some, of course, are built upon the SP cases. The sign patterns that allow RSP are those with a + in every row and for which there is a column that does not contain the only + of any row. The sign patterns that require RSP are those with a positive front, so that there is a column containing none of the frontal positives. To require RSP, a sign pattern must have a positive front, and every column must eliminate any positive front. So there must be a positive front, and each column must contain a frontal positive that is the only + in its row. The sign patterns that allow MSP are more complicated to describe and are not fully understood when 0s are present [JMS]. There are, as one may guess from the results of this section, closely related to sign patterns that allow a nonnegative inverse [JLR, etc.].
3.6 Linear Preservers Suppose that E ∈ Mm,n (F) is a class of matrices. If L : Mm,n (F) → Mm,n (F) is a linear transformation such that L(E) ⊆ E, then L is called a linear preserver of E. Understanding the linear preservers of E can provide insight into class E. Often additional assumptions are made to obtain definitive results; for example, onto (L(E) = E), invertible (L is invertible), L of a special form (e.g., L(x) = RXS, R ∈ Mn (F) and S ∈ Mm (F) invertible).
3.6 Linear Preservers
31
Here, we are interested in the linear preservers of the class SP, and of MSP, in Mm,n (F), in several senses. We will be able to use several of the results already developed. We first consider SP preservers of the form L(X): Mm,n (R) → Mm,n (R) with L(X) = RXS
for fixed
R ∈ Mm (R)
and S ∈ Mn (R).
(3.6.1)
We already know that if R is row positive and S is monotone, then such a transformation preserves SP (Theorem 3.1.5). Are there other possibilities? First is a theorem from [DGJJT16]; another proof is given in [CKS18a]. Theorem 3.6.1 Let L be a linear operator on Mm,n (R) of the form (3.6.1). Then L is an into linear preserver of SP if and only if either R is row positive and S is monotone, or −R is row positive and −S is monotone. Proof We need only verify necessity. Note that a rank one matrix xyT is SP if and only if x > 0 and y has at least one positive entry, or x < 0 and y has at least one negative entry. If X = xyT , with x > 0 and at least one entry of y is positive, then X is SP and L(X) = RXS = (Rx)(ST y)T is SP, per hypothesis. Because x > 0 is otherwise arbitrary, this means that either Rx > 0 or Rx < 0, or that either R is row positive, or −R is row positive. Assume the former. Then, ST y has at least one positive entry for every y that has at least one positive entry. This gives that, if ST y has no positive entries, then y has no positive entries, i.e., ST is monotone. But, as S is square, S must be monotone. The case in which −R is row positive similarly yields that −S is monotone, completing the proof. Recall that a matrix B ∈ Mn (R) is monomial if B ≥ 0 and has exactly one nonzero (i.e., positive) entry in each row and column. This is the same as saying that it is of the form B = DP with D ∈ D+ n and P ∈ Pn or B = QE with Q ∈ Pn and E ∈ D+ . The important feature of monomial matrices is that they are the n intersection of the row positive and the monotone matrices in Mn (R). Lemma 3.6.2 A matrix B ∈ Mn (R) is monomial if and only if B is both row positive and monotone. Proof Obviously, a monomial matrix is both row positive and inverse nonnegative. Suppose now that X is row positive and monotone. Then by Theorem 3.5.12, X is also inverse nonnegative. Let ei denote the standard basis vectors; that is, ei has a 1 in its i-th entry and 0s everywhere else. Because X is row positive, it must take each ei to a nonnegative sum of the ej ; that is, Xei = rji ej for some rji ≥ 0. Because X is inverse nonnegative, we must have that for each ej , there exists vj ≥ 0 such that Xvj = ej . Note, however, that this is impossible unless for some ei , Xei = rji ej . If this were not true, then Xvj would always have some positive entry that was not the j-th entry because
32
Semipositive Matrices
it cannot cancel out, due to their being no negatives involved. Thus, for every ej there exists ei such that Xei = rji ej . But this is exactly the same as claiming that X has exactly one non-zero entry in each row and column, and so X is monomial. With this, we may characterize the onto linear SP-preservers. Theorem 3.6.3 A linear operator on Mm,n (R) of the form (3.6.1) is an onto preserver of SP if and only if both R and S are monomial or −R and −S are monomial. Proof Suppose that L is an onto preserver of SP. First, note that the set of SP matrices contains a basis for Mm,n . We can see this by noting that every matrix that has a 2 in one entry and a 1 in every other entry is SP, so each is contained in the image of L. These matrices form a basis for Mm,n . Thus, L is an invertible map, which means that R and S must be invertible, and L−1 (A) = R−1 AS−1 . Because L is an onto preserver of SP, both it and its inverse are into preservers of SP, so R and S must be both row positive and inverse nonnegative (or −R and −S ), and thus monomial. Conversely, if R and S are both monomial, then L(A) = RAS is an into preserver of SP, as is L−1 (A) = R−1 AS−1 , so L is an onto preserver of SP. Now that we have characterized linear SP-preservers, both into and onto, of the form (3.6.1), we turn to linear preservers of MSP of this form. Notice that the onto preservers of SP, of this form, are also onto preservers of MSP. The same is not so for into preservers of MSP. First, we need a particular fact, not so obviously related to SP or MSP. Lemma 3.6.4 Let v ∈ Rn and w ∈ Rm be nonzero vectors and let n > m. If v has both a positive entry and a negative entry, then there exists a matrix B ∈ Mm,n (R) of full row rank such that B ≥ 0 and Bv = w. The same holds if 0 = v ≥ 0 and w > 0. Proof First, suppose that v has a negative entry. Without loss of generality, suppose that the first k entries of v are positive, and the last n−k are nonnegative. Denote by wi ∈ Rm the vector with 1 for the first m−i+1 entries, and 0 for every other entry. Set Bei = ri wi for 1 ≤ i ≤ m, where ei ∈ Rn is the i-th standard basis vector, and ri = 1 for i > 1, and is yet to be determined for i = 1. Note that this forces B to have full row rank. Now, set Bei = 0 for m < i < n. Consider the matrix B, where we complete this construction by picking Ben = 0. Choose r1 large enough that Bv > w, then set Ben = v1n (w − Bv), where vn is the last entry of v. Because vn < 0, this means that Ben ≥ 0, and we have that Bv = w.
3.6 Linear Preservers
33
Now, suppose that v ≥ 0. Similar to the above construction, set Bei = ri wi for 1 < i ≤ m, and set Bei = 0 for i > m. Let B be the end of this construction, supposing that we set Be1 = 0. Take the ri small enough that Bv < w, then set Be1 = v11 (w − Bv), so that Bv = w. Recall that it is necessary, for B ∈ Mm,n (R) to be MSP, that m ≥ n. It is convenient to consider linear MSP preservers of the form (3.6.1) in the two possible cases: m > n and m = n. The first case is m > n: Theorem 3.6.5 Let L be a linear operator on Mm,n (R), with m > n > 1, of form (2.6.1). Then (i) L is an into preserver of MSP if and only if R is monomial and S is monotone, or −R is monomial and −S is monotone; (ii) L is an onto preserver of MSP if and only if R and S are monomial, or −R and −S are monomial. Proof (i) First, recall that A is MSP if and only if A is SP and A has a nonnegative left inverse. Now, suppose that A is MSP and let B be a nonnegative left inverse for A. Then if R is monomial and S is inverse nonnegative, S−1 BR−1 is nonnegative because each matrix is nonnegative. Moreover, this forms a left inverse for RAS. Because R is row positive and S is inverse nonnegative, we also know that RAS is SP. Thus, RAS is MSP. If S is also taken to be monomial, then we can obviously get every MSP matrix B in the image of L because we can simply set A = R−1 BS−1 , which is MSP if B is, so that L(A) = B. Conversely, suppose that L(A) preserves MSP in the into sense. First, note that R and S must be invertible. If they were not, then RAS would not have full column rank, and therefore could not have a left inverse. Now, suppose that neither R nor −R is inverse nonnegative. Then there exists some vector v ≯ 0 and v 0 such that Rv ≥ 0. Let w ≯ 0, and consider Sw. Let B ≥ 0 be an n-by-m matrix such that Bv = Sw. Such a matrix must exist, by the previous lemma. Pick a basis {vi } for Rm , starting with v1 = v and v2 > 0, and let zi = Bvi . Note that z2 > 0 because B ≥ 0, with full row rank. Pick an m-by-n matrix A so that Azi = vi . Then A is a right inverse of B, and A is SP because Az2 = v2 . Thus, A is MSP. But, RASw = RABv = RAz1 = Rv ≥ 0, while w ≯ 0. Thus, RAS is not MSP, and so R or −R must be inverse nonnegative. (ii) Suppose that R is inverse nonnegative, and suppose that S was not inverse nonnegative. Then there exists some u ≥ 0 such that S−1 u = w ≯ 0. Set z ≥ 0 such that Rz ≥ 0. Such a vector exists because R is inverse nonnegative. Then
34
Semipositive Matrices
choose some n-by-m matrix B ≥ 0 with full row rank such that Bz = u. Note that such a matrix must exist, by the previous lemma. Now, pick a basis {ui } for Rn beginning with u1 = u. Because B has full row rank, there exist linearly independent vectors zi such that Bzi = ui , with z1 = z. Set A to be an m-by-n matrix such that Aui = zi . Then A is a right inverse of B and A is SP because Au = z. Thus, A is MSP. But, RASw = RAu = Rz ≥ 0, despite the fact that w ≯ 0. Thus, RAS cannot be MSP, so S must be inverse nonnegative. A similar proof shows that if −R is inverse nonnegative, so is −S. Finally, we must show that R is row positive, which comes down to just showing that R ≥ 0 because R is invertible. Suppose that R had a negative entry. Let it be the (i, j) entry. Choose an m-by-n matrix A such that, except for the j-th row, every row has exactly one entry that is a 1 and all others 0, and every column has at least one 1 entry. This is possible, due to the fact that m > n. If we then choose the j-th row of A to be positive, A will be MSP because A ≥ 0 with a positive entry in each row, making it SP, and no column-deleted submatrix could be SP because it would then have a zero row. Note that by making each entry of the j-th row of A large enough, we can force the i-th row of XA to be any vector v < 0, as long as the entries of v are large enough. Now, pick a vector w < 0 such that wS < 0. Such a vector must exist because R is inverse nonnegative, and thus so is S, which means S is SP. Pick A so that v = rw for some large enough scalar r. Then RAS will have a negative i-th row, and thus cannot be SP. Therefore, R must be row positive, and so R is monomial. Now, suppose that L(A) = RAS is an onto linear preserver of MSP. L must be invertible as shown above, so L−1 (A) = R−1 AS−1 , and L(A) = RAS are both into linear preservers. That means that R is monomial and S is both inverse nonnegative and row positive, and thus also monomial. Corollary 3.6.6 Let C ∈ Mr,m (R), B ∈ Mn,s (R). Then CA (AB) is SP, for every A ∈ Mm,n (R) that is SP if and only if C is row positive (B is monotone). The second case is m = n: The requirement that R be monomial is not necessary. In this case, R need only be monotone, because of Corollary 3.5.8 (MSP is equivalent to inverse nonnegativity, which is monotonicity, and monotonicity is closed under product). This should be compared with the non-square case. Preserving SP will occur automatically. This is revealed in the proof of Theorem 3.6.5 when A, with one row deleted is asked to be MSP. This could not happen when m = n. Thus, we have Theorem 3.6.7 Let L be a linear operator on Mn (R) of the form (3.6.1). Then (i) L is an into preserver of MSP if and only if R and S are monotone, or −R and −S are monotone;
3.6 Linear Preservers
35
(ii) L is an onto preserver of MSP if and only if R and S are monomial, or −R and −S are monomial. If we only ask that L : Mm,n (R) → Mm,n (R) be an onto linear preserver of SP, with no requirement of special form, it turns out that the special form (3.6.1) is necessary for linear, onto, preservation of SP. But, this requires substantial proof. We begin with some needed lemmas. Lemma 3.6.8 If L : Mm,n (R) → Mm,n (R) is an onto linear preserver of SP, then L(X) is SP if and only if X is SP. Proof The result follows from the definition of an onto preserver, and the fact that the SP class of matrices forms a basis. Lemma 3.6.9 If L : Mm,n (R) → Mm,n (R) is an onto linear preserver of SP, then L is also an onto linear preserver of LSP. Proof First, note that L is an onto preserver of the set of matrices that are not left SNP because this is the same as the set of matrices that are SP, by the Theorem of the Alternative 1.4.1. But, L must be invertible because the set of SP matrices contains a basis for Mm,n , so L in fact must be an onto preserver of left SNP. L is a homeomorphism, as it and its inverse are continuous, and the set of left SNP matrices is the closure of the set of left SN matrices, so L must also be an onto preserver of left SN. But the set of left SN matrices is exactly the negative of the set of left SP matrices, so if one is preserved in the onto sense by a linear map, so must the other. Thus, L preserves left SP. The above reasoning can be used to show that a linear map preserving one class of matrices must also preserve others. Now that we realize that onto linear preservers of SP are also LSP preservers, we can transfer any fact about the columns of matrices after L is applied to them, to the rows, as well. To show that an onto linear preserver of SP preserves nonnegative matrices, as well, we need an observation about the relationship between nonnegative matrices and SP matrices that we could have made earlier. Lemma 3.6.10 A matrix A ∈ Mm,n (R) satisfies A ≥ 0 if and only if, for every SP matrix B ∈ Mm,n (R), B + A is also SP. Proof First, suppose A is nonnegative. Then for any v ≥ 0, Av ≥ 0. Thus, if Bv > 0, then (A + B)v = Av + Bv > 0, so A + B is also SP. Conversely, suppose that A is not nonnegative. Then A has a negative entry. Suppose it to be the (i, j) entry. Pick B so that the j-th column of B is positive and all other entries are negative and larger in absolute value than any entry of A. Further,
36
Semipositive Matrices
have the (i, j) entry of B smaller in absolute value than the (i, j) entry of A. Then B is SP because it has a positive column, but A + B is not SP because every entry in the i-th row must be negative. We may now see that onto linear SP-preservers are onto linear preservers of the nonnegative matrices. Lemma 3.6.11 If L : Mm,n (R) → Mm,n (R) is an onto linear preserver of SP, then it is also an into linear preserver of nonnegativity, that is, if A ∈ Mm,n (R) and A ≥ 0, then L(A) ≥ 0. Proof Let A ≥ 0. Then for any SP matrix B, A + B is SP, so L(A + B) = L(A) + L(B) is SP. Note also that L(B) is SP. Further, for any SP matrix C there exists an SP matrix B such that L(B) = C. Thus, for any SP matrix C, L(A) + C is SP. By the previous lemma, then, L(A) ≥ 0. We now know that if we take the matrix Eij , an element of a basis of Mm,n (R), and apply a onto linear preserver L of SP to it, L(E) ≥ 0. Further, any fact about the columns of L(Eij ), resulting from this, applies to its rows, as well. Then, we get a general SP-preserver result. Theorem 3.6.12 Suppose L : Mm,n (R) → Mm,n (R). Then, L is an onto linear preserver of SP if and only if L is of the form (3.6.1) for some monomial matrices R and S. Proof We already know the reverse implication. Now suppose L is an onto linear preserver of SP. First, we know that L(Eij ) ≥ 0, by the previous lemma and the invertibility of L. Suppose that L(Eij ) has positive entries in more than one row. Note that L is SP (this i Eij must be SP because i Eij is the matrix E cannot be SP with 1s in the j-th row and 0s elsewhere). However, L ij i =k for any k because i =k Eij is not SP, and L is an onto preserver of SP. Now, each L(Eij ) must have a positive entry in at least one row. If any had positive entries in entries in two rows, there would be one Ekj that only had positive rows that another Eij already has a positive entry. But then, L E would ij i =k be SP because every row would have a positive entry, and no entry would be negative, so the sum of each column would be positive. Thus, each L(Eij ) can only have positive entries in one row. By the fact that L is also a preserver of left SP, each L(Eij ) can only have positive entries in one column as well. Thus, each L(Eij ) is a matrix with exactly one positive entry, which cannot be shared by any other L(Ekl ), due to the invertibility of L. Next, suppose that the L(Eij ) for fixed j are sent to matrices with the positive entry in different columns. Then consider a matrix A with a small positive
3.6 Linear Preservers
37
j-th column, and large negative entries everywhere else. If the entries in the j-th column of A are not sent to the same column of L(A), then we will end up with a matrix whose positive entries are not all in the same column. Because we can make the negative entries as large as we wish, we can force L(A) to not be SP, even though A was. Thus, all entries in the same column of A must be sent to the same column of L(A). Again, this must also be true for rows. The above now forces L to be a composition of Hadamard multiplication by some positive matrix, and permutation of rows and columns. Because permutation matrices are all monomial, we can assume that L is just Hadamard multiplication by some positive matrix, so that L(Eij ) = rij Eij . Consider the matrix ⎡ ⎤ 0 ··· 0 1 1 0 ··· 0 ⎢. .. .. .. .. .. ⎥ ⎢ .. . . . . .⎥ ⎢ ⎥ ⎢0 · · · 0 1 1 0 · · · 0⎥ ⎢ ⎥ ⎢ ⎥ ⎢0 · · · 0 1 −x 0 · · · 0⎥ A=⎢ ⎥. ⎢0 · · · 0 −x 1 0 · · · 0⎥ ⎢ ⎥ ⎢0 · · · 0 1 1 0 · · · 0⎥ ⎢ ⎥ .. .. .. .. .. ⎥ ⎢ .. ⎣. . . . . .⎦ 0 ··· 0 1 1 0 ··· 0 Note particularly that A is SP if and only if x < 1. Then ⎡. .. . . ⎢. ⎢0 · · · L(A) = ⎢ ⎢0 · · · ⎣ .. .. . .
.. . ri,j −xri+1,j .. .
.. . −xri,j+1 ri+1,j+1 .. .
.. . ··· ··· .. .
.. ⎤ .⎥ 0⎥ ⎥. 0⎥ ⎦ .. .
L(A) is SP if and only if A is SP, so L(A) is SP if and only if x < 1. Because all the other columns are zero besides j and j + 1, and all other rows of j and j + 1 are positive except i and i + 1, whether L(A) is SP depends entirely on those four entries. Setting x = 1, we see that we must have ri,j = ari,j+1 and ri+1, j = ari+1, j+1 . But we can repeat this for every pair of columns and rows to find the number we multiply an entry of some column by is in a fixed ratio with the number we multiply the entry of a different column in the same row by. This is just multiplication on the right by a positive diagonal matrix. Again, we can apply the fact that L must be left-SP preserving to show that the ratio of rows must also be in a fixed ratio, and so L is just multiplication on the left and right by positive diagonal matrices. But these are also monomial matrices,
38
Semipositive Matrices
so any onto linear preserver of SP can be written as L(A) = RAS for monomial R and S. For into linear preservers of SP, the special form (3.6.1) need not follow. Example 3.6.13 Consider the linear map L : M2 (R) → M2 (R) given by a b a b L = . c d a b−a This map preserves SP and is neither of the form (3.6.1) nor invertible. For the argument A to be SP, either a or b must be positive. If a > 0, the first column of L(A) is positive, so that L(A) is SP. If b > 0 and a ≤ 0, then the second column of L(A) is positive, and L(A) is SP. Thus, L preserves SP. Because 1 0 1 0 L = , 0 0 1 −1 L can increase rank and so cannot be of form (3.6.1). More generally, consider the map on Mm,n (R) given by ⎡ ⎤ a11 a12 ··· a1n ⎢ ⎥ n−1 ⎢a a1k ⎥ ⎢ 11 a12 − a11 · · · a1n − ⎥ ⎢ ⎥ k=1 ⎢ ⎥. L([aij ]) = ⎢ . .. .. ⎥ . ⎢ . ⎥ . . ⎢ ⎥ n−1 ⎣ ⎦ a1k a11 a12 − a11 · · · a1n − k=1
Because any SP matrix has a positive entry in the first row, L[a − ij] has a positive column, so that SP is preserved. But, as L can increase rank, L is not of form (3.6.1). Remark 3.6.14 The maps in the example above are not invertible. It is an open question what is the least strong regularity condition on an into SP preserver that it be of form (3.6.1), or what all into linear SP preservers are.
3.7 Geometric Mapping Properties of SP Matrices One would not expect that many strong properties of entry-wise nonnegative matrices generalize to SP matrices. There are, however, some cone-theoretic and Perron–Frobenius-type implications of semipositivity. This section contains such connections of SP matrices to cones of nonnegative vectors and cone
3.7 Geometric Mapping Properties of SP Matrices
39
invariance. More results of this type, including conditions under which an SP matrix has a positive eigenvalue, or leaves a proper cone invariant can be found in [Tsa16] and [ST18].
3.7.1 Preliminaries We will consider the following set, which is the closure of S+ (A). Definition 3.7.1 For A ∈ Mm,n (R), let K+ (A) = {x ∈ Rn : x ≥ 0 and Ax ≥ 0}. Note that A is SP if and only if AK+ (A) contains a positive vector, in which case we refer to K+ (A) as the semipositive cone of A. Many of the results on K+ (A) herein are stated and indeed hold for arbitrary A ∈ Mm,n (R). Our interest, however, is in semipositive matrices. The goal is to compute and study K+ (A) as a convex cone in Rn . For that purpose, we review below some basic material on generalized inverses and cones. The Moore–Penrose inverse of A ∈ Mm,n (R) is denoted by A† , and the group inverse of A is denoted by A# . Their defining properties are, respectively, AA† A = A, A† AA† = A† , (AA† )T = AA† , (A† A)T = A† A, and AA# A = A, A# AA# = A# , AA# = A# A. We let the range of A be denoted by R(A) and its null space by Null(A). While the Moore–Penrose inverse exists for all matrices A, the group inverse of a square matrix A ∈ Mn (R) exists if and only if rank(A) = rank(A2 ) (equivalently, Null(A) = Null(A2 )). It is known that A# exists if and only if the R(A) and Null(A) are complementary subspaces of Rn . We call A ∈ Mn (R) range symmetric if R(A) = R(AT ) or, equivalently, if † A = A# . The following will be used in some of the proofs below: R(A† ) = R(AT ), Null(A† ) = Null(AT ), R(A# ) = R(A), and Null(A# ) = Null(A). Recall that the topological interior of Rn+ comprises all positive vectors in n R and is denoted by int Rn+ . We use x > 0 and x ∈ int Rn+ interchangeably. The following geometric concepts will also be used in the sequel. The dual of a set S ⊆ Rn is S∗ = {z ∈ Rn : zT y ≥ 0 for all y ∈ S}. A nonempty convex set K ⊆ Rn is said to be a cone if αK ⊆ K for all α ≥ 0. A cone K is called a proper cone if it is (i) closed (in the Euclidean space Rn ),
40
Semipositive Matrices
(ii) pointed (i.e., K ∩ (−K) = {0}), and (iii) solid (i.e., the topological interior of K, int K, is nonempty). A polyhedral cone K ⊆ Rm is a cone consisting of all nonnegative linear combinations of a finite set of vectors in Rm , which are called the generators of K. Thus, K is polyhedral if and only if K = X Rn+ for some X ∈ Mm,n (R); when m = n and X is invertible, K = X Rn+ is called a simplicial cone in Rn+ . Note that simplicial cones in Rn are proper cones. A cone K is called an acute cone if pT q ≥ 0 for all p, q ∈ K. In terms of its dual, acuteness is equivalent to the inclusion K ⊆ K ∗ . A dual notion is that of obtuseness; K is called an obtuse cone if K ∗ ⊆ K. K is called a self-dual cone if it is both acute and obtuse. Rn+ is an example of a self-dual cone. We will have occasion to use the following results. The first result on the consistency of linear equations is quite well known; see, e.g., [B-IG03]. Lemma 3.7.2 Let A ∈ Mm,n (R) with b ∈ Rm . Then the system of linear equations Ax = b has a solution if and only if AA† b = b. In such a case, the general solution is given by x = A† b + z, where z ∈ Null(A). A version of the separating hyperplane theorem, which will be used below, is recalled next. For its proof, see [Man69]. / K. Then there Theorem 3.7.3 Let K ⊆ Rn be a closed convex set and b ∈ exists c ∈ Rn and a real number α such that cT b < α ≤ cT x for all x ∈ K. 3.7.2 Cones Associated with SP Matrices First is a fundamental factorization of SP matrices into the product of a positive and an inverse positive matrix. Theorem 3.7.4 A ∈ Mm,n (R) is SP if and only if there exist positive matrices X ∈ Mn (R) and Y ∈ Mm,n (R) such that X is invertible and A = YX −1 . Proof Let A be semipositive, i.e., there exist positive vectors x ∈ int Rn+ and y ∈ int Rm + such that Ax = y. Define the matrices X = xeT + I ∈ Mn (R),
Y = yeT + A ∈ Mm,n (R),
where e generically denotes the all-ones vector of appropriate size, and > 0 is chosen sufficiently small to have Y > 0. Then the result follows from the facts that AX = Y and that X > 0 is invertible because its eigenvalues are and eT x + . For the converse, assume there are positive matrices X ∈ Mn (R) and
3.7 Geometric Mapping Properties of SP Matrices
41
Y ∈ Mm,n (R) such that A = YX −1 . Let u ∈ int Rn+ and set v = Xu ∈ int Rn+ . Then Av = YX −1 v = Yu ∈ int Rm + , showing that A is SP. The following theorem shows that SP matrices act like nonnegative matrices on polyhedral subcones of Rn+ . Theorem 3.7.5 A ∈ Mm,n (R) is SP if and only if there exist proper polyhedral cone K1 ⊆ Rn+ and polyhedral cone K2 ⊆ int Rm + ∪ {0} such that AK1 = K2 . Proof Let A be SP. Consider the matrices X, Y in the proof of Theorem 3.7.4 such that A = YX −1 and let K1 = X Rn+
and K2 = Y Rn+ .
Because X is positive and invertible, K1 is simplicial and thus a proper cone in Rn+ . Because Y > 0, K2 is a polyhedral cone in int Rm + ∪ {0}. We also have that AK1 = YX −1 X R+ = K2 . For the converse, suppose there exists a proper cone K1 ⊆ Rn+ and a polyhedral cone K2 ⊆ int Rm + ∪ {0} such that AK1 = K2 . As K1 ⊆ Rn+ is proper, it is solid and so there is x ∈ intK1 ⊆ int Rn+ . It follows that Ax ∈ K2 \ {0} ⊆ int Rm + . Thus A is SP. Next, we turn our attention to K+ (A) and its dual, K+ (A)∗ . n Theorem 3.7.6 Let A ∈ Mm,n (R). Then K+ (A)∗ = AT (Rm + ) + R+ . n Proof Let y = AT u + v, where u ∈ Rm + , v ∈ R+ , and let x ∈ K+ (A). Then x ≥ 0 and Ax ≥ 0. We have xT y = xT AT u + v = (Ax)T u + xT v ≥ 0, n ∗ ∗ so that AT (Rm + ) + R+ ⊆ K+ (A) . Conversely, let y ∈ K+ (A) . Suppose that m n y∈ / AT (R+ )+R+ . Then by Theorem 3.7.3, there exists a vector p and a number α such that n pT y < α ≤ pT AT u + v for all u ∈ Rm + and v ∈ R+ .
By setting u = 0 and v = 0, we then have α ≤ 0. Replacing u by tu and v by tv, for t > 0, one has n α ≤ tpT AT u + v for all u ∈ Rm + and v ∈ R+ . Then α n ≤ pT AT u + v for all u ∈ Rm + and v ∈ R+ . t Letting t → ∞, we have n pT y < α ≤ 0 ≤ pT AT u + v for all u ∈ Rm + and v ∈ R+ .
42
Semipositive Matrices
By setting u = 0 and v = 0 separately, we obtain pT v ≥ for all v ≥ 0 and (Ap)T u ≥ 0 for all u ≥ 0, showing that p ≥ 0 and Ap ≥ 0, i.e., p ∈ K+ (A). Because pT y < 0, we arrive at a contradiction to y ∈ K+ (A)∗ , completing the proof of the reverse inclusion. Corollary 3.7.7 Let A ∈ Mm,n (R) be SP. Then K+ (A) is a proper polyhedral cone in Rn . Proof First, K+ (A) is clearly a convex set that is closed under nonnegative scaling. Because A is semipositive, K+ (A) contains a positive vector. That is, K+ (A) is a nontrivial cone in Rn . Let now x ∈ int Rn+ such that Ax ∈ int Rm + and let X, Y, K1 , K2 as in the proof of Theorem 3.7.5, so that AK1 = K2 . Hence, K1 is a simplicial and consequently a proper polyhedral cone that is contained in K+ (A). It follows that K+ (A) contains a solid cone and therefore K+ (A) is solid. Also, K+ (A) ⊆ Rn+ and consequently K is pointed. Last, K+ (A) is clearly a closed set. Thus, K+ (A) is proper cone in Rn . Next, recall that any nonempty subset of Rn is a polyhedral cone if and only if its dual set is a polyhedral cone; see e.g., [BP94, chapter 1, theorem (2.5)(c)]. By Theorem 3.7.6, K+ (A)∗ = [AT In ] Rn+ , that is, K+ (A)∗ is the cone generated by the n columns of AT and the columns of the n-by-n identity matrix. It follows that KA∗ , and thus K+ (A), are polyhedral cones. In what follows, we determine a necessary and sufficient condition for K+ (A) to be self-dual. Corollary 3.7.8 Let A ∈ Mm,n (R) be SP. Then the cone K+ (A) is acute. K+ (A) is obtuse if and only if A ≥ 0. K+ (A) is a self-dual cone if and only if A ≥ 0. Proof Let x ∈ K+ (A) be fixed and y ∈ K+ (A) be arbitrary. Then x ≥ 0 and y ≥ 0 so that xT y ≥ 0, showing that x ∈ K+ (A)∗ . Thus K+ (A) is an acute cone. Next, let A ≥ 0. Then K+ (A) = Rn+ and so K+ (A)∗ = Rn+ . Thus K+ (A)∗ ⊆ K+ (A), i.e., K+ (A) is obtuse. Conversely, let K+ (A) be obtuse, i.e., K+ (A)∗ ⊆ K+ (A). Then n n AT (Rm + ) + R+ ⊆ K+ (A) ⊆ R+ , n T so that for every u ∈ Rm + and for every v ∈ R+ , one has A u+v ≥ 0. By setting m T v = 0, we then have A u ≥ 0 for every u ∈ R+ . This means that AT ≥ 0, i.e., A ≥ 0. This proves the second statement.
3.7 Geometric Mapping Properties of SP Matrices
43
In [Tsa16, remark 4.6] it is remarked that if a matrix B maps K+ (A) into the nonnegative orthant, then the inclusion K+ (A) ⊆ K+ (B) holds. In the next result, among other sufficient conditions that guarantee such an inclusion, we consider the converse question. Theorem 3.7.9 Let A, B ∈ Mm,n (R) with Null(A) ⊆ Null(B). Consider the following statements: (a) (b) (c) (d) (e)
BA† ≥ 0. Ax ≥ 0 ⇒ Bx ≥ 0. B(K+ (A)) ⊆ Rn+ . K+ (A) ⊆ K+ (B). There exists W ≥ 0 such that B = WA.
Then (a) ⇒ (b) ⇒ (c) ⇒ (d) ⇒ (e). Suppose further, that AA† ≥ 0. Then all the statements are equivalent. Proof (a) ⇒ (b) Let y = Ax ≥ 0. Then x = A† y+z, for some z ∈ Null(A) ⊆ Null(B). Thus, Bx = BA† y ≥ 0 because BA† ≥ 0. (b) ⇒ (c) Let x ∈ K+ (A) so that x ≥ 0 and Ax ≥ 0. Then Bx ≥ 0 and so (c) holds. (c) ⇒ (d) Let x ∈ K+ (A) so that x ≥ 0. Then Bx ≥ 0 and so x ∈ K+ (B). (d) ⇒ (e) Let K+ (A) ⊆ K+ (B). Then (K+ (B))∗ ⊆ (K+ (A))∗ so that by Theorem 3.7.6, we have the following inclusion: n T m n BT (Rm + ) + R+ ⊆ A (R+ ) + R+ . m T T In particular, for every u ∈ Rm + , there exists v ∈ R+ such that B u = A v. By m substituting the standard basis elements of R+ for the vector u, it then follows that there exists a matrix V such that V ≥ 0 and BT = AT V. By setting V = W T , we obtain (e). Let us now assume that AA† ≥ 0. Then BA† = WAA† ≥ 0, showing that (e) ⇒ (a) and thus all the statements are equivalent.
There is a nice relationship between K+ (A) and K+ (A−1 ) when A is invertible. We obtain this as a consequence of the next general result involving the Moore–Penrose inverse. Theorem 3.7.10 Let A ∈ Mn (R) be range symmetric and AA† ≥ 0. Then A(K+ (A)) + Null AT = {y : y ∈ Rn+ + Null AT and A† y ≥ 0}. T Proof Because A is range symmetric, A† commutes withA. Also, Null A = † T Null(A ). Let y = Ax + z, where x ∈ K+ (A) and z ∈ Null A . Then x ≥ 0 and
44
Semipositive Matrices Ax ≥ 0 so that one has y ∈ Rn+ + Null AT . Also, A† y = A† Ax = AA† x ≥ 0, proving the inclusion A(K+ (A)) + Null AT ⊆ {y : y ∈ Rn+ + Null AT and A† y ≥ 0}. On the other hand, suppose that y = u + v, where u ≥ 0 and v ∈ Null AT . Also, let A† y ≥ 0. On letting w = A† u = A† y ≥ 0, we have that u = Aw + z for some z ∈ Null AT . Also, Aw = AA† u = A† Au ≥ 0, so that w ∈ K+ (A). We then have y = u + v = Aw + (z + v) ∈ A(K+ (A)) + Null AT , proving the inclusion in the reverse direction. The following is an easy consequence of the result above, applied to an invertible matrix. Corollary 3.7.11 Let A ∈ Mn (R) be invertible. Then A(K+ (A)) = K+ (A−1 ). In the next result, we present a class of matrices that map the cone K+ (A) into the cone K+ (B). Lemma 3.7.12 Let A, B ∈ Mm,n (R) such that R(A) ⊆ R(B) and B† ≥ 0. Then B† A(K+ (A)) ⊆ K+ (B). In particular, if A† ≥ 0, then the cone K+ (A) is invariant under A† A. Proof Let x ∈ K+ (A) so that x ≥ 0 and Ax ≥ 0. Set y = B† Ax. Then y ≥ 0 and By = BB† Ax = Ax ≥ 0, where we have made use of the fact that BB† A = A because R(A) ⊆ R(B). The second part is a simple consequence of the first part.
−1
Remark 3.7.13 Let A = 10 00 so that K+ (A) = Rn+ . Let B = −1 0 0 .
0 0 = [2 0]T . Then Then R(A) ⊆ R(B). Note that B† = 12 −1 0. Let x −1 0 x0 ∈ K+ (A). However, B† Ax0 = [−1 − 1]T 0. Hence B† A(K+ (A)) K+ (B), showing that the condition B† ≥ 0 is indispensable in the result above.
3.7 Geometric Mapping Properties of SP Matrices 45 1 0 0 0 0 0 Let A = 1 0 0 so that K+ (A) = Rn+ . Let B = 13 0 1 0 . Then [0 1 0]T ∈ 010 110 2 −1 1 R(A) and does not belong to R(B). It may be verified that B† = −1 2 1 . 0
0 0
If x0 = [1 0 0]T , then x0 ∈ K+ (A) but B† Ax0 0, so that B† A(K+ (A)) K+ (B). This shows that the range inclusion condition R(A) ⊆ R(B) cannot be removed. We continue with a representation of the semipositive cone K+ (A), when AA† ≥ 0. Theorem 3.7.14 Let A ∈ Mm,n (R) with AA† ≥ 0. Then K+ (A) = Rn+ ∩ (A† Rm + + Null(A)). Proof Let x ∈ K+ (A) and y = Ax ∈ Rm + . Then x ≥ 0 and, by Lemma 3.7.2, one has x = A† y + z for some z ∈ Null(A), proving one-way inclusion. If m † x ∈ Rn+ ∩ (A† Rm + + Null(A)), then x ≥ 0 and x = A u + v for some u ∈ R+ and m v ∈ Null(A). Then, because AA† ≥ 0, one has Ax = AA† u ∈ AA† Rm + ⊆ R+ . Thus x ∈ K+ (A), proving the reverse inclusion. Corollary 3.7.15 Let A ∈ Rn be invertible. Then K+ (A) = Rn+ ∩ A−1 (Rn+ ). 2 1 −1
1 T † † Remark 3.7.16 Let A = −2 −1 1 . Then A = 12 A so that AA = 1 −1
1 6 −1 1 0. Set x0 = A† e1 + v, where e1 = [1 0]T and v = 12 [0 1 1]T ∈ Null(A). Then x0 ∈ R3+ ∩ (A† R2+ + Null(A)). However, Ax0 = 12 [1 − 1]T 0. This shows that the representation in Theorem 3.7.14 is not valid without the assumption that AA† ≥ 0.
3.7.3 Intervals of Semipositive Matrices Entry-wise nonnegativity induces a natural partial order among matrices of the same size, namely, Y ≥ X if and only if Y − X ≥ 0. When Y ≥ X, one can consider the matrix interval [X, Y] that comprises all matrices Z such that X ≤ Z ≤ Y. We recall a result on intervals of inverse positive matrices; see Rohn [Roh87, theorem 1]. Theorem 3.7.17 A ∈ Mn (R) and B ∈ Mn (R) are inverse positive if and only if each X ∈ [A, B] is inverse positive.
46
Semipositive Matrices
It is easy to observe that if A ≤ C and if A is semipositive, then C is semipositive. This may be paraphrased by the statement that if A ≤ C, then K+ (A) ⊆ K+ (C). It prompts us to ask the following question: Let A be semipositive. What are the possible choices for semipositive matrices C ≤ A? In this regard, we will consider diagonal and rank-one perturbations of a semipositive matrix A of the form A − D or A − uvT , where D is a nonnegative diagonal matrix and u, v are nonnegative vectors. A matrix interval [C, E] will be referred to as a semipositive interval if each matrix X such that C ≤ X ≤ E is semipositive. A similar nomenclature is adopted for minimal semipositivity. Theorem 3.7.18 Let A ∈ Mn (R) be semipositive and let D ∈ Mn (R) be a nonnegative diagonal matrix with diagonal entries dj , j = 1, 2, . . . n. Then [A − D, A] is a semipositive interval if and only if there exists positive vector x ∈ K+ (A) such that for each j = 1, 2, . . . , n, 0 ≤ dj
0 and so [A − D, A] is a semipositive interval. Conversely, if D is a nonnegative diagonal matrix with diagonal entries dj and [A − D, A] is a semipositive interval, then there exists positive x such that Ax > Dx ≥ 0. It follows that x ∈ K+ (A) and 0 ≤ dj < (Ax)j /xj . Theorem 3.7.19 Let A ∈ Mn (R) be semipositive and let D ∈ Mn (R) be a nonnegative diagonal matrix with diagonal entries dj , j = 1, 2, . . . n. Let α = max{xT Ax : x2 = 1, x ∈ K+ (A)}
and
δ = min{dj : j = 1, 2, . . . n}.
If [A − D, A] is a semipositive interval, then δ < α. Proof First observe that because A is a semipositive matrix, the quantity α is a well-defined positive number because the maximum is taken over the intersection of the convex cone K+ (A) and the numerical range of A, which is a compact set.
3.7 Geometric Mapping Properties of SP Matrices
47
If [A − D, A] is semipositive, then there exists x > 0 with x2 = 1 and (A − D)x > 0. That is, Ax > Dx ≥ 0 and so α ≥ xT Ax > xT Dx ≥ δ because the numerical range of D is contained in [δ, ∞). Remark 3.7.20 If A is semipositive, the quantity α in Theorem 3.7.19 is positive, which implies that the numerical range of A intersects the positive real A+AT > 0. The converse is not true, even if one assumes line and so λmax 2 that A + AT is a nonnegative matrix, as can be seen by the non-semipositive
matrix A = −11 01 . Theorem 3.7.21 Let A ∈ Mn (R) be semipositive and let u, v ∈ Rn+ . If [A − uvT , A] is a semipositive interval, then there exists x ∈ K+ (A) such that T T u x v x < xT Ax. then there exists x > 0 such that Proof If[A − uvT , A] is semipositive, T T A − uv x > 0. That is, Ax > v x u ≥ 0, and so x ∈ K+ (A) with xT Ax > vT x uT x . Theorem 3.7.22 Let A ∈ Mn (R) be semipositive and let e denote the all-ones vector in Rn . (a) Let A† ≥ 0, u, v ∈ Rn+ , u ∈ R(A) such that vT A† u < 1. Then [A − uvT , A] is a semipositive interval. (b) Let A† ≥ 0 and set B = A† = [bij ]. Suppose that ni,j=1 bij < 1 and e ∈ R(A). Then [A − eeT , A] is a semipositive interval. (c) Let A be minimally semipositive and set C = A−1 = [cij ]. Suppose that n T i,j=1 cij < 1. Then A − ee is a minimally semipositive matrix so that T [A − ee , A] is a minimal semipositive interval. Proof (a) Let x = A† u ≥ 0. Note that because u ∈ R(A), we have AA† u = u. Then A − uvT x = AA† u − vT A† u u = u − vT A† u u = 1 − vT A† u u ≥ 0, i.e., A−uvT is semipositive. It follows that [A−uvT , A] is a semipositive interval.
48
Semipositive Matrices
(b) Set u = v = e. Then u, v ≥ 0, u ∈ R(A) and 1 − vT A† u = 1 − eT A† e = 1 −
n
bij > 0.
i,j=1
The proof now follows from part (a). (c) Because A is minimally semipositive, it follows that A−1 exists and −1 A ≥ 0. By the Sherman–Woodbury formula (e.g., see [HJ91, p. 19]) for the inverse of a rank-one perturbation, we have −1 1 = A−1 + A−1 eeT A−1 , A − eeT μ −1 ≥ 0 (so that A − eeT where μ = 1 − eT A−1 e > 0. Clearly, A − eeT is also minimally semipositive). By Theorem 3.7.17, [A − eeT , A] is inverse positive. In particular, every such matrix is minimally semipositive, completing the proof.
3.8 Strictly Semimonotone Matrices Semimonotone matrices A ∈ Mn (R) are defined as those matrices for which the operation Ax does not negate all the positive entries of any nonzero, entry-wise nonnegative vector x. If Ax preserves at least one positive entry for every such x, we refer to A as strictly semimonotone. In that respect, strictly semimonotone matrices generalize the class of P-matrices to be studied in Chapter 4, which preserve the sign of a nonzero entry of every nonzero x ∈ Rn . In fact, it follows that every P-matrix is strictly semimonotone. More important, strictly semimonotone matrices are intimately related to matrix semipositivity, prompting us to review their properties in this section. For instance, as we will see below, A is strictly semimonotone if and only if A itself and all of its proper principal submatrices are semipositive. We continue with a formal discussion of strictly semimonotone matrices. Additional results, history, and details can be found in [CPS92] and [TW19]. Definition 3.8.1 A matrix A ∈ Mn (R) is strictly semimonotone if for each nonzero, nonnegative x ∈ Rn , there exists k ∈ {1, 2, . . . , n} such that xk > 0 and (Ax)k > 0. The following simple observations can be made immediately. First, if ek is the k-th column of the n × n identity matrix and A ∈ Mn (R) is strictly
3.8 Strictly Semimonotone Matrices
49
semimonotone, then the k-th entry of Aek must be positive, that is, every strictly semimonotone matrix must have positive diagonal entries. Moreover, by letting x be any vector whose entries indexed by some index set α ⊆ {1, 2, . . . , n} are positive and whose other entries are zero, we can see that (Ax)k > 0 for some k ∈ α. Thus, A[α]x[α] has a positive entry, and so A[α] is strictly semimonotone. Hence, every principal submatrix of a strictly semimonotone matrix must be semimonotone. It is then easy to see that the following theorem holds. Theorem 3.8.2 A matrix A ∈ Mn (R) is strictly semimonotone if and only if every proper principal submatrix of A is strictly semimonotone, and for every x > 0, there exists k ∈ n such that (Ax)k > 0. Some important facts about strictly semimonotone matrices are summarized in the next theorem. Theorem 3.8.3 (a) (b) (c) (d)
Every square positive matrix is strictly semimonotone. Every P-matrix is strictly semimonotone. All strictly copositive matrices are strictly semimonotone. A ∈ Mn (R) is strictly semimonotone if and only if A and all its proper principal submatrices are semipositive. (e) A ∈ Mn (R) is strictly semimonotone if and only if AT is strictly semimonotone. (f) A ∈ Mn (R) is strictly semimonotone if and only if the Linear Complementarity Problem, LCP(q, A) (see Section 4.9.1) has a unique solution for every q ≥ 0.
Proof To show (a), let A be a positive matrix. For any 0 = x ≥ 0, if xk > 0, then because A does not contain any nonpositive entries, (Ax)k > 0. Next, we show (b). If A is a P-matrix, then for each x = 0, there exists a k such that xk (Ax)k > 0 (see Theorem 4.3.4). Thus, because this holds for every 0 = x ≥ 0, A is strictly semimonotone. To show (c), suppose A is a strictly copositive, then for any 0 = x ≥ 0, we have that xT Ax > 0. Thus, there must be an index k such that xk > 0 and (Ax)k > 0, and so A is strictly semimonotone. For a proofs of clauses (d)–(f), see [CPS92]. The proofs of the following result are straightforward. Theorem 3.8.4 Let A ∈ Mn (R) be a strictly semimonotone matrix. If B ∈ Mn (R) is a nonnegative matrix, then A + B is strictly semimonotone.
50
Semipositive Matrices
Theorem 3.8.5 Suppose that all proper principal submatrices of A ∈ Mn (R) are strictly semimonotone. If A has a row of nonnegative entries or a column of nonnegative entries, then A is strictly semimonotone. Proof We will show that if A ∈ Mn (R) has all proper principal submatrices strictly semimonotone and if it has a column of nonnegative entries, then A is strictly semimonotone. The case when A has a row of nonnegative entries would then follow from the Theorem 3.8.3. Let a1 , a2 , . . . , an ∈ Rn be the columns of A and suppose, without loss of generality, that a1 ≥ 0. Suppose A is not strictly semimonotone. Then, because all proper principal submatrices arestrictly semimonotone, there exists an x = [x1 x2 . . . , xn ]T > 0 such that Ax ≤ 0. Let y = [0 x2 x3 . . . , xn ]T and note that there must exist k ∈ {2, 3, . . . , n} such that (Ay)k > 0, because the principal submatrix A[{2, 3, . . . , n}] is strictly semimonotone. However, because Ax ≤ 0 and x1 a1 ≥ 0, we have ⎡ ⎤ 0 ⎢x2 ⎥ ⎥
⎢ ⎢ ⎥ Ay = a1 a2 a3 · · · an ⎢x3 ⎥ ⎢.⎥ ⎣ .. ⎦ xn = 0a1 + x2 a2 + x3 a3 + · · · + xn an = Ax − x1 a1 ≤ 0, a contradiction. The following two results are simple consequences of the definition of strict semimonotonicity. Theorem 3.8.6 Let P ∈ Mn (R) be a permutation matrix. Then A ∈ Mn (R) is strictly semimonotone if and only if PAPT is strictly semimonotone. Theorem 3.8.7 Let A ∈ Mn (R) and let D = diag (d1 , d2 , . . . , dn ) be a diagonal matrix with di > 0. Then the following statements are equivalent. (a) A is strictly semimonotone. (b) DA is strictly semimonotone. (c) AD is strictly semimonotone. Theorem 3.8.8 Let A ∈ Mn (R) be a block upper triangular matrix with square diagonal blocks. Then A is strictly semimonotone if and only if each diagonal block is strictly semimonotone.
3.8 Strictly Semimonotone Matrices
51
Proof (⇐) Suppose A is a strictly semimonotone block upper triangular matrix with square diagonal blocks. Each such block is a principal submatrix of A and so, by Theorem 3.8.2, it is also strictly semimonotone. (⇒) Let A be a block upper triangular matrix with square diagonal blocks that are strictly semimonotone. We will proceed by induction on the number of diagonal blocks. The base case is when A is a two-block upper triangular matrix with both diagonal blocks strictly semimonotone; i.e., A = A01 AB2 ∈ Mn (R), where A1 ∈ Mr (R) and A2 ∈ Mn−r (R) are strictly semimonotone. Suppose A is not strictly semimonotone. Take a set α ⊆ {1, 2, . . . , n} of minimum cardinality for which the principal submatrix A[α] is not strictly semimonotone. Note that we must have α = α1 ∪ α2 where α1 ⊆ {1, 2, . . . , r} is nonempty and α2 ⊆ {r + 1, r + 2, . . . , n} is nonempty; otherwise, the principal submatrix A[α] would be strictly semimonotone. Then, B˜ A[α1 ] , A[α] = 0 A[α2 ] which is also a two-block upper triangular matrix with the diagonal blocks strictly semimonotone. Now, the minimum cardinality of α implies that there must be an x > 0 such that A[α]x ≤ 0. However, this would imply that A[α2 ]z ≤ 0 for some positive vector z, a contradiction because A[α2 ] is strictly semimonotone. Hence, A must be strictly semimonotone and the base case holds. Now suppose that every block upper triangular matrix with k diagonal blocks all of which are strictly semimonotone is strictly semimonotone. Consider a matrix A with k + 1 diagonal blocks. We know that the submatrix of A containing the first k diagonal blocks is strictly semimonotone by the inductive hypothesis, so we can consider this submatrix to be a single diagonal block. Then A consists of two diagonal blocks, which are each strictly semimonotone. Thus, A is strictly semimonotone. Theorem 3.8.9 Suppose A ∈ Mn (R) with all proper principal submatrices being strictly semimonotone. Then A is strictly semimonotone if and only if for all diagonal matrices D ∈ Mn (R) with nonnegative diagonal entries, A + D does not have a positive nullvector. Proof (⇒) We will prove the contrapositive. Suppose there exists a diagonal matrix D with nonnegative diagonal entries such that A + D has a positive null vector.
52
Semipositive Matrices
Then there exists an x > 0 such that (A + D)x = 0. So Ax = −Dx ≤ 0. Thus, A is not strictly semimonotone. (⇐) Suppose A is not strictly semimonotone. We want to show that there exists a diagonal matrix D with nonnegative diagonal entries such that A+D has a positive null vector. Because A is not strictly semimonotone (but all proper principal submatrices are), there exists an x > 0 such that Ax ≤ 0. Now, let D = [dij ] be a diagonal matrix where dii = −
(Ax)i . xi
Note D has nonnegative diagonal entries and Dx = −Ax. Thus, (A + D)x = 0. Because x > 0, A + D has a positive null vector. Let [0, I] denote all n × n diagonal matrices whose diagonal entries are in [0, 1]. Theorem 3.8.10 A matrix A ∈ Mn (R) is strictly semimonotone if and only if for all T ∈ [0, I], T + (I − T)A has no null vector x such that 0 = x ≥ 0. Proof (⇒) The proof is by contradiction. Suppose that A is strictly semimonotone and suppose that for some T ∈ [0, I] there exists a 0 = x ≥ 0 such that (T + (I − T)A)x = 0. Let y = Ax. Because A is strictly semimonotone, there exists a k such that xk > 0 and yk = (Ax)k > 0. Notice that (T + (I − T)A)x = 0
⇒
tkk xk + (1 − tkk )(Ax)k = 0
⇒
tkk (xk − yk ) = −yk .
Now, if xk = yk , then 0 = yk , a contradiction because yk > 0. Thus, xk −yk = 0 and so yk tkk = − . xk − yk Now suppose xk − yk > 0. In this case, because tkk ≥ 0, we get that yk − ≥ 0 ⇒ −yk ≥ 0 ⇒ yk ≤ 0, xk − yk a contradiction because yk > 0. Finally, suppose that xk − yk < 0. In this case, because tkk ≤ 1, we get that yk − ≤ 1 ⇒ −yk ≥ xk − yk ⇒ 0 ≥ xk , xk − yk
3.8 Strictly Semimonotone Matrices
53
a contradiction because xk > 0. Thus, no such null vector can exist. (⇐) We will prove the contrapositive. Suppose that A is not strictly semimonotone. Then there exists a 0 = x ≥ 0 such that xk (Ax)k ≤ 0 for each k. Let y = Ax. Note, if xk > 0, we must have yk = 0 or yk < 0. If xk = 0, then we can have yk = 0, yk < 0, or yk > 0. Now, we will find a T ∈ [0, I] such that (T + (I − T)A)x = 0. Let ⎧ ⎪ 0 if xi > 0 and yi = 0 ⎪ ⎪ ⎪ yi ⎪ ⎪ ⎪ if xi > 0 and yi < 0 − ⎪ ⎨ xi − yi tii = 0 if xi = 0 = yi ⎪ ⎪ ⎪ ⎪ ⎪ 1 if xi = 0 and yi > 0 ⎪ ⎪ ⎪ ⎩1 if xi = 0 and yi < 0. yi ≤ 1. Suppose not. If Notice that if xi > 0 and yi < 0, then 0 ≤ − xi −y i yi < 0, then because x − y > 0, −y < 0 and so y − xi −y i i i i ≥ 0, a contradiction. i yi If − xi −y > 1, then −yi > xi − yi , and so xi < 0, a contradiction. Hence, we i see that 0 ≤ tii ≤ 1. Also, notice that
((I − T)Ax)i = (1 − tii )yi = −tii xi = −(Tx)i . Hence, (T + (I − T)A)x = 0. For proofs of the following results, we refer the reader to [TW19]. Theorem 3.8.11 Given the spectrum σ of any real n × n matrix with positive trace, there exists a strictly semimonotone matrix A ∈ Mn (R) such that σ (A) = σ . A signature matrix S ∈ Mn (R) is a diagonal matrix whose diagonal entries belong to {−1, 1}. Theorem 3.8.12 [cf. Theorem 4.3.8] The following are equivalent for A ∈ Mn (R): (a) A is a P-matrix. (b) SAS is semipositive for all signature matrices S. (c) SAS is strictly semimonotone for all signature matrices S. In [TW19], the term almost (strictly) semimonotone A ∈ Mn (R) is introduced when all proper principal submatrices of A are (strictly) semimonotone
54
Semipositive Matrices
and there exists an x > 0 such that Ax < 0 (Ax ≤ 0). Then the following results are shown. Theorem 3.8.13 If A ∈ Mn (R) is almost semimonotone, then −A is MSP Theorem 3.8.14 Suppose A ∈ Mn (R) (n > 1) is almost strictly semimonotone and not semimonotone. Then A−1 exists and A−1 < 0.
4 P-Matrices
4.1 Introduction Recall that A ∈ Mn (C) is called a P-matrix (A ∈ P), if all its principal minors are positive, i.e., det A[α] > 0 for all α ⊆ n. The P-matrices encompass such notable classes as the positive definite matrices (PD), the (inverse) M-matrices (M, IM), the totally positive matrices (TP), as well as the H+ -matrices. As we will see, P-matrices are semipositive (SP). The study of P-matrices originated in the context of these classes in the work of Ostrowski, Fan, Koteljanskii, Gantmacher and Krein, Taussky, Fiedler and Pták, Tucker, as well as Gale and Nikaido. Some classical references to this early work include [FP62, FP66, GN65, GK1935, Tau58]. The first systematic study of P-matrices appears in the work of Fiedler and Pták [FP66]. Since then, the class P and its subclasses have proven a fruitful research subject, judged by the attention received in the matrix theory community and the continuing interest generated by the applications of P-matrices in the mathematical sciences. P-matrices play an important role in a wide range of applications, including the linear complementarity problem, global univalence of maps, linear differential inclusion problems, interval matrices, and computational complexity. Some of these applications are discussed in this chapter; see also [BEFB94, CPS92, BP94, Parth96]. Of particular concern is the ability to decide as efficiently as possible whether an n-by-n matrix is in P or not, referred to as the P-problem. It has received attention largely due to its inherent computational complexity. Motivated by the P-problem and questions about the spectra of P-matrices, in this chapter we address the need to construct (generic and special) P-matrices for purposes of experimentation, as well as theoretical and algorithmic development. To this
56
P-Matrices
end, in this chapter we provide a review of (i) basic properties of P-matrices and operations that preserve them, (ii) techniques to generate special and generic P-matrices, (iii) numerical methods to detect P-matrices, and (iv) manifestations of P-matrices in various mathematical contexts. This approach affords us the opportunity to review well-known results, some of them presented under new light, as well as to bring forth some less known and some newer results on P-matrices. Other comprehensive treatments of P-matrix theory can be found in [BP94, Fie86, HJ91]. This chapter unfolds as follows: Section 4.2 contains preliminary material specific to this chapter. Section 4.3 reviews basic properties and characterizations of P-matrices, including mappings of P into itself. Section 4.4 contains facts and questions about the eigenvalues of P-matrices. Section 4.5 discusses P-matrices with additional properties, including a closer examination of a special subclass of the P-matrices (mimes) that encompasses the M-matrices and their inverses. Section 4.6 provides an algorithmic resolution of the general P-problem, as well as approaches suitable for the detection of special subclasses of the P-matrices. Section 4.7 combines results from previous sections to provide a method that can generate every P-matrix. Section 4.8 concerns the topological closure of P, specifically, the adaptation of results for P to the class of matrices with nonnegative principal minors. Section 4.9 reviews manifestations and applications of P-matrices in various mathematical contexts. The chapter concludes with Section 4.10 in which further P-matrix considerations, generalizations, and related facts are collected, as well as comments on work not covered in this book.
4.2 Notation, Definitions, and Preliminaries 4.2.1 Matrix Transforms Definition 4.2.1 Given a nonempty α ⊆ n and provided that A[α] is invertible, we define the principal pivot transform of A ∈ Mn (C) relative to α as the matrix ppt (A, α) obtained from A by replacing A[α]
by A[α]−1 ,
A[α, α] by − A[α]−1 A[α, α],
A[α, α] by A[α, α]A[α]−1
and A[α] by A/A[α].
By convention, ppt (A, ∅) = A. To illustrate this definition, when α = {1, 2, . . . , k} (0 < k < n), we have that A[α]−1 −A[α]−1 A[α, α] . ppt (A, α) = A/A[α] A[α, α]A[α]−1
4.2 Notation, Definitions, and Preliminaries
57
The effect of applying a principal pivot transform is as follows. Suppose that A ∈ Mn (C) is partitioned in blocks as A11 A12 (4.2.1) A= A21 A22 and further suppose that A11 is invertible. Consider then the principal pivot transform relative to the leading block (A11 )−1 −(A11 )−1 A12 B= . (4.2.2) A21 (A11 )−1 A22 − A21 (A11 )−1 A12
T
T The matrices A and B are related as follows: If x = x1T x2T and y = yT1 yT2 in Cn are partitioned conformally to A, then x y y x A 1 = 1 if and only if B 1 = 1 . x2 y2 x2 y2 For a review of the properties and applications of the principal pivot transform, see [Tsa00]. Definition 4.2.2 For A ∈ Mn (C) with −1 ∈ σ (A), the fractional linear map FA = (I + A)−1 (I − A) is called the Cayley transform of A. The map A → FA is an involution, namely, A = (I + FA )−1 (I − FA ). 4.2.2 More Matrix Classes Below we recall and introduce some matrix classes referenced in this chapter. • We let PM denote the class of matrices all of whose positive integer powers are in P. • A positive stable matrix A ∈ Mn (C) is a matrix all of whose eigenvalues lie in the open right-half plane. • A = [aij ] ∈ Mn (C) is row diagonally dominant if for all i ∈ n, |aii | > |aij |. j =i
Note that in our terminology the diagonal dominance is strict. Due to Gershgorin’s Theorem (see Chapter 1), row diagonally dominant matrices with positive diagonal entries are positive stable.
58
P-Matrices
• We call A = [aij ] ∈ Mn (R) a B-matrix if for each i ∈ n and all j ∈ n\{i}, n k=1
aik > 0
and
n 1 aik > aij ; n k=1
namely, the row sums of a B-matrix are positive, and the row averages dominate the off-diagonal entries. The properties and applications of B-matrices are studied in [Pen01]. • A Z-matrix (Z) is a square matrix all of whose off-diagonal entries are nonpositive. An invertible M-matrix (M) is a positive stable Z-matrix or, equivalently, a semipositive Z-matrix. An inverse M-matrix (IM) is the inverse of an M-matrix. An MMA-matrix is a matrix all of whose positive integer powers are irreducible M-matrices (see Section 4.2.3 for the definition of irreducibility). An M-matrix A can be written as A = sI − B, where B ≥ 0 and s ≥ ρ(B). The Perron–Frobenius Theorem applied to B and BT implies that A possesses right and left nonnegative eigenvectors x, y, respectively, corresponding to the eigenvalue s − ρ(B). We refer to x and y as the (right) Perron eigenvector and the left Perron eigenvector of A, respectively. When B is also irreducible, s−ρ(B) is a simple eigenvalue of A, and we may take x > 0 and y > 0. • The comparison matrix of A = [aij ] ∈ Mn (C), denoted by M(A) = [bij ], is defined by −|aij | if i = j bij = |aii | otherwise. • We call A ∈ Mn (C) an H-matrix if M(A) is an M-matrix. • The following class of matrices was defined by Pang in [Pan79a, Pan79b], extending notions introduced in the work of Mangasarian and Dantzig. Such matrices were called hidden Minkowski matrices in [Pan79b]; we adopt a different name for them, indicative of their matricial nature and origin. Definition 4.2.3 Consider a matrix A ∈ Mn (R) of the form A = (s1 I − P1 ) (s2 I − P2 )−1 , where s1 , s2 ∈ R , P1 , P2 ≥ 0, such that for some vector u ≥ 0, P1 u < s1 u and P2 u < s2 u. We call A a mime, which is an acronym for M-matrix and Inverse M-matrix Extension, because the class of mimes contains the M-matrices (by taking
4.2 Notation, Definitions, and Preliminaries
59
P2 = 0, s2 = 1) and their inverses (by taking P1 = 0, s1 = 1). We refer to the nonnegative vector u above as a common semipositivity vector of A. • We conclude this section with the definition of a type of orthogonal matrix to be used in generating matrices all of whose powers are in P. An n-by-n Soules matrix R is an orthogonal matrix with columns {w1 , w2 , . . . , wn } such that w1 > 0 and R RT ≥ 0 for every
= diag(λ1 , λ2 , . . . , λn ) with λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0. Soules matrices can be constructed starting with an arbitrary positive vector w1 such that w1 2 = 1; for details see [ENN98, Sou83]. 4.2.3 Sign Patterns and Directed Graphs We call a diagonal matrix S whose diagonal entries belong to {−1, 1} a signature matrix. Note that S = S−1 ; thus we refer to SAS as a signature similarity of A. Matrix A ∈ Mn (R) is called sign nonsingular if every matrix with the same sign pattern as A (see Section 3.3) is nonsingular. The directed graph, D(A), of A = [aij ] ∈ Mn (C) consists of the set of vertices {1, . . . , n} and the set of directed edges (i, j) connecting vertex i to vertex j if and only if aij = 0. We say D(A) is strongly connected if any two distinct vertices i, j are connected by a path of edges (i, i1 ), (i1 , i2 ), . . . , (ik−1 , ik ), (ik , j). When D(A) is strongly connected, we refer to A as an irreducible matrix. A cycle of length k in D(A) consists of edges (i1 , i2 ), (i2 , i3 ), . . . , (ik−1 , ik ), (ik , i1 ), where the vertices i1 , i2 , . . . , ik are distinct. The nonzero diagonal entries of A correspond to cycles of length 1 in D(A). The signed directed graph of A ∈ Mn (R), S(A), is obtained from D(A) by labeling each edge (i, j) with the sign of aij . We define the sign of a cycle on the vertices {i1 , i2 , . . . , ik } as above to be the sign of the product ai1 i2 ai2 i3 · · · aik−1 ik aik i1 . We denote by Mnk (k ≤ n) the set of matrices A ∈ Mn (R) with nonzero diagonal entries such that the length of the longest cycle in D(A) is no more than k. For matrices in Mnk , we adopt the following notation: A ∈ Pkn if A ∈ Mnk is a P-matrix; A ∈ Snk if all the cycles in S(−A) are signed negatively.
60
P-Matrices
4.3 Basic Properties of P-Matrices A basic observation comes first. Observation 4.3.1 A block triangular matrix with square diagonal blocks is a P-matrix if and only if each diagonal block is a P-matrix. In particular, the direct sum of P-matrices is a P-matrix. Proof The result follows from the following two facts. The determinant of a block triangular matrix with square diagonal blocks is the product of the determinants of the diagonal blocks. Also, if A is a block triangular matrix with square diagonal blocks, all principal submatrices of A are block triangular matrices whose diagonal blocks are principal submatrices of A. We proceed with a review of transformations that map P into itself. Theorem 4.3.2 Let A ∈ Mn (C) be a P-matrix (A ∈ P). Then the following hold. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
AT ∈ P. QAQT ∈ P for every permutation matrix Q. SAS ∈ P for every signature matrix S. DAE ∈ P for all diagonal matrices D, E such that DE has positive diagonal entries. A + D ∈ P for all diagonal matrices D with nonnegative diagonal entries. A[α] ∈ P for all nonempty α ⊆ n. A/A[α] ∈ P for all α ⊆ n. ppt (A, α) ∈ P for all α ⊆ n. In particular, when a = n, we obtain that ppt (A, n) = A−1 ∈ P. I + FA ∈ P and I − FA ∈ P. T I + (I − T) A ∈ P for all T ∈ [0, I].
Proof Clauses (1)–(4) and (6) are direct consequences of determinantal properties and the definition of a P-matrix. A (5) Notice that if A = [aij ] ∈ P, then ∂∂det aii = det A[{i}] > 0, that is, det A is an increasing function of the diagonal entries. Thus, as the diagonal entries of D are added in succession to the diagonal entries of A, the determinant of A and, similarly, all principal minors of A remain positive. (7) Because A ∈ P, A[α] is invertible and A−1 , up to a permutation similarity, has the block representation (see, e.g., [HJ13, (0.7.3) p. 18]) (A/A[α])−1 −A[α]−1 A[α, α](A/A[α])−1 . −(A/A[α])−1 A[α, α]A[α]−1 (A/A[α])−1
4.3 Basic Properties of P-Matrices
61
Therefore, every principal submatrix of A−1 is of the form (A/A[α])−1 for some α ⊆ n and has determinant det A[α]/ det A > 0. This shows that A−1 ∈ P and, in turn, that (A/A[α])−1 and thus A/A[α] are P-matrices for every α ⊆ n. (8) Let A ∈ P and consider first the case where α is a singleton; without loss of generality assume that α = {1}. Let B = ppt (A, α) = [bij ]. By definition, the principal submatrices of B that do not include entries from the first row coincide with the principal submatrices of A/A[α] and thus, by (7), have positive determinants. The principal submatrices of B that include entries from the first row of B are equal to the corresponding principal submatrices of the matrix B obtained from B using b11 = (A[α])−1 > 0 as the pivot and eliminating the nonzero entries below it. Notice that b11 b11 −b11 A[α, α] 1 0 −b11 A[α, α] = . B = A/A[α] −A[α, α] I A[α, α]b11 0 A[α] That is, B is itself a P-matrix for it is block upper triangular, and the diagonal blocks are P-matrices. It follows that all the principal minors of B are positive and thus B ∈ P. Next, consider the case α = {i1 , i2 , . . . , ik } ⊆ n with k ≥ 1. By our arguments so far, the sequence of matrices A0 = A, Aj = ppt (Aj−1 , {ij }), j = 1, 2, . . . , k is well defined and comprises P-matrices. Moreover, from the uniqueness of B = ppt (A, α) shown in [Tsa00, theorem 3.1], it follows that Ak = ppt (A, α) = B and thus B ∈ P. (9) A ∈ P is nonsingular and −1 ∈ σ (A); otherwise, (5) would be violated for D = I. Hence FA is well defined. It can also be verified that I + FA = 2(I + A)−1 and I − FA = 2(I + A−1 )−1 . As addition of positive diagonal matrices and inversion are operations that preserve P-matrices (see (5) and (8)), it follows that I − FA and I + FA are P-matrices. (10) Let A ∈ P and T = diag(t1 , . . . , tn ) ∈ [0, I]. Because T and I − T are diagonal, ti det (((I − T)A) [α]) . det(TI + (I − T)A) = α⊆n i ∈α
As ti ∈ [0, 1] and A ∈ P, all summands in this determinantal expansion are nonnegative. Unless T = 0, in which case TI + (I − T)A = A, one of these summands is positive. Hence det(A) > 0. The exact same argument can be applied to any principal submatrix of A, proving that A ∈ P.
62
P-Matrices
Remarks 4.3.3 The following comments pertain to Theorem 4.3.2. (i) It is straightforward to argue that each one of the clauses (1)–(5), (7), and (10) represents a necessary and sufficient condition that A be a P-matrix. (ii) By repeated application of (8), it follows that any one principal pivot transform of A is a P-matrix if and only if every principal pivot transform of A is a P-matrix; see [Tsa00]. (iii) A more extensive analysis of the relations among the Cayley transforms of P-matrices and other matrix positivity classes is undertaken in [FT02]. We continue with more basic facts and characterizations of P-matrices. First is the classical characterization of real P-matrices as those matrices that do not “reverse the sign of any nonzero real vector.” Theorem 4.3.4 If A ∈ Mn (R), then A ∈ P if and only if for each nonzero x ∈ Rn , there exists j ∈ n such that xj (Ax)j > 0. Proof Suppose that A ∈ Mn (R) and that there exists x ∈ Rn such that for all j ∈ n, xj (Ax)j ≤ 0. Then there exists positive diagonal matrix D such that Ax = −Dx, i.e., (A + D)x = 0. By Theorem 4.3.2 (5), A is not a P-matrix. This proves that when A ∈ Mn (C) is a P-matrix, then for every nonzero x ∈ Rn , there exists j ∈ n such that xj (Ax)j > 0. Suppose now that A ∈ Mn (R) and that for each nonzero x ∈ Rn , there exists j ∈ n such that xj (Ax)j > 0. Notice that the same holds for every principal submatrix A[α] of A, by simply considering vectors x such that x[α] = 0. Thus all the real eigenvalues of A[α] are positive, for all nonempty α ⊆ n. As complex eigenvalues come in conjugate pairs, it follows that all the principal minors of A are positive. Theorem 4.3.5 Let A ∈ Mn (R). Then the following are equivalent. (i) A ∈ P. (ii) Ax = 0 for any x ∈ Rn with no zero entries. (iii) For each nonzero x ∈ Rn , there exists diagonal matrix D such that xT DAx > 0. Proof Clearly (i) implies (ii) because A is invertible. To see that (ii) implies (i), if x ∈ Rn has no zero entries and Ax = 0, then by Theorem 4.3.4, A ∈ P. That (i) implies (iii), also follows from Theorem 4.3.4: Let x ∈ Rn be nonzero. Then, there exists j ∈ n such that xj (Ax)j > 0. Let D = diag(d1 , d2 , . . . , dn ) be a diagonal matrix with dk = 1 for all k = j. Then we can choose dj > 0 sufficiently large to have xT DAx > 0, showing (i) implies (iii). We can complete the proof by showing that not (ii) implies not (iii). Indeed, if Ax = 0 for some x ∈ Rn with nonzero entries, then xT DAx = 0 for all diagonal matrices D.
4.3 Basic Properties of P-Matrices
63
Theorem 4.3.6 Let A ∈ Mn (R) be a P-matrix. Then A is a semipositive matrix. Proof We will prove the contrapositive. By Theorem 1.4.1, if A is not semipositive, then there exists a nonzero x ≥ 0 such that AT x ≤ 0. Then by Theorem 4.3.4, AT is not a P-matrix and thus A is not a P-matrix. Not every semipositive matrix is a P-matrix as shown by the following simple counterexample. Example 4.3.7 The matrix A=
1 2 3 −1
maps the all-ones vector in R2 to a positive vector, but A is not a P-matrix. There is, however, a characterization of real P-matrices via semipositivity, first observed in [Al-No88]. It is provided next, along with a simpler proof. Theorem 4.3.8 A ∈ Mn (R) is a P-matrix if and only if for every signature matrix S ∈ Mn (R), SAS is semipositive. Proof Let A ∈ P and S ∈ Mn (R) be a signature matrix. By Theorem 4.3.2 (3), SAS ∈ P. If SAS is not semipositive, then by Theorem 1.4.1, there is a nonzero x ≥ 0 such that SAT Sx ≤ 0. Thus SAT S ∈ P and so SAS ∈ P, a contradiction that shows that SAS is semipositive. Conversely, suppose that for every signature matrix S ∈ Mn (R), SAS is semipositive. By Theorem 1.4.1, for every signature S and every nonzero x ≥ 0, SAT Sx = y ≤ 0. Because AT (Sx) = Sy, we have that Sx and Sy have at least one nonzero entry of the same sign. By Theorem 4.3.4, AT and thus A are P-matrices. We conclude this section with a characterization from [JT95] of real P-matrices in factored form. Recall that the matrix interval [0, I] contains all diagonal matrices with entries between 0 and 1, inclusive. Theorem 4.3.9 Let A = BC−1 ∈ Mn (R). Then A ∈ P if and only if the matrix TB + (I − T)C is invertible for every T ∈ [0, I]. Proof If A ∈ P, by Theorem 4.3.2 (10), TI + (I − T)BC−1 ∈ P for every T ∈ [0, I]. Thus TC + (I − T)B is invertible for every T ∈ [0, I]. For the converse, suppose TI + (I − T)BC−1 is invertible for every T ∈ [0, I] and, by way of contradiction, suppose BC−1 ∈ P. By Theorem 4.3.4, there exists nonzero x ∈ Rn such that yj xj ≤ 0 for every j ∈ n, where y = BC−1 x.
64
P-Matrices
Consider then T = diag(t1 , t2 , . . . , tn ), where for each j ∈ n, tj ∈ [0, 1] is selected so that tj xj + (1 − tj )yj = 0. It follows that Tx + (I − T)BC−1 x = 0, a contradiction that completes the proof. Corollary 4.3.10 Let A ∈ Mn (R). Then A ∈ P if and only if the matrix T + (I − T)A is invertible for every T ∈ [0, I]. Proof The result follows from Theorem 4.3.9 by taking B = I and C = A−1 .
4.4 The Eigenvalues of P-Matrices We begin with some basic observations about the eigenvalues of a P-matrix. Theorem 4.4.1 Let A ∈ Mn (C) be a P-matrix with characteristic polynomial f (t) = det(tI − A). Then the following hold. (i) f (t) is a monic polynomial with real coefficients that alternate in sign. (ii) The spectrum of A is closed under complex conjugation, i.e., σ (A) = σ (A). (iii) For each nonempty α ⊂ n, every real eigenvalue of A[α] is positive. Proof (i) The coefficient of tk (k = 0, 1, . . . , n − 1) in f (t) is (−1)n−k times the sum of all k × k principal minors of A. Thus, its sign is (−1)n−k . (ii) By (i), f (t) is a polynomial over the real numbers so its non-real roots come in complex conjugate pairs. (iii) If for some nonempty α ⊂ n, λ ≤ 0 were an eigenvalue of A[α], then A[α] + |λ|I would be a singular matrix, violating Theorem 4.3.2 (5) applied to the P-matrix A[α]. The above theorem leads to the following spectral characterization of a P-matrix. Theorem 4.4.2 Let A ∈ Mn (C) and its principal submatrices have real characteristic polynomials. Then A ∈ P if and only if every real eigenvalue of every principal submatrix of A is positive. Proof If A ∈ P so are its principal submatrices and thus their real eigenvalues are positive by Theorem 4.4.1 (iii). For the converse, let the real eigenvalues of A and its principal submatrices be positive. Then the principal minors of A are
4.4 The Eigenvalues of P-Matrices
65
Figure 4.1 The spectrum of a Q-matrix cannot intersect the shaded sector about the negative real axis.
positive because they are the products of their respective eigenvalues, which are either positive or come in complex conjugate pairs. Although the real eigenvalues of a P-matrix A ∈ Mn (C) are positive, other eigenvalues of A may lie in the left half-plane. In fact, as shown in [Her83], P-matrices may have all but one (when n is even), or all but two (when n is odd) of their eigenvalues in the left half-plane. There is, however, a notable restriction on the location of the eigenvalues of a P-matrix presented in Theorem 4.4.4 and illustrated in the Figure 4.1. It is attributed to Kellogg [Kel72], and it regards the class of matrices whose characteristic polynomials have real coefficients with alternating signs. Definition 4.4.3 Matrix A ∈ Mn (C) is called a Q-matrix if for each k ∈ n, the sum of all k × k principal minors of A is positive. Every P-matrix is indeed a Q-matrix. (This concept of a Q-matrix differs from the one used in the Linear Complementarity Problem literature; see Figure 4.1.) Theorem 4.4.4 Let A ∈ Mn (C) be a Q-matrix, and let Arg(z) ∈ (−π, π] denote the principal value of z ∈ C. Then π . (4.4.1) σ (A) ⊂ z ∈ C : | Arg(z) | < π − n Remark 4.4.5 A graph-theoretic refinement of the above result is shown in [JOTvdD93]: If A ∈ Pkn , where k < n, then π π < Arg(z) < π − . σ (A) ⊂ z ∈ C : − π + n−1 n−1 Some related refinements were conjectured in [HB83] and studied in [HJ86b] and [Fang89]. Recall that some of the better known subclasses of P, like the positive definite matrices and the totally positive matrices, are invariant under the
66
P-Matrices
taking of powers, and the eigenvalues of their members are positive numbers. Consequently, it is natural to ask: Question 4.4.6 Are the eigenvalues of a matrix all of whose powers are P-matrices necessarily positive? Question 4.4.6 is posed in [HJ86b], where it is associated with spectral properties of Q-matrices. More specifically, the following questions about complex matrices are posed in [HJ86b]: Question 4.4.7 Is every matrix all of whose powers are Q-matrices co-spectral to a matrix all of whose powers are P-matrices? Question 4.4.8 Is every Q-matrix similar to a P-matrix? Question 4.4.9 Is every Q-matrix co-spectral to a P-matrix? Question 4.4.10 Is every diagonalizable Q-matrix similar to a P-matrix? Another related question is posed in [HK03, question 6.2]: Question 4.4.11 Do the eigenvalues of a matrix A such that A and A2 are P-matrices necessarily have positive real parts? At the time of writing this book, Questions 4.4.6–4.4.11 remain unanswered. Two results in the direction of resolving these questions are paraphrased below. Proposition 4.4.12 [TT18] Let A ∈ Mn (R) such that t A2 + (1 − t) A ∈ P for every t ∈ [0, 1]. Then A is positive stable. Proposition 4.4.13 [Kus16] Let A ∈ Mn (R) be such that for some permutation matrix P, the leading principal submatrices of PAPT and the squares of the submatrices are P-matrices. Then A is a positive stable P-matrix. We conclude this section with a classic result on “diagonal stabilization” of matrices with positive leading principal minors (and thus P-matrices); see Fischer and Fuller [FF58] and Ballantine [Bal70]. Theorem 4.4.14 Let A ∈ Mn (R) such that det A[k] > 0 for all k = 1, 2, . . . , n. Then there exists a diagonal matrix D with positive diagonal entries such that all the eigenvalues of DA are positive and simple.
4.5 Special Classes of P-Matrices Below is a list of propositions tantamount to methods that can be used to generate P-matrices with special properties.
4.5 Special Classes of P-Matrices
67
Proposition 4.5.1 Every row diagonally dominant matrix A ∈ Mn (R) with positive diagonal entries is a positive stable P-matrix. Proof Every principal submatrix A[α] is row diagonally dominant with positive diagonal entries, and thus, by Gershgorin’s theorem, A[α] is positive stable. In particular, every real eigenvalue of A[α] is positive. As the complex eigenvalues come in conjugate pairs, it follows the det A[α] > 0 for all nonempty α ⊂ n. Proposition 4.5.2 Let A ∈ Mn (R) be such that A + AT is positive definite. Then A is a positive stable P-matrix. Proof Every principal submatrix A[α] + A[α]T of A + AT is also positive definite. Thus for every x ∈ C| α| , x∗ A[α] x = x∗
A[α] + A[α]T A[α] − A[α]T x + x∗ x 2 2
has positive real part. It follows that every eigenvalue of A[α] has positive real part, and thus every principal submatrix of A is positive stable. As in the proof of Proposition 4.5.1, this implies that A is a positive stable P-matrix. Proposition 4.5.3 [Pen01] Let A ∈ Mn (R) be a B-matrix. Then A ∈ P. Proposition 4.5.4 For every nonsingular matrix A ∈ Mn (C), the matrices A∗ A, AA∗ are positive definite and thus P-matrices. Proposition 4.5.5 Given a square matrix B ≥ 0 and s > ρ(B), A = sI − ρ(B) ∈ P. In particular, A is an M-matrix and thus positive stable. Proof Consider B = [bij ] ≥ 0 and s > ρ(B). By [HJ13, corollary 8.1.20], s > ρ(B[α]) for every nonempty α ⊆ n. Thus every principal submatrix A[α] of A is positive stable, and as in the proof of Proposition 4.5.1, it must have positive determinant. Proposition 4.5.6 Let B, C ∈ Mn (R) be row diagonally dominant matrices with positive diagonal entries. Then BC−1 ∈ P. Proof Let B and C be as prescribed, and notice that by the Levy–Desplanques theorem (see [HJ13, corollary 5.6.17]), TB + (I − T)C is invertible for every T ∈ [0, I]. Thus, by Theorem 4.3.9, BC−1 ∈ P. Remark 4.5.7 It is easy to construct examples showing that the above result fails to be true when B or C are not real matrices. We continue with two results on mimes from [Pan79a]. The original proofs of these results in [Pan79a] rely on (hidden) Leontief matrices and principal pivot
68
P-Matrices
transforms. In particular, the proof of Proposition 4.5.9 below is attributed to Mangasarian [Man78], who also used ideas from mathematical programming. We include here shorter proofs that are based on what is considered standard M-matrix and P-matrix theory. Proposition 4.5.8 Let A ∈ Mn (R) be a mime. Then A ∈ P. Proof Let A be a mime and s1 , s2 , P1 , P2 and u ≥ 0 be as in Definition 4.2.3. Then, for every T ∈ [0, I], the matrix C = T(s2 I − P2 ) + (I − T)(s1 I − P1 ) is a Z-matrix and Cu > 0 (i.e., C is a semipositive Z-matrix). This means that C is an M-matrix. In particular, C is invertible for every T ∈ [0, I]. By Theorem 4.3.9, we conclude that A = (s1 I − P1 )(s2 I − P2 )−1 ∈ P. Proposition 4.5.9 Let B ≥ 0 with ρ(B) < 1. Let {ak }m k=1 be a sequence such that 0 ≤ ak+1 ≤ ak ≤ 1 for all k = 1, . . . , m − 1. Then A=I+
m
ak Bk ∈ P.
k=1
If m is infinite, under the additional assumption that ∞ k=1 ak is convergent, we can still conclude that A ∈ P. More specifically, A is a nonnegative matrix and a mime. Proof Consider the matrix C = A(I − B) and notice that C can be written as C = I − G, where G ≡ B −
m
ak Bk − Bk+1 .
k=1
First we show that G is nonnegative: Indeed, as 0 ≤ ak+1 ≤ ak ≤ 1, we have G ≥ B−
m
a k Bk +
k=1
=B−
m
= B − a1 B + am B
ak+1 Bk+1 + am Bm+1
k=1
a k Bk +
k=1
m−1
m
k=2 m+1
ak Bk + am Bm+1 = (1 − a1 )B + am Bm+1 ≥ 0.
Next we show that ρ(G) < 1. For this purpose, consider the function f (z) = z −
m
ak zk − zk+1
k=1
= z(1 − a1 ) + z2 (a1 − a2 ) + · · · + zm (am−1 − am ) + am zm+1 ,
4.5 Special Classes of P-Matrices
69
in which expression all the coefficients are by assumption nonnegative. Thus |f (z)| ≤ f (|z|). However, for |z| < 1, we have f (|z|) = |z| −
m
ak |z|k − |z|k+1 ≤ |z|.
k=1
That is, for all |z| < 1, |f (z)| ≤ f (|z|) ≤ |z|. We can now conclude that for every λ ∈ σ (B), |f (λ)| ≤ |λ| ≤ ρ(B) < 1; that is, ρ(G) < 1. We have thus shown that A = (I − G)(I − B)−1 , where B, G ≥ 0 and ρ(B), ρ(G) < 1. Also, as B ≥ 0, we may consider its Perron vector u ≥ 0. By construction, u is also an eigenvector of G corresponding to ρ(G). That is, there exists vector u ≥ 0 such that Bu = ρ(B) u < u
and Gu = ρ(G) u < u.
Thus A is a mime and so it belongs to P by Proposition 4.5.8. Proposition 4.5.10 Let B ≥ 0. Then etB ∈ P for all t ∈ [0, 1/ρ(B)). Proof It follows from Proposition 4.5.9 by taking m = ∞ and ak =
1 k! .
H-matrices with positive diagonal entries are mimes and thus P-matrices by the results in [Pan79a]. We include a short direct proof of this fact next. Proposition 4.5.11 Let A ∈ Mn (R) be an H-matrix with positive diagonal entries. Then A ∈ P. Proof As M(A) is an M-matrix, by [BP94, theorem 2.3 (M35 ), chapter 6], there exists a diagonal matrix D ≥ 0 such that AD is strictly row diagonally dominant. As AD has positive diagonal entries, the result follows from Theorem 4.3.2 (4) and Proposition 4.5.1. We now turn our attention to P-matrices generated via sign patterns. Proposition 4.5.12 Snk ⊆ Pkn ⊆ P. Proof Recall that by the definitions in Section 4.2.3, when A = [aij ] ∈ Sn,k , the longest cycle in the directed graph of A is k ≥ 1; the cycles of odd (resp., even) length in the signed directed graph of A are positive (resp., negative).
70
P-Matrices
In particular, all the diagonal entries of A are positive. The terms in the standard expansion of det A are of the form (−1)sign(σ ) a1σ (1) . . . anσ (n) ,
(4.5.1)
where σ is a member of the symmetric permutation group on n elements. However, σ can be uniquely partitioned into its permutation cycles, some of odd and some of even length. In fact, (−1)sign(σ ) = (−1)n+q , where q is the number of odd cycles. As a consequence, the quantity in (4.5.1) is nonnegative. One such term in det A consists of the diagonal entries of A, which are positive, and thus the sign of that term is (−1)2n . It follows that det A > 0. Notice also that a similar argument can be applied to every principal minor of A. Thus A belongs to Pn,k . Remark 4.5.13 The argument in the proof above is essentially contained in the proof of [EJ91, theorem 1.9]. Also, notice that the matrices in Sn,k are sign nonsingular and sometimes referred to as qualitative P-matrices. Example 4.5.14 The sign pattern ⎡ ⎤ + + + ⎣ − + + ⎦ 0 − + belongs to S3,2 , and thus every matrix with this sign pattern is a P-matrix. Proposition 4.5.15 Let A ∈ Mn (R) be sign nonsingular and B ∈ Mn (R) any matrix with the same sign pattern as A. Then BA−1 , B−1 A ∈ P. Proof Because A and B are sign-nonsingular having the same sign pattern, we have that C = TB + (I − T)A is also sign-nonsingular for every T ∈ [0, I]. Thus, by Theorem 4.3.9, BA−1 ∈ P. The conclusion for B−1 A follows from a result in [JT95] dual to the one quoted in Theorem 4.3.9. Example 4.5.16 The sign pattern ⎡ − ⎢ + ⎢ ⎣ 0 +
⎤ − 0 − − − 0 ⎥ ⎥ + − − ⎦ 0 + −
is sign nonsingular. Thus, by Proposition 4.5.15, for A and B with this sign pattern given by,
4.5 Special Classes of P-Matrices ⎡ ⎤ −1 −2 0 −1 −1 −1 0 −1 ⎢ 1 −1 −3 ⎥ ⎢ 1 −1 −1 0 0 ⎥, B = ⎢ A=⎢ ⎣ 0 ⎣ 0 2 −2 −1 1 −1 −2 ⎦ 1 0 1 −1 3 0 1 −2 ⎡
71 ⎤ ⎥ ⎥, ⎦
we have that ⎡
BA−1
33 1 ⎢ −8 ⎢ = 23 ⎣ −21 −9
⎤ 7 −6 1 45 −14 −10 ⎥ ⎥ ∈ P. 6 31 −9 ⎦ −4 −13 6
Remark 4.5.17 Some sufficient conditions for a matrix A to be a P-matrix based on its signed directed graph are provided in [JNT96]. In particular, if A = [aij ] is sign symmetric (i.e., aij aji ≥ 0) and if its undirected graph is a tree (i.e., a connected acyclic graph), then necessary and sufficient conditions that A be a P-matrix are obtained.
4.5.1 Matrices All of Whose Powers Are P-Matrices As is well known, if A is positive definite, then so are all of its powers. Thus positive definite matrices belong to PM . By the Cauchy–Binet formula for the determinant of the product of two matrices (see [HJ13]), it follows that powers of totally positive matrices are totally positive and thus belong to PM . Below is a constructive characterization of totally positive matrices found in [GP96] (see also [Fal01] and [FJ11]), which therefore allows us to generate matrices in PM . Theorem 4.5.18 Matrix A ∈ Mn (R) is totally positive if and only if there exist positive diagonal matrix D and positive numbers li , uj (i, j = 1, 2, . . . , k), k = (n2 ), such that A = FDG, where
F = En (lk ) En−1 (lk−1 ) . . . E2 (lk−n+2 )
× En (lk−n+1 ) . . . E3 (lk−2n+4 ) . . . [En (l1 )] and
T T G = EnT (u1 ) En−1 (u2 ) EnT (u3 ) . . . E2T (uk−n+2 ) . . . En−1 (uk−1 )EnT (uk ) ; here Ek (β) = I + βEk,k−1 , where Ek,k−1 denotes the (k, k − 1) elementary matrix.
72
P-Matrices
Another subclass of PM is the MMA-matrices, introduced in [FHS87] (recall that these are matrices all of whose positive integer powers are irreducible M-matrices). The results in [ENN98] (see also [Stu98]), combined with the notion of a Soules matrix, allow us to construct all symmetric (inverse) MMAmatrices: Theorem 4.5.19 Let A ∈ Mn (R) be an invertible and symmetric matrix. Then A is an MMA-matrix if and only if A−1 = R RT , where R is a Soules matrix and = diag(λ1 , . . . , λn ) with λ1 > λ2 ≥ · · · ≥ λn > 0. It has also been shown in [FHS87] that for every MMA-matrix B ∈ Mn (R) there exists a positive diagonal matrix D such that A = D−1 BD is a symmetric MMA-matrix. In [HS86], D is found to be 1/2 −1/2 1/2 −1/2 D = diag x1 y1 , x2 y2 , . . . , xn1/2 y−1/2 , n where x, y are the unit strictly positive right and left Perron eigenvectors of B, respectively. Thus, we have a way of constructing arbitrary MMA-matrices as follows: Determine an orthogonal Soules matrix R starting with an arbitrary unit positive vector w1 . Choose a matrix as in Theorem 4.5.19 and form R RT . Then A = R −1 RT is a symmetric MMA-matrix with unit left and ˆ right Perron eigenvectors equal to w1 . Choose now positive diagonal matrix D −1 T −1 −1 ˆ ˆ ˆ ˆ and let B = DR R D . Then B is an MMA-matrix having Dw1 and D w1 are right and left Perron vectors, respectively. Remark 4.5.20 Given a matrix A in PM , so are AT , D−1 AD (where D is a positive diagonal matrix), QAQT (where Q is a permutation matrix), and SAS (where S is a signature matrix). 4.5.2 More on Mimes It has been shown in [Pan79a] that the notion of a mime A as in Definition 4.2.3 is tantamount to A being “hidden Z” and a P-matrix at the same time. More relevant to our context is the following result, reproven here using the language and properties of M-matrices. Proposition 4.5.21 Let A ∈ Mn (R). Then A is a mime if and only if (1) AX = Y for some Z-matrices X and Y, and (2) A and X are semipositive. Proof Clearly, if A is a mime as in Definition 4.2.3, then (1) holds with the roles of X and Y being played by (s2 I − P2 ) and (s1 I − P1 ), respectively. That
4.5 Special Classes of P-Matrices
73
(2) holds follows from the fact that z = Xu > 0 (i.e., X is semipositive) and Yu > 0, where u ≥ 0 is a common semipositivity vector of A. We then have that Az = YX −1 Xu = Yu > 0; that is, A is also semipositive. For the converse, suppose (1) and (2) hold. Then X is an invertible M-matrix. As A is assumed semipositive, Ax > 0 for some x ≥ 0. Let u = X −1 x. Then, Yu = Ax > 0; that is, Y is also semipositive and so an M-matrix as well. In fact, u is a common semipositivity vector of A and thus A is a mime. Remarks 4.5.22 (i) Based on the above result, we can assert that given a mime A and a positive diagonal matrix D, A + D and DA are also mimes. (ii) Principal pivot transforms, permutation similarities, Schur complementation, and extraction of principal submatrices leave the class of mimes invariant; see [Pan79a, Tsa00]. Recall that the mimes form a subclass of the P-matrices that includes the M-matrices, the H-matrices with positive diagonal entries, as well as their inverses. The following is another large class of mimes mentioned in [Pan79b]. Theorem 4.5.23 Every triangular P-matrix is a mime. Proof We prove the claim by induction on the order k of A. If k = 1 the result is obviously true. Assume the claim is true for k = n − 1; we will prove it for k = n. For this purpose, consider the triangular P-matrix A11 a , A= 0 a22 where A11 is an (n − 1)-by-(n − 1) P-matrix, a ∈ Rn−1 and a22 > 0. By the inductive hypothesis and Proposition 4.5.21, there exist Z-matrices X11 , Y11 and nonnegative vector u1 ∈ Rn−1 such that A11 X11 = Y11 , Y11 u1 > 0, and X11 u1 > 0. Consider then the Z-matrix X11 −X11 u1 X= , 0 x22 where x22 > 0 is to be chosen. Then let A11 X11 −Y11 u1 + x22 a Y = AX = . 0 a22 x22 T
that x22 > 0 can be chosen so that Y is a Z-matrix. Let also u = Notice T u1 u2 . Choosing u2 > 0 small enough, we have that Xu > 0 and Yu > 0. Thus A is a mime by Proposition 4.5.21.
74
P-Matrices
Theorem 4.5.24 Let A be a mime. Then A can be factored into A = BC−1 , where B, C are row diagonally dominant matrices with positive diagonal entries. Proof Suppose that A = (s1 I − P1 )(s2 I − P2 )−1 is a mime with semipositivity vector u that by continuity can be chosen to be positive (u > 0). Let D = diag(u1 , . . . , un ). Define B = (s1 I −P1 )D and C = (s2 I −P2 )D so that Be > 0 and Ce > 0, where e is the all ones vector. Notice that as B and C are Z-matrices with positive diagonal entries, they are indeed row diagonally dominant, and A = BC−1 . Remark 4.5.25 Not all P-matrices are mimes as shown by the construction of a counterexample in Pang [Pan79b].
4.6 The P-Problem: Detection of P-Matrices Of particular concern in applications is the ability to decide, as efficiently as possible, whether an n-by-n matrix is in P or not. This is referred to as the P-problem. This problem has received attention due to its inherent computational complexity, as well as due to the connection of P-matrices to the linear complementarity problem (see Section 4.9.1) and to self-validating methods for its solution (see [CSY01, JR99, RR96, Rum01]). The P-problem is indeed NP-hard. Specifically, the P-problem has been shown by Coxson [Cox94, Cox99] to be co-NP-complete. That is, the time complexity of discovering that a given matrix has a negative principal minor is NP-complete. Checking the sign of each principal minor is indeed a task of exponential complexity. A strategy has been developed in [Rum03] for detecting P-matrices, which is not a priori exponential; however, it can be exponential in the worst case. The most efficient to date comprehensive method for the P-problem is a recursive O(2n ) algorithm developed in [TL00], which is simple to implement and lends itself to computation in parallel; it is presented in detail in Section 4.6.2. For certain matrix classes discussed in Section 4.6.1, the P-problem is computationally straightforward, in some cases demanding only low-degree polynomial time complexity.
4.6 The P-Problem: Detection of P-Matrices
75
4.6.1 Detecting Special P-Matrices To detect a positive definite matrix A one needs to check if A = A∗ and whether the eigenvalues of A are positive or not. To detect an M-matrix A, one needs to check if A is a Z-matrix and whether A is positive stable or not. An alternative O(n3 ) procedure for testing if a Z-matrix is an M-matrix is described in [Val91]. To detect an inverse M-matrix A, one can apply a test for M-matrices to A−1 . Similarly, one needs to apply a test for M-matrices to M(A) to determine whether a matrix A with positive diagonal entries is an H-matrix or not. Recall that an H-matrix is also characterized by the existence of a positive diagonal matrix D such that AD is row diagonally dominant. This is why an H-matrix is also known in the literature as a generalized diagonally dominant matrix. Based on this characterization of H-matrices, an iterative method to determine whether A is an H-matrix or not is developed in [LLHNT98]; this iterative method also determines a diagonal matrix D with the aforementioned scaling property. More methods to recognize H-matrices have since been developed, see, e.g., [AH08]. To determine a totally positive matrix, Fekete’s criterion [FP1912] may be used: Let S = {i1 , i2 , . . . , ik } with ij < ij+1 , (j = 1, 2, . . . , k − 1), and define the dispersion of S to be ik − i1 − k + 1 if k > 1 d(S) = 0 if k = 1. Then A ∈ Mn (R) is totally positive if and only if det A[α|β] > 0 for all α, β ⊆ n with |α| = |β| and d(α) = d(β) = 0. That is, one only needs to check for positivity minors whose row and column index sets are contiguous. Fekete’s criterion is improved in [GP96] as follows: Theorem 4.6.1 Matrix A ∈ Mn (R) is totally positive if and only if for every k ∈ n (a) det A[α|k] > 0 for all α ⊆ n with |α| = k and d(α) = 0, (b) det A[k|β] > 0 for all β ⊆ n with |β| = k and d(β) = 0. A complete discussion for the recognition of total positivity can be found in [FJ11, chapter 3]. The task of detecting mimes is less straightforward and can be based on Proposition 4.5.21. As argued in [Pan79b] and by noting that, without loss of any generality, the semipositivity vector of X in Proposition 4.5.21 can be taken to be the all ones vector e (otherwise, our considerations apply with X, Y
76
P-Matrices
replaced by XD, YD for a suitable positive diagonal matrix D), the following two-step test for mimes is applicable [Pan79b]: Step 1. Determine whether A is semipositive or not by solving the linear program minimize eT x subject to x ≥ 0 and Ax ≥ e
If this program is infeasible, then A is not semipositive and thus not a mime; stop. Step 2. Check the consistency of the linear inequality system xij ≤ 0 (i, j = 1, 2, . . . , n, i = j), n aik xkj ≤ 0 (i, j = 1, 2, . . . , n, i = j), k=1 n x (i = 1, 2, . . . , n). j=1 ij > 0
If this system is inconsistent, A is not a mime; stop. Otherwise, A is a mime for it satisfies the conditions of Proposition 4.5.21. Remark 4.6.2 The task of detecting matrices in PM is an open problem, related to the problem of characterizing matrices in PM and their spectra; see Section 4.4 of this chapter.
4.6.2 Detection of General P-Matrices The exhaustive check of all 2n − 1 principal minors of A ∈ Mn (R) using Gaussian elimination is an O(n3 2n ) task; see [TL00]. Next we will describe an alternative test for complex P-matrices, which was first presented in [TL00] and was shown to be O(2n ) when applied to real P-matrices. The following theorem (cf. Theorem 4.3.2 (7) and (8)) is the theoretical basis for the subsequent algorithm. Although the result is stated and proved in [TL00] for real P-matrices, its proof is valid for complex P-matrices and included here for completeness. Theorem 4.6.3 Let A ∈ Mn (C), α ⊆ n with |α| = 1. Then A ∈ P if and only if A[α] > 0, A[α] ∈ P, and A/A[α] ∈ P. Proof Without loss of generality, assume that α = {1}. Otherwise we can apply our considerations to a permutation similarity of A. If A = [aij ] ∈ P, then A[α] ∈ P and A[α] ∈ P. Also A/A[α] ∈ P by Theorem 4.3.2 (7). For the converse, assume that A[α] = [a11 ], A[α], and A/A[α] are P-matrices. Using
4.6 The P-Problem: Detection of P-Matrices
77
a11 > 0 as the pivot, we can row reduce A to obtain a matrix B with all of its off-diagonal entries in the first column equal to zero. As is well known, B[α] = A/A[α]. That is, B is a block triangular matrix whose diagonal blocks are P-matrices. It follows readily that B ∈ P. The determinant of any principal submatrix of A that includes entries from the first row of A coincides with the determinant of the corresponding submatrix of B and is thus positive. The determinant of any principal submatrix of A with no entries from the first row coincides with a principal minor of A[α] and is also positive. Hence A ∈ P. The following is an implementation of a recursive algorithm for the P-problem suggested by Theorem 4.6.3. For its complexity analysis, see [GT06a, TL00]. Algorithm P(A) 1. 2. 3. 4. 5.
Input A = [aij ] ∈ Mn (C). If a11 > 0 output “Not a P-matrix” stop. Evaluate A/A[α], where α = {1}. Call P(A(α)) and P(A/A[α]). Output “This is a P-matrix.”
Next is a Matlab function implementing Algorithm P(A) for general complex matrices (see the remark below for online availability). function [r] = ptest(A) % Return r=1 if ‘A’ is a P-matrix (r=0 otherwise). n = length(A); if ∼(A(1,1)>0), r = 0; elseif n==1, r = 1; else b = A(2:n,2:n); d = A(2:n,1)/A(1,1); c = b - d*A(1,2:n); r = ptest(b) & ptest(c); end Remark 4.6.4 The algorithm P(A) uses Schur complementation and submatrix extraction in a recursive manner to compute (up to) 2n quantities. In the course of P(A), if any of these quantities is not positive, the algorithm terminates, declaring that A is not a P-matrix; otherwise it is a P-matrix. No further use of these 2n quantities is made in P(A), even when they all have to be
78
P-Matrices
computed; they are in fact overwritten. In [GT06a] an algorithm is developed (MAT2PM) that uses the underlying technique and principles of P(A) in order to compute all the principal minors of an arbitrary complex matrix. In [GT06b], a method is presented (PM2MAT) that constructs recursively a matrix from a set of prescribed principal minors, when this is possible. Matlab implementations of the algorithms in [GT06a, GT06b], as well as ptest, are maintained in www.math.wsu.edu/math/faculty/tsat/matlab.html.
4.7 The Recursive Construction of All P-Matrices The results in this section are based on material introduced in [TZ19]. Of particular interest in the study of P-matrices are two related problems: (i) Recognize whether or not a given matrix is a P-matrix (P-problem; see Section 4.6). (ii) Provide a constructive characterization of P-matrices, namely, a method that can generate every P-matrix. Both problems are central to computational challenges arising in the Linear Complementarity Problem (LCP) (see Section 4.9.1) and exemplified by the following facts: • LCP has a unique solution if and only if the coefficient matrix is a P-matrix [CPS92]. • The problem of deciding if a given matrix is a P-matrix is co-NP-complete [Cox94]. • The complexity of solving the LCP when the coefficient matrix is a P-matrix is presently unknown. If the problem of solving the LCP with a P-matrix were NP-hard, then the complexity classes NP and co-NP would coincide [MP91]. One interesting method to construct real P-matrices is to form products A = BC−1 , where B, C ∈ Mn (R) are row diagonally dominant matrices with positive diagonal entries (see Proposition 4.5.6). This was first observed in [JT95] and is also the subject of study in [Mor03] and [MN07], where such matrices are referred to as “hidden prdd.” At an Oberwolfach meeting [Ober], C. R. Johnson stated the above result and raised the question whether or not all real P-matrices can be factored this way, namely, they are hidden prdd. A counterexample was provided in [Mor03], along with a polynomial algorithm to detect hidden prdd matrices. Further study of related classes was pursued in [MN07].
4.7 The Recursive Construction of All P-Matrices
79
We shall see that the solution to problem (ii) above is intrinsically related to problem (i). Using a recursion based on rank-one perturbations of P-matrices, we shall be able to reverse the steps of the recursive algorithm P(A) in Section 4.6.2 that detects P-matrices in order to construct every P-matrix. 4.7.1 Rank-One Perturbations in Constructing P-Matrices The main idea in Theorem 4.6.3 allows one to construct recursively any and all members of the class of P-matrices of any given size. This is presented in the next two theorems and algorithm. Proposition 4.7.1 Let Aˆ ∈ Mn (C) be a P-matrix, a ∈ C and x, y ∈ Cn . Then the following are equivalent. a xT (i) A = −y ˆA is a P-matrix. (ii) a > 0 and Aˆ + 1a yxT is a P-matrix. Proof The equivalence follows from Theorem 4.6.3 and the fact that Aˆ + 1a yxT is the Schur complement of a = A[{1}] in A. Proposition 4.7.1 suggests the following recursive process to construct an n × n complex P-matrix, n ≥ 2. Algorithm P-CON Choose A1 > 0 For k = 1 : n − 1, given the k × k P-matrix Ak
T 1. Choose (x(k) , y(k) ) ∈ Ck × Ck and ak > 0 such that Ak + a1k y(k) x(k) is a P-matrix. 2. Form the (k + 1) × (k + 1) matrix Ak+1 =
ak
x(k)
−y(k)
Ak
T
.
Output A = An is a P-matrix. The recursive nature of every P-matrix is formally shown in the following theorem. Theorem 4.7.2 Every matrix constructed via P-CON is a P-matrix. Conversely, every P-matrix A ∈ Mn (C) can be constructed via P-CON. Proof By Proposition 4.7.1, each of the matrices Ak+1 (k = 1, 2, . . . , n − 1) in P-CON, including A1 , is a P-matrix. We use induction to prove the converse.
80
P-Matrices
The base case is trivial. Assume every P-matrix in Mn−1 (C) can be constructed via P-CON. Let A ∈ Mn (C) be any P-matrix partitioned as a A12 A = 11 , A21 A22 where A22 ∈ Mn−1 (C). By inductive hypothesis, A22 is a P-matrix constructible via P-CON. Because A is a P-matrix, by Corollary 4.7.1, A/a11 = A22 − 1 a11 A21 A12 is a P-matrix, and An = A is constructible via P-CON with an−1 = a11 > 0, x(n−1) = AT12 and y(n−1) = −A21 .
4.7.2 Constructing P-Matrices Our further development and application of algorithm P-CON in this section will be guided by the following considerations. (1) The choice of x(k) , y(k) ∈ Ck × Ck and ak > 0 in P-CON must be made T such that Ak + a1k y(k) x(k) has positive principal minors. Given that the primary interest in applications concerns real P-matrices, we will offer an implementation of P-CON that constructs real P-matrices. Therefore, we execute Step 1 of P-CON by choosing a pair of real vectors x(k) , y(k) randomly, and subsequently choose ak > 0 sufficiently large to ensure T Ak + a1k y(k) x(k) is a P-matrix. (2) Theorem 4.7.2 and P-CON deal with complex P-matrices. To proceed with construction of non-real P-matrices, it is implicit in the condition that T Ak + a1k y(k) x(k) be a P-matrix that, although the j-th entries of x(k) , y(k) can be non-real, their product must be real. We will illustrate such a construction in Example 4.7.7. Pursuant to consideration (1) above, the lemma and theorem that follow facilitate the choice of ak in the implementation of P-CON. n Lemma 4.7.3 Let A ∈ M n (C) be invertible and y, x ∈ C . Then det A + yxT = xT A−1 y + 1 det(A).
Proof Because I 0 I + yxT xT 1 0
y 1
I xT
−1 0 I 0 I + yxT = T x 0 1 1 I y , = T 0 x y+1
I y 1 −xT
0 1
4.7 The Recursive Construction of All P-Matrices
81
we have y I y , = det 1 0 xT y + 1
I + yxT det 0
i.e., det I + yxT = xT y + 1. Thus, det A + yxT = det(A) det I + A−1 yxT = xT A−1 y + 1 det(A). Theorem 4.7.4 Let A ∈ Mn (C) be a P-matrix, y, x ∈ Cn and a > 0. Then A + 1a yxT is a P-matrix if and only if for every α ⊆ n, we have (x[α])T (A[α])−1 y[α] ∈ R and a > max − (x[α])T (A[α])−1 y[α] . α⊆n
Proof For every α ⊆ n, det(A[α]) > 0 because A is a P-matrix. Notice that the principal submatrices of A + 1a yxT are of the form A[α] + 1a y[α](x[α])T . First we prove sufficiency: Suppose that for every α ⊆ n, (x[α])T (A[α])−1 y[α] ∈ R and a > max − (x[α])T (A[α])−1 y[α] . α⊆n
By Lemma 4.7.3, for any α ⊆ n, we have 1 1 T T −1 det A[α] + y[α](x[α]) = (x[α]) (A[α]) y[α] + 1 det(A[α]) > 0 a a Therefore, A + 1a yxT is a P-matrix. Next we prove necessity. If A + 1a yxT is a P-matrix, then for every α ⊆ n,
1 det A[α] + y[α](x[α])T a
=
1 T −1 (x[α]) (A[α]) y[α] + 1 det(A[α]) > 0, a
which necessitates that (x[α])T (A[α])−1 y[α] ∈ R and 1a (x[α])T (A[α])−1 y[α]+ 1 > 0, i.e., a > max − (x[α])T (A[α])−1 y[α] . α⊆n
Next, we incorporate the bound in Theorem 4.7.4 in Matlab code (PCON) that constructs real P-matrices.
82
P-Matrices
PCON function [A]=pcon(N) % Input N is the size of the desirable real P-matrix to be generated A=rand(1); % or A=abs(normrnd(0,1)); random 1x1 P-matrix for j=1:N-1 x=-1+2.*rand(j,1); % random entries in [-1,1]; or use x=normrnd(0,1,[j 1]); y=-1+2.*rand(j,1); %
or use y=normrnd(0,1,[j 1]);
[m,n]=size(A); v=1:n; a=0.01; % or a=abs(normrnd(0,1)); for k=1:n C = nchoosek(v,k); [p,q]=size(C); for i=1:p B=A(C(i,:), C(i,:)); b=-(x(C(i,:))). ∗inv(B)∗y(C(i,:)); if b>a a=b; end end end a=1.01∗a; % or a=(1+abs(normrnd(0,1)))*a; A=[a
x. ; -y
A];
end
Remark 4.7.5 The following remarks clarify the functionality of the implementation of P-CON. • The complexity of P-CON is exponential because of the computation of the lower bound for a = ak in Theorem 4.7.4, which requires a maximum be computed among all n-choose-k submatrices. • In the code provided above, random choices are uniformly distributed; normal distribution commands are commended out. • The lower bound for the parameter ak provided by Theorem 4.7.4 is strict, hence our choice of ak is larger than (but kept close to) the lower bound to minimize the chance of diagonal dominance. • Given that P-matrices are preserved under positive scaling of the rows and columns, there is no loss of generality in restricting the (uniformly distributed) random choice of the entries of x(k) and y(k) to be in [−1, 1].
4.7 The Recursive Construction of All P-Matrices
83
• We have experimented with normal and uniform distributions for the random choice of the parameters and vector entries. We have, however, observed no discernible difference in the nature of the generated P-matrices. • As desirable, in the experiments we have run, the matrices generated display no symmetry, no sign pattern, and no diagonal dominance because none of these traits are imposed by P-CON. We conclude by illustrating the functionality of P-CON with some generated examples of real P-matrices. We also include an example of a non-real P-matrix generated via P-CON. Example 4.7.6 The first two examples are P-matrices generated by execution of P-CON with random variables that are uniformly distributed. ⎡ ⎤ 0.8944 −0.1366 0.9951 0.6232 ⎢ 0.0287 0.0101 −0.7969 −0.2183⎥ ⎢ ⎥, ⎣−0.7889 0.8908 0.0101 0.8127⎦ 0.7249 −0.0026 −0.2578 0.5102 ⎤ ⎡ 5.5491 0.0613 0.6648 0.1950 −0.3294 ⎢0.4015 0.4512 −0.6777 0.5162 0.7422⎥ ⎥ ⎢ ⎥ ⎢ 0.2984 40.4328 −0.1956 0.2413⎥ . ⎢0.0948 ⎥ ⎢ ⎣0.1547 −0.3711 0.6913 0.2783 −0.1952⎦ 0.2808 0.4117 0.2373 −0.9657 0.6841 In the next two examples random variables were normally distributed. ⎡ ⎤ 2.7122 1.1093 −0.8637 0.0774 ⎢1.2141 1.3140 0.3129 −0.8649⎥ ⎢ ⎥, ⎣1.1135 0.0301 0.3185 −1.7115⎦ 0.0068 0.1649
⎡
0.1022
5.7061 ⎢ 0.8655 ⎢ ⎢ ⎢ 0.1765 ⎢ ⎣−0.7914 1.3320
1.3703
⎤ 2.5260 1.6555 0.3075 −1.2571 1.7977 −0.1332 −0.7145 1.3514⎥ ⎥ ⎥ 0.2248 1.6112 0.9642 0.5201⎥ . ⎥ 0.5890 0.0200 1.6021 −0.9792⎦ 0.2938 0.0348 1.1564 0.8314
Example 4.7.7 In this example, we explicitly illustrate the construction of a 4 × 4 non-real P-matrix using P-CON. The choices are made deliberately to satisfy the necessary conditions and are not randomly generated. Choose A1 = 2, x(1) = 2 − i, y(1) = 4 + 2i and a1 = 1 so that A1 +
1 (1) (1) T y x = 12 a1
84
P-Matrices
1 2−i is a 1 × 1 P-matrix. Thus A2 = −4−2i is a P-matrix. 2 1
Choose x(2) = −1−2i , y(2) = 2 −i and a2 = 3 so that 3−i
3+i
1 (2) (2) T 0.1667 2.1667 − 2.1667i y x = −4.3333 − 4.3333i 5.3333 a2 3 −1−2i 3−i is a P-matrix. Thus A3 = − 12 +i 1 2−i is a P-matrix. −4−2i 2 −i −3−i 2i (3) (3) 1−i 1+i Choose x = 1 , y = and a3 = 3.5 so that 2 A2 +
−3
3
1 (3) (3) T y x a3 ⎡ ⎤ 3.5714 −1.5714 − 1.4286i 3.0000 − 0.8095i = ⎣−0.7857 + 0.7143i 1.5714 2.0952 − 1.0952i⎦ −3.0000 − 0.8095i −4.1905 − 2.1905i 1.9365 ⎡ ⎤ 1
A3 +
3.5
⎢ −2i is a P-matrix. Thus A4 = ⎣ −1+i 2 3
−i 1+i 3 3 −1−2i 3−i ⎥ 1 2−i ⎦ − 12 +i −3−i −4−2i 2
is a P-matrix.
4.8 On Matrices with Nonnegative Principal Minors Many of the properties and characterizations of P-matrices extend to P0 -matrices, namely, the class P0 of matrices whose principal minors are nonnegative. Some of these extensions are straightforward consequences of the definitions and some are based on the following lemma, which implies that P0 is the topological closure of P in Mn (C). Lemma 4.8.1 Let A ∈ Mn (C). Then A ∈ P0 if and only if A + I ∈ P for all > 0. A Proof Suppose A = [aij ] ∈ P0 so that ∂∂det aii = det A[{i}] ≥ 0; that is, det A is a non-decreasing function of each of the diagonal entries. Thus, as each of the diagonal entries of A is increased in succession by > 0, the determinant of A+I must remain nonnegative. In addition, the coefficients of the characteristic polynomial of A, det(tI − A), alternate from nonnegative to non-positive and thus does not have any negative roots. That is, A has no negative eigenvalues and so A+I is nonsingular, implying that det(A+I) > 0. A similar argument applies to every principal submatrix of A, proving that A + I ∈ P for every > 0.
4.9 Some Applications of P-Matrices
85
Conversely, by continuity of the determinant as a function of the entries, if A + I is a P-matrix for every > 0, as approaches zero, the principal minors of A must be nonnegative. The following facts about P0 -matrices can be obtained from the definitions, as well as the corresponding facts about P-matrices and the above lemma. Theorem 4.8.2 Let A ∈ Mn (C) and its principal submatrices have real characteristic polynomials. Then A is a P0 -matrix if and only if every real eigenvalue of every principal submatrix of A is nonnegative. Theorem 4.8.3 Let A ∈ Mn (C) be a P0 -matrix and let Argz ∈ (−π , π ] denote the principal value of z ∈ C. Then π σ (A) ⊂ z ∈ C : | Arg(z) | ≤ π − . (4.8.1) n Theorem 4.8.4 Let A ∈ Mn (C) be a P0 -matrix. Then (1) A/A[α] ∈ P0 for all α ⊆ n such that A[α] is invertible. (2) ppt (A, α) ∈ P0 for all α ⊆ n such that A[α] is invertible. In particular, if A is invertible, then A−1 ∈ P0 . Theorem 4.8.5 A ∈ Mn (R) is a P0 -matrix if and only if for each x ∈ Rn , there exists j ∈ n such that xj (Ax)j ≥ 0.
4.9 Some Applications of P-Matrices We provide a brief description of some notable manifestations of P-matrices.
4.9.1 Linear Complementarity Problem Let M ∈ Mn (R) and q ∈ Rn be given. The Linear Complementarity Problem, denoted by LCP (q, M) , is to find z ∈ Rn such that z≥0 q + Mz ≥ 0 zT (q + Mz) = 0. An equivalent formulation of the LCP (q, M) is as the quadratic program minimize subject to
zT (q + Mz)
q + Mz ≥ 0 and z ≥ 0.
86
P-Matrices
Many of the matrix positivity classes discussed herein appear in the Linear Complementarity Problem literature. For a guide to these classes and their roles, see [Cott10]. Most relevant to this chapter is the following theorem, whose proof can be found in [CPS92] or [BP94]. Theorem 4.9.1 The LCP (q, M) has a unique solution for each vector q ∈ Rn if and only if M ∈ P. Remarks 4.9.2 (i) Related to Theorem 4.9.1, it can be shown that LCP (q, M) has a unique solution for certain 4n + 1 vectors q ∈ Rn , then it has unique solution for all q. Hence P-matrices can be characterized by the uniqueness of the solution to a finite number of Linear Complementarity Problems; see [Mu71]. (ii) The solution to the LCP (q, M) can be found by pivoting methods (e.g., Lemke’s algorithm) in exponential time, or by iterative methods (e.g., interior point methods), which are better suited for large-scale problems. A comprehensive presentation of numerical methods for the LCP (q, M) , as well as their adaptations to special matrix classes, can be found in [CPS92]. (iii) Several computational issues arise in the solution of the LCP (q, M) [Meg88], which have implications in the development of numerical methods and general complexity theory. For example: • [P-problem] Detect a P-matrix: It provides numerical validation for the existence and uniqueness of a solution to the LCP (q, M) . • [P-LCP] Solve the LCP (q, M) given that M ∈ P. • [P-LCP*] Solve the LCP (q, M) or exhibit a nonpositive principal minor of M. Facts and information: • If P-LCP or P-LCP* is NP-hard, then NP = coNP [MP91]. • Comprehensive information on P-LCP can be found in the dissertation [Rus07]. • It is not known whether or not P-LCP can be achieved in polynomial time. It belongs to a complexity class with some other important problems, e.g., [TFNP] – problems for Total Functions from NP.
4.9.2 Univalence Given a differential function F : R −→ Rn , where R is a closed rectangle in Rn , consider its Jacobian F (x). The Gale–Nikaido Theorem [GN65] states that
4.10 Miscellaneous Related Facts and References
87
if F (x) ∈ P for all x ∈ R, then F is univalent, i.e., injective. An account of univalence theorems can be found in [Parth83]. It has also been shown that F (x) ∈ P for merely all x on the boundary of R is sufficient to imply univalence of F [GZ79]. Subsequent work in [GR00] extended the connection between univalence and P-matrices to nonsmooth functions.
4.9.3 Interval Matrices In the context of mathematical programming and elsewhere, the analysis of matrix intervals and interval equations play an important role and are intimately related to P-matrices. In particular, it is known that [A, B] consists exclusively of n-by-n invertible matrices (i.e., [A, B] is a regular interval) if and only if for each of the 2n matrices T ∈ [0, I] having diagonal entries in {0, 1}, TA+(I−T)B is invertible [Roh89]. Related work is presented in [Roh91]. The case of matrix intervals that consist exclusively of P-matrices is considered in [BG84] and [RR96] (see also [JR99]). It has been shown that [A, B] consists entirely of P-matrices (i.e., [A, B] is a P-matrix interval) if and only if for every nonzero x ∈ Rn , there exists j ∈ n such that xj (Cx)j > 0 for every C ∈ [A, B] (cf. Theorem 4.3.4).
4.9.4 Linear Differential Inclusion Problems Consider the Linear Differential Inclusion (LDI) system x˙ ∈ x, x(0) = x0 , ⊂ Mn (R) and, in particular, the Diagonal Norm-bound LDI (DNLDI), which is a linear differential system with uncertain, time-varying (bounded) feedback gains. It leads to being of the form = {A + B(I − E)−1 C : ≤ 1, = diagonal}. The following result is in [Gha90] (see also [BEFB94]). Theorem 4.9.3 DNLDI is well posed if and only if (I + E)−1 (I − E) ∈ P.
4.10 Miscellaneous Related Facts and References We summarize and provide references to facts about P-matrices that are related to the themes of this chapter.
88
P-Matrices
• There are several interesting generalizations of the notion of a P-matrix. − P-tensors are defined in [DLQ18] using an extension of the notion of non-reversal of sign of nonzero real vectors (cf. Theorem 4.3.4). They are shown to have properties and be related to the tensor positivity classes they generalize analogously to P-matrices. − In [KS14], the authors generalize the results in [JT95] by replacing the inverse appearing in Theorem 4.3.9 with the Moore–Penrose or the group inverse of the factor C. − In [SGR99], the notion of the row-P-property of a set of matrices is introduced and related to the unique solvability of nonlinear complementarity problems. − Generalizations of P-matrices to block forms based on extensions of Theorem 4.3.9 are undertaken and compared in [ES98]. Convex sets of P-matrices and block P-matrices, as well as (Schur) stability are considered in [ES00] and [EMS02]. − P-matrix concepts and properties (like the non-reversal of the sign of a real nonzero vector, solutions to LCP-type problems and analogues to the relations between Z-matrices, M-matrices, P-matrices, and semipositivity) have been extended to transformations on Euclidean Jordan Algebras; see [GST04, TG05, GS06, GTR12]. • Sufficient conditions for positive stability of almost symmetric and almost diagonally dominant P-matrices are developed in [TSOA07]. • In [JK96], the authors consider matrices A some of whose entries are not specified and raise the problem of when do the unspecified entries can be chosen so that the completed matrix is in P. They show that if the specified entries include all of the diagonal entries, if every fully specified principal submatrix is in P, and if the unspecified entries are symmetrically placed (i.e., (i, j) entry is specified if and only if the (j, i) entry is specified), then the matrix A has a completion to a P-matrix. In [DH00], the authors extend the aforementioned class of partially specified matrices that have a P-matrix completion by providing strategies to achieve its conditions. In [DH00, JK96], partially specified matrices that cannot be completed to be in P are also discussed. General necessary and sufficient conditions for the completion of a partially specified matrix to a P-matrix are not known to date. Finally, completion problems of subclasses of P are studied in [FJTU00, GJSW84, Hog98a, Hog98b, JS96]. • In [BHJ85, HJ86a], the authors study linear transformations of P into P. In [BHJ85] it is shown that the linear transformations that map P onto itself are necessarily compositions of transposition, permutation similarity,
4.10 Miscellaneous Related Facts and References
•
•
•
•
•
• •
89
signature similarity, and positive diagonal equivalence (cf. Theorem 4.3.2 (1)–(4), respectively). Under the assumption that the kernel of a linear transformation L intersects trivially the set of matrices with zero diagonal entries, it is shown in [HJ86a] that the linear transformations that map P into itself (n ≥ 3) are compositions of transposition, permutation similarity, signature similarity, positive diagonal equivalence, and the map A → A+D, where D is a diagonal matrix whose diagonal entries are nonnegative linear combinations of the diagonal entries of A. There is an open problem of studying when the Hadamard (i.e., entrywise) product of two P-matrices is a P-matrix. A related problem concerns Hadamard product of inverse M-matrices. In particular, it is conjectured that the Hadamard square of two inverse M-matrices is an inverse M-matrix; see [Neu98, WZZ00]. The P-problem (detecting a P-matrix) is considered in [Rum03], where the sign-real spectral radius and interval matrix regularity are used to develop necessary and sufficient conditions for a real matrix to be in P. As a result, a not a priori exponential method for checking whether a matrix is in P or not is given. An algorithmic characterization of P-matrices via the Newton-min algorithm, which is an iterative method for the solution of the Linear Complementarity Problem, is presented in [GG19]. Factorization of real matrices into products of P-matrices are considered in [JOvdD03]. In particular, it is shown that every A ∈ Mn (R) can be written as the product of three P-matrices, provided that det A > 0. The Leading Implies All (LIA) class of matrices comprises all real square matrices for which the positivity of the leading principal minors implies that all principal minors are positive. Thus LIA is contained in P. LIA matrices are studied in [JN13]. Necessary conditions for a real matrix to be in P in terms of row and column sums are presented in [Szu90]. Given a P-matrix A ∈ Mn (R), the quantity α(A) = min { max xi (Ax)i } x∞ 1≤i≤n
and its role in the error analysis for the Linear Complementarity Problem is studied in [MP90] and [XZ02].
5 Inverse M-Matrices
5.1 Introduction An n-by-n real matrix A = [aij ] is called an M-matrix (A ∈ M) if (1) it is of the form A = αI − B, where B has nonnegative entries, and (2) α > ρ(B) in which ρ(B) is the Perron–Frobenius eigenvalue (spectral radius) of B. Thus, an M-matrix A has two key features: the sign pattern aii > 0, i = 1, . . . , n and aij ≤ 0, i = j (such matrices are called Z-matrices, A ∈ Z), and the property that all eigenvalues have positive real part (such matrices are called positive stable). It is easily seen that an M-matrix A has a positive eigenvalue with minimum modulus, denoted by q(A). Equivalently, it can be shown that a Z-matrix A is an M-matrix if and only if it is invertible and A−1 ≥ 0. Those matrices C that are inverses of M-matrices are called inverse M-matrices (C ∈ IM) and comprise a large class of stable nonnegative matrices. Note that an inverse M-matrix is a nonnegative matrix whose inverse is a Z-matrix. A number of equivalent properties can be substituted for (2) above, such as those given in [Ple77, NP79, NP80]. We note that characterizations of inverse M-matrices are much harder to come by than those for M-matrices, and we present several that have come up in the literature. Over the past half-century, M-matrices have had considerable attention, in large part because of the frequency with which they arise in applications [BP94], and a great deal is known about them and several generalizations, e.g., [FP62, NP79, NP80, Ple77, PB74]. M-matrices arise in iterative methods in numerical analysis of dynamical systems, finite difference methods for partial differential equations, input-output production and growth models in economics, linear complementarity problems in operations research, and
5.2 Preliminary Facts
91
Markov processes in probability and statistics [BP94]. A significant amount of attention has focused on inverse M-matrices. Again, this is, in part, due to applications [Wil77]. In addition to inverse problems involving M-matrices, inverse M-matrices themselves arise in a number of applications such as the Ising model of ferromagnetism [Ple77], taxonomy [BP94], and random energy models in statistical physics [Fan60]. Notation We denote the n-by-n entry-wise nonnegative matrices by N (or that they are ≥ 0), and those with positive main diagonal by N+ . The n-by-n diagonal matrices with positive diagonal entries are denoted by D+ n . the n-by-n real matrices whose principal minors are positive by P, the n-by-n matrices with nonpositive off-diagonal entries by Z, the n-by-n M-matrices by M, and the n-by-n inverse M-matrices by IM. Much is known about each of these important classes [And80, BP94, FP62, FP67, Joh78, Ple77, NP79, NP80]. If A ∈ N, then A ∈ IM if and only if A−1 ∈ Z, just as it is the case that if B ∈ Z, then B ∈ M if and only if B−1 ∈ N. Thus the classes Z and N are dual to each other in this context. All inequalities between matrices or vectors of the same size, such as A ≥ B, will be entry-wise. Furthermore, “≥” gives an appropriate partial order for the classes M and IM as A, B ∈ M satisfy B ≥ A if and only if A−1 , B−1 ∈ IM satisfy A−1 ≥ B−1 . Also, A ∈ M and B ≥ A imply B ∈ M. (Both are straightforward calculations.) There is no corresponding fact for IM, except Theorem 5.5.2; see also Theorem 5.17.23.
5.2 Preliminary Facts A number of facts follow from the definition of M-matrices; many of these are presented without proof. Several equivalences for M-matrices are listed below for future use. Theorem 5.2.1 [BP94] Let A ∈ Z ∩ Mn (R). The following are equivalent. (i) (ii) (iii) (iv) (v)
A is an M-matrix. A = αI − B in which B ≥ 0 and α > ρ(B). A is invertible and A−1 ≥ 0. A has positive principal minors. T There is D ∈ D+ n such that DA + A D is positive definite, i.e., A has a diagonal Lyapunov solution.
Note that (iv) was proved in [FP62], and from this it follows that if A ∈ IM, then det A > 0 and A has positive diagonal entries.
92
Inverse M-Matrices 5.2.1 Multiplicative Diagonal Closure
It is a familiar fact that the M-matrices are closed under positive diagonal multiplication; that is, if D if is diagonal with positive diagonal entries and A ∈ M, then DA ∈ M and AD ∈ M. The same is true of inverse M-matrices, for precisely the same reason. Corollary 5.2.2 If D ∈ D and B ∈ IM, then DB ∈ IM and BD ∈ IM. Proof Because DB ∈ N and is invertible, it suffices to note that (DB)−1 = B−1 D−1 ∈ Z, so that (DB)−1 ∈ M and DB ∈ IM. The same argument applies to BD.
5.2.2 Multiplicative Closure Neither M nor IM is closed under multiplication. Corollary 5.2.3 If A ∈ N is invertible, then A ∈ IM if and only if A−1 ∈ Z. Corollary 5.2.4 If A, B ∈ IM, then AB ∈ IM if and only if (AB)−1 ∈ Z. Example 5.2.5 Consider the IM matrices ⎡ ⎤ ⎡ ⎤ 21 14 13 24 20 13 A = ⎣13 24 20⎦ , B = ⎣13 24 14⎦ . 14 13 24 14 13 24 AB is not IM because the (1, 3) entry of (AB)−1 is positive. Remark 5.2.6 Multiplicative closure can be shown to hold for n = 2 (because A, B, and AB have positive determinant).
5.2.3 Additive Closure Another parallel between IM and M is that neither is a cone because neither is closed under addition. However, M is closed under addition in the 2-by-2 case. Theorem 5.2.7 If A, B ∈ IM, then A + B ∈ IM if and only if A + B is invertible and (A + B)−1 ∈ Z. Additive closure does not even hold for IM matrices of order 2. Example 5.2.8 Consider the IM matrix 3 5 A= . 1 3
5.3 Diagonal Closure
93
If B = AT , then A + B is not even invertible. However, if det(A + B) > 0, then A + B is IM in the 2-by-2 case. Remark 5.2.9 We note that, in case n = 2, for A, B ∈ M, A + B ∈ M if and only if A−1 + B−1 ∈ IM. This is not the case for n > 2, as indicated by the pair ⎡ ⎤ ⎡ ⎤ 1 0 0 1 0 0 A = ⎣−1 1 0⎦ , B = ⎣0 1 −3⎦ −1 −1 1
0 0
1
/ IM. There are, however, further for which A + B ∈ M, but A−1 + B−1 ∈ −1 interesting relationships between A+ B and A−1 + B to be noticed. For −1 −1 example, if A, B ∈ P, det(AB) · det A + B = det(A + B). Also, if A is lower and B is upper triangular, then the signs of the leading principal minors of A−1 + B−1 and A + B are the same, so that A + B ∈ M if A−1 + B−1 ∈ P. From Theorem 5.2.1 and the cofactor form of the inverse, we have Theorem 5.2.10 If A ∈ N, then A ∈ IM if and only if det A > 0 and either det A(i, j) > 0 or sgn det A(i, j) = (−1)i+j+1 for −1 ≤ i, j ≤ n, i = j. IM is closed under permutation similarity and translation. Theorem 5.2.11 If P is a permutation matrix, then A ∈ IM if and only if PAPT ∈ IM. Theorem 5.2.12 A ∈ IM if and only if AT ∈ IM. Our next fact is immediate from the corresponding property for M-matrices. Theorem 5.2.13 Each matrix A ∈ IM is positive stable.
5.3 Diagonal Closure Recall that C = [cij ] is said to be diagonally dominant of its rows if |cii | > |cij |, i = 1, . . . , n. j =i
If CT is diagonally dominant of its rows, then C is said to be diagonally dominant of its columns. The M-matrices possess an important (and characterizing) latent diagonal dominance property: if A ∈ Z, then A ∈ M if and only if there is a diagonal D with nonnegative diagonal entries such that AD is diagonally dominant of its rows. Alternatively, DA may be made diagonally dominant of its columns, and, furthermore, DAE may be made diagonally dominant of
94
Inverse M-Matrices
both its rows and its columns. On the other hand, inverse M-matrices are not diagonally dominant of its rows (columns) in general. However, the inverse M-matrices do possess analogous dominance properties, which are dual (in a certain sense) rather than being exactly the same. The matrix C is said to be diagonally dominant of its row entries if |cii | > |cij |, j = i, i = 1, . . . , n. Similarly, C is said to be diagonally dominant of its column entries if is diagonally dominant of its row entries. It is noted in [FP62, Joh77] and used in [Joh77, Wil77] that this weaker sort of dominance is latent in inverse M-matrices, just as the stronger sort is in M-matrices. We prove that fact here. Our approach is to scale the columns of A ∈ M so that the result is diagonally dominant of its rows, and then show, by inspection of the cofactors, that the inverse of the resulting matrix is diagonally dominant of its column entries.
CT
Theorem 5.3.1 [BP94] If A ∈ IM∩Mn (R), then the following are equivalent. (i) There is a D ∈ D+ n such that DA is diagonally dominant of its column entries. (ii) There is an E ∈ D+ n such that AE is diagonally dominant of its row entries. (iii) There are G, H ∈ D+ n such that GAH is diagonally dominant of both its row and column entries. Proof Let A ∈ IM so that B = A−1 ∈ M. It follows from the Perron–Frobenius Theorem [HJ13] that there is a D ∈ D+ n such that R = BD is diagonally dominant of its rows. Let S = R−1 = D−1 A = [sij ] and suppose i, j ∈ n with i = j. Then it follows from the cofactor expansion of R−1 and from R ∈ Z that |sii | − |sji | =
det R[n \ {i}] + det[R[n \ {i}, n \ {j}] det T = det R det R
in which T is obtained from R[n \ {i}] by adding the column ±R[n \ {i}, i] to the j-th column. Because R is diagonally dominant of its rows and has positive diagonal, det T > 0, and so |sii | − |sji | > 0. Thus, S = D−1 A is diagonally dominant of its column entries and (i) holds. The proof of (ii) is similar, and (iii) follows from (i) and (ii). Of course, because of Theorem 5.3.1, any IM matrix may be diagonally scaled to one with 1s on the diagonal (thereby obtaining a normalized inverse M-matrix). In fact, the scaled IM matrix may be taken to have 1s on the offdiagonal and entries < 1 [JS01a].
5.4 Diagonal Lyapunov Solutions
95
5.4 Diagonal Lyapunov Solutions It is known that M-matrices have diagonal Lyapunov solutions; that is, for T each B ∈ M, there is a D ∈ D+ n such that DB + B D is positive definite. (This follows from the fact that an M-matrix may be scaled to have both row and column diagonal dominance [JS01a].) This allows us to prove the analogous fact for IM matrices. Theorem 5.4.1 For each A ∈ IM ∩ Mn (R), there is a D ∈ D+ n such that DA + AT D is positive definite. Proof A = B−1 for some B ∈ M. Let D ∈ D+ such that DB + BT D is −1 T n −1 positive definite. Equivalently, DA + A D is positive definite. Hence, −1 T T −1 T A DA + A D A = DA + A D is also positive definite. It is worth noting, moreover, that the set of all possible diagonal Lyapunov solutions of A (which is a cone) is the same as that for B = A−1 . Because principal submatrices of positive definite matrices are positive definite, it follows that the possession of a diagonal Lyapunov solution is a property inherited under the extraction of principal submatrices. Because the possession of a positive definite Lyapunov solution implies positive stability, which implies a positive determinant for a real matrix, it means that an M-matrix is a P-matrix, a familiar fact. The same argument applies to Theorem 5.4.1. Corollary 5.4.2 The class IM ⊆ P. We note that Z∩P = M, but N∩P = IM. Within the class Z, the existence of a diagonal Lyapunov solution characterizes M (Theorem 5.2.1). Theorem 5.4.1 may be interpreted as saying that the existence of a diagonal Lyapunov solution is necessary for A ∈ N to be in IM. It is not a sufficient condition, however. Example 5.4.3 ⎡ ⎤ 18 1 6 A = ⎣ 1 18 6 ⎦ 6 6 18 has the identity as a Lyapunov solution, but its inverse is not in Z. It should be noted that, for n = 2, a matrix A ∈ N is in IM if and only if det A > 0 (just as B ∈ Z is in M if and only if det B > 0 for n = 2), but this easy characterization is atypical.
96
Inverse M-Matrices
5.5 Additive Diagonal Closure Although it was easy to see the closure of IM under positive diagonal multiplication from the corresponding fact for M, it is somewhat more subtle that the parallel extends to addition. It is known that for A ∈ M ∩ Mn (R) and D ∈ D+ n , A + D ∈ M. This fact may be seen, for example, from diagonal Lyapunov solution, from diagonal-dominance characterizations, or from the von Neumann expansion (I − A)−1 = I + A + A2 + · · · if ρ(A) < 1. The latter is given in [JS01a]. Here, we provide a proof based upon diagonal dominance characterizations. Theorem 5.5.1 If A ∈ IM and Eii is the n-by-n matrix with a 1 in the (i, i) position and 0s elsewhere, then A + tEii ∈ IM for any t ≥ 0. Proof Let A ∈ IM and let A−1 = [αij ]. Without loss of generality, we may assume that i = 1. Let B = A + te1 eT1 in which e1 denotes the first standard basis vector. Then, from [HJ13, p. 19], we have 1 A−1 e1 eT1 A−1 1 + eT1 A−1 e1 ⎡ ⎤ α11 ⎢ ⎥ 1 ⎢α21 ⎥ = A−1 − ⎢ . ⎥ [α11 α12 · · · α1n ] 1 + α11 ⎣ .. ⎦
B−1 = A−1 −
⎡
− ⎢+ ⎢ ⎢ ⎢+ −1 =A +⎢ ⎢. ⎢ ⎣. +
αn1 + − − . . −
+ . . − . . − . . . . . . . . − · ·
⎤ + −⎥ ⎥ ⎥ −⎥ ⎥ .⎥ ⎥ .⎦ −
= A−1 + C. α11 α11 |α1j | = c1j and |αj1 | ≥ 1+α |αj1 | = cj1 . Thus, For j = 1, |α1j | = 1+α 11 11 −1 B ∈ Z, which implies B ∈ Z and completes the proof.
Closure under addition of a nonnegative diagonal matrix follows immediately. Theorem 5.5.2 If A ∈ IM ∩ Mn (R) and D ∈ D+ n , then A + D ∈ IM.
5.5 Additive Diagonal Closure
97
This observation may also be made via an inductive argument based upon a characterization, to be given later, of how an inverse M-matrix may be embedded in another of larger dimension. Remark 5.5.3 We note that only the matrices in D+ n have the above property; i.e., if E is such that A + E ∈ IM for all A ∈ IM, then E must be a nonnegative diagonal matrix. This is straightforward, and we leave details to the reader. Because Theorem 5.5.2 indicates that we may increase diagonal entries and stay in IM, and Theorem 5.3.1 indicates that, in some sense, off-diagonal entries must be smaller than diagonal entries, a natural question to address is whether we may decrease off-diagonal entries and remain IM. This is essentially the case for M because of the diagonal-dominance characterization. That is, if any off-diagonal entry of A ∈ M is decreased in absolute value (increased algebraically, but not past 0), the result is still in M. Unfortunately, this is not the case for IM as illustrated by A and B in the next example, which differ only in the (1, 3) entry. Example 5.5.4 ⎡
⎤ 4 2 1 A = ⎣1 4 2⎦ ∈ IM 2 1 4
⎡
⎤ 4 2 0 and B = ⎣1 4 2⎦ ∈ / IM. 2 1 4
If, however, all the off-diagonal entries in some row (or column) are decreased by a common factor of scale, the outcome is different. Theorem 5.5.5 Suppose that A = [aij ] ∈ IM and that Ak (θ ) = [aij (θ )] is defined by aij (θ) =
θaij
for i = k and j = k,
aij
otherwise.
Then Ak (θ) ∈ IM for 0 ≤ θ ≤ 1 and 1 ≤ k ≤ n. Proof Write Ak (θ) = DA + E where D is the diagonal matrix agreeing with I except for a θ in the k-th diagonal position, and E is the zero matrix except for (1 − θ)akk in the k-th diagonal position. Then, for k > 0, Ak (θ ) ∈ IM by application of Corollary 5.2.2 and Theorem 5.5.2. The case of θ = 0 may be handled by taking limits. An implication of Theorem 5.5.5 is worth noting. Corollary 5.5.6 If A ∈ IM and B is a principal submatrix of A, then B ∈ IM.
98
Inverse M-Matrices Proof Multiple applications to A and AT of Theorem 5.5.5 for θ = 0 leave a matrix in IM that is permutation similar to the direct sum of B and D for some D in D. Its inverse is in M and is permutation similar to the direct sum of B−1 and D−1 . Because B−1 ∈ Z, it follows that B ∈ IM. Closure of IM under submatrix extraction was originally demonstrated in [Mark72] by very different means and parallels the corresponding fact for M. Because (1) principal minors are determinants of principal submatrices, and (2) matrices in Mand IM have positive determinants, we note that Corollary 5.5.6 implies Corollary 5.4.2.
5.6 Power Invariant Zero Pattern Another property of inverse M-matrices has been noted (and even generalized) by several authors [LN80, SRPM82] and is of interest when some off-diagonal entries are 0. That is, the zero-nonzero patterns of inverse M-matrices are power invariant. (This is in contrast with the fact that there is no special restriction on off-diagonal zero entries of M-matrices.) We state this result, which has been proven elsewhere. Theorem 5.6.1 Suppose that A ∈ IM, and let k be a positive integer. Then, the (i, j) entry of Ak is zero if and only if the (i, j) entry of A is zero. We know from Theorem 5.2.1 that IM ⊆ N+ . In N+ , the property of having a power invariant zero pattern is purely combinatorial and, in fact, is equivalent to A and A2 having the same zero pattern. Furthermore, it should be noted that (1) the values of the diagonal entries are immaterial to the question of a power invariant zero pattern and (2) if there are zeros, a power invariant zero pattern must be reducible, and each irreducible component must be strictly positive. In view of Theorems 5.5.2 and 5.3.1, a natural question to raise is: for which A ∈ N+ (or, equivalently, N) does there exist a diagonal D with nonnegative diagonal entries such that A + D ∈ IM? Theorem 5.6.1 gives a necessary condition and shows that not all A ∈ N+ may be put in IM by increasing the diagonal. However, this power invariance of the zero pattern is the only restriction, and except for it, the diagonal is the crucial feature associated with membership in IM, just as it is for M. Thus, the question is purely combinatorial, and its answer, by appealing to the von Neumann expansion, generalizes Theorem 5.6.1. Theorem 5.6.2 Suppose that A ∈ N+ ∩ Mn (R). Then there is a D ∈ D+ n such 2 that A + D ∈ IM if and only if A and A have the same zero pattern.
5.6 Power Invariant Zero Pattern
99
Proof If A + D ∈ IM for some D ∈ D+ n , then the necessity of the same zero pattern in A2 and A follows from Theorem 5.6.1 (which may also be deduced from the calculations to be given below). On the other hand, if A and A2 have the same zero pattern, then the pattern of A is power invariant, and it suffices to show that α ≥ 0 may be chosen so that (αI + A)−1 exists and is in Z. First, suppose that (αI + A)−1 exists and is in Z. Next, suppose that α > ρ(A). Then, 1 1 3 1 1 2 −1 I − A + 2A − 3A + ··· . (αI + A) = α α α α It then suffices to show that α can be chosen so that 1 1 3 1 2 diag A − 2A + 3A − ··· < I α α α and
1 1 A − A2 + 2 A3 − · · · α α
(i)
≥ 0.
For then we would have (αI + A)−1 ∈ Z. Note that (i) is equivalent to diag A(αI + A)−1 < I,
(ii)
(i )
and (ii) is equivalent to A2 (αI + A)−1 ≤ A.
(ii )
Because (αI + A)−1 → 0 as α → ±∞, it is clear that (i ) is satisfied for sufficiently large α. Furthermore, because A has power invariant zero pattern, A2 (αI+A)−1 and A have the same zero pattern, so that the fact that the left-hand side of (ii ) can be made arbitrarily small by the choice of α > 0 means that (ii ) may be satisfied. Because the convergence of (αI + A)−1 and (i) and (ii) may all be satisfied for α sufficiently large, they may all be satisfied simultaneously, which completes the proof. Question 5.6.3 [Joh82] Consider A ∈ N such that αI + A has power invariant zero pattern for α > 0. According to Theorem 5.5.2 and Theorem 5.6.2, there exists a real number α0 such that αI + A ∈ IM for α > α0 and αI + A ∈ / IM for α ≤ α0 . How may α0 , a function of A, be characterized? The corresponding M-matrix question has a pleasantly simple answer (namely, if and only if α > ρ(A)) with no combinatorial requirement. The combinatorial portion of the answer is straightforward in the case of IM (Theorem 5.6.2); however, the analytical portion (determination of α0 (A)) does not appear to be as simple (and not as neatly related to the spectrum). If an answer were available, a characterization of A ∈ N+ that lie in IM would
100
Inverse M-Matrices
follow: for example, D ∈ IM could be chosen so that diag(DA) = I and P = DA − I; then α0 (P) could be compared to 1 to provide an answer (along with a check of power invariance). This question was answered in [JS07a, theorem 4.16.2]. Question 5.6.4 [Joh82] Because A, B ∈ Z, B − A ∈ N, and A ∈ M imply B ∈ M, it seems natural to ask, more generally, under what circumstances are A, B ∈ N, B − A ∈ N, and A ∈ IM imply B ∈ IM? Theorems 5.5.2, 5.5.5 (and the discussion preceding), and 5.6.2 give special cases, but there may be a more encompassing answer. In general, we might ask: given A ∈ N, which satisfies some condition necessary for A ∈ IM, how might the entries of A be adjusted so that the resulting matrix lies in IM?
5.7 Sufficient Conditions for a Positive Matrix to Be IM Thus far, the conditions we have discussed have primarily been only necessary conditions for A ∈ IM. One paper, which also contains several other facts about IM, primarily concentrates on developing a very special condition sufficient for a positive matrix to be an inverse M-matrix [Wil77]. Suppose now that a positive matrix A = [aij ] is diagonally scaled so that aii = 1, i = 1, . . . , n, and aij < 1, i = j. We know that any inverse M-matrix may be scaled this way and that the original matrix is IM if and only if A is. (Such scalings, it should be noted, are not unique because alternate ones may be obtained by diagonal similarity.) Further, let amax = maxi =j aij , amin = mini =j aij , and define t by a2max = tamin + (1 − t)a2min . Under the stated conditions on A, we have 0 < amin ≤ amax < 1. We state but do not re-prove the main result of [Wil77] with t as defined above. Theorem 5.7.1 Let A = [aij ] > 0 be such that aii = 1, i = 1, . . . , n, and aij < 1, i = j. Then A ∈ IM if t−1 ≥ n − 2. Furthermore, the stated condition is tight; if aij = amin , i = j, except for aij = aji = amax when i = n − 1, n and 1 ≤ j ≤ n − 2, then A ∈ IM if and only if t−1 ≥ n − 2. At the opposite extreme from considering when a positive A is IM is the interesting question: which triangular A ∈ N+ are inverse M-matrices? Of course, a triangular matrix in Z must be an M-matrix, but this is now another point of difference between M and IM. If A ∈ N+ is triangular, then A need
5.8 Inverse M-Matrices Have Roots in IM
101
not be in IM; it still depends on the minors, which may have either sign. One extreme case to consider is which (0, 1) matrices are in IM. It is easy to see that such a matrix must have 1s on the diagonal and be permutation similar to a triangular matrix, but this still is not enough to characterize such matrices. In [LN80] the (0, 1) inverse M-matrices are characterized, and the presentation of the characterization is graph-theoretic. Another interesting feature of triangular matrices in either M or IM, and another analogy between the two classes, is noted in [Mark79]. If some power of A ∈ IM(B ∈ M) is permutation similar to a triangular matrix, then A (B) must be also (and by the same permutation). Necessary and sufficient conditions for an invertible, triangular, normalized SPP matrix (see Section 5.12) to be IM are given in Theorem 5.12.6. Factorizations LDU for inverse M-matrices have been studied. There the central question, which at first glance appears harder but is conceivably easier, is: which A ∈ N+ are (finite) products of inverse M-matrices? Clearly, any product of inverse M-matrices is in N+ , but this does not cover N+ , which leaves an intriguing question. The special case of inverse M-matrices that are inverses of tridiagonal matrices is treated in [Lew80].
5.8 Inverse M-Matrices Have Roots in IM In general, matrices in N+ do not necessarily have square roots (or higherorder roots) in N+ (or even N), although any power of a matrix in N+ is in N+ . However, matrices in IM have arbitrary roots not only in N+ but in IM. As we have often seen, this is parallel to a corresponding fact for M-matrices. Theorem 5.8.1 If B ∈ M, then, for each integer k ≥ 1, there is a matrix B1/k q such that (B1/k )k = B and B1/k ∈ M, q = 0, 1, . . . , k. Proof We may write B = αI −P where P ≥ 0 and α > ρ(P). Thus, (1/α)B = I−(1/α)P, and, because it suffices to prove the theorem for (1/α)P, we assume, without loss of generality, that α = 1. For a scalar t, we have ∞ i t x (1 − x) = (−1) i
i=0
for |x| < 1. Because ρ(P) < 1, it follows that (I − P) =
∞ t i P. (−1)i i i=0
102 Because
Inverse M-Matrices t = t(t − 1) . . . (t − i + 1)/i! , i
we have, for 0 < t < 1, such that i t (−1) < 0, i = 1, 2, . . . , i i i t and − ∞ i=1 (−1) i P converges to a nonnegative matrix f (P), whose spectral radius 1 − [1 − ρ(P)]T is less than 1. Thus, for 0 ≤ t ≤ 1, (I − P)T = 1 − f (P) ∈ M. Setting t = 1/k yields the asserted result. Corollary 5.8.2 If A ∈ IM, then, for each integer k ≥ 1, there is a matrix k q A1/k satisfying A1/k = A and A1/k ∈ IM, q = 0, 1, . . . , k. −1 Proof Let A = B−1 , B ∈ M, and A1/k = B1/k , where B1/k is the matrix given by Theorem 5.8.1. Thus, matrices in M(IM) have arbitrary k-th roots within the class IM (M). We note that, because they are high-order powers of matrices in N+ , this is another explanation of why elements of IM have power invariant 0-patterns. Question 5.8.3 [Joh82] Besides powers of elements of IM, are there any other invertible elements of N+ that, for each k = 1, 2, . . . , have k-th roots in N+ ? (Note that although IM is closed under the extraction of particular roots, powers of IM matrices are not necessarily IM because powers of M-matrices are not necessarily M.) However, a power of an element of IM does lie in N+ and has arbitrary roots in N+ because of Corollary 5.8.2.) If not, then IM would be characterized by the fact that powers of its elements have arbitrary roots in N+ . A corresponding question is the following: must each element of N+ which has arbitrary roots in N+ have a root in IM? Question 5.8.4 [Joh82] The invertible elements of N+ which satisfy Hadamard’s inequality form a semigroup. What is the structure of this semigroup, and how does it compare with the semigroup of products of elements of IM?
5.9 Partitioned IM Matrices Summary. Throughout the rest of this chapter we assume that all relevant inverses exist. By considering necessary and/or sufficient conditions for
5.9 Partitioned IM Matrices
103
A ∈ IM when A is written in partitioned form, a great deal can be learned. For instance, Sylvester’s matrix of “bordered” minors can be defined in terms of Schur complements, and the conformal form for A−1 can be defined in terms of inverses of matrices in N+ . The partitioned form of A−1 together with the fact that M-matrices are closed under extraction of principal submatrices and Schur complementation allow us to characterize IM matrices and to determine a number of inequalities involving partitions of IM matrices. Schur’s formula is derived as well as a special case of Sylvester’s identity for determinants; these, in turn, are used to provide necessary and sufficient conditions for a nonnegative matrix to be IM (see Theorems 5.9.7, 5.9.8, 5.9.9). It is noted that necessary and sufficient conditions for a matrix to be IM are generally much harder to come by than those for M-matrices.
5.9.1 Sylvester’s Matrix of Bordered Minors If A is square and A[α] is invertible, recall that the Schur complement of A[α] in A, denoted A/A[α], is defined by A/A[α] = A[α c ] − A[α c , α]A[α]−1 A[α, α c ]. Let α = k. (Due to Theorem 5.2.11, there is no difference between α = k and a general α.) It was shown in [CH69] that if A/A[α] = B = [bij ], then, for k + 1 ≤ i, j ≤ n, bij =
sij det A[α + i, α + j] = , det A[α] det A[α]
in which S = [sij ] is Sylvester’s matrix of “bordered” minors [Gan59], i.e., A[α] A[α, j] . sij = det A[i, α] aij Thus, S = (det A[α])(A/A[α]). 5.9.2 Schur Complement Form of the Inverse We shall use the Schur complement form of the inverse [HJ13] given in the following form. Let the square matrix A be partitioned as A[α] A[α, α c ] (5.9.1) A= A[α c ] A[α c , α] in which A, A[α], and A[α c ] are all invertible. Then
104
Inverse M-Matrices (A/A[α c ])−1 −A[α]−1 A[α, α c ](A/A[α])−1 −(A/A[α])−1 A[α c , α]A[α]−1 (A/A[α])−1 −(A/A[α c ])−1 A[α, α c ]A[α c ]−1 (A/A[α c ])−1 . = −A[α c ]−1 A[α c , α](A/A[α c ])−1 (A/A[α])−1 (5.9.2)
A−1 =
We now make use of the fact that M-matrices are closed under extraction of principal submatrices and under extraction of Schur complements (Schur complementation) [HJ13, HJ91]. A12 in which A11 and Theorem 5.9.1 Let A ≥ 0 be partitioned as A = AA11 A 21 22 A22 are non-void principal submatrices of A. Then, A ∈ IM if and only if (i) (ii) (iii) (iv) (v) (vi)
A/A11 ∈ IM; A/A22 ∈ IM; (A11 )−1 A12 (A/A11 )−1 (A/A11 )−1 A21 (A11 )−1 (A22 )−1 A21 (A/A22 )−1 (A/A22 )−1 A12 (A22 )−1
≥ 0; ≥ 0; ≥ 0; ≥ 0.
Proof For necessity, suppose A ∈ IM, and consider the Schur complement form of its inverse. Because A−1 ∈ M and M-matrices are closed under extraction of principal submatrices, (A/A11 )−1 and (A/A22 )−1 are in M, and (i) and (ii) follow. Statements (iii)–(vi) follow because A−1 ∈ Z. For sufficiency, observe that (i) and (ii) and either (iii) and (iv) or (v) and (vi) ensure that A−1 ∈ Z. This completes the proof. Corollary 5.9.2 IM matrices are closed under extraction of Schur complements. Corollary 5.9.3 IM matrices are closed under extraction of principal submatrices. These follow from Theorem 5.9.1 and the Schur complement form of (A−1 )−1 , respectively. In turn, Theorem 5.9.5 implies that Corollary 5.9.4 IM matrices have positive principal minors. Notice also that Theorem 5.9.1 allows us to zero out any row or column of an IM matrix off the diagonal and remain IM. That is, if
a11 A(t) = tA21
A12 , A22
a A = 11 A21
A12 ∈ IM, A22
5.9 Partitioned IM Matrices and B =
a11 A12 0 A22
105
, then
B−1 =
(a11 )−1 0
−(a11 )−1 A12 (A22 )−1 ∈Z (A22 )−1
because a11 , A12 (A22 )−1 ≥ 0. This fact can also be shown by applying Corollary 5.2.2 and Theorem 5.5.1, i.e., multiply the first column of A by some t, 0 < t < 1, a11 − ta11 > 0 to the (1, 1) entry to obtain the then add a11 A12 IM matrix A(t) = tA21 A22 . By continuity, a11 0
A12 A22
−1
(a11 )−1 = 0
−(a11 )−1 A12 (A22 )−1 ∈ IM. (A22 )−1
Matrix B ∈ M if and only if B/B11 , B/B22 ∈ M, but this does not hold for IM matrices. (When partitioned as in Theorem 5.9.1, the N+ matrix ⎡
⎤ 1 1 1 A = ⎣1 3 4 ⎦ 1 2 3 in which A11 = a11 satisfies A11 , A22 , A/A11 , A/A22 are IM provides a counterexample [Ima84] because A ∈ / IM. Additional conditions are given so that A/A11 , A/A22 ∈ IM implies that A ∈ IM.) We also have A12 in which A11 and Theorem 5.9.5 Let A ≥ 0 be partitioned as A = AA11 A 21 22 A22 are non-void principal submatrices of A. Then, if A ∈ IM, (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii)
A11 ∈ IM; A/A11 ∈ IM; A22 ∈ IM; A/A22 ∈ IM; (A11 )−1 A12 ≥ 0; A21 (A11 )−1 ≥ 0; (A22 )−1 A21 ≥ 0; A12 (A22 )−1 ≥ 0; A12 (A/A11 )−1 ≥ 0; (A/A11 )−1 A21 ≥ 0; A21 (A/A22 )−1 ≥ 0; (A/A22 )−1 A12 ≥ 0.
106
Inverse M-Matrices
Proof Observe that (i)–(iv) follow from the preceding remarks. Then, (v)–(xii) follow from (iii)–(vi) of Theorem 5.9.1 upon multiplying by the appropriate choice of A/A11 , A/A22 , A11 , or A22 , which completes the proof. For IM matrices of order 2 or 3, it is obvious from the remarks preceding Theorem
that we can zero out any reducing block (given a block matrix B C5.9.5 A= D E , the block C, respectively, D, is a reducing block provided A, B, E are square) or zero out either triangular part and remain IM. However, these properties do not hold in general. Example 5.9.6 Consider the IM matrix ⎤ ⎡ 20 8 11 5 ⎢19 20 19 12⎥ ⎥ A=⎢ ⎣17 8 20 5 ⎦ . 14 9 14 20 Neither B =
20
8 0 0 19 20 0 0 17 8 20 5 14 9 14 20
nor C =
20
0 0 0 19 20 0 0 17 8 20 0 14 9 14 20
is in IM because the (4, 1)
entry of the inverse of each is positive. For other characterizations of IM matrices, we will utilize Schur’s formula [HJ13], which states that det A = (det A[α])(det A/A[α]) provided A[α] is invertible. We will also apply a special case of Sylvester’s identity for determinants, which follows. Let A be an n-by-n matrix, α ⊆ N , and suppose |α| = k. Define the (n − k)by-(n − k) matrix B = [bij ] by setting bij = det A[α + i, α + j], for every i, j ∈ α c . Then Sylvester’s identity for determinants (see [HJ13]) states that for each δ, γ ⊆ α c , with |δ| = |γ | = m, det B[δ, γ ] = (det A[α])m−1 det A[α ∪ δ, α ∪ γ ].
(5.9.3)
Then, let A be an n-by-n matrix partitioned as follows: ⎡
a11 A = ⎣a21 a31
aT12 A22 aT32
⎤ a13 a23 ⎦ , a33
(5.9.4)
5.9 Partitioned IM Matrices
107
in which A22 is (n − 2)-by-(n − 2), and a11 , a33 are scalars. Define the matrices T a12 a13 a21 A22 A22 a23 a11 aT12 , C= , D= , E= T . B= a21 A22 A22 a23 a31 aT32 a32 a33 If we let b = det B, c = det C, d = det D, and e = det E, then it follows that b c det = det A22 det A. d e Hence, provided det A22 = 0, we have det A =
det B det E − det C det D . det A22
(5.9.5)
With certain nonnegativity/positivity assumptions IM matrices can be characterized in terms of Schur complements [JS01a]. Theorem 5.9.7 Let A = [aij ] ≥ 0. Then A ∈ IM if and only if A has positive diagonal entries, all Schur complements are nonnegative, and all Schur complements of order 1 are positive. In fact, these conditions can be somewhat relaxed. Theorem 5.9.8 Let A = [aij ] ≥ 0. Then A ∈ IM if and only if (i) A has at least one positive diagonal entry, (ii) all Schur complements of order 2 are nonnegative, and (iii) all Schur complements of order 1 are positive. Proof For necessity, assume that A ∈ IM. Then A has positive diagonal entries and, because IM matrices are closed under extraction of Schur complements, each Schur complement is nonnegative. So we just need to show those of order 1 are positive. But this follows from Schur’s formula because A has positive principal minors. For sufficiency, suppose (i), (ii), and (iii) hold, say, aii > 0. Observe that (by considering all Schur complements A[{i, j}]/aii in which j = i), (iii) implies that A has positive diagonal entries. Claim. All principal minors of A are positive. Proof of Claim. If A is 1-by-1, then the claim certainly holds. So, inductively, assume the claim holds for all matrices of order < n satisfying (i), (ii), and (iii) A12 . Thus, all principal submatrices of order < n have posand let A = Aa11 21 A22 itive determinant, and it suffices to prove that det A > 0. By Schur’s formula, det A = (det A22 )(a11 − A12 (A22 )−1 A21 ). The inductive hypothesis implies
108
Inverse M-Matrices
det A22 > 0 and thus the positivity of det A follows from (iii), completing the proof of the claim. Now let A−1 = B = [bij ] and consider bij , i = j, and assume, without loss of generality, that i < j. Define the sequences α = < 1, . . . , i − 1, i + 1, . . . , j − 1, j + 1, . . . , n >, α1 = < 1, . . . , i − 1, i + 1, . . . , j − 1, j + 1, . . . , n, i >, and α2 = < 1, . . . , i − 1, i + 1, . . . , j − 1, j + 1, . . . , n, j > . Then, bij = (−1)i+j
det A(j, i) det A
det A[α1 , α2 ] det A det A[α] = −(aij − A[i, α](A[α])−1 A[α, j]) det A ≤ 0.
= (−1)i+j (−1)n−i−1 (−1)n−j
The latter inequality holds because A/A[α] is a Schur complement of order 2 and hence is nonnegative. Thus, A−1 ∈ Z which implies A ∈ IM, and completes the proof. From the latter part of the proof of Theorem 5.9.8 we obtain another characterization of IM matrices. Theorem 5.9.9 Let A = [aij ] ≥ 0. Then A ∈ IM if and only if (i) det A > 0 and (ii) for each principal submatrix B of order n − 2, det B > 0 and A/B ≥ 0. As noted in Corollaries 5.9.2 and 5.9.3, IM matrices are closed under extraction of Schur complements and under extraction of principal submatrices. Conversely, if A ≥ 0 with principal submatrix B and both B and A/B are IM, then A is not necessarily IM.
5.10 Submatrices Summary. In this section almost principal submatrices are defined and signs of almost principal minors (APMs) of M and IM matrices are established. The operations of inverting and extracting a principal submatrix are shown to lead to entry-wise inequalities when these operations are applied to M and IM
5.10 Submatrices
109
matrices. For normalized IM matrices, it is shown that if a principal (resp., almost principal) submatrix is properly contained in another of the same type, then the magnitude of the larger principal (resp., almost principal) minor is smaller than that of the smaller, i.e., “larger” minors are smaller. It is also shown that if the inverse of a proper principal submatrix contains a block of zeros, then the inverse of the matrix itself has a larger block of zeros in related positions. Moreover, if a proper principle submatrix of an IM matrix contains a 0, then so does some principal submatrix of the same size. Finally, the last statement holds for general invertible matrices, and analogs of these results are shown to hold for positive definite matrices.
5.10.1 Principal and Almost Principal Submatrices Square submatrices that are defined by index sets differing in only one index, or the minors that are their determinants, are called almost principal submatrices. For simplicity we abbreviate “almost principal minor” (“principal minor”) to APM (PM). APMs are special for a variety of reasons including that, in the co-factor form of the inverse, they are exactly the numerators of off-diagonal entries of inverses of principal submatrices. So, if A ∈ M or A ∈ IM, an APM is 0 if and only if an off-diagonal entry of the inverse of a principal submatrix equals 0. Using the informal notation α + i (α − i) to denote the augmentation of the set α by i ∈ / α (deletion of i ∈ α from α), almost principal submatrices are of the form A[α + i, α + j], i, j ∈ / α (A[α − i, α − j], i, j ∈ α), i = j. All PMs in A ∈ M or A ∈ IM are positive. Because of inheritance (of the property IM under extraction of principal submatrices [Joh82]), the sign of every nonzero APM in A ∈ M or A ∈ IM is determined entirely by its position. Specifically, if α ⊆ N and i, j ∈ α, sgn(det A[α − i, α − j]) equals (−1)r+s if A ∈ M and (−1)r+s+1 if A ∈ IM in which r (resp., s) is the number of indices in α less than or equal to i (resp., j). An analogous statement can be made concerning det A[α + i, α + j]. For an individual minor of an IM matrix that is neither principal nor an APM, there is no constraint upon the sign. Our purpose here is to present more subtle information about APMs of an IM matrix: certain inequalities and relations among those that may be 0. To clarify our interest, consider the following example. Example 5.10.1 Consider the IM matrix ⎤ 1 .5 .4 .2 ⎢.8 1 .8 .4⎥ ⎥ A=⎢ ⎣.6 .5 1 .4⎦ . .2 .2 .25 1 ⎡
110
Inverse M-Matrices
Notice that the only vanishing APMs are the determinants of A[{1, 2}, {2, 3}], A[{1, 2}, {2, 4}], A[{1, 2, 3}, {2, 3, 4}], and A[{1, 2, 4}, {2, 3, 4}]. Thus, the (1, 3) entry of each of A[{1, 2, 3}]−1 and A[{1, 2, 4}]−1 is 0 while both the (1, 3) and (1, 4) entries of A−1 are 0 and these are the only entries that vanish in the inverse of any principal submatrix. The preceding example leads to three questions. Question 5.10.2 If the inverse of a proper principal submatrix of an IM matrix contains a block of 0s, does this imply that the inverse of the matrix itself has a larger block of 0s (and, somehow, in related positions)? Question 5.10.3 If the inverse of an IM matrix contains a block of 0s, does this imply that there is a block of 0s in the inverse of some other principal submatrix? Question 5.10.4 If the inverse of a proper principal submatrix of an IM matrix contains a 0, does this imply that the inverse of some other proper principal submatrix of the same size must also contain a 0? The answers to Questions 5.10.2 and 5.10.4 are certainly not in the affirmative for invertible matrices in general. Example 5.10.5 Consider the invertible matrix ⎡
1 ⎢3 A=⎢ ⎣2 4
2 1 3 3
4 2 4 2
⎤ 3 1⎥ ⎥. 1⎦ 5
The (3, 1) minor of A[{1, 2, 3}] is 0, but no other minor of A is 0. In fact, no other minor of any principal submatrix of A is 0. Of course, there may be “isolated” 0s in the inverse of an IM matrix, and examples are easily found in which there is a single 0 entry in its inverse. All three questions above will be answered affirmatively later, with precise descriptions of these phenomena. Moreover, Question 5.10.3 will be answered for invertible matrices in general. Because of Jacobi’s determinantal identity, there are often analogous statements about matrices in M. It should be noted that besides PMs and APMs no other minors have deterministic signs throughout M or throughout IM; analogously, there seem to be no results like those we present beyond PMs and APMs.
5.10 Submatrices
111
5.10.2 Inverses and Principal Submatrices The two operations of inverting and extracting a principal submatrix do not, of course, in general commute when applied to a given matrix (for which both are defined). There are, however, several interesting, elementary, entry-wise inequalities when these operations are applied to A ∈ M or A ∈ IM (for which they are always defined) in various orders. We record these here for reference and reflection. (Compare to inequalities in the positive semidefinite ordering for positive definite matrices [JS01b].) Throughout ≤ or < should be interpreted entry-wise. Theorem 5.10.6 If A ∈ M or A ∈ IM and ∅ = α ⊆ n, then (i) (A−1 [α])−1 ≤ A[α]; and (ii) A[α]−1 ≤ A−1 [α]. Proof Notice that (i) holds for A ∈ M because (A−1 [α])−1 = A/A[α c ] = A[α] − A[α, α c ]A[α c ]−1 A[α c , α] ≤ A[α]. (The latter inequality holds because A[α c ]−1 ≥ 0.) (i) holds for A ∈ IM by the same argument because A[α c ]−1 A[α c , α] ≥ 0. (The former fact has been noted previously [JS01b], while the latter fact does not seem to have appeared in the literature.) Because (ii) for A ∈ M (A ∈ IM) is just a restatement of (i) for A ∈ IM (A ∈ M), the theorem holds. It follows from Theorem 5.10.6(ii) that if there are some 0 off-diagonal entries in A[α]−1 , A ∈ IM, then there are 0 entries in A−1 in the corresponding positions. In particular, if a certain APM in A[α] vanishes, then a larger (i.e., more rows and columns) APM in a corresponding position vanishes in A. Actually, more can be said, as we shall see later. A hint of this is the following. An entry of a square matrix is, itself, an APM; if an entry of A ∈ IM is 0, it is known that A must be reducible [JS11] and thus, if n > 2, other entries (i.e. other APMs of the same size) must be 0.
5.10.3 Principal and Almost Principal Minor Inequalities It follows specifically from the observations of the last section that if A ∈ IM, α ⊆ β ⊆ n, and the APM det A[α +i, α +j] = 0, then det A[β + i, β + j] = 0 when i, j ∈ / β. Thus, a “smaller” vanishing APM implies that any “larger” one containing it also vanishes. This suggests that there may, in general, be inequalities between such minors. We note, in advance, that Theorem 5.10.6(ii) gives some inequalities, but we give additional ones here.
112
Inverse M-Matrices
We first give inequalities for the normalized case that may be paraphrased as saying that “larger” minors are smaller. The first inequality is a special case of Fischer’s inequality, while the second was derived in [JS01b]. Theorem 5.10.7 If A ∈ IM is normalized and ∅ = α ⊆ β ⊆ n − {i, j}, then (i) det A[β] ≤ det A[α]; and (ii) | det A[β + i, β + j]| ≤ | det A[α + i, α + j]|. Proof Noting that the determinantal inequalities of Fischer and Hadamard hold for inverse M-matrices [HJ91], we have det A[β] ≤ det A[β−α] det A[α] ≤ det A[α], which establishes (i). Now assume that i = j, B = A[β + i + j], and γ = α + i + j. Then B ∈ IM also, and, from Theorem 5.10.6(ii), we have | B−1 ij | ≤ | B[γ ]−1 ij | or equivalently, ! ! ! ! ! det B[β + i, β + j] ! ! det B[α + i, α + j] ! ! !≤! !. ! ! ! ! det B det B[γ ] Thus, | det B[β +i, β + j]| ≤
det B | det B[α + i, α + j]| det B[γ ]
≤ (det B[β + i + j − γ ]) | det B[α + i, α + j]| (by Fischer) ≤ | det B[α + i, α + j]| (by Hadamard). Because B is a principal submatrix of A, (ii) follows. Because A = [aij ] ∈ IM may be normalized via multiplication by D = diag(a11 , . . . , ann )−1 , we may easily obtain nonnormalized inequalities from Theorem 4.10.4. Theorem 5.10.8 If A = [aij ] ∈ IM and ∅ = α ⊆ β ⊆ n − {i, j}, then (i) det A[β] ≤ det A[α] i∈β−α aii ; and (ii) | det A[β + i, β + j]| ≤ | det A[α + i, α + j]| i∈β−α aii .
5.11 Vanishing Almost Principal Minors From prior discussion we know that, for A ∈ IM, if an entry of A[α]−1 is 0, then a corresponding entry of A−1 is 0. However, if α ⊆ n properly, one quickly finds that, unlike for general matrices, it is problematic to construct an example in which an entry of A[α]−1 is 0 and just one entry of A−1 (the corresponding one) is 0. We present here two results that show there is a good
5.11 Vanishing Almost Principal Minors
113
reason for this. In one, it is shown that any block of 0s in A[α]−1 implies a “larger” block of 0s in A−1 , and in the other it is shown that any 0 entry in A[α]−1 implies 0s in certain other matrices A[β]−1 when |β| = |α| < n. Both results lead to interpretations in terms of rank of submatrices of IM matrices rather like the row/column inclusion results for positive semidefinite and other matrices, noted in [Joh98]. Because a 0 entry in A ∈ IM implies that A is reducible and thus has other 0 entries (if n > 2), it follows that if A ∈ IM and A[α]−1 has a 0 entry, |α| = 2, then A−1 is reducible and the 0 entry is actually part of a 0 block in A−1 . This generalizes substantially even when A ∈ IM is positive and answers Question 5.10.2. Theorem 5.11.1 Suppose that A ∈ IM and that γ = n − i for some i ∈ n. Then, if A[γ ]−1 has a p-by-q 0 submatrix, A−1 has either a p-by-(q + 1) or a (p + 1)-by-q 0 submatrix. Specifically, if A[γ ]−1 [α, β] = 0 in which |α| = p and |β| = q, then either A−1 [α, β + i] = 0 or A−1 [α + i, β] = 0. Proof Let A ∈ IM and γ = n−i for some i ∈ n. Without loss of generality, assume that A has the partitioned form A11 A12 A= (5.11.1) A21 A22 in which A11 = aii for some i, 1 ≤ i ≤ n, and A22 = A[γ ]. If an invertible matrix A is partitioned as in (5.11.1) with A22 invertible, then −1 s −s−1 uT A−1 = (5.11.2) −1 T −s−1 v A−1 22 + s vu −1 in which s = A/A22 , uT = A12 A−1 22 , and v = A22 A21 . Thus, if A22 ∈ IM, it is easily seen (and was first noticed in [Joh82]) that A ∈ IM if and only if (i) s > 0, −1 T (ii) uT ≥ 0, (iii) v ≥ 0, and (iv) A−1 22 ≤ −s vu , except for diagonal entries. −1 Now A22 ∈ M and hence is in Z. Assume that A−1 22 has a p-by-q submatrix of 0s, say A−1 [α, β] = 0 in which |α| = p and |β| = q. If vr = 0 for all r ∈ α, 22 then A−1 [α, β + i] is a p-by-(q + 1) 0 submatrix of A−1 . On the other hand, if vr = 0 for some r ∈ α, then it follows from (iv) that us = 0 for all s ∈ β and hence A−1 [α + i, β] is a p-by-(q + 1) 0 submatrix of A−1 . This completes the proof.
In regard to Example 5.10.1, we see that for γ = {1, 2, 3} or {1, 2, 4}, a 1-by-1 block of 0s in A[γ ]−1 leads to a 1-by-2 block of 0s in A−1 . It is easy to construct examples such that both possibilities for the 0 block of A−1 occur and also such that exactly one of the possibilities occurs.
114
Inverse M-Matrices
Let the measure of irreducibility, m(A), of an invertible n-by-n matrix A be defined as m(A) = max(p,q)∈S (p + q) in which S = { (p, q) | A contains a p-by-q off-diagonal zero submatrix }. Thus, an n-by-n matrix A is reducible if and only if m(A) = n. Moreover, in Theorem 5.11.1, we have shown if B is an (n − 1)-by-(n − 1) principal that if B is a k-by-k submatrix of A ∈ IM, then m A−1 ≥ m B−1 + 1. In general, principal submatrix of A ∈ IM, then m A−1 ≥ m B−1 +n−k. (Note that the latter statement implies that if B is a reducible principal submatrix of A ∈ IM, then A is reducible also. And, in particular, if A ∈ IM has a 0 entry, then A is reducible, a fact noted previously.) The question is: when are these inequalities in fact equalities? In order to establish a converse to Theorem 5.11.1 we will need the following result on complementary nullities. This fact has origins in [Gan59] and has been refined, for example, in [JL92]. Let Null(A) denote the (right) null space of a matrix A, nullity(A) denote the nullity of A, i.e., the dimension of Null(A), and rank(A) denote the rank of A. Theorem 5.11.2 (Complementary Nullities) Let A be an n-by-n invertible matrix and ∅ = α, β ⊆ n with α ∩ β = ∅. Then, Null(A−1 [α, β]) = Null(A[β c , α c ]). We use Theorem 5.11.1 to prove a general result on 0 patterns of inverses. Theorem 5.11.3 Let ∅ = α, β ⊆ n with α ∩ β = ∅, and let A ∈ Mn (F) where F is an arbitrary field. If A−1 [α, β] = 0, then, for any γ ⊆ n such that α ∩ γ = ∅, β ∩ γ = ∅, (α ∪ β)c ⊆ γ , and A[γ ] is invertible, we have A[γ ]−1 [α ∩ γ , β ∩ γ ] = 0. Proof Suppose that A ∈ Mn (F) and A−1 [α, β] = 0 in which ∅ = α, β ⊆ n with α ∩ β = ∅. Then nullity(A−1 [α, β]) = |β|; hence, by complementary nullity, nullity(A[β c , α c ]) = |β|. Therefore, rank(A[β c , α c ]) = |α c | − |β| = n − |α| − |β| = |(α ∪ β)c |. Further, suppose that γ ⊆ n satisfies α ∩ γ = ∅, β ∩ γ = ∅, (α ∪ β)c ⊆ γ , and A[γ ] is invertible. Because γ − α ∩ γ = γ ∩ α c and γ − β ∩ γ = γ ∩ β c, rank(A[γ ][γ − β ∩ γ , γ − α ∩ γ ]) ≤ rank (A[β c , α c ]) = |(α ∪ β)c |.
5.11 Vanishing Almost Principal Minors
115
Thus, nullity(A[γ ][γ − β ∩ γ , γ − α ∩ γ ]) ≥ |γ − α ∩ γ | − |(α ∪ β)c | = |β ∩ γ | + |(α ∪ β)c | − |(α ∪ β)c | = |β ∩ γ |. Hence, by complementary nullity, nullity(A[γ ]−1 [α ∩ γ , β ∩ γ ]) ≥ |β ∩ γ |. This last inequality must be an equality, i.e., A[γ ]−1 [α ∩ γ , β ∩ γ ] = 0, which completes the proof. We can make similar statements concerning the minors of A and A[γ ] even if A is singular. It is easy to show that the condition (α ∪ β)c ⊆ γ is necessary. For instance, in Example 5.10.1, A−1 [α, β] = 0 in which α = {1} and β = {3, 4}. But if γ = α ∪ β (and thus α ∩ γ = ∅ and β ∩ γ = ∅, while it is not the case that (α ∪ β)c ⊆ γ ), then A[γ ]−1 has no 0 entries. We are now able to answer Question 5.10.3. Corollary 5.11.4 Let A ∈ IM with m A−1 = p + q, say A−1 [α, β] = 0 in which ∅ = α,β ⊆ n with |α| = p and |β| = q. Further, let γ = N −i for some i ∈ N, and assume that α∩γ = ∅ and β∩γ = ∅. Then, A[γ ]−1 [α∩γ , β∩γ ] = 0 if and only if (α ∪ β)c ⊆ γ . Proof Assume the hypothesis holds and observe that α ∩ β = ∅ because A ∈ IM. First, suppose that A[γ ]−1 [α ∩ γ , β ∩ γ ] = 0. Then m A−1 ≥ |α ∩ γ | + |β ∩ γ |. By the remarks after Theorem 5.11.1, m A−1 ≥ m A[γ ]−1 + n − |γ | and so |α| + |β| ≥ |α ∩ γ | + |β ∩ γ | + n − |γ |. Rearranging, we have (|α| − |α ∩ γ |) + (|β| − |β ∩ γ |) + |γ | ≥ n or equivalently, |α − γ | + |β − γ | + |γ | ≥ n. The latter holds (and with equality) if and only if (α ∪ β)c ⊆ γ .
116
Inverse M-Matrices
The converse follows from Theorem 5.11.3 because all principal submatrices of A are invertible. In regard to Example 5.10.1, A−1 [α, β] = 0 in which α = {1} and β = {3, 4}. For γ = {1, 2, 3} or {1, 2, 4}, we have α ∩ γ = ∅, β ∩ γ = ∅, and (α ∪ β)c ⊆ γ . In each case A[γ ]−1 [α ∩ γ , β ∩ γ ] = 0. On the other hand, for γ = {1, 3, 4}, we have α ∩ γ = ∅, β ∩ γ = ∅, but (α ∪ β)c ⊆ γ does not hold, and A[γ ]−1 [α ∩ γ , β ∩ γ ] = 0. So we see that the inequality m A−1 ≥ m B−1 +1 where B is an (n−1)-by(n − 1) principal submatrix of A (noted in the discussion after Theorem 5.11.1) is an equality as long as B = A[γ ] in which α ∩ γ = ∅, β ∩ γ = ∅, and (α ∪β)c ⊆ γ . Also, Example 5.10.1 (with γ = {2, 3, 4}) shows that, otherwise, the inequality may be strict. We next discuss vanishing APMs and in response to Question 5.10.4 establish the fact that they imply that other APMs, of the same size, vanish. We will use the following lemma. Lemma 5.11.5 Let A = [aij ] be an n-by-n IM matrix (n ≥ 3) and j ∈ n. If ai1 j , ai2 j , . . . , ait j = 0, then, for all k ∈ / {j, i1 , . . . , it }, either akj = 0 or ai1 k , ai2 k , . . . , ait k = 0. Proof This follows from the fact [Joh82, Wil77] that if A is an n-by-n IM matrix (n ≥ 3), then for all i, j, k ∈ n, aik akj ≤ aij akk . Theorem 5.11.6 Let A = [aij ] be an n-by-n IM matrix (n ≥ 3), let i, j, k be distinct indices in n, and let δ be a subset of n − {i, j, k}. Then, if det A[δ + i, δ + j] = 0, (i) det A[δ + i, δ + k] = 0 or (ii) det A[δ + k, δ + j] = 0. Proof Let A = [aij ] be an n-by-n IM matrix (n ≥ 3), let i, j, k be distinct indices in n, let δ be a subset of n − {i, j, k}, and assume that det A[δ + i, δ + j] = 0. If δ = ∅, the result follows because A[{i, j, k}] must be reducible. So assume that δ = ∅. By permutation similarity we may assume that i = i1 , δ = {i2 , . . . , ip−1 }, and j = ip . Let A1 = A[{i1 , . . . , ip }]. Because 0 = det A[δ + i, δ + j] = det B where B = A[{i1 , . . . , ip−1 }, {i2 , . . . , ip }], we see that the (ip , i1 ) minor of A1 is 0. Further, because A[δ] is a (p−2)-by-(p−2) principal submatrix of A lying in the lower left corner of B = [b1 , . . . , bp−1 ], p−2 {b1 , . . . , bp−2 } is linearly independent and thus bp−1 = i=1 βi bi . If bp−1 = 0, then, by Lemma 5.11.5, either akip = akj = 0 or ai1 k , ai2 k , . . . , aip−1 k = 0. The former case implies det A[δ + k, δ + j] = 0, while the latter implies
5.11 Vanishing Almost Principal Minors
117
det A[δ + i, δ + k] = 0. So assume bp−1 > 0. Then βi = 0 for some i, 1 ≤ i ≤ p − 2. By simultaneous permutation of rows and columns indexed by δ, we may assume βi = 0. Let k = ip+1 and consider the submatrix A2 = A[{i1 , . . . , ip+1 }] of A. Because A1 is a principal submatrix of A2 , the (ip , i1 ) minor of A2 is 0 also, i.e., 0 = det A[{i1 , . . . , ip−1 , ip+1 }, {i2 , . . . , ip+1 }] = det A3 . Now the (p − 1)-by-(p − 1) submatrix lying in the upper left corner of A3 is B and det B = 0. Therefore, applying Sylvester’s identity for determinants to det A3 , we see that either (1) det A[{i1 , . . . , ip−1 }, {i3 , . . . , ip+1 }] = 0, or (2) det A[{i2 , . . . , ip−1 , ip+1 }, {i2 , . . . , ip }] = 0. Case I. Suppose (1) holds. Then, 0 = det A[{i1 , . . . , ip−1 }, {i3 , . . . , ip+1 }] = det[b2 , . . . , bp−2 , bp−1 , bp ] ⎤ ⎡ p−2 β i b i , bp ⎦ = det ⎣b2 , . . . , bp−2 , i=1
= det[b2 , . . . , bp−2 , β1 b1 , bp ] = (−1)p−3 β1 det[b1 , b2 , . . . , bp−2 , bp ]. Because β1 = 0, 0 = det[b1 , b2 , . . . , bp−2 , bp ] = det A[{i1 , . . . , ip−1 }, {i2 , . . . , ip−1 , ip+1 }] = det A[δ + i, δ + k]. This establishes (i). Case II. Suppose (2) holds. Then, 0 = det A[{i2 , . . . , ip−1 , ip+1 }, {i3 , . . . , ip+1 }] = det A[δ + k, δ + j]. This establishes (ii) and completes the proof. We can also prove Theorem 5.11.6 using Theorems 5.11.1 and 5.11.3 as follows. Let ρ = δ + i + j and τ = ρ + k. Because det A[δ + i, δ + j] = 0,
118
Inverse M-Matrices
A[ρ]−1 [i, j] = 0. So, by Theorem 5.11.1, either A[τ ]−1 [i, j + k] = 0 or A[τ ]−1 [i + k, j] = 0. Suppose that A[τ ]−1 [i, j + k] = A[τ ]−1 [α, β] = 0. Let γ = δ + i + k = τ − j. Then, in A[δ + i + j + k], we have α ∩ γ = ∅, β ∩ γ = ∅, and (α ∪ β)c = δ ⊆ γ . Thus, by Theorem 5.11.3, A[τ ]−1 [α ∩ γ , β ∩ γ ] = A[γ ]−1 [i, k] = 0, which implies that det A[δ + i, δ + k] = 0. Similarly, it can be shown that if A[γ ]−1 [i + k, j] = 0, then det A[δ + k, δ + j] = 0. In regard to Example 5.10.1, the {1, 2},{2, 3} minor of A vanishes. Hence, by Theorem 5.11.6, either the {1, 2},{2, 4} minor or the {2, 4},{2, 3} minor must vanish. The former was true. M-matrices and IM matrices are known for their similarities with the positive definite matrices, and these results are no exception. In the positive definite matrices Theorem 5.11.6 has an obvious analog (namely that the δ + i, δ + j minor is 0 if and only if the δ + j, δ + i minor is 0), while the implication of Corollary 5.11.4 that follows from Theorem 5.11.3 has an identical analog. Less obvious is an analog to Theorem 5.11.1, yet there is one. Recall that positive semidefinite matrices have an interesting property that may be called “row and column inclusion” [Joh98]. If A is an n-by-n positive semidefinite matrix, then, for any index set α ⊆ n and any index i ∈ / n − α, the row A[i, α] (and thus the column A[α, i])lies in the row space of A[α] (column space of A[α]). Of course, this is interesting only in the event that A[α] is singular, and, so, there is no complete analog for IM matrices in which every principal submatrix is necessarily invertible. However, just as there are slightly weakened analogs in the case of totally nonnegative matrices “row or column inclusion” [Joh98]), Theorem 5.11.1 implies certain analogs for IM matrices; now the role of principal submatrices is replaced by almost principal submatrices. We will need the following lemma. Here, Row(A) (Col(A)) denotes the row (column) space of a matrix A. Lemma 5.11.7 Let the n-by-n matrix B have the partitioned form
C B= T e
d f
in which f is a scalar and rank (C) = n − 2. Then, if d ∈ / Col(C) and eT ∈ / Row(C), B is invertible.
T / Row(C), Proof d∈ / Col(C),
Because
rank C d = n−1 and because e ∈ T / Row C d . This implies rank(B) = n, i.e., B is invertible. f ∈ e Corollary 5.11.8 Let A be an n-by-n IM matrix, α ⊆ n, and i, j ∈ n − α. Then, for each k ∈ / α+i+j, either A[k, α+j] lies in the row space of A[α+i, α+j] or A[α + i, k] lies in the column space of A[α + i, α + j].
5.12 The Path Product Property
119
Proof The corollary certainly holds if A[α + i, α + j] is invertible. So assume not. Then det A[α + i, α + j] = 0 and rank (A[α + i, α + j]) = |α|. Thus, A[α + i + j]−1 has a 0 in the (i, j) position, and if k ∈ / α + i + j, it follows from −1 Theorem 5.11.1 that A[α + i + j + k] also has a 0 in the (i, j) position. Hence, det A[α + i + k, α + j + k] = 0. With B = A[α + i + k, α + j + k] and C = A[α + i, α + j], we see that the result follows upon applying Lemma 5.11.7. We note that in the statement of Corollary 5.11.8 the almost principal submatrix A[α + i, α + j] could be replaced by A[α, α + j] in the row case and by A[α +i, α] in the column case to yield a stronger, but less symmetric, statement. In the row or column inclusion results for totally nonnegative matrices, a bit more is true, the either/or statement is validated either always by rows or always by columns (or both), and this is (trivially) also so in the positive semidefinite case. Thus, it is worth noting that such phenomenon does not carry over to the almost principal IM case. Example 5.11.9 Consider the IM matrix ⎤ ⎡ 14 4 1 1 4 ⎢ 6 16 4 4 6 ⎥ ⎥ ⎢ ⎥ ⎢ A = ⎢ 6 6 14 4 6 ⎥ . ⎥ ⎢ ⎣ 6 6 4 14 6 ⎦ 4 4 1 1 14 The hypothesis of the corollary is satisfied with α = {2}, i = 1, and j = 3, and the conclusion is satisfied for k = 4 by columns and not rows and for k = 5 by rows and not columns. Examples in which satisfaction is always via columns or always via rows are easily constructed.
5.12 The Path Product Property Summary. In this section, for nonnegative matrices, the path product (PP) property is defined as well as the strict path product (SPP) property. It is shown that every IM matrix is SPP and, for n ≤ 3, every SPP matrix is IM. However, the latter statement does not hold for n > 3. It is noted that every normalized PP (normalized SPP) matrix A has a normalized matrix Aˆ that is positive diagonally similar to A in which all (resp., off-diagonal) entries are ≤ 1 (resp. < 1). A characterization of invertible, triangular, normalized SPP matrices is given in terms of path products, thereby answering a question posed in [Joh82]. It is noted that PP matrices can be used to establish a number of facts about IM
120
Inverse M-Matrices
matrices. The notions of purely strict path product (PSPP) and totally strict path product (TSPP) are introduced.
5.12.1 (Normalized) PP and SPP Matrices Let A = [aij ] be an n-by-n nonnegative matrix with positive diagonal entries. We call A a path product (PP) matrix if, for any triple of indices i, j, k ∈ N, aij ajk ≤ aik ajj and a strict path product (SPP) matrix if there is strict inequality whenever i = j and k = i [JS01a]. In [JS01a] it was noted that any IM matrix is SPP and that for n ≤ 3 (but not greater) the two classes are the same. See also [Wil77]. For a PP matrix A and any path i1 → i2 → · · · → ik−1 → ik in the complete graph Kn on n vertices, we have ai1 i2 ai2 i3 . . . aik−1 ik ≤ a i1 ik ai2 i2 ai3 i3 . . . aik−1 ik−1
(5.12.1)
and, if in addition, A is SPP, then the inequality is strict. We call (5.12.1) the path product inequalities and, if i1 = ik in (5.12.1), the cycle product inequalities. We call a product ai1 i2 ai2 i3 . . . aik−1 ik an (i1 , ik ) path product (of length k − 1) and, if ik = i1 , an (i1 , ik ) cycle product (of length k − 1). In (5.12.1), if ik = i1 , we see that the product of entries around any cycle is no more than the corresponding diagonal product, i.e., ai1 i2 ai2 i3 . . . aik−1 i1 ≤ ai1 i1 ai2 i2 . . . aik−1 ik−1 .
(5.12.2)
It follows that, in a normalized PP matrix, no cycle product is more than 1 and that, in a strictly normalized PP matrix, all cycles of length two or more have product less than 1. From this we have [JS01a]. Theorem 5.12.1 If A is a normalized PP (resp. normalized SPP) matrix, then there is a normalized PP (resp. normalized SPP) matrix Aˆ positive diagonally similar to A in which all (resp. off-diagonal) entries are ≤ (resp. 0. So assume n > 1. Because M-matrices are closed underextraction of principal submatrices, we may A11 A12 assume (inductively) that A = A21 A22 in which A11 = L11 U11 with L11 (U11 )
124
Inverse M-Matrices
being an (n − 1)-by-(n − 1) lower- (upper-) triangular M-matrix. Therefore, because A has positive principal minors L U 0 U11 U12 L11 U12 L = 11 11 A = LU = 11 L21 1 0 unn L21 U11 L21 U12 + unn in which unn = A/A11 > 0 (by Schur’s formula). Thus, L11 U12 , L21 U11 ≤ 0. −1 −1 , U11 ≥ 0, it follows that U12 , L21 ≤ 0. Thus, L, U ∈ Z. And Because L11 because # " " # −1 −1 −1 − u1nn U11 U12 U11 L11 0 −1 −1 L = and U = −1 1 0 1 −L21 L11 unn are both nonnegative, L, U ∈ M. Similarly, it can be shown that A has a ULfactorization within the class of M-matrices. Observe that if A = LU (UL) in which L, U ∈ M, then A−1 = U −1 L−1 (L−1 U −1 ) with L−1 , U −1 ∈ IM. Thus, IM matrices also have LU- and UL-factorizations within their class. However, it is not the case that if L, U are, respectively, lower- and upper-triangular IM matrices, then LU and/or UL is IM. Example 5.13.2 Consider the IM matrices ⎡
⎤ ⎡ ⎤ 1 0 0 1 1 4 L = ⎣ 1 1 0⎦ , U = ⎣ 0 1 4⎦ . 4 1 1 0 0 1 Neither LU nor UL is IM because the inverse of the former is positive in the (2, 1) entry, and the inverse of the latter is positive in the (3, 2) entry.
5.14 Sums, Products, and Closure Summary. In this section it is shown that sums, products, and positive integer powers of IM matrices need not be IM and are, in fact, IM if and only if the offdiagonal entries of its inverse are nonpositive. It is also noted that if A ∈ IM, then At ∈ IM for all 0 ≤ t < 1. Throughout, let A1 , A2 , . . . , Ak be n-by-n IM matrices in which k is a positive integer. It is natural to ask about closure under a variety of operations. Neither the product AB nor the sum A + B need generally be in IM; see Section 5.2. In addition, the conventional powers At , t > 1, are nonnegative but need not be in IM for n > 3. Recall that At , when t is not an integer, is defined naturally for an M-matrix, via power series, and thus for an IM matrix, as in Section 5.8, in [Joh82], or, equivalently, via principal powers, as in [JS11].
5.15 Spectral Structure of IM Matrices Example 5.14.1 Consider the IM matrix ⎡ 90 59 44 ⎢56 96 48 A=⎢ ⎣64 60 88 42 43 36
125
⎤ 71 88⎥ ⎥. 84⎦ 95
Because the inverse of A3 has a positive 1, 4 entry, A3 is not an IM matrix. However, in each of these cases, there is an aesthetic condition for the result to be IM, even when the number k of summands or factors is more than two. Note that, because A1 , A2 , . . . , Ak ≥ 0 (entry-wise), necessarily A1 + A2 + · · · + Ak , A1 A2 . . . Ak ≥ 0. The latter of these is necessarily invertible, while the first need not be. But, invertibility plus nonnegativity means that the result is IM if and only if the inverse has nonpositive off-diagonal entries. This gives Theorem 5.14.2 Let A1 , A2 , . . . , Ak be n-by-n IM matrices and let k be a positive integer. Then, A1 + A+ · · · + Ak (resp. A1 A2 . . . Ak ), provided it is invertible, is IM if and only if the off-diagonal entries of its inverse are nonpositive. Also, Theorem 5.14.3 If t ≥ 1 is an integer, then At is IM if and only if the offdiagonal entries of the inverse of At are nonpositive. The remaining issue of “conventional” powers (roots) At , 0 < t < 1, has an interesting resolution. Based upon a power series argument given in [Joh82] and in Section 5.8, the M-matrix B = A−1 always has a k-th root B1/k ∈ IM (that is, (B1/k )k = B), k = 1, 2, . . . , that is an M-matrix. T The same argument or continuity gives a natural BT = A−1 , 0 ≤ t < 1, that is an M-matrix. As the laws of exponents are valid, this means Theorem 5.14.4 If A is IM, then for each t, 0 ≤ t < 1, there is a natural At such that At is IM. If t = p/q, with p and q positive integers and 0 ≤ p ≤ q, then t q A = Ap .
5.15 Spectral Structure of IM Matrices Let A ∈ IM, and let σ (A) denote the spectrum of A. It follows from Perron– Frobenius theory that |λ| ≤ ρ(A) for all λ ∈ σ (A) with equality only for
126
Inverse M-Matrices
λ = ρ(A) and that the M-matrix B = A−1 = αI − P in which P ≥ 0 and α > ρ(P). Thus, q(B) ≡
1 = α − ρ(P) ρ(A)
is the eigenvalue of B with minimum modulus, σ (B) is contained in the disc {z ∈ C : |z − α| ≤ ρ(P)}, and Re(λ) ≥ q(B) for all λ ∈ σ (B) with equality only for λ = q(B). Moreover, σ (B) is contained in the open wedge π π Wn ≡ z = reiθ : r > 0, |θ| < − 2 n in the right half-plane if n > 2, and in (0, ∞) if n = 2 [HJ91]. So σ (A) is contained in the wedge Wn if n > 2 and in (0, ∞) if n = 2. Under the transformation f (z) = 1z , circles are mapped to circles and lines to lines (see [MH06] for details). This in turn implies that σ (A) is contained in the disc ρ(P) α {z ∈ C : |z − β| ≤ R} in which β = α 2 −(ρ(P)) 2 and R = α 2 −(ρ(P))2 .
5.16 Hadamard Products and Powers, Eventually IM Matrices Summary. For m-by-n matrices A = [aij ] and B = [bij ], the Hadamard (entry-wise) product A ◦ B is defined by A ◦ B = [aij bij ]. Many facts about the Hadamard product may be found in [Hor90, Joh82]. In this section, it is shown that IM matrices are closed under Hadamard product if and only if n ≤ 3. Also, it is shown that, for an IM matrix A, the t-th Hadamard power A(t) = [atij ] is IM for all real t ≥ 1 [Che07a], and that this is not necessarily the case if n > 3 and 0 < t < 1. The dual of the class of IM matrices is discussed. In [JS01a], it was noted that any IM matrix is SPP and that for n ≤ 3 (but not greater) the two classes are the same. See also [Wil77]. Our interest here lies in further relating these two classes via consideration of Hadamard powers. Motivated, in part, by the main result of [Che07a], we questioned whether an SPP matrix A becomes IM as t increases without bound. Toward this end, IM matrices are shown to be PSPP (see Section 5.12). Further, it is shown that TSPP matrices (those SPP matrices in which all path products are strict) become inverse M-matrices as t increases without bound but not conversely. However, the PSPP condition is shown to be necessary and sufficient to ensure that A(t) eventually becomes IM. In the process the notion of path product triple is introduced for an IM matrix A. (The presence of such a triple implies that a certain entry of the inverse of A vanishes.) We begin by establishing the fact that IM matrices are closed under Hadamard product for n ≤ 3.
5.16 Hadamard Products and Powers
127
5.16.1 Hadamard Products and Powers Because n-by-n IM matrices A and B are entry-wise nonnegative, it is natural to ask whether A ◦ B is again IM. For n ≤ 3, this is so, as IM is equivalent to SPP for n = 3 (Section 5.12), and the Hadamard product of SPP matrices is SPP; and for n = 1, 2, the claim is immediate. It has long been known that such a Hadamard product is not always IM. Examples for n = 6 (and hence for larger n) may be [Joh77, Joh78], and more recently, it was noted in [WZZ00] that the two 4-by-4 symmetric matrices ⎡ ⎤ ⎡ ⎤ 3 2 1 3 1 1 1 1 ⎢ 2 2 1 2⎥ ⎢ 1 2 2 2⎥ ⎢ ⎥ ⎥ A=⎢ ⎣ 1 2 3 3⎦ and B = ⎣ 1 1 1 1⎦ 1 2 3 4
3 2 1 4
are IM, while A ◦ B is not. Thus, the Hadamard product, A ◦ B, of two IM matrices A and B need not be IM for n > 3, which entirely resolves the question of dimensions in which there is Hadamard product closure. Theorem 5.16.1 The n-by-n IM matrices are closed under Hadamard product if and only if n ≤ 3. This leaves the question of which pairs of IM matrices have an IM Hadamard product. Counterexamples seem not so common, in part because the Hadamard product is SPP, but also because the ideas in [JS07a] indicate that a bounded multiple of I (at worst) need be added to make the Hadamard product IM. Nonetheless, better descriptions of such pairs would be of interest. In [CFJ01, JS07a] the dual of the IM matrices was defined to be IMD = {A ∈ Mn (R) : A ◦ B ∈ IM for all B ∈ IM}. It is not hard to see that IMD ⊆ IM, but an effective characterization of IMD would also be of interest. It was noted in [Joh77, FP62] that if A and B are M-matrices, then the Hadamard product A ◦ B−1 is also an M-matrix. A real n-by-n matrix A is diagonally symmetrizable if there exists a diagonal matrix D with positive diagonal entries such that DA is symmetric. In [Joh77] it was shown that if the M-matrix A is diagonally symmetrizable, then q A ◦ A−1 = 1. For an M-matrix A, −1 . For instance, there has been a great deal of interest in bounds for q A ◦ A q A ◦ A−1 ≤ 1 was proved in [FJMN85], and moreover, it was asked whether q A ◦ A−1 ≥ 1n . This latter question answered 2 affirmatively in [FJMN85], was−1 ≥ n . Also, for M-matrices A and and the authors conjectured that q A ◦ A B, a lower bound for q A ◦ B−1 was determined. The conjecture in [FJMN85] was later established independently in [Son00, Yon00, Che04b].
128
Inverse M-Matrices
For A = [aij ] ∈ IM, another (more special) natural question is whether A(2) ≡ A ◦ A ∈ IM also [Neu98]. This was also conjectured elsewhere. A constructive proof for A(2) when n = 4 was given by the authors. More generally, is A(k) = A ◦ A ◦ · · · ◦ A ∈ IM for all positive integers k? This question was settled in [Che04a]. Theorem 5.16.2 All positive integer Hadamard powers of an IM matrix are IM. In [WZZ00], as well as elsewhere, it was then conjectured that, for an IM matrix A, the t-th Hadamard power of A is IM for all real t ≥ 1. This was then proven in [Che07a]. Theorem 5.16.3 If A is an IM matrix and t ≥ 1, A(t) is IM. For n ≥ 4 and 0 < t < 1, Hadamard powers A(t) need not be IM. Example 5.16.4 Consider the IM matrix ⎤ ⎡ 1 0.001 0.064 0.027 ⎢ 0.064 1 0.064 0.274625⎥ ⎥. A=⎢ ⎣ 0.001 0.008 1 0.216 ⎦ 0.003375 0.027 0.216 1 The Hadamard cube root of A, given by ⎡ ⎤ 1 0.1 0.4 0.3 ⎢ 0.4 1 0.4 0.65⎥ ⎥, A(1/3) = ⎢ ⎣ 0.1 0.2 1 0.6 ⎦ 0.15 0.3 0.6 1 is not IM because the 2, 3 entry of its inverse is positive. The question of which IM matrices satisfy A(t) is IM for all t > 0 is still open.
5.16.2 Eventually IM Matrices As noted in Section 5.12, if A is SPP and t > 0, it is clear that A(t) is SPP. In fact, a Hadamard product of any two SPP matrices is again SPP, and this remains so for the variants of SPP to be studied in this section. This raises the question as to whether A(t) eventually becomes IM as t increases without bound. That is, is there a T ≥ 0 such that A(t) is IM for all t > T? In this case we say that A is an eventually inverse M-matrix (EIM). Our question, then,
5.16 Hadamard Products and Powers
129
is: which nonnegative matrices are EIM? It is clear that it is necessary that an EIM matrix be SPP, but it is not sufficient. Example 5.16.5 Consider the normalized SPP matrix ⎤ ⎡ 1 0.5 0.7 0.4 ⎢0.5 1 0.5 0.25⎥ ⎥. A=⎢ ⎣0.7 0.5 1 0.5 ⎦ 0.4 0.25 0.5 1 t t t t Because the 2, 4 cofactor of A(t) is c(t) 24 = [(0.5) − (0.35) ][(0.4) − (0.35) ], which is positive for all t > 0, we see that A is not EIM.
We also address the question of whether there is some converse to the statement that IM implies (some kind of) SPP. In order to answer these questions, we refine further the path product conditions for IM matrices and identify the appropriate additional necessary conditions. These involve the arrangement of occurrences of equality in the path product inequalities (5.12.4). We say that (i, j, k) is a path product equality triple (for the SPP matrix A) if equality occurs in (5.12.4). For instance, (2, 3, 4) is a path product equality triple in Example 5.16.6. Note that the path product equality triples of A and a normalized version of A are the same. When n ≥ 4, we will see that path product equalities in an IM matrix (which can occur) necessarily imply others. In view of previous work [JS01b], this is not surprising, as a path product equality implies that a 2-by-2 almost principal minor is 0, so that a certain 3-by-3 principal submatrix has a 0 in its inverse. By [JS01b] this implies further 0s in the inverse of the full matrix (though it is not necessary that all inverse 0s stem from path product equalities). Example 5.16.6 Consider the IM matrix ⎤ ⎡ 1 0.4 0.4 0.3 ⎢0.4 1 0.5 0.5⎥ ⎥ A=⎢ ⎣0.4 0.4 1 0.6⎦ . 0.4 0.4 0.4 1 Matrix A contains no path product equalities. However, A−1 has a 0 in the 1, 4 position. It was noted in [JS01a, theorem 4.3.1] that if A is an n-by-n SPP matrix, then there exist positive diagonal matrices D and E such that B = DAE in which B is a normalized SPP matrix. Thus, B(t) = Dt A(t) Et for t ≥ 0 and, because it is apparent that SPP matrices are closed under positive diagonal equivalence,
130
Inverse M-Matrices
A(t) is SPP if and only if B(t) is. Therefore, it suffices to consider normalized SPP matrices when studying (positive) Hadamard powers of SPP matrices. In that M-matrices are closed under positive diagonal equivalence, the same can be said for IM matrices. First, we identify the case in which no path product equalities occur, i.e., all inequalities (5.12.1) are strict, and call such SPP matrices totally strict path product (totally SPP). Observe that totally SPP matrices are necessarily positive. Totally SPP is not necessary for EIM (just consider the IM matrix B = A[{2, 3, 4}] of Example 5.16.6), but we will see (Theorem 5.16.11) that totally SPP is sufficient for EIM. Recall (Section 5.12) that if (5.12.4) is satisfied by an SPP matrix, we say that A is purely strict path product matrix (PSPP). We will see (Theorem 5.16.7) that, in PSPP matrices, path product equalities force certain cofactors to vanish. Notice that condition (5.12.4) does not hold for Example 5.16.5 because a24 = 0.25 = (0.5)(0.5) = a23 a34 while a14 = 0.4 > (0.7)(0.5) = a13 a34 and a21 = 0.5 > (0.5)(0.7) = a23 a31 . We show that any IM matrix is purely SPP (Theorem 5.16.7) and that EIM is equivalent to purely SPP (Theorem 5.16.12). We note that purely SPP and SPP coincide (vacuously) when n ≤ 3 and that generally the TSPP matrices are contained in the purely SPP matrices (vacuously). Last, but important, observe that, if A is totally (purely) SPP, then so is any normalization of A. Thus, in trying to show a totally (purely) SPP matrix is IM, by positive diagonal equivalence, we may, and do, assume that A is normalized. Thus, the basic PPinequalities are aij ajk ≤ aik for all i, j, k. Theorem 5.16.7 Any IM matrix is purely SPP. Proof Without loss of generality, let A = [aij ] be a normalized IM matrix. If n ≤ 3, then A is purely SPP vacuously. So we may assume that n ≥ 4. Assume that aik = aij ajk for the distinct indices i, j, k of N and let m ∈ N − {i, j, k}. Consider the principal submatrix ⎡
1 ⎢ ajm A[{m, j, k, i}] = ⎢ ⎣ akm aim
amj 1 akj aij
amk ajk 1 aik
⎤ ami aji ⎥ ⎥ aki ⎦ 1
of A. This submatrix is IM by inheritance. So (via, for instance, the special case of Sylvester’s identity for determinants given in Section 5.9.1), the (k, i) cofactor cki = (amk − amj ajk )(aim − aij ajm ) ≤ 0. Hence, by condition (5.12.4), either amk = amj ajk or aim = aik akm . Thus, A is purely SPP.
5.16 Hadamard Products and Powers
131
A graph may be constructed on the path product equality triples with an edge corresponding to coincidence of the first two or last two indices. This can be informative about the structure of path product equalities, but all we need here is the following lemma. The important idea is that the occurrence of path product equalities ensure that certain submatrices have rank one and, moreover, these rank one submatrices are large enough to guarantee that certain almost principal minors vanish. We make use of the well-known fact that an n-by-n matrix is singular if it contains an s-by-t submatrix of rank r such that s + t ≥ n + r + 1. Lemma 5.16.8 If (i, j, k) is a path product equality for the n-by-n purely SPP matrix A, n ≥ 4, then det A({k}, {i}) = 0, i.e., the (k, i) cofactor of A vanishes. Moreover, for all real t, det A(t) ({k}, {i}) = 0. Proof Without loss of generality, assume that A is an n-by-n normalized, purely SPP matrix, n ≥ 4, and that (i, j, k) is a path product equality for A. By permutation similarity, we may assume that (i, j, k) = (1, 2, 3) so that a13 = a12 a23 . Then, because A is purely SPP, for each c ∈ n − {1, 2, 3}, either a1c = a12 a2c or ac3 = ac2 a23 . Without loss of generality, assume that { c | c ∈ n − {1, 2, 3} and a1c = a12 a2c } = ∅. By permutation similarity, we may assume, again without loss of generality, that, for some q ∈ n − {1, 2, 3}, (i) a1c = a12 a2c , c = 3, . . . , q; and (ii) a1c = a12 a2c , c = q + 1, . . . , n. Note that, for c ∈ {3, . . . , q} and r ∈ {q+1, . . . , n} ⊆ n−{1, 2, c}, (i) implies (by condition (5.12.4)) that either a1r = a12 a2r or arc = ar2 a2c . Hence, by (ii), we have (iii) arc = ar2 a2c , r = q + 1, . . . , n, c = 3, . . . , q. If q = n and B = A[{1, 2}, {2, . . . , q}], then (i) implies that [a12 a13 . . . a1q ], the first row of B, is a12 times [1 a23 a24 . . . a2q ], the second row of B. Thus, B is a 2-by-(n − 1) rank one submatrix of A({3}, {1}). Because 2 + (n − 1) = n + 1 ≥ n + 1 = (n − 1) + 1 + 1, A({3}, {1}) is singular. On the other hand, if 3 ≤ q < n, let C = A[{1, 2, q + 1, . . . , n}, {2, . . . , q}]. Then, (i) implies that [a12 a13 . . . a1q ], the first row of C, is a12 times [1 a23 a24 . . . a2q ], the second row of C. Now consider [ar2 ar3 . . . arq ], the rth row of C, r = q + 1, . . . , n. It follows from (iii) that the rth row of C is ar2 times the second row of C, r = q + 1, . . . , n. Hence,
132
Inverse M-Matrices
C is a (n − q + 2)-by-(q − 1) rank one submatrix of A({3}, {1}). Because (n − q + 2) + (q − 1) = n + 1 ≥ n + 1 = (n − 1) + 1 + 1, A({3}, {1}) is singular in this case also. Thus, det A({3}, {1}) = 0 in either case, completing the proof of the first part. The second part follows by the same argument because (5.12.4) holds for A if and only if it holds for A(t) for any real number t. Now using Lemma 5.16.8, we have Corollary 5.16.9 If (i, j, k) is a path product equality for the n-by-n IM matrix A, n ≥ 4, then (A−1 )ki = 0. Theorem 5.16.10 Let A be an n-by-n SPP matrix. Then there exists T > 0 such that det A(t) > 0 for all t > T. Proof Without loss of generality, let A be an n-by-n normalized SPP matrix, and let maxi =j aij = M < 1. Denote the set of permutations of n by Sn and the identity permutation by id. Then, det A(t) = sgn(τ )at1,τ (1) at2,τ (2) . . . atn,τ (n) τ ∈Sn
=1+
sgn(τ )at1,τ (1) at2,τ (2) . . . atn,τ (n)
τ ∈Sn τ =id
>1−
at1,τ (1) at2,τ (2) . . . atn,τ (n)
τ ∈Sn τ =id
> 1 − (n! −1)M t . It is clear that there exists T > 0 such that for all t > T, 1 − (n! −1)M t > 0, completing the proof. Theorem 5.16.11 Let A be an n-by-n totally SPP matrix. Then there exists T > 0 such that A(t) ∈ IM for all t > T. Proof Without loss of generality, let A be an n-by-n normalized, totally SPP matrix. As in the proof of Theorem 5.16.10, let maxi =j aij = M < 1, let Sn denote the set of permutations of n, and let id denote the identity permutation. Then, it follows from Theorem 5.16.10 that there exists T1 > 0 such that, for all t > T1 , det A(t) > 0. (t) Now let i, j ∈ n with i = j and let N1 = n − {i, j}. Consider cij , the i, j cofactor of A(t) . Without loss of generality, assume that i < j. Then, if B=
A(t) [N1 ] A(t) [N1 ,i] A(t) [j,N1 ] atji
,
5.16 Hadamard Products and Powers
133
(t)
cij = (−1)i+j det A(t) ({i, j}) = (−1)i+j (−1)n−i−1 (−1)n−j det B = (−1) det B =− sgn(τ )b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1) τ ∈Sn−1
= −atji −
sgn(τ )b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1)
τ ∈Sn−1 τ =id
≤ −atji +
b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1) .
τ ∈Sn−1 τ =id
Now let
1 = { τ ∈ Sn−1 | τ = id and τ (n − 1) = n − 1 } = τ ∈ Sn−1 | τ = id and bn−1,τ (n−1) = atji and
(1) 2 = { τ ∈ Sn−1 | τ (n − 1) = n − 1 } = τ ∈ Sn−1 | bn−1,τ (n−1) = atji , so that {id}, 1 , and 2 form a partition of Sn−1 . Thus, b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1) = b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1) τ ∈Sn−1 τ =id
τ ∈ 1
+
b1,τ (1) b2,τ (2) . . . bn−1,τ (n−1) .
τ ∈ 2
Each term of the first summation on the right-hand side of (1) is the product of bn−1,τ (n−1) = atji and a “non-identity” term of the expansion of det A[N1 ] (because τ = id). Hence, each term in this summation has a factor of the form (2)
atji atsi1 ati1 i2 . . . atik−1 ik atik s
in which s, i1 , . . . , ik are distinct indices in N1 and k ≥ 1. Therefore, the cycle product (2) has at least three terms. Because the factors of this term distinct from atji are < 1, each term in the first summation is < atji M 2p and there are |Sn−2 | − 1 = (n − 2)! −1 such terms. On the other hand, each term of the second summation on the right-hand side of (1) has a factor of the form (3)
atji1 ati1 i2 ati2 i3 . . . atik−1 ik atik i
134
Inverse M-Matrices
in which i1 , . . . , ik are distinct indices in N1 with k ≥ 1. Hence, the path product (3) has at least two terms. Let mji denote the maximum (j, i) path product given by (3). Therefore, each term in the second summation is ≤ mji , and there are (n − 1)! −(n − 2)! = (n − 2)((n − 2)! ) such terms. Notice that, because A is totally SPP, mji < aji . Thus, (t)
cij ≤ −atji + ((n − 2)! −1) atji M 2p + (n − 2)((n − 2)! ) mtji t mji = −atji 1 − ((n − 2)! −1) M 2p − (n − 2)((n − 2)! ) . aji Because M < 1 and mji < aji , there exists Tij > 0 such that, for all t > Tij , c(t) ij ≤ 0. Let T2 = maxi =j Tij . Then, for all t > max(T1 , T2 ) = T, the inverse of A(t) has non-positive off-diagonal entries and, hence, A(t) ∈ IM, completing the proof. Our main result characterizes eventually inverse M-matrices and, in a certain sense, provides a converse to the statement that IM implies (some kind of) SPP. Theorem 5.16.12 Let A be an n-by-n nonnegative matrix. Then, A is purely SPP if and only if A ∈ EIM. Proof We observe that, for either of the two properties in question, purely SPP or EIM, we may assume that A is an n-by-n normalized SPP matrix. Necessity of the condition purely SPP then follows from Theorem 5.16.7 and the previously noted fact that the path product inequalities (and equalities) are preserved for any positive Hadamard power t. For sufficiency, let A be an n-by-n normalized, purely SPP matrix. It follows from Theorem 5.16.10 that there exists T1 > 0 such that det A(t) > 0 for all t > T1 . So we are left to show that, for some T > 0, the off-diagonal entries of (A(t) )−1 are nonpositive for all t > T. To this end, let i, j ∈ n with i = j, and N1 = n − {i, j}. If aji = ajk aki for some k ∈ N1 , then it follows from (t) Lemma 5.16.8 that c(t) ij , the i, j cofactor of A , vanishes for all positive p. Let Tij = 1 in this case. On the other hand, suppose that aji > ajk aki for all k ∈ N1 . Following the proof of Theorem 5.16.10, let maxi =j aij = M < 1 and let mji denote the maximum (j, i) path product given by (3). That mji < aji follows from (2). Hence, (1) implies that there is a positive constant (t) Tij such that, for all t > Tij , cij , the (i, j) cofactor of A(t) is ≤ 0. Letting T2 = maxi =j Tij and T = max(T1 , T2 ), we see that for all t > T, A(t) is invertible
5.16 Hadamard Products and Powers
135
and its inverse has non-positive off-diagonal entries; that is, A(t) is IM. So A is EIM, completing the proof. Remark 5.16.13 Suppose that we have a totally (purely) SPP matrix that is not IM. Because the totally (purely) SPP matrices are closed under any positive Hadamard power, extraction of a small enough Hadamard root will produce a totally (purely) SPP matrix in which P must be arbitrarily large, while raising the matrix to a large enough Hadamard power will produce a totally (purely) SPP matrix in which P may be taken to be an arbitrarily small positive number. In fact, T can be 0. Immediately from Theorem 5.16.10, we have Corollary 5.16.14 If A is an IM-matrix, then there exists T > 0 such that A(t) ∈ IM for all t > T. Recall that TSPP matrices are necessarily positive (see Section 5.12). While the condition TSPP is sufficient for EIM, positive EIM matrices are not necessarily TSPP (see Example 5.12.7). Rather, the correct necessary and sufficient condition, PSPP (see Section 5.12), was given in [JS07a] for positive matrices. The general (nonnegative) case follows by the same argument, and we have Theorem 5.16.15 Let A be a nonnegative n-by-n matrix. A is EIM if and only if A is PSPP. Theorem 5.16.16 For an n-by-n nonnegative matrix A, either (i) there is no t > 0 such that A(t) is IM or (ii) there is a critical value T > 0 such that A(t) is IM for all t > T and A(t) is not IM for all 0 ≤ t < T. The situation for IM matrices (which includes the symmetric ones) should be contrasted with doubly nonnegative matrices (DN), i.e., those matrices that are symmetric positive semi-definite and entry-wise nonnegative. Note that a symmetric IM matrix is DN. For the n-by-n DN matrices as a class, there is a critical exponent T such that A ∈ DN implies A(t) ∈ DN for all t ≥ T and T is a minimum over all DN matrices [HJ91]. That critical exponent is n − 2 [Hor90, HJ91]. All positive integer Hadamard powers of DN matrices are DN (because the positive semi-definite matrices are closed under Hadamard product), but it is possible for non-integer powers to leave the class, until the power increases to n − 2. This, curiously, cannot happen for (symmetric) IM matrices, as the “critical exponent” for the entire IM class is simply equal to 1.
136
Inverse M-Matrices
As mentioned, a positive matrix (certainly) need not be SPP, and SPP matrices need not be IM. However, it is worth noting that the addition of a multiple of the identity can “fix” both of these failures. If A > 0 is n-by-n, it is shown in [JS07a] that there is an α ≥ 0 such that αI + A is SPP; in addition, either A is already SPP or a value β > 0 may be calculated such that αI + A is SPP for all α > β. Moreover, if A ≥ 0 is n-by-n and SPP, then there is a minimal β ≥ 0 such that αI + A is IM for all α > β. In fact, if A is normalized SPP, which may always be arranged (Section 5.12), then β ≤ n − 2. This means that if we consider A ◦ B in which A and B are in IM, then either A ◦ B will be IM (if, for example, one of the matrices is already of the form: a large multiple of I plus an IM) or may be made IM by the addition of a positive diagonal matrix (that is not too big).
5.17 Perturbation of IM Matrices Summary. Here we discuss the effect of several types of perturbation and their effect on an IM matrix. These perturbations are positive rank one perturbations, positive diagonal perturbations, and perturbations induced by interval matrices. How may a given IM matrix be altered so as to remain IM, and how may a nonIM matrix be changed so as to become IM? As mentioned in Theorem 5.5.2, the addition of a nonnegative diagonal matrix to an IM matrix results in an IM matrix. We add that some nonnegative matrices that are not IM may be made IM via a nonnegative diagonal addition. If A is the inverse of an M-matrix that has no zero minors, then each entry (column or row) of A may be changed, at least a little, so as to remain IM. By linearity of the determinant, the set of possibilities for a particular entry (column or row) is an interval (convex set), which suggests the question of determination of this interval (convex set).
5.17.1 Positive Rank One Perturbations We begin by discussing positive rank one perturbation of a given IM matrix. There is the following nice result, found in [JS11, theorem 9.1]. Theorem 5.17.1 Let A be an IM matrix, let p and q be arbitrary nonnegative vectors, and for t ≥ 0, define x = Ap, yT = qT A,
5.17 Perturbation of IM Matrices
137
and s = 1 + tqT Ap. We then have −1 = A−1 − st pqT is an M-matrix; (i) A + txyT −1 −1 (ii) A + txyT x = 1s p ≥ 0 and yT A + txyT = 1s qT ≥ 0; −1 (iii) yT A + txyT x = 1s qT Ap < 1t , t > 0. This perturbation result may be used to show very simply that so-called strictly ultrametric matrices [Fie98a] are IM (see Section 5.28). If we consider a particular column of an IM matrix, then the set of replacements of that column that result in an IM matrix is a convex set. This convex set may be viewed as the intersection of n2 − n + 2 half-spaces [JS11, theorem 1.2]. Without loss of generality, we may assume the column is the last, so that, partitioned by columns, A(x) = [a1 a2 . . . an−1 x], with a1 , a2 , . . . , an−1 ≥ 0. Then the half-spaces are given by the linear constraints x ≥ 0, i+j+1
(−1)
det A(x)(i, j) ≥ 0, 1 ≤ i ≤ n, 1 ≤ j < n, i = j,
and det A(x) > 0. A similar analysis may be given for a single off-diagonal entry. It may be taken to be the (1, n) entry, so that a11 x A(x) = A21 A22 with x a scalar. Now the interval for x is determined by the inequalities (−1)i+j+1 det A(x)(i, j) ≥ 0, 1 < i ≤ n, 1 ≤ j < n, i = j, and det A(x) > 0. In [JS11] conditions are given on A = [a1 a2 . . . an−1 ]
138
Inverse M-Matrices
such that there exist an x ≥ 0 so that A(x) = [a1 a2 . . . an−1 x] is IM. If A is re-partitioned as A=
A11 a21
in which A11 is square, the conditions are the following subset of those in [JS11, theorem 2.4]: (i) A11 is IM; (ii) a21 ≥ 0; (iii) a21 A−1 11 ≥ 0. 5.17.2 Positive Diagonal Perturbations What about diagonal perturbation? By Theorem 5.5.2, if D ∈ D, then A + D is IM whenever A is. So what if A is not IM? If A is irreducible, we need to assume that A > 0. Is anything else necessary for A + D to be IM for some D ∈ D? Interestingly, we will see that an A > 0 may always be made PP and SPP by a sufficiently large diagonal addition. Necessary and sufficient conditions for a nonnegative matrix to be IM are as follows. Lemma 5.17.2 If A is an n-by-n nonnegative matrix, then A is IM if and only if (i) its determinant and its maximal proper principal minors are positive and (ii) its maximal almost principal minors are signed as those of an inverse M-matrix. Here we show that a positive matrix may be made PP by well-defined additions to the diagonal. We establish a determinantal inequality for normalized SPP matrices. For normalized IM matrices, this inequality provides an interesting dominance relation between the principal minors and certain associated APMs. We then show that the addition of the identity matrix to any 4-by-4 normalized SPP-matrix results in an IM matrix. This 4-by-4 fact may be generalized; we show that any n-by-n SPP matrix can be made IM by the addition of any scalar matrix sI such that s ≥ n − 3. (In fact, there is a lower bound on
5.17 Perturbation of IM Matrices
139
s that is less than or equal to n − 3.) The latter result has implications for pairs of IM matrices whose Hadamard product is IM.
5.17.3 Positive, Path Product, and Inverse M-Matrices Our first two propositions clarify the connection between positive matrices and positive PP matrices. Proposition 5.17.3 Let A = [aij ] be an n-by-n positive matrix and suppose that [JS01a, (1.1)] holds for all distinct i, j, k. Then, aij aji ≤ aii ajj . Proof Suppose that i, j ∈ n with i = j. Then, for all k = i, j, we have aii akj ajj aki = aii ajj , aij aji ≤ aki akj because aki aij ≤ akj aii and akj aji ≤ ajj aki . We note that Proposition 5.17.3 shows that, as in prior work [JS07b], the distinctness of the indices need not be required in the definition of PP. Proposition 5.17.4 Let A = [aij ] be an n-by-n matrix such that aij > 0, i = j, and aii = 0, i = 1, . . . , n. Then, there is a unique, minimum positive diagonal matrix D such that A + D is PP. Proof Let D = diag(d1 , . . . , dn ) in which dk is given by dk = max i =j i,j =k
aik akj . aij
a a
This assures that aij ≥ ikdk kj for triples {i, j, k} that are distinct and according to [JS07a, proposition 1] those for which i = j, but i = k, follow. A direct consequence of Proposition 5.17.4 is that the addition of a (minimal) nonnegative diagonal matrix or a minimal positive scalar matrix converts a square positive matrix into a PP matrix. A priori this diagonal matrix is not bounded, even in terms of n, but it is bounded in terms of the sizes and ratios among entries. We also note that, depending on the zero pattern of the entries, we may not be able to convert a nonnegative matrix with zero diagonal to a PP matrix with positive diagonal matrix addition. For instance, if A = [aij ]
140
Inverse M-Matrices
is such a matrix and for distinct indices i, j, k, aij ajk > 0 while aik = 0, then A+D∈ / PP for any positive diagonal matrix D. We next establish a determinantal inequality for certain normalized SPP matrices. For normalized IM matrices, the inequality shows that the dominance relationship between diagonal and associated off-diagonal entries extends to all proper principal minors and certain associated APMs. Observe that a special case of this inequality is the known fact that a row (column) diagonally dominant M matrix has an inverse that is diagonally dominant of its column (row) entries. The theorem establishes the following useful fact: a normalized SPP matrix is IM provided that the proper principal minors and the APMs are appropriately signed. (In this case, the determinant is positive.) The following theorem and corollary were crucial to the proof of the main result of [Che07a]. Theorem 5.17.5 Let A = [aij ] be an n-by-n normalized SPP matrix, n ≥ 2, whose proper principal minors are positive and whose APMs are signed as those of an IM-matrix. Then, (i) for any nonempty proper subset α of n = {1, 2, . . . , n} and for any indices i ∈ α and j ∈ /α det A[α] > max{| det A[α − i + j, α]|, | det A[α, α − i + j]|}; (5.17.1) (ii) det A > 0; and (iii) A is IM. Proof Because the theorem obviously holds for n = 2, we assume henceforth that n ≥ 3. To establish (5.17.1), it suffices to prove that det A[n − 1] > | det A[n − 1, n − 1]|. First, suppose that a1n = 0, and let S = { i | a1i = 0 }, k denote the cardinality of S, and T = { i | a1i = 0, i ∈ n − 1}. If k = n − 1, then [JS07a, (1.5)] obviously holds. So we may assume that S and T are each nonempty and form a partition of n − 1. Let i ∈ T and j ∈ S. Then, by the PP conditions, a1i aij ≤ a1j = 0. Because a1i = 0, this implies aij = 0 for i ∈ T and j ∈ S, and, in addition, a1j = 0 for j ∈ S, hence, A[n − 1, n − 1] has an (n−k)-by-k zero submatrix. It follows from the Frobenius–König Theorem that det A[n − 1, n − 1] = 0, which completes the proof of this case. Now suppose that a1n > 0 and n is odd (the even case is analogous). Because the almost principal minors of A are signed as those of an IM matrix, it follows that the n, 1 cofactor of A, (−1)n+1 det A[n − 1, n − 1] ≤ 0, and, hence, det A[n − 1, n − 1] ≤ 0 (because n is odd). If det A[n − 1, n − 1] = 0,
5.17 Perturbation of IM Matrices
141
then the result certainly holds. So assume that det A[n − 1, n − 1] < 0. Then, | det A[n − 1, n − 1]| = − det A[n − 1, n − 1] =−
n−1
(−1)i+n−1 ain det A[n − 1 − i, n − 1 − 1]
i=1
(expanding by the last column) n−1 (−1)i+1 ain det A[n − 1 − i, n − 1 − 1] = i=1
= a1n {det A[n − 1 − 1] +
n−1
(−1)i+1
i=2
ain det A[n − 1 − i, n − 1 − 1]} a1n
< det A[n − 1 − 1] +
n−1 i=2
(−1)i+1
ain det A[n − 1 − i, n − 1 − 1] a1n
(because 0 < a1n < 1) ≤ | det A[n − 1 − 1]| +
n−1
(−1)i+1 ai1 det A[n − 1 − i, n − 1 − 1]
i=2
because each term in the summand has the same sign as the i, 1 cofactor of A[n − 1] and hence is ain nonpositive, and because ≥ ai1 a1n = det A[n − 1], which establishes (5.17.1). Applying (5.9.5) and (5.17.1), we see that (b) (and hence (c)) holds, completing the proof. Then, directly from Theorem 5.17.5, we have Corollary 5.17.6 Let A be an n-by-n nonnegative matrix, n ≥ 2, each of whose proper principal submatrices is IM. Then A is IM if and only if each of its maximal APMs is signed as that of an IM matrix.
142
Inverse M-Matrices
We also have the following dominance relation between the principal minors and certain associated APMs of a normalized IM matrix. Corollary 5.17.7 Let A = [aij ] be an n-by-n normalized IM matrix, n ≥ 2. Then, for α a nonempty proper subset of n = {1, 2, . . . , n} and for any indices i ∈ α and j ∈ / α, (5.17.1) holds. Examining the proof of Theorem 5.17.5 carefully, we see that we did not need all proper principal minors and all APMs to be appropriately signed for the matrix to have positive determinant, i.e., we have Proposition 5.17.8 Let A be an n-by-n nonnegative matrix, n ≥ 3. Then, A is IM if and only if A is SPP and all principal minors and all APMs, each of orders n − 1 and n − 2, are signed as those of an IM matrix. The following theorem, based primarily on path product inequalities, motivated this work, and we will see that it is a special case of more general results that follow. Theorem 5.17.9 Let A be a 4-by-4 normalized SPP matrix. Then A + I is IM. Furthermore, A + sI, s < 1, need not be IM. Proof By Theorem 5.17.5 and permutation similarity, to show A + I is IM, it suffices to show that a particular 3-by-3 APM is necessarily properly signed. We show that the (4, 1) APM is nonnegative: ⎡
⎤ a12 a13 a14 det(A + I)(4, 1) = det ⎣1 + 1 a23 a24 ⎦ a32 1 + 1 a34 = 4a14 + a12 a23 a34 + a13 a32 a24 − 2a12 a24 − 2a13 a34 − a14 a23 a32 = 2(a14 − a12 a24 + a14 − a13 a34 ) + a12 a23 a34 + a13 a32 a24 − a14 a23 a32 ≥ 2(a14 − a12 a24 + a14 − a13 a34 ) + a12 a23 a32 a34 + a13 a32 a23 a34 − a14 a23 a32 (by PP) If the sum of the last three terms is nonnegative, then the determinant is nonnegative by the path product inequalities. Otherwise, = 2(a14 − a12 a24 + a14 − a13 a34 ) + (a12 a34 + a13 a34 − a14 )a23 a32 ≥ 2(a14 − a12 a24 + a14 − a13 a34 ) + (a12 a34 + a13 a34 − a14 ) = 2(a14 − a12 a24 ) + (a14 − a13 a34 ) + a12 a34 ≥ 0.
5.17 Perturbation of IM Matrices For the second claim, let 0 < , s < 1 and let ⎡ 1 1− 1− ⎢ 1 A=⎢ ⎣ 1
143
⎤ 1− 1 − ⎥ ⎥. 1 − ⎦ 1
Then A+sI is SPP. Further, for “small,” det (A+sI)(4, 1) ≈ (s+1)(s−1) < 0 so that A + sI is not IM, completing the proof. Let A = [aij ] be an n-by-n normalized SPP matrix, n ≥ 3, and, for i = j, let ⎧ n ⎪ 1 ⎪ ⎪ aik akj , aij = 0 ⎨ aij (A) uij = k=1 k =i,j ⎪ ⎪ ⎪ ⎩0, aij = 0. (A)
(Notice that aij = 0 implies aik akj = 0 for all k.) Let U(A) = max uij . We i =j
call U(A) the upper path product bound for A. The following straightforward lemma will prove useful. Lemma 5.17.10 Let A = [aij ] be an n-by-n normalized SPP matrix, n ≥ 3. Then, (i) for any i, j ∈ N with i = j, n
aik akj ≤ U(A)aij ;
k=1 k =i,j
(ii) U(A) ≤ n − 2. Moreover, U(A) = n − 2 if and only if, for some i, j ∈ N (i = j), aij = 0 and aik akj = aij for all k = i, j. Proof The conclusions follow from the definition of U(A) and [JS07a, (1.1)]. The next two theorems generalize Theorem 5.17.9. Theorem 5.17.11 Let A = [aij ] be an n-by-n normalized SPP matrix, n ≥ 3, and let L = max{U(A), 1}. Then A + sI is IM for all s ≥ L − 1. Furthermore, L − 1 may not be replaced by a smaller value. Proof By induction on n. If n = 3, then U(A) ≤ 1 so that L = 1. Because A is SPP, A = A + 0I = A + (L − 1)I is IM and thus A + sI is IM for all L ≥ L1 , where establishing the first claim for n = 3. Proceeding inductively, it follows that the (n − 1)-by-(n − 1) principal minors of B = A + (L − 1)I
144
Inverse M-Matrices
are positive because for any principal submatrix A1 of A, A1 + (L1 − 1)I is IM so that A1 + (L − 1)I is IM, as L ≥ L1 , where L1 = max{U(A1 ), 1}. From Theorem 5.17.5, it follows that, if the entries of adj B are properly signed, then det B > 0 and B is IM. Thus, for the first claim, it suffices to verify that an (n − 1)-by-(n − 1) APM is properly signed. By permutation similarity, it suffices to check any particular one, say the complement of the (1, 2)-entry, which should be nonnegative. Its value is ⎡ ⎤ b31
⎢ .. ⎥ b21 det B({1, 2}) − b23 . . . b2n adj B({1, 2}) ⎣ . ⎦ . bn1 Division by det B({1, 2}) means that we need ⎡
b21 ≥ b23
⎤ b31
⎢ . ⎥ . . . b2n B({1, 2})−1 ⎣ .. ⎦ .
(5.17.2)
bn1 Let B({1, 2})−1 have entries cij , i, j = 3, . . . , n. By induction, C = [cij ] is an M-matrix. The right-hand side of (5.17.2) is n
b2i cij bj1 =
b2i cij bj1 +
i =j
i,j=3
n
b2i cii bi1 .
(5.17.3)
i=3
Because cij ≤ 0, i = j, the first term on the right-hand side of (5.17.3) is not more than b2i cij bji bi1 by path product. i =j
Because of Fischer’s inequality [HJ91] applied to the IM matrix B({1, 2}), we have det B({1, 2}) ≤ bii det B({1, 2, i}) = L det B({1, 2, i}) so that L1 ≤ det B({1,2,i}) det B({1,2}) = cii . Combining, we obtain n n
b2i cij bj1 =
i=3 j=3
≤
n n i=3 j=3 j =i n n
n b2i cij bj1 + (b2i cii bi1 + b2i cii bii bi1 − b2i cii bii bi1 ) i=3
b2i cij bji bi1 +
i=3 j=3
n (1 − bii )b2i cii bi1 i=3
(because bj1 = aj1 ≥ aji ai1 = bji bi1 ≥ 0 and cij ≤ 0, i = j ) =
n i=3
b2i bi1
n j=3
cij bji +
n (1 − L)b2i cii bi1 i=3
5.17 Perturbation of IM Matrices =
n
b2i bi1 (1 + (1 − L)cii )
i=3
⎛
⎝because
n
⎞ cij bji = the i, i entry of CC−1 ⎠
j=3
≤
n
145
b2i bi1 1 + (1 − L)
i=3
1 L
1 b2i bi1 L n
=
i=3
n 1 = a2i ai1 L i=3
1 ≤ (U(A) a21 ) L ≤ a21 , which completes the proof of the first claim. If the normalized SPP matrix ⎡ 1 1 − 1 − ... 1 − ⎢ 1 ... ⎢ ⎢ .. .. .. ⎢ . . . ⎢ A=⎢ . . . . ⎢. .. .. .. ⎢. ⎢. .. .. ⎢. . ⎣. . 1 ... ...
⎤ 1− 1 − ⎥ ⎥ .. ⎥ ⎥ . ⎥ ⎥ ⎥ 1 − ⎥ ⎥ ⎥ 1 − ⎦ 1
in which 0 < < 1 and C = [cij ] is the cofactor matrix of B = A + sI, then, for “small,” cn1 ≈ (−1)n+1 (−1)n ((1 + s)n−2 − (n − 2)(1 + s)n−3 ) = −(1 + s)n−3 (s − (n − 3)). So, if 0 < s < L − 1 ≤ n − 3, det B > 0 and cn1 > 0. Hence, the 1, n entry of B−1 is positive. Thus, B is not IM, establishing the second claim. From Lemma 5.17.10 (ii), we have Theorem 5.17.12 Let A be an n-by-n normalized SPP matrix, n ≥ 3. Then A + sI is IM for all s ≥ n − 3.
146
Inverse M-Matrices
A consequence of Theorem 5.17.5 is the following. Corollary 5.17.13 Let A be an n-by-n nonnegative matrix with positive diagonal entries, and let D and E be positive diagonal matrices such that DE = (n − 2)[diag(A)]−1 . Then, if DAE − (n − 3)I is SPP, A is IM. Theorem 5.17.14 Let A > 0 be given and let D and E be positive diagonal matrices such that DE = (n − 2)[diag(A)]−1 . Then, if DAE − (n − 3)I is PP, A ∈ IM. Theorem 5.17.15 Let A be an n-by-n normalized SPP matrix, n ≥ 3. Then, there is a minimal real number s0 (A) such that for all s ≥ s0 (A), A + sI is IM. Proof Let A = [aij ] be a normalized SPP matrix, s be a real number, and cij (s) denote the (i, j) cofactor of A + sI, i = j. Then, cij (s) = (−1)i+j det (A + sI)(j, i) aij A[i, σ ] = − det A[σ , j] (A + sI)[σ ] = −aij (1 + s)n−2 + lower order terms in (1 + s) in which σ = n − {i, j}. Because cij (s) → −∞ as s → ∞, there is a smallest real number sij (A) such that cij (s) ≤ 0 for all s ≥ sij (A). Let s0 (A) = max sij (A). Then, by Theorem 5.17.12, A + sI is IM for all s ≥ s0 (A). i =j
Example 5.17.16 Consider the 4-by-4 normalized SPP matrix ⎤ ⎡ 1 0.10 0.40 0.30 ⎢0.40 1 0.40 0.65⎥ ⎥. A=⎢ ⎣0.10 0.20 1 0.60⎦ 0.15 0.30 0.60 1 As seen in Section 5.2, A is not IM (the (2, 3)-entry of A−1 is positive). By actual calculation, U(A) = a131 (a32 a21 + a34 a41 ) = 1.7 > 1. Hence, A + sI is √ ≈ 0.18. IM for all s ≥ 0.7. In fact, A + sIis IM if and only if s ≥ 1537−25 80 If C is a class of matrices, the Hadamard dual of C , denoted IM D , is the set of matrices A ∈ C such that A ∈ SC if and only if A ◦ B ∈ C for all B ∈ C . In [HJ91] an example was given to show that the Hadamard product of two independent inverse M-matrices need not be IM. Later a 4-by-4 symmetric example was given [WZZ00]. Because the Hadamard product of two inverse M-matrices is inverse M when n ≤ 3 (Section 5.16), this fully clarifies when
5.17 Perturbation of IM Matrices
147
the class of IM matrices is itself contained its Hadamard dual. Obviously, the positive diagonal matrices and J, the all ones matrix, lie in the Hadamard dual of the n-by-n IM matrices for any positive integer n. Because IM matrices are closed under positive diagonal multiplication, the latter implies that the positive rank one matrices also lie in the dual. Beyond this, the question is: do the n-by-n IM matrices, n ≥ 4, have anything else in their Hadamard dual? It is noted in Section 5.2 that, if A = [aij ] is an n-by-n IM matrix, then there exist positive diagonal matrices D and E such that A = DA1 E in which A1 = [αij ] is a normalized inverse M-matrix. Because D(A ◦ B)E = A ◦ (DBE) for positive diagonal matrices D and E, we only need to test A with normalized IM matrices B to see if A is in the Hadamard dual. Observe that (DAE) ◦ (FBG) = DF(A ◦ B)GE for positive diagonal matrices D, E, G, and H and that A ◦ B is normalized SPP provided A and B are Lemma 5.17.17 If A is an n-by-n normalized IM matrix, then (A+(n−3)In ) ∈ IMD . Proof Let A be an n-by-n normalized IM matrix and B be an n-by-n IM matrix. Then, there exist positive diagonal matrices D and E such that B = DB1 E in which B1 is a normalized IM matrix. Thus, (A + (n − 3)In ) ◦ B = D[A ◦ B1 + (n − 3)In ]E and the result follows from the remarks above. Theorem 5.17.18 If A is an n-by-n IM matrix and D and E are positive diagonal matrices such that A1 = DAE is normalized, then A + (n − 3)D−1 E−1 ∈ IMD . Proof It follows from Lemma 5.17.20 that A1 + (n − 3)In ∈ IMD . Hence, A + (n − 3)D−1 E−1 = D−1 (A1 + (n − 3)In )E−1 ∈ IMD . The result follows from Theorem 5.17.12. As we noted previously, the Hadamard product of two IM matrices need not be IM for n > 3. However, because of Lemma 5.17.20, the Hadamard product of any IM matrix with many others is IM. Theorem 5.17.18 identifies a number of matrices in IMD , and any IM-matrix may be changed to one in IMD by addition of an appropriate scalar matrix.
5.17.4 Intervals of IM Matrices Our last perturbation result pertains to IM interval matrices. If A, B ∈ Mn (R), the interval from A to B, denoted by I(A, B) is the set of matrices C = [cij ] ∈ Mn (R) satisfying min{aij , bij } ≤ cij ≤ max{aij , bij } for all i and j, while the set of vertices (vertex matrices) derived from A and B, denoted V(A, B), is the set
148
Inverse M-Matrices
of matrices C = [cij ] ∈ Mn (R) such that cij = aij or bij for all i and j. If C = [cij ] in which cij = min{aij , bij } (cij = max{aij , bij }), i, j = 1, . . . , n, then C is called the left endpoint matrix (right endpoint matrix). Note that there are at 2 most 2n distinct vertex matrices. We were motivated by the following question raised by J. Garloff [Gar]: given the interval determined by two IM matrices, when are all matrices in the interval IM? We fully answer this question by showing that all matrices in the interval are IM if and only if V(A, B) ⊆ IM. Then, by example, we show that there are limitations upon strengthening the answer and note that the same technique generalizes our answer to any identically signed class (IS) (a class of matrices defined by the signs weakly or strongly of a certain collection of minors.) The signs may be weak or strong. One might suspect that the interval between any two IM matrices is contained in the class of IM matrices if the two matrices are comparable with respect to the usual entry-wise partial ordering. An example is given to show that this conjecture is false (although it does hold for M-matrices [Fan60]). See Section 5.10 for a discussion of almost principal minors of M and IM matrices. Recall that a totally positive (nonnegative) matrix is a matrix with all minors positive (nonnegative). We denote the totally positive (nonnegative) matrices by TP (TN). Thus, M (IM, TP, TN) matrices, all have identically signed principal and almost principal minors. Of course, some of the defining inequalities may be weak inequalities and others strong. (For example, the almost principal minor inequalities for an M-matrix are weak, while the principal minor inequalities are strong.) Note that each of the classes mentioned so far is an IS class. Example 5.17.19 Consider the IM matrices ⎡
⎤ ⎡ ⎤ 1 .4 .3 1 .9 .6 A = ⎣.6 1 .6⎦ ≤ B = ⎣.6 1 .6⎦ . .4 .6 1 .4 .6 1 ⎤ 1 .6 .3 The matrix C = ⎣.6 1 .6⎦ satisfies A ≤ C ≤ B (entry-wise), yet C is not .4 .6 1 an IM matrix because (C−1 )13 > 0. ⎡
The conclusion of Example 5.17.19 remains true even if we restrict the set of matrices under consideration to the matrices whose inverses are tridiagonal M-matrices (under the entry-wise partial ordering). A line in a matrix is a single row or column.
5.17 Perturbation of IM Matrices
149
Lemma 5.17.20 Suppose that A = [aij ] and B = [bij ] ∈ IM, and that aij = bij except perhaps for the entries in one line. Then, for 0 ≤ t ≤ 1, tA + (1 − t)B ∈ IM. Proof Because A and B are both IM, all their corresponding principal and almost principal minors agree in sign (perhaps weakly in the almost principal case). Because the only entries that vary lie in, at most, a single line, each minor of tA+(1−t)B is a linear function of t. Thus, each principal or almost principal minor of tA + (1 − t)B, 0 ≤ t ≤ 1, agrees in sign (perhaps weakly in the almost principal case) with the corresponding minor of A (or B). Because all key minors are correctly signed, this implies that tA + (1 − t)B, 0 ≤ t ≤ 1, is an IM matrix that completes the proof. An immediate consequence of Lemma 5.17.20 is the following fact that we will use. Corollary 5.17.21 Suppose that A = [aij ] and B = [bij ] ∈ IM, and that aij = bij for (i, j) = (r, s), i.e., A and B differ in at most the r, s entry. Then, tA + (1 − t)B ∈ IM for 0 ≤ t ≤ 1. In the event that A and B differ in only one entry, the interval determined by A and B is the same as the line segment determined by them. If they differ in more entries (even in the same line), then the interval is a larger set than the line segment because independent variation in each entry is allowed. Moreover, the lemma does not necessarily hold if two matrices differ in more than a line. Example 5.17.22 Consider the IM matrices ⎡ ⎤ ⎡ ⎤ 1 0 0 1 .6 0 A = ⎣.6 1 .6⎦ ≤ B = ⎣.6 1 0⎦ . .4 .6 1 .4 .6 1 For 0 < t < 1, ⎡
⎤ 1 .6(1 − t) 0 tA + (1 − t)B = ⎣.6 1 .6t⎦ .4 .6 1 cannot possibly be IM because it is irreducible, but contains a zero entry (see [Joh82]). Theorem 5.17.23 Let A, B ∈ Mn (R). Then V(A, B) ⊆ IM if and only if I(A, B) ⊆ IM.
150
Inverse M-Matrices
Proof First assume that V(A, B) ⊆ IM. Then A and B are both in IM. Let C = [cij ] ∈ I(A, B). If we replace the (1, 1) entry of each matrix in V0 = V(A, B) with c11 to obtain a new set of matrices V1 , it follows from the corollary that V1 ⊆ IM. Similarly, if we replace the (1, 2) entry of each matrix in V1 with c12 to obtain the set of matrices V2 , then V2 ⊆ IM. Proceeding in this manner, filling in the matrices left to right by rows with entries from C, we replace the (i, j) entry of each matrix in Vn(i−1)+j−1 with cij and the resulting set of matrices Vn(i−1)+j ⊆ IM and have half as many elements as Vn(i−1)+j−1 . In the last step of this process, we obtain the set of matrices Vn2 whose sole element is C. Because Vn2 ⊆ IM, C is an IM matrix. Thus, I(A, B) ⊆ IM, and because the converse is obvious, this completes the proof. Theorem 5.17.18, V(A, B) consists of A, B, and 1Note that, in regard 1 to .9 .3 .4 .6 .6 1 .6 ∈ IM, and .6 1 .6 ∈ IM. Thus, we see that we cannot expect the .4 .6 1 .4 .6 1 statement of the theorem to hold for the entry-wise partial order if “all vertices” is replaced by all but one of the vertices. Moreover, as the following example shows, we cannot expect the statement to hold for the “checkerboard” partial ordering, i.e., A B ⇔ (−1)i+k aik ≤ (−1)i+k bik , i, k = 1, . . . , n, either, even if the two endpoints of the interval (the “corner matrices” [Gar96]) are IM. Notice that this latter order corresponds to the sign pattern of the inverse of a TP-matrix. Example 5.17.24 Consider the IM matrices ⎡ ⎤ ⎡ 1 .9 .6 1 A = ⎣.6 1 .6⎦ B = ⎣.6 .4 .6 1 .4 1 .9 .6 The matrix C = .6 1 .6 satisfies A C B, because
(C−1 )32
.4 .3 1
⎤ .4 .6 1 .6⎦ . .3 1 yet C is not an IM matrix
> 0.
Last, we consider the partial order A B, which corresponds to the sign pattern of the inverse of an IM matrix, namely A B ⇔ −aik ≤ −bik , i = k; aik ≤ bik , i = k, i, k = 1, . . . , n, then Theorem 5.17.23 again shows that we cannot expect the statement to hold even if the two corner matrices are IM. In light of this example, it is not obvious that there exists a partial order for which it suffices to check only the corners or any subset of the vertices as was the case for TP-matrices [Gar82] (in which case it was only necessary to check
5.18 Determinantal Inequalities
151
the two corner matrices) and invertible TN-matrices [Gar96] (in which case it was only necessary to test a maximum of 22n−1 vertices). We close by noting that the proof of Lemma 5.17.20 shows that for A, B that lie in any particular IS class and differ in at most one line, the line segment joining them lies fully in the same IS class. Using the corresponding corollary, a corresponding theorem is that for any particular IS class, it is necessary (obviously) and sufficient, for the entire interval to be in that IS class, that the vertices be in that IS class.
5.18 Determinantal Inequalities Classical determinantal inequalities associated with the names of Hadamard, Fischer, Koteljanskii, and Szasz have long been known for M-matrices. Because of Jacobi’s determinantal identity, it is an easy exercise to show that these also hold for IM matrices. The most general of these, associated with the name Koteljanskii, is Theorem 5.18.1 If J, K are index sets contained in n and A ∈ IM is n-by-n, then det A[J ∪ K] det A[J ∩ K] ≤ det A[J] det A[K]. Proof As in the work of Koteljanskii, this may be proven using Sylvester’s determinantal inequality, permutation similarity invariance of IM and the fact that symmetrically placed almost principal minors of A ∈ IM are weakly of the same sign. The inequalities of Hadamard, Fischer, and Szasz may be deduced from Theorem 5.18.1 and are stated below. Corollary 5.18.2 (Hadamard) If A ∈ IM, then det A ≤ ni=1 aii . Corollary 5.18.3 (Fischer) If A ∈ IM and J ⊆ n, then det A ≤ det A[J] det A[J c ]. Corollary 5.18.4 (Szasz) If A ∈ IM and k denotes the product of all the nk principal minors of A of size k-by-k, then 1 ≥ (2 ) ( in which
n i
1 n−1 1 )
≥ (3 ) (
1 n−1 2 )
≥ · · · ≥ n
denotes the i-th binomial coefficient, i = 0, 1, . . . , n.
There are inequalities among products of principal minors of any A ∈ IM besides those in Theorem 5.18.1 and its chordal generalizations. Recently, all
152
Inverse M-Matrices
such inequalities have been characterized [FHJ98]. In order to understand this result, consider two collections α = {α1 , . . . , αp } and β = {β1 , . . . , βp } of index sets, αi , βj ⊆ n, i, j ∈ {1, . . . , p}. For any index set J ⊆ n and any collection α, define the two functions fα (J) ≡ the number of sets αi such that J ⊆ αi and Fα (J) ≡ the number of sets αi such that αi ⊆ J. For the two collections α, β, the following two set-theoretic axioms are important: (ST0)
fα ({i}) = fβ ({i}), i = 1, . . . , n
and (ST2)
Fα (J) ≥ Fβ (J), for all J ⊆ N.
A third axiom (ST1) arises only in the characterization of determinantal inequalities for M-matrices. The result for IM is then Theorem 5.18.5 The following statements about two collections α, β of index sets are equivalent: (i)
t det A[αi ] i=1 is bounded over all A ∈ IM; t i=1 det A[βi ] t t i=1 det A[αi ] ≤ i=1 det A[βi ] for all A
∈ IM; and (ii) (iii) the pair of collections α, β satisfy ST0 and ST2. The proof is given in [FHJ98]. The above results leave only the question of whether there are inequalities involving some non-principal minors in a matrix A ∈ IM. Because IM matrices are P-matrices, there are some obvious inequalities, e.g., det A[α + i, α + j] det A[α + j, α + i] ≤ det A[α + i] det A[α + j] whenever i, j ∈ / α. There is also a family of nontrivial inequalities involving almost principal minors of IM matrices that extend those of Theorem 5.11.1. These inequalities exhibit a form of monotonicity already known for principal minors. Recall that if α ⊆ β ⊆ n, then det A[β] ≤ det A[α] akk for A ∈ IM. k∈β−α
5.19 Completion Theory
153
This just follows from det A[β] ≤ det A[α] det A[β − α] and Hadamard’s inequality (both are special cases of Theorem 5.19.1). If A were normalized, the product of diagonal entries associated with β − α would disappear and the above inequality could be paraphrased “bigger minors are smaller.” The same holds for our new inequalities, which generalize those given in Theorem 5.11.1. Theorem 5.18.6 Let α ⊆ β ⊆ n, and suppose that A = [aij ] ∈ IM is n-by-n. Then, if det A[α + i, α + j] = 0, |detA[β + i, β + j]| det A[β ∪ {i, j}] aii ≤ ≤ det A[β − α] ≤ |detA[α + i, α + j]| det A[α ∪ {i, j}] i∈β−α
whenever i = j, i, j ∈ / β. If det A[α + i, α + j] = 0, then det A[β + i, β + j] = 0, also, for i = j, i, j ∈ / β. We note that other determinantal inequalities for IM matrices were given in [Che07b].
5.19 Completion Theory A partial matrix is an array with some entries specified, and the other, unspecified, entries free to be chosen. A completion of a partial matrix is the conventional matrix resulting from a particular choice of values for the unspecified entries. [Joh90] is a good reference on matrix completion problems. One topic of interest is the completion of partial PP (SPP) matrices, i.e., nonnegative matrices such that every specified path satisfies the PP (SPP) conditions in [JS01a, theorem 4.1]. We make the assumption throughout that all diagonal entries are 1s because PP matrices are invariant under positive diagonal scaling. The SPP matrix completion problem is fundamental in considering the (difficult) IM matrix completion problem [JS96]. Here, a partial IM-matrix is a partial nonnegative matrix in which each fully specified principal submatrix is IM, and we wish to determine whether a given partial IM matrix can be completed to IM. Because every IM matrix is SPP, for such a completion to exist it is necessary for the partial IM matrix to be partial SPP and for any IM matrix completion to be SPP. Thus, the set of SPP completions (if they exist) of a partial IM matrix is a place to start in the search for an IM matrix completion, and it represents a narrowing of the superset of possible completions. From [JS01a], we have
154
Inverse M-Matrices
Theorem 5.19.1 Every partial PP (SPP) matrix has a PP (SPP) matrix completion. Partial IM matrices are not necessarily partial PP as shown by the following example. Example 5.19.2 Consider the partial IM matrix ⎡ ⎤ 1 12 ? 19 ⎢ 1 1 1 ?⎥ 2 2 ⎥ A=⎢ ⎣? 1 1 1 ⎦ . 2 2 1 ? 12 1 9 A is not partial PP because a12 a23 a34 = 18 > 19 = a14 . Thus, A cannot be completed to a PP matrix and, because every IM matrix is PP, no IM matrix completion exists. In fact, even if a partial IM matrix is partial SPP, it may not have an IM matrix completion. For instance, if in Example 5.12.7, we let the (1, 4) and (4, 1) entries be unspecified, it can be shown [JS96] that no IM matrix completion exists. A chordal graph is k-chordal [Gol80] if no two distinct maximal cliques intersect in more than k vertices. In [JS96] the symmetric IM (SIM) completion problem was studied, and it was shown that, for partial SIM matrices, 1-chordal graphs guarantee SIM completion. Theorem 5.19.3 Let G be a 1-chordal graph on n vertices. Then every n-byn partial SIM matrix A, the graph of whose specified entries is G, has a SIM completion. Moreover, there is a unique SIM completion A1 of A whose inverse entries are 0 in every unspecified position of A, and A1 is the unique determinant maximizing SIM completion of A. However, this is not true for chordal graphs in general. Consider the partial SIM matrix (which has a 3-chordal graph with two maximal cliques) ⎤ ⎡ 9 1 2 1 40 x 5 5 ⎢9 1 3⎥ 3 ⎥ ⎢ ⎢ 40 1 10 2 8 ⎥ ⎢1 1 3 3⎥ A = ⎢ 5 10 1 5 4 ⎥ . ⎥ ⎢ 1 1 ⎢2 1 14 ⎥ ⎦ ⎣5 2 5 3 3 1 x 1 8 4 4 In [JS99], using (5.9.11) and the cofactor form of the inverse, it was shown that no value of x yields a SIM completion.
5.20 Connections with Other Matrix Classes
155
Cycle conditions are derived that are necessary for the general IM completion problem. Further, these conditions are shown to be necessary and sufficient for completability of a partial symmetric IM matrix, the graph of whose specified entries is a cycle. Theorem 5.19.4 Let ⎡
1 ⎢a ⎢ 1 ⎢ ⎢ ⎢ A=⎢ ⎢ ⎢ ⎢ ⎣ an
a1 1 a2
?
an a2 · ·
? · · · · ·
· · an−1
an−1 1
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
be an n-by-n partial SIM matrix, n ≥ 4. Then A has a SIM completion if and only if the cycle conditions
aj ≤ ai , i = 1, . . . , n
j =i
are satisfied A block graph is a graph built from cliques and (simple) cycles as follows: starting with a clique or cycle, sequentially articulate the “next” clique or simple cycle at no more than one vertex of the current graph. A completability criterion for partial SIM matrices with block graphs is as follows. Theorem 5.19.5 Let A be an n-by-n partial SIM matrix, the graph of whose specified entries is a block graph. Then A has a SIM completion if and only if all minimal cycles in G satisfy the cycle conditions. In addition, other graphs for which partial SIM matrices have SIM completions are discussed.
5.20 Connections with Other Matrix Classes Let A ≥ 0. A is totally nonnegative (totally positive) if all minors of all orders of A are nonnegative (positive); further, A is oscillatory if A is totally nonnegative and a power of A is totally positive [Gan59]. In [Mark72] the class of totally nonnegative matrices whose inverses are M-matrices is characterized as follows
156
Inverse M-Matrices
Theorem 5.20.1 Suppose A is an invertible, totally nonnegative matrix. Then A−1 is an M-matrix if and only if det A(i, j) = 0 for i + j = 2k, in which k is a positive integer, and i = j. A special class of oscillatory matrices that are IM is also investigated: A Jacobi matrix is a square matrix A = [aij ] with real entries satisfying aij = 0 for |i − j| > 1. Further, A is a normal Jacobi matrix if ai,i+1 , ai+1,i < 0 for i = 1, 2, . . . , n − 1 [Lew80]. In [Lew89] the results in [Mark72] are extended as follows. Theorem 5.20.2 Let A be a square matrix. Consider the following three conditions. (i) A−1 is totally nonnegative. (ii) A is an M-matrix. (iii) A is a Jacobi matrix. Then any two of the three conditions imply the third. Theorem 5.20.3 Let A be a square matrix. Consider the following three conditions. (i) A−1 is oscillatory. (ii) A is an M-matrix. (iii) A is a normal Jacobi matrix. Then any two of the three conditions imply the third. Theorem 5.20.4 Let A be an M-matrix. Then A−1 is totally nonnegative if and only if A is a Jacobi matrix, and A−1 is oscillatory if and only if A is a normal Jacobi matrix. In [Pen95] the class of totally nonnegative matrices that are IM are fully characterized as follows. Theorem 5.20.5 Let A be an invertible totally nonnegative n-by-n matrix. Then the following properties are equivalent. (i) (ii) (iii) (iv)
A−1 is an M-matrix. det A(i, j) = 0 if |i − j| = 2. A−1 is a tridiagonal matrix. For any k ∈ {1, 2, . . . , n − 1}, det A([i1 , . . . , ik ], [j1 , . . . , jk ]) = 0 if |i1 − jl | > 1 for some l ∈ {1, . . . , k}.
We have the following characterization of IM matrices that are totally positive [Pen95].
5.21 Graph Theoretic Characterizations
157
Theorem 5.20.6 Let A be an invertible n-by-n M-matrix. Then the following properties are equivalent. (i) A−1 is a totally positive matrix. (ii) A is a tridiagonal matrix.
5.21 Graph Theoretic Characterizations of IM Matrices In [LN80] graph theoretic characterizations of (0, 1)-matrices that are IM matrices were obtained. A matrix A ∈ Rn×n is essentially triangular if for some permutation matrix P, P−1 AP is a triangular matrix. Let A be an n-by-n matrix and G(A) = (V(A) = N, E(A)) be its associated (adjacency) graph, i.e., (i, j) ∈ E(A) if and only if aij = 0, where V(A) is its set of vertices and E(A) is its set of edges. A path from vertex i to vertex j is called an (i,j)-path; a path of length k, k being the number of directed edges in the path, is called a k-path; and a path of length k from i to j is called an (i, j|k)-path. Let H denote the join of the (disjoint) graphs G1 and G2 in which G1 consists of a single vertex and G2 consists of a 2-path. (H is called a buckle.) Then, Theorem 5.21.1 An essentially triangular (0, 1)-matrix A is IM if and only if the associated graph is a partial order and, for every i, j, k, if an (i, j|k)-path in G(A) is a maximal (i, j)-path, then it is a unique k-path. Theorem 5.21.2 A (0, 1)-matrix A is IM if and only if G(A) induces a partial order on its vertices that contains no buckles as induced subgraphs. In [KB85] we have the following result that relates the zero pattern of a transitive, invertible (0, 1) matrix and that of its inverse. Theorem 5.21.3 Suppose that A = [aij ] is an n-by-n transitive, invertible (0, 1) matrix with B = [bij ] = A−1 . If i = j and aij = 0, then bij = 0. For a matrix B, let BIM denote the set BIM = {α ∈ R : αI + B ∈ IM}. BIM is a (possibly empty) ray on the real line, bounded from below, and is nonempty if and only if B is nonnegative and transitive. Then, Theorem 5.21.4 Let B be an n-by-n nonnegative transitive matrix. Then the ray BIM is closed, i.e., BIM = [α0 , ∞) for some α0 ≥ 0, if and only if there exists β0 ∈ R such that / IM, (i) B + β0 I ∈ (ii) B+β0 I is positive stable and so are its principal submatrices of order n−1.
158
Inverse M-Matrices
The authors express the infimum of BIM as a maximal root of a polynomial that depends on the matrix B, differentiating between the cases in which BIM is open or closed. In [Lew89] the following generalization was obtained. Consider a nonnegative square matrix A = [aij ] and its associated adjacency graph G(A) = (V(A), E(A)). The weight of an edge (i, j) of G(A) is then aij . The weight of a directed path is the product of the weights of its edges. The weight of a set of paths is the sum of the weights of its paths. For essentially triangular IM matrices, the following result was proved. Theorem 5.21.5 Let A be an essentially triangular nonnegative invertible matrix, and let A1 be its normalized matrix. Then A1 ∈ IM if and only if G(A1 ) is a partial order and the weight of the collection of even paths between any two vertices in G(A) does not exceed the weight of its collection of odd paths. A graph-theoretic characterization of general IM matrices was then given. Theorem 5.21.6 A nonnegative square matrix A is IM if and only if (i) For every two distinct vertices i and j in G(A), the sum of the even (i, j)paths does not exceed that of the odd (i, j)-paths. (ii) All principal minors of A are positive. A special class of inverse IM matrices was also characterized.
5.22 Linear Interpolation Problems The linear interpolation problem for a class of matrices C asks for which vectors x, y there exists a matrix A ∈ C such that Ax = y. In [JS01c] the linear interpolation problem is solved for several important classes of matrices, one of which is M (and hence it is solved for IM also). In addition, a transformational characterization is given for M-matrices that refines the known such characterization for P-matrices. These results were extended to other classes of matrices in [Smi01].
5.23 Newton Matrices For an n-by-n real matrix A with eigenvalues λ1 , λ2 , . . . , λn let Sk = λi1 . . . λik i1 0, (A + D)−1 A ≥ 0 for each positive diagonal matrix D, (I + cA)−1 ≤ I for all c > 0, and cA2 (I + cA)−1 ≤ A for all c > 0.
The theorem allows the authors to characterize nilpotent matrices on the boundary of IM. Denote by Lrn the set of all nonnegative r-by-n matrices, r ≤ n, which contain exactly one nonzero entry in each column. If the dimensions are not specified, write simply L. In [FM88b] a matrix A belonging to IM was shown to have an explicit form. Specifically, they prove Theorem 5.26.3 An n-by-n matrix A is in IM if and only if there exists a permutation matrix P, a diagonal matrix D with positive diagonal entries, a matrix B ∈ IM, and a matrix Q ∈ L without a zero row such that ⎤ ⎡ 0 UBQ UBV + W D−1 PAPT D = ⎣0 QT BQ QT BV ⎦ 0
0
0
for some nonnegative matrices U, V, and W. In the partitioning, any one or two of the three block rows (and their corresponding block columns) can be void. The preceding theorem allows the characterization of singular matrices k in IM. If A is a square matrix, the smallest integer k for which rank A = rank k+1 is called the index of A, denoted by index(A) = k. The Drazin inverse A D of A is the unique matrix A such that (i) Ak+1 AD = Ak ; (ii) AD AAD = AD ; (iii) AAD = AD A. In [FM90] the Drazin inverse of a matrix belonging to IM is determined and, for such a matrix A, the arrangement of the nonzero entries of the powers A2 , A3 , . . . is shown to be invariant.
5.27 Tridiagonal, Triangular, and Reducible IM Matrices In [Ima83] IM matrices whose nonzero entries have particular patterns were studied. First, tridiagonal IM matrices were characterized as follows.
162
Inverse M-Matrices
Theorem 5.27.1 Let A be a nonnegative, invertible, tridiagonal n-by-n matrix. Then the following statements are equivalent: (i) A is an IM matrix. (ii) All principal minors of A are positive, and A is the direct sum of matrices of the following types: (a) diagonal matrices, (b) 2-by-2 positive matrices, or (c) matrices of the form ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ A=⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣
d1 0 0 · · · · · · · 0
a1 d2 b1 0
0 0 d3 0 0
·
where at = 0 and u = even.
· s−1 s
· 0 a2 d4 b2 0
·
· 0 0 d5 0 ·
·
·
0 a3 d6 · ·
·
·
0 0 · · · ·
·
·
·
· · · · · · · · · · · · · 0 bu
0 · · · · · · · 0 at ds
when s is odd, and bu = 0 and t =
s 2
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
when s is
A nonempty subset K of Rn , which is closed under addition and nonnegative scalar multiplication, is called a (convex) cone in Rn . If a cone K is also closed topologically, has a nonempty interior, and satisfies K ∩ (−K) = ∅, then K is called a proper cone. If A is a real matrix, then the set K(A) = {Ax : x ≥ 0} is a polyhedral cone, i.e., a set of nonnegative linear combinations of a finite set S of vectors in Rn . Using these notions, the following geometric characterization of upper triangular IM matrices was given [Ima83]. Theorem 5.27.2 Let A be a nonnegative upper triangular n-by-n matrix with 1s along the diagonal. Then the following statements are equivalent. (i) A is an IM matrix. (ii) Aek − ek ∈ K(Ae1 , Ae2 , . . . , Aek−1 ) for k = 2, . . . , n. For certain types of reducible matrices, both necessary conditions and sufficient conditions for the reducible matrix to be IM are provided in [Ima84].
5.28 Ultrametric Matrices
163
5.28 Ultrametric Matrices A = [aij ] is a strictly ultrametric matrix [MMS94] if A is real symmetric and (entry-wise) nonnegative and satisfies (i) aik ≥ min(aij , ajk ) for all i, j, k; and (ii) aii > maxk =i aik for all i. In [MMS94] it was shown that if A = [aij ] is strictly ultrametric, then A is invertible and A−1 = [αij ] is a strictly diagonally dominant Stieltjes matrix satisfying αij = 0 if and only if aij = 0. The proof utilized tools from topology and real analysis. Then, a simpler proof that relied on tools from linear algebra was given in [NV94]. Such statements, and some following, also follow easily from [JS11, theorem 9.1], giving a yet simpler proof. In [Fie98b] new characterizations of matrices whose inverse is a weakly diagonally dominant symmetric M-matrix are obtained. The result in [MMS94] is shown to follow as a corollary. In [Fie98b] connections between ultrametric matrices and Euclidean point spaces are explored. See also [Fie76]. There has been a great deal of work devoted to generalizations of ultrametric matrices. In [NV95] and [MNST95], nonsymmetric ultrametric (called generalized ultrametric) matrices were independently defined and characterized. Necessary and sufficient conditions are given for regularity, and in the case of regularity, the inverse is shown to be a row and column diagonally dominant M-matrix. In [MNST96] nonnegative matrices whose inverses are M-matrices with unipathic graphs (a digraph is unipathic if there is at most one simple path from any vertex i to any vertex k) are characterized. A symmetric ultrametric matrix A is special (symmetric ultrametric matrix) [Fie00] if, for all i, aii = max aik . k =i
In [Fie00] graphical characterizations of special ultrametric matrices are obtained. In [MSZ03], by using dyadic trees, a new class of nonnegative matrices is introduced, and it is shown that their inverses are column diagonally dominant M-matrices. In [Stu98] a polynomial time spectral decomposition algorithm to determine whether a specified, real symmetric matrix is a member of several closely related classes is obtained. One of these classes is the class of strictly ultrametric matrices. In [TV99] it is shown that a classical inequality due to Hardy can be interpreted in terms of symmetric ultrametric matrices, and then a generalization of this inequality can be derived for a particular case.
6 Copositive Matrices
6.1 Introduction ( ) Let n = x ∈ Rn : x ≥ 0 and eT x = 1 be the unit simplex of Rn . For convenience in this chapter, we let Mn ({−1, 0, 1}) and Mn ({−1, 1}) denote the n-by-n matrices whose entries lie in the indicated sets. Definition 6.1.1 A symmetric matrix A ∈ Mn (R) is (i) Copositive (A ∈ C) if x ∈ n implies xT Ax ≥ 0; (ii) Strictly copositive (A ∈ SC) if x ∈ n implies xT Ax > 0; (iii) Copositive-plus (A ∈ C+ ) if A is copositive and x ∈ n , xT Ax = 0 imply Ax = 0. Of course, the requirement x ∈ n may be replaced by x ∈ Rn , 0 = x ≥ 0. Note that SC ⊆ C+ ⊆ C. The classes C and SC are broader than the positive semidefinite (PSD) and positive definite (PD) classes, respectively, as requirements are placed upon fewer vectors. If sN denotes the symmetric, nonnegative matrices in Mn (R), then both PSD, sN ⊆ C, so that PSD + sN ⊆ C. Copositive matrices were first defined by Motzkin in 1952 [Mot52]. Much of the early, formative work on copositivity ([Mot52, Dia62, Bau66, Bau67, Bas68]) focused on whether C = PSD + sN for general n (Diananda’s conjecture, [Dia62]). For n = 2, C = PSD ∪ sN, and C = PSD + sN for n = 3, 4 [Dia62]. However, the question was settled negatively by an example of Alfred Horn, reported in [Hal67].
6.1 Introduction
165
Example 6.1.2 Let A ∈ M5 (R) be the symmetric circulant ⎤ ⎡ 1 −1 1 1 −1 ⎢ −1 1 −1 1 1 ⎥ ⎥ ⎢ ⎥ ⎢ A = ⎢ 1 −1 1 −1 1 ⎥. ⎥ ⎢ ⎣ 1 1 −1 1 −1 ⎦ −1 1 1 −1 1 Then A ∈ C (see the general result in [Hal67]), but A is not the sum of a PSD matrix and a nonnegative matrix, as any symmetric decrease in a pair of symmetrically placed off-diagonal entries, or in a diagonal entry, leaves a matrix that is not PSD. The matrix itself has two negative eigenvalues and is not PSD. Counterexamples to the Diananda conjecture for n > 5 may be constructed by direct summation, so that C ⊆ PSD + sN if and only if n ≤ 4. A copositive matrix not in PSD + sN is called an exceptional (copositive) matrix. The copositive matrices form a convex cone in Mn (R), and the quadratic form xT Ax of a matrix on an extremal of this cone is called an extreme copositive quadratic form. The important Example 6.1.2 has only entries in {−1, 1}. This has brought a lot of attention to C ∩ Mn ({−1, 1}) and C ∩ Mn ({−1, 0, 1}). In the former case, the diagonal entries must all be 1, and in the latter, they must be 0 or 1. (But, then, we need only check the principal submatrix based upon the diagonal entries that are equal to 1.) But, in general, the case in which all are 1 has seen the most interest. These matrices have a nice characterization [HP73]. A lemma of independent interest is useful. Lemma 6.1.3 Let AT = A ∈ Mn (R) have all diagonal entries equal to 1. If every principal submatrix of A, in which all off-diagonal entries are strictly less than 1, is copositive, then A is copositive. Proof If every off-diagonal entry of A is strictly less than 1, then the hypothesis includes the desired conclusion (A ∈ C), and there is nothing to do. If not, suppose that aij ≥ 1 and that x ∈ Pn . Then, we show that there is a y = [yk ] ∈ n , with yi yj = 0, such that yT Ay ≤ xT Ax. Successive application of this observation yields that any value of the quadratic form of A on n by sums of values of the quadratic form (via nonnegative vectors) on principal submatrices in which all off-diagonal entries are < 1. Because these are all copositive, it will follow that A is copositive. Suppose that x ∈ n , xi xj > 0 and aij ≥ 1. Without loss of generality, we can assume i < j. We construct y ∈ n such that yi yj = 0 and yT Ay ≤ xT Ax. Let t = xi + xj and define
166
Copositive Matrices x(u) = [x1 . . . xi−1 u xi+1 . . . xj−1 t − u xj+1 . . . xn ]T .
Now, f (u) = x(u)T Ax(u) is a concave function of u on [0, t] (linear, if aij = 1). So the minimum of f , 0 ≤ u ≤ t, occurs at an endpoint of [0, t]. Let y be the corresponding vector, and the necessary observation is proven. Theorem 6.1.4 Suppose that A ∈ Mn ({−1, 1}) with all diagonal entries 1. Then A ∈ C, unless there is a 3-by-3 principal submatrix of A that is ⎡ ⎤ 1 −1 −1 ⎣−1 1 −1⎦ . −1 −1 1 Proof By the Lemma 6.1.3, we need only check principal submatrices, all of whose off-diagonal entries are −1. If there is one of size 3-by-3 or more, there is one of size 3-by-3. But the displayed matrix is not in C, by evaluating the quadratic form at e; and, then, A ∈ C. On the other hand, if there is no such principal submatrix of size 3-by-3 or more, then A ∈ C by the Lemma 6.1.3. Theorem 6.1.5 Suppose that A ∈ Mn ({−1, 0, 1}) with all diagonal entries equal to 1. Then A ∈ C, unless there is a 3-by-3 principal submatrix of A that is either ⎡ ⎤ 1 −1 −1 ⎣−1 1 −1⎦ −1 −1
1
or permutation similar to ⎡
⎤ 1 −1 −1 ⎣−1 1 0 ⎦. −1 0 1 Proof Let B be the first displayed matrix in the hypothesis and C be the second. Suppose A is copositive. Because neither B nor C is copositive, neither can be a principal submatrix of A. On the other hand, if A is not copositive, it follows from the lemma that A contains a principal submatrix M, which is not copositive and has all off-diagonal entries 0 or −1. If M contains a row with at least two −1s, then M contains a principal submatrix that is permutation similar to B or C. If each row of M contains at most one −1, then M is clearly the direct sum of matrices of the form
1 −1 and 1 . −1 1 So M is positive semidefinite, hence, copositive.
6.2 Basic Properties of Copositive Matrices
167
In the ensuing paragraphs, we summarize the basic theory of cones and their connections with copositive matrices. The topological interior of C is SC. Equivalently, the topological closure of SC is C. C is a closed convex cone in the class of symmetric matrices. C has a nonempty interior and is pointed, i.e., C ∩ (−C) = {0}. C is not polyhedral, i.e., it is not the case that C is the intersection of a finite number of half-spaces. A copositive matrix Q is an extreme (copositive) matrix if and only if whenever Q = Q1 + Q2 , in which Q1 and Q2 are copositive, then Q1 = tQ and Q2 = (1 − t)Q for some t ∈ [0, 1]. Other work in the 1960s focused on extreme copositive quadratic forms. The polar (dual) K ∗ of a cone K ⊆ Rn is the set of all x ∈ Rn such that yT x ≥ 0 for all y ∈ K. K is said to be self-polar if K = K ∗ . A matrix is copositive with respect to a cone K if xT Ax ≥ 0 for all x ∈ K.
6.2 Basic Properties of Copositive Matrices in Mn (R) The copositive matrices form a bigger class than the PSD matrices, but they are a natural generalization that is analogous in many ways. We record some basic properties here. Lemma 6.2.1 (Inheritance) Let A ∈ Mn (R) and α ⊆ {1, 2, . . . , n}. If A ∈ C (SC), then the principal submatrix A[α] ∈ C (SC). Proof Let A ∈ C of order n and suppose there exists a vector x ≥ 0, x = 0
such that xT A[α]x 0. Then, for the n-by-1 vector y = 0x , y Ay = T
xT
0
x 0
A[α] A[α, α c ] A[α c , α] A[α c ]
= xT A[α]x 0, a contradiction. The strictly copositive case is similar. Corollary 6.2.2 If A ∈ C (SC), then the diagonal entries of A are nonnegative (positive), and so, Tr(A) ≥ 0 (Tr(A) > 0). Because xT (A + B)x = xT Ax + xT Bx and xT (t A)x = t xT Ax when t ∈ R, we have Lemma 6.2.3 If A, B ∈ Mn (R) and A, B ∈ C (SC), then A + B ∈ C (SC) and t A ∈ C (SC) for t > 0. Thus, C and SC both form a cone in Mn (R). Of course, SC ⊆ C.
168
Copositive Matrices
Example 6.2.4 Notice that 0 1 A= 1 1
lies in C, but not SC. For x = 10 , xT Ax = 0. Also, A ∈ C+ , as Ax = 0. Example 6.2.5 Notice that
1 −1 A= −1 1 lies in C, as it is PSD, but not SC as eT Ae = 0, where e = [1 1]T . However, A ∈ C+ , as xT Ax = 0 implies that x is a multiple of e, and Ae = 0. Because the vector e > 0, and eT Ae is just the sum of the entries of A, we have Lemma 6.2.6 If A ∈ Mn (R) and A ∈ C (SC), then the sum of the entries of A is nonnegative (positive). Recall that a transformation in which A ∈ Mn (R) → CT AC for an invertible C ∈ Mn (R) is called a congruence of A. If A is PSD, any congruence of A is again PSD. However, this is not so for C. Example 6.2.7 If
1 2 A= , 2 1 then A is strictly copositive, due to positivity. However, if 1 0 D= , 0 −1 then
1 −2 D AD = , −2 1 T
which is not copositive. So, to maintain copositivity, restrictions must be made on congruence, especially if we wish to have an automorphism. First, because we wish the matrix carrying the congruence to map nonnegative vectors to nonnegative vectors, we have Lemma 6.2.8 If A ∈ Mn (R) ∩ C and C ∈ Mn (R), C ≥ 0, then CT AC ∈ C.
Proof x
6.2 Basic Properties of Copositive Matrices 169 CT AC x = (Cx)T A(Cx) ≥ 0 because Cx ≥ 0. So CT AC ∈ C.
T
Recall the definition from Chapter 3 that C ∈ Mr,m (R) is called row positive if C ≥ 0 and C has at least one positive entry in each row. Similarly, for SC, we have Lemma 6.2.9 If A ∈ Mn (R) ∩ SC and C ∈ Mn (R) is row positive, then CT AC ∈ SC. Proof Suppose 0 = x ≥ 0, and consider xT CT ACx = yT Ay, for y = Cx. Because Cx ≥ 0, Cx = 0 because C is row positive, yT Ay > 0, and CT AC is SC. Remark 6.2.10 For every x ∈ Rn with some negative entries, there is an A ∈ SC such that xT Ax < 0. To see this, it suffices to consider n = 2 and A with small positive diagonal entries and a large positive off-diagonal entry. Recall that a monomial matrix is a square matrix with exactly one positive entry in each row and column. Now, as nonnegative monomial matrices are those that map the nonnegative orthant onto itself, we have Theorem 6.2.11 If A, C ∈ Mn (R) and C is monomial, then CT AC ∈ C (SC) if and only if A ∈ C (SC). Moreover, no broader class of congruences is an automorphism (of either class). Proof By Lemma 6.2.9, A ∈ C (SC) implies CT AC ∈ C (SC), as monomial matrices are row positive. Because the inverse of a monomial matrix is monomial, the converse follows, as well. Because the monomial matrices are the largest class of matrices F such that both F and F −1 map the nonnegative orthant to itself, Remark 6.2.10 ensures that no larger class of congruences works. Two useful special cases are the following. Corollary 6.2.12 If A ∈ Mn (R) and P is a permutation matrix, then PT AP ∈ C (SC) if and only if A ∈ C (SC). T Corollary 6.2.13 If A ∈ Mn (R) and D ∈ D+ n , then D AD ∈ C (SC) if and only if A ∈ C (SC).
Because the diagonal entries must be positive (nonnegative) in order that a matrix be SC (C), the positive diagonal congruence automorphism may be used to normalize the diagonal entries of a matrix to be classified as SC (resp., C) to all be 1 (resp., 0 or 1). In the case of copositivity, it is useful to note that if a diagonal entry is 0, then the off-diagonal entries in that row and column must
170
Copositive Matrices
be nonnegative. This is an analog of a familiar fact about 0 diagonal entries of PSD matrices: the corresponding rows and columns must be 0. Theorem 6.2.14 If A = [aij ] ∈ Mn (R) ∩ C and aii = 0, then aij , aji ≥ 0, j = 1, 2, . . . , n. Proof If aii = 0 and aij < 0, then the principal submatrix
0 −s A[{i, j}] = −s t
with s > 0 and t ≥ 0. By
Lemma 6.2.1, this 2-by-2 matrix must be intyC. But, its quadratic form on xy ∈ R2 is ty2 −2sxy. Choosing y > 0 and x > 2s makes the quadratic form negative, a contradiction that completes the proof. Because of the inheritance requirement in Lemma 6.2.1, this means that A ∈ Mn (R) is in C if and only if off-diagonal entries are nonnegative in rows with 0 diagonal entries and the principal submatrix induced by its positive diagonal entries is in C. Of course, if all diagonal entries are 0, the matrix must be nonnegative and, thus, copositive. This means that to check membership in C or SC, we need only consider matrices with all diagonal entries 1. We have seen that each of the three classes C, SC, and C+ is closed under addition, positive scalar multiplication (thus, each is a cone), principal submatrix extraction (inheritance), and monomial congruence (thus, closed under permutation similarity). Copositive matrices are not closed under inversion or Schur complementation. Conditions for the inverse to be copositive can be found in [HS10, lemma 7.7]. A sufficient condition for the Schur complement to be copositive can be found in [JR05, theorem 4]. Example 6.2.15 If ⎡
⎤ 1 2 0 A = ⎣2 1 2⎦ , 0 2 1 then A ∈ C, the (2, 2) entry of A−1 = −1/7 and A/A[{1, 2}] = −5/3. So A−1 ∈ / C and A/A[{1, 2}] ∈ / C.
6.3 Characterizations of C, SC, and C+ Matrices In the fundamental copositive paper [CHL70b], C(SC, C+ ) matrices are characterized using inductive criteria for copositivity. In [Had83], a more general
6.3 Characterizations of C, SC, and C+ Matrices
171
criterion provides simpler coordinate-free proofs of the results of [CHL70b]. To begin, we simply state the main results of [CHL70b] (because the proofs are a bit tedious). Definition 6.3.1 [Had83] The symmetric matrix A ∈ Mn (R) is called (strictly) copositive of order m, 1 ≤ m ≤ n, if and only if every principal submatrix of order m is (strictly) copositive. There are corresponding definitions with respect to PD and PSD. Theorem 6.3.2 [CHL70b] Let M ∈ Mn (R). Then M ∈ C+ if and only if for some permutation matrix P, A B T P MP = T D B in which (i) A ∈ Mr (R) is PSD, 0 ≤ r ≤ n; ˆ (ii) B = ABˆ for some B; T ˆ ˆ (iii) D − B AB ∈ SC (hence D ∈ SC). Theorem 6.3.3 [CHL70b] Let the symmetric matrix M ∈ Mn (R) be copositive of order n − 1. Then M ∈ / C if and only if (i) adj M ≥ 0, and (ii) det M < 0 (in which case, M is PSD of order n − 1, M −1 ≤ 0, and M has one negative eigenvalue that is of minimum magnitude and has a positive eigenvector). Example 6.3.4 Consider the symmetric matrix ⎡ ⎤ 1 −1 −1 M = ⎣−1 1 −1⎦ , −1 −1 1 which is C (but not SC) of order 2. Then ⎡ ⎤ 0 2 2 adj M = ⎣2 0 2⎦ ≥ 0, and 2 2 0 / C. det M = −4 < 0 so that M −1 ≤ 0. From Theorem 6.3.3, it follows that M ∈ Further, M ∈ PSD of order 2 and M has exactly one negative eigenvalue, namely −1, of minimum modulus and with corresponding positive eigenvector e, the all 1s vector.
172
Copositive Matrices
Similarly, we have Theorem 6.3.5 [CHL70b, theorem 5.3.5] Let the symmetric matrix M ∈ / SC if and only if Mn (R) be SC of order n − 1. Then M ∈ (i) adj M > 0, and (ii) det M ≤ 0 (in which case, M is PD of order n − 1, so that rank(M) ≥ n − 1). Furthermore, if M ∈ / SC, then either (a) M ∈ / C, which is true if and only if det M < 0 (in which case M has exactly one negative eigenvalue of strictly minimum magnitude with a positive eigenvector), or (b) M ∈ C, which is true if and only if det M = 0 (in which case M ∈ PSD, rank M = n − 1, and the eigenvector associated with 0 is positive). Example 6.3.6 Consider the symmetric matrix ⎡
⎤ 3 −2 −1 M = ⎣−2 2 0 ⎦. −1 0 1 M is PD (and hence, SC) of order 2, det M = 0, and adj M = 2J (in which J is the all 1s matrix). So, by contraposition, M ∈ C and the eigenvector associated with 0 is v = e. The following definition will be of use. Definition 6.3.7 [Had83] Let A be a symmetric matrix, and let B be a strictly copositive matrix of the same order. The pair A, B is called a (strictly) codefinite pair if and only if Ax = λBx, x > 0 implies λ ≥ 0 (λ > 0). Now we establish the results of [Had83] for C (SC) matrices. Theorem 6.3.8 [Had83] Let A ∈ Mn (R) be copositive of order n − 1. The following statements are equivalent: (i) A is strictly copositive. (ii) There is a strictly copositive matrix B of order n such that the pair A, B is (strictly) codefinite. (iii) For every strictly copositive matrix B of order n the pair A, B is (strictly) codefinite.
6.3 Characterizations of C, SC, and C+ Matrices
173
Proof (i) ⇒ (iii): Let B be SC of order n. Suppose Ax = λBx, x > 0. Then, xT Ax = λxT Bx, and thus λ ≥ 0(λ > 0). (iii) ⇒ (ii): This is obvious. (ii) ⇒ (i): Suppose the form xT Ax assumes negative (nonpositive) values for certain x > 0, x = 0. Because A is (strictly) copositive of(order n−1, such x are ) necessarily > 0. Consider the form xT Ax on the set D = x : x ≥ 0, xT Bx = 1 . It assumes its minimum at a point xˆ of the relative interior of D. The minimum is negative (nonpositive). From the necessary condition for a minimum of xT Ax on D it follows that Aˆx = λBˆx with λ = xˆ T Aˆx. From (ii) λ ≥ 0(or λ > 0), which leads to a contradiction. The next theorem is the main result of [CHL70a]. Theorem 6.3.9 [Had83] Let A ∈ Mn (R) be copositive of order n − 1. Then the following statements are equivalent: (i) (ii) (iii) (iv)
A ∈ C. For every b > 0 there is an x > 0 such that Ax = λb, λ < 0. The matrix −A−1 exists and is nonnegative. det A < 0 and adj A ≥ 0.
Proof For the matrix B in Theorem 6.3.8 one can choose the identity or matrices B = bbT of rank 1 in which the vector b > 0. (i) ⇔ (ii): This follows from Theorem 6.3.8, by contraposition. (ii) ⇒ (iii): Assume that (ii) holds and that there is a y such that Ay = 0. Because Ax = 0 has a solution for b > 0, it follows that yT b = 0 for b > 0 so that y = 0. Hence −A−1 exists. For every b > 0 the vector x = A−1 b < 0. Thus, −A−1 ≥ 0, establishing (iii). (iii) ⇒ (ii): Assume that (iii) holds and that b > 0. Then let λ = −1 and x = −A−1 b so that Ax = A(−A−1 b) = −b = λb, λ < 0. (iv) ⇒ (iii): This is obvious. (iii) → (iv): Assume that (iii) holds. Choose B = I in Theorem 5.3.8. Then there is x > 0 such that Ax = λx, λ < 0. Assume that A has a second negative eigenvalue, Ay = μy, yT x = 0, y = 0, μ < 0. Therefore, the vector y has positive and negative components because x > 0, yT x = 0. Hence, there is α > 0 such that x + αy ≥ 0 and x + αy has at least one zero component. Because A is copositive of order n − 1, 0 ≤ (x + αy)T A(x + αy) = λxT x + μα 2 yT y < 0,
174
Copositive Matrices
a contradiction. Thus, A has exactly one negative eigenvalue, which implies det A < 0. (iv) then follows from (iii). For completeness, the corresponding result for strictly copositive matrices is included. The proof is similar to that of Theorem 6.3.9. Theorem 6.3.10 [Had83] Let A ∈ Mn (R) be strictly copositive of order n − 1. Then the following statements are equivalent: (i) (ii) (iii) (iv)
A ∈ SC. For every b > 0, there is an x > 0 such that Ax = λb, λ ≤ 0. The matrix −A−1 exists and is negative. det A ≤ 0 and adj A > 0.
For a symmetric matrix A ∈ Mn (R), Q(x) = xT Ax denotes its associated quadratic form, x denotes the Euclidean norm of x, and Rn+ denotes the set of all x in Rn such that x ≥ 0, x = 0. In the following a spectral criterion for strict copositivity is established. Theorem 6.3.11 [Kap00] Let A ∈ Mn (R) be a symmetric matrix. Then A is strictly copositive if and only if every principal submatrix B of A has no eigenvector v > 0 associated with an eigenvalue λ ≤ 0. Proof Let the principal submatrices of A have the property stated in the theorem, and let Q(x0 ) ≤ 0 for some x0 in Rn+ . Some, but not all, components of x0 may be 0. For proper numbering, we can assume that x0 = [a1 . . . am 0 . . . 0]T , where 1 ≤ m ≤ n, and ai > 0 for i = 1, . . . , m. The vector x0 cannot be the unique one at which Q has a nonpositive value; among all such vectors, we can choose one at which m has its least value, and we assume that our x0 has this property. We consider the case m > 1. Let y = [yk ] be an arbitrary vector of Rm and let Q0 (y) = Q(y1 , . . . , ym , 0, . . . , 0). Let yo = [a1 . . . am ]T . We consider the function Q0 (y) restricted to the set E = {y ∈ Rm : y = 1, y ≥ 0}. By multiplying y0 by a positive scalar, we can ensure that y0 is in E. The vector y0 must then be in the relative interior of E because a vector in the relative boundary of E would have less than m nonzero components. Thus, the function Q0 (y) on E has positive values on the relative boundary of E and has a nonpositive value at one point in the relative interior. Accordingly, the function must have a negative or zero absolute minimum at some vector v in the relative interior, so that v > 0. But the vector v would be an eigenvector of a principal submatrix of A with a negative or a zero associated
6.3 Characterizations of C, SC, and C+ Matrices
175
eigenvalue. By hypothesis, there can be no such vector. Accordingly, Q(x) > 0 in Rn+ . For the case m = 1, the principal submatrix [a11 ] would have the eigenvector v = (1) with a nonpositive eigenvalue, and the same conclusion follows. Conversely, let Q(x) > 0 in Rn+ , and let some principal submatrix B of A have an eigenvector v > 0 with a nonpositive associated eigenvalue. We can assume that B is obtained from A by deleting the rows and columns following the m-th and write v = [a1 . . . am ]T , x0 = [a1 . . . am 0 . . . 0]T . Then Q(x0 ) = vT Bv ≤ 0, contrary to hypothesis. Therefore, no principal submatrix of A can have an eigenvector v > 0 with a negative or a zero associated eigenvalue. This establishes the theorem. Note that the matrix M of Example 6.3.4 has eigenvector ⎡
⎤ 2√ v1 = ⎣2 + √6⎦ > 0 2+ 6 √ / SC. with associated eigenvalue λ1 = 2 − 6 ≤ 0. Hence, M ∈ For copositivity, an analogous proof yields Theorem 6.3.12 [Kap00] Let A ∈ Mn (R) be a symmetric matrix. Then A is copositive if and only if every principal submatrix B of A has no eigenvector v > 0 associated with an eigenvalue λ < 0. For completeness, we state Motzkin’s determinantal test for strict copositivity [CHL70b], as well as that of the corresponding test for copositivity due to Keller. Theorem 6.3.13 (Motzkin) [CHL70b] A symmetric matrix is strictly copositive if and only if each principal submatrix for which the cofactors of the last row are all positive has a positive determinant. (This includes the positivity of the diagonal entries.) Theorem 6.3.14 (Keller) [CHL70b] A symmetric matrix is copositive if and only if each principal submatrix for which the cofactors of the last row are all nonnegative has a nonnegative determinant. For example, the matrix M in Example 6.3.4 has cofactors 2, 2, 3 in the last row and det M = −4 < 0. Hence, M is not copositive.
176
Copositive Matrices
For n ≤ 3, explicit entry-wise inequalities are given [Had83] that provide necessary and sufficient conditions for a matrix to lie in C(SC). There are such formulas for n = 4, but the inequalities are much more complicated [PF93].
6.4 History Now we summarize a number of papers on copositivity. In [Bau66], it was shown that for n ≥ 2, if Q(x) = xT Qx in which x = [x1 . . . xn ]! ≥ 0 is an extreme copositive quadratic form, then for any i, i = 1, . . . , n, Q has a zero u = [u1 . . . un ]T ≥ 0 in which ui = 0. It was also shown that extreme forms in n variables can be used to construct extreme forms in m > n variables. In [Bau67] a class of extreme copositive forms in five variables was introduced and shown to provide extreme copositive forms in any number of variables m > n. In [Bas68] and [HP73], a number of results were established about relations with the class H of symmetric matrices in which every entry is ±1. In [HP73], these results were extended to the class E of symmetric matrices in which every entry is ±1 or 0, and each diagonal entry is 1. Specifically, those matrices in E , which are, respectively, copositive, copositive-plus, or positive semidefinite, are characterized. In [HN63] it was shown that the class Cof copositive forms and the class B of completely positive forms (i.e., real forms that can be written as a sum of squares of nonnegative real forms) are convex cones that are dual to each other and that PSD and sN are self-dual. Also, the extreme copositive forms of PSD, sN, B , and PSD + sN were determined, as well as some extreme forms of C. Horn’s form is an example of an extreme copositive form that does not belong to PSD + sN. In [Bau66] necessary and sufficient conditions for a quadratic form xT Ax with |aij | = 1 to be an extreme copositive form are obtained. Let S denote the set of symmetric matrices with 1s on the diagonal and off-diagonal entries 0 or ±1. In [HN63] the matrices in S that lie in C (resp. C+ , PSD) are characterized, as well as the copositive matrices in S that lie on extreme rays of C. Let S0 denote the set of symmetric matrices with 1s on the diagonal and with ±1 off-diagonal. Matrices in S0 that are (1) copositive, (2) extreme rays of the cone of copositive matrices, and (3) positive semidefinite are characterized in terms of certain graphs [HP73]. Copositivity has surfaced in a variety of areas of applied mathematics, which include game theory [Bas68, Gad58, Lan90], mathematical programming
6.4 History
177
[Ber81, CHL70a, CHL70b, Val86, Val88, Val89a, Val89b], the theory of inequalities [Dia62], block designs [HN63], optimization problems [Dan90], and control theory [HS10]. More recently there has been renewed interest in copositivity due to applications in modeling [CKL92, Had83] and linear complementarity problems [Ber81, MND01]. In particular, copositive matrices are of interest in solving the following linear complementarity problem as it arises in game theory [CHL70a]: Find a solution z to the system w = q + Mz, where M is an n-by-n copositive matrix, subject to w ≥ 0, z ≥ 0, zT w = 0. In [Val86], the basic theory of copositive matrices is reviewed and supplemented. Finite criteria for C, C+ , and SC matrices are derived using principal pivoting and compared with existing determinantal tests for C+ matrices. Last, C+ matrices of order 3 are characterized. In [Val89a], a real n-by-n matrix is defined to be almost C (almost C+ , almost SC) if it is not C(C+ , SC), but all its principal submatrices of order n−1 are. Using principal pivoting and quadratic programming, the above classes of matrices are studied with necessary and sufficient conditions given for almost C+ matrices. In [Val89b], criteria based on quadratic programming are given for C, C+ , and C matrices and these are compared with existing criteria. The copositivity of a symmetric matrix of order n is shown to be equivalent to the copositivity of two symmetric matrices of order n − 1, provided that the matrix has a row whose off-diagonal entries are nonpositive [PF93]. Based on this result, criteria are derived for copositive matrices of orders 3 and 4. As one might expect, one of the primary areas of interest has been in detection, i.e., given a general (square) matrix, determining whether it lies in the cone of copositive matrices. This is an NP-hard problem [Bom96]. Papers on detection include [Dan90, CHL70b, Kap01, HS10, Bas68, Bom00, Bom08, Bom87, Bom96]. Additional algorithms for detection of copositive matrices were developed in [XY11, Bom00, Bom96]. In [ACE95] simplices and barycentric coordinates are used in determining criteria for copositive matrices. Special copositive matrices were introduced in [Gow89, MND01, MP98]. For the quadratic form xT Qx to be positive unless x = 0, subject to the linear inequality constraints Ax ≥ 0, it is sufficient that there exists a copositive matrix C such that Q − AT CA is positive definite [MJ81]. The main result of [MJ81] shows this condition is also necessary. Recent survey papers are [HS10] (on variational aspects of copositive matrices), [IS00] (for conditionally definite matrices), [Dur10] (for copositive programming), [BS-M03] (for completely copositive cones), and [Dic11] (for the geometry of the copositive cone and its dual cone – the cone of completely positive matrices). The interior of the completely positive cone
178
Copositive Matrices
is characterized in [Dic10], using the set of zeros in the nonnegative orthant of the form. The (strictly) copositive matrix completion problem was solved in [HJR05].
6.5 Spectral Properties In 1969, the earliest spectral result on copositive matrices was obtained [HH69]. Specifically, it was shown that a copositive matrix A has the Perron property, i.e., its spectral radius ρ(A) is an eigenvalue. In 2005, among other things, it was shown that a copositive matrix must have a positive vector in the subspace spanned by the eigenvectors corresponding to the nonnegative eigenvalues [JR05]. Moreover, in 2008, necessary conditions were given that ensure that the Perron root of a copositive matrix has a positive eigenvector [Bom08]. Theorem 6.5.1 [HH69] Let A be a copositive matrix. Then its largest eigenvalue r satisfies r ≥ |λ|, in which λ is any other eigenvalue of A, i.e., A has the Perron property. Proof Let λ < 0 and Ax = λx in which ||x|| = 1. Then write x = y − z in which y ≥ 0, and yT z = 0, so ||y + z|| = 1. Then r ≥ (y + z)T A(y + z) = 2yT Ay + 2zT Az − (y − z)T A(y − z) ≥ −λ = |λ|.
6.6 Linear Preservers Recall the opening paragraph of Section 3.6 for the notion and utility of studying the linear preservers of a matrix class. In this section, which is based on [FJZ19], we are interested in identifying linear transformations on symmetric matrices that preserve copositivity, either in the into or onto sense. Let Sn = Sn (R) be the n(n+1) 2 -dimensional subspace of Mn (R) consisting of symmetric matrices. In this section, we use a subscript to indicate the size of copositive matrices, if useful; that is, Cn and SCn denote C ∩ Sn and SC ∩ Sn , respectively. In considering the action of linear transformations on copositive matrices, it suffices to consider linear transformations L: Sn → Sn (and it can be convenient because there are fewer variables to consider). However, it may also be
6.6 Linear Preservers
179
convenient to consider L to be a linear map on Mn (R). We shall do so interchangeably. We say that such a linear transformation preserves copositivity if A ∈ C implies L(A) ∈ C, and similarly for strict copositivity. More precisely, such an L is an into copositivity preserver. If L(C) = C, we have an onto copositivity preserver. Our purpose here is to better understand both types of linear copositivity preservers. The into preservers of the PSD matrices are not fully understood, and they are recognized to be a difficult problem. The onto linear preservers of PSD are straight forwardly known to be the congruences by a fixed invertible matrix [Schn65]. Certain natural kinds of linear transformations are more amenable to preserver analysis. We say that L is a linear transformation of standard form on Mn (R) if there are fixed matrices R, S ∈ Mn (R) such that L(A) = RAS (or L(A) = RAT S). Such a linear transformation is invertible if and only if R and S are invertible matrices. More generally, L is a linear transformation on Mn (R) if L(A) = [lij (A)], in which lij is a linear functional in the entries of A. It is known that an invertible linear transformation that preserves rank is of standard form [Hndbk06, Pier92], and there are useful variations upon this sufficient condition. Both the “onto” and especially the “into” copositivity linear preserver problems appear subtle. For example, in [Pok] a conjecture of N. Johnston is relayed: Any (into) copositivity preserver is of the form X −→
ATi XAi ,
i
in which Ai ≥ 0. Because any such map is (clearly) a copositivity preserver, this is a natural (though optimistic) conjecture. However, it is false even for 2-by-2 matrices. Suppose that a linear map on S2 is given by
a c c b
→
a a + b + 2c a + b + 2c b
.
If the argument is copositive, then a + b + 2c ≥ 0 because it is the value of
1 the quadratic form of the argument at 1 , so that the image is nonnegative and, thus, copositive. But, as the conjectured form a PSD preserver (by virtue 10 is −1 of being a sum of congruences), the fact that −1 10 is PSD, while its image 10 18
is not, shows that our linear map is not of the conjectured form, though 18 10 a copositivity preserver.
180
Copositive Matrices 6.6.1 Linear Copositivity Preservers of Standard Form
Because our arguments are always in Sn , a linear transformation of standard form is of the form L(A) = RART , or L(A) = RAT RT , where R ∈ Mn (R). Here we characterize both the into and onto preservers of copositivity of standard form. Such a transformation is invertible on Sn if and only if R is invertible. First, a useful lemma about copositive matrices. We say that v ∈ Rn , n ≥ 2, is a vector of mixed sign if v has both positive and negative entries. Nothing is assumed about 0 entries. Lemma 6.6.1 For each vector v ∈ Rn of mixed sign, there is a matrix A ∈ Cn such that vT Av < 0. Proof By the permutation similarity invariance of Cn , we may assume that v = [v1 , v2 , . . . , vn ]T with v1 v2 < 0. Then, let A ∈ Cn be ⎡ ⎤ 0 1 0 ⎢ 1 0 ⎥ ⎣ ⎦, 0
0
so that vT Av = 2v1 v2 < 0, as claimed. Theorem 6.6.2 Suppose that L is an into linear preserver of Cn of standard form. Then, L(A) = ST AS, with S ∈ Mn (R) and S ≥ 0. Proof Because L preserves symmetry and is of standard form, L(A) = ST AS with S ∈ Mn (R). Suppose that there is an x ≥ 0, x ∈ Rn , such that Sx is of mixed sign. Then by Lemma 6.6.1, the argument A may be chosen so that 0 > (Sx)T A(Sx) = xT ST ASx and ST AS = L(A) is not in Cn . Hence, L is not a copositivity preserver. Thus, for any x ≥ 0, Sx must be weakly uniformly signed. This means that S ≥ 0 or ≤ 0. Because S appears twice, we may take it to be the former. Because Cn contains a basis of Sn , a linear map on Sn that is an onto copositivity preserver must be an invertible linear map, and the inverse map must also be an onto preserver. If the map is of standard form, the inverse map just corresponds to inverting the S and ST (which, of course, must be invertible). Thus, we have that S−1 ≥ 0, as well as S ≥ 0 (or S−1 ≤ 0 and S ≤ 0). It is well known that this happens if and only if S is a monomial matrix, the product of
6.6 Linear Preservers
181
a permutation matrix and a positive diagonal matrix. Taken together, this gives the following characterization of linear transformations of standard form that map Cn onto itself. Theorem 6.6.3 Suppose that L is an onto linear preserver of Cn of standard form. Then, L(A) = ST AS, in which S ∈ Mn (R) and S is monomial. Because Cn forms a cone, we note that (i) any sum of into Cn preservers is again an into Cn preserver, though a sum of 1s of standard form may no longer be of standard form, and (ii) the sum of onto Cn preservers need no longer be onto. Also, it follows from the proven forms that both into and onto copositivity preservers of standard form are also (into and onto, respectively) PSD preservers.
6.6.2 Hadamard Multiplier Cn Preservers Recall that the Hadamard, or entry-wise, product of two matrices A = [aij ] and B = [bij ] of the same size is defined and denoted by A ◦ B = (aij bij ). If we consider a fixed n-by-n matrix H, then a natural type of linear transformation on n-by-n matrices A is given by L(A) = H ◦ A.
(6.6.1)
We may also ask for which H are such transformations (into) Cn preservers. Also recall that B is completely positive (CP) if B = FF T with F n-by-k and F ≥ 0. Thus, for F = [f1 , f2 , . . . , fk ] partitioned by columns, B = f1 f1T + f2 f2T + · · · + fk fkT . Then, the CP matrices, which also form a cone, are special PSD matrices. It is known that the n-by-n CP matrices are the cone theoretic dual of Cn , [HN63] as Tr BT A = ki=1 fiT Afi , which is nonnegative if A ∈ Cn . Now, consider a linear transformation of the form (6.6.1) with H a CP matrix of the form H = ki=1 hi hTi , hi ∈ Rn , hi ≥ 0, i = 1, . . . , k. If A ∈ Cn , then H ◦ A = ki=1 hi hTi ◦ A and xT (H ◦ A) x = ki=1 (x ◦ hi )T A(x ◦ hi ), which is nonnegative for x ≥ 0. Theorem 6.6.4 A linear transformation of the form (6.6.1) is an into copositive preserver if and only if H is CP.
182
Copositive Matrices
Proof Sufficiency follows from the calculation above. The quadratic form of H ◦ A on a nonnegative vector is a sum of quadratic forms of A on nonnegative vectors. On the other hand, if H is not CP because of the known duality, / Cn . eT (H ◦ A) e = TrH T A < 0 for some A ∈ Cn , and H ◦ A ∈ k T If H is CP, H = i=1 hi hi , hi ≥ 0, let Di = diag(hi ). Then H ◦ A = k T i=1 Di ADi , with Di ≥ 0, so that a linear transformation of the form (6.6.1) is also a sum of into transformations of standard form. With the exception of H being a rank one, positive, symmetric matrix, such a transformation will not be onto.
6.6.3 General Linear Maps on Sn Let L(A) = [lij (A)] in which each entry lij is a linear functional in the entries of A. Symmetry requires that the functionals lij and lji be the same. It is possible to design such maps that are copositivity preservers (and in a similar way, PSD preservers). (k) (k) (k) (k) (k)T (k) Let zij ∈ Rn , zij ≥ 0 and zij = zji . If A ∈ Cn , then zij Azij ≥ 0. Define (k)T (k) zij Azij lij (A) = k
and L(A) = [lij (A)]. Then, for A ∈ Cn , L(A) ≥ 0 and L(A)T = L(A), so that L(A) ∈ Cn and L is an into Cn preserver. PSD (into) preservers may be designed in a similar way. For example, let L(A) = diag(l1 (A), . . . , ln (A)) with li (A) = z∗i Azi , zi ∈ Cn , so that if A is PSD, L(A) is a nonnegative diagonal matrix and, thus, PSD. We note that, with this machinery, it is possible to design into, not onto, but invertible Cn preservers. Here is an example. For a b , A= b c let l11 (A) =
1 1
l12 (A) = l21 (A) =
A
1 1
1 1
+
A
1 0 1 1
,
A
1 0
,
6.6 Linear Preservers and l22 (A) = Then,
L
a b b c
1 1
A
=
1 1
+
0 1
183
A
0 1
2a + 2b + c a + 2b + c a + 2b + c a + 2b + 2c
.
.
and, for A ∈ C2 , L(A) ≥ 0 and L(A) is PSD. So L is a C2 preserver. Because l12 ≥ 0, it is only into. However, L is invertible, as # " x z x − z 3z−x−y −1 2 L . = 3z−x−y z y y−z 2 We note that more elaborate maps may be designed, including the possibility of negative off-diagonal entries. 6.6.4 PSD Preservers That Are Not Copositivity Preservers Of course, a PSD preserver need not be a copositivity preserver, and we note here a famous example of a PSD preserver that is not a copositivity preserver. The Choi map [Choi75] is a linear transformation from M3 (R) to M3 (R) that preserves PSD, but is not of any typical type. It is defined by ⎤⎞ ⎡ ⎤ ⎛⎡ −a12 −a13 2a11 + 2a22 a11 a12 a13 ⎦. L ⎝⎣ a21 a22 a23 ⎦⎠ = ⎣ −a21 2a22 + 2a33 −a23 a31 a32 a33 −a31 −a32 2a33 + 2a11 It is known, and easily checked, that any 3-by-3 (symmetric) PSD matrix is transformed to another PSD matrix. Of course, L is not a fixed congruence, nor onto. However, note that a copositive matrix with 0 diagonal and positive off diagonal entries is transformed to a matrix that is not copositive. In general, copositivity preservers need not be PSD preservers, and PSD preservers need not be copositivity preservers. However, we conjecture that onto copositivity preservers are also onto PSD preservers. 6.6.5 Onto Linear Copositivity Preservers We first make a fundamental observation about onto preservers of Cn . Theorem 6.6.5 Let L : Sn → Sn be a linear transformation that maps Cn onto Cn . Then L is invertible, and L−1 maps Cn onto Cn .
184
Copositive Matrices
Proof Because the copositive matrices include the standard basis for the symmetric matrices, their span is all of Sn , which means that the map is invertible. Because L(Cn ) = Cn , application of L−1 to both sides of the equality yields the desired statement. Now we may see that linear onto preservers of Cn also preserve several related sets. For B ∈ Cn , let C(B) = {A ∈ Cn : ∃α > 0 : B − αA ∈ Cn } and R = {B ∈ Cn : C(B) = Cn }.
Corollary 6.6.6 Let L : Sn → Sn be a linear transformation that maps Cn onto Cn . Then L also preserves (in the onto sense) (a) (b) (c) (d)
the boundary copositive matrices; the interior of Cn ; SCn ; and R.
Proof Items (a) and (b) follow because the copositive matrices are the closure of the strictly copositive matrices, and an invertible linear transformation is continuous. Item (c) follows because SCn is the interior of Cn . Now we show (d). Let B ∈ R and A ∈ C. Then B − αA ∈ C for some α > 0. So, L(B − αA) ∈ C, and L(B) − αL(A) ∈ C. Because L is onto, L(A) runs over C when A runs over C. So L(B) ∈ R, and L(R) ⊆ R. Similarly, L−1 (R) ⊆ R. Thus, R ⊆ L(R) ⊆ R implying that L(R) = R. Because both a fixed permutation similarity (equivalently, congruence) or a fixed positive diagonal congruence is an onto copositivity preserver, we have that monomial congruence is an onto copositivity preserver. Conjecture. The onto copositivity preservers are exactly the fixed monomial congruences. In a moment, we prove this conjecture in the 2-by-2 case. However, a proof, without additional assumptions, appears subtle. We have already shown that the conjecture holds if (1) the map is of standard form (Section 6.6.1). However, each of the following alternative additional assumptions is sufficient: (2) the map is an (onto) PSD preserver;
6.6 Linear Preservers
185
(3) the map is rank nonincreasing (in which case it is of standard form [Loe89, MM59]); (4) each of the component maps lij is a function of only one entry of A; (5) each of the component maps lij has nonnegative coefficients. We now study the onto 2-by-2 copositivity preservers. A symmetric matrix A ∈ S2 is copositive if and only if a b A= b c √ with a ≥ 0, c ≥ 0, and b ≥ − ac. The matrix A is strictly copositive if all the inequalities are strict. From now on, we assume that L : S2 → S2 is a linear transformation that maps C2 onto C2 and we use the following notation 1 0 α11 α12 L = =: α α12 α22 0 0 γ11 γ12 0 0 = L =: γ 0 1 γ12 γ22 β11 β12 0 1 = L =: β , 1 0 β12 β22 so that
L
a b b c
= aα + bβ + cγ .
Lemma 6.6.7 We have α11 α22 = 0 and γ11 γ22 = 0. Proof We show the first claim. The proof of the second claim is analogous. According to Corollary 6.6.6, either α11 α22 = 0 or √ α − α11 α22 , α = √ 11 − α11 α22 α22 with α11 > 0 and α22 > 0. In order to get a contradiction, suppose that α has this latter form. • Suppose that β11 β22 = 0. Then, by Corollary 6.6.6, √ β11 − β11 β22 √ β = . − β11 β22 β22
186
Copositive Matrices
Because α and β are linearly independent, we have α11 β22 −α22 β11 = 0. Thus, * √ det(aα + bβ ) = ab α11 β22 + β11 α22 − 2 α11 α22 β11 β22 is positive if ab > 0. Hence, for a > 0 and b > 0, aα + bβ has positive diagonal entries and positive determinant and, thus, is strictly copositive. Therefore, for a > 0, b > 0, and c < 0 sufficiently close to 0, A is not copositive, and L (A) is copositive. • Suppose that β11 = 0. Then, β12 ≥ 0. – If β12 > 0, let a > 0 and c < 0 be so that aα11 + cγ11 ≥ 0 and aα22 + cγ22 ≥ 0. Then, for b > 0 large enough, A is not copositive, and L (A) ≥ 0 is copositive. – If β12 = 0, then β22 > 0. For a > 0, c < 0 sufficiently close to 0, and b > 0 large enough, A is not copositive, and L (A) is copositive. • The proof is similar if β22 = 0. Lemma 6.6.8 We have α11 = α12 = β11 = β22 = γ22 = γ12 = 0 or α22 = α12 = β11 = β22 = γ11 = γ12 = 0. Proof By Lemma 6.6.7, α11 α22 = 0 and γ11 γ22 = 0. • Suppose that α11 = 0. We will show that α12 = β11 = β22 = γ22 = γ12 = 0, which implies α22 , β12 , γ11 > 0. 2
and a > bc , A – If β11 = 0 then for b < 0 and a, c > 0 with b < −c βγ11 11 is copositive, and L (A) is not copositive as its 1, 1 entry is negative. So β11 = 0 and, then, β12 ≥ 0. – Suppose that α12 = 0 (and β11 = 0). ◦ If β22 = 0, for b > 0, c = 0 and a < 0 sufficiently close to 0, A is not copositive, and L (A) is copositive. If γ22 = 0, then γ11 = 0 and γ12 ≥ 0. Thus, for c > 0, b = 0 and a < 0 sufficiently small, A is not copositive, and L (A) is copositive. Thus, β22 = γ22 = 0. ◦ If α11 = α12 = β22 = γ22 = β11 = 0 and γ12 = 0, then γ12 > 0. For a = 0, c > 0 large and b < 0 sufficiently small, A is not copositive, and L (A) is copositive. Thus, γ12 = 0. – Suppose that α12 = 0 (and β11 = 0). Then α12 > 0. ◦ If α22 = 0 or β22 = 0, for a > 0, c = 0, and b < 0 sufficiently close to 0, A is not copositive and L (A) is copositive.
6.6 Linear Preservers
187
and ◦ If α22 = 0 and β22 = 0, for b < 0 and a, c > 0 with b < −c βγ22 22 2
a > bc , A is copositive, and L (A) is not copositive as its 2, 2 entry is negative. • With similar arguments, we show that if α22 = 0 then α12 = β11 = β22 = γ11 = γ12 = 0. Theorem 6.6.9 A linear transformation L : S2 → S2 maps C2 onto C2 if and only if L
1 0 α11 0 0 0 0 0 = , , L = 0 0 0 0 0 1 0 γ22 0 1 0 β12 , L = β12 0 1 0
(6.6.2)
0 0 γ22 1 0 0 0 , L = = 0 1 0 0 0 0 α11 0 β12 0 1 = , L 1 0 β12 0
(6.6.3)
or L
for some α11 > 0, γ22 > 0 and β12 =
0 0
,
√ α11 γ22 .
Proof The sufficiency is obvious. Now we show the necessity. From Lemma 6.6.8 it follows that either (6.6.2) or (6.6.3) holds for some α11 ≥ 0, γ22 ≥ 0, and β12 ≥ 0. Suppose that (6.6.2) holds. The proof is similar if (6.6.3) holds. Because L is a linear transformation that maps C2 onto C2 it follows that √ α11 γ22 β12 = 0. Thus, we just need to see that β12 = α11 γ22 . Let a, c > 0 √ and b = − ac. Then
a b b c
is copositive, and L
a b b c
=
aα √ 11 − acβ12
√ − acβ12 cγ22
√ √ is copositive if and only if − acβ12 ≥ − acα11 γ22 , which implies β12 ≤ √ α11 γ22 .
188 Suppose that 0 < β12