A Survey of Matrix Theory and Matrix Inequalities [Revised ed.] 048667102X, 9780486671024

Written for advanced undergraduate students, this highly regarded book presents an enormous amount of information in a c

291 61 10MB

English Pages 208 [197] Year 2010

Recommend Papers

Matrix Inequalities [1 ed.] 3540437983, 9783540437987

The main purpose of this monograph is to report on recent developments in the field of matrix inequalities, with emphasi

406 34 998KB Read more

Dilations, Linear Matrix Inequalities, the Matrix Cube Problem and Beta Distributions [1 ed.] 9781470449476, 9781470434557

An operator $C$ on a Hilbert space $\mathcal H$ dilates to an operator $T$ on a Hilbert space $\mathcal K$ if there is a

133 62 1012KB Read more

Advances in Matrix Inequalities 3030760464, 9783030760465

This self-contained monograph unifies theorems, applications and problem solving techniques of matrix inequalities. In a

366 9 3MB Read more

A Matrix Knot Invariant

444 49 449KB Read more

Scalar, Vector, and Matrix Mathematics: Theory, Facts, and Formulas - Revised and Expanded Edition 9781400888252

The essential reference book on matrices—now fully updated and expanded, with new material on scalar and vector mathemat

113 36 7MB Read more

A First Course in Random Matrix Theory 1108488080, 9781108488082

The real world is perceived and broken down as data, models and algorithms in the eyes of physicists and engineers. Data

562 56 4MB Read more

INTRODUCTION TO MATRIX THEORY 9783030804817, 303080481X

324 114 2MB Read more

The Matrix in Theory 9042016396, 9789042016392

The Matrix trilogy continues to split opinions widely, polarising the downright dismissive and the wildly enthusiastic.

457 13 1MB Read more

Differential Linear Matrix Inequalities: In Sampled-Data Systems Filtering and Control [1st ed. 2023] 3031297539, 9783031297533

This book is entirely devoted to sampled-data control systems analysis and design from a new point of view, which has at

181 17 19MB Read more

Quantum Geometry, Matrix Theory, and Gravity 9781009440783, 9781009440776

115 19 8MB Read more

A Survey of Matrix Theory and Matrix Inequalities [Revised ed.]
048667102X, 9780486671024

Author / Uploaded
Marvin Marcus
Henryk Minc

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

A SURVEY OF MATRIX THEORY AND MATRIX INEQUALITIES

A SURVEY OF

MATRIX THEORY AND

MATRIX INEQUALITIES

MARVIN MARCUS HENRYK MINC University of California, Santa Barbara

ALLYN AND BACON, INC., BOSTON, 1964

THIS BOOK IS PART OF THE ALLYN AND BACON SERIES IN ADVANCED MATH• EMATICS, CONSULTING EDITOR, IRVING KAPLANSKY, UNIVERSITY OF CHICAGO.

Copyright @ 1964 by Allyn and Bacon, Inc., 150 Tremont Street, Boston. All rights reserved. No part of this book may be reproduced in any form, by mimeograph or any other means, without permission in writing from the publishen.

Library of Congress Catalog Card Number: 64-14269. Printed in the United States of America.

To Arlen and CatheriM

Preface

is a subject that has lately gained considerable popularity both as a suitable topic for undergraduate students and as an interesting area for mathematical research. Apart from aesthetic reasons this is a result of the widespread use of matrices in both pure mathematics and the physical and social sciences. In this book we have tried within relatively few pages to provide a survey of a substantial part of the field. The first chapter starts with the assumption that the reader has never seen a matrix before, and then leads within a short space to many ideas which are of current research interest. In order to accomplish this within limited space, certain proofs that can be found in any of a very considerable list of books have been left out. However, l't"e have included many of the proofs of classical theorems as well as a substantial number of proofs of results that arc in the current research literature. We have found that a subset of the material covered in Chapter I is ample for a one-year course at the third- or fourth-year undergraduate level. Of course, the instructor must provide some proofs but this is in accordance with our own personal preference not to regurgitate 11definition-theorem-proof" from the printed page. There are many ideas covered in Chapter I that are not standard but reflect our own prejudices: e.g., Kronecker products, compound and induced matrices, quadratic relations, permanents, incidence matrices, (v, k, ~) configurations, generalizations of commutativity, property L. The second chapter begins with a survey of elementary properties of convex sets and polyhedrons and goes through a proof of the Dirkhoff theorem on doubly stochastic matrices. .From this we go on to the properties of convex functions and a list of classical inequalities. This material is MoDERN MATRIX THEORY

vii

Dill

Preface

then put together to yield many of the interesting matrix inequalities of Weyl, Fan, Kantorovich, and others. The treatment is along the lines developed by these authors and their successors and many of the proofs are included. This chapter contains an account of the classical PerronFrobenius-'Wielandt theory of indecomposable nonnegative matrices and ends with some very recent results on stochastic matrices. The third chapter is concerned with a variety of results on the localization of the characteristic roots of a matrix in terms of simple functions of its entries or of entries of a related matrix. The presentation is essentially in historical order, and out of the vast number of results in this field we have tried to cull those that seemed to be interesting or to have some hope of being useful. In general it is hoped that this book will prove valuable in some respect to anyone who has a question about matrices or matrix inequalities.

MARviN J\.IARcus HBNRYK MINe

Contents

I. SURVEY OF MATRIX THEORY 1. INTRODUCTORY CONCEPtS M atrica and t.'C!Iclor•. M tdriz operotimY. lrwer8e. .Vatriz and rH!Ciur operatUm.. Ezampla. Trompose. Direct 1um and block mulliplication.

Ezamp/M.

Kronecker produd.

E:zomple.

I. NUMBERS ASSOCIATED WITH MATRICES NotatU1n. Submatrit-a. PermutaU.ona. DetnmiMnb. Tile quadratic rekltiom among 1ubdetenniMnt&. E.zampla. Compound matrias. Symmetric fundiona; trace. Permanents. EZJample. Prop~rtit• of permanent~. lndttad matricu. Cha~ teristic polyrunnial. Ezampla. Cltarac~riltic roota. Ezampla. RtJnk. Linear comlriTUJtioM. Ezampk. Linear dtpe711k?'IU; dimemU1n. E:zomple.

3. UNEAR EQUAnONS AND CANONICAL FORMS lntrodudilm and notation. E~tn.ry operation8. Example. Elementary nw.trica. Example. Hermite normaL form. Example. Uu of the Hermite normalfrmn in •ob1ing A.z =b. Exampk. Elementary column operationa and matrica.

Ezamplu.

Char-

aderiatic v«tort. Ezamplu. ConvenliorUI for polynomial arul irr tegrol matricu. Det.trminantal divUior•. Exarn.ple•. Equivalence. Ezamp/,1. Invariant foetor•. Elementary diuiaor•. Ez:,. amplu. Smi&h nmmal form. Ezampk. Si"'ilaritg. Exampia. Elememary di~Mor• and nmilarily. Example. Minimal polynomial. Companion malriz. Exmnpla. lrreducibilitu. Similoritylo a diagonal rMtriz. Examp/M. ix

Contents 59

4. SPECIAL CLASSES OF MATRICES, COMMUTATIVITY Bilinear fundional. Examplu. Inner product. Exampk. Orthogonality. Example. N&rmal matrices. Examplu. Circulant. Unitary aima1arily. Example. Poaitive defimu matrica. Exampk. F14ndiom of normal matricu. Examplu. Exponential of a matrix. Funcli07U of an arbitrary matrix. Example. IUpreaentation of a matrix aa a function of otMT' matricu. Ezamplea. Simultaneom reduction of commuting matricu. Commutatiuity. Example. Quaai-rommutativity. Examp~. Properly L. Examp~. Mu~~ ruulll on commutativity.

5. CONGRUENCE

81

Triple diagonal form. Congruence and~ Example. Relatiomhip t.o quadratic f01'J'n8. Exa~ ple. Congruence propertiu. Hermitian congruence. Example. Triangular product repreaent4tion. Examp~. Conjunctive reduction of akew-hermitian matrica. Conjunctiw reduction of two Mrmitian matriaa. Definitiom.

operation~~.

II. CONVEXITY AND MATRICES 1. CONVEX SETS

93

Definition&. Exam~. Intersection property. Examp~. Corwez polyludrona. Example. Birkhoff tlwn-em. Simplex. Exam~. Dimemion. Example. Lin«quenct>s m

(II + kk -

1) . 8t>quences .m 1.t . It IS

u~ful st't

to han' 80tnt' notation for two otht'r SE'quence &'ts. The first is the n .... of nil St'qUt'llCt'S of J..· distinct. int{'J.tt'l"S cho~n from 1' ... ' n. Thus D.... is ohtaint•d from Qt ... by a~~ciating with r-ach &'quence a E Q~: ... the k! sr-qut'nrt'S obtaint'd by l't'onlt'ring the intt>gt>rs in all possiblt> ways. Ht>nre D ....

has(~) k! =

11! 1 (11-

k)! SE'qut'nct's in it. The other set

isS~: ...

which is simply tht> tn it is sometimes rom·l'nil'nt. to dt'sigm\t{' tht> product. a;,a,-.· · ·a;, by a... In this notation it is l'asy t.o writ.

(i ~ p), and

S1e. 2

Numbns Associated with Matrie1s

13

It

BcP> = Acp)

+ i•l,J,., I; cJAw,

(c; E F),

then d(B) = d(A). Similarly, if BW = A W, (i ¢ p), and

then d(B) = d(A). In other words, adding multiples of other rows (columns) to a fixed row (column) does not alter the determinant of a matrix. i

!.4.7 If A . (r = 1, · ··,n;r ~ p),thend(A)

• =E

i•l

c;d(Z;). Thislatterformulatogether

with 2.4.2 says that the function d(A) is an alternatifl{l multilinear function of the rows.

!.4.8 (Expension by 1 row or column)

!.4.9 If A E M.(F) and B = (b,1) where bi; = (-l)i+ld(A(jlt)), (i,j = 1, · · ·, n), then BA = AB = d(A)l •. The matrix B is called the adjoint of A, written adj (A). Some authors call B the adjugate of A.

!.4.1 0 Let v, k, " be positive integers satisfying 0 < " < k < v. A matrix A E AI.(1) with zeros and ones as entries satisfying AA.,. = (k - A) I, + >J where J E Jf.(l) has every entry 1 is called a (v,k,A)-matrix. The entries of A are 0 and 1 and hence (AA") .. =

•

E af, is just the sum of the entries ,_,

in row i. Hence AJ = kJ, AA'~' =

A (AT -

~ J) A -l

(k- A)I.

= (k -

=k

A + kAJ,

>.)I.,

-1 " ( A f'

-

k" J ) = k(k 1-

>.) (kAT - >J).

It follows that the adjoint of A is the matrix d(A) . " B = k(k - A) (kA - >J).

We shall also derive an interesting relationship connecting the integers .,, k, and >.. We have

14 A2'A

Survey of Mdtrix Theory

Ch. I

= A-1(AA")A = A- 1((k-

+ 'AJ)A = (k- 'A)I. + u-•JA. (k - 'A) I. + 'Ak- A. Thus J A 2'A =

'A)I.

1./ From AJ = kJ we have ATA = (k - 'A)J 'Ak- 1vJA (note that J 2 = vJ). Now (AJ)T = kJ so kJA = (k- 'A)J 'Air 1vJA. Thus JA = rJ where c = (k - 'A)/(k - 'Ak-•v). Then JAJ = rJ 2 = cvJ, kJ 2 = cvJ, k = c and finally we have 'A = k(k - 1)/(v - 1).

+ +

2.4.11

(l.eplace expansion theorem)

Let 1

< r < n,

let a be a fixed

sequence in Q,,,., and for any {j = (p., · · ·, Pr) E Q,,,. let s(p) d(A) = ( -l)•

, = 1•1 I: p,. Then

>. (-l)• 8 or n > a+ t = n + 1. Thus every n

product

n b,. is zero and d(B) = Oo But d(A)

i•l

and hence d(A)

= ±d(B), by 2.4.2,

= Oo

!.7 Compound matrices If A E JJ.... (F) and 1

< r < min (m,n),

or rth adjugate of A is the (

then the rth compound matrix

~) X (;)matrix whose entries are d(A [al~]),

a E Qr ... , Jj E Qr ... arranged lexicographically in a and ~0 This matrix will be designated by Cr(A)o Another notation for it is IAIo For example, if A = (a,;) E lU.. (F) and r = 2, then d(A [1, 211, 2]) ( Ct(A) = d(A [1, 311, 2]) d(A [2, 311, 2])

i.7o1

d(A [1, 211, 3]) d(A [1, 311, 3]) d(A [2, 311, 3])

d(A [1, 212, 3])) d(A [1, 312, 3]) o d(A [2, 312, 3])

If A E .M..... (F) and B E J.lf .. ,.(F) and 1 < r

P•,i is thus

.'t§. ·-·

just the probability that the particles distribute themselves one at each point in &ome order.

2.11

Propertl• of p•••nents

Some properties of p(A) for A E M .(F) follow. 2.11.1

..

If D and L are diagonal matrices and P and Qare permutation ma-

trices (all in M.(F)) 1 then p(PAQ) = p(A), p(DAL) =

(.rr, D,iL..,) p(A).

2.11.2 p(A") = p(A), p(A*) == ii{Af, p(cA) = c"P(A) for any scalar c. 1.11.3 p(P) == 1 for any permutation matrix P E "V.(F), p(O"·") 2.11.• If for some p, A

..

=I: C~iJ j•l

where

ZJ

= 0.

E Mt .• (F), c; E F,

(j == 1, .. ·, n), and Z1 E M .(F) is the matrix whose pth row isz1 and whose rth row is A. j•ia

Survey of Matrix Theory (1)

a,11>

yields the term x-•. The rest of 8 can be completed by using a diagonal in the rows and columns not already used; that is, using a diagonal from (>.1,.- A)(i~, · · ·, i,._,li~, · · ·, i,._,),

corresponding to a permutation rp of the set complementary to i~, · · · , i,._,, But since t1 holds i., · · ·, i,._, fixed, it is clear that sgn t1 = sgn rp. Moreover, we want the coefficient of >.•-•, so we can only choose constant terms {not involving >.) in

(>.I. - A) (it, · · ·, i,._,lit, · · ·, i,._,) to complete &. Thus the coefficient of >.•-• that arises from the product 2.14(1) is just d( -A(i11

.. ·,

i,._,lit, .. ·, i,._,))

= (-l)'d(A{i.,

· .. , i,._,lit, · .. , i,._,)),

Each choice of integers {j E Q,._,,,. thereby yields a term >.•-•( -l)'d(A [ala]) where a E Q, .• is the set complementary to {j in {1, · · ·, n}. Thus the complete coefficient of ).,._, is

I:

( -1)'

aEQ•.•

d(A [ala])

= {-l)'E,(A).

2.14.2 I£ A is a 2-square matrix over C, then Ea(A) = tr (A), E:~(A) = d(A) and the characteristic polynomial is p(>.) = >.1 - tr (A)>. d(A).

+

2.15 Charederistic roots The n roots of the characteristic polynomial p(>.) = d(>.l. - A), each counted with its proper multiplicity, are called the characteristic roota of A. The set of characteristic roots of A will be denoted by >.(A) ; and if all the characteristic roots are real, we will choose the notation so that

Xt(A) > >.:~(A) > · · · > >.,.(A). The reason for the notation E,(A) is now clear: for, in general, p(>.)

= d(>.l,.

ft

- A)

= n

j•l

(>. - >.1(A))

ll

=

I:

1•0

(1)

(2)

>.•-•( -l)'E,(X1(A), ···,>.,.(A)).

Thus comparing 2.15(2) and 2.13(1) we see that E,(A) = E,().t(A), · · ·, ).,.(A)).

(3)

In other words, the sum of all the t-square principalsubdeterminants of A is just the tth elementary symmetric function of the characteristic roots of A. Some important consequences of this are

Sec. 2

Numbers Associated with Matrices tr (A)

" Ai(A); = l:" a" = l: ·-·

d(A)

23

= E"(Aa(A),

i•l

· · ·, A.(A)) =

n"

·-·

(4) Ai(A).

(5)

The numbers A.(A), · · ·, A.(A) go under several names: eigenvalues, secular values, latent root&, proper oo.lues. Some elementary properties of characteristic roots are listed.

1.1.5.1

~) E M.(C)

If A = ( : foP

where BE Mp(C) and DE M.,(C),

then the n characteristic roots of A are the p characteristic roots of B taken together with the q characteristic roots of D: A(A)

=

{A.(B), · · ·, Ap(B), A1 (D), · .. , A,(D)}.

au 0

!.15.! If

Gta

tJa

• ..

Cit.

A=

E M.(C) 0

.. .

0

a..

in which the entries below the main diagonal are 0, then the characteristic roots of A are au, · · · , the main diagonal entries of A. 2.15.3 If a E C and A E Jf,.(C), then al. +A has characteristic roots a + Ai(A), (i = 1, · · ·, n), and aA has characteristic roots aAi(A), (i = 1, · · ·, n). Thus we can write suggestively (if not properly) that A(al. + bA) = a+ bA(A) whenever a and bare in C. 2.15.4 If S E }.{.(C) and S is nonsingular, then A(S-•AS) = A(A) for any A E . lf.(C). . !.15.5 If A E .tlf.(C), then the characteristic roots of AP are Xf(A), (i = 1, · · ·, n), for any positive integer p.

a..,

2.15.6 If A E . .V .. (C), then A is nonsingular if and only if A,(A) (i - 1, · · ·, n). This follows immediately from 2.4.5 and 2.15(5).

~ 0,

2.15.7 If A E Af.(C), A is nonsingular, and p F- 0 is an integer, then the characteristic roots of AP are Af(A), (i = 1, · · ·, n). 2.15.8 If A E M.(C), ao, ···,a. arc numbers in C, and B = a.l,. a 1A + . · · + a..A "', then the characteristic roots of B are

ae + a1Ai(A)

+ t.~tAf{A) + ·· · + a.M"(A),

equation

(i - 1, · · ·, n).

numbers in C, and aoi.. + aaA + = o•.•, then any characteristic root r of A must satisfy th•!

2.15.9 If A E M .(C),

... + a.A•

+

ae, · · ·,a. are

24

Ch. I

Survey of Matrix Theory

2.15.10 If A,. E .M".(C), (p = 1, · · ·, m), and n 11 are positive integers,

then X(

f· A

11 )

p-1

= p-1 UX(A,).

In other words, the characteristic roots of

a direct sum of matrices is just the set of all of the characteristic roots of the individual summands taken together. 2.15.11 If A E ~f ,(C) and B E M ,(C), then the characteristic roots of the Kronecker product A ® B E M ,,(C) are the pq numbers >..,(A)XJ(B), (i ... 1' ... , p; j = 1' ... 'q).

< k < n,

2.15.12

If A E .M.(C) and 1

C..,(A} are

the(~) products N,(A) 1

then the characteristic roots of

· · · >.u(A} where

< ia < it < · · · < i~: < n.

In the notation of 2.1,

~(C~:(A}) = ~(A)

(ia, · · ·, i~:} E Q.., .• ; thus

tr (C~:(A)) = E~:(Xt(A), ···,>-.(A)) =

i

= 1•1 n N,(A)

for w

=

.J:. . Xt,(A).

2.15.13 If A E ~f.(C}, then the characteristic roots of A • are the complex conjugates of the characteristic roots of A. Thus >.,(A*) = 'Mil), (i = 1, .. ·, n); >.(AT) = >.(A), >.,(A} = >::(A), (i = 1, .. ·, n). 2.15.14 If A E M .. (C) and 1 P.,(A) are the ( n

+~ -

< k < n,

then the characteristic roots of

1) products Xia(A) · · · N.(A) where

1 < it In the notation of 2.1,

< is < · · · < i~: < n.

~(P~:(A)) - Xt,(A) -

for w = (it,···, i~:) E G~c,,.; thus

•

D >.,,(A)

·-·

tr (P~:(A)) - ~ ... ~(A). This latter function is called the completely trymmetric function of the numbers X1(A), · · ·, >-.(A} and is usually designated by h~c(X 1 (A), ···,>.,.(A)). 2.15.15 If A E .M,.,,.(C), BE .M " ... (C), (n > m), and ~(X) - d(>-1.. - AB}, B(X) = d(Xl n - BA), then B(X) = X"-.p(X). This implies that the nonzero characteristic roots of A B and BA are the same. 2.15.16 If A, E Jlf.(C), (t- I, .. ·, k), and A~ 1 = A.;A,, (i,j = 1, · · ·, k), and p(X~, >..,, · · ·, X~c) is a polynomial with complex coefficients in the indeterminates >-t, · · ·, x..,, then the characteristic roots of B = p(A 1, A,, ···,A~:} are the numbers fJJ = p(alJ, a2h • • ·, a~:1), (j .. 1, · · ·, n),

Stc. 2

Numbers Assodated with Matrices

25

where au, (j -= 1, · · ·, n), are the characteristic roots of A, in some fixed order. For example, if A1 and Aa commute, then the characteristic roots of A tAt are auau, a12aa, · · ·, a1,.a2n where au, · · ·, a1,. and aat, · · ·, at. are the characteristic roots of A 1 and A a in some order.

2.15.17 A somewhat more general result along these lines is the Frobenius theorem: I£ A and B are n-square matrices over C and both A and B commute with the commutator AB - BA, and if /('At, 'At) is any polynomial in the indeterminates 'At, 'A2 ·with complex coefficients, then there is an ordering of the characteristic roots of A and B, (3,, (i - 1, · · ·, n), such that the characteristic roots of f(A, B) are !(a,, f3i), (i = 1, · · ·, n).

a,,

2.15.18 If A E 1lf.(C) and A is nilpotent (i.e., A"= o..... for some positive integer r), then >.,(A) - 0, (i =- 1, · · ·, n). This is a consequence of 2.15.5. 2.16 Examples 2.16.1 Let A be the (v,k,>.)-matrix in 2.4.10. According to 2.15.3 the characteristic roots of AA 7 are of the form (k - 'A) + >.>..,(J). The characteristic polynomial of J is easily computed to be x•·- 1(x - t•) and hence the characteristic roots of AA 7 are k 'A(v - 1) with multiplicity 1 and k - 'A with multiplicity v - 1. According to 2.15{5) and 2.4.1 d2 (A) = (k - >.)-l(k 'A(v - 1)). But it was also proved in 2.4.10 that

+

+

'A = k(k - 1)/(v - 1)

and hence tfi(A) = (k - >.)- 1k2• Thus the formula for the adjoint of A given in 2.4.10 simplifies to B == ±(k - 'A)c.-a>1 2(kAT- >J).

2.16.2 Let P E .iV.(C) be the permutation matrix

P

0 1 0 0 0 1

0 0 0 0

0 0 0 1 0 0

0 0

= (Pu)-

letS E .Jf.(C) be the matrix defined by S.,t = n-tJtgt, (k = 1, · · ·, n; p = 1, ... , n), where 8 == e'aw',. = cos (2r/n) i sin (211/n). \Ve show that 8 is nonsingular, that = and that

+

s-• s•'

s- PS = diag (8"1

1,

8"-2,

Thus, according to 2.15.2 and 2.15.4, 1, 8, 81, roots of P. Now

• • ·, • • ·,

8, 1). 8"- 1 are the characteristic

Ch. I

= n-•

.. "

,l;

1 if k

Survey of Matrix Thm7

s _E.

•-1

p(A,).

l:" A,, then

i•l

2.17.8 (Sylvester's law) If A E M.... (F) and BE M•.,(F), then p(A) + p(B) - n < p(AB) m) is a vector space of dimension (n - l)m + 1. Then there exists a matrix BE Af..... (C) such that p(B) - 1 and (bn, · • ·, ba.. , · · ·, b.1, • · ·, b....) E U. For, let V be the set of all mn-tuples which have 0 coordinates after the first n coordinates: (en, · · ·, C111, 0, · · ·, 0). Now the dimension of Vis clearly n so that

dim (U

n

V) =dim U +dim V- dim (U + V) = (n - 1)m + 1 + n - dim (U + V)

>

(n - 1)m

= n- m

+1+n -

+ 1 > 1.

nm

Thus there must be a nonzero matrix BE M •.•(C) which is zero outside the first row. Hence p(B) = 1.

3 Linear Equations end Canonical Forms 3.1

lntrocluctlon end not.tlon

A system of linear equations over F, II

I: 4iJX; = b,, i-1

(i = 1, · · ·, m),

(1)

in which a,; E F and b, E F, (i = 1, · · ·, m; j = 1, · · ·, n), and x;, (j = 1, · · ·, n), are the unknowns, can be abbreviated to

Ax= b

(2)

in which A= (aii) E M .... (F), xis then-tuple (x1, ... ,z..), and b is the

See. 3

Lirwar &,ustions antl Canonical Forms

31

m-tuple (bt, · · ·, b,..). The problem is: given the matrix A and them-tuple b find all n-tuples x for which 3.1(2) is true. From 1.4(1) we can express 3.1(2) as an equation involving the columns of A: (3)

The matrix in M.....+1 (F) whose first n columns are the column.'l of A in order and whose (n + l)st column is them-tuple b is called the augmented matrix of the system 3.1(1) and is denoted by [A :b]. ThP ~yst.C'm 3.1(1) is said to be homoge-neotts whenever b = (),...In case h ¢. 0,., the system 3.1(1) is said to be nonhomogeneous. If 3.1(2) is nonhomogeneous, then Ax = 0. is called the associated homogeneous system. If the system is homogeneous, then x = o.. is called the trivial solution to 3.1(2). The essential facts about linear equations follow. The system of equations 3.1(2) has a solution, if and only if

3.1.1

bE (Am, ···,A~w """ Ec•>.; Eii>'+cw = E ., E, E~ea> are the three types of elementary matrices. Any elementary column operation can be achie\•ed by a corresponding post-multiplication by an elementary matrix analogously to 3.4.

Suppose A E .tV.... (F) where F is R or C and p(A) = k. Then there exists a sequence of elementary row and column operations that reduce A to a matrix D E .V.... (F) 1 0 0 0 1 0

3.10.1

D-

0 .

0

1

0

...

0

.................... 0

in which the only nonzero entries in D are l's in positions (j,j), (j = 1, ... 'k). This follows instantly from the Hermite form of A given in 3.6. Simply eliminate the nonzero elements outside of columns n1, · · ·, nt by type II colwnn operations on B using the columns numbered n., · · · , n~r. Follow this by type I column operations to put the l's in the upper left block.

Ch.I

38

3.11

S111vey •I Matrix Theory

Exempla

3.11.1 The matrix Bin 3.9 is reduced to diagonal form by the succession of column operations: nm-a(e), n.(k - X).) 1, i > 1) (en may be a unit} (i - 2, · · ·, n), (i = 2, · · ·, m).

Now the whole procedure can be repeated with the matrix C in which the elementary operations are performed on rows 2, · · ·, m and columns 2, · · ·, n: the result will be a matrix D which is 0 in rows &lld columns 1 and 2 except for the (1, 1) and (2,2) entries. Since cu was a divisor of c,;, (i > 1, i > 1), it follows that cu is a divisor of every element of D (this property isn't lost by elementary operations). Thus D looks like 0 d

Cu

0

D-

0

0 (L E M....a.-•(K)),

L

0

0

and c11 ld, cnll•h dll,;, (i -= 1, · · ·, m - 2; i - 1, · · ·, n - 2). Clearly this process can be continued until we obtain a matrix H E M.... (K) in which ht~lh'+l.•+t, h'+•.'+• J1l! 0, (i - 1, · · ·, p - 1), hii = 0 otherwise. Moreover, His equivalent to A. By 3.17.3 and 3.17.5, p(H) = p(A) and Hand A have the same determinantal divisors. Now p(H) = p - p(A) - r. Let 1 < k < r and observe that the only nonzero k-square subdeterminants ~

of Hare of the form II

·-1

k

At,... The gcd of all such products is

_II "'" [recall

·-·

A..lh•+•·•+l• (i = 1, · · · 1 r - 1)]. By definition of the invariant factors in 3.19, k

..

n h.. = in• l qi, i•l

(i = 1, · · · 1 r).

SuriM.J of M

Ch. I

46

am ThtMy

Thus

Au - 9h Auhts = 9t911 80

ka = q., · · . , ""

Au • · • "" .. 9• · · • 9r;

• • •,

= q,.. Hence H = B and the argument is complete.

3.!3 Exemple This is a continuation of the computation of the determinantal divisors of the (v,k,~)-matrix considered in 3.18. I..et qa, · · ·, q., be the invariant factors of A 80 that/.t = 9• • • • 9tr, (k = 1, · · ·, v), and 9•19•+~• (i = 1, · · ·, v- 1). Now .h._ k(k _ ~)C-1)11 q., = /-t = (k- ~)C-'>11 - k(k- ~). Moreover, q,lk(k - "), (i - 1, · · ·, v - 1), and q1 · · · 9-• = /-1 = (k- ~)C-')1 1 • It follows that each q,, (i = 1, · · ·, v- 1), is a power of (k - ~)and moreover must divide k(k - ~) [in which gcd (k, k - ") = 1]. Thus 9• .. 1 or q, = k - ~. (i = 1, · · ·, v - 1). This fact together with 9• · · · 9•-• = (k - ~).• + R'. Next scan the entries of R' for those of highest degree m- 1 (the coefficients may all be 0) and write

R

=

R_>..•

+ R._.x...- 1 + R".

Continue this until there are no terms left:

R = R.x• + R.-.x•-•

+ · · · + Rt>.. + Ro.

For example, if

- ().'-'A +2

R-

).2) 1 ,

then

R =

(~ ~) >..• + (~ ~) x• + ( _~ ~)'A+(~ ~).

The matrix S in 3.24(2) is given by S = RJJ- + R-tB- 1 + · · · + RtB + Ro. A proof of 3.24.1 can be made now along the following lines: From ('Al.- A) = P(>J.- B)Q we write p-•(>J.- A) ... (AI.- B)Q

= Q>..- BQ.

(3)

If we replace the indeterminate >..on the right by the matrix A (exactly as wu done above in determining 8) in both sides of 3.24(3), we obtain W A - BW ... 0. Here W is the matrix obtained by replacing 'A on the right by A in Q. Let R be the inverse of Q, as before. In other words, R.'A-Q R_1>..-- 1Q R 1>.Q RoQ = I.. On the other hand, 'A'Q = Q'A', (t = 0, · · ·, m), and hence R..Q>..m + R-tQ'A- 1 + · · · + R 1Q'A + RoQ = I •. If we replace 'A by A on the right in the last expression, we obtain

+

+ ·· · +

+

"' R,WA' :E

e-o

(4)

= 1,..

From W A - BW we obtain in succession W A' and hence 3.24(4) becomes

= B'W,

(t - 0, · · ·, m),

Survey of M alrix Theory

Ch. I

48

f R,B•) W = 1,.. (•-o On the other hand,

s = L"' RJJ' and thus sw = 1,., w = s-•, and finally 1•0

A

=

JV- 1Bll'

= SBS- 1•

Conversely, if there exists some nonsingular matrix T E M ,.(F) for which A - TBT- 1, then >J,. -A .. }../. - TBT-• = T(M,.- B)T- 1• Now d(T) is a nonzero element of F and, by 3.17.2, T is a unit matrix in F[}..]. Hence}../,. -A and}../,. - Bare equivalent. 3.!4.3 We sum up the above. Two matrices A and B in llf ,.(F) are similar over F, if and only if their characteristic matrices }..[,. - A and }../,. - B are equivalent over F[}..]; i.e., if and only if their characteristic matrices have the same invariant factors (see 3.20.1). The invariant factors of the characteristic matrix of A called the similarity irwariant8 of A.

are

3.25 3.!5.1

Examples The matrices A

= (!

!) and B = (~

real numbers. For, M, - A

}.. -=.\).

= (}.. ..=-/

}..[,- B =

(~

}..

~)are similar over the

02); A are

the determinantal divisors of >.I., ft - gcd (}.. - 1, 1) - 1, f., = }..(}.. - 2). The determinantal divisors of }../t - B are clearly J: gcd (}..,}.. - 2) = 1, !~ - }..(}.. - 2). Hence ft = J:, ft = !2 and, by 3.24.3, A and B are similar over the real numbers. In fact, using 3.24.2, we can find the matrix S such that A ... SBS- 1• For, the matrices P and Q in 3.24(1) may be taken to be

p ==

(-! -~: + ~! !).

Q=i ( Then

-}.. 2

+13}.. -

1

-}..2

+13}.. -

3)

.

Sec. 3

Liuar Equations and Canonical Forms

49

and hence

-!) (~ ~r + (~ -3)3 (00 0)2 + ( -11 (-! !)·

s- (~ =-

Of course, S can be l'.asily computed by other methods.

3.25.2 Let

A-

0 1 0 0 0 0 0 0 0 0 -1 2

0 0 1 0

0 0 0 0 0 0

0 1 0 0 0 0 -3 4

1 0 0 1 -3 2

The elementary divisors of 'Ale- A, in which A is regarded as a matrix over R, are ('A - 1) 2, ('A 2 + 1) 2• If A is regarded as a matrix over C, then the elementary divisors of 'A/s - A are ('A - 1)2, ('A - i)2, ('A + i) 2• Consider the matrix B = (

0 21) +( 0~ 0~ 0~ ~)1

-1

-1

0

-2 0

The elementary divisors of 'AI, - B (over R) are directly computed to be ('A - 1)2, ('A 2 + 1) 2• Hence, by 3.20.1, the matrices 'AI, -A and 'Als - B are equivalent over R['A]. It follows by 3.24.3 that A and B arc similar over R.

3.26 Ele111ent.ry divl1011 end 1i111ilerity

3.26.1

Two matrices A and B in M .(F) are similar over F if and only if 'AI. -A and 'AI.- B have the same elementary divisors (see 3.20.1, 3.24.3). 3.!6.! If A and B are in J.lf.(R), then A and B are similar over C if and only if they are similar over R.

3.26.3

u

then the set of elementary divisors of 'AI. - B is the totality of elementary

Ch.l

50

Survey of Matrix Theory

divisors of all the >J,., - A, taken together, (i = 1, · · · , m). The proof of this statement is deferred to 3.29.5.

"' " Ai is similar 3.!6.4 If Ai is similar to B, over F, (i = 1, · · ·, m), then I: 1

·-

"' B, over F. to E" i•l

3.!6.5 If A is in M.(F), then A is similar over F to A". For, >J.- AT= ('AI.- A)" and therefore A" and A have the same similarity invariants.

3.!7 Example Consider the matrix B over R given in 3.25.2. Then, as in 3.26.3, m .. 2,

A,=(-~ ~} and

As=

( 000 00I 001 0)01 . -1 0 -2 0

The elementary divisor of 'Ais- At is ('A - 2) 1 and the elementary divisor of >J4 - A1 is ('A 2 + 1) 2• By 3.26.3 the elementary divisors of Ma - Bare ('A - 2) 1 and ('A1 1)1•

+

3.!8 Mlnl11111l polynomial If A E M.(F), then the monic polynomial (/)('A) E F['A] of least degree for which (/)(A) = o•.• is called the minimal polynomial of A. We give below the essential facts about the minimal polynomial and related notions. 3.!8.1 The minimal polynomial of A divides any polynomial /('A) for which J(A) - o•.•. 3.!8.! (Cayley-Hamilton theorem) If 1/I('A) is the characteristic polynomial of A, then -/~(A) = o•.• and thus the minimal polynomial of A divides the characteristic polynomial; a matrix for which the minimal polynomial is equal to the characteristic polynomial is called nonderogatory, otherwise derogatory.

Proof: By 2.4.9 we have (M.- A) adj ('AI. -A) = -/I('A)I•. Clearly adj (M. - A) is a matrix with polynomial entries of degree not exceeding n - 1. Let

&c.3

Linear Equations and Canonical Forms

51

adj (U.. - A) = B._.x•-• + · · · + BsX + Bo, (BJ E Jf,.(F);j =- 0, · · ·, n- 1), and #(X) - c.X"

Then

+ · · · + caX + Co,

(XI,. - A)(B-~x·-•

(c; E F;j = 1, · · ·, n).

+ ··· + B.x + Bo) =

(c,.X•

+ ··· + c,X + 4)1•.

Comparing coefficients, we obtain

B,._1

= c.I.

B,._, - AB-1 = c-•1•

Bo

- AB1 -ABo

=

c11,.

=

c.J•.

Multiplying the first of these equalities by A •, the second by A ·-•, the jth by A •-H- 1, and adding them together yielda

o.... = c,.A• + c"_.A•-• + · · · + c1A +Col"

=~(A).

3.18.3 The minimal polynomial of A is equal to ita similarity invariant of highest degree. Proof: Let 't'(X) denote the minimal polynomial and let q,.(") be the similarity invariant of highest degree. That is, q"(A) is the in\•ariant factor of highest degree of the characteristic matrix ).]• - A. The entries of Q(A) = adj (U,.,- A) are all the (n- I)-square subdeterminanta of >J.- A. Hence the (n- l)st determinantal divisor of M,.- A,J-~(X), is jUBt the gcd of all the entries of Q(X): Q(X) = f,._.(X)D(X) where the entries of D(") are relatively prime. If /(X) is the characteristic polynomial of A, then by definition, /(X) = q,.(")/-l(A) and hence f,.-l(X)D(")()J,. -A) Q(X){M. - A) - /(")!,. - q"(X)/..-l(X)I,.. Clearly /.-a(X) J4 0 [e.g., d((M.- A)(lll)) is a polynomial ?lith leading term x-•] and hence

D(")(U. -A) - q,.(")l,.. If we express both sides of 3.28(1) as polynomials with matrix coefficients (see 3.24.2) and replace A by A, then it follows that q.(A) - o•.•. Hence by 3.28.1, 't'(X)Iq,.("). We set q,.(X) = rp(X)h(X). From rp(A) = 0,.,. we conclude by the division theorem for polynomials that rp(")/,. .. C(X)(Xl,. -A) where C(X) is a polynomial in " with matrix coefficients. Thus, from 3.28(1), D().)(>J.. -A) - q.(X)l,. =- h(X)rp(X)/,. = J&().)C(X){).f. -A). It follows from the uniqueness of division that D(X) - h().)C(X).

(2)

H we now regard D().) and C().) in 3.28(2) as matrices with polynomial

Ch. I

52

Survey of Matrix TMory

entries, we conclude that h(~) is a divisor of the entries of D(~). But these are relatively prime and hence h(~) is a unit. Since q.. (~) and~(~) are both monic we obtain h(~) = 1 and cp(~) = q.. (~).

+

3.28.4 The minimal polynomial of A B, [A E .M,.(F), B E .llf.. (F)], is the least common multiple of the minimal polynomials of A and B. Recall that the least common multiple of two polynomials is the monic polynomial of least degree that both divide. 3.28.5 The matrix A E },f,.(F) is nonderogatory (see 3.28.2), if and only if the first n - 1 similarity invariants of A are 1.

3.29 Companion matrix If p(~) E

-

F[~], (p(~) = x.~:

0 0

1 0

0 1

l

~ a.~:-JX~;-- 1 ), i•l

then the matrix 0 0

0

0

0

0 a.t-2

a.t-t

is called the companion matrix of the polynomial p(~) and is designated by C(p(~)); if p(~) = ~ - ao, then C(p(~)) = (ao) E },/ a(F). The matrix C(p(~))

is nonderogatory and its characteristic and minimal polynomials both equal p(~). 3.29.1

If p(~) =

(~

- a) 4, (a E F), then

a

1

a 1 H(p(~)) =-

0

is called the hypercompanion matrix of the polynomial (~ - a).t:; if p(X) .. X- a, then H(p(~)) - (a) E llfa(F). The only nonzero entries in H(p(~)) occur on the main diagonal and immediately above it. The matrix H (p(~)) 1s nonderogatory and the characteristic and minimal polynomials both equal p(~). The following are three results of great importance.

&c. 3

Linear Eq11ations and Canonical Forms

53

3.!9.! If A E J.f .. (F), then A is similar over F to the direct sum of the companion matrices of its nonunit similarity invariants. Proof: Suppose qi(X), (i = 1, · · ·, n), arc the similarity invariants of A: q1(X) = · · ·

"

n = _I:

J-k+l

= q~:(X) = 1 and degqi(X) = mi >

mJ; and qJ(X)Iqi+•(X), (j = 1, · · ·, n -

1, (j

= k + 1,

· · ·, n);

1) (see 3.22). Let Qi =

+

C(q,(X)), (i ... k 1, · · ·, n), and since (see 3.29) Q, is nonderogatory with minimum polynomial equal to q,(X), we conclude that}.].," - Q, is equivalent over F[X] to diag (1, · · ·, 1, q,(X)). It follows immediately that "

= l•k+l _I:" 'AI"" ft

- Q, is equivalent over F[X] to diag ·-T"tl (1, · · ·, 1, q.~:+I(X), · · ·, q"{X)) = diag (q.(X), · · ·, q,.(X)). It is ob\rious that the invariant factors of this last matrix are just the qi(X), (i = 1, · · ·, n),

'AI,. - . y-· Q,

and hence XI. -

" I:"

i-t+l

Q, is equivalent over F[X] to XI. -A. The result

now follows from 3.24.1. 3.29.3 (Frobenius, or retionel, anonlcel form) If A E .M.(F), then A is similar over F to the direct sum of the companion matrices of the elementary divisors of 'AI" - ..4.. Proof: Let p,(>..), (t = 1, · · ·, r), be all the different primes that occur in the factorizat.ion into primes of the nonunit similarity invariants of A, q.~:•• (X), · • ·, q,.(X). Let e•• (>..), · · ·, e&""(X) be the positive powers of the various p,(>..) that occur in factorization of q,(X); i.e., q,(X) = e,.(>..) · · · ei,.,(X). Let Q, = C(q,(X)), (i = k 1, · · ·, n), and set deg q,(>..) ,. k,. We show first that 'AI~;, - Q, is equivalent over F[X] to

+

., diag (1, · · ·, 1, e.,(x)). ,_.I:'

(1)

For, the determinantal divisors of C(qi(>..)) are justfa(X) = · · · = f~:;-1 (X) = 1 and /.,(>..) = q,(.X). On the other hand, the set of (k, - I)-square subdeterminants of the matrix in 3.29(1) clearly contains each of the polynomials

...II

eit(X) == d,.{X), (r-= 1, · · ·, mi). The e,,(X), (t = I, · · ·, m,),

t-l,t)IOfr

are powers of distinct prime polynomials and hence the gcd of the dr(X), (r .... 1, · · ·, m,), is 1. It follows that the determinantal divisors of the matrix 3.29(1) are also just j.(X) = · · · == ft,-1(>..) == 1 and/.,(>..) = qi(>..). Hence, from 3.20.1, 'AI~;; - Q, is equivalent over F[X] to the matrix in 1, · · ·, n). Then 3.29(1). Next, let c~e = deg e,1(X), (t = 1, · · ·, mi; i == k diag (1, · · ·, 1, tt~(X)) is equivalent over F[>..] to 'Ale., - C(e.:~(~)) (they both obviously have the same determinantal divisors) and hence we con-

+

elude that XI .. -

" Q, is equivalent O\'er F[>..] to y-·

i•T+I

Ch.l

54

Survey of Matrix Theory

"

"

...

Hence from 3.24.1 it follows that yo· Q, is similar to yo· l:" C(e;,(~)) i-T+t i-T+t ,_, and from 3.29.2 the result follows. 3.!9.4 (Jorden normal form) If A E J&f.. (C), then the elementary divisors of ~I.- A are of the form (~- a)t, (k > 0), and A is similar over C to the direct sum of the hypercompanion matrices of all the elementary divisors of ~I. - A. This direct sum is called the Jordan normal form. Since the Jordan normal form is a triangular matrix, the numbers appearing along the main diagonal are the characteristic roots of A. Proof: Each of the two matrices H ((~ - a)") and C((~ - a)") is nonderogarory (see 3.29 and 3.29.1). Moreover, each matrix has the polynomial (~ - a)t as both its minimal and characteristic polynomial. Hence the matrices H((~- a)tc) and C((~- a)tc) have precisely the same similarity invariants. It follows (see 3.29.2) that they are similar. The result is now an immediate consequence of 3.29.3. 3.!9.5 (Elementary dlvi10rs of e direct sum) We are now in a position to prove 3.26.3. We prove that if A = B C where Band Care p- and q-equare respectively, then the elementary divisors of )J. - A are the elementary divisors of ~I, - B together with those of )J9 - C. Let S = {ei(~). i = 1, · · ·, m}, denote the totality of elementary divisors of ~JP- Band ~I,- C; each of the ei(~) is a power of a prime polynomial. Scan S for the highest powers of all the distinct prime polynomials and multiply these together to obtain a polynomial q.. (X). Delete from S all the ei(X) used so far and scan what remains of S for all the highest powers of the distinct prime polynomials. Multiply these together to obtain q,._t(X). Continue in this fashion until Sis exhausted to obtain polynomials Qt+t{X), · · ·, q,.(X) (this is the process described in 3.20.2). It is clear from this method of construction that Qt+t(~) lqt+li(X). It is also clear that the sum of the degrees of the q;(X), (j = k + 1, · · ·, n), is just p + q = n. Let Q be the n-square direct sum of the companion matrices of the q;(X), (j = k + 1, · · ·, n). It is easy to compute that the similarity invariants of Q are the q;(X), (j = k + 1, · · ·, n), and hence the elementary divisors of )J. - Q are just the polynomials in S. Thus, by 3.29.3, Q is similar to the direct sum of the companion matrices of the polynomials in S. By 3.29.3, B and C are each similar to the direct sum of the companion matrices of the elementary divisors of )J P - Band ~I 9 - C respectively. Thus A is similar to the direct sum of the companion matrices of the polynomials inS. Hence XI .. - A and M.- Q have the same elementary divisors and the proof is complete.

+

Sec. 3

Lintar Equations and Canonical Forms

55

3. 30 Exam pi• 3.30.1 is

The companion matrix of the polynomial p(~) - ~· - ~~

C(p(X))-

+~-

1

G 10) 0 -1

1 1

3.30.1 The companion matrix of p(~) - ).8

-

- (~ -

.

+ 3~' - 4).1 + 3).1 1) 1 ().1 + 1) 1

2~11

2).

+1

18

E-

0 0 0 0 0 -1

1

0 0 0 0 2

0 1 0 0 0

0 0 1 0 0

0 0 0 1 0

0 0 0 0 1

E Me(R).

-3 4 -3 2

The only nonunit invariant factor of )J• - E (according to 3.29, E is nonderogatory) is p(>.). Hence E is already in the form given by 3.29.2. The elementary divisors in R[~] of >J, - E are Pt(~) = (~ - 1) 1 -= X' - 2X + 1 and Pt(~) -= (X1 + 1) 1 = X' + 2X' + 1. Thus C{pt(X)) = ( _ ~ ; }

C(Pt(~)) •

(

0 1 0 0

0 1

o o

o

-1 0

-2

D·

The Frobenius normal form in 3.29.3 states that E is similar over R to the direct sum 1 :I 0 0 -1 -21I

·----------·'·-----------------1

:I

0

!

0 1 0 0 0 0

:I : -1 0

0 0 1 0 0 1

-2 0

If E is regarded as a matrix in llfe(C), then the elementary divisors of E are

h1(X) = (X - 1) 2, #&,(X) = (X - i) 1 , lb(X} = (X

+ •)'.

Suruey of Matrix Theory

Ch. I 56 The corresponding hypercompanion matrices are H(h1(X)) ..

(~ ~}

(~

!}

H(ht(X))

==

H(ha(X))

= ( -~ -~}

The result 3.29.4 states that E is similar over C to its Jordan normal form:

H(h1(X))

+H(ht(X)) +H(ha(X)) ==

!

0

1 1 0 1

:

0

------- -------:---------i

0

1 : . :

0

I

:

0

·------~-------1----------

: 0 :I

.

3.31

0

: :I

1

.

Irreducibility

U A E M .. (F) and there exists a nonsingular S E M .(F) for which SAS-• == C

+D

where C E .llfp(F), D E M,(F), p + q = n, then A is said to be reducible over F. Otherwise, A is irreducible over F. 3.31.1 If A E M .(F), then A is irreducible over F, if and only if A is nonderogatory and the characteristic polynomial d(XI" - A) is a power of a prime polynomial. Thus the hypercompanion matrix 1/((X- a)•) is always irreducible. This implies that the Jordan normal form is "best" in the sense that no individual block appearing along the main diagonal is similar over F to a direct sum of smaller blocks. Similarly, the blocks down the main diagonal in the Frobenius normal form in 3.29.3 are companion matrices of powers of prime polynomials (the elementary divisors of XI. - A) and are therefore irreducible.

3.32 Similarity to a diagonal matrix A matrix A E .M.(F) is similar to a diagonal matrix over F under the following conditions: 3.32.1

If and only if all the elementary divisors of XI. - A linear.

Stc. 3

linear Equations and Canonical Forms

57

3.3!.! If and only if the minimal polynomial of A has distinct linear factors in F["]. 3.3!.3 If and only if A has a set of n linearly independent characteristic vectors (this requires, of course, that the characteristic roots of A lie in F). 3.3!.4 IfF = C and A bas distinct characteristic roots. 3.3!.5 If and only if d(B) ;II! 0 where B = (bi;) E M ,.(F) and b,1 tr (A'+'), (i,j = 0, · · ·, p- 1), and pis the degree of the minimal polynomial of A. Proof of 3.32.1. If the elementary divisors of "I• - A are linear, then the rational canonical form of A (see 3.29.3) is diagonal. Conversely, if A is similar to a diagonal matrix D, then, by 3.26.1, the matrices).],.- A and "/,. - D have the same elementary divisors and, by 3.26.3, the elementary divisors of )J,. - D are linear. Proof of 3.32.2. The definition of elementary divisors (sec 3.20) implies that the elementary divisors of a matrix in Jf.(F["]) are linear if and only if its invariant factor of highest degree has distinct linear factors in F["]. The result follows by 3.28.3 and 3.23.1. Proof of 3.32.3. Note that if S E Jl,-f ,.(F) and D = diag ("'' · · ·, >...), then (SD) -= ";SW. Now, suppose that B- 1AS = D. Then AS= SD, (AS)W = (SD)Ci>, A sw = ";SW J

and, since Sis nonsingular, the vectors S< 1), • • ·, S..; ~ ~h (j = 1, · · ·, r), we must have c1 = · · · - Cr - 0. But this implies that Zr+a = 0, a contradiction.

3.33 Exempla 3.33.1

Let p(>..)

= ~· -

+ 2~1 -

2~1

2~

+ 1 E C[~].

Then A =

C(p(~))

=- (

~i

-1 2

! ~)·

-2 2

The only nonunit similarity invariant of A is p(>..) itself which can be factored (over C[>..]) into (>.. - 1)1 (>.. + i)(>.. - i). Since p(X) is not a product of distinct linear factors, it follows from 3.32.2 that A is not similar to a diagonal matrix over C. However A is similar over C to the direct sum of the hypercompanion matrices of the elementary divisors of M. - A: (~- 1)1, ~ + i, ~- i. Thus A is similar over C to

(~_ _q_ _~---). I

0 :I

-~

0

0 • ~

3.33.! Although a matrix A E M .(C) is not necessarily similar to a diagonal matrix, it is similar to a matrix in which the off-diagonal elements are arbitrarily small. For, we can obtain a nonsingular S for which s-•AS is Jordan normal form: bu bu 0 bn s-•AS- B

0

bu

0

0 0

= ba-1,11

0

Let E = diag (1, 8, · · ·,

a•-•),

b•.• (8

> 0), then

Sec. 4

Special Classes of Matrices; Commutativity bu ah11 0 0 ba abta 0

59

0 0

(SE)- 1A(SE) - E- 1BE""" &ba-l,a

0 0 b•• so that the off-diagonal elements can be made small by choosing I small.

4 Special Classes of Matrlces1 CoMmutativity 4.1

Bilinur functlonel

Let U be a vector space (see 2.18) and let (J be a function on ordered pairs of vectors from U with values in C satisfying (i) (ii)

(J(cu1 + dttt, v) = c/3(u., v) fj(u, cv1 dv,) = c/3(u, v.)

+

+ ~(Ut, v),

+ d/3(u, Vt)

for all vectors u~, Ut, Vt, Vt, u, and v and all numbel'8 c and d. Then (J is called a bilinear functional. H statement (ii) is changed to (ii)' /J(U, CV1

+ dvt)

- ~fl(u, V1)

+ a,3(u, Vt),

then fJ is ca1led a conjugate bilinear functionol.

4.1

Exe•pl•

4.1.1 If U is the totality of n-vectors over C, i.e., U • C• and.d is any complex matrix, then (J(u, v) -

• E aiJ''UiVIJ i,j•l

where u = (u., · · ·, u.), v """ (v., · · ·, v.), is a conjugate bilinear functional.

.t.l.! As indicated in the example 2.2.1, Jf.(C) can be regarded as a vector space of n 1-tuples. The function fj(A,B) • tr (AB*) is a conjugate bilinear functional. This follows immediately from the properties of the trace given in 2.8.1.

4.3

Inn• product

If U is a vector space and (J is a conjugate bilinear functional which satisfies (i) (ii)

(J(u, v)

fJ(u, u)

== fj(v, u),

> 0 with equality if and only if u

- 0 (ft is positive deftniiB),

Surt•ty of Matrix Theory

Ch.l

60

then~ is called an inner product on U. The vector space U together with the functional~ is called a unitary space. In case the underlying field is the real number field R then it will be assumed that ~ takes on just real values

and condition (i) becomes simply ~(tt, v) = ~(v, tt). In this case U together with~ is called a Ettelidean space. The notation (u, v) is commonly used for the inner product IJ(tt, v). The nonnegative number (u, u) 112 is called the length of tt and is designated by Ilull· The term mmn of u is also used to designate Ilull.

4.4 Exemple In the vector space Ct• of n-tuples of complex numbers, (u, v) =

I:"

i•l

u;iJ;,

(u = (u~, · · ·, u..), v- (v~, · · ·, v,.)),

is an inner product.

4.5 Orthogonality If u~, · · ·, u. are vectors in U and (u,, u;) = 0 for i ~ j, then Ut, • • ·, u. are said to be pairuJise orthogonal. If, in addition llu•ll = 1, (i = 1, · · ·, n), then tt~, · · ·, u.. is an orthonormal set. In general, if llull = I, then u is a unit vector. Some results of great importance for a unitary or Euclidean space follow.

4.5.1 (Gram-Schmidt orthonormalizetion process) If x~, · · ·, x.. are linearly independent, then there exists an orthonormal set !t, · · ·, f,. such that (k -

1, .. ·, n).

(1)

Moreover, if Ua, · · ·, g,. is any other orthonormal set satisfying (g 1, • • ·, g~c) = (xa, · · ·, x~c), (k = I, · · ·, n), then g, = c,.f,. where jc,j = 1, (i,.. I, · · ·, n). This result, together with 3.11.2, implies that any orthonormal set of k vectors in ann-dimensional space (k < n) can be augmented to an orthonormal basis.

4.5.2 To construct the sequence/~, ···,/.. in 4.5.1: Step 1. Set/a = Xtllxtll-•. Step 2. If /a, · · ·, !" have been constructed, define d; (j == 1, · · ·, k), and set u~r:+l = X~r:+a

k

-

I: j•l

d;/;. Then

= (x~c+t, /;),

Sec. 4

Special Classes of Matrices; Commutativity

61

4.5.3 (C.uchy-Schw•rz inequ•lity) If u and v are any two vectors, then

l(u, v)l < llullllvll with equality if and only if u and v are linearly dependent. Proof: If u = O, the inequality becomes an equality. Otherwise let (v, u)

!lull'

w = v-

u.

Then (

w, u)

(v, u)

""' (v, u) - ilull'

(u,

u) ,..

0

and

llwll' =

(w, v) =

= llvll' But

'W~ 0 ; 1 '·

llwll' > 0

and therefore

or, since

llvll' - jj~ 1j} (u, v)

llvll' > l(u, v)l' - !lull' !lull, llvll, and l(u, v)l are nonnegative, l(u, v)l < llull llvll.

The case of equality occurs, if and only if either u

= 0 or w = v - f~ 1jl u

-

0; i.e., if and only if u and v are linearly dependent.

4.5.4 (Triangle inequality) If u and v arc any two vectors, then

llu +vii < !lull + llvll with equality only if one of the vectors is a nonnegative multiple of the other. Proof:

llu +vii' ....

(u

+ v, u + v)

llull 2 + (u, v) + (v, u) + !lull' == !lull' + 2Re((u, v)) + l!vll' < llull' + 2j(u, v)l + llvll' < !lull' + 2llullllvll + llvll 2 , (by 4.5.3), 2 = (!lull + llvll> • -=

The equality occurs if and only if u and v are linearly dependent and Re((u, v)) - I(u, v)l; i.e., if and only if one of the vectors is a nonnegative multiple of the other. .4.5.5 (Pythagorean theorem) x E (u~,

R

If

u., · · ·, u. are orthonormal vectors and ft

· · ·, u.), then x = .E (x, u,)u, and llxll'- .l: ICx, u,)l'. •-1 •-1

Ch.I

62

Survey of Matrix Thtory

-4.6 Example As an example of the idea of inner product we will obtain two interesting formulas for the inner products of two skew-symmetric (2. 7.3) and two symmetric (2.12. 7) products of n-vectors, (u. 1\ · · · 1\ u,., Vt 1\ · · · 1\ Vr) and (u. · · · u,., Vt • • • vr). \Ve are, as usual, using the inner product described in 4.4 for all vectors considered. Let A E .M..... (C) and BE .Mr ... (C) be the matrices whose rows are Ut, • • ·, u,. and Vt, • • ·, Vr respectively. Then, by 2.4.14, (Ut 1\ · · · 1\

u,., Vt

1\ · · · 1\ Vr)

=

.~ .• d(A [1, · · ·, rlw]) d(B[1, .. ·, rlw])

= ... ~..• d(A[1, · .. ,rlw])d(B*[wl1, · · ·,r]) = d(AB*).

But AB* E .M,.(C) and by matrix multiplication the (s,t) entry of AB* is just (u., v,). Hence (Ut

1\ · · · 1\ Ur,

1\ · · · 1\

Vt

Vr)

(s, t = 1, · · ·, r).

= d((u., v,)),

(1}

The situation for the symmetric product is analogous. For, by 2.11.7, (Ut • • •

u,., VJ

• • • Vr)

..,ef = wef

p(A[l, · · ·, rlw]) p(B[l, · · ·, rlw])fp(w)

=

p(A[l, .. ·, rlw]) p(B*[wl1, · · ·, r])fp(w}

= p(AB*) = p((u,, v,)),

(s, t

=

1, · · ·, r).

Hence (Ut • • •

u,.,

Vt • • • Vr)

= p((u., v,)),

(s, t = 1, · · ·, r).

(2)

From the formulas 4.6(1), 4.6(2), 2. 7(8), and 2.12(1), we have that if H E M.(C), then (Cr(H)ul 1\ • · • 1\ Ur,

Vt

1\ · · · 1\

Vr)

= d((Hu,, VJ))

(3)

and (4)

-4.7 No.,... I matrica Let A E M .. (C). Then:

-4.7.1

A is normal if AA*

= A*A.

-4.7.2 A is hermitian if A • = A. -4.7.3 A is unitary if A*= A- 1• -4.7.-4 A is skew-hermitian if A • = -A.

Stc. 4

SjJ«ial Classts of Matrices; Commutativity

63

In case A E M .(R), then the corresponding terms are as follows.

A is real normal if AA"

4.7.5

4.7.6 A is 311mmetric if A"

=

A"A.

= A.

A is orthogonal if A" = A -•. 4. 7.8 A is skew-8ymmetric if A" = -A. 4.7.7

4.7.9 If in what follows a n>sult holds for the real as well as the complex case, we will not restate it for the real case. \Ve will however sometimes consider matrices A E Jf.(C) that are complex ttymmetric: A" = A; oomplex orthogonal: A"= A- 1 ; complex skew-symmetric: AT= -A. \Vhen these arise we will carefully state exactly what is going on in order to avoid confusion with the real case. Some elementary results concerning these classes of matrices are listed. 4. 7.10 Hennitian or unitary or skew-hermitian matrices are normal.

4. 7.11 (i

.

If A,. are hermitian (skew-hennitian) and ai are real numbers,

= 1,

· · ·, m), then

I: a~, is hermitian (skew-hermitian). i•l

4.7.11 If A is normal or hermitian or unitary, then so is A• for any integer p (in case p is negative, then A is of course assumed nonsingular). If pis odd and A is skew-hermitian, then AP is skew-hermitian. 4.7.13 If A and Bare unitary, then AB is unitary. If A and Bare normal and AB - BA, then AB is normal. 4.7.14 Let (u, v)

=

l:"

uiiJi be the inner product in the space C• and

i-1

suppose A E !tf .(C). The fact that the equality (Au, v) = (u, A *t1) holds for all u and fl in C• implies that A is (i) (ii) (iii)

hermitian, if and only if (Au, v) - (u, Av) for all u and v in C• or (Au, u) is a real number for all u in C•; unitary, if and only if IIAull = llull for all u E C•; skew-hermitian, if and only if (Au, fl) = -(u, Av) for all u and 11 in C•. II

Similarly, if the inner product in the space R• is taken to be (u, t1) and A E M.(R), then (Au, v) - (u, A Ttl) and it follows that A is (iv)

(v) (vi)

.I: u,f1,

·-·

symmetric, if and only if (Au, v) = (u, Av) for all u and v in R•; orthogonal, if and only if IIAull = Ilull for all u E R•; skew-symmetric, if and only if (Au, v) - -(u, Av) for all u and v in R•.

In what follows we will use the inner products for in 4.7.14 unless otherwise stated.

c- and R• described

64

Ch. I

4.7.15 If A E M.(C), then A = H

+ iK

Survey of Matrix Theory

where H =

A +A* and 2

.. K = A-2i A* are both herDUtlan.

4.7.16 If A E M .. (R), then A = S . and T metric

+T

. whihS m e = A+A". 18 sym2

= A -A". 18 s kew-symmetri'e. 2

4.7.17 If A E M .. (C) and u E C• is a characteristic vector of A corresponding to the characteristic root r, then Au= ru and hence (Au, u) = (ru, u) = r(u, u). From this we can easily see the following results. 4. 7.18 If A is hermitian or real symmetric, then any characteristic root of A is real [see 4.7.14(i), (iv)]. 4. 7.19 If A is unitary or real orthogonal, then any characteristic root of A has absolute value 1 [see 4.7.14(ii), (v)]. 4.7.!0 If A is skew-hermitian or real skew-symmetric, then any characteristic root of A has a zero real part [see 4.7.l4{iii), (vi)].

4. 7.!1

A normal matrix A E M .. (C) is hermitian if and only if the charac-

teristic roots of A are real.

4.7.!! A normal matrix A E Ma(C) is skew-hermitian if and only if the characteristic roots of A have zero real part. 4.7.!3 A normal matrix A E M .. (C) is unitary if and only if the characteristic roots of A have absolute value 1. 4.7.!4 A matrix A E M .. (C) is unitary if and only if the rows (columns) of A form an orthonormal set of vectors in C•. 4.8 Ex111pla 4.8.1 The criterion in 4.7.24 provides a method of constructing a unitary or orthogonal matrix with a preassigned unit vector as first row. For, given any unit vector A(l) in C• orR", we can use the Gram-Schmidt process (see 4.5.2) to construct vectors A + t" (t< ..+t)let + e 'e2 + · · · + e"'e,.)

= e1e2 = c

Moreover, for

8 ~

1

1 1e,.

21ea

2

1e1

(3)

t (4)

66 whereas

Ch.I

Survey of Matrix Theory

llu.ll' = (u,, u.) .... n.

(5)

Hence combining the statements 4.8(3), 4.8( 4), 4.8(5) we can say that P has an orthonormal set of characteristic vectors v, ... u.-,Jv'n such that Pv, = e'v,, (t .... 1, · · ·, n). (6)

4.9 Circulant As an interesting application of example 4.8.3, we can analyze completely the structure of the characteristic roots and characteristic vectors for any matrix of the form Co

A-

It is clear that A '""'

•-1

I: c,P• and hence using the notation of 4.8.3

c-o

•-1

Av. = I: c,P'v.

•-o = ( •-o E c,~·) "•· 11

1

Thus if"'(>.) is the polynomial eo+ c.x + · · · + C.-t>.•-•, then Av, = t/l(~)v, and A has characteristic vectors Vt, • • • , v.. corresponding respectively to the characteristic roots t/f(e), · · ·, "'(e•). Any polynomial in the matrix n-1

P, A

= t•O I: c,P', is called a circulant.

4.10 Unit.ry shnilerity 4.10.1 If A E M.(C) and U E M.(C) and U is unitary, then U* = U- 1 and (U*AU);; = (AU, U), (i,j = 1, · · ·, n). Suppose that A has an orthonormal set of characteristic vectors u., · · · , u. corresponding respectively to characteristic roots rt, · · ·, r•. Let U be the matrix whose jth column is the n-tuple u;, (j = 1, · · ·, n). Then U*U = I .. so U is unitary and (U*AU),; = (AU, U) = (Au;, Ui) = (rfU;, u,) = r;llii· Hence AU = diag (r., · · ·, r .. ). Thus a matrix is similar to a diagonal matrix

u•

Sec. 4

Spet:Ud Classes of Matrices; Commutativity

67

via a unitary matrix if and only if it has an orthonormal set of characteristic vectors (see also 3.12.3). 4.10.! (Schur trlanguleriutlon theorem) If A E .~.V.(C), then there exists a unitary matrix U E M .(C) such that T == UAU* is an upper triangular matrix with the characteristic roots of A along the main diagonal. H A E lf.(R) and A has real characteristic roots, then U may be chosen to be a real orthogonal matrix. The matrix A is normal, if and only if T is diagonal. Proof: We use induction on n. Let x be a characteristic vector of A of unit length, Ax = X1x. Let V be any unitary matrix with x for its first column (see 4.8.1). Then (V*AV) 0. 4.12. 3 For a normal matrix A there is a close relationship between the singular values of A and the characteristic roots of A. If A E .M.. (C) is

Ch. I

70

Survey of Matrix Theory

normal, then the singular values of A are the characteristic roots of A in absolute value: vxi(A*A) = 1>-;(A)I, (j = 1, · · ·, n). 4.1!.4 If A E M"(C) is hermitian and p(A) = p, then A is positive semidefinite, if and only if there exists a permutation matrix Q E M "(C) such that d(Q'~'AQ[l, · · ·, kll, · · ·, k]) > 0, (k = 1, · · ·, p). (1) If p = n, then 4.12(1) holds for k = 1, · · ·, n, if and only if A is positive definite. 4.1!.5 A hermitian matrix is positive definite (semidefinite), if and only if every principalsubmatrix is positive definite (nonnegative) hermitian. 4.1 !.6 The Kronecker product, compound matrix, and induced matrix inherit certain properties from the component matrices. If A E Af,.(C) and x., · · ·, x~: are characteristic vectors of A corresponding respectively to the characteristic roots r1, · · · , r~:, then the skew-symmetric product Xt

A ... A

Xt

(see 2.7.3) is a characteristic vector of C~:(A) corresponding to the product r~:; the symmetric product Xt • • • x~: (see 2.12.7) is a characteristic vector of P~:(A) corresponding to Tt • • • r.~:. (Of course if the x; are linearly dependent then Xt A · · · A X1r = 0; hence in the case of C~:(A) we assume the x; are linearly independent.)

r1 • • •

4.1!.7 If A and Bare normal matrices in M,.(C) and M.(C) respectively, then (i) (ii)

A ® B is normal in M .."(C); if 1 < r < n, then Cr(A) is a normal

Pr(A) is a normal ( n

+; -

(~)-square matrix over C and

1)-square matrix over C.

Moreover, if A is any one of hermitian, positive definite hermitian, nonnegative hermitian, skew-hermitian (r odd), unitary, then so are Cr(A) and Pr(A ). If A and B both have any of the preceding properties (with the exception of the skew-hermitian property), then so does the mn-square matrix A 00 B. 4.13

Ex1mple

Let {j be a conjugate bilinear functional defined on ordered pairs of vectors from C". Let ua, · · ·, u,. be a basis for C" and define an associated matrix A E •",.(C) by aii = fj(ui, u;). Then we can easily prove that {j is positive

Sec. 4

Special Classes of Matrices; Commutativity

71

definite if and only if A is positive definite hermitian. This justifies the use of the term "positive definite" in both casee. For if x = (x1, • • ·, x.) is any vector in C• and u = IJ(u,

,. E ,_, x,ui, then

f:. z,-u-., f:. x,-u.) == E IJ(u,, u1)x~1 =- f:, a,JX~J·

u) = fJ {

·-·

According to 4.12.1,

,_,

,.

i,j•l

.l: a.JX~J > 0 for x .,.,

'·1-1

o. if and only if A is positive

jill

definite hennitian.

...14 Functions of nonn•l . .trlca H A E M .(C) is normal, then let U be a unitary matrix such that U* AU = diag (r., · · ·, r .. ) where r;, (j = 1, · · ·, n), are the characteristic roots of A. If j(z) is a function defined for z = r;, (j = 1, · · ·, n), then set j(A) = U diag (J(r.), · · ·,/(r.))U•.

In case /(z) == a.za.

(1)

+ Clir-•zl:'- + ·· · + a.z + ao, 1

/(A) == U diag (/(r.), · .. ,/(r.))U* = =

u {t•O Ea, diag (71, E'

t•O

.. ·,

r:.>) u•

a,U(diag (r., · · ·, r.))'U*

= t•O E' a,( U diag (r.,

(2)

· · ·, r .. ) U*)'

i

== E a,A'. t•O

Hence the definition 4.14(1) agrees with 2.13.1. Note that, by 4.14(1), j(A) has /(r1), • • ·,f(r,,.) as characteristic roots. It is also the case that when A is normal A* is a polynomial in A. H A E Af.( C) is hermitian positive semidefinite and /(z) = z 11•, (m > 0), then f(A) satisfies (J(A))• = A. If /(z) = lzu'"l, then j(A) = B is a uniquely determined positive semidefinite hermitian matrix and satisfies B• = A. The matrix B is sometimes written A 11"'. Of particular interest is the square root of A, A 111•

4.14.1

...15

Ea••pl..

...15.1 If J is the matrix in M .(R) all of whose entries are 1 and p > O, p nq > 0, then the matrix A = pi" qJ is positive definite. In fact, the matrix A is a circulant (see 4.9) in which

+

+

Ch. I

72 Co= p + q, A = (p + q)I.

(Ct

Survty of Matrix Theory

= C:a = · · · = C•-1

= q):

+ qP + qP2 + · · · + qP•-•.

The characteristic roots of A are the numbers (see also 2.15.3) Tt = (p + q) + qe• + qeli + ... + qe.. - r)• and let A be the k-square hypercompanion matrix of p(~) (see 3.29.1 ). Iff(~) is a polynomial, then j(A) is defmed by /( ) r

j'(r)

.LJ!2

1!

2!

pt-ll(r) (k- I)!

/"(r)

f(A) =

E

.M~:(C),

(1)

2!

. £J!l

0

1! .f(r)

where JW(r) is the tth derivative of/(~) evaluated at r (J•o'(>..) = f(~)). If B is any nmtrix in Jlf ,.(C), then 8 is similar over C to the direct sum of the hypercompanion matrices of all the elementary divisors of ~I. - B (see 3.29.4):

Then f(B) is defined by J(B) = S- 1 (

.E. J(A ,)) S.

(2)

azzl

More generally, iff(>.) is any function for whichfC'l(r), (t = 0, · · ·, e(r) - 1),

is defined for each cha.racteristic root r of B where (>. - r)., 1'l is the elementary divisor of highest degree involving>. - r, then the formulas 4.17(1),(2) define f(B). In the case of a normal matrix this definition coincides with those in 4.14 and 4.16.

4.18

Example

Suppose B E ill "(C) and /(~) is a polynomial in C[>.] such that /(0} = 0 if 0 is a characteristic root of B. Then there exists a polynomial q(>.) E .] for which q(B) = f(B) and q(O) = 0. For, in the notation of 4.17, it suffices to choose q(>.) so that qW(r) = Jj). 5.10.1 In order to compute the upper triangular matrix Pin 5.10.1, the Gram-Schmidt process (see 4.5.2) is used. Step 1. By a sequence of elementary row operations E~, · · ·, E.. and corresponding conjunctive column operations (see 5.8.1) reduce A to I,.; i.e., E: · · · ETAE, · · · E,. = I •. Then A = Q*Q where Q = (E, · · · E..)- 1• Step 2. The columns Qm, · · ·, Q ::z

35 Ul + 3V2 Ut, V2 Ut + V2 1 32 Ut - G Ua,

where

u.

= (~,

-~, ~}

Ut

= (a~' 3 ~,

-

3 ~2}

form an orthonormal set. Hence

3 p ..

0

5 3

V2 3 0

2

3

V2 -6 1

V2

is the upper triangular matrix satisfying P* P = A.

ua = ( 0,

~~ ~)

References

87

5.11 Conjunctive reduction of slcew-hermltian matrica If A E .M.. (C) and A • = -A then H = -iA is hermitian and, by 5.8.2, there exists a nonsingular P E 11! .. (C) such that

+(-/. _,) +

-iP*AP = P*HP = I~c 0..-r,n-r where r = p(A) and k is the index of - iA. Thus A is hermitely congruent to

i/11

+{-iJr-~c) +

On-r,11-t'•

5.13 ConJunctive reduction of two hennitian metrices If A and B are n-square hermitian matrices and A is positive definite hermitian, then there exists a nonsingular P E .J.ll.(C) such that

P*AP =I.. , P* BP = diag (k., · · · , k.) in which k,, (i = 1, · · ·, n), are the characteristic roots of A-1B. If A and B are in }Jf"(R), then P may be chosen in ..U .(R). Let A 111 be the positive definite hermitian square root of A (see 4.14.1). Then X(A -JB) = ).(A li2(A -lB)A -t12) = ).(A -ll:tBA -lit).

The matrix A -mBA -•.rt is hermitian and hence (see 4. 7.18) has real characteristic roots. Thus the numbers k1, · · · , k,. are real.

References Some of the material in this chapter can be found in many of the books listed in Part A of the Bibliography that follows. ~o explicit references are given in such cases. In other cases, however, a n-ference is given by a number preceded by A or by 13 according to whether the work referl't'd to ii:S u. book or a research paper. Separate reference lists and bibliographies are provided for each chapter. §1.9. [A27], p. 81. §1.10. Ibid., p. 89; [B6]; [B15]. §2.4.10. [Bl6]. §2.4.14. [All], vol. I, p. 9; for the history of this theorem and the respective claims by Cauchy and Binet see [A29], vol. I, pp. 118-131.

88

Ch. I

Survey of Matrix Theory

§2.5. [A18], p. 309. §2. 7. [A27], p. 86; [A48], p. 64. §2.7.2. The Sylvester-Franke theorem, as stated here, is due to Sylvester. Franke generalized this result to a theorem on a minor of the rth compound matrix ([A46], p. 108). ActuaUy the Sylvester-Fra.nke theorem given here is virtually due to Cauchy ([A29], vol. I, p. 118). §§2.9-2.ll. [Bl7]. §2.12. [A27], p. 81. §2.12.7. [Bll]. §2.15.n. [A27], p. 84. §2.15.12. Ibid., p. 87. §2.15.14. Ibid., p. 86. §2.15.16. [AS], p. 246. §2.15.17. Ibid., p. 252. §2.16.2. [Al7], p. 94. §2.16.3. [B5]. §2.17.1. [A27], p. 11. §3.28.2. Ibid., p. 18. §3.32.5. [B4]. §4.9. [Al7], p. 94. §t.10.2. [A27], p. 76. §'.10.6. [Bl8]. §'.10.7. [B8]. §'.12.2. [A2], p. 134. §'.12.3. Ibid., p. 134. §§4.12.4, 4.12.5. [A47], pp. 91-93. §4.12.7. [A48], ch. V; [Bl7]. §4.17. [A48], p. 116; [All], pp. 98-100. §4.19.1. [All], vol. I, p. 278. §4.19.2. Ibid., p. 279. §4.19.3. [All], vol. II, p. 4. §4.19.4. [A16], p. 169; [All], vol. I, p. 276. §4.19.5. [All], vol. II, p. 4. §4.19.6. Ibid., p. 6. 1§4.19.7, 4.19.8. [Bl]. §4.19.9. [B10]. §§4.20.2, 4.20.3. [B2]. §4.21.3. [All], vol. I, p. 292. §4.22.2. [B3]. §4.22.3. [A5], p. 241; [All], vol. I, p. 222. §4.22.4. [A48], p. 106. §§4.24.1-4.24.3. [B3]. §§4.25, 4.26. [B19]. §§4.26.1-4.26.3. Ibid. §§4.27.1-4.27.3. Ibid. §4.28.1. [B23]. §4.28.2. [B22].

Bibliograp~

89

§4.28.3. §4.28.4-. §4.28.5.

[B7]. [B20]. [B9]. §4..28.6. [B20]. 14..28. 7. [821]. 14.28.8. [B21]. f5.2. [Bl4], p. 246

Bibliography P~rt

A. Boob

1. Aitken, A. C., DetermiMnll and mab'icu, 4th ed., Oliver and Boyd, Edinburgh

(1946).

2. Amir-Mooz, A. R., and Fass, A. L., Elementa of linear apaus, Pergamon, 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

New York (1962). Bellman, R., Introductitm to matrix analym, McGraw-Hill, New York (1960). Bode";g, E., Malri.J: C'.alculua, 2nd ed., North-Holland, Amsterdam (1959). Browne, E. T., /mrotludion to Uae theory of determinants and mab"icer, University of North Carolina, Chapel Hill (1958). Faddeeva, V. N., Cotnpul.atwnal rMthoda oflint41' algebra, Do\'er, New York (1958). Ferrar, W. I... , Finite~~ Clarendon, Oxford (1951). Finkbeiner, D. T., lmrodud&on to matrice8 and limar tramfoT'11'14tionr, Freeman, San Francisco (1960). Frazer, R. A., Duncan, W. J., and Collar, A. R., Ekmentary matricu, Cambridge University, wndon (1938). Fuller, L. E., Basic ma.trix Uaeury, Prentice-Hall, Englewood Cliffs, N.J. (1962). Gantmacher, F. R., The theoru of matricu, vola. I and II (trans., K. A. Hirsch), Chelsea, New York (1959). Gelfand, I. M., Lectures on linear algebra, Interscience, New York (1961). Gracub, W., Limare Algebra, Springer, Berlin (1958). Grobner, W., MatrizenreC'hnu1f9, Oldenburg, Munich (1956). Hadley, G., Limar algebra, Addison-Wesley, Reading, Mass. (1961). Halmos, P.R., Finite dimenriMtal t'eclor &paces, 2nd ed., Van ~ostrand, Princeton (1958). Hamburger, H. L., and Grimshaw, M. E., Einear tramforma~wns in n-dimenBional t'er-tor 8pace, Cambridge University, London (1951). Hodge, W. V. D., and Pedoe, D., .Metlwdlofalgebraic(Je(ml£try, vol.l,Cambridge University, London (1947).

90

Ch. I

Survey of Matrix Theory

19. Hoffman, K., and Kunze, R., Linear algebra, Prentice-Hall, Englewood Cliffs, N.J. (1961). 20. Hohn, F. E., Elementary f1UJiri:e algebra, Macmillan, New York (1958). 21. Jacobson, N., Lecturu in abatTact algebra, vol. II, Van Nostrand, New York (1953). 22. Jaeger, A., lnt1'oduction to analytic gemnelry and limM' algebra, Holt, Rinehart and Winston, New York (1960). 23. Jung, H. W. E., ~Ualrizen und Delmninanten, Fachbuchverlag, uipzig (1952). 24. Kowalewski, G., Einfuhru"ff in die Determinantenlheorie, Chelsea, New York (1948). 25. Kuiper, N.H., LimM algebra and geometry, North-Holland, Amsterdam (1962). 26. MacDufJee, C. C., Vectors and mt.ltricu, Mathematical Association of America, Menasha, Wise. (1943). ?:1. MacDuffee, C. C., Til£ thury offfUJiricea, Chelsea, New York (1946). 28. Mirsky, L., An int1'oduction tolineM' algebra, Oxford University, Oxford (1955). 29. Muir, T., Th£qry of determinanta, vol.s. I-IV, Dover, New York (1960). 30. Muir, T., Ctmtributiona to the lautory of detenninanta 1900-19to, Blackie, London (1930). 31. Muir, T., and Metzler, W. H., A treatiae on the themy of determinants, Dover, New York (1960). 32. Murdoch, D. C., Lin« a} (5) are called open half-8pQ.Ce8, while the sets {x E Ul(x, z)

< a}

and

{x E Ul(x, z)

>

a}

(6)

are closed half-8pacr8. Half-spaces, open or closed, are convex. Thus a hyperplane divides U into three disjoint complementary convex sets: the hyperplane itself and two open half-spaces.

1.1 Examples 1.1.1 The set D .. of real n-square matrices whose row sums and column sums are all equal to 1 clearly form a convex set in M ,.(R). For, if A = (a11), B = (b,;) and

then, for any 8,

_I:"

~-·

(8ai;

+ (1 -

8)bi;)

= 8 I:. a,; + (1 ~-· = 8 + (1 - 8) =

Similarly, the column sums of 8A

+ (1 -

8)

1.

E"

b,1

i•l

8)B are all equal

to 1. U

See. 1

ConHx &ts

gs

J E M .(R) denotes the matrix all of whose entries are equal to 1, then A E D., if and only if JA = AJ = J. ~ .!.! 18

The set of all matrices in •.U .(R) each of whose entries is nonnegative convex.

1. 3

Intersection property

The intersection of the members of any family of convex subsets of U is convex.

1.4.1 The set of all vectors of R• with nonnegative coordinates (the n()nnegative orthant of R") is convex. For, this set is the intenection of n

closed half-spaces {z E R"l(z,

ft)

> 0,

k = 1, · · ·, n},

where et denotes the n-tuple whose kth coordinate is 1 and whose other n - 1 coordinates are 0. (See 1.2.19.)

1.4.! A matrix A E 111 .(R) is called doubly atochatic, if all ita entries are nonnegative and all its row sums and all its column sums are equal to l. The set of all n-square doubly stochastic matrices, n., is convex, since it is the intersection of D,. (see 1.2.1) and the set in 1.2.2.

1.4.3 Let WE Jf.(C) be a unitary matrix. Then the matrix S E M.. (R) whose (i,j) entry is IJV i;l 2 is doubly stochastic. For, from 1.4.7.24, the rows and columns of W are unit vectors. Such a doubly stochastic matrix is called ortkostochastic. 1.4.4 The matrix

s=

·(~0 ~1 ~)1

is doubly stochastic but not orthostochastic. This is easily seen as follows. IW.;1 1 = Bij, then W would have the form

If W were a unitary matrix for which

w = (=: 0

But

z1%t =

~~

Za

:.) 1/a

0, by 1.4.7.24, and this is incompatible with

lz,l 1 = l:ttl 1 = t·

Ch.II

Convexity and Mdlrues

1.5 Convex polyheclroM

f. IJ;z of elements of U is called a convex combination or a convex BUm of xs, · · ·, x.,, if all the 81 are nonnegative and ,.f 81 = 1. A linear combination

i•l

1

..

The set of all such convex combinations of x., · · ·, x., is called the convex polyhedron spanned by xs, · · ·, x., and will be denoted by H(xs, · · ·, x.,). More generally, if XC U, then H(X) will denote the totality of convex combinations of finite sets of elements of X. The set H(X) is called the convex hull of X. A convex polyhedron is a convex set. Some elementary properties of the convex hull are: (i) if XC Y, then H(X) C H(Y); (ii) for any X, H(H(X)) = H(X); (iii) if x E H (xs, · · ·, x.,), then H(x, x., · · ·, Xp) = H(xa, • • ·, x,).

1.5.1

There exists a unique finite set of elements y., · · ·, Yr, Yi E H(xs, · · ·, x.,), (j = 1, · · ·, r), such that Yi fl H(yt, • • ·, Yi-h Yi+h • • ·, Yr) and H (y., · · ·, Yr) = H (xa, · · ·, x.,). \Ve prove this statement in two parts. The existence of the set y., · · ·, Yr is established by induction on the integer p. There is nothing to prove for p = 1. If no Xi is a convex combination of x,, (i :F j), then x., · · ·, x., will do for the choice of the y's. Otherwise we can assume without loss of generality that xs E H(xt, · · ·, x11). By induction there exist y., · · ·, Yr such that no y 1 is a convex combination of 1/i, (i :F j); and moreover, H(y., · · ·, Yr) = H(Xt 1 • • · , x.,). But Xt E H(Xt, • · ·, x,) = H(y., • • ·, Yr) so that H(y., · · ·, Yr) = H(x., · • ·, x,). To prove the uniqueness suppose that us, · · ·, u. are in H(x1, • • ·, x11), no u 1 is a convex combination of the Ui, (i :F j), and H(ua, · • ·, u,) = H(xa, • • ·, x,) = H(y., • · ·, Yr). Then Y• = r

u;

=E

r

f3Jc1Jh

E

•-•

t•l

pute that Y•

=

f3Jt

= 1, P11 > 0,

·-·f: (,..j:,

(j

=

= 1, CtJ

> 0 and

1, · · ·, s; t = 1, · · ·, r). We com-

a;f3Jt) y, and hence

(t = 2, · · ·, r). It follows that 1 =

•

•

.I; etlU;, jE a; ,.. •l

E•

i•l

a;/3;•

t ,..

a 1{j11 = 1,

< max Pn < i•l. ··· .•

..

,t.

aj{J1,

= 0,

1. Thus for

some k, 1 < k < s, /3~rs = 1 and therefore {311 , = 0, (t = 2, ... , r). Thus Y• = u•. Similarly, we prove that {yt, · · ·, Yr} is a subset of {us, · · ·, u,} and reversing the roles of the u's andy's completes the argument.

Sec. 1

Convu Sets

97

1.5.2 The unique elements Yt, · · ·, Yr in 1.5.1 are called the vertices of the polyhedron H (x1, • • ·, xp).

1.6 Ex•mple The set of real n-tuples {(at, · · ·, a,.)ja;

>

0,

forms a convex polyhedron whose vertices are e1, (a~, · · ·, a.) =

• • ·,

e. (see 1.4.1). For,

111

E

i-1

a;1JJ

and clearly no e; is a convex combination of the rematmng e., (i ~ j). The convexity of this set follows also from the fact that it is an intersection of a hyperplane with the nonnegative orthant.

1.7 BirkhoiJ theorem The set of all n-square doubly stochastic matrices, n.., forms a convex polyhedron with the permutation matrices as vertices. In order to prove 1.7 we first prove the following lemma. 1.7.1 (Frobenius-Konig theorem) Every diagonal of an n-square matrix A contains a zero element if and only if A has an a X t zero submatrix with s + t = n + 1. To prove the sufficiency assume that A [wiT] = 0•.• where wE Q•.•, T E Q,,,.. Suppose that there is a diagonal j of A without a zero element. Then in columns T the entries of j must lie in A (wjT] E M •-•,,(R), and hence t < n- s. Butt = n - s + 1, a contradiction. To prove the necessity we use induction on n. H A = o...., there is nothing to prove. Assume then that a,1 ¢ 0 for some i, j. Then every diagonal of A (ijj) must contain a zero element and, by the induction hypothesis, A(ijj) contains a u X v zero submatrix with u + v = (n - 1) + 1; i.e., a u X (n - u) zero submatrix. We permute the rows and the columns of A so that if the resulting matrix is denoted by B, then B[1, · · ·, uju + 1, · · ·, n] = o....._.. ; i.e., so that B is of the form B =

(~ ~)where X

v.._.(R). If all elements of any

E M .. (R) and Z E ..

diagonal of X are nonzero, then all diagonals of Z must contain a zero element. It follows that either all the diagonals of X or all those of Z contain a zero element. Suppose that the former is the case. Then, by the

Ch. II

98

induction hypothesis, X must contain a p X q zero submatrix with p + q = u + 1. But then the first u rows of B contain a p X (q + v) zero submatrix and p + (q + v) = (p + q) + n - u - u + 1 + n - u = n + 1. If every diagonal of Z contains a zero element, the proof is almost identical.

1.7.! We are now in position to prove the Birkhoff theorem. LetS E n.. We use induction on the number of positive entries in S. The theorem holds if S has exactly n positive entries; for then S is a permutation matrix. Suppose that S is not a permutation matrix. Then S has a diagonal with positive entries. For, suppose that S contains a zero submatri.x S[CdiT], Cd E Q•.• , ., E Q,.,.. Since the only nonzero elements in the rows Cd of S are in S[Cdl-r), each row sum of S[Cdl-r) is 1 and the sum of all elements in S[Cdl-r) iss. Similarly, the sum of all elements in S(Cdl-r] is t. Now, S[CdiT) and S(Cdl-r] are disjoint submatrices of S and the sum of all elements in S is n. Therefore s + t 0 for all nonzero x and if there exists p > 1 such that f(Ox) = 9,/(x) for all x E U and all fJ > 0 (i.e.,fis hmnogeneom of degree p), then h(x) = (J(x)) 1'" is also convex on all of U. (It is assumed that 8x E U whenever x E U and fJ > 0.} Moreover, if the inequality 2.1(1) for f is strict whenever 0 < fJ < 1 and neither x nor y is a positive multiple of the other, then under the same conditions the inequality 2.1(1) with f replaced by his strict. To see this, first note that if x is zero, then f(x) = 0 = h(x); otherwise h(x) > 0. It follows that if either of x or y is zero, then h(fJx + (1 - 9)y) = 6h(x) + (1 - 8)h(y). Thus for x andy not zero

'(~). "'( 8 hCx) + ~) = ( 1 ( 8 hCx) + o- e>rw)Y'p

'(hex>) = (htx>Yf =

~=

1

=

< ( 8f (hcx>) + 0. If f"(x) > 0 in the interior ofF, then J is strictly convex on F.

2.3.9 If J is a convex function defined on the polyhedron H(x., · · ·, x,) C U, then for all x E H(:r-o, · · ·, Xp) f(x)

(4)

min l(zi).

i•O,···,p

2.3.10 Let S be a convex subset of R" such that for any permutation matrix P E A/,.(R), Pu E S whenever u E S. A function I on S is symmetric, if I(Pu) = f(u) for all u E S and all permutation matrices P E Jf.(R). If f is a convex symmetric function on S and u (u., · · ·, u.) E R•, then

E u,)

f. Ui

I ( •-~ ' · · ., •-~

< l(u).

(5)

" u,fn is the value of every coordinate of To see this, observe that~

·-·

" P"u/n where P is the full cycle permutation matrix (see 1.2.16.2). Then I:

i•l

( t. Ui

t. Ui)

( t.

P"u)

== I "=-~·----i - ·- , ••• , i -I I n n n

II

=

I: f(u) i•l n

= f(u),

the next but last equality following from the symmetry of 1. It can be proved similarly that iff is a concave symmetric function, then

E u,

f. u,)

•l i•l I (i--;-' · · ., --;>I( u.)

(6)

!.4 Exampl• 2.4.1

The function l(t)

= {' log t 0

is strictly convex on the interval 0 tion of 2.3.8.

!.4.2 The function F(X)

...

= .1: ., "

i£ t > 0, if t=O

< t < oo. This is seen by direct applical(z,J), (X

= (z,J) E 0.),

where 1 is the

Sec.3

Classical lnequtJlities

105

function in 2•.t.l, is strictly convex on 0... This follows immediately from 2.3.6 and 2.4.1. !.4.3 The unit ball B,. in R~, is a convex set. For, B,. is by definition the set of all x E R" for which h(:r:) = ~) 111 < 1. Observe that the

(:£ •-1

function/i(t) = t2 is convex on R by 2.3.8 and hence, by 2.3.6,/(:r:) =

" X: :E i•l

is convex on R". From 2.3.7, it follows that (/(x))ll" = h(x) is convex and 2.3.5 completes the argument. The convexity of B,. is of course equivalent to the triangle inequality in R" (see 1.4.5.4). !.4.4 The unit n.-cube K .. , defined u the set of all x = (x~, · · ·, :r:,.) E R• for which l:r:s:l < 1, (i = 1, · · ·, n), is convex. To see this, observe that the function f,(x) = lxa:l = max {xi, -:r:,} is convex by 2.3.4. Then K,. is defined by /(:r:) < 1 wheref(x) = maxf,(x). An application of 2.3.5 proves the assertion. i The function f(x)

!.4.5

= (:f:. lx.:1•) 1111,

(p

&•1

> 1),

is a convex function

on all of R•. For, by 2.3.4, lxa:l is convex and, since p

> 1, 2.3.8 and 2.3.3

Jl

yield that lxil• is convex. Hence

:E lz,IP is convex on R~ by 2.3.6. It follows

i•l

from a direct application of 2.3. 7 that f(x) is convex. !.4.6

Let

f

denote the function in 2.4.5. Then clearly

on all of R• and it follows from 2.3.10 that

f is

symmetric

If u,l• < n~1 .'E lu•IP. &•1

·-·

Claule~llnequalitia

3 3.1

Pow• .......

If lit, • · ·, a. are positive real numbers, then their power mean of order r, mt,., is defined:

3.1.1

(. r'· nai

if r

i•l

mt,.=

cr :E~

~

n

= O, (1)

if r :F 0.

The basic result concenling power mE'.ans follows. For fixed a1, · • · , a. and

r

< s,

< ffi'la == • · · = a,.. ffitr

with equality, if and only if a1

(2)

106

Ch. II

Corwexily and Matrices

3.1.! 1,he following inequalities which are special cases of 3.1(2) will be used extensively: (arithmetic-geometric mean Inequality) (3) (4)

3.1.3 Let 1 0,

(i = 1, 2, · · ·, n); then

i.e., the function

-~ B,em•-•

""·•-•

is convex for 1

< p < 2 on the positive orthant of R•.

3.2 Symmetric Fundions 3.!.1 (Newton's lnequelitla) Let a, E R, (i = 1, · · ·, n), and let Er(a., · · ·, a.) be the rth elementary symmetric function of the a, [see 1.2.1(1)]. Let

Pr = Erl(;) be the rth weighted elementary symmetric junc-

tion of the a,. Then for fixed

a~,

· · ·, a. and 1 < r

pill > pll3 > . . . > p!l•

(2)

with equality if and only if the a, are equal. 3.!.3 Let a = (a., ···,a..) and let (a., · · ·, a., a,.+•>· If 1 < r < k < n, then ~

a

< p~+l(ll)

pt(a) - Pt+ 1(ll)

with equality if and only if a1 = · · · = tlt.+t·

denote the (n

+ 1)-tuple (3)

Su:.3

ClassiCIJl luqualitils

3.!.4 Let A. =

1(11

"t. ai/n and G. = { E ai)

•-1

i-1

3.2(3) becomes

11 "·

Then for r = 1, k =

n,

< (~)•+t (A,.)• G,. G-+t

with equality if and only if a1

= ···

a.+J.

=

3.!.5 Let a = (a., · · ·, a.) and b = (b., • · ·, b,.) be n-tuples of positive numbers and Er(a) denote the rth elementary symmetric function of a1, • · · , a.. Then [Er(a

+ b)]llr >

[Er(a)] 1'"

+ [E,(b)]"',

i.e., the function EV' is concave on the positive orthant of R•, with equality if and only if a = tb, (t > 0). 3.!.6 If a = (at, ···,a.) and b = (b., ···,b.. ) are n-tuples of nonnegative numbers each with at least r positive coordinates and 1 ( -

E,.(a) )•'' E,._,(a)

+(

E,.(b)

E,._,(b)

)'I•'

with equality if and only if a = tb, (t > 0); i.e., the function (E,fE,_,) 1'• is concave on its domain in the nonnegative orthant.

= Tl/A.•l (a) in which = I: 8t.il, · · · &i,.ar •·+·· ·+•·-'

3.!. 7 Let /(a) T,,.l.•>(a)

&,

=

!(:)

if k

( -1)' (:)

···

a!",

(iJ

-> O·' J. -

1' · · •, n),

> 0, if k

< o,

and k is any number, provided that if k is positive and not an integer, then n < k + 1. Then on the nonnegative orthant in R• the function f is convex for k < 0 and concave for k > 0. For k = 1, /(a) = E,(a) and for k = -1, f(a) = h,.(a) (see 1.2.15.14). 3.!.8 Let h,(a) = ~..

a.

be the nth completely symmetric function

(see 1.2.15.14o) of the n-tuple a = (a1, • • ·, a.) of positive numben. It follows from 3.2.7 that (~(a+ b))llr < (h,.(a))l/r + (h,(b))llr, i.e., the function ~ 1' is convex on the positive orthant.

Ch.II

1(8

Convtxity and Matrices

3.3 Hiilder inequelity

If x = (xa, · · ·, x,.), y = (y1, of R• and 0 < 8 < 1, then

f

J•l

XJ11J

> 0, (i =

y.) are vectors in the nonnegative orthant

< ( i•l f. xY')' (J•l f:. yJ'1 -•)•-'

with equality, if and only if linearly dependent. Let 81 > 0, a,

• • ·,

(xl", · · ·, x!11)

(1)

and (yl 1u-'>, · · ·, y!IU-'>) are

•

1, · · · , k), and .I: 9, = 1. Since the log function

•-1

is concave in (0, co), we have

Thus (2)

> 0 and 0 < 8 < 1 a'b 1_, < fJa + (1 - 9)b with equality if and only if either a= b or 9 = 0 or 1. Let 8 > 0 and in 3.3(2) set In particular, for any a, b

b=

(3)

yyu-'>

:E" yJIU -I> J•l

Sum up the resulting inequalities for j = 1, · · ·, n:

This gives the Holder inequality 3.3(1). The inequality clearly holds, if some (or all) of the XiJ 1/J are 0.

3.3.1

The inequality 3.3(1) can be extended to complex numbers: (4)

Sec . .3

Classieallnequalities

3.3.2 For 8

=l

109

the inequality 3.3(1) becomes

.E x;y; < ,_, I: x' ,_, "

(

"

l: YJ )lll. ,..1

)''' (

ll

This is a special case of the Cauchy-Schwarz inequality (see 1.4.5.3).

3.4 Mlnlcowslcl Inequality

If GiJ

> 0,

(i

= 1, · · ·, m;j =

1, · · ·, n;O

< fJ < 1), then

( E" ( I:. a,s)''')' < E"' ( I:" aJf')' j•l

i•l

·-·

(1)

j•l

with equality, if and only if the rank of the matrix (a,1) does not exceed 1. We apply the Holder inequality to prove 3.4(1), the Minkowski inequality. In 3.3(1) put x; = aii, y 1 ==

.. )(1-f)/f, and observe that (,_, .E a,1

t. a,; {f. a•;),, < (i:, alf)' (t (f a,;)'' )•_,,

,..

,..

J•l

J•l

(i = 1, .. ·, m).

·-·

Now, sum up with respect to i, noting that

.. I:" au ( I:.. a;;)(1-f)/f == E" ( E.. ai;)1"• I: i•l i•l i•l

.... i=l

Thus that iB,

GE II

(

·-·

3.4.1

Eaij)l/f)f ( ,!,1. Xi+ Yi = (

+ ! t Yi n1-1 Xi+ Yi

)1/tt (

y·

n

+ i!,fl Xi+ Yi

)1/rt

[by 3"1(3)],

'

I

c~l x.)''" + c~. y)lf") c~. (x; + Yi) )''".

3.5.2 (Kantorovich Inequality) Let f be a convex function on [X,., x.], where x. > · · · > X,. > 0. Suppose that tf(t) > 1 on [X,., XI] and let c > 0. On the (n - I)-simplex H(e 1, • • ·, e,.) = 8,._ 1 define

Then 1

< F(a) < max

The choice f(t) = r 1

1

{ex. + ~/(X.),

eX"

+ ~/(X,.)}

(2)

and c = (X 1X,.)- 1' 2 yields

[(X•)l/t X,. + (Xx: )112]2.

1 < :E a;X; :E a,x.- 1 < 4

The right side of 3.5(3) is achievable with a1 = (i ~ 1, n).

i,

~ =

i,

(3)

and a; = 0,

3.5.3 (ugrans• Identity)

a:) (f:. bf)(t a,-b,)2"'"' .t (a,-bj- a,-b,) ·-· ·-· ·-· ,,_.

( f,

i

2•

(4)

Yt > · · · > y,.,

x. + x'l + · · · + Xt :r. + X2 + · · · + x,.

::::;;

Y•

= y,

+ Yt + · · · + y~c,(k + Y2 + · · · + y,.,

>

Xt

> ··· >

x,., y 1

= 1, · · ·, n - 1),

then there exists S E n.. such that x = Sy. In fact S may be chosen to be orthostochastic (see 1.4.3).

&e. 3

Classical lneiJUtllitils

111

The vectors x = (5, 4, 3, 2) and 11 = (7, 4, 2, 1) satisfy the conditions of 3.5.4. We construct an S E ~such that x = Sy. Call the number of nonzero coordinates in 11 - z the diBCrepancy of x and 11; in this example the discrepancy of x and y is 3. Let

& - ( 1: •

~ ~ D· 1

•

Then

~::+27)

&y = (

Choose 8 so that the discrepancy of x and 8111 is less than that of x and 11; i.e., so that either 58+ 2 = 5 or -58+ 7 = 3. The theory guarantees that at least one of these values for 8lies in the interval [0, 1]. Let 8 = f. Then 8,.

:.

c0 0) 1 f0 0 f 0 f 0 0 0 0 1 0

E 0.

and

B~=(D which still ditJers from x in its last two entries.

&-

(~ ~ ~

ut

~

0 0 a 1-a)· 0 0 1-a a

Then

&St~~

- ( 3a

~

I )·

-3a+4

Choose a so that 3a + 1 = 3 and - 3a + 4 = 2. Again the theory guarantees the existence of such a in the interval [0, 1]. Thus a= f and

Ch.ll

112

St-

c00)

Convexity and Matrices

0 1 0 0 . 0 0 • i 0 0 i f

The matrix

8-M-(i

o)

0 1 I0 0 0 I l 0 t f

is in n. and has the property that Sy = x. In general, the method is to construct a sequence {81} of doubly stochastic matrices with 1's in all but two rows (the hth and the kth row) such that SiBi-l · · · S1y and x have at least one discrepancy less than sj-1 ••• Sly and x; k is the least number such that (Si-1 ... SJY)t - Xt < 0 and h is the largest number less than k and such that (81-1 • · • S1y)• x,. > 0.

4 Convex Functions end Metrlx lnequelities 4.1

Convex functions

of metrlca

Given a convex function f and a normal matrix A we obtain a result relating the values off on a set of quadratic forms for A to the values off on some subset of the characteristic roots of A. Some important theorems which amount to making special choices for A and f will follow in the remaining subsections of 4.1. 4.1.1 Let f be a function on Ct with values in R. If X = (X1, and S E 0.., then we may define a function g on 0.. to R by g(S) - j((Su, X), • · ·, (S · · · > a. (see 1.4.12.2), then

•

• IAJ! < n a" n i•l i•l

(k- 1 · .. n)

'

, '

IA•I > · · · > IA.I, (9)

with equality for k .. n. The proof of this statement fork- 1 is easy. Simply take Zt to be a unit characteristic vector of A corresponding to the characteristic root -"•· Then IA111 == (Az~, Az1) = (A*Az., z.). But A*A is nonnegative hermitian and ita largest characteristic root is af. Hence, by 4.1.5 with k - 1,

776

Cia. II

(A*Ax.,

Xa) -al < al.

Convexity and M atricts

..

Now apply the preceding to the matrix c.(A) .

The largest (in modulus) characteristic root of C~:(A) is

..

,_n >.;

(see

Observe that (see 1.2.7.1, 1.2.7.2) (C~:(A))*C~:(A) = C~r(A *)C~:(A) - C.t(A *A) and so the largest singular value of C~r(A) is

1.2.15.12).

..

n

i•l

a;. Thus the result for k - 1, applied to C~:(A), yields 4.1(9). The

= n because ld(A)I

equality in 4.1(9) occurs fork

- (d(A *A}) 111•

4.1.10 If A and B are nonnegative hermitian matrices, define E"(A), A,.(A ), B,(A) to be respectively the elementary symmetric function of the characteristic roots of A [see 1.2.1(1)], the completely symmetric function of the characteristic roots of A (see 1.2.15.14) and finally the function defined in 3.1.3 all evaluated on the characteristic roots of A. Then ( E 2 (A

Ep-.~:(A

+ B) ) 1'* > ( +B)

-

(n

E 2 (A) )lit+ ( A'2 (R) ) E,._.(A) E,._.(B)

>p>k>

1'•,

( 10)

1);

~1•(A +B) 0 and x E C• is a unit vector, then

4.3.1

1

d(A)

+ d(B).

(1)

This follows immediately from 4.1.8.

4.4.1 If A E M.(C) is nonnegative hermitian and wE Q••• , then d(A) < d(A[wlw])d(A(wlw)).

(2)

118

Ch. II

Convexity and Malricts

For let P be the diagonal matrix with -1 in the main diagonal positions corresponding to w and 1 in the remaining main diagonal positions. Then B = PAPT is also nonnegative hermitian and B[wlw] = A[wlw], B(wlw) = A(wlw), B[wlw) = -A[wlw), and B(wlw] = -A(wlw]. Then the Laplace B yield expansion theorem (see 1.2.4.11) and 4.1.8 applied to A

+

(d(2A [wlw])d(2A(wlw)))lln

= (d(A + B))lltt > (d(A)) 11" + (d(B)) 11• = 2(d(A))11n,

and 4.4(2) follows. 4.4.3

If A and B are in M .. (C), then lp(AB)I 2

< p(AA *)p(B*B)

(3}

with equality, if and only if one of the following occurs: (i) (ii)

a row of A or a column of B consists of 0, no row of A and no column of B consists of 0 and A • = BDP where Dis a diagonal matrix, and Pis a permutation matrix.

In particular, lp(A)I 2

< p(AA*),

lp(A)I 2

< p(A*A).

To see 4.4(3), apply the Schwarz inequality (see 1.4.5.3, 1.4.6(2), 1.2.12(1), 1.2.12.2) to obtain lp(AB)I 2 = I(P.. (AB)e. · · · e.. , e1 • • • e,.)l 2 = I(P,.(A)P.. (B)e. · · · e,., e1 • • • e.. )l 2 = I(P.(B)el · · · e,., P,.(A*)el · · · e.)l 2 < (P,.(B)el · · · e,., P .. (B)el · · · e")(P"(A*)e. · · · e,.,P.(A*)e1 • ··e.) = (P.(B*B)el · · · e,., e1 • · · e.)(P"(AA *)e1 • • • e,., e1 • • • e.. ) = p(AA *)p(B*B).

We note that, by 1.4.6(2), e1 · · · e. is a unit vector. The discussion of equality is omitted. 4.4.4 If A E M .. (C) is nonnegative hermitian, then d(A)

< p(A).

(4)

For, by 1.5.10.1, there exists X E AI.. (C) such that X is upper triangular and X*X =A. Hence, by 4.4.3, d(A)

= d(X*X)

= d(X*)d(X}

= p(X*)p(X) < v' p(X*X)p(X*X) = p(A).

4.4.5 If A E AI,.(C) is nonnegative hermitian with characteristic roots ).1 > · · · > X,. > 0 and w E Q~:,n, then (5}

Set:. 4

Convex Functions arul Matrix Inequalities

719

and d(A[wlw]) < p(A[wlw]). (6) The inequality 4.4(6) follows from 4.4(4) whereas 4.4(5) results from applying the fact (see 4·.1.5) that (C.(A )eia 1\ · · · 1\ e;., e,. 1\ · · • 1\ et.) is between the largest and smallest characteristic roots of C11(A) (see 1.2.15.12). We note that by 1.4.6(1), eta 1\ · · • 1\ ~is a unit vector. 4.4.6 Let A E M .(C) be nonnegative hermitian and let P. =

n

wECb.-

d(A [wlw]).

Then

Pt > P!/("i 1 ) > · · · > P!~\::l) > P..

(7)

To prove this observe first that the Hadamard determinant theorem (see 4.1.7) together with 1.2.7(6) applied to C.-t(A) implies (d(A))n-t = d(C-t(A))

(r

+ l)Nr-1Nr+1(r'N:- (r + l)Nr-tNr+t)-

1

for 1 < r < p(A ). Equality holds, if and only if p(A) = r or p(A) all nonzero characteristic roots of A are equal.

> r and

4.4.16 If A E M .. (C) is nonnegative hermitian, wEQ..-.t ... , then

"•+•+t(A) < "-+•(A[wlw]) + "'+t(A(wlw)), (0 < a < n - k - 1, 0 < t < k - 1).

4.5

Hacllmarcl proclud

If A and B are in M .(C), then their He&dam4Ttl. product is the matrix H

= A•B E },f.(C) whose (iJ) entry is a;1bii·

4.5.1 The Hadamard product A • B is a principal n-square submatrix of the Kronecker product A ® B.

Sec. 5

4.5.!

Nonnegative Matrices

121

If A and Bare nonnegative hermitian matrices, then so is A • Band

""(A)>."(B) < "t(A • R) < "l(A)>.1(B). To prove 4.5(1), directly apply 4.5.1, 1.2.1S.ll, and 4.4.7.

4.5.3 If A ( .tlf,.(C) is nonnegative hermitian and h,1 H = (h,;) is nonnegative hermitian and >-t(H)

.1(A)) 2•

For, H = A • AT is a principal suhmatrix of A negative hermitian. Apply 4.5.2.

® A" and is thereby non-

4.5.4 If A and B in M "(C) are nonnegative hermitian, then d(A • B)

+ d(A )d(B) > d(A) ill•"l b" + d(B) in•"l a,•.

5 Nonn•s•tive M.trices 5.1

Introduction

:Most of the inequalities and other results in the preceding sections of this chapter concern positive or nonnegative real numbers. The theorems in section 4 are mostly about positive definite or positive semidefinite matrices which are, in a sense, a. matrix generalization of the positive or nonnegative numbers. A more direct matrix analogue of a positive (nonnegative) number is a positive (nonnegative) matrix. A real n-square matrix A = (a;;) is called positive (nonnegative), if a,1 > 0 (a,;> 0) for i,j = 1, · · ·, n. 'Ve write A > 0 (A > 0). Similarly, ann-tuple x = (xh · · ·, x") E R" is poaitive (nonnegative) if x, > 0 (xi> 0), (i = 1, · · ·, n). We write x > 0, if x is positive and x > O, if x is nonnegative. Doubly stochastic matrices, which played such an important part in the preceding section, form a subset of the set of nonnegative matrices. We shall discuss doubly stochastic matrices in detail in S.ll and 5.12. Some obvious properties of nonnegative n-square matrices can be deduced from the fact that they form a convex set and a partially ordered set. The main interest, however, lies in their remarkable spectral properties. These were discovered by Perron for positive matrices and amplified and generalized for nonnegative matrices by Frobeniu.s. The essence of Frobenius' generalization lies in the observation that a positive square matrix is a special case of a nonnegative indecomposable matrix.

Ch./1

122

5.!

Conoexity and M altius

Indecomposable matrices

A nonnegative n-square matrix A = (ai;) (n > 1) is said to be decomposable if there exists an ordering (i~, · · ·, ip, j1, · · ·, j,) of (1, · · ·, n) such that tli.i, = 0, (a = 1, · · ·, p; {J = 1, · · ·, q). Otherwise A is indecomposable. In other words, A is decomposable, if there exists a permutation matrix P such that P APT = (

g ~) where Band D are square. Some writers use the

terms reducible and irreducible instead of decomposable and indecomposable. There is no direct connection between decomposability in the above sense and reducibility as defined in 1.3.31. Some preliminary properties of indecomposable n-square matrices follow. 5.!.1 Let e1, · · ·, e" be the n-vecrors defined in 1.2.19. Then A > 0 is decomposable, if and only if there exists a proper subset {j1, · · ·, jp} of {1, · · ·, n} such that (Aei., · · ·, Aei") C (ei., · · ·, ei").

For example, the full-cycle permutation matrix 0 1 0 0 0 1

... ...

0 0 0 0

0 0 0 1 0 0

... ...

0 1 0 0

(1)

is indecomposable because Pe; = e;-1 (eo is ro be interpreted as e..) and therefore (Pe;,, · · ·, Pei") C (~·., · · ·, ei) means (ei.-1, · · ·, e;"-1)

=

(ej,, · · ·, ei"),

an impossibility unless p = n. 5.!.! If A > 0 is indecomposable and y > 0 is an n-tuple with k zeros, (1 < k < n), then the number of zeros in (I,. + A )y is less than k. 5.!.3 If A is an n-square indecomposable nonnegative matrix, then (I,. + A )"- 1 is positive. 5.!..4 If A is an indecomposable nonnegative matrix and x is a nonzero, nonnegative n-vector, then Ax ¢ 0". 5.!.5 If an indecomposable nonnegative matrix has a positive characteristic root rand a corresponding nonnegative characteristic vector x, then X> 0. 5.!.6 Denote the (i,j) entry of A k by al11• A nonnegative square matrix A is indecomposable, if and only if for each (i,j) there exists k ""' k(i, j) such that a!J' > 0.

&c. 5

Normegati111 Matriets

123

Note that k is a function of i,j and that for no k need all a!1> be positive. For example, the permutation matrix P of 5.2.1 is indecomposable and yet, for any k, P" contains n 2 - n zeros.

5.1.7 If A > 0 is indecomposable, then for each (i,j) there exists an integer k, smaller than the degree of the minimal polynomial of A, such that

t4:) > 0.

5.3 Exempl• 5.3.1 If a doubly stochastic matrix Sis decomposable, then there exists a permutation matrix P such that PSPT is a direct sum of doubly stochastic matrices. For, if Sis n-square, there exists, by the definition of decomposability, a permutation matrix P such that

PSP'~'""" (~ ~).where B is u-square

and Dis (n - u)-square. Let cr(X) denote the sum of all the entries of the matrix X. Then cr(B) is equal to the sum of entrieJJ in the first u rows of PSPT and therefore cr(B) = u. Similarly, cr(D) = n - u. Hence cr(PSF) = cr(B) cr(C) cr(D) = 11 cr(C). But u(PSPT) = n as PSF E o•. Therefore cr(C) = 0 and, since C > 0, C must be a zero matrix and PSPT = B +D. The matrices B and D are clearly doubly stochastic.

+

+

5.3.1 If A

>0

+

is indecompoeable and B

(AB)ii = 0 for some i, j, then

E"

.t:-1

a;.~rb~t:j

> 0,

then AB

> 0.

For, if

""" 0 and, since all the b,l are

positive and the at.~t: are nonnegative, we must have aik = 0, (k = 1, · · ·, n). This contradicts the indecomposability of A.

5.-4 Fully lnclecotnpouble metrica If ann-square matrix A contains an' X (n- 1) zero submatrix, for some integer s, then A is said to be partly decomposable; otherwise fully indecomposable. In other words, A is partly decomposable, if and only if there exist permutation matrices P and Qsuch that P A Q = (

~ ~) where Band

D are square submatrices. Clearly every decomposable matrix is partly decomposable and every fully indecomposable matrix is indecomposable. For

example,(~

!) is partly decomposable but not decomposable.

5.-4.1 If A > 0 and p(A) = 0, then A is partly decomposable. For, by the Frobenius-Konig theorem (see 1.7.1), A must then contain an ' X (n - ' 1) zero matrix.

+

124

Ch. II

Con1.1exity and Matriees

5.-4.2 A fuUy indecomposable n-square matrix can contain at most P is the matrix in 5.2(1), then I,. + P is fully indecomposable and therefore I,. + P + B is also fully indecomposable

n(n - 2) zeros. In fact, if

for any nonnegative matrix B. 5.-4.3 If A is fully indecomposable, then there exists a number k such that A" > 0 for all p ~ k. The converse is not true; i.e., if an indecomposable matrix has the property that A P > 0 for all p > k, it is not necessarily fully indecomposable. For example,

A=(~

!}

(A 1

>

0).

5.-4.-4 If A is an n-square nonnegative matrix and AA r contains an s X t zero submatrix such that s + t = n - 1, then A is partly decomposable. It follows that if A is fully indecomposable t.hen so is AA r. 5.-4.5 If A > 0 is fully indecomposable and AAT has

>

0) zeros in its ith row, then the number of zeros in the ith row of A is greater than t. t (t

5.-4.6 If A > 0 is a fully indecomposable n-square matrix, then no row of A A r can have more than n - 3 zeros. 5.5 Perron-Frobenlus theorem We now state the fundamental theorem on nonnegative matrices. It is the generalized and amplified version, due to Frobenius, of Perron's theorem.

5.5.1 (i)

(ii) (iii)

(iv) (v)

Let A be an n-square nonnegative indecomposable matrix. Then: A has a real positive characteristic root r (the maximal root of A) which is a simple root of the characteristic equation of A. If ).i is any characteristic root of A, then l>.il < r. There exists a positive characteristic vector corresponding tor. If A has h characteristic roots of modulus r: Xo = r, >.1, · · ·, >.,._~, then these are the h distinct roots of >." - r" = 0; 1l is called the index of imprimitivity of A. If ~. >.., • • ·, x.-1 are all the characteristic roots of A and 8 = e'tw'", then XrA, • • ·, ).,._18 are Xo, >.1, · · ·, x.. -1 in some order. If h > 1, then there exists a permutation matrix P such that

0 0

A11

0

0 Au

··· ···

0 0 0 0

PAPT = 0 0 0 · · · 0 A~a-u Au 0 0 ··· 0 0 where the zero blocks down the main diagonal are square.

&,. 5

Nonntgatiw Matrius

125

Any proof of the Perron-Frobenius theorem is too long and too involved to be given here. We list, however, some important related results.

5.5.1 let P• denote the set of n-tuples with nonnegative entries. 14"'or a nonzero z E P• and a fixed nonnegative indecomposable n-square matrix A define: . (Ax)i ( ) rz==mm ,

a(z)

(x,

Xi

i

~

0),

(1)

(A..-\.

=max~. i

(2)

X;

In other words, r(x) is the largest real number a for which Ax - ax > 0 and s(x) is the least real number {j for which Ax- (Jx < 0. Then for any nonzero x E P 11 r(x) < r < a(z). (3)

Also

(4)

r = max r(x) = min a(z). zEP.

zEP.

5.5.3 If A > 0 is indecomposable, r is the maximal root of A and B(x) = adj (xi.- A), (x E R), thenB(x) for x > r.

> Oforx > r. Moreover (xi.- A)- 1 > 0

5.5.4 If A E J.lf.. (R) is the matrix defined above and C(:X) is the reduced adjoint matrix of 'AI" - A, i.e., C(X) = B(>.)//..-1 where /a-1 is the gcd of all (n - i)-square subdetenninants of XI .. - A (see 1.3.15), then C(r) > 0. Here 'A is of course an indeterminate over Rand C(r) is the matrix obtained from C('A) by replacing X by the maximal root r of A. 5.5.5 If A > 0 is indecomposable, then A cannot have two linearly independent positive characteristic vectors.

5.5.6 If A > 0 is an indecomposable matrix in M .(R), then the maximal characteristic root of A(wlw), (w E imal characteristic root of A.

Q"·•• 1 < k < n), is leas than the max-

II

Let u = (u1,

• • ·,

u.) and define u(u) to be the sum

.I:

J=l

UJ.

Let A

> 0 be

an indecomposable matrix with maximal root rand (unique) characteristic vector x corresponding tor chosen so that cr(x) = 1. If R > 0 is any nonzero matrix commuting with A, then Rx/cr(Rx) = x. For, Ax = rx and therefore RAx = rRx; i.e., A (Rx) = r(Rx). Since Rx ~ 0, it must be a characteristic vector of A corresponding to r and therefore a nonzero multiple of z (see 5.5.5). Hence Rx/cr(Rx) = x.

Ch.Il

1~

Convexity tJnd MtJtriels

5.7 Nonnegative m•lrlcet For nonnegative matrices which are not necessarily indecomposable the following versions of the results in 5.5 hold. 5.7.1 If A > 0, then A has a maximal nonnegative characteristic root r such that r > 1>-il for any characteristic root X; of A. 5. 7.I If A > 0, then there exists a nonnegative characteristic vector corresponding to the maximal characteristic root of A. 5.7.3 If A > 0 and B(x) = adj (xi. -A), then B(x) = (b,1(x)) > 0 for x > r; moreover, db,;(x)/dx > 0 for each i,j and x > r, where r denotes the maximal root of A. 5. 7.4 If C(>.) is the reduced adjoint matrix of >.I .. - A (see 5.5.4), then C(>.) > 0 for >. > r. We list below some further related results on nonnegative matrices. 5.7.5 If A > 0 has maximal root rand 'Y is any characteristic root of a matrix G E },f n(C) for which luiil < ai;, then hi < r. If, in addition, A is indecomposable and hi = r, i.e., 'Y = re"', then G = e"'DAD-• where ld,;l = 6i;, (i, j = 1, · · ·, n). 5.7.6 If A > 0 has maximal root r and k < n, then the maximal root of A(wlw), (w E Q~: ... , 1 < k < n), does not exceed r. 5.7.7 Let A and B(x) be the matrices defined in 5.7.3. Then A is decomposable if and only if B(r) has a zero entry on its main diagonal. 5.7.8 Let A be the matrix in 5.7.3. Then every principal subdeterminant of xi,. - A is positive if and only if x > r. 5.7.9 If A > 0 has a simple maximal root rand both A and A2' have positive characteristic vectors corresponding tor, then A is indecomposable.

5.8 Exampla 5.8.1 Let A and B be nonnegative matrices and suppose A • B is their Hadamard product (see 4.5). Let >..v(A ), X.v(B), >...,(A • B) denote the maximal roots of A, B, and A • B respectively. Then >...,(A • B)

...,(B).

For, by 4.5.1, A • B is a principal submatrix of A® Band the maximal root of A ® B is X.v(A )>.,(B) (see 1.2.15.11). It follows therefore from 5.7.6 that >...,(A • B) < X.v(A )>..v(B).

5.8.1 Let A

= (a,1)

E M ,.(R) be a nonnegative matrix with a maximal

&c. 5

NDrwgatiw Matrices

1Z7

root r not exceeding 1. Then a necessary and sufficient condition that there exist a permutation matrix P for which PAPT is triangular is that

n"

(1 - a.. ) = d(l,. -A).

i•l

(1)

For, the condition is clearly necessary. To prove the sufficiency consider a nonnegative matrix C = (c;;) E ll/,.(R), with a maximal root 8 not exceeding 1, and consider the function Ill

.....

f(t) = II (t -

d(tl .. - C).

C;;) -

We usc induction on m to show tbatf(t) > 0 fort> J'(t) =

E{II (t -

i•l

Cjj) -

jfdl

8.

(2)

Diiferentiate 5.8(2),

d{t/a-1 - C(ili))}·

(3)

Since t > Band a is not smaller than the maximal root of C(ili) (see 5.7.6), each summand in 5.8(3) is nonnegative by the induction hypothesis. Thus f'(t) > 0 fort> 8. Now,

since a > cii, (i = 1, · · ·, m), by 5.7.6. Hencej(t) > 0 fort > 8. Moreover, if Cis indecomposable then 8 > Ci;, (i = I, · · ·, m), (see 5.5.6) and/(s) > 0 and thus j(t) > 0 for t > B. It follows that if C is indecomposable, then /(I) > /(a) > 0; i.e., Ill

II (1 - cH)

i-1

> d(l ..

- C).

Now, choose a permutation matrix P such that PAPT = B

"' B,, = ):'

·-·

where each B, i8 1-square or possibly n,-square indecomposable. We show that the second alternative is impossible and therefore that P APT is triangular. Since the b,i are just a permutation of the ai;, 5.8(1) implies that

.

.

n

D (1- a ..)= i•l

"'

(1- b.")= d(l.- A)= d(J,.- B)= II d(I,..- B~:).

·-·

... 1

Supp08e that there exists a q such that B, by the above argument,

and therefore

. ·-· ..

n

·-1

a contradiction.

= C i8 indecomposable.

D (1 - c,,)

> d(l.,- B,)

d(lrt~~- B~c)

< n"

i•l

(1 - a.i),

Then,

Ch.II

128

Convexily and Matrices

5.9 Primitive metrlca let A be an indecomposable nonnegative matrix with maximal root r. Let h(A) be the number of characteristic roots of A each of which has the property that its modulus is equal to r. Then A is said to be primitive, if h(A) = 1. We recall that the number h(A) is called the index of imprimitivity of A (see 5.5.1).

> 0 is an indecomposable matrix with characteristic polynomial ).• + a.>-"' + · · · + a,>.n•, (a, ~ 0),

If A

5.9.1 then

h(A)

= gcd (n -

n1, n1 - nt, · · ·, n,_. - n,).

(1)

5.9.! A > 0 is primitive if and only if AP > 0 for some positive integer p. Moreover, if AP > 0, then A 9 > 0 for q > p. 5.9.3 If A > 0 is primitive, then A 9 is primitive for any positive integer q. 5.9.4 If A > 0 is an indecomposable matrix with index of imprimitivity h, then there exists a permutation matrix P such that

PAAPT

= E' " A,

·-·

where the A, are primitive matrices with the same maximal root. 5.9.5 A fully indecomposable matrix is primitive (see 5.4.5). The converse is not true.

5.9.6 If A > 0 is primitive, then there exists a positive integer p, (1 0. If aii > 0, (i = 1, · · ·, n), then A •-• > 0. 5.9.7 If A

>

lim

B where p(B) = 1 and BC'> is a positive multiple of the posi-

•-·

(A )•

=

0 is primitive with maximal root r (see 5.5.1), then

T

tive characteristic vector of A corresponding tor. It follows from this that

·-·

(i,j

= 1, · · ·, n).

(2)

Recall (see 5.2.6) that tJL•> is the (i,j) entry of A •. Moreover, if A > 0 is assumed only to be indecomposable, then 5.9(2) is equivalent to A being primitive.

5.10 Exemple 5.10.1 let P be the full-cycle permutation matrix in 5.2(1) and Ei,i the matrix with 1 in the (i,i) position and zero elsewhere. Then P + E,,, is

Sec. 5

Nonntgativt Matrices

129

primitive. For, P + Ei,i is an indecomposable matrix with a positive trace. Therefore in 5.9(1) the number n - n1 is equal to 1 and thus h = 1. More generally, if A is any indecomposable matrix and B any nonnegative matrix with a. positive trace, then A + B is primitive.

5.10.2 The least int.eger p for which a primitive matrix A > 0 satisfies AP > 0 is not in general less than (n - 1) 2 + 1 (see 5.9.6). For, consider the primitive matrix A = P + E,.,:a where Pis the full-cycle permutation matrix in 5.2.1 and E,.,t is the matrix with 1 in position (n,2) and 0 elsewhere. Then A 11 contains zero entries fork

5.11

< (n

- 1) 2

+ 1.

Doubly stochastic m1trlca

A matrix A > 0 is doubly stochastic, if all its row sums and all its column sums are equal to 1 (see 1.4.2). Clearly a nonnegative matrix A E M ,.(R) is doubly stochastic, if and only if J .A = AJ,. = J,. where J. E. .M,.(R) is the matrix all of whose entries are equal to 1/n. For example, permutation matrices as well as matrices of the form k- 1 V where V is a (v,k,X)-matrix (see 1.2.4.10) are doubly stochastic. The set of all n-square doubly stochastic matrices, 0., forms a convex polyhedron with the permutation matrices as vertices (see 1.7).

5.11.1

5.11.2 If S E 0., then for each integer k, (1 < k < n), there exists a diagonal d of 8 containing at least n - k + 1 elements not smaller than p., where 4k if n + k is even, { (n + k) 2 4k

p. --

(n

+ k) 2 -

1

if n

+ k is odd.

5.11.3 If SEn. and more than (1L- l)(n- 1)! terms in the expansion of p(S) have a common nonzero value, then A = J ... 5.11.~ Every S E o. can be expressed as a convex combination of fewer than (n - 1) 2 + 2 permutation matrices.

5.11.5 If S E n., then ((n - 1)2

+ 1) 1- .. < p(S) < 1.

The upper bound is attained, if and only if S is a permutation matrix.

5.11.6 (v1n cl• W1erclea conjecture) It is conjectured that if S E 0,., then 'P

(S)

>

n! - n•

1JJ

Ch. II

Convexity and Matrices

with equality, if and only if S = J ... This conjecture is still unresolved. Some partial answers are given below.

5.11.7 If A En.. and p(A) =min p(S), then BEO.

(i) p(A (iii)) = p(A ), if ai; '#- 0, (ii) p(A (iii)) = p(A) + {J, if ai; = 0 where fJ 5.11.8 If A E Ow satisfies p(A)

= min p(S), BEo.

>0

is independent of i,j.

then A is fully indecom-

posable. 5.11.9 If A E Ow satisfies p(A) = min p(S) and A &Eo.

> 0,

then A = J ...

5.11.1 0 If A E n,., (A '#- J ,.), and the elements of A - J. are sufficiently small in absolute value, then p(A) > p(J.). 5.11.11 If A E n. is indecomposable, then its index of imprimitivity, h, divides n. Moreover, there exist permutation matrices P and Q such that A

PAQ =

where A; E n,.,,., (j

= 1,

I:" A1

i•l

···,h).

5.11.12 By the Birkhoff theorem (see 1.7), every A En. can be represented (in general, not uniquely) as a convex combination of permutation matrices. Let {J(A) be the least number of permutation matrices necessary to represent A E n. as such a combination. Then, if A is indecomposable with index of imprimitivity k, {J(A)

Si;, (i,j = 1, · · ·, n), then p(A) > n!fn".

Nonnegative MDtrices

&c. 5

The equality holds only if A permanent is l.

131

= QJ .Q where Q is a

diagonal matrix whose

If S = (8;J) E 0,., then there exists a permutation (I E B. such that n 1 n ,.•.• c.i) >(1) .t•l - n• with equality, if and only if S = J ... }i'or, let P be the convex function on 0. defined in 2.4.2. Then if P is a full cycle pennutation matrix,

5.1 !.1

F(S) =

.! E1 F(PaS) n

a:-o

> F ("tl a:•O

pa:s) n

(2)

= F(J.B) = F(J.)

= nlog(~) with equality, if and only if pas = S for all a; i.e., if and only if 8 = J •. The first equality in 5.12(2) follows from the fact that F(QB) = F(S) for any permutation matrix Q. fl

By 1.7, there exist permutations •

max D •

i=l

Bi,•W >

(I

E S,. such that fl

l:

0 and we can consider max

•E&. &•1

a permutation matrix in n.. Clearly fl

max •E&

l: log ''·• 1•1

= max P

n ''·• l:/(liJ) i,i = F(S)

> nlog(~}

Ch.ll

132 It follows that max •ES,.

Con~~txity

and Matrices

n" ''·• > 1/n,. with equality, if and only if S = J,..

i•l

111

For, if max •ES.

n ''·• = 1/n"',

then F(S) = n log (1/n) and thus S """ J •.

i•l

Conversely, if

s=

111

J,., then clearly

n Bi,•(i) = 1/n" for all, E s,.. i•l

Observe that the above result is implied by the van der Waerden conjecture (see 5.11.6).

5.12.! If A E n,. is positive semidefinite symmetric, then there exists a positive semidefinite real symmetric matrix B such that BJ = J B = J and B 1 =A. Observe first that A has 1 as a characteristic root with (1/v;, · · ·, 1/~) as corresponding characteristic vector. Hence, by 1.4.10.5 there exists a real orthogonal matrix U such that A = U diag (1, ~:t, • • ·, ~.)U". Clearly all elements in the first column of U are 1/v;. Let B = U diag (1, ~:t,

• • ·,

~.)U".

Then B is positive semidefinite real symmetric and B 2 =A. It remains to prove that BJ = J B = J. Since J(lt> = -Vnum and U~'UO> = e1, we have UTJ(It) = v;UTU(I) = Vnea and therefore

BJ = U diag (1, v'>:1, · · ·, v'>:,.)UTJ = v'; U diag (1, vli1, · · ·, vli,.)e, """' Vnlle1

and thus BJ = J. Also JB

= (BJ)" = J" =

J.

5.1!.3 If A E n,. is positive semidefinite, then p

(A)> n!

- n•

with equality, if and only if A = J •. By 5.12.2, there exists a real symmetric matrix B such that B 2 J B = J. The inequality 4.4(3) gives (p(J B) ) 2

Now JB

< p(JJ*)p(B*B).

= J, JJ* = nJ, and B*B = B 2 = A. Therefore (p(J)) 2 < nttp(J)p(A)

and p(A)

,.rn

>~

nl

= ......:. n 11

- n• We omit the proof of the case of equality.

= A and

Sec. 5

Nonnegative Matrices

133

5.12.4 If A E a. is positive semidefinite symmetric with a.,i < 1/(n - 1), (i = 1, · · ·, n), then there exists a positive semidefinite symmetric B ( n. such that B 2 = A. w·e show that the condition aii < 1/(n - 1), (i = 1, ... 'n), implies that the matrix Bin 5.12.2 satisfies B ~ 0. For, suppose that bioi- < 0 for some io, io. Then I: bi.; = p. > 1 since E" bit; = 1. Then i~~

j-1

p.2 = (

l:. biti)t < { l:. b~) (n -

lr')G

JJifJe

1),

by 3.1(4). Thus 2

II

a.&io =

1 n - 1

L bt > I: btJ > ___E._ > - ,

; -•

i ~;.

a contradiction.

n - 1

5.13 Stoch•slic ~~~alrica A matrix A E Af.(R), A > 0, is row 8tocha8tic, if AJ = J where again J is the matrix in .lf.(R) all of whose entries are 1. In other l\'ords, every row sum of A is 1.

5.13.1 The set of row stochastic matrices is a polyhedron whose vertices are the n• matrices with exactly one entry 1 in each row. 5.13.2 The matrix A > 0 is row stochastic, if and only if the vector (1, • • ·, 1) is a characteristic vector of A corresponding to the characteristic rootl. 5.13.3 If A is row stochastic, then every characteristic root r of A satisfies lrl < 1. 5.13.4 If A is row stochastic, then >. - 1 appears only linearly in the list of elementary divisors of >../,. - A [i.e., (>. - t)~>, (k > 1), is never an elementary divisor of XI" - A ]. 5.13.5 If A is nonnegative and indecomposable with maximal root r, then there exists a diagonal matrix D = diag (d1, ···,d.), (d, > 0; i = 1, · · ·, n), such that r-•D-•AD is row stochastic.

134

Ch. II

Convexity and Matrices

References §1.4.4. This example is due to A. J. Hoffman ([B12]). §1.7. [Bl]. §1.7.1. [A7], p. 240; [B7]. §3.1.1. [A6], p. 26. §3.1.3. This is Beckenbacb's inequality ([At], p. 27). §§3.2.1, 3.2.2. [A6], p. 52. §§3.2.3, 3.2.4. [B2]. §3.2.5. [Bl7]. §3.2.6. [B2]. §3.2.7. [B31]. §3.3. [A6], p. 26; [Al], p. 19. §3.4. [A6], p. 31. §3.5.1. [A6], p. 21; [AI], p. 26. §3.5.2. [B9]. §3.5.3. [AI], p. 3. §3.5.4. [A6], p. 47. §3.6. The solution follows the method of proof of Lemma 2 in [A6], p. 47. §§4.1.1-4.1.4. [BIO]. §4.1.5. [B5]. §4.1.6. For the lower bound see [B5]; for the upper bound see [BI8]. §4.1.7. [Al], p. 63; [B14]. §4.1.8. Ibid., p. 70. §4.1.9. [ll30]; [828]. §4.1.10. For inequality 4.1(10) see [B2]; for ease k = 1 see also [B17]. §4.2. [B30]; [B4], I; [B12]. §4.3.1. [B9]. §4.3.2. [R15]. §4.4.2. This inequality is due to E. Fischer; [AS], p. 420. §§4.4.3, 4.4.4. [B23]. §4.4.5. For inequality 4.4{5) see [B18]. §4.4.6. This is Sznsz's inequality; [Al], p. 64. §4.4.7. [AS], p. 75. §4.4.8. This is Bergstrom's inequality; [Al], p. 67. §4.4.9. This is a generalization of Heinz's inequality; [Bl3]. §4.4.10. [BS]. §§4.4.11-4.4.13. [Bll]. §4.4.14. [B33]. §4.4.15. This is an unpublished result of Ky Fan. §4.4.16. These are Aronszajn's inequalities; [A5], p. 77.

Bibliography

135

114.5.I-t.5.a. [Bt6]. §4.5.4. This inequality is due to Schur; [A8], p. 422. 1§5.2.2-5.2.7. [A-1], p. 50. §§5.4.2-5.4.6. [B20]. §5.5.1. [A4], p. 53. §5.5.2. [B3]; [At], p. 81. §§5.5.3-5.5.6. [A4], ch. XIII. §5.6. [B21]. 1§5.7.1-5.7.9. [A4], ch. XIII. §5.8.1. [B16]. §5.8.2. [B21]. §§5.9.1-5.9.4. [A4), p. 80. §5.9.6. Ibid.; [B32]. §5.9.7. [A4], p. 81. §5.10.2. [B32]. §5.11.1. [Bl]. §5.11.2. [B19]; [B24]. §5.11.3. [B19]. §5.11.4. [B24]. §5.11.6. [B29); [B22]. §§5.11.7-5.11.10. [B22]. §§5.11.11-5.11.13. [B21]. 1§5.11.14, 5.U.15. These unpublished rerrulta are due to M. Marcus and M. Newman; see also [B25]. §5.12.1. [B19]. §5.12.2. Ibid.; [B26]. §5.12.3. [B23]; [B26]. §5.12.4. [B19]. §§5.13.2-5.13.5. [A4], p. 82.

Bibliography P•rt A. Boolcs 1. Beckenbach, E. F., and Bellman, R., Ineqoolwa, Springer, Berlin (1961). 2. Bonnescn, T., and Fenchel, W., T~ der konvexen KIJrper, Chelsea, Nf'w York (1948). 3. Eggleston, H. G., CMWexity, Cambridge University, London (1958). 4. Gantmacher, 1•. R., TM theory of rnatrius, vol. II, Ch£-lsea, New York (19.'59). 5. Ilamburger, H. L., and Grimshaw, M. E., Linear tran.vormatiooa in n-dimen8ional vector apace, Cambridge Univenity, London (1951).

136

Ch.ll

Conoexily and M alrices

6. Hardy, G. H., Littlewood, J. E., and P61ya, G., Inequalities, 2nd ed., Cambridge University, London (1952). 7. Konig, D., Theorie der endlichen und u~ichen Graphen, Chelsea, New York (1950). 8. Mirsky, L., An intTodudion to lin«JT algebra, Oxford University, Oxford (1955).

Part B. P1pers 1. Birkhoff, G., Tres observaciona sobre el algebra lineal, Univ. Nac. Tucum4n Rev. Ser. A, 6 (1946), 147-150. 2. Bullen, P., and Marcus, M., Symmetric means and malrU inequalities, Proc. Amer. Math. Soc., 12 (1961), 285-290. 3. Collatz, L., Eimchlieuungssatz filr die charakteri8tischen Zahlen von Alatrizen, Math. Z., 48 (1942), 221-226. 4. Fan, Ky, On a theorem of Weyl ~ing eigefwolues ojlinem" IJ"amjormolions, Proc. Nat. Acad. Sci., I, S6 (1949), 652-655; II, S6 (1950}, 31-35. 5. Fan, Ky, A minimum property of the eigenvalues of a hermitian tramforrnolilm, Amer. Math. Monthly, 60 (1953), 48-50. 6. Frobenius, G., Vber Alalrizen. aus po8iliven Elementen, S.-B. Kgl. Preuss. Akad. Wiss., I (1908), 471-476; II (1909), 514-518. 7. Frobenius, G., Vber Matrizen aus nicht nsgaliven Elementen, 8.-B. Kgl. Preuss. JUtad. Wiss. (1912), 456-477. 8. Hua, L. K., Inequalities ine,-olving ckterminanls (in Chinese), Acta Math. Sinica, 6 (1955), 463; 1\·lath. Rev., 17 (1956), 703. 9. Kantorovich, L. V., Fundional analysis and applied mathematics, Uspehi Mat. Nauk, 8 (1948), 89-185 [English trans.: Nat. Bur. Standards, Report No. 1509 (1952}]. 10. Marcus, M., Corwex functitms of quadratic forms, Duke Math. J., 24 (1957), 321-325. 11. Marcus, M., On subckterminanls of doubly rtochmlic matrices, IJlinois J. Math., 1 (1957), 583-590. 12. Marcus, M., Some properties and applications of doubly Btochastic matrices, Amer. Math. Monthly, 67 (1960}, 215-221. 13. Marcus, M., AnoUier extensWn of Heinz's inequality, J. Res. Nat. Bur. Standards, 66B (1961), 129-130. 14. Marcus, M., The pemU~nent analogue of 1M Hadamard determinant theorem, Bull. Amer. Math. Soc., 4 (1963), 494-496. 15. Marcus, M., and Khan, N. A., Some generalizations of Kantmovich's inequality, Portugal. Math., tO (1951), 33-38. 16. Marcus, M., and Khan, N. A., A note on the Hadamard product, Canad. Math. Bull., f (1959), 81-83. 17. Marcus, M., and Lopes, L., Inequalities/or symmetric functWr&8 and hermitian malrices, Canad. J. Math.,.? (1957), 305-312. 18. Marcus, 1\L, and :McGregor, J. L., Extremal properties of hermitian matrices, Canad. J. Math., 8 (1956), 524-531.

Bibliography

1:Jl

19. Marcus, M., and Mine, H., 8orM ruullt on doubly~ rnaericea, Proc. Amer. Math. Soc., 18 {1962), 571-579. 20. Marcus, M., and Mine, H., DiBjaint pairs of Ida and i11Cidmce matrices, lllinois J. Math., 7 (1963), 137-147. 21. Marcus, M., Mine, H., and Moyls, n., 8om.e reaulla on nonnegative matrices, J. Res. Nat. Bur. Standards, 658 (1961), 205-209. 22. Marcus, M., and Newman, M., On the minimum of the permt~mnl of a doubly .tochastic matrix, Duke Math. J., t6 (1959), 61-72. 23. Marcus, M., and Newman, M., Inequ.calitiu /M tht ~Junction, Ann. of Math., 75 (1962), 47-62. 24. Marcus, M., and Ree, R., Diogonola of doubly 8loclt4atil: matrice1, Quart. J. Math. Oxford Ser. (2), 10 (1959), 295-302. 25. Maxfield, J. E., and Mine, H., On U&e doubly~ motri2: diiJgonally ccm~ to a gillen mab'iz (submitted for publication). 26. Mine, H., A not.e on an inequality of M. MarCWJ and M. Newman, Proo. Amer. 27.

.28. 29. 30. 31.

Math. Soc., 14 (1963), 890-892. Mirsky, L., Results and problems in tM tkory of doubly atocl&astic malricea, Z. WahrscheinlichkeiMtheorie t (1963), 319-334. P6lya, G., &mark on Weyl's note: "lmqualitiu between the two kinds of eigmvaluu of a linetJI' lranaformation," Proc. Nat. Acad. Sci., 86 (1950), 49-51. van der Waerden, B. L., Aufgabe 45, Jber. Deutsch. Math. Verein., 85 (1926), 117. Weyl, H., Imqualitiu between Uae two ki?Ula of eigerwaluu of a liMar tram/ormation., Proc. Nat. Acad. Sci., 86 (1949), 408--411. Whiteley, J. N ., Som.e imqtUJlities ctmamitlf symmetric function~, Mathematik&,

5 (1958), 49-57. 32. Wielandt, H., l.lr&Urkgbare, nicl&t negative Matriun, Math. Z., 61 (1950), 642-648. 33. Wielandt, H., An extremum property of tuma of rigenrlalue.a, Proo. Amer. Math. Soc., 6 (1955), 106-110.

Localization of Characteristic Roots

III 1 Bounds for Cherecteristic Roots 1.1

Introduction

Un1ess otherwise stated, all matrices are 888Umed to be square and over the complex field, i.e., in J.U n(C). There is quite a lot of information about the characteristic roots of some special types of matrices. Thus diagonal and triangular matrices exhibit their characteristic roots on their main diagonal (see 1.2.15.2). All characteristic roots of nilpotent matrices are equal to 0 while all those of idempotent matrices (i.e., A 1 = A) are equal to 0 or 1. At least one of the characteristic roots of any row stochastic matrix is equal to 1 and all of them lie on or inside the unit circle. All the characteristic roots of a unitary matrix lie on the unit circle, those of a hermitian matrix lie on the real axis, while those of a skew-hermitian matrix lie on the imaginary axis (see 1.4.7.19, 1.4.7.18, 1.4.7.20). As far as the characteristic roots of a general matrix are concerned nothing specific can be said about their location: they can obviously lie anywhere in the complex plane. In this chapter we shall state and prove theorems on bounds for characteristic roots of a general matrix in terms of simple functions of its entries or of entries of a related matrix as well as 139

Ch. III

140

Localization of Characteristic Roots

theorems on bounds for characteristic roots of other classes of matrices. In the first section we shall discuss some results obtained in the first decade of this century. The first results that specifically give bounds for characteristic roots of a general (real) matrix are due to Ivar Bendixson and are dated 1900. It is true that a simple corollary to the so-called Hadamard theorem on nonsingularity of square matrices [which incidentally is due to L. Levy (1881) and in its general form to Desplanques (1887)] gives the well-known Ger8gorin result. However, that simple corollary was not enunciated until 1931 (in this country not until 1946) and the honor of being the first in the field of localization of characteristic roots must be Bendixson's. We shall use the standard notation of matrix theory and in 1.2, 1.3, and 1.4 the following special notation: If A = (a,i) E .M.. (C), let B = (b,i) = (A +A *)/2 and C = (c,i) = (A -A *)/2i, both of which are obviously hermitian. Let >.1, · · ·, >. .. , (1>-•1 > · · · > 1>-.1), #Jl > · · · > #Jn, '• > · · · > v., be the characteristic roots of A, B, C respectively and let g = max laajl, g' = max aJ

lb•il, g"

=

max iJ

a,i

lc•A·

1.2 Bendixson's theorems 1.2.1

If A E Jf.(R), then

IIm(>.,)l < g" ~n(n ;- 1). 1.2.2 If A E M.(R), then JJ.

< Re(>.,) < #Jl·

We shall not prove Bendixson's theorems here as we shall presently state and prove their generalizations to complex matrices by A. Hirsch. Hirsch's paper in the Acta Mathematica follows the French version of Bendixson's paper in that journal and is in the form of a letter to llendixson. Its style is somewhat amusing because of rather exaggerated eulogistic references to Bendixson's results and methods.

1.3 Hinch's theorems 1.3.1

If A E M.(C), then

1>-.1 < ng, IRe(>-,)1 < ng', IIm(>.,)l < ng". If A

+ A' is real, then this last inequality can be replaced by

Sec. 1

Bounds for Characteristic Roots

llm(X,)I

141

< g" ~n(n;

1)

Proof: Let x be a characteristic vector of unit length corresponding to the characteristic root >.,; i.e., Ax = X.x and (x, x) = 1. Then (Ax, x) = >.,(x, x) (A *x, x) = (x, Ax)

= >., = X,.

(1)

Hence Re( '\1\t ) = (Ax, x)

+2 (A •x, x)

= (A +A •

2

X, X

) -_ (Bz, X )

(2)

and '\ ) I m (1\'

=

(Ax, x) - (A •x, x) = (C ) x, x . 2i

(3)

Thus

111

< g I:

jx,jjx,j

i.i•l

=g

(.f. •-1 lxll)'·

Recall the inequality (11.3.1.2)

E kJ < ( i:, lx,j2)l/2 i•l

n

i•l

(4)

n

which yields

Similarly for Re(>.,) and Jm(X,). Now, if A + A' is real, then 1.3(3),

ai, =

aii -

< E laij 2

i

0; i

= 1,

· · ·, n).

x- 1CXe where

e = (1, 1, · · ·, 1) E C".

Let i1, · · ·, i,. be a permutation of I, · · ·, n such that r,, Then k

n IX,I

r"

> · · · > rl...

= 1, · · ·, n).

2 Regions Containing Cherederistic Roots of a General Matrix 2.1 2.1.1

Levy-Despl•nqua theorem

Brauer established 1.6.5 by first proving a result (see 2.2) which is of considerable interest in itself. rnfortunately, in this theorem Brauer was also anticipated, though obviously he was Wlaware of it, by S. A. Ger~gorin who in 1931 published the same result. Ger8gorin proved his theorem as a corollary to the Levy-Despla.nques theorem which in its origiual form-for certain real matrices-was obtained by Levy in 1881,

146

Ch. III

Loealization of Characteristic Roots

was generalized by Desplanques in 1887, and is usually referred to as the ••Hadamard theorem" since it appeared in a book by Hadamard in 190:J. This result had a most remarkable history. Before it was stated by Hadamard it was also obtained in 1900 by Minkowski with the same restrictions as Levy's result, and in that form it is still known as the Minkowski theorem. Since then it made regular appearances in the literature until 1949 when an article by Olga Taussky seems to have succeeded in stopping these periodic rediscoveries of the theorem. Incidentally, Gerigorin's generalized version of the Levy-Desplanques theorem is false. The correct generalization along the same lines was stated and proved by Olga Taussky (see 2.2.4).

i.1.i Weintroducethefollowingnotation:Pi = R, -laiii,Q; = T; -laiil· Most of the results in this section will be stated in terms of P; or Ri. Clearly analogous theorems hold for Qi or Ti. (levy-Oesplanques theorem) and

If A = (ai;) is a complex n-square matrix (i = 1 .. · n)

,

' '

(1)

then d(A) ¢ 0. Proof: Suppose d(A) = 0. Then the system Ax = 0 has a nontrivial solution x = (x1, · · ·, x,.) (see 1.3.1.5). Let r be the integer for which (i = 1, · · ·, n).

Then

which contradicts 2.1(1).

i.i Gerigorin elisa i.i.1

(Gerigorin's theorem) The characteristic roots of an n-equare complex matrix A lie in the closed region of the z-plane consisting of all the discs

lz - a,,j

< Pi,

(i

=a

1, · · ·, n).

(1)

Proof: Let"' be a characteristic root of A. Then d("J· -A) - 0 and, by the Uvy-Desplanques theorem, '"' - a;;j < P, for at least one i.

i.i.i The absolute value of each characteristic root "• of A is less than or equal to min (R, T). For, by 2.2.1, IX, - a..! < P; for some i. Therefore jX,j < P, + ja,,;j = Ri < R. Similarly, jX,I < T.

Sec. 2

Regions Containing Characteristic Roots of a Gtnnal Matrix

IX•I > k = min (laiil

!.2.3 and, if k

>

'

< min (R•,

ld(A)I

147

- P,), T•)

0, ld(A)I

> k•.

These are immediate consequences of 2.2.1. We quote now lltithout proof some related results.

2.2.4 Call an n-square complex matrix indecomposable if it cannot be brought by a simultaneous row and column permutation (see 11.5.2) to the

~}

fonn ( :

where X and Z are square submatrices and 0 is a zero

sub matrix. If A .... (aii) is an indecomposable matrix for which

laiil > P,,

(i- 1, · · ·, n), with strict inequality for at least one i, then d(A) pi! 0.

2.2.5 If H c•> is a set in the complex plane containing m Gersgorin discs 2.2(1) of a matrix A and He..> has no points in common with the remaining n - m discs, then H (•> contains exactly m characteristic roots of A. It follows that if A is a real matrix (or merely a complex matrix with real main diagonal entries and real coefficients in its characteristic polynomial) and the Gerigorin discs of A are disconnected, then all characteristic roots of A are real.

!.!.6 If A is an indecomposable complex matrix, then all the

c~

teristic roots of A lie inside the union of the discs 2.2(1) unle&& a characteristic root is a common boundary point of all n discs.

2.2.7 Let A - (a,J) be ann-square complex matrix and let K and k be two positive numbers, K > k, such that la,il < k for j i. Then all of the characteristic roots of A lie in the union of the discs (i - 1, · · ·, n).

2.2.1 If

x,

is a characteristic root of A with geometric multiplicity m (see 1.3.12), then)., lies in at least m Ger8gorin discs 2.2(1).

2.2.9 Let A be an n-square complex matrix and let u be a permutation on 1, ... , n. Then the characreristic roots of A lie in the union of the n regions

lz - Cl.:il L,, if i ~ v(i) where L. -

la.:.h>l - j,., I: lai;l. i"•(i)

148

Ch. III

Localization of Characteristic Root.1

2.2.10 Let A, u, P,, and L, be defined as in 2.2.9. Suppose that >.,is a characteristic root of A with geometric multiplicity m. (see 1.3.12). Then >.,lies in at least m of the regions

lz - a;,l L,, if i ~ u(i). 2.2.11 Let >., be a characteristic root of A = (a;;) E M,.(C) with geometric multiplicity m. If {J., · • ·, fj,. are positive numbers such that

" I:

·-·

1/(1

+ {3;) < m, then >., lies in at least one of the discs (i = 1, · · ·, n).

la••l > P, for at least r subscripts i,

!.!.1! If least r.

then the rank of A is at

!.!.13 If the rank of A -= (a;;) E llf .(C) is p(A ), then

t

!!ill < p(A)

(2)

•·• R; n

where R, =

I: la;il· We agree that 0/0, if it occurs on the left-band side

i•l

of 2.2(2), is to be interpreted as 0.

!.3 Example The characteristic roots of

7 + 3i 2

A = ( - 1 - 6i

-4 - 6i 7 4- 6i

-4 ) -2 - 6i 13- 3i

are 9, 9 + 9i, and 9 - 9i; their absolute values are 9 and 12.73. Hirsch's theorem 1.3.1 yields

1>-,1 < 40.03,

IRe(>.,)l < 39, 3

Schur's inequality 1.4.1 reads

IIm(>.,)l < 20.12.

I: l>-el 1 = 405
-el: 23.11, 23.11, 23.10, 22.55. The Ger8gorin discs

lz- 7 - 3il

P,P;, (i,j = 1, · · ·, n; i ~ j),

(I)

then d(A) ~ 0. Proof: Suppose that the inequalities 2.4(1) hold and d(A) - 0. Then the system Ax = 0 has a nontrivial solution x = (x 1, • • ·, x.. ) (see 1.3.1.5). Let Xr and x, be coordinates of x such that

' + 1' · .. ' n).

(i = 1 .. · r - 1 r

,

'

Note that x. ~ 0. For, if x, = 0, then x, - 0 for all i ~ r, and

" Or;X; = 0; i.e., Ax = 0 and, in particular, I:

O.,Xr

j•l

Xr ~

0. But

= 0. Thus a.,., = 0 which

contravenes 2.4(1). Hence x, rtf 0. The supposition d(A) - 0 implies therefore that

and

larrlla•• l < PrP., contradicting 2.4(1).

2.4.2 Each characteristic root of A ""' (a.:1) E

.~/.. (C)

lies in at least one

of the n(n - 1)/2 ovals of Cusini

-, - a1;l > PJ'; cannot hold for all i,j, (i '# j). 2.4.3 The following question arises naturally: For what val uP-" of k must

lz- a,;l !z- aiil

each characteristic root of A lie within at least one of the ( ~) regions

nt !z - Cli.i.l
l'lt..,l - la.:•l for some i. Applying the H6lder inequality [see 11.2.2(1)] to 2.5(3) we obtain

l'lt..el < max (la"l + Pi)«(la,,l + Q,)•-ar For a =

l

'

we have

=

max (RtTl-..).

'

1>-,1 < max (R,T,)1'2•

(5}

i

+ aP, + (1 - a}Q,); ' orem [see 1.6(1)]; a = 0 or 1 gives 1.6.5. !.5.5 1>-,1 < max (la.:•l

2.5.6 If 0 then d(A)

a= !

gives

< a < 1 and for all pairs of subscripts i, j, i la•.:lla;;l > PlQl-'"PjQ}-'",

~

(4)

Parker's the-

~ j,

(6)

0.

s11 > · · · > Bj,.. Then 2.5(1) is equivalent to '' < 1 while 2.5(6) is equivalent to 8;8; < 1. In particular, s~Bb < 1 and thus Ba < 1. If srr. = O, then BJ:, = 0, (i = 2, · · ·, n), and d(A) = auan · · · a.. .. ~ 0. Otherwise 0 < '• < 1. Let q = v sift! sa. Let B Proof: Let'',. PiQ:-'"/Ia~tl and let Bit>

be the matrix obtained from A by multiplying the ktth row and the ktth

column by q. For B use the same notation as for A but with primes. Then P~. = qPt,., Q~. = qQ~cu la~..t.l = q2 la•.t.l and therefore Bk, = s~n/q = s1,s... < 1. Further, for i > 1, ~~. < qs~c; < qs.. = v'Bt.Brr. < 1. Therefore, by 2.5.1, d(B) ~ 0 and, since d(B) = q2d(A), d(A) #- 0.

v

1.5.7 As an immediate consequence of 2.5.6 we obtain the following result which generalizes 2.2.2. Each characteristic root of A = (a,i) E 1lf,(C) lies on or inside at least one of the n(n - 1)/2 ovals of Cassini;

lz -

aHiiz - a;, I < (P,P;)a(Q,Q;) 1-«,

.For, if l'lt.., - a"ll'lt.., - aiil 2.5.6, d('lt..d" - A) ~ 0.

>

(i,i

= 1, · · ·, n; i ¢ j; 0

(PiPi)'"(Q,Q;) 1-« for all

i,i,

0 for all i, j, (i # j), and there exist positive numbers lt, · · · , t. such that (1)

(i '""" 1, · · ·, n),

then A is semista.ble. Proof: Let Xc be a characteristic root of .A and (xh · · ·, x.) a characteristic vector corresponding to >.c. Let 1/i = xift,, (i .... 11 • • · , 11) 1 and

IY-1 = max IY•I· i " Then XJ,1J, """ I:

J-1

ai;t;YJ

and

Therefore

< ;.,.. I: tp.J < -t.a..-1

by 3.3(1).

Thus lA,- o.-1 < -a_ and in the complex plane the characteristic root )., lies in the disc which passes through 0 and whose center is at the negative number a....

3.3.2 If the inequalities 3.3(1) are strict, then A is stable. 3.3.3 Rohrbach generalized 3.3.1 to complex matrices; the inequalities 3.3(1) are replaced in his theorem by (i ""' J1

• • •1

11).

3.3.4 If A is defined a.s in 3.3.1 and e = max ( -a,i), then 1>.,

'

+ el < e.

This corollary to 3.3.1 was deduced by Fr~chet who used it to prove his theorem on characteristic roots of row stochastic matrices (see 3.4.1).

3.3.5 If A

= (ali)

E M .(C) and (i - 1I ... I 11) I

{2)

then A is semistable. If the inequalities in 3.3(2) are strict, then A is stable. If A is real and satisfies 3.3(2), then A is stable if and only if it is nonsingular. These results follow directly from 2.2.1.

160

Ch. Ill

Localization of Characteristic Roots

3.3.6 Let A = (a,;) E M .(R), (a,, < 0, a,; > 0 for i #- j). Then A is stable if and only if there exist positive numbers x., · · · , x.. such that x-•AXe < 0 where X= diag (x1, • • ·, x.) and e = (1, · · ·, 1) E R". This result clearly implies 3.3.2. 3.3. 7 (lyapunov's theorem) The only known general result on stability of matrices is due to Lyapunov. It can be stated as follows. A matrbc A E jl/ .(C) is stable, if and only if there exists a negative definite (see 1.4.12) matrix H such that AH + HA • is positive definite. 3.3.8 The result 3.3.7 can be proved from the following result by Ostrowski and Schneider. Let A E M .(C). Then A has no purely imaginary characteristic root, if and only if there exists a hermitian matrix H such that A H + H A • is positive definite; and then A and H have the same number of characteristic roots with negative real parts.

3.3.9 A result of Taussky that also is a corollary of 3.3.8 follows. Let A E M .(C) and suppose that a;, (i ""' 1, · · ·, n ), are the characteristic roots of A and that they satisfy

n"

(a,+ ii.t) ~ 0. Then there

••• -1

+

exists a unique nonsingular hermitian matrix H satisfying A H HA • = I.. Moreover H has as many positive characteristic roots as there are a, with positive real parts.

3.3.1 0 The equation A H + HA • = I. is precisely the one considered in 1.1.10. Let U be a unitary matrix in M .. (C) for which UAU* = T is upper triangular with the characteristic roots a1, · · ·, a. appearing on the main diagonal of 7'. Then AH + HA*- I. becomes TK + KT* =I. where K = UII U*. This last system is equivalent to the equation (1. ® T + T ® I.)x = e (see 1.1.10). The matrix I .. ® 7' + T ®I .. E M,..(C) is triangular with a, + ii.t, (1 < i, k < n), as main diagonal elements. Thus a unique solution x exists if and only if d(I. ® T + T ® I.) =

n" (ai + ii.t) ~ 0. This accounts for the hypothesis in 3.3.9. ,,,-1 3.4 Row stochastic matrices Recall that P

= (p,;) E M .(R) is row stochastic if

P•s > 0, (i,j = 1, · · ·, n), and E" P•s- 1, (i = 1, · · ·, n), j•l

(see 11.5.13).

3.4.1 If P = (p,;) is a row stochastic matrix, >., is a characteristic root of P and w = min (p ..), then

'

Sec. 3

Charactnistic Roots of Special Types of M atrie1s

161

1>-e - wl < 1 - w. Proof: Consider A == (a.;) where a,; = Pii - ~'i in which &iJ is the Kronecker delta (see 1.1.1). Then A satisfies the conditions of 3.3.1 and thus, for any characteristic root P.• of A,

IP.• + el < E where

t

= ~ (-a") as before. Now,

'

and therefore IX•- (1 - e)l l -

t

-" - "'I < · · · < 1>-t. - 'YI· Now, (A- "(l .. )(A- -y/,.)•ishermitian with characteristic roots l>.t,(i = 1, · · ·, n). Hence

'YI', (3)

where the minimum is taken over all orthonormal sets of k vectors in C" (see 11.4.1.5). Formula 3.5(3) with 3.5(1) implies 3.5(2). The last part of the theorem is proved similarly.

Characteristic &ots •f Special Types of Matrices

Sec.3

163

3.5.8 Let A, x., · · ·, x., x1, · · ·, x~: be defined as in 3.5.7. Then there exists (t., • · ·, t~:) E Q~: ·" such that i .E• IX,,- -k1 J•l l:• (Axi, Xj)l 2 < J•l .E IIAxJ!I'- -k~~.I: E (Axil x;) i•l

,_,

r•

(4)

There also exists (t~, · · ·, ta:) E Q~r .• satisfying the reversed inequality of 3.5(4). Proof: In 3.5.7 set 1 1: 'Y = -k E (Axi, x;), i•l

8=

l: 11Ax;ll2 -

,.. •

1 I: (Ax;, x;) · k 1.i-1 •

r

-

1:

It follows that

I: II(A - "Yl.)x;ll' =

.i•l

6 and 3.S(.f.) follows from the first

part of 3.5.7. The second part of 3.5.7 implies the existence of subscripts satisfying the reversed inequality of 3.5(4). When k = 1, the result in 3.5.8 reduces to 3.5.5. For, 3.5(4) reduces to

IX,. -

(Ax1, Zt)l

< 11Ax.ll 2

-

I(Ax~,

x.)l 2 •

3.5.9 Let A E M ..(C), B E ltf.(C) be nonnal. Let x E c-, y E C• be nonzero vectors such that (Ax, x) - (By, y) and IIAxll > IIBYII- Then a circular disc in the complex plane containing all m characteristic roots of A contains at least one characteristic root of B.

If A and B are normal with characteristic roots o:1, • • •, a. and

3.5.1 0 {J1,

• • ·,

that

{J. respectively, then there exists an ordering

tJ.cu, · · ·,

/3.,., such

" Ia:,- {J.col 2 < IIA - Bll 2 where IIXII denotes the Euclidean norm I:

t•l

of the matrix X (see 1.2.8.2). 3.5.11 If .A is a normal matrix with characteristic roots a., · · · , a. and k is a positive real number, then {J~, · · ·, fJ. are characteristic roots of a normal matrix B satisfying IlA - Bll 0, (a, b, c, d, x, y E R)}, 2)

which contain fla, • • ·, fJ,.. 3.5.13 In the special case of 3.5.9 when a., · · ·, a .. are real and fla, • • ·, fJ,. are pure imaginary, it follows that A= nA

where A runs over all "hyperbolic" regions of the complex plane, {x

+ iylax +by+ c(x2 -

which contain the numbers ai

y 2)

+ d > 0, (a, b, c, d, x, y E R)},

+ {J;, (i, j = 1, · · ·, n).

3.6 Hermitian matrices All the characteristic roots of a hermitian matrix are real (see 1.4.7.18). Moreover, a hermitian matrix is normal and therefore the results of 3.5 apply to it. We specialize only one of these, 3.5.3, because of the historical interest; the version of 3.5.3 specialized to hermitian matrices was the first result of this kind. The rest of this section consists of more recent results. We quote them without proof. 3.6.1 If A is a symmetric real matrix and u = (ua, · · ·, u,.) an n-tuple of nonzero real numbers, then in any interval containing the numbers (Au)i/ui, (i = 1, · · ·, n), lies at least one characteristic root of A (see 3.5.3). 3.6.! Let A = (aij) E Jf.. (C) be hermitian with characteristic roots Xa > · · · > :>.... Let ca, • · ·, c.. _. and da, · · ·, d. be 2n - 1 real numbers such that Ci

> 1,

(i

= 1,

· · ·, n - 1),

(i = 1, .. ·, n).

(1)

(2)

Then

:>.i < di,

(i = 1, · · ·, n).

(3)

3.6.3 The theorem 3.6.2 holds, if conditions 3.6(1) and 3.6(2) are replaced by

c, > 0,

(i - 1, .. ·, n - 1), (i - 1, .. ·, n).

3.6.4 The hypothesis of 3.6.2 implies also that (i - 1, · .. , n).

Sec. 3

Chmacteristic Roots of Special Types of Matrices

165

3.6.5 Let A = (a,i) E lrf.(C), (n > 2), be a hermitian matrix with characteristic roots At > · · · > A.. Let p1, · · ·, p. be any 8 integers, {1 < 8 < n + 1), such that 0 == Po < Pt < · · · < p. < P.-+t = n. Define a hermitian matrix B = (b.i) E M.(C):

b·· = {aii, if for some k, 1 '' 0, otherwise.

< k 1;

" I:

t•l

(X, - ~,)

< 'Y1,

j

+ 1 < p.;

(h - 1, · · ·, n);

'Yi-'+1, if 1 < j - i + 1 < p,; > l'i+p,, if i + p. < n. 3.6.7 Let X1 > · · · > x. and P.• > · · · > ~·and let Nand K

(iv)

x, > l'i -

(v)

X,

be the sets

consisting of all n-tuples and (X,.u) + P.l, • · ·, X,(,.> + lola) respectively as .p runs over all permutations IP E s•. Let /l(N) and H(K) be the convex hulls of Nand K (see 11.1.5) and let L == ll(N) n H(K). If A and B are any symmetric real matrices with characteristic roots x~, · · ·, "·and 1-'s, • • ·, P.• respectively and if cr1 > · · · > cr. are the characteristic roots of A + B, then (cr1, • • ·, cr.) E L. If, in addition, either (X1

or

+ ~-'• 2, (see 1.5.2). let b1 C1 0 0 • • • • · '• • • 0 a,i12c,O ......... 0 0 aa ba ca o........ 0 L.............................

0 0

· • • • • • · • • •0 0.-l ba-1 C..-1 .......... o 0 a. b.

be an n-square complex Jacobi matrix and let Lr denote the principal submatrix L.. [1, · · ·, rl1, · · ·, r]. 3.7.1 If L .. E M .. (R) and a,-ct-1 > 0, (i - 2, · · ·, n), then (i) all characteristic roots of L. are real and simple; (ii) between any two characteristic roots of L. lies exactly one characteristic root of La-1; {iii) if > · · · > ~.. are the characteristic roots of L., then

"1

d(",I•-1 - L ..-1) is alternately positive and negative, (t - 1, · · ·, n). 3.7.2 If L .. E M ..{R) and «tCi-1 < 0, {i = 2, ···,n), then all the real characteristic roots of L. lie between the least and the greatest of the bi, these values included. If, in addition, b1 < · · · < b., then the characteristic roots of L. and those of L-1 cannot interlace; in fact between any two adjacent real characteristic roots of L. must lie an even number of real characteristic roots of if any.

L._.,

3.7.3 If L. E M.(C), bs, · · ·, b•-• are purely imaginary, Re(bt) ~ O, ai = -1, (j = 2, · · ·, n), and c;, {j - 1, · · ·, n - 1), are nonzero real numbers, then the number of characteristic roots of L. with positive real parts is equal to the number of positive elements in the sequence eo, eoc1, CoC1c2, • • ·, CoC1 • • • c•-1 where Ce = Re(b.). 3.7.-4 If L,. E Af.. (C), b1, ···,b.. -~ are purely imaginary, Re(b,.) ~ 0, a;= -1, (j = 2, · · ·, n), and CJJ {j = 1, · · ·, n - 1), are nonzero real numbers, then the number of characteristic roots of L. with positive real

Sec. 4

The Sprtad of a MtJtrix

167

part8 is equal to the number of positive elements in the sequence c., c,.c,.-1, · · ·, C-nCa-1 • • • c1 where c,. = Rr.(h,.). In particular, if c1, c1 , • •• , c,._t, -c. are positive, then B is stable (see 3.3).

4 The Spread of 1 Matrix 4.1

DeRnition

Let A E },J ,.(C), (n > 3), and let.~., · · ·, >.,.be the characteristic roots of A. The Bpreoo of A, written s(A), is defined by: s(A) = max~~~ - ~;I. iJ

4.2 Spreacl of • general ••trix 4.2.1

H A E M .(C) and s(A) is defined as in 4.1, then s(A)

< (2IIA U2 - ~ ltr A 12)11 2

with equality, if and only if A is a normal matrix with n - 2 of its characteristic roots equal to the arithmetic mean of the remaining two.

4.2.2 s(A) < V211AII. 4.2.3 H A E },f.(R) and a(A) is defined as in 4.1, then s(A)

< (2(1- ~) (trA) 2 -

4E.' (A)r 2

12

with equality, if and only if n - 2 of the characteristic roots of A are equal to the arithmetic mean of the remaining two.

4.3 $prucl of • nonnal ••trix 4.3.1

If A E M.(C) is normal and s(A) is defined as in 4o.l, then s(A)

> v'a max laa;l. iJ14i

4.3.2 If A E M.(C) is hermitian and s(A) is defined as in 4.1, then s(A)

> 2 max la'll· i,.i

This inequality is the best possible, in the sense that there exist hermitian matrices whose spread is equal to the absolute value of an off-diagonal element; e.g., if A = (

_! - ~) +I •-•,

then s(A) - 2 ...

21a1:~l.

168

C/1. Ill

Localization of Characteristic Roots

4.3.3 If A E 1l/.. (C) is normal, then (i)

(ii)

(iii) (iv)

2 112 2 > max ,.,., ((Re(ar,.) - Re(a,)) + Ia,., + 4,,.1 ) , s(A) > max (larr - a,,l' + (la,.,l - la.rl) 2)112, ,.,., s(A) > max (la,.,l + la,,.l), r ,., s(A) > max (!(larr - a,l 2 + l 1), - A, then r is strictly in the interior

5.2.8 If r is a characteristic root of A E .M.(C) and is an elementary divisor of of the field of values of A.

~/.

(~

References 111.2.1, 1.2.2. [B2]. 1§1.3.1, 1.3.2. [B19]. 11.3.3. [B5]. 1§1.4.1, 1.4.2. [B42]. 11.5.1. [B6]. §1.6.1. [B38]. §1.6.2. [B16]. §1.6.S. [B7). §1.6.4. [B36]; [814]. §1.6.5. [83], I. §1.6.6. [A7], p. 36; [B3], I. §1.7. [B40]. 12.1.1. [B3], I; [Bt7]; [B23]; [B9]; [A3], pp. 13--14; [826]; [B47). §2.1.2. [B23]; [B9]. §2.2.1. [B17]. 12.2.2. [B3], I; also see §1.6.6. §2.2.3. [B3], I. §2.2.4. [B47]. §2.2.5. [Bl7]. §2.2.6. [B46].

170

Ch. Ill

Localization of Characteristic Roots

§2.2.'7. [B33]. §2.2.8. [B44). §§2.2.9, 2.2.10. [B41]. §§2.2.11-2.2.13. [Bl3). §2.4.1. This result is due to Ostrowski (see [B30]). The method of proof follows Brauer's proof of 2.4.2 (see [B3], II, Theorem 11). §2.4.2. [B3], II. §§2.5.1-2.5.7. [B31]. §2.6. [B21]; [Bl2]. §3.1.1. [B16]. §3.1.2. [B22]. §3.1.3. [B32]. §3.1.4. [84). §3.1.5. [B34]. §3.1.6. [B25]. §§3.3.1, 3.3.2. [B45]. §3.3.3. [B39]. §3.3.4. [815]. §3.3.6. [ll40]. §3.3.7. [A4), pp. 276-277; [835]. §3.3.8. [835]. §3.3.9. [ll48]. §3.4.1. [BI5]. §3.4.2. [B3], IV. §3.4.3. [BlO]. §3.4.4. [83], IV. §§3.5.1, 3.5.2. [B37]. §3.5.3. [B50]; [B51]. §§3.5.4, 3.5.5. [B51]. §3.5.6. [850]. §§3.5.7-3.5.9. [Bl3]. §§3.5.10, 3.5.11. [B20]. §§3.5.12, 3.5.13. [B52]. §3.6.1. [B8]. §§3.6.2-3.6.6. [Bll]. §3.6.7. [B24]. §§3.7.1, 3.7.2. [Bl]. §§3.7.3, 3.7.4. [B43]. §§4.2.1-4.2.3. [B27]. §4.3.1. Ibid. §4.3.2. [B37]. §§4.3.3, 4.3.4. [B28]. §§5.2.1, 5.2.2. [849]. *5.2.3. [B29]. ~:>.2.4. [B49]. §5.2.7. I bid. §5.2.8. [Bl8].

Bibliography

171

Bibliography Part A. Boolct 1. Bellman, R., Introduction Co matrix analysis, McGraw-Hill, ~cw York (1960). 2. Gantmacher, F. R., The theqry of mabice3, vola. I, 11 (trans., K. A. Hirsch), Chelsea, New York (1959). 3. Hadamard, J., Let;orUJ sur la propagation du ondes, Chelsea, New York (1949). 4. Lyapunov, A., Probleme gbleral de Ia stabiliU du mouvement, Ann. of Math. Studies 17, Princeron University, Princeton, N.J. (1947). 5. MacDuft'ee, C. C., Til£ tJaeqry of matrices, Chelsea, New York (1946). 6. Parodi, .1\·1., La loc4lisation des vakurs caracUri8tiquu des matriaa et •• appl.,_ cations, Gauthier-Villars, Paris (1959). 7. Perron, 0., TIUJOrie der algebraischen Gleid&ungen, II (sweite Auftage), de Gruyter, Berlin (1933). 8. Todd, J. (ed.), Survey of numerical analysis, .McGraw-Hill, New York (1962).

P•rt B. P•pen roolt of trMliagonal f'l'l4lrica, Proc. Edinburgh Math. Soc., 1! (1961), Edinburgh Math. Notes No. 44, 5-7. Bendi.xson, 1., Sur lea racinu d'une lquation ftm.dtJmfnltlle, Acta Math., 16 (1902), 359-365. (Ofversigt af K. Yet. Akad. Forb. Stockholm, 57 (1900), 1099.) Brauer, A., LimiU for tk characteristic roots of a matrix, Duke Math. J., I, 1S (1946), 387-395; II, 14 (1947), 21-26; IV, 19 (1952), 73-91. Brauer, A., The tJ&eqrema of Ledermann and Ostrowski on positive matrice1, Duke Math. J., !4 (1957); 265-274. Bromwich, T. J. I'A, On tit£ rootB of t.he claaraderwic equation of a lifl«