308 26 14MB
English Pages 302 Year 1989
MATRICES and UNEAR ALGEBRA second edition
Hans Schneider James Joseph Sylvester Professor of Mathematics University of Wisconsin-Madison
George Phillip Barker University of Missouri-Kansas City
DOVER PUBLICATIONS, INC., New York
Copyright© 1968 by Holt, Rinehart and Winston, Inc. Copyright© 1973 by Hans Schneider and George Phillip Barker. All
rights
reserved
under
Pan
American
and International
Copyright
Conventions. Published in Canada by General Publishing Company, Ltd., 30 Lesmill Road, Don Mills, Toronto, Ontario. Published in the United Kingdom by Constable and Company, Ltd., 10 Orange Street, London WC2H 7EG. T his Dover edition, first published in 1989, is an unabridged, slightly corrected republication of the second edition (1973) of the work originally published by Holt, Rinehart and Winston, Inc., New York, in 1968. Manufactured in the United States of America Dover Publications, Inc., 31 East 2nd Street, Mineola, N. Y. 11501 Library of Congress Cataloging-in-Publication Data
Schneider, Hans Matrices and linear algebra / H ans Schneider, George Phillip Barker. p.
cm.
Reprint. Originally p ublished: 2nd ed. New Y or k: H olt, Rinehart and Winston, 1973. Includes index. ISBN 0-486-66014-1 I. Algebras, Linear. 2. Matrices. I. Barker, George Phillip. II. Title.
[QA184.S38 1989] 512.9'434-dc l 9
89-30966 CIP
preface to the
second edition The primary difference between this new edition and the first one is the addition of several exercises in each chapter and a brand new section in Chapter 7. The exercises, which are both true-false and multiple choice, will enable the student to test his grasp of the definitions and theorems in the chapter. The new section in Chapter 7 illustrates the geometric con tent of Sylvester's Theorem by means of conic sections and quadric surfaces. We would also like to thank the correspondents and students who have brought to our attention various misprints in the first edition that we have corrected in this edition . MADISON, WISCONSIN KANSAS CITY, MISSOURI OCTOBER 1972
v
H.S. G.P.B.
preface to the
first edition Linear algebra is now one of the central disciplines in mathematics. A student of pure mathematics must know linear algebra if he is to continue with modern algebra or functional analysis. Much of the mathematics now taught to en gineers and physicists requires it. It is for this reason that the Committee on Underg raduate Programs in Mathematics recommends that linear algebra be taught early in the under graduate curriculum. In this book, written mainly for students in physics, engineering, economics, and other fields outside mathematics, we attempt to make the subject accessible to a sophomore or even a freshman student with little mathemati cal experience. After a short introduction to matrices in Chapter 1, we deal with the solving of linear equations in Chapter 2. We then use the insight gained there to motivate the study of abstract vector spaces in Chapter 3. Chapter 4 deals with determinants. Here we give an axiomatic definition, but quickly develop the determinant as a si gned sum of products. For the last thirty years there has been a vigorous and sometimes acrimonious discussion between the proponents of matrices and those of linear transformation. The con troversy now appears somewhat absurd, since the level of abstraction that is appropriate is surely determined by the mathematical goal. Thus, if one is aiming to generalize toward ring theory, one should evidently stress linear transformations. On the other hand, if one is looking for the linear algebra analogue of the classical inequalities, then clearly matrices VII
VIII
PREFACE
form the natural setting. From a pedagogical point of view, it seems appropriate to us, in the case of sophomore students, first to deal with matrices. We turn to linear transformations in Chapter 5. In Chapter 6, which deals with eigenvalues and similarity, we do some rapid switching between the matrix and the linear transformation points of view. We use whichever approach seems better at any given time. We feel that a stu dent of linear algebra must acquire the skill of switching from one point of view to another to become proficient in this field. Chapter 7 deals with inner product spaces. In Chapter 8 we deal with systems of linear differential equations. Obviously, for this chapter (and this chapter only) calculus is a prerequisite. There are at least two good reasons for including some linear differential equations in this linear algebra book. First, a student whose only model for a linear transformation is a matrix does not see why the abstract approach is desirable at all. If he is shown that certain differential operators are linear transformations also, then the point of abstraction becomes much more meaningful. Second, the kind of student we have in mind must become familiar with linear differential equations at some stage in his career, and quite often he is aware of this. We have found in teaching this course at the University of Wisconsin that the promise that the subject we are teaching can be applied to differential equations will motivate some students strongly. We gratefully acknowledge support from the National Science Founda tion under the auspices of the Committee on Undergraduate Programs in Mathematics for producing some preliminary notes in linear algebra. These notes were produced by Ken Casey and Ken Kapp, to whom thanks are also due. Some problems were supplied by Leroy Dickey and Peter Smith. Steve Bauman has taught from a preliminary version of this book, and we thank him for suggesting some improvements. We should also like to thank our publishers, Holt, Rinehart and Winston, and their mathematics editor, Robert M. Thrall. His remarks and criti cisms have helped us to improve this book. MADISON, WISCONSIN JANUARY 1968
H.S. G.P.B.
contents
Preface to the Second Edition
v
Preface to the First Edition
vii
THE ALGEBRA OF MATRICES
l. 2.
MATRICES:
]
DEFINITIONS
ADDITION AND SCALAR MULTIPLICATION OF
7
MATRICES
3.
MATRIX MULTIPLICATION
12
4.
SQUARE MATRICES, INVERSES, AND ZERO DIVISORS
23
5.
TRANSPOSES, PARTITIONING OF MATRICES,
30
AND DIRECT SUMS
2
LINEAR EQUATIONS
l.
EQUIVALENT SYSTEMS OF EQUATIONS
42
2.
ROW OPERATIONS ON MATRICES
47
3.
ROW ECHELON FORM
57
4.
HOMOGENEOUS SYSTEMS OF EQUATIONS
63
5.
THE UNRESTRICTED CASE:
A CONSISTENCY
74
CONDITION
IX
6.
THE UNRESTRICTED CASE:
7.
INVERSES OF NONSINGULAR MATRICES
A GENERAL SOLUTION
79 88
x
CONTENTS
3
VECTOR SPACES I.
VECTORS AND VECTOR SPACES
2.
SUBSPACES AND LINEAR COMBINATIONS
100
3.
LINEAR DEPENDENCE AND LINEAR INDEPENDENCE
112
4.
BASES
119
5.
BASES AND REPRESENTATIONS
133
6.
ROW SPACES OF MATRICES
140
7.
COLUMN EQUIVALENCE
147
8.
ROW-COLUMN EQUIVALENCE
151
9.
EQUIVALENCE RELATIONS AND CANONICAL FORMS OF MATRICES
4
I.
INTRODUCTION AS A VOLUME FUNCTION
161
2.
PERMUTATIONS AND PERMUTATION MATRICES
172
3.
UNIQUENESS AND EXISTENCE OF THE
4.
6
156
DETERMINANTS
DETERMINANT FUNCTION
5
96
181
PRACTICAL EVALUATION AND TRANSPOSES OF DETERMINANTS
188
5.
COFACTORS, MINORS, AND ADJOINTS
193
6.
DETERMINANTS AND RANKS
206
LINEAR TRANSFORMATIONS I.
DEFINITIONS
211
2.
REPRESENTATION OF LINEAR TRANSFORMATIONS
217
3.
REPRESENTATIONS UNDER CHANGE OF BASES
232
EIGENVALUES AND EIGENVECTORS I.
INTRODUCTION
2.
RELATION BETWEEN EIGENVALUES AND MINORS
248
3.
SIMILARITY
259
239
XI
CONTENTS
7
8
4.
ALGEBRAIC AND GEOMETRIC MULTIPLICITIES
267
5.
JORDAN CANONICAL FORM
272
6.
FUNCTIONS OF MATRICES
276
7.
APPLICATION:
292
MARKOV CHAINS
INNER PRODUCT SPACES
I.
INNER PRODUCTS
304
2.
REPRESENTATION OF INNER PRODUCTS
308
3.
ORTHOGONAL BASES
317
4.
UNITARY EQUIVALENCE AND HERMITIAN MATRICES
325
5.
CONGRUENCE AND CONJUNCTIVE EQUIVALENCE
336
6.
CENTRAL CONICS AND QUADRICS
344
7.
THE NATURAL INVERSE
348
8.
NORMAL MATRICES
354
APPLICATIONS TO DIFFERENTIAL EQUATIONS
I.
INTRODUCTION
362
2.
HOMOGENEOUS DIFFERENTIAL EQUATIONS
365
3.
LINEAR DIFFERENTIAL EQUATIONS:
4.
THE UNRESTRICTED CASE
372
LINEAR OPERATORS:
377
THE GLOBAL VIEW
Answers
381
Symbols
407
Index
409
MATRICES and UNEAR ilGEBRA
chapter
J
The Algebra of Matrices
1.
MATRICES: DEFINITIONS This book is entitled
Matrices
and
Linear Algebra,
and
"linear" will be the most common mathematical term used here. This word has many related meanings, and now we shall explain what a linear equation is. An example of a linear equa tion is 3x1
+
2x2
=
5, where x1 and x2 are unknowns. In
general an equation is called linear if it is of the form
(1.1.1) where x1,
·
·
·
, Xn
are unknowns, and
a1,
·
·
·
,
an
and
b
are
numbers. Observe that in a linear equation, products such as x1x2 or x34 and more general functions such as sin x1 do not occur. In elementary books a pair of equations such as
{
(1.1.2)
3x1 - 2x2 -xi
+
+
5x2
4x3
= .
1
=
-3
is called a pair of simultaneous equations. We shall call such a pair a
system
of linear equations. Of course we may have
more than three unknowns and more than two equations. Thus the most general system of
m equations inn unknowns is
2
THE ALGEBRA
OF
MATRICES G11X1
+
GmJXI
+
·
·
·
+
GJnXn
+
GmnXn
=
b1
(1.1.3) =
The a;j are numbers, and the subscript (i,
bm.
/) denotes that a;j is the
coefficient of Xj in the ith equation. So far we have not explained what the coefficients of the unknowns are, but we have taken for granted that they are real numbers such as
...J2,
or
1r.
2,
The coefficients could just as well be complex numbers. This
case would arise if we considered the equations
ix1 - (2 + i)x2
=
I
2x1 + (2 - i)x2
=
-i
Xt
+
2X2
=
3.
Note that a real number is also a complex number (with imaginary part zero), but sometimes it is important to consider either all real numbers or all complex numbers. We shall denote the real numbers by
R and the complex numbers by C. The reader who is familiar with R and C are fields. In fact, most of our
abstract algebra will note that
results could be stated for arbitrary fields. (A reader unfamiliar with abstract algebra should ignore the previous two sentences.) Although we are not concerned with such generality, to avoid stating most theorems
R C. Of course we must be consistent in any
twice we shall use the symbol F to stand for either the real numbers or the complex numbers
particular theorem. Thus in any one theorem if F stands for the real numbers in any place, it must stand for the real numbers in all places. Where convenient we shall call Fa number system. In Chapter 2 we shall study systems of linear equations in greater detail. In this chapter we shall use linear equations only to motivate the concept of a matrix. Matrices will turn out to be extremely useful -not only in the study of linear equations but also in much else. If we consider the system of equations
(l.l.2), we see that the arrays
of coefficients
[_� -� �] [_�J
4
THE ALGEBRA OF MATRICES
is called the ith row of
A
and we shall denote it by a;•. Similarly, the
vertical array
Omj
is called thejth column of A, and we shall often denote it by a.1. Observe
A occurring in the ith row and the jth column. A has m rows and n columns it is called an m X n matrix.
that aij is the entry of If the matrix
In particular, if m
=
n the matrix is called a square matrix. At times an
n X n square matrix is referred to as a matrix of order n. Two other special cases are the m X I matrix, referred to as a column vector, and the I X n matrix, which is called a row vector. Examples of each special case are
[�]
(2
'Tri
-% +
Usually we denote matrices by capital letters
(A,
i/5].
B, and C), but some
times the symbols [a;1], [bk1], and [cpq] are used. The entries of the matrix
A
will be denoted by aij, those of Bby bk1, and so forth. It is important to realize that
are all distinct matrices. To emphasize this point, we make the following
•
(1.1.6)
DEFINITION
matrix. Then
A
=
Let
A
be an m X n matrix and Ba p X q
Bif and only if m i
=
l,
·
·
· ,
=
p, n
m,
j
=
=
q, and 1,
· · ·,
n.
Thus for each pair of integers m, n we may consider two sets of matrices. One is the set of all m X n matrices with entries from the set of real numbers
R,
and the other is the set of all m X
n
matrices with
6
THE ALGEBRA OF MATRICES
Then
Ax
b is shorthand for
=
i (2 + i)x1 + Ox2 + (2 - i)x3 - 7X4 = - 2
3 we shall see that the left side of (l.l.7) may be read as the x are column vectors; b is a column vector with m elements and x is a column vector with n ele In Section
product of two matrices. Note that b and
ments. This method of writing the linear equations concentrates attention upon the essential item, the coefficient array.
EXERCISES
l. Find the matrices (A, b, x), corresponding to the following systems of equations. (a)
2x1 - 3x2 =
(b) 7x1 + 3x2 - X3 = 7
4
4x1 + 2x2 = -6.
X1 + X2
8
19x2 - X3 =17.
(c)
2x + 3y - 5z + 7w =11
- 4w =16
(d) 2x + 3y
y + 4z
z + w = 5.
=6
z + 5w = 8
6x
(e) (3 + 2i)z1 + (-2 + 4i)z2 =2 + i
=7
+ 7w =9.
(4 + 4i)z1 + (-7 + 7i)z2 =4 - i.
(f ) 3z1 + (4 - 4i)z2 = 6 Zt
+ (2 + 2i)z2 = 7 - i.
2. What systems of equations correspond to the following pairs of matrices? a
( ) A =
[ : �]. [ ;]· 1
-2
b
'1T
=
-.J2
(b)
A=[:
7 2.
2.
ADDITION AND SCALAR MULTIPLICATION OF MATRICES
ADDITION AND SCALAR MULTI PLICATION OF MATRICES We wish to see the effect on their corresponding matrices of adding two systems of equations. Consider the following two systems of equa tions with their corresponding matrices:
{ 2x1 -3x2 4x1 + 5x2 {-{lx1 + 3x2 16 -5x1 - 1x2 3 =
=
5
A=
7
=
B
=
=
e -:J [-{2 3]
g
[_�]. [163]·
h=
-1
-5
=
Adding the corresponding equations, we obtain a third system of equa tions,
{(2 + -{l)x1 + (-3 + 3)x2 (4 + - l)X2 -
and its matrices
c=
5)Xt
(
5
[24--{2 -35-13] +
+
5
=
=
(
5+
(-7 +
-[
k-
16) 3 ),
5+ -7+
16] 3
.
Here we see how C may be obtained directly from A and B without reference to the original system of equations as such. We simply add the entries in like position. Thus
c22 +4 a22 + b22 ++ =
=
and we shall write C = A
B.
We shall define this sum A
Bonly when A and B have the same
number of rows and columns. Two matrices with this property will be called •
additively conformable.
(t.2.1)
the suin C
+
DEFINITION =
A
For such matrices we make the following
If A and Bare two
Bis given by
i=
l, · · ·,
m
m, j =
X
n
matrices, then
1, · · ·,
n.
10
THE ALGEBRA OF MATRICES •
(1.2.5) DEFINITION OF SCALAR MULTIPLICATION Let = [a1j] be an m X n matrix with entries in the number system F, and aA = Aa is defined by let a be an element of F. Then B
A
=
i= I,
.
.
.,m,
j= I,
· · ·, n .
Directly from the definition w e have the following •
(1.2.6)
(I(2))
{a+{3)A
(3) (af3)A
(4)
THEOREM
a(A+B)
=
a(f3A).
IA (1) =
=
aA+aB. aA +{JA.
=
A.
To illustrate
A
let a
=
2,
[ -12 OJ 3
=
B
0
=
[-II
Then
-� :J) 2 [I 21 11 J [2 2 22J 2 [ 21 I J 2 [-11 -1 IJI [ -2 2 OJ [-22 -2 22J [2 2 22J. 4
=
aA +aB
0
=
3
=
4
6
0
o + 0
0
0
4
+
0
So in this particular case we have a(A +B)
=
=
0
aA +aB. The proofs
are straightforward and are left to the reader. Although aA
=
Aa for a a scalar and A
a
matrix, if A is a row vector,
we shall always write the product as aA. However, if B is a column vector, the product of B and a will be written as Ba.
11
2.
ADDITION AND SCALAR MULTIPLICATION OF MATRICES
EXERCISES I. None of the following matrices are equal. Check! Which of them are
additively conformable? Calculate the sum of all pairs of additively
conformable matrices of the same order. (a)
A�
( c) C
=
[: � -;] 4
[� � � �]. 3
6
(o)
W
E
�
G�
4
0
( b)
B�
(d) D
-2
=
[� : -;] [; : -;J 4
4
[� � �]. 3
6
7
0
[: -;]
(O
( h)
F
H
�
�
4 1
-2
7
[� ; -�]
[� -� �J
2. Add the following pairs of systems of equations by finding their matrices A, b, and x. After performing the matrix addition translate the result back into equation form. Check your result by adding
the equations directly. (a)
(b)
{
{
4x1 + ?x2
=
Xt + 2x2 - X3 4x1 + 2x2 6x1 + 3x2
-x1 + l3x2
=
=
3
=
7
=
8
2 3
{
{
7x2 + 8x3 XI - X2 X1 - 3x2 Xt + X2
4x1 + 2x2
=
=
=
=
=
-2 0
3.
0
17.
12
THE ALGEBRA OF MATRICES
(c)
(d)
r:
{
2x+ y+z=
f
(3 + 2i)z1 + (-2 + 4i)z2 = 2 +i
x- 2y-z=
1 -
x- y+z=
l
2
+
l -2x +
4
y
_,
1
=;
14y -z=
3.
and
(4 + 4i)z1 + (-7 + 7i)z2 = 4 - i
{
3z1 + (4 - 4i)z2 = 6 Z1 + (2 + 2i)z2 = 7 -i.
3. Check parts (3) and (4) of Theorem (1.2.2). 4. Prove part (5) of Theorem (l.2.2). [Hint: In this and exercise 3 use the definition of matrix addition as illustrated in the proof of part (2).] 5. Let C be an X n· matrix. The trace of C, tr(C), is defined to be the sum of the diagonal entries, that is, tr(C) = 2:{=1 Cit. Deduce the following results: (a) tr(A + B) = tr(A) + tr(B).
6.
(a)
{
(b) tr(kA) = k tr(A). (b)
4x1 - 5x2 + 7x3 = 0 2x1 + 6x2 - X3 = I.
{
2x1 - 3x2 + XJ = - X1 + 2x2 - X3 = -4 -3X1
+ 5X3 =
12.
Find the scalar product of the given systems of equations by 3. First perform the multiplications. Find the associated matrices and calculate the scalar product using the definition. Translate the result back into equations. Compare the respective answers.
7.
Check the assertions following the definition of scalar multiplication.
8. For the system of equations
{
iz1 +
2iz2 + (4 - i)z3 = 1
z1 + (2 - i)z2 + (1 + i)z3 = -i,
find the product of the corresponding matrix equation with the scalar ( 1 - i).
3.
MATRIX MULTIPLICATION Suppose we now consider the system of equations
(1.3.1)
{
YI
=
3x1 - 5x2
Y2 = 5x1 + 3x2
14
THE ALGEBRA
MATRICES
OF
so that if the product BA is to be defined we should have
(1.3. 7)
BA= C =
Observe that knowns
[
3+ 5
-5 + 3
3 - lO
-5 - 6 .
0+ 15
0+9
]
(l.3.3) was a system of three equations in the two un
y1 and y2, and that (l.3.1) had one equation for each of y1 and
y2. Thus in our substitution the number of unknowns in
(l.3.3) equals
(l.3. l).
the number of equations in
In terms of matrices this means that the number of columns of B
equals the number of rows of A. Further, after the substitution has been
(l.3.6), it is clear that the number of equations equals (l.3.3), while the number of unknowns is the same as in (l.3.1). Thus our new matrix BA will have the same carried out in
the number of equations in
number of rows as B and the same number of columns as A. With this in mind we shall call an matrix A
m
X
n
matrix B and an
multiplicatively conformable if and only if
n
=
' n ,
n
'
X p
that is, if
the number of columns of B equals the number of rows of A. We shall define multiplication only for multiplicatively conformable matrices. Further, BA will be an
m
X
p matrix; that is, BA will have as many
rows as B and as many columns as A. Keeping
• n
(l.3.7) and subsequent remarks in mind we make the following
(1.3.8) DEFINITION
Let B be an
m
X
n
]
matrix and A be an
X p matrix, so that B has as many columns as A has rows. Let
B=
[�11
Then the product
.
bml
�In] .
A=
.
bmn
BA= C =
[Ctt :
Cmt
[�11 .
.
a.1
"']
Cm p
�Ip .
.
anp
.
16
THE ALGEBRA OF MATRICES
We can immediately verify the following special cases: (1) row vector X matrix row vector. =
(2) matrix X column vector
•
=
column vector.
(3) row vector X column vector
=
( 4) column vector X row vector
=
(t.3.12)
EXAMPLE
l X l matrix (a scalar). matrix.
Let
and C= BA. Then
C=
[
l·O+l·l+2·2
l·l+l·(-1)+2·0
1·0+2·1+3·2
l·l+2 (-1)+3 . 0
l·O+ 4·1+9·2
l·l+ 4 (-1)+9·0
] [5 �] =
8
22
-
·
-3
We remark that Ccan be obtained from Band A by a row into column multiplication. Thus
c11 is the sum of the entrywise products going across
the first row of Band down the first column of A, so in (l.3.12)
c11
=
1·0+l·l+2·2 =
5
b1.a.1,
=
a•1 is the first column of A. Similarly, c32 can be obtained by going across the third row of B and down the
where b1• is the first row of Band
second column of A. Again, in (l.3.12) we have
c32
=
1·l+ 4(-l)+9·0
=
-3
=
b3.a.2.
In general
(1.3.13)
CtJ
=
(b11
b;2
b,.]
The formula (1.3.13) holds not just in this one example but whenever the product C
=
BA is defined.
18
THE ALGEBRA OF MATRICES •
(1.3. t 7)
THEOREM
C =BA, then the jth column of C is a B with coefficients from the jth
If
linear combination of the columns of column of
A.
The reader should also check that in Example
(l.3.12),
and in general n
C =b.1a1• + Note that for each k,
·
·
·
b.kak• is an
+b••a •• m
=
I: b.kak•· k�J
X p matrix.
Let us now return to some properties of matrix multiplication. We can now prove
•
(1.3.18)
THEOREM
For all matrices
A, B, and C, and any
scalar a:
(l) A(BC) =(AB)C. (2) A(B+C) =AB+AC. (3) (A+B)C =AC+BC. (4) a(AB) =(aA)B. whenever all the products are defined. PROOF OF
(1)
We must show that
AB =D, BC = G, (AB)C =F, and A(BC) =H. F = H.
Let
Hence
The proofs of as an exercise.
(2), (3), and (4) are similar and are left to the reader
20
THE ALGEBRA OF MATRICES
EXERCISES
I. Check the following products.
a
(
)
[�l -=�3 ][� -:J [-4� -=:4] =
{2 -�] [O) 0. 2{-� :J [2 [; -�I-:J [ _:J .2 l 2 3 . (Hint: 3. (2 ) (3) 4. 1A 2 0 2 O �[ �][� -�J. [ 0 -2 ][ 0 lJ : 0 (b) (c)
(d)
[l [O
=
=
=
8).
=
Which of the matrices in exercise
of Section
conformable? Calculate three such products. Check parts
tion.
and
of Theorem ( 1 .
) Compute the following products:
(a)
.
18 )
are multiplicatively Use the defini
(b)
[ -�r� -:J (dT� -� -m 0 J ( a,. and a,.
-->
a,. (read row a,. becomes row a,. and row a,.
becomes row a,.). Type JI
The multiplication of any row by a nonzero scalar >..,
denoted symbolically by a,.--> >..a,•.
50
LINEAR EQUATIONS In the general case a matrix of type II is of the form
En =
I
0
0
0
0
l
0
0
0
O
X
0 '
0
0
·
·
·
where X � 0, and in the special example
m
0
0
X
0
0
l
0
0
= 4,
r
=
2,
In the ge_neral situation a matrix of type III is of the form
Em=
l
0
0
0
0
l
0
0
0
0
0
0
l
·
·
·
X
·
·
·
O
'
where X occurs in the rth row of the sth column. Again for s
=
3, we have
m =
4,
r =
2,
52
LINEAR EQUATIONS •
E1-1
(2.2.1) £1,
LEMMA
Ei, Eu,
and
Eu1
are nonsingular.
In
fact,
=
I
0
0
0
I
0
1/>.
0
0
1
0
0
0
0
1
0
0
0
0
O
0
0
0
· ·
·
I
· · ·-
>.
· · ·
O
Thus the inverse of an elementary matrix of type I, II, or III is again an elementary matrix of the same type.
PROOF
The proof is by direct computation and is left to the reader.
£111-1 is £111£111-1
the matrix defined above, then the reader should check that
If
=
Remark
I=
E111-1E111.
Lemma
(2.2. l)
can easily be interpreted in terms of ele
mentary row operations. For example, since
E1-1 =E1,
E1(E1A) =A. This asserts that if we interchange two rows of
A
change the same two rows of B, we again obtain
to obtain B, then inter
A.
54
LINEAR EQUATIONS
•
(2.2.3) LEMMA Let A, B, and C denote (I) For all A, A�A. (2) If A � B, then B� A. (3) If A � B, and B� C, then A � C.
m
X
n
matrices. Then
PROOF
(I) A = IA, and since I is an elementary matrix, part {I) follows. (2) Let A� B. Then for suitable £1, ·· ·, E, we have B =PA, where P = E,E,_1 ···£1. It follows from Lemma (l.6.5) that P is nonsingular and p-1 = E1-tE2-1. · ·E,-1. As already noted, the inverse of each elemen tary matrix is an elementary matrix. Thus A = P-IB and B �A . (3) Let A�B,B � C. B =PA and C = QB, where P E,· ··E1 and Q = E,· ··E,tt. Hence QP = E,· ··E1 and C = QPA. Thus A � C. =
A and B are row equivalent rather A is row equivalent to B. Using Lemma (2.2.3) we can now prove a
In view of this lemma we may say than
•
(2.2.4)
THEOREM
Let
A �B, say
B
=
PA, P a product of ele
mentary matrices. Then the systems of equations Ax = b and PAx
=
Pb
are equivalent. [Recall Definition (2.1.5) of equivalent systems of equa tions.] PROOF
Let
c be a solution of Ax = b; that is, Ac = b. Obviously,
PAc =Pb. Conversely, suppose PAc =Pb. P is nonsingular [see the proof of Lemma (2.2.3)]. Therefore, or
Ac= b.
EXERCISES
I. (a) Suppose
A = [a;1] is an
m
X
n
matrix, and suppose that B1, B2,
A by performing the elementary row operations a, ......,. a,. and a, ......,. a,., a, ......,. Xa,., and a, ......,. a,. + Xa,., respectively. Show that there are elementary matrices Er,
83 are each obtained from •
•
•
•
Err, and Errr of types I, II, and III, respectively, such that Bi = ErA, B2 = ErrA, and 83 = ErrrA.
(b) Conversely, suppose C1 = ErA, C2 = ErrA, and C3 = ErIIA, where Er, En, and Errr are elementary matrices of types I, H, and III, respectively. Show that C1, C2, and C3 can each be obtained from
A by an elementary row operation of types
II, and III, respectively.
I,
56
LINEAR EQUATIONS 5. Let
D
Ut E,
�
[: � �] [-� : ;] [: � �] [-: : ;] [: � �] •nd
�
E,
D'
�
E,
�
�
E1D and I= (E3E2)D'. Does DE1 = I= D( ' E3E2)? D' nonsingular? Let D be an n X n diagonal matrix. Show that Dis nonsingular if and d .= 0. (Hint: If d11d22 d .= 0, can du 0? only if d11d22 and verify that I=
Are D and 6.
·
·
·
••
·
·
·
=
••
What is n-17 Write n-1 as the product of elementary matrices.) 7. Prove that elementary matrices of type II commute. Find two elemen
tary matrices of type I that do not commute; find two elementary matrices of type III that do not commute. 8. Let
A=[� ; 1�] 0
Show
that
AT� BT (�
-
12
B
=
[� ; �]· 0
-6
A � B. (Hint:
Recall exercise
is read "not row equivalent").
called the transpose of
9.
and
A
0
0
2.) Show also that The matrix
AT is
and is obtained by the operation.
(a) Obtain
by premultiplying I by a sequence of four elementary matrices, E4,
£3, E2, and £1, where £4 is of type
II and the others are of
type III. (b) Generalize (a): Show that any type I is a product are of type III.
m
X
m
elementary matrix of
£4£3£2£1, where £4 is of type
II and the others
57
3.
ROW ECHELON FORM
10. Prove the following. (a) If A � I, then A-1 � I (assume A-1 exists). (b) If A �/,then AT�/. 11. Prove or disprove (with a counterexample) this statement: If A and B are multiplicatively conformable matrices in row echelon form, then AB is in row echelon form.
3.
ROW ECHELON FORM We have seen one example of a matrix in row echelon form, the
coefficient matrix in equation
[�
(2.1.9). Another is
� -� �
0
0
0
0
0
0 0
We shall now show that any
0
�] I
.
0
m X n matrix is equivalent to a matrix in
row echelon form. The proof will be accomplished by actually exhibiting a step-by-step process,using row operations,that systematically reduces
A to the prescribed form. Such a step-by-step process is called an algorithm. (This particular algorithm is called Gaussian elimination.) Roughly
speaking, we could program this process on a computer in such a way that,given the matrix
A, the computer would print out the row echelon
form of A. Those readers who have some experience with computers will
recognize the following as an informal flow chart for the algorithm.
For convenience,the following abbreviations will be used in the chart:
REF,row echelon form; and ero, elementary row operation.
For readers who have not seen a flow chart before, we may briefly
explain that the flow chart describes how the machine repeats a sequence of operations. We shall call each sequence a
step. Initially k
end of the step the machine reaches the instruction "put k:
and this will mean that in step 2, k
=
=
=
I. At the k + I,"
2, and so on. Observe also that
several instructions change the matrices Ak and A. Using a convention
observed by computers, we again call the new matrices so obtained Ak
and A. We shall use the same convention in the formal proof below. We now give an example to illustrate the algorithm. Let
0
3
-I
4
-I
7
-] I
7 . 6
58
LINEAR EQUATIONS
START
NO
Find first nonzero column of Ak, soy column p.
Find first row of Ak hoving nonzero entry in column p.
Put this row first in Ak (ero of type I).
Make leoding entry of first row of Ak equol to 1 (ero of type 11).
Reduce oll other entries of column p of
A to zero by subtrocting oppro
priote multiples of first row of Ak from other rows of A (ero of type Ill).
YES
NO
Partition A,
(A is in REF) STOP
where B hos
k rows.
Put k:
k
+
1.
59
3.
ROW ECHELON FORM
Step
1
(c) Does
(a) and (b) We let k = I and put
At = O?
(d) The first nonzero column of (e) The first row of
(f)
Ai
[
Ai
p = 2.
is
having nonzero entry in column
Perform the operation
ai•--+ ai•
O
A=
0
0
0
-l
(h) Perform the operation
a1•--+ a1•
(i)
Does I = Partition
3? A,
No.
[� [[:
+
0
3
0
3
-4
0
3
0
3
where
and we loop back to (c) to begin step
Step
2
(c) Does
Ai = 0?
2.
so that
ai•.
-4
[O
A=
is row
l
-
ai•--+ -ai•,
A�
2
and ai•--+ a1 . . We now have
(g) Perform the operation
(j)
Ai = A.
No.
Then
-]
-l .
-17
=:].
2.
No.
(d) The first nonzero column of
Ai
is
p = 3.
(e) The first row of Ai having nonzero entry in column (f) This operation is unnecessary.
3
is row I.
60
LINEAR EQUATIONS
[[� �
(g) Perform the operation
Ot•--+
[O
A=
% a1.,
1
-4
3
a1•--+ a1•
(0
A=
(i)
Does 2 =
(j)
Partition
3? A,
Ak+1= A 3 3,
Step 3
(d) Stop.
03• --+ a3. -
[O
0
0
0
1
(0
0
0
-�
3 02•.
,
OJ.
0
and we loop back to (c) to begin step
(c) Does
A
402• and
1
=
(k) Now k =
+
.
-1
[[O O _2%J] OJ
No.
A=
where
-7]
- }�
[[�
(h) Perform the operations Thus
so that
A 3 = O?
3.
Yes.
is in row echelon form.
We shall now state the theorem formally and give a proof based on the algorithm.
•
(2.3.1)
Any
THEOREM
m
X
n
matrix
A
is row equivalent to a
matrix in row echelon form. PROOF Step 1
A F
If A = 0, then
0. Suppose
O•p
A
is in row echelon form. So we may assume
is the first nonzero column of
A
(often p=
1)
and
that a1p is the first nonzero element in the column. The operation a1.--+ a1• and
a1•--+ a;•.
1, this operation of type I is unnecessary. a1•--+ (a1p)-1a1•. Finally, calling the new the operations a;•--+ a;. - ( a1p) a1•, for i F I, I, a;p 0 for matrix A we observe that a1p
Again, if i
=
Next, perform the operation first row 01., perform I < i 5,m. In the new
i F 1,
and
/(1)=
p. Pa rtiti on A,
=
=
62
LINEAR EQUATIONS
and note that
(2) the first l(k) columns of Ak+1 are zero. Thus the conditions for starting the next step are satisfied. We carry out
r
steps of this algorithm until either A,+1
=
0 or
r = m.
The resulting matrix, which will now be denoted by AR, is in row echelon form and is row equivalent to the original matrix.
A square matrix in row echelon form is obviously in upper
Remark
l.3 for the definition). Thus the (2.3.l) gives a method for reducing every matrix to upper
triangular form (see exercise 6 of Section algorithm of
triangular form by elementary row operations. However, if it is merely desired to reduce A to upper triangular form, not necessarily to row echelon form, then several steps of the algorithm we have given may be omitted (see exercise 5 of this section).
(2.3.2)
•
CAUTION
Theorem
(2.3.l) shows that for each matrix A � AR. In Chap
there exists a matrix AR in row echelon form such that A ter
3 we shall show that AR is unique; that is, there is just one matrix in
row echelon form row equivalent to A. Thus we shall be entitled to call AR the row echelon form of A. We shall use this terminology in the rest of this chapter, although the uniqueness of AR has not been proved. A reader concerned with logical precision should read Chapter 3
through Section 6 before continuing with the rest of this chapter. How ever, we recommend this course only for those readers with previous experience in linear algebra. Chapter
3 i& rather abstract, and the reader
will be aided in understanding it by a familiarity with the solution of linear equations, which we shall discuss in Section 4.
EXERCISES
]
I. Following the algorithm reduce the following matrices to row echelon
form:
(•l
(c)
-
8
-3
h �]
16 6 l
(b) .
(d)
[�
[;
2
5
-2
2 0
2
0
-3
:J
:J
64
LINEAR EQUATIONS • (2.4.t) DEFINITION The system of equations Ax= b is homo geneous if and only if b = 0. We-then write Ax= 0. Let Ax = 0 be a homogeneous system of linear equations, and let AR be the row echelon form of A, which exists by Theorem (2.3.1). By Theorem (2.2.4) the system of equations ;4.Rx= 0 is equivalent to Ax= 0. Consequently we may assume from the outset that A itself is in row echelon form. It is clear that the system Ax = x=
0.
We call this solution the
0 always has at least trivial solution, and it is
one solution, usually of no
interest. Before considering the general case, let us inspect an example:
Ax�
(2.4.2)
[�
x.co
0
0
2
0
0
0
0
0
0
0
0
0
0
-�]
�/(I) Xz(2J
=
X/(2)
0.
X/(3) Xz(3)
If we write out the corresponding equations, we can separate the x10, from the Xz· By putting the x1ui on the left and the Xzci> on the right in the equations, and transforming to matrix vector form, we obtain
[::: :] -c[::: :] -[: � =�][::: :]. =
(2.4.3)
X/(3)
It is easily checked that
(2.4.3)
=
Xz(3)
(2.4.2)
0
and
(2.4.3)
call
[] X/(1)
X/(2)
X/(3)
the I vector and
[] Xz(I)
Xz(2) Xz(3J
0
7
Xz(JJ
yield the same solutions.
In
66
LINEAR EQUATIONS Let Ax
(2.4.6) RESULT
•
=
0 be a homogeneous system of m linear
equation in n unknowns and suppose that A is in row echelon form with
t
r
n
=
nonzero rows. Then the general solution of the system Ax -
of the
z
r
=
0 has
arbitrary parameters that may be chosen to be the components
vector. The elements of the solution are linear expressions in
the parameters. To restate our results for mafrices A that need not be in row echelon form we need a LetA be a m X n matrixand
(2.4.7) TEMPORARY DEFINITION
•
let AR be the row echelon form of A. The
rank r
of A is the number of
nonzero rows of AR. We remark in passing that
r
�
min(m, n). This fact will be useful later.
The logically precise will have observed that this definition uses the uniqueness of the row echelon form of A. Uniqueness will be proved in Chapter
3, and this definition will be replaced by a better one. (2.3. l) every system of equations Ax = 0, where A need not be in echelon form, is equivalent to the system ARX = 0, where AR is the
By row
row echelon form of A. Hence with this definition we are in a position to state the result (we do not call it a theorem, as a more precise version will appear in Chapter
•
Let Ax
(2.4.8) RESULT
equations in
n
3). =
0 be a system of m homogeneous linear
unknowns. If the rank of A is
r
and t
=
n
-
r,
then the
general solution of the system has t arbitrary parameters. The elements of the solution are linear expressions in the parameters. We shall now elucidate the last sentence of the result that for each choice of
z
(2.4.6).
We know
vector, there is but one solution of our homo
geneous system of equations. Suppose we choose
x,
=
'Y• and apply the corollary, it will follow that a solution is S
=
sl'YI
+
·
·
·
+
S1'"(1.
This solution is the general solution by virtue of (2.4.6). For the sake of reference, we shall call the solutions si, • • · , s1 a set of basic solutions of the system Ax
=
0.
We can sum this up in the following
(2.4.11)
•
tions in t
=
11
n
Let Ax
RESULT
be
0
=
set of
a
m
homogeneous equa
unknowns. Let the rank of A be r; let I s1,
•
•
·,
stj, where
- r, be the set of basic solutions; and let 'Yt. · • ·, 'Y1 be scalars.
Then the solution set S consists of all vectors S =
S1'Yt
+
·
'
·
+
s
of the form
S1'Yt•
Consider the system of equations
{ [�
Ax
Then
A,�
Consequently, /(1)
=
-3
llm
0
-3 -3
-3
-]
0
0 0
0
I, 1(2) = 3, z(l ) = 2, and z(2)
system in the form of (2.4.5) we have
[ [ J [ C=
X/(l)
Xtc2>
=
_
-3
2
0
-1
-3
2
0
-1
] ][ J Xz(l) Xz(2)
·
=
4. Writing this
70
LINEAR EQUATIONS
The general solution is
If we take 'Y 1
1, 'Y2
=
-2 we obtain a particular solution
[J
x•�
while if we take 'YI
=
2 and
'Y2
-3, we obtain a particular solution
=
It is not hard to prove that the general solution of our equations can be written in the form
For instance,
s• = x'(-3) + x2(2) and
We next treat the special case m
r :;: n.
;;:: n for this situation to exist.
s2
=
Since
r
x'(-2) + x2(1). � min(m, n) we must havt