*130*
*56*
*5MB*

*English*
*Pages 138*
*Year 2015*

- Author / Uploaded
- Bruce Cooperstein

Solutions Manual to Advanced Linear Algebra

by

Bruce N. Cooperstein

K23692_SM_Cover.indd 1

02/06/15 3:11 pm

K23692_SM_Cover.indd 2

02/06/15 3:11 pm

Solutions Manual to Advanced Linear Algebra

by

Bruce N. Cooperstein

Boca Raton London New York

CRC Press is an imprint of the Taylor & Francis Group, an informa business

K23692_SM_Cover.indd 3

02/06/15 3:11 pm

CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2015 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper Version Date: 20150506 International Standard Book Number-13: 978-1-4822-4888-3 (Ancillary) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com

K23692_SM_Cover.indd 4

02/06/15 3:11 pm

Contents 1

. . . . . . . .

1 1 2 3 7 9 10 14 15

. . . . . .

17 17 20 22 23 24 26

3

Polynomials 3.1 The Algebra of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Roots of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29 29 30

4

Theory of a Single Linear Operator 4.1 Invariant Subspaces of an Operator . . . . . . . . . . . . . . . 4.2 Cyclic Operators . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Maximal Vectors . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Indecomposable Linear Operators . . . . . . . . . . . . . . . 4.5 Invariant Factors and Elementary Divisors of a Linear Operator 4.6 Canonical Forms . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Linear Operators on Real and Complex Vector Spaces . . . . .

33 33 35 37 38 38 41 42

2

Vector Spaces 1.1 Fields . . . . . . . . . . . . . . . . . . . . 1.2 The Space Fn . . . . . . . . . . . . . . . . 1.3 Introduction to Vector Spaces . . . . . . . . 1.4 Subspaces of Vector Spaces . . . . . . . . . 1.5 Span and Independence . . . . . . . . . . . 1.6 Bases and Finite Dimensional Vector Spaces 1.7 Bases of Infinite Dimensional Vector Spaces 1.8 Coordinate Vectors . . . . . . . . . . . . .

. . . . . . . .

Linear Transformations 2.1 Introduction to Linear Transformations . . . . 2.2 Range and Kernel of a Linear Transformation 2.3 Correspondence and Isomorphism Theorems 2.4 Matrix of a Linear Transformation . . . . . . 2.5 The Algebra of L(V, W ) and Mmn (F) . . . . 2.6 Invertible Transformations and Matrices . . .

K23692_SM_Cover.indd 5

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . .

. . . . . .

. . . . . . .

. . . . . . .

02/06/15 3:11 pm

iv 5

CONTENTS

. . . . . . .

45 45 48 50 52 52 55 56

. . . . .

59 59 60 63 65 67

7

Trace and Determinant of a Linear Operator 7.1 Trace of a Linear Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Uniqueness of the Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71 71 73 77

8

Bilinear Maps and Forms 8.1 Basic Properties of Bilinear Maps . . . 8.2 Symplectic Space . . . . . . . . . . . . 8.3 Quadratic Forms and Orthogonal Space 8.4 Orthogonal Space, Characteristic Two . 8.5 Real Quadratic Forms . . . . . . . . . .

. . . . .

81 81 83 85 89 90

Sesquilinear Forms and Unitary Spaces 9.1 Basic Properties of Sesquilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Unitary Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93 93 95

6

9

Inner Product Spaces 5.1 Inner Products . . . . . . . . . . . . . . . . . . . 5.2 The Geometry of Inner Product Spaces . . . . . . 5.3 Orthonormal Sets and the Gram-Schmidt Process 5.4 Orthogonal Complements and Projections . . . . 5.5 Dual Spaces . . . . . . . . . . . . . . . . . . . . 5.6 Adjoints . . . . . . . . . . . . . . . . . . . . . . 5.7 Normed Vector Spaces . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

Linear Operators on Inner Product Spaces 6.1 Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Spectral Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Normal Operators on Real Inner Product Spaces . . . . . . . . . . . . . . . 6.4 Unitary and Orthogonal Operators . . . . . . . . . . . . . . . . . . . . . . 6.5 Positive Operators, Polar Decomposition and Singular Value Decomposition

10 Tensor Products 10.1 Introduction to Tensor Products . 10.2 Properties of Tensor Products . . 10.3 The Tensor Algebra . . . . . . . 10.4 The Symmetric Algebra . . . . . 10.5 Exterior Algebra . . . . . . . . 10.6 Clifford Algebra . . . . . . . . .

K23692_SM_Cover.indd 6

. . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . . .

. . . . .

. . . . .

. . . . . .

. . . . . .

97 97 99 102 104 105 107

02/06/15 3:11 pm

CONTENTS

v

11 Linear Groups and Groups of Isometries 11.1 Linear Groups . . . . . . . . . . . . . . . . 11.2 Symplectic Groups . . . . . . . . . . . . . 11.3 Orthogonal Groups, Characteristic Not Two 11.4 Unitary Groups . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

109 109 111 113 115

12 Additional Topics 12.1 Operator and Matrix Norms . 12.2 Moore-Penrose Inverse . . . 12.3 Nonnegative Matrices . . . . 12.4 Location of Eigenvalues . . 12.5 Functions of Matrices . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

117 117 117 120 122 123

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

13 Applications of Linear Algebra 125 13.1 Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 13.2 Error Correcting Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 13.3 Ranking Web Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

K23692_SM_Cover.indd 7

02/06/15 3:11 pm

K23692_SM_Cover.indd 8

02/06/15 3:11 pm

Chapter 1

Vector Spaces 1.1. Fields

(iv) Assume that x = a + bi with a, b ∈ Q not both zero. a b Then in C we have (a + bi)−1 = a2 +b 2 − a2 +b2 i. Howa ever, if a, b ∈ Q then a2 + b2 ∈ Q whence a2 +b 2 and b ∈ Q. 2 2 a +b

1. Let z = a + bi, w = c + di with a, b, c, d ∈ R. Then |z|2 = a2 + b2 , |w|2 = c2 + d2 and |z|2 |w|2 = (a2 + 4. Write z = a + bi, w = c + di. Then z + w = (a + c) + b2 )(c2 + d2 ) = a2 c2 + a2 d2 + b2 c2 + b2 d2 . On the other (b + d)i. Then z + w = (a + c) − (b + d)i. On the other hand, zw = (ac − bd) + (ad + bc)I and |zw|2 = (ac − z = a−bi, w = c−di. Then z +w = (a−bi)+(c− hand, bd)2 + (ad + bc)2 = a2 c2 + b2 d2 − 2abcd + a2 d2 + b2 c2 + di) = (a + c) + [(−b) + (−d)]i = (a + c) + (−b − d)i = 2abcd = a2 c2 + b2 d2 + a2 d2 + b2 c2 . (a + c) − (b + d)i. 2. Part ii) of Theorem (1.1) follows from part i). 5. This follows from part ii. of Theorem (1.1.1) since if c Part iii) of Theorem (1.1) zz = (a + bi)(a − bi) = a2 − abi + (bi)a − (bi)(bi) = a2 + b2 = |z|2 .

is a real number then c = c. Therefore cz = c z = cz.

6. a) The addition table is symmetric which implies that 3. Since C is a field and Q[i] is a subset of C it suffices to addition is commutative. Likewise the multiplication table is symmetric from which we conclude that multiplicaprove the following: tion is commutative. (i) If x, y ∈ Q[i] then x + y ∈ Q[i]; b) The entry in the row indexed by 0 and the column in(ii) If x ∈ Q[i] then −x ∈ Q[i]; dexed the element i is i for i ∈ {0, 1, 2, 3, 4}. (iii) If x, y ∈ Q[i] then xy ∈ Q[i]; and

c) Every row of the addition table contains a 0. If the row is headed by the element a and the column in which the 0 (iv) If x ∈ Q[i], x = 0 then ∈ Q[i]. occurs is indexed by b then a + b = 0. This establishes the (i) We can write x = a + bi, y = c + di where a, b, c, d ∈ existence of a negative of a with respect to 0. Note that Q. Then a + c, b + d ∈ Q and consequently, x + y = since there is only one zero in each row and column, the (a + c) + (b + d)i ∈ Q[i]. negative is unique. 1 x

(ii) If x = a + bi, a, b ∈ Q then −a, −b ∈ Q and −(a + d) The entry in the row indexed by 1 and the column inbi) = −a − bi ∈ Q[i]. dexed the element i is i for i ∈ {1, 2, 3, 4}.

(iii) If x = a + bi, y = c + di with a, b, c, d ∈ Q then e) Every row has a 1 in it. If the row is headed by the ac, ad, bc, bd ∈ Q and therefore (a + bi)(c + di) = [ac − element a and the column in which the 1 occurs is indexed by c then ac = 1. This establishes the existence of bd] + [ad + bc]i ∈ Q[i].

K23692_SM_Cover.indd 9

02/06/15 3:11 pm

2

Chapter 1. Vector Spaces

a multiplicative inverse of a with respect to 1. Note that since each row and column has only a single 1, the multiplicative inverse of a non-zero element with respect to 1 is unique. 7. The additive inverse (negative) of 2 is 3. So add 3 to both sides

x=

After multiplying the complex number on the right hand side by 2−i 2−i we obtain x=

(3x + 2) + 3 = 4 + 3

(3 + 4i)(2 − i) (2 + i)(2 − i)

3x + (2 + 3) = 2

After performing the arithmetic on the right hand side we get

3x + 0 = 2

x=2+i

3x = 2 Now multiply by 2 since 2 · 3 = 1. 2(3x) = 2 · 2

9. Existence of an additive inverse, associativity of addition, the neutral character of 0, the existence of multiplicative inverses for all non-zero elements, the associativity of multiplication, the neutral character of 1 with respect to multiplication.

(2 · 3)x = 4

1.2. The Space Fn

1·x=4 x = 4.

1.

The unique solution is x = 4. 2.

8. 2x − (1 + 2i) = −ix + (2 + 3i) Add 1 + 2i to both sides to obtain the equation 2x = −ix + (3 + 4i) Add ix from both sides to obtain

3.

4.

(2 + i)x = 3 + 4i 5. Divide by 2 + i to obtain the equation

K23692_SM_Cover.indd 10

3 + 4i 2+i

2i −2 + 2i 4 − 2i 2 6 −4 −6i 2i 8i 1 + 3i 2 −1 + i −3 + 2i −2 − i 1

02/06/15 3:11 pm

1.3. Introduction to Vector Spaces

1 + 2i 6. 3 + i 5 0 7. 0 0 4 8. 1 1 1 9. 4 2 3 10. 1 2 3−i 11. v = 3+i 4 12. v = 2

3 Use associativity on the left hand side to get (y + cx) + cx = y + cx = 0 0 + cx = 0 cx = 0. 2. Assume cu = 0 but c = 0. We prove that u = 0. Multiply by 1c to get 1c (cu) = 1c 0 = 0 by part iii) of Theorem (1.4). On the other hand, 1c (cu) = ( 1c c)u = 1u = u. Thus, u = 0. 3. Since v + (−v) = 0 it follows that v is the negative of −v, that is, −(−v) = v. 4. Add the negative, −v, of the vector v to both sides of the equation v+x=v+y to obtain (−v) + [v + x] = (−v) + [v + y].

1.3. Introduction to Vector Spaces

By the associativity property we have [(−v) + v] + x = [(−v) + v] + y. Now use the axiom that states (−v) + v = 0 to get

1. Set x = 0. Then x + x = 0 + x = x. Now multiply by the scalar c to get c(x + x) = cx After distributing we get cx + cx = cx. Set y = −(cx) and add to both sides of the equation: y + (cx + cx) = y + cx

K23692_SM_Cover.indd 11

0+x=0+y Since 0 + x = x and 0 + y = y we conclude that x = y. 5. Multiply on the left hand side by the scalar

1 c

:

1 1 (cx) = (cy). c c Now make use of (M3) to get

02/06/15 3:11 pm

4

Chapter 1. Vector Spaces

1 1 [ c]x = [ c]y. c c

(ab)f (x) = a[bf (x)] = a[(bf )(x)] = [a(bf )](x). Thus, the functions (ab)f and a(bf ) are equal. (M4) (1f )(x) = 1f (x) = f (x) so 1f = f.

Since 1c c = 1 we get 1x = 1y and then by (M4) we conclude x = y.

7. Let f, g, h ∈ M(X, V ) and a, b ∈ F be scalars. We show all the axioms hold. (A1) For any x ∈ X, (f +g)(x) = f (x)+g(x). However, addition in V is commutative and therefore f (x)+g(x) = g(x) + f (x) = (g + f )(x). Thus, the functions f + g and g + f are identical.

6. Let f, g, h ∈ M(X, F) and a, b ∈ F be scalars. We show all the axioms hold. (A2) For any x ∈ X, [(f + g) + h](x) = (f + g)(x) + (A1) For any x ∈ X, (f +g)(x) = f (x)+g(x). However, h(x) = [f (x) + g(x)] + h(x). This is just a sum of addition in F is commutative and therefore f (x) + g(x) = elements in V. Addition in V is associative and thereg(x) + f (x) = (g + f )(x). Thus, the functions f + g and fore [f (x) + g(x)] + h(x) = f (x) + [g(x) + h(x)] = f (x) + (g + h)(x) = [f + (g + h)](x). Thus, the funcg + f are identical. tions (f + g) + h and f + (g + h) are identical. (A2) For any x ∈ X, [(f + g) + h](x) = (f + g)(x) + h(x) = [f (x) + g(x)] + h(x). This is just a sum of (A3) O is an identity for addition: (O + f )(x) = O(x) + elements in F. Addition in F is associative and there- f (x) = 0 + f (x) = f (x) and consequently, O + f = f. fore [f (x) + g(x)] + h(x) = f (x) + [g(x) + h(x)] = (A4) Let −f denote the function from X to V such that f (x) + (g + h)(x) = [f + (g + h)](x). Thus, the func(−f )(x) = −f (x). Then [(−f ) + f ](x) = (−f )(x) + tions (f + g) + h and f + (g + h) are identical. f (x) = −f (x) + f (x) = 0 and therefore (−f ) + f = O. (A3) O is an identity for addition: (O + f )(x) = O(x) + f (x) = 0 + f (x) = f (x) and consequently, O + f = f. (M1) [a(f + g)](x) = a[f + g)(x)] = a[f (x) + g(x)]. Now, a is a scalar and f (x), g(x) ∈ V. By (M1) applied to (A4) Let −f denote the function from X to F such that V we have a[f (x)+g(x)] = af (x)+ag(x) = (af )(x)+ (−f )(x) = −f (x). Then [(−f ) + f ](x) = (−f )(x) + (ag)(x) = [af +ag](x). Therefore the functions a(f +g) f (x) = −f (x) + f (x) = 0 and therefore (−f ) + f = O. and af + ag are equal. (M1) [a(f + g)](x) = a[f + g)(x)] = a[f (x) + g(x)]. (M2) [(a + b)f ](x) = (a + b)f (x). Now a, b are scalars Now, a, f (x), g(x) are elements of F and the distributive and f (x) ∈ V . By axiom (M2) applied to V we have axiom holds in F and therefore a[f (x)+g(x)] = af (x)+ (a + b)(f (x) = af (x) + bf (x) = (af )(x) + (bf )(x) = ag(x) = (af )(x) + (ag)(x) = [(af ) + (ag)](x). Thus, [af + bf ](x). Thus, the functions (a + b)f and af + bf the functions a(f + g) and (af ) + (ag) are identical. are equal. (M2) [(a + b)f ](x) = (a + b)f (x). Now a, b and f (x) are all elements of the field F where the distributive ax- (M3) [(ab)f ](x) = (ab)f (x). Since a, b ∈ F and iom holds. Therefore (a + b)f (x) = af (x) + bf (x) = f (x) ∈ V we can apply (M3) for V and conclude that (af )(x) + (bf )(x) = [af + bf ](x). This shows that the [(ab)f ](x) = (ab)f (x) = a[bf (x)] = a[(bf )(x)] = functions (a + b)f and af + bf are identical as required. [a(bf )](x). Thus, (ab)f = a(bf ) as required. (M3) [(ab)f ](x) = (ab)f (x). Since a, b, f (x) are in F and the multiplication in F is associative we have

K23692_SM_Cover.indd 12

(M4) Finally, (1f )(x) = 1f (x) = f (x) by (M4) applied to V and therefore 1f = f.

02/06/15 3:11 pm

1.3. Introduction to Vector Spaces

5

8. We demonstrate the axioms all hold. So let u, u1 , u2 , u3 ∈ U, v, v1 , v2 , v3 ∈ V and c, d ∈ F. We then have

(cu1 + cu2 , cv1 + cv2 ) = (cu1 , cv1 ) + (cu2 , cv2 ) = c(u1 , v1 ) + c(u2 , v2 ).

(u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ).

For the second distributive property we have (u2 , v2 ) + (u1 , v1 ) = (u2 + u1 , v2 + v1 ).

[c+d](u, v) = ([c+d]u, [c+d]v) = (cu+du, cv+dv) =

Since addition in U and addition in V is commutative, u1 + u2 = u2 + u1 , v1 + v2 = v2 + v1 . Thus, the vectors are equal. This establishes (A1). [(u1 , v1 ) + (u2 , v2 )] + (u3 , v3 ) = ([u1 + u2 ] + u3 , [v1 + v2 ] + v3 ).

(cu, cv) + (du, dv) = c(u, v) + d(u, v)

[cd](u, v) = ([cd]u, [cd]v) = (c[du], c[dv]) =

(u1 , v1 ) + [(u2 , v2 ) + (u3 , v3 )] =

c(du, dv) = c[du].

([u1 + [u2 + u3 v1 + [v2 + v3 ]). Since addition in U and addition in V is associative, the results are identical. Thus, (A2) holds.

Finally, 1(u, v) = (1u, 1v) = (u, v).

(u, v) + (0U , 0W ) = (u + 0U , v + 0V ) = (u, v). So, indeed (0U , 0V ) is an identity for U × V with the addition as defined. Moreover, (u, v) + (−u, −v) = (u + (−u), v + (−v)) = (u + [−u], v + [−v]) = (0U , 0V ).

c[(u1 , v1 ) + (u2 , v2 )] = c(u1 + u2 , v1 + v2 ) =

9. Let f, g, h ∈

i∈I

Ui , a, b ∈ F.

(A1) For i ∈ I, (f +g)(i) = f (i)+g(i). Now f (i), g(i) ∈ Ui and addition in Ui is commutative and so f (i)+g(i) = g(i) + f (i) = (g + f )(i). Since i ∈ I is arbitrary, f + g = g + f. (A2) For i ∈ I, [(f + g) + h](i) = (f + g)(i) + h(i) = [f (i) + g(i)] + h(i). Now f (i), g(i), h(i) ∈ Ui and addition in Ui is associative. Therefore, [f (i) + g(i)] + h(i) = f (i) + [g(i) + h(i)] = f (i) + [g + h](i) = [f( g + h)](i). Since i ∈ I is arbitrary, (f + g) + h = f + (g + h). (A3) [O + f ])i) = O(i) + f (i) = 0i + f (i) = f (i). So, O + f = f.

(A4) Let −f denote the function from I to ∪i∈I such that (−f )(i) = −f (i) for i ∈ I. Then [(−f ) + f ](i) = since the distributive property holds in U and V. On the (−f )(i) + f (i) = −f (i) + f (i) = 0i and therefore other hand, (−f ) + f = O. (c[u1 + u2 ], c[v1 + v2 ]) = (cu1 + cu2 , cv1 + cv2 )

K23692_SM_Cover.indd 13

02/06/15 3:11 pm

6

Chapter 1. Vector Spaces

(M1) Since (M1) applies to Ui we have [a(f + g)](i) = (M4) This holds by the definition of 1 · U. a[(f + g)(i)] = a[f (i) + g(i)] = af (i) + ag(i) = 11. (A1) Since multiplication in R+ is commutative we (af )(i) + (ag)(i) = [(af ) + (ag)](i) = [af + ag](i). have Thus, the functions a(f + g) and af + ag are equal. a a1 (M2) [(a + b)f ](i) = (a + b)f (i). Now a, b are scalars + 2 = b1 b2 and f (i) ∈ Ui . By axiom (M2) applied to Ui we have (a + b)(f (i) = af (i) + bf (i) = (af )(i) + (bf )(i) = a2 a 1 a 1 a2 [af + bf ](i). Thus, the functions (a + b)f and af + bf = = b1 b 2 b2 b1 are equal. (M3) [(ab)f ](i) = (ab)f (i). Since a, b ∈ F and f (i) ∈ Ui a a2 + 1 . we can apply (M3) for Ui and conclude that [(ab)f ](i) = b2 b1 (ab)f (i) = a[bf (i)] = a[(bf )(i)] = [a(bf )](i). Thus, (ab)f = a(bf ) as required. (A2) Since multiplication in R+ is associative we have (M4) Finally, (1f )(i) = 1f (i) = f (i) by (M4) applied to Ui . Since i ∈ I is arbitrary, 1f = f.

10. (A1) Since for any two sets U, W we have U W = W U we have U + W = W + U.

(A2) We remark that for three sets U, W, Z that [U W ] Z consists of those elements of U ∪ W ∪ Z which are contained in 1 or 3 of these sets. This is also true of U [W Z] and therefore we have [U +W ]+Z = U +[W + Z]. (A3) ∅ + U = (∅ ∪ U ) \ (∅ ∩ U ) = U \ ∅ = U. So, ∅ does act like a zero vector.

a a a1 + 2 ]+ 3 = [ b2 b3 b1

a 1 a2 b1 b2

(a1 a2 )a3 (b1 b2 )b3

a + 3 = b3

a1 (a2 a3 ) = = b1 (b2 b3 )

a a a1 + [ 2 + 2 ]. b1 b2 b3

1 a 1a a (A4) U + U = (U ∪ U ) \ (U ∩ U ) = U \ U = ∅. So, (A3) 1 + b = 1b = b indeed, U is the negative of U. (A4) (M1) 0 · (U + W ) = ∅. Also, 0 · U = 0 · W = ∅ and 1 ∅ + ∅ = ∅. a + a1 = b 1 · (U + W ) = U + W and 1 · U = W, 1 · W = W and b so 1 · U + 1 · W = U + W = 1 · (U + W ). 1 1 aa = . (M2) If a = b then a · U = b · U and a · U + b · U = b 1b 1 a · U + a · U = ∅. On the other hand, a + b = 0 and 0 · U = ∅. Therefore we may assume a = 0, b = 1. Then (M1) (a + b) · U = 1 · U = U. On the other hand, 0 · U + 1 · U = a a ∅ + U = U. c[ 1 + 2 ] = b1 b2 (M3) If either a = 0 or b = 0 then both (ab) · U = ∅ and a · (b · U ) = ∅. Therefore we may assume a = b = 1. a a (a1 a2 )c c 1 2 = = Clearly, 1 · U = U = 1 · (1 · U ). b1 b2 (b1 b2 )c

K23692_SM_Cover.indd 14

02/06/15 3:11 pm

1.4. Subspaces of Vector Spaces

7

ac1 ac2 = bc1 bc2 c c a a1 + c2 = bc1 b2 a a1 +c 2 . c b1 b2

2a −2a 2a

− + +

3b = 5b = b =

−1 0 0

which is inconsistent. Thus, W is not a subspace.

(M2) c+d a a = c+d = b b c d c d a a a a = c + d = c d b b b b a a c +d b b (c + d)

(M3) cd a a = cd = b b d d c a a (a ) = c d = c(d ). (bd )c b b (cd)

1 a a a . (M4) 1 = 1 = b b b

1.4. Subspaces of Vector Spaces

2. We show that W is not a subspace by proving it is not closed under scalar multiplication. Specifically, W contains the vector f (1, 1) = 1+2X 2 . However, we claim the vector 2 + 4X 2 does not belong to W. If it did then there exist a, b such that f (a, b) = ab+(a−b)X +(a+b)X 2 = 2 + 4X 2 . We must then have a a

− +

b b

= =

0 . 4

This has the unique solution a = b = 2. However, then ab = 4 = 2 as required. x 3. Suppose y ∈ W so that 3x − 2y + 4z = 0. Then z 3(cx) − 2(cy) + 4(cz) = (3x − 2y + 4z)c = 0c = 0

cx which implies that cy ∈ W and W is closed under cz scalar multiplication. x2 x1 Suppose y1 , y2 ∈ W which implies that z1 z2 3x1 − 2y1 + 4z1 = 0 = 3x2 − 2y2 + 4z2

It then follows that 1. In order for W to contain the zero vector there must exist a, b such that 2a −2a 2a

− + +

3b + 5b b

1

= = =

which is equivalent to the linear system

K23692_SM_Cover.indd 15

0 0 0

3(x1 + x2 ) − 2(y1 + y2 ) + 4(z1 + z2 ) = [3x1 − 2y1 + 4z1 ] + [3x2 − 2y2 + 4z2 ] = 0. x1 x2 x 1 + x2 This implies that y1 + y2 = y1 + y2 ∈ W z1 z2 z1 + z 2 and W is closed under addition. Thus, W is a subspace.

02/06/15 3:11 pm

8

Chapter 1. Vector Spaces

4. Set W = ∪U ∈ FU. In order to show that W is a 9. Clearly, Mf in (X, F) is a subset of M(X, F) so we subspace we have to show that it is closed under addition have to show that it is closed under addition and scalar and scalar multiplication. multiplication. Closed under scalar multiplication. Assume x ∈ W and c is a scalar. By the definition of W there is an X ∈ F such that x ∈ X. Since X is a subspace of V, cx ∈ X and hence cx ∈ ∪U ∈F U = W.

Closed under scalar multiplication. Suppose f ∈ Mf in (X, F) and c is a scalar. If c = 0 then cf is the zero map with support equal to the empty set, which belongs to Mf in (X, F). On the other hand, if c = 0 then Closed under addition. We need to show if x, y ∈ W then spt(cf ) = spt(f ) is finite and so cf ∈ Mf in (X, F). x + y ∈ W. By the definition of W there are subspaces Closed under addition. Assume f, g ∈ M (X, F). If f in X, Y ∈ F such that x ∈ x, y ∈ Y. By our hypothesis x ∈ / spt(f ) ∪ spt(g) the f (x) = g(x) = 0 and (f + there exists Z ∈ F such that X ∪ Y ⊂ Z. It then follows g)(x) = 0. This implies that spt(f +g) ⊂ spt(f )∪spt(g) that x, y ∈ Z. Since Z is a subspace of V, x + y ∈ Z. and is therefore finite. Thus, f + g ∈ M (X, F). f in Then x + y ∈ W and W is closed under addition as 10. Set W = {f ∈ M(X, F)|f (y) = 0 ∀y ∈ Y }. We claimed. need to show W is closed under scalar multiplication and 5. Assume u ∈ U, w ∈ W and c is a scalar. Then c(u + addition. w) = cu + cw. Since U is a subspace and u ∈ U it follows that cu ∈ U. Similarly, cw ∈ W. Then Closed under scalar multiplication. Suppose f ∈ W and c is scalar, y ∈ Y. Then (cf )(y) = cf (y) = c0 = 0. c(u + w) = cu + cw ∈ U + W. Since y is arbitrary, cf ∈ W. 6. Closed under scalar multiplication. Assume x ∈ U ∪ W and c is a scalar. Then either x ∈ U in which case cx ∈ U or x ∈ W and cx ∈ W. In either case, cx ∈ U ∪ W and U ∪ W is closed under scalar multiplication.

Not closed under addition. Since U is not a subset of W there exists u ∈ U, u ∈ / W. Since W is not a subset of U there is a w ∈ W, w ∈ / U. We claim that u + w ∈ / U ∪ W. For suppose to the contrary that v = u + w ∈ U ∪ W. Suppose v ∈ U. Then w = v − u is the difference of two vectors in U, whence w ∈ U contrary to assumption. Likewise, if v ∈ W then u = v−w ∈ W, a contradiction. So, u + w is not in U ∪ W and U ∪ W is not a subspace. a 0 7. Let W = { |a ∈ R}, X = { |b ∈ R} and 0 b c Y ={ |c ∈ R}. c a 0 8. Take X = { |a ∈ R}, Y = { |b ∈ R}, Z = 0 b c { |c ∈ R}. c

K23692_SM_Cover.indd 16

Closed under addition. Suppose f, g ∈ W and y ∈ Y. Then (f + g)(y) = f (y) + g(y) = 0 + 0 = 0. 11. This is clearly not a subspace since the zero vector does not belong to it. 12. Closed under scalar multiplication. Suppose f ∈ and c is a scalar. If c = 0 i∈I Ui so that spt(f ) is finite then cf is the zero vector of i∈I Ui which has support the empty set and is in i∈I Ui . On the other hand, if c = 0 then spt(cf ) = spt(f ) and so is finite and cf ∈ i∈I Ui . Closed under addition. Suppose f, g ∈ i∈I Ui . As in the proof of 14, spt(f + g) ⊂ spt(f ) ∪ spt(g) and so is a finite subset. Then f + g ∈ i∈I Ui .

13. Assume x ∈ X ∩ Y + Z. Then there are vectors y ∈ Y, z ∈ Z such that x = y + z. Then z = x − y. Since Y ⊂ X, y ∈ X and because X a subspace we can conclude that z = x − y ∈ X. In particular, z ∈ X ∩ Z. Thus, x ∈ Y + (X ∩ Z). This proves that X ∩ (Y + Z) ⊂ Y + (X ∩ Z).

02/06/15 3:11 pm

1.5. Span and Independence

9

Conversely, assume that u ∈ Y +(X ∩Z). Write u = y+ that Span(v1 , c12 v1 + v2 , c13 v1 + c23 v2 + v3 }) ⊂ z where y ∈ Y and z ∈ X ∩ Z. Then clearly, u ∈ Y + Z. Span(v1 , v2 , v3 ). On the other hand, since Y ⊂ X, z ∈ X ∩ Z, and X is a On the other hand, v2 = (−c12 )v1 + [c12 v1 + v2 subspace, we can conclude that u = y + z ∈ X. Thus, u ∈ X ∩ (Y + Z). This proves that Y + (X ∩ Z) ⊂ v3 = (−c13 + c12 c23 )v2 + (−c23 )(c12 v + 1 + v2 )+ X ∩ (Y + Z) and therefore we have equality. (c13 v1 + c23 v3 + v3 ). 14. We need to show that Modd (R, R) is closed under Thus, v , v , v are all linear combinations of v , c v + 1 2 3 1 12 1 addition and scalar multiplication. v24 and c13 v1 + c23 v2 + v3 . It now follows from exerClosed under addition. Assume f, g ∈ Modd (R, R) cise 1 that Span(v1 , v2 , v3 ) = Span(v1 , c12 v1 , c13 v1 + so that f (−x) = −f (x), g(−x) = −g(x). Then (f + c23 v2 + v3 ). g)(−x) = f (−x) + g(−x) = −f (x) + −g(x) = −(f + g)(x).

Closed under scalar multiplication. Assume f ∈ Modd (R, R) and c ∈ R. Then (cf )(−x) = cf (−x) = c[−f (x)] = (−cf )(x).

1.5. Span and Independence 1. We need to show that Span(X) ⊂ Span(Y ) and, conversely, that Span(Y ) ⊂ Span(X). Since by hypothesis, X ⊂ Span(Y ) it then follows that Span(X) ⊂ Span(Span(Y )) = Span(Y ). In exactly the same way we conclude that Span(Y ) ⊂ Span(X) and we obtain the desired equality.

5. If v = 0 then 1v = 0 and therefore 0 is linear dependent. On the other hand, if cv = 0, c = 0 then v = 0.

6. Assume v = γu. Since v = 0, γ = 0. Then (−γ)u + 1v = 0 is a non-trivial dependence relation on (u, v) and consequently, (u, v) is linearly dependent. Conversely, assume that (u, v) is linearly dependent and neither vector is 0. Suppose au + bv = 0. If a = 0 then bv = 0 which by exercise implies that v = 0, a contradiction. So, a = 0. In exactly the same way, b = 0. Now v = (− ab )u, u = (− ab )u so v is a multiple of u and u is a multiple of v. 7. Multiply all the non-zero vectors by 0 and the zero vector by 1. This gives a non-trivial dependence relation and so the sequence of vectors is linearly dependent.

2. Clearly, u, cu + v are both linear combinations of u, v and consequently, u, cu + v ∈ Span(u, v). On the other hand, v = (−c)u + (cu + v) is a linear combination of u and cu + v. Whence, both u and v are linear combinations of u, cu + v and so u, v ∈ Span(u, cu + v). Now by exercise 1 we have the equality Span(u, v) = Span(u, cu + v).

9. Extend a non-trivial dependence relation on the vectors of S0 to a dependence relation on S by setting the scalars equal to zero for every vector v ∈ S \ S0 .

3. Clearly cu, v ∈ Span(u, v). On the other hand, u = ( 1c )(cu) + 0v is in Span(cu, v). Thus, u, v ∈ Span(cu, v) and, using exercise 1, we get the equality Span(u, v) = Span(cu, v).

10. This is logically equivalent to exercise 9. Alternatively, assume that some subsequence of S is linearly dependent. Then by exercise 9 the sequence S is dependent, a contradiction.

8. If i < j and vi = vj then let ci = 1, cj = −1 and set all the other scalars equal to 0. In this way we obtain a non-trivial dependence relation.

4. Since v1 , c12 v1 + v2 , c13 v1 + c23 v2 + v3 11. Assume Span(u1 , . . . , uk ) ∩ Span(v1 , . . . , vl ) = are all linear combinations of v1 , v2 1, v3 it follows {0}. Suppose c1 , . . . , ck , d1 , . . . , dl are scalars and

K23692_SM_Cover.indd 17

02/06/15 3:11 pm

10

Chapter 1. Vector Spaces

c1 u1 + · · · + ck uk + d1 v1 + · · · + dl vl = {0}.

If d = 0 then w ∈ Span(u1 , . . . , uk ) contrary to our hypothesis. Therefore, d = 0. But then v = (−

Then

c1 ck 1 )u1 + · · · + (− )uk + w. d d d

Thus, v ∈ Span(u1 , . . . , uk , w) as required. 13. Let c1 , c2 , c3 be scalars, not all zero, such that c1 (v1 + w) + c2 (v2 + w) + c3 (v3 + w) = 0. Claim c The vector c1 u1 +· · ·+ck uk ∈ Span(u1 , . . . , uk ) while 1 + c2 + c3 = 0. Suppose otherwise. Then 0 = d1 v1 + · · · + dl vl ∈ Span(v1 , . . . , vl ). By hypothesis the c1 (v1 + w) + c2 (v2 + w) + c3 (v3 + w) = c1 v1 + c2 v2 + c3 v3 + (c1 + c2 + c3 )w = c1 v1 + c2 v2 + c3 v3 , contrary to only common vector is the zero vector. Therefore the hypothesis that (v1 , v2 , v3 ) is linearly independent. It now follows that w = − c1 +c12 +c3 (c1 v1 + c2 v2 + c3 v3 ) ∈ c1 u1 + · · · + ck uk = 0 Span(v1 , v2 , v3 ). c1 u1 + · · · + ck uk = (−d1 )v1 + · · · + (−dl )vl .

d1 v 1 + · · · + dl v l = 0 However, since (u1 , . . . , uk ) is linearly independent we get c1 = · · · = ck = 0. Similarly, d1 = · · · = dl = 0. This implies that (u1 , . . . , uk , v1 , . . . , vl ) is linearly independent.

1.6. Bases and Finite Dimensional Vector Spaces

On the other hand, suppose 0 = w ∈ 1. Let B be a basis of V. B has 4 elements since Span(u1 , . . . , uk ) ∩ Span(v1 , . . . , vl ). Then there dim(V ) = 4. are scalars c1 , . . . , ck , d1 , . . . , dl such that a) Suppose S spans V and |S| = 3. Since B is a basis, in particular, B is linearly independent. Now by the Exw = c1 u1 + · · · + ck uk change theorem, 3 = |S| ≥ |B| = 4, a contradiction. w = d1 v 1 + · · · + d l v l . Note that since w = 0 at least one ci and dj is non-zero. Now we have

c1 u1 +· · ·+ck uk +(−d1 )v1 +· · ·+(−dl )vl = w−w = 0 is a non-trivial dependence relation and therefore (u1 , . . . , uk , v1 , . . . , vl ) is linearly dependent. 12. Since w ∈ Span(u1 , . . . , uk , v) there are scalars c1 , . . . , ck , d such that w = c1 u1 + . . . ck uk + dv.

K23692_SM_Cover.indd 18

b) Suppose I is a linearly independent subset of V with five elements. Since B is a basis it spans V. By the exchange theorem, 5 = |I| ≤ |B| = 4, a contradiction. 2. If U ⊂ W then, since dim(U ) = dim(W ) = 3 we must have U = W by Theorem (1.22) which contradicts the assumption that U = W. So, U is not contained in W (and similarly, W is not contained in U ). It now follows that the subspace U + W of V contains U as well as vectors not in U. Then 4 = dim(V ) ≥ dim(U + W ) > dim(U ) = 3 which implies that dim(U + W ) = 4. By part ii) of Theorem (1.22), U + W = V. Since W is not contained in U we know that U ∩ W is a proper subset of U and hence dim(U ∩ W ) < dim(U ) =

02/06/15 3:11 pm

1.6. Bases and Finite Dimensional Vector Spaces 3. It therefore suffices to prove that dim(U ∩W ) ≥ 2. Let (u1 , u2 , u3 ) be a basis for U and (w1 , w2 , w3 ) be a basis for W with w3 ∈ / U. If Span(w1 , w3 ) ∩ U = {0} then by Exercise (1.5.11) the sequence (u1 , u2 , u3 , w1 , w3 ) is linearly independent which contradicts the assumption that dim(V ) = 4. Thus, Span(w1 , w3 )∩U = {0}. Since w1 ∈ / U any non-zero vector aw1 + bw3 which is in U must have a = 0. By multiplying by a1 we can say there is an c ∈ F such that w1 + cw ∈ U. In exactly the same way we can conclude that there must be a d ∈ F such that w2 + dw3 ∈ U. We claim that (w1 + cw3 , w2 + dw3 ) is linearly independent. Suppose e1 , e2 ∈ F and

11 Span(S) = Span(S − vj ). But then by the exchange theorem no independent sequence of V can have more than n − 1 vectors which contradicts the assumption that bases have cardinality n.

5. We are assuming dim(V ) = n, S = (v1 , v2 , . . . , vm ) with m > n and Span(S) = V. We have to prove that some subsequence of S is a basis of V. Toward that end, let S0 be a subsequence of S such that Span(S0 ) = V and the length of S0 is as small as possible. We claim that S0 is linearly independent and therefore a basis of V. Let S0 = (w1 , w2 , . . . , wk ) and assume to the contrary that S0 is linearly dependent. Then for some j, 1 ≤ j ≤ k, wj is a linear combination of S0 − wj by Theorem (1.14). e1 (w1 + cw3 ) + e2 (w2 + dw3 ) = 0 Then by Theorem (1.13), V = Span(S0 ) = Span(S0 − wj ). However, this contradicts the assumption that S0 is Then e1 w1 + e2 w2 + (ce1 + de2 )w3 = 0. However, a spanning subsequence of S of minimal length. Thus, S0 since (w1 , w2 , w3 ) is linearly independent we must have is linearly independent and a basis as claimed. e1 = e2 = 0. Since (w1 + cw3 , w2 + dw3 ) is a linearly independent 6. Assume dim(U ∩ W ) = l, dim(U ) − l = sequence from U ∩W we conclude that dim(U ∩W ) ≥ 2. m, dim(W ) − l = n (so that dim(U ) = l + m and dim(W ) = l + n). Choose a basis (x1 , . . . , xl ) for 3. By Exercise (1.5.11) the sequence U ∩ W. This can be extended to a basis of U. Let (u1 , u2 , w1 , w2 , w3 ) is linearly independent and so it (u1 , . . . , um ) be a sequence of vectors from U such that suffices to show that (u1 , u2 , w1 , w2 , w3 ) is a spanning (x1 , . . . , xl , u1 , . . . , um ) is a basis for U. Likewise there sequence. Toward that end consider an arbitrary vector is a sequence of vectors (w1 , . . . , wn ) from W such v ∈ U + W. By the definition of U + W there are vectors that (x1 , . . . , xl , w1 , . . . , wn ) is a basis for W. We claim u ∈ U and w ∈ W such that v = u + w. Since u ∈ U that (x1 , . . . , xl , u1 , . . . , um , w1 . . . , wn ) is a basis for and (u1 , u2 ) is a basis for U there are scalars a1 , a2 ∈ F U + W. such that u = a1 u1 + a2 u2 . Similarly, there are scalars first show that S = b1 , b2 , b3 ∈ F such that w = b1 w1 + b2 w2 + b3 w3 . Thus, We v = u + w = a1 u1 + a2 u2 + b1 w1 + b2 w2 + b3 w3 and (x1 , . . . , xl , u1 , . . . , um , w1 . . . , wn ) is linearly intherefore (u1 , u2 , w1 , w2 , w3 ) is a spanning sequence as dependent. Suppose ai , 1 ≤ i ≤ l, bj , 1 ≤ j ≤ m and ck , 1 ≤ k ≤ n are scalars such that required. 4. We are assuming that dim(V ) = n, S = a1 x1 + · · · + al xl + b1 u1 + · · · + bm um + (v1 , v2 , . . . , vn ) is a sequence of vectors from V and S c1 w1 + · · · + cn wn = 0. spans V. We need to prove that S is linearly independent. For 1 ≤ j ≤ n let S − vj denote the sequence obtained from S by deleting vj . We then have that Suppose to the contrary that S is not linearly independent. Then for some j, 1 ≤ j ≤ n, vj is a linear combination of S − vj by Theorem (1.14). Then by Theorem (1.13), V =

K23692_SM_Cover.indd 19

a1 x1 + · · · + al xl + b1 u1 + · · · + bm um =

02/06/15 3:11 pm

12

Chapter 1. Vector Spaces

−(c1 w1 + · · · + cn wn )

(a1 + c1 )x1 + · · · + (al + cl )xl +

(1.1)

b1 u1 + · · · + bm um + d1 w1 + . . . dn wn .

Note the vector on the left hand side of Equation (1.1) belongs to U and the vector on the right hand side be- Thus, S spans U + W as required and S is a basis of longs to W. Therefore, since we have equality the vec- U + W. It follows that dim(U + W ) = l + m + n. Now tor belongs to U ∩ W = Span(x1 , . . . , xl ). However, since (x1 , . . . , xl , w1 , . . . , wn ) is linearly independent, by Exercise (1.5.11) we must have Span(x1 , . . . , xl ) ∩ dim(U ) + dim(W ) = (l + m) + (l + n) = 2l + m + n = Span(w1 , . . . , wn ) = {0}. Consequently, l + (l + m + n) = dim(U ∩ W ) + dim(U + W ).

a1 x1 + · · · + al xl + b1 u1 + · · · + bm um =

7. Let (x1 , x2 , x3 ) be a basis for X and (y1 , y2 , y3 ) be a basis for Y. If X ∩ Y = {0} then (x1 , x2 , x3 , y1 , y2 , y3 ) −(c1 w1 + . . . cn wn ) = 0. is linearly independent by Exercise 11 of Section (1.5) . However, since (w1 , . . . , wn ) is linearly independent But this contradicts dim(V ) = 5. Thus, X ∩ Y = {0}. this implies that c1 = · · · = cn = 0. Also, since 8. Since U + W = V, dim(U + W ) = dim(V ) = n. (x1 , . . . , xl , u1 , . . . , um ) is linearly independent we get Making use of Exercise 8 we get a1 = · · · = al = b1 = · · · = bm = 0. Thus, all ai , bj , ck are zero and the sequence S is linearly independent. We next show that S spans U + W. Suppose v is an arbitrary vector in U + W. Then there are vectors u ∈ U and w ∈ W such that v = u + w. Since (x1 , . . . , xl , u1 , . . . , um ) is a basis for U there are scalars ai , 1 ≤ i ≤ l, bj , 1 ≤ j ≤ m such that u = a1 x1 + · · · + al xl + b1 u1 + · · · + bm um . Since (x1 , . . . , xl , w1 , . . . , wn ) is a basis for W there are scalars ci , 1 ≤ i ≤ l, dk , 1 ≤ k ≤ n such that w = c1 x1 + · · · + cl xl + d1 w1 + · · · + dn wn . But now we have v =u+w = (a1 x1 + · · · + al xl + b1 u1 + · · · + bm um )+ (c1 x1 + · · · + cl xl + d1 w1 + · · · + dn wn ) =

K23692_SM_Cover.indd 20

dim(U ∩ W ) = dim(V ) − dim(U ) − dim(W ) = n − k − (n − k) = 0. Since dim(U ∩ W ) = 0, U ∩ W = {0} as required. 9. Set

x1 x 2 x X = { 3 |x1 , x2 , x3 ∈ F}, 0 0 0

x1 x 2 x Y = { 3 |x1 , x2 , x3 , x4 ∈ F}, x4 0 0

02/06/15 3:11 pm

1.6. Bases and Finite Dimensional Vector Spaces

x1 x 2 x Z = { 3 |x1 , x2 , x3 , x4 , x5 ∈ F} x4 x5 0 Note that dim(X) = 3, dim(Y ) = 4, dim(Z) = 5. a) Set 0 0 1 1 0 0 0 1 0 v 1 = , v 2 = , v3 = 0 0 0 0 0 0 0 0 0

1 0 1 0 1 1 −1 −1 1 w1 = , w2 = , w3 = 0 0 0 0 0 0 0 0 0 b) Let v1 , v2 , v3 be as in a) and now set 1 0 0 0 1 0 0 0 1 w 1 = , w2 = , w3 = . 1 1 0 0 0 0 0 0 0

13 c) Let v1 , v2 , v3 be as in a) and now set 0 0 0 0 0 0 0 0 1 w 1 = , w2 = , w3 = 1 0 1 0 1 1 0 0 0

Set V = Span(v1 , v2 , v3 ), W = Span(w1 , w2 , w3 ). Since v1 , v2 , v3 , w1 , w2 , w3 ∈ Z, V +W ⊂ Z and therefore dim(V + W ) ≤ dim(Z) = 5

(1.2)

On the other hand, (v1 , v2 , v3 , w1 , w2 ) is linearly independent and therefore dim(V + W ) ≥ 5

(1.3)

From Equation (1.2) and (1.3) we get dim(V + W ) = 5. Now use Exercise 6:

dim(V ) + dim(W ) = dim(V ∩ W ) + dim(V + W ) Since dim(V ) = dim(W ) = 3 and dim(V + W ) = 5 we conclude that dim(V ∩ W ) = 1 as required.

10. a) There are 8 non-zero vectors and any one can be the first vector in a basis. Suppose we choose the vector v. The second vector must be linearly independent and therefore not a multiple of v. There are 3 multiples: 0, v, −v. / So there are 9 - 3 = 6 choices for the second vector. Thus, Note that Span(v1 , v2 , v3 ) = X and that w1 , w2 , w3 ∈ X. Therefore Span(v1 , v2 , v3 ) = Span(w1 , w2 , w3 ). there are 8 × 6 = 48 bases. On the other hand, v1 , v2 , v3 , w1 , w2 , w3 ∈ Y. By the b) There are 24 non-zero vectors and any can be the argument of Exercise 2 we have first vector in a basis. Suppose we choose the vector v. The second vector must be linearly independent and therefore not a multiple of v. There are 5 multiples: dim[Span(v1 , v2 , v3 ) ∩ Span(w1 , w2 , w3 )] = 2.

K23692_SM_Cover.indd 21

02/06/15 3:11 pm

14

Chapter 1. Vector Spaces

0, v, 2v, 3v, 4v. So there are 25 - 5 = 20 choices for the second vector. Thus, there are 24 × 20 = 480 bases. c) There are p2 − 1 non-zero vectors and any can be the first vector in a basis. Suppose we choose the vector v. The second vector must be linearly independent and therefore not a multiple of v. There are p multiples: 0, v, . . . , (p − 1)v. So there are p2 − p choices for the second vector. Thus, there are (p2 − 1)(p2 − p) bases.

f=

n

ci χxi

i=1

is the zero function. In particular, for each i, f (xi ) = 0. However, f (xi ) = ci . Thus, c1 = · · · = cn = 0. Thus, {χx |x ∈ X} is linearly independent.

Now we need to show that {χx |x ∈ X} spans. Let f ∈ Mf in (X, F) be a non-zero function. Then spt(f ) 11. Let dim(V ) = n and dim(U ) = k. If U = V we can is non-empty but finite. Suppose spt(f ) ={x1 , . . . , xn }. n take W = {0} so we may assume k < n. Choose a basis Set ci = f (xi ). Then the function f and i=1 ci χxi are BU = (u1 , . . . , uk ) for U. By part 1) of Theorem (1.24) equal. we can expand BU to a basis, B = (u1 , . . . , un ) for V. Set 2. First of all suppose X is an n−dimensional vector W = Span(uk+1 , . . . , un ). Then W is a complement to space over Q say with basis (v , . . . , v ). Then there 1 n U. is a one-to-one set correspondence between V and Qn , a1 12. If (v1 , . . . , vk ) ⊂ W then V = Span(v1 , . . . , vk ) ⊂ n .. Span(W ) = W , a contradiction. namely taking i=1 ai vi to . . Therefore the cardi13. First assume that X ∩ Y = {0}. Let (x1 , . . . , xk ) be basis for X and (y1 , . . . , yk ) a basis for Y . Set ui = xi + yi and U = Span(u1 , . . . , uk ). Then X ∩ U = Y ∩ U = 0 and X ⊕ U = Y ⊕ U = X ⊕ Y . Let W be a complement to X + Y in V and set Z = U ⊕ W .

an nality of X is the same as the cardinality of Qn which is the same as the cardinality of Q. Suppose B is a basis of R as a vector space over Q. Let Pf in (B) be all the finite subsets of B. Then

Assume X ∩ Y = {0} and let (v1 , . . . , vj ) be a basis for R = ∪B∈Pf in (B) Span(B). X ∩ Y . Set s = k − j and let (x1 , . . . , xs ) be a sequence of vectors from X such that (v1 , . . . , vk , x1 , . . . , xs ) is a As shown above, for any B ∈ Pf in (B), the cardinality of basis of X and similarly let (y1 , . . . , ys ) be a sequence Span(B) is countable. of vectors from Y such that (u1 , . . . , vj , y1 , . . . , vs ) is a basis for Y . Set ui = xi + yi , 1 ≤ i ≤ s and U = For an infinite set X the cardinality of the finite subsets Span(u1 , . . . , us ). Then X ∩ U = Y ∩ U = {0} and of X is the same as the cardinality of X. Also, if Y is an X ⊕ U = Y ⊕ U = X + Y . Let W be a complement to infinite set and for each y ∈ Y, Sy is a countable set then ∪y∈Y Sy has cardinality no greater then Y. It now follows X + Y in V and set Z = U ⊕ W . that the cardinality of R is no greater than the cardinality of B. Since B is a subset of R we conclude that B and R have the same cardinality.

1.7. Bases of Infinite Dimensional Vector Spaces

1. First we show that {χx |x ∈ X} is independent. If not, then there must exist a finite subset {χxi |1 ≤ i ≤ n} which is dependent. There there are scalars, ci such that

K23692_SM_Cover.indd 22

3. Choose a basis BU . This can be extended to a basis B for V. Let W = Span(B \ BU ). Then W is a complement to U in V. 4. Let B be a basis for V . Choose a subset Xn of B with cardinality n. Then Un = Span(B \ Xn ) satisfies dim(V /Un ) = n.

02/06/15 3:11 pm

1.8. Coordinate Vectors

15

1.8. Coordinate Vectors 1a) Since dim(F2 [x]) is 3 and there are three vectors in the sequence, by the half is good enough theorem it suffices to show the vectors are linearly independent. Thus, suppose c1 , c2 , c3 are scalars such that c1 (1 + x) + c2 (1 + x2 ) + c3 (1 + 2x − 2x2 ) = 0 After distributing and collecting terms we obtain

c1j 2. Set cj = c2j , j = 1, 2, 3. This means that c3j

uj = c1j v1 + c2j v2 + c3j v3 .

If [x]B1

a1 = a2 then x = a1 u1 + a2 u2 + a3 u3 . a3

It then follows that

x = a1 (c11 v1 + c21 v2 + c31 v3 )+ a2 (c12 v1 + c22 v2 + c32 v3 )+

(c1 + c2 + c3 ) + (c1 + 2c3 )x + (c2 − 2c3 )x2 = 0

a3 (c13 v1 + c23 v2 + c33 v3 ) = (a1 c11 + a2 c12 + a3 c13 )v1 +

This gives rise to the homogeneous linear system

(a1 c21 + a2 c22 + a3 c23 )v2 + c1 c1

+

c2 c2

+ + −

c3 2c3 2c3

= + +

0 0 0

This system has only the trivial solution c1 = c2 = c3 = 0. Thus, the vectors are linearly independent as required. 1b) To compute [1]F we need to determine c1 , c2 , c3 such that (c1 + c2 + c3 ) + (c1 + 2c3 )x + (c2 − 2c3 )x2 = 1 This gives rise to the linear system

(a1 c31 + a2 c32 + a3 c33 )v3 . Consequently, [x]B2 =

a1 c11 + a2 c12 + a3 c13 a1 c21 + a2 c22 + a3 c23 = a1 c31 + a2 c32 + a3 c33

c11 c12 c31 a1 c21 + a2 c22 + a3 c32 = c31 c32 c33 a1 c 1 + a 2 c 2 + a 3 c 3

3a) For i = 0, 1, 2, 3 we have fj (i) = 1 if i = j and 0 = 1 otherwise. We claim that implies that (f1 , f2 , f3 , f4 ) is = 0 linearly independent. For suppose c1 f1 + c2 f2 + c3 f3 + = 0 c2 c4 f4 is the zero function. Substituting i with i = 0, 1, 2, 3 −2 we get 0 = ci+1 . Thus, all the ci = 0. Now since there which has the unique solution [1]F = 2 . In a simiare 4 vectors and dim(R3 [x]) = 4 it follows that F is a 1 basis for R3 [x]. 3 2 b) Suppose g = c1 f1 + c2 f2 + c3 f3 + c4 f4 . Then g(i) = lar manner [x]F = −1 and [x2 ]F = −1 . c1 fi (i) + c2 f2 (i) + c3 f3 (i) + c4 f4 (i) = ci+1 as required. −1 −1 c1 c1

K23692_SM_Cover.indd 23

+

c2

+ + −

c3 2c3 2c3

02/06/15 3:11 pm

16

Chapter 1. Vector Spaces

0 0 1 1 1 2 1 3 4. [1]F = 1 , [x]F = 2 , [x ]F = 4 , [x ]F = 3 9 1 0 1 . 8 27

5. Assume that Span(u1 , . . . , uk ) = V and that c ∈ Fn . Let x ∈ V be the vector such that [x]B = c. By hypothesis x ∈ Span(u1 , . . . , uk ). By Theorem (1.29), c is a linear combination of ([u1 ]B , [u2 ]B , . . . , [uk ]B ), equivalently, c ∈ Span([u1 ]B , [u2 ]B , . . . , [uk ]B ). Conversely, assume that Span([u1 ]B , . . . , [uk ]B ) = Fn and that x ∈ V. Let c = [x]B . By hypothesis, c is a linear combination of ([u1 ]B , [u2 ]B , . . . , [uk ]B ). Then by Theorem (1.29) it follows that x is a linear combination of (u1 , u2 , . . . , uk ). 6. This follows from Exercise 5 and the half is good enough theorem.

K23692_SM_Cover.indd 24

02/06/15 3:11 pm

Chapter 2

Linear Transformations 2.1. Introduction to Linear Transformations 1 0 0 1. Set e1 = 0 , e2 = 1 , e3 = 0, so that 0 0 1 3 (e1 , e2 , e3 ) is a basis of F . Then T (ae1 + be2 + ce3 ) = a(1 + x + x2 ) + b(1 − x2 ) + c(−2 − x2 ).

x x e 1 e 2 x1 x2 + = T + T . e y1 e y2 y1 y2 cx x cx e ) = T( ) = cy = y cy e x x c e x (e ) = c = cT . y c y (e ) e y

T (c

5. Let u1 , u2 ∈ U, c1 , c2 ∈ F. Then (T ◦ S)(c1 u1 + c2 u2 ) = T (S(c1 u1 + c2 u2 )). Since S is a linear transformation S(c1 u1 + c2 u2 ) = c1 S(u1 ) + c2 S(u2 ). It then 2. We show the additive property is violated: follows that 1 0 0 0 0 T =T = . It should then be the 0 0 0 1 0 T (S(c1 u1 + c2 u2 )) = T (c1 S(u1 ) + c2 S(u2 )). 1 0 0 0 1 0 case that T = . However, T = 0 1 0 0 0 1 Since T is a linear transformation 1 0 . 0 0 T (c1 S(u1 ) + c2 S(u2 )) = c1 T (S(u1 )) + c2 T (S(u2 )) = 2 −3 1 0 3. T (a +b ) = a 0 + b 0 . Now by c1 (T ◦ S)(u1 ) + c2 (T ◦ S)(u2 ). 0 1 4 5 It now follows that T ◦ S is a linear transformation. Theorem (2.5) it follows that T is a linear transformation. 6. Let v1 , v2 ∈ V. Then by the definition of S + T we 4. have x2 x 1 + x2 x )=T = T( 1 + y1 y2 y1 + y2 (S + T )(v1 + v2 ) = S(v1 + v2 ) + T (v1 + v2 ) x +x x x 1 2 1 2 e e e = = Since S, T are linear transformations ey1 +y2 e y1 e y2 By Theorem (2.5) it follows that T is a linear transformation.

K23692_SM_Cover.indd 25

02/06/15 3:11 pm

18

Chapter 2. Linear Transformations

S(v1 + v2 ) = S(v1 ) + S(v2 ), T (v1 + v2 ) = T (v1 ) + T (v2 ). We then have S(v1 + v2 ) + T (v1 + v2 ) = [S(v1 ) + S(v2 )] + [T (v1 ) + T (v2 )] = [S(v1 ) + T (v1 )] + [S(v2 ) + T (v2 )] = (S + T )(v1 ) + (S + T )(v2 ). Now let v ∈ V, c ∈ F. Then by the definition of S + T

is arbitrary we conclude that V = X + Y. On the other hand, assume v ∈ X ∩ Y. Since v ∈ Y, P2 (v) = v. Since v ∈ X, P1 (v) = v. It then follows that (P1 ◦ P2 )(v) = P1 (P2 (v)) = P1 (v) = v. However, we are assuming that P1 ◦ P2 = 0V and therefore v = (P1 ◦ P2 )(v) = 0. Thus, X ∩ Y = {0}. 10. Let B = (v1 , . . . , vn ) be a basis for V. Now T (B) = (T (v1 ), . . . , T (vn )) is a sequence of n vectors in W and dim(W ) = m < n. By the Exchange theorem there are scalars c1 , . . . , cn , not all zero, such that c1 T (v1 ) + · · · + cn T (vn ) = 0W . Since T is a linear transformation

(S + T )(cv) = S(cv) + T (cv) Since S, T are linear we have S(cv) = cS(v), T (cv) = cT (v). Then S(cv) + T (cv) = cS(v) + cT (v) = c[S(v) + T (v)] = c[(S + T )(v)].

c1 T (v1 ) + · · · + cn T (vn ) = T (c1 v1 + . . . cn vn ). Since not all c1 , c2 . . . , cn are zero and (v1 , . . . , vn ) is a basis, the vector v = c1 v1 + . . . cn vn = 0V and satisfies T (v) = 0W .

11. Π(v1 + v2 ) = (v1 + v2 ) + W = (v1 + W ) + (v2 + 7. Let v ∈ V and write v = x + y where x ∈ X and W ) = Π(v1 ) + Π(v2 ). y ∈ Y. Π(cv) = (cv) + W = c(v + W ) = cΠ(v). a) P1 (v) = x and P1 (x) = x. Consequently, P1 ◦ 12. If wj ∈ R(T ) then there is a vector vj ∈ V such P1 (v) = P1 (v). In exactly the same way, P2 ◦ P2 = P2 . that T (vj ) = wj . Now let w ∈ W be arbitrary. Since b) Now (P1 + P2 )(v) = P1 (v) + P2 (v) = x + y = v (w1 , . . . , wm ) is a spanning sequence for W there are and therefore P1 + P2 is the identity transformation of V. scalars c1 , . . . , cm such that c) Note that P1 (y) = P2 (x) = 0. In now follows that (P1 ◦ P2 )(v) = P1 (P2 (v)) = P1 (y) = 0 and therefore P1 ◦ P2 = 0V . In a similar fashion P2 ◦ P1 = 0V . 8. By Exercise 6, P1 + P2 = IV . Then T = IV ◦ T = (P1 + P2 ) ◦ T = (P1 ◦ T ) + (P2 ◦ T ). Since P1 ◦ T and P2 ◦ T are linear transformations, by Lemma (2.2) T = (P1 ◦ T ) + (P2 ◦ T ) is a linear transformation.

w = c1 w1 + . . . cm wm = c1 T (v1 ) + . . . cm T (vm ). Set v = c1 v1 + . . . cm vm . Since T is a linear transformation T (v) = T (c1 v1 + · · · + cm vm ) =

c1 T (v1 ) + . . . cm T (vm ) = 9. Let v ∈ V be an arbitrary vector and set x = P1 (v), y = P2 (v). Then x ∈ X, y ∈ Y so x + y ∈ c1 w1 + . . . cm wm = w. X + Y. By hypothesis, P1 + P2 = IV and therefore v = (P1 + P2 )(v) = P1 (v) + P2 (v) = x + y. As v Since w is arbitrary this proves that R(T ) = W.

K23692_SM_Cover.indd 26

02/06/15 3:11 pm

2.1. Introduction to Linear Transformations 13. Since each T (vi ) ∈ R(T ) and R(T ) is a subspace of W it follows that Span(v1 , . . . , vn ) ⊂ R(T ) so we have to prove the reverse inclusion. Assume w ∈ R(T ). Then there exists v ∈ V such that T (v) = w. Since (v1 , . . . , vn ) is a basis for V there are scalars c1 , . . . , cn such that v = c1 v1 + · · · + cn vn . Then

19 15. Let X consist of all pairs (A, φ) where A ⊂ B, φ is a linear transformation from Span(A to W and φ restricted to A is equal to f restricted to A. Order X as follows: (A, φ) ≤ (A , φ ) if and only if A ⊂ A and φ restricted to Span(A) is equal to φ. We prove that every chain has an upper bound.

Thus, assume that C = {(Ai , φi )|i ∈ I} is a chain in X . Set A = ∪i∈I Ai . Define φ as follows: If v ∈ Span(Ai ) then φ(v) = φi (v). We need to show this is well defined. c1 T (v1 ) + · · · + cn T (vn ) Suppose v ∈ Span(Ai ) ∩ Span(Aj ) for i, j ∈ I. Since The last equality follows from Lemma (2.1). Clearly, C is a chain either Ai ⊂ Aj or Aj ⊂ Ai . Assume Ai ⊂ c1 T (v1 ) + · · · + cn T (vn ) is in Span(T (v1 ), . . . , T (vn )) Aj . Also, since C is a chain, φj restricted to Ai is φi . It which completes the proof. then follows that φj (v) = φi (v) since φi , φj are linear 14. Let T : V → W be a linear transformation. Let transformations. T (vj ) = a1j w1 + · · · + amj wm . Consider the linear Now by Zorn’s lemma there exists a maximal element transformation (A, φ). We need to show that A = B. Suppose to the conw = T (v) = T (c1 v1 + · · · + cn vn ) =

trary that A = B and let v ∈ B \ A and set A = A ∪ {v}. S= aij Eij . Define φ : Span(A)⊕Span(v) as follows: φ (x+cv) = i,j φ(x) = cf (v). Then φ is linear and φ restricted to A Claim that S = T. Towards this end it suffices to prove is equal to f restricted to f. So, (A , φ ) ∈ X which conthat S(vj ) = T (vj ) for all j. Now since Eik (vj ) = 0W tradicts the maximality of (A, φ). Therefore A = B as claimed. for k = j we have

S(vj ) =

m

aij Eij (vj ) =

i=1

n i=1

aij wi = a1j w1 + · · · + amj wm = T (vj )

as required.

16. Let c1 , . . . , ck be scalars such that c1 v1 +· · ·+ck vk = 0V . We need to show that c1 = · · · = ck = 0. Applying T we get T (c1 v1 + · · · + ck vk ) = T (0V ) = 0W .

On the other hand, suppose S = i,j aij Eij is the zero By Lemma (2.1) function from V to W which takes every vector of V to the zero vector of W. In particular, S(vj ) = 0W . T (c1 v1 + · · · + ck vk ) = m m Thus, S(vj ) = i=1 aij Eij (vj ) = i=1 aij wi = 0W . However, since (w1 , . . . , wm ) is a basis of W we must c1 T (v1 ) + · · · + ck T (vk ) = c1 w1 + · · · + ck wk . have aij = · · · = amj = 0. Since this holds for every j, 1 ≤ j ≤ n we have all aij = 0 and the Eij are Since (w1 , . . . , wk ) is assumed to be linearly independent linearly independent. With what we have proved above, we have c1 = · · · = ck = 0 as required. (E11 , E21 , . . . , Em1 , E12 , . . . , Em2 , . . . , E1n , . . . , Emn ) is a basis for L(V, W ). It now follows that dim(L(V, W )) = mn.

K23692_SM_Cover.indd 27

02/06/15 3:11 pm

20

Chapter 2. Linear Transformations

2.2. Range and Kernel of a Linear Transformation

Ker(T ) = Span(2 − x + x2 ), rank(T ) = 3, and nullity(T ) = 1.

4. An arbitrary vector a + bx + cx2 ∈ F2 [x] is the im a b age of . This proves that T is surjective. T canc 0 1. There are three vectors in the basis for R(T ) and therefore rank(T ) = 3. Since dim(M23 (R) = 6 applying the not be an isomorphism since dim(M22 (F)) = 4 > 3 = dim(F2 [x]). rank-nullity theorem we get rank(T ) + nullity(T ) = dim(M23 (R))

5. The only solution to the linear system a a

3 + nullity(T ) = 6 Thus, nullity(T ) = 3.

a

+ −

b 2b − b + + 2b +

= 0 2c = 0 c = 0 c = 0

2. Ker(T ) = Span((x − a)(x − b), x(x − a)(x − b)). is the trivial one a = b = c = 0. Therefore Ker(T ) = The nullity of T is 2. Since dim(F3 [x]) = 4 by the 0 rank-nullity theorem we have dim(Ker(T )) = 2. We can {0} which implies that T is one-to-one. On the other 1 0 see that later directly by noting that T ( x−b ) = and a−b 0 hand, since dim(F3 ) = 3 and dim(M22 (F)) = 4, the 0 spaces cannot be isomorphic. T ( x−a b−a ) = 1 . 1 −1 1 2 3 3. T (a + bx + cx + dx ) = 6. R(T ) = Span(1 , 1 , 1) = F3 . Thus, 1 2 4 1 2 0 2 T is surjective. By the half is good enough theorem for 1 3 1 1 transformations, T is an isomorphism. a 1 + b 1 + c −1 + d 1 . 1 2 0 2 7. Let w ∈ W be arbitrary. Since T is onto there exists an element v ∈ V such that T (v) = w. Since S is onto It follows that there exists and element u ∈ U such that S(u) = v. Then (T ◦S)(u) = T (S(u)) = T (v) = w. We have thus shown 1 2 0 2 the existence of an element u ∈ U such that (T ◦ S)(u) = 1 3 1 1 , , , ). R(T ) = Span( w. This proves that T ◦ S is surjective. 1 1 −1 1 1 2 0 2 8. Suppose that (T ◦ S)(u) = (T ◦ S)(u ). Then T (S(u)) = T (S(u )). Since T is one-to-one this implies The first, second and fourth vectors are a basis for this that S(u) = S(u ). Since S is on-to-one we have u = u . subspace: Thus, T ◦ S is one-to-one. 1 2 2 1 3 1 R(T ) = Span( 1 , 1 , 1) 1

K23692_SM_Cover.indd 28

2

2

9. By Exercises 7 and 8 it follows that if S, t are isomorphisms then T ◦ S is a bijective function. By Exercise 5 of Section (2.1), T ◦ S is a linear transformation. Thus, T ◦ S is an isomorphism.

02/06/15 3:11 pm

2.2. Range and Kernel of a Linear Transformation 10. Let BV = (v1 , . . . , vn ) be a basis of V and set wj = T (vj ). Since T is injective and BV is linearly independent BW = (w1 , . . . , wn ) is linearly independent by Theorem (2.11). Since V and W are isomorphic, dim(W ) = dim(V ) = n so that by the half is good enough theorem, BW is a basis for W. Now there exists a unique linear transformation S : W → V such S(wj ) = vj . Note that S ◦ T is a linear operator on V such that (S ◦ T )(vj ) = vj for all j. It then follows that S ◦ T = IV and hence S = T −1 . This proves that T −1 is a linear transformation.

21 If S is not one-to-one then there will be other choices of (v1 , . . . , vm ) and therefore T will not be unique). 14. If dim(V ) = dim(W ) then T is an isomorphism by the half is good enough theorem for linear transformations and then T has a unique inverse (as a map) which is linear by Exercise 10. Therefore by Exercise 12 we may assume that dim(V ) < dim(W ). Let BV = (v1 , . . . , vn ) be a basis of V. Set wi = T (vi ), i = 1, . . . , n. Since T is one-to-one the sequence (w1 , . . . , wn ) is linearly independent. We can then extend (w1 , . . . , wn ) to a basis (w1 , . . . , wm ) for W. Now there exists a unique linear transformation S : W → V such that S(wi ) = vi for i ≤ n and S(wi ) = 0V for i > n.

Alternative proof (applies even when V is infinite dimensional): Assume w1 , w2 ∈ W. We need to show that T −1 (w1 +w2 ) = T −1 (w1 )+T (w2 ). Set v1 = T −1 (w1 ) and v2 = T −1 (w2 ). Just as a reminder, vi , i = 1, 2 is the unique element of V such that T (vi ) = wi . Since T is S ◦ T : V → V is a linear transformation. Moreover, linear, T (v1 + v2 ) = T (v1 ) + T (v2 ) = w1 + w2 . There- (S ◦ T )(v ) = S(T (v )) = S(w ) = v . It then follows i i i i fore T −1 (w1 + w2 ) = v1 + v2 = T −1 (w1 ) + T −1 (w2 ). that S ◦ T = I . V

Now assume that w ∈ W and c ∈ F. We need to show 15. To show R is well-defined we need to prove if that T −1 (cw) = cT −1 (w). Set v = T −1 (w). Since T is T1 (v) = T1 (v ) then T2 (v) = T2 (v ). If T1 (v) = T1 (v ) linear T (cv) = cT (v) = cw. We may therefore conclude then T (v − v ) = 0, that is, v − v ∈ Ker(T1 ). Since by that T −1 (cw) = cv = cT −1 (w). hypothesis, Ker(T1 ) = Ker(T2 ), v − v ∈ Ker(T2 ). 11. Let BV = (v1 , . . . , vn ) be a basis for V. We know Consequently, T2 (v − v ) = 0. It then follows that that R(T ) = Span(T (v1 ), . . . , T (vn )). Now let BW = T2 (v) − T2 (v ) = 0 and therefore T2 (v) = T2 (v ) as (w1 , . . . , wm ) be a basis for W. By the exchange theo- desired. rem, dim(V ) = n ≥ m = dim(W ). 12. Let BV = (v1 , . . . , vn ) be a basis for V and BW = (w1 , . . . , wm ) be a basis for W. Since T is injective by Theorem (2.11), T (BV ) = (T (v1 ), . . . , T (vn )) is linearly independent. By the exchange theorem, dim(V ) = n ≤ m = dim(W ). 13. Let BW = (w1 , . . . , wm ) be a basis for W. Since T is surjective, for each i = 1, . . . , m there exists a vector vi such that T (vi ) = wi . By Theorem (2.6) there exists a unique linear transformation S : W → V such that S(wi ) = vi . Now T ◦ S : W → W is linear transformation and (T ◦ S)(wi ) = T (S(wi )) = T (vi ) = wi . Since T ◦ S is a linear transformation and the identity when restricted to a basis of W it follows that T ◦ S = IW . (Note:

K23692_SM_Cover.indd 29

Now suppose u1 , u2 ∈ R(T1 ) and c1 , c2 are scalars. We need to show that S(c1 u1 +c2 u2 ) = c1 S(u1 )+c2 S(u2 ). Toward that end, let v1 , v2 ∈ V such that T1 (v1 ) = u1 , T1 (v2 ) = u2 . Then T1 (c1 v1 + c2 v2 ) = c1 T1 (v1 ) + c2 T1 (v2 ) = c1 u1 + c2 u2 . Therefore S(c1 u1 + c2 u2 ) = T2 (c1 v1 + c2 v2 ). Since T2 is linear, T2 (c1 v1 + c2 v2 ) = c1 T2 (v1 ) + c2 T2 (v2 ) = c1 S(T1 (v1 )) + c2 S(T1 (v2 )) = c1 S(u1 ) + c2 S(u2 ). 16. Suppose first that Ker(T ) = {0}. Then T is injective and by the half is good enough theorem an isomorphism. Since the composition of injective functions is injective, T k is also injective for all k ∈ N. In particular, T n and T n+1 and consequently, Ker(T n ) = Ker(T n+1 ) = {0}. Moreover, T n and T n+1 are surjective and therefore

02/06/15 3:11 pm

22

Chapter 2. Linear Transformations

Range(T n ) = Range(T n+1 ) = V. Therefore we may b) Let (v1 , . . . , vn−k ) be a basis for Ker(T ) and extend assume that T is not injective. to a basis (v1 , . . . , vn ) of V . There exists a unique operator S such that S(vi ) = vi if 1 ≤ i ≤ n − k and Let m be a natural number and suppose v ∈ Ket(T m ). S(vi ) = 0 if i > n − k. Then Range(S) = Ker(T ) so Then T m+1 (v) = T (T m (v)) = T (0) = 0. that rank(S) = n − k and T S = 0V →V . Thus, Ker(T m ) ⊂ Ker(T m+1 ). Next, assume w ∈ Range(T m+1 ). Then there is a vector x ∈ V such that 20. a) Since ST = 0V →V it follows that Range(T ) ⊂ w = T m+1 (x) = T m (T (x)) ∈ Range(T m ). Thus, Ker(S). Consequently, nullity(S) = dim(Ker(S)) ≥ dim(Range(T )) = rank(T ) = k. It then follows that Range(T m+1 ) ⊂ Range(T m ). rank(S) ≤ n − k. It follows that nullity(T m+1 ) ≥ nullity(T m ) and rank(T m+1 ) ≤ rank(T m ). Consider the sequence of b) Let (v1 , . . . , vk ) be a basis for Range(T ) and extend to numbers (nullity(T ), nullity(T 2 ), . . . , nullity(T n )). a basis (v1 , . . . , vn ). Let S be the unique linear operator Each is a natural number between 1 and n. If they are such that S(vi ) = 0 if 1 ≤ i ≤ k and S(vi ) = vi if all distinct they are then decreasing and we must have i > k. Then Range(S) = Span(vk+1 , . . . , vn ) so that nullity(T n ) = n, that is, T n is the zero map. But then rank(S) = n − k. Since Range(T ) ⊂ Ker(S), ST = T n+1 is also the zero map and in this case T n = T n+1 . 0V →V . In the case that they are not all distinct we must have 21. Let dim(V ) = n. Since T 2 = 0V →V it follows some m < n such that nullity(T m ) = nullity(T m+1 ), that Range(T ) ⊂ Ker(T ) whence rank(T ) = k ≤ so that Ker(T m ) = Ker(T m+1 ). It then follows that nullity(T ) = n − k. It then follows that 2k ≤ n so Ker(T k ) = Ker(T m ) for all k ≥ m. In particular, that k ≤ n2 . Ker(T m ) = Ker(T n ) = Ker(T n+1 ). By Theorem (2.9) it then follows that rank(T n ) = rank(T n+1 ). 22. Let T be the unique linear operator such that T (vi ) = Since Range(T n+1 ) ⊂ Range(T n ) we conclude that vi+m if 1 ≤ i ≤ m and T (vi ) = 0 if m + 1 ≤ i ≤ n. Range(T n+1 ) = Range(T n ). 17. Since dim(Range(T n )) + dim(Ker(T n )) = dim(V ) it suffices to prove that Range(T n ) ∩ Ker(T n ) = {0}. From the proof of Exercise 16, Ker(T k ) = Ker(T n ) and Range(T k ) = Range(T n ) for all k ≥ n. Set S = T n . It then follows that Ker(S 2 ) = Ker(S) and Range(S 2 ) = Range(S). So, 1. By the Theorem (2.19) we have assume that v ∈ Ker(S) ∩ Range(S). Then v = S(w) for some w ∈ V. Now S 2 (w) = S(v) = 0 since v ∈ V /W = (X1 + W )/W ∼ = X1 /(X1 ∩ W ) Ker(S). Thus, w ∈ Ker(S 2 ), whence w ∈ Ker(S). Thus, v = S(w) = 0 as required.

2.3. Correspondence and Isomorphism Theorems

V /W = (X2 + W )/W ∼ = X2 /(X2 ∩ W )

18. This follows immediately from Theorem (2.7) and the fact, established in the proof of Exercise 16 that Range(T 2 ) ⊂ Range(T ) and Ker(T ) ⊂ Ker(T 2 ).

Thus, X1 /(X1 ∩ W ) ∼ = X2 /(X2 ∩ W ).

19. a) Since T S = 0V →V it follows that Range(S) ⊂ Ker(T ) from which we conclude that rank(S) ≤ nullity(T ) = n − k.

2. If V = X1 ⊕W then by Theorem (2.19) V /W = (X1 + W )/W ∼ = X1 /(X1 ∩ W ) = X1 since X1 ∩ W = {0}. Similarly, V /W ∼ = X2 . Thus, X1 ∼ = X2 .

K23692_SM_Cover.indd 30

02/06/15 3:11 pm

2.4. Matrix of a Linear Transformation

23

3. Since f is not the zero map, by Theorem (2.17) it fol- the last equality since V1 ∩ Γ = {0}. lows that V /Ker(f ) ∼ = Range(f ). Since f is not the zero 8. Since U + W is a subspace of V, (U + W )/W is a map, f must be surjective, that is, Range(f ) = F. subspace of V /W. By hypothesis, dim(V /W ) = n and 4. Let u ∈ U. Then T (u) = u which implies that S(u) = therefore dim((U + W )/W ) ≤ n. (T − IV )(u) = 0. By Theorem (2.19) Now assume that v ∈ V is arbitrary. Since T (v + U ) = (U + W )/W ∼ v + U it follows that S(v) = (T − IV )(v) = T (v) − v ∈ = U/(U ∩ W ). 2 U. Then S (v) = S(S(v)) = 0. We therefore conclude that dim(U/(U ∩ W ) ≤ n. 5 a) Assume T (v) = v. Then (S + IV )(v) = S(v) + IV (v) = S(v) + v = v and therefore S(v) = . Con- Now by Theorem (2.18) versely, if v ∈ Ker(S) then T (v) = (S + IV )(v) = (V /(U ∩ W ))/(U/(U ∩ W ) ∼ = V /U. S(v) + IV (v) = 0 + v = v. b) Let v ∈ V. Then T (v) − v = T (v) − IV (v) = (T − IV )(v) = S(v).

Thus, dim(V /(U ∩ W ))/(U/(U ∩ W ) = dim(V /U ) = m. It now follows that dim(V /(V ∩ W )) ≤ dim(U/(U ∩ W )) + dim(V /U ) ≤ m + n.

Since S 2 is the zero map, S(v) ∈ Ker(S) = U. Therefore T (v) − v ∈ U which is equivalent to T (v + U ) = v + U.

2.4. Matrix of a Linear Transformation

6. Define a map T : U ⊕ V → (U/X) ⊕ (V /Y ) by

1. Assume T is onto. It follows from Exercise 13 of Section (2.1) that Span(T (v1 ), . . . , T (vn )) = This map is surjective and has kernel X ⊕ Y. By Theorem W. By Exercise 5 of Section (1.8) it follows that (2.17 (U ⊕ V )/(X ⊕ Y ) is isomorphic to U/X ⊕ V. Span([T (v1 )]BW , . . . , [T (vn )]BW ) = Fm . However, the 7 a.) Need to show that Γ is closed under addition and coordinate vectors [T (vj )]BW , j = 1, . . . , n are just the scalar multiplication. Suppose v, w ∈ V. Then (v, T (v)) columns of A. and (w, T (w)) are two typical elements of Γ. Since T is Conversely, assume the columns of A span Fm . Then linear ([T (v )] , . . . , [T (v )] ) spans Fm . By Exericise 1 T (u, v) = (u + X, v + Y ).

1

BW

n

BW

it follows that (T (v1 ), . . . , T (vn )) spans W .

2. Assume T is injective. Then by Theorem (2.11) (T (v1 ), . . . , T (vn )) is linearly independent. Then (v + w, T (v + w)). by Theorem (1.30) ([T (v1 )]BW , . . . , [T (vn )]BW ) is linearly independent in Fn . However, b) The subspace V1 = {(v, 0)|v ∈ V } is a complement ([T (v1 )]BW , . . . , [T (vn )]BW ) is the sequence of columns to Γ in V ⊕ V and isomorphic to V. It then follows from of the matrix A. Theorem (2.19) that Conversely, assume that the sequence of columns ∼ of the matrix A is linearly independent in Fn . V /Γ = (V1 + Γ)/Γ = V1 /(V1 ∩ Γ) = V1 (v, T (v)) + (w + T (w)) = (v + w, T (v) + T (w)) =

K23692_SM_Cover.indd 31

02/06/15 3:11 pm

24 By definition of the matrix A this means that ([T (v1 )]BW , . . . , [T (vn )]BW ) is linearly independent in Fn . By Theorem (1.30) we can conclude that (T (v1 ), . . . , T (vn )) is linearly independent in W. Finally, since BV = (v1 , . . . , vn ) is a basis for V , by Theorem (2.11), it follows that T is injective. 0 1 3. The matrix A = is non-zero but A2 = 02×2 . 0 0 Let T be the operator on R2 such that with respect to the 1 0 standard basis ( , ) it has matrix A. Thus, 0 1 x y T( )= . y 0 4. There are Here lots of examples. is one possible pair: 1 0 1 1 (A, B) = ( ), ). 1 0 −1 −1 2 2 1 5. MT (S, S) = 1 1 0 . 1 0 0

6. Let Sn be the standard basis of Fn and Sm be the standard basis of Fm . Set A = MT (Sn , Sm ). Then for any vector v ∈ Fn , T (v) = Av. 7. Let T ∈ L(Fn , Fm ) such that A = MT (Sn , Sm ). Since the columns of A span Fm by Exercise 1 T is surjective. By Exercise 13 of Section (2.2) there is an linear transformation S : Fm → Fn such that T S = IFm . Let B = MS (S m , S n ). It then follows that BA = MIFm (S m , S m ) = Im . 8. Let T ∈ L(Fn , Fm ) such that A = MT (S n , S m ). Since the sequence of columns of A is linearly independent, by Exercise 2 the operator T is injective. Then by Exercise 14 of Section (2.2) there is linear transformation S : Fm → Fn such that T S = IFn . Set B = MS (S m , S n ). Then AB = MIFn (S n , S n ) = In . 1 0 3 0 9. The reduced echelon form of A is 0 1 −2 0 . 0 0 0 1 Since every row is non-zero the columns of A span Q3 .

K23692_SM_Cover.indd 32

Chapter 2. Linear Transformations

4 −2 −1 −5 3 2 . One such matrix is 0 0 0 2 −1 −1

1 0 0 0 1 0 10. The reduced echelon form of A is 0 0 1 . 0 0 0 Since the columns of this matrix are linearly independent the columns of A are linearly independent. 2 −1 −2 0 One such matrix is −1 0 1 0 . 0 −1 1 0

11. Since A = MT (BV , BW ) it follows that [T (v)]BW = A[v]BV . Suppose v ∈ Ker(T ) so that T (v) = 0W . Thus A[v]BV = 0 from which we conclude that [v]BV ∈ null(A). On the other hand if [v]BV ∈ null(A) then [T (v)]BW = 0 from which we conclude that T (v) = 0W and v ∈ Ker(T ).

2.5. The Algebra of L(V, W ) and Mmn(F)

1 1 0 1. Let A = ,B = . Then 1 1 1 2 1 1 1 AB = , BA = . 1 1 1 2 1 0

Now let (v1 , v2 ) be linearly independent in V and xtend to a basis B = (v1 , . . . , vn ) for V. Let S be the linear operator on V such that S(v1 ) = v1 , S(v2 ) = v1 + v2 and for S(vi ) = vi for 3 ≤ i ≤ n. Let T be the linear operator on V such that T (v1 ) = v1 + v2 , T (v2 ) = v2 and T (vi ) = vi for 3 ≤ i ≤ n.

Then (ST )(v1 ) = S(T (v1 )) = S(v1 + v2 ) = S(v1 ) + S(v2 ) = v1 + (v1 + v2 ) = 2v1 + v2 . (T S)(v1 ) = T (S(v1 )) = T (v1 ) = v1 +v2 = (ST )(v1 ).

02/06/15 3:11 pm

2.5. The Algebra of L(V, W ) and Mmn (F)

25

2. Let (v1 , v2 ) be linearly independent and extend to a However, the matrices Ekl span Mnn (F) and it follows basis B = (v1 , . . . , vn ) for V. Let S be the operator on that J = Mnn (F). Since Mnn (F) and L(V, V ) are isoV such that S(v1 ) = v1 + v2 , S(v2 ) = v1 + v2 and morphic as algebras the only ideals in L(V, V ) are the zero ideal and all of L(V, V ). S(vi ) = 0V for 3 ≤ i ≤ n. Let T be the operator on V such that T (v1 ) = v1 − v2 , T (v2 ) = v1 − v2 and T (vi ) = 0V for 3 ≤ i ≤ n. If v ∈ B \ {v1 , v2 } then (ST )(v) = S(T (v)) = S(0) = 0. On the other hand (ST )(v1 ) = S(T (v1 )) = S(v1 − v2 ) = S(v1 ) − S(v2 ) = (v1 + v2 ) − (v1 + v2 ) = 0.

(ST )(v2 ) = S(T (v2 )) = S(v1 − v2 ) = 0. 3. Since 1a = a1 the identity of A is in CA (a) and so CA (a) has an identity. We need to show that CA (a) is closed under addition, multiplication and scalar multiplication. Suppose b, c ∈ CA (a). Then (b + c)a = ba + ca = ab + ac = a(b + c). So b + c ∈ CA (a). We also have (bc)a = b(ca) = b(ac) = (ba)c = (ab)c = a(bc) and so bc ∈ CA (a). Finally, if d is a scalar we have (db)c = d(ba) = d(ab) = a(db). 4. We prove the only non-zero ideal in Mnn (F) is Mnn (F). Suppose J is an ideal and A is a matrix with entries aij is in J and A is not the zero matrix. Then for some aij = 0. Let Eii be the matrix with zeros in all entries except the (i, i)−entry which is a 1. The matrix Eii AEjj = aij Eij is in J since J is an ideal. Since aij = 0 we can multiply by the reciprocal and therefore Eij ∈ J.

5. Clearly the sum of two upper triangular matrices is upper triangular. Consider the product of two upper triangular matrices A and B. The (i, j)−entry of AB is the dot product of the ith row of A with the j th for of B: b1j . .. bjj 0 . . . 0 aii . . . ain . 0 .. . 0

If i > j then the product is zero and AB is upper triangular. 6. U nn (F) is a subspace. We need only need to show that the diagonal elements of a product of a strict upper triangular matrix and a triangular matrix are zero. Suppose A is strictly upper triangular and B is upper triangular. Then the (i, i)−entry of AB is

0

...

0

ai+1,i+1

...

b1j . .. bjj ain = 0. 0 .. . 0

Similarly the diagonal entries of BA are zero.

7. Assume T ∈ L(V, V ) is not a unit. Then T is Now let Pkl be the matrix which is obtained from the not injective by the half is good enough theorem and identity matrix by exchanging the k and l columns (rows). Ker(T ) = {0}. Let v be a non-zero vector in Ker(T ). If B is an n × n matrix then Pkl B the matrix obtained Choose a basis (v1 , . . . , vn ) and let S be the operator such from B by exchanging the k and l rows and BPkl is that S(vi ) = v for all i. Then S is not the zero operator the matrix obtained from B by exchanging the k and l but Range(S) = Span(v). It then follows that T S is the columns. It then follows that Pik Eij Pjl = Ekl is in J. zero operator.

K23692_SM_Cover.indd 33

02/06/15 3:11 pm

26

Chapter 2. Linear Transformations

2.6. Invertible Transformations and Matrices

4 5 2 1. 2 3 1 −1 −1 −1

2. S −1 (a + bx + cx2 ) = (b − c) + (2a − b)x + (−3a + b + c)x2 .

7. This is the same as Exercise 7 of Section (2.5). 8. Assume S, T are invertible operators on the space V. Then (ST )(T −1 S −1 ) = S[T (T −1 S −1 )] = S[(T T −1 )S −1 ] = S[IV S −1 ] = SS −1 = IV . (T −1 S −1 )(ST ) = T −1 [S −1 (ST )] = T −1 [(S −1 S)T ] = T −1 [IV T ] = T −1 T = IV .

3. Set wi = T (vi ). Assume that (w1 , . . . , wn ) is a basis for W. Then there exists a unique linear transformation 9. By the distributive property, S is an operator on S : W → V such that S(wi ) = vi . Then ST (vi ) = L(V, V ). Let T ∈ L(V, V ). Then vi and since ST is linear, ST = IV . Similarly, T S = IW . So in this case T is an isomorphism. On the other −1 S(T ) = S −1 (ST ) = (S −1 S)T = IV T = T S hand, assume T is an isomorphism. Then T is injective and so (w1 , . . . , wn ) is linearly independent. Also, T is surjective. However, Range(T ) = Span(w1 , . . . , wn ). Thus, (w1 , . . . , wn ) is a basis of W. −1 T = S(S −1 T ) = (SS −1 )T = I T = T. SS V 4. This follows from Exercise 3.

−1 . So S is an invertible operator with inverse S 5. From Exercise 4 we need to count the number of bases (v1 , v2 , v3 ) there are in F32 . There are 7 non-zero vectors 10. IV − S)(IV + S + · · · + S k−1 ) = and v1 can be any of these vectors. v2 can be any vector except 0 and v1 and so there are 8 - 2 = 6 choices for v2 . IV + S + · · · + S k−1 − (S + · · · + S k−1 + S k ) = Having chosen v1 , v2 we can choose v3 to be any vector not in Span(v1 , v2 ) = {0, v1 , v2 , v1 + v2 }. So, there are 8 - 4 = 4 choices for v3 . So the number of bases is I V − S k = IV .

7 × 6 × 4 = 168.

A similar calculation gives

6. From Exercise 4 we need to count the number of bases (IV + S + · · · + S k−1 )(IV − S) = IV . (v1 , v2 , v3 ) there are in F33 . There are 26 non-zero vectors and v1 can be any of these vectors. v2 can be any 11. Every operator T is similar to itself via the identity: vector except 0 and ±v1 and so there are 27 - 3 = 24 IV T I −1 = T. Therefore the relation is reflexive. V choices for v2 . Having chosen v1 , v2 we can choose v3 T are similar via Q, that is, to be any vector not in Span(v1 , v2 ). There are 9 vectors Assume the−1operators S and −1 T = QSQ . Then S = Q T Q = Q−1 T (Q−1 )−1 . So, in span(v1 , v2 ) and so there are 27 - 9 = 18 choices for setting P = Q−1 we have S = P T P −1 . The relation of v3 . So the number of bases is similarity is symmetric. 26 × 24 × 18 = 11232 = 25 33 13.

K23692_SM_Cover.indd 34

Assume S = QRQ−1 and T = P SP −1 . Then

02/06/15 3:11 pm

2.6. Invertible Transformations and Matrices

27

T = P (QRQ−1 )P −1 = (P Q)R(Q−1 P −1 ) = (P Q)R(P Q)−1 which implies that similarity is a transitive relation. 12. Suppose T2 = QT1 Q−1 . Then

MT2 (B, B) = MQ (B, B)MT1 (B, B)MQ−1 (B, B). Set A = MQ (B, B) then MQ−1 (B, B) = A−1 . Thus, MT2 (B, B) = AMT1 (B, B)A−1 and so the matrices are similar. 13. We are assuming there is an invertible matrix A such that MT2 (B, B) = AMT1 (B, B)A−1 . Let Q be the operator on V such that the matrix of Q with respect to B is A: MQ (B, B) = A. Let T = QT1 Q−1 . Then

MT (B, B) = MQ (B, B)MT1 (B, B)MQ−1 (B, B) = MQ (B, B)MT1 (B, B)MQ (B, B)−1 AMT1 (B, B)A−1 = MT2 (B, B) It follows that T = T2 and since T = QT1 Q−1 , T2 and T1 are similar. 14. MT2 (B, B) is similar to MT2 (B , B ). Since MT1 (B, B) is similar to MT2 (B , B ) by hypothesis, by transitivity, MT1 (B, B) and BT2 (B, B) are similar. Now by Exercise 13, T1 and T2 are similar operators.

K23692_SM_Cover.indd 35

02/06/15 3:11 pm

K23692_SM_Cover.indd 36

02/06/15 3:11 pm

Chapter 3

Polynomials 3.1. The Algebra of Polynomials 1. x2 + 1.

contrary that r(x) = 0. Then r(x) = f (x) − q(x)d(x). By the definition of an ideal −q(x)d(x) ∈ J. Since f (x), −q(x)d(x) ∈ J we get r(x) = f (x) − q(x)d(x) ∈ J. However deg(r(x)) < deg(d(x)) which contradicts the minimality of the degree of d(x). So, r(x) is the zero polynomial and d(x) divides f (x).

2. Assume F (x), G(x) are in J. Then there are Now suppose d(x), d (x) are both minimal degree and ai (x), bi (X), i = 1, 2 such that monic. Then d(x) divides d (x) and d (x) divides d(x). This implies there is a scalar c ∈ F such that d (x) = F (x) = a1 (x)f (x) + b1 (x)g(x), cd(x). Since both are monic, c = 1 and they are equal. G(x) = a2 (x)f (x) + b2 (x)g(x). Then

5. Let a(x) and b(x) be polynomials such that a(x)f (x)+ b(x)g(x) = d(x). We then have a(x)f (x)d(x) + b(x)g (x)d(x) = d(x)

F (x) + G(x) = [a1 (x) + a2 (x)]f (x) + [b1 (x) + b2 (x)]g(x) ∈ J

a(x)f (x) + b(x)g (x) = 1

3. Assume F (x) ∈ J and c(x) ∈ F[x]. We need to show c(x)F (x) ∈ J. Now there are a(x), b(x) ∈ F[x] such that F (x) = a(x)f (x) + b(x)g(x). Then

We therefore conclude that f (x) and g (x) are relatively prime.

c(x)F (x) = [c(x)a(x)]f (x) + [c(x)b(x)]g(x) 4. Suppose d(x) is monic and has least degree in J. Want to show that every element of J is a multiple of d(x). Let f (x) ∈ J. Apply the division algorithm to write f (x) = q(x)d(x) + r(x) where either r(x) is the zero polynomial or deg(r(x)) < deg(d(x)). Suppose to the

K23692_SM_Cover.indd 37

6. Write f (x) = f (x)d(x), g(x) = g (x)d(x). Then f , g are relatively prime. Let l(x) be the least common multiple of f (x) and g(x). Now

f (x)g(x) d(x)

= f (x)g (x)d(x) is divisible by f (x) and

g(x) and therefore l(x) divides f (x)g(x) d(x) . On the other hand, let l(x) = l (x)d(x). Since f (x) = f (x)d(x) divides l(x) = l (x)d(x) we conclude that f (x) divides l (x). Similarly, g (x) divides l (x). Since f (x)g (x)

02/06/15 3:11 pm

30

Chapter 3. Polynomials

are relatively prime it then follows that f (x)g (x) di- 2. x4 + 5x2 + 4. Another, which is irreducible over the = f (x)g (x) divides rational numbers is x4 + 2x2 + 2. vides l (x). Consequently, f (x)g(x) d(x) l (x)d(x) = l(x). It now follows that multiple of l(x).

f (x)g(x) d(x)

is a scalar 3. Note that complex conjugation satisfies (3.1)

z + w = z + w, zw = z w 7. If l(x) and l (x) are both lcms of f (x), g(x) then l(x)|l (x) and l (x)|l(x) so that l (x) = cl(x) for some If λ is a root of f (x) then c ∈ F. Since both are monic, c = 1 and l(x) = l (x).

8. Without loss of generality we can assume that f (x) is monic. Let d(x) be the gcd of f (x) and g(x). Since f (x) is irreducible and d(x) divides f (x) either d(x) = 1 or d(x) = f (x). However, in the latter case, f (x) = d(x) divides g(x), contrary to the hypothesis. Thus, d(x) = 1 and f (x), g(x) are relatively prime. 9. This follows from Exercise 6.

λn + an−1 λn−1 + · · · + a1 λ + a0 = 0

(3.2)

Taking the complex conjugate of both sides of (3.2) and using (3.1) we obtain λn + an−1 λn−1 + · · · + a1 λ + a0 = 0

10. Let d(x) be the greatest common divisor of f (x) and g(x) in F[x] and assume, to the contrary that d(x) = ¯ + a0 = 0 λn + an−1 λn−1 + · · · + a1 λ g(x). Write f (x) = f (x), g(x) = g (x)d(x). As in the proof of Exercise 6 there are a(x), b(x) ∈ F[x] such that n n−1 a(x)f (x) + b(x)g (x) = 1. Since F ⊂ K it follows that λ + an−1 λ + · · · + a1 λ + a0 = 0. f (x) and g (x) are relatively prime in K[x]. But then the gcd of f (x) and g(x) in K[x] is d(x) contrary to the as- Thus, λ is a root of f¯(x). sumption that g(x) divides f (x) in K[x]. We therefore 4. x4 − 6x3 + 15x− 18x + 10. have a contradiction and consequently, d(x) = g(x) as required. 5. If 3 + 4i is a root of f (x) then so is 3 + 4i = 3 − 4i. 2 11. A polynomial g(x) dividesf (x) if and only if g(x) Then (x − [3 + 4i])(x − [3 − 4i]) = x − 3x + 25 is a 2 has a factorization cp (x)f1 . . . p (x)ft where c ∈ F and factor of f (x). Likewise x − 6x + 25 is a factor of g(x). 1

t

fi are in N ∪ {0} and fi ≤ ei . There are then ei + 1 6. Let n = max{deg(f (x), g(x)}. Then f (x), g(x) ∈ n i choices for fi and hence (e1 + 1) × · · · × (et + 1) choices F(n) [x] and we can write f (x) = i=0 ai x , g(x) = n n i i for (f1 , . . . , ft ). For any such t there is a unique c such i=0 bi x . Then f (x) + g(x) = i=0 (ai + bi )x . By that the polynomial is monic. the definition of D we have

3.2. Roots of Polynomials

D(f (x)) =

n i=1

iai xi−1 , D(g(x)) =

n

ibi xi−1 .

i=1

1. Since non-real, complex roots of a real polynomial Adding D(f (x)) + D(g(x)) we get come in conjugate pairs the number of non-real, complex roots of a real polynomial is always even. Therefore, if n the degree of a real polynomial is 2n + 1, an odd number, (iai + ibi )xi−1 = D(f (x)) + D(g(x)) = there must be a real root. i=1

K23692_SM_Cover.indd 38

02/06/15 3:11 pm

3.2. Roots of Polynomials n

i(ai + bi )x

31

i−1

= D(f (x) + g(x)).

i=1

n i 7. Assume f (x) = i=0 ai x . n i i=0 (cai )x . We then have D(cf (x)) =

n

Then cf (x) =

c(iai )xi−1 = c

i=1

n

D(

m

aj xj g(x)) =

j=0

m

aj D(xj g(x)).

j=0

Then by what we showed immediately above we have

i(cai )xi−1 =

i=1

n

m j Then f (x)g(x) = Now write f (x) = j=0 aj x . m j a x g(x). By Exercises 6 and 7 we have j=0 j

iai xi−1 = cD(f (x)).

i=1

m

j

aj D(x g(x)) =

j=0

m

aj [D(xj )g(x) + xj D(g(x))] =

j=0

m

8. Let k, l be natural numbers. We first prove that D(xk+l ) = D(xk )xl + xk D(xl ). By the definition of D we have

aj D(xj )g(x) +

j=0

[

m

kxk+l−1 + lxk+l−1 = (k + l)xk+l−1 = D(xk+l ) = D(xk · xl ).

aj xj D(g(x)) =

j=0

(jaj )xj−1 ]g(x) +

j=0

D(xk )xl + xk D(xl ) = kxk−1 xl + xk (lxl−1 ) =

m

m

aj xj D(g(x)) =

j=0

D(f (x))g(x) + f (x)D(g(x)). 9. Suppose α is a root of multiplicity at least two. Then (x − α)2 divides f (x). Then we can write f (x) = (x − α)2 g(x). Then D(f (x)) = 2(x−α)g(x)+(x−α)2 g (x). Consequently (x − α) is a factor of D(f (x)).

We next prove if g(x) ∈ F[x] and k is a natural number then D(xk g(x)) = D(xk )g(x) + xk D(g(x)). Write n m g(x) = i=0 bi xi . Then xk g(x) = i=0 bi (xk · xi ). By On the other hand, suppose f (x) has n distinct roots, α1 , . . . , αn . Then f (x) = (x − α1 ) . . . (x − αn ). Then Exercises 6 and 7 we have D(

m i=0

bi (xk · xi ) =

n i=0

bi D(xk · xi ).

By Exercise 7 we have n i=0

bi D(xk · xi ) = n

n i=0

bi D(xk ) +

i=0

k

D(x )

bi [D(xk )xi + xk D(xi )] = n

bi xk D(xi ) =

n i=1

f (x) (x − αi )

Evaluating D(f (x)) at αi we obtain (αi − α1 ) . . . (αi − αi−1 )(αi −αi+1 ) . . . (αi −αn ) = 0. Therefore (x−αi ) is not a factor of D(f (x)) and f (x), D(f (x)) are relatively prime. 10. If i = j then (x − αi ) is a factor of Fj (x) and hence fj (x) and therefore fj (αi ) = 0. On the other hand, F (α ) fj (αj ) = Fjj (αjj ) = 1. Now suppose

i=0

n i=0

i

bi x + x

k

n

i

bi D(x ) =

i=0

D(xk )g(x) + xk D(g(x)).

K23692_SM_Cover.indd 39

D(f (x)) =

n

cj fj (x) = 0

ij=1

Evaluating at αi we obtain

02/06/15 3:11 pm

32

Chapter 3. Polynomials

cj fj (αi ) = ci fi (αi ) = ci .

j=1

Thus, ci = 0 for all i and B = (f1 , . . . , fn ) is linearly independent in Fn−1 [x]. Since dim(Fn−1 [x]) = n, B is a basis. n 11. Set f (x) = i=j βj fj (x). f (x) satisfies the conditions. Suppose g(x) does also. Then αj is a root of f (x) − g(x) for all j. Since f (x) − g(x) ∈ Fn−1 [x] if f (x) − g(x) = 0 then it has at most n − 1 roots. Therefore f (x) − g(x) = 0 and g(x) = f (x). 12. If g(x) ∈ Fn−1 [x] and g(αj ) = βj then g(x) = n j=1 βj fj (x) by Exercise 8. Since the coefficient of fj (x) is g(αj ) we conclude that g(α1 ) [g(x)]B = ...

g(αn )

13. The coordinate vector of the constant function 1 is 1 .. . . The coordinate vector of xk is 1

α1k α2k [xk ]B = . . ..

αnk

It follows that the change of basis matrix from S to B is

1 1 MIFn−1 [x] (S, B) = . .. 1

α1 α2 .. .

α22 α22 .. .

αn

αn2

... ... ... ...

α1n−1 α2n−1 .. . αnn−1

When α1 , . . . , αn are distinct this matrix is invertible.

K23692_SM_Cover.indd 40

02/06/15 3:11 pm

Chapter 4

Theory of a Single Linear Operator 4.1. Invariant Subspaces of an Operator

Let v ∈ V be arbitrary. Set x = 12 (v + T (v)) and y = 1 2 (v − T (v)). We claim x ∈ E1 , y ∈ E−1 . 1 1 T (x) = T ( (v + T (v))) = T (v + T (v)) 2 2

x1 x1 1. T (x2 ) = x2 . x3 x1 + x2

=

2. Let (u1 , . . . , uk ) be a basis for U. Extend this to a basis (u1 , . . . , un ) of V. Let T be the operator such that T (ui ) = un for all i.

1 1 T (y) = T ( (v − T (v))) = T (v − T (v)) 2 2

3. x2 − 3x + 2. 4. x3 − 2x2 − x + 2. 5. S(T (v)) = (ST )(v) = (T S)(v) = T (S(v)) = T (λv) = λT (v). 6. Let T : U → U be defined by T(u) = T (u). Since T is invertible on V , in particular, T is injective. Then T is injective. Since V is finite dimensional it follows that U is finite dimensional. Consequently by the half is good enough theorem T is surjective. Now let u ∈ U be arbitrary. By what we have shown there is a u ∈ U such that T (u ) = u. Then T −1 (u) = T −1 (T (u )) = (T −1 T )(u ) = IV (u ) = u ∈ U. Thus, U is T −1 invariant.

1 1 (T (v) + T 2 (v)) = (T (v) + v) = x. 2 2

=

1 1 (T (v) − T 2 (v)) = (T (v) − v) = −y. 2 2

Now v = x + y ∈ E1 + E−1 . 8. In addition to R3 and 0 the invariant subspaces are 1 x1 Span(1) and {x2 |x1 + x2 + x3 = 0}. x3 1 9.

The T −invariant subspaces 1 1 0 {0}, Span(0), Span(0 , 1), R3 . 0 0 0

are

7. Suppose v ∈ E1 ∩ E−1 . Then T (v) = v and T (v) = 10. This follows since for any polynomials f, g we have −v. Then v = −v or 2v = 0. Since 2 = 0, v = 0. So f (T ) + g(T ) = (f + g)(T ) and f (T )g(T ) = h(T ) where E1 ∩E−1 = {0}. So we need to show that V = E1 +E−1 . h(x) = f (x)g(x).

K23692_SM_Cover.indd 41

02/06/15 3:11 pm

34

Chapter 4. Theory of a Single Linear Operator

11. Need to show that i) If f (x), g(x) ∈ Ann(T, v) then Thus, we conclude that [v]B is an eigenvector of the maf (x) + g(x) ∈ Ann(T, v); and ii) If f (x) ∈ Ann(T, v) trix MT (B, B) with eigenvalue λ. and h(x) ∈ F[x] then h(x)f (x) ∈ Ann(T, v). Conversely, assume [v]B is an eigenvector of the matrix i) [(f + g)(T )](v) = [f (T ) + g(T )](v) = f (T )(v) + MT (B, B) with eigenvalue λ. Then g(T )(v) = 0 + 0 = 0. MT (B, B)[v]B = λ[v]B = [λv]B . ii) [h(T )f (T )](v) = h(T )(f (T )(v)) = h(T )(0) = 0. 12. Let u ∈ U, w ∈ W. Then T (u+w) = T (u)+T (w). On the other hand, By hypothesis, T (u) ∈ U and T (w) ∈ W and therefore T (u) + T (w) ∈ U + W. Thus, U + W is T −invariant.

Suppose x ∈ U ∩ W. Since x ∈ U and U is T −invariant T (x) ∈ U. Similarly, T (x) ∈ W and hence T (x) ∈ U ∩ W.

MT (B, B) = [T (v)]B . Thus, we have [T (v)]B = [λv]B .

13. Need to show that i) If f (x), g(x) ∈ Ann(T ) then f (x) + g(x) ∈ Ann(T ); and ii) If f (x) ∈ Ann(T ) and h(x) ∈ F[x] then h(x)f (x) ∈ Ann(T ).

It then follows that

i) Let v be an arbitrary vector. We then have

16. This is a consequence of the fact that f (A) = Mf (T ) (B, B).

[(f + g)(T )](v) = [f (T ) + g(T )](v) = f (T )(v) + g(T )(v) = 0 + 0 = 0. Since v is an arbitrary vector f + g ∈ Ann(T ). ii) Let v be an arbitrary vector. Then

[h(T )f (T )](v) = h(T )(f (T )(v)) = h(T )(0) = 0. Thus, h(x)f (x) ∈ Ann(T ). 14. Assume T has an eigenvector v with eigenvalue λ. Then µT,v (x) = x − λ. Since µT (x)(v) = 0 it follows by Remark (4.4) that (x − λ) divides µT (x). 15. Assume v is an eigenvector with eigenvalue λ. Then T (v) = λv. We then have [T (v)]B = [λv]B = λ[v]B . On the other hand, [T (v)]B = MT (B, B)[v]B .

K23692_SM_Cover.indd 42

T (v) = λv.

17. Let f (x) ∈ F[x] and let T be an operator on a finite dimensional vector space. Suppose f (S) = 0V →V . Then f (MS (B, B)) = 0nn by Exercise 16. Then f (MS (B, B)tr ) = 0nn . However, MS (B, B) = MS (B, B)tr and consequently, f (MS (B, B)) = 0nn . Then by Exercise 16, f (S ) = 0V →V . This implies that µS (x) divides µS (x). However, the argument can be reversed so that µS (x) divides µS (x). Since both are monic they are equal. 18. Since T (v) = λv, v = λ1 T (v). Then 1 1 T −1 (v) = T −1 ( T (v)) = T −1 (T (v)) λ λ =

1 1 1 −1 (T T )(v) = IV (v) = v. λ λ λ

19. This follows since v is an eigenvector of T k with eigenvalue λk . Also, if v is an eigenvector for an operator S with eigenvalue λ then v is an eigenvector for cS with eigenvalue cλ. Finally, if v is an eigenvector for operators S1 , S2 with respective eigenvalues λ1 , λ2 then v is an

02/06/15 3:11 pm

4.2. Cyclic Operators eigenvector of S1 + S2 with eigenvalue λ1 + λ2 . Now if f (x) = am xm + · · · + a1 x + a0 the operators ak T k has v as an eigenvector with eigenvalue ak λk . Then add. 20. Apply S −1 T S to S −1 (v): (S −1 T S)(S −1 (v) = (S −1 T )(S(S −1 (v)) = (S −1 T )(IV (v) = (S −1 T )(v) = S −1 (T (v)) = S −1 (λv) = λS(v). So we applied S −1 T S to the vector S −1 (v) and the result is λS −1 (v). Thus, S −1 (v) is an eigenvector of S −1 T S with eigenvalue λ.

35 0 0 0 0 0 3 1 2 2. T (z) = 0 , T (z) = 1 , T (z) = 0 . 1 0 0 So T, z contains the standard basis of R4 and there −4 0 fore T, z = R4 . Now T 4 (z) = −5 and µT,z (x) = 0 x4 + 5x2 + 4 = (x2 + 1)(x2 + 4). 3. Let z ∈ V be any non-zero vector. Then T, z is a T −invariant subspace which is not {0}. Therefore T, z = V and T is a cyclic operator. 4. There are lots of such operators. Here is a simple one

21. Note for any polynomial f (x) that f (ST )S = Sf (T S) and f (T S)T = T f (ST ). Let f (x) = µST (x) and g(x) = µT (x). Then f (ST ) = 0V →V . Then f (ST )S = 0V →V . By the above Sf (T S) = 0V →V whence (T S)f (T S) = 0V →V . It then follows that g(x) divides xf (x). In exactly the same way f (x) divides xg(x). In particular, f (x) and g(x) have the same non-zero roots.

x1 x1 x2 2x2 T ( x3 = 3x3 . x4 4x4 1 1 4 If z = 1 then T, z = R . 1

4.2. Cyclic Operators

5. Since T is cyclic, µT (x) has degree 3. Suppose µT (x) has one real root so that µT (x) = (x − λ)g(x) where g(x) is a real irreducible quadratic. Let x be a vector such that R3 = T, x Then the T −invariant subspaces are R3 , {0}, T, (T − λI)x and T, g(T )(x).

1 3 2 1a) T (z) = −1 , T (z) = −4 , T 3 (z) = 0 −2 We can suppose µT (x) has three real roots. If there is 3 −3 . only one distinct root, say λ so that µT (x) = (x − λ)3 . In this case there are four T −invariant subspaces. −2

Assume µT (x) = (x − α)2 (x − β), α = β, then there are 1 1 3 The matrix 0 −1 −4 is invertible and therefore six T −invariant subspaces. 0 0 −2 Assume µT (x) = (x − α)(x − β)(x − γ), distinct. In this (z, T (z), T 2 (x)) is a basis for R3 . case there are eight T −invariant subspaces µT,z (x) = x3 − x2 + x − 1. 6. Let T have the following matrix with respect to the standard basis b) µT,u (x) = x − 1.

K23692_SM_Cover.indd 43

02/06/15 3:11 pm

36

Chapter 4. Theory of a Single Linear Operator

1 0 0

1 1 0

0 1 . 1

Here is another:

1 0 0 0 0 −1

0 1 0

(x − α)(x − β)(x − γ)(x − δ) where α, β, γ, δ are distinct real numbers. where α = β are real numbers. There are sixteen T −invariant subspaces in this case. 9. Let T be the operator on R4 which has the following matrix with respect to the standard basis:

0 1 −1 0 0 1 0 0

0 0 0 0 . 0 1 −1 0

7. Let T have the following matrix with respect to the standard basis 10. Let T be the operator on R4 which has the following matrix with respect to the standard basis: 2 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 0 2 0 8. Because T is cyclic, µT (x) has degree 4. 0 0 0 3 The number of T −invariant subspaces can be computed from a factorization of µT (x). The possibilities are: 11. Let T be the operator on R4 which has the following p(x)2 where p(x) is a real quadratic irreducible. There are three T −invariant subspaces in this case. p(x)(x − α) where p(x) is a real irreducible quadratic and α ∈ R. There are six T −invariant subspaces in this case. 2

p(x)(x − α)(x − β) where p(x) is a real irreducible quadratic and α = β are real numbers. There are eight T −invariant subspaces in this case.

matrix with respect to the standard basis:

1 0 0 0

0 2 0 0

0 0 3 0

0 0 0 4

12. Set v0 = v and vi+1 = T i (v) for 1 ≤ i < n. By our hypothesis, (v0 , . . . , vn−1 ) is a basis for V. Assume S(v) = a0 v0 + . . . an−1 vn−1 . Set f (x) = a0 + a1 x + (x − α)4 , α ∈ R. There are five T −invariant subspaces in · · · + an−1 xn−1 ∈ Fn−1 [x]. We claim that S = f (T ). this case. Since S and f (T ) are linear operators it suffices to show (x − α)3 (x − β) where α = β are real numbers. There that S(vi ) = f (T )(vi ) for i = 0, . . . , n − 1. are eight T −invariant subspaces in this case. We have constructed f (x) such that f (T )(v) = (x − α)2 (x − β)2 where α = β are real numbers. There f (T )(v0 ) = S(v). Consider f (T )(vi ) for 1 ≤ i ≤ n − 1. are nine T −invariant subspaces in this case. f (T )(vi ) = f (T )(T i (v)) = T i f (T )(v) = T i S(v) = ST i (v) = S(vi ). (x−α)2 (x−β)(x−γ) where α, β, γ are distinct real numbers. where α = β are real numbers. There are twelve T −invariant subspaces in this case.

K23692_SM_Cover.indd 44

02/06/15 3:11 pm

4.3. Maximal Vectors

37

4.3. Maximal Vectors

Since the αi are distinct, for all i < j + 1, αi − αj+1 = 0. Therefore

1. a) µT,e1 (x) = x2 + 2x + 2, µT,e2 (x) = x3 − 2x − 4, µT,e3 (x) = x3 − 2x − 4. b) µT (x) = x3 − 2x − 4. c) e2 , e3 are maximal vectors.

In then follows that the dependence relation in (4.1) is

2. µT (x) = (x − 2)2 (x − 1). e2 and e3 are maximal vectors for T.

Since vj+1 is non-zero, cj+1 = 0.

c1 = c2 = · · · = cj = 0.

cj+1 vj+1 = 0

5. Since x2 + 1, x + 1, x − 2 are relatively prime it follows that (x2 + 1)(x + 1)(x − 2) divides µT (x). Since T is an operator on R4 the degree of µT (x) is at most four. We 2 4. Clearly (v1 ) is linearly independent since an eigenvec- can then conclude that µT (x) = (x + 1)(x + 1)(x − 2). tor is non-zero. Assume we have shown that (v1 , . . . , vj ) Set v = v1 + v2 + v3 . Since µT,v1 (x), µT,v2 (x) and is linearly independent with j < k. We show that µT,v3 (x) are relatively prime, µT,v (x) is the product µT,v1 (x)µT,v2 (x)µT,v3 (x) = (x2 + 1)(x + 1)(x − 2) = (v1 , . . . , vj+1 ) is linearly independent. Suppose that µT (x). Thus, v is a maximal vector. 3. µT (x) = x4 −x3 −x2 −x−2 = (x−2)(x3 +x2 +x+1). e1 is a maximal vector.

c1 v1 + · · · + cj+1 vj+1 = 0

(4.1)

6. T, v1 = Span(v1 , v2 ) and every non-zero vector v in T, v1 satisfies µT,v (x) = x2 + 1 since x2 + 1 is irreApply T to get ducible. v3 is an eigenvector with eigenvalue -1 and v4 is an eigenvector with eigenvalue 1. If w = w1 + w2 + w3 c1 T (v1 ) + · · · + cj+1 T (vj+1 ) = 0 with w1 , w2 , w3 nonzero and w1 ∈ Span(v1 , v2 ), w2 ∈ Span(v3 ) and w3 ∈ Span(v4 ) then µT,w (x) is divisible Using the fact that T (vi ) = αi vi we get by (x2 + 1)(x + 1)(x − 1) and is therefore a maximal vector. On the other hand, if w1 , w2 are non-zero but w3 is zero then µT,w (x) = (x2 + 1)(x + 1). If w1 , w3 are (4.2) non-zero but w = 0 then µ c1 α1 v1 + · · · + cj αj vj + cj+1| αj+1 vj+1 2 2 T,w (x) = (x + 1)(x − 1). If w2 , w3 = 0, w1 = 0 then µT,w (x) = (x + 1)(x − 1). Now multiply (4.1) by αj+1 and subtract it from (4.2) to 7. If v = 0 then µT,v (x) is a non-constant polynomial get and divides µT (x) which is irreducible. Then µT,v (x) = µT (x) and v is a maximal vector. (4.3) 8. Since µT (x) has degree 5 and dim(F55 ) = 5 the operator T is cyclic. Note that x5 − x = x(x − 1)(x − x5 −x Thus, we obtain a dependence relation on (v1 , . . . , vj ). 2)(x−3)(x−4). Set fi (x) = x−i where i = 0, 1, 2, 3, 4 However, (v1 , . . . , vj ) is linearly independent and there- and let v be a maximal vector. Set vi = fi (T )v. Then vi is an eigenvector with eigenvalue i, i = 0, 1, 2, 3, 4. fore Since these eigenvalues are distinct, (v1 , . . . , v5 ) is independent by Exercise 4. By the half is good enough theorem (v1 , . . . , v5 ) is a basis. A vector c1 v1 + · · · + c5 v5 c1 (α1 − αj+1 ) = · · · = cj (αj − αj+1 ) = 0. c1 (α1 − αj+1 )v1 + · · · + cj (αj − αj+1 ) = 0

K23692_SM_Cover.indd 45

02/06/15 3:11 pm

38

Chapter 4. Theory of a Single Linear Operator

is a maximal vector if and only if ci = 0 for all i. Thus, odd it has a real root a. Thus, x−a divides f (x). However, there are 45 maximal vectors. since f (x) has single distinct irreducible factor it follows that f (x) = (x − 1)2n+1 .

4.4. Indecomposable Linear Operators

8. Let f (x) = µT (x). Then either f (x) = (x − a)2n for some n or f (x) = p(x)n for some real irreducible quadratic polynomial. By the theory of cyclic operators in the first case the number of T -invariant subspaces is 2n + 1 and in the latter case the number is n + 1.

1. This operator is decomposable since it is not cyclic. It 9. Let f (x) = µT (x). Then either f (x) = (x − a)4 has minimal polynomial x2 + 2x + 1 = (x + 1)2 . or g(x)2 where g(x) is an irreducible polynomial of de2. This operator is indecomposable. µT,e3 (x) = x3 + gree 2. In either case, let v be a maximal vector. In the V lies in 3x2 +3x+1 = (x+1)2 . So, the operator is cyclic and the first case, any vector w such that T, w = T, (T − aI)(v) which has dimension three and contains minimum polynomial is a power of the irreducible poly3 p vectors. Any other vector is maximal and hence in this nomial x + 1. 4 3 case there are p − p maximal vectors. 3. This operator is indecomposable. µT,e1 (x) = x3 − Suppose f (x) = g(x)2 . Now if V = T, w then w be6x2 + 12x − 8 = (x − 2)3 . longs to T, g(T )(v) which has dimension 2 and p2 vec4. Set f (x) = µS (x). If T in in P(S) then there is a tors. In this case there are p4 − p2 maximal vectors. polynomial g(x) with degree of g(x) less then degree of f (x) such that T = g(S). Since g(x) is not the zero 10. Suppose T is indecomposable. Let µT (x) = m polynomial, f (x) is irreducible and deg(g) < deg(f ) f (x) = p(x) where p(x) is irreducible. Let v be it must be the case that f (x), g(x) are relatively prime. a maximal vector. The only proper T −invariant subk Consequently, there are polynomials a(x), b(x) such that spaces of V are T, p(T ) (v) for 1 ≤ k ≤ n. Morej+1 j a(x)f (x) + b(x)g(x) = 1. Substituting S for x we get over, T, p(T ) (v) is a subspace of T, p(T ) (v) a(S)f (S) + b(S)g(S) = IV . Since f (S) = 0V →V and and hence T, p(T )(v) is the unique maximal proper g(S) = T we have b(S)T = IV . Thus, b(S) ∈ P(S) is T −invariant subspace. an inverse to T. On the other hand, suppose T is not indecomposable. 5. Set fi (x) = µT,vi (x). Then fi (x) divides p(x)m Then there are T −invariant subspaces U and W such that and so there is a natural number mi ≤ m such that V = U ⊕ W. Let U be a proper maximal T −invariant fi (x) = p(x)mi . Let j be chosen such that mj ≥ mi for subspace of U (possibly {0}) and similarly choose W in i = 1, 2, . . . , n. Then the gcd of fi (x) is fj (x) = p(x)mj . W. Then U ⊕ W and U ⊕ W are two distinct proper However, the gcd(f1 , . . . , fn ) is µT (x). Thus, fj (x) = maximal T −invariant subspaces. µT (x) and vj is a maximal vector. 6. Since T is indecomposable, µT (x) = p(x)m for some irreducible polynomial p(x). By Exercise 5 there is a j such that vj is maximal. Since T is indecomposable, T is cyclic. Therefore V = T, vj . 7. Let f (x) = µT (x). Since T is indecomposable, deg(f ) = dim(V ) = 2n + 1. Since the degree of f (x) is

K23692_SM_Cover.indd 46

4.5. Invariant Factors and Elementary Divisors of a Linear Operator

02/06/15 3:11 pm

4.5. Invariant Factors and Elementary Divisors of a Linear Operator 1. Set di = dim(Ui ). Then (d1 , . . . , d5 ) (12, 22, 28, 34, 38).

=

39

Let Bi be a basis for Vi and set B = B1 . . . Bt . Then MT (B, B) is a diagonal matrix.

2. The invariant factors, di (x), ordered so d1 |d2 |d3 |d4 are Conversely, assume that T is diagonalizable. Then there exists a basis B consisting of eigenvectors. Let α1 , . . . , αt d1 (x) = (x2 − x + 1)2 (x2 + 1) be the distinct eigenvectors and set Vi = {v ∈ V |T (v) = αi v}. Then V1 + · · · + Vt is a direct sum. Since B ⊂ V 1 + · · · + Vt d2 (x) = (x2 − x + 1)2 (x2 + 1)2 (x + 2) d3 (x) = (x2 − x + 1)3 (x2 + 1)2 (x + 2)2 d4 (x) = (x2 − x + 1)4 (x2 + 1)3 (x + 2)2

V = V1 + · · · + V t . Consequently, V = V1 ⊕ · · · ⊕ Vt . Thus, V is completely reducible. Now µT (x) = (x − α1 ) . . . (x − αt ).

9. Let dim(V ) = n and set Vi = {v ∈ V |pi (x)n (v) = dim(V ) = 44. 0} so that Vi is the pi −Sylow subspace. If there are 3. The elementary divisors and the invariant factors are infinitely many T −invariant subspaces in some Vi then there are clearly infinitely many T −invariant subspaces. x2 + 1 and x2 + 1 4. There is a single elementary divisor and invariant factor which is (x2 + 1)2 5. The elementary divisors are x2 + 1, x + 1 and x − 1. There is a single invariant factor, x4 − 1.

Suppose, on the other hand that each Vi has finitely many T −invariant subspaces. Suppose U is a T −invariant subspace. Set U = U ∩ Vi . Then

U = U1 ⊕ · · · ⊕ Ut . 6. The elementary divisors are x, x, x − 1, x − 1 and the It follows from this that there are only finitely many invariant factors are x2 − x, x2 − x. T −invariant subspaces. 7. Assume the minimum polynomial of T is 10. Let dim(V ) = n and p1 (x), . . . , pt (x) be the p1 (x) . . . pt (x) where pi (x) is irreducible and distinct. distinct irreducible factors of µT (x) Set Vi = {v ∈ Let Vi = {v ∈ V |pi (T )v = 0}. Then V = V1 ⊕ · · · ⊕ Vt . V |pi (T )(v) = 0} so that Vi is the pi (x)−Sylow subspace If each Vi is completely reducible then so is V . Conof V and sequently, we may reduce to the case that the minimum polynomial is irreducible. In this case, there are vectors V = V1 ⊕ · · · ⊕ Vt . v1 , . . . , vs such that µT,vi (x) = p(x) is irreducible and Then T is cyclic if and only if T restricted to each Vi is cyclic. Also, V has finitely many T −invariant subspaces if and only there are finitely many T −invariant subspaces Each space T, vj is T −irreducible. It follows from this in Vi for each i by Exercise 9. Thus, we may assume that that V is completely reducible. µT (x) = p(x)m for some irreducible polynomial p(x). 8. Assume V is completely reducible and µT (x) = (x − If T is cyclic in this case then the number of T -invariant α1 ) . . . (x − αt ) where αi are distinct. Set Vi = {v ∈ V : subspaces is m + 1. On the other hand suppose T is not cyclic. Then there are at least two elementary divisors (T − αi IV )(v) = 0} Then (invariant factors). Thus, there are vectors v1 , v2 with V = V1 ⊕ · · · ⊕ Vt . µT,v1 (x) = p(x)m1 and µT,v2 (x) = p(x)m2 such that V = T, v1 ⊕ · · · ⊕ T, vs .

K23692_SM_Cover.indd 47

02/06/15 3:11 pm

40

Chapter 4. Theory of a Single Linear Operator

T, v1 ∩ T, v2 = {0}. Set w1 = p(T )m1 −1 (v1 ), w2 = p(T )m2 −1 (v2 ). Then µT,w1 (x) = µT,w2 (x) = p(x) and T, w1 ∩ T, w2 = {0}. Now each of the spaces T, w1 + cw2 , c ∈ F is distinct and T −invariant. Since F is infinite there are infinitely many T −invariant subspaces. 11. Let V (p) = {v ∈ V |p(T )m (v) = 0} and V (q) = {v ∈ V |q(T )n (v) = 0}. Then V = V (p) ⊕ V (q). Since a(x)p(x)m +b(x)q(x)n = 1 it follows that a(T )p(T )m + b(T )q(T )n = IV . However, a(T )p(T )n restricted to V (p) is the zero map and consequently, b(T )q(T )n restricted to V (p) is the identity. Then b(T )q(T )n p(T ) restricted to V (p) is identical to p(T ) restricted to V (p). In a similar manner, b(T )q(T )n restricted to V (q) is the zero map and a(T )p(T )m restricted to V (q) is the identity map. Consequently a(T )p(T )m q(T ) restricted to V (q) is the same as q(T ) restricted to V (q). Thus, f (T ) restricted to V (p) is equal to the restriction of p(T ) to V (p) and f (T ) restricted to V (q) is equal to the restriction of q(T ) to V (q).

Now assume ek−1 < j − 1 ≤ ek and el−1 < j ≤ el . Clearly k ≤ l. Now Uj−1 = T, v1 ⊕ · · · ⊕ T, vk−1 ⊕ T, p(T )ek −j+1 (vj )⊕ · · · ⊕ T, p(T )et −j+1 (vt )

(4.4)

Uj = T, v1 ⊕ · · · ⊕ T, vl−1 ⊕ T, p(T )el −j (vl )⊕ · · · ⊕ T, p(T )et −j (vt )

(4.5)

It follows from (4.4) that mj−1 = dim(Uj−1 ) = e1 d + . . . ek−1 d + [t − k + 1](j − 1)d. It follows from (4.5) that mj = dim(Uj ) = e1 d + . . . el−1 d + [t − l + 1]jd. Suppose k = l then mj − mj−1 = dim(Uj ) − m −m dim(Uj−1 ) = [t − l + 1]d and j d j−1 = t − l + 1 which is the number of ei which are greater than or equal to j.

Suppose now that x = vp + vq with vp ∈ V (p) and vq ∈ Suppose k < l. Then e = · · · = e k l−1 = j − 1. Making V (q). Then f (T )(x) = f (T )(vp + vq ) = f (T )(vp ) + use of this we get that f (T )(vq ) = p(T )(vp ) + q(T )(vq ). If l = max(m, n) then f (T )l (x) = p(T )l (vp ) + q(T )l (vq ) = 0 + 0 = 0. mj = dim(Uj ) = Thus, f (T )l = 0V →V and f (T ) is nilpotent. e1 d + . . . ek−1 d + (j − 1)d[l − 1 − k] + [t − l + 1]jd. 12. Let p(x)e1 , . . . , p(x)et be the elementary divisors of T where e1 ≤ e2 · · · ≤ et ≤ m. Then there are vectors Then v1 , . . . , vt such that µT,vi (x) = p(x)ei and V = T, v1 ⊕ · · · ⊕ T, vt . Now U1 = T, p(T )e1 −1 (v1 ) ⊕ · · · ⊕ T, p(T )et −1 (vt ) and has dimension td which proves a).

K23692_SM_Cover.indd 48

mj − mj−1 = (j − 1)d[l − 1 − k] + [t − l + 1]jd − [t − k + 1](j − 1)d = [t − l + 1]d m −m

and we again get j d j−1 = [t − l + 1] which is equal to the number of ei which are greater than or equal to j.

02/06/15 3:11 pm

4.6. Canonical Forms

41

13. Since each invariant factor is the product of distinct elementary divisors, it follows that the characteristic polynomial of T is equal to the product of the elementary divisors of T . It therefore suffices to prove this in the case that µT (x) = p(x)m for some irreducible polynomial p(x) of degree d. In this case dm = deg(p(x)m ) = ) deg(χT (x)) = dim(V ) so that m = dim(V . d

4.6. Canonical Forms

1.

2.

3.

4.

5.

6.

0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 3 1 0 0 2 1 0 0 1 0 0 0

K23692_SM_Cover.indd 49

0 0 1 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0

−1 0 0 −2 0 0 −2 0 0 −2 0 0 0 0 −1 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 1 0

−4 4

0 −1 0 −2 1 −2 0 0 3 0 0 −2 0 1 0 2 0 0

0 0 2 1

0 1 0 0

0 0 1 1

0 0 0 0 0 0 0 0 0 0 0 0 0 −4 0 0 0 −4 1 0

1 1 0 0

0 1 0 0

1 0 0 7. 0 0 0

0 0 1 1

0 1 0 0 0 0 0 0 8. 044 , 0 0

0 1 1 0 , 0 0 1 0

0 0 1 1 0 0

0 0 0 1 1 0

0 0 0 0 0 0 0 0

0 0 0 1 0 0 1 0

0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 0 , 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 , 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 , 0 0 0 1 1 0 0 0

0 0 1 1 0 0

0 0 0 1 1 0

0 0 0 0 0 0 0 1

0 0 0 0 1 1

0 0 0 0 0 1

0 0 0 0

9. The characteristic polynomial of a nilpotent operator on an n−dimensional vector space is xn . It is completely reducible if and only if the minimum polynomial has distinct roots and consequently we must have µT (x) = x. This implies that T = 0V →V . 10. Every vector of V satisfies p(T )e (v) = 0 and therefore p(T )e = 0V →V .

0 0 0 −2 0 0 0

2 0 1 0 0 , 0 0 1

0

0 1 1 0

0 0 1 1

0 0 0 1

11. Let A be a square matrix. We have seen that if f (x) ∈ F[x] then f (Atr ) = f (A)tr . Since also Mf (S) (B, B) = f (MS (B, B)) it follows that S and S have the same minimum polynomial. Let p1 (x), . . . , ps (x) be the distinct irreducible polynomials dividing µS (x) = µS (x) and let µS (x) = p1 (x)m1 . . . ps (x)ms . Let Vi = {v ∈ V |pi (S)mi (v) = 0} and Vi = {v ∈ V |pi (S )mi (v) = 0}. Note for any operator T, nullity(T ) and nullity(MT (B, B)) and equal and for any square matrix nullity(A) and nullity(Atr ) are equal. It follows that dim(Vi ) = dim(Vi ). By Exercise 12 of Section (4.5) we can determine the elementary divisors of S divisible by pi (x) by determining the dimensions of the Ker(pi (S)k ) for all

02/06/15 3:11 pm

42

Chapter 4. Theory of a Single Linear Operator

k ≤ mi and likewise for the elementary divisors of S . However, these numbers will all be the same and consequently, S and S have the same elementary divisors and invariant factors. 0 0 0 0 0 0 0 0 12. 0 0 2 0 0

0

1

0 0 0 13. 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 0 −1 0 1 0 0

2

0 0 0 0 −1 1

0 0 0 0 0 −1

4.7. Linear Operators on Real and Complex Vector Spaces

iii) implies iv). Let B be a basis of eigenvectors. Then µT (B, B) is a diagonal matrix which must be the Jordan canonical form of T. iv) implies i). Let B be a basis for V such that MT (B, B) is the Jordan canonical form of T. Since the Jordan canonical form is diagonal this implies that the basis B consists of eigenvectors. Let α1 , . . . , αt be the distinct eigenvalues and Vi = {v ∈ V |T (v) = αi v}. Then V = V1 ⊕· · ·⊕Vt . Suppose U is a T −invariant subspace of V. Set Ui = U ∩ Ui . Then U = U1 ⊕ · · · ⊕ Ut . Since T acts a scalar when restricted to Vi every subspace of Vi is T −invariant. Let Wi be any complement to Ui in Vi and set W = W1 ⊕ · · · ⊕Wt . Then W is a T −invariant complement to U in V.

2. i) implies ii). Let α be an eigenvalue of T and let (x − αi )m be the exact power of (x − α) which divides µT (x). 1. i) implies ii). Assume T is completely reducible. Note Let µT (x) = (x − α)m f (x). Then f (x) and x − α are that the assumption that T is completely reducible carries relatively prime. Set Vα = {v ∈ V |(T − αIV )(v) = 0} over to the restriction of T to any T −invariant subspace. and Vf = {v ∈ V |f (T )(v) = 0}. Then V = Vα ⊕ Vf . It Set f (x) = µT (x) and n = dim(V ). Since f (x) ∈ C[x] follows from our hypothesis that Vf = {0} and f (x) = 1. f (x) splits into linear factors. Let α1 , . . . , αt be the distinct roots of f (x). Set Vi = {v ∈ V |(T − αi IV )n (v) = If T has more than one elementary divisor then V can be 0}. Then V = V1 ⊕ · · · ⊕ Vt . We show that, in fact, decomposed into a direct sum, therefore by our hypotheT (v) = αi v for v ∈ Vi . Suppose to the contrary that sis, there is only one elementary divisor and one Jordan there exists v ∈ Vi such that µT,v (x) = (x − αi )k with block. k > 1. Then T, v is indecomposable. However, by the above remark T, v is completely reducible, so we ii) implies i). If there is a single Jordan block of size n, have a contradiction. Thus, every non-zero vector in Vi say Jn (α), then T is cyclic and the minimum polynomial n is an eigenvector with eigenvalue αi . In now follows that of T is (x − α) and T is indecomposable. µT (x) = (x − α1 ) . . . (x − αt ).

3. The minimal polynomial is µT (x) = (x − 1)(x3 − 1). ii) implies iii). Assume µT (x) = (x − α1 ) . . . (x − αt ) The characteristic polynomial is (x − 1)(x3 − 1)2 . with αi distinct. Set Vi = {v ∈ V |T (v) = αi v}. Then V = V1 ⊕ · · · ⊕ Vt . For each i, T restricted to Vi is αi IVi . The invariant factors are (x − 1)(x3 − 1) and x3 − 1. Choose a basis Bi for Vi . Each vector in Bi is an eigenvector with eigenvalue αi . Now set B = B1 . . . Bt . This The elementary divisors are x − 1, (x − 1)2 , x2 + x + is a basis of eigenvectors. 1, x2 + x + 1.

K23692_SM_Cover.indd 50

02/06/15 3:11 pm

4.7. Linear Operators on Real and Complex Vector Spaces

1 0 0 4. 0 0 0 0

0 1 1 0 0 0 0

0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 −1 1 −1

0 0 0 0 0 0 0 −1 1 −1 0 0 0 0 √

J2 (0) ⊕ J3 (−2i) ⊕ J2 (0) ⊕ J1 (−2i) J2 (0) ⊕ J3 (−2i) ⊕ J1 (0) ⊕ J2 (−2i)

√

5. Set ω = − 12 +i 23 and ω 2 = ω1 = − 12 −i 23 . Then the Jordan canonical form of T over the complex numbers is

1 0 0 0 0 1 0 0 0 1 1 0 0 0 0 ω 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 ω 0 0

43

0 0 0 0 0 ω2 0

J2 (0) ⊕ J3 (−2i) ⊕ J1 (−2i) ⊕ J1 (−2i) ⊕ J1 (−2i) J2 (0) ⊕ J3 (−2i) ⊕ J1 (−2i) ⊕ J2 (−2i)

0 0 0 0 0 0 ω2

J2 (0) ⊕ J3 (−2i) ⊕ J3 (−2i)

8. The proof is by induction on n = dim(V ). Let α1 , . . . , αs be the distinct eigenvalues of S and β1 , . . . , βt the distinct eigenvalues of T. For α an eigenvalue of S set VS,α = {v ∈ V |(S − αIV )n (v) = 0}. Since S and T commute, VS,α is T −invariant. Likewise, define VT,β for 6. We define S, T on C4 by matrices with respect to the β = βj , 1 ≤ j ≤ t. Then VT,β is S−invariant. standard basis: As usual we have

1 0 S= 0 0

0 1 0 0

0 0 1 1

0 1 1 0 ,T = 0 0 1

0

0 1 0 0

0 0 1 1

0 0 . 0 1

Then χS (x) = χT (x) = (x−1)4 And µS (x) = µT (x) = (x − 1)2 . The elementary divisors (invariant factors) of S are x − 1, x − 1, (x − 1)2 . The elementary divisors of T are (x − 1)2 , (x − 1)2 . 7. There are eight possibilities. They are:

J2 (0) ⊕ J3 (−2i) ⊕ J1 (0) ⊕ J1 (0) ⊕ J1 (0) J2 (0) ⊕ J3 (−2i) ⊕ J1 (0) ⊕ J2 (0))

J2 (0) ⊕ J3 (−2i) ⊕ J1 (0) ⊕ J1 (0) ⊕ J1 (−2i)

K23692_SM_Cover.indd 51

V = VS,α1 ⊕ · · · ⊕ VS,αs = VT,β1 ⊕ · · · ⊕ VT,βt . Suppose s > 1. Then we can apply induction and conclude that for each i, 1 ≤ i ≤ s there is a basis Bi of VS,αi such that the matrix of S and T restricted to VS,αi with respect to Bi is in Jordan canonical form. Thus, we can assume that s = 1. In a similar way we can reduce to the case that t = 1. Let α = α1 , β = β1 . Set ES,α = {v ∈ V |S(v) = αv}. Then ES,α is T −invariant. It is then the case there must be a vector v ∈ ES,α which is also an eigenvector for T. Then Span(v) is S and T −invariant. Let S denote the transformation induced on V /Span(v) by S and similarly define T. By the induction hypothesis there exists a basis B = n−1 ) such that the matrix of S and T with re( v1 , . . . , v spect to B is in Jordan canonical form. For 1 ≤ i ≤ n − 1 let vi+1 be a vector in V such that Span(v) + vi+1 = vi and set v1 = v. Then B = (v1 , . . . , vn ) is a basis for

02/06/15 3:11 pm

44 V and MS (B, B) and MT (B, B) are in Jordan canonical form. −2 0 0 0 1 −2 0 0 9. 0 0 2 0 0 0 1 2

10. Let B be a basis such that M = MT (B, B) is in Jordan canonical form. Let A be the diagonal part of M and B = M − A. Then A is a diagonal matrix and B is a strictly lower triangular matrix and hence is nilpotent. Let D be the operator such that MD (B, B) = A and N the operator such that MN (B, B) = B (so that N = T − D). Then D is a diagonalizable operator and N is a nilpotent operator. Note for a Jordan block the diagonal part is a scalar matrix (scalar times the identity) which clearly commutes with the nilpotent part. From this it follows that AB = BA and consequently DN = N D. Suppose there exists polynomials d(x), n(x) such that d(T ) = D, n(T ) = N. We use this to prove uniqueness. Suppose also that D is diagonalizable, N is nilpotent, D N = N D and T = D + N . Since T = D + N we have D T = D (D + N ) = (D )2 + D N = (D )2 + N D = (D + N )D = T D . Similarly, T and N commute.

Chapter 4. Theory of a Single Linear Operator Set f (x) = µT (x) and assume that f (x) = (x − α1 )m1 . . . (x − αs )ms with αi distinct. Set Vi = {v ∈ V |(T − αi IV )mi = 0} so that V = V1 ⊕ · · · ⊕ Vs . f (x) mi Now let fi (x) = (x−α m . Then fi (x) and (x − αi ) i) i are relatively prime and consequently there exists polynomials ai (x), bi (x) such that

ai (x)fi (x) + bi (x)(x − αi )mi = 1. Now ai (T )fi (T ) acts as the zero operator on V1 ⊕ · · · ⊕ Vi−1 ⊕ Vi+1 ⊕ · · · ⊕ Vs and as the identity operator on Vi . Then αi ai (T )fi (T ) is the zero operator on V1 ⊕ · · · ⊕ Vi−1 ⊕Vi+1 ⊕· · ·⊕Vs and acts as scalar multiplication by αi on Vi . Also, ai (T )fi (T )(T −αi IV ) is the zero operator on V1 ⊕ · · · ⊕ Vi−1 ⊕ Vi+1 ⊕ · · · ⊕ Vs and is nilpotent on Vi . s Now set d(x) = i=1 αi ai (x)fi (x) and n(x) = s a (x)f (x)(x − α i i ). Then d(T ) acts a scalar muli=1 i tiplication by αi on Vi for each i and n(T ) acts as T − αi IV on each Vi . Consequently, d(T ) is a diagonalizable operator, n(T ) is a nilpotent operator and d(T ) + n(T ) = T. Since d(T ), n(T ) are polynomials in T we have d(T )n(T ) = n(T )d(T ).

Since D and N are polynomials in T it follows that D and 11. Assume first that T has no real eigenvectors. Let p(x) D , N and N commute. From D + N = T = D + N be an irreducible factor of µT (x). Then p(x) has no real we conclude that roots and therefore p(x) is a real quadratic. Suppose U is a T −invariant subspace. Let the elementary divisors D − D = N − N (4.6) of T restricted to U be p1 (x)m1 , . . . , ps (x)ms (note that is a real The operator on the left of (4.6) is the difference of two the pi (x) are not necessarily distinct. Each pi (x) mi (x) is 2mi irreducible quadratic. Then the degree of p i commuting diagonalizable operators and so is diagonaland the dimension of U is the sum 2m + · · · + 2m 1 s = izable. The operator on the left hand side of (4.6) is 2(m + . . . m ). 1 s the difference of two commuting nilpotent operators and 0 1 therefore is nilpotent. However, the only nilpotent diago12. Let T be the operator with matrix with −1 0 nalizable operator is the zero operator. Thus D − D = respect to the standard basis. The minimum polynomial of N − N = 0V →V whence D = D and N = N . T is x2 +1 which has no real roots. However, T 2 = −IR2 . It therefore suffices to prove that there exist polynomials −1 d(x) and n(x) such that d(x) + n(x) = 1 and d(T ) is 13. T S = S (ST )S. Thus, T S and ST are similar. Therefore if ST is diagonalizable then so is T S. diagonalizable, n(T ) is nilpotent.

K23692_SM_Cover.indd 52

02/06/15 3:11 pm

Chapter 5

Inner Product Spaces 5.1. Inner Products a1 1. Positive definite. If u = ... then

an

u · u = a21 + · · · + ann

= (a1 c1 +· · ·+an cn )+(b1 c1 +· · ·+bn cn ) = u·w+v·w. Essentially additivity holds because multiplication distributes over addition in R. b1 a1 .. .. Homogeneity If u = . , v = . and γ ∈ R

which is non-negative since each square is non-negative and zero if and only if a1 = · · · = an = 0, that if, if and then only if u = 0. a1 b1 .. .. Symmetry If u = . , v = . then an

bn

an

bn

b1 γa1 .. .. (γu) · v = . · .

γan

bn

u · v = a1 b1 + · · · + an bn = b1 a1 + · · · + bn an = v · u = (γa1 )b1 + · · · + (γan )b)n = γ(a1 b1 ) + · · · + γ(an bn ) since for all real numbers a, b we have commutativity of multiplication: ab = ba. = γ(a1 b1 + · · · + an bn ) = γ(u · v). b1 c1 a1 Essentially homogeneity holds since multiplication in R .. .. .. Additivity If u = . , v = . , w = . then is associative. an bn cn 2. u, γv = γv, u = a 1 + b1 c1 .. .. (u + v) · w = · . = . γv, u = γv, u = γ¯ u, v. a n + bn cn 3. u, v + w = v + w, u = (a1 + b1 )c1 + · · · + (an + bn )cn = (a1 c1 + b1 c1 ) + . . . (an cn + bn cn )

K23692_SM_Cover.indd 53

v, u + w, u = v, u + w, u

02/06/15 3:11 pm

46

Chapter 5. Inner Product Spaces

u, v + u, w.

w1 4. Positive definite Let w = ... . Then w, w = wn

w1 w1 .. .. . , . = α1 w1 w1 + · · · + αn wn wn (5.1)

wn

wn

Conjuagate symmetry u, v = S(u), S(v)EIP = S(v), S(u)EIP = v, u. Additivity in the first argument u+v, w = S(u+v), S(w)EIP Since S is linear and , EIP is additive in the first argument we have S(u + v), S(w)EIP = S(u) + S(v), S(w)EIP =

S(u), S(w)EIP + S(v), S(w)EIP = Each wi wi is non-negative. Since αi > 0 also αi wi wi ≥ u, w + v, w. 0 and zero if and only if wi = 0. Then each term of (5.1) is non-negative. Therefore, w, w is non-negative and Homogeneity in the first argument zero if and only if each term is zero, if and only if wi = 0 By the linearity of S and the homogeneity of , EIP in for all i, if and only if w = 0. the first argument we have u1 Conjugate Symmetry Suppose u = ... and v =

un

v1 .. . . For each term we haveαi ui v¯i = αi ui v¯i = vn αi vi u¯i since αi is real. This implies that v, u = u, v.

Additivity in the first argument This holds since multiplication distributes over addition in F. Homogeneity in the first argument

γu, v = S(γu), Sv)EIP = γS(u), S(v)EIP γS(u), S(v)EIP = γu, v.

6. Positive definite Let A have entries aij . The (i, i)−entry of A, A is n

aij aij

j=1

This holds since multiplication is associative and commu n n tative: For each term we have It follows that A, A = i=1 j=1 aij aij . Since each aij aij ≥ 0 A, A ≥ 0. A term aij aij = 0 if and only if aij = 0 and therefore the sum is zero if and only if αi (γui )vi = αi [γ(ui vi )] = A = 0nn . (αi γ)(ui vi ) = (γαi )(ui vi ) = γ[αi (ui vi )]. Conjugate symmetry n 5. Positive definite The (j, j)−entry of Atr B is i=1 aij bij and therefore u, u = S(u), S(u)EIP ≥ 0 since , is an inner A, B is product and is equal to zero if and only if S(u) = 0. Hown n ever, S is an invertible operator and therefore S(u) = 0 aij bij if and only if u = 0. i=1 j=1

K23692_SM_Cover.indd 54

02/06/15 3:11 pm

5.1. Inner Products

47

In a similar fashion

(u1 , w1 1 + u2 , w2 2 ) + (v1 , w1 1 + v2 , w2 2 )

B, A =

n n

bij aij

= u, w + v, w.

i=1 j=1

Homogeneity in the first argument

Since aij bij = bij aij the conjugate symmetry follows. Additivity in the first argument This follows from the formula and the fact that multiplication distributes over addition in F. Homogeneity in the first argument This follows from the formula and the fact that multiplication is associative in F.

γu, v = γu1 , v1 1 + γu2 , v2 2 = γu1 , v1 1 + γu2 , v2 2 = γ(u1 , v1 1 + u2 , v2 2 ) = γu, w. 8. Assume that v = c1 v1 +· · ·+cn vn = 0. Then v ·vj = 0 for all j. Using additivity and homogeneity in the first argument we get

7. , is an inner product. Throughout u = (u1 , u2 ), v = (v1 , v2 ), w = (w1 , w2 ) with u1 , v1 , w1 ∈ V1 and u2 , v2 , w2 ∈ V2 . Positive definite Assume u, u = u1 , u1 1 + u2 , u2 2 . Each of these is non-negative. Moreover, we get zero if and only if

if and only if

u1 , u1 1 = u2 , u2 2 = 0 u1 = 0V1 , u2 = 0V2 .

Conjugate symmetry u, v = u1 , v1 1 + u2 , v2 2 = v1 , u1 1 + v2 , u2 2 = v1 , u1 1 + v2 , u2 2 = v, u Additivity in the first argument

c1 v1 , vj + · · · + cn vn , vj = 0

c1 .. This implies that . is in the null space of the matrix

cn A. It immediately follows that (v1 , . . . , vn ) is linearly independent if and only if null(A) = 0 if and only if A is invertible. 9. If ci > 0 for all i then , is an inner product by Exercise 4. Suppose some ci ≤ 0. Then ei , ei ≤ 0 which contradicts positive definiteness. Thus, if , is an inner product, all ci > 0. 10. First note that if f, g ∈ V then spt(f ) ∪ spt(g) is finite so there are only finitely many non-zero terms in i∈N f (i)g(i).

Positive definite

2 Let I = spt(f ). Then f, f = i∈I f (i) . Each 2 f (i) > 0 for i ∈ I. Therefore f, f ≥ . Moreover we obtain a positive sum unless I = ∅, that is, unless f is the zero function. Symmetry

u + v, w = u1 + v1 , w1 1 + u2 + v2 , w2 2 = u1 , w1 1 + v1 , w1 1 + u2 , w2 2 + v2 , w2 2 =

K23692_SM_Cover.indd 55

This follows since for all i ∈ spt(f ) ∩ spt(g), f (i)g(i) = g(i)f (i) since multiplication of real numbers is commutative.

02/06/15 3:11 pm

48

Chapter 5. Inner Product Spaces

Additivity in the first argument

Homogeneity in the first argument

Set J = [spt(f )∩spt(h)]∪[spt(g)∩spt(h)]. Let f, g, h ∈ V and i ∈ J. Then

Assume v, w = a + bi so that v, wR = a. Let γ ∈ R. Then γv, w = γv, w = γ(a + bi) = (γa) + (γb)i. It follows that γv, wR = γa = γv, wR .

[f (i) + g(i)]h(i) = f (i)h(i) + g(i)h(i) since multiplication distributes over addition in R. It then follows that f +g, h = i∈J [f (i)+g(i)]h(i) = iinJ [f (i)h(i) + g(i)h(i)] =

f (i)h(i) +

i∈J

i∈J

g(i)h(i) = f, h + g, h.

5.2. The Geometry of Inner Product Spaces 1. Let v, w ∈ u⊥ so that v, u = w, u = 0. By additivity in the first argument we have

Homogeneity in the first argument Set I = spt(f ) ∩ spt(g). Then γf, g =

[γf (i)]g(i) = γ(f (i)g(i)) = i∈I

γ

i∈I

f (i)g(i) = γf, g.

v + w, u = v, u + w, u = 0 + 0 = 0. Thus, v + w ∈ u⊥ .

Now assume γ ∈ F. By homogeneity in the first argument we have

i∈I

γv, u = γv, u = γ × 0 = 0.

11. This is a real inner product. Positive definite

So, if v ∈ u⊥ and γ is a scalar then γv ∈ u⊥ .

Since v, v is a non-negative real number, v, vR = 2. Define a map fu : V → F by fu (v) = v, u. v, v and so is non-negative and equal to zero if and only Since , is additive and homogeneous in the first arif v = 0. gument, the map fu is a linear transformation. Assume a u) = a and conseu = 0. Then for a ∈ F, fu ( u,u Symmetry quently, fu is surjective. Now by the rank nullity theorem, Suppose v, w = a + bi with a, b ∈ R. Then v, wR = dim(Ker(f ) = n − 1. However, Ker(f )) = u⊥ . u u a. We then have w, v = v, w = a + bi = a − bi. 3. Assume w ∈ W ∩W ⊥ . Then w ⊥ w that is, w, w = Thus, v, wR = a. 0. By positive definiteness, w = 0. Additivity in the first argument 1 4. ax2 + bx + c, x2 + x + 1 = 0 (ax2 + bx + c)(x2 + Suppose u, w = a+bi, v, w = c+di with a, b, c, d ∈ x + 1)dx = R. Then u, wR = a, v, wR = c. 1 [ax4 + (a + b)x3 + (a + b + c)x2 + (b + c)x + c]dx = u + v, w = u, w + v, w = 0

(a + bi) + (c + di)= (a + c) + (b + d)i.

Thus, u + w, R = a + b as required.

K23692_SM_Cover.indd 56

(a

x5 x4 x3 x2 + (a + b) + (a + b + c) + (b + c) + cx|10 = 5 4 3 2

02/06/15 3:11 pm

5.2. The Geometry of Inner Product Spaces

49

a a+b a+b+c b+c + + + +c 5 4 3 2

2

d(x , x) =

This reduces to finding the solutions to the homogeneous equation

9. u −

47a + 65b + 110c = 0.

u, v −

65 2 2 A basis for (x2 + x + 1)⊥ is ( 110 47 x − 1, 47 x − 1). A − B, A − B. A − B = 5. d(A, B) = −4 −3 . 3 −4

u,v v2 v, v u,v v2

√ 1 15 = . 15 15

= u, v −

u,v v2 v, v

=

v 2 = u, v − u, v = 0.

10. u + v, u + v = u, u + u, v + u, v + v, v. u − v, u − v = u, u − u, v − u, v + v, v.

We have to compute the trace of the product (A − √ 25 0 tr . Thus, d(A, B) = 5 2. B) (A − B) = 0 25

u + iv, u + iv = u, u − iu, v + iu, v + v, v.

6. A, I2 is equal to the trace of Atr which is the same as the trace of A. So, the orthogonal complement to I2 consists of all matrices with trace zero. A basis for this is

u − iv, u − iv = u, u + iu, v − iu, v + v, v.

0 ( 0

1 0 , 0 1

0 1 , 0 0

0 ). −1

7. The subspace of diagonal matrices in M22 (R) has basis 1 0 0 0 ( , ). 0 0 0 1 1 0 The orthogonal complement to consists of all 0 0 0 a12 . The orthogonal complement to matrices a21 a22 0 0 a a12 the matrix consists of all matrices 11 . 0 1 a21 0 Thus, the orthogonal complement to the space of diagonal matrices in M22 (R) consists of all matrices with zeros on the diagonal. 8. d(x2 , x) = x2 − x, x2 − x. 1 x2 − x, x2 − x = 0 (x2 − x)2 dx = 1 4 5 4 3 (x − 2x3 + x2 )dx = [ x5 − 2 x4 + x3 ]|10 = 0 1 1 2 1 − + = . 5 4 3 15

K23692_SM_Cover.indd 57

We then have u+v 2 − u−v 2 +i u+iv 2 −i u−iv 2 = (u, u − u, u + iu, u − iu, u)+ (u, v − (−u, v) − i2 u, v − i2 u, v)+ u, v + u, v + i2 u, v + i2 u, v+ (v, v − v, v + iv, v − iv, v) = 4u, v.

y1 √ x2 √ 2y2 2 11. Set x = . and y = . . Apply Cauchy . . . . √ x √n nyn n Schwartz: x1

02/06/15 3:11 pm

50

Chapter 5. Inner Product Spaces

2

(x · y) ≤ (x · x)(y · y). The left hand side is (

n

x i yi ) 2

and so we get that , 1 and , 2 are identical. 16. Let a be the scalar such that z = y − ax is orthogonal to x. Then y = ax + z and y, y = ax + z, ax + z = |a|2 + z, z ≥ a2 . On the other hand, y, xx, y = ax + z, xx, ax + z = aa = |a|2 .

i=1

while the right hand side is

n n x2i 2 )( iy ). ( i i=1 i i=1

12. a) Since d(u, v) = u − v, u − v we have d(u, v) ≥ 0 and equals zero if and only if u − v = 0, if and only if u = v. b) This follows since −x, −x = x, x for any vector x.

5.3. Orthonormal Sets and the Gram-Schmidt Process 1. x2 = w2 −

Then

x2 , x1 = w2 − w2 , x1 −

c) d(u, w) = u − w = (u − v) + (v − w) ≤ u−v + v−w = d(u, v)+d(v, w) by Theorem (5.5).

w2 ,x1 x1 ,x1 x1 .

w2 , x1 x1 , x 1 = x1 , x1

w2 , x1 x1 , x1 = x1 , x1

w2 , x1 − w2 , x1 = 0. 2. x3 = w3 −

w3 ,x1 x1 ,x1 x1

−

w3 ,x2 x2 ,x2 x2 .

× 2 all one Making use of additivity and homogeneity in the first ar13. Denote the 2 × 2 identity by I2 and the 2 √ matrix by J2 . Easy calculations give I2 = 2, gument and the fact that x1 ⊥ x2 we get √ J2 = 2 2 and I2 , J2 = 2. Thus, w3 , x1 w3 , x2 x1 − x2 , x 1 = w3 , x1 = w3 − I2 , J2 2 1 2 x1 , x1 x2 , x2 =√ √ = = . I2 J2 4 2 22 2 The angle is π3 .

14. If u + v = u + v then u ⊥ v. Then (cu) ⊥ (dv). Then by the general Pythagorean theorem

w3 , x1 −

w3 , x1 x1 , x1 = w3 , x1 −w3 , x1 = 0. x1 , x1

w3 , x2 is computed in exactly the same way.

⊥ cu+dv 2 = cu 2 + dv 2 = c2 u 2 +d2 v 2 . 3. Let u ∈ U and x ∈ W . Since U ⊂ W, u ⊥ x. Since u is arbitrary, x ∈ U ⊥ . Since x is arbitrary we have W ⊥ ⊂ U ⊥ . 15. It is a consequence of the assumption that u 1 = 4. Let (w1 , . . . , wk ) be basis for W and extend to a basis u 2 for all vectors u. Then u + v 21 = (w1 , . . . , wn ) for V. Apply the Gram-Schmidt process to u + v 22 . It then follows that obtain an orthonormal basis (x1 , . . . , xn ) such that for all j ≤ n, Span(x1 , . . . , xj ) = Span(w1 , . . . , wj ). Then, 2u, v1 = 2u, v2 in particular, (x1 , . . . , xk ) is an orthonormal basis of W.

K23692_SM_Cover.indd 58

02/06/15 3:11 pm

5.3. Orthonormal Sets and the Gram-Schmidt Process Clearly Span(wk+1 , . . . , wn ) is contained in W ⊥ and has dimension n − k. Thus, dim(W ⊥ ) ≥ n − k. Since W ∩ W ⊥ = {0}, dim(W ⊥ ) ≤ n − k. Consequently we have equality. 5. This follows from the fact that W ∩ W ⊥ = {0} and dim(W ) + dim(W ⊥ ) = n.

51

k i=1

|u, vi |2 =

k i=1

|ci |2 .

On the other hand

n k 6. Assume dim(V ) = n, dim(W ) = k. Then 2 2 = |c | ≥ |ci |2 . u i ⊥ dim(W ) = n − k by Theorem (5.11). By another applii=1 i=1 cation of Theorem (5.11) we get that dim((W ⊥ )⊥ ) = k. On the other hand, W ⊂ (W ⊥ )⊥ . Since they have the We get equality if and only if c k+1 = · · · = cn = 0 which same dimension we get equality. occurs if and only if u is in the span of (v1 , . . . , vk ). 7. Let x = u + w ∈ U + W and y ∈ U ⊥ ∩ W ⊥ . Then a b ⊥ consists of all matrices 11. J x, y = u + w, y = u, y + w, y = 0 + 0 = 0. 2 c d Thus, U ⊥ ∩ W ⊥ ⊂ (U + W )⊥ . such that a + b + c + d = 0 and has basis 1 −1 1 0 1 0 ⊥ On the other hand, since U ⊂ U + W, (U + W ) ⊂ ( , , ). Applying Gram0 0 −1 0 0 −1 U ⊥ by Exercise 3. Similarly, (U + W )⊥ ⊂ W ⊥ . Thus Schmidt to this basis we obtain the following orthogonal (U + W )⊥ ⊂ U ⊥ ∩ W ⊥ and we have equality. basis: Now apply the first part to U ⊥ + W ⊥ : 1 1 1 1 1 −1 2 2 3 3 , 1 ). ( , −1 0 −1 0 0 3 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ (U + W ) = (U ) ∩ (W ) = U ∩ W. √ The first matrix has norm 3 the second has norm 32 It now follows from Exercise 6 that (U ∩ W )⊥ = U ⊥ + and the last has norm 43 . Dividing the respective vectors ⊥ W . by these numbers gives an orthonormal basis. , . . . , v ). 8. For each j, Span(w1 , . . . , wj ) =Span(v j 1 12. Set ai = x, vi , bi = y, vi . Since (v1 , . . . , vn )is a1j an orthonormal basis we have . .. n n a This implies that [wj ]B has the form jj . Therefore, x = a v , y = bi v i . i i 0 i=1 i=1 .. . 0 Again, since (v1 , . . . , vn ) is an orthonormal basis, the change of basis matrix from B to B , MIV (B, B ) is upper triangular. Reversing the roles of the bases, it also n n follows that MIV (B , B) is upper triangular. x, y = a i bi = x, vi y, vi .

9. (1, x − 12 , x2 − x + 16 ).

i=1

i=1

10. Extend (v1 , . . . , vk ) to an orthonormal basis of V. Set u, vi = ci so that u = c1 v1 + · · · + cn vn . Then

K23692_SM_Cover.indd 59

02/06/15 3:11 pm

52

Chapter 5. Inner Product Spaces

5.4. Orthogonal Complements and Projections 2 −1 3 −1 1. P rojW (u) = 2 , P rojW ⊥ (u) = 1 3 0 2. P rojW (J2 ) = 1

tr tr tr [w]tr S [u]S = (Q[v]S ) [u]S = [v] Q [u]S tr = [v]tr S (Q[u]S ) = [v]S 0n = 0n .

1

1 , P rojW ⊥ (J2 ) = 0

3. P rojW (x3 ) = 32 x2 − 35 x +

1 0

0 1

1 20 .

4. 53 . 5.

[T (v)]S = Q[v]S . Since u ∈ Ker(T ), Q[u]S = 0n . We now use this to compute [w]tr S [u]S = 0 :

√

30 3 √ 2 15 5

It now follows that W = Ker(T ) ⊂ W ⊥ . However, by the rank nullity theorem dim(W ) = n − dim(W ). By Theorem (5.11), dim(W ⊥ ) = n−dim(W ) and therefore W = W ⊥. Now let w ∈ W. It remains to show that T (w) = w. Since Q2 = Q it follows that T 2 = T. Let u ∈ V such that T (u) = w. Then T (w) = T 2 (u) = T (u) = w.

11. Assume W ⊥ U and let v ∈ V. Set w = P rojW (v). Then w ∈ W. Since w ⊥ U it follows that P rojU (w) = 1 7. 35 (−20x2 + 48x + 6). 0. Conversely, assume P rojU ◦ P rojW = 0V →V . Let w ∈ W. Then P rojW (w) = w. Then P rojU (w) = 0 8. Extend B to an orthonormal basis B = (w1 , . . . , wn ). which implies that w ∈ Ker(P rojU ) = U ⊥ . Since w is We need to show that Q[wi ]S = [wi ]S for i ≤ k and arbitrary we have W ⊂ U ⊥ and therefore W ⊥ U. 0n if i > k. Since S is an orthonormal basis, for any two vectors u, v we have u, v = [u]S · [v]S . Now for 12. Let w ∈ W, v ∈ W ⊥ such that u = w + v. 1 ≤ i < j ≤ k we have wi , wj = 0 which implies Then P rojW (v) = w and P rojW (u), P rojW (u) = that [wi ]S · [wj ]S = 0 and [wi ]S · [wi ]S = 1. This im- w, w. On the other hand, since w ⊥ v by the plies that Atr [wi ]S = ei the ith standard basis vector of Pythagorean theorem u, u = w + v, w + v = Rn . Whence Q[wi ]S = AAtr [wi ]S = Aei = [wi ]S as w, w + v, v ≥ w, w. Moreover, we get equality required. On the other hand, if 1 ≤ i ≤ k, k + 1 ≤ j ≤ n if and only if v = 0 if and only if u = w ∈ W. then Atr [wj ]S = 0n and, consequently, Q[wj ]S = 0n . 13. Let w ∈ W, v ∈ W ⊥ such that u = w + v. 6.

9. Qtr = (AAtr )tr = (Atr )tr Atr = AAtr = Q, so Then dist(u, W ) = v, v. By the Pythagorean theorem, Q is symmetric. Let v ∈ V and write v = w + u where u, u = w + v, w + v = w, w + v, v ≥ v, v. 2 w ∈ W, u ∈ W ⊥ . Then P rojW (v) = P rojW (w+u) = Moreover, we have equality if and only if w = 0 if and ⊥ P rojW (w) = w = P rojW (v). It follows from this that only if u = v ∈ W . 2 Q = Q since Q is the matrix of P rojW with respect to S. 10. Let u, v be in V. As stated in Exercise 8, u, v = [u]S · [v]S . Now suppose w ∈ W = Range(T ) and u ∈ Ker(T ). We claim that w, u = 0. By the above remark it suffices to show that [w]S · [u]S = 0 which is equivalent to [w]tr S [u]S = 0. Since w ∈ Range(T ) there exists a vector v ∈ V such that w = T (v). Then [w]S =

K23692_SM_Cover.indd 60

5.5. Dual Spaces

1.

1 2 1 0

2 3 0 1

1 1 0 2

2 3 is invertible with inverse 1 1

02/06/15 3:11 pm

5.5. Dual Spaces

53

−5 3 0 1 which clearly inplies that T (v) = 0W . Since v is arbi−1 1 −1 0 trary we can conclude that Ker(T ) = {0V } and T is . This implies that the sequence −2 1 0 1 injective. 5 −3 1 −1 Now assume that T is onto and let g ∈ W be non-zero. of column vectors are a basis for R4 . Then there exists w ∈ W such that g(w) = 1. Let (w = w1 , . . . , wm ) be a basis for W such that (w2 , . . . , wm ) Set is a basis for Ker(g). Since T is onto, there exists v ∈ V such that T (v) = w. Now [T (g)](v) = g(T (v)) = g1 = −5f1 + 3f2 + f4 g(w) = 1. In particular, T (g) is not the zero vector in V , equivalently, g is not in Ker(T ). Since g is arbitrary g2 = −f1 + f2 − f3 in W it follows that T is injective. g3 = −2f1 + f2 + f4 g4 = 5f1 − 3f2 + f3 − f4 (g1 , g2 , g3 , g4 ) is the basis of (R4 ) which is dual to B. 2. Since dim(L(V, W )) = dim(L(W , V )) it suffices to show that the map T → T is injective. Let T ∈ L(V, W ) be a non-zero map. We need to show that T is non-zero. Since T is non-zero there exists v ∈ V such that w = T (v) = 0W . Extend w to a basis (w = w1 , . . . , wm ) m for W. Define g : W → F by g( i=1 ci wi ) = ai . Now [T (g)](v) = [g ◦ T ](v) = g(T (v)) = g(w) = 1. Thus, T (g) = 0W .

Conversely, assume that Range(T ) = W. Let (w1 , . . . , wk ) be a basis for Range(T ) and extend this to a basis (w1 , . . . , wm ) for W. Now define g : W → F by m g( i=1 ai wi = am . Note that Range(T ) ⊂ Ker(T ). This implies that T (g) = 0V and therefore T is not one-to-one. 4. This follows immediately from Exercise 3. 5. Set k = rank(T ) and let (w1 , . . . , wk ) be a basis for Range(T ). Extend to a basis BW = (w1 , . . . , wm ) for W and let BW = (g1 , . . . , gm ) be the basis of W dual to BW . We claim that (T (g1 ), . . . , T (gk )) is a basis for Range(T ).

Suppose j > k. Then Range(T ) ⊂ Ker(gj ) and therefore T (gj ) = 0V . Thus, Range(T ) ⊂ 3. Assume T is one-to-one and let f ∈ V be non-zero. Span(T (g1 ), . . . , T (gk )). Since each T (gi ) is in Let (v1 , . . . , vn−1 ) be a basis for Ker(f ) and extend to a Range(T ) we have equality. It remains to show that basis (v1 , . . . , vn ) for V such that f (vn ) = 1. Set wi = (T (g1 ), . . . , T (gk )) is linearly independent. T (vi ), 1 ≤ i ≤ n. Since T is one-to-one, (w1 , . . . , wn ) For 1 ≤ i ≤ k let vi ∈ V such that T (vi ) = is linearly independent. Extend to a basis (w1 , . . . , wm ) wi . Then (v1 , . . . , vk ) is linearly independent since m fo W. Define g : W → F by g( i=1 ai wi ) = a1 . Then (T (v1 ), . . . , T (vk )) = (w1 , . . . , wk ) is linearly indeT (g)(v1 ) = (g ◦ T )(v1 ) = g(T (v1 )) = g(w1 ) = 1. pendent. Extend to a basis BV = (v1 , . . . , vn ) where Also, for i > 1, T (g)(vi )(g ◦ T )(vi ) = g(T (vi )) = (vk+1 , . . . , vn ) is a basis for Ker(T ). Let BV = g(wi ) = 0. Thus, T (g) = f and T is onto. (f1 , . . . , fn ) be the basis of V which is dual to BV . We claim that T (gi ) = fi from which the result will follow. Conversely, assume that T is onto and let v ∈ V be a non-zero vector. Let (v = v1 , . . . , vn ) be a basis of V. Suppose j > k. Then vj ∈ Ker(T ) and [T (gi )](vj ) = n And let f : V → F be defined by f ( i=1 ai vi ) = a1 . 0. Suppose j ≤ k, j = i. Then [T (gi )](vj ) = Since T is onto there exists g ∈ W such that T (g) = f. gi (T (vj )) = gi (wj ) = 0. Finally, [T (gi )](vi ) = Then (T (g))(v) = f (v) = 1. Whence g(T (v)) = 1 gi (T (vi )) = gi (wi ) = 1.

K23692_SM_Cover.indd 61

02/06/15 3:11 pm

54

Chapter 5. Inner Product Spaces

6. Assume dim(V ) = n, dim(W ) = m and rank(T ) = k. By the rank-nullity theorem, nullity(T ) = n − k. By Exercise 5, rank(T ) = k. Again by the rank-nullity theorem, nullity(T ) = m − k. Thus, nullity(T ) = nullity(T ) if and only if n − k = m − k if and only if n = m.

such that Fxi = Fi . This sequence satisfies the requirements of the exercise. 10. Suppose f, g ∈ U , c ∈ F and u ∈ U. Then (f + g)(u) = f (u) + g(u) = 0 + 0 = 0. Since u ∈ U is arbitrary, U ⊂ Ker(f + g).

(cf )(u) = c[f (u)] = c × 0 = 0. Thus, U ⊂ Ker(cf ) 7. Let v = 0V and (v = v1 , . . . , vn ) be a basis for and U ⊥ is a subspace of V . V and (g1 , . . . , gn ) the basis of V dual to (v1 , . . . , vn ). Then g1 (v) = 1. Since (f1 , . . . , fn ) is a basis for V there Let (u1 , . . . , uk ) be a basis for U and extend to a basis B = (u1 , . . . , un ) for V. Let (g1 , . . . , gn ) be the basis of exists scalars c1 , . . . , cn such that V which is dual to B. Then U ⊥ = Span(gk+1 , . . . , gn ) and has dimension n − k. g = c1 f1 + · · · + cn fn .

11. Since U, W ⊂ U + W it follows that if f ∈ (U + W ) Then 1 = g(v) = c1 f1 (v) + . . . cn fn (v). This implies then f ∈ U ∩ W . On the other hand, suppose f ∈ U ∩ for some i, fi (v) = 0. It then follows that T (v) = 0n . W and v ∈ U + W. Then there are u ∈ U, w ∈ W such Thus, T is injective whence an isomorphism by the half is that v = u + w. Then f (u + w) = f (u) + f (w) = good enough theorem. 0 + 0 = 0 and f ∈ (U + W ) . Thus, we have equality. 8. Since T is an isomorphism by Exercise 4, T is an Now suppose f ∈ U , g ∈ W and v ∈ U ∩ W. Then isomorphism. Since (π1 , . . . , πn ) is a basis for (Fn ) we (f + g)(v) = f (v) + g(v) = 0 + 0 = 0. Thus, can conclude that (T (π1 ), . . . , T (πn )) = (f1 , . . . , fn ) f + g ∈ (U ∩ W ) . This shows that U + W ⊂ (U ∩ is a basis for V . W ) . We complete this with a dimension argument. Let 9. We give an indirect proof. We first establish the exis- dim(U ) = k, dim(W ) = l and dim(U ∩ W ) = j. Then tence of a natural isomorphism between V and (V ) . dim(U + W ) = k + l − j. From Exercise 10 it follows ⊥ Thus, let v ∈ V. Define a map Fv : V → F by that dim((U + W ) ) = n − k − l + j. By the first part, F (f ) = f (v). Claim that the map v → F is a linear U ∩W = (U +W ) . Thus, dim(U ∩W ) = n−k−l+j. v

v

transformation.

Let v, w ∈ V. We need to prove that Fv+w = Fv + Fw . Let f ∈ V . Then Fv+w (f ) = f (v + w) = f (v) + f (w) = Fv (f ) + Fw (f ) = (Fv + Fw )(f ). Now let v ∈ V, c ∈ F. We need to prove that Fcv = cFv . Now for f ∈ V we have Fcv (f ) = f (cv) = cf (v) = cFv (f ) = (cFv )(f ).

Again by Exercise 10, dim(U ) = n − k, dim(W ) = n − l. Then

dim(U +W ) = dim(U )+dim(W )−dim(U ∩W ) = (n−k)+(n−l)−(n−k−l+j) = n−j = dim((U ∩W ) ).

Now let v ∈ V be non-zero. We have oftentimes seen that there exists f ∈ V such that f (v) = 0. Then Fv (f ) = 12. Clearly π is linear and injective. Since dim((U ⊕ f (v) = 0. Therefore the linear map v → Fv has trivial W ) ) = dim(U ⊕ W ) = dim(U ) + dim(W ) = kernel and is injective. Since dim(V ) = dim((V ), in dim(U ) + dim(W ) = dim(U ⊕ W ), π is an isomorphism. fact, this map is an isomorphism. Now let (F1 , . . . , Fn ) be the basis of (V ) which is dual to (f1 , . . . , fn ) and let (x1 , . . . , xn ) be the basis of V

K23692_SM_Cover.indd 62

13. Let f ∈ L(X, F). Then (S ◦ T ) (f ) = (S ◦ T ) ◦ f = S ◦ (T ◦ f ) = S (T (f )) = (S ◦ T )(f ).

02/06/15 3:11 pm

5.6. Adjoints

55

14. Let f ∈ U and u ∈ U. Then [T (f )](u) = f (T (u)). 2. −420x2 + 396x − 60. Since U is T −invariant and u ∈ U, T (u) ∈ U. Since f ∈ 1 0 3. U we conclude that f (T (u)) = 0. Since u is arbitrary, 0 −1 T (f ) ∈ U . 4. Let v ∈ V and x ∈ X. Then 15. Let B = (v1 , . . . , vn ) be a basis for V and B = (f1 , . . . , fn ) be the basis of V dual to B. Then (T S)(v), xX = v, (T S)∗ (x)V MT (B , B ) = MT (B, B)tr . It follows from this that the operators T ∈ L(V, V ) and T ∈ L(V , V ) have the On the other hand, same minimum polynomial (and elementary divisors and invariant factors). (T S)(v), x = T (S(v)), x = X

X

16i) Suppose g ∈ Ker(T ) and w = T (v). Then g(w) = g(T (v)) = [T (g)[v] = 0. Since w is arbitrary, S(v), T ∗ (x)W = v, S ∗ (T ∗ (x))V g ∈ Range(T ) and Ker(T ) ⊂ Range(T ) . On the other hand, suppose g ∈ Range(T ) . Let v ∈ V. Then Thus, for all v ∈ V and x ∈ X we have [T (g)](v) = g(T (v)). Since T (v) ∈ Range(T ) and g ∈ Range(T ) we have g(T (v)) = 0. Since v ∈ V is v, (T S)∗ (x)V = v, (S ∗ T ∗ )(x)V arbitrary, T (g) is the zero vector in V and g ∈ Ker(T ). Thus, we have equality. This implies that (T S)∗ (x) = (S ∗ T ∗ )(x) for all x ∈ X ii) Assume f = T (g) ∈ Range(T ) and v ∈ Ker(T ). whence (T S)∗ = S ∗ T ∗ . Then f (v) = [T (g)](v) = g(T (v)) = g(0W ) = 0. 5. Let v ∈ V, w ∈ W. Then Thus, f ∈ Ker(T ) . Now we can complete this with dimension arguments: By Exercise 5, dim(Range(T )) = T (v), wV = v, T ∗ (w)V = dim(Range(T )). By Exercise 10, dim(Ker(T ) ) = dim(V ) − dim(Ker(T )) = dim(Range(T )). (T ∗ )∗ (v), wW . iii) Let v ∈ Ker(T ) and f = T (g). Then f (v) = [T (g)](v) = g(T (v)) = g(0W ) = 0. Since f and v are Since this holds for all w ∈ W we must have (T ∗ )∗ (v) = arbitrary we have Ker(T ) ⊂ Range(T ) . Once again T (v) for all v ∈ V whence (T ∗ )∗ = T. we get equality by a dimension argument. 6. Assume T (u) = λu. Then for all v ∈ V we have iv) Let w = T (v) and g ∈ Ker(T ). Then g(w) = g(T (v)) = [T (g)](v) = 0 since T (g) is the zero vector (T − λ)(u), v = 0. in V . This implies that Range(T ) ⊂ Ker(T ) . Equality This implies that follows from dimension arguments.

5.6. Adjoints

2 1. 3 −1

K23692_SM_Cover.indd 63

u, (T − λIV )∗ (v) = u, (T ∗ − λIV )(v) = 0. This implies that Range(T − λIV ) ⊂ u⊥ is a proper subspace of V. It must then be the case that Ker(T − λIV ) = {0} which implies that there exists an eigenvector with eigenvalue λ.

02/06/15 3:11 pm

56

Chapter 5. Inner Product Spaces

7. Let w ∈ W and assume that T ∗ (w) = 0V . Then for all v ∈ V v, T ∗ (w)V = 0. Using the definition of T ∗ we have for all v ∈ V that T (v), wW = 0

12. S ∗ (x, y) = (−y, x). 13. Let BV be an orthonormal basis of V and BW be an orthonormal basis of W. Set A = MT (BV , BW ) and tr A∗ = MT ∗ (BW , BV ). By Theorem (5.23), A∗ = A . Since rank(T ) = rank(A), rank(T ∗ ) = rank(A∗ ) tr and rank(A) = rank(A ) it follows that rank(T ) = rank(T ∗ ).

14. Assume S exists. Then 1 = v1 , v1 = v1 , S ∗ (y) = S(v1 ), y = x, y. This proves that x, y = 1. Conversely, assume x, y = 1. Let (x2 , . . . , xn ) be a basis for y ⊥ . Since x, y = 1 = 0, x ∈ / y ⊥ so that (x, x2 , . . . , xn ) is linearly independent. Set x1 = x and define S : V → V so that 8. Suppose (T ∗ T )(v) = 0V . It then follows that by the S(v i ) = xi . Then S is invertible and S(v1 ) = x1 = x. definition of T ∗ that It remains to show that S ∗ (y) = v1 . Let 2 ≤ j ≤ n. Then 0 = S(vj ), y = vj , S ∗ (y). Consequently, (T (v), T (v)W = v, (T ∗ T )(v)V = 0. S ∗ (y) = Span(v2 , . . . , vn )⊥ = Span(v1 )⊥ . There is then a scalar α such that S ∗ (y) = αv1 . However, 1 = Since , W is positive definite we can conclude that x, y = S(v1 ), y = v1 , S ∗ (y) = v1 , αv1 = α. T (v) = 0W . However, T injective implies that v = 0V . Thus, α = 1 as required. Thus, T ∗ T is injective. Since V is finite dimensional it follows that T ∗ T is bijective. Since T is invertible, in particular, Range(T ) = W. This implies that w = 0W and consequently, T ∗ is injective, whence invertible. Exercise 3 applied to T T −1 = IV = T −1 T yields (T ∗ )−1 = (T −1 )∗ .

5.7. Normed Vector Spaces

9. It follows from part i) of Theorem (5.22), if T : V → W is surjective then T ∗ is injective. By Exercise 5, (T ∗ )∗ = T. Now it follows from Exercise 8 that T T ∗ : W → W is bijective.

−4 −4 −4 2 2 2 1. a) −1 1 = 9, −1 2 = 5, −1 ∞ = 10. Let u ∈ U and w ∈ U ⊥ . Since U is T −invariant we −2 −2 −2 have 4. T (u), w = 0. 3 3 3 −6 −6 −6 Making use of the definition of T ∗ we then get for all b) 0 1 = 11, 0 2 = 7, 0 ∞ = 6. u∈U 2 2 2 u, T ∗ (w) = 0. −4 3 −7 2 −6 8 This implies that T ∗ (w) ∈ U ⊥ . 2. Let x = −1 , y = 0 . Then x − y = −1. 11. Suppose (T ∗ T )(v) = 0. Then 2 2 0 Then d (x, y) = x−y = 16, d (x, y) = x−y 2 = 1 1 2 √ T (v), T (v) = v, T ∗ (T (v)) = 0. 114, d∞ (x, y) = x − y ∞ = 8. By positive definiteness we have T (v) = 0.

K23692_SM_Cover.indd 64

3. 1

02/06/15 3:11 pm

5.7. Normed Vector Spaces

57

z1 4. 1) If x = y then d(x, y) = 0 = 0. On the other ∞ and ∈ Br∞ (0). Set t = r−z ii) Assume z = 2 z2 hand, if d(x, y) = 0 then x − y = 0, whence x − y = 0 so x = y. v1 assume v = ∈ Bt2 (z). We need to prove that v2 2) This holds since d(y, x) = y − x = (−1)(x − v ∈ Br∞ (0). Now v ∞ = z + (v − z) ∞ ≤ z ∞ y) = | − 1| x − y = x − y = d(x, y). + v − z ∞ . Note that v − z ∞ ≤ v − z 2 . ∞ ∞ 3) d(x, z) = x − z = (x − y) + (y − z) ≤ Therefore, v ∞ ≤ z ∞ + r−z = r+z < r. 2 2 x − y + y − z = d(x, y) + d(y, z). 7. Let {xk }∞ sequence with respect to k=1 be a Cauchy x1 y1 x1k 5. Let x = ... , y = ... . If x = 0 then clearly the l1 -norm. Assume xk = ... . We claim for each

xn yn xnk x ∞ = max{|x1 |, . . . , |xn |} = 0. On the other hand if i, 1 ≤ i ≤ n that {xik }∞ is a Cauchy sequence in R of k=1 x ∞ = max{|x1 |, . . . , |xn |} = 0 then xi = 0 for all i C. Thus, let be a positive real number. Since {xk }∞ k=1 is and x = 0. a Cauchy sequence with respect to the l1 -norm there exists a natural number N = N ( ) such that if p, q ≥ N then n Let c be a scalar. Then cx ∞ = xq − xp 1 < . Since xq − xp 1 = i=1 |xiq − xip | max{|cx1 |, . . . , |cxn |} = max{|c||x1 |, . . . , |c||xn |} = it follows for each i, 1 ≤ i ≤ n and p, q ≥ N that |xiq − |c|max{|x1 |, . . . , |xn |} = |c| x ∞ . xip | < as required. Since R and C are complete we can Now x + y ∞ = max{|x1 + y1 |, . . . , |xn + yn |}. Now conclude for each i the sequence {xik }∞ converges. Set k=1 |xi + yi | ≤ |xi | + |yi | ≤ x ∞ + y −∞. Consex1 . quently, max{|x1 + y1 |, . . . , |xn + yn |} ≤ x ∞ + xi = limk→∞ xik and set x = .. . We claim that y ∞ . xn 6. Let Br2 (x) denote the open ball centered at x with limk→∞ xk = x. Toward that objective, assume is a radius r with respect to the l2 -norm and Br∞ (x) the open positive real number. Since limk→∞ xik = xi there is a ball centered at x with radius r with respect to the l∞ - natural number Mi such that if k ≥ Mi then |xi − xik | < norm. We must show the following: i) For an arbitrary n . Now set M = max{M1 , . . . , Mn } and suppose k ≥ point y ∈ Br2 (0) there is a positive s such that Bs∞ (y) ⊂ M . Then k ≥ M i so that |xi − xik | < n and therefore n ( Br 0); and ii) For an arbitrary point z ∈ Br∞ (0) there is a x − xk 1 = i=1 |xi − xik | < n × n = so that 2 ∞ limk→∞ xk = x as claimed. positive t such that Bt (z) ⊂ Br (0). ∞ with respect to Suppose u ∈ Bs∞ (y). Then u 2 = 8. Let {xk }k=1 be a Cauchysequence x 1k y1 ,u = y +(u−y) 2 ≤ y 2 + u−y 2 . If y = the l∞ -norm. Assume xk = ... . We claim that for y2 u1 xnk then u − y 22 = (u1 − y1 )2 + (u2 − y2 )2 < 2s2 each i, 1 ≤ i ≤ n that {xik }∞ u2 k=1 is a Cauchy sequence in R or C. Thus, let be a positive real number. Since (r− y )2 {xk }∞ k=1 is a Cauchy sequence with respect to the l∞ = 4 norm there exists a natural number N = N ( ) such that if p, q ≥ N then xq − xp ∞ < . Since xq − xp ∞ = 2 Consequently, u − y ≤ r−y . Then, u ≤ y max{|x 2 2 1q − x1p , . . . , |xnq − xnp |} we can conclude for 2 r−y2 r+y2 each i, 1 ≤ i ≤ n and p, q ≥ N that |xiq − xip | < + 2 = < r. 2

i) Set s =

K23692_SM_Cover.indd 65

r−y2 √ . 2 2

02/06/15 3:11 pm

58

Chapter 5. Inner Product Spaces

establishing our claim. Since R and C are complete we 11. Let m = max{ e1 , . . . , en } and assume x1 converges. can conclude for each i the sequence {xik }∞ k=1 n x1 x = ... is in S1∞ . Then x ≤ i=1 xi ei .. Set xi = limk→∞ xik and set x = . . We claim that xn n n by the triangle inequality. i=1 xi ei = i=1 |xi | xn n limk→∞ xk = x. Toward that objective, assume is a ei ≤ i=1 m|xi | ≤ nm since |xi | ≤ 1. Thus, S1∞ is positive real number. Since limk→∞ xik = xi there is a bounded. It remains to show that it is S1∞ is closed. natural number Mi such that if k ≥ Mi then |xi − xik | < . Now set M = max{M1 , . . . , Mn } and assume that k ≥ M . Then k ≥ Mi so that |xi − xik | < . Since this is true for each i, 1 ≤ i ≤ n it follows that x − xk ∞ = max{|x1 − x1k |, . . . |xn − xnk |} < so that, indeed, limk→∞ xk = x. 9. i) The relation fo equivalence is reflexive: Take c = d = 1. ii) The relation of equivalence is symmetric: Assume · is equivalent to · and let c, d be positive real numbers such that for every x, c x ≤ x ≤ d x . Then 1 1 d x ≤ x ≤ c x for every x. iii) The relation of equivalence is transitive: Assume · is equivalent to · and · is equivalent to · ∗ . Let a, b be positive real numbers such that for every x, a x ≤ x ≤ b x and let c, d be positive real numbers such that for every x we have c x ∗ ≤ x ≤ d x ∗ . Set e = ac and f = bd. Then for every x, e x ∗ ≤ x ≤ f x ∗ . 2

10. e1 + e2 2p = e1 − e2 2p = 2 p . Therefore 2

e1 + e2 2p + e1 − e2 2p = 2 × 2 p . On the other hand, 2( e1 2p + e2 2p ) = 4. 2

If p = 2 then 2 × 2 p = 2 × 2 = 4 and we have equality. Conversely, assume we have equality; 2

2 × 2p = 4 2

so that 2 p = 2 so that

K23692_SM_Cover.indd 66

2 p

= 1 and p = 2, as asserted.

02/06/15 3:11 pm

Chapter 6

Linear Operators on Inner Product Spaces 6.1. Self-Adjoint Operators 1. This follows immediately since (S +T )∗ = S ∗ +T ∗ = S + T. 2. (γT )∗ = γT ∗ = γT since γ ∈ R and T is self-adjoint. 3i) R∗ = [ 12 (T + T ∗ )]∗ = 12 [T ∗ + (T ∗ )∗ ] = 12 [T ∗ + T ] = R.

T T ∗ = R2 + i[SR − RS] + S 2 = T ∗ T = R2 + i[RS − SR] + S 2 . It follows that SR − RS = RS − SR from which we conclude that 2SR = 2RS, SR = RS. 5. The dimension is n2 as a real vector space.

S = + T ]) = + (T ) ] = − 12 i[−T + T ] = 12 i[−T + T ] = S.

6. Assume (ST )∗ = ST. (ST )∗ = T ∗ S ∗ = T S. Thus, T S = ST. On the other hand, assume ST = T S. Then (ST )∗ = T ∗ S ∗ = T S = ST.

ii) R + iS = 12 [T + T ∗ ] + 12 i2 [−T + T ∗ ] = 1 1 ∗ ∗ 2 [T + T ] − 2 [−T + T ] = 1 1 ∗ 1 1 ∗ 2 T + 2 T + 2 T − 2 T = T.

7. Let B = (v1 , . . . , vn ) be an orthonormal basis for V. Let S(vj ) = T (vj ) = vj for 3 ≤ j ≤ n. Set S(v1 ) = v2 , S(v2 ) = v1 ; T (v1 ) = v1 + 2v2 , T (v2 ) = 2v1 + 3v2 .

∗

( 12 i[−T ∗

∗

∗

1 ∗ 2 i[−T ∗

∗ ∗

iii) If T = R1 + iS1 is self-adjoint then T ∗ = R1∗ , +(iS1 )∗ = R1∗ + iS1∗ = R1 − iS1 .

Since MS (B, B), MT (B, B) are real symmetric the operators S and T are self-adjoint by Theorem (6.1). HowThen T + T ∗ = 2R1 , R1 = 12 [T + T ∗ ] = R. Then S1 = ever, MST (B, B) = MT S (B, B). T − R1 = T − R = S. 8. T (v) 2 = T (v), T (v) = v, T ∗ T (v). Since T is normal, v, T ∗ T (v) = v, T T ∗ (v) = T ∗ (v), T ∗ (v) = T ∗ (v) 2 .

4. T ∗ = R − iS. Suppose RS = SR. Then T T ∗ = (R + iS)(R − iS) = 2

2

2

9. This follows from Exercise 8. 2

R + (iS)R − R(iS) + S = R + S = ∗

(R − iS)(R + iS) = T T and so T is normal. Conversely, assume that T is normal so that T T ∗ = T ∗ T. Then

K23692_SM_Cover.indd 67

10. By Exercise 9 we have Ker(T ) = Ker(T ∗ ). Then Range(T ∗ ) = Ker(T )⊥ = Ker(T ∗ )⊥ = Range(T ) by Theorem (5.22). 11. If T T ∗ = T 2 then for v ∈ V, v, T T ∗ (v) = This implies that T ∗ (v), T ∗ (v) = v, T 2 (v). ∗ T (v), T (v) from which we conclude that

02/06/15 3:11 pm

60

Chapter 6. Linear Operators on Inner Product Spaces

Ik 0k×l T ∗ (v), T ∗ (v) − T (v) = 0 for all v ∈ V. Sup. Since A is Since T = P roj(U,W ) , A = 0l×k 0l×l pose T (v) = 0. Then T ∗ (v), T ∗ (v) = 0 so that by positive definiteness, T ∗ (v) = 0. Thus, a real symmetric matrix, A∗ = Atr = A. Consequently, Ker(T ) ⊂ Ker(T ∗ ). However, by Exercise 13 T ∗ = T. of Section (5.6), rank(T ) = rank(T ∗ ). Then iii) implies i). A self-adjoint operator is always normal so ∗ nullity(T ) = nullity(T ). This implies that there is nothing to prove. Ker(T ) = Ker(T ∗ ). Set U = Ker(T ) = Ker(T ∗ ). By Theorem (5.22), Range(T ) = Ker(T ∗ )⊥ = Ker(T )⊥ = Range(T ∗ ). Set W = Range(T ). Note that U ∩ W = U ∩ U ⊥ = {0}. Since dim(U ) + dim(U ⊥ ) = dim(V ) we have V = U ⊕ W. Let S be T restricted to W = Range(T ) = Range(T ∗ ). Note for u, w ∈ W, S(u), w = T (u), w = u, T ∗ (w) = u, S ∗ (w). Since T ∗ (w) ∈ W it must be the case that S ∗ is the restriction of T ∗ to W. It now follows that S 2 = SS ∗ . Since S is invertible, S = S ∗ . We have therefore shown that T restricted to W = Range(T ) is equal to T ∗ restricted to W. However, T restricted to U = Ker(T ) is equal to T ∗ restricted to U (both are the zero map). We can now conclude T = T ∗ and T is selfadjoint. 12. Let U = Ker(T ) and assume that U is proper in V. By Exercise 9, Ker(T ∗ ). = U Since U is T ∗ −invariant by Theorem (5.22) it follows that U ⊥ is T −invariant. Since T is nilpotent and T leaves U ⊥ invariant, Ker(T|⊥ ) U ⊥ is not just the zrero vector. But this contradicts U ∩ U = {0}. Thus, U = V. 13. (T − λIV )∗ = T ∗ − λIV . Since T and T ∗ commute and λIV and λIV commute with all operators, T − λIV and T ∗ − λIV commute and T − λIV is normal. 14. i) implies ii). Assume T is normal. Then by Exercise 9, Ker(T ∗ ) = Ker(T ) = W. By Theorem (5,22), U = Range(T ) = Ker(T ∗ )⊥ = Ker(T )⊥ = W ⊥ . ii) implies iii). Let BU be an orthonormal basis for U and BW be an orthonormal basis for W. Also, set k = dim(U ), l = dim(W ). Then B = BU BW is an orthonormal basis for V. Let A denote MT (B, B) and A∗ = MT ∗ (B, B).

K23692_SM_Cover.indd 68

6.2. Spectral Theorems 1. Let α1 , . . . , αs be the distinct eigenvalues of T and denote the minimum polynomial, µT (x) = (x−α1 ) . . . (x− F (x) . Also, let αs ) of T by F (x) and set Fi (x) = x−α i Vi = {v ∈ V |T (v) = αi v} and Wi = V1 ⊕ · · · ⊕ Vi−1 ⊕ Vi+1 ⊕ · · · ⊕ Vs so that V = Vi ⊥ Wi . Now (x − αi ) and Fi (x) are relatively prime so there are polynomials ai (x), bi (x) such that ai (x)(x − αi ) + bi (x)Fi (x) = 1. Now let vi ∈ Vi , wi ∈ Wi . Since (T − αi IV )(vi ) = 0 we have bi (T )fi (T )(vi ) = [ai (T )(T − αi IV ) + bi (T )fi (T )](vi ) = IV (vi ) = vi . On the other hand bi (T )Fi (T )(wi ) = 0 Now set gi (x) = αi bi (x)Fi (x). Then gi (T )(vi ) = αi vi gi (T )(wi ) = 0. Now set g(x) = g1 (x) + · · · + gs (x). Then g(T ) = T ∗ . 2. i) implies ii). Let α1 , . . . , αs be the distinct eigenvalues of T and set Vi = {v ∈ V |T (v) = αi v}. Then V = V1 ⊥ · · · ⊥ Vs . Note that the eigenvalues of T ∗ are

02/06/15 3:11 pm

6.2. Spectral Theorems

61

1 5. Assume that b = c. Then 1 is an eigenvector with −2 Now let U be a T −invariant subspace and set Ui = U ∩ eigenvalue b = c. Then with respect to the orthonormal Vi . Then U = U1 ⊥ · · · ⊥ Us . It follows by the above 1 1 1 ∗ remark that each Ui is T −invariant and therefore U is basis ( √1 1 , √1 −1 , √1 1 ) the matrix of 3 2 6 T −invariant. 1 0 −2 T is diagonal with diagonal entries a, b, b so that by the ii) implies iii). Assume U is T −invariant. By hypothesis, spectral theorem T is self-adjoint. U is T ∗ -invariant. Then by Exercise 10 of Section (5.6) Conversely, assume that T is self-adjoint. Since U ⊥ is T −invariant. 1 1 iii) implies i). Since U + U ⊥ = V, U ∩ U ⊥ = {0}. 1 and −1 are eigenvectors it must be the This implies that T is completely reducible. Since the 1 0 field is the complex numbers this also implies that T is 1 1 1 diagonalizable. Now let α1 , . . . , αs be the distinct eigen- case that Span( , )⊥ = Span( ) is 1 1 −1 values of T and set Vi = {v ∈ V |T (v) = αi v}. We −2 1 0 then have that V = V1 ⊕ · · · ⊕ Vs . Further, let Vi denote 1 V1 ⊕ · · · ⊕ Vi−1 ⊕ Vi+1 ⊕ · · · ⊕ Vs so that V = Vi ⊕ Vi . T −invariant, equivalently, 1 is an eigenvector, say −2 Note that if X is a T −invariant subspace then X = X1 ⊕ 1 1 · · · ⊕ Xs where Xi = X ∩ Vi . 1 = −1 + with eigenvalue d. However, Now set W = Vj⊥ . Then W = W1 ⊕ · · · ⊕ Ws where −2 0 Wi = W ∩ Vi . However, Wj = W ∩ Vj = {0} and 0 therefore W ⊂ Vj . However, since V = Vj ⊕ Vj⊥ = Vj ⊕ 1 . Applying T we have 2 Vj we must have dim(W ) = dim(Vj ). Consequently, −1 Vj⊥ = W = Vj . In particular, for i = j, Vi ⊂ Vj , that is, Vi ⊥ Vj . Now let Bi be an orthonormal basis for Vi , 1 ≤ 1 1 0 1 0 i ≤ s and set B = B1 . . . Bs . Then B is an orthonormal T ( 1 ) = T (−1+2 1 ) = T (−1)+T ( 1 ) basis for V and MT (B, B) is diagonal. It follows from −2 0 −1 0 −1 Theorem (6.3) that T is normal. 1 1 0 tr tr 4 −i 4 −i 4 i ∗ d = b + c 1 −1 1 3. = . Thus, T = = i 4 i 4 −i 4 −2 0 −1 T. √1 √1 1 0 1 0 With respect to the orthonormal basis ( i2 , −i2 ) √ √ d( −1 + 1 ) = b −1 + c 1 2 2 5 0 0 −1 0 −1 the matrix of T is . 0 3 1 0 1 0 1 1 1 1 √1 1 √ √ d −1 + d 1 = b −1 + c 1 4. ( 3 1 , 2 −1 , 6 1 ). 0 −1 0 −1 1 0 −2 α1 , . . . , αs and {v ∈ V |T ∗ (v) = αi v} = Vi . Note that every subspace of Vi is T − and T ∗ −invariant.

K23692_SM_Cover.indd 69

02/06/15 3:11 pm

62

Chapter 6. Linear Operators on Inner Product Spaces

1 0 0 (d − b) −1 + (d − c) 1 = 0 0 −1 0

1 0 Since (−1 , 1 ) is linearly independent, b = c = 0 −1 d. 6. Assume T is self-adjoint. Since 1 1 1 1 1 −1 ( 1 , −1 , 1 ) are eigenvectors it follows 1 −1 −1 1 1 1 1 1 1 −1 ⊥ −1 that Span( 1 , −1 , 1 ) = Span(−1) 1 −1 −1 1 1 −1 is T −invariant, equivalently, −1 is an eigenvector 1 for T. 1 −1 Conversely, assume −1 is an eigenvector of T. by 1 normalizing (dividing each vector by 2) we obtain an orthonormal basis of eigenvectors. We need to know that the corresponding eigenvalues are all real. If they are not all distinct then since three are distinct and real the fourth must be real. Therefore we may assume the eigenvalues are all distinct. Then the minimum polynomial of T has degree four and the eigenvalues are the roots of this polynomial. Since T is a real operator the minimum polynomial is a real polynomial. If it had a complex root then it would have to have a second. However, since three of the roots are real it must then be the case that the fourth root is real.

values are real. Let B be an orthonormal basis such that MT (B, B) is diagonal. Then MT (B, B) is a real diagotr nal matrix. In particular, MT ∗ (B, B) = MT (B, B) = MT (B, B) from which it follows that T ∗ = T. 8. Let α1 , . . . , αs be the distinct eigenvalue of S and β1 , . . . , βt the distinct eigenvalues of T. Set Vi = {v ∈ V |S(v) = αi v}. Then V = V1 ⊕ · · · ⊕ Vs . Claim each Vi is T −invariant: let v ∈ Vi . Then S(T (v)) = (ST )(v) = (T S)(v) = T (S(v)) = T (αi v) = αi T (v). Now set Wj = {u ∈ V |T (u) = βj u} so that V = W1 ⊕ · · · ⊕ Wt . Now each Wj is S−invariant. Since Vi is T −invariant it follows that Vi = Vi1 ⊕ · · · ⊕ Vit where Vij = Vi ∩ Wj . Note that if either i = i or j = j then Vij ⊥ Vi j . Thus, V is the orthogonal direct sum of Vij , 1 ≤ i ≤ s, 1 ≤ j ≤ t. Let Bij be an orthonormal basis of Vij . Order the collection of bases Bij lexicographically. Then i,j Bij is an orthonormal basis of V. Thus, there exists an orthonormal basis of eigenvectors for each of S and T. 9. If T is invertible there is nothing to prove so assume that Ker(T ) = {0}. Let α1 = 0, α2 , . . . , αs be the distinct eigenvalues of T. Set Vi = {v ∈ V |T (v) = αi v}. Since T is normal it is diagonalizable and V = V1 ⊕ · · · ⊕ Vt . Let i > 1 and v ∈ Vi . Then T k (v) = αik v = 0 and therefore v ∈ Range(T k ). Thus, Range(T k ) = V2 ⊕ · · · ⊕ Vt . Since Range(T k ) = Range(T ), rank(T k ) = rank(T ). Then by the ranknullity theorem nullity(T k ) = nullity(T ). However, Ker(T ) ⊂ Ker(T k ) and therefore we have equality: Ker(T ) = Ker(T k ). 10. If T is completely reducible on a complex space then there exists a basis B = (v1 , . . . , vn ) of eigenvectors. Den n fine i=1 ai vi , j=1 bi vi = i=1 ai bi . With respect to this inner product B is an orthonormal basis. By the Spectral theorem T is normal.

11. Let BU be an orthonormal basis of U consisting of eigenvectors of T|U and BU ⊥ be an orthonormal basis of 7. It T is self-adjoint we have seen that the eigenvalues U ⊥ consisting of eigenvectors of T|U ⊥ . Since T|U is selfare all real. Assume that T is normal and all its eigen- adjoint, for each vector u ∈ BU , T (u) = αu for some

K23692_SM_Cover.indd 70

02/06/15 3:11 pm

6.3. Normal Operators on Real Inner Product Spaces

63

c1 α1 c1 sume [u]B = ... . Then [T (u)]B = ... and cn α n cn Now B = BU BU ⊥ is an orthonormal basis of V consist2 2 T (u), u = α |c | + . . . α |c | ∈ R. 1 1 n n ing of eigenvectors of T. Thus T is normal. Since all the eigenvalues are real it follows that T is self-adjoint.

real number α. Since T|U ⊥ is self-adjoint, for each vector v ∈ BU ⊥ , T (v) = βv for some real number v.

12. This is false. Consider the operator which satisfies T (e1 ) = e1 , T (e2 ) = e2 , T (e3 ) = 2e3 , T (e4 ) = 2e4 where ei is the ith standard basis vector of R4 equipped with the dot product. T is self-adjoint. Now let U = Span(e1 , e3 ) and W = Span(e1 + e2 , e3 + e4 ). Then W is T −invariant (the vectors e1 + e2 , e3 + e4 are eigen- In all the following S is the standard orthonormal basis n vectors). Moreover, R4 = U ⊕ W. However, U ⊥ = of Rn . Span(e2 , e4 ). 1. Let T be the operator on R4 such that MT (S4 , S4 ) = 13. If T is self-adjoint then U = Range(T ) = 0 1 0 0 ⊥ ⊥ ⊥ Ker(T ) = W , whence W = U . Conversely, as- −1 0 0 0 sume that W = Ker(T )⊥ . Choose an orthonormal ba- 0 0 0 2. sis BU of U and an orthonormal basis BW of W . Since 0 0 −2 0 U ⊥ W it follows that B = BU BW is an orthonormal basis of V . Now MT (B, B) is diagonal with diLet T be the operator on R4 such that MT (S4 , S4 ) = agonal entries 0 and 1 from which we conclude that T 2. 0 1 0 0 is self-adjoint. Alternatively, the operator T is equal to P roj(U,W ) . By Exercise 14 of Section (6.1), T is self- −1 0 0 0. 0 0 0 1 adjoint if and only if W = U ⊥ . 0 0 −1 0 14. Since T is skew-Hermitian, T ∗ = −T. Assume v is 3. By Lemma (6.5) there is an orthonormal basis S = an eigenvector with eigenvalue α ∈ C \ R. Let v be a unit (v1 , v2 ) and real numbers α, β with β = 0 such that the vector with eigenvalue α = 0. Then α −β matrix of T with respect to S is A = . β α α = αv, v = αv, v = T (v), v α β . Then A is the matrix of T ∗ with Let A = −β α = v, T ∗ (v) = v, −T (v) = v, −αv respect to S. It therefore suffices to prove that there is a linear polynomial f (x) such that A = f (A). Set f (x) = −x + 2α. = −αv, v = −α. 4. Since T is normal it is completely reducible. Since the minimum polynomial is a real irreducible quadratic Thus, α + α = 0 which implies that the real part of α is the minimal T −invariant subspaces have dimension 2. It zero and α a pure imaginary number. follows that there are T −invariant subspaces U1 , . . . , Us 15. Let B = (v1 , . . . , vn ) be a orthonormal basis each of dimension 2 such that V = U1 ⊥ · · · ⊥ Us . of V such that MT (B, B) is diag{α1 , . . . , αn }. As- Assume that the roots of µT (x) are α ± iβ with β = 0.

6.3. Normal Operators on Real Inner Product Spaces

K23692_SM_Cover.indd 71

02/06/15 3:11 pm

64 α −β Set A = . There are then orthonormal bases β α Bj of Uj such that MT |Uj (Bj , Bj ) = A. If we set B = B1 . . . Bs then B is an orthonormal basis of V and MT (B, B) is the block diagonal matrix with s blocks equal to A. α β Now set A = . Then the matrix of T ∗ with −β α respect to B is the block diagonal matrix with s blocks equal to A . As we saw in Exercise 3, there is a real linear polynomial f (x) such that A = f (A). It then follows that MT ∗ (B, B) = f (MT (B, B)) from which we conclude that T ∗ = f (T ).

Chapter 6. Linear Operators on Inner Product Spaces that Ck (x)pk (x) + Dk (x)hk (x) = 1. Set Gk (x) = fk (x)Dk (x)hk (x). Now set f (x) = F1 (x) + · · · + Fs (x) + G1 (x) + · · · + Gt (x). Then f (T ) = T ∗ . 6. By Exercise 5 there exists a polynomial f (x) such that T ∗ = f (T ). Since (T ∗ )∗ = T there is also a polynomial g(x) such that T = g(T ∗ ). Now assume that ST = T S. Then S commutes with f (T ) = T ∗ . Likewise, if S commutes with T ∗ then S commutes with g(T ∗ ) = T.

7. The hypothesis implies that T is cyclic. Since T S = ST it follows that there is a polynomial f (x) such that S = f (T ). Since µT (x) is quadratic, f (x) is linear which implies that S ∈ Span(T, IV ). 5. Since T is completely reducible there are distinct real numbers α1 , . . . , αs and distinct real irreducible quadrat8. Under the given hypothesis, there are real distinct ics p1 (x), . . . , pt (x) such that µT (x) = (x − α1 ) . . . (x − irreducible quadratic polynomials p1 (x), . . . , ps (x) such αs )p1 (x) . . . pt (x). Let the roots of pj (x) be aj ± ibj that µT (x) = p1 (x) . . . ps (x) and if Uj = {v ∈ where aj , bj ∈ R. Also, for 1 ≤ j ≤ s set gj (x) = V |pj (T )(v) = 0} then dim(Uj ) = 2. Moreover, V = µT (x) µT (x) (x−αj ) and for 1 ≤ k ≤ t set hk = pk (x) . Further, for U1 ⊥ · · · ⊥ Us . If U is a T −invariant subspace then 1 ≤ j ≤ s set Uj = {v ∈ V |T (v) = αj v} and for U = (U ∩U1 )⊕· · ·⊕(U ∩Us ). It therefore suffices to show 1 ≤ k ≤ t set Wk = {v ∈ V |pk (T )(v) = 0}. Then that each Uj is S−invariant. Since ST = T S it follows that S commutes with pj (T ). Assume now that v ∈ Uj . V = U1 ⊥ · · · ⊥ Us ⊥ W1 ⊥ · · · ⊥ Wt . Then pj (T )(S(v)) = (pj (T )S)(v) = (Spj (T ))(v) = S(pj (T )(v)) = S(0) = 0. Thus, S(v) ∈ Uj . Let Sj be an orthonormal basis of Uj , 1 ≤ j ≤ s and Sk an orthonormal basis of Wk such that the matrix of T 9. Continue with the notation of Exercise 8. It suffices to prove that S restricted to each Uj is normal. However, to S restricted to Wk with respect k is block diagonal with this follows from Exercise 7. ak −bk . blocks equal to Ak = bk a k 10. Under the given hypotheses, T is a cyclic operator. Therefore dim(C(T )) = dim(V ) = deg(µT (x)) which a k bk . As we have seen in Exercise is even. Set Ak = −bk ak 3 there is a linear polynomial fk (x) such that Ak = 11. We claim that dim(C(T )) = 8. Let S = fk (Ak ). (v1 , v2 , v3 , v4 ) be an orthonormal basis such that 1 −1 0 0 Now, for each j, (x − αj ) and gj (x) are relatively prime and consequently there are polynomials cj (x), dj (x) such MT (S, S) = 1 1 0 0 . The vector S(v1 ) 0 0 1 −1 that cj (x)(x − αj ) + dj (x)gj (x) = 1. Set Fj (x) = 0 0 1 1 dj (x)gj (x). can be chosen arbitrarily. Then S(v2 ) = T (S(v1 )) − Also, for each k, pk (x) and hk (x) are relatively prime S(v1 ). Likewise, S(v3 ) can be chosen arbitrarily and and so there are polynomials Ck (x) and Dk (x) such S(v4 ) = T (S(v3 )) − S(v3 ).

K23692_SM_Cover.indd 72

02/06/15 3:11 pm

6.4. Unitary and Orthogonal Operators

65

12. Since T is skew-symmetric it is normal since T n commutes with −T. Note that for any vector v we have 2 u = c j 2 . T (v), v = v, T ∗ (v) = v, −T (v) which implies j=1 that 2T (v), v = 0. Thus, v ⊥ T (v). This implies that n if v is an eigenvector then T (v) = 0. Since T is invertible By the definition of T, T (u) = j=1 λj cj vj . Then there are no eigenvectors. Thus, a minimal T −invariant n subspace has dimension 2. Let p(x) be a real irreducible T (u) 2 = λj c j 2 = quadratic polynomial dividing µT (x) with roots α ± iβ j=1 with α, β ∈ R, β = 0. n n 2 2 |λ | c = cj 2 = u 2 . j j Let U be a 2-dimensional T −invariant subspace of V j=1 j=1 such that p(T ) restricted to U is the zero operator. Let S = (u1 , u2 ) be an orthonormal basis of U such that the 4. We recall that for a real finite dimensional vector space α −β matrix of T with respect to S is A = . Since and S an orthonormal basis, MT ∗ (S, S) = MT (S, S)tr . β α So, assume T is an isometry. Then T ∗ = T −1 whence T is a skew-symmertic operator the matrix A is skewA−1 = MT −1 (S, S) = MT ∗ (S, S) = Atr . symmetric. But this implies that α = 0 and the roots Conversely, assume A−1 = Atr . Since A−1 = of p(x) are purely imaginary. MT −1 (S, S) and Atr = MT ∗ (S, S) we can conclude that T −1 = T ∗ and therefore T is an isometry.

6.4. Unitary and Orthogonal Operators

1. Suppose T (v) = 0. Then 0 = T (v) = v . By positive definiteness we conclude that v = 0 and so Ker(T ) = {0} and T is injective. Since V is finite dimensional from the half is good enough theorem T is bijective. Let v ∈ V and set u = T −1 (v). Then v = T (u). Therefore, T −1 (v) = u = T (u) since T is an isometry. However, T (u) = v so we have shown that T −1 (v) = v and T −1 is an isometry.

2. Let S, T be isometries of V and let v be an arbitrary vector in V. Since S is an isometry, S(T (v)) = T (v) . Since T is an isometry, T (v) = v . Thus, (ST )(v) = S(T (v)) = v and ST is an isometry. c1 .. 3. Let u be an arbitrary vector and assume [u]S = . so that u = c1 v1 + · · · + cn vn . Then

K23692_SM_Cover.indd 73

cn

5. Let S be the operator such that S(uj ) = vj . Then S is a unitary operator and, consequently, MS (S1 , S1 ) is a unitary matrix. However, MIV (S2 , S1 ) = MS (S1 , S1 ) and so the change of basis matrix, MIV (S2 , S1 ), is a unitary matrix. 6. Let S be the operator such that S(uj ) = vj . Then S is an orthogonal operator and MS (S1 , S1 ) is an orthogonal matrix. However, MIV (S2 , S1 ) = MS (S1 , S1 ) and consequently, MIV (S2 , S1 ) is an orthogonal matrix. 7. Let V = Cn equipped with the usual inner product, S be the operator on V given by multiplication by A and let S = (e1 , . . . , en ) be the standard (orthonormal batr tr tr sis). Assume AA = A A. Since MS ∗ (S, S) = A it follows that S is normal. By the complex spectral theorem there exists an orthonormal basis S consisting of eigenvectors for S, equivalently, so that MS (S , S ) is a diagonal matrix. Set Q = MIV (S , S). By Exercise 5, Q is a unitary matrix. Then Q−1 AQ = MS (S , S ) is a diagonal matrix. Conversely, assume there is a unitary matrix Q such that Q−1 AQ is diagonal. Let T be the operator such

02/06/15 3:11 pm

66 MT (S, S) = Q. T is a unitary operator and therefore if T (ej ) = uj , S = (u1 , . . . , un ) is an orthonormal basis. Moreover, the vectors uj are eigenvectors for S and, consequently, by the complex spectral theorem S is normal, that is, SS ∗ = S ∗ S. Since A = MS (S, S) and tr tr tr A = MS ∗ (S, S) it follows that AA = A A.

8. Let V = Rn equipped with the dot product, let S be the operator on V giben by multiplication by A and let S = (e1 , . . . , en ) be the standard (orthonormal basis). Assume A = Atr . Since MS ∗ (S, S) = Atr it follows that S is self-adjoint. By the real spectral theorem there exists an orthonormal basis S consisting of eigenvectors for S, equivalently, so that MS (S , S ) is a diagonal matrix. Set Q = MIV (S , S). By Exercise 6, Q is an orthogonal matrix. Then Qtr AQ = Q−1 AQ = MS (S , S ) is a diagonal matrix. Conversely, assume there is an orthogonal matrix Q such that Qtr AQ = Q−1 AQ is diagonal. Let T be the operator such MT (S, S) = Q. T is an orthogonal operator and therefore if T (ej ) = uj , S = (u1 , . . . , un ) is an orthonormal basis. Moreover, the vectors uj are eigenvectors for S and, consequently, by the real spectral theorem, S is self-adjoint, that is, S ∗ = S Since A = MS (S, S) and Atr = MS ∗ (S, S) it follows that Atr = A. 9. Assume T is a real isometry. Since T T ∗ = T ∗ T = IV it follows that T is a normal operator. Consequently, T is completely reducible and we can express V as an orthogonal sum U1 ⊥ · · · ⊥ Us ⊥ W1 ⊥ · · · ⊥ Wt of T −invariant subspspaces where dim(Uj ) = 1, dim(Wk ) = 2 and T restricted to Wk does not contain an eigenvector. Moreover, there is an orthonormal basis of Sk of Wk such that the matrix to Wk with of T restricted αk −βk with βk > 0. respect to Sk has the form βk αk

Chapter 6. Linear Operators on Inner Product Spaces

1 = v1k 2 = T (v1k 2 = αk v1k + βk v2k 2 = αk2 + βk2 . Since βk > 0 there exists a unique θk , 0 < θk < π such that αk = cos θk , βk = sin θk . Now by setting S = (u1 , . . . , us )S1 . . . St we obtain an orthonormal basis of V and the matrix of T is block diagonal with entries aj , 1 ≤ j ≤ s and 2 × 2 blocks Ak = cos θk −sin θk for 1 ≤ k ≤ t. sin θk cos θk

Conversely, assume that there exists an orthonormal basis S such that MT (S, S) is block diagonal and each block is either 1 × 1 with entry ±1 or 2 × 2 of the form cos θ −sin θ for some θ, 0 ≤ θ < π. sin θ cos θ

Assume the basis has been ordered so that the first s , . . . , At are blocks are 1 × 1 and the remaining blocks, A1 cos θk −sin θk 2 × 2 and suppose Ak = . Suppose sin θk cos θk S = (u1 , . . . , us )S1 . . . St where Sk = (v1k , v2k ) consists of two orthogonal unit vectors. By our assumption on the MT (S, S) T (uj ) = aj uj where a ∈ {−1, 1}. Also, T (v1k ) = cos θk v1k + sin θk v2k , T (v2k ) = −sin θk v1k + cos θk v2k . Then T (Sk ) = (T (v1k ), T (v2k )) is an orThus, thonormal basis for Span(v1k , v2k ). (T (u1 ), . . . , T (us ))T (S1 ) . . . T (St ) is an orthonormal basis of V and T is an isometry.

10. If T is an isometry then T ∗ T = T T ∗ = IV . If T is self-adjoint then T ∗ = T and therefore T 2 = IV . It follows that the minimum polynomial of T divides x2 − 1 = (x − 1)(x + 1). Consequently, the eigenvalLet uj ∈ Uj be a vector of norm 1. Since Uj is ues of T are all ±1. Since T is self-adjoint there exists T −invariant, in particular, uj is an eigenvector of T. Sup- an orthonormal basis S such that MT (S, S) is diagonal. pose T (uj ) = aj uj . Then 1 = uj = T (uj ) = Since the eigenvalues of T are all ±1 the diagonal entries of MT (S, S) are all ±1. aj uj = |aj | uj = |aj |. Thus, aj = ±1.

Finally, assume that Sk = (v1k , v2k ). Then T (v1k ) = αk v1k + βk v2k . Since T is an isometry

K23692_SM_Cover.indd 74

11. Assume that T is a self-adjoint operator. Then there exists an orthonormal basis S = (v1 , . . . , vn ) consisting

02/06/15 3:11 pm

6.5. Positive Operators, Polar Decomposition and Singular Value Decomposition

67

of eigenvectors of T. Moreover, if T (vj ) = aj vj then 16. Since the dimension of V is odd, S must have an aj ∈ R. Suppose now that T 2 = IV . Then the minimum eigenvector v. Since v = S(v) the eigenvalue of v polynomial of T divides x2 − 1 and all the aj ∈ {1, −1}. is ±1. Then S 2 (v) = v. Then (T (v1 ), . . . , T (vn )) is an orthonormal basis and T 17. Let SU be an orthonormal basis of U and SU ⊥ be is an isometry. an orthonormal basis of U ⊥ . Then S = SU ∪ SU ⊥ is an Conversely, assume that T is an isometry. It follows from orthonormal basis of V. MT (S, S) is a diagonal matrix with ±1 on the diagonal and therefore T is a isometry Exercise 10 that T 2 = IV . and self-adjoint. 2 12. Let T be the operator on C which has matrix = 18. Set S = (u , u , u , u ) i 0 1 2 3 4 with respect to the standard basiis. 1 1 1 1 0 1 1 1 −1 −1 13. Assume U is T −invariant. Since T is an isometry, by ( and S = 1 , −1 , 1 , −1) Exercise 1, T is bijective. In particular, T restricted to U 1 −1 −1 1 is injective and since U is T −invariant, T restricted to U 1 1 1 1 is bijective. Assume w ∈ U ⊥ and u ∈ U is arbitrary. We 1 1 −1 0 need to prove that T (w) ⊥ u. Since T restricted to U is (v1 , v2 , v3 , v4 ) = ( 1 , −1 , 1 , 0). bijective there exists v ∈ U such that T (v) = u. Now 1 −1 −1 0 T (w), u = T (w), T (v) = T ∗ T (w), u. Since T is an isometry, T ∗ T = IV and therefore T ∗ T (w), u = w, u = 0.

A necessary and sufficient condition for an operator Q to satisfy Q−1 SQ = T is that Qi(vj ) = aj uj with aj = 0. If Q is an isometry then we must have Q(vj ) = ±uj for j = 1, 2, 3 it order to preserve norms and Q(v4 ) = ± 12 u4 . However, since v3 , v4 = 1 we must have Q(v3 ), Q(v4 ) = 1 whereas Q(v3 ), Q(v4 ) are orthogonal.

14. Assume A is upper triangular and a unitary matrix. Then the diagonal entries of A are non-zero since A is invertible. The inverse of an upper triangular matrix is upper triangular. On the other hand, since A is unitary, tr A−1 = A is lower triangular. So A−1 is both upper and lower triangular and hence diagonal and therefore A is diagonal.

6.5. Positive Operators, Polar Decomposition and Singular Value Decomposition

15. Let (u1 , . . . , uk ) be an orthonormal basis of U1 and set uj = R(uj ). Then (u1 , . . . , uk ) is an orthonormal basis of U2 . Extend (u1 , . . . , uk ) to an orthonormal basis (u1 , . . . , un ) of V and extend (u1 , . . . , uk ) to an orthonormal basis (u1 , . . . , un ) of V. Let S be the operator on V such that S(uj ) = uj for 1 ≤ j ≤ n. Then S restricted to U1 is equal to R and since S takes an orthonormal basis of V to an orthonormal basis of V , S is an isometry.

1. Assume S is a positive operator and S 2 = T. Then ST = T S. Since both are self-adjoint there exists an orthonormal basis S = (v1 , . . . , vn ) consisting of eigenvectors for both S and T. Let S(vj ) = aj vj and T (vj ) = bj vj and assume the notation has been chosen so that bj = 0 for j ≤ k, bj = 0 for j > k so that Ker(T ) = Span(vk+1 , . . . , vn ). Now bj vj = T (vj ) = S 2 (vj ) = a2j vj . If follows that if j > k then aj = 0. If

K23692_SM_Cover.indd 75

02/06/15 3:11 pm

68

Chapter 6. Linear Operators on Inner Product Spaces

j ≤ k then aj , bj > 0 and a2j = bj so that aj is uniquely T (vj ) = aj vj then aj ≥ 0. Assume that T is invertible. determined. Thus, S is unique. Then all aj > 0. Let v = c1 v1 + · · · + cn vn = 0. Then 2. Since T is normal there exists an orthonormal basis S = (v1 , . . . , vn ) consisting of eigenvectors of T. Assume T (vj ) = aj vj . Let bj ∈ C such that b2j = aj and let S be the operator such that S(vj ) = bj vj . Then S 2 = T. 3. Under the hypothesis there is anorthonormal basis S = a −b with b > 0. (v1 , v2 ) such that MT (S, S) = b a √ Set r = a2 + b2 , A = ar , B = Rb . Then A2 + B 2 = 1 and B > 0 so that there exists a unique θ, 0 < θ < π such that A = cos θ, B= sin θ. Let S bethe operator such rcos θ2 −rsin θ2 that MS (S, S) = . Then S 2 = T. rsin θ2 rcos θ2

n n aj cj vj , cj vj = T (v), v = j=1

n

j=1

aj c2j .

j=1

Since aj > 0 and cj ∈ R, aj c2j ≥ 0. On the other hand, n some cj > 0 and therefore j=1 aj c2j > 0. On the other hand, suppose T is not invertible and v ∈ Ker(T ). Then T (v), v = 0.

8. Since T is an invertible positive operator there exists an orthonormal basis S = (v1 , . . . , vn ) such that T (vj ) = aj vj with aj ∈ R+ . Let S be the operator such that S(vj ) = a1j vj . Then S = T −1 since ST (vj ) = vj for all j. Since a1j > 0 it follows that T −1 = S is a positive operator.

4. Under the hypothesis there are T −invariant subspaces U1 , . . . , Un each of dimension two such that for j = k, Uj ⊥ Uk and V = U1 ⊕ · · · ⊕ Un . By Exercise 3, there exists an operator Sj such that Ker(Sj ) = U1 ⊕ · · · ⊕ Uj−1 ⊕ Uj+1 ⊕ · · · ⊕ Un and such that Sj2 is the restriction of T to Uj . Set S = S1 + · · · + Sn . Then 9. 1) [ , ] is positive definite: Assume v = 0. Then S 2 = T. [v, v] = T (v), v > 0 by Exercise 7. 5. Let S, T be positive operators. Then (S + T )∗ = S ∗ + 2) [ , ] is additive in the first argument: [v + v , w] = 1 2 T ∗ = S + T and so S + T is self-adjoint. Now let v ∈ V. T (v + v ), w = T (v ) + T (v ), w = T (v ), w + 1 2 1 2 1 Then S(v), v ≥ 0 and T (v), v. It now follows that T (v2 ), w = [v1 , w] + [v2 , w]. (S + T )(v), v = S(v) + T (v), v =

3) [ , ] is homogeneous in the first argument: [cv, w] = T (cv), w = cT (v), w = cT (v), w = c[v, w].

S(v), v + T (v), v ≥ 0 + 0 = 0.

4) [ . ] satisfies conjugate symmetry: [w, v] = T (w), v = w, T ∗ (v). However, since T is a positive operator it is self-adjoint and T ∗ = T. Thus

6. Since T is a positive operator it is self-adjoint. Since c is a real number cT is self-adjoint. Now let v ∈ V. We then have

(cT )(v), v = cT (v), v = cT (v), v ≥ 0

w, T ∗ (v) = w, T (v) = T (v), w = [v, w].

10. [S(v), w] = T (S(v)), w = S(v), T (w) since T is self-adjoint. We therefore need to show that 7. Since T is a positive operator there exists an orthonor- [v, (T −1 S ∗ T )(w)] = S(v), T (w). By the definition of mal basis S = (v1 , . . . , vn ) of eigenvectors such that if [ , ] we have

the latter inequality since T (v), v ≥ 0 and c > 0.

K23692_SM_Cover.indd 76

02/06/15 3:12 pm

6.5. Positive Operators, Polar Decomposition and Singular Value Decomposition

[v, (T −1 S ∗ T )(w)] = T (v), (T −1 S ∗ T )(w) = v, T (T −1 S ∗ T )(w) =

v, (T T −1 )(S ∗ T )(v) = v, S ∗ (T (w) =

as was required.

S(v), T (w)

11. Define [ , ] by [v, w] = T (v), w. By Exercise 9 this is an inner product on V. Set S = RT. By Exercise 10 the adjoint, S , of S with respect to [ , ] is S = T −1 S ∗ T. Now S ∗ = (RT )∗ = T ∗ R∗ = T R since both R and T are self-adjoint. Thus, S = (RT ) = T −1 (T R)T = RT . Consequently, RT is self-adjoint with respect to [ , ] and therefore is diagonalizable with real eigenvalues. The operator T R is similar to RT since T R = T (RT )T −1 and since RT is diagonalizable with real eigenvalues so is T R. 12. Assume T is a positive operator and let S = (v1 , . . . , vn ) be an orthonormal basis for V such that T (vj ) = aj vj where aj ∈ R+ ∪ {0}. Now assume that T is an isometry. Then we must have |aj | = 1 for all j, whence, aj = 1 for all j and T = IV .

69

∗ On the other hand if T is √ not invertible then T T is not invertible and neither is T ∗ T . In this case there are infinitely many isometries S which extend R and so S is not unique.

16. Since T is not invertible, S is not unique. One solution is

1 3 √3−1 √ 3 3+1 3

√ − 3−1 3 1 √3 3+1 3

√ − 3+1 3 √ − 3−1 3 1 3

17. Let V = Cn , W = Cm and T : V → W be the operator such that T (v) = Av. Let SV denote the standard basis of V and SW the standard basis of W. By Theorem (6.12) there are orthonormal bases BV = (v1 , . . . , vn ) and BW = (u1 , . . . , um ) such that T (vj ) = sj wj for 1 ≤ j ≤ r and T (vj ) = 0W if j > r. Let P = MIV (BV , SV ) and Q = MIW (SW , BW ). Assume 1 ≤ j ≤ r. Then QAP eVj = QAvj = Q(T (vj )) = Q(sj wj ) = sj Qwj = sj eW j .

If j > r then 13. Since S, T are self-adjoint and ST = T S it follows that ST is self-adjoint. Then there exists an orthonormal basis S = (v1 , . . . , vn ) consisting of eigenvectors for S QAP eVj = QAvj = Q(T (vj )) = Q0W = 0W . and for T. Set S(vj ) = aj vj , T (vj ) = bj vj . Since S, T are positive, aj , bj ≥ 0. Then aj bj ≥ 0. Now S is an Thus, QAP has the required form. orthonormal basis for V consisting of eigenvectors for ST ∗ and ST (vj ) = aj bj ≥ 0. It follows that ST is a positive 18. We have shown that Ker(T ) = Ker(T T ) ∗ ∗ ∗ and similarly, Ker(T ) = Ker(T T ). Since T T is operator. a self-adjoint operator it is diagonalizable and V = 14. Let the operators S and T on R2 be defined as multiKer(T ∗ T ) ⊕ Range(T ∗ T ) = Ker(T ) ⊕ Range(T ∗ T ). plication by the following matrices, respectively: Similarly, V = Ker(T ∗ ) ⊕ Range(T T ∗ ). Note that since nullity(T ) = nullity(T ∗ ) it follows that 2 0 3 −1 , . dim(Range(T ∗ T )) = dim(Range(T T ∗ )). Let S de0 3 −1 3 note the restriction of the operator T to Range(T ∗ T ). ∗ ∗ 15. Assume√T is invertible. Then √ T T−1is invertible and Suppose v ∈ Range(T T ). ∗Then there is a vector ∗ ∗ hence so is T T . Then S = T T T is unique. u ∈ V such that v = (T T )(u). Then S(v) =

K23692_SM_Cover.indd 77

02/06/15 3:12 pm

70

Chapter 6. Linear Operators on Inner Product Spaces

T (v) = T ([T ∗ T ])(u) = (T T ∗ )(T (u)) ∈ Range(T T ∗ ). Thus, Range(S) ⊂ Range(T T ∗ ). However, S is injective since Ker(T ) ∩ Range(T ∗ T ) = {0}. Since dim(Range(T ∗ T )) = dim(Range(T T ∗ )), in fact, S is an isomorphism.

hand, (P S)(v) = P (S(v)) = aS(v) since S(v) is an eigenvector with eigenvalue a.

Now assume that v ∈ Range(T ∗ T ) is an eigenvector with eigenvalue a. We claim that S(v) is an eigenvector of T T ∗ with eigenvalue a. Thus, we must apply T T ∗ to S(v) and obtain aS(v). (T T ∗ )(S(v) = (T T ∗ )(T (v)) = [T (T ∗ T )](v) =

T [(T ∗ T )(v)] = T (av) = aT (v) = aS(v) as required. 19. Assume rank(T ) = k. Since T is semi-positive there exists an orthonormal basis B = (v1 , . . . , vn ) of eigenvectors for T with T (vj ) = aj vj with aj > 0 for 1 ≤ j ≤ k and T (vj ) = 0 for j > k. Now T ∗ T = T 2 since T is self-adjoint. Now T 2 (vj ) = a2j vj for 1 ≤ j ≤ k and T 2 (vj ) = 0 for j > k. Then the singular values of T are aj > 0, a2j = aj .

a2j , 1 ≤ j ≤ k. However, since

20. Assume SP = P S. Multiplying on the left and on the right by S −1 we get P S −1 = S −1 P. Now, (SP )∗ = P ∗ S ∗ = P S −1 = S −1 P. Then (SP )∗ (SP ) = (P S −1 )(SP ) = P 2 . On the other hand, (SP )(SP )∗ = (SP )(P S −1 ) = (P S)(S −1 P ) = P 2 and SP is normal. Conversely, assume SP is normal. Then P 2 = ∗ ∗ 2 −1 (SP ) (SP ) = (SP )(SP ) = SP S . Thus, S commutes with P 2 and therefore leaves invariant each eigenspace of P 2 . However, since P is positive the eigenspaces of P and the eigenspaces of P 2 are the same. Therefore S leaves the eigenspaces of P invariant. Let v be an eigenvector of P with eigenvalue a. Then (SP )(v) = S(P (v)) = S(av) = aS(v). On the other

K23692_SM_Cover.indd 78

02/06/15 3:12 pm

Chapter 7

Trace and Determinant of a Linear Operator 7.1. Trace of a Linear Operator

4. The matrices MT (B, B) and MT (B , B ) are similar so by Corollary (7.1) they have the same trace.

5. Let B be a basis for V. Then T r(ST ) T race(MST (B, B)). However, MST (B, B) 1. Let A have entries aij , 1 ≤ i, j ≤ n and B have entries MS (B, B)MT (B, B) so that bij , 1 ≤ i, j ≤ n. Then A + B has entries aij + bij , 1 ≤ T r(ST ) = T race(MS (B, B)MT (B, B)) i, j ≤ n. Then traces are given by T race(A) = a11 + · · · + ann T race(B) = b11 + · · · + bnn T r(A + B) = (a11 + b11 ) + · · · + (ann + bnn ) = [a11 + · · · + ann ] + (b11 + · · · + bnn ] = T race(A) + T race(B). 2. Let A have entries aij , 1 ≤ i, j ≤ n. Then the entries of cA are caij , 1 ≤ i, j ≤ n. The traces are given by T race(A) = a11 + · · · + ann T race(cA) = (ca11 ) + · · · + (cann ) = c[a11 + · · · + ann ] = cT race(A).

= =

In exactly the same way T r(T S) = T race(MT (B, B)MS (B, B)) By Theorem (7.1), T race(MS (B, B)MT (B, B) T race(MT (B, B)MS (B, B)).

=

6. Let B be a basis for V. Then T r(cT ) = = T race(cMT (B, B)). T race(McT (B, B)) However, by Exercise 2, T race(cMT (B, B)) = cT race(MT (B, B)) = cT r(T ). 7. Since (x1 + x2 + x3 )2 − (x21 + x22 + x23 ) = 2(x1 x2 + x1 x3 + x2 x3 ) we conclude that x1 x2 + x1 x3 + x2 x3 = 0. Next note that

3x1 x2 x3 = (x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 )− 3. Assume P is an invertible n × n matrix and C is an n × n matrix and set D = P −1 CP. We need to show that (x1 + x2 + x3 )(x21 + x22 + x23 ) + (x31 + x32 + x33 ) = 0 −1 T race(D) = T race(C). Set A = CP and B = P . By Theorem (7.1), T race(AB) = T race(BA). However, We may therefore assume that at least one of x1 , x2 , x3 is AB = (CP )P −1 = C(P P −1 ) = CIn = C whereas zero and by symmetry that x3 = 0. Since x1 x2 + x1 x2 + BA = P −1 CP = D. x2 x3 = 0 we further conclude that x1 x2 = 0. But then

K23692_SM_Cover.indd 79

02/06/15 3:12 pm

72

Chapter 7. Trace and Determinant of a Linear Operator

Now T race(MS (B, B)A) = 0 for every operator S on V. However, MS (B, B) ranges over all the matrices in Mnn (F). Let Eij be the matrix which has one non-zero from which we have entry, a one in the (i, j)−position. Let aij be the entry of A in the (i, j)−position. If k = i then the k th row x 1 + x 2 = x1 − x 2 = 0 of Eij A is zero, whereas the ith row consists of the j th and x1 = x2 = 0. row of A. and therefore T race(Eij A) = aji . Thus, for 8. Since A is similar to an upper triangular matrix we can all i, j, aij = 0 and A is the zero matrix as required. assume that A is upper triangular. Let α1 , α2 , α3 be the 11. Since all the eigenvalues of A are real there diagonal entries of A. Then the diagonal entries of A2 are exists a non-singular matrix Q such that Q−1 AQ is α12 , α22 , α32 and the diagonal entries of A3 are α13 , α23 , α33 . upper triangular. Assume the diagonal entries of It follows that Q−1 AQ are a1 , . . . , an . Then the diagonal entries (x1 ± x2 )2 = x21 + x22 ± 2x1 x2 = 0

0 = T r(A) = α1 + α2 + α3 0 = T r(A2 ) = α12 + α22 + α32 0 = T r(A3 ) = α13 + α23 + α33 . From Exercise 7, α1 = α2 = α3 = 0 and A is nilpotent.

of (Q−1 AQ)2 = Q−1 A2 Q are a21 , . . . , a2n . Then T race(A2 ) = T race(Q−1 A2 Q) = a21 + · · · + a2n ≥ 0. 12. Since T 2 = T the minimum polynomial of T divides x2 − x and all the eigenvalues of A are 0 or 1. Let B be a basis for V and set A = MT (B, B). There exists a nonsingular matrix Q such that Q−1 AQ is upper triangular with diagonal entries 0 or 1. Then T r(T ) = T race(A) = T race(Q−1 AQ) is a sum of the diagonal entries, whence a non-negative integer.

9. This depends on proving the only solution in complex 13. Let B be an orthonormal basis of V and numbers to the systems of equation set A = MT (B, B). Then MT ∗ (B, B) = Atr . x1 + . . . + xn = 0 Since A and Atr have the same diagonal entries, 2 2 x 1 + . . . + xn = 0 T race(A) = T race(Atr ). Thus, T r(T ) = T race(A) = .. .. .. .. .. T race(Atr ) = T r(T ∗ ). . . ... . . . xn1

+

...

+

xnn

=

0

14. Let B be an orthonormal basis of V and set A = tr MT (B, B). Then MT ∗ (B, B) = A . It now follows that tr is x1 = x2 = · · · = xn . T r(T ∗ ) = T race(A ) = T race(A) = T race(A) = The crux is to show that x1 x2 . . . xn = 0 from which it T r(T ). follows that one of the variables is zero which, by sym15. T r : L(V, V ) → F is a non-zero linear transformation metry can be taken to be xn and then apply induction. and consequently it is onto the one-dimensional space F. For the former, consult an abstract algebra book in which sl(V ) = Ker(T r) and so is a subspace and by the rankit is proven that the homogeneous polynomials x1 + · · · + nullity theorem, dim(sl(V )) = dim(L(V, V )) − 1 = xn , x21 + · · · + x2n , . . . , xn1 + · · · + xnn generate the ring of n2 − 1. homogeneous polynomials in x1 , x2 , . . . , xn . In particu16. If S = T ∗ T then S is a semi-positive operator lar, x1 x2 . . . xn . which implies that it has real eigenvalues which are all 10. Let B be a basis for V. Set A = MT (B, B). non-negative and, since it is self-adjoint, there is a orWe need to prove that A is the n × n zero matrix. thonormal basis B such that MT (B, B) is diagonal. Let

K23692_SM_Cover.indd 80

02/06/15 3:12 pm

7.2. Determinants

73

1 A = MT (B, B) = diag{a1 , . . . , an }. Since aj are 0 eigenvalues all ai ≥ 0. Then T r(T ) = T race(A) = . . a1 + · · · + an ≥ 0. On the other hand, the trace is zero if [S(uj )]B ∈ Span(e1 ) for 2 ≤ j ≤ n where e1 = .. and only if a1 = · · · = an = 0. However, this implies A 0 is the zero matrix, whence T is the zero operator. However, the (j, j)-entry of [S(uj )]B is zero, whence the (j, j)-entry of [T (uj )]B is zero. Thus, all the diagonal entries of MT (B, B) are zero as required. 17. We do a proof by induction on n = dim(V ). If n = 1 there is nothing to do. So assume the result is true 18. Let Z be the set of n × n matrices all of whose diagfor operators on spaces of dimension n − 1 and assume onal entries are zero. Then Z is a subspace of M (F) of nn dim(V ) = n. If T = 0V →V then any basis works so we dimension n2 − n. For a matrix B define a map ad(B) : may assume T = 0V →V . We first claim that there is a M (F) → M (F) by ad(B)(C) = BC − CB. This is nn nn vector v such that T (v) ∈ / Span(v). Otherwise, for each a linear map and Ker(ad(B)) = C(B) the subalgebra of v there is λv ∈ F such that T (v) = λv v. We claim that M (F) which commutes with B. Note that if the mininn λv is independent of the vector v. Of course, if w = cv mum polynomial of B has degree n then dim(C(B)) = n then T (w) = cT (v) = cλv v = λv (cv) = λv w. Sup- and Range(ad(B)) has dimension n2 − n. pose on the other hand that (v, w) is linearly independent. Then λv+w v + λv+w w = λv+w (v + w) = T (v + w) = Now let a1 , . . . , an be distinct elements of F and let B T (v) + T (w) = λv v + λw w. Then λv = λv+w = λw . be the diagonal matrix with entries a1 , . . . , an . Then the Thus, T = λIV for some λ ∈ F. Then T r(T ) = nλ = 0. minimum polynomial of B is (x − a1 ) . . . (x − an ) has Since the characteristic of F is zero, λ = 0, contrary to degree n and therefore dim(Range(ad(B)) = n2 − n. our assumption that T = 0V →V . On the other hand, for any matrix C the diagonal entries of BC − CB are all zero. Thus, Range(ad(B)) = Z. Thus, for A ∈ Z then there is a matrix C such that A = Now choose v such that T (v) ∈ / Span(v) and set BC − CB. v1 = v and v2 = T (v) and extend to a basis B1 = n (v1 , v2 , . . . , vn ). Let T (vj ) = i=1 aij vi and set aj = 19. By Exercise 17 there exists a basis B such that A = 0 MT (B, B) has diagonal entries all zero. By Exercise 18 1 there are matrices B, C such that A = BC − CB. Let 0 [T (vj )]B . Note that a1 = e2 = . Since T r(T ) = 0 R, S ∈ L(V, V ) such that MR (B, B) = B, MS (B, B) = .. S. Then T = RS − SR. .

0 n we must have i=2 aii = 0. Now define an operator S n on V such that S(v1 ) = v1 and S(vj ) = i=2 aij vi . Note that W = Span(v2 , . . . , vn ) is S-invariant and that n T r(S|W ) = i=2 aii = 0. Set S = S|W . By the inductive hypothesis, there is a basis BW = (w1 , . . . , wn−1 ) 1. Let B be the matrix obtained from A by exchanging for W such that the diagonal entries of MS (BW , BW ) the first and ith rows. Let the entries of A be akl and the are all zero. Set u1 = v1 , uj = wj−1 for 2 ≤ j ≤ n and entries of B be bkl . Let Akl be the matrix obtained from A B = (u1 , . . . , un ). Since T (vj )−S(vj ) ∈ Span(v1 ) for by deleting the k th row and lth column, with Bkl defined = det(Bkl ), Ckl = 2 ≤ j ≤ n it follows that T (uj ) − S(uj ) ∈ Span(v1 ) similarly. Set Mkl = det(Akl ), Mkl k+l k+l for 2 ≤ i ≤ n. Consequently, for 2 ≤ j ≤ n, [T (uj )]B − (−1) Mkl and Ckl = (−1) Mkl .

7.2. Determinants

K23692_SM_Cover.indd 81

02/06/15 3:12 pm

74

Chapter 7. Trace and Determinant of a Linear Operator

Since B is obtained from A by exchanging the first and ith rows, det(B) = −det(A). By Exercise 1 det(B) =

b11 C11

+ ··· +

b1n C1n .

Note that b1j = aij . Also, the matrix B1j is obtained from the matrix Aij by moving the first row to the (i − 1)st row. Note that this can be obtained by exchanging the first and second row of A1j then exchanging the second and third rows of that matrix, and continuing until we exchange the (i − 2)nd and (i − 1)st rows. Thus, there are i − 2 exchanges which implies that det(B1j ) = (−1)i−2 det(Aij ) = (−1)i det(Aij ). Thus, = (−1)i Mij . M1j

It then follows that = (−1)1+j M1j = C1j

(−1)1+j (−1)i Mij = (−1)1+i+j Mij = −Cij . Putting this together we get

−det(A) = det(B) = ai1 (−Ci1 ) + · · · + ain (−Cin ) Multiplying by -1 we get det(A) = ai1 Ci1 + · · · + ain Cin

det(A) = det(B) = a1j C1j + a2j C2j + · · · + anj Cnj 3. Let B be an orthonormal basis of V. Then det(T ) = det(MT (B, B)) and det(A∗ ) = det(MT ∗ (B, B)). Howtr ever, if A = MT (B, B) then MT ∗ (B, B) = A . Then tr det(T ∗ ) = det(A ) = det(A) = det(T ). c1 .. 4. If v = . then Jn v is the vector all of whose encn

n n tries are equal to c1 +· · ·+cn . Thus, Jn jn = . = njn ..

n and jn is an eigenvector with eigenvalue n. This proves i. If follows from the previous paragraph that Jn vi = 0. The sequence (v1 , . . . , vn−1 ) is linearly independent and spans a subspace of dimension n − 1 contained in null(Jn ). Since null(Jn ) is a proper subspace of Rn , Span(v1 , . . . , vn−1 ) = null(Jn ). This proves ii). / Span(v1 , . . . , vn−1 ) the sequence B = iii) Since jn ∈ (v1 , . . . , vn−1 , jn ) is linearly independent. Since there are n vectors and dim(Rn ) = n, B is a basis for Rn . 5. Ajn = (aIn + bJn )jn =

(aIn )jn + (bJn )jn = ajn + (bn)jn = [a + bn]jn . Thus, jn is an eigenvector of A with eigenvalue a + bn.

2. Let B = A . Denote the (i, j)−entry of B by bij On the other hand, Avi = (aIn + bJn )vn = and the (i, j)−cofactor by Cij . By Exercise 3, det(B) = det(A). By exercise 2 for any j (aIn )vi + (bJn )vi = avi . tr

+ bj2 Cj2 + · · · + bjn Cjn . det(B) = bj1 Cj1 = Cij . Since B is the transpose of A, bji = aij and Cji Thus,

K23692_SM_Cover.indd 82

This shows that each vi is an eigenvector with eigenvalue a. Thus, B = (v1 , . . . , vn−1 , jn ) is a basis of eigenvectors for A with eigenvalues a with multiplicity n−1 and a+bn with multiplicity 1. This implies that A is similar to the diagonal matrix with n − 1 entries equal to a and one

02/06/15 3:12 pm

7.2. Determinants

75

entry equal to a + bn. The determinant of this diagonal We can factor αk − α1 from (k − 1)st row to get the matrix is the product of the diagonal entries and is equal determinant is n to an−1 (a+bn). Since A is similar to this diagonal matrix, k=2 (αk − α1 ) times the determinant of det(A) = an (a + bn). 6. We proceed by induction starting with n = 2 1 α2 + α1 . . . α2n−2 + α2n−3 α1 + · · · + α1n−2 as the case. Direct computation shows that 1 α3 + α1 . . . α3n−2 + α3n−3 α1 + · · · + α1n−2 base . 1 1 .. .. . ) = α2 − α1 . Thus, assume the result . det( . ... . α1 α2 n−2 n−3 n−2 1 α n + α 1 . . . αn + α n α 1 + · · · + α 1 holds for n − 1. We will use properties of determinants

of matrices to compute the determinant of 1 1 ... 1 α α2 ... αn 1 2 α22 ... αn2 α1 . .. .. . . . . ... α1n−1

α2n−1

...

αnn−1

We begin working with the transpose

1 1 . .. 1

α1 α2 .. . αn

... ... ... ...

.. .

α1n−1 α2n−1 .. . . αnn−1

Subtract the first row from all the other rows to get the following matrix which has the same determinant:

1 0 . ..

α1 α2 − α1 .. .

0 αn − α1

... ... ... ...

α1n−1 α2n−1 − α1n−1 .. . αnn−1 − α1n−1

We need to compute the determinant of the matrix 1 α2 + α1 . . . α2n−2 + α2n−3 α1 + · · · + α1n−2 1 α3 + α1 . . . α3n−2 + α3n−3 α1 + · · · + α1n−2 . .. .. .. . ... . n−2 n−2 n−3 1 α n + α 1 . . . αn + α n α 1 + · · · + α 1

Take its transpose. Then subtract α1 times the (n − 2)nd row from the (n − 1)st row, the α1 times the (n − 3)rd row from the (n − 2)st row and continue, subtracting α1 times the second row from the third row and α1 times the first row from the second row. The matrix obtained is

1 α2 α22 .. .

α2n−2

1 α3 α32 .. . α3n−2

... ... ... ... ...

1 αn αn2 .. . αnn−2

By the induction hypothesis, the determinant of this matrix is 2≤j 2 then Player 2 [f (v1 , u) + g(v1 , u)] + [f (v2 , u) + g(v2 , u)] = places an arbitrary entry in a position (k, l) with k > 2. If (f + g)(v1 , u) + (f + g)(v2 , u) Player 1 puts a number b in position (i, j) with i ∈ {1, 2} then Player 2 puts the same number b in position (k, j) where {i, k} = {1, 2}. When the matrix is filled the first That (f +g)(u, v1 +v2 ) = (f +g)(u, v1 )+(f +g)(u, v2 ) and second rows will be identical and the matrix will have is proved in exactly the same way. determinant zero. Now let u, v ∈ V and c is a scalar. We need to prove 20. Taking determinants we get det(AB) = (f + g)(cu, v) = (f + g)(u, v) = c(f + g)(u, v). det(A)det(B) = det(BA). On the other hand det(−BA) = (−1)2k1+1 det(BA) = −det(BA). Therefore det(AB) = −det(AB) and so is zero. Then either det(A) = 0 or det(B) = 0 so either A is not invertible or B is not invertible.

7.3. Uniqueness of the Determinant

(f + g)(cu, v) = f (cu, v) + g(cu, v) = cf (u, v) + cg(u, v) = c[f (u, v) + g(u, v)] = c[(f + g)(u, v)] = c(f + g)(u, v) Likewise

(f + g)(u, cv) = f (u, cv) + g(cu, cv) = cf (u, v) + cg(u, v) = c[f (u, v) + g(u, v)] =

1. The proof is by induction on j − i. If j − i = 1 this follows from the definition of an alternating form. Assume the result for t and assume that ui = uj with j −i = t+1. By Lemma (7.10)

K23692_SM_Cover.indd 85

c[(f + g)(u, v)] = c(f + g)(u, v) Now assume f ∈ L(V 2 , W ) and c is a scalar. We need to prove that cf ∈ L(V 2 , W ). Let v1 , v2 , u ∈ V. Then

02/06/15 3:12 pm

78

Chapter 7. Trace and Determinant of a Linear Operator

(cf )(v1 + v2 , u) = c[f (v1 + v2 , u)] = c[f (v1 , u) + f (v2 , u)] = cf (v1 , u) + cf (v2 , u) = (cf )(v1 , u) + (cf )(v2 , u). In exactly the same way we prove that (cf )(u, v1 +v2 ) = (cf )(u, v1 ) + (cf )(u, v2 ). Finally, we need to show that if d is a scalar and u, v ∈ V then (cf )(du, v) = (cf )(u, dv) = d[(cf )(u, v)].

(cf )(du, v) = c[f (du, v)] = c[df (u, v)] = (cd)f (u, v) = (dc)f (u, v) = d[cf (u, v)] = d[(cf )(u, v)]. Similarly,

(cf )(u, dv) = c[f (u, dv)] = c[df (u, v)] = (cd)f (u, v) = (dc)f (u, v) = d[cf (u, v)] = d[(cf )(u, v)].

6. We prove this for (i, j) = (1, 2). We first show that f12 is alternating: a1 a1 a2 a2 f12 ( a3 , a3 ) = a1 a2 − a2 a1 = 0. a4 a4

We prove additive in the first varialbe:

a1 b1 c1 a2 b2 c2 f12 ( a3 + b3 , c3 ) = a4 b4 c4 (a1 + b1 )c2 − (a2 + b2 )c1 =

(a1 c2 + b1 c2 ) − (a2 c1 + b2 c1 ) = (a1 c2 − a2 c1 ) + (b1 c2 − b2 c1 ) = a1 c1 c1 b1 a2 c2 b2 c2 f12 ( a3 , c3 ) + f12 (b3 , c3 ) a4

c4

b4

c4

A similar argument shows that f12 is additive in the second argument.

4. Again we prove this for m = 2. The general case We now prove that f12 has the scalar property in the first is similar. We need to prove the following: i) If f, g ∈ variable Alt(V 2 , W ) and u ∈ V then (f + g)(u, u) = 0W ; and 2 ii) If f ∈ Alt(V , W ), c is a scalar and u ∈ V then (cf )(u, u) = 0W . b1 b1 a1 ca1 a 2 b 2 ca2 b2 i) (f + g)(u, u) = f (u, u) + g(u, u). Since f, g ∈ f12 (c a3 , b3 ) = f12 (ca3 , b3 ) = 2 Alt(V , W ) we have a4 b4 ca4 b4 f (u, u) + g(u, u) = 0W + 0W = 0W .

ii) (cf )(u, u) = c[f (u, u)] = c0W = 0W . 5. Assume m > n = dim(V ) and f ∈ Alt(V m , W ). Let (u1 , . . . , um ) be a sequence from V. Since m is greater than the dimension of V the sequence is linearly dependent. By Lemma (7.11) f (u1 , . . . , um ) = 0W .

K23692_SM_Cover.indd 86

(ca1 )b2 − (ca2 )b1 = c(a1 b2 ) − c(a2 b1 ) = b1 a1 a2 b2 c[a1 b2 − a2 b1 ] = cf12 ( a3 , b3 ) a4

b4

02/06/15 3:12 pm

7.3. Uniqueness of the Determinant

79

The scalar property in the second variable is proved similarly. 7. Let ei denote the ith standard basis vector of F4 and set eij = (ei , ej ) for 1 ≤ i, j ≤ 4. Then any alternating form f ∈ Alt(V 2 , F) is uniquely determined by its values on eij . Next note that fij (ekl ) = δik δjl . This implies that (f12 , . . . , f34 ) is linearly independent: Sup pose 1≤i 0 for all nonzero vectors u ∈ M . Let (v1 , . . . , vm ) be an orthogonal basis of M which exists by Lemma (8.21). As in Exercise 16 we can assume that φ(vi ) = 1 for 1 ≤ i ≤ m. Now suppose M1 , M2 ∈ P are maximal with dimensions mi , i = 1, 2. Let (v1i , . . . , vmi ,i ) be an orthonormal basis of Mi . We claim that m1 = m2 . Suppose to the contrary that m1 = m2 . We may assume and without loss of generality that m1 < m2 . Let σ be the linear map from U = Span(v12 , . . . , vm1 ,2 ) to M1 such that σ(vj2 ) = vj1 for 1 ≤ j ≤ m1 . Then σ is an isometry. By Witt’s theorem there exists an isometry S of V such that S restricted to U is σ. Set M2 = S(M2 ). Then S(M2 ) ∈ P and S(M2 ) properly contains M1 , a contradiction. Thus, m1 = m2 and the map S is an isometry of V which takes M2 to M1 .

Chapter 8. Bilinear Maps and Forms is a totally singular subspace of V, dim(M ) = n−1 2 .

20. When m = 1 there are two singular subspaces of dimension one, each with q − 1 non-zero vectors so there are altogether 2(q − 1) singular vectors when m = 1 which is equal to (q m − 1)(q m−1 + 1). Assume we have shown that in a space of dimension 2m (maximal Witt index) there are (q m − 1)(q m−1 + 1) singular vectors and that dim(V ) = 2(m + 1). Let v, w be singular vectors with v, w = 1 and set U = Span(v, w)⊥ . Let ∆(v) be the singular vectors u such that u ⊥ v and Span(u) = Span(v) and Γ(v) the singular vectors z such that v ⊥ z. Count |Γ(v)| first. Note that v ⊥ has dimension 2m − 1 and therefore there are q 2m − q 2m−1 vectors in V \ v ⊥ . Let x be any vector such x ⊥ v. Then in Span(v, x) there are q 2 − q which are non-orthogonal to v. Consequently, the number of non-degenerate two diq 2m −q 2m−1 . Any 18. By Lemma (8.21) there is an orthogonal ba- mensional subspaces which contain v is q 2 −q such two dimensional subspace contains q − 1 singular sis (v1 , v2 , v3 ) for V . First suppose that at least two of (v1 ), φ(v2 ), φ(v3 ) are squares, say φ(v1 ) = vectors which are not orthogonal to v and we can cona2 , φ(v2 ) = b2 . Then replacing v1 by a1 v1 and v2 by 1b v2 clude that we can assume that φ(v1 ) = φ(v2 ) = 1. q 2m − q 2m−1 The following is well-known and proved in a first course = q 2m−1 − q 2m−2 . |Γ(v)| = (q − 1) 2−q q in abstract algebra (it depends on the fact that the multiplicative group of the field Fq is cyclic): for any c ∈ Fq ⊥ there are d, e ∈ Fq such that c = d2 + e2 . In particular, Now suppose u ∈ v , Span(u) = Span(v). Then subthis applies to −φ(v3 ). Now if d2 + e2 = −φ(v3 ) then Span(u, v) is 2a totally singular two dimensional ⊥ space and has q −q such vectors. Now w ∩Span(u, v) φ(dv1 + ev2 + v) = 0. is a one-dimensional singular subspace contained in U . On the other hand if φ(v1 ) and φ(v2 ) are non-squares in By the inductive hypothesis there are (q m − 1)(q m−1 + 1) Fq then replace φ with dφ where d is a non-square. Then singular vectors in U and therefore (qm −1)(qm−1 +1) such q−1 we can apply the above. Since the singular vectors of φ subspaces. It now follows that and dφ are the same we are done. 19. The proof is by induction on n. If n ≤ 3 then this fol(q m − 1)(q m−1 + 1) 2 = − q) |∆(v)| = (q lows from Exercise 18. So assume that dim(V ) = n > 3 q−1 and the result is true for all spaces of dimension less than q(q m − 1)(q m−1 + 1) n. Since dim(V ) > 3 by Exercise 18 there is a singular vector v. By Lemma (8.24) there exists a singular vector Finally, there are q −1 singular vectors in Span(v). Thus, w such that v, wφ = 1. Set U = Span(v, w)⊥ which the number of singular vectors is is non-degenerate of dimension n−2. Set h = n−3 2 . By the inductive hypothesis there exists a totally singular sub(q − 1) + q(q m − 1)(q m−1 + 1) + q 2m−1 − q 2m−2 space M of U, dim(M ) = h. Then M = M ⊕ Span(v)

K23692_SM_Cover.indd 96

02/06/15 3:12 pm

8.4. Orthogonal Space, Characteristic Two

89

After simplifying we get (q m+1 − 1)(q m + 1).

Then σ is an isometry. By Witt’s theorem there exist an isometry S of V such that S restricted to M2 = σ. But 21. This follows from the proof of Exercise 20. then S(M2 ) is totally singular subspace of V and S(M2 ) 22. It follows from Exercises 20 and 21 that the number properly contains M1 , a contradiction. of hyperbolic pairs is (q m − 1)(q m−1 + 1)q 2m−2 . The 3. By Witt’s theorem there exists an isometry S on V such result follows from a straight forward induction on m. that S(X) = Y . Then S(X ⊥ ) = Y ⊥ so that X ⊥ and Y ⊥ 23. Need to do induction on m. Assume m = 1. are isometric. Let x be an element of V such that φ(x) = 1 and let y ∈ x⊥ and set d = φ(y). There are then 2(q + 1) 4. This follows immediately from Theorem (8.15) pairs (u, v) such that φ(u) = 1, φ(vv) = d. For any 0 1 such pair the linear map τ (ax + by) = au + bv is 5. Span 0 and Span 1. an isometry and every isometry arises this way. Thus, 1 0 |O(V, φ)| = 2(q + 1). Now assume that m > 1. 0m Im We can show by induction that the number of singu- 6. Set B = BX ∪ BY . Let J = . Suppose I m 0m lar vectors is (q m + 1)(q m−1 − 1) and the number of T is an operator on V and MT (B, B) = M . Then T is hyperbolic pairs is q 2m−2 (q m + 1)(q m−1 − 1). Then an isometry if and only if M tr JM = J. Now let S be an we can show by induction that the number of sequences operator on V and assume that S(X) = X, S(Y ) = Y . (x1 , y1 , . . . , xxm−1 , ym−1 ) such that xi , yφ = 1 for MX 0m . From the above it 1 ≤ i ≤ m − 1 and xi ⊥ xj , xi ⊥ yj , yi ⊥ yj for Then MS (B, B) = 0m MY 2( m m−1 m 2i ) i = j is q 2 (q + 1)(q − 1)Πi=2 (q − 1). The or- follows that S is an isometry if and only if M tr M = I . Y m X der of O(V, φ) is obtained by multiplying this number by m 7. Let c be a scalar and v = v1 + v2 a vector in V1 ⊕ V2 . 2i 2(q + 1) to get 2q 2( 2 ) (q m + 1)Πm−1 i=1 (q − 1). Then

8.4. Orthogonal Space, Characteristic Two

φ(cv)

=

φ(c(v1 + vv2 ))

=

φ(cv1 + cv2 )

=

φ1 (cv1 ) + φ2 (cv2 )

= c2 φ1 (v1 ) + c2 φ2 (v2 ) 1. Let v be a singular vector in (V, φ). By Lemma (8.29) = c2 [φ1 (v1 ) + φ2 (v2 ) there is a singular vector w such that v, wφ = 1 and = c2 φ(v1 + v2 ) = c2 φ(v) then (v, w) is a basis for V . Suppose x = av + bw is in Rad(V, φ), Then, in particular, x, vφ = 0. However, x, vφ = av + bw, vφ = b, whence b = 0. Similarly, Assume v = v1 +v2 and w = w1 +w2 with v1 , w1 ∈ V1 and v2 , w2 ∈ V2 . Then a = 0 and x = 0. 2. Suppose to the contrary that dim(M1 ) = v1 + v2 , w1 + w2 φ = dim(M2 ). Then without loss of generality we can assume dim(M1 ) = m1 < m2 = dim(M2 ). Let M2 φ([v1 + v2 ] + [w1 + w2 ]) − φ(v1 + v2 ) − φ(w1 + w2 ) = be a subspace of M2 , dim(M2 ) = m1 and choose bases φ([v1 + w1 ] + [v2 + w2 ] − φ(v1 + v2 ) − φ(w1 + w2 ) = (v1,,i , . . . , vm1 ,i ) for Mi , i = 1, 2. Let σ be the linear transformation from M1 to M2 such that σ(v2,i ) = v1,i . φ1 (v1 + w1 ) + φ2 (v2 + w2 )−

K23692_SM_Cover.indd 97

02/06/15 3:12 pm

90

Chapter 8. Bilinear Maps and Forms φ1 (v1 ) + φ2 (v2 ) + φ1 (w1 ) + φ2 (w2 )) = [φ1 (v1 + w1 ) − φ1 (v1 ) − φ1 (w1 )]+ [φ2 (v2 + w2 ) − φ2 (v2 ) − φ2 (w2 )] = v1 , w1 φ1 + v2 , w2 φ2 .

of det(A) determine π and ν. Note, for each m there are two classes of isometry forms. b) If m is the Witt index and m < n2 then we can have (π, ν) = (m, n − m) or (n − m, m). Thus, there are two isometry classes.

n 8. H1 ⊥ H2 is non-degenerate space of dimension four c) If m = 2 then (π, ν) = (m, m) and there is a unique and Witt index 2. It therefore suffices to prove that E1 ⊥ isometry class of forms. E2 has Witt index two. E1 ⊥ E2 is isometric to (F4 , φ) 7. That [ , ] is bilinear follows from the bilinearity of a , and the linearity of T. We therefore have to show b 2 2 2 2 2 where φ( = a + ab + b δ + c + d + d δ. Check that ([x, y] = [y, x]. However [x, y] = x, T (y) = c T (x), y since T is self-adjoint. Since , is symmetric, d we have T (x), y = y, T (x) = [y, x] as required. 1 0 1 0 8. Fix y ∈ V and define Fy : V → R by Fy (x) = 0 1 0 1 that [x, y]. Then Fy ∈ V = L(V, R). Then there exists a 1 ⊥ 0 and φ 1 = φ(0 = 0. unique vector T (y) ∈ V such that Fy (x) = x, T (y). 0 1 0 1 We have defined a function T : V → V such that [x, y] = x, T (y). We have to show that T is linear and then a symmetric operator.

8.5. Real Quadratic Forms

Let y1 , y2 ∈ V. Then x, T (y1 , +y2 ) = [x, y1 + y2 ] = 1. (π, σ) = (1, 0). 2. (π, σ) = (2, 1). √ √ 3 −1 − 2 √22 √33 3. ( 2 , −1 , 3 ). √ 3 2 0 3 √ 1 2 −2 √2 2 4. (−√ 2 , −1 , 1 ). 2 1 2 2

5. The number of congruence classes is equal to the number of triples (π, ν, ζ) ∈ N3 such that π + ν + ζ = n. This is n+1 2 .

[x, y1 ] + [x, y2 ] = x, T (y1 ) + x, T (y2 ) = x, T (y1 ) + T (y2 ). It follows that T (y1 + y2 ) = T (y1 ) + T (y2 ). Now suppose c is a scalar and y ∈ V. Then [x, cy] = x, T (cy). On the other hand, [x, cy] = c[x, y] = x, T (y) = x, cT (y) and consequently, T (cy) = cT (y). Thus, T is an operator on V. Since [ , ] is a symmetric bilinear form we have x, T (y) = [x, y] = [y, x] = y, T (x) = T (x), y, the latter equality since , is symmetric. This implies that T = T ∗ and T is a symmetric operator.

6a) Let m be the Witt index. We know the rank, ρ = n and π + ν = n. Also, m = min{π, ν} and therefore (π, ν) = (m, n − m) or (n − m, m). Since n is odd one of 9. i) implies ii). Since A is symmetric there exists an π, ν is even the other is odd and therefore one of m, n−m orthonormal basis B = (v1 , . . . , vn ) of Rn consisting of is even and the other is odd. If det(A) < 0 then ν is odd eigenvectors. Assume Avi = ai . Since v tr Av > 0 for and if det(A) > 0 then ν is even. Thus, m and the sign all v ∈ Rn , in particular, vitr Avi = ai vi 2 > 0 which

K23692_SM_Cover.indd 98

02/06/15 3:12 pm

8.5. Real Quadratic Forms

91

implies that ai > 0. Set ui = √1ai vi and let Q be the matrix with columns equal to the ui . Then Qtr AQ = In . ii) implies iii). Assume A is congruent to the identity matrix. Then there is an invertible matrix P such that P tr AP = In . Then (P −1 )tr In P −1 = A. Set Q = P −1 . Then A = Qtr Q. iii) implies i). Let v ∈ Rn . Then v tr Av = (v tr Qtr )(Qv) = (Qv)tr (Qv) = Qv 2 > 0 unless Qv = 0. Since Q is invertible, Qv = 0 if and only if v = 0.

K23692_SM_Cover.indd 99

02/06/15 3:12 pm

K23692_SM_Cover.indd 100

02/06/15 3:12 pm

Chapter 9

Sesquilinear Forms and Unitary Spaces 9.1. Basic Properties of Sesquilinear Forms

(f + g)(v, aw) = f (v, aw) + g(v, aw) = σ(a)f (v, w) + σ(a)g(v, w) =

1. Let v ∈ V and c ∈ F. Then (T ◦S)(cv) = T (S(cv)) = T (σ(c)S(v)) = τ (σ(c))T (S(v)) = (τ ◦ σ)(c)[T ◦ S](v). Thus, T ◦ S is a τ ◦ σ-semilinear map.

σ(a)[f (v, w) + g(v, w)] = σ(a)[(f + g)(v, w)].

2. Let f, g ∈ Sσ (V ). We need to prove that f + g ∈ Thus, f + g ∈ Sσ (V ). Sσ (V ). Let v1 , v2 , w ∈ V . Then Now assume f ∈ Sσ (V ), c ∈ F. We must show that cf ∈ Sσ (V ). Let v1 , v2 , w ∈ V . Then (f + g)(v1 + v2 , w) = f (v1 + v2 , w) + g(v1 + v2 , w) =

(cf )(v1 + v2 , w) = c[f (v1 + v2 , w)] =

[f (v1 , w) + f (v2 , w)] + [g(v1 , w) + g(v2 , w)] =

c[f (v1 , w) + f (v2 , w)] = cf (v1 , w) + cf (v2 , w) =

[f (v1 , w) + g(v1 , w)] + [f (v2 , w) + g(v2 , w)] =

(cf )(v1 , w) + (cf )(v2 , w).

(f + g)(v1 , w) + (f + g)(v2 , w).

That (af )(w, v1 + v2 ) = (af )(w, v1 ) + (af )(w, v2 ) is That (f + g)(w, v1 + v2 ) = (f + g)(w, v1 ) + (f + proved in exactly the same way. g)(w, v2 ) is proved in exactly the same way. Finally, let a ∈ F, v, w ∈ V . Then Now suppose v, w ∈ V and a ∈ F. Then

(cf )(av, w) = cf (av, w) = caf (v, w) = (f + g)(av, w) = f (av, w) + g(av, w) =

acf (v, w) = a[cf (v, w)] = a[(cf )(v, w)].

af (v, w) + ag(v, w) = a[f (v, w) + g(v, w)] = a[(f + g)(v, w)]. Also,

K23692_SM_Cover.indd 101

(cf )(v, aw) = cf (v, aw) = cσ(a)f (v, w) = (σ(a)c)f (v, w) = σ(a)[cf (v, w)] =

02/06/15 3:12 pm

94

Chapter 9. Sesquilinear Forms and Unitary Spaces Since f (w, v) is a scalar we have

σ(a)[(cf )(v, w)]. 3.

a11 .. . an1 Then

= Assume A = Mf (B, B) . . . a1n b1 c1 .. , [u] = .. , and [v] = .. . . . B B ... . ...

bn

ann

f (u, v) = f (

n

bi v i ,

i=1

n n

n

cn

c j vj ) =

j=1

bi aij σ(cj ) =

i=1 j=1

[u]tr B Aσ([v]B ). 4. Let B = (v1 , . . . , vn ) be a basis. Denote by gi the σsemilinear map from V to F given by gi (w) = f (vi , w). We claim that (g1 , . . . , gn ) is linearly independent. For n suppose i=1 ai gi = 0V →F . Then for all w ∈ V , n i=1

ai gi (w) =

ai f (vi , w) = 0

i=1

tr tr [w]tr B Aσ([v]B ) = σ([v]B A [w]B = tr σ([v]tr B σ(A )σ([w]B ) = σ(f (v, w)).

ii. This is proved exactly as i. 6. Let gi be the σ-semilinear map such that gi (vi ) = 1 and gi (vj ) = 0 for j = i. By Lemma (9.5) there is vector vi such that f (vi , w) = gi (w). Thus, f (vi , vi ) = gi (vi ) = 1 and f (vi , vj ) = gi (vj ) = 0. 7. Assume σ 2 = IF . Then f is Hermitian, hence reflexive. n n To see this, let v = i=1 ai vi and w = i=1 bi vi . Then f (v, w) =

n

ai σ(bi ).

i=1

On the other hand, f (w, v) =

bi σ(ai ).

i=1

If σ 2 = IF then f (w, v) = σ(f (v, w)) so f is Hermitian and reflexive.

n = 0. Thus, Consequently, f ( i=1 ai vi , w) m Assume dim(V ) ≥ 2 and σ 2 = IF . Let a ∈ F, σ 2 (a) = a v ∈ Rad (f ) = {0 }. Since (v , . . . , v i i L V 1 n ) is i=1 linearly independent it follows that a1 = · · · = an = 0. a. Then Thus, (g1 , . . . , gn ) is linearly independent as claimed. Since the dimension of the space of σ-semilinear f (σ(a)v1 + v2 , v1 − av2 ) = σ(a) − σ(a) = 0. maps from V to F is equal to dim(V ) = n it follows that (g1 , . . . , gn ) is a basis for this space. Since F is a σ-semilinear map from V to F there are unique On the other hand, n Set scalars b1 , . . . , bn such that F = i=1 bi gi . v = b1 v1 + · · · + bn vn . Then F (w) = f (v, w) for all f v1 − av2 , σ(a)v1 + v2 ) = σ 2 (a) − a = 0. w ∈V. 5. i. Assume f is Hermitian and 1 ≤ i, j ≤ n. Then aji = f (vj , vi ) = σ(f (vi , vj )) = σ(aij ) = aij . Thus, Atr = A. Conversely, if Atr = A then f (vj , vi ) = σ(f (vi , vj )). Suppose v = b1 v1 + · · · + bn vn and w = c1 v1 + · · · + cn vn . Then f (w, v) = [w]tr B Aσ([v]B ).

K23692_SM_Cover.indd 102

8. We can view F as a two dimensional vector space over E. The map trF/W is a linear map. Therefore either trF/E is the zero map or it is surjective. Suppose the characteristic of F is not two. Then the map a → −a is not an automorphism of F and there exists an a such that a+σ(a) = 0. If follows that in this case the map is surjective. Suppose the characteristic is two. Since σ = IF there

02/06/15 3:12 pm

9.2. Unitary Space

95

is an a such that a = σ(a) which implies that a+σ(a) = 0 dim(U1 ) = l < m = dim(U2 ). Let (v1 , . . . , vl ) be a and once again the map is surjective. basis for U2 and (w1 , . . . , wm ) a basis for U2 . Let τ be the linear transformation from Span(w1 , . . . , wl ) → U2 such that τ (wi ) = vi . Then τ is an isometry. By Witt’s theorem there is an isometry T of V such that T restricted to Span(w1 , . . . , wl ) is τ . Set U2 = T (U2 ). Then U2 is a totally singular subspace of V and U2 properly contains 1. Suppose (v1 , . . . , vn ) is linearly dependent. Then U1 , contradicting the assumption that U1 is a maximal tofor some k, vk is a linear combination of (v1 , . . . , vk−1 ). tally singular subspace. Say vk = a1 v1 + · · · + ak−1 vk−1 . Then f (vk , vk ) = 6. By Witt’s theorem there is an isometry T of V such that f (vk , a1 v1 + · · · + ak−1 vk−1 ) = a1 f (vk , v1 ) + · · · + T (U1 ) = U2 . Let x ∈ U2 and y ∈ T (U1 )⊥ . We claim ak−1 f (vk , vk−1 ) = 0 which contradicts the assumption that f (x, y) = 0 from which it will follow that T (U1⊥ ) ⊂ that f (vi , vi ) = 0 for 1 ≤ i ≤ n. U2⊥ . Since U2 = T (U1 ) there is a vector u ∈ U1 such that 2. If T is an isometry then f (T (vi ), T (vj )) = f (wi , wj ) x = T (u). Since y ∈ T (U1⊥ ) there is a v ∈ U1⊥ such that for all i and j. Conversely, assume that f (wi , wj ) = y = T (v). Now f (x, y) = f (T (u), T (v)) = T (u, v) = n f (vi , vj ) for all i and j. Assume v = i=1 ai vm w = 0. Now dim(U1⊥ ) = dim(V ) − dim(U1 ) = dim(V ) − dim(U2 ) = U2⊥ . Since dim(T (U1⊥ )) = dim(U1⊥ ) it now i=1 bi vi . Then f (v, w) = follows that T (U1⊥ ) = U2 . n n 7. Let x be an anisotropic vector and let y ∈ x⊥ . Since f( ai vi , bi v i ) = ai bi f (vi , vi ). N is surjective, we can assume that f (x, x) = 1 and i=1n i=1 i=1 f (y, y) = −1. Then v = x − y is isotropic. n n On the other hand, T (v) = i=1 ai T (vi ) = i=1 ai wi n 8. We prove this by induction on n ≥ 2 (there is nothand, similarly, T (w) = i=1 bi wi . Then ing to prove for n = 1). The base case is covered by Exercise 7. Assume that n > 2 and that the result has n n been proved for all non-degenerate unitary spaces (W, g) f (T (v), T (w)) = f ( ai wi , bi w i ) = where dim(W ) < n. and that dim(V ) = n. By Exercise i=1 i=1 7 the space is isotropic. By Exercise 3 there exists a hyn perbolic pair, (x, y). U = Span(x, y) is non-degenerate ai bi f (wi , wj ) = f (v, w). and therefore U ⊥ is a non-degenerate subspace of dimeni=1 sion n − 2. By the inductive hypothesis there exists an 3. Since f is non-degenerate there exists a vector u such totally isotropic subspace M, dim(M ) = n−2 . Then 2 that f (v, u) = 0. Now the result follows from Lemma M = M + Span(x) is a totally isotropic subspace and (9.14). n dim(M ) = n−2 2 + 1 = 2 . 4. Let x = av1 + bw1 , y = cv1 + dw1 . Then 9. Let I be the set of all isotropic vectors and set U = f (x, y) = aσ(d) + bσ(c). By the definition of T we Span(I). If we can show that Span(I) = V then I conhave T (x) = av2 + bw2 , T (y) = cv2 + dw2 . We then tains a basis of V and are done. Fix an isotropic vechave f (T (x), T (y)) = f (av2 + bw2 , cv2 + dw2 ) = tor x and let y ∈ V be arbitrary. If y is isotropic then aσ(d) + bσ(c). y ∈ Span(I) and there is nothing to prove. So assume y 5. Suppose to the contrary that dim(U1 ) = is anisotropic. If f (x, y) = 0 then Span(x, y) is nondim(U2 ). Then without loss of generality we can assume degenerate and by Corollary (9.3) there is an isotropic

9.2. Unitary Space

K23692_SM_Cover.indd 103

02/06/15 3:12 pm

96

Chapter 9. Sesquilinear Forms and Unitary Spaces

vector z ∈ Span(x, y) such that f (x, z) = 1. Then y ∈ Span(x, y) = Span(x, z) ⊂ Span(I). We may therefore assume that x ⊥ y. Now y ⊥ is non-degenerate and contains x so there is an isotropic vector u ∈ y ⊥ such that f (x, u) = 1 and then U = Span(x, y, u) is a nondegenerate three dimensional subspace. Let W be a two dimensional subspace of U containing x such that W = Span(x, y), Span(x, u). Then W is non-degenerate and by Corollary (9.3) there is an isotropic vector z ∈ W such that f (x, z) = 1 and W = Span(x, z). Then y ∈ U = Span(x, u, w) ⊂ Span(I). 10. The proof is by induction on n = dim(V ). If dim(V ) = 1 then any non-zero vector is an orthogonal basis. So assume n > 1 and the result is true for all nondegenerate spaces of dimension n−1. Since (V, f ) is nondegenerate there exists a vector v such that f (v, v) = 0. Set U = v ⊥ , a non-degenerate subspace of dimension n − 1. By the inductive hypothesis there exists an orthogonal basis (v1 , . . . , vn−1 ) for U . Set vn = v. Then (v1 , . . . , vn ) is an orthogonal basis for V .

K23692_SM_Cover.indd 104

02/06/15 3:12 pm

Chapter 10

Tensor Products 10.1. Introduction to Tensor Products 1. Assume V1 , V2 are finite dimensional and suppose B1 = (v1 , . . . , vm ), B2 = (w1 , . . . , wn ). If f exists then we must have f(

m

ai vi ,

n

bj w j ) =

m n

ai bj f (vi , wj ).

[f1 (x1 ) + f1 (x2 )]f2 (y) By the distributive property in F we can conclude that [f1 (x1 )+f1 (x2 )]f2 (y) = f1 (x1 )f2 (y)+f1 (x2 )f2 (y) = f (x1 , y) + f (x2 , y) In exactly the same way we can prove that f (x, y1 + y2 ) = f (x, y1 ) + f (x, y2 ).

Now to the scalar property: f (cx, y) = f1 (cx)f2 (y) = [cf1 (x)f2 (y)] = cf (x, y). Similarly, f (x, cy) = In fact, defining f in this way gives a bilinear function f1 (x)f2 (cy) = f1 (x)[cf2 (y)] = [f1 (x)c]f2 (y) = with f restricted to B1 × B2 equal to f. On the other hand, [cf1 (x)]f2 (y) = c[f1 (x)f2 (y) = cf (x, y). if V1 or V2 is not finite dimensional, then for any finite 4. Let (v1 , v2 ) be linearly independent in V and subset B1 of B1 and B2 of B2 we can use this to define (w1 , w2 ) be linearly independent in W and set x = a bilinear map on Span(B1 ) × Span(B2 ). One can then v1 ⊗ w1 + v2 ⊗ w2 . Then x is indecomposable. Supuse Zorn’s lemma to show that these can be extended to pose to the contrary that there are vectors v ∈ V and all of V1 × V2 . We omit the details. w ∈ W such that x = v ⊗ w. We can assume that ’ (v1 , v2 ) and (w1 , w2 ) can each be extended to an in2. Define a map θ : V1 ×V2 to V2 ⊗V1 by θ(x, y) = y⊗x. dependent sequence (v1 , . . . , vm , (w1 , . . . , wn ), respecThis map is bilinear. Since V1 ⊗V2 is universal for V1 ×V2 tively, such that v ∈ Span(v1 , . . . , vm ) and w ∈ there exists a linear transformation T : V1 ⊗V2 → V2 ⊗V1 Span(w1 , . . . , wn ). Write v = a1 v1 +· · ·+am vm , w = such that T (x ⊗ y) = y ⊗ x. In a similar way we get a b1 w1 + · · · + bn wn . Then linear transformation S : V2 ⊗ V1 → V1 ⊗ V2 such that S(y⊗x) = x⊗y. Then ST = IV1 ⊗V2 and T S = IV2 ⊗V1 . v⊗w = a i bj v i ⊗ w j . i=1

j=1

i=1 j=1

i,j 3. Let x, x1 , x2 ∈ V1 , y, y1 , y2 ∈ V2 and c ∈ F. Since Since {vi ⊗ wj |1 ≤ i ≤ m, 1 ≤ j ≤ n} is linearly f1 ∈ L(V1 , F) we have independent we must have a1 b2 = 0 = a2 b1 and a1 b1 = f (x1 + x2 , y) = f1 (x1 + x2 )f2 (y) = 1 = a2 b2 which gives a contradiction.

K23692_SM_Cover.indd 105

02/06/15 3:12 pm

98

Chapter 10. Tensor Products

5. Set W = Span(w1 , . . . , wn ), a finite dimens s sional subspace of W. Let (z1 , . . . , zs ) be a basis for g(xi , yi ) = g(x , y ) + i i W . We can express each wj as a linear combination of i=1 i=1 s n (z1 , . . . , zs ) : wj = i=1 aij zi . Now j=1 xj ⊗ wj = n s g(z) + g(z ). j=1 i=1 aji xj ⊗ zi . However, {xj ⊗ zi |1 ≤ j ≤ n, 1 ≤ i ≤ s} is linearly indendent and therefore each Next assume that z ∈ Z and c ∈ F. If z = aji = 0 whence wi = 0W for all i. s i=1 f (xi , yi ) then by the bilinearity of f we have cz = s 6. Before proceeding we claim that for any basis i=1 f (xi , cyi ) so that (x1 , . . . , xs ) of V and any z ∈ Z there are vectors yi ∈ W such that z = f (x1 , y1 ) + · · · + f (xs , ys ). s s Thus, assume that z = f (v1 , w1 ) + · · · + f (vm , wm ). g(cz) = g(xi , cyi ) = cg(xi , yi ) = We can express vj in terms of the basis (x1 , . . . , xs ) : i=1 i=1 s m vj = i=1 aij xi . Now set yi = j=1 aij wj . Then by s the bilinearity of f it follows that g(xi , yi ) = c g (z). c z = f (v1 , w1 ) + · · · + f (vm , wm ) = f (x1 , y1 ) + · · · + f (xs , ys ). Now by hypothesis b) it follows for z ∈ Z there are unique yi ∈ W such that z = f (x1 , y1 )+· · ·+f (xs , ys ).

i=1

7. By the universal property of the tensor product we know for all f ∈ B(V, W ; Z) there exists a linear map f : V ⊗ W → Z. Let θ denote the map f → f so that θ is a map from B(V, W ; Z) to L(V ⊗ W, Z). We need to prove that θ is linear and bijective.

Now assume that g : V × W → X is a bilinear form. We first show that θ is additive: Assume f, g ∈ We need to show that there exists a unique linear map g : B(V, W ; Z). Then f, g are the uniique linear maps from Z → X such that g◦f = g. Clearly the only possible way s V ⊗ W to Z such that f(v ⊗ w) = f (v, w) and to define g is as follows: suppose z = i=1 f (xi , yi ). s g(v ⊗ w) = g(v, w). Now f + g is a linear map from Then g(z) = i=1 g(xi , yi ). By the uniqueness of exV ⊗ W to Z. Computing (f + g)(v ⊗ w) we get pression for z, g is a well-defined function. However, we need to demonstrate that it is linear. Thus, let z, z ∈ Z. We need to prove that g(z + z ) = g(z) + g(z ). f(v⊗w)+ g (v⊗w) = f (v, w)+g(v, w) = (f +g)(v, w). s f (x , y ) and z = Assume that z = i i i=1 s f (x , y ). Since f is bilinear we have z + z = By uniqueness, f i + g = f + g. i i=1 s f (x , y + y ). Then i i i i=1 We next show homogeneity holds: if c is a scalar then = cf. s cf g(xi , yi + yi ) g(z + z ) = i=1

By the bilinearity of g we have

(v ⊗ w) = (cf )(v, w) = cf (v, w) = cf(v ⊗ w) cf

as required. s i=1

K23692_SM_Cover.indd 106

g(xi , yi + yi ) =

s i=1

(g(xi , yi ) + g(xi , yi ) =

It remains to show that θ is an isomorphism. Injectivity is easy: If f, g ∈ B(V, W ; Z) are distinct then there exists

02/06/15 3:12 pm

10.2. Properties of Tensor Products v ∈ V, w ∈ W such that f (v, w) = g(v, w). Then f(v⊗ w) = f (v, w) = g(v, w) = g(v ⊗ w).

It remains to show surjectivity. Assume T ∈ L(V ⊗ W, Z). Define t : V × W → Z by t(v, w) = T (v ⊗ w). Since T is linear and the tensor product is bilinear it follows that t is bilinear. Thus, t ∈ B(V, W ; Z). Clearly t = T and so θ is surjective.

99 On the other hand since u ∈ V ⊗ Y2 there are unique vectors u1 , . . . , uk , x1 , . . . , xm such that u = u1 ⊗ y1 + · · · + uk ⊗ yk + x1 ⊗ z1 + · · · + xm ⊗ zm . However, this implies that (u1 − u1 ) ⊗ y1 + · · · + (uk − uk ) ⊗ yk +

v1 ⊗w1 +· · ·+vl ⊗wl −x1 ⊗z1 −· · ·−xm ⊗zm = 0V ⊗W . 8. The map µ : F × V → V given by µ(c, v) = cv is bilinear. We therefore have a linear transformation from Since (y1 , . . . , yk , w1 , . . . , wl , z1 , . . . , zm ) is linearly F ⊗ V to V such that c ⊗ v = cv. Denote this map by independent by Exercise 5 we have u1 − u1 = · · · = T . Define S : V → F ⊗ V by S(v) = 1 ⊗ v. This uk − uk = v1 = · · · = vl = x1 = · · · = xm = 0V . map is linear. Consider the composition ST (c ⊗ v) = Thus, u ∈ V ⊗ (Y1 ∩ Y2 ). S(cv) = 1 ⊗ (cv) = c ⊗ v so that ST = IF⊗V . Similarly, T S(v) = T (1 ⊗ v) = v and T S = IV .

9. To avoid confusion we denote the tensor product of X and Y by X ⊗ Y (as well as products of elements in this space). Define θ : X × Y → V ⊗ W by θ(x, y) = x ⊗ y (since this is in V ⊗W there is no prime). By the universal property for the tensor product there exists a linear map ⊗ y) = θ(x, y) = θ : X ⊗ Y → V ⊗ W such that θ(x x ⊗ y (there is no in the latter expression since this is the tensor product of the two elements in V ⊗ W ). Since Z is the subspace of V ⊗ W generated by all elements x ⊗ = Z. We need to prove that θ is injective. y, Range(θ) Suppose θ(u) = 0V ⊗W where u ∈ X ⊗ Y. Then there exists a linearly independent sequence (x1 , . . . , xm ) in X and a linearly independent sequence (y1 , . . . , yn ) in Y and scalars aij such that u = ij aij xi ⊗ yj . We then have θ( i,j aij xi ⊗ yj ) = 0V ⊗W . However, θ( i,j aij xi ⊗ yj ) = i,j aij xi ⊗ yj (now this tensor product is in V ⊗W ). The vectors {xi ⊗yj ∈ V ⊗W |1 ≤ i ≤ m, 1 ≤ j ≤ n} are linearly independent and therefore aij = 0 and u = 0X⊗ Y . 10. Let (y1 , . . . , yk ) be a basis of Y1 ∩ Y2 . Let (y1 , . . . , yk , w1 , . . . , wl ) be a basis for Y1 and (y1 , . . . , yk , z1 , . . . , zm ) be a basis for Y2 . Suppose u ∈ (V ⊗ Y1 ) ∩ (V ⊗ Y2 ). Since u ∈ V ⊗ Y1 there are unique vectors u1 , . . . , uk , v1 , . . . , vl ∈ V such that u = u1 ⊗ y1 + · · · + uk ⊗ yk + v1 ⊗ w1 + · · · + vl ⊗ wl .

K23692_SM_Cover.indd 107

10.2. Properties of Tensor Products

1. If π fixes one of 1, 2 or 3 this follows from the fact that V ⊗ W is isomorphic to W ⊗ V by Exercise 1 of Section (10.1). We illustrate the proof for π = (123). Let f : V1 × V2 ×V3 → V2 ⊗V3 ⊗V1 given by f (x, y, z) = y ⊗z ⊗x. This is a 3-linear map. By the universal property of the tensor product we have a linear map f : V1 ⊗ V2 ⊗ V3 → V2 ⊗ V3 ⊗ V1 such that f(x ⊗ y ⊗ z) = y ⊗ z ⊗ z.

Likewise can define a trilinear map g : V2 × V3 × V1 → V1 ⊗ V2 ⊗ V3 given by g(y, z, x) = x ⊗ y ⊗ z. There is then a linear map g : V2 ⊗V3 ⊗V1 such that g(y⊗z⊗x) = x ⊗ y ⊗ z. gf = IV1 ⊗V2 ⊗V3 and fg = IV2 ⊗V3 ⊗V1 .

2. If vi ∈ Vi then S1 ⊗ · · · ⊗ Sm )(v1 ⊗ · · · ⊗ vm ) = S1 (v1 ) ⊗ · · · ⊗ Sm (vm ) ∈ R1 ⊗ · · · ⊗ Rm . V1 ⊗ · · · ⊗ Vm is generated by all vectors of the form v1 ⊗ · · · ⊗ vm it follows that Range(S1 ⊗ · · · ⊗ Sm ) ⊂ R1 ⊗ · · · ⊗ Rm . We need to prove the reverse inclusion. Suppose ri ∈ Ri = Range(Si ). Let vi ∈ Vi such that S(vi ) = ri . Then (S1 ⊗ · · · ⊗ Sm )(v1 ⊗ · · · ⊗ vm ) =

02/06/15 3:12 pm

100

Chapter 10. Tensor Products

S(v1 ) ⊗ · · · ⊗ S(vm ) = r1 ⊗ · · · ⊗ rm and consequently, of the transformation S ⊗ T : Fl ⊗ Fn → Fk ⊗ Fm R1 ⊗ · · · ⊗ Rm ⊂ R. with respect to the bases obtained by taking tensor products of the standard basis vectors in lexicographical or3. We do a proof by induction on m ≥ 2. For der. Then rank(A ⊗ B) = dim(Range(S ⊗ T )). the base case let dim(Vi ) = ni and dim(Ki ) = ki On the other hand, rank(A) = dim(Range(S)) and for i = 1, 2. Let (v1 , . . . , vk1 ) be a basis for K1 rank(B) = dim(Range(T )). By Exercise 2 the range and extend to a basis (v1 , . . . , vn1 ) for V1 . Similarly, of S ⊗ T is Range(S) ⊗ Range(T ) which has dimension let (u1 , . . . , uk2 ) be a basis for K2 and extend to dim(Range(S)) × dim(Range(T )). a basis (u1 , . . . , un2 ) for V2 . Note that the sequence (S1 (vk1 +1 ), . . . , S1 (vn1 )) in W1 is linearly independent 5. Let dim(V ) = m, dim(W ) = n. Assume neither S as is the sequence (S2 (uk2 +1 ), . . . , S2 (un2 )) in W2 . As a nor T is nilpotent. Let f (x) = µS (x) and g(x) = µT (x). consequence the set of vectors {S1 (vi )⊗S2 (uj )|k1 +1 ≤ Then f (x) does not divide xm and g(x) does not divide i ≤ n1 , k2 + 1 ≤ j ≤ n2 } is linearly independent. xn . Let p(x) = x be an irreducible factor of f (x) and Assume now that x = i,j aij vi ⊗ uj ∈ K. Applying q(x) = x be an irreducible factor of g(x). Let x ∈ V such that µS,x (x) = f (x) and y ∈ W such that µT,y (x) = S1 ⊗ S2 we get q(x). Set X = S, x and Y = T, y. Then X ⊗ Y is invariant under S ⊗ IW and IV ⊗ T and therefore under (S1 ⊗ S2 )( aij vi ⊗ vj = i,j S ⊗ T = (S ⊗ IW )(IV ⊗ T ). Set Z = X ⊗ Y and denote by S ⊗ T the restriction of S ⊗ T to Z. It follows from aij S1 (vi ) ⊗ S(vj ) = Exercise 2 that S ⊗ T is surjective, therefore injective, on i,j Z. In particular, Ker(S ⊗ T ) is trivial. However, if S ⊗T n2 n1 is nilpotent then for any invariant subspace U of V ⊗ W aij S(ui ) ⊗ S(vj ) = 0W1 ⊗W2 . the kernel of the restriction of S ⊗ T to U must be noni=k1 +1 j=k2 +1 trivial. Thus, S ⊗ T is not nilpotent. Since {S1 (vi ) ⊗ S2 (uj )|k1 + 1 ≤ i ≤ n1 , k2 + 1 ≤ j ≤ n2 } is linearly independent for k1 + 1 ≤ i ≤ n1 , k2 + 1 ≤ 6. Let vi ∈ V be an eigenvector of S with eigenvalue j ≤ n2 , aij = 0. This implies that u ∈ K1 ⊗ V2 + V1 ⊗ αi and wj an eigenvector of T with eigenvalue βj . Then vi ⊗ wj is an eigenvector of S ⊗ T with eigenvalue αi βj . K2 = X1 + X2 . Consequently, S ⊗T is diagonalizable. On the other hand, Assume the result is true for tensor product of m spaces since the eigenvalues α β are all distinct, the minimum i j and that we have Si : Vi → Wi for 1 ≤ i ≤ m. Set polynomial of S ⊗ T is V2 = V2 ⊗ · · · ⊗ Vm+1 and S2 = S2 ⊗ · · · ⊗ Sm+1 . Then V1 ⊗· · ·⊗Vm+1 is equal to V1 ⊗V2 and S1 ⊗· · ·⊗Sm+1 = m n S1 ⊗ S2 . Set K2 = Ker(S2 ). Also set Y2 = K2 ⊗ V3 ⊗ (x − αi βj ) · · ·⊗Vm+1 , Yi = V2 ⊗· · ·⊗Vi−1 ⊗Ki ⊗Vi+1 ⊗· · ·⊗Vm+1 . i=1 j=1 By the inductive hypothesis K2 = Y2 +· · ·+Ym+1 . Also, by the base case, K = K1 ⊗ V2 + V1 ⊗ K2 . However, has degree mn and therefore S ⊗ T is cyclic. V1 ⊗ K2 = V1 ⊗ (Y2 + · · · + Ym+1 ) = X2 + · · · + Xm+1 7. For any cyclic diagonalizable operator S : V → V the while K1 ⊗ V2 = X1 . This completes the proof. operator S ⊗ S : V ⊗ V → V ⊗ V will not be cyclic. For 4. Let S : Fl → Fk be the transformation such that example, let S : R2 → R2 be given by multiplication by S(v) = Av and T : Fn → Fm be the transforma1 0 A= . Then tion such that T (w) = Bw. Then A ⊗ B is the matrix 0 2

K23692_SM_Cover.indd 108

02/06/15 3:12 pm

10.2. Properties of Tensor Products

1 0 A⊗A= 0 0

0 2 0 0

0 0 2 0

101 0 0 0

c

n i=1

ai ⊗F vi + d

n i=1

ai ⊗F vi =

c v + d v.

4

So, the eigenvalue 2 occurs with algebraic multiplicity 2 and the operator is not cyclic. 8. Let v ∈ V such that µS,v (x) = (x − α)k and let w ∈ W such that µT,w (x) = (x − β)l . It suffices to prove that S⊗T restricted to S, v⊗T, w has minimum polynomial dividing (x − αβ)kl .

We need also to compute (cd) v:

(cd) v = (cd)

n i=1

n

ai ⊗ F vi =

i=1

(cd)ai ⊗F vi =

n

dai ⊗F vi = Set v1 = v and for i < k set vi+1 = (S − αIV )vi . Simii=1 i=1 larly, set w1 = w and for j < l, wj+1 = (T − βIW )wj . n (v1 , . . . , vk ) is a basis for S, v and (w1 , . . . , wl ) is a c[d ai ⊗ F vi ] basis for T, w. Let S be the restriction of S to S, v i=1 and T the restriction of T to T, w. The matrix of S with 10. That B is linearly independent follows from Exercise respect to (v1 , . . . , vk ) is the k × k matrix 5 of Section (10.1). We show that B spans VK . Since every α 0 0 ... 0 element of VK is a sum of elements of the form c ⊗F w 1 α 0 . . . 0 for w ∈ V it suffices to prove that this element belong A = . . . . Since B is a basis for V there exist scalars to Span(B). .. .. .. . . . .. ai ∈ F such that w = a1 v1 + · · · + an vn . Then 0 0 0 ... α

The matrix of T with respect to (w1 , . . . , wl ) is

β 1 B = . ..

0 β .. .

0

0

0 ... 0 ... .. . ... 0 ...

0 0 .. .

(c + d) v = (c + d)

i=1

n i=1

ai ⊗ F vi =

(cai + dai ) ⊗F vi =

K23692_SM_Cover.indd 109

n i=1

n i=1

β

(c + d)ai ⊗F vi =

cai ⊗F vi +

n i=1

c ⊗F w = c ⊗F (a1 v1 + · · · + an vn ) = (ca1 ) ⊗F v1 + . . . (can ) ⊗F vn ∈ Span(B).

The matrix A ⊗ B is a lower triangular kl × kl with αβ on the diagonal. This implies the result. = ni=1 ai ⊗F vi . Then 9. Let c, d ∈ K and v n

c(dai ) ⊗F vi = c

n

dai ⊗F vi =

11. Let dim(V ) = m, dim(W ) = n. Then by Exercise 11, dimK (VK ) = m, dimK (WK ) = n. Then dim(L(V, W )) = mn = dimK (L(VK , WK ) = dimK (L(V, W )K . Thus, L(VK , WK ) and L(V, W )K are isomorphic as spaces over K. 12. Let B1 = (u1 , . . . , um ) be a basis for V1 and B2 = (w1 , . . . , wn ) be a basis for V2 . Let B be the basis of V1 ⊗ V2 consisting of the set {ui ⊗ wj |1 ≤ i ≤ m, 1 ≤ j ≤ n} ordered lexicographically. Let A = MS1 (B1 , B1 ) and B = MS2 (B2 , B2 ) so that MS1 ⊗S2 (B , B ) = A⊗B. Assume the (k, l)−entry of A is aij and the (k, l)−entry of B is bkl . Then the diagonal

02/06/15 3:12 pm

102

Chapter 10. Tensor Products

entries of A ⊗ B are aii bjj where 1 ≤ i ≤ m, 1 ≤ j ≤ n. Then The sum is then (a11 + · · · + amm )(b11 + · · · + bnn ) = T race(A)T race(B).

det(E1 )n . . . det(Ek )n det(F1 )m . . . det(Fl )m =

13. Suppose E is an elimination matrix and is upper triangular. Then E ⊗ In is also upper triangular with 1’s on the diagaonal and therefore det(E ⊗ In ) = 1 = 1n . Similarly, if E is lower triangular then det(E ⊗ In ) = 1. Suppose E is an exchange matrix, specifically obtained by exchanging the i and j rows of the identity matrix Im . Then E ⊗ In can be obtained from the identity matrix by exchanging rows n(i − 1) + k with row n(j − 1) + k for 1 ≤ k ≤ n. Therefore det(E ⊗ In ) = (−1)n = det(E)n . Finally, assume that E is a scaling matrix obtained by multiplying the i row of Im by c. Then E ⊗ In is obtained from the mn−identity matrix by multiplying rows n(i−1)+k by c for 1 ≤ k ≤ n and is therefore a diagonal matrix with n diagonal entries equal to c and the remaining equal to 1. Then det(E ⊗ In ) = cn = det(E)n .

[det(E1 ) . . . det(Ek )]n [det(F1 ) . . . det(Fl )]m = det(A)n det(B)m .

10.3. The Tensor Algebra 1. Suppose f1 , f2 ∈ ⊕i∈I Vi . Set I1 = spt(f1 ), I2 = spt(f2 ). Let J = I1 ∩ I2 , J = {j ∈ I|f2 (j) = −f1 (j)} and J ∗ = J \ J . Further set I1 = I1 \ J ∗ , I2 = I2 \ J ∗ . Then spt(f1 + f2 ) = I1 ∪ I2 \ J = I1 ∪ I2 ∪ J ∗ . We now compute G(f1 + f2 ) :

det(E1 ⊗In ) . . . det(Ek ⊗In )det(Im ⊗F1 ) . . . det(Im ⊗Fl ). By Exercise 14, det(Ei ⊗ In ) = det(Ei )n and det(Im ⊗ Fj ) = det(Fj )m .

K23692_SM_Cover.indd 110

gi ([f1 + f2 ](i)) =

i∈spt(f1 +f2 )

Let B1 = (u1 , . . . , um ) be a basis for V1 and B2 = (w1 , . . . , wn ) be a basis for V2 . Let B be the basis of V1 ⊗ V2 consisting of the set {ui ⊗ wj |1 ≤ i ≤ m, 1 ≤ j ≤ n} ordered lexicographically. Let A = MS1 (B1 , B1 ) and B = MS2 (B2 , B2 ) so that MS1 ⊗S2 (B , B ) = A ⊗ B. Then det(S1 ⊗ S2 ) = det(A ⊗ B).

det(A ⊗ B) =

G(f1 + f2 ) =

14. If either S1 or S2 is singular then so is S1 ⊗ S2 and then the result clearly holds. Therefore we may assume that Si is invertible for i = 1, 2.

Let E1 , . . . , Ek be m × m elementary matrices such that S1 = E1 E2 . . . Ek and let F1 , . . . , Fl be n×n elementary matrices such that S2 = F1 F2 . . . Fl . Then A ⊗ B = (E1 ⊗ In ) . . . (Ek ⊗ In )(Im ⊗ F1 ) . . . (Im ⊗ Fl ). Since the determinant is multiplicative we have

det(A ⊗ B) =

gi ([f1 + f2 ](i)) =

i∈I1 ∪I2 ∪J ∗

gi ([f1 + f2 ](i)) +

i∈I1

i∈I1

gi ([f1 + f2 ](i))+

i∈I2

gi ([f1 + f2 ](i)) =

i∈J ∗

gi (f1 (i) + f2 (i)) +

i∈I1

gi (f1 (i) + f2 (i))+

i=I2

gi (f1 (i) + f2 (i)) =

i∈J ∗

gi (f1 (i)) +

gi (f2 (i)) +

i∈I2

gi (f1 (i)) +

i∈I1

gi (f1 (i) + f2 (i)) =

i∈J ∗

gi (f2 (i))+

i∈I2

[gi (f1 (i)) + gi (f2 (i))

(10.1)

i∈J ∗

02/06/15 3:12 pm

10.3. The Tensor Algebra

103

For i ∈ J we have f1 (i) + f2 (i) = 0 and therefore

gi (f1 (i) + f2 (i)) = 0W

i∈J

2. Since S : V → W is surjective, each S ⊗ · · · ⊗ S : Tk (V ) → Tk (W ) is surjective by part i) of Lemma (10.2). (10.2) It then follows by Lemma (10.3) that T (S) is surjective.

Adding (10.2) to (10.1) we obtain

gi (f1 (i)) +

i∈I1

[gi (f1 (i)) + gi (f2 (i)) +

i∈J ∗

gi (f2 (i))+

i∈I2

gi (f1 (i) + f2 (i)) =

i∈J

gi (f1 (i)) +

i∈I1

gi (f2 (i))+

i∈I2

[gi (f1 (i)) + gi (f2 (i))+

i∈J ∗

3. Since S : V → W is injective, each S ⊗ · · · ⊗ S : Tk (V ) → Tk (W ) is injective by part ii) of Lemma (10.2). It then follows by Lemma (10.3) that T (S) is injective.

4. This follows immediately from part iv. of Lemma (10.2). 5. The eigenvalues are: 8, 27, 125 (with multiplicity 1) 12, 20, 18, 50, 45, 75 (with multiplicity 3) and 30 (with multiplicity 6).

6. Let S(v1 ) = 2v1 , S(v2 ) = 3v2 . Then v1 ⊗ v2 , v2 ⊗ v1 are both eigenvectors of T2 (S) with eigenvalue 6. Thus, T2 (S) is not cyclic.

(10.3) 7. This is false unless RS = SR = 0V →V . Even taking R = S = IV gives a counterexample. In that case R + S = 2IV and T (R) + T (S) = 2IT (V ) and every vector Rearranging and combining the terms in (10.3) we get in T (V ) is an eigenvector with eigenvalue 2 for 2IT (V ) . On the other hand, vectors in T2 (V ) are eigenvectors of T2 (2IV ) with eigenvalue 4. gi (f1 (i)) + gi (f2 (i)) (10.4) ∗ ∗ i∈I1 ∪J ∪J i∈I2 ∪J ∪J 8. This is false. For example, let V have dimension 3 with basis (v1 , v2 , v3 ). Set X = Span(v1 , v2 ) and Y = Since spt(f1 ) = I1 ∪ J ∗ ∪ J and spt(f2 ) = I2 ∪ J ∗ ∪ J Span(x3 ) and S = P roj(X,Y ) . Then Ker(S) = Y. Now we get (10.4) is equal to K2 = Ker(T2 (S)) = V ⊗ Y + Y ⊗ V has dimension 6 and therefore T2 (V )/K2 has dimension 3. On the other hand, dim(V /Y ) = 2 and T2 (V /Y ) = 4. gi (f1 (i)) + gi (f2 (i)) = G(f1 ) + G(f2 ) 9. This is an immediate consequence of the definition of i∈spt(f1 ) i∈spt(f2 ) the tensor product. kl−k+1 Now suppose f ∈ ⊕i∈I Vi and 0 = c ∈ F is a scalar. Then 10. Assume S l = 0 = V →V . We claim that Tk (S) spt(cf ) = spt(f ). Now 0 . To see this note that T (S) = (S ⊗ I ⊗ gi (f1 (i)) +

i∈J

gi (f2 (i))

i∈J

k

Tk (V )→Tk (V )

G(cf ) =

gi ([cf ](i)) =

i∈spt(cf )

c

cgi (f (i)) =

i∈spt(f )

i∈spt(f )

K23692_SM_Cover.indd 111

gi ([cf ](i)) =

i∈spt(f )

gi (cf (i)) =

i∈spt(f )

gi (f (i)) = cG(f ).

V

· · · ⊗ IV )(IV ⊗ S ⊗ · · · ⊗ IV ) . . . (IV ⊗ IV ⊗ · · · ⊗ S) where in each case there is tensor product of k maps, one is S and all the others are IV . These maps commute and therefore Tk (S)kl−k+1 =

S j1 ⊗ · · · ⊗ S jk

(10.5)

where the sum is over all non-negative sequences (j1 , . . . , jk ) such that j1 + · · · + jk = kl − k + 1. By

02/06/15 3:12 pm

104

Chapter 10. Tensor Products

the generalized pigeon-hole principle some ji ≥ l which implies that S ji = 0V →V and hence each term in (10.5) is zero.

Now assume that i < k and for all j with i < j < k ⊗(S) − ⊗((i, j)(S)) ∈ I. It suffices to prove that

11. The formula is T r(Tk (S)) = T r(S)k . The proof is by induction on k. By Exercise 13 of Section (10.2) T r(S ⊗ S) = T r(S)2 . Assume the result is true for k: T r(Tk (S)) = T r(S)k . Again by Exercise 13 of Section (10.2) T r(Tk+1 (S)) = T r(Tk (S) ⊗ S) = T r(Tk (S))T r(S) = T r(S)k T r(S) = T r(S)k+1 .

vi ⊗ vi+1 ⊗ . . . vk−1 ⊗ vk −

k−1

12. We claim that det(Tk (S)) = det(S)kn . We do induction on k ≥ 1. When k = 1 the result clearly holds. Also, for k = 2 we can apply Exercise 15 of Section (10.2) to obtain det(S ⊗ S) = det(S)n det(S)n = set(S)2n as required. k−1

Assume now that det(Tk (S)) = det(S)kn . Now Tk+1 (S) = Tk (S) ⊗ S. The dimension of Tk (V ) = nk . Then by Exercise 15 we have det(Tk+1 (S)) = det(Tk (S) ⊗ S) =

det(S)

knk

k−1

det(S)

k

)n det(S)n =

nk

= det(S)

(k+1)nk

By the induction hypothesis each of the following are in I: vi ⊗ vi+1 ⊗ · · · ⊗ vk−1 ⊗ vk − vi+1 ⊗ vi ⊗ vi+2 ⊗ · · · ⊗ vk vi+1 ⊗ vi ⊗ vi+2 ⊗ · · · ⊗ vk − vi+1 ⊗ vk ⊗ vi+2 ⊗ . . . vk−1 ⊗ vi vi+1 ⊗ vk ⊗ vi+2 ⊗ · · · ⊗ vk−1 ⊗ vi − vk ⊗ vi+1 ⊗ vi+2 ⊗ · · · ⊗ vk−1 ⊗ vi . The sum of these is

k

det(Tk (S))n det(S)n = (det(S)kn

vk ⊗ vi+1 ⊗ · · · ⊗ vk−1 ⊗ vi ∈ I.

vi ⊗ vi+1 ⊗ . . . vk−1 ⊗ vk − .

10.4. The Symmetric Algebra 1. S = (v1 , . . . , vn ) be a sequence of vectors. We will denote by ⊗(S) the product v1 ⊗ · · · ⊗ vn . Also, for a permuation π of {1, . . . , n} we let π(S) = (vπ(1) , . . . , vπ(n) ). We therefore have to prove that ⊗(S) − ⊗(π(S)) ∈ I for all sequences S and permuations π. We first prove that the result for permuations (i, j) which exchange a pair i < j and leave all other k in {1, . . . , n} fixed. We prove this by induction on j − i.

vk ⊗ vi+1 ⊗ · · · ⊗ vk−1 ⊗ vi which is therefore in I.

We now use the fact that every permutation can be written as a product of transpositions - permutations of type (i, j). We do induction on the number of transpositions needed to express the permutation π. We have already proved the base case. Suppose we have shown if π that can be expressed as a product of k transpositions then ⊗(S) − ⊗(π(S)) ∈ I and assume that τ = σk+1 . . . σ1 a product of transpostions. Set γ = σk . . . σ1 . Then ⊗(S) − ⊗(τ (S))

[⊗(S) − ⊗(γ(S))] + [⊗(γ(S)) − ⊗(σk+1 (γ(S)))]. For the base case, j = i + 1 we have vi ⊗ vi+1 − vi+1 ⊗ vi ∈ I and then for every x ∈ Ti−1 (V ), y ∈ [⊗(S) − ⊗(γ(S))] is in I by the inductive hypothesis and Tn−i−1 (V ), x ⊗ [vi ⊗ vi+1 − vi+1 ⊗ vi ] ⊗ y ∈ I. There- [⊗(γ(S)) − ⊗(σk+1 (γ(S)))] ∈ I by the base case. Thus, fore ⊗(S) − ⊗((i, i + 1)S) ∈ I. ⊗(S) − ⊗(τ (S)) ∈ I as required.

K23692_SM_Cover.indd 112

02/06/15 3:12 pm

10.5. Exterior Algebra

105

2. We use the identification of Sym(V ) with F[x1 , . . . , xn ]. Under this identification Symk (V ) correSpan((W, π)|W ∈ Sn (B), π ∈ Sn∗ ) = sponds to the subspace of homogeneous polynomial of Span((B, π)|π ∈ Sn∗ ). degree k. A basis for this consists of all xe11 . . . xenn where e1 , . . . , en are non-negative and e1 + · · · + en = k. This Since (B, π) ∈ Span((W, π)|W ∈ S (B), π ∈ S ∗ ), we n n is equal to the number ways that n bins can be filled with need only prove the inclusion k balls (where some bins can be empty). This is a stan = dard combinatorics problem. The answer is k+n−1 k k+n−1 Span((W, π)|W ∈ Sn (B), π ∈ Sn∗ ) ⊂ Span((B, π)|π ∈ Sn∗ ) n−1 .

3. Let (v1 , . . . , vn ) be a basis of V consisting of eigen, equivalently, vectors of T with T (vi ) = αi . Under the correspondence of Symk (V ) with the homogeneous polynomials (W, π) ∈ Span((B, π)|π ∈ Sn∗ ). of F[v1 , . . . , vn ] of degree k the monomial v1e1 . . . vnen where (e1 , . . . , en ) is a sequence of non-negative integers Suppose W = σ(B). Then (B, σ) and (B, πσ) ∈ with e1 + · · · + en = k is an eigenvector with eigenvalue Span(B, π)|π ∈ Sn∗ ). Taking the difference we obtain α1e1 . . . αnen . 4. The eigenvalues are 1, 2, 8, 16 with multiplicity 1 and 4 with multiplicity 2. This operator is not cyclic. 5. a23 − a2 .

(B, σ) − (B, πσ) = [⊗(B) − sgn(σ)(⊗(σ(B))]− [⊗(B) − sgn(πσ)(⊗(πσ(B))] =

10.5. Exterior Algebra 1. Minic the proof of Exercise 1 of Section (10.4). 2. We let [1, n] = {1, . . . , n} and Sn = {π : [1, n] → [1, n]|π is bijective} and Sn∗ = Sn \ {I[1,n] . Set W0 = (v1 , . . . , vk ). Denote by Sn (W0 ) the set {(vπ(1) , . . . , vπ(k) )|π ∈ Sn } and set [B k ] = B k \ Sn (W0 ). Also, for W = (w1 , . . . , wk ) ∈ B k set ⊗(W ) = w1 ⊗ · · · ⊗ wk . Finally, for W ∈ Sn (W0 ) and π ∈ Sn let (W, π) = ⊗(W ) − sgn(π) ⊗ (π(W )).

sgn(πσ)(⊗(π(W )) − sgn(σ)(⊗(W )) = sgn(σ)sgn(π)(⊗(π(W )) − sgn(σ)(⊗(W )) = −sgn(σ)(W, π) which establishes the claim.

Next note that Span(⊗(W )|W ∈ [Bn ] ) ∩ Span(⊗(W )|W ∈ Sn (B)) = {0}. Therefore, if A typical spanning vector inJk has the form x1 ⊗ · · · ⊗ ⊗(B) ∈ Jn then, in fact, ⊗(B) is a linear combination of ∗ xi ⊗y⊗y⊗z1 ⊗· · ·⊗zj where i+j+2 = k. Express each (B, π), π ∈ Sn . of xr , y, zs as a linear combination of the basis B. Then Suppose ⊗(B) ∈ Span((B, π)|π ∈ S ∗ ). Let σ ∈ n x1 ⊗· · ·⊗xi ⊗y⊗y⊗z1 ⊗· · ·⊗zj is a linear combination S ∗ . Then −(B, σ) + ⊗(B) ∈ Span((B, π)|π ∈ S ∗ ). n n of ⊗(W ), W ∈ [B k ] and (W, τ ), W ∈ Sn (W0 ), τ ∈ Sn . However, −(B, σ) + ⊗(B) is ± ⊗ (σ(B)) and conse3. We continue with the notation introduced in the so- quently, ⊗(σ(B)) ∈ Span((B, π)|π ∈ Sn∗ ). Since the lution of Exercise 2 with k = n so that W0 = B = set {⊗(σ(B))|σ ∈ Sn } is linear independent this will imply that Span((B, π)|π ∈ Sn∗ ) has dimension n! On (v1 , . . . , vn ). Our first claim is that

K23692_SM_Cover.indd 113

02/06/15 3:12 pm

106

Chapter 10. Tensor Products

the other hand, there are only n! − 1 generating vec- Thus, {∧k (S)(∧k (Φ))|Φ ∈ Sn (v1 , . . . , vk )} is linearly / independent and ∧k (S) is injective. tors so dim(Span(B, π)|π ∈ Sn∗ ) ≤ n! Thus, ⊗(B) ∈ ∗ Span((B, π)|π ∈ Sn ). 6. If V is n−dimensional and S is nilpotent then S n = This implies that ∧n (V ) has dimension at least one. On 0V →V . Since ∧(S)n = ∧(S n ) = 0∧(V )→∧(V ) . Thus, the other hand it has been established that dim(∧n (V )) ∧(S) is nilpotent. has dimension at most one and therefore exactly one. 7. Let (v1 , . . . , vn ) be a basis of V such that S(vi ) = Since ∧n (V ) has dimension one with basis v1 ∧ · · · ∧ vn αi vi . Then vi1 ∧ · · · ∧ vik is an eigenvector of ∧k (S) there is a unique map F : ∧n (V ) → F such that with eigenvalue αi1 . . . αik . Since {vi1 ∧ · · · ∧ vik |1 ≤ F (v1 ∧· · ·∧vn ) = 1. By the universal property of ∧n (V ) i1 < · · · < ik ≤ n} is a basis for ∧k (V ) it follows that this implies that there is a unique alternating n−linear ∧k (S) is diagonalizable. form f on V such that f (B) = 1. n−1 8. det(∧k (S)) = det(S)(k−1) . This can be proved by 4. Let (w1 , . . . , wk ) be a sequence of vectors from V. showing it holds for elementary operators (matrices). Then 9. Let S : R4 → R4 be the operator with matrix k 0 1 0 0 ∧ (SR)(w1 ∧ · · · ∧ wk ) = −1 0 0 0 (SR)(w1 ) ∧ · · · ∧ (SR)(wk ) = 0 0 3 4 . S(R(w1 )) ∧ · · · ∧ S(R(wk )) = 0 0 −3 4 ∧k (S)(R(w1 ) ∧ · · · ∧ R(wk )) = k

k

∧ (S)(∧ (R)(w1 ∧ · · · ∧ wk )) = (∧k (S) ∧k (R))(w1 ∧ · · · ∧ wk ).

5. Let BV = (v1 , . . . , vn ) be a basis of V. For a sequence (u1 , . . . , uk ) we let ∧(u1 , . . . , uk ) = u1 ∧ · · · ∧ uk . We continue with the notation of Exercises 2 and 3. To show that S is injective it suffices to show that the image of a basis of ∧k (V ) is linearly independent in ∧k (W ). Now {∧k (Φ)|Φ ∈ Sn (v1 , . . . , vk )} is a basis for ∧k (V ). Since S is injective, (w1 , . . . , wn ) = (S(v1 ), . . . , S(vn )) is linearly independent in W. Extend to a basis (w1 , . . . , wm ) for W. Then {∧k (Ψ)|Ψ ∈ Sm (w1 , . . . , wk )} is a basis for ∧k (W ). In particular, {∧k (Ψ)|Ψ ∈ Sn (S(w1 , . . . , wk ))} = {∧k (Ψ)|Ψ ∈ Sn ((S(v1 ), . . . , S(vk ))} is linearly independent. However,

Then the eigenvalues of S are ±i, 3 ± 4i. On the other hand, the eigenvalues of ∧2 (S) are 1, 25, −4 + 3i, 4 + 3i, −4 − 3i, 4 − 3i. 10. Let (v1 , v2 , v3 , v4 ) be linearly independent and set x = v1 ∧v2 +v3 ∧v4 . Then x∧x = 2(v1 ∧v2 ∧v3 ∧v4 ).

11. Since multiplication is distributive φ is additive in each variable. Since (cw) ∧ v = w ∧ (cv= c(w ∧ v), in fact, φ is bilinear. In light of bilinearity, to show that φ is symmetric it suffices to show that it is for decomposable vectors, that is, vectors of the form v1 ∧ · · · ∧ v2n . However, (v1 ∧ · · · ∧ v2n ) ∧ (w1 ∧ · · · ∧ w2n ) = 2

(−1)4n (w1 ∧ · · · ∧ w2n ) ∧ (v1 ∧ · · · ∧ v2n )

from which it follows that

φ(vv1 ∧ · · · ∧ v2n , w1 ∧ · · · ∧ w2n ) = ∧k ((S(vπ(1) ), . . . , vπ(k) ) = S(vπ(1) ) ∧ · · · ∧ S(vπ(k) ) = k

∧ (S)(vπ(1) ∧ · · · ∧ vπ(k) )

K23692_SM_Cover.indd 114

φ(w1 ∧ · · · ∧ w2n , v1 ∧ · · · ∧ v2n ). We now need to show that φ is non-degenerate. For a subset α = {i1 < · · · < i2n } of [1, 4n] let vα = vi1 ∧

02/06/15 3:12 pm

10.6. Clifford Algebra

107

· · · ∧ vi2n . Also denote by α the complementary subset 15. x3 + 6x2 − 9. α = [1, 4n] \ α. Note that vα ∧ vβ = 0∧2n (V ) if β = α 16. x6 − 3x4 − 27x3 − 9x2 + 27. and vα ∧ vα = ±v1 ∧ · · · ∧ v4n . Now suppose that x = α cα vα where the sum is taken over all subsets α of [1, 4n] of size 2n and that x = 0∧2n (V ) . Then some cα = 0. Then x ∧ vα = ±cα and φ(x, vα ) = 0. In particular, x is not in the radical. be a non-zero vector in V . Set a = φ(v) < 0. Since x is arbitrary the radical of φ is just the zero vector. 1. Let v √ Let c = −a. By replacing v, if necessary, by 1c v we 12. We need to show that there are totally singular subcan assume that φ(v) = −1. Since dim(V ) = 1, V = spaces of dimension 3. Note that the singular vectors are Span(v). Then T (V ) is isomorphic to R[x]. The ideal precisely the decomposable vectors. Let (v1 , . . . , v4 ) be Iφ is generated by v ⊗ v + 1 which corresponds to the a basis for V. Then Span(v1 ∧ v2 , v1 ∧ v3 , v1 ∧ v4 ) is a polynomial x2 + 1. Thus, T (V )/Iφ is isomorphic to C. totally singular subspace of dimension 3. A second class / α. Then of singular subspaces of dimension 3 is represented by 2. Let α = {i1 < · · · < ik } and assume j ∈ vj vis = −vis vj for 1 ≤ s ≤ k. It follows by induction ] pan(v1 ∧ v2 , v1 ∧ v3 , v2 ∧ v3 ). on k that vj vα = (−1)k vα vj . 13. As in the proof of Exercise 11, the form is bilinear. 3. Continue with the notation of 2 and assume is = j. Now we have Then vj vα = vj vi1 . . . vis . . . vik = (v1 ∧ · · · ∧ vn ) ∧ (w1 ∧ · · · ∧ wn ) = (−1)s−1 vi1 . . . vis−1 vj vis . . . vik = 2 (−1)n (w1 ∧ · · · ∧ wn ) ∧ (v1 ∧ · · · ∧ wn ) = (−1)s−1 vi1 . . . vis−1 vj vj . . . vik = −(w1 ∧ · · · ∧ wn ) ∧ (v1 ∧ · · · ∧ wn (−1)s−1 vi1 . . . vis−1 φ(vj )vs+1 . . . vik =

10.6. Clifford Algebra

Suppose now that (v1 , . . . , v2n ) is a basis of V . For a (−1)s−1 φ(vj )vα\{j} . subset α = {i1 < · · · < in } of [1, 2n] set vα = vi1 ∧ · · · ∧ vin . As in Exercise 18, if β = α = [1, 2n] \ α then 4. Assume first that I is a homogeneous ideal of A = vα ∧ vβ = 0∧2n (V ) . A0 ⊕ A1 . Then I = I ∩ A0 ⊕ I ∩ A1 from which it immediately follows that I is generated as an ideal by Let x = cα vα + cα vα . Then x2 = x ∧ x = (I ∩ A0 ) ∪ (I ∩ A1 ). cα c α v α ∧ v α + c α c α v α ∧ v α =

Conversely, assume X is a set of homogeneous elements of A and I consists of all elements of the form z = b1 x1 c1 + · · · + bk xk ck + d1 y1 e1 + . . . dl yl el where xi ∈ X ∩ A0 , yi ∈ X ∩ A1 , bi , ci , di , ei ∈ A. We need to cα cα vα ∧ vα − cα cα vα ∧ vα = 0∧2n (V ) . show the homogeneous parts of zR are in I. Write each t It follows from the previous two paragraphs that x ∧ x = bi as bi0 + bi1 where bit ∈ A and similarly for ci , di and 0∧2n (V ) for any vector x ∈ ∧n (V ) and therefore φ is ei . Set alternating. The proof that φ is non-degenerate is exactly k l as in Exercise 11. z = (b x c + b x c ) + (di0 yi ei1 + di1 yi ei0 ) 0 i0 i i0 i1 i i1 14. x6 + 14x4 + 96x3 − 128x − 32. i=1 i=1

K23692_SM_Cover.indd 115

02/06/15 3:12 pm

108

z1 =

Chapter 10. Tensor Products k i=1

(bi0 xi ci1 + bi1 xi ci0 ) +

l

(di0 yi ei0 + di1 yi ei1 )

i=1

Then z0 ∈ A0 , z1 ∈ A1 and z = z0 + z1 . Now each of bis xi cit and dis yi eit belongs to I and therefore z0 , z1 ∈ I and I is a homogeneous ideal.

5. Let (v1 , v2 ) be an orthogonal basis for V . As in the solution to Exercise 1 we can assume that φ(v1 ) = φ(v2 ) = −1. Set i = v1 , j = v2 and k = v1 v2 then (1, i, j, k) is a basis for C(V ). Note that i2 = φ(v1 ) = −1 = φ(v2 ) = j 2 . Since v1 ⊥ v2 , ij = v1 v2 = −v2 v1 = −ji. By Exercise 3, ki = (ij)i = i(−ij) = −i2 j = j. Simiarly, jk = i = −kj. It follows that C(V ) is isomorphic to the division ring of quaterions. 6. Let (x, y) be a hyperbolic basis for V . Then φ(x + y) = φ(x) + φ(y) + x, yφ = 0 + 0 + 1. Therefore x2 + y 2 + xy + yx = 1. Since (1, x, y, xy) is a basis for C(V ) and Span(x, y, xy, yx) contains {1, x, y, xy} a11 a12 also (x, y, xy, yx) is a basis. Denote by the a21 a22 vector a11 xy +a12 x+a21 y +a22 yx. Let’s determine the product of a11 xy + a12 y + a21 x + a22 yx with b11 xy + b12 x + b21 y + b22 yx. We note the following: x2 = y 2 = x(xy) = y(yx) = (yx)x = (xy)y = 0 x(yx) = x = (xy)x, y(xy) = y = (yx)y (xy)2 = xy, (yx)2 = yx. Now the product of a11 xy + a12 x + a21 y + a22 yx and b11 xy + b12 x + b21 y + b22 yx is (a11 b11 + a12 b21 )xy+ (a11 b12 + a12 b22 )x+ (a21 b11 + a22 b21 y+ (a21 b12 + a22 b22 )yx. It follows that the map from a11 xy+a12 x+a21 y+a22 yx a11 a12 to is an isomorphism of algebras and C(V ) a21 a22 is isomorphic to M22 (F).

K23692_SM_Cover.indd 116

02/06/15 3:12 pm

Chapter 11

Linear Groups and Groups of Isometries 11.1. Linear Groups 1. The order of GL(V ) is equal to the number of bases of V . When dim(V ) = n and the field is Fq this is n

q( 2 )

n

i=1

(y1 , . . . , yj , w1 , . . . , wt , z1 , . . . , zt , q1 , . . . , qs ) is also a basis for V . Now let S be the linear transformation such that S(xi ) = yi for 1 ≤ i ≤ j

S(ui ) = wi for 1 ≤ i ≤ j S(vi ) = zi for 1 ≤ i ≤ j

(q i − 1).

S(pi ) = qi for 1 ≤ i ≤ s

Since SL(V ) is the kernel of det : GL(V ) → F∗q and the Then S is an invertible operator, S(U1 ) = W1 , S(U2 ) = determinant is surjective, |SL(V )| = |GL(V )|/|F∗q | = W2 . n

q( 2 )

n

i=1

(q i − 1) ×

n

n 1 = q( 2 ) (q i − 1). q−1 i=2

2. Let j = dim(U1 ∩ U2 ). Let (x1 , . . . , xj ) be a bass for U1 ∩ U2 and (y1 , . . . , yj ) a basis for W1 ∩ W2 . Set t = k − j. Let (u1 , . . . , ut ) be vectors in U1 such that (x1 , . . . , xj , u1 , . . . , ut is a basis for U1 and (v1 , . . . , vt ) vectors from U2 such that (x1 , . . . , xj , v1 , . . . , vt ) is a basis for U2 . Let (w1 , . . . , wt ) be vectors from W1 such that (y1 , . . . , yj , w1 , . . . , wt ) is a basis of W1 and (z1 , . . . , zt ) be vectors from W2 such that (y1 , . . . , yj , z1 , . . . , zt ) is a basis for W2 . Now (x1 , . . . , xj , u1 , . . . , ut , v1 , . . . , vt ) is a basis for U1 + U2 and (y1 , . . . , yj , w1 , . . . , wt , z1 , . . . , zt ) is a basis for W1 + W2 . Set s = n − [k + t] and let (p1 , . . . , ps ) be sequence of vectors such that (x1 , . . . , xj , u1 , . . . , ut , v1 , . . . , vt , p1 , . . . , ps ) is a basis for V and (q1 , . . . , qs ) a sequence of vector such that

K23692_SM_Cover.indd 117

3. Since every one dimensional subspace is an intersection of the k-dimensional subspaces which contain it we can conclude that every vector of V is an eigenvector. Then by the proof of Lemma (11.1), T ∈ Z(GL(V )).

4. We may assume that H1 = H2 . Then dim(H1 ∩H2 ) = n−2 and P ⊂ H1 ∩H2 . Let P = Span(x1 ) and extend to a basis (x1 , . . . , xn−2 ) for H1 ∩H2 . Let xn−1 be a vector such that (x1 , . . . , xn−1 ) is a basis for H1 and let xn be a vector in H2 such that (x1 , . . . , xn−2 , xn ) is a basis for H2 . Note that xn−1 ∈ / H2 and xn ∈ / H1 . Let S ∈ χ(P, H1 ) and T ∈ χ(P, H2 ). For 1 ≤ i ≤ n − 2, S(xi = T (xi ) = xi so, in particular, ST (xi ) = T S(xi ) = xi . It there suffices to prove that ST (xj ) = T S(xj ) for j = n − 1, n. Now S(xn−1 ) = xn−1 and there is a scalar, a, such that S(xn ) = xn + ax1 . Similarly, T (xn ) = xn and there is scalar, b, such that T (xn−1 ) = xn−1 + bx1 . Now T S(xn−1 ) = T (xn−1 ) = xn−1 + bx1 ,

02/06/15 3:12 pm

110

Chapter 11. Linear Groups and Groups of Isometries T S(xn ) = T (xn + ax1 ) = xn + ax1 . ST (xn−1 ) = S(xn−1 ) = xn−1 + bx1 , ST (xn ) = S(xn ) = xn + ax1 .

Since ST and T S agree on a basis, ST = T S. = bxn−1 − axn and H = 5. Set Span(x1 , . . . , xn−2 , xn−1 ). Then ST (xn−1 ) = xn−1 . Since ST (xn ) = xn + ax1 it follows that ST = T S ∈ χ(P, H). xn−1

6. Let Pi = Span(xi ), i = 1, 2. Extend x1 , x2 ) to a basis (x1 , . . . , xn−1 ) of H and then to a basis (x1 , . . . , xn ) of V . Let S ∈ χ(P1 , H) and T ∈ χ(P2 , H). Now S(xj ) = xj = T (xj ) for 1 ≤ j ≤ n − 1. Moreover, there are scalars, a, b such that S(xn ) = xn + ax1 and T (xn ) = xn + bx2 . Then

T (y2 ) = y2 and there is a scalar b such that T (y1 ) = 1 0 . y1 + by2 . Therefore Mπ(T ) (BW , BW ) = b 1 It follows that χ(P1 , H2 ), χ(P2 , H2 ) is isomorphic to SL2 (F) = SL(W ). 9. First assume that P1 + P2 ⊂ H1 ∩ H2 . Assume Pi = Span(xi ), i = 1, 2. Then (x1 , x2 ) is linearly independent. Extend to a basis (x1 , . . . , xn−2 ) of H1 ∩ H2 . Let yi ∈ Hi , i = 1, 2 such that (x1 , . . . , xn−2 , yi ) is a basis of Hi . Let Si ∈ χ(Pi , Hi ), i = 1, 2. Then Si (xj ) = xj for i = 1, 2 and 1 ≤ j ≤ n − 2 Moreover, Si (yi ) = yi for i = 1, 2. On the other hand, there are scalars, a1 , a2 such that S1 (y2 ) = a1 x1 + y1 and S2 (y1 ) = y1 + a2 x2 . To prove that S1 S2 = S2 S1 we need only show that S1 S2 agree on y1 and y2 .

S1 S2 (y1 ) = S1 (y1 + ax2 ) = y1 + a2 x2 , ST (xn ) = S(xn + bx2 ) = xn + ax1 + bx2 , T S(xn ) = S2 S1 (y1 ) = S2 (y1 ) = y1 + a2 x2 . T (xn + ax1 ) = xn + bx2 + ax1 . Since ST and T S agree on a basis, ST = T S. 7. Set x1 = ax1 + bx2 and P = Span(x1 ). Now ST = T S is the identity when restricted to H and ST (xn ) = xn + x1 and therefore ST ∈ χ(P , H). 8. Let (x1 , . . . , xn−2 ) be a basis of H1 ∩ H2 and let yi ∈ Pi , i = 1, 2 be nonzero vectors. Since y1 ∈ / H2 , in particular, y2 ∈ / H1 ∩ H2 so that (x1 , . . . , xn−2 , y1 ) is a linearly independent and therefore a basis of H1 . Similarly, (x1 , . . . , xn−2 , y2 ) is a basis of H2 . Set W = Span(y1 , y2 ). Note that if S is in the subgroup of GL(V ) generated by χ(P1 , H1 ) and χ(P2 , H2 ) then S(v) = v for every vector v ∈ H1 ∩ H2 . Consequently, the map S → S|W is an injective homomorphism. Set BW = (y1 , y2 ). Denote by π(S) the restriction of S to W .

S1 S2 (y2 ) = S1 (y2 ) = a1 x1 + y2 , S2 S1 (y2 ) = S2 (a1 x1 + y2 ) = a1 x1 + y2 . So we must now show if P1 + P2 is not contained in H1 ∩ H2 then χ(P1 , H1 ) and χ(P2 , H2 ) do not commute. By Exercise 8 we can assume that either P1 ⊂ H2 or P2 ⊂ H1 . Without loss of generality assume P1 ⊂ H2 . Let P1 = Span(x), P2 = Span(y). Since x ∈ H1 ∩ H2 there is a basis (x = x1 , . . . , xn−2 ) for H1 ∩ H2 . Since y∈ / H1 ∩ H2 the sequence (x1 , . . . , xn−2 , y) is basis of H2 . Let xn−1 be a vector in H1 such that (x1 , . . . , xn−1 ) is a basis for H1 . Now let Si ∈ χ(Pi , Hi ) for i = 1, 2. Then S1 (xj ) = xj for 1 ≤ j ≤ n − 1 and there is a scalar, a1 such that S1 (y) = y + a1 x1 . On the other hand, S2 (xj ) = xj for 1 ≤ j ≤ n − 2, S(y) = y and there is a scalar, a2 such that S2 (xn−1 ) = xn−1 + a2 x2 . We now compute S1 S2 (xn−1 ) and S2 S1 (xn−1 ).

Now assume S ∈ χ(P1 , H1 ). Then S(y1 ) = y1 and there is a scalar a such that S(y2 ) = ay1 + y2 . Therefore 1 a Mπ(S) (BW , Bw ) = . If T ∈ χ(P2 , H2 ) then S1 S2 (xn−1 ) = S1 (xn−1 +a2 y) = xn−1 +a2 y2 +a1 a2 x, 0 1

K23692_SM_Cover.indd 118

02/06/15 3:12 pm

11.2. Symplectic Groups

S2 S1 (xn−1 ) = S2 (xn−1 ) = xn−1 + a2 y. Thus, S1 S2 = S2 S1 . 10. Let T ∈ χ(P, H) and y = S(x), x ∈ H. Then (ST S −1 )(y) = (ST S −1 )(S(x)) = ST (x) = S(x) = y.

111 6. Let (x1 , . . . , xn y1 , . . . , yn ) be a hyperbolic basis and T ∈ Ψ(x1 ) act as follows: T (x1 ) = x1 T (y1 ) = y1 +

n

(ak xk + bk yk ) + γx1

k=2

T (xj ) = xj − bj x1 for 2 ≤ j ≤ n T (yj ) = yj + aj x1 for 2 ≤ j ≤ n

On the other hand, assume w ∈ V, w ∈ / S(H). Set / H. Now (ST S −1 )(w) − w = If S ∈ Sp(V ) set x = S(xj ) and y = S(yj ). If u = S −1 (w) so that u ∈ j j (ST S −1 )(S(u)) − S(u) = S(T (u) − u). Since T ∈ T = ST S −1 then χ(P, H), T (u) − u ∈ P and therefore S(T (u) − u) ∈ S(P ) so that ST S −1 ∈ χ(S(P ), S(H)). This proves that T (x1 ) = x1 Sχ(P, H)S −1 ⊂ χ(S(P ), S(H)). Since this also applies n to S −1 we get the reverse inclusion and equality. (ak xk + bk yk ) + γx1 T (y1 ) = y1 + k=2

11.2. Symplectic Groups

T (xj ) = xj − bj x1 for 2 ≤ j ≤ n T (yj ) = yj + aj x1 for 2 ≤ j ≤ n

Thus, ST S −1 = T ∈ Ψ(x1 ).

1. If y ∈ x then Tx,c (y) = y. On the other hand, if y∈ / x⊥ then (Tx,c − IV )(y) = cf (y, x)x ∈ x⊥ . Thus, 7. This follows from Exercise 6. Tx,c ∈ χ(Span(x), x⊥ ) and, therefore, is a transvection. 8. We continue with the notation of Exercise 6. For b2 a2 2. If y ∈ x⊥ then Tx,c Tx,d (y) = y = Tx,c+d (y). On the .. .. a = . , b = . and γ ∈ F denote by T (a, b, γ) other hand, assume f (y, x) = 1. Then ⊥

Tx,c Tx,d (y) = Tx,c (y + dx) = y + dx + cy = y + (c + d)x = Tx,c+d (y). 3. If y ∈ x⊥ then Tbx,c (y) = y = Tx,b2 c (y). On the other hand, assume f (y, x) = 1 then

an bn the operator with the action as in Exercise 6. Let also c2 d2 .. .. c = . , d = . ∈ Fn−1 and δ ∈ F. It is a cn dn straightforward calculation to see that

T (a, b, γ)T (c, d, δ) = Tbx,c (y) = y + cf (u, bx)(bx) = y + b2 cx = Tx,b2 c (y). 4. If x ⊥ y then Span(x) + Span(y) ⊂ x⊥ ∩ y ⊥ . By Exercise 9 of Section (11.1) χ(x) = χ(Span(x), x⊥ ) and χ(Span(y), y ⊥ ) = χ(y) commute. 5. This is an instance of Exercise 10 of Section (11.1).

K23692_SM_Cover.indd 119

T (a + c, b + d, δ + γ − btr d + atr c) = T (c, d, δ)T (a, b, γ). 9. Let X = Span(x), Y1 = Span(y) and Y2 = Span(z) where f (y, x) = 1 = f (z, x). Set x1 = x and y1 = y and extend (x1 , y1 ) to a hyperbolic basis

02/06/15 3:12 pm

112

Chapter 11. Linear Groups and Groups of Isometries

(x1 , . . . , xn , y1 , . . . , yn ). Write z as a linear combina- there are several cases to consider: a) α = β. Then tion of this basis and note that since f (z, x) = f (y, x) = f (α + β, γ) = f (0, γ) = 0. On the other hand, 1 the coefficient of y1 is 1. Thus, f (α, γ) + f (β, γ) = f (α, γ) + f (α, γ) = 0. b) α = γ and α ∩ β = ∅. Then (α + β) ∩ γ = (α + β) ∩ α = ∅. n Then f (α + β, γ) = f (α + β, α) = 0. On the other hand, z = y1 + (ak xk + bk yk ) + γx1 . f (α, γ) + f (β, γ) = f (α, α) + f (β, α) = 0 + 0, the latk=2 ter since β ∩ α = ∅. In the remaining cases we can now By Lemma (11.15) if T ∈ Ψ(x) and T (y) = T (y1 ) = z assume that α, β, γ are distinct. c) α ∩ β = ∅, γ ⊂ α ∪ β. then T is unique. Then (α + β) ∩ γ = ∅ and f (α + β, γ) = 0. On the other hand under these assumptions, |α ∩ γ| = |β ∩ γ| = 1, and 10. Let (x = x1 , . . . , xn , y1 , . . . , yn ) be a hyperbolic then f (α, γ) = f (β, γ) = 1 so that f (α, γ) + f (β, γ) = basis of V . Set S0 = (x1 , . . . , xn ) and for 2 ≤ j ≤ n set 0. d) α ∪ β ∪ γ = [1, 6]. Then α + β = γ so that Sj = (S0 ∪ {yj }) \ {xj } and set Mj = Span(Sj ). Then f (α + β, γ) = 0 = f (α, γ) = f (β, γ). Consequently, n f (α + β, γ) = f (α, γ) + f (β, γ). e) α ∩ β = ∅ = α ∩ ∩j=2 Mj = Span(x1 ). γ, |β∩γ| = 1. Then |(α+β)∩γ| = 1 and f (α+β, γ) = 1. On the other hand f (α, γ) = 0, f (β, γ) = 1 and we again 11. This is the same as Exercise 10 of Section (8.2). get equality. We can now assume that α, β, γ are distinct 12. By definition 0 is an identity element. Also every and |α ∩ β| = 1. f) We now treat the case that γ ⊂ α ∪ β, element v ∈ V is its own inverse with respect to 0. The that is, γ = α + β. Then f (α, γ) = f (β, γ) = 1 and definition of addition for α, β ∈ V \ {0} is symmetric f (α + β, γ) = f (γ, γ) = 0 and we have the required in α and β and therefore is commutative. It remains to equality. g) |α ∩ β ∩ γ| = 1. In this case (α + β) ∩ γ = ∅ show the addition is associative. If α, β ∈ V \ {0} then so f (α + β, γ) = 0 whereas f (α, γ) = f (β, γ) = 1. h) clearly (α + β) + 0 = α + (β + 0). Thus, we can assume α ∩ β ∩ γ = ∅, |α ∪ β ∪ γ| = 4. Then we may assume that our three elements are not zero. So let α, β, γ ∈ [1, 6]{2} . α ∩ γ = ∅, γ intersects β and α + β. Then f (α + β, γ) = There are several cases to consider: a) |α ∪ β ∪ γ| = 3. 1, f (α, γ) = 0, f (β, γ) = 1 and we have equality. i) FiThen α+β = γ, β +γ = α, γ +β = α and (α+β)+γ = nally, we have the case where γ is disjoint from α ∪ β. In α+(β +γ) = 0. b) α∪β ∪γ = [1, 6]. In this case we also this case f (α + β, γ) = 0 = f (α, γ) + f (β, γ) and we have α + β = γ, β + γ = α, γ + α = β and (α + β) + γ = are done. α + (β + γ) = 0. c) |α ∪ β ∪ γ| = 1. Then |α ∪ β ∪ γ| = 4 and (α + β) + γ = α + (β + γ) = [1, 6] \ (α ∪ β ∪ γ). d) α ∩ β ∩ γ = ∅ and |α ∪ β ∪ γ| =. Without loss of {2} generality we can assume that α ∩ β = ∅, γ ⊂ α ∪ β. 14. Since π ∈ S6 fixes 0, we have for any α ∈ [1, 6] {2} Then (α + β) + γ = α + (β + γ) = (α ∪ β) \ γ. e) that f (π(α), 0) = 0 = f (α, 0). Suppose α, β ∈ [1, 6] . |α ∪ β ∪ γ| = 5. Then we can assume that α ∩ β = ∅ and Then α ∩ β = ∅ if and only if π(α) ∩ π(β) = ∅ and |α ∩ γ| = 1, β ∩ γ = ∅. Let [1, 6] \ (α ∪ β ∪ γ) = {i} and therefore f (π(α), π(β)) = f (α, β). Thus, S6 acts as α ∩ γ = {j}. Then (α + β) + γ = α + (β + γ) = {i, j}. isometries of the symplectic space (V, f ). By Exercise 11, |Sp4 (2)| = 24 (24 − 1)(22 − 1) = 16 × 15 × 3 = 13. If f is bilinear then by its definition it is alternating 24 × 32 × 5 = 6! = |S6 |. Therefore, S6 is isomorphic to since f (v, v) = 0 for every v ∈ V . Now f (v, w) = Sp4 (2). f (v + 0, w) = f (v, w) + 0 = f (v, w) + f (0, w). Also, for v, w ∈ V, f (v + w, 0) = f (v, 0) + f (w, 0) since both sides are zero. We may therefore assume that the three vectors are α, β, γ ∈ [1, 6]{2} . Again

K23692_SM_Cover.indd 120

02/06/15 3:12 pm

11.3. Orthogonal Groups, Characteristic Not Two

113

11.3. Orthogonal Groups, Characteristic Not Two

Lemma (11.28) there is a unique element γ ∈ Tu such that γ(v) = x. Then γTv γ −1 = Tγ(v) = Tx . It follows that the subgroup generated by Tu , Tv contains Tx for every singular vector x. Consequently, the subgroup generated by Tu , Tv contains Ω(V ). Since Tu , Tv are 1. If x ∈ u⊥ ∩ y ⊥ then x ∈ z ⊥ and ρz ρy (x) = x. subgroups of Ω(V ) we have equality. On the other hand, τu,y (x) = x + x, yφ u = x. It now suffices to show that ρz ρy (y) = τu,y (y) = y+y, yφ u. 5. Let H denote the subgroup of Ω(V ) generated by Set a = y, yφ so that z = a2 u + y. Now z, zφ = χ(l1 ) ∪ χ(l2 ) ∪ χ(l3 ) ∪ χ(l4 ). We first point out that x⊥ 1 = Span(x1 , x2 , y2 ). We claim that Tx1 is genery, yφ = , z, yφ . Now ρz (y) = ated by χ(l1 ) ∪ χ(l4 ). Let Hx1 be the subgroup of Tx1 y, zφ generated by χ(l1 ) ∪ χ(l4 ). We need to prove if w ∈ x⊥ 1 y − 2[ z] = z, zφ then τx1 ,w ⊂ Hx1 . Since w ∈ x⊥ = Span(x , x , y ) 1 2 2 1 there are scalars a, b, c such that w = ax1 + bx2 + cy2 . a y − 2z = y − 2( u + y) = By Exercise 2, τx1 ,w = τx1 ,bx2 +cy2 so we can assume 2 w = bx2 + cy2 . By Lemma (11.25), τx1 ,bx2 +cy2 = y − au − 2y = −y − y, yφ u. τx1 ,bx2 τx1 ,cy2 ∈ Hx1 . In exactly the same way, Tx2 is the subgroup generated by χ(l1 ) ∪ χ(l2 ), Ty1 is equal to It then follows that ρz ρy (y) = ρz (−y) = y + the subgroup generated by χ(l2 ) ∪ χ(l3 ), and Ty2 is geny, yφ u = τu,y (y). erated by χ(l3 ) ∪ χ(l4 ). 2. Assume w = v + cu for some scalar c. Let x ∈ u⊥ . Suppose now that u = cx1 + x2 is a vector in l1 . Set Then σ = τx1 ,cx2 . Then σ(x2 ) = x2 = x2 , x1 φ y2 + x2 , y2 φ (cx1 ) = x2 + cx1 . It therefore follows that τu,w (x) = x + x, wφ u = σTx2 σ −1 = Tu is contained in H. In exactly the same x + x, v + cuφ u = way if u ∈ l2 ∪ l3 ∪ l4 then Tu ⊂ H. x + x, vφ u = τu,v (x). Now suppose u is an arbitrary singular vector. We must show that Tu ⊂ H. By the above we can assume that Since τu,v and τu,w agree on u⊥ they are identical. Conu ∈ / l 1 ∪ l2 ∪ l3 ∪ l4 . Let ax1 + x2 be a vector in l1 verely, assume τu,v = τu,w . Let x ∈ u⊥ . Then such that ax1 + x2 ⊥ u and let by2 + y1 be a vector x, vφ = x, wφ so that x, v − wφ = 0. Consein l3 such that by2 + y1 ⊥ u. It must be the case that quently, v − w ∈ Rad(u⊥ ) = Span(u). ax1 + x2 ⊥ by2 + y1 so that b = −a. Set z1 = ax1 + x2 3. We need to prove if v ∈ u⊥ is a singular vector and z2 = y1 − ay2 . Then u ∈ Span(z1 , z2 ). Since then τu,v is in the subgroup of Tu generated by all τu,x Tu = Tcu for any nonzero scalar, c, we can assume that where x ∈ u⊥ and x is non-singular. Let x ∈ u⊥ u = z1 + dz2 for some scalar d. Now set γ = τy2 ,dz2 . such that v, xφ = x, xφ . Since x ⊥ v the sub- Then γ(z1 ) = z1 +dz2 = u. Then Tu = γTz1 γ −1 ⊂ H. space Span(v, x) is non-degenerate. Let y be a vector It now follows that Ω(V ) ⊂ H and since H ⊂ Ω(V ) we in Span(v, x) such that y ⊥ x and v, yφ = y, yφ . have equality. It is then the case that v = x + y. By Lemma (11.25) 6. Let σ ∈ L1 and u ∈ l1 . We claim that σ(u) ∈ l1 . It τu,v = τu,x τu,y . suffices to prove this for σ ∈ χ(l2 ) ∪ χ(l4 ). Suppose 4. Without loss of generality we can assume u, vφ = 1. u = x1 . If σ ∈ χ(l2 ) then σ(u) = σ(x1 ) = x1 . Suppose x is a singular vector and u, xφ = 1. By On the other hand, if σ ∈ χ(l4 ), say σ = τx2 ,ay1 then

K23692_SM_Cover.indd 121

02/06/15 3:12 pm

114

Chapter 11. Linear Groups and Groups of Isometries

σ(u) = σ(x1 ) = τx2 ,ay1 (x1 ) = x1 + x1 , x2 φ ay1 + x1 , ay1 φ x2 = x1 + ax2 ∈ l1 . Similarly, if u = x2 then σ(x2 ) ∈ l1 . Suppose u = x1 + ax2 . Then σ(u) = σ(x1 )+aσ(x2 ) ∈ l1 since σ(x1 and σ(x2 ) ∈ l1 .

q(x + y ) − q(x ) − q ∗ y ) = 1+ω 0 det − 0 1−ω 1 0 det − 0 1 ω 0 det = 0 −ω

Suppose σ = τx1 ,ay2 . We determine the matrix of σ restricted to l1 with respect to the basis (x1 , x2 ). By what 1 a we have shown this is . Similarly, the matrix of 0 1 1 0 (1 + ω)(1 − ω) − 1 + ω 2 = 0. τx2 ,by1 with respect to (x1 , x2 ) is . It follows b 1 that the restriction of L1 to l1 is isomorphic to SL2 (F). It now follows that q(au +bv +cx +dy ) = ab+c2 +e2 d However, this map is injective and therefore L1 is isomor- and the linear map that takes u → u , v → v , x → x , y → y is an isometry. phic to SL2 (F). Similarly, L2 is isomorphic to SL2 (F). 7. Since l1 ∩ l4 = Span(x1 ), χ(l1 ) and χ(l4 ) commute. 11. It suffices to prove that A · m ∈ M for A = 1 β 0 1 Since l1 ∩ l2 = Span(x2 ), χ(l1 ) and χ(l2 ) commute. 1 0 Since l2 ∩ l3 = Span(y1 ) it follows that χ(l2 ) and χ(l3 ) and A = since these matrices generate SL2 (K) γ 1 commute. Finally, since l3 ∩ l4 = Span(y2 ) we can con a α clude that χ(l3 ) and χ(l4 ) commute. It now follows that where a, b ∈ F and α ∈ K. Thus, assume m = α b L1 and L2 commute. 1 β 8. We have seen above that L1 leaves l1 invariant and If A = then 0 1 therefore for σ ∈ L1 , σ(B) = B. If σ ∈ χ(l1 ) acts trivially on l1 then σ(B) = B. On the other hand, if a aβ + α . A·m= σ ∈ χ(l3 ) then σ(B) ∩ B = ∅. It follows that B is a aβ + α aββ + αβ + αβ + b block of imprimitivity. 9. The Witt index of W is zero, that is, there are no singular vectors in W . In particular, for v = ax + y, φ(ax + y) = a2 + d = 0. Since a ∈ F is arbitrary, there are no roots in F of the quadratic polynomial X 2 + d which implies that X 2 + d is irreducible in F[X]. 1 0 0 0 10. Let u = ,v = . Then u , v 0 0 0 −1 are singular vectors in (M, q) and u , v q = 1. The 0 α orthogonal complement of U in M is { |α ∈ α 0 0 1 K}. Set x = so that q(x ) = 1. Set y = 1 0 0 ω . Then q(y ) = −ω 2 = d and −ω 0

x , y q =

K23692_SM_Cover.indd 122

Since aβ + α = aβ + α and aββ + αβ + αβ + b ∈ F it follows that A · m ∈ M . In a similar fashion if A = 1 0 then A · m ∈ M . β 1

12. If A ∈ SL2 (K) and m1 , m2 ∈ M then TA (m1 + tr tr tr m2 ) = A (m1 + m2 )A = A m1 A + A m2 A = TA (m1 ) + TA (m2 ). If A ∈ SL2 (K), m ∈ M and c ∈ F then TA (cm) = tr tr A (cm)A = c(A mA = cTA (m). Thus, TA is a lintr ear operator on M . Since det(A) = 1 also det(A ) = tr tr 1. Then det(A mA) = det(A )det(m)det(A) = det(A). Consequently, q(TA (m)) = −det(TA (m)) = −det(m) = q(m). Thus, TA is an isometry of (M, q).

13. Since the characteristic of F is not two, the center of SL2 (K) = {±I2 }. This acts trivially on M and therefore

02/06/15 3:12 pm

11.4. Unitary Groups is in the kernel of the action. On the other hand, since |K| > 3, P SL2 (K) is simple. This implies the either Range(T ) = P SL2 (K) or the action is trivial, which it is not. Thus, Range(T ) is isomorphic to P SL2 (K). 0 α Let z = . It is straightforward to check α 0 that the action of τu ,z is the same as TA where A = 1 α . Thus, Range(T ) contains Tu . By Remark 0 1 (11.5) Ω(M, q) is generated by Tu ∪ Tv which are both contained in Range(T ) and therefore, Range(T ) is isomorphic P SL2 (K).

115 T (u) = ωu, T (v) = ωv. Let τ be the isometry of V such that τ restricted to W is T and τ restricted to W ⊥ is the identity. Then τ ∈ Ω(V ).

4. Suppose first that Span(u) = Span(v). Let w be a singular vector such that f (u, w) = 1. By Exercise 3 there is a τ ∈ Ω(V ) such that τ (u) = ωu, τ (w) = ωw. Then either τ (u) = v or τ 2 (u) = v. We may therefore assume that Span(u) = Span(v). Next assume that f (u, v) = 0. Let v ∈ Span(v) such that f (u, v ) = 1 and set W = Span(u, v). Then there is a σ ∈ Ω(W ) such that σ restricted to W ⊥ is the identity and σ(u) = ω 2 v . It then follows by Exercise 3 that we can find a γ ∈ Ω(V ) such that γ(v ) = ω(v ). Then one of the isometries σ, γσ, γ 2 σ takes u to v. Finally assume Span(u) = Span(v) and u ⊥ v. There exists an isotropic vector x such that u ⊥ x ⊥ v. By what we have shown there exists τ1 , τ2 ∈ Ω(V ) such that 1. Assume τ restricted to W is a transvection of W , say τ1 (u) = x, τ2 (x) = v. Set τ = τ2 τ1 . Then τ ∈ Ω(V ) τ ∈ χ(X, X ⊥ ∩ W ) where X is an isotropic subspace of and τ (u) = v. W . Let τ be defined as follows: τ(w + u) = τ (w) + u where w ∈ W and u ∈ W ⊥ . Then τ ∈ χ(X, X ⊥ ) ∈ 5. Let u be an isotropic vector and set U = Span(u). Ω(V ). If T ∈ Ω(W ) express T as a product, τ1 . . . τk If X is a non-degenerate two dimensional subspace conwhere τi ∈ χ(Xi , Xi⊥ ∩ W ). Prove that T = τ1 . . . τk ∈ taining u then |I1 (X)| = 3, one of which is U . The total number of two dimensional subspaces containing U is 21 Ω(V ). and the number of two dimensional subspaces containing 2. Let u, v be isotropic vectors such that f (u, v) = 1 and U and contained in u⊥ is five. Therefore there are 16 nonset B = (u, v). Let F4 = {0, 1, ω, ω + 1 = ω 2 }. The six degenerate two dimensional subspaces containing U and anistropic vectors are u + ωv, ωu + ω 2 v, ω 2 u + v, u + 16 × 2 = 32 one spaces Y in I1 (V ) such that Y ⊥ U . ω 2 v, ωu + v, ω 2 u + ωv. If T ∈ SU (V ) then the matrix Now let Y ∈ I1 (V ) with Y ⊥ U and assume Z is a toof T with respect to B is one of the following: tally isotropic two dimensional space containing U . Then |I1 (Z)| = 5, one of which is U . On the other hand, 1 ω 1 0 0 ω I2 , , , , 2 2 Y ⊥ ∩ Z ∈ I1 (U ⊥ ∩ Y ⊥ ). Now U ⊥ ∩ Y ⊥ is a non0 1 ω 1 ω 0 degenerate two dimensional subspace and contains three 1 ω 0 ω one dimensional subspaces in I1 (V ). Thus, there exists , . ω2 0 ω2 1 exactly three totally isotropic subspaces of dimension two containing U . We can now conclude that there are 3 × 4 Apart from I2 none of these leave the vector u + ωv fixed isotropic one spaces W in U ⊥ , W = U . We therefore and so SU (V ) is transitive on the six vectors. have 1 + 12 + 32 = 45 one spaces in I1 (V ).

11.4. Unitary Groups

3. Let w be an anistropic vector in Span(u, v)⊥ and set W = Span(u, v, w). Then W is non-degenerate. By Lemma (11.37) there is an isometry T in Ω(W ) such that

K23692_SM_Cover.indd 123

6. This was proved in the course of Exercise 5. 7. This was also proved in the course of Exercise 5.

02/06/15 3:12 pm

116

Chapter 11. Linear Groups and Groups of Isometries

8. Proved in Exercise 5. 9. f (ax1 + bx2 + cy2 + dy1 , ax1 + bx2 + cy2 + dy1 ) = ad + da + bc + cb = T r(ad) + T r(bc). Thus, ax1 + bx2 + cy2 + dy1 is isotropic if and only if T r(ad) + T r(bc) = 0. 10. If X is an anistropic one dimensional space then X ⊥ is a non-degenerate three dimensional subspace which contains 21 one dimensional subspaces. Moreover, if W = X ⊥ then I1 (W ) = 9. Therefore P ∩ L1 (W ) = 21 − 9 = 12, 11. If X, Y are orthogonal and anistropic then X + Y is non-degenerate as is X ⊥ ∩ Y ⊥ . A non degenerate two dimensional subspace contains 3 isotropic one dimensional subspaces and two anistropic one dimensional subspaces. Moreover, the two anistropic one dimensional subspaces of a non degenerate two dimensional subspace are orthogonal. 12. |P| = 85 − 45 = 40. Suppose X ∈ P. As we have seen in Exercise 10 there are 12 elements Y ∈ P with Y ⊥ X. Then there are two Z ∈ P with X ⊥ Z ⊥ Y . Therefore there are 40 × 12 × 2 × 1 four-tuples (X, Y, Z, W ) from P such that they are mutually orthogonal. Since there are 4! permutations of {X, Y, Z, W } the number of subsets of cardinality four of mutually orthogonal elements of P is 40×12×2×1 4×3×2×1 = 40. 13. Assume i = j and let {i, j, k, m} = {1, 2, 3, 4}. Then P ∩ L1 (Xi⊥ ∩ Xj⊥ ) = {Xk , Xm }. Consequently, if Y ∈ P \l then Y is orthogonal to as most one of Xi . Note that |P \ l| = 36. For each i there are 12 elements Z ∈ P such that Z ⊥ Xi . Three of these are Xj , Xk , Xm and so there are nine elements Z of P \ l such that Z ⊥ Xi . This accounts for 9 × 4 elements of P \ l, which is all of them.

K23692_SM_Cover.indd 124

02/06/15 3:12 pm

Chapter 12

Additional Topics 12.1. Operator and Matrix Norms 1. Let be a norm on Fn and let x be a non-zero vector in Fn . Then In x = x and In x = x . Consequently, for every non-zero vector, x, Inx = 1. Therefore, x In = 1. √ 1 1 2. In F = T race(Intr In ) 2 = n 2 = n. By Exercise 1, F is not induced by any norm on Fn . √ 1 3. A F = (122 + 72 + 22 ) 2 = 193, A 1,1 = max{14, 7} = 14, A ∞,∞ = max{19, 2} = 19. tr 12 2 12 2 tr A A= = 7 0 7 0 12 7 12 2 = 2 0 7 0 193 24 . 24 4

The characteristic polynomial of Atr A is X 2 − 197X + 196 = (X − 1)(X − 196). The eigenvalues √ are 1 and 196. tr Thus, ρ(A A) = 196 and A 2,2 = 196 = 14. 11 7 7 4 Atr A = A2 = 7 11 7 . T race(Atr A) = 33. 7 7 11 √ Therefore A F = 33.

K23692_SM_Cover.indd 125

A 1,1 = A ∞,∞ = 5. The eigenvalues of Atr A are 4 with multiplicity two tr and 25 with √ multiplicity 1. Then ρ(A A) = 25 and A 2,2 = 25 = 5. 5. By Lemma (12.1) there is a positive real number M such that T (x) W ≤ M x V . Suppose now that x0 ∈ V, T (x0 ) = y0 and is a positive real number. We have to show that there exists δ > 0 such that if x − x0 V < δ then T (x) − T (x0 ) W < . Let γ = max{M, 1} and set δ = γ . Now suppose x − x0 V < δ then T (x) − T (x0 ) W = T (x − x0 ) W < M x − x0 V < M ≤ M . 6. Since In2 = In we have In = In · In ≤ In · I n from which we conclude that In ≥ 1.

12.2. Moore-Penrose Inverse 1. Since µP (x) = x2 − x it follows that P 2 = P , whence P 3 = P . Thus, if X = P then (PI1) and (PI2) hold. Since P is Hermitian, P ∗ = P and therefore (P 2 )∗ = P 2 . It follows if X = P then (PI3) and (PI4) hold so P † = P . 2. If X = diag{ d11 , . . . , d1r , 0, . . . , 0} then DXD = D and XDX = X so (PI1) and (PI2) hold. Since AX = XA = diag{1, . . . , 1, 0, . . . , 0} (rank r), (AX)∗ =

02/06/15 3:12 pm

118

Chapter 12. Additional Topics

AX, (XA)∗ = XA so also (PI3) and (PI4) hold. Thus, X = diag{ d11 , . . . , d1r , 0 . . . , 0} is the Moore-Penrose inverse of D.

BXB = (A∗ A)[A† (A† )∗ ](A∗ A) = (A∗ A)[A† {(A† )∗ A∗ }A =

1 3. Set X = v (a1 , . . . , an ). Then Xv = 1 and (A∗ A)[A† (AA† )∗ }A = (A∗ A)[A† (AA† ]A = vXv = v, XvX = X so (PI1) and (PI2) hold. Since (A∗ AA† )(AA† A) = (A∗ )(AA† A) = A∗ A = B. Xv is a real scalar, (Xv)∗ = Xv and (PI4) holds. On the other hand, vX is the n × n matrix whose (i, j)-entry is Thus, (PI1) holds. 1 ∗ v ai aj which is a Hermitian matrix: (vX) = vX so (PI3) holds. XBX = [A† (A† )∗ ](A∗ A)[A† (A† )∗ ] = 4. AA−1 A = A so (PI1) holds. A−1 AA−1 = A−1 so A† (AA† )∗ AA† (A∗ )† = (PI2) holds. Since AA−1 = A−1 = In = In∗ , (PI3) and (PI4) hold. Thus, A† = A−1 . A† AA† (A∗ )† = A† (A∗ )† = X.

5. Set X = C ∗ (CC ∗ )−1 . Then CX = Ir so that XCX = X and CXC = C so that (PI1) and (PI2) hold. Since CX = Ir , clearly (PI4) holds. On the other hand, XC = C ∗ (CC ∗ )−1 C and (XC)∗ = C ∗ ([CC ∗ ])−1 )∗ (C ∗ )∗ = C ∗ (CC ∗ )−1 C = XC.

Consequently, (PI2) holds. (BX)∗ = [(A∗ A)(A† {A† }∗ ]∗ = [A∗ (AA† ){A† }∗ ]∗ =

6. P 2 = (AA† )2 = AA† AA† . By (PI1), AA† A = A and therefore (AA† )2 = AA† = P . By (PI3), P ∗ = P so that (P 2 )∗ = P 2 .

({A† }∗ )∗ (AA† )∗ (A∗ )∗ =

7. (Im − P )2 = (Im − P )Im − (Im − P )P = (Im − P ) + ] (P 2 − P ) = Im − P . Also, (IM − P )∗ = Im ∗ −P ∗ = Im − P .

Since A† A is Hermitian it follows that (BX)∗ is Hermitian, whence, BX is Hermitian and (PI3) holds.

8. Set X = (A† )∗ . We want to show that X = (A∗ )† . A∗ XA∗ = A∗ (A† )∗ A∗ = (AA† A)∗ = A∗ so (PI1) holds. ∗

† ∗

∗

† ∗

†

† ∗

† ∗

XA X = (A ) A (A ) = (A AA ) = (A ) = X. Thus, (PI2) holds.

A† AA† A = A† A.

(XB)∗ = {[A† (A† )∗ ][A∗ A]}∗ = {A† [(A† )∗ A∗ ]A}∗ = {A† ([AA† )∗ ]A}∗ = [A† (AA† )A]∗ = A† A

which is Hermitian and (PI4) holds. (A∗ X)∗ = [A∗ (A† )∗ ]∗ = A† A. However, we then have A∗ X = [(AA∗ )∗ ]∗ = (A† A) = A† A by (PI4) for A. 10. Since A = AA† A = A(A† A) by (PI4) we have Thus, (PI3) holds for X relative to A∗ . A∗ = [A(A† A)]∗ = (A† A)∗ A∗ = (A† A)A∗ . On the other hand, writing A = (AA† )A and using (PI3) we get (XA∗ )∗ = AX ∗ = A[(A† )∗ ]∗ = AA† . Since AA† A∗ = [(AA† )A]∗ = A∗ (AA† )∗ = A∗ (A† A). is Hermitian it follows that XA∗ is Hermitian and (PI4) holds. 11. (A∗ A)† A∗ = [A† (A∗ )† ]A∗ by Exercise 8. By Exercise 7 we have 9. Set B = A∗ A and X = A† (A∗ )† = A† (A† )∗ by Exercise 7. We show that X is the Moore-Penrose inverse of B. A† (A∗ )† ]A∗ = [A† (A† )∗ ]A∗ =

K23692_SM_Cover.indd 126

02/06/15 3:12 pm

12.2. Moore-Penrose Inverse

119

A† (AA† A)∗ = A† (AA† ) by (PI3). By (PI2) this is equal to A† as desired.

Since A† A is Hermitian it follows that (A† A)n+1 is Hermitian. This proves (PI3). In a similar fashion we have

12. A† A = (A∗ A)† (A∗ A) by Exercise 11. Now A∗ A is Hermitian and therefore so is (A∗ A)† by Exercise 8. Moreover, A† A is Hermitian by (PI4). It follows that

An+1 (A† )n+1 = [AA† ]n+1 .

A† A = (A† A)∗ = {[(A∗ A)† ](A∗ A)}∗ = ∗

∗

∗

† ∗

∗

∗

†

(A A) [(A A) ] = (A A)(A A) . Since A∗ A = AA∗ we have (A∗ A)(A∗ A)† . = (AA∗ )(AA∗ )† = A[A∗ (AA∗ )† ] = AA†

Since AA† is Hermitian so is (AA† )n+1 and (PI4) holds. 14. With B = λA and X = hold.

1 † λA

we show (PI1) - (PI4)

1 BXB = (λA)( A† )(λA) = λ λ(AA† A) = λA = B.

by Exercise 11. 13. The proof is by induction on n. The base case is trivial. Assume that (An )† = (A† )n . We must show that (An+1 )† = (A† )n+1 . We show that for X = (A† )n+1 that (PI1) = (PI4) hold. We make use of Exercise 12: A† A = AA† . As a consequence we have

1 1 XBX = ( A† )(λA)( A† ) = λ λ

An+1 (A† )n+1 An+1 = (AA† A)(An (A† )n An ).

1 BX = (λA)( A† ) = AA† = (AA† )∗ . λ

By the inductive hypothesis, An (A† )n An = An . Furthermore, AA† A = A. Therefore

1 1 † (A AA† ) = A† = X. λ λ

1 XB = ( A† )λA) = A† A = (A† A)∗ . λ

(AA† A)(An (A† )n An ) = AAn = An+1 . Thus, (PI1) holds.

(A† )n+1 An+1 (A† )n+1 = (A† AA† )[(A† )n An (A† )n ] = A† (A† )n = (A† )n+1 . Thus, (PI2) holds. (A† )n+1 An+1 = (A† A)[(A† )n An ] =

15. Assume A∗ = A† . Then (A∗ A)2 = (A† A)2 = A† A = A∗ A by Exercise 6. Converely, assume that (A∗ A)2 = A∗ A. Then it is straightforward to see that (A∗ A)† = A∗ A. From Exercise 11 it follows that A† = A∗ AA∗ . It then follows that A = A(A∗ AA∗ )A = A(A∗ A)2 = AA∗ A. Thus, (PI1) holds with X = A∗ . Since A = AA∗ A taking adjoints we get A∗ = AA∗ A so (PI2) holds as well. Since both AA∗ and A∗ A are Hermitian (PI3) and (PI4) are satisfied. Therefore, in fact, A† = A∗ .

(A† A)(A† A)n = (A† A)n+1 .

K23692_SM_Cover.indd 127

02/06/15 3:12 pm

120

Chapter 12. Additional Topics

12.3. Nonnegative Matrices

9. Assume Ad = λd. Let aij be the (i, j)-entry of A. Since d is an eigenvector with eigenvalue ρ(A) we have for each i,

1. Assume akij = 0 and am jl = 0. The (i, l)-entry of n k m Ak+m = Ak Am is p=1 akip am pl ≥ aij ajl > 0.

2. Assume j ∈ ∆(i) and l ∈ ∆(j). Then (i, j), (j, l) ∈ ∆. By Exercise 1, (i, l) ∈ ∆ so that l ∈ ∆(i).

3. Let l ∈ I. Suppose aml = 0 so that m ∈ ∆(l). Then by Exercise 2, m ∈ ∆(i) = I. Consequently, Ael ∈ Span(ej |j ∈ I) as was to be shown. a11 . . . a1n .. and B = 4. Let A = ... ... .

b11 .. .

...

an1

...

n

Now the (i, j)-entry of D−1 AD is n j=1

ann

b1n .. and assume that {(i, j)|a = 0} = ij .

... bn1 . . . bnn {(i, j)|bij = 0}. Then akij = 0 if and only if bkij = 0. The result follows from this. j n−1−j n−1 5. (In + A)n−1 = j=0 n−1 . Assume i = In A j j. Since (In + A)n−1 > 0 for some j, 1 ≤ j ≤ n − 1 the (i, j)-entry of Aj is positive. Fix i, 1 ≤ i ≤ n and let 1 ≤ l ≤ n, l = i. Then for some j, ajil > 0 and for > 0. Thus, A some k, akji > 0. It then follows that aj+k ii is irreducible.

aij dj = ρ(A)di .

j=1

bij =

n dj j=1

di

dj di aij .

Then

aij =

n 1 aij dj = di j=1

1 (λdi ) = λ). di

It follows that the matrix λ1 D−1 AD is a row stochastic matrix. By Theorem (12.22) it follows that 1 −1 ρ( λ D AD) = 1. However, ρ( λ1 D−1 AD) = ρ(D −1 AD) λ)

=

ρ(A) λ .

Thus, λ = ρ(A).

10. By Theorem (12.19) there are positive vectors x, y such that x 1 = 1 = y tr x, Ax = ρx, Atr y = ρy. By Theorem (12.21), limk→∞ [ ρ1 A]k = xy tr , a rank one positive matrix. It then follows for some natural number k that Ak > 0.

11. Assume ω ∈ C, |ω| = 1 and ωzi = |zi | for all i. Then 6. Let the entries of A be aij . It suffices to prove if A is a ˙ n | = |ω(z1 ) + · · · + zn )| = |z1 + · · · + zn | = |ω||z1 + +z positive m× n matrix, x is a non-zero nonnegative matrix |ωz1 + · · · + ωzn | = ||z1 | + · · · + |zn || = |z1 | + . . . |zn |. x1 then Ax = 0. Let x = ... and assume that xj > 0. We prove the converse by induction on n. Assume n = 2. Let ω ∈ C, |ω| = 1, such that ωz1 = a ∈ R+ . We xn need to prove that ωz2 = |z2 |, equivalently, ωz2 ∈ R+ . n The ith entry of Ax is k=1 aik xk > aij xj > 0. Assume ωz2 = b + ci where b, c ∈ R. Then |z1 + z2 |2 = 7. If ρ(A) = 0 then all the eigenvalues of A are zero and |(a + b) + ci|2 = (a + b)2 + c2 = a2 + 2ab + b2 + c2 . √ A is a nilpotent matrix, whence An = 0nn . However, On the other hand, (|z1 | + |z2 |)2 = [a + b2 + c2 ]2 = √ if Ak > 0 then (Ak )n > 0 which contradicts (Ak )n = a2 + b2 + c2 + 2a b2 + c2 . If c = 0 or b < 0 then √ (An )k = 0nn . 2ab = 2a b2 + c2 . Therefore c = 0 and b > 0. 8. Assume x > 0 is an eigenvector for A, say Ax = Now assume that n > 2 and the result is true for n − 1 λx. Since A is non-negative and x > 0, Ax > 0. In complex numbers z1 , . . . , zn−1 . Assume ω ∈ C, |ω| = 1 particular, λ > 0. Thus, ρ(A) > 0. so that ωz1 ∈ R+ . Replacing zi with ωzi , if necessary, we

K23692_SM_Cover.indd 128

02/06/15 3:12 pm

12.3. Nonnegative Matrices

121

may assume that z1 ∈ R+ and we need to prove that zi ∈ R+ for every i. Suppose |z2 +· · ·+zn | < |z2 |+· · ·+|zn |. Then |z1 +· · ·+zn | ≤ |z1 |+|z2 +· · ·+zn | < |z1 |+· · ·+ |zn |, a contradiction. Therefore |z1 + (z2 + · · · + zn )| = |z1 | + |z2 + · · · + zn | = |z1 | + . . . |zn |. It follows by the case for n = 2 that z2 + · · · + zn ∈ R+ . By the induction hypothesis, there is γ ∈ C, |γ| = 1 such that γzi = |zi | for 2 ≤ i ≤ n. Then γ(z2 +· · ·+zn ) = γz2 +· · ·+γzn ∈ R+ which implies that γ = 1.

therefore s1 A1 + · · · + st At is column stochastic. If the matrices are bi-stochastic then the argument applies tr to Atr 1 , . . . , At . s1 .. 16. If the columns of A are c1 , . . . , cn and p = .

12. Assume first that A > 0 and that Ax = λx and |λ| = ρ(A). Then there is a ω ∈ C, |ω| = 1 such that ωx = |x| > 0. Since Ax = λx we have |Ax| = |λx| = |λ||x| = ρ(A)|x|. It then follows that A|x| = ρ(A)x to that |x| is a positive vector. Now assume A is nonnegative and primitive, λ ∈ Spec(A), λ = ρ(A) and that Ax = λx. By Exercise 10 there is a k such that Ak > 0. Then Ak x = λk x and so λk ∈ Spec(Ak ). Moreover, ρ(Ak ) = ρ(A)k . By the case just proved, either λk = ρ(A)k of |λ|k < |ρ(A)|k so that |λ| < |ρ(A)|. However, in the first case the geometric multiplicity of ρ(A)k is greater than one which contradicts Theorem (12.19). p1 n .. 13. Let p = . . Then p, jn = j=1 pn . Thus,

17. If follows from Exercise 18 and the definition of matrix multiplication that the product of two column stochastic matrices is stochastic. If A is stochastic then by induction it follows that Ak is stochastic.

sn then Ap = s1 c1 + · · · + sn cn . Now the result follows by Lemma (12.7), Exercise 16.

18. We first prove if p, q are probability vectors and p, q = 1 then p = q = ei for some i. Assume to the contrary that p = ei for all i. Then there exists j such that 0 < pj < 1. Then pj qj < qj from which we conn clude that j=1 pj qj < j=1 qj = 1, a contradiction. Let the columns of Atr be a1 , . . . , an and the columns of A−1 be b1 , . . . , bn . Since AA−1 = In it follows that ai , bi = 1 so that {a1 , . . . , an } = {e1 , . . . , en } and A is a permutation matrix.

19. Assume A has more than n entries and therefore some row has at least two entries. Without loss of generality we can assume the first row has two entries, which we point out are both less than one. Assume they are in columns i and j. Since a1i < 1 there must be an s such that asi = 0. Since a1j < 1 there must be a t such that atj = 0. Now there are two columns with at least 2 non-zero entries. pnj quence of nonnegative numbers, sj pij ≥ 0 for every i Since each column sums to 1 there must be at least one t and j. Then j=1 pij sj ≥ 0 so that s1 p1 + · · · + st pt is nonzero entry in each column and we have shown this matrix must have at least n + 2 nonzero entries. a nonnegative vector. Now a b n t t n 20. Let A = . Since the matrix is bistochastic, c d sj pij = sj pij = a + b = a + c so that b = c and the matrix is symmetric. i=1 j=1 j=1 i=1 Also b + d = 1 = a + b so a = d. n sj pij = sj = 1. 21. Since A is reducible it is permutation similar to a j=1 i=1 j=1 A1 Bst block matrix of the form where A1 is an 15. By Lemma (12.7) (proved in Exercise 16) every col0ts A2 umn of s1 A1 + · · · + st At is a probability vector and s × s matrix and A2 is a t × t matrix with s + t = n.

pn if p ≥ 0 then p is a probability vector if and only if p, jn = 1. p1j .. 14. Assume pj = . . Since (s1 , . . . , st ) is a se-

K23692_SM_Cover.indd 129

02/06/15 3:12 pm

122

Chapter 12. Additional Topics

The matrix A1 is column stochastic and the matrix A2 is is linearly independent. Let AI,I be the k × k matrix row stochastic. Now the sum of each column of A1 is n whose (s, t)-entry is ais ,it . Then AI,I is strictly diagand there are s columns so the sum of all the entries in A1 onally dominant and therefore by Theorem (12.32) inis ns. Since A is row stochastic, each row of A sums to vertible. This implies that the sequence of columns of n. Let the rows of A1 be a12 , . . . , a1s . The sum of the AI,I is linear independent which, in turn implies that the entries in each a1i is at most n. Suppose some row does sequence (ci1 , . . . , cik ) is linearly independent. Thus, not sum to n. Then the sum of all the entries in all the rank(A) ≥ k. rows of A1 sums to less than ns, a contradiction. Thus, 8. Assume to the contrary that for all i, |a | ≤ C (A), we ii i each row of A1 sums to n and A1 is a bistochastic matrix. will obtain a contradiction. Since A is strictly diagonally n n Consequently, Bst = 0st and then A2 is bistochastic. dominant it follows that i=1 |aii | > i=1 Ri (A) = n n i=j |aij | = i=1 Ci (A) ≥ i=1 |aii |, a contradiction.

12.4. Location of Eigenvalues

9. Since multiplying the ith row by -1 changes both the sign of det(A) and aii we can assume that all aii > 0 and prove that det(A) > 0. The proof is by induction 1. Since A is a stochastic matrix, in particular, A is real tr matrix. Since A is stochastic, A is row stochastic. If on n ≥ 2. If A = a11 a12 with a11 > |a12 | and a21 a22 δ = min{aii |1 ≤ i ≤ n} then Ci (A) = Ri (Atr ) ≤ 1 − δ > |a | then det(A) = a11 a22 − a21 a12 ≥ a11 a22 − a 22 21 for all i. Suppose z ∈ C and |z − δ| ≤ Ci (A) ≤ 1 − δ. |a ||a | > 0. 12 21 On the other hand, |z − a | ≤ |z − δ|. ii

2. This implies that A is strictly diagonally dominant, consequently, A is invertible.

3. Since Γi (A) ∩ Γj (A) = ∅ for all i = j it follows from Theorem (12.26) the eigenvalues of A are distinct and it follows that A is diagonalizable. 4. By Theorem (12.26) each Γi (A) contains an eigenvalue, whence, no Γi (A) contains two. Now assume a ∈ R and w, w are complex conjugates. Then |a − w| = |a − w|. If A has a complex eigenvalue, w, then also w is an eigenvalue since the characteristic polynomial is real. Suppose w ∈ Γi (A) so that |w − aii | ≤ Ri (A). But then |w − aii | ≤ Ri (A) and w ∈ Γi (A), a contradiction. Thus, the eigenvalues of A are all real. 5. Since each of the matrices Q−1 AQ is similar to A we have Spec(Q−1 AQ) = Spec(A). Therefore Spec(A) = Spec(Q−1 AQ) ⊂ Γ(Q−1 AQ) and, consequently, contained in the intersection of all Γ(Q−1 AQ). 6. This is proved just like Exercise 4.

Now assume the result is true for all strictly diagonally dominant (n − 1) × (n − 1) real matrices with positive diagonal entries and assume that A is an n × n strictly diagonally dominant real matrix with positive diagonal entries. Let aij be the (i, j)-entry of A. We add multiples of the first column to the other columns in order to obtain zeros in the first row. This does not change the determinant. After performing these operations the entry in the a a1j (i, j)-entry with 1 < i, j is bij = aij − i1 a11 . We claim that the (n − 1) × (n − 1) B matrix with (k, l)-entry equal to bk+1,l+1 is strictly diagonally dominant with positive diagonal entries. Now the diagonal entries are aii −

a1i a1i ai1 ≥ aii − | ai1 | > a11 a11 aii − |ai1 | > 0

which establishes our second claim. We now must show that a1i ai1 ai1 a1j aii − > |aij − |. a11 a11 j≥2,j=i

7. Denote the columns of A by c1 , . . . , cn . Let I = {i1 < We illustrate with i = 2, the other cases follow in exactly · · · < ik }. We prove that the sequence (ci1 , . . . , cik ) the same way. We thus have to show that

K23692_SM_Cover.indd 130

02/06/15 3:12 pm

12.5. Functions of Matrices

a22 −

a12 a21 > a11 n j=3

n j=3

|a2j | +

123

|a2j −

a21 a1j |≤ a11

n a21 a1j | |. a11 j=3

It therefore suffices to show that

a22 >

n j=3

|a2j | +

n a21 a1j a12 a21 | |+| . a a11 11 j=3

Since A is strictly diagonally dominant it suffices to prove n a a1j that j=2 | 21 a11 | < |a21 |. Now n a21 a1j | |= a11 j=2

|a21 | ×

n 1 |a1j | < |a21 | a11 j=2

a0 In + a1 Q−1 BQ + · · · + am Q−1 B m Q. By (12.9) i, this is equal to a0 Im + a1 Q−1 BQ + · · · + am (Q−1 BQ)m = f (Q−1 BQ). 3. There exists an invertible matrix Q such that Q−1 AQ is upper triangular. Since exp(Q−1 AQ) = Q−1 exp(A)Q we can, without loss of generality, assume that A is upper λ1 a12 . . . a1n 0 λ2 . . . a2n triangular. Now if A = . . .. then . . . . ... .

0 0 . . . λn exp(A) is upper triangular with eλ1 , . . . , eλn on the diagonal. Consequently, χexp(A) = (x − eλ1 ) . . . (x − exp(λn ). 4. It follows from Theorem (12.36) and the definition of determinant that det(exp(A)) = eλ1 . . . eλn = eλ1 +···+λn = eT race(A) .

5. For an n × n complex matrix we have (An )∗ = (A∗ )n . We also have (A + B)∗ = A∗ + B ∗ . Also for a real scalar n c, (cA)∗ = cA∗ . It then follows that if f (x) ∈ R[x] and since a11 > j=2 |a1j |. A is an n × n complex matrix then f (A)∗ = f (A∗ ). Now det(A) = a11 det(B). Since B is strictly diagonally n 1 j ∗ ∗ dominant with positive diagonal entries, det(B) > 0 by Set fn (x) = j=0 j! x . Then fn (A) = fn (A ). It then follows that the inductive hypothesis. Hence det(A) > 0.

12.5. Functions of Matrices

exp(A∗ ) = lim fn (A∗ ) = lim fn (A)∗ = exp(A)∗ . n→∞

n→∞

6. By Exercise 5, exp(A)∗ = exp(A∗ ) = exp(A). 1i. For any n × n matrices A, B, (Q−1 AQ)(Q−1 BQ) = 7. In general, if AB = BA then exp(A)exp(B) = Q−1 [AB]Q. Then result follows by a straightforward in- exp(A + B) = exp(B + A) = exp(B)exp(A). duction on k. In particular, exp(A)exp(A)∗ = exp(A)exp(A∗ ) = ∗ ∗ ∗ ii. [Q−1 (M1 + M2 )]Q = [Q−1 M1 + Q−1 M2 ]Q = exp(A + A ) = exp(A + A) = exp(A )exp(A) = ∗ exp(A) exp(A). [Q−1 M1 ]Q + [Q1 M2 ]Q = Q−1 M1 Q + Q−1 M2 Q. 2. Assume f (x) = a0 + a1 x + · · · + am xm . Now f (B) = a0 In + a1 B + · · · + am B m . Then Q−1 f (B)Q = Q−1 [a0 In + a1 B + · · · + am B m ]Q. By repeated application of Lemma (12.9) ii. it follows that Q−1 f (B)Q = Q−1 (a0 In )Q + Q−1 (a1 B)Q + · · · + Q−1 (am B m )Q =

K23692_SM_Cover.indd 131

02/06/15 3:12 pm

K23692_SM_Cover.indd 132

02/06/15 3:12 pm

Chapter 13

Applications of Linear Algebra 13.1. Least Squares 1. By Lemma (13.1), if X is a {1, 3} inverse of A then AX = AA† . Conversely, suppose AX = AA† . Then AXA = AA† A = A by (PI1) for the Moore-Penrose inverse. Also, (AX)∗ = (AA† )∗ = AA† = AX. 2. Since z = Xb where X is a {1, 3} inverse of A we have Az = AXb = AA† b which is a least squares solution to Ax = b. Therefore, if Ay = Az then y is a least square solution to Ax = b. 3. Since A = BC, A∗ = C ∗ B ∗ . Substituting this for A∗ in the normal equation A∗ Ax = A∗ b we get C ∗ B ∗ Ax = C ∗ B ∗ b. It then follows that C ∗ (B ∗ Ax − B ∗ b) = 0n . However, since C ∗ is an n × r matrix with rank r it must be the case that B ∗ Ax − B ∗ b = 0r which implies that B ∗ Ax = B ∗ b. 4. Clearly the columns of A are not multiple of each other and consequently, the sequence is linearly independent. The reduced echelon form of the matrix 1 1 9 1 −3 3 is I3 and therefore b ∈ / col(A). −2 2 −6 1 1 −2 . The matrix version of the normal A∗ = 1 −3 2 equations is

K23692_SM_Cover.indd 133

1 −2 1 2 −2

1 1 1 −3

6 −6

1 1 −3 x = 1 2

1 −3

−6 24 x= , 14 6 31 4 . x = 15

9 −2 3 . 2 −6

4

5. Clearly the columns of A are not multiples of each other and consequently, the sequence is linearly independent. The reduced echelon form of the matrix 1 2 1 1 1 −2 is I3 so b ∈ / col(A). 1 3 7 1 1 1 so the matrix version of the normal A∗ = 2 1 3 equations is

1 2

1 1

1 1 1 3 3

3 6

2 1 1 x = 2 1

1 1

6 6 x= , 14 21 −7 x= 9 .

1 1 −2 . 3 7

2

02/06/15 3:12 pm

126

Chapter 13. Applications of Linear Algebra

6. Clearly the columns of A are not multiple of each other and consequently, the sequence is linearly independent. 1 1 2 1 1 1 The reduced echelon form of the matrix 1 1 3 1 −1 18 1 0 0 0 1 0 is / col(A). 0 0 1 so b ∈ 0

0

1 1 1

1 1 1

0

1 1

1 1

1 1 4 2

1 1

1 1

4 2 2

1 1 1 1 1 x = −1 1 1 1 −1

2 1 1 . −1 3 18

1 1

2 24 x= , 4 −12 x=

20 . −16

1 1 1 −1 1 1 1

1 1 −1

1 1 1 1 . The matrix version of the nor1 1 1 −1 mal equations is A∗ =

1 1 1

1 1 1

1 1 x = −1 1

1 1 2 −1 −1 . 1 4

1 1 −1 2 4 0

1 1 1 −1

2 6 0 x = −2 , 4 8

x=

3 2 − 5 . 4 5 4

8.

The reduced form of the matrix echelon 1 0 1 1 1 0 2 0 0 1 −1 −1 0 1 −1 0 It fol1 2 0 −1 is 0 0 0 1. 2 1 3 0 0 0 0 0 lows that the sequence of columns of A is linearly dependent and that b ∈ / col(A). The matrix version the normal equations is

6 4 8

4 6 2

8 −1 2 x = −3 . 14 3

1 0 0 0 1 0 7. The reduced echelon form of A is 0 0 1 so the 1 0 0 0 0 0 1 1 0 2 sequence of columns of A is linearly independent. The re ,C = . Then A = BC Set B = 1 2 0 1 −1 1 1 1 1 1 1 2 1 1 2 duced echelon form of the matrix 1 1 −1 −1 is full rank decomposition of A. Then A† = C † B † . 1 −1 1 4 3 1 2 is I4 . Therefore b ∈ / col(A). The matrix version of the − 15 − 10 5 B † = 101 , 1 3 2 normal equations is − 5 10 − 10 5

K23692_SM_Cover.indd 134

02/06/15 3:12 pm

13.1. Least Squares

C† =

1

3 1 3 1 3

127

1 3 5 . 3 1 −6

A† =

A† =

1 3 − 1 15 2 15

1 30 11 60 7 − 60

1 5 3 20 1 − 10

1 5 1 . 20 3 20

7 20

1 45 4 45 1 9 1 − 15

−

3

2 tion is z + y where y ∈ col(I3 − A† A) = Span −1. −1

9. The reduced echelon form of (A b) is 1 0 1 1 0 0 1 1 −1 0 0 0 0 0 1 . 0 0 0 0 0 0 0 0 0 0

4 45 1 45 1 9 1 15

1

1 − 10 . The general least square soluSet z = A† b = − 11 20

0 A† A = 1

1 3 1 3 − 13

3 1 3

1 15 1 − 15

1 9 1 9 2 9

0

2 15 1 3 1 3 2 3

0

0 1

1 − 15

. 0 2 − 15 1 15

3

0 . 0 2 3

3 5

2 5 Set z = A b = 1 . Then the general least square †

1 5

solution is z +y where y is the column space of I4 −A† A 1 1 1 −1 which is equal to Span 2 , 0 . 0

1 3 2 3 − 23

2

− 23 2 3 3 0 1 3

1 9 . This leads to 8 = 3 5 1 2 Thus, b ∈ / col(A) and the sequence of columns of A is 2 2 − 3 9 3 3 3 7 . the equation x= linearly dependent. 1 0 3 − 23 32 3 −5 The matrix version of the normal equations is After performing the multiplication on the right we get 4 −1 3 5 6 3 9 2 −1 4 3 −5 3 = , x = . 0 3 5 3 9 3 6 0 13 5 −5 0 10 3 −3 . x= 5 3 1 0 1 0 1 1 1 2 2 2 1 0 1 1 1 1 Let B = 1 −1 , C = . Then − 12 2 2 0 1 1 −1 11. Let Q = 1 1 . 1 1 − 12 2 2 1 −1 1 − 12 − 12 2 † A = BC is a full rank decomposition of A. Then A = 2 −1 1 † † C B . Then A = QR where R = 0 5 4. Then R−1 = 1 0 0 1 0 3 4 1 1 1 1 9 1 1 1 0 − − † 3 15 15 5 3 5 . 2 10 10 C† = 1 1 9 1 . B = 4 1 1 1 0 1 − − − 3 3 15 15 5 3 5 5 5 1 1 −3 0 0 1 3

K23692_SM_Cover.indd 135

1 10. 2 −2

02/06/15 3:12 pm

128

Chapter 13. Applications of Linear Algebra

4 The solution is x = R−1 (Qtr b = 1 −2 12. y = 2.06 + 3.01x. 13. y = 3.51 − 356x. 14. y = 2.92 − 1.88x + 1.20x2 . 15. y = .26 + .87x + .35x . 2

16. y = .35e1.55t . 17. y = 2.53e−.14t .

13.2. Error Correcting Codes

a non degenerate hyperbolic orthogonal space and so has Witt index n2 . On the other hand, if q is even and n is odd then the orthogonal space (Fnq , φ) is nonsingular: it has a radical of dimension one (the space spanned by the all one vector) which is a non-singular vector. In this case the Witt index is n−1 2 . 5. a) Let z denote the all one vector. Then z ∈ H. The map x → z +x is a bijection from the collection of words of length t to words of length 7 − t. b) Since the minimum weight is 3 there are no words of weight 1 or 2 and by part a) no words of weight 5 or 6. So the weight of a word in H is in {0, 3, 4, 7}. There is a single word of weight 0 and weight 7. There are equally many words of weight 3 and 4 and their total is 14 so there are 7 of each.

1. For an arbitrary word z = (c1 . . . cn ) let spt(z) = 6. The parity check extends 07 to 08 and extends the all {i|ci = 0) so that wt(z) = |spt(z)|. For words x and one vector of length 7 to the all one vector of length 8. y we have spt(x + y) ⊂ spt(x) ∪ spt(y) and the result Each of the 7 words of weight 4 are extended by adding follows from this. a component equal to zero so each of these gives rise to a 2. d(x, z) = wt(x − z) = wt([x − y] + [y − z]) ≤ word of weight 4. On the other hand each of the 7 words wt(x − y) + wt(y − z) by Theorem (13. 8) (Exercise 1). of weight 3 are extended by adding a component equal to By the definition of distance one so each of these also gives rise to a word of weight 4. Therefore, in all, there are 14 words of weight four in H. wt(x − y) + wt(y − z) = d(x, y) + d(yz).

7. x · x = wt(x) × 1Fq is equal to one of w(x) is odd and zero if wt(x) is even.

3. Part i. Assume d(w, x) = t where w = (a1 . . . an ). 8. x · y = |spt(x) ∩ spt(y)| × 1 . Therefore x · y = 0 Fq Then x and w differ in exactly t places. There are nt if spt(x) ∩ spt(y)| has an even number of elements and ways to pick those places. If i is such a place then the ith is one if |spt(x) ∩ spt(y) is odd. component of x is not ai and we have q − 1 choices for the ith component of x. Thus, the number of such x is 9. Since wt(x) is even for every word in H it follows n t by Exercise 6 that x · x = 0 for every x ∈ H. Clearly if t (q − 1) . x = 0 and y is arbitrary then x·y = 0. On the other hand, ii. The closed ball, Br (w) is the disjoint union of since x ∩ z = x it follows from Exercise 7 that z · x = 0. {w, {x|d(w, x) = 1}, . . . , {x|d(w, x) = r}. The result We may therefore assume that wt(x) = wt(y) = 4 and now follows by part i. prove that x · y = 0. By Exercise 7 we need to prove that 4. For a vector v set φ(v) = v · v. This is a quadratic |spt(x) ∩ spt(y)| is even. Suppose |spt(x) ∩ spt(y)| = form. A self-dual code is a totally singular subspace. If q 1. Then wt(x + y) = 6 which contradicts Exercise 5. is odd then (Fnq , φ) is non degenerate and the Witt index Likewise, if |spt(x) ∩ sit(y)| = 3 then wt(x + y) = 2, is at most n2 . If q is even and n is even then (Fnq , φ) is again a contradiction.

K23692_SM_Cover.indd 136

02/06/15 3:13 pm

13.3. Ranking Web Pages

129

10. Let x, w be code words. Suppose y ∈ B3 (w) ∩ q t−n × q n = q t . B3 (x). Then d(w, x) ≤ d(w, y) + d(y, x) ≤ 3 + 3 = 6, Thus, {B1 (w)|w ∈ Ham(n, q)} is a partition of Ftq and a contradiction. Therefore B3 (w) ∩ B3 (x) = ∅. By part 23 Ham(n, q) is a perfect 1-error correcting code. ii. of Theorem (13.11), |B3 (w)| = 1+23+ 23 2 + 3 = 1+23+253+1771 = 2048 = 211 . Since C has dimension 12, the number of words in C is 212 . Since the balls of radius 3 centered at the code words are disjoint it follows that

13.3. Ranking Web Pages

| ∪w∈C B3 (w)| = |C| × |B3 (w)| = 212 × 211 = 223 = |F23 2 |.

Thus, {B3 (w)|w ∈ C} is a partition of

1.

0 1 3 0 1 3

F23 2 .

05×4 The matrix is where A 04×5 B 1 0 12 1 2 0 13 0 0 0 0 0 0 1 0 1 0 1 2 0 0 0 and B = 2 0 1 0 0. 3 0 1 0 0 0 13 21 0 0 0 12 0 A

=

11. The number of nonzero vectors in Fn is q n − 1. Since Span(x) contains q − 1 nonzero vectors the number of 1 n −1 3 = t. Thus, one dimensional subspaces of Fnq is qq−1 2. The last column is a zero column and therefore its enthere are t columns in the parity check matrix H(n, q) so tries do not add up to 1. the length of the Hamming (n, q)-code is t. A 05×3 91 j5 We next claim that rank(H(n, q)) = n. Since there 3. L = . where j5 is the all one 51 04×5 B are n rows it follows that rank(H(n, q)) ≤ n. On 9 j4 0 13 0 the other hand if X1 , . . . , Xn are one dimensional sub1 1 0 spaces, X1 + · · · + Xn = Fnq and Xi = Span(xi ) for 2 . vector, j4 is the all one 4-vector and B = 1 n 0 0 1 ≤ i ≤ n then (x1 , . . . , xn ) is a basis of F . Therefore, q

rank(H(n, q)) ≥ n. Since the Hamming code is the null space of H(n, q) we can conclude it has dimension t − n.

0

3 1 3

1 2

4. Since Span(e1 , e2 , e3 , e4 , e5 ) is invariant the matrix We now prove the assertion about the minimum distance. is reducible. 12 Since any two columns are linearly independent the min31 imum distance is at least 3. However if X, Y, Z are three 4 31 distinct one dimensional subspaces such that X + Y = 2 31 6 X + W = Y + W and X = Span(x), Y = Span(y), 31 7 and W = Span(w) then (x, y, w) is linearly dependent. 5. Span( ). 31 Consequently, there are words of weight three. Thus, the 0 minimum weight is exactly three. 0 0 It now follows that the balls B (w) where w is in the 1

Hamming code are disjoint since the minimum distance is 3. By Theorem (13.11) the number of vector in such a n −1 ball is 1 + t(q − 1) = 1 + qq−1 × (q − 1) = q n . Since there are q t−n code words we get

0

| ∪w∈Ham(n,q) B1 (w)| = |Ham(n, q)| × |B1 (w)| =

K23692_SM_Cover.indd 137

02/06/15 3:13 pm

130

Chapter 13. Applications of Linear Algebra

6.

1 36 10 36 1 36 10 36 10 36 1 36 1 36 1 36 1 36

K23692_SM_Cover.indd 138

29 72 1 36 29 72 1 36 1 36 1 36 1 36 1 36 1 36

1 36 1 36 1 36 1 36 1 36 1 36 1 36 1 36 1 36

10 36 1 36 29 72 1 36 29 72 1 36 1 36 1 36 1 36

28 36 1 36 1 36 1 36 1 36 1 36 1 36 1 36 1 36

1 36 1 36 1 36 1 36 1 36 1 36 7 9 1 36 1 36

1 36 1 36 1 36 1 36 1 36 10 36 1 36 10 36 10 36

1 36 1 36 1 36 1 36 1 36 1 36 29 72 1 36 29 72

4 36 1 9 1 9 1 9 1 9 1 9 1 9 1 9 1 9

02/06/15 3:13 pm