Student Solution Manual to accompany the 5th edition of Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach [5 ed.] 9780971576698

Detailed solutions to every odd-numbered exercise. "I particularly enjoyed problem A1.5. I thought I'd never s

201 77 6MB

English Pages 305 [315]

Table of contents :
Cover
Solutions for Chapter 0
Solutions for Chapter 1
Solutions for review exercises, chapter 1
Solutions for Chapter 2
Solutions for review exercises, chapter 2
Solutions for Chapter 3
Solutions for review exercises, chapter 3
Solutions for Chapter 4
Solutions for review exercises, chapter 4
Solutions for Chapter 5
Solutions for review exercises, chapter 5
Solutions for Chapter 6
Solutions for review exercises, chapter 6
Solutions for the Appendix
Index

Recommend Papers

Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach [5 ed.] 0971576688, 9780971576681

320 72 28MB Read more

Vector Calculus and Differential Equations

This book is intended to fill the needs of a full year sophomore course to complete a two-year sequence of calculus with

112 14 78MB Read more

Prelude to calculus and linear algebra

105 25 Read more

Differential Equations and Linear Algebra

714 40 8MB Read more

Vector Analysis Versus Vector Calculus (Instructor Solution Manual, Solutions) [1 ed.] 9781461421993, 1461421993

obtained thanks to https://t.me/HermitianSociety

221 58 2MB Read more

Vector and Geometric Calculus (Geometric Algebra & Calculus) 1480132454, 9781480132450

This textbook for the undergraduate vector calculus course presents a unified treatment of vector and geometric calculus

490 20 7MB Read more

Calculus 9th edition by Adams Solution Manual

Author(s): Adams Series: The God of Education Year: 2018

0 0 8MB Read more

Calculus, Vol. 2: Multi-Variable Calculus and Linear Algebra with Applications to Differential Equations and Probability [2 ed.] 9780471000075

An introduction to the calculus, with an excellent balance between theory and technique. Integration is treated before d

419 42 47MB Read more

Mechanical vibrations 5th edition solution manual

4,164 929 32KB Read more

Linear Algebra and Differential Equations using MATLAB

917 106 4MB Read more

Student Solution Manual to accompany the 5th edition of Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach [5 ed.]
9780971576698

Author / Uploaded
John H. Hubbard and Barbara Burke Hubbard

Similar Topics
Mathematics

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Student Solution Manual to accompany the 5th edition of

Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach

John Hamal Hubbard

Barbara Burke Hubbard

Cornell University ´ Aix-Marseille Universite

Matrix Editions

Ithaca, NY 14850 MatrixEditions.com

Copyright 2015 by Matrix Editions 214 University Ave. Ithaca, NY 14850 www.MatrixEditions.com

All rights reserved. This book may not be translated, copied, or reproduced, in whole or in part, in any form or by any means, without prior written permission from the publisher, except for brief excerpts in connection with reviews or scholarly analysis. Printed in the United States of America 10 9 8 7 6 5 4 3 2

ISBN: 978-0-9715766-9-8

Contents

Note to students

iv

Chapter 0: Preliminaries

1

Chapter 1: Vectors, matrices, and derivatives 9 Chapter 1 review exercises 31 Chapter 2: Solving equations 40 Chapter 2 review exercises 74 Chapter 3: Higher partial derivatives, quadratic forms, and manifolds 85 Chapter 3 review exercises 122 Chapter 4: Integration 133 Chapter 4 review exercises 173 Chapter 5: Volumes of manifolds 182 Note on parametrizations 183 Chapter 5 review exercises 204 Chapter 6: Forms and vector calculus 207 Chapter 6 review exercises 266 Appendix A: Some harder proofs 277 Index 305

Note to Students

You should not attempt to use this manual with a previous edition of the textbook, as some sections of the textbook have been renumbered, new sections have been added, some exercises have been moved from one section to another, some earlier exercises have been discarded, and some new ones have been added. In this manual, we do not number every equation. When we do, we use (1), (2), . . . , starting again with (1) in a new solution.

This manual gives solutions to all odd-numbered exercises in the fifth edition of Vector Calculus, Linear Algebra, and Di↵erential Forms: A Unified Approach. Please think about the exercises before looking at the solutions! You will get more out of them that way. Some exercises are straightforward; their solutions are correspondingly straightforward. Others are “entertaining”. You may be tempted go straight to the solution manual to look for problems similar to those assigned as your homework, without reading the text. This is a mistake. You may learn some of the simpler material that way, but if you don’t read the text, soon you will find you cannot understand the solutions. The textbook contains a lot of challenging material; you really have to come to terms with the underlying ideas and definitions. Errata for this manual and for the textbook will be posted at www.matrixeditions.com/errata.html If you think you have found errors either in the text or in this manual, please email us at [email protected] or [email protected] so that we can make the errata lists as complete as possible.

John H. Hubbard and Barbara Burke Hubbard

iv

Solutions for Chapter 0

Solution 0.2.1: The negation in part a is false; the original statement is true: 5 = 4 +1, 13 = 9 +4, 17 = 16 + 1, 29 = 25 + 4, . . . , 97 = 81 + 16, . . . . If you divide a whole number (prime or not) by 4 and get a remainder of 3, then it is never the sum of two squares: 3 is not, 7 is not, 11 is not, etc. You may well be able to prove this; it isn’t all that hard. But the original statement about primes is pretty tricky. The negation in part b is false also, and the original statement is true: it is the definition of the mapping x 7! x2 being continuous. But in part c, the negation is true, and the original statement is false. Indeed, if you take ✏ = 1 and any > 0 and set x = 1 , y = 1 + 2 , then |y 2

x2 | = |y =

2

✓

x||y + x| ◆ 2 + > 1. 2

The original statement says that the function x 7! x2 is uniformly continuous, and it isn’t.

0.2.1 a. The negation of the statement is: There exists a prime number such that if you divide it by 4 you have a remainder of 1, but which is not the sum of two squares. b. The negation is: There exist x 2 R and ✏ > 0 such that for all > 0 there exists y 2 R with |y x| < and |y 2 x2 | ✏. c. The negation is: There exists ✏ > 0 such that for all x, y 2 R with |x y| < and |y 2 x2 | ✏. 0.3.1 a. (A ⇤ B) ⇤ (A ⇤ B)

b. (A ⇤ A) ⇤ (B ⇤ B)

> 0, there exist

c. A ⇤ A

0.4.1 a. No: Many people have more than one aunt, and some have none. b. No,

1 0

is not defined.

c. No. 0.4.3 Here are some examples: a. “DNA of” from people (excluding clones and identical twins) to DNA patterns. b. f (x) = x. 0.4.5 The following are well defined: g f : A ! C, h k : C ! C, k g : B ! A, k h : A ! A, and f k : C ! B. The others are not, unless some of the sets are subsets of the others. For example, f g is not because the codomain of g is C, which is not the domain of f , unless C ⇢ A. 0.4.7 a. f g(h(3)) = f g( 1) = f ( 3) = 8 b. f g(h(1)) = f g( 2) = f ( 5) = 25 0.4.9 If the argument of the square root is nonnegative, the square root can be evaluated, so the open first and the third quadrants are in the natural domain. The x-axis is not (since y = 0 there), but the y-axis with the origin removed is in the natural domain, since x/y = 0 there. 0.4.11 The function is defined for { x 2 R | also defined for the negative odd integers.

1 < x < 0, or 0 < x }. It is

0.5.1 Without loss of generality we may assume that the polynomial is of odd degree d and that the coefficient of xd is 1. Write the polynomial xd + ad Let A = |a0 | + · · · + |ad ad

d 1 1A

1|

d 1 1x

+ · · · + a0 .

+ 1. Then

+ ad

d 2 2A

+ · · · + a0  (A 1

1)Ad

1

,

2

Solutions for Chapter 0

and similarly, ad

1(

A)d

1

+ ad

A)d

2(

2

+ · · · + a0  (A

1)Ad

1

.

Therefore, p(A)

Ad

(A

1)Ad

1

> 0 and p( A)  ( A)d + (A

1)Ad

1

< 0.

By the intermediate value theorem, there must exist c with |c| < A such that p(c) = 0. 0.5.3 Suppose f : [a, b] ! [a, b] is continuous. Then the function g(x) = x f (x) satisfies the hypotheses of the intermediate value theorem (Theorem 0.5.9): g(a)  0 and g(b) 0. So there must exist x 2 [a, b] such that g(x) = 0, i.e., f (x) = x.

We give two solutions to Exercise 0.5.5.

0.5.5 First solution P1 P1 Let k=1 bk be a rearrangement of k=1 ak , which means that there exists a bijective map ↵ : N ! N such that b↵(k) = ak . P1 Since the series k=1 ak is absolutely convergent, we know it is converP1 gent. Let A = k=1 ak . We must show that ! m X (8✏ > 0)(9M ) m M =) bk A < ✏ . First choose N such that (n > N ) =)

n X

P1

N +1

ak

k=1

|ak | < ✏/2; in that case,

A 

k=1

1 X

k=n

|ak | 

1 X

k=N +1

|ak | < ✏/2.

Now set M = max{↵(1), . . . , ↵(N )}, i.e., M is so large that all of a1 , . . . , aN appear among b1 , . . . , bM . Then m > M =)

m X

k=1

bk

A 

N X

ak

A +

k=1

1 X

k=N +1

|ak | < ✏.

Second solution P1 The series k=1 ak + |ak | is a convergent series of positive numbers, so its sum is the sup of all finite sums of terms. P1 P1 If k=1 bk is a rearrangement of the same series, then k=1 bk + |bk | is P1 also a rearrangement of k=1 ak + |ak |. P1 P1 But the finite sums of terms of k=1 ak + |ak | and k=1 bk + |bk | are exactly the same set of numbers, so these two series converge to the same limit. The same argument says that 1 X

k=1

|ak | =

1 X

k=1

|bk |.

Solutions for Chapter 0

The first equality is justified because both series 1 1 X X (bk + |bk |) and |bk | k=1

Thus 1 X

k=1

bk =

1 X

3

1 X

(bk + |bk |)

k=1

k=1

|bk | =

1 X

1 X

(ak + |ak |)

k=1

k=1

|ak | =

1 X

ak .

k=1

k=1

converge, so their di↵erence does also.

Solution 0.6.1, part a: There are infinitely many ways of solving this problem, and many seem just as natural as the one we propose.

0.6.1 a. Begin by listing the integers between 1 and 1, then list the numbers between 2 and 2 that can be written with denominators  2 and which haven’t already been listed, then list the rationals between 3 and 3 that can be written with denominators  3 and haven’t already been listed, etc. This will eventually list all rationals. Here is the beginning of the list: 1, 0, 1, 2,

3 , 2

1 1 3 , , , 2, 3, 2 2 2

8 , 3

5 , 2

7 , 3

5 , 3

4 , 3

2 , 3

1 1 2 , , ,... 3 3 3

b. Just as before, list first the finite decimals in [ 1, 1] with at most one digit after the decimal point (there are 21 of them), then the ones in [ 2, 2] with at most two digits after the decimal, and which haven’t been listed earlier (there are 380 of these), etc.

Solution 0.6.3: The same argument holds if one uses the function g, with derivative ⇡ g 0 (x) = . 2(1 + cos2 ⇡x ) 2

0.6.3 It is easy to write a bijective map ( 1, 1) ! R. For instance, x ⇡x f (x) = or g(x) = tan . 1 x2 2 The derivative of the first mapping is f 0 (x) =

(1

x2 ) + 2x2 1 + x2 = >0 (1 x2 )2 (1 x2 )2

so the mapping is monotone increasing on ( 1, 1), hence injective. Since it is continuous on ( 1, 1) and lim f (x) =

x& 1

1 and

lim f (x) = +1,

x%+1

it is surjective by the intermediate value theorem. 0.6.5 a. To the right, if a 2 A is an element of a chain, then it can be continued: a, f (a), g f (a) , f g(f (a)) , . . . , and this is clearly the only possible continuation. Similarly, if b is an element of a chain, then the chain can be continued b, g(b), f g(b) , g f (g(b)) , . . . , and again this is the only possible continuation. Let us set A1 = g(B) and B1 = f (A). Let f1 : A1 ! B be the unique map such that f1 (g(b)) = b for all b 2 B, and let g1 : B1 ! A be the unique map such that g1 (f (a)) = a for all a 2 A. To the left, if a 2 A1 is an element of a chain, we can extend to f1 (a), a. Then if f (a1 ) 2 B1 , we can extend one further, to g1 (f1 (a)), and if g1 (f1 (a)) 2 A1 , we can extend further to f1 g1 (f1 (a)) ,

g1 f1 (a) ,

f1 (a),

a.

4

Solutions for Chapter 0

Either this will continue forever or at some point we will run into an element of A that is not in A1 or into an element of B that is not in B1 ; the chain necessarily ends there. b. As we saw, any element of A or B is part of a unique infinite chain to the right and of a unique finite or infinite chain to the left. Part c: (1) can be uniquely continued forever (2) can be continued to an element of A that is not in the image of g (3) can be continued to an element of B that is not in the image of f

c. Since every element of A and of B is an element of a unique chain, and since this chain satisfies either (1), (2), or (3) and these are exclusive, the mapping h is well defined. If h(a1 ) = h(a2 ) and a1 belongs to a chain of type 1 or 2, then so does f (a1 ), hence so does f (a2 ), hence so does a2 , and then h(a1 ) = h(a2 ) implies a1 = a2 , since f is injective. Now suppose that a1 belongs to a chain of type 3. Then a1 is not the first element of the list, so a1 2 A1 , and h(a1 ) = f1 (a1 ) is well defined. The element h(a2 ) is a part of the same chain, hence also of type 3, and h(a2 ) = f1 (a2 ). But then a1 = g(f1 (a1 )) = g(f1 (a2 )) = a2 . So h is injective. Now any element b 2 B belongs to a maximal chain. If this chain is of type 1 or 2, then b is not the first element of the chain, so b 2 B1 and h(g1 (b)) = f (g1 (b)) = b, so b is in the image. If b is in a chain of type 3, then b = f1 (g(b)) = h(g(b)), so again b is in the image of h, proving that h is surjective. d. This is the only entertaining part of the problem. In this case, there is only one chain of type 1, the chain of all 0’s. There are only two chains of type 2, which are f

+1 !

1 g 1 f 1 g 1 f 1 ! ! ! ! .... 2 2 4 4 8

1 g 1 f 1 g 1 f 1 ! ! ! ! .... 2 2 4 4 8 All other chains are of type 3, and end to the left with a number in (1/2, 1) or in ( 1, 1/2), which are the points in B f (A). Such a sequence might be 3 g 3 f 3 g 3 f 3 ! ! ! ! .... 4 4 8 8 16 Following our definition of h, we see that 8 if x = 0 >

: x if x 6= ±1/2k for some k 0. f

1!

The existence of this function h is Bernstein’s theorem. Solution 0.6.7: Two sets A and B have the same cardinality if there exists an invertible mapping A ! B. Here, finding such an invertible map would be difficult, so instead we use Bernstein’s theorem, which says that if there is an injective map A ! B and an injective map B ! A, then there is a invertible map A ! B.

0.6.7 a. There is an obvious injective map [0, 1) ! [0, 1) ⇥ [0, 1), given simply by g : x 7! (x, 0). We need to construct an injective map in the opposite direction; that is a lot harder. Take a point in (x, y) 2 [0, 1)⇥[0, 1), and write both coordinates as decimals: x = .a1 a2 a3 . . .

and y = .b1 b2 b3 . . . ;

Solutions for Chapter 0

Remember that injective = one to one surjective = onto bijective = invertible. Bernstein’s theorem is the object of Exercise 0.6.5.

5

if either number can be written in two ways, one ending in 0’s and the other in 9’s, use the one ending in 0’s. Now consider the number f (x, y) = .a1 b1 a2 b2 a3 b3 . . . . The mapping f is injective: using the even and the odd digits of f (x, y) allows you to reconstruct x and y. The only problem you might have is if f (x, y) could be written in two di↵erent ways, but as constructed f (x, y) will never end in all 9’s, so this doesn’t happen. Note that this mapping is not surjective; for instance, .191919 . . . is not in the image. But Bernstein’s theorem guarantees that since there are injective maps both ways, there is a bijection between [0, 1) and [0, 1) ⇥ [0, 1).

b. The proof of part a works also to construct a bijective mapping (0, 1) ! (0, 1) ⇥ (0, 1). But it is easy to construct a bijective mapping (0, 1) ! R, for instance x 7! cot(⇡x). If we compose these mappings, we find bijective maps R ! (0, 1) ! (0, 1) ⇥ (0, 1) ! R ⇥ R. c. We can use part b repeatedly to construct bijective maps Part c: The map (f, id) : R ⇥ R ! R

maps the first R to R ⇥ R and the second R to itself.

Solution 0.7.3, part b: If you had trouble with this, note that it follows from Proposition 0.7.5 (geometrical representation of multiplication of complex numbers) that

1

1 1 = , z |z|

since zz = 1, which has length 1. The modulus (absolute value) p of 3 + 4i is 25 = 5, so the modulus of (3 + 4i) 1 is 1/5. It follows from the second statement in Proposition 0.7.5 that the polar angle of z 1 is minus the polar angle of z, since the two angles sum to 0, which is the polar angle of the product zz 1 = 1. The polar angle of 3 + 4i is arccos(3/5).

f

(f, id)

(f, id)

R ! R ⇥ R ! (R ⇥ R) ⇥ R = R ⇥ R2 ! (R ⇥ R) ⇥ R2 (f, id)

= R ⇥ R3 ! (R ⇥ R) ⇥ R3 = R ⇥ R4 · · · .

Continuing this way, it is easy to get a bijective map R ! Rn .

0.6.9 It is not possible. For suppose it were, and consider as in equation 0.6.3 the decimal made up of the entries on the diagonal. If this number is rational, then the digits are eventually periodic, and if you do anything systematic to these digits, such as changing all digits that are not 7 to 7’s, and changing 7’s to 5’s, the sequence of digits obtained will still be eventually periodic, so it will represent a rational number. This rational number must appear someplace in the sequence, but it doesn’t, since it has a di↵erent kth digit than the kth number for every k. The only weakness in this argument is that it might be a number that can be written in two di↵erent ways, but it isn’t, since it has only 5’s and 7’s as digits. 0.7.1 “Modulus of z,” “absolute value of z,” and |z| are synonyms. “Real part of z” is the same as Re z = a. “Imaginary part of z” is the same as Im z = b. The “complex conjugate of z” is the same as z¯. p 0.7.3 a. The absolute value ofp2 + 4i is |2 + 4i| = 2 5. The argument (polar angle) of 2+4i is arccos 1/ 5, which you could also write as arctan 2. b. The absolute value of (3 + 4i) is arccos(3/5).

1

is 1/5. The argument (polar angle)

p c. The absolute value of (1 + i)5 is 4 2.pThe argument is 5⇡/4. (The complex number 1 + i has absolute value 2 and polar angle ⇡/4. De Moivre’s formula says how to compute these for (1 + i)5 .)

6

Solutions for Chapter 0

d. The absolute value of 1 + 4i is

p p 17; the argument is arccos 1/ 17.

0.7.5 Parts 1–4 are immediate. For part 5, we find (z1 z2 )z3 = (x1 x2

y1 y2 ) + i(y1 x2 + x1 y2 ) (x3 + iy3 )

= (x1 x2 x3

y1 y2 x3

+ i(x1 x2 y3

y1 x2 y3

x1 y2 y3 )

y1 y2 y3 + y1 x2 x3 + x1 y2 x3 ),

which is equal to z1 (z2 z3 ) = (x1 + iy1 )((x2 x3 = (x1 x2 x3

y2 y3 ) + i(y2 x3 + x2 y3 ))

x1 y2 y3

+ i(y1 x2 x3

y1 y2 x3

y1 x2 y3 )

y1 y2 y3 + x1 y2 x3 + x1 x2 y3 ).

Parts 6 and 7 are immediate. For part 8, multiply out: ✓ ◆ ✓ a b a2 b2 ab (a + ib) i 2 = 2 + 2 +i 2 2 2 2 2 2 a +b a +b a +b a +b a + b2

ab 2 a + b2

= 1 + i0 = 1.

◆

Part 9 is also a matter of multiplying out: z1 (z2 + z3 ) = (x1 + iy1 ) (x2 + iy2 ) + (x3 + iy3 ) = (x1 + iy1 )((x2 + x3 ) + i(y2 + y3 )) = x1 (x2 + x3 ) = x1 x2

y1 (y2 + y3 ) + i y1 (x2 + x3 ) + x1 (y2 + y3 )

y1 y2 + i(y1 x2 + x1 y2 ) + x1 x3

y1 y3 + i(y1 x3 + x1 y3 )

= z1 z2 + z1 z3 . Solution 0.7.7: Remember that the set of points such that the sum of their distances to two points is constant is an ellipse, with foci at those points.

0.7.7 a. The equation |z u| + |z v| = c represents an ellipse with foci at u and v, at least if c > |u v|. If c = |u v| it is the degenerate ellipse consisting of the segment [u, v], and if c < |u v| it is empty, by the triangle inequality, which asserts that if there is a z satisfying the equality, then c < |u

v|  |u

z| + |z

v| = c.

b. Set z = x + iy; the inequality |z| < 1 Re z becomes p x2 + y 2 < 1 x,

We should worry that squaring the equation might have introduced parasitic points, where p x2 + y 2 < 1 x. This is not the case, since 1 x is positive throughout the region.

corresponding to a region bounded by the curve of equation p x2 + y 2 = 1 x.

If we square this equation, we will get the curve of equation 1 x2 + y 2 = 1 2x + x2 , i.e., x = (1 y 2 ), 2 which is a parabola lying on its side. The original inequality corresponds to the inside of the parabola. p i± 1 8 0.7.9 a. The quadratic formula gives x = , so the solutions 2 are x = i and x = 2i.

Solutions for Chapter 0

7

b. In this case, the quadratic formula gives p p 1± 1 8 1±i 7 x2 = = . 2 2 Each of these numbers has two square roots, which we still need to find. One way, probably the best, is to use the polar form; this gives x2 = r(cos ✓ ± i sin ✓), where

p 1+7 p r= = 2, 2

1 p ⇡ 1.2094 . . . radians. 2 2

✓ = ± arccos

Thus the four roots are p 4 ± 2(cos ✓/2 + i sin ✓/2) and

±

p 4 2(cos ✓/2

i sin ✓/2).

c. Multiplying the first equation by (1 + i) and the second by i gives i(1 + i)x

(2 + i)(1 + i)y = 3(1 + i)

i(1 + i)x

y = 4i,

which gives (2 + i)(1 + i)y + y = 3

i,

Substituting this value for y then gives x =

1 i.e., y = i + . 3 7 3

8 3 i.

0.7.11 a. These are the vertical line x = 1 and the circle centered at the origin of radius 3. b. Use Z = X + iY as the variable in the codomain. Then (1 + iy)2 = 1

y 2 + 2iy = X + iY

gives 1 X = y 2 = Y 2 /4. Thus the image of the line is the curve of equation X = 1 (Y 2 /4), which is a parabola with horizontal axis. The image of the circle is another circle, centered at the origin, of radius 9, i.e., the curve of equation X 2 + Y 2 = 81. c. This time use Z = X + iY as the variable in the domain. Then the inverse image of the line = Re z = 1 is the curve of equation Re (X + iY )2 = X 2

Y 2 = 1,

which is a hyperbola. The inverse image of the curve p of equation |z| = 3 is 2 2 the curve of equation |Z | = |Z| = 3, i.e., |Z| = 3, the circle of radius p 3 centered at the origin. 0.7.13 a. The cube roots of 1 are 1,

2⇡ 2⇡ cos + i sin = 3 3

p 1 i 3 + , 2 2

b. The fourth roots of 1 are 1, i,

1,

4⇡ 4⇡ cos + i sin = 3 3 i.

1 2

p i 3 . 2

8

Solutions for Chapter 0

c. The sixth roots of 1 are p 1 3 1 1, 1, +i , 2 2 2

p 3 i , 2

p 1 3 +i , 2 2

1 2

p 3 i 2

0.7.15 a. The fifth roots of 1 are cos 2⇡k/5 + i sin 2⇡k/5, p 5

The point of the question is to find these numbers in some more manageable form. One possible approach is to set ✓ = 2⇡/5, and to observe that cos 4✓ = cos ✓. If you set x = cos ✓, this leads to the equation 2(2x2

p 5

1

Figure for solution 0.7.15, part b

for k = 0, 1, 2, 3, 4.

1)2

1 = x i.e., 8x4

8x2

x + 1 = 0.

This still isn’t too manageable, until you start asking what other angles satisfy cos 4✓ = cos ✓. Of course ✓ = 0 does, meaning that x = 1 is one root of our equation. But ✓ = 2⇡/3 does also, meaning that 1/2 is also a root. Thus we can divide: 8x4 (x

8x2 x + 1 = 4x2 + 2x 1)(2x + 1)

1,

and cos 2⇡/5 is the positive root of that quadratic equation, i.e., p p p 2⇡ 5 1 2⇡ 10 + 2 5 cos = , which gives sin = . 5 4 5 4 The fifth roots of 1 are now p p p p p p 5 1 10 + 2 5 5+1 10 2 5 1, ±i , ±i . 4 4 4 4 p b. It is straightforward to draw a line segment of length ( 5 1)/4: p construct a rectangle with sides 1 and 2, so the diagonal has length 5. Then subtract 1 and divide twice by 2, as shown in the figure in the margin.

Solutions for Chapter 1

1.1.1 a.



  1 2 3 + = 3 1 4





 2 1 = 1 3

c.

1 3

2 = 1

b. 2 

1 2

d.





 2 4 = 4 8

   3 3 1 4 + ~e1 = + = 2 2 0 2

Figure for Solution 1.1.1. From left: (a), (b), (c), and (d). 1.1.3 a. ~v 2 R3 b. L ⇢ R2 e. B0 ⇢ B1 ⇢ B2 , . . . . 2 3 1 617 6 . 7 Pn .7 1.1.5 6 6 . 7 = i=1 ~ei 415 1

2

c. C ⇢ R3 3

1 6 2 7 6 . 7 Pn 6 .. 7 = ei i=1 i~ 6 7 4n 15 n

d. x 2 C2 2

0 0 3 4 .. .

3

6 7 6 7 6 7 6 7 Pn 6 7= ei i=3 i~ 6 7 6 7 6 7 4n 15 n

1.1.7 As shown in the figure in the margin, the vector field in part a points straight up everywhere. Its length depends only on how far you are from the z-axis, and it gets longer and longer the further you get from the z-axis; it vanishes on the z-axis. The vector field in part b is simply rotation in the (x, y)-plane, like (f) in Exercise 1.1.6. But the z-component is down when z > 0 and up when z < 0. The vector field in part c spirals out in the (x, y)-plane, like (h) in Exercise 1.1.6. Again, the z-component is down when z > 0 and up when z < 0.

Solution 1.1.7. Top: The vector field in part a points up. Middle: The vector field in part b is rotation in the (x, y)-plane. Bottom: The vector field in part c spirals out in the (x, y)-plane.

1.1.9 The number 0 is in the set, since Re (w0) = 0. If a and b are in the set, then a+b is also in the set, since Re (w(a+b)) = Re (wa)+Re (wb) = 0. If a is in the set and c is a real number, then ca is in the set, since then Re (wca) = cRe (wa) = 0. So the set is a subspace of C. The subspace is a line in C with a polar angle ✓ such that ✓ + ' = k⇡/2, where ' is the polar angle of w and k is an odd integer. 1.2.1 a.

i. 2 ⇥ 3

ii. 2 ⇥ 2

iii. 3 ⇥ 2 9

iv. 3 ⇥ 4

v. 3 ⇥ 3

10

Solutions for Chapter 1

b. The matrices i and v can be multiplied on the right by the matrices iii, iv, v; the matrices ii and iii on the right by the matrices i and ii.  5 1.2.3 a. b. [ 6 16 2 ] 2 1.2.5 a. True: (AB)> = B > A> = B > A b. True: (A> B)> = B > (A> )> = B > A = B > A> c. False: (A> B)> = B > (A> )> = B > A 6= BA d. False: (AB)> = B > A> 6= A> B > 1.2.7 The matrices a and d have no transposes here. The matrices b and f are transposes of each other. The matrices c and e are transposes of each other. 2 3  1 2 1 1 1.2.9 a. A> = , B> = 4 0 1 5 0 0 2 3 1 0 1 1 b. (AB)> = B > A> = 4 0 0 5 1 1 2 3  > 1 1 1 0 1 c. (AB)> = = 40 05 1 0 1 1 1 d. The matrix multiplication A> B > is impossible.

1.2.11 The expressions b, c, d, f, g, and i make no sense. 1.2.13 The trivial case is when a = b = c = d = 0; then obviously ad bc = 0 and the matrix is not invertible. Let us suppose d 6= 0. (If we suppose that any other entry is nonzero, the proof would work the same way.) If ad = bc, then the first row is a multiple  b of bthe second: we can c dd write a = db c and b = db d, so the matrix is A = d . c d To show that A is not invertible,  we need to show that there is no matrix a0 b0 1 0 B = such that AB = . But if the upper left corner of c0 d0 0 1 b 0 0 AB is 1, then we have d (a c + c d) = 1, so the lower left corner, which is a0 c + c0 d, cannot be 0. 2 32 3 2 3 1 a b 1 x y 1 a + x az + b + y 1.2.15 4 0 1 c 5 4 0 1 z 5 = 4 0 1 c+z 5 0 0 1 0 0 1 0 0 1 So x =

a, z =

c, and y = ac

b.

Solutions for Chapter 1

11

1.2.17 With the labeling shown in the margin, the adjacency matrices are

a. 2

2

b. 1

1

3

2

2

4

3

Labeling for Solution 1.2.17

3 0 1 1 AT = 4 1 0 1 5 1 1 0

2 A2T = 4 1 1 2 10 A5T = 4 11 11

2 0 6 A2S = 4 2 0 2 0 6 16 5 AS = 4 0 16

2

2

0 2 2 0 0 2 2 0 16 0 16 0

3 1 1 2 15 1 2

2

0 61 AS = 4 0 1

3 2 3 3 A3T = 4 3 2 3 5 3 3 2 3

11 11 10 11 5 11 10

3 2 0 0 27 4 6 3 5 AS = 4 0 0 2 4 3 0 16 16 0 7 5 0 16 16 0

4 0 4 0

0 4 0 4

3 4 07 5 4 0

1 0 1 0

0 1 0 1

3 1 07 5 1 0

2

3 6 5 5 A4T = 4 5 6 5 5 5 5 6

2

8 0 6 A4S = 4 8 0

0 8 0 8

8 0 8 0

3 0 87 5 0 8

The diagonal entries of An are the number of walks we can take of length n that take us back to our starting point. c. In a triangle, by symmetry there are only two di↵erent numbers: the number an of walks of length n from a vertex to itself, and the number bn of walks of length n from a vertex to a di↵erent vertex. The recurrence relation relating these is an+1 = 2bn

and bn+1 = an + bn .

These reflect that to walk from a vertex V1 to itself in time n + 1, at time n we must be at either V2 or V3 , but to walk from a vertex V1 to a di↵erent vertex V2 in time n + 1, at time n we must be either at V1 or at V3 . If |an bn | = 1, then an+1 bn+1 = |2b2 (an + bn )| = |bn an | = 1.

d. Color two opposite vertices of the square black and the other two white. Every move takes you from a vertex to a vertex of the opposite color. Thus if you start at time 0 on black, you will be on black at all even times, and on white at all odd times, and there will be no walks of odd length from a vertex to itself. e. Suppose such a coloring in black and white exists; then every walk goes from black to white to black to white . . . , in particular the (B, B) and the (W, W ) entries of An are 0 for all odd n, and the (B, W ) and (W, B) entries are 0 for all even n. Moreover, since the graph is connected, for any pair of vertices there is a walk of some length m joining them, and then the corresponding entry is nonzero for m, m + 2, m + 4, . . . since you can

12

Solution 1.2.19, part b: There are three ways to go from Vi to an adjacent vertex Vj and back again in three steps: go, return, stay; go, stay, return; and stay, go, return. Each vertex Vi has three adjacent vertices that can play the role of Vj . That brings us to nine. The couch potato itinerary “stay, stay, stay” brings us to 10.

Solutions for Chapter 1

go from the point of departure to the point of arrival in time m, and then bounce back and forth between this vertex and one of its neighbors. Conversely, suppose the entries of An are zero or nonzero as described, and look at the top line of An , where n is chosen sufficiently large so that any entry that is ever nonzero is nonzero for An 1 or An . The entries correspond to pairs of vertices (V1 , Vi ); color in white the vertices Vi for which the (1, i) entry of An is zero, and in black those for which the (1, i) entry of An+1 is zero. By hypothesis, we have colored all the vertices. It remains to show that adjacent vertices have di↵erent colors. Take a path of length m from V1 to Vi . If Vj is adjacent to Vi , then there certainly exists a path of length m + 1 from V1 to Vj , namely the previous path, extended by one to go from Vi to Vj . Thus Vi and Vj have opposite colors. 1.2.19

2

1 61 6 60 6 61 a. B = 6 60 6 61 4 0 0 2 3 2 4 2 2 2 2 2 2 0 10 2 4 2 2 0 2 2 2 6 7 6 10 6 7 6 62 2 4 2 2 0 2 27 6 6 6 7 6 2 2 2 4 2 2 0 2 6 7 6 10 b. B 2 = 6 7 B3 = 6 62 0 2 2 4 2 2 27 6 6 6 7 6 62 2 0 2 2 4 2 27 6 10 4 5 4 2 2 2 0 2 2 4 2 6 0 2 2 2 2 2 2 4 6

Solution 1.2.21, part b: The first column of an adjacency matrix corresponds to vertex 1, the second to vertex 2, and so on, and the same for the rows. If the matrix for example, 2 is upper triangle, 3 1 1 1 0 60 0 1 07 6 7 4 0 0 1 1 5, you can go from 0 0 0 1 vertex 1 to vertex 2, but not from 2 to 1; from 2 to 3, but not from 3 to 2, and so on: once you have gone from a lower-numbered vertex to a higher-numbered vertex, there is no going back.

1 1 1 0 0 0 1 0

0 1 1 1 0 0 0 1

10 10 10 6 6 6 10 6

1 0 1 1 1 0 0 0 6 10 10 10 6 6 6 10

0 0 0 1 1 1 0 1 10 6 10 10 10 6 6 6

1 0 0 0 1 1 1 0 6 6 6 10 10 10 6 10

0 1 0 0 0 1 1 1

3 0 07 7 17 7 07 7 17 7 07 5 1 1

10 6 6 6 10 10 10 6

6 10 6 6 6 10 10 10

3 6 6 7 7 10 7 7 6 7 7 10 7 7 6 7 5 10 10

The diagonal entries of B 3 correspond to the fact that there are exactly 10 loops of length 3 going from any given vertex back to itself. 1.2.21 a. The proof is the same as with unoriented walks (Proposition 1.2.23): first we state that if Bn is the n ⇥ n matrix whose i, jth entry is the number of walks of length n from Vi to Vj , then B1 = A1 = A for the same reasons as in the proof of Proposition 1.2.23. Here again if we assume the proposition true for n, we have: (Bn+1 )i,j =

n X

(Bn )i,k (B1 )k,j =

k=1 n

So A = Bn for all n.

n X

(An )i,k Ak,j = (An+1 )i,j .

k=1

b. If the adjacency matrix is upper triangular, then you can only go from a lower number vertex to a higher number vertex; if it is lower triangular, you can only go from a higher number vertex to a lower number vertex. If it is diagonal, you can never go from any vertex to any other.

Solutions for Chapter 1

13



a 1 0 1.2.23 a. Let A = b 0 1

2

3 0 0 and let B = 4 1 0 5 . Then AB = I. 0 1

b. Whatever matrix one multiplies B by on the right, the top left corner of the resultant matrix will always be 0 when we need it to be 1. So the matrix B has no right inverse. Solution 1.3.1: The first question on one exam at Cornell University was “What is a linear transformation T : Rn ! Rm ?” Several students gave answers like “A function taking some vector ~ v 2 Rn and producing some vector ~ 2 Rm .” w This is wrong! The student who gave this answer may have been thinking of matrix multiplication. But a mapping from Rn to Rm need not be given by matrix multiplication: the map  consider x x2 ping 7! . In any case, y y2 defining a linear transformation by saying that it is given by matrix multiplication begs the question, because it does not explain why matrix multiplication is linear. The correct answer was given by another student in the class: “A linear transformation T : Rn ! Rm

is a mapping Rn ! Rm such that for all ~a, ~ b 2 Rn , T (~a + ~ b) = T (~a) + T (~ b),

and for all ~a 2 Rn and all scalar r, T (r~a) = rT (~a).” Linearity and the approximation of nonlinear mappings by linear mappings are key motifs of this book. You must know the definition, which gives you a foolproof way to check whether or not a given mapping is linear.

c. With A and B as in part a, write I > = I = AB = (AB)> = B > A> . So A> is a right inverse for B > , so B > has infinitely many right inverses. 4 2 1.3.1 a. Every linear  transformation T : R ! R is given by a 2⇥4 matrix. 1 0 1 2 For example, A = is a linear transformation T : R4 ! R2 . 3 2 1 7

b. Any row matrix 3 wide will do, for example, [1,

1, 2]; such a matrix

takes a vector in R and gives a number. 3

1.3.3 A. R4 ! R3

B. R2 ! R5

C. R4 ! R2

D. R4 ! R. 2 3 .25 6 .025 7 6 7 6 .025 7 6 7 6 .025 7 6 7 6 .025 7 6 7 6 .025 7 1.3.5 Multiply the original matrix by the vector ~v = 6 7, putting ~v 6 .025 7 6 7 6 .025 7 6 7 6 .025 7 6 7 6 .025 7 4 5 .025 .5 on the right. 1.3.7 It is enough to know what T gives when evaluated on the 2 2 3 2 3 2 3 3 1 1 0 0 1 1 6 standard basis vectors 4 0 5, 4 1 5, 4 0 5. The matrix of T is 4 2 3 0 0 1 1 0

three 3 0 27 5. 1 1

1.3.9 2No, T is3 not linear. were 2 If 3 it 2 3 linear, the matrix would be 2 1 1 2 7 [T ] = 4 1 2 0 5, but [T ] 4 1 5 = 4 0 5, which contradicts the defini1 1 1 4 5 tion of the transformation.

14

Solutions for Chapter 1

 cos ✓ sin ✓ 1.3.11 The rotation matrix is . This transformation takes sin ✓ cos ✓  cos ✓ ~e1 to , which is thus the first column of the matrix, by Theorem sin ✓  cos(90 ✓) = sin ✓ 1.3.4; it takes ~e2 to , which is the second column. sin(90 ✓) = cos ✓ One could also write this mapping as the rotation matrix of Example 1.3.9, applied to ✓:  cos( ✓) sin( ✓) . sin( ✓) cos( ✓) 1.3.13 The expressions a, e, f, and j are not well-defined compositions. For the others: b. C B : Rm ! Rn (domain Rm , codomain Rn ) d. B A C : Rk ! Rk h. A C

B : Rm ! Rm

g. B A : Rn ! Rk i. C

c. A C : Rk ! Rm

B A : Rn ! Rn

~ ) = A~v + A~ 1.3.15 We need to show that A(~v + w w and A(c~v) = cA~v. By Definition 1.2.4, Solution 1.3.15: With the dot product, introduced in the next section, the solution is simpler: Every entry in the vector A~ v is the dot product of ~ v with one of the rows of A. The dot product is linear with respect to both of its arguments, so the mapping A~ v is also linear with respect to ~ v.

(A~v)i = ~) A(~v + w

i

= =

n X

k=1 n X k=1 n X

i

=

n X

cos(2✓) sin(2✓)

sin(2✓) cos(2✓)

ai,k (v + w)k =

2

ai,k wk , and

n X

ai,k (vk + wk )

k=1

ai,k vk +

n X

ai,k wk = (A~v)i + (A~ w)i .

k=1

ai,k (cv)k =

k=1

(A~ w)i =

k=1

k=1

Similarly, A(c~v)



ai,k vk ,

n X

n X

ai,k cvk = c

k=1

n X

ai,k vk = c(A~v)i .

k=1

1.3.17  cos(2✓)2 + sin(2✓)2 cos(2✓) sin(2✓) sin(2✓) cos(2✓) = sin(2✓) cos(2✓) cos(2✓) sin(2✓) sin(2✓)2 + cos(2✓)2 

cos(2✓) sin(2✓)

sin(2✓) cos(2✓)

2

=



1 0 =I 0 1

1.3.19 By commutativity of matrix addition, AB+BA = BA+AB , so the 2 2 Jordan product is commutative. By non-commutativity of matrix multiplication: AB+BA C 2

+ C AB+BA ABC + BAC + CAB + CBA 2 = 2 4 A BC+CB + ABC + ACB + BCA + CBA 2 6= = 4 2

BC+CB A 2

Solutions for Chapter 1

15

so the Jordan product is not associative.

So far we have defined only determinants of 2 ⇥ 2 and 3 ⇥ 3 matrices; in Section 4.8 we will define the determinant in general.

~ , |~v|, |A|, and det A. (If A consists of a single 1.4.1 a. Numbers: ~v · w row, then A~v is also a number.) ~ and A~v (unless A consists of a single row). Vectors: ~v ⇥ w ~ , the vectors must each have three entries. In b. In the expression ~v ⇥ w the expression det A, the matrix A must be square. 1.4.3 To normalize a vector, divide it by its length. This gives: 2 3 2p 3  0 2 1 1 1 3 a. p 4 1 5 b. p c. p 4 2 5 7 17 4 58 31 5 02 3 2 31 1 1 1.4.5 a. cos(✓) = 1⇥1p3 @4 0 5 · 4 1 5A = p13 , so 0 1 ✓ ◆ 1 ✓ = arccos p ⇡ .95532. 3 b. cos(✓) = 0, so ✓ = ⇡/2 . 1.4.7 a. det = 1; the inverse is b. det = 0; no inverse



0 1 1 2

c. det = ad; if a, d 6= 0, the inverse is d. det = 0; no inverse 2 3 2 3 6yz 6 1.4.9 a. 4 3xz 5 b. 4 7 5 5xy 4

 1 d ad 0

b . a 2

3 2 c. 4 22 5 3

~ = 1.4.11 a. True by Theorem 1.4.5, because w Solution 1.4.11, part c: Our answer depended on the vectors chosen. In general, det[~a, ~ b,~c] =

det[~a,~c, ~ b];

the result here is true only because ~ are both sides are 0, since ~ v, w linearly dependent.

Solutions 1.4.13 and 1.4.15: Of course, the vectors must be in R3 for the cross product to be defined.

2~v.

~ ) is a number; |~u|(~v ⇥ w ~ ) is a number times a vector, b. False; ~u · (~v ⇥ w i.e., a vector. ~ = c. True: since w

~ ] = 0 and det[~u, w ~ , ~v] = 0. 2~v, we have det[~u, ~v, w

d. False, since ~u is not necessarily (in fact almost surely not) a multiple ~ ; the correct statement is |~u · w ~ |  |~u||~ of w w|. e. True. f. True. 2 3 2 3 2 3 2 3 xa a xbc xbc 0 1.4.13 4 xb 5 ⇥ 4 b 5 = 4 (xac xac) 5 = 4 0 5 . xc c xab xab 0 2 3 2 3 2 3 2 3 2 3 a d bf ce d a 1.4.15 4 b 5 ⇥ 4 e 5 = 4 cd af 5 = 4 e 5 ⇥ 4 b 5 = c f ae bd f c

2

ec 4 af bd

3 bf cd 5 ae

16

Solution 1.4.19, part b: We find it surprising that the diagonal vector ~ vn is almost orthogonal to all the standard basis vectors when n is large.

Solution 1.4.23, part a: This uses the fact that 12 + 22 + · · · + n2 =

n3 n2 n + + . 3 2 6

Solutions for Chapter 1

  x 2 1.4.17 a. It is the line of equation · = 2x y = 0. y 1   x 2 2 b. It is the line of equation · = 2x 4 4y + 12 = 0, y 3 4 which you can rewrite as 2x 4y + 8 = 0. p p 1.4.19 a. The length of ~vn is |~vn | = 1 + · · · + 1 = n. 1 b. The angle is arccos p , which tends to ⇡/2 as n ! 1. n p p p p 1.4.21 a. |A| = 1 + 1 + 4 = 6; |B| = 5; |~c| = 10  p p 2 0 b. |AB| = = 12  30 = |A||B| 2 2 p p p p |A~c| = 50  |A||~c| = 60; |B~c| = 13  |B||~c| = 50 r p n3 n2 n 1.4.23 a. The length is |~ wn | = 1 + 4 + · · · + n2 = + + . 3 2 6 k b. The angle ↵n,k is arccos q . n3 n2 n + + 3 2 6

c. In all three cases, the limit is ⇡/2. Clearly limn!1 ↵n,k = ⇡/2, since the cosine tends to 0. The limit of ↵n,n is limn!1 arccos q n3 3

n 2

+ n2 + n 6

= arccos 0 = ⇡/2.

The limit of ↵n,[n/2] is also ⇡/2, since it is the arccos of q

[n/2] n3 3

+

1.4.25 q ⇣q ⌘2 x21 + x22 y12 + y22

n2 2

+

n 6

,

which tends to 0 as n ! 1.

⇣ ⌘2 x1 y1 + x2 y2 = (x2 y1 )2 + (x1 y2 )2

= (x1 y2 x2 y1 )2 ✓q ◆2 q 2 2 2 2 2 so (x1 y1 + x2 y2 )  x1 + x2 y1 + y2 , so |x1 y1 + x2 y2 | 

2x1 x2 y1 y2 0,

q q x21 + x22 y12 + y22 .

1.4.27 a. To show that the transformation is linear, we need to show that ~ ) = T~a (~v) + T~a (~ T~a (~v + w w) and ↵T~a (~v) = T~a (↵~v). For the first, ~ ) = ~v + w ~ 2 ~a ·(~v + w ~ ) ~a = ~v + w ~ 2(~a ·~v +~a · w ~ )~a = T~a (~v)+T~a (~ T~a (~v + w w). For the second, ↵T~a (~v) = ↵~v

2↵(~a · ~v)~a = ↵~v

2(~a · ↵~v)~a = T~a (↵~v).

Solutions for Chapter 1

b. We have T~a (~a) =

17

~a, since ~a · ~a = a2 + b2 + c2 = 1:

T~a (~a) = ~a

The transformation T~a is the 3dimensional version of the transformation shown in Figure 1.3.4.

2(~a · ~a)~a = ~a

2~a =

~a.

If ~v is orthogonal to ~a, then T~a (~v) = ~v, since in that case ~a ·~v = 0. Thus T~a is reflection in the plane that goes through the origin and is perpendicular to ~a. c. The matrix of T~a is

2

M = [T~a (~e1 ) , T~a (~e2 ) , T~a (~e3 )] = 4

1

2a2 2ab 2ac

2ab 1 2b2 2bc

3 2ac 2bc 5 1 2c2

Squaring the matrix gives the 3 ⇥ 3 identity matrix: if you reflect a vector, then reflect it again, you are back to where you started. Solution 1.5.1: Note that part c is the same as part a.

1.5.1 a. The set { x 2 R | 0 < x  1 } is neither open nor closed: the point 1 is in the set, but 1 + ✏ is not for every ✏, showing it isn’t open, and 0 is not but 0 + ✏ is for every ✏ > 0, showing that the complement is also not open, so the set is not closed. b. open

c. neither

d. closed

e. closed

f. neither

g. Both. That the empty set / is closed is obvious. Showing that it is open is an “eleven-legged alligator” argument (see Section 0.2). Is it true that for every point of the empty set, there exists ✏ > 0 such that . . . ? Yes, because there are no points in the empty set.

Solution 1.5.3, part b: There is a smallest ✏i , because there are finitely many of them, and it is positive. If there were infinitely many, then there would be a greatest lower bound, but it could be 0. Part c: In fact, every closed set is a countable intersection of open sets.

Equation 1 uses the familiar form of the triangle inequality: if a = b + c, then |a|  |b| + |c|. Equation 2 uses the variant |a|

|b|

|c| .

1.5.3 a. Suppose Ai , i 2 I is some collection (probably infinite) of open S sets. If x 2 i2I Ai , then x 2 Aj for some j, and since Aj is open, there S exists ✏ > 0 such that B✏ (x) ⇢ Aj . But then B✏ (x) ⇢ i2I Ai . b. If A1 , . . . , Aj are open and x 2 \ki=1 Ai , then there exist ✏1 , . . . , ✏k > 0 such that B✏i (x) ⇢ Ai , for i = 1, . . . , k. Set ✏ to be the smallest of ✏1 , . . . , ✏k . Then B✏ (x) ⇢ B✏i (x) ⇢ Ai .

c. The infinite intersection of open sets ( 1/n, 1/n), for n = 1, 2, . . . , is not open; as n ! 1, 1/n ! 0 and 1/n ! 0; the set {0} is not open. ⇣ ⌘ 1.5.5 a. This set is open. Indeed, if you choose x y in your set, then p p 1 < x2 + y 2 < 2. Set np o p p r = min x2 + y 2 1, 2 x2 + y 2 > 0. ⇣ ⌘ ⇣ ⌘ u Then the ball of radius r around x is contained in the set, since if y v is in that ball, then, by the triangle inequality,     p u u x x x  +

r 1. (2) v y v y y

18

Solutions for Chapter 1

b. The ⇣ locus ⌘ xy 6= 0 is also open. It is the complement of the two axes, x so that if y is in the set, then r = min{|x|, |y|} > 0, and the ball B of ⇣ ⌘ ⇣ ⌘ u is B, then radius r around x is contained in the set. Indeed, if y v |u| = |x + u x| > |x| |u v, by the same argument.

Many students have found Exercise 1.5.5 difficult, even though they also thought it was obvious, but didn’t know how to say it. If this applies to you, you should check carefully where we used the triangle inequality, and how. Almost everything concerning inequalities requires the triangle inequality.

x| > |x|

r

0, so u is not 0, and neither is

c. This time our set is the x-axis, and it is closed. We will use the criterion that a set is closed if the limit of a convergent of elements ⇣ sequence ⌘ x n of the set is in the set (Proposition 1.5.17). If n 7! y is a sequence in n ⇣ ⌘ 0 the set, and converges to x y0 , then all yn = 0, so y0 = limn!1 yn = 0, and the limit is also in the set. d. The rational numbers are neitherpopen nor closed. Any rational number x is the limit of the numbers x + 2/n, which are all irrational, so the irrationals aren’t closed. Any irrational number is the limit of the finite decimals used to write it, which are all rational, so the rationals aren’t closed either. 1.5.7 a. The natural domain is R2 minus the union of the two axes; it is open. b. The natural domain is that part of R2 where x2 > y (i.e., the area “outside” the parabola of equation y = x2 ). It is open since its “fence” x2 belongs to its neighbor. c. The natural domain of ln ln x is {x|x > 1}, since we must have ln x > 0. This domain is open. d. The natural domain of arcsin is [ 1, 1]. Thus the natural domain of 3 2 arcsin x2 +y minus the open disc x2 + y 2 < 3. Since this domain is 2 is R the complement of an open disc it is closed (and not open, since it isn’t R2 or the empty set). e. The natural domain is all of R2 , which is open. f. The natural domain is R3 minus the union of the three coordinate planes of equation x = 0, y = 0, z = 0; it is open. 1.5.9 For any n > 0 we have ity (Theorem 1.4.9). Because n ! 1. So :

xi 

n X

1 X

|xi |.

i=1 P 1 i=1

1 X i=1

n X

xi 

1.5.11 a. First let us see that ⇣ (8✏ > 0)(9N )(n > N ) =) |an

|xi | by the triangle inequalPn xi converges, i=1 xi converges as i=1

i=1

⌘ a| < '(✏) =)

an converges to a .

Solutions for Chapter 1

19

Choose ⌘ > 0. Since limt!0 '(t) = 0, there exists > 0 such that when 0 < t  we have '(t) < ⌘. Our hypothesis guarantees that there exists N such that when n > N , then |an a|  '( ) = ⌘. Now for the converse: ⇣ ⌘ an converges to a =) (8✏ > 0)(9N )(n > N ) =) |an a| < '(✏) . For any ✏ > 0, we also have '(✏) > 0, so there exists N such that n > N =) |an

a| < '(✏).

b. The analogous statement for limits of functions is: Let ' : [0, 1) ! [0, 1) be a function such that limt!0 '(t) = 0. Let U ⇢ Rn , f : U ! Rm , and x0 2 U . Then limx!x0 f (x) = a if and only if for every ✏ > 0 there exists > 0 such that when x 2 U and |x x0 | < , we have |f (x) a| < '(✏). 1.5.13 Choose a point a 2 Rn C. Suppose that the ball B1/n (a) of radius 1/n around a satisfies B1/n (a) \ C 6= / for every n. Choose an in B1/n (a) \ C. Then the sequence n 7! an converges to a, since for any ✏ > 0 we can find N such that for n > N we have 1/n < ✏, so for n > N we have |a an | < 1/n < ✏. Then our hypothesis implies that a 2 C, a contradiction. Thus there exists N such that B1/N (a) \ C = / . This shows that Rn C is open, so C is closed. Solution 1.5.15: Here, x plays the role of the alligators in Section 0.2, and x 0 satisfying | 2 x| < plays the role of eleven-legged alligators; the conp clusion | x 5| < ✏ (i.e., that 5 is the limit) is the conclusion “are orange withpblue spots” and the conclusion | x 3| < ✏ (3 is the limit) is the conclusion “are black with white stripes”.

1.5.15 Both statements are true. To show that the first is true, we say: for every ✏ > 0, there exists > 0 such that for all x satisfying x 0 and p | 2 x| < , then | x 5| < ✏. For any ✏ > 0, choose = 1. Then there is no x 0 satisfying | 2 x| < . So for those nonexistent x satisfying p | 2 x| < 1, it is true that | x 5| < ✏. By the same argument the second statement is true. 1.5.17 Set a = limm!1 am , b = limm!1 bm

and c = limm!1 cm .

1. Choose ✏ > 0 and find M1 and M2 such that if m M1 then we have |am a|  ✏/2, and if m M2 then |bm b|  ✏/2. Set M = max(M1 , M2 ). If m > M , we have |am + bm

a

b|  |am

a| + |bm

b| 

✏ ✏ + = ✏. 2 2

So the sequence m 7! (am + bm ) converges to a + b. 2. Choose ✏ > 0. Find M1 such that if m

M1 ,

then |am

1 a|  inf 2

✓

◆ ✏ ,✏ . |c|

The inf is there to guard against the possibility that |c| = 0. In particular, if m M1 , then |am |  |a| + ✏. Next find M2 such that if ✏ m M2 , then |cm c|  . 2(|a| + ✏)

20

Solutions for Chapter 1

If M = max(M1 , M2 ) and m |cm am

M , then

ca| = |c(am

a) + (cm

c)am |

✏ ✏ + = ✏, 2 2 converges and the limit is ca.

 |c(am so the sequence m 7! cm am

a)| + |(cm

c)am | 

3. We can either repeat the argument above, or use parts 1 and 2 as follows: n n X X lim am · bm = lim am,i bm,i = lim (am,i bm,i ) m!1

m!1

=

i=1

n ⇣ X i=1

i=1

lim am,i

m!1

⌘⇣

m!1

n ⌘ X lim bm,i = ai bi = a · b.

m!1

i=1

4. Find C such that |am |  C for all m; saying that am is bounded means exactly that such a C exists. Choose ✏ > 0, and find M such that when m > M , then |cm | < ✏/C (this is possible since the cm converge to 0). Then when m > M we have ✏ |cm am | = |cm ||am |  C = ✏. C 1.5.19 a. Suppose I I

A + C = (I

A is invertible, and write

A) + C(I

A)

1

(I

A) = I + C(I

A)

1

A)

1 3

(I

A),

so (I

A + C)

1

= (I = (I

A)

1

A)

1

⇣ I + C(I I |

1

A)

(C(I

A)

⌘ 1

1

) + (C(I

A) {z

1 2

)

(C(I

geometric series if |C(I A)

1

| (x2 + y 2 ) ln(x2 + 2y 2 )

(x2 + y 2 ) ln(x2 + y 2 ),

which ⇣ ⌘ tends to 0 using the equation in the margin. f 00 = 0, then f is continuous.

So if we choose

d. The function is not continuous near the origin. Since ln 0 is undefined, the diagonal x + y = 0 is not part of the function’s domain of definition. However, the ⇣ function is ⌘defined at points arbitrarily close to that line, e.g., x 3 the point x + e 1/x . At this point we have ✓ ⇣ ⌘ ◆ 3 2 3 1 1 x2 + x + e 1/x ln x x + e 1/x x2 3 = , x |x| which tends to infinity as x tends to 0. But if we approach the origin along the x-axis (for instance), the function is x2 ln |x|, which tends to 0 as x tends to 0. 1.5.23 a. To say that limB!A (A B) 1 (A2 B 2 ) exists means that there is a matrix C such that for all ✏ > 0, there exists > 0, such that when |B A| < and B A is invertible, then (B

1

A)

(B 2

A2 )

C < ✏. 

2 0 = 2I. Write 0 2 B = I + H, with H invertible, and choose ✏ > 0. We need to show that there exists > 0 such that if |H| < , then b. We will show that the limit exists, and is

(I + H

I)

1

⇣ (I + H)2

I2

⌘

2I < ✏.

(1)

Indeed, (I + H

I)

1

⇣ (I + H)2

So if you set

I2

⌘

2I = H

1

= H

1

⇣ I 2 + IH + HI + H 2 ⇣ ⌘ 2H + H 2

I2

⌘

2I

2I = |H|.

= ✏, and |H|  , then equation 1 is satisfied.

c. We will show that the limit does not exist. In this case, we find A2 = H

1

(I 2 + AH + HA + H 2

=H

1

(AH + HA + H 2 ) = A + H

1

AH + H 2 .

If the limit exists, it must be 2A: choose H = ✏I so that H

1

=✏

(A + H

A)

1

(A + H)2

A+H is close to 2A.

1

2

AH + H = 2A + ✏I

I 2) 1

I; then

Solutions for Chapter 1

23

H

1

AH =



1/✏ 0



0 , you will find that 1    0 0 1 ✏ 0 0 1 = = 1/✏ 1 0 0 ✏ 1 0

But if you choose H = ✏

1 0

A.

So with this H we have A+H

1

AH + H 2 = A

A + ✏H,

which is close to the zero matrix. 1.6.1 Let B be a set contained in a ball of radius R centered at a point a. Then it is also contained in a ball of radius R + |a| centered at the origin; thus it is bounded. 1.6.3 The polynomial p(z) = 1 + x2 y 2 has no roots because 1 plus something positive cannot be 0. This does not contradict the fundamental theorem of algebra because although p is a polynomial in the real variables x and y, it is not a polynomial in the complex variable z: it is a polynomial in z and z¯. It is possible to write p(z) = 1 + x2 y 2 in terms of z and z¯. You can use z+z z z x= and y = , 2 2i and find p(z) = 1 +

z4

2|z|4 + z 4 16

(1)

but you simply cannot get rid of the z.

Solution 1.6.5: How did we land on the number 3? We started the computation, until we got to the expression |z|2 7, which we needed to be positive. The number 3 works, and 2 does not; 2.7 works too.

1.6.5 a. Suppose |z| > 3. Then (by the triangle inequality) |z|6

|q(z)|

|z|6

(4|z|4 + |z| + 2)

= |z|4 (|z|2

7)

(9

|z|6

(4|z|4 + |z|4 + 2|z|4 )

7) · 34 = 162.

b. Since p(0) = 2, but when |z| > 3 we have |p(z)| |z|6 |q(z)| 162, the minimum value of |p| on the disc of radius R1 = 3 around the origin must be the absolute minimum value of |p|. Notice there must be a point at which this minimum value is achieved, since |p| is a continuous function on the closed and bounded set |z|  3 of C. 1.6.7 Consider the function g(x) = f (x) mx. This is a continuous function on the closed and bounded set [a, b], so it has a minimum at some point c 2 [a, b]. Let us see that c 6= a and c 6= b. Since g 0 (a) = f 0 (a) m < 0, we have g(a + h) h!0 h lim

Thus for every ✏ > 0, there exists

g(a)

< 0.

> 0 such that 0 < |h|
0. h!0 h Express this again in terms of ✏’s and ’s, choose ✏ = g 0 (b)/2, and choose h satisfying < h < 0. As above, we have lim

g(b + h) g(b) g 0 (b) > > 0, h 2 and since h < 0, this implies g(b + h) < g(b). So c 2 (a, b), and in particular c is a minimum on (a, b), so g 0 (c) = f 0 (c) m = 0 by Proposition 1.6.12. 1.6.9 “Between 0 and a” means that if you plot a as a point in R2 in the usual way (real part of a on the x-axis, imaginary part on the y-axis), then a + buj lies on the line segment connecting the origin and the point a; see the figure in the margin. For this to happen, buj must be a negative real multiple of a, and we must have |buj | < |a|. Write a = r1 (cos !1 + i sin !1 ) b = r2 (cos !2 + i sin !2 ) u = p(cos ✓ + i sin ✓). Then a + buj = r1 (cos !1 + i sin !1 ) + r2 pj cos(!2 + j✓) + i sin(!2 + j✓) . Then buj will point in the opposite direction from a if 1 !2 + j✓ = !1 + ⇡ + 2k⇡ for some k, i.e., ✓ = (!1 j

!2 + ⇡ + 2k⇡),

and we find j distinct such angles by taking k = 0, 1, . . . , j 1. The condition |buj | < |a| becomes r2 pj < r1 , so we can take def

0 < p < (r1 /r2 )1/j = p0 . 1.6.11 Set p(x) = xk + ak

k 1 1x

+ · · · + a1 x + a0 with k odd. Choose

C = sup{1, |ak

1 |, . . . , |a0 |}

Solutions for Chapter 1

25

and set A = kC + 1. Then if x  p(x) = xk + ak

k 1 1x

k

Similarly, if x

1

+ · · · + a1 x + a0

k 1

 ( A) + CA = Ak

A we have

(kC

Ak + kCAk

+ ··· + C  Ak

A) =

1

1

 0.

A we have

p(x) = xk + ak k

k 1

(A) = Ak

1

k 1 1x

CA

+ · · · + a1 x + a0 ···

kC) = Ak

(A

Ak

C

1

kCAk

1

0.

Since p : [ A, A] ! R is a continuous function (Corollary 1.5.31) and we have p( A)  0 and p(A) 0, then by the intermediate value theorem there exists x0 2 [ A, A] such that p(x0 ) = 0. 1.7.1 a. f (a) = 0, f 0 (a) = cos(a) = 1, so the tangent is g(x) = x. b. f (a) = 12 , f 0 (a) =

c. f (a) = 1, f 0 (a) = 0

d. f (a) = 2, f (a) =

p

sin(a) = 23 , so the p 3 ⇡ g(x) = (x )+ 2 3

tangent is 1 . 2

sin(a) = 0, so the tangent is g(x) = 1. 1 a2

g(x) =

= 4(x

4, so the tangent is 1/2) + 2 =

4x + 4.

⇣ ⌘⇣ ⌘⇣ ⌘ 1.7.3 a. f 0 (x) = 3 sin2 (x2 + cos x) cos(x2 + cos x) 2x sin x ⇣ ⌘⇣ ⌘⇣ ⇣ ⌘ b. f 0 (x) = 2 cos((x + sin x)2 ) sin((x + sin x)2 ) 2(x + sin x) 1 + cos x ⇣ ⌘⇣ ⌘ c. f 0 (x) = (cos x)5 + sin x 4(cos x)3 ( sin(x)) = (cos x)5 4(sin x)2 (cos x)3 d. f 0 (x) = 3(x + sin4 x)2 (1 + 4 sin3 x cos x)

sin3 x(cos x2 ⇤ 2x) sin x2 (3 sin2 x cos x) (sin x2 sin3 x)(cos x) + 2 2 + sin(x) 2 + sin(x) 2 + sin(x) ✓ 3 ◆✓ ◆ x 3x2 (x3 )(cos x2 ⇤ 2x) f. f 0 (x) = cos sin x2 sin x2 (sin x2 )2

e. f 0 (x) =

1.7.5 a. Compute the partial derivatives: ⇣ ⌘ ⇣ ⌘ x x = p1 p D1 f x = and D f . 2 y y 2 x +y 2 x2 + y This gives ⇣ ⌘ 2 2 D1 f 21 = p =p 5 22 + 1

and D2 f

⇣ ⌘ 1 2 = p 1 = p . 1 2 5 2 22 + 1

26

Solutions for Chapter 1

⇣ ⌘ 1 , we have x2 + y < 0, so the function is not defined At the point 2 there, and neither are the partial derivatives. ⇣ ⌘ ⇣ ⌘ x = x2 + 4y 3 . This gives b. Similarly, D1 f x = 2xy and D f 2 y y ⇣ ⌘ ⇣ ⌘ D1 f 21 = 4 and D2 f 21 = 4 + 4 = 8; D1 f

⇣

1 2

c. Compute

⌘

=

4 and D2 f

⇣ ⌘ x = y ⇣ ⌘ D2 f x y = D1 f

⇣

1 2

⌘

= 1 + 4 · ( 8) =

31.

y sin xy x sin xy + cos y

y sin y.

This gives ⇣ ⌘ ⇣ ⌘ D1 f 21 = sin 2 and D2 f 21 = 2 sin 2 + cos 1 sin 1 ⇣ ⌘ ⇣ ⌘ 1 = 2 sin 2 and D f 1 = sin 2 + cos 2 2 sin 2 = cos 2 D1 f 2 2 2 d. Since

D1 f we have

⇣ ⌘ 2 4 x = xy + 2y 2 3/2 y 2(x + y ) ⇣ ⌘ 2 = p4 1 2 27 ⇣ ⌘ 1 = 36 p D1 f 2 10 5 D1 f

and D2 f

⇣ ⌘ 2x2 y + xy 3 x = , y (x + y 2 )3/2

⇣ ⌘ 2 = p10 ; 1 27 ⇣ ⌘ 12 1 = p . and D2 f 2 5 5 and D2 f

1.7.7 Just pile up the partial derivative vectors side by side: 2 3 sin x 0 h ⇣ ⌘i a. D~f x =4 2xy x2 + 2y 5 y 2 2x cos(x y) cos(x2 y) 2 3 p 2x 2 p 2y 2 h ⇣ ⌘i x +y x +y 5. b. D~f x =4 y x y 2y sin xy cos xy 2x sin xy cos xy 1.7.9 a. The derivative is an m ⇥ n matrix b. a 1 ⇥ 3 matrix (line matrix) c. a 4 ⇥ 1 matrix (vector 4 high) ⇥ ⇤ ⇥ 2 3 2 3⇤ 1.7.11 a. y cos(xy), x cos(xy) b. 2xex +y , 3y 2 ex +y   y x cos ✓ r sin ✓ c. d. 1 1 sin ✓ r cos ✓

sin 2

Solutions for Chapter 1

27

1.7.13 For the first part, |x| and mx are continuous functions, hence so is f (0 + h) f (0) mh = |h| mh. For the second, we have |h|

mh

mh

h |h| mh h mh = h h Solution 1.7.15, part a: Writing the numerator as a length is optional, but the length in the denominator is not optional: you cannot divide by a matrix. Since H is an n⇥m matrix, the [0] in limH![0] is the n ⇥ m matrix with all entries 0.

h

h

=

=

1

m when h < 0

=

1

m when h > 0.

The di↵erence between these values is always 2, and cannot be made small by taking h small. 1.7.15 a. There exists a linear transformation [DF (A)] such that F (A + H)

lim

H![0]

F (A) |H|

[DF (A)]H

= 0.

b. The derivative is [DF (A)]H = AH > +HA> . We found this by looking for linear terms in H of the di↵erence F (A + H)

F (A) = (A + H)(A + H)> = (A + H)(A> + H > )

AA> AA>

= AH > + HA> + HH > ; see Remark 1.7.6. The linear terms AH > +HA> are the derivative. Indeed, lim

H![0]

|(A + H)(A + H)>

AA> |H|

AH >

HA> |

|HH > | |H| |H > |  lim = lim |H| = 0. |H| |H| H![0] H![0] H![0]

= lim

1.7.17 The derivative of the squaring function is given by

Solution 1.7.17: This may look like a miracle: the expressions should not be equal, they should di↵er by terms in ✏2 . The reason why they are exactly equal here is that   2 0 0 0 0 = . ✏ 0 0 0

[DS(A)]H = AH + HA;  1 1 0 0 substituting A = and H = gives 0 1 ✏ 0        1 1 0 0 0 0 1 1 ✏ 0 0 0 ✏ 0 + = + = . 0 1 ✏ 0 ✏ 0 0 1 ✏ 0 ✏ ✏ 2✏ ✏ 

Computing (A + H)2 (A + H)2

A2 gives the same result;    1+✏ 2 1 2 ✏ 0 A2 = = . 2✏ 1+✏ 0 1 2✏ ✏ ~ ~

h 1.7.19 Since lim~h!0 |h| = [0], the derivative exists at the origin and is |~ h| the 0 linear transformation, represented by the n ⇥n matrix with all entries 0.

28

Solutions for Chapter 1

1.7.21 We will work directly from the definition of the derivative: det(I + H)

det(I) (h1,1 + h2,2 )

= (1 + h1,1 )(1 + h2,2 ) = h1,1 h2,2

h1,2 h2,1

1

(h1,1 + h2,2 )

h1,2 h2,1 .

Each hi,j satisfies |hi,j |  |H|, so we have | det(I + H) Thus

det(I) |H| lim

H!0

(h1,1 + h2,2 )|

| det(I + H)

=

det(I) |H|

|h1,1 h2,2 h1,2 h2,1 | 2|H|2  = 2|H|. |H| |H| (h1,1 + h2,2 )|

 lim 2|H| = 0. H!0

1.8.1 Three make sense: c. g f : R2 ! R2 ; the derivative is a 2 ⇥ 2 matrix d. f

e. f

g : R3 ! R3 ; the derivative is a 3 ⇥ 3 matrix f : R2 ! R;

the derivative is a 1 ⇥ 2 matrix

1.8.3 Yes: We have a composition of sine, the exponential function, and ⇣ ⌘ x the function y 7! xy, all of which are di↵erentiable everywhere.

1.8.5 One must also show that f g is di↵erentiable, working from the definition of the derivative. 1.8.7 Since [Df (x)] = [ x2 x1 + x3 x2 + x4 · · · xn 2 + xn xn 1 ], we have [D f ( (t) ] = [ t2 t + t3 t2 + t4 · · · tn 2 + tn tn 1 ]. In addition, 2 3 1 6 2t 7 7 [D (t)] = 6 4 ... 5. So the derivative of the function t ! f ( (t)) is ntn

[D(f )(t)] | {z }

deriv. of comp. at t

Solution 1.8.9: It is easiest to solve this problem without using the chain rule in several variables.

1

2

= [Df ( (t))] [D (t)] = t + | {z } | {z } deriv. of f at (t)

deriv. of at t

1.8.9 Clearly ⇣ ⌘ 0 2 D1 f x y = 2xy' (x

y 2 );

D2 f

n X1

i 1

it

i 1

(t

i+1

+t

i=2

⇣ ⌘ x = y

Thus ⇣ ⌘ 1 ⇣ ⌘ 1 x = 2y'0 (x2 y 2 ) D1 f x + D f 2 y y x y 1 ⇣ ⌘ = 2f x y . y

2y 2 '0 (x2

2y'0 (x2

!

)

+ nt2(n

y 2 ) + '(x2

1 y 2 ) + '(x2 y

1)

y 2 ).

y2 )

Solutions for Chapter 1

29

To use the chain rule, write f = k h g, where ⇣ ⌘ ⇣ 2 ⌘ ⇣ ⌘ ⇣ ⌘ x y2 , '(u) , g x h u y = y v = v

This leads to  0  h ⇣ ⌘i ' (u) 0 2x x Df y = [t, s] 0 1 0

2y 1

⇣ ⌘ k st = st.

= [2xt'0 (u),

Insert the values of the variables; you find ⇣ ⌘ ⇣ ⌘ x = 0 2 2 D1 f x = 2xy ' (x y ) and D f 2 y y

2y 2 '0 (x2

2yt'0 (u) + s].

y 2 ) + '(x2

y 2 ).

Now continue as above.

1.8.11 Using the chain rule in one variable, ✓ ◆✓ ◆ ✓ ◆✓ ◆ x+y 1(x y) 1(x + y) x+y 2y 0 0 D1 f = ' =' x y (x y)2 x y (x y)2 and

0

D2 f = ' so Solution 1.8.13: Recall (Proposition 1.7.18) that if T (A) = A 1 then T is a di↵erentiable map on the space of invertible matrices, and [DT (A)]H =

A

1

HA

1

2A5 + 4A5 + 2A5 A4 p  8A = 8 x2 + y 2 . 

We can’t write 2xy 4 as 2A5 and compute 2A5 2A5 = 0 because x may be negative.

◆✓

xD1 f + yD2 f = '

1(x

✓

y) ( 1)(x + y) (x y)2

x+y x y

◆✓

2xy (x y)2

◆

◆

0

='

0

+'

1.8.13 Call S(A) = A2 + A and T (A) = A

1

✓

✓

x+y x y

x+y x y

◆✓

◆✓

2x (x y)2

2yx (x y)2

. We have F = T

◆

◆

=0

S, so

[DF (A)]H = [DT (A2 + A)][DS(A)]H = [DT (A2 + A)](AH + HA + H)

Here we replace A by A2 + A and H by AH + HA + H.

2x5 + 4x3 y 2 2xy 4 (x2 + y 2 )2

x+y x y

0

.

Solution 1.9.1: Simplify p notation by setting A = ✓x2◆+ y 2 . Then, for instance, D1 f 0 0 is

✓

=

(A2 + A)

1

(AH + HA + H)(A2 + A)

1

.

It isn’t really possible to simplify this much. ⇣ ⌘ 1.9.1 Except at 00 , the partial derivatives of f are given by D1 f

⇣ ⌘ 2x5 + 4x3 y 2 2xy 4 x = y (x2 + y 2 )2

and D2 f

⇣ ⌘ 4x2 y 3 2x4 y + 2y 5 x = . y (x2 + y 2 )2

At the origin, they are both given by ✓ ◆ ✓ ◆ ⇣ ⌘ ⇣ ⌘ 1 h4 + 04 1 04 + h4 0 0 D1 f 0 = lim = 0; D2 f 0 = lim = 0. h!0 h h!0 h h2 + 02 02 + h2

Thus there are partial derivatives everywhere, and we need to check that they are continuous. The only problem is at the origin. One easy way to show this is to remember that p p |x|  x2 + y 2 and |y|  x2 + y 2 . Then both partial derivatives satisfy ⇣ ⌘ p (x2 + y 2 )5/2 |Di f x |  8 = 8 x2 + y 2 . 2 2 2 y (x + y )

30

Solutions for Chapter 1

Thus the limit of both partials at the origin is 0, so the partials are continuous and f is di↵erentiable everywhere.

Solution 1.9.3: The line matrix [a b] is the derivative; it is the linear transformation L of Proposition and Definition 1.7.9.

1.9.3 a. It means that there is a line matrix [a b] such that ✓ ◆ 1 sin(h21 h22 ) lim ah bh = 0. 1 2 ~ h21 + h22 h!~ 0 |~ h| b. Since f vanishes identically on both axes, both partials exist, and are 0 at the origin. In fact, D1 f vanishes on the x-axis and D2 f vanishes on the y-axis. c. Since D1 f (0) = 0 and D2 f (0) = 0, we see that f is di↵erentiable at the origin if and only if

Inequality 1, first inequality: For any x, we have | sin x|  |x|. The second inequality follows from 0  (x

y)2 = x2

2xy + y 2 ;

lim

2 |~ h|!0 (h1

and this is indeed true, since sin(h21 h22 )  h21 h22 

for any x, y 2 R we have

1 |xy|  (x2 + y 2 ). 2

sin(h21 h22 ) = 0, + h22 )(h21 + h22 )1/2 1 2 h + h22 4 1

2

and lim

|~ h|!0

1 (h21 + h22 )2 1 = lim (h2 + h22 )1/2 = 0. 4 (h21 + h22 )3/2 4 |h|!0 1

(1)

Solutions for review exercises, chapter 1

1.1 a. Not a subspace: ~0 is not on the line. b. Not a subspace: ~0 is not on the line. c. A subspace. V1

V2

1.3 If A and B are upper triangular matrices, then if i > j we know that ai,j = bi,j = 0. Using the definition of matrix multiplication: ci,j =

n X

ai,k bk,j ,

k=1

V4

V3 V3

V1

V2

V4

Labeling of vertices for Solution 1.5, parts b and c.

we see that if i > j, then in the summation either ai,k = 0 or bk,j = 0, Pn so ci,j = k=1 0 = 0. If for all i > j we have ci,j = 0, then C is upper triangular, so if A and B are upper triangular, AB is also upper triangular. 2 3 0 1 0 1.5 a. Labeling the vertices in the direction of the arrows: 4 0 0 1 5 1 0 0 2 3 0 1 1 1 60 0 1 07 b. With the labels shown in the figure: 4 5 0 0 0 0 0 0 1 0 2 3 0 1 0 1 60 0 1 07 c. Again, with the labeling as shown in the figure: 4 5 1 0 0 1 1 1 1 0 1.7 Except when x = 0, y 6= 0, there is no problem with continuity: the formula describes a product of continuous functions, divided by a function that doesn’t vanish. To understand the behavior near x = 0, y 6= 0, we can write |y|e

|y|/x2

x2

=

✓ 2 x 1+

|y| x2

+

|y| ⇣ 1 2

|y| x2

⌘2

+ ···

◆
0, and set 1 + k = ✏4 and x = ✏; the function becomes ✏3 1 (1 + k2 ) > . ✏4 ✏ Thus there are points near the origin where the function is arbitrarily large. But there are also points where the function is arbitrarily close to 0, taking x = y = ✏.

Solutions for review exercises, Chapter 1

Solution 1.21, part c: statement

The

lim u ln |u| = 0,

u!0

can be proved using l’Hˆ opital’s ln |u| rule applied to . We used 1/u this already in Solution 1.5.21.

Solution 1.23: For m = 3, we have   ⇡ .00059 a3 ⇡ e .00028 p ⇡ (.0006)2 + (.0003)2 p = 10 3 .36 + .09 < 10 3 .

35

c. Since limu!0 u ln |u| = 0, the limit is 0.

d. Let us look at the function on the line y = ✏, for some ✏ > 0. The function becomes (x2 + ✏2 )(ln |x| + ln ✏) = x2 ln |x| + ✏2 ln |x| + x2 ln ✏ + ✏2 ln ✏. When |x| is small, the first, third, and fourth terms are small. But the 3 second is not. If x = e 1/✏ , for instance, the second term is ✏2 1 = , 3 ✏ ✏ which will become arbitrarily large as ✏ ! 0. Thus along the curve given 3 by x = e 1/✏ and y = ✏, the function tends to 1 as ✏ ! 0. But along the curve x = y = ✏, the function tends to 0 as ✏ ! 0, so it has no limit. 1.23 Let ⇡n be ⇡ to n places (e.g., ⇡2 = 3.14). Then ⇡ ⇡n < 10 n . We can do the same with e. So:  p p ⇡ an < 2 · 102( n) = ( 2)10 n < 10 n+1 . e  ⇡ So the best we can say in general is that to get |an | < 10 m , e we need n = m + 1. When m = 3, we can get away with n = 3, since (.59 . . . )2 + (.28 . . . )2 < 1. But when m = 4, we really need n = 5, since (.92 . . . )2 + (.81 . . . )2 > 1. 1.25 We know by the fundamental theorem of algebra that p(z) must have at least one root (and we know by Corollary 1.6.15 that, counted with multiplicity, it has exactly ten). We need to find R such that if |z| R, then the leading term z 10 has larger absolute value that the absolute value of the sum 2z 9 + 3z 8 + · · · + 11. Being very sloppy with our inequalities, we see that |2z 9 + 3z 8 + · · · + 11|  10 · 11|z|9 , so if we take any R > 110, so that when |z| R, then |z|10 > 110|z 9 |, a root z0 of the polynomial cannot satisfy |z0 | R. If we are less sloppy with our inequalities, we can observe that the successive terms of 2z 9 + · · · + 11 are decreasing in absolute value as soon as |z| > 2. So |2z 9 + 3z 8 + · · · + 11|  2 · 10|z|9 ,

and we see that |z|10 > |2z 9 + 3z 8 + · · · + 11| if |z| > 20. p 1.27 We have x2 = |x|, so 1 (f (h) h!0 h lim

|h| , h!0 h

f (0)) = lim

which does not exist (since it is ±1 depending on whether h is positive or negative).

36

Solutions for review exercises, Chapter 1

For

p 3 x2 , we have 1 (f (h) h!0 h lim

h2/3 1 = lim 1/3 , h!0 h h!0 h

f (0)) = lim

p x4 = x2 is di↵erentiable. ⇣ ⌘ 1.29 a. Both partials exist at 00 and are 0, but the function is not di↵erentiable. Indeed, if it were, the derivative would necessarily be the Jacobian matrix, i.e., the 0 matrix, and we would have which tends to ±1. Of course,

lim

f (~h)

~ h!0

f (0) |~h|

[0, 0]~h

= 0.

But writing the definition out leads to

h21 h2 p = 0, 2 2 2 2 ~ h!0 (h1 + h2 ) h1 + h2 lim

which isn’t true: for instance, if you set h1 = h2 = t, the expression above becomes t3 p , 2 2 |t|3 Solution 1.29, part b: If f were di↵erentiable, then by the chain rule the composition would be differentiable, and it isn’t. For x small, sin x ⇡ x; see equation 3.4.6 for the Taylor polynomial of sin(x).

which does not become small as t ! 0.

b. This function is not di↵erentiable. If you set g(t) =

⇣ ⌘ t , then t

(f g)(t) = 2|t| is not di↵erentiable at t = 0, but g is di↵erentiable at t = 0, so f is not di↵erentiable at the origin, which is g(0). c. Here are two proofs of part c:

First proof This isn’t even continuous at the origin, although both partials exist there and are 0. But if you set x = t, y = t, then sin(xy) sin t2 1 = ! , x2 + y 2 2t2 2

as t ! 0.

Second proof This function is not di↵erentiable; although both partial derivatives exist at the origin, the function itself is not continuous at the origin. For example, along the diagonal, sin t2 1 = , 2 t!0 2t 2 but the limit along the antidiagonal is 1/2: lim

sin( t2 ) = t!0 2t2 lim

1 . 2

1.31 a. By the chain rule this is   1 1 2t 1 [Df (t)] = , 2 = 2 2 2t t + sin t t + sin(t ) t + sin(t2 )

1 . t + sin t

b. We defined f for t > 1, but we will analyze it for all values of t. The function f is increasing for all t > 0 and decreasing for all t < 0. First let

Solutions for review exercises, Chapter 1

Solution 1.31, part b: The function f tends to 1 as t tends to 0. To see this, note that if s is small, sin s is very close to s (see the margin note for Solution 1.29). So Z t2 Z t2 ds ds ⇡ s + sin s 2s t t ⌘ 1⇣ = ln |t2 | ln |t| 2 1 = (2 ln |t| ln |t| 2 ln |t| = . 2

37

us see that f increases for t > 0, i.e., that the derivative is strictly positive for all t > 0. To see this, put the derivative on a common denominator: [Df (t)] =

2t2 + 2t sin t t2 sin(t2 ) . (t2 + sin(t2 ))(t + sin t)

(1)

For x > 0, we have x > sin x (try graphing the functions x and sin x), so for t > 0, the denominator is strictly positive. We need to show that the numerator is also strictly positive, i.e., t2 + 2t sin t

sin(t2 ) > 0.

(2)

For ⇡ < t  ⇡, this is true: for ⇡ < t  ⇡, we know that t and sin t are both positive or both negative, so 2t sin t > 0, and we have t2 > sin(t2 ) for t 6= 0. (Do not worry about t2 < t for small t; if you set x = t2 , the formula x > sin x still applies for x > 0.) For t > ⇡, equation 2 is also true: in that case, t2 + 2t sin t

sin(t2 )

t(t + 2 sin t)

1

t(⇡

2)

1>⇡

1.

To show that the function is decreasing for t < 0, we must show that the derivative is negative. For t < 0, the numerator is still strictly positive, by the argument above. But the denominator is negative: t2 + sin(t2 ) is positive, but t + sin t is negative. 1.33 Set f (A) = A wish to compute Solution 1.33: Remember (see Example 1.7.17) that we cannot treat equation 1 as matrix multiplication. To treat the derivatives of f and g as matrices, we would have to identify Mat (n, n) 2 with Rn ; the derivatives would be 2 2 n ⇥n matrices. Instead we think of the derivatives as linear transformations.

1

and g(A) = AA> + A> A. Then F = f

[DF (A)]H = [Df

g, and we

g(A)]H = [Df (g(A))][Dg(A)]H

= [Df (AA> + A> A)] [Dg(A)]H . | {z }

(1)

new increment for Df

The linear terms in H of g(A + H)

g(A) = (A + H)(A + H)> + (A + H)> (A + H)

AA>

A> A

are AH > + HA> + A> H + H > A; this is [Dg(A)]H, which is the new increment for Df . We know from Proposition 1.7.18 that [Df (A)]H = A 1 HA 1 , which we will rewrite as [Df (B)]K =

B

1

KB

1

(2)

to avoid confusion. We substitute AH > + HA> + A> H + H > A for the increment K in equation 2 and g(A) = AA> + A> A for B. This gives [DF (A)]H = [Df (AA> + A> A)][Dg(A)]H ⇣ ⌘ 1⇣ ⌘⇣ ⌘ 1 = AA> + A> A AH > + HA> + A> H + H > A AA> + A> A . | {z }| {z }| {z } B

1

K

There is no obvious way to simplify this expression.

B

1

38

Solutions for review exercises, Chapter 1

1.35 a. All partial derivatives exist: away from the origin, they exist by Theorems 1.8.1 and 1.7.10; at the origin, they exist and are 0, since the function vanishes identically on all three coordinate axes. b. By Theorem 1.8.1, the function is di↵erentiable everywhere except at the origin. At the origin, the function is not di↵erentiable. In fact, it isn’t even continuous: the limit 0 1 t h3 1 lim f @ t A = lim 4 = lim does not exist. h!0 h!0 3h h!0 3h t

a c

b

1.37 Compute

2

A =

Figure for Solution 1.37. The outside surface has equation a2 +bc = 1; it is a hyperboloid of one sheet and describes the solutions to part b. The inside surface has equation a2 + bc = 1; it is a hyperboloid of two sheets and describes the solutions to part c.

✓

a b c d

◆2

=



a2 + bc b(a + d) c(a + d) bc + d2

a. To get the diagonal entries of A2 to be 0, we need a2 = d2 = bc. Since a2 = d2 , we have a = ±d; we will examine the cases a = d and a = d separately. If a = d 6= 0, we get a contradiction: to get the o↵-diagonal terms to be 0 we must have either b = 0 or c = 0, but then a2 must be 0, so a and  d must be 0. If a = d = 0, then either b = 0 or c = 0. Indeed, the 0 b 0 0 matrices and are matrices whose square is 0. 0 0 c 0 If a = d 6= 0, then the o↵-diagonal terms of A2 are 0. For the diagonal terms to be 0, we must have bc = a2 , so neither b nor c canbe 0 and we 2 a b 0 0 must have c = ba . Indeed, A = satisfies A2 = . a2 a 0 0 b b. If A2 = I, then

b(a + d) = c(a + d) = 0 Solution 1.37, part b: The locus of equation A2 = I is given by four equations in four unknowns, so you might expect it to be a union of finitely many points. This is not the case; the locus is mainly the surface of equation a2 + bc = 1: take any point of this surface and set d = a to find A such that A2 = I. There are two exceptional solutions not on this surface, namely ±I.

implies that either b = c = 0 or a + d = 0. If b = c = 0, then a2 = d2 = 1, so a = ±1, d = ±1, giving four solutions. If a = d, then b and c must satisfy bc = 1 a2 . c. As in part b, if A2 = I, then either a+d=0, or b = c = 0. The latter leads to a2 = d2 = 1, so there are no such solutions. So for all solutions d = a, a2 + bc = 1. 1.39 a. This is a case where it is much easier to think of first rotating the telescope so that it is in the (x, z)-plane, then changing the elevation, then rotating back. This leads to the following product of matrices: 2 32 32 3 cos ✓0 sin ✓0 0 cos ' 0 sin ' cos ✓0 sin ✓0 0 4 sin ✓0 cos ✓0 0 5 4 0 1 0 5 4 sin ✓0 cos ✓0 0 5 0 0 1 sin ' 0 cos ' 0 0 1 2 3 cos2 ✓0 cos ' sin2 ✓0 cos ✓0 sin ✓0 (cos ' 1) sin ' cos ✓0 = 4 cos✓0 sin ✓0 (cos ' 1) cos ' sin2 ✓0 + cos2 ✓0 sin ✓0 sin ' 5 sin ' cos ✓0 sin ' sin ✓0 cos '

You may wonder about the signs of the sin ! terms. Once the telescope is level, it is pointing in the direction of the x-axis. You, the astronomer

Solutions for review exercises, Chapter 1

39

rotating the telescope, are at the negative x end of the telescope. If you rotate it counterclockwise, as seen by you, the matrix is as we say. On the other hand, we are not absolutely sure that the problem is unambiguous as stated. b. It is best to think of first rotating the telescope into the (x, z)-plane, then rotating it until it is horizontal (or vertical), then rotating it on its own axis, and then rotating it back (in two steps). This leads to the following product of five matrices: 2

cos ✓0 4 sin ✓0 0

sin ✓0 cos ✓0 0

32 0 cos '0 054 0 1 sin '0

0 1 0

32 sin '0 1 0 540 cos '0 0

0 cos ! sin !

32 0 sin ! 5 4 cos !

cos '0 0 sin '0

0 1 0

32 sin '0 0 54 cos '0

cos ✓0 sin ✓0 0

sin ✓0 cos ✓0 0

3 0 0 5. 1

Solutions for Chapter 2

2.1.1 a.2 2.1.3  1 a. 4

3 40 1 2 5

2 3 x 4y5 2 3 32 z 3 b. 4 0 4 0 1 1 54 4 5 0 1

1 2 3



3 1 ! 6 0

0 1

1 2

2

1 2 3 c. 4 2 3 0 0 1 2 2

1 e. 4 2 1

1 3 4

2

1 b. 4 1 1

1 3 2

3 2 4 0 1 1 4 5 c. 4 1 0 1 2

1 0 1

3 2 5 1 0 0 1 5!4 0 1 0 3 0 0 1

3 2 1 4 1 0 1 2 5!4 0 1 1 9 0 0

1 3 d. 4 1 2 3 7

2

1 2 3

3 2 1 1 0 35 ! 40 1 2 0 0

5 2 0 6/5 1/5 0

3 2 1 1 2 5!4 0 1 0

3 1 15 2

3 2 25 1

7 2 3 0 2 0

0 1 0

3 1 25 1 3 0 05 1

3 6/5 1/5 5 0

2.1.5 You can undo “multiplying row i by m 6= 0” by “multiplying row i by 1/m” (which is possible because m 6= 0; see Definition 2.1.1). You can undo “adding row i to row j” by “subtracting row i from row j,” i.e., “adding ( row i) to row j”. You can undo “switching row i and row j” by “switching row i and row j” again. 2 3 1 0 0 2 2.1.7 Switching rows 2 and 3 of the matrix 4 0 0 1 1 5 brings it to 0 1 0 1 2 3 1 0 0 2 echelon form, giving 4 0 1 0 15. 0 0 1 1 2 3 1 1 0 1 The matrix 4 0 0 2 0 5 can be brought to echelon form by multiply0 0 0 1 2 3 1 1 0 0 ing row 2 by 1/2 and subtracting row 3 from row 1, giving 4 0 0 1 0 5 . 0 0 0 1 40

Solutions for Chapter 2

2

0 0 The matrix 4 1 0 0 1 the first and second 2 0 41 0 2 0 The matrix 4 0 0

2

0 1 40 0 0 0

by multiplying row 3 2 0 3 0 3 0 1 1 1 15 ! 40 0 0 1 2 0

Solution 2.1.9: This is the main danger in numerical analysis: adding (or subtracting) numbers of very di↵erent sizes loses precision.

41

3 0 0 5 can be brought to echelon form by switching first 0 rows, then the second and third rows: 3 2 3 2 3 0 0 1 0 0 1 0 0 0 05 ! 40 0 05 ! 40 1 05. 1 0 0 1 0 0 0 0 3 1 0 3 0 3 0 1 1 1 1 5 can be brought to echelon form 0 0 0 1 2

2 through by 1 0 0 1 0 0

3 1 0

1, then adding 3 2 0 3 0 1 15 ! 40 1 2 0

row 3 to row 2: 1 0 0 1 0 0

3 0 1 0 0 1

3 3 15. 2

2.1.9 The first problem occurs when you subtract 2 · 1010 from 1 to get from the second to the third matrix of equation 2.1.12 (second row, second entry). The 1 is “invisible” if computing only to 10 significant digits, and disappears in the subtraction: 1 20000000000 = 19999999999, which to 10 significant digits is 20000000000. Another “invisible” 1 is found in the second row, third entry. 2.2.1 a. The augmented matrix [A, ~b] corresponds to 2x + y + 3z = 1 x

y=1

x + y + 2z = 1. Since [A, ~b] row reduces to 2

1 40 0

0 1 0

1 1 0

3 0 05, 1

x and y are pivotal unknowns, and z is a nonpivotal unknown. b. If we list first the variable y, then z, then x, the system of equations becomes y + 3z + 2x = 1 y + x =1 y + 2z + x = 1. The corresponding matrix is 2 3 2 1 3 2 1 1 0 4 1 0 1 1 5 , which row reduces to 4 0 1 1 2 1 1 0 0

3 1 0 1 05. 0 1

This time y and z are the pivotal variables, and x is the nonpivotal variable.

42

Solutions for Chapter 2

2.2.3 a. Call the equations A, B, C, D. Adding A and B gives 2x + 4y 2w =0; comparing this with C gives 2w = z 5w + v, so 3w = z + v.

(1)

Comparing C and 2D gives 15w = 5z + 3v, which is compatible with equation 1 only if v = 0. So equation 1 gives 3w = z. Substituting 0 for v and 3w for z in each of the four equations gives z + 2y w = 0. b. Since you can choose arbitrarily the value of y and w, and they determine the values of the other variables, the family of solutions depends on two parameters. 2.2.5 a. This system has a solution for every value of a. If you row reduce  a 1 0 2 you may seem to get 0 a 1 3  1 0 1/a2 2/a + 3/a2 , 0 1 1/a 3/a which seems to indicate that there is a solution for any value of a except a = 0. However, obviously the system has a solution if a = 0; in that case, y = 2 and z = 3. The problem with the above row reduction is that if a = 0, then a can’t be used for a pivotal 1. If a = 0, the matrix row reduces 0 1 0 2 to . 0 0 1 3 b. We have two equations in three unknowns; there is no unique solution. 2 3 1 1 2 1 2.2.7 We can perform row operations to bring 4 1 1 a b 5 to 2 0 b 0 2 3 1 0 (2 + a)/2 (1 + b)/2 4 0 1 (2 a)/2 (1 b)/2 5 . 0 0 2+a+b 1+b a. There are then two possibilities. If a + b + 2 6= 0, the first three columns row reduce to the identity, and the system of equations has the unique solution x=

b(b + 1) , 2+a+b

y=

b2 3b + 2a , 2(2 + a + b)

z=

1+b . 2+a+b

If a + b + 2 = 0, then there are two possibilities to consider: either b + 1 = 0 or b + 1 6= 0. If b + 1 = 0, so that a = b = 1, the matrix row reduces to 2 3 1 0 1/2 0 4 0 1 3/2 1 5 . 0 0 0 0

In this case there are infinitely many solutions: the only nonpivotal variable is z, so we can choose its value arbitrarily; the others are x = z/2 and

Solutions for Chapter 2

43

y = 1 (3z)/2. If a + b + 2 = 0 and b + 1 6= 0, then there is a pivotal 1 in the last column, and there are no solutions. b. The first case, where a + b + 2 6= 0, corresponds to an open subset of the (a, b)-plane. The second case, where a = b = 1, corresponds to a closed set. The third is neither open nor closed. 2.2.9 Row reducing 2 1 1 1 3 1 5 1 6 1 4 1 2 2 2 2 5 4 9

k

n

k

1 7 1 7

3 2 1 1 27 60 5 gives 4 0 0 0

0 1 0 0

4 1/3 2/3 0

3 7/3 1/3 0

3 2 5/6 7 5. 1/6 + 1/2

There are then two possibilities: either 6= 1/2 or = 1/2. If 6 = 1/2, there will be a pivotal 1 in the last column (once we have divided by + 1/2), so there is no solution. But if = 1/2, there are infinitely many solutions: x4 and x5 are nonpivotal, so their values can be chosen arbitrarily, and then the values of x1 , x2 and x3 are given by x1 = 2 + 4x4

3x5

x2 = 5/6 + x4 /3

n+1 Figure for Solution 2.2.11, part b. This n ⇥ (n + 1) matrix represents a system A~x = ~ b of n equations in n unknowns. By the time we are ready to obtain a pivotal 1 at the intersection of the kth column (dotted) and kth row, all the entries on the kth row to the left of the kth column are 0, so we only need to place a 1 in position k, k and then justify that act by dividing all the entries on the kth row to the right of kth column by the (k, k) entry. There are n + 1 k such entries. If the (k, k) entry is 0, we go down the kth column until we find a nonzero entry. In computing the total number of computations, we are assuming the worse case scenario, where all entries of the kth column are nonzero.

0 0 1 0

7x5 /3

x3 = 1/6 + 2x4 /3 + x5 /3. 2.2.11 a. R(1) = 1 + 1/2 1/2 = 1, R(2) = 8 + 2 1 = 9. If we have one equation in one unknown, we need to perform one division. If we have two equations in two unknowns, we need two divisions to get a pivotal 1 in the first row (the 1 is free), followed by two multiplications and two additions to get a 0 in the first element of the second row (the 0 is free). One more division, multiplication and addition get us a pivotal 1 in the second row and a 0 for the second element of the first row, for a total of nine. b. As illustrated by the figure in the margin, we need n + 1 k divisions to obtain a pivotal 1 in the column k. To obtain a 0 in another entry of column k requires n + 1 k multiplications and n + 1 k additions. We need to do this for n 1 entries of column k. So our total is (n + 1 c. Using n X

Pn

(2n

k=1

k) + 2(n

k=1

k=

1)(n

1)(n + 1

n(n+ 1) , 2

k) = (2n

1)(n

k + 1).

compute

k + 1) =

n X

(2n

1)(n + 1)

k=1

= n(2n = n3 +

n X

(2n

1)k

k=1

1)(n + 1) n2 2

n . 2

(2n

1)

n(n + 1) 2

44

Solutions for Chapter 2

d.

2 3 7 + 3 2 6 2 3 Q(2) = 8 + 4 3 2 2 3 Q(3) = 27 + 9 3 2 Q(1) =

= 1, 7 2 = 9, 6 7 3 = 28. 6

2 The function R(n) Q(n) is the cubic polynomial 13 n3 n2 +p 3 n, with a 2 root at n = 2. Its derivative, n2 2n + 3 , has roots at 1 ± 1/3; it is strictly positive for n 2. So the function R(n) Q(n) is increasing as a function of n for n 2, and hence is strictly positive for n 3. Another way to show that Q(n) < R(n) is to note that for n 3,

R(n)

Q(n) =

1 3 n 3

2 n2 + n 3

1 (3n2 ) 3

2 2 n2 + n = n > 0. 3 3

e. For partial row reduction for a single column, the operations needed are like those for full row reduction (part b) except that we are just putting zeros below the diagonal, so we can replace n 1 in the total for full row reduction by n k, to get (n + 1

k) + 2(n

k)(n + 1

k) = (n

k + 1)(2n

2k + 1)

total operations (divisions, multiplications, and additions). f. Denote by P (n) the total computations needed for partial row reduction. By part e, we have P (n) =

n X

(n

k + 1)(2n

2k + 1).

k=1

Let

2 3 1 2 1 n + n n. 3 2 6 We will show by induction that P = P1 . Clearly, P (1) = P1 (1) = 1. If P (n) = P1 (n), we get: P1 (n) =

P (n + 1) =

n+1 X

(n

k + 2)(2n

2k + 3) = 1 +

k=1

n X

k=1

=1+

k=1 n X

(n

⌘⇣ k + 1) + 1 (2n k + 1)(2n

k=1

k=

n(n + 1) . 2

=1+

(n

k + 2)(2n

k=1

n ⇣ X =1+ (n

In line 4, we get the next-to-last term using

n X

|

{z

2k + 1) + }

P (n)

2 3 n + |3

1 2 n 2{z

1 n 6}

⌘ 2k + 1) + 2

+4n2

n X

(4n

4k + 5)

k=1

4

n2 + n + 5n 2

P1 (n) by inductive hypothesis

2 1 = (n + 1)3 + (n + 1)2 3 2

1 (n + 1) = P1 (n + 1). 6

2k + 3)

Solutions for Chapter 2

45

So the relation is true for all n 1. This can also be solved by direct computation (as in part c), using n X

k=1

n(n+ 1) k= 2

n X

and

k2 =

k=1

n(n + 1)(2n + 1) . 6

g. We need n k multiplications and n k additions for the row k, so the total number of operations for back substitution is B(n) = n2 n. h. So the total number of operations for n equations in n unknowns is 2 3 7 Q(n) = P (n) + B(n) = n3 + n2 n for all n 1. 3 2 6 2

3 4 2.3.1 The inverse of A is A 1 1 5. Now compute 3 2 32 3 2 3 3 1 4 1 2 0 2 2 5 4 1 1 1541 0 15 = 4 1 1 25. 2 1 3 1 1 1 2 1 4 3 =4 1 2

1 1 1

The columns of the product are the solutions to the three systems we were trying to solve. 2.3.3 For instance, 2 3 2 3 2 3   1 0 1 0  1 0 0 1 0 0 4 1 0 1 0 0 0 15 = , but 4 0 1 5 = 40 1 05. 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 2

3 2.3.5 a. Since A = 4 2 1

A

1

2

the solution is x = 3/8, y 2 3 1 3 b. Since 4 2 1 2 1 1 1 2 1 0 0 40 1 0 0 0 1

3/16 = 4 1/4 1/16

1/4 0 1/4

3 1/16 3/4 5 5/16

3 2 3 3 1 1 0 0 3/8 2 1 5 row reduces to 4 0 1 0 1/2 5, 1 1 0 0 1 1/8

1 1 1

= 1/2, z = 1/8. 3 1 0 0 0 1 0 5 row reduces to 0 0 1 3 3/16 1/4 1/16 1/4 0 3/4 5 , we have 1/16 1/4 5/16

and

2.3.7 It just so happens that 2 32 2 1 6 3 1 0 42 7 35 = 40 1 4 12 5 0 0

2

3/16 4 1/4 1/16

1/4 0 1/4

32 3 2 3 1/16 1 3/8 3/4 5 4 1 5 = 4 1/2 5 . 5/16 1 1/8

A = A 1: 3 0 0 5 . So by Proposition 2.3.1, we have 1

46

Solutions for Chapter 2

2

3 2 3 2 3 5 5 4 ~x = A 1 4 7 5 = A 4 7 5 = 4 6 5 . 11 11 9 2.3.9 a. 2 2 4 0 1 2 1 4 0 1 2 1 41 0 2

2

1 0 40 1 0 0

1 40 1 3 2 3 2 05 4 0 1 1

b.

3 2 0 3 2 0

The products will be 3 3 14 2 3 5 3 times the third row is subtracted from the first 0 4 3 3 2 4 6 5 the second row is multiplied by 2 0 4 3 3 2 0 4 5 the second and third rows are switched. 2 3

3 2 35 4 3, 14 35 4

2

1 40 1 2 3 2 1 0 0 1 40 2 05 40 0 0 1 1

3 2 0 3 4 0

3 2 35 4 3, 2 65 4

2

1 40 1 2 3 2 1 0 0 1 40 0 15 41 0 1 0 0

3 2 0 3 0 2

3 2 35 4 3 2 45. 3

2.3.11 Let A be an n ⇥ m matrix. Then

AE1 (i, x) has the same columns as A, except the ith, which is multiplied by x. AE2 (i, j, x) has the same columns as A except the jth, which is the sum of the jth column of A (contributed by the 1 in the (j, j)th position), and x times the ith column (contributed by the x in the (i, j)th position). AE3 (i, j) has the same columns as A, except for the ith and jth, which are switched. 2.3.13 Here is one way to show this. Denote by a the ith row and by b the jth row of our matrix. Assume we wish to switch the ith and the jth rows. Then multiplication on the left by E2 (i, j, 1) turns the ith row into a + b. Multiplication on the left by E2 (j, i, 1) then by E1 (j, 1) turns the jth row into a. Finally, we multiply on the left by E2 (i, j, 1) to subtract a from the ith row, making that row b. So we can switch rows by multiplying with only the first two types of elementary matrices. Here is a di↵erent explanation of the same argument: Compute the product      1 1 1 0 1 0 1 1 0 1 = . 0 1 0 1 1 1 0 1 1 0 This certainly shows that the 2 ⇥ 2 elementary matrix E3 (1, 2) can be written as a product of elementary matrices of type 1 and 2.

Solutions for Chapter 2

47

More generally, E2 (i, j, 1)E1 (j, 1)E2 (j, i, 1)E2 (i, j, 1) = E3 (i, j). 2.3.15 1. Suppose that [A|I] row reduces to [I|B]. This can be expressed as multiplication on the left by elementary matrices: Going from equation 1 to equation 2 in Solution 2.3.15: If [E][A|I] = [I|B],

then

[E][A] = I

[E][I] = B,

and

since the multiplication of each column of A and of I by [E] occurs independently of all other columns: [E]

[A|I] [EA|EI].

Ek . . . E1 [A|I] = [I|B].

(1)

The left and right sides of equation (1) give Ek . . . E1 A = I

and Ek . . . E1 I = B.

(2)

Thus B is a product of elementary matrices, which are invertible, so (by Proposition 1.2.15) B is invertible: B 1 = E1 1 . . . Ek 1 . Substituting the right equation of (2) into the left equation gives BA = I, so B is a left inverse of A. We don’t need to check that it is also a right inverse, but doing so is straightforward: multiplying BA = I by B 1 on the left and B on the right gives I=B

1

IB = B

1

(BA)B = (B

1

B)AB = AB.

So B is also a right inverse of A. 2. If [A|I] row reduces to [A0 |A00 ], where A0 is not the identity, then by Theorem 2.2.6 the equation A~xi = ~ei either has no solution or has infinitely many solutions for each i = 1, . . . , n. In either case, A is not invertible. 2.4.1 The only way you can write 2 3 2 3 2 3 2 3 a1 0 1 0 607 607 6 ... 7 6 a2 7 6 . 7 = a1 6 . 7 + · · · + ak 6 7 = 6 . 7 , 4 .. 5 4 .. 5 4 0 5 4 .. 5 0

Solution 2.4.3: The vectors in equation 1 are orthogonal, by Corollary 1.4.8; they are orthonormal because they are orthogonal and each vector has length 1.

Solution 2.4.5: Recall that only a very few, very special subsets of Rn are subspaces of Rn ; see Definition 1.1.5. Roughly, a subspace is a flat subset that goes through the origin: to be closed under multiplication, a subspace must contain the zero vector, so that 0~ v = ~0.

0

1

ak

is if a1 = a2 = · · · = ak = 0.

2.4.3 To make the basis orthonormal, each vector needs to be normalized to give it length 1. This is done by dividing each vectorby itsplength (see  p 1/p2 1/p2 equation 1.4.6). So the orthonormal basis is , . These 1/ 2 1/ 2 vectors form a basis of R2 because they are two linearly independent vectors in R2 ; they are orthogonal because p  p  1/p2 1/p2 · = 0. (1) 1/ 2 1/ 2 2.4.5 To show that Span (~v1 , . . . , ~vk ) is a subspace of Rn , we need to show that it is closed under addition and under multiplication by scalars. This follows from the computations c(a1 ~v1 + · · · + ak ~vk ) = ca1 ~v1 + · · · + cak ~vk ; (a1 ~v1 + · · · + ak ~vk ) + (b1 ~v1 + · · · + bk ~vk )

= (a1 + b1 )~v1 + · · · + (ak + bk )~vk .

48

Solutions for Chapter 2

To see that it is the smallest subspace that contains the ~vi , note that any subspace that contains the ~vi must contain their linear combinations, hence the smallest such subspace is Span (~v1 , . . . , ~vk ). 2.4.7 Both parts depend on the definition of an orthonormal basis: ⇢ 0 if i 6= j, ~vi · ~vj = 1 if i = j

Proof of part 1: Since ~v1 , . . . , ~vn form a basis, they span Rn : any ~x 2 Rn can be written ~x = a1 ~v1 + · · · + an ~vn . Take the dot product of both sides with ~vi : ~x · ~vi = a1 ~v1 · ~vi + · · · + an ~vn · ~vi = ai ~vi · ~vi = ai . Proof of part 2: ✓X ◆ ✓X ◆ X n n n X n n X |~x|2 = ai ~vi · aj ~vj = ai aj (~vi · ~vj ) = a2i . i=1

j=1

i=1 j=1

i=1

2.4.9 To see that condition 2 implies condition 3, first note that 2 =) 3 is logically equivalent to (not 3) =) (not 2). Now suppose {~v1 , . . . , ~vk } is a linearly dependent set spanning V , so by Definition 2.4.2, there exists a nontrivial solution to a1 ~v1 + · · · + ak ~vk = 0. Without loss of generality, we may assume that ak is nonzero (if it isn’t, renumber the vectors so that ak is nonzero). Using the above relation, we can solve for ~vk in terms of the other vectors: ✓ ◆ a1 ak 1 ~vk = ~v1 + · · · + ~vk 1 . ak ak This implies that {~v1 , . . . , ~vk } cannot be a minimal spanning set, because if we were to drop ~vk we could still form all the linear combinations as before. So (not 3) =) (not 2), and we are finished. To show that 3 =) 1: ~ 2 V , there exist The vectors ~v1 , . . . , ~vk span V , so for any vector w some numbers a1 , . . . , an such that ~. a1 ~v1 + · · · + an ~vn = w Thus, if we add this vector to ~v1 , . . . , ~vk , we will have a linearly dependent set because a1 ~v1 + · · · + an ~vn

~ =0 w

~ can is a nontrivial linear combination of the vectors that equals 0. Since w be any vector in V , the set {~v1 , . . . , ~vk } is a maximal linearly independent set.

Solutions for Chapter 2

49

2.4.11 a. For any n, we have n + 1 linear equations for the n + 1 unknowns a0,n , a1,n , . . . , an,n , which say ✓ ◆k ✓ ◆k ✓ ◆k ⇣ n ⌘k Z 1 0 1 2 1 a0,n + a1,n + a2,n + · · · + an,n = xk dx = , n n n n k+1 0 one for each k = 0, 1, . . . , n. These systems of linear equations are: • When n = 1

a0,1 1 + a1,1 1 = 1 a0,1 0 + a1,1 1 = 1/2

• When n = 2

a0,2 1 + a1,2 1 + a2,2 1 = 1 a0,2 0 + a1,2 (1/2) + a2,2 1 = 1/2 a0,2 0 + a1,2 (1/4) + a2,2 1 = 1/3

• When n = 3 The system of equations for n = 3 could be written as the augmented matrix [A|~ b]: 2 3 1 1 1 1 1 6 0 1/3 2/3 1 1/2 7 6 7. 4 0 1/9 4/9 1 1/3 5 0 1/27 8/27 1 1/4

a0,3 1 + a1,3 1 + a2,3 1 + a3,3 1 = 1

a0,3 0 + a1,3 (1/3) + a2,3 (2/3) + a3,3 1 = 1/2 a0,3 0 + a1,3 (1/9) + a2,3 (4/9) + a3,3 1 = 1/3 a0,3 0 + a1,3 (1/27) + a2,3 (8/27) + a3,3 1 = 1/4. b. These wouldn’t be too bad to solve by hand (although already the last would be distinctly unpleasant). We wrote a little Matlab m-file to do it systematically: function [N,b,c] = EqSp(n) N = zeros(n+1); % make an n+1 ⇥ n+1 matrix of zeros c=linspace(1,n+1,n+1); % make a place holder for the right side for i=1:n+1 for j=1:n+1 N(i,j)= ((j-1)/n)^(i-1); % put the right coefficients in the matrix end c(i)=1/c(i); % put the right entries in the right side end b=c’; % our c was a row vector, take its transpose c=N% this solves the system of linear equations ¯

If you write and save this file as ‘EqSp.m’, and then type [A,b,c]=EqSp(5), for the case when n = 5, you will get 2

1 60 6 60 A=6 60 6 40 0

1 1/5 1/25 1/125 1/625 1/3125

1 2/5 4/25 8/125 16/625 32/3125

1 3/5 9/25 27/125 81/625 243/3125

1 4/5 16/25 64/125 256/625 541/1651

3 1 17 7 17 7, 17 7 15 1

2

3 1 6 1/2 7 6 7 6 1/3 7 7 b=6 6 1/4 7 , 6 7 4 1/5 5 1/6

2

3 19/288 6 25/96 7 6 7 6 25/144 7 7 c=6 6 25/144 7 . 6 7 4 25/96 5 19/288

50

For instance, for n = 2, we have    1 1 1/2 1 = . 0 1 1/2 1/2

Solutions for Chapter 2

This corresponds to the equation Ac = b, where the matrix A is the matrix of coefficients for n = 5, and the vector c is the desired set of coefficients – the solutions when n = 5. When n = 1, 2, 3, the coefficients – i.e., the solutions to the systems of equations in part a – are 2 3 2 3 1/8  1/6 1/2 6 3/8 7 , 4 2/3 5 , 4 5. 1/2 3/8 1/6 1/8

R 1 dx The approximations to 0 1+x = log 2 = 0.69314718055995 . . . obtained with these coefficients are .75 for n = 1, 25 36 = .6944 . . . for n = 2, and 111 160 = .69375 for n = 3. c. If you compute 5 X

1 ai,5 ⇡ (i/5) + 1 i=0

Z

0

1

dx = log 2 = 0.69314718055995 . . . 1+x

you will find 0.69316302910053, which is a pretty good approximation for a Riemann sum with six terms. For instance, the midpoint Riemann sum gives 5

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4

3 0.0118 0.1141 7 7 0.2362 7 7 1.2044 7 7 3.7636 7 7 10.3135 7 7 22.6521 7 7 41.7176 7 7 63.9006 7 7 82.5706 7 7 89.7629 7 7. 82.5829 7 7 63.9189 7 7 41.7345 7 7 22.6633 7 7 10.3191 7 7 3.7656 7 7 1.2050 7 7 0.2363 7 7 0.1141 5 0.0118

Coefficients when n = 20.

1X 1 ⇡ 0.69190788571594, 5 i=1 (2i 1)/10 which is a much worse approximation. But this scheme runs into trouble. All the coefficients are positive up to n = 7, but for n = 8 they are 2 6 6 6 6 6 6 6 6 6 6 6 4

3 2 248/7109 578/2783 7 6 7 6 111/3391 7 6 7 6 97/262 7 6 7 6 454/2835 7 ⇡ 6 7 6 97/262 7 6 7 6 111/3391 7 6 5 4 578/2783 248/7109

3 0.0349 0.2077 7 7 0.0327 7 7 0.3702 7 7 0.1601 7 7 0.3702 7 7 0.0327 7 5 0.2077 0.0349

and the approximation scheme starts depending on cancellations. This is much worse when n = 20, where the coefficients are as shown in the margin. Despite these bad sign variations, the Riemann sum works pretty well: the approximation to the integral above gives 0.69314718055995, which is ln 2 to the precision of the machine.

Solutions for Chapter 2

51

2

3 1 a a a 61 1 a a7 2.4.13 In the process of row reducing A = 4 5 , you will come 1 1 1 a 1 1 1 1 to the matrix 2 3 1 a a a 0 0 7 60 1 a 4 5. 0 1 a 1 a 0 0 1 a 1 a 1 a

If a = 1, the matrix will not row reduce to the identity, because you can’t choose a pivotal 1 in the second column, so one necessary condition for A to be invertible is that a 6= 1. Let us suppose that this is the case. We can now row reduce two steps further to find 2 3 2 3 1 0 a a 1 0 0 a 0 0 7 0 7 60 1 60 1 0 4 5 , and then 4 5 0 0 1 a 0 0 0 1 0 0 0 1 a 1 a 0 0 0 1 a The next step row reduces the matrix to the identity, so the matrix is invertible if and only if a 6= 1. ~ can be written as a linear combination of the ~vi in two 2.4.15 Suppose w P P ~ = ki=1 xi ~vi and w ~ = ki=1 yi ~vi , with xi 6= yi for at least one i. ways: w Then we can write k X i=1

xi ~vi

k X i=1

yi ~vi = ~0,

i.e.,

k X (xi

yi )~vi = ~0,

i=1

and it follows from Definition 2.4.2 that the ~vi are not linearly independent. In the other direction, if any vector can be written as a linear combination Pk Pk of the ~vi in at most one way, then ~0 = i=1 0~vi = i=1 xi ~vi implies that all the xi are 0, so the ~vi are linearly independent. 2.5.1 a. The vectors ~v1 and ~v3 are in the of A, since2A~v31 = ~0 and 2 kernel 3 2 2 A~v3 = ~0. But ~v2 is not, since A~v2 = 4 3 5. The vector 4 3 5 is in the 3 3 image of A. b. The matrix T represents a transformation from R5 to R3 ; it takes a ~ 4 has the right height vector in R5 and gives a vector in R3 . Therefore, w ~ 1 and w ~ 3 have the right height to be in the kernel (although it isn’t), and w to be in its image. Since the sum of the second and fifth columns of T is ~0, one element of 2 3 0 617 6 7 the kernel is 6 0 7. 4 5 0 1

52

Solutions for Chapter 2

2.5.3 nullity T = dim ker T = number of nonpivotal columns of T ; rank of T = dim image T = number of linearly independent columns of T = number of pivotal columns of T . rank T + nullity T = dim domain T 2.5.5 By Definition 1.1.5 of a subspace, we need to show that the kernel and the image of a linear transformation T are closed under addition and multiplication by scalars. These are straightforward computations, using the linearity of T . ~ 2 ker T , i.e., if T (~v) = ~0 and T (~ The kernel of T : If ~v, w w) = ~0, then ~ ) = T (~v) + T (~ T (~v + w w) = ~0 + ~0 = ~0 and T (a~v) = aT (~v) = a~0 = ~0,

~ 2 ker T and a~v 2 ker T . so ~v + w

~ = T (~ The image of T : If ~v = T (~v1 ), w w1 ), then ~v + w ~ = T (~ w1 ) + T (~v1 ) = T (~ w1 + ~v1 ) and a~v = aT (~v1 ) = T (a~v1 ). So the image is also closed under addition and multiplication by scalars.

Solution 2.5.7: All triples of columns of A are linearly independent except columns 3, 4, and 6.

2.5.7 a. n = 3. The last three columns of the matrix are clearly linearly independent, so the matrix has rank at least 3, and it has rank at most 3 because there can be at most three linearly independent vectors in R3 . b. Yes. For example, the first three columns are linearly independent, since the matrix composed of just those columns row reduces to the identity. c. The 3rd, 4th, and 6th columns are linearly dependent. d. You cannot choose freely the values of x1 , x2 , x5 . Since the rank of the matrix is 3, three variables must correspond to pivotal (linearly independent) columns. For the variables x1 , x2 , x5 to be freely chosen, i.e., nonpivotal, x3 , x4 , x6 would have to correspond to linearly independent columns. 2.5.9 a. The matrix of a linear transformation has as its columns the images of the standard basis vectors, in this case identified to the polynomials p1 (x) = 1, p2 (x) = x, p3 (x) = x2 . Since T (p1 )(x) = 0, T (p2 )(x) = x, T (p3 )(x) = 4x2 , 2 3 0 0 0 the matrix of T is 4 0 1 0 5. 0 0 4 2 3 2 3 0 0 0 0 1 0 b. The matrix 4 0 1 0 5 row reduces to 4 0 0 1 5. Thus the image 0 0 4 0 0 0 has dimension 2 (the number of pivotal columns) and has a basis made up

Solutions for Chapter 2

53

of the polynomials ax + 4bx2 , the linear combinations of the second and third columns of the matrix of T . The kernel has dimension 1 (the number of nonpivotal columns), and consists precisely of the constant polynomials. b rank 2

rank 2

a rank 1 rank 2

b

rank 3

2

rank 3

rank 3

rank 2

rank 2

a

1

rank 3

rank 1 rank 3

Figure for Solution 2.5.11 Top: On the hyperbola, the kernel of A has dimension 1 and its image has dimension 1. Elsewhere, the rank (dimension of the image) is 2, so by the dimension formula the kernel has dimension 0. The rank is never 0 or 3. Bottom: On the a-axis, the line a = b, and the line b = 2, the rank of B is 2, so its kernel has dimension 1. At the origin the rank is 1 and the dimension of the kernel is 2. Elsewhere, the kernel has dimension 0 and the rank is 3.

2.5.11 The sketch is shown in the margin. a. If ab 6= 2, then dim(ker (A)) = 0, so in that case the image has dimension 2. If ab = 2, the image and the kernel have dimension 1. b. By row operations, we can bring the matrix B to 2 3 1 2 a 40 b ab a 5 . 0 2a b a2 a We now separate the case b 6= 0 and b = 0. 2 3 1 2 a • If b = 0, the matrix is 4 0 0 a 5, which has rank 3 unless a = 0, 2 0 2a a a in which case it has rank 1. (Of course, since n = 3, rank 3 corresponds to dim ker = 0, and rank 1 corresponds to dim ker = 2.) • If b 6= 0, then we can do further row operations to bring the matrix to the form 2 3 1 2 a 40 b 5. ab a 2a b 2 0 0 a a a) b (ab The third entry in the third row factors as ab (a b)(2 b), so we have rank 2 (and dim ker = 1) if a = 0 or a = b or b = 2. Otherwise we have rank 3 (and dim ker = 0).

2.5.13 a. As in Example 2.5.14, we need to put the right side on a common denominator and consider the resulting system of linear equations. Row reduction then tells us for what values of a the system has no solutions. So: x 1 A0 B1 x + B0 = + 2 2 (x + 1)(x + ax + 5) x + 1 x + ax + 5 =

A0 x2 + aA0 x + 5A0 + B1 x2 + B1 x + B0 x + B0 . (x + 1)(x2 + ax + 5)

This gives x

1 = A0 x2 + aA0 x + 5A0 + B1 x2 + B1 x + B0 x + B0 ,

i.e., 32 3 5 0 1 1 A0 aA0 + B1 + B0 = 1 which we can write as 4 a 1 1 1 5 4 B1 5. 1 1 0 0 B0 A0 + B1 = 0, 2 3 1 0 0 a2 6 Row reduction gives 4 0 1 0 a 26 5, so the fraction in question cannot 0 0 1 a4 6a be written as a partial fraction when a = 6. 5A0 + B0 =

1

2

54

Solutions for Chapter 2

b. This does not contradict Proposition 2.5.13 because that proposition requires that p be factored as p (x) = (x

a1 )n1 · · · (x

ak )nk .

with the ai distinct. If you substitute 6 for a in x2 + ax + 5 you get x2 + 6x + 5 = (x + 1)(x + 5), so factoring p/q as p(x) x 1 A0 B1 x + B0 A0 B1 x + B0 = = + = + q(x) (x + 1)(x2 + ax + 5) x + 1 x2 + ax + 5 x + 1 (x + 1)(x + 5) does not meet that requirement; both terms contain (x + 1) in the denominator. You could avoid this by using a di↵erent factorization: p(x) x 1 x 1 A1 x + A0 B0 = = = + 2 2 2 q(x) (x + 1)(x + 6x + 5) (x + 1) (x + 5) (x + 1) x+5 2.5.15 We give two solutions. Solution 1 Since A and B are square, C = (AB) 1 is square. We have CAB = I and ABC = I, so (using the associativity of matrix multiplication) (CA)B = I, so CA is a left inverse of B, and A(BC) = I, so BC is a right inverse of A. Since (see the discussion after Corollary 2.2.7) a square matrix with a one-sided inverse (left or right) is invertible, and the one-sided inverse is the inverse, it follows that A and B are invertible. The second solution reproves the fact that if a square matrix has a left inverse it has a right inverse, and the two inverses are equal. By Proposition 1.2.15, since the matrices A 1 and AB are invertible, so is the product B = (A 1

1

A)B = A

1

(AB),

1

and B = (AB) A. Note that this uses associativity of matrix multiplication (Proposition 1.2.9). Recall that the nullity of a linear transformation is the dimension of its kernel.

Solution 2 Note first the following results: if T1 , T2 : Rn ! Rn are linear transformations, then 1. the image of T1 contains the image of T1 T2 , and 2. the kernel of T1 T2 contains the kernel of T2 . The first is true because, by the definition of image, for any vector ~v ~ such that (T1 T2 )(~ in img T1 T2 , there exists a vector w w) = ~v. Since T1 (T2 (~ w)) = ~v, the vector ~v is also in the image of T1 . The second is true because for any ~v 2 ker T2 , we have T2 (~v) = ~0. Since T1 (~0) = ~0, we see that (T1 T2 )(~v) = T1 T2 (~v) = T1 (~0) = ~0, so ~v is also in the kernel of T1 T2 . If AB is invertible, then the image of A contains the image of AB by statement 1. So A has rank n, hence nullity 0 by the dimension formula, so A is invertible. Since B = A 1 (AB), we have B 1 = (AB) 1 A. For B, one could argue that ker B ⇢ ker AB = {0}, so B has nullity 0, and thus rank n, so B is invertible. *2.5.17 a. Since p(0) = a + 0b + 02 c = 1, we have a = 1. Since p(1) = a + b + c = 4, we have b + c = 3. Since p(3) = a + 3b + 9c = 2, we have 3b + 9c = 3. It follows that c = 2 and b = 5.

Solutions for Chapter 2

Solution 2.5.17, part b: The polynomials themselves are almost certainly nonlinear, but Mx is linear since it handles polynomials that have already been evaluated at the given points.

55

b. Let Mx be the linear transformation from the space of Pn of polynomials of degree at most n to Rn+1 given by 2 3 2 3 p(x0 ) x0 6 .. 7 .. 5 4 p 7! 4 . 5 , where x = . . xn

p(xn )

Assume that the polynomial q is in the kernel of Mx . Then q vanishes at n + 1 distinct points, so that either q is the zero polynomial or it has degree at least n + 1. Since q cannot have degree greater than n, it must be the zero polynomial. So ker (Mx ) = {0}, hence Mx is injective, so by Corollary 2.5.10 it is surjective. It follows that there exists a solution of 2 3 a0 . Mx (p) = 4 .. 5 and that it is unique. an c. Take the linear transformation Mx0 from Pk to R2n+2 defined by 2 3 2 3 p(x0 ) a0 6 .. 7 . 7 6 6 . 7 6 .. 7 6 7 6 7 6 p(xn ) 7 6 an 7 0 0 7 p 7! 6 . If ker (M ) = {0}, then a solution of M (p) = 6 7 x x 6 p0 (x0 ) 7 6 b0 7 6 7 6 7 6 . 7 4 ... 5 4 .. 5 bn p0 (xn ) exists and so a value for k is 2n + 2 (as shown above). In fact this is the lowest value for k that always has a solution. 2 3  1 1 1 1 1 2.5.19 a. H2 = , H3 = 4 1/2 1/4 1/8 5. 1/2 1/4 1/3 1/9 1/27 2 1 1 1 ... 1 3 1/2n 7 7 1/3n 7. b. .. 7 5 . 1/n 1/n2 1/n3 . . . 1/nn c. If Hn is not invertible, there exist numbers a1 , . . . , an not all zero such that 6 1/2 6 1/3 Hn = 6 6 . 4 . .

1/4 1/9 .. .

1/8 1/27 .. .

... ... .. .

f~a (1) = · · · = f~a (n) = 0. But we can write + a2 xn 2 + · · · + an , xn and the only way this function can vanish at the integers 1, . . . , n is if the numerator vanishes at all the points 1, . . . , n. But it is a polynomial of degree n 1, and cannot vanish at n di↵erent points without vanishing identically. f~a (x) =

a1 xn

1

56

Solutions for Chapter 2

*2.5.21 a. First, we will show that if there exists S : Rn ! Rn such that T1 = S T2 then ker T2 ⇢ ker T1 : Indeed, if T2 (~v) = ~0, then (since S is linear), (S T2 )(~v) = T1 (~v) = ~0. Now we will show that if ker T2 ⇢ ker T1 then there exists a linear transformation S : Rn ! Rn such that T1 = S T2 : For any ~v 2 img T2 , choose ~v0 2 Rn such that T2 (~v0 ) = ~v, and set S(~v) = T1 (~v0 ). We need to show that this does not depend on the choice of ~v0 . If ~v10 also satisfies T2 (~v10 ) = v, then ~v0 ~v10 2 ker T2 ⇢ ker T1 , so T1 (~v0 ) = T1 (~v10 ), showing that S is well defined on img T2 . Now extend it to Rn in any way, for instance by choosing a basis for img T2 , extending it to a basis of Rn , and setting it equal to 0 on all the new basis vectors. b. If 9S : Rn ! Rn such that T1 = T2 S then img T1 ⇢ img T2 : ~ 2 img T1 there is a vector ~v such that T1 (~v) = w ~ (by the For each w ~ , so w ~ 2 img T2 . definition of image). If T1 = T2 S, then T2 (S(~v)) = w Conversely, we need to show that if img T1 ⇢ img T2 then there exists a linear transformation S : Rn ! Rn such that T1 = T2 S. Choose, for each i, a vector ~vi such that T2 (~vi ) = T1 (~ei ). This is possible, since img T2 img T1 . Set S = [~v1 , . . . , ~vn ]. Then T1 = T2 S, since (T2 S)(~ei ) = T2 (~vi ) = T1 (~ei ). 



0 1 It corresponds to the basis v1 = , 0 0    0 0 0 0 2 1 v3 = , v4 = . We have = 2v1 + v2 + 5v3 + 4v4 . 1 0 0 1 5 4   1 0 0 0 b. It corresponds to the basis v1 = , v2 = , 0 0 1 0    0 1 0 0 2 1 v3 = , v4 = . We have = 2v1 + 5v2 + v3 + 4v4 . 0 0 0 1 5 4 2.6.1

a.

2.6.3 02 31 a   1 0 1 B6 b 7C = a + b @ 4 5 A {v} c 0 1 0 d

  0 0 1 0 +c +d 1 1 0 1

1 0 , v2 = 0 0

 1 a+b c = 0 c+d a

2.6.5 a. The ith column of [RA ] is [RA ]~ei : 2 3 a    a b 1 0 a b 6b7 [RA ]~e1 = 4 5 , which corresponds to = . 0 0 0 0 0 c d 0 2 3 c 6d7 [RA ]~e2 = 4 5 , 0 0

which corresponds to



  c d 0 1 a b = . 0 0 0 0 c d

d b

Solutions for Chapter 2

2 3 0 607 [RA ]~e3 = 4 5 , a b 2 3 0 607 [RA ]~e4 = 4 5 , c d

57

which corresponds to



  0 0 0 0 a b = . a b 1 0 c d

which corresponds to



  0 0 0 0 a b = . c d 0 1 c d 

a b Similarly, the first column of [LA ] corresponds to c d    a b 0 1 a b 0 second column to , the third to c d 0 0 c d 1   a b 0 0 fourth to . c d 0 1 b. From part a we have p p |RA | = |LA | = 2a2 + 2b2 + 2c2 + 2d2 = 2|A|.



1 0 ; the 0 0 0 , and the 0

2.6.7 a. This is not a subspace, since 0 (the zero function) is not in it. (It is an affine subspace.) b. This is a subspace. If f, g satisfy the di↵erential equation, then so does af + bg: (af + bg)(x) = af (x) + bg(x) = axf 0 (x) + bxg0 (x) = x(af + bg)0 (x). c. This is not a vector subspace: 0 is in it, but that is not enough to make it a subspace. The function f (x) = x2 /4 is in it, but x2 = 4(x2 /4) is not, so it isn’t closed under multiplication by scalars. 2.6.9 a. Take any basis w1 , . . . , wn of V , and discard from the ordered set of vectors v1 , . . . , vk , w1 , . . . , wn any vectors wi that are linear combinations of earlier vectors. At all stages, the set of vectors obtained will span V , since they do when you start and discarding a vector that is a linear combination of others doesn’t change the span. When you are through, the vectors obtained will be linearly independent, so they satisfy Definition 2.6.11. b. The approach is identical: eliminate from v1 , . . . , vk any vectors that depend linearly on earlier vectors; this never changes the span, and you end up with linearly independent vectors that span V . 2.7.1 The successive vectors ~e1 , A~e1 , A2~e1 , . . . are    1 1 1 , , ,.... 0 1 2

58

Solutions for Chapter 2

The vectors 



1 0 

1 = 2



and

1 1

are linearly independent, and

 1 1 +2 , 0 1

or better,

Thus the polynomial p1 is p1 (t) = t2 Solution 2.7.3: The notation T (p) = p + xp00 is standard, but it may be confusing, because when applied (for example) to the polynomial x, we have x playing a double role – it is both the polynomial and the name of the variable. It would be heavier but perhaps clearer to write 00

(T (p))(x) = p(x) + xp (x).

A2~e1

2t + 1 = (t

2A~e1 + ~e1 = ~0. 1)2 .

2.7.3 a. We will give two solutions. def

First solution: In the basis {x} = 2 1 60 A=4 0 0 since

(1, x, x2 , x3 ), the matrix of T is 3 0 0 0 1 2 07 5, 0 1 6 0 0 1

T (1) = 1, T (x) = x, T (x2 ) = x2 + 2x, T (x3 ) = x3 + 6x2 .

Then, applied to the basis vectors 1, x, x2 , x3 , we have (T (1))(x) = 1 + x · 0 = 1,

Denote the basis p1 , . . . , p4 by {p}. The change of basis matrix [Pp!x ] is 2

1 60 S=4 0 0

(T (x))(x) = x + x · 0 = x,

(T (x2 ))(x) = x2 + x · 2

(T (x3 ))(x) = x3 + x · 6x.

2

1 0 6 S 1 AS = 4 0 0

1 1 0 0

3 1 17 5, 1 1

1 1 1 0

giving for our desired 32 1 0 0 1 1 1 0760 54 0 1 1 0 0 0 1 0

with inverse S

matrix

0 1 0 0

0 2 1 0

32 0 1 0760 54 6 0 1 0

1 1 0 0

1 1 1 0

1

2

1 60 =4 0 0

1 1 0 0

3 2 1 1 17 60 5=4 1 0 1 0

3 0 07 5, 1 1

0 1 1 0

0 1 0 0

2 2 1 0

Second solution: The other approach is to compute directly:

3 2 47 5. 6 1

T (1) = 1 T (1 + x) = 1 + x T (1 + x + x2 ) = 1 + x + x2 + 2x = (1 + x + x2 ) + 2(x + 1) 2

3

2

3

2

2

T (1 + x + x + x ) = 1 + x + x + x + 2x + 6x

= (1 + x + x2 + x3 ) + 6(x2 + x + 1)

4(x + 1)

2.

This expresses the images of the basis vectors pi under T as linear combinations of the pi , so the coefficients are the columns of the desired matrix, giving again 2 3 1 0 2 2 2 47 60 1 4 5. 0 0 1 6 0 0 0 1

Solutions for Chapter 2

59

b. We give a solution analogous to the first above. In the “standard basis” 1, x, . . . , xk of Pk , the matrix of T is the diagonal matrix 2 3 1 0 ... 0 60 2 ... 0 7 A=6 . .. 7 4 ... ... . . . . 5 0 0 ... k + 1

2

1 6 0 6 . 4 .. |

0

1 ... 1 ... .. . . . . 0 ... {z S

1

The change of basis matrix is analogous to the above, and 2 1 1 1 ... 32 32 3 0 1 0 ... 0 1 1 ... 1 60 2 1 ... 0760 2 ... 0 760 1 ... 17 6 6. . . 76 . . . 7 = 60 0 3 ... . . .. 7 .. .. 5 4 .. .. . . .. 5 6 . .. .. . . . 5 4 .. .. 4 .. . . . 0 0 ... k + 1 0 0 ... 1 1 0 0 0 ... {z }| {z } }| A

1 1 1 .. . k+1

S

3

7 7 7, 7 5

where S has 1’s on and above the diagonal and 0’s elsewhere, and S 1 has 1’s on the diagonal, 1’s on the “superdiagonal” immediately above the main diagonal, and 0’s elsewhere. The product has 1’s everywhere above the diagonal, and 1, . . . , k + 1 on the diagonal. 2.7.5 a. The recursive formula for the bn can be written      n bn 0 1 bn 1 0 1 b0 0 1 = = = bn+1 1 2 bn 1 2 b1 1 2

n



1 . 1

To compute the power ofpthe matrix, we use eigenvalues and eigenvectors. The eigenvalues are 1 ± 2, and a basis of eigenvectors is   1p 1p , . 1+ 2 1 2 This leads to the change of basis matrix  1p 1p S= , with inverse S 1+ 2 1 2

1

p p 2 p2 1 = 2+1 4

1 . 1

Indeed, you can check by multiplication that p p p    2 p2 1 1 0 1 1p 1p 1+ 2 0p = . 2+1 1 1 2 1+ 2 1 2 0 1 2 4 Thus S



0 1 1 2

n

1



0 1 1 2

n

S=



p 1+ 2 0p 0 1 2

and finally p n (1 + 2) 0p =S S 1 0 (1 2)n p p p p  ( 2 1)(1 + 2)n + ( 2 + 1)(1 2 p p n+1 p = 4 ( 2 1)(1 + 2) + ( 2 + 1)(1 

p n 2) p n+1 2)

n

=



(1 +

p n 2) 0

p (1 + 2)n p n+1 (1 + 2)

(1

(1 (1

0p , 2)n

p n 2) p n+1 . 2)

60

Solutions for Chapter 2

Multiplying this by



b0 b1



1 we find 1 p (1 + 2)n + (1 bn = 2 =

p n 2)

p b. Since |1 2| < 1, it will contribute practically nothing to b1000 , and p 1 1000 (1 + 2) has the same number of digits and the same leading digits 2 as b1000 . You will find that your calculator will refuse to evaluate this, but using logarithms base 10 for a change, you find log10 b1000 ⇡ 382.475, so b1000 has 382 digits, starting with 2983. We have

2

0 A2 = 4 1 1 2 1 A3 = 4 1 2

0 1 2 1 2 3

3 1 15 2 3 1 25 4

c. This time the relevant matrix description of the recursion is 2 3 2 32 3 2 3n 2 3 cn 0 1 0 cn 1 0 1 0 1 4 cn+1 5 = 4 0 0 1 5 4 cn 5 = 4 0 0 1 5 4 1 5 . cn+2 1 1 1 cn+1 1 1 1 1 2 3 0 1 0 Set A = 4 0 0 1 5. One can easily check that 1 1 1 A3

A2

A = I,

2 so the roots of the polynomial 3 These eigenvalues are approximately 1

⇡ 1.83926 . . .

and

A basis of eigenvectors is

1 are eigenvalues of the matrix.1

2,3

2

S=4

1 1 2 1

⇡

1 2 2 2

.419643 ± .6062907. 1 3 2 3

3 5

and we can compute the high powers as above by 2 n 3 0 0 1 n An = S 4 0 0 5 S 1. 2 n 0 0 3 1

How did we find the coefficients x = y = z = 3

1 for the equation

2

A + xA + yA + zI = 0? This equation corresponds to 2 1+z 1+y 4 1+x 2+x+z 2 + x + y 3 + 2x + y

3 2 1+x 0 2+x+y 5 = 40 4 + 2x + y + z 0

0 0 0

3 0 05. 0

A glance just at the top line immediately gives the answer. Normally one would not expect a system of nine equations in three unknowns to have a solution! The Cayley-Hamilton theorem (Theorem 4.8.27) says that every n ⇥ n matrix satisfies a polynomial equation of degree at most n, whereas you would expect that it would only satisfy an equation of degree n2 .

Solutions for Chapter 2

Computing S we find

1

61

is what computers are for, but when the dust has settled, 2 3 2 3 1 .4356 S 1 4 1 5 ⇡ 4 .28219 .359i 5 , 1 .28219 + .359i

and, as above, cn ⇡ .4356 ·

n 1

+ (.28219

n 2

.359i) ·

+ (.28219 + .359i) ·

n 3.

Finally, to find the number of digits of c1000 , only the term involving is relevant, and we resort to logarithms base 10:

1

log .4356 + 1000 log 1.83926 ⇡ 264.282. Thus c1000 has 265 digits.

Inequality 1 follows from the second paragraph of the proof of Theorem 2.7.10. Inequality 2 is true because A and B have only positive entries, A B, and ~ vB has only positive entries.

2.7.7 a. Let A and B be square matrices with A the leading eigenvalue of A and B the leading eigenvalue of B. First let us see that if A B > 0, then A vA and ~vB with B . Choose strictly positive unit eigenvectors ~ eigenvalues A and B ; then A B: 1

A

=|

vA | A~

= |A~vA |

z}|{

2

|A~vB |

z}|{

|B~vB | = |

vB | B~

=

B.

(This result is of interest in its own right.) Let An be the matrix obtained from A by adding 1/n to every entry. Then An > 0 and by Theorem 2.7.10, An has an eigenvector vn 2 , where is the set of unit vectors in the positive quadrant Q. Moreover, by the argument above, the corresponding eigenvalues n are nonincreasing and positive, so they have a limit 0. Since is compact, the sequence n 7! vn has a convergent subsequence that converges, say to v 2 (not necessarily in ). Passing to the limit in the equation An vn = n vn , we see that Av = v. b. By part a, the matrix A has an eigenvector v 2 with eigenvalue 0. The vector v is still an eigenvector for An , this time with eigenvalue n

. Thus in this case where An > 0, we must have v 2 , i.e., v > 0, and > 0, hence > 0. Moreover, if µ is any other eigenvalue of A, then µn is an eigenvalue of An , so |µ|n < n , hence |µ| < . n

2.8.1 The partial derivatives are D1 F1 =

sin(x

y), D2 F1 = sin(x

D1 F2 = cos(x + y)

y)

1,

1, D2 F2 = cos(x + y),

so D1,1 F1 = D2,2 F1 =

cos(x

y);

D1,2 F1 = D2,1 F1 = cos(x

D1,1 F2 = D2,2 F2 = D1,2 F2 = D2,1 F2 =

sin(x + y).

y)

62

Solutions for Chapter 2

Therefore, the absolute value of each second partial is bounded by 1, and there are eight second partials, so p [DF(u)] [DF(v)]  8 · 12 |u v|; p using this method we get M = 2 2, the same result as equation 2.8.62. 2.8.3 a. Use the triangle inequality: |x| = |x |y| = |y

y + y|  |x

x + x|  |y

y| + |y|

|x|

imply

x| + |x|

|y|

|y|  |x

y|

|x|  |x

y|

imply

|x|

|y|  |x

y|.

p b. Choose C > 0, and find x such that 1/(2 x) > C, which will happen for x > 0 sufficiently small. From the definition of the derivative, p p y x 1 p = lim . 2 x y!x y x Thus taking y > x sufficiently close to x, we will find p p | y x| (C 1)|y x|. Since this is possible for every C > 0, we see that there cannot exist a p p number M such that | y x|  M |y x| for all x, y > 0. 2.8.5 a. Newton’s method gives xn+1 =

2x3n + 9 , which leads to 3x2n

25 46802 = 2.083, x2 = = 2.08008, 12 22500 307548373003216 x3 = ⇡ 2.08008382306424. 147853836270000 x0 = 2, x1 =

b. In this case, h0 = We have f (x0 ) = 23

23 9 1 = , 2 · 22 12

 1 U1 = 2, 2 + . 6

1, f 0 (x0 ) = 3 · 22 = 12, and finally ✓ ◆ 1 M = sup |f 00 (x)| = 6 2 + = 13. 6 x2U1 9=

c. Just compute: |f (x0 )|M 1 · 13 1 = = 0.09027 < , |f 0 (x0 )|2 122 2 so we know that Newton’s method will converge – not that there was much doubt after part a. 2.8.7 a. The Matlab command newton([cos(x1)+x2-1.1; x1+cos(x1+x2)-.9], [0; 0], 5)

Solutions for Chapter 2

63

produces the output ◆ ✓ ◆ ✓ ◆ 0.10000000000000 0.10000000000000 0.09998746447058 , , , 0.10000000000000 0.10499583472197 0.10499458325724 ✓ ◆ ✓ ◆ 0.09998746440624 0.09998746440624 , . 0.10499458332900 0.10499458332900

✓

We see that the first nine significant digits are unchanged from the third to the fourth step, and that all 15 digits are preserved from the fourth to the fifth step. b. Set

Then

h ⇣ ⌘i  Df x = y

so

 

1 

⇣ ⌘  cos x + y 1.1 f x y = x + cos(x + y) .9 . sin x sin(x + y)

1

h ⇣ ⌘i  0 1 Df 00 = , 1 0

1 , sin(x + y)

h0 = a1 =

Now we estimate the Lipschitz ratio:  sin u1 1 sin u2 sin(u1 + v1 ) sin(u1 + v1 ) 1 sin(u2 + v2 ) |(u1

= (u1

|u1 u2 | u2 ) + (v1

v2 )| |(u1

2

u2 ) + 2((u1

u2 ) + (v1 p so we may take M = 5. The quantity

2 1/2

v2 ))

v2 )| p  5 (u1

0.1 . 0.1 1 sin(u2 + v2 )

u2 )2 + (v1

v2 )2

1/2

,

p p p 2 p 2 2 10 = 5 ( 2) = ⇡ .632 . . . 10 10 is not smaller than 1/2, so we cannot simply invoke Kantorovich. There are at least two ways around this. One  is to use the norm of the p 0 1 derivative rather than the length: the norm of is 1, not 2, so this 1 0 improves the estimate by a factor of 2, which is enough. Another is to look at the next step of Newton’s method, which gives |f (a1 )| ⇡ .005. We can use the same Lipschitz ratio. The derivative L = [Df (a1 )] is, by the same Lipschitz estimate, of the form  a 1+b 1+c d M |f (a0 )| [D~f (a0 )]

Using the norm rather than the length is justified by Theorem 2.9.8. The norm is defined in Definition 2.9.6.

0 u2 ) + (v1



1 2

with |a|, |b|, |c|, |d| all less than 0.1. So | det L| > .75, and we have |L 1 | < 2.2/.75 < 3. This gives p 2 M |f (a1 )| L 1 < 5 · (.005) · 9,

64

Solutions for Chapter 2

which is much less than 1/2.

Solution 2.8.9: This is an exceptionally easy case in which to apply Kantorovich’s theorem. What would we get if we used the norm (discussed in Section 2.9) instead of the length? Since the norm of the identity is 1, we p 1 would get a2 + b2 + c2  . We 4 don’t know how sharp this is, but 1/2 is certainly too big.

2.8.9 We will start Newton’s method at the origin, and try to solve the system of equations 0 1 0 1 0 1 x x + y2 a 0 F @ y A = @ y + z2 b A = @ 0 A . z z + x2 c 0

First, a bit of computation will show you that M = 2 is a global Lipschitz ratio for [DF (x)]. Next, the derivative p at the origin is the identity, so its inverse is also the identity, with length 3. So we find that the condition for Kantorovich’s theorem to work is p p p 1 1 a2 + b2 + c2 · ( 3)2 · 2  , i.e., a2 + b2 + c2  . 2 12 2.8.11 a. Set p(x) = x5 p(x) = 0 is

x

N (a) = a

6. One step of Newton’s method to solve a5 a 6 4a5 + 6 = . 5a4 1 5a4 1

In particular, x1 = N (2) = 134 79 , with a first Newton step h0 = b. To apply Kantorovitch’s theorem, we need to show that

Solution 2.8.11, part b: Since U1 = B|~h0 | (x1 ), the sup in equation (1) for M is over U 1 = [x1 |~ h0 |, x1 + |~ h0 |]  134 24 134 24 = , + 79 79 79 79   110 158 110 = , = ,2 . 79 79 79 For an initial guess x0 , Kantorovich’s theorem requires a Lipschitz ratio in U 1 ; for an initial guess x1 , it requires a Lipschitz ratio M1 in U 2 .

M p(2) < 1/2, (p0 (2))2

where M =

sup x2[110/79, 2]

24/79.

|p00 (x)|.

(1)

The second derivative of p is 20x3 ; since 20x3 is increasing, the supremum M is 20 · 23 = 160. We have p(2) = 24 and p0 (2) = 79. Unfortunately, 24 · 160 1 ⇡ .6152860119 > , 2 79 2

so Kantorovitch’s theorem does not apply. But it does apply at x1 . Using decimal approximations, we find p(x1 ) ⇡ 1.09777 and p0 (x1 ) ⇡ 27.0579. The supremum M1 of p00 (x) over U 2 is ✓ ◆ ✓ ◆3 134 134 M1 = p00 (x1 ) = p00 = 20 ⇡ 98, giving 79 79 M1 p(x1 ) 98 · 1.1 ⇡ ⇡ .148 < 1/2. (p0 (x1 ))2 (27)2

The original M works also: M p(x1 ) 160 · 1.097 ⇡ (p0 (x1 ))2 724 ⇡ .24 < 1/2.

2.8.13 a. Direct calculation gives  2u1 |[DF(u)] [DF(v)]| = u1  2(u1 = u1 p So p 5 is a global Lipschitz ratio. 10.

2 2u2 1 u2 1 v1 ) 2(u2 v1 u2



2v1 v1

v2 ) v2

2 2v2 1 v2 1 p = 5|u v|.

The second derivative approach gives

Solutions for Chapter 2

b. We have ✓ ◆  5 1 F = , 1 1 so

65

h ⇣ ⌘i  8 2 DF 51 = , 0 4

h ⇣ ⌘i DF 51

1

 1 4 = 32 0

2 , 8

   1 4 2 1 3/16 = 8 1 1/4 32 0 ✓ ◆  ✓ ◆ 5 3/16 77/16 x1 = + = , 1 1/4 5/4

~h0 =

Figure for Solution 2.8.13. The curve of equation x2 + y 2

2x

15 = 0

is ⇣ the ⌘ circle of radius 4 centered at 1 . The curve of equation 0 xy

x

y=0

is a hyperbola with asymptotes the lines x = 1 and y = 1. Exercise 2.8.13 uses Newton’s method to find an intersection of these curves. There are four such ⇣ ⌘ intersections, but starting at 51 leads to a sequence that rapidly converges to the intersection in the small shaded circle.

and |~h0 | = 5/16. For Kantorovich’s theorem to guarantee convergence, we must have p 84 p 1 2 2 5 , 32 2 p p which is true (and is still true if we use 10 instead of 5). c. By part b, there is a unique root in the disc of radius 5/16 around ✓ ◆ ✓ ◆ 77/16 4.8 ⇡ . 5/4 1.25 See the picture in the margin. 2.8.15 The system of equations to be solved is 1+a + a2 y = 0 ⇣⇣ ⌘ ⌘ y , F = a + (2 )y + az = 0 z a + (3 )z = 0 ⇣ ⌘ ⇣ ⌘ for the unknowns ( yz , ), starting at ( 00 , 1); the number a is a parameter. We have 0 1 a ⇣⇣ ⌘ ⌘ ⇣⇣ ⌘ ⌘ 0 ,1 = @ aA, F 0 , 1 = |a|p3. F 0 0 a Now compute

2 2 3 2 2 a 0 1 h a h ⇣⇣ ⌘ ⌘i ⇣⇣ ⌘ ⌘i y 0 DF z , =4 2 a y 5, DF 0 , 1 = 4 1 0 (3 ) z 0 h ⇣⇣ ⌘ ⌘i 1 The inverse DF 00 , 1 is computed by row reducing 2 2 2 3 1 0 0 0 1 a 0 1 1 0 0 60 1 0 4 1 a 5 0 0 0 0 1 0 to get 4 0 2 0 0 0 1 0 0 1 1 a2 This gives h ⇣⇣ ⌘ ⌘i DF 00 , 1

1 2

=1+

3 1 0 5. 0

0 a 2

a 2 1 2 a3 2

3 7 5

a2 1 a6 9 a2 a6 + + 1 + a4 + = + + a4 + . 4 4 4 4 4 4

66

Solutions for Chapter 2

Finally, h ⇣⇣ ⌘ DF yz1 , 1

1

⌘i

h ⇣⇣ ⌘ ⌘i DF yz2 , 2 2 2 3 0 0 0 = 4 0 y2 y1 5 1+ 2 0 z2 + z1 1+ 2 p 2 = 2( 1 y2 )2 + (z1 z2 )2 2 ) + (y1 ✓✓ ◆ ◆ ✓✓ ◆ ◆ p y1 y2  2 , 1 , 2 z1 z2

So the inequality in Kantorovich’s theorem will be satisfied if ✓ ◆ p 9 a2 a6 p 1 4 |a| 3 + +a + 2 . 4 4 4 2 Since the expression on the left is an increasing function of |a|, if it is satisfied for some particular a0 > 0, it will be satisfied for |a| < a0 . The best a0 is the solution of this 7th degree equation, a0 ⇡ .0906; arithmetic will show that a0 = .09 does satisfy the inequality. 1)2 , it is always true that

2.9.1 For the polynomial f (x) = (x f 0 (a)

f 0 (b) = 2(a

1)

2(b

1) = 2|a

b|,

so M = 2 is a Lipschitz ratio, and certainly the best ratio, realized at every pair of points. Since f (0)M 1·2 1 = = , f 0 (0)2 ( 2)2 2 inequality 2.8.53 of Kantorovich’s theorem is satisfied as an equality. One step of Newton’s method is an+1 = an +

(1 an )2 an + 1 = , 2(1 an ) 2

so if we assume by induction that an = 1 we find an+1 = Solution 2.9.3: In the second inequality, we choose one unit ~ v to make |A~ v| as large as possible, and another to make |B~ v| as large as possible. The sum we get this way is certainly at least as large as the sum we would get if we chose the same ~ v for both terms.

1

1/2n , which is true for n = 0,

1/2n + 1 =1 2

1/2n+1 ,

and hn = 1/2n+1 , which is exactly the worst case in Kantorovich’s theorem. 2.9.3 Just apply the definition of the norm, and the triangle inequality in the codomain: kA + Bk = sup |(A + B)~v| = sup |A~v + B~v| |~ v|=1

|~ v|=1

 sup |A~v| + |B~v|  sup |A~v| + sup |B~v| = kAk + kBk. |~ v|=1

|~ v|=1

|~ v|=1

Solutions for Chapter 2

67

2.9.5 a. In part b, we will solve this equation by Newton’s method. For  x x now, you might guess that there exists a solution of the form . x x Substitute that into the equation, to find  2   2x 2x2 x x 1 1 + = , i.e., 2x2 + x 1 = 0. 2x2 2x2 x x 1 1 This has two solutions, 1 and 1/2, which lead to two solutions of the original problem:   1 1 1/2 1/2 and . 1 1 1/2 1/2

Solution 2.9.5, part b: If we wanted to consider [DF (I)] and [DF (I)] 1 as matrices, we would have to consider an element of Mat (2, 2) as an element of R4 and the matrices in question would be 4⇥4, with 16 entries, making computing the length cumbersome.

b. We will use the length of matrices, but the norm for elements of L(Mat (2, 2), Mat (2, 2)): i.e., for linear maps from  Mat (2, 2)  to Mat (2, 2). 1 1 0 0 2 The equation we want to solve is X + X = , so the 1 1 0 0 relevant mapping is the (nonlinear) mapping F : Mat (2, 2) ! Mat (2, 2) given by  1 1 2 F (X) = X + X . 1 1 Clearly |F (I)| =



1 1

1 1

= 2. Since

[DF (I)](H) = HI + IH + H = 3H, we have [DF (I)] 1 H = H/3, i.e., [DF (I)] function id : Mat (2, 2) ! Mat (2, 2). Thus [DF (I)]

1

=

Inequality 1: The first inequality is the triangle inequality; the second inequality is Schwarz. Remember that we are taking the sup of H such that |H| = 1.

is 1/3 times the identity

1 , 3

which squared gives 1/9. Moreover, k[DF (A)] [DF (B)]k  2|A [DF (A)]

1

B|:

[DF (B)] = sup (AH + HA + H)

(BH + HB + H)

|H|=1

= sup H(A

B) + (A

B)H

|H|=1

 sup

⇣

H(A

B) + (A

|H|=1

⇣  sup |H||A |H|=1

B| + |A

So the Kantorovich inequality is satisfied: 2 · 1/9 · 2 < 1/2.

B)H

⌘

⌘ B||H| = 2|A

(1) B|.

68

Solutions for Chapter 2

2.9.7 First, let us compute the norm of f (A0 ) = kf (A0 )k = sup

|~ x|=1

= sup |~ x|=1

= sup |~ x|=1



1 1

1 1

p 2(x1



x1 x2



= sup |~ x|=1

1 1 

1 : 1 x1 x1

x2 x2

x2 )2

q 2(x21 + x22 )

4x1 x2 .

If we substitute cos ✓ for x1 and sin ✓ for x2 (since x21 + x22 = 1) this gives q p p sup 2(cos2 ✓ + sin2 ✓) 4 cos ✓ sin ✓ = sup 2 2 sin 2✓ = 4 = 2.

The following (virtually identical to equation 2.9.23) shows that the derivative of f is Lipschitz with Lipschitz ratio 2: ⇣ ⌘ k[Df (A1 )] [Df (A2 )]k = sup [Df (A1 )] [Df (A2 )] B = sup A1 B + BA1 + |B|=1



|B|=1

0 1 1 0

= sup |(A1

A2 )B + B(A1

 sup |A1

A2 ||B| + |B||A1

A2 B

BA2



0 1 1 0

A2 )|

|B|=1 |B|=1

 sup 2|B||A1 |B|=1

= 2|A1

A2 |

A2 |

A2 |.

Finally we need to compute the norm of [Df (A0 )] 1 63



8 1

1 8

1

=

1 63



8 1

1 : 8

2 8x x2 3 1 6 63 7 = sup 4 63 5 x1 8x2 |~ x|=1 + 63 s✓ 63 ◆2 ✓ ◆2 8x1 x2 8x2 x1 = sup + 63 63 |~ x|=1 q 1 = sup 64x21 16x1 x2 + x22 + 64x22 16x2 x1 + x21 |~ x|=1 63 q 1 = sup (x21 + x22 ) + 64(x21 + x22 ) 32x1 x2 . |~ x|=1 63

Switching again to sines and cosines, we get q 1 1p 9 1 sup 65(cos2 ✓ + sin2 ✓) 16 sin 2✓ = 65 + 16 = = . 63 63 63 7 1 2 Squaring this norm gives k[Df (A0 )] k ⇡ 0.0204.

Solutions for Chapter 2

69

Thus instead of 2 · |{z} 2.8 · |{z}

|f (A0 )|

M

(.256)2 | {z }

|[Df (A0 )]

= .367 < .5,

1 |2

obtained using the length (see equation 2.8.73 in the text), we have 2 · |{z} 2 · |{z}

|f (A0 )|

Solution 2.10.1: Remember, the inverse function theorem says that if the derivative of a mapping is invertible, the mapping is locally invertible.

M

0.0204 | {z }

k[Df (A0 )]

= 0.08166. 1 k2

2.10.1 Parts a, c, and d: The theorem does not guarantee that these functions are locally invertible with di↵erentiable inverse, since the derivative is not invertible: for the first two, because the derivative is not square; for the third, because all entries of the derivative are 0. (In fact, Exercise 2.10.4 shows that the functions are definitely not locally invertible with di↵erentiable inverse.)  ⇣ ⌘ 2 1 b. Invertible: the derivative at 11 is , which is invertible; 2 0 hence the mapping is invertible. 0 1 1 e. The same mapping as in part d is invertible at @ 1 A; the derivative 1 2 3 1 1 1 at that point is 4 2 0 0 5, which is invertible. 0 0 2

2.10.3 a. Because sin 1/x isn’t defined at x = 0, to see that this function is di↵erentiable at 0, you must apply the definition of the derivative: h/2 + h2 sin(1/h) 1 1 1 = + lim h sin = . h!0 h 2 h!0 h 2

f 0 (0) = lim

b. Away from 0, you can apply the ordinary rules of di↵erentiation, to find ✓ ◆ 1 1 1 1 1 1 1 0 2 f (x) = + 2x sin + x cos = + 2x sin cos . 2 x x x2 2 x x As x approaches 0, the term 2x sin x1 approaches 0, but the term cos x1 oscillates infinitely many times between 1 and 1. Even if you add 1/2, there still are x arbitrarily close to 0 where f 0 (x) < 0. Since f is decreasing at these points, the function f is not monotone in any neighborhood of 0, and thus it cannot have an inverse on any neighborhood of 0. c. It doesn’t contradict Theorem 2.10.2 because f is not increasing or decreasing. More important, it doesn’t contradict Theorem 2.10.7 (the inverse function theorem), because that theorem requires that the derivative be continuous, and although f is di↵erentiable, the derivative is not continuous.

70

Solutions for Chapter 2

2.10.5 a. Two di↵erent implicit functions exist, both for x  find them by computing y 2 + y + 3x + 1 = (y 2 + y + 1/4) + 3x + 3/4 = 0,

1/4. We

so that

(y + 1/2)2 =

Solution 2.10.5, part b: Does this seem contradictory? After all, we know that given x = 1/4 we can use p 4y = ± 3x 3/4 1/2

to compute y = 1/2, and here is the implicit function theorem telling us there’s no implicit function there! But the implicit function theorem doesn’t talk about implicit functions at points, it talks about neighborhoods. At x = 1/4 + ✏, with ✏ > 0, there is no implicit function.

3(x + 1/4), p giving the implicit functions y = ± 3x can’t be negative, we have x  1/4.

3/4

1/2. Since

3x

3/4

b. To check that this result ⇣agrees ⌘ with the implicit function theorem, x we compute the derivative of f y = y 2 + y + 3x + 1, getting h ⇣ ⌘i Df x = [3 2y + 1]. y Defining y implicitly as a function of x means considering y the pivotal variable, so the matrix corresponding to [D1 F(c), . . . , Dn k F(c)] is [2y +1], which is invertible if y 6= 1/2. This value for y gives x 6= ✓1/4. Given ◆ any ⇣ ⌘ 1/4 2 0 point x , there y0 satisfying y0 +y0 +3x0 +1 = 0 except the point 1/2 ⇣ ⌘ ⇣ ⌘ x 0 is a neighborhood of f x y0 in which f y = 0 expresses y implicitly as a function of x. c. Since y is the pivotal unknown, we write   1 2y + 1 3 1 3 1 L= , L = . 0 1 2y + 1 0 2y + 1 p

Equations 1: In the first equation we are using equation 2.10.26.

So at the point with coordinates x = 12 , y = 32 1 , we have  1 1 p3 1 p 13 L 1=p , |L 1 | = p 13, giving |L 1 |2 = . 3 3 3 0 3 The derivative of f is Lipschitz with Lipschitz ratio 2, so the largest radius R of a ball on which the implicit function g is guaranteed to exist by the implicit function theorem satisfies 1 3 2= , so R = . (1) 2R|L 1 |2 52 5 d. If we choose the point with coordinates x = 13 4 , y = 2 , we do a bit better; in this case  1 1 23 18 3 L 1= , giving |L 1 |2 = and R = . 6 0 6 18 92

0 1 ✓ 2 ◆ x ⇣ ⌘ x 2.10.7 Yes. Set F @ y A = . The function g(t) = 0t is an y t t ✓ ◆ ⇣ ⌘ g(t) appropriate implicit function, i.e., F = 00 in a neighborhood of t 0 1 0 @ 0 A. But the implicit function theorem does not say that an implicit 0

Solutions for Chapter 2

71

0 1 2 !3  0 x 2x 0 0 @ A 4 5 function exists: at 0 the derivative DF y = is 0 1 1 0 t  0 0 0 ; this matrix is not onto R2 . 0 1 1 ◆ ⇣ ⌘ ✓ x + y + sin(xy) 2.10.9 The function F x = maps the origin to the y sin(x2 + y)  1 1 origin, and [DF(0)] = . Certainly F is di↵erentiable with Lipschitz 0 1 ⇣ ⌘ a derivative. So it is locally invertible near the origin, and the point 2a will be in the domain of the inverse for |a| sufficiently small.

2.10.11 a. If you set up the two multiplications AHi,j and Hi,j A, you will see that T (Hi,j ) = (ai + aj )Hi,j . b. The Hi,j form a basis for Mat (n, n). Their images are all multiples of themselves, so they are linearly independent unless any multiple is 0, i.e., unless ai = aj for some i and j. It follows from the chain rule that the derivative of the inverse is the inverse of the derivative.

c. False. If there were such a di↵erentiable map, its derivative would be the inverse of T : Mat (n, n) ! Mat (n, n), where T (H) = BH + HB, and 2 3 1 0 0 B = 4 0 1 05. 0 0 1 But T is not invertible, since a1 =

Solution 2.10.15, part a: We can see that the columns of the derivative are linearly independent because the two entries of the first column have the same sign and the two entries of the second column have opposite signs, or because the determinant is ex (ey + e y ), which never vanishes.

1 and a2 = 1, so a1 = a2 . 0 1 x 2.10.13 False. For example, the function f @ y A = y + z satisfies the x hypotheses but does not express x in any way. For the implicit function theorem to apply, we would need D1 f 6= 0. h ⇣ ⌘i  x e ey 2.10.15 a. The derivative of F is given by DF x = x . y e e y The columns of this matrix are linearly independent, so the inverse function theorem guarantees that F is locally invertible. b. Di↵erentiate F [D(F

1

so

[DF

1

F = I using the chain rule:

F)(a)] = [DF 1

1

(F(a))][DF(a)] = [DF

(b)] = [DF(a)]

1

=

ea1 (ea2

1 +e

a2 )

1

(b)][DF(a)] = I,



e a2 ea1

ea2 . ea1

2.10.17 We will assume that the function f : [a, b] ! [c, d] is increasing. If it isn’t, replace f by f .

72

Solutions for Chapter 2

1. Pick a number y 2 [c, d], and consider the function fy : x 7! f (x) which is continuous, since f is. Since fy (a) = c

y  0 and fy (b) = d

y

y,

0,

by the intermediate value theorem there must exist a number x 2 [a, b] such that fy (x) = 0, i.e., f (x) = y. Moreover, this number x is unique: if x1 < x2 and both x1 and x2 solve the equation, then f (x1 ) = f (x2 ) = y, but f (x1 ) < f (x2 ) since f is increasing on [a, b]. So set g(y) 2 [a, b] to be the unique solution of fy (x) = 0. Then by definition we have 0 = fy g(y) = f g(y) This says that

y,

so f g(y) = y.

⇣ ⌘ f g f (x) = f (x),

for any x 2 [a, b]. But any y 2 [c, d] can be written y = f (x), so f (g(y)) = y for any y 2 [c, d]. We need to check that g is continuous. Choose y 2 (c, d) (the endpoints c and d are slightly di↵erent and a bit easier) and ✏ > 0. Consider the sequences n 7! x0n and n 7! x00n defined by x0n = g(y

1/n) and x00n = g(y + 1/n).

These sequences are respectively increasing and decreasing, and both are bounded, so they have limits x0 and x00 . Applying f (which we know to be continuous) gives ✓ ◆ 1 f (x0 ) = lim f (x0n ) = lim y =y n!1 n!1 n ✓ ◆ 1 f (x00 ) = lim f (x00n ) = lim y + = y. n!1 n!1 n

Thus x0 = x00 = g(y), and there exists N such that x00N x0N < ✏. Now set = 1/N ; if |z y| < , then g(z) 2 [x0N , x00N ], and in particular, |g(z) g(y)| < ✏. This is the definition of continuity. 2. Choose y 2 [c, d], and define two sequences n 7! x0n , n 7! x00n in [a, b] by induction, setting x00 = a, x000 = b, and ✓ 0 ◆ 8 x0n + x00n xn + x00n 00 00 0 > > , xn+1 = xn if y f ; < xn+1 = 2 2 ✓ 0 ◆ 0 00 > xn + x00n > : x00n+1 = xn + xn , x0n+1 = x0n if y < f . 2 2

Then both sequences converge to g(y). Indeed, the sequence n 7! x0n is nondecreasing and the sequence n 7! x00n is nonincreasing; and both are bounded, so they both converge to something. Moreover, x00n

x0n =

b

a 2n

,

Solutions for Chapter 2

73

so they converge to the same limit. Finally, f (x0n )  y

and f (x00n )

y

for all n; passing to the limit, this means that lim f (x0n ) = lim f (x00n )

y

n!1

n!1

y,

so all the inequalities are equalities. 3. Suppose y = f (x), and that f is di↵erentiable at x, with f 0 (x) 6= 0. Define the function k(h) by g(y + h) = x + k(h). Since g is continuous, limh!0 k(h) = 0. Now g(y + h) h

g(y)

=

x + k(h) h

x

=

k(h) f (x + k(h))

Take the limit as h ! 0. Since lim

h!0

k(h) f (x + k(h))

f (x)

=

1 , f 0 (x)

we see that g(y + h) h also exists, and the limits are equal. lim

h!0

g(y)

= g 0 (y)

f (x)

.

Solutions for review exercises, chapter 2

2.1 a. Row reduction gives 2

1 1 41 0 1 a

3 2 1 a 1 1 2 b 5 ! 40 1 1 b 0 a 1

We will consider the cases 2 1 40 0

3 2 3 1 a 1 0 2 b 3 b a5 ! 40 1 3 a b 5. 2 b a 0 0 3a 1 a(b a) a = 1/3 and a 6= 1/3. If a = 1/3, we get 3 0 2 b 1 1 3 b 5, 3 1 0 0 3 (b 13 )

If a = 1/3 and b = 1/3, there is no pivotal 1 in the 4th column or the 3rd column. If a = 1/3 and b 6= 1/3, then by further row reducing we can get a pivotal 1 in the 4th column.

so if a = 1/3 and b = 1/3, there are infinitely many solutions. If a = 1/3 and b 6= 1/3, then there are no solutions. If a 6= 1/3, we can further row reduce our original matrix to get a pivotal 1 in the third column. In that case, the system of equations has a unique solution. b. We have already done all the work: the matrix of coefficients (i.e., the matrix consisting of the first three columns) is invertible if and only if a 6= 1/3.

Solution 2.5, part b: If you did not assume that the inverse is upper triangular, you could argue  X Y that it is of the form , Z W where X, like A, is (n k)⇥(n k), and W , like B, is k ⇥ k. Then the multiplication    A C X Y I [0] = [0] B Z W [0] I

2.3 a. The first statement is true; the others are false. b. Statements 1, 2, and 4 are true if n = m.

says that BZ = [0]; since B is invertible, this says that Z = [0]. It also says that BW = I, i.e., that W = B 1 . So now we have   A C X Y [0] B [0] B 1  AX AY + CB 1 = , [0] I

by row operations. We can now add multiples of the last k rows to the first e which row reduces the whole matrix to the n k to cancel the entries of C, identity.  A 1 Y b. If you guess that there is an inverse of the form and [0] B 1 multiply out, you find 2 3 A C 4 5 [0] B .   1 1 A Y I A C +YB [0] B 1 [0] I

where AX = I

and

giving Y =

A

AY +CB 1

CB

1

.

1

= [0],

2.5 a. Saying that a matrix is invertible is equivalent to saying that it can e if be row reduced to the identity. Thus [A, C] can be reduced to [In k , C] and only if A is invertible; likewise, [[0], B] can be reduced to [[0], Ik ] if and only if B is invertible. So   e A C I C M= can be transformed into n k [0] B [0] Ik

So if A 1 C + Y B = [0], you will have found an inverse; this occurs when Y = A 1 CB 1 . 74

Solutions for review exercises, Chapter 2

75

2.7 a. A couple of row operations will bring 2 3 2 3 1 1 1 1 1 1 40 5. a 1 5 to 4 0 1 (a + 3)/4 2 a+2 a+2 0 0 (1 a)(a + 4)/4 Thus the matrix is invertible if and only if (1 a 6= 1 and a 6= 4.

a)(a + 4) 6= 0, i.e., if

b. This is an unpleasant row reduction of 2 3 1 1 1 1 0 0 40 a 1 0 1 05. 2 a+2 a+2 0 0 1

The final result is

(a

2 (a + 2)(a 1 4 2 1)(a + 4) 2a

2.9 a. The vectors



1)

3 0 a 1 a+4 1 5. (a + 4) a

 cos ✓ sin ✓ and are orthogonal since sin ✓ cos ✓   cos ✓ sin ✓ · =0 sin ✓ cos ✓

(Corollary 1.4.8). By Proposition 2.4.16, they are linearly independent, and two linearly independent vectors in R2 form a basis (Corollary 2.4.20). The length of each basis vector is 1: p p cos2 ✓ + sin2 ✓ = 1 and cos2 ✓ + ( sin ✓)2 = 1, so the basis is orthonormal, not just orthogonal. b. The argument in part a applies. Recall that an n ⇥ n matrix whose columns form an orthonormal basis of Rn is called an orthogonal matrix.

We saw in Example 1.3.9 that  cos ✓ sin ✓ sin ✓ cos ✓ corresponds to rotation by ✓ counterclockwise around the origin.

c. Let A = [~v1 , ~v2 ] be any orthogonal matrix. Then ~v1 is a unit vector, cos ✓ so it can be written for some angle ✓. The vector ~v2 must be orsin ✓  sin ✓ thogonal to ~v1 and of length 1. There are only two choices: and cos ✓  sin ✓ . The first gives positive determinant and corresponds to rotacos ✓ tion. Note that Proposition 1.4.14   says that to have positive determinant, cos ✓ sin ✓ must be clockwise from , which it always will be. sin ✓ cos ✓ The second choice gives a negative determinant and corresponds to reflection with respect to a line through the origin that forms angle ✓/2 with the x-axis (see Example 1.3.5).

76

Solutions for review exercises, Chapter 2

2.11 a. Write

Here we write I=



1 0

0 1

A=



1 2

2 1

5 A = 4

4 5

2

and so on.



2 2 3 1 607 7 as 6 405 1 2 3 1 627 7 as 6 425, 1 2 3 5 647 6 as 4 7 , 45 5

1 60 4 0 1 2

1 2 2 1 1 60 to 4 0 0



a b c d

2 3 a 6b7 as 4 5; the question is whether the columns of c d

3 13 14 7 5 are linearly independent. Since this matrix row reduces 14 13 3 3 6 2 77 5, it follows that I and A are linearly independent, and 0 0 0 0

5 4 4 5 0 1 0 0

A2 = 2A + 3I,

A3 = 7A + 6I.

The subspace spanned by I, A, A2 , A3 has dimension 2. b. Suppose B1 and B2 satisfy AB1 = B1 A and AB2 = B2 A. Then (xB1 + yB2 )A = xB1 A + yB2 A = xAB1 + yAB2 = A(xB1 + yB2 ).  a b So W is a subspace of Mat (2, 2). If you set B = and write out c d AB = BA as equations in a, b, c, d, you find a + 2c = a + 2b ,

b + 2d = 2a + b ,

2a + c = c + 2d ,

2b + d = 2c + d.

These equations pretty obviously describe a 2-dimensional subspace of Mat (2, 2), but to make it clear, write the equations b

c=0 ,

a

d=0 ,

a

d=0 ,

In other words, W is the kernel of the matrix 2 3 0 1 1 0 0 17 61 0 4 5 . This matrix row reduces to 1 0 0 1 0 1 1 0

b

c = 0.

2

1 60 4 0 0

0 1 0 0

0 1 0 0

so W has dimension 2, the number of nonpivotal columns. c. If B = aI + bA, then

3 1 07 5, 0 0

BA = (aI + bA)A = aA + bA2 = A(aI + bA) = AB, so clearly V ⇢ W . But they have the same dimension, so they are equal. Solution 2.13: The pivotal columns of T form a basis for Im T (see Theorem 2.5.4). By definition 2.2.5, a column of T is pivotal if the corresponding column of the row-reduced matrix Te contains a pivotal 1.

2.13 a. The matrix 2 3 1 1 3 6 2 42 1 0 4 15 4 1 6 16 5

row reduces to

2

3 1 0 1 10/3 1 4 0 1 2 8/3 1 5 . 0 0 0 0 0

The first two columns are a basis for the image, and the vectors 2 3 2 3 2 3 1 10/3 1 6 2 7 6 8/3 7 6 1 7 6 7 6 7 6 7 6 17,6 0 7,6 07 4 5 4 5 4 5 0 1 0 0 0 1

Solutions for review exercises, Chapter 2

77

form a basis for the kernel (see Theorem 2.5.6). Indeed we find dim(ker ) + dim(img) = 3 + 2 = 5. b. The matrix  2 1 3 6 2 2 1 0 4 1 If we had followed to the letter the procedure outlined in Example 2.5.7, and had not “cleared denominators”, we would have 1 instead of 4 in the first and third vector, and 1 instead of 2 in the second vector. The other nonzero entries would then be fractions.

So the image is all of R2 , tors) is 2 3 2 3 6 67 6 6 7 6 6 47,6 4 5 4 0 0

row reduces to



1 0 3/4 5/2 3/4 . 0 1 3/2 1 1/2

and a basis for the kernel (we cleared denomina3 2 5 27 6 7 6 07,6 5 4 2 0

3 3 27 7 07, 5 0 4

which has dim = 3.

Again, we find dim(ker ) + dim(img) = 3 + 2 = 5.  a b 2.15 Let A = . The equation A> A = I gives c d a2 + c2 = 1,

b2 + d2 = 1,



a So is a unit vector and can be written c   b a is a unit vector orthogonal to , so d c   sin ✓ or cos ✓

ab + cd = 0. 

cos ✓ sin ✓

for some ✓. The vector

it is either sin ✓ . cos ✓

In the first case, A is rotation by angle ✓ (Example 1.3.9); in the second it is reflection in the line with polar angle ✓/2 (see equation 1.3.11).

2

3 p(1) 6 .. 7 6 . 7 6 7 6 p(k) 7 6 7 T (p) = 6 0 7 6 p (1) 7 6 . 7 4 .. 5 p0 (k)

Map T for Solution 2.17

2.17 Let P2k 1 be the space of polynomials of degree at most 2k 1; it can be identified with R2k , by identifying c0 + c1 x + · · · + c2k 1 x2k 1 2 3 c0 . with 4 .. 5. We will apply the dimension formula to the linear mapping c2k 1 T : P2k 1 ! R2k defined in the margin. The kernel of T is {0}: indeed, a polynomial in the kernel can be factored as p(x) = q(x)(x

1)2 . . . (x

k)2 ,

and such a polynomial has degree at least 2k unless q = 0. Since p has degree at most 2k 1, it follows that q must be 0, hence p must be 0, so the kernel of T is {0}. So from the dimension formula, the image of T has dimension 2k, hence is all of R2k . So there is a unique solution to the equations p(1) = a1 , . . . , p(k) = ak and p0 (1) = b1 , . . . , p0 (k) = bk .

78

Solutions for review exercises, Chapter 2

2.19 Just compute away: T (af + bg) (x) = (x2 + 1)(af + bg)00 (x) = (x2 + 1)af 00 (x) + (x2 + 1)bg 00 (x) = (x2 + 1)af 00 (x)

x(af + bg)0 (x) + 2(af + bg)(x) xaf 0 (x)

xbg0 (x) + 2af (x) + 2bg(x)

xaf 0 (x) + 2af (x) + (x2 + 1)bg 00 (x)

xbg0 (x) + 2bg(x)

= aT (f ) + bT (g) (x). 2.21 Suppose that the vectors v1 , . . . , vn form a basis of V , and that w1 , . . . , wm are linearly independent. Then (by Proposition 2.6.15), {v} is one to one and onto, hence invertible, and {w} is one to one, so 1 m ! Rn is a one-to-one linear transformation, so the {w} : R {v} columns of its matrix are linearly independent. But these are m vectors in Rn , so m  n. Similarly, if v1 , . . . , vn form a basis of V , and w1 , . . . , wm span V , then 1 m n {w} : R ! R is onto, so the columns of its matrix are m vectors {v} n that span R . Then m n. Solution 2.23: If your copy of the textbook is the first printing, note that the statement of the exercise has been changed to the following: Let A be an n ⇥ m matrix. Show that if you row reduce the augmented matrix [A> |Im ] to get e> B e > ], the nonzero columns of [A e form a basis for the image of A, A e correspondand the columns of B e form ing to the zero columns of A a basis for the kernel of A.

2.23 Let A be an n ⇥ m matrix. Column reducing a matrix is the same as row reducing the transpose, then transposing back. Moreover, row reducing A can be done by multiplying A on the left by a product of elementary matrices, which is invertible. Further, the matrix [A> | I] is onto, so after row reduction there will be a leading 1 in every row. So for any matrix A we can find an invertible m ⇥ m matrix which we will call H > such that e> | B e > ]. H > [A> | I] = [A

Transposing gives the (n + m) ⇥ m matrix 2 



A AH H= I H

| 6 ~a1 · · · ~ak 6  | e 6 A = e =6 6 B 6 4 ⇤

~0

3

7 7 7 7, 7 | 7 ~b1 · · · ~bl 5 |

where k + l = m and by definition k is the index of the last column whose pivotal 1 is in some row whose index is at most n. The vectors ~a1 , . . . , ~ak are in the image of A, since they are columns of AH, hence linear combinations of the columns of A. Moreover they are linearly independent since each contains a pivotal 1 in a di↵erent row, with 0’s above, and they span the image since H is invertible. So they form a basis for the image of A. The vectors ~b1 , . . . , ~bl (the lower right “corner”) are the last l columns of H. The upper right “corner” says that each of the last l columns of AH is ~0. So ~b1 , . . . , ~bl are in ker A. These vectors are linearly independent (again because they contain pivotal 1’s in di↵erent rows, with 0’s above). They span the kernel by the dimension formula: m=k+l

and m = dim img A + dim ker A = k + dim ker A,

Solutions for review exercises, Chapter 2

79

so dim ker A = l. Thus ~b1 , . . . , ~bl form a basis of ker A. 2.25 It takes a bit of patience entering this problem into Matlab, much helped by using the computer to compute A3 symbolically. The answer ends up being 2 3 2.08008382305190 0.00698499417571 0.08008382305190 4 5. 0 1.91293118277239 0 0 0.17413763445522 2 Note that one step of Newton’s method gives 2 3 2 3 2 + 1/12 0 1/12 2.083 0 0.083 4 0 2 1/12 0 5=4 0 1.916 0 5 0 1/6 2 0 0.16 2

which is very close.

 2x1 1 2x2 1 1 2y1 1 2y2 Lipschitz constant for the derivative of f . 2.27 a. Since



b. The inverse of [Df (x)] is  1 2y 1 [Df (x)] 1 = 4xy 1 1 2x

= 2|x

which at x0 =

so we have

y|, we see that 2 is a

⇣ ⌘ 2 is 3

 1 6 1 , 23 1 4

    1 6 1 1 +5/23 2 + 5/23 = , so x1 = . 1 3/23 3 3/23 23 1 4 p The length of [Df (x0 )] 1 is 54/23, so p 54 1 2 |f (x0 )| [Df (x0 )] 1 M = 2 · 2 · 2 ⇡ 0.289 < , 23 2 so Newton’s method converges, to a point of the disc of radius p 34 ~ |h0 | = ⇡ .25 around x1 . 23 p ✓ ◆ 34 51/23 c. The disc centered at and with radius is guaranteed to 66/23 23 contain a unique root; see the figure in the margin. ~h0 =

Figure for Solution 2.27. The shaded region, described in part c, is guaranteed to contain a unique root.

2.29 First, note that kAk = sup |A~v| = |~ v|=1

sup ~ |w|=1,|~ v|=1

|~ w> ||A~v|

sup ~ |w|=1,|~ v|=1

|~ w> A~v|;

(1)

similarly, ~|= kA> k = sup |A> w ~ |w|=1

~ is a Equation 3: Since (A~ v)> w number, it equals its transpose.

sup ~ |~ v|=1,|w|=1

~| |~v> ||A> w

sup ~ |w|=1,|~ v|=1

~ |. |~v> A> w

(2)

Note also that ~ |~ w> A~v| = | (A~v)> w

>

~ | = |~v> A> w ~ |. | = |(A~v)> w

(3)

80

Solutions for review exercises, Chapter 2

Therefore, if we can show that the inequalities (1) and (2) are equalities ~ , we will have shown that kAk = kA> k. for appropriate ~v, w By definition, kAk = sup |A~v|, and since the set { ~v 2 Rn | |~v| = 1 } is |~ v|=1

compact, the maximum is realized: there exists a unit vector ~v0 such that A~v0 ~0 = |A~v0 | = kAk. Let w . Then |A~v0 | A~v0 ~ >w ~ 0 |A~v0 | = w ~ 0> ~ 0> A~v0 = |~ kAk = |A~v0 | = w |A~v0 | = w w0> A~v0 |. | 0{z } |A~v0 | 1

Therefore,

kAk =

sup ~ |w|=1,|~ v|=1

|~ w> A~v|

and by a similar argument, kA> k =

sup ~ |w|=1,|~ v|=1

~ |. |~v> A> w

It follows that kAk = kA> k. Solution 2.31: We cannot simply invoke the implicit function theorem in part a. Since D1 F (a) = 0, the theorem does not guarantee existence of a di↵erentiable implicit function expressing x and in terms of y and z, but it does not guarantee nonexistence either.

⇣ ⌘ 2.31 a. False. Suppose sin(xyz) z = 0 expresses x implicitly as g yz 0 1 ⇡/2 near a = @ 1 A. Set 1 0 1 0 1 0 ⇣y⌘1 x x g z F @ y A = sin(xyz) z and G @ y A = @ y A . z z z

Since F G goes from R3 to R, its derivative [DF G(a)] is a 1⇥3 (row) matrix.

Then F G = 0, so, by the chain rule, [DF (G(a))][DG(a)] = [0 0 0]. Writing this out in terms of matrices gives ⇣ ⌘ ⇣ ⌘3 2 1 1 0 D g D g h i 1 2 1 1 5 ⇡ ⇡ ⇡ ⇡ ⇡ 1 · 1 · cos · 1 · cos cos 1 40 = [0 0 0]. 1 0 | {z 2} |2 {z 2} |2 {z2 } 0 0 1 D1 F =yz cos xyz D2 F =xz cos xyz D3 F =xy cos xyz 1

This is a contradiction, since the third entry of the product is

1.

b. True. Since the partial derivative with respect to z at a is D3 F (a) = 1, which is invertible, the implicit function theorem guarantees that F (x) = 0 expresses z implicitly as a function g of x and y near a, and that it is di↵erentiable, with derivative given by equation 2.10.29. 2.33 False. This is a problem most easily solved using the chain rule, not the inverse function theorem. Call S the map S(A) = A2 . If such a mapping g exists, then g is an inverse function of S, so S g = I, and since

Solutions for review exercises, Chapter 2

81

g is di↵erentiable, we can apply the chain rule: h

⇣"

1 2 z" }| ✓ 3 DS g 0 DS

The derivative of the identity is the identity everywhere; this uses Theorem 1.8.1, part 2.

2 1

# ⌘i

{ ◆!#h ⇣ 0 3 Dg 3 0

0 3

⌘i

h ⇣ ⌘i 3 0 = D(S g) 0 3  h ⇣ ⌘i 3 0 = DI = I. 0 3

h ⇣ ⌘i 1 2 Thus if the desired function g exists, then DS is in2 1 vertible; if the derivative is not invertible, either g does not exist, or it exists but is not di↵erentiable. From Example 1.7.17, we know that [DS(A)]H = AH + HA, so h ⇣ 1 DS 2

2 1

 ⌘i 1 B= 2

Written out in coordinates, setting B = 

1 2

2 1



  a b a b 1 + c d c d 2

The transformation 

2 B+B 1 



1 2

a b , this gives c d

 2 a b+c =2 1 a d

 a b a b+c 7! c d a d

2 . 1

a+d b+c d

a+d . b+c d

(1)

is not invertible, since the entries are not linearly independent.2 Therefore g does not exist.

2



Note that the matrix of the (1) is not the matrix 2 linear transformation 3 1 1 1 0 6 1 a b+c a+d 0 0 17 7, since ; it is 6 4 a d b+c d 1 0 0 15 0 1 1 1 2

1 6 1 6 4 1 0

1 0 0 1

1 0 0 1

32 3 2 3 0 a a b+c 7 6 7 6 176 b 7 6 a + d 7 7. = 154 c5 4 a d 5 1 d b+c d

This matrix is clearly noninvertible, since the third column is a multiple of the second column.

82

If you want to find out whether a map f is locally invertible, the first thing to do is to see whether the derivative [Df ] (here, [DS]) is invertible. If it is not, then the chain rule tells you that there are two possibilities. Either there is no implicit function (the usual case), or there is one and it is not di↵erentiable. If [Df ] is invertible, the next step is to show that f is continuously di↵erentiable, so that the inverse function theorem applies. In this case, it is an easy consequence of Theorem 1.8.1 (rules for computing derivatives) that the squaring function S is continuously differentiable.

Solutions for review exercises, Chapter 2

h ⇣ ⌘i 1 2 Alternatively, you could show that the derivative DS is 2 1 not invertible by noticing that      1 2 1 1 1 1 1 2 0 0 + = , 2 1 0 1 0 1 2 1 0 0 so that



1 0

h ⇣ 1 1 2 ker Df 1 2

2 1

⌘i .

2.35 If the matrix [Df (xn )] is not invertible, one can pick any invertible matrix A, and use it instead of [Df (xn )] 1 at the troublesome iterate xn : xn+1 = xn

Af (xn ).

This procedure cannot produce a false root x⇤ , since then we would have x⇤ = x⇤

Af (x⇤ ),

which means that Af (x⇤ ) = 0. The requirement that A be invertible then implies that f (x⇤ ) = 0, so x⇤ is indeed a true root. Of course, we can not longer hope to have superconvergence, but the fact that our algorithm doesn’t break down is probably worth the sacrifice. 2.37 a. Suppose first that p1 and p2 have no common factors, and that

By the fundamental theorem of algebra (Theorem 1.6.14), this factoring is possible, but the roots may be complex.

p1 (x)q1 (x) + p2 (x)q2 (x) = 0; we need to show that then q1 and q2 are the zero polynomial. Write p1 (x) = (x

a1 ) . . . (x

ak1 ) and p2 (x) = (x

b1 ) . . . (x

bk2 ).

Since p1 (x)q1 (x) + p2 (x)q2 (x) = 0 and p1 (bi ) 6= 0 for all i = 1, . . . , k2 , we have q1 (b1 ) = · · · = q1 (bk2 ) = 0. Similarly, q2 (a1 ) = · · · = q2 (ak1 ) = 0. If all the ai are distinct, this shows that q2 is a polynomial of degree < k1 that vanishes at k1 points; so it must be the zero polynomial. The same argument applies to q1 if the bi are distinct. Actually, this still holds if the roots have multiplicities. Suppose that q2 is not the zero polynomial. In the equation (x

a1 ) . . . (x

ak1 )q1 (x) + (x

b1 ) . . . (x

bk2 )q2 (x) = 0,

cancel all the powers of (x ai ) that appear in both p1 (x) and q2 (x). There will be some such term (x aj ) left from p1 (x) after all cancellations, since p1 has higher degree than q2 , so if after cancellation we evaluate what is left at aj , the first summand will give 0 and the second will not. This is a contradiction and shows that q2 = 0. The same argument shows that q1 = 0.

Solutions for review exercises, Chapter 2

83

This proves the “if” part of (a). For the “only if” part, suppose that p1 (x) = (x

c)˜ p1 (x) and p2 (x) = (x

i.e., that p1 and p2 have the common factor (x p1 (x)˜ p2 (x)

p2 (x)˜ p1 (x) = (x

so q1 = p˜2 and q2 =

c)˜ p2 (x),

c). Then

c)(˜ p1 (x)˜ p2 (x)

p˜2 (x)˜ p1 (x)) = 0,

p˜1 provide a nonzero element of the kernel of T .

b. If p1 and p2 are relatively prime, then T is an injective (one to one) linear transformation between spaces of the same dimension, hence by the dimension formula (theorem 2.5.8) it is onto. In particular, the polynomial 1 is in the image. But if p1 and p2 have the common factor (x c), i.e., p1 (x) = (x

c)˜ p1 (x) and p2 (x) = (x

c)˜ p2 (x),

then any polynomial of the form p1 (x)q1 (x) + p2 (x)q2 (x) will vanish at c, and hence will not be a nonzero constant. Nathaniel Schenker suggested a di↵erent solution to Exercise 2.39: Row reduce A by multiplying by elementary matrices on the left. In the matrix BA, where B is a product of elementary matrices, there is a pivotal 1 in every row that is not all zeros, so by column operations (multiplication by elementary matrices D on the right; see Exercise 2.3.11), we can eliminate all the nonzero entries of nonpivotal columns. Now we have a matrix whose only entries are 0’s and pivotal 1’s. By further multiplication on left and right by elementary matrices we can move all the pivotal 1’s to make the matrix of e D, e the form Jk . Thus Jk = BA e and D e are products of where B elementary matrices. This gives e 1 Jk D e 1 , or A = QJk P 1 , A=B e e with Q = B 1 and P = D.

2.39 In the domain of A, choose vectors ~v1 , . . . , ~vn k that span the kernel of A (and hence are linearly independent, since the dimension of the kernel is n k, by the dimension formula (Theorem 2.5.8). Add further linearly independent vectors to make a basis ~v1 , . . . , ~vn k , ~vn k+1 , . . . , ~vn . (The theorem of the incomplete basis justifies our adding vectors to make a basis; see Exercise 2.6.9.) In the codomain, define the vectors ~n w

k+1

= A~vn

~n k+1 , . . . , w

= A~vn .

These vectors are linearly independent, because 0=

n X

~i = ai w

i=n k+1

n X

ai A~vi = A

i=n k+1

n X

ai ~vi

i=n k+1

!

Pn implies that i=n k+1 ai ~vi is in ker A, hence it is in the span of the vectors ~v1 , . . . , ~vn k . So we can write n X

i=n k+1

ai ~vi =

n Xk

bi ~vi ,

i=1

which implies that all ai (and all bi ) vanish. ~ n k+1 , . . . , w ~ n can also be completed to form a basis Hence w ~ 1, . . . , w ~n w

~ n k+1 , . . . , w ~ n. k, w

Set ~ n ]. P = [~v1 , . . . , ~vn ] and Q = [~ w1 , . . . , w The ith column of the matrix ~ 1, . . . , w ~ n ] is Q~ei = w ~ i , so Q = [w Q

1

(~ wi ) = Q

1

Q~ei = ~ei .

Both these matrices are invertible, and we have ⇢ 0 Q 1 AP (~ei ) = Q 1 A~vi = Q 1 (~ wi ) = ~ei So Q

1

AP = Jk , or equivalently, A = QJk P

1

.

if i  n if i > n

k k.

84

Solutions for review exercises, Chapter 2

2.41 Start with [A | I] and suppose that you have performed row operations so that there are pivotal 1’s in the first m columns. The matrix then has the form m

2

n

n

61 . 0 6 m 6 0 .. 1 6 6 6 4 m

0

m

m

⇤

⇤

⇤

⇤

n

m

0 1. 0 .. 0 1

3

7 7 7 7. 7 7 5

All the entries marked * are likely to be nonzero; any zeros make the computation shorter. In the general case, to get from the mth step to the (m + 1)th step, you need to perform – n divisions – n(n 1) multiplications – (n 1)(n 1) additions We have n divisions because there are n + 1 possibly nonzero entries in the (m + 1)th line; the pivotal 1 requires no division. We have n(n 1) multiplications because each of the n rows except the (m + 1)th requires n multiplications. You might think that by the same reasoning we have n(n 1) additions, but the m + 1st column of the future inverse does not require an addition because we are adding a number to 0, so we actually have (n 1)(n 1) additions. So to compute A 1 by bringing [A|I] to row echelon form you need ⇣ ⌘ n n + n(n 1) + (n 1)(n 1) = 2n3 2n2 + n operations.

There are at least two ways to compute A and then A 1 ~b, you will use 2n3

2n2 + n + 2n2

1~

b. If you first compute A

1

n = 2n3

operations. (The 2n2 n is for the matrix multiplication A 1 ~b.) But if you row reduce [A|~b] and consider the last column, we have seen (Exercise 2.2.11) that we need n3 + 12 n2 12 n operations, or 2 3 3 2 7 n + n n 3 2 6 operations if we use partial row reduction and substitution. Thus computing A 1 ~b by first computing A 1 and then performing the matrix multiplication requires almost 3 times as many operations as partial row reduction and substitution.

Solutions for Chapter 3

0

1 x1 3.1.1 The derivative of F @ x2 A = x21 + x22 + x23 1 is [2x1 2x2 2x3 ], x3 which is [0 0 0] only at the origin, which does not satisfy F (x) = 0. So the unit sphere given by x21 + x22 + x23 = 1 is a smooth surface.

3

3.1.3 If the straight line is not vertical, it is the graph of the function f : R ! R given by f (x) = a + bx, i.e., y = a + bx. Since b is a constant, every value of x gives one and only one value of y. If the straight line is vertical, it is the graph of the function g : R ! R given by g(y) = a, i.e., x = a; every value of y gives one and only one value of x. 3

3

3

Figure for Solution 3.1.5. The curves Xc for c=

3, 2.5, . . . , 3.

Note that X0 has a cusp at the origin, which is why it is not a smooth curve.

3.1.5 a. The derivative of x2 +y 3 is [2x, 3y 2 ]. Since this is a transformation R2 ! R, the only way it can fail to be onto is for it to vanish, which happens only if x = y = 0. This point is on X0 , so all Xc are smooth curves for c 6= 0. 2 3 But X0 is not a smooth curve p near the origin. The equation x +y = 0 can 3 be “solved” for x: x = ± y , and this is not a function near y = 0, for instance it isn’t defined for y > 0. It can also be solved for y: y = x2/3 . This is a function, but it is not di↵erentiable at 0. Thus X0 is not a smooth curve. b. These curves are shown in the figure at left. 3.1.7 a. Both ⇣ X⌘a and Yb are graphs of smooth functions representing z as functions of x y , so they are smooth surfaces. b. The intersection Xa \ Yb is a smooth curve if the derivative of 0 1 ✓ 2 ◆ x x + y3 + z a @ A F y = x+y+z b z

is onto (i.e., has rank 2) at every point of Xa \ Yb . The derivative is  2x 3y 2 1 [DF] = , 1 1 1  1 and the only way it can have rank < 2 is if all three columns are , i.e., 1 if 2x = 3y 2 = 1. So all the intersections Xa \ Yb are smooth curves p except the ones that go through the two vertical lines x = 1/2, y = ±1/ 3. The values of a and b for the surfaces Xa and Yb that contain such a point ✓ ◆3 1 1 0 1 p a = z + ± 1/2 p 4 3 @ ±1/ 3 A are ✓ ◆ , 1 1 z b=z+ ± p 2 3 85

86

“Precisely when” means “if and only if”.

Solutions for Chapter 3

p which occurs precisely when a b = 1/4 ± 2/(3 3). 0 1 0 1 1/2 1/2 p p At the points @ +1/ 3 A, @ 1/ 3 A, the plane Yb is the tangent plane z z to the surface Xa .

5

3.1.9 a. We have 4

2

x +x +y

4

✓ ◆2 ✓ 1 2 y = x + + y2 2 2

1 2

◆2

1 . 2

So we can take −2

2

−5

5

−2

2

−5

p(x) = x2 +

1 2

and q(y) = y 2

1 . 2

b. The graphs of p, p2 , q, and q 2 are shown at top and middle of the figure in the margin. The graph of the function F is shown at the bottom. You can see that the slices at x constant look like the graph of q 2 , raised or lowered, and similarly, slices at y constant look like the graph of p2 , raised or lowered, all with a single⇣ minimum. (Looking ahead to section 3.6, we ⌘ 0 see that the critical point 0 is a saddle point, where p goes up and q p goes down. The other critical points are x = 0, y = ±1/ 2, which are minima.) If you slice this graph parallel to the (x, y)-plane, you will get the curves of Figure 3.1.10 in the text. 0 1 st ⇣ ⌘ s 3.1.11 a. The union is parametrized by t 7! @ st2 A. st3

b. This means eliminating s and t from the equations x = st, y = st2 , and z = st3 . In this case, it is easy to see that t = y/x, s = x2 /y, so that an equation for X is y2 x= or xz = y 2 . z The set defined by this equation contains two lines that are not in X: the x-axis and the z-axis.

Figure for Solution 3.1.9, part b. Top: The graphs of p and p2 . Middle: The graphs of q and q 2 . Bottom: Graph of F .

c. The derivative of 0 1 x f @ y A = xz z

y2

is

2

4Df

!3 x y 5 = [z, z

2y, x],

which evidently never vanishes except at the origin. d. Just substitute into the equation xz = y 2 : r(1 + sin ✓)r(1

sin ✓) = r2 (cos ✓)2 .

The figure below shows the two parametrizations. The parametrization of part a, shown to the right, doesn’t look like much. The left side shows

Solutions for Chapter 3

87

the parametrization of part d; it certainly looks like a cone of revolution. Indeed, if you switch to the coordinates x = u + v, z = u v, the equation becomes u2 v 2 = y 2 . In the form u2 = y 2 + v 2 , you should recognize it as a cone of revolution around the u-axis.

Figure for Solution 3.1.11. Left: The parametrization for part d. Right: The parametrization for part a.

e. The set of noninvertible symmetric matrices ad

2

b = 0, so it is the same cone.

3.1.13 a. The equation is (x b. The equation of Pt is 20 1 x 4@ y A z



a b b d

has equation

a) · ~v = 0. 0

13 2 3 t 1 @ t2 A5 · 4 2t 5 = 0. t3 3t2

c. Writing the above equations in full, we find that the equations of Pt1 \ Pt2 are x + 2t1 y + 3t21 z = t1 + 2t31 + 3t51 x + 2t2 y + 3t22 z = t2 + 2t32 + 3t52 . Two planes x + a1 y + b1 z = c1 and x + a2 y + b2 z = c2 fail to intersect if they coincide or are parallel. They coincide if the equations are the same; they are parallel if x + a1 y + b1 z is a multiple of x + a2 y + b2 z. In either case, the matrix of coefficients  1 a1 b1 1 a2 b2 has row rank 1, so by Proposition 2.5.11 it has rank 1.

To be sure that the planes intersect in a line, we must check that the matrix of coefficients  1 2t1 3t21 1 2t2 3t22 has exactly two pivotal columns (i.e., rank 2). But if t1 6= t2 , the first two columns are linearly independent, so that is true. d. The intersection P1 \ P1+h has equations x + 2y + 3z = 6 x + 2(1 + h)y + 3(1 + h)2 z = (1 + h) + 2(1 + h)3 + 3(1 + h)5 = 6 + 22h + 36h2 + 32h3 + 15h4 + 3h5 .

88

Solutions for Chapter 3

To solve this system, we row reduce the matrix  1 2 3 6 . 1 2 + 2h 3 + 6h + 3h2 6 + 22h + 36h2 + 32h3 + 15h4 + 3h5 Note that when row reducing this matrix, we can get rid of one power of h (subtract the second row from the first and divide by h). This leads to  1 0 3 3h 16 36h 32h2 15h3 3h4 . 3 3 4 3 0 1 3 + 2h 11 + 18h + 16h2 + 15 2 h + 2h The equations encoded by this matrix have a perfectly good limit when h ! 0: x = 3z

16 and y =

3z + 11,

which are a parametric representation of the required limiting line. 3.1.15 Clearly if l1 = l2 + l3 + l4 , the only positions are those where the vertices are aligned. In that case, the position of x1 and the polar angle of the linkage determine the position in X2 ; thus X2 is a 3-dimensional manifold, which looks like R2 ⇥ S 1 . Similarly, X3 is a 5-dimensional manifold, which looks like R3 ⇥ S 2 . 3.1.17 Two circles with radii r1 and r2 , whose centers are a distance d apart, intersect at two points exactly if |r1

r2 | < d,

|r1 + r2 | > d.

If one of these inequalities is satisfied and the other is an equality, the circles are tangent and intersect in one point. The point x2 is on the circle of radius l1 around x1 and on the circle of radius l2 around x3 . These two circles therefore intersect at two points precisely if |l1

l2 | < |x1

x3 |,

|l1 + l2 | > |x1

x3 |.

Similarly, the point x4 is on the circle of radius l3 around x3 and the circle of radius l4 around x1 . These two circles intersect in two points precisely if |l3

l4 | < |x1

x3 |,

|l3 + l4 | > |x1

x3 |.

Under these conditions, there are two choices for x2 , and two choices for x4 , leading to four positions in all. 3.1.19 a. The space of 2 ⇥ 2 matrices of rank 1 is precisely the  set of a b 0 0 matrices A = such that det A = ad bc = 0, but A 6= . c d 0 0 Compute the derivative: h ⇣ ⌘i a b D det = [d c b a], c d so that the derivative is nonzero on M1 (2, 2).

Solutions for Chapter 3

89

b. This is similar but longer. The space M2 (3, 3) is the space of 3⇥3 matrices whose determinant vanishes, but which have two linearly independent columns. If two vectors 2 3 2 3 a1 b1 ~a = 4 a2 5 and ~b = 4 b2 5 a3 b3 are linearly independent, then ~a ⇥ ~b = 6 ~0 (see Exercise 1.4.13). Now we have 2 3 a1 b1 c1 det 4 a2 b2 c2 5 = a1 b2 c3 a1 c2 b3 b1 a2 c3 + b1 c2 a3 + c1 a2 b3 c1 b2 a3 , a3 b3 c3

so taking the variables in the order a1 , a2 , . . . , c3 , the derivative of the determinant is [b2 c3

c3 b2 ,

b1 c3 + c1 b3 , b1 c2

c1 b2 , . . . , a1 b2

b1 a2 ],

precisely the determinants of all the 2 ⇥ 2 submatrices you can make from the 3 ⇥ 3 matrix, or alternatively, precisely the coordinates of all the cross products of the columns. We just saw that at least one coordinate of such a cross product must be nonzero at a point in M2 (3, 3). 3.1.21 a. 0 Yes, 1 the theorem tells us that they are smooth surfaces. 0 1 We x x define f @ y A = ex + 2ey + 3ez a, so that Xa is defined by f @ y A = 0. z z This has solutions only for a > 0; for a  0, the set Xa is the empty set and hence is a smooth surface by our definition. If a > 0, we have to check that the derivative is onto at all points: 2 0 1 !3 x x 4Df y 5 = [ex 2ey 3ez ] 6= [0 0 0] for all @ y A 2 R3 . z z Thus Xa is a smooth surface. Solution 3.1.21, part b: Recall that we cannot use Theorem 3.1.10 to determine that a locus is not a smooth manifold; a locus defined by F(z) = 0 may be a smooth manifold even though [DF(z)] is not onto. But in this case, at these values of a and b the locus Xa,b is a single point, which is not a smooth curve.

b. No, Theorem 3.1.10 does not guarantee that they are smooth curves. We define F : R3 ! R2 as 0 1 ✓ x ◆ x e + 2ey + 3ez a @ A F y = , x+y+z b z so that the subsets in question are given by F(x) = 0. Then  x e 2ey 3ez [DF(x)] = , 1 1 1

which has a single pivotal column 0 if and only1if ex = 2ey = 3ez , i.e., x x = y + ln(2) = z + ln(3). A triple @ x ln(2) A satisfies both equations x ln(3)

90

Solutions for Chapter 3

of F(x) = 0 when ex = a/3 and 3x

ln(2)

ln(3) = b,

i.e.,

b + ln(6) . 3 Setting these two equations equal, we find that the subsets Xa,b may not be smooth curves when a = 3e(b+ln(6))/3 . x = ln(a/3) and x =

3.1.23 Manifolds are invariant under rotations (see Corollary 3.1.17), so we may assume that our linkage is on the x-axis, say at points ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ a1 a2 a3 a4 def def def def a1 = , a2 = , a3 = , a4 = , 0 0 0 0 with a1 < a2 < a4 < a3 and a2 l1

l2 l4

l4

l1

l3 l2 l3

x2

x4 > l2

a2 = l2 , a4

a3 = l3 , a1

a4 = l4 .

Recall that X2 ⇢ (R2 )4 is the set defined by the equations l3

l2

a1 = l1 , a3

l3

Figure for Solution 3.1.23. The case where the candidate coordinates consist of x2 and x4 and two y-variables.

We can choose the values of the coordinate functions to be whatever we like; in particular, they can be 0. At the base position A, all the yi vanish.

|x1

x2 | = l1 ,

|x2

x3 | = l2 ,

|x3

x4 | = l3 ,

|x4

x1 | = l4 .

We will denote by A 2 X2 the position where xi = ai , for i = 1, 2, 3, 4. Let X2✏ ⇢ X2 be the subset where |xi ai | < ✏, for✓ i =◆1, 2, 3, 4. We will call xi our variables x1 , . . . , x4 , y1 , . . . , y4 , where xi = . yi The question is whether there exists ✏ > 0 such that X2✏ is the graph of a 1 C map representing four of the eight variables in terms of the other four. We will show that this is not the case: indeed, X2✏ is not the graph of any map (C 1 or otherwise) representing four of the eight variables in terms of the other four. There are many (8 choose 4, i.e., 70) possible choices of candidate “domain variables” (the active variables), which we will call candidate coordinates; they will be coordinates if the other variables are indeed functions of these. Let us make some preliminary simplifications, which will eliminate many of the candidates. Clearly y1 , y2 , y3 , y4 are not a possible set of coordinates, since we can add any fixed constant to x1 , x2 , x3 , x4 and stay in X2 . Also, no two linked xi can belong to any set of possible coordinates, since |xi+1 xi |  li . Any choice of three xi -variables will contain a pair of linked ones, so any set of coordinates must include either two x-variables and two y-variables, or one x-variable and three y-variables. The two cases are di↵erent. If our candidate coordinates consist of two x-variables (easily seen to be necessarily x2 and x4 ), and two y-variables, then in every neighborhood of the position A, there exists a position X where the two coordinate y-variables are 0, and the two non-coordinate y-variables are nonzero. Indeed, as shown by the figure in the margin, if you can choose x2 and x4 so that |x2 x4 | > l2 l3 , and then the linkage will not lie on a line. If ⇣⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘⌘ x1 , x2 , x3 , x4 X= 2 X2✏ y1 y2 y3 y4

Solutions for Chapter 3

91

then X0 =

⇣⇣

⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘⌘ x1 , x2 , x3 , x4 2 X2✏ y1 y2 y3 y4

also and has the same values for the candidate coordinates. Thus these candidate coordinates fail: their values do not determine the other four variables. When the candidate consists of one x-coordinate and three y-coordinates, two y-coordinates must belong to non-linked points, say y1 and y3 (y2 and y4 would do just as well), and without loss of generality, we may assume that the third coordinate y-variable is y2 . There is then a position X where y1 = y3 = 0, y2 6= 0; it then follows that y4 6= 0 also. If ⇣⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘⌘ x1 , x2 , x3 , x4 X= 2 X2✏ 0 y 0 y 2

4

then

X0 =

⇣⇣

⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘⌘ x1 , x2 , x3 , x4 2 X2✏ 0 y2 0 y4

also and has the same values for the candidate coordinates. Thus the candidate y1 , y2 , y3 , and xj are not coordinates for any j, as they do not determine y4 . The upshot is that there is no successful candidate for coordinate variables, so X2 is not a manifold near A. 0 1 x 3.1.25 a. Set F @ y A = z (x sin z)(sin z x + y). We need to check z ⇣ ⇣ ⌘⌘ ⇣ ⌘ u 2 R2 . This is that the equation F g u = 0 is satisfied for all v v straightforward: ⇣ ⇣ ⌘⌘ F g u = uv sin uv + u sin uv)(sin uv (sin uv + u) + (u + v) v = uv

2

4DF

!3 x y 5 = [ (sin z | z

(u)(v) = 0.

0 1 x b. We need to show that [DF (x)] 6= [0, 0, 0] for all x = @ y A 2 S; it is z actually true for all x 2 R3 . Indeed, using the product rule, we compute x + y) + (x {z D1 F (x)

sin z) (x sin z) 1 + cos z(sin z } | {z } | D2 F (x)

x + y) {z

D3 F (x)

cos z(x

sin z)]. }

When will all three entries be 0? For D2 F (x) to be 0, we need to have x sin z = 0, but then (looking at the first entry) y = 0, and then (looking at the third entry) 1 = 0, which is a contradiction. So the derivative of F never vanishes on R3 .

92

Solutions for Chapter 3

0 1 0 1 x x ⇣ ⌘ @yA c. Given a point x = @ y A 2 S, we need to show that g u = v z z can be solved for u, v. Clearly we can just solve the equations x = sin uv + u Part c illustrates that showing that a map is onto is equivalent to showing that some equation has a solution.

y =u+v z = uv. The first and third equations give u = x sin z, and the second then gives v = y x+sin z. We still need to check that this actually is a solution. Recall that the point x is a point of S, in particular F (x) = z

(x

sin z)(sin z

x + y) = z

uv = 0,

so the third equation is satisfied, and then from the definition of u the first is satisfied, and from the definition of v the second is satisfied. ⇣ ⌘ ⇣ ⌘ u2 , i.e., 1 d. To see that g is injective, suppose that g u = g v1 v2 sin u1 v1 + u1 = sin u2 v2 + u2 u1 + v1 = u2 + v2 u1 v1 = u2 v2 .

Saying that the derivative  ✓ ◆ Dg u : R2 ! R3 v is injective means that its image is a plane in R3 , not a line or a point. It mean ✓ does ◆ not ✓ ◆ that the map u 7! Dg u is injective. v v

The first and third equations imply u1 = u2 , and then the second equation implies v1 = v2 . Thus g is injective. h ⇣ ⌘i ⇣ ⌘ u 2 R2 , it is To show that the derivative Dg u is 1–1 at every v2 3 v 0 h ⇣ ⌘i  a enough to show that Dg u = 4 0 5 implies a = b = 0. Write v b 0 2 3 2 3 2 3 v cos uv + 1 u cos uv 0 h ⇣ ⌘i  a 4 5 + b4 5 = 405. Dg u = a 1 1 v b v u 0

The second line tells us that a = b, and the third line says we must have u = v. The first line then becomes u cos u2 + 1 = u cos u2 , a contradiction, so we must have a = b = 0. Thus the derivative of g is everywhere injective. ⇣ ⌘ 3.2.1 a. At 10 the derivative is [3 , 0], so the equation for the tangent line is 3(x 1) = 0, i.e., x = 1. b. At the same point, the tangent space is the kernel of [3 , 0], which is the line of equation x˙ = 0, i.e., the y-axis. ⇣ ⌘ ⇣ ⌘ u if and only if 3.2.3 A point x is in the tangent line to X at c y v ⇣ ⌘ x u 2 ker [2u, 3v 2 ], i.e., when 2u(x u) + 3v 2 (y v) = 0. y v

! X is the kernel of [2u, 3v 2 ], i.e., the 1-dimenc u v sional subspace of R2 of equation 2ux˙ + 3v 2 y˙ = 0.

The tangent space T

Solutions for Chapter 3

93

0

1 u A, the tangent plane to the surface of 3.2.5 a. At the point @ v 2 2 Au + Bv equation Ax2 + By 2 z = 0 has equation 2 3 x u 5 = 0. [ 2Au 2Bv 1]4 y v z (Au2 + Bv 2 )

[0 0

Applied to our three points, this means that P1 , P2 , P3 have equations 2 3 2 3 2 3 x x a x 5 = 0, [ 0 2Bb 1 ] 4 y 5 = 0, [ 2Aa 0 1]4 y 1 ] 4 y b 5 = 0, 2 z z Aa z Bb2 which expand to

z=0 2aAx

z = a2 A

2bBy z = b2 B. 0 1 a/2 This is easy to solve; we find q = @ b/2 A. 0 b. The volume of the tetrahedron is 2 3 a 0 a/2 1 ab 2 det 4 0 b b/2 5 = (a A + b2 B). 6 12 2 2 a A b B 0

⇣ ⌘ ⇣ ⌘ ⇣ ⌘ a , the point f x is the unique z such that 3.2.7 For any x near y b y1 0 1 0 0 1 x x h(y, z) @ y A is on the surface S. Similarly, @ g(x, z) A and @ y A are on S. z z z ⇣ ⌘ ⇣ ⌘ h(b, z) h(y, c) Note that f = z and g = y. After di↵erentiating these b c equations with respect to the appropriate variables (z for the first, y for the second) and using the chain rule, we discover that (D1 f )(D 02 h) 1= 1 and a (D1 g)(D1 h) = 1, where all derivatives are evaluated at a = @ b A. These c equations imply that 1 1 D1 f 6= 0, D1 g 6= 0, D1 h = , D2 h = . D1 g D1 f We can also combine the functions like this: ⇣ ⌘ ⇣ ⌘ c) = c and g h(b, z) = b. f h(y, y z

After di↵erentiating the first equation with respect to y and the second with respect to z (and using the chain rule) we get (D1 f )(D1 h) + (D2 f ) = 0 and (D1 g)(D2 h) + (D2 g) = 0.

94

Solutions for Chapter 3

Since D1 f and D1 g do not equal zero, D1 h =

D2 f D1 f

D2 g . D1 g

and D2 h =

Since we know that D1 f and D1 g do not equal 0, the equations for the tangent planes at a may be written as follows by linearizing the equations x = h(y, z), y = g(x, z), and z = f (x, y) and rearranging terms: D2 f 1 y˙ + z˙ D1 f D1 f

x˙ =

1 y˙ D1 g

x˙ =

D2 g z˙ D1 g

x˙ = (D1 h)y˙ + (D2 h)z. ˙ From the equations we derived above, we see that the first and third tangent planes are the same, as are the second and third. So all tangent planes are the same. 0 1 x0 3.2.9 The tangent plane to CX through a point @ y0 A has equation z0 ✓ ⇣ ⌘x x ⇣ ⌘y y ⇣ ⌘ ⇣ ⌘◆ z z0 0 0 x0 /z0 x0 /z0 + y D f x0 /z0 0 /z0 D1 f x + D f = x D f . 2 0 1 0 2 y0 /z0 y0 /z0 y0 /z0 y0 /z0 z0 z0 z02

Note that this is the plane through the 0 1 origin that contains the tangent line x0 /z0 to the curve X at the point @ y0 /z0 A. 1

3.2.11 a. If A 2 O(n), then A> A = I, so A> = A 1 . Moreover, if A, B 2 O(n), then (AB)> (AB) = B > A> AB = B > B = I; this shows that a product of orthogonal matrices is orthogonal. Since (A we see that A

1

1 >

) A

1

= (A> )> A

1

= AA

1

= I,

is also orthogonal.

b. If n 7! An is a convergent sequence of orthogonal matrices, converging to A1 , then Solution 3.2.11, part d: To compute the derivative [DF (A)], compute F (A + H)

F (A)

and discard all terms in the increment H that are quadratic or higher; see the remark after Example 1.7.17.

> A> 1 A1 = ( lim An )( lim An ) n!1

n!1

= |{z}

lim (A> n An ) = lim I = I,

n!1

n!1

Thm. 1.5.26, part 4

so the O(n) is closed in Mat (n, n). It is also bounded; all the columns are Pn of unit length, so i,j=1 a2i,j = n. c. Just compute: (A> A

I)> = (A> A)>

I = A> (A> )>

I = A> A

I.

d. We have [DF (A)]H = A> H + H > A; we need to show that given any M 2 S(n, n), there exists H 2 Mat (n, n) such that A> H + H > A = M .

Solutions for Chapter 3

95

Try H = 12 AM : since M = M > , we have ✓ ◆ ✓ ◆> 1 1 1 1 1 1 A> AM + AM A = A> AM + M > A> A = M + M > = M. 2 2 2 2 2 2 e. That O(n) is a manifold follows immediately from Theorem 3.1.10, and the characterization of the tangent space follows from Theorem 3.2.4: it is ker [DF (I)], which is the set of H 2 Mat (n, n) such that H +H > = [0], which is precisely the set of antisymmetric matrices. 3.3.1

D1 (D2 f ) = D1 (x2 + 2xy + z 2 ) = 2x + 2y D2 (D3 f ) = D2 (2zy) = 2z D3 (D1 f ) = D3 (2xy + y 2 ) = 0 D1 (D2 (D3 f )) = D1 (D2 ((2yz)) = D1 (2z) = 0.

3.3.3 We have 2 Pf,a (a + ~h) =

2 X X 1 1 DI f (a)~hI = D(0,0,0) f (a)h01 h02 h03 + I! 0!0! | {z } m=0 I2I3m f (a)

1 1 1 D(1,0,0) f (a)h11 h02 h03 + D(0,1,0) f (a)h01 h12 h03 + D(0,0,1) f (a)h01 h02 h13 1!0!0! 0!1!0! 0!0!1! 1 1 1 + D(2,0,0) f (a)h21 h02 h03 + D(1,1,0) f (a)h1 h2 h03 + D(1,0,1) f (a)h1 h02 h3 2!0!0! 1!1!0! 1!0!1! 1 1 1 + D(0,1,1) f (a)h01 h2 h3 + D(0,2,0) f (a)h01 h22 h03 + D(0,0,2) f (a)h01 h02 h23 ; 0!1!1! 0!2!0! 0!0!2! i.e., +

2 Pf,a (a + ~h) = f (a) + D(1,0,0) f (a)h1 + D(0,1,0) f (a)h2 + D(0,0,1) f (a)h3

1 + D(2,0,0) f (a)h21 + D(1,1,0) f (a)h1 h2 + D(1,0,1) f (a)h1 h3 2 1 1 + D(0,1,1) f (a)h2 h3 + D(0,2,0) f (a)h22 + D(0,0,2) f (a)h23 . 2 2 3.3.5 cardinality I1m = cardinality {(m)} = 1;

cardinality I2m = cardinality {(0, m), (1, m cardinality

I3m

= cardinality

1), . . . , (m, 0)}, = m + 1;

{(0, I2m ), (1, I2m 1 ), . . . , (m, I20 )}

= (m + 1) + m + · · · + 2 + 1 =

(m + 1)(m + 2) 2

In general, cardinality Inm = cardinality In0

1

+ cardinality In1

· · · + cardinality

Inm 11

1

+ ···

+ cardinality Inm 1 ,

96

Solutions for Chapter 3

i.e., cardinality Inm = cardinality Inm

1

+ cardinality Inm 1 .

(1)

Remark. This is reminiscent of the Pascal’s triangle recursion for binomials, ✓ ◆ ✓ ◆ ✓ ◆ m m 1 m 1 = + . (2) r r 1 r Indeed, setting m = n + k 1, r = n 1 gives ✓ ◆ ✓ ◆ ✓ ◆ n+k 1 n+k 2 n+k 2 = + . n 1 n 1 n 2 Thus the formula cardinality Ink =

✓

n

1+k n 1

◆

,

(3)

satisfies equation 1. Moreover, cardinality Ink satisfies the boundary conditions of Pascal’s triangle, namely, ✓ ◆ ⇣ ⌘ n 1 k k 0 cardinality I1 = 1 = 0 and cardinality In = 1 = , n 1 and these together with equation 2 completely specify the binomial coefficients. Thus equation 3 is true. 4

Here is a very nice di↵erent solution to Exercise 3.3.5, proposed by Nathaniel Schenker: Using the “stars and bars” method, consider the problem as counting the ways of allocating m indistinguishable balls to n distinguishable bins. Each configuration of the balls in the bins corresponds to an I 2 Inm , with each ball in bin j, for j 2 {1, . . . , n}, adding 1 to ij . Each I 2 Inm is represented by a sequence of length m + n 1 containing m stars (representing the m balls) and n 1 bars (which divide the stars into bins). For example, in I43 , we represent (0, 2, 0, 1) by | ⇤ ⇤| |⇤. Any choice of positions of the m stars in the sequence automatically implies the positions of the n 1 bars, and vice versa. Thus the cardinality 1 of Inm is the number of choices of positions for the stars, that is, m+n , m or, equivalently, the number of choices of positions for the bars, that is, m+n 1 . n 1 3.3.7 If I 2 Inm , then ~hI = hi11 hi22 · · · hinn , where i1 + i2 + · · · + in = m, the total degree. Therefore (x~h)I = xi1 hi11 xi2 hi22 · · · xin hinn ,

i.e.,

xi1 +···+in ~hI = xm ~hI .

3.3.9 a. Clearly this function has partial derivatives of all orders everywhere, except perhaps at the origin. It is also clear that both first partials exist at the origin, and are 0 there, since f vanishes identically on the axes.

Solutions for Chapter 3

97

Elsewhere, the partials are given by ⇣ ⌘ x4 y + 3x2 y 3 2xy 4 ⇣ ⌘ x5 x3 y 2 2x4 y x = D1 f x = and D f . 2 y y (x2 + y 2 )2 (x2 + y 2 )2

These partials are continuous on all of R2 , since the limits of the expressions above are 0 as x, y ! 0. In particular, f is of class C 1 . ⇣ ⌘ ⇣ ⌘ 0 b. and c. Since D1 f x = 0 for all x, and D f 2 0 y = 0 for all y, ⇣ ⌘ ⇣ ⌘ D1 (D1 f ) 00 = 0 and D2 (D2 f ) 00 = 0. Since D1 f

⇣ ⌘ ⇣ ⌘ 0 = 0 for all y and D f x = x for all x, 2 y 0 ⇣ ⇣ ⌘⌘ ⇣ ⇣ ⌘⌘ D2 D1 f y0 = 0 and D1 D2 f x = 1, 0

so the crossed partials are not equal.

d. This does not contradict Theorem 3.3.8 because the first partial derivative⇣D2⌘f is not di↵erentiable. The definition of D2 f being di↵erentiable at 0 = 0 0 is that there should exist a line matrix [a, b] such that lim

D2 f (0 + ~h)

[a, b]~h

D2 f (0)

= 0.

|~h|

|~ h|!0

(1)

  h1 0 ~ ~ We determine what [a, b] must be by considering h = and h = 0 h2 (and using the value for D2 f given in part a):  ⇣ ⌘ ⇣ ⌘ h D2 f h01 D2 f 00 [a, b] 1 0 |h1 ah1 | lim = h1 !0 |h1 | |h1 | D2 f lim

h2 !0

h h

!lim !

0 0

!

⇣

0 h2

⌘

D2 f

⇣ ⌘ 0 0

|h2 |

[a, b]



0 h2

=

|b||h2 | = 0. |h2 |

To satisfy equation (1), these imply that [a, b] = [1, 0]. But then  ⇣ ⌘ ⇣ ⌘ h 0 2h5 D2 f h D f [1, 0] 2 3 h h 0 h |h| (2h2 )2 3  p = = p2 = p 6= 0. h 2|h| 2|h| 2 2 h

Solution 3.3.11: f (i) denotes the ith derivative of f .

3.3.11 The jth derivative of the numerator of the fraction given in the hint is f (j) (a + h)

k X f (i) (a) i h (i j)! i=j

j

,

98

Solutions for Chapter 3

and the jth derivative of the denominator of the fraction given in the hint is k! hk j . (k j)! If we evaluate these two expressions at h = 0, they yield 0 for 0  j < k. Thus the hypotheses of l’Hˆ opital’s rule are satisfied until we take the kth derivative, at which point the numerator is 0 and the denominator is k! when evaluated at h = 0. Thus the limit in question is 0. For uniqueness, suppose p1 and p2 are two polynomials of degree  k such that f (a + h) p1 (a + h) f (a + h) p2 (a + h) lim = 0, lim = 0. k h!0 h!0 h hk By subtraction, we see that p1 (a + h)

lim

p2 (a + h) hk

h!0

= 0.

If p1 6= p2 , then p1 (a + h) p2 (a + h) = b0 + · · · + bk hk has a first nonvanishing term, i.e., there exists l  k such that b0 = · · · = bl 1 = 0 and bl 6= 0. Then the limit lim

p1 (a + h)

h!0

is zero only if limh!0

p2 (a + h)

hk

1 hk

l

bl hl + · · · + bk hk h!0 hk 1 = lim k l (bl + · · · + bk hk l ) h!0 h = lim

= 0, and that is not the case when l  k.

3.3.13 First, we compute the partial derivatives and second partials: 1+y 1+x D(1,0) f = p D(0,1) f = p 2 x + y + xy 2 x + y + xy p 1+y p x + y + xy · 0 (1 + y) 2 x+y+xy (1 + y)2 D(2,0) f = = 2(x + y + xy) 4(x + y + xy)3/2 p p x + y + xy · 1 (1+y)(1+x) 2(x + y + xy) (1 + y)(1 + x) 2 x+y+xy D(1,1) f = = 2(x + y + xy) 4(x + y + xy)3/2 p 1+x x + y + xy · 0 (1 + x) 2px+y+xy (1 + x)2 D(0,2) f = = . 2(x + y + xy) 4(x + y + xy)3/2 ✓ ◆ 2 At the point these are 3 ✓ ◆ ✓ ◆ 1 3 1 2 1 2 2 D(1,0) f = p = 1, D(0,1) f = = , 3 3 2 · 1 2 2 2 3+6 ✓ ◆ ✓ ◆ (1 3)2 2 · 1 ( 2)( 1) 2 2 D(2,0) f = = 1, D(1,1) f = = 0, 3 3 4·1 4 ✓ ◆ ( 1)2 1 2 D(0,2) f = = . 3 4·1 4

Solutions for Chapter 3

⇣ ⌘ 3 + v; i.e., the increment ~h is u v . The Taylor ✓ ◆ 2 polynomial of degree 2 of f at is then 1 u 12 v + 12 ( u2 14 v 2 ): 3 ✓ ◆ ✓ ◆ 1 1 2 1 1 2 2+u ! = |{z} 1 + u + v + u + v . |{z}! 3+v 2 2} 2 } 2 4 ! | {z | {z | {z } 3 ! ! 2 2 ! Write x =

P2 f,

99

f

3

D1,0 f

3

2 + u, y =

u1 v 0

D(0,1) f

2 u0 v 1 3

1 2 D(2,0) f

2 u2 v 0 3

1 2 D(0,2) f

2 u0 v 2 3

3.3.15 We know (Theorem 3.1.10) that there exist a neighborhood U ⇢ Rn of a and a C 1 map f : U ! Rn k with [Df (a)] onto, such that M \ U = { u | f (u) = 0 } . def

Further, by Theorem 3.2.4, Ta M = ker[Df (a)]. Thus U1 = F 1 (U ) is a neighborhood of 0 2 Rn . Define g : U1 ! Rn k by g(u) = f F(u). Then F

1

(M ) \ U1 = { u1 | g(u1 ) = 0 } .

Since [Dg(0)] = [D(f

F )(0)] = [Df (F (0))][DF (0)] = [Df (a)]A,

and both [Df (a)] and A are onto, [Dg(0)] is onto. But, since ~v 2 ker[Dg(0)] =) A~v 2 ker[Df (a)] =) ~v 2 A

1

ker[Df (a)] ,

we have that ker[Dg(0)] = A Here, g plays the role of F in the implicit function theorem, and h plays the role of g.

1

(ker[Df (a)]) = A

1

(Ta M )

is the span of ~e1 , . . . , ~ek . Thus we have D1 g(0) = · · · = Dk g(0) = 0, so [Dk+1 g(0), . . . , Dn g(0)] is invertible. Thus the implicit function theorem (Theorem 2.10.14) asserts that there exists a function h such that g(u1 ) = 0 expresses ⇣ ⌘ the last n k variables as a function h of the first k variables: x g h(x) = 0, where x corresponds to the first k variables. So locally F 1 (M ) is the graph of h. (“Locally” must be within U1 , but it may be smaller.) Moreover, by equation 2.10.29, [Dh(0)] =

[Dk+1 g(0), . . . , Dn g(0)]

1

[D1 g(0), . . . , Dk g(0)] = [0]. | {z } [0]

3.4.1 Using equation 3.4.9, with m = 1/2 and sin(x + y) playing the role of x in that equation, we have 1 + sin(x + y)

1/2

=1+

1 1 sin(x + y) + 4 sin(x + y) 2 2! 3

2

+ ··· .

Equation 3.4.6 tells us that sin(x + y) = x + y (x+y) plus higher degree 3! terms; discarding terms greater than 2 leaves just x + y. Therefore: p 1 1 1 + sin(x + y) = 1 + (x + y) x + y)2 + · · · . 2 8

100

Solutions for Chapter 3

We could also use the chain rule in the other direction, first developing the sine, then the square root: p p 1 1 1 + sin(x + y) = 1 + (x + y) + · · · = 1 + (x + y) (x + y)2 + · · · . 2 8 3.4.3 Set a = f (a + u) = Going from the first to the second line uses the binomial formula, equation 3.4.9.

⇣

⇣

2+u

2 3

⌘

and u =

⇣ ⌘ u . Then v

⌘1/2 3 + v + ( 2 + u)( 3 + v) = (1

2u

1 1 = 1 + ( 2u v + uv) ( 2u v + uv)2 + · · · 2 8 1 1 1 =1 u v + uv (4u2 + v 2 + 4uv + · · · ) + · · · 2 2 8 Therefore, 1 1 1 2 1 2 1 1 2 Pf,a (a+u) = 1 u v + uv u v uv = 1 u v 2 2 2 8 2 2

v + uv)1/2

1 2 u 2

1 2 v . 8

3.4.5 Write f (x) = A + Bx + Cx2 + R(x) with R(x) 2 o(h2 ). Then ⇣ ⌘ ⇣ ⌘ h af (0) + bf (h) + cf (2h) = h aA + b(A + Bh + Ch2 ) + c(A + 2Bh + 4Ch2 ) +h bR(h) + cR(2h) = hA(a + b + c) + h2 B(b + 2c) + h3 C(b + 4c) + h bR(h) + cR(2h) ; note that h bR(h) + cR(2h) 2 o(h3 ). On the other hand, Z 2h Z 2h h2 h3 f (t) dt = 2Ah + 4B + 8C + R(t) dt. 2 3 0 0 R 2h Note that 0 R(t) dt 2 o(h3 ). Thus we find

1 3 b + 2c = 2 with solution b = 4 . 3 8 b + 4c = 1 3 c= 3 Our error terms above show that with these numbers, the equation ⇣ ⌘ Z h h af (0) + bf (h) + cf (2h) f (t) dt 2 o(h3 ) a=

a+b+c=2

0

is satisfied by all functions f of class C 3 . 0 1 0 1 x ⇡/2 3.4.7 Let f @ y A = sin(xyz) z = 0 and a = @ 1 A. Then z 1 0 1 x D3 f @ y A = xy cos(xyz) 1, f (a) = 0, D3 f (a) = 1 6= 0, z

Solutions for Chapter 3

101

so [Df (a)] is onto, and by the implicit function theorem (short version), the equation f = 0 expresses one variable as a function of the other two. Since D3 f (a) = 1 is invertible, the long version of the implicit function theorem says that we can choose z as the “pivotal” or “passive” variable that is a ✓ ◆ ⇣ ⌘ ⇡/2 x function of the others: g y = z in a neighborhood of p = , where 1 g(p) = 1. By definition, ✓ ◆ 1 1 ⇡/2 + x˙ 2 Pg,p = |{z} 1 +D1 g(p)x˙ + D2 g(p)y˙ + D12 g(p)x˙ 2 + D1 D2 g(p)x˙ y˙ + D22 g(p)y˙ 2 , 1 + y˙ 2 2 g(p)

Of course you could denote the various components of the increment by h1 , h2 , h3 , or by u, v, w. But we recommend against writing x, y, z for those increments, which leads to confusion.

where x˙ denotes the increment in the x direction and y˙ the increment in the y direction. To simplify notation, denote the various coefficients by ↵1 , ↵2 , etc.: ✓ ◆ ⇡/2 + x˙ 2 Pg,p = 1 + ↵1 x˙ + ↵2 y˙ + ↵3 x˙ 2 + ↵4 x˙ y˙ + ↵5 y˙ 2 , (1) 1 + y˙ and note further that equation 2.10.29 tells us that D1 g = D2 g = 0:

If you didn’t use this argument, you would discover that ↵1 = ↵2 = 0 anyway, on coming to equation 3.

[Dg(p)] =

1

[D3 f (a)]

[D1 f (a), D2 f (a)] =

[ 1]

1

[0 0] = [0 0].

Thus equation 1 becomes ✓ ◆ ⇡/2 + x˙ 2 1 + z˙ = Pg,p = |{z} 1 + ↵3 x˙ 2 + ↵4 x˙ y˙ + ↵5 y˙ 2 . 1 + y˙ | {z } g(p)

Compute 0 1 ⇡/2 + x˙ f @ 1 + y˙ A = 1 + z˙ =

1

=

1

=

1

z˙

⇣⇣ ⇡ ⌘ ⌘ (1 + z) ˙ + sin + x˙ (1 + y)(1 ˙ + z) ˙ | {z } | 2 {z }

z-coordinate

⇣⇡

(2)

product of x,y,z coordinates

⇣ ⌘⌘ ⇡ ⇡ ⇡ z˙ + sin + x˙ + y˙ + z˙ + x˙ y˙ + x˙ z˙ + y˙ z˙ + x˙ y˙ z˙ 2 2 2 2 ⇣ ⌘ ⇡ ⇡ ⇡ z˙ + cos x˙ + y˙ + z˙ + x˙ y˙ + x˙ z˙ + y˙ z˙ + x˙ y˙ z˙ 2 2 2 ⌘2 1⇣ ⇡ ⇡ ⇡ z˙ + 1 x˙ + y˙ + z˙ + x˙ y˙ + x˙ z˙ + y˙ z˙ + x˙ y˙ z˙ + · · · , 2 2 2 | 2 {z }

(3)

Taylor polynomial of cosine

which gives 0 1 ⇡/2 + x˙ 2 @ Pf,a 1 + y˙ A = 1 + z˙

z˙

1 2 x˙ 2

⇡2 2 y˙ 8

⇡2 2 z˙ 8

⇡ x˙ y˙ 2

⇡ x˙ z˙ 2

⇡2 y˙ z. ˙ (4) 4

Substituting z˙ = ↵3 x˙ 2 + ↵4 x˙ y˙ + ↵5 y˙ 2 from equation 2 into equation 4 gives 0 1 ⇡/2 + x˙ 1 2 ⇡2 2 ⇡ 2 @ Pf,a 1 + y˙ A = (↵3 x˙ 2 + ↵4 x˙ y˙ + ↵5 y˙ 2 ) x˙ y˙ x˙ y. ˙ 2 8 2 2 Pg,p

Solution 3.4.7: We said above that we recommend using some other notation than x, y, z to denote increments to x, y, z. The real danger is in confusing z and the increment to z. If you do that you may be tempted to substitute z = 1 + z˙ for z˙ in equation 4, counting the 1 twice. In Example 3.4.9 we do insert the entire Taylor polynomial (degree 2) of g into x2 + y 3 + xyz 3 3 = 0, but note that we did not count the 1 twice; x2 became (1 + u)2 and y 3 became (1 + v)3 , but z 3 became a (1 + a1,0 u + a0,1 v + 2,0 u2 + · · · )3 , 2 3 not (1+1+a1,0 u+· · · ) . As long as you can keep variables and increments straight, it does not matter which you do.

102

Solutions for Chapter 3 2

2

(Note the handy fact that you don’t need to compute ⇡8 z˙ 2 , ⇡2 x˙ z, ˙ or ⇡4 y˙ z, ˙ since all terms are higher than quadratic.) 0 1 0 1 x ⇡/2 + x˙ 2 @ Since f @ y A = 0 for any values of x, y, z, then Pf,a 1 + y˙ A = 0; 2 z Pg,p setting all coefficients equal to zero gives ↵1 = ↵2 = 0,

1 , 2

↵3 =

↵4 =

⇡ 2

↵5 =

⇡2 , 8

so 1 2 x˙ 2

⇡ x˙ y˙ 2

⇡⇣ x 2

⇡⌘ (y 2

2 ˙ =1 Pg,p (p + x)

which we could also write as 1⇣ ⇡ ⌘2 2 Pg,p (x) = 1 x 2 2

⇡2 2 y˙ , 8

1)

⇡2 (y 8

1)2 .

3.4.9 Compute Taylor series of e

2 erf(x) = p ⇡

Z

x

0

t2

z }| { Z 1 1 X ( t2 )k 2 X ( 1)k x 2k dt = p t dt k! k! ⇡ 0 0 0 2 =p ⇡

1 X

k=0

(1)

( 1)k (x)2k+1 . (2k + 1)k!

n 1 2 X ( 1)k x2k+1 Set Pn (x) = p and (2k + 1)k! ⇡ k=0

alternating series because of ( 1)k

Solution 3.4.9: The function “erf” computes the area under the bell or “Gaussian” curve and is thus important in probability. The exchange of sum and integral in equation 1 is justified t2 )k because the series ( k! is absolutely uniformly convergent on [0, x].

En = erf(1/2)

Pn (1/2) =

z

}| { 1 2 X ( 1)k (1/2)2k+1 p . (2k + 1)k! ⇡ k=n

Since En is an alternating series with decreasing terms tending to 0, it converges, and for each n, 2 (1/2)2n+1 En  first term of the series En = p . ⇡ (2n + 1)n! Trying di↵erent values of n, we find that E2 may be too big but E3 is definitely small enough: |E2 | 

1 p > 10 160 ⇡

3

and |E3 | =

1 p < 10 2688 ⇡

So take n = 3; then 2 P3 (x) = p ⇡

✓ x

◆ 1 3 1 5 x + x . 3 10

3

.

Solutions for Chapter 3

Solution 3.4.11: First we use equation 3.4.9 with m = 1 to write 1 ⇡ 1 xz. 1 + zy Then we use equation 3.4.9 with m = 1/2 to compute the Taylor polynomial of (1 + (1

xz)(x + y))1/2 .

103

3.4.11 a. If we systematically ignore all terms of degree > 3, we can write ✓ ◆1/2 ⇣ ⌘1/2 x+y 1/2 1+ ⇡ (1 + (x + y)(1 xz)) = 1 + x + y x2 z xyz 1 + xz ⌘ 1⇣ ⌘2 1⇣ ⇡1+ x + y x2 z xyz x + y x2 z xyz 2 8 ✓ ◆✓ ◆ 1 1 1 3 + · (x + y)3 6 2 2 2 1 1 1 2 1 ⇡ 1 + (x + y) (x + y)2 x z xyz 2 8 2 2 1 + (x3 + y 3 + 3x2 y + 3xy 2 ). 16 0 1 0 b. The number D(1,1,1) f @ 0 A is exactly the coefficient of xyz, so it is 0 1/2. 3.5.1 We find signature (2, 1), since 4z 2 + 2yz

4xz + 2xy + x2 =

y2 3.5.3 a. x +xy 3y + 4 2

b. x2 + 2xy c.

2

y2 + y2

⇣ 4 z

y x ⌘2 1 + + (y + 2x)2 + x2 . 4 2 4

y2 ⇣ y ⌘2 = x+ 4 2

y 2 = (x + y)2

p !2 y 13 ; signature (1, 1). 2

p ( 2y)2 ;

signature (1, 1).

y2 y2 + zy 4 4 ✓ ◆ ⇣ y ⌘2 y2 2 = x+ yz + z + z 2 2 4 ⇣ ⌘2 y ⌘2 ⇣ y = x+ z + z 2 ; signature (2, 1). 2 2

x2 + xy + zy = x2 + xy +

3.5.5 a. The signature is (1, 1), since ✓ ◆ y2 y2 ⇣ y ⌘2 x2 + xy = x2 + xy + = x+ 4 4 2

⇣ y ⌘2 2

.

b. Let us introduce u = x + y, so that x = u y. In these variables, our quadratic form is ✓ ◆ ✓ ◆ uz u2 z2 uz u2 z2 2 (u y)y + yz = y uy yz + + + + + + 2 4 4 2 4 4 ⇣ ⌘ ⇣ ⌘ 2 2 u z u z = y + + . 2 2 2 2 Again the quadratic form has signature (1, 1), but this time it is degenerate.

104

Solutions for Chapter 3

Remark. Nathaniel Schenker points out that in part b, the algebra is simpler using a di↵erent change of variables: since xy + yz = (x + z)y, we can let u = (x + z y), so that x + z = u + y. Then the quadratic form is ✓ ◆ u2 u2 ⇣ u ⌘2 ⇣ u ⌘2 (u + y)y = y 2 + uy + = y+ . 4 4 2 2

Often a clever choice of variables can simplify computations. The advantage of the first solution is that it is a systematic approach that always works: when a quadratic form contains no squares, choose the first term and set the new variable u to be the sum of the variables involved (or, as we did in Example 3.5.7, the di↵erence). 4 3.5.7 First, assume Q is a positive definite quadratic form on Rn with signature (k, l): Q(x) = ↵1 (x) Each linear function ↵i : Rn ! R is a 1 ⇥ n row matrix, the ith row of the matrix T .

2

+ · · · + ↵k (x)

2

↵k+1 (x)

2

···

↵k+l (x)

2

with the ↵i linearly independent. We want to show that k = n and l = 0. Assume k < n. Then, by the 2 3 ↵1 . dimension formula, the k ⇥ n matrix T = 4 .. 5 satisfies ker T 6= {0}. Let ↵k ~y 6= 0 be in ker T . Then Q(~y)  0, so Q is not positive definite. This shows that Q positive definite implies k n. Since k + l  n with k 0, l 0, this shows that k = n and l = 0. In the other direction, if Q has signature (n, 0), then the 2 linear 3transfor↵1 (~x) 6 .. 7 n n mation T : R ! R given by the n ⇥ n matrix T (~x) = 4 . 5 is onto ↵n (~x) Rn , so by the dimension formula, ker T = 0. So for all ~x 6= 0 in Rn , at least one of the ↵i (~x) 6= 0, so squaring the functions ensures that Q(~x) = ↵1 (~x)

2

+ · · · + ↵n (~x)

2

> 0.

3.5.9 As in Example 3.5.7, we use the substitution u = a following computation: This result has important ramifications for physics: according to the theory of relativity, spacetime naturally carries a quadratic form of signature (1, 3). Thus this space of Hermitian matrices is a natural model for spacetime.

det H = ad

b2

d in the

c2

⇣ u ⌘2 = (u + d)d b2 c2 = d2 + ud + 2 ⇣ u ⌘2 ⇣ u ⌘2 2 2 = d+ b c 2 2 ✓ ◆2 ✓ ◆2 a+d a d = b2 c2 . 2 2 The signature of det H is thus (1, 3).

⇣ u ⌘2 2

b2

c2

Solutions for Chapter 3

3.5.11 a. Set A =



105

a b . Then c d

tr A2 = a2 + d2 + 2bc and

tr A> A = a2 + b2 + c2 + d2 .

b. The signature of tr(A> A) is (4, 0). The trace of A2 can be written a + d2 + 12 (b + c)2 12 (b c)2 , so tr(A2 ) has signature (3, 1). 2

3.5.13 a. First, suppose A is a symmetric matrix. Then the function ~v, w ~ 7! ~v> A~ w satisfies rule 1: (a1 ~v1 + a2 ~v2 )> A~ w = a1 ~v1> A~ w + a2 ~v2> A~ w. Since A is symmetric, and the transpose of a number equals the number, it satisfies rule 2: > ~v ~ > (~v> A)> = w ~ > A> ~v = w ~ > A~v. A~ w} = (~v> A~ w)> = w | {z

a number

In the other direction, given a symmetric bilinear function B, define the matrix A by ai,j = B(~ei , ~ej ). ~ ) = ~v> A~ and define BA by BA (~v, w w. The matrix A is symmetric since B is symmetric, and BA (~ei , ~ej ) = ~e> ej = ai,j = B(~ei , ~ej ), i A~ so the desired relation B = BA holds if applied to the standard basis vectors. But then it holds for the linear combinations of standard basis vectors, i.e., for all pairs of vectors. b. Let Q be a quadratic form: Q(~x) =

X

bi,j xi xj .

1ijn

Make a C with entries ci,i = bi,i and ci,j = 12 bi,j if i 6= j. Then we can take ~ ) 7! ~x> C w ~: B to be the symmetric bilinear function (~x, w 0 1 n X n X X X X ~x> C~x = xi @ ci,j xj A = ci,j xi xj = bi,j xi xj = Q(~x). i

j

i=1 j=1

c. There isn’t very much to show: Z 1 Z (a1 p1 (t) + a2 p2 (t))q(t) dt = a1 0

1ijn

1

p1 (t)q(t) dt + a2

0

0

and Z

0

1

p(t)q(t) dt =

Z

Z

0

1

q(t)p(t) dt.

1

p2 (t)q(t) dt,

106

Solutions for Chapter 3

d. If T : V ! W is any linear transformation, and B is a symmetric bilinear function on W , it is easy to check that the function B(T (~v1 ), T (~v2 )) is always a symmetric bilinear function on V . To find the matrix, compute B

⇣

⌘ Z ei ), p (~ej ) = p (~

1

ti+j

2

dt =

0

1 i+j

1

.

Thus the matrix is Solution 3.5.13, part d: This matrix is called the (k+1)⇥(k+1) Hilbert matrix; it is a remarkable fact that its inverse has only integer entries. You might confirm this in a couple of cases, but proving it is quite tricky.

2

6 1 6 2 6 . 6 . 4 .

1 k+1

3

1 2 1 3

.. .

... ... .. .

1 k+1 1 k+2

1 k+2

...

1 2k+1

1

.. .

7 7 7. 7 5

3.5.15 a. Both (p(t))2 and (p0 (t))2 are polynomials, whose coefficients are homogeneous quadratic polynomials in the coefficients of p. After integrating, the coefficient of tk in (p(t))2 (p0 (t))2 simply gets multiplied by 1/(k + 1). k X Explicitly: setting p = ai ti we get i=0

Q(p) =

k X k X

ai aj

k X k X

ai aj

i=0 j=0

b. We find Z 1⇣ (at2 + bt + c)2 0

1

ti+j

ijti+j

2

dt

0

i=0 j=0

=

Z

✓

1 i+j+1

⌘ (2at + b)2 dt =

17 2 a 15

ij i+j

1

2 2 b + c2 3

◆

.

3 2 ab + ac + bc. 2 3

It seems easiest to remove c, to find Q(p) =

✓ ◆2 b a c+ + 2 3

17 2 a 15

2 2 b 3

3 ab 2

The terms other than the first square are 56 2 a 45

11 2 b 12

11 ab. 6

It seems easier to remove b next, writing 11 11 (a + b)2 + a2 12 12

56 2 a . 45

1 2 a 9

1 2 b 4

1 ab. 3

Solutions for Chapter 3

107

1

We are left with −1

1

−1

Figure 1 for Solution 3.5.17, part a. The ellipse of equation 2

2

2x + 2xy + 3y = 1.

59 2 a . So the quadratic form is 180

✓ ◆2 a b c+ + 3 2

11 (a + b)2 12

59 2 a , 180

with signature (1, 2).

3.5.17 a. The quadratic form corresponding to this matrix is   2 1 x [x y] = 2x2 + 2xy + 3y 2 . 1 3 y Completing squares gives 2x2 + 2xy + 3y 2 = 2(x + y/2)2 + 5y 2 /2, so the quadratic form has signature (2, 0), and the curve of equation 2x2 + 2xy + 3y 2 = 1 is an ellipse, drawn in the margin (Figure 1).

3 2 1 0 b. The quadratic form corresponding to the matrix A = 4 1 2 1 5 is 0 1 2 ✓⇣ ◆ ⌘ ⇣ ⌘ ⇣ y 2 y 2 y ⌘2 2 2 2 2(x + y + z + xy + yz) = 2 x + + z+ + p , 2 2 2

Figure 2 for Solution 3.5.17, part b

z

2

which has signature (3, 0). This formula describes an ellipsoid. It is shown in Figure 2. 2 3 2 0 3 c. The quadratic form corresponding to the matrix 4 0 0 0 5 is 3 0 1 p 2x2 + 6xz z 2 = (x + 3z)2 ( 8z)2 . It is degenerate, of signature (1, 1), and the surface of equation 2x2 + 6xz z 2 = 1 is a hyperbolic cylinder. As above this can be parametrized by x = cosh s

3 sinh s p 8

y=t x

y

Figure 3 for Solution 3.5.17, part c. The hyperbolic cylinder.

sinh s z= p . 8 The resulting surface is shown in Figure 3; its intersection with the (x, z)-plane is shown in Figure 4. 2 3 2 4 3 d. The quadratic form corresponding to the matrix 4 4 1 3 5 is 3 3 1 2x2 + y 2 z 2 + 8xy 6xz + 6yz. Writing this as a sum or di↵erence of squares is a bit more challenging; we find ✓ ◆2 ✓ ◆2 3 9 71 y + 4x + y 14 x z + z2, 2 14 28

108

Solutions for Chapter 3

3

−3

3

so the quadratic form is of signature (2, 1). Again the surface of equation 2x2 + y 2 z 2 + 8xy 6xz + 6yz = 1 is a hyperboloid of one sheet, shown in Figure 5. (Parametrizing this hyperboloid was done as in part b, but the computations are quite a bit more unpleasant.)  1 2 e. The quadratic form corresponding to the matrix is 2 4 x2 + 4xy + 4y 2 = (x + 2y)2 .

−3

Figure 4. for Solution 3.5.17, part c. The hyperbola in the (x, z)-plane over which we are constructing the cylinder.

It is degenerate, of signature (1, 0), and the curve of equation (x + 2y)2 = 1 is the union of the two lines x + 2y = 1 and x + 2y = 1; see Figure 6. 3.6.1 a. Since all the partials of f are continuous, f is di↵erentiable on all of R3 . In fact, 2

4Df

!3 x y 5 = [ 2x + y z

x + sin y

2z ] ,

which vanishes at the origin,

and is thus a critical point by Definition 3.6.2. b. We have Figure 5 for Solution 3.5.17, part d. The hyperboloid of one sheet of part d.

2 Pf,0 (h) =

h22 + h23 2

and

3

h21 + h1 h2 + −3

3

h22 + h23 = 2

✓ ◆2 h2 h2 h1 + + 2 + h23 , 2 4

so the critical point has signature (3, 0) and is a local minimum by Theorem 3.6.8. 3.6.3 By Proposition 3.5.11 there exists a subspace W of Rn such that Q is negative definite on W . If a quadratic form on W is negative definite, then by Proposition 3.5.15, there exists a constant C > 0 such that

−3

Figure 6 for Solution 3.5.17, (e) The two lines x + 2y = 1 and x + 2y = 1 of part e. Solution 3.6.3: For |t| > 0 sufficiently small, r(t~ h) < C|~ h|2 . t2 so we have f (a + t~ h) t2

1 + h21 + h1 h2 +

f (a)

Q(~x) 

C|~x|2

for all ~x 2 Rn .

Thus if ~h 2 W , and t > 0, there exists C > 0 such that f (a + t~h) t2

f (a)

=

t2 Q(~h) + r(t~h)  t2

r(t~h) C|~h|2 + 2 , t

where t2 Q(~h) is the quadratic term of the Taylor polynomial, and r(~h) is the remainder. Since r(t~h) = 0, t!0 t2 lim

< 0.

it follows that f (a + t~h) < f (a) for |t| > 0 sufficiently small.

Solutions for Chapter 3

109

3.6.5 a. The critical points are the points where D1 f , D2 f , and D3 f vanish, i.e., the points satisfying the three equations y

z + yz = 0

x + z + xz = 0 y

x + xy = 0.

If we use the first two to express x and y in terms of z and substitute into the third equation, we get ✓ ◆ z z 2 = 0. z+1 z+1

A polynomial function like f is its own Taylor polynomial at the origin (but not elsewhere).

This leads to the 0 two roots 0 1 1 z = 0 and z = 0 2 points @ 0 A and @ 2 A. 0 2

2, and finally to the critical

b. The quadratic terms at the first point (the origin) are uv + vw see the margin note. Set u = s v, and get rid of u, to find ⇣ s ⌘2 1 sv v 2 + vw sw + vw = v w + w2 + s2 . 2 4

uw;

The origin is a saddle, with signature (2, 1). Here we turn the polynomial into a new polynomial, whose variables are 0 the 1 increments from the 2 point @ 2 A. 2

At the other critical point, set x=

2 + u,

y = 2 + v,

z=

2+w

and substitute into the function. We find 4

uv

vw + uw + uvw.

The quadratic terms are uv vw + uw, exactly the opposites of the quadratic terms at the first point, so again we have a saddle, this time with signature (1, 2).

Solution 3.6.7: Remember that 0

0

(f g) = gf + f g

0

(ef )0 = ef f 0 . In part b, we use equation 3.4.5 for the Taylor polynomial of ex .

3.6.7 a. We compute h ⇣ ⌘i h 2 Df x = 2x(1 + x2 + y 2 )ex y

y2

, 2y 1

2

(x2 + y 2 ) ex

y2

i

.

Thus D1 f = 0 when x = 0, and D2 f = 0 when y = 0 or x2 + y 2 = 1. Thus 2 2 the critical points x= ⇣ occur ⌘ ⇣ when ⌘ ⇣ 0 and ⌘ either y = 0 or x + y = 1, that 0 . is, at the points 00 , 01 , and 1

b. The second-degree terms of the Taylor approximation of f about the origin are x2 and y 2 . (Just multiply x2 + y 2 and eu = 1 + u + u2 /2 + · · · , where u = x2 y 2 , and take the second-degree terms.) Since x2 + y 2 is a positive definite quadratic form, f has a local minimum at the origin. In order to characterize the other two critical points, we first compute the

110

Solutions for Chapter 3

three second partial derivatives: ⇣ ⌘ ⇣ ⌘ 2 2 2 2 2 4 D12 f x = 2 + 10x + 2y + 4x y + 4x e(x y ⇣ ⌘ 2 2 (x2 y 2 ) D1 D2 f x y = 4xy(x + y )e ⇣ ⌘ ⇣ ⌘ 2 2 2 2 2 4 D22 f x = 2 2x 10y + 4x y + 4y e(x y

y2 )

y2 )

.

⇣ ⌘ ⇣ ⌘ 0 , we have D2 = 4/e, D D f = 0, D2 f = At both 01 and 1 2 1 2 1 Thus the second-degree terms of the Taylor polynomial are D12 f h21 2h2 = 1 2! e which give the quadratic form

and

D22 f h22 = 2!

4/e.

2h22 , e

2h21 2h22 e e at both points. This form has signature (1, 1), so f has a saddle at both of these critical points. 3.7.1 Let us call the three sides of the box x, 2x, and y. Then the problem is to maximize the volume 2x2 y, subject to the constraint that the surface area is 2(2x2 + xy + 2xy) = 10. This leads to the Lagrange multiplier problem [ 4xy

2x2 ] =

[ 4x + 3y

3x ] .

In the resulting two equations 4xy = (4x + 3y)

(1)

2

2x = (3x),

(2)

we see that one solution to equation 2 is x = 0, which is certainly not the maximum. The other solution is 2x = 3 . Substitute this value of into equation 1, to find 4x = 3y, and then the corresponding value of y in q terms of x into the constraint equation, to find x = maximum volume r 20 5 V = . 9 6

5 6.

This leads to the

3.7.3 a. Suppose we use Lagrange multipliers, computing the derivative of ' = x3 + y 3 + z 3 and the derivatives of the constraint functions F1 = x + y + z = 2 and F2 = x + y z = 3. Then at a critical point, D' = [3x2 , 3y 2 , 3z 2 ] = i.e.,

r

x = ±y = ±

1

+ 3

2

1 [1, 1, 1]

+

2 [1, 1,

r

and z = ±

1

1];

2

3

.

Solutions for Chapter 3

Solution 3.7.3, part a: Here it would be easier to find the critical points using a parametrization. In this case the constraints are linear, so finding a parametrization is easy. But usually finding a parametrization requires the implicit function and is computationally daunting, and less straightforward than using Lagrange multipliers (also computationally daunting).

Solution 3.7.3, part b: If you can find critical points, analyzing them using the Hessian matrix approach is systematic.

111

Adding the constraints gives 2(x + y) = 5, hence we must use the positive square root for both x and y. Similarly, subtracting the constraints gives 2z = 1, so for z we must use the negative square root: r r 1+ 2 1 2 x=y= and z = . 3 3 Substituting these values in the constraint functions tells us that 87 63 and , 1 = 2 = 32 32 0 1 5/4 which we can use to determine that there is a critical point at @ 5/4 A . 1/2 b. We can’t immediately conclude that this critical point is a minimum or a maximum of the constrained function because the domain of that function is not compact. We will analyze the critical point two ways: using a parametrization that incorporates the constraints, and using the augmented Hessian matrix. In this case, a parametrization is computationally much easier. First let us use a parametrization. If we write the constraint functions in matrix form and row reduce, we get   1 1 1 2 1 1 0 5/2 , which row reduces to ; 1 1 1 3 0 0 1 1/2 i.e.,

x + y = 5/2 and z = Substituting x = 5/2 (5/2

The augmented Hessian matrix H is defined in equation 3.7.43. Since F1 = x + y + z = 2 F2 = x + y [DF] =



1 1

z = 3, 1 1

1 . 1

3

y) + y 3

y and z =

1/2.

1/2 in the function x3 + y 3 + z 3 gives

1/8 = (5/2)3

3(5/2)2 y + 3(5/2)y 2

1/8,

which is the equation for a parabola; the fact that the coefficient for y 2 is positive tells us that the parabola is shaped like a cup, and therefore has a minimum. Now we will use the augmented Hessian matrix. Since the constraint functions F1 and F2 are linear, their second derivatives are 0, so the only nonzero entries of the matrix B of equation 3.7.42 come from the constrained function f , which has second derivatives 6x, 6y, and 6z: 2 3 6x 0 0 B = 4 0 6y 0 5 . 0 0 6z 0 1 5/4 Thus the augmented Hessian matrix at the critical point @ 5/4 A is 1/2 2 3 5 6x · 4 0 0 1 1 0 6y · 54 0 1 17 6 6 7 H=6 0 0 6z · 12 1 17, 4 5 1 1 1 0 0 1 1 1 0 0

112

Solutions for Chapter 3

which (denoting by u and v the two additional variables represented by that matrix) corresponds to the quadratic form 15 2 15 2 x + y 3z 2 2ux 2xv 2yu 2yv 2zu + 2zv. 2 2 After some (painful) arithmetic, we express this as a sum of squares: !2 !2 r r r r r r 15 2 2 15 2 2 + x u v + y u v 2 15 15 2 15 15 ✓ ◆2 p u v 1 80 2 p 3z + p + (u 9v)2 v , 15 15 3 3 which has signature (3, 2). Since m = 2, Theorem 3.7.13 tells us that (3, 2) = (p + 2, q + 2). So the signature of the constrained critical point is (1, 0), which means that the constrained critical point is a minimum. Solution 3.7.5: We will compute the volume in one octant, then multiply that volume by 8.

3.7.5 The volume of the parallelepiped in the first octant (where x y 0, z 0) is gotten by maximizing xyz subject to the constraint

0,

x2 + 4y 2 + 9z 2 = 9. The Lagrange multiplier theorem says that at such a maximum, there exists a number such that [yz, xz, xy] = [2x, 8y, 18z], i.e., we have the three equations yz = 2 x, xz = 8 y, xy = 18 z. Multiply the first by x, the second by y, and the third by z, to find xyz = 2 x2 = 8 y 2 = 18 z 2 . This leads to x2 = 4y 2 = 9z 2 , hence 3x2 = 9, or p p p 3 3 x = 3, y = , z= , 2 3 p and finally the maximum of xyz is 3/2. There are eight octants, so the p total volume is 4 3. 3.7.7 a. Using the hint, write z = 1c (1 ax by). The function now becomes ⇣ ⌘ 1 xyz = xy(1 ax by) = F x y , c with partial derivatives ⇣ ⌘ 1 y 2 D1 F x y = c (y 2axy by ) = c (1 2ax by), ⇣ ⌘ 1 x 2 D2 F x 2bxy) = (1 ax 2by). y = c (x ax c There are clearly four critical points: 0 1 0 1 0 1 0 1 1/(3a) 0 0 1/a @ 0 A , @ 1/b A , @ 0 A , @ 1/(3b) A . 1/c 0 0 1/(3c)

Solutions for Chapter 3

113

b. At these four points the Hessian matrices (matrices whose entries are second derivatives) are     1 0 1 1 1 1 2a/b 1 0 1 (2a)/(3b) 1/3 , , , . 1 0 1 2b/a 1/3 (2b)/(3a) c 1 0 c c c

By Proposition 3.5.17, each of these symmetric matrices uniquely determines a quadratic form; by Theorem 3.6.8, these quadratic forms can be used to analyze the critical points. The quadratic form corresponding to the first matrix is ⌘ 2 1⇣ xy = (x + y)2 (x y)2 , c 2c with signature (1, 1), so the first point is a saddle point. Since nothing made the z-coordinate special, the second and third points are also saddles. (This could also be computed directly.) The fourth corresponds to the quadratic form ! r r ✓ ◆ 2 a 2 b 2 2 ⇣ a 1 b ⌘2 3b 2 x + y + xy = x+ y + y . 3c b a 3c b 2 a 4a Thus the quadratic form is negative definite, and the function is a maximum. (You could also do this conceptually: The region ax + by + cz = 1, x, y, z

0

is compact, and the function is nonnegative there. Thus it must have a positive maximum; since the function is equal to 0 at the other critical points, this maximum must be the fourth critical point.) 3.7.9 If A is any square matrix, then QA (~a + ~h) QA (~a)

~h> A~a ~a> A~h = (~a + ~h)> A(~a + ~h) ~a> A~a ~h> A~a = ~a> A~a + ~h> A~a + ~a> A~h + ~h> A~h ~a> A~a ~h> A~a ~a> A~h = ~h> A~h.

~a> A~h

So 0  lim

~ h!~ 0

QA (~a + ~h)

QA (~a)

~h> A~a

~a> A~h

|~h|

|~h> A~h| |~h|2 A  lim =0 ~ ~ h!~ 0 h!~ 0 |~ |~h| h|

= lim

Thus the derivative of QA at ~a is the linear function ~h 7! ~h> A~a + ~a> A~h. In our case A is symmetric, so ~h> A~a = ~a> A~h, justifying equation 3.7.52. We can get the same result using the definition of the directional derivative. The directional derivative of QA is ⌘ ⌘ 1⇣ 1⇣ ~a + k~h · A(~a + k~h) ~a · A~a lim QA (~a + k~h) QA (~a) = lim k!0 k k!0 k ⇣ ⌘ = lim ~a · A~h + ~h · A~a + k~h · A~h (1) k!0 ~ ~ = ~a · (Ah) + h · A~a = ~a> A~h + ~h> A~a.

114

Solutions for Chapter 3

This linear function of ~h depends continuously on a, so the function QA is di↵erentiable, and its derivative is [DQA (~a)]~h = ~a> A~h + ~h> A~a. Using the properties of transposes on products and noting that A is symmetric, we see that (~h> A~a)> = ~a> A> ~h = ~a> A~h. Since ~h> A~a is a number, it is symmetric, so we also have ~h> A~a = ~a> A~h. Substituting this into the last line of equation 1 yields ⌘ 1⇣ lim QA (~a + k~h) QA (~a) = 2~a> A~h. k!0 k 3.7.11 The curve C is a hyperbola. Indeed, take a point of the plane curve C 0 of equation 2xy + 2x + 2y + 1 = 0, and set z = 1 + x + y. Then z 2 = x2 + y 2 + 1 + 2xy + 2x + 2y = x2 + y 2 , so this point is both on the plane and on the cone. The equation for C 0 can be written (x + 1)(y  + 1) = 1/2, 1 so C 0 is the hyperbola of equation xy = 1/2, translated by . 1 In particular, C stretches out to infinity, and there is no point furthest from the origin. But the hyperbola is a closed subset of R3 , and each branch contains a point closest to the origin, i.e., a local minimum. We are interested in the extrema of the function x2 + y 2 + z 2 (measuring squared distance from the origin), constrained to the two equations defining C. At a critical point of the constrained function, we must have [2x, 2y, 2z] =

1 [2x,

2y,

2z] +

2 [1,

1,

1].

The first two of these equations, 2x = 1 2x + 2 and 2y = 1 2y + 2 , imply that either 1 = 1 and 2 = 0 or that x = y. The hypothesis 2 = 0 gives z = 0, hence the first constraint (equation of the cone) becomes x2 + y 2 = 0, i.e., x = y = 0. This is incompatible with the second constraint. So we must have x = y. Then the two equations 2x2 z 2 = 0 and 2x z = 1 give 2x2 + 4x + 1 = 0. Solving this quadratic equation, we find the two roots p p 2+ 2 2+ 2 x1 = and x2 = . 2 2 These give the points p p 0 1 0 1 ( 2 p2)/2 ( 2 + p2)/2 @( 2 2)/2 A and @ ( 2 + p 2)/2 A . p 1 2 1+ 2 p The p distances squared from the origin to these points are 6 + 4 2 and 6 4 2 respectively. Since both of the local minima of the distance function are critical points, these two points are both local minima.

Solutions for Chapter 3

115

3.7.13 a. The partial derivatives of f are 0 1 0 1 x x D1 f @ y A = y 1, D2 f @ y A = x + 1, z z

0 1 x D3 f @ y A = 2z. z

The only point at which all three vanish is x0 , so this is the only critical point of f . Since f is a quadratic polynomial, its Taylor polynomial of degree two at x0 is itself: 0 1 0 1 1 + hx 1 + hx 2 @ 1 + hy A = f @ 1 + hy A Pf,x 0 0 + hz 0 + hz = ( 1 + hx )(1 + hy ) =

1 + hx

( 1 + hx ) + (1 + hy ) + (hz )2

hy + hx hy + 1

hx + 1 + hy + h2z

= 1 + hx hy + h2z . The quadratic terms yield the quadratic form ⌘ 1⇣ (hx + hy )2 (hx hy )2 + h2z , 4 which has signature (2, 1). Therefore x0 is a saddle of f (Definition 3.6.9). b. The function F is our constraint function. Its derivative is 2 !3 x 4DF y 5 = [ 2(x + 1) 2(y 1) 2z ] . z

By the Lagrange multiplier theorem, a constrained critical point on Sp2 (x0 ) occurs at a point where 8 2 2 !3 !3 > x x < y 1 = 2 (x + 1) 4Df y 5 = 4DF y 5 , i.e., x + 1 = 2 (y 1) . > : z z 2z = 2 z

The final equation says either = 1 or z = 0. If z 6= 0, solving the other equations yields x =0 1, y1=01. Then 1 the 1 1 p constraint function F shows that z = ± 2. Thus @ p1 A, @ p1 A are 2 2 constrained critical points of f . If z = 0, then the first two equations imply that = ±1/2 or = 0. But if = 0, then y = 1 and x = 1, yielding the point x0 , which is not on Sp2 (x0 ). Therefore = ±1/2. In the case = 1/2, we get y = x + 2, while if = 1/2, we have y = x. These, plus the equation of the sphere, lead to x = 0 or x = 2. Thus 0 1 0 1 0 1 0 1 0 2 0 2 @2A, @ 0A, @0A, @ 2A 0 0 0 0 are also constrained critical points of f .

116

Solutions for Chapter 3

c. We showed in part a that the only critical point in the interior of the ball is at x0 , and that this point is a saddle; therefore it cannot be an extremum. We compute f at the constrained critical points (on the surface of the ball) found in part b: 0 1 0 1 0 1 0 1 1 1 0 2 f @ p1 A = f @ p1 A = 3, f @ 2 A = f @ 0 A = 2, 2 2 0 0 0 1 0 1 0 2 f @ 0 A = 0, f @ 2A = 0 0 0 Therefore the maximum value of f on Bp2 (x0 ) is 3 and the minimum value is 0. Note that at the critical point x0 we have f (x0 ) = 1, which does indeed lie between the maximum and minimum values. 3.7.15 a. The set X1 =

n⇣ ⌘ x 2X y

x2 + y 2  4

o

⇢ R2

is closed and bounded, hence compact, and the function f has a minimum on X1 . ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ The point 10 is on X1 , and f 10 = 1. Since at every x y 2 X X1 ⇣ ⌘ we have f x y > 4, the minimum of f on X1 is a global minimum. b. The Lagrange multiplier equation is ⇥ [2x, xy] = 3(x 1)2 ,

The second equation 2y =

2 y implies either

⇤ 2y .

2

• y = 0, which by y = (x 1)3 gives x = 1, so the first equation gives 2 = 0, a contradiction, or • = 1. Then 2x = 3(x 1)2 , which gives 2 x = x2 3

2x + 1, i.e., x2

which has no real solutions, since then x = Figure for Solution 3.7.15 The curve X ⇢ R2 . It is defined by ✓ ◆ 3 F x y 2 = 0. y = (x 1)

4 x + 1 = 0, 3 p 2 4/9 1. 3 ±

⇣ ⌘ 1 is the 0 closest point to the origin, hence the minimum of x2 + y 2 on X. To see this without a drawing, note that y 2 = (x 1)3 implies that 3 (x 0, hence x 1, so x2 + y 2 1, with the unique minimum at ⇣ ⌘ 1) 1 . 0 The Lagrange multiplier theorem (Theorem 3.7.5) requires that [DF ] be surjective. Here, h ⇣ ⌘i ⇥ ⇤ DF x = 3(x 1)2 , 2y y c. The curve is shown in the figure in the margin. The point

Solutions for Chapter 3

is not surjective at

117

⇣ ⌘ 1 . 0

3.8.1 The outcome “total 2” has probability “total 3” has probability “total 4” has probability “total 5” has probability “total 6” has probability “total 7” has probability “total 8” has probability “total 9” has probability “total 10” has probability “total 11” has probability “total 12” has probability

1/100 2/100 3/100 2/20+2/100; 2/20+ 3/100 2/20+4/100 1/4+4/100 2/20+2/100 2/20+1/100 2/100 1/100

3.8.3 a. We have E(f ) =

1 X

f (n) P({s}) =

n=1

Figure for Sol. 3.8.3, part c Daniel Bernoulli (1700–1782) Consider, Bernoulli wrote, a very poor man waiting for a drawing in which he is equally likely to get nothing or twenty thousand ducats: since the expectation is ten thousand ducats, would he be foolish if he sold his right to the drawing for nine thousand? “I do not think so,” Bernoulli wrote, “although I think that a very rich man would go against his interest if, given the chance of buying this right for that same price, he declined.” Bernoulli’s comment justifies insurance: the expected value of insurance is always less than the premiums you pay, but insuring against losses you can’t a↵ord still make sense.

1 X

n

n=1

1 1 2 3 = + 2 + 3 + · · · = 2, n 2 2 2 2

since |x| < 1, we have 1 X

xn =

n=1

1 1

x

,

so di↵erentiating gives

1 X

nxn

1

n=1

=

1 (1

x)2

,

which gives 1 X

n=1

nxn =

x (1

, x)2

hence

1 X

n=1

n·

1 1/2 = = 2. 2n (1/2)2

b. E(f ) =

1 X

n=1

f (n) P({s}) =

1 X

n=1

2n

1 =1 2n

c. In theory, you should be willing to ante up any sum less than the expected value, so for f , any sum less than 2. For g, the first thing to find out is how solvent the “bank” is. Obviously no bank can pay out the expected value, so to determine the real expectation you would need to know the maximum the bank will pay. Another thing to consider is how much you are willing to lose; consider the comments of Daniel Bernoulli. 3.8.5 a. The following computation shows that if ~v is in the kernel of A> A, it is in the kernel of A:

118

Solutions for Chapter 3

0 = ~v · ~0 = ~v · A> A~v

= ~v> A> A~v = (~v> A> )> · A~v = (A~v) · (A~v) = |A~v|2 .

(1)

Similarly, any vector in the kernel of AA> is in the kernel of A> . Equation 1: It is always true ~ =~ ~ since that A~ v·w v · A> w, ~ = (A~ ~ =~ ~ (A~ v) · w v)> w v> A> w

b. Since A> A and AA> are symmetric, by the spectral theorem there are orthonormal eigenbases ~v1 , . . . , ~vm of Rm for A> A and ~u1 , . . . , ~un of Rn for AA> . Suppose that A> A~vi =

~ =~ v · (A> w).

and AA> ~uj = µj ~uj ,

vi i~

labeled so that 1

···

k

>

k+1

= ··· =

m

= 0 and µ1

···

µl > µl+1 = · · · = µn = 0.

Let V be the span of ~v1 , . . . , ~vk and let W be the span of ~u1 , . . . , ~ul . For 1 def 1 def each i  k, set ~vi0 = p A~vi , and for each j  l set ~u0j = p A> ~uj . µj i Now let us check that the ~vi0 , i = 1, . . . , k are an orthonormal set in W and the ~u0j are an orthonormal set in V . Indeed, the ~vi0 are orthonormal: ~vi0 1 · ~vi0 2 = p =p =

s

1 i1 i2

1 i1 i2 i2 i1

(A~vi1 ) · (A~vi2 ) ~vi1 · (A> A~vi2 )

~vi1 · ~vi2 =

⇢

0

if i1 6= i2

1

if i1 = i2

and they do belong to W , because they are orthogonal to ~ul+1 , . . . , ~un : if j > l then 1 ~vi0 1 · ~uj = p

i1

1 (A~vi1 ) · ~uj = p

i1

(vi1 ) · A> uj = 0.

The same argument works for the ~u0j . Since an orthonormal set is linearly independent, we have k  l and ~ 1, . . . , w ~ n, l  k so k = l, and finally the basis ~v1 , . . . , ~vm and the basis w where ~ 1 = ~v10 , . . . , w ~ k = ~vk0 , w ~ k+1 = ~uk+1 , . . . , w ~ n = ~un w are bases answering the question. 3.9.1 a. If a = b = 1, so that the ellipse is a circle with radius 1, by part 2 2 b we have  = 1. If the ellipse is a circle of radius r, we have xr2 + yr2 = 1, and again using part b we have =

r8 1 = ; 6 3/2 r (r )

the larger the circle, the smaller the curvature.

Solutions for Chapter 3

119

b. The ellipse of equation

b

x2 y2 + 2 =1 2 a b a

Figure for Solution 3.9.1 The ellipse given by x2 y2 + 2 =1 2 a b

(shown in the figure at left) is the union of the graphs of bp 2 bp 2 f (x) = a x2 and g(x) = a x2 . a a ✓ ◆ ✓ ◆ x x Clearly the curvatures at and are equal, so we will f (x) g(x) treat only f . Rather than di↵erentiating f explicitly, we will use implicit di↵erentiation to find b2 x a2 y

y0 =

and y 00 =

b4 , a2 y 3

leading to (x) =

Solution 3.9.3: We discuss the e↵ect of the choice of point on the sign of the mean curvature at the end of the solution.

|y 00 |

1 + (y 0 )2

3/2

=

b4 1 4 2 3 a y 1 + b 4 x22 a y

3/2

=

a4 b4 a4 y 2 + b4 x2

3/2

.

(1)

x2 y2 Since we know that 2 + 2 = 1 gives the ellipse shown at left, we a b may wish to see whether this value for the curvature seems reasonable for certain values of a, b, and x. For example, if a = b = 1, the ellipse is the unit circle, which should have curvature 1. This is indeed what equation 1 gives. At x = 0 the curvature is b/a2 , while at x = a it is a/b2 , a reciprocal relationship that seems reasonable. 0 1 0 3.9.3 First, note that any point can be rotated to the point a = @ 0 A, so it 1 ⇣ ⌘ p x is enough to compute the curvatures there. Set z = f y = 1 x2 y 2 . Then near a the “best coordinates” are X = x, Y = y, Z = z 1, so that p Z = z 1 = 1 X 2 Y 2 1.

To determine the Taylor polynomial of f at the origin, we use equation 3.4.9: (1 + x)m = 1 + mx +

m(m 1) 2 m(m x + 2!

1)(m 3!

2)

x3 + · · ·

(1)

with m = 1/2, to get Note that the quadratic terms of equation 2 do not correspond to the quadratic terms of equation 1.

1 Z ⇡ 1 + ( X2 2

1 Y )+ 2 2

✓ 1

1 2

◆

1 ( X2 2!

The quadratic terms of the Taylor polynomial are X2 2

Y2 . 2

Y 2 )2 + · · ·

1.

(2)

120

Solutions for Chapter 3

Since in an adapted coordinate system the quadratic terms of the Taylor polynomial are 1 (A2,0 X 2 + 2A1,1 XY + A0,2 Y 2 ), 2 In this simple case, the “best coordinates” are automatic. Even if you did not use the X, Y, Z notation when you computed the Taylor polynomial you would have found no first-degree terms. Thus if you tried to apply Proposition 3.9.10 you would have run into q c = a21 + a22 = 0, and all the expressions that involve dividing by c or by c2 would not make sense. That proposition only applies to cases where at least one of a1 and a2 is nonzero.

we have A2,0 = A0,2 = then gives

1 and A1,1 = 0. Definition 3.9.7 of mean curvature

1 (A2,0 + A0,2 ) = 2 and Definition 3.9.8 of Gaussian curvature gives mean curvature =

1

Gaussian curvature = A2,0 A0,2 A21,1 = 1. 0 1 0 What if you had chosen to work near @ 0 A? In a neighborhood of this 1 ⇣ ⌘ p x 2 2 point, z = f y = 1 x y , so setting X = x, Y = y, Z = z + 1, p we would have Z = z + 1 = 1 X 2 Y 2 + 1, so that ✓ ◆ 1 1 1 1 2 2 Z⇡ 1 ( X Y ) 1 ( X 2 Y 2 )2 · · · + 1. 2 2 2 2! This gives a mean curvature of +1. Recall that in the discussion of “best coordinates for surfaces,” we said that an adapted coordinate system for a surface S at a “is a system where X and Y are coordinates with respect to an orthonormal basis of the tangent plane, and the Z-axis is the normal direction.” Of course there are two normal directions: pointing up on the Z-axis or pointing down. If we 0 1 0 assume that the normal is pointing up, then when we chose a = @ 0 A, we 1 chose a point at which the sphere is curving away from the direction of the normal,0 giving1a negative mean curvature. When we redid the computation 0 at a = @ 0 A, then the sphere was curving towards the direction of the 1 normal, giving a positive mean curvature. ⇣ ⌘ x 3.9.5 By equation 3.9.5, the curvature of the curve at f (x) is =

|f 00 (x)| . The surface satisfies (1 + (f 0 (x))2 )3/2

a1 = f 0 (x), a2 = 0, a2,0 = f 00 (x), a1,1 = a0,2 = 0, hence c2 = (f 0 (x))2 , so equation 3.9.38 says that the absolute value of the mean curvature is |f 00 (x)| |H| = , 2(1 + (f 0 (x))2 )3/2 which is indeed half the curvature as given by Proposition 3.9.2.

Solutions for Chapter 3 Solution 3.9.7 is intended to illustrate Figure 3.9.7. A goat tethered at the North Pole with a chain 2533 kilometers long would have less grass to eat (assuming grass would grow) than a goat tethered at the North Pole on a “flat” earth.

121

3.9.7 a. Remember (or look up) that the radius of the earth is 6366 km.3 Remember also that the inclination of the axis of the earth to the ecliptic is 23.45 . (Again, this isn’t exact, and the inclination varies periodically with a period of about 30 000 years and amplitude ⇡ .5 .) Thus the real radius of the arctic circle (the radius as measured in the plane containing the arctic circle) is 6366 sin(23.45 ) = 2533 km, and its circumference is 2⇡2533 ⇡ 15930 km. If the earth were flat, the circumference of a circle with the radius of the arctic circle as measured from the pole would be 2⇡2607.5 ⇡ 16 383 km. b. By the same computation as above, the equation for the angle (in radians) between the pole and a point of the circumference, with vertex at the center of the earth, is 2⇡6366✓

2⇡6366 sin ✓ = 1,

i.e., ✓

sin ✓ =

1 . 2⇡ · 6366

A computer will tell you that the solution to this equation is approximately ✓ ⇡ .05341 radians. If you don’t have a computer handy, a very good approximation is obtained by using two terms of the Taylor expansion for sine, which gives (6/(2⇡ · 6366))1/3 ⇡ .05313. The corresponding circle has radius on earth about 340 km. 3.9.9 a. The hypocycloid looks like the solid curve in the margin (top). Figure for Solution 3.9.9 The hypocycloid of part a is the curve inscribed in the dotted circle.

b. If you ride a 2-dimensional bicycle on the inside of the dotted circle, and the radius of your wheels is 1/4 of the radius of the circle, a dot on the rim will describe a hypocycloid, as shown in the bottom figure at left. c. Using Definition 3.9.5, the length is given by Z ⇡/2 q 4a (3 sin2 t cos t)2 + (3 sin t cos2 t)2 dt 0

= 12a = 6a

Z

Z

⇡/2

0 ⇡/2

q sin2 t cos2 t(sin2 t + cos2 t) dt

2 sin t cos t dt = 6a.

0

Figure for Solution 3.9.9 Part b: A dot on the rim of this 2-dimensional bicycle describes a hypocycloid.

The hypocycloid has four arcs, so the length of one arc is 3a/2. Note that the circle has circumference 2⇡a, so that the lengths of the circle and of the hypocycloid are close. 3.9.11 The map

h i ' : s 7! ~t(s), ~n(s), ~b(s) ,

s2I

3 Actually, the earth is flattened at the poles, and the radius ranges from 6356 kilometers at the poles to 6378 km at the equator, but the easy number to remember is that the circumference of the earth is 40 000 km, so the radius is approximately 40 000/(2⇡) ⇡ 6366 km.

122

Solutions for Chapter 3

(' for “phrenet”) is a mapping I ! O(3). Exercise 3.2.11 says that the tangent space to O(3) at the identity is the space of antisymmetric matrices. If we adapt our Frenet map ' to consider the map h i 1h i ~t(s), ~n(s), ~b(s) , s 2 I, : s 7! ~t(s0 ), ~n(s0 ), ~b(s0 )

part b of Exercise 3.2.11 tells us that this represents a parametrized curve in O(3), such that (s0 ) = I. Thus its derivative 0 (s0 ) is antisymmetric; temporarily let us call it h i 1h i ~t0 (s), ~n0 (s), ~b0 (s) . A = ~t(s0 ), ~n(s0 ), ~b(s0 ) This can be rewritten h i h i ~t0 (s), ~n0 (s), ~b0 (s) = ~t(s0 ), ~n(s0 ), ~b(s0 ) A, which, if A is the antisymmetric matrix 2 0  A = 4 0 0 ⌧ corresponds to

~t0 (s0 ) = 0

~n (s0 ) = ~b0 (s0 ) =

3 0 ⌧ 5, 0

(s0 )~n(s0 ) + ⌧ (s0 )~b(s0 )

(s0 )~t(s0 ) ⌧ (s0 )~n(s0 ).

Solutions for review exercises, chapter 3

3.1 a. The derivative of the mapping x3 + xy 2 + yz 2 + z 3 2

[ 3x + y

2

2xy + z

2

4 is

2

2yz + 3z ] ,

which only vanishes if x = y = z = 0. (Look at the first entry, then the second.) The origin isn’t on X, so X is a smooth surface. 0 1 1 b. At the point @ 1 A, the derivative of F is [ 4 3 5 ]. The tangent 1 plane to the surface is the plane of equation 4x + 3y + 5z = 12. The tangent space to the surface is the kernel of the derivative, i.e., it is the plane of equation 4x˙ + 3y˙ + 5z˙ = 0. 3.3 a. The space X is given by the equations 1)2 + y12 + z12

(x1

2

(x2 + 1) + (x1 2

2

x2 ) + (y1

2

y22

y2 ) + (z1

+ z22 z2 )2

1=0 1=0 4 = 0.

b. The derivative of the mapping R6 ! R3 defining X is

3 2(x1 1) 2y1 2z1 0 0 0 4 5. 0 0 0 2(x2 + 1) 2y2 2z2 2(x1 x2 ) 2(y1 y2 ) 2(z1 z2 ) 2(x2 x1 ) 2(y2 y1 ) 2(z2 z1 ) When

this matrix is

0

1 0 1 x1 1 @ y1 A = @ 1 A , z1 0

0

1 0 1 x2 1 @ y2 A = @ 1 A , z2 0

2

3 0 0 0 0 2 05, 4 0 0

0 2 0 40 0 0 4 0 0

(1)

(2)

which has rank 3 since the first, second, and fifth columns are linearly independent. So X is a manifold of dimension 3 near the points in equation 1. c. The tangent space is the kernel of the matrix (2) computed in part b, i.e., the set of 0 1 0 1 u1 u2 @ v1 A , @ v2 A w1 w2 such that 2v1 = 0, 2v2 = 0, 4(u1 subspace of R6 .

123

u2 ) = 0, which is a 3-dimensional

124

Solutions for review exercises, Chapter 3

3.5 The best way to deal with this is to use the Taylor series for sine and cosine given in Proposition 3.4.2. Set ⇡ ⇡ ⇡ x = + u, y = + v, z = + w, 6 4 3 and compute ✓ ◆ ✓ ◆ ✓ ◆ 3⇡ 3⇡ 3⇡ sin + u + v + w = sin cos(u + v + w) + cos sin(u + v + w) 4 4 4 p ✓ ◆ p ✓ ◆ 2 1 2 1 = 1 (u + v + w)2 u+v+w (u + v + w)3 + · · · 2 2 2 6 p ✓ ◆ 2 1 1 = 1 (u + v + w) (u + v + w)2 + (u + v + w)3 + · · · . Moral: If you can, compute 2 2 6

Taylor polynomials from known expansions rather than by computing partial derivatives.

Solution 3.7: The wrong way to go about this problem is to say 1 3 that sin(x2 + y) = x2 + y y 6 plus higher-degree terms, and then apply equation 3.4.7, saying that ⇣ ⌘ cos 1 + sin(x2 + y) 1 3 ⇡ cos(1 + x + y y ) 6 ✓ ◆2 1 1 3 =1 1 + x2 + y y 2 6

Thus the required Taylor polynomial is p ✓ ◆ 2 1 1 2 3 1 (u + v + w) (u + v + w) + (u + v + w) . 2 2 6 3.7 The margin note discusses the incorrect way to go about this. What we can do is use the formula cos(a + b) = cos a cos b

sin a sin b,

which gives cos 1 + sin(x2 + y) = cos 1 cos(sin(x2 + y))

sin 1 sin(sin(x2 + y)) .

2

+ ··· . This is WRONG (not just a harder way of going about things), because equation 3.4.7 is true only near the origin. At the origin, 1 + sin(x2 + y) = 1, so we cannot treat it as the x of equation 3.4.7, which must be 0 at the origin.

At the origin, sin(x2 + y) = 0, so now we can apply equations 3.4.6 and 3.4.7 (discarding terms higher than degree 3), to get ✓ ◆ (x2 + y)2 2 cos 1 + sin(x + y) = cos 1 1 2 ✓ ◆ 1 2 1 2 sin 1 (x2 + y) (x + y)3 (x + y)3 6 6 cos 1 sin 1 3 = cos 1 (sin 1)y (sin 1)x2 y 2 (cos 1)x2 y y . 2 3 3.9 Both first partials exist and are continuous for all homogeneous polynomials of degree 4, and ⇣ ⇣ ⌘⌘ ⇣ ⇣ ⌘⌘ D2 D1 f y0 = d and D1 D2 f x = b. 0 It follows that

⇣ ⇣ ⌘⌘ D2 D1 f 00

⇣ ⇣ ⌘⌘ D1 D2 f 00 =d

b,

and that the condition for the crossed partials to be equal is that d = b.

Solutions for review exercises, Chapter 3

125

3.11 a. The implicit function theorem says that this will happen if Solution 3.11: We are evaluating Dz (y cos z x sin z) 0 1 r at @ 0 A, not multiplying. 0

Dz (y cos z

0 1 r x sin z) @ 0 A = 0 =

y sin z 0 · sin 0

0 1 r x cos z @ 0 A 0 r cos 0 =

r 6= 0.

b. The hint ⇣ ⌘gives it away: since the x-axis is contained in the surface, we have gr x 0 = 0 for all x, so ⇣ ⌘ ⇣ ⌘ Dx1 gr 0r = Dx2 gr 0r = 0. For a picture of this surface, see Figure 3.9.9. 3.13 a. We have 2

3 1 x y det 4 1 y z 5 = xy + xz + yz x2 y 2 z 2 , 1 z x 2 3 0 x y which certainly is a quadratic form. But det 4 x 0 z 5 = 2xyz certainly y z 0 is not. 2 3 b. We have xy + xz + yz x2 y 2 z 2 = x y2 z2 z)2 , 4 (y so the signature is (0, 2), and the quadratic form is degenerate. 3.15 The quadratic form represented is ax2 + 2bxy + dy 2 . (if) If a+d > 0, at least one of a and d is positive, and then if ad b2 > 0, we must have both a > 0, d > 0. So we can write ✓ ◆ b2 2 ad b2 2 2 2 2 ax + 2bxy + dy = ax + 2bxy + y + y a a ✓ ◆2 b ad b2 2 =a x+ y + y , a a   x 0 which is strictly positive if 6= . y 0 ⇣ ⌘ ⇣ ⌘ (only if) If you apply the quadratic form to 10 and 01 , you get     1 1 0 0 a= ·G and d = ·G , 0 0 1 1

so if the quadratic form is positive definite, you find a > 0, d > 0, hence a + d > 0.  b Now apply the quadratic form to the vector , to find a  ✓  ◆ b a b b · = a(ad b2 ). a b d a

126

Solutions for review exercises, Chapter 3

Since this must also be positive, we find ad b2 > 0. ⇣ ⌘ ⇣ ⌘ 3.17 a. The critical points are 00 and 11 , since at those points the partial derivatives D1 f = 6x

6y and D2 f =

6x + 6y 2 vanish.

b. We have D12 f = 6, D1 D2 f = 6, and D22 f = 12y, so the second2 2 degree⇣term ⌘ of the Taylor polynomial is 3h1 6h1 h2 + 6yh2 . At the critical point 00 , the term 6yh22 vanishes, so Q(~h) = 3(h21

2h1 h2 ) = 3 (h1

h2 )2

(h2 )2 ;

the signature of this quadratic form is (1, 1), and the critical point is a saddle. ⇣ ⌘ At the critical point 11 , we have 6yh22 = 6h22 , so ⇣ ⌘ Q(~h) = 3 h21 2h1 h2 + 2h22 = 3 (h1 h2 )2 + h22 ; the signature of Q is (2, 0), and the critical point is a minimum. 3.19 a. The function is 1 + 2xyz x2 y 2 where all three partials vanish, i.e., where

Since x = yz, we have x2 = x(yz), and so on.

z 2 . The critical points occur

2yz

2x = 0

2xz

2y = 0

2xy

2z = 0.

It from these equations that x2 = y 2 = z 2 = xyz. One solution is 0 follows 1 0 @ 0 A. The other solutions are of the form 0 0 1 0 1 0 1 0 1 a a a a @aA, @ aA, @ aA, @ aA, a a a a

for an appropriate value of a. These are solutions of a = a2 in the first and third, and a = a2 in the second and fourth, so the critical points are 0 1 0 1 0 1 0 1 0 1 0 1 1 1 1 @0A, @1A, @ 1A, @ 1A, @ 1A. 0 1 1 1 1 b. At the origin, the quadratic terms are evidently x2 y 2 z 2 , so that point is a maximum. 0 1 1 At the point @ 1 A, if we set x = 1 + u, y = 1 + v, z = 1 + w, and 1 multiply out, keeping only the quadratic terms, we find 2(1+u)(1+v)(1+w) (1+u)2 (1+v)2 +(1+w)2 = This point is a (1, 2)-saddle.

(u v w)2 +(v+w)2 (v w)2 .

Solutions for review exercises, Chapter 3

127

0

1 1 At the point @ 1 A, if we set x = 1 + u, y = 1 + v, z = 1 multiply out, keeping only the quadratic terms, we find 2(1+u)( 1+v)( 1+w) (1+u)2 ( 1+v)2 +( 1+w)2 =

(u+v+w)2 +(v+w)2 (v w)2 .

This point is also a (1, 2)-saddle. Since the expression for the function is symmetric with respect to the variables, evidently the other two points behave the same way.

D 41

c

d

3.21 a. Consider the figure at left. The cosine law says e2 = a2 + d2

C A '

1 + w, and

e b a

We compute the area of the quadrilateral ABCD by computing the area of the two triangles, using the formula area = 12 ab sin ✓, where ✓ is the angle between sides a and b.

2bc cos .

b. The area of the quadrilateral is given by 12 (ad sin ' + bc sin ). c. Our quadrilateral satisfies the constraint a2 + d2

3 2

B Figure for solution 3.21.

2ad cos ' = b2 + c2

b2

2ad cos '

c2 + 2bc cos

= 0.

So the Lagrange multiplier theorem asserts that at the maximum of the area function, there is a number such that [ad cos ', bc cos ] = [2ad sin ',

2bc sin ].

(To find the critical point of the area, we can ignore the 1/2 in the formula for the area.) This immediately gives cot ' + cot = 0, i.e., opposite angles are supplementary (i.e., sum to 180 ). It follows from high school geometry that the quadrilateral can be inscribed in a circle; see the figure below. D

↵

c d

C

A '

C

e

c 2↵

b a A

B

B Figure for part c, Solution 3.21. Left: Our quadrilateral, inscribed in a circle. Right: An angle inscribed in a circle is half the corresponding angle at the center of the circle. This is key to proving that a quadrilateral can be inscribed in a circle if and only if opposite angles are supplementary, a result you may remember from high school.

128

Solutions for review exercises, Chapter 3

⇣ ⌘ p x = x2 + y 2 gives the following y quantities (see equation 3.9.33), which enter in Proposition 3.9.11: ⇣ ⌘ ⇣ ⌘ a b D1 f ab = a1 = p , D2 f ab = a2 = p , c = 1, 2 2 2 a +b a + b2 b2 ab a2 a2,0 = 2 , a = , a = . 1,1 0,2 (a + b2 )3/2 (a2 + b2 )3/2 (a2 + b2 )3/2 3.23 Computing the derivatives of f

Inserting these values into equation 3.9.37, we see that the Gaussian curvature is K=

a2 b2 a2 b2 = 0. 22 (a2 + b2 )3

By equation 3.9.38, the mean curvature is 1 H= p p . 2 2 a2 + b2 A goat grazing on the flank of a conical volcano, constrained by a chain of a given length, too short for him to reach the top, has access to the same amount of grass as a goat grazing on a flat surface, and constrained by the same length chain.

The quadratic form (1) is called the second fundamental form of a surface at a point. It follows from A21,1 = A0,2 A2,0 that A0,2 and A2,0 are either both positive or both negative. We could also see that the second fundamental form is degenerate by considering the first equation in Solution 3.28, substituting A2,0 for a, A1,1 for 2b, and A0,2 for c: the only way the quadratic form ✓ ◆2 b ac b2 2 a x+ y + y a a in the third line can be degenerate is if ac = b2 , i.e., A21,1 = A0,2 A2,0

Thus the mean curvature decreases as |a| and |b| increase. This is what one would expect: our surface is a cone, so it becomes flatter as we move away from the vertex of the cone. If we draw a circle on the cone near its vertex, and determine the minimal surface bounded by that circle, then repeat the procedure further and further away from the vertex, the di↵erence between the area of the disc on the cone and the area of the minimal surface will decrease with distance from the vertex. One way to explain why the Gaussian curvature is 0 is to look at equation 3.9.32. Our surface is a cone, which can be made out of a flat piece of paper, with no distortion. Thus the area of a disc around a point on the surface (a point not the vertex of the cone) is the same as the area of a flat disc with the same radius. Therefore, K must be 0. For another explanation, note that by Definition 3.9.8, Gaussian curvature 0 implies A21,1 = A0,2 A2,0 . If both A0,2 and A2,0 are positive, this means that the quadratic form 1 (A2,0 X 2 + 2A1,1 XY + A0,2 Y 2 ) 2 (see equation 3.9.28) can be written p p ( A2,0 X + A0,2 Y )2 = A2,0 X 2 + 2A1,1 XY + A0,2 Y 2 . If both A0,2 and A2,0 are negative, it can be written p p 2 A2,0 X A0,2 Y

(1)

Thus the quadratic form (second fundamental form) is degenerate. Why should the second fundamental form of a cone be degenerate? In adapted coordinates, a surface is locally the graph of a function whose domain is the tangent space and whose values are in the normal line; see equation 3.9.28. Thus the quadratic terms making up the second fundamental form are also a map from the tangent plane to the normal line. In

Solutions for review exercises, Chapter 3

129

the case of a cone, the tangent plane to a cone contains a line of the cone, so the quadratic terms vanish on that line. 3.25 This is similar to Exercise 3.7.14 but more complicated, since the second partials of the constraint function F do not all vanish. Therefore, when constructing the augmented Hessian matrix, one must take into account these second derivatives and the Lagrange multipliers corresponding to each critical point. We have D1 f = 2x, D2 f = 2y, D3 f = dz, so D1 D1 f = D2 D2 f = D3 D3 f = 2 D1 D2 f = D2 D1 f = 0 D2 D3 f = D3 D2 f = 0. Moreover, D1 F1 = 2x

D1 F2 =

1

D2 F1 = 2y

D2 F2 =

0

D3 F1 = 0

D3 F2 =

1,

So we have (where B is the matrix in equation 3.7.42) B1,1 = |{z} 2

D1 D1 f

B2,1 = |{z} 0

2 1 |{z}

0 2 |{z}

1 D1 D1 F1

=2

B3,1 = 0

0 1 |{z}

B2,2 = |{z} 2

2 1 |{z}

1 D2 D2 F1

2 D2 D2 F2

B3,3 = |{z} 2

1 D3 D3 F1

0 |{z}

2 D3 D3 F2

D2 D1 f

D2 D2 f

B2,3 = 0

D3 D3 f

1 D2 D1 F1

This gives the augmented Hessian 2 2 2 1 0 2 6 6 H=6 0 4 2x 1

1

0 2 |{z}

=0

0 |{z}

=2

2

1

0 |{z}

=2

2 D2 D1 F2

matrix 0 2 0 2y 0

2

2 D1 D1 F2

1

0 0 2 0 1

We now analyze the four critical points 0 1 0 1 0 1 0 0 1 @1A, @ 1A, @0A, 0 0 1

2x 2y 0 0 0 0

3 1 07 7 17. 5 0 0

1 1 @ 0A. 1

130

Solutions for review exercises, Chapter 3

First critical point

0 1 0 At the critical point @ 1 A, we 0 2 0 6 0 6 H=6 0 4 0 1

If we label our variables A

2AE

have 0 0 0 2 0

1

= 1,

0 0 2 0 1

0 2 0 0 0

2

= 0, so

3 1 07 7 17. 5 0 0

E, this corresponds to the quadratic form 4BD + 2CE + 2C 2 .

Completing squares gives ✓ ◆2 ✓ ◆2 p p E E p + 2A + 2A2 2C + p 2 2

(D + B)2 + (B

D)2 ,

with signature (3, 2). In this case, m = 2 (there are two constraints); therefore, by Theorem 3.7.13, the constrained critical point has signature (1, 0) and the point is a minimum. Second critical point0

1 0 At the critical point @ 1 A, the quadratic form is the same except that 0 we have +4BD rather than 4BD: 2AE + 4BD + 2CE + 2C 2 , which can be written ✓ ◆2 ✓ ◆2 p p E E p + 2A + 2A2 + (D + B)2 2C + p 2 2

(B

D)2 .

Again, the constrained critical point is a minimum, with signature (1, 0). Third critical point

0 1 1 At the critical point @ 0 A, we have 1 augmented Hessian matrix is 2 2 0 0 2 0 6 0 6 H=6 0 0 2 4 2 0 0 1 0 1 This corresponds to the quadratic form 2A2

2B 2 + 2C 2

4AD

1

2 0 0 0 0

= 2 and

2

3 1 07 7 17. 5 0 0 2AE + 2CE,

=

2, and the

Solutions for review exercises, Chapter 3

which can be written ✓ ◆2 ✓ ◆2 p p E E p + 2A 2C + p 2 2

131

2B 2

(D + A)2 + (A

D)2

This has signature (2, 3), so the constrained critical point has signature (0, 1); it is a maximum. Fourth critical point 0

1 1 At the critical point @ 0 A, we have 1 2 2 0 0 2 0 6 0 6 H=6 0 0 2 4 2 0 0 1 0 1

1

2 0 0 0 0

This corresponds to

2A2

2B 2 + 2C 2 + 4AD

which can be written ✓ ◆2 ✓ ◆2 p p E E p + 2A 2C + p 2 2

=

2

= 2, giving 3 1 07 7 17. 5 0 0

2AE + 2CE,

2B 2 + (A + D)2

(A

D)2 .

This has signature (2, 3), so the constrained critical point has signature (0, 1); it is a maximum. 3.27 Proposition 3.9.11 gives a formula for the mean curvature of a surface in terms of the Taylor polynomial, up to degree 2, of the function of which the surface is a graph. Consider Scherk’s surface as the graph of ✓ ◆ ⇣ ⌘ cos x x z = f y = ln = ln(cos x) ln(cos y). (1) cos y To get the equality marked (1) we use u2 + o(u2 ) 2

cos u = 1 and

sin u = u + o(u2 ). To get the equality marked (2) we use ln(1

x) =

x

x2 + o(x2 ). 2

By Proposition 3.9.10, the Taylor polynomial is computed at the origin, so we make the change of variables x = x0 + u, y = y0 + v. Then the Taylor polynomial, to degree 2, of the first term of equation (1) is ln(cos(x0 + u)) = ln(cos x0 cos u sin x0 sin u) ✓ ⇣ ⌘◆ sin x0 = ln cos x0 cos u sin u cos x0 ⇣ ⌘ sin x0 = ln(cos x0 ) + ln cos u sin u cos x0 ✓ ◆ sin x0 u2 (1) 2 = ln(cos x0 ) + ln 1 u + o(u ) cos x0 2 ✓ ◆ ✓ ◆2 sin x0 u2 1 sin x0 u2 (2) = ln(cos x0 ) u + u + + o(u2 ) cos x0 2 2 cos x0 2 ✓ ◆2 ! sin x0 u2 sin x0 = ln(cos x0 ) u 1+ + o(u2 ). cos x0 2 cos x0

132

Solutions for review exercises, Chapter 3

A similar formula holds for ln cos(y0 + v). Thus ln(cos x)

ln(cos y) = ln cos(x0 + u)

= ln

✓

ln cos(y0 + v) a

a2,0

a

1 2 z ✓ }| { ! z }| { z }| { ⇣ sin x ⌘2 ◆ cos x0 sin x0 sin y0 1 0 u +v + 1+ u2 cos y0 cos x0 cos y0 2 cos x0 ! ✓ ⇣ sin x ⌘2 ◆ 1 0 2 + 1+ v + o(u2 + v 2 ). 2 cos x0 | {z }

◆

a0,2

So by equation 3.9.38, the mean curvature is ✓ ◆⇣ ⌘ 1 2 2 a (1 + a ) 2a a a + a (1 + a ) 2,0 1 2 1,1 0,2 2 1 , 2(1 + c2 )3/2

where the second factor is ⇣ ⌘ a2,0 (1 + a22 ) 2a1 a2 a1,1 + a0,2 (1 + a21 ) ✓ ◆2 ! ✓ ◆2 ! sin x0 sin y0 = 1+ 1+ + cos x0 cos y0

1+

✓

sin x0 cos x0

◆2 !

1+

✓

sin y0 cos y0

◆2 !

= 0. 3.29 a. For any angle, B is symmetric: > > > > B > = (Q> i,j AQi,j ) = Qi,j A Qi,j = Qi,j AQi,j ,

This method for diagonalizing symmetric matrices is known as Jacobi’s method.

since A is symmetric. We have   bi,i bi,j cos ✓ = bj,i bj,j sin ✓

sin ✓ cos ✓



ai,i aj,i

ai,j aj,j



cos ✓ sin ✓

sin ✓ . cos ✓

If you carry out the multiplication, you will find bi,j = bj,i = (aj,j

ai,i ) sin ✓ cos ✓ + ai,j (cos2 ✓

sin2 ✓).

Thus if tan 2✓ =

2ai,j ai,i aj,j

we will have bi,j = bj,i = 0. b. This breaks into three parts: (1) If k 6= i, k 6= j, l 6= i, l 6= j, then ak,l = bk,l . (2) Suppose k 6= i and k 6= j. Then b2k,i + b2k,j = a2k,i + a2k,j . Indeed, both are the length squared of the projection of A~ek = B~ek onto the plane spanned by ~ei and ~ej ; which is also the plane spanned by (cos ✓)~ei + (sin ✓)~ej

and ( sin ✓)~ei + (cos ✓)~ej .

Solutions for review exercises, Chapter 3

133

(3) Suppose k 6= i and k 6= j. Then b2i,k + b2k,j = a2i,k + a2j,k . This follows from case (2), using that A and B are symmetric, i.e., ai,k = ak,i etc. It now follows that

X k6=l

b2k,l =

X

a2k,l

a2i,j + a2j,i ,

k6=l

and the desired result follws from taking only the strictly subdiagonal terms. c. Start with A0 = A, let (i0 , j0 ) be the position of the largest o↵def diagonal |ai,j |, find ✓0 so as to cancel it, set A1 = Q> 0 A0 Q0 . Now find the largest o↵-diagonal entry of A1 , then ✓1 and Q1 as above and set A2 = Q> 1 A1 Q1 . We have never seen a case where the sequence k 7! Ak fails to converge, but here is a proof that the sequence always has a convergent sequence. Set Pm = Q0 Q1 · · · Qm . This is a sequence in the orthogonal group O(n), so it has a convergent subsequence i 7! Pmi that converges, say to the orthogonal matrix P . Then P > AP is diagonal. 3.31 Using the Lagrange multipliers theorem, we find that the critical points of the given function occur when x2 x3 · · · xn = 2 x1

x1 x3 x4 · · · xn = 4 x2 .. .

x1 · · · xn

1

(1)

= 2n xn .

From the first two equations, we find that x2 · · · xn x1 x3 · · · xn x2 (x3 · · · xn )2 and x2 = = . 2 4 8 2 (Note we have assumed that none of the xi are zero, which will certainly be the case at the desired maximum.) Thus x3 . . . xn p = . 2 2 x1 =

(Actually, may be negative as well, but assuming it is positive has no e↵ect whatsoever on the computation; this should be clear as we proceed.) x1 Substituting this expression for back into (1), we find x2 = p . Substi2 tuting the results we have gathered into each equation successively, we find x1 that xi = p . We then substitute these results into the constraint equation i 2 2 x1 + 2x2 + · · · + nx2n = 1 to find nx21 = 1, or x1 = p1n . Thus xi = p1ni . Thus the maximum of our function is nn/21pn! , which can also be written ⇣ ⌘n p1 p1 . n n!

Solutions for Chapter 4

1 4.1.1 a. The area of a dyadic cube C 2 D3 (R2 ) is ( 213 )2 = 64 ; the area 1 2 of a cube C 2 D4 (R2 ) is ( 214 )2 = ( 16 ) ; the area of a cube C 2 D5 (R2 ) is ( 215 )2 . b. The volume of a dyadic cube C 2 D3 (R3 ) is ( 213 )3 ; the volume of a cube C 2 D4 (R3 ) is ( 214 )3 ; the volume of C 2 D5 (R3 ) is ( 215 )3 .

4.1.3 a. 2-dimensional volume (i.e., area) ( 213 )2 = 1/64. b. 3-dimensional volume ( 212 )3 = 1/64 c. 4-dimensional volume ( 213 )4 = ( 18 )4 d. 3-dimensional volume ( 213 )3 = ( 18 )3 For each cube, the first index gives unnecessary information. This index is k, which tells where the cube is. The number n of entries of k gives necessary information: if it is 2, the cube is in R2 , if it is 3, the cube is in R3 , and so on. But we do not need to know what the entries are to compute the n-dimensional volume. 4.1.5 a.

n X i=0

Z

R

b. x1[0,1) (x) |dx| =

Z

R

i=

n(n + 1) . 2

x1[0,1] (x) |dx| =

Z

R

x1(0,1] (x) |dx| =

Z

R

x1(0,1) (x) |dx| =

1 . 2

We will write out the first one in detail. Since our theory of integration is based on dyadic decompositions, we divide the interval from 0 to 1 into 2N pieces, as suggested by the figure at left. Let us compare upper and lower sums when the interval is [0, 1). For the lower sum (which corresponds to the left Riemann sum, since the function is increasing), the width of each piece is 21N , the height of the first is 0, that N of the second is 21N , and so on, ending with 2 2N 1 . This gives N

LN

Figure for Solution 4.1.5.

N

2 1 2 1 1 X i 1 X x1[0,1) (x) = N = i. 2 i=0 2N 22N i=0

For the upper sum (corresponding to the right Riemann sum), the height of the first piece is 21N , that of the second is 22N , and so on, ending with 2N = 1: 2N N

UN So

N

2 2 1 1 X i 1 X i+1 x1[0,1) (x) = N = . 2 i=1 2N 2N i=0 2N

UN

LN = 134

1 , 2N

Solutions for Chapter 4

135

which goes to 0 as N ! 1, so the function is integrable. Using the result from part a, we see that it has integral 1/2: ✓ ◆ ✓ ◆ 1 (2N 1)(2N ) 1 1 1 lim LN x1[0,1) (x) = lim · = lim = . N !1 N !1 22N N !1 2 2 2N +1 2 For the interval [0, 1], note that UN x1[0,1] (x) = UN x1[0,1) (x) +

1 ; 2N

LN x1[0,1] (x) = LN x1[0,1) (x) . Therefore,

Z

R

x1[0,1) (x) |dx| =

Z

R

x1[0,1] (x) |dx| =

1 . 2

The upper and lower sums for the interval (0, 1] are the same as those for [0, 1], and those for the interval (0, 1) are the same as those for [0, 1), so those integrals are equal also. c. This is fundamentally easy; the difficulty is all in seeing exactly how many dyadic intervals are contained in [0, a], and fiddling with the little bit at the right that is left over. We will require the notation bbc, called the floor of b, and which stands for the largest integer  b. Consider first the case of the closed interval, i.e., integrate x1[0,a] (x). Then we have UN x1[0,a] (x) =

b2N ac 1

X i=0

i+1 a + N 2N 2 2

and LN x1[0,a] (x) =

b2N ac 1

X i=0

i 22N

.

The last term in the upper sum is the contribution of the rightmost interval  N ◆ b2 ac b2N ac + 1 , , 2N 2N where the maximum value is a and is achieved at a. So we find UN x1[0,a] (x)

LN x1[0,a] (x) =

b2N ac 1

X i=0

1 a 2a + N  N, 22N 2 2

which clearly tends to 0 as N ! 1, so the function is integrable. To find the integral, we will use the lower sums (which are slightly less messy). We find LN x1[0,a] (x) =

b2N ac 1

X i=0

i 1 1 = 2N (b2N ac 22N 2 2

1)b2N ac

1 b2N ac 1 b2N ac a2 = . N N 2 2 2 2 For the other integrals, note that whether the interval is open or closed at 0 makes no di↵erence to the upper or lower sums. The interval being =

136

Solutions for Chapter 4

open or closed at a makes a di↵erence exactly when a = k/2N , in which case the upper sum contains an extra term when the endpoint is there. But this extra term is a/2N , and does not change the limit as N ! 1. d. Observe that x1[a,b] (x) = x1[0,b] (x) x1[0,a) (x), x1[a,b) (x) = x1[0,b) (x)

x1[0,a) (x),

x1(a,b] (x) = x1[0,b] (x)

x1[0,a] (x),

x1(a,b) (x) = x1[0,b) (x)

x1[0,a] (x).

By Proposition 4.1.14, all four right sides are integrable, and the integrals are all (b2 a2 )/2.

Z

Rn+m

4.1.7 As suggested in the text, write f1 = f1+ we have

f1 , f2 = f2+

f2 . Then

Z g |dn x||dm y| = f1+ (x) f1 (x) f2+ (y) f2 (y) |dn x||dm y| n+m R Z ⇣ ⌘ = f1+ (x)f2+ (y) f1 (x)f2+ (y) f1+ (x)f2 (y) + f1 (x)f2 (y) |dn x||dm y| n+m ZR Z = f1+ (x)f2+ (y)|dn x||dm y| f1 (x)f2+ (y)|dn x||dm y| Rn+m Rn+m Z Z f1+ (x)f2 (y)|dn x||dm y| + f1 (x)f2 (y)|dn x||dm y| Rn+m Rn+m ✓Z ◆ ✓Z ◆ ✓Z ◆ ✓Z ◆ = f1+ (x)|dn x| f2+ (y)|dm y| f1 (x)|dn x| f2+ (y)|dm y| Rn Rm Rn Rm ✓Z ◆✓Z ◆ ✓Z ◆✓Z ◆ f1+ (x)|dn x| f2 (y)|dm y| + f1+ (x)|dn x| f2+ (y)|dm y| Rn Rm Rn Rm ✓Z ◆ ✓Z ◆ = (f1+ (x) f1 (x))|dn x| (f2+ (y) f2 (y))|dm y| Rn Rm ! Z ! Z =

Rn

f1 |dn x|

Rm

f2 |dm y|

4.1.9 Using the inequality | sin a sin b|  |a b| (see the footnote in the text, page 245), we see that within each dyadic square C 2 DN (R2 ) in the interior of Q, we have | sin(x1

y1 )

sin(x2

y2 )|  |(x1 = |(x1  |x1 = 21

N

y1 )

(x2

y2 )|

x2 )

(y2

y1 )|

x2 | + |y2

.

y1 |  1/2N + 1/2N

Furthermore, at most 8(2N ) such squares touch the boundary of the square, where the oscillation is at most 1. Thus we have 1 22N 8 · 2N 2+8 UN (f ) LN (f )  N 1 2N + 2N = N , 2 2 2 2

Solutions for Chapter 4

137

which evidently goes to 0 as N ! 1. 4.1.11 If X ⇢ Rn is any set, and a is a number, we will denote n

N

aX = { ax | x 2 X } .

If C 2 DM (R ), then 2 C 2 DM N (Rn ). In particular, X UM (D2N f ) = MC (D2N f ) voln (C) C2DM (Rn )

=

X

M2

NC

(f ) voln (C)

C2DM (Rn )

= 2nN

X

MC (f ) voln (C) = 2nN UM +N (f ).

C2DM +N (Rn )

An exactly similar computation gives LM (D2N f ) = 2nN LM +N (f ). Putting these inequalities together, we find 2nN LM +N (f ) = LM (D2N f )  UM (D2N f ) = 2nN UM +N (f ).

The outer terms above can be made arbitrarily close, so the inner terms can also. It follows that D2N f is integrable, and Z Z D2N f (x)|dn x| = 2nN f (x)|dn x|. Rn

Rn

4.1.13 We have 2N |b a| 2 2N |b a| = N 2 2N

2  LN (1I ) 2N 2N |b a| + 2 2N |b a| 2  vol(I)  UN (1I )  = + N. N N 2 2 2 So taking the limit as N ! 1, we find vol(I) = |b a|. Observe that the argument for X⇥Y proves a stronger statement; it shows that if X has volume 0, and Y has any finite volume, then X ⇥ Y has volume 0. Case X [ Y : inequality 1 is Proposition 4.1.14, part 3; equality 2 is Proposition 4.1.14, part 1; equality 3: X and Y have volume 0.

4.1.15 a. X \ Y : If C \ (X \ Y ) 6= / then C \ X 6= / , so for a given ✏ we can use the same N as we did for X alone in equation 4.1.62. X ⇥ Y : Using Proposition 4.1.20 we see that for any ✏1 , it suffices to choose N to be the larger of N in Proposition 4.1.23 for X with ✏ = ✏1 , and N for Y with ✏ = 1. X [ Y : Whether or not X and Y are disjoint, we have so

1X[Y  1X + 1Y ,

Z

Rn

Z

⇣ ⌘ 1X (x) + 1Y (x) |dn x| 1 Rn Z Z = 1X (x)|dn x| + 1Y (x)|dn x|

1X[Y (x)|dn x|  2

Rn

= 0 + 0 = 0. 3

Rn

138

Solutions for Chapter 4

One could also argue that X [Y = X [(Y X) and apply Theorem 4.1.21 (which concerns disjoint sets), but that requires knowing Y X ⇢ Y and voln Y = 0 =) voln (Y X) = 0, which again requires using 1Y X  1Y and applying Proposition 4.1.14, part 3. b. Let X be the set in question. It intersects 2N + 1 squares of DN (R2 ), so 2N + 1 UN (1X ) = , which tends to 0 as N ! 1. 22N c. Let X be the set in question. It intersects 22N + 2N + 2N + 1 cubes of DN (R3 ), so UN (1X ) = In Exercises 4.1.17 and 4.1.19, we will suppose that we have made a decomposition 0 = x0 < x1 < · · · < xn = 1, and will try to see what sort of Riemann sum the proposed “integrand” gives.

22N + 2N +1 + 1 , 23N

which tends to 0 as N ! 1.

4.1.17 a. This is an acceptable but “degenerate” integrand: it will give 0, since n X1

(xi+1

xi )2 =

i=0

n X1

xi+1 (xi+1

i=0

|

{z

xi )

right Riemann R sum for 01 x dx

}

n X1

xi (xi+1

i=0

|

{z

xi ) .

left Riemann R sum for 01 x dx

}

b. This is an acceptable integrand. We have | sin(u) u| < |u|3 /6 when |u| < 1, since the power series for sine is alternating, with decreasing terms tending to 0. So as soon as all |xi+1 xi | < 1, we have n X1 i=0

sin(xi+1

xi )

n X1

(xi+1

i=0

xi ) 

n 1 1X (xi+1 6 i=0

xi )3 .

The right side tends to 0 as the decomposition becomes fine (for instance, it is smaller than theRsum in part a), so the two terms on the left have the 1 same limit, which is 0 1 dx = 1 0. c. This is not an acceptable integrand; it gives an infinite limit. Indeed, if we break up [0, 1] into n intervals of equal length, each will have length 1/n and we get n r X p 1 n = p = n, n n i=1 which goes to 1 as n ! 1. d. This is an acceptable integrand; the limit it gives is 1. Indeed, n n ⌘ nX1 X1⇣ X1 x2i+1 x2i = xi+1 (xi+1 xi ) + xi (xi+1 xi ). i=0

i=0

As in part a, each of these is a Riemann sum for to Z 1 2x dx = 12 02 . 0

i=0

R1 0

x dx, so the sum tends

Solutions for Chapter 4

139

An easier approach is to notice that the sum telescopes: n X1⇣

x2i+1

i=0

⌘ x2i = x21

x20 + x22

x21 + · · · + x2n

x2n

1

= x2n

x20 = 1

0 = 1.

e. This is similar: n X1

x3i+1

i=0

x3i =

n X1

(x2i+1 + xi+1 xi + x2i )(xi+1

xi ).

i=0

All three terms of the sum are Riemann sums for their sum is Z 1 3x2 dx = 13 03 .

R1 0

x2 dx, so the limit of

0

Again, this is easier if you notice that the sum telescopes: n X1

x3i+1

x3i = x31

i=0

Figure for Solution 4.1.19. This is the case N = 15 and M = 7.

x30 + x32

x31 + · · · + x3n

x3n

1

= x3n

x30 = 1

0 = 1.

4.1.19 Suppose we break [0, 1] ⇥ [0, 1] into N M congruent rectangles, as shown in the margin. Then the relevant Riemann sum is r ! p 1 1 M N M = . 2 |{z} N M N number of pieces

The limit depends on how fast Npgoes to infinity relative to M . If p p N = M , or if N is a multiple of M (N = k M ), then the Riemann sum tends to a finite limit (1 or 1/k respectively). If N is much smaller p than M , the limit is infinite, and if N is much bigger than M , the limit p is 0. Therefore, |b a|2 |c d| is not a reasonable integrand. 4.1.21 a. It is enough to show that

UN (f + g)  Un (f ) + UN (g) for every N , i.e., MC (f + g)  MC (f ) + MC (g)

The sequence is necessary because f + g might fail to achieve its supremum MC (f + g).

There exists a sequence n 7! xn in C such that

MC (f + g) = lim (f + g)(xn ). n!1

Then MC (f + g) = lim (f + g)(xn ) = lim f (xn ) + g(xn )  MC (f ) + MC (g). n!1

n!1

b. Let f be the indicator function of the rationals in [0, 1], and g the indicator function of the irrationals in [0, 1]: i.e., f = 1Q\[0,1] and g = 1(R Q)\[0,1] . Then 1 = U (f + g) < U (f ) + U (g) = 2.

140

Problem 4.2.1: This kind of question comes up frequently in real life. Some town discovers that 3 times as many children die from leukemia as the national average for towns of that size: 9 deaths rather than 3. This creates an uproar, and an intense hunt for the cause. But is there really a cause to discover? Among towns of that size, the number of leukemia cases will vary, and some place will have the maximum. The question is: with the number of towns that there are, would you expect to find one that far from the expected value?

Solutions for Chapter 4

4.2.1 The question asked in the exercise is a bit vague: you can’t reasonably ask “how likely is this particular outcome”. Every individual outcome is very unlikely (though some are more unlikely than others). The question which makes sense is to ask: how many standard deviations is the observed outcome from average? You can also ask whether the number of repetitions of the experiment (in this case 15) is large enough to use the figures of Figure 4.2.5. The quick answer is that these results are not very likely. If we chart results for each integer, the resulting bar graph does not look like a bell curve, as shown in the margin. To give a more detailed answer, we must compute the standard deviation for the experiment. Figuring out all the possible ways of getting all the various totals would be time consuming, so instead we use the fact that if an experiment with standard deviation

3 2 1 0

is repeated n times, the central limit

theorem asserts that the average is approximately distributed according to p the normal distribution with mean E and standard deviation / n. The expectation of getting heads when tossing a coin once is 1/2. To 0 1

2 3 4

5 6 7 8 9 10 11 12 13 14

Figure for Solution 4.2.1. None of the 14 coin tossses resulted in no heads appearing; heads appeared exactly once in two of the 14 toin cosses, and so on.

compute the variance we consider the two cases. If we get heads, then (f

E(f ))2 = (1

1/2)2 = 1/4; if we get tails, we also get (f

Var(f ) = E(f

E(f ))2 = (0

1/2)2 = 1/4. So

E(f ))2 = E(1/4 + 1/4) = 1/2(1/2) = 1/4

and (f ) = 1/2. So the standard deviation of the average results when tossing the coin 1 14 times is p . But we are interested in the actual number of heads, not 2 14 the average. Denote this random variable by T ; we multiply the standard deviation of the average results by 14 to get But is n = 14 large? If we were asked in a court of law, we would not be willing to affirm that a person reporting those results was lying or cheating. If 300 people each tossed a coin 14 times, one of them might very likely come up results more than 2.5 standard deviations from the expected value.

p 14 (T ) = ⇡ 1.9. 2 The expectation of T is 7, so (see Figure 4.2.5) for n large we should expect 68% of our results to be between 5.1 and 8.9 heads (within one standard deviation). The actual figure is 5/15 (but if we expanded the range to between 5 and 9, it would be 9/15 ⇡ 60%). For n large we would expect 95% to be within 3.2 and 10.8; the actual figure is 11/15 ⇡ 73.3%.

Solutions for Chapter 4

4.2.3 a. Z 1 1 E(f ) = f (x) 1[ 2a 1  3 a x a2 = = 6a a 6 Var(f ) =

Z

1

f (x)

141

a,a] (x) dx = 3

Z

a

f (x) a

2

1 dx = 2a

2

2

( a) a a a = + = 6a 6 6 3 2

E(f )

1 a

1 1[ 2a

a,a]

dx =

Z

a a

✓ x2

Z

a a

a2 3

x2 dx 2a

◆2

✓ ◆ 1 2a2 x2 a4 = x4 + dx 3 9 a 2a !  5 a  ✓ a 1 x 2 2 x3 a4 a 2 a5 = a + [x] a = 2a 5 3 3 9 2a 5 a a ✓ ◆ 1 2 1 4a4 = a4 + = . 5 9 9 45 r 4a4 2a2 (f ) = = p 45 3 5 Z

b. E(f ) = = Var(f ) =

Z 

Z Z

1

f (x)

1

1 1[ 2a

4 a

x 8a

1

a,a] (x) dx

3

= a

f (x)

1 a 6

4

a 8

( a) =0 8a

E(f ) 

7

x x dx = 2a 14a a r p a6 a3 7 (f ) = = 7 7 =

Solution 4.3.1, part a: The first equality in the second line is justified because by Proposition 4.1.14, part 3, Z Z f + 0 and f 0.

=

2

1 1[ 2a

a

Z

a,a] 7

= a

a 14a

a

f (x) a

dx =

Z

1 dx = 2a

a a

x3

1 dx 2a

2 a5 a5 + 3 3 9

Z

0

a a

2

◆

x3 dx 2a

1 dx 2a

7

( a) 2a7 a6 = = 14a 14a 7

4.3.1 a. If f is integrable, then f + and f are integrable (Corollary 4.1.15) so |f | = f + + f is integrable (Proposition 4.1.14, part a). Now Z

f =  =

Z Z

Z

✓Z ◆ ✓Z ◆ z}|{ = f+ f ✓Z ◆ ✓Z ◆ Z = f+ + f = f+ + f

Prop. 4.1.14, part 1

f+

f

f+ +

Z

f

|f |.

b. Converse counterexample: f = 1/2 1Q restricted to [0, 1] has |f | = 1/2 (integrable) but f is not integrable.

142

Solutions for Chapter 4

4.3.3 a. Since the circle is symmetric, consider only the upper right quadrant, and keep in mind that we are looking just for an upper bound, not for a sharp upper bound. As suggested in the hint, divide that quadrant into two by drawing the✓diagonal p ◆ through the origin. This line intersects ⇣ ⌘ 1/p2 the circle at the point . In the part of the circle from 01 to 1/ 2 the diagonal, the slope of the circle is always less than 1, so if we look at columns of cubes, the circle never intersects more than two p ⌘ in any one ⇣ cubes 1/ 2 , there are column. Starting at the origin, and going to the point 0 2N p 2

Solution 4.3.3: A first error to avoid is confusing the unit circle with the unit disc; the problem concerns a curve, not a surface. Second, if you looked carefully at equation 4.1.12 defining dyadic cubes, you may have realized that at level N = 0, six cubes are needed, rather than the obvious of the points ✓ ◆ four,✓because ◆ 0 and 1 . This is not im1 0 portant, since we are concerned with the number needed as N gets big.

columns. Since we can’t deal in fractions of cubes, take the fractional N

2 part of that number (denoted with square brackets, [ p ]), and add 1 for 2 good measure. Multiplying by two gives ✓ N ◆ 2 p +1 2 2

for that eighth of the circle. This is an over-estimate, since in many columns the circle intersects only one cube, but that doesn’t matter. In the second segment of ⇣ the⌘upper right quadrant of the circle, the slope is much steeper; indeed, at 10 , the tangent to the circle is vertical. So if we use the same columns, then as N ! 1 there is no limit to the number of cubes the circle will intersect. But if we rotate the circle by 90 , we have the same situation as✓ before, p ◆ with slope less than 1. (This method 1/p2 counts cubes at the point twice, once when we look at the circle 1/ 2 the normal way, and again when we have rotated it.) There are eight such segments in all, so for the whole circle we have an (admittedly generous) upper bound of ✓ N ◆ 2 16 p + 1 ⇡ 2N p . 16 2 2 b. Denote the unit circle by S 1 . We have

0  LN (1S 1 )  UN (1S 1 ) 

1 N 16 2 p 22N 2

and lim

N !1

Another solution to Exercise 4.3.5 is given in the margin on the next page.

1 N 16 2 p = 0. 22N 2

4.3.5 Let us apply Theorem 4.3.9, which says that if f is bounded with bounded support, and continuous except on a set of volume 0 (area in this case), then it is integrable. The function 1P f is bounded (by 1), and its support is in P , which is bounded. Moreover, it is continuous except on the boundary of P . This boundary is the union of the arc of parabola y = x2 , 1  x  1, which is the graph of a continuous function, hence has area 0, and the segment y = 1, 1  x  1, which is also the graph of a continuous function, hence also has area 0.

Solutions for Chapter 4

Solution 4.3.5: Another way to approach this problem is to claim that over a dyadic square C 2 DN (R2 ) in P that doesn’t intersect the boundary of P , the os2 cillation p of sin(y ) is strictly less than 2 2/2N . Bounding the oscillation of a function f usually involves bounding the derivative of f and applying✓the◆ mean value theorem. If f x = sin(y 2 ), y then 2 [Df ( x y )] = [0, 2y cos(y )] and 0 < 2y cos(y 2 ) < 2 for 0 < y < 1. The above result then follows from Corollary 1.9.2 and the fact that for two points x1 , x2 in such a dyadic square, p |x1 x2 |  2/2N . Now we estimate how many dyadic squares can intersect the boundary of P . Since when |x| < 1 the curve y = x2 has slope at most 2, it will intersect a vertical column of dyadic squares in at most three squares, and the horizontal line y = 1 will intersect exactly one square of the column. Moreover, the right edge will intersect 2N squares of DN (R2 ), so altogether, at most 5 · 2N dyadic squares will intersect the boundary, with total area 5 · 2 N . Thus for every ✏ > 0, there exists N such that the function has oscillation < ✏ except on a set of dyadic cubes of total area < ✏.

143

4.4.1 Clearly, if a set X has measure 0 using open boxes, then it also has measure 0 using closed boxes; the closures of the open boxes used to cover X also cover X. The volume of an open box equals the volume of its closure. The other direction is a little more difficult. For any closed box B, let B 0 be the open box with the same center but twice the sidelength. Note that voln (B 0 ) = 2n voln B. If X is a set such that there exists a sequence P of closed boxes with X ⇢ [Bi and voln Bi  ✏, then X ⇢ [Bi0 and P 0 n voln Bi  2 ✏. By taking ✏ sufficiently small, we can make 2n ✏ arbitrarily small. 4.4.3 A set X ⇢ Rn has measure 0 if for any ✏ > 0 it can be covered by P a sequence i 7! Bi of open boxes such i voln (Bi ) < ✏. If X is compact, then by the Heine Borel theorem (Theorem A3.3) it is covered by finitely many of these boxes; by Proposition 4.4.6, the union of these boxes has boundary of measure 0 (in fact, it has volume 0). Let B be the union of these boxes, then 1B is bounded with bounded support, continuous except on a set of measure 0, hence integrable, so for N sufficiently large we have UN (1B ) < ✏. So 0  LN (1X )  UN (1X )  UN (1B ) < ✏, and limN !1 (UN (1X )

LN (1X )) < ✏; since ✏ is arbitrary, the limit is 0.

4.4.5 We will prove it false by showing a counterexample. Let f be the function ⇢ 1 if x = pq is rational, written in lowest terms, and |x|  1 f (x) = q 0 if x is irrational, or |x| > 1,

and let g(x) = 0 for all x 2 R. Then g (obviously) and f (see Example 4.4.7) are integrable, both with integral 0, and g  f , but { x 2 R | g(x) < f (x) } = Q \ [0, 1], and Q \ [0, 1] does not have volume (see Example 4.4.3). 4.4.7 Let X ⇢ [0, 1] be the set of numbers that can be written using only the digits 1, . . . , 6. It takes six intervals of length 1/10th to cover X, namely [.1, .2],

[.2, .3],

...

, [.6, .7].

Similarly it takes 36 intervals of length 1/100th to cover X, and more generally 6n intervals of length 1/10n to cover X. Their total length is (6/10)n , with tends to 0 as n tends to infinity, so X has measure 0, in fact this proves that it has 1-dimensional volume 0. In this case we did not need Heine-Borel (as we did in Exercise 4.4.3) to show that measure 0 implies volume 0.

144

Solutions for Chapter 4

4.5.1 By Proposition 4.3.5, any bounded part of a line in the plane has 2-dimensional volume (area) 0, so the values of the function on a line do not a↵ect the value of the integral. 4.5.3 a. The integral becomes Z 0 ✓Z 2+y ⇣ ⌘ ◆ Z f x dx dy + y 1

Figure for Solution 4.5.3.

y

1

Z

2

p y

p y

0

! ⇣ ⌘ x f y dx dy.

b. Below the x-axis, the domain of integration would be unchanged, as shown in Figure 4.5.4 in the textbook. Above the x-axis, it would be the region bounded on the left by the “left branch” of the parabola y = x2 , bounded on the right by the “right branch” of the parabola y = (x 2)2 , and bounded above by the line y = 1. This domain of integration is shown in the figure in the margin. 4.5.5 a. We use induction over k. Suppose we know that relation i = ci i 1 allows us to express 2k+2 in terms of c2k+1 : 2k+2

= c2k+2

2k+1

= c2k+2 c2k+1

k

= ⇡k! . The 2k , c2k+2 , and 2k

2k .

The expression for ci is k Y 2j 1 2j j=1 {z }

ci = c2k = c0

Thus

|

and

|

for i even

k Y

2j . 2j + 1 j=1 {z }

ci = c2k+1 = c1

for i odd

1)! 2⇡ c0 c1 = . i! i Substituting this into the equation for 2k+2 gives ci ci

2k+2

1

=

(i

= c2k+2 c2k+1

2k

=

2⇡ ⇡ k ⇡ k+1 = , 2k + 2 k! (k + 1)!

as desired. b. Suppose we know that 2k 1

=

⇡k

1

(k 1)!22k (2k 1)!

1

.

Then, as before, we have 2k+1

= c2k+1 c2k

2k 1

=

2⇡ ⇡ k 1 (k 1)!22k 2k + 1 (2k 1)!

1

=

Multiplying top and bottom by 2k gives 2k+1

=

⇡ k k!22k+1 , (2k + 1)!

as desired.

⇡ k (k 1)!22k . (2k + 1)(2k 1)!

Solutions for Chapter 4

145

4.5.7 We will do z, y, x first: Z 1 Z 12 (1 x) Z 0

The integral is not Z 1 2y 3z

1.

Z

1/2

0

3.

Z

0

5.

Z

0

Z

1 2y

1 3 (1

1 3 (1

Z

2y)

0

1

Z

x 3z)

Z

1 3 (1

!

x 2y)

dx.

xyz dz dy

0

xyzdz dx dy

Z

x)

0

!

1 2y 3z

2.

1 2 (1

!

x 3z)

Z

1/3

0

!

xyz dx dz dy

0

1 3 (1

1 2 (1

0

! !

x 2y)

0

Z

xyz dz dy

!

!

dx.

Why not? The above, incorrect integral does not take into account all the available information. When we integrate first with respect to x, to determine the upper limit of integration we ask what is the most that x can be. Since x, y, z 0, it follows that x is biggest when y, z = 0; substituting 0 for y and z in the inequality x+2y +3z  1 gives x  1, so the upper limit of integration is 1. When we next integrate with respect to y, we are doing so for all values of x between 0 and 1. Given that constraint, y is biggest when z is 0, which gives x + 2y  1, or y  1/2(1 x). When we get to the third variable, which in this case is z, we are integrating for all permissible values of x and y, so the upper limit of integration is 1/3(1 x 2y). The five other ways of setting up the integral are:

0

1/2

!

x 2y)

0

Z

0

Solution 4.5.7: Note that since you are not asked to actually compute the integral, whether the function being integrated is xyz or something else is irrelevant.

Z

0

1 3 (1

4.

Z

0

!

Z

1 2 (1

3z)

0

1/3

Z

Z

1 2y 3z

0

Z

1 2 (1

!

xyz dx dy dz

0

1 3z

!

x 3z)

!

!

xyz dy dx dz

0

xyz dy dz dx.

0

4.5.9 As indicated in the text, the integral for the half above the diagonal is Z

0

+

1Z

1Z

x1

Z

0

x1 y1

0

1

z ✓Z

0

Z

1

x1

Z

1 x1 y1

for x2 to the left of the vertical line with x-coordinate x1 y1 x2 x1

}|

det

Z

(x2 y1

x1 y2 ) dy2

(x2 y1 x1 y2 ) dy2 + | {z }

✓Z |

0

1

{z

1 y1 x2 x1

◆ }

{ ◆ (x1 y2 x2 y1 ) dy2 dx2 dy1 dx1 | {z } + det

(1)

dx2 dy1 dx1 .

for x2 to the right of the vertical line with x-coordinate x1 ; det

The limit of integration yx1 x1 2 is the value of y2 on the line separating the shaded and unshaded regions, the line on which the first dart lands; since that line is given by y = xy11 x, we have y2 = xy11 x2 . We will compute the integral for the half of the square above the diagonal, and then add it to the integral for the bottom half, which by equation 4.5.30

146

Solutions for Chapter 4

13 is 108 . (Can you think of a way to take advantage of Fubini to make this computation easier?) Rather surprisingly, when we compute the total integral this way, we encounter some logarithms. The first inner integral in the first line is y1 x2  Z y1xx2 1 y22 x1 y1 x2 y 2 x2 (x2 y1 x1 y2 ) dy2 = x2 y1 y2 x1 = x2 y1 x1 1 22 2 0 x1 2x1 0

x22 y12 . 2x1 The second inner integral in the first line is  Z 1 y2 (x1 y2 x2 y1 ) dy2 = x1 2 x2 y1 y2 y1 x2 2 x =

1

=

x1 2

x2 y1

x1

1 y1 x2 x1

x22 y12 x2 y1 x1 + x2 y1 = 2 2x1 x1 2

x2 y1 +

y12 x22 . 2x1

So the iterated integral on the first line of equation 1 is x1 ◆ Z 1Z 1 x1 y12 x22 x1 x2 x22 y1 x32 y12 y1 x2 y1 + dx2 dy1 dx1 = + dy1 dx1 2 x1 2 2 3x1 0 0 x1 0 0 x1 ◆ ◆ Z 1Z 1✓ 2 Z 1 ✓Z 1 2 Z 1 2h i1 x1 x21 y1 x31 y12 x1 x1 = + dy dx = dy dx = ln y dx1 1 1 1 1 1 2y1 2y12 3y13 x1 x1 0 x1 0 x1 3y1 0 3  3  3 1 Z 1 2 Z 1⇣ 3 Z 1 2 1 x1 x1 x1 1 ⌘ x1 x 1 = ln x1 dx1 = ln x1 + · dx1 = dx1 = 1 = 9 x 27 0 27 0 |3 0 0 9 {z } | 9{z } 0 | {z 1}

Z

1Z

1Z

x1 y1

✓

f0 g

fg

(2)

f g0

Since ln 0 = 1, you might worry that in the third line of formula 2, the first term would lead to (03 /9) · ( 1), which is undefined. But as x1 ! 0, x31 goes to 0 much faster than ln x1 goes to 1.4 The iterated integral on the second line of equation 1 is ◆ Z 1 Z 1 Z 1 ✓Z 1 (x2 y1 x1 y2 ) dy2 dx2 dy1 dx1 0

=

x1

Z

0

=

Z

1Z

1Z

1

1

x1

0

=

Z

=

0

Z

x1

1

0

Z

x1 y1

Z

1

x1

1

✓ 4

1 4

Z

0

1 x1 y1

1 x1 y1

✓

 x2 y1 y2

x1 2

1

dx2 dy1 dx1 0

Z 1Z 1 2 x1 ⌘ x2 y1 dx2 dy1 dx1 = 2 2 0 x1 ◆ Z 1 2 x1 x21 y1 x21 y1 + dy1 dx1 = 2 y12 2 2y1 4 0 ◆  1 x21 x2 1 x21 x3 + 1 dx1 = x1 + 1 = 4 2 4 4 12 0

⇣ x2 y1

y1 2

x1 y22 2

x1 x2 2

1

x1 y1 2

1

1 4

x1 y1

dy1 dx1 dx1

x1

1 1 1 + = . 4 12 12

Powers dominate logarithms, and exponentials dominate powers. This follows almost immediately from l’Hˆ opital’s rule. For powers and exponentials, see Solution 6.10.7.

Solutions for Chapter 4

147

So the integral for the half above the diagonal is

1 1 13 + = , which 27 12 108

gives a total integral of 13/54. This computation should convince you that taking advantage of symmetries is worthwhile. But there is a better way to compute the integral for Since the line separating the the upper half, as we realized after working through the above computashaded and unshaded regions is tion. The computation is simpler if we integrate with respect to x2 first, y1 given by y = x, we have rather than y2 : x1 0 1 y1 x1 y2 Z 1 Z 0 Z 1 Z x11y2 Z 1 y2 = x2 , i.e., x2 = . y x1 y1 B C (x1 y2 x2 y1 ) dx2 + (x2 y1 x1 y2 ) dx2Ady2 dy2 dx1 . @ x1 y2 | | {z } {z } 0 x1 0 0 1 y

+ det

4.5.11 a. We have Z ⇡ ✓Z 0

y

⇡

det

◆ Z sin x sin x dx dy = |dx dy|, x x T

where T is the triangle 0  y  x  ⇡. b. Written the other way, this becomes ◆ Z ⇡ ✓Z x Z ⇡ sin x sin x dy dx = x dx = cos 0 x x 0 0 0

cos ⇡ = 2.

4.5.13 a. If D2 (D1 (f )) and D1 (D2 (f )) both exist and are continuous on U , and if there exists ✏ > 0 such that D2 (D1 (f ))(a)

D1 (D2 (f ))(a) > ✏,

(1)

then there exists > 0 such that on the square n⇣ ⌘ o x 2 R2 | a  x  a + , a  y  a + S= 1 1 2 2 y D2 (D1 (f )) D1 (D2 (f )) > ✏. In particular, Z D2 (D1 (f )) D1 (D2 (f )) |dx dy| > 2 ✏ > 0. S

Note that this uses the continuity of the second partials. b. Using Fubini’s theorem, these two integrals can be computed: ! Z Z Z a1 +

S

D2 D1 (f ) |dx dy| = Z

a2 +

D2 D1 (f ) dy

a1

dx

a2

⇣ ⇣ ⌘ ⇣ ⌘⌘ D1 (f ) a x+ D1 (f ) ax dx 2 2 a1 ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ a1 a1 + a1 . 1+ =f a f f + f a2 + a2 + a2 a2 =

a1 +

148

Solutions for Chapter 4

Z

S

D1 D2 (f ) |dx dy| = =

Z

⇣

Z

a2 +

a2

!

a1 +

D1 D2 (f ) dx

a1

✓ ⇣ ⌘ D2 (f ) a1 y+

a2 +

a2

=f

Z

a1 + a2 +

dy

⇣ ⌘◆ D1 (f ) ay1 dy ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ 1 1 f a a+ f a1a+ +f a a2 . 2 2

⌘

So the two integrals are equal. Thus if D2 (D1 (f )) and D1 (D2 (f )) both exist and are continuous on U , then the crossed partials are equal; the second partials cannot be continuous and satisfy inequality (1). c. The second partials of the function 8 2 y2 ⇣ ⌘ < xy x x2 + y 2 f x y =: 0

✓ ◆ ✓ ◆ x 0 if 6= y 0 otherwise

are not continuous at the origin, so although Solution 4.5.15: This region looks roughly like a plastic Easter egg, standing on one end, the bottom half z x2 + y 2 , the top half z  10 x2 y 2 ; the two halves join at z = 5.

D1 (D2 (f ))(0) > D2 (D1 (f ))(0) (see Example 3.3.9), there is no square on which one crossed partial is larger than the other. 4.5.15 Since the two halves are symmetric, we will just compute the bottom half and multiply by 2. The volume is given by the following integral. p In the second line we make the change of variables y = z u, so that p dy = z du: 0 1

2

Z

Z

5

0

Of course one does not need to use the order dx dy dz. This exercise would be much easier to do using cylindrical coordinates, discussed in Section 4.10.

=4

Z

5

0

=4

Z

0

5

p z

p z

Z

Z pz

y2

dx

p

z y2

p z p z

p z

hup z 1 2

!

u2

Z 5B BZ B dy dz = 2 B 0 B @

!

!

y 2 dy dz = 4

Z

0

5

✓Z

1

1

h ipz x p

p z

p z

p z

 i1 1 ⇡ z2 + arcsin u dz = 4 2 2 2 1

z

|

{z p

2

C C C dy Cdz C y2 } A

y2

z y2

!

◆ p zu2 z du dz

5

= 2⇡ 0

25 = 25⇡. 2

4.5.17 a. Rearranging the order of a set of numbers does not change the value of the rth smallest number. Thus Mr (x) satisfies the hypothesis of Exercise 4.3.4, and Z Z Mr (x) |dn x| = n! Mr (x) |dn x| . Qn 0,1

n P0,1

n Since 0  x1  x2 · · ·  xn  1 in P0,1 , the rth smallest coordinate in the second integral is xr , i.e., Mr (x) = xr . We can integrate with respect to

Solutions for Chapter 4

149

the various xi in di↵erent orders, but the order x1 , x2 , . . . , xn is convenient. With this order, the integral in question becomes ◆ ◆ Z 1 ✓ ✓Z x2 n! ··· xr dx1 · · · dxn . 0

0

Let us first consider how to do the integral if r  n 2 (and n > 2). After integrating with respect to xi for 1  i < r, the innermost integral is Z xi+2 xr xii+1 dxi+1 . i! 0 (This formula is valid for any i < r and i  n 2 even if r > n 2.) After we integrate with respect to xr 1 , the Mr function becomes relevant. The innermost integral in this case will be Z xr+1 r Z xr+1 r xr dxr rxr dxr = . (r 1)! r! 0 0 Thus, after integrating with respect to xi for r  i < n 2, the innermost integral is Z xi+2 i+1 x dxi+1 r i+1 . (i + 1)! 0 It is now simple to evaluate the whole integral. After using the above formula to get the result of the first n 2 integrations, our integral is ! Z 1 Z xn n 1 Z 1 n xn 1 dxn 1 x dxn r(n!) r n! r dxn = n! r n dxn = = . (n 1)! (n)! (n + 1)! n+1 0 0 0 n n!

Z

1

0

If r = n 1, then we can use our formula for the innermost integral after 2 integrations with i < r: ! ! Z xn n 2 Z 1 Z xn xn 1 xn 1 dxn 1 xnn 11 dxn 1 dxn = n! (n 1) dxn (n 2)! (n 1)! 0 0 0 ✓ ◆ Z 1 xnn dxn (n 1)(n!) n 1 r = n! (n 1) = = = . (n)! (n + 1)! n+1 n+1 0 For r = n, we can use the same formula: ! Z 1 Z xn n 2 Z 1 n 1 xn 1 xn dxn 1 xn xn dxn n! dxn = n! (n 2)! (n 1)! 0 0 0 ✓ ◆ Z 1 xnn dxn (n)(n!) n r = n! n = = = . (n)! (n + 1)! n+1 n+1 0

b. The minimum of a set of numbers is the same, regardless of the order of the numbers, so the min function also satisfies the hypothesis of Exercise 4.3.4, and ✓ ◆ ✓ ◆ Z Z b b b b n min 1, , . . . , |d x| = n! min 1, , . . . , |dn x| . n x x x x 1 n 1 n Qn P 0,1 0,1

150

Solutions for Chapter 4

n In P0,1 , xn > xi for i < n. Thus b/xn < b/xi for i < n (since b > 0). With this in mind, it is clear that min (1, b/x1 , . . . , b/xn ) = min (1, b/xn ) n in P0,1 . To evaluate the integral it is convenient to use the same order of integration that we chose in part a:

n!

Z

0

1

✓ ✓Z ···

x2

0

✓ ◆ ◆ ◆ b min 1, dx1 · · · dxn . xn

The expression for the innermost integral after integrating with respect to xi (i  n 2) will be nearly the same as that for part a, since the function we are integrating is only a function of xn (the expression is the same, except that min (1, b/xn ) replaces xr ): ⇣ ⌘ Z xi+2 min 1, b xi dxi+1 i+1 xn . i! 0 It is now simple to evaluate the integral, using the above expression for the innermost integral after n 2 integrations. First perform one additional integration: ! ✓ ◆ ✓ ◆ Z 1 Z xn Z 1 b xnn 21 dxn 1 b xnn 1 dxn n! min 1, dxn = n! min 1, . xn (n 2)! xn (n 1)! 0 0 0

b

a

To integrate the min function, we have to split the above into two integrals. For 0  xn  b, we have 1 < b/xn and for b  xn  1, we have b/xn < 1. So ✓ ◆ Z 1 Z b n 1 Z 1 n 2 b xnn 1 dxn xn dxn bxn dxn n! min 1, = n! + n! x (n 1)! (n 1)! (n 1)! n 0 0 b ✓ ◆ (n!)bn n! b bn = + n! (n 1)! n 1 n 1 =

Figure for Solution 4.5.19 The ellipse x2 y2 + 2 = 1. 2 a b In the second line we write (1 z 3 ) not (z 3 1) because |a b| cannot be negative; we can do so, of course, since (1

z 3 )2 = (z 3

1)2 .

bn (n

1) + nb n 1

nbn

=

nb n

bn . 1

4.5.19 Use the fact that the area bounded by the ellipse ⇡|ab|. This gives Z

1 1

⇣ Area bounded by ellipse given by =

Z

1

⇡(1

z 3 )(1 + z 3 ) dz = ⇡

1

 =⇡ z

z7 7

1

= 1

12 ⇡. 7

Z

x2 (z 3

1)2

(1

z 6 ) dz

1 1

+

x2 a2

+

y2 b2

= 1 is

⌘ y2 = 1 dz (z 3 + 1)2

Solutions for Chapter 4

151

4.6.1 a. For the unit square, we find ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ 1 f 00 + f 01 + f 10 + f 11 36 ! ✓ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ ⇣ ⌘◆ ⇣ ⌘ 1/2 1/2 0 1 1/2 + 4 f 0 + f 1 + f 1/2 + f 1/2 + 16f 1/2 . 1 f 216

0 0 0

For the unit cube, we find ! ! ! ! +f

✓ +4 f +f ✓ + 16 f + 64f

1 0 0

1/2 0 0

!

0 1/2 1 0 1/2 1/2 1/2 1/2 1/2

+f

0 1 0

+f

1/2 1 0

!

!

+f

+f

!!

1 1 0

+f

!

1 1/2 1 1 1/2 1/2

!

!

+f

1/2 0 1

+f

!

0 0 1/2

+f

+f

0 0 1

1/2 0 1/2

!

1/2 1 1

+f

!

!

+f

+f

+f

!

1 0 1/2 1/2 1 1/2

1 0 1

!

+f 0 1/2 0

+f

!

!

0 1 1

!

0 1 1/2

+f

1/2 1/2 0

+f

!

x+f

+f

!

!

1 1/2 0

+f

+f

!

1 1 1

!

1 1 1/2 1/2 1/2 1

!◆

!◆

.

b. The formula for the unit square gives ✓⇣ ◆ ✓ ◆ ⇣2 2 2 2⌘ 1 1 1 1⌘ 1 1 1 128 16 1+ + + +4 + + + + 16 = 2+ + + ⇡ .524074. 36 2 2 3 3 5 3 5 2 36 3 15 2 The exact value of the integral is Z ⇣ ⌘ f x y |dx dy| = 3 ln 3 unit square

4 ln 2 ⇡ .523848.

4.6.3 If we set

X=

b

a 2

x+

a+b , 2

dX =

b

a 2

dx

and make the corresponding change of variables, we find ◆ Z b Z 1 ✓ b a a+b b a f (X) dX = f x+ dx. 2 2 2 a 1 This integral is approximated by ✓ ◆ p X b a a+b b a wi f xi + , 2 2 2 i=1 so to make the “same approximation” for Xi =

b

a 2

xi +

a+b 2

Rb a

f (X) dX, we need to use

and Wi =

b

a 2

wi .

152

Solutions for Chapter 4

4.6.5 a. By Fubini’s theorem, Z Z f (x)g(y) |dx dy| = [a,b]⇥[a,b]

[a,b]

=

Z

⇣Z

[a,b]

= =

[a,b]

⌘ f (x)g(y) |dx| |dy|

! Z g(y) |dy|

[a,b]

!

n X

n X

ci f (xi ) i=1 n X n X

!

f (x) |dx| !

ci g(xi )

i=1

ci cj f (xi )g(xj ).

i=1 j=1

b. Since both quadratic and cubic polynomials are integrated exactly by Simpson’s rule, this function is integrated exactly, giving  3 1 4 1 x y 1 = . 3 0 4 0 12

We already used equation 1 in equation 3.8.19.

4.6.7 a. In the expression ET (ET (f 2 )), we have summed f 2 (s) once for every T with s 2 T . So every s 2 S has been sampled the same (huge) number of times, and in the expectation we have divided by this huge number, to find the average of f 2 over S, i.e., ES (f 2 ). 2 2 For the next two, note that f T is a constant on T , so ET (f T ) = f T . By the same argument, 2

ET (f f T ) = f T ET (f ) = f T .

(1)

b. N N

1

⇣ ⇣ ET ET (f = = =

N

N

1 N

N

1 N

N

1

f T )2

⌘⌘

⇣ ET ET (f 2 ) ⇣ ET ET (f 2 ) ⇣ ES (f 2 )

⌘ 2 ET (2f f T ) + ET (f T )

2

= ET (f T )

2

⌘

(2)

⌘ 2 ET (f T )

c. Continuing equation 2, we have ⌘ N ⇣ 2 ES (f 2 ) ET (f T ) N 1 ⇣ ⌘ N 2 = VarS (f ) + ES (f ) N 1 But ES (f )

2

fT

! ⇣ ⌘ 2 VarT (f T ) + ET (f T ) .

so we are left with ⌘ N ⇣ VarS (f ) VarT (f T ) . N 1

Solutions for Chapter 4

153

d. The crucial issue is that VarT (f T ) =

The notation |T| means the number of elements of T.

1 VarS (f ). N

To see this, we need to develop out: ⌘2 1 X⇣ 1 X 1 X⇣ VarT (f T ) = f T ES (f ) = f (t) |T| |T| N t2T T 2T T 2T ⌘2 1 1 X⇣ 1 = f (s) ES (f ) = VarS (f ) N |S| N

(3) !2

ES (f )

s2S

Using equation (3), we have ✓ ◆ N 1 N N 1 VarS (f ) VarS (f ) = VarS (f ) = VarS (f ). N 1 N N 1 N

Note that one N goes with the m to make up the y; the other two give the area of the square.

4.7.1 a. Theorem 4.7.4 asserts that you can compute integrals by ✓ ◆ parn/N titioning the plane into squares with vertices at the points and m/N sidelength 1/N . The expression X 2 1 1 X X m nm/N 2 lim me nm/N = 2 e 3 N !1 N N N 0n,m 4.8.17 Set A = , so A A = . We need to find c d ab + cd b2 + d2 the maximal eigenvalue of A> A. To find the eigenvalues, we use the characteristic polynomial:  a2 c2 ab cd > ( ) = det( I A A) = det > A A ab cd b2 d2 a2

=( 2

=

c2 )(

2

2

b2

d2 )

( ab

cd)( ab

2

2

2 2

2 2

cd)

2 2

(a + b + c + d ) + a b + a d + b c + c2 d2 | {z } |A|2

(a2 b2 + 2abcd + c2 d2 ) 2

=

|A|2

(ad

bc)2 =

2

|A|2

So using the quadratic formula to solve 2

|A|2

we get Solution 4.8.19: Remember that the determinant of a matrix is a number, so multiplication of determinants is commutative. We have det(QJk P

1

)

= det Q det Jk det P

1

=0

(ad

bc)2 =

2

|A|2

(det A)2 .

(det A)2 = 0,

p |A|4 4(det A)2 = 2 To get the larger eigenvalue, take the positive square root. |A|2 ±

4.8.19 Using Exercise 2.39, the the problem is not too hard. a. Assume A has rank k = n 1. We want to show that the linear transformation [D det(A)] : Mat (n, n) ! R is not the zero linear transformation, i.e., that there is a matrix B such that [D det(A)]B 6= 0. Write [D det(A)]B as follows, where P , Q, Jk are as in Exercise 2.39:

since det Jk = 0; see (for instance) Theorem 4.8.9.

0

[D det(A)]B = [D det(QJk P

)]B = lim

h!0

1

1

1

+ hB h

+ hQQ BP P ) = lim h!0 h 1 1 det Q det P det(Jk + hQ BP ) = lim h!0 h 1 det(J BP ) k + hQ = det Q det P 1 lim . h!0 h = lim

det(QJk P

1

det QJk P

1

h!0

z }| det QJk P

⇣ det Q(Jk + hQ

1

{

1

h

Since Q and P 1 are invertible, we know that det Q 6= 0 and det P so we just need to show that there exists a matrix B satisfying det(Jk + hQ

1

BP ) 6= 0.

1

BP )P

1

⌘

6= 0,

Solutions for Chapter 4

157

Set K = Q 1 BP , so that now we need to show that det(Jk + hK) 6= 0, and let ki,j denote the i, jth entry of K, so that 2 hk hk1,2 ... hk1,n 3 1,1 hk2,n 7 6 hk2,1 1 + hk2,2 . . . 7, det(Jk + hQ 1 BP ) = det(Jk + hK) = det 6 .. .. .. 4 .. 5 . . . . hkn,1

We are assuming k = n 1, so the first column and first row of Jk consist of 0’s, so the first column and first row of Jk + hK equal the first column and first row of hK.

hkn,2

...

1 + hkn,n

Then (as you can easily check in the case n = 3 by working out an example), when we compute the determinant, there is a single term, the term hk1,1 , that is of degree 1 in h; all the other terms have at least h2 , and thus disappear after dividing by h and taking the limit as h ! 0. This yields 1 det(Jk + hQ h!0 h lim

1

BP ) = k1,1 .

Thus if B is any n ⇥ n matrix such that the (1, 1) entry of K = Q not 0, then

1

BP is

1

gives

[D det(A)]B 6= 0. 0 Clearly a matrix K 0 exists with k1,1 6= 0; setting B = QK 0 P 1 1 0 1 0 Q BP = Q QK P P = K .

Solution 4.8.19, part b: You can compute det(Jk +hK) = det(Jk +hQ

1

BP )

using development by the first column or Theorem 4.8.12, which says that you must choose in each term one element from each row and each column. In the first n k columns, every entry in the matrix has a factor of h, so each term of the determinant has a factor of h2 .

b. In this case, entries of the first two columns and first two rows of Jk + hK have an h. For instance, if n = 4 and k = n 2 = 2, we have 2 3 hk1,1 hk1,2 hk1,3 hk1,4 hk2,2 hk2,3 hk2,4 7 6 hk det(Jk + hK) = det 4 2,1 5 hk3,1 hk3,2 1 + hk3,3 hk3,4 hk4,1 hk4,2 hk4,3 1 + hk4,4 When you compute this, the lowest degree term in the expansion will have a factor hn k (for the case above, h2 ), giving lim

h!0

Thus if rank A  n

1 det(Jk + hQ h

1

BP ) = 0

2, we have [D det(A)]B = 0 for all B 2 Mat (n, n).

4.8.21 By definition, M (~ei ) = ~e

(i) .

Thus

M⌧ M (~ei ) = M⌧ (~e

(i) )

= ~e⌧ (

(i))

= ~e(⌧

)(i) ,

whereas M⌧

(~ei ) = ~e(⌧

)(i) .

Thus the linear transformations M⌧ M and M⌧ do the same things to the standard basis vectors (alternatively, the matrices have the same columns), so they coincide. 4.8.23 a. If A is orthogonal, 1 = det I = det A> A = (det A> )(det A) = (det A)2 , so det A = ±1.

158 Solution 4.8.23, part b: If we had introduced dot products in Cn , and had shown that orthogonal matrices are special cases of unitary matrices, the proof would be easier.

Solutions for Chapter 4

b. If is a real eigenvalue with eigenvector ~v 2 Rn , then, since A is orthogonal and preserves lengths, |~v| = |A~v| = | ||~v|,

so | | = 1.

If = a + ib 2 / R, we saw in Exercise 2.14 that Au = au Aw = bu + aw. Since A preserves all lengths, |u|2 + |w|2 = |Au|2 + |Aw|2

= a2 |u|2 + b2 |w|2

bw and

2ab(u · w) + b2 |u|2 + a2 |w|2 + 2ab(u · w)

= (a2 + b2 )(|u|2 + |w|2 ).

So we get | |2 = (a2 + b2 ) = 1, so indeed | | = 1.

c. The eigenvalues of A are +1, 1, and nonreal pairs e±i✓ . Since (Corollary 4.8.25) the determinant is the product of the eigenvalues with multiplicity, and since each pair of complex conjugate eigenvalues contributes +1 as do the eigenvalues +1, of course, we see that the only way that the determinant can be 1 is if 1 occurs as an eigenvalue with odd multiplicity.



cos ✓i sin ✓i sin ✓i cos ✓i gives rotation by ✓i in the plane spanned by v2i and v2i+1 ; see Example 1.3.9. The entry 1 is reflection with respect to some hyperplane. The matrix

The matrix T 1 AT : In the subspace spanned by the eigenvectors with eigenvalue 1, A is the identity. In the subspace spanned by the eigenvectors with eigenvalue 1, A is minus the identity, i.e., a reflection with respect to the origin. In the planes spanned by ~ v2i 1 , ~ v2i A is rotation by some angle.

d. We will prove a stronger result: that if an n ⇥ n matrix A is orthogonal, there exists an orthonormal basis ~v1 , . . . , ~vn of Rn such that if T = [~v1 . . . ~vn ], and we set ci = cos ✓i , si = sin ✓i , then we have: 2 3 c1 s1 6 7 6 7 s1 c1 6 7 6 7 6 7 .. 6 7 . 6 7  6 7 6 7 c s k k 6 7 6 7 s c 6 7 k k 6 7 1 6 7 1 T AT = 66 7 7 .. 6 7 6 7 . 6 7 6 7 6 7 1 6 7 6 7 6 7 1 6 7 6 7 .. 6 7 . 4 5 1

(all the blank entries are 0). Let A : Rn ! Rn be a linear transformation, and V ⇢ Rn . We will say that A is orthogonal on V if A(V ) ⇢ V and A preserves all dot products of elements of V . Work by induction on the dimension of a subspace of Rn on which A is orthogonal. Thus suppose that any subspace V ⇢ Rn with AV ⇢ V of dimension < m has an orthonormal basis in which A : V ! V has a matrix as above, and let us prove it for a subspace V of dimension m. By the fundamental theorem of algebra, every linear transformation has an eigenvalue (perhaps complex). Let ~v 2 Cn be an eigenvector of A : V ! V , i.e., A~v = ~v. Suppose first that is real, i.e., = ±1. Then ~ · ~v = 0, then A(~v? ) = ~v? : if w 0 = A~ w · A~v = ±A~ w · ~v.

Solutions for Chapter 4

159

But ~v? has dimension n 1. ~ 2 Rn . If = a + ib = ei✓ with b 6= 0, we can write ~v = ~u + i~ w with ~u, w We need to show three things: ~ are orthogonal 1. that ~u and w 2. that they have the same length, so they can be rescaled to be unit vectors forming an orthonormal basis of the plane P that they span. 3. if a third vector ~z is orthogonal to the plane P , then A~z is also orthogonal to P . For the first, since A is orthogonal it preserves all dot products, so b~ w) · (b~u + a~ w = (a2

~u · w ~ = A~u · A~ w = (a~u

~ ), b2 )(~u · w

and by subtracting (a2

~ ) = ~u · w ~ b2 )(~u · w

~ ) = ~u · w ~ from (a2 + b2 )(~u · w

~ ) = 0. Since b 6= 0 we see that ~u · w ~ = 0. we get b2 (~u · w For the second, |~u|2 = |A~u|2 = |a~u

b~ w|2 = a2 |~u|2 + b2 |~ w|2 ,

so (1 a2 )|~u|2 = b2 |~u|2 = b2 |~ w|2 , and since b 6= 0 we have |~u| = |~ w|. Thus ~ have the same length and are orthogonal to each other, the vectors ~u and w so they can be scaled to be an orthonormal basis of P . In that basis the matrix of A : P ! P is   a b cos ✓ sin ✓ = if = ei✓ . b a sin ✓ cos ✓ Again P ? is mapped to itself by A, and all dot products of P ? are preserved, and again P has lower dimension. 4.9.1 It doesn’t matter what A is, so long as it has positive and finite volume: since T has a triangular matrix, voln (T (A)) = | det T | = n!, voln (A) by Theorem 4.8.9. 4.9.3 a. By Fubini’s theorem, this volume is Z 1Z 1 zZ 1 z y Z 1Z 1 z dx dy dz = 1 z 0

0

0

0

=

Z

1

0

=

Z

0

y dy dz

0

1

 (1

1 (1 2

z)y 2

y2 2

z) dz =

1 z

dz 0



1 (1 6

1

z)3

= 0

1 6

160

Solution 4.9.3, part b: By “maps the tetrahedron T1 onto the tetrahedron T2 ” we mean that S is an onto map (in fact, it is bijective) that takes a point in T1 and gives a point in T2 ; for every point y 2 T2 there is a point x 2 T1 such that Sx = y.

Solutions for Chapter 4

2

2 b. The matrix S = 4 1 1 tetrahedron T2 . Thus

1 3 1

3 2 5 5 maps the tetrahedron T1 onto the 2

vol T2 = | det S| vol T1 =

33 11 = . 6 2

4.9.5 The function A is given by  f (x) g(x) A(x) = det 0 = f (x)g 0 (x) f (x) g 0 (x)

f 0 (x)g(x).

So its derivative is given by A0 (x) = f 0 (x)g 0 (x) + f (x)g 00 (x) = q(x) f (x)g(x)

f 0 (x)g 0 (x)

f 00 (x)g(x)

f (x)g(x) = 0.

Here we use (ab)0 = ab0 + ba0 .

So the function is constant: A(x) = A(0). 4.9.7 a. First let us see why there exists such a function and why it is unique. Existence is easy (since we have Theorem and Definition 4.8.1): the function | det T | satisfies properties 1–5. The proof of uniqueness is essentially the same as the proof of uniqueness for the determinant: to compute e (T ) from properties 1–5, column reduce T . Clearly operations of type 2 and 3 do not change the value of e . Operations of type 3 switch two columns, which by property 2 does not change the value of e . Operations of type 2 add a multiple of one column onto another, which by property 4 does not change the value either. By property 3, an operation of type 1 multiplies e (T ) by |µ|, the factor you multiply a column by. So keep track, for each column operation of type 1, of the numbers µ1 , . . . , µk that you multiply columns by. At the end we see that ⇢ 0 if T does not column reduce to the identity e (T ) = 1 1 e = (I) if T column reduces to the identity. |µ1 ···µk |

By property 5,

|µ1 ···µk |

1 1 e (I) = . |µ1 · · · µk | |µ1 · · · µk |

b. Now we show that the mapping T 7! voln (T (Q)) satisfies properties 1–5. Properties 1, 2, and 5 are straightforward, but 3 and 4 are quite delicate. Property 1: It is not actually necessary to prove that voln (T (Q)) 0, but it follows immediately from Definition 4.1.17 of voln and from the fact (Definition 4.1.1) that the indicator function is never negative. Property 2: Since the mapping Q 7! T (Q) consists of taking each point in Q, considering it as a vector, and multiplying it by T , changing the order

Solutions for Chapter 4

161

of the vectors does not change the result; it just changes the labeling of the vertices of T (Q). So voln (T (Q)) remains unchanged. s1 ~ v1 + (as1 + 1)~ v2

C

s1 ~ v1 + ~ v2

~v2 B s1 ~ v1 + as1 ~ v2

~ v1 + a~ v2

A ~ s1 ~v1 v1 Here T = [~ v1 , ~ v2 ], and the subset T (Q) is A [ B; the subset [~ v1 + a~ v2 , ~ v2 ](Q) is B [ C. We need to show that vol2 A [ B = vol2 B [ C, which it does, since B [ C = B [ (A + ~ v2 ) and (by Proposition 4.1.22), vol2 (A + ~ v2 ) = vol2 (A).

Properties 3 and 4 reflect properties that voln obviously ought to have if our definition of n-dimensional volume is to be consistent with our intuition about volume (not to mention the formulas for area and volume you learned in high school). The n vectors ~v1 , . . . , ~vn span an n-dimensional “parallelogram” (parallelogram in R2 , parallelepiped in R3 , . . . ). Saying that voln should have property 3 says exactly that if you multiply one side of such a parallelogram by a number a, keeping the others constant, it should multiply the volume by |a|. Saying that voln should have property 4 is a way of saying that translating an object should not change its volume: once we know that voln is invariant by translation, property 4 will follow. It isn’t crystal clear that our definition using dyadic decompositions actually has these properties. The main tool to do this problem is Proposition 4.1.22; we will apply it to prove property 4. Consider the following three subsets of Rn A = { s1 ~v1 + t~v2 + s3 ~v3 + · · · + sn ~vn | 0  si < 1, t 2 [0, as1 ) }

B = { s1 ~v1 + t~v2 + s3 ~v3 + · · · + sn ~vn | 0  si < 1, t 2 [as1 , 1) }

C = { s1 ~v1 + t~v2 + s3 ~v3 + · · · + sn ~vn | 0  si < 1, t 2 [1, 1 + as1 ) } . These are shown in the margin for n = 2. Then

[~v1 , . . . , ~vn ](Q) = A[B,

[~v1 +a~v2 , . . . , ~vn ](Q) = B[C,

C = A+~v2 .

Property 4 follows. For property 3, we will start by showing that it is true for a = 1/2, 1/3, 1/4, . . . . Choose m > 0, and define Ak by ⇢ k 1 k Ak,m = t~v1 + s2 ~v2 + s3 ~v3 + · · · + sn ~vn 0  si < 1, t< . m m Then for any m, the Ak,m are disjoint, Ak = A1 + the same volume, and m [

k=1

Ak = [~v1 , . . . , ~vn ](Q),

so

voln (Ak ) =

k 1 ~ m v1

so they all have

1 [~v1 , . . . , ~vn ](Q). m

Now take a > 0 to be any positive real. Choose m, and let p be the number such that p/m  a < (p + 1)/m. Then [pk=1 Ak ⇢ [a~v1 , . . . , ~vn ](Q) ⇢ [p+1 k=1 Ak ,

giving

p p+1 voln [~v1 , . . . , ~vn ](Q)  voln [a~v1 , . . . , ~vn ](Q)  voln [~v1 , . . . , ~vn ](Q). m m Clearly, the left and right terms can be made arbitrarily close by taking m sufficiently large, so p voln [a~v1 , . . . , ~vn ](Q) = lim voln [~v1 , . . . , ~vn ](Q) m!1 m = a voln [~v1 , . . . , ~vn ](Q).

162

Solutions for Chapter 4

Finally, if a < 0, we have ⇥ ⇤ [a~v1 , . . . , ~vn ](Q) = |a|~v1 , . . . , ~vn (Q)

so

a~v1 ,

voln [a~v1 , . . . , ~vn ](Q) = voln [ a~v1 , . . . , ~vn ](Q) = |a| voln [~v1 , . . . , ~vn ](Q). Property 5: By Proposition 4.1.20, I 7! voln [~e1 , . . . , ~en ](Q) = voln (Q) = 1. 4.10.1

Z

p R2 (x2 + y 2 ) dx dy =

DR

=

=

=

Z Z Z Z

Since x2 + y 2  R2 , y goes from

R R R R R R R

y 2 to Z p

p R2

R2 y 2

p

✓

y 2 . So Z

(x2 + y 2 ) dx dy =

R2 y 2

2(R2

y 2 )3/2 3

0 ⇣ 2 B2 R 1 @ 2R3 (1

R

R to R, and x goes from

p + 2 R2

y 2 (R )

3

⌘3/2

0 p 2 2 1  3 R y x @ A dy + y2 x p 3 R R2 y 2

R

y2 y2

◆

dy

r

2 2

y2 R y C ) A dy R2 R2

+ 2 R2 (1

y 2 3/2 (R ) ) + 2R3 3

r

1

1

⇣ y ⌘2 ⇣ y ⌘2 R

R

!

y Now set sin ✓ = R (which we can do because y goes from dy = R cos ✓ d✓, and continue: ◆ Z Z ⇡2 ✓ 2 2 2 2 4 4 2 (x + y ) dx dy = R cos ✓ + 2 cos ✓ sin ✓ d✓ ⇡ 3 DR 2 0 integrates to 0 Z ⇡2 z }| { 1 4 @ (1 + 2 cos 2✓ + cos2 2✓) + 1 (1 =R In the second and third lines of ⇡ 6 2 this equation, cos 2✓ and cos 4✓ in2

tegrate to 0 because 2✓ goes from ⇡ to ⇡, where cos is as often negative as positive.

integrates to 0

= R4 = R4

Z Z

⇡ 2 ⇡ 2 ⇡ 2 ⇡ 2

1 1 + + 6 12

z }| { cos 4✓ 12

dy. R to R) so that

1

cos2 2✓)A d✓

integrates to 0

1 + 2

1 4

z }| { cos 4✓ 4

!

d✓

1 R4 ⇡ d✓ = . 2 2

1 1 4.10.3 Since |z 2 2| = 2 has no solutions, squaring both sides of 1 1 2 |z | = does not add more points to the graph. Squaring both sides 2 2 1 2 1 2 gives |z 2 | = 4 , which, using de Moivre’s formula (equation 0.7.13), can be rewritten as

Solutions for Chapter 4 Remember (Definition 0.7.3) that |a + ib|2 = a2 + b2 ; it does not equal a2 + 2iab b2 .

163

1 = r2 cos2 ✓ + ir2 sin2 ✓ 4

1 2

2

= r4 cos2 2✓

r2 cos 2✓ +

1 + r4 sin2 2✓, 4

which gives the desired result: r2 (r2

Solution 4.10.5, part b: This linear transformation is given by 2 3 a 0 0 4 0 b 0 5. By Theorem 4.8.9, 0 0 c its determinant is abc. Solution 4.10.7: For example, if A is the identity, then XA is the unit ball: QI (~x) = ~x · ~x = |~x|2 . More generally, XA is an n-dimensional ellipsoid; we are computing the n-dimensional volume of such ellipsoids. The volumes of the unit balls are computed in Example 4.5.7.

i.e., r2 = cos 2✓.

cos 2✓) = 0,

⇣ ⌘ ⇣ ⌘ u = au , which maps v bv  a 0 the unit circle to the ellipse, i.e., the transformation given by ; see 0 b Example 4.9.9. The area of the ellipse is the area of the circle multiplied by | det T | = |ab|, so the area is ⇡|ab|. 0 1 0 1 u au b. This time, use the linear change of variables T @ v A = @ bv A. w cw This maps the unit sphere to the ellipsoid, whose volume is therefore 4⇡/3 multiplied by | det T | = |abc|. 4.10.5 a. Use the linear change of variables T

4.10.7 Let XA ⇢ Rn denote the subset QA (~x)  1. Suppose that there exists a symmetric matrix C such that C 2 = C > C = A. Note that p | det C| = det A. The equation QA (~x)  1 can be rewritten (C~x) · (C~x)  1: 1

QA (~x) = x · A~x = ~x · C 2~x = ~x> C > C~x = (C~x)> C~x = C~x · C~x.

In other words, the linear transformation C turns XA into the unit ball Bn ⇢ Rn : it takes a point in Rn satisfying x · Ax  1 and returns a point Cx 2 Rn satisfying |Cx|2  1. This gives voln (XA )

= |{z}

thm. 4.9.1

voln (Bn ) voln (Bn ) = p . | det C| det A

(1)

Checking that C exists requires the spectral theorem (Theorem 3.7.15) and also some notions about changes of basis. By Theorem 3.7.15, there exists an orthonormal basis ~v1 , . . . ~vn of Rn such that A~vi = i ~vi . Since the quadratic form Q(~v) = ~v · A~v is positive definite by definition, 0 < ~vi · A~vi = ~vi · Recall (Proposition and Definition 2.4.19) that T is an orthogonal matrix if and only if T > T = I: if T = [~ v1 , . . . , ~ vn ], then the i, jth entry of T > T is ~ vi> ~ vj = ~ vi · ~ vj . Since the ~ vi form an orthonormal basis, this dot product is 1 when i = j and 0 otherwise. Thus T > T = I.

vi i~

=

vi |2 , i |~

which implies that all the eigenvalues i are positive. Let T = [~v1 , . . . , ~vn ]. We need two properties of T : 1. Since T is an orthogonal matrix, T > T = I. 2. By Proposition 2.7.5, 2 0 ... 0 1 0 0 6 2 ... T 1 AT = 6 .. .. .. 4 ... . . . 0

0

...

n

3

7 7. 5

(2)

164

Solutions for Chapter 4

Let us set 2p 1

6 0 C=T6 4 .. . 0

C>

p0 .. . 0

2

... ... .. . ...

3 0 0 7 .. 7 5T p. n

1

2p

6 0 =T6 4 .. . 0

1

Then C is symmetric: 0 2p 3 1> 2p ... 0 1 p0 1 0 7 >C 0 B 6 0 6 2 ... 6 C =B = (T > )> 6 .. .. 7 .. @T 4 .. 5T A 4 .. |{z} . . . . . p Theorem 1.2.17 0 0 ... 0 n 2p 3 0 . . . 0 1 p 0 7 > 6 0 2 ... =T6 . . .. 7 . 4 . 5 T = C. .. .. . p. 0 0 ... n Moreover, 2p 1 p0 0 6 2 C2 = T 6 .. 4 .. . . 0 0 2p 1 p0 6 0 2 =T6 .. 4 .. . . 0 0

Solution 4.10.7: Our construction of the square root C of A is a special case of functional calculus. Given any real-valued function of a real variable, we can apply the function to real symmetric matrices by the same procedure: diagonalize the matrix using the spectral theorem, apply the function to the eigenvalues, and “undiagonalize” back. The same idea applied to linear operators in infinite-dimensional vector spaces is a fundamental part of functional analysis. Elliptical change of coordinates for Solution 4.10.9: 0 1 0 1 r x = ar cos ✓ cos ' E : @ ✓ A 7! @ y = br sin ✓ cos ' A ' z = cr sin '

... ... .. . ... ... ... .. . ...

p0 .. . 0 p0 .. . 0

3 2p 0 ... 1 p0 0 7 > 6 0 2 ... 6 .. 7 .. .. 5T | {zT} 4 ... . . . p I 0 0 . . . n 32 2 0 0 ... 1 0 7 > 6 0 2 ... 6 .. .. 7 .. 5 T = T 4 ... . . p. 0 0 ... n

2

... ... .. . ...

2

... ... .. . ...

3 0 0 7 > .. 7 5T . p. n

3> 0 0 7 > .. 7 5 T p. n

3 0 0 7 > .. 7 5T p. n

3

0 0 7 > T |{z} = A. .. 7 . 5 eq. (2)

n

This confirms our initial assumption that there exists a symmetric matrix C such that C 2 = A, and equation 1 is correct: the volume of XA is voln (Bn ) voln (XA ) = p , det A

where Bn is the unit ball in Rn .

4.10.9 Assume that a, b, and c are all positive, and use the “elliptical change of coordinates” E shown in the margin. The region V corresponds to the region 80 1 9 < r ⇡ ⇡= U = @ ✓ A 2 R3 0 < r  1, 0  ✓  , 0  ' < : 2 2; ' in our coordinates. Before we can use these coordinates to integrate the desired function, we must calculate | det[DE]|. To get [DE] from [DS] (calculated in equation 4.10.22), we need only multiply the first, second, and third rows of [DS] by a, b, and c respectively, to get 2 0 13 2 3 r a cos ✓ cos ' ar sin ✓ cos ' ar cos ✓ sin ' 4DE @ ✓ A5 = 4 b sin ✓ cos ' br cos ✓ cos ' br sin ✓ sin ' 5 . ' c sin ' 0 cr cos '

Solutions for Chapter 4

By equation 4.10.23, 2 0 13 r det 4DS @ ✓ A5 = r2 cos '. '

165

Theorem 4.8.8, together with equation 4.8.16, tells us that multiplying a row of a matrix by a scalar a scales the determinant of the matrix by a, so det[DE] = abc det[DS], and 2 0 13 r det 4DE @ ✓ A5 = abc r2 cos '. '

Integration is now a simple matter. First change coordinates under the multiple integral: Z Z xyz|dx dy dz| = (ar cos ✓ cos ')(br sin ✓ cos ')(cr sin ')abcr2 cos '|dr d✓ d'|. V

U

Rearranging terms and changing to an iterated integral yields 2

(abc)

Z

Z

1

0

⇡/2

0

✓Z

⇡/2

5

!

◆

3

r sin ✓ cos ✓ cos ' sin ' d✓ d'

0

dr.

This is quite simple to evaluate if one notes that cos ✓ d✓ = d sin ✓

and

sin ' d' = d cos '.

The result is (abc)2 /48. 4.10.11 Use spherical coordinates, to find Z

0

R

Z

2⇡

0

Z

⇡/2

2

r cos ' d' d✓ dr = ⇡/2

Z

R

0

=

Z

Z

0

R⇥

2⇡ ⇥

⇤⇡/2 r2 sin ' ⇡/2 d✓ dr

2

2r ✓

0

⇤2⇡ 0

dr =



R

4⇡r3 3

=

4R3 ⇡ . 3

=

⇡ . 6

0

4.10.13 Cylindrical coordinates are appropriate here. We find Z

0

2⇡

Z

0

1

Z

r

r dz dr d✓ = 2⇡

r2

Z

1

r(r

2

r ) dr = 2⇡

0

✓

1 3

4.10.15 a. The curve is drawn at left. b. The x-coordinate of the center of gravity is the point Figure for Solution 4.10.15 The curve of part a

x=

R ⇡/4 R cos 2✓

(r cos ✓)r dr d✓

⇡/4 0 R ⇡/4 R cos 2✓ ⇡/4 0

We now compute these two integrals.

r dr d✓

.

1 4

◆

166

Solutions for Chapter 4

The numerator gives Z Z ⇡/4 1 ⇡/4 1 cos3 2✓ cos ✓ d✓ = (cos 6✓ cos ✓ + 3 cos 2✓ cos ✓) d✓ 3 ⇡/4 12 ⇡/4 Z ⇡/4 1 = (cos 7✓ + cos 5✓ + 3 cos 3✓ + 3 cos ✓) d✓ 24 ⇡/4 p ✓ ◆ 2 1 1 = +1+3 . 24 7 5 The denominator gives Z

⇡/4 ⇡/4

Z

cos 2✓

0

1 r dr d✓ = 2

Z

⇡/4

cos2 2✓ d✓ =

⇡/4

⇡ . 8

p After some arithmetic, this comes out to x = (44 2)/(35⇡) ⇡ .600211 . . . . By symmetry, we have y = 0.

Figure for Solution 4.10.17 The shaded region is the region A, bounded by the arcs of the curves

4.10.17 a. The region A is shown in the figure in the margin. It is bounded by the arcs of the curves y = ex + 1 and y = ea (ex + 1) where 0  x  a, and the arcs of the curves y = e x + 1 and y = ea (e x + 1) where a  x  0. b. In this case, we can find an explicit inverse of : we need to solve, for u and v in terms of x and y, the system of equations u

y = ex + 1 and y = ea (ex + 1) where 0  x  a, and the arcs of the curves y=e where

x

a

+ 1 and y = e (e a  x  0.

x

v=x

u

e + ev = y

.

Write u = x + v, and substitute in the second equation, to find

+ 1)

ex+v + ev = ev (ex + 1) = y. Since y

0 in A and ex + 1 > 0 always, this leads to v = ln

An exactly analogous argument gives y u = ln x . e +1 c. We have

 h ⇣ ⌘i 1 x det D y = det u e

ex

1 = eu + ev . ev

The change of variables formula gives Z Z y|dx dy| = (eu + ev )(eu + ev ) |du dv| A

=

Qa aZ a

Z

0

e2u + 2eu ev + e2v du dv

0

= a(e2a

1) + 2(ea

1)2 .

y . +1

Solutions for Chapter 4

167

4.10.19

a. The image ✓ of◆0  u  1, v = 0, is the arc of parabola u parametrized by u 7! , i.e., the arc of the parabola y = x2 where u2 0  x  1. The image of 0◆  u  1, v = 1, is the arc of parabola parametrized ✓ u 1 by u 7! , i.e., the arc of the parabola y = (x + 1)2 + 1 where u2 + 1 1  x  0. The ⇣ image ⌘ of u = 0, 0  v  1, is the arc of parabola parametrized by v 2 , i.e., the arc of the parabola x = y 2 where 0  y  1. v 7! v

The image of ⌘u = 1, 0  v  1, is the arc of parabola parametrized ⇣ 2 by v 7! 11 +vv , i.e., the arc of the parabola x 1 = (y 1)2 where 1  y  2. These curves are shown in the margin. b. We need to show that is 1–1 so that be able ⇣ we ⌘ will ⇣ ⌘ to apply the u u 1 2 change of variables formula in part c. If v1 = v2 , we get u1

v12 = u2

v22

u21 + v1 = u22 + v2 . Figure for Solution 4.10.19 The shaded area is (Q). We have two vertical parabolas; the shaded region is inside the lower one and outside the upper one. There are also two horizontal parabolas; the shaded region is inside the rightmost one and outside the leftmost one.

These equations can be rewritten u2 = v12

u1 (u1

v22 = (v1

u2 )(u1 + u2 ) = v2

v2 )(v1 + v2 )

v1 .

In the region under consideration, v1 + v2 0 and u1 + u2 0. It follows from the first equation that if v1 6= v2 , then u1 6= u2 , and both v1 v2 and u1 u2 have the same sign. The same argument for the second equation says that if u1 6= u2 , then v1 6= v2 , and v1 v2 and u1 u2 have opposite sign. So u1 = u2 and v1 = v2 .  h ⇣ ⌘i 1 2v u c. We have det D = det = 1+4uv, which is positive v 2u 1 in Q, so Z Z Z 1Z 1 x |dx dy| = (u v 2 )(1 + 4uv)|du dv| = (u + 4u2 v v 2 4uv 3 ) du dv A

Q

= =

Z 

1

0



0

u2 4u3 v + 2 3

v 2v + 2 3

2

3

v 3

0

u=1

v2 u

2u2 v 3

dv = u=0

4 1

v 2

= 0

Z

0

1

✓

1 4v + 2 3

v2

◆ 2v 3 dv

1 . 3

0 1 0 1 r r cos ✓ 4.10.21 a. Let @ ✓ A = @ r sin ✓ A, and let the region B in (r, ✓, z)z z space be a subset on which is 1–1. Set A = (B). If f : A ! R is

168

Solutions for Chapter 4

integrable, then rf

r ✓ z

!!

is integrable over B, and

0 0 11 0 1 Z r x rf @ @ ✓ AA |dr d✓ dz| = f @ y A |dx dy dz|. B A z z

Z

In this case, ⇣ the ⌘ appropriate cylindrical coordinates consist of keeping x, and writing yz in polar coordinates. Then r is the distance to the x-axis, so the integral becomes Z b Z 2⇡ Z f (x) Z ⇡ b 4 r r2 dr d✓ dx = f (x) dx. 2 a a 0 0 b. Applying the formula above gives that the moment of inertia is Z Z ⇡ ⇡/2 ⇡ ⇡/2 4 cos x dx = cos2 x(1 sin2 x) dx 2 2 ⇡/2 ⇡/2 ◆ Z ⇡/2 ✓ ⇡ sin2 2x 2 = cos x dx 2 4 ⇡/2 ⇡ ⇣ ⇡ ⇡ ⌘ 3⇡ 2 = = . 2 2 8 16 4.11.1 Set f (x) = |x|p 1D (x), where D is the unit disc. Let An ⇢ R2 be the set 1/n < |x|  1 and set fn = |x|p 1An . Then each fn is L-integrable (in fact, R-integrable), 0  f1  f2  · · · , and limn!1 fn = f , so f is L integrable precisely if the sequence Z Z an = fn (x) |d2 x| = |x|p |d2 x| R2

An

is bounded. This sequence is easy to compute explicitly by passing to polar coordinates: ! ⇢ 2⇡ Z 2⇡ Z 1 1 1 np+2 if p 6= 2 p an = r r dr d✓ = p+2 2⇡ ln n if p = 2. 0 1/n Thus we see that f is integrable precisely if p > integral is 2⇡/(p + 2).

2; in that case, the

4.11.3 Using the change of variables for polar coordinates, we have the following, where A = (1, 1) ⇥ [0, 2⇡): ✓ ◆p ◆p Z Z ✓ x r cos ✓ |dx dy| = r |dr d✓| y r sin ✓ R2 B1 (0) A Z 2⇡ Z 1 Z 1 p+1 = r dr d✓ = 2⇡ rp+1 dr. 0

1

1

Solutions for Chapter 4

169

By Theorem 4.11.20, the existence of this last integral is equivalent to the existence of the original integral. If p 6= 2, we can write Z 1 1 Z n+1 1  p+2 n+1 X X r rp+1 dr = rp+1 dr = . p +2 n 1 n n=1 n=1

Our definition of integrability says that if this series converges, the integral exists. But the series telescopes: m  p+2 n+1 X r 1 = (m + 1)p+2 1 p + 2 p + 2 n n=1 and converges if p + 2 < 0, i.e., if p < 2. Moreover, since |x|p > 0, the integral is then precisely the sum of the series, so if p < 2, we have ✓ ◆p Z 1 x |dx dy| = . y p + 2 2 R B1 (0)

We now need to see that if p 2, the integral does not exist. By the monotone convergence theorem, the integral will fail to converge if Z m sup rp+1 dr = 1. m!1

1

There are two cases to consider: if p 6= 2, we just computed that integral, to find Z m 1 rp+1 dr = (m + 1)p+2 1 , p + 2 1 which tends to infinity with m if p + 2 > 0. If p = 2, then Z m 1 sup dr = sup ln m = 1. r m!1 1 m!1 4.11.5 Let An =

x 2 Rn Rn

2n  |x| < 2n+1 , so that B1 (0) = A0 [ A1 , [ · · · .

Then |x|p 1Rn

B1 (0)

=

1 X

m=0

|x|p 1Am ,

and so |x|p is L-integrable over Rn B1 (0) precisely if the series 1 Z X |x|p |dn x| is convergent. m=0

Am

The mapping 'm (x) = 2m x maps A1 to Am , and det[D'm ] = 2nm . So applying the change of variables formula gives Z Z Z |x|p |dn x = 2mp |x|p 2nm |dn x| = 2m(p+n) |x|p |dn x|. Am

A1

A1

170

Solutions for Chapter 4

Thus our series is 1 X

2m(n

p)

Z

A1

m=0

|x|p |dn x|,

where the integral is some constant. It is convergent exactly when n + p < 0,

i.e., if p
1. (See Exercise 4.11.3; note that the present p is half what is denoted p there.) This is seen as follows: take the eight subsets Am,1 , . . . , Am,8 obtained by reflecting Am with respect to both axes and both diagonals. These eight copies cover the complement of the square S where |x|  1, |y|  1, with some points covered twice when m > 1. Thus Z Z 1 1 |dx dy|  |dx dy| 2 2 p 2 2 p Am (x + y ) R2 S (x + y ) Z 1 8 |dx dy|. 2 + y 2 )p (x Am Clearly, integrability over R2 B1 (0) and over R2 S are equivalent, since 1/(x2 + y 2 )p is always integrable over S B1 (0). The case where m < 1 is more delicate. In that case, in Am we have x2  x2 + y 2  2x2 , so the integrability of 1/(x2 + y 2 )p is equivalent to integrability of 1/x2p . But this integral can be computed explicitly: ! Z 1 Z xm Z 1 1 dy dx = xm 2p dx, x2p 1 0 1 which is finite precisely if m

2p
(m + 1)/2.

4.11.9 There are no difficulties associated with infinities in this case: Z Z 1 i⇠x \ 1[ 1,1] (⇠) = 1[ 1,1] (x)e dx = ei⇠x dx =



R

1

i⇠x 1

e i⇠

i⇠

= 1

e

e i⇠

i⇠

=2

sin ⇠ . ⇠

T heonlyf ishypartisthecase Solution 4.11.11: This is a matter of pinning down the fact that exponentials win over polynomials.

⇠ = 0; but note that lim 2

⇠!0

sin ⇠ = 2 = 1\ [ 1,1] (0). ⇠

4.11.11 It is enough to show the result for any monomial xk11 · · · xknn . Note that |xki i |  (1 + |x|)ki , so |xk11 . . . xknn |  (1 + |x|)k ,

Solutions for Chapter 4

171

where k = k1 + · · · + kn . Thus it is enough to show that the integral Z 2 (1 + |x|)k e |x| |dn x| Rn

exists for every k Z

0. Developing out the power, it is enough to show that

Rn

|x|k e

|x|2

|dn x| exists for every k

0.

Let us break up Rn into the unit ball B and the sets Am =

x 2 Rn

2m < |x|  2m+1

.

We can then write |x|k = |x|k 1B (x) +

1 X

m=0

def

|x|k 1Am (x) = g(x) +

1 X

fm (x),

m=0

and g and all the fm are R-integrable. Therefore it is enough to prove that the series of numbers 1 Z X fm (x)|dn x| m=1

Rn

is convergent. The change of variable m : x 7! 2m x transforms the region A1 into the region Am , and the change of variables formula gives Z Z 2m 2 2m def n fm (x)|d x| = 2mn 2km |x|k e 2 |x| |dn x|  2mn 2km 2k e 2 = am . Rn

A1

The ratio test, applied to this series, gives 2(m+1)

am+1 2(m+1)n 2k(m+1) 2k e 2 2n+k 2n+k = = = , 2m am 2mn 2km 2k e 2 e3 22m e22(m+1) 22m which clearly tends to 0 as m tends to infinity, so the series converges. Here is another solution, by Vorrapan Chandee, when a freshman at Cornell: It is clearly enough to show the result for any monomial xk11 · · · xknn . By Fubini’s theorem, we have Z Z 2 2 2 xk11 · · · xknn e |x| |dn x| = xk11 e x1 · · · xknn e xn |dn x| n n R R ✓Z ◆ ✓Z ◆ k1 x21 kn x2n = x1 e |dx1 | · · · xn e |dxn | . R

R

The original integral exists if each integral in the above product R exists. 2 So it suffices to show that the one-dimensional integral R xk e x |dx| exists for k =R0, 1, 2, . . . . By a symmetry argument, we can reduce this to 2 2 1 showing that 0 xk e x dx exists. Indeed, xk e x is eventually dominated by e x , which is easily seen to be integrable over [0, 1). 4.11.13 a. For the first sequence, if the fk converge to anything, they must converge to the function 0. But if you take ✏ = 1/2, then for x = k + 1/2 we have fk (x) 0 = 1 > ✏.

172

Solutions for Chapter 4

1 For the second example, even if you take ✏ = 1 and xk = 2k , we have fk (xk ) 0 = k 1. For the third, let f1 be the function that is 1 on the rationals and 0 on the irrationals. Set xk = ak+1 and ✏ = 1/2. Then

|fk (xk )

f1 (xk )| = 1 > ✏.

b. Write p = a0 + a1 x + · · · + am xm . Suppose that k 7! pk converges uniformly to p. Choose ✏ > 0; our assumption says that there exists K such that when k K, then for any x we have |pk (x) p(x)| < ✏. But pk (x) p(x) is a polynomial, and the only bounded polynomials are the constants. Thus pk (x) p(x) = ck for the constant ck = a0,k a0 when k K. This proves that all the coefficients of the pk are eventually constant, except perhaps the constant term. Moreover, the inequality |pk (x)

p(x)| = |a0,k

ak | < ✏

clearly implies that k 7! ak,0 converges to ak . In the other direction we must show that if all the nonzero coefficients are constant, and the constant coefficient converges, then the sequence k 7! pk converges uniformly to p. Suppose the coefficients ai,k are eventually constant for i > 0, so we can set ai,k = ai

for k > K and i > 0,

and suppose a0,k converges to a0 . Set p = a0 + a1 x + · · · + am xm ; clearly, if k > K, then p(x)

pk (x) = a0

a0,k ,

which converges to 0; therefore, pk converges uniformly to p.

Solution 4.11.13, part c: The existence of M uses the hypothesis that A is bounded; any polynomial has a maximum on any bounded set.

c. If the two sequences of functions k 7! fk and k 7! gk converge uniformly to f and g, then the sequence k 7! fk + gk converges uniformly to f + g. Thus it is enough to show that if the sequence of numbers k 7! ak converges, say to a1 , then the sequence of functions k 7! ak xi 1A converges uniformly on R. Let M be the maximum of |x|i on A. Choose ✏ > 0. There exists K such that if k > K, then |ak a1 | < ✏, and then |ak xi

a1 xi |  ✏M.

This can be made arbitrarily small by choosing ✏ sufficiently small. 4.11.15 Since this is a telescoping series, it is enough to show that the last term of any partial sum tends to 0. Since x(ln x 1) is an antiderivative of ln x, the last term of ✓ ◆ n X1 Z 1/2i 1 1 1 | ln x| dx is ln n 1 = (n ln 2 + 1) . n n 2 2 2 i+1 i=1 1/2

Solutions for Chapter 4

Solution 4.11.17 illustrates the fact that if we allow improper integrals whose existence depends on cancellations, the change of variables formula is not true. The idea of doing integration theory without being able to make changes of variables is ridiculous.

173

This tends to 0 when n tends to infinity: 2n grows much faster than n. Z 1 1 1 4.11.17 After change of variables, this integral becomes sin du. u u 0 As an improper integral, this should mean Z A 1 1 lim sin du; A!1 0 u u

see equation 4.11.52. However, the integral inside the limit above does not exist; the integrand u1 sin u1 oscillates between the graph of 1/u and the graph of 1/u; near u = 0 the functions 1/u and 1/u are not integrable. 4.11.19 a. Let n 7! cn be an increasing sequence of numbers converging to b, and set gn (x) = |f (x)|1[a,cn ] .

Exercise 4.11.19 brings out the importance of the monotone convergence theorem and the dominated convergence theorem when proving results about Lebesgue integrals.

Rc Rb Since limc!b a |f (x)| dx exists, it follows that limn!1 a gn (x) dx exists, so hypothesis 4.11.64 of the monotone convergence theorem is satisfied for gn . So by that theorem, |f | is L-integrable. Now set fn (x) = f (x)1[a,cn ) (x). Then |fn (x)|  |f (x)|, so by the dominated convergence theorem, since f (x) = limn!1 fn (x), the function f is L-integrable, and Z cn Z b Z b lim f (x) dx. = lim fn (x) dx = f (x) dx. n!1

n!1

a

a

a

4.11.21 Let m  n. Define Bi ⇢ Rn as in equation 4.11.45, and let B0 be Pk the unit ball. Set fi = f 1Bi and gk = i=1 fi . For i 1, we have fi (x)

The sequence k 7! Z

Rn

n

gk (x)|d x| =

R

Rn

gk (x)|dn x| diverges because

k Z X

Rn

i=1

k

1 1B (x). 2|x|m i

1X 2 i=1

fi (x)|dn x|

Z

k

Bi

1 1 X (n n |d x| = 2 |x|m 2 i=1

m)(i 1)

Z

B1

1 |dn x|. |x|m

This is the sequence of partial sums of a divergent geometric series. Thus we are in the situation of part 2 of the monotone convergence theorem (Theorem 4.11.18): f = f0 + sup gk k

and 0  g1  g2  · · · .

R Since supk Rn gk (x)|dn x| = 1, we see that f is not integrable over Rn when m  n.

Solutions for review exercises, chapter 4

4.1 It is enough that UN (1C ) = LN (1C ) because by Lemma 4.1.9, for all M N, UN (1C )

UM (1C )

LM (1C )

LN (1C ).

Thus the upper sums and lower sums are all equal for M function 1C is integrable. But UN (1C ) = LN (1C ) is obviously true.

N , so the

4.3 1. False. For instance, multiplication by 2, i.e., the function [2] : Z ! Z, is not onto; its image is the even integers. 2. True. If A is onto, the image of Zn contains the standard basis vectors, so it spans Rn . In particular, the matrix A viewed as a linear transformation Rn ! Rn , is onto, hence also injective. But any element of the kernel of A : Zn ! Zn is also an element of the kernel of A : Rn ! Rn , so it must be 0. 3. True. The same argument works as for 2: if we view A as representing a linear transformation Rn ! Rn , then det A 6= 0 implies that A is injective, and so is its restriction to Zn . 4. False. The same example as part 1 is a counterexample here also: det[2] 6= 0 but [2] : Z ! Z is not onto. 4.5 There are many ways to approach this problem. One is to observe N 2N 1 X X k+j that 2 e N is a Riemann sum (unfortunately not dyadic) for the N j=1 k=1

function

f

⇣ ⌘ x = ex+y 1 [0,1]⇥[0,2] . y

By Proposition 4.1.16, this function is integrable, with integral ✓Z 2 ◆ ✓Z 1 ◆ ey dy ex dx = (e 1)(e2 1). 0

0

With the arguments at hand, this is a little shaky; if we only considered the numbers N = 2M we would be fine, but the other values of N won’t quite enter into our dyadic formalism. Another possibility is to write 1 ! 0 2N N 1 X k/N @ 1 X j/N A e e N N j=1 k=1

174

Solutions for review exercises, Chapter 4

175

and to observe that both sums are finite geometric series. Using rn+1 1 r (see equation 0.5.4), we see that the first sum is a + ar + · · · + arn = a N X

ek/N = e1/N

k=1

1

1

e(N +1)/N . 1 e1/N

We need to evaluate the limit 1 1/N 1 e(N +1)/N e . N !1 N 1 e1/N lim

Equation 1 uses l’Hˆ opital’s rule.

In this expression, e1/N ! 1 and e(N +1)/N ! e, so the limit that matters is 1 x 1 lim = lim = lim = 1, (1) N !1 N (1 e1/N ) x!0 1 ex x!0 ex giving the limit 1 1/N 1 e(N +1)/N e =e N !1 N 1 e1/N lim

1.

Exactly the same way, we find lim

N !1

So we have in all lim

N !1

N 1 X k/N e N k=1

2N X

ek/N = e2

1.

j=1

1 ! 0 2N X 1 @ ej/N A = (e N j=1

1)(e2

1).

4.7 Set µ(xi ) = µ1 if xi 2 A, and µ(xi ) = µ2 if xi 2 B, and compute as follows: R ✓Z ◆ Z x µ(xi ) |dn x| 1 n n C i xi (C) = = µ1 xi |d x| + µ2 xi |d x| M (C) M (A) + M (B) A B R R ✓ ◆ µ1 xi |dn x| µ2 xi |dn x| 1 = M (A) A + M (B) B M (A) + M (B) M (A) M (B) ⇣ ⌘ 1 = M (A)xi (A) + M (B)xi (B) . M (A) + M (B) Therefore,

x(C) =

M (A)x(A) + M (B)x(B) . M (A) + M (B)

1 4.9 P1 Choose ✏, and write X = [i=1 Bi , where the Bi are pavable sets with i=1 voln (Bi ) < ✏/2. For each i find a dyadic level Ni such that ✏ UNi 1Bi  voln (Bi ) + i+1 . 2

176

Solutions for review exercises, Chapter 4

Then at the level Ni , the set Bi is covered by finitely many dyadic cubes ✏ with total volume at most voln (Bi ) + i+1 . Now list first the dyadic cubes 2 that cover B1 , then the dyadic cubes (at a di↵erent level) that cover B2 , and so on. The total volume of all these cubes is at most 1 ⇣ X ✏ ⌘ ✏ ✏ voln (Bi ) + i+1  + = ✏. 2 2 2 i=1 4.11 a. We can write ◆ Z p2 ✓Z 2 sin(x + y) dy dx or p 2

Z

x2

The first leads to Z

p 2 p 2

the second leads to Z 2⇣ p cos(y y)

⇣ cos(x + x2 ) cos(y +

0

2

0

Z

p y p y

!

sin(x + y) dx dy.

⌘ cos(x + 2) dx;

p ⌘ y) dy =

Z

2

p 2 sin y sin y dy.

0

We believe that neither can be evaluated in elementary terms. Numerically, using Simpson’s rule with 50 subdivisions, the integral comes out to 2.4314344 . . . . Maple gives a solution using Fresnel integrals; the numerical value obtained this way is 2.4314328529574138146 . . . . b. This is much easier: The integral is ◆ ◆ Z 2 ✓Z 2 Z 2 ✓Z 2 56 2 2 2 2 4 (x + y ) dx dy = 4 (x + y ) dy dx = . 3 1 1 1 1 4.13 We have Z 1Z 2 ⇣ ⌘ Z x f y |dx dy| = 0

Z

0

1

Z

0

2

 Z 1 2 bx2 ax + + cxy dy = (2a + 2b + 2cy) dy 2 0 0 0 ⇥ ⇤1 = 2ay + 2by + cy 2 0 = 2a + 2b + c.

0

1

The other integral is longer to compute: Z 1Z 2 ⇣ ⇣ ⌘⌘2 x f y |dx dy| = (a2 + b2 x2 + c2 y 2 + 2abx + 2acy + 2bcxy) dx dy 0

=

Z

0

1

0

✓ ◆ 8b2 2a2 + + 2c2 y 2 + 4ab + 4acy + 4bcy dy 3

8b2 2c2 + + 4ab + 2ac + 2bc. 3 3 The Lagrange multiplier theorem says that at a minimum of the second integral, constrained so that the first integral is 1, there exists a number such that = 2a2 +

[ 4a + 4b + 2c

16b 3

+ 4a + 2c

4c 3

+ 2a + 2b ] =

[2 2 1].

Solutions for review exercises, Chapter 4

177

From this and the constraint we find the system of linear equations 16b + 4a + 2c 3 4c 2a + 2b + c = + 2a + 2b 3 2a + 2b + c = 1.

4a + 4b + 2c =

This could be solved by row reduction, but it is easier to observe that the first equation says that b = 0 andRthe second equation says that c = 0. 2 1R2 Thus a = 1/2 and the minimum is 0 0 12 |dx dy| = 12 . 4.15 In order for the equality to be true for all polynomials of degree  3, it is enough that it should be true for the polynomials 1, x, x2 , x3 . Thus we want the four equations Z 1 Z 1 dx dx p 1. = c(1 + 1) 3. x2 p = c(u2 + u2 ) 1 x2 1 x2 1 1 Z 1 Z 1 dx dx 2. xp = c(u u) 4. x3 p = c(u3 u3 ). 2 1 x 1 x2 1 1 Equations 2 and 4 are true for all c and all u, since both sides are 0. Equation 1 gives Z 1 dx 1 p 2c = = [arcsin x] 1 = ⇡, so c = ⇡/2. 1 x2 1 Equation 3: Computing the integral (using integration by parts), we get Z 1 Z 1p h p i1 x dx 2cu2 = xp = x 1 x2 + 1 x2 dx = ⇡. 1 1 x2 1 1

Solution 4.17: [D det(A)] is not the derivative of det A =  5. It is 1 3 the derivative at A = of 0 1 2 1 a BbC B the function det @ C = ad bc: cA d i.e., it is the row matrix [d

c

b

a],

evaluated at a = 1, b = 3, c = 2, d = 1.

Substituting c = ⇡/2 in 2cu2 = ⇡ gives ⇡u2 = ⇡, i.e., u = ±1. 2 3 a 6b7 4.17 [D det(A)]B = [1 2 3 1] 4 5 = a 2b 3c + d. c d ✓  ◆ .2 .6 a b 1 det A tr(A B) = 5 tr .4 .2 c d =

5( .2a + .6c + .4b

.2d) = a

2b

3c + d.

4.19 The easiest way to compute this integral (at least if one has already done Exercise 4.9.4) is to observe that the volume to be computed is the same as in Exercise 4.9.4, after scaling the x1 coordinate by n, the x2 coordinate by n/2, etc., until the coordinate xn is scaled by n/n = 1. This gives the volume 1 n n n nn · · ··· = . n! 1 2 n (n!)2 Here is the solution to Exercise 4.9.4:

178

Solutions for review exercises, Chapter 4

Let us call An the answer to the problem; we will try to set up an inductive formula for An . By Fubini’s theorem, we find Z 1 Z 1 x1 Z 1 Z 1 x1 Z 1 x1 x2 1 1 A1 = dx2 dx1 = , A2 = dx3 dx2 dx1 = , 2 6 0 0 0 0 0 and more generally Z 1Z 1 An = 0

0

x1

Z

0

We see that if we define Z t Z t x1 Z Bn (t) = 0

0

0

···

Z

···

Z

1 x1 x2

t x1 x2

then

1 x1 ··· xn

1

dxn . . . dx2 dx1 .

0

t x1 ··· xn

1

dxn . . . dx2 dx1 ,

(1)

0

An = Bn (1). Calculate as above B1 (t) = t, B2 (t) = t2 /2, B3 (t) = t3 /6. A natural guess is that Bn (t) = tn /n!. We will show this by induction: a look at formula 1 will show you that Z t Bn+1 (t) = Bn (t x1 ) dx1 . 0

Thus the following computation does the inductive step:  Z t t (t x1 )n+1 tn+1 Bn+1 (t) = Bn (t x1 ) dx1 = = . (n + 1)! 0 (n + 1)! 0 Thus 3

An = Bn (1) =

2

−2 −1

Figure for Solution 4.21 A cardioid

1 . n!

4.21 This curve is called a cardioid (as in “cardiac arrest”). It is shown in the margin. The area, by the change of variables theorem and Fubini’s theorem, is ! Z 2⇡ Z 1+sin ✓ Z 2⇡  2 1+sin ✓ Z 2⇡ r (1 + sin ✓)2 r dr d✓ = d✓ = d✓ 2 0 2 0 0 0 0  Z 2⇡ 2⇡ ✓ (sin ✓)2 = cos ✓ + d✓ 2 2 0 0 ⇡ 3⇡ =⇡+ = . 2 2 4.23 a. In spherical coordinates, Q corresponds to 80 1 9 < r = ⇡ ⇡ U = @ ✓ A 0 < r  1, 0  ✓  , 0  ' < . : 2 2; '

Solutions for review exercises, Chapter 4

179

With this in mind, we can write our integral as an iterated integral: ◆ ! Z Z 1 Z ⇡/2 ✓ Z ⇡/2 3 2 (x + y + z)|d x| = (r cos ✓ cos ' + r sin ✓ cos ' + r sin ')r cos ' d' d✓ dr. Q

0

Z

0

Solution 4.27, part c: This argument may p seem opaque. Why take x = 1/ 2? Since we were trying to find a point in the complement of a set of measure 0, why did we need a number with special properties? These are hard questions to answer. It should be clear that the convergence of the series depends on x being poorly approximated by rational numbers, in the sense that if you want to get close to x you will have to use a large denominator. In fact, part b says that most real numbers are like that. But actually finding one is a di↵erent matter. (If you are asked to “pick a number, any number”, you will most likely come up with a rational number, which has probability 0.) It often happens that we can prove that some property of numbers (like being transcendental) is shared by almost all numbers, without being able to give a single example, at least easily (see Section 0.6). The approximation of numbers by rationals is called Diophantine analysis, and one of its first results is that p quadratic irrationals, like 1/ 2, are poorly approximable; in part c we present a proof.

1

0

0

b. We can use a table of integrals to evaluate the more unpleasant sine and cosine portions of the integral, but it is easier to argue that the x, y, and z contributions to the total must be the same (symmetry), and then evaluate the z contribution: ! ◆ ! Z ⇡/2 ✓ Z ⇡/2 Z 1 Z ⇡/2  2 ⇡/2 3 3 sin ' r sin ' cos ' d' d✓ dr = r d✓ dr 2 0 0 0 0 0 Z 1 Z ⇡/2 3 ! r = d✓ dr 2 0 0 Z 1 3 ⇡r ⇡ = dr = . 4 16 0 The value of our original integral is three times this: 3⇡/16. 4.25 By symmetry, the center of gravity is on the z-axis. The z-coordinate z of the center of gravity is R R1 z |dx dy dz| z(⇡z) dz 1/3 2 A R z= = R0 1 = = . 1/2 3 |dx dy dz| (⇡z) dz A 0 p 4.27 a. The function 1/ |x a| is L-integrable over [0, 1] for every a 2 [0, 1] (or even outside): Z 1 Z a+1 Z 1 1 1 1 p p p dx = 4. dx  dx = |x a| |x a| |x| 0 a 1 1 Then since

Z 1 1 1 X X 1 1 1 p dx  4  4, k 2 0 2k |x ak | k=1 k=1 P1 Theorem 4.11.17 says that the series of functions k=1

1 2k

verges for almost all x, and that f is L-integrable on [0, 1].

1 |x ak |

p

con-

b. We did this in part a: “on a set of measure 0” is the same as “for almost all x”. Note, however, that when x is rational, the series does not converge, since we have a 0 in the denominator for one of the terms. So “almost all x” does not include any rational numbers. c. Let us take the rationals in [0, 1] in the following order: 1 1 2 1 3 1 2 , a4 = , a5 = , a6 = , a7 = , a8 = , a9 = , . . . , 2 3 3 4 4 5 5 taking first all those with denominator 1, then those with denominator 2, then those with denominator 3, etc. Note that if ak = p/q for p, q integers,

a1 = 0, a2 = 1, a3 =

180

Solutions for review exercises, Chapter 4

then k 2q 3, since there are at least two numbers for any denominator except 2 (accounting pfor the 3). Now take x = 1/ 2. Notice (this is the clever step) that 1 2q 2 p 2 The second inequality in equation 1: since p 1/ 2 < 1 and p/q  1, their sum is  2.

p q

1 p p + = |q 2 2 q

2p2 |

1

since it is a nonzero integer. Hence for any coprime integers p, q with p/q 2 [0, 1], we have x

p 1 = p q 2

p q

1

2q 2 p12

+

1 , 4q 2

p q

(1)

so Pq

Equation 2: the sum p=0 accounts for all the rational numbers with denominator q; for all q except q = 1, there are at most q 1 of them. There are exactly q 1 of them if q is prime. This sum becomes multiplication by q + 1 in the next step. The case where a quantity being summed has no index matching the index of the sum is discussed in a margin note in Section 0.1.

This leads to 1 X 1

p 2k |x k=1

p |x 1 ak |



1 ak |

q 1 X X q=1 p=0

 2q.

1 22q

3

2q =

1 X 2q(q + 1) q=1

22q

3

.

This series is easily seen to be convergent by the ratio test.

4.29 Write a = a1 + ia2 and b = b1 + ib2 , so that T (u) = au + b¯ u can be written T (x + iy) = (a1 + ia2 )(x + iy) + (b1 + ib2 )(x = a1 x

iy)

a2 y + b1 x + b2 y + i(a1 y + a2 x

b1 y + b2 x).

we are identifying C with R2 in the standard way, with x + iy written so equation 3 is equivalent to the matrix multiplication    x a + b1 a2 + b2 x T = 1 . y a2 + b2 a1 b1 y

a

b

(2)

(3) ⇣ ⌘ x , y

Thus

det T = (a1 +b1 )(a1 b1 ) (a2 +b2 )( a2 +b2 ) = a21 b21 +a22 b22 = |a|2 |b|2 . Now we show the result for the norm. Any complex number of absolute value 1 can be written |ei✓ | = 1. This, with the triangle inequality, gives Figure for Solution 4.29 One train represents aei✓ , the other be i✓ . There exists a value of ✓ for which the trains line up, i.e., for which aei✓ and be i✓ are multiples of each other by a positive real number.

kT k = sup aei✓ + be ✓

i✓

= |a + b|  |a| + |b|.

Think of two circles centered at the origin, one with radius |a|, the other with radius |b|: aei✓ travels around its circle in one direction, and be i✓ travels around its circle in the other direction, as illustrated in the figure at left. Clearly there is a value of ✓ where the two are on the same half-line from the origin through the circles; for this value of ✓, we see that aei✓ and be i✓ are multiples of each other by a positive real number, giving kT k = |a + b| = |a| + |b|.

Solutions for review exercises, Chapter 4

181

8 u=x > < v = xyz 4.31 a. Set > : w = y/(xz), so that A becomes the parallelepiped 1  u  b,

1  v  a,

1  w  c.

b. We need to express x, y, z in terms of u, v, w: x=u p y = vw v v 1 = p = xy u u vw

z= Set

r

v . w

1 0 pu 1 u vw C @vA=B @ 1r v A. w u w 0

Then 2

4D and

c. Z

1

b

Z

1

a

Z

1

c

2

det 4D

2 3 ! 6 u 6 5 v =6 6 4 w !3 u v 5= w

1 1 dw dv du = (a 2uw 2

1

0 r

1 u2

v w

1 w 2 v r 1 1 2u vw

r

p w v v w2/3

1 4u

0 r

1 4u

1)

Z

b

1

du u

! ✓Z

0 r

1 v 2 w p 1 v 3/2 2u w

r r v 1 = w vw c

1

dw w

◆

=

1 (a 2

3 7 7 7 7 5 1 . 2uw

1) ln b ln c.

4.33 Let ai be a list of the rationals in [0, 1]: for instance 0, 1, 1/2, 1/3, 2/3, . . . , and let X be the open set in R2 1 ✓ [ X= ai i=1

1 210+i

, ai +

1 210+i

◆

.

The set X is open, hence a countable disjoint union of intervals. and it contains all the rationals, so it is dense. But the sum of the lengths of the intervals is < 1/29 , hence tiny. Let f : X ! R be a continuous function that maps each connected component of X bijectively to ( 1, 1).

182

Solutions for review exercises, Chapter 4

We claim that the graph of f does not have volume. Indeed, for any N the set of cubes of DN (R) in [0, 1] that completely contain one of the intervals of X has total length at least 1 1/29 . To cover the part of the graph above these cubes requires 2 · 2N squares of DN (R2 ). Thus the upper sum of 1 f is at least 2(1 1/29 ), but the lower sum is 0, since every square of DN (R2 ) contains points not on the graph. So the graph does not have volume. It does have measure 0, since it is the countable union of the graphs of the functions defined on each component of X, and each of these does have measure (in fact, volume) 0. 4.35 Set def

Ak =

⇢

x 2 Rn

1 1  |x| < k 2k+1 2

Then |x|p 1B1 (0) = p

1 X

k=0

|x|p 1Ak ,

where the maps |x| 1Ak are Riemann integrable and nonnegative. To know that |x|p is integrable for B1 (0), we need to know that the series of integrals of the |x|p 1Ak converges. The map k : x 7! 2 k x maps A0 to Ak , and det[D k ] = 2 nk . Thus Z Z Z |x|p |dn x| = 2 nk |2 k x|p |dn x| = 2 k(n+p) |x|p |dn x|. Ak

A0

A0

Thus Z

B1 (0)

p

n

|x| |d x| =

1 X

k=0

2

k(n+p)

!Z

A0

|x|p |dn x|.

The series is a geometric series and converges if and only if n + p > 0, i.e., if p > n.

Solutions for Chapter 5

2

3 3 2 3 5.1.1 Set T = [~v1 , ~v2 , ~v3 ]; then T > T = 4 2 6 4 5 and det T > T = 30, 3 4 6 p so the 3-dimensional volume of the parallelogram is 30. 5.1.3 Here are two solutions. First solution Set T = [~v1 , . . . , ~vk ]. Since the vectors ~v1 , . . . , ~vk are linearly dependent, rank T < k. Further, img T > T ⇢ img T > , so rank T > T  rank T >

= |{z}

rank T < k.

Prop. 2.5.11

>

Since T T is a k ⇥ k matrix with rank < k, it is not invertible, hence its determinant is 0, so p volk P (~v1 , . . . , ~vk ) = det T > T = 0. Second solution Since the vectors ~v1 , . . . , ~vk are linearly dependent, there exist numbers a1 , . . . , ak not all 0 such that a1 ~v1 + · · · + ak ~vk = 0, i.e., 2 3 a1 . [~v1 , . . . , ~vk ] 4 .. 5 = 0. ak

Solution 5.1.5: Here is another approach. By equation 5.1.9, the matrix T > T is symmetric, so equation 1 shows that T > T is a symmetric matrix representing a quadratic form taking only nonnegative values, i.e., its signature is (l, 0) for some l  k. So T > T has l positive eigenvalues, and the others are zero, by Theorem 3.7.16. Since the determinant of a triangular matrix is the product of the eigenvalues (Theorem 4.8.9), and a symmetric matrix is diagonalizable (the spectral theorem), and the determinant is basis independent (Theorem 4.8.6), it follows that det T > T 0.

Set T = [~v1 , . . . , ~vk ]. Since T > T a = 0, it follows that ker (T > T ) 6= 0, so T > T is not invertible, and det(T > T ) = 0, so p volk P (~v1 , . . . , ~vk ) = det T > T = 0. 5.1.5 Note first that if ~v is a nonzero vector in Rk , (T > T ~v) · ~v = T > (T ~v)

>

~v = (T ~v)> T ~v = T ~v · T ~v

0.

(1)

Denote by A the k ⇥ k matrix T > T and let I be the k ⇥ k identity matrix. Consider the matrix tA + (1 t)I for t 2 [0, 1], which we can think of as A (when t = 1) being transformed to I (when t = 0). For ~v 6= 0 and 0  t < 1, we have tA + (1

t)I ~v · ~v = t A~ t) ~v · ~v > 0. | v{z· ~v} + (1 | {z } |{z} 0

>0

>0

This implies that, for 0 < t  1, ker tA + (1 t)I = 0 and thus that det tA + (1 t)I is never 0 when 0  t < 1. Since when t = 0, we have det tA + (1 t)I = 1, and when t = 1, det tA + (1 t)I = det A, it 183

184

Solutions for Chapter 5

follows that det A 0. (This uses the continuity of determinant, which is a polynomial function.)

z

Nathaniel Schenker suggests a third solution to Exercise 5.1.5: By the singular value decomposition (Theorem 3.8.1), T = P DQ> , where P and Q are, respectively n ⇥ n and k ⇥ k orthogonal matrices, and D is an n ⇥ k nonnegative rectangular diagonal matrix. Then T > T = QD> P > P DQ> = QD> In DQ> = QD> DQ> ,

a

r '

where the first equality is Theorem 1.2.17 and In is the n ⇥ n identity matrix. Then

y

✓

det(T > T ) = det Q det(D> D) det Q> = det Q det(D> D)(det Q)

x

1

= det(D> D),

where the first equality is Theorem 4.8.4 and the second is Corollary 4.8.5. Since D> D is diagonal with nonnegative entries, Theorem 4.8.9 implies that det(D> D) 0. Note on parametrizations

a r p

'

r cos '

0

0 ✓ r cos '

A number of exercises in Sections 5.2–5.4 involve finding parametrizations. The “small catalog of parametrizations” in Section 5.2 should help. In many cases we adapt one of the standard parametrizations (spherical, polar, cylindrical) to the problem at hand. Note also the advice given in Solution 5.1: To find a parametrization for some region, we ask, “how should we describe where we are at some point in the region?” Thus a first step is often to draw a picture. If you memorized the spherical, cylindrical, and polar coordinates maps without thinking about why those are good descriptions of “where we are at some point”, this is a good time to think about how these standard change of coordinates maps follow from high school trigonometry. Consider, for instance, the spherical coordinates map

x y

p

Top: In spherical coordinates, a point a is specified by its distance from the origin (r), its longitude (✓), and its latitude ('). Middle: This triangle lives in a plane perpendicular to the (x, y)plane; you might think of slicing an orange in half vertically. Bottom: This triangle lives in the (x, y)-plane.

0 1 0 1 r r cos ✓ cos ' S : @ ✓ A 7! @ r sin ✓ cos ' A ' r sin '

where r is the radius of the sphere, ✓ is longitude, and ' is latitude, as shown in the top figure at left (which you already saw as Figure 4.10.4). The middle figure at left shows the triangle whose vertices are a point a on the sphere, the origin, and the projection p of a onto the (x, y)plane. Since cos ' is adjacent side over hypotenuse, the length of the side between 0 and p is r cos '. In the bottom triangle, cos ✓ is adjacent side over hypotenuse r cos ', so the adjacent side (the side lying on the x-axis) is r cos ✓ cos '. This gives the first entry in the spherical coordinates map; it is the “x-coordinate” of the point p, which is the same as the x-coordinate of the point a. Since sin ✓ is opposite side over hypotenuse, the “y-coordinate” of p and of a is r sin ✓ cos '. The z-coordinate of a should be the distance

Solutions for Chapter 5

185

between a and p, as shown in the middle figure; that distance is indeed r sin '. Trigonometry was invented by the ancient Greeks for the purpose of doing these computations. 5.2.1 Condition 3 of Definition 5.2.3 requires that : (U X) ! M be one to one, of class C 1 , with locally✓Lipschitz derivative. To ◆ ✓ ◆ ✓ show ◆ that ✓ it◆is r1 r2 r1 r2 one to one, we must show that if = , then = . ✓1 ✓2 ✓1 ✓2 Set r1 cos ✓1 = r2 cos ✓2 r1 sin ✓1 = r2 sin ✓2 r1 = r2 .

U

The mapping is one to one on X but not on U .

The map is C 1 : we can differentiate it as many times as we like.

Since r1 = r2 by the third equation, cos ✓1 = cos ✓2 and sin ✓1 = sin ✓2 . Since we are restricting ✓ to 0 < ✓ < 2⇡, this implies that ✓1 = ✓2 . So is one to one on U X. Moreover, it is clearly not just C 1 but C 1 , and Proposition 2.8.9 says that the derivative of a C 2 mapping is Lipschitz, so has locally Lipschitz derivative. For condition 4, we must show that [D (u)] is one to one for all u in U X. We have 2 3 cos ✓ r sin ✓ h ⇣ ⌘i D ✓r = 4 sin ✓ r cos ✓ 5 . 1 0 It is one to one if x = 0, y = 0 is the only solution to 2 3 2 3 x cos ✓ yr sin ✓ 0 h ⇣ ⌘i  x r D ✓ = 4 x sin ✓ + yr cos ✓ 5 = 4 0 5 . y x 0

Since x = 0, we have yr cos ✓ = yr sin ✓ = 0; since in U X we have r > 0, we can divide by r to get y cos ✓ = y sin ✓ = 0. Since sin ✓ and cos ✓ do not simultaneously vanish, this implies that y = 0. 5.2.3 The map S is not a parametrization by Definition 3.1.18, but it is a parametrization by Definition 5.2.3. It is not difficult to show that S is equivalent to the spherical coordinates map using latitude and longitude (Definition 4.10.6). Any point S(x) is on a sphere of radius r, since r2 sin2 ' cos2 ✓ + r2 sin2 ' sin2 ✓ + r2 cos2 ' = r2 . Since 0  r  1, the mapping S is onto. The angle ' tells what latitude a point is on. Going from 0 to ⇡, it covers every possible latitude. The polar angle ✓ tells what longitude the point is on; going from 0 to 2⇡, it covers every possible longitude. However, S is not a parametrization by Definition 3.1.18, because it is not one to one. Trouble occurs in several places. If r = 0, then S(x) is the

186

Solutions for Chapter 5

origin, regardless of the values of ✓ and '. For ✓, trouble occurs when ✓ = 0 and ✓ = 2⇡; for fixed r and ', these two values of ✓ give the same point. For ', trouble occurs at 0 and ⇡; if ' = ⇡, the point is the south pole of a sphere of radius r, regardless of ✓, and if ' = 0, the point is the north pole of a sphere of radius r, regardless of ✓. By Definition 5.2.3, S is a parametrization, since the trouble occurs only on a set of 3-dimensional volume 0. In the language of Definition 5.2.3, for r

3

M =R ,

for ✓

for '

z }| { z }| { z }| { U = [0, 1) ⇥ [0, 2⇡] ⇥ [0, ⇡],

X = {0} ⇥ [0, 2⇡] ⇥ [0, ⇡] [ [0, 1) ⇥ {0, 2⇡} ⇥ [0, ⇡] | {z } | {z } trouble when r=0

trouble when ✓=0 or ✓=2⇡

[ [0, 1] ⇥ [0, 2⇡] ⇥ {0, ⇡} . | {z } trouble when '=0 or '=⇡

Trouble occurs on a union of five surfaces, which you can think of as five sides of a box whose sixth side is at infinity. The base of the box lies in the plane where r = 0; the sides of the base have length ⇡ and 2⇡. The base represents the set “trouble when r = 0.” Two parallel sides of the box stretching to infinity represent “trouble when ✓ = 0 or ✓ = 2⇡”. The other set of parallel sides represents “trouble when ' = 0 or ' = ⇡”. By Proposition 4.3.5 and Definition 5.2.1, these surfaces have vol3 = 0. Next let us check (condition 4 of Definition 5.2.3), that the derivative is By the dimension formula, the one to one for all u in U X. This will be true (Theorem 4.8.3) if and only derivative is invertible if it is one if the determinant of the derivative is not 0. The determinant is to one. 2 2 3 !3 r sin ' cos ✓ r sin ' sin ✓ r cos ' cos ✓ det 4DS ✓ 5 = det 4 sin ' sin ✓ r sin ' cos ✓ r cos ' sin ✓ 5 = 2r2 sin ', ' cos ' 0 r sin ' Polynomials are continuous by Corollary 1.5.31. Theorem 1.5.29 discusses combining continuous functions

which is 0 only if r = 0, or ' = 0, or ' = ⇡, which are not in U X. Next, we will show that S : (U X) ! R3 is of class C 1 with locally Lipschitz derivative. We know the first derivatives exist, since we just computed the derivative. They are continuous, since they are all polynomials in r, sine, and cosine, which are continuous. In fact, S : (U X) ! R3 is C 1 , since its derivatives of all order are polynomials in r, sine, and cosine, and are thus continuous. We do not need to check anything about Lipschitz conditions because Proposition 2.8.9 says that the derivative of a C 2 function is Lipschitz. Finally, 0 we 1 need to show that S(X) has 3-dimensional volume 0. If r = 0, r then S @ ✓ A is the origin, whatever the values of ✓ and '. If ✓ = 0 we have 0 1 '0 1 r r sin ' S @ 0 A = @ 0 A, which is a surface in the (x, z)-plane. If ✓ = 2⇡, we ' r cos '

Solutions for Chapter 5

187

0 1 0 get the same surface. If ' = 0, we get @ 0 A, the z-axis going from 0 to r 0 1 0 1; if ' = ⇡ @ 0 A, we get the z-axis going from 0 to 1. r 5.2.5 The set

U=

1 ✓ [ ai

1

, ai 2i+k

i=1

Solution 5.2.5: The point of showing that U can be parametrized is that we will be able to compute integrals over U using the parametrization. We can’t integrate a function f over U , since the upper and lower sums UN (f 1U ) and LN (f 1U ) do not have a common limit. But we can integrate f over J, since the boundary of J is the increasing sequence i 7! ai of numbers (including its limit) defined by ai =

i X

j=1

lj , i = 0, 1, . . . , 1.

Thus the boundary of J has 1dimensional volume 0.

1 2i+k

◆

is an open subset of R, hence indeed a 1-dimensional manifold. The pathological property of U is that its boundary has positive length, so that its indicator function is not integrable.5 But U is simply a countable union of disjoint open intervals. (Every open subset of R is a countable union of open intervals.) List these intervals I1 , I2 , . . . in order of length (which is not the order in which they appear in R). Let the length of Ii be li , and let 0 1 i 1 i X X Ji = @ li , li A , j=1

j=1

so that Ji is another interval of length li . Now let J = [1 i=1 Ji , and define : J ! U to be the map that translates Ji onto Ii . This is a parametrization according to Definition 5.2.3. Indeed, let X be empty, and check the conditions: 1.

(J) = U , by definition;

2.

(J

3.

X) = U , since X is empty;

: J X ! U is one to one, of class C 1 , with Lipschitz derivative. These are local conditions, and is locally a translation.

4.

0

5.

(X) has 1-dimensional volume 0, since X is empty.

(x) 6= 0 for all x 2 J, since

0

(x) = 1 everywhere.

5.2.7 a. Let ⇡ : Rn ! Rk denote the projection of Rn using the corresponding k coordinate functions. Choose ✏ > 0, and N so large that X

C2DN (Rn ) C\X6= /

⇣

1 2N |{z}

sidelength of C

⌘k

< ✏.

5 The boundary of U is the complement of U in [0, 1]: by Definition 1.5.10, a point x is in the boundary if each neighborhood of x intersects both U and its complement. Since the length of [0, 1] is 1, and the sum of the lengths of the intervals making up U is ✏, the boundary [0, 1] U has positive length.

188

Solutions for Chapter 5

Then ✏>

X

n

C2DN (R ) C\X6= /

✓

1 2N

◆k

X

k

C1 2DN (R ) C1 \⇡(X)6= /

✓

1 2N

◆k

,

since for every k-dimensional cube C1 2 DN (Rk ) such that C1 \ ⇡(X) 6= / , there is at least one n-dimensional cube C 2 DN (Rn ) with C ⇢ ⇡

1

(C1 ) such that C \ X 6= / .

Thus volk (⇡(X)) < ✏ for any ✏ > 0. b. If you tried to find an unbounded subset of R2 of length 0 whose projection onto the x-axis has positive length, you will have had trouble. What can be found is such a subset whose projection does not have length. For example, the subset of R2 consisting of points with x-coordinate p/q and y-coordinate q, with p and q integers satisfying 0  pq  1, is unbounded, since q can be arbitrarily large; it has length 0, since it consists of isolated points. Its projection onto the x-axis is the set of rational numbers in [0, 1], which (as we saw in Example 4.4.3) does not have defined volume.

l=

Z

b

a

5.3.1 a. Going back to Cartesian coordinates, set x(t) = r(t) cos ✓(t) and y(t) = r(t) sin ✓(t): ✓ ◆ ✓ ◆ x(t) r(t) cos ✓(t) (t) = = . y(t) r(t) sin ✓(t) Rb So the formula l = a |~ 0 (t)| dt (equation 5.3.4) gives

q

(r cos ✓)0 (t) + (r sin ✓)0 (t)

q

r0 (t)

2

Z b r⇣ = r0 (t) cos(✓(t)) a

=

Z

a

b

2

+ r(t)

2

dt

⌘2 ⇣ ⌘2 r(t) sin(✓(t))✓0 (t) + r0 (t) sin(✓(t))+ r cos(✓(t))✓0 (t) dt 2

✓0 (t)

2

dt.

When computing the first entry of 0 (t), remember that the derivative of cos ✓(t) is the derivative at t of the composition cos ✓, computed using the chain rule: (cos ✓)0 (t) = cos0 (✓(t))✓0 (t) = sin(✓(t))✓0 (t). Similarly, for the second entry, the derivative of sin ✓(t) is the derivative at t of the composition sin ✓, giving (sin ✓)0 (t) = sin0 (✓(t))✓0 (t) = cos(✓(t))✓0 (t). ✓ ↵t ◆ e cos t b. The spiral is the curve parametrized by (t) = , whose e ↵t sin t  e ↵t sin t ↵e ↵t cos t derivative is ~ 0 (t) = , so the length between e ↵t cos t ↵e ↵t sin t

Solutions for Chapter 5

Solution 5.3.1, part b: How do we know this curve is a spiral? Each point of the curve is known by polar coordinates; at time t the distance from the point to the origin is e ↵t and the polar angle is t. The beginning ✓ ◆ point, at t = 0, is the point 1 0 in the (x, y)plane – the point with coordinates x = e0 cos 0,

y = e0 sin 0.

As t increases, the polar angle increases and the distance from the origin decreases; as t ! 1, we have e ↵t ! 0.

189

t = 0 and t = b is Z b Z 0 |~ (t)| dt = 0

b

0

=

p ↵2 e

2↵t

+e

2↵t

dt =

Z

b

0

p (e ↵2 + 1

↵b

↵

1)

p 1 + ↵2 e

↵!0

↵b

↵

1)

dt

.

The limit of this length as ↵ ! 0 is b: expressing e nomial (equation 3.4.5) gives p (e lim ↵2 + 1

↵t

↵b

as its Taylor poly-

2 2 p (1 b↵ + b 2↵ ···) 1 2 = lim ↵ + 1 ↵!0 ↵ ⇣ ⌘ p 2 = lim ↵ + 1 b + ↵(higher degree terms)

↵!0

= b.

At ↵ = 0, the distance from the origin is 1, so the curve is a circle of radius 1 rather than a narrowing spiral. c. For every 2⇡ increment in t the spiral turns around the origin once: ✓(t) = ✓(t + 2⇡). So the spiral turns infinitely many times about the origin as t ! 1. The length does not tend to infinity, since p p ↵2 + 1 ↵2 + 1 ↵b lim l = lim (1 e )= . b!1 b!1 ↵ ↵ 5.3.3 a. Since

Solution 5.3.3: In the exercise, the order in which we listed the variables ✓ and ' in the domain is reversed, compared to Definition 4.10.6. You should think that here we have 0 1 0 1 r r cos ✓ cos ' S : @ ' A 7! @ r sin ✓ cos ' A . ✓ r sin '

x(t) = r(t) cos '(t) cos ✓(t), y(t) = r(t) cos '(t) sin ✓(t), z(t) = r(t) sin '(t), we have l=

r sin ' cos ✓ '0

2

r0 cos ' sin ✓ ✓0 )

2

+ (r0 cos ' sin ✓ + (r0 sin '

=

Solution 5.3.3, part b: Even though ✓(t) = tan t tends to infinity, the curve does not become long.

Z ⇣ (r0 cos ' cos ✓

r sin ' sin ✓ '0 + r0 cos ' cos ✓ ✓0 ) ⌘ 2 1/2 r cos ' '0 ) dt

Z p (r0 )2 + r2 ('0 )2 + r2 cos2 ' (✓0 )2 dt.

b. We get the integral Z as ( sin t)2 + (cos t)2 + (cos t)2 (cos t)2 0

5.3.5 a. The map

1 dt = (cos t)4

0 1 2r cos ✓ ⇣ ⌘ : ✓r 7! @ 3r sin ✓ A r2

Z

0

a

p p 2 dt = a 2.

190

Solutions for Chapter 5 2

2

parametrizes the locus given by the equation z = x4 + y9 (which is an elliptic paraboloid)6 ; it maps the part of the (r, ✓)-plane where 0  ✓  2⇡, r  a to the region of the paraboloid where z  a2 . Thus the surface area is given by the integral v 2 3 u  ✓h ⇣ ⌘i ◆ Z 2⇡Z as Z Z u 2 cos ✓ 2r sin ✓ 2⇡ a > u 2 cos ✓ 3 sin ✓ 2r 4 tdet det D ✓r [D ( ✓r )] dr d✓ = 3 sin ✓ 3r cos ✓ 5dr d✓ 2r sin ✓ 3r cos ✓ 0 0 0 0 0 2r 0 Z 2⇡ Z a q = 2r 9 + r2 (4 sin2 ✓ + 9 cos2 ✓) dr d✓. 0

0

This integral cannot be evaluated in terms of elementary functions, but it can be evaluated in terms of elliptic functions, using Maple (at least, version 7 or higher): r r Z 2⇡ Z a q 2 2 1 + a a2 2 2 2r 9 + r2 (4 sin ✓ + 9 cos2 ✓) dr d✓ = p 36 a ellipticK 5 2 9 + 4a 9 + 4a2 3 1 + a2 0 0 ! r r r r 1 + a2 5 a2 1 + a2 4 a2 + 81 elliptic⇡ , 5 + 16 a ellipticE 5 2 2 2 9 + 4a 4 9 + 4a 9 + 4a 9 + 4a2 ! r r p 1 + a2 2 a2 2 + 36 a ellipticE 5 9⇡ 1 + a . 9 + 4a2 9 + 4a2

The elliptic functions “ellipticK,” “ellipticE,” and “elliptic⇡” are tabulated functions; tables with these functions can be found, for example, in Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene Stegun (Dover Publications, Inc.). The graph of the area as a function of a is shown in the margin. b. It is easier to begin by finding the volume of the region outside the elliptic paraboloid, and then subtract it from the volume of the elliptic cylinder. Call Ua the ellipse

Figure for Solution 5.3.5 The area of the surface as a function of a.

x2 y2 +  a2 . 4 9 Then the volume of the region above this ellipse in the plane z = 0 and beneath the paraboloid is ◆  Z ✓ 2 Z 2⇡Z a x y2 2 cos ✓ 2r sin ✓ + |dx dy| = det r2 dr d✓ = 3a4 ⇡. 3 sin ✓ 3r cos ✓ 4 9 Ua 0 0 The elliptic cylinder has total volume ⇡(6a2 )a2 = 6⇡a4 , and the di↵erence of the two is 3⇡a4 . 6

How did we find this parametrization? The surface cut horizontally at height ✓ ◆ a cos t z is an ellipse, so we were guided by the parametrization t 7! for the b sin t 2 2 ellipse xa2 + yb2 = 1; see Example 4.10.19. We got the r2 by noting that then ✓ ◆ x2 y2 ra cos t t 7! parametrizes the ellipse 2 + 2 = r2 . rb sin t a b

Solutions for Chapter 5

191

Here is another solution to part b, provided by Nathaniel Schenker and using Definition 5.3.2. In it the elliptic paraboloid is sliced 0 1horizontally: x 2 2 x y For z fixed, the area of the ellipse +  z is f @ y A = 6⇡z. Let M 4 9 z 0 1 0 denote the z-axis between z = 0 and z = a2 , parametrized as : t 7! @ 0 A, t 0  t  a2 . Then the volume of the region in question is 0 1 0 1 Z Z q x x 1@ A @ A f y d y = f ( (t)) det([D (t)]> [D (t)] |dt| M [0,a2 ] z z Z a2 p = 6⇡t 1 dt = 3⇡a4 0

Solution 5.3.7: This parametrization is inspired by the spherical change of coordinates 0 1 ✓ ◆ cos ✓ cos ' ✓ 7! @ sin ✓ cos ' A . ' sin '

5.3.7 a. The obvious way to parametrize the surface of the ellipsoid is 0 1 ✓ ◆ a cos ' cos ✓ ⇡ ⇡ ' = @ b cos ' sin ✓ A , < ' < , 0  ✓ < 2⇡. ✓ 2 2 c sin ' Z

for the unit sphere. Scaling a sphere in three directions with three scaling factors a, b, c gives an ellipsoid.

b. Using equation 5.3.33, we find that the area is ✓ ◆ 2⇡ Z ⇡/2 ✓ ! ⇣ ⌘◆ ! ⇣ ⌘ ' ' D1 ⇥ D2 d' d✓ ✓ ✓

0

=

Z

0

=

⇡/2

2⇡ Z

Z 2⇡Z 0

2

⇡/2 ⇡/2

⇡/2 q

3 2 3 a sin ' cos ✓ a cos ' sin ✓ 4 b sin ' sin ✓ 5 ⇥ 4 b cos ' cos ✓ 5 d' d✓ c cos ' 0

b2 c2 cos4 ' cos2 ✓ + a2 c2 cos4 ' sin2 ✓ + a2 b2 cos2 ' sin2 ' d' d✓.

⇡/2

5.3.9 a. We parametrize S by

0 ⇣ ⌘ x =@ y

standard approach and set up the integral as Z

1 x A. We could use our y 3 4 x +y

z

z }| { q (x + y + x3 + y 4 ) det[D (x)]> [D (x)]|dx dy| S v 0 2 u  Z 1 Z p1 y2 u 1 2 u 1 0 3x 3 4 = p 2 (x + y + x + y )tdet @ 0 1 4y 3 4 0 1 1 y 3x2

31 0 1 5A dx dy , 4y 3

but since we are in R3 it is probably easier computationally to use the length of the cross product to give the area of the parallelogram spanned !

!

by the partial derivatives D1 and D2 (see Proposition 1.4.20 or equation

192

Solutions for Chapter 5

5.3.33). Thus we have Z Z p 2 1

p

1

= Solution 5.3.9: In part b, we use polar coordinates, but the result will not be computable in elementary terms in any case.

Z

2 3 2 3 z 1 0 z }| { (x + y + x3 + y 4 ) 4 0 5 ⇥ 4 1 5 dx dy y2 3x2 4y 3

1 y 1

1 1

Z p1 p

y2

1 y2

p (x + y + x3 + y 4 ) (9x4 + 16y 6 + 1 dx dy.

b. It is easier to pass to polar coordinates, so that the integral will be over a rectangle. We find Z 2⇡ Z 1 p (r cos ✓ + r sin ✓ + r3 cos3 ✓ + r4 sin4 ✓) 9r4 cos4 ✓ + 16r6 sin6 ✓ + 1 r dr d✓. 0

0

We have no idea how to compute this integral symbolically. To evaluate it numerically, we need to choose a numerical method. The following sequence of Matlab instructions performs a midpoint Riemann sum, breaking the domain into 100 intervals on each side, and evaluating the function at the center of each rectangle. First, define the function:

function prob545=integrand (x,y) prob545 = x2 ⇤(cos(y)+sin(y)+x2 ⇤ (cos(y))3 +x3 ⇤ (sin(y))4 )⇤ sqrt(9⇤(x⇤cos(y))4 +16⇤(x⇤sin(y))6 +1); Then create ✓ a matrix A whose ◆ values are the values of this function (2i 1)/200 at the points , for i, j = 1, . . . , 200, using the following 2⇡(2j 1)/200 commands: Matlab has a command “dblquad” that should compute double integrals, but we could not get it to work, even on Matlab’s own example.

N = 100 for I = 1 : N, for J = 1 : N, A(I, J) = integrand((2⇤I-1)/(2⇤N),2⇤pi⇤(2⇤J-1)/(2⇤N)); end, end; Now sum the elements of the matrix, and multiply by 2⇡/(1002 ), the area of the small rectangles: sum(sum(A(1:N,1:N)))⇤2⇤pi/(N2 ) The answer the computer comes up with is 0.96117719450855. If you want a more precise answer, you might try the 2-dimensional Simpson’s method. The following Matlab program will do this. It yields the estimate 0.96143451150993. If you repeat the procedure with N = 201, you will find 0.96143438060954. It seems likely that the digits 0.9614343 are accurate; this is in agreement with the estimate (from Theorem 4.6.2) that says the error should be a constant times 1/N 4 , where the constant is computed from the partials of f up to order 4.

Solutions for Chapter 5

193

N=101 S(1)=1; S(N)=1; for i=1:(N-1)/2, S(2⇤i)=4; end for i=1:(N-3)/2, S(2*i+1)=2; end for i=1:(N-3)/2, S(2*i+1)=2; end for i=1:N, for j=1:N, SS(i,j) = S(i)*S(j); end end for I = 1:N, for J = 1:N, B(I,J) = integrand((I-1)/(N-1),2*pi*(J-1)/(N-1)); end, end; (sum(sum(SS(1:N,1:N).⇤B(1:N,1:N))))⇤2⇤pi/(9⇤(N-1)2 )

Solution 5.3.11, part a: More generally, the same sort of computation shows that when a surface is scaled by a, its Gaussian curvature is scaled by 1/a2 . The mean curvature is scaled by 1/a.

By Definition 3.9.8, the Gaussian curvature K of a surface at a point a is K(a) = A2,0 A0,2

A21,1 .

5.3.11 a. The Gaussian curvature K of the sphere S of radius R is 1/R2 . This computation is very 0 similar 1 to that in Solution 3.9.3. Any point can be 0 rotated to the point a = @ 0 A, so it is enough to compute the curvatures R there. Near a the “best coordinates” are X = x, Y = y, Z = z R, so that zp Z = R2

z

}| X2

Y

{ 2

R=R

r

1

X2 R2

Y2 R2

!

1 .

As in Solution 3.9.3, we use equation 3.4.9 to compute ✓ ✓ ◆ ◆ 1 X2 Y2 Z ⇡R 1+ + ··· 1 . 2 R2 R2 ✓ ◆ 1 X2 Y2 The quadratic terms of the Taylor polynomial are + , so 2 R R A1,1 = 0,

A2,0 = A0,2 = 1/R,

giving K = 1/R2 .

The area of the sphere of radius R is 4⇡R2 , so its total curvature K(S) is 4⇡, independent of R: Z 1 K(S) = |K(x)| |d2 x| = 2 4⇡R2 = 4⇡. R S b. The hyperboloid of equation z = x2 y 2 is parametrized by x and y. We computed the Gaussian curvature of this surface in Example 3.9.12, and found it to be

194

Solutions for Chapter 5

0

1

The parametrization in Solux 4 tion 5.3.11, part b is an example A= K@ y . 2 of parametrizing as a graph: (1 + 4x + 4y 2 )2 2 2 x y 0 1 ✓ ◆ x x =@ Thus our total curvature is A. y y 2 3 2 3 x2 y 2 Z Z 1 0 R2

| 4| 4 0 5 ⇥ 4 1 5 |dx dy| = 4 (1 + 4x2 + 4y 2 )2 2x 2y

R2

|dx dy| . (1 + 4x2 + 4y 2 )3/2

This integral exists (as a Lebesgue integral), and can be computed in polar coordinates (this uses Theorems 4.11.21 and 4.11.20), to yield ◆ Z 2⇡ ✓Z 1 Z 1 r 8r 4 dr d✓ = ⇡ dr = 2⇡. 2 3/2 (1 + 4r ) (1 + 4r2 )3/2 0 0 0 Solution 5.3.13, part a: This result was first due to Archimedes, who was so proud of it that he arranged for a sphere inscribed in a cylinder to adorn his tomb, in Sicily. One hundred thirtyseven years later, in 75 BC, Cicero (then the Roman questor of Sicily) found the tomb, “completely surrounded and hidden by bushes of brambles and thorns. I remembered having heard of some simple lines of verse which had been inscribed on his tomb, referring to a sphere and cylinder . . . . And so I took a good look round . . . . Finally I noted a little column just visible above the scrub: it was surmounted by a sphere and a cylinder.”

5.3.13 a. Using cylindrical coordinates, the mapping from the cylinder to the sphere can be written 0p 1 1 z 2 cos ✓ ⇣ ⌘ p f z✓ = @ 1 z 2 sin ✓ A ; z the cross product of the two partials is the vector p 2 p 3 2 3 2p 2 z cos ✓/p 1 z 2 p 1 z sin ✓ p1 4 1 z 2 cos ✓ 5 ⇥ 4 z sin ✓/ 1 z 2 5 = 4 1 0 1

3 z 2 cos ✓ z 2 sin ✓ 5 . z

This vector has length 1, so by equation 5.3.33 and Definition 5.3.1, for any subset C ⇢ S1 with volume, Z q vol2 f (C) = det[D (u)]> [D (u)] d2 u C Z Z 2 = 1d u = 1C d2 u = vol2 C. R2

C

b. The area of the polar caps and of the tropics is exactly the same as the area of the corresponding parts of the cylinder: Area of polar caps = 2(40000)2 (1

cos 23 270 )/(2⇡) ⇡ 42.06 ⇥ 106 km2

Area of tropics = 2(40000)2 (sin 23 270 )/(2⇡) ⇡ 203 ⇥ 106 km2 . Thus the tropics have roughly five times the area of the polar caps. c. As explained by the figure in the margin, the area in question is ⇣ A(r) = 2⇡R 1 2

= ⇡r2

r⌘ cos = 2⇡R2 R

⇡ 4 r + ··· . 12R2

✓

r2 2R2

r4 + ··· 24R4

◆

Solutions for Chapter 5

.r

0 1 a BbC 5.3.15 a. For the point @ A 2 R4 to be on the unit sphere, we must c d have a2 + b2 + c2 + d2 = 1, so a first step in showing that parametrizes S 3 is to show that the sum of the squares of the coordinates is 1:

h

✓

re

%

195

R cos2 cos2 ' cos2 ✓ + cos2 = cos2

Figure for Solution 5.3.13. Part c: A sphere of radius R inscribed in a cylinder. The angle ✓ at the center of the sphere corresponding to an arc of great circle of length r on a sphere of radius R is r/R. Thus the part of the cylinder that projects to it has length 2⇡R and height ⇣ r⌘ h = R 1 cos , R hence area ⇣ ⇣ r ⌘⌘ 2⇡R R 1 cos . R By part a, this equals the area of the polar cap of radius r. Equations 1: sin2 ' and (sin ')2 mean the same thing, as do cos2 ' and (cos ')2 ; the second is better notation but the first is more traditional.

2

4D 2

4D Computing this product is a little easier if you remember that A> A is always symmetric (see Exercise 1.2.16 and equation 5.1.9).

cos2 ' + cos2

cos2 ' sin2 ✓ + cos2 sin2 ' + sin2

sin2 ' + sin2

= cos2

+ sin2

= 1.

Next we must check the domains. Except possibly on sets of 3-dimensional volume 0 (individual points, for example), is one to one and onto? Think of the fourth coordinate, sin , as being the “height” of S 3 , analogous to the z-coordinate for the unit sphere in R3 . Clearly sin must run through [ 1, 1], and does run through these values exactly once as varies from ⇡/2 to ⇡/2. For any fixed value of 2 ( ⇡/2, ⇡/2), the other coordinates satisfy a2 + b2 + c2 + (sin )2 = 1, i.e., a2 + b2 + c2 = (cos )2 , (1) 0 1 a so @ b A runs through the 2-sphere in R3 of radius cos . For fixed sin , c the first three coordinates 0 are: 1 cos cos ✓ cos ' @ cos sin ✓ cos ' A , cos sin '

which is the parametrization of the sphere in R3 with radius cos by spherical coordinates (Definition 4.10.6). Thus is a parametrization of the 3-sphere in R4 . b. To compute vol3 (S 3 ) we use equation 5.3.3. First we compute 2 2 !3 !3> 2 !3 ✓ ✓ ✓ 4D ' 5 and 4D ' 5 4D ' 5: 2 !3 ✓ 6 ' 5=4

!3> 2 ✓ ' 5 =4 giving 2

4D

cos cos ' sin ✓ cos cos ' cos ✓ 0 0

cos cos ' sin ✓ cos sin ' cos ✓ sin cos ' cos ✓ !3> 2 ✓ ' 5 4D

cos sin ' cos ✓ cos sin ' sin ✓ cos cos ' 0

cos cos ' cos ✓ cos sin ' sin ✓ sin cos ' sin ✓

!3 2 2 ✓ cos cos2 ' 5 4 ' = 0 0

3 sin cos ' cos ✓ sin cos ' sin ✓ 7 5. sin sin ' cos

0 0 cos cos ' 0 sin sin ' cos

0 cos2 0

3 0 05. 1

3

5,

196

Solutions for Chapter 5

The determinant of this matrix is cos4 cos2 ', so applying equation 5.3.3 to our problem gives Z ⇡/2 Z ⇡/2 Z 2⇡ 3 vol3 S = cos2 cos ' d✓ d' d 2

To integrate cos we used the fact that cos2 has average 1/2 over one full period, which for cos2 is ⇡. You may notice that this result is the same as the value for vol3 S 3 given in Table 5.3.3.

In equation 1, the element of length |d1 x| is being evaluated on 1 the vector . We could also 2x  1 write this as . 2x

= 2⇡ = 2⇡

⇡/2

⇡/2

Z

Z

Z

2

⇡/2

⇡/2 ⇡/2 ⇡/2

0

⇡/2

cos2

cos ' d' d

⇡/2

⇡/2 [sin '] ⇡/2

2

cos

d = 4⇡

Z

⇡/2

cos2

⇡/2

d = 4⇡

⇡ 2

= 2⇡ . 5.3.17 It’s pretty easy to write down the appropriate integrals, but one of them is easy to evaluate in elementary terms, one is hard, and one is very hard. Denote by x and y the x and y components of the center of gravity. Then (by Definition 4.2.1) R R x |dx| y |dx| C x= R and y = RC . |dx| |dx| C C ✓ ◆ x The curve is parametrized by x 7! , so the integrals become x2  Z a Z a p 1 1 x|d x| dx = x 1 + 4x2 dx 2x 0 0  Z a Z a p 1 y|d1 x| dx = x2 1 + 4x2 dx (1) 2x 0 0  Z a Z ap 1 |d1 x| dx = 1 + 4x2 dx. 2x 0 0 The first of these is easy, setting 4x2 = u, so 8x dx = du. This leads to Z a p ⌘ 1 ⇣ x 1 + 4x2 dx = (1 + 4a2 )3/2 1 . 12 0

The third is a good bit harder. Substitute 2x = tan ✓, so that 2 dx = d✓/ cos2 ✓. Our integral becomes Z p Z Z 1 1 1 cos ✓ 2 1 + 4x dx = d✓ = d✓. 2 cos3 ✓ 2 (1 sin2 ✓)2 Now substitute sin ✓ = t, so that cos ✓ d✓ = dt, and the integral becomes ✓Z ◆ Z Z Z Z 1 dt 1 dt dt dt dt = + + + , 2 (1 t2 )2 8 1 t 1+t (1 t)2 (1 + t)2 using partial fractions. This is easy to integrate, giving 1 t 1 1+t + ln . 4 1 t2 8 1 t

Solutions for Chapter 5

197

Substituting back leads to ✓ p ◆ Z ap p 1 1 1 + 4x2 dx = a 1 + 4a2 + ln 2a + 1 + 4a2 . 2 2 0

(2)

The second integral is trickier yet. Begin by integrating by parts: Z Z p p p x x 1 x2 1 + 4x2 dx = · 8x 1 + 4x2 dx = (1 + 4x2 )3/2 (1 + 4x2 ) 1 + 4x2 dx 8 12 12 | {z } Z

A

x = (1 + 4x2 )3/2 12

1 12

Z p 1 + 4x2 dx | {z }

4 12

see equation 2

Z |

x2

p 1 + 4x2 dx . {z } A

Now insert the value computed in equation 2, and add the term on the far right to both sides of the equation, to get ✓ p ◆ Z p p 4 x 1 1 x2 1 + 4x2 dx = (1 + 4x2 )3/2 x 1 + 4x2 + ln 2x + 1 + 4x2 . 3 12 24 2 Z

0

a

Finally, we find p a x2 1 + 4x2 dx = (1 + 4a2 )3/2 16 This gives

We do not consider an ability to compute such complicated integrals an important mathematical skill. In general, it is far better to use Maple or similar software to carry out such computations.

x=

y=

1 32

✓ p ◆ p 1 2 2 a 1 + 4a + ln 2a + 1 + 4a . 2

(1 + 4a2 )3/2 1 1 p p 6 a 1 + 4a2 + 12 ln |2a + 1 + 4a2 | a 8 (1

⇣ p ⌘ p 1 + 4a2 )3/2 16 a 1 + 4a2 + 12 ln |2a + 1 + 4a2 | p p . a 1 + 4a2 + 12 ln |2a + 1 + 4a2 |

5.3.19 To go from the third to the last line of equation 5.3.49, first we do an integration by parts, with p (1 + r2 )3/2 du = 1 + r2 r dr and v = r, so that u = and dv = dr. 3 This gives Z Rp iR 4⇡ Z R 4⇡ h 4⇡ 1 + r2 r2 dr = (1 + r2 )3/2 r (1 + r2 )3/2 dr 3 3 0 0 0 ! Z Rp p 4⇡ 4⇡ 2 3/2 2 2 2 = R(1 + R ) 1 + r + 1 + r r dr . 3 3 0 Multiplying each side by 3/(4⇡) gives Z Rp Z 2 2 3/2 2 3 1 + r r dr = R(1 + R ) 0

0

R

p 1 + r2 dr

Z

0

R

p 1 + r2 r2 dr,

and adding the last term on the right to both sides gives Z Rp Z Rp 2 2 3/2 2 4 1 + r r dr = R(1 + R ) 1 + r2 dr. 0

0

(1)

198

Solutions for Chapter 5

Now we fiddle with the last term on the right side. To go from the d✓ first to the second line below, we set tan ✓ = r, so that dr = cos ✓ , and q q 1 1 sin ✓ = tan ✓ 1+tan2 ✓ = r 1+r2 : Z

0

R

p 1 + r2 dr = =

R

1 + r2 p dr = 1 + r2

z

1st

Z

from 1st term in braces, using trig. as described above

R

}|

{z

2nd

}| { 1 r2 p p dr dr 1 + r2 1 + r2 0 0 0 Z arctan R h p iR Z R p cos ✓ 2 d✓ r 1+r + 1 + r2 dr cos2 ✓ 0 0 0 | {z } | {z } Z

Z

R

from 2nd term in braces, by integration by parts

So, subtracting from both sides the last term on the second line, we get Z Rp Z arctan R p cos ✓ 2 1 + r2 dr = d✓ R 1 + R2 2 cos ✓ 0 0 Z arctan R p 1 2 cos ✓ = d✓ R 1 + R2 2 0 1 sin2 ✓ ✓ ◆ Z p 1 arctan R cos ✓ cos ✓ = + d✓ R 1 + R2 2 0 1 + sin ✓ 1 sin ✓  ✓ ◆ arctan R p 1 1 + sin ✓ = ln R 1 + R2 2 1 sin ✓ 0 2 3R p r + 1 + r2 p 7 p 16 6 1 + r2 7 = R 1 + R2 6ln p 7 24 1 + r2 r 5 p 1 + r2 0 p 2 p 2 1 (R + 1 + R ) = ln R 1 + R2 2 1 + R2 R2 p p = ln(R + 1 + R2 ) R 1 + R2 . Thus

Z

0

R

p 1 + r2 dr =

⌘ p 1 ⇣ ln R + 1 + R2 2

p R 1 + R2 . 2

(2)

Now we multiply equation 1 by ⇡, replacing the second term on the right side by its equivalent value in equation 2, to get ! Z Rp ⌘ Rp1 + R2 p 1 ⇣ 2 2 3/2 2 2 4⇡ 1 + r r dr = ⇡ R(1 + R ) ln R + 1 + R . 2 2 0 5.3.21 We will use equation 5.3.3 for computing the k-dimensional volume of a k-dimensional manifold in Rn , in this case a 2-dimensional manifold in C3 , which we can think of as being R6 .

Solutions for Chapter 5

199

Let us write z = s(cos ✓ + i sin ✓), so that s = |z|. (This uses equations

0.7.9 and 0.7.10.) Then using de Moivre’s formula s(cos ✓ + i sin ✓)

p

= sp (cos p✓ + i sin p✓),

we have 0

1 2 p sp cos p✓ ps p s sin p✓ B C 6 psp C ⇣ ⌘ B q h ⇣ ⌘i 6 q s cos q✓ B C 6 qs s =6 q C and D ✓s q ✓ =B s sin q✓ B C 6 qs @ r A 4 r s cos r✓ rs sr sin r✓ rsr

1

cos p✓ 1 sin p✓ 1 cos q✓ 1 sin q✓ 1 cos r✓ 1 sin r✓

3 psp sin p✓ psp cos p✓ 7 7 qsq sin q✓ 7 7. qsq cos q✓ 7 5 rsr sin r✓ rsr cos r✓

h ⇣ ⌘i> h ⇣ ⌘i Computing D ✓s D ✓s and taking the determinant gives h

⇣ ⌘i> h ⇣ ⌘i s s D ✓ ✓ z }| { p2 s2p 2 + q 2 s2q 2 + r2 s2r 2 0 det 0 p2 s2p + q 2 s2q + r2 s2r ⇣ ⌘2 = p2 s2p 1 + q 2 s2q 1 + r2 s2r 1 , D

so taking the square root of the determinant is easy. (This is the kind of thing that happens with complex numbers.) Thus if we call our surface S, equation 5.3.3 gives volk S = =

|z|1

r

0

0

Z Z

2⇡

Z



1

h ⇣ ⌘i> h ⇣ ⌘i det D ✓s D ✓s ds d✓ p2 s2p

1

+ q 2 s2q

1

+ r2 s2r

 2 2r 1 q 2 s2q r s = 2⇡ + + 2q 2r 0 0 ⇣p q ⌘ r = 2⇡ + + = ⇡(p + q + r). 2 2 2 p2 s2p 2p

1



1

ds d✓ ! 1 0

5.4.1 a. The image of the hyperboloid under the Gauss map is the part p 1 of the sphere where | cos '| p . Set a = arccos 1/ 1 + a2 . Using 2 1+a spherical coordinates, we find the area of this region to be Z

0

2⇡

Z

a

cos ' d' d✓ = 2⇡(2 sin a

a)

a = 4⇡ p . 1 + a2

200

Solutions for Chapter 5

Figure for Solution 5.4.1, part a: At right, we see the branch of the hyperbola of equation x2 + a2 z 2 = 1

z

a x=

where x > 0, together with representative normals, and the same normals anchored at the origin. These normals fill the arc a a ⇡ p AB = det (B > A> A)B = det B(B > A> A) = det(BB > ) det(A> A).

and set M = det A> A, because B is not square. Use the SVD (Theorem 3.8.1) to write B = P DQ> , with P , Q orthogonal and D rectangular diagonal. Then (using the fact that the determinant of an orthogonal matrix is ±1; see Exercise 4.8.23) det(AB)> AB = det QD> P A> AP DQ> = det D> (AP )> (AP )D.

e for the diagonal k ⇥ k submatrix of D, and AP g for the first k Write D columns of AP . Then we have so

e > (AP g )> (AP g )D, e D> (AP )> (AP )D = D

⇣ ⌘ ⇣ ⌘ ⇣ ⌘ e > (AP g )> (AP g )D e = (det D) e 2 det (AP g )> (AP g) . det (AB)> AB = det D

e is the product of the square By Theorem 3.8.1 and Corollary 4.8.25, det D > e 2 = det B > B. It remains roots of the eigenvalues of B B, hence (det D) > g g to show that det(AP ) (AP ) is bounded by a constant that depends only on A; but P is an orthogonal matrix and the orthogonal group is compact (Exercise 3.2.11), so def

M =

g )> (AP g) | sup | det (AP

P 2O(n)

is a number depending only on A.

6.6.9 The proof of Stokes theorem for manifolds with corners will essentially contain this exercise; most of the solution is in Proposition 6.10.7, which we repeat here.

Proposition 6.10.7: The idea is to use the functions defining M and X within M as coordinate functions of a new space; this has the result of straightening out M and @X, since these become sets where appropriate coordinate functions vanish.

Proposition in 6.10.7. Let M ⇢ Rn be a k-dimensional manifold, and let Y ⇢ M be a piece-with-corners. Then every x 2 Y is the center of a ball U in Rn such that there exists a di↵eomorphism F : U ! Rn satisfying F(U \ M ) = F(U ) \ Rk k

F(U \ Y ) = F(U ) \ R \ Z,

where Z is a region x1

0, . . . , xj

6.10.30 6.10.31

0 for some j  k.

Proof. At any x 2 Y , Definition 6.6.5 of a corner point and Definition 3.1.10 defining a manifold known by equations give us a neighborhood V ⇢ Rn of x, and a collection of C 1 functions V ! R with linearly independent derivatives: functions f1 , . . . , fn k defining M \ V and functions ge1 , . . . , gem (extensions of the coordinate functions gi , . . . , gm of the map g of Definition 6.6.5). The number m may be 0; this will happen if x is in the interior of Y .

Solutions for Chapter 6

235

In each case we extend the collection of their derivatives to a maximal collection of n linearly independent linear functions, by adding linearly independent linear functions 1 , . . . , k m : Rn ! R. These n functions define a map F : V ! Rn whose derivative at x is invertible. We can apply the inverse function theorem to say that locally, F is invertible, so locally it is a di↵eomorphism on some neighborhood U ⇢ V of x. Equations 6.10.30 and 6.10.31 then follow. ⇤ Using this, there are two things to show: the nonsmooth boundary has (k 1)- dimensional volume 0 and the smooth boundary has finite (k 1)dimensonal volume. It is enough to show that this is true in U , or even in 0 0 U 0 ⇢ U containing x such that the closure U is contained in U (and U is compact of course since it is closed and bounded). Indeed, by the HeineBorel theorem, since X is compact it is covered by finitely many such U 0 , so both properties are true for all of X if they are true for each U 0 . The nonsmooth boundary is a subset of a manifold of dimension < k 1 (see Proposition 5.2.2), since it is defined by at least two equations with surjective derivative. For the smooth boundary, consider the part where some gi in U 0 it is a subset of a compact piece of a k 1-dimensional manifold, parametrized by F 1 restricted to a (k 1)-dimensional subspace of Rn . So the (k 1)-dimensional volume of this part of the smooth boundary is an integral of a continuous function with compact support, hence finite. h ⇣ ⌘ ⇣ ⌘i 6.6.11 Suppose D gf x is onto; then there exists ~v 2 Rn such that h ✓ ◆ ⇣ ⌘i ⇣ ⌘ f D x ~v = 0 1 . The top entry says that ~v 2 Tx M , and the bottom g entry says that [Dg(x)] does not vanish on ~v, so [Dg(x)] : Tx M ! R is onto.  For the converse, suppose that [Dg(x)] : Tx M ! R is onto. Choose ~ w 2 Rn k+1 . We need to find ~a 2 Rn such that w  h ⇣ ⌘ ⇣ ⌘i ~ w f D g x ~a = . w Since [Df (x)] is onto we can find ~v1 such that ~, [Df (x)]v1 = w We have [Df (x)](~ v2 ) = 0 since ~ v2 2 Tx M and, by Theorem 3.2.4, Tx M = ker[Df (x)].

and since [Dg(x)] : Tx M ! R is onto we can find ~v2 2 Tx M with [Dg(x)]~v2 =hw ⇣ [Dg(x)]~ v1 . We will consider separately the top and bot⌘ ⇣ ⌘i f tom lines of D g x (~v1 + ~v2 ). The top line is ~ + [Df (x)](~v2 ) = w ~ [Df (x)](~v1 + ~v2 ) = [Df (x)](~v1 ) + [Df (x)](~v2 ) = w | {z } 0 ; see margin

The bottom line is

[Dg(x)](~v1 + ~v2 ) = [Dg(x)](~v1 ) + [Dg(x)](~v2 ) = [Dg(x)](~v1 ) + w

[Dg(x)](~v1 ) = w

236

Solutions for Chapter 6

Therefore if ~a = ~v1 + ~v2 , we have   h ⇣ ⌘ ⇣ ⌘i ~ [Df (x)] w ~a = , so D gf x : Rn ! Rn [Dg(x)] w

k+1

is onto.

6.7.1 a. ⇣ ⌘ d(x1 x3 dx3 ^ dx4 ) = D1 (x1 x3 )dx1 + D2 (x1 x3 )dx2 + D3 (x1 x3 )dx3 + D4 (x1 x3 )dx4 ^ dx3 ^ dx4 = (x3 dx1 + x1 dx3 ) ^ dx3 ^ dx4 = x3 dx1 ^ dx3 ^ dx4 . b. d(cos xy dx ^ dy) = ( y sin xy dx

d

~2 F

x sin xy dy) ^ dx ^ dy = 0.

6.7.3 When computing the exterior derivative, remember that any terms containing dxi ^ dxi are 0. a. ✓ ◆ ✓ ◆ ✓ ◆ y x y x =d dx + dy = d dx + d dy x2 + y 2 x2 + y 2 x2 + y 2 x2 + y 2 ✓ ◆ ✓ ◆ y y x x = D1 2 dx + D2 2 dy ^ dx + D1 2 dx + D2 2 dy ^ dy x + y2 x + y2 x + y2 x + y2

y 2 x2 (x2 + y 2 ) 2x2 dy ^ dx + dx ^ dy = 0. (x2 + y 2 )2 (x2 + y 2 )2 0 1 F1 b. Set F~3 = @ F2 A. Then F3 ⇣ ⌘ d F~3 = d F1 dy ^ dz F2 dx ^ dz + F3 dx ^ dy ✓ ◆ x y z =d dy ^ dz dx ^ dz + 2 dx ^ dy (x2 + y 2 + z 2 )3/2 (x2 + y 2 + z 2 )3/2 (x + y 2 + z 2 )3/2 =

=

(x2 + y 2 + z 2 )3/2 · 1 x 32 (x2 + y 2 + z 2 )1/2 · 2x dx ^ dy ^ dz (x2 + y 2 + z 2 )3 (x2 + y 2 + z 2 )3/2 3y 2 (x2 + y 2 + z 2 )1/2 dy ^ dx ^ dz (x2 + y 2 + z 2 )3 +

=

(x2 + y 2 + z 2 )3/2 3z 2 (x2 + y 2 + z 2 )1/2 dz ^ dz ^ dy (x2 + y 2 + z 2 )3

3(x2 + y 2 + z 2 )3/2

3(x2 + y 2 + z 2 )1/2 (x2 + y 2 + z 2 ) = 0. (x2 + y 2 + z 2 )3

6.7.5 a. Computing the exterior derivative from the definition means computing the integral on the right side of Z 1 d(z 2 dx ^ dy) Px (~v1 , ~v2 , ~v3 ) = lim 3 z 2 dx ^ dy. h!0 h @Px (h~ v1 ,h~ v2 ,h~ v3 ) Thus we must integrate the 2-form field z 2 dx ^ dy over the boundary of the 3-parallelogram spanned by h~v1 , h~v2 , h~v3 , i.e., over the six faces of the 3-parallelogram shown in the margin on the next page.

Solutions for Chapter 6

Solution 6.7.5, part a: If you just integrate everything in sight, this is a long computation. To make it bearable you need to (1) not compute terms that don’t need to be computed, since they will disappear in the limit, and (2) take advantage of cancellations early in the game. To interpret the answer, it also helps to know what one expects to find (more on that later).

We those six faces as follows, where 0  s, t  h, and 0 parametrize 1 x x = @ y A; we use Proposition 6.6.26 to determine which faces are taken z with a plus sign and which are taken with a minus sign: ⇣ ⌘ s = x + s~v + t~v , minus. (This is the base of the box shown at 1 2 t

1. left.)

⇣ ⌘ s = x + h~v + s~v + t~v , plus. (This is the top of the box.) 3 1 2 t ⇣ ⌘ s = x + s~v + t~v , plus. (This is the front of the box.) 1 3 t ⇣ ⌘ s = x + h~v + s~v + t~v , minus. (This is the back of the box.) 2 1 3 t ⇣ ⌘ s = x + s~v + t~v , minus. (This is the left side of the box.) 2 3 t ⇣ ⌘ s = x + h~v + s~v + t~v , plus. (This is the right side of the box.) 1 2 3 t

2. 3. 4. 5. 6.

h~ v3

h~ v3

h~ v2 x

h~ v1

x + h~ v1

Figure for solution 6.7.5. The 3-parallelogram spanned by h~ v1 , h~ v2 , h~ v3 .

Z

0

h

Z

0

=

h

Note that the two first faces are spanned by ~v1 and ~v2 (or translates thereof), the base taken with and the top taken with +. The next two faces, with opposite signs, are spanned by ~v1 and ~v3 , and the two sides, with opposite signs, are spanned by ~v2 and ~v3 . To integrate our form field over these parametrized domains we use Definition 6.2.1. The computations are tedious, but we do not have to integrate everything in sight. For one thing, common terms that cancel can be ignored; for another, anything that amounts to a term in h4 can be ignored, since h4 /h3 will vanish in the limit as h ! 0. We will compute in detail the integrals over the first two faces. The integral over the first face (the base) is the following, where v1,3 denotes the third entry of ~v1 and v2,3 denotes the third entry of ~v2 : s t

z 2 dx^dy eval. at

237

!

!

D

!

,D

z }| { z s }| t { 2 (z + sv1,3 + tv2,3 ) dx ^ dy (~v1 , ~v2 ) ds dt Z

0

h

Z

0

h

z2 |{z}

2 2 + s2 v1,3 + t2 v2,3 + 2stv1,3 v2,3 + 2szv1,3 + 2tzv2,3 (v1,1 v2,2 {z } | {z } | 2

term in h

The notation O(h) is discussed in Appendix A.11.

terms in h4

RhRh

v1,2 v2,1 ) ds dt

terms in h3

Note that the integral 0 0 z 2 ds dt will give a term in h2 , and the next Rh 2 three terms will give a term in h4 ; for example, 0 s2 v1,3 ds gives an h3 and Rh 2 2 4 s v1,3 dt gives an h, making h in all. These higher degree terms can be 0 disregarded; we will denote them below by O(h4 ). This gives the following integral over the first face: Z hZ h z 2 + 2szv1,3 + 2tzv2,3 + O(h4 ) (v1,1 v2,2 v1,2 v2,1 ) ds dt 0

0

=

h2 z 2 + h3 zv1,3 + h3 zv2,3 + O(h4 ) (v1,1 v2,2

v1,2 v2,1 ).

238

Solutions for Chapter 6

Before computing the integral over the second face (the top), notice that it is exactly like the first face except that it also has the term h~v3 . This face comes with a plus sign, while the first face comes with a minus sign, so when we integrate over the second face, the identical terms cancel each other. Thus we didn’t actually have to compute the integral over the first face at all! In computing the contribution of the first two faces to the integral, we need only concern ourselves with those terms in 2 (z + hv3,3 + sv1,3 + tv2,3 )2 that contain hv3,3 : i.e., h2 v3,3 , 2hsv3,3 v1,3 , 2htv3,3 v2,3 , and 2zhv3,3 . Integrating the first three would give terms in h4 , so the entire contribution to the integral of the first two faces is Z

Z

h

0

Z

0

h

2zhv3,3 + O(h4 ) (v1,1 v2,2

v1,2 v2,1 ) ds dt = (2zh3 v3,3 + O(h4 ))(v1,1 v2,2

v1,2 v2,1 ).

0

hZ

h

0

Similarly, the entire contribution of the second pair of faces is ⇣ ⌘⇣ ⌘ 2zhv2,3 +O(h4 ) (v1,1 v3,2 v3,1 v1,2 ) ds dt = 2zh3 v2,3 +O(h4 ) v1,1 v3,2 v3,1 v1,2 .

(The partial derivatives are di↵erent, so we have (v1,1 v3,2 v3,1 v1,2 ), not (v1,1 v2,2 v1,2 v2,1 ) as before.) And the contribution of the last pair of faces is

Z

0

h

Z

h

2zhv1,3 + O(h4 ) (v2,1 v3,2

v2,2 v3,1 ) ds dt = (2zh3 v1,3 + O(h4 ))(v2,1 v3,2

v2,2 v3,1 ).

0

Dividing by h3 and taking the limit as h ! 0 gives

d(z 2 dx ^ dy) Px (~v1 , ~v2 , ~v3 ) = (2zv3,3 )(v1,1 v2,2

v1,2 v2,1 )

+ (2zv1,3 )(v2,1 v3,2

Note that in this solution we have reversed our usual rule for subscripts, where the row comes first and column second.

(2zv2,3 )(v1,1 v3,2

v3,1 v1,2 )

v2,2 v3,1 ).

Now it helps to know what one is looking for. Since d(z 2 dx ^ dy) is a 3-form on R3 , it is a multiple of the determinant. If you compute 2 3 v1,1 v2,1 v3,1 det[~v1 , ~v2 , ~v3 ] = 4 v1,2 v2,2 v3,2 5 , v1,3 v2,3 v3,3 you will see that

d(z 2 dx ^ dy) Px (~v1 , ~v2 , ~v3 ) = 2z det[~v1 , ~v2 , ~v3 ]. b. Using Theorem 6.7.4, we have part 5

z}|{ d(z 2 dx ^ dy) = d z 2 ^ dx ^ dy part 4

z}|{ = (D1 z 2 dx + D2 z 2 dy + D3 z 2 dz) ^ dx ^ dy = 2z dz ^ dx ^ dy

Prop. 6.1.15

z}|{ =

2z dx ^ dy ^ dz.

Solutions for Chapter 6

239

6.7.7 a. The four edges of P e2 (h~e2 , h~e3 ) are parametrized by 0 1 2 3 0 1 2 3 0 0 0 0 1. t 7! @ 1 + h A + t 4 0 5 2. t 7! @ 1 A + t 4 0 5 0 h 0 h 0 1 2 3 0 1 2 3 0 0 0 0 3. t 7! @ 1 A + t 4 h 5 4. t 7! @ 1 A + t 4 h 5 . h 0 0 0

The second and third are taken with a minus sign, and the first and fourth with a plus sign. Only the first two contribute to the integral around the boundary, since dx3 returns 0 for vectors with no vertical component. They give Z 1 Z 1 ( 1 + h)2 h dt ( 1)2 h dt = 2h2 + h3 . 0

0

2

Divide by h and take the limit as h ! 0, to find

2.

b. We find

d(x22 dx3 ) = 2x2 dx2 ^ dx3

and 2x2 dx2 ^ dx3 P

e2 , ~e3 ) e2 (~

=

2·1 =

2.

6.7.9 First compute d!: ⇣ ⌘ d! = d p(y, z) dx + q(x, z) dy

= D2 p(y, z) dy ^ dx + D3 p(y, z) dz ^ dx + D1 q(x, z) dx ^ dy + D3 q(x, z) dz ^ dy;

comparing this with the formula d! = x dy ^ dz + y dx ^ dz, we see that x dy ^ dz = D3 q(x, z) dz ^ dy;

y dx ^ dz = D3 p(y, z) dz ^ dx;

i.e., D3 q(x, z) =

x;

i.e., D3 p(y, z) =

y.

In addition, 0 = D1 q(x, z) dx ^ dy

D2 p(y, z) dx ^ dy;

The functions q(x, z) = xz and p(y, z) = which gives us the 1-forms of the form !=

yz dx

i.e., D1 q(x, z) = D2 p(y, z). yz satisfy these constraints,

xz dy.

But since ddf = 0, the complete list of 1-forms satisfying the conditions is larger: !=

yz dx

xz dy + df,

where f is any (at least) C 2 function on R3 .

6.7.11 a. This is a restatement of Theorem 1.8.1, part 5. b. Any k-form can be written as a sum of k-forms of the form a(x) dxi1 ^ · · · ^ dxik ,

240

Solutions for Chapter 6

and any l-form can be written as a sum of l-forms of the form b(x) dxj1 ^ · · · ^ dxjl . The result then follows from Theorem 6.7.4, part 2 (the exterior derivative of the sum equals the sum of the exterior derivatives) and from Proposition 6.1.15 (distributivity of the wedge product). We will work it out in the case of a k-form ' = '1 + '2 and an l-form , where we assume that Theorem 6.7.9 is true for '1 , '2 , and . We have d ('1 + '2 ) ^

= d('1 ^ + '2 ^ ) = d('1 ^ ) + d('2 ^ ) ⇣ ⌘ ⇣ ⌘ = d'1 ^ + ( 1)k '1 ^ d + d'2 ^ + ( 1)k '2 ^ d = (d'1 + d'2 ) ^ = d('1 + '2 ) ^

+ ( 1)k '1 ^ d + '2 ^ d

+ ( 1)k ('1 + '2 ) ^ d .

c. To simplify notation, in the equation below we write a(x) and b(x) as a and b. Recall that we can write the wedge product of a 0-form f and a form ↵ either as f ↵ or as f ^ ↵. We now treat the function ab as the function f of Theorem 6.7.4, part 5:

d '^ To move the b next to the a in line [2] we use Proposition 6.1.15 (skew commutativity): since b is a 0-form we have

= d a dxi1 ^ · · · ^ dxik ^ b dxj1 ^ · · · ^ dxjl

= d(ab) ^ dxi1 ^ · · · ^ dxik ^ dxj1 ^ · · · ^ dxjl

[2]

= (a db + b da) ^ dxi1 ^ · · · ^ dxik ^ dxj1 ^ · · · ^ dxjl

[3]

= a db ^ dxi1 ^ · · · ^ dxik ^ dxj1 ^ · · · ^ dxjl

dxi1 ^ · · · ^ dxik ^ b

= ( 1)0·k b ^ dxi1 ^ · · · ^ dxik = b dxi1 ^ · · · ^ dxik .

To go to line [2] we also use equation 6.7.9. To go to line [3] we apply the formula d(f g) = f dg+g df (Theorem 6.7.9, with k = 0). To go to [4] we use the distributivity of the wedge product (Proposition 6.1.15). To go from [4] to [5] we use the commutativity of 0-forms. To go from [5] to [6] we again use skew commutativity, this time moving the 1-form db.

[1]

+ b da ^ dxi1 ^ · · · ^ dxik ^dxj1 ^ · · · ^ dxjl | {z }

[4]

+ d' ^ b ^ dxj1 ^ · · · ^ dxjl | {z }

[5]

d'

= a db ^ dxi1 ^ · · · ^ dxik ^ dxj1 ^ · · · ^ dxjl

= a ^ dxi1 ^ · · · ^ dxik ^( 1)k db dxj1 ^ · · · ^ dxjl +d' ^ | {z } | {z } '

= d' ^

d

k

+ ( 1) ' ^ d .

[6]

Solutions for Chapter 6

241

~ ), ~u · (~v ⇥ w ~ ), grad f (x) · ~v. Function: 6.8.1 a. Numbers: dx ^ dy(~v, w ~ ~ div F . Vector fields: grad f , curl F b.

~ grad f = rf ~ · F~ div F~ = r ~ ⇥ F~ curl F~ = r

df = Wrf ~ = Wgrad f = D1 f dx1 + D2 f dx2 + D3 f dx3

dWF~ = d

~ F

~ curl F

=

~ F ~ r⇥

= Mdiv F~ = Mr· ~ F ~

c. Of course more than one right answer is possible; for example, in (i), grad F~ could be changed to curl F~ . In (iii), F~ (~v1 , ~v2 , ~v3 ) is meaningful if it is in R4 . i.

grad f

iv. WF~ 6.8.3

ii. curl F~

iii.

v1 , ~v2 ) ~ (~ F

v.

vi.

unchanged

unchanged

2

3 2 3 2 3 2 3 2xy D1 y 0 grad f = 4 x2 5 ; curl F~ = 4 D2 5 ⇥ 4 x 5 = 4 z 5 ; 1 D3 xz 2 2 3 2 3 D1 y div F~ = 4 D2 5 · 4 x 5 = x. D3 xz

✓ ◆  x 2x = y 2y 0 1 2 3  x 1 ⇣ ⌘ sgn(x + y + z) 2x x = 1 ~ ~ @yA = 415 c. rf d. rf 2 2 x +y y 2y |x + y + z| z 1 sgn(x+y+z) 1 Note that |x+y+z| can be simplified to x+y+z . Of course there is no solution to part d if x + y + z = 0. ~ 6.8.5 a. rf

⇣ ⌘  0 x = y 2y

~ b. rf

6.8.7 a. We can write 2 the3 1-form ' = xy dx + z dy + yz dz as the work xy form WF~ , where F~ = 4 z 5. yz b. In the language of forms,

dWF~ = d(xy dx + z dy + yz dz) = d(xy) ^ dx + d(z) ^ dy + d(yz) ^ dz

= (D1 xy dx + D2 xy dy + D3 xy dz) ^ dx + (D1 z dx + D2 z dy + D3 z dz) ^ dy =

+ (D1 yz dx + D2 yz dy + D3 yz dz) ^ dz

x(dx ^ dy) + (z

1)(dy ^ dz).

(1)

242

Solutions for Chapter 6

Since a 2-form in R3 can be written G2 dx ^ dz + G3 dx ^ dy, 2 3 z 1 ~ 4 0 5. the last line of equation (1) can be written G ~ for G = x ~ ~ It can also be written as r⇥ ~ F ~ , since G is precisely the curl of F : 2 3 2 3 2 3 2 3 D1 xy D2 yz D3 z z 1 ~ ⇥ F~ = 4 D2 5 ⇥ 4 z 5 = 4 D1 yz + D3 xy 5 = 4 0 5 . r D3 yz D1 z D2 xy x ~ G

Solution 6.8.9, part a: A less direct way to show this is to say that the exterior derivative of df = Wgrad f = W F

1

F2

= (F1 dx + F2 dy) must be 0: d(F1 dx + F2 dy) = D2 F1 dy ^ dx + D1 F2 dx ^ dy = 0,

i.e., D1 F2

D2 F1 = 0.

= G1 dy ^ dz

6.8.9 a. If F~ = grad f , then D1 f = F1 and D2 f = F2 , so that D2 F1 = D2 D1 f and D1 F2 = D1 D2 f ; these crossed partials are equal since f is of class C 2 (Corollary 3.3.10). b. We have changed the statement of the exercise to “Show that this is not necessarily true if we only assume that all second partials of f exist everywhere. ” The solution is: It is possible for the first partial derivatives to have partial derivatives without being di↵erentiable, and then their crossed partials are not necessarily equal. See Example 3.3.9. 6.8.11 a. W2 x 3 = x dx + y dy + z dz and 4y5 z

d(x dx + y dy + z dz) = dx ^ dx + dy ^ dy + dz ^ dz = 0. 2 3 x Indeed, a function f exists such that grad f = 4 y 5; it is the function z 0 1 x 2 2 2 x +y +z f @yA = . 2 z b. 2 x 3 = x dy ^ dz + y dz ^ dx + z dx ^ dy; the exterior derivative is 4y5 z

d(x dy ^ dz + y dz ^ dx + z dx ^ dy)

= dx ^ dy ^ dz + dy ^ dz ^ dx + dz ^ dx ^ dy = 3 dx ^ dy ^ dz.

6.8.13 a.

2

3 2 3 2 2 3 x2 y D1 x y div 4 2yz 5 = 4 D2 5 · 4 2yz 5 = 2xy 2z x3 y 2 D3 x3 y 2 2 2 3 2 3 2 2 3 2 3 3 x y D1 x y 2x y + 2y curl 4 2yz 5 = 4 D2 5 ⇥ 4 2yz 5 = 4 3x2 y 2 5 x3 y 2 D3 x3 y 2 x2

Solutions for Chapter 6

b.

243

2

3 2 3 2 3 sin xz D1 sin xz div 4 cos yz 5 = 4 D2 5 · 4 cos yz 5 = z cos xz xyz D3 xyz

z sin yz + xy

2

3 2 3 2 3 2 3 sin xz D1 sin xz xz + y sin yz curl 4 cos yz 5 = 4 D2 5 ⇥ 4 cos yz 5 = 4 x cos xz yz 5 . xyz D3 xyz 0

c. Curl of the vector field in part a

We will compute dWF~ = curl F~ , the flux form field of curl F~ , and from that we will compute curl F~ . Since WF~ = x2 y dx

If you attempt to compute this using random vectors ~ v1 , ~ v2 , you will regret it.

2yz dy + x3 y 2 dz

is a 1-form, its exterior derivative is a 2-form. By Definition 6.7.1, we have Z 1 dWF~ = lim 2 x2 y dx 2yz dy + x3 y 2 dz. h!0 h @(Px (h~ v1 ,h~ v2 )) Rather than computing this with vectors ~v1 , ~v2 , remember (Theorem 6.1.8) that any 2-form can be written in terms of the elementary 2-forms. So dWF~ =

~ curl F

= a dx ^ dy + b dy ^ dz + c dx ^ dz

(1)

for some coefficients a, b, c. Theorem 6.1.8 says that to determine the coefficients, we should evaluate dWF~ on the standard basis vectors. Thus to determine the coefficient of dx^dy we will integrate WF~ over the oriented boundary of the parallelogram spanned by h~e1 , h~e2 , computing Z 1 lim x2 y dx 2yz dy + x3 y 2 dz. h!0 h2 @P (h~ e1 ,h~ e2 ) x To do this, we parametrize each edge of the parallelogram shown in the figure at left and give it a plus or minus depending on its orientation: 1. Px (h~e1 )

parametrized by

1 (t)

= x + t~e1

2. Px+h~e1 (h~e2 )

parametrized by

2 (t)

= x + h~e1 + t~e2

3. Px+h~e2 (h~e1 )

parametrized by

3 (t)

= x + h~e2 + t~e1

4. Px (h~e2 )

parametrized by

4 (t)

= x + t~e2

First, we will determine the orientation of each edge. Using Proposition 6.6.26, we have @Px (h~e1 , h~e2 ) = ( 1)0 Px+h~e1 (h~e2 )

Px (h~e2 ) + ( 1)1 Px+h~e2 (h~e1 )

Px (h~e1 ) ,

so edges 1 and 2 come with a plus sign, while 3 and 4 get a minus.

244 h~e2

Solutions for Chapter 6

3

4

2

We must integrate x2 y dx 2yz dy + x3 y 2 dz over each parametrized edge, but note that for the first edge (and the third), we are only concerned with x2 y dx, since dy(~e1 ) = 0 and dz(~e1 ) = 0. If you don’t see this, recall (equation 6.4.30) how we integrate forms. To integrate x2 y dx

2yz dy + x3 y 2 dz

over the first edge, we integrate x2 y dx x

1

h~e1

Figure for solution 6.8.13. The parallelogram spanned by h~e1 , h~e2 . In our calculations, we denote 0 by 1 x, y, z the entries of x: x x = @ y A. x

The h in the denominator might be worrisome – what will happen as h ! 0? But we will see that it cancels with a term from another edge. That is the point of having an oriented boundary!

first edge parametrized by 1 . But Thus for edge 1 we compute 1 h2

Z

1 x y dx = 2 h @(Px (h~ e1 ,h~ e2 )) 2

! Dt 1 (t)

Z

! 1 (t))

2yz dy + x3 y 2 dz(Dt = ~e1 .

x2 y eval. at h

0

over the

1 (t)

z }| { (x + t)2 y

dx(~e1 ) dt

Z h 1 x2 y + 2xyt + t2 y dt h2 0 1 = 2 x2 yh + xyh2 + · · · h =

Notice that t2 y gives a term in h3 ; once we divide by h2 it will be a term in h, which will go to 0 as h ! 0. So we can ignore it. The terms that count x2 y for edge 1 are + xy, both taken with a +. h For edge 2 we are only concerned with 2yz dy, since dx(~e2 ) = 0 and dz(~e2 ) = 0. We compute Z h 1 2yz 2tz dt, h2 0 which gives

2yz h

z.

x2 y A similar computation for edge 3 gives the terms + xy + x2 , each h 2yz term taken with a minus sign; for edge 4 we get z, also taken with h a minus sign. Thus we have Z 1 lim x2 y dx 2yz dy + x3 y 2 dz h!0 h2 @(P (h~ e1 ,h~ e2 )) x =

x2 y 2yz + xy + h | {z } | h {z from edge 1

=

2

x .

x2 y z xy } | h {z

from edge 2

from edge 3

2yz x2 + +z } | h {z }

from edge 4

We can substitute this for a in equation 1: ~ curl F

=

x2 dx ^ dy + b dy ^ dz + c dx ^ dz.

Recall that F~ = F1 dy ^ dz F2 dx ^ dz + F3 dx ^ dy, so x2 should be the third entry of curl F~ , which is indeed what we got in part a (with considerably less e↵ort!).

Solutions for Chapter 6

245

For the coefficient of dx ^ dz we integrate over the boundary of the parallelogram spanned by h~e1 , h~e3 . To save work, we note that for edges 1 and 3 we are interested only in x2 y dx, and for edges 2 and 4 we are only interested in x3 y 2 dz. We get @ Px (h~e1 , h~e3 ) =

x2 y x3 y 2 x2 y + xy + + 3x2 y 2 xy | h {z } | h {z } | h {z } from edge 1

from edge 2

x3 y 2 = 3x2 y 2 . h

from edge 3

Since in the formula for F~ the coefficient of dx ^ dz is F2 , the second entry of curl F~ should be 3x2 y 2 , which is what we got in part a. A similar computation involving ~e2 , ~e3 gives the coefficient for dy ^ dz, i.e., the first entry of curl F~ . Div of the vector field in part a Now let’s compute the divergence of the vector field in part a, by computing d F~ = Mdiv F~ . Since ~ F

h~e3

h~e3

is a 2-form, d F~ is a 3-form and can be written ↵ dx ^ dy ^ dz for some coefficient ↵ = div F~ . We will compute d F~ by integrating x2 y dy ^ dz + 2yz dx ^ dz + x3 y 2 dx ^ dy

h~e2 x

= x2 y dy ^ dz + 2yz dx ^ dz + x3 y 2 dx ^ dy

h~e1

over the oriented boundary of the parallelogram spanned by h~e1 , h~e2 , h~e3 , shown at left. The boundary consists of0six1faces, which we parametrize as follows, x where 0  s, t  h, and x = @ y A; we use Proposition 6.6.26 to determine z which faces are taken with a plus sign and which are taken with a minus sign: ⇣ ⌘ s = x + s~e + t~e , minus (the base of the box shown at left) 1. 1 2 t ⇣ ⌘ s = x + h~e + s~e + t~e , plus (the top of the box) 2. 3 1 2 t ⇣ ⌘ s = x + s~e + t~e , plus (the front of the box) 3. 1 3 t ⇣ ⌘ s = x + h~e + s~e + t~e , minus (the back of the box) 4. 2 1 3 t ⇣ ⌘ s = x + s~e + t~e , minus (the left side of the box) 5. 2 3 t ⇣ ⌘ s = x + h~e + s~e + t~e , plus (the right side of the box) 6. 1 2 3 t

x + h~e1

The parallelogram spanned by h~e1 , h~e2 , h~e3 .

For the integral over the first face, we are concerned only with the term x3 y 2 dx ^ dy, since dy ^ dz(~e1 , ~e2 ) = 0 and dx ^ dz(~e1 , ~e2 ) = 0. So we have x3 y 2 dx^dy eval. at

1 h3

Z

@Px (h~ e1 ,h~ e2 )

x3 y 2 dx ^ dy =

1 h3

Z

0

h

Z

0

h

s t

!

!

D

!

,D

z }| { z s }| t { 3 2 (x + s) (y + t) dx ^ dy (~e1 , ~e2 ) ds dt

246

Solutions for Chapter 6

After discarding everything that will give terms in h4 , dividing by h3 , and giving the correct orientation, we are left with the following for the first face: x3 y 2 h

3x2 y 2 2

x3 y.

For the top of the box we get the same terms, with opposite sign. For the third face (the front), we get 2yz + y. h For the fourth face (the back), we get 2yz h

y

2z.

For the fifth and sixth faces, we are concerned only with x2 y dy ^ dz. The fifth face gives x2 y h

x2 . 2

For the sixth face, we get x2 y x2 + 2xy + . h 2 After cancellations, this leaves 2xy d

~ F

2z. Thus

= Mdiv F~ = (2xy

By equation 6.5.13, div F~ = 2xy

2z) dx ^ dy ^ dz.

2z, which is what we got in part a.

Curl and div of the vector field in part b We do not work out curl and div of the second vector field, since the procedure is the same and you can check your results from part b. 6.9.1 a. To show that the pullback T ⇤ is linear if T is a linear transformation, we need to show that for k-forms ' and and scalar a, T ⇤ (' + ) = T ⇤ (') + T ⇤ ( ); T ⇤ (a') = aT ⇤ '. For the first: We use Definition 6.1.5 of addition of k-forms to go from line 1 to line 2.

T ⇤ (' + )(~v1 , . . . , ~vk ) = (' + ) T (~v1 ), . . . , T (~vk ) = ' T (~v1 ), . . . , T (~vk ) + ⇤

= T '(~v1 , . . . , ~vk ) + T

⇤

T (~v1 ), . . . , T (~vk ) (~v1 , . . . , ~vk ).

For the second: T ⇤ a'(~v1 , . . . , ~vk ) = a' T (~v1 , . . . , ~vk ) = aT ⇤ '(~v1 , . . . , ~vk ).

Solutions for Chapter 6

247

b. To show that the pullback by a C 1 mapping f is linear, we have ⇣ ⌘ ⇣ ⌘ f ⇤ (' + ) Px (~v1 , . . . , ~vk ) = (' + ) Pf (x) [Df (x)]~v1 , . . . , [Df (x)]~vk ⇣ ⌘ ⇣ ⌘ = ' Pf (x) [Df (x)]~v1 , . . . , [Df (x)]~vk + Pf (x) [Df (x)]~v1 , . . . , [Df (x)]~vk ⇣ ⌘ ⇣ ⌘ = (f ⇤ ') Px (~v1 , . . . , ~vk ) + (f ⇤ ) Px (~v1 , . . . , ~vk ) and

⇣ ⌘ ⇣ ⌘ f ⇤ (a') Px (~v1 , . . . , ~vk ) = (a') Pf (x) [Df (x)]~v1 , . . . , [Df (x)]~vk ⇣ ⌘ = af ⇤ ' Px (~v1 , . . . , ~vk ) .

6.9.3 a. (f ⇤ W⇠ )Px (~v) = W⇠ Pf (x) [Df (x)]~v = ⇠ f (x) · [Df (x)]~v = ⇠ f (x) ⇣ ⌘> ⇣ ⌘ ~v = [Df (x)]> ⇠ f (x) · ~v = [Df (x)]> ⇠ f (x)

>

[Df (x)]~v

= W[Df ]> ⇠ f Px (~v),

so f ⇤ W⇠ = W[Df ]> ⇠ f .

f⇤

⇠

b. We have n = m, so suppose U ⇢ Rn is open, f : U ! Rn is C 1 and ⇠ is a vector field on Rn . Then ⇣ ⌘ Px (~v1 , . . . , ~vn 1 ) = ⇠ Pf (x) [Df (x)]~v1 , . . . , [Df (x)]~vn 1 h i = det ⇠ f (x) , [Df (x)]~v1 , . . . , [Df (x)]~vn 1 h i = det [Df (x)][Df (x)] 1 ⇠ f (x) [Df (x)]~v1 , . . . , [Df (x)]~vn 1 ⇣ h i⌘ = det [Df (x)] [Df (x)] 1 ⇠ (x) , ~v1 , . . . , ~vn 1 ⇥ ⇤ = det[Df (x)] det [Df (x)] 1 ⇠ f (x) , ~v1 , . . . , ~vn 1 =

So f ⇤

⇠

=

(det[Df ])[Df ]

(det[Df ])[Df ]

1 (⇠

1 (⇠

f)

f)

Px (~v1 , . . . , ~vn

= det[Df ]

[Df ]

1)

1 (⇠

f)

e[i,j] be the matrix obtained from A by replacing the (i, j)th c. Let A entry by 1, and all other entries of the ith row and jth column by 0. For instance, 2 3 2 3 a1 b1 c1 0 b1 c1 e[3,1] = 4 0 b2 c2 5 if A = 4 a2 b2 c2 5 then A a3 b3 c3 1 0 0 Note that

e[i,j] = ( 1)i+j det A[i,j] . det A

(1)

Recall (Exercise 4.8.18) that Cramer’s rule says that if A~x = ~b and if Ai (~b) denotes the matrix A where the ith column has been replaced by ~b,

248

Solutions for Chapter 6

then xi det A = det Ai (~b). When A is invertible, so that det A 6= 0, then Cramer’s rule says that 2 det A (~b) 3 2 3 1 det A1 (~b) det A 6 7 ~ 6 7 .. .. A4 (2) 5 = b, i.e., A 4 5 = (det A)~b. . . ~ det An (b) det An (~b) | det {zA } ~ x

In the second equation in (2), both sides are continuous functions of A, so the equation is true on the closure of invertible matrices, which is all matrices. e[k,i] = det Ai (~ek ). Then Cramer’s Apply this to ~b = ~ek , noting that det A rule says that 2 e[k,1] 3 det A .. 6 7 A4 5 = (det A)~ek , . e[k,n] det A

so, using equation 1, 2 e[1,1] · · · det A e[n,1] 3 det A .. .. 6 7 A adj(A) = A 4 5 = (det A)[~e1 , . . . , ~en ] = (det A)I. . ··· . e[1,n] · · · det A e[n,n] det A Equality 1 is the definition of the flux. Equality 2 introduces AA 1 = I. Equality 3 factors out A. Equality 4 is

Now we need to show that f ⇤ ⇠ = adj[Df ](⇠ and any invertible n ⇥ n matrix A we have v1 , . . . , A~vn 1 ) ⇠ (Pf (x) (A~

f ).

For any vector ⇠ 2 Rn

= det[⇠(f (x)), A~v1 , . . . , A~vn 1

1]

= det[AA 1 ⇠(f (x)), A~v1 , . . . , A~vn 1 ] 2 ⇣ ⌘ = det A[A 1 ⇠(f (x)), ~v1 , . . . , ~vn 1 ]

det(AB) = det A det B.

3

Equality 5 is multilinearity: multiplying one column of a matrix by a number multiplies the determinant by the same number. Equality 6 is (det A)A 1 = adj A, which follows from equation 3.

= det A det[A 4

= det[(det A)A 5

1

⇠(f (x)), ~v1 , . . . , ~vn 1

(3)

⇠(f (x)), ~v1 , . . . , ~vn

= det[(adj(A)⇠(f (x)), ~v1 , . . . , ~vn 6

(4)

1] 1]

1 ].

and again in both sides the first and last expressions are continuous functions of A, hence valid for all matrices. Applying this to A = [Df (x)] we find f ⇤ ⇠ = adj[Df ](⇠ f ) . 6.10.1 Give U its standard orientation as an open subset of R3 , and @U the boundary orientation. Since d(z dx ^ dy + y dz ^ dx + x dy ^ dz) = dz ^ dx ^ dy + dy ^ dz ^ dx + dx ^ dy ^ dz we see that Z @U

= 3 dx ^ dy ^ dz,

1 (z dx ^ dy + ydz ^ dx + x dy ^ dz) = 3

Z

U

dx ^ dy ^ dz,

Solutions for Chapter 6

249

which is the volume of U .

Z

U

6.10.3 We will compute the integral in two ways. First we compute it directly. Let be the inverse of the projection mentioned in the text. The projection preserves orientation, so does too, and we may use to perform our integration. Let U be the region we wish to integrate over and set 80 1 9 < x1 = V = @ x2 A x1 , x2 , x3 0, and x1 + x2 + x3  a . : ; x3

Since (V ) = U , Definition 6.2.1 tells us that ✓ ◆ Z ! ! ! x1 dx2 ^ dx3 ^ dx4 = x1 dx2 ^ dx3 ^ dx4 D1 , D2 , D3 |dx1 dx2 dx3 |. V

Since

0

0

1

x1 @ x2 A = B @ x3

we have

x1 dx2 ^ dx3 ^

a

x1 x2 x3 x1 x2

! dx4 (D1

! , D2

1

2

1 C 6 0 A, with derivative 4 0 x3 1

! , D3

2

0 ), = x1 det 4 0 1

1 0 1

0 1 0 1

3 0 15 = 1

3 0 07 5, 1 1

x1 ,

so we may write our integral as an iterated integral and evaluate: Z a Z a x1Z a x1 x2 Z a Z a x1 x1 dx3 dx2 dx1 = x1 (a x1 x2 ) dx2 dx1 0

0

0

0

=

Z

0

a

0

=

x1 )2

x1 (a

 1 a2 x21 2 2

2

dx1

2ax31 x4 + 1 3 4

a

= 0

a4 . 24

We can get the same result using the generalized Stokes’s theorem. Let X be the region in R4 bounded by the three-dimensional manifolds given by equations x1 = 0, x2 = 0, x3 = 0, x4 = 0, and x1 + x2 + x3 + x4 = a. Also let @X1 , . . . , @X5 be the portions of these manifolds that bound X, respectively. We assume all are given orientations so that the projection of @X5 onto (x1 , x2 , x3 )-coordinate space is orientation preserving. We may now apply the generalized Stokes’s theorem (Theorem 6.10.2), which tells us that Z 5 Z X x1 dx2 ^ dx3 ^ dx4 = d(x1 dx2 ^ dx3 ^ dx4 ) i=1

@Xi

X

=

Z

X

dx1 ^ dx2 ^ x3 ^ x4 .

The integrals over the first four portions, @X1 , . . . , @X4 , contribute nothing to the sum; the form x1 dx2 ^ dx3 ^ dx4 is uniformly zero over these manifolds. We will spell it out for the first, @X1 .

250

Solutions for Chapter 6

Let ' : R3 ! R4 be the map

0 1 1 0 x2 x B C ' : @ x3 A 7! @ 2 A ; x3 x4 x4 0

this mapping parametrizes the boundary of X1 . The function x1 is 0 on this manifold, so the form evaluates to zero everywhere: 2 3 1 0 0 ! ! ! x1 dx2 ^ dx3 ^ dx4 (Dx2 ', Dx3 ', Dx4 ') = 0 det 4 0 1 0 5 = 0. 0 0 1

Similarly, the boundaries of X2 , X3 , and X4 contribute nothing to the integral. We now have Z Z x1 dx2 ^ dx3 ^ dx4 = dx1 ^ dx2 ^ x3 ^ x4 , @X5

X

and our problem has been reduced to calculating an integral over the region 80 1 9 x1 > > < = B x2 C X = @ A x1 , x2 , x3 , x4 0, x1 + x2 + x3 + x4  a . > > : x3 ; x4 To compute this integral, we first determine the orientation of X. All 4forms in R4 are of the form f (x) dx1 ^dx2 ^x3 ^x4 = f (x) det. Assume X is 2 3 1 6 ~ = 417 oriented by sgn k det, for some constant k 6= 0. Since the vector N 5 1 1 points out of X on @X5 , Definition 6.6.21 tells us that @X5 is oriented by ⇣ ⌘ ~ , ~v1 , ~v2 , ~v3 . sgn k det N Since we wish the map

✓ ! ! ~ , Dx , Dx k det N 1 2

to preserve orientation, 2 1 ◆ ! 61 , Dx3 = k det 4 1 1

we require that 3 1 0 0 0 1 07 5= 0 0 1 1 1 1

4k

be positive. This is satisfied if k = 1, so that sgn det orients X. Now consider the identity parameterization of X. The partial derivatives with respect to x1 , . . . , x4 are ~e1 , . . . , ~e4 . Since det evaluated on these vectors gives 1, the identity parameterization is orientation reversing. So Z Z dx1 ^ dx2 ^ x3 ^ x4 = |d4 x|, [I(X)]

X

Solutions for Chapter 6

Z aZ 0

0

a x1Z a x1 x2Z a 0

0

251

where I is the identity map on X. As an iterated integral this is Z a Z a x1Z a x1 x2 x1 x2 x3 dx4 dx3 dx2 dx1 = (a x1 x2 x3 ) dx3 dx2 dx1 0

=

0

Z aZ 0

=

Z

0

a x1

0

a

(a

0

(a

x1 2

x1 )3 dx1 = 6

x2 )2

dx2 dx1

a4 , 24

the same result as before. 6.10.5 First we will do this using Stokes’s theorem. Set ! = x dy ^ dz + y dz ^ dx + z dx ^ dy, and call A the octant of solid ellipsoid given by 0  x, y, z

and

x2 y2 z2 + 2 + 2  1. 2 a b c

Observe that d! = 3 dx ^ dy ^ dz. Next, observe that the integral of x dy ^ dz + y dz ^ dx + z dx ^ dy over any subset of the coordinate planes vanishes. For instance, on the (x, y)-plane, only the term in dx ^ dy could contribute, and it doesn’t, since z = 0 there. R R So S ! = A dx ^ dy ^ dz. The octant A is the image 2 of the3first octant in the unit sphere under a 0 0 the linear transformation 4 0 b 0 5, with determinant abc. So 0 0 c ✓ ◆✓ ◆ 1 4⇡ abc ⇡ vol3 (A) = abc = . 8 3 6 Z abc ⇡ Finally, ! = 3 vol3 (A) = . 2 S Now we repeat the exercise using the parametrization 0 1 a cos ' cos ✓ ⇣ ⌘ ✓ @ A with 0  ✓  ⇡ , 0  '  ⇡ . ' = b cos ' sin ✓ 2 2 c sin ' This parametrization is orientation preserving, and we have 2 3 a cos ' sin ✓ a sin ' cos ✓ h ⇣ ⌘i ✓ D ' = 4 b cos ' cos ✓ b sin ' sin ✓ 5 . 0 c cos '

252

Solutions for Chapter 6

Thus Z

S

=

x dy ^ dz + y dz ^ dx + z dx ^ dy = Z

⇡ 2

0

Z

⇡ 2

a cos ' cos ✓ det

0

+ b cos ' sin ✓ det + c sin ' det Z

Z







b cos ' cos ✓ 0

b sin ' sin ✓ c cos '

0 a cos ' sin ✓

c cos ' a sin ' cos ✓

a cos ' sin ✓ b cos ' cos ✓

a sin ' cos ✓ b sin ' sin ✓

!

d✓ d'.

⇣ abc cos3 ' cos2 ✓ + abc cos3 ' sin2 ✓ + abc sin2 ' cos ' sin2 ✓ 0 0 ⌘ + abc sin2 ' cos ' cos2 ✓ d' d✓ Z ⇡2 Z ⇡2 = abc cos3 ' + sin2 ' cos ' d' d✓ =

⇡ 2

⇡ 2

0

= abc Solution 6.10.7: Recall that the “bump” function R : Rn ! R is given by ( 2 2 e 1/(R x| ) if |x|2  R2 0

2

2

if |x| > R .

⇡ 2

Z

0

⇡ 2

cos ' d' = abc

0

i ⇡2 ⇡h ⇡ sin ' = abc . 2 2 0

6.10.7 This is an application of the principle that “exponentials win over polynomials”: suppose that p is a polynomial of some degree m, then ex = 1 or, taking inverses, x!1 |p(x)| lim

lim p(x)e

x

x!1

= 0.

To see this, note that the power series for ex contains a term xm+1 /(m + 1)! and all the other terms are positive, and |p(x)|  Cxm for an appropriate C and x sufficiently large, so ex x!1 |p(x)|

xm+1 = 1. x!1 C(m + 1)!xm

lim

lim

Restating this one more time, for any poynomials p and q we have lim

u&0

p(u) e q(u)

1/u

= 0.

The function e

1/(R2 |x|2 )

,

for |x| < R

is the composition of g : x 7! R2 |x|2 and the function u 7! e map g is C 1 , so the issue is to show that the map ⇢ 1/u e if u > 0 F : u 7! 0 if u  0 is C 1 , or equivalently if for all k the right-derivatives satisfy lim F (k) (u) = 0,

u&0

1/u

. The

Solutions for Chapter 6

253

to fit with the left-derivatives, which are obviously all 0. We have 1 F (u) = 2 e u 0

1/u

✓

00

, F (u) =

2 1 + 4 u3 u

◆

e

1/u

,...,

and it is not hard to see that all derivatives are of this form, hence indeed limu&0 f (k) (u) = 0. 6.10.9 a. Let A be an n ⇥ k matrix, and consider the map Mat (n, k) ! R given by B 7! det A> B. Viewed as a function of the columns of B, this mapping is k-linear and alternating. It is alternating because exchanging two columns of B exchanges the same two columns of A> B, and hence changes the sign of det A> B. To see that it is multilinear, write h B 0 = ~b1 , . . . , ~bi

~0 ~ ~ 1 , bi , bi+1 , . . . , bk

i ,

h B 00 = ~b1 , . . . , ~bi

~ 00 ~ ~ 1 , bi , bi+1 , . . . , bk

i ,

and h B = ~b1 , . . . , ~bi

~0 1 , ↵bi

i + ~b0i , ~bi+1 , . . . , ~bk .

Then A> B is obtained from A> B 0 and A> B 00 by the same process of keeping all the columns except the ith, and replacing its ith column by ↵ · the ith column of A> B 0 + so that det A> B = ↵ det A> B 0 +

· the ith column of A> B 00 ,

det A> B 00 .

b. By part a, B 7! det A> B defines an element of ' 2 Ak (Rn ). By Theorem 6.1.8, it is of the form '=

X

1i1 ai1 ,1 . . . ai1 ,k 6 .. 7 . .. = det 4 ... . . 5 aik ,1

...

aik ,k

254

Solutions for Chapter 6

c. Thus '(~a1 , . . . , ~ak ) = det A> A =

X

1i1 1. Then SR oriented by the outwardpointing normal and the unit sphere S1n 1 with the orientation given by the inward-pointing normal form the oriented boundary of the region def

B1,R = { ~x | 1  |~x|  R } . So by Stokes’s theorem (giving S1n 1 the orientation of the outward-pointing normal, hence the minus sign), we have Z Z Z d F~n = 0, ~n ~n = F F n SR

If the word flux means what it supposed to mean, the fact that ~n through the unit the flux of F sphere is the volume of the unit sphere should be obvious.

1

S1n

1

B1,R

by equation 6.7.21. When R < 1, the same proof works: S1n 1 oriented by the outwardn 1 pointing normal and SR oriented by the inward-pointing normal form the oriented boundary of def

In all cases

R

n SR

since for any x 2

1

S1n 1

BR,1 = { ~x | R  |~x|  1 } . R ~n = S n 1 F ~n , and the latter gives voln F 1

and any

~v1 , . . . , ~vn

1

2 T~x S1n

1

,

1

S1n

1

,

280

Solutions for review exercises, Chapter 6

the vector ~x is orthogonal to all the ~vi , so v1 , . . . , ~vn 1 ) ~n P~ x (~ F To see how we got from line 2 to line 3 in equation 1, look at equation 5.1.9 giving the product T > T . The entry in the first row, first column is |~x| = 1, and all other entries of the first row and first column are 0. Solution 6.33: For the notation {u}P see the margin note next to Proposition and Definition 2.6.20.

= det[~x, ~v1 , . . . , ~vn 1 ] q = det[~x, ~v1 , . . . , ~vn 1 ]> [~x, ~v1 , . . . , ~vn q = det[~v1 , . . . , ~vn 1 ]> [~v1 , . . . , ~vn 1 ] = voln

1

P~x (~v1 , . . . , ~vn

1]

(1)

1 ).

6.33 Choosing one basis {u} of a real vector space and declaring it direct orients the vector space, according to the rule ⌦({u0 }) = sgn det P

if {u0 } = {u}P.

Thus choosing one direct basis {v} for V (which is oriented, so this is meaningful) and one direct basis {w} for W , and declaring that the basis ({v}, {w}) of V ⇥ W is direct orients V ⇥ W . The thing we need to check is that if we had chosen some other direct bases {v0 }, {w0 } for V and W we would get the same orientation of V ⇥ W . Since {v0 } and {w0 } are direct, the matrices P, Q such that {v0 } = {v}P and {w0 } = {w}Q both have positive determinant. Then  P 0 [{v0 }, {w0 }] = [{v}, {w}] 0 Q  P 0 and det = det P det Q = 1. Thus {v0 }, {w0 } is indeed direct. 0 Q

Solutions for the Appendix

By “mth digit” we mean the mth digit to the right of the decimal point.

A1.1 Suppose x and y are k-close for all k and that x < y; moreover we can suppose that x and y are both positive. This means that for all k, either [x] k = [y] k or [y] k = [x] k + 10 k . Since x 6= y, there must be a first digit in which they di↵er, say the mth. Saying that they are m-close means that the mth digit of x is one less than the mth digit of y; in particular, the mth digit of x is not a 9. What does saying that they are (m + 1)-close tell us? Adding 10 (m+1) to [x] (m+1) can only produce a carry in the mth position if the (m + 1)st digit of x is a 9 and the (m + 1)st digit of y is a 0, and in that case they are (m + 1)-close. Saying that they are (m + 2)-close says that the (m + 2)nd digit of x and y are respectively 9 and 0 as well. Continuing this way, we see that all the digits of x after the mth are 9’s, and all the digits of y after the mth are 0’s; moreover, the mth digit of x is one less than the mth digit of y. That is exactly the condition for x and y to be equal. A1.3 a. Let a and b be positive finite decimals. We want to consider a/b, as computed by long division, so we may assume (by multiplying both by an appropriate power of 10) that they are both integers. When writing the digits of the quotient after the decimal point, the next digit depends only on the current remainder, which is a number < b. Thus the same integer must appear twice, and the digits of the quotient between two such appearances will then be repeated indefinitely. Suppose we carry out long division to k digits after the decimal point, obtaining a quotient qk and a remainder rk . These numbers then satisfy a = qk b + 10

k

rk .

By equation A1.5, b(a/b) is sup inf b([a/b]l ) = sup inf bql = sup bqk . k

A remainder is always smaller than the divisor, so rk < b.

l k

k

l k

k

where the inf was dropped since the sequence k 7! bqk is increasing, so the inf is the first element. But we just saw that a qk b = 10 k rk < 10 k b. Thus a is the least upper bound of the qk b: it is an upper bound since 10 k rk 0, and it is a least upper bound since 10 k rk will be arbitrarily small for k large. b. Suppose x 2 R satisfies x > 0; then x > 10 p for some p, since it must have a first nonzero digit. Define inv x = inf (1/[x]k ). k>p

We need to show that x inv(x) = 1. By equation A1.5, this product of real numbers means x inv(x) = sup inf [x]l [inv(1/x)]l = sup[x]k [inv(1/x)]k , k

l>k

k

281

282

Solutions for the Appendix

since again the sequence l 7! [x]l [inv(1/x)]l is increasing. So we must show that sup[x]k [inv(1/x)]k = 1. k

n

Suppose x < 10 , and choose an integer p. Since inv(x) = inf k 1/[x]k , there exists q such that [inv(x)]k 1/[x]k ]  10 (p+n) when k > q. Then [x]k [inv(1/x)]k

1 = [x]k [inv(1/x)]k = [x]k [inv(1/x)]k

[x]k (1/[x]k ) (1/[x]k )  x10

(p+n)

 10

p

,

when k > q. Thus supk [x]k [inv(1/x)]k is p-close to 1 for every p 2 N, which means that it is 1. c. If x < 0, define inv(x) = inv( x). We need to check that for x < 0, we have x inv(x) = 1, and there is nothing to it: We are here assuming the rule of signs that ( a)b = a( b) =

(ab);

how do we know that? Since ( a)b and a( b) are D-continuous functions, and they agree on the finite decimals, they agree everywhere.

x inv(x) = x

inv( x) = ( x) inv ( x) = 1, | {z } | {z } + since x 0, so the equation has exactly one real root, and it is simple. Solve the equation to find !1/3 p 65 ± 3969 u= . 54 u6

284

Solutions for the Appendix

This gives two expressions for x: !1/3 ! p p 65 + 3969 4 65 + 3969 + 54 9 54

1/3

+

1 = 2 (surprise!) 3

and 65

!1/3 p 3969 4 + 54 9

65

! p 3969 54

1/3

+

1 = 2. (another surprise!) 3

It certainly isn’t obvious that those funny expressions are equal to 2, or to each other, but it is true (try them on your calculator). This is “explained” by the sentence following equation A2.13 and by Exercise A2.2. The complex roots can be obtained from the complex cube roots: ! ! 1/3 p p ! p p ! p 1/3 65 + 3969 1± 3 4 65 + 3969 1⌥ 3 1 1+i 3 + + = . 54 2 9 54 2 3 2 We do not claim that this result is obvious, but a calculator will easily confirm it, and it could also be obtained by long division: x3

x2 x x 2

2

= x2 + x + 1,

whose roots are indeed the nonreal cubic roots of 1. ⇣ y

a ⌘3 ⇣ +a y 3

A2.3 We have ⌘ ⇣ a 2 ⇣ a⌘ +b y +c = y 3 3

2ay 2 a2 y + 3 9 2 2 a y 2a y = y3 + + 9 9 ✓ ◆ 2 a 3 =y +y b + 3 = y3

✓ a⌘ 2 y 3

◆ 2ay a2 + +ay 2 3 9

ay 2 2a2 y a3 2a2 y a3 + + ay 2 + + by 3 9 27 3 9 a3 2a2 y a3 ab + + by +c 27 3 9 3 2a3 ab + c. 27 3

2a2 y a3 + + by 3 9

ab +c 3

ab +c 3

A2.5 Here the appropriate substitution is x = u + 7/(3u), which leads to ✓ ◆3 ✓ ◆ 7 7 u+ 7 u+ + 6 = 0. 3u 3u As above, after multiplying out, simplifying, multiplying through by u3 , and setting v = u3 , you find the quadratic equation v 2 + 6v +

343 = 0. 27 p

This time, the roots of the quadratic are complex: v1 = 3 + i 109 3 and p v2 = 3 i 109 3 . One approach (usually the only one) is to pass to polar

Solutions for the Appendix

285

coordinates; in this case, you might “just happen to observe” that the cube p 10 3 roots of 3 + i 9 are p p p 2 3 3 3 1 5 3 u1 = 1 + i u2 = +i and u3 = i . 3 2 6 2 6 These lead to u1 +

7 = 2, 3u1

u2 +

7 = 3u2

3 and u3 +

Indeed, it is easy to check that (x + 3)(x

2) = x3

1)(x

A2.7 a. This is a straightforward computation: ⇣ ⇣ ⇣ ⇣ a ⌘4 a ⌘3 a ⌘2 x +a x +b x +c x 4 4 4

=

x4

ax3

+ 38 a2 x2

+ax3

3 2 2 4a x 2

1 3 16 a x 3 3 + 16 a x 1 2 abx

+bx

7 = 1. 3u3 7x + 6.

a⌘ +d 4

1 4 + 256 a

1 4 64 a 1 + 16 ba2 1 4 ac

+cx

+d =

4

x

+(

3 2 8a

2

+ b)x

3 +( a8

ab 2

+ c)x

+(

3a4 256

+

ba2 16

ac 4

+ d)

So we see that p= q= r=

3 2 a +b 8 a3 ab +c 8 2 3a4 ba2 ac + + d. 256 16 4

b. The equation x2 y + p/2 = 0 is the definition of y, and if you substitute it into the second, you get ⇣ p ⌘2 x2 + + qx + r 2

p2 = x4 + px2 + qx + r = 0. 4

c. When m = 1, the curve is a circle; when m < 0, it is a hyperbola; when m > 0, it is an ellipse. 2

d. It is exactly the curve y 2 + qx + r p4 = 0, which corresponds to m = 1. ⇣ ⌘ e. and f. If the equation fm x y = 0 is a union of two lines, then at ⇣ ⌘ x the point y where the lines intersect it cannot represent x implicitly as

a function of y or y implicitly as a function of x. (We discuss this sort of thing in detail in Section 2.10.) It follows that both partial derivatives of

286

Solutions for the Appendix

⇣ ⌘ x satisfy the equations y ⇣ 2 p p⌘ + m x2 y + =0 4 2 2y m = 0

fm must vanish at that point, so m and ⇣ ⌘ 2 fm x y = y + qx + r

q + 2mx = 0. This gives m q and x = , 2 2m and substituting these values into the first equation, multiplying through by 4m, and collecting terms gives y=

m3

2pm2 + (p2

4r)m + q 2 = 0. 2

“The ratios . . . must be equal”: If ↵1 x2 +

1x

+

1

= A(↵2 x2 +

2x

then ↵1 = ↵2

1 2

=

1 2

.

+

2 ),

g. The two functions y 2 + qx + r p4 and x2 y + p2 , when restricted to a diagonal lk , are two quadratic functions that vanish at the same two points. But any quadratic function of t that vanishes at two points a and b is of the form A(t a)(t b). In other words, any two such functions are multiples of each other. Thus we have ⇣ p2 p⌘ 2 k(x x1 ) + y1 + qx + r = A x2 k(x x1 ) y1 + 4 2 for some number A. It follows that the ratios of the coefficients of x2 , x, and constants must be equal, which leads to the equations given in the exercise. Use the first and second, and the first and third, to express k3 as a quadratic polynomial in k, and equate these polynomials to get ⇣ p⌘ p2 k2 x21 y1 + = y12 + qx1 + r. 2 4 h. The two parabolas to intersect are y = x2 represented in the figure below.

2 and x = 3

y2 ,

slope -k1 x3 y3

-5.0

slope k3 slope k2 -5.0

5.0

slope k1

x1 y1

slope -k3 x2 y2 slope -k2

Figure for solution A2.7, part h. The parabolas to intersect in order to solve x4

4x2 + x + 1 = 0.

Solutions for the Appendix

287

The resolvent cubic is m3 + 8m2 + 12m + 1 = 0. If we set m = t to eliminate the square term, we find the equation

8/3

28 187 t+ = t3 + ↵t + = 0. 3 27 This is best solved by the trigonometric formula of Exercise A2.6, which gives !! r r 4↵ 1 3 3 t= cos arccos . 3 3 4↵ ↵ t3

With our values of ↵ and , this gives the roots t1 ⇡ 2.57816998074169, Since m = t m1 ⇡ ⇣

x1 y1

⌘

⇡

✓

0.08849668592498,

t2 ⇡

3.37429792821080,

t3 ⇡ 0.79612794746911.

8/3, this gives m2 ⇡

6.04096459487747,

m3 ⇡

1.87053871919756.

The corresponding points are ◆ ⇣ ⌘ ✓ ◆ ⇣ ⌘ ✓ ◆ 5.64992908800994 0.08276823877167 0.26730267321838 x x 2 3 , y ⇡ , y ⇡ . 0.04424834296249 2 3.02048229743873 3 0.93526935959878 If we insert these values into the equation for k, we find

k1 ⇡ 0.29748392549006,

k2 ⇡ 2.45783738169910,

k3 ⇡ 1.36767639418013.

To find the roots, we must intersect the lines given by the equations yi = ±ki (x xi ), two at a time, of course. For instance, if we intersect y y1 = k1 (x x1 ) and y y2 = k2 (x x2 ), we find the point ✓ ◆ ✓ ◆ X1 1.76401492519458 ⇡ . Y1 1.11174865630925 y

If we intersect y ✓

y1 = k1 (x x1 ) and y y3 = k3 (x ◆ ✓ ◆ X2 2.06149885068464 ⇡ . Y2 2.24977751137410

✓

y1 = k1 (x x1 ) and y y3 = k3 (x ◆ ✓ ◆ X3 0.39633853101445 ⇡ . Y3 1.84291576883330

If we intersect y

x3 ), we find

x3 ), we find

Finally, if we intersect y y1 = k1 (x x1 ) and y y2 = k2 (x find ✓ ◆ ✓ ◆ X4 0.69382245650451 ⇡ . Y4 1.51861039885004

x2 ), we

The points are labeled by the quadrant they are in. Indeed, the four numbers X1 , X2 , X3 , X4 are (approximations to) the roots of the original polynomial.

288

Solutions for the Appendix

Now let us solve the equation x4 + 4x3 + x 1 = 0. If we set x = y this becomes y 4 6y 2 + 9y 5 = 0, with resolvent cubic

1,

m3 + 12m2 + 56m + 81 = 0. To solve the resolvent cubic, we need to eliminate the term in m2 , by setting m = t 4, giving the equation t3 + 8t + 15 = t3 + P t + Q = 0. Then set t = u P/(3u) and v = u3 ; as in equation A2.11, this leads to the quadratic equation v 2 + Qv

P3 = 0. 27

(1)

One root of this equation is v ⇡ 16.17254074438183, and the three cube roots of v are u1 ⇡ 2.52886755571821, u2 ⇡ u3 ⇡

1.26443377785910 + 2.19006354605823 i, 1.26443377785910

2.19006354605823 i.

(If we choose the other root of equation 1, we would get di↵erent values of ui but the same values of ti ; see the text immediately following equation A2.13). These now give t1 ⇡ 1.47437711368740, t2 ⇡ t3 ⇡

0.73718855684370 + 3.10327905690479i, 0.73718855684370

and finally, using m = t

4, we get

m1 ⇡

2.52562288631260,

m3 ⇡

4.73718855684370

m2 ⇡

3.10327905690479i,

4.73718855684370 + 3.10327905690479 i, 3.10327905690479 i.

We have now solved the resolvent cubic, and can give the points of intersection of the diagonals of the quadrilateral formed by the roots: ✓ ◆ ✓ ◆ x1 1.78173868489526 ⇡ , y1 1.26281144315630 ✓ ◆ ✓ ◆ x2 0.66468621310792 + 0.43542847826296i ⇡ , y2 2.36859427842185 + 1.55163952845240i ✓ ◆ ✓ ◆ x3 0.66468621310792 0.43542847826296i ⇡ . y3 2.36859427842185 1.55163952845240i

Solutions for the Appendix

289

We can now compute the slopes of two of the pairs of diagonals, given by k1,1 ⇡ 1.58922084252397, k1,2 =

k1,1 ,

k2,1 ⇡ 2.28038824159147 k2,2 =

0.68042778863371 i,

k2,1 .

We are finally in a position to intersect the pairs of diagonals; the xcoordinate of each point of intersection is given by the formula k1,i x1

k2,j x2 y1 + y2 , k1,i k2,j

for appropriate values of i and j;

the four values given by i = 1, 2, j = 1, 2 give the four roots of the equation y4

6y 2 + 9y

5 = 0.

They are 0.79461042126199 + 0.68042778863371i, 0.79461042126199

0.68042778863371i,

1.48577782032949, 3.07499866285346, and finally, the roots of the original equation x4 + 4x3 + x are found from x = y

1=0

1 to be

0.20538957873801 + 0.68042778863371i, 0.20538957873801

0.68042778863371i,

0.48577782032949, 4.07499866285346. A5.1 First note that f (u + h)

f (u)

[Df (u)]h = 

Z 1⇣ [Df (u + th)]h

Z

0 1

0

M |th|↵ |h| dt =

⌘ [Df (u)]h dt

M |h|↵+1 . ↵+1

To show that [Df (a1 )] is invertible, we will use the following lemma: Lemma 1. If P and Q are n ⇥ n matrices with Q invertible, and |I

PQ

1

|  k < 1,

then P is invertible, and |P

1

|

1 1

k

|Q

1

|.

290

Solutions for the Appendix

Proof. Set X = I P Q 1 ; then I series. Thus P 1 exists, and 1

P

1

=Q

X = PQ

1

is invertible by geometric

(I + X + X 2 + · · · ),

so |P

1

|  |Q =

The quantity on the third line is  k by hypothesis.

1

1

| + |Q

X| + · · ·  |Q

|Q 1 | 1  |Q 1 |X| 1 k

1

1

|(1 + |X| + · · · )

|. ⇤

To apply this to show that [Df (a1 )] is invertible, note that ⇣ ⌘ I [Df (a1 )][Df (a0 )] 1 = [Df (a0 )] [Df (a1 )] [Df (a0 )]  M |h0 |↵ [Df (a0 )]  M [Df (a0 )]

1

1

1

f (a0 )

↵

[Df (a0 )]

1

 k. k

Moreover, note that k as defined in the exercise is necessarily strictly less than 1; in fact, it is less than 1/2; see the figure in the margin. So Lemma 1 says that [Df (a1 )] 1 exists and

.5

1

[Df (a1 )]



1 1

k

[Df (a0 )]

1

.

It follows that we can define ↵ 1

0 0 The equation ✓ ◆↵ k =1 (↵ + 1)(1 k)

k

represents k implicitly as a function of ↵ for 0  ↵  1 (and in fact for all ↵ 0). By implicit di↵erentiation, one can show that this function is increasing; more specifically, it increases from 0 to 1/2 as ↵ increases from 0 to 1.

h1 =

1

[Df (a1 )]

f (a1 ) and a2 = a1 + h1 .

We need three estimates for the size of f (a1 ): |f (a1 )| = f (a1 ) f (a0 ) [Df (a0 )]h0 8 M > > |h |↵+1 > >↵+1 0 > > > > < ↵+1 M  [Df (a0 1 )]f (a0 ) > ↵+1 > > > > > > > : M [Df (a 1 )]f (a0 ) ↵ |h0 |, 0 ↵+1 where the three lines on the right after the inequality are all equal. We find the size of h1 : |h1 | = [Df (a1 )]

1 1

|f (a1 )|

[Df (a0 )] M ↵ [Df (a0 )] 1 f (a0 ) |h0 | 1 k ↵+1 k |h0 |  . 1 k ↵+1 Next we show that the modified Kantorovich inequality 

[D~f (a0 )]

1 ↵+1

|~f (a0 )|↵ M  k

Solutions for the Appendix

291

holds for a1 : [Df (a1 )]

1 1+↵

✓

|[Df (a0 )] 1 k

1

|

◆1+↵ ✓

M |f (a1 )| M  [Df (a0 )] 1 f (a0 ) ↵+1 ✓ ◆1+↵ ✓ ◆↵ |[Df (a0 )] 1 | k|f (a0 )|  M 1 k ↵+1 ✓ ◆1+↵ ✓ ◆↵ 1 k  k 1 k ↵+1 ✓ ◆↵ 1 k 1 k  k k = k. 1 k (↵ + 1)(1 k) 1 k ↵

↵+1

◆↵

M

So we have shown that the method may be iterated; that

so lim aj = a0 +

P

|hm+1 | 

k |hm |, k)(↵ + 1)

(1

hj converges to a point a; and that M |hm |↵+1 , ↵+1

|f (am+1 )| 

so f (a) = 0. Finally, we show the rate of convergence: m ⇣ X 1 1 [Df (am+1 )][Df (a0 )] = [Df (aj )]

⌘ [Df (aj+1 )] [Df (a0 )]

j=0

so 1

1

[Df (am+1 )][Df (a0 )]

R

 =

m X j=0

M |hj |↵ [Df (a0 )]

M [Df (a0 )] 1

b. R1

.



b1

1

So, again by Lemma 1, [Df (am+1 )] Figure for solution A8.1. Since R1 is the largest number such that g and g1 coincide on BR1 (b), and |b1 b| = R1 , we have g1 (b1 ) = g(b1 ), but any neighborhood of b1 includes points in BR (b) BR1 (b), where g and g1 do not coincide.

1



⇣

k

1

⇣

f (a0 )

↵

1



1

⌘↵
b, you have a contradiction: k divides k!a evenly, of course, but it divides neither b nor m. Thus e is not rational. A12.3 a. Using Theorem A12.1 (and equation 3.4.6), and setting k = 2, we get Z 1 h sin h = h + (h t)2 ( cos t) dt 2! 0 (where

Solution A12.3, part b: The first inequality in the third line uses 0  (x

y)2 = x2

2xy + y 2 ,

hence xy  (x2 + y 2 )/2. The second inequality uses our assumption that x2 + y 2  1/4.

cos t is the third derivative of sin t). Thus Z 1 xy sin xy = xy + (xy t)2 ( cos t) dt. 2! 0

b. Using part a, the bound is Z Z 1 xy 1 xy (xy t)2 cos t dt  (xy t)2 dt 2! 0 2! 0  xy 1 (xy t)3 1 = = (xy)3 2! 3 6 0 ✓ 2 ◆ ✓ ◆3 2 3 1 x +y 1 1   . 6 2 6 8 A12.5 a. Clearly sgn y is di↵erentiable when y 6= 0. The chain rule guarantees that f is continuously di↵erentiable unless y = 0 or the quantity p under the square root is  0. But x + x2 + y 2 0 and it only vanishes if y = 0 and x 0. So to show that f is continuously di↵erentiable on the complement of the half-line y = 0, x  0, we need to show that f is continuously di↵erentiable on the open half-line where y = 0 and x > 0. In a neighborhood of a

296

Solutions for the Appendix

◆ x0 satisfying x0 > 0 and y0 = 0, we can write (using the Taylor y0 polynomial given in equation 3.4.9, with m = 1/2 and n = 1), point We can factor out the x to write r p y2 2 2 x+ x + y = x+x 1 + 2 x because x in a neighbor✓ is positive ◆ x0 hood of . y0

✓

p x + x2 + y 2 = =

r

x+x 1+

y2 = x2

y2 + o y2 . 2x

✓ ✓ 2 ◆◆ 1 y2 y x+x 1+ + o 2 2x x2

⇣ ⌘ p Since y 2 = (sgn y)y, we see that in a neighborhood of x00 with x0 > 0 we have ✓ ◆ ⇣ ⌘ y y x p + o(y) = p + o(y). f y = (sgn y)(sgn y) 2 x 2 x The function f vanishes identically on the positive x-axis, so we see that f is di↵erentiable on the positive x-axis, with h ⇣ ⌘i  1 p . Df x = 0 0 2 x

To see that f is of class C 1 on the positive x-axis, we need to show that the partial derivatives are continuous there. This is a straightforward but messy computation using the chain rule: if y 6= 0 we find p ⇣ ⌘ x x2 + y 2 x q D1 f y = sgn y p p 4 x2 + y 2 12 ( x + x2 + y 2 ) which tends to 0 when y tends to 0 since after cancellations there is a q p x x2 + y 2 left over. The partial derivative with respect to y is ⇣ ⌘ y y q D2 f x = sgn y (sgn y)y +o(y). p y = sgn y p 2 1 4x 2px 4 x + y 2 2 ( x + x2 + y 2 )

Here we are dealing with the case n = 2, k = 0, so Theorem A12.5 says that the remainder should be X 1 DI f (c) h~I I! 1

Again the sgn(y)’s cancel, the y’s cancel, one power of 2 cancels, as well p 1 as an x, leaving p . So f is continuously di↵erentiable on the open 2 x half-line y = 0 and x > 0, i.e., on the complement of the half-line y = 0, x  0. b. We have

f (a) =

s

1+

p 1 + ✏2 , 2

So

I2I2

= [D1 f (c) D2 f (c)]~ h.

f (a + ~h) which tends h⇣ to 2 as⌘ ✏ ⇣! 1 , i.e., any c 2 ✏

s

f (a + ~h) = + s

f (a) = 2

1+

1+

p 1 + ✏2 . 2

p 1 + ✏2 , 2

0,⌘i and cannot be [Df (c)]h for any c 2 [a, a + ~h], 1 , since [Df (c)] remains bounded. To see that ✏

Solutions for the Appendix

297

[Df (c)] remains bounded, first we compute 2 1 p 2x 2 h ⇣ ⌘i 2 + x +y x Df y = 4sgn y q , p x+ x2 +y 2 2 2

sgn y q 2

y

p

x2 +y 2

p

x+

x2 +y 2 2

3

5.

⇣ ⌘ For c = cc1 2 [a, a + ~h], we have c1 = 1, and the derivative does not 2 blow up because the denominators do not tend to 0 as c2 ! 0: s

p ( 1) + ( 1)2 + c22 2

!

as c2 !0

s

p 1+ 1 6= 0 2

c. Theorem A12.5 requires that f be of class C k+1 on [a, a + h]; in our case k = 0 and we need f of class C 1 on [a, a + ~h], which it isn’t: the line of discontinuity of f crosses [a, a + ~h], so we do not have [a, a + ~h] ⇢ U . ⇣ ⌘ ⇣ ⌘ ⇣ ⌘ x = x2 + y 2 1, so at 0 we A14.1 a. We have f x = y and F y 1 ⇣ y ⌘ 0 have = 1/2, and at 1 we have = 1/2. We have Lf,F

0 1 x @yA = y

(x2 + y 2

⇣ ⌘ ✓ x x = y x2 + y 2 h ⇣ ⌘i  1 0 0 D = , 1 2 2 The two functions

◆

h ⇣ ⌘i  1 0 D = 1 2

are ✓ ⇣ ⌘ x x p 1 u = u+1 1

and

1

1)

0 . 2

2

x2

◆

for

⇣ ⌘ 0 1

◆

for

⇣

and 2

⇣ ⌘ ✓ x p x u = u+1

x2

0 1

⌘

This gives ✓ ◆ ✓ ◆ x x f1 (x) and = p =g 1 0 1 x2 where

p g1 (x) = + 1

x2 ,

✓ ◆ ✓ ◆ x x p f2 (x), = =g 2 0 1 x2 g2 (x) =

p 1

x2 .

298

Solutions for the Appendix

Then ⇣ ⌘ x =f fe1 u

⇣ ⌘ x =F Fe1 u ⇣ ⌘ x =F Fe2 u and

⇣ ⌘ p p x x2 , fe2 = f u + 1 x2 1 u =+ u+1 2 = ✓ ◆ ⇣ ⌘ p x 2 x =F p 2 = x + u + 1 x2 1=u 1 u 2 u+1 x ✓ ◆ ⇣ ⌘ p x 2 x =F 2 p = x + u + 1 x2 1=u 2 u u + 1 x2

0 1 x ⇣ ⌘ x Lfe1 ,Fe1 @ u A = fe1 u 0 1 x ⇣ ⌘ x Lfe2 ,Fe2 @ u A = fe2 u

To find the Hessian matrix of Lfe1 ,Fe1 , it is easiest to develop the function as a Taylor polynomial

1 1 2 (u x2 ) u u 2 8 to degree 2 (rather than computing the second derivatives of Lfe1 ,Fe1 directly). 1+

p u= u+1 u=

p u+1

x2

⇣ ⌘! x 1 u

u = Lf,F

x2

2

u = Lf,F

⇣ ⌘! x u .

1 2 3 0 1 0 0 The Hessian matrix of Lfe1 ,Fe1 at @ 0 A is 4 0 1/4 1 5, giving 1/2 0 1 0 ✓ ◆ 2 B2 B A2 2BC = A2 + 2C + 4C 2 , 4 2 with signature (1, 2).

0

1 2 0 1 0 The Hessian matrix of Lfe2 ,Fe2 at @ 0 A is 4 0 1/4 1/2 0 1 ✓ ◆ 2 B2 B A2 + 2BC = A2 + 2C 4C 2 , 4 2

with signature (2, 1). The Hessian matrix of Lf,F ture (1, 2). The Hessian matrix of Lf,F (2, 1).

0

0

1 2 0 1 at @ 1 A is 4 0 1/2 0

0 1 2

0

3 0 1 5, giving 0

3 0 2 5, with signa0

1 2 3 0 1 0 0 at @ 1 A is 4 0 1 2 5, with signature 1/2 0 2 0

⇣ ⌘ 0 is a maximum, with signature (p = 0, q = 1), so 1 ⇣ ⌘ 0 is a minimum, with signature (p + 1, q + 1) = (1, 2). The point 1 The point

(p = 1, q = 0), so (p + 1, q + 1) = (2, 1). ⇣ ⌘ ⇣ ⌘ x = x2 + y 2 1. 2 b. We have f x = y ax and F y y The Lagrange multiplier equations for the critical points are [ 2ax 1] = [2x 2y] and x2 + y 2 = 1.

Solutions for the Appendix

299

The equation 2ax = 2 x leads to two possibilities, x = 0 or = first possibility leads to the same two critical points as in part a: 0 1 0 1 0 1 0 1 x 0 x 0 @ y A = @ 1 A and @ y A = @ 1 A . 1/2 1/2

a. The

(1)

These solutions exist for all values of a. In addition, when |a| > 1/2, there are two more solutions, coming from the equality = a: 0 1 0 p 1 0 1 0 p 1 x + 1 1/(4a2 ) x 1 1/(4a2 ) @yA = @ A and @ y A = @ A . (2) 1/2a 1/2a a a Let us begin by treating the solutions in (1). We have 0 1 x Lf,F @ y A = y ax2 (x2 + y 2 1) ⇣ ⌘ ✓ x x = y x2 + y 2 h ⇣ ⌘i  1 0 0 D = , 1 2 2

1

◆

h ⇣ ⌘i  1 0 D = 1 2

0 . 2

Equations 3–6 involve only the constraint and are identical with part a: the two functions 1 and 2 are ◆ ⇣ ⌘ ✓ ⇣ ⌘ x x = p for 01 (3) 1 u 2 u+1 x and

⇣ ⌘ ✓ x p x 2 u = u+1

This gives ✓ ◆ ✓ ◆ x x = p = ge1 (x) and 1 0 1 x2 where

Then ⇣ ⌘ x =f fe1 u ⇣ ⌘ x =f fe2 u

⇣ ⌘ x =F Fe1 u ⇣ ⌘ x =F Fe2 u

p g1 (x) = + 1

x2

◆

for

⇣

0 1

⌘

✓ ◆ ✓ ◆ x x p = = ge2 (x), 2 0 1 x2 x2 ,

g2 (x) =

p 1

x2 .

(4)

(5)

(6)

⇣ ⌘ p x 2 ax2 , u =+ u+1 x ⇣ ⌘ p x u + 1 x2 ax2 2 u = ✓ ◆ ⇣ ⌘ p x 2 x =F p 2 = x + u + 1 x2 1=u 1 u 2 u+1 x ✓ ◆ ⇣ ⌘ p x 2 x =F p = x2 + u + 1 x2 1=u 2 u 2 u+1 x 1

300

Solutions for the Appendix

and

0 1 x ⇣ ⌘ x Lfe1 ,Fe1 @ u A = fe1 u

u=

1

= Lf,F 0 1 x ⇣ ⌘ x Lfe2 ,Fe2 @ u A = fe2 u = Lf,F

To compute the second derivatives of Lfe1 ,Fe1

at the origin, it is easiest to develop the function as a Taylor polynomial 1 (u x2 ) 2 to degree 2. 1+

1 2 u 8

ax2

u

The Hessian matrix of Lfe1 ,Fe1 2 4

1

2a 0 0

(1 + 2a)A2

0 1/4 1 B2 4

x2

ax2

u

⇣ ⌘! x u u=

2

p u+1

p u+1

x2

ax2

u

⇣ ⌘! x u .

1 0 at @ 0 A is 1/2

3 0 15, 0

0

giving

2BC =

(1 + 2a)A2

✓

B + 2C 2

◆2

+ 4C 2 ,

with signature (1, 2) if a > 1/2, and signature (2, 1) if a < 1/2. For a = 1/2, the form is degenerate. 0 1 0 The Hessian matrix of Lfe2 ,Fe2 at @ 0 A is 1/2 2 3 1 2a 0 0 4 0 1/4 1 5 , giving 0 1 0 ✓ ◆2 2 B B 2 2 (1 2a)A + 2BC = (1 2a)A + 2C 4C 2 , 4 2 with signature (1, 2) if a > 1/2, and signature (2, 1) if a < 1/2. For a = 1/2, the form is degenerate. 0 1 2 3 0 1 2a 0 0 The Hessian matrix of Lf,F at @ 1 A is 4 0 1 2 5, again 1/2 0 2 0 with signature (1, 2) if a > 1/2, and signature (2, 1) if a < 1/2. For a = 1/2, the form is degenerate.0 1 2 3 0 1 2a 0 0 The Hessian matrix of Lf,F at @ 1 A is 4 0 1 2 5, again with 1/2 0 2 0 signature (1, 2) if a > 1/2, and signature (2, 1) if a < 1/2. For a = 1/2, the form is degenerate.

Solutions for the Appendix

301

Thus, for the critical points in (1), the signatures of ⇣ the⌘quadratic ⇣ forms ⌘ 0 0 , as given by the Hessians of Lf,F and Lfe,Fe agree at both 1 and 1 Theorem 3.7.13 demands. For a > 1/2, both points are maxima, and ⇣ ⌘for a < 1/2 both points are minima. For 1/2 < a < 1/2, the point 01 is ⇣ ⌘ 0 is a minimum. a maximum and the point 1 Now we will consider the critical points 1 1 0 1 0 q 0 1 0 q x x + 1 4a12 1 4a12 C C @yA = B @yA = B @ @ 1/2a A and 1/2a A a a of equation 2, which exist only if |a| 1/2. 0 >q + 1 4a12 B The Hessian matrix of Lf,F at @ 1/2a a 2 2 3 0 2a 2 0 2x 6 4 0 2 2y 5 = 6 4 q 0 1 2x 2y 0 ⌥2 1 4a2

r

2aB 2 ⌥ 4 1

If the square root on the left is the positive square root, then that on the right is the negative square root, and vice versa: the top sign in ⌥ goes with the top sign in ±.

1

C A is

0 2a

q ⌥2 1 1/a

1/a

The corresponding quadratic form ✓ ◆ 1 2 1 1 2 2 AC + BC = 2a B + BC + C 4a2 a a2 4a4 r ✓ 1 1 2 3 6 C ± 2a 1 AC + a 1 2a3 4a2 ✓ ◆ a3 1 + 1 A2 2 4a2

1 4a2

0

1 4a2

◆

3

7 7. 5

2

A

!

has signature (2, 1) if a > 1/2 and signature (1, 2) if a < 1/2. Thus both of the new critical points are minima if a > 1/2 and maxima if a < 1/2. Now the exercise requires us to find the signature of Lfe,Fe . If a > 1/2, then for both critical points, we have ✓ ◆ ✓ ◆ x x p = , u 1 + u x2 since in that case y = 1/2a < 0. Then ⇣ ⌘ ⇣ ⌘ p x =f x fe u u + 1 x2 ax2 , u = so 0 1 x ⇣ ⌘ p x Lfe,Fe @ u A = fe2 u u= u + 1 x2 ax2

u,

and we need to compute its Hessian at the two critical points; in this case it is easier to compute the Taylor polynomial to degree 2 at the critical

302

v u u tB + 1 =

=

s

Solutions for the Appendix

point, checking along the way that the terms of degree 1 drop out as they should.8 Thus set r 1 x= 1 + A, u = 0 + B, = a + C, 4a2 to find !2 !2 r r 1 1 1 +A a 1 +A ( a + C)B) 4a2 4a2 r

1 +B 4a2 0 1 @ 1 1+ 2a 2

1 4a2

2A 1

2

4a B

2

r

A2

8a A 1

! r 1 1 2 a 1 + 2A 1 + A + aB BC 4a2 4a2 ! !2 1 r 1 1 1 A 4a2 A2 4a2 B 8a2 A 1 4a2 8 4a2

r 1 1 a+ 2aA 1 aA2 + aB BC + o(A2 + B 2 + C 2 ). 4a 4a2 Indeed, the linear terms in A and B do vanish, and the quadratic terms are (after a bit of computation) p a(4a2 1)A2 + a3 B 2 2a2 4a2 1AB + a3 B 2 BC ✓ ◆2 a 1 1 = a(4a1 1) A p (B + C)2 + (B C)2 . 4 4 4a2 1 Recall that a > 1/2, so this quadratic form has signature (2, q 1), representing a minimum, as was required. Note that for the point 1 + 4a12 the computation is the same, just interpreting the square root throughout as the negative square root. In the case a < 1/2, we need the other possible , namely ✓ ◆ ✓ ◆ x x p = , since y = 1/2a > 0. u 1 + u x2 This leads to

0 1 x p Lfe,Fe @ u A = + u + 1

x2

ax2

u.

Rather surprisingly, this leads to the Taylor polynomial ! !2 1 r r 1 @ 1 1 1 1 A 1+ 4a2 B 8a2 A 1 4a2 A2 4a2 B 8a2 A 1 2a 2 4a2 8 4a2 r 1 1 a+ 2aA 1 aA2 + aB BC + o(A2 + B 2 + C 2 ); 4a 4a2 0

8

In the earlier part of part b we did not use Taylor polynomials, but it probably would have been easier to do so.

Solutions for the Appendix

303

the pminus sign in front of the first term is still there. Indeed, the square root + u + 1 x2 is the positive one, and since a < 1/2 (and in particular, a < 0), we must take the negative square root of 4a2 when factoring it out of the square root. Thus the quadratic form is again ✓ ◆2 a 1 1 a(4a2 1) A p (B + C)2 + (B C)2 , 4 4 4a2 1 which this time has signature (1, 2), since a < Theorem 3.7.13.

1/2. Thus again we confirm

A15.1 By Definition 3.9.7, H = (1/2)(A2,0 + A0,2 ). Equation 3.9.36 gives us expressions for A2,0 and A0,2 . It is now straightforward to show p that H is given by the desired expression. Remember we have set c = a21 + a22 , so that a21 + a22 = c2 . Thus we have 1 (A2,0 + A0,2 ) 2 1 1 a2,0 a22 2a1,1 a1 a2 + a0,2 a21 a2,0 a21 + 2a1,1 a1 a2 + a0,2 a22 2 2 1/2 2 2c (1 + c ) 2c (1 + c2 )3/2 ⇣ ⌘ 1 2 2 2 2 2 (1 + c ) a a 2a a a + a a + a a + 2a a a + a a 2,0 1,1 1 2 0,2 2,0 1,1 1 2 0,2 2 1 1 2 3/2 2c2 (1 + c2 ) ⇣ ⌘ 1 2 2 2 2 2 2 2 2 2 a (a + a ) + a (a + a ) + c a a c 2a a a + c a a 2,0 2 0,2 1 2,0 2 1,1 1 2 0,2 1 1 2 3/2 2c2 (1 + c2 ) ⇣ ⌘ 1 2 2 a + a + a a + a a 2a a a 2,0 0,2 2,0 0,2 1,1 1 2 2 1 3/2 2 (1 + c2 ) ⇣ ⌘ 1 2 2 a (1 + a ) 2a a a + a (1 + a ) 2,0 1,1 1 2 0,2 2 1 . 3/2 2 (1 + c2 )

H= = = = = =

In going to the next-to-last line, one c2 in the 2c2 in the denominator cancels the c2 and a21 + a22 in the numerator. A16.1 a. Since 0  sin x  1 for x 2 [0, ⇡], it follows that 0  sinn x  sinn for x 2 [0, ⇡]. So cn =

Z

⇡

sinn x dx
c2n+1 , we have r p 2n + 1 2 ⇡ 1 + o(1) n but

r

2n + 1 p = 2 1 + o(1) , n

C 2 1 + o(1) ,

so C 2  2⇡ 1 + o(1) .

We also have so C 2 2⇡ 1 + o(1) . p p n So if there exists C such that n! = C n ne 1 + o(1) , then C = 2⇡. c2n < c2n

1,

A16.3 It is not really the object of this exercise, but let us see by induction that the number of ways of picking k things from m, called “m choose k,” is given by the formula ✓ ◆ m! m = k k!(m k)!

Solutions for the Appendix

305

which you probably saw in high school and in Theorem 6.1.10. One way of seeing this is the relation leading to Pascal’s triangle: ✓ ◆ ✓ ◆ ✓ ◆ m m 1 m 1 = + , k k 1 k which expresses the fact that to choose k things among m, you must either choose k 1 things among the first m 1, and then choose the last also, or choose k things among the first m 1, and then not choose the last. Suppose that ✓ the ◆ formula is true for all m 1 and all k, with the conm vention that = 0 if k < 0 or k > m. Then the inductive step is k ✓ ◆ ✓ ◆ ✓ ◆ m m 1 m 1 = + k k 1 k (m 1)! (m 1)! + 1)!(m k)! k!(m k 1)! ✓ ◆ (m 1)! 1 1 = + (k 1)!(m k 1)! m k k =

=

(k

(k

(m 1)! 1)!(m k

m m! = . 1)! k(m k) k!(m k)!

Now to our question. Since the coin is being tossed 2n times, there are 22n possible sequences of tosses; saying that the coin is a fair coin is exactly saying that all such sequences have the same probability 1/22n . The number of sequences corresponding to n + k heads is exactly the number of ways of choosing n + k tosses among the 2n, i.e., 2n choose n + k: ✓ ◆ (2n)! 2n = . n+k (n + k)!(n k)! p p We need to sum this over all k such that n + a n  n + k  n + b n. A21.1 a. For any x1 , x2 2 Rn , we have ⇣ 0  f (x2 ) = inf |x2 y|  inf |x2 x1 | + |x1 y2X

y2X

⌘ y| = |x2

x1 | + f (x1 ).

The same statement holds if we exchange x1 and x2 , so |f (x1 )

f (x2 )|  |x1

x2 |.

This proves that f is continuous. If x 2 X, there exists a sequence i 7! yi converging to x, so inf y2X |x yi | = 0, hence f (x) = 0. But if x 2 / X, then there exists ✏ > 0 such that |x y| > ✏ for all y 2 X, hence f (x) ✏. b. There isn’t much to show: this function is bounded by 1 and has support in [0, 1]; it is continuous, (that was the point of including 0 and 1 in X 0 ), it satisfies f  0 everywhere, and f (x) = 0 precisely if x 2 / [0, 1] or if x is in the unpavable set X 0 .

306

Solutions for the Appendix

A21.3 This has nothing to do with what Rn X✏ might be; to lighten def notation, set Z = Rn X and denote by d(x, Z) the distance between x def and the nearest point in Z: d(x, Z) = inf y2Z |x y|, Then by the triangle inequality d(x, Z)  |x

d(y, Z)  |y

so d(x, Z)

d(y, Z)  |x

and finally |d(x, Z) (1

d(y, Z)|  |x

d(x, Z))

(1

x| + d(x, Z) d(y, Z)

d(x, Z)  |x

f (y)|  |x

y|,

y|. It then follows immediately that

d(y, Z)) = d(y, Z)

If f, g : Rn ! R satisfy |f (x)

y|,

y| + d(y, Z)

y| and |g(x)

d(x, Z)  |x, y|. g(y)|  |x

y|,

then | sup(f, g)(x) sup(f, g)(y)||  |x y|, as you can check by looking at the four cases where each of sup(f, g)(x) and sup(f, g)(y) is realized by f or g. Applying this to the case f (z) = 1 d(x, Z) and g(z) = 0, we get h(x) h(y)|  |x y|. Now we will show that h(x) = 1 when x 2 / X✏ , and 0  h(x) < 1 when x 2 X✏ . By definition, h = 1 on Z = Rn X✏ , which is closed. Thus if z2 / Rn X✏ , there is a ball Bz (r) of some radius 0 < r < 1 around z such that Bz (r) \ Rn X✏ = / , and h(z)  1 r. Thus Rn X✏ is the set of z 2 Rn such that h(z) = 0. The number h(z) is the supremum of 1 d(z, Z) and 0, so of course h(z) 0, and since d(z, Z) 0, we have h(z)  1. 1 A21.5 The hypotheses of Theorem 4.11.20 imply that is bijective and 1 of class C with Lipschitz derivative. So the direction “ =) ” asserts that if

(f |

)| det[D ]| {z }

like f in original statement

Z

U

⇣

(f

(f

is integrable on U , then ⌘ 1 )| det[D ] | det[D

1

]| = (f

)

1

| det[D ]

1

|| det[D

1

]|

is integrable on V , and Z h ⇣ ⌘i 1 1 )(x) det[D (x)] |dn x| = (f )(y) det D (y) |det[D 1 (y)]| |dn y| ZV Z 1 = f (y) det[D( )(y)] |dn y| = f (y) |dn y|. V

V

A22.1 To simplify notation, we will set ' = dxi1 ^ · · · ^ dxik . Thus we want to show that the definition of the wedge product justifies going from the last line of equation A22.5 to the preceding line, i.e., (df ^') P0 (~v1 , . . . , ~vk+1 ) =

k+1 X

( 1)i

i=1

1

b i , . . . , ~vk+1 ). [Df (0)]~vi '(~v1 , . . . , ~v

Solutions for the Appendix

307

Since f is a 0-form, df is a 1-form, and ' is a k-form. So Definition 6.1.12 of the wedge product says X df ^ '(~v1 , . . . , ~vk+1 ) = sgn( ) df (~v (1) )'(~v (2) , . . . , ~v (k+1) ). (1) shu✏es 2Perm(1,k)

Equation 1: The (i) indicate that these shu✏es are those permutations such that within each group, the vectors come with indices in ascending order.

We have 1 + k possible (1, k)-shu✏es. As in the discussion following Definition 6.1.12, we use a vertical bar to separate the vector on which the 1-form df is being evaluated from the vectors on which the k-form ' is being evaluated, and we use a hat to denote an omitted vector: b 1 , ~v2 , . . . , ~vk+1 ~v1 ~v b 2 , . . . , ~vk+1 ~v2 ~v1 , ~v .. .

b k+1 , ~vk+1 ~v1 , ~v2 , . . . , ~v

which we can write

b i , . . . , ~vk+1 ~vi ~v1 , . . . , ~v

for i = 1, . . . , k + 1. When i is odd, the signature of the permutation is +1; when i is even, the signature is 1. So we can rewrite equation 1 as df ^ '(~v1 , . . . , ~vk+1 ) =

k+1 X

( 1)i

i=1

1

b i , . . . , ~vk+1 ). df (~vi )'(~v1 , . . . , ~v

(2)

Now we can rewrite df as [Df ]. To evaluate df at 0, we change the (~v1 , . . . , ~vk+1 ) on the left side of equation 2 to P0 (~v1 , . . . , ~vk+1 ): df ^' P0 (~v1 , . . . , ~vk+1 ) =

Solution A23.1: The [0] in F

1

([0]) ⇢ Rn ⇥ Mat (n, n)

is the zero matrix among symmetric n ⇥ n matrices.

k+1 X

( 1)i

i=1

1

b i , . . . , ~vk+1 ). ([Df (0)]~vi )'(~v1 , . . . , ~v

A23.1 By definition, the space of rigid motions of Rn is the space of maps T : Rn ! Rn of the form x 7! Ax + b, where A 2 Mat (n, n) satisfies A> A = I. Thus we can think of the space of rigid motions as the subset F 1 ([0]) ⇢ Rn ⇥ Mat (n, n), where F (b, A) = A> A I, and it is important to consider the codomain of F as the space of symmetric n⇥n matrices, not the space of all n ⇥ n matrices. Then F is a map from a space of dimension n2 + n to a space of dimension n(n + 1)/2. By part d of Exercise 3.2.11, we know that the derivative of F is surjective at all the points where A> A = I. So, by Theorem 3.1.10, F 1 ([0]) is a manifold of dimension n + n2

n(n + 1) n(n + 1) = . 2 2

A23.3 This is immediate from Heine-Borel. The manifold M is locally a graph. We have seen many times what this means: for every x 2 M , there exist

308

Solutions for the Appendix

• subspaces E1 , E2 ⇢ Rn with E1 spanned by p standard basis vectors and E2 spanned by the other n p standard basis vectors, • open subsets U ⇢ E1 and V ⇢ E2 , • A C 1 mapping g : U ! V ,

such that M \ (U ⇥ V ) = (g). Renumber the variables so that E1 ✓ corre◆ y sponds to the first basis vectors and E2 to the last, and write x = . z Choose r(x) > 0 such that Br(x) (y) ⇢ U , and call Zx the graph of the restriction of the graph of g to the concentric ball of radius r/2. Then the closure of Zx , which is the graph of the restriction of g to the closed ball of radius r/2, is a compact subset of M . The Zy form an open cover of Y , so by Heine-Borel there is a finite subcover: there exist x1 , . . . , xm such that Y ⇢ Zx1 [ · · · [ Zxm . If each of the f (Z xi ) has q-dimensional volume 0, then so does their union, which contains f (Y ).

Index

composition, 54

exterior derivative, computing, 233

computational “tricks”, 16, 34

Bold page numbers indicate a page where a term is defined, formally or informally. Page numbers in italics indicate that the term is used in a theorem, proposition, in lemma, or corollary.

A> A always symmetric, 194 det A> T A 0, 182 absolute convergence, 2 adjacency matrix, 11 alternating series, 102 angle inscribed in circle, 126

cone, parametrization of, 269

floor, 134

continuity 21, 22, 27, 29, 31, 36, 38 convergence, 2, 19

flux, 223, 253 of electric field, 271

cross product, 15

Fresnel integrals, 175

curvature Gaussian, 127, 191, 192 mean, 119, 127, 191, 299 total, 192

Fubini’s theorem, 158

cylindrical coordinates, 164, 192, 212, 252, 269 derivative, 28 as linear transformation, 37 function of one variable, 25 of identity matrix, 81 of inverse, 71

functional analysis, 163 functional calculus, 163 fundamental theorem of algebra, 23 gambling, 116 Gauss map, 197 Gauss’s law, 271 Gaussian bell curve, 102 Gaussian curvature, 127, 191, 192 scaling, 191 geometric series, 181

Archimedes, 192

determinant, 15 basis independent, 182 det A> A 0, 182 using to check orientation, 269

associativity, 209

diagonal vector, 16

Handbook of Mathematical Functions, 189

augmented Hessian matrix, 110

diagonalizable matrix, 163, 182

helicoid, parametrization of, 205

di↵erentiability, 69

antisymmetry, 152 approximation, 178

gradient, 269 gravitation, 271

Bernoulli, Daniel, 116

Diophantine analysis, 178

Hermitian matrix, model for spacetime, 104

Bernstein’s theorem, 4, 5

distributivity, 208–209

Hessian matrix, 294, 296–297

best coordinates for surfaces, 119

dominated convergence theorem, 171

Hilbert matrix, 105

binomial, 96

dot product, 15

homogeneous polynomial, 123 hyperbolic cylinder, 106

boundary of compact manifold, 273 e, proof not rational, 291 cardinality of R and Rn , 5

eigenvalues, 59, 61, 182

cardioid, 177

eigenvectors 59, 61

hyperboloid, 107, 192 image under Gauss map, 197 of one sheet, 107

Cayley-Hamilton theorem, 60

electromagnetic field, 273

hypocycloid, 120

center of gravity, 164, 178, 194, 252

elementary forms, 207–208

chain rule, 266

elementary matrices, 47

identity matrix, derivative of, 81

change of coordinates cylindrical, 164, 192, 212, 252, 269 elliptical, 163 polar, 187, 190, 213, 255 spherical, 183, 225, 256

ellipse, 6, 118, 161 area bounded by, 149 elliptical change of coordinates, 163

image of composition, 54

ellipsoid, 161 parametrization, 189

improper integral, 171 and change of variables, 171

change of variables, 256 improper integrals, 171

elliptic function, 188, 189 elliptic⇡, 189 ellipticE, 189 ellipticK, 189

incompressibility, 275

Cicero, 192

implicit function theorem, 80 neighborhoods, 70

increment, notation for, 100, 101

compact manifold has empty boundary, 273

elliptic paraboloid, 188, 189

integral of form over k-parallelogram, 213 of vector-valued function, 256

elliptical change of coordinates, 163

integrand, 137

compactness, 110

erf function, 102

intermediate value theorem, 2

complex numbers, 217

exponentials, 145, 169, 248

closed set, 17, 18, 19

309

310 inverse, derivative of, 71

Index: page numbers in italics indicate theorems, propositions, etc.

inverse square laws, 271

mean curvature, 119, 127, 191, 299 scaling, 191 sign of, 119

polynomials dominated by exponentials, 169, 248 homogeneous, 123

invertibility, determining, 81

midpoint Riemann sum, 190

power series, 137

invertible matrix, 285

monotone convergence theorem, 171

irrational numbers, 18 proof e irrational, 291

multilinearity, 152

powers dominate logarithms, 145, 248 dominated by exponentials, 145, 248

natural domain, 1, 18

probability, 102

Jacobi’s method, 131

natural log, 22, 203

pullback, 243

Jordan product, 14, 15

negating mathematical statements, 1

inverse function theorem, 69

Kantorovich’s theorem, weaker condition, 285–287

Newton’s method when inequality not satisfied, 63

quadratic form, 103, 104, 106, 109 quadrilateral inscribed in circle, 126

norm of matrix, 63, 66 rank, 227

k-parallelogram, integrating form over, 213

notation for increments, 100, 101 for sequences, 33

kernel of composition, 54

numerical analysis, loss of precision, 41

rational numbers, 178 countable, 3 neither open nor closed, 18

Lagrange multiplier, 109

onto function and existence of solution, 92

reflection, 17, 33, 75

Lagrange multiplier theorem, 109, 111, 175

open set, 17, 18

relativity, theory of, 104

orientation, 214–221, 226, 228, 229 boundary of 2-parallelogram, 240 boundary of 3-parallelogram, 241–242 by transverse vector field, 214 checking using cross product, 269 checking using determinant, 269

rigid motion, 303

k-close, 277

Lebesgue integration, 171 limit of functions, 19 of sequence, 19 linear transformation, 13 right and wrong definitions, 13

orthogonal matrix, 75, 94

logarithm (ln), 22, 203 dominated by powers, 145

parabola, 6, 7

Lorentz pseudolength, 257

paraboloid, 188, 189

Lorentz transformation, 257

parallelepiped, 33 contained in ellipsoid, 111

m choose k, 300–301

parametrization boundary of 2-parallelogram, 240 boundary of 3-parallelogram, 241–242 cone, 269 ellipsoid, 189 finding, 183, 188, 204 helicoid, 205 surface of ellipsoid, 189

magnetic monopole, 275 manifold, when locus is not, 89, 90 Matlab, 49, 62, 190–191 matrix augmented Hessian matrix, 110 elementary, 47 Hermitian, 104 Hessian, 294, 296–297 Hilbert, 105 invertible, 285 orthogonal, 75, 94 symmetric, 104, 163, 303 upper triangular, 31

partial derivatives 25, 26, 29, 36, 109 partial fractions, 54 partial row reduction, cost, 44 Pascal’s triangle, 96, 301 permutations, 209

matrix factorization, 155

piece-with-boundary straightening boundary, 231

matrix multiplication, 10, 13, 31, 37

polar angle, 5, 215–216

Maxwell’s equations, 257

polar coordinates, 187, 190, 213, 255

ratio test, 169, 179

rotation, 14, 33, 38–39, 75 row reduction, 42–47 saddle, 109 second fundamental form, 127 sequence, notation ambiguous, 33 shu✏es, 209 signature of quadratic form, 103, 104, 109, 182 Simpson’s method, 191 singular value decomposition (SVD), 231 skew commutativity, 210-211, 237 smooth manifold when locus is not, 89, 90 spacetime, 104 spectral theorem, 163, 182 spherical coordinates, 183, 225 spiral, 187 squaring function for matrices, 27 standard deviation, 139 subspace, 31, 47 symmetric matrix, 104, 163, 182, 303 symmetry, 140, 145, 146, 170, 178, 253 Taylor polynomials computing, 123 error to avoid, 123

Taylor polynomials, cont. point where evaluated, 123 Taylor series of vector field, 253 telescope, 38 telescoping series, 171 telescoping sum, 138 total curvature, 192 transverse vector field, 214 triadic Cantor set, 206 triangle inequality, 17, 18 variant, 17 trigonometry, 183 units, 275 vector field Taylor series of, 253 vector-valued function integral, 256 integrating, 256 volume, 186 volume 0, 136 when X ⇥ Y has, 136 wedge product, 302–303 proof of properties, 208–211 properties, 208–210 work of vector field, 222, 252–253, 254

311