A complete solution guide to principles of mathematical analysis 9789887879701, 9789887879718, 9887879703, 9887879711

136 41 7MB

English Pages [395] Year 2018

Table of contents :
Cover
Rudin(PMA)
Preface
List of Figures
List of Tables
The Real and Complex Number Systems
Problems on rational numbers and fields
Properties of supremums and infimums
An index law and the logarithm
Properties of the complex field
Properties of Euclidean spaces
A supplement to the proof of Theorem 1.19
Basic Topology
The empty set and properties of algebraic numbers
The uncountability of irrational numbers
Limit points, open sets and closed sets
Some metrics
Compact sets
Further topological properties of R
Properties of connected sets
Separable metric spaces and bases and a special case of Baire's theorem
Numerical Sequences and Series
Problems on sequences
Problems on series
Recursion formulas of sequences
A representation of the Cantor set
Cauchy sequences and the completions of metric spaces
Continuity
Properties of continuous functions
The extension, the graph and the restriction of a continuous function
Problems on uniformly continuous functions
Further properties of continuous functions
Discontinuous functions
The distance function E
Convex functions
Other properties of continuous functions
Differentiation
Problems on differentiability of a function
Applications of Taylor's theorem
Derivatives of higher order and iteration methods
Solutions of differential equations
The Riemann-Stieltjes Integral
Problems on Riemann-Stieltjes integrals
Definitions of improper integrals
Hölder's inequality
Problems related to improper integrals
Applications and a generalization of integration by parts
Problems on rectifiable curves
Sequences and Series of Functions
Problems on uniform convergence of sequences of functions
Problems on equicontinuous families of functions
Applications of the (Stone-)Weierstrass theorem
Isometric mappings and initial-value problems
Some Special Functions
Problems related to special functions
Index of a curve
Stirling's formula
Functions of Several Variables
Linear transformations
Differentiable mappings
Local maxima and minima
The inverse function theorem and the implicit function theorem
The rank of a linear transformation
Derivatives of higher order
Integration of Differential Forms
Integration over sets in Rk and primitive mappings
Generalizations of partitions of unity
Applications of Theorem 10.9 (Change of Variables Theorem)
Properties of k-forms and k-simplexes
Problems on closed forms and exact forms
Problems on vector fields
The Lebesgue Theory
Further properties of integrable functions
The Riemann integrals and the Lebesgue integrals
Functions of classes L and L2
Appendix
A proof of Lemma 10.14
Solid angle subtended by a surface at the origin
Proofs of some basic properties of a measure
Index
Bibliography

Recommend Papers

A Complete Solution Guide to Complex Analysis 9789887415503, 9789887415510, 9887415510

117 68 4MB Read more

Mathematical Analysis Solution Manual

826 82 3MB Read more

A Complete Solution Guide to Real and Complex Analysis II 9789887415640, 9789887415657, 9887415650

This is a complete solution guide to all exercises from Chapters 10 to 20 in Rudin's Real and Complex Analysis. The

111 94 4MB Read more

A Complete Solution Guide to Real and Complex Analysis 9887415677, 9789887415671

This is a complete solution guide to all exercises from Chapters 1 to 20 in Rudin's Real and Complex Analysis. The

790 83 7MB Read more

A Complete Solution Guide to Real and Complex Analysis I 9789887879787, 9789887879794, 9887879797

This is a complete solution guide to all exercises from Chapters 1 to 9 in Rudin's Real and Complex Analysis. The f

111 55 3MB Read more

Principles of Mathematical Analysis [3.5th ed.]

A modernized and updated edition of the third edition of Walter Rudin's "Principles of Mathematical Analysis&q

121 98 Read more

Solutions Manual to Walter Rudin's Principles of Mathematical Analysis

Solutions manual developed by Roger Cooke of the University of Vermont, to accompany Principles of Mathematical Analysis

1,050 139 23MB Read more

Principles of Mathematical Analysis [3 ed.] 9780070542358, 007054235X

A modernized and updated edition of the third edition of Walter Rudin's "Principles of Mathematical Analysis&q

99 63 Read more

Principles of Mathematical Logic

476 60 1MB Read more

Java A complete Practical Solution 9789388176507

257 62 7MB Read more

A complete solution guide to principles of mathematical analysis
9789887879701, 9789887879718, 9887879703, 9887879711

Author / Uploaded
Kit-Wing Yu

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

A Complete Solution Guide to Principles of Mathematical Analysis

by Kit-Wing Yu, PhD [email protected]

c 2018 by Kit-Wing Yu. All rights reserved. No part of this publication may be reproduced, Copyright stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the author. ISBN: 978-988-78797-0-1 (eBook) ISBN: 978-988-78797-1-8 (Paperback)

ii

About the author

Dr. Kit-Wing Yu received his B.Sc. (1st Hons), M.Phil. and Ph.D. degrees in Math. at the HKUST, PGDE (Mathematics) at the CUHK. After his graduation, he has joined United Christian College to serve as a mathematics teacher for at least seventeen years. He has also taken the responsibility of the mathematics panel since 2002. Furthermore, he was appointed as a part-time tutor (2002 – 2005) and then a part-time course coordinator (2006 – 2010) of the Department of Mathematics at the OUHK. Besides teaching, Dr. Yu has been appointed to be a marker of the HKAL Pure Mathematics and HKDSE Mathematics (Core Part) for over thirteen years. Between 2012 and 2014, Dr. Yu was invited to be a Judge Member by the World Olympic Mathematics Competition (China). In the research aspect, he has published over twelve research papers in international mathematical journals, including some wellknown journals such as J. Reine Angew. Math., Proc. Roy. Soc. Edinburgh Sect. A and Kodai Math. J.. His research interests are inequalities, special functions and Nevanlinna’s value distribution theory.

iii

iv

Preface Professor Walter Rudina is the author of the classical and famous textbooks: Principles of Mathematical Analysis, Real and Complex Analysis, and Functional Analysis. (People commonly call them “Baby Rudin”, “Papa Rudin” and “Grandpa Rudin” respectively.) Undoubtedly, they have produced important and extensive impacts to the study of mathematical analysis at university level since their publications. In my memory, Principles of Mathematical Analysis was the standard textbook when I was a year 2 undergraduate mathematics student many years ago. I read Chapters 1-7 plus 11 again when I prepared my Ph.D. qualifying examination. In my personal experience, I think that chapters in these books are well-organized and expositions of theorems are clear, precise and well-written. I believe that you will agree with me after reading them. Although Principles of Mathematical Analysis stopped to have any update after 1976, many instructors nowadays still like to select this book as the standard textbook or one of the main reference books in their analysis courses. I guess the “beauty” of this book that I described above is one of the reasons for this kind of choice. Furthermore, many people believe that the best way to study mathematics is by working through examples and exercises. Thus the excellent and well-designed exercises provided in the book is another main factor that attracts us. This can be justified if you type some keywords in google, then you will see many people are discussing and finding solutions of those exercises in different platforms. For example, the Math Stack Exchange: https://math.stackexchange.com/ Actually, Professor Roger Cooke wrote “Solutions Manual to Walter Rudin’s Principles of Mathematical Analysis” in 1976, see https://minds.wisconsin.edu/handle/1793/67009 However, a glance at the files will reveal that the readability of the manual is a bit low because of the lacks of equation numbers and supporting illustrations. Besides, the most important point is that it seems that some solutions are incomplete! Therefore, these give “birth” to this book: I decide to write the solutions myself. In other words, this book is not a collection of solutions from others, but rather I use my own words and own ways to solve and prove the problems. In fact, I have written a complete solution guide so that it helps every mathematics student and instructor to understand the ideas and applications of the theorems in Rudin’s book. As a mathematics instructor at a college, I understand that the growth of a mathematics student depends largely on how hard he/she does exercises. When your instructor asks you to do some exercises from Rudin, you are not suggested to read my solutions unless you have tried your best to prove them yourselves.

a https://en.wikipedia.org/wiki/Walter_Rudin.

v

vi The features of this book are as follows: • It covers all the 285 exercises with detailed and complete solutions. As a matter of fact, my solutions show every detail, every step and every theorem that I applied. That’s why my book has over 390 pages! • There are 55 illustrations and 3 tables for explaining the mathematical concepts or ideas used behind the questions or theorems. • Sections in each chapter are added so as to increase the readability of the exercises. • Different colors are used frequently in order to highlight or explain problems, lemmas, remarks, main points/formulas involved, or show the steps of manipulation in some complicated proofs. (ebook only) • Necessary lemmas with proofs and references are provided because some questions require additional mathematical concepts which are not covered by Rudin. • Three appendices are included which further explain and supplement some theories in Chapters 10 and 11. Since the solutions are written solely by me, you may find typos or mistakes. If you really find such a mistake, please send your valuable comments or opinions to [email protected]. Then I will post the updated errata on my new website https://sites.google.com/view/yukitwing/. Finally, if the sales of this book are good (I hope so), then I will start to write a solution guide of the book Real and Complex Analysis soon.

Kit Wing Yu February 2018

List of Figures

2.1 2.2 2.3 2.4

The neighborhoods Nh (q) and Nr (p). . . . . Convex sets and nonconvex sets. . . . . . . The sets Nh (x), N h (x) and Nqm (xk ). . . . . 2 The construction of the shrinking sequence.

. . . .

. . . .

. . . .

. . . .

13 23 25 29

3.1

The Cantor set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

4.1 4.2 4.3 4.4 4.5 4.6 4.7

The graph of g on [an , bn ]. . . . . . . . . . . . The sets E and Ini . . . . . . . . . . . . . . . The graphs of [x] and√(x). . . . . . . . . . . . An example for α = 2 and n = 5. . . . . . . The distance from x ∈ X to E. . . . . . . . . The graph of a convex function f . . . . . . . The positions of the points p, p + κ, q − κ and

5.1 5.2 5.3 5.4 5.5 5.6

The The The The The The

. . . . . . . . . . . . q.

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

59 63 70 72 74 76 77

zig-zag path of the process in (c). . . . . . . . . . . . zig-zag path induced by the function f in Case (i). . zig-zag path induced by the function g in Case (i). . zig-zag path induced by the function f in Case (ii). zig-zag path induced by the function g in Case (ii). geometrical interpretation of Newton’s method. . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

105 108 109 109 110 111

8.1 8.2 8.3 8.4 8.5 8.6

The graph of the continuous function y = f (x) = (π − |x|)2 on [−π, π]. The graphs of the two functions f and g. . . . . . . . . . . . . . . . . A geometric proof of 0 < sin x ≤ x on (0, π2 ]. . . . . . . . . . . . . . . . The graph of y = | sin x|. . . . . . . . . . . . . . . . . . . . . . . . . . . The winding number of γ around an arbitrary point p. . . . . . . . . . The geometry of the points z, f (z) and g(z). . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

186 197 199 199 202 209

9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8

An example of the range K of f . . . . . . . . . . . . . . The set of q ∈ K such that (∇f3 )(f −1 (q)) = 0. . . . . . Geometric meaning of the implicit function theorem. . . The graphs around the four points. . . . . . . . . . . . . The graphs around (0, 0) and (1, 0). . . . . . . . . . . . The graph of the ellipse X 2 + 4Y 2 = 1. . . . . . . . . . . The definition of the function ϕ(x, t). . . . . . . . . . . . The four regions divided by the two lines αx1 + βx2 = 0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and αx1 − βx2 = 0.

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

219 220 232 233 236 239 243 252

10.1 10.2 10.3 10.4 10.5 10.6

The The The The The The

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

256 264 269 270 271 277

vii

. . . . . .

. . . . . . .

. . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . .

. . . . . . .

compact convex set H and its boundary ∂H. figures of the sets Ui , Wi and Vi . . . . . . . . mapping T : I 2 → H. . . . . . . . . . . . . . mapping T : A → D. . . . . . . . . . . . . . mapping T : A◦ → D0 . . . . . . . . . . . . . mapping T : S → Q. . . . . . . . . . . . . .

. . . . . . .

. . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

List of Figures 10.7 The open sets Q0.1 , Q0.2 and Q. . . . . . . . . . . . . 10.8 The mapping T : I 3 → Q3 . . . . . . . . . . . . . . . 10.9 The mapping τ1 : Q2 → I 2 . . . . . . . . . . . . . . . 10.10The mapping τ2 : Q2 → I 2 . . . . . . . . . . . . . . . 10.11The mapping τ2 : Q2 → I 2 . . . . . . . . . . . . . . . 10.12The mapping Φ : D → R2 \ {0}. . . . . . . . . . . . 10.13The spherical coordinates for the point Σ(u, v). . . . 10.14The rectangles D and E. . . . . . . . . . . . . . . . . 10.15An example of the 2-surface S and its boundary ∂S. 10.16The unit disk U as the projection of the unit ball V . 10.17The open cells U and V . . . . . . . . . . . . . . . . . 10.18The parameter domain D. . . . . . . . . . . . . . . . 10.19The figure of the M¨ obius band. . . . . . . . . . . . . 10.20The “geometric” boundary of M . . . . . . . . . . . .

viii . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

278 280 288 289 289 296 300 302 304 325 326 332 333 335

11.1 The open square Rδ ((p, q)) and the neighborhood N√2δ ((p, q)). . . . . . . . . . . . . . . . 350 B.1 The plane angle θ measured in radians. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 B.2 The solid angle Ω measured in steradians. . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 B.3 A section of the cone with apex angle 2θ. . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

List of Tables

6.1

The number of intervals & end-points and the length of each interval for each En . . . . . 121

9.1 9.2

Expressions of x around four points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Expressions of y around four points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

ix

List of Tables

x

Contents

Preface

v

List of Figures

vii

List of Tables

ix

1 The 1.1 1.2 1.3 1.4 1.5 1.6

Real and Complex Number Systems Problems on rational numbers and fields . . Properties of supremums and infimums . . . An index law and the logarithm . . . . . . . Properties of the complex field . . . . . . . Properties of Euclidean spaces . . . . . . . A supplement to the proof of Theorem 1.19

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 1 2 2 5 7 9

2 Basic Topology 2.1 The empty set and properties of algebraic numbers . . . . . . . . . . . . 2.2 The uncountability of irrational numbers . . . . . . . . . . . . . . . . . . 2.3 Limit points, open sets and closed sets . . . . . . . . . . . . . . . . . . . 2.4 Some metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Further topological properties of R . . . . . . . . . . . . . . . . . . . . . 2.7 Properties of connected sets . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Separable metric spaces and bases and a special case of Baire’s theorem

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

11 11 12 12 15 17 18 21 24

3 Numerical Sequences and Series 3.1 Problems on sequences . . . . . . . . . 3.2 Problems on series . . . . . . . . . . . 3.3 Recursion formulas of sequences . . . . 3.4 A representation of the Cantor set . . 3.5 Cauchy sequences and the completions

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . of metric spaces

4 Continuity 4.1 Properties of continuous functions . . . . . 4.2 The extension, the graph and the restriction 4.3 Problems on uniformly continuous functions 4.4 Further properties of continuous functions . 4.5 Discontinuous functions . . . . . . . . . . . 4.6 The distance function ρE . . . . . . . . . . 4.7 Convex functions . . . . . . . . . . . . . . . 4.8 Other properties of continuous functions . . xi

. . . . . .

. . . . .

. . . . . .

. . . . .

. . . . . . . . . of a continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

31 31 33 45 49 50

. . . . . function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

57 57 58 63 68 69 73 76 81

. . . . .

. . . . .

. . . . .

. . . . .

Contents 5 Differentiation 5.1 Problems on differentiability of a function . . . . 5.2 Applications of Taylor’s theorem . . . . . . . . . 5.3 Derivatives of higher order and iteration methods 5.4 Solutions of differential equations . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

85 . 85 . 96 . 102 . 113

. . . . . . . . . . . . . . . . . . . . by parts . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

117 117 122 126 130 133 137

7 Sequences and Series of Functions 7.1 Problems on uniform convergence of sequences of functions 7.2 Problems on equicontinuous families of functions . . . . . . 7.3 Applications of the (Stone-)Weierstrass theorem . . . . . . . 7.4 Isometric mappings and initial-value problems . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

141 141 157 164 167

8 Some Special Functions 8.1 Problems related to special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Index of a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Stirling’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

173 173 201 210

9 Functions of Several Variables 9.1 Linear transformations . . . . . . . . . . . . . 9.2 Differentiable mappings . . . . . . . . . . . . 9.3 Local maxima and minima . . . . . . . . . . . 9.4 The inverse function theorem and the implicit 9.5 The rank of a linear transformation . . . . . . 9.6 Derivatives of higher order . . . . . . . . . . .

6 The 6.1 6.2 6.3 6.4 6.5 6.6

Riemann-Stieltjes Integral Problems on Riemann-Stieltjes integrals . . . . Definitions of improper integrals . . . . . . . . H¨ older’s inequality . . . . . . . . . . . . . . . . Problems related to improper integrals . . . . . Applications and a generalization of integration Problems on rectifiable curves . . . . . . . . . .

. . . .

. . . .

. . . . . . . . . . . . . . . function . . . . . . . . . .

10 Integration of Differential Forms 10.1 Integration over sets in Rk and primitive mappings 10.2 Generalizations of partitions of unity . . . . . . . . 10.3 Applications of Theorem 10.9 (Change of Variables 10.4 Properties of k-forms and k-simplexes . . . . . . . 10.5 Problems on closed forms and exact forms . . . . . 10.6 Problems on vector fields . . . . . . . . . . . . . . 11 The 11.1 11.2 11.3

. . . .

. . . . . . . . . . . . . . . theorem . . . . . . . . . .

. . . . . . . . . . . . Theorem) . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

213 213 215 219 226 237 241

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

255 255 263 267 284 294 330

Lebesgue Theory 337 Further properties of integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 The Riemann integrals and the Lebesgue integrals . . . . . . . . . . . . . . . . . . . . . . 340 Functions of classes L and L 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

Appendix

355

A A proof of Lemma 10.14

355

B Solid angle subtended by a surface at the origin

365

C Proofs of some basic properties of a measure

369

Index

377

Bibliography

379

CHAPTER

1

The Real and Complex Number Systems Unless the contrary is explicitly stated, all numbers that are mentioned in these exercises are understood to be real.

1.1

Problems on rational numbers and fields

Problem 1.1 Rudin Chapter 1 Exercise 1.

Proof. Assume that r + x was rational. Then it follows from Definition 1.12(A1), (A4) and (A5) that x = (r + x) − x is also rational, a contradiction. Similarly, if rx was rational, then it follows from Definition 1.12(M1), M(4) and M(5) that rx x= x is also rational, a contradiction. This ends the proof of the problem. Problem 1.2 Rudin Chapter 1 Exercise 2. √ √ Proof. Assume that 12 was rational so that 12 = m n , where m and n are co-prime integers. Then we have m2 = 12n2 and thus m is divisible by 3. Let m = 3k for some integer k. Then we have m2 = 9k 2 and this shows that 4n2 k2 = , 3 so n is divisible by 3. This contradicts the fact that m and n are co-prime, completing the proof of the problem. Problem 1.3 Rudin Chapter 1 Exercise 3.

Proof. Since x 6= 0, there exists

1 x

∈ F such that x ·

1 x

= 1.

(a) Therefore, it follows from Definition 1.12(M2), (M3), (M4) and (M5) that xy = xz implies that y = z. 1

Chapter 1. The Real and Complex Number Systems

2

(b) Similarly, it follows from Definition 1.12(M2), (M3), (M4) and (M5) that xy = x implies that y = 1. (c) Similarly, it follows from Definition 1.12(M2), (M3), (M4) and (M5) that xy = 1 implies that y = x1 . (d) Since

1 x

1

∈ F , we have

1.15(a) implies that

1

1 x

1 x

∈ F such that

= x.

1 x

·

1

1 x

= 1. Now we have

1 x

·

1 1 x

=

1 x

· x(= 1), then Proposition

This completes the proof of the problem.

1.2

Properties of supremums and infimums

Problem 1.4 Rudin Chapter 1 Exercise 4.

Proof. Since E ⊂ S, the definitions give α ≤ x and x ≤ β for all x ∈ E. Thus Definition 1.5(ii) implies that α ≤ β. This completes the proof of the problem. Problem 1.5 Rudin Chapter 1 Exercise 5.

Proof. Theorem 1.19 says that R is an ordered set with the least-upper-bound property. Since A is a non-empty subset of R and A is bounded below, inf A exists in R by Definition 1.10. Furthermore, −A is a non-empty subset of R. Let y be a lower bound of A, i.e. y ≤ x for all x ∈ A. Then we have −x ≤ −y for all x ∈ A. Thus −y is an upper bound of −A and sup(−A) exists in R by Definition 1.10. Let α = inf A and β = sup(−A). By definition, we have y ≤ β for all y ∈ −A, where y = −x for some x ∈ A. It implies that x = −y ≥ −β for all x ∈ A, so −β is a lower bound of A and then −β ≤ α. Similarly, we have α ≤ x for all x ∈ A so that −α ≥ −x for all x ∈ A. It implies that −α is an upper bound of −A, so β ≤ −α and then −β ≥ α. Hence we have α = −β, i.e. inf A = − sup(−A). This completes the proof of the problem.

1.3

An index law and the logarithm

Problem 1.6 Rudin Chapter 1 Exercise 6.

Proof. (a) Since bm > 0 and n ∈ N, Theorem 1.21 implies that there exists one and only one real y such that y n = bm . Similarly, there exists one and only one real z such that z q = bp . We have y nq = (y n )q = (bm )q = bmq = bpn = (bp )n = (z q )n = z qn 1

1

which implies that y = z, i.e., (bm ) n = (bp ) q . p

m

(b) Let br = b n and bs = b q . Without loss of generality, we may assume that n and q are positive. Then the corollary of Theorem 1.21 implies that br+s = b

mq+np nq

1

1

1

1

m

p

= (bmq+np ) nq = (bmq × bnp ) nq = (bmq ) nq × (bnp ) nq = b n × b q = br × bs .

3

1.3. An index law and the logarithm (c) By definition, B(r) = {bt | t ∈ Q, t ≤ r}, where r ∈ Q. It is clear that br ∈ B(r), so it is a nonempty subset of R. Since b > 1, we have bt ≤ br for all t ≤ r so that br is an upper bound of B(r). Therefore, Theorem 1.19 and Definition 1.10 show that sup B(r) exists in R. Now we show that br = sup B(r). If 0 < γ < br , then γ is obviously not an upper bound of B(r) because br ∈ B(r). By Definition 1.8, we have br = sup B(r). (d) By part (c), we know that bx , by and bx+y are all well-defined in R. By definition, we have B(x) = {br | r ∈ Q, r ≤ x},

B(y) = {bs | s ∈ Q, s ≤ y},

B(x + y) = {bt | t ∈ Q, t ≤ x + y}.

Before continuing the proof, we need to show several results: Lemma 1.1 For every real x and y, we define B(x, y) = B(x) × B(y) = {br × bs | r, s ∈ Q, r ≤ x, s ≤ y}. Then we have bx × by = sup B(x, y).

Proof of Lemma 1.1. By definition, bx and by are upper bounds of B(x) and B(y) respectively, so we have br ≤ bx and bs ≤ by for every br ∈ B(x) and bs ∈ B(y). Therefore, we have br × bs ≤ bx × by for every br × bs ∈ B(x, y). In other words, bx × by is an upper bound of B(x, y). Let 0 < α < bx × by . Then we have bαx < by . We define the number p = 21 ( bαx + by ). It is obvious from this definition that bαx < p < by . By bαx < p, we have αp < bx and so there exists br ∈ B(x) such that α < br . (1.1) p Similarly, the inequality p < by implies that there exists bs ∈ B(y) such that p < bs .

(1.2)

Now inequalities (1.1) and (1.2) show that α < br × bs for some br × bs ∈ B(x, y). Hence α is not an upper bound of B(x, y) and we have bx × by = sup B(x, y), completing the proof of the lemma.

Lemma 1.2 Let S be a set of positive real numbers and bounded above and S −1 = {x−1 | x ∈ S}. Then 1 we have sup S = . inf S −1

Proof of Lemma 1.2. Suppose that α is an upper bound of S, i.e., 0 < x ≤ α for all x ∈ S. Then we have 0 < α−1 ≤ x−1 for all x−1 ∈ S −1 . Hence the result follows directly from the definitions of the least upper bound and the greatest lower bound.

Chapter 1. The Real and Complex Number Systems

4

Lemma 1.3 For every real x, we have b−x =

1 . bx

Proof of Lemma 1.3. We have two facts: 1

1

– Fact 1. If b > 1, then b n > 1 for every positive integer n. Otherwise, 0 < b n < 1 implies 1 that 0 < b = (b n )n < 1n = 1 by Theorem 1.21, a contradiction. 1

1

– Fact 2. If m and n are positive integers such that n > m, then b m > b n . Otherwise, it 1 1 follows from Fact 1 that 1 < b m < b n and so it implies that 1

1

1

b = (b m )m < (b n )m < (b n )n = b, a contradiction. Let r and s be rational. Define A(x) = {bs | s ∈ Q, s ≥ x}. We next want to prove that sup B(x) = inf A(x). In fact, it is clear that sup B(x) ≤ inf A(x) by definitions. Suppose that 1 D = inf A(x) − sup B(x). Assume that D > 0. By Fact 2 above, b n − 1 is decreasing as n is increasing, so there exists a positive integer N such that 1

bx (b n − 1) < D 1 for all n ≥ N . By Theorem 1.20(b), we see that there exist r, s ∈ Q such that x − 2n 1; otherwise, we have b = (b n )n < 1 which is impossible. The result follows 1 by replacing b by the real number b n in part (a) and Problem 1.6(a). (c) If t > 1 and n >

b−1 t−1 ,

then part (b) implies that 1

b − 1 ≥ n(b n − 1) >

1 b−1 × (b n − 1) t−1

1

and so b n < t. (d) Let w be a number such that bw < y. Let t = y · b−w . It is easily to check that t > 1. If n is sufficiently large enough, then we have n > b−1 t−1 . Hence it follows from parts (c) and (b) that 1 1 −w w+ n n and thus b < y for sufficiently large n. b y. Let t = y −1 · bw . It is obvious that t > 1. If n is sufficiently large enough, then we have n > b−1 t−1 . Hence it follows from part (c) and then part (b) that 1

n(y −1 bw − 1) > b − 1 ≥ n(b n − 1) 1

and thus bw− n > y for sufficiently large n. (f) We have A = {w ∈ R | bw < y}. Since x is the least upper bound of A, we have w ≤ x for all w ∈ A. 1 If bx < y, then part (d) implies that bx+ n < y for sufficiently large n and so x + n1 ∈ A. Therefore, we have x + n1 ≤ x and then n1 ≤ 0, a contradiction. Similarly, if bx > y, then x ∈ / A and so w < x 1 for all w ∈ A. Now part (e) implies that bx− n > y for sufficiently large n, so we have w i · 0 which implies that −1 > 0, a contradiction. Similarly, the case i < 0 is impossible. Problem 1.9 Rudin Chapter 1 Exercise 9.

Chapter 1. The Real and Complex Number Systems

6

Proof. We check Definition 1.5 in this case. Let z = a + bi, w = c + di ∈ C. If a 6= c, then we have either z < w or z > w. If a = c and b 6= d, then we have either z < w or z > w. If a = c and b = d, then we have z = w. Therefore, this relation satisfies Definition 1.5(i). Let z = a + bi, w = c + di and q = e + f i be complex numbers such that z < w and w < q. Since z < w, we have either a < c or a = c and b < d. Similarly, since w < q, we have either c < e or c = e and d < f . Combining the above two results, we get either a < e or a = e and b < f . This proves Definition 1.5(ii). Hence this turns C into an ordered set. Let S = {z = a + bi | b ∈ R, a < 0} ⊂ C. Since −1 ∈ S, it is clear that S is not empty. We will show that sup S does not exist in C. Assume that w = c + di was the least upper bound of S for some c, d ∈ R. That is, z ≤ w for all z ∈ S. If c > 0, then the definition of the dictionary order implies that z < ζ = 0 + bi < w for all z ∈ S, contradicting to the fact that w = sup S. If c < 0, then we have c < 2c so that ζ = 2c + di ∈ S and w < ζ, contradicting to the fact that w = sup S. Hence this order set does not have the least-upper-bound property, completing the proof of the problem. Problem 1.10 Rudin Chapter 1 Exercise 10.

Proof. If v ≥ 0, then we have

|w|2 − u2 z = a + 2abi − b = u + 2 4 2

2

2

12

1

1

i = u + (|w|2 − u2 ) 2 i = u + (v 2 ) 2 i = u + vi = w.

If v ≤ 0, then we have 1

1

z 2 = a2 − 2abi − b2 = u − (|w|2 − u2 ) 2 i = u − (v 2 ) 2 i = u − (−v)i = u + vi = w. By the above results, we see that every non-zero complex number w has two complex square roots z, z which are defined as in the question. However, when w = 0, then u = v = 0 which imply that a = b = 0 and thus z = z = 0. Hence, the complex number 0 has only one complex square root which is 0 itself, completing the proof of the problem. Problem 1.11 Rudin Chapter 1 Exercise 11.

Proof. Let |z| = r > 0. We define z = rw, where w = zr . Then it is easily to see that this expression satisfies the required conditions. Assume r1 w1 = r2 w2 , where r1 > 0, r2 > 0 and |w1 | = |w2 | = 1. Then 1 we have rr21 = w w2 which leads to r1 w1 |w1 | =| |= = 1. r2 w2 |w2 | Hence we have r1 = r2 and then w1 = w2 . This completes the proof of the problem.

Problem 1.12 Rudin Chapter 1 Exercise 12.

Proof. This result follows from induction and Theorem 1.33(e). Problem 1.13 Rudin Chapter 1 Exercise 13.

7

1.5. Properties of Euclidean spaces

Proof. Since |x| = |x − y + y| ≤ |x − y| + |y| by Theorem 1.33(e), we have |x| − |y| ≤ |x − y|. Similarly, since |y| = |y − x + x| ≤ |y − x| + |x| by Theorem 1.33(e), we have −|x − y| ≤ |x| − |y|. Hence these two results together imply that the desired result, completing the proof of the problem. Problem 1.14 Rudin Chapter 1 Exercise 14.

Proof. It follows from Definition 1.32 that |1 + z|2 + |1 − z|2 = (1 + z)(1 + z) + (1 − z)(1 − z) = 2(1 + |z|2 ) holds. This completes the proof of the problem.

Problem 1.15 Rudin Chapter 1 Exercise 15.

Proof. By the proof of the Schwarz inequality (Theorem 1.35), the equality holds if and only if each term n X |Baj − Cbj |2 is zero, i.e., in the sum j=1

|Baj − Cbj |2 = 0

2

2

(Baj − Cbj )(Baj − Cbj ) = 0

B |aj | + C 2 |bj |2 − 2BCRe (aj bj ) = 0.

(1.3)

By Theorem 1.33(d), we have Re (aj bj ) ≤ |aj bj | and so the relation (1.3) implies that 0 ≥ B 2 |aj |2 + C 2 |bj |2 − 2BC|aj ||bj | = (B|aj | − C|bj |)2 . Hence we have the equality holds if and only if B|aj | = C|bj | if and only if j = 1, 2, . . . , n. This completes the proof of the problem.

1.5

|aj | |bj |

is a constant for

Properties of Euclidean spaces

Problem 1.16 Rudin Chapter 1 Exercise 16.

Proof. Let m be the mid-point of x and y and u = z − m = (u1 , u2 , . . . , uk ). Geometrically, these conditions say that the three vectors x, y and u form an isosceles triangle with sides r, r and d. In other words, u must satisfy the equations u · (x − y) = 0 and |u|2 = r2 −

d2 . 4

(1.4)

Chapter 1. The Real and Complex Number Systems

8

(a) Since |x − y| = d > 0, we have from Theorem 1.37(b) that x and y are distinct and then we may assume without loss of generality that x1 6= y1 . If u2 , u3 , . . . , uk are arbitrary, then we define −1 [u2 (x2 − y2 ) + · · · + uk (xk − yk )]. x1 − y1

u1 =

Then it is clear that this vector u = (u1 , u2 , . . . , uk ) satisfies the equation u · (x − y) = 0. Since u2 , u3 , . . . , uk are arbitrary, there are infinitely many u1 and then u. Next, suppose that u is a vector satisfying the equation u · (x − y) = 0. Since 2r > d, we must have 2 2 r2 − d4 > 0. If |u|2 6= r2 − d4 , then we consider the vector u defined by u=

q r2 −

d2 4

|u|

u.

Thus it is easy to check that u satisfies both equations (1.4). In fact, this proves that there are infinitely many u satisfying both equations (1.4) and hence there are infinitely many z such that |z − x| = |z − y| = r. 2

(b) If 2r = d, then we have r2 − d4 = 0 and so |u|2 = 0. It follows from Theorem 1.37(b) that u = 0, i.e., z = m. This prove the uniqueness of such z. 2

(c) If 2r < d, then we have r2 − d4 < 0 and it is clear that there is no u such that |u|2 < 0. Hence there is no such z in this case. When k = 1 or 2, the results will be different from those when k ≥ 3 and the analysis is given as follows: • Case (i): k = 2. If 2r > d, then u=

q r2 − |u|

d2 4

u

are the only vectors satisfying the equations (1.4), where x −y 2 2 u2 , u2 . u=± − x1 − y1 If 2r = d, then u = 0 is the only vector such that u · (x − y) = 0 and |u|2 = 0. If 2r < d, then there is no such u. Hence, there are two such z if 2r > d, exactly one such z if 2r = d and no such z if 2r < d. • Case (ii): k = 1. Then there is no point u satisfying both u(x1 − y1 ) = 0

and |u|2 = r2 −

d2 4

if 2r 6= d. However, u = 0 is the only point such that u(x1 − y1 ) = 0 and |u|2 = 0 if 2r = d. Hence, there is no such z if 2r 6= d and only one such z if 2r = d. This completes the proof of the problem. Problem 1.17 Rudin Chapter 1 Exercise 17.

9

1.6. A supplement to the proof of Theorem 1.19

Proof. Let x = (x1 , x2 , . . . , xk ) and y = (y1 , y2 , . . . , yk ). Then we have 2

2

|x + y| + |x − y| =

k X j=1

2

2

[(xj + yj ) + (xj − yj ) ] = 2

k X j=1

x2j

+2

k X

yj2 = 2|x|2 + 2|y|2 .

j=1

Suppose that x and y are sides of a parallelogram. Then x + y and x − y are the diagonals of the parallelogram and the above result can be interpreted as follows: the sum of the squares of the lengths of the diagonals (left-hand side) is double to the sum of the squares of the lengths of the sides (right-hand side). This completes the proof of the problem. Problem 1.18 Rudin Chapter 1 Exercise 18.

Proof. We define x = (x1 , x2 , . . . , xk ), where xj ∈ R for j = 1, 2, . . . , k. If x1 = x2 = · · · = xk = 0, then the element y = (1, 0, . . . , 0) satisfies the requirements that y 6= 0 and x · y = 0. Otherwise, without loss of generality, we may assume that x1 6= 0. If we define x2 + x3 + · · · + xk , 1, . . . , 1 , y= − x1 then we still have y 6= 0 and x · y = 0. Let x be a non-zero real number. If y 6= 0, then Proposition 1.16(b) implies that x · y 6= 0. Hence this is not true if k = 1, finishing the proof of the problem. Problem 1.19 Rudin Chapter 1 Exercise 19.

Proof. Let 3c = 4b − a and 3r = 2|b − a|. Since |x|2 = x · x (Definition 1.36), we have the following relations: |x − a| = 2|x − b| ⇔ |x|2 − 2a · x + |a|2 = 4|x|2 − 8b · x + 4|b|2 ⇔ 3|x|2 − 8b · x + 2a · x + 4|b|2 − |a|2 = 0

⇔ 9|x|2 − 24b · x + 6a · x + 12|b|2 − 3|a|2 = 0 ⇔ (3x − 4b + a) · (3x − 4b + a) = 4(b − a)(b − a) 4 1 4 1 4 ⇔ x − b + a · x − b + a = (b − a)(b − a) 3 3 3 3 9 1 2 4 ⇔ x − b + a = |b − a| 3 3 3 ⇔ |x − c| = r,

completing the proof of the problem.

1.6

A supplement to the proof of Theorem 1.19

Problem 1.20 Rudin Chapter 1 Exercise 20.

Proof. Let us recall the first two properties now: (I) α 6= ∅ and α 6= Q.

Chapter 1. The Real and Complex Number Systems

10

(II) If p ∈ α and q ∈ Q such that q < p, then q ∈ α. Suppose that “(III) If p ∈ α, then p < r for some r ∈ α” is deleted in the definition of a cut, see p. 17. Then it is easy to see that Step 2 and Step 3 are still satisfied. For Step 4, we still have the definition of addition of cuts: if α, β ∈ R, then α + β = {r + s | r ∈ α, s ∈ β}. We define 0∗ = {p ∈ Q | p ≤ 0}. It is clear that 0∗ satisfies (I) and (II), but it has the maximum element 0. It is also obvious that the addition so defined satisfies axioms (A1) to (A3). Let α ∈ R. If r ∈ α and s ∈ 0∗ , then either r + s < r or r + s = r which imply that r + s ∈ α, i.e., α + 0∗ ⊆ α. For any p ∈ α, we always have p + 0 = p so that p ∈ α + 0∗ , i.e., α ⊆ α + 0∗ . Hence we have α = α + 0∗ and the addition satisfies (A4). Let α = {p ∈ Q | p < 0} ∈ R. Assume that there was β ∈ R such that α + β = 0∗ . Since α 6= ϕ, we have p ∈ α and then β contains an element q ∈ Q such that p + q = 0. Since p is a negative rational, q must be a positive rational. By Theorem 1.20(b), we have 0 < q1 < q for some q1 ∈ Q. By (II), we have q1 ∈ β. However, we have p + q1 > 0 so that p + q1 ∈ / 0∗ by definition. This contradicts the assumption and hence we have the fact that the addition does not satisfy (A5). This completes the proof of the problem.

CHAPTER

2

Basic Topology

2.1

The empty set and properties of algebraic numbers

Problem 2.1 Rudin Chapter 2 Exercise 1.

Proof. Assume that there was a set A such that ∅ is not a subset of it. Then ∅ contains an element which is not in A. However, ∅ has no element by definition. Hence no such element exists and ∅ must be a subset of every set. This completes the proof of the problem. Problem 2.2 Rudin Chapter 2 Exercise 2.

Proof. We don’t use the hint to prove the result. For each positive integer n, we let Pn be the set of all polynomials of degree less than or equal to n with integer coefficients. Then it is easy to check that the mapping f : Pn → Zn+1 defined by f (a0 z n + a1 z n−1 + · · · + an−1 z + an ) = (a0 , a1 , . . . , an ) is bijective. By Example 2.5, it is clear that Z is countable. Thus it follows from Theorem 2.13 that Zn+1 is countable and thus Pn is also countable. For each p(z) ∈ Pn , we let Bp(z) be the set of all roots of p(z). Since a polynomial p(z) ∈ Pn has at most n (distinct) roots, Bp(z) is a finite set. By the corollary to Theorem 2.12, the set [ Bp(z) S= p(z)∈Pn

is at most countable. Since every positive integer is an algebraic number (consider the polynomial z − n), the set of all algebraic numbers A is infinite. Since A is a subset of S, we have A is countable, completing the proof of the problem. Problem 2.3 Rudin Chapter 2 Exercise 3.

Proof. Assume that all real numbers were algebraic. Then Problem 2.2 implies that R is countable which contradicts the corollary of Theorem 2.43. Hence there exist real numbers which are not algebraic. This completes the proof of the problem. 11

Chapter 2. Basic Topology

2.2

12

The uncountability of irrational numbers

Problem 2.4 Rudin Chapter 2 Exercise 4.

Proof. The set of all irrational real numbers R \ Q is uncountable. Otherwise the corollary of Theorem 2.12 implies that R = (R \ Q) ∪ Q is countable, a contradiction to the corollary of Theorem 2.43. This completes the proof of the problem.

2.3

Limit points, open sets and closed sets

Problem 2.5 Rudin Chapter 2 Exercise 5.

Proof. We let E0 = and

n1 o n = 1, 2, . . . , n

n o 1 E1 = 2 + n = 1, 2, . . . , n E = E0 ∪ E1 ∪ E2 .

n o 1 E2 = 4 + n = 1, 2, . . . n

It is clear that E0 , E1 and E2 are bounded. Furthermore, by Example 2.21(e), we know that each of E0 , E1 and E2 has exactly one limit point, namely 0, 2 and 4 respectively. Hence the E satisfies the required conditions. This completes the proof of the problem. Problem 2.6 Rudin Chapter 2 Exercise 6.

Proof. Let p ∈ (E ′ )′ and Nr (p) be a neighborhood of p for some r > 0. Since p is a limit point of E ′ , Definition 2.18(b) implies that there exists a point q 6= p in Nr (p) such that q ∈ E ′ . By the definition of E ′ , q is a limit point of E and thus there exists a point s 6= q in Nh (q) such that s ∈ E for every h > 0. If we take h = 21 min(d(p, q), r − d(p, q)), then we have Nh (q) ⊂ Nr (p). Furthermore, we must have s 6= p. Otherwise, we have p ∈ Nh (q). If h = 12 d(p, q), then this fact implies that 1 d(p, q) < h = d(p, q), 2 a contradiction. If h = 21 (r − d(p, q)), then we have r − d(p, q) ≤ d(p, q), but d(p, q) < h = 21 (r − d(p, q)) which implies that 2d(p, q) < r − d(p, q), a contradiction again. Thus what we have shown is that every neighborhood Nr (p) of p contains a point s 6= p such that s ∈ E. By Definition 2.18(b), p is a limit point of E and hence p ∈ E ′ and E ′ is closed by Definition 2.18(d). See Figure 2.1 for the neighborhoods Nh (q) and Nr (p) below.

13

2.3. Limit points, open sets and closed sets

Figure 2.1: The neighborhoods Nh (q) and Nr (p).

We first show that (E)′ ⊆ E ′ . Suppose that p ∈ (E)′ . Then p is a limit point of E = E ′ ∪ E, so for every r > 0, there exists q ∈ Nr (p) and q 6= p such that q ∈ E = E ′ ∪ E. If q ∈ E, then p is already a limit point of E and thus p ∈ E ′ . If q ∈ E ′ , then the argument in the previous paragraph can be applied to obtain that p ∈ E ′ . So we have (E)′ ⊆ E ′ . Conversely, suppose that p ∈ E ′ and Nr (p) is a neighborhood of p for some r > 0. Since p is a limit point of E, Definition 2.18(b) implies that there exists a point q 6= p in Nr (p) such that q ∈ E. Recall that E = E ∪ E ′ , we must have q ∈ E and thus p is a limit point of E. Hence E ′ ⊆ (E)′ and then E ′ = (E)′ . The sets E and E ′ may have different limit points. For example, we consider the set E = {1, 12 , 13 , . . .} whose only limit point is 0, i.e., E ′ = {0}. Since E ′ has only one element, Theorem 2.20 implies that (E ′ )′ = ∅. This completes the proof of the problem. Problem 2.7 Rudin Chapter 2 Exercise 7.

Proof. (a) If x ∈ Bn , then x ∈ Bn or x ∈ Bn′ . If x ∈ Bn , then x ∈ Ai for some i ∈ {1, 2, . . . , n}, so x ∈ Ai . Therefore we have n [ x∈ Ai . i=1

Bn′ ,

If x ∈ then x is a limit point of Bn . We claim that x ∈ A′i for some i ∈ {1, 2, . . . , n}. Assume that x 6∈ A′i for all i ∈ {1, 2, . . . , n}. Then for each i ∈ {1, 2, . . . , n}, there exists a neighborhood Nri (x) of x for some ri > 0 such that Nri (p) ∩ Ai = ∅. Let r = min {ri } > 0.a Then we must 1≤i≤n

have Nr (x) ∩ Ai = ∅ for all i ∈ {1, 2, . . . , n}, so Nr (x) ∩ Bn = ∅ by Remarks 2.11, contradicting the fact that x is a limit point of Bn . Hence we must have x ∈ A′i for some i ∈ {1, 2, . . . , n} and

a We

can define this r because there are only finitely many subsets, namely A1 , A2 , . . . , An , of a metric space.

Chapter 2. Basic Topology

14

then Bn ⊆ Conversely, if x ∈

n [

i=1

n [

Ai .

(2.1)

i=1

Ai , then x ∈ Ai and thus x ∈ Ai or x ∈ A′i for some i ∈ {1, 2, . . . , n}. If

x ∈ Ai , then we have x ∈ Bn ⊆ Bn . If x ∈ A′i , then x is a limit point of Ai . Thus there exists y ∈ Nr (x) for every r > 0 and y 6= x such that y ∈ Ai . Since Ai ⊆ Bn by definition, x is also a limit point of Bn , i.e., x ∈ Bn′ ⊆ Bn . Both cases together imply that n [

i=1

Ai ⊆ Bn .

(2.2)

Hence the above set relations (2.1) and (2.2) imply that Bn =

n [

Ai ,

i=1

for n = 1, 2, . . .. (b) The result follows from a similar argument of part (a). However, the inclusion may be proper. For example, we consider Ai = { 1i } for all i ∈ N, so we have B = {1, 21 , 13 , . . . , }. By the corollary of Theorem 2.20, we have A′i = ∅ and thus Ai = { 1i } for all i ∈ N. Thus we have o n 1 1 Ai = 1, , , . . . , 2 3 i=1 ∞ [

but B = {0, 1, 12 , 13 , . . .}. Hence we have B ⊃ This completes the proof of the problem.

∞ [

Ai .

i=1

Problem 2.8 Rudin Chapter 2 Exercise 8.

Proof. Let p = (p1 , p2 ) ∈ E. Since E is open in R2 , we have Ns (p) ⊆ E for some s > 0. In other words, there exists q = (q1 , q2 ) ∈ Ns (p) such that (q1 , q2 ) 6= (p1 , p2 ). Let r > 0. If s ≤ r, then it is obvious that (q1 , q2 ) ∈ Ns (p) ⊆ Nr (p) in this case. If r < s, then we consider the point p′ = (p1 + 21 r, p2 ). Since 0
d1 (3, 2) + d1 (2, 1). Hence d1 is not a metric. • For d2 : It is easy to check that the function d2 satisfies √ Definition 2.15(a) and (b). For any ab + b. This certainly implies that non-negative real numbers a and b, we have a + b ≤ a + 2 √ √ √ a + b ≤ a + b and so for any x, y, z ∈ R, p p p p d2 (x, y) = |x − y| ≤ |x − z| + |z − y| ≤ |x − z| + |z − y| = d2 (x, z) + d2 (z, y). Hence d2 is a metric.

• For d3 and d4 : The function d3 is not a metric because d3 (1, −1) = |12 − (−1)2 | = 0. Similarly, the function d4 is not a metric too because d4 (1, 1) = |1 − 2| = 1. • For d5 : It is clear that d5 satisfies Definition 2.15(a) and (b). To show that d5 also satisfies Definition 2.15(c), we need a lemma first: Lemma 2.1 Suppose that a, b and c are non-negative real numbers. If a ≤ b + c, then we have b c a ≤ + . 1+a 1+b 1+c

17

2.5. Compact sets Proof of Lemma 2.1. Since 0 ≤ a ≤ b + c, we have 1 ≤ 1 + a ≤ 1 + b + c and then 1+a−1 1 1 b c b c a = = 1− ≤1− = + ≤ + , 1+a 1+a 1+a 1+b+c 1+b+c 1+b+c 1+b 1+c completing the proof of the lemma.

Now if we put a = |x − y|, b = |x − z| and c = |y − z| into Lemma 2.1, we immediately have the result that d5 (x, y) ≤ d5 (x, z) + d5 (z, y). Hence d5 is a metric. Now we end the proof of the problem.

2.5

Compact sets

Problem 2.12 Rudin Chapter 2 Exercise 12. S Proof. Let K = {0, 1, 21 , 13 , . . .} and {Gα } be a collection of open subsets of R such that K ⊆ α Gα . Then we must have 0 ∈ Gα1 for some α1 . Since Gα1 is open in R, 0 is an interior point of Gα1 by Definition 2.18(f). Thus there exists an interval (a, b), where a < 0 < b, such that (a, b) ⊆ Gα1 . By Theorem 1.20(a) (the Archimedean property), there exists a positive integer N such that N b > 1, i.e., b > N1 . Therefore, we have n1 ∈ (a, b) ⊆ Gα1 for all positive integers n ≥ N . We rewrite K = {1, 12 , . . . , N1−1 } ∪ {0, N1 , N1+1 , . . .}. Now it follows from the previous paragraph that n

0,

o 1 1 , , . . . ⊆ Gα1 . N N +1

In addition, since {1, 21 , . . . , N1−1 } is a finite set, there are finitely many Gα2 , Gα3 , . . . , Gαm such that

Hence we have

n

1 1 o 1, , . . . , ⊆ Gα2 ∪ Gα3 ∪ · · · ∪ Gαm . 2 N −1 K⊆

m [

Gαi ,

i=1

i.e., K is compact by Definition 2.32, completing the proof of the problem.

Problem 2.13 Rudin Chapter 2 Exercise 13.

Proof. We define n o n1 o 1 1 1 K0 = 0, 1, , , . . . and Kn = + m = n, n + 1, . . . , 2 3 n m

where n = 1, 2, . . .. We also define K = 1 n

∞ [

Kn . n=0 1 1 2 , 3 , . . .}

By definition, K0 and each Kn have limit point 0 and

respectively. Therefore we have {0, 1, ⊆ K ′. We claim that K ′ = K0 . Let p ∈ R be a limit point of K. If p < 0, then we define δ = |p| 2 and thus (p − δ, p + δ) ∩ K = ∅. If p > 2, then we define δ = p−2 and thus (p − δ, p + δ) ∩ K = ∅. If 1 < p ≤ 2, 2

Chapter 2. Basic Topology

18

then we define δ = 12 min(p − 1, 2 − p) and thus the set (p − δ, p + δ) ∩ K contain only finitely many points of K. Hence we have shown that K ′ ⊆ [0, 1]. Next, we suppose that p ∈ [0, 1]\K0 . By this assumption, there exists a positive integer k such that 1 1 1 2 < p < k1 . Since n1 + n1 ≥ n1 + m and n1 + n1 ≥ m +m for all m ≥ n, k+m is the maximum of the set 1 1 1 1 , k1 ). If δ = 12 ( k1 − p), Kk+m ∪ Kk+m+1 ∪ · · · . Define δ = 2 min(p − k+1 , k − p), so (p − δ, p + δ) ⊂ ( k+1 then we have 2k − 1 3p 2 1 1 3 1 = p−δ = > − > − 2 2 2k 2 k+1 k 2(k + 1) k+m 1 k+1

2

for all m > 4(k+1) 2k−1 − k. In this case, the interval (p − δ, p + δ) contains only finitely many points of 1 K1 ∪ K2 ∪ · · · ∪ Km−1 . Similarly, if δ = 21 (p − k+1 ), then we have p−δ =

1 1 2 p + > > 2 2(k + 1) k+1 k+m

for all m > k+2. In this case, the interval (p−δ, p+δ) contains only finitely many points of K1 ∪· · ·∪Km−1 . Now both cases show that the interval (p − δ, p + δ) can only contain finitely many points of K and thus it is not a limit point of K by Theorem 2.20. Therefore we have K ′ = K0 ⊂ K so that K is closed. Since |x| ≤ 2 for all x ∈ K, the set K is a bounded set. Hence it follows from Theorem 2.41 (the Heine-Borel theorem) that K is compact, completing the proof of the problem. Problem 2.14 Rudin Chapter 2 Exercise 14.

Proof. For each n = 2, 3, . . ., we consider the interval Gn = ( n1 , 1). If x ∈ (0, 1), then it follows from Theorem 1.20(a) (the Archimedean property) that there exists a positive integer n such that nx > 1, i.e., x ∈ Gn . Furthermore, we have ∞ [ Gn , (0, 1) ⊆ i=2

i.e., {G2 , G3 , . . .} is an open cover of the segment (0, 1). Assume that {Gn1 , Gn2 , . . . , Gnk } was a finite subcover of (0, 1), where n1 , n2 , . . . , nk are positive integers and 2 ≤ n1 < n2 < · · · < nk . By definition, we have Gn1 ⊆ Gn2 ⊆ · · · ⊆ Gnk and so (0, 1) ⊆

k [

i=1

Gni ⊆ Gnk ,

contradicting to the fact that 2n1k ∈ (0, 1) but 2n1k ∈ / ( n1k , 1). Hence {G2 , G3 , . . .} does not have a finite subcover for (0, 1). This completes the proof of the problem.

2.6

Further topological properties of R

Problem 2.15 Rudin Chapter 2 Exercise 15.

Proof. We take the metric space to be the real number line R. • Example 1. Let En = [n, ∞), where n = 1, 2, 3, . . .. It follows from the corollary of Theorem 2.23 that each En is closed in R. Furthermore, if 1 ≤ n1 < n2 < · · · < nk , then we have k \

i=1

Eni = [nk , ∞) 6= ∅,

2.6. Further topological properties of R

19 but

∞ \

n=1

En = ∅ because if x is a real number such that x ∈

contradiction.

∞ \

n=1

En , then x > n for all n ∈ N, a

• Example 2. Let Fn = (0, n1 ), where n = 1, 2, 3, . . .. It is clear that each Fn is bounded. If 1 ≤ n1 < n2 < · · · < nk , then we have 1 Fni = 0, 6= ∅, nk i=1 k \

but

∞ \

n=1

Fn = ∅ because if x is a real number such that x ∈

contradiction.

∞ \

Fn , then x
0, there exists q ∈ E and q 6= p such that |p − q| < r. In particular, if we take r = n1 , where n is a large positive integer, then we have 0 q2 − > q 2 − δ > 2. (2.4) n n n n Thus the inequalities (2.5) and (2.6) imply that p ∈ E and hence E is closed by Definition 2.18(d). Assume that E was compact in Q. For each n = 1, 2, . . ., we let the sets Vn = {p ∈ Q | 2− n1 < p2 < 3}, q q √ √ Vn− = − 3, − 2 − n1 and Vn+ = 2 − n1 , 3 . Then it is easy to see that p2 > q −

Vn = Vn− ∪ Vn+ . √ √ √ √ If p ∈ E, then either − 3 < p < − 2 or 2 < p < 3 which implies that either r r √ √ 1 1 − 3 0. By definition, we have A = {q ∈ X | d(p, q) < δ} and B = {q ∈ X | d(p, q) > δ} which are obviously disjoint open sets by Theorem 2.19. By the result of part (b), A and B are separated. (d) Let X be a connected metric space with at least two (distinct) points. Let the two points be a and b. Assume that X was countable, i.e., X = {a, b, x3 , x4 , . . .}. We define r = d(a, b). Since a and b are distinct, it follows from Definition 2.15(a) that r > 0. Since [0, r] = (0, r) ∪ {0, r}, (0, r) must be uncountable by the corollary of Theorem 2.43. Therefore there exists δ ∈ (0, r) such that d(a, x) 6= δ for all x ∈ X. Now if we define A = {x ∈ X | d(a, x) < δ} and B = {x ∈ X | d(a, x) > δ}, then they are nonempty (a ∈ A and b ∈ B) and the result of (c) shows that A and B are separated. However, X = A∪B so that X is not connected by Definition 2.45. This contradiction proves the problem. We complete the proof of the problem.

Chapter 2. Basic Topology

22

Problem 2.20 Rudin Chapter 2 Exercise 20.

Proof. Let E be a connected set. Assume that E = A ∪ B, where A ∩ B = A ∩ B = ∅. Since A ∩ E ⊆ A and B ∩ E ⊆ B, we have A ∩ E ⊆ A and B ∩ E ⊆ B. Since A ⊆ A, we have A ∩ B ⊆ A ∩ B = ∅. Since E ⊆ E = A ∪ B, we have E = E ∩ E = E ∩ (A ∪ B) = (E ∩ A) ∪ (E ∩ B). Since (A ∩ E) ∩ (B ∩ E) ⊆ A ∩ B = ∅ and (B ∩ E) ∩ (A ∩ E) ⊆ B ∩ A = ∅, we have E is disconnected which is a contradiction. Hence E must be connected. However, the interior of a connected set may not be connected. We prove the following lemma first: Lemma 2.3 Suppose that A and B are connected sets and A ∩ B 6= ∅. Then the set E = A ∪ B is also connected.

Proof of Lemma 2.3. Assume that E was not connected. By definition, there are separated sets U and V such that U ∪ V = E = A ∪ B. Let UA = U ∩ A and VA = V ∩ A. Since U and V are separated, we have UA ∩ VA ⊆ (U ∩ A) ∩ (V ∩ A) = (U ∩ V ) ∩ (A ∩ A) = ∅ ∩ A = ∅ and similarly UA ∩ VA = ∅. Since UA ∪ VA = A, A is not connected which contradicts the hypothesis. Therefore we have either UA = ∅ or VA = ∅.

(2.7)

Similarly, the sets UB = U ∩ B and VB = V ∩ B will imply that B is not connected, a contradiction again. Therefore we have either UB = ∅ or VB = ∅.

(2.8)

Now we verify that the cases (2.7) and (2.8) will induce a contradiction. For examples, if UA = ∅ and UB = ∅, then U ∩ A = ∅ and U ∩ B = ∅ but they imply that U = U ∩ E = U ∩ (A ∪ B) = (U ∩ A) ∪ (U ∩ B) = ∅ which contradicts U 6= ∅; if UA = ∅ and VB = ∅, then A ∩ U = ∅ and B ∩ V = ∅ which give (A ∩ B) ∩ U = ∅ and (A ∩ B) ∩ V = ∅ respectively. Thus these imply that ∅ = [(A ∩ B) ∩ U ] ∪ [(A ∩ B) ∩ V ] = (A ∩ B) ∩ (U ∪ V ) = (A ∩ B) ∩ E = A ∩ B which contradicts the fact that A ∩ B 6= ∅. Other cases can be done similarly, so we have the desired result that E is connected. Let’s go back to the proof of the problem. For example, we consider the disks A = {x ∈ R2 | |x| ≤ 1} and B = {x ∈ R2 | |x − (2, 0)| ≤ 1}. Define E = A ∪ B. By Definition 2.17, both A and B are convex so that we obtain from Problem 2.21(c) that A and B are connected. It is clear that x = (1, 0) is the (only) common point of A and B. Hence it follows from Lemma 2.3 that E is connected. However, it is easy to check from the definition that E ◦ = A◦ ∪ B ◦ , where A◦ = {x ∈ R2 | |x| < 1} and B ◦ = {x ∈ R2 | |x − 2| < 1}. Since A◦ and B ◦ are clearly separated, E ◦ is not connected. We end the proof of the problem.

23

2.7. Properties of connected sets Problem 2.21 Rudin Chapter 2 Exercise 21.

Proof. (a) By definition, we have A0 = {t ∈ R | p(t) ∈ A} and B0 = {t ∈ R | p(t) ∈ B}. If t ∈ A0 ∩ B0 , then t ∈ A0 and t ∈ B0 . Since t ∈ B0 , we have p(t) ∈ B. Since t ∈ A0 , we have t ∈ A0 or t ∈ A′0 . If t ∈ A0 , then p(t) ∈ A. Thus p(t) ∈ A ∩ B ⊆ A ∩ B and then A and B are not separated, a contradiction. Suppose that t ∈ A′0 . Then t is a limit point of A0 . Therefore we have s ∈ Nh (t) and s 6= t such that s ∈ A0 for every h > 0. Since s ∈ Nh (t), we have |s − t| < h and thus |p(s) − p(t)| = |(−s + t)a + (s − t)b| ≤ |s − t|(|a| + |b|) < h(|a| + |b|). In fact, this implies that p(Nh (t)) = Nh(|a|+|b|)(p(t)) so that p(s) ∈ Nh(|a|+|b|)(p(t)). If p(s) = p(t), then we have a = b which contradicts the fact that A and B are separated. Thus we have p(s) 6= p(t). Recall that s ∈ A0 so that p(s) ∈ A. Hence we have actually shown that p(t) is a limit point of A, i.e., p(t) ∈ A which gives p(t) ∈ A ∩ B and then A and B are not separated, a contradiction. (b) Assume that p(t) ∈ A ∪ B for all t ∈ (0, 1). Since p(0) = a and p(1) = b, we have p(t) ∈ A ∪ B for all t ∈ [0, 1] and thus [0, 1] ⊆ A0 ∪ B0 by definition. Let C0 = [0, 1] ∩ A0 and D0 = [0, 1] ∩ B0 . Then C0 and D0 are nonempty because p(0) = a ∈ A and p(1) = b ∈ B. Furthermore, since A0 and B0 are separated, we have C0 ∩ D0 = ∅ and C0 ∩ D0 = ∅. Therefore [0, 1] is not connected which contradicts Theorem 2.47. Hence we have p(t0 ) ∈ / A ∪ B for some t0 ∈ (0, 1). (c) By Definition 2.17, a set S ⊆ Rk is said to be convex if, for all x, y ∈ S and all t ∈ (0, 1), we have (1 − t)x + ty ∈ S, see Figure 2.2.

Figure 2.2: Convex sets and nonconvex sets.

Chapter 2. Basic Topology

24

Assume that S was not connected. Then we have S = A ∪ B, where A and B are separated. Pick a ∈ A and b ∈ B. By part (b), there exists t0 ∈ (0, 1) such that p(t0 ) ∈ / A ∪ B = S, so S is not convex which is contrary to our hypothesis. Hence S must be connected. This completes the proof of the problem.

2.8

Separable metric spaces and bases and a special case of Baire’s theorem

Problem 2.22 Rudin Chapter 2 Exercise 22.

Proof. Let Qk = {(q1 , q2 , . . . , qk ) | q1 , q2 , . . . , qk ∈ Q}. Since Q is countable, Theorem 2.13 shows that Qk is countable. To prove that Rk is separable, we must show that every nonempty open subset of Rk contains at least one element in Qk .c Let S be an open set of Rk and p = (p1 , p2 , . . . , pk ) ∈ S. By Definition 2.18(f), there exists h > 0 such that B(p, h) ⊆ S. By definition, B(p, h) = {x ∈ Rk | d(x, p) < h}. If we h h h take x = (p1 − 2k , p2 − 2k , . . . , pk − 2k ), then we have d(x, p) =

k h X h − pi = < h pi − 2k 2 i=1

h so that x ∈ B(p, h). Since pi − 2k < pi , where i = 1, 2, . . . , k, Theorem 1.20(b) shows that there exist h qi ∈ Q such that pi − 2k < qi < pi , where i = 1, 2, . . . , k. Let q = (q1 , q2 , . . . , qk ). By Definition 2.15(c), we have h h d(q, p) ≤ d(q, x) + d(x, p) < + = h. 2 2

Hence we have q ∈ B(p, h) ⊆ S and Qk is dense in Rk . This ends the proof of the problem.

Problem 2.23 Rudin Chapter 2 Exercise 23.

Proof. Let X be a separable metric space. Since X is separable, it contains a countable dense subset. Let C = {x1 , x2 , . . .} be a countable dense subset of X. Let B = {Nqm (xk )} be a collection of subsets of X, where qm ∈ Q+ and xk ∈ C. (Here Q+ is the set of all positive rational numbers.) By Theorem 2.19, each Nqm (xk ) is open in X. Since C and Q+ are countable, the set B is also countable by Theorem 2.13. Suppose that x ∈ X and G is an open subset of X containing x. Since G is open, x is an interior point of G and thus we have Nh (x) ⊆ G for some h > 0, see the black dotted circle in Figure 2.3. Since C is dense in X, there exists xk ∈ C such that xk ∈ N h (x), see the red dotted circle in Figure 2 2.3. It is clear from Theorem 1.20(b) that d(x, xk ) < qm < h2 for some qm ∈ Q+ so that x ∈ Nqm (xk ). Since d(x, xk ) < qm < h2 , for every y ∈ Nqm (xk ), we have d(y, x) ≤ d(y, xk ) + d(xk , x) < qm + qm
0, since X 6= ∅, we can pick x1 ∈ X. Then we can choose x2 ∈ X such that d(x1 , x2 ) ≥ δ. Otherwise, we have d(x1 , x) < δ for all x ∈ X which means that X ⊆ Nδ (x1 ).

In this case, we replace δ by 2δ and choose x2 ∈ X such that d(x1 , x2 ) ≥ δ2 . Otherwise, we have X ⊆ N δ (x1 ). Then the process can be repeated but it must stop after a finite number of steps because 2

X ⊆N

δ 2n

(x1 )

for all nonnegative integers n. However, this implies that X = ∅, a contradiction. Let j be a positive integer. Having chosen x1 , . . . , xj ∈ X, choose xj+1 ∈ X, if possible, so that d(xi , xj+1 ) ≥ δ for i = 1, . . . , j. In other words, x1 , x2 . . . , xj+1 are distinct elements. This process must stop after a finite number of steps. Otherwise, we have the infinite subset E = {x1 , x2 , . . .} of X and the hypothesis guarantees that E has a limit point x ∈ X. By Theorem 2.20, every neighborhood of x contains infinitely many points of E. Therefore we have xj , xj+1 , . . . ∈ N δ (x) for some positive integer j 2 so that d(xi , x) < δ2 for all i = j, j + 1, . . .. By this, we have d(xj , xj+1 ) ≤ d(xj , x) + d(x, xj+1 )
0. By Theorem 1.20(a) (the Archimedean property), there exists a positive integer n such that n1 < h. Fix this n and we follow from the relation (2.9) that x ∈ N n1 (xnk ) for some k ∈ {1, 2, . . . , mn }. By definition, we have d(x, xnk )
0 such that Nδ (x) ⊆ Gn . Let m be a positive integer greater than n. If xm ∈ Nδ (x), then we have xm ∈ Gn and so xm ∈ G1 ∪ · · · ∪ Gn ∪ · · · ∪ Gm

which implies that xm ∈ / Fm , a contradiction. Therefore x is not a limit point of E which is a contradiction. Hence we have the desired result that X is compact. This completes the proof of the problem. d We

don’t require that ni = mi in general.

27

2.8. Separable metric spaces and bases and a special case of Baire’s theorem Problem 2.27 Rudin Chapter 2 Exercise 27.

Proof. Suppose that E ⊆ Rk , E is uncountable, and let P be the set of all condensation points of E. • At most countably many points of E are not in P . This statement is equivalent to the statement that “P c ∩ E is at most countable.” By Problem 2.22, we know that Rk is separable and then Problem 2.23 implies that it has a countable base. Let {Vn } be a countable base of Rk , let W be the union of those Vn for which E ∩ Vn is at most countable.

Let p ∈ P . Assume that p ∈ W . Since W is open by Theorem 2.24(a), we have p ∈ Vk ⊆ W for some positive integer k because {Vn } is a base of X (see Problem 2.23). Since Vk is open, we have Nh (p) ⊆ Vk for some h > 0. By definition of W , E ∩ Vk is at most countable and so is E ∩ Nh (p), but this contradicts the fact that Nh (p) ∩ E is uncountable. Hence we have p ∈ W c so that P ⊆ W c. If p ∈ W c , then p ∈ / W . Assume that Nh (p) was a neighborhood of p such that Nh (p) ∩ E has at most countably many points for some h > 0. Since {Vn } is a base of X and Nh (p) is open, there exists Vk such that p ∈ Vk ⊆ Nh (p) for some positive integer k. Since Nh (p) ∩ E is at most countable, Vk ∩ E is at most countable too and this implies that p ∈ Vk ⊆ W, a contradiction. Hence Nh (p) ∩ E must be uncountable for all h > 0 so that p ∈ P , i.e., W c ⊆ P . Now the above two paragraphs show our desired result that P = W c.

(2.11)

• P is perfect. Since W is open, P = W c is closed by the relation (2.11) and Theorem 2.23.e Next, we let x ∈ P and Nh (x) be a neighborhood of x, where h > 0. Assume that Nh (x) ∩ P = {x}. This means that if y ∈ Nh (x) \ {x}, then y ∈ P c and it is easy to get from the relation (2.11) that y ∈ W . Thus we have Nh (x) \ {x} ⊆ W . Since Nh (x) = (Nh (x) \ {x}) ∪ {x}, we have Nh (x) ⊆ W ∪ {x} and so Nh (x) ∩ E ⊆ (W ∪ {x}) ∩ E ⊆ (W ∩ E) ∪ {x}. (2.12) By definition, W is the union of those Vn for which E ∩ Vn is at most countable. By the corollary of Theorem 2.12, we have W ∩ E is at most countable and then (2.12) implies that Nh (x) ∩ E is at most countable too. This implies that x ∈ / P , a contradiction. Hence there exists y ∈ Nh (x) and y 6= x such that y ∈ P for every h > 0. In other words, x is a limit point of P and we obtain from Definition 2.18(h) that P is perfect. This completes the proof of the problem.

Problem 2.28 Rudin Chapter 2 Exercise 28.

e We can prove that P is closed directly without using the relation (2.11): Let p be a limit point of P . Then for every r > 0, the neighborhood Nr (p) of p contains a point q 6= p such that q ∈ P . By definition, q is a condensation point of E, so the neighborhood Nh (q) contains uncountably many points of E, where h = 12 min(d(p, q), r − d(p, q)). Since we have Nh (q) ⊂ Nr (p) (see Figure 2.1 for clarification), p is also a condensation point of E. That is p ∈ P and so P is closed.

Chapter 2. Basic Topology

28

Proof. Let E be a closed set in a separable metric space X. If E is at most countable, then we have E = ∅ ∪ E. By Theorem 2.23, the empty set is closed. Since ∅ contains no point, it contains no isolated point. Thus the empty set must be perfect and we are done in this case. Suppose that E is uncountable. If we read the proof of Problem 2.27 in detail, then it can be seen that the proof does not depend on the metric space in which P is embedded. In fact, what the proof requires is that the space X has a countable base and this is automatically satisfied because of Problem 2.23. Therefore the set P of all condensation points of E is perfect. By definition, a condensation point of E must be a limit point of E so that P ⊆ E ′ . Since E is closed, we have P ⊆E by Theorem 2.27(b). Let F = E \ P . By Problem 2.27 again, the set F must be at most countably many points. Hence we have E =P ∪F and this completes the proof of the problem.

Problem 2.29 Rudin Chapter 2 Exercise 29.

Proof. Let G be an open set in R and x ∈ G. We shall construct the segments with the required properties. The idea of the construction is that for each x ∈ G, we have to find the maximal segment containing x and show that such segment is a subset of G. Next, we have to show that two maximal segments are disjoint. The construction is divided into several steps. • Step 1: Let Ex = {y ∈ R | (x, y) ⊆ G} and Fx = {y ∈ R | (y, x) ⊆ G}. Since G is open, we have x ∈ (x − h, x + h) ⊆ G for some h > 0 so that x + h ∈ Ex , x − h ∈ Fx and thus Ex and Fx are nonempty. We define ax = inf Fx , bx = sup Ex and the segment Ix = (ax , bx ).f

(2.13)

• Step 2: We show that Ix ⊆ G. By definition, we have x ∈ Ix . Let p ∈ Ix \ {x}. Then we have either ax < p < x or x < p < bx . Suppose that ax < p < x. Then it follows from Theorem 1.20(b) (or by Problem 2.22) that ax < q < p for some q ∈ Q. Since ax is the greatest lower bound of Fx , we must have y < q for some y ∈ Fx so that (q, x) ⊂ (y, x) ⊆ G. (2.14)

Since q < p < x, we obtain from the relation (2.14) that p ∈ G so that (ax , x) ⊆ G. The other side is similar and we omit the details here. By these, we have proven Step 2 that Ix ⊆ G. For each x ∈ G, we have x ∈ Ix ⊆ G which implies that [ G= Ix . x∈G

• Step 3: The segment defined in (2.13) is the maximal segment containing x. To show this property, we must have ax ∈ / G. Otherwise, we have (ax − h, ax + h) ⊆ G for some h > 0 as G is an open set. This implies that (ax − h, x) ⊆ G so that ax − h ∈ Fx and then we have ax < ax − h, but ax is the greatest lower bound of Fx . This is clearly a contradiction. Hence we have ax ∈ / G. Similarly, we must have bx ∈ / G. Now if Ix was not the maximal segment containing x, then we have Ix ⊆ (a, b) ⊆ G, where a < ax or bx < b. These imply that ax ∈ G or bx ∈ G, but both lead to a contradiction. f Here

ax and bx can be possibly −∞ and +∞ respectively.

29

2.8. Separable metric spaces and bases and a special case of Baire’s theorem • Step 4: If r, s ∈ G, then we have either Ir = Is

or Ir ∩ Is = ∅.

Let p ∈ Ir ∩ Is . Then it follows from Theorem 2.24(a) that Ir ∪ Is is an open set containing the point p. By Step 3 above, we have Ir = Ir ∪ Is and Is = Ir ∪ Is which imply that Ir = Is . We have shown from Step 1 to Step 4 that the open set G is a union of disjoint segments. By Theorem 1.20(b), each Ix contains at least one rational number. Since Q is countable, the union of disjoint segments is at most countable. This finishes the proof of the problem. Problem 2.30 Rudin Chapter 2 Exercise 30.

Proof. Assume that S the interior of every Fn was empty. That is, Fn◦ = ∅ for every n. Then we have S ∞ ◦ c ∞ ◦ k F = ∅ and so ( n 1 Fn ) = R . By Theorem 2.22, we have 1 ∞ \ 1

(Fn◦ )c =

∞ [ 1

Fn◦

c

= Rk .

By Problem 2.9(d), we have (Fn◦ )c = Fnc . By this and the relation (2.15), we have implies that Rk = Fnc

(2.15) T∞ 1

Fnc = Rk which (2.16)

for every n. For every n, since Fn is closed, Fnc is open and thus this and the relation (2.16) deduce that each Fnc is a dense open subset of Rk . As suggested by Rudin, we can “imitate” the proof of Theorem 2.43 to obtain the required result. When we read the proof of Theorem 2.43 closely, we see that the core idea of it is to construct a shrinking sequence of nonempty compact sets Kn so that the corollary of Theorem 2.36 can be applied to obtain a contradiction. Now we follow this idea to construct such a shrinking sequence of nonempty compact sets in the following paragraph, see also Figure 2.4 for the idea of constructing the shrinking sequence.

Figure 2.4: The construction of the shrinking sequence.

Chapter 2. Basic Topology

30

Let G be an open set of Rk .g Since F1c is a dense subset of Rk , there exists p ∈ F1c such that p ∈ G, see Problem 2.22 and its footnote. Thus we have G1 = F1c ∩ G 6= ∅. Since F1c is open in Rk , G1 must be open in Rk by Theorem 2.24(c). Let p1 ∈ G1 .h Since G1 is open, we have Nr1 (p1 ) ⊆ G1 for some r1 > 0. Without loss of generality, we may assume that Nr1 (p1 ) ⊆ G1 . Since F2c is a dense subset of Rk , the set G2 = F2c ∩ Nr1 (p1 ) is a nonempty open subset of Rk . Let p2 ∈ G2 . Then we can choose r2 > 0 small enough such that Nr2 (p2 ) ⊆ G2 . By definition, we have Nr2 (p2 ) ⊆ G2 = F2c ∩ Nr1 (p1 ) ⊂ Nr1 (p1 ). Now we can continue this process to obtain the following shrinking sequence · · · ⊂ Nr3 (p3 ) ⊂ Nr2 (p2 ) ⊂ Nr1 (p1 ).

(2.17)

By Theorem 2.27(a), each Nrn (pn ) is closed. It is clear that each Nrn (pn ) is bounded, so Theorem 2.41 (the Heine-Borel theorem) T∞implies that each member in (2.17) is compact. Therefore the corollary of Theorem 2.36 shows that 1 Nrn (pn ) is nonempty. Since it is true that Nrn (pn ) ⊆ Fnc for every n, we have ∞ ∞ \ \ Nrn (pn ) ⊆ Fnc which says that the set

T∞

c 1 Fn

1

1

is nonempty. However, this and Theorem 2.22 lead the result that ∞ [ 1

Fn

c

6= ∅.

S∞ In other words, we have 1 Fn 6= Rk which contradicts our hypothesis. Hence there is at least one Fn having a nonempty interior. This completes the proof of the problem.

g The h Here

construction of the required shrinking sequence can be “seen” from Figure 2.4. we don’t assume that p = p1 .

CHAPTER

3

Numerical Sequences and Series

3.1

Problems on sequences

Problem 3.1 Rudin Chapter 3 Exercise 1.

Proof. Suppose that {sn } converges to s. By Definition 3.1, for every ε > 0, there is an integer N such that n ≥ N implies that |sn − s| < ε. By Problem 1.13, we have ||sn | − |s|| ≤ |sn − s| < ε for all n ≥ N . Hence the sequence {|sn |} converges to |s|. However, the converse is not true. For example, we let sn = (−1)n so that {sn } is divergent, but |sn | = 1 → 1. This completes the proof of the problem. Problem 3.2 Rudin Chapter 3 Exercise 2.

Proof. We have p n2 + n − n2 1 1 lim ( n2 + n − n) = lim √ = lim q = . n→∞ n→∞ 2 n2 + n + n n→∞ 1 + 1 + 1 n

This completes the proof of the problem.

Problem 3.3 Rudin Chapter 3 Exercise 3.

Proof. It follows from s1 =

√ 2 < 2 and the induction that q √ √ sk+1 = 2 + sk < 2 + 2 < 2

when sk < 2. Hence we have sn < 2 for all n = 1, 2, 3, . . .. By similar argument, we can show that sn > 0 for all n = 1, 2, 3, . . .. Therefore we have 0 < sn < 2 31

Chapter 3. Numerical Sequences and Series

32

for all n = 1, 2, 3, . . . so that {sn } is bounded. Since

√ √ q −( sn − 2)( sn + 1) √ p and sn+1 − sn = 2 + sn − sn = , √ 2 + sn + sn

√ √ sn − 2 < 2 − 2

we have sn+1 − sn > 0 so that {sn } is strictly increasing. By Theorem 3.14 (Monotone Convergence Theorem), we have {sn } converges. This completes the proof of the problem. Problem 3.4 Rudin Chapter 3 Exercise 4.

Proof. By checking the first few terms, we see that s1 = 0, s2 = 0, s3 =

1 3 3 7 1 , s4 = , s5 = , s6 = , s7 = , . . . . 2 4 4 8 8

Therefore we can show by induction that s2m =

2m−1 − 1 2m

and s2m+1 =

2m − 1 , 2m

where m = 1, 2, . . .. By Definition 3.16, we have E = { 21 , 1}. Hence we have lim sup sn = 1 and n→∞

lim inf sn = n→∞

1 , 2

completing the proof of the problem.

Problem 3.5 Rudin Chapter 3 Exercise 5.

Proof. Let s = lim sup(an + bn ), a = lim sup an and b = lim sup bn . Suppose that the sum on the right is n→∞

n→∞

not of the form ∞ − ∞.

n→∞

• Case (i): When a = +∞ or b = +∞. The inequality holds trivially in this case. • Case (ii): When a < +∞ and b < +∞. By Theorem 3.17(a), there exists a subsequence {ank +bnk } such that s = lim (ank + bnk ). k→∞

– If any one of {ank } and {bnk } converges, say {ank }, then the equation bnk = (ank + bnk ) − ank implies that {bnk } also converges and it follows from Theorem 3.3(a) and Definition 3.16 that s = lim (ank + bnk ) = lim ank + lim bnk ≤ a + b. k→∞

k→∞

k→∞

– If both {ank } and {bnk } diverge but {ank + bnk } convergea, then we consider α = lim sup ank k→∞

and β = lim sup bnk . k→∞

By Definition 3.16, we have lim sup ank ≤ lim sup an k→∞

a For

n→∞

and

lim sup bnk ≤ lim sup bn k→∞

n→∞

example, ank = (−1)k+1 and bnk = (−1)k so that ank + bnk = 0 for all positive integers k.

33

3.2. Problems on series so that α and β are finite. By Theorem 3.17(a) again, there exists a subsequence {ankj } of the divergent sequence {ank } such that lim ankj = α.

j→∞

We note that {bnk } diverges, so it may happen that its subsequence also diverges. However, we recall from Definition 3.5 that a sequence {pn } converges to p if and only if every subsequence of {pn } converges to p. Since s = lim (ank + bnk ), the subsequence {ankj + bnkj } converges k→∞

to this s too. Thus it follows from Theorem 3.3(a) that

lim bnkj = lim [(ankj + bnkj ) − ankj ] = lim (ankj + bnkj ) − lim ankj = s − α.

j→∞

j→∞

j→∞

j→∞

By this, we know that the subsequence {bnkj } of the divergent sequence {bnk } is convergent. Hence we obtain from the definition of s and Theorem 3.3(a) that lim sup(an + bn ) = lim (ank + bnk ) k→∞

n→∞

= lim (ankj + bnkj ) j→∞

= lim ankj + lim bnkj j→∞

j→∞

≤α+β ≤a+b

= lim sup an + lim sup bn . n→∞

n→∞

This completes the proof of the problem.

3.2

Problems on series

Problem 3.6 Rudin Chapter 3 Exercise 6.

Proof. (a) We have an =

√

√ √ √ √ √ 1 1 ( n + 1 − n)( n + 1 + n) 1 √ =√ n + 1− n = = √ √ ≥ √ 1 ≥ 0. n+1+ n n+1+ n 2 n+1 2 · (n + 1) 2

By Theorem 3.28, (b) We have

P

1

1

(n+1) 2

diverges. By Theorem 3.25 (Comparison Test),

P

an diverges.

√ √ √ √ √ √ ( n + 1 − n)( n + 1 + n) 1 1 1 n+1− n √ = = √ an = √ √ ≤ √ = 3 . n 2n n n( n + 1 + n) n( n + 1 + n) 2n 2 P 1 P Since an converges. 3 converges by Theorem 3.28, the comparison test shows that n2

(c) Since

3 2

> 1, we have ( 32 )n ≥ ( 32 )2 > 2 for n ≥ 2. Then it can be shown by induction that n
1, then the triangle inequality implies that |z n | ≤ |1 + z n | + 1 so that |an | =

1 1 . ≤ n 1 + zn |z| − 1

Since |z| > 1, we have |z| = 1 + δ for some δ > 0. Let N be the least positive integer such thatb N≥

log 2 . log(1 + δ)

Then for all positive integers n such that n > N , we have |z|n = (1 + δ)n > 2 so that |an | ≤

1 2 ≤ n. |z|n − 1 |z|

Hence it follows from Theorem 3.25 (Comparison Test) that This completes the proof of the problem.

P

an converges.

Problem 3.7 Rudin Chapter 3 Exercise 7.

P √ Proof. Suppose that an ≥ 0 for all positive integers n and an converges. Since 0 ≤ ( an − n1 )2 , we have √ an 1 1 ≤ (an + 2 ). n 2 n P 1 By Theorem 3.28, n2 converges. By the assumption and Theorem 3.25 (Comparison Test), we see that X √an n

converges. This completes the proof of the problem. Problem 3.8 Rudin Chapter 3 Exercise 8.

b The

base of the logarithm here is supposed to be 10.

35

3.2. Problems on series

Proof. Suppose that {bn } is increasing.c Since {bn } is bounded, there exists a positive number M such that −M ≤ bn ≤ M for all positive integers n. Therefore we have −M |an | ≤ |an |bn ≤ M |an | and then |an bn | ≤ M |an | P for all positive integers n. Since an converges, M |an | also converges and it follows from Theorem P 3.25 (Comparison Test) that an bn converges. This completes the proof of the problem.d P

Problem 3.9

Rudin Chapter 3 Exercise 9.

Proof. 3

1 3 (a) Since | (n+1) n3 | = (1 + n ) , we have

(n + 1)3 1 3 α = lim sup = 1. = lim sup 1 + 3 n n n→∞ n→∞

Hence we have R = 1. 2n+1 2 , we have (b) Since (n+1)! = n+1 2n n!

2n+1 (n+1)! α = lim sup 2n = lim sup n→∞

n!

n→∞

2 = 0. n+1

Hence we have R = ∞. 2n+1 2 n 2 ) , we have (c) Since (n+1) = 2( n+1 2n n2

2n+1 2 n 2 (n+1) α = lim sup 2n = lim sup 2( ) = 2. n+1 n→∞ n→∞ n2

Hence we have R = 12 .

(n+1)3 1 n+1 3 (d) Since 3n+1 = 3 ( n ) , we have n3 3n

(n+1)3 1 n+1 3 1 n+1 α = lim sup 3 n3 = lim sup ( ) = . n 3 n→∞ n→∞ 3 3n

Hence we have R = 3. We complete the proof of the problem.

Problem 3.10 Rudin Chapter 3 Exercise 10.

Proof. Suppose that {ank } is the subsequence of {an } such that ank 6= 0, where nk are positive integers such p that n1 < n2 < · · · . The subsequence is infinite by the given condition. Since |ank | ≥ 1, we have nk |ank | ≥ 1 so that p α = lim sup n |an | ≥ 1. n→∞

Hence we have R = c The d This

1 α

≤ 1 which is the desired result.

case for decreasing sequences is similar, so we omit the details here. result is well-known as Abel’s Test.

Chapter 3. Numerical Sequences and Series

36

Problem 3.11 Rudin Chapter 3 Exercise 11.

Proof. (a) If the sequence {an } is not bounded, then 1 an = → 1. 1 + an 1 + a1n P an It follows from Theorem 3.23 that 1+an diverges. If the sequence {an } is bounded, then an > 0 implies that there is a positive real number M such that an < M for all positive integers n and thus 1 + an ≤ 1 + M . Therefore we have an an ≥ >0 1 + an 1+M and Theorem 3.25 (Comparison Test) yields that X

diverges.

an 1 + an

(b) Since an > 0, we have sN +k ≥ sN +j for any fixed positive integer N and for all j = 1, 2, . . . , k. Therefore we have sN1+k ≤ sN1+j for j = 1, 2, . . . , k. Hence we have aN +1 aN +k aN +1 aN +2 aN +k + ···+ ≥ + + ··· + sN +1 sN +k sN +k sN +k sN +k 1 (aN +1 + aN +2 + · · · + aN +k ) = sN +k 1 = (sN +k − sN ) sN +k sN =1− . sN +k

(3.1)

Since {sk } diverges, {sN +k } diverges. Since an > 0 for all n ∈ N, we have lim sN +k = +∞.

k→∞

(3.2)

Combining the inequality (3.1) and the limit (3.2), we have lim

k→∞

Assume that

a

P an sn

N +1

sN +1

+ ···+

sN 1 aN +k ≥ lim 1 − = 1 − sN lim = 1. k→∞ k→∞ sN +k sN +k sN +k

(3.3)

was convergent. By Theorem 3.22, there exists an integer N such that aN +1 am 1 + ···+ ≤ sN +1 sm 2

if m ≥ N + 1, but this contradicts the result (3.3) if we take m → ∞ in the above inequality. Hence ∞ X an s n=1 n

diverges.

37

3.2. Problems on series

(c) Since sn ≥ sn−1 for every positive integer n ≥ 2, we have 1 1 sn − sn−1 an − = ≥ 2. sn−1 sn sn−1 sn sn

(3.4)

Since an > 0, we have sn > 0. This and the inequality (3.4) together imply that n X ak

k=1

s2k

X 1 1 1 1 2 a1 X ak 1 1 + + − + − < . ≤ = 2 2 s1 sk a1 sk−1 sk a1 s1 sn a1 n

=

n

k=2

k=2

Hence the partial sums of the series {

P an

s2n }

are bounded and it follows from Theorem 3.24 that X an s2n

converges.

(d) Since an > 0, we have n2 an < 1 + n2 an and thus 1 an ≤ 2. 1 + n2 an n By Theorem 3.28,

P

1 n2

converges. By Theorem 3.25 (Comparison Test), the series

converges. However, the convergence of the series

X

an 1 + n2 an

X

an 1 + nan

depends on the choice of the sequence {an }. P – If an = 1, then we have an diverges and it follows from Theorem 3.28 that the series X

diverges. – To construct a sequence {an } so that terms

P 1 n

1 1+n

an diverges but and

an 1 + nan

P

an 1+nan

converges, we note that the

an are of the same magnitude. Therefore, if there are “too many” terms 1+na in the series, then n P1 the series must be divergent because n diverges. This observation motivates the construction of such sequence {an } as follows: Define {an } by k 2 , if n = 2k , where k = 0, 1, 2, . . .; an = 0, otherwise.

For examples, a1 = 1, a2 = 2, a3 = 0, a4 = 4, a5 = a6 = a7 = 0, a8 = 8 and a9 = · · · = a15 = 0, a16 = 16, . . . . It is clear that

∞ X

an =

n=1

By Theorem 3.26, the series

∞ X

k=0

2k

and

∞ X

∞

X 2k an = . 1 + nan 1 + 22k n=1 X

2k

k=0

Chapter 3. Numerical Sequences and Series

38

P 1 2k 1 diverges. Since 1+2 is convergent by Theorem 3.26, we obtain from Theorem 2k ≤ 2k and 2k 3.25 (Comparison Test) that the series X

converges.

2k 1 + 22k

This completes the proof of the problem.

Problem 3.12 Rudin Chapter 3 Exercise 12.

Proof. (a) If m < n, then rn < rm so that the sequence {rn } is strictly decreasing and r1m < r1n . It follows from this that am am+1 an am am+1 an + + ···+ > + + ···+ rm rm+1 rn rm rm rm rm − rn+1 = rm rn+1 =1− rm rn >1− . (3.5) rm P an We use similar argument as in the proof of Problem 3.11(b). Assume that rn was convergent. Since an > 0 for every n ∈ N, it is easy to see that rn > 0 for all n ∈ N. By Theorem 3.22, there is an integer N such that am an 1 + ···+ ≤ (3.6) rm rn 2 if n > m ≥ N . Therefore, by putting the inequality (3.6) into the inequality (3.5), we obtain that 1−

rn 1 ≤ rN 2

(3.7)

if n > N . Recall that {rn } is strictly decreasing and rn > 0 for all n ∈ N. Therefore, we apply Theorem 3.14 (Monotone Convergence Theorem) to {rn } to get the result that lim rn = 0.

n→∞

Hence we deduce from this and the inequality (3.7) that rn 1 ≤ , 1 = lim 1 − n→∞ rN 2

a contradiction and so the required result follows. (b) Since √

rn −

√ √ √ √ ( rn − rn+1 )( rn + rn+1 ) rn − rn+1 an √ =√ > √ , rn+1 = √ √ √ rn + rn+1 rn + rn+1 2 rn

the desired inequality follows. Since n X √ √ ak √ √ < 2( r1 − rn+1 ) < 2 r1 , rk k=1

P an √ it follows from Theorem 3.24 that rn converges. This completes the proof of the problem.

39

3.2. Problems on series Problem 3.13 Rudin Chapter 3 Exercise 13.

P bn be two absolutely convergent series. Let cn be the Cauchy product of the n X P two series, where cn = ak bn−k . We have to show that |cn | converges and we follow part of the idea Proof. Let

P

an and

P

k=0

of the proof of Theorem 3.50. n X P P P P Since an and bn converge absolutely, we let A = |an | and B = |bn |. Put An = |ak |, n X

Bn =

k=0

|bk | and Cn =

n X

k=0

k=0

|ck |. Now for all n ≥ 0, |an | and |bn | are nonnegative terms of An and Bn

respectively. This implies that An ≤ A and Bn ≤ B for all n ≥ 0 and then we have |Cn | =

n X

k=0

|ck |

≤ |a0 ||b0 | + |a0 ||b1 | + |a1 ||b0 | + · · · + |a0 ||bn | + |a1 ||bn−1 | + · · · + |an ||b0 | = |a0 |Bn + |a1 |Bn−1 + · · · + |an |B0 ≤ |a0 |B + |a1 |B + · · · + |an |B = An B ≤ AB.

Hence {Cn } is a bounded sequence (bounded by AB) and we deduce from Theorem 3.24 that converges. We finish the proof of the problem.

P

|cn |

Problem 3.14 Rudin Chapter 3 Exercise 14.

Proof. (a) Given ǫ > 0, there exists a positive integer N such that |sn − s| < 2ǫ for all n ≥ N . Fix this N , we have (s − s) + · · · + (s 0 N −1 − s) + (sN − s) + · · · + (sn − s) |σn − s| = n+1 1 1 ≤ (|s0 − s| + · · · + |sN −1 − s|) + (|sN − s| + · · · + |sn − s|) {z } n+1 n+1 | 1 ≤ (|s0 − s| + · · · + |sN −1 − s|) + n+1 1 < (|s0 − s| + · · · + |sN −1 − s|) + n+1

(n − N + 1) terms

n−N +1 ǫ · n+1 2 ǫ . 2

(3.8)

Let M = max(|s0 − s|, . . . , |sN −1 − s|) and N ′ be the least integer such that N ′ > 2M ǫ − 2. Since N M < 2ǫ so that inequality is fixed, M and then N ′ are fixed too. Then for all n ≥ N ′ + 1, we have n+1 (3.8) implies that |σn − s| < ǫ for all n ≥ N ′ + 1. This shows that lim σn = s.

(b) Let sn = (−1)n . Then {sn } is obviously divergent. Since σn = 0 if n is odd and σn = even, we have lim σn = 0. n→∞

1 n+1

if n is

Chapter 3. Numerical Sequences and Series

40

(c) Such sequence {sn } must satisfy two conditions: – Condition (1): It must contain a divergent subsequence. 1

– Condition (2): The growth of the sum s0 + s1 + · · · + sn = O(n k ) for some k ≥ 2 as n → ∞e so that s0 + s1 + · · · + sn →0 σn = n+1 as n → ∞. Now we define the sequence {sn } by 1 n 3 = k, if n = k 3 , where k = 1, 2, . . .; sn = 1 otherwise. n2 , This sequence {sn } satisfies Condition (1) because sk 3 = k → ∞ as k → ∞ so that lim sup sn = ∞. Next, for any positive integer n, let k be the largest positive integer such that k 3 ≤ n < (k + 1)3 . Then we have 0 ≤ s0 + s1 + · · · + sn ≤

k X

(k+1)3

X

m+

m=1

m=1

3

(k+1) X 1 k(k + 1) 1 = + m2 2 m2 m=1

so that

" # (k+1)3 X 1 k(k + 1) 1 s0 + s1 + · · · + sn . (3.9) ≤ 3 + 0 ≤ σn = n+1 k +1 2 m2 m=1 P 1 By Theorem 3.28, m2 converges. Since the right-hand side of the inequality (3.9) tends to 0 as k → ∞, it follows from the remarkf above Theorem 3.20 that lim σn = 0.

n→∞

That is, this sequence {sn } satisfies Condition (2). Hence this prove part (c). (d) It is clear that the equation is true for n = 1. Assume that it is also true for n = k, where k is a positive integer. Then for n = k + 1 we have i 1 X 1 hX jaj + (k + 1)ak+1 jaj = k + 2 j=1 k + 2 j=1 k

k+1

1 [(k + 1)(sk − σk ) + (k + 1)ak+1 ] k+2 1 [(k + 1)sk+1 − (k + 1)σk ] = k+2 1 = [(k + 1)sk+1 + sk+1 − (k + 2)σk+1 ] k+2 = sk+1 − σk+1 . =

Hence the induction shows that the expression is true for all positive integers. Since lim(nan ) = 0, we obtain from part (a) that n

1 X kak = 0. n→∞ n + 1 lim

k=1

e Let f (n) and g(n) be two functions defined on N. One writes f (n) = O(g(n)) as n → ∞ if and only if there is a positive constant M such that for all sufficiently large values of n, we have |f (n)| ≤ M |g(n)|. f That is, if 0 ≤ x ≤ s for n ≥ N , where N is some fixed number, and if s → 0, then x → 0. n n n n

41

3.2. Problems on series Let lim σn = σ. Then it follows from Theorem 3.3(a) that lim sn = lim

n→∞

n→∞

σn +

1 X kak = σ + 0 = σ. n+1 n

k=1

Hence {sn } converges to σ. (e) If m < n, then we have (m + 1)(σn − σm ) +

n X

i=m+1

(sn − si )

s + s + · · · + s s0 + s1 + · · · + sm 0 1 n = (m + 1) + [(sn − sm+1 ) + · · · + (sn − sn )] − n+1 m+1 m+1 (s0 + s1 + · · · + sn ) + (n − m)sn − (s0 + s1 + · · · + sn ) = n+1 m+1−n−1 = (s0 + s1 + · · · + sn ) + (n − m)sn n+1 = (n − m)(sn − σn ) which yields the desired formula. Since i ≥ m + 1, we have n − i ≤ n − m − 1 and that

1 i+1

≤

1 m+2

so

|sn − si | = |sn − sn−1 + sn−1 − sn−2 + · · · + si+1 − si |

≤ |sn − sn−1 | + |sn−1 − sn−2 | + · · · + |si+1 − si | M M M ≤ + + ···+ n n − 1 i + 1} | {z (n − i) terms

M M + ···+ ≤ i+1 i+1 (n − i) = M i+1 (n − m − 1) ≤ M. m+2

Fix ǫ > 0. Since (m + mǫ + ǫ + 1 + ǫ) − (m + mǫ + ǫ) = 1 + ǫ > 1, there exists an integer n such that m + mǫ + ǫ ≤ n < m + mǫ + ǫ + 1 + ǫ and the inequalities are equivalent to m≤

n−ǫ < m + 1. 1+ǫ

It is easy to see that the above inequalities imply that m+1 1 ≤ n−m ǫ Since |sn − si | ≤

n−m−1 m+2 M ,

and

n−m−1 < ǫ. m+2

we have |sn − si | < M ǫ and hence lim sup |sn − σ| ≤ M ǫ. n→∞

Since ǫ was arbitrary, lim sn = σ. This completes the proof of the problem. Problem 3.15 Rudin Chapter 3 Exercise 15.

Chapter 3. Numerical Sequences and Series

42

Proof. We prove the theorems one by one: • Proof of generalized Theorem 3.22: Let a = (a1 , a2 , . . . , ak ). We further let, for each positive n X ai converges to a ∈ Rk if and integer n, an = (an1 , an2 , . . . , ank ), where an1 , . . . , ank ∈ R. Now only if

n X

i=1

g

aij converges to aj for each j = 1, 2, . . . , k. By Theorem 3.22, we have

i=1

n X

aij converges

i=1

to aj if and only if for every ǫ there is an integer Nj such that m X ǫ aij < k

(3.10)

i=n

if m ≥ n ≥ Nj . If

n X

ai converges, then it follows from the inequality (3.10) and

i=1

m m m m X X X X ai ≤ ai1 + ai2 + · · · + aik i=n

that

i=n

for m ≥ n ≥ N = max(N1 , . . . , Nk ).

i=n

i=n

m X ǫ ǫ ai < + · · · + ≤ ǫ k k i=n

(3.11)

Conversely, if the inequality (3.11) holds for m ≥ n ≥ N , then since m m X X ai , aij ≤ i=n

i=n

for each j = 1, 2, . . . , k, we have

m X aij < ǫ i=n

for m ≥ n ≥ N . By Theorem 3.22 again, n X

n X

aij converges to aj for each j = 1, 2, . . . , k so that

i=1

ai converges to a.

i=1

• Proof of generalized Theorem 3.23: We take m = n in the inequality (3.11), then it becomes |an | < ǫ for all n ≥ N . Since an = (an1 , an2 , . . . , ank ), we have q a2n1 + a2n2 + · · · + a2nk < ǫ for all n ≥ N . By Definition 3.1, we have

lim (a2n1 + a2n2 + · · · + a2nk ) = 0.

n→∞

Since a2nj ≥ 0 for 1 ≤ j ≤ k, we have lim anj = 0 for 1 ≤ j ≤ k, i.e., n→∞

lim an = 0.

n→∞

n X g This holds because we have aij − aj ≤ i=1

n X ai − a ≤ i=1

n n X X ai1 − a1 + · · · + aik − ak , where j = 1, 2, . . . , k. i=1

i=1

43

3.2. Problems on series • Proof of generalized Theorem 3.25(a): Given ǫ > 0, there exists N ≥ N0 such that m ≥ n ≥ N implies m X ck ≤ ǫ k=n

by the Cauchy criterion. It follows from Theorem 1.37(e) that m m m X X X ak ≤ |ak | ≤ ck < ǫ k=n

k=n

k=n

for all m ≥ n ≥ N . By the generalized Theorem 3.22, we have

P

ak converges.

• Proof of generalized p Theorem 3.33: If α < 1, then we can choose β so that α < β < 1 and an integer N such that n |an | < β for n ≥ N . That is, for all n ≥ N , we have |an | < β n .

P n Since 0 < β < β converges by Theorem 3.26. Hence it follows from the generalized Theorem P1, 3.25(a) that |an | converges.

If α > 1, then we obtain from Theorem 3.17(a) that there is a sequence {nk } such that p nk |ank | → α.

Hence |an | > 1 for infinitely many values of n which contradicts the generalized Theorem 3.23. • Proof of generalized Theorem 3.34: If part (a) holds, we can find β < 1 and an integer N such that a n+1 0, we follow from the facts b0 ≥ b1 ≥ b2 ≥ · · · and lim bn = 0 that there is an integer n→∞ N such that ǫ bN ≤ √ . (3.14) 2 kM Let q ≥ p ≥ N . Now we have bn − bn+1 ≥ 0 for all nonnegative integers n. It follows from this, inequalities (3.13), (3.14) and the facts that (a + b)2 ≤ (|a| + |b|)2 for every a, b ∈ R, we have q X an bn n=p

Chapter 3. Numerical Sequences and Series

44

! q q X X = ank bn an1 bn , . . . , n=p n=p q−1 ! q−1 X X = Ank (bn − bn+1 ) + Aqk bq − A(p−1)k bp An1 (bn − bn+1 ) + Aq1 bq − A(p−1)1 bp , . . . , n=p n=p v #2 u k " q−1 uX X u =u Anj (bn − bn+1 ) + Aqj bq − A(p−1)j bp uj=1 n=p {z } | t This is b. {z } | This is a.

v # u k " q−1 2 uX X t ≤ Anj (bn − bn+1 ) + Aqj bq − A(p−1)j bp n=p j=1 v #2 " q−1 u k uX X t 2 (bn − bn+1 ) + bq + bp M ≤ j=1

n=p

v u k uX = t (4M 2 b2 ) p

j=1

√ ≤ 2M bN k ≤ ǫ.

Now the convergence of the series

P

an bn follows immediately from the generalized Theorem 3.22.

• Proof of generalized Theorem 3.45: The assertion follows from the inequality m m X X |ak | ak ≤ k=n

k=n

plus the generalized Theorem 3.22.

• Proof of generalized Theorem 3.47: Let An = An + Bn =

n X

n X

ak and Bn =

k=0

n X

bk . Then we acquire

k=0

(ak + bk ).

(3.15)

k=0

Since lim An = A and lim Bn = B, we see from the expression (3.15) that n→∞

n→∞

lim (An + Bn ) = A + B.

n→∞

The proof of the second assertion is similar. P ′ • Proof of generalized Theorem 3.55: Let an be a rearrangement with partial sums s′n . Given ǫ > 0, there exists an integer N such that m ≥ n ≥ N implies that m X i=n

|ai | < ǫ.

(3.16)

Now choose a positive integer p such that the integers 1, 2, . . . , N are all contained in the set k1 , k2 , . . . , kp (here we use the notation of Definition 3.52). Then if n > p, the vectors a1 , . . . , aN will cancel in the difference sn − s′n so that the inequality (3.16) implies that |sn − s′n | < ǫ.

Hence {s′n } converges to the same sum as {sn }. This completes the proof of the problem.

45

3.3. Recursion formulas of sequences

3.3

Recursion formulas of sequences

Problem 3.16 Rudin Chapter 3 Exercise 16.

Proof. √ (a) Since α > 0, it can be shown by induction that xn > α > 0 for all positive integers n.h By this, we have −α + x2 x2 + α n ≤ 0. − xn = − xn+1 − xn = n 2xn 2xn √ Thus {xn } decreases monotonically. Since α < xn ≤ x1 for all positive integers n, the sequence {xn } is bounded and Theorem 3.14 (Monotone Convergence Theorem) implies that {xn } converges. Let x = lim xn . Then Theorem 3.3 implies that α 1 lim xn+1 = lim xn + n→∞ n→∞ 2 xn 1 α x= x+ 2√ x x = ± α. Since xn > 0 for all positive integers n, we have x > 0 and hence x =

√ α, as desired.

(b) We have √ √ 1 1 (xn − α)2 ǫ2 α √ = n . xn + − α= · ǫn+1 = xn+1 − α = 2 xn 2 xn 2xn √ √ As shown in part (a) that xn > α > 0 for all positive integers n. Thus if β = 2 α, then we have ǫn+1 = By this, it is clear that ǫ2 < β n = k + 1, we have

ǫ1 2 . β

ǫk+2
1, we have α − 1 > 0 and then √ √ α−1 √ (3.17) xn+1 − α = ( α − xn ) 1 + xn for all positive integers n. Therefore the expression (3.17) implies that √ √ √ ( α − 1)2 (x2n−1 − α), x2n+1 − α = (1 + x2n )(1 + x2n−1 ) √ √ √ ( α − 1)2 x2n+2 − α = (x2n − α). (1 + x2n+1 )(1 + x2n )

(3.18)

for all positive integers n. Since x1 > α and x2 < α, we can show by induction and the expressions (3.18) that √ √ x2n−1 > α and x2n < α for all positive integers n.

Lemma 3.2 For every positive integer n, we have x2n+1 − x2n−1 =

2(α − x22n−1 ) 1 + α + 2x2n−1

and x2n+2 − x2n =

2(α − x22n ) . 1 + α + 2x2n

(3.19)

Proof of Lemma 3.2. We note that α + xn − xn−1 1 + xn n−1 α + α+x 1+xn−1 − xn−1 = n−1 1 + α+x 1+xn−1

xn+1 − xn−1 =

α(1 + xn−1 ) + α + xn−1 − xn−1 1 + xn−1 + α + xn−1 2(α − x2n−1 ) = . 1 + α + 2xn−1 =

Hence it is easily seen that the expressions (3.19) follows from this. It is time to return to the proof of Problem 3.17.

47

3.3. Recursion formulas of sequences

(a) Let n be a positive integer. By Lemma 3.1, we have x22n−1 − α > 0. Hence it follows from this and Lemma 3.2 that x2n+1 < x2n−1 . (b) Similarly, Lemma 3.1 implies that x22n − α < 0 and we obtain from Lemma 3.2 that x2n+1 > x2n . (c) By Lemma 3.1 and part (a), {x2n−1 } is monotonically decreasing, so Theorem 3.14 (Monotone Convergence Theorem) shows that {x2n−1 } converges. Similarly, since Lemma 3.1 and part (b) imply that {x2n } is monotonically increasing, we obtain from Theorem 3.14 (Monotone Convergence Theorem) that {x2n } converges. Furthermore, it can be deduced easily from the expressions (3.19) that √ lim x2n−1 = lim x2n = α. (3.20) n→∞

n→∞

By the triangle inequality, we have |xm − xn | ≤ |xm −

√ √ α| + | α − xn |

(3.21)

for positive integers m and n. Hence it follows from the limits (3.20) and the inequality (3.21) that {xn } is a Cauchy sequence. By Theorem 3.11(c), the sequence {xn } converges. Let lim xn = x. Then we have α + x n lim xn+1 = lim n→∞ n→∞ 1 + xn α+x x= 1+x √ x = ± α. √ Since xn > 0 for every positive integer n, we have x > 0 and hence x = α. √ (d) Put ǫn = |xn − α|. We know from the expression (3.17) that √ √ (1 − α)2 |1 − α| ǫn = ǫn−1 . ǫn+1 = 1 + xn (1 + xn )(1 + xn−1 ) By the definition of xn , we have (1 + xn+1 )(1 + xn ) = 1 + α + 2xn > 1 + α so that √ √ √ (1 − α)2 α−1 (1 − α)2 ǫn+1 = ǫn−1 < ǫn−1 = √ ǫn−1 (1 + xn )(1 + xn−1 ) 1+α α+1 which implies that √α − 1 n ǫ1 ǫ2n+1 < √ α+1

√α − 1 n and ǫ2n+2 < √ ǫ2 , α+1

(3.22)

where n is a positive integer. To compare the rapidity of convergence of the process with the√one described in Problem 3.16, we √ 3 3 < 10 take the same example that α = 3 and x1 = 2. Then we have √3−1 , ǫ1 = |2 − 3| < 10 and 3+1 √ 1 5 ǫ2 = | 3 − 3| < 10 . By these and the inequalities (3.22), we have ǫ2n+1
p α.

Proof of Lemma 3.3. Assume that xk > theorem, if 0 < x < 1, then we have

√ p α for some positive integer k. By the binomial

(1 − x)p = 1 − px +

p(p − 1) 2 x − · · · > 1 − px. 2

(3.23)

Let y = 1 − x. The inequality (3.23) becomes

Next, we put y =

√ pα xk

y p > 1 − p(1 − y).

(3.24)

into the inequality (3.24) to get √ p α α > 1 − p 1 − p xk xk √ p α α >1− p p 1− xk xk √ 1 α xk − p α > xk − p−1 p pxk √ p−1 α xk + p−1 > p α p pxk √ xk+1 > p α,

completing the proof of the lemma.

Lemma 3.4 The sequence {xn } is monotonically decreasing.

Proof of Lemma 3.4. It is clear from the definition of xn that 1 1 α xn+1 − xn = − xn + p−1 = p−1 (α − xpn ) p pxn pxn for every positive integer n. By Lemma 3.3, we have α − xpn < 0 so that xn+1 − xn < 0 for every positive integer n. In other words, the sequence {xn } is monotonically decreasing.

49

3.4. A representation of the Cantor set

Now we can continue our proof of Problem 3.18. By Lemmae 3.3 and 3.4, the sequence {xn } is √ monotonically decreasing and bounded below by p α. Hence it follows from Theorem 3.14 (Monotone Convergence √ Theorem) that {xn } converges and the analysis preceding Lemma 3.3 gives the limit of it must be p α. This completes the proof of the problem.

3.4

A representation of the Cantor set

Problem 3.19 Rudin Chapter 3 Exercise 19.

∞ o n X αn , αn ∈ {0, 2} . Recall that the Cantor set P is defined by Proof. Let E = x(a) x(a) = n 3 n=1

P = [0, 1] \

∞ [

m=1

3m−1 [−1 k=0

3k + 1 3k + 2 , , 3m 3m

see equation (2.24) on [21, p. 42], see Figure 3.1.i

Figure 3.1: The Cantor set. 3k+2 m−1 − 1. In other words, x ∈ / P if and only if x ∈ ( 3k+1 3m , 3m ) for some m = 1, 2, . . . and k = 0, 1, . . . , 3 ∞ X βn Suppose that b = {βn } and x(b) = , where βn ∈ {0, 1, 2}. We want to show that x(b) ∈ / P if 3n n=1 and only if βn = 1 for some positive integer n. Then the previous paragraph says that x(b) ∈ / P if and only if ∞ X βn 3k + 1 3k + 2 (3.25) ∈ , 3n 3m 3m n=1

for some positive integer m and k = 0, 1, . . . , 3m−1 − 1. Fix this m and it is obvious that the relation (3.25) is equivalent to ∞ X βn ∈ (3k + 1, 3k + 2). (3.26) n−m 3 n=1 Since ∞ m−1 ∞ X X X βn βn m−n = β 3 + β + n m n−m n−m 3 3 n=1 n=1 n=m+1

and each m − n is a positive integer for n = 1, 2, . . . , m − 1, we have each 3m−n is divisible by 3 and then we have m−1 X βn 3m−n = 3N (3.27) n=1

i The

figure can be found in https://en.wikipedia.org/wiki/Cantor_set.

Chapter 3. Numerical Sequences and Series for some positive integer N . Let γm = βm + expression (3.27) that

50

∞ X

βn , so we have from the relation (3.26) and the n−m 3 n=m+1

3N + γm ∈ (3k + 1, 3k + 2).

(3.28)

Since 0 ≤ βn ≤ 2 for all n, we get from Theorem 3.26 that 0 ≤ γm ≤ 2 +

∞ X

n=m+1

2 3n−m

1 1 = 2 1 + + 2 + · · · = 3. 3 3

If γm = 0 or 3, then 3N + γm = 3N or 3N + 3 which contradicts the relation (3.28). In fact, the bounds of γm force that N = k and 1 < γm < 2. Since 0≤

∞ X

βn ≤ 1, n−m 3 n=m+1

we must have 0 < βm < 2 for this fixed positive integer m. However, we acquire βm ∈ {0, 1, 2} which implies that βm = 1 for a positive integer m. Hence we have shown that x(b) ∈ / P if and only if βn = 1 for some positive integer n and this means that E=P as required. We finish the proof of the problem.

3.5

Cauchy sequences and the completions of metric spaces

Problem 3.20 Rudin Chapter 3 Exercise 20.

Proof. Given ǫ > 0, there is an integer N1 such that d(pni , p) < 2ǫ for ni ≥ N1 . Since {pn } is a Cauchy sequence, there is an integer N2 such that d(pn , pm ) < 2ǫ for m, n ≥ N2 . Put N = max(N1 , N2 ). Then for all n ≥ N , we have ǫ ǫ d(pn , p) ≤ d(pn , pni ) + d(pni , p) < + = ǫ. 2 2 Hence the full sequence {pn } converges to p, completing the proof of the problem.

Problem 3.21 Rudin Chapter 3 Exercise 21. T∞ Proof. Let E = 1 En . Since each En is nonempty, we can construct a sequence {pn }, where pn ∈ En . Since Em ⊆ En if m ≥ n, we have pm ∈ En (3.29) if m ≥ n. We first show that the sequence {pn } is convergent. Given that ǫ > 0. Since each En is bounded, we know from Definition 3.9 that each diam En is well-defined. Besides, since diam En → 0 as n → ∞, there is an integer N such that diam En < ǫ for all n ≥ N . In particular, we take n = N and we obtain from Definition 3.9 that Sn = {d(p, q) | p, q ∈ En } and d(p, q) ≤ sup SN = diam EN < ǫ (3.30)

51

3.5. Cauchy sequences and the completions of metric spaces

for p, q ∈ EN .j Since En ⊇ En+1 , we have Em ⊆ En ⊆ EN so that pm , pn ∈ EN

(3.31)

for any integers m, n with m ≥ n ≥ N . Therefore it follows from the inequality (3.30) and the relation (3.31) that d(pn , pm ) < ǫ for all m ≥ n ≥ N which shows that {pn } is a Cauchy sequence by Definition 3.8. Since X is a complete metric space, Definition 3.12 ensures that the sequence {pn } converges to a point p ∈ X. Next, we prove that E = {p}. To this end, we first show that p ∈ E. Define A to be the subset of N such that n ∈ A if and only if pn = p. We also define B to be the complement of A so that n ∈ B if and only if pn 6= p. By definition, we know that n ∈ A if and only if p ∈ En . (However, n ∈ B does not imply that p ∈ / En .) Now there are two cases for consideration: • Case (i): B is finite. Then we have B = {n1 , n2 , . . . , nk }, where n1 < n2 < · · · < nk . This implies that pn = p for all n ≥ nk + 1 and thus p ∈ En for all n ≥ nk + 1. Since Enk+1 ⊆ Enk ⊆ Enk−1 ⊆ · · · ⊆ En1 , we have p ∈ En for all n ∈ B. Hence we have p ∈ E in this case. • Case (ii): B is countable. We want to show that p is a limit point of each En , so we fix the integer n first. Since pm → p as m → ∞, we get from Theorem 3.2(a) that if Nr (p) is any neighborhood of p, then there is an integer N such that pm ∈ Nr (p) for all m ≥ N . Since B is countable, we can pick an element m ∈ B with m ≥ n and m ≥ N such that the relation (3.29) yields pm ∈ En . By definition, we have pm 6= p and hence it follows from Definition 2.18(b) that p is a limit point of each En . Since En is closed for each positive integer n, we have p ∈ En for each positive integer n and thus p ∈ E in this case too. Now we have shown that p ∈ E and we shall prove the uniqueness of p. Assume that p′ ∈ E but p 6= p. Then we have d(p, p′ ) > 0 and p, p′ ∈ En for every positive integer n. By Theorem 1.20(b), there is q ∈ Q which does not depend on n such that 0 < q < d(p, p′ ). Therefore Definition 3.9 implies that ′

diam En ≥ q > 0 for every positive integer n, a contradiction. Hence we must have p = p′ , i.e., E = {p}, completing the proof of the problem. Problem 3.22 Rudin Chapter 3 Exercise 22.

Proof. Basically, we follow the idea of proof of Problem 2.30. Let G be an open set of X. Since G1 is a dense subset of X, there exists p ∈ G1 such that p ∈ G. Thus the set F1 = G1 ∩ G is nonempty. Since G1 is open in X, F1 must be open in X by Theorem 2.24(c). Let p1 ∈ F1 . (Here we don’t assume that p = p1 .) Since F1 is an open subset, we have E1 = Nr1 (p1 ) ⊆ F1 ⊆ G1 for some r1 > 0. Without loss of generality, we may assume that E1 = Nr1 (p1 ) ⊆ F1 ⊆ G1 . j In

fact, the inequality (3.30) holds for all n ≥ N with EN replaced by En .

Chapter 3. Numerical Sequences and Series

52

Since G2 is a dense subset of X, the set F2 = G2 ∩ E1 is a nonempty open subset of X. Let p2 ∈ F2 . Then we can choose r2 > 0 small enough such that E2 = Nr2 (p2 ) ⊆ F2 ⊆ G2 . By definition, we have E2 = Nr2 (p2 ) ⊆ F2 = G2 ∩ E1 = G2 ∩ Nr1 (p1 ) ⊂ Nr1 (p1 ) = E1 . Now we can continue this process to obtain the following shrinking sequence · · · ⊂ E3 ⊂ E2 ⊂ E1 .

(3.32)

Since each En is actually a neighborhood, En is closed and bounded. Furthermore, the sequence (3.32) implies that {rn } is a strictly decreasing sequence of positive real numbers (here 0 is the greatest lower bound of {rn }) and it follows from Theorem 3.14 (Monotone Convergence Theorem) that lim rn = 0. n→∞ Now it is clear that, for each positive integer n, diam En = sup{d(p, q) | p, q ∈ En } = 2rn , so we must have lim diam En = lim (2rn ) = 0.

n→∞

n→∞

Since X is a nonempty complete metric space, Problem 3.21 implies that ∞ \

n=1

En = {p}

for some p ∈ X. Since p ∈ En ⊆ Gn for each positive integer n, we have p ∈ proof of the problem.

T∞ 1

Gn , completing the

Problem 3.23 Rudin Chapter 3 Exercise 23.

Proof. We follow the hint. For any positive integers m and n, we have d(pn , qn ) ≤ d(pn , pm ) + d(pm , qm ) + d(qm , qn )

or d(pm , qm ) ≤ d(pn , pm ) + d(pn , qn ) + d(qm , qn ) (3.33)

which implies that |d(pn , qn ) − d(pm , qm )| ≤ d(pn , pm ) + d(qm , qn ). Since {pn } and {qn } are Cauchy sequences in a metric space X, there is an integer N such that d(pn , pm )
0. Since {Pn } is a Cauchy sequence in X ∗ , there exists an integer Nǫ that ǫ ∆(Pr , Ps ) < 4 (1)

(2)

for every r, s ≥ Nǫ . By the definition of ∆, there exists a positive integer Nǫ

such

such that

ǫ d(prt , pst ) < 4

(3.36)

(2)

for t ≥ Nǫ . Now for each positive integer k, the sequence {pk1 , pk2 , . . .} ∈ Pk is a Cauchy sequence in X, so there exists an integer Nk such that d(pkn , pkm )
0. By the proof of Lemma 3.5, we see that if m, t ≥ Nǫ , then we have 3ǫ (3.38) d(pm , pt ) < . 4

55

3.5. Cauchy sequences and the completions of metric spaces (3)

(3)

(1)

log

1

We put m ≥ Nǫ . Since Nǫ = max(Nǫ , ⌈2 + log 2ǫ ⌉), we have from the triangle inequality and the inequality (3.37) that

1 2m

≤

ǫ 4.

d(pmt , pt ) ≤ d(pmt , pm ) + d(pm , pt ) = d(pmt , pmNm ) + d(pm , pt ) 1 < m + d(pm , pt ) 2

Then we obtain

(3.39)

if t ≥ Nm . (4) (3) (4) (3) (4) (3) Let Nǫ = max(Nm , Nǫ ). Now for every t ≥ Nǫ , since m ≥ Nǫ and t ≥ Nǫ ≥ Nǫ , it follows from the inequality (3.38) that 3ǫ . 4

d(pm , pt ) < Recall that this chosen m gives inequality (3.39) that

1 2m

≤

d(pmt , pt ) < (3)

ǫ 4

(4)

and t ≥ Nǫ

≥ Nm , so we obtain from these and the

1 ǫ 3ǫ + d(pm , pt ) < + =ǫ 2m 4 4

(3.40)

(4)

if m ≥ Nǫ and t ≥ Nǫ . Therefore we can easily see from the inequality (3.40) and the definition ∆(Pm , P ) = lim d(pmt , pt ) that ∆(Pm , P ) → 0 as m → ∞. t→∞

Hence we have shown that a Cauchy sequence {Pn } in X ∗ converges to P ∈ X ∗ . By Definition 3.12, X ∗ is complete. (d) The sequence {p, p, . . .} is the required Cauchy sequence. Since {p, p, . . .} ∈ Pp and {q, q, . . .} ∈ Pq for all p, q ∈ X, we get from the part (b) that ∆(Pp , Pq ) = lim d(p, q) = d(p, q). n→∞

In other words, this shows that ϕ : X → X ∗ defined by ϕ(p) = Pp is an isometry. (e) To prove that ϕ(X) is dense in X ∗ , we must show that every neighborhood of P ∈ X ∗ contains ϕ(p) ∈ ϕ(X). To this end, let P ∈ X ∗ and {pn } ∈ P . Given that ǫ > 0. Since {pn } is a Cauchy sequence in X, there exists a positive integer N such that d(pm , pn ) < 2ǫ for all m, n ≥ N . By part (d), we have ϕ(pN ) = PpN . Recall that PpN contains the Cauchy sequence all of whose terms are pN , so it follows from this and part (b) that ǫ ∆(P, ϕ(pN )) = ∆(P, PpN ) = lim d(pn , pN ) ≤ < ǫ. n→∞ 2 In other words, it means that the neighborhood of P with radius ǫ contains an element ϕ(pN ) ∈ ϕ(X). Hence ϕ(X) is dense in X ∗ . Suppose that X is complete. Let P ∈ X ∗ and {pn } ∈ P . Since {pn } is a Cauchy sequence in the complete metric space X, we have pn → p for some p ∈ X. Thus we have ∆(P, Pp ) = lim d(pn , p) = 0. n→∞

Hence we have P = Pp and then ϕ(X) = X ∗ . This completes the proof of the problem. Problem 3.25 Rudin Chapter 3 Exercise 25.

Chapter 3. Numerical Sequences and Series

56

Proof. Now we have X = Q. By Problem 3.24(e), Q is dense in Q∗ . By the footnote c in Chapter 2, we have Q∗ = Q = R, finishing the proof of the problem.

CHAPTER

4

Continuity

4.1

Properties of continuous functions

Problem 4.1 Rudin Chapter 4 Exercise 1.

Proof. The answer is no! For example, let f : R → R be defined by f (0) = 1 and f (x) = 0 for x ∈ R \ {0}. Then it is easy to check that lim [f (x + h) − f (x − h)] = 0 h→0

for every x ∈ R. However, f is not continuous at 0. To see this, we pick ǫ = x ∈ R with 0 < |x − 0| < δ always imply that |f (x) − f (0)| = |0 − 1| = 1 >

1 2

and for every δ > 0, those

1 . 2

By Definition 4.5, f is not continuous at 0. We finish the proof of the problem.

Problem 4.2 Rudin Chapter 4 Exercise 2.

Proof. Since x ∈ E implies f (x) ∈ f (E) ⊆ f (E), we have E ⊆ f −1 (f (E)). By Theorem 2.27(a), f (E) is closed in Y . Thus the corollary of Theorem 4.8 ensures that f −1 (f (E)) is closed in X. By Theorem 2.27(c), we must have E ⊆ f −1 (f (E)) and hence we have f (E) ⊆ f (E) for every set E ⊆ X. To prove the second assertion, we consider the example f : R \ {0} → R defined by f (x) =

1 . x2

Take E = Z, so E = Z and f (E) = f (Z) = {1, 212 , 312 . . .}, but f (E) = f (Z) = {0} ∪ {1, 212 , 312 . . .}. Hence we have the desired result that f (E) ⊂ f (E), completing the proof of the problem.

57

Chapter 4. Continuity

58

Problem 4.3 Rudin Chapter 4 Exercise 3.

Proof. By definition, we have Z(f ) = {p ∈ X | f (p) = 0} = f −1 ({0}). Since {0} is a closed set in R, the corollary of Theorem 4.8 implies that Z(f ) is closed in X. We complete the proof of the problem. Problem 4.4 Rudin Chapter 4 Exercise 4.

Proof. By definition, the statement “f (E) is dense in f (X)” is equivalent to the statement “every point of f (X) is a limit point of f (E)” which is also equivalent to f (X) ⊆ (f (E))′ ⊆ f (E). Since it is clear that f (E) ⊆ f (X), the last statement is equivalent to f (E) = f (X).

(4.1)

The direction f (E) ⊆ f (X) is obvious. For the other direction, we note from Problem 4.2 that f (E) ⊆ f (E) for every set E ⊆ X. If E is a dense subset of X, then we have E = X and this implies that f (X) ⊆ f (E) which is the relation (4.1). Let p ∈ X. Thus it is a limit point of E. Since E is dense in X, there exists a sequence {pn } in E such that lim pn = p. Since f and g are continuous at p, it follows from Theorem 4.2 that n→∞

lim f (pn ) = f (p) and

n→∞

lim g(pn ) = g(p).

n→∞

Since f (pn ) = g(pn ) for all positive integers n, we have f (p) = lim f (pn ) = lim g(pn ) = g(p), n→∞

n→∞

completing the proof of the problem.

4.2

The extension, the graph and the restriction of a continuous function

Problem 4.5 Rudin Chapter 4 Exercise 5.

Proof. We have f : E ⊆ R → R and E is closed in R. By Theorem 2.23, E c is open in R and Problem 2.29 tells us that E c is the union of an at most countable collection of disjoint segments, possibly including (−∞, a) or (b, ∞) for some a, b ∈ R.a We define g : R → R to be the function such that g(x) = f (x) for all x ∈ E. To define g on E c , we let (an , bn ) be one of the disjoint segments of E c , where an < bn . There are two cases: a For

examples, if E = [0, 1], then we have E c = (−∞, 0) ∪ (1, ∞); if E = (−∞, 0] ∪ [1, +∞), then we have E c = (0, 1).

59

4.2. The extension, the graph and the restriction of a continuous function • Case (i): E c does not contain (−∞, a) or (b, ∞). We define g on [an , bn ] by  if x = an ;  f (an ), m(x − an ) + f (an ) if an < x < bn ; g(x) =  f (bn ), if x = bn , where

m=

f (bn ) − f (an ) . b n − an

In other words, the graph of g is a straight line connecting (an , f (an )) and (bn , f (bn )) with slope m, see Figure 4.1 for an illustration.

Figure 4.1: The graph of g on [an , bn ].

• Case (ii): E c contains (−∞, a) or (b, ∞). We consider the case (−∞, a) only because the case (b, +∞) can be done similarly. We define g on (−∞, a] by g(x) = f (a) for all x ∈ (−∞, a]. In other words, the graph of g is a constant function. Since a straight line is continuous on its domain, the only points of uncertainty of continuity of g(x) are at the endpoints x = an and x = bn . Given ǫ > 0 small enough and suppose that m 6= 0. Since f is continuous at an , there exists a δ > 0 such that |f (x) − f (an )| < ǫ ǫ if |x − an | < δ. Let δ1 = min(δ, |m| ).b Then, for |x − an | < δ1 , we have

|g(x) − g(an )| = = < b We

|m(x − an ) + f (an ) − f (an )|, if an ≤ x < an + δ1 ; |f (x) − f (an )|, if an − δ1 < x < an ,

|m| · |x − an |, |f (x) − f (an )|,

if an ≤ x < an + δ1 ; if an − δ1 < x < an ,

ǫ, if an ≤ x < an + δ1 ; ǫ, if an − δ1 < x < an .

note that m can be negative, so we take its absolute value to make

ǫ |m|

positive.

Chapter 4. Continuity

60

If m = 0, then we take δ1 = min(ǫ, δ) in the above analysis and we get |f (an ) − f (an )|, if an ≤ x < an + δ1 ; |g(x) − g(an )| = |f (x) − f (an )|, if an − δ1 < x < an , 0, if an ≤ x < an + δ1 ; = |f (x) − f (an )|, if an − δ1 < x < an , ǫ, if an ≤ x < an + δ1 ; < ǫ, if an − δ1 < x < an . Hence we have |g(x) − g(an )| < ǫ

if |x − an | < δ1 and then g is continuous at x = an by Definition 4.5. Since the continuity of g in the case for the endpoint x = bn and the case for E c containing (−∞, a) or (b, +∞) can be done similarly as above, we skip the details here. Note that the word “closed” cannot be omitted in the above result. We modify Example 4.27(c) to be −x − 2, if x < 0; f (x) = x + 2, if x > 0.

Then f has a simple discontinuity at x = 0 and is continuous at every other point of the open set (−∞, 0) ∪ (0, +∞). Since f (0+) = 2 and f (0−) = −2, this makes f impossible to have a continuous extension. Suppose that E ⊆ R is a closed set and f : E → Rk is a vector-valued function defined by f (x) = (f1 (x), f2 (x), . . . , fk (x)),

where each fi : E → R is a real function defined on E. By Theorem 4.10(a), f is continuous on E if and only if each of the functions f1 , . . . , fk is continuous on E. By the previous analysis, each fi has a continuous extension gi . Therefore, the vector-valued function g : R → Rk defined by g(x) = (g1 (x), g2 (x), . . . , gk (x)) is continuous on R by Theorem 4.10(a). For each 1 ≤ i ≤ k, we have gi (x) = fi (x) for all x ∈ E and this implies that g(x) = f (x) for all x ∈ E. This completes the proof of the problem.

Problem 4.6 Rudin Chapter 4 Exercise 6.

Proof. Let f : E → Y , E and Y are metric spaces. By definition, since E × f (E) = {(x, f (y)) | x, y ∈ E}, we have graph (f ) = {(x, f (x)) | x ∈ E} ⊆ E × f (E) ⊆ E × Y.

Before we start to prove the result, there are a few points to note. We know that E and Y may not be subsets of Rk for some positive integer k, so we cannot apply any result relevant to Rk (e.g. Theorems 2.41, 4.10, 4.15 and etc.) directly in this problem. Instead, the strategy we use here is that we define a metric on the set E × Y and we consider the mapping g : E → E × f (E) defined by g(x) = (x, f (x)). If we can show that g is continuous, then g(E) = graph (f ) is compact by Theorem 4.14. The first step is to define the metric in E × Y induced by the metrics dE and dY . Let p1 , p2 ∈ E × Y . Thus we have p1 = (x1 , y1 ) and p2 = (x2 , y2 ) for some x1 , x2 ∈ E and y1 , y2 ∈ Y , We define dE×Y (p1 , p2 ) = dE (x1 , x2 ) + dY (y1 , y2 ),

(4.2)

where dE and dY are the metrics in the spaces E and Y respectively. We must check Definition 2.15:

61

4.2. The extension, the graph and the restriction of a continuous function • Since p1 = p2 if and only if x1 = x2 and y1 = y2 if and only if dE (x1 , x2 ) = 0 and dY (y1 , y2 ) = 0, we have dE×Y (p1 , p2 ) = 0 if and only if p1 = p2 . • Since dE (x1 , x2 ) = dE (x2 , x1 ) and dY (y1 , y2 ) = dY (y2 , y1 ), we have dE×Y (p1 , p2 ) = dE×Y (p2 , p1 ). • For any p1 , p2 , p3 ∈ E × Y , we have dE×Y (p1 , p2 ) = dE (x1 , x2 ) + dY (y1 , y2 ) ≤ dE (x1 , x3 ) + dE (x3 , x2 ) + dY (y1 , y3 ) + dY (y3 , y2 ) = dE×Y (p1 , p3 ) + dE×Y (p2 , p3 ).

These show that the definition (4.2) is indeed a metric in E × Y . By Example 2.16, we know that every subset Y of a metric space X is a metric in its own right, with the same distance function. Hence the expression (4.2) is also a metric in E × f (E). The second step is to show that g is continuous on E. Let p be a point of E. Given that ǫ > 0. Since f is continuous at E, it is continuous at p by Definition 4.5. Thus there exists a δ > 0 such that ǫ dY (f (x), f (p)) < (4.3) 2 if dE (x, p) < δ and x ∈ E. We let δ ′ = min( 2ǫ , δ). Then we still have the inequality (4.3) if dE (x, p) < δ ′ and x ∈ E. Hence it follows from the definition (4.2) and the inequality (4.3) that dE×Y (g(x), g(p)) = dE (x, p) + dY (f (x), f (p)) < δ ′ +

ǫ ǫ ǫ ≤ + 0. We let δ = mǫ2 . If |x| < δ, then we have m2 x |f (x, mx) − f (0, 0)| = ≤ m2 |x| < m2 δ = ǫ. 1 + m4 x2

By Definition 4.5, fE ′ is continuous at the origin. The case for the continuity of gE ′ is similar to that of fE ′ , so we omit its details here. We complete the proof of the analysis.

63

4.3. Problems on uniformly continuous functions

4.3

Problems on uniformly continuous functions

Problem 4.8 Rudin Chapter 4 Exercise 8.

Proof. Since E is bounded in R, Theorem 1.19 guarantees that inf E and sup E exist in R. Let them be a and b respectively so that E ⊆ [a, b]. Since f is uniformly continuous on E, we take ǫ = 1 there exists a δ > 0 such that |f (x) − f (y)| < 1 for all x, y ∈ E and |x − y| < δ. Let ∆ = Then we define the intervals

2(b−a) δ

and N be the least positive integer such that N ≥

2(b−a) . δ

h δ δi In = a + (n − 1) , a + n , 2 2 where n = 1, 2, . . . , N . By definition, we have h δi I1 = a, a + 2

where a + N 2δ ≥ b. See Figure 4.2.

h δi δ and IN = a + (N − 1) , a + N , 2 2

Figure 4.2: The sets E and Ini . Since the width of each In is 2δ , we have |f (x) − f (y)| < 1

(4.6)

for all x, y ∈ In ∩ E.d In the following discussion, we suppose that n1 , n2 , . . . , nk are positive integers such that 1 ≤ n1 < n2 < · · · < nk ≤ N and Ini ∩ E 6= ∅ for i = 1, 2, . . . , k. Therefore, we can pick and fix an element xni ∈ Ini ∩ E and it follows from the triangle inequality and the inequality (4.6) that |f (x)| ≤ |f (x) − f (xni )| + |f (xni )| < 1 + |f (xni )|

(4.7)

for all x ∈ Ini ∩ E, where i = 1, 2, . . . , k. Let M = max(|f (xn1 )|, |f (xn2 )|, . . . , |f (xnk )|). It is clear that (In1 ∩ E) ∪ (In2 ∩ E) ∪ · · · ∪ (Ink ∩ E) = E. If x ∈ E, then we have x ∈ Ini ∩ E for some i = 1, 2, . . . , k and it follows from the inequality (4.7) that |f (x)| < 1 + M. Hence f is bounded on E. We know that f (x) = x is a real uniformly continuous function on R (we just need to take δ = ǫ in Definition 4.18). However, it is obviously not bounded on R, completing the proof of the problem. d Note

that it may happen that In ∩ E = ∅ for some n ∈ {1, 2, . . . , N }.

Chapter 4. Continuity

64

Problem 4.9 Rudin Chapter 4 Exercise 9.

Proof. Let us restate the two requirements first: • Statement 1: To every ǫ > 0 there exists a δ > 0 such that diam f (E) < ǫ for all E ⊆ X with diam E < δ. • Statement 2: To every ǫ > 0 there exists a δ > 0 such that dY (f (p), f (q)) < ǫ for all p, q ∈ E ⊆ X with dX (p, q) < δ. Recall from Definition 3.9 that if E is a nonempty subset of a metric space X, SE = {dX (p, q) | p, q ∈ E} and Then we have

Sf (E) = {dY (P, Q) | P, Q ∈ f (E)} = {dY (f (p), f (q)) | p, q ∈ E}. diam E = sup SE

and diam f (E) = sup Sf (E) .

(4.8)

Suppose that Statement 1 is true. Let p, q ∈ E. If dX (p, q) < δ, then Definition 1.7 implies that δ is greater than any element of SE and so the first expression (4.8) implies that diam E < δ. Since Statement 1 is true, we have diam f (E) < ǫ. By the second expression (4.8) again, the latter inequality implies that dY (f (p), f (q)) < ǫ. This shows Statement 2 is true. Next, we suppose that Statement 2 is true. If diam E < δ, then the first expression (4.8) implies that dX (p, q) < δ for all p, q ∈ E and thus dY (f (p), f (q)) < ǫ for all p, q ∈ E. By this and the second expression (4.8), we have diam f (E) < ǫ. Therefore, Statement 1 is true. Hence Statement 1 and Statement 2 are equivalent, completing the proof of the problem. Problem 4.10 Rudin Chapter 4 Exercise 10.

Proof. Assume that f was not uniformly continuous on X. Then it follows from Definition 4.18 that there exists a ǫ > 0 such that for every δ > 0, we have dY (f (p), f (q)) ≥ ǫ for some p, q ∈ X with dX (p, q) < δ. For each positive integer n, if we take δ = sequences {pn }, {qn } in X such that dX (pn , qn ) < n1 but dY (f (pn ), f (qn )) ≥ ǫ.

1 n,

then there are (4.9)

Furthermore, we know from the inequality (4.9) that pn 6= qn for all positive integers n. Since X is a compact metric space, Theorem 3.6(a)e implies that a subsequence of {pn }, namely {pnk }, converges to p ∈ X. By the triangle inequality, we have dX (qnk , p) ≤ dX (qnk , pnk ) + dX (pnk , p)
0 such that dY (f (p), f (q)) < ǫ for all p, q ∈ X and dX (p, q) < δ. Since {xn } is a Cauchy sequence in X, there is an integer N such that dX (xn , xm ) < δ if m, n ≥ N . Hence we have dY (f (xn ), f (xm )) < ǫ if m, n ≥ N , i.e., {f (xn )} is a Cauchy sequence by Definition 3.8. Alternative proof of Problem 4.13: Let p ∈ X \ E. Since E is a dense subset of X, there is a sequence {pn } in E such that lim pn = p. n→∞

By Theorem 3.11(a), {pn } is a Cauchy sequence in X. Since f is uniformly continuous on E, our Problem 4.11 implies that {f (pn )} is also a Cauchy sequence in R and we follow from Theorem 3.11(c) that {f (pn )} converges to P ∈ R. Suppose that there is another sequence {qn } in E which converges to p. By the previous analysis, we know that {f (qn )} converges to Q ∈ R. Now we have the following result about P and Q: Lemma 4.1 We have P = Q.

Proof of Lemma 4.1. Given that δ > 0. Since pn → p and qn → p as n → ∞, there exist positive integers N1 and N2 such that dX (pn , p)
0 such that |f (a) − f (b)| < ǫ for all a, b ∈ E and dX (a, b) < δ. Thus if we take a = pn and b = qn , then there exists a positive integer N such that dX (pn , qn ) < δ for all n ≥ N . Therefore we have for all n ≥ N , |f (pn ) − f (qn )| < ǫ.

(4.11)

By Definition 3.1, lim [f (pn ) − f (qn )] = 0

n→∞

and it follows from Theorem 3.3(a) that P = Q. This completes the proof of the lemma.

Chapter 4. Continuity

66

We can continue the proof of the problem. We define g : X → R by ( f (p), if p ∈ E; g(p) = lim f (pn ), if p ∈ X \ E, {pn } ⊆ E and lim pn = p. n→∞

(4.12)

n→∞

By Lemma 4.1, the definition (4.12) is well-defined. It remains to show that g is continuous at every p ∈ X. At the first glance, one may think that it is not necessary to consider points in E because f is uniformly continuous on E. However, the definition only says that |f (x) − f (p)|
0. • Case (i): p ∈ E. In this case, g(p) = f (p). Thus it is easy to see that there exists a δ > 0 such that ǫ |g(x) − g(p)| = |f (x) − f (p)| < 3 for all x ∈ E and dX (x, p) < δ. Thus, without loss of generality, we may assume that x ∈ X \ E. Let dX (x, p) < δ. Since E is dense in X, we have {xn } ⊆ E such that lim xn = x. Since we have n→∞

g(x) = lim f (xn ), there exists a positive integer N such that n→∞

dX (xN , x) + dX (x, p) < δ

and |g(x) − f (xN )|
0 and f (1) < 1 which imply that g(0) = 0 − f (0) < 0 and g(1) = 1 − f (1) > 0.

By Theorem 4.23, there exists c ∈ (0, 1) such that g(c) = 0, i.e., f (c) = c. This completes the proof of the problem.

69

4.5. Discontinuous functions Problem 4.15 Rudin Chapter 4 Exercise 15.

Proof. Let f : R → R be a continuous open mapping. Assume that f was not monotonic. In other words, there exists a < c < b such that either f (a) > f (c) and f (c) < f (b)

(4.21)

f (a) < f (c) and f (c) > f (b).

(4.22)

or If inequalities (4.21) hold, then we consider the restriction of f (see Problem 4.7) on the compact set [a, b]. By Theorem 4.16, there exists p, q ∈ [a, b] such that f attains its maximum and minimum at p and q respectively. Certainly, we have q ∈ (a, b) because of the inequalities (4.21). • Case (i): p ∈ (a, b). By definition, we have for all x ∈ (a, b) so that

f (q) ≤ f (x) ≤ f (p)

(4.23)

f ((a, b)) ⊆ [f (q), f (p)].

(4.24)

By Theorem 2.47, (a, b) is connected. Thus Theorem 4.22 implies that f ((a, b)) is a connected subset of R. Since f (p), f (q) ∈ f ((a, b)) and the inequalities (4.23), Theorem 2.47 again implies that [f (q), f (p)] ⊆ f ((a, b)). (4.25) Now we obtain from the set relations (4.24) and (4.25) that f ((a, b)) = [f (q), f (p)]. Since f is an open map and (a, b) is an open set in R, f ((a, b)) = [f (q), f (p)] is open in R which is a contradiction. • Case (ii): p ∈ / (a, b). Then f attains its maximum only at the end points a or b. However, by applying similar argument as obtaining the set relations (4.24) and (4.25), we can show that either f ((a, b)) = [f (q), f (a)) or f ((a, b)) = [f (q), f (b)). Since both [f (q), f (a)) and [f (q), f (b)) are not open, it contradicts the hypothesis that f is an open map. Therefore, inequalities (4.21) cannot hold. Similarly, inequalities (4.22) cannot hold. Hence, f must be monotonic. This completes the proof of the problem.

4.5

Discontinuous functions

Problem 4.16 Rudin Chapter 4 Exercise 16.

Proof. Both the functions f (x) = [x] and g(x) = (x) have simple discontinuities at all integers. To see this, let n ∈ Z, then we have f (n+) = lim f (x) = n, x→n+

f (n−) = lim f (x) = n − 1 x→n−

and g(n+) = lim g(x) = 0, x→n+

g(n−) = lim g(x) = 1. x→n−

Chapter 4. Continuity

70

Let us show that f and g are continuous on R \ Z. Let x ∈ R \ Z. Then we have x ∈ (k − 1, k) for some k ∈ Z so that f (x) = [x] = k − 1 and g(x) = (x) = x − [x] = x − k + 1. In other words, f (x) is constant on each (k − 1, k), so f is continuous on each (k − 1, k). Similarly, since g(x) is a polynomial of x on each (k − 1, k), g is continuous on each (k − 1, k). See Figure 4.3 for the graphs of the functions [x] (the blue line) and (x) (the red line):

Figure 4.3: The graphs of [x] and (x). This completes the proof of the problem.

Problem 4.17 Rudin Chapter 4 Exercise 17.

Proof. By Definition 4.26, there are two ways in which a function can have a simple discontinuity, so we consider the cases separately. • Case (i): f (x+) 6= f (x−). Let E+ = {x | f (x−) < f (x+)} ⊆ (a, b). Since f (x−) < f (x+), Theorem 1.20(b) implies the existence of a rational number p such that f (x−) < p < f (x+). Therefore this p satisfies condition (a). Fix this p, Theorem 1.20(b) again implies the existence of a δ > 0 such that f (x−) + δ < p. By Definition 4.25, for every ǫ > 0, there exists a δ > 0 such that |f (t) − f (x−)| < ǫ for all a < t < x and x − t < δ. Since the ǫ can be chosen such as f (x−) + ǫ < p, we have f (t) < p for all t such that x − δ < t < x. Since there exists a rational number q such that x − δ < q < t, condition (b) is satisfied. Similarly, there exists a rational number r satisfying condition (c). Now we have shown that for each x ∈ E+ , we have associated a triple (p, q, r) of rational numbers satisfying conditions (a) to (c). By Theorem 2.13, the set of all such triples is countable. Suppose that the triple (p, q, r) is associated with x and y in E+ . Assume that x < y. Then we have x < t0 < y. Since x < t < r < b implies f (t) > p, we have f (t0 ) > p. However, since a < q < t < y

71

4.5. Discontinuous functions implies f (t) < p, we have f (t0 ) < p. Thus a contradiction occurs and we cannot have x < y. Now the inequality y < x can be shown to be impossible by similar argument. Therefore, we have x = y. In other words, each triple is associated with at most one point of E+ . Hence the set E+ is at most countable. If we define E− = {x | f (x+) < f (x−)} ⊆ (a, b), then the above analysis can also be applied to show that E− is also at most countable. • Case (ii): f (x+) = f (x−) 6= f (x). Let F+ = {x | f (x+) = f (x−) < f (x)} ⊆ (a, b). By applying the argument in Case (i), it can be shown that each point x ∈ F+ is associated a triple (p, q, r) of rational numbers such that (1) f (x−) = f (x+) < p < f (x) and a < q < t < x or x < t < r < b implies that f (t) < p, (2) each such triple (p, q, r) is associated with at most one point of F+ . By Theorem 2.13, F+ is at most countable. Next if we define F− = {x | f (x+) = f (x−) > f (x)} ⊆ (a, b), then it is very clear from the above analysis that it is also at most countable.

Since the set of points at which f has a simple discontinuity, namely G, is the union of E+ , E− , F+ and F− and each set is at most countable, G is also at most countable by Theorem 2.13, completing the proof of the problem. Problem 4.18 Rudin Chapter 4 Exercise 18.

Proof. The function in question is called Thomae’s function, named after Carl Johannes Thomae. Read the website https://en.wikipedia.org/wiki/Thomae%27s_function for more details. Let α be irrational. We want to show that for every ǫ > 0, there exists a δ > 0 such that |f (x) − f (α)| = |f (x)| < ǫ for all x ∈ R with |x − α| < δ. Given that ǫ > 0. By Theorem 1.20(a) (the Archimedean property), we let n be the least positive integer such that n1 < ǫ. Now it is clear from the definition that f (x) = 0 if x is irrational and so the crux idea of the proof is that the δ is constructed in order that all rational numbers in the interval (α − δ, α + δ) cannot have denominators ≤ n. The construction of the number δ is as follows: Let k be a positive integer ≤ n. Since α is irrational, kα is irrational too. By applying the proof of Theorem 1.20(b), we can find an integer mk such that mk < kα < mk + 1 which implies that α ∈ Ik =

m

k

k

,

mk + 1 . k

Since the width of Ik is less than k1 , the interval Ik does not contain a rational number with k as its denominator. For this k, we define δk = min(| mkk − α|, |α − mkk+1 |). Since (α − δk , α + δk ) ⊂ Ik , (α − δk , α + δk ) does not contain a rational number with k as its denominator too. Next, we define δ = min(δ1 , δ2 , . . . , δn ) and we consider the interval (α − δ, α + δ). By the above analysis, this interval does not contain a rational number whose denominator ≤ n. If x ∈ (α − δ, α + δ) and x = pq , then we must have q ≥ n + 1 and thus 1 1 1 |f (x)| = ≤ < < ǫ, q n+1 n as required. By definition, f is continuous at every irrational number, i.e., lim f (x) = f (α) = 0. See x→α √ Figure 4.4 for an example that α = 2 and n = 5.

Chapter 4. Continuity

72

Figure 4.4: An example for α =

√ 2 and n = 5.

Let x = m n . To prove that f has a simple discontinuity at x, we check Definitions 4.25 and 4.26 directly. We need the following result: Lemma 4.2 The set R \ Q is also dense in R.

Proof of Lemma 4.2. Let x, y ∈ R and x < y. Suppose first that 0 ∈ / (x, y). We consider x √ √y . By Theorem 1.20(b), we have < 2 2 x y √ Then we have qk → +∞ as k → ∞.

m n

for all k and rk →

m n.

73

4.6. The distance function ρE Proof of Lemma 4.3. Assume that the set {qk } was bounded. Then there is a positive integer N such that 1 ≤ qk ≤ N for all k. If {pk } is bounded, then {rk } is a finite sequence which contradicts the hypothesis that rk → m n as k → ∞. Thus we have pk → +∞ as k → ∞. However, this means that rk → +∞ as k → ∞, a contradiction again. Hence we have qk → +∞ as k → ∞.

Now we return to the proof of the problem. By the definition of f , we have {f (rk )} = Lemma 4.3, we have q1k → 0 as k → ∞. In other words, we have

{ q1k }.

By

lim f (rk ) = 0.

k→∞

By Definition 4.25, we have f (x+) = 0. Similarly, we also have f (x−) = 0. Since f (x) = f ( m n) = have f (x+) = f (x−) 6= f (x).

1 n,

we

By Definition 4.26, f has a simple discontinuity at every rational point. This competes the proof of the problem. Problem 4.19 Rudin Chapter 4 Exercise 19.

Proof. For every rational r, let Er = {x ∈ R | f (x) = r}. By the hypothesis, Er is closed in R. Assume that f was not continuous at x0 . By Theorems 4.2 and 4.6, there exists a sequence {xn } in R such that xn → x0 and xn 6= x0 for all n but f (xn ) 9 f (x0 ). Thus there exists a subsequence {xnk } of {xn } such that |f (xnk ) − f (x0 )| ≥ ǫ

for some ǫ > 0 and for all k. For the convenience of the discussion, we may rename this subsequence as {xn } so that f (xn ) − f (x0 ) ≥ ǫ for all n, i.e., f (xn ) ≥ f (x0 ) + ǫ for all n. By Theorem 1.20(b), there is a rational r such that f (xn ) ≥ f (x0 ) + ǫ > r > f (x0 ) for all n. By the assumption, we have f (tn ) = r for some tn between x0 and xn . Since xn → x0 as n → ∞, we must have tn → x0 as n → ∞. By this and the fact that tn ∈ Er for all n, we know that x0 is a limit point of Er . Since Er is closed, x0 ∈ Er and thus f (x0 ) = r which contradicts our choice of r. Hence f is continuous, completing the proof of the analysis.

4.6

The distance function ρE

Problem 4.20 Rudin Chapter 4 Exercise 20.

Proof. (a) By definition, it is clear that ρE (x) ≥ 0.

Suppose that x ∈ E. Then x ∈ E or x ∈ E ′ . If x ∈ E, then it is clear that ρE (x) = 0. If x ∈ E ′ , then x is a limit point of E. Thus for every N n1 (x), there is a point zn ∈ N n1 (x) but zn 6= x such that zn ∈ E. By the choice of N n1 (x), we have d(x, zn ) < n1 which gives 0 ≤ ρE (x)
0, N a contradiction. This proves our claim. In addition, it is easy to see that z 6= x because x ∈ / E. ρE (x) ≥

Next, we let Nδ (x) be a neighborhood of x for some δ > 0. Then there exists a positive integer n such that n1 < δ so that N n1 (x) ⊂ Nδ (x).

The preceding analysis makes sure that there exists a z ∈ Nδ (x) such that z 6= x and z ∈ E. Hence, it follows from Definition 2.18(b) that x is a limit point of E. Hence we have x ∈ E ′ ⊆ E, completing the proof of part (a).

Figure 4.5: The distance from x ∈ X to E. (b) Since ρE (x) ≤ d(x, z) ≤ d(x, y) + d(y, z) for all x, y ∈ X, we have ρE (x) ≤ d(x, y) + ρE (y).

(4.26)

Similarly, we have ρE (y) ≤ d(y, z) ≤ d(y, x) + d(x, z) for all x, y ∈ X which implies that ρE (y) ≤ d(x, y) + ρE (x).

(4.27)

Combining inequalities (4.26) and (4.27), we have |ρE (x) − ρE (y)| ≤ d(x, y) for all x ∈ X, y ∈ X. Given ǫ > 0. Let δ = 2ǫ . Then for all x, y ∈ X with d(x, y) < δ, we have |ρE (x) − ρE (y)| ≤ d(x, y) < δ =

ǫ < ǫ. 2

By Definition 4.18, ρE is a uniformly continuous function on X. This finishes the proof of the problem.

75

4.6. The distance function ρE Problem 4.21 Rudin Chapter 4 Exercise 21.

Proof. We define the function ρF : K → R by ρF (x) = inf d(x, z), z∈F

where x ∈ K. Since F is closed, we have F = F by Theorem 2.27(b). Since F ∩ K = ∅, we have F ∩K = ∅ and Problem 4.20(a) implies that ρF (x) 6= 0 for all x ∈ K. Therefore, we follow from Problem 4.20(b) that ρF is a positive (uniformly) continuous function on K. By Theorem 4.16, there exists an a ∈ K such that ρF (a) = min ρF (x). Since ρF is positive, we must have ρF (a) > 0 so that there exists a δ > 0 x∈K

such that 0 < δ < ρF (a) ≤ ρF (q) ≤ d(p, q) for all p ∈ F and q ∈ K. Suppose that K = {1, 2, 3, . . .} and F = {1 + 21 , 2 + 13 , . . .} Then both K and F are closed as well as 1 ∈ F for every n ∈ N, we have K ∩ F = ∅, but they are not compact. Since n ∈ K and n + n+1 d n, n +

1 1 = →0 n+1 n

as n → ∞. Thus our conclusion fails in this case and this ends the proof of the problem.

Problem 4.22 Rudin Chapter 4 Exercise 22.

Proof. By definition, we have ρA ≥ 0 and ρB ≥ 0 on X. We claim that ρA (x) + ρB (x) > 0

(4.28)

for all x ∈ X. Since A and B are closed, it follows from Problem 4.20(a) that ρA (x) = 0 if and only if x ∈ A and ρB (x) = 0 if and only if x ∈ B. Since A ∩ B = ∅, there is no point x ∈ X such that ρA (x) = ρB (x) = 0 which yields the result (4.28). By Problem 4.20(b), both ρA and ρB are continuous functions on X. By Theorem 4.9, we see immediately that f is a continuous function on X. To find the range of f , we note that 0 ≤ ρA (x) ≤ ρA (x) + ρB (x). Combining this and the inequality (4.28), we have ρA (x) ≤ 1. 0≤ ρA (x) + ρB (x) Since f (a) = 0 if a ∈ A and f (b) = 1 if b ∈ B, we have f (X) = [0, 1]. Since f (p) = 0 if and only if ρA (p) = 0 and A is closed, we deduce from Problem 4.20(a) that ρA (p) = 0 if and only if p ∈ A. Hence f (p) = 0 precisely on A. The result that f (p) = 1 precisely on B can be proven similarly, so we omit the details here. Now the above result can be applied to obtain a converse of Problem 4.3: Let A ⊆ X be closed. Then A = Z(f ) for some real continuous function f . To see why it is true, we consider two cases:

Chapter 4. Continuity

76

• Case (i): A = X. Then the function f : X → R defined by f (x) = 0 for all x ∈ X satisfies the required conditions. • Case (ii): A ⊂ X. Then we take any point y ∈ X \ A. Thus B = {y} is closed and A ∩ B = ∅. So the f defined in this problem is the desired continuous real function with the property that Z(f ) = A. Let V = f −1 ([0, 12 )) and W = f −1 (( 21 , 1]). It is obvious that V and W are disjoint because if x ∈ V ∩ W , then f (x) ∈ [0, 12 ) and f (x) ∈ ( 12 , 1] which is impossible. Since 1 1 h 1 = [0, 1] ∩ − , 0, 2 2 2

and

1

i 1 3 , 1 = [0, 1] ∩ , , 2 2 2

Theorem 2.30 implies that both [0, 12 ) and ( 12 , 1] are open sets in [0, 1]. By Theorem 4.8, we know that V and W are disjoint open sets in X. Finally, if p ∈ A, then f (p) = 0 so that p ∈ V, i.e., A ⊆ V . Similarly, if p ∈ B, then f (p) = 1 so that p ∈ W , i.e., B ⊆ W . This completes the proof of the problem.

4.7

Convex functions

Problem 4.23 Rudin Chapter 4 Exercise 23.

Proof. Before we prove the results, let’s look at the graph of a convex function first:

Figure 4.6: The graph of a convex function f . From Figure 4.6, we see that the graph of a convex function is below the straight line connecting the points (x, f (x)) and (y, f (y)). Furthermore, the graph indicates that the equality holds if and only if x = y. Now we are going to prove the assertions one by one. • f is continuous in (a, b). Suppose that p, q ∈ (a, b) and p < q. We first show that

77

4.7. Convex functions Lemma 4.4 The function f is bounded in [p, q].

p−a b−q Proof of Lemma 4.4. Let 0 < r < min( q−p 4 , 4 , 4 ), Mp,q = max(f (p), f (q)) and t ∈ (p, q). t−p It is obvious that [p+r, q −r] ⊂ (a, b). If λ = q−p , then we have 0 < λ < 1 and λq +(1−λ)p = t. Therefore, the definition implies that

f (t) = f (λq + (1 − λ)p) ≤ λf (q) + (1 − λ)f (p)

≤ λMp,q + (1 − λ)Mp,q = Mp,q

for all t ∈ (p, q). In other words, f is bounded above in [p, q]. r Since a < p − r < p < q, if t ∈ (p, q) and λ = t−p+r , then 0 < λ < 1 and we have p = (1 − λ)(p − r) + λt. By the definition of a convex function, we have f (p) = f ((1 − λ)(p − r) + λt) ≤ (1 − λ)f (p − r) + λf (t) so that 1 1−λ f (p) − f (p − r) λ λ t−p+r t−p = f (p) − f (p − r) r r q−p f (p − r) > f (p) − r

f (t) ≥

for all t ∈ (p, q). In other words, f is bounded below in [p, q]. Hence there is a positive number Np,q such that |f (x)| ≤ Np,q for all x ∈ [p, q], completing the proof of the lemma. Let’s return to the proof of the problem. Given that ǫ > 0 and x is a fixed number in the interval (a, b). Then Theorem 1.20(b) ensures that there exist p, q ∈ (a, b) such that a < p < x < q < b. Suppose that 0 < κ < 12 min(x − p, q − x). Then we have a < p < p + κ < x < q − κ < q < b. See Figure 4.7 for the positions of the points p, p + κ, q − κ and q.

Figure 4.7: The positions of the points p, p + κ, q − κ and q. Case (i): Let y be a real number such that p < p + κ < x < y < q − κ < q. If λ = x−p y−p , then it is easy to check that 0 < λ < 1 and x = (1 − λ)p + λy. Thus we have y − p ≥ κ and Lemma 4.4 implies that

Chapter 4. Continuity

78

f (x) ≤ (1 − λ)f (p) + λf (y)

= f ((1 − λ)p + λy) f (x) − f (y) ≤ (1 − λ)[f (p) − f (y)] y−x [f (p) − f (y)] = y−p 2Np,q (y − x) < κ

(4.29)

(4.30)

for some positive number Np,q . Similarly, if λ = y−x q−x , then it is easy to check that 0 < λ < 1 and y = (1 − λ)x + λq. Thus we have q − x > κ and Lemma 4.4 implies that f (y) ≤ (1 − λ)f (x) + λf (q) = f ((1 − λ)x + λq)

f (y) − f (x) ≤ λ[f (q) − f (x)] y−x [f (q) − f (x)] = q−x 2Np,q < (y − x). κ

(4.31)

By combining the inequalities (4.29) and (4.31), we always have |f (y) − f (x)|
δ for all p ∈ K and q ∈ F .

Now we consider the open ball B(z, δ) = {x ∈ Rk | |x − z| < δ}. Assume that there was x ∈ Rk such that x ∈ B(z, δ) ∩ (K + C). Then we have |x − z| < δ and x = p + y for some p ∈ K and y ∈ C. For this y, we have q = z − y ∈ F which implies x = p + y = p + z − q. Hence, by the definition of the chosen δ, we have δ < |p − q| = |x − z| < δ, a contradiction.

(b) We ∈ Z, the set (k − 1, k) is open in R. By Theorem 2.24(a), [ have C2 = {nα | n ∈ C1 }. For every k[ (k−1, k) is open in R. Since R\C1 = (k−1, k), C1 is closed in R by Theorem 2.23. By similar k∈Z

k∈Z

argument, it can be shown that C2 is closed. By definition, we have C1 + C2 = {h + kα | h, k ∈ C1 }.

Chapter 4. Continuity

82

Now the mapping f : C1 + C2 → C1 × C1 defined by f (h + kα) = (h, k) shows that C1 + C2 is equivalent to a subset of C1 × C1 . Since C1 × C1 is countable and C1 ⊂ C1 + C2 , it follows from Theorem 2.8 that C1 + C2 is countable.h For a positive integer n, we consider the fractional part {nα} = nα − [nα]. To prove the density of the set, we need a preliminary result first: Lemma 4.6 Suppose that α is irrational and 0 < θ < 1. For every ǫ > 0, there exists a positive integer k such that |{kα} − θ| < ǫ. Hence, if h = [kα], then we have |kα − h − θ| < ǫ.

Proof of Lemma 4.6. We note that if m 6= n, then {mα} 6= {nα}. Otherwise, we have mα − [mα] = nα − [nα] which implies that α=

[mα] − [nα] m−n

is rational, a contradiction. Given that ǫ > 0 and choose θ such that 0 < θ < 1. Recall Dirichlet’s theorem that for any irrational α, there exist integers h and k such that |kα−h| < ǫ. Now we have either kα > h or kα < h. Suppose that kα > h. Since |kα − h| = kα − h = {kα} + [kα] − h, h and [kα] are integers and {kα}, we have [kα] = h and 0 < {kα} < ǫ.

(4.41)

Now we consider the following sequence {kα}, {2kα}, {3kα}, . . .. Since kα = [kα] + {kα}, we have mkα = m[kα] + m{kα} for every integer m. Thus we have m{kα} = {mkα}

if and only if

{kα}
0 in (a, b), we have f (x2 ) − f (x1 ) > 0 which means that f is strictly increasing in (a, b). To prove the second assertion, we prove the following result first: Lemma 5.1 If f is strictly increasing in (a, b), then f is one-to-one in (a, b).

Proof of Lemma 5.1. If x1 , x2 ∈ (a, b) and x1 6= x2 , then either x1 < x2 or x1 > x2 . If x1 < x2 , then since f is strictly increasing in (a, b), we have f (x1 ) < f (x2 ). Similarly, if x1 > x2 , then since f is strictly increasing in (a, b), we have f (x1 ) > f (x2 ). In both cases, we have f (x1 ) 6= f (x2 ). By Definition 2.2, f is one-to-one in (a, b), completing the proof of Lemma 5.1. 85

Chapter 5. Differentiation

86

Suppose that c and d are real numbers such that a < c < d < b. By Lemma 5.1, f is one-to-one in [c, d]. Since f is differentiable in (a, b), it is also continuous in (a, b) by Theorem 5.2. In particular, f is continuous on [c, d]. Let E = f ([c, d]). By Theorem 4.17, the inverse function g : E → [c, d] of f is well-defined and g(f (x)) = x. (5.1) Besides, g is continuous on E. Let t, y ∈ E and t 6= y. Then there exist s, x ∈ [c, d] such that f (s) = t and f (x) = y. Now we have from the expression (5.1) that φ(t) =

g(t) − g(y) g(f (s)) − g(f (x)) s−x = = . t−y f (s) − f (x) f (s) − f (x)

(5.2)

Since t → y if and only if s → x and f ′ (x) > 0 in (a, b), the expression (5.2) and Theorem 4.4(c) imply that g ′ (f (x)) = g ′ (y) = lim φ(t) = lim t→y

t→y

g(t) − g(y) s−x = lim = lim s→x f (s) − f (x) s→x t−y

1 f (s)−f (x) s−x

=

1 , f ′ (x)

where x ∈ [c, d]. For any x ∈ (a, b), it is obvious that we can find c, d such that x ∈ [c, d] ⊂ (a, b), g ′ (f (x)) = f ′1(x) is true for every x ∈ (a, b). This completes the proof of the problem. Problem 5.3 Rudin Chapter 5 Exercise 3.

Proof. Since g is differentiable in R, f is also differentiable in R by Theorem 5.3(a). In fact, we acquire f ′ (x) = 1 + ǫg ′ (x). Since −M ≤ g ′ (x) ≤ M in R, we have 1 − ǫM ≤ f ′ (x) ≤ 1 + ǫM. Thus if ǫ is small enough, then we have 1 − ǫM > 0 so that f ′ (x) > 0 in R. By Problem 5.2, we have f is strictly increasing in R and then it is one-to-one in R. This completes the proof of the problem. Problem 5.4 Rudin Chapter 5 Exercise 4.

2

n+1

Proof. Let f (x) = C0 x + C1 x2 + · · · + Cn xn+1 be a polynomial defined in R. In particular, f is continuous on [0, 1] and differentiable in (0, 1). By Theorem 5.10, there exists a α ∈ (0, 1) such that f (1) − f (0) = f ′ (α). Since f (1) = f (0) = 0, the equation f ′ (x) = C0 + C1 x + · · · + Cn−1 xn−1 + Cn xn = 0 has at least one real root in (0, 1) which is the desired result, completing the proof of the problem. Problem 5.5 Rudin Chapter 5 Exercise 5.

87

5.1. Problems on differentiability of a function

Proof. Since f is differentiable in (0, +∞), it is continuous on (0, +∞) by Theorem 5.2. In particular, f is continuous on [x, x + 1] and differentiable in (x, x + 1) for every x > 0. By Theorem 5.10 and the definition of g, we have g(x) = f (x + 1) − f (x) = (x + 1 − x)f ′ (y) = f ′ (y)

(5.3)

for some y ∈ (x, x + 1). If x → +∞, then we have y → +∞ and so f ′ (y) → 0. Hence it follows from the expression (5.3) that g(x) → 0 as x → +∞. This finishes the proof of the problem. Problem 5.6 Rudin Chapter 5 Exercise 6.

Proof. By Theorem 5.3 and condition (b), g is differentiable in (0, +∞) and g ′ (x) =

xf ′ (x) − f (x) . x2

We note that g ′ (x) ≥ 0 if and only if

xf ′ (x) − f (x) ≥ 0.

For every x > 0, conditions (a) and (b) imply that f is continuous in [0, x] and differentiable in (0, x). By Theorem 5.10, we have f (x) − f (0) = (x − 0)f ′ (ξ)

for some ξ ∈ (0, x). By condition (c), we have f (x) = xf ′ (ξ). By condition (d), we have f ′ (ξ) ≤ f ′ (x) and thus f (x) ≤ xf ′ (x). Therefore, we have g ′ (x) ≥ 0 for every x > 0 and Theorem 5.11(a) implies that g is monotonically increasing in (0, +∞). Problem 5.7 Rudin Chapter 5 Exercise 7.

Proof. Since f (x), g(x) are differentiable, g ′ (x) 6= 0 and f (x) = g(x) = 0, it follows from Theorem 4.4(c) that f (t)−f (x) f (t) f (t) − f (x) f ′ (x) t−x lim = lim = lim g(t)−g(x) = ′ t→x g(t) t→x g(t) − g(x) t→x g (x) t−x

which is our desired result. This completes the proof of the problem.

Problem 5.8 Rudin Chapter 5 Exercise 8.

Proof. Since f ′ is continuous in [a, b] and [a, b] is compact, Theorem 4.19 implies that f ′ is uniformly continuous in [a, b]. By Definition 4.18, there exists a δ > 0 such that |f ′ (t) − f ′ (x)| < ǫ

(5.4)

for every x, t ∈ [a, b] with |t − x| < δ. Suppose that t < x. Then we deduce from Theorem 5.10 that there exists a ξ ∈ (t, x) such that f (x) − f (t) = f ′ (ξ). (5.5) x−t

Since 0 < |ξ − x| < |t − x| < δ, the inequality (5.4) and the expression (5.5) together give f (x) − f (t) − f ′ (x) = |f ′ (ξ) − f (x)| < ǫ x−t

Chapter 5. Differentiation

88

which is our desired result in the case t < x. Since the case for x < t is similar, we omit the details here. We claim that this also holds for vector-valued functions: f ′ (x) is continuous on [a, b] and ǫ > 0. Then there exists δ > 0 such that f (t) − f (x) − f ′ (x) < ǫ t−x whenever 0 < |t − x| < δ and x, t ∈ [a, b]. To prove this result, suppose that f (x) = (f1 (x), f2 (x), . . . , fk (x)), then Remark 5.16 implies that f is differentiable on [a, b] if and only if each f1 , . . . , fk is differentiable on [a, b]. Furthermore, we apply Theorem 4.10 to get the result that f ′ (x) is continuous on [a, b] if and only if each f1′ , . . . , fk′ is continuous on [a, b] and thus they are uniformly continuous on [a, b] by Theorem 4.19. In other words, for each i = 1, 2, . . . , k, there exists δi > 0 such that f (t) − f (x) ǫ i i − fi′ (x) < √ t−x k

(5.6)

whenever 0 < |t − x| < δi and t, x ∈ [a, b]. Let δ = min(δ1 , . . . , δk ). Hence we follow from the inequalities (5.6) and the proof of Theorem 4.10 that for every 0 < |t − x| < δ and t, x ∈ [a, b], we have f (t) − f (x) f (t) − f (x) fk (t) − fk (x) 1 1 − f ′ (x) = − f1′ (x), . . . , − fk′ (x) t−x t−x t−x v u k 2 uX fi (t) − fi (x) ≤t − fi′ (x) t − x i=1 v u k uX ǫ2 0 and then f must be differentiable in (x − δ, x + δ). Since our h can be chosen arbitrarily so that (x − h, x + h) ⊆ (x − δ, x + δ), we can conclude that F (h) is differentiable as a function of h and Theorem 5.5b gives F ′ (h) = f ′ (x + h) − f ′ (x − h). By Theorem 5.13, we have F (h) F ′ (h) f ′ (x + h) − f ′ (x − h) f (x + h) + f (x − h) − 2f (x) = lim = lim ′ = lim . 2 h→0 G(h) h→0 G (h) h→0 h→0 h 2h

(5.9)

lim

On the other hand, we follow from Definition 5.1 that 1 ′′ (f (x) + f ′′ (x)) 2 1h f ′ (x + h) − f ′ (x) f ′ (x) − f ′ (x − h) i = lim + lim h→0 2 h→0 h h f ′ (x + h) − f ′ (x − h) . = lim h→0 2h

f ′′ (x) =

(5.10)

By comparing the limits (5.9) and (5.10), we have the desired result. This finishes the proof of the problem. Problem 5.12 Rudin Chapter 5 Exercise 12.

Proof. We have f (x) = so that ′

f (x) =

−3x2 , if x < 0; 3x2 , if x > 0,

−x3 , if x ≤ 0; x3 , if x > 0 ′′

and f (x) =

To compute f ′ (0), we consider f ′ (0+) and f ′ (0−) so that f ′ (0+) = lim

t→0 t>0

−6x, if x < 0; 6x, if x > 0.

f (t) − f (0) t3 f (t) − f (0) −t3 = lim = 0 and f ′ (0−) = lim = lim = 0. t→0 t t→0 t→0 t t−0 t−0 t>0

t0 ′′

′′

t0

f ′′ (t) − f ′′ (0) −6t = lim = −6. t→0 t→0 t t−0

and f (3) (0−) = lim t 0, we

n→∞

π =1 (5.11) f (xn ) = x0n sin(|xn |−c ) = sin 2nπ + 2 for all positive integers n so that lim f (xn ) = 1 6= f (0). Thus f is not continuous at x = 0 in n→∞ this case. 1 – Case (iii): a < 0. Similarly, we consider the sequence {xn } defined by xn = 1 . π (2nπ+ 2 ) c

Instead of the expression (5.11), we have π π − ac π − ac sin 2nπ + (5.12) f (xn ) = xan sin(|xn |−c ) = 2nπ + = 2nπ + 2 2 2 for all positive integers n. Since a < 0 and c > 0, − ac > 0 and then the expression (5.12) implies that lim f (xn ) = ∞ = 6 f (0). Thus f is not continuous at x = 0 in this case. n→∞

Hence we establish the result that f is continuous if and only if a > 0. (b) Note that we have φ(x) =

xa sin (|x|−c ) − 0 f (x) − f (0) = = xa−1 sin (|x|−c ) x−0 x−0

(5.13)

and so f ′ (0) = lim φ(x) = lim xa−1 sin (|x|−c ). x→0

x→0

– Case (i): a ≤ 1. We consider the sequence {xn } defined by xn = have lim xn = 0. By a similar argument as in part (a), we have

1

1

c (nπ+ π 2)

. Since c > 0, we

n→∞

a−1 a−1 π π − c π − c sin nπ + φ(xn ) = xna−1 sin(|xn |−c ) = nπ + = (−1)n nπ + 2 2 2 for all positive integers n. Now if a = 1, then

(5.14)

φ(x2k ) = 1 and φ(x2k+1 ) = −1.

By Theorem 4.2, lim φ(xn ) does not exist in this case! If a < 1, then − a−1 > 0 and the c n→∞

expression (5.14) implies that φ(x2k ) → +∞ and φ(x2k+1 ) → −∞ as k → ∞. Thus lim φ(xn ) n→∞

does not exist in this case! Therefore, we have shown that if f ′ (0) exists, then a > 1.

Chapter 5. Differentiation

92

– Case (ii): a > 1. Then since −1 ≤ | sin(|x|−c )| ≤ 1 for all x ∈ [−1, 1], we have 0 ≤ |φ(x)| ≤ |x|a−1 . Since lim |x|a−1 = 0, we have lim |φ(x)| = 0 and thus lim φ(x) = 0. In other words, we have x→0

x→0

f ′ (0) = 0 in this case.

x→0

(c) Since xa and sin(|x|−c ) are differentiable in [−1, 1] \ {0}, Theorem entiable in [−1, 1] \ {0}. Suppose that a > 1. By the result of part  a−1 [a sin(x−c ) − cx−c cos(x−c )],  x ′ f (x) = 0,  a−1 x [a sin(−x)−c − c(−x)−c cos(−x)−c ],

5.3(b) implies that f is differ(b) and Theorem 5.5, we have if 0 < x ≤ 1; if x = 0; if −1 ≤ x < 0.

(5.15)

– Case (i): a ≥ 1 + c. Then we have a > 1 and a − 1 − c ≥ 0. Therefore, we can deduce from the expressions (5.15) that |f ′ (x)| ≤ |axa−1 sin(|x|−c )| + |cxa−1−c | cos(|x|−c )| ≤ a|x|a−1 + c|x|a−1−c ≤ a + c for all x ∈ [−1, 1]. This shows that f ′ is bounded by a + c in this case.

1

– Case (ii): a < 1 + c. Consider the sequence {xn } defined by xn = (2nπ)− c > 0. Then we −c have sin(x−c n ) = sin 2nπ = 0, cos(xn ) = cos 2nπ = 1 and so f ′ (xn ) = −c(xn )a−1−c = −c(2nπ)

c+1−a c

→ −∞

as n → ∞. Therefore, f ′ is unbounded on [−1, 1] in this case. Hence we have shown that f ′ is bounded if and only if a ≥ 1 + c. (d) By the expressions (5.15), we know that f ′ is continuousc for all x ∈ [−1, 1] \ {0}, so we check whether f ′ is continuous at x = 0 or not. – Case (i): a > 1 + c. Then we have a > 1 and a − 1 − c > 0. Since |f ′ (x)| ≤ |axa−1 sin(|x|−c )| + |cxa−1−c cos(|x|−c )| ≤ a|x|a−1 + c|x|a−1−c , we have lim |f ′ (x)| = 0 which implies that lim f ′ (x) = 0. By the expression (5.15), we have x→0

x→0

lim f ′ (x) = f ′ (0), so f ′ is continuous on [−1, 1] by Theorem 4.6.

x→0

– Case (ii): a < 1 + c. Then we know from part (c) that f ′ is unbounded on [−1, 1]. Assume that f ′ was continuous on the compact set [−1, 1]. By Theorem 4.15, f ′ is bounded on [−1, 1], a contradiction. Thus it is not continuous on [−1, 1]. 1

– Case (iii): a = 1 + c. Then we consider the sequence {xn } defined by xn = (2nπ)− c > 0. −c Then we have sin(x−c n ) = sin(2nπ) = 0, cos(xn ) = cos(2nπ) = 1 and so f ′ (xn ) = −c(xn )a−1−c = −c. Since f ′ (0+) = lim f ′ (xn ) = −c 6= 0 = f ′ (0), f ′ is not continuous at x = 0 in this case. n→∞

Hence we have shown that f ′ is continuous if and only if a > 1 + c. (e) It follows from the expressions (5.15) that f ′ (x) − f ′ (0) x−0 xa−1 [a sin (|x|−c ) − c|x|−c cos(|x|−c )] = x−0 a−2 −c =x [a sin (|x| ) − c|x|−c cos(|x|−c )]

ϕ(x) =

and so f ′′ (0) = lim ϕ(x) = lim xa−2 [a sin (|x|−c ) − c|x|−c cos(|x|−c )]. x→0

c By

x→0

[21, Eqn. (49), Chap. 8] again, we assume that the function cos x is differentiable in R.

93

5.1. Problems on differentiability of a function – Case (i): a ≤ 2 + c. We consider the sequences {xn } and {yn } defined by xn = yn =

1

1

[(2n+1)π] c

1

1

(2nπ) c

and

. Since c > 0, we have lim xn = 0 and lim yn = 0. By a similar argument as n→∞

n→∞

in part (b), we have ϕ(xn ) = xna−2 [a sin (|xn |−c ) − c|xn |−c cos(|xn |−c )] = (2nπ)−

a−2 c

= −c(2nπ)

[a sin(2nπ) − c(2nπ) cos(2nπ)]

c+2−a c

(5.16)

and ϕ(yn ) = yna−2 [a sin (|yn |−c ) − c|yn |−c cos(|yn |−c )] = [(2n + 1)π]−

= c[(2n + 1)π]

a−2 c

{a sin(2n + 1)π − c[(2n + 1)π] cos(2n + 1)π}

c+2−a c

(5.17)

for all positive integers n. If a = c + 2, then the expressions (5.16) and (5.17) imply that ϕ(xn ) = −c and ϕ(yn ) = c respectively. Since ϕ(xn ) 6= ϕ(yn ), f ′′ (0) does not exist by Theorem 4.2. If a < c + 2, then we obtain from the expression (5.16) that lim ϕ(xn ) = −∞. n→∞

Thus f ′′ (0) does not exist by Theorem 4.2.

– Case (ii): a > 2 + c. Then since −1 ≤ | sin(x−c )| ≤ 1 and −1 ≤ | cos(x−c )| ≤ 1 for all x ∈ [−1, 1], we have 0 ≤ |ϕ(x)| ≤ a|x|a−2 + c|x|a−2−c . Since lim |x|a−1 = 0 and lim |x|a−2−c = 0, we have lim |ϕ(x)| = 0 and thus lim ϕ(x) = 0. In x→0

x→0

other words, we have f ′′ (0) = 0 in this case.

x→0

x→0

Hence we have shown that f ′′ (0) exists if and only if a > 2 + c. (f) Suppose that a > 2 + c. By the result of part (e) and the expressions (5.15), we have  a(a − 1)xa−2 − c2 xa−2−2c sin(x−c )     if 0 < x ≤ 1;  −c(2a − 1 − c)xa−2−c cos(x−c ), ′′ 0, if x = 0; f (x) = a−2  2 −2c −c  a(a − 1) − c (−x) x sin(−x)    −c(2a − 1 − c)xa−2 (−x)−c cos(−x)−c , if −1 ≤ x < 0.

(5.18)

– Case (i): a ≥ 2 + 2c. Then we have a − 2 ≥ 2c > 0, a − 2 − c ≥ c > 0 and a − 2 − 2c ≥ 0. Therefore, we can deduce from the expressions (5.18) that |f ′′ (x)| ≤ [a(a − 1)xa−2 − c2 xa−2−2c ] sin(|x|−c ) + c(2a − 1 − c)xa−2−c cos(|x|−c ) ≤ a(a − 1)|x|a−2 + c2 |x|a−2−2c + c(2a − 1 − c)|x|a−2−c

≤ a(a − 1) + c2 + c(2a − 1 − c) = a(a − 1) + c(2a − 1)

for all x ∈ [−1, 1]. This shows that f ′′ is bounded by a(a − 1) + c(2a − 1) in this case.

– Case (ii): a < 2 + 2c and 2a − 1 − c 6= 0. If a ≥ 2, then we consider the sequence {xn } 1 defined by xn = (2nπ + π2 )− c > 0 so that we have 1 π = 1 and sin(x−c ) = sin 2n + n 2

Thus we have

1 cos(x−c ) = cos 2n + π = 0. n 2

a−2−c sin(x−c cos(x−c f ′′ (xn ) = a(a − 1)xna−2 − c2 xa−2−2c n ) − c(2a − 1 − c)xn n ) n 2−a 2+2c−a π c π c = a(a − 1) 2nπ + − c2 2nπ + . 2 2

Chapter 5. Differentiation Since

2−a c

≤ 0 and

2+2c−a c

94 > 0, we have f ′′ (xn ) → −∞ as n → +∞.

1

If a < 2, then we consider the sequence {yn } defined by yn = (2nπ)− c > 0 so that we have sin(yn−c ) = sin(2nπ) = 0 and cos(yn−c ) = cos(2nπ) = 1. Thus we have f ′′ (yn ) = a(a − 1)yna−2 − c2 yna−2−2c sin(yn−c ) − c(2a − 1 − c)yna−2−c cos(yn−c ) = −c(2a − 1 − c)(2nπ)

c+2−a c

.

Since c+2−a > 1 and 2a − 1 − c 6= 0, we have f ′′ (yn ) → −∞. Therefore, f ′′ is unbounded in c this case. – Case (iii): a < 2 + 2c and 2a − 1 − c = 0. Then we have from   a(a − 1)xa−2 − c2 xa−2−2c sin(x−c ), 0, f ′′ (x) =  a(a − 1) − c2 (−x)−2c xa−2 sin(−x)−c ,

the expressions (5.18) that if 0 < x ≤ 1; if x = 0; if −1 ≤ x < 0.

If a ≥ 2, then we can apply the same sequence {xn } as in Case (ii) so that we have 2−a 2+2c−a π c π c f ′′ (xn ) = a(a − 1) 2nπ + − c2 2nπ + . 2 2

(5.19)

2+2c−a Since 2−a > 0, we have f ′′ (xn ) → −∞ as n → +∞. c ≤ 0 and c If a < 2, then the expression (5.19) can be rewritten as

Since case.

2−a c

2−a π 2 i π c h a(a − 1) − c2 2nπ + . f ′′ (xn ) = 2nπ + 2 2

> 0, we have f ′′ (xn ) → −∞ as n → +∞. Therefore, f ′′ is also unbounded in this

Hence we have shown that f ′′ is bounded if and only if a ≥ 2 + 2c. (g) By the expressions (5.18), we know that f ′′ is continuous for all x ∈ [−1, 1] \ {0}, so we check whether f ′′ is continuous at x = 0 or not. – Case (i): a > 2 + 2c. Then we have a > 2 and a − 2 − c > a − 2 − 2c > 0. Since |f ′′ (x)| ≤ |a(a − 1)xa−2 − c2 xa−2−2c sin(|x|−c )| + |c(2a − 1 − c)xa−2−c cos(|x|−c )| ≤ a(a − 1)|x|a−2 + c2 |x|a−2−2c + c(2a − 1 − c)|x|a−2−c ,

we have lim |f ′′ (x)| = 0 which implies that lim f ′′ (x) = 0. By the expressions (5.18) again, x→0

x→0

we have lim f ′′ (x) = f ′′ (0), so f ′′ is continuous at 0 by Theorem 4.6. x→0

– Case (ii): a < 2 + 2c. Then we know from part (f) that f ′′ is unbounded on [−1, 1]. Assume that f ′′ was continuous on the compact set [−1, 1]. By Theorem 4.15, f ′′ is bounded on [−1, 1], a contradiction. Thus it is not continuous on [−1, 1]. 1

– Case (iii): a = 2+2c. Then we consider the sequence {xn } defined by xn = (2nπ + π2 )− c > 0. −c Then we have sin(x−c n ) = 1, cos(xn ) = and so π −2 f ′′ (xn ) = a(a − 1)xna−2 − c2 xa−2−2c = a(a − 1) 2nπ + − c2 . n 2

Since f ′′ (0+) = lim f ′′ (xn ) = −c2 6= 0 = f ′′ (0), f ′′ is not continuous at x = 0 in this case. n→∞

Hence we have shown that f ′′ is continuous if and only if a > 2 + 2c. This completes the proof of the problem.

95

5.1. Problems on differentiability of a function Problem 5.14 Rudin Chapter 5 Exercise 14.

Proof. We prove the results one by one. • f is convex if and only if f ′ is monotonically increasing. – Suppose that f is convex in (a, b). Then it follows from Problem 4.23 that f (t) − f (s) f (u) − f (s) f (u) − f (t) ≤ ≤ t−s u−s u−t

(5.20)

whenever a < s < t < u < b. Similarly, we have f (u) − f (t) f (v) − f (t) f (v) − f (u) ≤ ≤ u−t v−t v−u

(5.21)

whenever a < t < u < v < b. Combining the inequalities (5.20) and (5.21), we have f (u) − f (t) f (v) − f (u) f (t) − f (s) ≤ ≤ t−s u−t v−u

(5.22)

whenever a < s < t < u < v < b. Since f is differentiable in (a, b), we have from Definition 5.1 that f ′ (x) = f ′ (x+) = f ′ (x−) for every x ∈ (a, b). In particular, we have f ′ (s) = f ′ (s+) and f ′ (u) = f ′ (u+). Thus these and the inequalities (5.22) together imply that f ′ (s) = f ′ (s+) = lim

t→s t>s

f (t) − f (s) f (v) − f (u) ≤ v→u lim = f ′ (u+) = f ′ (u) t−s v − u v>u

if a < s < u < b. Hence it means that f ′ is monotonically increasing in (a, b). – Suppose that f ′ is monotonically increasing in (a, b). Let a < x < b, a < y < b and 0 < λ < 1, we want to show that f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y). (5.23) If t = λx + (1 − λ)y, then it is easily to see that 0 0. By the hypothesis, f is a twice-differentiable real function in [x, x + 2h]. In other words, the function f satisfies the conditions of Theorem 5.15 (Taylor’s Theorem). Put P (t) = f (x) + f ′ (x)(t − x) and so Theorem 5.15 (α = x and β = x + 2h) implies that f (x + 2h) = P (x + 2h) + d We

f ′′ (ξ) (x + 2h − x)2 = f (x) + f ′ (x) · (2h) + 2f ′′ (ξ)h2 2

apply the theory of Chap. 6 here to obtain that f (x) = cx + d.

(5.28)

97

5.2. Applications of Taylor’s theorem

for some ξ ∈ (x, x + 2h). Now we rewrite (5.28) to get f ′ (x) =

1 [f (x + 2h) − f (x)] − hf ′′ (ξ) 2h

and then

1 |f (x + 2h)| + |f (x)| + h|f ′′ (ξ)| (5.29) 2h for some ξ ∈ (x, x + 2h). By the definitions of M0 , M1 and M2 , the inequality (5.29) implies that |f ′ (x)| ≤

|f ′ (x)| ≤ hM2 +

M0 , h

(5.30)

where x ∈ (a, +∞).

q 0 Since h is arbitrary and M0 , M2 are positive numbers, we may take h = M M2 > 0 in the inequality √ √ (5.30) to get |f ′ (x)| ≤ 2 M0 M2 on (a, +∞). Since it is true for all x ∈ (a, +∞), we have M1 ≤ 2 M0 M2 which gives M12 ≤ 4M0 M2 (5.31) as desired. Consider the given example in the hint, we acquire f ′ (x) =

4x 4x

(x2 +1)2

(−1 < x < 0), (0 < x < ∞),

and f ′′ (x) =

(

4 4−12x2 (x2 +1)3

(−1 < x < 0), (0 < x < ∞).

Since f ′ (x) → 0 and f ′′ (x) → 4 as x → 0, the footnote of Problem 5.9 shows that f ′ (0) = 0 and f ′′ (0) = 4. Thus we have   (−1 < x < 0), (−1 < x < 0),  4  4x 4 (x = 0), 0 (x = 0), (5.32) and f ′′ (x) = f ′ (x) =  4−12x2  4x (0 < x < ∞), (0 < x < ∞). (x2 +1)2 (x2 +1)3

From the definition of f and the expressions (5.32), we have M0 = 1, M1 = 4 and M2 = 4 which give the equality (5.31). We claim that the inequality M12 ≤ 4M0 M2 also holds for vector-valued functions. Suppose that f = (f1 , f2 , . . . , fn ), where f1 , f2 , . . . , fn are twice-differentiable real functions on (a, +∞). By Remarks 5.16, the vector-valued function f is also twice-differentiable on (a, +∞). Let, further, M0 , M1 and M2 be the least upper bounds of |f (x)|, |f ′ (x)| and |f ′′ (x)| on (a, +∞) respectively. For any c ∈ (a, +∞), we consider the function F : R → R defined by F (x) = f1′ (c)f1 (x) + f2′ (c)f2 (x) + · · · + fn′ (c)fn (x). Since each fk is twice-differentiable real function on (a, +∞), Theorem 5.3 implies that F is also a twice-differentiable real √ function on (a, +∞). Therefore, the previous analysis shows that, instead of the inequality |f ′ (x)| ≤ 2 M0 M2 , we have p |F ′ (x)| ≤ 2 M0 M2 , (5.33)

where M0 , M1 and M2 are the least upper bounds of |F (x)|, |F ′ (x)| and |F ′′ (x)|, respectively, on (a, +∞). Next, let’s recall the Cauchy-Schwarz inequality for vectors: If u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ), then we have q q |u1 v1 + u2 v2 + · · · + un vn | ≤ u21 + u22 + · · · + u2n · v12 + v22 + · · · + vn2 . (5.34) Apply the result (5.34) to |F (x)| and |F ′′ (x)| to get

|F (x)| = |f1′ (c)f1 (x) + f2′ (c)f2 (x) + · · · + fn′ (c)fn (x)| q p ≤ [f1′ (c)]2 + [f2′ (c)]2 + · · · + [fn′ (c)]2 · [f1 (x)]2 + [f2 (x)]2 + · · · + [fn (x)]2

Chapter 5. Differentiation

98

≤ M1 · M0 and |F ′′ (x)| = |f1′ (c)f1′′ (x) + f2′ (c)f2′′ (x) + · · · + fn′ (c)fn′′ (x)| q q ≤ [f1′ (c)]2 + [f2′ (c)]2 + · · · + [fn′ (c)]2 · [f1′′ (x)]2 + [f2′′ (x)]2 + · · · + [fn′′ (x)]2 ≤ M1 · M2

on (a, +∞). Therefore, it follows from these and the inequality (5.33) that |F ′ (x)| ≤ 2

p p p M0 M2 ≤ 2 M1 · M0 · M1 · M2 = 2M1 M0 M2 ,

(5.35)

where x ∈ (a, +∞). In particular, we may take x = c in the inequality (5.35) to obtain [f1′ (c)]2 + [f2′ (c)]2 + · · · + [fn′ (c)]2 ≤ 2M1

p M0 M2 .

(5.36)

Since the right-hand side of the inequality √ (5.36) is independent of the choice of c, we let c run through all values in (a, +∞) so that M12 ≤ 2M1 M0 M2 which implies that the desired inequality (5.31). Finally, the equality also holds for vector-valued functions if we consider f (x) = (f (x), 0, . . . , 0), where f (x) is the given example in the hint. This completes the proof of the problem. Problem 5.16 Rudin Chapter 5 Exercise 16.

Proof. By the hypothesis, we have |f ′′ (x)| ≤ K on (0, ∞) for some positive K. Let M0 (a) =

sup |f (x)|,

M1 (a) =

x∈(a,∞)

sup |f ′ (x)|

and M2 (a) =

x∈(a,∞)

sup |f ′′ (x)|.

x∈(a,∞)

Thus Problem 5.15 implies that M12 (a) ≤ 4M2 (a)M0 (a) ≤ 4KM0 (a).

(5.37)

Since f (x) → 0 as x → ∞, we have M0 (a) → 0 as a → ∞. Otherwise, there was a sequence {an }, a positive integer N and a ǫ > 0 such that an → ∞ as n → ∞ but M0 (an ) > ǫ for all n ≥ N . Since M0 (an ) is the least upper bound of |f (x)| on (an , ∞), there exist xn ∈ (an , ∞) such that |f (xn )| > ǫ for all n ≥ N , contradicting to the fact that f (x) → 0 as n → ∞. Therefore, it follows from the inequality (5.37) that lim M12 (a) ≤ 4K lim M0 (a) = 0.

a→∞

a→∞

(5.38)

It is clear that a → ∞ implies that x → ∞. Since |f (x)| ≤ M1 (a) for all x ∈ (a, ∞), we follow from this and the inequality (5.38) that lim |f ′ (x)| ≤ lim M1 (a) = 0. x→∞

′

a→∞

Hence we have f (x) → 0 as x → ∞, completing the proof of the problem. Problem 5.17 Rudin Chapter 5 Exercise 17.

99

5.2. Applications of Taylor’s theorem

Proof. By Theorem 5.15, we have f (1) = P (1) +

f (3) (s) (1 − 0)3 3!

and f (−1) = P (−1) +

f (3) (t) (0 − 1)3 3!

(5.39)

for some s ∈ (0, 1) and t ∈ (−1, 0). By the hypothesis, we have P (1) = f (0) + f ′ (0) · 1 +

f ′′ (0) f ′′ (0) 2 ·1 = 2 2

and P (−1) = f (0) + f ′ (0) · (−1) +

f ′′ (0) f ′′ (0) · (−1)2 = . 2 2

Therefore, the expressions (5.39) imply that 1=

f ′′ (0) f (3) (s) + 2 6

and 0 =

f ′′ (0) f (3) (t) − 2 6

so that f (3) (s) + f (3) (t) = 6. Assume that f (3) (x) < 3 for all x ∈ (−1, 1). Then we have f (3) (s) < 3 and f (3) (t) < 3 so that f (3) (s) + f (3) (t) < 6, a contradiction. Hence we have f (3) (x) ≥ 3 for some x ∈ (−1, 1). This end the proof of the problem.

Problem 5.18 Rudin Chapter 5 Exercise 18.

Proof. We can prove this version of Taylor’s theorem by induction. For n = 1, we have 0

P (β) +

X f (i) (α) Q(1−1) (α) (β − α)1 = (β − α)i + Q(α)(β − α) = f (α) − [f (α) − f (β)] = f (β). (1 − 1)! i! i=0

Therefore, the statement is true for n = 1. Assume that it is true for n = k, i.e., k−1

f (β) = P (β) +

X f (i) (α) Q(k−1) (α) Q(k−1) (α) (β − α)k = (β − α)i + (β − α)k . (k − 1)! i! (k − 1)! i=0

(5.40)

For n = k + 1, since f (t) − f (β) = (t − β)Q(t), we have f ′ (t) = (t − β)Q′ (t) + Q(t) f ′′ (t) = (t − β)Q′′ (t) + 2Q′ (t) .. .. . . f (k) (t) = (t − β)Q(k) (t) + kQ(k−1) (t).

(5.41)

Therefore, it follows from the assumption (5.40) and the expression (5.41) that k

P (β) +

X f (i) (α) Q(k) (α) Q(k) (α) (β − α)k+1 = (β − α)i + (β − α)k+1 k! i! k! i=0 k−1 X

f (k) (α) Q(k) (α) f (i) (α) (β − α)i + (β − α)k + (β − α)k+1 i! k! k! i=0 " # k−1 X f (i) (α) Q(k) (α) Q(k−1) (α) i k+1 k (β − α) + − (β − α) + (β − α) = i! k! (k − 1)! i=0 =

Chapter 5. Differentiation

100

Q(k) (α) (β − α)k+1 k! k−1 X f (i) (α) Q(k−1) (α) (β − α)i + (β − α)k = i! (k − 1)! i=0 +

= f (β).

Thus the statement is also true for n = k + 1. Hence we follow from induction that it is true for all positive integers n, completing the proof of the problem. Problem 5.19 Rudin Chapter 5 Exercise 19.

Proof. (a) Let αn < 0 < βn . Since f ′ (0) exists, |βn − αn | ≥ |βn | and |βn − αn | ≥ |αn |, we have f (β ) − f (α ) n n |Dn − f ′ (0)| = − f ′ (0) βn − αn f (β ) − f (0) − β f ′ (0) f (α ) − f (0) − α f ′ (0) n n n n ≤ + βn − αn βn − αn f (β ) − f (0) f (α ) − f (0) n n ≤ − f ′ (0) + − f ′ (0) . βn αn

(5.42) (5.43)

Since f ′ (0) exists, Definition 5.1 and Theorem 4.2 imply that

f (pn ) − f (0) = f ′ (0) n→∞ pn − 0 lim

for every sequence {pn } in (−1, 1) such that pn 6= 0 and lim pn = 0. In particular, we have n→∞

f (βn ) − f (0) f (αn ) − f (0) = lim = f ′ (0). n→∞ n→∞ αn βn lim

Hence we deduce from these and the inequality (5.43) that lim Dn = f ′ (0). n→∞

(b) Since 0 < αn < βn and

n { βnβ−α } n

is bounded, we have

βn βn −αn

≤ M for some positive M so that

M M 1 ≤ < . βn − αn βn αn Therefore, these and the inequality (5.42) imply that f (α ) − f (0) − β f ′ (0) f (β ) − f (0) − β f ′ (0) n n n n |Dn − f ′ (0)| < M + M βn αn f (β ) − f (0) f (α ) − f (0) n n = M − f ′ (0) + M − f ′ (0) . βn αn

Hence, a similar argument as in part (a) gives the desired result that lim Dn = f ′ (0). n→∞

′

(c) Since f is continuous in (−1, 1) and −1 < αn < βn < 1, f is continuous on [αn , βn ] and differentiable in (αn , βn ). By Theorem 5.10, we have f (βn ) − f (αn ) = (βn − αn )f ′ (ξn ) Dn = f ′ (ξn ),

(5.44)

101

5.2. Applications of Taylor’s theorem where ξn ∈ (αn , βn ). Again, by the continuity of f ′ , we have lim f ′ (ξn ) = f ′ (0) and so the n→∞

expression (5.44) implies that

lim Dn = f ′ (0).

n→∞

Recall from Example 5.6(b) that the function f : (−1, 1) → R defined by 2 x sin x1 , (x 6= 0); f (x) = 0, (x = 0) is differentiable in (−1, 1), but f ′ is not continuous at 0, i.e., lim f ′ (x) 6= f ′ (0) = 0. Let αn = and βn =

1 2nπ .

x→0

1 2nπ+ π 2

Then we have −1 < αn < βn < 1, αn → 0 and βn → 0 as n → ∞. Since

f (αn ) = α2n sin

sin(2nπ + π2 ) 1 1 = = αn (2nπ + π2 )2 (2nπ + π2 )2

and f (βn ) = βn2 sin

1 sin(2nπ) = = 0, βn (2nπ)2

we have Dn =

π −4n −1 f (βn ) − f (αn ) = = π 2 × (4n) 2nπ + βn − αn (2nπ + 2 ) 2 2nπ + π2

so that lim Dn = −

n→∞

2 6= f ′ (0). π

This completes the proof of the problem.

Problem 5.20 Rudin Chapter 5 Exercise 20.

Proof. When we read the Eqn. (24) of Theorem 5.15, we may modify it to the following form: Suppose f = (f1 , f2 , . . . , fm ) : [a, b] → Rm , where f1 , f2 , . . . , fm are real functions on [a, b], n is a positive integer, f (n−1) is continuous on [a, b], f (n) exists for every t ∈ (a, b). Let α, β be distinct points of [a, b], and define P(t) = (P1 (t), P2 (t), . . . , Pm (t)) =

n−1 X k=0

where Pi (t) =

n−1 X k=0

f (k) (α) (t − α)k , k!

(5.45)

(k)

fi (α) (t − α)k for i = 1, 2, . . . , m. Then there exists a point x ∈ (α, β) such that k! f (n) (x) |f (β) − P(β)| ≤ (β − α)n . n!

(5.46)

We see that the inequality (5.46) follows immediately from Theorem 5.15 (Taylor’s Theorem) if m = 1. For the general case, we let 1 z= [f (β) − P(β)] |f (β) − P(β)|

which is clearly a unit vector. Let, further, ϕ(t) = z · f (t), where t ∈ [α, β]. Then it is easy to see that ϕ is a real-valued continuous function on [α, β] which is differentiable in (α, β). Thus it is the case when m = 1, so we deduce from Theorem 5.15 that

where

ϕ(n) (x) |ϕ(β) − Q(β)| ≤ (β − α)n , n! Q(β) =

n−1 X k=0

ϕ(k) (α) (β − α)k . k!

(5.47)

Chapter 5. Differentiation

102

Since ϕ(k) (t) = z · f (k) (t), where k = 0, 1, . . . , n, we have Q(β) = z ·

n−1 X

f (k) (α) (β − α)k = z · P(β). k!

k=0

(5.48)

Hence, it follows from the inequality (5.47) and the expression (5.48) that z · f (n) (x) |z · [f (β) − P(β)]| ≤ (β − α)n n! f (n) (x) 1 [f (β) − P(β)] · [f (β) − P(β)] ≤ |z| (β − α)n |f (β) − P(β)| n! f (n) (x) 1 |f (β) − P(β)|2 ≤ (β − α)n |f (β) − P(β)| n! f (n) (x) |f (β) − P(β)| ≤ (β − α)n n!

which is exactly our desired inequality (5.46). This finishes the proof of the problem.

5.3

Derivatives of higher order and iteration methods

Problem 5.21 Rudin Chapter 5 Exercise 21.

Proof. We prove the last assertion only because it implies all the other cases. We note from Theorem 8.6(b) that the exponential function ex has derivatives of all orders on R, so we construct the desired function based on this property. Let E be a non-empty closed subset of R. Then E = R\E is an open subset of R. Recall from Problem 2.29 that every open set in R is the union of an at most countable collection of disjoint segments. Thus we suppose that ∞ [ E= (ak , bk ), k=1

where (a1 , b1 ), (a2 , b2 ), . . . are disjoint. belong to E, i.e., ak , bk ∈ E. We start with the function

e

By this definition, we know that all the endpoints ak and bk

g(x) =

1

e− x2 , if x > 0; 0, if x ≤ 0.

(5.49)

By Theorem 8.6(c), we have g(x) = 0 if and only if x ≤ 0. Furthermore, it follows from Theorem 5.5 (Chain Rule) and Theorem 8.6(b) that g is differentiable at all non-zero x. We claim that g is also differentiable at 0. To this end, we follow from the definition (5.49) and Theorem 8.6(f) that 1

g(x) − g(0) e− x2 g (0+) = lim = lim =0 x→0 x→0 x−0 x ′

x>0

and g ′ (0−) = lim

x→0 x0

0 g(x) − g(0) = lim = 0. x→0 x−0 x x 0. If we take N to be a positive integer such that l 1 ǫ(1 − A) m N >1+ · log log A |x2 − x1 | (note that log A < 0), then we have

|x2 − x1 | N −1 A < ǫ, 1−A

so this and the inequality (5.53) imply that for m ≥ n ≥ N , we have |xm − xn | < ǫ. Thus {xn } is a Cauchy sequence so that lim xn = x exists. Since f is continuous on R, we have n→∞

lim f (xn ) = f (x) and then

n→∞

x = lim xn+1 = lim f (xn ) = f (x). n→∞

n→∞

Hence x is a fixed point of f . (d) The process described in part (c) can be “visualized” by the zig-zag path, where the points (xn , xn+1 ) are replaced by the points (pn−1 , pn ) (i.e., the red crosses on the blue curves). g For

example, we have xn0 +2 = f (xn0 +1 ) = f (xn0 ) = xn0 +1 .

105

5.3. Derivatives of higher order and iteration methods

Figure 5.1: The zig-zag path of the process in (c). Problem 5.23 Rudin Chapter 5 Exercise 23.

Proof. Since α, β and γ are fixed points of f , they are the solutions of the equation x3 − 3x + 1 = 0. By using a scientific calculator, we can see easily that α = −1.87939 . . . , β = 0.34730 . . . and γ = 1.53209 . . .. We define g(x) = f (x) − x. It is obvious that g(x) =

x3 − 3x + 1 = 0 if and only if f (x) = x. 3

In other words, fixed points of f (x) are the only zeros of g(x).h Thus α, β and γ are the only zeros of g(x). (a) Suppose that x1 < α. Since g(−2) = − 31 < 0, the continuity of g implies that g(x) < 0 on (−∞, α). In particular, we have g(x1 ) < 0. Now we apply induction to prove that {xn } is a strictly decreasing sequence. For n = 1, we have x2 − x1 = f (x1 ) − x1 = g(x1 ) < 0, i.e., x2 < x1 . The statement is true for n = 1. Assume that it is true for n = k for some positive integer k, i.e., xk+1 < xk . For n = k + 1, we note that g ′ (x) = x2 − 1 > 0 on (−∞, −1), so Theorem 5.10 implies that g is strictly increasing on (−∞, −1). Therefore, our assumption gives the result that g(xk+1 ) < g(xk ) which is equivalent to xk+2 − xk+1 < xk+1 − xk < 0. Thus it is also true for n = k + 1. By induction, it is true for all positive integers n. h The following proof in this problem uses mainly the continuity property of the function g and induction. In fact, one may prove the problem by using mainly Theorem 5.11 and the differentiability and the convexity of the function f (Problems 4.23 and 5.14).

Chapter 5. Differentiation

106

Next, we show that {xn } is unbounded. Assume that {xn } was bounded. By Theorem 3.14, we have {xn } is convergent. Let lim xn = θ. It is clear that θ < α. Furthermore, it follows from this n→∞ and Theorem 4.2 that θ = lim xn+1 = lim f (xn ) = f (θ). (5.54) n→∞

n→∞

Therefore, we arrive at a conclusion that θ is the fourth fixed point of f , a contradiction to our hypothesis. Hence, we have xn → −∞ as n → ∞. (b) Suppose that α < x1 < γ. We note that α < β < γ, so there are three cases: – Case (i): x1 = β. Then we have x2 = f (x1 ) =

x31 + 1 3β = = β. 3 3

Thus we have xn = β for all positive integers n and we are done in this case. – Case (ii): x1 ∈ (β, γ). We claim that xn ∈ (β, γ) and xn+1 < xn for all positive integers n. We prove the claim by induction. For the first assertion, it follows from 0 < β < x1 that f (β) < f (x1 ) which is equivalent to β < x2 . For the second assertion, since g(β) = g(γ) = 0 and g(x) 6= 0 on (β, γ), we must have g(x1 ) 6= 0. Since 13 + 1 1 g(1) = −1=− , 3 3 the continuity of g guarantees that g(x) < 0 on (β, γ). In particular, we have g(x1 ) < 0 and then f (x1 ) < x1 which is equivalent to x2 < x1 . Thus the statements are true for n = 1. Assume that they are true for n = k for some positive integer k, i.e., xk ∈ (β, γ) and xk+1 < xk . For n = k + 1, it follows from 0 < β < xk that f (β) < f (xk ) so that β < xk+1 . Since xk+1 ∈ (β, γ), we have g(xk+1 ) < 0 so that f (xk+1 ) < xk+1 which is equivalent to xk+2 < xk+1 . Therefore, the statements are true for n = k + 1. Hence the claim follows from induction. Since {xn } is bounded, Theorem 3.14 implies that {xn } converges. Let lim xn = η. By using n→∞

similar argument as obtaining the result (5.54), we have

η = lim xn+1 = lim f (xn ) = f (η). n→∞

n→∞

Therefore, η is also a fixed point of f and η ∈ [β, γ). Hence we have η = β in this case.

– Case (iii): x1 ∈ (α, β). We claim that xn ∈ (α, β) and xn < xn+1 for all positive integers n. Since the proof of this part is very similar to that of Case (ii) above, we omit the details here. Since {xn } is bounded, Theorem 3.14 implies that {xn } converges. Let lim xn = τ . By using n→∞

similar argument as obtaining the result (5.54), we have

τ = lim xn+1 = lim f (xn ) = f (τ ). n→∞

n→∞

Therefore, τ is also a fixed point of f and τ ∈ (α, β]. Hence we have τ = β in this case.

107

5.3. Derivatives of higher order and iteration methods

(c) Suppose that γ < x1 . Since g(2) = 1 > 0, the continuity of g implies that g(x) > 0 on (γ, ∞). In particular, we have g(x1 ) > 0. Now by applying similar argument as in part (a), we can show that {xn } is a strictly increasing sequence and it is unbounded. Hence we must have xn → +∞ as n → ∞. This completes the proof of the problem.

Problem 5.24 Rudin Chapter 5 Exercise 24.

Proof. By Problem 3.16, we see that the convergence of the function f is much more rapid than that of √ the function g. Take the same starter x1 > α in the recursions of f and g. By the inequalities (5.51) to (5.53), the rate of the convergence depends on the bound of the recursion formula. It is easy to check that √ 1 α √ x+ − α f (x) − α = 2 x √ √ x α α α = − − + 2 2 2x 2 √ √ √ α α 1 −1 = (x − α) + 2 x √ 2 √ √ α 1 f (x) − α = (5.55) 1− (x − α) 2 x and

√ √ √ √ √ α+x √ α+x− α−x α 1− α g(x) − α = − α= = (x − α). 1+x 1+x 1+x

(5.56)

Let {xn (f )} and {xn (g)} be the two sequences generated by f and g respectively, where x1 (f ) = x1 (g). (Note that xn (f ) 6= xn (g) in general.) Then it follows from the formulas (5.55) and (5.56) that √ √ √ α 1 |xn+1 (f ) − α| = 1 − (5.57) · |xn (f ) − α| 2 xn (f ) and

1 − √α √ √ |xn+1 (g) − α| = · |xn (g) − α|. 1 + xn (g)

(5.58)

• Case √ (i): α > 1. By Problem 3.16, Lemma 3.1, Problem 3.17(a) and (b) and the fact that x1 > α > 1, we consider the functions f and g defined on the interval [1, x1 ]. Therefore, the expressions (5.57) and (5.58) imply that √ n √ √ α 1 Y |xn+1 (f ) − α| = n |x1 − α| 1 − 2 xk (f )

(5.59)

k=1

and

|xn+1 (g) −

√ n Y √ √ 1− α α| = |x1 − α| 1 + xk (g)

(5.60)

k=1

for all positive integers n. Since {xn (f )} decreases monotonically (see Problem 3.16(a)), we have from the expression (5.59) that x − √α n √ √ 1 |xn+1 (f ) − α| ≤ (5.61) |x1 − α|. 2x1

Chapter 5. Differentiation

108

Similarly, since xn (g) ∈ [1, x1 ], we have from the expression (5.60) that √α − 1 n √α − 1 n √ √ √ |x1 − α|. |x1 − α| ≤ |xn+1 (g) − α| ≤ 1 + x1 2

(5.62)

We compare the magnitudes of the two constants √ √ x1 − α α−1 and . 2x1 1 + x1 √ √ In fact, if x1 is chosen to be close to α,i then x1 − α is very small so that √ √ x1 − α α−1 < < 1. 2x1 1 + x1 Therefore, we follow from the inequalities (5.61) and (5.62) that √ √ |xn+1 (f ) − α| < |xn+1 (g) − α|.

This explains why the rate of convergence of {xn (f )} is much more rapid than that of {xn (g)} if n is large enough. For example, take α = 10, then we have √ √ ( α, α) = (3.162277660168379, 3.162277660168379). Note that

7 89 15761 , x3 (f ) = , x4 (f ) = ,... 2 28 4984 and its zig-zag path is shown in Figure 5.2j , where the green, orange and purple dots are 89 15761 7 7 89 , , , and 5, 2 2 28 28 4984 x1 (f ) = 5,

x2 (f ) =

respectively.

Figure 5.2: The zig-zag path induced by the function f in Case (i).

Similarly, we have x1 (g) = 5,

x2 (g) =

15 , 6

x3 (g) =

75 , 21

x4 (g) =

286 , 96

x5 (g) =

1245 ,... 381

and its zig-zag path is shown in Figure 5.3, where the green, orange, purple and black dots are 285 1245 15 15 75 75 285 , , and , , , 5, 6 6 21 21 96 96 381

109

5.3. Derivatives of higher order and iteration methods

Figure 5.3: The zig-zag path induced by the function g in Case (i).

respectively. In conclusion, √ √ the pattern of the zig-zag path induced by the function f is approaching to the fixed point ( 10, 10) from one-side, but the one induced by the function g goes near the fixed point “alternatively”. • Case (ii): 0 < α < 1. Take α =

1 10

and x1 = 15 . Then we have

√ √ ( α, α) = (0.4472135954999579, 0.4472135954999579).

Therefore, we have x1 (f ) =

1 , 5

x2 (f ) =

7 , 20

x3 (f ) =

89 280

and x4 (f ) =

15761 49840

and its zig-zag path is shown in Figure 5.4, where the green, orange and purple dots are

respectively.

1 7 , , 5 20

7 89 , 20 280

and

89 15761 , 280 49840

Figure 5.4: The zig-zag path induced by the function f in Case (ii). i This j The

can be done by an initial estimation. figures are drawn by “desmos”.

Chapter 5. Differentiation

110

Similarly, we acquire x1 (g) =

1 , 5

x2 (g) =

1 , 4

x3 (g) =

7 , 25

x4 (g) =

19 , 64

x5 (g) =

127 ,... 415

and its zig-zag path is shown in Figure 5.3, where the green, orange, purple and black dots are

respectively.

1 1 , , 5 4

1 7 , , 4 25

7 19 , 25 64

and

19 127 , 64 415

Figure 5.5: The zig-zag path induced by the function g in Case (ii).

In this case, both the q of the zig-zag paths induced by the functions f and g are approaching qpatterns

to the fixed point ( (ii).

1 10 ,

1 10 )

from one-side. This is main difference between Case (i) and Case

We complete the proof of the problem.

Problem 5.25 Rudin Chapter 5 Exercise 25.

Proof. (a) Now the given formula can be rewritten as f ′ (xn ) =

0 − f (xn ) . xn+1 − xn

(5.63)

Therefore, it means that the slope of the graph of f at xn (the left-hand side of the formula (5.63)) equals to the slope of the straight line passing through (xn , f (xn )) and (xn+1 , 0) (the right-hand side of the formula (5.63)). See the following figure:

111

5.3. Derivatives of higher order and iteration methods

Figure 5.6: The geometrical interpretation of Newton’s method.

(b) We prove the result by induction. By the continuity of f and the fact that ξ is the unique point in (a, b) at which f (ξ) = 0, we have f (x1 ) > 0. Otherwise, we have f (t) = 0 for some t ∈ [x1 , b) contradicting to the uniqueness of ξ. Since f ′ (x) > 0 for all x ∈ [a, b], f (x1 ) >0 f ′ (x1 ) and so x2 < x1 . This shows that the case for n = 1 is true. Assume that it is also true for n = k for some positive integer k, i.e., xk+1 < xk . Note that xk+1 < xk is equivalent to f (xk ) > 0. For n = k + 1, we have xk+2 = xk+1 −

f (xk+1 ) . f ′ (xk+1 )

(c) It is clear that the function f satisfies the conditions of Theorem 5.15 (Taylor’s theorem) with α = ξ and β = xn , so we have f (xn ) = f (ξ) + f ′ (xn )(xn − ξ) + = f ′ (xn )(xn − ξ) +

f ′′ (t) (xn − ξ)2 2

f ′′ (t) (xn − ξ)2 2

f ′′ (t) f ′ (xn ) = (xn − ξ) + ′ (xn − ξ)2 f (xn ) 2f (xn )

(d) Since f ′ (x) ≥ δ > 0 and 0 ≤ f ′′ (x) ≤ M for all x ∈ [a, b], it follows from part (c) that 0 ≤ xn+1 − ξ ≤

M (xn − ξ)2 = A(xn − ξ)2 2δ

which implies that 0 ≤ xn+1 − ξ ≤

n 1 [A(x1 − ξ)]2 A

(5.64)

as required. When we compare these inequalities (5.64) with Problems 3.16 and 3.18, we see that the recursion formula in Problem 3.18 which is xn+1 =

α p−1 xn + x−p+1 p p n

Chapter 5. Differentiation

112

corresponds to applying Newton’s method with the function f (x) = xp − α. Actually, we have f ′ (x) = pxp−1 so that xn+1 = xn −

f (xn ) xpn − α α p−1 = x − xn + x−p+1 . n p−1 = f ′ (xn ) p p n pxn

(5.65)

Put p = 2, the formula (5.65) reduces to Problem 3.16 as a special case and the error estimation ǫn+1 < β

ǫ 2n 1

β

is equivalent to the error estimation (5.64) deduced by Newton’s method. (e) It is clear that g(ξ) = ξ if and only if f (ξ) = 0. In other words, the fixed point ξ of g is exactly the zero of f . Hence, Newton’s method of computing ξ as described in parts (a) to (c) amounts to finding the fixed point of the function g. Since g ′ (x) =

f (x)f ′ (x) , [f ′ (x)]2

we have f (x)f ′′ (x) M |g ′ (x)| = ≤ 2 |f (x)| [f ′ (x)]2 δ M lim |g ′ (x)| ≤ lim 2 |f (x)| x→ξ x→ξ δ which implies that lim g ′ (x) = 0. x→ξ

2

(f) It is obvious that f (x) = 0 if and only if x = 0. Since f ′ (x) = 13 x− 3 , f ′ (x) → 0 as x → ∞. This shows that the function f does not satisfy one of the conditions of Newton’s method: f ′ (x) ≥ δ > 0 for some positive constant δ and for all x ∈ (−∞, ∞). Therefore, Newton’s method fails in this case. In fact, we have f (xn ) = −2xn xn+1 = xn − ′ f (xn ) so that xn+1 = (−2)n x1 . Since x1 ∈ (0, ∞), the sequence {xn } is divergent. This ends the proof of the problem.

Problem 5.26 Rudin Chapter 5 Exercise 26.

Proof. We follow the hint. Fix x0 ∈ [a, b], let M0 (f ) =

sup |f (x)|

and M0 (f ′ ) =

x∈[a,x0 ]

sup |f ′ (x)|.

x∈[a,x0 ]

Since |f ′ (x)| ≤ A|f (x)| on [a, b], we always have M0 (f ′ ) ≤ AM0 (f ). Now for any x ∈ [a, x0 ], since f is differentiable in [a, b], it is clear that it is continuous on [a, x] and differentiable in (a, x). By Theorem 5.10, we have |f (x) − f (a)| = |(x − a)f ′ (t)| for some t ∈ (a, x). By our hypothesis, this implies that |f (x)| ≤ |f ′ (t)|(x0 − a) ≤ M0 (f ′ )(x0 − a) ≤ A(x0 − a)M0 (f ).

(5.66)

113

5.4. Solutions of differential equations

Let x0 be chosen so that A(x0 − a) < 21 . Assume that M0 (f ) 6= 0. Then A(x0 − a) < (5.66) imply that M0 (f ) |f (x)| < 2 for all x ∈ [a, x0 ] and this gives the contradiction that M0 (f ) ≤ on [a, x0 ] which induces f (x) = 0

M0 (f ) 2 .

1 2

and the inequality

Hence we must have M0 (f ) = 0

on [a, x0 ]. Now we choose x0 , x1 , . . . , xn such that a < x0 < x1 < · · · < xn < b with A(x0 − a)
a x>a 4 f (x) ≡ 0 on [0, a). Hence, the other solutions must be in the form 0, if 0 ≤ x < a; f (x) = 1 2 (x − a) , if a ≤ x < b. 4 As a final remark, since f (x) 6≡ problem.

x2 4 ,

it is impossible that a = 0 and b = ∞.l This finishes the proof of the

Problem 5.28 Rudin Chapter 5 Exercise 28.

Proof. A solution of the initial-value problem for systems of differential equations y′ = Φ(x, y),

y(a) = c = (c1 , c2 , . . . , ck ) (αi ≤ ci ≤ βi ; i = 1, 2, . . . , k)

(5.68)

is, by definition, a differentiable function f on [a, b] such that f (a) = c and f ′ = Φ(x, f )

(a ≤ x ≤ b).

We claim that the initial-value problem (5.68) has at most one solution if there is a constant A such that |Φ(x, y2 ) − Φ(x, y1 )| ≤ A|y2 − y1 | (5.69) whenever (x, y1 ), (x, y2 ) ∈ R, where R is a (k + 1)-cell. To prove the claim, let y1 and y2 be two solutions of the initial-value problem (5.68). Define the function f (x) = y2 (x) − y1 (x). It is clear that f is differentiable on [a, b], f (a) = y2 (a) − y1 (a) = c − c = 0 and |f ′ (x)| = |y2′ (x) − y1′ (x)| = |Φ(x, y2 ) − Φ(x, y1 )| ≤ A|y2 − y2 | = A|f (x)| on [a, b]. By using the vector-valued form of Problem 5.26, we have f (x) = 0 for all x ∈ [a, b]. Hence, we have y2 (x) = y1 (x) for all x ∈ [a, b] and the uniqueness of solutions of the initial-value problem (5.68) follows immediately. This completes the proof of the problem. k Such

an a must exist. Otherwise, if f (x) > ǫ > 0 for some ǫ > 0 and for all x > 0, then the continuity of f gives 0 = f (0) = lim f (x) ≥ ǫ, x→0

a contradiction. l We shall remark that Problem 5.27 is actually part of the so-called Picard’s existence theorem and the inequality in the problem is called the Lipschitz condition. See [5, §2.4, §2.8].

115

5.4. Solutions of differential equations

Problem 5.29 Rudin Chapter 5 Exercise 29.

Proof. Let y be a solution of the differential equation with the initial conditions (k) y + gk (x)y (k−1) + · · · + g2 (x)y ′ + g1 (x)y = f (x), y(a) = c1 , y ′ (a) = c2 , . . . , y (k−1) (a) = ck .

(5.70)

′ Suppose that y1 = y, y2 = y1′ , y3 = y2′ , . . . , yk = yk−1 , yk′ = y (k) . Then the differential equation (5.70) is equivalent to

yk = f (x) − [g1 (x)y + g2 (x)y ′ + · · · + gk (x)y (k−1) ] = f (x) −

k X

gj (x)yj .

j=1

Besides, the initial conditions y (j−1) (a) = cj are equivalent to the conditions yj (a) = cj respectively, where j = 1, 2, . . . , k. Therefore, what we have shown is that the differential equation (5.70) is equivalent to the initial-value problem  k X   ′ gj (x)yj , (j = 1, 2, . . . , k − 1); yj = yj+1 , yk′ = f (x) − (5.71) j=1   yj (a) = cj (j = 1, 2, . . . , k).

We note that the initial-value problem (5.71) is a special case of the system of differential equations in Problem 5.28 with φ1 = y2 ,

φ2 = y3 , . . . , φk−1 = yk

and φk = f (x) −

k X

gj (x)yj .

j=1

Assume that u and v were two solutions of the differential equation (5.70). Then they induce two solutions u = (u1 , u2 , . . . , uk ) = (u, u′ , . . . , u(k−1) ) and v = (v1 , v2 , . . . , vk ) = (v, v ′ , . . . , v (k−1) ) of the initial-value problem (5.71) with Φ(x, y) = (φ1 , φ2 , . . . , φk ) =

y2 , y3 , . . . , yk , f (x) −

k X j=1

!

gj (x)yj .

To complete our problem, we have to show that the solutions u and v satisfy the inequality (5.69) for some positive constant A. Since g1 , g2 , . . . , gk are continuous real functions on [a, b], Theorem 4.16 ensures that maximums of g1 , g2 , . . . , gk can be found in [a, b]. Let Mj = max gj (x) and define x∈[a,b]

M = max(M1 , M2 , . . . , Mk ). Therefore, it follows from the definition (5.71), Theorem 1.35 and Definition 1.36 that |Φ(x, u) − Φ(x, v)| = |u′ − v′ |

= |(u′1 , u′2 , . . . , u′k ) − (v1′ , v2′ , . . . , vk′ )| ! ! k k X X gj (x)vj gj (x)uj − v2 , v3 , . . . , vk , f (x) − = u2 , u3 , . . . , uk , f (x) − j=1 j=1 ! k X gj (x)(vj − uj ) = u2 − v2 , . . . , uk − vk , j=1

Chapter 5. Differentiation

=

(

≤

(

≤

(

116

2

2

(u2 − v2 ) + · · · + (uk − vk ) + 2

2

"

k X j=1

(u2 − v2 ) + · · · + (uk − vk ) + M

2

"

#2 ) 21

gj (x)(vj − uj ) k X j=1

(u2 − v2 )2 + · · · + (uk − vk )2 + M 2 × k

#2 ) 21

(uj − vj )

k X j=1

(uj − vj )2

) 12

i 12 h ≤ (1 + kM 2 ) × (u1 − v1 )2 + (u2 − v2 )2 + · · · + (uk − vk )2 1 2

≤ A|u − v|, 1

where A = (1 + kM 2 ) 2 > 0. Hence, Problem 5.28 implies that u = v, as desired. This completes the proof of the problem.

CHAPTER

6

The Riemann-Stieltjes Integral

6.1

Problems on Riemann-Stieltjes integrals

Problem 6.1 Rudin Chapter 6 Exercise 1.

Proof. It is clear that f is bounded on [a, b] and is discontinuous only at x0 ∈ [a, b]. Since α is continuous at x0 , it follows from Theorem 6.10 that f ∈ R(α). Let P = {t0 , t1 , . . . , tn } be a partition of [a, b] and Ii = [ti−1 , ti ], where i ∈ {1, . . . , n}. Let, further, x0 ∈ Ii for some i. By Definition 6.2, we have mi = inf f (x) = 0 (i = 1, 2, . . . , n). x∈Ii

Thus we have L(P, f, α) =

n X

mi ∆αi = 0

i=1

and then sup L(P, f, α) = 0 which gives

Z

b

f dα = 0.

a

Since f ∈ R(α), we must have

Z

b

f dα =

a

Z

b

f dα = 0

a

as desired. We complete the proof of the problem.

Problem 6.2 Rudin Chapter 6 Exercise 2.

Proof. Assume that f (x0 ) 6= 0 for some x0 ∈ [a, b]. Since f ≥ 0 on [a, b], we have f (x0 ) > 0 and the continuity of f implies that f (x0 ) >0 f (x) > 2 on (x0 − δ, x0 + δ) ⊂ [a, b] for some δ > 0. Let [c, d] ⊂ (x0 − δ, x0 + δ) ⊂ [a, b], where c < d. By Theorem 6.21 (Second Fundamental Theorem of Calculus), we know that Z

b

0 dx = 0

and

a

Z

a

117

b

1 dx = b − a.

Chapter 6. The Riemann-Stieltjes Integral By Theorem 6.12(b) and (c), we have Z b Z c Z d Z f dx = f dx + f dx + a

a

c

118

b

d

f dx ≥ 0 +

Z

d

f dx + 0 ≥

c

(d − c)f (x0 ) >0 2

which is a contradiction. Hence we have f (x) = 0 for all x ∈ [a, b], completing the proof of the problem. Problem 6.3 Rudin Chapter 6 Exercise 3.

Proof. Suppose that P = {x0 , x1 , . . . , xn } is a partition of [−1, 1]. By Theorem 6.4, we have for j = 1, 2, 3, U (P ∗ , f, βj ) − L(P ∗ , f, βj ) ≤ U (P, f, βj ) − L(P, f, βj ) if P ∗ is a refinement of P . Thus, without loss of generality, we may some positive integer k, where 0 < k < n. Then we have   β1 (xi ) − β1 (xi−1 ) = 0, 1, ∆(β1 )i = β1 (xi ) − β1 (xi−1 ) =  β1 (xi ) − β1 (xi−1 ) = 0,   β2 (xi ) − β2 (xi−1 ) = 0, 1, ∆(β2 )i = β2 (xi ) − β2 (xi−1 ) =  β2 (xi ) − β2 (xi−1 ) = 0,   β3 (xi ) − β3 (xi−1 ) = 0, 1 , ∆(β3 )i = β3 (xi ) − β3 (xi−1 ) =  2 β3 (xi ) − β3 (xi−1 ) = 0,

assume that 0 ∈ P . Let xk = 0 for if i = 1, 2, . . . , k; if i = k + 1; if i = k + 2, . . . , n,

(6.1)

if i = 1, 2, . . . , k − 1; if i = k; if i = k + 1, . . . , n,

(6.2)

if i = 1, 2, . . . , k − 1; if i = k, k + 1; if i = k + 2, . . . , n.

(6.3)

By the definitions (6.1), (6.2) and (6.3), we have L(P, f, β1 ) = L(P, f, β2 ) =

n X

i=1 n X

mi ∆(β1 )i = mk+1 ,

U (P, f, β1 ) =

n X

Mi ∆(β1 )i = Mk+1 ,

i=1

mi ∆(β2 )i = mk ,

U (P, f, β2 ) =

n X

Mi ∆(β2 )i = Mk ,

i=1

i=1 n X

mk + mk+1 mi ∆(β3 )i = L(P, f, β3 ) = , 2 i=1

U (P, f, β3 ) =

n X

Mi ∆(β3 )i =

i=1

Mk + Mk+1 2

which imply that U (P, f, β1 ) − L(P, f, β1 ) = Mk+1 − mk+1 , U (P, f, β2 ) − L(P, f, β2 ) = Mk − mk ,

(6.4) (6.5)

U (P, f, β3 ) − L(P, f, β3 ) =

(6.6)

(Mk+1 − mk+1 ) + (Mk − mk ) . 2

(a) By Theorem 6.6 and the expression (6.4), f ∈ R(β1 ) if and only if for every ǫ > 0, there exists a partition P such that Mk+1 − mk+1 < ǫ. (6.7) Suppose that such a partition P exists. If we take δ = xk+1 > 0, then for all x with x > 0 and x − 0 < δ, we have |f (x) − f (0)| ≤ Mk+1 − mk+1 < ǫ. (6.8)

By Definition 4.25, this means that f (0+) = f (0).

Conversely, suppose that f (0+) = f (0). Then for every ǫ > 0, there exists a δ > 0 such that |f (x) − f (0)| < ǫ

(6.9)

119

6.1. Problems on Riemann-Stieltjes integrals for all x with 0 < x < δ. Let P = {−1, 0, δ, 1} be a partition of [−1, 1]. Thus the expression (6.4) and inequality (6.9) give U (P, f, β1 ) − L(P, f, β1 ) = sup f (x) − inf f (x) x∈[0,δ]

x∈[0,δ]

≤ sup f (x) − f (0) + inf f (x) − f (0) x∈[0,δ]

x∈[0,δ]

≤ǫ+ǫ = 2ǫ.

Since ǫ is arbitrary, we know from Theorem 6.6 that f ∈ R(β1 ). Hence we have shown that f ∈ R(β1 ) if and only if f (0+) = f (0).

If the inequality (6.7) holds for the partition P , then it follows from Theorem 6.7(c) that Z f (tk+1 ) − f dβ1 < ǫ,

(6.10)

where tk+1 ∈ [0, xk+1 ]. Hence we deduce from the inequalities (6.8) and (6.10) that, for all tk+1 ∈ [0, xk+1 ], Z Z f (0) − f dβ1 ≤ f (0) − f (tk+1 ) + f (tk+1 ) − f dβ1 < ǫ + ǫ = 2ǫ.

Since ǫ is arbitrary, it means that

as desired.

Z

f dβ1 = f (0)

(b) We claim that f ∈ R(β2 ) if and only if f (0−) = f (0) and then Z f dβ2 = f (0). Since the proof of this is almost identical to that in part (a) with applying the expressions (6.2) and (6.5), we omit the details of its proof here. (c) By the analysis in part (a), we see that Mk+1 − mk+1 < ǫ if and only if f (0+) = f (0). Similarly, the analysis in part (b) implies that Mk − mk < ǫ if and only if f (0−) = f (0). By these, we obtain from the expression (6.6) that U (P, f, β3 ) − L(P, f, β3 )
0, there exists a partition P = {x0 , x1 , . . . , xn } such that U (P, f ) − L(P, f ) < ǫ.

(6.11)

By Definition 6.1, we have Mi = 1 and mi = 0 for all i = 1, 2, . . . , n. Therefore, we have U (P, f ) =

n X i=1

Mi ∆xi = xn − x0 = b − a

and L(P, f ) =

n X

mi ∆xi = 0.

i=1

However, their difference is b − a which contradicts the assumption (6.11). Hence we must have f ∈ /R on [a, b] for any a < b, completing the proof of the problem. Problem 6.5 Rudin Chapter 6 Exercise 5.

Proof. Define the function f (x) =

1, if x ∈ [a, b] and x is rational; −1, if x ∈ [a, b] and x is irrational.

Then it is bounded on [a, b]. Now f 2 (x) = 1 for all x ∈ [a, b] so that f 2 ∈ R on [a, b]. However, using similar argument as in the proof of Problem 6.4, we can show that f 6∈ R on [a, b]. For the second assertion, we know from the hypothesis that there exist constants M and m such that m ≤ f (x) ≤ M for all x ∈ [a, b]. Define φ : [m3 , M 3 ] → R by 1

φ(y) = y 3 . Then the function φ is continuous on [m3 , M 3 ] and φ(f 3 (x)) = f (x) on [a, b]. By Theorem 6.11, we have f ∈ R(α) on [a, b]. This completes the proof of the problem.

Problem 6.6 Rudin Chapter 6 Exercise 6.

Proof. Recall the definition of the Cantor set P : Suppose that E0 , E1 , E2 , . . . are the intervals defined in Sec. 2.44. Then we have ∞ \ P = En . n=1

Table 6.1 below shows the number of intervals, the number of the end-points and the length of each interval for each En . We apply Theorem 6.6 to show that f ∈ R on [0, 1]. In fact, our goal is to construct a suitable partition P = {x0 , x1 , . . . , xn } of [0, 1] such that U (P, f ) − L(P, f ) =

n X i=1

(Mi − mi )∆xi < ǫ

(6.12)

121

6.1. Problems on Riemann-Stieltjes integrals Interval E0 E1 E2 .. .

Number of intervals 20 21 22 .. .

Number of end-points including 0 and 1 1

2 22 23 .. .

1 30 1 31 1 32

En

2n

2n+1

1 3n

Length of each interval

.. .

Table 6.1: The number of intervals & end-points and the length of each interval for each En . for every ǫ > 0. Since f is bounded on [0, 1], there are constants M and m such that m ≤ f (x) ≤ M for all x ∈ [0, 1]. Suppose that n is a large positive integer such that 1+

ǫ log 8(M−m)

log 32

< n.

(6.13)

Now for each En , we let 0 = a1 < b1 < a2 < b2 < · · · < a2n < b2n = 1

be the end-points contained in En .a By the information given in Table 6.1, we know two facts about the end-points: • Fact 1. For k = 1, 2, . . . , 2n and j = 1, 2, . . . , 2n − 1, we have b k − ak =

1 3n

and aj+1 − bj ≤

1 3n−1

(6.14)

• Fact 2. If x ∈ P , then it follows from the definition of P and the footnote a that x ∈ [ak , bk ] for some positive integer k. Furthermore, by the hypotheses, f is continuous at every point x ∈ [0, 1] \ P . However, it does not mean that f is discontinuous at every point of P . In fact, the only thing that we can say is that “f is possibly discontinuous at points of P .” Now we can start to construct our partition based on Fact 1. Let δ > 0 be a number such that δ < 2·31 n . By the observations (6.14), we define a partition Pn (δ) by Pn (δ) = {0 = a1 < b1 + δ < a2 − δ < b2 + δ < · · · < a2n − δ < b2n = 1}.

(6.15)

Since there are totally 2n+1 distinct points in Pn (δ), they make the interval [0, 1] into 2n+1 − 1 intervals.b We consider 2n+1 X−1 (Mi − mi )∆xi . (6.16) U (Pn (δ), f ) − L(Pn (δ), f ) = i=1

Next, we estimate the magnitude of (6.16) by using Fact 2. In fact, we split the summation (6.16) into two parts: One part consists of only intervals on which f is continuous and the other part consisting of intervals that f might contain discontinuities. By the definition (6.15) of Pn (δ) and Fact 2, we know that [a1 , b1 ] ⊂ [a1 , b1 + δ], [a2n , b2n ] ⊂ [a2n − δ, b2n ] and [ak , bk ] ⊂ [ak − δ, bk + δ] a We

note that each [ak , bk ] is an interval in En , where k = 1, 2, . . . , n. should be careful that they are not the intervals in En !

b You

Chapter 6. The Riemann-Stieltjes Integral

122

for k = 2, 3, . . . , 2n − 1. In other words, all possible discontinuities of f must fall into the intervals with odd indices of the summation (6.16). Hence we can rewrite the summation (6.16) as U (Pn (δ), f ) − L(Pn (δ), f ) =

n 2X −1

s=1

|

n

(M2s − m2s )∆x2s + {z

corresponding to the continuity of f n 2X −1

≤ (M − m)

s=1

}

2 X r=1

|

(M2r−1 − m2r−1 )∆x2r−1 {z

corresponding to possibly the discontinuity of f n

2 X ∆x2s + (M − m) ∆x2r−1 .

} (6.17)

r=1

In order to find the estimate of the inequality (6.17), we must find the bounds of the summations in (6.17). To this end, we obtain from the expressions (6.14) that ∆x2s = x2s − x2s−1 = (as+1 − δ) − (bs + δ) = as+1 − bs − 2δ ≤

1 − 2δ, 3n−1

where s = 1, 2, . . . , 2n − 1. Since δ is any number less than 2·31 n , we can further assume that " # 3 1 1 ǫ 0< ≤δ< − . 2 3n 2(2n − 1)(M − m) 2 · 3n

(6.18)

(6.19)

Then we get from the expression (6.18) that n 2X −1

(M − m)

s=1

∆x2s ≤ 3(M − m)

n 2X −1

s=1

n

2X −1 3ǫ 2δ 3ǫ 1 ≤ − = . n − 1) 3n 3 2(2 2 s=1

(6.20)

Similarly, we note from the expressions (6.14) that ∆x1 = x1 − x0 = b1 − a1 + δ =

1 + δ, 3n

∆x2r−1 = x2r−1 − x2r−2 = (br + δ) − (ar − δ) = br − ar + 2δ = ∆x2n+1 −1 = x2n+1 −1 − x2n+1 −2 = b2n − a2n + δ =

1 + 2δ, 3n

(6.21)

1 + δ, 3n

where r = 2, 3 . . . , 2n − 1. Therefore, it follows from the inequality (6.13), the upper bound of δ and the expressions (6.21) that 2n 2n 2 n−1 X X 1 ǫ ∆x2r−1 < (M − m) < . (M − m) + 2δ ≤ 4(M − m) n 3 3 2 r=1 r=1

(6.22)

Hence it follows from the inequalities (6.20) and (6.22) that U (Pn (δ), f ) − L(Pn (δ), f ) < 2ǫ holds for every n satisfying the inequality (6.13) and every δ satisfying the inequality (6.19). By Theorem 6.6, we have f ∈ R on [0, 1]. This completes the proof of the problem.

6.2

Definitions of improper integrals

Problem 6.7 Rudin Chapter 6 Exercise 7.

Proof. We use a red symbol on the left-hand side of the definition because we want to emphasize that it is a “new” definition of integral.

123

6.2. Definitions of improper integrals

(a) Since f ∈ R on [0, 1] and 0 < c < 1, Theorem 6.12(c) implies that f ∈ R on [0, c] and on [c, 1]. In addition, f ∈ R on [0, 1] also implies that f is bounded on [0, 1] by Definition 6.1. Thus, we let |f (x)| ≤ M on [0, 1] for some M > 0. Therefore, we follow from Theorem 6.12(b) and (c) that Z Z Z 1 1 c f dx = f dx − f dx ≤ M c 0 c 0

so that

Z

Since

Z Z 1 1 lim f dx = lim M c = 0. f dx − c→0 0 c→0 c c>0

1

c>0

f dx is a constant, we have

0

Z

1

f dx = lim

c→0 c>0

0

Z

1

f dx =

Z

1

f dx.

0

c

Hence this “new” definition of integral agrees with the old one. ∞ [ 1 1i 1 (b) Let (0, 1] = , k1 ]. Then we have , . Define f (x) = (−1)k+1 (k + 1) for all x ∈ ( k+1 k+1 k k=1

Z

1 k

1 k+1

1 1 (−1)k+1 = − . f (x) dx = (−1)k+1 (k + 1) k k+1 k

(6.23)

1 It is clear that c ∈ ( k+1 , k1 ] for one and only one positive integer k, so this and expression (6.23) imply that 1 Z 1 Z 1 Z 12 Z k−1 Z k1 f (x) dx = f (x) dx + f (x) dx + · · · + f (x) dx + f (x) dx c

1 2

1 3

k

Z

c

1

f (x) dx −

Z

c

=1− 1 k

f (x) dx = 1 −

(−1) 1 1 + − ··· + + 2 3 k−1 (−1)k 1 1 + − ··· + . 2 3 k−1

Since c → 0 if and only if k → ∞, we have Z

Z

1 k

c

1 k

f (x) dx

c

(6.24)

1 k

c

f (x) dx → 0

as c → 0 and then we deduce from the expression (6.24) and Theorem 3.43 that Z 1 Z 1 1 1 f (x) dx = 1 − + − · · · = ln 2. f (x) dx = lim c→0 c 2 3 0 c>0 1 , k1 ], we follow from However, we have |f (x)| = k + 1 ≥ 0 for any positive integer k. Since c ∈ ( k+1 the expression (6.23) that Z 1 Z 1 Z 1 k k−1 X X1 1 |f (x)| dx ≥ |f (x)| dx ≥ |f (x)| dx = = . 1 1 j j c k+1 k j=1 j=1

By the fact that c → 0 if and only if k → ∞ and Theorem 3.28, we have Z 1 lim |f (x)| dx c→0 c>0

does not exist.

c

Chapter 6. The Riemann-Stieltjes Integral

124

We end our analysis of the problem here.

Problem 6.8 Rudin Chapter 6 Exercise 8.

Proof. Let k be a positive integer such that 1 ≤ k < n. Since f decreases monotonically on [1, ∞), we have f (k + 1) ≤ f (x) ≤ f (k) (6.25) for all x ∈ [k, k + 1]. Recall that f ∈ R on [1, b] for every b > 1, so we know from Theorem 6.12(c) and the inequalities (6.25) that f ∈ R on [k, k + 1] which certainly imply f (k + 1) ≤

Z

k+1

f (x) dx ≤ f (k).

k

(6.26)

By the inequalities (6.26) and the hypothesis f (x) ≥ 0,c we have ! n−1 n−1 n−1 X X Z k+1 X f (k + 1) ≤ f (x) dx ≤ f (k) k=1

k

k=1

n X

k=2 n X k=2

f (k) ≤ f (k) ≤

Since f (x) ≥ 0 on [1, ∞), the sequences (Z

Z

k=1

n

f (x) dx ≤

1

Z

n

1

f (x) dx ≤

)

n

f (x) dx

1

and

n−1 X

k=1 n X

f (k) f (k).

(6.27)

k=1

(

n X

)

f (k)

k=1

(6.28)

are increasing monotonically. If the series converges, then we deduce from the right-hand inequality in (6.27) that the sequence of integrals in (6.28) is bounded above. By Theorem 3.14, it converges. Conversely, if the sequence of integrals converges, then the left-hand inequality in (6.27) guarantees that the series in (6.28) is also bounded above. By Theorem 3.14 again, it converges. This completes the proof of the problem. Problem 6.9 Rudin Chapter 6 Exercise 9.

Proof. The result is stated as follows: Integration by Parts for Improper Integrals. Suppose that F and are differentiable functions on Z G ∞ [a, ∞), F ′ = f ∈ R and G′ = g ∈ R on [a, ∞). If lim F (b)G(b) and f (x)G(x) dx exist, then b→∞

Z

∞

a

F (x)g(x) dx

a

also exists and we have Z ∞ a

c The

F (x)g(x) dx = lim F (b)G(b) − F (a)G(a) − b→∞

hypothesis f (x) ≥ 0 is used in the last set of inequalities.

Z

a

∞

f (x)G(x) dx.

(6.29)

125

6.2. Definitions of improper integrals

Proof of Integration by Parts for Improper Integrals. Let a be fixed and a < b. By Theorem 6.22 (Integration by parts), we have Z b Z b F (x)g(x) dx = F (b)G(b) − F (a)G(a) − f (x)G(x) dx. (6.30) a

a

Taking b → ∞ to the right-hand side of the formula (6.30), since both Z ∞ lim F (b)G(b) and f (x)G(x) dx b→∞

a

exist, they guarantee that the right-hand side of the formula (6.30) exists. Therefore, this shows that Z ∞ F (x)g(x) dx a

exists and then the formula (6.29) holds.

Now we return to the proof of the problem. Put a = 0, F (x) = sin x and G(x) = (6.29), we get Z ∞ Z ∞ sin x sin b cos x dx = lim −0− dx 2 b→∞ −(1 + x) 1 + b 1 +x 0 0 Z ∞ Z ∞ sin x cos x dx. dx = 1 + x (1 + x)2 0 0

1 1+x

into the formula

By Theorem 6.21 (Second Fundamental Theorem of Calculus), we have Z b Z b 1 1 1 1 sin x =1− dx ≤ dx = − − − 2 2 (1 + x) (1 + x) 1 + b 1 + 0 1 + b 0 0

so that

Z

In other words,

∞

0

Z b sin x sin x dx = lim dx ≤ 1. (1 + x)2 b→∞ 0 (1 + x)2 Z

∞

0

converges absolutely. To show that the integral

Z

sin x dx (1 + x)2

∞

0

cos x dx 1+x

does not converge absolutely, notice that Z nπ+ π2 n Z (k+1)π X 2 | cos x| | cos x| dx ≥ dx (k−1)π 1 + x 1+x 0 2 k=2 Z (k+1)π n X 2 2 | cos x| dx ≥ (k + 1)π + 2 (k−1)π 2 k=2

≥ ≥

n X

4 (k + 1)π + 2

k=2 n X

4 π

k=2

1 k+2

for every positive integer n. By Theorem 3.28, we know that n X

k=2

1 k+2

diverges and hence this completes the proof of the problem.

Chapter 6. The Riemann-Stieltjes Integral

6.3

126

H¨ older’s inequality

Problem 6.10 Rudin Chapter 6 Exercise 10.

Proof. (a) The claim is certainly true if u = 0 or v = 0. Therefore, we assume that both u and v are positive. By Theorem 8.6(b) and (c), we have (ex )′′ = ex ≥ 0. Then Problem 5.14 implies that ex is convex on R and so p

1

uv = elog(uv) = e p log(u

)+ q1 log(v q )

≤

up vq 1 log up 1 log vq e + e = + . p q p q

(6.31)

By Problem 4.23, we know that the equality of f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y) holds if and only if x = y. Hence the equality of the inequality (6.31) holds if and only if log(up ) = log(v q ) which is equivalent to up = v q .d (b) Let u = f and v = g. By part (a), we have fg ≤

1 p 1 q f + g . p q

(6.32)

Since f, g ∈ R(α) on [a, b], Theorem 6.13(a) shows that f g ∈ R(α) on [a, b]. Furthermore, there are constants mf , mg , Mf and Mg such that mf ≤ |f (x)| ≤ Mf

and mg ≤ |g(x)| ≤ Mg

on [a, b]. Since the functions φp (x) = xp and φq (x) = xq are obviously continuous on [mf , Mf ] and [mg , Mg ] respectively, Theorem 6.11 guarantees that f p , g q ∈ R(α) on [a, b]. Therefore, it follows from these, the inequality (6.32) and Theorem 6.12(a) that Z

a

b

f g dα ≤

Z

a

b

1

1 1 f p + g p dα = p q p

Z

b

f p dα +

a

1 q

Z

b

g q dα = a

1 1 + = 1. p q

By part (a), we know that the equality holds if and only if f p = g q . (c) Suppose that f and g are complex-valued functions in R(α) on [a, b]. Let F (x) = ( Z

a

|f (x)| b

) p1

|f (x)|p dα

and G(x) = ( Z

Since f, g ∈ R(α), we deduce from Theorem 6.13(b) that |f (x)|, |g(x)| ∈ R(α) on [a, b]. d Note

that the inequality proven in (a) is called Young’s inequality.

a

|g(x)| b

) q1 .

|g(x)|q dα

127

6.3. H¨ older’s inequality Then Theorem 6.12(a) implies that F, G ∈ R(α) on [a, b]. Now it is obvious that F (x) ≥ 0, G(x) ≥ 0 on [a, b] and Z b Z b F p (x) dα = Gq (x) dα = 1. a

a

Hence we obtain from part (b) and then Theorem 6.13(b) that Z b 1≥ F (x)G(x) dα (Z

|f (x)|p dα

a

(Z

a

) p1 ( Z

b

b p

|f (x)| dα

|g(x)|q dα

a

) p1 ( Z

a

a

) q1

b

≥

q

b

|f (x)||g(x)| dα

a

Z b ≥ f (x)g(x) dα a

) q1

b

Z

|g(x)| dα

which is our desired inequality. By part (b), the equality holds if and only if F p = Gq which is equivalent to |g(x)|q |f (x)|p = Z b . Z b p q |f (x)| dα |g(x)| dα a

(d)

a

– Case (i): “Improper” integrals described in Problem 6.7. Suppose that f and g are real functions on (0, 1] and f, g ∈ R on [c, 1] for every c > 0. Then Theorem 6.13(a) implies that f g ∈ R on [c, 1]. By part (c), we have Z (Z ) p1 ( Z ) q1 1 1 1 0≤ f (x)g(x) dα ≤ |f (x)|p dα |g(x)|q dα . c c c

(6.33)

If any integral on the right-hand side of the inequality (6.33) is divergent when c → 0, then H¨ older’s inequality certainly holds in this case. If Z 1 Z 1 Z 1 Z 1 lim |f (x)|p dα = |f (x)|p dα and lim |g(x)|q dα = |g(x)|q dα, c→0 c>0

c

c→0 c>0

0

then we obtain from Theorem 4.2 that Z 1 Z 1 |f (x)|p dα |f (x)|p dα = lim n→∞

cn

and

0

lim

n→∞

0

c

Z

1

cn

|g(x)|q dα =

Z

1

0

|g(x)|q dα

for every sequence {cn } in [c, 1] such that cn 6= c and lim cn = c. By the inequality (6.33), n→∞ we have Z ) q1 (Z ) p1 ( Z 1 1 1 q p |g(x)| dα 0 ≤ lim f (x)g(x) dα ≤ lim |f (x)| dα n→∞ c n→∞ cn cn n (Z ) p1 (Z ) q1 1

= lim

n→∞

=

(Z

0

so that Theorem 4.2 ensures that

cn

1

|f (x)|p dα

1 p

× lim

n→∞

) p1 ( Z

|f (x)| dα

Z 1 f (x)g(x) dα 0

0

1 q

cn

) 1q

|g(x)| dα

|g(x)|q dα 0 there exists a partition P = {x0 , x1 , . . . , xn } such that U (P, f, α) − L(P, f, α) < f Here

we change the independent variable of g from t to x.

ǫ2 . M −m

(6.44)

Chapter 6. The Riemann-Stieltjes Integral

130

By Definition 6.2, we have U (P, f, α) − L(P, f, α) =

n X i=1

(Mi − mi )∆αi .

Hence it follows from the inequalities (6.43) and (6.44) that kf − gk22 ≤ (M − m)[U (P, f, α) − L(P, f, α)] < ǫ2 which implies that kf − gk2 < ǫ. This completes the proof of the problem.

6.4

Problems related to improper integrals

Problem 6.13 Rudin Chapter 6 Exercise 13.

Proof. (a) Put t2 = u. It follows from Theorem 6.21 (Integration by Parts) that f (x) = =

Z

x+1

sin(t2 ) dt

x Z (x+1)2 x2

=−

Z

du sin u √ 2 u

(x+1)2

x2

d(cos u) √ 2 u

cos(x2 ) cos(x + 1)2 − − = 2x 2(x + 1)

Z

(x+1)2

x2

cos u 3

4u 2

du.

Hence, for x > 0, we get from the expression (6.45) and Theorem 6.13(b) that cos(x2 ) cos(x + 1)2 Z (x+1)2 cos u − − |f (x)| = 3 du 2x 2(x + 1) 2 2 4u x Z 2 (x+1) cos u 1 1 + + ≤ 3 du 2x 2(x + 1) x2 4u 2 Z (x+1)2 1 1 cos u ≤ + + 3 du 4u 2 2x 2(x + 1) x2 2 Z (x+1) 1 1 1 + + ≤ 3 du 2x 2(x + 1) 4u 2 x2 1 1 1 1 1 + − − = 2x 2(x + 1) 2 x + 1 x 1 = . x Assume that cos x2 and cos(x + 1)2 attained 1 at the same time. Then we have x2 = kπ

and (x + 1)2 = lπ,

(6.45)

(6.46)

131

6.4. Problems related to improper integrals where l > k > 0. This leads 2x + 1 = (l − k)π and then x =

(l − k)π − 1 . Therefore, we have 2

[(l − k)π − 1]2 4 2 2 4kπ = (l − k) π − 2(l − k)π + 1 x2 =

(l − k)2 π 2 − 2(l + k)π + 1 = 0.

(6.47)

However, the equation (6.47) means that π is an algebraic number which is a contradiction. Hence the inequality (6.46) is strict, i.e., |f (x)| < x1 for x > 0. (b) If we define

Z

cos(x + 1)2 − 2x r(x) = x+1

(x+1)2

x2

then it follows from the expression (6.45) that

cos u 3

4u 2

du,

2xf (x) = cos(x2 ) − cos[(x + 1)2 ] + r(x).

(6.48)

(6.49)

Hence we have from the expression (6.48) that 1 + 2x |r(x)| ≤ x+1

Z

(x+1)2

x2

du 4u

3 2

=

1 1 2 2 1 −x − < = x+1 x+1 x x+1 x

as required. (c) Let’s rewrite the expression (6.49) as 1 1 sin x + + r(x). 2xf (x) = 2 sin x2 + x + 2 2

(6.50)

By the result of part (b), we know that r(x) → 0 as x → ∞. Thus it follows from the expression (6.50) that −1 ≤ xf (x) ≤ 1 for all x > 0. We claim that lim sup xf (x) = 1 x→∞

and

lim inf xf (x) = −1. x→∞

To prove this claim, we note from the result of part (b) that 2xf (x) has the same magnitude as cos(x2 ) − cos[(x + 1)2 ]

(6.51) √ as x → ∞. Put xn = n 2π into the term (6.51) and then apply the periodicity of cos x to get √ √ cos(x2n ) − cos[(xn + 1)2 ] = cos(2n2 π) − cos 2n2 π + n 8π + 1 = 1 − cos n 8π + 1 . q It is clear that the number α = π2 is irrational. By Lemma 4.6, we know that the set S = {kα − h | k ∈ N, h ∈ Z}

g It

1 , then for every ǫ > 0 there exists an is dense in R. In other words, we consider the number 12 − 2π g integer N , sequences {hm } ⊂ Z and {km } ⊂ N such that 1 ǫ 1 − (6.52) km α − hm − < 2 2π 2π

is obvious that

lim km = ∞.

m→∞

Chapter 6. The Riemann-Stieltjes Integral

132

for all m ≥ N . It is clear that the inequality (6.52) is equivalent to √ km 8π + 1 − (2hm π + π) < ǫ

for all m ≥ N . Therefore, we deduce from this, the periodicity and the continuity of cos x that √ lim cos(km 8π + 1) = lim cos(2hm π + π) = cos π = −1. m→∞

m→∞

In other words, we have lim sup xf (x) = 1. x→∞

The case for lim inf xf (x) = −1 x→∞

is similar, so we omit the details here. (d) Let N be a positive integer. We consider Z

N 2

sin t dt =

0

N −1 Z k+1 X k=0

=

N −1 X

sin t2 dt

k

f (k)

k=0

# cos(k + 1)2 r(k) cos k 2 − + = f (0) + 2k 2k 2k k=1 # " N −1 N −1 cos(k + 1)2 1 X r(k) 1 X cos k 2 . + − = f (0) + 2 k 2 k k N −1 X

"

k=1

Since

(6.53)

k=1

cos(k + 1)2 cos(k + 1)2 cos(k + 1)2 − =− , k+1 k k(k + 1)

we follow from the expression (6.53) that Z

N

N −1 1 X r(k) 1 + sin t dt = f (0) + 2 k 2

cos N 2 cos 1 − N −1

2

0

k=1

!

−

N −1 1 X cos k 2 . 2 k(k − 1)

(6.54)

k=2

c for a constant c, so we have k −1 N −1 N −1 N X r(k) X r(k) X c . ≤ < k k k2

Recall from part (b) that |r(k)|
0, there is a trigonometric polynomial P such that |P (x) − f (x)|
0, there exists an integer N such that Z π N 1 X 1 ǫ P (x + nα) − P (t) dt < N 2π 3 −π n=1

(8.69)

for all n ≥ N . Therefore, we deduce from the expressions (8.67), (8.68) and the inequality (8.69) with the integer N defined in the inequality (8.69) that for all n ≥ N , we have Z π N 1 X 1 f (x + nα) − f (t) dt N 2π −π n=1 Z π N N 1 X 1 X 1 P (t) dt = [f (x + nα) − P (x + nα)] + P (x + nα) − N N n=1 2π −π n=1 Z π 1 − [f (t) − P (t)] dt 2π −π Z π N N 1 X 1 X 1 [f (x + nα) − P (x + nα)] + P (x + nα) − P (t) dt ≤ N N 2π −π n=1 n=1 Z 1 π + [f (t) − P (t)] dt 2π −π

Chapter 8. Some Special Functions

196

Z π N ǫ 1 X 1 < |f (x + nα) − P (x + nα)| + + |f (t) − P (t)| dt. N n=1 3 2π −π Finally, we apply the inequality (8.66) to the inequality (8.70), so we establish Z π N 1 X ǫ ǫ ǫ 1 f (x + nα) − f (t) dt < + + = ǫ. N 2π 3 3 3 −π n=1 Hence we have the desired result and this completes the proof of the problem. Problem 8.20 Rudin Chapter 8 Exercise 20.

Proof. For examples, if m = 1, 2, 3, 4, then we have  if 1 ≤ x ≤ 2;   (x − 1) log 2,        (3 − x) log 2 + (x − 2) log 3, if 2 ≤ x ≤ 3; f (x) =   (4 − x) log 3 + (x − 3) log 4, if 3 ≤ x ≤ 4;        (5 − x) log 4 + (x − 4) log 5, if 4 ≤ x ≤ 5  (x − 1) log 2, if 1 ≤ x ≤ 2;         3 23    log x + log 2 , if 2 ≤ x ≤ 3;   2 3  =  34 4   x + log log , if 3 ≤ x ≤ 4;   3 43       5    log 5 x + log 4 , if 4 ≤ x ≤ 5 4 4 5

and

g(x) =

Therefore, their graphs are given by

 x − 1, if        x   − 1 + log 2, if    2 x   − 1 + log 3, if   3         x − 1 + log 4, if 4

1 2

≤ x < 32 ;

3 2

≤ x < 52 ;

5 2

≤ x < 72 ;

7 2

≤ x < 92 .

(8.70)

197

8.1. Problems related to special functions

Figure 8.2: The graphs of the two functions f and g. Now for a positive integer n, we have Z

n

f (x) dx =

1

= =

n−1 X

Z

m+1

m=1 m n−1 X Z m+1 m=1 n−1 X

m

f (x) dx [(m + 1 − x) log m + (x − m) log(m + 1)] dx

1 [log m + log(m + 1)] 2 m=1

1 1 log(n − 1)! + log(n!) 2 2 1 = log(n!) − log n. 2

=

(8.71)

Chapter 8. Some Special Functions

198

Furthermore, we have Z

n

g(x) dx =

1

Z

3 2

g(x) dx +

Z

n

g(x) dx +

n− 21

1

n−1 X

m=2

Z

m+ 12

g(x) dx

m− 21

n−1 X Z m+ 12 x 1 1 1 + log n + − 1 + log m dx = − 1 8 8n 2 m m=2 m− 2

=

n−1 X 1 1 1 − + log n + log m 8 8n 2 m=2

1 1 1 − + log n + log(n − 1)! 8 8n 2 1 1 1 = log(n!) − log n + − . 2 8 8n =

(8.72)

Combining the expressions (8.71) and (8.72), we have for x ≥ 1, Z n Z n 1 1 g(x) dx. f (x) dx = log(n!) − log n > − + 2 8 1 1 By Theorem 6.22, we have Z n Z log x dx = n log n − 1 · log 1 − 1

n

x d(log x) = n log n −

1

Z

(8.73)

n 1

dx = n log n − n + 1.

(8.74)

Since f (x) ≤ log x ≤ g(x) if x ≥ 1, we follow from the inequality (8.73) and the expression (8.74) that Z n Z n Z n f (x) dx ≤ log x dx ≤ g(x) dx 1

1

1

1 1 1 log(n!) − log n ≤ n log n − n + 1 < log(n!) − log n + 2 2 8

which deduce the inequalities 7 1 log n + n < 1 < log(n!) − n + 8 2

(8.75)

for n = 2, 3, 4, . . .. Hence, by taking exponential to each part of the inequalities (8.75), we have eventually the desired formula 7 n! e 8 < n n √ < e. (e) n This completes the proof of the problem.

Problem 8.21 Rudin Chapter 8 Exercise 21.

Proof. Since |Dn (t)| is an even function on [−π, π], Theorem 6.12(c) and Theorem 6.19 give "Z # Z π Z Z 0 1 1 π 1 π sin(n + 21 )t Ln = |Dn (t)| dt + |Dn (t)| dt = |Dn (t)| dt = dt. 2π −π π 0 π 0 sin 2t 0

(8.76)

Z

(8.77)

By Theorem 6.19 again with t = 2x, the expression (8.76) implies that 2 Ln = π

0

π 2

sin(2n + 1)x dx. sin x

We know that 0 < sin x ≤ x on (0, π2 ], see Figure 8.3 for a geometric proof of this.

199

8.1. Problems related to special functions

Figure 8.3: A geometric proof of 0 < sin x ≤ x on (0, π2 ]. Thus we can deduce from the formula (8.77) that Ln ≥

2 π

Z

0

π 2

sin(2n + 1)x dx. x

(8.78)

Apply Theorem 6.19 with y = (2n + 1)x to the inequality (8.78), we yield Ln ≥ It is clear that

2 π

Z

0

(2n+1) π 2

Z n−1 Z sin y 2 nπ sin y 2 X (k+1)π sin y dy ≥ dy = dy. y π 0 y π y kπ

(8.79)

k=0

1 1 ≥ y (k + 1)π

on [kπ, (k + 1)π], where k = 0, 1, . . . , n − 1. Therefore, the inequality (8.79) implies that Ln ≥

Z (k+1)π n−1 2 X 1 | sin y| dy. π2 k + 1 kπ

For each k ∈ {0, 1, 2, . . . , n − 1}, we acquire Z

(k+1)π

kπ

(8.80)

k=0

| sin x| dx =

Z

π

sin x dx,

0

see Figure 8.4 for this geometric meaning.

Figure 8.4: The graph of y = | sin x|.

(8.81)

Chapter 8. Some Special Functions

200

By substituting the expression (8.81) into the inequality (8.80) and applying Problem 8.9(a), we obtain Z π n−1 n 4 2 X 1 4 X1 ≥ 2 log n. sin x dx = π2 k+1 0 π2 k π

Ln ≥

k=0

k=1

This finishes the analysis of the problem.

Problem 8.22 Rudin Chapter 8 Exercise 22.

Proof. We follow the given hint. Let f (x) = 1 + Since

∞ X α(α − 1) · · · (α − n + 1) n x . n! n=1

(8.82)

α − n α(α − 1) · · · (α − n) n! × x = lim lim sup |x| = |x|, (n + 1)! α(α − 1) · · · (α − n + 1) n→∞ n + 1 n→∞

we follow from Theorem 3.34(a) that the series converges if |x| < 1. By Theorem 8.1, the function defined by the series (8.82) is differentiable in (−1, 1) and f ′ (x) =

∞ X α(α − 1) · · · (α − n + 1) n−1 x . (n − 1)! n=1

Thus we follow from the definition (8.82) and the derivative (8.83) that (1 + x)f ′ (x) = (1 + x) =

∞ X α(α − 1) · · · (α − n + 1) n−1 x (n − 1)! n=1

∞ ∞ X α(α − 1) · · · (α − n + 1) n−1 X α(α − 1) · · · (α − n + 1) n x + x (n − 1)! (n − 1)! n=1 n=1

∞ ∞ X α(α − 1) · · · (α − n + 1) n−1 X α(α − 1) · · · (α − n + 1) n =α+ x + x (n − 1)! (n − 1)! n=2 n=1

=α+ =α+

∞ ∞ X α(α − 1) · · · (α − n) n X α(α − 1) · · · (α − n + 1) n x + x n! (n − 1)! n=1 n=1 ∞ X α(α − 1) · · · (α − n + 1) α − n + 1 xn (n − 1)! n n=1

=α+α

∞ X α(α − 1) · · · (α − n + 1) n x n! n=1

= αf (x).

By [21, Eqn. (38), p. 180] and Theorem 5.5, we have f ′ (x) d (log f (x)) = dx f (x) d α (log f (x)) = dx 1+x Z x α dt log f (x) − log f (0) = 0 1+t log f (x) = α log(1 + x)

(8.83)

201

8.2. Index of a curve f (x) = (1 + x)α .

(8.84)

Hence, by the definition (8.82) and the expression (8.84), we have the desired result that ∞ X α(α − 1) · · · (α − n + 1) n (1 + x) = 1 + x . n! n=1 α

(8.85)

By replacing x and α by −x and −α respectively in the expression (8.85), we have (1 − x)−α = 1 + =1+

∞ X (−α)(−α − 1) · · · (−α − n + 1) (−x)n n! n=1 ∞ X α(α + 1) · · · (α + n − 1) n x . n! n=1

(8.86)

In particular, we suppose that −1 < x < 1 and α > 0 in the equation (8.86). By Theorem 8.18(a), we know that Γ((n + α − 1) + 1) Γ(n + α) = Γ(α) Γ(α) (α + n − 1)Γ(α + n − 1) = Γ(α) = ···

(α + n − 1) · · · (α + 1)αΓ(α) Γ(α) = (α + n − 1) · · · (α + 1)α.

=

(8.87)

By substituting the identity (8.87) into the expression (8.86), we get (1 − x)−α = 1 +

∞ ∞ X Γ(n + α) n X Γ(n + α) n x = x , n!Γ(α) n!Γ(α) n=0 n=1

where −1 < x < 1 and α > 0. This completes the proof of the problem.

8.2

Index of a curve

Problem 8.23 Rudin Chapter 8 Exercise 23.

Proof. We follow the given hint. Since γ is continuously differentiable and γ(t) 6= 0 on [a, b], the function γ′ γ is well-defined and continuous on [a, b]. Thus ϕ : [a, b] → C to be the function given by ϕ(x) =

γ′ γ

∈ R on [a, b] by Theorem 6.8 and we can define

Z

a

x

γ ′ (t) dt. γ(t)

(8.88)

By Theorem 6.20 (First Fundamental Theorem of Calculus), the function ϕ is differentiable on [a, b] and γ ′ (t) ϕ′ (t) = . (8.89) γ(t)

Chapter 8. Some Special Functions

202

Furthermore, we have ϕ(a) = 0. Let f (t) = γ(t)e−ϕ(t) , where t ∈ [a, b]. It is clear that f is differentiable on [a, b] and we deduce from the expression (8.89) that f ′ (t) = γ ′ (t)e−ϕ(t) + γ(t)[−ϕ′ (t)]e−ϕ(t) = γ ′ (t)e−ϕ(t) − γ ′ (t)e−ϕ(t) = 0 for all t ∈ (a, b). By Theorem 5.11(b), f is a constant on (a, b). Since f is continuous on [a, b], it must be a constant on [a, b]. Since γ(a) = γ(b), we have eϕ(b) = eϕ(a) = e0 = 1.

(8.90)

By [21, Eqn. (52), p. 183], Theorem 8.7(a) and (c), the expression (8.90) implies that ϕ(b) = 2nπi for some integer n. We note that ϕ(b) = 2πiInd (γ), so we have 2πiInd (γ) = 2nπi which is equivalent to Ind (γ) = n for some integer n. This proves the first assertion. For the second assertion, if γ(t) = eint , a = 0 and b = 2π, then we have Z 2π Z 2π 1 n ineint Ind (γ) = dt = dt = n. 2πi 0 eint 2π 0 Now the number Ind (γ) is called the winding number of γ around 0 because it counts the total number of times that the curve γ travels counterclockwise around the origin 0. This number certainly depends on the orientation of the curve, so it is negative if the curve travels around the point clockwise. See Figure 8.5 about the winding number of γ around an arbitrary point p.

Figure 8.5: The winding number of γ around an arbitrary point p. We end the analysis of the problem here.

Problem 8.24 Rudin Chapter 8 Exercise 24.

Proof. We follow the given hint. Let 0 ≤ c < ∞. It is clear that γ + c : [a, b] → C is still a continuously differentiable closed curve. Since γ does not intersect the negative real axis, we must have γ(t) + c 6= 0

203

8.2. Index of a curve

for every t ∈ [a, b]. Thus γ + c satisfies all the hypotheses of Problem 8.23 and it is meaningful to talk about Ind (γ + c). We define f : [0, ∞) → Z by f (c) = Ind (γ + c) =

1 2πi

Z

b

a

γ ′ (t) dt, γ(t) + c

where c ∈ [0, ∞). We claim that f is a continuous function of c. To this end, we check the definition of continuity (Definition 4.5). Given that ǫ > 0 and c ∈ [0, ∞). It is easy to see that # " 1 Z b γ ′ (t) γ ′ (t) |f (x) − f (c)| = − dt 2πi a γ(t) + x γ(t) + c Z b ′ 1 γ ′ (t) γ (t) ≤ − dt 2π a γ(t) + x γ(t) + c Z b 1 γ ′ (t)(c − x) ≤ (8.91) dt. 2π a (γ(t) + x)(γ(t) + c)

Since γ and γ ′ are continuous on [a, b], we know from Theorem 4.16 (Extreme Value Theorem) that there exist real numbers m, m′ , M and M ′ such that 0 < m ≤ |γ(t)| ≤ M

and m′ ≤ |γ ′ (t)| ≤ M ′

(8.92)

for all t ∈ [a, b].i If M ′ = 0, then γ ′ (t) = 0 for all t ∈ [a, b] which imply that γ = A for some constant A on (a, b). Since γ is continuous on [a, b], we have γ = A on [a, b]. However, γ cannot be a closed curve in this case, a contradiction. Thus we must have M ′ > 0. Furthermore, we note that c > 0 and x ≥ 0 imply that m+c>m

and m + x ≥ m,

so we deduce easily from the inequalities (8.91) and (8.92) that 1 |f (x) − f (c)| ≤ 2π If we take δ =

2πm2 M ′ (b−a) ǫ

Z

a

b

M ′ (b − a) M′ |c − x|. |c − x| dt < (m + c)(m + x) 2πm2

(8.93)

> 0, then we obtain from the inequality (8.93) that |f (x) − f (x)| < ǫ

for all x ∈ [0, ∞) with |x − c| < δ. Hence we prove our claim that f is continuous on [0, ∞). Next, the definition of f and the inequalities (8.92) also imply that Z b ′ 1 M ′ (b − a) γ (t) |f (c)| ≤ . dt ≤ 2π a γ(t) + c 2π(m + c)

Thus we have

lim |f (c)| = 0.

c→∞

Since the range of f is Z, we have f (c) = 0 for all c ∈ [0, ∞). In particular, we take c = 0 to get f (0) = Ind (γ) = 0, as required. This completes the proof of the problem. i Note

that it is possible that m′ = 0, but m > 0 because γ(t) 6= 0 on [a, b].

Chapter 8. Some Special Functions

204

Problem 8.25 Rudin Chapter 8 Exercise 25.

Proof. We follow the given hint. Let γ = γγ21 . Since γ1 : [a, b] → C and γ2 : [a, b] → C are continuously differentiable closed curves and γ1 (t)γ2 (t) 6= 0 for every t ∈ [a, b], the function γ : [a, b] → C is also a continuously differentiable closed curve and γ(t) 6= 0 for every t ∈ [a, b]. Now the inequality |γ1 (t) − γ2 (t)| < |γ1 (t)| shows that |1 − γ(t)| < 1 on [a, b], so we have 0 < γ(t) < 2 for all t ∈ [a, b]. Next, by Problem 8.24, we have Ind (γ) = 0. In addition, a direct computation shows that γ′ γ′ γ′ = 2− 1 γ γ2 γ1 which gives Ind (γ) =

1 2πi

Z

a

b

1 γ ′ (t) dt = γ(t) 2πi

Z

a

b

1 γ2′ (t) dt − γ2 (t) 2πi

Z

a

b

γ1′ (t) dt = Ind (γ2 ) − Ind (γ1 ). γ1 (t)

Hence we have Ind (γ1 ) = Ind (γ2 ) as required, completing the proof of the problem.

Problem 8.26 Rudin Chapter 8 Exercise 26.

Proof. For all t ∈ [0, 2π], we obtain from the triangle inequality that δ < |γ(t)| ≤ |γ(t) − P1 (t)| + |P1 (t)| < so that |P1 (t)| >

δ + |P1 (t)| 4

3δ δ > >0 4 2

on [0, 2π]. By this and the triangle inequality, we have |P1 (t) − P2 (t)| = |P1 (t) − γ(t) + γ(t) − P2 (t)| ≤ |P1 (t) − γ(t)| + |P2 (t) − γ(t)|
0 such that |γ(t) − x| > η for all t ∈ [0, 2π] and x ≤ 0. Put κ = min(δ, η). Then both |γ(t)| > κ and |γ(t) − x| > κ are valid on [0, 2π] and on x ≤ 0. If P (t) is a trigonometric polynomial such that κ |P (t) − γ(t)| < , 4 then the triangle inequality implies that |γ(t) − x| ≤ |γ(t) − P (t)| + |P (t) − x| and then it gives κ 3κ = >0 4 4 for all t ∈ [0, 2π]. Therefore, the range of P (t) does not intersect the negative real axis. By Problem 8.24, we have Ind (P ) = 0. By definition, we have |P (t) − x| ≥ |γ(t) − x| − |γ(t) − P (t)| > κ −

Ind (γ) = 0 which extends the result of Problem 8.24 to any closed curve (not necessarily differentiable) in C with domain [0, 2π] and γ(t) 6= 0 for every t ∈ [0, 2π]. • Extension of Problem 8.25. Suppose that γ1 and γ2 are two closed curves in C with domain [0, 2π], and γ1 (t)γ2 (t) 6= 0 on [0, 2π]. Then there exist δ1 > 0 and δ2 > 0 such that |γ1 (t)| > δ1

and |γ2 (t)| > δ2

on [0, 2π]. By Theorem 8.15 (Stone–Weierstrass theorem), there exists trigonometric polynomials P1 and P2 such that δ2 δ1 and |γ2 (t) − P2 (t)| < |γ1 (t) − P1 (t)| < 4 4 on [0, 2π]. Furthermore, we suppose that |γ1 (t) − γ2 (t)| < |γ1 (t)| for every t ∈ [0, 2π]. Thus there exists a δ3 > 0 such that |γ1 (t)| − |γ1 (t) − γ2 (t)| > δ3 for all t ∈ [0, 2π]. Let δ = min(δ1 , δ2 , δ3 ). Then we have

δ δ , |γ2 (t) − P2 (t)| < and |γ1 (t)| − |γ1 (t) − γ2 (t)| > δ 4 4 for all t ∈ [0, 2π]. By these and the triangle inequality, we have |γ1 (t) − P1 (t)|
0 with ǫ < |c| there exists a R > 0 such that |z −n f (z) − c| < ǫ for all |z| ≥ R which is equivalent to |γr (t) − crn eint | = |f (reit ) − crn eint | < ǫ|rn eint | < |crn eint |

(8.94)

for all r ≥ R and 0 ≤ t ≤ 2π. Now we apply Problem 8.25 to the inequality (8.94) and then the definition, we get Ind (γr (t)) = Ind (crn eint ) =

1 2πi

Z

2π

0

cinrn eint dt = n. crn eint

Therefore, we have Ind (γr (t)) = n for all sufficiently large r. (c) Let p, r ∈ [0, ∞). Define d = |p − r| ≥ 0 and I(p, d) = [min(0, p − d), p + d]. Now we want to show that for every ǫ > 0, there exists a δ > 0 such that r ∈ I(p, d) and |p − r| < δ imply thatj |Ind (γp ) − Ind (γr )| < ǫ.

(8.95)

Next, we define the set K(r, d) = {aeit | a ∈ I(r, d), 0 ≤ t ≤ 2π}. Since f : C → C is a continuous function and the set K(p, 0) is compact, Theorem 4.16 (Extreme Value Theorem) ensures that the value m = min |f (peit )| t∈[0,2π]

is finite. Given that ǫ > 0 with m > ǫ. By Theorem 4.19, f is uniformly continuous on K(p, d). Thus there exists a δ > 0 such that for all z1 , z2 ∈ K(p, d) with |z1 − z2 | < δ implies |f (z1 ) − f (z2 )| < ǫ.

(8.96)

In particular, we may assume that z1 = peit and z2 = reit , where p is fixed and r varies. Thus the inequality (8.96) implies that |f (peit ) − f (reit )| < ǫ < m ≤ |f (peit )| j The

introduction of the set I(p, d) makes sure that if r ∈ I(p, d), then r ≥ 0.

(8.97)

207

8.2. Index of a curve for all r ∈ I(p, d) with |p − r| < δ and 0 ≤ t ≤ 2π. By definition, the inequality (8.97) can be rewritten as |γp (t) − γr (t)| < |γp (t)| for all r ∈ I(p, d) with |p − r| < δ and 0 ≤ t ≤ 2π. By Problem 8.25 or Problem 8.26, we have Ind (γp ) = Ind (γr )

(8.98)

for all r ∈ I(p, d) with |p − r| < δ. In other words, the identity (8.98) means that the inequality (8.94) holds, i.e., the function Ind (γp ) is continuous at p ∈ [0, ∞). Recall from Problem 8.23 that Ind (γr ) ∈ Z. Since [0, ∞) is connected by Theorem 2.47, we can see from Theorem 4.22 and part (c) that Ind (γr )([0, ∞))

is also connected. By part (b), we know that Ind (γr ) = n for all sufficiently large r and thus we must have Ind (γr ) = n for every r ∈ [0, ∞) and for some n ∈ Z. In particular, we have Ind (γ0 ) = n > 0 which contradicts the result of (a). Hence we have f (z) = 0 for at least one complex number z. This completes the proof of the problem. Problem 8.28 Rudin Chapter 8 Exercise 28.

Proof. For 0 ≤ r ≤ 1, 0 ≤ t ≤ 2π, we put γr (t) = g(reit ) and ψ(t) = e−it γ1 (t).

(8.99)

Assume that g(z) 6= −z for every z ∈ T . Then we have ψ(t) 6= −1

(8.100)

for every t ∈ [0, 2π]. Since |g(z)| = 1 for every z ∈ D, we have |ψ(t)| = |e−it γ1 (t)| = |e−it g(eit )| = 1 6= 0

(8.101)

for every t ∈ [0, 2π], i.e., ψ maps [0, 2π] into the unit circle T . By these two facts (8.100) and (8.101), we have the result that the range of ψ does not intersect the negative real axis. Furthermore, we have ψ(0) = γ1 (0) = g(1) and ψ(2π) = e−2πi γ1 (2π) = g(e2πi ) = g(1) so that it is a closed curve. In conclusion, the curve ψ satisfies the hypotheses of Problem 8.24 or Problem 8.26. Therefore, we must have Ind (ψ) = 0. (8.102) By a similar argument as in Problem 8.27(c), we know that Ind (γr ) is a continuous function of r, on [0, 1]. Since [0, 1] is connected, Theorem 4.22 implies that Ind (γr )([0, 1]) is also connected. Recall that Ind (γr ) ∈ Z, so it must be a fixed integer for all r ∈ [0, 1]. To derive a contradiction from this result, we compute the values of Ind (γ0 ) and Ind (γ1 ). We note that γ0 (t) = g(0) 6= 0 which is a non-zero constant, the definition of the winding number of a closed curve (see Problem 8.23) gives Z 2π 0 1 dt = 0. Ind (γ0 ) = 2πi 0 g(0) To find Ind (γ1 ), we need a lemma first. Lemma 8.1 Let α, β : [a, b] → C be closed curves, α(t) 6= 0 and β(t) 6= 0 for every t ∈ [a, b]. Let γ : [a, b] → C be defined by γ = α × β. Then we have Ind (γ) = Ind (α) + Ind (β).

Chapter 8. Some Special Functions

208

Proof of Lemma 8.1. We suppose that α and β are continuously differentiable. Then it is easy to check that γ is also a continuously differentiable closed curve and γ(t) 6= 0 for every t ∈ [a, b]. By definition and the fact that γ ′ = α′ β + β ′ α, we have Ind (γ) =

1 2πi

Z

b

a

1 γ′ dt = γ 2πi

Z

b

a

1 α′ dt + α 2πi

Z

b a

β′ dt = Ind (α) + Ind (β). β

Next, suppose that α and β are not differentiable so that γ may not be differentiable. However, the numbers Ind (α), Ind (β) and Ind (γ) are still well-defined by Problem 8.26. Let M1 = max |α(t)| t∈[0,2π]

and M2 = max |β(t)|. t∈[0,2π]

Since α, β and γ are non-zero on [0, 2π], there exists a small δ > 0 such that |α(t)| > δ,

|β(t)| > δ

and |γ(t)| > δ

for every t ∈ [0, 2π]. • Case (i): M1 + M2 ≥ 1. By Theorem 8.15 (Stone–Weierstrass theorem), there are trigonometric polynomials P1 and P2 such that |P1 (t) − α(t)|
0} and E = W1 ∪ W2 .a Define the real-valued function f : E → R by y, if (x, y) ∈ W1 ; f (x, y) = −y, if (x, y) ∈ W2 . Since W1 and W2 are open, E is open. Obviously, E is not convex. Furthermore, we have (D1 f )(x, y) = 0 for every (x, y) ∈ E. However, we have f (1, 1) = 1 and f (−1, 1) = −1 so that f depends on the first coordinate. This completes the proof of the problem. Problem 9.11 Rudin Chapter 9 Exercise 11.

Proof. By [21, Eqn. (34), p. 217], we have for x ∈ Rn , ∇(f g)(x) =

n X

(Di (f g))(x)ei .

i=1

Since Di (f g)(x) = f (Di g)(x) + g(Di f )(x), the expression (9.12) reduces to ∇(f g)(x) = a In

n X

[f (x)(Di g)(x) + g(x)(Di f )(x)]ei

i=1

other words, E is the union of the first and second quadrants which is shaped like a horseshoe.

(9.12)

219

9.3. Local maxima and minima = f (x)

n X

(Di g)(x)ei + g(x)

i=1

n X

(Di f )(x)ei

i=1

= f (x)∇(g)(x) + g(x)∇(f )(x). Thus we have ∇(f g) = f ∇g + g∇f.

Suppose that f 6= 0 in Rn . Since 1 = f · f1 and ∇(1) = 0, we put g =

(9.13) 1 f

in the identity (9.13) to obtain

1 1 + ∇(f ) 0 = f∇ f f 1 ∇ = f −2 ∇(f ). f

Hence, we complete the proof of the problem.

9.3

Local maxima and minima

Problem 9.12 Rudin Chapter 9 Exercise 12.

Proof. The range K of f is in fact the (ring) torus generated by rotating the circle given by (x − b)2 + z 2 = a2 about the z-axis, see Figure 9.1 for an example produced by WolframAlphab with a = 1 and b = 2.

Figure 9.1: An example of the range K of f .

(a) By definition, we have (∇f1 )(x) = (D1 f1 )(x)e1 + (D2 f1 )(x)e2 = (−a sin s cos t, −(b + a cos s) sin t), where x = (s, t). Thus (∇f1 )(x) = 0 if and only if sin s cos t = 0 and (b + a cos s) sin t = 0. b See

https://www.wolframalpha.com/ .

(9.14)

Chapter 9. Functions of Several Variables

220

Since b > a > 0, we have (b + a cos s) > 0 for any s, so the second equation in (9.14) implies that t = 0 or π. In both cases, the first equation in (9.14) shows that s = 0 or s = π. Hence there are exactly four points p ∈ K such that (∇f1 )(f −1 (p)) = 0. In fact, they are the images of the four points (0, 0), (0, π), (π, 0) and (π, π) which are f (0, 0) = (a + b, 0, 0),

f (0, π) = (−a − b, 0, 0),

f (π, 0) = (b − a, 0, 0)

and f (π, π) = (−(b − a), 0, 0). (b) Again, by definition, we have (∇f3 )(x) = (D1 f3 )(x)e1 + (D2 f3 )(x)e2 = (a cos s, 0), where x = (s, t). Thus (∇f3 )(x) = 0 if and only if cos s = 0 if and only if s = Therefore, the required set is given by f1 ( π2 , t) = b cos t, f2 ( π2 , t) = b sin t, f3 ( π2 , t) = a

or

f1 ( 3π 2 , t) = b cos t, f2 ( 3π 2 , t) = b sin t, f3 ( 3π 2 , t) = −a.

π 2

or s =

3π 2 .

(9.15)

Geometrically, the locus of the left-hand side in (9.15) is the circle center at z = a with radius b and the locus of the right-hand side in (9.15) is the circle center at z = −a with radius b. The graph of the loci can be seen in Figure 9.2.

Figure 9.2: The set of q ∈ K such that (∇f3 )(f −1 (q)) = 0. (c) For any (s, t) ∈ [0, 2π] × [0, 2π], we have −(a + b) ≤ f1 (s, t) ≤ a + b.

221

9.3. Local maxima and minima Since f1 (0, 0) = a + b,

f1 (0, π) = −(a + b),

f1 (π, 0) = b − a

and f1 (π, π) = −(b − a),

the points (0, 0) and (0, π) correspond to the local maximum (a + b, 0, 0) and the local minimum (−(a + b), 0, 0) of f1 respectively. Finally, it is easy to see that any of the remaining two points is neither a local maximum or a local minimum. For f3 (s, t), we know that −a ≤ f3 (s, t) ≤ a

for all (s, t) ∈ [0, 2π] × [0, 2π]. Since 3π f3 , t = −a 2

and f3

π ,t = a 2

for every t ∈ [0, 2π], the points a and −a are obviously the local maximum and the local minimum of f3 respectively.

(d) We have g : R → R3 . By definition, we have g(t) = f (t, λt) = ((b + a cos t) cos λt, (b + a cos t) sin λt, a sin t)

(9.16)

which implies that g′ (t) = − λ(b + a cos t) sin λt − a sin t cos λt, λ(b + a cos t) cos λt − a sin t sin λt, a cos t .

Therefore, we have

|g′ (t)|2 = g′ (t) · g′ (t)

= [−λ(b + a cos t) sin λt − a sin t cos λt]2

+ [λ(b + a cos t) cos λt − a sin t sin λt]2 + a2 cos2 t

= λ2 (b + a cos t)2 + a2 sin2 t + a2 cos2 t = a2 + λ2 (b + a cos t)2 . This proves the second assertion of this part.

For the first assertion, we note from the definition (9.16) that if g(u) = g(v) for u, v ∈ R, then we have sin u = sin v which means that u = v + 2kπ for some k ∈ Z. Since b + a cos t ≥ b − a > 0 for every t ∈ [0, 2π], we have (b + a cos u) cos λu = (b + a cos v) cos λv (b + a cos v) cos λ(v + 2kπ) = (b + a cos v) cos λv cos λ(v + 2kπ) = cos λv λ(v + 2kπ) = λv + 2mπ kλ = m for some m ∈ Z. If k 6= 0, then λ = u = v, i.e., g is 1-1.

m k

∈ Q, a contradiction. Thus k = 0 and m = 0, so we have

Finally, we show that g(R) is dense in K and we divide its proof into several steps: – Step 1: Rephrasing the problem. Pick x ∈ K. For every ǫ > 0, there exists a t ∈ R such that |g(t) − x| < ǫ. To this end, suppose that x = ((b + a cos p) cos q, (b + a cos p) sin q, a sin p), where p, q ∈ R. By the definition (9.16), we have |g(t) − x|2 = [(b + a cos t) cos λt − (b + a cos p) cos q]2

+ [(b + a cos t) sin λt − (b + a cos p) sin q]2 + (a sin t − a sin p)2 .

(9.17)

Chapter 9. Functions of Several Variables

222

– Step 2: Simplification of the expression (9.17). Now the expression in the right-hand side of (9.17) is too complicated for computation, so we need to simplify it. Since sin t and cos t are periodic functions, if we take t = p + 2nπ for some integer n, then the expression (9.17) reduces to |g(p + 2nπ) − x|2 = [(b + a cos p) cos λ(p + 2nπ) − (b + a cos p) cos q]2

+ [(b + a cos p) sin λ(p + 2nπ) − (b + a cos p) sin q]2 n 2 ≤ (b + a)2 cos λ(p + 2nπ) − 2mπ − cos q 2 o + sin λ(p + 2nπ) − 2mπ − sin q

(9.18)

for some m ∈ Z. (The reason why the term 2mπ is inserted in the inequality (9.18) will be clear very soon in Step 3 and Step 4 below.) – Step 3: Transformation of the problem. The expression in the right-hand side of the inequality (9.18) is quite simple now, but we can do much better. In fact, by [21, Eqn. (49), p. 182] and Definition 5.1, we have | cos t − cos x| ≤ |t − x| and | sin t − sin x| ≤ |t − x| as t → x. Thus we can further reduce the inequality (9.18) to |g(p + 2nπ) − x|2 ≤ 2(b + a)2 |λ(p + 2nπ) − 2mπ − q|2 .

(9.19)

If we can show that for every ǫ > 0, there exist m, n ∈ Z such that ǫ ,c |2nλπ − 2mπ + λp − q| < √ 2(b + a)

(9.20)

then the inequalities (9.19) and (9.20) imply that |g(p + 2nπ) − x| < ǫ

(9.21)

for some integer n. Let’s summarize what we have done so far. We have shown that our original problem |g(t) − x| < ǫ follows immediately from the validity of the inequality (9.20) for a sequence of integers, so Step 4 comes into play. – Step 4: Application of the Kronecker’s Approximation Theorem (Lemma 4.6). To show that (9.20) holds, we need a previous result in Chapter 4. That is the Kronecker’s Approximation Theorem (Lemma 4.6): Let 0 < θ < 1. Given ǫ > 0, there exists n ∈ N such that ǫ , (9.22) |nλ − h − θ| < √ 2 2π(b + a) where a, b are constants with 0 < a < b and h = [nλ] ∈ Z. We notice that the θ in the inequality (9.22) can be taken to be any real number. In particular, we substitute θ = q−λp 2π in the inequality (9.22) to get ǫ |2nλπ − 2hπ + (λp − q)| < √ 2(b + a) which is exactly the inequality (9.20) with m = [nλ]. c This

inequality may not hold without the existence of the term 2mπ. For example, if p = q = √ √ have |2 2nπ + π2 ( 2 − 1)| which cannot be arbitrary small.

π 2

and λ =

√

2, then we

223

9.3. Local maxima and minima Hence the set g(R) is dense in K.

This completes the proof of the problem.d

Problem 9.13 Rudin Chapter 9 Exercise 13.

Proof. Suppose that f : R → R3 is given by f (t) = (f1 (t), f2 (t), f3 (t)). Now the differentiability of f implies the differentiability of f1 , f2 and f3 by Remark 5.16. Since f (t)·f (t) = 1, we have f1 (t)f1 (t) + f2 (t)f2 (t) + f3 (t)f3 (t) = 1 and thus d [f (t) · f (t)] = 0 dt d [f1 (t)f1 (t) + f2 (t)f2 (t) + f3 (t)f3 (t)] = 0 dt 2f1′ (t)f1 (t) + 2f2′ (t)f2 (t) + 2f3′ (t)f3 (t) = 0 (f1′ (t), f2′ (t), f3′ (t)) · (f1 (t), f2 (t), f3 (t)) = 0 f ′ (t) · f (t) = 0 as required, completing the proof of the problem.

Problem 9.14 Rudin Chapter 9 Exercise 14.

Proof. (a) By simple computation, we have (D1 f )(x, y) =

x2 (x2 + 3y 2 ) (x2 + y 2 )2

and (D2 f )(x, y) =

−2x3 y , (x2 + y 2 )2

where (x, y) 6= (0, 0). Thus we have 0 ≤ |(D1 f )(x, y)| ≤

x2 (3x2 + 3y 2 ) 3x2 = 2 ≤3 2 2 2 (x + y ) x + y2

and the A.M. ≥ G.M. implies that

p x2 (2|x||y|) x2 (2 x2 y 2 ) x2 (x2 + y 2 ) x2 0 ≤ |(D2 f )(x, y)| = 2 = ≤ = = 1. (x + y 2 )2 (x2 + y 2 )2 (x2 + y 2 )2 x2 + y 2

If (x, y) = (0, 0), then we have (D1 f )(0, 0) = lim

t→0

f (0, t) − f (0, 0) f (t, 0) − f (0, 0) = 1 and (D2 f )(0, 0) = lim = 0. t→0 t t

Hence both D1 f and D2 f are bounded in R2 . d This

problem can also be found in [6, Exericses 11.5, pp. 265, 266].

(9.23)

Chapter 9. Functions of Several Variables

224

(b) Let u = u1 e1 + u2 e2 , where u21 + u22 = 1. By [21, Eqn. (39), p. 217], we know that f ((0, 0) + t(u1 , u2 )) − f (0, 0) t f (tu1 , tu2 ) = lim t→0 t t3 u31 = lim 2 2 t→0 t(t u1 + t2 u2 2)

(Du f )(0, 0) = lim

t→0

= u31 . Since u is a unit vector, |u31 | ≤ |u|3 ≤ 1 and the result follows. (c) We have γ : R → R2 with γ(0) = (0, 0) and |γ ′ (0)| > 0. Let γ(t) = (γ1 (t), γ2 (t)). Then we have g(t) = f (γ(t)) =

γ13 (t) . + γ22 (t)

γ12 (t)

(9.24)

Since γ is differentiable in R, γ1 and γ2 are differentiable in R. Thus it is easy to see from the righthand side in the expression (9.24) that g is differentiable at every point t with (γ1 (t), γ2 (t)) 6= (0, 0). Now the remaining case is the differentiability of g at the point a where γ1 (a) = γ2 (a) = 0. In this case, we have g(a) = f (γ(a)) = f (0, 0) = 0 which gives h γ (t) − γ (a) i3 1 1 γ13 (t) g(t) − g(a) t − a , =h = γ1 (t) − γ1 (a) i2 h γ2 (t) − γ2 (a) i2 t−a (t − a)[γ12 (t) + γ22 (t)] + t−a t−a

(9.25)

where t 6= a. Here we must impose an addition assumption that γ1 (a) 6= 0

or γ2 (a) 6= 0.

Thus it follows from the expression (9.25) that g ′ (a) =

[γ1′ (a)]3 . [γ1′ (a)]2 + [γ2′ (a)]2

(9.26)

This proves our first assertion. For the second assertion, suppose that γ is continuously differentiable on R, i.e., γ ′ is continuous on R. Then both γ1′ and γ2′ are continuous on R. By the expression (9.24) again, we know that g ′ is continuous at every point t with (γ1 (t), γ2 (t)) 6= (0, 0). It remains to check the continuity of g ′ at every point a such that γ1 (a) = γ2 (a) = 0. To this end, we establish from the expression (9.24) that g ′ (t) =

γ14 (t)γ1′ (t) + 3γ12 (t)γ22 (t)γ1′ (t) − 2γ13 (t)γ2 (t)γ2′ (t) . [γ12 (t) + γ22 (t)]2

Then we follow from the equation (9.26) that γ14 (t)γ1′ (t) + 3γ12 (t)γ22 (t)γ1′ (t) − 2γ13 (t)γ2 (t)γ2′ (t) t→a [γ12 (t) + γ22 (t)]2 ( h γ (t) − γ (a) i2 h γ (t) − γ (a) i2 h γ (t) − γ (a) i4 2 2 1 1 1 1 γ1′ (t) + 3 γ1′ (t) = lim t→a t−a t−a t−a

lim g ′ (t) = lim

t→a

225

9.3. Local maxima and minima ) h γ (t) − γ (a) i3 h γ (t) − γ (a) i nh γ (t) − γ (a) i2 h γ (t) − γ (a) i2 o−2 1 1 2 2 1 1 2 2 ′ −2 γ2 (t) × + t−a t−a t−a t−a

[γ1′ (a)]5 + 3[γ1′ (a)]3 [γ2′ (a)]2 − 2[γ1′ (a)]3 [γ2′ (a)]2 {[γ1′ (a)]2 + [γ2′ (a)]2 }2 [γ1′ (a)]3 = ′ [γ1 (a)]2 + [γ2′ (a)]2 =

= g ′ (a). Hence g ′ is continuous at a and our desired result follows.e (d) Assume that f was differentiable at (0, 0). Let u = u1 e1 + u2 e2 . By [21, Eqn. (40), p. 218] and the limits (9.23), we acquire (Du f )(0, 0) = (D1 f )(0, 0)u1 + (D2 f )(0, 0)u2 = u1 which contradicts the result of part (b). Hence, we end the analysis of the problem.

Problem 9.15 Rudin Chapter 9 Exercise 15.

Proof. (a) If x = 0 or y = 0, then the inequality clearly holds. Suppose that x 6= 0 and y 6= 0. Apply the A.M. ≥ G.M. to the positive numbers x4 and y 2 , we get the desired result. (b) By direct computation, we have gθ (t) = f (t cos θ, t sin θ) = t2 − 2t3 cos2 θ sin θ − gθ′ (t) = 2t − 6t2 cos2 θ sin θ − gθ′′ (t) = 2 − 12t cos2 θ sin θ −

4t4 cos6 θ sin2 θ , (t2 cos4 θ + sin2 θ)2

16t3 cos6 θ sin4 θ , (t2 cos4 θ + sin2 θ)3 48t2 cos6 θ sin4 θ(sin2 θ − t2 cos4 θ) . (t2 cos4 θ + sin2 θ)4

Therefore, we have gθ (0) = 0,

gθ′ (0) = 0

and gθ′′ (0) = 2.

Hence, for each θ ∈ [0, 2π], the function gθ has a strict local minimum at t = 0. (c) By direct computation, it is clear that f (x, x2 ) = −x4 < 0 = f (0, 0) so that the point (0, 0) is not a local minimum for f . This completes the proof of the problem.

e We remark that the condition |γ ′ (0)| > 0 has not been used in our argument, so the author wonders that this may happen to be a typo and the correct condition should be |γ ′ (t)| > 0 for all t ∈ R. However, the author can’t find a counterexample to show that the condition |γ ′ (t)| > 0 for all t ∈ R is necessary for the first assertion. Thus, can someone find a differentiable curve γ in R2 with γ(0) = 0, |γ ′ (0)| > 0 and γ(1) = 0 (take a = 1 for example), but g is differentiable at 1?

Chapter 9. Functions of Several Variables

9.4

226

The inverse function theorem and the implicit function theorem

Problem 9.16 Rudin Chapter 9 Exercise 16.

Proof. We remark that this exercise was also been discussed in Hardy’s book [10, p. 236]. In this book, Hardy used the function φ(x) = αx + x2 sin x1 instead of the function f (t) Rudin used here. By [21, Example 5.6(b), p. 106] and the fact that 1 lim t sin , t

t→0

we have f (0) = 0

1 f (t) − f (0) = 1. = lim 1 + 2t sin t→0 t→0 t t

and f ′ (0) = lim

Furthermore, for all t 6= 0, we have

f ′ (t) = 1 + 4t sin

1 1 − 2 cos . t t

(9.27)

1 Therefore, we have |f ′ (t)| ≤ 7 for all t ∈ (−1, 1). Now we put t = 2kπ into the derivative (9.27), where k ∈ N and k → ∞, so 1 4 f′ =1+ sin(2kπ) − 2 cos(2kπ) = −1 (9.28) 2kπ 2kπ and then 1 lim f ′ = −1 6= f ′ (0), k→∞ 2kπ i.e., f ′ is not continuous at 0. Assume that f was one-to-one in (−δ, δ) for some δ > 0. Since f is continuous on (−δ, δ), it follows from Lemma 6.1 that f is monotonic on any [a, b] ⊆ (−δ, δ), where a < b.

• f is monotonically increasing in (−δ, δ). Suppose that t is an arbitrary point in (−δ, δ). Then we have ≥ 0, if x − t > 0; f (x) − f (t) ≤ 0, if x − t < 0. Both cases imply that

f (x) − f (t) ≥0 x−t for all x, t ∈ (−δ, δ) with x 6= t. Therefore, we must have

(9.29)

f ′ (t) ≥ 0 on (−δ, δ). If the positive integer k is chosen such that contradictory to the result (9.28).

1 2kπ

∈ (0, δ), then the inequality (9.29) is

• f is monotonically decreasing in (−δ, δ). Instead of the inequality (9.29), we have f (x) − f (t) ≤0 x−t

(9.30)

for all x, t ∈ (−δ, δ) with x 6= t. In this case, we have

f ′ (t) ≤ 0 1 ∈ (0, δ), then on (−δ, δ). Now if we take the positive integer k to be large enough so that (2k+1)π we derive from the expression (9.27) that 1 4 f′ sin(2k + 1)π − 2 cos(2k + 1)π = 1 + 2 = 3 (9.31) =1+ (2k + 1)π (2k + 1)π

which gives a contradiction.

227

9.4. The inverse function theorem and the implicit function theorem

Hence f is not one-to-one in any neighborhood of 0 and this completes the proof of the problem.

Problem 9.17 Rudin Chapter 9 Exercise 17.

Proof. (a) Since f12 (x, y) + f22 (x, y) = e2x (cos2 y + sin2 y) = e2x , the range of f is R2 \ {(0, 0)}. (b) Now we have D1 f1 = ex cos y,

D2 f1 = −ex sin y,

so that [f ′ (x, y)] =

D1 f2 = ex sin y

ex cos y ex sin y

−ex sin y ex cos y

where (x, y) ∈ R2 . Hence we have

x e cos y Jf (x, y) = det[f (x, y)] = x e sin y

and D2 f2 = ex cos y

,

(9.32)

−ex sin y = e2x 6= 0 ex cos y

′

for every (x, y) ∈ R2 .

By Theorem 9.36, the linear operator f ′ (x, y) is invertible for every (x, y) ∈ R2 and then we deduce from Theorem 9.24 (The Inverse Function Theorem) that there exists a neighborhood of (x, y) such that f is one-to-one. However, we note that f (x, y + 2π) = (ex sin(y + 2π), ex cos(y + 2π)) = (ex sin y, ex cos y) = f (x, y),

so f is not one-to-one on R2 . (c) We have b = f (0, π3 ) = (cos π3 , sin π3 ) = ( 12 ,

√

3 2 ).

Now we want the formula

g(ex cos y, ex sin y) = (x, y) holds in a neighborhood of a. It is easy to see that if p = ex cos y and q = ex sin y, then we have p q x = log p2 + q 2 and y = tan−1 , p where − π2 < y
0; f ( u, √ 0) = (u, 0), f (0, −u) = (u, 0), if u < 0. Hence we have shown that f (R2 ) = R2 . (b) Since D1 f1 = 2x, we have

D2 f1 = −2y,

D1 f2 = 2y

and D2 f2 = 2x,

(9.38)

2x −2y = 4(x2 + y 2 ). Jf (x, y) = det[f ′ (x, y)] = 2y 2x

Thus Jf (x, y) 6= 0 if and only if (x, y) 6= (0, 0), and then it follows from Theorem 9.24 (Inverse Function Theorem) that every point of R2 \ {(0, 0)} has a neighborhood in which f is one-to-one. However, it is clear that f is not one-to-one because f (1, −1) = (0, 2) = f (−1, 1). (c) Suppose that a = (2, 1). Then we have b = f (2, 1) = (3, 4). Let g be the continuous inverse of f , defined in a neighborhood of b, such that g(b) = a. By the analysis of part (a), we know the explicit formula for g: s√ s√ ! u2 + v 2 + u u2 + v 2 − u . , g(u, v) = 2 2 On the one hand, we note from the partial derivatives (9.38) that 2x −2y [f ′ (x, y)] = 2y 2x

Chapter 9. Functions of Several Variables so that

 s

  [f ′ (g(u, v))] = 2   

and then

[f ′ (g(u, v))]−1

230

√ u2 + v 2 + u 2 s√ u2 + v 2 − u 2

s

√ u2 + v 2 − u − 2 s√ u2 + v 2 + u 2

 s√ u2 + v 2 + u   1 2  s = √ √ 2 u2 + v 2  2  u + v2 − u − 2

On the other hand, we have s 2 1 u √ √ D1 g 1 = +1 , 4 u2 + v 2 + u u2 + v 2 s 2 1 u √ √ D1 g 2 = −1 , 4 u2 + v 2 − u u2 + v 2

1 D2 g 1 = 4 1 D2 g 2 = 4

     

s√ u2 + v 2 − u 2 s√ 2 u + v2 + u 2

s s



  .  

(9.39)

2 v √ , ×√ 2 2 2 u +v +u u + v2 2 v √ . ×√ 2 2 2 u +v −u u + v2

After simplification, these expressions become s√ s√ 1 1 u2 + v 2 + u u2 + v 2 − u , D2 g 1 = √ , D1 g 1 = √ 2 2 2 u2 + v 2 2 u2 + v 2 s√ s√ u2 + v 2 − u u2 + v 2 + u 1 1 D1 g 2 = − √ , D2 g 2 = √ . 2 2 2 u2 + v 2 2 u2 + v 2 Therefore, we have  s√ u2 + v 2 + u   1 2  s [g′ (u, v)] = √ √ 2 u2 + v 2  2  u + v2 − u − 2

s√ u2 + v 2 − u 2 s√ 2 u + v2 + u 2

     

which is exactly the matrix (9.39). Finally, it is easy to see that [f ′ (a)] = [f ′ (2, 1)] =

4 −2 2 4

 q 13 1  q2 and [g′ (b)] = [g′ (3, 4)] = 10 − 72

This completes the proof of the problem.

q

q

7 2

13 2



.

Problem 9.19 Rudin Chapter 9 Exercise 19.

Proof. The system of equations is equivalent to the mapping f : R3+1 → R3 defined by f (x, y, z, u) = (3x + y − z + u2 , x − y + 2z + u, 2x + 2y − 3z + 2u).

(9.40)

231

9.4. The inverse function theorem and the implicit function theorem

If a = (x, y, z) = (0, 0, 0) and b = u = 0, then we have f (a, b) matrix at (0, 0, 0, 0) is given by  3 1 −1 [f ′ (0, 0, 0, 0)] =  1 −1 2 2 2 −3

The determinants of  3 1  1 −1 2 2

are

the submatrices    0 3 −1 0 1 ,  1 2 1 , 2 2 −3 2 −12,



1 −1  −1 2 2 −3 21,

= (0, 0, 0). Therefore, its corresponding  0 1 . 2

 0 1  2

and

(9.41)



3 1  1 −1 2 2

 −1 2  −3

3 and 0

respectively. Hence, it follows from Theorem 9.28 (Implicit Function Theorem) that the system of equations can be solved for x, y, u in terms of z; for x, z, u in terms of y; for y, z, u in terms of x; but not for x, y, z in terms of u. This completes the proof of the problem. Problem 9.20 Rudin Chapter 9 Exercise 20.

Proof. Let us restate the implicit function theorem in the case n = m = 1 first. Suppose that E ⊆ R2 is an open set, f : E → R is a C ′ -mapping and (a, b) is a point in R2 such that f (a, b) = 0. Suppose, further, that ∂x f (a, b) 6= 0.f Then there exists an open set U ⊆ R2 and an interval I ⊆ R with (a, b) ∈ U and b ∈ I, having the following property: For every y ∈ I corresponds a unique x ∈ R such that (x, y) ∈ U

and f (x, y) = 0.

If this x is defined to be g(y), where g : I ⊆ R → R, then the function g is C ′ , g(b) = a, f (g(y), y) = 0 for y ∈ I and ∂y f (a, b) . (9.42) g ′ (b) = − ∂x f (a, b) We can interpret the implicit function theorem in two approaches: • Approach 1: By the explanation of Apostol [1, pp. 373, 374], the expression f (x, y) = 0 does not necessarily represent a function. Then one may ask when the relation can be solved explicitly for x in terms of y. The implicit function theorem solves this problem locally. Geometrically, it means that given a point (a, b) such that f (a, b) = 0, if ∂x f (a, b) 6= 0, then there will be an interval I of b such that the relation f (x, y) = 0 is in fact a function in I. In other words, we can find a continuously differentiable function g : I → R implicitly such that f (g(y), y) = 0, i.e., x can be solved explicitly in terms of g in this neighborhood. • Approach 2: Another way to look at the theorem is that the level curve S = {(x, y) ∈ R2 | f (x, y) = 0}

(9.43)

is locally a graph of a function. Here the word “locally” means that for every (a, b) ∈ S, there exist an interval I of b and an open set U ⊆ R2 of (a, b) such that U ∩ S is the graph of a continuously differentiable function x = g(y), i.e., U ∩ S = {(x, y) = (g(y), y) | y ∈ I}.

Chapter 9. Functions of Several Variables

232

Figure 9.3: Geometric meaning of the implicit function theorem.

Furthermore, the slope of the tangent to the curve at the point (a, b) is given by the derivative (9.42). Since ∂x f (a, b) 6= 0, we see that the tangent is not vertical. For instance, if f (x, y) = x2 + y 2 − 1, then the level curve S defined by (9.43) is the unit circle in R2 , see Figure 9.3. Around the point A(x1 , y1 ), x can be expressed in terms of y, i.e., x = g(y) = p 1 − y 2 . Furthermore, the slope of the tangent at A is given by g ′ (x1 ) = −

y1 2y1 =− . 2x1 x1

However, there is no such function around the point B(0, 1) because ∂x f (0, 1) = 0 so that g ′ is not well-defined at B. We complete the analysis of the problem.

Problem 9.21 Rudin Chapter 9 Exercise 21.

Proof. (a) By [21, Eqn. (34), p. 217], we have ∇f (x, y) = (6x2 −6x)e1 +(6y 2 +6y)e2 . Therefore, ∇f (x, y) = 0 if and only if x2 − x = 0 and y 2 + y = 0 if and only if x = 0, 1 and y = 0, −1. Hence the four points are (0, 0), (0, −1), (1, 0) and (1, −1). To find the local extreme of the function f , we need a result from calculus of several variables, see [1, Theorem 13.11, p. 379]. Lemma 9.2 Let f be a real-valued function with continuous second-order partial derivatives at a stationary point (a, b) ∈ R2 . Let A = ∂xx f (a, b), B = ∂xy f (a, b), C = ∂yy f (a, b) and A B ∆ = det = AC − B 2 . B C (a) If ∆ > 0 and A > 0, then f has a locally minimum at (a, b). (b) If ∆ > 0 and A < 0, then f has a locally maximum at (a, b). (c) If ∆ < 0, then f has a saddle point at (a, b).

f Here

the notation ∂x f means

∂f . ∂x

233

9.4. The inverse function theorem and the implicit function theorem Now we have ∂xx f = 12x − 6, ∂yy f = 12y + 6 and ∂xy f = 0. – At (0, 0): A = −6, B = 0 and C = 6. Since ∆ = −36 < 0, (0, 0) is a saddle point by Lemma 9.2(c). – At (0, −1): A = −6, B = 0 and C = −6. Since ∆ = 36 > 0 and A < 0, (0, −1) is a local maximum by Lemma 9.2(b). – At (1, 0): A = 6, B = 0 and C = 6. Since ∆ = 36 > 0 and A > 0, (0, −1) is a local minimum by Lemma 25(a). – At (1, −1): A = 6, B = 0 and C = −6. Since ∆ = −36 < 0, (1, −1) is a saddle point by Lemma 9.2(c). The behaviours of the four points are shown in Figures 9.4(a) to (d) below.g

(a) The saddle point (0, 0).

(b) The local maximum point (0, −1).

(c) The local minimum point (1, 0).

(d) The saddle point (1, −1).

Figure 9.4: The graphs around the four points. (b) By computation, we have f (x, y) = (x + y)(2x2 − 2xy + 2y 2 − 3x + 3y). By definition, we have S = {(x, y) ∈ R2 | f (x, y) = 0}

= {(x, y) ∈ R2 | (x + y)(2x2 − 2xy + 2y 2 − 3x + 3y) = 0}.

g The graphs in Figures 9.4 and 9.5 are produced by using the free online software “3D Surface Plotter”, see https://academo.org/demos/3d-surface-plotter/.

Chapter 9. Functions of Several Variables

234

Therefore, f (x, y) = 0 if and only if or 2x2 − 2xy + 2y 2 − 3x + 3y = 0.

x+y =0

(9.44)

In other words, points of S are exactly the zeros of one of the equations in (9.44). – Case (i): x as a function of y. For every y, we have from the leftmost equation in (9.44) always gives the solution x = −y. (9.45) Recall the fact that D1 f (x, y) = 6x(x − 1),

so Theorem 9.28 (Implicit Function Theorem) implies that for every x ∈ R \ {0, 1}, x can be expressed as a function of y locally. When x = 0, it follows from the rightmost equation in (9.44) that 3 y = 0 or y = − ; 2 when x = 1, we obtain 1 y = −1 or y = . 2 Therefore, at points 1 3 1, , (9.46) (0, 0), 0, − , (1, −1) and 2 2 x might not possibly be expressed as a function of y. To check whether x can be solved in terms of y around these four points, we rewrite the rightmost equation in (9.44) as 2x2 − (3 + 2y)x + (2y 2 + 3y) = 0 so that x=

(3 + 2y) ±

p (3 + 2y)2 − 8(2y 2 + 3y) . 4

(9.47)

By the forms of solutions (9.45) and (9.47), we have the following table: Expressions of x p −y (3 + 2y) + 3(1 − 2y)(3 + 2y) p 4 (3 + 2y) − 3(1 − 2y)(3 + 2y) 4

y→0 0

y → − 23 3 2

y → −1 1

y → 21 − 21

3 2

0

1

1

0

0

− 12

1

Table 9.1: Expressions of x around four points. Hence we can conclude from Table 9.1 that x cannot be expressed uniquely as a function of y around these four points. For instance, as y → 0, both p (3 + 2y) − 3(1 − 2y)(3 + 2y) x = −y and x = 4 tend to 0. This means that x has two different expressions around the point (0, 0). – Case (ii): y as a function of x. Similarly, for every x, we have y = −x and for − 21 ≤ x ≤ 32 , we have y=

(2x − 3) ±

p 3(3 − 2x)(1 + 2x) . 4

(9.48)

(9.49)

235

9.4. The inverse function theorem and the implicit function theorem Since D2 f (x, y) = 6y(y + 1), we know from Theorem 9.8 (Implicit Function Theorem) that for every y ∈ R \ {−1, 0}, y can be expressed as a function of x locally and the only points of uncertainly are 1 3 − , −1 , (1, −1), (0, 0) and ,0 . (9.50) 2 2 By the forms of solutions (9.48) and (9.49), we have the following table: Expressions of y p −x (2x − 3) + 3(3 − 2x)(1 + 2x) p 4 (2x − 3) − 3(3 − 2x)(1 + 2x) 4

x → − 12 1 2

x→1 −1

x→0 −1

x → 23 − 23

−1

1 2

−1

−1

−1

−1

− 32

−1

Table 9.2: Expressions of y around four points. Hence, by similar analysis as in Case (i), we can conclude from Table 9.2 that y cannot be expressed uniquely as a function of x around the four points. Finally, we see easily from the sets of points (9.46) and (9.50) that there is no neighborhoods around the points (0, 0) and (1, −1) such that f (x, y) = 0 cannot be solved for y in terms of x or for x in terms of y. This completes the proof of the problem. Problem 9.22 Rudin Chapter 9 Exercise 22.

Proof. We follow the flow of the proof of Problem 9.21. (a) By [21, Eqn. (34), p.217], we have ∇f (x, y) = 6(x2 + y 2 − x)e1 + 6y(2x + 1)e2 . Therefore, ∇f (x, y) = 0 if and only if x2 + y 2 − x = 0

and y(2x + 1) = 0

if and only ifh x2 − x = 0

and y = 0

if and only if x = 0, 1 and y = 0. Hence ∇f (x, y) = 0 only at the two points (0, 0) and (0, 1). Since ∂xx f = 12x − 6,

∂yy f = 12x + 6 and ∂xy f = 12y.

– At (0, 0): A = −6, B = 0 and C = 6. Since ∆ = −36 < 0, (0, 0) is a saddle point by Lemma 9.2(c). – At (1, 0): A = 6, B = 0 and C = 18. Since ∆ = 108 > 0 and A > 0, (1, 0) is a local minimum by Lemma 9.2(a). The behaviours of the four points are shown in Figures 9.5(a) and (b) below. h It

is impossible that x = − 12 because y 2 +

3 4

= 0 gives non-real y.

Chapter 9. Functions of Several Variables

236

(b) The local minimum point (1, 0).

(a) The saddle point (0, 0).

Figure 9.5: The graphs around (0, 0) and (1, 0). (b) By definition, we have S = {(x, y) ∈ R2 | f (x, y) = 2x3 − 3x2 + 6y 2 x + 3y 2 = 0}. – Case (i): x as a function of y. We note from Theorem 9.28 (Implicit Function Theorem) that if D1 f (x, y) 6= 0, then x in terms of y. Therefore, points of S that have no neighborhoods in which the equation f (x, y) = 0 cannot be solved for x in terms of y must satisfy D1 f (x, y) = 0.

(9.51)

By the result of part (a), we know that D1 f (x, y) = 0 is equivalent to y 2 = x − x2 . Put this into the equation f (x, y) = 0 to reduce it to 3x − 4x3 = 0 which gives x = 0 or If x = 0, then y = 0. If x =

√

√ 3 2 ,

x=±

√ 3 . 2

then we have p √ 2 3−3 y=± . 2 √

Now the point x = − 23 is rejected because y 2 = −2 43−3 < 0. In conclusion, x might possibly be expressed as a function of y around the points p √ √3 2 3 − 3 . (9.52) ,± (0, 0) and 2 2 It is well-known that the discriminant ∆ of the cubic equation 2x3 − 3x2 + 6y 2 x + 3y 2 = 0

(9.53)

is given byi h i 3 2 ∆ = −108y 2(16y 4 + 24y 2 − 3) = −108y 2 16 y 2 + − 12 < 0, 4

so the equation (9.53) has one real root and a pair of complex conjugate roots for every real y. Hence x cannot be expressed uniquely as a function of y at the points (9.52). i See

https://en.wikipedia.org/wiki/Cubic_function.

237

9.5. The rank of a linear transformation – Case (ii): y as a function of x. Similarly, we consider D2 f (x, y) = 0 which is equivalent to 6y(2x + 1) = 0. If x = − 12 , then we have 1 f − , y = −1 6= 0, 2

so (− 12 , y) 6∈ S for every real y. If y = 0, then f (x, 0) = 0 if and only if x = 0 or x = 23 . As a result, y might possibly be expressed as a function of x around the points (0, 0) and For fixed x, we have y2 =

3

2

,0 .

(9.54)

x2 (3 − 2x) 3(2x + 1)

so that − 12 < x ≤ 32 for real solutions y. Hence, this shows that at the points (9.54), y cannot be solved in terms of x. We complete the analysis of the problem.

Problem 9.23 Rudin Chapter 9 Exercise 23.

Proof. It is obvious that f (0, 1, −1) = 0. By definition, we have D1 f = 2xy1 + ex ,

D2 f = x2

and D3 f = 1

and then D1 f (0, 1, −1) = 1 6= 0,

D2 f (0, 1, −1) = 0 and D3 f (0, 1, −1) = 1.

(9.55)

Therefore, we deduce from Theorem 9.28 (Implicit Function Theorem) that there exists a differentiable function g in a neighborhood of (1, −1) in R2 such that g(1, −1) = 0 and f (g(y1 , y2 ), y1 , y2 ) = 0. To find (D1 g)(1, −1) and (D2 g)(1, −1), we derive from the note following [21, Eqn. (65), p.226] that (D1 f )(0, 1, −1)(D1g)(1, −1) = −(D2 f )(0, 1, −1) and (D1 f )(0, 1, −1)(D2 g)(1, −1) = −(D3 f )(0, 1, −1). Hence we obtain from the values (9.55) that (D1 g)(1, −1) = 0 and (D2 g)(1, −1) = −1. This completes the proof of the problem.

9.5

The rank of a linear transformation

Problem 9.24 Rudin Chapter 9 Exercise 24.

Chapter 9. Functions of Several Variables

238

Proof. We have the mapping f : R2 \ {(0, 0)} → R2 . Thus for every (x, y) 6= (0, 0), since D1 f1 =

4xy 2 , (x2 + y 2 )2

we have

D2 f1 =

−4x2 y , (x2 + y 2 )2

D1 f2 =



Now it is easy to see that

(x2

and D2 f2 =

x(x2 − y 2 ) , (x2 + y 2 )2

 −4x2 y (x2 + y 2 )2  . x(x2 − y 2 )  (x2 + y 2 )2

4xy 2  2 (x + y 2 )2 [f ′ (x, y)] =   y(y 2 − x2 ) (x2 + y 2 )2

det[f ′ (x, y)] =

y(y 2 − x2 ) (x2 + y 2 )2

(9.56)

1 [(4xy 2 )x(x2 − y 2 ) + (4x2 y)y(y 2 − x2 )] = 0 + y 2 )4

for any (x, y) 6= (0, 0). By Theorem 9.36, the matrix (9.56) is not invertible. Then we deduce from [15, Theorem 8, p. 112] that the column vectors of [f ′ (x, y)] are dependent. Since R([f ′ (x, y)]) is a vector space in R2 , we have rank ([f ′ (x, y)]) = dim R([f ′ (x, y)]) ≤ 2. Since R([f ′ (x, y)]) is spanned by the column vectors of [f ′ (x, y)] (see [21, p. 210]), we follow from Theorem 9.3(a) that rank ([f ′ (x, y)]) = 2 is impossible. Therefore, we have either rank ([f ′ (x, y)]) = 0 and rank ([f ′ (x, y)]) = 1. Suppose that rank ([f ′ (x, y)]) = 0. Then Definition 9.30 says that [f ′ (x, y)](u, v) = (0, 0) for all (u, v) ∈ R2 \ {(0, 0)}. If x implies that  4xy 2  (x2 + y 2 )2 [f ′ (x, y)](1, −1) =   y(y 2 − x2 ) (x2 + y 2 )2

6= −y, then we choose u = 1 and v = −1 so that the matrix (9.56)  −4x2 y 1 4xy(x + y) 0 1 (x2 + y 2 )2   6= . = 2 x(x2 − y 2 )  −1 (x − y)(x2 − y 2 ) 0 (x + y 2 )2 (x2 + y 2 )2

If x = −y, then we choose u = v = 1 so that the matrix (9.56) gives 1 [f (x, −x)](1, −1) = 4 4x ′

4x3 0

4x3 0

1 1

=

2 ! 0 6= . x 0 0

Hence it is impossible to have rank ([f ′ (x, y)]) = 0 and so rank ([f ′ (x, y)]) = 1. This answers the first assertion. For the second assertion, let X = f1 (x, y) and Y = f2 (x, y). Since X 2 + 4Y 2 = f12 (x, y) + 4f22 (x, y) =

x4 − 2x2 y 2 + y 4 4x2 y 2 x4 + 2x2 y 2 + y 4 + = = 1, (x2 + y 2 )2 (x2 + y 2 )2 (x2 + y 2 )2

the range of f is a subset of the ellipse with radii 1 and 9.6.

1 2

on the X and Y axes respectively, see Figure

239

9.5. The rank of a linear transformation

Figure 9.6: The graph of the ellipse X 2 + 4Y 2 = 1.

Now we show that the range of f is exactly the ellipse X 2 + 4Y 2 = 1. Suppose that (X, Y ) is a point on the ellipse such that x2 − y 2 xy X= 2 and Y = 2 (9.57) 2 x +y x + y2 for some x and y with (x, y) 6= (0, 0). Fix x = 1. Then we have X = y=±

r

1−y 2 1+y 2

which implies that

1−X , 1+X

(9.58)

where X 6= −1. By the expressions (9.58) and the fact that X 2 + 4Y 2 = 1, when X 6= −1, we have

f 1, ±

r

r 1−X 1 − X , f2 1, ± 1+X 1+X q 1−X ! 1−X ± 1 − 1+X 1+X = , 1−X 1 + 1+X 1 + 1−X 1+X 1p = X, ± 1 − X2 2 = (X, Y ).

1−X = f1 1, ± 1+X

r

If X = −1, then we deduce from the definition (9.57) that x = 0 and thus Y = 0. In this case, we have f (0, 2) = (f1 (0, 2), f2 (0, 2)) =

02 − 22 02 + 2

, 2

0·2 = (−1, 0). 02 + 22

Hence the range of f is exactly the graph of the ellipse X 2 + 4Y 2 = 1. This shows the second assertion and thus completes the proof of the problem. Problem 9.25 Rudin Chapter 9 Exercise 25.

Proof.

Chapter 9. Functions of Several Variables

240

(a) Recall the definitions from the proof of Theorem 9.32 that Y1 = R(A), {y1 , . . . , yr } is a basis of Y1 , zi ∈ Rn is defined by Azi = yi (9.59) for 1 ≤ i ≤ r, and a linear mapping S : Y1 → Rn is given by S(c1 y1 + · · · + cr yr ) = c1 z1 + · · · + cr zr

(9.60)

for all scalars c1 , . . . , cr . We note from Theorem 9.3(c) that Rn has a basis containing {y1 , . . . , yr }. Let such a basis be {y1 , . . . , yr , xr+1 , . . . , xn }. Then we have x = c1 y1 + · · · + cr yr + cr+1 xr+1 + · · · + cn xn for some scalars c1 , . . . , cn . By Definition 9.30, we have Ax ∈ Y1 = R(A) so that Ax = c1 y1 + · · · + cr yr .

(9.61)

By using the expressions (9.59) and (9.61), we have ASAx = AS(c1 y1 + · · · + cr yr ) = A(c1 z1 + · · · + cr zr ) = c1 y1 + · · · + cr yr = Ax.

(9.62)

Thus it deduces from the expression (9.62) that SASAx = SAx for every x ∈ Rn . Hence SA is a projection in Rn .

For the second assertion, since SA is a projection in Rn , we follow from the property [21, p. 228] that every x ∈ Rn has a unique representation of the form x = x1 + x2 ,

(9.63)

where x1 ∈ R(SA) and x2 ∈ N (SA).

To finish the proof of this part, we have to prove two steps: – Step 1: R(SA) = R(S). On the one hand, if z ∈ R(S), then we have z = Sy for some y ∈ Y1 = R(A). Since y = Aw for some x ∈ Rn , we have z = Sy = SAx. Thus z ∈ R(SA), i.e., R(S) ⊆ R(SA). On the other hand, if z ∈ R(SA), then z = SAy for some y ∈ Rn . By definition, Ay ∈ Y1 so that z ∈ R(S), i.e., R(SA) ⊆ R(S) and then R(SA) = R(S). – Step 2: N (SA) = N (A). On the one hand, if x ∈ N (A), then Ax = 0 which shows clearly that SAx = S0 = 0. In other words, x ∈ N (SA), i.e., N (A) ⊆ N (SA). On the other hand, if x ∈ N (SA), then we have SAx = 0 and it follows from the expression (9.62) that Ax = ASAx = A0 = 0. Thus, x ∈ N (A), i.e., N (SA) ⊆ N (A) and then N (SA) = N (A). Hence the vectors x1 and x2 in the representation (9.63) are elements of R(S) and N (A) respectively.

241

9.6. Derivatives of higher order

(b) We divide the proof into two steps: – Step 1: dim R(S) = dim R(A). Let y, y′ ∈ Y1 be such that S(y) = S(y′ ). Then we have S(c1 y1 + · · · + cr yr ) = S(c′1 y1 + · · · + c′r yr )

(9.64)

for some scalars c1 , . . . , cr , c′1 , . . . , c′r . By definition, we have from the expression (9.64) that c1 z1 + · · · + cr zr = c′1 z1 + · · · + c′r zr .

(9.65)

Apply A to both sides of the expression (9.65), we have c1 y1 + · · · + cr yr = c′1 y1 + · · · + c′r yr . Since {y1 , . . . , yr } is a basis of Y1 , we have c1 = c′1 , . . . , cr = c′r so that S is an one-to-one linear mapping. Therefore, the mapping S : Y1 → S(Y1 ) is actually an isomorphism j and we have dim R(S) = dim S(Y1 ) = dim Y1 = dim R(A) = r.

(9.66)

– Step 2: dim N (A) = n − r. We know from part (a) that {y1 , . . . , yr , xr+1 , . . . , xn } is a basis of Rn . For every x ∈ Rn , we have x = c1 y1 + · · · + cr yr + cr+1 xr+1 + · · · + cn xn for some scalars c1 , . . . , cr . In particular, if x ∈ N (A), then we have Ax = 0 and by the expression (9.61), we have c1 y1 + · · · + cr yr = Ax = 0. Since {y1 , . . . , yr } is a basis of Y1 , it gives c1 = c2 = · · · = cr = 0. Thus every x ∈ N (A) is a linear combination of xr+1 , . . . , xn , i.e., N (A) is spanned by {xr+1 , . . . , xn }. Next, it is clear that the set {xr+1 , . . . , xn } is linearly independent, so it is a basis of N (A) and then dim N (A) = n − r, (9.67) as desired. Combining the numbers (9.66) and (9.67), we have dim N (A) + dim R(A) = n − r + r = n. This completes the proof of the problem.

9.6

Derivatives of higher order

Problem 9.26 Rudin Chapter 9 Exercise 26.

j A linear mapping f : V → W between two vector spaces is called an isomorphism if it is one-to-one and onto. See, for example, [15, p. 155].

Chapter 9. Functions of Several Variables

242

Proof. By Theorem 7.18, there exists a real continuous function on R which is nowhere differentiable. Let this function be g. Define f : R2 → R by f (x, y) = g(x). Then D1 f (x, y) does not exist for every x(, y) ∈ R2 , but D12 f (x, y) = 0. We have completed the proof of the problem. Problem 9.27 Rudin Chapter 9 Exercise 27.

Proof. We have f (x, y) =

(a) If (x, y) 6= (0, 0), then we have D1 f (x, y) =

 0,   

if (x, y) = (0, 0);

xy(x2 − y 2 )   , if (x, y) 6= (0, 0).  x2 + y 2

y(x4 + 4x2 y 2 − y 4 ) (x2 + y 2 )2

and D2 f (x, y) =

x(x4 − 4x2 y 2 − y 4 ) . (x2 + y 2 )2

(9.68)

(9.69)

Thus it is obvious that f, D1 f and D2 f are continuous at every point (x, y) 6= (0, 0).

Next we have to check their continuity at the point (0, 0). By the A.M. ≥ G.M., we have xy(x2 − y 2 ) |x2 − y 2 | . |f (x, y) − f (0, 0)| = ≤ x2 + y 2 2

We observe that

|x2 − y 2 | →0 2 as (x, y) → (0, 0), so this means that f is continuous at (0, 0). By the definition (9.68) and the A.M. ≥ G.M., we have D1 f (0, 0) = lim

t→0

f (t, 0) − f (0, 0) f (0, t) − f (0, 0) = 0 and D2 f (0, 0) = lim = 0. t→0 t t

Therefore, we have from the expressions (9.69) that

and

(x2 + y 2 )2 − 2y 4 y(x4 + 4x2 y 2 − y 4 ) |D1 f (x, y) − D1 f (0, 0)| = ≤ |y| ≤ |y| → 0 2 2 2 (x + y ) (x2 + y 2 )2

(x2 + y 2 )2 − 2x4 x(x4 − 4x2 y 2 − y 4 ) ≤ |x| |D2 f (x, y) − D2 f (0, 0)| = ≤ |x| → 0 (x2 + y 2 )2 (x2 + y 2 )2

as (x, y) → (0, 0). Thus D1 f and D2 f are continuous at (0, 0). (b) We have

and

 if (x, y) = (0, 0);  0, y(x4 + 4x2 y 2 − y 4 ) D1 f (x, y) = , if (x, y) 6= (0, 0).  (x2 + y 2 )2

 if (x, y) = (0, 0);  0, x(x4 − 4x2 y 2 − y 4 ) D2 f (x, y) = , if (x, y) 6= (0, 0).  (x2 + y 2 )2

(9.70)

(9.71)

243

9.6. Derivatives of higher order For (x, y) 6= (0, 0), we see that D12 f (x, y) = D21 f (x, y) =

(x2 − y 2 )(x4 + 10x2 y 2 + y 4 ) x6 + 9x4 y 2 − 9x2 y 4 − y 6 = . (x2 + y 2 )3 (x2 + y 2 )3

(9.72)

For (x, y) = (0, 0), we have D2 f (t, 0) − D2 f (0, 0) t−0 = lim = 1, t→0 t t D1 f (0, t) − D1 f (0, 0) −t − 0 D21 f (0, 0) = lim = lim = −1. t→0 t→0 t t

D12 f (0, 0) = lim

t→0

(9.73) (9.74)

Therefore, both D12 f and D21 f exist at every point of R2 . In addition, we deduce from the expressions (9.72) that they are continuous at every point except possibly the origin (0, 0). However, if x = y = t, then we have D12 f (t, t) = D21 f (t, t) = 0 so that |D12 f (t, t) − D12 f (0, 0)| = 1 and |D21 f (t, t) − D21 f (0, 0)| = 1, i.e., they are discontinuous at (0, 0). (c) The results have already been shown in the limits (9.73) and (9.74). We end the proof of the problem.

Problem 9.28 Rudin Chapter 9 Exercise 28.

Proof. For t < 0, we have p  (0p≤ x ≤ |t|);  −x, p p ϕ(x, t) = −ϕ(x, |t|) = x − 2 |t|, ( |t| ≤ x ≤ 2 |t|);  0, (otherwise).

Now we can divide the xt-plane into six regions, see Figure 9.7:

Figure 9.7: The definition of the function ϕ(x, t).

(9.75)

Chapter 9. Functions of Several Variables In fact, we have

244

 x,   √    −x + 2 t −x, p ϕ(x, t) =    x − 2 |t|,   0,

Region 1; Region 2; Region 3; Region 4; Regions 5 and 6.

(9.76)

Therefore, it is easy to see from the definition (9.76) ϕ(x, t) is continuous on each so we only p region, p √ that √ need to check its continuity on the curves x = t, x = 2 t when t ≥ 0 and x = |t|, x = 2 |t| when t < 0.k √ • Case (i): On the curve x = t. Since √ ϕ( t+, t) = lim ϕ(x, t) = √ √ x→ t√ t 0 if (D12 f )(a) > 0 and Q(x) < 0 if (D12 f )(a) < 0. Hence it follows from this and the expression (9.95) that – f has a local minimum at a if (D12 f )(a) > 0 and det Ha > 0, – f has a local maximum at a if (D12 f )(a) < 0 and det Ha > 0. • Case (ii): det Ha < 0. By the form (9.97), Q(x) can be expressed in the form Q(x) = γ(αx1 + βx2 )(αx1 − βx2 ),

(9.98)

where α, β, γ are some constants and γ 6= 0. Now it is clear from the form (9.98) that Q(x) = 0 if and only if αx1 + βx2 = 0 or αx1 − βx2 = 0. Besides, the lines αx1 + βx2 = 0 and αx1 − βx2 = 0 divides the x1 x2 -plane into four regions. See Figure 9.8 for details.t

Figure 9.8: The four regions divided by the two lines αx1 + βx2 = 0 and αx1 − βx2 = 0. By direct computation, we have Q(1, 0) = Q(−1, 0) = α2 γ

and Q(0, 1) = Q(0, −1) = −β 2 γ.

When γ > 0, then Q(x) > 0 in Regions I and III and Q(x) < 0 in Regions II and IV; when γ < 0, then Q(x) < 0 in Regions I and III and Q(x) > 0 in Regions II and IV. Hence f has a saddle point at a. • Case (iii): det Ha = 0. Then the point a may be a local maximum, a local minimum or a saddle point. This finishes the discussion of the first assertion. We start to prove the second assertion. We consider the function f : E ⊆ Rn → R, where E is a neighborhood of a. Suppose that n X (Di f )(a)ei = 0 ∇f (a) = i=1

t Here

we suppose that the slope of the line αx1 + βx2 = 0 is positive. The other case can be done similarly.

253

9.6. Derivatives of higher order

and not all second-order derivatives of f are 0 at a. By these and the results from Problem 9.30, we have f (a + x) − f (a) = =

X

1≤s1 +···+sn ≤2 n X

(D1s1 · · · Dnsn f )(a) s1 x1 · · · xsnn + r(x) s1 ! · · · sn !

1 (Di Dj f )(a)xi xj + r(x). 2 i,j=1

(9.99)

Similar as in the case of R2 , we define Ha = [(Dij f )(a)]

(9.100)

to be the Hessian matrix at the point a, where (Dij f )(a) is the entry appears in the ith row and jth column of the matrix (9.100). By this, the expression (9.99) can be written in the form f (a + x) − f (a) = where



  x= 

1 T x Ha x + r(x), 2 x1 x2 .. . xn

(9.101)

    

and xT is the transpose of the vector x. Recall that a symmetric matrix A (i.e., AT = A) is called positive definite if the quadratic form xT Ax > 0 for all x 6= 0. To finish our proof, we need to find bounds of the number xT Ax when A is symmetric. For this purpose, some basic results about eigenvalues of a matrix are needed, see [15, Theorem 5, §7.2, Chap. 7, p. 405] or [3, Exercise 7, §5, Chap. 7, p. 266]: Lemma 9.3 Let A be an n × n symmetric matrix. Then a quadratic form xT Ax is (a) positive definite if and only if the eigenvalues of A are all positive, (b) negative definite if and only if the eigenvalues of A are all negative, or (c) indefinite if and only if A has both positive and negative eigenvalues.

Lemma 9.3 provides a quick way to check whether a symmetric matrix A is positive definite or negative definite. Another simple way to show that a matrix is positive definite is by using its submatrices: A real symmetric n × n matrix A is positive definite if and only if det Ai > 0 for each i = 1, 2, . . . , n, where Ai is the upper left i × i submatrix of A. We need one more lemma, see [3, Proposition 5.7, §5, Chap. 7, p. 255]: Lemma 9.4 Spectral Theorem (real case) Let T be a symmetric operator on a real vector space V with a positive definite bilinear form. Then there is an orthonormal basis of V consisting of eigenvectors of T .

Let λ1 , λ2 , . . . , λn be eigenvalues of an n× n symmetric and positive definite matrix A with real entries and x1 , . . . , xn be their corresponding eigenvectors. By Lemma 9.3(a), λ1 , λ2 , . . . , λn are real and positive.

Chapter 9. Functions of Several Variables

254

By Lemma 9.4, without loss of generality, we may assume that the set {x1 , . . . , xn } is an orthonormal basis of Rn . Given non-zero x ∈ Rn , we have x = c1 x1 + · · · + cn xn for some scalar c1 , . . . , cn . It follows that xT Ax =

n X i=1

ci xi

n n n n X T X X T X λi c2i ≥ λ|x|2 , ci λi xi = ci xi ci xi = A i=1

i=1

i=1

(9.102)

i=1

where λ = min (λi ) > 0. Now it is time to determine whether f has a local maximum, or a local 1≤i≤n

minimum, or a saddle point, at a as follows: • Case (i): Ha is positive definite. By the corollary of Theorem 9.41, the matrix Ha is symmetric. Put A = Ha into the inequality (9.102) so that the expression (9.101) becomes f (a + x) − f (a) ≥

1 λ|x|2 + r(x), 2

where x is non-zero and so close to 0. Since r(x) |x|2 → 0 as x → 0, for that r(x) λ 2 < |x| 2

(9.103) λ 2

> 0 there exists a δ > 0 such (9.104)

for 0 < |x| < δ. By the equivalent form

λ λ − |x|2 < r(x) < |x|2 2 2

of the inequality (9.104), the inequality (9.103) can be reduced to f (a + x) − f (a) > 0 for 0 < |x| < δ. This means that f has a local minimum at the point a. • Case (ii): Ha is negative definite. In this case, Lemma 9.3(b) says that λ1 , . . . , λn are all negative. Instead of the inequalities (9.102) and (9.103), we have xT Ha x ≤ ρ|x|2 and f (a + x) − f (a) ≤

ρ 2 |x| + r(x), 2

where x is non-zero and so close to 0, and ρ = max (λi ) < 0. Again, since 1≤i≤n

|ρ| 2

> 0 there exists a δ > 0 such that

for 0 < |x| < δ. By the equivalent form −

r(x) |ρ| 2 < |x| 2

(9.105) r(x) |x|2

→ 0 as x → 0, for (9.106)

|ρ| 2 |ρ| 2 |x| < r(x) < |x| 2 2

of the inequality (9.106), we see that the inequality (9.105) induces f (a + x) − f (a) ≤

ρ + |ρ| 2 ρ 2 |x| + r(x) < |x| = 0 2 2

for 0 < |x| < δ. In other words, f has a local maximum at the point a. • Case (iii): Ha is indefinite. In this case, the point a may be a local maximum, a local minimum or a saddle point in this case. This proves our second assertion, completing the proof of the problem.

CHAPTER

10

Integration of Differential Forms

10.1

Integration over sets in Rk and primitive mappings

Problem 10.1 Rudin Chapter 10 Exercise 1.

Proof. Let H be a compact convex set in Rk , supp (f ) ⊆ H and H ◦ 6= ∅, where H ◦ denotes the interior of H (see Problem 2.9). If f ∈ C (H), we extend f to a function on I k containing H by setting f (x) = 0 for all x ∈ I k \ H, and define Z Z f,

f=

(10.1)

Ik

H

where

I k = {(x1 , . . . , xk ) | 0 ≤ xi ≤ 1, i = 1, 2, . . . , k}. As suggested by the hint, we are going to show that f can be approximated by functions F that are continuous on Rk and supp (F ) ⊆ H. Before that, we need to show the existence of the integral on the right-hand side in (10.1). Denote ∂H to be the boundary of the compact convex set H. Since H ◦ 6= ∅, we define the function ρ : I k → R by ρ∂H (x), if x ∈ H; ρ(x) = 0, if x ∈ I k \ H, inf{|x − y| | y ∈ ∂H}, if x ∈ H; = (10.2) 0, if x ∈ I k \ H. By Problem 4.20(a), ρ∂H (x) = 0

if and only if x ∈ ∂H = ∂H.

By this and Problem 4.20(b), ρ is (uniformly)    ϕ(t) =  

continuous on I k . Suppose 0 < δ < 1, put 1, if δ ≤ t < +∞; t , if 0 < t < δ; δ 0, if t = 0.

Now it is clear that ϕ is continuous [0, +∞). Define F : I k → R by F (x) = ϕ(ρ(x))f (x). Since f ∈ C (H), F is obviously continuous on H. Since f (x) = 0 on I k \ H, F is continuous on I k \ H too. However, it is not clear whether F is continuous on ∂H or not. In fact, the answer to this question is affirmative: Let p ∈ ∂H and {xn } be a sequence in I k \ ∂H converging to p. By the continuity of the functions ϕ and ρ, we observe that lim ϕ(ρ(xn )) = ϕ(ρ(p)) = 0. n→∞

255

Chapter 10. Integration of Differential Forms

256

Since H is compact, we have p ∈ H so that f (p) is well-defined. Furthermore, Theorem 4.15 implies that f is bounded by a positive constant M on H. Therefore, we have 0 ≤ lim |F (xn )| = lim |ϕ(ρ(xn ))f (xn )| ≤ M lim |ϕ(ρ(xn ))| = M |ϕ(ρ(p))| = 0 n→∞

n→∞

n→∞

which means that lim F (xn ) = 0 = ϕ(ρ(p))f (p) = F (p).

n→∞

Hence F is also continuous on ∂H and then F ∈ C (I k ).a Lemma 10.1 Put y = (x1 , . . . , xk−1 ) ∈ I k−1 . Let S = {xk ∈ [0, 1] | F (y, xk ) 6= f (y, xk )}. Then the set S is either empty or is a line segment whose length does not exceed δ.

Proof of Lemma 10.1. Suppose that S 6= ∅. Then we have F (y, xk ) 6= f (y, xk ) for some xk ∈ [0, 1]. By definition, this means that 0 ≤ ϕ(ρ(y, xk )) < 1

and f (y, xk ) 6= 0.

In addition, these conditions are equivalent to 0 ≤ ρ(y, xk ) < δ

and f (y, xk ) 6= 0.

(10.3)

By the second condition in (10.3), we must have (y, xk ) 6∈ I k \ H. Thus, (y, xk ) ∈ H so that the first condition in (10.3) reduces to 0 < ρ(y, xk ) = ρ∂H (y, xk ) < δ. Since H is convex, the line segment joining the point (y, xk ) and any point on ∂H is still inside H. See Figure 10.1 for details. Therefore, S must be a line segment whose length is less than δ. This completes the proof of the lemma.

Figure 10.1: The compact convex set H and its boundary ∂H. a It

should be noted that since f may not be continuous on I k , we can’t conclude that lim f (xn ) = f (p).

n→∞

10.1. Integration over sets in Rk and primitive mappings

257

Let return to the proof of the problem. Since 0 ≤ ϕ ≤ 1, it follows that Z 1 |Fk (y, xk ) dxk − fk (y, xk )| dxk ≤ δkf k, |Fk−1 (y) − fk−1 (y)| ≤

(10.4)

0

where kf k = max |f (x)|. As δ → 0, the inequality (10.4) exhibits fk−1 as a uniform limit (with respect x∈I k

to δ) of a sequence of continuous functions {Fk−1 } on I k−1 . This proves that fk−1 ∈ C ( I k−1 ) and also the existence of the integral on the right-hand side in (10.1). Furthermore, if F = Fk and f = fk , then we can rewrite the inequality (10.4) as Z Z F− f ≤ δkf k Ik Ik which is true no matter what the order of the k single integrations is. As we have shown above, F ∈ C (I k ), so L(F ) = L′ (F ) Z by Theorem 10.2. Hence the same is true for f , i.e., f is unaffected by any change of the order of integration. This completes the proof of the problem.

Problem 10.2 Rudin Chapter 10 Exercise 2.

Proof. By Definition 10.3, we have supp (f ) = {(x, y) ∈ R2 | f (x, y) 6= 0} ⊆ R2 . Let Si = supp (ϕi ) ⊆ (2−i , 21−i ) ⊂ (0, 1) for each i ∈ N and S=

∞ [

Si .

i=1

We need a lemma: Lemma 10.2 We have f (x, y) 6= 0 if and only if x ∈ Si and y ∈ Si ∪ Si−1 for some i ∈ N. Proof of Lemma 10.2. By the hypothesis, we note that Si ∩ Sj = ∅ for all i, j ∈ N and i 6= j. In other words, the supports {Si } are mutually disjoint. Suppose that x ∈ Si and y ∈ Si ∪ Si−1 for some i ∈ N. Then we have ϕi (x) 6= 0, ϕj (x) = 0 for all j 6= i and either ϕi (y) 6= 0 or ϕi−1 (y) 6= 0 so that f (x, y) = [ϕi−1 (x) − ϕi (x)]ϕi−1 (y) + [ϕi (x) − ϕi+1 (x)]ϕi (y) = −ϕi (x)ϕi−1 (y) + ϕi (x)ϕi (y) = [ϕi (y) − ϕi−1 (y)]ϕi (x) 6= 0.

Next, suppose that x 6∈ Si or y 6∈ Si for all i ∈ N. Then we have ϕi (x) = 0

or ϕi (y) = 0

for all i ∈ N. In this case, the definition of f gives f (x, y) = 0. This completes the proof of the lemma.

Chapter 10. Integration of Differential Forms

258

By Lemma 10.2, if f (x, y) 6= 0, then (x, y) ∈ Si × (Si ∪ Si−1 ) ⊂ (0, 1) × (0, 1) for some i ∈ N. This means that S = {(x, y) ∈ R2 | f (x, y) 6= 0} ⊂ (0, 1) × (0, 1). (10.5)

Since S ⊂ (0, 1) × (0, 1) ⊂ [0, 1] × [0, 1] and [0, 1] × [0, 1] is closed in R2 , Theorem 2.27(c) implies that S ⊆ [0, 1] × [0, 1]. Since S = supp (f ), we have supp (f ) ⊆ [0, 1] × [0, 1]

which means supp (f ) is a bounded set in R2 . By Theorem 2.27(a), supp (f ) is a closed set in R2 . Hence we conclude from Theorem 2.41 (Heine-Borel Theorem) that supp (f ) is compact in R2 . This proves our first assertion. For the second assertion, we consider two cases: • Case (i): f is continuous at (x, y) 6= (0, 0). Let (x, y) 6= (0, 0). If both x and y are nonzero, then it is clear that x ∈ Si ⊆ (2−i , 21−i ) and y ∈ Sj ⊆ (2−j , 21−j ) for some i, j ∈ N. Denote η = min(|x − 2−i |, |x − 21−i |, |y − 2−j |, |y − 21−j |). Then we have Nη ((x, y)) ⊆ (2−i , 21−i ) × (2−j , 21−j ). Furthermore, for every (p, q) ∈ Nη ((x, y)), we see that p 6∈ Sr for all r 6= i and q 6∈ St for all t 6= j. These mean that ϕr (p) = 0 for all r 6= i and ϕt (q) = 0 for all t 6= j, and they imply that  0, if i < j;    ϕi (p)ϕi (q), if i = j; (10.6) f (p, q) = −ϕi (p)ϕi−1 (q), if i = j + 1;    0, if i > j + 1

for all (p, q) ∈ Nη ((x, y)).b By the definition (10.6), we always have  0,    |ϕi (x)ϕi (y) − ϕi (p)ϕi (q)|, |f (x, y) − f (p, q)| = |ϕi (x)ϕi−1 (y) − ϕi (p)ϕi−1 (q)|,    0,

if if if if

i < j; i = j; i = j + 1; i>j+1

(10.7)

for all (x, y), (p, q) ∈ Nη ((x, y)). Since each ϕi is continuous on R, it is also continuous on (2−i , 21−i ). Hence, given ǫ > 0, it is easily seen from the expressions in (10.7) that we can always find 0 < δ < η small enough so that |f (x, y) − f (p, q)| < ǫ

if |(x, y) − (p, q)| < δ. In other words, f is continuous at every (x, y) whenever x and y are nonzero. The remaining cases x 6= 0 and y = 0 or x = 0 and y 6= 0 can be done similarly, so we don’t repeat the details here and we simply conclude that f is continuous at every point (x, y) 6= (0, 0). • Case (ii): f is discontinuous at (0, 0). Next we show that f is discontinuous at (0, 0). Consider the sequence of points {(2−k , 2−k )}. It is obvious that (2−k , 2−k ) → (0, 0) as k → ∞. For each k ∈ N, since 2k 6∈ (2−i , 21−i ) for all i ∈ N, we have 2k 6∈ supp (ϕi ) for all i ∈ N. In other words, we have ϕi (2k ) = 0 for all i ∈ N and then f (2−k , 2−k ) = 0

(10.8)

b By the hypothesis, we just require that S = supp (ϕ ) is in (2−k , 21−k ) for each k ∈ N, not the whole interval. Thus k k it may happen that p ∈ (2−i , 21−i ) but not in Si or q ∈ (2−j , 21−j ) but not in Sj . As a result, it is still possible that f (p, q) = 0 in the second or third case in the definition (10.6). However, this does not give any trouble to our argument.

10.1. Integration over sets in Rk and primitive mappings

259 for all k ∈ N. Since 1=

Z

ϕk = R

Z

R

R

ϕk = 1 for every k ∈ N, we have

21−k 2−k

ϕk (x) dx ≤

Z

21−k

max ϕk (x) dx = max ϕk (x)

x∈Sk

2−k

x∈Sk

Z

21−k 2−k

dx = 2−k × max ϕk (x). x∈Sk

Thus we have max ϕk (x) ≥ 2k

x∈Sk

for every k ∈ N. By Theorem 2.27(a), Sk is closed. Since Sk ⊂ (0, 1) for all k ∈ N, each Sk is bounded and Theorem 2.41 (Heine-Borel Theorem) implies that each Sk is compact. Since ϕk : Sk → R is continuous, Theorem 4.16 (Extreme Value Theorem) ensures that ϕk attains its maximum value 2k at some point in the interval Sk . Let one of these points be pk ∈ Sk , i.e., ϕk (pk ) = 2k . Consider another sequence of points {(pk , pk )}. By the construction, it is clear that pk ∈ (2−k , 21−k ) and so (pk , pk ) → (0, 0) as k → ∞. By the definition of f , we have f (pk , pk ) = [ϕk (pk ) − ϕk+1 (pk )]ϕk (k) = [ϕk (k)] = 2k+1 giving f (pk , pk ) → ∞

(10.9)

as k → ∞. By comparing the two results (10.8) and (10.9), we conclude that f is discontinuous at (0, 0). This shows the second assertion.c To prove the equations in the third assertion, we basically follow the consideration of Definition 10.1. If y 6∈ Si for all i ∈ N, then we have ϕi (y) = 0 for all i ∈ N so that f (x, y) = 0 which gives Z f (x, y) dx = 0 and thus

Z

dy

Z

f (x, y) dx =

1

dy

0

R

R

Z

Z

f (x, y) dx =

R

∞ Z X i=1

21−i

dy

2−i

Z

f (x, y) dx.

On each Si , we know that f (x, y) = [ϕi (x) − ϕi+1 (x)]ϕi (y) and thus Z

21−i

dy 2−i

Z

Z

f (x, y) dx =

21−i

ϕi (y)

2−i

R

Z

=

Z

R

ϕi (x) dx −

(10.10)

R

Z

!

ϕi+1 (x) dx dy

R

21−i

ϕi (y)(1 − 1) dy

2−i

= 0.

(10.11)

Combining the expressions (10.10) and (10.11), we get Z Z dy f (x, y) dx = 0. R

R

Similarly, if x 6∈ Si for all i ∈ N, then we have ϕi (x) = 0 for all i ∈ N so that f (x, y) = 0 which gives Z f (x, y) dy = 0 and thus

Z

R

c The

dx

Z

R

f (x, y) dy =

Z

0

1

dx

Z

R

f (x, y) dy =

∞ Z X i=1

21−i

dx

2−i

limit (10.9) also shows that f is unbounded in every neighborhood of (0, 0).

Z

f (x, y) dy. R

(10.12)

Chapter 10. Integration of Differential Forms On each Si , we have f (x, y) =

260

ϕi (x)[ϕi (y) − ϕi−1 (y)], ϕ1 (x)ϕ1 (y),

if i 6= 1; if i = 1.

Thus we obtain Z   [ϕi (y) − ϕi−1 (y)] dy, if i 6= 1; ϕ (x)   Z  i R f (x, y) dy = Z  R    ϕ1 (x) ϕ1 (y) dy, if i = 1, R 0, if i 6= 1; = ϕ1 (x), if i = 1.

(10.13)

Combining the expressions (10.12) and (10.13), we obtain Z

R

dx

Z

f (x, y) dy = R

∞ Z X i=1

= =

Z

Z

21−i

dx

2−i

1

dx 1 2

Z

Z

f (x, y) dy + R

1 1 2

f (x, y) dy R

ϕ1 (x) dx +

∞ Z X i=2

= 1.

∞ Z X i=2

21−i

2−i

dx

Z

f (x, y) dy R

21−i

0 dx 2−i

This completes the proof of the problem.

Problem 10.3 Rudin Chapter 10 Exercise 3.

Proof. (a) We modify the proof of Theorem 10.7 to obtain the result. In fact, put F1 = F. Since F′1 (0) = I, we also have F′ (0) = I. By the assumption, we know that F1 ∈ C ′ (V1 ) for some neighborhood V1 ⊆ E of 0. Then we still have [21, Eqn. (19), p. 249] in the case m = 1: F′1 (0)e1 =

n X

(D1 Fi )(0)ei ,

(10.14)

i=1

where F1 , . . . , Fn are real C ′ -functions in V1 . Recall from the proof of Theorem 10.7 that there is a k such that 1 ≤ k ≤ n, (D1 Fk )(0) 6= 0 and B1 is defined to be the flip that interchanges 1 and this k. Next, we apply the fact F′1 (0) = I to the formula (10.14) to obtain e1 = (D1 F1 )(0)e1 + (D1 F2 )(0)e2 + · · · + (D1 Fn )(0)en . Thus, we have (D1 F1 )(0) = 1 and (D1 Fi )(0) = 0 for all i = 2, 3, . . . , n. This means that k = m = 1, so B1 = I. Since (D1 F1 )(0) = 1, we obtain from the expression G1 (x) = x + [F1 (x) − x1 ]e1 that G1 ∈ C ′ (V1 ), G1 is primitive and

G′1 (0) = I.

Therefore, we have F′1 (0) = G′1 (0) = B1 = I.

10.1. Integration over sets in Rk and primitive mappings

261

Assume that F′i (0) = G′i (0) = Bi = I, Gi ∈ C ′ (Vi ) and Gi is primitive for all 1 ≤ i ≤ m. By [21, Eqn. (21), p. 250], we still have −1 Fm+1 (y) = Bm Fm ◦ G−1 m (y) = Fm ◦ Gm (y)

(y ∈ Vm+1 ).

(10.15)

Apply Theorem 9.15 (Chain Rule) and the induction hypothesis to the expression (10.15) to get −1 ′ ′ −1 ′ −1 ′ F′m+1 (0) = F′m (G−1 m (0))(Gm (0)) = Fm (0)(Gm (0)) = (Gm (0)) .

(10.16)

By using [21, Eqn . (52), p. 223] in the proof of Theorem 9.24 (Inverse Function Theorem) and then the induction hypothesis, we acquire that ′ ′ −1 −1 (G−1 = (G′m (0))−1 = I−1 = I. m (0)) = (Gm (Gm (0)))

Thus the expression (10.16) implies that F′m+1 (0) = I. By Theorem 9.17, we have F′m+1 (0)em+1 =

n X

(Dm+1 Fi )(0)ei .

(10.17)

i=m+1

Recall from the proof of Theorem 10.7 that there is a k such that m + 1 ≤ k ≤ n, (Dm+1 Fk )(0) 6= 0 and Bm+1 is defined to be the flip that interchanges m + 1 and this k. Since F′m+1 (0) = I, we deduce from the formula (10.17) that em+1 = (Dm+1 Fm+1 )(0)em+1 + · · · + (Dm+1 Fn )(0)en so that (Dm+1 Fm+1 )(0) = 1 and (Dm+1 Fi )(0) = 0 for i = 2, . . . , n. As a result, we have k = m+1 and then Bm+1 = I. Similarly, since (Dm+1 Fm+1 )(0) = 1, we obtain from the expression Gm+1 (x) = x + [Fm+1 (x) − xm+1 ]em+1 that G′m+1 (0) = I. To sum up, what we have shown is that F′i (0) = G′i (0) = Bi = I, Gi ∈ C ′ (Vi ) and Gi is primitive for all 1 ≤ i ≤ n. The expression (10.15), with y = Gm (x), is equivalent to Fm (x) = Fm+1 (Gm (x))

(x ∈ Um ).

(10.18)

By applying this with m = 1, 2 . . . , n − 1, we establish that F1 = F2 ◦ G1 = F3 ◦ G2 ◦ G1 = · · · = Fn ◦ Gn−1 ◦ · · · ◦ G1 in some neighborhood of 0. By [21, Eqn. (18), p. 249], we have Fn (x) = Pn−1 x + αn (x)en ,

(10.19)

so Fn is also primitive by Definition 10.5. Hence we just rename Fn by Gn in the formula (10.19) so as to obtain our desired result. (b) Let F : R2 → R2 be the mapping defined by F(x, y) = (y, x). Assume that F = G2 ◦ G1 , where each Gi : Vi ⊆ R2 → R2 is a primitive C ′ -mapping in some neighborhood of 0, Gi (0) = 0 and G′i (0) is invertible. By Theorem 9.15 (Chain Rule), we have F′ (0) = G′2 (G1 (0))G′1 (0) = G′2 (0)G′1 (0).

(10.20)

Chapter 10. Integration of Differential Forms Now we have ′

F (0) =

262

0 1

1 0

.

(10.21)

Since G1 is a primitive, we have either G1 (x, y) = (x, g1 (x, y))

or G1 (x, y) = (g1 (x, y), y),

where g1 : V1 → R is a real function and it is differentiable at 0. By direct computation, we have either 1 0 a b G′1 (0) = and G′1 (0) = , a b 0 1 where a and b are real numbers. Similarly, we have either 1 0 c d ′ ′ G2 (0) = and G2 (0) = , c d 0 1 where c and d are real numbers. Thus the right-hand side of the expression (10.20) is in one of the following forms: 1 0 c + ad bd a b ac bc + d , , , , c + ad bd 0 b ac bc + d 0 1 but none of them is the matrix (10.21). Hence, the mapping F is not the composition of any two primitive mappings. We complete the proof of the problem. Problem 10.4 Rudin Chapter 10 Exercise 4.

Proof. By direct computation, we have (G2 ◦ G1 )(x, y) = G2 (G1 (x, y)) = G2 (ex cos y − 1, y)

= (ex cos y − 1, (1 + ex cos y − 1) tan y) = (ex cos y − 1, ex sin y)

= F(x, y)

for every (x, y) ∈ R2 . Suppose that g1 , g2 : R2 → R are given by g1 (x, y) = ex cos y − 1 and g2 (x, y) = (1 + x) tan y. Then we have G1 (x, y) = (0, y) + (ex cos y − 1, 0) = ye2 + g1 (x, y)e1 and G2 (x, y) = (x, 0) + (0, (1 + x) tan y) = xe1 + g2 (x, y)e2 . By Definition 10.5, G1 and G2 are primitive. By Definition 9.38, we have x e cos y −ex sin y JG1 (x, y) = , 0 1 1 0 JG2 (x, y) = , tan y (1 + x) sec2 y x e cos y −ex sin y JF (x, y) = . ex sin y ex cos y

263

10.2. Generalizations of partitions of unity

Thus we have JG1 (0, 0) = JG2 (0, 0) = JF (0, 0) =

1 0

0 1

.

Let D = {(u, v) ∈ R2 | u2 + v 2 ≤ 1} be the unit disk. Then we have e2u − v 2 ≥ 0 for all (u, v) ∈ D, so p H1 (u, v) = ( e2u − v 2 − 1, v) is well-defined. Then we have

(H1 ◦ H2 )(x, y) = H1 (H2 (x, y))

= H1 (x, ex sin y) p = ( e2x − (ex sin y)2 − 1, ex sin y) = (ex cos y − 1, ex sin y)

= F(x, y)

in the unit disk D, completing the proof of the problem.

10.2

Generalizations of partitions of unity

Problem 10.5 Rudin Chapter 10 Exercise 5.

Proof. Here is the analogue of Theorem 10.8: Suppose K is a compact subset of a metric space X with metric d, and {Vα } is an open cover of K. Then there exists functions ψ1 , . . . , ψs ∈ C (X) such that (a) 0 ≤ ψi ≤ 1 for 1 ≤ i ≤ s; (b) each ψi has its support in some Vα , and (c) ψ1 (x) + · · · + ψs (x) = 1 for every x ∈ K. When we read Rudin’s proof carefully, we see that the key step of the construction there are relations [21, Eqns. (26) & (27), p.251]. More precisely, the relation [21, Eqn. (26), p.251] tells us that W (xi ) \ B(xi ) 6= φ

and B(xi ) ∩ W (xi )c = φ.

(10.22)

How do these relations (10.22) motivate ideas of the construction of functions in our question? We observe a few points first. Functions of the type in Problem 4.22 are constructed on disjoint nonempty closed sets A and B. Since B(xi ) and W (xi )c are disjoint nonempty closed subsets of Rn , they satisfy this condition. By constructing each ϕi ∈ C (Rn ), the relation [21, Eqn. (27), p.251] plays the role of “gluing” them together to give functions satisfying the unity requirement. Thus these give us a “direction” of proving our result. In fact, we first construct sets having properties similar to relations [21, Eqns. (26) & (27), p.251]. Since K is compact, {Vα } has a finite subcover and then there is a positive integer s such that K ⊆ V1 ∪ V2 ∪ · · · ∪ Vs . We need some results from topology: Lemma 10.3 If X is a metric space, then it is Hausdroff.

Chapter 10. Integration of Differential Forms

264

Proof of Lemma 10.3. See [18, p. 98] for the definition of a Hausdroff space. Let x, y ∈ X with x 6= y. Let δ = d(x, y), U = N δ (x) and V = N δ (y). Then it is easy to check that 2

2

U ∩ V = ∅, proving the lemma.

Since K is a subset of X, it is also Hausdroff by [18, Theorem 17.11, p. 100]. Furthermore, it follows from Lemma 10.3 and [18, Theorem 32.3, p. 202] that K is normal.d Now, by using Step 1 of the proof of Theorem 36.1 in [18, p. 225], we can show that it is possible to find open coverings {U1 , . . . , Us } and {W1 , . . . , Ws } of K such that U i ⊂ Wi ⊂ W i ⊂ Vi (10.23) for each i = 1, . . . , s. Therefore, the sets Ui and Wi are what we need, see Figure 10.2 for the sets Ui , Wi and Vi .

Figure 10.2: The figures of the sets Ui , Wi and Vi . Next, we apply Problem 4.20 to construct the functions satisfying the requirements of the partitions of unity. To do this, let Ai = U i and Bi = Wic , where i = 1, . . . , s. Then the relations (10.23) show that Ai and Bi are disjoint nonempty closed sets. By Problem 4.20, we consider the continuous function ϕi : X → [0, 1] defined by ϕi (x) =

ρBi (x) . ρAi (x) + ρBi (x)

Then we have ϕi (x) = 0 precisely on Bi , ϕi (x) = 1 precisely on Ai and 0 ≤ ϕi (x) ≤ 1 for all x ∈ X. In c addition, since ϕ−1 i ((0, 1]) = Bi = Wi , Definition 10.3 and relations (10.23) imply that supp (ϕi ) = ϕ−1 i ((0, 1]) = W i ⊂ Vi . d See

[18, p. 195] for the definition of a normal space.

265

10.2. Generalizations of partitions of unity

Let W = W1 ∪ W2 ∪ · · · ∪ Ws . Define ϕ : W → R by ϕ(x) =

s X

ϕi (x).

i=1

If x ∈ W , then x ∈ Wi for some i which means that ϕi (x) > 0. As a result, we have ϕ(x) > 0

(10.24)

for all x ∈ W . Let Y = W c . By Theorem 2.24(a), W is open in X and then Theorem 2.23 ensures that Y is closed in X. Since K and Y are disjoint nonempty closed subsets of X, Problem 4.20 again implies that the function f : X → [0, 1] defined by f (x) =

ρY (x) ρK (x) + ρY (x)

is a continuous function on X, 0 ≤ f (x) ≤ 1 for all x ∈ X, f (x) = 0 precisely on Y and f (x) = 1 precisely on K. Finally, for each i = 1, . . . , s, we define the function ψi : X → [0, 1] by  ϕi (x)f (x)   , if x ∈ W ;  ϕ(x) (10.25) ψi (x) =    0, if x ∈ X \ W . We claim that the functions ψ1 , . . . , ψs satisfy conditions (a) to (c):e

• Proof of condition (a). To this end, we follow from the facts 0 ≤ ϕi (x) ≤ 1, 0 ≤ f (x) ≤ 1 for all x ∈ X, the inequality (10.24) and the definition (10.25) that 0 ≤ ψi (x) ≤ 1 on X. This proves condition (a). • Proof of condition (b). The definition (10.25) and the inequality (10.24) imply that ψ(x0 ) 6= 0 0 )f (x0 ) if and only if ϕi (xϕ(x 6= 0 if and only if ϕi (x0 )f (x0 ) 6= 0 if and only if ϕi (x0 ) 6= 0 and f (x0 ) 6= 0. 0) By the definition of ϕi , we have ϕi (x0 ) 6= 0

if and only if

x0 ∈ Bic = Wi .

(10.26)

Similarly, the definition of f implies that f (x0 ) 6= 0 if and only if x0 6∈ Y . Since Y = W c , we have f (x0 ) 6= 0 if and only if

x0 ∈ W.

(10.27)

In conclusion, statements (10.26) and (10.27) tell us that ψi (x) 6= 0 if and only if

x ∈ Wi .

Therefore, we deduce from this and the relations (10.23) that supp (ψi ) = {x ∈ X | ψi (x) 6= 0} = W i ⊂ Vi . This is exactly condition (b). • Proof of condition (c). Since {W1 , . . . , Ws } is an open cover of K, if x ∈ K, then x ∈ W and it follows from the definition (10.25) that s X

ψi (x) =

i=1

This shows condition (c). e Such

functions are called bump functions in X.

s 1 f (x) X ϕi (x) = · ϕ(x) = 1. ϕ(x) i=1 ϕ(x)

Chapter 10. Integration of Differential Forms

266

To finish our proof, we have to show that each function ψi is continuous on X. It is trivial to see that ψi is continuous on W and on X \ W by its definition (10.25). Thus what is left is to show that it is continuous on the boundary of W . Recall that the boundary of a set A, denoted by ∂A, is defined to be A \ A◦ . Let a ∈ ∂W . Since W is open, a ∈ X \ W f and then ψi (a) = 0. Let {an } be a sequenceg in W such that lim an = a.

n→∞

Since f is continuous on X and f (x) = 0 on Y = W c , we have lim f (an ) = f (a) = 0.

n→∞

By the inequality (10.24), we know that preceding Theorem 3.20 that

ϕi (x) ϕ(x)

is bounded by 1 on W . Then we obtain from the remark

lim ψi (an ) = 0 = ψi (a).

n→∞

Hence, by Theorem 4.2 and Definition 4.5, ψi is also continuous on ∂W and then on X, i.e., ψi ∈ C (X) for i = 1, 2, . . . , s. We end the proof of the problem. Problem 10.6 Rudin Chapter 10 Exercise 6.

Proof. As a remark, we note that a function f is called smooth if it has derivatives of all orders in its domain. Firstly, we recall the function in Problem 8.1: f (x) =

1

e− x2 , 0,

(x 6= 0); (x = 0).

This is a function satisfying the conditions that f (x) > 0 for all x ∈ R, f ∈ C ∞ (R) and f (m) (0) = 0 for all m = 1, 2, . . ..h Secondly, we define the function g : R → R by √ f ( x), if x > 0; g(x) = 0, if x ≤ 0, −1 e x , if x > 0; = 0, if x ≤ 0. This function is nonnegative on R. Furthermore, it can be shown in the same way as the proof of Problem 8.1 that this function g has derivatives of all orders for all real x (i.e., g ∈ C ∞ (R)) and g (m) (0) = 0 for all m = 1, 2, . . .). Thirdly, we define the function h : Rn → R by h(x) =

g(2 − |x|) . g(2 − |x|) + g(|x| − 1)

(10.28)

If |x| < 2, then we have g(2 − |x|) > 0. If |x| ≥ 2, then we have g(2 − |x|) > 0 but g(|x| − 1) > 0. Thus we always have g(2 − |x|) + g(|x| − 1) > 0 f See

[18, Exercise 19(a), p. 102] a sequence exists because of the equivalent definition of the boundary of a set: a ∈ ∂A if and only if Nδ (a) ∩ A 6= ∅ and Nδ (a) ∩ Ac 6= ∅ for every δ > 0. h The notation C ∞ (Rn ) denotes the set of all infinitely differentiable functions on Rn for some positive integer n. g Such

267

10.3. Applications of Theorem 10.9 (Change of Variables Theorem)

on Rn . Apart from this, since g(|x| − 1) = 0 for |x| ≤ 1, we have h(x) = 1 for |x| ≤ 1. Similarly, we have h(x) = 0 for |x| ≥ 2 and 0 ≤ h(x) ≤ 1 for 1 ≤ |x| ≤ 2. By repeated applications of Theorems 9.15 (Chain Rule), 9.21 and the fact that g has derivatives of all orders, we conclude that h ∈ C ∞ (Rn ). It is time to construct our desired function based on the function (10.28). We follow the proof of Theorem 10.8. Without loss of generality, we may assume further that Bri (xi ) ⊂ W2ri (xi ) ⊂ W2ri (xi ) ⊂ Vα(xi )

and K ⊆ Br1 (x1 ) ∪ · · · ∪ Brs (xs ),

(10.29)

where xi ∈ K, Bri (xi ) and W2ri (xi ) are open balls, centered at xi , with rational radii ri and 2ri respectively, 1 ≤ i ≤ s. Now we define ϕi : Rn → R and ϕ : Rn → R by ϕi (x) = h

x ri

and ϕ(x) =

s X

ϕi (x)

(10.30)

i=1

respectively. By the construction of the function h, we know that ϕi (x) = 1 if x ∈ Bri (xi ), ϕi (x) = 0 if x ∈ (W2ri (xi ))c and 0 ≤ ϕi (x) ≤ 1 on Rn . Besides, if x ∈ K, then x ∈ Bri (xi ) for some i so that ϕi (x) > 0. Thus we always have ϕ(x) > 0 on K and we can further define the function ψi : Rn → R by ψi (x) =

ϕi (x) . ϕ(x)

(10.31)

We check that the functions ψ1 , . . . , ψs satisfy conditions (a) to (c): • Proof of condition (a). It follows from the definitions (10.30) and (10.31) trivially. • Proof of condition (b). Since ϕi (x) = 0 on (W2ri (xi ))c , we have {x ∈ Rn | ϕi (x) 6= 0} ⊂ W2ri (x). Therefore, we follow from the left-most relation in (10.29) that supp (ϕi ) ⊂ W2ri (x) ⊂ Vα(xi ) . This verifies condition (b). • Proof of condition (c). For every x ∈ K, we have s X

s

1 X ϕi (x) = 1 ψi (x) = ϕ(x) i=1 i=1

so that condition (c) is satisfied. This completes the proof of the problem.

10.3

Applications of Theorem 10.9 (Change of Variables Theorem)

Problem 10.7 Rudin Chapter 10 Exercise 7.

Proof.

Chapter 10. Integration of Differential Forms

268

(a) Recall that Qk = {x = (x1 , . . . , xk ) ∈ Rk | x1 + · · · + xk ≤ 1

and x1 , . . . , xk ≥ 0}.

(10.32)

Let S = {0, e1 , . . . , ek }. Then the smallest convex subset of Rk containing S is exactly the convex hull of S. If we denote the convex hull of S by Conv (S), then we have Conv (S) = {c0 0 + c1 e1 + · · · + ck ek | c0 + c1 + · · · + ck = 1

and c0 , c1 , . . . , ck ≥ 0}.

(10.33)

We have to prove that Qk = Conv (S). To this end, we suppose that x ∈ Conv (S). Then we have x = c0 0 + c1 e 1 + · · · + ck e k , where c0 + c1 + · · · + ck = 1 and c0 , c1 , . . . , ck ≥ 0. If x1 = c1 , x2 = c2 , . . . , xk = ck , then since c0 0 = 0, we have x = c1 e1 + · · · + ck ek = (c1 , . . . , ck ) = (x1 , . . . , xk ), where x1 + · · · + xk ≤ 1 and x1 , . . . , xk ≥ 0. By definition (10.32), we have x ∈ Qk . Conversely, we suppose that x ∈ Qk , so x = (x1 , . . . , xk ) = x1 e1 + · · · + xk ek , where x1 + · · · + xk ≤ 1 and x1 , . . . , xk ≥ 0. Let c0 = 1 − (x1 + · · · + xk ), c1 = x1 , . . . , ck = xk . We can see from these that c0 + c1 + · · · + ck = 1, c0 , c1 . . . , ck ≥ 0 and x = x1 e1 + · · · + xk ek = c0 0 + c1 e1 + · · · + ck ek . By definition (10.33), we have x ∈ Conv (S). Hence, we establish Qk = Conv (S). (b) Let f : X → Y , where X is a convex set. Without loss of generality, we may assume that Y = f (X). Our purpose is to show that Y is convex. Suppose that u, v ∈ Y . By Definition 10.26, we know that f (x) = f (0) + Ax for some A ∈ L(X, Y ). Thus there are a, b ∈ X such that u = f (a) = f (0) + Aa

and v = f (b) = f (0) + Ab.

Let 0 < λ < 1. Since X is convex, we have λa + (1 − λ)b ∈ X and then λu + (1 − λ)v = f (0) + A[λa + (1 − λ)b] = f (λa + (1 − λ)b) ∈ Y. In other words, Y is also convex which is our desired result. This finishes the proof of the problem.

Problem 10.8 Rudin Chapter 10 Exercise 8.

Proof. Before proving the result, it is believed that Rudin missed the few words “and (1, 1) to (4, 5)” at the end of the second sentence in the question. By Definition 10.26, since T (0, 0) = (1, 1), the affine map T : I 2 → H has the form T (x) = T (0) + Ax = (1, 1) + Ax, where A is a 2 × 2 matrix and I 2 = {(x, y) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}. Let a b A= . c d

269

10.3. Applications of Theorem 10.9 (Change of Variables Theorem)

Then we have T

x y

=

1 1

+

a c

b d

x y

=

ax + by + 1 cx + dy + 1

.

(10.34)

In particular, since T (1, 0) = (3, 2) and T (0, 1) = (2, 4), we obtain from the expression (10.34) that a+1 3 b+1 2 = and = c+1 2 d+1 4 which imply that a = 2, b = c = 1 and b = 3. Therefore, the affine map T is given by x 1 2 1 x 2x + y + 1 T = + = . y 1 1 3 y x + 3y + 1

(10.35)

See 10.3 for the mapping T .

Figure 10.3: The mapping T : I 2 → H. By the representation (10.35) and Definition 9.38, we have 2 1 ′ JT (x) = det[T (x)] = det =6−1=5 1 3

(10.36)

for every x ∈ I 2 . Let I ◦ and H ◦ be the interiors of the square I 2 and the parallelogram H respectively. To evaluate the integral α, one may think that Theorem 10.9 should be applied to the function f : H → R defined by f (x, y) = ex−y .

Although the map T satisfies the hypotheses of the theorem (i.e., T ∈ C (I ◦ ) by Theorem 9.21 and the fact that T ∈ C (I 2 ); T : I ◦ → H ◦ is bijective by the fact that A is invertible; and JT (x) 6= 0 on I ◦ ), the function f fails in this case: f (x, y) 6= 0 for every (x, y) ∈ H ◦ so that supp (f ) = H ◦ = H which is compact in R2 , but

supp (f ) * T (I ◦ ) = H ◦ .

Therefore, we have to seek a variation of Theorem 10.9 which is applicable to this situation. In fact, the version considered by Fitzpatrick ([8, Theorem 19.9, p. 506]) serves this purpose:i Lemma 10.4 Suppose that U is open in Rn and the mapping T : U → Rn is a smooth change of variables. Let D be an open Jordan domain such that K = D ∪ ∂D ⊆ U . Then T (K) is a Jordan domain with the property that for any continuous function f : T (K) → R, the following formula holds: Z Z f (y) dy = f (T (x))|JT (x)| dx. T (K)

K

i T is called smooth if it is one-to-one and J (x) 6= 0 on U . Besides, a bounded subset of Rn is said to be a Jordan T domain if its boundary has Jordan content 0, see [8, p. 492].

Chapter 10. Integration of Differential Forms

270

To apply Lemma 10.4, we need to extend the domain of T from the unit square I 2 to the open 2-cell R = (0, 2) × (0, 2). Now it is easy to check that T is also one-to-one and JT (x) 6= 0 on R2 . By [8, Exercise 1, p. 496], the unit square I 2 is a Jordan domain. By definition, it means that ∂I 2 has Jordan content 0. Since ∂I ◦ = ∂I 2 , the set I ◦ is an open Jordan domain. Furthermore, we have 2

K = I ◦ ∪ ∂I ◦ = I 2 ⊂ R2 and then T (K) = T (I 2 ) = H. Since f is obviously continuous on H, we follow from Lemma 10.4 that Z Z Z Z f (T (x))|JT (x)| dx. (10.37) f (T (x))|JT (x)| dx = f (y) dy = ex−y dx dy = α= I2

T −1 (H)

H

H

By the formula (10.35), we have f (T (x)) = f (2u + v + 1, x + 3v + 1) = eu−2v and then it deduces from the formula (10.37) that α=5

Z

1

0

Z

1

eu−2v du dv.

(10.38)

0

Since eu−2v ∈ C (I 2 ), Theorem 10.2j says that the order of the integration on the right-hand side of (10.38) does not matter. Hence we obtain from the integral (10.38) that α=5

Z

0

1

Z

1

eu−2v du dv = 5

0

Z

0

1

eu du

Z

0

1

5 e−2v dv = (e − 1)(e−2 + 1), 2

completing the proof of the problem.

Problem 10.9 Rudin Chapter 10 Exercise 9.

Proof. Rudin said that the interval is from (0, 0) to (0, a), but the end point is a typo, see [1, Example 1, p. 418]. Let A = {(r, θ) | 0 ≤ r ≤ a, 0 ≤ θ ≤ 2π} and D = {(x, y) | x2 + y 2 = r} be the rectangle in the rθ-plane and the closed disc with center (0, 0) and radius a respectively. See Figure 10.4 below:

Figure 10.4: The mapping T : A → D. 2 is a kind of Fubini’s Theorem: If f (x, y) ! I = [a, b] × [c, d], where a, b, c and d are some constants, ! = Zg(x)h(y) and Z b Z d h(y) dy . See, for instance, [24, Theorem 3.10, p. 58]. g(x) dx f (x, y) dx dy = then we have j It

I2

a

c

271

10.3. Applications of Theorem 10.9 (Change of Variables Theorem)

Let T : A → D be defined by 2

2

T (r, θ) = (r cos θ, r sin θ).

2

Since (r cos θ) + (r sin θ) = r , T maps A into D. Given any (x, y) ∈ D. If x 6= 0, then let θ = tan−1 p (thus θ 6= π2 , 3π x2 + y 2 which imply that 2 ) and r =

y x

T (r, θ) = (x, y).

If x = 0 and y = r, then

π T r, = (0, r). 2

If x = 0 and y = −r, then

3π T r, = (0, −r). 2

Therefore, the mapping T is onto. Next, suppose that A◦ = {(r, θ) | 0 < r < a, 0 < θ < 2π} which is open by Problem 2.9(a). We consider the mapping T : A◦ → D0 , (10.39)

where D0 = {(x, y) | x2 + y 2 < a2 } \ {(r, 0) | 0 ≤ r ≤ a}.k See Figure 10.5 for the illustration.

Figure 10.5: The mapping T : A◦ → D0 . We check the hypotheses of Theorem 10.9 for the map (10.39): • Let (r, θ), (r′ , θ′ ) ∈ A◦ . If T (r, θ) = T (r′ , θ′ ), then we have (r cos θ, r sin θ) = (r′ cos θ′ , r′ sin θ′ ) which gives r cos θ = r′ cos θ′ and r sin θ = r′ sin θ′ . Thus tan θ = tan θ′ and either θ′ = θ or θ′ = π + θ. However, the relation θ′ = π + θ implies that r cos θ = r′ cos(π + θ) = −r′ cos θ

and r sin θ = r′ sin(π + θ) = −r′ sin θ.

(10.40)

Since r > 0 and r′ > 0, there is no θ such that the two equations in (10.40) hold simultaneously. Therefore, we have θ′ = θ and then r = r′ . In other words, the mapping T : A◦ → D0 is one-to-one. • For the Jacobian of T , since

′

[T (r, θ)] = we have for every (r, θ) ∈ A◦ .

cos θ sin θ

−r sin θ r cos θ

,

JT (r, θ) = det[T ′ (r, θ)] = r 6= 0

k We notice that T (0, θ) = (0, 0) for every 0 ≤ θ ≤ 2π; T (r, 0) = T (r, 2π) = (r, 0) for every 0 ≤ r ≤ a; and T (a, θ) = (a cos θ, a sin θ) for every 0 ≤ θ ≤ 2π which is the boundary of D.

Chapter 10. Integration of Differential Forms

272

Suppose that f ∈ C (D). Then f is a continuous and bounded function on the compact set D. By Theorem 4.15, f (D) is bounded too. Since supp (f ) = {(x, y) ∈ D | f (x, y) 6= 0} ⊆ f (D), the set supp (f ) is also bounded. Since supp (f ) is closed, Theorem 2.41 (Heine-Borel Theorem) shows that it is compact. In addition, since D0 ⊂ D, we know from Definition 4.5 that f is also continuous on D0 . Let fD0 be the restriction of f to D0 (see Problem 4.7 for the definition). Then supp (fD0 ) = {(x, y) ∈ D0 | f (x, y) 6= 0} is a closed subset of the compact set supp (f ), so supp (fD0 ) is also compact by Theorem 2.35. Furthermore, it is clear that supp (fD0 ) lies in D0 = T (A◦ ). By Theorem 10.9 with y = (x, y) and x = (r, θ), we must have Z Z Z (10.41) f (T (x))|JT (x)| dx. f (y) dy = f (x, y) dx dy = D0

A◦

D0

Since f ∈ C (D) and T, JT ∈ C (A), we have (f ◦ T ) × JT ∈ C (A) and then Theorem 10.2 implies that the mapping (f ◦ T ) × JT : A → R is integrable. By [17, Theorem 13.6, p. 110], the restriction (f ◦ T ) × JT : A◦ → R is also integrable and

Z

f (T (x))|JT (x)| dx =

Z

f (T (x))|JT (x)| dx.

(10.42)

A

A◦

Combining formulas (10.41) and (10.42), we obtain Z Z aZ f (x, y) dx dy = 0

D◦

2π

f (T (r, θ))r dr dθ.

(10.43)

0

To summary, what we have shown in the previous paragraph is that if f ∈ C (D), then the equality (10.43) holds on the interior D0 of D. To remove this restriction, we proceed as in Example 10.4. In other words, we want to show something like [21, Eqn. (8), p. 248]. Since f ∈ C (D), we extend f to a function on I 2 by setting f (x, y) = 0 on I 2 \ D, and define Z Z (10.44) f (x, y) dx dy = f (x, y) dx dy. I2

D

2

Here I is the 2-cell defined by 2

−1 ≤ x, y ≤ 1.

Since f may be discontinuous on I , the existence of proof. To do this, suppose 0 < δ < 1 and letl    1, 1−t ϕδ (t) = ,   δ 0,

the integral on the right-hand side of (10.44) needs

if 0 ≤ t ≤ 1 − δ;

if 1 − δ < t ≤ 1;

(10.45)

if t > 1.

It is easy to see that the function (10.45) is continuous on [0, ∞). Define Fδ : I 2 → R by Fδ (x, y) = ϕδ (x2 + y 2 )f (x, y).

(10.46)

Here we have a result about this Fδ : Lemma 10.5 The function Fδ : I 2 → R defined by (10.46) is continuous and bounded on I 2 . l The functions ϕ and F defined in [21, Eqn. (5) & (6)] depend on δ, but Rudin didn’t mention this point clearly in the text.

273

10.3. Applications of Theorem 10.9 (Change of Variables Theorem) Proof of Lemma 10.5. If (x, y) ∈ D◦ , then we have Fδ (x, y) = f (x, y) which is clearly continuous on D◦ . Similarly, if (x, y) ∈ I 2 \ D, then we have Fδ (x, y) = 0 so that it is also continuous on D \ I 2 . Suppose that (x, y) ∈ ∂D. Then we have lim 2 2

x +y →1 x2 +y 2 1

0 × f (x, y) = 0.

These mean that Fδ is continuous on ∂D. Since D is compact, we conclude from Theorem 4.15 that Fδ ∈ C (I 2 ). Let’s return to the proof of the problem. For each x0 ∈ [−1, 1], let Sx0 = {y ∈ [−1, 1] | Fδ (x0 , y) 6= f (x0 , y)}. If Sx0 6= ∅, then Fδ (x0 , y) 6= f (x0 , y) for some y ∈ [−1, 1] which amounts to ϕ(x20 + y 2 ) 6= 1 and f (x0 , y) 6= 0 and then it is equivalent to 1 − δ ≤ x20 + y 2 ≤ 1 and f (x0 , y) 6= 0. √ In this case, the set Sx0 is a segment of length does not exceed 2(1 − 1 − δ). By definition (10.45), we have 0 ≤ ϕ ≤ 1 which shows that √ (10.47) |F1δ (x) − f1 (x)| ≤ 2(1 − 1 − δ) · max 2 |f (x, y)| (x,y)∈I

for all x ∈ [−1, 1], where F1δ (x) =

Z

1

Fδ (x, y) dy

and f1 (x) =

−1

Z

1

f (x, y) dy.

−1

As δ → 0, the inequality (10.47) implies that f1 is a uniform limit of a sequence of continuous functions {F1δ } on [−1, 1]. By Theorem 7.12, we have f1 ∈ C ([−1, 1]). This proves the existence of the integral on the right-hand side of (10.44). Furthermore, we follow from the inequality (10.47) that Z Z √ f (x, y) dx dy ≤ 22 (1 − 1 − δ) · max 2 |f (x, y)|. Fδ (x, y) dx dy − I2 (x,y)∈I I2 Since Fδ ∈ C (I 2 ), the order of integration of

Z

Fδ

I2

is irrelevant and then the same is true for

Z

f.

I2

Hence, by the definition (10.44), f is integrable over D. To get our final conclusion, we need a lemma (see [17, Theorem 13.6, p. 110]): Lemma 10.6 Let S be a bounded set in Rn ; let f : S → R be a bounded continuous function. If f is integrable over S, then f is integrable over S ◦ , and Z Z f. f= S

S◦

Chapter 10. Integration of Differential Forms

274

Now the closed disk D is bounded and f : D → R is a bounded continuous function. By the above analysis, f is integrable over D, so Lemma 10.6 shows that Z

f (x, y) dx dy =

D

Z

f (x, y) dx dy.

(10.48)

D◦

Hence, we establish from the equalities (10.43) and (10.48) that Z

f (x, y) dx dy =

Z

a

0

D

Z

2π

f (T (r, θ))r dr dθ.

(10.49)

0

This completes the proof of the problem.

Problem 10.10 Rudin Chapter 10 Exercise 10.

Proof. Let’s make clear what is the meaning of “f decreases sufficiently rapidly as |x| + |y| → ∞”. In fact, it means that lim

|x|+|y|→∞

|(x2 + y 2 )f (x, y)| = 0

(10.50)

or equivalently |f (x, y)| ≤

(x2

A + y 2 )1+c

(10.51)

for some positive constants A and c and for all large |x| + |y|. If we put x = r cos θ and y = r sin θ, then the inequality (10.51) becomes A |f (r cos θ, r sin θ)| ≤ 2+2c (10.52) r for all large enough r. Define fb : [0, b] → R by fb(r) = r

so that

Z

2π

f (T (r, θ)) dθ = r

0

Z

0

b

Z

2π

f (r cos θ, r sin θ) dθ

0

Z

2π

f (T (r, θ))r dr dθ = 0

Z

0

where b > 0. Now we show that the improper integral Z

∞

0

b

fb(r) dr,

fb(r) dr

exists. To this end, we need a comparison result of improper integrals. Lemma 10.7 Suppose that F : (a, ∞) → R is increasing. If there exists a constant M such that F (x) ≤ M for all x ∈ (a, ∞), then lim F (x) exists and x→∞

lim F (x) ≤ M.

x→∞

275

10.3. Applications of Theorem 10.9 (Change of Variables Theorem) Proof of Lemma 10.7. Suppose that E = F ((a, ∞)) = {y = F (x) | x > a} ⊆ R. By the hypothesis, we know that E is bounded by M . By Theorem 1.19, R is an ordered field with the least-upper-bound property. By Definition 1.10, E has a least upper bound in R. Suppose that S = sup E. Given ǫ > 0, then the number S − ǫ is not an upper bound of E, so we have S − ǫ < F (x0 ) (10.53) for some x0 > a. Since F is increasing, the inequality (10.53) shows that S − ǫ < F (x0 ) ≤ F (x) ≤ S for all x > x0 . In other words, we have |F (x) − S| < ǫ for all x > x0 . By Definition 4.33, we have lim F (x) = S.

x→∞

Since S = sup E, it is trivial that S ≤ M which is our desired result.

Lemma 10.8 Let f (x) and g(x) be continuous functions and a be a constant. Suppose that 0 ≤ g(x) ≤ f (x) Z ∞ Z ∞ for x ≥ a. If f (x) dx exists, then the limit g(x) dx also exists and a

a

Z

a

∞

g(x) dx ≤

Proof of Lemma 10.8. Suppose that Z y F (y) = f (x) dx

Z

∞

f (x) dx.

a

and G(y) =

a

Z

y

g(x) dx,

a

where y ≥ a. Since f (x) ≥ 0 and g(x) ≥ 0 for all x ≥ a, the functions F and G are increasing. Since 0 ≤ g(x) ≤ f (x) for all x ≥ a, we have G(y) ≤ F (y) for all y ≥ a. By the hypothesis, the number Z M=

∞

(10.54)

f (x) dx

a

is finite. Since we always have F (y) ≤ M for all y ≥ a, we deduce from the inequality (10.54) that G(y) ≤ M for all y ≥ a. Hence the desired results follow immediately from Lemma 10.8.

Chapter 10. Integration of Differential Forms

276

Lemma 10.9 The improper integral

Z

exists.

∞

|fb(r)| dr

0

Proof of Lemma 10.9. Let N be a fixed positive integer such that the inequality (10.52) holds for all r ≥ N . If b > N , then we get from Theorems 6.12(c) and 6.13(b) that Z

0

b

|fb(r)| dr =

≤

≤

Z

N

0

Z

N

0

Z

N

0

|fb(r)| dr +

|fb(r)| dr +

|fb(r)| dr +

Z

b

N Z b N

Z

b

N

|fb(r)| dr Z 2π f (r cos θ, r sin θ) dr r 0 2πA dr r1+2c

(10.55)

Since fb is continuous on the compact interval [0, N ] and N is fixed, Theorem 4.14 implies that |fb(r)| ≤ m on [0, N ] for a positive constant m. Thus the inequality (10.55) becomes 0≤

Since

Z

b

0

Z

b

N

and

|fb(r)| dr ≤ mN +

Z

b

N

2πA dr. r1+2δ

πA 1 1 2πA dr = − r1+2δ δ r2N r2b

πA 1 πA 1 − 2b = 2N , 2N b→∞ δ r r δr Lemma 10.8 implies that the limit Z ∞ |fb(r)| dr lim

0

exists. This completes the proof of this lemma.

Since 0 ≤ |fb(r)| − fb(r) ≤ 2|fb(r)| for every 0 ≤ r ≤ b, it follows from Lemmas 10.8 and 10.9 that Z ∞ fb(r) dr 0

exists. Next, since D is the closed disk centered at the origin with radius a, D becomes the whole plane R2 as a → ∞. Hence, we deduce from the equality (10.49) that Z

f (x, y) dx dy =

R2

Z

0

∞

Z

2π

f (T (r, θ))r dr dθ.

0

Now the function f : R2 → R given by f (x, y) = exp(−x2 − y 2 ) is clearly a continuous and bounded function on the closed disk D with center at (0, 0) and radius a so that f ∈ C (D). Furthermore, f is a function satisfying the condition (10.50). Therefore, on the one hand, we have Z ∞ Z 2π Z exp(−x2 − y 2 ) dx dy = f (T (r, θ))r dr dθ R2

0

0

277

10.3. Applications of Theorem 10.9 (Change of Variables Theorem) =

Z

0

1 = 2 1 = 2 = π.

Z

∞

Z

0

Z

2π

2

e−r r dr dθ

0 " 2π Z ∞

#

2

e−r d(r2 ) dθ

0

2π

0

2 ∞ − e−r dθ 0

(10.56)

On the other hand, we have Z

R2

exp(−x2 − y 2 ) dx dy =

Z

∞

2

!

e−x dx

−∞

×

Z

∞

2

e−y dy

−∞

!

=

Z

∞

−∞

2

!2

e−x dx

.

(10.57)

Combining the expressions (10.56) and (10.57), we have Z ∞ √ 2 e−x dx = π −∞

which is exactly formula (101) of Chap. 8 (see [21, p. 194]). This completes the proof of our problem. Problem 10.11 Rudin Chapter 10 Exercise 11.

Proof. Let S = {(s, t) | 0 < s < ∞, 0 < t < 1} and Q = {(u, v) | u > 0, v > 0} be the strip in the (s, t)-plane and the positive quadrant in the (u, v)-plane respectively. Define T : S → R by T (s, t) = (s − st, st).

(10.58)

Since s > 0 and 0 < t < 1, s − st = s(1 − t) > 0 and st > 0. Thus we have T (S) ⊆ Q. If T (s, t) = T (s′ , t′ ), then the definition (10.58) gives s − st = s′ − s′ t′ and st = s′ t′ which show that s = s′ and t = t′ v . immediately. Thus T is one-to-one. Besides, given u > 0 and v > 0, we define s = u + v and t = u+v Then we have (s, t) ∈ S and v v T (s, t) = (s − st, st) = u + v − (u + v) = (u, v). , (u + v) u+v u+v Therefore, we have T (S) = Q, i.e., the map T : S → Q is onto. See Figure 10.6 for the mapping T .

Figure 10.6: The mapping T : S → Q. It is obvious from the definition (10.58) that   ∂u ∂u 1−t  ∂s ∂t  JT (s, t) = det  ∂v = det  ∂v t ∂s ∂t

−s s

= s(1 − t) + st = s.

Chapter 10. Integration of Differential Forms

278

This proves our second assertion. For x > 0, y > 0, we consider the function f : Q → R and the integral given by Z Z ux−1 v y−1 f (u, v) = u · v and α = f (u, v) du dv = ux−1 e−u v y−1 e−v du dv e e Q Q respectively. Obviously, S is an open set and T is an one-to-one (in fact, bijective) C ′ -mapping. Although the function f (u, v) is clearly continuous on Q, its support is supp (f ) = {(u, v) ∈ Q | f (u, v) 6= 0} = Q ∪ {(u, 0) | u ≥ 0} ∪ {(0, v) | v ≥ 0} which consists of the positive quadrant plus the nonnegative u and v axes. Thus the set supp (f ) is unbounded (and hence not compact in R2 by Theorem 2.41 (Heine-Borel Theorem)) and does not lie in T (S) = Q. In other words, the hypotheses of Theorem 10.9 do not satisfy in this case. To overcome these two problems, we try to consider subsets of S and Q and then we apply Lemma 10.4 to such subsets. To this end, for every small enough ǫ > 0, we consider the set Sǫ = {(s, t) | ǫ < s < ǫ−2 , ǫ < t < 1 − ǫ} ⊂ S. Let Aǫ = T (ǫ, ǫ), Bǫ = T (ǫ, 1 − ǫ), Cǫ = T (ǫ−2 , ǫ) and Dǫ = T (ǫ−2 , 1 − ǫ). Then direct computation gives Aǫ = (ǫ(1 − ǫ), ǫ2 ),

Bǫ = (ǫ2 , ǫ(1 − ǫ)),

Cǫ = (ǫ−2 (1 − ǫ), ǫ−1 ) and Dǫ = (ǫ−1 , ǫ−2 (1 − ǫ)).

Now let Qǫ be the interior of the convex hull of the points Aǫ , Bǫ , Cǫ and Dǫ in the (u, v)-plane. That is, if Hǫ = {λ1 Aǫ + λ2 Bǫ + λ3 Cǫ + λ4 Dǫ | λi ≥ 0 for i = 1, 2, 3, 4 and λ1 + λ2 + λ3 + λ4 = 1}, then we have Qǫ = Hǫ◦ ⊂ Q. For examples, if ǫ = 0.1, then the four points are A0.1 = (0.09, 0.01),

B0.1 = (0.01, 0.09),

C0.1 = (90, 10) and D0.1 = (10, 90);

if ǫ = 0.2, then they are A0.2 = (0.16, 0.04),

B0.2 = (0.04, 0.16),

C0.2 = (20, 5) and D0.2 = (5, 20).

See Figure 10.7 for the open sets Q0.1 , Q0.2 and Q.

Figure 10.7: The open sets Q0.1 , Q0.2 and Q.

(10.59)

279

10.3. Applications of Theorem 10.9 (Change of Variables Theorem)

By definition, we know that both Sǫ and Qǫ are open and bounded in R2 . Next, we have Sǫ ⊂ Sδ

and Qǫ ⊂ Qδ

if 0 < δ < ǫ. As ǫ → 0, we know from the representations of Sǫ and Qǫ that Sǫ → S

and Qǫ → Q.

Furthermore, by the construction of the sets Sǫ and Qǫ , it is easily seen that the restriction T : Sǫ → Qǫ is a bijective C ′ -mapping. Since Sǫ is a rectangle in R2 , it follows from [8, Exercise 1, p. 496] that it is an open Jordan domain. Since Kǫ = Sǫ ∪ ∂Sǫ = Sǫ ⊂ S, we know from the definition (10.59) of Qǫ that T (Kǫ ) = Qǫ = Hǫ . Now our function f is obviously continuous on Hǫ , so we may apply Theorem 10.9 to the restriction f : Hǫ → R to obtain Z Z f (u, v) du dv = f (T (s, t))|JT (s, t)| ds dt. (10.60) Hǫ

Sǫ

On the one hand, we know that Z

f (u, v) du dv =

Z

ǫ−2 (1−ǫ)

u

x−1 −u

e

du

ǫ2

Hǫ

! Z

ǫ−2 (1−ǫ)

v

y−1 −v

e

dv

ǫ2

!

(10.61)

and on the other hand, we have Z

f (T (s, t))|JT (s, t)| ds dt =

Z

ǫ

Sǫ

1−ǫ Z ǫ−1 ǫ

sx+y−1 ty−1 (1 − t)x−1 e−s ds dt.

(10.62)

By substituting the integrals (10.61) and (10.62) into the integral relation (10.60), we have ! Z −2 ! "Z # Z ǫ−2 (1−ǫ) ǫ (1−ǫ) 1−ǫ x−1 −u y−1 −v y−1 x−1 u e du v e dv = t (1 − t) dt ǫ2

ǫ2

ǫ

× and since

Z

Z

ǫ−1

sx+y−1 e−s ds

ǫ

!

ǫ−1

sx+y−1 e−s ds > 0

ǫ

for every ǫ > 0, we must have Z

ǫ

1−ǫ

t

y−1

x−1

(1 − t)

dt =

Z

ǫ−2 (1−ǫ)

u

x−1 −u

e

du

ǫ2

Z

ǫ−1

s

! Z

ǫ−2 (1−ǫ)

v

y−1 −v

ǫ2

x+y−1 −s

e

ds

ǫ

for every ǫ > 0. Recall from Definition 8.17 that the integral Z ∞ tx−1 e−t dt 0

converges for all x > 0, so we can conclude that lim

ǫ→0

Z

ǫ−2 (1−ǫ)

ǫ2

ux−1 e−u du =

Z

0

∞

ux−1 e−u du = Γ(x),

e

dv

!

(10.63)

Chapter 10. Integration of Differential Forms

lim

ǫ→0

Z

ǫ−2 (1−ǫ)

v

y−1 −v

e

dv =

ǫ2

lim

ǫ→0

Z

Z

280 ∞

v x−1 e−v dv = Γ(y),

0

ǫ−1

sx+y−1 e−s ds =

Z

∞

sx+y−1 e−s ds = Γ(x + y).

0

ǫ

Since Γ(x) is nonzero on (0, ∞), they imply that the integral on the left-hand side of the relation (10.63) also converges and we have Z

1

0

ty−1 (1 − t)x−1 dt = lim

ǫ→0

Z

1−ǫ ǫ

ty−1 (1 − t)x−1 dt =

Γ(x)Γ(y) Γ(x + y)

which is exactly formula (96) of Chap. 8. This completes the proof of the problem.

As a remark to Problem 10.11, Rudin mentioned that Theorem 10.9 has to be extended to cover the case of improper integrals. The proof we present here does not use this approach. If the reader wants to apply such extension to prove the problem, we recommend the following version of Theorem 10.9 which can be found in [25, Theorem 1, p. 156]: Lemma 10.10 Suppose that T : U → V is a diffeomorphism of the open set U ⊆ Rn onto the open set V ⊆ Rn and f : U → R is integrable on all measurable compact subsets of U . If the improper integral Z f (y) dy

V

converges, then the improper integral Z f (T (x))|JT (x)| dx U

converges and has the same value.

Problem 10.12 Rudin Chapter 10 Exercise 12.

Proof. See Figure 10.8 for the example T : I 3 → Q3 .

Figure 10.8: The mapping T : I 3 → Q3 . We divide the proof into several steps:

281

10.3. Applications of Theorem 10.9 (Change of Variables Theorem)

• Proof of the formula. The formula can be shown by induction. The case k = 1 is trivial. Assume that the formula is true for k = n, i.e., n X i=1

xi = 1 −

n Y

i=1

(1 − ui ).

(10.64)

If k = n + 1, then we follow from the formula (10.64) and the definition that n+1 X

n X

xi = xn+1 +

xi

i=1

i=1

= xn+1 + 1 −

n Y

(1 − ui )

i=1

= 1 + (1 − u1 ) · · · (1 − un )un+1 − = 1 + un+1

n Y

i=1

=1+ =1−

n Y

n Y (1 − ui )

i=1

n Y (1 − ui ) (1 − ui ) − i=1

(1 − ui )(un+1 − 1)

i=1 n+1 Y i=1

(1 − ui ).

Thus the formula is also true in the case k = n+ 1. Hence we follow from induction that the formula is true for all positive integers k. • The surjectivity of the map T : I k → Qk . Since 0 ≤ ui ≤ 1 for all i = 1, 2, . . . , k, we have xi ≥ 0 for all i = 1, 2, . . . , k. By the formula (10.64), we have k X i=1

xi ≤ 1,

so T maps I k into Qk . Given (x1 , . . . , xk ) ∈ Qk . If points

m X i=1

xi 6= 1 for all m = 1, 2, . . . , k − 1, then the

u1 = x1 , x2 u2 = , 1 − x1 x3 x3 u3 = , = (1 − x1 )(1 − x2 ) 1 − x1 − x2 ·················· , xk xk uk = = (1 − x1 ) · · · (1 − xk−1 ) 1 − x1 − · · · − xk−1 imply that T (u1 , . . . , uk ) = (x1 , . . . , xk ). If m is the least positive integer in the set {1, 2, . . . , k − 1} such that m X

xi = 1,

i=1

then we have

m−1 X i=1

xi < 1 and xm+1 = xm+2 = · · · = xk = 0.

(10.65)

Chapter 10. Integration of Differential Forms

282

By these and the formula (10.64), the point (u1 , . . . , uk ) defined by u1 = x1 ,

u2 =

m−1 X x2 xi , , . . . , um = xm ÷ 1 − 1 − x1 i=1

ui = xi ,

where i = m + 1, . . . , k, implies that T (u1 , . . . , um , um+1 , . . . , uk ) = (x1 , x2 , . . . , xm , 0, . . . , 0). If we have k X

xi = 1 and

i=1

k−1 X

xi < 1,

i=1

then the point (u1 , . . . , uk ) defined by X x2 u1 = x1 , u2 = xi , . . . , uk = xk ÷ 1 − 1 − x1 i=1 k−1

implies that T (u1 , . . . , uk ) = (x1 , x2 , . . . , xk ). Thus it means that the mapping T is onto. • The injectivity of the mapping T : I → Q. Let I and Q be the interiors of I k and Qk respectively, i.e., I = {(u1 , . . . , uk ) ∈ Rk | 0 < u1 , . . . , uk < 1} (10.66) and k o n X xi < 1 . Q = (x1 , . . . , xk ) ∈ Rk x1 , . . . , xk > 0,

(10.67)

i=1

We first show that the mapping T is one-to-one in I. To this end, let u = (u1 , . . . , uk ) and v = (v1 , . . . , vk ) be points in I such that T (u) = x = (x1 , . . . , xk ) and T (v) = y = (y1 , . . . , yk ). If T (u) = T (v), then we have x1 = y1 , x2 = y2 , . . . , xk = yk . Note that 0 < ui < 1 and 0 < vi < 1 for 1 ≤ i ≤ k. By definition, the relation x1 = y1 implies trivially that u1 = v1 . Similarly, the inequality 0 < u1 = v1 < 1 and the relation x2 = y2 show that u2 = v2 . By using the inequalities 0 < ui = vi < and the relations xi = yi for 1 ≤ i ≤ k − 1, it deduces immediately from the definition that xk = yk . Hence T is one-to-one in I. Next, we show that T (I) = Q. Let x ∈ Q so that x = (x1 , . . . , xk ), where x1 , . . . , xk > 0 and k X xk < 1. If we define u1 , . . . , uk by the equalities (10.65), then we obtain immediately that i=1

T (u1 , . . . , uk ) = (x1 , . . . , xk ). Thus T (I) = Q, as required. In addition, it is clearly that the inverse S : Q → I is well-defined by the formulas (10.65). • Computation of the JT (u) and JS (x). To find JT (u), we note that T (u1 , . . . , uk ) = (u1 , (1 − u1 )u2 , . . . , (1 − u1 ) · · · (1 − uk−1 )uk ).

(10.68)

283

10.3. Applications of Theorem 10.9 (Change of Variables Theorem) Apply Theorem 9.17 to (10.68), we get  1 0    −u2 1 − u1  .. ..   . . [T ′ (u)] =    k−1  k−1 Y Y  (1 − ui )uk (1 − ui )uk −  − i=1 i6=2

i=2

−

0

···

0

0 .. .

··· .. .

0 .. .

         k−1  Y  ··· (1 − ui ) 

k−1 Y i=1 i6=3



(1 − ui )uk

i=1

which is a lower triangular matrix. It is well-known that the determinant of a triangular matrix is the product of the entries on the diagonal (see [15, Theorem 2, 167]), so we have JT (u) = det[T ′ (u)] = 1 × (1 − u1 ) × [(1 − u1 )(1 − u2 )] × · · · × = (1 − u1 )k−1 (1 − u2 )k−2 · · · (1 − uk−1 ),

k−1 Y i=1

(1 − ui ) (10.69)

where u ∈ I. To find JS (x), we notice that S(x1 , . . . , xk ) =

! x2 xk x1 , . ,..., 1 − x1 1 − x1 − · · · − xk−1

Similarly, we apply Theorem 9.17 to the expression (10.70), we get  1 0 0    1 x2  0 2  (1 − x ) 1 − x1 1  .. .. ..  . . . [S ′ (x)] =     xk xk xk   k−1 k−1 k−1  2 2 2 X X X  xi xi xi 1− 1− 1− i=1

i=1

i=1

(10.70)

···

0

···

0

..

.. .

.

···



1

1−

k−1 X i=1

and therefore,

xi

2

              

JS (x) = det[S ′ (x)] 1 1 × ···× =1× 1 − x1 1 − x1 − · · · − xk−1 = [(1 − x1 )(1 − x1 − x2 ) · · · (1 − x1 − x2 · · · − xk−1 )]−1 , where x ∈ Q. We end the proof of the problem.

Problem 10.13 Rudin Chapter 10 Exercise 13.

Proof. Let T : I k → Qk be the mapping as defined in Problem 10.12. Then I is an open set, T : I → Q is bijective and we deduce from the definition (10.66) and the formula (10.69) that JT (u) 6= 0

Chapter 10. Integration of Differential Forms

284

for all u ∈ I. Define the function f : Q → R by f (x) = xr11 xr22 · · · xrkk which is clearly continuous on Q. By definition (10.67), we know that supp (f ) = {x ∈ Q | f (x) 6= 0} = Q = Qk which is compact by Theorem 2.41 (Heine-Borel Theorem). Hence, it follows from Theorem 10.9 and the formula (10.69) that Z Z xr11 xr22 · · · xrkk dx = f (T (u))|JT (u)| du Q I Z = f (T (u))|(1 − u1 )k−1 (1 − u2 )k−2 · · · (1 − uk−1 )| du1 du2 · · · duk .

(10.71)

I

By the definition of x, the integral (10.71) reduces to Z Z xr11 xr22 · · · xrkk dx = f (T (u))|(1 − u1 )k−1 (1 − u2 )k−2 · · · (1 − uk−1 )| du1 du2 · · · duk Q I Z = ur11 [(1 − u1 )u2 ]r2 [(1 − u1 )(1 − u2 )u3 ]r3 · · · [(1 − u1 ) · · · (1 − uk−1 )uk ]rk I

× |(1 − u1 )k−1 (1 − u2 )k−2 · · · (1 − uk−1 )| du1 du2 · · · duk Z o n = ur11 · · · urkk (1 − u1 )r2 +···+rk (1 − u2 )r3 +···+rk · · · (1 − uk−1 )rk I

(1 − u1 )k−1 (1 − u2 )k−2 · · · (1 − uk−1 ) du1 du2 · · · duk "Z # "Z 1

=

0

ur11 (1

× ··· ×

"Z

1

k+r2 +···+rk −1

− u1 )

0

1

rk−1 uk−1 (1

du1 ×

1+rk

− uk−1 )

0

#

duk−1 ×

ur22 (1 Z

k+r3 ···+rk −2

− u2 )

1

0

urkk

!

duk .

du2

# (10.72)

Finally, we apply Theorem 8.20 to each integral on the right-hand side of (10.72) and then Theorem 8.18(b), cancellations happen and it further reduces to Z Γ(r1 + 1)Γ(k + r2 + · · · + rk ) Γ(r2 + 1)Γ(k − 1 + r3 + · · · + rk ) × xr11 xr22 · · · xrkk dx = Γ(k + r1 + · · · + rk + 1) Γ(k + r2 + · · · + rk ) Q Γ(rk−1 + 1)Γ(2 + rk ) Γ(rk + 1)Γ(1) × Γ(3 + rk−1 + rk ) Γ(2 + rk ) Γ(r1 + 1)Γ(r2 + 1) · · · Γ(rk + 1) = Γ(k + r1 + · · · + rk + 1) r1 !r2 ! · · · rk ! . = (k + r1 + · · · + rk ) × ···×

(10.73)

In particular, if we take r1 = · · · = rk = 0 into the integral (10.73), then we get Z 1 dx = k! Qk which gives the volume of the k-simplex Qk . This completes the proof of the problem.

10.4

Properties of k-forms and k-simplexes

Problem 10.14 Rudin Chapter 10 Exercise 14.

285

10.4. Properties of k-forms and k-simplexes

Proof. Recall from Definition 9.33 that if {j1 , . . . , jk } is an ordered k-tuple of distinct integers, then we have k q−1 Y Y sgn (jq − jp ). (10.74) s(j1 , . . . , jk ) = q=2 p=1

Let k = 2. Then we have s(j1 , j2 ) = sgn (j2 − j1 ) =

1, if j2 > j1 ; −1, if j2 < j1 .

If j2 > j1 , then dxj1 ∧ dxj2 is an increasing 2-index so that ε(j1 , j2 ) = 1. Similarly, if j2 < j1 , then dxj1 ∧ dxj2 = − dxj2 ∧ dxj1 so that ε(j1 , j2 ) = −1 by [21, Eqn. (42), p. 256]. Therefore, we have ε(j1 , j2 ) = s(j1 , j2 ). Assume that we have ε(j1 , . . . , jm ) = s(j1 , . . . , jm ),

(10.75)

where 2 ≤ m < k. Let k = m + 1. By definition (10.74) and the equality (10.75), we have s(j1 , . . . , jm , jm+1 ) =

m+1 Y Y q−1

q=2 p=1

=

m Y

p=1

sgn (jq − jp )

sgn (jm+1 − jp ) ×

= ε(j1 , . . . , jm ) ×

m Y

p=1

m q−1 Y Y

q=2 p=1

sgn (jq − jp )

sgn (jm+1 − jp ).

(10.76)

To make the (m + 1)-tuple {j1 , . . . , jm , jm+1 } of distinct integers into an increasing (m + 1)-index, we can first make the m-tuple {j1 , . . . , jm } into an increasing m-index, namely {jr1 , . . . , jrm } where jr1 < · · · < jrm , and then add the integer jm+1 to that increasing m-index to produce the increasing (m + 1)-index. Suppose that jr1 < · · · < jrs < jm+1 < jrs+1 < · · · < jrm . (10.77) Then we have ε(j1 , . . . , jm , jm+1 ) = ε(j1 , . . . , jm ) × (−1)m−s−1 .

(10.78)

By the inequalities (10.77), we have m Y

p=1

sgn (jm+1 − jp ) =

m Y

p=1

sgn (jm+1 − jrp ) = (−1)m−s−1 .

(10.79)

Hecnce we follow from the equalities (10.76), (10.78) and (10.79) that s(j1 , . . . , jm , jm+1 ) = ε(j1 , . . . , jm , jm+1 ) which implies that the statement is true for k = m + 1. By induction, formula (46) is true for all integers k. This completes the proof of the problem. Problem 10.15 Rudin Chapter 10 Exercise 15.

Chapter 10. Integration of Differential Forms

286

Proof. Suppose that ω and λ are represented in the standard presentation: X X ω= aI (x) dxI and λ = bJ (x) dxJ , I

(10.80)

J

where the summations in (10.80) extend over all increasing k-indices I and m-indices J respectively. By Definition 10.17, we have the (k + m)-form in an open set E ⊆ Rn X X ω∧λ= aI (x)bJ (x) dxI ∧ dxJ and λ ∧ ω = bJ (x)aI (x) dxJ ∧ dxI , I,J

I,J

where I and J range independently over their aI (x)bJ (x) = bJ (x)aI (x) for every x ∈ E. Therefore, our result follows immediately if we can show that dxI ∧ dxJ = (−1)km dxJ ∧ dxI

(10.81)

for each increasing k-indices I and increasing m-indices J. Suppose that I and J have an element in common, then we know from [21, Eqn. (43), p. 256] that dxI ∧ dxJ = dxJ ∧ dxI = 0 so that the formula (10.81) holds. Next, we suppose that {i1 , . . . , ik } and {j1 , . . . , jm } are increasing k-indices and m-indices respectively with no element in common. By repeated application of the anticommutative relation [21, Eqn. (42), p. 256], we have dxI ∧ dxJ = ( dxi1 ∧ · · · ∧ dxik ) ∧ ( dxj1 ∧ · · · ∧ dxjm ) = ( dxi1 ∧ · · · ∧ dxik−1 ∧ (−1) dxj1 ) ∧ ( dxik ∧ dxj2 ∧ · · · ∧ dxjm )

= ( dxi1 ∧ · · · ∧ dxik−2 ∧ (−1)2 dxj1 ∧ dxik−1 ) ∧ ( dxik ∧ dxj2 ∧ · · · ∧ dxjm ) = ((−1)k dxj1 ∧ dxi1 ∧ · · · ∧ dxik−1 ) ∧ ( dxik ∧ dxj2 ∧ · · · ∧ dxjm )

= ((−1)2k dxj1 ∧ dxj2 ∧ dxi1 ∧ · · · ∧ dxik−2 ) ∧ ( dxik−1 ∧ dxik ∧ dxj3 ∧ · · · ∧ dxjm ) .. .

= (−1)km ( dxj1 ∧ dxj2 ∧ · · · ∧ dxjm ) ∧ ( dxi1 ∧ dxi2 ∧ · · · ∧ dxik )

= (−1)km dxJ ∧ dxI

which is exactly the equality (10.81). This completes the proof of the problem. Problem 10.16 Rudin Chapter 10 Exercise 16.

Proof. Let k = 2. Then we get from Definition 10.29 that ∂σ = [p1 , p2 ] − [p0 , p2 ] + [p0 , p1 ] which gives ∂ 2 σ = ∂[p1 , p2 ] − ∂[p0 , p2 ] + ∂[p0 , p1 ] = [p2 ] − [p1 ] − ([p2 ] − [p0 ]) + [p1 ] − [p0 ] = 0. Let k = 3. Similarly, we obtain from Definition 10.29 that ∂σ = [p1 , p2 , p3 ] − [p0 , p2 , p3 ] + [p0 , p1 , p3 ] − [p0 , p1 , p2 ]

287

10.4. Properties of k-forms and k-simplexes

which implies that ∂ 2 σ = ∂[p1 , p2 , p3 ] − ∂[p0 , p2 , p3 ] + ∂[p0 , p1 , p3 ] − [p0 , p1 , p2 ]

= [p2 , p3 ] − [p1 , p3 ] + [p1 , p2 ] − ([p2 , p3 ] − [p0 , p3 ] + [p0 , p2 ]) + [p1 , p3 ] − [p0 , p3 ] + [p0 , p1 ] − ([p1 , p2 ] − [p0 , p2 ]) + [p0 , p1 ])

= 0.

For the general case, let σi and σij be the (k − 1)-simplex and (k − 2)-simplex obtained by deleting pi and pi plus pj from σ respectively, where i < j. That is σi = [p0 , . . . , pi−1 , pi+1 , . . . , pk ] and σij = [p0 , . . . , pi−1 , pi+1 , . . . , pj−1 , pj+1 . . . , pk ] where i, j = 0, . . . , k and i < j. Now each σij occurs exactly twice in ∂ 2 σ, one from deleting the pj first and then the pi next, and the other one from deleting the pi first and then the pj next. We claim that the resulting (k − 2)-simplex have opposite sign. To this end, we notice that the positions of pi and pj in the oriented affine k-simplex σ first: σ = [ p0 , . . . , pi−1 , pi , pi+1 , . . . , pj−1 , pj , pj+1 , . . . , pk ]. | {z } i terms before pi

|

{z

j terms before pj

(10.82)

}

• Case (i): Delete the pj from the expression (10.82). We have σj = [ p0 , . . . , pi−1 , pi , pi+1 , . . . , pj−1 , pj+1 , . . . , pk ] | {z }

(10.83)

i terms before pi

and this contributes a factor (−1)j . We observe from the expression (10.83) that there are i terms before the pi , so when we delete the pi from the expression (10.83), we have σij = [p0 , . . . , pi−1 , pi+1 , . . . , pj−1 , pj+1 , . . . , pk ] and this contributes another factor (−1)i . Thus we obtain (−1)i+j σij in this way. • Case (ii): Delete the pi from (10.82). We get σi = [p0 , . . . , pi−1 , pi+1 , . . . , pj−1 , pj , pj+1 , . . . , pk ] | {z }

(10.84)

(j − 1) terms before pj

and this contributes a factor (−1)i . Now we notice from the form (10.84) that there are (j − 1) terms before the pj , so if we delete the pj from the expression (10.84), then we have σij = [p0 , . . . , pi−1 , pi+1 , . . . , pj−1 , pj+1 , . . . , pk ] and this contributes another factor (−1)j−1 . Thus we obtain (−1)i+j−1 σij in this way. This proves our claim and then completes the proof of our problem. Problem 10.17 Rudin Chapter 10 Exercise 17.

Chapter 10. Integration of Differential Forms

288

As a remark to Definition 10.28, we recall from Rudin’s explanation [21, p. 268] that the notation “+” used in the chain J 2 = τ1 + τ2 does not mean the addition of mappings. In fact, if we denote Ωk (E) to be the collection of all k-forms in E and Φ : D ⊂ Rk → E ⊆ Rn is a k-surface in E, then we define Φ : Ωk (E) → R to bem Z Φ(ω) =

ω.

Φ

Hence the “+” in the expression J 2 = τ1 + τ2 means that 2

J (ω) = τ1 (ω) + τ2 (ω) =

Z

ω+

τ1

Z

ω

(10.85)

τ2

for every 2-form ω. To avoid any ambiguity, we write τ1 +c τ2 to replace the original affine chain τ1 + τ2 with the meaning shown in the integrals (10.85). Proof. We first find the explicit representations of τ1 and τ2 . Note that τ1 is characterized by τ1 (0) = 0,

τ1 (e1 ) = e1

and τ1 (e2 ) = e1 + e2 .

By [21, Eqn. (80), p. 267], we know that τ2 = [0, e2 + e1 , e2 ] and then it is characterized by τ2 (0) = 0,

τ2 (e1 ) = e1 + e2

and τ2 (e2 ) = e2 .

By [21, Eqn. (78), p. 266], we acquire the mappings τ1 : Q2 → R2 and τ2 : Q2 → R2 by

for all u ∈ Q2 , where A=

τ1 (u) = Au

and τ2 (u) = Bu

and B =

1 0

1 1

1 1

0 1

(10.86)

.

(10.87)

Next, we find τ1 (Q2 ) and τ2 (Q2 ). By Definition 10.26, Q2 = {ae1 + be2 | a, b ≥ 0, a + b ≤ 1}. Therefore, by this and the mappings (10.86) with the matrices (10.87), we deduce that 1 1 a a+b τ1 (ae1 + be2 ) = = = (a + b)e1 + be2 0 1 b b

(10.88)

which imply that τ1 (Q2 ) = {(a + b)e1 + be2 | a, b ≥ 0, a + b ≤ 1}

(10.89)

and it is the “lower right” half of the unit square I 2 , see Figure 10.9:

Figure 10.9: The mapping τ1 : Q2 → I 2 . m We use different colors for Φ so as to make clear that it has “different” meanings in “different” situations: For Φ, it means a k-surface and for Φ, it means a function.

289

10.4. Properties of k-forms and k-simplexes

Similarly, we have τ2 (ae1 + be2 ) =

1 0 1 1

a b

=

a a+b

= ae1 + (a + b)e2

(10.90)

which imply that τ2 (Q2 ) = {ae1 + (a + b)e2 | a, b ≥ 0, a + b ≤ 1}

(10.91)

and it is the “upper left” half of the unit square I 2 , see Figure 10.10:

Figure 10.10: The mapping τ2 : Q2 → I 2 . Thus we follow from the ranges (10.89) and (10.91) that I 2 = τ1 (Q2 ) ∪ τ2 (Q2 ) and the interiors of τ1 (Q2 ) and τ2 (Q2 ) are disjoint. By the matrices (10.87), since det A = det B = 1 > 0, τ1 and τ2 are obviously one-to-one mappings of class C ′′ . Furthermore, we know from Definition 10.31 that J 2 has the positively oriented boundary. Hence it is reasonable to say that J 2 the positively oriented square in R2 . See Figure 10.11 for the orientation of J 2 , where the red and green arrows connecting the points (0, 0) and (1, 1) have opposite orientation so that they cancel each other.

Figure 10.11: The mapping τ2 : Q2 → I 2 . Finally, we compute ∂J 2 and ∂(τ1 − τ2 ). By routine computation and [21, Eqn. (80), p. 267], we have ∂J 2 = ∂τ1 + ∂τ2 = [e1 , e1 + e2 ] − [0, e1 + e2 ] + [0, e1 ] − ([e2 , e1 + e2 ] − [0, e1 + e2 ] + [0, e2 ]) = [e1 , e1 + e2 ] + [0, e1 ] − [e2 , e1 + e2 ] − [0, e2 ] = [0, e1 ] + [e1 , e1 + e2 ] + [e1 + e2 , e2 ] + [e2 , 0]

Chapter 10. Integration of Differential Forms

290

which is the sum of 4 oriented affine 1-simplexes.n Similarly, we have ∂(τ1 − τ2 ) = ∂τ1 − ∂τ2 = [e1 , e1 + e2 ] − [0, e1 + e2 ] + [0, e1 ]

+ ([e2 , e1 + e2 ] − [0, e1 + e2 ] + [0, e2 ]) = [e1 , e1 + e2 ] + [0, e1 ] + [e2 , e1 + e2 ] + [0, e2 ] − 2[0, e1 + e2 ].

Therefore, we have ∂J 2 6= ∂(τ1 − τ2 )

and this completes the proof of the problem.

Problem 10.18 Rudin Chapter 10 Exercise 18.

Proof. Recall from Definition 10.26 that Q3 = {ae1 + be2 + ce3 | a, b, c ≥ 0, a + b + c ≤ 1} and σ1 is characterized by σ1 (0) = 0,

σ1 (e1 ) = e1 ,

σ1 (e2 ) = e1 + e2

and σ1 (e3 ) = e1 + e2 + e3 .

By [21, Eqn. (78), p. 266], we have the mapping σ1 : Q3 → R3 defined by σ1 (u) = A1 u for all u ∈ Q3 , where



 1 1 1 1 . 0 1

1 A1 =  0 0

(10.92)

Since det A1 = 1 > 0, σ1 is positively oriented. The five permutations of (1, 2, 3) other than (1, 2, 3) are (1, 3, 2),

(2, 1, 3),

(2, 3, 1),

(3, 1, 2) and (3, 2, 1).

By Problem 10.14, we have s(1, 3, 2) = −1,

s(2, 1, 3) = −1,

s(2, 3, 1) = 1,

s(3, 1, 2) = 1

and s(3, 2, 1) = −1.

Therefore, we have σ2 = −[0, e1 , e1 + e3 , e1 + e2 + e3 ] = [0, e1 , e1 + e2 + e3 , e1 + e3 ],

σ3 = −[0, e2 , e1 + e2 , e1 + e2 + e3 ] = [0, e2 , e1 + e2 + e3 , e1 + e2 ], σ4 = [0, e2 , e2 + e3 , e1 + e2 + e3 ], σ5 = [0, e3 , e1 + e3 , e1 + e2 + e3 ], σ6 = −[0, e3 , e2 + e3 , e1 + e2 + e3 ] = [0, e3 , e1 + e2 + e3 , e2 + e3 ] so that

and



1 A2 =  0 0

n Actually,

 1 1 1 0 , 1 1



0 A3 =  1 0

 1 1 1 1 , 1 0 

0 A6 =  0 1



0 A4 =  1 0

 1 0 1 1 . 1 1

 0 1 1 1 , 1 1



0 1 A5 =  0 0 1 1

 1 1 , 1

the ∂J 2 is exactly the positively oriented boundary ∂I 2 of the unit square I 2 given in [21, p. 271].

291

10.4. Properties of k-forms and k-simplexes

Since det A2 = det A3 = det A4 = det A5 = det A6 = 1 > 0, σ2 , . . . , σ6 are positively oriented by Definition 10.26. Put J 3 = σ1 + · · · + σ6 . By Definition 10.29, we have ∂J 3 =

6 X

∂σm

m=1

= [e1 , e1 + e2 , e1 + e2 + e3 ] − [0, e1 + e2 , e1 + e2 + e3 ] + [0, e1 , e1 + e2 + e3 ] − [0, e1 , e1 + e2 ]

− [e1 , e1 + e3 , e1 + e2 + e3 ] − [0, e1 + e3 , e1 + e2 + e3 ] + [0, e1 , e1 + e2 + e3 ] − [0, e1 , e1 + e3 ]

+ [e2 , e2 + e3 , e1 + e2 + e3 ] − [0, e2 + e3 , e1 + e2 + e3 ] + [0, e2 , e1 + e2 + e3 ] − [0, e2 , e2 + e3 ]

− [e3 , e2 + e3 , e1 + e2 + e3 ] − [0, e2 + e3 , e1 + e2 + e3 ] + [0, e3 , e1 + e2 + e3 ] − [0, e3 , e2 + e3 ] = [e1 , e1 + e2 , e1 + e2 + e3 ] − [0, e1 , e1 + e2 ] − [e1 , e1 + e3 , e1 + e2 + e3 ] + [0, e1 , e1 + e3 ]

− [e2 , e1 + e2 , e1 + e2 + e3 ] − [0, e1 + e2 , e1 + e2 + e3 ] + [0, e2 , e1 + e2 + e3 ] − [0, e2 , e1 + e2 ] + [e3 , e1 + e3 , e1 + e2 + e3 ] − [0, e1 + e3 , e1 + e2 + e3 ] + [0, e3 , e1 + e2 + e3 ] − [0, e3 , e1 + e3 ]

− [e2 , e1 + e2 , e1 + e2 + e3 ] + [0, e2 , e1 + e2 ] + [e2 , e2 + e3 , e1 + e2 + e3 ] − [0, e2 , e2 + e3 ] + [e3 , e1 + e3 , e1 + e2 + e3 ] − [0, e3 , e1 + e3 ] − [e3 , e2 + e3 , e1 + e2 + e3 ] + [0, e3 , e2 + e3 ]

(10.93)

which consists exactly 12 oriented affine 2-simplexes. By the matrix representation (10.92), A1 u = x if and only if      1 1 1 a x1  0 1 1   b  =  x2  0 0 1 c x3 if and only if

a + b + c = x1 ,

b + c = x2

and c = x3 .

3

Thus x ∈ σ1 (Q ) if and only if 0 ≤ x3 ≤ x2 ≤ x1 ≤ 1. Similarly, we have A2 u = x if and only if 0 ≤ x2 ≤ x3 ≤ x1 ≤ 1, A3 u = x if and only if 0 ≤ x3 ≤ x1 ≤ x2 ≤ 1,

A4 u = x if and only if 0 ≤ x1 ≤ x3 ≤ x2 ≤ 1, A5 u = x if and only if 0 ≤ x2 ≤ x1 ≤ x3 ≤ 1,

A6 u = x if and only if 0 ≤ x1 ≤ x2 ≤ x3 ≤ 1.

These mean that x ∈ σ2 (Q3 ) if and only if 0 ≤ x2 ≤ x3 ≤ x1 ≤ 1,

x ∈ σ3 (Q3 ) if and only if 0 ≤ x3 ≤ x1 ≤ x2 ≤ 1,

x ∈ σ4 (Q3 ) if and only if 0 ≤ x1 ≤ x3 ≤ x2 ≤ 1,

x ∈ σ5 (Q3 ) if and only if 0 ≤ x2 ≤ x1 ≤ x3 ≤ 1,

x ∈ σ6 (Q3 ) if and only if 0 ≤ x1 ≤ x2 ≤ x3 ≤ 1.

In other words, we have σ1 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x3 ≤ x2 ≤ x1 ≤ 1},

σ2 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x2 ≤ x3 ≤ x1 ≤ 1},

σ3 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x3 ≤ x1 ≤ x2 ≤ 1},

σ4 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x1 ≤ x3 ≤ x2 ≤ 1},

σ5 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x2 ≤ x1 ≤ x3 ≤ 1},

σ6 (Q3 ) = {(x1 , x2 , x3 ) | 0 ≤ x1 ≤ x2 ≤ x3 ≤ 1}.

(10.94)

Chapter 10. Integration of Differential Forms

292

Hence it follows from the ranges (10.94) that they have disjoint interiors and I 3 = σ1 (Q3 ) ∪ σ2 (Q3 ) ∪ · · · ∪ σ6 (Q3 ). This completes the proof of the problem.

Problem 10.19 Rudin Chapter 10 Exercise 19.

Proof. By rewriting the expression (10.93), we get the following new expression for ∂J 3 : n ∂J 3 = [e1 , e1 + e2 , e1 + e2 + e3 ] − [e1 , e1 + e3 , e1 + e2 + e3 ] − [0, e2 , e2 + e3 ] o n +[0, e3 , e2 + e3 ] + [0, e1 , e1 + e3 ] − [0, e3 , e1 + e3 ] − [e2 , e1 + e2 , e1 + e2 + e3 ] o n +[e2 , e2 + e3 , e1 + e2 + e3 ] + [e3 , e1 + e3 , e1 + e2 + e3 ] − [e3 , e2 + e3 , e1 + e2 + e3 ] o −[0, e1 , e1 + e2 ] + [0, e2 , e1 + e2 ] . (10.95)

For r = 0, 1, i = 1, 2, 3, Bri : R2 → R3 are C ′′ -mappings so that each βri is an oriented 3-simplex of class C ′′ . Since J 2 = τ1 + τ2 , we deduce from [21, Eqn. (88, p. 270] that βri = Bri (τ1 + τ2 ) = Bri (τ1 ) + Bri (τ2 ).

(10.96)

Let b1ri = Bri (τ1 ) and b2ri = Bri (τ2 ), where r = 0, 1, i = 1, 2, 3. Then the expression (10.96) can be rewritten as βri = b1ri + b2ri . (10.97) Now we must find the explicit forms of the oriented affine 2-simplexes b1ri and b2ri in the definition (10.97) in order to compute each (−1)i (β0i − β1i ). To this end, we recall the representations (10.88) and (10.90) first: τ1 (ae1 + be2 ) = (a + b)e1 + be2 = ((a + b), b) and τ2 (ae1 + be2 ) = ae1 + (a + b)e2 = (a, a + b). • Computation of −(β01 − β11 ). For b101 and b201 , since b101 (ae1 + be2 ) = (0, a + b, b) and b201 (ae1 + be2 ) = (0, a, a + b), we have b101 (0) = 0,

b101 (e1 ) = e2

and b101 (e2 ) = e2 + e3 ,

b201 (0) = 0,

b201 (e1 ) = e2 + e3

and b201 (e2 ) = e3 .

By [21, Eqn. (77), p. 266], we have b101 = [0, e2 , e2 + e3 ] and b202 = [0, e2 + e3 , e3 ], so the definition (10.97) gives β01 = [0, e2 , e2 + e3 ] + [0, e2 + e3 , e3 ]. Similarly, since b111 (ae1 + be2 ) = (1, a + b, b) and b211 (ae1 + be2 ) = (1, a, a + b), we have b111 (0) = e1 ,

b111 (e1 ) = e1 + e2

and b111 (e2 ) = e1 + e2 + e3 ,

(10.98)

293

10.4. Properties of k-forms and k-simplexes b211 (0) = e1 ,

b211 (e1 ) = e1 + e2 + e3

and b211 (e2 ) = e1 + e3 .

By [21, Eqn. (77), p. 266] again, we have b111 = [e1 , e1 + e2 , e1 + e2 + e3 ] and b212 = [e1 , e1 + e2 + e3 , e1 + e3 ], so the definition (10.97) gives β11 = [e1 , e1 + e2 , e1 + e2 + e3 ] + [e1 , e1 + e2 + e3 , e1 + e3 ].

(10.99)

Therefore, we deduce from the two expressions (10.98) and (10.99) that (−1)1 (β01 − β11 ) = [e1 , e1 + e2 , e1 + e2 + e3 ] + [e1 , e1 + e2 + e3 , e1 + e3 ] − [0, e2 , e2 + e3 ] − [0, e2 + e3 , e3 ] = [e1 , e1 + e2 , e1 + e2 + e3 ] − [e1 , e1 + e3 , e1 + e2 + e3 ] −[0, e2 , e2 + e3 ] + [0, e3 , e2 + e3 ]

which is exactly the first brackets in the expression (10.95). • Computation of β02 − β12 . For b102 and b202 , since b102 (ae1 + be2 ) = (a + b, 0, b) and b202 (ae1 + be2 ) = (a, 0, a + b), we have b102 (0) = 0,

b102 (e1 ) = e1

and b102 (e2 ) = e1 + e3 ,

b202 (0) = 0,

b202 (e1 ) = e1 + e3

and b202 (e2 ) = e3 .

Since b102 = [0, e1 , e1 + e3 ] and b202 = [0, e1 + e3 , e3 ], we have β02 = [0, e1 , e1 + e3 ] + [0, e1 + e3 , e3 ].

(10.100)

Now we have b112 (ae1 + be2 ) = (a + b, 1, b) and b212 (ae1 + be2 ) = (a, 1, a + b) which imply that b112 = [e2 , e1 + e2 , e1 + e2 + e3 ] and b212 = [e2 , e1 + e2 + e3 , e2 + e3 ]. Thus we have β12 = [e2 , e1 + e2 , e1 + e2 + e3 ] + [e2 , e1 + e2 + e3 , e2 + e3 ].

(10.101)

Combining the expressions (10.100) and (10.101), we obtain β02 − β12 = [0, e1 , e1 + e3 ] + [0, e1 + e3 , e3 ] − [e2 , e1 + e2 , e1 + e2 + e3 ] − [e2 , e1 + e2 + e3 , e2 + e3 ] = [0, e1 , e1 + e3 ] − [0, e3 , e1 + e3 ] − [e2 , e1 + e2 , e1 + e2 + e3 ] +[e2 , e2 + e3 , e1 + e2 + e3 ]

which is exactly the second brackets in the expression (10.95). • Computation of −(β03 − β13 ). The computation of (−1)3 (β03 − β13 ) can be done similarly as above. We have b103 (ae1 + be2 ) = (a + b, b, 0) and b203 (ae1 + be2 ) = (a, a + b, 0) which imply that b103 (0) = 0,

b103 (e1 ) = e1

and b103 (e2 ) = e1 + e2 ,

Chapter 10. Integration of Differential Forms b203 (0) = 0,

294

b203 (e1 ) = e1 + e2

and b203 (e2 ) = e2 .

These mean that b103 = [0, e1 , e1 + e2 ] and b203 = [0, e1 + e2 , e2 ], so we have β03 = [0, e1 , e1 + e2 ] + [0, e1 + e2 , e2 ].

(10.102)

β13 = [e3 , e1 + e3 , e1 + e2 + e3 ] + [e3 , e1 + e2 + e3 , e2 + e3 ].

(10.103)

Similarly, we have

Therefore we follow from the expressions (10.102) and (10.103) that −(β03 − β13 ) = −[0, e1 , e1 + e2 ] − [0, e1 + e2 , e2 ] + [e3 , e1 + e3 , e1 + e2 + e3 ] + [e3 , e1 + e2 + e3 , e2 + e3 ] = [e3 , e1 + e3 , e1 + e2 + e3 ] − [e3 , e2 + e3 , e1 + e2 + e3 ] −[0, e1 , e1 + e2 ] + [0, e2 , e1 + e2 ]

which is the third brackets in the expression (10.95). Hence, by the above computations, we get our desired result that 3

∂J =

3 X i=1

(−1)i (β0i − β1i ).

This ends the proof of the problem.

10.5

Problems on closed forms and exact forms

Problem 10.20 Rudin Chapter 10 Exercise 20.

Proof. Suppose that E is an open set in Rn , f ∈ C ′ (E), ω is a k-form of class C ′ in E and Φ is a (k + 1)-chain of class C ′′ in E. By Theorem 10.20(a), we have d(f ω) = ( df ) ∧ ω + (−1)0 f dω = ( df ) ∧ ω + f dω.

(10.104)

Apply Theorem 10.33 (Stokes’ Theorem) to the left-hand side in the expression (10.104), we get Z Z Z Z fω = d(f ω) = ( df ) ∧ ω + f dω Φ

∂Φ

Φ

which implies that the desired equality Z Z f dω = Φ

∂Φ

fω −

Φ

Z

Φ

( df ) ∧ ω.

Let n = 1 and k = 0 in the above consideration. Let, further, E be an open set in R containing the interval [a, b], where a and b are real numbers with a < b. Now we consider the oriented affine 1-simplex Φ : [0, 1] → R, where Φ(0) = a and Φ(1) = b. By Definition 10.29, we have ∂Φ = [b] − [a]

295

10.5. Problems on closed forms and exact forms

which is an oriented 0-simplex. If ω = g is a 0-form of class C ′ in E, then f g is also a 0-form of class C ′ in E. Thus, by the equation just preceding Theorem 10.27 (see [21, p. 267]), we acquire Z Z Z f g = f (b)g(b) − f (a)g(a). (10.105) fg + fω = −a

+b

∂Φ

On the other hand, it is clear that f dg and ( df )g are 1-forms by Definition 10.18. In addition, we obtain from [21, Eqn. (59)] that dg = g ′ (x) dx, where x = Φ(t), thus we see from [21, Eqn. (35), p. 254] that Z Z f (x)g ′ (x) dx f dg = | {z } Φ Φ =

=

This is the a(x) in [21, Eqn. (34), p. 254]

Z

f (Φ(t))g ′ (Φ(t))Φ′ (t) dt

[0,1] Z 1

f (Φ(t))g ′ (Φ(t))Φ′ (t) dt.

0

If Φ is supposed to be strictly increasing on [0, 1], then [21, Eqn. (39), p. 133)] implies that Z 1 Z b f (u)g ′ (u) du. f (Φ(t))g ′ (Φ(t)) Φ′ (t) dt = | {z } 0 a

(10.106)

This is the f (ϕ(y)) in [21, Eqn. (39), p. 133]

Similarly, we have df = f ′ (x) dx, where x = Φ(t). Then it deduces from [21, Eqn. (35), p. 254] that Z 1 Z Z Z Z ′ ′ ′ g(Φ(t))f ′ (Φ(t))Φ′ (t) dt. g(Φ(t))f (Φ(t))Φ (t) dt = g(x)f (x) dx = g( df ) = ( df ) ∧ ω = 0

[0,1]

Φ

Φ

Φ

By [21, Eqn. (39), p. 133)] again, we obtain Z 1 Z ′ ′ g(Φ(t))f (Φ(t))Φ (t) dt = 0

b

g(u)f ′ (u) du.

Combining the equalities (10.105) to (10.107), we establish that Z b Z ′ f (u)g (u) du = f (b)g(b) − f (a)g(a) − a

(10.107)

a

b

f (u)g ′ (u) du a

which is exactly Theorem 6.22 (Integration of Parts). This completes the proof of the problem.

Problem 10.21 Rudin Chapter 10 Exercise 21.

Proof. (a) Since x = r cos t and y = r sin t, the direct computation of the formula [21, Eqn. (113), p. 277] is given as follows: Z Z 2π 2 Z 2π r cos2 t dt + r2 sin2 t dt η= = dt = 2π. r2 γ 0 0 However, it follows from Theorem 10.20 and [21, Eqn. (59), p. 260] that x y y x ∧ dy + ∧ d2 y − d 2 ∧ dx − ∧ d2 x dη = d 2 2 2 2 2 2 2 x +y x +y x +y x +y

Chapter 10. Integration of Differential Forms =

h (x2 + y 2 ) − x(2x)

i −x(2y) dy ∧ dy (x2 + y 2 )2 (x2 + y 2 )2 h −y(2x) (x2 + y 2 ) − y(2y) i − dx + dy ∧ dx. (x2 + y 2 )2 (x2 + y 2 )2

296

dx +

(10.108)

By the anticommutative relation ( dy ∧ dx = − dx ∧ dy) and [21, Eqn. (43), p. 256], we deduce from the expression (10.108) that h (x2 + y 2 ) − x(2x)

i (x2 + y 2 ) − y(2y) dx ∧ dy 2 2 2 + (x + y ) i h −x(2y) −y(2x) dy ∧ dy − dx ∧ dx + (x2 + y 2 )2 (x2 + y 2 )2 1 = 2 (−2xy dy ∧ dy + 2xy dx ∧ dx) (x + y 2 )2 = 0.

dη =

(x2

y 2 )2

dx ∧ dy −

(b) Let D = {(t, u) | 0 ≤ t ≤ 2π, 0 ≤ u ≤ 1} and Φ : D → R2 \ {0} be given as in the hint. Since γ : [0, 2π] → R2 \ {0} and Γ : [0, 2π] → R2 \ {0} are C ′′ -mappings, Φ is a 2-surface in R2 \ {0} by Definition 10.10. The geometric interpretation of the mapping Φ is given in Figure 10.12, where a purple arrow is an interval [γ(t), Γ(t)] for some t ∈ [0, 2π] which does not contain the origin 0.o

Figure 10.12: The mapping Φ : D → R2 \ {0}. By a similar analysis as in Example 10.32, we know that ∂Φ = Φ(∂D) = σ1 + σ2 + σ3 + σ4 ,

(10.109)

where σ1 (t) = Φ(t, 0) = Γ(t), σ2 (u) = Φ(2π, u) = (1 − u)Γ(2π) + uγ(2π), σ3 (t) = Φ(2π − t, 1) = γ(2π − t), σ4 (u) = Φ(0, 1 − u) = uΓ(0) + (1 − u)γ(0).

Since γ(0) = γ(2π) and Γ(0) = Γ(2π), we follow from [21, Eqn. (77) & (80), pp. 266, 267] in Definition 10.26 that σ2 = [Γ(0), γ(0)] = −[γ(0), Γ(0)] = −σ4 . o In

fact, the Φ is a homotopy between γ and Γ, see [18, p. 323].

297

10.5. Problems on closed forms and exact forms Similarly, by direct application of [21, Eqn. (35), p. 254], we obtain Z Z ω=− ω γ

σ3

for every 1-form ω. In other words, we get from the relation (10.109) that ∂Φ = Γ − γ

(10.110)

By Theorem 10.33 (Stokes’ Theorem) and part (a), we get Z Z Z η= dη = 0 = 0. Φ

∂Φ

(10.111)

Φ

Hence it follows from the expressions (10.110) and (10.111) Z Z Z Z η η+ η+ η+ σ4 σ3 σ2 σ1 Z Z Z Z η η− η+ η− σ4 γ σ4 Γ Z η Γ

that =0 =0 Z η. =

(10.112)

γ

Now, by the result of part (a), we have the desired result from (10.112) Z η = 2π. Γ

(c) Suppose that Γ(t) = (a cos t, b sin t), where a > 0, b > 0 are fixed. Now Γ is a C ′′ -curve in R2 \ {0} with parameter interval [0, 2π] and Γ(0) = Γ(2π). Let [γ(t), Γ(t)] be the interval joining the points γ(t) and Γ(t) for each t ∈ [0, 2π]. Since [γ(t), Γ(t)] does not contain the origin 0, we deduce from part (b) that Z η = 2π Γ Z x dy − y dx = 2π x2 + y 2 [0,2π] Z 2π ab cos2 t dt + ab sin2 t dt = 2π a2 cos2 t + b2 sin2 t 0 Z 2π ab dt = 2π. 2 2 a cos t + b2 sin2 t 0 (d) Recall from [21, Remark 10.35(a), p. 275] that a 1-form ω=

n X

fi (x) dxi

i=1

is exact in an open set E ⊆ Rn if and only if there is a function (0-form) g ∈ C ′ (E) such that (Di g)(x) = fi (x)

(x ∈ E, 1 ≤ i ≤ n).

(10.113)

Consider n = 2 and E to be any convex open set in which x 6= 0.p By the definition of η, we have f1 (x, y) =

−y + y2

x2

and f2 (x, y) =

x2

x . + y2

p In other words, E does not interest with the y-axis and this means that E lies either entirely in the left or right half plane.

Chapter 10. Integration of Differential Forms Let g : E → R be defined by Since

d dx (arctan x)

=

1 1+x2 ,

298

y g(x, y) = arctan . x

we can easily see that

y −y 1 ∂ y arctan = 2 · =− 2 y2 ∂x x x x + y2 1 + x2

and

y x ∂ arctan = 2 . ∂y x x + y2

Now the function g satisfies the conditions (10.113), so the definition gives x η = dg = d arctan y in E. Similarly, we suppose that F is any convex open set in which y 6= 0.q A direct computation shows that ∂ x y x x ∂ − arctan =− 2 − arctan = 2 and , 2 ∂x y x +y ∂y y x + y2 thus we have

∂ x ∂ x x = − arctan dx + − arctan dy d − arctan y ∂x y ∂y y x y dx + 2 dy =− 2 x + y2 x + y2 x dy − y dx = x2 + y 2 = η. By the result of Example 10.36, η is not exact in R2 \ {0}, but the analysis verifies that we can say that η is exact locally in R2 \ {0}, so it is reasonable to denote η = dθ for some 0-form θ. (e) We write [0, 2π] = I1 ∪ I2 ∪ · · · ∪ I6 , where Ii = [ (i−1)π , iπ 3 3 ] and i = 1, 2, . . . , 6. Now γ lie in E, while

h π i 0, , 3

γ

h 2π

γ

h π 2π i , 3 3

,π

3

i

,

γ

h 4π i π, 3

and γ

lie in F . Therefore, it follows from (d) that Z

η=

γ

i=1

=

q Now

6 Z X

Z

and γ

h 5π 3

, 2π

i

h 4π 5π i , 3 3

η

γ(Ii )

Z

Z

Z

y d arctan x γ(I1 ) γ(I3 ) γ(I4 ) γ(I6 ) Z Z x . d − arctan + + y γ(I5 ) γ(I2 ) +

+

+

F does not interest with the x-axis and so F lies either entirely in the upper or lower half plane.

(10.114)

299

10.5. Problems on closed forms and exact forms We apply Theorem 10.33 (Stokes’ Theorem) to the integrals on the right hand side in the expression (10.114) and then [21, Eqn. (62), p. 261], we obtain Z

γ

π3 π 2π 4π 3 η = tan−1 tan t + tan−1 tan t 2π + tan−1 tan t + tan−1 tan t 5π 0

π

3

5π 2π 3 3 − tan−1 cot t π − tan−1 cot t 4π 3

2

3

π 2π 5π π π 3 3 − t π − tan−1 − t 4π = × 4 − tan−1 tan 3 2 2 3 3 2π 5π π 4π π 3 3 − − t π − − t 4π = 3 2 2 3 3 4π 2π + = 3 3 = 2π. This means that part (d) implies part (b). (f) Given that Γ(t) = (Γ1 (t), Γ2 (t)), where t ∈ [0, 2π]. We also write Γ = Γ1 + iΓ2 in the complex plane C. Since Γ is assumed to be a closed C ′ -curve in R2 \ {0}, it is a continuously differentiable closed curve in C and Γ(t) 6= 0 for every t ∈ [0, 2π]. Thus, by the definition of the index of Γ in Problem 8.23, we have Ind (Γ) =

1 2πi

Z

2π

0

Γ′1 + iΓ′2 dt Γ1 + iΓ2

Z 2π ′ 1 Γ1 + iΓ′2 Γ1 − iΓ2 = · dt 2πi 0 Γ1 + iΓ2 Γ1 − iΓ2 Z 2π Z 2π 1 Γ1 Γ′2 − Γ′1 Γ2 Γ1 Γ′1 + Γ2 Γ′2 1 dt + dt. = 2 2 2π 0 Γ1 + Γ2 2πi 0 Γ21 + Γ22 Since η =

x dy−y dx x2 +y 2 ,

(10.115)

Definition 10.11 implies that 1 2π

Z

Γ

η=

1 2π

Z

0

2π

Γ1 Γ′2 − Γ2 Γ1 dt Γ21 + Γ22

(10.116)

which is exactly the real part of the complex number (10.115). Furthermore, since Γ(2π) = Γ(0), we have Z 2π Z 2π 2π Γ1 Γ′1 + Γ2 Γ′2 2 2 2 2 d(ln(Γ + Γ )) = ln[Γ (x) + Γ (x)] dt = (10.117) = 0. 1 2 1 2 Γ21 + Γ22 0 0 0

Hence we reach the desired result by comparing the expressions (10.115) to (10.117). This completes the proof of the problem.

Problem 10.22 Rudin Chapter 10 Exercise 22.

Proof. It should be noted that the equations of x, y and z expressed in terms of u and v are actually the spherical coordinates of the point (x, y, z) on the unit sphere. Let’s “see” the spherical coordinates for the point Σ(u, v) in Figure 10.13.

Chapter 10. Integration of Differential Forms

300

Figure 10.13: The spherical coordinates for the point Σ(u, v).

(a) By Definition 10.18, we see that y z x dζ = d 3 dy ∧ dz + d 3 dz ∧ dx + d 3 dx ∧ dy. r r r

(10.118)

Since x ∂ x ∂ x ∂ x d 3 = dx + dy + dz r ∂x r3 ∂y r3 ∂z r3 r2 − 3x2 3xy 3xz = dx − 5 dy − 5 dz, 6 r r r y ∂ y ∂ y ∂ y dx + dy + dz d 3 = r ∂x r3 ∂y r3 ∂z r3 3xy r2 − 3y 2 3yz = − 5 dx + dy − 5 dz, 6 r r r z ∂ z ∂ z ∂ z dx + dy + dz d 3 = r ∂x r3 ∂y r3 ∂z r3 3xz 3yz r2 − 3z 2 = − 5 dx − 5 dy + dz, r r r6 we apply these, the anticommutative relation and the facts dx ∧ dx = dy ∧ dy = dz ∧ dz = 0 to the expression (10.118) to obtain dζ =

3r2 − 3(x2 + y 2 + z 2 ) 3r2 − 3r2 · dx ∧ dy ∧ dz = · dx ∧ dy ∧ dz = 0 r6 r6

in R3 \ {0}. (b) We remark that the first place where the concept of the area of a 2-surface in R3 occurs is Sec. 10.46, not Sec. 10.43., so it is believed that this is a typo.

301

10.5. Problems on closed forms and exact forms Suppose that E ⊆ D is a compact set and S = ΣE : E → R3 . Then S is also a 2-surface in R3 , of class C ′′ . Notice that ∂(y, z) cos u sin v sin u cos v = = sin2 u cos v, 0 ∂(u, v) − sin u ∂(z, x) − sin u 0 = (10.119) = sin2 u sin v, ∂(u, v) cos u cos v − sin u sin v ∂(x, y) cos u cos v − sin u sin v = = sin u cos u. sin u cos v ∂(u, v) cos u sin v Therefore, it follows from the Jacobians (10.119) and Definition 10.11 that Z Z Z ∂(z, x) ∂(y, z) sin u sin v · du ∧ dv + du ∧ dv sin u cos v · ζ= ∂(u, v) ∂(u, v) S S S Z ∂(x, y) du ∧ dv cos u · + ∂(u, v) S Z Z Z ∂(y, z) ∂(z, x) ∂(x, y) = sin u cos v · du dv + sin u sin v · du dv + cos u · du dv ∂(u, v) ∂(u, v) ∂(u, v) E E E Z Z Z = sin3 u cos2 v du dv + sin3 u sin2 v du dv + sin u cos2 u du dv E E E Z sin u du dv. (10.120) = E

By Definition 10.46, since N(u, v) = (sin2 u cos v)e1 + (sin2 u sin v)e2 + (sin u cos u)e3 , we have A(S) =

Z

S

|N(u, v)| du dv =

Z

sin u du dv.

(10.121)

E

Hence our result follows immediately from the expressions (10.120) and (10.121). (c) By a bit algebra, we ∂(y, z) = ∂(t, s) ∂(z, x) = ∂(t, s) ∂(x, y) = ∂(t, s)

know that g ′ (t)h2 (s) g(t)h′2 (s) = g ′ (t)g(t)[h2 (s)h′3 (s) − h′2 (s)h3 (s)], ′ ′ g (t)h3 (s) g(t)h3 (s) g ′ (t)h3 (s) g(t)h′3 (s) = g ′ (t)g(t)[h3 (s)h′1 (s) − h1 (s)h′3 (s)], g ′ (t)h1 (s) g(t)h′1 (s) g ′ (t)h1 (s) g(t)h′1 (s) = g ′ (t)g(t)[h1 (s)h′2 (s) − h′1 (s)h2 (s)]. ′ ′ g (t)h2 (s) g(t)h2 (s)

Therefore, it follows from the Jacobians (10.122) and Definition 10.11 that Z Z n 1 g 2 (t)g ′ (t)h1 (s)[h2 (s)h′3 (s) − h′2 (s)h3 (s)] ζ= 3 × 2 2 2 3 Φ I 2 g (t)[h1 (s) + h2 (s) + h3 (s)] 2 + g 2 (t)g ′ (t)h2 (s)[h3 (s)h′1 (s) − h1 (s)h′3 (s)]

o + g 2 (t)g ′ (t)h3 (s)[h1 (s)h′2 (s) − h′1 (s)h2 (s)] dt ds Z n g ′ (t) h1 (s)[h2 (s)h′3 (s) − h′2 (s)h3 (s)] = 3 × 2 (s) + h2 (s) + h2 (s)] 2 2 g(t)[h I 1 2 3 o ′ + h2 (s)[h3 (s)h1 (s) − h1 (s)h′3 (s)] + h3 (s)[h1 (s)h′2 (s) − h′1 (s)h2 (s)] dt ds

(10.122)

Chapter 10. Integration of Differential Forms =

Z

g ′ (t)

′ ′ 3 × h1 (s)h2 (s)h3 (s) − h1 (s)h2 (s)h3 (s) g(t)[h21 (s) + h22 (s) + h23 (s)] 2 h′1 (s)h2 (s)h3 (s) − h1 (s)h2 (s)h′3 (s) + h1 (s)h′2 (s)h3 (s) − h′1 (s)h2 (s)h3 (s) dt ds

I2

+ = 0.

302

(d) We follow the given hint. Since E is a closed rectangle, we have E = [a, b] × [c, d] for some constants a, b, c and d with 0 < a < b < π and 0 < c < d < 2π, see Figure 10.14 below:

Figure 10.14: The rectangles D and E. Consider the 3-surface Ψ : [0, 1] × E → R3 \ {0} given by Ψ(t, u, v) = [1 − t + tf (u, v)]Σ(u, v), where (u, v) ∈ E, 0 ≤ t ≤ 1. For fixed v, define Eu = {u ∈ [0, π] | (u, v) ∈ E} = [a, b]

and the mapping Φ : [0, 1] × [a, b] → R3 \ {0} given by

Φ(t, u) = Ψ(t, u, v). Now [0, 1] × [a, b] is a 2-cell, thus it is compact by Theorem 2.40 and Φ is a 2-surface of class C ′′ with parameter domain [0, 1] × [a, b]. Since v is fixed, we have Φ(t, u) = (x, y, z), where x = g(t, u) cos v sin u,

y = g(t, u) sin v sin u,

z = g(t, u) cos u

and g(t, u) = [1 − t + tf (u, v)].

Since 0 ≤ t ≤ 1 and f (u, v) > 0 on D, g(t, u) > 0 on D.

By a similar argument as in part (c), instead of the Jacobians (10.122), we haver ∂(y, z) gt sin v sin u sin v(g cos u + gu sin u) = = −ggt sin v, gu cos u − g sin u gt cos u ∂(t, s) ∂(z, x) gt cos u gu cos u − g sin u = = ggt cos v, gt cos v sin u cos v(g cos u + gu sin u) ∂(t, s) ∂(x, y) gt cos v sin u cos v(g cos u + gu sin u) = = 0. gt sin v sin u sin v(g cos u + gu sin u) ∂(t, s)

r Here

we denote gt =

∂g(t,u) ∂t

and gu =

∂g(t,u) . ∂u

(10.123)

303

10.5. Problems on closed forms and exact forms Thus, by the Jacobians (10.123), we have Z Z Z ζ= g 2 gt sin v cos v sin u du dv − Φ

[0,1]×[a,b]

g 2 gt sin v cos v sin u du dv

[0,1]×[a,b]

= 0.

We notice that the same thing holds when u is fixed. By the definition of Ψ, we get ∂Ψ = Ψ(0, u, v) − Ψ(1, u, v) + Ψ(t, a, v) − Ψ(t, b, v) + Ψ(t, u, c) − Ψ(t, u, d) = S(u, v) − Ω(u, v) + Φa (t, v) − Φb (t, v) + Φc (t, u) − Ψd (t, u),

(10.124)

where Φc and Φd are the mappings defined by (t, u) 7→ Ψ(t, u, c) and (t, u) 7→ Ψ(t, u, d) respectively, while Φa and Φb are the mappings defined by (t, v) 7→ Ψ(t, a, v) and (t, v) 7→ Ψ(t, b, v) respectively. By the above analysis, we know that Z Z Z η= η=

Z

η = 0.

(10.125)

Φd

Φc

Φb

Φa

η=

Since f ∈ C ′′ (D), Ψ is a 3-chain of class C ′′ in R3 \ {0}. In addition, since ζ is a 2-form of class C ′ in R3 \ {0}, Theorem 10.33 (Stokes’ Theorem) implies that Z Z dζ = ζ. (10.126) Ψ

∂Ψ

By part (a), dζ = 0 so that the integral (10.126) shows that Z ζ = 0.

(10.127)

∂Ψ

Hence we follow from the relations (10.124), the integrals (10.125) and the expression (10.127) that Z ζ 0= Z Z Z Z Z∂Ψ Z ζ ζ− ζ+ ζ− ζ+ ζ− = Φd Φc Φb Φa Ω S Z Z ζ ζ− = S

Ω

and it is equivalent to Z

Ω

ζ=

Z

ζ = A(S). S

For a better illustration of the set S = ΣE , by the analysis in Example 10.32, we have ∂S = ∂(ΣE ) = Σ(∂E) = γ1 + γ2 + γ3 + γ4 , where γ1 (u) = Σ(u, c) = (sin u cos c, sin u sin c, cos u), γ2 (v) = Σ(b, v)

Chapter 10. Integration of Differential Forms

304

= (sin b cos v, sin b sin v, cos b), γ3 (u) = Σ(b + a − u, d) = (sin(b + a − u) cos d, sin(b + a − u) sin d, cos(b + a − u)), γ4 (v) = Σ(a, c + d − v)

= (sin a cos(c + d − v), sin a sin(c + d − v), cos a),

with a ≤ u ≤ b and c ≤ v ≤ d. In particular, we consider the example that E = [ π4 , π2 ] × [ π2 , π]. Then we have

where π4 ≤ u ≤ π2 and Figure 10.15 below:

γ1 (u) = (0, sin u, cos u), γ2 (v) = (cos v, sin v, 0), 3π 3π γ3 (u) = − sin − u , 0, cos −u , 4 4 3π √2 3π √2 √2 , cos −v , sin −v , γ4 (v) = 2 2 2 2 2

π 2

≤ v ≤ π. The corresponding 2-surface S and its boundary ∂S is shown in

Figure 10.15: An example of the 2-surface S and its boundary ∂S.

305

10.5. Problems on closed forms and exact forms

(e) Let V = {(x, y, z) | x2 + y 2 > 0, z ∈ R} = R3 \ {(0, 0, z) | z ∈ R}

which is an open set in R3 . In other words, V is the 3-dimensional space with deleted z-axis. Now −

z r

and η

are 0- and 1-forms respectively, we follow from Theorem 10.20(a), Definition 10.18 and Problem 10.21(a) that h z i z dλ = − d ∧ η + (−1)0 ∧ dη r r xz yz r2 − z 2 x dy − y dx dx + 3 dy − dz ∧ = r3 r r3 x2 + y 2 2 2 z x +y x dy − y dx = 3 dx ∧ dy − dz ∧ r r3 x2 + y 2 z x y = 3 dx ∧ dy − 3 dz ∧ dy + 3 dz ∧ dx r r r x dy ∧ dz + y dz ∧ dx + z dx ∧ dy = r3 = ζ. Hence ζ is exact in the open set V . (f) Note that Ω(u, v) = f (u, v)Σ(u, v) = (f (u, v) sin u cos v, f (u, v) sin u sin v, f (u, v) cos u), where (u, v) ∈ E. By the hypothesis E = [a, b] × [c, d] made in part (b), if we further assume that 0 < a < b < π, then it is easy to see that [f (u, v) sin u cos v]2 + [f (u, v) sin u sin v]2 = f 2 (u, v) sin2 u > 0 which means Ω ⊆ V.

(10.128)

Recall that f ∈ C ′′ (D) and Σ is a 2-surface of class C ′′ in R3 \ {0}, we have Ω is a 2-surface of class C ′′ in R3 \ {0} too. Since λ is clearly a 1-form of class C ′ in V , ζ = dλ in V by part (d) and the subset relation (10.128), the exactness of ζ still hold in Ω. Hence it follows from Theorem 10.33 (Stokes’ Theorem) that Z Z Z Z z (10.129) η. dλ = λ= ζ= ∂Ω r Ω ∂Ω Ω Similarly, the definition of S implies that S ⊆ V . Since S is a 2-surface of class C ′′ in R3 \ {0}, Theorem 10.33 (Stokes’ Theorem) again implies that Z Z Z Z z ζ= dλ = (10.130) η. λ= S S ∂S ∂S r Now we want to show that the two right-most integrals in the relations (10.129) and (10.130) are equal. Such a proof is presented in two steps below. – Step 1: Analysis of

z r

on ∂Ω and ∂S. We notice that if (x, y, z) ∈ ΣE (u, v), then x = sin u cos v,

y = sin u sin v,

z = cos u

(10.131)

Chapter 10. Integration of Differential Forms so that

306

z cos u = = cos u. r 1

Similarly, if (x, y, z) ∈ Ω(u, v), then x = f (u, v) sin u cos v,

y = f (u, v) sin u sin v,

z = f (u, v) cos u,

where f (u, v) > 0 so that f (u, v) cos u z = = cos u. r f (u, v) In other words,

z r

is the same at ΣE (u, v) as at Ω(u, v). By part (b), we know that S = ΣE

and thus

z r

is the same at ∂S as at ∂Ω.

– Step 2: Analysis of ζ on ∂Ω and ∂S. By Figure 10.14, we know that E does not intersect u = 0, therefore we have y η = d arctan x by Problem 10.21(d).s On ∂S, we have arctan Similarly, on ∂Ω, we have arctan

sin u sin v y = arctan(tan v) = v. = arctan x sin u cos v

f (u, v) sin u sin v y = arctan(tan v) = v. = arctan x f (u, v) sin u cos v

Thus the 1-form η is the same at ∂S as at ∂Ω too. Hence we deduce from the above analysis and part (b) that Z Z Z Z Z Z z z ζ= ζ = A(S) η= η= λ= λ= Ω S ∂S ∂Ω ∂Ω r ∂S r which is our expected result. (g) The answer is affirmative. Let L be a straight line through the origin. Recall that the spherical coordinates for a unit vector x = (x, y, z) are given by the formulas (10.131). It is well-knownt that the matrices   cos u − sin u 0 Rxy (u) =  sin u cos u 0  , 0 0 1   cos v 0 sin v 0 1 0 , Rzx (v) =  − sin v 0 cos v   1 0 0 Ryz (w) =  0 cos w − sin w  0 sin w cos w represent rotations by angle u, v and w counterclockwise in the (x, y)-plane, the (z, x)-plane and the (y, z)-plane respectively.

s Note

that E does not intersect v = 0, so the formula η = d − arctan t See, for instance, [16, pp. 328 - 332].

x y

is also applicable to get the same result.

307

10.5. Problems on closed forms and exact forms Lemma 10.11 Let T : R3 → R3 be the transformation with matrix M defined by M = Rzx (−u)Rxy (−v). Then T transforms the straight line L onto the z-axis.

Proof of Lemma 10.11. We have M = Rzx (−u)Rxy (−v)   cos u 0 − sin u  1 0 = 0 sin u 0 cos u  cos u cos v cos u sin v cos v =  − sin v sin u cos v sin u sin v

cos v − sin v 0

sin v cos v 0 

− sin u , 0 cos u

 0 0  1 (10.132)

where 0 ≤ u ≤ π and 0 ≤ v ≤ 2π. Then direct computation shows that T (x) = Mx  cos u cos v =  − sin v sin u cos v   0 =  0 . 1

cos u sin v cos v sin u sin v

  − sin u sin u cos v   sin u sin v  0 cos u cos u

In other words, the mapping T transforms L onto the z-axis, completing the proof of the lemma. Let E = R3 \ L. Since T is bijectiveu , we consider the mapping TE = T : E → V which is also bijective. Since λ is of class C ′ in V and the formula (10.132) of M implies that T and then TE is of class C ′′ , we deduce from Theorem 10.22(c) and part (e) that d(λTE ) = ( dλ)TE = ζTE . Since ζ is a 2-form in V , ζTE is a 2-form in E. We claim that ζ = ζTE . To this end, we get from the formula (10.132) that    cos u cos v cos u sin v − sin u x  y  cos v 0 T (x) =  − sin v sin u cos v sin u sin v cos u z   (cos u cos v)x + (cos u sin v)y + (− sin u)z . (− sin v)x + (cos v)y = (sin u cos v)x + (sin u sin v)y + (cos u)z

Following the notations used in Definition 10.21, we have

t1 (x) = (cos u cos v)x + (cos u sin v)y + (− sin u)z, u This

can be easily checked by the matrix form (10.132).

Chapter 10. Integration of Differential Forms

308

t2 (x) = (− sin v)x + (cos v)y, t3 (x) = (sin u cos v)x + (sin u sin v)y + (cos u)z which imply that dt1 = (cos u cos v) dx + (cos u sin v) dy − sin u dz, dt2 = − sin u dx + cos v dy,

(10.133)

dt3 = (sin u cos v) dx + (sin u sin v) dy + cos u dz. Thus we obtain from [21, Eqn. (67), p.262] that

ζTE = [(cos u cos v)x + (cos u sin v)y + (− sin u)z] dt2 ∧ dt3

+ [(− sin u)x + (cos v)y] dt3 ∧ dt1 + [(sin u cos v)x + (sin u sin v)y + (cos u)z] dt1 ∧ dt2 .

(10.134)

We need to compute dt2 ∧ dt3 ,

dt3 ∧ dt1

and

dt1 ∧ dt2 .

To do this, we know from the formulas (10.133) that dt2 ∧ dt3 = [− sin v dx + cos v dy] ∧ [(sin u cos v) dx + (sin u sin v) dy + cos u dz] = −(sin u sin2 v + sin u cos2 v) dx ∧ dy − (sin v cos u) dx ∧ dz + (cos u cos v) dy ∧ dz

= (− sin u) dx ∧ dy − (sin v cos u) dx ∧ dz + (cos u cos v) dy ∧ dz, dt3 ∧ dt1 = [(sin u cos v) dx + (sin u sin v) dy + cos u dz]

(10.135)

∧ [(cos u cos v) dx + (cos u sin v) dy − sin u dz]

= (sin u cos u sin2 v) dx ∧ dy + (sin u2 cos v) dz ∧ dx

− (sin u cos u sin2 v) dx ∧ dy − (sin2 u sin v) dy ∧ dz

+ (cos2 u cos v) dz ∧ dx − (cos2 u sin v) dy ∧ dz

= (cos v) dz ∧ dx − (sin v) dy ∧ dz

(10.136)

and dt1 ∧ dt2 = [(cos u cos v) dx + (cos u sin v) dy − sin u dz] ∧ [− sin v dx + cos v dy] = (cos u cos2 v + cos u sin2 v) dx ∧ dy + (sin u sin v) dz ∧ dx + (sin u cos v) dy ∧ dz

= (cos u) dx ∧ dy + (sin u sin v) dz ∧ dx + (sin u cos v) dy ∧ dz. Now we show that r is invariant under the rotation Rzx (−u)Rxy (−v): Lemma 10.12 1

If we denote rTE = (t21 + t22 + t23 ) 2 , then we have rTE = r.

(10.137)

309

10.5. Problems on closed forms and exact forms Proof of Lemma 10.12. By direct computation, we have rT2 E = t21 + t22 + t23 = [(cos u cos v)x + (cos u sin v)y + (− sin u)z]2 + [(− sin v)x + (cos v)y]2 + [(sin u cos v)x + (sin u sin v)y + (cos u)z]2 = (cos2 u cos2 v)x2 + 2(cos u cos v)[(cos u sin v)y + (− sin u)z]x + [(cos u sin v)y + (− sin u)z]2 + (sin2 v)x2 − (2 sin v cos v)xy + (cos2 v)y 2 + (sin2 u cos2 v)x2

+ (2 sin u cos v)[(sin u sin v)y + (cos u)z]x + [(sin u sin v)y + (cos u)z]2

= x2 + 2(cos u cos v)[(cos u sin v)y + (− sin u)z]x + [(cos u sin v)2 y 2 − (2 cos u sin u sin v)yz + (− sin u)2 z 2 ] − (2 sin v cos v)xy + (cos2 v)y 2 + (sin2 u cos2 v)x2 + (2 sin u cos v)[(sin u sin v)y + (cos u)z]x + [(sin u sin v)2 y 2 + (2 sin u cos u sin v)yz + (cos u)2 z 2 ] = x2 + y 2 + z 2 + (2 cos2 u sin v cos v − 2 sin v cos v + 2 sin2 u cos v sin v)xy (−2 cos u sin u sin v + 2 cos u sin u sin v)yz + (−2 sin u cos u cos v + 2 sin u cos u cos v)zx = x2 + y 2 + z 2 = r2 which certainly gives rTE = r and this completes the proof of the lemma.

Now we return to the proof of the problem. After putting the identities (10.135), (10.136) and (10.137) into the 2-form (10.134) and using Lemma 10.11, we have rT3 E ζTE = [(cos u cos v)x + (cos u sin v)y + (− sin u)z] × [−(sin u) dx ∧ dy + (sin v cos u) dz ∧ dx + (cos u cos v) dy ∧ dz] + [(− sin v)x + (cos v)y] × [(cos v) dz ∧ dx + (− sin v) dy ∧ dz]

+ [(sin u cos v)x + (sin u sin v)y + (cos u)z] × [(cos u) dx ∧ dy + (sin u sin v) dz ∧ dx + (sin u cos v) dy ∧ dz] n = (cos u cos v)x + (cos u sin v)y + (− sin u)z × (− sin u) o + [(sin u cos v)x + (sin u sin v)y + (cos u)z] × (cos u) dx ∧ dy n + (cos u cos v)x + (cos u sin v)y + (− sin u)z × (cos u cos v) + (− sin v)x + (cos v)y × (− sin v) o + (sin u cos v)x + (sin u sin v)y + (cos u)z × (sin u cos v) dy ∧ dz n + (cos u cos v)x + (cos u sin v)y + (− sin u)z × (sin v cos u) + (− sin v)x + (cos v)y × (cos v) o (sin u cos v)x + (sin u sin v)y + (cos u)z × (sin u sin v) dz ∧ dx = z dx ∧ dy + x dy ∧ dz + y dz ∧ dx

= r3 ζ

which implies that ζTE = ζ.

(10.138)

As we have shown that d(λTE ) = ζTE holds in E = R3 \ L, this and the identity (10.138) imply that ζ = d(λTE ),

Chapter 10. Integration of Differential Forms

310

i.e., ζ is exact E. This completes the proof of the problem.

Problem 10.23 Rudin Chapter 10 Exercise 23.

Proof. (a) By applying Theorem 10.20 repeatedly, we have d[(−1)i−1 (rk )−k xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk ]

= d[(−1)i−1 (rk )−k xi ] ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk

+ (−1)0 (−1)i−1 (rk )−k xi ∧ d( dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk )

= d[(−1)i−1 (rk )−k xi ] ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk

+ (−1)i−1 (rk )−k xi [( d2 x1 ) ∧ dx2 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk | {z } It is 0.

+(−1)1 dx1 ∧ d( dx2 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk )]

= d[(−1)i−1 (rk )−k xi ] ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk

+ (−1)i (rk )−k xi dx1 ∧ d( dx2 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk ) = ··· = d[(−1)i−1 (rk )−k xi ] ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk .

(10.139)

By [21, Eqn. (59), p.260], we have d[(−1)i−1 (rk )−k xi ] = (−1)i−1

k X ∂ xi dxj . ∂xj rkk j=1

(10.140)

It is clear that  ∂r k ∂rk x  −xi ∂xkj −xi krkk−1 ∂x −xi krkk−1 rkj  −kxi xj j   = = = k+2 , if j 6= i;  2k 2k 2k  rk rk rk  rk

∂ xi =  ∂xj rkk  ∂r k   rkk − xi ∂xki rk − kx2i rkk−2 r2 − kx2   = k = k k+2 i ,  2k 2k rk rk rk

if j = i.

Therefore, it follows from the summation (10.140) that i−1

d[(−1)

−k

(rk )

i−1

xi ] = (−1)

k X rk2 − kx2i −kxi xj dx + dxj i k+2 rk rkk+2 j=1

!

j6=i

and then the expression (10.139) with an application of the anticommutative relation shows that d[(−1)i−1 (rk )−k xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk ] ! k 2 2 X −kxi xj i−1 rk − kxi = (−1) dxi + dxj ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk rkk+2 rkk+2 j=1 j6=i

r2 − kx2 = (−1)i−1 k k+2 i dxi ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk rk

311

10.5. Problems on closed forms and exact forms = (−1)2i−2 =

rk2 − kx2i dx1 ∧ · · · ∧ dxk rkk+2

rk2 − kx2i dx1 ∧ · · · ∧ dxk rkk+2

(10.141)

Hence we deduce from the definition of ωk and the relation (10.141) that dωk =

k X i=1

d[(−1)i−1 (rk )−k xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk ]

k X rk2 − kx2i = dx1 ∧ · · · ∧ dxk rkk+2 i=1

=

krk2 − k(x21 + · · · + x2k ) dx1 ∧ · · · ∧ dxk rkk+2

=0 in Ek . (b) We have fk : Ek → R. Note that the gradient of fk at x is given by (∇fk )(x) =

k X

(Di fk )(x)ei .

(10.142)

i=1

– Step 1: fk satisfies the equations given in the hint. By Theorem 6.20 (First Fundamental Theorem of Calculus), if i 6= k, then we have Z xk k−3 ∂ rk (Di fk )(x) = (−1) (1 − s2 ) 2 ds ∂xi −1 xk k−3 ∂( ) x2 2 = (−1)k rk 1 − 2k ∂xi rk k−3 2 −xi xk x1 + · · · + x2k−1 2 = (−1)k · rk3 rk2 k

= (−1)k+1

k−3 xi xk rk−1 . rkk

(10.143)

Similarly, if i = k, then we have (Dk fk )(x) = (−1)k

∂rk k−3 rk − xk ∂x x2k 2 k 1 − rk2 rk2

2 k rk

= (−1)

= (−1)k

− x2k (rk2 − x2k ) · rk3 rkk−3

k−3 2

k−1 rk−1 . rkk

(10.144)

By substituting the relations (10.143) and (10.144) into the definition (10.142) and then consider its dot product with x, we get x · (∇fk )(x) = =

k X

xi (Di fk )(x)

i=1

k−1 X i=1

(−1)k+1

k−3 x2i xk rk−1 x rk−1 k k k−1 + (−1) rkk rkk

Chapter 10. Integration of Differential Forms = (−1)k+1

312 k−1 x rk−1 xk rk−1 k k k−1 + (−1) rkk rkk

= 0.

(10.145)

– Step 2: ωk = d(fk ωk−1 ) for k = 2, . . . , n. We notice from Theorem 10.20, part (a) and Definition 10.17 that d(fk ωk−1 ) = ( dfk ) ∧ ωk−1 + (−1)0 fk ∧ dωk−1 = ( dfk ) ∧ ωk−1 # " k X 1 = k−1 (Di fk )(x) dxi rk−1 i=1 # " k−1 X (−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk−1 ∧ i=1

=

1 h

k−1 rk−1

(−1)1−1 x1 (D1 fk )(x) dx1 ∧ dx2 ∧ · · · ∧ dxk−1 {z } | There are (k − 2) terms.

+(−1)2−1 x2 (D2 fk )(x) dx2 ∧ dx1 ∧ dx3 ∧ · · · ∧ dxk−1 | {z } There are (k − 2) terms.

k−2

+ · · · + (−1) +

(

xk−1 (Dk−1 fk )(x) dxk−1 ∧ dx1 ∧ · · · ∧ dxk−2 {z } | There are (k − 2) terms.

1 k−1 rk−1

i

(Dk fk )(x) dxk

#) " k−1 X i−1 . (−1) xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk−1 ∧ {z } | i=1

(10.146)

There are (k − 2) terms.

By applying the anticommutative relation (k − 2)-times to the red brackets in the expression (10.146) and then the formula (10.144), it becomes (−1)2k−2

=

1 rkk

1 k−1 rk−1

·

k−1 X i=1

k−1 k−1 X rk−1

rkk

i=1

(−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk

(−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk .

(10.147)

Now it remains to simplify the blue brackets in the expression (10.146). In fact, by the equation (10.145), the blue brackets are equivalent to −

xk xk (Dk fk )(x) dx1 ∧ · · · ∧ dxk−1 = (−1)k+1 k dx1 ∧ · · · ∧ dxk−1 . k−1 rk rk−1

(10.148)

Hence we follow from substituting the expressions (10.147) and (10.148) back into the expression (10.146) that d(fk ωk−1 ) = (−1)k+1 +

=

xk dx1 ∧ · · · ∧ dxk−1 rkk

k−1 1 X (−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk rkk i=1

k 1 X (−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxk rkk i=1

313

10.5. Problems on closed forms and exact forms = ωk which is our desired property.

(c) We know from Problems 10.22 and 10.23 that ω2 = η and ω3 = ζ. By Examples 10.36 and 10.37, we know that Z Z ζ = 4π 6= 0, η = 2π 6= 0 and Σ

γ

where γ and Σ are parametrizations of the unit circle and the unit sphere in R2 and R3 respectively. Furthermore, we conclude from the discussion parts of Examples 10.36 and 10.37 and the facts ∂γ = ∂Σ = 0 that η and ζ are not exact in R2 \ {0} and R3 \ {0} respectively. Thus it is reasonable to say that the answer in this part is negative.

Let n ≥ 2. We basically follow a part of the argument as in [9, §6.1]. Consider the (n − 1)-sphere S n−1 of radius 1 in Rn defined by S n−1 = {x ∈ Rn | kxk = 1}. Let ω = rnn ωn . Then we have ω = ωn on S n−1 so that ω(S n−1 ) = ωn (S n−1 ) by Definition 10.11, i.e., Z

ωn =

S n−1

Z

ω. S n−1

Since ωn is, by definition, a (n − 1)-form defined in En , ω is also a (n − 1)-form in En . We know that S n−1 = ∂Dn , where Dn = {x ∈ Rn | kxk ≤ 1} is the closed unit ball in Rn (see, for example, [11, p. 253]). Since S n−1 can be treated as an k-simplex in Rn of class C ′′ in En , we apply Theorem 10.33 (Stokes’ Theorem) to S n−1 and ω to get Z Z Z Z dω. (10.149) ω= ω= ωn = S n−1

S n−1

∂Dn

Dn

Lemma 10.13 We have dω = n dx1 ∧ · · · ∧ dxn .

Proof of Lemma 10.13. By Definition 10.18, we have n X (−1)i−1 xi dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxn dω = d i=1

=

n X i=1

=

n X i=1

(−1)i−1 dxi ∧ dx1 ∧ · · · ∧ dxi−1 ∧ dxi+1 ∧ · · · ∧ dxn (−1)i−1 (−1)i−1 dx1 ∧ · · · ∧ dxi−1 ∧ dxi ∧ dxi+1 ∧ · · · ∧ dxn

= n dx1 ∧ · · · ∧ dxn which is the required result.

Chapter 10. Integration of Differential Forms

314

Now we apply Lemma 10.13 to the relation (10.149) and then using Definition 10.44 (Volume elements), we obtain Z

S n−1

ω=n

Z

n

Dn

dx1 ∧ · · · ∧ dxn = Vol (Dn ) =

π2 6= 0, n Γ( 2 + 1)

(10.150)

where Γ is the Gamma function. For the second assertion, assume that λ was a (n − 2)-form defined in En ⊆ Rn such that ωn = dλ.

(10.151)

By Problem 10.16 or Remarks 10.35(c), we know that ∂S n−1 = ∂ 2 Dn = 0. Therefore, we follow from this fact, Theorem 10.33 (Stokes’ Theorem) and the exactness (10.151) that Z Z Z Z λ=0 λ= dλ = ωn = ∂S n−1

S n−1

S n−1

∂ 2 Dn

which contradicts the result (10.150). Hence ωn is not exact in En .v (d) We have the following generalizations: – A generalization of Problem 10.22(c). Suppose that g ∈ C ′′ ([0, 1]), hi ∈ C ′′ ([0, 1]n−2 ) and g > 0. Suppose further that Φ(s1 , . . . , sn−2 , t) = (x1 , . . . , xn ) is a (n − 1)-surface with parameter domain I n−1 given by xi = g(t)hi (s1 , . . . , sn−2 ) (1 ≤ i ≤ n).

(10.152)

Then we have Z

ωn = 0.

(10.153)

Φ

To prove the formula (10.153), we need to compute the Jacobians 

∂x1 ∂s1

    ..   .   ∂xi−1   ∂s1  ∂(x1 , . . . , xi−1 , xi+1 , . . . , xn )  = det  ∂x i+1  ∂(s1 , . . . , sn−2 , t)   ∂s1    ..  .     ∂xn ∂s1 v One

∂x1 ∂sn−2

∂x1 ∂t

···

.. . ∂xi−1 ∂sn−2

.. . ∂xi−1 ∂t

···

∂xi+1 ∂sn−2

∂xi+1 ∂t

..

.. .

.. .

∂xn ∂sn−2

∂xn ∂t

··· ..

.

.

···



            .           

can prove the same result by generalizing the argument used in part (b), but the steps are very cumbersome.

315

10.5. Problems on closed forms and exact forms By the definition (10.152), we have 

g(t)

∂h1 ∂s1

· · · g(t)

    ..   .    ∂hi−1   g(t)  ∂s1 ∂(x1 , . . . , xi−1 , xi+1 , . . . , xn )  = det   ∂(s1 , . . . , sn−2 , t)  g(t) ∂hi+1  ∂s1     ..  .     ∂hn g(t) ∂s1 

..

.. .

.

Z

Φ

ωn =

Z

I n−1

g ′ (t)hi−1

· · · g(t)

∂hi+1 ∂sn−2

g ′ (t)hi+1

..

.. .

.

    ..   .     ∂hi−1   ∂s1  × det   ∂h i+1   ∂s  1    ..  .     ∂hn ∂s1

···

∂h1 ∂sn−2

h1

..

.. .

.. .

···

∂hi−1 ∂sn−2

hi−1

···

∂hi+1 ∂sn−2

hi+1

..

.. .

.. .

∂hn ∂sn−2

hn

.

.

···

.. .



                          

∂hn g ′ (t)hn ∂sn−2  ∂h1 ··· h1  ∂sn−2   .. ..  ..  . . .     ∂hi−1  ··· hi−1   ∂sn−2    ∂hi+1 ··· hi+1   ∂sn−2     . . .. . . . . .      ∂hn ··· hn ∂sn−2

· · · g(t)

j=1

∂h1 ∂s1

.. .

∂hi−1 ∂sn−2

n X g ′ (t) (−1)i−1 hi n n2 X i=1 h2j g(t)



g ′ (t)h1

· · · g(t)

∂h1  ∂s1    ..   .     ∂hi−1   ∂s1  n−2 ′ =g (t)g (t) det   ∂h i+1    ∂s1    ..  .     ∂hn ∂s1

so that

∂h1 ∂sn−2

              ds1 · · · dsn−2 dt.            

Chapter 10. Integration of Differential Forms Since the summation in the above integral is along the first column  ∂h1 h1  ∂s1    . .. det  .  ..    ∂hn hn ∂s1

316 just the expansion of the following determinant ···

∂h1 ∂sn−2

..

.. .

.

···

∂hn ∂sn−2

h1



   ..   .     hn

and Theorem 9.34(d) implies that it is actually 0. Hence we have Z Z g ′ (t) ωn = n n × (0) ds1 · · · dsn−2 dt = 0 X Φ I n−1 2 2 hj g(t) j=1

which proves the formula (10.153). – A generalization of Problem 10.22(d). Our first step is to construct a (n − 1)-surface in En whose role is similar to that of Σ in Example 10.32. Let Dn be the (n − 1)-cell [0, π]n−2 × [0, 2π]. Suppose that Σn−1 : Dn → En ⊆ Rn \ {0} is the (n − 1)-surface defined by Σn−1 (φ1 , . . . , φn−1 ) = (x1 , . . . , xn−1 , xn ), where x1 = cos φ1 , x2 = sin φ1 cos φ2 , x3 = sin φ1 sin φ2 cos φ3 , .. .

(10.154)

xn−2 = sin φ1 · · · sin φn−3 cos φn−2 , xn−1 = sin φ1 · · · sin φn−2 cos φn−1 , xn = sin φ1 · · · sin φn−2 sin φn−1

and 0 ≤ φ1 , . . . , φn−2 ≤ π, 0 ≤ φn−1 ≤ 2π. (See [4] for further details of the derivation of Σn−1 .) By direct computation, we know from the definitions (10.154) that x21 + x22 + · · · + x2n = 1. Thus the range of Σn−1 is the (n − 1)-sphere S n−1 , i.e., Σn−1 (Dn ) = S n−1 .

(10.155)

Next, suppose E is a closed rectangle in Dn with edges parallel to those of Dn . In other words, we have E = [a1 , b1 ] × · · · × [an−1 , bn−1 ], where ai and bi are some constants with 0 < ai < bi < π for 1 ≤ i ≤ n − 2 and 0 < an−1 < bn−1 < 2π. Let f ∈ C ′′ (Dn ) and f > 0 on Dn . Let, further, that Ω be the (n − 1)-surface with parameter domain E, defined by Ω(φ1 , . . . , φn−1 ) = f (φ1 , . . . , φn−1 )Σn−1 (φ1 , . . . , φn−1 ). Now we want to prove

317

10.5. Problems on closed forms and exact forms Lemma 10.14 We have

Z

ωn =

Ω

Z

ωn = An−1 (S), S

n−1 where S and An−1 (S) denote the restriction ΣE and the “area” of S.

Since the proof of Lemma 10.14 is quite lengthy, we present its proof in Appendix A. – A special case of Lemma 10.14. We claim that Lemma 10.15 For n ≥ 2, we have

Z

n

ωn =

Σn−1

where

n 2π 2 Γ( n 2)

2π 2 , Γ( n2 )

is the surface area of the (n − 1)-sphere of radius 1, see [4, p. 66].

Proof of Lemma 10.15. The case for n = 2 is done in Example 10.36. So we prove the case for n ≥ 3 by induction. By the formula (A.16), we have Z ωn = An−1 (Σn−1 ) Σn−1

=

Z

0

π

···

Z

0

π

Z

2π

0

sin φn−2 sin2 φn−3 · · · sinn−2 φ1 dφ1 · · · dφn−1

= (2π)I1 × · · · × In−2 , where In−2 =

(10.156) Z

π

sinn−2 x dx.

0

When n = 3, we obtain from the formula (A.9) that Z π Z sin x dx = 4π. ω3 = 2πI1 = 2π Σ2

0

√ By Theorem 8.18(a) and the fact that Γ( 12 ) = π, we have 3

3

3

2π 2 4π 2 2π 2 √ = 4π. 1 = 3 = 1 π Γ( 2 ) 2 Γ( 2 ) Thus the statement is true for n = 3. Assume that it is also true for n = k ≥ 3, i.e., Z k 2π 2 ωk = . Γ( k2 ) Σk−1 For n = k + 1, we follow from the formula (A.9), the assumption and the properties of Γ(x) that Z ωk+1 = [(2π)I1 × · · · × Ik−2 ] × Ik−1 Σk Z = ωk × Ik−1 Σk−1 k

=

2π 2 × Ik−1 Γ( k2 )

Chapter 10. Integration of Differential Forms

318

 2π m   if k = 2m;  Γ(m) × I2m−1 , = m+ 1  2π 2  × I2m , if k = 2m + 1,  Γ(m + 21 ) √  2π m 2(m − 1)! π   , ×  Γ(m) Γ(m + 12 ) = 1 2m Γ(m + 21 ) π 2π m+ 2    √ × m , × 1 π 2 m! Γ(m + 2 )  1 m+  2π 2   , if k = 2m; Γ(m + 21 ) =   2π m+1  , if k = 2m + 1, m!  1  2π m+ 2  , if k = 2m;  Γ(m + 21 ) =  2π m+1   , if k = 2m + 1, Γ(m + 1)

if k = 2m; if k = 2m + 1,

k+1

2π 2 . = Γ( k+1 2 )

Hence the statement is still true for n = k + 1 and this finishes the proof of the lemma. This completes the proof of the problem.

Problem 10.24 Rudin Chapter 10 Exercise 24.

Proof. Let x, y ∈ E and x 6= y. Since E is convex, the affine-oriented 2-simplexes σ = [p, x, y] is in E. Furthermore, we know from Definition 10.30 that σ is of class C ′′ because the identity mapping is of class C ′′ . By Theorem 10.33 (Stokes’ Theorem) and the fact that dω = 0, we have Z Z ω. (10.157) dω = 0= ∂σ

σ

By Theorem 10.29, we have ∂σ = [x, y] − [p, y] + [p, x], so the integral (10.157) implies that Z

[x,y]

ω−

Z

ω+

Z

ω=0 Z f (y) − f (x) =

[p,y]

[p,x]

ω.

(10.158)

[x,y]

By definition, [x, y] is the straight line segment in E joining the points x and y. Let γ : [0, 1] → E be the 1-surface in E ⊆ Rn defined by γ(t) = (1 − t)x + ty = ((1 − t)x1 + ty1 , . . . , (1 − t)xn + tyn ). By Definition 10.11 (or Example 10.12(a)), we have Z

ω=

[x,y]

=

Z

n 1X

0 i=1 n XZ 1 i=1

0

ai (γ(t))

∂ ((1 − t)xi + tyi ) dt ∂t

ai ((1 − t)x + ty)(yi − xi ) dt

319

10.5. Problems on closed forms and exact forms Z n X (yi − xi )

=

1

0

i=1

ai ((1 − t)x + ty) dt.

(10.159)

Combining the expressions (10.158) and (10.159), we have f (y) − f (x) =

n X i=1

(yi − xi )

Z

1

ai ((1 − t)x + ty) dt.

0

(10.160)

Next, by [21, Eqn. (25), p. 215] and the expression (10.160), we have f (x + sej ) − f (x) s Z 1 1 = lim (xj + s − xj ) aj ((1 − t)x + t(x + sej )) dt s→0 s 0 Z 1 = lim aj (x + tsej ) dt

(Dj f )(x) = lim

s→0

s→0

=

Z

0

1

aj (x) dt 0

= aj (x) for j = 1, . . . , n. Hence if we define the real function f : E ⊆ Rn → R by Z ω, f (x) = [p,x]

then it is of class C ′ in E and df =

n X

(Di f )(x) dxi =

n X

ai (x) dxi = ω

i=1

i=1

in E. This completes our proof of the problem.

Problem 10.25 Rudin Chapter 10 Exercise 25.

Proof. This problem relates the concepts “exactness” and “independence of path” of 1-forms. Suppose that X ω= ai (x) dxi .

We need some results from topology. Given a space X, define an equivalence relation (Definition 2.3) on X by setting x ∼ y if there exists a connected subspace of X containing x and y. The equivalent classes are called the connected component of X. Let’s recall some well-known results about connected components of a space X ([18, Theorem 25.1, p. 159]): Lemma 10.16 Let {Xα } be the collection of all connected components of X. Then we have [ X= Xα . α

Furthermore, a space X is said to be locally connected at x if for every neighborhood U of x, there exists a connected neighborhood V such that x ∈ V ⊆ U.

Chapter 10. Integration of Differential Forms

320

If X is locally connected at each x, then X is called a locally connected space. We have the following result about locally connected spaces ([18, Theorem 25.3, p. 161]): Lemma 10.17 If X is a locally connected space, then each component of an open set U of X is also open in X. We know, by checking the definition of locally connectedness directly, that Rn is locally connected. Since the set E in our question is supposed to be open in Rn , Lemmas 10.16 and 10.17 imply that [ E= Eα , α

where each Eα is an open (in Rn ) connected component of E. We claim that ω is exact in Eα . To this end, fix pα ∈ Eα . Similar to Problem 10.24, we define a function fα : Eα → R by Z ω. fα (x) = [pα ,x]

For any x, y ∈ Eα , let γ = [x, y] − [pα , y] + [pα , x]. Then γ is a closed curve in Eα and the hypothesis shows that Z Z Z Z Z 0= ω= ω= ω− ω+ ω. γ

[x,y]−[pα ,y]+[pα ,x]

[x,y]

Thus we have fα (x) − fα (y) =

Z

[pα ,y]

[pα ,x]

ω

[x,y]

which is exactly the relation (10.158). By imitating the remaining part of the argument in Problem 10.24, we conclude that ω = dfα in Eα . Finally, if we define the function f : E → R such that the restriction of f to Eα is fα , i.e., f |Eα = fα , then we have ω = df in E. This ends the proof of the problem.

Problem 10.26 Rudin Chapter 10 Exercise 26.

Proof. We follow the given hint. Let E = R3 \ {0}. Then E is obviously open in R3 . Define γ : [0, 1] → E to be a closed curve in E, of class C ′ . Therefore, there is a 2-surface Φ : D → E such that ∂Φ = γ, where D is a compact subset of R2 . Since ω is a 1-form in E of class C ′ and Φ is of class C ′′ in E, it follows from Theorem 10.33 (Stokes’ Theorem) and then the hypothesis dω = 0 that Z Z Z ω= ω= dω = 0. γ

∂Φ

Φ

By Problem 10.25, we establish that ω is exact in E = R3 \ {0}, finishing the proof of the problem. Problem 10.27 Rudin Chapter 10 Exercise 27.

321

10.5. Problems on closed forms and exact forms

Proof. Let E = (p1 , q1 ) × (p2 , q2 ) × (p3 , q3 ) be the open 3-cell in R3 . By Theorem 10.20 and then [21, Eqn. (59), p. 260], we have dλ = d(g1 dx + g2 dy) = ( dg1 ) ∧ dx + (−1)0 g1 ∧ d2 x + ( dg2 ) ∧ dy + (−1)0 g2 ∧ d2 y ∂g ∂g ∂g1 ∂g1 ∂g2 ∂g2 1 2 = dx + dy + dz ∧ dx + dx + dy + dz ∧ dy. ∂x ∂y ∂z ∂x ∂y ∂z

(10.161)

Since the anticommutative relation and dx ∧ dx = dy ∧ dy = 0 (see [21, Eqn. (42) & (43), p.256]), we obtain from the expression (10.161) that ∂g1 ∂g2 ∂g1 ∂g2 dy ∧ dx + dx ∧ dy + dz ∧ dx + dz ∧ dy ∂y ∂x ∂z ∂z ∂g ∂g1 ∂g1 ∂g2 2 dx ∧ dy + = − dz ∧ dx + dz ∧ dy. ∂x ∂y ∂z ∂z

dλ =

(10.162)

When x and y are fixed, since f2 ∈ C ′ (E), we have f2 ∈ C ′ ((p3 , q3 )). In particular, f2 ∈ R on [c, z] and f2 is continuous at z, so Theorem 6.20 (First Fundamental Theorem of Calculus) implies that Z z Z y ∂g1 ∂ ∂ = f2 (x, y, s) ds − f3 (x, t, c) dt = f2 (x, y, z) − 0 = f2 (x, y, z). (10.163) ∂z ∂z c ∂z b Similarly, we have ∂g2 ∂ =− ∂z ∂z and ∂ ∂y

−

Z

Z

c

z

f1 (x, y, s) ds = −f1 (x, y, z)

y

f3 (x, t, c) dt

b

!

(10.164)

= −f3 (x, y, c).

(10.165)

Finally, we have to evaluate ∂ ∂x

−

Z

c

z

f1 (x, y, s) ds

!

and

∂ ∂y

Z

c

z

!

f2 (x, y, s) ds .

To do this, we need Theorem 9.42, so we have to check its hypotheses. For the first integral, we fix y and let F1 (x, s) = f1 (x, y, s). • Since F1 (x, s) is defined for a ≤ x ≤ r1 , c ≤ s ≤ r3 for some r1 < q1 and r3 < q3 . (This is the hypothesis (a) in Theorem 9.42.) • For every (fixed) x ∈ [a, r1 ], the condition f1 ∈ C ′ (E) implies that F1 (x, s) ∈ R on [c, r3 ]. (This is the hypothesis (c) in Theorem 9.42.) 1 • Since f1 ∈ C ′ (E), ∂F ∂s is a uniformly continuous function on [a, r1 ] × [c, r3 ]. By Definition 4.18, for every ǫ > 0, there exists a δ > 0 such that ∂ ∂ − F1 (x, s) F1 (x, s)