318 107 1MB
English Pages 247 [289] Year 2020
Introduction to Analysis in One Variable Michael Taylor Math. Dept., UNC E-mail address: [email protected]
2010 Mathematics Subject Classification. 26A03, 26A06, 26A09, 26A42 Key words and phrases. real numbers, complex numbers, irrational numbers, Euclidean space, metric spaces, compact spaces, Cauchy sequences, continuous function, power series, derivative, mean value theorem, Riemann integral, fundamental theorem of calculus, arclength, exponential function, logarithm, trigonometric functions, Euler’s formula, Weierstrass approximation theorem, Fourier series, Newton’s method
Contents
Preface
ix
Some basic notation Chapter 1.
xiii
Numbers
1
§1.1.
Peano arithmetic
2
§1.2.
The integers
9
§1.3.
Prime factorization and the fundamental theorem of arithmetic 14
§1.4.
The rational numbers
16
§1.5.
Sequences
21
§1.6.
The real numbers
29
§1.7.
Irrational numbers
41
§1.8.
Cardinal numbers
44
§1.9.
Metric properties of R
51
§1.10.
Complex numbers
Chapter 2.
Spaces
56 65
§2.1.
Euclidean spaces
66
§2.2.
Metric spaces
74
§2.3.
Compactness
80
§2.4.
The Baire category theorem
85
Chapter 3.
Functions
87
§3.1.
Continuous functions
88
§3.2.
Sequences and series of functions
97 vii
viii
Contents
§3.3.
Power series
102
§3.4.
Spaces of functions
108
§3.5.
Absolutely convergent series
112
Chapter 4.
Calculus
117
§4.1.
The derivative
119
§4.2.
The integral
129
§4.3.
Power series
148
§4.4.
Curves and arc length
158
§4.5.
The exponential and trigonometric functions
172
§4.6.
Unbounded integrable functions
191
Chapter 5.
Further Topics in Analysis
199
§5.1.
Convolutions and bump functions
200
§5.2.
The Weierstrass approximation theorem
205
§5.3.
The Stone-Weierstrass theorem
208
§5.4.
Fourier series
212
§5.5.
Newton’s method
237
§5.6.
Inner product spaces
243
Appendix A. Complementary results
247
§A.1.
The fundamental theorem of algebra
247
§A.2.
More on the power series of (1 −
249
§A.3.
π2
§A.4.
Archimedes’ approximation to π
253
§A.5.
Computing π using arctangents
257
§A.6.
Power series for tan x
261
§A.7.
Abel’s power series theorem
264
§A.8.
Continuous but nowhere-differentiable functions
268
x)b
is irrational
251
Bibliography
273
Index
275
Preface
This is a text for students who have had a three course calculus sequence, and who are ready for a course that explores the logical structure of this area of mathematics, which forms the backbone of analysis. This is intended for a one semester course. An accompanying text, Introduction to Analysis in Several Variables [13], can be used in the second semester of a one year sequence. The main goal of Chapter 1 is to develop the real number system. We start with a treatment of the “natural numbers” N, obtaining its structure from a short list of axioms, the primary one being the principle of induction. Then we construct the set Z of all integers, which has a richer algebraic structure, and proceed to construct the set Q of rational numbers, which are quotients of integers (with a nonzero denominator). After discussing infinite sequences of rational numbers, including the notions of convergent sequences and Cauchy sequences, we construct the set R of real numbers, as ideal limits of Cauchy sequences of rational numbers. At the heart of this chapter is the proof that R is complete, i.e., Cauchy sequences of real numbers always converge to a limit in R. This provides the key to studying other metric properties of R, such as the compactness of (nonempty) closed, bounded subsets. We end Chapter 1 with a section on the set C of complex numbers. Many introductions to analysis shy away from the use of complex numbers. My feeling is that this forecloses the study of way too many beautiful results that can be appreciated at this level. This is not a course in complex analysis. That is for another course, and with another text (such as [14]). However, I think the use of complex numbers in this text serves both to simplify the treatment of a number of key concepts, and to extend their scope in natural and useful ways.
ix
x
Preface
In fact, the structure of analysis is revealed more clearly by moving beyond R and C, and we undertake this in Chapter 2. We start with a treatment of n-dimensional Euclidean space, Rn . There is a notion of Euclidean distance between two points in Rn , leading to notions of convergence and of Cauchy sequences. The spaces Rn are all complete, and again closed bounded sets are compact. Going through this sets one up to appreciate a further generalization, the notion of a metric space, introduced in §2.2. This is followed by §2.3, exploring the notion of compactness in a metric space setting. Chapter 3 deals with functions. It starts in a general setting, of functions from one metric space to another. We then treat infinite sequences of functions, and study the notion of convergence, particularly of uniform convergence of a sequence of functions. We move on to infinite series. In such a case, we take the target space to be Rn , so we can add functions. Section 3.3 treats power series. Here, we study series of the form ∞ ∑ ak (z − z0 )k , (0.0.1) k=0
with ak ∈ C and z running over a disk in C. For results obtained in this section, regarding the radius of convergence R and the continuity of the sum on DR (z0 ) = {z ∈ C : |z −z0 | < R}, there is no extra difficulty in allowing ak and z to be complex, rather than insisting they be real, and the extra level of generality will pay big dividends in Chapter 4. One section in Chapter 3 is devoted to spaces of functions, illustrating the utility of studying spaces beyond the case of Rn . Chapter 4 gets to the heart of the matter, a rigorous development of differential and integral calculus. We define the derivative in §4.1, and prove the Mean Value Theorem, making essential use of compactness of a closed, bounded interval and its consequences, established in earlier chapters. This result has many important consequences, such as the Inverse Function Theorem, and especially the Fundamental Theorem of Calculus, established in §4.2, after the Riemann integral is introduced. In §4.3, we return to power series, this time of the form ∞ ∑ (0.0.2) ak (t − t0 )k . k=0
We require t and t0 to be in R, but still allow ak ∈ C. Results on radius of convergence R and continuity of the sum f (t) on (t0 − R, t0 + R) follow from material in Chapter 3. The essential new result in §4.3 is that one can obtain the derivative f ′ (t) by differentiating the power series for f (t) term by term. In §4.4 we consider curves in Rn , and obtain a formula for arc length for a smooth curve. We show that a smooth curve with nonvanishing
Preface
xi
velocity can be parametrized by arc length. When this is applied to the unit circle in R2 centered at the origin, one is looking at the standard definition of the trigonometric functions, (0.0.3)
C(t) = (cos t, sin t).
We provide a demonstration that (0.0.4)
C ′ (t) = (− sin t, cos t)
that is much shorter than what is usually presented in calculus texts. In §4.5 we move on to exponential functions. We derive the power series for the function et , introduced to solve the differential equation dx/dt = x. We then observe that with no extra work we get an analogous power series for eat , with derivative aeat , and that this works for complex a as well as for real a. It is a short step to realize that eit is a unit speed curve tracing out the unit circle in C ≈ R2 , so comparison with (0.0.3) gives Euler’s formula (0.0.5)
eit = cos t + i sin t.
That the derivative of eit is ieit provides a second proof of (0.0.4). Thus we have a unified treatment of the exponential and trigonometric functions, carried out further in §4.5, with details developed in numerous exercises. Section 4.6 extends the scope of the Riemann integral to a class of unbounded functions. Chapter 5 treats further topics in analysis. The topics center around approximating functions, via various infinite sequences or series. Topics include polynomial approximation of continuous functions, Fourier series, and Newton’s method for approximating the inverse of a given function. We end with a collection of appendices, covering various results related to material in Chapters 4–5. The first one gives a proof of the fundamental theorem of algebra, that every nonconstant polynomial has a complex root. The second explores the power series of (1 − x)b , in more detail then dome in §4.3, of use in §5.2. There follow three appendices on the nature of π and its numerical evaluation, an appendix on the power series of tan x, and one on a theorem of Abel on infinite series, and related results. We also study continuous functions on R that are nowhere differentiable. Our approach to the foundations of analysis, outlined above, has some distinctive features, which we point out here. 1) Approach to numbers. We do not take an axiomatic approach to the presentation of the real numbers. Rather than hypothesizing that R has specified algebraic and metric properties, we build R from more basic objects
xii
Preface
(natural numbers, integers, rational numbers) and produce results on its algebraic and metric properties as propositions, rather than as axioms. In addition, we do not shy away from the use of complex numbers. The simplifications this use affords range from amusing (construction of a regular pentagon) to profound (Euler’s identity, computing the Dirichlet kernel in Fourier series), and such uses of complex numbers can be readily appreciated by a student at the level of this sort of analysis course. 2) Spaces and geometrical concepts. We emphasize the use of geometrical properties of n-dimensional Euclidean space, Rn , as an important extension of metric properties of the line and the plane. Going further, we introduce the notion of metric spaces early on, as a natural extension of the class of Euclidean spaces. For one interested in functions of one real variable, it is very useful to encounter such functions taking values in Rn (i.e., curves), and to encounter spaces of functions of one variable (a significant class of metric spaces). One implementation of this approach involves defining the exponential function for complex arguments and making a direct geometrical study of eit , for real t. This allows for a self-contained treatment of the trigonometric functions, not relying on how this topic might have been covered in a previous course, and in particular for a derivation of the Euler identity that is very much different from what one typically sees. We follow this introduction with a record of some standard notation that will be used throughout this text. Acknowledgment During the preparation of this book, I have been supported by a number of NSF grants, most recently DMS-1500817.
Some basic notation
R is the set of real numbers. C is the set of complex numbers. Z is the set of integers. Z+ is the set of integers ≥ 0. N is the set of integers ≥ 1 (the “natural numbers”). Q is the set of rational numbers. x ∈ R means x is an element of R, i.e., x is a real number. (a, b) denotes the set of x ∈ R such that a < x < b. [a, b] denotes the set of x ∈ R such that a ≤ x ≤ b. {x ∈ R : a ≤ x ≤ b} denotes the set of x in R such that a ≤ x ≤ b. [a, b) = {x ∈ R : a ≤ x < b} and (a, b] = {x ∈ R : a < x ≤ b}. xiii
xiv
Some basic notation
z = x − iy if z = x + iy ∈ C, x, y ∈ R. Ω denotes the closure of the set Ω. f : A → B denotes that the function f takes points in the set A to points in B. One also says f maps A to B. x → x0 means the variable x tends to the limit x0 . f (x) = O(x) means f (x)/x is bounded. Similarly g(ε) = O(εk ) means g(ε)/εk is bounded. f (x) = o(x) as x → 0 (resp., x → ∞) means f (x)/x → 0 as x tends to the specified limit. S = sup |an | means S is the smallest real number that satisfies S ≥ |an | for n
all n. If there is no such real number then we take S = +∞. ( ) lim sup |ak | = lim sup |ak | . k→∞
n→∞ k≥n
Chapter 1
Numbers
One foundation for a course in analysis is a solid understanding of the real number system. Texts vary on just how to achieve this. Some take an axiomatic approach. In such an approach, the set of real numbers is hypothesized to have a number of properties, including various algebraic properties satisfied by addition and multiplication, order axioms, and, crucially, the completeness property, sometimes expressed as the supremum property. This is not the approach we will take. Rather, we will start with a small list of axioms for the natural numbers (i.e., the positive integers), and then build the rest of the edifice logically, obtaining the basic properties of the real number system, particularly the completeness property, as theorems. Sections 1.1–1.3 deal with the integers, starting in §1.1 with the set N of natural numbers. The development proceeds from axioms of G. Peano. The main one is the principle of mathematical induction. We deduce basic results about integer arithmetic from these axioms. A high point is the fundamental theorem of arithmetic, presented in §1.3. Section 1.4 discusses the set Q of rational numbers, deriving the basic algebraic properties of these numbers from the results of §§1.1–1.3. Section 1.5 provides a bridge between §1.4 and §1.6. It deals with infinite sequences, including convergent sequences and “Cauchy sequences.” This prepares the way for §1.6, the main section of this chapter. Here we construct the set R of real numbers, as “ideal limits” of rational numbers. We extend basic algebraic results from Q to R. Furthermore, we establish the result that R is “complete,” i.e., Cauchy sequences always have √ √in R. √ limits Section 1.7 provides examples of irrational numbers, such as 2, 3, 5,...
1
2
1. Numbers
Section 1.8 deals with cardinal numbers, an extension of the natural numbers N, that can be used to “count” elements of a set, not necessarily finite. For example, N is a “countably” infinite set, and so is Q. We show that R “uncountable,” and hence much larger than N or Q. Section 1.9 returns to the real number line R, and establishes further metric properties of R and various subsets, with an emphasisis on the notion of compactness. The completeness property established in §1.6 plays a crucial role here. Section 1.10 introduces the set C of complex numbers and establishes basic algebraic and metric properties of C. While some introductory treatments of analysis avoid complex numbers, we embrace them, and consider their use in basic analysis too precious to omit. Sections 1.9 and 1.10 also have material on continuous functions, defined on a subset of R or C, respectively. These results give a taste of further results to be developed in Chapter 3, which will be essential to material in Chapters 4 and 5.
1.1. Peano arithmetic In Peano arithmetic, we assume we have a set N (the natural numbers). We e = N ∪ {0}. We assume there is a map assume given 0 ∈ / N, and form N e −→ N, s:N
(1.1.1)
e such that which is bijective. That is to say, for each k ∈ N, there is a j ∈ N ′ s(j) = k, so s is surjective; and furthermore, if s(j) = s(j ) then j = j ′ , so s is injective. The map s plays the role of “addition by 1,” as we will see below. The only other axiom of Peano arithmetic is that the principle of e is a set with the mathematical induction holds. In other words, if S ⊂ N properties (1.1.2)
0 ∈ S,
k ∈ S ⇒ s(k) ∈ S,
e then S = N.
e we see that Actually, applying the induction principle to S = {0} ∪ s(N), it suffices to assume that s in (1.1.1) is injective; the induction principle ensures that it is surjective. e inductively on y, by We define addition x + y, for x, y ∈ N, (1.1.3)
x + 0 = x,
x + s(y) = s(x + y).
Next, we define multiplication x · y, inductively on y, by (1.1.4)
x · 0 = 0,
x · s(y) = x · y + x.
1.1. Peano arithmetic
3
We also define (1.1.5)
1 = s(0).
We now establish the basic laws of arithmetic. Proposition 1.1.1. x + 1 = s(x). Proof. x + s(0) = s(x + 0).
Proposition 1.1.2. 0 + x = x. Proof. Use induction on x. First, 0 + 0 = 0. Now, assuming 0 + x = x, we have 0 + s(x) = s(0 + x) = s(x). Proposition 1.1.3. s(y + x) = s(y) + x. Proof. Use induction on x. First, s(y + 0) = s(y) = s(y) + 0. Next, we have s(y + s(x)) = ss(y + x), s(y) + s(x) = s(s(y) + x). If s(y + x) = s(y) + x, the two right sides are equal, so the two left sides are equal, completing the induction. Proposition 1.1.4. x + y = y + x. Proof. Use induction on y. The case y = 0 follows from Proposition 1.1.2. e we must show s(y) has the same Now, assuming x + y = y + x, for all x ∈ N, property. In fact, x + s(y) = s(x + y) = s(y + x), and by Proposition 1.1.3 the last quantity is equal to s(y) + x.
Proposition 1.1.5. (x + y) + z = x + (y + z). Proof. Use induction on z. First, (x + y) + 0 = x + y = x + (y + 0). Now, e we must show s(z) has assuming (x + y) + z = x + (y + z), for all x, y ∈ N, the same property. In fact, (x + y) + s(z) = s((x + y) + z), x + (y + s(z)) = x + s(y + z) = s(x + (y + z)), and we perceive the desired identity.
4
1. Numbers
Remark. Propositions 1.1.4 and 1.1.5 state the commutative and associative laws for addition. We now establish some laws for multiplication. Proposition 1.1.6. x · 1 = x. Proof. We have x · s(0) = x · 0 + x = 0 + x = x, the last identity by Proposition 1.1.2.
Proposition 1.1.7. 0 · y = 0. Proof. Use induction on y. First, 0 · 0 = 0. Next, assuming 0 · y = 0, we have 0 · s(y) = 0 · y + 0 = 0 + 0 = 0. Proposition 1.1.8. s(x) · y = x · y + y. Proof. Use induction on y. First, s(x) · 0 = 0, while x · 0 + 0 = 0 + 0 = 0. Next, assuming s(x) · y = x · y + y, for all x, we must show that s(y) has this property. In fact, s(x) · s(y) = s(x) · y + s(x) = (x · y + y) + (x + 1), x · s(y) + s(y) = (x · y + x) + (y + 1), and identity then follows via the commutative and associative laws of addition, Propositions 1.1.4 and 1.1.5. Proposition 1.1.9. x · y = y · x. Proof. Use induction on y. First, x · 0 = 0 = 0 · x, the latter identity by e we must show Proposition 1.1.7. Next, assuming x · y = y · x for all x ∈ N, that s(y) has the same property. In fact, x · s(y) = x · y + x = y · x + x, s(y) · x = y · x + x, the last identity by Proposition 1.1.8.
Proposition 1.1.10. (x + y) · z = x · z + y · z. Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, e we must show it holds for s(z). In assuming it holds for z (for all x, y ∈ N), fact, (x + y) · s(z) = (x + y) · z + (x + y) = (x · z + y · z) + (x + y), x · s(z) + y · s(z) = (x · z + x) + (y · z + y),
1.1. Peano arithmetic
5
and the desired identity follows from the commutative and associative laws of addition. Proposition 1.1.11. (x · y) · z = x · (y · z). Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, e we have assuming it holds for z (for all x, y ∈ N), (x · y) · s(z) = (x · y) · z + x · y, while x · (y · s(z)) = x · (y · z + y) = x · (y · z) + x · y, the last identity by Proposition 1.1.10 (and 1.1.9). These observations yield the desired identity. Remark. Propositions 1.1.9 and 1.1.11 state the commutative and associative laws for multiplication. Proposition 1.10 is the distributive law. Combined with Proposition 1.1.9, it also yields z · (x + y) = z · x + z · y, used above. We next demonstrate the cancellation law of addition: e Proposition 1.1.12. Given x, y, z ∈ N, (1.1.6)
x + y = z + y =⇒ x = z.
Proof. Use induction on y. If y = 0, (1.1.6) obviously holds. Assuming (1.1.6) holds for y, we must show that (1.1.7)
x + s(y) = z + s(y)
implies x = z. In fact, (1.1.7) is equivalent to s(x + y) = s(z + y). Since the map s is assumed to be one-to-one, this implies that x + y = z + y, so we are done. e Given x, y ∈ N, e we say We next define an order relation on N. (1.1.8)
x < y ⇐⇒ y = x + u, for some u ∈ N.
Similarly there is a definition of x ≤ y. We have x ≤ y if and only if y ∈ Rx , where e (1.1.9) Rx = {x + u : u ∈ N}. Other notation is y > x ⇐⇒ x < y,
y ≥ x ⇐⇒ x ≤ y.
6
1. Numbers
Proposition 1.1.13. If x ≤ y and y ≤ x then x = y. Proof. The hypotheses imply (1.1.10)
y = x + u,
e u, v ∈ N.
x = y + v,
Hence x = x + u + v, so, by Proposition 1.1.12, u + v = 0. Now, if v ̸= 0, then v = s(w), so u + v = s(u + w) ∈ N. Thus v = 0, and u = 0. e either Proposition 1.1.14. Given x, y ∈ N, (1.1.11)
x < y,
or
x = y,
or
y < x,
and no two can hold. Proof. That no two of (1.1.11) can hold follows from Proposition 1.1.13. It e We will establish (1.1.11) remains to show that one must hold. Take y ∈ N. by induction on x. Clearly (1.1.11) holds for x = 0. We need to show that e then either if (1.1.11) holds for a given x ∈ N, (1.1.12)
s(x) < y,
or s(x) = y,
or y < s(x).
Consider the three possibilities in (1.1.11). If either y = x or y < x, then clearly y < s(x) = x + 1. On the other hand, if x < y, we can use the implication (1.1.13)
x < y =⇒ s(x) ≤ y
to complete the proof of (1.1.12). See Lemma 1.1.17 for a proof of (1.1.13). We can now establish the cancellation law for multiplication. e Proposition 1.1.15. Given x, y, z ∈ N, (1.1.14)
x · y = x · z, x ̸= 0 =⇒ y = z.
Proof. If y ̸= z, then either y < z or z < y. Suppose y < z, i.e., z = y + u, u ∈ N. Then the hypotheses of (1.1.14) imply x · y = x · y + x · u,
x ̸= 0,
hence, by Proposition 1.1.12, (1.1.15)
x · u = 0,
x ̸= 0.
We thus need to show that (1.1.15) implies u = 0. In fact, if not, then we e and we have can write u = s(w), and x = s(a), with w, a ∈ N, (1.1.16)
x · u = x · w + s(a) = s(x · w + a) ∈ N.
This contradicts (1.1.15), so we are done.
1.1. Peano arithmetic
7
Remark. Note that (1.1.16) implies (1.1.17)
x, y ∈ N =⇒ x · y ∈ N.
We next establish the following variant of the principle of induction, e called the well-ordering property of N. e is nonempty, then T contains a smallest Proposition 1.1.16. If T ⊂ N element. Proof. Suppose T contains no smallest element. Then 0 ∈ / T. Let (1.1.18)
e : x < y, ∀ y ∈ T }. S = {x ∈ N
Then 0 ∈ S. We claim that (1.1.19)
x ∈ S =⇒ s(x) ∈ S.
Indeed, suppose x ∈ S, so x < y for all y ∈ T. If s(x) ∈ / S, we have s(x) ≥ y0 for some y0 ∈ T. On the other hand (see Lemma 1.1.17 below), (1.1.20)
x < y0 =⇒ s(x) ≤ y0 .
Thus, by Proposition 1.1.13, (1.1.21)
s(x) = y0 .
It follows that y0 must be the smallest element of T. Thus, if T has no smallest element, (1.1.19) must hold. The induction principle then implies e which implies T is empty. that S = N, Here is the result behind (1.1.13) and (1.1.20). e Lemma 1.1.17. Given x, y ∈ N, (1.1.22)
x < y =⇒ s(x) ≤ y.
Proof. Indeed, x < y ⇒ y = x + u with u ∈ N, hence u = s(v), so y = x + s(v) = s(x + v) = s(x) + v, hence s(x) ≤ y.
Remark. Proposition 1.1.16 has a converse, namely, the assertion (1.1.23)
e nonempty =⇒ T contains a smallest element T ⊂N
implies the principle of induction: ( ) e k ∈ S ⇒ s(k) ∈ S =⇒ S = N. e (1.1.24) 0 ∈ S ⊂ N,
8
1. Numbers
e \ S. To see this, suppose S satisfies the hypotheses of (1.1.24), and let T = N e then T is nonempty, so (1.1.23) implies T has a smallest element, If S ̸= N, say x1 . Since 0 ∈ S, x1 ∈ N, so x1 = s(x0 ), and we must have e \ S, s(x0 ) ∈ T = N
x0 ∈ S,
(1.1.25)
contradicting the hypotheses of (1.1.24). Exercises Given n ∈ N, we define (1.1.26)
1 ∑
∑n
k=1 ak
inductively, as follows.
n+1 ∑
ak = a1 ,
k=1
ak =
n (∑
k=1
) ak + an+1 .
k=1
Use the principle of induction to establish the following identities. 1. Linear series (1.1.27)
2
n ∑
k = n(n + 1).
k=1
2. Quadratic series (1.1.28)
6
n ∑
k 2 = n(n + 1)(2n + 1).
k=1
3. Geometric series (1.1.29)
(a − 1)
n ∑
ak = an+1 − a,
if a ̸= 1.
k=1
Here, we define the powers an inductively by a1 = a, an+1 = an · a.
(1.1.30)
4. We also set a0 = 1 if a ∈ N, and (1.1.31)
(a − 1)
n ∑
∑n
k=0 ak
= a0 +
ak = an+1 − 1,
k=0
5. Given k ∈ N, show that with strict inequality for k > 1.
2k ≥ 2k,
∑n
k=1 ak .
if a ̸= 1.
Verify that
1.2. The integers
9
e 6. Show that, for x, x′ , y, y ′ ∈ N, x < x′ , y ≤ y ′ =⇒ x + y < x′ + y ′ , and x · y < x′ · y ′ , if also y ′ > 0. 7. Show that the following variant of the principle of induction holds: ( ) 1 ∈ S ⊂ N, k ∈ S ⇒ s(k) ∈ S =⇒ S = N. e Hint. Consider {0} ∪ S ⊂ N. More generally, with Rx as in (1.9), show that, for x ∈ N, ( ) x ∈ S ⊂ Rx , k ∈ S ⇒ s(k) ∈ S =⇒ S = Rx . Hint. Use induction on x. e n ∈ N, show that if 8. With an defined inductively as in (1.1.30) for a ∈ N, also m ∈ N, am an = am+n , (am )n = amn . Hint. Use induction on n.
1.2. The integers e To be more An integer is thought of as having the form x − a, with x, a ∈ N. formal, we will define an element of Z as an equivalence class of ordered e where we define pairs (x, a), x, a ∈ N, (x, a) ∼ (y, b) ⇐⇒ x + b = y + a.
(1.2.1)
We claim (1.2.1) is an equivalence relation. In general, an equivalence relation on a set S is a specification s ∼ t for certain s, t ∈ S, which satisfies the following three conditions. (a) Reflexive. s ∼ s, (b) Symmetric. (c)
∀ s ∈ S.
s ∼ t ⇐⇒ t ∼ s.
Transitive. s ∼ t, t ∼ u =⇒ s ∼ u.
We will encounter various equivalence relations in this and subsequent sections. Generally, (a) and (b) are quite easy to verify, and we will concentrate on verifying (c). Proposition 1.2.1. The relation (1.2.1) is an equivalence relation. Proof. We need to check that (1.2.2)
(x, a) ∼ (y, b), (y, b) ∼ (z, c) =⇒ (x, a) ∼ (z, c),
10
1. Numbers
e i.e., that, for x, y, z, a, b, c ∈ N, (1.2.3)
x + b = y + a, y + c = z + b =⇒ x + c = z + a.
In fact, the hypotheses of (1.2.3), and the results of §1.1, imply (x + c) + (y + b) = (z + a) + (y + b), and the conclusion of (1.2.3) then follows from the cancellation property, Proposition 1.1.12. Let us denote the equivalence class containing (x, a) by [(x, a)]. We then define addition and multiplication in Z to satisfy (1.2.4) [(x, a)] + [(y, b)] = [(x, a) + (y, b)], [(x, a)] · [(y, b)] = [(x, a) · (y, b)], (x, a) + (y, b) = (x + y, a + b),
(x, a) · (y, b) = (xy + ab, ay + xb).
To see that these operations are well defined, we need: Proposition 1.2.2. If (x, a) ∼ (x′ , a′ ) and (y, b) ∼ (y ′ , b′ ), then (1.2.5)
(x, a) + (y, b) ∼ (x′ , a′ ) + (y ′ , b′ ),
and (1.2.6)
(x, a) · (y, b) ∼ (x′ , a′ ) · (y ′ , b′ ).
Proof. The hypotheses say (1.2.7)
x + a′ = x′ + a,
y + b′ = y ′ + b.
The conclusions follow from results of §1.1. In more detail, adding the two identities in (1.2.7)) gives x + a′ + y + b′ = x′ + a + y ′ + b, and rearranging, using the commutative and associative laws of addition, yields (x + y) + (a′ + b′ ) = (x′ + y ′ ) + (a + b), implying (1.2.5). The task of proving (1.2.6) is simplified by going through the intermediate step (1.2.8)
(x, a) · (y, b) ∼ (x′ , a′ ) · (y, b).
If x′ > x, so x′ = x + u, u ∈ N, then also a′ = a + u, and our task is to prove (xy + ab, ay + xb) ∼ (xy + uy + ab + ub, ay + uy + xb + ub), which is readily done. Having (1.2.8), we apply similar reasoning to get (x′ , a′ ) · (y, b) ∼ (x′ , a′ ) · (y ′ , b′ ), and then (1.2.6) follows by transitivity.
1.2. The integers
11
Similarly, it is routine to verify the basic commutative, associative, etc. laws incorporated in the next proposition. To formulate the results, set (1.2.9)
m = [(x, a)], n = [(y, b)], k = [(z, c)] ∈ Z.
Also, define (1.2.10)
0 = [(0, 0)],
1 = [(1, 0)],
and −m = [(a, x)].
(1.2.11) Proposition 1.2.3. We have
m + n = n + m, (m + n) + k = m + (n + k), m + 0 = m, m + (−m) = 0, (1.2.12)
mn = nm, m(nk) = (mn)k, m · 1 = m, m · 0 = 0, m · (−1) = −m, m · (n + k) = m · n + m · k.
To give an example of a demonstration of these results, the identity mn = nm is equivalent to (xy + ab, ay + xb) ∼ (yx + ba, bx + ya). e imply xy + In fact, commutative laws for addition and multiplication in N ab = yx + ba and ay + xb = bx + ya. Verification of the other identities in (1.2.12) is left to the reader. We next establish the cancellation law for addition in Z. Proposition 1.2.4. Given m, n, k ∈ Z, (1.2.13)
m + n = k + n =⇒ m = k.
Proof. We give two proofs. For one, we can add −n to both sides and use the results of Proposition 1.2.3. Alternatively, we can write the hypotheses of (1.2.13) as x+y+c+b=z+y+a+b and use Proposition 1.1.12 to deduce that x + c = z + a.
12
1. Numbers
Note that it is reasonable to set m − n = m + (−n).
(1.2.14)
This defines subtraction on Z. There is a natural injection N ,→ Z,
(1.2.15)
x 7→ [(x, 0)],
whose image we identify with N. Note that the map (1.2.10) preserves addition and multiplication. There is also an injection x 7→ [(0, x)], whose image we identify with −N. Proposition 1.2.5. We have a disjoint union: Z = N ∪ {0} ∪ (−N).
(1.2.16)
Proof. Suppose m ∈ Z; write m = [(x, a)]. By Proposition 1.1.14, either a < x,
or
x = a,
or x < a.
In these three cases, x = a + u, u ∈ N,
or x = a,
or a = x + v, v ∈ N.
Then, either (x, a) ∼ (u, 0),
or (x, a) ∼ (0, 0),
or (x, a) ∼ (0, v).
We define an order on Z by: m < n ⇐⇒ n − m ∈ N.
(1.2.17) We then have:
Corollary 1.2.6. Given m, n ∈ Z, then either (1.2.18)
m < n,
or
m = n,
or
n < m,
and no two can hold. The map (2.15) is seen to preserve order relations. Another consequence of (1.2.16) is the following. Proposition 1.2.7. If m, n ∈ Z and m · n = 0, then either m = 0 or n = 0. Proof. Suppose m ̸= 0 and n ̸= 0. We have four cases: m > 0, n > 0 =⇒ mn > 0, m < 0, n < 0 =⇒ mn = (−m)(−n) > 0, m > 0, n < 0 =⇒ mn = −m(−n) < 0, m < 0, n > 0 =⇒ mn = −(−m)n < 0,
1.2. The integers
13
the first by (1.1.17), and the rest with the help of Exercise 3 below. This finishes the proof. Using Proposition 1.2.7, we have the following cancellation law for multiplication in Z. Proposition 1.2.8. Given m, n, k ∈ Z, mk = nk, k ̸= 0 =⇒ m = n.
(1.2.19)
Proof. First, mk = nk ⇒ mk − nk = 0. Now mk − nk = (m − n)k. See Exercise 3 below. Hence mk = nk =⇒ (m − n)k = 0. Given k ̸= 0, Proposition 1.2.7 implies m − n = 0. Hence m = n.
Exercises 1. Verify Proposition 1.2.3. ∑ 2. We define nk=1 ak as in (1.1.26), this time with ak ∈ Z. We also define ak inductively as in Exercise (3) of §1.1, with a0 = 1 if a ̸= 0. Use the principle of induction to establish the identity n ∑ (−1)k−1 k = − m if n = 2m, k=1
m+1
if n = 2m + 1.
3. Show that, if m, n, k ∈ Z, −(nk) = (−n)k,
and mk − nk = (m − n)k.
Hint. For the first part, use Proposition 1.2.3 to show that nk + (−n)k = 0. Alternatively, compare (a, x) · (y, b) with (x, a) · (y, b). 4. Deduce the following from Proposition 1.1.16. Let S ⊂ Z be nonempty and assume there exists m ∈ Z such that m < n for all n ∈ S. Then S has a smallest element. Hint. Given such m, let Se = {(−m) + n : n ∈ S}. Show that Se ⊂ N and deduce that Se has a smallest element. 5. Show that Z has no smallest element.
14
1. Numbers
1.3. Prime factorization and the fundamental theorem of arithmetic Let x ∈ N. We say x is composite if one can write (1.3.1)
x = ab,
a, b ∈ N,
with neither a nor b equal to 1. If x ̸= 1 is not composite, it is said to be prime. If (1.3.1) holds, we say a|x (and that b|x), or that a is a divisor of x. Given x ∈ N, x > 1, set (1.3.2)
Dx = {a ∈ N : a|x, a > 1}.
Thus x ∈ Dx , so Dx is non-empty. By Proposition 1.1.16, Dx contains a smallest element, say p1 . Clearly p1 is a prime. Set (1.3.3)
x = p 1 x1 ,
x1 ∈ N,
x1 < x.
The same construction applies to x1 , which is > 1 unless x = p1 . Hence we have either x = p1 or (1.3.4)
x1 = p2 x2 ,
p2 prime , x2 < x1 .
Continue this process, passing from xj to xj+1 as long as xj is not prime. The set S of such xj ∈ N has a smallest element, say xµ−1 = pµ , and we have (1.3.5)
x = p1 p2 · · · pµ ,
pj prime.
This is part of the Fundamental Theorem of Arithmetic: Theorem 1.3.1. Given x ∈ N, x ̸= 1, there is a unique product expansion x = p1 · · · pµ ,
(1.3.6)
where p1 ≤ · · · ≤ pµ are primes. Only uniqueness remains to be established. This follows from: Proposition 1.3.2. Assume a, b ∈ N, and p ∈ N is prime. Then (1.3.7)
p|ab =⇒ p|a or p|b.
We will deduce this from: Proposition 1.3.3. If p ∈ N is prime and a ∈ N, is not a multiple of p, or more generally if p, a ∈ N have no common divisors > 1, then there exist m, n ∈ Z such that (1.3.8)
ma + np = 1.
1.3. Prime factorization and the fundamental theorem of arithmetic
15
Proof of Proposition 1.3.2. Assume p is a prime which does not divide a. Pick m, n such that (1.3.8) holds. Now, multiply (1.3.8) by b, to get mab + npb = b. Thus, if p|ab, i.e., ab = pk, we have p(mk + nb) = b,
so p|b, as desired. To prove Proposition 1.3.3, let us set (1.3.9)
Γ = {ma + np : m, n ∈ Z}.
Clearly Γ satisfies the following criterion: Definition. A nonempty subset Γ ⊂ Z is a subgroup of Z provided (1.3.10)
a, b ∈ Γ =⇒ a + b, a − b ∈ Γ.
Proposition 1.3.4. If Γ ⊂ Z is a subgroup, then either Γ = {0}, or there exists x ∈ N such that (1.3.11)
Γ = {mx : m ∈ Z}.
Proof. Note that n ∈ Γ ⇔ −n ∈ Γ, so, with Σ = Γ ∩ N, we have a disjoint union Γ = Σ ∪ {0} ∪ (−Σ). If Σ ̸= ∅, let x be its smallest element. Then we want to establish (1.3.11), so set Γ0 = {mx : m ∈ Z}. Clearly Γ0 ⊂ Γ. Similarly, set Σ0 = {mx : m ∈ N} = Γ0 ∩ N. We want to show that Σ0 = Σ. If y ∈ Σ \ Σ0 , then we can pick m0 ∈ N such that m0 x < y < (m0 + 1)x, and hence y − m0 x ∈ Σ is smaller than x. This contradiction proves Proposition 1.3.4.
Proof of Proposition 1.3.3. Taking Γ as in (1.3.9), pick x ∈ N such that (1.3.11) holds. Since a ∈ Γ and p ∈ Γ, we have a = m0 x,
p = m1 x
for some mj ∈ Z. The assumption that a and p have no common divisor > 1 implies x = 1. We conclude that 1 ∈ Γ, so (1.3.8) holds. Exercises
16
1. Numbers
1. Prove that there are infinitely many primes. Hint. If {p1 , . . . , pm } is a complete list of primes, consider x = p1 · · · pm + 1. What are its prime factors? 2. Referring to (1.3.10), show that a nonempty subset Γ ⊂ Z is a subgroup of Z provided (1.3.12)
a, b ∈ Γ =⇒ a − b ∈ Γ.
Hint. a ∈ Γ ⇒ 0 = a − a ∈ Γ ⇒ −a = 0 − a ∈ Γ, given (1.3.12). 3. Let n ∈ N be a 12 digit integer. Show that if n is not prime, then it must be divisible by a prime p < 106 . 4. Determine whether the following number is prime: (1.3.13)
201367.
Hint. This is for the student who can use a computer. 5. Find the smallest prime larger than the number in (1.3.13). Hint. Same as above.
1.4. The rational numbers A rational number is thought of as having the form m/n, with m, n ∈ Z, n ̸= 0. Thus, we will define an element of Q as an equivalence class of ordered pairs m/n, m ∈ Z, n ∈ Z \ {0}, where we define (1.4.1)
m/n ∼ a/b ⇐⇒ mb = an.
Proposition 1.4.1. This is an equivalence relation. Proof. We need to check that (1.4.2)
m/n ∼ a/b, a/b ∼ c/d =⇒ m/n ∼ c/d,
i.e., that, for m, a, c ∈ Z, n, b, d ∈ Z \ {0}, (1.4.3)
mb = an, ad = cb =⇒ md = cn.
Now the hypotheses of (1.4.3) imply (mb)(ad) = (an)(cb), hence (md)(ab) = (cn)(ab).
1.4. The rational numbers
17
We are assuming b ̸= 0. If also a ̸= 0, then ab ̸= 0, and the conclusion of (1.4.3) follows from the cancellation property, Proposition 1.2.8. On the other hand, if a = 0, then m/n ∼ a/b ⇒ mb = 0 ⇒ m = 0 (since b ̸= 0), and similarly a/b ∼ c/d ⇒ cb = 0 ⇒ c = 0, so the desired implication also holds in that case. We will (temporarily) denote the equivalence class containing m/n by [m/n]. We then define addition and multiplication in Q to satisfy (1.4.4)
[m/n] + [a/b] = [(m/n) + (a/b)],
[m/n] · [a/b] = [(m/n) · (a/b)],
(m/n) + (a/b) = (mb + na)/(nb),
(m/n) · (a/b) = (ma)/(nb).
To see that these operations are well defined, we need: Proposition 1.4.2. If m/n ∼ m′ /n′ and a/b ∼ a′ /b′ , then (1.4.5)
(m/n) + (a/b) ∼ (m′ /n′ ) + (a′ /b′ ),
and (m/n) · (a/b) ∼ (m′ /n′ ) · (a′ /b′ ).
(1.4.6)
Proof. The hypotheses say mn′ = m′ n,
(1.4.7)
ab′ = a′ b.
The conclusions follow from the results of §1.2. In more detail, multiplying the two identities in (1.4.7) yields man′ b′ = m′ a′ nb, which implies (1.4.6). To prove (1.4.5), it is convenient to establish the intermediate step (m/n) + (a/b) ∼ (m′ /n′ ) + (a/b).
(1.4.8) This is equivalent to
(mb + na)/nb ∼ (m′ b + n′ a)/(n′ b), hence to (mb + na)n′ b = (m′ b + n′ a)nb, or to mn′ bb + nn′ ab = m′ nbb + n′ nab. This in turn follows readily from (1.4.7). Having (1.4.8), we can use a similar argument to establish that (m′ /n′ ) + (a/b) ∼ (m′ /n′ ) + (a′ /b′ ), and then (1.4.5) follows by transitivity of ∼.
18
1. Numbers
From now on, we drop the brackets, simply denoting the equivalence class of m/n by m/n, and writing (1.4.1) as m/n = a/b. We also may denote an element of Q by a single letter, e.g., x = m/n. There is an injection (1.4.9)
Z ,→ Q,
m 7→ m/1,
whose image we identify with Z. This map preserves addition and multiplication. We define (1.4.10)
−(m/n) = (−m)/n,
and, if x = m/n ̸= 0, (i.e., m ̸= 0 as well as n ̸= 0), we define (1.4.11)
x−1 = n/m.
The results stated in the following proposition are routine consequences of the results of §1.2. Proposition 1.4.3. Given x, y, z ∈ Q, we have x + y = y + x, (x + y) + z = x + (y + z), x + 0 = x, x + (−x) = 0, x · y = y · x, (x · y) · z = x · (y · z), x · 1 = x, x · 0 = 0, x · (−1) = −x, x · (y + z) = x · y + x · z. Furthermore, x ̸= 0 =⇒ x · x−1 = 1. For example, if x = m/n, y = a/b with m, n, a, b ∈ Z, n, b ̸= 0, the identity x · y = y · x is equivalent to (ma)/(nb) ∼ (am)/(bn). In fact, the identities ma = am and nb = bn follow from Proposition 2.3. We leave the rest of Proposition 1.4.3 to the reader. We also have cancellation laws: Proposition 1.4.4. Given x, y, z ∈ Q, (1.4.12)
x + y = z + y =⇒ x = z.
Also, (1.4.13)
xy = zy, y ̸= 0 =⇒ x = z.
1.4. The rational numbers
19
Proof. To get (1.4.12), add −y to both sides of x + y = z + y and use the results of Proposition 1.4.3. To get (1.4.13), multiply both sides of x·y = z ·y by y −1 . It is natural to define subtraction and division on Q: x − y = x + (−y),
(1.4.14) and, if y ̸= 0,
x/y = x · y −1 .
(1.4.15)
We now define the order relation on Q. Set (1.4.16)
Q+ = {m/n : mn > 0},
where, in (1.4.16), we use the order relation on Z, discussed in §1.2. This is well defined. In fact, if m/n = m′ /n′ , then mn′ = m′ n, hence (mn)(m′ n′ ) = (mn′ )2 , and therefore mn > 0 ⇔ m′ n′ > 0. Results of §1.2 imply that (1.4.17)
Q = Q+ ∪ {0} ∪ (−Q+ )
is a disjoint union, where −Q+ = {−x : x ∈ Q+ }. Also, clearly x (1.4.18) x, y ∈ Q+ =⇒ x + y, xy, ∈ Q+ . y We define (1.4.19)
x < y ⇐⇒ y − x ∈ Q+ ,
and we have, for any x, y ∈ Q, either (1.4.20)
x < y,
or
x = y,
or y < x,
and no two can hold. The map (1.4.9) is seen to preserve the order relations. In light of (1.4.18), we see that (1.4.21)
given x, y > 0,
x 0, there exists n ∈ N such that 1 ε> . n 6. Work through the proof of the following. Assertion. If x = m/n ∈ Q, then x2 ̸= 2. Hint. We can arrange that m and n have no common factors. Then ( m )2 = 2 ⇒ m2 = 2n2 ⇒ m even (m = 2k) n ⇒ 4k 2 = 2n2 ⇒ n2 = 2k 2 ⇒ n even.
1.5. Sequences
21
Contradiction? (See Proposition 1.7.2 for a more general result.) 7. Given xj , yj ∈ Q, show that x1 < x2 , y1 ≤ y2 =⇒ x1 + y1 < x2 + y2 . Show that 0 < x1 < x2 , 0 < y1 ≤ y2 =⇒ x1 y1 < x2 y2 .
1.5. Sequences In this section, we discuss infinite sequences. For now, we deal with sequences of rational numbers, but we will not explicitly state this restriction below. In fact, once the set of real numbers is constructed in §1.6, the results of this section will be seen to hold also for sequences of real numbers. Definition. A sequence (aj ) is said to converge to a limit a provided that, for any n ∈ N, there exists K(n) such that 1 . n We write aj → a, or a = lim aj , or perhaps a = limj→∞ aj . (1.5.1)
j ≥ K(n) =⇒ |aj − a|
0, then (aj /bj ) is Cauchy. The following proposition is a bit deeper than the first three. Proposition 1.5.4. If (aj ) is bounded, i.e., |aj | ≤ M for all j, then it has a Cauchy subsequence.
1.5. Sequences
23
Figure 1.5.1. Nested intervals containing aj for infinitely many j
Proof. We may as well assume M ∈ N. Now, either aj ∈ [0, M ] for infinitely many j or aj ∈ [−M, 0] for infinitely many j. Let I1 be any one of these two intervals containing aj for infinitely many j, and pick j(1) such that aj(1) ∈ I1 . Write I1 as the union of two closed intervals, of equal length, sharing only the midpoint of I1 . Let I2 be any one of them with the property that aj ∈ I2 for infinitely many j, and pick j(2) > j(1) such that aj(2) ∈ I2 . Continue, picking Iν ⊂ Iν−1 ⊂ · · · ⊂ I1 , of length M/2ν−1 , containing aj for infinitely many j, and picking j(ν) > j(ν − 1) > · · · > j(1) such that aj(ν) ∈ Iν . See Figure 1.5.1 for an illustration of a possible scenario. Setting bν = aj(ν) , we see that (bν ) is a Cauchy subsequence of (aj ), since, for all k ∈ N, |bν+k − bν | ≤ M/2ν−1 . Here is a significant consequence of Proposition 1.5.4. Proposition 1.5.5. Each bounded monotone sequence (aj ) is Cauchy.
24
1. Numbers
Proof. To say (aj ) is monotone is to say that either (aj ) is increasing, i.e., aj ≤ aj+1 for all j, or that (aj ) is decreasing, i.e., aj ≥ aj+1 for all j. For the sake of argument, assume (aj ) is increasing. By Proposition 5.4, there is a subsequence (bν ) = (aj(ν) ) which is Cauchy. Thus, given n ∈ N, there exists K(n) such that (1.5.11)
µ, ν ≥ K(n) =⇒ |aj(ν) − aj(µ) |
0. We claim that cj = (1 + y)j ≥ 1 + jy, for all j ≥ 1. In fact, this clearly holds for j = 1, and if it holds for j = k, then ck+1 ≥ (1 + y)(1 + ky) > 1 + (k + 1)y. Hence, by induction, the estimate is established. Consequently, bj
1/y. Proposition 1.5.6 enables us to establish the following result on geometric series. Proposition 1.5.7. If |x| < 1 and aj = 1 + x + · · · + xj , then aj →
1 . 1−x
1.5. Sequences
25
Proof. Note that xaj = x + x2 + · · · + xj+1 , so (1 − x)aj = 1 − xj+1 , i.e., 1 − xj+1 . 1−x The conclusion follows from Proposition 1.5.6. aj =
Note in particular that (1.5.13)
0 < x < 1 =⇒ 1 + x + · · · + xj
(n + 1)j . Using (1.5.13), we have 1 1 1 1 (1.5.15) an+j − an < 1 = n! · n . (n + 1)! 1 − n+1 Hence (aj ) is Cauchy. Taking n = 2, we see that (1.5.16)
j > 2 =⇒ 2 12 < aj < 2 34 .
Proposition 1.5.8. The sequence (1.5.14) cannot converge to a rational number. Proof. Assume aj → m/n with m, n ∈ N. By (1.5.16), we must have n > 2. Now, write n m ∑1 (1.5.17) = + r, r = lim (an+j − an ). j→∞ n ℓ! ℓ=0
Multiplying both sides of (1.5.17) by n! gives (1.5.18)
m(n − 1)! = A + r · n!
where (1.5.19)
A=
n ∑ n! ℓ=0
ℓ!
∈ N.
Thus the identity (1.5.17) forces r · n! ∈ N, while (1.5.15) implies (1.5.20)
0 < r · n! ≤ 1/n.
26
1. Numbers
This contradiction proves the proposition.
Exercises 1. Show that lim
k→∞
k = 0, 2k
and more generally for each n ∈ N, lim
kn = 0. 2k
lim
2k = 0, k!
k→∞
Hint. See Exercise 5. 2. Show that k→∞
and more generally for each n ∈ N, lim
k→∞
2nk = 0. k!
3. Suppose a sequence (aj ) has the property that there exist r < 1,
K∈N
a j+1 j ≥ K =⇒ ≤ r. aj Show that aj → 0 as j → ∞. How does this result apply to Exercises 1 and 2? such that
4. If (aj ) satisfies the hypotheses of Exercise 3, show that there exists M < ∞ such that k ∑ |aj | ≤ M, ∀ k. j=1
Remark. This yields the ratio test for infinite series. 5. Show that you get the same criterion for convergence if (1.5.1) is replaced by 5 j ≥ K(n) =⇒ |aj − a| < . n Generalize, and note the relevance for the proof of Proposition 1.5.1. Apply the same observation to the criterion (1.5.10) for (aj ) to be Cauchy.
1.5. Sequences
27
The following three exercises discuss continued fractions. We assume aj ∈ Q,
(1.5.21)
aj ≥ 1,
j = 1, 2, 3, . . . ,
and set (1.5.22)
f1 = a1 ,
f2 = a1 +
1 , a2
f3 = a1 +
1 a2 +
1 a3
,....
Having fj , we obtain fj+1 by replacing aj by aj + 1/aj+1 . In other words, with (1.5.23)
fj = φj (a1 , . . . , aj ),
given explicitly by (1.5.22) for j = 1, 2, 3, we have (1.5.24)
fj+1 = φj+1 (a1 , . . . , aj+1 ) = φj (a1 , . . . , aj−1 , aj + 1/aj+1 ).
6. Show that f1 < fj , ∀ j ≥ 2,
and f2 > fj , ∀ j ≥ 3.
Going further, show that (1.5.25)
f1 < f3 < f5 < · · · < f6 < f4 < f2 .
7. If also bj+1 , ˜bj+1 ≥ 1, show that (1.5.26)
φj+1 (a1 , . . . , aj , bj+1 ) − φj+1 (a1 , . . . , aj , ˜bj+1 ) = φj (a1 , . . . , aj−1 , bj ) − φj (a1 , . . . , aj−1 , ˜bj ),
with ˜bj = aj + 1 , ˜bj+1 ˜bj+1 − bj+1 1 1 bj − ˜bj = − = . ˜bj+1 bj+1 bj+1 ˜bj+1 bj = aj +
(1.5.27)
1 , bj+1
Note that bj , ˜bj ∈ (aj , aj + 1]. Iterating this, show that, for each ℓ = j − 1, . . . , 1, (1.5.26) is (1.5.28)
= φℓ (a1 , . . . , aℓ−1 , bℓ ) − φℓ (a1 , . . . , aℓ−1 , ˜bℓ ),
with ˜bℓ = aℓ + 1 , ˜bℓ+1 bℓ+1 1 1 bℓ+1 − ˜bℓ+1 bℓ − ˜bℓ = − =− . ˜bℓ+1 bℓ+1 bℓ+1 ˜bℓ+1 bℓ = aℓ +
(1.5.29)
1
,
28
1. Numbers
Finally, (1.5.26) equals b1 − ˜b1 . Show that |bℓ − ˜bℓ | ↘, and that |b1 − ˜b1 | = δ =⇒ |bℓ − ˜bℓ | ≥ δ (1.5.30) =⇒ bℓ˜bℓ ≥ 1 + δ, for ℓ ≤ j, hence (1.5.31)
δ ≤ (1 + δ)−(j−1) |bj − ˜bj | ≤ (1 + δ)−(j−1) .
Show that this implies (1.5.32)
δ2
0), and if (1.6.12) holds we say x ∈ R− (and we say x < 0). Clearly x > 0 if and only if −x < 0. It is also clear that the map Q ,→ R in (1.6.6) preserves the order relation. j ≥ K =⇒ aj ≤ −
(1.6.12)
Thus we have the disjoint union (1.6.13)
R = R+ ∪ {0} ∪ R− ,
R− = −R+ .
Also, clearly (1.6.14)
x, y ∈ R+ =⇒ x + y, xy ∈ R+ .
As in (1.4.19), we define (1.6.15)
x < y ⇐⇒ y − x ∈ R+ .
If x = [(aj )] and y = [(bj )], we see from (1.6.11)–(1.6.12) that x < y ⇐⇒ for some n, K ∈ N,
1 ( 1) i.e., aj ≤ bj − . n n The relation (1.6.15) can also be written y > x. Similarly we define x ≤ y and y ≤ x, in the obvious fashions. (1.6.16)
j ≥ K ⇒ bj − aj ≥
The following results are straightforward. Proposition 1.6.4. For elements of R, we have (1.6.17)
x1 < y1 , x2 < y2 =⇒ x1 + x2 < y1 + y2 ,
(1.6.18)
x < y ⇐⇒ −y < −x,
(1.6.19)
0 < x < y, a > 0 =⇒ 0 < ax < ay,
(1.6.20)
0 < x < y =⇒ 0 < y −1 < x−1 .
Proof. The results (1.6.17) and (1.6.19) follow from (1.6.14); consider, for example, a(y − x). The result (1.6.18) follows from (1.6.13). To prove (1.6.20), first we see that x > 0 implies x−1 > 0, as follows: if −x−1 > 0, the identity x · (−x−1 ) = −1 contradicts (1.6.14). As for the rest of (1.6.20), the hypotheses imply xy > 0, and multiplying both sides of x < y by a = (xy)−1 gives the result, by (1.6.19).
32
1. Numbers
As in (1.5.2), define |x| by (1.6.21)
|x| = x if x ≥ 0, −x if x < 0.
Note that (1.6.22)
x = [(aj )] =⇒ |x| = [(|aj |)].
It is straightforward to verify (1.6.23)
|xy| = |x| · |y|,
|x + y| ≤ |x| + |y|.
We now show that R has the Archimedean property. Proposition 1.6.5. Given x ∈ R, there exists k ∈ Z such that (1.6.24)
k − 1 < x ≤ k.
Proof. It suffices to prove (1.6.24) assuming x ∈ R+ . Otherwise, work with −x. Say x = [(aj )] where (aj ) is a Cauchy sequence of rational numbers. By Proposition 1.5.2, there exists M ∈ Q such that |aj | ≤ M for all j. By Proposition 1.4.5, we have M ≤ ℓ for some ℓ ∈ N. Hence the set S = {ℓ ∈ N : ℓ ≥ x} is nonempty. As in the proof of Proposition 1.4.5, taking k to be the smallest element of S gives (1.6.24). Proposition 1.6.6. Given any real ε > 0, there exists n ∈ N such that ε > 1/n. Proof. Using Proposition 1.6.5, pick n > 1/ε and apply (1.6.20). Alternatively, use the reasoning given above (1.6.8). We are now ready to consider sequences of elements of R. Definition. A sequence (xj ) converges to x if and only if, for any n ∈ N, there exists K(n) such that 1 (1.6.25) j ≥ K(n) =⇒ |xj − x| < . n In this case, we write xj → x, or x = lim xj . The sequence (xj ) is Cauchy if and only if, for any n ∈ N, there exists K(n) such that 1 (1.6.26) j, k ≥ K(n) =⇒ |xj − xk | < . n We note that it is typical to phrase the definition above in terms of picking any real ε > 0 and demanding that, e.g., |xj − x| < ε, for large j. The equivalence of the two definitions follows from Proposition 1.6.6. As in Proposition 1.5.2, we have that every Cauchy sequence is bounded.
1.6. The real numbers
33
It is clear that, if each xj ∈ Q, then the notion that (xj ) is Cauchy given above coincides with that in §1.5. If also x ∈ Q, the notion that xj → x also coincides with that given in §1.5. Here is another natural but useful observation. Proposition 1.6.7. If each aj ∈ Q, and x ∈ R, then (1.6.27)
aj → x ⇐⇒ x = [(aj )].
Proof. First assume x = [(aj )]. In particular, (aj ) is Cauchy. Now, given m, we have from (1.6.16) that 1 1 1 |x − ak | < ⇐⇒ ∃ K, n such that j ≥ K ⇒ |aj − ak | < − m m n (1.6.28) 1 ⇐= ∃ K such that j ≥ K ⇒ |aj − ak | < . 2m On the other hand, since (aj ) is Cauchy, 1 for each m ∈ N, ∃ K(m) such that j, k ≥ K(m) ⇒ |aj − ak | < . 2m Hence 1 k ≥ K(m) =⇒ |x − ak | < . m This shows that x = [(aj )] ⇒ aj → x. For the converse, if aj → x, then (aj ) is Cauchy, so we have [(aj )] = y ∈ R. The previous argument implies aj → y. But |x − y| ≤ |x − aj | + |aj − y|, ∀ j, so x = y. Thus aj → x ⇒ x = [(aj )]. Next, the proof of Proposition 1.5.1 extends to the present case, yielding: Proposition 1.6.8. If xj → x and yj → y, then (1.6.29)
xj + yj → x + y,
and (1.6.30)
xj yj → xy.
If furthermore yj ̸= 0 for all j and y ̸= 0, then (1.6.31)
xj /yj → x/y.
So far, statements made about R have emphasized similarities of its properties with corresponding properties of Q. The crucial difference between these two sets of numbers is given by the following result, known as the completeness property. Theorem 1.6.9. If (xj ) is a Cauchy sequence of real numbers, then there exists x ∈ R such that xj → x.
34
1. Numbers
Proof. Take xj = [(ajℓ : ℓ ∈ N)] with ajℓ ∈ Q. Using (1.6.27), take aj,ℓ(j) = bj ∈ Q such that |xj − bj | ≤ 2−j .
(1.6.32)
Then (bj ) is Cauchy, since |bj − bk | ≤ |xj − xk | + 2−j + 2−k . Now, let (1.6.33)
x = [(bj )].
It follows that (1.6.34)
|xj − x| ≤ |xj − bj | + |x − bj | ≤ 2−j + |x − bj |,
and hence xj → x.
If we combine Theorem 1.6.9 with the argument behind Proposition 1.5.4, we obtain the following important result, known as the BolzanoWeierstrass Theorem. Theorem 1.6.10. Each bounded sequence of real numbers has a convergent subsequence. Proof. If |xj | ≤ M, the proof of Proposition 1.5.4 applies without change to show that (xj ) has a Cauchy subsequence. By Theorem 1.6.9, that Cauchy subsequence converges. Similarly, adding Theorem 1.6.9 to the argument behind Proposition 1.5.5 yields: Proposition 1.6.11. Each bounded monotone sequence (xj ) of real numbers converges. A related property of R can be described in terms of the notion of the “supremum” of a set. Definition. If S ⊂ R, one says that x ∈ R is an upper bound for S provided x ≥ s for all s ∈ S, and one says (1.6.35)
x = sup S
provided x is an upper bound for S and further x ≤ x′ whenever x′ is an upper bound for S. For some sets, such as S = Z, there is no x ∈ R satisfying (1.6.35). However, there is the following result, known as the supremum property. Proposition 1.6.12. If S is a nonempty subset of R that has an upper bound, then there is a real x = sup S.
1.6. The real numbers
35
Proof. We use an argument similar to the one in the proof of Proposition 1.5.4. Let x0 be an upper bound for S, pick s0 in S, and consider I0 = [s0 , x0 ] = {y ∈ R : s0 ≤ y ≤ x0 }. If x0 = s0 , then already x0 = sup S. Otherwise, I0 is an interval of nonzero length, L = x0 − s0 . In that case, divide I0 into two equal intervals, having in common only the midpoint; say I0 = I0ℓ ∪ I0r , where I0r lies to the right of I0ℓ . Let I1 = I0r if S ∩ I0r ̸= ∅, and otherwise let I1 = I0ℓ . Note that S ∩ I1 ̸= ∅. Let x1 be the right endpoint of I1 , and pick s1 ∈ S ∩ I1 . Note that x1 is also an upper bound for S. Continue, constructing Iν ⊂ Iν−1 ⊂ · · · ⊂ I0 , where Iν has length 2−ν L, such that the right endpoint xν of Iν satisfies xν ≥ s,
(1.6.36)
∀ s ∈ S,
and such that S ∩ Iν ̸= ∅, so there exist sν ∈ S such that xν − sν ≤ 2−ν L.
(1.6.37)
The sequence (xν ) is bounded and monotone (decreasing) so, by Proposition 1.6.11, it converges; xν → x. By (1.6.36), we have x ≥ s for all s ∈ S, and by (6.34) we have x − sν ≤ 2−ν L. Hence x satisfies (1.6.35). ∑∞ We turn to infinite series k=0 ak , with ak ∈ R. We say this series converges if and only if the sequence of partial sums (1.6.38)
Sn =
n ∑
ak
k=0
converges: (1.6.39)
∞ ∑
ak = A ⇐⇒ Sn → A as n → ∞.
k=0
The following is a useful condition guaranteeing convergence. ∑ Proposition 1.6.13. The infinite series ∞ k=0 ak converges provided (1.6.40)
∞ ∑ k=0
i.e., there exists B < ∞ such that
|ak | < ∞, ∑n
k=0 |ak |
≤ B for all n.
36
1. Numbers
Proof. The triangle inequality (the second part of (1.6.23)) gives, for ℓ ≥ 1, n+ℓ ∑ |Sn+ℓ − Sn | = ak k=n+1
(1.6.41) ≤
n+ℓ ∑
|ak |,
k=n+1
and we claim this tends to 0 as n → ∞, uniformly in ℓ ≥ 1, provided (1.6.40) holds. In fact, if the right side of (1.6.41) fails to go to 0 as n → ∞, there exists ε > 0 and infinitely many nν → ∞ and ℓν ∈ N such that n∑ ν +ℓν
(1.6.42)
|ak | ≥ ε.
k=nν +1
We can pass to a subsequence and assume nν+1 > nν + ℓν . Then (1.6.43)
n∑ ν +ℓν
|ak | ≥ νε,
k=n1 +1
for all ν, contradicting the bound by B that follows from (1.6.40). Thus (1.6.40) ⇒ (Sn ) is Cauchy. Convergence follows, by Theorem 1.6.9. When (1.6.40) holds, we say the series
∑∞
k=0 ak
is absolutely convergent.
The following result on alternating series gives another sufficient condition for convergence. Proposition 1.6.14. Assume ak > 0, ak ↘ 0. Then ∞ ∑
(1.6.44)
(−1)k ak
k=0
is convergent. Proof. Denote the partial sums by Sn , n ≥ 0. We see that, for m ∈ N, (1.6.45)
S2m+1 ≤ S2m+3 ≤ S2m+2 ≤ S2m .
Iterating this, we have, as m → ∞, (1.6.46)
S2m ↘ α,
S2m+1 ↗ β,
β ≤ α,
and (1.6.47)
S2m − S2m+1 = a2m+1 ,
hence α = β, and convergence is established.
1.6. The real numbers
37
Here is an example: ∞ ∑ (−1)k k=0
k+1
=1−
1 1 1 + − + ··· 2 3 4
is convergent.
This series is not absolutely convergent (cf. Exercise 6 below). For an evaluation of this series, see exercises in §4.5 of Chapter 4. Exercises 1. Verify Proposition 1.6.3. 2. If S ⊂ R, we say that x ∈ R is a lower bound for S provided x ≤ s for all s ∈ S, and we say (1.6.48)
x = inf S,
provided x is a lower bound for S and further x ≥ x′ whenever x′ is a lower bound for S. Mirroring Proposition 1.6.12, show that if S ⊂ R is a nonempty set that has a lower bound, then there is a real x = inf S. 3. Given a real number ξ ∈ (0, 1), show it has an infinite decimal expansion, i.e., there exist bk ∈ {0, 1, . . . , 9} such that (1.6.49)
ξ=
∞ ∑
bk · 10−k .
k=1
Hint. Start by breaking [0, 1] into ten subintervals of equal length, and picking one to which ξ belongs. 4. Show that if 0 < x < 1, ∞ ∑
(1.6.50)
xk =
k=0
1 < ∞. 1−x
Hint. As in (1.4.23), we have n ∑ k=0
xk =
1 − xn+1 , 1−x
x ̸= 1.
The series (1.6.50) is called a geometric series. 5. Assume ak > 0 and ak ↘ 0. Show that ∞ ∞ ∑ ∑ (1.6.51) ak < ∞ ⇐⇒ bk < ∞, k=1
k=0
38
1. Numbers
where bk = 2k a2k .
(1.6.52)
Hint. Use the following observations: 1 1 b2 + b3 + · · · ≤ (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · , 2 2 (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · ≤ b1 + b2 + · · · . 6. Deduce from Exercise 5 that the harmonic series 1 + diverges, i.e.,
1 2
+
and
1 3
+
1 4
+ ···
∞ ∑ 1 = ∞. k
(1.6.53)
k=1
7. Deduce from Exercises 4–5 that ∞ ∑ 1 p > 1 =⇒ < ∞. kp
(1.6.54)
k=1
For now, we take p ∈ N. We will see later that (1.6.54) is meaningful, and true, for p ∈ R, p > 1. 8. Given a, b ∈ R \ 0, k ∈ Z, define ak as in Exercise 4 of §1.4. Show that aj+k = aj ak ,
ajk = (aj )k ,
(ab)j = aj bj ,
∀ j, k ∈ Z.
9. Given k ∈ N, show that, for xj ∈ R, xj → x =⇒ xkj → xk . Hint. Use Proposition 6.7. 10. Given xj , x, y ∈ R, show that xj ≥ y ∀ j, xj → x =⇒ x ≥ y. ∑ 11. Given the alternating series (−1)k ak as in Proposition 1.6.14 (with ak ↘ 0), with sum S, show that, for each N , N ∑ k=0
(−1)k ak = S + rN ,
|rN | ≤ |aN +1 |.
1.6. The real numbers
39
12. Generalize Exercises 5–6 of §1.5 as follows. Suppose a sequence (aj ) in R has the property that there exist r < 1 and K ∈ N such that a j+1 j ≥ K =⇒ ≤ r. aj Show that there exists M < ∞ such that k ∑
Conclude that
∑∞
|aj | ≤ M,
∀k ∈ N.
j=1
k=1 ak
is convergent.
13. Show that, for each x ∈ R, ∞ ∑ 1 k x k! k=1
is convergent. The following exercises deal with the sequence (fj ) of continued fractions associated to a sequence (aj ) as in (1.5.21), via (1.5.22)–(1.5.24), leading to Exercises 6–8 of §1.5. 14. Deduce from (1.5.25) that there exist fo , fe ∈ R such that f2k+1 ↗ fo ,
f2k ↘ fe ,
fo ≤ fe .
15. Deduce from (1.5.33) that fo = fe (= f , say), and hence fj −→ f,
as j → ∞,
i.e., if (aj ) satisfies (1.5.21), φj (a1 , . . . , aj ) −→ f,
as j → ∞.
We denote the limit by φ(a1 , . . . , aj , . . . ). 16. Show that φ(1, 1, . . . , 1, . . . ) = x solves x = 1 + 1/x, and hence √ 1+ 5 φ(1, 1, . . . , 1, . . . ) = . 2 √ Note. The existence of such x implies that 5 has a square root, 5 ∈ R. See Proposition 1.7.1 for a more general result.
40
1. Numbers
17. Take x ∈ (1, ∞) \ Q, and define the sequence (aj ) of elements of N as follows. First, a1 = [x], where [x] denotes the largest integer ≤ x. Then set 1 x2 = ∈ (1, ∞), a2 = [x2 ], x − a1 and, inductively, 1 xj+1 = ∈ (1, ∞), aj+1 = [xj+1 ]. xj − aj Show that x = φ(a1 , . . . , aj , . . . ). 18. Conversely, suppose aj ∈ N and set x = φ(a1 , . . . , aj , . . . ). Show that the construction of Exercise 17 recovers the sequence (aj ).
1.7. Irrational numbers
41
1.7. Irrational numbers There are real numbers that are not rational. One, called e, is given by the limit of the sequence (1.5.14); in standard notation, ∞ ∑ 1 e= . ℓ!
(1.7.1)
ℓ=0
This number appears naturally in the theory of the exponential function, which plays a central role in calculus, as exposed in §5 of Chapter 4. Proposition 1.5.8 implies that e is not rational. One can approximate e to high accuracy. In fact, as a consequence of (1.5.15), one has n ∑ 1 1 1 e− ≤ · . ℓ! n! n
(1.7.2)
ℓ=0
For example, one can verify that 120! > 6 · 10198 ,
(1.7.3) and hence
120 ∑ 1 e− < 10−200 . ℓ!
(1.7.4)
ℓ=0
In a fraction of a second, a personal computer with the right program can perform a highly accurate approximation to such a sum, yielding 2.7182818284 5904523536 0287471352 6624977572 4709369995 9574966967 6277240766 3035354759 4571382178 5251664274 2746639193 2003059921 8174135966 2904357290 0334295260 5956307381 3232862794 3490763233 8298807531 · · · accurate to 190 places after the decimal point. A number in R \ Q is said to be irrational. √We present some other common examples of irrational numbers, such as 2. To begin, one needs √ to show that 2 is a well defined real number. The following general result includes this fact. Proposition 1.7.1. Given a ∈ R+ , k ∈ N, there is a unique b ∈ R+ such that bk = a. Proof. Consider (1.7.5)
Sa,k = {x ≥ 0 : xk ≤ a}.
Then Sa,k is a nonempty bounded subset of R. Note that if y > 0 and y k > a then y is an upper bound for Sa,k . Hence 1 + a is an upper bound for Sa,k . Take b = sup Sa,k . We claim that bk = a. In fact, if bk < a, it follows from
42
1. Numbers
Exercise 9 of §6 that there exists b1 > b such that bk1 < a, hence b1 ∈ Sa,k , so b < sup Sa,k . Similarly, if bk > a, there exists b0 < b such that bk0 > a, hence b0 is an upper bound for Sa,k , so b > sup Sa,k . We write (1.7.6)
b = a1/k .
Now for a list of some irrational numbers: Proposition 1.7.2. Take a ∈ N, k ∈ N. If a1/k is not an integer, then a1/k is irrational. Proof. Assume a1/k = m/n, with m, n ∈ N. We can arrange that m and n have no common prime factors. Now (1.7.7)
mk = ank ,
so (1.7.8)
n | mk .
Thus, if n > 1 and p is a prime factor of n, then p|mk . It follows from Proposition 1.3.2, and induction on k, that p|m. This contradicts our arrangement that m and n have no common prime factors, and concludes the proof. Noting that 12 = 1, 22 = 4, 32 = 9, we have: Corollary 1.7.3. The following numbers are irrational: √ √ √ √ √ √ (1.7.9) 2, 3, 5, 6, 7, 8. A similar argument establishes the following more general result. Proposition 1.7.4. Consider the polynomial (1.7.10)
p(z) = z k + ak−1 z k−1 + · · · + a1 z + a0 ,
aj ∈ Z.
Then (1.7.11)
z ∈ Q, p(z) = 0 =⇒ z ∈ Z.
Proof. If z ∈ Q but z ∈ / Z, we can write z = m/n with m, n ∈ Z, n > 1, and m and n containing no common prime factors. Now multiply (1.7.12) by nk , to get (1.7.12)
mk + ak−1 mk−1 n + · · · + a1 mnk−1 + a0 nk = 0,
aj ∈ Z.
It follows that n divides mk , so, as in the proof of Proposition 7.2, m and n must have a common prime factor. This contradiction proves Proposition 1.7.4.
1.7. Irrational numbers
43
Note that Proposition 1.7.2 deals with the special case p(z) = z k − a,
(1.7.13)
a ∈ N.
Remark. The existence of solutions to p(z) = 0 for general p(z) as in (1.7.10) is harder than Proposition 1.7.1, especially when k is even. For the case of odd k, see Exercise 1 of §1.9. For the general result, see Appendix A.1. The real line is thick with both rational numbers and irrational numbers. By (1.6.27), given any x ∈ R, there exist aj ∈ Q such that aj → x. Also, given any x ∈ R, there exist irrational bj such √ that bj → x. To see this, just −j take aj ∈ Q, aj → x, and set bj = aj + 2 2. In a sense that can be made precise, there are more irrational numbers than rational numbers. Namely, Q is countable, while R is uncountable. See §1.8 for a treatment of this. Perhaps the most intriguing irrational number is π. See Chapter 4 for material on π, and Appendix A.3 a proof that it is irrational. Exercises 1. Let ξ ∈ (0, 1) have a decimal expansion of the form (1.6.49), i.e., (1.7.14)
ξ=
∞ ∑
bk · 10−k ,
bk ∈ {0, 1, . . . , 9}.
k=1
Show that ξ is rational if and only if (1.7.12) is eventually repeating, i.e., if and only if there exist N, m ∈ N such that k ≥ N =⇒ bk+m = bk . 2. Show that
∞ ∑
10−k
2
is irrational.
k=1
3. Making use of Proposition 1.7.1, define ap for real a > 0, p = m/n ∈ Q. Show that if also q ∈ Q, ap aq = ap+q . Hint. You might start with am/n = (a1/n )m , given n ∈ N, m ∈ Z. Then you need to show that if k ∈ N, (a1/nk )mk = (a1/n )m . You can use the results of Exercise 8 in §1.6.
44
1. Numbers
4. Show that, if a, b > 0 and p ∈ Q, then (ab)p = ap bp . Hint. First show that (ab)1/n = a1/n b1/n . 5. Using Exercises 3 and 4, extend (1.6.54) to p ∈ Q, p > 1. Hint. If ak = k −p , then bk = 2k a2k = 2k (2k )−p = 2−(p−1)k = xk with x = 2−(p−1) . √ √ 6. Show that 2 + 3 is irrational. Hint. Square it. 7. Specialize the proof of Proposition 1.7.2 to a demonstration that 2 has no rational square root, and contrast this argument with the proof of such a result suggested in Exercise 6 of §1.4. √ 8. Here is a way to approximate a, given a ∈ R+ . Suppose you have an √ approximation xk to a, √ xk − a = δ k . √ Square this to obtain x2k + a − 2xk a = δk2 , hence √
a = xk+1 −
δk2 , 2xk
xk+1 =
a + x2k . 2xk
Then xk+1 is an improved approximation, as long as |δk | < 2xk . One can iterate this. Try it on √ √ √ 7 7 9 2≈ , 3≈ , 5≈ . 5 4 4 How many iterations does it take to approximate these quantities to 12 digits of accuracy?
1.8. Cardinal numbers We return to the natural numbers considered in §1 and make contact with the fact that these numbers are used to count objects in collections. Namely, let S be some set. If S is empty, we say 0 is the number of its elements. If S is not empty, pick an element out of S and count “1.” If there remain other elements of S, pick another element and count “2.” Continue. If you pick a final element of S and count “n,” then you say S has n elements. At least, that is a standard informal description of counting. We wish to restate this a little more formally, in the setting where we can apply the Peano axioms.
1.8. Cardinal numbers
45
In order to do this, we consider the following subsets of N. Given n ∈ N, set In = {j ∈ N : j ≤ n}.
(1.8.1)
While the following is quite obvious, it is worthwhile recording that it is a consequence of the Peano axioms and the material developed in §1.1. Lemma 1.8.1. We have (1.8.2)
I1 = {1},
In+1 = In ∪ {n + 1}.
Proof. Left to the reader. Now we propose the following
Definition. A nonempty set S has n elements if and only if there exists a bijective map φ : S → In . A reasonable definition of counting should permit one to demonstrate that, if S has n elements and it also has m elements, then m = n. The key to showing this from the Peano postulates is the following. Proposition 1.8.2. Assume m, n ∈ N. If there exists an injective map φ : Im → In , then m ≤ n. Proof. Use induction on n. The case n = 1 is clear (by Lemma 1.8.1). Assume now that N ≥ 2 and that the result is true for n < N . Then let φ : Im → IN be injective. Two cases arise: either there is an element j ∈ Im such that φ(j) = N , or not. (Also, there is no loss of generality in assuming at this point that m ≥ 2.) If there is such a j, define ψ : Im−1 → IN −1 by ψ(ℓ) = φ(ℓ)
for ℓ < j,
φ(ℓ + 1) for j ≤ ℓ < m. Then ψ is injective, so m − 1 ≤ N − 1, and hence m ≤ N . On the other hand, if there is no such j, then we already have an injective map φ : Im → IN −1 . The induction hypothesis implies m ≤ N − 1, which in turn implies m ≤ N . Corollary 1.8.3. If there exists a bijective map φ : Im → In , then m = n. Proof. We see that m ≤ n and n ≤ m, so Proposition 1.1.13 applies.
Corollary 1.8.4. If S is a set, m, n ∈ N, and there exist bijective maps φ : S → Im , ψ : S → In , then m = n.
46
1. Numbers
Proof. Consider ψ ◦ φ−1 .
Definition. If either S = ∅ or S has n elements for some n ∈ N, we say S is finite. The next result implies that any subset of a finite set is finite. Proposition 1.8.5. Assume n ∈ N. If S ⊂ In is nonempty, then there exists m ≤ n and a bijective map φ : S → Im . Proof. Use induction on n. The case n = 1 is clear (by Lemma 1.8.1). Assume the result is true for n < N . Then let S ⊂ IN . Two cases arise: either N ∈ S or N ∈ / S. If N ∈ S, consider S ′ = S \ {N }, so S = S ′ ∪ {N } and S ′ ⊂ IN −1 . The inductive hypothesis yields a bijective map ψ : S ′ → Im (with m ≤ N − 1), and then we obtain φ : S ′ ∪ {N } → Im+1 , equal to ψ on S ′ and sending the element N to m + 1. If N ∈ / S, then S ⊂ IN −1 , and the inductive hypothesis directly yields the desired bijective map. Proposition 1.8.6. The set N is not finite. Proof. If there were an n ∈ N and a bijective map φ : In → N, then, by restriction, there would be a bijective map ψ : S → In+1 for some subset S of In , hence by the results above a bijective map ψ˜ : Im → In+1 for some m ≤ n < n + 1. This contradicts Corollary 1.8.3. The next result says that, in a certain sense, N is a minimal set that is not finite. Proposition 1.8.7. If S is not finite, then there exists an injective map Φ : N → S. Proof. We aim to show that there exists a family of injective maps φn : In → S, with the property that (1.8.3) φn = φm , ∀ m ≤ n. Im
We establish this by induction on n. For n = 1, just pick some element of S and call it φ1 (1). Now assume this claim is true for all n < N . So we have φN −1 : IN −1 → S injective, but not surjective (since we assume S is not finite), and (8.3) holds for n ≤ N − 1. Pick x ∈ S not in the range of φN −1 . Then define φN : IN → S so that (1.8.4)
φN (j) = φN −1 (j), φN (N ) = x.
j ≤ N − 1,
1.8. Cardinal numbers
47
Having the family φn , we define Φ : N → S by Φ(j) = φn (j) for any n ≥ j. Two sets S and T are said to have the same cardinality if there exists a bijective map between them; we write Card(S) = Card(T ). If there exists an injective map φ : S → T , we write Card(S) ≤ Card(T ). The following result, known as the Schroeder-Bernstein theorem, implies that Card(S) = Card(T ) whenever one has both Card(S) ≤ Card(T ) and Card(T ) ≤ Card(S). Theorem 1.8.8. Let S and T be sets. Suppose there exist injective maps φ : S → T and ψ : T → S. Then there exists a bijective map Φ : S → T . Proof. Let us say an element x ∈ T has a parent y ∈ S if φ(y) = x. Similarly there is a notion of a parent of an element of S. Iterating this gives a sequence of “ancestors” of any element of S or T . For any element of S or T , there are three possibilities: a) The set of ancestors never terminates. b) The set of ancestors terminates at an element of S. c) The set of ancestors terminates at an element of T . We denote by Sa , Ta the elements of S, T , respectively for which case a) holds. Similarly we have Sb , Tb and Sc , Tc . We have disjoint unions S = Sa ∪ Sb ∪ Sc ,
T = Ta ∪ Tb ∪ Tc .
Now note that φ : Sa → Ta ,
φ : Sb → Tb ,
ψ : Tc → Sc
are all bijective. Thus we can set Φ equal to φ on Sa ∪ Sb and equal to ψ −1 on Sc , to get a desired bijection. The terminology above suggests regarding Card(S) as an object (a cardinal number). Indeed, if S is finite we set Card(S) = n if S has n elements. A set that is not finite is said to be infinite. We can also have a notion of cardinality of infinite sets. A standard notation for the cardinality of N is Card(N) = ℵ0 .
(1.8.5)
Here are some other sets with the same cardinality: Proposition 1.8.9. We have (1.8.6)
Card(Z) = Card(N × N) = Card(Q) = ℵ0 .
48
1. Numbers
Figure 1.8.1. Counting N × N
Proof. We can define a bijection of N onto Z by ordering elements of Z as follows: 0, 1, −1, 2, −2, 3, −3, · · · . We can define a bijection of N and N × N by ordering elements of N × N as follows: (1, 1), (1, 2), (2, 1), (3, 1), (2, 2), (1, 3), · · · . See Figure 1.8.1. We leave it to the reader to produce a similar ordering of Q. An infinite set that can be mapped bijectively onto N is called countably infinite. A set that is either finite or countably infinite is called countable. The following result is a natural extension of Proposition 1.8.5. Proposition 1.8.10. If X is a countable set and S ⊂ X, then S is countable. Proof. If X is finite, then Proposition 1.8.5 applies. Otherwise, we can assume X = N, and we are looking at S ⊂ N, so there is an injective map φ : S → N. If S is finite, there is no problem. Otherwise, by Proposition
1.8. Cardinal numbers
49
1.8.7, there is an injective map ψ : N → S, and then Theorem 1.8.8 implies the existence of a bijection between S and N. There are sets that are not countable; they are said to be uncountable. The following is a key result of G. Cantor. Proposition 1.8.11. The set R of real numbers is uncountable. Proof. We may as well show that (0, 1) = {x ∈ R : 0 < x < 1} is uncountable. If it were countable, there would be a bijective map φ : N → (0, 1). Expand the real number φ(j) in its infinite decimal expansion: (1.8.7)
φ(j) =
∞ ∑
ajk · 10−k ,
ajk ∈ {0, 1, . . . 9}.
k=1
Now set (1.8.8)
bk = 2
if akk ̸= 2,
3
if akk = 2,
and consider (1.8.9)
ξ=
∞ ∑
bk · 10−k ,
ξ ∈ (0, 1).
k=1
It is seen that ξ is not equal to φ(j) for any j ∈ N, contradicting the hypothesis that φ : N → (0, 1) is onto. A common notation for the cardinality of R is (1.8.10)
Card(R) = c.
We leave it as an exercise to the reader to show that (1.8.11)
Card(R × R) = c.
Further development of the theory of cardinal numbers requires a formalization of the notions of set theory. In these notes we have used set theoretical notions rather informally. Our use of such notions has gotten somewhat heavier in this last section. In particular, in the proof of Proposition 1.8.7, the innocent looking use of the phrase “pick x ∈ S . . . ” actually assumes a weak version of the Axiom of Choice. For an introduction to the axiomatic treatment of set theory we refer to [5]. Exercises 1. What is the cardinality of the set P of prime numbers?
50
1. Numbers
2. Let S be a nonempty set and let T be the set of all subsets of S. Adapt the proof of Proposition 1.8.11 to show that Card(S) < Card(T ), i.e., there is not a surjective map φ : S → T . Hint. There is a natural bijection of T and Te, the set of functions f : S → {0, 1}, via f ↔ {x ∈ S : f (x) = 1}. Given φ˜ : S → Te, describe a function g : S → {0, 1}, not in the range of φ, ˜ taking a cue from the proof of Proposition 1.8.11. 3. Finish the proof of Proposition 1.8.9. 4. Use the map f (x) = x/(1 + x) to prove that Card(R+ ) = Card((0, 1)). 5. Find a one-to-one map of R onto R+ and conclude that Card(R) = Card((0, 1)). 6. Use an interlacing of infinite decimal expansions to prove that Card((0, 1) × (0, 1)) = Card((0, 1)). 7. Prove (1.8.11). 8. Let m ∈ Z, n ∈ N, and consider Sm,n = {k ∈ Z : m + 1 ≤ k ≤ m + n}. Show that Card Sm,n = n. Hint. Produce a bijective map In → Sm,n . 9. Let S and T be sets. Assume Card S = m,
Card T = n,
S ∩ T = ∅,
with m, n ∈ N. Show that Card S ∪ T = m + n. Hint. Produce bijective maps S → Im and T → Sm,n , leading to a bijection S ∪ T → Im+n .
1.9. Metric properties of R
51
1.9. Metric properties of R We discuss a number of notions and results related to convergence in R. Recall that a sequence of points (pj ) in R converges to a limit p ∈ R (we write pj → p) if and only if for every ε > 0 there exists N such that j ≥ N =⇒ |pj − p| < ε.
(1.9.1)
A set S ⊂ R is said to be closed if and only if (1.9.2)
pj ∈ S, pj → p =⇒ p ∈ S.
The complement R \ S of a closed set S is open. Alternatively, Ω ⊂ R is open if and only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where (1.9.3)
Bε (q) = {p ∈ R : |p − q| < ε},
so q cannot be a limit of a sequence of points in R \ Ω. In particular, the interval (1.9.4)
[a, b] = {x ∈ R : a ≤ x ≤ b}
is closed, and the interval (1.9.5)
(a, b) = {x ∈ R : a < x < b}
is open. We define the closure S of a set S ⊂ R to consist of all points p ∈ R such that Bε (p) ∩ S ̸= ∅ for all ε > 0. Equivalently, p ∈ S if and only if there exists an infinite sequence (pj ) of points in S such that pj → p. For example, the closure of the interval (a, b) is the interval [a, b]. An important property of R is completeness, which we recall is defined as follows. A sequence (pj ) of points in R is called a Cauchy sequence if and only if (1.9.6)
|pj − pk | −→ 0,
as j, k → ∞.
It is easy to see that if pj → p for some p ∈ R, then (1.9.6) holds. The completeness property is the converse, given in Theorem 1.6.9, which we recall here. Theorem 1.9.1. If (pj ) is a Cauchy sequence in R, then it has a limit. Completeness provides a path to the following key notion of compactness. A nonempty set K ⊂ R is said to be compact if and only if the following property holds. (1.9.7)
Each infinite sequence (pj ) in K has a subsequence that converges to a point in K.
52
1. Numbers
It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j ̸= k, so (pj ) cannot have a convergent subsequence. The following converse statement is a key result. Theorem 1.9.2. If a nonempty K ⊂ R is closed and bounded, then it is compact. Clearly every nonempty closed subset of a compact set is compact, so Theorem 1.9.2 is a consequence of: Proposition 1.9.3. Each closed bounded interval I = [a, b] ⊂ R is compact. Proof. This is a direct consequence of the Bolzano-Weierstrass theorem, Theorem 1.6.10. Let K ⊂ R be compact. Since K is bounded from above and from below, we have well defined real numbers (1.9.8)
b = sup K,
a = inf K,
the first by Proposition 1.6.12, and the second by a similar argument (cf. Exercise 2 of §1.6). Since a and b are limits of elements of K, we have a, b ∈ K. We use the notation (1.9.9)
b = max K,
a = min K.
We next discuss continuity. If S ⊂ R, a function (1.9.10)
f : S −→ R
is said to be continuous at p ∈ S provided (1.9.11)
pj ∈ S, pj → p =⇒ f (pj ) → f (p).
If f is continuous at each p ∈ S, we say f is continuous on S. The following two results give important connections between continuity and compactness. Proposition 1.9.4. If K ⊂ R is compact and f : K → R is continuous, then f (K) is compact. Proof. If (qk ) is an infinite sequence of points in f (K), pick pk ∈ K such that f (pk ) = qk . If K is compact, we have a subsequence pkν → p in K, and then qkν → f (p) in R. This leads to the second connection.
1.9. Metric properties of R
53
Proposition 1.9.5. If K ⊂ R is compact and f : K → R is continuous, then there exists p ∈ K such that (1.9.12)
f (p) = max f (x), x∈K
and there exists q ∈ K such that (1.9.13)
f (q) = min f (x). x∈K
Proof. Since f (K) is compact, we have well defined numbers (1.9.14)
b = max f (K),
a = min f (K),
a, b ∈ f (K).
So take p, q ∈ K such that f (p) = b and f (q) = a.
The next result is called the intermediate value theorem. Proposition 1.9.6. Take a, b, c ∈ R, a < b. Let f : [a, b] → R be continuous. Assume (1.9.15)
f (a) < c < f (b).
Then there exists x ∈ (a, b) such that f (x) = c. Proof. Let (1.9.16)
S = {y ∈ [a, b] : f (y) ≤ c}.
Then a ∈ S, so S is a nonempty, closed (hence compact) subset of [a, b]. Note that b ∈ / S. Take (1.9.17)
x = max S.
Then a < x < b and f (x) ≤ c. If f (x) < c, then there exists ε > 0 such that a < x − ε < x + ε < b and f (y) < c for x − ε < y < x + ε. Thus x + ε ∈ S, contradicting (1.9.17). Returning to the issue of compactness, we establish some further properties of compact sets K ⊂ R, leading to the important result, Proposition 1.9.10 below. Proposition 1.9.7. Let K ⊂ R be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreasing sequence of closed subsets of K. If each Xm ̸= ∅, then ∩m Xm ̸= ∅. Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y. Since {xmk : k ≥ ℓ} ⊂ Xmℓ , which is closed, we have y ∈ ∩m Xm . Corollary 1.9.8. Let K ⊂ R be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open sets in R. If ∪m Um ⊃ K, then UM ⊃ K for some M .
54
1. Numbers
Proof. Consider Xm = K \ Um .
Before getting to Proposition 1.9.10, we bring in the following. Let Q denote the set of rational numbers. The set Q ⊂ R has the following “denseness” property: given p ∈ R and ε > 0, there exists q ∈ Q such that |p − q| < ε. Let R = {Brj (qj ) : qj ∈ Q, rj ∈ Q ∩ (0, ∞)}.
(1.9.18)
Note that Q is countable, i.e., it can be put in one-to-one correspondence with N. Hence R is a countable collection of balls. The following lemma is left as an exercise for the reader. Lemma 1.9.9. Let Ω ⊂ R be a nonempty open set. Then ∪ (1.9.19) Ω = {B : B ∈ R, B ⊂ Ω}. To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα . If each Uα ⊂ R is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we say {Uβ : β ∈ B} is a subcover. This result is called the Heine-Borel theorem. Proposition 1.9.10. If K ⊂ R is compact, then it has the following property. (1.9.20)
Every open cover {Uα : α ∈ A} of K has a finite subcover.
Proof. By Lemma 1.9.9, it suffices to prove the following. Every countable cover {Bj : j ∈ N} of K by open intervals (1.9.21) has a finite subcover. For this, we set (1.9.22) and apply Corollary 1.9.8.
Um = B1 ∪ · · · ∪ Bm
Exercises 1. Consider a polynomial p(x) = xn + an−1 xn−1 + · · · + a1 x + a0 . Assume each aj ∈ R and n is odd. Use the intermediate value theorem to show that p(x) = 0 for some x ∈ R. We describe the construction of a Cantor set. Take a closed, bounded interval [a, b] = C0 . Let C1 be obtained from C0 by deleting the open middle third interval, of length (b − a)/3. At the jth stage, Cj is a disjoint union of 2j closed intervals, each of length 3−j (b − a). Then Cj+1 is obtained from
1.9. Metric properties of R
55
Cj by deleting the open middle third of each of these 2j intervals. We have C0 ⊃ C1 ⊃ · · · ⊃ Cj ⊃ · · · , each a closed subset of [a, b]. 2. Show that C=
(1.9.23)
∩
Cj
j≥0
is nonempty, and compact. This is the Cantor set. 3. Suppose C is formed as above, with [a, b] = [0, 1]. Show that points in C are precisely those of the form (1.9.24)
ξ=
∞ ∑
bj 3−j ,
bj ∈ {0, 2}.
j=0
4. If p, q ∈ C (and p < q), show that the interval [p, q] must contain points not in C. One says C is totally disconnected. 5. If p ∈ C, ε > 0, show that (p − ε, p + ε) contains infinitely many points in C. Given that C is closed, one says C is perfect. 6. Show that Card(C) = Card(R). Hint. With ξ as in (1.9.24) show that ξ 7→ η =
∞ ( ) ∑ bj j=0
2
2−j
maps C onto [0, 1]. Remark. At this point, we mention the Continuum Hypothesis. If S ⊂ R is uncountable, then Card S = Card R. This hypothesis has been shown not to be amenable to proof or disproof, from the standard axioms of set theory. See [4]. However, there is a large class of sets for which the conclusion holds. For example, it holds whenever S ⊂ R is uncountable and compact. See Exercises 7–9 in §2.3 of Chapter 2 for further results along this line. 7. Show that Proposition 1.9.6 implies Proposition 1.7.1. 8. In the setting of Proposition 1.9.6 (the intermediate value theorem), in which f : [a, b] → R is continuous and f (a) < c < f (b), consider the following.
56
1. Numbers
(a) Divide I = [a, b] into two equal intervals Iℓ and Ir , meeting at the midpoint α0 = (a + b)/2. Select I1 = Iℓ if f (α0 ) ≥ c, I1 = Ir if f (α0 ) < c. Say I1 = [x1 , y1 ]. Note that f (x1 ) < c, f (y1 ) ≥ c. (b) Divide I1 into two equal intervals I1ℓ and I1r , meeting at the midpoint (x1 + y1 )/2 = α1 . Select I2 = I1ℓ if f (α1 ) ≥ c, I2 = I1r if f (α1 ) < c. Say I2 = [x2 , y2 ]. Note that f (x2 ) < c, f (y2 ) ≥ c. (c) Continue. Having Ik = [xk , yk ], of length 2−k (b − a), with f (xk ) < c, f (yk ) ≥ c, divide Ik into two equal intervals Ikℓ and Ikr , meeting at the midpoint αk = (xk + yk )/2. Select Ik+1 = Ikℓ if f (αk ) ≥ c, Ik+1 = Ikr if f (αk ) < c. Again, Ik+1 = [xk+1 , yk+1 ] with f (xk+1 ) < c and f (yk+1 ) ≥ c. (d) Show that there exists x ∈ (a, b) such that xk ↗ x,
yk ↘ x,
and f (x) = c.
This method of approximating a solution to f (x) = c is called the bisection method.
1.10. Complex numbers A complex number is a number of the form (1.10.1)
z = x + iy,
x, y ∈ R,
where the new object i has the property i2 = −1.
(1.10.2)
We denote the set of complex numbers by C. We have R ,→ C, identifying x ∈ R with x + i0 ∈ C. We define addition and multiplication in C as follows. Suppose w = a + ib, a, b ∈ R. We set (1.10.3)
z + w = (x + a) + i(y + b), zw = (xa − yb) + i(xb + ya).
See Figures 1.10.1 and 1.10.2 for illustrations of these operations. It is routine to verify various commutative, associative, and distributive laws, parallel to those in Proposition 1.4.3. If z ̸= 0, i.e., either x ̸= 0 or y ̸= 0, we can set 1 y x (1.10.4) z −1 = = 2 −i 2 , 2 z x +y x + y2 and verify that zz −1 = 1.
1.10. Complex numbers
57
Figure 1.10.1. Addition in the complex plane
For some more notation, for z ∈ C of the form (1.10.1), we set (1.10.5)
z = x − iy,
Re z = x,
Im z = y.
We say z is the complex conjugate of z, Re z is the real part of z, and Im z is the imaginary part of z. We next discuss the concept of the magnitude (or absolute value) of an element z ∈ C. If z has the form (1.10.1), we take a cue from the Pythagorean theorem, giving the Euclidean distance from z to 0, and set √ (1.10.6) |z| = x2 + y 2 . Note that (1.10.7)
|z|2 = z z.
With this notation, (1.10.4) takes the compact (and clear) form (1.10.8)
z −1 =
z . |z|2
We have (1.10.9)
|zw| = |z| · |w|,
58
1. Numbers
Figure 1.10.2. Multiplication by i in C
for z, w ∈ C, as a consequence of the identity (readily verified from the definition (1.10.5)) (1.10.10)
zw = z · w.
In fact, |zw|2 = (zw)(zw) = z w z w = zzww = |z|2 |w|2 . This extends the first part of (1.6.23) from R to C. The extension of the second part also holds, but it requires a little more work. The following is the triangle inequality in C. Proposition 1.10.1. Given z, w ∈ C, (1.10.11)
|z + w| ≤ |z| + |w|.
Proof. We compare the squares of each side of (1.10.11). First, |z + w|2 = (z + w)(z + w) (1.10.12)
= |z|2 + |w|2 + wz + zw = |z|2 + |w|2 + 2 Re zw.
Now, for any ζ ∈ C, Re ζ ≤ |ζ|, so Re zw ≤ |zw| = |z| · |w|, so (1.10.12) is (1.10.13)
≤ |z|2 + |w|2 + 2|z| · |w| = (|z| + |w|)2 ,
1.10. Complex numbers
59
and we have (1.10.11).
We now discuss matters related to convergence in C. Parallel to the real case, we say a sequence (zj ) in C converges to a limit z ∈ C (and write zj → z) if and only if for each ε > 0 there exists N such that j ≥ N =⇒ |zj − z| < ε.
(1.10.14) Equivalently,
zj → z ⇐⇒ |zj − z| → 0.
(1.10.15) It is easily seen that (1.10.16)
zj → z ⇐⇒ Re zj → Re z and Im zj → Im z.
The set C also has the completeness property, given as follows. A sequence (zj ) in C is said to be a Cauchy sequence if and only if |zj − zk | → 0, as j, k → ∞.
(1.10.17)
It is easy to see (using the triangle inequality) that if zj → z for some z ∈ C, then (1.10.17) holds. Here is the converse: Proposition 1.10.2. If (zj ) is a Cauchy sequence in C, then it has a limit. Proof. If (zj ) is Cauchy in C, then (Re zj ) and (Im zj ) are Cauchy in R, so, by Theorem 1.6.9, they have limits. ∑ We turn to infinite series ∞ k=0 ak , with ak ∈ C. We say this converges if and only if the sequence of partial sums (1.10.18)
Sn =
n ∑
ak
k=0
converges: (1.10.19)
∞ ∑
ak = A ⇐⇒ Sn → A as n → ∞.
k=0
The following is a useful condition guaranteeing convergence. Compare Proposition 1.6.13. ∑ Proposition 1.10.3. The infinite series ∞ k=0 ak converges provided (1.10.20)
∞ ∑ k=0
i.e., there exists B < ∞ such that
|ak | < ∞, ∑n
k=0 |ak |
≤ B for all n.
60
1. Numbers
Proof. The triangle inequality gives, for ℓ ≥ 1, n+ℓ ∑ |Sn+ℓ − Sn | = ak k=n+1
(1.10.21)
n+ℓ ∑
≤
|ak |,
k=n+1
which tends to 0 as n → ∞, uniformly in ℓ ≥ 1, provided (1.10.20) holds (cf. (1.6.42)–(1.6.43)). Hence (1.10.20) ⇒ (Sn ) is Cauchy. Convergence then follows, by Proposition 1.10.2. ∑ As in the real case, if (1.10.20) holds, we say the infinite series ∞ k=0 ak is absolutely convergent. An example to which Proposition 1.10.3 applies is the following power series, giving the exponential function ez : (1.10.22)
z
e =
∞ ∑ zk k=0
k!
,
z ∈ C.
Compare Exercise 13 of §1.6. The exponential function is explored in depth in §4.5 of Chapter 4. We turn to a discussion of polar coordinates on C. Given a nonzero z ∈ C, we can write z (1.10.23) z = rω, r = |z|, ω = . |z| Then ω has unit distance from 0. If the ray from 0 to ω makes an angle θ with the positive real axis, we have (1.10.24)
Re ω = cos θ,
Im ω = sin θ,
by definition of the trigonometric functions cos and sin. Hence (1.10.25)
z = r cis θ,
where (1.10.26)
cis θ = cos θ + i sin θ.
If also (1.10.27)
w = ρ cis φ,
ρ = |w|,
then (1.10.28)
zw = rρ cis(θ + φ),
as a consequence of the identity (1.10.29)
cis(θ + φ) = (cis θ)(cis φ),
1.10. Complex numbers
61
which in turn is equivalent to the pair of trigonometric identities cos(θ + φ) = cos θ cos φ − sin θ sin φ, (1.10.30) sin(θ + φ) = cos θ sin φ + sin θ cos φ. There is another way to write (1.10.25), using the classical Euler identity eiθ = cos θ + i sin θ.
(1.10.31) Then (1.10.25) becomes (1.10.32)
z = r eiθ .
The identity (1.10.29) is equivalent to (1.10.33)
ei(θ+φ) = eiθ eiφ .
We will present a self-contained derivation of (1.10.31) (and also of (1.10.30) and (1.10.33)) in Chapter 4, §§4.4–4.5. The analysis there includes a precise description of what “angle θ” means. We next define closed and open subsets of C, and discuss the notion of compactness. A set S ⊂ C is said to be closed if and only if (1.10.34)
zj ∈ S, zj → z =⇒ z ∈ S.
The complement C \ S of a closed set S is open. Alternatively, Ω ⊂ C is open if and only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where (1.10.35)
Bε (q) = {z ∈ C : |z − q| < ε},
so q cannot be a limit of a sequence of points in C \ Ω. We define the closure S of a set S ⊂ C to consist of all points p ∈ C such that Bε (p) ∩ S ̸= ∅ for all ε > 0. Equivalently, p ∈ S if and only if there exists an infinite sequence (pj ) of points in S such that pj → p. Parallel to (1.9.7), we say a nonempty set K ⊂ C is compact if and only if the following property holds. Each infinite sequence (pj ) in K has a subsequence (1.10.36) that converges to a point in K. As in §1.9, if K ⊂ C is compact, it must be closed and bounded. Parallel to Theorem 1.9.2, we have the converse. Proposition 1.10.4. If a nonempty K ⊂ C is closed and bounded, then it is compact. Proof. Let (zj ) be a sequence in K. Then (Re zj ) and (Im zj ) are bounded, so Theorem 1.6.10 implies the existence of a subsequence such that Re zjν and Im zjν converge. Hence the subsequence (zjν ) converges in C. Since K is closed, the limit must belong to K.
62
1. Numbers
If S ⊂ C, a function f : S −→ C
(1.10.37)
is said to be continuous at p ∈ S provided pj ∈ S, pj → p =⇒ f (pj ) → f (p).
(1.10.38)
If f is continuous at each p ∈ S, we say f is continuous on S. The following result has the same proof as Proposition 1.9.4. Proposition 1.10.5. If K ⊂ C is compact and f : K → C is continuous, then f (K) is compact. Then the following variant of Proposition 1.9.5 is straightforward. Proposition 1.10.6. If K ⊂ C is compact and f : K → C is continuous, then there exists p ∈ K such that |f (p)| = max |f (z)|,
(1.10.39)
z∈K
and there exists q ∈ K such that |f (q)| = min |f (z)|.
(1.10.40)
z∈K
There are also straightforward extensions to K ⊂ C of Propositions 1.9.7–1.9.10. We omit the details. But see §2.1 of Chapter 2 for further extensions. Exercises We define π as the smallest positive number such that cis π = −1. See Chapter 4, §§4.4–4.5 for more on this matter. 1. Show that
2π =⇒ ω n = 1. n For this, use (1.10.29). In conjunction with (1.10.25)–(1.10.28) and Proposition 1.7.1, use this to prove the following: ω = cis
Given a ∈ C, a ̸= 0, n ∈ N, there exist z1 , . . . , zn ∈ C such that zjn = a. 2. Compute
√ ) 3 3 + i , 2 2
(1
1.10. Complex numbers
63
and verify that (1.10.41)
π 1 cos = , 3 2
√ π 3 . sin = 3 2
3. Find z1 , . . . , zn such that (1.10.42)
zjn = 1,
explicitly in the form a + ib (not simply as cis(2πj/n)), in case (1.10.43)
n = 3, 4, 6, 8.
Hint. Use (1.10.41), and also the fact that the equation u2j = i has solutions (1.10.44)
1 i u1 = √ + √ , 2 2
u2 = −u1 .
4. Take the following path to finding the 5 solutions to (1.10.45)
zj5 = 1.
One solution is z1 = 1. Since z 5 − 1 = (z − 1)(z 4 + z 3 + z 2 + z + 1), we need to find 4 solutions to z 4 + z 3 + z 2 + z + 1 = 0. Write this as 1 1 (1.10.46) z 2 + z + 1 + + 2 = 0, z z which, for 1 (1.10.47) w=z+ , z becomes (1.10.48)
w2 + w − 1 = 0.
Use the quadratic formula to find 2 solutions to (1.10.48). Then solve (1.10.47), i.e., z 2 − wz + 1 = 0, for z. Use these calculations to show that √ 2π 5−1 cos = . 5 4 The roots zj of (1.10.45) form the vertices of a regular pentagon. See Figure 1.10.3. 5. Take the following path to explicitly finding the real and imaginary parts of a solution to z 2 = a + ib. Namely, with x = Re z, y = Im z, we have x2 − y 2 = a,
2xy = b,
64
1. Numbers
√ Figure 1.10.3. Regular pentagon, a = ( 5 − 1)/4.
and also x2 + y 2 = ρ = √
hence x=
ρ+a , 2
√ a2 + b2 , y=
b , 2x
as long as a + ib ̸= −|a|. 6. Taking a cue from Exercise 4 of §1.6, show that ∞ ∑ 1 (1.10.49) = z k , for z ∈ C, |z| < 1. 1−z k=0
7. Show that
∞
∑ 1 = z 2k , 1 − z2
for z ∈ C, |z| < 1.
k=0
8. Produce a power series series expansion in z, valid for |z| < 1, for 1 . 1 + z2
Chapter 2
Spaces
In Chapter 1 we developed the real number line R, and established a number of metric properties, such as completeness of R, and compactness of closed, bounded subsets. We also produced the complex plane C, and studied analogous metric properties of C. Here we examine other types of spaces, which are useful in analysis. Section 2.1 treats n-dimensional Euclidean space, Rn . This is equipped √ with a dot product x · y ∈ R, which gives rise to a norm |x| = x · x. Parallel to (1.6.23) and (1.10.11) of Chapter 1, this norm satisfies the triangle inequality. In this setting, the proof goes through an inequality known as Cauchy’s inequality. Then the distance between x and y in Rn is given by d(x, y) = |x − y|, and it satisfies a triangle inequality. With these structures, we have the notion of convergent sequences and Cauchy sequences, and can show that Rn is complete. There is a notion of compactness for subsets of Rn , similar to that given in (1.9.7) and in (1.10.36) of Chapter 1, for subsets of R and of C, and it is shown that nonempty, closed bounded subsets of Rn are compact. Analysts have found it useful to abstract some of the structures mentioned above, and apply them to a larger class of spaces, called metric spaces. A metric space is a set X, equipped with a distance function d(x, y), satisfying certain conditions (see (2.2.1)), including the triangle inequality. For such a space, one has natural notions of a convergent sequence and of a Cauchy sequence. The space may or may not be complete. If not, there is a construction of its completion, somewhat similar to the construction of R as the completion of Q in §1.6 of Chapter 1. We discuss the definition and some basic properties of metric spaces in §2.2. There is also a natural notion of compactness in the metric space context, which we treat in §2.3. 65
66
2. Spaces
Most metric spaces we will encounter are subsets of Euclidean space. One exception introduced in this chapter is the class of infinite products; see (2.3.9). Another important class of metric spaces beyond the Euclidean space setting consists of spaces of functions, which will be treated in §3.4 of Chapter 3.
2.1. Euclidean spaces The space Rn , n-dimensional Euclidean space, consists of n-tuples of real numbers: (2.1.1)
x = (x1 , . . . , xn ) ∈ Rn ,
xj ∈ R, 1 ≤ j ≤ n.
The number xj is called the jth component of x. Here we discuss some important algebraic and metric structures on Rn . First, there is addition. If x is as in (2.1.1) and also y = (y1 , . . . , yn ) ∈ Rn , we have (2.1.2)
x + y = (x1 + y1 , . . . , xn + yn ) ∈ Rn .
Addition is done componentwise. Also, given a ∈ R, we have ax = (ax1 , . . . , axn ) ∈ Rn .
(2.1.3)
This is scalar multiplication. We also have the dot product, (2.1.4)
x·y =
n ∑
xj yj = x1 y1 + · · · + xn yn ∈ R,
j=1
given x, y ∈ Rn . The dot product has the properties x · y = y · x, (2.1.5)
x · (ay + bz) = a(x · y) + b(x · z), x · x > 0 unless x = 0.
Note that (2.1.6)
x · x = x21 + · · · + x2n .
We set (2.1.7)
|x| =
√
x · x,
which we call the norm of x. Note that (2.1.5) implies (2.1.8)
(ax) · (ax) = a2 (x · x),
hence (2.1.9)
|ax| = |a| · |x|,
for a ∈ R, x ∈ Rn .
2.1. Euclidean spaces
67
Taking a cue from the Pythagorean theorem, we say that the distance from x to y in Rn is (2.1.10)
d(x, y) = |x − y|.
For us, (2.1.7) and (2.1.10) are simply definitions. We do not need to depend on the Pythagorean theorem. Significant properties will be derived below, without recourse to the Pythagorean theorem. A set X equipped with a distance function is called a metric space. We will consider metric spaces in general in the next section. Here, we want to show that the Euclidean distance, defined by (2.1.10), satisfies the “triangle inequality,” (2.1.11)
d(x, y) ≤ d(x, z) + d(z, y),
∀ x, y, z ∈ Rn .
This in turn is a consequence of the following, also called the triangle inequality. Proposition 2.1.1. The norm (2.1.7) on Rn has the property (2.1.12)
|x + y| ≤ |x| + |y|,
∀ x, y ∈ Rn .
Proof. We compare the squares of the two sides of (2.1.12). First, |x + y|2 = (x + y) · (x + y) (2.1.13)
=x·x+y·x+y·x+y·y = |x|2 + 2x · y + |y|2 .
Next, (2.1.14)
(|x| + |y|)2 = |x|2 + 2|x| · |y| + |y|2 .
We see that (2.1.12) holds if and only if x · y ≤ |x| · |y|. Thus the proof of Proposition 2.1.1 is finished off by the following result, known as Cauchy’s inequality. Proposition 2.1.2. For all x, y ∈ Rn , (2.1.15)
|x · y| ≤ |x| · |y|.
Proof. We start with the chain (2.1.16)
0 ≤ |x − y|2 = (x − y) · (x − y) = |x|2 + |y|2 − 2x · y,
which implies (2.1.17)
2x · y ≤ |x|2 + |y|2 ,
∀ x, y ∈ Rn .
If we replace x by tx and y by t−1 y, with t > 0, the left side of (2.1.17) is unchanged, so we have (2.1.18)
2x · y ≤ t2 |x|2 + t−2 |y|2 ,
∀ t > 0.
68
2. Spaces
Now we pick t so that the two terms on the right side of (2.1.18) are equal, namely t2 =
(2.1.19)
|y| , |x|
t−2 =
|x| . |y|
(At this point, note that (2.1.15) is obvious if x = 0 or y = 0, so we will assume that x ̸= 0 and y ̸= 0.) Plugging (2.1.19) into (2.1.18) gives (2.1.20)
x · y ≤ |x| · |y|,
∀ x, y ∈ Rn .
This is almost (2.1.15). To finish, we can replace x in (2.1.20) by −x = (−1)x, getting −(x · y) ≤ |x| · |y|,
(2.1.21)
and together (2.1.20) and (2.1.21) give (2.1.15).
We now discuss a number of notions and results related to convergence in Rn . First, a sequence of points (pj ) in Rn converges to a limit p ∈ Rn (we write pj → p) if and only if |pj − p| −→ 0,
(2.1.22)
where | · | is the Euclidean norm on Rn , defined by (2.1.7), and the meaning of (2.1.22) is that for every ε > 0 there exists N such that j ≥ N =⇒ |pj − p| < ε.
(2.1.23)
If we write pj = (p1j , . . . , pnj ) and p = (p1 , . . . , pn ), then (2.1.22) is equivalent to (p1j − p1 )2 + · · · + (pnj − pn )2 −→ 0, as j → ∞, which holds if and only if |pℓj − pℓ | −→ 0 as j → ∞, for each ℓ ∈ {1, . . . , n}. That is to say, convergence pj → p in Rn is eqivalent to convergence of each component. A set S ⊂ Rn is said to be closed if and only if (2.1.24)
pj ∈ S, pj → p =⇒ p ∈ S.
The complement Rn \ S of a closed set S is open. Alternatively, Ω ⊂ Rn is open if and only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where (2.1.25)
Bε (q) = {p ∈ Rn : |p − q| < ε},
so q cannot be a limit of a sequence of points in Rn \ Ω.
2.1. Euclidean spaces
69
An important property of Rn is completeness, a property defined as follows. A sequence (pj ) of points in Rn is called a Cauchy sequence if and only if (2.1.26)
|pj − pk | −→ 0,
as j, k → ∞.
Again we see that (pj ) is Cauchy in Rn if and only if each component is Cauchy in R. It is easy to see that if pj → p for some p ∈ Rn , then (2.1.26) holds. The completeness property is the converse. Theorem 2.1.3. If (pj ) is a Cauchy sequence in Rn , then it has a limit, i.e., (2.1.22) holds for some p ∈ Rn . Proof. Since convergence pj → p in Rn is equivalent to convergence in R of each component, the result is a consequence of the completeness of R. This was proved in Chapter 1. Completeness provides a path to the following key notion of compactness. A nonempty set K ⊂ Rn is said to be compact if and only if the following property holds. (2.1.27)
Each infinite sequence (pj ) in K has a subsequence that converges to a point in K.
It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j ̸= k, so (pj ) cannot have a convergent subsequence. The following converse statement is a key result. Theorem 2.1.4. If a nonempty K ⊂ Rn is closed and bounded, then it is compact. Proof. If K ⊂ Rn is closed and bounded, it is a closed subset of some box (2.1.28)
B = {(x1 , . . . , xn ) ∈ Rn : a ≤ xk ≤ b, ∀ k}.
Clearly every closed subset of a compact set is compact, so it suffices to show that B is compact. Now, each closed bounded interval [a, b] in R is compact, as shown in §1.9 of Chapter 1, and (by reasoning similar to the proof of Theorem 2.1.3) the compactness of B follows readily from this. We establish some further properties of compact sets K ⊂ Rn , leading to the important result, Proposition 2.1.8 below. This generalizes results established for n = 1 in §1.9 of Chapter 1. A further generalization will be given in §2.3.
70
2. Spaces
Proposition 2.1.5. Let K ⊂ Rn be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreasing sequence of closed subsets of K. If each Xm ̸= ∅, then ∩m Xm ̸= ∅. Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y. Since {xmk : k ≥ ℓ} ⊂ Xmℓ , which is closed, we have y ∈ ∩m Xm . Corollary 2.1.6. Let K ⊂ Rn be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open sets in Rn . If ∪m Um ⊃ K, then UM ⊃ K for some M . Proof. Consider Xm = K \ Um .
Before getting to Proposition 2.1.8, we bring in the following. Let Q denote the set of rational numbers, and let Qn denote the set of points in Rn all of whose components are rational. The set Qn ⊂ Rn has the following “denseness” property: given p ∈ Rn and ε > 0, there exists q ∈ Qn such that |p − q| < ε. Let (2.1.29)
R = {Br (q) : q ∈ Qn , r ∈ Q ∩ (0, ∞)}.
Note that Q and Qn are countable, i.e., they can be put in one-to-one correspondence with N. Hence R is a countable collection of balls. The following lemma is left as an exercise for the reader. Lemma 2.1.7. Let Ω ⊂ Rn be a nonempty open set. Then ∪ (2.1.30) Ω = {B : B ∈ R, B ⊂ Ω}. To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα . If each Uα ⊂ Rn is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we say {Uβ : β ∈ B} is a subcover. Proposition 2.1.8. If K ⊂ Rn is compact, then it has the following property. (2.1.31)
Every open cover {Uα : α ∈ A} of K has a finite subcover.
Proof. By Lemma 2.1.7, it suffices to prove the following. Every countable cover {Bj : j ∈ N} of K by open balls (2.1.32) has a finite subcover. To see this, write R = {Bj : j ∈ N}. Given the cover {Uα }, pass to {Bj : j ∈ J}, where j ∈ J if and only of Bj is contained in some Uα . By (2.1.30), {Bj : j ∈ J} covers K. If (2.1.32) holds, we have a subcover {Bℓ : ℓ ∈ L} for some finite L ⊂ J. Pick αℓ ∈ A such that Bℓ ⊂ Uαℓ . The {Uαℓ : ℓ ∈ L} is the desired finite subcover advertised in (2.1.31).
2.1. Euclidean spaces
71
Finally, to prove (2.1.32), we set Um = B1 ∪ · · · ∪ Bm
(2.1.33)
and apply Corollary 2.1.6.
Exercises 1. Identifying z = x + iy ∈ C with (x, y) ∈ R2 and w = u + iv ∈ C with (u, v) ∈ R2 , show that the dot product satisfies z · w = Re zw. In light of this, compare the proof of Proposition 2.1.1 with that of Proposition 1.10.1 in Chapter 1. 2. Show that the inequality (2.1.12) implies (2.1.11). 3. Given x, y ∈ Rn , we say x is orthogonal to y (x ⊥ y) provided x · y = 0. Show that, for x, y ∈ Rn , x ⊥ y ⇐⇒ |x + y|2 = |x|2 + |y|2 . 4. Let e1 , v ∈ Rn and assume |e1 | = |v| = 1. Show that e1 − v ⊥ e1 + v. Hint. Expand (e1 − v) · (e1 + v). See Figure 2.1.1 for the geometrical significance of this, when n = 2. 5. Let S 1 = {x ∈ R2 : |x| = 1} denote the unit circle in R2 , and set e1 = (1, 0) ∈ S 1 . Pick a ∈ R such that 0 < a < 1, and set u = (1 − a)e1 . See Figure 2.1.2. Then pick v ∈ S 1 such that v − u ⊥ e1 , and set b = |v − e1 |. Show that (2.1.34)
b=
√
2a.
Hint. Note that 1 − a = u · e1 = v · e1 , hence a = 1 − v · e1 . Then expand b2 = (v − e1 ) · (v − e1 ).
72
2. Spaces
Figure 2.1.1. Right triangle in a circle
6. Recall the approach to (2.1.34) in classical Euclidean geometry, using similarity of triangles, leading to a b = . b 2 What is the relevance of Exercise 4 to this? In Classical Euclidean geometry, the point v is constructed as the intersection of a line (through u, perpendicular to the line from 0 to e1 ) and the circle S 1 . What is the advantage of the material developed here (involving completeness of R) over the axioms of Euclid in guaranteeing that this intersection exists? 7. Prove Lemma 2.1.7. 8. Use Proposition 2.1.8 to prove the following extension of Proposition 2.1.5. Proposition 2.1.9. Let K ⊂ Rn be compact. Assume {Xα : α ∈ A} is a collection of closed subsets of K. Assume that for each finite set B ⊂ A,
2.1. Euclidean spaces
73
Figure 2.1.2. Geometric construction of b =
∩α∈B Xα ̸= ∅. Then
∩
√
2a
Xα ̸= ∅.
α∈A
Hint. Consider Uα = Rn \ Xα . 9. Let K ⊂ Rn be compact. Show that there exist x0 , x1 ∈ K such that |x0 | ≤ |x|, ∀ x ∈ K, |x1 | ≥ |x|,
∀ x ∈ K.
We say |x0 | = min |x|, x∈K
|x1 | = max |x|. x∈K
74
2. Spaces
2.2. Metric spaces A metric space is a set X, together with a distance function d : X × X → [0, ∞), having the properties that d(x, y) = 0 ⇐⇒ x = y, (2.2.1)
d(x, y) = d(y, x), d(x, y) ≤ d(x, z) + d(y, z).
The third of these properties is called the triangle inequality. We sometimes denote this metric space by (X, d). An example of a metric space is the set of rational numbers Q, with d(x, y) = |x − y|. Another example is X = Rn , with √ d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 . This was treated in §2.1. If (xν ) is a sequence in X, indexed by ν = 1, 2, 3, . . . , i.e., by ν ∈ Z+ , one says (2.2.2)
xν → y ⇐⇒ d(xν , y) → 0, as ν → ∞.
One says (xν ) is a Cauchy sequence if and only if (2.2.3)
d(xν , xµ ) → 0 as µ, ν → ∞.
One says X is a complete metric space if every Cauchy sequence converges to a limit in X. Some metric spaces are not complete; for example, Q is not complete. You can take a sequence (xν ) of rational numbers such that √ xν → 2, which is not rational. Then (xν ) is Cauchy in Q, but it has no limit in Q. b If a metric space X is not complete, one can construct its completion X b as follows. Let an element ξ of X consist of an equivalence class of Cauchy sequences in X, where we say (2.2.4)
(xν ) ∼ (x′ν ) =⇒ d(xν , x′ν ) → 0.
We write the equivalence class containing (xν ) as [xν ]. If ξ = [xν ] and η = [yν ], we can set (2.2.5)
ˆ η) = lim d(xν , yν ), d(ξ, ν→∞
b a complete metric space. and verify that this is well defined, and makes X Details are provided at the end of this section. If the completion of Q is constructed by this process, you get R, the set of real numbers. This construction was carried out in §1.6 of Chapter 1. There are a number of useful concepts related to the notion of closeness. We define some of them here. First, if p is a point in a metric space X and
2.2. Metric spaces
75
r ∈ (0, ∞), the set (2.2.6)
Br (p) = {x ∈ X : d(x, p) < r}
is called the open ball about p of radius r. Generally, a neighborhood of p ∈ X is a set containing such a ball, for some r > 0. A set S ⊂ X is said to be closed if and only if (2.2.7)
pj ∈ S, pj → p =⇒ p ∈ S.
The complement X \ S of a closed set is said to be open. Alternatively, U ⊂ X is open if and only if (2.2.8)
q ∈ U =⇒ ∃ ε > 0 such that Bε (q) ⊂ U,
so q cannot be a limit of a sequence of points in X \ U . We state a couple of straightforward propositions, whose proofs are left to the reader. Proposition 2.2.1. If Uα is a family of open sets in X, then ∪α Uα is open. If Kα is a family of closed subsets of X, then ∩α Kα is closed. Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The following result is straightforward. Proposition 2.2.2. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p. Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each ε > 0, there exists q ∈ S ∩ Bε (p), q ̸= p. It follows that p is an accumulation point of S if and only if each Bε (p), ε > 0, contains infinitely many points of S. One straightforward observation is that all points of S \ S are accumulation points of S. If S ⊂ Y ⊂ X, we say S is dense in Y provided S ⊃ Y . The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all the open sets contained in S. Note that the complement of the interior of S is equal to the closure of X \ S. We next define the notion of a connected space. A metric space X is said to be connected provided that it cannot be written as the union of two disjoint nonempty open subsets. The following is a basic example. Here, we treat I as a stand-alone metric space. Proposition 2.2.3. Each interval I in R is connected. Proof. Suppose A ⊂ I is nonempty, with nonempty complement B ⊂ I, and both sets are open. (Hence both sets are closed.) Take a ∈ A, b ∈ B; we can
76
2. Spaces
assume a < b. (Otherwise, switch A and B.) Let ξ = sup{x ∈ [a, b] : x ∈ A}. This exists, by Proposition 1.6.12 of Chapter 1. Now we obtain a contradiction, as follows. Since A is closed, ξ ∈ A. (Hence ξ < b.) But then, since A is open, ξ > a, and furthermore there must be a neighborhood (ξ − ε, ξ + ε) contained in A. This would imply ξ ≥ ξ + ε. Contradiction. See the next chapter for more on connectedness, and its connection to the Intermediate Value Theorem. Construction of the completion of (X, d)
As indicated earlier in this section, if (X, d) is a metric space, we can ˆ This construction can be compared to that b d). construct its completion (X, b consist done to pass from Q to R in §1.6 of Chapter 1. Elements of X of equivalence classes of Cauchy sequences in X, with equivalence relation given by (2.2.4). To verify that (2.2.4) defines an equivalence relation, we need to show that the relation specified there is reflexive, symmetric, and transitive. The first two properties are completely straightforward. As for the third, we need to show that (2.2.9)
(xν ) ∼ (x′ν ), (x′ν ) ∼ (x′′ν ) =⇒ (xν ) ∼ (x′′ν ).
In fact, the triangle inequality for d gives (2.2.10)
d(xν , x′′ν ) ≤ d(xν , x′ν ) + d(x′ν , x′′ν ),
from which (2.2.9) readily follows. We write the equivalence class containing (xν ) as [xν ]. ˆ η) by Given ξ = [xν ] and η = [yν ], we propose to define d(ξ, (2.2.11)
ˆ η) = lim d(xν , yν ). d(ξ, ν→∞
b ×X b → [0, ∞), we need to verify that the limit To obtain a well defined dˆ : X on the right side of (2.2.11) exists whenever (xν ) and (yν ) are Cauchy in X, and that the limit is unchanged if (xν ) and (yν ) are replaced by (x′ν ) ∼ (xν ) and (yν′ ) ∼ (yν ). First, we show that dν = d(xν , yν ) is a Cauchy sequence in R. The triangle inequality for d gives dν = d(xν , yν ) ≤ d(xν , xµ ) + d(xµ , yµ ) + d(yµ , yν ), hence dν − dµ ≤ d(xν , xµ ) + d(yµ , yν ),
2.2. Metric spaces
77
and the same upper estimate applies to dµ − dν , hence to |dν − dµ |. Thus the limit on the right side of (2.2.11) exists. Next, with d′ν = d(x′ν , yν′ ), we have d′ν = d(x′ν , yν′ ) ≤ d(x′ν , xν ) + d(xν , yν ) + d(yν , yν′ ), hence
d′ν − dν ≤ d(x′ν , xν ) + d(yν , yν′ ),
and the same upper estimate applies to dν − d′ν , hence to |d′ν − dν |. b ×X b → [0, ∞) is well defined. These observations establish that dˆ : X b We next need to show that it makes X a metric space. First, (2.2.12)
ˆ η) = 0 ⇒ lim d(xν , yν ) = 0 ⇒ (xν ) ∼ (yν ) ⇒ ξ = η. d(ξ, ν→∞
ˆ η) = d(η, ˆ ξ) follows from (2.2.11) and the symmetry Next, the symmetry d(ξ, b then of d. Finally, if also ζ = [zν ] ∈ X,
(2.2.13)
ˆ ζ) = lim d(xν , zν ) d(ξ, ν [ ] ≤ lim d(xν , yν ) + d(yν , zν ) ν
ˆ η) + d(η, ˆ ζ), = d(ξ, so dˆ satisfies the triangle inequality. To proceed, we have a natural map b (2.2.14) j : X −→ X, j(x) = (x, x, x, . . . ). It is clear that for each x, y ∈ X, ˆ (2.2.15) d(j(x), j(y)) = d(x, y). b From here on, we will simply identify a point x ∈ X with its image j(x) ∈ X, b b using the notation x ∈ X (so X ⊂ X). It is useful to observe that if (xk ) is a Cauchy sequence in X, then ˆ xk ) = 0. (2.2.16) ξ = [xk ] =⇒ lim d(ξ, k→∞
In fact, (2.2.17)
ˆ xk ) = lim d(xν , xk ) → 0 as k → ∞. d(ξ, ν→∞
From here we have the following. b Lemma 2.2.4. The set X is dense in X. ˆ follows b say ξ = [xν ], the fact that xν → ξ in (X, b d) Proof. Given ξ ∈ X, from (2.2.16). We are now ready for the following analogue of Theorem 1.6.9 of Chapter 1.
78
2. Spaces
ˆ is complete. b d) Proposition 2.2.5. The metric space (X, ˆ By Lemma 2.2.4, we can pick b d). Proof. Assume (ξk ) is Cauchy in (X, −k ˆ xk ∈ X such that d(ξk , xk ) ≤ 2 . We claim (xk ) is Cauchy in X. In fact, ˆ k , xℓ ) d(xk , xℓ ) = d(x (2.2.18)
ˆ k , ξk ) + d(ξ ˆ k , ξℓ ) + d(ξ ˆ ℓ , xℓ ) ≤ d(x ˆ k , ξℓ ) + 2−k + 2−ℓ , ≤ d(ξ
so (2.2.19)
d(xk , xℓ ) −→ 0 as k, ℓ → ∞.
b We claim Since (xk ) is Cauchy in X, it defines an element ξ = [xk ] ∈ X. ξk → ξ. In fact, (2.2.20)
ˆ k , ξ) ≤ d(ξ ˆ k , xk ) + d(x ˆ k , ξ) d(ξ ˆ k , ξ) + 2−k , ≤ d(x
ˆ k , ξ) → 0 as k → ∞ follows from (2.2.17). This and the fact that d(x completes the proof of Proposition 2.2.5.
Exercises 1. Prove Proposition 2.2.1. 2. Prove Proposition 2.2.2. ˆ is constructed b d) 3. Suppose the metric space (X, d) is complete, and (X, as indicated in (2.2.4)–(2.2.5), and described in detail in (2.2.9)–(2.2.17). b is both one-to-one and onto. Show that the natural inclusion j : X → X 4. Show that if p ∈ Rn and R > 0, the ball BR (p) = {x ∈ Rn : |x − p| < R} is connected. Hint. Suppose BR (p) = U ∪ V , a union of two disjoint open sets. Given q1 ∈ U, q2 ∈ V , consider the line segment ℓ = {tq1 + (1 − t)q2 : 0 ≤ t ≤ 1}. 5. Let X = Rn , but replace the distance √ d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2
2.2. Metric spaces
79
by d1 (x, y) = |x1 − y1 | + · · · + |xn − yn |. Show that (X, d1 ) is a metric space. In particular, verify the triangle inequality. Show that a sequence pj converges in (X, d1 ) if and only if it converges in (X, d). 6. Show that if U is an open subset of (X, d), then U is a union of open balls. 7. Let S ⊂ X be a dense subset. Let B = {Br (p) : p ∈ S, r ∈ Q+ }, with Br (p) defined as in (2.2.6). Show that if U is an open subset of X, then U is a union of balls in B. That is, if q ∈ U , there exists B ∈ B such that q ∈ B ⊂ U . Given a nonempty metric space (X, d), we say it is perfect if it is complete and has no isolated points. Exercises 8–10 deal with perfect metric spaces. 8. Show that if p ∈ X and ε > 0, then Bε (p) contains infinitely many points. 9. Pick distinct p0 , p1 ∈ X, and take positive r0 < (1/2)d(p0 , p1 ). Show that X0 = Br0 (p0 ) and X1 = Br0 (p1 ) are disjoint perfect subsets of X (i.e., are each perfect metric spaces). 10. Similarly, take distinct p00 , p01 ∈ X0 and distinct p10 , p11 ∈ X1 , and sufficiently small r1 > 0 such that Xjk = Br1 (pjk ) for k = 0, 1 are disjoint perfect subsets of Xj . Continue in this fashion, producing Xj1 ···jk+1 ⊂ Xj1 ···jk , closed balls of radius rk ↘ 0, centered at pj1 ···jk+1 . Show that you can define a function φ:
∞ ∏
{0, 1} → X,
φ((j1 , j2 , j3 , . . . )) = lim pj1 j2 ···jk .
ℓ=1
k→∞
Show that φ is one-to-one, and deduce that Card(X) ≥ Card(R). A metric space X is said to be separable if it has a countable dense subset.
80
2. Spaces
11. Let X be a separable metric space, with a dense subset S = {pj : j ∈ N}. Produce a function ∞ ∏ ψ : X −→ N ℓ=1
as follows. Given x ∈ X, choose a sequence (pjν ) of points in S such that pjν → x. Set ψ(x) = (j1 , j2 , j3 , . . . ). Show that ψ is one-to-one, and deduce that Card(X) ≤ Card(R).
2.3. Compactness We return to the notion of compactness, defined in the Euclidean context in (2.1.27). We say a (nonempty) metric space X is compact provided the following property holds: (2.3.1)
Each sequence (xk ) in X has a convergent subsequence.
We will establish various properties of compact metric spaces, and provide various equivalent characterizations. For example, it is easily seen that (2.3.1) is equivalent to: (2.3.2)
Each infinite subset S ⊂ X has an accumulation point.
The following property is known as total boundedness: Proposition 2.3.1. If X is a compact metric space, then (2.3.3)
Given ε > 0, ∃ finite set {x1 , . . . , xN } such that Bε (x1 ), . . . , Bε (xN ) covers X.
Proof. Take ε > 0 and pick x1 ∈ X. If Bε (x1 ) = X, we are done. If not, pick x2 ∈ X \ Bε (x1 ). If Bε (x1 ) ∪ Bε (x2 ) = X, we are done. If not, pick x3 ∈ X\[Bε (x1 )∪Bε (x2 )]. Continue, taking xk+1 ∈ X\[Bε (x1 )∪· · ·∪Bε (xk )], if Bε (x1 ) ∪ · · · ∪ Bε (xk ) ̸= X. Note that, for 1 ≤ i, j ≤ k, i ̸= j =⇒ d(xi , xj ) ≥ ε. If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no accumulation point, so property (2.3.2) is contradicted. Corollary 2.3.2. If X is a compact metric space, it has a countable dense subset. Proof. Given ε = 2−n , let Sn be a finite set of points xj such that {Bε (xj )} covers X. Then C = ∪n Sn is a countable dense subset of X.
2.3. Compactness
81
Here is another useful property of compact metric spaces, which will eventually be generalized even further, in (2.3.6) below. Proposition 2.3.3. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form a decreasing sequence of closed subsets of X. If each Kn ̸= ∅, then ∩n Kn ̸= ∅. Proof. Pick xn ∈ Kn . If (2.3.1) holds, (xn ) has a convergent subsequence, xnk → y. Since {xnk : k ≥ ℓ} ⊂ Knℓ , which is closed, we have y ∈ ∩n Kn . Corollary 2.3.4. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open subsets of X. If ∪n Un = X, then UN = X for some N . Proof. Consider Kn = X \ Un .
The following is an important extension of Corollary 2.3.4. Note how this generalizes Proposition 2.1.8. Proposition 2.3.5. If X is a compact metric space, then it has the property: (2.3.4)
Every open cover {Uα : α ∈ A} of X has a finite subcover.
Proof. Let C = {zj : j ∈ N} ⊂ X be a countable dense subset of X, as in Corollary 2.3.2. Given p ∈ Uα , there exist zj ∈ C and a rational rj > 0 such that p ∈ Brj (zj ) ⊂ Uα . Hence each Uα is a union of balls Brj (zj ), with zj ∈ C ∩ Uα , rj rational. Thus it suffices to show that (2.3.5)
Every countable cover {Bj : j ∈ N} of X by open balls has a finite subcover.
Compare the argument used in Proposition 2.1.8. To prove (2.3.5), we set Un = B 1 ∪ · · · ∪ B n
and apply Corollary 2.3.4. The following is a convenient alternative to property (2.3.4): ∩ Kα = ∅, If Kα ⊂ X are closed and (2.3.6) α then some finite intersection is empty. Considering Uα = X \ Kα , we see that (2.3.4) ⇐⇒ (2.3.6).
The following result, known as the Heine-Borel theorem, completes Proposition 2.3.5.
82
2. Spaces
Theorem 2.3.6. For a metric space X, (2.3.1) ⇐⇒ (2.3.4). Proof. By Proposition 2.3.5, (2.3.1) ⇒ (2.3.4). To prove the converse, it will suffice to show that (2.3.6) ⇒ (2.3.2). So let S ⊂ X and assume S has no accumulation point. We claim: Such S must be closed. Indeed, if z ∈ S and z ∈ / S, then z would have to be an accumulation point. To proceed, say S = {xα : α ∈ A}, and set Kα = S \ {xα }. Then each Kα has no accumulation point, hence Kα ⊂ X is closed. Also ∩α Kα = ∅. Hence there exists a finite set F ⊂ A such that ∩α∈F Kα = ∅, if (2.3.6) holds. Hence S = ∪α∈F {xα } is finite, so indeed (2.3.6) ⇒ (2.3.2). Remark. So far we have that for every metric space X, (2.3.1) ⇐⇒ (2.3.2) ⇐⇒ (2.3.4) ⇐⇒ (2.3.6) =⇒ (2.3.3). We claim that (2.3.3) implies the other conditions if X is complete. Of course, compactness implies completeness, but (2.3.3) may hold for incomplete X, e.g., X = (0, 1) ⊂ R. Proposition 2.3.7. If X is a complete metric space with property (2.3.3), then X is compact. Proof. It suffices to show that (2.3.3) ⇒ (2.3.2) if X is a complete metric space. So let S ⊂ X be an infinite set. Cover X by balls B1/2 (x1 ), . . . , B1/2 (xN ). One of these balls contains infinitely many points of S, and so does its closure, say X1 = B1/2 (y1 ). Now cover X by finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1 . One such set contains infinitely many points of S, and so does its closure X2 = B1/4 (y2 ) ∩ X1 . Continue in this fashion, obtaining X1 ⊃ X2 ⊃ X3 ⊃ · · · ⊃ Xk ⊃ Xk+1 ⊃ · · · ,
Xj ⊂ B2−j (yj ),
each containing infinitely many points of S. Pick zj ∈ Xj . One sees that (zj ) forms a Cauchy sequence. If X is complete, it has a limit, zj → z, and z is seen to be an accumulation point of S. Remark. Note the similarity of this argument with the proof of the BolzanoWeiersrass theorem in Chapter 1.
2.3. Compactness
83
If Xj , 1 ≤ j ≤ m, is a finite collection of metric spaces, with metrics dj , we can define a Cartesian product metric space m ∏ (2.3.7) X= Xj , d(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ). j=1
√ Another choice of metric is δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . The metrics d and δ are equivalent, i.e., there exist constants C0 , C1 ∈ (0, ∞) such that (2.3.8)
C0 δ(x, y) ≤ d(x, y) ≤ C1 δ(x, y),
∀ x, y ∈ X.
A key example is Rm , the Cartesian product of m copies of the real line R. We describe some important classes of compact spaces. Proposition 2.3.8. If Xj are compact metric spaces, 1 ≤ j ≤ m, so is ∏m X = j=1 Xj . Proof. If (xν ) is an infinite sequence of points in X, say xν = (x1ν , . . . , xmν ), pick a convergent subsequence of (x1ν ) in X1 , and consider the corresponding subsequence of (xν ), which we relabel (xν ). Using this, pick a convergent subsequence of (x2ν ) in X2 . Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then have a convergent subsequence in X. The following result is useful for analysis on Rn . Proposition 2.3.9. If K is a closed bounded subset of Rn , then K is compact. Proof. This has been proved in §2.1. There it was noted that the result follows from the compactness of a closed bounded interval I = [a, b] in R, which in turn was proved in §1.9 of Chapter 1. Here, we just note that compactness of [a, b] is also a corollary of Proposition 2.3.7. We next give a slightly more sophisticated result on compactness. The following extension of Proposition 2.3.8 is a special case of Tychonov’s Theorem. Proposition 2.3.10. If {Xj : j ∈ Z+ } are compact metric spaces, so is ∏∞ X = j=1 Xj . Here, we can make X a metric space by setting ∞ ∑ dj (pj (x), pj (y)) , (2.3.9) d(x, y) = 2−j 1 + dj (pj (x), pj (y)) j=1
84
2. Spaces
where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X, then xν → y in X, as ν → ∞, if and only if, for each j, pj (xν ) → pj (y) in Xj . Proof. Following the argument in Proposition 2.3.8, if (xν ) is an infinite sequence of points in X, we obtain a nested family of subsequences (2.3.10)
(xν ) ⊃ (x1 ν ) ⊃ (x2 ν ) ⊃ · · · ⊃ (xj ν ) ⊃ · · ·
such that pℓ (xj ν ) converges in Xℓ , for 1 ≤ ℓ ≤ j. The next step is a diagonal construction. We set (2.3.11)
ξν = xν ν ∈ X.
Then, for each j, after throwing away a finite number N (j) of elements, one obtains from (ξν ) a subsequence of the sequence (xj ν ) in (2.3.10), so pℓ (ξν ) converges in Xℓ for all ℓ. Hence (ξν ) is a convergent subsequence of (xν ).
Exercises 1. Let φ : [0, ∞) → [0, ∞) have the following properties: Assume φ(0) = 0,
φ(s) < φ(s + t) ≤ φ(s) + φ(t), for s ≥ 0, t > 0.
Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does δ(x, y) = φ(d(x, y)). 2. Show that the function d(x, y) defined by (2.3.9) satisfies (2.2.1). Hint. Consider φ(r) = r/(1 + r). 3. In the setting of (2.3.7), let }1/2 { . δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 Show that δ(x, y) ≤ d(x, y) ≤
√
m δ(x, y).
4. Let X be a metric space, p ∈ X, and let K ⊂ X be compact. Show that there exist x0 , x1 ∈ K such that d(x0 , p) ≤ d(x, p),
∀ x ∈ K,
d(x1 , p) ≥ d(x, p),
∀ x ∈ K.
2.4. The Baire category theorem
85
Show that there exist y0 , y1 ∈ K such that d(q0 , q1 ) ≤ d(y0 , y1 ),
∀ q0 , q1 ∈ K.
We say diam K = d(y0 , y1 ). 5. Let X be a metric space that satisfies the total boundedness condition b be its completion. Show that X b is compact. (2.3.3), and let X b Hint. Show that X also satisfies condition (2.3.3). 6. Deduce from Exercises 10 and 11 of §2.2 that if X is a compact metric space with no isolated points, then Card(X) = Card(R). Note how this generalizes the result on Cantor sets in Exercise 6, §1.9, of Chapter 1. In Exercises 7–9, X is an uncountable compact metric space (so, by Exercise 11 of §2.2, Card X ≤ Card R). 7. Define K ⊂ X as follows: x ∈ K ⇐⇒ Bε (x) is uncountable, ∀ ε > 0. Show that (a) K ̸= ∅. Hint. Cover X with B1 (pj ), 1 ≤ j ≤ N0 . At least one is uncountable; call it X0 . Cover X0 with X0 ∩ B1/2 (pj ), 1 ≤ j ≤ N1 , pj ∈ X0 . At least one is uncountable; call it X1 . Continue, obtaining uncountable compact sets X0 ⊃ X1 ⊃ · · · , with diam Xj ≤ 21−j . Show that ∩j Xj = {x} with x ∈ K. 8. In the setting of Exercise 7, show that (b) K is closed (hence compact), and (c) K has no isolated points. Hint for (c). Given x ∈ K, show that, for each ε > 0, there exists δ ∈ (0, ε) such that Bε (x) \ Bδ (X) is uncountable. Apply Exercise 7 to this compact metric space. 9. Deduce from Exercises 6–8 that Card K = Card R. Hence conclude that Card X = Card R.
2.4. The Baire category theorem If X is a metric space, a subset U ⊂ X is said to be dense if U = X, and a subset S ⊂ X is said to be nowhere dense if S contains no nonempty open set. Consequently, S is nowhere dense if and only if X \ S is dense. Also,
86
2. Spaces
a set U ⊂ X is dense in X if and only if U intersects each nonempty open subset of X. Our main goal here is to prove the following. Theorem 2.4.1. A complete metric space X cannot be written as a countable union of nowhere dense subsets. Proof. Let Sk ⊂ X be nowhere dense, k ∈ N. Set (2.4.1)
Tk =
k ∪
Sj ,
j=1
so Tk are closed, nowhere dense, and increasing. Consider Uk = X \ Tk ,
(2.4.2)
which are open, dense, and decreasing. Clearly ∪ ∩ (2.4.3) Sk = X =⇒ Uk = ∅, k
k
so to prove the theorem, we show that there exists p ∈ ∩k Uk . To do this, pick p1 ∈ U1 and ε1 > 0 such that Bε1 (p1 ) ⊂ U1 . Since U2 is dense in X, we can then pick p2 ∈ Bε1 (p1 ) ∩ U2 and ε2 ∈ (0, ε1 /2) such that Bε2 (p2 ) ⊂ Bε1 (p1 ) ∩ U2 .
(2.4.4)
Continue, producing pk ∈ Bεk−1 (pk−1 ) ∩ Uk and εk ↘ 0 such that Bεk (pk ) ⊂ Bεk−1 (pk−1 ) ∩ Uk ,
(2.4.5)
which is possible at each stage because Uk is dense in X, and hence intersects each nonempty open set. Note that (2.4.6)
pℓ ∈ Bεℓ (pℓ ) ⊂ Bεk (pk ),
∀ ℓ > k.
It follows that (2.4.7)
d(pℓ , pk ) ≤ εk ,
∀ ℓ > k,
so (pk ) is Cauchy. Since X is complete, this sequence has a limit p ∈ X. Since each Bεk (pk ) is closed, (2.4.6) implies (2.4.8) This finishes the proof.
p ∈ Bεk (pk ) ⊂ Uk ,
∀ k.
Theorem 2.4.1 is called the Baire category theorem. The terminology arises as follows. We say a subset Y ⊂ X is of first category provided Y is a countable union of nowhere dense sets. If Y is not a set of first category, we say it is of second category. Theorem 2.4.1 says that if X is a complete metric space, then X is of second category.
Chapter 3
Functions
The playing fields for analysis are spaces, and the players themselves are functions. In this chapter we develop some frameworks for understanding the behavior of various classes of functions. We spend about half the chapter studying functions f : X → Y from one metric space (X) to another (Y ), and about half specializing to the case Y = Rn . Our emphasis is on continuous functions, and §3.1 presents a number of results on continuous functions f : X → Y , which by definition have the property xν → x =⇒ f (xν ) → f (x). We devote particular attention to the behavior of continuous functions on compact sets. We bring in the notion of uniform continuity, a priori stronger than continuity, and show that f continuous on X ⇒ f uniformly continuous on X, provided X is compact. We also introduce the notion of connectedness, and extend the intermediate value theorem given in §1.9 of Chapter 1 to the setting where X is a connected metric space, and f : X → R is continuous. In §3.2 we consider sequences and series of functions, starting with sequences (fj ) of functions fj : X → Y . We study convergence and uniform convergence. We move to infinite series ∞ ∑
fj (x),
j=1
in case Y = Rn , and discuss conditions on fj yielding convergence, absolute convergence, and uniform convergence. Section 3.3 introduces a special class 87
88
3. Functions
of infinite series, power series, ∞ ∑
ak z k .
k=0
Here we take ak ∈ C and z ∈ C, and consider conditions yielding convergence on a disk DR = {z ∈ C : |z| < R}. This section is a prelude to a deeper study of power series, as it relates to calculus, in Chapter 4. In §3.4 we study spaces of functions, including C(X, Y ), the set of continuous functions f : X → Y . Under certain hypotheses (e.g., if either X or Y is compact) we can take D(f, g) = sup dY (f (x), g(x)), x∈X
as a distance function, making C(X, Y ) a metric space. We investigate conditions under which this metric space can be shown to be complete. We also investigate conditions under which certain subsets of C(X, Y ) can be shown to be compact. Unlike §§3.1–3.3, this section will not have much impact on Chapters 4–5, but we include it to indicate further interesting directions that analysis does take. Material in §§3.2–3.3 brings in some basic results on infinite series of numbers (or more generally elements of Rn ). Section 3.5 puts this material in a more general context, and shows that a number of key results can be deduced from a general “dominated convergence theorem.”
3.1. Continuous functions Let X and Y are metric spaces, with distance functions dX and dY , respectively. A function f : X → Y is said to be continuous at a point x ∈ X if and only if (3.1.1)
xν → x in X =⇒ f (xν ) → f (x) in Y,
or, equivalently, for each ε > 0, there exists δ > 0 such that (3.1.2)
dX (x, x′ ) < δ =⇒ dY (f (x), f (x′ )) < ε,
that is to say, (3.1.3)
f −1 (Bε (f (x)) ⊂ Bδ (x),
where the balls Bε (y) and Bδ (x) are defined as in (2.2.6) of Chapter 2. Here we use the notation f −1 (S) = {x ∈ X : f (x) ∈ S}, given S ⊂ Y . We say f is continuous on X if it is continuous at each point of X. Here is an equivalent condition.
3.1. Continuous functions
89
Proposition 3.1.1. Given f : X → Y , f is continuous on X if and only if (3.1.4)
U open in Y =⇒ f −1 (U ) open in X.
Proof. First, assume f is continuous. Let U ⊂ Y be open, and assume x ∈ f −1 (U ), so f (x) = y ∈ U . Given that U is open, pick ε > 0 such that Bε (y) ⊂ U . Continuity of f at x forces the image of Bδ (x) to lie in the ball Bε (y) about y, if δ is small enough, hence to lie in U . Thus Bδ (x) ⊂ f −1 (U ) for δ small enough, so f −1 (U ) must be open. Conversely, assume (3.1.4) holds. If x ∈ X, and f (x) = y, then for all ε > 0, f −1 (Bε (y)) must be an open set containing x, so f −1 (Bε (y)) contains Bδ (x) for some δ > 0. Hence f is continuous at x. We record the following important link between continuity and compactness. This extends Proposition 1.9.4 of Chapter 1. Proposition 3.1.2. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X compact, then f (K) is a compact subset of Y. Proof. If (yν ) is an infinite sequence of points in f (K), pick xν ∈ K such that f (xν ) = yν . If K is compact, we have a subsequence xνj → p in X, and then yνj → f (p) in Y. If f : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition 3.1.2 is: Proposition 3.1.3. If X is a compact metric space and f ∈ C(X), then f assumes a maximum and a minimum value on X. Proof. We know from Proposition 3.1.2 that f (X) is a compact subset of R, hence bounded. Proposition 1.6.1 of Chapter 1 implies f (K) ⊂ R has a sup and an inf, and, as noted in (1.9.9) of Chapter 1, these numbers are in f (K). That is, we have (3.1.5)
b = max f (K),
a = min f (K).
Hence a = f (x0 ) for some x0 ∈ X, and b = f (x1 ) for some x1 ∈ X.
For later use, we mention that if X is a nonempty set and f : X → R is bounded from above, disregarding any notion of continuity, we set (3.1.6)
sup f (x) = sup f (X), x∈X
and if f : X → R is bounded from below, we set (3.1.7)
inf f (x) = inf f (X).
x∈X
90
3. Functions
If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below, we set inf f = −∞. Given a set X, f : X → R, and xn ∈ X, we set ( ) (3.1.8) lim sup f (xn ) = lim sup f (xk ) , n→∞
n→∞ k≥n
and (3.1.9)
( lim inf f (xn ) = lim n→∞
) inf f (xk ) .
n→∞ k≥n
We return to the notion of continuity. A function f ∈ C(X) is said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that (3.1.10)
x, y ∈ X, d(x, y) ≤ δ =⇒ |f (x) − f (y)| ≤ ε.
More generally, if Y is a metric space with distance function dY , a function f : X → Y is said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that (3.1.11)
x, y ∈ X, dX (x, y) ≤ δ =⇒ dY (f (x), f (y)) ≤ ε.
An equivalent condition is that f have a modulus of continuity, i.e., a monotonic function ω : [0, 1) → [0, ∞) such that δ ↘ 0 ⇒ ω(δ) ↘ 0, and such that (3.1.12)
x, y ∈ X, dX (x, y) ≤ δ ≤ 1 =⇒ dY (f (x), f (y)) ≤ ω(δ).
Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R, then f (x) = sin 1/x is continuous, but not uniformly continuous, on X. The following result is useful, for example, in the development of the Riemann integral in Chapter 4. Proposition 3.1.4. If X is a compact metric space and f : X → Y is continuous, then f is uniformly continuous. Proof. If not, there exist ε > 0 and xν , yν ∈ X such that dX (xν , yν ) ≤ 2−ν but (3.1.13)
dY (f (xν ), f (yν )) ≥ ε.
Taking a convergent subsequence xνj → p, we also have yνj → p. Now continuity of f at p implies f (xνj ) → f (p) and f (yνj ) → f (p), contradicting (3.1.13). If X and Y are metric spaces and f : X → Y is continuous, one-toone, and onto, and if its inverse g = f −1 : Y → X is continuous, we say f is a homeomorphism. Here is a useful sufficient condition for producing homeomorphisms.
3.1. Continuous functions
91
Proposition 3.1.5. Let X be a compact metric space. Assume f : X → Y is continuous, one-to-one, and onto. Then its inverse g : Y → X is continuous. Proof. If K ⊂ X is closed, then K is compact, so by Proposition 1.2, f (K) ⊂ Y is compact, hence closed. Now if U ⊂ X is open, with complement K = X \ U , we see that f (U ) = Y \ f (K), so U open ⇒ f (U ) open, that is, U ⊂ X open =⇒ g −1 (U ) open. Hence, by Proposition 3.1.1, g is continuous.
We next define the notion of a connected space. A metric space X is said to be connected provided that it cannot be written as the union of two disjoint nonempty open subsets. The following is a basic class of examples. Proposition 3.1.6. Each interval I in R is connected. Proof. This is Proposition 2.2.3 of Chapter 2.
We say X is path-connected if, given any p, q ∈ X, there is a continuous map γ : [0, 1] → X such that γ(0) = p and γ(1) = q. The following is an easy consequence of Proposition 3.1.6. Proposition 3.1.7. Every path connected metric space X is connected. Proof. If X = U ∪ V with U and V open, disjoint, and both nonempty, take p ∈ U, q ∈ V , and let γ : [0, 1] → X be a continuous path from p to q. Then [0, 1] = γ −1 (U ) ∪ γ −1 (V ) would be a disjoint union of nonempty open sets, which by Proposition 3.1.6 cannot happen. The next result is known as the Intermediate Value Theorem. Note that it generalizes Proposition 1.9.6 of Chapter 1. Proposition 3.1.8. Let X be a connected metric space and f : X → R continuous. Assume p, q ∈ X, and f (p) = a < f (q) = b. Then, given any c ∈ (a, b), there exists z ∈ X such that f (z) = c. Proof. Under the hypotheses, A = {x ∈ X : f (x) < c} is open and contains p, while B = {x ∈ X : f (x) > c} is open and contains q. If X is connected, then A ∪ B cannot be all of X; so any point in its complement has the desired property.
92
3. Functions
Exercises 1. If X is a metric space, with distance function d, show that |d(x, y) − d(x′ , y ′ )| ≤ d(x, x′ ) + d(y, y ′ ), and hence d : X × X −→ [0, ∞) is continuous. 2. Let pn (x) = xn . Take b > a > 0, and consider pn : [a, b] −→ [an , bn ]. Use the intermediate value theorem to show that pn is onto. 3. In the setting of Exercise 2, show that pn is one-to-one, so it has an inverse qn : [an , bn ] −→ [a, b]. Use Proposition 3.1.5 to show that qn is continuous. The common notation is qn (x) = x1/n , x > 0. Note. This strengthens Proposition 1.7.1 of Chapter 1. 4. Let f, g : X → C be continuous, and let h(x) = f (x)g(x). Show that h : X → C is continuous. 5. Define pn : C → C by pn (z) = z n . Show that pn is continuous for each n ∈ N. Hint. Start at n = 1, and use Exercise 4 to produce an inductive proof. 6. Let X, Y, Z be metric spaces. Assume f : X → Y and g : Y → Z are continuous. Define g ◦ f : X → Z by g ◦ f (x) = g(f (x)). Show that g ◦ f is continuous. 7. Let fj : X → Yj be continuous, for j = 1, 2. Define g : X → Y1 × Y2 by g(x) = (f1 (x), f2 (x)). Show that g is continuous. We present some exercises that deal with functions that are semicontinuous. Given a metric space X and f : X → [−∞, ∞], we say f is lower
3.1. Continuous functions
93
semicontinuous at x ∈ X provided f −1 ((c, ∞]) ⊂ X is open, ∀ c ∈ R. We say f is upper semicontinuous provided f −1 ([−∞, c)) is open, ∀ c ∈ R.
8. Show that f is lower semicontinuous ⇐⇒ f −1 ([−∞, c]) is closed, ∀ c ∈ R, and f is upper semicontinuous ⇐⇒ f −1 ([c, ∞]) is closed, ∀ c ∈ R. 9. Show that f is lower semicontinuous ⇐⇒ xn → x implies lim inf f (xn ) ≥ f (x). Show that f is upper semicontinuous ⇐⇒ xn → x implies lim sup f (xn ) ≤ f (x). 10. Given S ⊂ X, show that χS is lower semicontinuous ⇐⇒ S is open. χS is upper semicontinuous ⇐⇒ S is closed. Here, χS (x) = 1 if x ∈ S, 0 if x ∈ / S. 11. If X is a compact metric space, show that f : X → R is lower semicontinuous =⇒ min f is achieved. In Exercises 12–18, we take (3.1.14)
H=
∞ ∏
[0, 1],
j=1
K=
∞ ∏
{0, 1}.
j=1
Both these sets are compact spaces, with metrics given as in Proposition 2.3.10. (Since infinite series crop up, one might decide to look at these exercises after reading the next section.) 12. Show that, for 0 < t < 1, the map (3.1.15)
Ft : H −→ [0, ∞),
94
3. Functions
given by (3.1.16)
Ft (a) =
∞ ∑
aj tj ,
j=1
is continuous. Here a = (a1 , a2 , . . . , aj , . . . ), aj ∈ [0, 1]. 13. Put an ordering on K by saying a < b (with a as above and b = (b1 , b2 , . . . )) provided that, if aj = bj for all j < N but aN ̸= bN , then aN < bN (hence aN = 0 and bN = 1). Show that 1 a < b =⇒ Ft (a) < Ft (b), provided 0 < t < , 2 and a < b =⇒ F1/2 (a) ≤ F1/2 (b). 14. Deduce from Exercise 13 that 1 Ft : K −→ [0, ∞) is ont-to-one, provided 0 < t < . 2 Setting Ct = Ft (K), deduce that 1 Ft : K −→ Ct is a homeomorphism, for 0 < t < . 2 15. Look at the construction of Cantor sets described in the exercises for §1.9, and show that C1/3 = F1/3 (K) is of the form constructed there. 16. Show that F1/2 : K −→ [0, 1] is onto, but not one-to-one. Hint. Take the infinite binary expansion of a number ξ ∈ [0, 1], noting that 1=
∞ ∑
2−j .
j=1
17. Given a, b ∈ H, show that ψab : [0, 1] −→ H,
ψab (s) = sa + (1 − s)b,
is continuous. Deduce that H is connected. 18. Given a, b ∈ L, a ̸= b, show that there exist Oa , Ob ⊂ K, both open in K, such that a ∈ Oa , b ∈ Ob ,
Oa ∩ Ob = ∅,
Oa ∪ Ob = K.
3.1. Continuous functions
95
Figure 3.1.1. Graph of y = sin 1/x
We say K is totally disconnected. Deduce that Ct is totally disconnected,
∀ t ∈ (0, 1/2).
19. Consider the function ( 1] 1 f : 0, −→ R, f (x) = sin . 4 x A self-contained presentation of the function sin θ is given in Chapter 4. Here we stipulate that sin : R −→ R is continuous and periodic of period 2π, and sin ±π/2 = ±1. See Figure 3.1.1. Show that f is continuous, but not uniformly continuous. How does this result mesh with Proposition 3.1.4? 20. Let G = {(x, sin 1/x) : 0 < x ≤ 1/4} be the graph depicted in Figure
96
3. Functions
3.1.1. Set X = G ∪ {(0, y) : −1 ≤ y ≤ 1}. Show that X is a compact subset of R2 . Show that X is connected, but not path connected.
3.2. Sequences and series of functions
97
3.2. Sequences and series of functions Let X and Y be metric spaces, with distance functions dX and dY , respectively. Consider a sequence of functions fj : X → Y , which we denote (fj ). To say (fj ) converges at x to f : X → Y is simply to say that fj (x) → f (x) in Y . If such convergence holds for each x ∈ X, we say (fj ) converges to f on X, pointwise. A stronger type of convergence is uniform convergence. We say fj → f uniformly on X provided (3.2.1)
sup dY (fj (x), f (x)) −→ 0, as j → ∞. x∈X
An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that (3.2.2)
j ≥ K =⇒ dY (fj (x), f (x)) ≤ ε,
∀ x ∈ X.
A significant property of uniform convergence is that passing to the limit preserves continuity. Proposition 3.2.1. If fj : X → Y is continuous for each j and fj → f uniformly, then f : X → Y is continuous. Proof. Fix p ∈ X and take ε > 0. Pick K ∈ N such that (3.2.2) holds. Then pick δ > 0 such that (3.2.3)
x ∈ Bδ (p) =⇒ dY (fK (x), fK (p)) < ε,
which can be done since fK : X → Y is continuous. Together, (3.2.2) and (3.2.3) imply (3.2.4) x ∈ Bδ (p) ⇒ dY (f (x), f (p)) ≤ dY (f (x), fK (x)) + dY (fK (x), fK (p)) + dY (fK (p), f (p)) ≤ 3ε. Thus f is continuous at p, for each p ∈ X.
We next consider Cauchy sequences of functions fj : X → Y . To say (fj ) is Cauchy at x ∈ X is simply to say (fj (x)) is a Cauchy sequence in Y . We say (fj ) is uniformly Cauchy provided (3.2.5)
sup dY (fj (x), fk (x)) −→ 0,
as j, k → ∞.
x∈X
An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that (3.2.6)
j, k ≥ K =⇒ dY (fj (x), fk (x)) ≤ ε,
∀ x ∈ X.
98
3. Functions
If Y is complete, a Cauchy sequence (fj ) will have a limit f : X → Y . We have the following. Proposition 3.2.2. Assume Y is complete, and fj : X → Y is uniformly Cauchy. Then (fj ) converges uniformly to a limit f : X → Y . Proof. We have already seen that there exists f : X → Y such that fj (x) → f (x) for each x ∈ X. To finish the proof, take ε > 0, and pick K ∈ N such that (3.2.6) holds. Then taking k → ∞ yields (3.2.7)
j ≥ K =⇒ dY (fj (x), f (x)) ≤ ε,
∀ x ∈ X,
yielding the uniform convergence.
If, in addition, each fj : X → Y is continuous, we can put Propositions 3.2.1 and 3.2.2 together. We leave this to the reader. It is useful to note the following phenomenon in case, in addition, X is compact. Proposition 3.2.3. Assume X is compact, fj : X → Y continuous, and fj → f uniformly on X. Then ∪ (3.2.8) K = f (X) ∪ fj (X) ⊂ Y is compact. j≥1
Proof. Let (yν ) ⊂ K be an infinite sequence. If there exists j ∈ N such that yν ∈ fj (X) for infinitely many ν, convergence of a subsequence to an element of fj (X) follows from the known compactness of fj (X). Ditto if yν ∈ f (X) for infinitely many ν. It remains to consider the situation yν ∈ fjν (X), jν → ∞ (after perhaps taking a subsequence). That, is, suppose yν = fjν (xν ), xν ∈ X, jν → ∞. Passing to a further subsequence, we can assume xν → x in X, and then it follows from the uniform convergence that (3.2.9)
yν −→ y = f (x) ∈ K.
We move from sequences to series. For this, we need some algebraic structure on Y . Thus, for the rest of this section, we assume (3.2.10)
fj : X −→ Rn ,
for some n ∈ N. We look at the infinite series (3.2.11)
∞ ∑ k=0
fk (x),
3.2. Sequences and series of functions
99
and seek conditions for convergence, which is the same as convergence of the sequence of partial sums, (3.2.12)
Sj (x) =
j ∑
fk (x).
k=0
Parallel to Proposition 1.6.13 of Chapter 1, we have convergence at x ∈ X provided ∞ ∑ (3.2.13) |fk (x)| < ∞, k=0
i.e., provided there exists Bx < ∞ such that (3.2.14)
j ∑
|fk (x)| ≤ Bx ,
∀ j ∈ N.
k=0
In such a case, we say the series (3.2.11) converges absolutely at x. We say (3.2.11) converges uniformly on X if and only if (Sj ) converges uniformly on X. The following sufficient condition for uniform convergence is called the Weierstrass M test. Proposition 3.2.4. Assume there exist Mk such that |fk (x)| ≤ Mk , for all x ∈ X, and ∞ ∑ (3.2.15) Mk < ∞. k=0
Then the series (3.2.11) converges uniformly on X, to a limit S : X → Rn . Proof. This proof is also similar to that of Proposition 1.6.13 of Chapter 1, but we review it. We have m+ℓ ∑ fk (x) |Sm+ℓ (x) − Sm (x)| ≤ k=m+1
≤
(3.2.16)
m+ℓ ∑
|fk (x)|
k=m+1
≤ ∑m
m+ℓ ∑
Mk .
k=m+1
Now (3.2.15) implies σm = k=0 Mk is uniformly bounded, so (by Proposition 1.6.11 of Chapter 1), σm ↗ β for some β ∈ R+ . Hence (3.2.17)
|Sm+ℓ (x) − Sm (x)| ≤ σm+ℓ − σm ≤ β − σm → 0, as m → ∞,
independent of ℓ ∈ N and x ∈ X. Thus (Sj ) is uniformly Cauchy on X, and uniform convergence follows by Propositon 3.2.2.
100
3. Functions
Figure 3.2.1. Functions f1 and g1 , arising in Exercises 1–2
Bringing in Proposition 3.2.1, we have the following. Corollary 3.2.5. In the setting of Proposition 3.2.4, if also each fk : X → Rn is continuous, so is the limit S.
Exercises 1. For j ∈ N, define fj : R → R by x f1 (x) = , fj (x) = f (jx). 1 + x2 See Figure 3.2.1. Show that fj → 0 pointwise on R. Show that, for each ε > 0, fj → 0 uniformly on R \ (−ε, ε). Show that (fj ) does not converge uniformly to 0 on R. 2. For j ∈ N, define gj : R → R by x g1 (x) = √ , 1 + x2
gj (x) = g1 (jx).
3.2. Sequences and series of functions
101
Show that there exists g : R → R such that gj → g pointwise. Show that g is not continuous on all of R. Where is g discontinuous? 3. Let X be a compact metric space. Assume fj , f : X → R are continuous and fj (x) ↗ f (x), ∀ x ∈ X. Prove that fj → f uniformly on X. (This result is called Dini’s theorem.) Hint. For ε > 0, let Kj (ε) = {x ∈ X : f (x) − fj (x) ≥ ε}. Note that Kj (ε) ⊃ Kj+1 (ε) ⊃ · · · . What about ∩j≥1 Kj (ε)? 4. Take gj as in Exercise 2 and consider ∞ ∑ 1 gk (x). k2 k=1
Show that this series converges uniformly on R, to a continuous limit. 5. Take fj as in Exercise 1 and consider ∞ ∑ 1 fk (x). k k=1
Where does this series converge? Where does it converge uniformly? Where is the sum continuous? Hint. For use in the latter questions, note that, for ℓ ∈ N, ℓ ≤ k ≤ 2ℓ, we have fk (1/ℓ) ∈ [1/2, 1].
102
3. Functions
3.3. Power series An important class of infinite series is the class of power series ∞ ∑
(3.3.1)
ak z k ,
k=0
with ak ∈ C. Note that if z1 ̸= 0 and (3.3.1) converges for z = z1 , then there exists C < ∞ such that |ak z1k | ≤ C,
(3.3.2)
∀ k.
Hence, if |z| ≤ r|z1 |, r < 1, we have (3.3.3)
∞ ∑
|ak z | ≤ C k
k=0
∞ ∑
rk =
k=0
C < ∞, 1−r
the last identity being the classical geometric series computation. (Compare (1.10.49) in Chapter 1.) This yields the following. Proposition 3.3.1. If (3.3.1) converges for some z1 ̸= 0, then either this series is absolutely convergent for all z ∈ C, or there is some R ∈ (0, ∞) such that the series is absolutely convergent for |z| < R and divergent for |z| > R. We call R the radius of convergence of (3.3.1). In case of convergence for all z, we say the radius of convergence is infinite. If R > 0 and (3.1) converges for |z| < R, it defines a function (3.3.4)
f (z) =
∞ ∑
ak z k ,
z ∈ DR ,
k=0
on the disk of radius R centered at the origin, (3.3.5)
DR = {z ∈ C : |z| < R}.
Proposition 3.3.2. If the series (3.3.4) converges in DR , then it converges uniformly on DS for all S < R, and hence f is continuous on DR , i.e., given zn , z ∈ DR , (3.3.6)
zn → z =⇒ f (zn ) → f (z).
Proof. For each z ∈ DR , there exists S < R such that z ∈ DS , so it suffices to show that f is continuous on DS whenever 0 < S < R. Pick T such that S < T < R. We know that there exists C < ∞ such that |ak T k | ≤ C for all k. Hence ( S )k (3.3.7) z ∈ DS =⇒ |ak z k | ≤ C . T
3.3. Power series
103
Since ∞ ( ) ∑ S k
(3.3.8)
k=0
T
< ∞,
the Weierstrass M-test, Proposition 3.2.4, applies, to yield uniform convergence on DS . Since (3.3.9)
∀ k,
ak z k is continuous,
continuity of f on DS follows from Corollary 3.2.5.
More generally, a power series has the form (3.3.10)
f (z) =
∞ ∑
an (z − z0 )n .
n=0
It follows from Proposition 3.3.1 that to such a series there is associated a radius of convergence R ∈ [0, ∞], with the property that the series converges absolutely whenever |z −z0 | < R (if R > 0), and diverges whenever |z −z0 | > R (if R < ∞). We identify R as follows: (3.3.11)
1 = lim sup |an |1/n . R n→∞
This is established in the following result, which complements Propositions 3.3.1–3.3.2. Proposition 3.3.3. The series (3.3.10) converges whenever |z − z0 | < R and diverges whenever |z − z0 | > R, where R is given by (3.3.11). If R > 0, the series converges uniformly on {z : |z −z0 | ≤ R′ }, for each R′ < R. Thus, when R > 0, the series (3.3.10) defines a continuous function (3.3.12)
f : DR (z0 ) −→ C,
where (3.3.13)
DR (z0 ) = {z ∈ C : |z − z0 | < R}.
Proof. If R′ < R, then there exists N ∈ Z+ such that n ≥ N =⇒ |an |1/n < Thus (3.3.14)
1 =⇒ |an |(R′ )n < 1. R′
z − z n 0 |z − z0 | < R′ < R =⇒ |an (z − z0 )n | ≤ , R′
for n ≥ N , so (3.3.10) is dominated by a convergent geometrical series in DR′ (z0 ).
104
3. Functions
For the converse, we argue as follows. Suppose R′′ > R, so infinitely many |an |1/n ≥ 1/R′′ , hence infinitely many |an |(R′′ )n ≥ 1. Then z − z n 0 |z − z0 | ≥ R′′ > R =⇒ infinitely many |an (z − z0 )n | ≥ ≥ 1, R′′ forcing divergence for |z − z0 | > R. The assertions about uniform convergence and continuity follow as in Proposition 3.3.2. It is useful to note that we can multiply power series with radius of convergence R > 0. In fact, there is the following more general result on products of absolutely convergent series. Proposition 3.3.4. Given absolutely convergent series (3.3.15)
A=
∞ ∑
αn ,
B=
n=0
∞ ∑
βn ,
n=0
we have the absolutely convergent series (3.3.16)
AB =
∞ ∑
γn ,
γn =
n=0
Proof. Take Ak =
∑k
n=0 αn ,
(3.3.17)
Bk =
Ak Bk =
n ∑
αj βn−j .
j=0
∑k
n=0 βn .
k ∑
Then
γn + Rk
n=0
with (3.3.18) ∑ Rk =
σ(k) = {(m, n) ∈ Z+ × Z+ : m, n ≤ k, m + n > k}.
αm βn ,
(m,n)∈σ(k)
Hence
∑
|Rk | ≤ (3.3.19)
∑
m≤k/2 k/2≤n≤k
≤A
∑
∑
|αm | |βn | +
|βn | + B
n≥k/2
∑
|αm | |βn |
k/2≤m≤k n≤k
∑
|αm |,
m≥k/2
where (3.3.20)
A=
∞ ∑ n=0
|αn | < ∞,
B=
∞ ∑
|βn | < ∞.
n=0
It follows that Rk → 0 as k∑→ ∞. Thus the left side of (3.17) converges to AB and the right side to ∞ n=0 γn . The absolute convergence of (3.3.16)
3.3. Power series
105
follows by applying the same argument with αn replaced by |αn | and βn replaced by |βn |. Corollary 3.3.5. Suppose the following power series converge for |z| < R: (3.3.21)
f (z) =
∞ ∑
an z n ,
g(z) =
n=0
∞ ∑
bn z n .
n=0
Then, for |z| < R, (3.3.22)
f (z)g(z) =
∞ ∑
cn z n ,
cn =
n=0
n ∑
aj bn−j .
j=0
The following result, which is related to Proposition 3.3.4, has a similar proof. (For still more along these lines, see §3.5.) ∑ ∑ Proposition 3.3.6. If ajk ∈ C and |ajk | < ∞, then j,k j ajk is ab∑ solutely convergent for each k, k ajk is absolutely convergent for each j, and ∞ (∑ ∞ ∞ (∑ ∞ ) ∑ ) ∑ ∑ (3.3.23) ajk = ajk = ajk . j=0 k=0
k=0 j=0
j,k
∑
Proof. Clearly the hypothesis implies j |ajk | < ∞ for each k, and also ∑ k |ajk | < ∞ for each j. It also implies that there exists B < ∞ such that SN =
N ∑ N ∑
|ajk | ≤ B,
∀ N.
j=0 k=0
Now SN is bounded and monotone, so there exists a limit, SN ↗ A < ∞ as N ↗ ∞. It follows that, for each ε > 0, there exists N ∈ N such that ∑ e ×N e : j > N or k > N }. |ajk | < ε, C(N ) = {(j, k) ∈ N (j,k)∈C(N )
Note that if M, K ≥ N , then N ∑ N M (∑ K ∑ ) ∑ ajk ≤ ajk − j=0 k=0
hence
M (∑ ∞ N ∑ N ∑ ) ∑ ajk − ajk ≤ j=0 k=0
Therefore
j=0 k=0
j=0 k=0
∞ (∑ ∞ N ∑ N ∑ ) ∑ ajk − ajk ≤ j=0 k=0
j=0 k=0
∑
|ajk |,
(j,k)∈C(N )
∑
|ajk |.
(j,k)∈C(N )
∑ (j,k)∈C(N )
|ajk |.
106
3. Functions
We have a similar result with the roles of j and k reversed, and clearly the two finite sums agree. It follows that ∞ (∑ ∞ ∞ (∑ ∞ ∑ ) ∑ ) ajk − ajk < 2ε, ∀ ε > 0, j=0 k=0
k=0 j=0
yielding (3.3.23). Using Proposition 3.3.6, we demonstrate the following. Shrawan Kumar for this argument.)
(Thanks to
Proposition 3.3.7. If (3.3.10) has a radius of convergence R > 0, and z1 ∈ DR (z0 ), then f (z) has a convergent power series about z1 : (3.3.24)
f (z) =
∞ ∑
bk (z − z1 )k ,
for |z − z1 | < R − |z1 − z0 |.
k=0
Proof. There is no loss in generality in taking z0 = 0, which we will do here, for notational simplicity. Setting fz1 (ζ) = f (z1 + ζ), we have from (3.3.10) fz1 (ζ) = (3.3.25) =
∞ ∑
an (ζ + z1 )n
n=0 ∞ ∑ n ∑ n=0 k=0
( ) n k n−k an ζ z1 , k
the second identity by the binomial formula. Now, ( ) ∞ ∑ n ∞ ∑ ∑ n (3.3.26) |an | |ζ|k |z1 |n−k = |an |(|ζ| + |z1 |)n < ∞, k n=0 k=0
n=0
provided |ζ| + |z1 | < R, which is the hypothesis in (3.3.24) (with z0 = 0). Hence Proposition 3.3.6 gives ( ) ∞ (∑ ∞ ∑ n n−k ) k (3.3.27) fz1 (ζ) = an z ζ . k 1 k=0 n=k
Hence (3.3.2) holds, with (3.3.28)
bk =
∞ ∑ n=k
( ) n n−k z . an k 1
This proves Proposition 3.3.7. Note in particular that (3.3.29)
b1 =
∞ ∑
nan z1n−1 .
n=1
3.3. Power series
107
For more on power series, see §4.3 of Chapter 4.
Exercises 1. Let ak ∈ C. Assume there exist K ∈ N, α < 1 such that a k+1 (3.3.30) k ≥ K =⇒ ≤ α. ak ∑ Show that ∞ k=0 ak is absolutely convergent. Note. This is the ratio test. 2. Determine the radius of convergence R for each of the following power series. If 0 < R < ∞, try to determine when convergence holds at points on |z| = R. ∞ ∑
(3.3.31)
n=0 ∞ ∑ n=1 ∞ ∑
zn, zn n!
, n
nz ,
∞ ∑ zn n=1 ∞ ∑
n
zn 2n
n=1 ∞ ∑
n=1
∞ ∑ zn
,
n=1 ∞ ∑
,
n=1
,
n2
n
z2 , 2n
∞ ∑
2 n
n z ,
n=1
n! z n .
n=1
3. Prove Proposition 3.3.6. 4. We have seen that ∞
(3.3.32)
∑ 1 = zk , 1−z
|z| < 1.
k=0
Find power series in z for (3.3.33)
1 , z−2
1 . z+3
Where do they converge? 5. Use Corollary 3.3.5 to produce a power series in z for (3.3.34) Where does the series converge?
z2
1 . +z−6
108
3. Functions
6. As an alternative to the use of Corollary 3.3.5, write (3.3.34) as a linear combination of the functions (3.3.33). 7. Find the power series on z for 1 . 1 + z2 Hint. Replace z by −z 2 in (3.3.32). 8. Given a > 0, find the power series in z for a2
1 . + z2
3.4. Spaces of functions If X and Y are metric spaces, the space C(X, Y ) of continuous maps f : X → Y has a natural metric structure, under some additional hypotheses. We use ( ) (3.4.1) D(f, g) = sup d f (x), g(x) . x∈X
This sup exists provided f (X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum exists if X is compact. Proposition 3.4.1. If X is a compact metric space and Y is a complete metric space, then C(X, Y ), with the metric (3.4.1), is complete. Proof. That D(f, g) satisfies the conditions to define a metric on C(X, Y ) is straightforward. We check completeness. Suppose (fν ) is a Cauchy sequence in C(X, Y ), so, as ν → ∞, ( ) (3.4.2) sup sup d fν+k (x), fν (x) ≤ εν → 0. k≥0 x∈X
Then in particular (fν (x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say to g(x) ∈ Y . It remains to show that g ∈ C(X, Y ) and that fν → g in the metric (3.4.1). In fact, taking k → ∞ in the estimate above, we have ( ) (3.4.3) sup d g(x), fν (x) ≤ εν → 0, x∈X
i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x in X and fix ε > 0. Pick N so that εN < ε. Since fN is
3.4. Spaces of functions
109
continuous, there exists J such that j ≥ J ⇒ d(fN (xj ), fN (x)) < ε. Hence ( ) j ≥ J ⇒ d g(xj ), g(x) ( ) ( ) ( ) ≤ d g(xj ), fN (xj ) + d fN (xj ), fN (x) + d fN (x), g(x) < 3ε.
This completes the proof.
In case Y = R, we write C(X, R) = C(X). The distance function (3.4.1) can then be written (3.4.4)
D(f, g) = ∥f − g∥sup ,
∥f ∥sup = sup |f (x)|. x∈X
∥f ∥sup is a norm on C(X). Generally, a norm on a vector space V is an assignment f 7→ ∥f ∥ ∈ [0, ∞), satisfying (3.4.5)
∥f ∥ = 0 ⇔ f = 0,
∥af ∥ = |a| ∥f ∥,
∥f + g∥ ≤ ∥f ∥ + ∥g∥,
given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a normed vector space. It is then a metric space, with distance function D(f, g) = ∥f − g∥. If the space is complete, one calls V a Banach space. In particular, by Proposition 3.4.1, C(X) is a Banach space, when X is a compact metric space. The next result is a special case of the Arzela-Ascoli Theorem. To state it, we say a modulus of continuity is a strictly monotonically increasing, continuous function ω : [0, ∞) → [0, ∞) such that ω(0) = 0. Proposition 3.4.2. Let X and Y be compact metric spaces, and fix a modulus of continuity ω(δ). Then { ( ) ( ) } (3.4.6) Cω = f ∈ C(X, Y ) : d f (x), f (x′ ) ≤ ω d(x, x′ ) ∀ x, x′ ∈ X is a compact subset of C(X, Y ). Proof. Let (fν ) be a sequence in Cω . Let Σ be a countable dense subset of X, as in Corollary 2.3.2 of Chapter 2. For each x ∈ Σ, (fν (x)) is a sequence in Y, which hence has a convergent subsequence. Using a diagonal construction similar to that in the proof of Proposition 2.3.10 of Chapter 2, we obtain a subsequence (φν ) of (fν ) with the property that φν (x) converges in Y, for each x ∈ Σ, say (3.4.7)
φν (x) → ψ(x),
for all x ∈ Σ, where ψ : Σ → Y. So far, we have not used (3.4.6). This hypothesis will now be used to show that φν converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is compact, we can cover X by finitely many balls
110
3. Functions
Bδ (xj ), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so large that φν (xj ) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N ). Now, for any x ∈ X, picking ℓ ∈ {1, . . . , N } such that d(x, xℓ ) ≤ δ, we have, for k ≥ 0, ν ≥ M, ( ) ( ) ( ) d φν+k (x), φν (x) ≤ d φν+k (x), φν+k (xℓ ) + d φν+k (xℓ ), φν (xℓ ) ( ) (3.4.8) + d φν (xℓ ), φν (x) ≤ ε/3 + ε/3 + ε/3. Thus (φν (x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we now have (3.4.7) for all x ∈ X. Letting k → ∞ in (3.4.8) we have uniform convergence of φν to ψ. Finally, passing to the limit ν → ∞ in (3.4.9)
d(φν (x), φν (x′ )) ≤ ω(d(x, x′ ))
gives ψ ∈ Cω .
We want to re-state Proposition 3.4.2, bringing in the notion of equicontinuity. Given metric spaces X and Y , and a set of maps F ⊂ C(X, Y ), we say F is equicontinuous at a point x0 ∈ X provided (3.4.10)
∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F , dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.
We say F is equicontinuous on X if it is equicontinuous at each point of X. We say F is uniformly equicontinuous on X provided (3.4.11)
∀ ε > 0, ∃ δ > 0 such that ∀ x, x′ ∈ X, f ∈ F , dX (x, x′ ) < δ =⇒ dY (f (x), f (x′ )) < ε.
Note that (3.4.11) is equivalent to the existence of a modulus of continuity ω such that F ⊂ Cω , given by (3.4.6). It is useful to record the following result. Proposition 3.4.3. Let X and Y be metric spaces, F ⊂ C(X, Y ). Assume X is compact. then (3.4.12)
F equicontinuous =⇒ F is uniformly equicontinuous.
Proof. The argument is a variant of the proof of Proposition 3.1.4. In more detail, suppose there exist xν , x′ν ∈ X, ε > 0, and fν ∈ F such that d(xν , x′ν ) ≤ 2−ν but (3.4.13)
d(fν (xν ), fν (x′ν )) ≥ ε.
Taking a convergent subsequence xνj → p ∈ X, we also have x′νj → p. Now equicontinuity of F at p implies that there exists N < ∞ such that ε (3.4.14) d(g(xνj ), g(p)) < , ∀ j ≥ N, g ∈ F , 2
3.4. Spaces of functions
111
contradicting (3.4.13).
Putting together Propositions 3.4.2 and 3.4.3 then gives the following. Proposition 3.4.4. Let X and Y be compact metric spaces. If F ⊂ C(X, Y ) is equicontinuous on X, then it has compact closure in C(X, Y ).
Exercises 1. Let X and Y be compact metric spaces. Show that if F ⊂ C(X, Y ) is compact, then F is equicontinuous. (This is a converse to Proposition 3.4.4.) 2. Let X be a compact metric space, and r ∈ (0, 1]. Define Lipr (X, Rn ) to consist of continuous functions f : X → Rn such that, for some L < ∞ (depending on f ), |f (x) − f (y)| ≤ LdX (x, y)r ,
∀ x, y ∈ X.
Define a norm ∥f ∥r = sup |f (x)| + x∈X
sup x,y∈X,x̸=y
|f (x) − f (y)| . d(x, y)r
Show that Lipr (X, Rn ) is a complete metric space, with distance function Dr (f, g) = ∥f − g∥r . 3. In the setting of Exercise 2, show that if 0 < r < s ≤ 1 and f ∈ Lips (X, Rn ), then r θ ∥f ∥r ≤ C∥f ∥1−θ θ = ∈ (0, 1). sup ∥f ∥s , s 4. In the setting of Exercise 2, show that if 0 < r < s ≤ 1, then {f ∈ Lips (X, Rn ) : ∥f ∥s ≤ 1} is compact in Lipr (X, Rn ). 5. Let X be a compact metric space, and define C(X) as in (3.4.4). Take P : C(X) × C(X) −→ C(X), Show that P is continuous.
P (f, g)(x) = f (x)g(x).
112
3. Functions
3.5. Absolutely convergent series Here we look at results on infinite series of numbers (or vectors), related to material in Sections 3.2 and 3.3. We concentrate on absolutely convergent series. Rather than looking at a series as a sum of ak for k ∈ N, we find it convenient to consider the following setting. Let Z be a countably infinite set, and take a function f : Z −→ Rn .
(3.5.1)
We say f is absolutely summable, and write f ∈ ℓ1 (Z, Rn ), provided there exists M < ∞ such that ∑ (3.5.2) |f (k)| ≤ M, for each finite set F ⊂ Z. k∈F
In notation used in §§3.2–3.3, we would have f (k) denoted by fk , k ∈ N (or maybe k ∈ Z+ ), but we use the notation f (k) here. If f ∈ ℓ1 (Z, Rn ), we say the series ∑ (3.5.3) f (k) is absolutely convergent. k∈Z
Also we would like to write the characterization (3.5.2) as ∑ (3.5.4) |f (k)| < ∞. k∈Z
∑ ∑ Of course, implicit in (3.5.3)–(3.5.4) is that k∈Z f (k) and k∈Z |f (k)| are well defined elements of Rn and [0, ∞), respectively. We will see shortly that this is the case. To start, we note that, by hypothesis (3.5.2), if f ∈ ℓ1 (Z, Rn ), the quantity (3.5.5)
M (f ) = sup
{∑
} |f (k)| : F ⊂ Z finite
k∈F
is well defined, and M (f ) ≤ M . Hence, given ε > 0, there is a finite set Kε (f ) ⊂ Z such that ∑ (3.5.6) |f (k)| ≥ M (f ) − ε. k∈Kε (f )
These observations yield the following. Proposition 3.5.1. If f ∈ ℓ1 (Z, Rn ), then (3.5.7)
F ⊂ Z \ Kε (f ) finite =⇒
∑ k∈F
This leads to:
|f (k)| ≤ ε.
3.5. Absolutely convergent series
113
Corollary 3.5.2. If f ∈ ℓ1 (Z, Rn ) and A, B ⊃ Kε (f ) are finite, then ∑ ∑ (3.5.8) f (k) − f (k) ≤ 2ε. k∈A
k∈B
To proceed, we bring in the following notion. Given subsets Fν ⊂ Z (ν ∈ N), we say Fν → Z provided that, if F ⊂ Z is finite, there exists N = N (F ) < ∞ such that ν ≥ N ⇒ Fν ⊃ F . Since Z is countable, we see that there exist sequences Fν → Z such that each Fν is finite. Proposition 3.5.3. Take f ∈ ℓ1 (Z, Rn ). Assume Fν ⊂ Z are finite and Fν → Z. Then there exists SZ (f ) ∈ Rn such that ∑ (3.5.9) lim f (k) = SZ (f ). ν→∞
k∈Fν
Furthermore, the limit SZ (f ) is independent of the choice of finite Fν → Z. ∑ Proof. By Corollary 3.5.2, the sequence Sν (f ) = k∈Fν f (k) is a Cauchy sequence in Rn , so it converges to a limit we call SZ (f ). As for the independence of the choice, note that if also Fν′ are finite and Fν′ → Z, we can interlace Fν and Fν′ . Given Proposition 3.5.3, we set ∑ (3.5.10) f (k) = SZ (f ),
for f ∈ ℓ1 (Z, Rn ).
k∈Z
Note in particular that, if f ∈ ℓ1 (Z, Rn ), then |f | ∈ ℓ1 (Z, R), and ∑ (3.5.11) |f (k)| = M (f ), k∈Z
defined in (3.5.6). (These two results illuminate (3.5.3)–(3.5.4).) Remark. Proposition 3.5.3 contains Propositions 1.6.13 and 1.10.3 of Chapter 1. It is stronger than those results, in that it makes clear that the order of summation is irrelevant. Our next goal is to establish the following result, known as a dominated convergence theorem. Proposition 3.5.4. For ν ∈ N, let fν ∈ ℓ1 (Z, Rn ), and let g ∈ ℓ1 (Z, R). Assume (3.5.12)
|fν (k)| ≤ g(k),
∀ ν ∈ N, k ∈ Z,
and (3.5.13)
lim fν (k) = f (k),
ν→∞
∀ k ∈ Z.
114
3. Functions
Then f ∈ ℓ1 (Z, Rn ) and (3.5.14)
lim
ν→∞
∑
fν (k) =
k∈Z
∑
∑
f (k).
k∈Z
∑
Proof. We have k∈Z |g(k)| = k∈Z g(k) = M < ∞. Parallel to (3.5.6)– (3.5.7), for each ε > 0, we can take a finite set Kε (g) ⊂ Z such that ∑ k∈Kε (g) g(k) ≥ M − ε, and hence ∑ F ⊂ Z \ Kε (g) finite =⇒ g(k) ≤ ε k∈F
∑
(3.5.15) =⇒
|fν (k)| ≤ ε, ∀ ν ∈ N,
k∈F
the last implication by (3.5.12). In light of Proposition 3.5.3, we can restate this conclusion as ∑ (3.5.16) |fν (k)| ≤ ε, ∀ ν ∈ N. k∈Z\Kε (g)
Bringing in (3.5.13), we also have ∑ (3.5.17) |f (k)| ≤ ε, for each finite F ⊂ Z \ Kε (g), k∈F
and hence
∑
(3.5.18)
|f (k)| ≤ ε.
k∈Z\Kε (g)
On the other hand, since Kε (g) is finite, ∑ (3.5.19) lim fν (k) =
∑
ν→∞
k∈Kε (g)
f (k).
k∈Kε (g)
It follows that lim sup |SZ (fν ) − SZ (f )| ν→∞
(3.5.20)
≤ lim sup |SKε (g) (fν ) − SKε (g) (f )| ν→∞
+ lim sup |SZ\Kε (g) (fν ) − SZ\Kε (g) (f )| ν→∞
≤ 2ε, for each ε > 0, hence (3.5.21)
lim sup |SZ (fν ) − SZ (f )| = 0, ν→∞
which is equivalent to (3.5.14). Here is one simple but basic application of Proposition 3.5.4.
3.5. Absolutely convergent series
115
Corollary 3.5.5. Assume f ∈ ℓ1 (Z, Rn ). For ν ∈ N, let Fν ⊂ Z and assume Fν → Z. One need not assume that Fν is finite. Then ∑ ∑ f (k) = f (k). (3.5.22) lim ν→∞
k∈Fν
k∈Z
Proof. Apply Proposition 3.5.4 with g(k) = |f (k)| and fν (k) = χFν (k)f (k). The following result recovers Proposition 3.3.6. Proposition 3.5.6. Let Y and Z be countable sets, and assume f ∈ ℓ1 (Y × Z, Rn ), so ∑ (3.5.23) |f (j, k)| = M < ∞. (j,k)∈Y ×Z
Then, for each j ∈ Y ,
∑
(3.5.24)
f (j, k) = g(j)
k∈Z
is absolutely convergent, g ∈ ℓ1 (Y, Rn ),
(3.5.25) and
∑
(3.5.26)
f (j, k) =
(j,k)∈Y ×Z
hence
∑
(3.5.27)
∑
g(j).
j∈Y
f (j, k) =
(j,k)∈Y ×Z
∑( ∑ j∈Y
) f (j, k) .
k∈Z
∑ Proof. Since k∈Z |f (j, k)| is dominated by (3.5.23), the absolute convergence in (3.5.24) is clear. Next, if A ⊂ Y is finite, then ∑ ∑∑ (3.5.28) |g(j)| ≤ |f (j, k)| ≤ M, j∈A
j∈A k∈Z
so g ∈ ℓ1 (Y, Rn ). Furthermore, if Aν ⊂ Y are finite, then ∑ ∑ f (j, k), Fν = Aν × Z, (3.5.29) g(j) = j∈Aν
(j,k)∈Fν
and Aν → Y ⇒ Fν → Y × Z, so (3.5.26) follows from Corollary 3.5.5.
We next examine implications for multiplying two absolutely convergent series, extending Proposition 3.3.4.
116
3. Functions
Proposition 3.5.7. Let Y and Z be countable sets, and assume f ∈ ℓ1 (Y, C), g ∈ ℓ1 (Z, C). Define (3.5.30)
f × g : Y × Z −→ C,
(f × g)(j, k) = f (j)g(k).
Then f × g ∈ ℓ1 (Y × Z, C).
(3.5.31)
Proof. Given a finite set F ⊂ Y × Z, there exist finite A ⊂ Y and B ⊂ Z such that F ⊂ A × B. Then ∑ ∑ |f (j)g(k)| ≤ |f (j)g(k)| (j,k)∈F
(3.5.32)
(j,k)∈A×B
=
∑
|f (j)|
j∈A
where M (f ) =
∑ j∈Y
∑
|g(k)|
k∈B
≤ M (f )M (g), ∑ |f (j)| and M (g) = k∈Z |g(k)|.
We can apply Proposition 3.5.6 to f × g to deduce: Proposition 3.5.8. In the setting of Proposition 3.5.7, (∑ )( ∑ ) ∑ (3.5.33) f (j)g(k) = f (j) g(k) . (j,k)∈Y ×Z
j∈Y
k∈Z
In case Y = Z = N, we can then apply Proposition 3.5.3, with Z replaced by N × N, f replaced by f × g, and (3.5.34)
Fν = {(j, k) ∈ N × N : j + k ≤ ν},
and recover Proposition 3.3.4, including (3.3.16).
Chapter 4
Calculus
Having foundational material on numbers, spaces, and functions, we proceed further into the heart of analysis, with a rigorous development of calculus, for functions of one real variable. Section 4.1 introduces the derivative, establishes basic identities like the product rule and the chain rule, and also obtains some important theoretical results, such as the Mean Value Theorem and the Inverse Function Theorem. One application of the latter is the study of x1/n , for x > 0, which leads more generally to xr , for x > 0 and r ∈ Q. Section 4.2 brings in the integral, more precisely the Riemann integral. A major result is the Fundamental Theorem of Calculus, whose proof makes essential use of the Mean Value Theorem. Another topic is the change of variable formula for integrals (treated in some exercises). In §4.3 we treat power series, continuing the development from §3.3 of Chapter 3. Here we treat such topics as term by term differentiation of power series, and formulas for the remainder when a power series is truncated. An application of such remainder formulas is made to the study of convergence of the power series about x = 0 of (1 − x)b . Section 4.4 studies curves in Euclidean space Rn , with particular attention to arc length. We derive an integral formula for arc length. We show that a smooth curve can be reparametrized by arc length, as an application of the Inverse Function Theorem. We then take a look √ at the unit circle S 1 2 1 in R . Using the parametrization of part of S as (t, 1 − t2 ), we obtain a power series for arc lengths, as an application of material of §3 on power series of (1 − x)b , with b = −1/2, and x replaced by t2 . We also bring in
117
118
4. Calculus
the trigonometric functions, having the property that (cos t, sin t) provides a parametrization of S 1 by arc length. Section 4.5 goes much further into the study of the trigonometric functions. Actually, it begins with a treatment of the exponential function et , observes that such treatment extends readily to eat , given a ∈ C, and then establishes that eit provides a unit speed parametrization of S 1 . This directly gives Euler’s formula eit = cos t + i sin t, and provides for a unified treatment of the exponential and trigonometric functions. We also bring in log as the inverse function to the exponential, and we use the formula xr = er log x to generalize results of §4.1 on xr from r ∈ Q to r ∈ R, and further, to r ∈ C. In §4.6 we give a natural extension of the Riemann integral from the class of bounded (Riemann integrable) functions to a class of unbounded “integrable” functions. The treatment here is perhaps a desirable alternative to discussions one sees of “improper integrals.”
4.1. The derivative
119
4.1. The derivative Consider a function f , defined on an interval (a, b) ⊂ R, taking values in R or C. Given x ∈ (a, b), we say f is differentiable at x, with derivative f ′ (x), provided (4.1.1)
lim
h→0
f (x + h) − f (x) = f ′ (x). h
We also use the notation df (x) = f ′ (x). dx A characterization equivalent to (4.1.1) is (4.1.2)
(4.1.3)
f (x + h) = f (x) + f ′ (x)h + r(x, h),
r(x, h) = o(h),
where r(x, h) → 0 as h → 0. h Clearly if f is differentiable at x then it is continuous at x. We say f is differentiable on (a, b) provided it is differentiable at each point of (a, b). If also g is defined on (a, b) and differentiable at x, we have (4.1.4)
r(x, h) = o(h) means
d (f + g)(x) = f ′ (x) + g ′ (x). dx We also have the following product rule: (4.1.5)
d (f g)(x) = f ′ (x)g(x) + f (x)g ′ (x). dx To prove (4.1.6), note that (4.1.6)
f (x + h)g(x + h) − f (x)g(x) h f (x + h) − f (x) g(x + h) − g(x) = g(x) + f (x + h) . h h We can use the product rule to show inductively that d n x = nxn−1 , dx for all n ∈ N. In fact, this is immediate from (4.1.1) if n = 1. Given that it holds for n = k, we have (4.1.7)
d k+1 d dx k d x = (x xk ) = x + x xk dx dx dx dx k k = x + kx = (k + 1)xk ,
120
4. Calculus
completing the induction. We also have 1( 1 1) 1 1 − =− → − 2 , as h → 0, h x+h x x(x + h) x for x ̸= 0, hence d 1 1 = − 2, dx x x
(4.1.8)
if x ̸= 0.
From here, we can extend (4.1.7) from n ∈ N to all n ∈ Z (requiring x ̸= 0 if n < 0). A similar inductive argument yields d f (x)n = nf (x)n−1 f ′ (x), dx
(4.1.9)
for n ∈ N, and more generally for n ∈ Z (requiring f (x) ̸= 0 if n < 0). Going further, we have the following chain rule. Suppose f : (a, b) → (α, β) is differentiable at x and g : (α, β) → R (or C) is differentiable at y = f (x). Form G = g ◦ f , i.e., G(x) = g(f (x)). We claim G = g ◦ f =⇒ G′ (x) = g ′ (f (x))f ′ (x).
(4.1.10) To see this, write
G(x + h) = g(f (x + h)) (4.1.11)
= g(f (x) + f ′ (x)h + rf (x, h)) = g(f (x)) + g ′ (f (x))(f ′ (x)h + rf (x, h)) + rg (f (x), f ′ (x)h + rf (x, h)).
Here, rf (x, h) −→ 0 as h → 0, h and also
rg (f (x), f ′ (x)h + rf (x, h)) −→ 0, as h → 0, h so the analogue of (4.1.3) applies. The derivative has the following important connection to maxima and minima. Proposition 4.1.1. Let f : (a, b) → R. Suppose x ∈ (a, b) and (4.1.12)
f (x) ≥ f (y),
∀ y ∈ (a, b).
If f is differentiable at x, then f ′ (x) = 0. The same conclusion holds if f (x) ≤ f (y) for all y ∈ (a, b).
4.1. The derivative
121
Figure 4.1.1. Illustration of the Mean Value Theorem
Proof. Given (4.1.12), we have (4.1.13)
f (x + h) − f (x) ≤ 0, h
∀ h ∈ (0, b − x),
f (x + h) − f (x) ≥ 0, h
∀ h ∈ (a − x, 0).
and (4.1.14)
If f is differentiable at x, both (4.1.13) and (4.1.14) must converge to f ′ (x) as h → 0, so we simultaneously have f ′ (x) ≤ 0 and f ′ (x) ≥ 0. We next establish a key result known as the Mean Value Theorem. See Figure 4.1.1 for an illustration. Theorem 4.1.2. Let f : [a, b] → R. Assume f is continuous on [a, b] and differentiable on (a, b). Then there exists ξ ∈ (a, b) such that (4.1.15)
f ′ (ξ) =
f (b) − f (a) . b−a
122
4. Calculus
Proof. Let g(x) = f (x)−κ(x−a), where κ denotes the right side of (4.1.15). Then g(a) = g(b). The result (4.1.15) is equivalent to the assertion that g ′ (ξ) = 0
(4.1.16)
for some ξ ∈ (a, b). Now g is continuous on the compact set [a, b], so it assumes both a maximum and a minimum on this set. If g has a maximum at a point ξ ∈ (a, b), then (4.1.16) follows from Proposition 4.1.1. If not, the maximum must be g(a) = g(b), and then g must assume a minimum at some point ξ ∈ (a, b). Again Proposition 4.1.1 implies (4.1.16). We use the Mean Value Theorem to produce a criterion for constructing the inverse of a function. Let (4.1.17)
f : [a, b] −→ R,
f (a) = α, f (b) = β.
Assume f is continuous on [a, b], differentiable on (a, b), and (4.1.18)
0 < γ0 ≤ f ′ (x) ≤ γ1 < ∞,
∀ x ∈ (a, b).
Then (4.1.15) implies (4.1.19)
γ0 (b − a) ≤ β − α ≤ γ1 (b − a).
We can also apply Theorem 4.1.2 to f , restricted to an interval [x1 , x2 ] ⊂ [a, b], to get (4.1.20) γ0 (x2 − x1 ) ≤ f (x2 ) − f (x1 ) ≤ γ1 (x2 − x1 ),
if a ≤ x1 < x2 ≤ b.
It follows that (4.1.21)
f : [a, b] −→ [α, β] is one-to-one.
The intermediate value theorem implies f : [a, b] → [α, β] is onto. Consequently f has an inverse (4.1.22)
g : [α, β] −→ [a, b],
g(f (x)) = x, f (g(y)) = y,
and (4.1.20) implies (4.1.23) γ0 (g(y2 ) − g(y1 )) ≤ y2 − y1 ≤ γ1 (g(y2 ) − g(y1 )),
if α ≤ y1 < y2 ≤ β.
The following result is known as the Inverse Function Theorem. Theorem 4.1.3. If f is continuous on [a, b] and differentiable on (a, b), and (1.17)–(1.18) hold, then its inverse g : [α, β] → [a, b] is differentiable on (α, β), and 1 (4.1.24) g ′ (y) = ′ , for y = f (x) ∈ (α, β). f (x) The same conclusion holds if in place of (1.18) we have (4.1.25)
−γ1 ≤ f ′ (x) ≤ −γ0 < 0,
except that then β < α.
∀ x ∈ (a, b),
4.1. The derivative
123
Proof. Fix y ∈ (α, β), and let x = g(y), so y = f (x). From (4.1.22) we have, for h small enough, x + h = g(f (x + h)) = g(f (x) + f ′ (x)h + r(x, h)), i.e., (4.1.26)
g(y + f ′ (x)h + r(x, h)) = g(y) + h,
r(x, h) = o(h).
Now (4.1.23) implies (4.1.27)
|g(y1 + r(x, h)) − g(y1 )| ≤
1 |r(x, h)|, γ0
˜ = f ′ (x)h, and y1 = y + h, ˜ we provided y1 , y1 + r(y, h) ∈ [α, β], so, with h have ˜ ˜ = g(y) + h + o(h), ˜ (4.1.28) g(y + h) ′ f (x)
yielding (4.1.24) from the analogue of (4.1.3).
Remark. If one knew that g were differentiable, as well as f , then the identity (4.1.24) would follow by differentiating g(f (x)) = x, applying the chain rule. However, an additional argument, such as given above, is necessary to guarantee that g is differentiable. Theorem 4.1.3 applies to the functions (4.1.29)
pn (x) = xn ,
n ∈ N.
By (4.1.7), p′n (x) > 0 for x > 0, so (4.1.18) holds when 0 < a < b < ∞. We can take a ↘ 0 and b ↗ ∞ and see that (4.1.30)
pn : (0, ∞) −→ (0, ∞) is invertible,
with differentiable inverse qn : (0, ∞) → (0, ∞). We use the notation (4.1.31)
x1/n = qn (x),
x > 0,
so, given n ∈ N, (4.1.32)
x > 0 =⇒ x = x1/n · · · x1/n ,
(n factors).
Note. We recall that x1/n was constructed, for x > 0, in Chapter 1, §1.7, and its continuity discussed in Chapter 3, §3.1. Given m ∈ Z, we can set (4.1.33)
xm/n = (x1/n )m ,
x > 0,
124
4. Calculus
and verify that (x1/kn )km = (x1/n )m . Thus we have xr defined for all r ∈ Q, when x > 0. We have (4.1.34)
xr+s = xr xs ,
for x > 0, r, s ∈ Q.
See Exercises 3–5 in §1.7 of Chapter 1. Applying (4.1.24) to f (x) = xn and g(y) = y 1/n , we have (4.1.35)
d 1/n 1 y = , dy nxn−1
y = xn , x > 0.
Now xn−1 = y/x = y 1−1/n , so we get (4.1.36)
d r y = ry r−1 , dy
y > 0,
when r = 1/n. Putting this together with (4.1.9) (with m in place of n), we get (4.1.36) for all r = m/n ∈ Q. The definition of xr for x > 0 and the identity (4.1.36) can be extended to all r ∈ R, with some more work. We will find a neat way to do this in §4.5. We recall another common notation, namely √ (4.1.37) x = x1/2 , x > 0. Then (4.1.36) yields (4.1.38)
1 d√ x= √ . dx 2 x
In regard to this, note that, if we consider √ √ x+h− x , (4.1.39) h we can multiply numerator and denominator by (4.1.40)
√
√
x+h+
√
x, to get
1 √ , x+h+ x
whose convergence to the right side of (4.1.38) for x > 0 is equivalent to the statement that √ √ (4.1.41) lim x + h = x, h→0
√ i.e., to the continuity of x 7→ x on (0, ∞). Such continuity is a consequence of the fact that, for 0 < a < b < ∞, n = 2, (4.1.42)
pn : [a, b] −→ [an , bn ]
is continuous, one-to-one, and onto, so, by the compactness of [a, b], its inverse is continuous. Thus we have an alternative derivation of (4.1.38).
4.1. The derivative
125
If I ⊂ R is an interval and f : I → R (or C), we say f ∈ C 1 (I) if f is differentiable on I and f ′ is continuous on I. If f ′ is in turn differentiable, we have the second derivative of f : d2 f d ′ f (x). (x) = f ′′ (x) = 2 dx dx If f ′ is differentiable on I and f ′′ is continuous on I, we say f ∈ C 2 (I). Inductively, we can define higher order derivatives of f, f (k) , also denoted dk f /dxk . Here, f (1) = f ′ , f (2) = f ′′ , and if f (k) is differentiable,
(4.1.43)
(4.1.44)
f (k+1) (x) =
d (k) f (x). dx
If f (k) is continuous on I, we say f ∈ C k (I). Sometimes we will run into functions of more than one variable, and will want to differentiate with respect to each one of them. For example, if f (x, y) is defined for (x, y) in an open set in R2 , we define partial derivatives, ∂f f (x + h, y) − f (x, y) (x, y) = lim , h→0 ∂x h (4.1.45) ∂f f (x, y + h) − f (x, y) (x, y) = lim . h→0 ∂y h We will not need any more than the definition here. A serious study of the derivative of a function of several variables is given in the companion [13] to this volume, Introduction to Analysis in Several Variables. We end this section with some results on the significance of the second derivative. Proposition 4.1.4. Assume f is differentiable on (a, b), x0 ∈ (a, b), and f ′ (x0 ) = 0. Assume f ′ is differentiable at x0 and f ′′ (x0 ) > 0. Then there exists δ > 0 such that (4.1.46)
f (x0 ) < f (x) for all x ∈ (x0 − δ, x0 + δ) \ {x0 }.
We say f has a local minimum at x0 . Proof. Since f ′ (x0 + h) − f ′ (x0 ) , h→0 h the assertion that f ′′ (x0 ) > 0 implies that there exists δ > 0 such that the right side of (4.1.47) is > 0 for all nonzero h ∈ [−δ, δ]. Hence (4.1.47)
(4.1.48)
f ′′ (x0 ) = lim
−δ ≤ h < 0 =⇒ f ′ (x0 + h) < 0, 0 < h ≤ δ =⇒ f ′ (x0 + h) > 0.
This plus the mean value theorem imply (4.1.46).
126
4. Calculus
Remark. Similarly, (4.1.49)
f ′′ (x0 ) < 0 =⇒ f has a local maximum at x0 .
These two facts constitute the second derivative test for local maxima and local minima. Let us now assume that f and f ′ are differentiable on (a, b), so f ′′ is defined at each point of (a, b). Let us further assume f ′′ (x) > 0,
(4.1.50)
The mean value theorem, applied to (4.1.51)
∀ x ∈ (a, b). f ′,
yields
a < x0 < x1 < b =⇒ f ′ (x0 ) < f ′ (x1 ).
Here is another interesting property. Proposition 4.1.5. If (4.1.50) holds and a < x0 < x1 < b, then (4.1.52)
f (sx0 + (1 − s)x1 ) < sf (x0 ) + (1 − s)f (x1 ),
∀ s ∈ (0, 1).
Proof. For s ∈ [0, 1], set (4.1.53)
g(s) = sf (x0 ) + (1 − s)f (x1 ) − f (sx0 + (1 − s)x1 ).
The result (4.1.52) is equivalent to (4.1.54)
g(s) > 0 for 0 < s < 1.
Note that (4.1.55)
g(0) = g(1) = 0.
If (4.1.54) fails, g must assume a minimum at some point s0 ∈ (0, 1). At such a point, g ′ (s0 ) = 0. A computation gives g ′ (s) = f (x0 ) − f (x0 ) − (x0 − x1 )f ′ (sx0 + (1 − s)x1 ), and hence (4.1.56)
g ′′ (s) = −(x0 − x1 )2 f ′′ (sx0 + (1 − s)x1 ).
Thus (4.1.50) ⇒ g ′′ (s0 ) < 0. Then (4.1.49) ⇒ g has a local maximum at s0 . This contradiction establishes (4.1.54), hence (4.1.52). Remark. The result (4.1.52) implies that the graph of y = f (x) over [x0 , x1 ] lies below the chord, i.e., the line segment from (x0 , f (x0 )) to (x1 , f (x1 )) in R2 . We say f is convex.
4.1. The derivative
127
Exercises Compute the derivative of each of the following functions. Specify where each of these derivatives are defined. √ (1) 1 + x2 , (2)
(x2 + x3 )−4 ,
(3)
1+x2 . (x2 +x3 )4
√
4. Let f : [0, ∞) → R be a C 2 function satisfying (4.1.57)
f (x) > 0,
f ′ (x) > 0,
f ′′ (x) < 0,
for x > 0.
Show that (4.1.58)
x, y > 0 =⇒ f (x + y) < f (x) + f (y).
5. Apply Exercise 4 to x . 1+x Relate the conclusion to Exercises 1–2 in §2.3 of Chapter 2. Give a direct proof that (4.1.58) holds for f in (4.1.59), without using calculus. (4.1.59)
f (x) =
6. If f : I → Rn , we define f ′ (x) just as in (4.1.1). If f (x) = (f1 (x), . . . , fn (x)), then f is differentiable at x if and only if each component fj is, and f ′ (x) = (f1′ (x), . . . , fn′ (x)). Parallel to (4.1.6), show that if g : I → Rn , then the dot product satisfies d f (x) · g(x) = f ′ (x) · g(x) + f (x) · g ′ (x). dx 7. Establish the following variant of Proposition 4.1.5. Suppose (4.1.50) is weakened to (4.1.60)
f ′′ (x) ≥ 0,
∀ x ∈ (a, b).
Show that, in place of (4.1.52), one has (4.1.61)
f (sx0 + (1 − s)x1 ) ≤ sf (x0 ) + (1 − s)f (x1 ),
∀ s ∈ (0, 1).
Hint. Consider fε (x) = f (x) + εx2 . 8. The following is called the generalized mean value theorem. Let f and
128
4. Calculus
g be continuous on [a, b] and differentiable on (a, b). Then there exists ξ ∈ (a, b) such that [f (b) − f (a)]g ′ (ξ) = [g(b) − g(a)]f ′ (ξ). Show that this follows from the mean value theorem, applied to h(x) = [f (b) − f (a)]g(x) − [g(b) − g(a)]f (x).
9. Take f : [a, b] → [α, β] and g : [α, β] → [a, b] as in the setting of the Inverse Function Theorem, Theorem 1.3. Write (4.1.24) as 1 , y ∈ (α, β). (4.1.62) g ′ (y) = ′ f (g(y)) Show that f ∈ C 1 ((a, b)) =⇒ g ∈ C 1 ((α, β)), i.e., the right side of (4.1.62) is continuous on (α, β). Show inductively that, for k ∈ N, f ∈ C k ((a, b)) =⇒ g ∈ C k ((α, β)). Example. Show that if f ∈ C 2 ((a, b)), then (having shown that g ∈ C 1 ) the right side of (4.1.62) is C 1 and hence 1 g ′′ (y) = − ′ f ′′ (g(y))g ′ (y). f (g(y))2 10. Let I ⊂ R be an open interval and f : I → R differentiable. (Do not assume f ′ is continuous.) Assume a, b ∈ I, a < b, and f ′ (a) < u < f ′ (b). Show that there exists ξ ∈ (a, b) such that f ′ (ξ) = u. Hint. Reduce to the case u = 0, so f ′ (a) < 0 < f ′ (b). Show that then f |[a,b] has a minimum at a point ξ ∈ (a, b).
4.2. The integral
129
4.2. The integral In this section, we introduce the Riemann version of the integral, and relate it to the derivative. We will define the Riemann integral of a bounded function over an interval I = [a, b] on the real line. For now, we assume f is real valued. To start, we partition I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤ k ≤ N }, disjoint except for their endpoints, whose union is I. We can order the Jk so that Jk = [xk , xk+1 ], where (4.2.1)
x0 < x1 < · · · < xN < xN +1 ,
x0 = a, xN +1 = b.
We call the points xk the endpoints of P. We set (4.2.2)
ℓ(Jk ) = xk+1 − xk ,
We then set I P (f ) =
maxsize (P) = max ℓ(Jk ) 0≤k≤N
∑
sup f (x) ℓ(Jk ), Jk
k
(4.2.3) I P (f ) =
∑
inf f (x) ℓ(Jk ).
k
Jk
Here, sup f (x) = sup f (Jk ), Jk
inf f (x) = inf f (Jk ), Jk
and we recall that if S ⊂ R is bounded, sup S and inf S were defined in §1.6 of Chapter 1; cf. (1.6.35) and (1.6.48). We call I P (f ) and I P (f ) respectively the upper sum and lower sum of f , associated to the partition P. See Figure 4.2.1 for an illustration. Note that I P (f ) ≤ I P (f ). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of I, we say P refines Q, and write P ≻ Q, if P is formed by partitioning each interval in Q. Equivalently, P ≻ Q if and only if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions have a common refinement; just take the union of their endpoints, to form a new partition. Note also that refining a partition lowers the upper sum of f and raises its lower sum: (4.2.4)
P ≻ Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).
Consequently, if Pj are any two partitions and Q is a common refinement, we have (4.2.5)
I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).
130
4. Calculus
Figure 4.2.1. Upper and lower sums associated to a partition
Now, whenever f : I → R is bounded, the following quantities are well defined: (4.2.6)
I(f ) =
inf
P∈Π(I)
I P (f ),
I(f ) = sup P∈Π(I)
I P (f ),
where Π(I) is the set of all partitions of I. We call I(f ) the lower integral of f and I(f ) its upper integral. Clearly, by (4.2.5), I(f ) ≤ I(f ). We then say that f is Riemann integrable provided I(f ) = I(f ), and in such a case, we set ∫ ∫ b f (x) dx = f (x) dx = I(f ) = I(f ). (4.2.7) a
I
We will denote the set of Riemann integrable functions on I by R(I). We derive some basic properties of the Riemann integral. Proposition 4.2.1. If f, g ∈ R(I), then f + g ∈ R(I), and ∫ ∫ ∫ (4.2.8) (f + g) dx = f dx + g dx. I
I
I
4.2. The integral
131
Proof. If Jk is any subinterval of I, then sup (f + g) ≤ sup f + sup g, Jk
Jk
and inf (f + g) ≥ inf f + inf g,
Jk
Jk
Jk
Jk
so, for any partition P, we have I P (f + g) ≤ I P (f ) + I P (g). Also, using common refinements, we can simultaneously approximate I(f ) and I(g) by I P (f ) and I P (g), and ditto for I(f + g). Thus the characterization (4.2.6) implies I(f + g) ≤ I(f ) + I(g). A parallel argument implies I(f + g) ≥ I(f ) + I(g), and the proposition follows. Next, there is a fair supply of Riemann integrable functions. Proposition 4.2.2. If f is continuous on I, then f is Riemann integrable. Proof. Any continuous function on a compact interval is bounded and uniformly continuous (see Propositions 3.1.1 and 3.1.3 of Chapter 3). Let ω(δ) be a modulus of continuity for f, so (4.2.9)
|x − y| ≤ δ =⇒ |f (x) − f (y)| ≤ ω(δ),
ω(δ) → 0 as δ → 0.
Then (4.2.10)
maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · ℓ(I),
which yields the proposition.
We denote the set of continuous functions on I by C(I). Thus Proposition 4.2.2 says C(I) ⊂ R(I). The proof of Proposition 4.2.2 provides a∫criterion on a partition guaranteeing that I P (f ) and I P (f ) are close to I f dx when f is continuous. We produce an extension, giving a condition under which I P (f ) and I(f ) are close, and I P (f ) and I(f ) are close, given f bounded on I. Given a partition P0 of I, set (4.2.11)
minsize(P0 ) = min{ℓ(Jk ) : Jk ∈ P0 }.
Lemma 4.2.3. Let P and Q be two partitions of I. Assume 1 (4.2.12) maxsize(P) ≤ minsize(Q). k Let |f | ≤ M on I. Then 2M ℓ(I), k 2M ℓ(I). I P (f ) ≥ I Q (f ) − k I P (f ) ≤ I Q (f ) +
(4.2.13)
132
4. Calculus
Proof. Let P1 denote the minimal common refinement of P and Q. Consider on the one hand those intervals in P that are contained in intervals in Q and on the other hand those intervals in P that are not contained in intervals in Q. Each interval of the first type is also an interval in P1 . Each interval of the second type gets partitioned, to yield two intervals in P1 . Denote by P1b the collection of such divided intervals. By (4.2.12), the lengths of the intervals in P1b sum to ≤ ℓ(I)/k. It follows that ∑ ℓ(I) 2M ℓ(J) ≤ 2M , |I P (f ) − I P1 (f )| ≤ k b J∈P1
and similarly |I P (f ) − I P1 (f )| ≤ 2M ℓ(I)/k. Therefore I P (f ) ≤ I P1 (f ) +
2M ℓ(I), k
I P (f ) ≥ I P1 (f ) −
2M ℓ(I). k
Since also I P1 (f ) ≤ I Q (f ) and I P1 (f ) ≥ I Q (f ), we obtain (4.2.13).
The following consequence is sometimes called Darboux’s Theorem. Theorem 4.2.4. Let Pν be a sequence of partitions of I into ν intervals Jνk , 1 ≤ k ≤ ν, such that maxsize(Pν ) −→ 0. If f : I → R is bounded, then (4.2.14)
I Pν (f ) → I(f ) and I Pν (f ) → I(f ).
Consequently, (4.2.15)
f ∈ R(I) ⇐⇒ I(f ) = lim
ν→∞
ν ∑
f (ξνk )ℓ(Jνk ),
k=1
for arbitrary ξνk ∈ Jνk , in which case the limit is
∫ I
f dx.
Proof. As before, assume |f | ≤ M . Pick ε > 0. Let Q be a partition such that I(f ) ≤ I Q (f ) ≤ I(f ) + ε, I(f ) ≥ I Q (f ) ≥ I(f ) − ε. Now pick N such that ν ≥ N =⇒ maxsize Pν ≤ ε minsize Q. Lemma 2.3 yields, for ν ≥ N , I Pν (f ) ≤ I Q (f ) + 2M ℓ(I)ε, I Pν (f ) ≥ I Q (f ) − 2M ℓ(I)ε.
4.2. The integral
133
Hence, for ν ≥ N , I(f ) ≤ I Pν (f ) ≤ I(f ) + [2M ℓ(I) + 1]ε, I(f ) ≥ I Pν (f ) ≥ I(f ) − [2M ℓ(I) + 1]ε.
This proves (4.2.14).
Remark. The ∫sums on the right side of (4.2.15) are called Riemann sums, approximating I f dx (when f is Riemann integrable). Remark. A second proof of Proposition 4.2.1 can readily be deduced from Theorem 4.2.4. One should be warned that, once such a specific choice of Pν and ξνk has been made, the limit on the right side of (4.2.15) might exist for a bounded function f that is not Riemann integrable. This and other phenomena are illustrated by the following example of a function which is not Riemann integrable. For x ∈ I, set (4.2.16)
ϑ(x) = 1 if x ∈ Q,
ϑ(x) = 0 if x ∈ / Q,
where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains points in Q and points not in Q, so for any partition P of I we have I P (ϑ) = ℓ(I) and I P (ϑ) = 0, hence (4.2.17)
I(ϑ) = ℓ(I),
I(ϑ) = 0.
Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to be rational, in which case the limit on the right side of (4.2.15) would be ℓ(I), or we could pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we could pick half of them to be rational and half to be irrational, and the limit would be 12 ℓ(I). Associated to the Riemann integral is a notion of size of a set S, called content. If S is a subset of I, define the “characteristic function” (4.2.18)
χS (x) = 1 if x ∈ S, 0 if x ∈ / S.
We define “upper content” cont+ and “lower content” cont− by (4.2.19)
cont+ (S) = I(χS ),
cont− (S) = I(χS ).
We say S “has content,” or “is contented” if these quantities are equal, which happens if and only if χS ∈ R(I), in which case the common value of cont+ (S) and cont− (S) is ∫ (4.2.20) m(S) = χS (x) dx. I
134
4. Calculus
It is easy to see that (4.2.21)
cont+ (S) = inf
N {∑
} ℓ(Jk ) : S ⊂ J1 ∪ · · · ∪ JN ,
k=1
where Jk are intervals. Here, we require S to be in the union of a finite collection of intervals. There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure. The key to the construction of Lebesgue measure is to cover a set S by a countable (either finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by {∑ ∪ } (4.2.22) m∗ (S) = inf ℓ(Jk ) : S ⊂ Jk . k≥1
k≥1
Here {Jk } is a finite or countably infinite collection of intervals. Clearly m∗ (S) ≤ cont+ (S).
(4.2.23)
Note that, if S = I ∩ Q, then χS = ϑ, defined by (4.2.16). In this case it is easy to see that cont+ (S) = ℓ(I), but m∗ (S) = 0. In fact, (4.2.22) readily yields the following: (4.2.24)
S countable =⇒ m∗ (S) = 0.
We point out that we can require the intervals Jk in (4.2.22) to be open. Consequently, since each open cover of a compact set has a finite subcover, (4.2.25)
S compact =⇒ m∗ (S) = cont+ (S).
See the material at the end of this section for a generalization of Proposition 4.2.2, giving a sufficient condition for a bounded function to be Riemann integrable on I, in terms of the upper content of its set of discontinuities, in Proposition 4.2.11, and then, in Proposition 4.2.12, a refinement, replacing upper content by outer measure. ∫ It is useful to note that I f dx is additive in I, in the following sense. Proposition 4.2.5. If a < b < c, f : [a, c] → R, f1 = f [a,b] , f2 = f [b,c] , then ( ) ( ) ( ) (4.2.26) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] , and, if this holds, ∫ (4.2.27)
∫
c
f dx = a
∫
b
f1 dx + a
c
f2 dx. b
4.2. The integral
135
Proof. Since any partition of [a, c] has a refinement for which b is an endpoint, we may as well consider a partition P = P1 ∪ P2 , where P1 is a partition of [a, b] and P2 is a partition of [b, c]. Then (4.2.28) so (4.2.29)
I P (f ) = I P1 (f1 ) + I P2 (f2 ),
I P (f ) = I P1 (f1 ) + I P2 (f2 ),
{ } { } I P (f ) − I P (f ) = I P1 (f1 ) − I P1 (f1 ) + I P2 (f2 ) − I P2 (f2 ) .
Since both terms in braces in (4.2.29) are ≥ 0, we have equivalence in (4.2.26). Then (4.2.27) follows from (4.2.28) upon taking finer and finer partitions, and passing to the limit. Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the function ∫ x (4.2.30) g(x) = f (t) dt. a
If a ≤ x0 ≤ x1 ≤ b, then (4.2.31)
∫
g(x1 ) − g(x0 ) =
x1
f (t) dt, x0
so, if |f | ≤ M, (4.2.32)
|g(x1 ) − g(x0 )| ≤ M |x1 − x0 |.
In other words, if f ∈ R(I), then g is Lipschitz continuous on I. Recall from §4.1 that a function g : (a, b) → R is said to be differentiable at x ∈ (a, b) provided there exists the limit ] 1[ (4.2.33) lim g(x + h) − g(x) = g ′ (x). h→0 h When such a limit exists, g ′ (x), also denoted dg/dx, is called the derivative of g at x. Clearly g is continuous wherever it is differentiable. The next result is part of the Fundamental Theorem of Calculus. Theorem 4.2.6. If f ∈ C([a, b]), then the function g, defined by (4.2.30), is differentiable at each point x ∈ (a, b), and (4.2.34)
g ′ (x) = f (x).
Proof. Parallel to (4.2.31), we have, for h > 0, ∫ ] 1 x+h 1[ (4.2.35) f (t) dt. g(x + h) − g(x) = h h x If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f (t) − f (x)| ≤ ε whenever |t − x| ≤ δ. Thus the right side of (4.2.35) is within ε of f (x) whenever h ∈ (0, δ]. Thus the desired limit exists as h ↘ 0. A similar argument treats h ↗ 0.
136
4. Calculus
The next result is the rest of the Fundamental Theorem of Calculus. Theorem 4.2.7. If G is differentiable and G′ (x) is continuous on [a, b], then ∫ b (4.2.36) G′ (t) dt = G(b) − G(a). a
Proof. Consider the function (4.2.37)
∫
g(x) =
x
G′ (t) dt.
a
We have g ∈ C([a, b]), g(a) = 0, and, by Theorem 4.2.6, (4.2.38)
g ′ (x) = G′ (x),
∀ x ∈ (a, b).
Thus f (x) = g(x) − G(x) is continuous on [a, b], and (4.2.39)
f ′ (x) = 0,
∀ x ∈ (a, b).
We claim that (4.2.39) implies f is constant on [a, b]. Granted this, since f (a) = g(a) − G(a) = −G(a), we have f (x) = −G(a) for all x ∈ [a, b], so the integral (4.2.37) is equal to G(x) − G(a) for all x ∈ [a, b]. Taking x = b yields (4.2.36). The fact that (4.2.39) implies f is constant on [a, b] is a consequence of the Mean Value Theorem. This was established in §4.1; see Theorem 4.1.2. We repeat the statement here. Theorem 4.2.8. Let f : [a, β] → R be continuous, and assume f is differentiable on (a, β). Then ∃ ξ ∈ (a, β) such that (4.2.40)
f ′ (ξ) =
f (β) − f (a) . β−a
Now, to see that (4.2.39) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that f (β) ̸= f (a). Then just apply Theorem 4.2.8 to f on [a, β]. This completes the proof of Theorem 4.2.7. We now extend Theorems 4.2.6–4.2.7 to the setting of Riemann integrable functions. Proposition 4.2.9. Let f ∈ R([a, b]), and define g by (4.2.28). If x ∈ [a, b] and f is continuous at x, then g is differentiable at x, and g ′ (x) = f (x). The proof is identical to that of Theorem 4.2.6. Proposition 4.2.10. Assume G is differentiable on [a, b] and G′ ∈ R([a, b]). Then (4.2.36) holds.
4.2. The integral
137
Proof. We have G(b) − G(a) =
n−1 ∑[ k=0
(4.2.41)
( ( k + 1) k )] G a + (b − a) − G a + (b − a) n n
b−a∑ ′ G (ξkn ), n n−1
=
k=0
for some ξkn satisfying k k+1 < ξkn < a + (b − a) , n n as a consequence of the Mean Value Theorem. Given G′ ∈ R([a, b]), Darboux’s theorem (Theorem 4.2.4) implies that as n → ∞ one gets G(b) − ∫b G(a) = a G′ (t) dt. (4.2.42)
a + (b − a)
Note that the beautiful symmetry in Theorems 4.2.6–4.2.7 is not preserved in Propositions 4.2.9–4.2.10. The hypothesis of Proposition 4.2.10 requires G to be differentiable at each x ∈ [a, b], but the conclusion of Proposition 4.2.9 does not yield differentiability at all points. For this reason, we regard Propositions 4.2.9–4.2.10 as less “fundamental” than Theorems 4.2.6– 4.2.7. There are more satisfactory extensions of the fundamental theorem of calculus, involving the Lebesgue integral, and a more subtle notion of the “derivative” of a non-smooth function. For this, we can point the reader to Chapters 10-11 of the text [12], Measure Theory and Integration. So far, we have dealt with integration of real valued functions. If f : I → C, we set f = f1 + if2 with fj : I → R and say f ∈ R(I) if and only if f1 and f2 are in R(I). Then ∫ ∫ ∫ (4.2.43) f dx = f1 dx + i f2 dx. I
I
I
There are straightforward extensions of Propositions 4.2.5–4.2.10 to complex valued functions. Similar comments apply to functions f : I → Rn .
Complementary results on Riemann integrability
Here we provide a condition, more general then Proposition 4.2.2, which guarantees Riemann integrability. Proposition 4.2.11. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the set S of points of discontinuity of f has the property (4.2.44)
cont+ (S) = 0.
138
4. Calculus
Then f ∈ R(I). Proof. Say |f (x)| ≤ M . Take ε > 0. As in (4.2.21), take intervals ∑ J1 , . . . , JN such that S ⊂ J1 ∪ · · · ∪ JN and N k=1 ℓ(Jk ) < ε. In fact, fatten each Jk such that S is contained in the interior of this collection of intervals. Consider a partition P0 of I, whose intervals include J1 , . . . , JN , amongst others, which we label I1 , . . . , IK . Now f is continuous on each interval Iν , so, subdividing each Iν as necessary, hence refining P0 to a partition P1 , we arrange that sup f − inf f < ε on each such subdivided interval. Denote these subdivided intervals I1′ , . . . , IL′ . It readily follows that 0 ≤ I P1 (f ) − I P1 (f )
0. This time, take a ∑countable collection of open intervals {Jk } such that S ⊂ ∪k≥1 Jk and k≥1 ℓ(Jk ) < ε. Now f is continuous at each p ∈ I \ S, so there exists an interval Kp , open (in I), containing p, such that supKp f − inf Kp f < ε. Now {Jk : k ∈ N} ∪ {Kp : p ∈ I \ S} is an open cover of I, so it has a finite subcover, which we denote {J1 , . . . , JN , K1 , . . . , KM }. We have (4.2.47)
N ∑
ℓ(Jk ) < ε,
and sup f − inf f < ε, ∀ j ∈ {1, . . . , M }. Kj
k=1
Kj
Let P be the partition of I obtained by taking the union of all the endpoints of Jk and Kj in (4.2.47). Let us write P = {Lk : 0 ≤ k ≤ µ} (∪ ) (∪ ) = Lk ∪ Lk , k∈A
k∈B
where we say k ∈ A provided Lk is contained in an interval of the form Kj for some j ∈ {1, . . . , M }, as in (4.2.47). Consequently, if k ∈ B, then
4.2. The integral
139
Lk ⊂ Jℓ for some ℓ ∈ {1, . . . , N }, so ∪
(4.2.48)
Lk ⊂
k∈B
We therefore have ∑ (4.2.49) ℓ(Lk ) < ε,
N ∪
Jℓ .
ℓ=1
and sup f − inf f < ε, ∀ j ∈ A. Lj
k∈B
It follows that 0 ≤ I P (f ) − I P (f ) < (4.2.50)
∑
Lj
2M ℓ(Lk ) +
k∈B
∑
εℓ(Lj )
j∈A
< 2εM + εℓ(I). Since ε can be taken arbitrarily small, this establishes that f ∈ R(I).
Remark. Proposition 4.2.12 is part of the sharp result that a bounded function f on I = [a, b] is Riemann integrable if and only if its set S of points of discontinuity satisfies (4.2.46). Standard books on measure theory, including [6] and [12], establish this. We give an example of a function to which Proposition 4.2.11 applies, and then an example for which Proposition 4.2.11 fails to apply, but Proposition 4.2.12 applies. Example 1. Let I = [0, 1]. Define f : I → R by f (0) = 0, (4.2.51)
f (x) = (−1)j for x ∈ (2−(j+1) , 2−j ], j ≥ 0.
Then |f | ≤ 1 and the set of points of discontinuity of f is (4.2.52)
S = {0} ∪ {2−j : j ≥ 1}.
It is easy to see that cont+ S = 0. Hence f ∈ R(I). See Exercises 16-17 below for a more elaborate example to which Proposition 4.2.11 applies. Example 2. Again I = [0, 1]. Define f : I → R by (4.2.53)
f (x) = 0 if x ∈ / Q, m 1 if x = , in lowest terms. n n
140
4. Calculus
Then |f | ≤ 1 and the set of points of discontinuity of f is S = I ∩ Q.
(4.2.54)
As we have seen below (4.2.23), cont+ S = 1, so Proposition 4.2.11 does not apply. Nevertheless, it is fairly easy to see directly that (4.2.55)
so f ∈ R(I).
I(f ) = I(f ) = 0,
In fact, given ε > 0, f ≥ ε only on a finite set, hence I(f ) ≤ ε,
(4.2.56)
∀ ε > 0.
As indicated below (4.2.23), (4.2.46) does apply to this function, so Proposition 4.2.12 applies. Example 2 is illustrative of the following general phenomenon, which is worth recording. Corollary 4.2.13. If f : I → R is bounded and its set S of points of discontinuity is countable, then f ∈ R(I). Proof. By virtue of (4.2.24), Proposition 4.2.12 applies.
Here is another useful sufficient condition condition for Riemann integrability. Proposition 4.2.14. If f : I → R is bounded and monotone, then f ∈ R(I). Proof. It suffices to consider the case that f is monotone increasing. Let PN = {Jk : 1 ≤ k ≤ N } be the partition of I into N intervals of equal length. Note that supJk f ≤ inf Jk+1 f . Hence I PN (f ) ≤ (4.2.57)
N −1 ∑ k=1
( inf f )ℓ(Jk ) + (sup f )ℓ(JN ) Jk+1
≤ I PN (f ) + 2M
JN
ℓ(I) , N
if |f | ≤ M . Taking N → ∞, we deduce from Theorem 4.2.4 that I(f ) ≤ I(f ), which proves f ∈ R(I). Remark. It can be shown that if f is monotone, then its set of points of discontinuity is countable. Given this, Proposition 4.2.14 is also a consequence of Corollary 4.2.13. By contrast, the function ϑ in (4.2.16) is discontinuous at each point of I.
4.2. The integral
141
We mention some alternative characterizations of I(f ) and I(f ), which can be useful. Given I = [a, b], we say g : I → R is piecewise constant on I (and write g ∈ PK(I)) provided there exists a partition P = {Jk } of I such that g is constant on the interior of each interval Jk . Clearly PK(I) ⊂ R(I). It is easy to see that, if f : I → R is bounded, {∫ } I(f ) = inf f1 dx : f1 ∈ PK(I), f1 ≥ f , I {∫
(4.2.58) I(f ) = sup
} f0 dx : f0 ∈ PK(I), f0 ≤ f .
I
Hence, given f : I → R bounded, (4.2.59)
f ∈ R(I) ⇔ for each ε > 0, ∃f0 , f1 ∈ PK(I) such that ∫ f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε. I
This can be used to prove (4.2.60)
f, g ∈ R(I) =⇒ f g ∈ R(I),
via the fact that (4.2.61)
fj , gj ∈ PK(I) =⇒ fj gj ∈ PK(I).
In fact, we have the following, which can be used to prove (4.2.60), based on the identity 2f g = (f + g)2 − f 2 − g 2 . Proposition 4.2.15. Let f ∈ R(I), and assume |f | ≤ M . Let φ : [−M, M ] → R be continuous. Then φ ◦ f ∈ R(I). Proof. We proceed in steps. Step 1. We can obtain φ as a uniform limit on [−M, M ] of a sequence φν of continuous, piecewise linear functions. Then φν ◦ f → φ ◦ f uniformly on I. A uniform limit g of functions gν ∈ R(I) is in R(I) (see Exercise 9). So it suffices to prove Proposition 4.2.15 when φ is continuous and piecewise linear. Step 2. Given φ : [−M, M ] → R continuous and piecewise linear, it is an exercise to write φ = φ1 −φ2 , with φj : [−M, M ] → R monotone, continuous, and piecewise linear. Now φ1 ◦ f, φ2 ◦ f ∈ R(I) ⇒ φ ◦ f ∈ R(I).
142
4. Calculus
Step 3. We now demonstrate Proposition 4.2.15 when φ : [−M, M ] → R is monotone and Lipschitz. By Step 2, this will suffice. So we assume −M ≤ x1 < x2 ≤ M =⇒ φ(x1 ) ≤ φ(x2 ) and φ(x2 ) − φ(x1 ) ≤ L(x2 − x1 ), for some L < ∞. Given ε > 0, pick f0 , f1 ∈ PK(I), as in (2.59). Then φ ◦ f0 , φ ◦ f1 ∈ PK(I), and
φ ◦ f0 ≤ φ ◦ f ≤ φ ◦ f1 ,
∫
∫ (φ ◦ f1 − φ ◦ f0 ) dx ≤ L
I
(f1 − f0 ) dx ≤ Lε. I
This proves φ ◦ f ∈ R(I).
For another characterization of R(I), we can deduce from (4.2.58) that, if f : I → R is bounded, {∫ } I(f ) = inf φ1 dx : φ1 ∈ C(I), φ1 ≥ f , I {∫
(4.2.62) I(f ) = sup
} φ0 dx : φ0 ∈ C(I), φ0 ≤ f ,
I
and this leads to the following variant of (4.2.59). Proposition 4.2.16. Given f : I → R bounded, f ∈ R(I) if and only if for each ε > 0, there exist φ0 , φ1 ∈ C(I) such that ∫ (4.2.63) φ0 ≤ f ≤ φ1 , and (φ1 − φ0 ) dx < ε. I
4.2. The integral
143
Exercises 1. Let c > 0 and let f : [ac, bc] → R be Riemann integrable. Working directly with the definition of integral, show that ∫ ∫ b 1 bc (4.2.64) f (cx) dx = f (x) dx. c ac a More generally, show that ∫ b−d/c ∫ 1 bc (4.2.65) f (cx + d) dx = f (x) dx. c ac a−d/c 2. Let ∫f : I × S → R be continuous, where I = [a, b] and S ⊂ Rn . Take φ(y) = I f (x, y) dx. Show that φ is continuous on S. Hint. If fj : I → R are continuous and |f1 (x) − f2 (x)| ≤ δ on I, then ∫ ∫ (4.2.66) f1 dx − f2 dx ≤ ℓ(I)δ. I
I
3. With f as in Exercise 2, suppose gj : S → R are continuous and a ≤ ∫ g (y) g0 (y) < g1 (y) ≤ b. Take φ(y) = g01(y) f (x, y) dx. Show that φ is continuous on S. Hint. Make a change of variables, linear in x, to reduce this to Exercise 2. 4. Let φ : [a, b] → [A, B] be C 1 on a neighborhood J of [a, b], with φ′ (x) > 0 for all x ∈ [a, b]. Assume φ(a) = A, φ(b) = B. Show that the identity ∫ B ∫ b ( ) (4.2.67) f (y) dy = f φ(t) φ′ (t) dt, A
a
for any f ∈ C([A, B]), follows from the chain rule and the Fundamental Theorem of Calculus. The identity (4.2.67) is called the change of variable formula for the integral. Hint. Replace b by x, B by φ(x), and differentiate. Going further, using (4.2.62)–(4.2.63), show that f ∈ R([A, B]) ⇒ f ◦ φ ∈ R([a, b]) and (4.2.67) holds. (This result contains that of Exercise 1.) 5. Show that, if f and g are C 1 on a neighborhood of [a, b], then ∫ b ∫ b [ ] ′ f ′ (s)g(s) ds + f (b)g(b) − f (a)g(a) . f (s)g (s) ds = − (4.2.68) a
a
This transformation of integrals is called “integration by parts.”
144
4. Calculus
6. Let f : (−a, a) → R be a C j+1 function. Show that, for x ∈ (−a, a), (4.2.69)
f (x) = f (0) + f ′ (0)x +
where (4.2.70)
∫ Rj (x) = 0
x
f ′′ (0) 2 f (j) (0) j x + ··· + x + Rj (x), 2 j!
(x − s)j (j+1) f (s) ds j!
This is Taylor’s formula with remainder. Hint. Use induction. If (4.2.69)–(4.2.70) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1, by showing that (4.2.71) ∫ x ∫ x (x − s)k (k+1) f (k+1) (0) k+1 (x − s)k+1 (k+2) f (s) ds = x + f (s) ds. k! (k + 1)! (k + 1)! 0 0 To establish this, use the integration by parts formula (4.2.68), with f (s) replaced by f (k+1) (s), and with appropriate g(s). See §4.3 for another approach. Note that another presentation of (4.2.70) is ∫ 1 (( ) ) xj+1 f (j+1) 1 − t1/(j+1) x dt. (4.2.72) Rj (x) = (j + 1)! 0 For another demonstration of (4.2.70), see the proof of Proposition 4.3.4. 7. Assume f : (−a, a) → R is a C j function. Show that, for x ∈ (−a, a), (4.2.69) holds, with ∫ x [ ] 1 (x − s)j−1 f (j) (s) − f (j) (0) ds. (4.2.73) Rj (x) = (j − 1)! 0 Hint. Apply (4.2.70) with j replaced by j − 1. Add and subtract f (j) (0) to the factor f (j) (s) in the resulting integrand. 8. Given I = [a, b], show that (4.2.74)
f, g ∈ R(I) =⇒ f g ∈ R(I),
as advertised in (4.2.60). 9. Assume fk ∈ R(I) and fk → f uniformly on I. Prove that f ∈ R(I) and ∫ ∫ fk dx −→ f dx. (4.2.75) I
I
10. Given I = [a, b], Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk | ≤ M on I
4.2. The integral
145
for all k, and fk −→ f uniformly on Iε ,
(4.2.76)
for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (4.2.75) holds. 11. Use the fundamental theorem of calculus and results of §4.1 to compute ∫ b (4.2.77) xr dx, r ∈ Q \ {−1}, a
where −∞ < a < b < ∞ if r ≥ 0 and 0 < a < b < ∞ if r < 0. See §4.5 for (4.2.77) with r = −1. 12. Use the change of variable result of Exercise 4 to compute ∫ 1 √ (4.2.78) x 1 + x2 dx. 0
13. We say f ∈ R(R) provided f |[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and ∞ ∫ k+1 ∑ (4.2.79) |f (x)| dx < ∞. k=−∞ k
If f ∈ R(R), we set
∫
∫
∞
(4.2.80)
k
f (x) dx = lim
k→∞
−∞
f (x) dx. −k
Formulate and demonstrate basic properties of the integral over R of elements of R(R). 14. This exercise discusses the integral test for absolute convergence of an infinite series, which goes as follows. Let f be a positive, monotonically decreasing, continuous function on [0, ∞), and suppose |ak | = f (k). Then ∫ ∞ ∞ ∑ |ak | < ∞ ⇐⇒ f (x) dx < ∞. 0
k=0
Prove this. Hint. Use
N ∑
∫ |ak | ≤ 0
k=1
N
f (x) dx ≤
N −1 ∑ k=0
15. Use the integral test to show that, if p > 0, ∞ ∑ 1 < ∞ ⇐⇒ p > 1. kp k=1
|ak |.
146
4. Calculus
Note. Compare Exercise 7 in §1.6 of Chapter 1. (For now, p ∈ Q+ . Results of §4.5 allow one to take p ∈ R+ .) Hint. Use Exercise 11 to evaluate IN (p) = ∫ N −p ∫∞ dx, for p ̸= −1, and let N → ∞. See if you can show 1 x−1 dx = ∞ 1 x ∫2 ∫ 2N without knowing about log N . Subhint. Show that 1 x−1 dx = N x−1 dx. In Exercises 16–17, C ⊂ [a, b] is the Cantor set introduced in the exercises for §1.9 of Chapter 1. As in (1.9.23) of Chapter 1, C = ∩j≥0 Cj . 16. Show that cont+ Cj = (2/3)j (b − a), and conclude that cont+ C = 0. 17. Define f : [a, b] → R as follows. We call an interval of length 3−j (b − a), omitted in passing from Cj−1 to Cj , a “j-interval.” Set f (x) = 0,
if x ∈ C,
(−1)j ,
if x belongs to a j-interval.
Show that the set of discontinuities of f is C. Hence Proposition 4.2.11 implies f ∈ R([a, b]). 18. Let fk ∈ R([a, b]) and f : [a, b] → R satisfy the following conditions. (a)
|fk | ≤ M < ∞,
∀ k,
(b)
fk (x) −→ f (x),
∀ x ∈ [a, b],
(c)
Given ε > 0, there exists Sε ⊂ [a, b] such that cont+ Sε < ε,
and fk → f uniformly on [a, b] \ Sε .
Show that f ∈ R([a, b]) and ∫ b ∫ b fk (x) dx −→ f (x) dx, a
as k → ∞.
a
Remark. In the Lebesgue theory of integration, there is a stronger result, known as the Lebesgue dominated convergence theorem. See Exercises 12– 14 in §4.6 for more on this. 19. Recall that one ingredient in the proof of Theorem 4.2.7 was that if f : (a, b) → R, then (4.2.81)
f ′ (x) = 0 for all x ∈ (a, b) =⇒ f is constant on (a, b).
Consider the following approach to proving (4.2.81), which avoids use of the Mean Value Theorem.
4.2. The integral
147
(a) Assume a < x0 < y0 < b and f (x0 ) ̸= f (y0 ). Say f (y0 ) = f (x0 ) + A(y0 − x0 ), and we may as well assume A > 0. (b) Divide I0 = [x0 , y0 ] into two equal intervals, I0ℓ and I0r , meeting at the midpoint ξ0 = (x0 + y0 )/2. Show that either f (ξ0 ) ≥ f (x0 ) + A(ξ0 − x0 ) or f (y0 ) ≥ f (ξ0 ) + A(y0 − ξ0 ). Set I1 = I0ℓ if the former holds; otherwise, set I1 = I0r . Say I1 = [x1 , y1 ]. (c) Inductively, having Ik = [xk , yk ], of length 2−k (y0 − x0 ), divide it into two equal intervals, Ikℓ and Ikr , meeting at the midpoint ξk = (xk + yk )/2. Show that either f (ξk ) ≥ f (xk ) + A(ξk − xk ) or f (yk ) ≥ f (ξk ) + A(yk − ξk ). Set Ik+1 = Ikℓ if the former holds; otherwise set Ik+1 = Ikr . (d) Show that xk ↗ x, yk ↘ x, x ∈ [x0 , y0 ], and that, if f is differentiable at x, then f ′ (x) ≥ A. Note that this contradicts the hypothesis that f ′ (x) = 0 for all x ∈ (a, b).
148
4. Calculus
4.3. Power series In §3.3 of Chapter 3 we introduced power series, of the form (4.3.1)
f (z) =
∞ ∑
ak (z − z0 )k ,
k=0
with ak ∈ C, and established the following. Proposition 4.3.1. If the series (4.3.1) converges for some z1 ̸= z0 , then either this series is absolutely convergent for all z ∈ C or there is some R ∈ (0, ∞) such that the series is absolutely convergent for |z − z0 | < R and divergent for |z − z0 | > R. The series converges uniformly on (4.3.2)
DS (z0 ) = {z ∈ C : |z − z0 | < S},
for each S < R, and f is continuous on DR (z0 ). Recall that R is called the radius of convergence of the power series (4.3.1). We now restrict attention to cases where z0 ∈ R and z = t ∈ R, and apply calculus to the study of such power series. We emphasize that we still allow the coefficients ak to be complex numbers. Proposition 4.3.2. Assume ak ∈ C and (4.3.3)
f (t) =
∞ ∑
ak tk
k=0
converges for real t satisfying |t| < R. Then f is differentiable on the interval −R < t < R, and its derivative is given by (4.3.4)
f ′ (t) =
∞ ∑
kak tk−1 ,
k=1
the latter series being absolutely convergent for |t| < R. We first check absolute convergence of the series (4.3.4). Let S < T < R. Convergence of (4.3.3) implies there exists C < ∞ such that (4.3.5)
|ak |T k ≤ C,
∀ k.
Hence, if |t| ≤ S,
C ( S )k k , S T which readily yields absolute convergence. (See Exercise 1 below.) Hence (4.3.6)
(4.3.7)
|kak tk−1 | ≤
g(t) =
∞ ∑ k=1
kak tk−1
4.3. Power series
149
is continuous on (−R, R). To show that f ′ (t) = g(t), by the fundamental theorem of calculus, it is equivalent to show ∫ t (4.3.8) g(s) ds = f (t) − f (0). 0
The following result implies this. Proposition 4.3.3. Assume bk ∈ C and (4.3.9)
g(t) =
∞ ∑
bk tk
k=0
converges for real t, satisfying |t| < R. Then, for |t| < R, ∫ t ∞ ∑ bk k+1 t , (4.3.10) g(s) ds = k+1 0 k=0
the series being absolutely convergent for |t| < R. Proof. Since, for |t| < R, (4.3.11)
b k k+1 t ≤ R|bk tk |, k+1
convergence of the series in (4.3.10) is clear. Next, write g(t) = SN (t) + RN (t), (4.3.12)
SN (t) =
N ∑
k
bk t ,
RN (t) =
k=0
∞ ∑
bk tk .
k=N +1
As in the proof of Proposition 3.3.2 in Chapter 3, pick S < T < R. There exists C < ∞ such that |bk T k | ≤ C for all k. Hence (4.3.13)
|t| ≤ S ⇒ |RN (t)| ≤ C
∞ ( ) ∑ S k = CεN → 0, as N → ∞. T
k=N +1
so ∫
t
(4.3.14) 0
and, for |t| ≤ S, (4.3.15)
∫ t N ∑ bk k+1 g(s) ds = t + RN (s) ds, k+1 0 k=0
∫ t ∫ t |RN (s)| ds ≤ CRεN . RN (s) ds ≤ 0
This gives (4.3.10).
0
150
4. Calculus
Second proof of Proposition 4.3.2. As shown in Proposition 3.3.7 of Chapter 3, if |t1 | < R, then f (t) has a convergent power series about t1 : (4.3.16) f (t) =
∞ ∑
bk (t − t1 ) = b0 + b1 (t − t1 ) + (t − t1 ) k
2
∞ ∑
bk+2 (t − t1 )k ,
k=0
k=0
for |t − t1 | < R − |t1 |, with (4.3.17)
b1 =
∞ ∑
nan tn−1 . 1
n=1
Now Proposition 3.1 applies to lim
t→t1
∑∞
k=0 bk+1 (t
− t1 )k = g(t). Hence
f (t) − f (t1 ) = b1 + lim (t − t1 )g(t) = b1 , t→t1 t − t1
as desired.
Remark. The definition of (4.3.10) for t < 0 follows standard convention. More generally, if a < b and g ∈ R([a, b]), then ∫ a ∫ b g(s) ds = − g(s) ds. b
a
More generally, if we have a power series about t0 , (4.3.18)
f (t) =
∞ ∑
ak (t − t0 )k ,
for |t − t0 | < R,
k=0
then f is differentiable for |t − t0 | < R and (4.3.19)
f ′ (t) =
∞ ∑
kak (t − t0 )k−1 .
k=1
We can then differentiate this power series, and inductively obtain (4.3.20)
f (n) (t) =
∞ ∑
k(k − 1) · · · (k − n + 1)ak (t − t0 )k−n .
k=n
In particular, (4.3.21)
f (n) (t0 ) = n! an .
We can turn (4.3.21) around and write (4.3.22)
an =
f (n) (t0 ) . n!
4.3. Power series
151
This suggests the following method of taking a given function and deriving a power series representation. Namely, if we can, we compute f (k) (t0 ) and propose that (4.3.23)
f (t) =
∞ ∑ f (k) (t0 )
k!
k=0
(t − t0 )k ,
at least on some interval about t0 . To take an example, consider f (t) = (1 − t)−r ,
(4.3.24)
with r ∈ Q (but −r ∈ / N), and take t0 = 0. (Results of §4.5 will allow us to extend this analysis to r ∈ R.) Using (4.1.36), we get f ′ (t) = r(1 − t)−(r+1) ,
(4.3.25)
for t < 1. Inductively, for k ∈ N, f (k) (t) =
(4.3.26)
[k−1 ] ∏ (r + ℓ) (1 − t)−(r+k) . ℓ=0
Hence, for k ≥ 1, (4.3.27)
f
(k)
(0) =
k−1 ∏
(r + ℓ) = r(r + 1) · · · (r + k − 1).
ℓ=0
Consequently, we propose that (4.3.28)
(1 − t)
−r
=
∞ ∑ ak k=0
k!
|t| < 1,
tk ,
with (4.3.29)
a0 = 1,
ak =
k−1 ∏
(r + ℓ), for k ≥ 1.
ℓ=0
We can verify convergence of the right side of (4.3.28) by using the ratio test: a tk+1 /(k + 1)! k + r k+1 |t|. (4.3.30) = k+1 ak tk /k! This computation implies that the power series on the right side of (4.3.28) is absolutely convergent for |t| < 1, yielding a function (4.3.31)
g(t) =
∞ ∑ ak k=0
k!
tk ,
|t| < 1.
It remains to establish that g(t) = (1 − t)−r .
152
4. Calculus
We take up this task, on a more general level. Establishing that the series ∞ ∑ f (k) (t0 )
(4.3.32)
k!
k=0
(t − t0 )k
converges to f (t) is equivalent to examining the remainder Rn (t, t0 ) in the finite expansion (4.3.33)
f (t) =
n ∑ f (k) (t0 ) k=0
k!
(t − t0 )k + Rn (t, t0 ).
The series (4.3.32) converges to f (t) if and only if Rn (t, t0 ) → 0 as n → ∞. To see when this happens, we need a compact formula for the remainder Rn , which we proceed to derive. It seems to clarify matters if we switch notation a bit, and write (4.3.34)
f (x) = f (y) + f ′ (y)(x − y) + · · · +
f (n) (y) (x − y)n + Rn (x, y). n!
We now take the y-derivative of each side of (4.3.34). The y-derivative of the left side is 0, and when we apply ∂/∂y to the right side, we observe an enormous amount of cancellation. There results the identity (4.3.35)
∂Rn 1 (x, y) = − f (n+1) (y)(x − y)n . ∂y n!
Also, (4.3.36)
Rn (x, x) = 0.
If we concentrate on Rn (x, y) as a function of y and look at the difference quotient [Rn (x, y) − Rn (x, x)]/(y − x), an immediate consequence of the mean value theorem is that, if f is real valued, (4.3.37)
Rn (x, y) =
1 (x − y)(x − ξn )n f (n+1) (ξn ), n!
for some ξn betweeen x and y. This is known as Cauchy’s formula for the remainder. If f (n+1) is continuous, we can apply the fundamental theorem of calculus to (4.3.35)–(4.3.36), and obtain the following integral formula for the remainder in the power series. Proposition 4.3.4. If I ⊂ R is an interval, x, y ∈ I, and f ∈ C n+1 (I), then the remainder Rn (x, y) in (4.3.34) is given by ∫ 1 x (4.3.38) Rn (x, y) = (x − s)n f (n+1) (s) ds. n! y
4.3. Power series
153
This works regardless of whether f is real valued. Another derivation of (4.3.38) arose in the exercise set for §4.2. The change of variable x − s = t(x − y) gives the integral formula ∫ 1 1 (4.3.39) Rn (x, y) = (x − y)n+1 tn f (n+1) (ty + (1 − t)x) dt. n! 0 If we think of this integral as 1/(n + 1) times a weighted mean of f (n+1) , we get the Lagrange formula for the remainder, 1 (4.3.40) Rn (x, y) = (x − y)n+1 f (n+1) (ζn ), (n + 1)! for some ζn between x and y, provided f is real valued. The Lagrange formula is shorter and neater than the Cauchy formula, but the Cauchy formula is actually more powerful. The calculations in (4.3.43)–(4.3.54) below will illustrate this. Note that, if I(x, y) denotes the interval with endpoints x and y (e.g., (x, y) if x < y), then (4.3.38) implies (4.3.41)
|Rn (x, y)| ≤
|x − y| n!
sup |(x − ξ)n f (n+1) (ξ)|, ξ∈I(x,y)
while (4.3.39) implies (4.3.42)
|Rn (x, y)| ≤
|x − y|n+1 (n + 1)!
sup |f (n+1) (ξ)|. ξ∈I(x,y)
In case f is real valued, (4.3.41) also follows from the Cauchy formula (4.3.37) and (4.3.42) follows from the Lagrange formula (4.3.40). Let us apply these estimates with f as in (4.3.24), i.e., f (x) = (1 − x)−r ,
(4.3.43) and y = 0. By (4.3.26), (4.3.44)
f (n+1) (ξ) = an+1 (1 − ξ)−(r+n+1) ,
an+1 =
n ∏
(r + ℓ).
ℓ=0
Consequently, (4.3.45)
f (n+1) (ξ) = bn (1 − ξ)−(r+n+1) , n!
bn =
an+1 . n!
Note that (4.3.46)
bn+1 n+1+r = → 1, bn n+1
as n → ∞.
Let us first investigate the estimate of Rn (x, 0) given by (4.3.42) (as in the Lagrange formula), and see how it leads to a suboptimal conclusion.
154
4. Calculus
(The impatient reader might skip (4.3.47)–(4.3.50) and go to (4.3.51).) By (4.3.45), if n is sufficiently large that r + n + 1 > 0, sup (4.3.47)
ξ∈I(x,0)
|f (n+1) (ξ)| |bn | = (n + 1)! n+1
if − 1 ≤ x ≤ 0,
|bn | (1 − x)−(r+n+1) n+1
if 0 ≤ x < 1.
Thus (4.3.42) implies |Rn (x, 0)| ≤ (4.3.48)
|bn | |x|n+1 if − 1 ≤ x ≤ 0, n+1 ( x )n+1 |bn | 1 if 0 ≤ x < 1. n + 1 (1 − x)r 1 − x
Note that, by (4.3.46), cn =
|bn | cn+1 |bn+1 | n + 1 =⇒ → 1 as n → ∞, = n+1 cn |bn | n + 2
so we conclude from the first part of (4.3.48) that (4.3.49)
Rn (x, 0) −→ 0 as n → ∞, if − 1 < x ≤ 0.
On the other hand, x/(1−x) is < 1 for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1. Hence the factor (x/(1 − x))n+1 decreases geometrically for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1. Thus the second part of (4.3.48) yields only (4.3.50)
1 Rn (x, 0) −→ 0 as n → ∞, if 0 ≤ x < . 2
This is what the remainder estimate (4.3.42) yields. To get the stronger result (4.3.51)
Rn (x, 0) −→ 0 as n → ∞, for |x| < 1,
we use the remainder estimate (4.3.41) (as in the Cauchy formula). This gives (4.3.52)
|Rn (x, 0)| ≤ |bn | · |x|
sup ξ∈I(x,0)
|x − ξ|n , |1 − ξ|n+1+r
with bn as in (4.3.45). Now x−ξ ≤ x, 1−ξ x − ξ −1 < x ≤ ξ ≤ 0 =⇒ ≤ |x − ξ| ≤ |x|. 1−ξ 0 ≤ ξ ≤ x < 1 =⇒
(4.3.53)
4.3. Power series
155
The first conclusion holds since it is equivalent to x − ξ ≤ x(1 − ξ) = x − xξ, hence to xξ ≤ ξ. The second conclusion in (4.3.53) holds since ξ ≤ 0 ⇒ 1 − ξ ≥ 1. We deduce from (4.3.52)–(4.3.53) that (4.3.54)
|x| < 1 =⇒ |Rn (x, 0)| ≤ |bn | · |x|n+1 .
Using (4.3.46) then gives the desired conclusion (4.3.51). We can now conclude that (4.3.28) holds, with ak given by (4.3.29). For another proof of (4.3.28), see Exercise 14. There are some important examples of power series representations for which one does not need to use remainder estimates like (4.3.41) or (4.3.42). For example, as seen in Chapter 1, we have n ∑ 1 − xn+1 (4.3.55) xk = , 1−x k=0
if x ̸= 1. The right side tends to 1/(1 − x) as n → ∞, if |x| < 1, so we get ∞ ∑ 1 (4.3.56) = xk , |x| < 1, 1−x k=0
without further ado, which is the case r = 1 of (4.3.28)–(4.3.29). We can differentiate (4.3.56) repeatedly to get ∞ ∑ ck (n)xk , |x| < 1, n ∈ N, (4.3.57) (1 − x)−n = k=0
and verify that (4.3.57) agrees with (4.3.28)–(4.3.29) with r = n. However, when r ∈ / Z, such an analysis of Rn (x, 0) as made above seems necessary. Let us also note that we can apply Proposition 4.3.3 to (4.3.56), obtaining (4.3.58)
∫ x ∞ ∑ xk+1 dy = , k+1 0 1−y
|x| < 1.
k=0
Material covered in §4.5 will produce another formula for the right side of (4.3.58). Returning to the integral formula for the remainder Rn (x, y) in (4.3.34), we record the following variant of Proposition 4.3.4. Proposition 4.3.5. If I ∈ R is an interval, x, y ∈ I, and f ∈ C n (I), then ∫ x 1 (x − s)n−1 [f (n) (s) − f (n) (y)] ds. (4.3.59) Rn (x, y) = (n − 1)! y Proof. Do (4.3.34)–(4.3.38) with n replaced by n − 1, and then write Rn−1 (x, y) =
f (n) (y) + Rn (x, y). n!
156
4. Calculus
Remark. An advantage of (4.3.59) over (4.3.38) is that for (4.3.59), we need only f ∈ C n (I), rather than f ∈ C n+1 (I).
Exercises 1. Show that (4.3.6) yields the absolute convergence asserted in the proof of Proposition 4.3.2. More generally, show that, for any n ∈ N, r ∈ (0, 1), ∞ ∑
k n rk < ∞.
k=1
Hint. Refer to the ratio test, discussed in §3.3 (Exercise 1) of Chapter 3. 2. A special case of (4.3.18)–(4.3.21) is that, given a polynomial p(t) = an tn + · · · + a1 t + a0 , we have p(k) (0) = k! ak . Apply this to Pn (t) = (1 + t)n . (k)
(k)
Compute Pn (t) using (4.1.7) repeatedly, then compute Pn (0), and use this to establish the binomial formula: ( ) n ( ) ∑ n k n n! n t , = (1 + t) = . k k k!(n − k)! k=0
3. Find the coefficients in the power series ∞ ∑ 1 √ = bk xk . 1 − x4 k=0 Show that this series converges to the left side for |x| < 1. Hint. Take r = 1/2 in (4.3.28)–(4.3.29) and set t = x4 . 4. Expand
∫
x
∫
x
dy √ 1 − y4 0 in a power series in x. Show this holds for |x| < 1. 5. Expand
dy √ 1 + y4 0 as a power series in x. Show that this holds for |x| < 1.
4.3. Power series
6. Expand
157
∫
1
dt 1 + xt4 0 as a power series in x. Show that this holds for |x| < 1. √
7. Let I ⊂ R be an open interval, x0 ∈ I, and assume f ∈ C 2 (I) and f ′ (x0 ) = 0. Use Proposition 4.3.4 to show that f ′′ (x0 ) > 0 ⇒ f has a local mimimum at x0 , f ′′ (x0 ) < 0 ⇒ f has a local maximum at x0 . Compare the proof of Proposition 4.1.4. 8. Note that
√
√
1 1− . 2 Expand the right side in a power √ series, using (4.3.28)–(4.3.29). How many terms suffice to approximate 2 to 12 digits? 2=2
9. In the setting of Exercise 8, investigate series that converge faster, such as series obtained from √ √ 3 1 2= 1− 2 9 √ 10 1 = 1− . 7 50 √ √ 10. √ Apply√variants of the methods of Exercises 8–9 to approximate 3, 5, 7, and 1001. √ 11. Given a rational approximation xn to 2, write √ √ 2 = xn 1 + δn . Assume |δn | ≤ 1/2. Then set ( 1 ) xn+1 = xn 1 + δn , 2
2 = x2n+1 (1 + δn+1 ).
√ Estimate δn+1 . Does the sequence (xn ) approach 2 faster than a power series? Apply this method to the last approximation in Exercise 9. 12. Assume F ∈ C([a, b]), g ∈ R([a, b]), F real valued, and g ≥ 0 on [a, b].
158
Show that
4. Calculus
∫
(∫
b
b
g(t)F (t) dt = a
) g(t) dt F (ζ),
a
for some ζ ∈ (a, b). Show how this result justifies passing from (4.3.39) to (4.3.40). ∫b Hint. If A = min F, B = max F , and M = a g(t) dt, show that ∫ b AM ≤ g(t)F (t) dt ≤ BM. a
13. Recall that the Cauchy formula (4.3.37) for the remainder Rn (x, y) was obtained by applying the Mean Value Theorem to the difference quotient Rn (x, y) − Rn (x, x) . y−x Now apply the generalized mean value theorem, described in Exercise 8 of §4.1, with f (y) = R(x, y), g(y) = (x − y)n+1 , to obtain the Lagrange formula (4.3.40). 14. Here is an approach to the proof of (4.3.28) that avoids formulas for the remainder Rn (x, 0). Set fr (t) = (1 − t)
−r
,
gr (t) =
∞ ∑ ak k=0
k!
tk ,
for |t| < 1,
with ak given by (4.3.29). Show that, for |t| < 1, r fr (t), and (1 − t)gr′ (t) = rgr (t). fr′ (t) = 1−t Then show that d (1 − t)r gr (t) = 0, dt and deduce that fr (t) = gr (t).
4.4. Curves and arc length The term “curve” is commonly used to refer to a couple of different, but closely related, objects. In one meaning, a curve is a continuous function from an interval I ⊂ R to n-dimensional Euclidean space: (4.4.1)
γ : I −→ Rn ,
γ(t) = (γ1 (t), . . . , γn (t)).
We say γ is differentiable provided each component γj is, in which case (4.4.2)
γ ′ (t) = (γ1′ (t), . . . , γn′ (t)).
4.4. Curves and arc length
159
γ ′ (t) is the velocity of γ, at “time” t, and its speed is the magnitude of γ ′ (t): √ (4.4.3) |γ ′ (t)| = γ1′ (t)2 + · · · + γn′ (t)2 . We say γ is smooth of class C k provided each component γj (t) has this property. One also calls the image of I under the map γ a curve in Rn . If u : J → I is continuous, one-to-one, and onto, the map (4.4.4)
σ : J −→ Rn ,
σ(t) = γ(u(t))
has the same image as γ. We say σ is a reparametrization of γ. We usually require that u be C 1 , with C 1 inverse. If γ is C k and u is also C k , so is σ, and the chain rule gives (4.4.5)
σ ′ (t) = u′ (t)γ ′ (u(t)).
Let us assume I = [a, b] is a closed, bounded interval, and γ is C 1 . We want to define the length of this curve. To get started, we take a partition P of [a, b], given by (4.4.6)
a = t0 < t1 < · · · < tN = b,
and set (4.4.7)
ℓP (γ) =
N ∑
|γ(tj ) − γ(tj−1 )|.
j=1
See Figure 4.4.1. We will massage the right side of (4.4.7) into something that looks like ∫b a Riemann sum for a |γ ′ (t)| dt. We have ∫ tj γ ′ (t) dt γ(tj ) − γ(tj−1 ) = tj−1 tj [
∫ (4.4.8)
=
] γ ′ (tj ) + γ ′ (t) − γ ′ (tj ) dt
tj−1 ′
= (tj − tj−1 )γ (tj ) +
∫
tj
[ ′ ] γ (t) − γ ′ (tj ) dt.
tj−1
We get (4.4.9)
|γ(tj ) − γ(tj−1 )| = (tj − tj−1 )|γ ′ (tj )| + rj ,
with (4.4.10)
∫ |rj | ≤
tj
tj−1
|γ ′ (t) − γ ′ (tj )| dt.
160
4. Calculus
Figure 4.4.1. Approximating ℓ(γ) by ℓP (γ)
Now if γ ′ is continuous on [a, b], so is |γ ′ |, and hence both are uniformly continuous on [a, b]. We have (4.4.11)
s, t ∈ [a, b], |s − t| ≤ h =⇒ |γ ′ (t) − γ ′ (s)| ≤ ω(h),
where ω(h) → 0 as h → 0. Summing (4.4.9) over j, we get (4.4.12)
ℓP (γ) =
N ∑
|γ ′ (tj )|(tj − tj−1 ) + RP ,
j=1
with (4.4.13)
|RP | ≤ (b − a)ω(h),
if each tj − tj−1 ≤ h.
Since the sum on the right side of (4.4.12) is a Riemann sum, we can apply Theorem 4.2.4 to get the following. Proposition 4.4.1. Assume γ : [a, b] → Rn is a C 1 curve. Then ∫ b (4.4.14) ℓP (γ) −→ |γ ′ (t)| dt as maxsize P → 0. a
4.4. Curves and arc length
161
We call this limit the length of the curve γ, and write ∫ b (4.4.15) ℓ(γ) = |γ ′ (t)| dt. a
Note that if u : [α, β] → [a, b] is a C 1 map with C 1 inverse, and σ = γ ◦u, as in (4.4.4), we have from (4.4.5) that |σ ′ (t)| = |u′ (t)| · |γ ′ (u(t))|, and the change of variable formula (4.2.67) for the integral gives ∫ β ∫ b (4.4.16) |σ ′ (t)| dt = |γ ′ (t)| dt, α
a
hence we have the geometrically natural result (4.4.17)
ℓ(σ) = ℓ(γ).
Given such a C 1 curve γ, it is natural to consider the length function ∫ t (4.4.18) ℓγ (t) = |γ ′ (s)| ds, ℓ′γ (t) = |γ ′ (t)|. a
γ′
If we assume also that is nowhere vanishing on [a, b], Theorem 4.1.3, the inverse function theorem, implies that ℓγ : [a, b] → [0, ℓ(γ)] has a C 1 inverse (4.4.19)
u : [0, ℓ(γ)] −→ [a, b],
and then σ = γ ◦ u : [0, ℓ(γ)] → Rn satisfies (4.4.20)
σ ′ (t) = u′ (t)γ ′ (u(t)) 1 γ ′ (u(t)), = ′ ℓγ (s)
for t = ℓγ (s), s = u(t),
since the chain rule applied to u(ℓγ (t)) = t yields u′ (ℓγ (t))ℓ′γ (t) = 1. Also, by (4.18), ℓ′γ (s) = |γ ′ (s)| = |γ ′ (u(t))|, so (4.4.21)
|σ ′ (t)| ≡ 1.
Then σ is a reparametrization of γ, and σ has unit speed. We say σ is a reparametrization by arc length. We now focus on that most classical example of a curve in the plane R2 , the unit circle (4.4.22)
S 1 = {(x, y) ∈ R2 : x2 + y 2 = 1}.
We can parametrize S 1 away from (x, y) = (±1, 0) by √ √ (4.4.23) γ+ (t) = (t, 1 − t2 ), γ− (t) = (t, − 1 − t2 ), on the intersection of S 1 with {(x, y) : y > 0} and {(x, y) : y < 0}, respectively. Here γ± : (−1, 1) → R2 , and both maps are smooth. In fact, we can
162
4. Calculus
take γ± : [−1, 1] → R2 , but these functions are not differentiable at ±1. We can also parametrize S 1 away from (x, y) = (0, ±1), by √ √ (4.4.24) γℓ (t) = (− 1 − t2 , t), γr (t) = ( 1 − t2 , t), again with t ∈ (−1, 1). Note that ′ γ+ (t) = (1, −t(1 − t2 )−1/2 ),
(4.4.25) so
t2 1 = . 2 1−t 1 − t2 Hence, if ℓ(t) is the length of the image γ+ ([0, t]), we have ∫ t 1 √ (4.4.27) ℓ(t) = ds, for 0 < t < 1. 1 − s2 0 The same formula holds with γ+ replaced by γ− , γℓ , or γr . ′ |γ+ (t)|2 = 1 +
(4.4.26)
We can evaluate the integral (4.4.27) as a power series in t, as follows. As seen in §3, (4.4.28)
−1/2
(1 − r)
=
∞ ∑ ak k=0
k!
rk ,
for |r| < 1,
where
( 1 )( 3 ) ( 1) 1 ··· k − . a1 = , ak = 2 2 2 2 The power series converges uniformly on [−ρ, ρ], for each ρ ∈ (0, 1). It follows that ∞ ∑ ak 2k s , |s| < 1, (4.4.30) (1 − s2 )−1/2 = k! (4.4.29)
a0 = 1,
k=0
uniformly convergent on [−a, a] for each a ∈ (0, 1). Hence we can integrate (4.4.30) term by term to get (4.4.31)
∞ ∑ ak t2k+1 , ℓ(t) = k! 2k + 1
0 ≤ t < 1.
k=0
One can use (4.4.27)–(4.4.31) to get a rapidly convergent infinite series for the number π, defined as (4.4.32)
π is half the length of S 1 .
See Exercise 7 in §4.5. Since S 1 is a smooth curve, it can be parametrized by arc length. We will let C : R → S 1 be such a parametrization, satisfying (4.4.33)
C(0) = (1, 0),
C ′ (0) = (0, 1),
4.4. Curves and arc length
163
Figure 4.4.2. The circle C(t) = (cos t, sin t)
so C(t) traverses S 1 counter-clockwise, as t increases. For t moderately bigger than 0, the rays from (0, 0) to (1, 0) and from (0, 0) to C(t) make an angle that, measured in radians, is t. This leads to the standard trigonometrical functions cos t and sin t, defined by (4.4.34)
C(t) = (cos t, sin t),
when C is such a unit-speed parametrization of S 1 . See Figure 4.4.2. We can evaluate the derivative of C(t) by the following device. Applying d/dt to the identity (4.4.35)
C(t) · C(t) = 1
and using the product formula gives (4.4.36)
C ′ (t) · C(t) = 0.
since both |C(t)| ≡ 1 and |C ′ (t)| ≡ 1, (4.4.36) allows only two possibilities. Either (4.4.37)
C ′ (t) = (sin t, − cos t).
164
4. Calculus
or (4.4.38)
C ′ (t) = (− sin t, cos t).
Since C ′ (0) = (0, 1), (4.4.37) is not a possibility. This implies d d cos t = − sin t, sin t = cos t. dt dt We will derive further important results on cos t and sin t in §4.5. (4.4.39)
One can think of cos t and sin t as special functions arising to analyze the length of arcs in the circle. Related special functions arise to analyze the length of portions of a parabola in R2 , say the graph of 1 (4.4.40) y = x2 . 2 This curve is parametrized by ( 1 ) (4.4.41) γ(t) = t, t2 , 2 so (4.4.42)
γ ′ (t) = (1, t).
In such a case, the length of γ([0, t]) is ∫ t√ (4.4.43) ℓγ (t) = 1 + s2 ds. 0
Methods to evaluate the integral in (4.42) are provided in §4.5. See Exercise 10 of §4.5. The study of lengths of other curves has stimulated much work in analysis. Another example is the ellipse x2 y 2 + 2 = 1, a2 b given a, b ∈ (0, ∞). This curve is parametrized by (4.4.44)
(4.4.45)
γ(t) = (a cos t, b sin t).
In such a case, by (4.38), γ ′ (t) = (−a sin t, b cos t), so (4.4.46)
|γ ′ (t)|2 = a2 sin2 t + b2 cos2 t = b2 + η sin2 t,
η = a2 − b2 ,
and hence the length of γ([0, t]) is ∫ t√ (4.4.47) ℓγ (t) = b 1 + σ sin2 s ds,
η . b2 0 If a ̸= b, this is called an elliptic integral, and it gives rise to a more subtle family of special functions, called elliptic functions. Material on this can be found in Chapter 6 of [14], Introduction to Complex Analysis. σ=
4.4. Curves and arc length
165
We end this section with a brief discussion of curves in polar coordinates. We define a map Π : R2 −→ R2 ,
(4.4.48)
Π(r, θ) = (r cos θ, r sin θ).
We say (r, θ) are polar coordinates of (x, y) ∈ R2 if Π(r, θ) = (x, y). Now, Π in (4.4.48) is not bijective, since (4.4.49)
Π(r, θ + 2π) = Π(r, θ),
Π(r, θ + π) = Π(−r, θ),
and Π(0, θ) is independent of θ. So polar coordinates are not unique, but we will not belabor this point. The point we make is that an equation (4.4.50)
r = ρ(θ),
ρ : [a, b] → R,
yields a curve in R2 , namely (with θ = t) (4.4.51)
γ(t) = (ρ(t) cos t, ρ(t) sin t),
a ≤ t ≤ b.
The circle (4.4.34) corresponds to ρ(θ) ≡ 1. Other cases include π π (4.4.52) ρ(θ) = a cos θ, − ≤ θ ≤ , 2 2 yielding a circle of diameter a/2 centered at (a/2, 0) (see Exercise 6 below), and (4.4.53)
ρ(θ) = a cos 3θ,
yielding a figure called a three-leaved rose. See Figure 4.4.3. To compute the arc length of (4.4.51), we note that, by (4.4.39), x(t) = ρ(t) sin t, (4.4.54)
′
y(t) = ρ(t) sin t
′
⇒ x (t) = ρ (t) cos t − ρ(t) sin t, y ′ (t) = ρ′ (t) sin t + ρ(t) cos t,
hence x′ (t)2 + y ′ (t)2 = ρ′ (t)2 cos2 t − 2ρ(t)ρ′ (t) cos t sin t + ρ(t)2 sin2 t + ρ′ (t)2 sin2 t + 2ρ(t)ρ′ (t) sin t cos t + ρ(t)2 cos2 t
(4.4.55)
= ρ′ (t)2 + ρ(t)2 . Therefore (4.4.56)
∫ ℓ(γ) = a
b
∫ b√ ρ(t)2 + ρ′ (t)2 dt. |γ (t)| dt = ′
a
Exercises 1. Let γ(t) = (t2 , t3 ). Compute the length of γ([0, t]).
166
4. Calculus
Figure 4.4.3. Three-leafed rose: r = a cos 3θ
2. With a, b > 0, the curve γ(t) = (a cos t, a sin t, bt) is a helix. Compute the length of γ([0, t]). 3. Let
( 2√2 1 ) γ(t) = t, t3/2 , t2 . 3 2 Compute the length of γ([0, t]). 4. In case b > a for the ellipse (4.4.45), the length formula (4.4.47) becomes ∫ t√ b2 − a2 ℓγ (t) = b 1 − β 2 sin2 s ds, β 2 = ∈ (0, 1). b2 0 Apply the change of variable x = sin s to this integral (cf. (4.2.46)), and write out the resulting integral.
4.4. Curves and arc length
167
5. The second half of (4.4.49) is equivalent to the identity (cos(θ + π), sin(θ + π)) = −(cos θ, sin θ). Deduce this from the definition (4.4.32) of π, together with the characterization of C(t) in (4.4.34) as the unit speed parametrization of S 1 , satisfying (4.4.33). For a more general identity, see (4.5.44). 6. The curve defined by (4.4.52) can be written γ(t) = (a cos2 t, a cos t sin t),
−
π π ≤t≤ . 2 2
Peek ahead at (4.5.44) and show that (a a ) a γ(t) = + cos 2t, sin 2t . 2 2 2 Verify that this traces out a circle of radius a/2, centered at (a/2, 0). 7. Use (4.4.56) to write the arc length of the curve given by (4.4.53) as an integral. Show this integral has the same general form as (4.4.46)–(4.4.47). 8. Let γ : [a, b] → Rn be a C 1 curve. Show that ℓ(γ) ≥ |γ(b) − γ(a)|, with strict inequality if there exists t ∈ (a, b) such that γ(t) does not lie on the line segment from γ(a) to γ(b). Hint. To get started, show that, in (4.4.7), ℓP (γ) ≥ |γ(b) − γ(a)|. 9. Consider the curve C(t) = (cos t, sin t), discussed in (4.4.33)–(4.4.38). Note that the length ℓC (t) of C([0, t]) is t, for t > 0. Show that (π ) C = (0, 1), C(π) = (−1, 0), C(2π) = (1, 0). 2 10. In the setting of Exercise 9, compute |C(t) − (1, 0)|. Then deduce from Exercise 8 that, for 0 < t ≤ π/2, 1 − cos t
0 for t ≥ 0. Since e−t = 1/et , we see that et > 0 for all t ∈ R. Since det /dt = et > 0, the function is monotone
174
4. Calculus
increasing in t, and since d2 et /dt2 = et > 0, this function is convex. (See Proposition 4.1.5 and the remark that follows it.) Note that, for t > 0, (4.5.18)
et = 1 + t +
t2 + · · · > 1 + t ↗ +∞, 2
as t ↗ ∞. Hence lim et = +∞.
(4.5.19)
t→+∞
Since e−t = 1/et , lim et = 0.
(4.5.20)
t→−∞
As a consequence, exp : R −→ (0, ∞)
(4.5.21)
is one-to-one and onto, with positive derivative, so there is a smooth inverse L : (0, ∞) −→ R.
(4.5.22)
We call this inverse the natural logarithm: (4.5.23)
log x = L(x).
See Figures 4.5.1 and 4.5.2 for graphs of x = et and t = log x. Applying d/dt to L(et ) = t
(4.5.24) gives (4.5.25)
L′ (et )et = 1,
hence L′ (et ) =
1 , et
i.e., d 1 log x = . dx x
(4.5.26) Since log 1 = 0, we get (4.5.27)
∫
x
log x = 1
dy . y
An immediate consequence of (4.5.17) (for a, b ∈ R) is the identity (4.5.28)
log xy = log x + log y,
x, y ∈ (0, ∞).
We move on to a study of ez for purely imaginary z, i.e., of (4.5.29)
γ(t) = eit ,
t ∈ R.
This traces out a curve in the complex plane, and we want to understand which curve it is. Let us set (4.5.30)
eit = c(t) + is(t),
4.5. The exponential and trigonometric functions
175
Figure 4.5.1. Exponential function
with c(t) and s(t) real valued. First we calculate |eit |2 = c(t)2 + s(t)2 . For x, y ∈ R, (4.5.31)
z = x + iy =⇒ z = x − iy =⇒ zz = x2 + y 2 = |z|2 .
It is elementary that (4.5.32)
z, w ∈ C =⇒ zw = z w =⇒ z n = z n , and z + w = z + w.
Hence (4.5.33)
ez =
∞ ∑ zk k=0
k!
= ez .
In particular, (4.5.34)
t ∈ R =⇒ |eit |2 = eit e−it = 1.
Hence t 7→ γ(t) = eit traces out the unit circle centered at the origin in C. Also (4.5.35)
γ ′ (t) = ieit =⇒ |γ ′ (t)| ≡ 1,
176
4. Calculus
Figure 4.5.2. Logarithm
so γ(t) moves at unit speed on the unit circle. We have (4.5.36)
γ(0) = 1,
γ ′ (0) = i.
Thus, for moderate t > 0, the arc from γ(0) to γ(t) is an arc on the unit circle, pictured in Figure 4.5.3, of length ∫ t (4.5.37) ℓ(t) = |γ ′ (s)| ds = t. 0
In other words, γ(t) = eit is the parametrization of the unit circle by arc length, introduced in (4.4.33). As in (4.4.34), standard definitions from trigonometry give (4.5.38)
cos t = c(t),
sin t = s(t).
Thus (4.5.30) becomes (4.5.39)
eit = cos t + i sin t,
which is Euler’s formula. The identity (4.5.40)
d it e = ieit , dt
4.5. The exponential and trigonometric functions
177
Figure 4.5.3. The circle eit = c(t) + is(t)
applied to (4.5.39), yields d d cos t = − sin t, sin t = cos t. dt dt Compare the derivation of (4.4.39). We can use (4.5.17) to derive formulas for sin and cos of the sum of two angles. Indeed, comparing (4.5.41)
(4.5.42)
ei(s+t) = cos(s + t) + i sin(s + t)
with (4.5.43)
eis eit = (cos s + i sin s)(cos t + i sin t)
gives (4.5.44)
cos(s + t) = (cos s)(cos t) − (sin s)(sin t), sin(s + t) = (sin s)(cos t) + (cos s)(sin t).
Further material on the trigonometric functions is developed in the exercises below. Remark. An alternative approach to Euler’s formula (4.5.39) is to take the power series for eit , via (4.5.7), and compare it to the power series for
178
4. Calculus
cos t and sin t, given in (4.4.62). This author regards the demonstration via (4.5.33)–(4.5.37), which yields a direct geometrical description of the curve γ(t) = eit , to be more natural and fundamental than one via the observation of coincident power series. For yet another derivation of Euler’s formula, we can set (4.5.45)
cis(t) = cos t + i sin t,
and use (4.5.41) (relying on the proof in (4.4.39)) to get d cis(t) = i cis(t), cis(0) = 1. (4.5.46) dt Then the uniqueness result (4.5.9)–(4.5.13) implies that cis(t) = eit .
Exercises. 1. Show that (4.5.47)
|t| < 1 ⇒ log(1 + t) =
∞ ∑ (−1)k−1
k
k=1
Hint. Rewrite (4.5.27) as
∫ log(1 + t) = 0
t
tk = t −
t2 t3 + − ··· . 2 3
ds , 1+s
expand 1 = 1 − s + s2 − s3 + · · · , 1+s and integrate term by term.
|s| < 1,
2. In §4.4, π was defined to be half the length of the unit circle S 1 . Equivalently, π is the smallest positive number such that eπi = −1. Show that √ 1 3 πi/2 πi/3 i. e = i, e = + 2 2 Hint. See Figure 4.5.4. 3. Show that cos2 t + sin2 t = 1, and 1 + tan2 t = sec2 t, where tan t =
sin t , cos t
sec t =
1 . cos t
4.5. The exponential and trigonometric functions
Figure 4.5.4. Regular hexagon, a = eπi/3
4. Show that d tan t = sec2 t = 1 + tan2 t, dt d sec t = sec t tan t. dt 5. Evaluate
∫
y
0
dx . 1 + x2
Hint. Set x = tan t. 6. Evaluate
∫ 0
Hint. Set x = sin t.
y
√
dx . 1 − x2
179
180
4. Calculus
7. Show that
∫ 1/2 π dx √ = . 6 1 − x2 0 Use (4.4.27)–(4.4.31) to obtain a rapidly convergent infinite series for π. Hint. Show that sin π/6 = 1/2. Use Exercise 2 and the identity eπi/6 = eπi/2 e−πi/3 . Note that ak in (4.4.29)-(4.4.31) satisfies ak+1 = (k + 1/2)ak . Deduce that ∞ ∑ bk 1 2k + 1 (4.5.48) π= , b0 = 3, bk+1 = bk . 2k + 1 4 2k + 2 k=0
8. Set
1 cosh t = (et + e−t ), 2
1 sinh t = (et − e−t ). 2
Show that d cosh t = sinh t, dt
d sinh t = cosh t, dt
and cosh2 t − sinh2 t = 1. ∫
9. Evaluate
0
y
√
dx . 1 + x2
Hint. Set x = sinh t. ∫
10. Evaluate
y
√ 1 + x2 dx.
0
11. Using Exercise 4, verify that d (sec t + tan t) = sec t(sec t + tan t), dt d (sec t tan t) = sec3 t + sec t tan2 t, dt = 2 sec3 t − sec t. 12. Next verify that d log | sec t| = tan t, dt d log | sec t + tan t| = sec t. dt
4.5. The exponential and trigonometric functions
13. Now verify that
181
∫ ∫
tan t dt = log | sec t|,
sec t dt = log | sec t + tan t|, ∫ ∫ 2 sec3 t dt = sec t tan t + sec t dt. (Here and below, we omit the arbitrary additive constants in indefinite integrals.) See the next exercise, and also Exercises 40–43 for other approaches to evaluating these and related integrals. ∫ 14. Here is another approach to the evaluation of sec t dt. We evaluate ∫ u dv √ I(u) = 1 + v2 0 in two ways. (a) Using v = sinh y, show that ∫ sinh−1 u I(u) = dy = sinh−1 u. 0
(b) Using v = tan t, show that ∫
tan−1 u
I(u) =
sec t dt. 0
Deduce that
∫
x
sec t dt = sinh−1 (tan x),
for |x|
0 and r ∈ Q. Now we define xr for x > 0 and r ∈ C, as follows: xr = er log x .
(4.5.49)
18. Show that if r = n ∈ N, (5.46) yields xn = x · · · x (n factors). 19. Show that if r = 1/n, x1/n defined by (4.5.49) satisfies x = x1/n · · · x1/n
(n factors),
and deduce that x1/n , defined by (4.5.49), coincides with x1/n as defined in §4.1. 20. Show that xr , defined by (4.5.49), coincides with xr as defined in §4.1, for all r ∈ Q. 21. Show that, for x > 0, xr+s = xr xs ,
∀ r, s ∈ C.
and (xr )s = xrs ,
22. Show that, given r ∈ C, d r x = rxr−1 , dx 22A. For y > 0, evaluate
∫y 0
∀ x > 0.
cos(log x) dx and
∫x 0
sin(log x) dx.
4.5. The exponential and trigonometric functions
Hint. Deduce from (4.5.49) and Euler’s formula that cos(log x) + i sin(log x) = xi . Use the result of Exercise 22 to integrate xi . 23. Show that, given r, rj ∈ C, x > 0, rj → r =⇒ xrj → xr . 24. Given a > 0, compute d x a , dx
x ∈ R.
d x x , dx
x > 0.
25. Compute
26. Prove that x1/x −→ 1,
as x → ∞.
log x −→ 0, x
as x → ∞.
Hint. Show that
27. Verify that ∫
∫
1
1
x
ex log x dx
x dx =
∫0 ∞
0
=
−y
e−ye e−y dy
0
=
∞ ∫ ∑
∞
n=1 0
(−1)n n −(n+1)y y e dy. n!
28. Show that, if α > 0, n ∈ N, ∫ ∞ y n e−αy dy = (−1)n F (n) (α), 0
where
∫ F (α) = 0
∞
e−αy dy =
1 . α
183
184
4. Calculus
29. Using Exercises 27–28, show that ∫ 1 ∞ ∑ xx dx = (−1)n (n + 1)−(n+1) 0
n=0
=1−
1 1 1 + 3 − 4 + ··· . 2 2 3 4
Some special series 30. Using (4.5.47), show that ∞ ∑ (−1)k−1 k=1
k
= log 2.
Hint. Using properties of alternating series, show that if t ∈ (0, 1), N ∑ (−1)k−1 k=1
k
tk = log(1 + t) + rN (t),
|rN (t)| ≤
tN +1 , N +1
and let t ↗ 1. 31. Using the result of Exercise 5, show that ∞ ∑ π (−1)k = . 2k + 1 4 k=0
Hint. Exercise 5 implies tan−1 y =
∞ ∑ (−1)k 2k+1 y , 2k + 1
for − 1 < y < 1.
k=0
Use an argument like that suggested for Exercise 30, taking y ↗ 1.
Alternative approach to exponentials and logs An alternative approach is to define log : (0, ∞) → R first and derive some of its properties, and then define the exponential function Exp : R → (0, ∞) as its inverse. The following exercises describe how to implement this. To start, we take (4.5.27) as a definition: ∫ x dy (4.5.50) log x = , x > 0. 1 y
4.5. The exponential and trigonometric functions
185
32. Using (4.5.50), show that (4.5.51)
∀ x, y > 0.
log(xy) = log x + log y,
Also show (4.5.52)
1 = − log x, x
log
∀ x > 0.
33. Show from (4.5.50) that d 1 log x = , dx x
(4.5.53)
x > 0.
34. Show that log x → +∞ as x → +∞. (Hint. See the hint for Exercise 15 in §4.2.) Then show that log x → −∞ as x → 0. 35. Deduce from Exercises 33 and 34, together with Theorem 4.1.3, that log : (0, ∞) −→ R is one-to-one and onto, with a differentiable inverse. We denote the inverse function Exp : R −→ (0, ∞), also set et = Exp(t). 36. Deduce from Exercise 32 that (4.5.54)
es+t = es et ,
∀ s, t ∈ R.
Note. (4.5.54) is a special case of (4.5.17). 37. Deduce from (4.5.53) and Theorem 4.1.3 that (4.5.55)
d t e = et , dt
∀ t ∈ R.
As a consequence, (4.5.56)
dn t e = et , dtn
∀ t ∈ R, n ∈ N.
38. Note that e0 = 1, since log 1 = 0. Deduce from (4.5.56), together with the power series formulas (4.3.34) and (4.3.40), that, for all t ∈ R, n ∈ N, (4.5.57)
n ∑ 1 k e = t + Rn (t), k! t
k=0
186
4. Calculus
where (4.5.58)
Rn (t) =
tn+1 ζn e , (n + 1)!
for some ζn between 0 and t. 39. Deduce from Exercise 38 that et =
(4.5.59)
∞ ∑ 1 k t , k!
∀ t ∈ R.
k=0
Remark. Exercises 35–39 develop et only for t ∈ R. At this point, it is natural to segue to (4.5.6) and from there to arguments involving (4.5.7)–(4.5.17), and then on to (4.5.29)–(4.5.41), renewing contact with the trigonometric functions. Some more trigonometric integrals The next few exercises treat integrals of the form ∫ (4.5.60) R(cos θ, sin θ) dθ. Here and below, typically R is a rational function of its arguments. 40. Using the substitution x = tan θ/2, show that dθ = 2
dx , 1 + x2
cos θ =
1 − x2 , 1 + x2
sin θ =
2x . 1 + x2
Hint. With α = θ/2, use cos 2α = 2 cos2 α − 1,
and sec2 α = 1 + tan2 α.
41. Deduce that (4.5.60) converts to ∫ ( 1 − x2 2x ) dx (4.5.61) 2 R , . 1 + x2 1 + x2 1 + x2 42. Use this approach to compute ∫ ∫ 1 1 dθ, and dθ. sin θ cos θ Compare the second result with that from Exercise 13.
4.5. The exponential and trigonometric functions
187
43. Use the substitution t = sin θ to show that, for k ∈ Z+ , ∫ ∫ dt sec2k+1 θ dθ = . (1 − t2 )k+1 Compare what you get by the methods of Exercises 40–42, and also (for k = 0, 1) those of Exercise 13. Hint. sec2k+1 θ = (cos θ)/(1 − sin2 θ)k+1 . We next look at integrals of the form ∫ (4.5.62) R(cosh u, sinh u) du. 44. Using the substitution x = tanh u/2, show that 1 + x2 dx , cosh u = , 1 − x2 1 − x2 and obtain an analogue of (4.5.61). du = 2
sinh u =
2x , 1 − x2
45. Looking back at Exercise 14, complement the identity d sinh−1 (tan t) = sec t dt with d 1 tan−1 (sinh u) = = sech u. du cosh u ∫ Compare the resulting formula for sech u du with what you get via Exercise 44. We next look at integrals of the form ∫ √ (4.5.63) R(x, 1 + x2 ) dx. This extends results arising in Exercise 14. 46. Using the substitution x = tan θ, show that √ dx = sec2 θ dθ, 1 + x2 = sec θ, and deduce that (4.5.63) converts to ∫ ( sin θ 1 ) 1 , dθ, R cos θ cos θ cos2 θ a class of integrals to which Exercises 40–41 apply.
188
4. Calculus
47. Using the substitution x = sinh u, show that √ dx = cosh u du, 1 + x2 = cosh u, and deduce that (4.5.63) converts to ∫ R(sinh u, cosh u) cosh u du, a class of integrals to which Exercise 44 applies. 48. Apply the substitution x = sin θ to integrals of the form ∫ √ R(x, 1 − x2 ) dx, and work out an analogue of Exercise 46. 49. We next look at arc length on the graph of t = log x. Consider the curve γ : (0, ∞) → R2 given by γ(x) = (x, log x). ∫x Taking ℓγ (x) = ℓ(γ([1, x]) = 1 |γ ′ (s)| ds (as in (4.4.18)), for x > 1, show that ∫ x√ dy ℓγ (x) = 1 + y2 . y 1 Use methods developed in Exercises 46 and/or 47 (complemented by those developed in Exercises 41 and 44) to evaluate this arc length. For another perspective, examine the arc length of σ(t) = (t, et ). The gamma function 50. Show that
∫ Γ(x) =
∞
e−t tx−1 dt
0
defines Γ : (0, ∞) −→ R as a continuous function. This is called the gamma function. 51. Show that, for x > 0, Γ(x + 1) = xΓ(x).
4.5. The exponential and trigonometric functions
∫
Hint. Write Γ(x + 1) = −
0
∞(
189
d −t ) x e t dt, dt
and integrate by parts. 52. Show that Γ(1) = 1, and deduce from Exercise 51 that, for n ∈ N, Γ(n + 1) = n! Remark. The gamma function has many uses in analysis. Further material on this can be found in Chapter 3 of [13], and a good deal more can be found in Chapter 4 of [14]. Exponentials and factorials 53. Looking at the power series for en , show that en > nn /n!, or equivalently ( n )n n! > . e 54. We consider some more precise estimates on n!. In preparation for this, establish that, for 1 ≤ a < b < ∞ ∫ b b log x dx = x log x − x . a
a
Also, using the fact that (d2 /dx2 ) log x < 0, ∫ a+1 ] 1[ log a + log(a + 1) < log x dx < log(a + 12 ). 2 a 55. Note that, for n ≥ 2, log n! = log 2 + · · · + log n. Using Exercise 54, show that ∫ n log x dx > log 2 + · · · + log(n − 1) + 21 log n, 1
and hence n!
e(n+1/2) log(n+1/2) e−(n+1/2) e−(3/2) log(3/2) e3/2 . Using log(n + 12 ) = log n + log(1 +
1 2n ),
log(1 + δ) > δ − 12 δ 2 , for 0 < δ < 1, deduce that
( n )n [ ( 2 )3/2 ]√ e n. e 2 Remark. A celebrated result known as Stirling’s formula says ( n )n √ (4.5.64) n! ∼ 2πn, e as n → ∞, in the sense that the ratio of the two sides tends to 1. We have ( 2 )3/2 √ e < 2π < e. 3 Compute each of these three quantities to 5 digits of accuracy. n! >
The gamma function is useful for proving (4.5.64). A proof can be found in Appendix A.3 of [14].
4.6. Unbounded integrable functions
191
4.6. Unbounded integrable functions There are lots of unbounded functions we would like to be able to integrate. For example, consider f (x) = x−1/2 on (0, 1] (defined any way you like at x = 0). Since, for ε ∈ (0, 1), ∫ 1 √ (4.6.1) x−1/2 dx = 2 − 2 ε, ε
this has a limit as ε ↘ 0, and it is natural to set ∫ 1 (4.6.2) x−1/2 dx = 2. 0
Sometimes (4.6.2) is called an “improper integral,” but we do not consider that to be a proper designation. Here, we define a class R# (I) of not necessarily bounded “integrable” functions on an interval I = [a, b], as follows. First, assume f ≥ 0 on I, and for A ∈ (0, ∞), set fA (x) = f (x) if f (x) ≤ A,
(4.6.3)
A, if f (x) > A.
We say f ∈ R# (I) provided fA ∈ R(I),
∀ A < ∞, and ∫ ∃ uniform bound fA dx ≤ M.
(4.6.4)
I
∫
monotonically to a finite If f ≥ 0 satisfies (4.6.4), then I fA dx increases ∫ limit as A ↗ +∞, and we call the limit I f dx: ∫ ∫ (4.6.5) fA dx ↗ f dx, for f ∈ R# (I), f ≥ 0. I
I
∫b
We also use ∫ the notation a f dx, if I = [a, b]. If I is understood, we might just write f dx. It is valuable to have the following. Proposition 4.6.1. If f, g : I → R+ are in R# (I), then f + g ∈ R# (I), and ∫ ∫ ∫ (f + g) dx = f dx + g dx. (4.6.6) I
I
I
Proof. To start, note that (f + g)A ≤ fA + gA . In fact, (4.6.7)
(f + g)A = (fA + gA )A .
192
4. Calculus
∫ ∫ ∫ ∫ Hence (f + g)A ∈ R(I) and (f + g)A dx ≤ fA dx + gA dx ≤ f dx + ∫ g dx, so we have f + g ∈ R# (I) and ∫ ∫ ∫ (4.6.8) (f + g) dx ≤ f dx + g dx. On the other hand, if B > 2A, then (f + g)B ≥ fA + gA , so ∫ ∫ ∫ (4.6.9) (f + g) dx ≥ fA dx + gA dx, for all A < ∞, and hence ∫ ∫ ∫ (4.6.10) (f + g) dx ≥ f dx + g dx.
Together, (4.6.8) and (4.6.10) yield (4.6.6). Next, we take f : I → R and set (4.6.11)
f = f + − f −,
f + (x) = f (x) if f (x) ≥ 0, 0 if f (x) < 0.
Then we say f ∈ R# (I) ⇐⇒ f + , f − ∈ R# (I),
(4.6.12) and set
∫
(4.6.13)
∫
∫ f dx − +
f dx = I
I
f − dx,
I
where the two terms on the right are defined as in (4.6.5). To extend the additivity, we begin as follows Proposition 4.6.2. Assume that g ∈ R# (I) and that gj ≥ 0, gj ∈ R# (I), and g = g0 − g1 .
(4.6.14) Then
∫
(4.6.15)
∫ g dx =
∫ g0 dx −
g1 dx.
Proof. Take g = g + − g − as in (4.6.11). Then (4.6.14) implies g + + g1 = g0 + g − ,
(4.6.16)
which by Proposition 4.6.1 yields ∫ ∫ ∫ ∫ + (4.6.17) g dx + g1 dx = g0 dx + g − dx. This implies (4.6.18)
∫
∫ g dx − +
−
g dx =
∫
∫ g0 dx −
g1 dx,
4.6. Unbounded integrable functions
193
which yields (4.6.15) We now extend additivity. Proposition 4.6.3. Assume f1 , f2 ∈ R# (I). Then f1 + f2 ∈ R# (I) and ∫ ∫ ∫ (4.6.19) (f1 + f2 ) dx = f1 dx + f2 dx. I
I
I
Proof. If g = f1 + f2 = (f1+ − f1− ) + (f2+ − f2− ), then (4.6.20)
g = g0 − g1 ,
g0 = f1+ + f2+ ,
g1 = f1− + f2− .
We have gj ∈ R# (I), and then ∫ ∫ ∫ (f1 + f2 ) dx = g0 dx − g1 dx ∫ ∫ + + (4.6.21) = (f1 + f2 ) dx − (f1− + f2− ) dx ∫ ∫ ∫ ∫ − + + = f1 dx + f2 dx − f1 dx − f2− dx, the first equality by Proposition 4.6.2, the second tautologically, and the third by Proposition 4.6.1. Since ∫ ∫ ∫ + (4.6.22) fj dx = fj dx − fj− dx,
this gives (4.6.19).
If f : I → C, we set f = f1 + if2 , fj : I → R, and say f ∈ R# (I) if and only if f1 and f2 belong to R# (I). Then we set ∫ ∫ ∫ (4.6.23) f dx = f1 dx + i f2 dx. Similar comments apply to f : I → Rn . Given f ∈ R# (I), we set (4.6.24)
∫ |f (x)| dx.
∥f ∥L1 (I) = I
We have, for f, g ∈ R# (I), a ∈ C, (4.6.25)
∥af ∥L1 (I) = |a| ∥f ∥L1 (I) ,
194
4. Calculus
and
∫ ∥f + g∥L1 (I) =
|f + g| dx ∫I
(4.6.26)
(|f | + |g|) dx
≤ I
= ∥f ∥L1 (I) + ∥g∥L1 (I) . Note that, if S ⊂ I, (4.6.27)
∫ |χS | dx = 0,
cont+ (S) = 0 =⇒ I
where cont+ (S) is defined by (4.2.21). Thus, to get a metric, we need to form equivalence classes. The set of equivalence classes [f ] of elements of R# (I), where ∫ ˜ (4.6.28) f ∼ f ⇐⇒ |f − f˜| dx = 0, I
forms a metric space, with distance function (4.6.29)
D([f ], [g]) = ∥f − g∥L1 (I) .
However, this metric space is not complete. One needs the Lebesgue integral to obtain a complete metric space. One can see [6] or [12]. We next show that each f ∈ R# (I) can be approximated in L1 by a sequence of bounded, Riemann integrable functions. Proposition 4.6.4. If f ∈ R# (I), then there exist fk ∈ R(I) such that (4.6.30)
∥f − fk ∥L1 (I) −→ 0,
as k → ∞.
Proof. If we separately approximate Re f and Im f by such sequences, then we approximate f , so it suffices to treat the case where f is real. Similarly, writing f = f + − f − , we see that it suffices to treat the case where f ≥ 0 on I. For such f , simply take (4.6.31)
fk = fA ,
A = k,
with fA as in (4.6.3). Then (4.6.5) implies ∫ ∫ (4.6.32) fk dx ↗ f dx, I
I
4.6. Unbounded integrable functions
195
and Proposition 4.6.3 gives ∫ ∫ |f − fk | dx = (f − fk ) dx I
∫I
(4.6.33)
∫ f dx −
=
fk dx
I
I
→ 0 as k → ∞. So far, we have dealt with integrable functions on a bounded interval. Now, we say f : R → R (or C, or Rn ) belongs to R# (R) provided f |I ∈ R# (I) for each closed, bounded interval I ⊂ R and ∫ R (4.6.34) ∃A < ∞ such that |f | dx ≤ A, ∀ R < ∞. −R
In such a case, we set
∫
∫
∞
R
f dx = lim
(4.6.35)
f dx.
R→∞
−∞
−R
One can similarly define R# (R+ ).
Exercises 1. Let f : [0, 1] → R+ and assume f is continuous on (0, 1]. Show that ∫ 1 # f ∈ R ([0, 1]) ⇐⇒ f dx is bounded as ε ↘ 0. ε
In such a case, show that
∫
∫
1
f dx = lim
ε→0
0
1
f dx. ε
2. Let a > 0. Define pa : [0, 1] → R by pa = x−a if 0 < x ≤ 1 Set pa (0) = 0. Show that pa ∈ R# ([0, 1]) ⇐⇒ a < 1. 3. Let b > 0. Define qb : [0, 1/2] → R by qb (x) =
1 , x| log x|b
196
4. Calculus
if 0 < x ≤ 1/2. Set qb (0) = 0. Show that qb ∈ R# ([0, 1/2]) ⇐⇒ b > 1. 4. Show that if a ∈ C and if f ∈ R# (I), then ∫ ∫ af ∈ R# (I), and af dx = a f dx. Hint. Check this for a > 0, a = −1, and a = i. 5. Show that f ∈ R(I), g ∈ R# (I) =⇒ f g ∈ R# (I). Hint. Use (4.2.53). First treat the case f, g ≥ 1, f ≤ M . Show that in such a case, (f g)A = (fA gA )A , and (f g)A ≤ M gA . 6. Compute
∫
1
log t dt. Hint. To compute
∫1 ε
0
log t dt, first compute d (t log t). dt
7. Given g ∈ R(I), show that there exist gk ∈ PK(I) such that ∥g − gk ∥L1 (I) −→ 0. Given h ∈ PK(I), show that there exist hk ∈ C(I) such that ∥h − hk ∥L1 (I) −→ 0. 8. Using Exercise 7 and Proposition 4.6.4, prove the following: given f ∈ R# (I), there exist fk ∈ C(I) such that ∥f − fk ∥L1 (I) −→ 0. 9. Recall Exercise 4 of §4.2. If φ : [a, b] → [A, B] is C 1 , with φ′ (x) > 0 for all x ∈ [a, b], then ∫ b ∫ B f (φ(t))φ′ (t) dt, f (y) dy = (4.6.36) A
a
for each f ∈ C([a, b]), where A = φ(a), B = φ(b). Using Exercise 8, show that (4.6.36) holds for each f ∈ R# ([a, b]).
4.6. Unbounded integrable functions
197
10. If f ∈ R# (R), so (4.6.34) holds, prove that the limit exists in (4.6.35). 11. Given f (x) = x−1/2 (1 + x2 )−1 for x > 0, show that f ∈ R# (R+ ). Show that ∫ ∞ ∫ ∞ dx 1 dy √ . = 2 2 1+x 1 + y4 x 0 0 12. Let fk ∈ R# ([a, b]), f : [a, b] → R satisfy (a)
|fk | ≤ g,
∀ k, for some g ∈ R# ([a, b]),
(b)
Given ε > 0, ∃ contented Sε ⊂ [a, b] such that ∫ g dx < ε, and fk → f uniformly on [a, b] \ Sε . Sε
Show that f ∈ R# ([a, b]) and ∫ b ∫ b fk (x) dx −→ f (x) dx, a
as k → ∞.
a
13. Let g ∈ R# ([a, b]) be ≥ 0. Show that for each ε > 0, there exists δ > 0 such that ∫ S ⊂ [a, b] contented, cont S < δ =⇒ g dx < ε. S
Hint. With gA defined as in (4.6.3), pick A such that Then pick δ < ε/2A.
∫
gA dx ≥
∫
g dx − ε/2.
14. Deduce from Exercises 12–13 the following. Let fk ∈ R# ([a, b]), f : [a, b] → R satisfy (a)
|fk | ≤ g,
∀ k, for some g ∈ R# ([a, b]),
(b)
Given δ > 0, ∃ contented Sδ ⊂ [a, b] such that cont Sδ < δ,
and fk → f uniformly on [a, b] \ Sδ .
Show that f ∈ R# ([a, b]) and ∫ b ∫ b f (x) dx, fk (x) dx −→ a
as k → ∞.
a
Remark. Compare Exercise 18 of §4.2. As mentioned there, the Lebesgue theory of integration has a stronger result, known as the Lebesgue dominated convergence theorem.
198
4. Calculus
√ 15. Given g(s) = 1/ 1 − s2 , show that g ∈ R# ([−1, 1]), and that ∫ 1 ds √ = π. 1 − s2 −1 Compare the arc length formula (4.4.27). √ 16. Given f (t) = 1/ t(1 − t), show that f ∈ R# ([0, 1]), and that ∫ 1 dt √ = π. t(1 − t) 0 Hint. Set t = s2 .
Chapter 5
Further Topics in Analysis
In this final chapter we apply results of Chapters 3 and 4 to a selection of topics in analysis. One underlying theme here is the approximation of a function by a sequence of “simpler” functions. In §5.1 we define the convolution of functions on R, ∫ ∞ f ∗ u(x) = f (y)u(x − y) dy, −∞
and give conditions on a sequence (fn ) guaranteeing that fn ∗ u → u as n → ∞. In §5.2 we treat the Weierstrass approximation theorem, which states that each continuous function on a closed, bounded interval [a, b] is a uniform limit of a sequence of polynomials. We give two proofs, one using convolutions and one using the uniform convergence on [−1, 1] of the power series of (1 − x)b , whenever b > 0, which we establish in Appendix A.2. (Here, we take b = 1/2.) Section 5.3 treats a far reaching generalization, known as the Stone-Weierstrass theorem. A special case, of use in §5.4, is that each continuous function on T1 is a uniform limit of a sequence of finite linear combinations of the exponentials eikθ , k ∈ Z. Section 5.4 introduces Fourier series, f (θ) =
∞ ∑
ak eikθ .
k=−∞
A central question is when this holds with ∫ π 1 f (θ)e−ikθ dθ. ak = 2π −π 199
200
5. Further Topics in Analysis
This is the Fourier inversion problem, and we examine several aspects of this. Fourier analysis is a major area in modern analysis. Material treated here will provide a useful background (and stimulus) for further study. We mention that Chapter 3 of [14] deals with Fourier series on a similar level as here, while making contact with complex analysis, and also including treatments of the Fourier transform and Laplace transform. Progressively more advanced treatments of Fourier analysis can be found in [13], Chapter 7, [6], Chapter 8, and [15], Chapter 3. Section 5.5 treats the use of Newton’s method to solve f (ξ) = y for ξ in an interval [a, b] given that f (a) − y and f (b) − y have opposite signs and that |f ′′ (x)| ≤ A, |f ′ (x)| ≥ B > 0, ∀ x ∈ [a, b]. It is seen that if an initial guess x0 is close enough to ξ, then Newton’s method produces a sequence (xk ) satisfying k
|xk − ξ| ≤ Cβ 2 ,
for some β ∈ (0, 1).
It is extremely useful to have such a rapid approximation of the solution ξ. This chapter also has an appendix, §5.6, dealing with inner product spaces. This class of space generalizes the notion of Euclidean spaces, considered in §2.1, from finite to infinite dimensions. The results are of use in our treatment of Fourier series, in §5.4.
5.1. Convolutions and bump functions If u is bounded and continuous on R and f is integrable (say f ∈ R(R)) we define the convolution f ∗ u by ∫ ∞ (5.1.1) f ∗ u(x) = f (y)u(x − y) dy. −∞
Clearly (5.1.2)
∫ |f | dx = A,
|u| ≤ M on R =⇒ |f ∗ u| ≤ AM on R.
Also, a change of variable gives (5.1.3)
f ∗ u(x) =
∫
∞
−∞
f (x − y)u(y) dy.
We want to analyze the convolution action of a sequence of integrable functions fn on R that satisfy the following conditions: ∫ ∫ (5.1.4) fn ≥ 0, fn dx = 1, fn dx = εn → 0, R\In
5.1. Convolutions and bump functions
201
where (5.1.5)
In = [−δn , δn ],
δn → 0.
Let u ∈ C(R) be supported on a bounded interval [−A, A], or more generally, assume u ∈ C(R),
(5.1.6)
|u| ≤ M on R,
and u is uniformly continuous on R, so with δn as in (5.1.5), |x − x′ | ≤ 2δn =⇒ |u(x) − u(x′ )| ≤ ε˜n → 0.
(5.1.7)
We aim to prove the following. Proposition 5.1.1. If fn ∈ R(R) satisfy (5.1.4)–(5.1.5) and if u ∈ C(R) is bounded and uniformly continuous (satisfying (5.1.6)–(5.1.7)), then un = fn ∗ u −→ u,
(5.1.8)
Proof. To start, write
∫
un (x) = ∫ (5.1.9)
uniformly on R, as n → ∞.
fn (y)u(x − y) dy
∫
fn (y)u(x − y) dy +
=
fn (y)u(x − y) dy
R\In
In
= vn (x) + rn (x). Clearly |rn (x)| ≤ M εn ,
(5.1.10) Next, (5.1.11)
∀ x ∈ R.
∫ vn (x) − u(x) =
fn (y)[u(x − y) − u(x)] dy − εn u(x), In
so (5.1.12)
|vn (x) − u(x)| ≤ ε˜n + M εn ,
∀ x ∈ R,
|un (x) − u(x)| ≤ ε˜n + 2M εn ,
∀ x ∈ R,
hence (5.1.13) yielding (5.1.8).
Here is a sequence of functions (fn ) satisfying (5.1.4)–(5.1.5). First, set ∫ 1 1 2 n (x2 − 1)n dx, (x − 1) , An = (5.1.14) gn (x) = An −1
202
5. Further Topics in Analysis
and then set (5.1.15)
fn (x) = gn (x), 0,
|x| ≤ 1, |x| ≥ 1.
It is readily verified that such (fn ) satisfy (5.1.4)–(5.1.5). We will use this sequence in Proposition 5.1.1 for one proof of the Weierstrass approximation theorem, in the next section. The functions fn defined by (5.1.14)–(5.1.15) have the property (5.1.16)
fn ∈ C n−1 (R).
Furthermore, they have compact support, i.e., vanish outside some compact set. We say f ∈ C0k (R),
(5.1.17)
provided f ∈ C k (R) and f has compact support. The following result is useful. Proposition 5.1.2. If f ∈ C0k (R) and u ∈ R(R), then f ∗ u ∈ C k (R), and (provided k ≥ 1) (5.1.18)
d f ∗ u(x) = f ′ ∗ u(x). dx
Proof. We start with the case k = 0, and show that f ∈ C00 (R), u ∈ R(R) =⇒ f ∗ u ∈ C(R). In fact, by (5.1.3), ∫ f ∗ u(x + h) − f ∗ u(x) =
[f (x + h − y) − f (x − y)]u(y) dy −∞ ∫ ∞ ≤ sup |f (x + h) − f (x)| |u(y)| dy, ∞
x
−∞
which clearly tends to 0 as h → 0. From here, it suffices to treat the case k = 1, since if f ∈ C0k (R), then ∈ C0k−1 (R), and one can use induction on k. Using (5.1.3), we have ∫ ∞ f ∗ u(x + h) − f ∗ u(x) (5.1.19) = gh (x − y)u(y) dy, h −∞
f′
where (5.1.20)
gh (x) =
1 [f (x + h) − f (x)]. h
We claim that (5.1.21)
f ∈ C01 (R) =⇒ gh → f ′ uniformly on R, as h → 0.
5.1. Convolutions and bump functions
Given this,
(5.1.22)
∫
∞
−∞
203
∫
f ′ (x − y)u(y) dy −∞ ∫ ∞ ′ ≤ sup |gh (x) − f (x)| |u(y)| dy, ∞
gh (x − y)u(y) dy −
−∞
x
which yields (5.1.18). It remains to prove (5.1.21). Indeed, the fundamental theorem of calculus implies ∫ 1 x+h ′ f (y) dy, (5.1.23) gh (x) = h x if h > 0, so (5.1.24)
|gh (x) − f ′ (x)| ≤
|f ′ (y) − f ′ (x)|,
sup x≤y≤x+h
if h > 0, with a similar estimate if h < 0. This yields (5.1.21).
We say (5.1.25)
f ∈ C ∞ (R) provided f ∈ C k (R) for all k,
and similarly f ∈ C0∞ (R) provided f ∈ C0k (R), for all k. It is useful to have some examples of functions in C0∞ (R). We start with the following. Set G(x) = e−1/x , 2
(5.1.26)
if x > 0, if x ≤ 0.
0,
Lemma 5.1.3. G ∈ C ∞ (R). Proof. Clearly g ∈ C k for all k on (0, ∞) and on (−∞, 0). We need to check its behavior at 0. The fact that G is continuous at 0 follows from (5.1.27)
e−y −→ 0, 2
as y → ∞.
Note that (5.1.28)
G′ (x) =
2 −1/x2 e , if x > 0, x3 0, if x < 0.
also (5.1.29)
G′ (0) = lim
h→0
G(h) = 0, h
as a consequence of (5.1.30)
ye−y −→ 0, 2
as y → ∞.
204
5. Further Topics in Analysis
Clearly G′ is continuous on (0, ∞) and on (−∞, 0). The continuity at 0 is a consequence of y 3 e−y −→ 0, 2
(5.1.31)
as y → ∞.
The existence and continuity of higher derivatives of G follows a similar pattern, making use of y k e−y −→ 0, 2
(5.1.32)
as y → ∞,
for each k ∈ N. We leave the details to the reader.
Corollary 5.1.4. Set g(x) = G(x)G(1 − x).
(5.1.33)
Then g ∈ C0∞ (R). In fact, g(x) ̸= 0 if and only if 0 < x < 1. Exercises 1. Let f ∈ R(R) satisfy (5.1.34)
∫ f ≥ 0,
f dx = 1,
and set
(x)
, n ∈ N. n Show that Proposition 5.1.1 applies to the sequence fn . (5.1.35)
fn (x) = nf
2. Take (5.1.36)
1 2 f (x) = e−x , A
∫
∞
A=
e−x dx. 2
−∞
Show that Exercise 1 applies to this case. √ Note. In [13] it is shown that A = π in (5.1.36). 3. Modify the proof of Lemma 5.1.3 to show that, if G1 (x) = e−1/x , if x > 0, 0,
if x ≤ 0,
then G1 ∈ C ∞ (R). 4. Establish whether each of the following functions is in C ∞ (R). 1 φ(x) = G(x) sin , if x ̸= 0, x 0, if x = 0.
5.2. The Weierstrass approximation theorem
205
1 ψ(x) = G1 (x) sin , if x ̸= 0, x 0, if x = 0. Here G(x) is as in (5.1.26) and G1 (x) is as in Exercise 3.
5.2. The Weierstrass approximation theorem The following result of Weierstrass is a very useful tool in analysis. Theorem 5.2.1. Given a compact interval I, any continuous function f on I is a uniform limit of polynomials. Otherwise stated, our goal is to prove that the space C(I) of continuous (real valued) functions on I is equal to P(I), the uniform closure in C(I) of the space of polynomials. We will give two proofs of this theorem. Our starting point for the first proof will be the result that the power series for (1−x)a converges uniformly on [−1, 1], for any a > 0. This is established in Appendix A.2, and we will use it, with a = 1/2. From the identity x1/2 = (1 − (1 − x))1/2 , we have x1/2 ∈ P([0, 2]). More to the point, from the identity ( )1/2 |x| = 1 − (1 − x2 ) , (5.2.1) √ √ we have |x| ∈ P([− 2, 2]). Using |x| = b−1 |bx|, for any b > 0, we see that |x| ∈ P(I) for any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval I. By translation, we have (5.2.2)
|x − a| ∈ P(I)
for any compact interval I. Using the identities 1 1 1 1 (5.2.3) max(x, y) = (x + y) + |x − y|, min(x, y) = (x + y) − |x − y|, 2 2 2 2 we see that for any a ∈ R and any compact I, (5.2.4)
max(x, a), min(x, a) ∈ P(I).
We next note that P(I) is an algebra of functions, i.e., (5.2.5)
f, g ∈ P(I), c ∈ R =⇒ f + g, f g, cf ∈ P(I).
Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f | ∈ P(I), and, via (5.2.3), we deduce that (5.2.6)
f, g ∈ P(I) =⇒ max(f, g), min(f, g) ∈ P(I).
206
5. Further Topics in Analysis
Suppose now that I ′ = [a′ , b′ ] is a subinterval of I = [a, b]. With the notation x+ = max(x, 0), we have ( ) (5.2.7) fII ′ (x) = min (x − a′ )+ , (b′ − x)+ ∈ P(I). This is a piecewise linear function, equal to zero off I \ I ′ , with slope 1 from a′ to the midpoint m′ of I ′ , and slope −1 from m′ to b′ . Now if I is divided into N equal subintervals, any continuous function on I that is linear on each such subinterval can be written as a linear combination of such “tent functions,” so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such piecewise linear functions, so we have f ∈ P(I), proving the theorem. For the second proof, we bring in the sequence of functions fn defined by (5.1.14)–(5.1.15), i.e., first set ∫ 1 1 2 n (x2 − 1)n dx, (5.2.8) gn (x) = (x − 1) , An = An −1 and then set fn (x) = gn (x),
(5.2.9)
0,
|x| ≤ 1, |x| ≥ 1.
It is readily verified that such (fn ) satisfy (5.1.4)–(5.1.5). We will use this sequence in Proposition 5.1.1 to prove that if I ⊂ R is a closed, bounded interval, and f ∈ C(I), then there exist polynomials pn (x) such that pn −→ f,
(5.2.10)
uniformly on I.
To start, we note that by an affine change of variable, there is no loss of generality in assuming that [ 1 1] (5.2.11) I= − , . 4 4 Next, given I as in (5.2.11) and f ∈ C(I), it is easy to extend f to a function 1 u(x) = 0 for |x| ≥ . 2 Now, with fn as in (5.2.8)–(5.2.9), we can apply Proposition 5.1.1 to deduce that ∫ (5.2.13) un (x) = fn (y)u(x − y) dy =⇒ un → u uniformly on R.
(5.2.12)
u ∈ C(R),
Now |x| ≤ (5.2.14)
1 =⇒ u(x − y) = 0 for |y| > 1 2 ∫ =⇒ un (x) =
gn (y)u(x − y) dy,
5.2. The Weierstrass approximation theorem
207
that is, (5.2.15)
|x| ≤
1 =⇒ un (x) = pn (x), 2
where
∫ pn (x) = ∫
(5.2.16) =
gn (y)u(x − y) dy gn (x − y)u(y) dy.
The last identity makes it clear that each pn (x) is a polynomial in x. Since (5.2.13) and (5.2.15) imply [ 1 1] (5.2.17) pn −→ u uniformly on − , , 2 2 we have (5.2.10). Exercises 1. As in Exercises 1–2 of §5.1, take ∫ ∞ 1 2 2 f (x) = e−x , A = e−x dx, A −∞ (x) . fn (x) = nf n Let u ∈ C(R) vanish outside [−1, 1]. Let ε > 0 and take n ∈ N such that sup |fn ∗ u(x) − u(x)| < ε. x
Approximate fn by a sufficient partial sum of the power series ∞ n ∑ 1 ( x 2 )k fn (x) = − 2 , A k! n k=0
and use this to obtain a third proof of Theorem 5.2.1. Remark. A fourth proof of Theorem 5.2.1 is indicated in Exercise 8 of §5.4. 2. Let f be continuous on [−1, 1]. If f is odd, show that it is a uniform limit of finite linear combinations of x, x3 , x5 , . . . , x2k+1 , . . . . If f is even, show it is a uniform limit of finite linear combinations of 1, x2 , x4 , . . . , x2k , . . . . 3. If g is continuous on [−π/2, π/2], show that g is a uniform limit of finite linear combinations of sin x, sin2 x, sin3 x, . . . , sink x, . . . .
208
5. Further Topics in Analysis
Hint. Write g(x) = f (sin x) with f continuous on [−1, 1]. 4. If g is continuous on [−π, π] and even, show that g is a uniform limit of finite linear combinations of 1, cos x, cos2 x, . . . , cosk x, . . . . Hint. cos : [0, π] → [−1, 1] is a homeomorphism. 5. Assume h : R → C is continuous, periodic of period 2π, and odd, so (5.2.18)
h(x + 2π) = h(x),
h(−x) = −h(x),
∀ x ∈ R.
Show that h is a uniform limit of finite linear combinations of sin x, sin x cos x, sin x cos2 x, . . . , sin x cosk x, . . . . Hint. Given ε > 0, find δ > 0 and continuous hε , satisfying (5.2.18), such that sup |h(x) − hε (x)| < ε,
hε (x) = 0 if |x − jπ| < δ, j ∈ Z.
x
Then apply Exercise 4 to g(x) = hε (x)/ sin x.
5.3. The Stone-Weierstrass theorem A far reaching extension of the Weierstrass approximation theorem, due to M. Stone, is the following result, known as the Stone-Weierstrass theorem. Theorem 5.3.1. Let X be a compact metric space, A a subalgebra of CR (X), the algebra of real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X, i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq (p) ̸= hpq (q). Then the closure A is equal to CR (X). We will derive this from the following lemma. Lemma 5.3.2. Let A ⊂ CR (X) satisfy the hypotheses of Theorem 5.3.1, and let K, L ⊂ X be disjoint, compact subsets of X. Then there exists gKL ∈ A such that (5.3.1)
gKL = 1 on K,
0 on L,
and 0 ≤ gKL ≤ 1 on X.
Proof of Theorem 5.3.1 To start, take f ∈ CR (X) such that 0 ≤ f ≤ 1 on X. Set (5.3.2) K = {x ∈ X : f (x) ≥ 32 },
U = {x ∈ X : f (x) > 13 },
Lemma 5.3.2 implies that there exists g ∈ A such that 1 1 on K, g = 0 on L, and 0 ≤ g ≤ . (5.3.3) g= 3 3
L = X \ U.
5.3. The Stone-Weierstrass theorem
209
Then 0 ≤ g ≤ f ≤ 1 on X, and more precisely (5.3.4)
2 0 ≤ f − g ≤ , on X. 3
We can apply such reasoning with f replaced by f − g, obtaining g2 ∈ A such that ( 2 )2 (5.3.5) 0 ≤ f − g − g2 ≤ , on X, 3 and iterate, obtaining gj ∈ A such that, for each k, (5.3.6)
0 ≤ f − g − g2 − · · · − gk ≤
( 2 )k 3
, on X.
This yields f ∈ A whenever f ∈ CR (X) satisfies 0 ≤ f ≤ 1. It is an easy step to see that f ∈ CR (X) ⇒ f ∈ A.
Proof of Lemma 5.3.2. We present the proof in six steps. Step 1. Let f ∈ A and assume φ : R → R is continuous. If sup |f | ≤ A, we can apply the Weierstrass approximation theorem to get polynomials pk → φ uniformly on [−A, A]. Then pk ◦ f → φ ◦ f uniformly on X, so φ ◦ f ∈ A. Step 2. Consequently, if fj ∈ A, then (5.3.7)
1 1 max(f1 , f2 ) = |f1 − f2 | + (f1 + f2 ) ∈ A, 2 2
and similarly min(f1 , f2 ) ∈ A. Step 3. It follows from the hypotheses that if p, q ∈ X and p ̸= q, then there exists fpq ∈ A, equal to 1 at p and to 0 at q. Step 4. Apply an appropriate continuous φ : R → R to get gpq = φ ◦ fpq ∈ A, equal to 1 on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X. Step 5. Let L ⊂ X be compact and fix p ∈ X \ L. By Step 4, given q ∈ L, there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X.
210
5. Further Topics in Analysis
Now {Ωq } is an open cover of L, so there exists a finite subcover Ωq1 , . . . , ΩqN . Let gpL = min gpqj ∈ A.
(5.3.8)
1≤j≤N
Taking O = ∩N 1 Oqj , an open neighborhood of p, we have (5.3.9)
gpL = 1 on O,
0 on L,
and 0 ≤ gpL ≤ 1 on X.
Step 6. Take K, L ⊂ X disjoint, compact subsets. By Step 5, for each p ∈ K, there exists gpL ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on L. Now {Op } covers K, so there exists a finite subcover Op1 , . . . , OpM . Let gKL = max gpj L ∈ A.
(5.3.10)
1≤j≤M
We have (5.3.11)
gKL = 1 on K,
0 on L,
and 0 ≤ gKL ≤ 1 on X,
as asserted in the lemma. Theorem 5.3.1 has a complex analogue.
Theorem 5.3.3. Let X be a compact metric space, A a subalgebra (over C) of C(X), the algebra of complex valued continuous functions on X. Suppose 1 ∈ A and that A separates the points of X. Furthermore, assume f ∈ A =⇒ f ∈ A.
(5.3.12) Then the closure A = C(X).
Proof. Set AR = {f + f : f ∈ A}. One sees that Theorem 5.3.1 applies to AR . Here are a couple of applications of Theorems 5.3.1–5.3.3. Corollary 5.3.4. If X is a compact subset of Rn , then every f ∈ C(X) is a uniform limit of polynomials on Rn . Corollary 5.3.5. The space of trigonometric polynomials, given by N ∑
(5.3.13)
k=−N
is dense in
C(S 1 ).
ak eikθ ,
5.3. The Stone-Weierstrass theorem
211
Exercises 1. Prove Corollary 5.3.4. 2. Prove Corollary 5.3.5, using Theorem 5.3.3. Hint. eikθ eiℓθ = ei(k+ℓ)θ , and eikθ = e−ikθ . 3. Use the results of Exercises 4–5 in §5.2 to provide another proof of Corollary 5.3.5. Hint. Use cosk θ = ((eiθ + e−iθ )/2)k , etc. 4. Let X be a compact metric space, and K ⊂ X a compact subset. Show that A = {f |K : f ∈ C(X)} is dense in C(K). 5. In the setting of Exercise 4, take f ∈ C(K), ε > 0. Show that there exists g1 ∈ C(X) such that sup |g1 − f | ≤ ε, K
and sup |g1 | ≤ sup |f |. X
K
6. Iterate the result of Exercise 5 to get gk ∈ C(X) such that sup |gk − (f − g1 − · · · − gk−1 )| ≤ 2−k , K
sup |gk | ≤ 2−(k−1) . X
7. Use the results of Exercises 4–6 to show that, if f ∈ C(K), then there exists g ∈ C(X) such that g|K = f .
212
5. Further Topics in Analysis
5.4. Fourier series We work on T1 = R/(2πZ), which under θ 7→ eiθ is equivalent to S 1 = {z ∈ C : |z| = 1}. Given f ∈ C(T1 ), or more generally f ∈ R(T1 ) (or still more generally, if f ∈ R# (T1 ), defined as in §4.6 of Chapter 4), we set, for k ∈ Z, ∫ 2π 1 ˆ f (θ)e−ikθ dθ. (5.4.1) f (k) = 2π 0 We call fˆ(k) the Fourier coefficients of f . We say (5.4.2)
f ∈ A(T ) ⇐⇒ 1
∞ ∑
|fˆ(k)| < ∞.
k=−∞
Our first goal is to prove the following. Proposition 5.4.1. Given f ∈ C(T1 ), if f ∈ A(T1 ), then ∞ ∑ (5.4.3) f (θ) = fˆ(k)eikθ . k=−∞
∑ ˆ |f (k)| < ∞, the right side of (5.4.3) is absolutely and Proof. Given uniformly convergent, defining ∞ ∑ (5.4.4) g(θ) = fˆ(k)eikθ , g ∈ C(T1 ), k=−∞
and our task is to show that f ≡ g. Making use of the identities ∫ 2π 1 eiℓθ dθ = 0, if ℓ ̸= 0, 2π 0 (5.4.5) 1, if ℓ = 0, we get fˆ(k) = gˆ(k), for all k ∈ Z. Let us set u = f − g. We have (5.4.6)
u ∈ C(T1 ),
u ˆ(k) = 0,
∀ k ∈ Z.
It remains to show that this implies u ≡ 0. To prove this, we use Corollary 5.3.5, which implies that, for each v ∈ C(T1 ), there exist trigonometric polynomials, i.e., finite linear combinations vN of {eikθ : k ∈ Z}, such that (5.4.7)
vN −→ v uniformly on T1 .
Now (5.4.6) implies
∫ u(θ)vN (θ) dθ = 0,
∀ N,
T1
and passing to the limit, using (5.4.7), gives ∫ (5.4.8) u(θ)v(θ) dθ = 0, ∀ v ∈ C(T1 ). T1
5.4. Fourier series
213
Taking v = u gives
∫ |u(θ)|2 dθ = 0,
(5.4.9) T1
forcing u ≡ 0, and completing the proof.
We seek conditions on f that imply (5.4.2). Integration by parts for f ∈ C 1 (T1 ) gives, for k ̸= 0, ∫ 2π 1 i ∂ −ikθ fˆ(k) = f (θ) (e ) dθ 2π 0 k ∂θ (5.4.10) ∫ 2π 1 = f ′ (θ)e−ikθ dθ, 2πik 0 hence ∫ 2π 1 ˆ (5.4.11) |f (k)| ≤ |f ′ (θ)| dθ. 2π|k| 0 If f ∈ C 2 (T1 ), we can integrate by parts a second time, and get ∫ 2π 1 ˆ (5.4.12) f (k) = − f ′′ (θ)e−ikθ dθ, 2πk 2 0 hence ∫ 2π 1 |fˆ(k)| ≤ |f ′′ (θ)| dθ. 2πk 2 0 In concert with ∫ 2π 1 |f (θ)| dθ, (5.4.13) |fˆ(k)| ≤ 2π 0 which follows from (5.4.1), we have (5.4.14)
|fˆ(k)| ≤
2π(k 2
+ 1)
Hence (5.4.15)
∫
1
f ∈ C 2 (T1 ) =⇒
2π [
] |f ′′ (θ)| + |f (θ)| dθ.
0
∑
|fˆ(k)| < ∞.
We will sharpen this implication below. We start with an interesting example. Consider (5.4.16)
f (θ) = |θ|,
−π ≤ θ ≤ π,
and extend this to be periodic of period 2π, yielding f ∈ C(T1 ). We have ∫ π ∫ 1 π 1 −ikθ ˆ |θ|e dθ = θ cos kθ dθ f (k) = 2π −π π 0 (5.4.17) 1 = −[1 − (−1)k ] 2 , πk
214
5. Further Topics in Analysis
for k ̸= 0, while fˆ(0) = π/2. This is clearly a summable series, so f ∈ A(T1 ), and Proposition 5.4.1 implies that, for −π ≤ θ ≤ π, ∑ 2 π |θ| = − eikθ 2 πk 2 k odd (5.4.18) ∞ π 4∑ 1 = − cos(2ℓ + 1)θ. 2 π (2ℓ + 1)2 ℓ=0
Now, evaluating this at θ = 0 yields the identity ∞ ∑
(5.4.19)
ℓ=0
1 π2 . = (2ℓ + 1)2 8
Using this, we can evaluate ∞ ∑ 1 S= , k2
(5.4.20)
k=1
as follows. We have ∞ ∑ 1 = k2
(5.4.21)
k=1
=
∑ k≥1 odd
π2 8
+
1 4
1 + k2 ∞ ∑ ℓ=1
∑ k≥2 even
1 k2
1 , ℓ2
hence S − S/4 = π 2 /8, so (5.4.22)
∞ ∑ 1 π2 = . k2 6 k=1
We see from (5.4.17) that if f is given by (5.4.16), then fˆ(k) satisfies (5.4.23)
|fˆ(k)| ≤
C . k2 + 1
This is a special case of the following generalization of (5.4.15). Proposition 5.4.2. Let f be Lipschitz continuous and piecewise C 2 on T1 . Then (5.4.23) holds. Proof. Here we are assuming f is C 2 on T1 \ {p1 , . . . , pℓ }, and f ′ and f ′′ have limits at each of the endpoints of the associated intervals in T1 , but f is not assumed to be differentiable at the endpoints pℓ . We can write f as a sum of functions fν , each of which is Lipschitz on T1 , C 2 on T1 \ pν , and fν′ and fν′′ have limits as one approaches pν from either side. It suffices to show that each fˆν (k) satisfies (5.4.23).
5.4. Fourier series
215
Now g(θ) = fν (θ + pν − π) is singular only at θ = π, and gˆ(k) = fˆν (k)eik(pν −π) , so it suffices to prove Proposition 5.4.2 when f has a singularity only at θ = π. In other words, f ∈ C 2 ([−π, π]), and f (−π) = f (π). In this case, we still have (5.4.10), since the endpoint contributions from integration by parts still cancel. A second integration by parts gives, in place of (5.4.12), ∫ π 1 i ∂ −ikθ ˆ f (k) = f ′ (θ) (e ) dθ 2πik −π k ∂θ ∫ (5.4.24) ] 1 [ π ′′ −ikθ ′ ′ f (θ)e dθ + f (π) − f (−π) , =− 2πk 2 −π
which yields (5.4.23). We next make use of (5.4.5) to produce results on with the following.
∫ T1
|f (θ)|2 dθ, starting
Proposition 5.4.3. Given f ∈ A(T1 ), ∫ ∑ 1 |f (θ)|2 dθ. (5.4.25) |fˆ(k)|2 = 2π T1
More generally, if also g ∈ A(T1 ), ∫ ∑ 1 (5.4.26) g (k) = fˆ(k)ˆ f (θ)g(θ) dθ. 2π T1
Proof. Switching order of summation and integration and using (5.4.5), we have ∫ ∫ ∑ 1 1 f (θ)g(θ) dθ = fˆ(j)ˆ g (k)e−i(j−k)θ dθ 2π 2π T1 T1 j,k (5.4.27) ∑ g (k), = fˆ(k)ˆ k
giving (5.4.26). Taking g = f gives (5.4.25).
We will extend the scope of Proposition 5.4.3 below. Closely tied to this is the issue of convergence of SN f to f as N → ∞, where ∑ (5.4.28) SN f (θ) = fˆ(k)eikθ . |k|≤N
Clearly f ∈ A(S 1 ) ⇒ SN f → f uniformly on T1 as N → ∞. Here, we are interested in convergence in L2 -norm, where ∫ 1 2 (5.4.29) ∥f ∥L2 = |f (θ)|2 dθ. 2π T1
216
5. Further Topics in Analysis
Given f ∈ R(T1 ), this defines a “norm,” satisfying the following result, called the triangle inequality: (5.4.30)
∥f + g∥L2 ≤ ∥f ∥L2 + ∥g∥L2 .
See Appendix 5.6 for details on this. Behind these results is the fact that (5.4.31)
∥f ∥2L2 = (f, f )L2 ,
where, when f and g belong to R(T1 ), we define the inner product ∫ 1 (5.4.32) (f, g)L2 = f (θ)g(θ) dθ. 2π S1
Thus the content of (5.4.25) is that ∑ (5.4.33) |fˆ(k)|2 = ∥f ∥2L2 , and that of (5.4.26) is that ∑ (5.4.34) g (k) = (f, g)L2 . fˆ(k)ˆ The left side of (5.4.33) is the square norm of the sequence (fˆ(k)) in ℓ2 . Generally, a sequence (ak ) (k ∈ Z) belongs to ℓ2 if and only if ∑ (5.4.35) ∥(ak )∥2ℓ2 = |ak |2 < ∞. There is an associated inner product (5.4.36)
((ak ), (bk )) =
∑
ak bk .
As in (5.4.30), one has (see §5.6) (5.4.37)
∥(ak ) + (bk )∥ℓ2 ≤ ∥(ak )∥ℓ2 + ∥(bk )∥ℓ2 .
As for the notion of L2 -norm convergence, we say (5.4.38)
fν → f in L2 ⇐⇒ ∥f − fν ∥L2 → 0.
There is a similar notion of convergence in ℓ2 . Clearly (5.4.39)
∥f − fν ∥L2 ≤ sup |f (θ) − fν (θ)|. θ
In view of the uniform convergence SN f → f for f ∈ A(T1 ) noted above, we have (5.4.40)
f ∈ A(T1 ) =⇒ SN f → f in L2 , as N → ∞.
The triangle inequality implies (5.4.41) ∥f ∥L2 − ∥SN f ∥L2 ≤ ∥f − SN f ∥L2 ,
5.4. Fourier series
217
and clearly (by Proposition 5.4.3) ∥SN f ∥2L2 =
(5.4.42)
N ∑
|fˆ(k)|2 ,
k=−N
so ∥f − SN f ∥L2 → 0 as N → ∞ =⇒ ∥f ∥2L2 =
(5.4.43)
∑
|fˆ(k)|2 .
We now consider more general functions f ∈ R(T1 ). With fˆ(k) and SN f defined by (5.4.1) and (5.4.28), we define RN f by (5.4.44) Note that
∫ T1
f (θ)e−ikθ
(5.4.45)
f = SN f + RN f. ∫ dθ = T1 SN f (θ)e−ikθ dθ for |k| ≤ N , hence (f, SN f )L2 = (SN f, SN f )L2 ,
and hence (SN f, RN f )L2 = 0.
(5.4.46) Consequently, (5.4.47)
∥f ∥2L2 = (SN f + RN f, SN f + RN f )L2 = ∥SN f ∥2L2 + ∥RN f ∥2L2 .
In particular, ∥SN f ∥L2 ≤ ∥f ∥L2 .
(5.4.48)
We are now in a position to prove the following. Lemma 5.4.4. Let f, fν belong to R(T1 ). Assume lim ∥f − fν ∥L2 = 0,
(5.4.49)
ν→∞
and, for each ν, (5.4.50)
lim ∥fν − SN fν ∥L2 = 0.
N →∞
Then lim ∥f − SN f ∥L2 = 0.
(5.4.51)
N →∞
Proof. Writing f − SN f = (f − fν ) + (fν − SN fν ) + SN (fν − f ), and using the triangle inequality, we have, for each ν, (5.4.52) ∥f − SN f ∥L2 ≤ ∥f − fν ∥L2 + ∥fν − SN fν ∥L2 + ∥SN (fν − f )∥L2 . Taking N → ∞ and using (5.4.48), we have (5.4.53)
lim sup ∥f − SN f ∥L2 ≤ 2∥f − fν ∥L2 , N →∞
for each ν. Then (5.4.49) yields the desired conclusion (5.4.51).
218
5. Further Topics in Analysis
Given f ∈ C(T1 ), we have trigonometric polynomials fν → f uniformly on T1 , and clearly (5.4.50) holds for each such fν . Thus Lemma 5.4.4 yields the following. (5.4.54)
f ∈ C(T1 ) =⇒ SN f → f in L2 , and ∑ |fˆ(k)|2 = ∥f ∥2L2 .
Lemma 5.4.4 also applies to many discontinuous functions. Consider, for example f (θ) = 0
(5.4.55)
for − π < θ < 0,
1 for 0 < θ < π.
We can set, for ν ∈ N, for − π ≤ θ ≤ 0, 1 νθ for 0 ≤ θ ≤ , ν (5.4.56) 1 1 1 for ≤θ≤π− , ν ν 1 ν(π − θ) for π − ≤ θ ≤ π. ν 1 Then each fν ∈ C(T ). (In fact, fν ∈ A(T1 ), by Proposition 5.4.2.). Also, one can check that ∥f − fν ∥2L2 ≤ 2/ν. Thus the conclusion in (5.4.54) holds for f given by (5.4.55). fν (θ) = 0
More generally, any piecewise continuous function on T1 is an L2 limit of continuous functions, so the conclusion of (5.4.54) holds for them. To go further, let us consider the class of Riemann integrable functions. A function f : T1 → R is Riemann integrable provided f is bounded (say |f | ≤ M ) and, for each δ > 0, there exist piecewise constant functions gδ and hδ on T1 such that ∫ ( ) (5.4.57) gδ ≤ f ≤ hδ , and hδ (θ) − gδ (θ) dθ < δ. T1
Then (5.4.58)
hδ (θ) dθ.
gδ (θ) dθ = lim
f (θ) dθ = lim
δ→0
δ→0
T1
∫
∫
∫
T1
T1
Note that we can assume |hδ |, |gδ | < M + 1, and so ∫ ∫ 1 M +1 2 |f (θ) − gδ (θ)| dθ ≤ |hδ (θ) − gδ (θ)| dθ 2π π T1 T1 (5.4.59) M +1 δ, < π
5.4. Fourier series
219
so gδ → f in L2 -norm. A function f : T1 → C is Riemann integrable provided its real and imaginary parts are. In such a case, there are also piecewise constant functions fν → f in L2 -norm, giving the following. Proposition 5.4.5. We have (5.4.60)
f ∈ R(T1 ) =⇒ SN f → f in L2 , and ∑ |fˆ(k)|2 = ∥f ∥2L2 .
This is not the end of the story. Lemma 5.4.4 extends to unbounded functions on T1 that are square integrable, such as 1 0 ν, ∑
∥fµ − fν ∥2L2 =
|ak |2 → 0 as ν → ∞.
ν 0, C < ∞, such that |φ| ≤ δ =⇒ |f (θ + φ) − f (θ)| ≤ C|φ|α .
(5.4.90)
Proposition 5.4.9 implies the following. older continuous at θ, with Proposition 5.4.10. Let f ∈ R# (T1 ). If f is H¨ some exponent α ∈ (0, 1], then (5.4.89) holds. Proof. We have (5.4.91)
f (θ + φ) − f (θ) ≤ C ′ |φ|−(1−α) , sin φ/2
for |φ| ≤ δ. Since sin φ/2 is bounded away from 0 for φ ∈ [−π, π] \ [−δ, δ], the hypothesis (5.4.87) holds. We now look at the following class of piecewise regular functions, with jumps. Take points pj , (5.4.92)
−π = p0 < p1 < · · · < pK = π.
Take functions (5.4.93)
fj : [pj , pj+1 ] −→ C,
H¨older continuous with exponent α > 0, for 0 ≤ j ≤ K − 1. Define f : T1 → C by f (θ) = fj (θ),
if pj < θ < pj+1 ,
(5.4.94)
fj (pj+1 ) + fj+1 (pj+1 ) , if θ = pj+1 . 2 By convention, we take K ≡ 0 (recall that π ≡ −π in T1 ).
Proposition 5.4.11. With f as specified above, we have (5.4.95)
SN f (θ) −→ f (θ),
∀ θ ∈ T1 .
5.4. Fourier series
225
Proof. If θ ∈ / {p0 , . . . , pK }, this follows from Proposition 5.4.10. It remains to consider the case θ = pj for some j. Note that (5.4.96)
SN Rφ f = Rφ SN f,
where Rφ f (θ) = f (θ + φ). Hence there is no loss of generality in taking pj = 0. Parallel to (5.4.96), we have (5.4.97)
SN T f = T SN f,
T f (θ) = f (−θ).
Hence 1 SN f (0) = SN (f + T f )(0). 2 However, f + T f is H¨older continuous at θ = 0, with value 2f (0), so Proposition 5.4.10 implies 1 SN (f + T f )(0) −→ f (0), as N → ∞. (5.4.99) 2 This gives (5.4.95) for θ = pj = 0. (5.4.98)
If f is given by (5.4.92)–(5.4.94) and α ∈ (0, 1), we say f ∈ PCα (T1 ).
(5.4.100)
If instead m ∈ N and such fj in (5.4.93) belong to C m ([pj , pj+1 ]), we say f ∈ PCm (T1 ).
(5.4.101)
Let us take a closer look at the following function χ, which belongs to PC (T1 ) for all m ∈ N: m
(5.4.102)
χ(θ) = 0 1
for − π < θ < 0, for 0 < θ < π,
with χ(0) = χ(±π) = 1/2. A calculation (cf. Exercise 3 below) gives ] 1 1 [ (5.4.103) χ(0) ˆ = , χ(k) ˆ = 1 − (−1)k , k ̸= 0. 2 2πi Hence M 1 2 ∑ sin(2ℓ + 1)θ (5.4.104) SN χ(θ) = + , N = 2M + 1. 2 π 2ℓ + 1 ℓ=1
See Figures 5.4.2–5.4.3 for the graphs of SN χ(θ), with N = 11 and N = 21. These graphs illustrate the following phenomena regarding SN χ(θ). (I) If J is a closed interval in T1 that is disjoint from the points where χ jumps, then (5.4.105)
sup |SN χ(θ) − χ(θ)| −→ 0, as N → ∞. θ∈J
226
5. Further Topics in Analysis
Figure 5.4.2. Graph of SN χ(θ), N = 11.
(II) Near a point of discontinuity, SN χ(θ) overshoots χ(θ), by an amount that does not decay as N → ∞. This overshot is accompanied by an oscillatory behavior in SN χ(θ), that decays as θ moves away from the jump.
The first phenomenon is a special case of Riemann localization of convergence of Fourier series. The second is a special case of the Gibbs phenomenon. We aim to establish some results that justify these observations. For this, it will be convenient to bring in the following class of functions. Definition. Given f ∈ R(T1 ), we say f ∈ BV(T1 ) provided there exist fν ∈ C 1 (T1 ) and A < ∞ such that (5.4.106)
∥fν′ ∥L1 ≤ A,
and fν → f in L2 -norm.
We write (5.4.107)
∥f ∥TV = inf{A : (5.4.106) holds}, ∥f ∥BV = ∥f ∥sup + ∥f ∥TV .
5.4. Fourier series
227
Figure 5.4.3. Graph of SN χ(θ), N = 21.
In connection with the use of ∥f ∥sup , note that (5.4.108)
|fν (θ)| ≤ A + |fν (θ0 )|,
for all θ0 , θ ∈ T1 , and integrating over θ0 gives (5.4.109)
sup |fν | ≤ 2πA + ∥fν ∥L1 ,
hence, in the limit, if f ∈ BV(T1 ), (5.4.110)
sup |f | ≤ 2π∥f ∥TV + ∥f ∥L1 .
Example. We have (5.4.111)
PC1 (T1 ) ⊂ BV(T1 ),
and, for f ∈ PC1 (T1 ), as in (5.4.92)–(5.4.94), K−1 ∑ ∫ pj+1 (5.4.112) ∥f ∥TV = |fj′ (θ)| dθ + J(f ), j=0
pj
228
5. Further Topics in Analysis
where J(f ) is the sum of the absolute values of the jumps of f across the points pj , 0 ≤ j ≤ K − 1. In case f = χ, as in (5.4.102), ∥χ∥TV = 2,
(5.4.113)
∥χ∥BV = 3.
Here is a useful general estimate. Proposition 5.4.12. Given f ∈ BV(T1 ), k ̸= 0, |fˆ(k)| ≤
(5.4.114)
1 ∥f ∥TV . 2π|k|
Proof. Take fν as in (5.4.106). Then, for each k, fˆ(k) = lim fˆν (k).
(5.4.115)
ν→∞
Meanwhile, (5.4.116)
1 2πik
fˆν (k) =
∫
fν′ (θ)e−ikθ dθ,
T1
so (5.4.117)
|fˆν (k)| ≤
1 A, 2π|k|
∀ k ∈ Z \ 0.
Taking ν → ∞ gives (5.4.114).
We apply Proposition 5.4.12 to study the behavior of SN f given f ∈ BV(T1 ). We can write ∫ ∫ 1 1 SN f (θ) − f (θ) = bθ (φ) sin N φ dφ + gθ (φ) cos N φ dφ 2π 2π (5.4.118) T1 T1 = R1N (θ) + R2N (θ). We have an easy general estimate on R2N (θ), namely f ∈ BV(T1 ) ⇒ ∥gθ ∥TV ≡ ∥f ∥TV (5.4.119)
⇒ sup |R2N (θ)| ≤ θ
1 ∥f ∥TV . 2πN
We turn to an estimate on R2N (θ), under the hypothesis that (5.4.120)
f ∈ BV(T1 ) and f = 0 for |θ − θ0 | < a,
where θ0 ∈ T1 and a ∈ (0, π). Picking r ∈ (0, 1), we have (5.4.121)
gθ (φ) = f (θ + φ) − f (θ) = 0 for |θ − θ0 | ≤ ra, |φ| < (1 − r)a,
hence (5.4.122)
bθ (φ) = gθ (φ) cot
φ = 0, for |θ − θ0 | ≤ ra, |φ| < (1 − r)a. 2
5.4. Fourier series
229
Since the only singularity of cot φ/2 in T1 is at φ = 0, we have (5.4.123)
bθ ∈ BV(T1 ),
∀ |θ − θ0 | < a,
and (5.4.124)
∥bθ ∥BV ≤ C(r, a)∥f ∥BV ,
for |θ − θ0 | ≤ ra,
when r ∈ (0, 1). Hence, by Proposition 5.4.12, C ′ (r, a) ∥f ∥BV , for |θ − θ0 | ≤ ra, N when (5.4.120) holds. Putting this together with (5.4.119), we have the following. (5.4.125)
|R1N (θ)| ≤
Proposition 5.4.13. If f ∈ BV(T1 ) satisfies (5.4.120), then, for each r ∈ (0, 1), there exists C = C(r, a) such that C (5.4.126) sup |SN f (θ) − f (θ)| ≤ ∥f ∥BV . N |θ−θ0 |≤ra We get the same sort of conclusion if (5.4.120) is modified to read (5.4.127)
f ∈ BV(T1 ), g ∈ C 2 (T1 ), and f = g for |θ − θ0 | < a,
since g ∈ C 2 (T1 ) ⇒ |ˆ g (k)| ≤ C/k 2 . We then replace ∥f ∥BV on the right side of (5.4.126) by ∥f ∥BV + ∥g∥C 2 . This has the following consequence, explaining Phenomenon I for SN χ(θ). Corollary 5.4.14. If we take f ∈ PC2 (T1 ),
(5.4.128)
and J is a closed interval in T1 that is disjoint from the set of points where f has a jump, then there exists C = C(J, f ) such that C (5.4.129) sup |SN f (θ) − f (θ)| ≤ . N θ∈J We next address Phenomenon II. We take f ∈ BV(T1 ) and return to ∫ (5.4.130) SN f (θ) = fθ (φ)DN (φ) dφ, fθ (φ) = f (φ + θ), T1
with DN (θ) as in (5.4.83). It will be convenient to relate this to a slightly different family of operators, namely ∫ (5.4.131) SN f (θ) = fθ (φ)EN (φ) dφ, T1
with (5.4.132)
EN (φ) =
1 sin N φ . π φ
230
5. Further Topics in Analysis
Note that (5.4.133)
DN (φ) = EN (φ) +
1 1 γ(φ) sin N φ + cos N φ, 2π 2π
where (5.4.134)
γ(φ) = cot
φ 2 − 2 φ
is a smooth, odd function of φ on [−π, π] (which, as a function on T1 , has a jump at π). We have 1 ∫ C fθ (φ)γ(φ) sin N φ dφ ≤ ∥fθ γ∥TV 2π N T1 (5.4.135) C ≤ ∥f ∥BV , N and 1 ∫ C fθ (φ) cos N φ dφ ≤ ∥f ∥TV , (5.4.136) 2π N T1
hence C sup SN f (θ) − SN f (θ) ≤ ∥f ∥BV . N θ∈T1
(5.4.137)
We now specialize to f = χ, given by (5.4.102). We have ∫ 1 π−θ sin N φ (5.4.138) SN χ(θ) = dφ, if 0 ≤ θ ≤ π. π −θ φ Note that SN χ(θ) −
(5.4.139)
1 is odd in θ ∈ [−π, π], 2
so an analysis of (5.4.138) suffices. Since (sin N φ)/φ is even, we have ∫ ∫ 1 θ sin N φ 1 π−θ sin N φ (5.4.140) SN χ(θ) = dφ + dφ, π 0 φ π 0 φ for θ ∈ [0, π]. We bring in the following special function: ∫ 2 x sin y (5.4.141) G(x) = dy, π 0 y and deduce that (5.4.142)
1 1 SN χ(θ) = G(N θ) + G(N (π − θ)), for θ ∈ [0, π]. 2 2
5.4. Fourier series
231
The function G(x) is called the sine-integral. It can be evaluated accurately for x in a bounded interval by taking the power series ∞
y 2k sin y ∑ (−1)k = , y (2k + 1)!
(5.4.143)
k=0
and integrating term by term: (5.4.144)
G(x) =
∞ 2 ∑ (−1)k x2k+1 . π 2k + 1 (2k + 1)! k=0
Regarding the behavior of G(x) as x → +∞, note that (5.4.145)
G(N π) =
N −1 2 ∑ (−1)k ak , π k=0
where (5.4.146)
∫
(k+1)π
ak = kπ
| sin y| dy ↘ 0, y
as k ↗ ∞.
It follows that G(x) tends to a finite limit as x → +∞. In fact, (5.4.147)
lim G(x) = 1.
x→∞
This can be seen from
(π )
(Nπ )
1 or SN χ(0) = G(N π), 2 2 2 together with (5.4.137) and Proposition 5.4.11, which together imply (π ) (π ) 1 (5.4.149) SN χ →χ = 1, SN χ(0) → χ(0) = , as N → ∞. 2 2 2 From here, another look at (5.4.145)–(5.4.146) yields (5.4.148)
SN χ
=G
,
C , for x > 0. x See Exercise 12 below for a more precise result. (5.4.150)
|G(x) − 1| ≤
See Figure 5.4.4 for the graph of G(x). We see that G(x) overshoots its limiting value of 1. In fact, as one can check using (5.4.144)–(5.4.146), (5.4.151)
Gmax = G(π) ≈ 1.1789797444 · · · .
Combining the identity (5.4.142) with (5.4.137), we have the following incisive description of SN χ on [0, π]. Proposition 5.4.15. For χ given by (5.4.102), we have ] 1[ C (5.4.152) sup SN χ(θ) − G(N θ) + G(N (π − θ)) ≤ . 2 N 0≤θ≤π
232
5. Further Topics in Analysis
Figure 5.4.4. Graph of the sine-integral, G(x)
We can extend the analysis of Proposition 5.4.15 to negative θ using the fact that both χ(θ) − 1/2 (hence SN χ(θ) − 1/2) and G(x) are odd functions of their arguments. Combining this observation with Proposition 5.4.15 and with (5.4.150), we have the following. Corollary 5.4.16. Let I ⊂ (−π, π) be C = C(I) < ∞ such that 1 (5.4.153) sup SN χ(θ) − − 2 θ∈I
a closed interval. Then there exists 1 C G(N θ) ≤ . 2 N
We now move beyond SN χ and examine the Gibbs phenomenon for a more general class of functions with jumps. We take f ∈ PC2 (T1 ).
(5.4.154)
Assume f has a jump at θ = 0, and (5.4.155)
lim f (θ) = 0,
θ↗0
lim f (θ) = 1.
θ↘0
Let I ⊂ (−π, π) be a closed interval, containing 0, but disjoint from the other jumps of f . We can write (5.4.156)
f − χ = g + h,
5.4. Fourier series
233
with h ∈ BV(T1 ), h ≡ 0 on a neighborhood of I, and g satisfying the hypotheses of Proposition 5.4.2. It follows from Propositions 5.4.2 and 5.4.13 that C (5.4.157) sup SN (g + h)(θ) − g(θ) ≤ , N θ∈I hence (5.4.158)
C sup SN (f − χ)(θ) − (f − χ)(θ) ≤ . N θ∈I
Combining this with Corollary 5.4.16 yields the following. Proposition 5.4.17. Assume f ∈ PC2 (T1 ) satisfies (5.4.155), and let I ⊂ (−π, π) be a closed interval that is disjoint from the jumps of f other than at θ = 0. Then (1) 1 1 (5.4.159) SN f (θ) = + G(N θ) + (f − χ)(θ) + O , 2 2 N uniformly in θ ∈ I. We leave it as an exercise to extend Proposition 5.4.17 to the case where (5.4.160)
lim f (θ) = a,
θ↗0
lim f (θ) = b.
θ↘0
Using the fact that SN Rφ f = Rφ SN f for φ ∈ T1 , one can also analyze the Gibbs phenomenon for SN f (θ) near other jumps of f ∈ PC2 (T1 ).
234
5. Further Topics in Analysis
Exercises 1. Prove (5.4.80). 2. Prove (5.4.97). 3. Compute χ(k) ˆ when (5.4.161)
χ(θ) = 1 0
for 0 < θ < π, for − π < θ < 0.
Then use (5.4.60) to obtain another proof of (5.4.22). 4. Apply Proposition 5.4.11 to χ in (5.4.161), when θ = 0, π/2, π. Use the computation at θ = π/2 to show the following (compare Exercise 31 in Chapter 4, §4.5): π 1 1 1 = 1 − + − + ··· . 4 3 5 7 5. Apply (5.4.60) when f (θ) is given by (5.4.16). Use this to show that ∞ ∑ 1 π4 = . k4 90 k=1
6. Use Proposition 5.4.10 in concert with Proposition 5.4.2 to demonstrate that (5.4.3) holds when f is Lipschitz and piecewise C 2 on T1 , without recourse to Corollary 5.3.5 (whose proof in §5.3 uses the Stone-Weierstrass theorem). Use this in turn to prove Proposition 5.4.1, without using Corollary 5.3.5. 7. Use the results of Exercise 6 to give a proof of Corollary 5.3.5 that does not use the Stone-Weierstrass theorem. Hint. As in the end of the proof of Theorem 5.2.1, each f ∈ C(T1 ) can be uniformly approximated by a sequence of Lipschitz, piecewise linear functions. Recall that Corollary 5.3.5 states that each f ∈ C(T1 ) can be uniformly approximated by a sequence of finite linear combinations of the functions eikθ , k ∈ Z. The proof given in §5.3 relied on the Weierstrass approximation theorem, Theorem 5.2.1, which was used in the proof of Theorems 5.3.1 and
5.4. Fourier series
235
5.3.3. Exercise 7 indicates a proof of Corollary 5.3.5 that does not depend on Theorem 5.2.1. 8. Give another proof of Theorem 5.2.1, as a corollary of Corollary 5.3.5. Hint. You can take I = [−π/2, π/2]. Given f ∈ C(I), you can extend it to f ∈ C([−π, π]), vanishing at ±π, and identify such f with an element of C(T1 ). Given ε > 0, approximate f uniformly to within ε on [−π, π] by a finite sum N ∑ ak eikθ . k=−N
Then approximate eikθ uniformly to within ε/(2N + 1) by a partial sum of the power series for eikθ , for each k ∈ {−N, . . . , N }. 9. Let f ∈ C(T1 ). Assume there exist fν ∈ A(T1 ) and B < ∞ such that fν → f uniformly on T1 and ∞ ∑
|fˆν (k)| ≤ B,
∀ ν.
k=−∞
Show that f ∈ A(T1 ). 10. Let f ∈ C(T1 ). Assume there exist fν ∈ C(T1 ) satisfying the conditions of Proposition 5.4.6 such that fν → f uniformly on T1 , and assume there exists C < ∞ such that ∫ |fν′ (θ)|2 dθ ≤ C, ∀ ν. T1
Show that f ∈ A(T1 ). Hint. Use (5.4.69), with f replaced by fν . 11. Recall the Dirichlet kernel DN (θ), defined by (5.4.76). Show that N 1 ∑ DN (θ) = cos kθ 2π
(5.4.162)
k=−N
) ∑ 1 ( = 1+2 cos kθ . 2π N
k=1
Try to find trigonometric identities, not involving the complex exponentials eikθ , that take one from (5.4.162) to the key identity (5.4.77), i.e., (5.4.163)
DN (θ) =
1 sin(N + 1/2)θ . 2π sin θ/2
236
5. Further Topics in Analysis
Remark. One point of this exercise is to highlight the advantages of being able to work with complex-valued functions, which, in (5.4.78)–(5.4.79), allow one to reduce the calculation of DN (θ) to a simple geometric series. The reader is challenged to establish (5.4.163) without benefit of this use of complex numbers. Feel free to throw in the towel if you get stuck! 12. Recall the sine-integral
∫ 2 x sin y G(x) = dy. π 0 y Using (5.4.147) and integration by parts, show that, for x > 0, ∫ 2 ∞1 1 − G(x) = sin y dy π x y ∫ ∞ ] 1 2 [ cos x − cos y dy = π x y2 x ∫ ∞ [ ] 2 cos x sin x 2 = − 2 + sin y dy . π x x y3 x
5.5. Newton’s method
237
5.5. Newton’s method Here we describe a method to approximate the solution to (5.5.1)
f (ξ) = 0.
We assume f : [a, b] → R is continuous and f ∈ C 2 ((a, b)). We assume it is known that f vanishes somewhere in (a, b). For example, f (a) and f (b) might have opposite signs. We take x0 ∈ (a, b) as an initial guess of a solution to (5.5.1), and inductively construct the sequence (xk ), going from xk to xk+1 as follows. Replace f by its best linear approximation at xk , g(x) = f (xk ) + f ′ (xk )(x − xk ),
(5.5.2)
and solve g(xk+1 ) = 0. This yields xk+1 − xk = −
(5.5.3)
f (xk ) , f ′ (xk )
or xk+1 = xk −
(5.5.4)
f (xk ) . f ′ (xk )
See Figure 5.5.1 for an illustration. Naturally, we need to assume f ′ (x) is bounded away from 0 on (a, b). This production of the sequence (xk ) is Newton’s method, and as we will see, under appropriate hypotheses it converges quite rapidly to ξ. We want to give a condition guaranteeing that |xk+1 − ξ| < |xk − ξ|. Say (5.5.5)
xk = ξ + δ.
Then (5.5.4) yields f (ξ + δ) f ′ (ξ + δ) f ′ (ξ + δ)δ − f (ξ + δ) = . f ′ (ξ + δ)
xk+1 − ξ = δ − (5.5.6)
Now the mean value theorem implies (5.5.7)
f (ξ + δ) − f (ξ) = f ′ (ξ + τ δ)δ,
for some τ ∈ (0, 1).
Since f (ξ) = 0, we get from (5.5.6) that (5.5.8)
xk+1 − ξ =
f ′ (ξ + δ) − f ′ (ξ + τ δ) δ. f ′ (ξ + δ)
A second application of the mean value theorem gives (5.5.9)
f ′ (ξ + δ) − f ′ (ξ + τ δ) = (1 − τ )δf ′′ (ξ + γδ),
238
5. Further Topics in Analysis
Figure 5.5.1. One iteration of Newton’s method
for some γ ∈ (τ, 1), hence (5.5.10)
xk+1 − ξ = (1 − τ )
Consequently, (5.5.11)
f ′′ (ξ + γδ) 2 δ , f ′ (ξ + δ)
τ ∈ (0, 1), γ ∈ (τ, 1).
f ′′ (ξ + γδ) 2 |xk+1 − ξ| ≤ sup ′ δ . f (ξ + δ) 0 0. This is useful in the proof of the Weierstrass approximation theorem in Chapter 5. Appendix A.3 shows that π 2 is irrational. Appendix A.4 discusses a method of numerically evaluating π that goes back to Archimedes. Appendix A.5 discusses calculations of π using arctangents. Appendix A.6 treats the power series for tan x, whose coefficients require a more elaborate derivation than those for sin x and cos x. Appendix A.7 discusses a theorem of Abel, giving the optimal condition under which a power series in t with radius of convergence 1 can be shown to converge uniformly in t ∈ [0, 1], as well as related issues regarding convergence of infinite series. Appendix A.8 discusses the existence of continuous functions on R that are nowhere differentiable.
A.1. The fundamental theorem of algebra The following result is the fundamental theorem of algebra. Theorem A.1.1. If p(z) is a nonconstant polynomial (with complex coefficients), then p(z) must have a complex root. Proof. We have, for some n ≥ 1, an ̸= 0, (A.1.1)
p(z) = an z n + · · · + a1 z + a0 ( ) = an z n 1 + O(z −1 ) , |z| → ∞, 247
248
A. Complementary results
which implies lim |p(z)| = ∞.
(A.1.2)
|z|→∞
Picking R ∈ (0, ∞) such that inf |p(z)| > |p(0)|,
(A.1.3)
|z|≥R
we deduce that inf |p(z)| = inf |p(z)|.
(A.1.4)
|z|≤R
z∈C
Since DR = {z : |z| ≤ R} is compact and p is continuous, there exists z0 ∈ DR such that |p(z0 )| = inf |p(z)|.
(A.1.5)
z∈C
The theorem hence follows from:
Lemma A.1.2. If p(z) is a nonconstant polynomial and (A.1.5) holds, then p(z0 ) = 0. Proof. Suppose to the contrary that p(z0 ) = a ̸= 0.
(A.1.6) We can write (A.1.7)
p(z0 + ζ) = a + q(ζ),
where q(ζ) is a (nonconstant) polynomial in ζ, satisfying q(0) = 0. Hence, for some k ≥ 1 and b ̸= 0, we have q(ζ) = bζ k + · · · + bn ζ n , i.e., q(ζ) = bζ k + O(ζ k+1 ),
(A.1.8)
ζ → 0,
so, uniformly on S 1 = {ω : |ω| = 1} (A.1.9)
p(z0 + εω) = a + bω k εk + O(εk+1 ),
ε ↘ 0.
Pick ω ∈ S 1 such that b k a ω =− , |b| |a|
(A.1.10)
which is possible since a ̸= 0 and b ̸= 0. In more detail, since −(a/|a|)(|b|/b) ∈ S 1 , Euler’s identity implies −
a |b| = eiθ , |a| b
for some θ ∈ R, so we can take ω = eiθ/k .
A.2. More on the power series of (1 − x)b
249
Given (A.1.10), b ) ( p(z0 + εω) = a 1 − εk + O(εk+1 ), a which contradicts (A.1.5) for ε > 0 small enough. Thus (A.1.6) is impossible. This proves Lemma A.1.2, hence Theorem A.1.1. (A.1.11)
Now that we have shown that p(z) in (A.1.1) must have one root, we can show it has n roots (counting multiplicity). Proposition A.1.3. For a polynomial p(z) of degree n, as in (A.1.1), there exist r1 , . . . , rn ∈ C such that (A.1.12)
p(z) = an (z − r1 ) · · · (z − rn ).
Proof. We have shown that p(z) has one root; call it r1 . Dividing p(z) by z − r1 , we have p(z) = (z − r1 )˜ p(z) + q,
(A.1.13)
where p˜(z) = an z n−1 + · · · + a ˜0 and q is a polynomial of degree < 1, i.e., a constant. Setting z = r1 in (A.1.13) yields q = 0, so p(z) = (z − r1 )˜ p(z).
(A.1.14)
Since p˜(z) is a polynomial of degree n − 1, the result (A.1.12) follows by induction on n. The numbers rj , 1 ≤ j ≤ n, in (A.1.12) are called the roots of p(z). If k of them coincide (say with rℓ ) we say rℓ is a root of multiplicity k. If rℓ is distinct from rj for all j ̸= ℓ, we say rℓ is a simple root.
A.2. More on the power series of (1 − x)b In §4.3 of Chapter 4, we showed that (1 − x)b =
(A.2.1)
∞ ∑ ak k=0
k!
xk ,
for |x| < 1, with (A.2.2)
a0 = 1,
ak =
k−1 ∏
(−b + ℓ),
for k ≥ 1.
ℓ=0
There we required b ∈ Q, but in §4.5 of Chapter 4 we defined y b , for y > 0, for all b ∈ R (and for y ≥ 0 if b > 0), and noted that such a result extends. Here, we prove a further result, when b > 0.
250
A. Complementary results
Proposition A.2.1. Given b > 0, ak as in (A.2.2), the identity (A.2.1) holds for x ∈ [−1, 1], and the series converges absolutely and uniformly on [−1, 1]. Proof. Our main task is to show that ∞ ∑ |ak | (A.2.3) < ∞, k! k=0
if b > 0. This implies that the right side of (A.2.1) converges absolutely and uniformly on [−1, 1] and its limit, g(x), is continuous on [−1, 1]. We already know that g(x) = (1 − x)b on (−1, 1), and since both sides are continuous on [−1, 1], the identity also holds at the endpoints. Now, if k − 1 > b, ak b ∏ ( b) ∏ ( b) (A.2.4) =− 1− 1− , k! k ℓ ℓ 1≤ℓ≤b
b 0. We substitute y = b/2x into the first equation, obtaining (A.4.10)
x2 −
b2 = a, 4x2
then set u = x2 and get (A.4.11)
u2 − au −
whose positive solution is (A.4.12)
u=
b2 = 0, 4
a 1√ 2 a + b2 . + 2 2
Then (A.4.13)
x=
√
u,
b y= √ . 2 u
Taking a = Ck , b = Sk , and knowing that Ck2 + Sk2 = 1, we obtain (A.4.14)
Sk Sk+1 = √ , 2 Uk
A.4. Archimedes’ approximation to π
255
with (A.4.15)
Uk =
1 + Ck = 2
1+
√ 1 − Sk2 2
.
Then (A.4.16)
Area P2k = 2k−1 Sk .
Alternatively, with Pk = Area P2k , we have Pk (A.4.17) Pk+1 = √ . Uk As we show below, π is approximated to 15 digits of accuracy in 25 iterations of (A.4.14)–(A.4.17), starting with S2 = 1 and P2 = 2. First, we take a closer look at the error estimate in (A.4.3). Note that n ( 2π 2π ) (A.4.18) π − Area Pn = − sin , 2 n n and that δ3 δ3 δ5 − + · · · < , for 0 < δ < 6, (A.4.19) δ − sin δ = 3! 5! 3! so 2π 3 1 (A.4.20) π − Area Pn < · , for n ≥ 2. 3 n2 Thus we can take c = 2π 3 /3 in (A.4.3) for n ≥ 2, and this is asymptotically sharp. From (A.4.20) with n = 225 , we have (A.4.21)
π − P25