107 11 13MB
English Pages 640 [733] Year 1993
Introduction Sets and Functions The student who wishes to use this book successfully should have a sound background in elementary calculus and linear algebra and some exposure to multivruiable calculus. Adequate preparation is normally obtained from two years of undergraduate mathematics. Also required is a basic knowledge of sets and functions, for which the necessary concepts are summarized in this introduction. This material should be read briefly and then consulted as needed. Set theory is the starting point for much of mathematics and is itself a vast and complicated subject. For brevity and better understanding, we begin our study somewhat intuitively. The reader who is interested in the subtleties of set theory can consult the supplement at the end of this introduction. A set is a collection of "objects" or "things" called members of the set. For example, the collection of positive integers l, 2, 3, ... fcmnsa set. Likewise, the ~ l numbers (fr~s) p/q fonn a set. If Sis a set, and xis a member of S, we write x E S. A subset of the set S is a set A such that every element of A is also a member of S; symbolically, this relationship is denoted (x E A) • (x E S), where the symbol • denotes "implies." When A is a subset of S, we write A c S. Sometimes the notation A C S is used for what we denote as A c S. We can also define equality of setsbystating that A= B means A c Band BC A; that is, A and B have the same elements. The empty set, denoted 0, is the set with no members. For example, the set of integers n such that n2 = - I is empty. One method of specifying a set is to list its members in braces. Thus we write N = {I, 2, 3, ... } to denote the set of all positive integers and Z = {... ,-3,-2,-1,0,1,2,3, ... } for the set of all integers. An example of a subset of N is the set of even numbers; it is written A = { 2, 4, 6, ... } = { x E N I x is even } c N.
We read {x E N I x is even} as "the set of all members x of N such that x is even." Here is an important notational distinction. If S is a set and a E S, then {a}
2
Introduction: Sets and Functions
denotes the subset of S consisting of the single element a. Thus {a} c S, while a ES. Let S be a given set and let ACS and BC S. Define A UB = {x ES Ix EA or x E B}, which is read "the set of all x E S that are members of A or B -(or both)." The set AU B is called the union of A and B. Similarly, one can form the union of a family of sets. For example, let A 1,A2 , ••• be subsets of S and let Ui:'1A; = {x E S I x E A; for some i}; this is sometimes written U{A1,A2,A3, ... }. Note thatAUB is the special case withA 1 =A,A2 =B, and A; = 0 for i > 2. Similarly, we form the intersections A n B = { x E S I x E A and x E B} and n!='1A; = {x E S I x EA; for all i}. Figure I-1 illustrates these operations.
s
S
(a)
FIGURE I-1
ArlB
(c)
(b)
(a) Subset; (b) union; (c) intersection
For A C S and B c S, we form the complement of A relative to B by defining B\A = {x EB Ix
\t A},
where x (/. A means xis not contained in A. See Figure I-2.
FIGURE I-2
Complement
Introduction: Sets and Functions
3
As in Worked Example I.I at the end of this introduction, we see that B\ (A, U (B\A1)n(B\A2) and that B\(A, nA2) (B\A1)U(B\A2) for any sets A1,A2, B c S. This is an example of a "set identity." Other examples are given in the exercises. Given sets A and B, define the Cartesian product A x B of A and B to be the set of all ordered pairs (a, b) with a EA and b E B; i.e., Ax B = { (a, b) I a EA and b E B}. See Figure 1-3. A2)
=
=
y
B
AXB
b
(a, b)
A
a
FIGURE 1-3
Cartesian product
Let S and T be given sets. A function f : S -+ T consists of two sets S and T together with a "rule" that assigns to each x E S a specific element of T, denoted /(x). One often writes x t--t f (x) to denote that x is mapped to the element f(x). For example, the function /(x) = x2 may be specified by saying x 1-+ x2. Figure 1-4 depicts this function with S = T the set of all real numbers; this set is denoted R and will be introduced carefully in Chapter I. For now, we use it informally. f
X
FIGURE 1-4
The function x t--t x2
Introduction: Sets and Functions
4
Note. In this book, the terms "mapping," "map," "function," and "transformation" are all synonymous. For a function f : S ---+ T, the set S is called the domain or source off and T is called the target off. The range, or image, off is the subset of T defined by f(S) = {f(x) ET Ix ES}. The graph off is the set {(x,f(x)) ES x TI x ES}, as in Figure 1-5.
(x.f(x))
Graph off f(x)
B
I
A
FIGURE 1-5
AXB
X
Graph of a function
Someone paying careful attention to logical foundations may object to using colloquial language, such as "rule," and would be happier to define a function from .S to T as a subset R of S x T with these two properties: 1.
Each member of S occurs as the first component of some member of R, and
2.
1\vo members of R with the same first component are identical; that is, the first component x determines the second componentf(x), as in Figure 1-5.
A function f : S ---+ T is called one-to-one or an injection if whenever x 1 -f. x2, then /(xi) -f. /(x2). Thus a function is one-to-one when no two distinct elements of S are mapped to the same element of T. Equivalently, f is one-toone when for each y E T, the equation f(x) = y has at most one solution x E S. An extreme example of a function which is not one-to-one (if S has more than one element) is a constant function, a function f : S ---+ T such that /(x1) = f(xz) for all x 1,x2 ES. See Figure 1-6.
Note. In definitions it is a convention that "if' stands for "if and only if." The latter is often written "iff," or {:}, In theorems it is absolutely necessary to distinguish between "if," "only if," and "iff." We say that/ : S---+ Tis onto, or is a swjection, when for every y E T, there is an x ES such that/(x) = y, in other words, when the range equals the target.
5
Introduction: Sets and Functions
FIGURE I-6
Constant function
Another way of saying this is that for each y E T, the equation f(x) = y has at least one solution x E S. It should be noted that the choice of S and T is part of the definition off, and whether or not f is one-to-one or onto depends on that choice. For example, let f be defined by f(x) = x2 . Then/ is one-to-one and onto when S and T consist of all real numbers x such that x ~ 0, is one-to-one but not onto when S is all those x such that x ~ 0 and T is all real numbers, and is neither when S and T consist of all real numbers. For f : S - t T and A C S, we define /(A)= {/(x) E T I x EA}, and for B c T we define/- 1(B) to be the set {x ES lf(x) EB}. We callf(A) the image of A under/ and 1- 1(B) the inverse image, or preimage, of B under f.
Note. We can form one-to-one or onto.
1- 1(B)
for a set B C T even though f might not be
If f : S - t T is one-to-one and onto, then for each y E T there is a unique solution x E S to f(x) = y. Thus there is a unique function, denoted 1- 1 : T .-> S (not to be confused with the operation/- 1(B) defined in the previous paragraph or 1/f), such that/(/- 1(y)) = y for ally ET and/- 1(/(x)) = x for all x ES. We call 1- 1 the inverse function off. A one-to-one and onto map is also called a bijection, or a one-to-one correspondence.
Note.
In calculus we learn how important the choice of domain (source) is in forming the inverse function. For instance, to form sin - I arcsin, we cut the domain and regard sin as a map sin: [-7r/2,7r/2]-> [-1, 1] on which it is a bijection. Then sin- 1 : [-1,1]-> [-71"/2,rr/2] is defined. Consult your calculus text for more examples.
=
The map/ : S -> S defined by /(x) = x for all x E S is called the identity mapping on S. One should distinguish the identity mappings for different sets. For example, one sometimes uses the notation ls for the identity mapping on S. Clearly, ls is one-to-one and onto.
6
Introduction: Sets and Functions
For two functions/ : S -+ T and g : T -+ U, the composition g of : S -+ U is defined by (g o f)(x) = g(f(x)), as shown in Figure l-7. For example, if f : IR. -+ R is defined by f(x) x2 and g : R -+ R by g(x) x + 3, then go f: x 1-+ x2 + 3 and/ o g : x 1-+ (x + 3)2 (here S, T, and U consist of all real numbers). In particular, note that fog '!- g of.
=
=
gof
FIGURE 1-7
Composition of mappings
Note. In calculus, we learn that comP.ositions are important for, among other things, the chain rule. The same is true in this book. Sometimes we wish to restrict our attention to just some elements on which a function is defined. This process is called restricting a function. More fonnally, if we have a mapping/ : S-+ T and A C S, we consider a new function denoted f I A : A -+ T defined by (/ I A)(x) = f(x) for all x E A. We call/ I A the restriction off to A and/ an extension off I A. A set A is called finite if we can display all of its elements as follows: A = {a,, a2, ... , an} for some integer n. A set that is not finite is called infinite. For example, the set of all positive integers N = { l, 2, ... } is an infinite set. It can be difficult to decide if one infinite set has more elements than another infinite set. For instance, it is not clear at first if there are more rational or irrational numbers. To make this notion precise, we say that two sets A and B have the same number of elements (or have the same cardinality) if there exists a mapping/ : A -+ B that is one-to-one and onto. If an infinite set has the same number of elements as the set of integers { l, 2, ... } , then it is called denumerable. A set that is either finite or denumerable is said to be countable; otherwise, it is called uncountably infinite, or just uncountable. An example of an uncountable set is the set of all real numbers between O and I. (We shall prove this in Chapter 1.) Let S'be a set. A seque,ice in Sis a mapping/: N -+ S. Thus we associate to each integer n an element of S, namely /(n). One often suppresses the fact
Supplement on the Axioms of Set Theory
7
that we have a function by simply considering a sequence as the image elements, say x 1,x2,x3, .. ., or alternatively we write "the sequence x,," or (x,,)~ 1• We call Y1,Y2, .•• a subsequence of x1,x2, ... if there is a function g : N - N such that for every i E N, y; = Xg(i), and if for i < j, g(i) < g(j). In other words, a subsequence is obtained by "throwing out" elements of the original sequence and naturally ordering the elements that remain. For example, the sequence y,. = (2n) 2 is a subsequence of x,. = n2 • Here g(n) = 2n. Sometimes one writes a subsequence of x,, as x,.1 , where the notation g(i) = n; reminds us that the n; are chosen from among the n's. An important method for proving statements indexed by the positive integers N is the technique of induction. A property P(i) is true for all i EN if:
1.
P(l) is true (base case), and
2.
For every n EN, if P(n) is true then P(n + 1) is true (induction step).
The same technique also applies to {O, 1, ... } with the base case replaced by:
11•
P(O) is true.
We shall have more to say about the basis for the natural numbers in §1.1.
Supplement on the Axioms of Set Theory 1 There is no rigorous mathematics today that does not use concepts from set theory. For this reason we started with set theory in this text. The purpose of this supplement is to help bridge the gap between the approach in this text and that in more formal set theory courses using a book like Halmos's Naive Set Theory. 2 Any introduction to !;et theory has to take into account the following points: 1.
The concept of a set is so basic that it is impossible to define it in terms of more basic notions.
2.
For this reason, we specify the concept of a set with axioms, . but the axiomatic method may not be familiar to the student.
3.
Axiomatic set theory involves logic, but some concepts of logic may not be familiar either.
In view of these circumstances, the most effective approach, and the one used in this text, is to start working with the intuitive concept of a set and come 1This
supplement was written with the help of Istvan Fary. Paul R., 1960. Naive Set Theory. New York: D. Van Nostran y means y < x. Properties 11, 12, and 14 combine to give the following o.bservation.
x
1.1.1 Law of Trichotomy If x and y are elements of an ordered field, then exactly one of the relations x < y, x = y, or x > y holds..
28
Chapter 1 The Real Une and Euclidean Space
There are other systems besides the real numbers in which some of these axioms play a role. For example, axioms I through 4 define a commutative group. Axioms I through 9 excluding axioms 5 and 8 define a ring; for instance, this concept is appropriate for algebraic operations on the set of all n x n matrices. Notice that the set of all subsets of a given set S, where A :::; B is defined to mean that A c B, forms a partially ordered set and that this ordering fails to satisfy property 14.
1.1.2 Proposition In an ordered field the following properties hold:
=a for every a, then x =0.
Unique identities If a + x every a, then x = 1.
ii.
Unique inverses If a + x =0, then x =-a. If ax =I, then x =a- 1•
iii.
No divisors of zero If xy =0, then x =0 or y =0.
iv.
Cancellation laws for addition b + x, then a :::; b.
v.
Cancellation for multiplication If ax ax :2:: bx and x > 0, then a ;?:: b.
vi.
0 · x = 0 for every x.
vii.
-(-x) =xfor every x.
viii.
-x = (- l)x for every x.
ix.
If x 'f 0, then x- 1 'f O and (x- 1)- 1 = x.
x.
If x f: 0 and y f: 0, then xy f: 0 and (xy)- 1 = x- 1y- 1•
xi.
If x
xii.
If x S 0 and y
xiii. 0 xiv.
If a · x
=a for
i.
If a + x = b + x, then a = b. If a + x S
= bx and x
'f 0, then a
=b.
If
$ y and O.:::; z, then xz :::; yz. If x :::; y and z S 0, then yz :::; xz. $ 0, then xy ~ 0.
If x S 0 and y ~ 0, then xy :::; 0.
< I.
For any x,
x2
:2:: 0.
Proofs of ii, vi, viii, xiii, and xiv are given at the end of the chapter to illustrate the ideas. The purpose of the axioms of an ordered field is to isolate the key properties we need for manipulation of algebraic equalities and inequalities. For example,
29
§I.I Ordered Fields and the Number Systems
here is a typical manipulation: By 1.1.2xiv, for any numbers a and b, (a - b)2 ~ 0. Thus, a 2 - 2ab + b2 ~ 0, and so ab $
I
2(a2 + b2 )
for any numbers a, b. This type of manipulation is important in analysis and will reappear later.
1.1.3 Example Using the axioms and properties of an ordered field given in this section, prove that a2
-
b2 = (a - b)(a + b).
Solution By the,distributive law, (a - b)(a + b) = (a - b) ·a+ (a - b) · b.
Now use commutativity and the disuibutive law again, along with a - b = a+ (-b): (a- b) · a+ (a -b) · b
=a· (a -
b)+b· (a -b)
=a
2
+a• (-b)+b· a+b · (-b).
Now a• (-b) =a• (-1) · b = (-l)ab = -(ab) by 1.J.2viii, associativity, and commutativity. Similarly, b · (-b) = -b2 • Thus (a - b)(a +b) equals
a 2 +a· (-b) + b ·a+ b · (-b) = a 2
-
(ab)+ ba - b2
= a2
-
ab+ ab - b2
= a 2 - b2
(by axiom 5)
(by axioms 3 and 4).
•
1.1.4 Example In an ordered field prove that if O $ x < y, then i1 < y2. Solution If 0 ::; x < y, then 0 ::; x ::; y, and so by 1.1.2xi~ x2- ::; yx. By the same reasoning, x ::; y implies xy ::; y2. Thus
x2 :5 yx = xy $
(by commutativity)
y2,
and so ;,? $ y2. We now need to exclude the possibility that x2- equals 'y2 • But if x2- = y2 then
;,? - y2 = 0 (add -y2 to each side) = 0 (by Example 1.1.3)
(x - y)(x + y)
Chapter 1 The Real line and Euclidean Space
30
By 1.1.2xii we have O 5 x and y > 0. Now x+y ~ 0, since x+y = 0 would imply y = (-x) 5 0 so that y 5 0, which is impossible by the law of trichotomy. By cancellation for multiplication,
(x-y)(x+y)=0 implies that x - y = 0, i.e., that x = y. But we are given x • is excluded, as desired.
< y,
and so this case
Other familiar symbols may be introduced. The magnitude, or absolute value, of a number x is denoted !xi and in an ordered field is defined by !xi = x if x ~ 0 and !xi= -x if x < 0. The distance between two elements x and y is Ix - yl. Certain elementary properties of absolute value and distance are worth singling out both for their own value and because we will generalize them to other settings later. Their proofs. are straightforward, and so they are omitted.
1.1.5 Proposition i.
lxl ~ 0 for every x.
ii.
lxl = 0 if and only if x = 0.
iii.
!xyl = Ix! !YI-
iv.
lx+yl 5 Jxl + !YI-
v.
l!xJ - IYII 5 Ix - YI-
triangle i11equalily a/Jemative triangle inequality
For instance, to prove iv one considers all of the cases: For example, if x ~ 0, y 5 0, and x + y ~ 0, then Ix+ YI x + y 5 x + Y - y x = lxl 5 lxl + IYI•
=
=
1.1.6 Proposition If d(x,y) = Ix - yj. then i.
d(x,y) ~ 0.
ii.
d(x,y) = 0
iii.
d(x,y) = d(y,x).
iv.
d(x, y) 5 d(x, z) + d(z, y).
if and only if x = y. triangle inequality
The name triangle inequality comes from the corresponding inequality when x and y are not numbers but points in the plane and d(x,y) is the usual Euclidean
31
§1.1 Ordered Fields and the Number Systems
distance between them. In that case, it expresses the familiar fact that the sum of the lengths of two sides of a triangle is always at least the length of the third side. We will treat this more gener;il situation in §1.6.
The Number Systems We briefly summruize the various number systems to review and establish notation.
The Natural Numbers The natural numbers are the counting numbers: N = { 1, 2, 3, 4, ... } . Appending ' zero gives the set N = { 0, I, 2, 3, ... } of nonnegative integers. The identities 0 and-(1 are in N, and addition and multiplication are defined. However, we do t1ot have1legatives or reciprocals. With the usual ordering, N satisfies all of ttie axioms 1 to 16 except 4 and 8. The most outstanding fact about N is that the el~ments of N are the numbers we use for counting. There is a first one, 0, and for each there is a definite "next" obtained by adding 1. Repeating the process of adding 1 generates all of N. This is formalized by the Principle of Mathematical Induction If Sis a subset of N such that O E Sand k + l is in S whenever k is in S, then S = N.
As the reader already saw in the introductory chapter, this is a very powerful tool not only in number theory, but throughout mathematics. Some consequences of this ptinciple are wo11h pointing out. Certainly the set N has a smallest el~ment, namely 0. The same is tme of every nonempty subset of N. We say that N is well-ordered by :5 or that it satisfies the well-ordering property. A grammatically unf011unate but common expression is to describe :5 as a wellordering on N.
1.1.7 Proposition
N is well-ordered by the relation
:5. That is,
N has the
well-ordering property: Well-Ordering Property If Sis a nonempty subset of N, then there is a smallest element in S; i.e., there is an so E S such that s0 :5 x for every x in S.
32
Chapter 1 The Real LJne and Euclidean Space
The proof of this, given at the end of the chapter, illustrates a way of using induction that may be new to the reader. While a formal proof is necessary, the validity of the well-ordering property for the natural numbers should be clear intuitively. As a consequence of the well-ordering property, one can show that the natural numbers also satisfy the
Principle of Complete Induction n} CS implies n ES, then S = N.
If SC N is a subset such that {x E N Ix
0, we must prove that there is an integer N such that if n 2: N then II/n _:_ 0I < €. It will be so provided that 1/ N < €, and so it suffices to choose N > 1/ c, which is possible by the Archimedean property. • The next example is a bit more complicated but uses essentially the same idea.
1.2.13 Example Show that Jn2 + I/n!
----t Oas n ----too.
Solution To show that Jn 2 + 1/n! gets small as n gets large, we estimate it: 0
< Vn2+l < -
Thus, given c
n!
-
> 0 choose N 0
v2n2 = J2n n!
=
n!
such that N
>
J2 < J2 .
(n - l)! - n - I
✓2/c + l. Then n
Jn 2 +1 J2 J2 n! - n- 1 - N - 1
< --- < -- < --
0 as the allowed error and see just how large N must be to guarantee II/2n - 0I
11
{:>
.!_ < € {=> ! < 2"
zn
c
1 log - < n log 2 c
{:>
n
>
log(l/c) 1og 2 .
Chapter 1 The Real Line and Euclidean Space
42
Selecting any N > log(l/e)/ log 2 yields the result. Notice it is important that each step of the computation be reversible, since we really need n 2::: N to imply ll/2n - OI < c:. The finicky reader may legitimately object to this proof on the grounds that it uses the logarithm, which has not yet been rigorously introduced. Here is another method for such readers. Method 2 This method uses an earlier result and illustrates a useful trick. We compare our sequence with known sequences. By Example 1.1.11, we know that O < 1/2n < l/n for every n > 0. We also know by 1.2.12 that 1/n -+ 0. Given e > 0, fix N large enough so that 1/n < e whenever n 2::: N. Then n 2::: N implies O < I/2n < 1/n < £. Thus II/2n - OI < £. Notice that the value of N obtained by this method is 1/c:. This is probably quite a bit larger than the best possible N obtained in method 1. For c: = 0.00 I, we have obtained here that N = 1000 (or better, 1001) works. The first method showed that any N larger than (log 1000) / log 2 ~ 9 .97 would work. We have a terribly inefficient value of N, but a much simpler argument. For theoretical purposes, such estimates are often a good idea. For computational purposes the effort expended in finding the more efficient estimate might be worthwhile. A second thing to note is the trick of comparing a sequence under study with more easily understood sequences. •
The arithmetic of limits is used as in calculus to evaluate more complex limits:
.
n2
-
3n +6
1.2.15 Example Evaluate hm 3 ., 4 2 . n-oo n- + n +
Solution I.Im n2.,-3n+6 = 1.Im 1-3/n+6/n2 n-+oo 3n- + 4n + 2 n-oo 3 + 4/n + 2/n 2 lim I - 3 lim 1/n + 6 ( Jim 1/n) 2
n-oo =------------~ n-oo
n_,.oo
n~1! 3 +4 n~1! 1/n + 2 c~1! Ifn)2
1-3·0+6·02 - 3+4·0+2-02 1 Jim 1/ n 3 (since we know that n-oo
=-
=0).
•
§1.2 Completeness and the Real Number System
43
Next we discuss the construction of the real numbers. The following result is basic.
1.2.16 Theorem There is a "unique" complete ordered field, called the real number system.
The real number system is denoted JR. In this text, +oo and -oo are not included in JR; for other purposes, such as in complex variables and in integration theory, however, it is useful to adjoin ±oo. In the theorem above, "uniqueness" means that any two systems satisfying axiom groups I-ID and the completeness property 1.2.9 can be put into a oneto-one correspondence that is compatible with addition, multiplication, and the order relation. By compatibility with addition, for example, we mean that the number in the second system corresponding to the sum of the two numbers from the first system is the sum of the corresponding two numbers in the second system. Such a correspondence is called an isomorphism. We shall not present the proof of Theorem 1.2.16 but rather shall give the basic construction of JR here and leave the detailed verification of the properties \as an exercise. We start wi~ the ordered field of rational numbers IQ. Consider the set of sequences _. - S = {(x1,x2, .. .) I each x~ E IQ and the sequence is
increasing and bounded above}. Call two members of S equivalent if every upper bound for one sequence is also an upper bound for the other. The set of all sequences that are equivalent to a given sequence is called an equivalence class. One checks that this divides (or partitions) th~ space S into disjoint subsets. Let JR be the set of all the equivalence classes in S. The rationals are regarded as a subset of JR by identifying a rational number r with the class of all sequences equivalent to the sequence (r, r, r, ... ). Roughly speaking, the reals are the limits of all increasing sequences of rationals that are bounded above. Now one has to define addition, multiplication, inequality, and so on, and verify all of the axioms. This is a straightforward but somewhat tedious job. As part of this job one checks consistency: if x E JR and if (x 1,x2 , .•. ) is a sequence in the class x, then this sequence really does converge to x. The existence of IR can also be shown by verifying that the usual decimal expansions have the required properties, although this method has some subtle difficulties. 1\vo other standard methods are a geometrical approach due to Richard Dedekind and an analytic approach due to Georg Cantor. Dedekind takes for his basic objects "cuts," which are partitions of the rationals into pairs
44
Chapter I The Real line and Euclidean Space
of subsets (A, B) with x < y for all x in A and y in B. Cantor proceeds as we do, looking directly at sequences, and creates limits for them by considering as his basic objects equivalence classes of sequences. A consequence of the Archimedean property from 1.2.11 is that the rational numbers are dense in JR. This means that given any real number x, we can find rational numbers as close to it as we want. This confinns our intuitive picture of the rational numbers as being densely scattered along the line.
1.2.17 Proposition Q is dense in IR. That is, i.
If x and y are in IR and x
ii.
If x E IR,
< y,
eE IR and E > 0,
then there is an r E Q with x
then there is an r E Q with
< r < y.
Ix - rl < €.
The rational numbers are countable, and from 1.2.17, there are enough of them to approximate every real number. Does this mean that IR is countable? Perhaps surprisingly, the answer is no. This is yet another result of Georg Cantor. At the end of the chapter we give one of the more traditional arguments for showing this. This argument is often called the diagonal argument and depends on the representation of the reals as infinite decimal expansions.
1.2.18 Theorem The unit interval ]O, 1[ in JR is uncountable. A simple rescaling by a straight line function f(x) =a+ (b - a)x puts ]O, l[ into one-to-one correspondence with the interval ]a, b[. Thus, any interval ]a, b[ is uncountable. Thus, although 1.2.17 shows that any such interval contains infinitely many rational numbers, 1.2.18 shows that it must also contain infinitely many, in fact uncountably many, in-ational numbers.
1.2.19 .Corollary If x and y are in IR and x < y, then the interval Jx,y[ contains countably many rational numbers and uncountably many irrational numbers. We conclude with a less obvious sequence that we will come back to later as a source of examples.
1.2.20 Proposition: The harmonic sequence
Let x, = 1 arid x,. = 1+ 1/2+· • •+ 1/ n, n = 2, 3, .... Then x,. is monotone increasing but is unbounded above and so does not converge (we write x,.-+ oo).
-
45
Exercises for §1.2
In the next section we will be studying the least upper bound property. Some authors like to use it as the axiom of completeness. In such a treatment the monotone convergence property appears as a theorem. In our treatment, which we chose for its intuitive appeal, it is the least upper bound property that is a theorem!
Exercises for §1.2 In Example 1.2.10, let A ::: limn_, 00 x,,.
1.
a.
Show that A is a root of A2
b.
Find limn->oo x,,.
Z.
Show that 3" /n! converges to 0.
3.
Let x,1 =
J n2 + I -
-
A - 2 == 0.
n. Compute lim,,----, 00 Xn,
Let x,, be a monotone increasing sequence such that Xn+I -x,, ::; 1/ n. Must converge?
4.
Xn
Let lF be an ordered field in which every strictly monotone increasing sequence bounded above converges. Prove that lF is complete.
5.
§1.3 Least Upper Bounds The completeness axiom can be put into several other equivalent forms. To state these, we need some further terminology.
1.3.1 Definitions Let S C JR. A number b is called an upper bound for S if for all x E S, we have x ::; b. A number b is called a least upper bound of S if, first, b is an upper bound of S and, second, b is less than or equal to every other upper bound of S. See Figure 1.3-1. The least upper bound of S (also called the supremum of S) is denote1 sup S, sup(S), lub S, or lub(S). If S c JR. is not bounded above (has no upper bound), we say that sups is infinite and write sups= +oo. 1
1If
S is empty, then one defines sup(S) to be oo.
46
Chapter 1 The Real Line and Euclidean Space an upper bound
s li"...::4'..M....""',.,..
~1:h.,.._-~ "\~
~it~~..iNr.J.;v>o.••..
FIGURE 1.3-1
/
•
'\ the least upper bound
Least upper bound
The set ]a, b[ = { x E JR I a < x < b} is called an open interval, while the set ~ x ~ b} is called a closed interval. For example, the closed inteival (0, 1), the open inteival JO, I[, and all of the rationals less than 1 have a least upper bound of I. These are relatively simple sets. It is good to keep in mind that the theory developed in this section also applies to complicated sets, such as the set of numbers in (0, 1J with a decimal expansion O.a 1a2a3 · • - where a; is 5, 6, or 7 for each i. This set is not as readily visualized.
[a, bJ= {x E JR I a
Note. In this text, open inteivals are denoted as Ja, b[ rather than (a, b). This "European" convention avoids confusion with ordered pairs. There can be at most one least upper bound for a set S. Indeed, if b and/ b' are both least upper bounds, then since b is less than or equal to every other upper bound, b ~ b'. Similarly, b' ~ b, and so we conclude that b = b'. Thus, we may speak of the least upper bound. The least upper bound of a set need not be a member of that set. For example, the least upper bound of S = JO, 1[, namely 1, does not belong to S. A set need, not have any upper bound. For example, the whole real number system has no upper bound, and the set of positive integers has no upper bound. In the "degenerate" case of the empty set 0, we regard any number as an upper bound. If b is an upper bound for the set S and b E S, then b is the least upper bound. To see this, note that if d is any upper bound for S, then since b E S, b ~ d, as required. A useful alternative to the definition of least upper bound is stated next.
1.3.2 Proposition Let S c IR be nonempty. Then b E IR is the least uppe; bound of S if! b is an upper bound and for every c > 0 there is an x E S such that x > b - e. The proof is found at the end of this chapter. However, the theorem should be quite "obvious," because b sits just at the "top" (that is, to the "right") of the
§1.3 Least Upper Bounds
47
set Sand there are no "gaps" between it and the set S, so that for any e > 0 we can take x just below b within a distance€. [Warning: This sort of plausibility argument is intended to give you a feel for the statement-do not confuse it with a rigorous proof.] Similarly, a lower bound for a set S is a number b such that b ~ x for all x E S. Also, b is c~led a greatest lower bound iff it is a lower bound and for any lower bound c of S, c ~ b. As with least upper bounds, greatest lower bounds are unique if they exist. The greatest lower bound is sometimes called the injimum and is denoted by inf S, inf(S), glb S, or glb(S). As in 1.3.2, a number c is the greatest lower bound for a set S iff c is a lower bound and for every e > 0 there is an x E S such that x < c + €. If S C JR is not bounded below, or is empty, we write inf(S) = -oo. There are a few more or less obvious properties stated in the next proposition.
1.3.3 Proposition Suppose A c B c IR and A and B are nonempty. Then infB ~ infA ~ supA ~ supB.
In 1.3.3 we assume that the inf's and sup's exist, or we use the conventions -oo < x < +oo for any x E JR. The more subtle question of existence is dealt with in the next theorem.
1.3.4 Theorem In JR the following hold: i.
Least upper bound property Let S be a nonempty set in JR that has an upper bound. Then S has a least upper bound in JR.
ii.
Greatest lower bound property Let P be a nonempty set in IR that has a lower bound. Then P has a greatest lower bound in JR.
The proof is at the end of the chapter. However, this result should also be fairly apparent. Indeed, if a bounded subset of JR had no least upper bound, there-wolild be a "hole" at the top of the set, and a sequence of members of S increasing toward that hole would not converge to an element in R Similarly, we must have property ii. Using the methods of the proof, it is not difficult to show that conditions i and ii are each equivalent to the completeness axiom for an ordered field.
1.3.5 Example Consider the set S = {x infS.
E JR
I x2 +x < 3}.
Find supS and
48
Chapter 1 The Real Line and Euclidean Space
Solution Consider the graph of y =
r
+ x (Figure 1.3-2). From elementary calculus we see that for x = -1/2, y is a minimum. Thus Smay be pictured as shown in Figure 1.3-2. • y
y=x 2 +x
(- l, _ l) 2
4
FIGURE 1.3-2 The set Sin Example 1.3.5
r
The sup and inf clearly occur when +x = 3, or, from the g·uadratic formula, when - 1 ± v'f+TI -1 ± v'l3 2 2 Thus
x=----~=----
../f3-l
sup S = - - 2
and
·rs-- 0, there is an integer N (depending on €) such that lxn - x,,,I < e whenever n 2: N and m 2: N. This condition me~ns intuitively that the sequence "bunches up"; that is, all the elements of the sequence are arbitrarily close to one another sufficiently far out in the sequence. If x,, converges to x, we claim that Xn is a Cauchy sequence. Indeed, given e > 0 choose N so that lxn - xi < e/2 if n ~ N. If n,m ~ N, then lxn - Xml = lxn -x+x-x,,,I S Ix,, -xi +lx-x,,,I < e/2+e/2 = e, which proves our assertion. Here we have used the triangle inequality IY + zl S IYI + lzl. We summarize:
1.4.2 Proposition Every convergent sequence is a Cauchy sequence. The next two theorems are central to much of analysis. The first is the basis for many of the ideas about compactness in Chapter 3. It exploits the observation that even when a sequence does not converge, some parts of it might. Consider the sequence 1,--:1, 1,-1, 1,-1, ... defined for n 0, 1,2,3, ... by x,, (-It. This sequence certainly does not converge, but if we look only at the terms with even index: xo,x2 ,x4 , ••• we find the constant sequence I, l, 1, ... , which certainly converges. If we look only at the terms with odd index: x 1 ,x3 ,x5 , ••• we find another constant sequence -1, - I, -1, ... , which also converges. We say that we have found convergent subsequences. A subsequence of a sequence xo,x1,x2, ... is a sequence formed by leaving out some of the terms of the sequence while keeping others without changing their order. More precisely, we select indices,n(l) < n(2) < n(3) < ••• and take the sequence with these indices:
=
Xn(l),Xn(2),Xn(3), • • • •
=
Chapter 1 The Real Line and Euclidea,i Space
50
Theorem 1.4.3 is our first encounter with a very _important observation known as the Bo/zano-Weierstrass Property. If a sequence in JR stays bounded, it must cluster somewhere and have a subsequence (possibly many) that converges to some point in IR. This depends heavily on completeness. The limit that ought to be there must not be missing. The second theorem is dosely linked to the first and supplies the version of completeness that turns out to be appropriate for generalization.
1.4.3 Th'eorem Every bounded sequence in JR has a subsequence that converges to some point in R 1.4.4 Theorem Every Cauchy sequence in JR converges to an element ofR 1.4.5 Corollary If a and b are in JR and a < b, then every sequence of points in the interval [a, b] = {x I a $ x $ b} has a subsequence that converges to some point in [a,b]. The property expressed in Theorem 1.4.4 is called Cauchy completeness. It is often given as the definition of completeness, but does not, by itself, substitute for the least upper bound axiom. However, together with the Archimedean principle, it does. A Cauchy-complete Archimedean ordered field satisfies the least upper bound property. (Without the Archimedean property it might not. A bounded, decreasing sequence such as 1, 1/2, 1/3, 1/4, 1/5, ... might not converge to anything. The only likely candidate for a limit is 0, but without the Archimedean prope1ty it need not converge to 0. For a similar reason it would . not be a Cauchy ~equence.) · One intuition for Theorem 1.4.4 runs as follows. If we ignore the first N terms of a Cauchy sequence, we know that the remaining terms will be bunched together. As we ~isregard more and more terms, the remainder of the sequence becomes more tightly grouped and squeezes down to some limiting number, the limit_ of the sequence. To see more precisely how this is done requires more care, and so the actual proof is our only recourse. The following lemmas contain the strategy for proving Theorem 1.4.4 fi:om 1.4.3.
1.4.6 Lemma · Every Cauchy sequence is bounded. '
.
1.4.7 Lemma If a subsequence of a Cauchy sequence converges to x theri the sequen 0, if we choose N so that l/(2N- 1) < e, then x,. satisfies the definition of a Cauchy sequence. • ·
Exercises for §1.4 · 1.
Let Xn satisfy lxn - Xn+i I < 1/3n. Show that Xn converges.
2.
Show that the sequence Xn = e5in(Snl has a convergent subsequence.
3.
Find a bounded sequenc·e with' three subsequences converging to three different numbers.
52 4.
Chapter I The Real Une and Euclidean Space
Let Xn be a Cauchy sequence. Suppose that for every e > 0 there is some n > 1/e such that lxnl < e. Prove that Xn--+ 0. _,, ·': r .. '
5.
r
I
~'
...-;A
r
~
I :
I
: ",
~
_,
True or false: If Xn is a Cauchy sequence, then for n and m large enough,
d(Xn+1,X111+1) 5 d(Xn,X111),
§1.5 Cluster Points; lim inf and lim supA useful tool in the study of convergent sequences is the notion of a cluster point:
4-.5..1:.0efiiiitiofi-' A point x is called a clusterf1Dint of the sequence Xn if for every € > 0 there are infinitely many values of n with lxn - xi < e. For example, both 1 and -1 are cluster points of the sequence 1, -1, l, -1, 1, -1, ... ; notice that this sequence does not converge. However, the next proposition shows that there is a relationship between convergence and cluster points.
£5:2::'Proposifioit Let Xn be a sequence in IR and let x E IR. > 0 and for each N,
i.
x is a cluster point of Xn if and only if for each there is an index n > N with lxn - xi < £.
i~
x is a cluster point of Xn if and only if there is a subsequence of Xn that converges to x.
iii.
Xn--+ x if and only if every subsequence of Xn converges to x.
iv.
Xn -+ x if and only if the sequence is bounded and x is its only cluster point.
v.
Xn -+ x if and only if every subsequence of Xn has a further subsequence that converges to x.
€
Consider the sequence 1, 0, -1, 1, 0, -1, 1, 0, -1, .... This sequence also does not converge. It has three cluster points: 1, 0, and -1. Of these, I and -1
53
§1.5 Cluster Points; /im inf and lim sup
are particularly interesting, being the largest and smallest of the cluster points. It seems obvious at first, less obvious after more thought, that every bounded sequence of real numbers has largest and smallest cluster points. We want to try to capture this in a definition for general sequences in IR.
\':S1Definition Let Xn be a sequence in JR that is bounded above. The limit superior of Xn is the largest cluster point of Xn, i.e., the sup of the set of cluster points. If the set of cluster points is empty, we set limsupxn = -oo; if the sequence is not bounded above, we set limsupxn = +oo. Similarly, if Xn is bounded below, the limrl-inferior of Xn is the smallest cluster point of x,., i.e., the inf of the set of cluster points. If the set of cluster points is empty, we set liminfx,. +oo, and if Xn is not bounded below, we set liminfxn -oo.
=
=
The limit supeijor is denoted by
and the limit inferior by lim inf Xn,
lim infn-+ooXn, . Or lim
X11 •
~-Examples a.
For the sequence 1,0,-l,l,O,-l,l,0,-1, ... ,liminfxn = -1 and lim sup Xn = 1.
b.
The sequence Xn = l / n converges to 0, which is the only cluster point. Thus lim Xn lim sup x,. lim inf x,. = 0.
c.
Let Xn 0 if n is even and x,. n if n is odd; i.e., Xn = 0, I, 0, 3, 0, 5, 0, 7, .... Here, lim inf Xn = 0. The sequence is not bounded above, and so we put lim sup Xn = +oo.
d.
Let
e.
Even if a sequence is bounded, the lim inf is not necessarily the infimum of the set {x1,x2 ,x3 , •• • }, nor need the lim sup be its supremum. This is so because we are interested only in cluster points, and any finite number of the entries qn be thrown away without changing the cluster points.
=
=
Xn
=
=
=n, i.e., 0, 1,2,3,4, .... Here, lim sup =liminfxn = +oo. Xn
Chapter I The Real Line and Euclidean Space
54
For a more complicated example, consider the sequence Xn defined to be I+ 1/j if n = 5j 1 - 1/j if n = 5j + 1
0 if n = 5j +2 - 1 + 1/j if n = 5j + 3 -1 - 1/j if n = 5j + 4
=
wherej 2,3,4,5,6,7,8, ... and n begins with
= 10,11,12,13, .... The
sequence
as shown in Figure 1.5-1. Xi3
X14
I
X19
I I
I 1111+1111 I
X13
I I
I 0
1x20 I I I I II I I Xu x,o
I
I I I I I 11 • 1II I I I I
I
1
131
3
2
141 I
.2
3
FIGURE 1.5-1
I
X"i1 I 11 II 11 Xu x,6 I I
I I
I
I I
I
1x,s I I I I
I I I
~
Xu
I
~
151
1
141
I
I .1
4
6
s
s
2
4
3
The sequence of Example 1.5.4, part e
There are subsequences converging to -1, 0, and I. Subsequences approach both +I and -1 from both above and below. Thus, limsupxn = 1 and lim inf Xn = -1. A somewhat simpler example is:
C.
Xn
=(-l)n(l + 1/n), n =l,2,3,4, .... The first few terms are 3 45 67 89 - 2• 2' - 3' 4' - 5' 6' - 7' 8' ....
The odd-index terms increase to -1, and the even-index terms decrease to 1, as in Figure 1.5-2. Thus, liminfxn -1 and limsupxn 1. •
=
=
§1.5 Cluster Points; litn
uif and litn
sup
55
I II •
•
4 11 -I - - 11
-2
0
11 I
I
3 I'
I ',
6
8
-5 -1 lim inf (xn)
= -1
FIGURE 1.5-2 The sequence of Example 1.5.4, part f Keep this picture in mind as we proceed. The key idea is that a cluster point of a sequence remains a cluster point if finitely many tenns are discarded. If xis a cluster point of x 1,x2,x3,X4, ... , then it is also a cluster point of Xn+1,Xn+2,Xn+3, .... In using lim inf and lim sup, it is sometimes useful to have a direct €-N characterization.
cl::S:S::Pmpositiru:i, Let Xn be a sequence in IR. Then i:
ii.
If Xn is bounded below, a number a is equal to lim inf Xn if and only if
> 0,
there is an N such that a - c: < .Xn whenever n 2'.: N,
a.
For all and
b.
For all € > 0 and for all M, there is an n
€
> M with Xn < a+ c:.
If Xn is bounded above, a number b is equal to lim supxn
if and only
> 0, there is an N such that Xn < b + € whenever n ~ N,
a.
For all and
b.
For all€ > 0 and for all M, there is an n > M with b - c < Xn-
€
if
An examination of Figure 1.5-2 should make this proposition plausible. Using it, we can develop yet another characterization of lim inf and lim sup. For a sequence X1,X2,X3, ... in R. let Sn {xn+1,Xn+2,Xn+3,···}, On inf Sn, and bn = sup Sn. (Recall that if Sn is not bounded above, then sup Sn= +oo, and if it is not bounded below, then inf Sn= -oo.) Then S1 ::> S2 ::> S3 ::> S4 ::> Ss ::> •.•• If n 5 k, then sk c Sn, so that Ok 5 bk 5 bn. If k 5 n, then Sn c sk, and so ak 5 an 5 bn. Thus, the a's and b's are arranged as
=
=
(Note that +oo and -oo are still allowed as values for an and bn.) The reader should compute the first few of these for Example 1.5.4, part r.
56
Chapter 1 The Real Line and Euclidean Space
1.5.6 Proposition If Xn is a sequence in JR. then limsupxn = inf{sup{Xn+J,Xn+2,Xn+3, ... } In= 1, 2, 3, ... } liminf Xn = sup{inf{Xn+!,Xn+2,Xn+3, ... } In= 1, 2, 3, ... }. Setting Sn= {xn+1,Xn+2,Xn+3, .. ,}, bn = sup Sn, and an= inf Sn, we can rewrite this as limsupxn
=inf{supSn} =inf{b1,b2,b3,b4, ... } n
liminf Xn = sup{inf Sn} = sup{ a,, a2, a3, a4, ... }. n
Some general properties follow.
1.5.7 Proposition Let Xn be a given sequence in JR. Then ~
i.
lim inf Xn
ii.
If Xn
iii.
If M :5 Xn for all n, then lim inf Xn 2:: M.
iv.
lim sup Xn = +oo if and only if Xn is not bounded above.
v.
lim inf Xn = -oo if and only if x,. is not bounded below.
vi.
If x is a cluster point of Xm then lim inf Xn :5 x :5 lim sup Xn,
vii.
If a = lim inf Xn is finite, then it is a cluster point of Xn.
viii.
If b = lim sup x,. is finite, then it is a cluster point of x,..
ix.
x,.
-+
~
lim supx,,.
M for all n, then lim sup Xn
~
M.
x E IR if and 011/y if lim sup x,. = lim inf x,. = x E JR.
Exercises for §1.5 1.
Let x,. = 3 + (- I rO + I/ n). Calculate Jim inf x,. and Jim sup x,..
2.
Find a sequence Xn with Jim sup x,.
3.
Let x,. Show that x,. has subsequences u,. and l,. with u,.
=5 and lim inf x,. =-3. be a sequence with lim sup x,. = b E IR and lim inf x,. = a E JR. -+
band I,.
-+
a.
4.
Let limsupx11 = 2. True or false: }f n is large enough, then x,. > I.99.
5.
Trne or false: If lim supx,, = b, then for n large enough, x,.
~
b.
§1.6 Euclidean Space
57
§1.6 Euclidean Space In this book we work primarily with one-, two-, or three-dimensional Euclidean space. In many applications, and in developing analysis in three-space, higher dimensional spaces naturally occur. Therefore, it is important to treat the general case, but we usually fall back on the case of one-, two-, or three-space for visualization and intuition. Let us begin with the formal definition.
1.6.1 Definition Euclidean n-space, denoted lRn, consists of all ordered n-tuples of real numbers. Symbolically, ]Rn= {(xi, ... ,Xn)
I Xt,· .. ,Xn E JR}.
Thus, JR" = JR x • • • x lR (n times) is the Cartesian product of IR with itself n times. Elements of JR" are generally denoted by single letters that stand for n-tuples such as x = (x 1, ••• , Xn), and we speak of x as a point in lR". Addition and scalar multiplication of n-tuples are defined by
and a(x1, ... , Xn) = (ax1, ... , axn)
for a E JR.
The geometric meaning of these operations is reviewed in Figure 1.6-1 for n = 3. The spaces IR 2 and JR 3 and more generally IR" are examples of a structure from linear algebra called a vector space. Just as we did with fields, we single out the vital properties of the familiar spaces, IR 2 and 1R3, and tum them into a definition. Basically, a vector space is a collection of objects called vectors that we can add, subtract, and multiply by numbers.
1.6.2 Definition A real vedor space V is a set of elements called v_ectors, with given operations of vector addition+ : VxV -+ V and scalarmu/Jiplication · : JR x V -+ V such that: l
v + w = w + v for every v and w in V.
iL
(v+u) +w = v +(u +w).
-Ju.
There is a zero vector O such that v + 0 = v for every v in V.
commutativity associativity zerQ vector
Chapter 1 The Real Line and Euclidean Space
58
X
= (x1 ,Xz,-'j)
-x
FIGURE 1.6-1
iv.
v. vi.
vii. viii.
Addition and scalar multiplication
For each v in V there is a vector -v such that v + (-v) = 0.
negatives
,\ · (v + w) = ,\ · v
For ,\ E JR and v, w in V, + ,\ · w.
distributivity
For ,\, m E JR and v in V, ,\(m · v) = (,\,n) · v.
associativity
For,\, m E JR and v in V, (,\ + m) . V = ,\ . V + m . V.
distributivity
1 · v = v for every v E V.
multiplicative identity
A vector space over a general field lF is defined in the same way by substituting IF for JR throughout. In analysis we are concerned mostly with vector spaces over JR, which we call real vector spaces, and over the complex numbers C, which are called complex vector spaces. (See §1.8 for a discussion of C.) A subset of Vis called a subspace (or linear subspace or vector subspace) if it is itself a vector space with the same operations.
1.6.3 Lemma If W is a subset of a vector space V over a field IF, then W is a vector subspace over IF if and only if ,\v + µw E W whenever ,\ andµ are in lF and v and w are in W. The reader can supply the proof or review the relevant section of a linear algebra text.
59
§1.6 Euclidean Space
FIGURE 1.6-2
Hyperplane and affine hyperplane
In particular, an (n - 1)-dimensional linear subspace of IR" is called a hyperplane. An affine hyperplane is a set x + H, where H is a hyperplane and x e IR" and where x + H means the set of all x + y as y ranges through H; thus x + H = {x + y I y E H}. See Figure 1.6-2.
1.6.4 Theorem Euclidean n-space with the operations ofaddition and scalar multiplication previously defined is a vector space of dimension n. The proof is a straightforward check of the axioms for a vector space, which we shall leave for Exercise 16 at the end of the chapter. This theorem should be no surprise; after all, a vector space is an abstraction of the basic properties of vectors in Euclidean space. We can show that IRn has dimension n by exhibiting a basis with n vectors; for example, the standard basis {e 1 = (1, 0, ... , 0), e2 = (0, 1,0, ... ,0), ... , en= (0,0, ... ,0, l)}. Using the standard basis, the components of x = (x 1 , • •• , Xn) are just xi, . .. , Xn. Using another basis for IRn, the components would be different. This means that if e1, . .. , en denotes the standard basis, x = 1 x;e;, but if f1, . .. ,fn is another basis, x = y;f; for some (possibly different) numbers Y1t ... ,y,.. We have· not defined the terms "basis" or "component" in general. We refer you to your · linear algebra course.
E7=
E7=i
1.6.5 Definition The length or norm of a vector x in IRn is defined by n
llxll = { ~x;
}1/2
Chapter I The Real Line and Euclidean Space
60
where x = (xi, ... , Xn), The distance between two vectors x and y is the real number n
d(x,y) = llx - YII =
{
~(x; - y;)2
}1/2
The inner product of x and y is defined by n
(x,y} = LX;y;. i=I
Thus we have llxll2 = {x,x}. In JR.3, the reader is familiar with another expression for (x,y}, namely, {x,y} = llxll 11Yllcos0, where cos0 is the cosine of the angle formed by x and y. See Figure l.6-3.
X
FIGURE 1.6-3 Length and inner product
We now summarize the basic propenies of these operations:
1.6.6 Theorem For vectors in IR.n, we have l.
Properties of the inner product: i.
(x, x} 2: 0.
ii.
(x,x)
iii.
(x,y + w) = (x,y} + (x, w).
=0 iff x =0.
positivity nondegeneracy
distributivity
§1.6 Euclidean Space iv. v.
II.
ii. iii. iv.
IV.
E lit
mulnplicativity symmetry
Properties of the norm:
i.
III.
(o.x,y) =a(x,y) for a (x,y) = (y,x).
61
llxl! ~ o. llxll =0 iff x =0. !!axll = lal !lxll for a E lit llx+yll s llxll + IIYII-
positivity nondegeneracy
multiplicativity triangle inequality
Properties of the distance:
i.
d(x,y)
ii.
d(x,y)
~
0.
iii.
=0 iff x =y. d(x,y) =d(y,x).
iv.
d(x,y) S d(x,z) +d(z,y).
positivity nondegeneracy symmetry triangle inequality
The Cauchy-Schwarz ineqUfJlity:
I(x, y) I s Ilxll IIYIIProperties i-v of the inner product are almost obvious algebraically. Once we- have the Cauchy-Schwarz inequality, the properties of the norm and distance follow easily. Here is one proof of the Cauchy-Schwarz inequality; another is given in the theorem proofs section. We want to prove that I(x, y) I S Ilxll IIYI 1- The cases x 0 or y 0 are trivial, and so we can assume x ~ 0 and y ~ 0. Rescaling and Y by rewriting l(x,y)I S llxll llYII as l(x/llxll, y/llYll)I S I, we can assume llxll IIYII l; then we need to prove that l(x,y)I :S I. Consistent with the l)thagorean theorem, we claim that . '
=
x
=
=
=
ll(x,y)yll 2 +llx - (x,y}yll 2 = llx112 = 1, since (x,y)y should be the projection of the vector x along the vector Figure 1.6-3. To verify this algebraically, write
y,
as in
ll(x,y)yll 2 + llx - (x,y}yll 2 = (x,y} 2 IIYll 2 + (x - (x,y)y,x - (x,y)y) = (x,y) 2 IIYll 2 + llx112 - 2(x,y) 2 + (x,y) 2 IIYll2 = l!xll 2
Chapter I The Real line tlnd Euclidean Space
62
Bx+yll< Dxl+ DyU
FIGURE 1.6-4 Triangle inequality
since
IIYII = 1.
Therefore
l(x,y)l 2
=ll(x,y)yl!2 ~ ll(x,y)yll2 + llx -
(x,y)yll2 =llx112
=1,
and so the Cauchy-Schwarz inequality follows. This proof also shows that if the Cauchy-Schwarz inequality becomes an equality, then x and y are linearly dependent. (The tenn llx - {x,y)yll 2 in the last display is zero in this case, and so x = (x,y}y.) Note that IV follows just from the properties I. Many of these properties should also be evident geometrically. For example, iv in II and III expresses the fact that the length of one side of a triangle is less than or equal to the sum of the lengths of the other sides (Figure 1.6-4). A set with a function d obeying the rules of group III is called a metric space. A vector space with a nonn obeying the rules in group II is called a normed space, and a vector space with an inner product obeying the rules in group I is an inner product space. We will look at these concepts in §1.7 and show that properties III follow from II, and properties II follow from I. Therefore, all that is needed is to show that the proposed inner product on Rn satisfies the properties in I. (Aside: The famous inequality of group IV should properly be called the Cauchy-Bunyakovskii-Schwarz inequality.) Generalizing the concepts from IR.3, we call x,y E Rn orthogonal iff (x,y) = 0. Two subspaces S and T are orthogonal iff (x,y} = 0 for all x e S and y E T. If in addition S and T span Rn, then they are called orthogonal complements.
Exercises/or §1.6 ,
63
An observation that will be helpful to us when studying sequences in IR.n is the (ollowing:
1.6.7 Proposition If v = (v1, ... , vn) and w = (w1, ... , wn) are vectors in Rn and we let p(v, w) = max{lv1 - W1 I, lv2 - w2I, ... , lvn - wnl}, then p(v, w) $ llv -
wll $
../np(v, w).
1.6.8 Example Find the length of the line segment joining (1, l, l) to (3, 2,0). Solution This length is the length of the vector (3, 2, 0)- ( l, l, l) = (2, l, - l ), which represents the vector from (1, l, l) to (3,2,0). The length is
11(2, 1,-1)11 = ✓22 + 12 + (-1)2 = ./6.
•
1.6.9 Example In IR.3, find the orthogonal complement of the line x =y
=
z/2 (or xi = X2 = X3/2 in different notation).
Solution This line, call it I, is the one-dimensional subspace spanned by the vector (1, l, 2) (see Figure l.6-5). The orthogonal complement is a plane (through the origin, since it is a subspace) and so has an equation of the form Ax+ By+ Cz
=0,
i.e.,
((A, B, C), (x,y, z))
=0.
Thus, (A, B, C) is normal to the plane, but (I, l, 2) is a vector perpendicular to the" plane, so that the orthogonal complement sought is the plane x + y + 2z =0. •
Exercises for §1.6 1.
If llx + YII = origin.
llxll + IIYII, show that x and y lie on
2.
What is the angle between (3, 2, 2) and (0, l, 0)?
3.
Find the orthogonal complement of the plane spanned by (3, 2, 2) and (0, l, 0) in IR. 3•
the same ray from the
Chapter 1 The Real Line and Euclidean Space
64
z
£ I I I
I I (1,1,2)
X
FIGURE 1.6-5 The line I and its orthogonal complement in Example 1.6.10
={x E lR3 I llxll :$ 3} and Q ={x E IR3 I llxll < 3}.
4.
Describe the sets B
S.
Find the equation of the line through (1, I, I) and (2,3,4). Is this line a linear subspace?
§1.7 Norms, Inner Products, and Metrics Among the properties of IR.n is its metric property, giving a way of measuring distance between points. A metric space is a set M equipped with a function d : M x M ...... JR. that gives a reasonable way of measuring the distance between two elements of M. The notation and language are meant to recall the familiar distance between points in Euclidean 1-, 2-, or 3-space but not to be restricted to that.
1.7.1 Definition A metric space (M,d) is a set Mand a junction d: M x M ...... JR such that ~
i.
d(x,y)
ii.
iii.
=0 if and only if x =y. d(x,y) =d(y,x)for every x,y EM.
iv.
d(x, y) $ d(x, z) + d(z, y) for all x, y, z E M.
0 for all x,y EM.
d(x,y)
positivity nondegeneracy symmetry triangle inequality
§1.7 Nonns, Imier Products, and Metrics
65
1.7.2 Examples a.
We saw in 1.1.6 that the real line JR itself is a metric space with metric defined by d(x,y) = Ix - YI• Readers who are familiar with complex numbers will recognize that C is also a metric space with d(z, w) = lz- wl; see §1.8 for details. Similarly, 1.6.6 shows that ]Rn is a metric space (with the standard metric).
b.
Discrete metric Let M be any set and let d(x, y) = 0 if x if x 'F y. Then d is a metric on M.
c.
Word metric This example is useful in combinatorics and computer science. Let M be the set of all "words," each word consisting of an ordered 8-tuple of Os and 1s. w = (w,, w2, W3, w4, w5, w6, w1, ws)
=y and d(x, y) =1
with each wk = 0 or
l.
Let d(v, w) = the number of places in which v and ware different. For example, if v = 01100011 and w 00110101, then d(v, w) 4, since v and w differ in the second, fourth, sixth, and seventh bits. Then d is a metric on M. (In establishing its properties it may help to check that d(v, w) = lvk - wkl.J
=
=
z:::..,
d.
Bounded metric If d is a metric on a set M and p(x,y) is defined by p(x,y) = d(x,y)/(1 + d(x,y)), then p is also a metric on M, as one can verify (see Exercise JO for Chapter ]). This new metric has the property that p(x,y) < 1 for all x,y e M; i.e., p is bounded by 1. •
Although metric spaces are often vector spaces, this is not always the case, as the discrete metric space shows. We tum now to concepts that work only in (_vector spaces'.') 7
1.7.3 Definili::i nonned spac;;i,o, nonned vecio, space) (V, 11 • I[) ;, a vector space V and a function 11 • II : V
i.
llvll
ii.
=0 if and only if v =0. II.Xvii= I.XI llvll for every ve V and every scalar .X.
iii.
~v.
~ 0 for all v
e V.
llvll
llv+wll::; llvll+llwll
--+
JR called a nonn such that
positivity ,iondegeneracy multiplicativity tria,igle inequality
66
Chapter I The Real Une and Euclidean Space
~
We saw in 1.1.5 that JR. is a nonned space with llxl I = lxlReaders familiar with complex numbers will recognize that C is a normed space with llzll = lzl.
a.
Realnumbers
b.
Taxicab nonn
Consider the space R.2 , but instead of the usual norm on it, set ll(x,y)II, = lxl + IYI- Then II· Iii is a norm on IR. 2• The name comes from the associated distance. If P (x,y) and Q (a,b), then
=
d,(P,Q) =
=
IIP - Qlli =Ix-al+ IY- bl.
This is the sum of the vertical and horizontal separations. You must travel this distance to get from P to Q if you always travel parallel to the axes (stay on the streets in a taxicab). See Figure 1.7-1. y
--+---x.___ _ _ _ _ _....a - -• x
FIGURE 1.7-1
c.
The taxicab metric
Supnmum norm Let~= all real-valued functions on the interval (0, 1] that are bounded. That is, let M = {f: [0, 1] - JR. I there is a number B with 1/(x)I :$ B for every x E [0, 11}. For each/ in M, f([O, 1]) is a bounded subset of JR., and so {lf(x)I [0, 11} is also. It then has a finite least upper bound and llfll= = sup{l/(x)I
Ix E
Ix E [0, 11}
defines a function 11 • lloo : M - R The set Mis a vector space, and 11 • lloo is a norm on it. We will thoroughly examine this norm ip Chapter 5. •
§1.7 Norms, Inner Products, and Metrics
67
As we remarked earlier in the discussion of Rn, norms always produce metrics.
!l;.7.;S-:f.roR.os.i.t~n , If (V, II · II) is a normed vector space and d(v, w) is _defined by d(v, w) = llv - wll,
then d is a metric on V. The metrics that norms produce sometimes give a measure of the difference between two vectors. We saw one example already in the taxicab metric. In the example of the space of bounded functions, the metric is d(j,g) = II/ - glloo = sup{l/(x) - g(x)I IO~ x
:5 I}.
Thus, the metric given by the sup norm is the largest vertical separation between the graphs, as in Figure 1.7-2.
y I
I I
I I If
- -•
FIGURE 1.7-2
x
The sup distance between functions is the largest vertical distance between their graphs
Although many interesting metrics come from nonns, not all do. Unless the vector space consists only of the zero vector, choose a nonzero v and note that IIAvll = IAI llvll --+ oo as IAI -+ oo, so that d(Av,0) becomes arbitrarily large. Therefore, Examples 1.7.2b and d c1:1nnot be produced from nonns.
68
Chapter 1 The Real Line and Euclidean Space
lt,1:«£De1imtfon.1
A real vector space V with a function (·, ·} from V x V is called an inner product space if
i.
(v, v}
~
ii.
(v, v}
=0 if and only if v =0.
iii.
(..\v, w}
0 for all v E V.
v.
R
positivity nondegeneracy
=..\(v, w) for all v, w E V
and for all
iv.
--+
~
E IR.
multiplicativity
=
(v, w + u} (v, w} + (v. u) for all v, w,u EV. (v, w}
distributivity
=(w, v} for all v, w E V.
symmetry
The most important example is Rn with the standard inner product, but there are others.
1.7.7 Example
ontinuous functions Let V be the space of all continreal-valuedfun ions on the interval [0, l]. (Continuity is studied in detail in Chapter . r now we use the notion informally. assuming that the reader is familiar with it from calculus.) As the reader is probably aware. sums of continuous functions are continuous; we shall soon verify this officially. Thus, V is a vector space, usually denoted by C([0, I]), and is of great interest in analysis. We will confirm in Chapter 4 that any continuous function on [0, I] is bounded, so that the supremum norm defined in Example 1.7.4c makes sense on C([0, l]). We will see in Exercise 30 at the end of this chapter that this norm cannot be associated with an inner product. There is. however, a very interesting inner product on C([O, l]) not associated with the supremum norm. Products of continuous functions are continuous, and continuous functions are integrable. We will verify these assertions later, but for now we will assume them and define an inner product on C([0, I]) by
(f, g} =
1 1
/
(x)g(x)dx.
It is straightforward (given the basic properties of integrals) to check that this is an inner product. It will be important to us in Chapter JO in the study of ~Fourier series and Fourier analysis. • Inner products, when available, are very powerful tools. As in Rn. they supply a notion of orthogonality: Two vectors v and ware orthogonal if (..\v, w} = 0.
§1.7 Nonns, Inner Products, and Metrics
69
They also give us one of the most useful inequalities in analysis, the CauchySchwarz (or Cauchy-Bunyakovskii-Schwarz) inequality. Before turning to it we collect some easy but useful facts.
a:2.1'-liroposition' If(·,•} is an .inner product on a real vector space V, then i.
(Av+µw,u}=A(v,u}+µ(w,u}
ii.
(u, ,\v + µw) = A(u, v} + µ(u, w}
iii.
(v, Aw} = A(v, w}
iv.
(0, w} = (w,O} = 0.
The proof is straightforward and is left to the reader.
~ ~eauchy-Schwarz·Inequalitr ,If (V, ( · ,: }) is an inner product space, then l(v, w}I::; ylM ✓(w, w} for every v and w in V. The proof we gave in the 'Preceding section is valid in this abstract context, too! Besides other uses, this inequality supplies the key to showing that every inner product produces a nonh.
1:7..1:0-Rroposition_
lf(V, (·,·})is an inner product space and 11·11 is de.fined
for v in V by .then
llvll = ~•
II· II is a norm on V.
The main point here is that one gets the triangle inequality as an easy consequence of the Cauchy-Schwarz inequality:
llv+wll2
=(v+w,v+w} =llvll2 +2(v,w} + llwll2 $ llvll2 + 2l(v, w}I + llwll 2 $
llvll2 + 2llvll llwll + llwll2 = 2 ,
and so llv + wll $ llvll + llwll.
Chapter I The Real Line and Euclidean Space
70
Exercises for §1.7
=
=x for both the sup norm
I.
In C([O, 1]) find d(J,g) where /(x) 1 and g(x) and the norm given in Example 1.7.7.
2.
Identify the space of polynomials of degree n - I in a real variable with Rn. In the corresponding Euclidean metric, calculate d(J, g) where/(x) = I and g(x) =x.
3.
Put the inner product (/, g) = J01/ (x)g(x)dx on C([O, I]). Cauchy-Schwarz inequality for f(x) 1 and g(x) x.
4.
=
=
Verify the
Using the inner product in Exercise 3, verify the triangle inequality for f (x) = x and g(x) = x2.
5.
Show that
11 '"lloo
is not the norm defined by the metric in Example 1.7.7.
~.l.8 The Complex Numbers Note. The following material is important in analysis in general but is not required for the bulk of this book, except for Chapter 10. You will see much more of this in courses on complex variables; see, for instance, Basic Complex Analysis by J. Marsden and M. Hoffman, 2d ed., W. H. Freeman, New York, 1987. The real numbers JR were created to supply a continuous unbroken line of numbers suitable for the needs of geometry and calculus. However, there are still a few problems: While we have supplied a square root of 2, so that the equation x2 - 2 = 0 has a solution, x2 + 2 = 0 has none. Complex numbers supply solutions to such equations. The set of complex numbers C is created following the clue given by the quadratic fonnula, which states that the equation ax2 + bx+ c = 0 with a -,. 0 has solutions x = -b ± Jb2 - 4ac/2a. For the equation x2 + x - 6 0 this produces the solutions x 2 and x = -3. For the equation x2+2x+5 = 0, it suggests x = -I ±2A. In school we are sometimes taught to conclude that there is no solution. But suppose we try to find one in a spirit of mathematical playfulness. If these values of x are inserted into the equation and manipulated according to the rules of arithmetic with the proviso that (..;=T)2 = -1, they do indeed "solve" the equation. Since the quadratic fonnula always gives real solutions i~ b2 - 4ac ~ 0 and suggests "solutions"
=
=
§1.8 The Complex Numbers
71
z = (-b/2a) ± (✓4ac- b2 /2a)A if b2 - 4ac < 0, we are led to "play" with objects z = x + y A where x and y are real. To simplify notation, we use the symbol i to stand for A. We then manipula~ the expressions x+iy according to the rules of arithmetic with the proviso that i2 = -1. If x,y,a,b E JR, then we find that (x + iy) +(a+ ib) = (x +a)+ i(y + b) (x+ iy) + (0 + iO) =x+iy (x+ iy) + (-x- iy) = 0 + iO
(x + iy) · (a+ ib) = (ax - by)+ i(bx + ay) (x+ iy) • (1 + iO) =x+iy.
Can we divide? At least fonnally, we can write
1 1 X- iy X • -y --=-----=--+1-x + iy x + iy x - iy x2 + y2 x2 + y 2 • If at least one of x or y is not 0, then this is an allowable expression in our system. Direct multiplication con6nns that
.._ ( __, x 2 + i 2-y 2 ) ~+y x+y
•
(x + iy) = (1 + i0).
The complex numbers are given a logical basis and a geometric realization by identifying the construction x + iy with the ordered pair (x,y) and then with a point in the Euclidean plane of analytic geometry. The horizontal axis consisting of points (x, 0) is thus identified with the real numbers by x "=" x;t;.i0 "=" (x, 0). It is referred to as the real axis. The vertical axis, consisting of points (0,y), is traditionally referred to as the imaginary axis. The point (0, 1) is identified with i and is often called the imaginary unit. The rules of addition and multiplication become definitions: (x,y) + (a, b) = (x + a,y + b) (x,y) · (a,b) = (ax- by,bx+ay).
It is a tedious but routine chore to check that the plane with these operations, -and with (0, 0) as additive identity and (1, 0) as multiplicative identity, is a field. (It satisfies properties 1-10 of the field axioms given in §1.1.) The reciprocal of a nonzero point is given, as suggested in our fonnula for 1/(x + iy), by - 1 =
(A. -h). X
+y
X
+y
72
Chapter 1 The Real Line and Euclidean Space
Furthermore, the real numbers can be identified with the horizontal axis by x - (x, 0), and everything agrees: x + y - (x, 0) + (y, 0) and xy - (xy, 0). Finally, we check that (0, 1) · (0, I) = (-1, 0) - -1, so that we have a square root for -1. The resulting field is called the complex numbers and is denoted by C. However, C cannot be an ordered field. In an ordered field we know that z2 ~ 0 for every element z and that -1 < 0. But in C we have an element i with ,-i = -1. Consequently, there is no way to introduce an ordering in C that has all of the order properties in the definition of an ordered.field. Identification of C with the plane JR. 2 gives it a geometric reality. Addition is the same as vector addition in the plane and follows the parallelogram law, as in Figure 1.8-1. y
FIGURE 1.8-1
Complex numbers are added like vectors, illustrating (a+bi)+ (c + di) =(a+ c) + (b + d)i
The geometric meaning of multiplication becomes clear if we use polar coordinates. Suppose z = (rcos 0, rsin 0) = rcos 0 + irsin 8 w = (p cos cp, p sin .(v, w) and {>.v, w) = X(v, w). This causes no substantial difference if one is consistent The statement of the Cauchy-Schwarz inequality still holds for complex spaces, although the proof is trickier. (The vectors must be "rotated" first by multiplication by a complex scalar of norm I so that the appropriate coefficients used in the proof are real.) The equation Ilzl 12 = (z, z) defines a norm on V as before. The argument for the triangle inequality is this:
llx + Yll 2 = {x + y,x + y) = (x,x) + (x,y) + (y,x) + (y,y) = llxll2 + (x,y) + (x,y) + IIYll 2 = llxll 2 + 2Re({x,y)) + IIYll 2 $ llxll 2 + 2l(x,y)I + IIYll2
(by Cauchy-Schwarz)
S llxll2 + 2llxll llYII + IIYll2 = 2 •
Chapter I The Real Line and Euclidean Space
78
Exercises for §1.8 1.
2.
Express the following complex numbers in the form a + ib:
a.
(2+3i)+(4+i)
b.
(2 + 3i)/(4 + t)
c.
1/i + 3/(1 + i)
Find the real and imaginary parts of the following, where
a.
(z + 1)/(2z
b.
.z3
z = x + iy:
- 5)
3.
Is it tr_ue that Re(zw) = (Re z)(Re w)?
4.
What is the complex conjugate of (8 - 2i) 10 / ( 4 + 6i)5 ?
5.
Does z2 = lzl 2 ? If so, prove this equality. If not, for what z is it true?
6.
Assuming either izl = 1 or !wl = 1 and
zw-:/- I, prove that
z-w - I=l. I1-zw 7.
8.
9.
Prove that
z.
-b.
ez -:j. 0 for any complex number
c.
lei0 1= I for each real number 0.
d.
(cos 0 + isin 0t = cosn0 + i sinn0.
Show that
a.
ez = I iff z = k21ri for some integer k.
b.
eZ 1 = ez2 iff z1 - z2 = k21ri for some integer k.
For what
z does the sequence Zn= nzn converge?
Theorem Proofs for Chapter 1
79
Theorem Proofs for Chapter 1 1.1.2 Proposition In an ordered field the following properties hold: If a+ x =0, then x =-a, and if ax= 1. then x = a- 1•
ii.
Unique inverses
vi.
0 · x = 0 for every x.
viii. -x = (-I)xfor every x. xiii. 0 < 1. xiv.
For any x,
x2
~ 0.
Proof To prove ii, suppose a + x = 0. Then -a
=-a+ 0 = -a + (a+ x) =(-a +a)+ x =0 + x = x,
and so -a= x, as claimed. Likewise, if ax= I, then a- 1 = a- 1 • 1 = a- 1 (ax) = (a- 1a)x = Ix= x. For vi: 0 • x (0 + 0)x 0 · x + 0 • x, and so
=
=
0 = 0 · x + (-0 · x) = (0 · x + 0 · x) + (-0 · x) = 0 · X + (0 · X + (-0 · X)) = 0 · X + 0 = 0 · X. For viii: x+(-I)·x= 1 •x+(-l)·x
=(1+(-l))·X =0·x=0
by vi. Thus, (-1) • x = -x by ii. For xiii, suppose 1 $ 0. Then l+(-1) $ 0+(-1), and so O $ -1. We could then use property 16: Since O $ -1 and O $ -1, we get O $ (-1) • (-1) = -(-1) = 1. Therefore, 1 $ 0 and O $ I, and so I = 0 by property 12, in contradiction to property 10. ' To prove xiv, consider two cases: If x ~ 0, then = x. x ~ o, by axiom 16. If x < 0, then x 2 = (-(-x))(-(-x)) = (- 1)2(-x)2, by vii and viii. But (-1) ., 2 = .,1, since O = (-1)(-1 + 1) = (-1)2 + (-1) • 1 = (-1)2 - I. Thus
r
r=(-x)-~0.
•
80
Chapter I The Real LJne and Euclidean Space
1.1.7 Proposition N is well-ordered by the relation $. Proof Suppose S C N is a set with no smallest element. Let T = N\S. We
=
=
use induction to show that T N and hence S 0. If O were in S it would have to be a smallest element. Thus, 0 E T. Instead of attacking T directly we consider To= {n E N I {O, 1, 2, ... ,n} C T}. Since To C T, we will be done if we show To= N. The argument here shows OE To. Suppose k E T0 • Then {O, 1, 2, ... ,k} C T. If k + 1 were in Sit would be a smallest element in S, since all natural numbers smaller than k + 1 are in T. Thus k + 1 must also be in T. Since {O, l,2, ... ,k,k+ l} CT, we have k+ 1 E To. Thus To= N by induction. •
1.2.2 Sandwich Lemma Suppose Xn -+ L. Yn -+ L, and Xn $ Zn $ Yn for all n (it is enough to assume that there is an No such that Xn $ Zn $ Yn whenever n > No). Then Zn -+ L. Proof Indeed, let c > 0 be given. Then ~
1.
There is an No such that
Xn
2.
There is an N1 such that
lxn - LI < €
3.
There is an N2 such that
IYn - LI < € whenever n ~ N2.
$ Zn $ Yn whenever n
No,
whenever n ~ N1, and
Selecting N = max(No, Ni, N2), then whenever n L < e:, and so lzn - LI < €, as needed. •
~
N,
-€
< Xn
-
L :5
Zn -
L$
Yn -
1.2.5 Proposition If Xn is a sequence in IR. and Xn
-+
x and Xn
-+
y, then
x=y.
Proof
Xn
converge to both x and y. Write
by the triangle inequality. If Ix - YI > 0, then using Ix - yl/2 as our ~, we can choose N so large that Ix-Xnl < lx-yl/2 and lxn - JI < Ix- yl/2 if n ~ N. Thus we would conclude that Ix - YI < Ix - .vi, which cannot be. Hence Ix,- yj = 0 and so x=y.
•
81
Theorem Proofs for Chapter 1
1.2.6 Proposition A convergent sequence is bounded. Proof If Xn
-+ x, there is an N with lxn - xi < 1 whenever n 2: N. Thus lxnl $ lxl + I when n ~ N. Letting M = max{lxl, !xii, lx2I, ... , lxn-il} + 1, we get lxnl $ M for every n. •
1.2.7 Limit Theorem for Sequences Suppose Xn-+ x, Yn-+ y, and>. is constant. Then i.
Xn + Yn
-+
ii.
>.xn
>.x.
iii.
XnYn
-+
iv.
if Yn
'F O and y 'F 0, then Xn/Yn-+ x/y.
-+
X + y.
xy.
Proof The proofs illustrate the use of the triangle inequality. i.
Let e > 0. Select N1 so that lxn - xi < e/2 whenever n 2: N1 and select N2 so that IYn -yl < e/2 whenever n ~ N2. If N = max(N1,N2), then whenever n ~ N, we have
j(Xn +Yn) -
and so Xn + Yn
(x +Y)I
-+ X
= l(xn - X) +(Yn - Y)I 5 lxn - xi+ IYn - YI < c/2 + c/2 = e,
+ y.
ii.
This is a special case of iii in which Yn is a constant sequence, and so it suffices to show iii.
iii.
First write IXnYn - ..tyj = IXnYn - XnY + XnY - xyj
< _ Ixn..vn - xnYI + Ixn y· - xvi • = lxnl IYn -
YI+ IYI lxn - xj.
Since the sequence Xn converges, it is bounded. There is a constant M such that lxnl < M for every n. Select N, so that lxn - xi < E"/[2(lyJ + l)]
82
Chapter 1 The Real Line and Euclulean Space whenever n;;:: Ni, and select N2 so that IYn -yl ~ N2. Let N = max(Ni,N2). If n ~ N, then
< t:/[2(M + I)] whenever
n
[XnYn - xyl :::; lxnl [Yn - y[ + IYI lxn - xi
[yle < - Mt: - - + -'-'-- 2(M + 1) 2(IYI + 1) €
€
0. There is an Ni such that IYn - YI n ~ Ni, so that IYn[ > lyl/2. For such n we have .
~1
Xn _ = 'XnY-.xyn Yn Y YnY '
< IYl/2 whenever
I= I(Xn -x)y+x(y-yn) I YnY
< IYI lxn - xi + lxl IY - Yn [ IYnYI IYnYI
-
2
21xl
:::; jyf [xn -x[ + [yl 2 IYn -y[. Now select N2 so that lxn - xi < elYl/4 whenever n ~ N2, and select N3 so that IYn -y[ < e[yl 2/4lxl whenever n ~ N3. If N-= max(N1,N2,N3) and n ~ N, then I
Therefore, Xn/Yn
::1
Xn _ < ~ t:ly[ + 4[x[ elYl 2 = € Yn Y [YI 4 [y[ 2 4[xl ·
---4
x/y.
•
1.2.11 Proposition Complete ordered fields are Archimedean. Proof
Let lF be a complete ordered field and consider x E JF. We must show that there is an integer N such that x < N. If not, the monotone sequence 1, 2, 3, ... is bounded above by x and so converges by the monotone sequence property. We assert that this sequence cannot converge to a number, say y. If it did, for any € > 0 there would be an N such that for n ;;:: N, I=
In+ 1 - nl :::; In + 1 - y[ + IY - n[ < Ze,
by the triangle inequality. This gives a contradiction if €
< I /2.
•
Theorem Proofs for Chapter
83
f
1.2.17 Proposition Q is dense in IR. That is, i. If x and y are in JR and x < )' there is an r E Q with x < r < y. ii.
If x E IR and 1::
> 0, there is an r E Q with Ix - rl < e.
Proof The proof illustrates the interplay of the Archimedean property and the well-ordering property of N. To establish i, suppose x < y, so that y - x > 0. By the Archimedean property of IR, there is an integer n with O < 1/n < y-x. Again by the Archimedean property (version 2), there is an integer k with k/n > x. By the well-ordering property, there is a smallest such positive k Using it, (k - l)/n ~ x < k/n. If r = k/n, then
k- 1 1 1 X < r = - - + - '.5 X + - < X + (y - x) = y, n n n so that x < r < y, as required. Part ii follows from part i by letting y = x + r. Readers are asked to supply the details themselves.
•
1.2.18 Theorem The unit interval ]0, 1[ in IR is uncountable. Proof Suppose x 1, x2, x3, •.. were proposed as an enumeration (listing) of the points in ]O, I[. We will show that there must be at least one x in ]O, 1[ that does not appear in the list so that no one-to-one correspondence of N with ]0, 1[ is _possible. Write out each Xt as a decimal expansion: X1
+-+
0.aua12a13a14a1s ...
x2 +-+ 0.a21 a22a23a24a2s ••• X3 +-+ 0.a31a32a33a34a35 ..•
where each aik is a digit from O through 9 and repeating 9s are chosen by preference over terminating decimal expansions. That is, 1/4 = 0.2499999 .... Let x = 0.b1b2b3b4 ... , where bt = 5 if akk f. 5 and bt = 6 if akk = 5. Then x E ]0, 1[ and x is not in the list, since it is different from Xt in the kth decimal place. •
1.2.19 Corollary 'ff x and y are in IR and x < y, then the interval ]x,y[ contains countablv manv rational numbers and uncountablY manr irrational · · · · numbers.
Chapter 1 The Real Line and Euclidean Space
84
Proof This follows from 1.2.17 and 1.2.18.
•
= 1 and xn = 1+ 1/2 + • • • + 1/n, n = 2, 3, .... Then Xn is monotone increasing but is unbounded above and so does not converge (we write Xn ----too). 1.2.20 Proposition: The harmonic sequence Let x1
Proof Note first that
x2 = 1 + 1/2, x 4 = l + 1/2 + 1/3 + 1/4 > 1 + 1/2 + 1/4 + 1/4 = 1 + 1/2 + 1/2, Xg = 1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 > 1+ 1/2+1/2+4/8 = 1+1/2+1/2+1/2. In general, induction giyes.x2" ~ l+(n/2), and so Xn is unbounded. •
1.3.2 Proposition Let S c JR be nonempty. Then b E JR is the least upper bound of S iff b is an upper bound and for every € > 0 there is an x E S such that x > b- c. Proof First, suppose b = lub(S) = sup(S) and c > 0. We must produce an < x + c.. If there were no such x, we would have b ~ x + c for
x E S such that b
every x E S; that is, b - c ~ x. Thus b - c is an upper bound strictly less than b and therefore b is nut the least upper bound, which contradicts our hypothesis. Conversely, suppose b satisfies the given condition. Let d be an upper bound of S. According to the definition of sup(S), we must show that b s; d. Suppose in fact, b > d. Let e = b - d. Then d = b - c and d ~ x for all x E S implies b-c ~ x orb~ x+c., and so our condition fails. Thus the supposition that b > d is wrong, and we may then conclude that b :s; d, as required. This completes the argument. •
Note. In this proof, we used the following basic principle of logic: Showing that a statement P implies a statement Q (in symbols P =}- Q) is equivalent to showing that ~Q :::}- ~P where ~Q is the negation of Q. We call ~Q :::}- ~P the contrapositive of P =}- Q, whereas Q :::}- P is the converse. While proving P =}- Q is equivalent to proving ~Q =}- ~P, it need not be the same as Q :::}- P, as simple examples like (x = -1) :::}- (x2 = 1) show. 1.3.3 Proposition Suppose infB
0 ~A
c
B
cR
Then
:s; inf A :s; sup A :s; sup B.
Theorem Proofs for Chapter 1
85
Proof For example, to prove that infB :S infA, let b = infB, so that b :S x for all x E B. Thus b S x for all x EA also, since A C B. Thus b is a lower bound for A, so that b S inf A.
•
1.3.4 Theorem In JR the following hold: i.
Least upper bound property Let S be a nonempty set in IR that has an upper bound. Then S has a least upper bound in R
ii.
Greatest lower bound property Let P be a nonempty set in JR that has a lower bound. Then P has a greatest lower bound in R
Proof i.
We shall give two proofs here. See Exercise 46 at the end of this chapter for a third proof.
Method 1 Let M be an upper bound and let n be a positive integer. Keeping n fixed, consider the sequence M - I/2n, M- 2/2", M - 3/2n, ... that steps down an amount 1/2n from M each time we proceed in the sequence. Choose the first integer k such that M - k/2" fails to be an upper bound. Call this integer kn. There must be such a k,., since S :j:. 0 and IR satisfies the Archimedean property. Let b,. = M - k,./2n, so that b,. is not an upper bound but bn + 1/2n is an upper bound. Since we have decreased the size of the steps by a factor of 2 each time n is increased by 1, b1 :S b2 :S b 3 :S .... Since b,. :SM for all n, we can invoke completeness to ensure that b,. ---+ b for some b. Also, b 2; bn (why?). We claim that b is the least upper bound. To show this, let x E S. If x > b, let c = x - b and, by Example 1.2.14, choose n so large that 1/2n < e. Then x
1
= b + c 2; b,. + c > b,. + 2" .
Thus b,. + 1/2" is not an upper bound, a contradiction. Similarly, for any c > 0 there must be an x E S such that b - x < e:, so that b is the least upper bound, by 1.3.2.
Method 2 Since S :j:. 0, we can choose some x E S. Let us write y 2; S if
? is
an upper bound of S. For k = 0, 1, 2, ... , let Nk be the smallest natural ~ber such that x + Nk/2k 2: S; Nk exists by the well-ordering property. Let ~ x : Nk/2k. Note that Nk+I = 2 · Nk or Nk+t = 2 · Nk - 1. Thus Yk is a f: · easing sequence bounded below by .x.
Chapter 1 The Real Line and Euclidean Space
86
By the completeness property of JR., Yt -+ y for some y E JR.. We shall show that y is the least upper bound for S. First, let us demonstrate that it is an upper bound. Suppose that z E Sand z > y. Since Yt -+ y, there is some k such that Yt < z. This is impossible, since each Yk is an upper bound for S. Now we need to show that y is the smallest upper bound. By Proposition 1.3.2, it suffices to prove that for any given e: > 0 there is a z E S so that y < z + e:. Pick k such that 1;21r. < e:, which is possible by Example 1.2.14. By the choice of Ykt there is some z E S such that z > Y1r. - 1/21 . But y ~ Y1r. and so z > y- l/2k > y- e:, and the proof of i is complete.
ii.
Consider the set -P = {-x I x E P}. Since Pis bounded below, -P is bounded above, so by i, -P has a least upper bound c E JR. From the • definition we find that -c is the required greatest lower bound.
We remarked earlier that properties i and ii are each equivalent to the completeness axiom for an ordered field. We have shown one-half of the implication, namely, that the completeness axiom implies i and ii for an ordered field. Exercise 11 asks the reader to prove that i and ii both imply the completeness axiom.
1.4.2 Proposition Every convergent sequence is a Cauchy sequence. Proof Let Xn -+ x. For c > 0 choose N so that lxn - xi < e: /2 if n 2: N. Thus, if n, m 2: N, lxn -xml ~ lxn -xi+ Ix - Xml < e:/2 +e:/2 = f:, by the triangle inequality. ~us Xn is Cauchy.
•
1.4.3 Theorem Every bounded sequence in JR has a subsequence that converges to some point in JR. Proof · Suppose Xn is a bounded sequence in R There is an integer M with -M < Xn < M for every n. Bisect the interval [-M, M] into two half intervals [-M,0] and [O,M]. At least one of these must contain Xn for infinitely many indices n. Call it Io and select no with ,x,.,, E lo. Now split Io in half and let I 1 be one of the portions that contains Xn for infinitely many indices n. As there are infinitely many xn's available, we can select n1 > no with Xn 1 in /i. Continue in this way to obtain subintervals, indices, and points such that
1.
lo :::> Ii ·::> I2 :::> h :> ....
2.
h = [at, b,t] with b1i - a1r. = M /2k.
Theorem Proofs for Chapter I
87
3.
no < n 1 < n2 < n3 < ....
4.
Xn1; E ft. (See Figure l.P-1.)
Consider the sequence a1,a2,a3, .... Since 11:+i C 11; C [-M,M], we have -M $ a1c $ at+1 $ M for every k. The sequence is monotone increasing and bounded and so must converge to some number x. For each k we have lxn1; -
xi
$ lxn1; -
a.1:I + la.1: - xi
$
M/2" + la1: - xi.
Let e > 0. By Example 1.2.14, 1/2"-+ 0 ask-+ oo, so that there is an integer K1 such that I /2.t < c / (2M) whenever k ~ Ki. Since a.1: -+ x, there is an integer K2 such that la1: - xi < e/2 whenever k ~ K2. Let K = max(K1 ,K2 ). If k ~ K, then lxn,1: < e/2 + e/2 = e. Therefore limt-oo Xnk = x, and so the subsequence Xn,1: converges to x. •
xi
lo /1
I
I
/2
I• I I /3 I
1--+l I
I
I 11,1
0
I
-M
0 I QI
I H I I I I I I a2 a4 I I
I I
I I I I I I I I I a3 I I I I x., I I
x., x., x., FIGURE l.P-1
I
•I
I I I I
I I I I I
I
I I I
M '
I
x..
The bisection process is used to show that every sequence in [ -M, M] has a convergent subsequence
1.4.S Corollary If a and b are in JR and a < b, then every sequence of Points in the interval [a,b] = {x I a~ x Sb} has a subsequence that converges to some point in [a,b].
88
Chapter I The Real line and Euclidean Space
Proof If Xn
is a sequence of points in [a, b], then a ::::; Xn ::::; b for every
n. The sequence is bounded, and so by Theorem 1.4.3 there is a subsequence Xn,, x,.2 , x,. 3 , ••• and there is a point x in IR such that limk......, 00 x,.t = x. By Proposition 1.2.3, we have a::::; x::::; b, and so x E [a, b]. •
In the text we showed how to get Theorem 1.4.4 from the following lemmas.
1.4.6 Lemma Every Cauchy sequence is bounded. Proof Suppose x,. is a Cauchy sequence. Then there is an N such that jx,. xkl < 1 whenever n ~ N and k ~ N. Therefore, lxnl ::::; lxNI + I for n = N,N + 1,N + 2, .... If we put M = max{lxd, lx2I, ... , lxNI} + I, then lxnl ::::; M for every n. • 1.4.7 Lemma If a subsequence of a Cauchy sequence converges to x, then the sequence itself converges to x. Proof Suppose x,. is a Cauchy sequence and that it has a subsequence x,.k convergent to x. Suppose E > 0. There is an N such that Ix,. - Xm I < E /2 whenever n ~ N and m ~ N. Since x,.k is a subsequence, there is an no > N with lx,.0 - xi < E /2. If m > N, then lxm - xi= lxm - Xno +xno -xi : N with lxn - xi < E.
ii.
x is a cluster point of x,. converges to x.
iii.
x,.
iv.
x,. - t x point.
v.
Xn - t x if and only if every subsequence of x,. has a further subsequence that converges to x.
-t
x
if and only if there
if and only if every subsequence of x,. if and only if the
E
> 0 and for each N,
is a subsequence of x,. that converges to x.
sequence is bounded and x is its only cluster
Theorem Proofs for Chapter 1
89
Proof i.
If x is a cluster point and i=; > 0, then there are infinitely many indices with
lxn - xi < c. Only finitely many are smaller than N, and so one of them must be larger than N. In the other direction, select indices inductively: pick n1 with !xn 1 - xi < c, then n2 > n1 with lxn2 - xi < c, then n3 > 712 with lxn3 - xi < c, and so forth. We get n1 < n2 < n3 < · · · with lxn; - xi < c and so x is a cluster point.
ii.
iii.
To prove this part, we refine the selection process just done. If x is a cluster point, select an index n 1 with !xn 1 -xi < 1. Using i, select n2 > n1 with lxn2 - xi < 1/2. Continuing in this way using i repeatedly, select indices n1 < n2 < n3 < n4 < · · · with lxnt ...: xi < 1/k. This gives a subsequence that converges to x, since if c > 0 we know there is a K with I/K < c. Thus, k 2: K implies lxnt - xi < 1/k::::; 1/K < c. In the other direction, if there is a subsequence converging to x and c > 0, then there is an index such that the indices in the subsequence past that one label point are closer than i=; to x. Thus x is a cluster point. is a subsequence, let c > 0 and select N so that Since n1 < n2 < n3 < · · ·, induction shows that nk 2: k. Hence k 2: N implies nk 2: N, and so lxnt - xi < €. The subsequence converges to x. To show the other direction, we prove the contrapositive. Namely, if Xn were a sequence that did not converge to x, then there would be an c > 0 such that for each N there is an index past N with lxn - xi > c. Use this repeatedly to select n1 < n2 < n3 < · · · with lxn; - xi > c. This gives a subsequence that does not converge to x. If
Xn -+
lxn - xi
x and
Xn;
0 and a subsequence xkrn such that lx-xk(j) I 2: i=; for eachj. Since that subsequence is bounded, it would have a convergent sub-subsequence. The limit of that sub-subsequence would be a cluster point of the original sequence different from x, but there are no such points. Therefore, Xk -+ x.
v.
If Xn -+ x, then every subsequence converges to x by iii, and so we do not need to pass to a sub-subsequence. If every subsequence has a subsubsequence convergent to x, then x is a cluster point of every subsequence and hence of Xn. If there were any other cluster point y -:J x, then there would be a subsequence convergent to it by ii. By iv, y would be the only cluster point of that subsequence, and it could not have a sub-subsequence
90
Chapter 1 The Real Line and Euclidean Space convergent to x. Hence x must be the only cluster point of Xn- Therefore Xn - X, by iv.
•
1.5.5 Proposition Let Xn be a sequence in R Then i. If Xn is bounded below, a number a is equal to fun inf Xn if and only if
ii.
a.
For all e and
> 0, there is an N such that a - e < Xn whenever n 2 N,
b.
For all e
> 0 and for all M,
there is an n
>M
with
Xn
< a+€.
If Xn is bounded above, a number b is equal to lim supxn if and only if
< b + e whenever n 2
a.
For all e > 0, there is an N such that and
b,
For all e > 0 and for all M, there is an n > M with b - e
0, there is a cluster point x such that x - e/2 < a; also, a S x for all cluster points. If x is a cluster point, then there are infinitely many Xn with lxn - xi < £ /2, and so such x,, satisfy x,, - e < a, which proves ib. If ia fails, then there is an e > 0 and infinitely many x,, satisfying Xn < a - e. In fact, choose a subsequence Xnt with x,,t < a - e. Since Xn is bounded below, Xn, is bounded and so has a convergent subsequence converging to, say, x. Then x S a - e, so that x is a cluster point smaller than a, giving a contradiction. The converse is. proved in the same way, and the proof of ii is similar.
•
1.5.6 Proposition If Xn is a sequence in JR., then limsupxn=inf{sup{xn+J,Xn+2,x,,;3, ... } In= 1,2,3, ... } liminfx,, = sup{inf{xn+t,Xn+2,Xn+3, ... }
In= 1,2,3, ... }.
Proof We prove the equality for lim sup x,,. First of all, we can assume that x,, is bounded above, for otherwise both sides would be +oo (with the understanding that inf{oo} = oo ). Let b = lim sup x,, and b,, = sup{x,,+ 1, Xn+2, ••. } as in the text, so that b 1 2 bi 2 •·•and b,,-+ inf{b1,b 2 , ••• } (if b,, is unbounded below, then both sides would be -oo, a case we leave to the reader). We need to prove that b,, -+ b. For every e > 0 there is an N such that x,, < b + e for n 2 N, by 1.5.Siia. Thus, b,, $ b + e for n 2 N. On the other hand, since b - e < Xn for arbitrarily large n 2 M, we get b - e S b,, for n 2 M. Thus, for large n, b - e $ b,, $ b + e, and so b,, ----+ b.
•
Theorem Proofs for Chapter I
91
1.5.7 Proposition Let Xn be a given sequence in R Then i.
liminfxn $ limsupxn.
ii.
If Xn
iii.
If M $
iv.
lim sup Xn = +oo
if and only if Xn
v.
liminfxn = -oo
if and only if Xn is not bounded below.
vi.
If xis a cluster point of Xn then liminfxn $ x $ limsupxn.
vii.
If a = lim inf Xn is finite, then it is a cluster point of Xn-
$ M for all n, then limsupxn ~ M. Xn
for all n, then lim infxn
~
M.
is not bounded above.
viii. If b = limsupxn is finite, then it is a cluster point of Xnix.
Xn -+ X
E
JR if and only if Jim sup Xn = lim inf Xn = X E R
Proof i.
,\•
ii.
Given a sequence Xn in JR, let Sn = {Xn+t,Xn+2,· • • }, On = inf Sn (set an = -oo if Sn is unbounded below), and bn = supSn (set bn = oo if Sn is unbounded above). From-the discussion preceding the statement of Proposition 1.5.6 in the main text, each bk is an upper bound for {ai, a2 , a 3 , • .• } . Thus, lim inf Xn = sup{ a1, a2, a3, ... } $ bt for any k. This holds for every k, and so lim infxn $ inf{ b1, b2, b3, ... } = lim sup XnSince Xn $ M for all n, supSn $ M for all n, and so limsupxn = inf (sup Sn) $ M. The proof of iii is similar. n
iv.
If Xn is not bounded above, limsupxn = +oo, by definition. The converse follows from ii. The proof of v is similar.
vi.
Suppose x is a cluster point of Xn, Then for every n there is an index k > n with x - E < x1; < x + c:. That is, Sn n ]x - c:,x + €[ 'F 0. Thus, for every n, x- € $ supSn and inf Sn$ x+c:. Since this holds for every i
> 0, we obtain
inf Sn $ x $ sup Sn
for all n,
and so liminfxn
=sup(infSn) $ x$ n t
inf(supSn) n
=limsupxn,
92
Chapter 1 The Real Line and Euclidean Space
vii.
By definition, a= sup{a1,a2,a3, ... }, where an= inf{Xn+1,Xn~2,•·•}. Clearly, a1 $ a2 $ a3 $ · · · $ a. Since a is finite and equal to the least upper bound of the at, they must converge upward to it. Let e and N be given. There is an No such that a - e < an $ a whenever n :?: N0 • Selecting n > max(N,No), we get a - e < infSn $ a. Thus, Sn must intersect the interval ]a - e, a+ e[. There is an index k > n :?: N with lxt - al < e, and so a is a cluster point.
viii. The proof is left as an exercise. ix.
If Xn -+ x, then for all e > 0 there is an N such that x - e < Xn < x + e for all n :?: N. Thus by 1.5.Si, lim inf Xn = x and by 1.5.Sii, lim sup Xn = x. Conversely, if the conditions of 1.5.S hold with x = a = b, let N be the greater of the N's for 1.5.Sia and 1.5.Siia, so that x - £ < Xn < x + £ if n:?: N, and thus Xn -+ x as n-+ oo. •
1.6.6 Theorem For vectors in Rn, we have I.
Properties of the inner product: i.
(x, x} :?: 0.
ii.
=0 iff x =0.. (x,y + w} =(x,y) + (x, w}. (ax,y} =a(x,y} for a ER (x,y} =(y,x).
iii.
iv.
v. II.
III.
{x,x}
positivity nondegeneracy distributivity mu/Jiplicativity symmetry
Properties of the norm: i.
llxll :?:
ii.
llxll
o.
iii.
=0 iff x =0. llo:xll =lal llxll for a E R
iv.
llx + YII $ llxll + IIYII-
positivity nondegeneracy mu/Jiplicativity triangle inequality
Properties of the distance:
ii.
:?: 0. d(x,y) =0 iff x =y.
iii.
d(x,y) = d(y,x).
iv.
d(x, y) $ d(x, z) + d(z, y).
i.
d(x,y)
positivity nondegeneracy symmetry triangle inequality
Theorem Proofs for Chapter 1
IV.
93
The Cauchy-Schwarz inequality:
l{x,y)I ::;; llxll llYIIProof I.
Each property is a routine check except IV, which will be proved in 1.7.9.
II and III will be proved as general consequences of I in 1.7.5, 1.7.9 and 1.7.10. •
1.6.7 Proposition If v = (v 1, ••• , Vn) and w = (w 1, ••• , wn) are vectors in Rn and we let p(v, w) = max{lv1 - wd, lv2 - w2l, ... , lvn - Wnl}, then p(v, w) 5
llv - wll 5
/np(v, w).
Proof Clearly n
Iv; -
w;j = Jlv; -
w;l 25
L lvi - wil2 = !Iv - wllj=t
Also, n
!Iv - wll =
L {lvi - wil2} 5 j=l
= Jnp(v, w)2 = /np(v, w), Which proves the result.
•
1.7.S Proposition If (V, II • II) is a normed vector space and d(v, w) is defined by d(v, w) = !Iv then d is a metric on V.
wll,
94
Chapter 1 The Real line and Euclidean Space
Proof i.
ii.
d(v, w) ~ 0, since d(v, w) = llv To show d(v, w) = 0 = 0 ¢=> V = W.
¢=>
v
= w,
wll 2: 0. note that d(v, w) = 0
¢=>
llv - wll
=0
¢=>
V - W
iii.
iv.
=d(w, v), since d(v, w) =llv-wll =11-(w-v)II = l-11 llw-vll = llw - vii= d(w, v) For the triangle inequality, d(x,y) = llx - YII = ll(x - z) + (z - y)il :5 llx - zll + llz - YII =d(x,z) + d(z,y). •
d(v, w)
If (V, (·, ·)) is an inner product
1.7.9 Cauchy-Schwarz Inequality space, then
l(x,y)I:::; -.fM ~
for every x and y in V.
Proof If either x or y is 0, then (x,y) = 0, and so the inequality holds. Therefore, we can assume x I- 0 and y I- 0. Then (x,x) > 0 and (y,y) > 0. For any a in JR, we have 0 :5 (ax+y,ax+y) =ci(x,x) +2(x,y)a+ (y,y) by using the basic properties of an inner product. Setting (x,x) = a, 2{x,y) = b, and (y,y) = c, this becomes
aci + ba + c 2: 0 for every a in JR. We know (since a > 0) that this quadratic expression has a minimum and that it occurs when a= -b/2a. Therefore we insert that value for a and obtain
a(:: Since a
2)
+b
(-;a) +c ~
0;
i.e.,
c2: ::.
> 0, this means that b2 $ 4ac; i.e., l(x,y)l 2 $ (x,x)(y,y).
Taking square roots gives what we want.
•
1.7.10 Proposition lf(V, (·,·))is an inner product space and 11·11 is defined for v in V by
llvll=~. then
11 · II
is a norm on V.
Worked Examples for Chapter I
95
Proof The only nontrivial property is the triangle inequality, which was ver-
•
ified in the text.
Worked Examples for Chapter 1 Example 1.1 For real numbers, prove that a. b.
c.
x S lxl, -lxl S x. !xi :5 a ijf -a S x S Ix+ YI :5 !xi + IYl-
a, where a ~ 0.
Solution a.
If x ?: 0, then lxl = x; if x < 0, then !xi ?: x, since x S Ix!- The other assertion is similar.
!xi ?: 0.
b:
If x ?: 0, then we must show that O :5 x :5 a iff -a :5 x S a, which is obvious. Similarly, if x < 0, then the assertion becomes O :5 -x :5 a iff -a S x :5 a, which is again obvious. Here we use the fact that if c :5 0, then O :5 x :5 y iff O ~ ex ~ cy.
c.
By a, -!xi :5 x :5 lxl and -!YI :5 y :5 IYI• Adding, we obtain -(lxl+lyl) S x +y S lxl + IYI- Then, by b, Ix+ YI :5 !xi+ !YI- In addition, this can be proved by cases as indicated in the text. Note that this is also a special case of the triangle inequality in !Rn; see Theorem 1.6.6Iliv. •
In either case,
Example 1.2 Let S be a nonempty set in IR bounded above and x = sup(SJ. Show that there is a sequence x 1, x2 , • •• in S such that Xk
-+
x.
Solution For each k, use Proposition 1.3.2 to find an x1r. E S such that x1r. :5 x < Xk + 1/k. Then Xk-+ x, since for a given e > 0, we choose N?: -I/e; then • k ~ N implies Xt S x < x1:. + e; i.e., Ix - Xtl < e.
Example 1.3 For numbers X1, ••• ,Xn,Yl, .•• ,Yn. and Z1, .•. ,Zn, show that
(t x;y;z;) (t x1) (t Yr) (t z1) · 4
i=l
2
:5
1=l
1=l
1=!
96
Chapter 1 The Real Line and Euclidean Space
Solution The Cauchy-Schwarz inequality (Theorem 1.6.6lv) says that
Applying this to the numbers w; =
Applying this again to
4Zf gives
(LXTZi2) 2:'.S (LX;4)( LZ;4) and so
XiZi and y; gives
(
i.e., L(x;z;)
2) :'.S (LX;4) 1/2(LZ;4) 1/2
(LXiYiZi)2:'.S (LX;4)1/2 (LZ;4) 1/2 (LYi2) .
Squaring both sides, the result is obtained. (We have used the fact that if a, b 2: 0, • then a :'.Sb iff a 2 :'.S b2 .)
Example 1.4 Suppose x
E JR and x
>
0; show that there is an irrational
number between O and x.
Solution If xis rational, then since ../2 is irrational (Exercise 2), so is x/ ../2 (why?) and is between O and x. On the other hand, if x is irrational, then x/2 is irrational (why?) and lies between O and x. •
Example 1.5 Let A and B be nonempty sets in JR that are bounded above. Let a = sup(A) and· b =sup(B) and let the set C be defined by C = {xy I x E A, y E B}. Show that, in general, ab -I sup(C). If a < 0 and b < 0, prove that ab = inf( C). If a > 0 and b > 0 and A, B have only positive elements, prove that ab = sup( C). = {x E JR I -10 < x < -1} = ) - 10, -1[ and B = ]0, 1/2[, so that a= -1, b = 1/2, and ab= -1/2. But C =] - 5,0[ and sup(C) = 0. Now we prove that if a < 0 and b < 0, then ab = inf(C). For this, we use the analogue of Proposition 1.3.2 for greatest lower bounds. First, let x E A and y E B. We want to show xy 2: ab. But x :::; a and y :::; b, or -x 2: -a 2: 0 and -y 2: -b 2: 0, and so (using ordered field axiom ill.16 for JR), (-x)(-y) 2: (-a)(-b); i.e., xy 2: ab. Given e > 0, we seek x EA and y E B
Solution As a specific instance, let A
Exercises for Chapter I
97
so that ab> xy-E, or lab-xyl < E. Choose x and y so that a < x+c/[2(lbl+ l)], b < y + c/[2lal], and b < y + 1. Then, since luvl = lul lvl and IYI < lbl + 1, we get
lab -
xyl $ =
xyl c c · !al lb - YI+ la - xi IYI < !al Zlal + Z(lbl + l) (lbl + 1) = c lab - ayl + lay -
(using the triangle inequality). The last assertion can be proven in an analogous way.
•
Exercises for Chapter 1 1.
For each of the following sets S, find sup(S) and inf(S) if they exist: a.
{x E JR I x2 < 5}
b.
{x E JR I x2 > 7}
c.
{1/n I n, an integer, n
d.
{-1/n
e.
{.3, .33, .333, ... }
> 0}
In an integer, n > 0}
2.
Review the proof that v'2 is irrational. Generalize this to positive integer that is not a perfect square.
3.
a.
Let x;?: 0 be a real number such that for any c that x = 0.
b.
Let S = )0, 1[ . Show that for each c > 0 there exists an x E S such that X < E.
> 0,
vk.
x $
for k a
E.
Show
4.
Show that d = inf(S) iff d is a lower bound for S and for any c > 0 there is an x E S such that d ;?: x - E.
5.
Let Xn be a monotone increasing sequence bounded above and consider the set S = { x 1, x2 , .. . } . Show that x,, converges to sup(S). Make a similar statement for decreasing sequences.
6.
Let A and B be two nonempty sets of real ~umbers with the property that x $ y for all x E A, y E B. Show that there exists a number c E JR such that x $ c $ y for all x E A, y E B. Give a counterexample to this statement for rational numbers (it is, in fact, equivalent to the completeness axiom and is the basis for another way of formulating the completeness axiom known as Dedekind cuts).
98
Chapter 1 The Real line and Euclidean Space
7.
For nonempty sets A,B c IR, let A +B = {x+y Ix EA and y EB}. Show that sup(A + B) = sup(A) + sup(B).
8.
For nonempty sets A, B c JR, detennine which of the following statements are true. Prove the true statements and give a counterexample for those that are false:
a.
sup(A
n B) ~ inf{ sup(A), sup(B)}. n B) = inf{ sup(A), sup(B)}.
b.
sup(A
c.
sup(A U B) ~ sup{ sup(A), sup(B)}.
d.
sup(A U B) = sup{ sup(A), sup(B)}.
9.
Let Xn be a bounded sequence of real numbers and Yn = (-lfxn, Show that lirnsupyn ~ lirn sup lxnl• Need we have equality? Make up a similar · inequality for lim inf.
10.
Verify that the bounded metric in Example 1.7.2d is indeed a metric.
11. 12.
Show that i and ii of Theorem 1.3.4 both imply the completeness axiom · for an ordered field. In an inner product space show that
a. b.
c.
2llxll2 + 2IIYll 2 = llx +Yll2 + llx - Yll 2 (parallelogram law). llx + YII 11.t - YII ~ llxll2 + IIYll2 • 4(x,y} = llx + Yll2 - llx - Yll2 (polarizatior, identity).
Interpret these results geometrically in terms of the parallelogram formed by x and y. 13.
What is the orthogonal complement in lR4 of the space spanned by (1, 0, 1, 1) and (-1,2,0,0)?
14.
a.
Prove Lagrange's identity
(t
2
x;y;) =
i=l
(t xr) (t l) - L i=l
(x;yi - xm>2
I 9 1}
and B = {d((x,y),(0,0)) I (x,y) ES}. Find
For each x E JR satisfying x 2 0, prove the existence of y E JR such that
l=x. 30.
Let V be the vector space C([0, 1]) with the norm llflloo = sup{lf(x)I I x E [0, 1]}. Show that the parallelogram law fails 1and conclude that this norm does not coJI}e from any inner product. (Refer to Exercise 12.)
31.
Let A, B C JR and f : A x B -+ JR be bounded. Is it true that sup{/(x,y) I (x,y) EA
x B} = sup{sup{J(x,y) Ix EA} I y EB}
or, the same thing in different notation,
f(x,y) = sup (sup f(x,y))?
sup (x,y)EAXB
32.
a. b.
yEB
xEA
Give a reasonable definition for what limn-+oo Xn = oo should mean. Let xi = I and define inductively Xn+I =(xi+··· +x11 )/2. Prove that
Xn-+00. 33.
a.
Show that (logx)/x-+ 0 as x-+ oo. (You may consult your calculus text and use, for example, l'Hopital's rule.)
b.
Show that n 1111
-+
1 as n-+ oo.
Exercises 34--45 concern complex numbers. 34.
35.
Express the following complex numbers in the form a + bi:
a.
(2 + 3i)(4 + i)
b.
(8
c.
(1+3/(1+0)2
+ 61) 2
What is the complex conjugate of (3 + 8i)4 /(I + i) 10 ?
Exercises for Chapter I 36.
101
Find the solutions to:
a.
(z + 1)2
b.
z4 -
= 3 + 4i.
i = 0.
37.
Find the solutions to z2 = 3 - 4i.
38.
If a is real and z is complex, prove that Re(az) = a Re z and that Im(az) = a Im z. Generally, show that Re : C -+ R is a real linear map; that is, that Re (az + bw) = a Re z + b Re w for a, b real and z, w complex.
39.
Find the real and imaginary parts of the following, where z = x + iy:
a. b. 40.
a.
'
1/z2 1/(3z+2) Fix a complex number
z = x + iy and consider the linear mapping
'Pz : R 2 -+ IR2 (that is, of C -+ C) defined by '{)z(W) = z · w (that is, multiplication by z). Prove that the matrix of 'Pz in the standard
basis (1,0),(0, 1) of!R 2 is given by
41.
Show that Re(iz) = - lm(z) and that lm(iz) = Re(z) for all complex numbers z.
42.
Letting z = x + iy, prove that
43.
If a, b E C, prove the parallelogram identity:
lxl + IYI ::; v214
la - bl 2 +la+ bl 2 = 2(lal 2 + 1h12). 44.
Prove Lagrange's identity for complex numbers:
Deduce the Cauchy inequality from your proof.
45.
Show that if
lzl > I then n-oo lim t' /n = oo.
102
Chapter 1 The Real Line and Euclidean Space
46.
Prove that each nonempty set S of IR. that is bounded above bas a least upper bound as follows: Choose .xo E S and Mo an upper bound. Let ao = (xo + Mo)/2. If ao is an upper bound, let M1 = ao and xI = xo; otherwise let M 1 = Mo and x1 > ao, x 1 E S. Repeat, generating sequences Xn and Mn. Prove that they both converge to sup(S).
Chapter 2 The Topology of Euclidean Space In this chapter we begin our study of those basic properties of !Rn that are important for the notion of a continuous function. We will study open sets, which generalize open intervals on IR, and closed sets, which generalize closed intervals. The study of open and closed sets constitutes the beginnings of topology. This study will be continued in Chapter 3. Most of the material in this chapter depends only on the basic properties of the distance function and so makes sense in a general metric space. Recall that the distance function d for !Rn is given by n
d(x,y) =
{
~(x;
-y;f
}1/2
and that the basic properties of d are
2: 0.
1.
d(x,y)
2.
d(x,y) = 0 iff x = y.
3.
d(x,y)
4.
d(x,y) ::; d(x, z) + d(z,y) (triangle inequality).
= d(y,x).
Recall also that a set M together with a distance function d satisfying these properties is called a meuic space. 103
Chapter 2 The Topology of Euclidean Space
104
§2.1 Open Sets To define open sets, we first introduce the notion of an c:-disk in a metric space. As usual, our primary example will be IRn.
2.1.1 Definition Let (M,d) be a metric space. For each.fixed x EM and c: > 0, the set D(x,c:) = {y EM I d(x,y) < c:} is called the £-disk about x (also called the £-neighborhood or £-ball about x). See Figure 2.1-J~ set Ac A( is said to be open if for each x EA, there exists ah£> 0 such that D(x,£) C ~ neighborhood of a point in Mis an open set containing that point.
y
R' -.e-+X
FIGURE 2.1-1
The £-disk
Note that the empty set 0 and the whole space M are open. It is important to realize that the c: required in the de nition of an open set may depend on x. For example, the unit square in IR2 not including the "boundary" is open, but the €-neighborhoods get smaller as we approach the boundary. However, the£ cannot be zero for any x. See Figure 2.1-2. Consider an open interval in JR = JR 1, say, JO, I[. Indeed, this is an open set (see Figure 2.1-3). However, if we look upon the set as being in JR2 (as a subset of the x axis), it is no longer open. Thus, for a set to be open it is essential to specify which ]Rn and, in general, which metric space it is to be considered a subset of. There are numerous examples of sets that are not open. The unit disk {x E lR.2 I llxll ~ 1} is such an example. This set is not open because for a point
105
§2.1 Open Sets
FIGURE 2.1-2
An open set
y . ft2
e-disk inR2
e-disk in R
I X
0
0
(x,0)
FIGURE 2.1-3
]0, l[ is open in JR but not in JR2
FIGURE 2.1-4 A nonopen set
on the "boundary" (that is, points x with llxll = I), every c-disk contains points that do not lie in the set. See Figure 2.1-4.
2.1.2 Proposition In a metric space, each €-disk D(x, c) is open.
Chapter 2 The Topology of Euclidean Space
106
The main idea for the proof is contained in Figure 2.1-5. Notice in this figure that the size of the disk about the pointy E D(x,£) gets small as y gets closer to the boundary. The theorem should be "intuitively" clear from this picture.
FIGURE 2.1-5
Some ideas useful for the proof of Proposition 2.1.2
Some basic laws that open sets obey are the following:
2.1.3 Proposition In a metric space (M, d),
~-
~
i.
The intersection of a finite number of open subsets of M is o p /
ii.
The union of an arbitrary collecti9n of open subsets of M is open.
iii.
The empty set
0
and the whole space M are open.
To appreciate the difference between assertions i and ii, note that the intersection of an arbitrary family of open sets need not be open. For example, in JR 1, a single point (which is not an open set) is the intersection of the collection of all open intervals containing it (why?).
Note. A set with a specified collection of subsets (called, by definition, open sets) obeying the rules in Proposition 2.1.3, and containing the empty set and the whole space, is called a topological space. We shall not deal with general . topological spaces in this book, but primarily with the cases of Rn and metric spaces. However, much of what is said in Chapters 2, 3, and 4 does apply to , the more general setting.
§2.1 Open Sets
107
2.1.4 Example Let S = { (x, y) e IR2 I O < x < 1}. Show that S is open. Solution In Figure 2.1-6 we see that about each point (x,y) e S we can draw the disk of radius r = min{x, I - x}, and it is entirely contained in S; you can see this from the picture and can prove it using the triangle inequality. Hence, by definition, S is open. •
y
FIGURE 2.1-6
Any point of this set has a disk about it also lying in the set
-----------------
/1_ 2.1.5 Ex~ple Let__s = { (x,y) e R 2 I O < x 1
~
I}. I s ~ - - - -
Solution No, because any disk about (l, 0) e S contains points (x, 0) with • - - - - - ________
J
x>l.
2.1.6 Example Let A c Rn be open and B c !Rn. Define A+ B = {x + y E !Rn Ix EA and y EB}. Prove that A + B is open.
Solution Let w E A + B. There are points x E A and y E B with w = x + y. Since A is open, there is an € > 0 such that D(x,€) c A. We claim that
~
Chapter 2 The Topology of Euclidean Space
108
wll
D(w,€) CA+ B. Suppose z E D(w,e). Then llz = llz - (x + y)II < €. But llz - (x + y)II = ll(z - y) - xii, so z - y E D(x, e) CA. Since y E B, this forces z = (z-y)+y to be in A +B. Thus, D(w,e) c A +Band henceA+B is an open
set.
•
Note that we have in effect shown that if A is open then A + {y} is open. It follows that A + B = Uyen(A + {y}) is also open.
2.1.7 Example Let M be any set, and define the discrete metric do on M by do(x,y) =0 if x =y and do(x,y) = 1 if x i' y. Prove that every set A C Mis open.
Solution Indeed, given x EA, is open. •
D(x, 1/2) = {x} c A, and so by definition, A
·
Exercises for §2.1 1.
Show that JR2 \{(0,0)} is open in JR 2•
2.
Let S= {(x,y) E JR 2 I xy
3.
Let AC R be open and BC R 2 be defined by B = {(x,y) E R 2 Show that B is open.
4.
Let BC JR.n be any set. Define C = {x E Rn I d(x,y) < 1 for some y E B}. Show that C is open.
5.
Let AC R be open and BC R. Define AB= {xy ER Ix EA and y EB}. Is AB necessarily open?
6.
Show that R 2 with the taxicab metric has the same open sets as it does with the standard metric.
> I}. Show that Sis open.
Ix EA}.
§2.2 Interior of a Set 2.2.1 Definition Let M be a metric space and A c M. A point x E A is called an interior point of A if there is an open set U such that x E U c A. The interior of A is the collection of all interior points of A and is denoted int(A). This set might be empty.
Exercises for §2.2
109
The condition on x is equivalent to the following: There is an c > 0 such that D(x,c) c A - see Exercise 5 at chapter's end. For example, the interior of a single point in IR.n is empty. The interior of the unit disk in IR2 , including its boundary, is the unit disk without its boundary. We can describe the interior of a set in a somewhat different manner as follows: The interior of A is the union of all open subsets of A (the reader is asked to show this in Exercise 23 at the end of this chapter). Thus, by Proposition 2.1.3 or directly, int(A) is open. Hence, int(A) is the largest open subset of A. Therefore, if there are no open subsets of A, then int(A) = 0. Also, it is evident that A is open iff int(A) = A.
2.2.2 Example Let S = {(x, y) E IR 2 I 0 < x
:::; 1}. Find int(S).
Solution To determine the interior points, we locate points about which it is possible to draw an c-disk entirely contained in S. From Figure 2.1-6, we see that these are points (x,y) where O< x < I. Thus, int(S) = {(x,y) IO< x < l}.
• 2.2.3 Example Is it true that int(A) U int(B) = int(A U B)? Solution No. On the real line, let A= [0, 1] and B = [1,2]. Then int(A) = ]0, ·1 [ (why?) and int(B) =] 1, 2[, so that int(A)Uint(B) = ]0, 1[ U] 1, 2[ = ]0, 2[\ { 1}, while int(A U B) = int[O, 2] = ]0, 2[. •
2.2.4 Example Is it true in a general metric space (M, d) that int{y EM d(y,xo):::; r} = {y EM I d(y,xo) < r} where Xo EM and r > 0 are given?
I
~olution No. For example, consider any set M with the discrete metric, xo E M and r = 1. Then {y E M I d(y, xo) :::; 1} = M, and so its interior is all of M. On the other hand, {y I d(y,Xo) < I}= {Xo}, which is not M if M has more than one point. •
Exercises for §2.2 1.
LetS={(x,y)EIR.2 lxy2: l}. Findint(S).
2.
Le~.S = {(x,y, z) E JR3 I 0
0.
Show that
D(Xo, r) C int{y EM I d(y,xo) ~ r}.
§2.3 Closed Sets 2.3.1 Definition A set B in a metric space M is said to be closed if its complement (that is, the set M \ B) is open. For example, a single point in Rn is a closed set. The set in R 2 consisting of the unit disk with its boundary circle is closed. Roughly speaking, a set is closed when it contains its "boundary points" (this intuition will be made precise in §2.6). See Figure 2.3-1. Our intuition can be hard to interpret for complicated sets, and so the technical definition 2.3.1 is then used.
I
•
X
FIGURE 2.3-1
Examples of closed sets
It is possible to have a set that is neither open nor closed. For example, in JR 1, a half-open interval ]O; 1] is neither open nor closed. Thus, even if we know that A is not open, we cannot conclude that it is closed or not closed. The next theorem is analogous to Proposition 2.1.3.
2.3.2 Proposition In a metric space (M, d), i.
The union of a finite number of closed subsets is closed.
ii.
The intersection of an arbitrary family of closed subsets is closed.
iii.
The wlwle space M and the empty set 0 are closed.
,r
§2.3 Closed Sets
111
This theorem follows from Proposition 2.1.3 by noting that unions and intersections are interchanged when we talce complements (see the Introduction). The proof is left to the reader (Exercise 22 at the end of the chapter), who should also show that in the first assertion, the finite union cannot be replaced by arbitrary unions.
2.3.3 Example Let S = {(x,y) E IR.2 1 O < x $ 1,0 $ y $ 1}. Is S closed? Solution See Figure 2.3-2. Intuitively, S is not closed, because the portion of its boundary on the y-axis is not in S. Also, the complement is not open because any e-disk about a point on the y-axis, such as the point (0, 1/2), will intersect S (and hence not be in IR 2 \ S). •
y
FIGURE 2.3-2
Is this set closed?
2.3.4 Example Let S = { (x,y) E 1R 2 I x2 + y2 $ I}. Is S closed? Solution Yes; S is the unit disk, including its boundary.
Tj
a!l open set, because for (x,y) E IR 2 \ S, the disk of radius e = be entirely contained in IR 2 \ S (Figure 2.3-3). •
complement is - 1 will
x2 + y 2
2.3.5 Example Show that any finite set in !Rn is closed. '
Solution Single points are closed, and so the assertion folfows frofll Proposition 2.3.2i.
•
·
/
Chapter 2 The Topology of Euclidean Space
112 y
I \
I
-
,. ' •
__,,,
'
\
(x,y) /
1------x
FIGURE 2.3-3
The complement of the set S in Example 2.3.4 is open'
2.3.6 Example Let (M, d) be a metric space and A C M be a finite set. Let :S I for some y EA}. Show that Bis closed.
B = {x EM I d(x,y)
Solution We will show that M \ B is open. Let z E M \ B so that d(z, y) > I for ally E A. Let Y1, ... ,YN denote the points of A, so that d(z,y;) > 1 for i = 1, .. . ,N. Let e be the minimum of d(z,y1)- 1, .. . ,d(Z,YN)- I, so that e > 0. If d(x,z) < e/2, the triangle inequality gives d(x,y1)+d(x,z) 2: d(y1,z) or d(x,y;) 2: d(y;,z) - e/2, which, by construction of e, is strictly greater than 1. Thus, D(z, e) c M \ B, and so M \ B is open and thus B is closed. •
Exercises for §2.3 1.
Let S = {(x,y) E 1R2 Ix 2: 1 and y 2: 1}. Is S closed?
2.
Let S= {(x,y) E JR2 Ix= 0, 0
3.
Redo Example 2.3.S directly, this time showing that the complement is open.
4.
Let A c ]Rn be arbitrary. Show that ]Rn \ (intA) is closed.
S.
Let S = {x E JR Ix is irrational}. Is S closed?
6.
Give an alternative solution of Example 2.3.6 by showing that B is a union of finitely many closed sets.
< y < l}. Is S closed?
113
§2.4 Accumulation Points
§2.4 Accumulation Points Another useful way to detennine whether or not a set is closed is based on the concept of an accumulation point.
2.4.1 Definition A point x in a metric space M is called an accumulation point of a set A c M if every open set U containing x contains some point of A other than x. In other words, an accumulation point of a set A is a point such that other points of A are arbitrarily close by. Accumulation points are also referred to as cluster points. Using Proposition 2.1.2, the statement that x is an accumulation point of A is equivalent to the statement that for every € > 0, D(x, e) contains some point y of A with y f. x. For example, in JR 1 , a set consisting of a single point has no accumulation points and the open interval ]0, I [ has all points of [0, I] as accumulation points. Note that an accumulation point of a set need not lie in that set. The definitions of accumulation point and closed set are closely related, as shown by the next theorem.
2.4.2 Theorem A set A
C M is closed iff the accumulation points of A
belong to A. A set need not have any accumulation points (a single point and the set of integers in R 1 are examples), in which case Theorem 2.4.2 still applies and we can conclude that the set is closed. Theorem 2.4.2 is intuitively clear because the property of being closed means, roughly speaking, that a set co_ntains all points on its "boundary," and such points are accumulation points. This sort of rough argument has a pitfall, and one has to be more careful, since some sets are sufficiently complicated that our intuition may fail us. For example, consider A= { 1/n E JR I n = 1, 2, 3, ... } U {0}. This is a closed set (verify!) and its only accumulation point is {0}, which lies in A. But our intuition about "boundary" is not very clear for this set; hence, the need for inore careful arguments.
2.4.3 Example Let S = {x E JR accumulation points of S.
Ix
E [0, 1] and x is rational}. Find the
Solution The set of accumulation·points consists of all points in [0, 1). Indeed, let y E [0, 1) and D(y,€) = ]y - e,y + e[ be· a neighborhood of y .. We
Chapter 2 The Topology of Euclidean Space
114
can find rational points in [O, l] arbitrarily close to y (other than y) and in particular in D(y, e). Hence y is an accumulation point. Any point y . 0, there is a y E A such that d(x,y) < €. Thus, again, a = 0. Conversely, if o = 0 and x (/. A, then for any c > 0 there is a y E A such that d(x,y) < € (by a property of inf), so that xis an accumulation point of A and thus x E cl(A). •
Solution First, let x E cl(A) and
Exercises for §2.5 Ix> y2}.
1.
Find the closure of A= {(x,y) E IR 2
2.
Find the closure of {1/n In= 1,2,3, ... } in IR.
3.
Let A = { (x, y) E IR 2 I x is rational}. Find cl(A ).
4.
a.
For A C Rn, show that cl(A) \ A consists entirely of accumulation points of A.
b.
Need it be all of them?
S.
In a general metric space M, let A c D(x, r) for some x E Mand r > 0. Show that cl(A) C B(x, r) ·= {y E MI d(x,y) 5 r}.
118
Chapter 2 The Topology of Euclidean Space
§2.6 Boundary of a Set If we consider the unit disk in R.2, we know what we would like to call the boundary-the obvious choice is the unit circle. For more complicated sets, such as the rationals, it is not as clear what the boundary should be. Therefore a precise definition is needed.
2.6.1 Definition For a given set A in a metric space (M,d), the boundary is defined to be the set bd(A) = cl(A) n cl(M \ A).
Sometimes the notation 8A = bd(A) is used. Since the intersection of two closed sets is closed, bd(A) is a closed set. Also, note that bd(A) = bd(M \ A). From Proposition 2.5.2, we deduce that the boundary is also described as follows:
2.6.2 Proposition Let A c M. Then x e bd(A) iff for every e > 0, D(x, e) contains points of A and of M \ A (these points might include x itself). See Figure 2.6-1.
FIGURE 2.6-1
Disks about boundary points contain points in the set and · points not in the set
The original definition states that bd(A) is the border between A and M \ A. This is also what Proposition 2.6.2 is asserting, and therefore it should be intuitively clear.
119
§2.6 Boundary of a Set
2.6.3 Example Let A= {x E IR Ix E [0, 1] and xis rational}. Find bd(A).
=
=
[0, I], since, for any c: > 0 and x E [0, l], D(x, c:) ]x- c:, x+E:[ contains both rational and irrational points. The reader should also verify that bd(A) = [O, 1] using the original definition of bd(A). This example shows that if A C B, it does not necessarily follow that bd(A) C bd(B). (Let A be as in this example and B = [O, 1) in R) •
Solution bd(A)
2.6.4 Example If x E bd(A), must x be an accumulation point? Solution No. For example, let A = { 0} c IR. Then A has no accumulation points, but bd(A) = {O}. • 2.6.5 Example Let S = {(x,y) E IR 2 I x2 - y2 > I}. Find bd(S). Solution The set S is sketched in Figure 2.6-2. Clearly, 'bd(S) consists of the hyperbola x2 - y2 = 1. •
y
FIGURE 2.6-2 The set S in Example 2.6.5
Chapter 2 The Topology of Euclidean Space
120
Exercises for §2.6 In= 1,2,3, ... }.
1.
Find bd(A) where A= {1/n E JR
2.
If x E cl(A) \ A, then show that x E bd(A). Is the converse true?
3.
Find bd(A) where A= {(x,y) E 1R2 Ix :5 y}.
4.
Is it always true that bd(A) = bd(int A)?
5.
Let A c JR be bound~d and nonempty and let x = sup(A). Is x E bd(A)?
6.
Prove that the boundary of a set in IR. 2 with the standard metric is the same as it would be with the taxicab metric.
§2.7 Sequences The definition of convergence of a sequence in IR.n is similar to that of convergence of a sequence of real numbers. In fact, the definition is meanin~ful in a metric space.
2.7.1 Definition Let (M,d) be a metric space and Xk a sequence of points in M. We say that
Xk
converges to a point x E M, written lim
Xk
= x or
Xk -+
x as k
-+
oo,
k-+
provided that for every open set U containing x, there is an integer N such that E U whenever k?: N. See Figure 2.7-1.
Xk
This definition coincides with the usual c:-N definition, as the next theorem shows.
2.7.2 Proposition A sequence Xk in M converges to x
E M
if! for every
c > 0 there is an N such that k?: N implies d(x,xk) < c:. Let us specialize our discussion of convergence to ]Rn. In this case, Definition 2.7.1 reads: A sequence of points vk in lRn converges to v if for every c > 0 there is an N such that llvk - vii < c: whenever k ?: N. Indeed, this is so just because d(v, vk) !Iv - vkll- Also note that for n l, we recover the definition of convergence on JR.. The definition is essentially the same for convergence in
=
=
§2.7 Sequences
121
•
• FIGURE 2.7-1
Convergence of a sequence in IR 2
any other normed space: If v· is a normed space and vk is a sequence in V, then Vk converges to v E V if llvk - vii -+ 0 as k -+ oo. In IR 2 , the situation appears as in Figure 2.7-1. Much of what we did in one dimension still makes sense in !Rn, but some does not. Most noticeably, there is no natural order imposed on !Rn, and so the discussion of monotone sequences does not apply. Much of the rest goes through by replacing absolute values by norms or distances throughout. For example, since normed spaces are vector spaces we still know how to add and multiply by numbers and so we can examine the arithmetic of sequences.
2.7.3 Proposition Suppose Vt and wk are sequences of vectors in a normed space (such as !Rn), ,\k is a sequence of numbers in IR, ,\ E IR is a constant number; and u is a constant vector. If vk -+ v, Wk -+ w, and ,\k -+ ,\, then
+ Wk
i.
Vk
ii.
>.vk-+ >.v.
iii.
>.ku-+ >.u.
-+ V
+ W.
iv.
v.
If >.k
I
I
~ 0 and >. ~ 0, then ,\k Vk -+ Av.
For some computations, we use the specific representation of vectors in !Rn in terms of a finite number of coordinates. If v and vk are in !Rn, then we write v (v 1, ••• , yft) and Vk = (vk, vl, ... ,'1) , where each vi and each v{ lie in R
=
2.7.4 Proposition vk -+ v in !Rn if and ~nly if each sequence of coordinates converges to the corresponding coordinate of v as a sequence in IR. That is, iim vk = v in !Rn iff Jim v~ = vi in JR for each i = I, 2, ... , n. ....... OCI
k-c,o
Chapter 2 The Topology of Euclidean Space
122
This can be written a bit more compactly as lim (vl, . .. , v;> = ( lim vl, ... , lim v:) .
k-+oo
k-oo
k-+oo
For the case of JR 2, Proposition 2.7.4 should be clear from Figure 2.7-1. Note that this proposition does not make sense in the general metric space setting.
2.7.S Example Show that the sequence of vectors verges to (0, 0) in JR 2 as k
--+
Vk
= (1/k, l/k2) con-
oo.
Solution The component sequences 1/k and 1/k2 each converge to 0. By Proposition 2.7.4, the vectors (1/k, 1/k2) converge to (0, 0) in JR 2• • We can use sequences to determine whether a set is closed. The method is
as follows:
2.7.6 Proposition i.
A set A C M is closed ijf for every sequence Xk E A that converges in M, the limit lies in A.
ii.
For a set BC M, x E cl(B) ijf there is a sequence Xt E B V.:ith
is,
Xk--+
x.
One should note that the sequences in i and ii are allowed to be trivial-that = x for all k is allowed.
Xk
2.7.7 Example Let Xn E JR"' be a convergent sequence with llxnll $ 1 for all n. Show that the limit x also satisfies llxll $ I. If llxnll < l, then must we have llxll < 1?
I IIYII $ 1} is closed. Hence, by Proposition 2.7.6i, Xn E B and Xn --+ x implies x E B. This is not true if $ is replaced by 0, proving that A is open.
iii.
The empty set 0 and the whole space M are open directly from the definition.
•
2.4.2 Theorem A set A
C M is closed ijJ the accumulation points of A
belong to A.
Proof
First, suppose A is closed. Then M\A is open. Thus if x E M\A, there is an c > 0 such that D(x, c) C M \ A; i.e., D(x,c) n A = 0. Thus x is not an accumulation point, and so A contains all its accumulation points. Conversely, suppose A contains all its accumulation points. Let x E M \ A. Since x is not an accumulation point and x (j A, there is an £ > 0 such that D(x, £) n A = 0; i.e., D(x, c) c M \ A. Hence M \ A is open, and so A is closed.
•
2.5.2 Proposition For A C M, cl(A) consists of A plus the accumulation points of A. That is, cl(A) = A U { accumulation points of A}. Proof Let B =AU{x Ix is an accumulation point of A}. By 2.4.2, any closed set containing A also contains B. If B is closed, then it will therefore be the smallest closed set containing A, so that B = cl(A). To show that Bis closed, we use 2.4.2; let y be an accumulation point of B. If c > 0, then D(y, £) contains other points of B. If z is such a point, then either z E A or z is an accumulation point of A. In the latter case, D(z, £ - d(z, y )) is an open set containing z, and so, by 2.4.1, it contains some other points of A, necessarily distinct from y. Thus y is an accumulation point of A, and soy E B. Thus B is closed.
•
2.6.2 Proposition Let A c M. Then x e bd(A) ijJ for every£ > 0, D(x, £) contains points of A and of M \ A (these points might include x itself).
Proof Let X E bd(A) = cl(A) n cl(M \ A). Either X E A or X E M \ A. If x E A, then, by Proposition 2.5.2, x is an accumulation point of M \ A, and-the conclusion follows. The case x E M \ A and the converse are similar. • 2.7.2 Propositi9n A sequence Xt in M converges to x E M ijJ for every > 0 there is an N such that k ~ N implies d(x, Xt) < £.
£
Chapter 2 The Topology of Euclidean Space
132
x1, -+ x and e > 0. Since D(x,c) is open, there is an N such that k ~ N implies x1, E D(x, c ), or d(x, x1,) < c, as required. Conversely, suppose the condition holds and U is a neighborhood of x. Find e > 0 such that D(x,e) c U. Then there is an N such that k ~ N implies d(x,xk) < e, that is, Xt E D(x,e) C u and so XJ;-+ x, by definition.
Proof Suppose
•
2.7.3 Proposition Suppose Vt and Wk are sequences of vectors in a normed space (such as IR.n), Ak is a sequence of numbers in JR., A E R is a constant number; and u is a constant vector. If v,, -+ v, Wt -+ w, and A.t -+ A, then i.
Vk
ii.
AV.t-+ Av.
+Wt-+ v+w.
iii.
iv.
A.tVt-+ AV.
v.
If At
I I ~ 0 and A ~ 0, then At Vt -+
>/·
Proof In each case the proof parallels that of 1.2.7, and so we leave the adaptation to the reader. To save effort, note that, as in 1.2.7, iv implies each of ii and iii, and so only proof of i, iv, and v need be shown.
•
2.7.4 Proposition v,, -+ v in IR.n if and only if each sequence of coordinates converges to the corresponding coordinate of v as a sequence in JR.. That is, lim Vt= v in Rn iff lim v~ = v; in JR/or each i = 1, 2, ... ,n. k--+oo
·
k--+oo
Proof If 6(v, vk) = max.{jv 1 - vfl, ... , jvn - v;I}, then 6(v, v.1:) $ llv - v1,II $ y'n8(v, Vk) by 1.6.7. Suppose Vt -+ v in IR.n and I $ j $ n. Letting e > 0, then there is an integer K such that llv - Vtll < e whenever k ~ K. Therefore, for such k, we have jvj - vi I $ 6(v, vk) $ llv - Vtll < e, and so vi-+ vj in JR.. The proof of the converse shows where the finiteness of n is vital. Suppose that for each j = 1, 2, ... , n, we have limk-oo vi = vi. Let e > 0. For any eo > 0, there are integers Ki, K2, K3, ... , Kn such that whenever k ~
lv 1 - v}I < eo jv2 - vii < t:o
whenever k ~ K2,
< io
whenever k ~ Kn.
I~ - v;I
K1,
Tlieorem Proofs for Chapter 2
133
There are only finitely many such K;, and so one of them is largest. Let K = max(K1, K2, ... , Kn), For k ~ K, 1.6.7 again gives llv - Vt-II 5 ✓,i6(v, Vt) ~ .fiigo. If this is done with go = g / ✓,i, we obtain llv - Vtll < e whenever k ~ K, and so Vt--+ Vin Rn.
•
2.7.6 Proposition i.
A set A C M is closed iff for every sequence Xk E A that converges in M, the limit lies in A.
ii.
For a set B C M. x E cl(B) if! there is a sequence Xk E B with
Xk --+
x.
Proof i.
. First, suppose A is closed. Assume Xt --+ x and x ff. A. Then x is an accumulation point of A, for any neighborhood of x contains Xt E A for k large. Hence, x E A, by Theorem 2.4.2. Conversely, we shall use Theorem 2.4.2 to show that A is closed. Let x be an accumulation point of A, and choose Xt E D(x, 1/k) n A. Then Xt --+ x, since for any e > 0, we can choose N ~ 1/g; then k ~ N implies Xt E D(x, g); see Figure 2.P-1. Hence, by hypothesis, x E A, and so A is closed.
ii.
The argument here is similar and we shall leave it as an exercise.
FIGURE 2.P-1
•
Accumulation points of a set
2.8.3 Proposition A convergent sequence in a normed or metric space is bounded.
Chapter 2 The Topology of Euclidean Space
134
Proof We proceed as in 1.2.6. if n
~
N. Thus
Xn
E D(x, 1) if n
If Xn - x, there is an N such that d(xn,x) N. Let
0, choose N so that k ~ N implies l!xk - xii < c/2. Then, for k,l ~ N, llxk - xdl = ll(xk - x) + (x - x1)II :S l!xk - xi I + !Ix - xd I < c: /2 + c: /2 = c: by the triangle inequality. Thus, Xk is a Cauchy sequence. ' Conversely, suppose Xk is a Cauchy sequence. Since lx1 :S l!xk - xiii, the components are also Cauchy sequences on the real line. By completeness of IR and Theorem 1.4.4, xj converges to, say, xi. By Proposition 2.7.4, Xk converges to x = (x 1, ... ,x").
Proof
x/1
•
2.8.7 Proposition If xk is a sequence in a metric space M and x i.
x is a cluster point if! for every c: k > N with d(xk,x) < c:.
ii.
x is a cluster point
iii.
Xk - x
E M, then
> 0 and for each integer N, there is a
if! there is a subsequence convergent to x.
if and only if every subsequence converges to x.
135
Theorem Proofs for Chapter 2
iv.
Xk --+ x iff every subsequence of x1c has a further subsequence that converges to x.
Proof This proof is analogous to the proof of 1.5.2, and we leave the adaptation to the reader.
•
2.9.2 Theorem Let V be a complete normed space (such as IR.n). A series E Xk in V converges iff for every E > 0, there is an N such that k ;?: N implies llxk + Xk+I + · · · + Xk+pll
< E for all integers p = 0, I, 2, ....
Proof Let St= z=:= 1 x;. By completeness, Exk converges iff sk is a Cauchy sequence. This is true iff for every E > 0 there is an N such that l ;?: N implies !Is, - St+qll < £ for all q = I, 2, .... But lls1+q - s,11 = llx1+1 + · · · + XJ+qll, and so the result follows with k = l + I and p = q - I = 0, 1, 2, . . . . • 2.9.3 Theorem In a complete normed space, if E Xk converges absolutely, then
E Xt converges.
Proof This follows from Theorem 2.9.2 and the triangle inequality:
2.9.4 Theorem i.
Geometric series: If
lrl
then
z=: dk diverges. p-series test: z=;: 1/nP converges if p > 1 and diverges to oo (that is, 1
1
1
iii.
1
the partial sums increase without bound) if p :5 I.
iv.
Ratio test: Suppose that limn-oo lan+if anl exists and is strictly less than I.
z=:
Then 1 a,, converges absolutely. If the limit is strictly greater than I, then the series diverges. If the limit equals 1, then the test is inconclusive.
Chapter 2 The Topology of Euclidean Space
136 v.
Root test: Suppose that limn-+oo lanl'/n exists and is strictly less than 1. Then 1 an converges absolutely. If the limit is strictly greater than I, then the series diverges; if the limit equals 1, then the test is inconclusive.
vi.
Integral test: If f is continuous. nonnegative, and monotone decreasing on [1, +oo[. then E:iJ(n) and f(x)dx converge or diverge together.
vii.
Ratio comparison test: Let E~ a; and E~ b; be series, with b; > 0 for all i. If ( a) la;! ~ b;for all i, or if lim;-+oo lad/ b; < oo, and if (b) E:, b; is convergent, then a; is convergent. If (c) a; ~ b; for all i, or lim;...... 00 a;/b; > 0, and if (d) b; is divergent, then 1 a; is divergent.
E:
ft
E:,
E:,
I;;:
E:,
viii. Alternating series: If a; is such that the a; alternate in sign, are decreasing in absolute value, and a; -+ 0 as i -+ oo, then the series converges. The error in approximating the sum by Sn = a; is no greater than lan+1 I-
Z:7=,
Proof i.
We shall use the following:
Power Lemma
~
lim ,n = { n-+oo O
if r > 1 if r= 1 if O ~ r < I.
Proof First consider the case r > 1. We write r as 1 + s, where s > 0. If we expand rn (1 +st, we get rn 1 + ns + (other positive terms). Therefore, rn ~ 1 + ns, which goes to oo as n -+ oo. Second, if r 1, then rn 1 for all n, and so limn-+oo rn = I. Finally, if O ~ r < 1, then, excluding the easy case r 0, we let p = 1/r, so that p > 1, and hence limn ...... 00 pn oo. Therefore, limn-+oo rn limn-+oo 1/ pn 0. l'
=
=
=
=
=
=
=
=
By elementary algebra, 2 I - rn+I I + r + r + • · · + rn = - - -
1- r
if r F- 1. By the power lemma, ,n+l -+ 0 as n -+ 00 if lrl < 1, and Irr' -+ 00 if lrl > 1. Thus we have convergence if lrl < 1 and divergence if lrl > 1. Obviously, rn diverges if lrl = I, since rn f+ 0.
E:O
Theorem Proofs for Chapter 2
ii.
137
The partial sums of the series I:~ 1 ak form a Cauchy sequence, and thus the partial sums of the series I:~1 bk also form a Cauchy sequence, since for any k and p we have bk+ bk+I + · · · + bk+p ::; ak + ak+I + · · · + ak+p· Hence 1 bk converges. A positive series can diverge only to +oo, and so, given M > 0, we can find k o such that k 2': k o implies that c 1 + c2 + · · ·+ck 2': M. Therefore, for k 2': ko, di + d2 + · · · + dk 2': M, so that 1 dk also diverges to oo.
I:::
I:::
iii.
First suppose that p ::; 1; in this case I/ nP 2': I/ n for all n = I, 2, .... Therefore, by ii, 1(1/nP) will diverge if the hannonic series 1(1/n) diverges. But we proved this in 1.2.20. Now suppose that p > I. If we let
I:::
I:::
Sk
1 1 1 1 = IP + 2P + 3P + ... + kP'
then sk is an increasing sequence of positive real numbers. On the other hand, Sz;
-
I
=
_!_ + (_!_ + _!_) + (_!_ + _!_ + _!_ + ..!..) IP
2P
3P
4P
5P
+ ... + C2k~I )P + ... + (2k 1
2
4
6P
7P
~ l)P)
2k-l
(r')PlaNI, and so limn-+oo laNI = oo, whereas the limit would have to be zero if the sum converged. Thus 1 Ok diverges. To see that the test fails if limn_,oo lan+i/onl = 1, consider the series 1 + l + l + · •·, and 1 1/nP for p > 1. In both cases, limn_,00 lon+i/anl = l, but the first series diverges and the second converges.
r::
v.
E:
Suppose that limn ..... 00 (1anl) 1ln = r < 1. Choose r' such that r < r' < I and N such that n ;?: N implies that lanp!n < r'; in other words, lanl < (r'Y'. The series la1 I + la2I + · · · + laN-1 I + (r't + (r')N+l + · · · converges to lad+ la2I +· · · + loN-d +(r'Y' /(1- r'), and so, by ii, 1 at converges. If limn-oo(lanl) 1ln = r > 1, choose l < r' < rand N such that n ;?: N implies that lanp!n > r', or, in other words, lanl > (r')n. Hence, limn-oo lanl = oo, and therefore, 1 a1c diverges. To show that the test fails when limn-oo(lanl) 1ln = 1, observe that, by elementary calculus,
E:
E:
lim
n-+oo
(-I)
t/n
n
1 ) 1/n lim ( - 2 =1 n
= 1 and
n--+OO
(talce logarithms and note that (logx)/x-+ 0 as x -+ oo). But diverges and I/n 2 converges.
E:,
vi.
E: 1/n 1
For this part, we accept some elementary facts about integrals from calculus (see §4.8 for a review). In Figure 2.P-2a, the rectangles of areas a1 , a2, ... , On enclose more area than that under the curve from x = 1 to x = n + l. Therefore,
a1+a2+···+an~
J.
1
n+I
f(x)dx.
If we now consider Figure 2.P-2b and talce the area from x = l to x = n, we have a2 + a3
+ · · · + On
s· inf(x)dx.
Theorem Proofs for Chapter 2
139
y
l
n n+I
2
(a)
y
l
2
n-ln
(b)
FIGURE 2.P-2
Inequalities needed for the integral test
Adding a1 to both sides gives
a, +a2 +a3 +···+an$ a1 +
in f(x)dx.
Combining the two results yields
ln+I f(x)dx $ a, + a2 + · · · + an $ a1 + in f(x)dx. Ji°°
f(x)dx is finite, then the right-hand inProvided that the integral equality implies that the series 1 an is also finite, by the completeness property of R But if f(x)dx is infinite, the left-hand inequality shows that the series is also infinite. Hence, the series and the integral converge or diverge together.
ft'
vii.
I:;:
For instance, suppose that lim;- 00 ja;lf b; = M < oo, with all b; > 0. Then for large enough i, we have ja;I/ b; < M + 1, i.e., lad < (M + l)b;. If Eb; converges, so does E(M + l)b;, and hence Ea; converges, by the comparison test. The other cases are similar.
Chapter 2 The Topology of Euclidean Space
140
I:;:,
a; be an alternating series. Assuming a, > 0, and if we let viii. Let b; = (-1Y+ 1a;, then all the b; are positive, and our series is b1 -bi +b3 b4 + bs · · ·. In addition, we have b, > b2 > b3 > · · · and lim;--+oo b; = 0. Each even partial sum S2n can be grouped as (b1 - b2) + (b3 - b4) + · · · + (bn-1 - bn), which is a series of positive terms, and so we have S2 :S S4 :S S6 :S · · · . The odd partial sums S2n+1 can be grouped as b1 - (b2 - b3) (b4 -bs)- · · · -(b2n -h2n+1), which is a sum of negative terms (except for the first), so that we have S 1 2: S3 2: Ss 2: · · ·. Next, we note that S2n+1 = S2,1 + b2n+1 2 S2n• Thus, the even partial sums S2n form an increasing sequence that is bounded above by any member of the decreasing sequence of odd partial sums. By the monotone sequence property, the sequence S2n approaches a limit, Seven• Similarly, the decreasing sequence S2n+1 approaches a limit, Sood. Thus we have S2 :S S4 :S S6 S · · · S S2n :S · · · S Seven :S Sood $ · · · :S S2n+1 $ · · · $ S3 :S S,. Since S2n+1 - S2n = a2n+1, which approaches zero as n - oo, the difference Sood - Seven is less than S2n+i - S2n, and so it must be zero; i.e., Sood= Seven• Call this common value S. Thus
and IS2n+1 - Sj $ IS2n+1 - S2n+2l = b2n+2 = ja2n+2I,
and so each difference !Sn - Sj is less than ian+1 l- Since an+I - 0, we get Sn --+ S as n --+ oo. This argument also shows that each tail of an alternating series is no greater than the first term omitted from the partial sum.
•
Worked Examples for Chapter 2 Example 2.1 Let S = {(x,,x2) E lR 2 I lxd :S 1, lx2l closed or neither? What is the interior of S? Solution S with
< 1}.
ls S open or
S is not open, since there is no neighborhood around any point of
x, = 1 that is entirely contained in S.
See Figure 2.E-1. On the other
hand, S is not closed, since
and no neighborhood around a point of JR. 2 \ S with x2 = 1 is contained in JR. 2 \ S.
Worked Examples for Chapter 2
----
141
(0,1}
(1,0}
FIGURE 2.E-1
-----------
Detennine the topological properties of this set
Alternatively, we see that S is not closed by noting that the sequence (0, 1 - 1/n) converges but the limit point (0, 1) does not lie in S (see Proposition 2.7.6). We assert that int(S) = { (x1. x2) E IR 2 I lx1 I < 1, lx2 I < 1}. We check this by showing that the members of this set are the interior points of S. If Ix, I < I and lx2I < 1, then the disk with center (x1 ,x2) and radius r = minimum{ I - lxd, I jx2 1} lies in S. As we have seen, the other points of S are not interior points. As the student becomes more familiar with this type of argument, some of the details may be omitted. •
Example 2.2 Slww that if x is an accumulation point of a set S c IR.n then every open set containing x contains infinitely many points of S. Solution We use proof by contradiction. Suppose there were an open set U containing x and containing only finitely many points of S. Let x 1,x2, ... ,Xm be the points of S in U other than x. Let e be the minimum of the numbers d(x,x 1),d(x,x2), . .. ,d(x,x111 ), so that e > 0. Then D(x,c) contains no points of S other than x, which contradicts the fact that x is an accumulation point of S. The reader should also supply a direct proof of this result. •
Example 2.3 If x = sup(S) for S c IR, slww that x E cl(S). Solution By Proposition 2.5.2, it suffices to show either that x E S, or that x is an accumulation point of S. By Proposition 1.3.2, for any .c
>0
there is
Chapter 2 The Topology of Euclidean Space
142 a y E S with d(x,y) < point of S. •
€.
This means that if x
r/.
S, then xis an accumulation
Example 2.4 Slww that a sequence in a metric space can converge to at most one point (limits are unique). y. Given£ > 0, choose N such that k 2: N implies d(xk,X) < c/2, and M such that k 2: M implies d(xk,Y) < c/2. If k 2: N and k 2: M, then d(x,y) ~ d(x,xk) + d(xk,Y) < € (by the triangle inequality). Since O ~ d(x,y) 0, d(x,y) = 0, and so x = y. •
Solution Let Xk-+ x and Xk-+
-------
Example 2.5· "Big Oh" and "little oh" notation Write f = 0( g) if g(x) > 0 for x E JR sufficiently large and f(x)/ g(x) is bounded for x sufficiently large. g (read f is Write f = o(g) iff / g goes to zero as x goes to +oo. Also write f asymptotic to g) if f / g -+ l as x -+ oo. Prove the following:
~
1.
x2 +x = O(x2).
2. --
x2 +x ~ x2.
3.
exp( Jiogx) = o(x).
Solution We note that if/ is asymptotic tog, then it will follow automatically that/= O(g) (why?). Thus 1 will follow from 2. But 2 is easy, since we know that (x2 + x)/x2 = I+ 1/ x goes to l as x goes to infinity. To prove 3, we note that exp(logx) = x, so that (exp Jlogx)/x = exp(Jfcigx- logx). Since logx-+ oo asx-+ oo, (v'logx)/logx-+ 0 asx-+ oo so that for xlarge, Jlogx ~ (logx)/2 and hence, for x large, exp-Jiogx - - < -l ( exp (logx)) -= -I X - X 2 ,/i' which goes to zero as x -+ oo.
•
Example 2.6 Recall that one may define ex by X
x2 x3
e =l+x+-+-+···
2!
3!
.
JR. Hence this definition of ex makes sense.) Slww that e = e 1 is an irrational number.
(By the ratio test, this series converges for all x E
Exercises for Chapter 2
143
Solution Suppose that e = a/b for integers a and b. Let k be an integer, k > b, and let a= k!(e - 2 - 1/2! - 1/3! - · • • - 1/k!), so that a is a nonzero integer as well. Since e = 2 + 1/2! + 1/3! + · · •,
1 k+ 1
1 (k+ l)(k+2)
1 - k+ 1
1 (k+ 1)2
I k
a=--+----+··· 0, there is an N such that m ;::: N implies d(xn,,x) ~ € (this differs from Proposition 2.7.2 in that here "< €" is replaced by"~€").
Exercises for Chapter 2 12.
145
Prove the following properties for subsets A and B of a metric space: a.
int(int(A)) = int(A).
b.
int(A U B) :, int(A) U int(B).
c.
int(A n B) = int(A) n int(B).
13.
Show that cl(A) = A U bd(A).
14.
Prove the following for subsets of a metric space M:
15.
a.
cl(cl(A)) = cl(A).
b.
cl(A U B) = cl(A) U cl(B).
c.
cl(A
n B) C
cl(A) n cl(B).
Prove the following for subsets of a metric space M: bd(A) = bd(M \ A).
a. b.
bd(bd(A)) C bd(A).
c.
bd(A U B) C bd(A) U bd(B) C bd(A U B) U A U B.
d.
bd(bd(bd(A))) = bd(bd(A)).
16.
Let a1 = /2, a2 = (v'2 )° 1 , ••• ,an+I = (/2 )"n. Show that an n -+ oo. (You may use any relevant facts from calculus.)
17.
If 1: x,., converges absolutely in Rn, show that
18.
If x,y EM and x 'f y, then prove that there .exist open sets U and V such that XE u, y EV, and Un V = 0.
19.
Define a limit point of a set A in a metric space M to be a point x E M such that Un A 'f 0 for every neighborhood U of x.
2 as
1: x,,, sin m converges.
a.
What is the difference between limit points and accumulation points? Give examples.
b.
If x is a limit point of A, then show that there is a sequence Xn E A with Xn - x.
c.
If x is an accumulation point of A, then show that x is a limit point of A. Is the converse true?
d.
If x is a limit point of A and x ¢ A, then show that x is an accumulation point.
e.
Prove: A set is closed iff it contains all of its limit points.
146 20.
Chapter 2 The Topology of Euclidean Space For a set A in a metric space M and x E M, let d(x,A) = inf{d(x,y) I y EA},
and for c > 0, let D(A,c) = {x I d(x,A)
< c}.
a.
Show that D(A, c) is open.
b.
Let A c Mand N£ = {x EM I d(x,A) :5 c}, where c > 0. Show that Ne is closed and that A is closed iff A= n {Ne I c > O}.
I
21.
Prove that a sequence xk in a normed space is a Cauchy sequence iff for every neighborhood U of 0, there is an N such that k, l ?: N implies Xk -X1 EU.
22.
Prove Proposition 2.3.2. (Hint: Use Exercise 12 of the Introduction.)
23.
Prove that the interior of a set A c M is the union of all the subsets of A that are open. Deduce that A is open iff A = int(A). Also, give a direct proof of the latter statement using the definitions.
24.
Identify R.n+m with R.n x R.111 • Show that A C R.n+m is open iff for each (x,y) EA, with x E R.n, y E R. 111 , there exist open sets Uc R.n, V C Rm with x E U, y E V such that U x V c A. Deduce that the product of open sets is open.
25.
Prove that a set A family of £-disks.
26.
Define the sequence of numbers an by
c
M is open iff we can write A as the union of some
1
I
ao=l,a,=l+--, ... ,an=I+ 1 I +ao +an-I Show that an is a convergent sequence. Find the limit.
27.
Suppose an ?: 0 and an -+ 0 as n -+ oo. Given any c is a subsequence bn of an such that bn < €.
28.
Give examples of:
I::,
> 0, show that there
a.
An infinite set in R. with no accumulation points
b.
A nonempty subset of R. that is contained in its set of accumulation points
Exercises for Chapter 2
147
c.
A subset of JR that has infinitely many accumulation points but contains none of them
d.
A set A such that bd(A) = cl(A).
29.
Let A, B C JR.n and x be an accumulation point of A U B. Must x be an accumulation point of either A or B?
30.
Show that each open set in JR is a union of disjoint open intervals. Is this sort of result true in ]Rn for n > 1, where we define an open interval as the Cartesian product of n open intervals, ]a 1, b 1 [ x · · · x ]an, bn [?
31.
Let A' denote the set of accumulation points of a set A. Prove that A' is closed. Is (A')' = A' for all A?
32.
Let A C JR.n be closed and Xn E A be a Cauchy sequence. Prove that Xn converges to a point in A.
33.
Let Sn be a bounded sequence of real numbers. Asfume that 2sn Sn+I· Show that limn--+oo(Sn+I - Sn)= 0.
34.
Let Xn E JRk and d(Xn+J,Xn) S rd(xn,Xn-1), where O S r < 1. Show that x,. converges.
35.
Show that any family of disjoint nonempty open sets of real numbers is countable.
36.
Let A, B C Rn be closed sets. Does A+ B = {x + y I x E A and y E B} have to be closed?
37.
For Ac M, a metric space, prove that bd(A) = [A
n
llxk - xiii
s Sn- I +
cl(M \ A)] U [cl(A) \ A].
S 1/k + 1/l. Prove that Xk converges.
38.
Let Xk E JR." satisfy
39.
Let S C JR. be bounded above and below. Prove that sup(S) - inf(S) = sup{x - y Ix E S and y E S}.
40.
Suppose in JR that for all n, Un Un converges.
41.
Let An be subsets of a metric space M, An+I C An, and An -:/ 0, but assume that n~1An = 0. Suppose x E n~1cl(An)- Show that xis an accumulation point of A 1•
S bn, an S an+I, and bn+I S bn. Prove that
148
C,hapter 2 The Topology of Euciulean Space
42.
Let AC Rn and x E lRn. Define d(x,A) = inf{d(x,y) I.YE A}. Must there be a z EA such that d(x,A) = d(x, z)?
43.
Let XI = \1'3, ... ,Xn =
44.
A set A c ]Rn is said to be dense in B c ]Rn if B c cl(A). If A is dense in ]Rn a!!d U is o~n, pro,ve that A D U is dense in U. Is· this true if U is not ope.n?
45.·
Sbow that xlogx = o(~) as x - oo (see Worked Example 2.5).
46.
a.
J3 + Xn-1.
If/= o(g) and if g(x) a:s X - 00.
C~mpute liIDn-+oo Xn-
oo as x -
~. then show that e{ = o(ts 0 and lim supn-+oo an+ 1/ an < 1, then I: an converges, and if lirn infn_, 00 an+i/an > 1, then :Z::: an diverges.
b.
If a,, ~ 0 and if Jim supll->00 y'a;; < I (respectively, > 1), then
:z::: all
converges (respectively, diverges).
c. 49.
In the ratio comparison test, can the limits be replaced by lim sup's?
Prove Raabe's test: If all > 0 and if a,.+ifan :::; 1 -A/n for some fixed constant A > 1 and n sufficiently large, then :Z::: all converges. Similarly, show that if a11+1/a11 ~ 1 - (l/n), then :Z:::a11 diverges. Use Raabe's test to prove convergence of the hypergeometric series whose general term is an=
a(a + 1) • ••(a+ n - 1)(3((3 + 1) • • • ((3 + n - 1) 1 · 2 · · · n · 'Y('Y + 1).- · · ('y + n - 1)
where a, (3, and 'Y are nonnegative integers, 1 >,a+ (3. Show that the series diverges if 1 < a+ (3. 50.
Show that for x sufficiently large,f(x) = (xcos2 x + sin 2 x)ex2 is monotonic and tends to +oo, but that neither the ratio f(x)/(x 112 ex2) nor its reciprocal is bounded.
Exercises for Chapter 2
51.
a.
149
If Un > 0, n ; 1, 2, ... , show that ·. f Un+I < 1: . f .;;;< 1· Ii mm - _ 1m m Un _ 1m sup ~
.
b.
.
v
Deduce .that if lim(U11+ 1/ u,.)
vn"7
Un ~
1·1m sup --. Un+I . ~
=A, then lim sup efUn ~ A.
c. · Show that the converse of part b is false by use of the sequence' U2n = U211+t = 2-n. d. 52.
Calculate limsup,:/nl./n.
Test the following series for convergence. oo
a.
-k
~✓:+1 k
oo
b..
Lk2+1 k=O
c.
v'n+I.
~
~ n2 -3n+ 1 n=O 00
.
· "log(k + 1) - logk d. ~ tan- 1(2/k) k=I
.
00
e.
L sin(n-
0 ),
a real, > 0
11=1 00
r.
3
E.;n 11=!
53.
Given a set A in a metric space, what _is the maximum number of distinct subsets that can be produced by successively applying the operations closure, interior, and complement .to A (in any order)? Give an example of a set achieving your maximum.
Chapter 3 Compact and Connected Sets In this chapter, we study two of the most important and useful kinds of sets in metric spaces and especially in IR.n. Intuitively, we want to say that a set in IR.n is compact when it is, closed and is contained in a bounded region, and that a set is connected when it is "in one piece." Figure 3-1 gives some examples. As usual, it is necessary to tum these ideas into rigorous definitions. In each case the most useful technical definition appears to be a little removed from our intuition, but in the end we will see that it is in good accord with it. The fruitfulness of these notions will be revealed in Chapter 4, where they will be applied to the study of continuous functions.
§3.1 Compactness In this section we give the general definition and properties of compact sets in metric spaces. A criterion for recognizing compact sets, called the Heine-Borel theorem, states that a set in IR.n is compact iff it is closed and bounded. This result, special to the metric space IR.n, is discussed in §3.2. Recall from our discussion of completeness of IR.n in Chapter 1 that every bounded sequence has a convergent subsequence. This can be rephrased: If A C Rn is a closed and bounded set, then every sequence in A has a subsequence converging to a point of A. Historically, this was recognized to be an important property of sets, and so was elevated to a definition. This property plays a crucial role in many basic theorems such as the existence of maxima and minima of continuous functions on closed intervals, as we shall see in Chapter 4.
3.1.1 Definition Let M be a metric space. A subset A c M is called sequentially compact if every sequence in A has a subsequence that converges to a point in A. 151
'
Chapter 3 Compact and Connected Sets
}52
compact
connected
FIGURE 3-1
noncom pact
noncompact
not connected
Compact and connected sets in IR.2
This property is equivalent to another property, called compactness, that we shall now develop. This property is less obvious, and its equivalence to sequential compactness is far from clear, at least at first. Here is some terminology we need for our formal definition. Let M be a metric space and A C M a subset. A cover of A is a collection { U;} of sets whose union contains A; it is an open' cover if each U; is open. A subcover of a given cover is a subcollection of { V;} whose union also contains A or, as we say, covers A; it is a/mite subcover if the subcollection contains only a finite number of sets. Open covers are not necessarily countable collections of open sets. For example, the uncountable set of disks {D((x, 0), 1) I x E IR} in IR 2 covers the real axis, and the subcollection of all disks D((n, 0), 1) centered at integer points on the real line forms a countable subcover. Note that the set of disks D((2n, 0), 1) centered at even integer points on the real line does not form a subcovering (why?).
3.1.2 Definition A subset A of a metric space M is called compact if every open cover of A has a finite subcover. Here is the first major result, which links compactness and sequential compactness.
§3.1 Compactness
1"53
3.1.3 Bolzano-Weierstrass Theorem A subset of a metric space is compact
if! it is sequentially compact.
Some simple observations will help give a feel for compactness and for this theorem. F!rst, a sequentially compact set must be closed. Indeed, if Xn E A converges to x E M, then by assumption there is a subsequence converging to a point x 0 EA; by uniqueness of limits, x = xo, and so A is closed. Second, a sequentially compact set A must be bounded, for if not, there is a point xo E A and a sequence Xn E A with d(xn, xo) ~ n. Then Xn cannot have any convergent subsequence. 'fo show directly that a compact set is bounded, use the fact that for any x0 EA, the open balls D(x0 , n), n = 1, 2, ... , cover A, so there is a finite subcover. Note that in the definitions, one can take A = M, in which case one just speaks of a compact metric space. We shall develop examples of compact spaces in due course. Another characterization of compactness relates to completeness. It is a useful technical tool used in the proof of the Bolzano-Weierstrass theorem.
3.1.4 Definition A set A c M is called totally bounded if for each there is ajnite_!_~t_lx_1 ~ --~ ·..0N} in M such that :!-C I D(x;,!_).
uf
E
>0
3.1.5 Theorem A metric space is compact iff it is complete and totally bounded.
Let A c M, and assume that M is complete. If we apply this theorem to the metric space A, we conclude that A is compact if! it is closed and totally bounded. In Theorem 3.1.5, a few things are obvious, others less obvious. First, note that D(x;, e) C D(x 1 , € + d(x;,X1 )), so that if
then A C D(x 1, R} and so a totally bounded se,t is bounded. This is consistent with our earlier remark that compact sets are bounded. At this stage we do not have effective methods for telling when a given set is compact. We will remedy this in the next section.
3.1.6 Example The entire real line IR is not compact, for it is unbounded. Another reason is that {D(n, l)
=]n -
l, n + l[I n
=0, ±I, ±2, ... }
is an open cover of IR but does not have a finite subcover (why?).
•
· Chapter 3 Compact and Connected Sets
154
3.1.7 Example Let A= JO, 1]. Find an open cover with no finite subcover.
In= 1,2,3, ... }. (Why does the union contain all of A?) It clearly cannot have a finite subcover. This time, compactness fails because A is not closed; the point O is "missing" from A. This collection is not a cover for [O, 1]; in fact any open_ cover for [0, l] must have a finite subcover, because, as we prove in the next section, [O, 1] is compact. • Solution Consider the open cover {)1/n,2[
3.1.8 Example Give an example of a bounded and closed set that is not compact.
Solution Let
M be any infinite set with the discrete metric: d(x,y) = 0 if x = y and d(x,y) = l if x -f:. y. Clearly, MC D(xo, 2) for any xo EM, and so M
is bounded. Since it is already the entire metric space, it is closed. However, it is not compact. Indeed, { D(x, l /2) I x E M} is an open cover with no finite • subcover.
3.1.9 Example A collection of closed sets {Ka} in a metric space M is said to have the finite intersection property for A if the intersection of any finite number of the K,,, with A is nonempty. Show that A C M is compact iff every collection of closed sets with the finite intersection property for A has nonempty intersection with A.
Solution First, assume A is compact. Let { F;} be a collection of closed sets and let U; M\F;, so that U; is open. Suppose that An (n~1F;) 0. Taldng complements, this means that the V; cover A. Since the covering is open, there is a finite subcovering, say, Ac U 1 U • • • U UN, Then An (F1 n ••• n FN) = 0, and so { F;} does not have the finite intersection property. Thus, if {F;} is a collection of closed sets with the finite intersection property, then An { F;} -f:. 0. Conversely, let { U;} be an open covering of A and let F; = M\ U;. Then A n (n~1F;) = 0, and so, by assumption, {F;} cannot have the finite intersection property for A. Thus, A n (F1 n •• • n FN) = 0 for some members F1, ... , FN of the collection. Hence, U1, ••• , UN is the required finite subcover and thus A is compact. •
=
=
155
§3.2 The Heine-Borel Theorem
Exercises for §3.1 1.
Show that A c M is sequentially compact iff every infinite subset of A has an accumulation point in A.
2.
Prove that {(x,y) E R 2 IO
3.
Let M be complete and A C M be totally bounded. Show that cl(A) is compact.
4.
Let x1c
-+
5 x < l,O 5 y 5 1} is not compact.
x be a convergent sequence in a metric space and let A =
{x1,X2, ... } U fx}.
S.
a.
Show that A is compact.
b.
Verify that every open cover of A has a finite subcover.
Let M be a set with the discrete metric. Show that any infinite subset of M is noncompact. Why does this not contradict the statement in Exercise 4?
§3.2 The Heine-Borel Theorem In Euclidean space we can easily tell if a set is compact from the following theorem:
3.2.1 Heine-Borel Theorem A set A c Rn is compact if! it is closed and bounded.
.
a
One half of this was already indicated in §3.1. In fact, compact set is closed and bounded in any metric space. The converse must be special in view of,Example 3.1.8. Indeed, it is not even obvious that the closed interval (0, I] in R is compact. In fact, (0, l] is compact, and one of the proofs of the Heine-Borel theorem begins by treating this case.
3.2.2 Example Determine which of the following are compact:
Ix~ 0} C ~
a.
{x E IR
b.
[0, 1) U [2,3] CIR
c.
{(x,y) E JR2 j x2
·
+y2 < I} C JR.2
Chapter 3 Compact and Connected Sets
156
Solution a.
Noncompact, because it is unbounded.
b.
Compact, because it is closed and bounded.
c.
Noncompact, because it is not closed.
•
3.2.3 Example Let Xk be a sequence of points in k. Show that
Xk
)Rn
with Ilxk 11 S 3 for all
has a convergent subseq':'ence.
]Rn I llxll S 3} is closed a~d bounded, and hence compact. Since Xk EA, we can apply the Bolzano-Weierstrass theorem to obtain • the conclusion.
Solution The set A= {x E
3.2.4 Example In the definition of a compact set, can "every" be replaced by "some"?
Solution No. Let A = JR, and let the open cover consist of the single open set JR. This has a finite subcover, namely, itself, but being unbounded, JR is not c9mpact. •
3.2.5 Example Let A= {O} U {1, 1/2, ... , 1/n, ... }. Show directly that A '--.. satisfies the definition of compactness. Solution Let { U;} be an arbitrary open cover of A. We must show that there is a finite subcover. The point O lies in one of the open sets-relabeling if needed, we can suppose that OE U1• Since U 1 is open and 1/n - 0, there is an N such that 1/N, 1/(N +I), ... lie in U1. Relabeling again if needed, suppose that I E U2, ... , 1/(N - 1) E UN. Then U1, ... , UN is a finite subcover, since it is a finite subcollection of the { U;} and it includes all of the points of A. Notice ·that if A were the set {I, 1/2, ... }, then the argument would not work. In fact, this set is not closed, and so it is not compact. • ·
Exercises for §3.2 1.
Which of the following sets are compact?
a.
{x E JR I OS x S I and xis irrational}
§3.3 Nested Set Property
157
b.
{ (x,y) E lR 2 I O $ x $ 1}
c.
{(x,y) E 1R2 I xy ~.I}
n {(x,y) I x2 + y2 < 5}
2.
Let r 1, r2 , r3, ... be an enumeration of the rational numbers in [0, 1]. Show that there is a convergent subsequence.
3.
Let M = { (x,y) E IR 2 I x2 +·y2 $ 1} with the standard metric. Show that A c M is compact iff A is closed.
4.
Let A be a bounded set ~n !Rn. Prove that cl(A) is compact.
5.
Let A be an infinite set in IR with a single accumulation point in A. Must
A be compact?
§3.3 Nested Set Property The next theorem is an important consequence of the Bolzano-Weierstrass theorem.
3.3.1 Nested Set Property let F1c be a sequence of compact nonempty sets in a metric space M such that F1 U1c. Then 1 U1c ¢ M is F1c ' I 0. Thus, if Mis a metric space and the open sets Uk are equivalent to 1
n:
u:
increasing-i.e., Uk+I ::> Ur-and have compact complements, then the union of the U1c is not all of M.
Chapter 3 Compact and Connected Sets
158
,> :,
pEnF,,;
I
•
FIGURE 3.3-1
Nested set property
3.3.2 Example Let M be the unit sphere in JR3, M = {(x,y,z) I x2 +y2+z2 = I} with the standard metric. Let U; be the portion of M strictly belqw latitude 90° - 10/i, i = 1, 2, 3, ... , as in Figure 3.3-2.
latitude 90° - 10/i
FIGURE 3.3-2
An increasing sequence of open sets on the unit sphere
The metric space M is compact (why?), and, consistent with the preceding remarks, the union of the U; is not all of M, since it excludes the north pole. •
3.3.3 Example Ver½' the nested set property for Ft= [0; 1/k] C JR. Solution Each Ft is compact, and F1:+ 1 c F". The intersection is {0}, which is nonempty. •
Exercises for §3.3
159
3.3.4 Example ls the nested set property true if "compact nonempty" is replaced by "open nonempty" or "closed nonempty"?
Solution No. Let Fk = ]k, oo[ or [k, oo[.
•
3.3.S Example A more exotic family of decreasing compact sets Fn, for
n:
which 1Fn is quite complicated, is obtained by removing successive triangles from a given triangle in the plane, as in Figure 3.3-3. •
FIGURE 3.3-3
Sierpinski's gasket
Exercises for §3.3 1.
Verify the nested set property for Fk = { x E IR I x ~ 0, 2 ::;
2.
Is the nested set property true if "compact nonempty" is replaced by "open bounded nonempty"?
3.
Let x1: -+ x be a convergent sequence in a metric space. Verify the validity of the nested set property for F1: = {x, I l ~ k} U {x}. What happens if Fk={xill~k}?
4.
Let x1: -+ x be a convergent sequence in a metric spai:e. Let A be a family of closed sets with the prope11y that for each A E A, there is an N such that k ~ N implies Xk E A. Prove that x E nA.
x2 ::; 2 + 1/ k}.
Chapter 3 Compact and Connected Sets
160
§3.4 Path-Connected Sets The second important topic to be discussed in this chapter is connectedness. We know intuitively those sets we would like to call "connected." However, our intuition can fail in judging more complicated sets. For example, how do we decide whether the set {(x,sin(l/x)) Ix> O} U {(0,y) I y E (-1, ll} c lR 2 is connected (see Figure 3.4-1)? Therefore, we seek a sound mathematical definition we can depend on.
y
1 ----,HH-~-+---------x
FIGURE 3.4-1
Connected?
There are, in fact, two different (but closely related) notions of connectedness. The more intuitive and applicable of these is that of path-connectedness, and so we begin with it. Our definition must first define what is meant by a curve (or path) joining two points.
3.4.1 Definition We call a map
0 such that for ally EA, y ~ x, we have d(x,y) > €.
b.
A set is called discrete if all its points are isolated. Give some examples. Show that a discrete set is compact iff it is finite. ·
15.
Let K1 C M, and Ki C Mi be path-connected (respectively, connected, compact). Show that K 1 x K2 is path-connected (respectively, connected, compact) in M, x Mi.
16.
If Xk
17.
Let K be a nonempty closed set in ]Rn and x E JR11 \K. Prove that there is a y EK such that d(x,y) = inf{d(x,z) I z EK}. Is this true for open sets? Is it true in general metric spaces?
18.
Let Fn CIR be defined by Fn = {x Ix~ 0 and 2- 1/n $ x2 $ 2+ 1/n}. Show that n:,Fn t, 0. Use this to show the existence of ../2.
19.
Let Vn C M be open sets such that cl(Vn) is compact, V11 cl(V11 ) C Vn-t· Proven:, Vn ~ 0.
20.
Prove that a compact subset of a metric space must be closed as follows: Let x be in the complement of A. For each y E A, choose disjoint neighborhoods Uy of y and Vy of x. Consider the open cover { Uy hEA of A to show the complement of A is open.
21.
a.
Prove: a set A C M is connected iff 0 and A are the only subsets of A that are open and closed relative to A. (A set U C A is called open relative to A if U = V n A for some open set V c M; "closed · · relative to A" is defined similarly.)
b.
Prove that 0 and ]Rn are the only subsets of IR" that are both open and closed.
- x in a normed space, prove that llxkll - !!xii- Is the converse true? Use this to prove that {x E Rn I Ilxl I $ 1} is closed, using sequences.
~ 0,
and
.
Exercises for Chapter 3
175
22.
Find two sutisets A, B c R 2 and a point x0 E JR2 such that A U B is not connected but A U B U {x0 } is connected.
23.
Let Q denote the rationals in R Show that both Q and the irrationals JR\Q are not connected.
24.
Prove that a set A C M is not connected if we can write A as the disjoint union of two sets B and C such that B n A -:j 0, C n A -:j 0, and neither of the sets B or C bas a point of accumulation belonging to the other set
25.
Prove that there is a sequence of distinct integers n 1, n2, . . . -+ oo such that limk-oo sin nk exists.
26.
Show that the completeness property of JR may be replaced by the Nested Interval Property. If {Fn}1 is a sequence of closed bounded intervals in JR such that Fn+I C Fn for all n = 1, 2, 3, ... , then there is at least one .point in n~ 1Fn.
27.
Let A C JR be a bounded set. Show that A is closed iff for every sequence EA, limsupxn EA and liminfxn EA.
Xn
28.
Let A c M be connected and contain more than one point. Show that every point of A is an accumulation point of A.
29.
Let A {(x,y) E JR2 I x 4 + y4 ~-connected?
30.
Let Uk be a sequence of open bounded sets in JR.n. Prove or disprove:
=
a.
b. c.
d.
= I}.
Show that A is compact. Is it '
LJ: Uk is open. n: Uk is open. n: (Rn\ Uk) is closed. n: (JRn\Uk) is compact. 1 1
1
1
31.
Suppose A c Rn is not compact. Show that there exists a sequence F 1 :> F2 :> F 3 • · • of closed sets such that Fk nA -:j 0 for all k and
32.
Let Xn be a sequence in JR3 such that llxn+l - xnll $ 1/(n2 + n), n Show that Xn converges.
33.
Baire category theorem. A set S in a metric space is called nowhere dense if for each nonempty open set U, we have cl(S) n U -:j U, or equivalently, int(cl(S)) = 0. Show that ]Rn cannot be written as the countable union of nowhere dense sets.
2::
1.
176
Chapter 3 Compact and Connected Sets
34.
Prove that each closed set A C M is an intersection of a countable family of open sets.
35.
Let a E JR and define the sequence a I ' a2' . . . in R by a I a~_ 1 - an- I + 1 if n > l. For .what a E JR is the sequence a.
Monotone?
b.
Bounded?
c.
Convergent?
= a,
and an =
Compute the limit in the cases of convergence. 36.
Let A C IRn be uncountable. Prove that A has -an accumulation point.
37.
Let A, B
0 b.
c M with A compact, B closed, and A n B = 0.
Show that there is an yEB.
£
>0
s~ch that d(x, y)
>
e for all x E A and
Is a true if A, B are merely closed?
38.
Show that A C M is not connected iff there exist two disjoint open_ sets U, V such that UnA f 0, VnA f 0, and Ac UU V.
39.
Let F 1 = (0, 1/3]U[2/3, 1] be obtained from [O, 1] by removing the middle third. Repeat, obtaining Fi= [O, 1/9] U (2/9, 1/3)] U [2/3, 7 /9] ·U [8/9, l].
In general, Fn is a union of intervals and middle third of these intervals. Let C =
40.
Fn+t
is obtained by removing the
n: Fn, the Cantor set. Prove: 1
a.
C is compact.
b.
Chas infinitely many points. [Hint: Look at the endpoints' of Fn.]
c.
int(C) = 0.
d.
C is perfect; that is, it is closed with no isolated points.
e.
Show that C is totally disconnected; that is, if x, y E C and x f y then x E U and y E V where U and V are open sets that disconnect C.
Let Fk be a nest of compact sets (that is, Fk+t C Fk)- Furthermore, suppose each F k is connected. Prove that nf:;1{ Fk} is connected. Give a.n example to show that compactness is an essential condition and we cannot just assume that "Fk is a nest of closed connected sets."
...
Chapter 4 Continuous Mappings To obtain interesting and useful theorems, it is often necessary to make restrictions on the mathematical objects one studies. In this chapter we require that the functions studied be continuous, and we will investigate some of the consequences of this restriction. -In Chapter 6 we study an even stronger restriction, namely, that of differentiability.
§4.1 Continuity First we examine the notion of continuity for real-valued functions on the real line JR. Figure 4.1- la shows a continuous function, and Figure 4.1-1 b shows a discontinuous one. A continuous function has the important property that when xis close to x0 , j(x) is close to f(xo) (as shown in Figure 4.1-la). On the other hand, in Figure 4.1-lb, even if xis arbitrarily close to xo.f(x) need not be close to /(Xo). The reader should be familiar with these ideas from one:variable calculus. To define continuity in more precise and general terms, first the concept of the limit of a function at a point is defined. Let (M, d) and (N, p) be two metric spaces, A CM, andf : A-+ Na given mapping.
4.1.1 Definition
~µpose
that x0 is an accumulation point of A. We say. that b E N is the limit off at xo, wr{tten limf(x) = b,
x-xo
'. if given any € > 0 there exists fJ > 0 (possibly depending on f. x0 , and€) such '.that for all x E A satisfying -x ;= x0 and d(xo, x) < 6, we have p(j(x), b) < € . 177
17~
Chapter 4 Continuous Mappings y
y
___ _/ I X
X
(a)
FIGURE 4.1-1
(b)
(a) Continuous function; (b) discontinuous function
Intuitively, this says that as x approaches xo, /(x) approaches b. We also write f (x) -+ b as x -+ Xo- (Compare this with the concept of the limit of a sequence.) Note that if Xo is not an accumulation point, there will not be any other points of A near xo, in which case the condition becomes vacuous.
Note. Whenever statements are given for metric spaces, ask yourself: What is the corresponding statement for Rn? Likewise, when given a statement or homework problem in JRn, ask: is this special to Rn, or does it work generally in a metric, normed, or vector space? In general, limx-xof(x) need not exist. For example, let/: R\{O}-+ JR be defined by /(x) 1 if x < 0 and /(x) 2 if x > 0. Then O is an accumulation point of JR\ {O} but limx-of(x) doesn't exist. However, if f(x) = 1 when x 'f 0 0, then limx-of(x) 1. Another example is f : JR\ {O} -+ JR, and if f(O) f(x) = sin(l / x); this function oscillates faster and faster near O and so cannot approach any limit there. If limx-xof(x) exists, then it is unique, and so we are justified in saying the limit off at xo. To prove this, suppose limx-xof(x) = b and b'. To show that b = b', let c > 0 be given. Then there exist 61 > 0 and 62 > 0 such that O < d(x,'xo) < 61 implies p(f(x), b) < c /2 and O < d(x, xo) < 62 implies p(f(x),b') < c/2. Let 6 = min{6 1,61 }; then O < d(x,xo) < 6 implies p(b,b') $ p(b,f(x))+p(f(x), b') < c/2+c/2 =€,by the triangle inequality. Thus, p(b,b') < € for any€ > 0 and so p(b, b') = 0, which means b = b'. Notice that in 4.1.1, if we had not required xo to be an accumulation point, then limits need not have been unique.
=
=
=
=
§4.1 Continuity
179
There are some particular cases of 4.1.1 in which special notation is used. For instance, suppose/ is defined (at least) on A= ]xo,a] c JR for some a> xo and f is real-valued. Then limf(x) = b
x-Xo
means the limit off with domain A= ]xo,a]. In other words, for every c > 0 there is a f, > 0 such that Ix - .xol < fJ and x > .xo implies 1/(x) - bl < c. Thus we are talcing the limit of/ as x approaches xo from the right. Similarly, we can define lim /(x) = b, x--+xO
the limit as x approaches xo from the left. These are, for obvious reasons, called one-sided limits. It should now be clear to the reader how to define expressions like lim,._ 00 /(x) = a, and ~o forth. We are now ready to define continuity of a function at a point.
4.1.2 Definition Let A c
M,
f :A
-+ N,
and
.xo
E A. We say that f is
continuous at xo if either Xo is not an accumulation point of A or limx-x0 f(x) = /(Xo)-
This definition requires the existence of limx-.rof(x) in addition to specifying its value. It can be rephrased as follows: f is continuous at the point Xo in its domain ifffor all c > 0, there is a f, > 0 such that for all x E A, d(x, xo) < 6 implies p(f(x),f (xo)) < c. In Definition 4.1.1, we needed to specify that x f: xo because f was not necessarily defined at x0 , but here there is no need to specify x ,j: x0 , s,ii;_ice our condition is certainly valid if x = x0 • In the important special case f : A c Rn -+ R.111 , observe that/ is continuous at Xo EA iff for all c > 0 there is a 6 > 0 such that/or all x EA with !Ix - xoll < fJ, we have 11/(x) - /(xo)II < £.
4.1.3 Definition A function f : A
C M -+ N is called continuous on the set
B C A if f is continuous at each point of B. If we just say that f is continuous, we mean that f is continuous on its domain A. There are other useful ways of formulating the notion of continuity. One of these, given in iii of the next theorem, is particularly significant because it involves only the topology (that is, the open sets), and so it would be applicable in more general situations.
Chapter 4 Co1Zli1Zuous Mappings
180
4.1.4 Theorem Let f : A c M
-+
N be a mapping. Then the foltowing
assertions are equivalent: i.
f is continuous on A.
ii.
For each convergent sequence Xk
iii.
For each open set U in N, 1- 1(11) C A is open relative to A: that is, 1- 1(11) = V nA for some open set V.
iv.
For each closed set F C N, 1- 1(F) C 1- 1(F) = G n A for some closed set G.
-+
Xo in A, we have f(xk)
-+ f(xo).
A is closed relative to A; that is,
Condition ii in this theorem has an analogous version for limits that can be proved in the same way. Namely, if f : A C M -+ N and xo is an accumulation point of A, then lim /(x) = b
if and only if
X-+Xo
lim f(xk) = b
lc-oo
for every sequence Xt E A that converges to Xo, Xt "f xo. From this theorem it is evident that our definition of a continuous path, given in Chapter 3, coincides with continuity as we have defined it here. In §4.3 we shall introduce theorems that will enable us to readily establish the continuity of some of the more common functions. We now briefly discuss the plausibility of the theorem. First of all, that i is the same as ii should be clear, for i means that/(x) is near /(xo) if xis near x0 , and ii is the same except that it lets x approach xo via a sequence. Assertions iii and iv are also the same if we remember that open sets are complements of closed sets. Let us see what iii is telling us. Choose U to be a small open set containing /(XQ). That 1- 1(11) is open means that there is a whole open disk around Xo contained in J- 1(11). In this disk, x is mapped to U, which represents points near /(xo). In other words, using U as a measure of closeness of /(x) to /(x0), if xis near enough to xo (i.e., if x E 1- 1(11)), /(x) will be near f(xo). This therefore represents the same idea expressed in i.
4.1.5 Example Let f : Rn
-+
!Rn be the identity function x
1-+
x. Show that
f is continuous. E Rn. By definition we must find 6 > 0 for given € > ·o such that llx - xoll < 6 implies 11/(x) - /(xo)II -< c:. If we choose 6 = c:, the definition becomes the statement that llx- xoll 0 and/(0) 0. Clearly, this function exhibits the same unboundedness property as does 1/ x on JO, I]. •
Solution Let/: (0, I] -+IR.be defined by f(x)
4.4.3 Example Verify Theorem 4.4.1 for f(x) = x/(x2 + I) on (0, 1). Solution f (0) = 0, /(I) = 1/2. We shall verify explicitly that the maximum
=
=
is at x 1, and that the minimum is at x 0. (Elementary calculus helps to determine this, but we shall give a direct verification.) First, as O 5 x 5 l, x/(x2 + 1) ~ 0, since x ~ 0 and x2 + 1 ~ 1, so that f(x) ~ /(0) for O ::;; x ::;; I. Thus O is the minimum. Next, note that O 5 (x - I )2 = x2 - 2x + I, so that x2 + 1 ~ 2x and hence, for x ';/ 0, X
X
I
- - < - = -2
x2+1-2x
so that f (x) 5 /(1) = 1/2 and thus x = l is the maximum point.
•
4.4.4 Example Show that x0 and x 1 in Theorem 4.4.1 need not be unique. Solution Let/(x) = I for all x E (0, I]. Then any Xo,x 1 E (0, l] will do. This example is of course rather trivial. More interesting ones have already been learned in calculus, such as (x2 - 1)2 on [-2, 2], which has minima at ± l and • maxima at ±2.
§4.5 The Intermediate Value Theorem
191
Exercises for §4.4 1.
Give an example of a continuous and bounded function on all of JR that does not attain its maximum or minimum.
2.
Verify the maximum-minimum theorem for f(x) = x3
3.
Let f : K C JR.n -+ JR. be continuous on a compact set K and let M = { x E K lf(x) is the maximum off on K}. Show that Mis a compact set.
4.
Let/: AC !Rn-+ IR be continuous, x,y EA, and c: [O, 1)-+ AC !Rn be a continuous curve joining x and y. Show that along this curve, f assumes its maximum and minimum values (among all values along the curve).
5.
Is a version of the maximum-minimum theorem valid for the function f(x) = (sinx)/x on ]O, oo[? On (0, oo[?
-
x on [-1, 1].
§4.5 The Intermediate Value Theorem The intermediate value theorem should be known to the reader from elementary calculus. It states that a continuous function on an interval assumes all values between any two given elements of the range, as in Figure 4.5- la. The discontinuous function! in Figure 4.5-lb never assumes the value 1/2. Roughly speaking, while a discontinuous function can jump from one value to another, a continuous function must pass through all intermediate values. Another way the intermediate value property can fail is if the domain A is not connected, as illustrated by the continuous function in Figure 4.5-2. Thus the crucial a~sumptions are that f be continuous and f be defined on a connected region. We shall see that the proof of the intermediate value theorem is in fact quite simple, because of the way we have formulated the notion of connectedness.
4.5~1 Intermediate Value Theorem Let M be a metric space, A c M, and f : A -+ JR be continuous. Suppose that K C A is connected and x, y E K. For every number c E JR such that f(x) < c < f(y), there exists a point z E K such that f(z) =·c. •
Since intervals (open or closed) are connected, the usual intermediate value tbeorem in calculus becomes a special case. However, notice that Theorem 4.5.1 more general. It applies, for example, to continuous real-valued functions of ~veral vatiables f (x 1, ••• , x 11 ) defined on all of !Rn, which is a connected set.
t.'
/
Chapter 4 Conti,zuous Mappings
192 y
y
o--
I
2
f(x)
-
I 4
~+--l&H!!:il--A
FIGURE 4.5-1
lntennediate value· theorem
y
C
____.- f(x) X
A
FIGURE 4.5-2
ContinUOlls function with a disconnected domain
Since path-connected sets are connected, we get
4.5.2 Corollary Let K
C IR.n be path connected and f : K -
tinuous. Let x,y E Kand f(x) f(z) = C.
< c < f(y).
IR be conThen there is a z E K such that
4.5.3 Example Prove Theorem 4.5.1 using the fact that f(K) is connected and that connected sets in IR. are intervals.
Solution That f(K) is connected comes from Theorem 4.2.1. Hence f(K) is an interval, possibly unbounded. If /(x),/(y) E J(K) are such that/(x)
~
J(y),
X
Exercises for §4.5
193
then [f(x),f(y)] Cf(K) since/(K) is an inteival. Thus, if/(x) S. c S.f(y), then c E [f(x),/(y)] c /(K), and so c = f(z) for some z. Another proof is given in the theorem proofs section of this chapter. •
4.5.4 Example Let f(x) be a cubic polynomial. Slww that f has a (real) root Xo (that is, f(xo) = 0).
Solution We can write f(x) = ax3 + bx2 +ex+ d, where a t- 0. Suppose that a > 0. For x large and positive, ax3 is large (and positive) and will be bigger than the other tenns, so that f (x) > 0 if x is large. This requires some exact estimates but should be intuitively clear. To see it exactly, note that
· (I+ -axb+ ax- +axd) -
ax3 +bx2 +cx+d = ax3
c
-?
3
and the factor in parentheses tends to I as x -+ co. Similarly, f(x) < 0 if xis large and negative. Hence, we can apply the intermediate value theorem with K = lR to conclude the existence of a point .to where /(Xo) = 0. •
4.5.5 Example Let f : [I , 2] -+ [0, 3) be a continuous function satisfying f(l) = 0 and /(2) = 3. Show that f has a fixed point. That is, show that there is a point xo E (I, 2) such that f (Xo) = XoSolution Let g(x) =f(x)-x. Then g is continuous, g(l) =/(I)- I= -1, and g(2) =/(2) - 2 =3 - 2 = I. Hence, by the intennediate value theorem, g must vanish at some x0 E [I, 2), and this Xo is a fixed point for /(x). • (
Exercises for §4.5 1.
What happens when you apply the method used in Example 4.5.4 to quadratic polynomials? To quintic polynomials?
2.
Let f : an -+ am be continuous. Let r = { (x,/(x)) I x E lRn} be the graph off in an x lRm. Prove. that r is closed and connected. Generalize your result to metric spaces.
3.
Let/: [0, I]-+ [O, I] be continuous. Prove that/ has a fixed point.
Chapter 4 Continuous Mappings
194 4.
Let/ : [a, b) --+ IR be continuous. Show that .the range of/ is a bounded closed interval.
5.
Prove that there is no continuous map taking [O, 1) onto ]0, I[.
§4.6 Uniform Continuity Sometimes it is useful to have available a variant of the concept of continuity known as uniform continuity. This variant is mostly useful for technical reasons, such as labor-saving devices in proofs. The exact definition is as follows.
4.6.1 Definition Let (M, d) and (N, p) be metric spaces, A c M, f : A --+ N, and B C A. We say that f is uniformly conJinuous on tlze set B if for every e > 0 there is a 6 > 0 such that x,y EB and d(x,y) < 6 imply p(f(x),f(y)) < e. The definition is similar to that for continuity, except that here we are required to choose 6 to work for all x,y once e is given. For continuity, we were required only to choose a 6 once we were given e > 0 and a particular xo. Clearly, if f is uniformly continuous, then f is continuous. For example, consider f : IR --+ IR, j(x) = x2. Then f is certainly continuous, but it is not uniformly continuous. Indeed, for e > 0 and x0 > 0 given, the 6 > 0 we need is at least as small as :::/(2x0 ) (why?), and so if we choose x0 large, 6 must get smaller; i.e., no single 6 will do for all xo. This phenomenon ,cannot happen on compact sets, as the next theorem shows. ·
4.6.2 Uniform Continuity Theorem Let f : A - N be continuous and let K c A be a compact set. Then f is uniformly continuous on K. The use of merely bounded sets in this theorem will not do, for consider = 1/x on the noncompact set ]0, l]. If we examine the proof that f is continuous (Example 4.1.6), we see that f is not uniformly continuous. Of course, we ·cannot extend/ to be continuous on the compact set (0, 1), because f is already unbounded. f(x)
4.6.3 Example Let f : ]0, I]
--+
uniformly continuous on [a, 1) for a
IR be de.fined by j(x) = 1/ x. Show that f is
> 0.
§4.6 Uniform Continuity
195
Solution This follows. from the unifonn continuity theorem since compact set and/ is continuous on ]0, l] and hence on [a, 1). •
[a, l] is a
In the next section we shall be reviewing differential calculus. However, we wish to use some of the ideas from that area now, since they are relevant for the next result.
4.6.4 Example Let f : ]a, b[ - JR be differentiable, and suppose that there is a constant M > 0 such that V'(x)I ::5 M for all x satisfying a < x < b. Here, a orb may be ±oo, and f' stands for the derivative off. Show that f is uniformly continuous on ]a, b[.
Solution The definition of uniform continuity asks us to estimate the difference lf(x) - f(y)I in tenns of Ix - YI- This suggests using the mean value theorem. Indeed, f(x)- f(y) =J'(xo)(x- y) for some xo between x and y. Hence lf(x) - f(y)I
:5 Mix - yl;
a mapping with this property is called Upschuz. (We will encounter this property several times again.) Given e > 0, choose 6 = e/M. Then Ix - YI < 6 implies lf(x) -f(y)I Hence f is uniformly continuous.
M•e
< M · 6 =. - M = e. •
The intuition suggested by this result may shed some light on uniform continuity. Specifically, this result says that if the slope of the graph of a function is bounded, then it is uniformly continuous. Unbounded slopes suggest that one should suspect a failure of uniform continuity. This is often a good guide when examining specific functions or their graphs. However, as Exercise 7 shows, this guide is not foolproof.
4.6.5 Example Show that sin x : JR - JR is uniformly continuous. Solution d(sinx)/dx = cosx is bounded in absolute value by I, and so, by 4.6.4, sinx is uniformly continuous. •
Chapter 4 Continuous Mappings
196
Exercises for §4.6 1.
Demonstrate the conclusion in Example 4.6.3 directly from the definition.
2.
Prove that /(x) = I/x is unifonnly continuous on [a, oo[ for a > 0.
3.
Must a bounded continuous function on JR be uniformly continuous?
4.
If/ and g are uniformly continuous maps of JR to lR., must the product f · g be unifonnly continuous? What if f and g are bounded?
5.
Letf(x) = Ix!- Show that/: JR-+ JR is uniformly continuous.
6.
a.
Show that f : JR -+ JR is not uniformly continuous iff there exist an c > 0 and sequences Xn and Yn such that lxn - Ynl < l/n and IJ(xn) - f(Yn)I ~ €. Generalize this statement to metric spaces.
b.
Use a on JR to prove that/(x) =·x2 is not uniformly continuous.
7.
Let/(x) = ./i.
a.
Show that/ is uniformly continuous on the interval [0, IJ.
b.
Discuss the relationship between uniform continuity and bounded slopes in light of this example.
§4.7 Differentiation of Functions of One Variable We assume that the reader is familiar with the general ideas of calculus of functions of one variable. Our purpose here is to recall and sharpen some of the basic theoretical points. A few of the proofs are included right in the text, since many will be a review.
4.7.1 Definition Let the jun_ction f be de.fined on some open interval containing xo E IR. We say that f is differentiable at xo if . /(xo + h) - f(xo) / '(xo ) = 11m h h-->O
exists. We call J' (xo) the derivative off at xo. Rewriting this condition as
.
f(x) - /(xo) - f'(xo) · (x - Xo)
I1m - - - - - - - - - x-+xo x - Xo
= 0,
§4.7 Differentiation of Function.s of One Variable
197
we see that the straight line y = f(x0 ) + f' (xo)(x - xo), called the tangent line to the graph off at xo, is a good approximation to f near Xo, and rewriting it as
- f(xo) . [/(Xo + tu) I1m A-
11%-0
LU
-
•
J'(Xo )] = 0,
we see thatf'(.xo), being the limit of the slopes of secant lines, can be interpreted as the slope of the tangent line to the graph off at (xo,/(Xo)), as in Figure 4. 7-1.
y
slope
= /' (x0)
!T
FIGURE 4.7-1 J'(x0 ) is the slope of the tangent line
Another way of writing the definition that avoids division by tu (and the exclusion tut- 0) is: For any c > 0, there is a 6 > 0 such that !tu! < {J implies
.,
IJ(xo + tu)- f(xo) - /'(xo)tul ~ cltul. We assume that the reader is familiar with the Leibniz notation dy / dx for the derivative. Here is the differentiability-continuity relationship:
4.7.2 Proposition /ff is differentiable at xo, thenf is continuous at x0• Proof Indeed,
. 1:!.~/(x) . = _.l:!_~
0
[f(x~
=~:xo) (x - xo) +f(xo)]
=f' (xo) · 0 +f (xo) =f (xo). •
Chapter 4 Continuous Mappings
198
Here is another proof of this proposition, which we shall need later: Given e
> 0, there is a 6 > 0 such that IJ(xo + .1.x) - /(xo) - /'(xo)-1.xl :'.5 cl.1.xl
if l.1.xl
< 6. Thus, by the triangle inequality, lf(xo + .1.x) - /(.xo)I $ Ii' (xo)-1.xl + elAxl •
Choosing e = I,
lf(xo + L\x) - /(.xo)I :'.5 (f'(xo) + l)IAxl if IA.ti < 6. This shows more than continuity-it shows that/ has the Li.pschitz property at x0 ; i.e., there exist constants M ~ 0 and c > 0 such that
MIA.ti from this (given e > 0, choose 6 = E / M).
lf(xo + Ax) -
/(xo)I
$
if IA.ti < c. Continuity follows This inequality means that the graph lies inside th~ "bow tie" region in Figure 4.7-2. -
slope= M
AGURE 4.7-2 Geometric meaning of the Lipschitz property
4.7.3 Example Let f(x) = Ix!- Show that f is continuous but not differentiable at xo = 0. Solution Indeed, the limit of the difference quotient, lim /(Xo + Ax) - /(Xo) = lim IA.ti' dx-o L\x dx-o .1.x
§4. 7 Differentiation of Functions of One Variable
199
does not exist, since l.1.xl/ .1.x is I if .1.x > 0 and is -1 if tu < 0. The function is clearly continuous at Xo = 0 (let c = 6 in the definition of continuity). •
4.7.4 Example
Let f (x) =
r. Calculate f' (.xo) using the definition.
Solution Here
r_
• f(x+L\x) -f(X) _ 1. (x+L\.x:)2 . [2x A-]_ 2x I1m - - - - - - 1m - - - - - - 11m + LU , L\x dr-+0 L\x &-+0
dr-+0
and so/'(x) = 2x; thus/ is differentiable at each x E JR.
•
Of course, in practical problems, we don't calculate derivatives using the definition. Rather, we use the familiar rules of calculus:
4.7.5 Theorem Suppose that/ and g are differentiable at xo and that k E JR.. Then kf, f + g, Jg, and f / g (assuming g(xo) ~ 0) are all differentiable at Xo and ii.
=k(f'(xo)) (f + g)'(.xo) =J'(xo) + g'(xo)
iii.
(fg)'(xo) = f(.xo)g'(xo) +/'(.xo)g(xo)
iv.
( [_)' (.xo) · g
i.
(kf)'(xo)
=g(xo) !' (Xo) - f (xo)g' (Xo) [g(xo)l2
constant multiple rule sum rule product rule quotient rule 'r
The proof in each case is a straightf01ward limit manipulation the reader should write out. A bit trickier is:
4.7.6 Chain Rule
/ff is differentiable at xo and g is differentiable atf(.xo), then g of is differentiable at xo and
(g o /)' (xo) = g' (f(xo))/' (xo).
This proof is found at the end of the chapter. Of course it is via these rules-4.7.2, 4.7.5, and 4.7.6---that we learn how to differentiate, for example, rational functions. '· If/ is differentiable on ]a, b[ and f' is continuous, we say that/ is of class iC1• Of course, we can differentiate f' again, assuming that it is differentiable,
200
Chapter 4 Continuous Mappings
to get the second derivative f". Iff" is continuous, we say that f is of class and so on.
C2,
4.7.7 Definition
We say that a function f defined in a neighborlwod of Xo is increasing (respectively, strictly increasing) at xo if there is an interval ]a, b[ containing Xo such that:
i.
If a < x < Xo, then J(x)
~ /(xo);
respectively, J(x)
< /(xo).
ii.
If Xo < x < b, then f(x)
~ /(.xo);
respectively, J(x)
> J(.xo).
Similarly, f is decreasing (respectively, strictly decreasing) at x0 an interval ]a, b[ containing xo such that:
i.
If a < x < xo, then f(x)
~ J(xo):
respectively, J(x)
> /(.xo).
ii.
If Xo < x < b, then f (x)
$ /(.xo); respectively, f (x)
< f (xo).
4.7.8 Theorem i.
if there is
Assuming that/ is differentiable at xo,
If f is increasing at Xo, then J' (.xo) ~ 0.
ii. · If f is decreasing at xo, then f' (Xo) $ 0.
> 0, then f is strictly increasing at .xo.
iii.
If f'(xo)
iv.
If f' (xo) < 0, then f is strictly decreasing at Xo.
We defer the proof to the end of the chapter. Using this result, we can relate calculus to maxima and minima.
4.7.9 Proposition
lff: ]a,b[-. ]R is differentiable at c E ]a,b[ andj has a maximum (or a minimum) at c, then f'(c) = 0.
Proof Let f have a maximum at c. Then for h > 0, [f(c + h) - f(c)]/ h ~ 0; and so, letting h-. 0, h > 0, we get/'(c) ~ 0. Similarly, for h < 0 we obtain f'(c) ~ 0. Hence/'(c) = 0. • We can now put this together with the maximum-minimum theorem from §4.4.
4.7.10 Rolle's Theorem If/: [a,b]-. lR is continuous,/ is differentiable on ]a,b[. and j(b) = /(a) = 0, then there is a number c E ]a,b[ such that /'(c) = 0.
§4.'7 Differentiation of Functions of One Variable
201
Proof If f(x) = 0 for all
x E [a,b], we can choose any c. Therefore assume that f is not identically zero. From the maximum-minimum theorem (4.4.1), we know that there is a point c1 where / assumes its maximum and there is a point c2 where / assumes its minimum. Since/ is not identically zero and J(a) f(b) O; at least one of CJ, c2 lies in ]a, b[. If c1 E ]a, b[, we get J'(ci) = 0, by Proposition 4.7.9; similarly for c2. •
=
=
4.7.11 Mean Value Theorem IfI: [a,b]-+ JR is
continuous and differentiable on ]a, b[. there is a point c E ]a,b[ such that f(b) - f(a) =f'(c)(b - a).
Proof Let cp(x) = f(x)- f(a)- (x - a)[f(b)- f(a)]/(b - a) (see Figure 4.7-3), and apply Rolle's theorem. •
y
_ _ _ _ ___._ _ _ _ _ _ _ ___._ _ _ _ x
a
FIGURE 4.7-3
X
b
The mean value theorem
4.7.12 Corollary If f: [a,b] -+ JR
,'
is continuous and if f' "". 0 on ]a,b[,
then f is constant.
Proof Applying the mean value theorem to/ on [a,x] gives a point c such that f(x) - J(a) = J'(c)(x - a) =0, so that /(x) =f(a) for all x E [a, b], and therefore/ is constant.
4.7.13 Example
•
Let/: ]a, b[-+ JR be differentiable and that IJ(x) - f(y)I :5 Mix - YI for all x,y E ]a, b[.
1/'(x)I :5 M.
Prove
Chapter 4 Continuous Mappings
202
Solution By the mean value theorem, /(x)-/(y) =/(c)(x- y) for some c E ]x,y[. Tal 0 for all x E ]a, b[ or f'(x) < 0 for all x E ]a, b[. Then J is strictly increasing or strictly decreasing on ]a,b[ and so f has an inverse function. The following theorem allows us to compute the derivative off- 1 in tenns of the derivative of'j.
4.7.15 Inverse Function Theorem Suppose that f : ]a,b[
-+ JR and that either f'(x) > 0 for all x E ]a,b[ or f'(x) < 0 for all x E ]a, b[. Then /: ]a, b[ -+ R is a bijection onto its range, 1- 1 .is differentiable on its domain, and (/- 1 )'(y) = 1/f'(x) wherej(x) =y. ·
Exercises for §4.7
203
Proposition 4.7.14 is also useful for discussions of maxima and minima. For instance:
4.7.16 Proposition Suppose that f is continuous on [a,b] and is twice differentiable on ]a, b[, and that Xo E ]a, b[. If f'(xo) = 0 and f" (xo) > 0, then Xo is a strict_local minimum off.
i.
--
If f' (xo) = 0 anlj" (xol__< Q, then Xo is a strict local maximum off.
ii.
What is going on in i is that f" (x0) > 0 implies that f' is strictly increasing at x0 , and so/'< 0 just to the left of xo and/'> 0 just to the right. Now 4.7.14 can be used to get the result.
Exercises for §4.7
1./
Give an example of a function defined on JR, which is continuous everywhere and which fails to be differentiable at exactly n given points
/XI,··· ,Xn. 2.
Does the mean value theorem apply to /(x) = .,/x on [0, I]? Does it apply to g(x) = vfxI on [-1, I]?
3. /
Let f be a nonconstant polynomial such that /(0) = /(I). Prove that / has a local minimum or a local maximum point somewhere in the open inteival JO, I[.
4.
A rubber cube of incompressible material is pulled on all faces with a force T. The material stretches by a factor v in two directions and c6ntracts by a factor v- 2 in the other. By balancing forces, one can establish Rivlin's
equation: v3
/ S.
Tv 2 -
20 + 1 = 0,
l
where. a is a strictly positive constant (analogous to the spring constant for a spring). Show that Rivlin's equation has one (real) solution if T < 6/2a and has three solutions if T > 6/2a. . · Let f be continuous on [3, 5] and differentiable on ]3, 5[, and suppose that f(3) = 6 and /(5) = 10. Prove that, for some point xo in the open inteival ]3, 5[, the tangent line to the graph off at xo passes through the origin. Illustrate your result with a sketch.
204
Chapter 4 Continuous Mappings
§4.8 Integration of Functions of One Variable Let A c JR be a bounded set and f : A -+ JR a bounded function. When we say that we want to integrate the function f over the set A, we mean that we would like to find the area under the graph off (see Figure 4.8-1). If f ~ 0, this is the literal area; in general, it is the signed area. To do this, note first, since A is bounded, that there is a closed interval [a,b] ::.> A. We consider f to be defined over the whole interval [a,b] by lettingf be zero on [a,b]\A. Next, we partition [a,b], which means that we pick an integer n and points xo = a, x,, ... ,Xn-1,Xn =bin such a way that a= Xo < x, < · .. < Xn-1 < Xn = b. Denote such a partition by P; that is, let P = {xo, ... , Xn}. Then, form the two sums n-1
U(f,P)
= I)sup{J(x) Ix E [x;,X;+d})(x;+1 - x;) j::(}
and n-1
L(f,P) = I)nf{f(x) Ix E [x;,X;.1-1]})(x;+1 - x;), i=O
called the upper and lower sums, respectively. The first sum is the sum over all intervals [x;,X;+il of the maximum(= sup) of/ in each interval times the length of that interval and has value equal to the area of the shaded region shown in Figure 4.8-1. Since f is assumed to be bounded, the sup exists in each interval. The second sum is the sum over all intervals [x;,X;+d of the minimum (or inf) off in each interval times the length of that interval and is the hatched region shown in Figure 4.8-1. The boundedness of the function again guarantees that the inf exists. Since f is bounded-say, -M :$ f :$ M-we see that
-(b - a)M :$ L(f, P) :$ U(f, P) :$ (b - a)M for any partition P of [a,b]. Let
S = inf{ U(f, P) I P is any partition} and
s = sup{ L(f, P) I P is any partition}. If we again look at Figure 4.8-1, it seems reasonable to expect that as the size of the intervals in P gets smaller, U(f, P) decreases while l(f, P) increases. In the limit of decreasing size of the intervals of P, the numbers U(f, P) and l(J, P) should converge to a common value. This leads us to the following definition.
205
§4.8 lntegration of Functions of One Variable y
----·X
FIGURE 4.8-1
The upper and lower sums for this function
4.8.1 Definition We say that/ is Riemann integrable (or just integrable, or that the integral exists, for short) ifs S. The common value s S is denoted by JAi or by JAJ(x)dx. If A= [a,b], we write
=
lb
f(x)dx =
=
11 lb =
f.
It should be noted that integrability does not really involve smoothness or continuity properties off. In fact, some badly discontinuous functions can still be integrable. •# One thing about this procedure may seem puzzling. Why do we insist that inf{ U(J, P)} = sup{ L(J, P)}? At first, we might think that this relation will always hold. However, this is not always the case, as the next example shows.
4.8.2 Example Calculate inf{ U(J, P)} and sup{ L(J, P)} for the function f : [0, I] C
JR
--+
JR defined by f(x)
= 1 if x is irrational and f(x) = 0 if x is
rational.
=
=
Solution We claim that inf{ U(f, P)} I and sup{ L(J, P)} 0. Indeed, on any interval,/ is always I at some points and zero at others, so that the inf on any interval is .zero and the sup is I. Therefore, the integral of this function over the set [0, I] does not exist for our purposes. In more advanced work the
206
Chapter 4 Continuous Mappings
integral of such a "pathological function" can be defined, but we shall be dealing mostly with "~ecent functions," for which the integral exists. •
4.8.3 Example Suppose that f : [a,b]
-+
JR is (Riemann) integrable and
f ~ 0. Show that J: f (x) dx ;:: 0.
Solution By definition, the integral is the infimum of sums of the form
L
n-1 (
i=O
sup
f(x)
)
· (x;+ 1 - x;)
xE[x;,x;. 11)
over all partitions. But each of these sums U(f, P) is nonnegative, since f;:: 0. Hence the integral is ;:: 0, since the inf of a set of nonnegative numbers is also nonnegative. • The first result is a theorem giving us some insight into which functions are integrable. We shall study the problem in greater depth in Chapter 8.
4.8.4 Theorem i. If f : [a, bJ -. JR is bounded and is continuous at all but finitely many points of[a,b], then it is integrable on [a,bJ.
ii.
Any increasing or decreasing function on [a,bJ is integrable on [a,bJ.
We can also prove the following proposition, which gives some rules of integration.
4.8.5 Proposition i.
If f is integrable on
[a, bJ and k E JR, then kf is integrable on [a, b J and
f:kf=kf:f.
iii.
If f and g are integrable on [a, bJ, then f + g is integrable on [a, bJ and J:(!+g) = J:1 + I: g. If f and g are integrable on [a, bJ and f(x) $ g(x) for all x E [a, b], then
iv.
I:!$ If f is
ii.
J:g. integrable on [a,b] and [b,cJ, then f is integrable on [a,c] and
1:1=I:J+ J:f.
§4.8 Integration of Functions of One Variable
207
Each of these properties is plausible in terms of the interpretation of the integral as the signed area under the graph. Here is a useful consequence of iii. Note that -1/1 f I/I, so that - I/(x)I dx f(x) dx IJii Suppose that X.t -+ .xo. To show that/(xt.)-+ /(x0 ), let e > 0; we must find an integer N so that k ~ N implies p(f(xt.),f(.xo)) < e. To do this, choose 6 > 0 so that d(x,x0 ) < 6 implies p(/(x),/(.xo)) < e. The existence of such a 6 is guaranteed by the continuity of/. Then choose N so that k ~ N implies d(Xt.,Xo) < 6. This choice of N yields the desired conclusion.
Chapter 4 Continuous Mappings
212
ii==>iv Let F c N be closed. To show that/- 1(F) is closed in A, we note that a set B is closed relative to A iff for every sequence Xk E B that converges to a point x E A, we necessarily have x E B. Let Xk E /- 1(F) and let Xk-+ x, where x EA. We must show that x E/- 1(F). By ii,f(xk)-+ f(x), and since f(xk) E F and F is closed, we conclude that f(x) E F. Thus X
Ef- 1(F).
iv==>iii If U is open, let F = N\ U, which is closed. Then, by iv, 1- 1(F) = G n A for some closed set G. Thus/- 1(U) =An (M\G), and so/- 1(U) is open relative to A.
> 0, we must find 8 so that d(x, Xo) < 8 implies Since D(f(Xo),c) is an open set, /- 1(D(f(xo),€)) is open, by iii. Thus, by the definition of an open set and the fact that x0 E 1- 1(D(f(Xo),c)), there is a 8 > 0 such thatD(x0 ,o)nA cf- 1(D(f(x0 ),c)). This is another way of saying that
iii==>i Given xo E A and p(j(x),f(xo))
(p(f(x),/(xo)) < €).
•
To gain practice with these concepts, one might try proving (directly) other implications within the theorem, for example, i ==> iii, or ii ==> i.
.
'
4.2.1 Theorem Suppose that f : M -+ N is conti~uous and K c· M is connected. Then f (K) is connected. Similarly, if K is path-connected, so is f (K). Proof Suppose f(K) is not connected. By definition, we can write /(K) C UU V, where Un Vnf(K) = 0, Unf(K) f. 0, Vnf(K) f. 0, and U, V are open sets. Now,f- 1(U) = U'nK for some open set U', and similarly,f- 1(V) = V' nK for some open set V'. From the conditions on U, V, we see that U' n V' nK = 0, K c U' U V', U' n K f. 0, and V' n K f. 0. Thus K is not connected, which prov~s the first assertion. For the second part, let z = f(x), w = f(y) be two points in f(K), where x, y E K. Let c(t) ,be a continuous curve joining x and y. Then d(t) = f(c(t)) is a continuous curve joining z and w. It is continuous because if tk -+ to, then c(h) -+ c(to), since c is continuous; consequently f(c(tk)) = d(tk) -+ f(c(to)) = d(to), since f is continuous. Thus d is continuous, and hence f (K) is pathconnected.
•
4.2.2 Theorem Suppose that f compact. Then f (B) is compact.
: M
-+
N is continuous and B c M is
213
Theorem Proofs for Chapter 4
Proof Let Yk be a sequence in f (B).' By Theorem 3.1.3 it must be shown that Yk has a subsequence converging to a point in f (B). Let Yk = f (xk), for Xk E B. Since B is compact, there is a convergent subsequence, say, Xkn -+ x, for x E B. Then, by Theorem 4.1.4ii,/(Xkn) -+ /(x), and so/(xkn) is a convergent subsequence of y,,_. • 4.3.1 Theorem Let M, N, and P be metric spaces and suppose that f : A c M -+ N and g : B C N -+ P are continuous mappings with f(A) C B. Then g of : A C M -+ P is continuous. Proof Let Uc P be open. Then (gof)- 1(U) =J- 1(g- 1(U)). Now, g- 1(U) = U' n B for some U' open, and 1- 1(U' n B) = 1- 1(U'), since f(A) c B. Since f is continuous, 1- 1( U') = U" n A for U" open. Thus g of is continuous, by Theorem 4.1.4. •
The other conditions of Theorem 4.1.4 could also be used to prove Theorem 4.3.1. Instead of proving Theorem 4.3.2, we shall confine ourselves to proving its corollary. The general case is similar; the only complexity is in the notation.
4.3.3 Corollary Let A i.
ii.
C M,
and let x0 E A be an accumulation point of A.
Let f : A -+ V and g : A -+ V be continuous at xo; then the sum f + g : A -+ V is continuous at xo. Let f : A
-+
JR and g : A
-+
V be continuous at xo; then the product
f · g : A -+ V is continuous at xo.
iii.
Let f : A -+ JR and g : A -+ V be continuous at xo with f (xo) #0; then f is nonzero in a neighborhood U of xo and the quotient g/f: U-+ Vis continuous at xo.
Proof i.
Let xo E A and suppose £ > 0 is given. Choose 61 > 0 such that d(x,xo) < 61 implies d(f(x),f(xo)) < e/2 and 62 > 0 such that d(x,xo) < 62 implies d(g(x), g(x0 )) < £ /2. Let 6 be the minimum of 61, 62. Therefore, if d(x,x0 ) < 6, the triangle inequality gives
II{/+ g)(x) - (J + g)(Xo)II = llf(x) - f(xo) + g(x) - g(xo)II :::; 11/(x) - /(xo)II + llg(x) - g(Xo)II :::; e/2+£/2 = £.
Chapter 4 Continuous Mappings
214
ii.
Let xo E A and suppose e > 0. Choose 61 such that d(x,Xo) < 61 implies IJ(x) - f(xo)I < e/2(llg(xo)II + 1) and IJ(x)I :5 IJ(xo)I + I (why is this possible?). Also, choose 62 such that d(x,Xo) < 62 implies that llg(x) - g(xo)I I < e /[2(j/(xo)I + l)]. Then for 6 = min(61, 62), d(x, xo) < 6 implies (by the triangle inequality) llfg(x) - fg(xo)II = llf(x)g(x) - f(x)g(Xo) +f(x)g(xo) - f(xo)g(xo)II
:5 1/(x)l llg(x) - g(.xo)II + 1/(x) - /(xo)l llg(.xo)II (since llavll = lal llvll for v E V and a E IR). Continuing with this line of reasoning, we get (1/(xo)I + l)e
llfg(x) - fg(xo)II
iii.
llg(xo)lle
< 2(1/(xo)I + 1) + 2(1lg(xo)II + 1) :5
e
e
2 + 2 = e.
By the proof of ii, it suffices to consider the case 1//, because g/f = g • (l/f). To show that 1// is continuous, given xo E A, choose 61 such that lf(x) - /(xo)I :5 (l/(xo)l/2) for llx - xoll < 61. This is possible by the continuity off. It follows that 1/(x)I ~ (l/(xo)l/2). Now, given e > 0, choose 62 such that llx - xoll < 62 implies 1/(x) - /(xo)I Letting 6 = min(61,62), llx- xoll
< elf~o)l2.
< 6 implies
I_I __l_, _ f(x)
/(Xo) -
I
,/(Xo) - f(x) < !J(x) - /(xo)I f(xo) f(x) l/(xo)l 2/2
This shows that 1/f(x) is continuous at x0 •
0, there is a r, > 0 such that l!lul < r, implies lk(u + Au) - k(u) - k'(u)!lul
Given e
< clllul.
> 0, we must find a 6 > 0 such that lg(f(x + Ax)) - g(f(x)) - g' (f(x)) J' (x)Axl
< c Ax
(1)
Theorem Proofs for Chapter 4
217
if l&I < 8. Here we have written x for xo to simplify notation. Adding and subtracting g'(f(x))!J./, the left-hand side of (1) reads !Ag - g' (f(x))Af + g' (J(x))N - g' (f(x)) J' (x)&I ~ IAg - g'(f(x))Afl
where Ag
=g(f(x + &)) -
g(f(x))
+ jg'(f(x))j IA/ - J'(x)&!
=g(f(x) + Al) -
(2)
g(f(x))
and N = f(x + &) - f(x).
Since f has the Lipschitz property at x (see the comments following 4.7.2), !NI < Ml&I if!&! < c for some c > 0. Since g is differentiable atf(x), given c1 > 0, there is a 81 > 0 such that if !NI < 81, then IAg- g'(f(x))Afl < cdNIThus we should choose 8 so that IN! < 81, i.e., 8 ~ 8ifM. Then the right-hand side of (2) is bounded above by ct INI + jg' (J(x))IA/ -
!' (x)&I
~ c1Ml&I + Nltif -
f' (x)&I
where N = lg'(f(x))j. Given E2 > 0, choose 82 so that IN -f'(x)&I l&I < 82. Then the right-hand side of (3) is bounded above by
(3)
< e2l&I if (4)
Now it is clear how to choose things: Given e > 0, let e: 1 = e:/2M, c 2 = c/2N, and 81,82 be the corresponding o's. Let 8 = min(8 1/M, 82). Then (4) shows that (1) holds.
•
4.7.8 Theorem Assuming that/ is differentiable at x0, i.
If f is increasing at xo, then f' (xo) 2'.'. 0.
ii.
If f
iii.
Iff' (xo) > 0,
iv.
If f' (xo) < 0, then f is strictly decreasing at xo.
.'
is decreasing at xo, then f' (xo) ~ 0. then f is strictly increasing at XQ.
Proof i and ii follow from iv and iii (why?), and so we illustrate by proving iii. Thus, suppose f' (xo) > 0. By definition, lim f (x) - f(xo) = J' (xo). x-+xo
X -Xo
Chapter 4 Continuous Mappings
218
Let c = /'(Xo). There is a neighborhood N of Xo such that for x E N and x -;. xo, we have - /(Xo) _ J'(:xo)I 0 for all x E ]a, b[ or f' (x) < 0 for all x E ]a, b[. Then / : ]a, b[ -+ lR is a bijection onto its range, 1- 1 is differentiable on its domain, and (f- 1)'(y) 1//'(x) where f(x) y.
=
=
Proof Since/ is monotone by the preceding proposition, 1- 1 exists, and by
4.2.1 and 4.3.S, its domain is an interval. First we show that 1- 1 is continuous.
Suppose that f' (x) > 0, so that f is strictly increasing. (A similar proof works if f is strictly decreasing.) If U c ]a, b[ is an open set, let y E /(U), so that
219
Theorem Proofs for Chapter 4
y = f(x) for some x E U and there is an open inteival ]x1, x2[ with x E ]x1 ,x2[ and ]x1,x2[ c U. Now f(x1) < J(x2) andJOx1 ,x2D c Jf(x1),J(x2)[, sincef is strictly increasing. For any c with /(xi) < c < f(x2), there is some z E ]x1,x2[ with f(z) = c, by the intermediate value tlieorem. Thus/Gx1,x2D = ]/(xi),/(x2)( c J(l]) and y =/(x) E ]J(x1),/(x2 )[. We have shown that (f- 1 1(U) =J(U) is open. Hence, by Theorem 4.1.4,/- 1 is continuous. Now write y = f(x), so that x =f- 1(y); and Yo =/(xo), so that Xo =1- 1(yo). Since 1- 1 is continuous, limy-Yo x = Xo, Then ·
r
(f-t)'(yo) = lim 1-t(y) - 1-t(yo) = lim x - Xo Y-Yo y - Yo Y-Yo f(x) - J(:xo)
=Y-Yo lim I =_l_ • J(x) - f(xo) /'(:xo) · x- Xo
4.7.16 Proposition Suppose that
f is continuous on [a,b] and is twice differentiable on ]a, b[, and that Xo E ]a, b[.
> 0, then
i.
If f' (Xo) = 0 and f" (xo)
ii.
If f' (Xo) = 0 and f" (xo) < 0, then xo is a strict local maximum off.
Xo is a strict local minimum off.
Proof i.
If /"(:xo) > 0, then f'(x) is increasing at Xo, and so there is a 6 > 0 such that/'(x) < 0 if xo- 6 < x < xo and/'(x) > 0 if xo < x < :xo+6. Thus, by 4.7.14, /(x) > J(xo) if Xo - 6 < x < xo andf(x) > f(:xo) if Xo < x < Xo +6.
ii.
The proof is similar.
• • I
4.8.4 Theorem i.
If f : [a, b] -+ JR is bounded and is continuous at all but finitely many points of[a,b], then it is integrable on [a,b].
ii.
Any increasing ordecreasingfunction on [a,b] is integrable on [a,b]. Let us first prove:
Upper-Lower Integral Inequality Fora boundedfunctionf: [a,b]-+ R,
f:J :5 f:J.
Chapter 4 1Continuous Mappings
220
Proof If P and P' are partitions of [a,b] with P c P', then P' is called a refinement of P. To get our inequality, we first prove the following result: Lemma If P' is a refinement of P, then L(f, [a,bl,P') U(i,[a,b],P') $ U(f,[a,b],P).
~ L(f, [a,b],P) and
.
.
Proof Let Po= P and P1 =PU {at}, where a1 E P'\P, and let -P; = P;-1'U {a1}, where a; E P'\P;- 1, so that for some t, P = Po c P1 C P2 C · • • C P, = P' and P;+1 has one more point than P;. If P; = {xo,x1, ... ,xn} with Xj < Xj+l and Xk-l < lli+I Xk, then .
0 and f(x) = (x - x")j log x for O < x < 1 and /(0) = 0, /(1) = 1 - k. Show that/: [O, lJ-+ JR is continuous. Is/ uniformly continuous?
Ix -
yj 2 • Prove that/ is a constant.
Exercises for Chapter 4
235
31.
Let f(x) = x 1f 0 and B,0 = {x E Rn I !Ix - xoll $ ro}. Suppose that B,0 c A. Prove that there is an r > ro such that B, C A.
33.
A set A C Rn is called relatively compact when cl(A) is compact. Prove that A is relatively compact iff every sequence in A has a subsequence that converges to a point in Rn.
34.
Assuming that the temperature on the surface of the earth is a continuous function, prove that on any great circle of the earth there are two antipodal points with the same temperature.
35.
Let f : JR. -+ JR. be increasing and bounded above. Prove that the limit limx-o-- /(x) exists.
36.
Show that {(x, sin(l/x)) not path-connected.
37.
Prove the following intenriediate value theorem for derivatives: If f is differentiable at all points of [a,b], and if/1({1) andf'(b) have opposite signs, then there is a point Xo E ]a, b[ such that f' (Xo) = 0.
Ix> O} U ({0} x
[-1, 1)) in R 2 is connected but
•'
38.
A real-valued function defined on ]a, b[ is called convex when the following inequality holds for x,y in ]a, b[ and t in (0, 1]: f(tx + (1 - t)y) $ tf(x) + (1 - t)/(y). ~
If/ has a continuous second derivative and/"> 0, show that/ is convex. 39.
Suppose/ is continuous on [a,b],f(a) =f(b) = 0, and x2J"(x) +4xf'(x) + 2/(x) ~ 0 for x E ]a, b[. Prove that f(x) $ 0 for x in [a, b].
40.
Calculate d
f'
dx
dt } 0 I +x2 •
41.
Prove that
236
Chapter 4 Continuous Mappings
t
and
k2 = n(n + I~2n + I).
k=l-
These formulas were used in Worked.Example 4.6. 42.
For x > 0, define L(x) = /t(I/t)dt. definition:
Prove the following, using this
a.
L is increasing in x.
b.
L(xy) = L(x) + L( y).
c.
L'(x) = 1/x.
d.
L(l) = 0.
e.
Properties c and d uniquely determine L. What is L?
J; f(y)
43.
Let f : JR ---t JR be continuous and set F(x) = F'(x) = 2xf(x2). Give a more general theorem.
44.
Letf: [O, 1) ---t lll be Riemann integrable and suppose for every a, b with f = 0. 0 S a < b S I there is a (i;, a < c < b, with f(c) = 0. Prove Mustf be zero? What if f is oontinuous?
dy.
Prove that
Jd
45.
Prove the following second mean value theorem. Let f and g be defined on [a, b] with g continuous, f 2:: 0, andf integrable. Then there is a point xo E ]a, b[ such that
1b 46.
f (x)g(x) dx = g(xo)
1b
f(x) dx.
a.
For complex-valued functions on an interval, prove the fundamental theorem of calculus.
b.
Evaluate
f0
11'
etc dx using a.
Chapter 5 Uniform Convergence Many important functions are defined using infinite sequences or series. To study such functions, we need to understand the concept of uniform convergence. To deal effectively with concrete situations and examples, we use specific tests for uniform convergence. Perhaps the most helpful such test is the Weierstrass M test for series. Another test is the Cauchy criterion, which is mainly of theoretical use. We also include the more specialized tests of Dirichlet and Abel. In connection with uniform convergence, we look again-at-th~pare-of continuous functions, which we introduced briefly in §LL In this space the "points" or "vectors .. are functions, and convergence of a sequence- corresponds to uniform convergence of these functions. The space is proved to be complete in the sense that C~nchy sequences..~e. A second basic property of this space, called the Arzela-Ascoli theorem, establishes the 0 and to 1 if x = 0. The limit function is thus defined by f(x) =0 if x > 0 and/(0) = I. This is not a continuous function, even though each of the functions fk is continuous. What is going wrong? If we focus attention on any one value of x, then for large enough k, the values fk(x) are close to f(x). However, the k needed for any particular degree of accuracy depends very much on x. For x near 0, very large values of k will be needed to make fk(x) close to f(x), and there will always be points x still closer to 0 for which fk(x) is not close to f(x). To remedy this, we introduce a stronger notion of convergence. We ask not only that the values of/k be close to those off, but that they be so uniformly, that is, independently of x.
,.,_ .
§5.1 Pointwise and Uniform Convergence
239
5.1.2 Definition Letfk: A - t N be a sequence offunctions with the property that for every € > 0 there is an integer L such that k 2 L implies p(jk(x),f(x)) < for all x E A. Here, p is the metric on N. Under these conditions, we say that the sequence A comerges uniformly to f on A, and we write fk - f (uniformly).
€
The important point to note here is that the choice of L may depend on c but not on x. One must be able to choose Lin such a way that the same L works everywhere on the set A. The notion of uniform convergence thus depends very much on the set A being considered as domain. Convergence might be uniform on one domain but not on a larger domain. In the last example, the convergence of the functions ft to the limit/ is uniform on any interval [a, I] with O 0, we need only select an integer L large enough that 1/L < a. If k 2 L, then fk(x) = 0 = f(x) for every x in [a, 1). Thus fx - t f uniformly on [a, I]. However, if a is very close to 0, then correspondingly large values of L are required. If c < I /2, then it is not possible to select a single k that makes 1/k(x) - /(x)I < c for every x in [O, l] at the same time. Thus, although the functions A do converge to f pointwise on [O, I], they do not do so uniformly. There is a good geometric way to visualize uniform convergence. The condition p(jk(x),f(x)) < € for every x means that/k(x) is always closer than a distance c to f(x). This is easily described in terms of the graphs of the functions. The graph of fk must lie within an "c tube" around the graph of/ whenever k 2 L. See Figure 5.1-2. y
A
FIGURE 5.1-2
Uniform closeness for f: AC JR--+ JR
Perhaps another example will make the idea clearer. Consider the sequence of functions A : JR --+ JR defined by fk(x) = {
0, x< k, l, x 2 k
f,
~
240
\
b;
Chapter 5 Uniform Convergence
=
=
(k I, 2, 3, ... ). Thenfk --+ 0 (pointwise), because for each x E JR,fk(x) 0 for k large (k > x). However, A does not converge to zero unifonnly, for, no matter how large k is, there are points x such thatfk(x) - 0 is not small. Observe that if fk(x) --+ f (unifonnly), then fk --+ f (pointwise). This is because for any x E A and e > 0, we have an integer L such that p(fk(x),f(x)) < e if k ~ L, that is, fk(x) --+ f(x). We make similar definitions for a series of functions. Here we choose N = V a normed vector space and let gk : A --+ V, so that addition makes sense.
EZ:
5.1.3 Definition We say that the series 1 gt converges to g pointwise, and write gk = g (pointwise), if the sequence St = I:~=I g; of partial sums 1 gk = g (uniformly) or gk converges pointwise tog.Also, we say that 1 converges tog uniformly if St --+ g (uniformly). For a sequence fk (or serirs E gt), we say that /k (or E gt) converges uniformly if there exists a function to which it converges uniformly.
EZ:
E:
E:,
The first basic property of uniform convergence is its preservation of conti' nuity, given in the next result.
5.1.4 Proposition let A c M where M is a metric space, let ft : A be a sequence of continuous functions, and suppose that fk A). Then f is continuous on A.
--+
--+ N f (uniformly on
Thus, uniform convergence is a strong enough condition to guarantee that the limiting function of a sequence of continuous functions is continuous. In view of the preceding examples, this should not be unreasonable.
5.1.5 Corollary If the functions gk : A (uniformly), then g is continuous.
--+
V are continuous and
E:, gk = g
This follows by applying the proposition to the sequence of partial sums. This result can be restated as the validity of interchanging limits and sums: 00
Jim
00
L gt(X) = L
.r-+.ro k=I
5.1.6 Example let fn : JR--+ JR fn --+ 0 uniformly as n --+ oo.
Jim gk(x)
k=I .r-+.ro
be defined by fn(X) = (sinx)/n. Show that
§5.1 Pointwise and Uniform Convergence
241
Solution We must show that lfn(x} - 0I = IJ,,(x)I gets small independently of
x as n -+ oo. But l/n(x)I = I sinxl/n S l/n, which gets small independently of x as n-+ oo.
•
5.1.7 Example Show that the series for sinx, x3
x5
3!
5!
sin x = x - - + - - • • ·
converges uniformly on the interval [O, r] for any r
'
> 0.
Solution We must show that the sequence of partial sums n
' Sn(X) =
L
(-1/x2k+I (2k + 1)!
k=O
converges uniformly. To do this, estimate the difference:
IL (-1/ (2k + IS L (2k + l)!" 00
lsn(X) - sin xi=
x2.t+1
oo
r2k+1
1)!
k-=n+I
k=n+l
The right-hand side is independent of x and tends to 0 as n tends to oo, since it is the tail of a convergent series. Thus, the convergence of this series is uniform. Note that continuity of sinx follows from this; a result we know already. •
5.1.8 Example Let fn(X) = xn,
0
S
x
S 1. Does fn converge uniformly?
·What if0 S x < 1? '
.
Solution First we determine the limit point by point. We havy fn(0) = 0 for alln,andf,,(x)-+ 0 if x to
< I, butfnO) = 1 for all n. Thusf,, f(X) = { 0,
1,
X X
converges pointwise
~1
= 1.
It cannot converge uniformly, because this limit is not continuous (Figure 5.1-3). If 0 S x < 1, then fn converges pointwise to zero and 0 is a continuous function. However, the convergence is still not uniform, for given any n, xn 2:: 1/2 if xis close enough to I {since lim_....... 1 xn = 1). •
242
Chapter S Uniform Convergence JI
X
x=O
x=I
FIGURE 5.1-3
The sequence.f;.(x) =xn
5.1.9 Example Consider the geometric series E:Oxk. a.
Show that this series converges pointwise to the function g(x) = 1/(1 -x) for x in the open interval] - 1, 1(.
b.
IJO
c.
Show that the convergence is not uniform on] - 1, l[.
(log(e:I 1 - xi)/ logx) - 1.
xi,
b.
The error
xi
lxr I/11 - xi < € is smaller for -x than it is for X, an~ !__ dx
(lxln+I) =
1-
X
\ { _Q_} '
(nxn(l-x)@'(=-- .;..,._t-,:rµ(1 - x)
)
(why?). Since this is positive, we see that the error is largest on [-a, a] at the right endpoint, x = a. If we select an N large enough to -work there,
243
§5.1 Pointwise and Uniform Convergence
it will also work everywhere on [-a,a], and the definition of uniform convergence on [-a, a] will be met. c.
If x > 1/?, then the error, lxn+ 1 /(1 - x)I, will be larger than 2lxn+lj. Thus, for any particular value of n, we can make the error larger than any specified c between O and -2 by selecting x in the open interval ] 1/2, 1[ and larger than (c /2) 1/(n+l). The convergence is thus not uniform on all of
]-1,1[.
•
Let fn(x) = xn /(1 + xn) for x in the interval [O, 2). Show that the sequence of functions /1,'2,h, ... converges pointwise on the interval [0, 2] but that the convergence is not uniform.
5.1.10 Example
Solution
The denominator, 1 + xn, is greater than or equal to 1 for each x in our domain, and so lfn(x)I 5 xn. If O 5 x < I, this tends to O as n -+ oo. Therefore limn-+oo.fn(x) 0 if O 5 x < 1. If x 1, then/n(X) 1/2 for every n. If x > 1, then xn -+ oo as n -+ oo. Since fn(x) = 1/(1 + (1/xn)), this leads to limn-+oofn(x) = I for x > 1. Thus.fn tends pointwise on [O, 2] to the function/(x) defined by j(x) = 0 for O 5 x < 1,/(1) 1/2, and/{x) I for I< x 5 2. Each of the functions In is continuous on [O, 2). If the convergence were uniform, then the limit function would also be continuous. Since f has a discontinuity at x = l, the convergence must not be uniform. .The first few of these functions are shown plotted in Figure 5.1-4 together with the discontinuous limit function. •
=
=
=
=
=
y
FIGURE 5.1-4 This sequence Ji: converges.pointwise but not uniformly
244
Chapter 5 Uniform Convergence
Exercises for §5.1 1.
Letfn(x) = (x- I/n)2, 0 $ x $ I. Doesfn converge unifonnly?
2.
Letfn(X)
3.
Let fn : IR -+ IR be unifonnly continuous and let f,, converge unifonnly to /. Do you think that/ must be unifonnly continuous? Discuss.
4.
Letfn(x) = xn, 0 $ x $ 0.999. Doest,, converge unifonnly?
s.
Let
= x- xn, 0 $ x $ I. Does!,, converge unifonnly?
00
f(x) =
n/2
L nx(n.
1)2
0 $ x $ l.
n=I
Discuss how you might prove that/ is continuous.
§5.2 The Weierstrass M test '
One reason there were few examples in the lasf section involving infinite series is that it is usually difficult to study uniform convergence directly unless a convenient formula can be found for the partial sums, as was the case with the geometric series. In this section we present some convenient tests for the uniform convergence of a series of functions. The first is a characterization of those series of functions that are unifonnly convergent expressed in terms of the functions in the series without mentioning explicitly the limit function to which the series converges. The basic idea is to use the Cauchy condition for the convergence of the sequence of partial sums. Recall that every convergent sequence satisfies the Cauchy condition. These are the sequences that in some sense "ought" to converge. If the space in which the points lie is complete, then the limit actually exists and the sequence does converge.
5.2.1 Cauchy Criterion Let N be a metric space with metric p, and let A be a set. Suppose that N is complete and ft : A -+ N is a sequence of functions. Then /k converges uniformly on A if! for every E > 0 there is an N such that l, k ~ N implies p(/t(X),J,(x))
for all x EA.
0 there is an N such that k ~ N implies
E:
for all x E A and all integers p ~ 0. As in §5.1, for series, the g1c are assumed to talce values in a nonned vector space V. To use 5.2.1, we require V to be R_n or even V R is a common choice. Using the Cauchy criterion, we can obtain the following important technique for detennining the uniform convergence of a series.
complete. Of course, V
=
=
5.2.2 Weierstrass M test
Suppose that V is a complete normed vector space and g1c : A -+ V are functions such that there exist constants M1:. with llg1c(x)l1 :::; M1c for all x E A, and E:1 M1:. converges. Then E:1 g1c converges uniformly (and absolutely).
It is not always possible to use the M test, but it is effective in the majority of cases. For more refined tests, see the Dirichlet and Abel tests in § 5.9. In the M test , the constants M1:. give a bound on the "rate of convergence," the point being that the bound is independent of x. More exactly, the tail of] the series 1 gk, which represents the error, is bounded by that of 1 Mt, which tends to O independently of x.
E:
E:
5.2.3 Example Show that the series verges uniformly on R
E: gn(x) = E: (sin nx)2 / n 1
1
2
con-
lgn 1. Hence, by Theorem 5.2.2, the series converges unifonnly. • Solution Let Mn = l/n2 • Here,
5.2.4 Example Prove that f(x) =
Loo_,.. ( xn.n)2 ~ is continuous on JR. n=u
Solution Here we cannot choose an Mn for the nth term, because x is not bounded. We therefore do not expect uniform convergence on all of R, but we can prove unifonn convergence on each interval [-a,a] by letting Mn= (an/n!)2,
246
Chapter 5 Uniform Convergence
which is an upper bound for the nth term on [-a, a]. The ratio test shows that I: Mn converges, since
converges to zero, which is less than 1. Hence, we have uniform convergence on [-a,a], and so, by 5.1.5, we get continuity of/ on [-a,a]. Since a was arbitrary, we get continuity on all of JR. •
5.2.5 Example Suppose that a sequence fn(x), 0 ::; x ~ 1, converges uniformly andfn is differentiable. Mustf~(x) converge uniformly? Solution The answer is no. In general, control on the derivatives gives control on the functions via the mean value theorem, but not vice versa. For example, letf,,(x) = [sin(n2x)]/n. Then/n - 0 uniformly, but/~(x) =ncos(n2 x) does not converge even pointwise (set x = 0, for example). Here is another example of this phenomenon. Let gn(X) = xn+I /(n + 1) on [0, 1). Since lgn(x)I ::; 1/(n + 1), gn -+ 0 uniformly. But g~(x) = xn does not converge uniformly, as was previously shown. •
5.2.6 Example
Suppose that ao, a 1, a2, • •• is a bounded sequence of real numbers. Show that the series 00 ~a1e 1e ~-x k=O k!
converges to a continuous function.
Solution Fix a point x0 • We need to show convergence to a limit function /(xo) and continuity of/ atxo. LetA = {x I lxl 5 21.xol}. Then.xo e A. The partial sums are polynomials and so are certainly continuous on the set A. If we show that the series converges uniformly on A, the sum will thus be continuous on A and will certainly be so at xo. Since x0 was an arbitrary point, this will compl~te the proof. Uniform convergence on A may be established by the Weierstrass M test. If Bis an upper bound for the numbers la1cl, we can take M1c = B-2klxol* /k!. With g1c(x) = (a1c/ k!)x 1e, we have lg1e(x)I ~ M1e for all x in A. The ratio test shows that :E:iM1c converges, since M1c+i/M1c = 2lxol/(k + 1) and this tends to Oas k tends to infinity. The Weierstrass test applies and shows that our series converges uniformly on A as we wanted. •
247
§5.3 Integration and Differentiation of Series
Exercises for § 5.2 1.
Discuss the convergence and unifonn convergence of
=xn /(n + xn), x ~ 0, n =l, 2, ... .
a.
fn(X)
b.
fn(X) = e-x2 /njn, XE IR, n = l, 2, ... .
1::
/n 2, 0
2.
Discuss the uniform convergence of
3.
Prove that/(x) =
4.
Discuss the uniform convergence of
5.
If 1 an is absolutely convergent, prove that unifonnly convergent.
"E: xn /n 1
2
1 xn
:::;
x:::; I.
is continuous on [O, l].
"E:, 1/(r + n
"E:
2 ).
"E: an sin nx 1
is
§5.3 Integration and Differentiation of Series We saw in the last two sections that infinite series can often be shown to represent continuous functions. If we want to see how they can be used in the study of calculus, we will need to know how they behave with respect to differentiation and integration. If a sequence of functions / 1.f2 ,/3, ... converges to a limit function f, under what circumstances does f{,f~,J;, ... converge to J' or do the in!egrals Jf 1, JJi, J/3, ... converge to JJ? The idea of unifo1m convergence supplies a tool that is often useful. The result for integrals is fairly straightforward.
5.3.1 Theorem Suppose that/1,h,h, ... are integrable functions on a closed bounded interval. [a, b ], and that they converge uniformly to a limit function f on [a, b]. Then f is integrable on [a, b] and lim n-oo
lb a
fn(X) dx =
lb
j(x) dx.
a
The idea is that for large n, the values fn(X) are all unifonnly close to f(x). Thus, any Riemann sum for f is close to the corresponding Riemann sum for fnThe finite length of the interval is important at this step. Since fn is integrable, · its Riemann sums are all close together if the mesh of the partitions is smaR enough, and so those for f are also close together. In particular, the upper and lower sums for f are close together. With appropriate details, this shows that f
Chapter 5 Uniform Convergence
248
is integrable. The assertion that the limit of the integrals is equal to the integral of the limits then follows from the observation that
lb
fn(X) dx -
lb
J(x) dx ::;
lb
l.fn(x) - J(x)I dx.
Selecting n large enough that l.fn(x) - f(x)I < £ for all x E [a, b] makes this smaller than (b - a)t: and gives us our assertion about limits. Again, the finite length of the interval is important. The details of the proof are supplied at the end of this chapter. If we apply this result to the partial sums of an infinite series of integrable functions that converges uniformly on a closed bounded interval [a, b], we find that we can interchange the order of integration and summation.
5.3.2 Corollary Suppose that the junctions gk : [a, b] -+ JR are Riemann integrable and 1 gk converges uniformly on [a, b]. Then we may interchange the order of integration and summation:
E:
The corollary follows from Theorem 5.3.1 applied to the sequence of partial sums. Intuitively, the theorem should be fairly clear, because if /k is very close to f, then its integral (the area under the curve) should be close to that off. But be careful here. Indeed, this result may be false if /k converges only pointwise! (See Example S.3.S.)
Note. There is a theorem with a wider scope than 5.3.1, called Lebesgue's dominated convergence theorem. One version of this result (due to Ascoli) states that if ft converges pointwise to f and the fi are uniformly bounded (that is, l.fi.(x)I ::; M for all k = I, 2, ... and all x E [a,b]), then the conclusion of Theorem 5.3.1 remains valid. We shall be content for most of this book with the more elementary form of the result in Theorem 5.3.1. Can we take the same liberties with derivatives? The answer to the question of term-by-term differentiation of a uniformly convergent sequence or series is often no, as we saw in Example 5.2.5. This result is a good illustration of the sort of care that is often needed to tum an intuitively plausible statement into one
249
§5.3 lntegraJion and Differentiation of Series
of actual fact. Thus we need more assumptions than just uniform convergence. Sufficient conditions are given in the following theorem.
5.3.3 Theorem Let /k : ]a, b[ -+ IR be a sequence of differentiable functions on the open interval ]a, b[ converging pointwise to f : ]a, b[ -+ R Suppose that the derivatives J; are continuous and converge unifonnly to a function g. Then f is differentiable andf' = g.
5.3.4 Corollary If the g1; are differentiable, the gk are continuous, converges pointwise, and 1 g~ converges uniformly, then
I::
I:: 8k 1
As usual, the corollary follows by applying the theorem to the sequence of partial sums.
5.3.5 Example Give an example of a sequence ft : [O, 1) -+ IR that converges to zero pointwise, but for which
J:J,, dx does not converge to zero.
Solution Letfk have the graph in Figure 5.3-1. Thus,.fic is such that
1; f1c dx =
l for all k = l, 2, 3, .... Furthermore, for each x, f1c(x) -+ 0 as k -+ oo (clearly, if x = 0, and if x > 0, thenft(x) = 0 as soon ask> 1/x). Thus/1:-+ 0 pointwise, but ft dx = l for all k. •
1;
5.3.6 Example Let gn(X) = ru2/(1 + nx2), -1 ~ x ~ 1, n = 1,2,3, .... Examine the conclusions of Theorem 5.3.3 in this case, Solution As
n grows, the quantities fn(x) tend pointwise, but not uniformly,
to the function /(x) defined on [-1, I] by f(x) = 1 if x ":/: 0 and /(0) = 0. The limit function f is not even continuous, much less differentiable, at x = 0. Something must be going wrong with the derivatives. These are given by t;(x) = 2nx/(1 + nx2)2. These do converge pointwise to O on [-1, 11, but not uniformly. For n ~ I, we compute thatf~(l/n) ~ 1/2. This example shows that
Chapter 5 Uniform Convergence
250 y .0, 2k) -
area= 1
...._____--I~---
=1
X
FIGURE 5.3-1
X
This sequence converges to zero pointwise, but the correspond~ng sequence of integrals does not converge to zero
some assumption such as uniform convergence of the derivatives is important in Theorem 5.3.3. Pointwise convergence is not enough. •
5.3.7 Example Verify that
J; et dt
= ex - 1, using ex =
EC: x" /n!
and
Corollary 5.3.2.
Solution By the Weierstrass M test , ex =
EC:
xn /n! converges uniformly on any finite interval. Thus, by Corollary 5.3.2 applied to the interval [0,x],
r
lo
et dt
=
f lor : _ =f
~ Ix = _:_ + x2 + ...
dt
n=O
n!
n=O
(n + 1)!
= ex - I.
0
1!
2!
•
5.3.8 Example
E:
a.
Sum the series
b.
Sum 1 -1/2+ 1/3 -1/4+ • • •.
1 xn /n,
lxl
0, by the M test.
§5.3 Integration and Differentiation of Series
Thus we may integrate tenn by tenn from O to x: n+l
00
n
00
L~ = - log(l - x); n+
L xn
i.e.,
n;()
= - log(l - x).
n=I
This is valid pointwise for all x in ] - 1, 1[, since c is arbitrary. b.
It is actually valid at x = -1 as well. To see this, recall that N 1-xN+I 1 _ _xN+I ~xn _ ____
L...,
1-x
-
-1-x
1-x·
n;()
Thus,
L 1x t"dt= Ln+l - = -log(l-x)- 1x -1-t dt N
N
_,.. n~
and hence
0
xn+I
tN+l
O
n;()
It::~
+ log(l -
Letting x = -1, and using 1/(1- t)
t
n=O
(-l)n+I +log2
n+ 1
x)I
= 11x :~It dt,.
:S 1 for -1 :St< 0, we get
:SI
r-1
Jo
tN+I
dtl = _1_
N+2
which tends to O as N tends to oo. Thus, (-1)" L -n= 100
log 2 = -
1
1
1
2+ 3 - 4+ · ··· •
n=I
5.3.9 Example a.
Expandf(x) = 1/(1 +x2) in a geometric series.
b.
Integrate the series in a to prove
c.
Justify the formula of Euler: 1r
1 3
1 5
1 7
-=1--+---+···. 4
Chapter 5 Uniform Convergence
252
Solution a.
We expand 1/(1 +.x2) as a geometric series:
·_x2 '.x2 2 .x2 3 I I l+x2 =1-(-x2)=l+(- )+(- ).+(-) +··· = 1 - .x2 + x4
-
x6 + • •· ,
which is valid if I-x2 I < 1, that is, if lxl < I. The convergence is uniform on [-1 +t:, 1- t:] for any t: > 0 by the M test. b.
Integrating from zero to x gives
r ...!!!_ = I +1
X -
2
} 0
x3 + x5 -
X1
+ ....
3
7
'
5
but we know that the integral of 1/(1 + r2) is tan- 1 1, and so
tan -1 X = X -
c.
x3 3
+ x5 - - x1 - +··• 5 7
for
IXI < 1.
If we set x = 1 and use tan- 1 1 = rr /4, we get Euler's formula: rr 1 1 1 4=l-3+5-7+·••; but this is not quite justified, since the series for tan- 1 xis valid only for lxl < 1. (It is plausible, though, since 1 - ½ + ½ + · · · ,· being an alternating series, converges.) To justify Euler's formula, we may use the finite form of the geometric series expansion:
!-
1 ,2 4 1 "r" f+l jbt+2 1+12=1- +t+···+(-) +(-1 1+12· Integrating from O to 1, we have rr I I (-1)" - = tan- 1 1 = 1 - - + - - • • • + - - + (-1)"+ 1 4 3 5 2n+l
11 0
jln+2 --
1+t2
dt.
We will be finished if we can show that the last tenn goes to zero as oo. We have
n-
0$
11 o
jbt+2 2 +I
-1
d1 $
11 r2"• O
2 dt
= -2
1
n+ 3
-+
0 as n -+ oo.
•
253
Exercises for §5.3
5.3.10 Example Suppose that ao,a1,a2,a3, ... is a bounded sequence of real numbers. Show that the series
converges on JR to a differentiable function f(x), and ·that 00
k / '(X) = ~ w llt+I k! X • k=O
Solution We saw in Example 5.2.6 that the series "£':o(a1c/k!)xk converges on lR to a continuous function/(x) by considering the functions gk(x) = (ak/k!)x" on intervals of finite length. But g~(x) so 00 ,00 ~
w
k=O
1
gk(~) ,
~
=(ka1c/k!)x"- 1 =(a,J(k -
Gk
=L.JI (k _
l)!)xk-t , and
00
l)!x
k-1
k=
~ ak+I
le =L.J k!x . k=O
The argument used in Example 5.2.6 may be applied again to show that this series converges uniformly on any interval of finite length to a function g(x). Corollary 5.3.4 applies and shows that/ is differentiable and that/'(x) = g(x). •
Exercises for §5.3 1.
Investigate the validity of Theorem 5.3.1 for the sequence fn defined by nx fn(X) = -1 - -2 , +nx
0 $ X $ 1.
2.
Show that the sequence {/n} defined by fn(x) = n3xn(I - x) converges pointwise to f = 0 oo [O, 1], and then use Theorem 5.3.1 to show that the convergei:ice is not uniform.
3.
Investigate the validity of Theorems 5.3.1 and 5.3.3 forfn(X) = y,ixn(l-x) on [O, l]. [Hint: Locate the maximum of fn(x).]
4.
Verify that
J; sin t dt = 1 -
cosx, using oo
_x2n+I
sinx = ~(-It (2n+ 1)!"
Chapter S Uniform Convergence
254 5.
Verify that sin' x = cosx, using series.
6.
Express the sum of the series
E: xn / n 1
2
as an integral. What if x = - I?
§5.4 The Elementary Functions A few of the most important functions of mathematical analysis are often referred to as "elementary functions." These include such functions as:
1.
Exponential functions, such as ex, 2X, and 10'".
2.
The corresponding logarithms.
3.
Power functions, such as x2,x13117 , and, more mysteriously, x"; and closely related functions such as polynomials and rational functions.
4.
Trigonometric functions, such as sinx and cosx.
The calculus of these functions is usually developed in a beginning course in calculus and need not be completely repeated here-in fact, we have assumed that you know something about them already. We will, however, point out some of the places where there are potential difficulties and show how the ideas from the last few sections may be applied in this context.
Exponential Functions• and Logarithms Students encounter exponentiation early in their mathematical experience when they learn to abbreviate the repeated product of n copies of a number b as bn. It is apparent that these positive integer exponents satisfy the fundamental property
bn. bk= bn+k_ Extension of this property quickly leads to useful definitions for exponentiation with negative integer exponents and with rational exponents: b-n =
_!_
b''
and bp/q = f!/bP = (!J'b)P.
There is no problem with any of these, at least when b is positive, except for possible doubt about the existence of qth roots of positive reaJ- numbers b. We solved that problem in Chapter 1 with the study of the completeness of JR-we can define the qth root of b to be the least upper bound of the set of all real numbers whose qth power is smaller than b.
255
§5.4 The Elementary Functions
Another subtle problem arises when we consider irrational exponents. What should be the meaning of 2"? We would like to have a well-behaved function of x, denoted by 2x, such that 2Pfq = ( ¥1,)P if x happens to be a rational number p/q. One way to attack this problem would.be to observe that Q is dense in R, so that any real number such as 1r can be approximated by rational numbers. We could select a sequence r1, r2, r3, ... in Q with rn -+ 1r as n -+ oo. We know what meaning we want for 2'1 , 2'2 , 2'3, ... , and we want 2x to be a continuous function of x, and so we should have 2" = limn-oo 2'n. This is a good idea and not an unreasonable way to proceed for a numerical approximation of 2,.. It presents difficulties as a basis for a definition: 1.
Why does the limit exist?
2.
Why is the limit independent of the choice of a sequence of rationals approximating rr?
3.
If we do the same thing for arbitrary real x, how does one establish differentiability and so forth for the resulting function 2x?
4.
How does one establish the "laws of exponents" such as 2x2Y = 2x+y?
These problems are not insuperable, but they are not trivial, either. The attack on these questions most commonly used in a calculus course begins with the logarithm instead of the exponential function. The natural logarithm is introduced by means of the fundamental theorem of calculus as an antiderivative for the reciprocal function: ln(x) =
i I
x
l
- dt for x > 0. I
This automatically makes the natural logarithm a differentiable function of x on the open half line x > 0 with derivative equal to 1/ x. The fundamental properties of the logarithm.follow readily. The exponential function is then introduced as the inverse of the logarithm. Although perhaps not very intuitive, this approach has a number of advantages. Exercises 2 through 7 at the end of this section guide the reader through some of this development. We present here an approach that starts with the more familiar exponential function, but defines it in tenns of an infinite series. The reader is probably familiar with the expansion of the elementary functions in power series called Taylor series or Maclaurin series. In particular, a familiar fact is that the exponential function can be expressed for all real x as
x Loo -x I
e =
k=O
k!
k
1 2 1 3 1 4 =l+x+-x-+-x +-x +···
2
3!
4!
.
Chapter 5 Uniform Convergence
256
Our plan is to use this series to define a function, which we then show to behave in the way the exponential should.
5.4.1 Definition For every number x, set exp(x) =
f
k1,xt.
k=O
•
From 5.3.10, this series converges uniformly and absolutely on any bounded subset of JR (or I+ fox I dt = l +x.
This shows that exp(x) - t +oo as x - t +oo. Since exp(-x) = 1/ exp(x), we also get exp(x) - t 0 as x - t -oo. The combination of v, vi, and vii with the intermediate value theorem shows that exp maps JR one-to-one onto JR+. Continuity and the intermediate value theorem are used to obtain "onto."
Chapter 5 Uniform Convergence
258
Propositions 5.4.2 and 5.4.3 indicate that the function exp acts a lot like exponentiation. We now show that it really is. First, we define the special number e: 00 1 1 1 e exp(l) ~ k! 1 + 1 + 2 + 3 ! + • • •.
=
=
=
Using logarithms, introduced after the next heading, and l'Hopital's rule, it can be shown that e = limn-ooO + (1/n)Y,. This supplies a link to such subjects as "continuously" compounded interest. From either of these characterizations or others, the value of e can be computed to any desired degree of accuracy. The first few decimal places are e ~ 2.71828 18284 59045 23536 02874 71352 66249 ...
It may not be too surprising that e is irrational, but more is true. ·rt is not even algebraic. An algebraic number is a number that is the root of a polynomial with rational coefficients. Numbers that are not algebraic are called transcendental. That e is transcendental was shown by Charles Hermite in 1873. (The same conclusion for 1r came in 1882 from C. L. F. Lindemann.) The function exp(x) acts just the way e raised to the "power" x should. First, suppose that n is a positive integer. Repeated application of 5.4.2 gives exp(n) exp(l +I+•••+ 1) exp(l) • exp(l) • • • exp(l) (exp(l))n en. (The reader should be able to supply a proof by induction.) For negative integers, we invoke 5.4.3iv to conclude that exp(-n) I/ exp(n) I/en. This is exactly what we are accustomed to writing as e-n. Similarly,
=
=
=
= =
=
(exp(l/n)t = exp(l/n) · exp(l/n) · · · exp(l/n)
=exp((l/n) + 0/n) + · · • + (1/n)) =exp(l) =e. Thus, exp(l/n) is an nth root for e. We are accustomed to writing this as exp(l/n) = fie= e1fn. Putting this all together, we see that for integers p and q with q ~ 0, exp(p/q) = ( fe)P, exactly the number we are accustomed to writing as ef'/q_
5.4.4. Proposition
exp(x) is a differentiable function on JR. such that
exp(x) = ex for x E Q. The function exp(x) thus serves as a quite reasonable definition for what ex should mean for any real x (or, in fact, for any complex number). We now have a good definition for e,r, but our original challenge was 2,r. Before attacking this · problem, we turn our attention to logarithms.
259
§5.4 The Elementary Functions
Natural Logarithms The exponential function exp(x) = ex has been defined to produce a differentiable function from JR one-to:.one onto JR+. It is strictly increasing, with a derivative that is strictly positive at every real x. In particular, the derivative is never 0. By the single-variable inverse function theorem (4.7.15), there is a differentiable inverse function taldng JR+ one-to-one onto JR. This function is called the natural logarithm function and is denoted lnx or logx. That it is an inverse for the exponential means that · 1og(e.r)
= x for every real x
and exp(logx) = x
for every x
> 0.
Computation with the chain rule yields the derivative:
d
d
dx (exp(log x)) = dx (x),
i.e.,
d exp(logx) d.x (logx) = 1.
i.e.,
d.x(logx)
Thus,
d
x d.x(logx)
=l,
d
=x·1
Thus, the natural logarithm is an antiderivative for the reciprocal function. Since log(l) = 0, the fundamental theorem of calculus gives logx =
f
I
.r
l - dt for x t
> 0,
the formula frequently taken as the starting point for the theory in a calculus course, as mentioned previously.
· Other Bases With the natural logarithm available, we can tum our attention to our original challenge problem. What is 2-rr? Ifwe look ahead for a moment and assume that it and the natural logarithm both behave as we hope they will, we can perform a computation that will indicate what the proper definition should be. We would like to have log(2-rr) 1r log 2, so that 2-rr exp(log(2")) exp(1r log 2). Since exp and log are known functions, this last expression supplies the basis for a definition. We can try defining 21r as exp(1r log 2) or, more generally, 2.r as exp(x log 2) for any x. We will show that this works as desired not only for 2 but for any positive number b as base to give a reasonable definition for ~
=
=
=
Chapter 5 Uniform Convergence
260
5.4.5 Definition
Let b > 0. For any number'x, define expb(x) = exp(x log b).
Applying the properties we already have for exp and the chain rule yields the following basic properties of expb:
5.4.6 Proposition If b > 0, then expb(x) is a differentiable function on IR and i.
expb(x + y) = expb(x) · expb(y) for all numbers x and y.
ii.
expb(O) = 1.
iii.
dx (expb(x)) = log(b) · expb(x) for all real x.
iv.
expb(x)
v.
expb(-x) = 1/ expb(x).
d
.
> 0 for all real x.
=
=
If b 1, then logb 0. For O < b < 1, we have logb < 0, while if b > 1, then logb > 0. Since expb(t) > 0 for every t and exp~(x) = log(b)-expb(xlogb), this gives us all the information we need about the sign of exp~(x).
5.4.7 Proposition i.
If b =1, then expb(x) =1 for every x E JR.
ii.
If b > 1, then
iii.
a.
expix) is strictly increasing for every x E R
b.
expb(x) --+ +oo as x --+ +oo.
c.
expb(x) --+ 0 as x --+ -oo.
/f O < b
< 1, then
a.
expb(x) is strictly decreasing for every x E R
b.
expb(x) --+ 0 as x
c.
expix)--+ +oo as x--+ -oo .
--+
+oo.
In both cases ii and iii, the function expb(x) maps IR one-to-one onto JR+.
§5.4 The Elementary Functions
261
The computation showing that expb(x) agrees with our usual notion of Ir for rational x proceeds just as it did for exp(x).
5.4.8 Proposition expb(x) is a continuous and differentiable function on JR such that expb(x) = Ir for all x E Q. Thus expb(x) serves as a quite reasonable definition for Ir for all numbers x. Of course, since expb(x) maps JR one-to-one onto JR+ (at least for b :j: 1), with derivative never equal to 0, there is a differentiable inverse function,
taking JR+ one-to-one onto JR, and expb(log/x)) = x
for every x
> 0.
We are now in a position to write down the basic properties of exponentials and logarithms. First, we use the base e to compute log(/r) = log(exp(xlogb)) = xlogb. Thus, (fr)Y =exp(ylog(/r)) = exp(yxlogb) =exp(xylogb) =frY. This is the last of the principal algebraic properties of exponentiation.
5.4.9 Proposition If b > 0 and x and y are in JR, then i.
b0 = 1.
ii.
Ir
iv.
(fr)Y
> 0 for every x in JR.
= frY.-
The corresponding properties of logarithms are also now available:
5,4.10 Proposition
If b > 0 and b :j: I, then
ii.
logb(xy) = logb(x) + logb(y) for all positive x and y.
iii.
logb(x') = t logb(x) for x
> 0 and t E JR.
Chapter 5 Uniform Convergence
262
=Ir and y =b' and compute logb(xy) =logb(b b =logbW+') = s + t = logbW) + logb(b')
To obtain ii, we let x
5 1)
= logb(x) + logb(y). To obtain iii, we let x = b5 and compute
The calculus of these. functions is summarized in the next proposition.
5.4.11 Proposition i.
!!_(ex)= ex; Jex dx =ex+ C.
ii.
d dt(logx) = 1/x;
iii.
1
iv.
d dx(logbx) = 1/(xlogb).
dx
(bx) =
~ log b;
ix 1
(1/t)dt = logx, x
J~
> 0.
dx = bx/ log b + C.
Power Functions By a "power function," we mean a function of the form/(x) = xP for a constant p. The only power functions we have really used so far in the development of theory are those with p an integer, although we have discussed roots to some extent, which correspond to rational values of p. We now have a good definition of xP for any p, at least if x is positive: xP =exp(plogx). We know that this satisfies most of the correct properties for powers of positive numbers: xPx =xp+r
(xPY =xpr
= 1/xP XO = I if X f 0.
x-P
263
§5.4 The Elementary Functions
A remaining property, xPyP = (xy)P, is easily checked: (xy)P = exp(p logxy) = exp(p(logx +logy))= exp(p logx + p logy)
= exp(p log x) • exp(p logy) = xPyP. The familiar rule for derivatives of power functions is readily obtained from the chain rule: d
d
.
d
dx(xP) = dx(exp(plogx)) = exp'(plogx) • dx(plogx)
= exp(p logx) •p/x = (xP). p/x = pxP- 1. Of course, this works in full generality only for x > 0. However, we can always set QP = 0 for every p > 0 and ~p/q
= efxi
for every real x if p and q are integers and p is even.
Trigonometric Functions The expansions of the trigonometric functions sine and cosine are probably familiar to the reader:
x3 x5
x7
x2
x6
sinx = x - - + - - - + • · · 3! 5! 7! and
cosx = 1 - -
2!
z::
x4
+ - - - + ··· 4! 6! .
Each of these series is of the form 0(ak/k!)xk, where each of the coefficients ak is either -1, 0,- or 1. They thus fit the pattern of Example 5.3.10. Thus, if we define functions s(x) and c(x) by the series oo
s(x)
=x- _!_x3 + _!_x5 - I.x7 + ... = ~ 3!
5!
1 1 = I - -:x!+ -x4 2! 4!
x2k+I
L..,, (2k + 1)!
7!
k=O
and c(x)
k
(- I)
1 6!
-x6 + ...
Ii =~ --:? L..,, (2k)! ' 00
(-
k=O
each of s(x) and c(x) converges uniformly and absolutely on any bounded set of numbers and represents a differentiable function on all of JR. Term-by-term
Chapter 5 Uniform Convergence
264
differentiation is valid for any real x. Carrying out this differentiation shows that s'(x) = c(x) and c'(x) = -s(x) for all real x. From the series and from these derivative relations, we can show that s(x) and c(x) satisfy the same key identities as do sine and cosine. For example, let h(x) = s(x)2 + c(x)2. Then h(x) is differentiable and h'(x) 2s(x)c(x) - 2c(x)s(x) 0 for every x. Thus h(x) is constant. Since c(O) = 1 and s(O) = 0, we conclude that h(x) = 1 for every x. That is,
=
s(x)2 + c(x)2 = 1
=
for every real x.
From the series, it is also apparent that s(x) is an odd function of x and that c(x) is even: s(-x) = -s(x) and c(-x) = c(x) for all x. The formulas for the sine and cosine of a sum can also be recovered by direct manipulation of the series in much the same way that we obtained the fundamental formula for the exponential function, exp(x + y) = exp(x) • exp(y). However, we can also obtain them indirectly by noting an important link between the exponential and the trigonometric functions. This relationship depends on the fact that one has uniform and absolute convergence of the series 'E~(ak/k!)xk on any bounded subset of complex numbers provided that the complex coefficients a1c stay bounded. The proof follows just as it does for real numbers, as do the manipulations leading to the formula exp(x + y) = exp(x) •exp(y). If we compute exp(ix) for real x, we find that I?
•
(X)
1
eve= ""'-(ixl ~ k=O k'•
. x2 ix3x"ir x6 = l+1x- - - - + - + - - - - ··· 2
=(l-
3
4
5
6
~ + ~ - · · ·) + i (x- ~ + ~ -
· · ·)
= c(x) + is(x). The law of exponents assures us that c(x + y) + is(x + y)
ei 0. Show that/ is a differentiable function on JR+= {x ER l x > O} with/(])= 0 and/'(x) = 1/x for all x > 0.
3.
Show that if a Exercise 2).
> 0 and
b
> 0, then /(ab) =/(a)+ f(b)
(/ is defined in
Chapter S Unifonn Convergence
268
4.
Show that
a.
f is strictly increasing on R• (f is defined in Exercise 2).
b.
= +oo. lim.x-o/(x) = -oo. lim.x-oof(x)
c. 5.
Show that/ has a differentiable inverse function g = 1- 1 talcing IR one-toone onto JR.+ and that g' (x) = g(x) for all real x (f is defined in Exercise 2).
6.
Show that g(x + y) = g(x) • g(y) for all real numbers x and y (g is defined in Exercise 5). ' ·
7.
Let e g(l), and show that g(p/q) p and q in Z.
8.
The "error function" is defined by
=
=eP/q for every rational number p/q,
erf(x) = -1-
./2-i
a.
1"' e-
12 dt .
12
0
Show that erf(x) can be represented by a power series
valid for all x, and compute ao,a 1,a2,a3,a4, and as. b.
Use your main result from part a to estimate the value of
(This integral gives the probability that a measurement taken at random from a normally distributed population lies within one standard deviation of the mean of 0.)
§5.5 The Space of Continuous Functions Fix a metric space M, a subset A c M, and a normed vector space N. Consider the set V of all functions f : A -+ N. Then V is easily seen to be a vector space. In V, the zero vector is the function that is O for all x E A, and addition and
269
§5.5 The Space of Continuous Functions
=
scalar multiplication are defined by (J +g)(x) f(x) + g(x) and (>./)(x) for each l ·c. 'JR,f,g E V. Let
~,
=>.(J(x))
.. •
C = {J EV If is continuous} .
•
If there is danger of confusion, we write C(A, N) to indicate that C depends on our choice of A and N. Then C is also a vector space, since the sum of two continuous functions is continuous and a scalar multiple of a continuous function is continuous. ~ Let Cb be the vector subspace of C consisting of bounded functions: Cb = {J E C If is bounded}. Recall that "f is bounded" means that there is a constant M such that 11/(x)II $ M for all x EA. If A is compact, then Cb= C, by Theorem 4.4.1 applied to the real-valued function x 1--+ 11/(x)IIFor f E Cb, let IIii I = sup{ IIJ(x)I I I x E A}, which exists, since f is bounded. The number 1111 I is a measure of the size off and is called the norm of J. See Figure 5.5-1. Note that 11111 $ M iff IIJ(x)II $ M for all x E A. y
y
A
FIGURE 5.5-1
The norm of a function picks out the largest absolute value off(x)
What we are trying to do here is to look at the space Cb in the same way as we look at '/Rn. Namely, each point in Cb (which is a function) has a norm, and so we can hope that many of the concepts developed for vectors in '/Rn will carry over to Cb, Such a point of view is useful in doing analysis, and some important results can be proved by using some of our techniques on the space Cb, For this program to be successful, tHe first task is to establish that Cb is .a normed space.
,.
l' •
270
Chapter 5 Uniform Convergence
Note. Although we have a norm, we do not have an inner product associated
with it such that 11/112 = (/,/}. Other spaces of functions that we study in Fourier analysis (Chapter 10) do have such an inner product.
If N is only a metric space, we still get a metric on Cb, namely, d(f,g) = sup{p(f(x),g(x))
Ix EA}.
Most of what follows holds in this context: When a vector space structure on N is needed, we will explicitly say so.
5.5.1 Theorem i.
If (M,d} and (N, p) are metric spaces, then so is Cb(A,N); that is, d(j,g) satisfies
ii.
a.
d(f,g) ~ O and d(f,g)
b.
d(f,g) = d(g,f).
c.
d(f, g)
~
=O if/ f =g.
d(f, h) + d(h, g).
If N is a normed space, then so is Cb(A, N); that is, a. b.
c.
II •II S{!tis.fies
11111 ~ o and II/II =o if! I= o. Ila/II= lal II/II for a ER,/ E Cb. Iii+ gll ~ 11111 + llgll (triangle inequality).
· These are the basic rules we need to talk about open sets, convergence, and so forth. For example, write /1c -. f in Cb iff II/le - /II - t 0. The connection with uniform convergence is simple.
5.5.2 Theorem (fi:
-t
f (uniformly on A))
0 there is an N such that k,l ~ N implies ll/1c - Jill < t::. Recall that a space is called complete if every Cauchy sequence converges. Another name for a complete normed space is a Banach space. Completeness is an important technical property for a space, since often we may be able to prove that a sequence satisfies the Cauchy criterion and we want to deduce that it converges to some element of the space.
§5.5 The Space of Continuous Functions
271
S.S.3 Theorem If N is a complete metric space, then Cb(A, N) is a complete metric space. If N is a Banach space, so is Cb(A, N). This result is really a rephrasing of two basic results:
1.
A unifonnly Cauchy sequence of functions into a complete space converges unifonnly to something.
2.
A uniform limit of continuous functions is continuous.
The space Cb is only one of a host of spaces of functions of great importance in analysis. While both Cb and Rn are complete normed spaces, they are quite different in other respects. For instance, as we have mentioned, Cb does not have an inner product that gives the norm II · 11 (Exercises 12 and 30 at the end of Chapter 1). Another is that Cb is not finite-dimensional. In the following sections, we shall see some specific problems to which this theory can be applied.
5.5.4 Example Let B = {f E C([0, l],R) lf(x) > 0/or'all x E [0, I]}. Show that Bis an open set in C([0, l],R). E B we must find an r: > 0 such that D(/, r:) = {g E C I II/ - KIi < r:} C B. Since [O, I] is compact,/ has a minimum valuesay, m-at some point of [0, 1). Thus, /(x) ~ m > 0 for all x E [0, 1). Let r: = m/2. If II/ - KIi < e, then for any x, 11/(x) - g(x)II < e =m/2. Hence, g(x) ~ m/2 > 0, and so g E B. •
Solution By definition, for/
S.5.5 Example What is the closure of the set B in Example 5.5.4? Solution We assert that the closure is D = {/ E C I f(x) ~ 0 for all x E [O, 11}. This is a closed set because iffn(x) ~ 0 and.fn -+ / unifonnly, and hence pointwise, then /(x) > 0 for all x. To show that D is the closure, it suffices to show that for f E D there is an fn E B such that .fn -+ f (why?). Simply let
fn=J+l/n.
•
5.5.6 Example Consider a sequence fn E Cb such that ll.fn+1 - fnll ~ Tn, where I: rn is convergent, Tn
~
0. Prove that fn con·verges.
272
Chapter 5 Uniform Convergence
Solution By the triangle inequality,
11.fn - fn+kll $
llfn - .fn+dl + 11.fn+I - fn+2II + · · · + llfn+t-1 $ Tn + Tn+I + · · " + Tn+k•
- fn+kll
Since I: r1 is convergent, this expression tends to O as n - t oo, since it is less than or equal to s - Sn-1 where Sn is the nth partial sum and s is the sum. Hence, f,, is a Cauchy sequence, and so it converges. •
Exercises for § 5.5 > 0 for all x E IR}. Is B open? If not, what
1.
Let B = {f E Cb(IR,IR) lf(x) is int(B)?
2.
What is the closure of B in Exercise 1?
3.
Do you see a connection between Example 5.5.6 and the Weierstrass M test? Discuss.
4.
Let
I nx fn(X)=--, n 1 +nx
Show thatfn
5.
-t
0$x$1.
O in C([O, l],IR).
Let/I: be a convergent sequence in Cb(A, Rm). Prove that {fk I k = I, 2, ... } is bounded in Cb(A, IR"'). Is it closed?
§5.6 The Arzela-Ascoli Theorem Analogous to the Heine-Borel theorem concerning compact subsets of Rn, this theorem gives conditions for a set in Cb to be compact. Recall that in Rn a set is compact if and only if it is closed and bounded, but that this criterion need not hold in other metric spaces. In particular, it does not hold in Cb. What the Arzela-Ascoli theorem does is come as close to this as possible with the intent of providing conditions for compactness that can be verified in examples. We begin with some terminology.
5.6.1 Definition let B C C(A, N). We say that l3 is an equicontinuous set of functions if for any c: > 0 there is a 6 > 0 such that x,y E A and d(x,y) < 6 implies d(J(x),f(y)) < c: for all f E /3.
273
§5.6 The Arzela-Ascoli Theorem
This definition is the same as that of uniform continuity, except that now we also demand that 6 can be chosen independent of/ as well as .xo. Let Bx = U(x) I / E B} for fixed x. This is the set of all values of the functions in B at the point x E A. We say that B is pointwise compact if and only if Bx is compact in N for each x EA.
5.6.2 Arzela-Ascoli Theorem Let Ac M be compact and B c C(A,N). Then B is compact if and only if B is closed, equicontinuous, and pointwise compact.
The proof strategy is based on the Balzano-Weierstrass property. The main point is to show that if each in is in B, then the sequence in has a uniformly convergent subsequence. Since B is pointwise compact, we can do this at each x E A. The idea now is to use equicontinuity and compactness of A to make this uniform. A clever diagonal selection process makes this feasible. For instance, the theorem is often used as follows: Let A c M be compact and N = Rm. Assume B C C(A, R_m) is equicontinuous and pointwise bounded. Then every sequence in B has a uniformly convergent subsequence.
5.6.3 Corollary
,,I
Here one uses the fact that any bounded set in Rm lies in a compact set (a ball, for instance). A related result is this: If A c Mis compact and N = R.11 , 'then B C C(A, R_m) is compact iff it is bounded, closed, and equicontinuous. We leave the proof of this to the reader. ---+ R be continuous and be such that l/11 (x)I ::; lOOfor every n and for allx in [O, I] and the derivatives!~ exist and are uniformly bounded on ]0, l[. Prove that in has a uniformly convergent subsequence.
5.6.4 Example Let / 11 : [O, I]
Solution We verify that the set U11 } is equicontinuous and bounded. The hypothesis is that lf~(x)I ~ M for a constant M. By the mean value theorem, lfn(X) - fn(Y)I
:=; Mix - YI,
and so, given c:, we can choose Ii= c:/M, independent of x,y, and n. Thus Un} is equicontinuous. It is bounded because IIJ,,11 = SUPo:s;x::;i lJ,,(x)I ::; 100. •
Chapter S Uniform Convergence
274
5.6.5 Example ls the result of Example 5.6.4 valid if we omit the condition that IJ,.(x)I is bounded?
Solution No, for let.fn(x) = n. Then/~= 0, but clearly there is no convergent subsequence. • To exploit compactness and combine the Arzela-Ascoli theorem with results about continuous functions from Chapter 4, we need a supply of continuous functions on C. The next example provides a start.
5.6.6 Example Let I
: C([O, I]), IR) -+ JR be defined by I(/) =
Id f(x)dx.
Prove that I is continuous.
Solution We must show that.fn -+fin C implies l(Jn)-+ I(/). But this is a consequence of Theorem 5.3.1. •
Exercises for §5.6 1.
Show that in Example 5.6.4, In bounded can be replaced by fn(O) = 0 to give the same conclusion.
2.
In 5.6.3, need the whole sequence be convergent?
3.
a.
Show that the following set is open:
b.
Show that Cb is closed in the space of all bounded functions on a set A.
4.
Let B C C([O, I}, IR) be closed, bounded, and equicontinuous. Let I: B-+ R be defined by l(J) = f(x) dx. Show that there is an /o-·E B at which the value of / is maximized.
Id
§5.7 The Contraction Mapping Principle and Its Applications
5.
275
Let the functions fn : [a, b] -+ IR. be uniformly bounded continuous functions. Set Fn(X) =
lx
fn(t)dt,
a $ X :5 b.
Prove that Fn has a uniformly convergent subsequence.
§5.7 The Contraction Mapping Principle and Its Applications This section deals with an important iterative technique in analysis called the contraction mapping principle. It proves existence and uniqueness results and, in addition, gives a specific iterative procedure for locating the solution. We begin with the general method.
5.7.1 Contradion Mapping Principle Let M be a complete metric space and : M --+ M a given mapping. Assume that there is a constant k, 0 :5 k < 1, such that d((x),(y)) $ kd(x,y)for all x,y EM. Then there is a unique fixed point for , i.e., a point x. E M such that (x.) = x •. In fact, if xo is any point in M and we define x, (XQ), X2 (x1), •• ,, Xn+I (xn), ••• , then Jim Xn =x•.
=
=
=
n-oo
Intuitively, is shrinking distances, and so as iterates, points bunch up. The clustering of this bunching is centered at x.; see Figure 5.7-1.
Note. There are other famous fixed-point theorems that have a more topological flavor. For instance, the Brouwer fixed-point theorem (in a special case) says that any continuous map/: D-+ D, where D = {x E Rn I llxll ~ I} is the disk, has a fixed point. Certainly such f's need not be contractions. If n = I, this is proved by the intermediate value theorem (see 4.5.5 and Exercise 3 of §4.5). In this result there can be many fixed points. A related result is that any vector field on the 2-sphere has a zero somewhere; or "at every instant, there is somewhere on the earth that the wind is not·blowing." . • Our more analytical contraction mapping theorem has some everyday interpretations, too: "Take a map of the city in which you are present and put it on a horizontal table. Then there is exactly one point on the map which corresponds to the point on the table directly beneath it."
216
Chapter S Uniform Convergence
FIGURE 5.7-1
A contraction shrinks distances between points
As our first application, we study the existence of solutions of differential equations. We give a specific context, although the technique is fairly general. We shall generalize it in §7.5. Consider a continuous functionf(t,x) defined in a neighborhood of (to,xo) E JR 2• Assume that the following Li,pschill. condition holds:
for all (t,xi) and (t,x2) in a neighborhood of (to,Xo). If f is differentiable in x and (8f / 8x)(t, x) is continuous, then the condition is automatic (by the mean value theorem). •
5.7.2 Theorem Under the above assumptions, there is a 6 > 0 such that the equation dx dt =f(t,x), has a unique C1 solution x tp1(t) = f(t, tp(t)).
(I)
x(to)=xo
=tp(t), with tp(to) =x ,for to 0
6
< t < to+ 6,
i.e.,
The technique of proof is based on the following observation. Solving (I) is equivalent to finding a continuous function 'P on ]to - 6, to+ 6[ such that tp(t) = Xo +
1' to
f(s, (f(x)) =a+ J;J(y)xe-xy dy, and so the constant for 4> is k = sup zE[O,r)
r IK(x,y)I dy = sup Jor xe-xy dy
Jo
= sup (1 -
zE[O.r)
e-x ) = I - e-r < I. 2
2
xE[O,r)
Thus, we get a unique solution on any interval [O, r].
•
5.7.9 Example We consider a "closed-loop feedback system" denoted schematically this way: r
+o -t
• c = F(e)
F
e
I•
Here, r,c,c are functions oft E [a,b]. We choose M = C([a,b], IR) and assume that F : M - M is a contraction. The basic equations are c = r - c; i.e., c + F(c) = r. Prove the existence and uniqueness of a solution c te this system for given r.
Note. This system can be thought of as follows. A signal (such as an electrical impulse) renters the system, joined by a return signal c to produce c = r- c. A "'black box" F modifies c to F(c); the signal c is then fed back to the incoming signal r. Solution Note that for two solutions ct and c:2, ct + F(c:i) = t:2
+ F(c:2)
§5.7 Tlze Contraction Mapping Principle arnl Its Applications
281
and therefore
Since Fis a contraction, we have a contradiction unless c 1 = c 2 • Hence, if there is a solution c, it is unique. We now prove the existence of a solution for a given "input" r. Let Gr(c) = r - F(c).
We want c
= G,(1:).
We claim that G, is a contraction. Indeed,
Since k system.
0 such that d(x,Xo) < 6, x E A => p (fN(x).fN(Xo)) < c:/3. Then, for d(x,Xo) < 6, p(f(x),f(xo)) ~ p(f(x),fN(X)) + p(fN(X),fN(Xo)) + p(JN(Xo),f(Xo)) < c:/3 + c:/3 + c:/3 = €. Since Xo is arbitrary, f is continuous at each point of A; hence it is continuous. •
Proof
295
Theorem Proofs for Chapter 5
5.2.1 Cauchy Criterion Let N be a metric space with metric p, and let A be a set. Suppose that N is complete and ft : A -+ N is a sequence pf functions. Then /k converges uniformly on A iff for every c > 0 there is an N such that l, k ~ N implies p(fk(x),fi(x)) < £ for all x E A.
Proof If fk -+ f unifonnly, then given c > 0, we can find an integer N such that k ~ N implies p(J,,(x),f(x)) < c /2 for all x. Then if k, l ~ N, p(J,,(x),fi(x)) :::; p(fk(x),f(x)) + p(f(x),fi(x)) < c/2 + c/2 = c. Conversely, if, given £ > 0, we can find an N such that k, l ~ N implies p(ft(x),fi(x)) < £ for all x, then fk(x) is a Cauchy sequence at each point x, and so fk(X) converges pointwise to something, which we denote by f(x). Moreover, we can find an N such that k, l ~ N implies p(fk(x),fi(x)) < c/2 for all x. Since fk(x) -+ f(x) at each point x, we can find for each x an Nx such that l ~ Nx implies p(fi(x),j(x)) < e/2. Let l ~ max{N,Nx}- Then k ~ N implies p(/k(x),f(x)) :::; p(/k(x),fi(x)) + p(fi(x),f(x)) < c/2 + e/2 = £. Since this is true for each point x, we have found an N such that k ~ N implies p(fk(x),f(x)) < £ for all x. Hence fk -+ f (uniformly).
•
5.2.2 Weierstrass M test Suppose_ that V is a complete normed vector space and gk : A -+ V are functions such that there exist constants Mk with llgk(x)II :::; Mt for all x EA, and E~1 Mk converges. Then E~1 g1c converges uniformly (and absolutely).
E:
Proof Since 1 Mt converges, for every £ > 0 there is an N such that k ~ N implie,s IMk + · · · + Mk+pl < € for all p = I, 2, .... Fork ~ N, we have, by the triangle inequality, llgt(X)
+ · · · + gk+p(x)II :::; llgix)II + · · · + llgk+p(x)II :::; Mk + · · · + M1c+p < c
for all x E A. Thus, by the Cauchy criterion for series, unifonnly.
•
E: gk converges 1
5.3.1 Theorem Suppose thatft ,h ,!J, ... are integrable functions on a closed bounded interval [a, b], and that they converge uniformly to a limit function f on [a,b]. Thenf is integrable on [a,b] and lim n-oo
lb a
fn(x)dx =
lb a
f(x)dx.
296
Chapter 5 Uniform Convergence
Proof Let us first asswne that f is integrable. To prove the limit relation, recall that if lg(x)I :s; M then ·
llb
g(x)
dxl :s; M(b -
a).
Given e > 0, choose N such that k 2: N implies [{k(x)- f(x)I
< e/(b- a). Then
as required. The harder part is to show that f is ip.tegrable. To do this, we use Definition 4.8.1, which is equivalent to
lb lb .
_a f
(x) dx = a f (x) dx.
Let e > 0 be given and choose N so that IJ,,(x) - f(~)I < e if n 2: N and a :s; x :s; b. In particular, this implies that f is bounded (why?) Note that if a is an upper bound for fn on [x;,X;+il, then a+ e is an upper bound for f on the same interval. Thus, sup{f(x) Ix E [x;,X;+il} S sup{fn(x) Ix E [xi,Xi+il} + e and hence U(f, P) :S; U(fn, P) + e(b - a).
Similarly, L(fn, P) - e(b - a) :S; L(f, P).
Taking the inf's and sup's ?Ver P, we get
and
lb
fn(x)dx- e(b - a) S
lb
f(x)dx.
297
Theorem Proofs for Chapter 5 Thus, 11b f(x)dx-1b f(x)dx :$ 2c(b - a);
being valid for all c integrable. •
>
0, the upper and lower integrals ar~ equal, so f is
5.3.2 Corollary Suppose that the functions g" : [a,b]
--+ IR are Riemann integrable and 1 g1: converges uniformly on [a,b]. Then we may interchange the order of integration and summation:
z:=:
fb
la Proof Let!,,
(f
gk(x)) dx =
k=J
gk(X) dx) .
k=I
=L~-=I gk ; thenfn
lb
f (lat
--+
f
=z::::1 gk (unifonnly), and so, by 5.3.1,
fn(x)dx-+ lb f(x)dx.
•
5.3.3 Theorem Let ft : ]a, b[ --+ lR be a sequence of differentiable functions on the open interval ]a, b[ converging pointwise to J : ]a, b[ --+ R Suppose that the derivatives/£ are continuous and converge uniformly to a function g. Then f is differentiable and J1 = g. Write fk(x) = /t(Xo) + J~f£(t) dt, where a < xo < b. This is possible by the fundamental theorem of calculus. Letting k --+ oo, we getf(x) = f(xo) + J~ g(t) dt, using Theorem 5.3.1. Since g is continuous by 5.1.4, the fundamental theorem of calculus shows that the right side is a differentiable function of x with derivative g(x). Hence the left side is also differentiable, and so/' (x) = g(x). •
Proof
.5.5.1 Theorem i:
/f (M,d) and (N,p) are metric spaces, then so is Cb(A,N); that is, d(f,g) satisfies
'?. 0 and d(f,g) =0 iff f
a.
d(f,g)
b.
d(f,g) = d(g,f).
c.
d(f,g) :$ d(f,h)+d(h,g).
=g.
Chapter 5 Uniform Convergence
298
ii.
If N is a normed space, then so is Cb(A, N); that is,
a.
II/II~ o and 11111 = oWt= o.
b.
llo/11 = lol II/II for a E R, / E Cb,
c.
Ill+ ell $
11111 + llell
11 · II satisfies
(triangle inequality).
Proof i.
Each of a and b is a routine check. To prove c, write d(f,e) = sup{p(f(x),e(x)) Ix EM}.
For each x EM, p(f(x), e(x)) $ p(f(x), h(x)) + p(h(x),e(x))
and so by definition of least upper bound, d(f,e) $ sup{p(f(x),h(x)) + p(h(x),e(x)) Ix EM}.
Now d(f, h) + d(h, e) is an upper bound for the set on the right-hand side, and so it is greater than or equal to the sup. Thus, d(f, e) $ d(f, h)+d(h, e). ii.
a and b are clear. For c, Ill+ gll = sup{ll(f + g)(x)II Ix EA} $ sup{llf(x)II + lle(x)II Ix EA}
by the triangle inequality. Now 11111 + llell is an upper bound for the set on the right-hand side, since 11/(x)II $ 11111 and lle(x)II $ llcll for all x E A. Thus, it is greater than or equal to the sup of this set. Thus II/+ ell $ 11111 + llgll, as required. •
5.5.2 Theorem
{ft_
-+
f (uniformly on A)) (ft
-+
f in Cb),
Proof This is nothing more than a transcription of the definitions. The student should write it out. • 5.5.3 Theorem If N is a complete metric space, then Cb(A, N) is a complete metric space. If N is a Banach space, so is Cb(A,N).
Theorem Proofs for Chapter S
299
-Proof Letfn E C1,(A,N) be a Cauchy sequence. Thenfn satisfies the Cauchy criterion (S.2.1), and so it converges unifonnly to a function f. By 5.1.4, f is continuous, and, taking, say, c = 1 in the definition of convergence, we see that I,. bounded implies f is bounded. Thus, f E C1,(A, N). Hence C1,(A, N) is complete. The second assertion is a special case of the first. •
5.6.2 Arzela-Ascoli Theorem Let Ac M be compact and B c C(A,N). Then B is compact if and only if B is closed, equicontinuous. and pointwise compact.
Proof To prove this, we first prove a lemma. Lemma Let A be compact. Then for any 6 > 0 there is a finite set C6 = {Yt, ... ,Yk} such that each x EA is within 6 of some y, E C6,
Proof The collection of balls {D(x,6) I x e A} cover A, and so, by compactness, a finite number-say, D(y1, 6), . .. , D(yk, 6}---do as well. Let C6 = {Y1,, .. ,Yt}. 'v Now we tum to the proof of the theorem. Let C = LJ{C1;n In= I, 2, 3, ... }. Since each C1;n is finite, C is countable; say, C = {x1,x2, ... }. Let f,, be. our sequence in B. Now {/n} is contained in the pointwise compact set B, and so, from the Bolzano-Weierstrass theorem, there is a subsequence of !,.(xi) that is convergent. Let us denote this subsequence by
Similarly, the sequence/u(x2 ), k = 1,.2, ... , has a subsequence
which is convergent. Continuing the process, the sequence ht(X3), k = I, 2, ... ,
has a subsequence
which is convergent. We proceed in this way and then set Kn= Inn, so that Kn is the nth function occurring in the nth subsequence.
Chapter 5 Unifonn Convergence
300
Diagrammatically, gn is obtained by picking out the diagonal:
/11 /12 /13 '' -fin '·' (first subsequence) /21 /22/23 · · ·hn · · · (second subsequence) hi /32 /33 · · -f3,, · · · (third subsequence) (nth subsequence)
fnl fn2 fn3 · · ·fmi · · ·
This trick is called the "diagonal process" and is useful in a variety of situations. From its construction, we see that the sequence gn converges at each point of C; indeed, g,. is a subsequence of each sequence J,n1c, k = I, 2, .... We shall now prove that the sequence g,. converges at each point of A and that the convergence is uniform, and this will prove the theorem. To do this, let c > 0 and let 8 be as in the definition of equicontinuity. Let C6 = {y 1, ••• , Yk} be a finite subset of C such that every point A is within 8 of some point in C6 (see the lemma). Since the sequences
all converge, there is an integer N such that if m, n p(g,.(y;),gn(y;))
d((x), (y)) :::; kd(x, y) < k8 = to. Consider Xn+I = (xn); Xn+I -+ x., and by the continuity of Cl>, (xn) -+ Cl>(x.). Thus, x. = (x.), so x. is a fixed point. Finally, we prove the uniqueness of the fixed point x*. Let y. be another fixed point, i.e., Cl>(y.) = y •. Then
d(x.,y.) = d((x.), Cl>(y.)) $ kd(x.,y.); i.e., (1 - k)d(x.,y.) :::; 0. But k < 1, and so (1 - k) > 0, implying d(x.,y.) fixed point is unique.
•
= 0, i.e., x. = y., and thus the
5.7.2 Theorem Under the assumptions in the text, there is a 8 >
0 such
that the equation dx
=xo has a unique C1 solution x = cp(t), with 2r1:(x) = n x2 L rt(X) - 2nx Lkr1:(x) + L k2rt(X) k=O
=
x2 -
n2
l:=O
l:=O
2nx • nx + [nx + n(n - 1).r]
= nx(l -x).
(4)
306
Chapter S Unifonn Convergence
Now choose M such that 1/(x)I $ M on [O, l]. Since/ is uniformly continuous, there is, for given e > 0, a 6 > 0 such that Ix - YI < 6 implies lf(x) - /(y)I < e. We want to estimate the expression n
1/(x) - Pn(x)I
= f(x) -
n
Lf(k/n)r,.(x)
= L (f(x) - f(k/n)) rk(x)
k=O
.
k=O
To do this, divide this sum into two parts: those for which lk - nxl < 6n, and those for which lk - nxl ~ 6n. If lk - nxl < 6n, then Ix - (k/n)I < 6, so that 1/(x) - f(k/n)I < e, and therefore, remembering that rk(x) ~ 0, these terms give a sum $ e E~ rk(x) = E. The terms of the second type have a sum that is
since l(k - nx)/n61 ~ 1 for these terms. By (4), this sum is bounded by
2Mx(l -x) n6 2
no. Then
c:xk
$ (1 - x) 'tskxkl + (I - x) · c:xno• 1 0
- x)- 1
k=I
no $ (1-x) I:s1cx" +e. h:I
Therefore, limsupx--+J- 1/(x)I :5 c:. Since c: > 0 was arbitrary, limsupx--+J-f(x)
=O.
•
Chapter 5 Uniform Convergence
312
5.10.7 Theorem
E:Oa1- =A (C, 1) implies E:Oak =A (Abel).
Proof As before, we may suppose that A = 0. Write Sn = E~ ak, Tn = E~ Sk. By assumption, Tn = o(n). Here we use the little "oh" notation meaning that for any e > 0, jo(n)I :5 en for n large enough. Hence Sn = Tn - Tn-1 = o(n) and an = Sn - Sn-I = o(n). By comparison with a multiple of the convergent series kx·t, all three series a,:x·t, Skxk, and T1:xk converge if < 1.
E
E
E
E
lxl
Also,
Now, since Tn = o(n), given e > 0 we may choose no so that n ITnl :5 En. Accordingly,
1/(x)I :5 (1
- x)2
L Ttxk + (1 - x>2 t~no
:5 (1
- x>2
IL
2:: no implies
ekxkl
k>no
L T1cxk + (1 - x)2 • ex(l - x)-
2
k~no
and we find Jim supx_, 1limx-1- J(x) = 0. •
1/(x)I < e.
Thus, . as in the previous theorem,
5.10.8 Theorem IfEan =A (C, l)andif an= O(1/n)(that is, iflanl :5 C/n for a constant C), then E an converges (to A) in the usual sense. Proof As in the preceding proofs, we may suppose that
E~=I a1:,
Tn =
A = 0. Write Sn
=
z:::. . Sk, Then the first hypothesis is written as Tn = o(n), as in 1
the preceding proof. We want to show that Sn -+ 0. If not, then for some t, > 0, ISnl 2:: 6 for infinitely many indices n. It can be assumed (by reversing all signs if need be) that Sn 2:: 6 for infinitely many values of n. But if Sn ~ 6 and r > n, we have
(I
I) -
Sr=Sn+On+1+an+2+··•+ar>6-C --+···+n+l R This will be
2:: 6/2 provided that Clog(r/n)
r >6-Clog-. n
~ 6/2, that is, r/n
:5 e6f 2c = A.
Worked Examples for Chapter 5 (Note that ,\
>
313
1.) Hence, we have f,
([,\n] - n)
2~
[An]
L S, =
T[An) -
Tn,
t=+l
(Here [x] means the largest integer less than or equal to x.) Now the right side of this inequality is o(n), but the left side is of the order (,\-1)6n/2, a contradiction. Hence Sn must tend to 0.
•
Worked Examples for Chapter 5 Example 5.1 a.
If fk
b.
Answer the same question for uniform convergence.
-+ f (pointwise) and gk -+ g (pointwise), show that f1c + g1c (pointwise) for functions f,g: A C Rn -+ Rm.
-+
f +g
Solution a.
For x E A, we must show that (ft+ gA:)(x) -+ (J + g)(x). Given € > 0, choose N1 such that k ~ Ni implies llfk(x) - f(x)II < c/2 and N2 such that k ~ N2 implies llgk(x) - g(x)II < e/2. Then let N = max(N1,N2), so that k ~ N implies (by the triangle inequality)
IIVt + g1c)(x) - (J + g)(x)II b.
~
11/k(x) - f(x)II + llgk(x) - g(x)II < e.
Repeat the argument in a where each statement is to hold for all x E A.
• Example 5.2 Prove that a sequence _{k : A -+ Rn converges pointwise (uniformly) iff its components converge pointwise (uniformly). Solution The part of the example on pointwise convergence follows from the fact that a sequence in Rm converges iff its components do (see Chapter 2). However, we write out the argument again so that its validity for uniform convergence can be seen.
Chapter S Uniform Convergence
314
Let x = (x 1, ••• ,x"') E Rm. Then Iii S llxll S E:1 Iii. Indeed, the first inequality is obvious, and the second follows from the triangle inequality if we write x = (x 1 , 0, ... , 0) + (0, ~, 0, ... , 0) + • • • + (0, 0, ... , x"'). Applied to ft = (ff, ... ,ft), we have m
lfi(x) - /(x)I S llft(x) - f(x)II ~
L lff (x) - /(x)I. i=I
Hence.ft converging pointwise implies thatJJ converges pointwise. Conversely, suppose that f£(x) converges for each i and x. Choose N; such that k,l 2:: N; implies IJI(x) - J/(x)I < Ejm. Then if N = max(N1, ... ,N,,,), k, I 2:: N implies llfk(x) - .fi(x)II < E / m + · · · + f / m = E, and so ft(x) converges. For uniform convergence, we repeat the argument with each statement holding for all x e A. •
Example 5.3 Find an example of a sequence ft that converges 1,1,niformly to zero on [0, oo[, where each contradict Theorem 5.3.1?
Solution
It ft(x) dx exists, but It ft(x) dx -+. +oo. Does this
Let ft(x) = {
1/k, 0,
if O ~ X ~ k2 , if X > k2 .
Then ft-+ 0 uniformly, since lfk(x)I ~ 1/k for all x. However,
fo
00
fk(x) dx
=k2/ k =k -+ oo.
This does not contradict Theorem 5.3.1 because that theorem dealt with finite intervals. •
Example 5.4 (Dini's Theorem) Let A c Rn be compact and ft a sequence of continuous functions fk : A -+ R such that (a) ft(x) ~ 0 for x E A; (b) ft -+ 0 pointwise; and (c) J,,(x) s .fi(x) whenever k 2::: I. Prove that ft -+ 0 uniformly. Solution This example requires a little care, because we are trying to deduce uniform convergence from pointwise convergence plus some other hypotheses and we know that the result won't be true without these extra hypotheses (study Figure 5.1-1, where all the hypotheses here are valid, except f1c(O) -+ 0 as k-+ oo).
Worked Examples for Chapter 5
315
Given c > 0, we want to find an N so that lfk(x)I < € for all k 2: N and x E A. For each x EA, find Nx so that lfic(x)I < c/2 if k 2: Nx. We write Nx to emphasize that this number depends on x. Here we have used hypothesis (b). By continuity of fic(x) there is a neighborhood Ux,k for x such that 1/k(Y) - fic(x)I < c/2 for y E Ux,k· The neighborhoods Ux,N, form a covering of A, and so by compactness there is a finite subcover, say, centered at X1, • .. , XM- Let N = max(Nxw .. , NxM). Let x EA and k 2: N. Then x E Ux,,N, for some I, and so lfN,(x)-fN,(x1)I < c/2. Thus, using (a) and (c), 0 $./k(x) $./N(x)
$_/N,(X) = /N,(X1) + [fN,(X) - fN,(X1)]
< c/2+£/2=£. Therefore lfk(x)I
0, there is an N such that n 2: N implies
llgnll + ... + llgn+pll < €. Choose an integer Ni such that a(n) > N whenever n > Ni. (We can do this since there are only finitely many integers n for which a(n) $_ N because a is a
Chapter 5 Uniform Convergence
316 bijection.) Thus, if n
> N 1, we have u(n + k) > N,
and so
I::,
By the Cauchy criterion, Cu(n) converges absolutely. To show that the limits are the same, given e, select N2 > N, where N is as before, such that if 1 ~ / ~ N, then l = u(n) for some n, I ~ n ::; N2 • This is because such k are finite in number and u is onto. Let No = max(N1 , N2 ); then, form> No,
By construction, E~ 1 Cu(k) - E;:, Cn is a sum of some of the terms g, where l > N. Thus, the terms in the preceding equation are no larger thane+ e = 2e:. Thus, the series E~ 1 8u(k)·converges to E:Ogn, v.:hich is the desired conclusion. The result of this example is closely related to important rearrangement theorems for double series (see Exercise 51). •
Exercises for Chapter 5 1.
a.
Letfi: be a sequence of functions from A c !Rn to IR"'. Suppose there are constants mk such that 11/k(x) - f(x)II ~ mk for all x E A and such that mk -+ 0. Prove that fi: -+ f unifonnly.
b.
If mk -+ m E 1R and llft(X) - Ji(x)II ~ lmt - m,I for all x E A, show that fi: converges unifonnly.
2.
Determine which of the following sequences converge (pointwise or uniformly) as k -+ oo. Check the continuity of the limit in each case.
a.
(sinx)/k on JR
b.
1/(kx+ 1) on )0, I[
c.
x/(kx+l)on]0,l[
d.
x/(1 + kx2) on JR (1, (cos x)/ k2 ), a sequence of functions from JR to IR.2
e.
Exercises for Chapter 5 3.
4.
317
E:
Determine which of the following real series 1 8k converge (pointwise or uniformly). Check the continuity of the limit in each case. 0
x5,k > k.
a.
gk(x) =
b.
8k(x)
=
!xi 5: k { 1/k2, 1/x2, Jxl > k.
c.
8k(x)
=
v-~t) cos(kx) on R
d.
gk(x)
= xk on ]0, 1[.
{
/-ll,
x
Letfn: [1,2]-+ IR be defined by fn(x) =x/(l +x)n.
a.
Prove that E':i/n(x) is convergent for x E [l, 2].
b.
Is it uniformly convergent?
C.
Is
f12 (E~ fn(X))
dx =
L~ f 12fn(X) dx?
5.
Suppose that fk -+ f uniformly, where fk : A C JRn -+ IR; 8k -+ g uniformly, where 8k : A-+ IR"'; there is a constant M1 such that Jlg(x)II 5, M1 for all x; and there is a constant M2 such that lf(x)I 5: M2 for all x. Show thatfi:gk-+ Jg uniformly. Find a counterex~ple if M 1 or M2 does not exist. Are M 1 and M 2 necessary for pointwise convergence?
6.
Prove that the sequence fk : A -+ IR"' converges pointwise iff for each x E A.fk(x) is a Cauchy sequence.
·1.
For functions f : A C !Rn -+ IR , form Cb as in the text. Show that llfgll::; 11/ll llgll'
8.
Does pointwise convergence of continuous functions on a compact set to a continuous limit imply uniform convergence on tha! set?
9.
Suppose that the functions 8k are continuous and E : 1 8k converges uniformly on A C !Rn. If Xk -+ xo in A, prove that I:;'; 1 8n(Xk) -+ I:;';1 8n(Xo) ask-+ oo.
10.
For the sequences and series of Exercises 2 and 3, when can we integrate or differentiate term by term?
11.
a.
Must a contraction on any metric space have a fixed point? Discuss.
b.
Let f : X -+ X, where X is a complete metric space (such as JR), satisfy d(f(x),f(y)) < d(x,y) for all x,y EX such that x f: y. Must f have a fixed point? What if X is compact?
Chapter 5 Uniform Convergence
318
12.
A function f : A - t JR, where A C ]Rn, is called lower semicontinuous if whenever xo E A and >. < f (xo), there is a neighborhood U of xo such- that >. 0 infJy-xJ 0, there is a 8 > 0 such that d(x, xo) < 8 implies d(f(x),f(xo)) < e for all f E B. Prove that B is equicontinuous.
23.
Let f : IR - IR and suppose that f of is continuous. Must f be continuous?
,24.
A metric space M is called second countable if there is a countable collection U1, U2, ... of open sets in M such that every open set in M is the union of members of this collection. Prove that such an M has a countable subset C such that cl(C) = M. (We then say that Mis separable.) Prove conversely that a separable metric space is second countable.
25.
Let f : [0, I] - JR be continuous and one-to-one. Show that f is either increasing or decreasing.
26.
Let k(x,y) be a continuous real-valued function on the square U = {(x,y) I 0 S x S 1, 0 Sy S 1} and assume that lk(x,y)I < I for each (x,y) EU. Let A : [O, 1) - JR be continuous. Prove that there is a unique continuous real-valued functionf(x) on (0, 1] such that f(x) =A(x)+
27.
fo k(x,y)f(y)dy. 1
Let f : ]a, b[ - IR be uniformly continuous, and that Xn E ]a, b[ with b. Show that limn__, 00 f(xn) exists.
Xn -
28.
Let fn(x) = x/ n. Is fn uniformly convergent on [0, 396]? On IR?
_ Chapter 5 Uniform Convergence
320
29.
Discuss the uniform continuity of the following: xE)-1,1[.
a.
f(x)=x2-,
b.
/(x) = x 113 ,
x E [O, oo[.
c.
f(x) = e-x,
x E [O, oo[.
d.
f(x)=xsin(l/x),
e.
f(x) = sin[ln(l + .x3)],
O 0. Prove that there is a simple function g such that Iii - gll < c.
40.
a.
Define 8 : C([O, 1], IR) -+ IR, f continuous.
b.
Let g : lR -+ lR be continuous. Define F: C([O, 1], IR) -+ C([O, 1], JR) by F(f) = gof. Prove that Fis continuous; prove that if g is unifonnly continuous, then F is uniformly continuous.
f(O). Prove that 8 is linear and
1--+
41.
Show that there is a polynomial p(x) such that lp(x) - jxj 3 1 < 1/10 for - 1000 :'S X '.'S 1000.
42.
Study the possibility of replacing the sequence of Bernstein polynomials in Theorem 5.8.1 by a sequence of Lagrange interpolation polynomials (see Exercise 2, §5.8, for the definition and properties) to effect the proof of the theorems in §5.8.
43.
Let Ce((-1, 1],IR) denote the set of even functions in C([-1, 1],IR).
44.
45.
a.
Show that Ce is closed and not dense in C.
b.
Show that the even polynomials are dense in Ce, but not in C.
Projects: Examine the possibility of extending the Stone-Weierstrass theorem to
a.
Complex-valued functions (keep the same hypotheses on B, except add ''f E B implies J E B," where the ·overbar denotes complex conjugation).
b.
Noncompact domains (consult Simmons, Introduction to Topology and Modern Analysis, McGraw llill).
c.
Use b to study the density of the Hermite functions in a suitable space of continuous functions (the Hermite functions are defined and studied in, for example, Courant-Hilbert, Methods of Mathematical Physics, Vol. I, Wiley).
a.
Let fk : K C ]Rn -+ ]Rm be an equicontinuous sequence of functions on a compact set K converging pointwise. Prove that the convergence is uniform.
b.
Let x2
fn(X) =
X
2
+
(l
)2 ,
-nx
Q
:'S X '.'S 1.
322
Chapter 5 Uniform Convergence
Show that fn converges pointwise but not unifonnly. What can you conclude from a? 46.
Let f(t,x) be defined and continuous for a S t S b and x E Rn. The purpose of this exercise is to show that the problem dx/dt f(t,x), x(a) Xo, has a solution on an interval t E [a, c] for some c > a (it is unique only under more stringent conditions). Perfonn the operations as follows: Divide [a, b] into n equal parts to = a, ... , In = b, and define a continuous function x11 inductively by
=
{
X,.(t) = f(t;, Xn(t;)), Xn(a) =Xo.
Put An(t) = x,.(t) - f(t,Xn(t)), Xn(t) = Xo
SO
+
t;
=
< t < l;+1,
that
1 1
/(s,Xn(s))
+ A11 (s)ds.
Use the Arzela-Ascoli theorem to find a convergent subsequence of the =f(t,x) and x(a) =X(). This method is called polygonal approximation.
x 11 • Show that the limit satisfies dx/dt
47.
Let .fn : A C JR" -. Rm be a sequence of equicontinuous functions. SupP?Se that ft converges on a dense subset K of A. Prove that the sequence converges on all of A. Does this shed any light on the proof of Theorem 5.6.2?
48.
Prove that the sup norm on C([O, l], JR) is not derived from an inner product (,) by 11/11 2 = (/,/). [Hint: Show that the parallelogram identity fails.]
49.
Let S be a set and let B denote the set of all bounded real-valued functions on S; endow B with the sup norm. Prove that B is a Banach space.
50.
Let / : JR -. JR be a uniform limit of polynomials. Prove that f is a polynomial.
51.
Consider a double series 00
L amn
where amn E JR,
m, n = 0, l, 2, ....
m,n=O
Say that it converges to S if for any e > 0 there is an N such that n, m ~ N implies
If ak1-SI < kJ=O
E:.
Exercises for Chapter 5
323
E:,,,_--o
Define absolute convergence and prove that if anm is absolutely ' convergent, then the sum can be rearranged as follows:
Interpret this result in terms of summing entries in an infinite matrix by rows and columns.
52.
Can we differentiate the series
term by term?
53.
54.
Evaluate the following limits: '
.
I - cosx
a.
hmx-o 3X
b.
lim,..__.o+(l+sin2x) 1/x
c.
1- - ~) limx-o+ (-.smx x
2X
_
Test the following infinite'series for convergence or divergence:
a.
f f kit k=J
b.
/klogk k2 +2k+3 .
k=l
oo (k!)2 C.
55.
Ek=1
(2k)!
Prove that 1r/4 = I - 1/3 + 1/5 - I/7 +•••starting from 00
(1
+x2)- 1 = I:O .r-=2
57.
Prove that if
::5 x < oo, ii. IJ,,(x)I S g(x), n = I, 2, 3, ... , 0 S x < oo, iii. fn(x) --+ j(x) uniformly, 0 ::5 x S R, for any R < oo, and iv. Jo"° g(x) dx < oo, i.
fn(x), g(x) are continuous,
then limn-+oo Jo"° fn(X) dx = 58.
59.
0
ft f(x) dx.
Prove the following convergence tests;
o.
a.
If u,. >
b.
If Un> 0,
a.
u,.+i
< 1-
u,. -
!n - nlogn' _a_
a
> 1, then I: Un converges.
l , then "L.., Un diverges. . -Un+I 2:: l - -1 - - -
n
Un
nlogn
Letp > I with I/p+ 1/q = 1. For a,b,t
> 0, prove that
d'fP bqt-q ab l, then
n
) 1/p
+ ~bf
1. For which complex x with
325
Exercises for Chapter 5 61.
Let Eakxk have radius of convergence R. Show that Eak(x verges inside the disk of center b, radius R.
62.
Find the radius of convergence of
63.
a.
Exkk/(k + 1)
b.
Exk/logk
bl con-
(Binomial series) Consider ~ a(a - 1) ···(a - k + 1) k Lt kl X • k=O
Assume that a is not an integer 2:: 0. Show that the radius of convergence is R = l. (See Exercise 49, Chapter 2, on the hypergeometric series, for behavior of the series at x = ±1.) 64.
Does 1 + 1/2 + 1/3 + • • • converge (C, 1) or (Abel)?
65.
Letf(x) be continuous,
Jof
0::; x < oo. We normally define 00
/(x) dx::: lim
r
R-++oo}o
f(x) dx,
if the limit exists. By analogy with (C, 1) summability, define a notion of "(C, 1) integrability from 0 to oo," and prove that your method of integrability is regular, that is, agrees with the usual if the latter converges.
ft
ti = l and tn+l = tn/(1 + t~). where /3 is fixed, E~1 tn converges. [Hint: Find a constant C such
66.
Define tn inductively by 0 ::; f3 < l. Prove that that tn::; cn-•I.B.]
67.
Let A= {j/2n E [0,1] j n = 1,2,3, ... , j = o,1,2, ... ,2n}, and let / : A - JR satisfy the following condition: There is a sequence En > 0 with 1 En < oo and
E:
ke~n
1) - / (
1n) I
0, j = 1, 2, ... , 2".
Prove that f has a unique extension to a continuous function from [O, 1] to R
326
Chapter 5 Uniform Convergence
68.
Let A C ]Rm be compact and let B c C(A, ]jlffl) be compact. Prove that B is equicontinuous as follows:
69.
a.
Prove that the map E : C(A, Rm) is continuous.
b.
Use uniform continuity of E restricted to B x A to deduce the result. (This method of proof is due to J. Allen.)
X
A -+ Rm defined by (f, x)
1-+
f(x)
Let the functions fn be monotone increasing and continuous on [O, 1]. Suppose that F(x) = 11,,(x) converges for each x E [O, 1]. Prove that F is continuous.
z::
Chapter 6 Differentiable Mappings In this chapter we develop the notion of a differentiable map from !Rn to lR.111 (or, more generally, from one normed space to another). Starting with this chapter, some linear algebra will be used. At this point, the reader might review the notion of a linear transformation and its matrix representation, since we shall be defining the derivative as a linear mapping. The connection between the general definition and partial derivatives will be found in §6.2. After this we generalize the usual theorems of calculus from the one-variable context treated in §4.7 to the multivariable case (such as the theorem stating that differentiability implies continuity, the chain rule, the mean value theorem, Taylor's theorem, tests for extrema, and so forth).
§6.1 Definition of the Derivative . A function of one variable f : ]a, b[ if the limit
-+
IR is called differentiable at xo E ]a, b[
. f (xo + h) / '(Xo ) = 11m h
f (xo)
, h-O
exists. Equivalently, we may write this formula as
r
1m h-O
f(xo + h) - f(:xo) - f'(xo)li .:_ 0 . / - , l
that is, lim f(x) - f(xo) - J'(xo)(x - xo) = 0 x-xo
X-Xo
or, what is the same, lim lf(x) - /(Xo) - /'(x'o)(x - Xo)I = O. x-+xo
Ix - xol 327
I•."
328
Chapter 6 Differentiable Mappings
The number/' (xo) represents the slope of the line tangent to the graph of/ at the point (Xo,/(Xo)), as in Figure 6.1-1. y
/
/(xo)
_ _ __.__ _ __.__ _...._____ X
• b
a
FiGURE 6.1-1
~e derivative is the slope of the tangent line
Differentiability of/ : JR
m such that
r
x~~
-+
JR at x0 is equivalent to the existence of a number
1/(x) - /(Xo) - m(x - xo)I lx-xol
0
= ·
The function T(s) = ms is linear: ..T(as + /3t) = aT(s) + /3T(t) for all real a, /3. s, and t. To generaiize this notion to maps/ : A c !Rn -+ IR"', we make the following definition. : :4 c lRn -+ R"' is said to be differentiable at if there is a linear function, denoted D/(xo) : Rn -+ JR"' and called the derivative off at XO, such that
6.1.1 Definition A mdp f
xo E A
r x~~o
11/(x) - /(Xo) - Df(xo)(x - xo)II 0 llx - xoll = ·
In this definition, D/(xo)(x - x0) denotes the value of the linear map D/(xo) applied to the vector x - xo E Rn, and so D/(xo)(x - Xo) E R"'. We often write D/(xo) • h foe D/(xo)(h). We shall be concerned almost exclusively with the Euclidean case, but note right here that if V, W are nonned spaces and f : A ·c V -+ W, then D/(xo) :
V -+ W, and the definition makes sense.
§6.1 Definition of the Derivative
6
329
The definition may be rewritten by saying that for every e EA and JJx - xoll < 6 implies
>
0 there is a
> 0 such that x
llf(x) - f(xo) - Df(xo)(x - xo)IJ ::; eJJx - xoJJ. In 6.1.1 it is implicit that x (a xo, but in this e-6 formulation, we can allow x = xo, since then both sides reduce to zero. Intuitively, x 1--+ f (xo) + Df(xo)(x - xo) is supposed to be the best affine approximation to f near the point xo, (An affine mapping is a linear mapping plus a constant.) See Figure 6.1-2. In this figure we have indicated the equations of the tangent planes to the graph of/.
Y =f(:xo) + Df(xo) (x-Xo)
\
... (a) /: R • IR
FIGURE 6.1-2
(b)
f:
IR. 2
•
(a) f: JR---+ JR; (b) f: IR2
IR ---+
JR
If f is a function of one variable and if we compare the definitions of Df(x) and df/dx =f'(x), we see that Df(x)(h) =f'(x) • h (the product of the numbers f' (x) and h E JR). Thus the linear map D/(x) is just multiplication by dfJdx, in this case. If f is differentiable at each point of A, we say that/ is differentiable on A. We expect intuitively (as in Figure 6.1-2) that there can be only one best linear approximation. This is in fact true if we assume that A is an open set.
6.1.2 Theorem Let A be an open set in !Rn and suppose f : A differentiable at x 0 • Then DJ(xo) is uniquely determined by f.
---+
!Rm is
Chapter 6 Differentiable Mappings
330
6.1.3 Example Let f : JR -+ IR.. /(x) = x3. Compute Df(x) and df / dx. Solution From one-variable calculus, dx3 / dx = 3x2. Thus, in this example, D/(x) is the linear mapping ·h
H
Df(x) · h = 3x2h.
•
6.1.4 Example Show that, in general, Df •is not uniquely determined. Solution For example, if A = {.to} is a single point, any D/(Xo) will do, because x e A, llx - xoll < 6 holds only when x = Xo, in which case the expression
11/{x) -
/(Xo) - DJ(xo)(x -
xo)I I
is zero. The definition is thus fulfilled in a trivial way.
•
Note.
If the proofofTheorem 6.1.2 is examined closely, one sees that Df(x) is unique (assuming it exists) on a wider range of sets than open sets. For example, the theorem is valid for closed intervals in IR. or generally for closed disks in Rn.
Exercises for §6.1 1.
Compute D/(x) for/ : IR.
2.
Prove that D(/ + g) =DJ+ Dg.
3.
Let A = {(x,y) e IR. 2 I O :5 x :5 l,y = O}. Prove that the conclusion of Theorem 6.1.2 is false for this A.
4.
Let f : Rn -+ JR"' and suppose there is a constant M such that for x E IR.n, 11/(x)I I :5 Ml lxll2. Prove that / is differentiable at xo = 0 and that D/(Xo) = 0.
S.
If/ : IR. -+ JR is differentiable and 1/(x)I :5 Ix!. must D/(0) = O?
-+
IR. defined by /(x) = x sinx.
331
§6.2 Matrix Representation
§6.2 Matrix Representation In addition to the definition of Df(x), there is another way to differentiate a function f of several variables. We write f in components, j(x1 , ••• , Xn) = (J,(x,, ... ,xn), ... ,fm(x,, ... ,xn)), and compute the partial derivatives 8fi/8x; for j = I, ... , m and i = 1, ... , n, where the notation 8/j / Bx; means that we compute the usual derivative of h with respect to x; while keeping the other variables x,, ... ,Xi-1, X;+1, ••• ,Xn fixed.
6.2.1 Definition 8/j/ 8x; is given by the following limit, when the limit exists:
8/j(
~ ~
)
. {fj(x1, ... ,x;+h, ... ,Xn)-/j(x1,••·•Xn)} I . ~o z
x,, ... ,Xn = 1,m
In §6.1 we saw that Df(x) for/ : JR -+ JR is just the linear map "multiplication by df / dx." This fact can be generalized to the following theorem.
6.2.2 Theorem Suppose A c lRn is an open set and f : A -+ lR"' is differentiable on A. Then the panial derivatives 8/j/ox; exist, and the matrix of the linear map Df(x) with respect to the standard bases in lR.n and JR"' is given by 8J, ax,
8/, OX2
8ft 8xn
8/i ax,
8/i 8x2
8/i 8xn
8fm ax,
8/m 8x2
8/m OXn
where each partial derivative is evaluated at x = (x 1, ••• , Xn). This matrix is called the Jacobian matrix off or the derivative matrix.
In doing specific computations, we can usually compute the Jacobian matrix easily, and Theorem 6.2.2 gives us DJ. In some books, D/ is called the differential or the total derivative off.
Chapter 6 Differentiable Mappings
332
When m = I, f is a real-valued function of n variables. Then matrix
(:~
DJ has
the
:~)
...
and the derivative applied to a vector e = (a 1, ... ,an) is
8/ I: 8-a;. n
D/(x) · e =
i=I
X,
It should be emphasized that DJ is a linear mapping at each x E A and the definition of Df(x) is independent of the basis used. If we change from the standard basis to another basis, the matrix elements will change. If one examines the definition of the matrix of a linear transformation, it can be seen that the columns of the matrix relative to the new basis will be the derivative D/(x) applied to the new basis in Rn with this image vector expressed in the new basis in Rm. Of course, the linear map Df(x) itself does not change from basis to basis. In the case m = 1, D/(x) is, in the standard basis, a 1 x n matrix, as just shown. The vector whose components are the same as those of Df(x) is c_alled the gradient off and is denoted grad f or 'vf. Thus, for/ : A C JR.n --+ JR, grad/=
(of,···, of). ax, OXn
(Sometimes it is said that grad f is just DJ with commas inserted!) To give an intrinsic definition of the gradient requires the use of the inner product. In fact, 'v/(x) is the vector such that for any v E Rn, ('v/(x), v) = D/(x) • v.
An important special case occurs when f = L is already linear. From the definition of the derivative, we see that DL(.xo) = L, as expected, since the best affine approximation to a linear map has the linear map itself as the linear part (see Example 6.2.4). Thus, the Jacobian matrix of L is the matrix of L itself in this case. Another case of interest is a constant map. Indeed, one sees that a constant map has derivative zero; zero is the linear map K : Rn --+ JR.m such that K(x) = 0 = (0, ... , 0) for all x E R".
6.2.3 Example Compute DJ.
Let f : R 2
--+
lR.3 be defined by f(x,y) = (x2,x3y,x4y2).
§6.2 Matrix Representation
333
Solution According to Theorem 6.2.2, Df(x,y) is the linear map whose matrix is fJ/1 ox
0/1 {)y
ah
ah
ox
{)y
0/3
8/3 oy
ax
•
where/1(x,y) = x2,.h(x,y) =~y. and/3(x,y) = J!y2.
6.2.4 Example Let L : Rn - Rm be a linear map; that is, L(x + y) = L(x) + L(y) and L(ax) = crL(x). Show that DL(x) = L.
Solution Given .xo and e > 0, we must find 6 > 0 such that llx -
.xoll
0). •
6.2.S Example Let f(x,y,z) = (xsiny)/z. Compute grad/. Solution grad/= (of/ox,of/ay,af/oz), and here
8/ ox=
of
siny
-z-,
{)y
xcosy
= -z-,
of xsiny oz=-~,
so that _ (siny , xcosy , _ xsiny) • 2
grad/(x,y,z ) -
z
z
z
•
334
Chapter 6 Differentiable Mappings
Exercises for §6.2 R.2 ,/(x,y,z) = (x4y,xel). Compute D/.
1.
Let/: JR. 3
2.
Let/: JR. 3 -+ JR, (x,y,z)
3.
Let L be a linear map of ]RR -+ lRm, let g : JRR --+ ]Rm be such that Ilg(x)II ~ Mllxll2, and let f(x) = L(x) + g(x). Prove that 1?/(0) = L.
4.
Letf(x,y) = (xy,y/x). Compute DJ. Compute the matrix of D/(x,y) with respect to the basis (I, 0), (1, I) in JR. 2 ,
5.
Discuss the possibility of defining D/ for f a mapping from one normed space to another.
--+
1--+
ex2+y2+z2. Compute D/ and grad/.
§6.3 Continuity of Differentiable Mappings; Differentiable Paths It was shown in §4.7 that a differentiable function of one variable is continuous. This is appealing intuitively, since having a tangent line (or plane) to the graph is stronger than having no breaks in the graph. We recall the proof: Let f : ]a, b[ --+ lR be differentiable at xo. Then
. (/(x) - f(xo)) = hm . hm
x-+Xo
x-+Xo
(/(x) /(xo)) - -- - · (x - xo) x - x0
=/'(xo) · lim (x X-+X 0 and a f>o > 0 such that llx - xoll < 60 implies 11/(x) - /(xo)II 5 Mllx - xoll(This is called the local Lipschitz property.) Earlier, we examined the special case of real-valued functions, f : ]Rn -+ JR. The case of a function c : JR -+ lR.m is also important. Here c represents a
§6.3 Continuity of Differentiable Mappings; Differentiable Paths
335
curve or path in IRm. In this case, Dc(t): JR-+ ]Rm is represented by the vector associated with the single column matrix
(~· l dcm dt
where c(t) = (ci(t), ... ,c111 (t)). This vector is denoted c'(t) and is called the tangent vector or velocity vector to the curve. If we note that
'( )
.
c t = 11m
h-0
c(t
+ h) - c(t) h
and use the fact that (c(t + h) - c(t))/ h is a chord that approximates the tangent line to "the cuive, we see that c' (t) should represent the exact tangent vector (see figure 6.3-1). In terms of a moving particle, (c(t+h)-c(t))/ his an approxfmation to the velocity, since it is displacement/time, and so c' (t) is the instantaneous velocity. z
X
FIGURE 6.3-1
The velocity vector of a curve
Strictly speaking, we should always represent c'(t) as a column vector, since the matrix ofDc(t) is an m x I matrix. However, this is typographically awkward, and so we often write c'(t) as a row vector.
Chapter 6 Differentiable Mappings
336
6.3.2 Example Let
c(I)
= (fl, t, sin t). Find the tangent vector to c(t) at
c(0) = (0, 0, 0).
Solution c'(t)
=
(2t, l,cost). Setting t • vector tangent to c(t) at (0, 0, 0).
= 0,
c'(0)
= (0, 1, 1),
which is the
Next we recall some single-variable ideas from §4.7:
6.3.3 Example Prove that f : JR -+ JR defined by x i---+ lxl is continuous but not differentiable at 0.
=
=
Solution f(x) x for x 2: 0 and/(x) -x for x < 0, and so/ is continuous .on ]0, oo[ and ]- oo, 0[. Since lim.r-o/(x) 0 /(0), f is also continuous at 0, and so f is continuous at all points. Fin~ly, f is not differentiable at 0, for if it were,. lim f(x) - /(0) = lim f (x) .r-+O X - 0 .r-+O X would exist. But for x • cannot exist.
= =
> 0, f(x)/ x is +1 and for x < 0 it is -1, and so the limit
6.3.4 Example Must the derivative of a function be continuous? Solution The answer is no, but an example is not obvious. An example is the function
· f(x) = x2 sin
and
~ X
if x
i
0,
f (X) =0 if X =0.
See Figure 6.3-2. To demonstrate differentiability at zero, we shall show that f(x) -
X
-+
0 as
X -+
0.
Indeed, lf(x) / xi = Ix sin(l / x)I ::; lxl -+ 0 as x -+ 0. Thus, f' (0) exists and is zero. Hence, f is differentiable at 0. By differentiation rules of one-variable calculus,
. -1 -. COS -1 if X / '(X ) = 2x· SID X
X
~
0.
337
Supplement to §6.3 y \
\
\
\
\
I
I
I
I
I
f(x) = x 2 sin(l/x)
,,,,
/
/
,I ,I ,I ,I I
,I
FIGURE 6.3-2
This function is differentiable, but the derivative is discontinuous at 0
As x -+ 0, the first term tends to O but the second term oscillates between + I and -1, and so lim.r-o/'{x) does not exist. Thus,/' exists but is not continuous.
•
Supplement to §6.3 Karl Weierstrass gave an example of a continuous functionf(x) of one variable that has no derivative anywhere (see Exercise 20, Chapter 5). Roughly speaking, this function superimposes infinitely many corners like those in Example 6.3.3. One can also ask if such functions arise in any "practical'' context. Surprisingly, the answer is yes. One place in which "nowhere differentiable" paths are important is in the study of Brownian motion, the erratic motion of small• particles suspended in a fluid such as water or air. The American mathematician Norbert Wiener developed an effective model for studying this motion involving integration over a "space" of paths. In this method "most" continuous paths are not differentiable anywhere and have no tangent vectors. Another context in which nowhere differentiable paths occur is in the study of fractals and dynamical systems. For instance, fix a complex number c and let fc : lR 2 ~ C -+ C be defined by fc(z.) z.2 + c. Let J';° z. IJ;(z.) -+ oo as n -+ oo}, where f': means fc composed with itself n times, and let Jc = bd(J':°). For c = 0, it is clear that le is a circle, but for c 'F O (but close enough to zero) it is known that Jc is a continuous but nowhere differentiable curve! One calls Jc the Julia set of fc; it is shown in Figure 6.3-3.
=
={
338
Chapter 6 Differentiable Mappings
(b)
(a)
(a) The Julia set of/(z) = z2 +½i, which is a simple closed curve but is nowhere differentiable; (b) the Julia set of/(z) = z2 - 1, which contains infinitely' many closed curves
FIGURE 6.3-3
The set of c for which le is path-connected is called the Ma,idelbrot set, shown in Figure 6.3-4. (The Mandelbrot set is also the set of c for which zero does not go to infinity under iteration of le-this is not obvious, however.) A deep theorem of Douady and Hubbard is that the Mandelbrot set is connected. This is an example of a very complicated set whose connectedness is far from obvious-computer magnification of this set shows how uuly subtle it is. For more information, references, and the reasons this subject has some practical importance, see R. Devaney, An Introduction to Chaotic Dynamical Systems (Addison-Wesley, Reading, M~·ss., 1985), and references therein.
Exercises for §6.3
=0 if x
1.
Let f(x) = x2 if x is irrational and let f(x) continuous at 0? Is it differentiable at 0?
2.
Is the local Lipschitz condition in Theorem 6.3.1 enough to guarantee differentiability?
3.
Must the derivative of a continuous function exist at its maximum?
4.
Let f(x) x sin(l / x), x 'f:. 0, and f(0) differentiability off at 0.
S.
Find the tangent vector to the curve c(t) = (3r, e', t + i2) at the point corresponding to t = I.
=
=0.
is rational. Is /
Investigate the continuity and
§6.3 Supplement to §6.3
FIGURE 6.3-4
339
The Mandelbrot set (reprinted by pennission, from B. B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman, New York, 1983)
Chapter 6 Differentiable Mappings
340
§6.4 Conditions for Differentiability Since the Jacobian matrix provides- an effective computational tool, it would be useful to know if the existence of the usual partial derivatives implies that the derivative DJ exists. This is, unfortunately, not true in general. For example, take f : R. 2 -+ R defined by f(x,y) = x when y = 0, f(x,y) = y when x = 0, and f(x,y) = 1 elsewhere. Then of/ox and of fay exist at (0, 0) and are equal to 1. However, f is not continuous at (0, 0) (why?), and so the derivative DJ cannot possibly exist at (0, 0). See Figure 6.4-1. (See the examples and exercises for more exotic examples.)
y f(x.y)=y
/=1
/= I f(x,y) =x X
/=1
FIGURE 6.4-1
/= l
The partial derivatives of this function exist at (0, 0), but the function is not continuous at (0, 0)
It is quite simple to understand such behavior. The partial derivatives depend only on what happens in the directions of the x and y axes, whereas the definition of DJ involves the combined behavior off in a whole neighborhood of a given point. We can, however, assert the following.
6.4.1 Theorem Let A c Rn be open and f : A c R_n -+ R.111• Suppose f = (/1, ... ,f,,,). If each of the partials ofi/ ax; exists and is continuous on A, then f is differentiable on A.
§6.4 Conditions for Differentiabiliiy
341
The partial derivatives of a function measure its rate of change in the special directions parallel to the axes. The directional derivatives do this in other directions also.
6.4.2 Definition Let f be real-valued and defined in a neighborhood of xo E an, and let e E an be a unit vector. Then d f( )I + te) - f(xo) 1. /(Xo dt Xo + te l=O = ,~ ---,--is called the directional derivative off at xo in the direction e. From this definition, we see that the directional derivative is the rate of change off in the direction e and, for n = 2, gives the slope of the graph off in that direction; see Figure 6.4-2.
z
FIGURE 6.4-2 Slope of l = tan (J = directional derivative
We claim that the directional derivative in the direction of e equals Df(xo)·e. To see this, consider the definition of Df(x0 ) with x = xo + te: Given c > 0,
11/(xo + te~ - f(Xo) -
Df(Xo).
ell S cllell
Chapter 6 Differentiable Mappings
342
if ltl is sufficiently small. Thus, if J is differentiable at Xo, then the directional derivatives also exist and are given by
. f(xo + te) - J(xo) 01.,.( ) l1m - - - - - - = 'J xo · e. t
1-+0
In particular, observe that {)J/ Ox; is the derivative of/ in the direction of the ith coordinate axis (with e e; (0, 0, ... , 0, 1, 0, ... , 0)). For a function/ : JR 2 -+ JR, the directional derivatives D/(x0 ) • e can be used to determine the plane tangent to the graph of/ (compare Figure 6.1-2). That is, the line l given by z = f(xo) + D/(xo) · re is tangent to the graph of/, since, as in Figure 6.4-2, D/(xo) • e is the rate of change off in the direction e. Thus, the tangent plane to the graph off at (xo.f(xo)) may be described by the equation
= =
z = f (xo) + Df(xo) · (x - xo) (see Figure 6.4-3). Since we have not defined the notion of the tangent plane to a surface, we shall adopt this equation as a definition of the tangent plane. z z
Xo
FIGURE 6.4-3
=/(xo) + D/(xo)
= (x1,o, . . .
· (x-Xo)
, x•. o)
The tangent plane to the graph of a function
6.4.3 Example Show that the existence of all directional derivatives at a point need not imply differentiability.
§6.4 Conditions for Differentiability
343 y
FIGURE 6.4-4
The existence of directional derivatives does not force differentiability or even continuity
Solution Define a function/: IR 2 -+ IR by f(x,y) = I if O < y < x2 and by f(x,y) = 0 otherwise. Then/ is identically O along the horizontal and vertical axes. Any other line through the origin also stays in the region in which f = 0 for some short distance on both sides of the origin. Thus f is constantly O on some interval on both sides of the origin along any such line. This shows that the directional derivative at the origin in the direction of that line is 0. All directional derivatives exist and are O at the origin, but the function is not even continuous, much less differentiable, at the origin. See Figure 6.4-4. A slightly more complicated but similar example is the function/: IR 2 -+ IR defined by f(x,y) = 2xy if x2 t- -y X +y and f(x,y) =0 if x2 = -y. Letting e = (e 1, e2 ), where e2
1 t
f. 0, I t2e1e2 e,e2 = - 2- 2 2 r t e1 + te2 re 1 + e2
-f(te1, te2) = -
-+
e,
as t -+ 0; the case e2 = 0 gives zero. Thus each directional derivative exists at (0, 0). However,! is not continuous at (0, 0), since for x2 near -y with both x, y llmall,/ is very large. (We leave the details to the reader.) • This example shows that existence of all directional derivatives would not be a convenient definition of differentiability, since it would not even imply
Chapter 6 Differentiable Mappings
344
continuity. This is the reason one adopts the more restrictive notion in Definition 6.1.1.
6.4.4 Example Let f(x,y) = x2 + y. Compute the equation of the plane tangent to the graph off at x 1, y 2.
=
=
Solution Here Df(x,y) has matrix
of of) ( ox' 8y
= (2x, l),
and so Df( 1, 2) = (2, 1). Thus, the equation of the tangent plane becomes z
=3 + (2, 1) (
;
=; ) =
3 + 2(x - 1) + (y- 2);
that is,
•
2x+y-z=l.
Exercises for §6.4 1.
Use Theorem 6.4.1 to show thatf(x,y) defined by (xy)2
f(x,y) = ~ • (x,y) ~ (0,0) yx2 +y2
and f(x,y)
=0,
(x,y)
=(0,0)
is differentiable at (0, 0). 2.
Investigate the differentiability of f(x,y) =
xy
~
yx2 +y2
at (0, 0) if f (0, 0) = 0. 3.
Find the tangent plane to the graph of z = x2 + y2 at (0°1 0).
4.
Find the equation of the tangent plane to z = x3 + y4 at x = 1, y = 3.
S.
Find a function/ : IR 2 -+ JR that is differentiable at each point but whose partials are not continuous at (0, 0).
§6.5 The Chain Rule
345
§6.5 The Chain Rule As we recalled in §4.7, one of the most important techniques of differentiation is the chain rule ("function of a function" rule). For example, to differentiate (.x3 + 3) 6 , let y = x3 + 3 and first differentiate y6, getting 6y5, and then multiply by the derivative of x 3 + 3 to obtain the final answer 6(.x3 + 3)5 3x2. There is a similar process for functions of several variables. For example, if u, v, and f are real-valued functions of two variables, then 8 of au of av a/.)x + ,\y for some O ::; ,\ ::; 1, as in Figure 6.7-1.
=
=
y
,,.;,~>)lly-xl
x
~>..Uy-xii FIGURE 6.7-1
The point c lies on the segment joining x and y
6.7.1 Mean Value Theorem i.
Suppose f : A C IR.n - JR is differentiable on an open set A. For any x,y E A such that the line segment joining x and y lies in A (which need not happen for all x,y), there is a point con that segment such that J(y) - f(x) = Df(c) · (y - x).
ii.
Suppose f : A c Rn - IR."' is differentiable on the open set A. Suppose the line segment joining x and y lies in A and f = (Ii, ... ,fm). Then there exist points c,, . .. , c111 on that segment such that f;(y)- f;(x)
=Df;(c;)(y- x),
i
=l, ... ,m.
354
Chapter 6 Differentiable M(lppings
An alternative formulation of the mean value theorem is given in Worked Example 6.S at the end of the chapter.
6.7.2 Example A set A C JR" is said to be convex if for each x,y EA, the segment joining x and y also lies in A. See Figure 6.7-2. Let A C IR." be an open convex set and let f : A --+ IR.m be differentiable. If DJ = 0 on A, show that f is constant on A. (Generalizations of this are given in Exercise 9 at the end of the chapter.)
(a)
(b)
FIGURE 6.7-2 (a) Not convex; (b) convex
Solution If x,y E A, then for each component/;, there is a vector c; such that f;(y) - f;(x)
= Df;(c;)(y - x).
Since DJ = 0, DJ; = 0 for each i (why?), and so f;(y) = f;(x). It follows that f(y) =f(x), which means thatf is constant.
•
6.7.3 Example Suppose that f : [O, oo[ --+ IR. is continuous, j(0) = 0, f is differentiable on ]O, oo[. and f1 is nondecreasing. Prove that g(x) = f(x)/x is nondecreasing for x > 0.
Solution From §4.7 or directly from the mean value theorem, we see that a function h : IR.--+ IR. is nondecreasing if h'(x) ~ 0, because x ~ y implies that h(y) - h(x) = h'(c)(y - x) ~ 0.
Now '( )
xf'(x) - f(x)
g X =---x2
§6.8 Taylor's Theorem and Higher Derivatives
355
and there is a c between O and x such that f(x) = f(x) - /(0) = f' (c) · x $ xf' (x),
since O < c < x implies that/'(x) '?:.f'(c). Thus xf'(x)-f(x) '?:. 0, and so g' '?:. 0, which implies that g is nondecreasing. •
Exercises for §6.7 1.
2. 3.
4.
If f : JR .:...+ JR is differentiable and is such that f' (x) (strictly) increasing.
> 0,
prove that f is
'Prove the following (weak version of) l'HtJpital's rule: If/', g' exist at .to, g'(xo) f 0, and/(xo) 0 g(xo), then lim.x-xa[/(x)/g(x)] =/'(xo)/g'(xo).
==
Use Exercise 2 to evaluate
a.
lim.x-o[(sinx)/x]
b.
lim.x-o[(e.x - 1)/x]
Which of the following sets are convex?
a.
{(x,y) E JR 2 I y '?:. O}
b.
{x E R" I O
= aB(e1 ,f) + /3B(e2 ,/), where e1, e2 E E, f E F, and a:, f3 E R The map BX-O we have defined may be checked to be a bilinear map of !Rn X !Rn - IR"'. Given a bilinear map B : E x F -+ JR., we can associate a matrix with each choice of basis e1, ... , en of E and /1, ... ,f,11 of F; that is, we can let aij
If
= B(e;,fj).
n
x=
L x;e;
m
and y = LYjfj,
i=I
j=I
then
au B(x,y) =~ aijXiYj =(x1, ... ,Xn) ( · 'J
: an1
Notice that the matrix associated to B depends on the choice _of basis of E and F.
§6.8 Taylor's Theorem and Higher Derivatives
357
Note. For the second derivative, we shall, by abuse of notation, still write D2/(x0 ) for the bilinear map Bx,, obtained by differentiating DJ at Xo as described in the preceding paragraphs. 6.8.2 Theorem Let f : A C Rn -+ R be twice differentiable on the open set A. Then the matrix of D2J(x) : Rn x Rn-+ R with respect to the standard basis is given by
where each partial derivative is evaluated at the point x = (x 1 , • •• , Xn). For higher derivatives, we proceed in an analogous manner. For example, -+ Rm' for each x. We do not associate a matrix with this map, but rather the third-order derivatives labeled by four indices: 8 3/t/ 8x18xj8X; for each component/". (Such quantities are closely related to what are called tensors.) Before proceeding with Taylor's theorem, we give an important property of the second derivative. The' matrix in Theorem 6.8.2 is, under fairly mild conditions, symmetric.
D3/ gives a trilinear map D3/(x) : Rn x Rn x Rn
6.8.3 Symmetry of Mixed Partials Let f : A -+ R"' be twice differentiable on the open set A with D2/ continuous (that is, with the functions 82/ / ox;OXj continuous). Then D2/ is symmetric; that is,
or, in terms of components,
Using this, it follows that all the higher derivatives are symmetric as well under analogous conditions.
358
Chapter 6 Differentiable Mappings
The symmetry of second derivatives represents a fundamental property not encountered in single-variable calculus. Let us verify the symmetry in a specific example: Suppose/(x,y,z) = e%Ysinx+x2y4cos2 z, so that/: IR.3 --+ JR.. Then
!
= e:ry cosx + yexy sinx + 2xy4 cos2 z,
%=
xexy sinx + 4x2y3 cos2 z,
and
a21
oyax
a21 =xe:ry cos x + e:ry sin x + .xye:ry sin x + 8xy3 cos2 z =oxoy.
Theorem 6.8.3 is not so obvious intuitively. However, some intuition can be gained from the main idea of the proof. For functions f(x, y ), we consider the quantity S = f(x + h,y + k)- f(x + h,y) - f(x,y + k) + f(x,y), which is the difference of the differences in Figure 6.8-1. The algebraic fact that S can be written as the difference of the differences in two ways (horizontal and then vertical, or vice versa) is the reason one has equality of the mixed partials.
y
(x, y
+ k)
(x, y)
(x +h, y
+ k)
(x+h,y) X
FIGURE 6.8-1
S is a difference of differences
6.8.4 Definition A function is said to be of class C' if the first r derivatives exist and are continuous. (Equivalently, this means that all partial derivatives
§6.8 Taylor's Theorem and Higher Derivatives
359
up to order r exist and are continuous.) A function is said to be smooth or of class COO if it is of class C' for all positive integers r. Iterative use of the chain rule can be used to show that the composite of C' functions is also C'.
6.8.5 Taylor's Theorem Let f : A -+ JR be of class C' for A c !Rn an open set. Let x,y EA, and suppose that the segment joining x and y lies in A. Then there is a point c on that segment such that 1 1 L k!Dkf(x)(yx, ... ,y- x) + r!D'f(c)(y - x, ... ,y - x) r-1
J(y) - J(x) =
k=I
where D~(x)(y - x, ... ,y - x) denotes Dkf(x) as a k-linear map applied to the k-tuple (y - x, ... ,y - x). In coordinates,
L n
Dkf(x)(y - x, ... ,y - x) = .
(
.
•••···••k=l
8kf
8X1•.... 8X1.k
)
(y;. - x; 1 )
• • •
(y;k - x;k).
Setting y = x + h, we can write the Taylor formula as
l J(x + h) =J(x) + DJ(x) · h + .. · + (r- l)!D' 1/(x) · (h, ... ,h) + R,-i(x,h) where R,_ 1(x, h) is the remainder. Furthermore,
R,-i (x, h)
llhll'-1
-+
0 as h-+ 0.
Other forms in which the remainder term can be cast are given in the proof of the theorem. This theorem is a generalization of the mean value theorem (in which case r = l) and of Taylor's theorem encountered in one-variable calculus.
Note. The last two statements in Taylor's theorem require f to be only cr- 1, as is seen by an examination of the proof.
Chapter 6 Differentiable Mappings
360
The basic idea of the proof of Taylor's theorem is to use the fundamental theorem of calculus (§4.8) to write
f(x + h) - f(x) =
fo
1
:,
f(x + th)dt
and then to repeatedly integrate by parts, generating a series. From Taylor's theorem, we are led to form the Taylor series about Xo,
I L k!D /(Xo)(x - Xo, ... ,x - Xo). l
CX)
k=O
This series need not converge to f (x) even if f is C°°. If it does so in a neighborhood of xo, we say that f is real analytic at xo. Thus, a function f is real analytic if the remainder term (1/ r!) D'f(c)(x - xo, ... ,x - xo) -+ 0 as r-+ oo. For example, if ex, sinx, and cosx are defined in more traditional ways rather than the power series approach we took in §5.4, then this method can be used to establish the power series expansions.
6.8.6 Example Justify and then verify equality of mixed partials for f(x,y) = yx2(cosy2).
Solution Since f is C°°, Theorem 6.8.3 guarantees equality of mixed partials. To see this explicitly, we compute as follows:
of = 2xycosy2, -o2f = 2xcosy2 -
-
ox
8/
{}y =x2cosy2-
oyox
2y2x2 siny2,
[J2f
4xy2 siny2;
{}x{}y =
2xcosy2 -4y2xsiny2.
•
6.8.7 Example If f is C°° on JR. and for every interval [a,b] there is a constant M such that lfn)(x)I :5 Mn for all n and x E [a, b], slww that f is analytic at each x0 and ~ fn>(xo) ( )n f( X) = ~ - -1 -X-Xo • n. n=O
Solution Select b with -b < x0 < b. The remainder is fn>(c) (.x - xotl < Mnlx - Xoln
l n!
-
n!
•
§6.8 Taylor's ·rheorein and Higher Derivatives
361
where c is between x and xo and lf">(c)I < M" on [-b, b]. This tends to 0 as -+ oo, since by the ratio test, the corresponding series converges. Observe that the convergence is uniform on all bounded intervals (why?). •
n
6.8.8 Example Give an example of a COO junction that is not analytic.
Solution
Let j(x)
=0
if X
=0,
and /(x)
2 '
= e-l/z
if x 'f 0.
The only place where smoothness off is in doubt is at x = 0. At that point we can use l'Hopital's rule to evaluate the deri~ative as the limit of a difference quotient: J'(0) = lim /(x) - /(O) = Iim !e-i/.r2 = 0. z-o x-0 z-ox (Supply the details for the application of l'Hopital's rule.) Similarly, for x '/; 0, we havef'(x) = (2/x1)e-l/z2 and a similar application of l'Hopital's rule shows that f"(0) exists and is 0. One proceeds inductively to show that f">(0) exists and is O for each n > 0. Thus/ is COO. But the Taylor series for/ with center at Xo = 0 is identically 0. This Taylor series does not converge to the value of the function in any nontrivial neighborhood of 0, so/ is not analytic at 0. •
6.8.9 Example Compute the second-order Taylor formula for f(x,y) = cos(x + 2y) around (0, 0).
Solut.ion
Here /(0, 0)
: (0, 0)
= 1,
=- sin(0 + 2 • 0) =0,
{J2f
.
axi (0,0) = - cos(0) = -1,
%
(0, 0) = -2 sin(O + 2 • 0) = 0,
021 · · oy2 (0, 0) = -4 cos 0 = -4,
and
a21 oxoy (0,0)
a21
.
= oyox (0,0) =-2cos0 =-2.
Chapter 6 Differentiable Mappings
362 Thus
1
f(h, k) = 1 -
.
2(h2 + 4hk + 4k2) + R2((0, 0), (h, k)),
where R2((0, 0), (Ii, k))
-+
0
ll(h,k)ll 2
as
'
•
(0 0)
(h k) -+
'
·
Exercises for §6.8 1.
Verify the equality of mixed partials for J(x,y) = (ex2+ix)xy2.
2.
Use Example 6.8.7 to establish the Taylor series and analyticity of ex, sinx, and cosx on all of IR, assuming we know their derivatives.
3.
Let f : JR
-+
JR be defined by J(x) = x2 sin ( ~)
if x E }-1, l[,
X
;e0
and f(x) = 0
if X = 0.
Investigate the validity of Taylor's theorem for f about the point x = 0.
4.
Find the Taylor series representa!ion about x = 0 for log(l - x), -1 < x < I, and show that it equals log(I - x) on -1 < x < l, and also show that it converges uniformly on closed subintervals of ]- 1, I[.
5.
Compute the second-order Taylor formula for J(x,y) = ex cosy around (0,0).
6.
Verify that if the conditions in Example 6.8.7 are met, then we can differentiate the Taylor series term by term to obtainf'(x).
§6.9 Maxima and Minima An important application of Taylor's theorem is the determination of maxima and minima of real-valued functions. As we might expect from our knowledge of functions of one variable, the criteria involve the second derivative. Let us firsr recall the real-variable case.
§6.9 Maxima and Minima
363
If/: ]a, b[-+ JR has a local maximum or minimum at.xo and/ is differentiable at Xo, then f' (x0 ) = 0. Furthermore, if f is twice continuously differentiable and /"(.xo) < 0, then Xo is a local maximum, and if f"(xo) > 0, then xo is a local minimum. To generalize these facts to real-valued functions of n variables, we begin by giving the relevant definitions.
6.9.1 Definition Let f : A c Rn -+ JR where A is open. If there is a neighborhood of Xo E A on which /(Xo) is a maximum, that is, iff(xo) 2: /(x) for all x in the neighbor/wad, we say that xo is a local maximum point and f (Xo) is a local maximum value for f. Similarly, we can define a local minimum off. A point is called extreme if it is either a local minimum or a local maximum for f. A point Xo is a critical point if f is differentiable at xo and if Df(xo) = 0. The first basic fact is presented in the next theorem.
6.9.2 Theorem If f : A c JR.n -+ JR is differentiable, A is open, and xo EA is an extreme point f, then Df(xo) = O; that is, Xo is a critical point. The proof is much the same as for one-variable calculus. The result is intuitively clear, since at an extreme point the graph off must have a horizontal tangent plane. However, just being a critical point is not sufficient to guarantee that the point is extreme. For example, consider f (x) = x-3. For this function, 0 is a critical point, since Df(O) = 0. But x-3 > 0 for x > 0 and x-3 < 0 for x < 0, and so O is not extreme. Another example is given by J(x, y) = y2 - x2. Here O (0, 0) is a critical point, since of/ ax 2x, of/ ay 2y, and so Df(0, 0) = 0. However, in any neighborhood of O we .can find points where f is greater than O and points where/ is less than 0. A critical point that is not a local extreme point is called a saddle point. Figure 6.9-1 shows how this tenninology originated. For/: A C JR-+ JR, we have already mentioned thatf(x) is a local maximum value if/'(x) = 0 andf"(x) < 0. This is geometrically clear if we remember that f"(x) < 0 means/ is concave downward (or that the slopes/'(x) are decreasing). To generalize this, the concept of the Hessian of a function g at xo is introduced.
=
6.9.3 Definition If g : B c Rn
=-
=
-+ JR is of class C 2 , the Hessian of g at Xo is defined to be the bilinear function HX-O(g) : Rn x Rn -+ JR given by Hx0 (g)(x,y) = D2 g(xo)(x,y). Thus the Hessian is. as a matrix, just the matrix .of second partials.
364
Chapter 6 Differentiable Mappings y
y
maximum
saddle
FIGURE 6.9-1
Maximum, minimum, and saddle points
A bilinear form, that is, a bilinear mapping B : !Rn x !Rn -+ JR, is called positive definite if B(x, x) > 0 for all x -:/ 0 in !Rn and is called positive semidefinite if B(x,x) ~ 0 for all x E !Rn. Negative definite and negative semidefinite bilinear forms are defined similarly.
Now we can make the following generalization to the multivariable case.
6.9.4 Theorem i.
If f : A C JR.n
ii.
If f has a local maximum at xo, then HX-O(J) is negative semidefinite.
-+ JR is a C 2 function defined on an open set A and xo is a critical point off such that H,o 0 fork even, and that if HX-O(f) is negative semidefinite, then Ak :S 0 for k odd and Ak ~ 0 for k even. Thus f has a minimum at x 0 if Ak < 0 for k odd and Ak > 0 for k even. If Ak > 0 for some odd k or Ak < 0 for some even k, then f cannot have a minimum value at xo, In fact, if Ak < 0 for some even k, f can have neither a maximum nor a minimum at x 0 , and x 0 must be a saddle point off (see Exercise 8 at the end of this chapter for an instance of this).
366
Chapter 6 Differentiable Mappings
Note. This theorem is also useful in economics for optimizing quantities such as profits. It is also used in mechanics when f is the potential of a system, because then a minimum corresponds to stability and the maxima and saddle points correspond to instability. (See Marsden and Tromba, Vector Calculus, 3rd ed., W. H. Freeman, New York, 1988, Chap. 4, for details.) 6.9.5 Example Show that the matrix
is positive definite if! a > 0 and ad - b2 > 0.
Solution Positive definite means that (x,y) ( :
: ) ( ; )
>0
if (x,y)
t- (0,0);
that is, ax2 + 2bxy + dy2 > 0. First, suppose this is true for all (x, y) F- (0, 0). Setting y 0, x 1, we get a > 0. Setting y 1, we have ax2 + 2bx+ d > 0 for all x. This function is a parabola with a minimum (since a > 0) at 2ax + 2b = 0. That is, x = -b/a. Hence,
=
that is, ad - b2
=
> 0.
=
•
The converse may be proved in the same way.
6.9.6 Example Investigate the nature of the critical point (0, 0) of f(x,y) = x2
- xy + y2.
Solution The partial derivatives are
{)J {)x
=2x-y,
{)2J {)x2
{)2J
= 2,
8xoy = -l,
8/
oy = -x+2y,
82/ {)y2
= 2,
367
Theorem Proofs for Chapter 6 and so the Hessian evaluated at x = 0, y = 0 is
( 2 -1) -1
=
=
2
.
=
Here A1 2 > 0 and A2 4 - 1 3 > 0, and so the Hessian is positive definite. Thus we have a local minimum. •
Exercises for §6.9 1.
Prove that
is negative definite iff a
< 0 and ad - b2 > 0.
2.
Investigate the nature of the critical point (0, 0) of/(x,y) = x2+2xy+y2+6.
3.
Investigate the nature of the critical point (0, 0, 0) off(x, y, z) = x2 + y2 + 2z2 +xyz.
4.
(This exercise assumes a knowledge of linear algebra.) Let A be a symmetric matrix. Show that A is positive definite if and only if the eigenvalues of A (which exist and are real, since A is symmetric) are positive. Is this true if A is not symmetric?
5.
Check that the matrix
O) ( 0l O 0 0 0 0 -1
has A;
6.
~
0 yet the matrix is not semidefinite.
Determine the nature of the critical point (0, 0) of the function x3 + 2xy2 y4 + x2 + 3.xy + y2 + 10.
Theorem Proofs for Chapter 6 t,.1.2 Theorem Let A be an open set in IR.n and suppose f : A differentiable at Xo- Then Df(xo) is uniquely determined by f.
-+
IR.m is
Chapter 6 Differentiable Mappings
368
Proof Let L1 and Li be 'two linear mappings satisfying conditions of the definition of the derivative. We must show that L, Li. Fix e E Rn, llell 1, and let x = xo + >.e for >. E R Then note that
=
!>.I = llx - Xoll
and
l1L1 · e - Li · ell =
=
IIL, · >.e - Li · >.ell I-XI
·
Since A is open, x E A for ,\ sufficiently small. By the triangle inequality, IIL1 •e-Li•ell = IIL,(x-XQ)-Li(x-xo)II
llx -
< llf(x) -
-
Xoll
J(Xo) - L1 (x - xo)II
llx-Xoll llf(x) - /(xo) - Li(x - Xo)I I +--------llx - xoll
As >. -+ 0, these two tenns each tend to 0, and so L1 · e =Li· e. Our selection of e was arbitrary, except that !Jell= 1. But for any nonzero y E Rn, y/llYII = e has length 1, and, by linearity, if L1(e) Li(e), then L 1(y) Li(y).
=
=
•
6.2.2 Theorem Suppose A c Rn is an open set and f : A -+ R"' is differentiable on A. Then the partial derivatives 8Jj/8x; exist, and the matrix of the linear map Df(x) with respect to the standard bases in Rn and R"' is given by 8/1 8x1
8/1 OX2
8/1 OXn
afz
afz
afz
ox,
8x2
OXn
81,,. OX1
8/m 8x2
ofnr OXn
where each partial derivative is evaluated at x = (xi, ... ,xn)- This matrix is called the Jacobian matrix off or the derivative matrix.
Proof By definition of the matrix of a linear mapping, the jith matrix element of Df(x) is the jth component of the vector Df(x) • e; = Df(x) applied to the ith
369
Theorem Proofs for Chapter 6
standard basis vector, e;. Call this component aii• Let y = x + he; for h E JR and note that llf(y) - /(x) - Df(x) · (y - x)II IIY -
xii _ llf(x1, ... ,X; + h, ... ,Xn) - f(x1, ... , Xn) - hDJ(x) · e;II lhl
Since this tends to O as h and so, as h - 0,
-+
0, then so does the jth component of the numerator,
ljj(x1, ... , X; + h, ... , Xn) - Jj(x1, . .. , Xn) - haj;I O lhl - . Therefore,
ajj . Jj(x1,, .. ,x;+h, ... ,Xn)-Jj(x1,,,.,Xn) Oji= lIm = -. ,,-o h OX;
•
6.3.1 Proposition Suppose A C !Rn is open andf: A -+ JRm is differentiable on A. Then f is continuous. In fact, for each Xo E A, there are a constant M > 0 and a 80 > 0 such that llx - xoll < 60 implies llf(x) - j(xo)II $ Mllx - xoll(This is called the local Lipschitz property.) For the proof, we recall that if L : !Rn - Rm is a linear transformation, then there is a constant Mo such that III.xii $ Mollxll for all x E Rn (see Worked Example 4.4 at the end of Chapter 4). Here we shall be taking L = Df(xo).
Proof To prove continuity, it suffices to prove the stated Lipschitz property, since, given c > 0, we can choose 8 = min(6o, c/ M). To prove the Lipschitz property, let c = 1 in the definition of the derivative. Then there is a 60 so that llx - xoll < 80 implies IIJ(x) - f(xo) - Df(xo)(x - xo)II $ llx - xoll, which implies IIJ(x) - f(xo)II $ IIDJ(xo)(x - xo)II + !Ix - xoll (here we use the triangle inequality in the form IIYII - llzll $ IIY - zll, which follows by writing y = (y - z) + z and applying the usual form of the triangle inequality). Let M = Mo+ I and use the fact that IIDJ(xo)(x-xo)II $ Mollx-Xoll to give the result.
•
Chapter 6 Differentia.ble Mappings
370
6.4.1 Theorem Let A C Rn be open and f : A C Rn - t ]Rm. Suppose f = (Ji, ... .fm)- If each of the partials 8.[j/8xi exists and is continuous on A, then f is differentiable on A.
Proof If DJ(x) is to exist, its matrix representation must be the Jacobian matrix, by Theorem 6.2.2. We need to show that with x E A fixed, for any c > 0 there is a 8 > 0 such that IIY- xii < 8, y EA implies llf(y) - f(x) - Df(x)(y-x)II < cllY- xiiTo do this, it suffices, to prove this for each component off separately (why?). Therefore, we can suppose m = 1. Write f(y) - f(x) = f(yi, • • • ,Yn) - f(x1,Y2, • • • ,Yn) +f(x1,Y2, • • • ,Yn) - f(x1,X2,Y3, · · · ,Yn) + f(x1,X2,y3, · · · ,Yn) - f(x1,x2,X3,y4, · · · ,Yn)
+ · ·. + f(x1, · · · ,Xn-1,Yn) - f(xi, • • • ,Xn)Now use the mean value theorem, which implies 8f f(yi, • • • ,Yn) - f(x1,Y2, • • • ,Yn) = ~(u1,Y2, • • • ,Yn)(y1 - xi) UX(
x,
for some u1 between and Yt (y2, ... ,Yn are fixed). We write ~imilar expressions for the other terms and get f(y)-f(x)=
(%{
(u1,Yz, ... ,yn)) (y1 -x,)
+ (:~ (x,,uz,y3, ... ,yn)) (y2 -x2) + · · · + ( :~ (x1,X2, ... ,x,._,, Un)) (Yn - Xn). Since Df(x)(y-x) = "'E,7=/8J/8x;)(x1, ... ,xn)(yi -x;),
11/(y) -f(x) -
Df(x)(y- x)II::; {l-:{.(u1,Y2, .. • ,Yn)- :; (xi,••• ,x,.)I
+···+
1:~
(x,, ... ,Xn-t,Un)- :~ (x1, ...
,x,.)I} lly-xll
using the triangle inequality and the fact that IY; - x;I ::; IIY - xi I. Since the terms 8f/8x; are continuous and"u; lies between y; and x;, there is a 8 > 0 such that the term in braces is less than c for IIY - xii < 8. This estimate proves the assertion.
•
Theorem Proofs for Chapter 6
371
6.5.1 Chain Rule Let A
C ]Rn be open and f : A -+ R'n be differentiable at xo E A. let B C Rn be open, f(A) C B, and g : B -+ RP be differentiable at f(xo). Then the composite g of is differentiable at Xo and D( g o f)(xo) =
Dg(/(.:to)) o D/(Xo).
Proof that
To prove that D(g o /)(Xo) • y = Dg(/(Xo)) • (D/(x0) • y), we will 'show
r Ila o f(x) -
x~~
g of(xo) - Dg(J(xo)) · [Df(xo)(x - Xo)]II 0 llx - .:toll = .
To do this, estimate the numerator as follows: 118 o f(x) - go f(xo) - Dg(f(xo)) · (DJ(xo)(x - xo))I I
=
llc 0 such that 11/(x) - /(xo)II $ Mllx - xoll whenever !Ix - .:toll < 60, by Theorem 6.3.1. Given c > 0, there is, by the definition of the derivative of g, a 61 > 0 such that 11.Y - J(xo)II < 61 implies llg(y) - g(J(Xo)) - Dg(J(Xo))[y - /(Xo)JII
Thus !Ix - Xoll
< 62 = min{ 60, 6i}
< ( 2~)
IIY - /(Xo)II-
implies
llg(f(x)) - g(f(xo)) - Dg(/(Xo))[/(x) - /(Xo)JII llx- .:toll
c
< 2·
Since Dg(f(xo)) is a linear map, we know that there is a constant N such that IIDg(f(Xo))(y)II $ N · IIYII for ally E Rm, where it can be assumed that N /; 0. Now, by definition of the derivative, there is a 63 > 0 such that llx - xoll < 63 implies =IIJ_(x_)_-..;_J(_Xo_)_----'Df~(x_o_)(x_-_Xo_)....:.:.11 < _c llx - .:toll 2N" Then !Ix - xoll
< 63
implies
IIDg(f(Xo))[J(x) - f(xo) - D/(Xo)(x - .:to)JII llx - xoll
< -
Njlf(x) - /(.:to) - D/(xo)(x - Xo)II llx-xoll
< ~2
Chapter 6 Differentiable Mappings
372
Let 6 = min{62,63}. Thus, llx- xoll
< 6 implies
Ilg of(x) - g of(xo)- Dg(f(xo)) · Df(xo)(x llx - xoll
0 and 82/ / 8x1 8x1 > 0 imply f has a l?cal minimum at xo. !!.. > 0 and 82/ / 8x18x1 < 0 imply f has a local maximum at x0 • Ll < 0 implies x0 is a saddle point off. !!..
Consider the following two possible properties for a subset X of Rn:
1
There is a point x0 E X such that every other point x in X can be joined to .xo by a straight line in X.
2
There is a point .xo E X such that every other point x in X can be joined to x 0 by a di(ferentiable path in X.
a.
Give .examples of each kind of set that are not convex.
b.
Show that if X is an open set in Rn satisfying either of these conditions and/: X-. Risa differentiable function with zero derivative, then f is constant.
c.
Show that if X is an open subset of lR.n, then the following are equivalent: i. Condition 2 above ii. Path connectedness of X iii. Connectedness of X
10.
Prove the analogue of Theorem 6.9.4 for minima.
11.
Prove the analogue of Proposition 5.3.3 for f: A c Rn-. Rm.
12.
A function/: Rn - t R is called homogeneous ofdegree m iff(tx) = t"'f(x) for all x E lR.n and t E R. If f is differentiable, show that for x E Rn,
Df(x)x = mf(x),
8/ LX;~ =mf(x). n
that is,
i=I
X,
Show that maps multilinear in k variables give rise to homogeneous functions of degree k. Give other examples.
386 13.
Chapter 6 Differentiable Mappings Use the chain rule to find derivatives of the following, where f(x, y, z) = and h(x) sinx:
x2 + yz, g(x,y) =y3 + xy,
=
a.
F(x,y,z) =f(h(x),g(x,y),z).
b.
G(x,y,z)
c.
H(x,y,z) =g(f(x,y,h(x)),g(z,y)).
= h(f(x,y,z)g(x,y)).
Also find general formulas for the derivatives of F, G, H. 14.
a.
Extend Worked Example 6.2 to multilinear maps.
b.
Apply the result in a.to the case of the determinant map det: Rn2 = Rn X • • • X Rn -+ R, to show that A E Rn2 is a critical point of det ·iff A has rank~ n - 2.
15.
Let f : R -+ R be differentiable. Assume there is no x E R such that f . and f' both vanish at x. Show that S = { x I 0 $ x $ 1,f(x) = .0} is finite.
16.
If f : Rn -+ Rm is a differentiable and DJ is a constant, show that/ is a linear term plus a constant and that the linear part off is the constant value ofD/.
17.
If/ : A c Rn -+ R is of class er and Df(xo) = 0, D2/(xo) = 0, ... , or- 1/(Xo) = 0 but D'/(x0 )(x, ... ,x) < 0 for all x E Rn, x 'r" 0, then prove that/ has a local maximum at xo.
x' + bx+ c = 0 where b > 0 has exactly one solu-
18.
Prove that the equation tion x ER
19.
In each of the following problems, determine the second-order Taylor fo1mula for the given function about the given point (xo,Yo):
20.
a.
/(x,y)
=(x + y)2, xo = 0,yo =0.
b.
f(x,y)
=e"+Y, Xo =0, Yo= 0.
c.
/(x,y) = 1/(x2 + y2 + 1), Xo = 0, Yo= 0.
d.
j(x,y)
= e-x2-y2 cos(xy), xo = 0, Yo= 0.
e.
J(x,y)
= sin(xy) + cos(xy), xo = 0, yo= 0.
f.
f(x,y)
= e1 cosy, Xo = 1, Yo= 0.
Let L : Rn - t R"' be a linear map. Define IILi I = inf{ M I l!Lxll ~ Mllxll for all x E Rn}. Show that II· II is a norm on the space of linear maps of Rn to R"'.
~ercises for Chapter 6 21.
387
a.
Let f : A C Rn - JR, and assume that for some integer k, !).k < 0, where ~ is evaluated at x0 • Show that f cannot have a (local)
b.
If ~
minimum at·xo.
0 such that llxll < r implies IIDg;(x)II < 1/2n, where g = (g1, ... ,gn). By the mean value theorem, given x E D(O, r), there are points c1, c2, ... , Cn in D(O, r) such that g;(x) g;(x) - g;(O) Dg;(c;)(x - 0) Dg;(c;)(x). Therefore,
=
llg(x)II $
t i=l
=
lg;(x)I =
t
IDg;(c;)(x)I $
i=l
=
t
IIDg;(c;)II llxll
< ll~II
for all x E ] - 1 - p, -1 + p[ and u(-1) 27.
=1 =v(- 1)(
Obtain an estimate on the length of time the solution of dx/dt = t2x3e'~ x(O) = 1 exists.
443
Exercises for Chapter 7
28.
Let A c Rn be compact and let B c C(A, R) be compact. Show that there are an /o E B and an .xo E A such that g(x) :$ /o(xo) for all g E B and
xEA. 29.
Let an ~ an+I ~ 0 and an -+ 0. Let f(x) = continuous on [-1, O].
30.
Is it possible to solve
'E:0 anxn. Show that f(x) is
xy2 + xzu + yv2=3
u3yz + 2xv - u2 v2=2 for u(x,y,z), v(x,y,z) near (x,y,z) fJv/ fJy.
= (1, 1, 1), (u,v) = (1, 1)?
Compute
31.
Consider the equation dx/dt = 1 + tx, x(O) = 0. Examine the iteration scheme given in §7.5 to obtain a power series expression for the solution. Examine the radius of convergence.
32.
Compute the index of the function x2 + y2 - 7x - Sy+ xy + 16 + (x - 2)3 at its critical point x = 2,y = 3. Discuss the nature of the function near this point.
33.
Give another proof of the Morse lemma as follows. Assume x 0 = 0 and
/(.xo) = 0. Use Taylor's theorem to write/(x) =½D2/(0) •(x,x)+ ½Rx(x,x) = ½{Axx, x} so that for each x, Ax is a symmetric linear transfonnation of IRn. By assumption, Ao is an isomorphism, and so Ax is an isomorphism if xis near 0. Let Q,, AoA;', so that Q0 I. Using a power series, we can define the square root Tx of Qx for x close to 0, that is, T; = Qx. Show that QxAx = AxQ;, where T means the transpose matrix, and using the power series for Tx, show that the same equation holds for Tx. Let Sx = T; 1, and conclude that Ax = SxAoS;. Let h(x) S;x, and show that Dh(O) now apply the inverse function theorem to conclude that h is locally invertible. Let g = 1i- 1• Show that
=
=
=
f(x) =
=/;
I
2{Aoh(x), h(x)}
and deduce that f o g(x) = ½D2/(0)(x, x). Finally, use a linear change of coordinates to diagonalize the quadratic fonn ½D2/. Find the relative extrema of /IS in Exercises 34 through 37. 34.
/:IR 2 -+IR.,(x,y)i--+x2+y2,
S={(x,2)jxEIR}.
444
Chapter 7 The 1,iverse and Implicit Function Theorems
35.
/:R 2 -+R,(x,y)-x2+y2,
S={(x,y)lx2-y2=I}.
36.
/:R2 -+R,(x,y)-x2-y2,
S={(x,cosx)lxEIR.}.
31.
f: R 3 --+ R, (x,y,z) - x2 + y2 +z2 ,
38.
A rectangular box with no top is to have a surface area of 16 square meters. Find the dimensions that maximize the volume.
39.
Design a I-liter cylindrical water container that uses the minimum amount of metal.
S = {(x,y,z) I z ~ -2 +x2 + y2 }.
Chapter 8 Integration The reader is undoubtedly familiar with the integration process for functions of one variable and its application to practical problems involving area, volume, arc length, and so on. The purpose of this chapter and the next is to review, solidify, and extend this knowledge. In this chapter we formulate the basic definitions for a general theory of integration. Some familiarity with simple situations involving multiple integrals is useful but not essential. The powerful computational theorems for multiple integrals will be given in the next chapter. These are Fubini's theorem, which enables us to reduce a multiple integral to iterated single integrals, and the change of variables formula, which enables us to change from rectangular coordinates to a more convenient system of coordinates such as polar or spherical coordinates and more generally to change from one coordinate system to another. To obtain a satisfactory theory of multiple integrals, even for continuous functions, it is convenient to introduce the notion of a set of "measure zero." We shall see that one of the main theorems states that a bounded function is integrable iff its discontinuities form a set of measure zero. In particular, a bounded function with a finite or countable number of discontinuities is integrable.
§8.1 Integrable Functions Building on our work in §4.8, let us outline how integration should be developed in R 2• Suppose f : A c R 2 -+ R is a bounded nonnegative function (see Figure 8.1-1), where A is a bounded set. The graph of the function f is a surface in R 3, and the integration process is used to find the volume under this surface. We enclose A in some rectangle [a1,bd x [a 2,b2] and extendf to the whole rectangle by defining it to be zero outside of A. Then we divide [a1,bd x [a2,b2] into smaller rectangles by partitioning [a1,bd by, for example, 445
446
Chapter 8 Integration
a1 = Xo < X1 < · · · < Xn-1 < Xn = b1 and partitioning [a2, b2] by, say, a2 = Yo < Y1 < · · · < Y111-1 < Ym = b2, thus forming mn rectangles [x;,X;+d x [Yi,Yi+dThen we calculate the volume of the shaded block of Figure 8.1-1 to be inf{/(z) I z E [x;,X;+d x [Yi,Yi+d} · (X;+1 - x;)(Yi+I - Yi) and the volume of the shaded block plus the crosshatched block of Figure 8.1-1 to be
sup{/(z) I z E [x;,x;+d x [Yi,Yi+d} · (x;+1 - x;)(Yi+I - Yi>· Summing these volumes over all i and j (that is, all the mn "subrectangles" of the rectangle [a 1 , bi] x [a2, b2D gives two values, denoted L(J, P) and U(J, P), where P stands for the partition. If sup{L(f, P) IP is a partition}= inf{ U(f, P) I Pis a partition}, we say that/ is (Riemann) integrable over A and define the (Riemann) integral over the set A, written JAi' or JJ,,.f(x,y)dxdy, by
1t =
sup{L(/,P)}
=inf{ U(f, P)}.
We have already introduced most of the ideas needed for a theory of integration of bounded functions over bounded sets for arbitrary dimensions. Most of what remains is to fonnalize the statements for the case lRn.
z
1·
~--~J~~' l,·1
I I
X
FIGURE 8.1-1
I I
~I I I I I
t):t;t---t---....,,, The graph of a bounded function and a column helping make up the volume under it
Let/ : A C !Rn - IR be a bounded function with domain a bounded set A. Choose a rectangle [a 1 , bi] x · · • x [an, bn] that encloses A. Furthermore, let/ be
447
§8.1 llltegrable Functions
defined over the whole rectangle by setting it equal to O at points not contained in A. Let P be a partition of [01, bi] x · · · x [an, bn] obtained by dividing each [a;,b;] by points ,to, ... ,J,11 I• and fonning the m1m2 • · · mn rectangles
Define the volume of the rectangle B = [a 1, b1] x ••• x [an, bn] to be the product of the lengths of its edges: vol(B) = v(B) = (b1 - a 1)(b2 - a2)· · · (bn - an). Let L(j, P) denote the lower sum off for P, defined by L(j,P) = L[inf{f(x)
Ix E R}]v(R),
REP
the sum being over all subrectangles R of the partition P. Similarly, let U(j,P) denote the upper sum for P, defined by U(f, P) = I:rsup{f(x) I X E R} ]v(R). REP
Now we observe some properties of L(/, P) and U(f, P). From the definition, we see that for any partition P, L(f, P) S U(f, P). Now suppose P' is any partition that is a refinement of or is finer than P; this means that each subrectangle belonging to P' is contained in a subrectangle belonging to P. We assert that L(f, P) $ l(f, P'). Indeed, this follows from the fact that the minimum off on a rectangle is less than or equal to the minimum on any rectangle contained in it. Similarly, U(j, P') $ U(f, P). This has the following consequence. If P' and P" are any two panitions of [a,, b,] x · · · x [an, bn], then l(f, P') $ U(f, P''). To prove this, let P be a partition of the rectangle that refines both pt and P", which we can arrange by using all the subdivision points of P' and P"; then L(f,P') :5 l(f,P) :5 U(f,P) :5 U(f,P''). Since the function f and the rectangle B are bounded, the sets of numbers { L(f, P) I P is a partition of B} and { U(f, P) I P is a partion of B} are bounded below by inf{f(x) I x E B} vol(B) and above by sup{f(x) Ix E B} vol(B). Each number in the first set is smaller than all numbers in the second set. Thus, if we set
s = sup{L(f, P) IP is a partition of B} S = inf{ U(f, P) I Pis a partition of B}, then s S S. With this notation we can make another definition.
448
Chapter 8 Integration
8.1.1 Definition /f f is a bounded function on a bounded region A, then the upper integral off on A is defined by l f == S == inf{ U(f, P) I P is a partition of B} and the lower integral off on A by l f == s == sup{l(f..P)
IP is a partition of B}
where B is any rectangle containing the region A. We say that f is Riemann integrable (from now on we will just use the word "integrable") ifs ==Sand define the integral off on A as the common value:
r·
Instead of JAJ. the notation JAJ(x)dx Qr ·JAJ(x1,,,, ,Xn)dx1 · · · dxn is frequently employed. If f : [a, b] -+ JR, the notation f or f(x) dx is also used. There is an important equivalent characterization of the Riemann integral as presented in the next theorem.
J:
J:
let A c Rn be bounded and lie in some rectangle S. let f : A -+ JR be bounded and be extended to S by defining f == 0 outside A. Then f is integrable with integral I iff for any e > 0 there is a o > 0 such that if P is any partition of S into rectangles S1, ... , SN with sides of length < o and ifx1 E S1, ... ,XN E SN, we have
8.1.2 Darboux's Theorem
N
Lf(x;)v(S;) - I
< e.
;.,1
We call
I:!1 f (x;)v(S;) a Riemann sum.
A condition closely related to Theorem 8.1.2 follows.
8.1.3 Riemann's Condition f is integrable if! for any e > 0 there is a partition P" of S such that O $ U(f,P,:)- l(f,P,:) < e.
449
§8.1 Integrable Functions
The proof of Riemann•s condition will be given along with the proof of Darboux's theorem at the end of the chapter. Notice that if f is continuous, we can realize the upper and lower sums for a partition as special Riemann sums, since/ assumes its maximum and minimum on each subrectangle at some point of that subrectangle. If f is continuous on the whole rectangle S (= interval if n = l), then it follows from uniform continuity off (see §4.6) and Riemann's condition that f is integrable. We shall, in fact, prove a more general result in 8.3.1. Let us now illustrate a few of these ideas in the special case of one dimension. These were treated in §4.8, but are worth reviewing at this point.
8.1.4 Example Interpret Riemann sums geometrically for f: [a,b]-+ JR. Solution Let P : a
= xo
< x1 < · · ·
./5 > 0 for all n, contradicting the hypothesis gn(x) -+ 0.
n:,
n:
•
8.6.3 Corollary /ff: [a,b]-+ R,f ~ 0, and the improper integral
J: /
2
0, A can be covered by a finite number of rectangles of total volume< €. Let these rectangles be Vi, ... , VM. Let S be a closed rectangle containing A, and let P be a partition of S into
Chapter 8 Integration
488
subrectangles S1, ... ,SN such that each S; is either contained in some Vi or has at most its boundary in common with some V;'s-the partition is defined by using all the edges of the Vj's. Then U(lA, P) = I:~ 1 v(V;) < i::. This implies that inf{ U(lA,P) I Pis a partition} = 0, and hence l(lA,P') = 0 for any partition P', since O $ l(lA,P) $ U(lA,P) for all P. Therefore A has volume, and this volume is zero. •
Example 8.2 Let ft be a sequence of bounded (Riemann) integrable functions defined on R = [a, b] x [c, d]. Suppose ft integrable on R and
f
-+
f uniformly. Prove that f is
j
lf1c(x,y)dxdy = kf(x,y)dxdy
Solution
(Compare Theorem 5.3.1) First observe that/ is bounded. Indeed, iffN is such that IJN(x,y) - f(x,y)I < l for all (x,y) ER, then, using the triangle inequality, lf(x,y)I ~ 1/N(x,y)I + I.
Therefore, since fN is bounded, so is f. By Theorem 8.1.2, we must find a number/ such that for every is a 6 > 0 such that
£
> 0, there
n,m
Lf 0 such that for lxp -Xp-d < 6, IYq - Yq-d < 6,
Worked Examples for Chapter 8
489
With this choice, the triangle inequality gives n.m
Et 0.
23.
a.
If
0 such that if P is any partition into rectangles S1, ••• , SN with sides < 6, there exist x 1 E S1, ••• ,XN E SN such that
It.
2S.
has volume zero.
Let f : [0, oo[ -. Show that
f(x;)v(S;) -
/I
< e.
a be nonnegative, integrable, and uniformly continuous. lim f(x) = 0. X--+00
26.
c a", where A is bounded and has volume.
Let f : A c to be unbounded. Suppose C; is a sequence of compact sets with volume, C; c A with C; increasing to A, and assume v(C;) -. v(A) (this is automatically true, as shall be seen in Exercise 15 at the end of Chapter 9). Show that f is integrable iff f is integrable on each C;, lim;_oo fc.f exists, and Consider a set A
an-. IR,/~ 0, but allow f
I
[ f = _lim [
}A
,. . . oo)C;
f.
27.
Prove that if f : A C IR." -. a is continuous, A is open with volume, and 0 for each B C A with volume, then f 0. 8/
28.
Let/: [O, I] -. a be integrable and be continuous atxo. Show that the map x 1-• J;J(y) dx is differentiable with derivative /(Xo). Give an example of
f =
=
493
tfixercises for Chapter 8
a discontinuous integrable f for which this map is not differentiable. For bounded integrable f, prove that this map is always continuous and, in fact, is Lipschitz.
l9,
Show that the Cantor set CC [O, 1) has measure zero (see Exercise 38 at the end of Chapter 3).
30..,
Prove the following analogues of the Weierstrass and Dirichlet. tests for uniform convergence using the Cauchy criterion: a.
Let/: [a,oo[ x [c,d]-+ R. and suppose there is a positive function M(x) defined for x E [a, oo[, such that 1/(x, s)I :5 M(x) for all s E [c,d], and such that l000 M(x)dx < oo. Then F(s) = l000 /(x,s)dx converges uniformly in s. If f(x, s) is continuous in (x, s), prove that F is continuous.
b.
Let f : [a, oo[ x [c, d] -+ R. be continuous, and suppose that I f(x, s) dxl :5 M for a constant M for all r ~ a, s E [c, d]. Suppose cp(x, s) is decreasing in x and cp(x, s) -+ 0 as x -+ oo uniformly in s. Prove that F(s) = l000 cp(x,s'J/(x,s)dx converges uniformly.
l:
31.
Let/: [a,b]-+ R be differentiable and assume thatf' is integrable. Prove that J:J'(x)dx =f(b) - f(a).
32.
Suppose that rp is a differentiable function on [a, b] and that/ is continuous on the range of cp. Show that
] ddX [1,p(.r> ,p(a) f(t)dt =J(cp(x))cp'(x). 33.
Define a functionf on the interval [0, 1) by puttingf(x) = I if xis rational and /(x) = -1 if x is irrational. Show that I/I is integrable on [0, 1] but that f is not.
34.
Define a function f on [0, I] by putting f(x) = I if x is irrational and f(x) = 1/q if xis a rational number equal to p/q for integers p and q with no common factor. Is f integrable on [O, I]? (See also Exercise 38.)
35.
Let An= [(n + 1) + (n + 2) + · · · + (n + n)]/n. Prove that limn_ 00 (1/n)An = 3 /2, using the Riemann integral.
36.
Prove that limn-oo(n!) 1/n /n = e- 1 by considering Riemann sums for log xdx based on the partition 1/n < 2/n < ••• < I.
37.
1; a.
Under what conditions is J: f(cp(t))cp'(t)dt =
1::;:/ f(x) dx?
494
Chapter 8 Integration
b. 38.
Evaluate
j
~ using x = cost.
(1 -x) 1 -x2
Let f : [O, 1] -+ JR be defined by if x is irrational f(x) = { ~/q ifx=p/q, where p, q ~ 0 with no common factor. Show that f is integrable, and compute J~ f. (See also Exercise 34.)
! !
1 + n 2 + · · · + 2:] .
39.
Prove that {og 2 = limn---+oo [ n
40.
Find an open subset of JR contained in JO, I [ that does not have volume, as follows:
a.
Review the construction of the Cantor set (see Exercise 39, Chapter 3).
b.
Modify the Cantor set by letting C1c be obtained from Ct-I by removing the middle 1/2/r.th from each interval of Ct- I and letting Co= [0, 1). Set C = 1 C1c.
n: k
·
I 4.
c.
Show that v(C1c) = IL= 1(1 - 1/2') ~
d.
Let U be the complement of C. Compute the boundary of U, and, using c, show that it cannot have measure zero.
This exercise also produces an example of a compact set C with empty interior that does not have volume.
41.
I/ is Riemann integrable}. Set
Let R([a, b]) = {/ : [a, b] -+ JR
d(f, g) =
lb
lf(x) - g(x)I dx.
Is d a metric on the space R([a,b])? 42.
Find a subset A of [0, 1] such that A = cl(int A) and yet bd(A) does not have measure zero.
43.
It is a fact (see J. Marsden and M. Hoffman, Basic Complex Analysis, 2d ed., W. H. Freeman, New York, 1987, p. 26) that . 7f' • 27f' SID - SID -
n
n
• (n - 1)7f' • • • SID
n
n
= 2n-l •
Use this identity to evaluate the improper integral
J0
1r
log sin x dx.
j:;xercises for Chapter 8
495
44.
Discuss generalizations of Theorem 8.6.1 to R_n.
45.
Discuss generalizations of Theorem 8.6.1 from [0, 1) to [0, oo[.
46.
a.
Suppose U::: ]-1, 1( x ]-1, I[ C R. 2, / : U-+ R.. Assume that 8f/8x and 8// 8y exist at each point of U and are bounded on U, where (x,y) are the standard coordinates for R. 2• Show that/ is continuous at (0,0).
b.
Show by example that boundedness of the partial derivatives is necessary in part a; mere existence is not enough.
47.
For every a > 0, compare hence determine
J:
x 0 dx with
and
:E~1 n° dx and
I: -n- . N
. l1m N-+oo
:E;:1 n°
n=I
o
Nl+o
48.
For any function/(x) continuous over the reals, define the sequence/n(X)::: n J;+J/n f(Od{ for n = 1, 2, 3, .... Show that dfn(x)/dx exists even if df(x)/dx does not, that/(x) = limn-+oofn(X), and that convergence to the limit is uniform when f is uniformly continuous.
49.
Suppose {h} is a collection of open intervals whose union covers a compact interval C on the real axis; show that some positive e: exists such that every subinterval of C no wider than s lies entirely in at least one of the /k's. ·
50.
State whatever lemmas, theorems, and so forth are needed to justify each of the following assertions:
I:: 2-k sin(k/n) = 0.
a.
limn-+oo
b.
If f (x) is a power series converging in )-1, l [ , then the same is true for J'(x).
c.
Letf(x) = tan(1rx/2) and set On =fn>(0)/n!. Then :E:Oan is not a convergent series. (Do not attempt to compute an.)
d.
1
If fn(x) is differentiable on [a,bJ with l{;(x)I < IO for all n and if 0 at each x, thenf,,(x)-+ 0 uniformly.
x E [a,b] andfn(x)-+
e.
f 0 and O < 0 < 21r. We have the "rnle" dxdy = rdrd8. Since x2+y2 = ,2, we get
1
(x2+y2)dxdy=
A
1
r 2 rdrd8 =
A
1112~ 1-=0
IJ=O
rdBdr=
1' r=O
21rrdr=
21T . •
The justification of the "rule" dxdy = rdrd0 is given by the change of variables formula (Theorem 9.3.1). The exu·a factor r is just the Jacobian 8(x,y)/ 8(r, 0) = r. However, it is easy to heuristically "justify" the rule by regarding dr and dO as infinitesimals: dr represents a radial infinitesimal while rd0 represents an infinitesimal length of arc generated by a change in angle of d6 along a circle of radius r. Thus rdrdO is the area element in a sector bounded by r, r + dr and 0, 0 + dB (see Figure 9.1-2). ' In one dimension, the change of vruiables fonnula is easy. It states that if f is continuous on [a,b] and we have a mapping
A
a
'f'(X)
There is an analogous theorem with the roles of x and y interchanged. The corollary is a consequence of the theorem if we remember that f is extended to be zero outside A. These results are extended to multiple integrals as follows.
502
Chapter 9 Fubini's Theorem and the Change of Variables Formula y
J
,t,(r)
f(x, y)dy
~(r)
ef>(x)
-+----'-------'----- X FIGURE 9.2-1
Fubini's theorem
9.2.3 Theorem i.
Let A C !Rn and B C IR"' be rectangles and let f : A x B C !Rn x IR.111 -+ JR be continuous. Define.for each x EA, fx : BC IR"' -+ 1R by fx(y) = f(x,y). Then
ii.
If f is integrable and fx is integrable for each fixed x, then again
1x/ 1 =
(1f(x,y)dy) dx.
Similarly, if JAf(x,y) dx exists for each y, then
In practice, this theorem may be used repeatedly to reduce a problem to iterated one-dimensional integrals.
9.2.4 Example Evaluate
1
(x + y + 7.)2 dx dy dz
where A is the three-dimensional volume sketched in Figure 9.2-2.
§9.2 Fubini's Theorem
503 z
l
co.o.•i
A
(0,),0)
/(1,0,0) X
A tetrahedral region in IR 3
FIGURE 9.2-2
Solution Here A is the set {(x,y, z) E IR 3 Ix~ 0,y ~ 0, z ~ 0, and x+y+z $ 1}, and so it consists of those points (x, y, z) for which x ~ 0, y ~ 0, x + y $ I, and O :5 z :5 1- (x+y). Let B = {(x,y) Ix~ 0,y ~ 0, and x+y :5 I}. Then, by Theorem 9.2.3, and remembering that/ is zero outside A,
r (lor,-,.
1
+.Y)
}x+y+zf-dxdydz= ln
)
(x+y+ddz dxdy.
Similarly, B consists of those points (x,y) for which x E [0, I] and y E [O, 1-x], and so
1
[1
,,_ ududv= 0. O (,.2/4)-1
(Strictly speaking, one should first apply the change of vruiables theorem only to the integral fc1 0, we can choose 6 > 0 such that
8/
18y (xo,Yo)
8/
- oy (x,y)
I
0, there is a B such that
IG(y)-
lb
% -1.
Evaluate
J01 x' (log(x))R dx.
sin(21rt) - 21rtcos(21rt)
t
2
by considering the derivative of J;" cos(tx)dx with respect tot.
:t fo
2
3.
Evaluate
4.
Carry out the proof of Proposition 9.7.S.
s.
Evaluate
"
x cos(tx) dx.
Jo"° x'e-ax dx
by repeatedly differentiating
a> 0.) Justify the application of Proposition 9.7.5.
Jo"° e-ax dx.
(Here
Theorem Proofs for Chapter 9
521
Theorem Proofs for Chapter 9 9.2.1 Theorem i.
Let A be the rectangle described by a $ x $ b, c
f : A --+ JR be continuous.
The expression
~
y $ d, and let
Then
lb (1d
f(x,y)dy) dx
means that the function g(x) =
1d
f(x,y)dy
is int~gratedfrom a to b.
ii.
In i, suppose f is integrable and the function fx : [c,d] = f(x,y) is integrable for each fixed x E [a, b]. Then
lR defined by
fx(y)
One can similarly assume that
lb
/(x,y)dx
exists for each y and obtain
Proof Since i is a special case of ii, we need only prove ii. Let g : [a, b] c JR --+ JR be the function defined by g(x) =
id
J(x,y)dy.
522
Chapter 9 Fubini's Theorem and the Change of Variables Formula
f
J:
We must show that g is integrable over [a, b] and that 1.f = g(x) dx. Suppose a xo < x1 < · · · < x,. b and c = Yo < · · · < y,. d are partitions of the intervals [a, b] and [c, d]. Let P[a,bJ be the pru1ition of [a, b] given by the sets V; = [X;-1,X;]. Pie.di the partition of [c,d] given by the sets Wj = [Yj-1tYj], and PA the pru1ition of A given by the sets
=
=
=
Then L(J, PA)=
L
ms/f)v(Sij) =LL ms/f)v(Wj)v(V;), i
ij
j
where ms(/) is the minimum (inf) off on the set S. For x E V;, we have msii(f) 5 mw//..-), where f.- is defined by fx(y) = J(x,y). Hence,
~ ms/f)v(Wj) 5 ~ mw/fx)v(Wj) 5 J
id
fx(y)dy = g(x).
J
Since this inequality holds for any x E V;, we get Therefore, L(f, PA)
5
I;i ms/f)v(Wj) 5
L mv;(g)v(V;) 5 L(g,
mv;(g).
P[a,b]),
i
From this and from a similar argument for upper bounds, we obtain the inequalities
Since f is integrable over A, these inequalities show that g is integrable and
Using the same argument, if we assume that we get
J: f(x,y)dx exists for each y,
523
Tl,eorem Proofs for Chapter 9
9.2.2 Corollary Let ip, ·I/J : [a, b] -+ 'IR. be continuous maps such that ip(x) :s; 'lj;(x)for all x E [a,b], let A= {(x,y) I a :5 x :5 b,ip(x) :5 y :5 'lj;(x)}, and let f: A -+ 'IR. be continuous. Then (see Figure 9.2-1)
1l f =
b
(11/J(x) f(x,y)dy ) dx.
a
A
'P(X)
Proof Let S = [a,b] x [c,d] be a closed rectangle enclosing A, and extendf to S by setting it equal to O on S\A. The sets graph(cp) = {(x,ip(x)) Ix E [a,b]} and graph(tJ,) = { (x, ·r/;(x)) I x E [a, bl} are of measure zero (by Exercise 23, Chapter 8). Thus, the set of discontinuities of/ defined on S is of measure zero and so f is integrable over S. Also, for any x.fx is continuous on [c, d], except possibly at 0 for all x, that/ is convex upward, i.e., !(x;y) $f(x)~f(y)_
13.
=
=
Suppose C C A x B and v(C) 0. Let Cx {y E B assume that I v _ { 1 if (x,y) E C exO X
~Q.
§10.4 Functions of Bounded Variation and Fejer Theory (Optional) There is a theorem that is similar to the pointwise convergence theorem, but tha1 holds under more general conditions and that also gives a criterion for uniform convergence. We state this theorem without proof (the proof is similar to that of the pointwise convergence theorem-it is just a little more intricate). We shall be content to prove a weaker version in §10.6 and to prove a related theorem of Fejer. To understand the theorem, the notion of a function of bounded variation is needed. For f : [a, b] - JR, we say that f is of bounded variation if there is a number M such that for all partitions a = xo < x 1 < ••• < Xn = b of [a, b], II
L lf(xk) - f(xk-ill ~ M. k=I
Roughly, saying that f is of bounded variation means that the graph off has finite arc length. One can show that a function is of bounded variation iff it is the difference of two bounded monotone functions. 2 It follows that if f is of bounded variation, then its discontinuities are all jump discontinuities and are countable in number, so that f(x0) and f (x0) are defined.
10.4.1 Dirichlet-Jordan Theorem
Let f : [0, 211') -
JR be a bounded
function.
i.
If f is of bounded variation on an interval [xo - £,xo + e] (for some e > 0) about xo, then the Fourier series off evaluated at xo converges to [f(xo) +f(xo)]/2.
ii.
If f
is continuous, f(O) = f(?11'i and f is of bounqed variation, then the Fourier series off converges uniformly to f.
is of bounded variation, set v(x) = sup{}:Z~, IJ(xk) - f(xt-i>I I a= xo ~ xi ::=: · · · ~ = p - q, where p = v + f /2 and q = v - f /2. One checks that P and q are increasing. The converse is easy to verify. 2 1f /
xn
=x}, the variation of J. Write f
§10.4 Functions of Bounded Variation and Fejer Theory (Optional)
511
Both the pointwise convergence theorem and the Dirichlet-Jordan theorem give sufficient conditions for the Fourier series to converge. Exercise 34 at the end of the chapter gives an example to show that the conditions are not necessary. Useful necessary and sufficient conditions are not known. As we have remarked, the Fourier series of a continuous function need not converge pointwise. By the Dirichlet-Jordan theorem, such a function cannot be of bounded variation. Fejer's theory covers this case by weakening the requirement of pointwise convergence of the series to Cesaro summability of the series. Recall from §5.10 that a sequence a,, a2 , •.. is said to converge in the sense of Cesaro, or (C, 1), if Un= (a, + · • • + an)/n converges. If an - x, then -.c2g(t)
(2) =0 A solution of (2) with h(0) =h(l) =0 and g'(0) =0 is
=sm. n11'x - 1-
and
g(t)
h"(x) + )..h(x)
for a constant).. (why?). h(x)
where
=cos -n11'ct 1-
,, ,,
n-1r-
).. =T·
n
=1,2,3, ....
Thus, for each n, a solution of the equations of motion satisfying conditions 1 and 3 is . n1rx n1rct y,,(x, t) = sm - 1- cos - 1-, n = 1, 2, ....
595
§10. 7 Applications The initial conditions for this solution are
y(x, 0) = sin n;x
and
it
(x, 0) = 0.
Thus we have the solution for a particular initial condition sin(mrx/ /). However, we know that any/ can be expanded in a half-interval sine series, and since all the conditions are linear, we should be able to add up the solutions corresponding to the terms in this expansion. This is done more precisely as follows.
10.7.1 Theorem In the initial-displacement problem, suppose that f is twice differentiable. Then the solution to the initial-displacement problem is y(x, t)
=2I [f(x -
ct)+ f(x + ct)]
= bn sm. -mrx mrct =L 1- cos - 1- ,
(3)
n=l
where the bn are the half-interval sine coefficients, bn =
2 [1 . mrx l Jo f(x) sm - 1- dx,
and f is extended to be odd periodic. (Twice differentiable means we are assuming that the extended/ is twice differentiable.) (See Figure 10.7-2.)
y
'
'\
\
/'
I
\,._.,,
FIGURE 10.7-2
',
\
X
\
I
I
'-"'
Solving the initial-displacement problem
It happens in this case that the solution (3) could be simplified to a more easily handled and explicit form. Often, however, one must deal directly with the Fourier series itself. Before generalizing, let us note the simple physical interpretation of the result. The graph of /(x - ct) is that of/ moved over to the right a distance
596
Chapter 10 Fourier Analysis
ct, and so we can interpret the function g,(x) = f(x - ct) as / moving to the right with velocity c after time t. Similarly, h,(x) = f(x + ct) is f moving to the left with velocity c. See Figure 10.7-3. Thus, the initial shape of the string propagates away to the left and right with velocity c, forming two waves, each with one-half the initial amplitude, and the waves "reflect" (with sign change) when they reach the endpoints.
,•-
/ •
,,
£ \
0
I
,..,,._.;
\
c,
h1
,_..,,.
:D!:{"'
I
,--~,,• ,
'
\
g
h
66' \
.
1
~ - - 1 (•
\
\
FIGURE 10.7-3
,,_,,,
I
I \
\
~· I
I
,..__/
Left- and right-traveling waves
To use the tfalf-interval sine series, recall that we made/ odd periodic. If we look only on the interval [O, l], we see that when/ moves to l, it reflects from the wall; see Figure 10.7-4. Since the solution is the sum, there will be complicated canceling (or "interference"). To keep track of this, it is useful to visualize a simpler situation first. Suppose / were concentrated near a point (possibly a 6 function) and we were to watch it move. The motions in this case are called the characteristics of the problem. They should be visualized as if one were watching a movie. See Figure 10.7-5. To use genuine delta functions or functions/ that are continuous but not twice differentiable, we must generalize the scope of Theorem 10.7.1 and also generalize what we mean by a solution of fJ2y / 012 = c2({)2y/ ox2) for y that are not differentiable. This is done using the theory of distributions. Admitting distributions, the result still holds for f a dist1ibution (that is, the formal manipulations
§10. 7 Applications
597
-- -
-- -
t= 1
= ~
FIGURE 10.7-4 Waves reflecting at the boundary
•
A'
• 2
0 (a)
=u:
±
± (b)
;y___·
•
-v-
•
(d)
(c)
:ft (e)
:ft (f)
=
FIGURE 10.7-5 (a) t 0; (b) t = l; (c) t = 2; (d) t = 3; (e) t = 4; (t) t (g) return to (a)
=5;
can be justified when properly interpreted). We then regard (j(x-ct)+f(x+ct))/2 as the solution for any f, differentiable or not. In two-dimensional problems (such as a vibrating drum), the wave equation reads {}2y - 2 (4) ot2 - C or. + o_tj .
(82y 82y)
In this case, the general solution can be written in Fourier series but does not have a simple explicit expression as did the one-dimensional case. The solution
Chapter 10 Fourier Analysis
598
(for the similar initial-displacement problem) on the rectangle [0, /] x [0, l'] is given by
'°'
.
00
V
. m,rxz [ y(x1,X2,t) = ~ bnm sm -mrx1 1- sm - 1,- cos ,rct(n/1)2 + (m/1')2
]
(5)
n,m=l
where
. mrx11- sm. -mu2 lorlor J(x1,X2)sm1,-dx1 dx2• I
4
bnm = IL'
I'
The reader is asked to go through the delivation of this in Exercise 68 at chapter's end. Turning to heat conduction, consider a bar whose temperature is T(x, t) at the point x at time t. Interpreting -(8T/ax) as the rate of heat flow, the condition of "insulation" at x 0 is (aT / ax) 0 (evaluated at x 0). The law of heat conduction asserts that
=
=
8T
=
a 2T
81 (x, t) = k ax2 (x, t),
(6)
where k is a constant determined by the conductivity of the material. This is called the heat equation. (For a derivation see Marsden and Tromba, Vector Calculus, 3d ed., W. H. Freeman, New York, 1988, Chap. 7.) The heat equation (6) differs from the wave equation in that aT/ at replaces a2 T/ai1. This difference is very jmportant, for solutions to the heat conduction problem are different in their behavior from those of the wave equation. For example, in the heat equation one obtains solutions only for t ~ 0: Intuitively, for the wave equation the graph of the solution "bounces around" like water waves. For the heat equation, the solution diffuses out and becomes steady as t - oo (as temperature tends to become evened out). To study this simple situation, let us create the following model for heat conduction of a bar with insulated ends (for simplicity, talce k = l ). Hence, we wish to solve for T(x, t) satisfying 1.
aT a2 T -aX (x,t) = k-a (x,t), 0 < x < l, x-
2.
T(x,O) =J(x),
?
0
0 and the limit as 1 -+ oo.
x2. 6.
{1x+cl
=
Let T(x, 1) be a solution of the heat equation and set l(l) = J~ IT(x, 1)12 dx. Show that L(l) is nonincreasing.
§10.8 Fourier Integrals This section consists of an informal discussion of Fourier integrals. We sketch the main results so that the reader may see their role in Fourier analysis. As we have seen in the previous sections, Fourier series are a useful tool for analyzing
606
Chapter 10 Fourier Analysis
functions on a finite interval. Since many functions are given on the whole real line JR, it would be helpful to have an analogous theory on JR.. Fourier integrals provide this theory. Considerf : [-l, I] -+ JR.. Writing f in tenns of its exponential Fourier series,
L c,,ei111rx/l, 00
f(x):
1 where c,, = 21
-oo
JI
.
f(y)e-m1ry/l dy.
-I
Let a: = mr / l and write
~ • 1r f(x) = ~c(a)e"rx 1 ,
where c(o:) = -21 1r
-oo
JI
.
/(y)e-' 0 Y dy.
-I
For l large, a: approximates a continuous variable, and this sum is a Riemann sum with Aa = 1r / I. This suggests that J(x) =
1-:
where c(a) = -21 1r
c(o:)eiox da,
Joo f(y)e-'°'Y. dy.
(1)
-oo
In short, we expand our finite intervals to infinite intervals, and the Fourier series is then expressed as an integral. The same steps can be used in trigonometric form, except that integrals are taken from O to oo, as are the corresponding sums. The relevant theorem states that if f is sectionally continuous with jump discontinuities, f' (Xo) and f' (Xo) exist there, and J~ If(x)I dx < 00 (/ is integrable), then
(2) where c(a)'= -21
1r
1
00
j(y)e-'•0 Y dy.
-oo
One proves this in a way similar to the pointwise convergence theorem. Formula (2) is called the Fourier inversion formula. In trigonometric form the formula is
I
2 [/(x•)+f(x-)] =
f'X>
lo
[A(o:)cosax+B(o:)sinax]da,
where
11
00
A(a) = -
7r
0
f(y)cosaydy
and
11"°
B(a) = -
7r
0
f(y)sino:ydy.
(3)
§10.8 Fourier llltegra/s
607
This form is especially convenient if/ is even or odd. In view of the inversion theorem, the Fourier transform off is defined as f(o:) A
100 f(x)e-,xo • dx.
= -21
(4)
-oo
7r
If f is continuous, differentiable, and integrable on JR, then
(5) There is a similar formula on !Rn; that is,
~
/(x)
L
j(o:)i(o.
x)
do:,
(6)
where j(Oi)
f
= _I_
J'IW'
(21r)"
f(x)e-i(x,o) dx,
x, a E !Rn, and (x, o:) is the usual inner product in !Rn. Given f : [0, oo[ --+ JR, we can extend f to all of JR by making it even or odd. Just as with the cosine and sine se1ies, we can then introduce the Fourier
cosine transform by extending f to be even, and setting
21
fc(o:) = -
00
/(y) cos(o:y) dy.
(7)
0
7f
The inversion formula becomes f(x) =
100
lc(o:) COS(OIX) do:.
(8)
Similarly, extending/ to be odd leads to the Fourier sine transform, ].(a-)=~ 1r
.['xi /(y)sin(01y)dy,
lo
(9)
and the inversion formula becomes f(x) =
fo
00
j.(o:) sin(X01) d01.
(10)
A standard fact one should know (using Worked Example 9.1, Chapter 9) is that the Fomier transfo1m of e-x212 (Gaussian function) on JR is e- 02 / 2 / .._ff;, which is consistent with the inversion theorem.
608
Chapter 10 Fourier Analysis In general, an integral tramform is an association of the function g(x) =
1
(11)
k(x,y)f(y)dy
with the function/ for some fixed function k called the kernel, and some fixed range A of integration. Such operations are common in mathematical physics. Thus the Fourier transform is an integral transform with kernel k(x,y) = e-ixy /2rr. Here we come to an important, general problem. The transform maps f to g; that is, given/, we get g. Can we invert this? In other words, given g, can we invert the transformation to find/? The Fourier inversion formula solves this problem in the case of k(x,y) = e-ixJ /2rr. That is, knowing the Fourier transform, we can recover the function using the inversion formula. Another common integral transfonn is the Laplace transform with kernel k(x,y) = e-xy and range [0,oo]. Thus, the Laplace transfo1m off is C(f)(x) =
1
00
(12)
e:.."-,f(y)dy.
The inversion problem for Laplace transforms also has a solution. (See, for example, Marsden and Hoffman, Basic Complex Analysis, 2d ed., Chap. 7, for details.) For/: [-tr, rr] --t IR, Parseval's relation gives the identity (see Table 10.5-3)
1 1
}6$ltl$:r
lf(x) - f(x - t)IFn(I) dt.
The first integral on the right-hand side is bounded above by
1
-2I
1r
1119
eFn(t)dt :5 -2I
J,r eF (t)dt = e. 11
1r
-,r
The second integral is bounded by
-21
1r
1
2MF,.(t)dt = M
6$1il$ir
1r
1
Fn(t)dt,
6$l1l$1r
where M = sup, 1/(t)j. By property iv of Lemma 11, we may choose N so that if ~ N, this last integral is :5 e. Thus, if n ~ N, lf(x) - O'n(x)I $ e + e = 2e.
•
n
One can prove that the Cesaro sums converge to an integrable function except possibly on a set of measure zero (see Hewitt and Stromberg, Real and Abstract Analysis, Springer-Verlag, p. 294).
10.5.1 Integration Theorem Fourier series
Suppose
J:::r l/(x)l 2 dx < oo
00
0;
~ L)a,. cos nx + bn sin nx). n=l
Then, letting g(x) = J~ 1J(y)dy, we have g(x)= ao(\+rr) +~(an
=
ao(x + rr) 2
00
{
1-:
a,. .
cosnydy+bn bn
1:,r sinnydy)
+~ -;smnx+-;«-It-cosnx)
and the convergence is uniform for -rr $ x :5 rr.
}
and f has
Theorem Proofs for Chapter I 0
Proof
We prepare the following lemma.
Lemma 12 Suppose fn : [a, b] fn
-+
635
-+
R. are such tha,t
f in mean. Let gn(X)
Then gn
-+
Proof
=l~f,,(y)dy
and
g(x)
l"
=
J; l.fn(x)j2 dx < oo and
f(y)dy.
g uniformly on [a, b].
By the Cauchy-Schwarz inequality,
l8n(X) - g(x)l 2 $ $
(1x (1x l.fn(y) - f(y)l
lfn(Y) - f(y)j dy)
from which the result is obvious.
2
2
dy) (x - a) $ II!,, - fll2(b - a),
•
For the theorem, let s,,(x) be the nth partial sum of the Fourier series and taJcef,, = Sn in the lemma. We know thatfn -+fin mean (Theorem 10.3.1), and so gn -+ g unifonnly. Here 8n is the partial sum of the integrated Fourier series, and so we have the result. •
10.5.2 Gibbs' Phenomenon Consider f(x) = {
a -'Ir$
X
0 is arbitrary but . ,, 22;12 'I;"" fixed. In this case we let Mn = n-e-n 1r e and note that LJ Mn converges. (We cannot allow t = 0.) The rest of the theorem is obvious. •
10.7.3 Theorem In Theorem 10.7.2, lim T(x, t) = j(x)
1-0,r>O
in the sense of convergence in mean, and, converges uniformly (and pointwise) if f is continuous, with J' sectionally continuous. More generally, for any f, if the Fourier series off converges at x to j(x), then T(x,t)-+ f(x) as t-+ 0.
Proof For the first part, it will suffice to show the following.
Lemma 14 For each t > 0, suppose J, E V, an inner product space, and cpo, cp1, ... is 'a complete ortlwnormal basis. Let 00
/, =L Cn(t)'Pn, n=I
If
00
f
=L Cn'Pn• n=l
00
lim ~ lcn(t) - cnl 2 = 0,
1-+0 ~ n=I
then f,
-+
f (in mean).
Proof The result follows from Parseval's relation
IIJ,-!112 =
z:: lcn(t)-cnl 1
2•
•
Theorem Proofs for Chapter I 0
641
In the case of Theorem 10.7.3, we must show that 00
lim ~ lanl20
,-o~
- e-n2,..2,1,2>2 = o.
n=l
E:
2 21 12 To do this, it is enough to show that the function g(t) = 1 lanl 2 (1-e-n 1r l )2 is continuous int, since g(0) = 0. To show that g(t) is continuous, we shall show that the series converges unifonnly in t. To do this, Abel's test will be used. The fonn we need is the following:
E:
Lemma 15 Let 1 Cn be a convergent series and 'Pn(t) a uniformly bounded, decreasing (respectively, increasing) sequence defined fort~ 0. Then g(t) = 1 Cn 0, T(x, t) is differentiable and hence continuous. However, T(x, t) may not be differentiable at r = 0, but the proof just given does show that we have continuity at r = 0. These methods using Abel's and Ditichlet's tests are important for establishing convergence in other problems (such as Laplace's equation), as we shall see in the next proof.
10.7.4 Theorem i.
Given g1, let rp(x,y) be defined by _ ~b
rp (x, y ) - L
n=I
. h mr(b- y) sin(mrx/a) . / , a smh(mrb a)
,, sm
(l 2 )
Suppose g1 is of class C 2 and g 1(0) = g1 (a) = 0. Then rp converges uniformly, and is the solution to the Dirichlet problem with f 1 = /2 = g2 = 0, and is continuous on the whole square, and V 2 ¢ = 0 on the interior. ii.
If each of /1 ,/2, g 1, g2 is of class C 2 and vanishes at the corners of the
rectangle, then the solution rp(x,y) is the sum of four series like Equation ( 12), V 2 rp = 0 on the interior. and V is continuous on the wlwle rectangle and assumes the given boundary values. Furthermore, rp is COO on the interior. iii.
If /1 ,h, g1, g2 are only square integrable, then the series for rp converges on the interior. V2 rp = 0, and rp is COO. Also,
0 we shall establish unifonn
Chapter 10 Fourier Analysis
644
convergence on O < e $ y $ 1r and x arbitrary. With this extra restriction, the delicacy of Abel's test is no longer needed; the Weierstrass M test will do. We have lbnl $ M. Let _ 2Msinh[n(1r - e)] Mn-n . sinh(1rn) Then Mn bounds the terms in,\. But 2 sinh[n(1r-e)] e"n(l - e"n), from the definition of sinh. Thus
< e"(n-c) and 2
sinh(n1r) ~
Since e > 0, L Mn converges, and so we have unifonn convergence. Note that we could use nk instead of n 2 for any k and still have convergence; in fact, we can differentiate any number of times; that is, cp is COO (a little thought shows that cp is analytic-see Example 6.8.7). The proof of part iii is routine. To show that V 2cp = 0 and cp is a COO function on the interior, the proof is similar to the preceding (all that was used was that the bn are bounded). For convergence in mean, we proceed as in the proof of Theorem 10.7.3, using Lemma 15.
Worked Examples for Chapter 10 Example 10.1 Let f : [O, 1r] _. C be a continuous .function. Prove that the following inequality holds:
111' f(x)
2
sinxdxl + ... +
11,,. f(x)
sinnxdxr $
i 1,,. l/(x)l dx. 2
Solution This follows from Bessel's inequality applied to the following (incomplete) 011honormal system on [0, 1r]:
.,/2j;-
sinx, ... , ,/2r; sinnx.
If we used an infinite sum, we would have equality by Parseval's theorem (see Table 10.5-3). •
Example 10.2 Show that iff,,
-+
f (in mean) and 8n _. g (in mean) in an
inner product space, then
{fn,gn)-+ (/,g}.
Worked Examples for Chapter 10
645
Solution First, make an estimate using the Schwarz inequality and the triangle inequality: IUn,gn) - (f,g)I 5 l(fn,gn) - (fn,g)I + l(fn,g)- (f,g)I = l(fn,8n - g)I + IUn - /,g}I 5 ll!nll llgn - gll +llfn - 111 llgll. The result follows from this. Given e > 0, choose N so that n each of the following estimates:
N implies
e
2.
llfn - !II< 211811; llfn - !II ~ I; and
3.
ll8n - gll < 2(11/11 + l)"
1.
~
e
(Why is this possible?) Then 11/n - /II ~ 1 implies
l(fn,gn} -
(f,g}I
-+
(why?), and for n ~ N,
~{II/II+ I)llgn - gll +llfn - /11 llgll 5
This proves that (/n,8n}
llfnll 5 II/II+ I
c11111 + 1)~. - k2'
.,\ ,- integer.
l-=1
Note that the series on the right converges unifonnly for O $ .,\ $ .,\0 < I. Note also that ,r(cos 1r .,\/ sin 1r .,\) - (I/>.) --+ 0 as >. -+ 0, and so is Riemann integrable. By integrating, sin,r.,\) Loo lo 0 Joo., ( ,r.,\- = o k=I
(1-
.,\2) •
-k2
l>-1
.
>.
2) .'
k2
i.e.,
sin 1r >. = 1r >.
k=I
IT 00
(
>.2)
1 - k2
,
l>-1
.. This product fonnula for the sine was discovered (though not rigorously proved) by Euler. (For another method of proof, see J. Marsden and M. Hoffman, Basic Complex Analysis, 2d ed., Chap. 7.) If we take >. = 1/2, then l
=~ 2
rr 00
(
~I
l __ I ) =~ 4k2 2
IT (2k (2k) - 1)(2k + I) (2k) ' 00
~I
and so 71"
00
(2k) (2k) . 1)(2k + 1); i.e.,
2 = IT (2k k=I
71"
(2 · 2)(4 · 4)(6 • 6) · .. (2n · 2n)
•
- = hm - - - - - - - - - - - - - - , 2 n-oo (1 · 3)(3 · 5)(5 · 7) · · · ((2n - 1) · (2n + 1)) which is called Wallis' product formula for 1r /2.
•
Example 10.6 An interesting application of the Parseval relation to the isoperimetric problem is as follows: Show that among all plane curves of a given perimeter. the largest area is enclosed by the circle. Solution
Let (x(t), y(t)), 0 $ t $ 21r, be a parametric representation of a simple closed curve. We assume that x(t), y(t) are C1 functions oft, and that the parameter t is arc length. Thus the total length is 21r, and .x(t)2 + y(t)2 = 1, and so J;1r (.x(t)2 + y(t)2) dt = 21r. The enclosed area is A=
fo 1r x(t)y(t) dt. 2
(fhis is a standard calculus fo1mula; see, for example, Marsden and Tromba, Vector Calculus, 3d ed., Chap. 7 .) We claim that A $ 1r, and that A = 1r only if the curve is a circle. To prove this, we express A in tenns of the Fourier coefficients of x and y. Wtite 00
x(t) =
~ + L(ak cos kt+ ht sin kt) k=I
649
Worked Examples for Chapter I 0 and
00
y(t) = ~o +
L(
D'k
cos kt + fJk sin kt).
k=I
All coefficients are real, and so, by a change of origin in the plane, we may assume ao ao 0. The Fourier series of the derivatives i(t), y(t) are
= =
00
i(t) =
L (kht cos kt + ba" sin kt) k=I
and 00
Y(t)
=
L (kf3k cos kt+ ba" sin kt). k=I
By Parseval's relation,
and the area is
Hence
k=I 00 ~.,
')
.,
?
,,
= 1r L., (k- - k)(a'i: +bi:+ a*+ /3i) k=I 00
+
1r
L k{ (ak -
/3t)2 + (ak + bd }.
L-=1
Thus
1r -
i.
ak
A ~ 0, and
1r -
A = 0 if and only if
=bk =ak =f)k =0 for k ~ 2;
In this case, x(t)
=a 1 cos
t +b 1 sin t
and
y(t)
=-b 1 cost+ a 1 sin t =-x(t + 1r /2).
Chapter 10 Fourier Analysis
650 Equivalently, for some Rand 6, x(t) = R cos(t + 6)
and y(t) = R sin(t + 6).
The condition x2+y2 = l implies R = I, and therefore we have a circle of radius 1.
•
Exercises for Chapter 10 1.
Let V be an inner product space and M c V a vector subspace. Define the orthogonal complement of M by
M.1.. ={/EV I (f,g} = 0 for all g EM}. Show that M.1.. is a vector subspace of Vandis closed (that is, if/n E M.1.. andfn -+ f (in mean), then/ E M.1..). 2.
Prove that the Legendre polynomials (see §10.2) are complete in L2 of [-1, 1]. [Hint: First show that any polynomial can be expanded in Legendre polynomials, and then employ the method of proof of the mean completeness theorem.]
3.
a.
Use the Fourier series for eX on [-71', 71'] to prove the following identity: 00 I l -2(1rcoth 7r - 1) =
Z::-2-1 · n + I
b.
Use the half-interval cosine series for cos ax where a is not an integer to prove the following identity: 1l'Cot1ra
I =- + a
Z::-a 2a-n oo
2- 2.
I
4.
Prove that if In-+ f (in mean), then 11/nll-+ 11111- Is the converse true?
5.
Prove that uniform convergence implies mean convergence (on finite intervals).
6.
Consider the space h of all sequences x = (x1,x2, .. . ) of real numbers < oo. Show that '2 is an inner product space with {x,y} = with 1 1 x;y;. In addition, show that this space is complete (by showing that all Cauchy sequences converge).
r;:
r;: r;
Exercises for Chapter 10 7.
651
Let
ltxr..).
662 75.
Chapter 10 Fourier Analysis (W. A. J. Luxemburg) Let 'Pn be a set of orthonormal functions in L2 of the interval [a, b ]. Show that
a.
'Pn is complete iff x - a=
b.
'Pn is complete iff
"E: IJ; 'Pn(t)dtj2 for all x E [a,b] and 1
I
c for x E / ]xo - 6, Xo + 6[. Let Tn(x) [t(x)]", where t(x) = l +cos(x-xo)- cos 6, and show that Tn(X) ~ 0 on I, Tn(X) -+ oo uniformly on every closed subinterval of/, and the Tn are uniformly bounded outside /. Use this to show that (j, Tn) = 0 is impossible for n large. For generalf, attempt to apply the results just obtained to F(x) = 1J(t)dt.
=
=
f
Appendix A Miscellaneous Exercises This appendix contains supplementary exercises. Numbers 1-53 are miscellaneous exercises that cover a variety of topics. Most are not routine questions, but rather ask for examples, true-false evaluatons of assertions, or developments of variations or extensions of ideas from the text. Numbers 54--66 are examination-type questions that may be useful for review.
Exercises 1.
For two sets A, B C Rn, define the distance between them by d(A, B) = inf{d(x,y) Ix e A,y e B}.
a.
Find two closed sets such that A n B = 0 and yet d(A, B) = 0.
b.
For A compact and x 0 for some
e R_n I d(x,A) =
O}. 2.
Let A C R_n. Show that A is compact iff every continuous map/ : A -+ R is bounded above and assumes its maximum at some point in A.
3.
Show that/ : R-+ R has a continuous derivative iff the double limit lim
(x,y}--+(Xo.XO)
exists for every Xo E R. 663
f(x)- /(y) x-y
Appendix A Miscellaneous Exercises
664
E:1(x+ 1)2 /n
Show that on [0, l].
S.
What is wrong with this argument? f(x)
2
00
-oo
1
=1/(1 +x)2 =(d/dx)[-1/(1 +x)],
so
1
and
E: (x + l'f/n! each converge unifonniy
4.
f(x) dx = lim
a-oo
la
-a
f(x) dx = lim
a-oo
[
-1-1 - - - -1 - ] = 0. +a 1- a
6.
Give an example of a continuous function f : lR - JR such that f(IR) is not closed. If A is a closed bounded interval in JR, mustf(A) be closed?
7.
At first, one might think the intervals ]O, 1[ and [0, 1] are very similar. State at least five significant differences between them in terms of topology and continuous functions.
8.
a.
b. c. d.
e. 9.
Are the following statements true or false? (All sets have volume and all functions are bounded and integrable.)
c.
JAi = J8f. If {x I f(x) ~ g(x)} has measure 0, then JAi = f8j.' If/ 2: 0, g 2= 0, and JAi = JA g, then/= g on A (except possibly on
d.
a set of measure 0). The same question as c with the additional assumption that f 2= g.
a. b.
10.
11.
Find an example of a closed set A in !Rn such that A = bd (A). If A = bd (A) show that A is closed. Prove: If int(A) = 0, then (A= bd(A) ¢=> A is closed). Prove: If A is closed and A C bd (A), then A = bd (A) and int (A) = 0. Find a set A such that A c bd (A) but A ~ bd (A).
If A :::> B, and A \ B has measure 0, then
Let/ : [a,b]-+ lR be integrable.
a. b. c.
Prove that F(x) = f0xf(t)dt is uniformly continuous on [a,b].
a.
Let T : !Rn -+ Rn be a linear mapping. Prove that T is norm preserving (that is, II Tx II = II x II) iff T preserves the inner product, {Tx, Ty}= {x,y ). (See Exercise 12, Chapter 1).
Show that F has a derivative at xo if f is continuous at .xo. Show that F is differentiable except on a set of measure 0.
,pxercises
665
I
b.
1.2.
If T preserves the norm (or inner product), then Tis an isomorphism.
Prove Cavalieri's principle: If A, B c IR.3 have volume, and if every plane parallel to the xy-plane intersects A and B in equal area, then A and B have the same volume. [Hint: Make use of Fubini's theorem.] Remark: In connection with this problem, see Gelbaum and Olmsted, Counterexamples in Analysis, Example 6, Chapter I 1. For applications see a calculus text.
13.
Show that a set A c Rn has volume iff for every e > 0 there is a set Ve C A and a set W,, :, A such that V,. and We have volume and v(We \ V,.) = v(W,.) - v(V,.) < e. Show that if the latter condition holds then the volume of A is inf{v(W,.) I e > O} = sup{v(V,.) I e > O}.
14.
Show that the volume of the figure obtained by revolving the area under the graph of a nonnegative function / : [a, b] -+ R once around the xaxis is given by 1rf(x)2 dx. Use this fonnula to compute the volume of {(x,y,z) E R 3 I l < X < 2 and y2 +z2 < x4}.
J:
< 0.
15.
Let/ : [a,b]-+ R be continuous and suppose/(a)/(b) there is an x E]a, b[ such that f(x) = 0.
16.
Let f : Rn -+ Rn be of class C1 and suppose J/(x) 'f O for all x. Let x0 E Rn and B {x E Rn I/(x) .xo}. Show that B has no accumulation points.
t,.
Give an example of a function/ of class C1 that has derivative equal to zero at a point x but is one-to-one in a neighborhood of x. Show that 1- 1 cannot be differentiable at/(x).
18.
Show that if the sets A and B have volume then so does A U B. If A n B and A \ B have volume as well, then prove:
=
Show that
=
n B).
a.
v(A U B) = v(A) + v(B) - v(A
b.
v(A \ B) = v(A) - v(B) if A ::> B.
19.
Suppose that f : A C Rn -+ JR is integrable and that/ = g except on a set of content zero. Show that g is integrable. Show that this is false if we replace "content zero" by "measure zero."
20.
a.
Show that if {U0 } is a family of disjoint open sets in Rn, then the family is countable. [Hint: Pick a point with rational coodinates in each set and use the fact that the set of such points is countable.]
b.
If A is an open subset of Rn, show that the connected components of A are open and that there are countably many of them.
666
Appendix A Miscellaneous Exercises
c.
Prove that each open set in R is the union of a countable collection of intervals.
21.
Let/ : [a,b]-+ [a,,0] be a strictly increasing map of [a,b] onto [a,,0]. Show that f and 1- 1 are continuous.
22.
Prove the Lebesgue Covering Lemma: Let A be a compact subset of Rn and let {VO} be an open cover of A. Then there is an e > 0 such that if S is any rectangle contained in A with sides shorter than e, then S c V00 for some ao.
23.
Let f : A C Rn -+ R be a mapping and let M C A be the set of (strict) local maxima of/. Show that J(M) is finite or countable. Give examples.
24.
Let S be an open connected set in Rn. Let A be a connected component of IRn \ S. Show that Rn \ A is connected.
25.
Find a nonconstant continuous function/ [a, b] -+ R that has its maximum at xo E )0, I [ but for which/' (Xo) does not exist.
26.
A set B c A is said to be dense in A if A c cl (B). Show that this is equivalent to the condition that for every open set U with A n U f: 0 we have B n U f: f2J. Is A dense in cl (A)? Show that Rn has a countable dense subset.
27.
a.
Let A c Rn have volume. Show that int (A) and cl (A) have volume and v(A) v(int (A)) v(cl (A)).
b.
Show that if A is a set such that cl (A) has volume, we cannot conclude that A itself has volume. Show that if A is a set with int (A) = 0, we cannot conclude that A has content O or measure 0.
c.
=
=
28.
Rn -+ B c Rn be of class C1 on an open set A and suppose B = g(A). We say g is volume preserving if for every set D c A such that D and g(D) have volume, we have v(g(D)) = v(D). Suppose g is one-toone and Jg(X) t,. 0 at each x E A, then show that g is volume preserving iff jJg(x)I = I for all x EA.
29.
a.
Let g : A
b.
c
Show that if a set A has content 0, then cl (A) has content 0. Is this true for measure O? Let f : A C Rn -+ Rm be of class C1 on an open set A. Let B C A and cl (B) c A and suppose that B is compact. If B has content or measure 0, show that /(B) does as well. [Hint: Consider the case in which Jf(x) = 0 separately and use Sard's theorem (Exercise 27, Chapter 9).]
667
Exercises 30.
31.
32.
A subset A of Rn is said to be homeomorphic to a subset B of Rm if there is a continuous map '{) : A - B with a continuous inverse 'P- 1 : B - A. We call '{) a homeomorphism. a.
Find an example of a bijection '{) : A - B that is continuous but is not a homeomorphism.
lb.
Let/ : A C R_n - Rm be continuous and let r = { (x,/(x) E Rn X Rm x e A} be the graph of/. Show that A and rare homeomorphic.
I
Let/ : [a, b] - R be a monotone function. (Say, for example, that f is nondecreasing: x $ y => f(x) $ /(y).) a.
Show that the left and right limits, /(x+) = limh-+o+ /(x + h) and f(x-) = liffih-0" /(x- h), exist at each x in [a, b]. (Only one of them at each endpoint.)
b.
Show that/ has at most a countable set of discontinuities. [Hint: Let Pn be the set of points at which the jump of/ exceeds 1/n. Show that Pn is finite and consider the union of all the Pn for n = 1, 2, 3, ....]
c.
It is a famous theorem of Lebesgue that for each such/ the derivative of/ exists, except possibly for points in a set of measure 0. Consider some examples to verify the validity of the results. Look up a proof in, for example, Hewitt and Stromberg, Real and Abstract Analysis, and write a brief essay on the essential features of the proof.
Prove that the transformation X1
=
U1
X2
=
U1 +U2
X3=
Xn
=
u1+u2+u3
U1
+···+Un
leaves volumes unchanged. 33.
Let g [O, I] - R be integrable. Show that
1[1 1
34.
1
g(t) dt] dx =
1 1
tg(t) dt.
Let K be a compact set. If (/n}'f is a uniformly convergent sequence of continuous real-valued functions on K, show that In is equicontinuous. The converse is not true. Give a counterexample.
668
Appendix A Miscellaneoui Exercise,
35.
Let S be a subset of Rn with volume," and let t > 0. Let R be the set of points R (tx1, ••• , txn) I (x1, ... , ~n) E S}. Show that v(R) rnv(S). What if t < O?
36.
Explain how the Gibbs phenomenon is possible and the Fourier series still converge in the mean and pointwise.
37.
a.
Let /(x) have the Fourier series (ao/2) + ~:1 [an cos nx + bn sin nx] on [-1r,1r]. Define the reflection off by g(x) =f(-x). Show that the Fourier series of g is (ao/2) + ~:1 [an cos nx - bn sin nx].
b.
Let
={
=
f(x) = { O,
x,
for - 1T $ X $ 0, for O < X $ 1T.
Recall that the Fourier series for f is ,r ~ ((-1)" - 1 (-1)" . ) -4 + L..., cosnx- - - smnx . 2 ,... ,rn n 1
Use part a to show that the Fourier series for ~ 2
2~
+ ~
(-If - I 1rn2
n=I
c. d.
38.
lxl
on [-1r, 1r] is
_ ~ _ 4 ~ cos[(2k- l)x] cosnx - 2 L..., ,r(2k- 1)2 • k=l
1r2 I I I Use part b to show that - = I + - 2 + - 2 + - 2 + • • • 8 3 5 7 • Use part b to obtain the Fourier cosine series of x on [0, 1r] and conversely.
a.
Let V be an inner product space and ipo, ip1, ... a complete orthonormal sequence. Suppose that W is a subspace of V and that/ E W iff {f,ipo} = 0. Prove that ip1,cp2, ... is a complete orthonormal system for W. Generalize.
b.
Apply part a to the trigonometric system and i. W = {/ : [-,r, ,r]-+ JR I J~.,J(x)dx = O}. ii. W = {/ : [-1r,1r]-+ JR. If is even}. iii. W = {/ : [-,r,,r]-+ JR If is odd}.
39.
If/: [a,b]-+ JR is square integrable, prove that/ is integrable. That is: l/(x)l 2 dx < oo =} lf(x)I dx < oo. [Suggestion: Use the CauchySchwarz inequality.]
J:
J:
669
Exercises 40.
a.
b.
41.
Suppose that/ : [-1r, rr] -+ JR is sectionally continuous with only jump discontinuities. Show that the sum of the Fourier series of/ at x depends only on the values of/ in any neighborhood of x. This is called Riemann s localization property. [Sugestion: Apply Theorem 10.3.2.] The Fourier coefficients of/ depend on the values of/ throughout [-rr, rr ]. How do you reconcile this with the result of part a? [Suggestion: Study the proof of 10.3.2.]
\Suppose the Fourier series of/ : [-1r, rr] x [-1r, rr]
-+
JR is
00
L
Cn,milVtimy.
n,1n=-oo
(See Exercise 18, Chapter 10.)
a. b.
Write out the Fourier series off in trigonometric form. For fixed y, let g(x) = J(x,y). Show that the exponential Fourier coefficients for g are 00
G(n) = Cn
=
L
Cn,meimy_
m=-oo
c.
If/ is square integrable we know that its Fourier series converges to f in mean. (See Theorem 10.3.1.) The purpose here is to give a pointwise convergence theorem. Show that if f is of class C1 and J(x, 1r) = f(x, -1r) and /(rr,y) = /(-rr,y) for all x and y, then the Fourier series for f converges to f(x,y) pointwise. [Suggestion: Use part b and 10.3.2.]
:E.:
42.
For what values of p is 1(sinnx)/nP the Fourier series of a square integrable function? (See Exercise 32, Chapter 10.)
43.
a.
b.
Show that
t
cos kx
s isin(~/2) I
for x I 0. Consider the Fourier series for ·the step function, 00
f(x)
=
L co:nx. n=l
Show that for each 6 > 0, this converges uniformly on [6, rr]. [Suggestion: Use part a and the Dirichlet test]
Appendix A Miscellaneous Exercises
610
z:;:
c.
Generalize b to any Fourier series /(x) = 1 an cos nx with an decreasing. Conclude that/ must be continuous on ]0, 1r[.
d.
Deduce from c that if/ has a discontinuity at Xo and O < x0 then the Fourier coefficients off are not decreasing.
=
=(x -
< 1r,
l/2)2 - (2 / 4.
44.
A string on [0, l] is initially displaced at t 0 by f(x) Find a formula for the displacement after time t.
45.
a.
If a bar with insulated ends has temperature T = constant at t = O, show that T = constant for all t > 0.
b.
If a bar on [0, 1r] has temperature at t = 0 given by sinx, find the temperature for t > 0.
c.
Same as part b except that T = cosx at t = 0.
a.
Find a function
0
Suppose that/ : [O, I] -+ JR is integrable, that f(x) dx ~ 1, and that 0 $ f(x) $ 10 for all x in [0, 1]. Define the set E = {x E [0, 1] If(x) ~ 1}, and assume that E has volume. Show that v(E) ~ 1/2.
Exercises
52.
671
Suppose that f : [O, 21r] - JR is continuous and /(0) = /(21r ). Let N
L (/,'Pk) 'Pk
SN=
k=-N
be the Nth partial sum of the Fourier series for/ and define 4>(y)
=
1'
and
f(x) dx
State whether each of the following "must be true" (MBT) or "could be false" (CBF).
53.
a.
I;,r SN(x)eix dx - I;,r f(x)dx as N -
b.
It x2sN(X)dx- It x2f(x)dx.
c.
II SN - f II -
d.
SN(2) - /(2).
e.
l:N is the Nth partial sum of the Fourier series for 4>.
r.
II i:N -
oo.
0.
g,
4> II - o. I:N(2) - 4>(2).
h.
I:N - cl> uniformly on [O, 21r].
The Poisson kernel and harmonic functions. Let/(0) be continuous and 21r-periodic. (We can think off as a function defined on the circumference of the unit circle in the plane .. ) By Fej6r's theorem, we know that the Fourier series of/ converges to/ in the (C, I) sense. Deduce that 00
lim ~ ckrltli" 9 = f(0).
,~1•
L...,,,
k=-oo
(Note the exponent lkl for the negative indices.) Define 00
u(r, 0) =
L
c,_,l"leik8
•
k=-oo
for O $ r < 1. We regard u as a function in the interior of the unit disk in the plane. In rectangular coordinates x, y we have 00
u(x,y) = co+
L (ck(X + iy)" + C-k(X k=l
iy)").
672
Appendix A Miscellaneous Exercises Prove that this series converges uniformly in each disk of radius smaller than 1. Show (by the general theory of power series) that we can differentiate u tenn-by-tenn any number of times. In this way prove that {J2u 82 u 8x2 + f)y2 =0;
that is, u is a solution of Laplace's equation-a harmonic function. We have already seen that u(r, 0) --+ /(0) as r --+ 1-, so we have solved the "Dirichlet problem": to find a hannonic function in the unit disk that has a given function for its boundary values. For O $ r < 1 prove that u(r, 0) = -2I 11"
1"
f(t)P,(0 - t)dt
-,r
=
=
where P,(O - t) I:;~00 rlkleik(e-t>. The function P,(y) E~oo rlkleikJ is called the Poisson kernel. Sum this series explicitly to prove that P,(y) =
1+
1 - ,2 . - 2rcosy
,:i
Show that this kernel has the same crucial properties as the Fej6r kernel, namely a.
2,r-periodicity.
b.
(1/2,r) f:.1r P,(t)dt = 1.
c.
P,(t)
d.
For each fixed 6 > 0, we have
~
0.
lim [
,-1- }6$ltl$1r
P,(t)dt = 0.
Deduce that u(r, 0) discussed above converges to /(0) uniformly as r-+ 1-.
Examination-Type Questions Exercises 54-58 are examination-type questions based on Chapters 1-6 and 8. 54.
a.
Define the least upper bound of a set S.
Examination-Type Questions
55.
673
b.
Find sup{x ER I x2 +x < 3}.
c.
What is meant by saying that R is complete?
d.
Let (xn)f be a convergent sequence in R. Prove that it is a Cauchy sequence.
e.
Let O O} is connected.
e.
If A and B are connected sets in Rn and A n B
Define the term "path-connected set."
~
Ix
=
0, show that A U B
is connected.
56.
a.
b. c.·
Define what it means for a sequence of functions /1: converge uniformly.
: Rn
-+
R to
E:, (sin kx) k31 converges uniformly for x E R. Is f(x) = E: (sin kx) /k31 a continuous function of x? Justify your 2/
Prove that
1
2
2
2
answer.
d.
Let /1:(x) = (1/kx) + I fork = 1, 2, 3, ... and x e]O, l[. Prove that 1 pointwise.
/1: -+ 57.
IJ,)f in part d converge uniformly on ]0.1[?
e.
Does the sequence
a..
Let In : [a, b] -+ R be continuous functions differentiable on ]a, b[ with/~ continuous. Suppose fn converge uniformly to/ and/~ converge uniformly tog. State a theorem concerning differentiablity of
f. b.
Prove your theorem in a. Clearly state any results used.
c.
Let/1:(x) = (sinkx)/k2. Does your theorem work?
d.
State a result that would guarantee that the following operation would be valid: 1
e.
Definer= E:Oxn /n!. Use part d to prove that
J; e' dt = r - 1.
674 58.
Appendix A Miscellaneous Exercises
a.
Let/ : A C R_n derivative of/.
b.
For / : A C Rn -+ IR, define the gradient of / and discuss the geometrical meaning of ( grad /(x), e ).
c.
If Sis a surface defined by S
d.
Find the equation of the plane tangent to the surface x2 + y3 + z4 = 3 at (l, 1, 1).
e.
Argue that the two surfaces x2 + y2 + tangent at the point (1, 1, 1).
-+ R_n
where A is open. Give a definition of the
=
=
{x I F(x) c} for some constant c, argue that grad / (x) is perpendicular to S if x E S.
r = 3 and x1 + y3 + z = 3 are 4
Exercises 59-62 are based on Chapters 6-10.
59.
a.
Let/ : Rn IR".
-+
IRm. Define what it means for f to be differentiable at
XE
60.
b.
Is it true that existence of the partial derivatives implies that / is differentiable? Discuss.
c.
Let/ : IR2
d.
Let h(x,y) =J(g(x,y),k(y),p(x)). Write down a formula for oh/ox. Justify this in terms of the chain rule.
e.
Let / : lR -+ lR be differentiable. Assume that / and /' have no common zeros. Prove that/ has only finitely many zeros in (0, 1).
a.
What does the inverse function theorem state for functions/ : lR IR?
b.
Consider the equations
-+
R.3,/(x,y) = (xy,eY,cosx). Compute D/(1,0).
{
-+
=2 xz+y2 +y = 3.
x1+y40
Show that they are solvable for y(x),z(x) near (x,y,z) = (I, I, 1). Compute dy/dx at x = 1.
61.
[0, 1} be continuous. Prove that cp has a fixed point.
c.
Let cp : (0, 1)
d.
Let F : IR" -+ IR" be of class C1 and have nonzero Jacobian at every point. Prove that F(R") is open.
e.
Let / : IR.2 -+ lR be continuous. Show that / is not one-to-one. [Suggestion: If/ were one-to-one, then the images of the x and y axes would both be intervals in IR.]
a.
Evaluate
-+
JA e-x2-I dxdy where A= {(x,y) E 1R2 I x2 +y2 ~ 1}.
Examination-Type Questions b.
c. d.
e.
62.
a. b.
675
Evaluate J8 x dxdy where B is the region in the plane bounded by x = 0, y = 0, and x + y = 1. State one version of Fubini's theorem. Use part c to write a fonnula for I,1./(x,y)dxdy where A= {(x,y) E R.2 I c :5 y :5 d and t/J(y) :5 x :5 cp(y)}. Here t/1 and cp are to be smooth functions on the interval [c,d] with t/J(y) :5 cp(y) for all y E [c, d]. Sketch this region. . Let cp : IR2 - R.2 be of class C1 and bijective with Jcp '/; 0. Assume IA dxdy I,;,CA) dxdy for all open disks A. Prove that Jcp 1.
=
=
Let V be an inner product space and (g of)(x) E C E/-t(g-l(C)).
¢=>
A,B '/: 0 (3a EA and 3b E B) A xB '/:0.
15. a. b.
17.
¢:=> X
If f is one-to-one, let g(y) there is no such x.
g(f(x)) E C
~
/(x) E
(a,b) EA x B, and so
=x if f(x) =y and let g(y) be anything if
If f is onto and y E T, choose some x such that f(x) = y and let h(y) = x. (Note the use of the axiom of choice.)
Let T : ]Rn-+ ]Rm be the linear transfonnation associated with A. We learn in linear algebra that T is onto, i.e., Tx = y is solvable for x e ]Rn for any y e lRm, if and only if T has rank m. If B exists, Tis onto by Exercise 15b. Conversely, if T is onto, choose a complementary subspace W c IR.n of dimension m to ker(D. Then Tl W : W -+ R_m is an isomorphism. Let U be its inverse. Then T o U = identity . If B is the matix of U, we get AB = identity.
Chapter 1: The Real Line and Euclidean Space 1.1 Ordered Fields and the Number Systems · 1.
(a+ b)2 = (a+ b)(a + b) = (a+ b)a +(a+ b)b = a 2 + ba +ab+ b 2 (by the distributive law twice). Now use commutativity.
3.
Expand (a - b)3
5.
Let lF = {0, 1,2} with arithmetic mod 3. For example, 2 • 2 = 1 and 1 +·2 = 0. To show it cannot be ordered, get a contradiction from (for example) 1 > 0 so 1 + 1 = 2 > 0 so l + 2 = 0 > 0.
> 0 and rearrange.
685
Appendix C Answers and Suggestions
1.2 Completeness and the Real Number System 1.
a.
Take limits as n - oo in x~ = 2 + Xn-1 •
b.
2.
3.
0.
5.
From a (nontrivial) monotone sequence (xn)f", extract a subsequence that is strictly monotone.
1.3 Least Upper Bounds 1.
sup(S) = 1; S is not bounded below.
3.
sup(Q) is an upper bound for P, so sup(Q)
5.
sup(S) = l.
2: sup(P).
1.4 Cauchy Sequences 1.
lxn - Xn+kl :::; E:;t-l lxi - X;+il :::; E:(1/3;) ::; 1/3n-l. So, for all c > 0, we can choose N such that 1/3n-l < c. We get a Cauchy sequence, so (xn)?" converges.
3.
A possibility is 1, 0, -1, 1, 0, -1, 1, 0, -1, ....
5.
False.
1.5 Cluster Points, Jim inf, and Jim sup
=2; lim sup(xn) =4.
1.
lim inf(xn)
3.
Use 1.5.S to show that there are points XN(n> within ¾ of a (orb). Make sure you end up with a subsequence.
5.
False.
1.6 Euclidean Space 1.
If equality holds in the Cauchy-Schwarz or triangle inequality, then x and y are parallel. Expand II x + y 11 2 = (II x II + II y 11>2 to obtain II x II · II y II = (x,y) = llxll · IIYII · cos(0). Use this directly or let u = y/llYII and z = x - ( x, u ) u. .Check that ( z, u ) = 0 and that I x II 2 = II z II 2 + I(x, u ) I2 = II x 11 2 , Conclude that z = 0 and x = ((x,y) / II y ll 2)y.
Appendix C Answers and Suggestions
686 3.
{A· (-2,0,3) E JR. 3 JAE JR}.
5.
x
=(y + 1)/2 =(z + 2)/3, or P(t) =(1 + t, 1 + 2t, 1 + 3t). This line is not a linear subspace since (0, 0, 0) is not on it.
1.7 Norms, Inner Products, and Metrics 1.
The distance given by the sup norm is 1. That from 1.7.7 is 1//3.
3.
-J7JY)
=1
v1fi:i')
,J7J:T) •v1fi:i') is true. 5.
For f(x) = different.
X,
= 1//3, and (f,g) ·
Ill lloo = 1, but
u:n
112
= 1/2.
So l(f,g)j
= 1//3. So these norms are
1.8 The Complex Numbers 1.
a.
6 + 4i.
b.
(11/5) + 2i.
c.
(3 /2) - (5 /2)i.
3.
No, not always. _,--.. ·
5.
No; true iff z is real.
7.
a.
Use trig identities and the properties of real exponents.
b.
le'J = eRe(z)
c.
cos2 0 + sin2 0 = 1
d.
Use induction.
Jzl
0 and cover A by finitely many c/2-balls. Show that cl (A) is covered by the corresponding c-balls.
5.
A sequence converges iff it is eventually constant. This does not contradict Exercise 4 since the entries i~ such a convergent sequence form a finite set.
3.2 The Heine-Borel Theorem 1.
None of them.
Appendix C Answers and Suggestions
693
3.
All subsets of M are bounded, so A compact {==::} A closed.
5.
No.
{==::}
(A closed and bounded)
3.3 The Nested Set Property 1.
nkFk={v'2} "f"0.
3.
If Fk
= {x1 I l 2: k}, then nk Fk = 0.
None of the sets Fk are compact.
3.4 Path-Connected Sets 1.
3.
a.
Not path-connected. Any path between two rationals must contain an irrational.
b.
Path-connected.
c.
Path-connected.
d.
Not path-connected. If the point (1, 0) were added, it would be pathconnected.
No.
3.5 Connected Sets 1.
No. ] - 1/2, 3/2[ and ]2, 7 /2[ are disjoint open sets whose union contains A.
3.
If A c ~ 2 • let A* = {(x,y,O) E ~ 3 I (x,y) EA}. If "/(t) = (x(t),y(t)) is a path in A, then "/* (t) = (x(t), y(t), 0) is a path in A"'. If U* and V* disconnect A*, then U = {(x,y) E ~ 2 I (x,y,O) E U*} and V = {(x,y) E ~ 2 1 (x,y,0) EV*} would disconnect A.
Exercises for Chapter 3 (at end of chapter) 1.
a.
Connected, not compact.
b.
Connected and compact.
c.
Compact, not connected if n
d.
Neither connected nor compact.
e.
Compact, but not connected if it contains more than one point.
= 1, connected and compact if 11 2
2.
Appendix C Answers and Suggestions
694
f.
n = 1, compact and not connected. n
g.
Compact and connected.
h.
Compact, not necessarily connected.
i.
Neither compact nor connected.
j.
Compact, not necessarily connected.
~
2, compact and connected.
3.
cl (A) is bounded since A is (why?), and it is closed. So it is compact. Every infinite subset, such as A, must have an accumulation point. (See Section 3.1, Exercise l.)
5.
a.
Uk=
b.
Uk= ]k - (1/3),k + (1/3)(, k = ... , -3, -2, -1, 0, 1, 2, 3, ....
{x E R 2
111 x II
< kj(k + 1)}, k = 1, 2, 3, ....
7.
Start with cl (Ak) = {x} U {xk,Xk+I, .. .}. Remember to show that if y-/- x, then y ¢. cl (Ak) for large k. (No such Xk can be close to y since they are close to x.) (Give detail.)
9.
a.
False; [0, 1] is compact, but JR. \ [0, 1] is not connected. In .IR.n, A = {x E .IR.n I 1 :S II x II :S 2} is compact, but Rn \ A is not connected.
b.
False; same examples as in a.
c.
False; ]a, b] is connected but is neither open nor closed.
d.
False for n = 1, true for n ~ 2. (Rn \ A is path-connected if n ~ 2.)
a.
If U and V disconnect Band x EB n U, then An U-/- 0. (Why?) Similarly, A n V -/- 0. So A would be disconnected.
c.
Establish lemma: If B is connected, B C C, and C is disconnected by U and V, then either B C U or B C V.
11.
13.
Pick Xn E Fn. Show (xn)i is a Cauchy sequence. Its limit must be in cl (Fn) for all n. (Why?) There cannot be two such points since diam(Fn) --t 0. ·
15.
a.
17.
As in Worked Example 1.2 of Chapter 1, find Zic E K' with d(x,Zk) --t d~K) = inf{d(x,z) I z E K}. For large k they are all in the closed ball of radius 1 + d(z, K) around x. Use compactness to get a subsequence converging to some z. Then z E K (why?), and d(Zn(j), x) ----+ d(z, x) (why?). So d(x, z) = d(x, K). (Why?) This does not work for open sets. The proof does not work unless closed balls are compact.
If 1 (t) and µ(t) are paths from x to a in K 1 and from y to b in K2, then r.p(t) = (-y(t),µ(t)) is a path from (x,y) to (a,b) in K1 x K2. ·
Appendix C Answers and Suggestions
695
19.
cl 0".by "~ 0."
7.
Suppose C is a closed subset of B. Then C is compact (why?) and ((J- 1)- 1)(C) = f(C). (Why?) So f(C) is closed. (Why?) Thus 1- 1 is continuous. (Why?) For a counterexample with n = 2 consider f : [0, 21r[ - lR.2 given by f(t) = (sin t, cost).
9.
Extend! by puttingf(b) = g(b). If Fis closed in JR.m, show that h- 1(F) = 1- 1(F) U g- 1(F) and so is closed.
11.
a.
13.
If inf(J(V)) or sup(J(V)) were in f(V), then f(V) could not be open since it could not contain an interval around either of these points.
15.
For an example with inequality, try f 1(x) = x and .fi(x) == 1 - x on [0, 1].
17.
Letf(x,y) = .xy. Show thatf is not uniformly continuous by showing that lf(a, a) - f(b, b)I = (la+ bl/ v'2) II (a, a) - (b, b) Ii.
19.
A= {(x,y) E lR.2 I xy = l}.
21.
a, b, and c. Yes.
If (c,f(c)) and (d,f(d)) are on the graph, put "((t) = (t,f(t)).
d. No.
f(A) = {x E
JR Ix
~
O}.
699
Appendix C Answers and Suggestions 23.
Hint: To show "onto," suppose y, EX \f()() and consider the sequence Y2 =J(y1),y3 =J(y2), · · ..
25.
f(x) = sin(l/x) is a counterexample withf' not bounded.
27.
81/64.
29.
Divide by x - y and let y tend to x to show that f' (x) = 0.
31.
f(l) = e.
33.
If A is rela~vely compact, then cl (A) is compact. Convergent subsequences exist by the Bolzano-Weierstrass theorem. If every sequence in A has a subsequence convergent in R_n, then start with a sequence in cl (A), get a nearby sequence in A, ta1ce a subsequence converging in Rn, and show that the corresponding subsequence of the original sequence converges also necessarily to a point in cl (A).
35.
A =f({x ER Ix> 0}) is nonempty and bounded below by J(O). Show that lim........o- j(x) = inf(A).
37.
Supposef'(a) < 0 f(y)dy andg is differentiable andf continuous, then F'(x) = f(g(x))g'(x).
45.
Let m
=inf(g([a,b])) and M =sup(g([a,b])). m
1b
f(x)dx
$1b
Then
f(x)g(x)dx $ M
1b
f(x)dx.
J:
(Why?) Since t J(x) dx depends continuously on t, the intennediate value theorem gives to in [m,M] with
1b
f (x)g(x) dx
= to
1b
f(x) dx.
Now apply that theorem to g to get x 0 with g(x0 ) = to. (Supply details.)
Appendix C Answers and Suggestions
700
Chapter 5: Uniform Convergence 5.1 Pointwise and Uniform Convergence 1.
Yes.
3.
Yes.
5.
Show that if/t(x) is the kth partial sum, then 1/(x) - _ft(x)I :5 Use this to show uniform convergence.
E~+i (1/n2).
5.2 The Weierstrass M-Test 1.
a.
Convergence is pointwise but not unifonn. (Why and to what?)
b.
Convergence is uniform. (Why and to what?)
= :E~1(r' /n2 ) on [0, I]. (Why?) Each ft is continuous, so f is also. (Why?)
3.
/(x) is the uniform limit of _ft(x)
5.
Use the Weierstrass M-test with Mt = latl•
5.3 Integration and Differentiation of Series 1. The limit is f(x) =1/ x for x > O; f (0) =0. This is not continuous. The convergence cannot be uniform. (Why not?) 3.
Show that l.fn(x)I :5 fn(n/(n + 1)) :5 ,/n /(n + 1). From this show that -+ 0 uniformly on [O, 1]. The derivatives converge to O pointwise but not unifonnly.
fn
5.
Justify tenn-by-term differentiation of the series for sine by discussing uniform convergence of the appropriate series on [-R, R] for each R > 0.
5.4 The Elementary Functions 1.
Justify term-by-term integration by discussing unifonn convergence on [O, r] for r > x.
3.
Let h(x) /(ax) - f(a) - f(x). Show h(l) Derive the result from this.
5.
Use
=
~7.15.
=0 and h' (x) =0 for all x > 0.
Appendix C Answers and Suggestions 7.
701
For positive integers p and q, we have (g(l/ q))q = g((I/ q) + • • •+(l/q)) (q terms)= g(l) = e. So g(l/q) = e1/q. Then g(p/q) = g((l/q) + · • · + (1/q)) (p terms)= g(l/q)P = efl/q_
5.5 The Space of Continuous Functions 1.
No; int (B) = {f E Cb OR, IR) I 38
3.
Example 5.5.6 can be applied to produce the M-test. fffn = llfn+l - fn I = II 8n II ::; rn = Mn.
5.
It is not closed unless the limit function is in the set.
> 0 with f(x) > 8 for all x }.
E~1 gk, then
5.6 The Arzela-Ascoli Theorem 1.
If lf~(x)I < M for all x andfn(0) = 0 for all n, then lfn(x)I ::; Mx::; M for all x E )0, 1[ . (Why?)
3.
5.
a. b.
Show that the complement is closed by using 5.6.6.
A uniform limit of continuous functions is continuous.
Show that if lfn(t)I ::; M for every tin [a, b], then IFn(x)I ::; M(b-a) for all n and all x in [a, b]. So (Fn)'i is uniformly bounded. Also IF~(x)I ::; M. (Why?) Use the method of Example 5.6.4 to get a uniformly convergent subsequence.
S.7 The Contraction Mapping Principle and Its Applications 1.
lo{
k. Then let! = I( 'Pk,Sn }I $ II 'Pt II · II Sn II = II Sn II - 0 as n - oo. So Ct = 0.
19.
a.
Compute
( 'Pi, 'Pi} j and O if i f:. j. So the 'Pi 1 6it6ki which is I if i fonn an orthononnal family. Also, for x E /2 we have x = r;;:i x;cp;, so the family is complete.
II/ - E ftlfJk 11 2 ~ 0, and let n -+ oo.
(,\ - µ)p(x) f(x)g(x)
= (,\p(x) f(x))g(x)
- (µp(x)g(x)) f(x)
= [-/'(x) - q(x) f(x)]g(x) - [-g" (x) - q(x)g(x)] f(x) = g" (x) f(x) - J" (x)g(x)
= [g"(x) f(x) + g'(x) /'(x)] -
[f'(x)g'(x) + f"(x)g(x)]
=[g'(x) l(x)]' -
=dxd (g'(x) f(x) - /'(x)g(x)) .
J:
1/'(x)g(x)]'
So(,\ - µ,) p(x')f(x)g(x)dx = (,\ - µ)[g'(x')f(x) - J'(x)g(x)]I! = 0. As ,\ - µ f:. 0, the integral must be 0.
724
Appendix C Answers and Suggestions
21.
Let A = f - Sn, B = Sn+p - Sn. Then (sn,B) = 0 implies (A,B) =
Ef=n+t (f,cp;)(f,cpi)
= 11B11 2 • And (B,A) = (A,B) = 11B11 2 • So (f- Sn+p,f - Sn+p) = (A- B,A - B) = {A,A )-(A,B}-(B,A}+(B,B} =
11An2- IIBll2 :5 IIAll2 - Thus llf-Sn+pll :5 llf-snll• 23.
iii~~ = iii [
=
~ ( t/Jo,