192 39 27MB
English Pages 478 [484] Year 1995
ADVANCED CALCULUS O F SEVERAL VARIABLES
C.H. Edwards, Jr.
DOVER BOOKS O N MATHEMATICS Rutherford Aris. (0-486-66110-5) T H E MALLIAV1N CALCULUS, Denis R. Bell. (0-486-44994-7) ASYMPTOTIC EXPANSIONS O F INTEGRALS, Norman Bleistein and Richard A. Handelsman. (0-486-65082-0) NoN-EucLiDEAN GEOMETRY, Roberto Bonola. (0-486-60027-0) INTRODUCTION T O PARTIAL DIFFERENTIAL EQUATIONS, Arne Broman. (0-486-66158-X) AN INTRODUCTION T O ORDINARY DIFFERENTIAL EQUATIONS, Earl A. CoddingtOU. (0-486-65942-9) MATRICES AND LINEAR TRANSFORMATIONS, Charles G. Cullen. (0-486-66328-0) INTRODUCTION T O NONLINEAR DIFFERENTIAL AND INTEGRAL EQUATIONS, H. T. Davis. (0-486-60971-5) S O M E THEORY O F SAMPLING, W. Edwards Deming. (0-486-64684-X) INTRODUCTION T O LINEAR ALGEBRA AND DIFFERENTIAL EQUATIONS, John W. Dettman. (0-486-65191-6) LINEAR PROGRAMMING AND ECONOMIC ANALYSIS, Robert Dorfman, P a u l A. Samuelson and Robert M. Solow. (0-486-65491-5) 1 THEORY O F H ’ SPACES, Peter L. Duren. (0-486-41184-2) T H E THIRTEEN BOOKS O F EucLiD’s ELEMENTS, translated with a n introduction and commentary b y Sir Thomas L. Heath. (0-486-60088-2, 0-486-60089-0, 0-486-60090-4) Three-volume set CALCULUS O F VARIATIONS WITH APPLICATIONS, George M. Ewing. (0-486-64856-7) DIFFERENTIAL FORMS WITH APPLICATIONS T O T H E PHYSICAL SCIENCES, Harley Flanders. (0-486-66169-5) AN INTRODUCTION T O THE CALCULUS O F VARIATIONS, Charles Fox. (0-486-65499-0) FOUNDATIONS O F MODERN ANALYSIS, Avner Friedman. (0-486-64062-0) T E C H N I C A L CALCULUS W I T H ANALYTIC GEOMETRY, J u d i t h L. Gersting. (0-486-67343-X) INTRODUCTION T O DIFFERENCE EQUATIONS, Samuel Goldberg. (0-486-65084-7) PROBABILITY: An Introduction, Samuel Goldberg. (0-486-65252-1) Topol, Robert Goldblatt. (0-486-45026-0) DIFFERENTIAL GEOMETRY, William C. Graustein. (0-486-45011-2) DIFFERENTIAL GEOMETRY, Heinrich W. Guggenheimer. (0-486-63433-7) NUMERICAL METHODS FOR SCIENTISTS AND ENGINEERS, Richard Hamming. (0-486-65241-6) PROBABILITY: Elements of the Mathematical Theory, C. R. Heathcote. (0-486-41149-4) BUILDING MODELS BY GAMES, Wilfrid Hodges. (0-486-45017-1) ORDINARY DIFFERENTIAL EQUATIONS, E . L. Ince. (0-486-60349-0) LIE ALGEBRAS, Nathan Jacobson. (0-486-63832-4) GREEK MATHEMATICAL THOUGHT AND T H E ORIGIN O F ALGEBRA, Jacob Klein. (0-486-27289-3) (continued on back flap)
VECTORS, TENSORS AND T H E BASIC EQUATIONS O F FLUID MECHANICS,
Advanced Calculus of Several Variables
Advanced Calculus of Several Variables C. H. EDWARDS, JR. The University of Georgia
D O V E R PUBLICATIONS, INC. New York
Copyright Copyright © 1973 by Academic Press, Inc.
Bibliographical Note This Dover edition, first published in 1994, is an unabridged, corrected republication o f the work first published by Academic Press, N e w York, 1973.
Library of Congress Cataloging-in-Publication Data Edwards, C . H . (Charles Henry), 1937Advanced calculus o f several variables I C . H . Edwards, Jr. p. cm. Originally published: N e w York : Academic Press, 1973. Includes bibliographical references and index. ISBN 0-486-68336-2 (pbk.) 1. Calculus. I. Title. QA303.E22 1994 515'.84—dc20
Manufactured in the United States o f America Dover Publications, Inc., 31 East 2nd Street, Mineola, N.Y. 11501
94-24204 CIP
T o M y Parents
CONTENTS
Preface
I
Euclidean Space and Linear Mappings 1 2 3 4 5 6 7 8
The Vector Space St n Subspaces o f 0 P Inner Products and Orthogonality Linear Mappings and Matrices The Kernel and Image o f a Linear Mapping Determinants Limits and Continuity Elementary Topology of
II Multivariable Differential Calculus 1 Curves in 2 Directional Derivatives and the Differential 3 The Chain Rule 4 Lagrange Multipliers and the Classification o f Critical Points for 'Functions o f T w o Variables 5 Maxima and Minima, Manifolds, and Lagrange Multipliers 6 Taylor’s Formula for Single-Variable Functions 7 Taylor’s Formula in Several Variables 8 The Classification o f Critical Points
III Successive Approximations and Implicit Functions 1 Newton’s Method and Contraction Mappings 2 The Multivariable Mean Value Theorem 3 The Inverse and Implicit Mapping Theorems 4 Manifolds in 5 Higher Derivatives
Contents
vlll
V
Line and Surface Integrals; Differential Forms and Stokes’ Theorem 1 Pathlength and Line Integrals 2 Green's Theorem 3 Multilinear Functions and the Area o f a Parallelepiped 4 Surface Area 5 Differential Forms 6 Stokes' Theorem 7 The Classical Theorems o f Vector Analysis 8 Closed and Exact Forms
VI
u
1 Area and the 1-Dimensional Integral 2 Volume and the n-Dimensional Integral 3 Step Functions and Riemann Sums 4 Iterated Integrals and Fubini's Theorem 5 Change o f Variables 6 Improper Integrals and Absolutely Integrable Functions
K
Multiple Integrals
OO
IV
287 304 322 330 345 363 380 395
T h e Calculus o f Variations 1 2 3 4 5
Normed Vector Spaces and Uniform Convergence Continuous Linear Mappings and Differentials The Simplest Variational Problem The Isoperimetric Problem Multiple Integral Problems
404 411 418 429 438
Appendix : T h e Completeness o f 31
445
Suggested Reading
449
Subject Index
453
PREFACE
This book has developed from junior-senior level advanced calculus courses that I have taught during the past several years. It was motivated by a desire t o provide a modern conceptual treatment of multivariable calculus, emphasizing the interplay of geometry and analysis via linear algebra and the approximation of nonlinear mappings by linear ones, while at the same time giving equal attention to the classical applications and computational methods that are responsible for much of the interest and importance of this subject. In addition to a satisfactory treatment of the theory of functions of several variables, the reader will (hopefully) find evidence of a healthy devotion to matters of exposition as such—for example, the extensive inclusion of motivational and illustrative material and applications that is intended to make the subject attractive and accessible to a wide range of “ typical ” science and mathematics students. The many hundreds of carefully chosen examples, problems, and figures are one result of this expository effort. This book is intended for students who have completed a standard introductory calculus sequence. A slightly faster pace is possible if the students’ first course included some elementary multivariable calculus (partial derivatives and multiple integrals). However this is not essential, since the treatment here of multivariable calculus is fully self-contained. We d o not review single-variable calculus; with the exception of Taylor’s formula in Section II.6 (Section 6 of Chapter II) and the fundamental theorem of calculus in Section IV.1. Chapter I deals mainly with the linear algebra and geometry of Euclidean n-space With students who have taken a typical first course in elementary linear algebra, the first six sections of Chapter I can be omitted; the last two sections of Chapter I deal with limits and continuity for mappings of Euclidean spaces, and with the elementary topology of that is needed in calculus. The only linear algebra that is actually needed to start Chapter II is a knowledge of the correspondence between linear mappings and matrices. With students having this minimal knowledge o f linear algebra, Chapter I might (depending upon the taste of the instructor) best be used as a source for reference as needed. ix
X
Preface
Chapters I I through V are the heart o f the book. Chapters II and III treat multivariable differential calculus, while Chapters I V and V treat multivariable integral calculus. I n Chapter II the basic ingredients o f single-variable differential calculus are generalized to higher dimensions. We place a slightly greater emphasis than usual on maximum-minimum problems and Lagrange multipliers—experience has shown that this is pedagogically sound from the standpoint o f student motivation. I n Chapter I I I we treat the fundamental existence theorems o f multivariable calculus by the method o f successive approximations. This approach is equally adaptable to theoretical applications and numerical computations. Chapter I V centers around Sections 4 and 5 which deal with iterated integrals and change o f variables, respectively. Section I V . 6 is a discussion o f improper multiple integrals. Chapter V builds upon the preceding chapters to give a comprehensive treatment, from the viewpoint o f differential forms, o f the classical material associated with line and surface integrals, Stokes’ theorem, and vector analysis. Here, as throughout the book, we are not concerned solely with the development o f the theory, but with the development o f conceptual understanding and computational facility as well. Chapter V I presents a modern treatment o f some venerable problems o f the calculus o f variations. The first part o f the Chapter generalizes (to normed vector spaces) the differential calculus o f Chapter II. The remainder o f the Chapter treats variational problems by the basic method o f “ ordinary calculus ” —equate the first derivative to zero, and then solve for the unknown (now a function). The method o f Lagrange multipliers is generalized so as to deal in this context with the classical isoperimetric problems. There is a sense in which the exercise sections may constitute the most important part o f this book. Although the mathematician may, in a rapid reading, concentrate mainly on the sequence o f definitions, theorems and proofs, this is not the way that a textbook is read b y students (nor is it the way a course should be taught). The student’s actual course of study may be more nearly defined b y the problems than by the textual material. Consequently, those ideas and concepts that are not dealt with b y the problems may well remain unlearned by the students. For this reason, a substantial portion o f m y effort has gone into the approximately 430 problems in the book. These are mainly concrete computational problems, although not all routine ones, and many deal with physical applications. A proper emphasis o n these problems, and o n the illustrative examples and applications in the text, will give a course taught from this book the appropriate intuitive and conceptual flavor. I wish to thank the successive classes o f students who have responded so enthusiastically to the class notes that have evolved into this book, and who have contributed t o it more than they are aware. In addition, I appreciate the excellent typing o f Janis Burke, Frances Chung, and Theodora Schultz.
Advanced Calculus of Several Variables
I _______________ Euclidean Space and Linear Mappings
Introductory calculus deals mainly with real-valued functions of a single variable, that is, with functions from the real line St to itself. Multivariable calculus deals in general, and in a somewhat similar way, with mappings from one Euclidean space to another. However a number of new and interesting phenomena appear, resulting from the rich geometric structure of n-dimensional Euclidean space I n this chapter we discuss in some detail, as preparation for the development in subsequent chapters o f the calculus of functions of a n arbitrary number of variables. This generality will provide more clear-cut formulations o f theoretical results, and is also of practical importance for applications. For example, a n economist may wish to study a problem in which the variables are the prices, production costs, and demands for a large number of different commodities; a physicist may study a problem in which the variables are the coordinates of a large number of different particles. Thus a “ real-life” problem may lead to a high-dimensional mathematical model. Fortunately, modem techniques of automatic computation render feasible the numerical solution of many high-dimensional problems, whose manual solution would require a n inordinate amount of tedious computation.
1
T H E VECTOR SPACE 3t n
As a set, St* is simply the collection o f all ordered n-tuples of real numbers. That is, dt" — {(*i> x 2 » • • • >
: each x t e St}. 1
2
I
Euclidean Space and Linear Mappings
Recalling that the Cartesian product A x B of the sets A and B is by definition the set of all pairs (a, b) such that a e A and b e B , m see that can be regarded as the Cartesian product set x ••• x (n times), and this is of course the reason for the symbol The geometric representation of 3, obtained by identifying the triple (x i t x 2 , x 3 ) of numbers with that point in space whose coordinates with respect to three fixed, mutually perpendicular “coordinate axes” are x l , x 2 , x 3 respectively, is familiar to the reader (although we frequently write (x, y, z) instead o f (xn x 2 , x 3) in three dimensions). By analogy one can imagine a similar geometric representation of in terms of n mutually perpendicular coordinate axes in higher dimensions (however there is a valid question as to what “ perpendicular” means in this general context; we will deal with this in Section 3). The elements of are frequently referred to as vectors. Thus a vector is simply a n n-tuple of real numbers, and not a directed line segment, or equivalence class of them (as sometimes defined in introductory texts). The set is endowed with two algebraic operations, called vector addition and scalar multiplication (numbers are sometimes called scalars for emphasis). Given two vectors x = (x n . . . , x„) and y = (y l t . . . , y„) in their sum x + y is defined by x + y = (-Xi + y i , - - - , x „ + y n ), that is, by coordinatewise addition. Given a e $ , the scalar multiple a x is defined by a x = (ax1 , . . . , a x „ ) . For example, if x - (1, 0, - 2 , 3) and y »* ( - 2 , 1, 4, —5) then x 4- y = ( - 1, 1, 2, —2) and 2x = (2, 0, —4, 6). Finally we write 0 = (0, 0) and — x = (— l)x, and use x — y as a n abbreviation for x + (— y). The familiar associative, commutative, and distributive laws for the real numbers imply the following basic properties of vector addition and scalar multiplication: V1 V2 V3 V4 V5 V6 V7 V8
x + (y + z) = (x + y) + z x+ y= y+ x x+0= x x+ x) = 0 (ab)x = a(bx) (a + b)x = a x + b x a(x + y) = a x + ay lx = x
(Here x, y, z are arbitrary vectors in and a and b are real numbers.) V1-V8 are all immediate consequences of our definitions and the properties of SI. For
1
The Vector Space
3
example, to prove V6, let x = (xj
x„). Then
(a 4- b)x = ((a + b)x 2 , . . . . (a + Z>)x„) = (ax t + b x l t . . . , a x „ + b x = (ax t , . . . , ax„) + (Z>x1 5 . . . , b xn ) — a x + bx. The remaining verifications are left as exercises for the student. A vector space is a set V together with two mappings V x V -» V and x V -» V, called vector addition and scalar multiplication respectively, such that V1-V8 above hold for all x, y, z e V and a, b e 01 (V3 asserts that there exists 0 6 V such that x + 0 = x for all x e V, and V4 that, given x e V, there exists — x e V such that x 4- (— x) = 0). Thus V1-V8 may be summarized by saying that is a vector space. For the most part, all vector spaces that we consider will be either Euclidean spaces,.or subspaces of Euclidean spaces. By a subspace of the vector space. V is meant a subset W of V that is itself a vector space (with the same operations). It is clear that the subset W of V is a subspace if and only if it is “closed” under the operations of vector addition and scalar multiplication (that is, the sum of any two vectors in W is again in W , as is any scalar multiple of a n element of IF)—properties V1-V8 are then inherited by W from V. Equivalently, I F is a subspace of V if and only if any linear combination of two vectors in IF is also in W (why?). Recall that a linear combination of the vectors v1 } . . . , vt is a vector of the form a 1 v1 4-------F vfc, where the at e 0t. The span of the vectors v( , . . . , vt e is the set S of all linear combinations of them, and it is said that 5 is generated by the vectors Example 1 0ln is a subspace of itself, and is generated by the standard basis vectors e t = (1, 0, 0, . . . , 0), e 2 = (0, 1, 0, . . . , 0), e„ = (0, 0 , 0 , . . . , 0 , 1), since (x t , x 2 , . . . , x „ ) = Xj ej 4- x2 e 2 + ••• + xn e„. Also the subset of Stn consisting of the zero vector alone is a subspace, called the trivial subspace of 0in. Example 2 The set of all points in with last coordinate zero, that is, the set of all (x t , . . . , x „ - i , 0) e 3?", is a subspace of which may be identified with
Example 3 Given a2 , aH) e @n, the set of all (x 2 , x 2 , . . . , x „ ) e such that OjXi 4-------1- a„xn = 0 is a subspace of 0in (see Exercise 1.1).
I
4
Euclidean Space and Linear Mappings
Example 4 The span S of the vectors Vj, . . . , v* e is a subspace of S t because, given elements a = £ * a ( Vj and b = £ * b t v( of S, and real numbers r and s, we have ra + sb = £i(ra< + sb i e S. Lines through the origin in St 3 are (essentially by definition) those subspaces of St 3 that are generated by a single nonzero vector, while planes through the origin in St 3 are those subspaces of Si 3 that are generated by a pair of noncollinear vectors. We will see in the next section that every subspace V of S t is generated by some finite number, at most n, of vectors; the dimension of the subspace V will be defined to be the minimal number of vectors required to generate V. Subspaces of S t of all dimensions between 0 and n will then generalize lines and planes through the origin in & 3. Example 5 If V and W are subspaces of then so is their intersection V n W (the set of all vectors that lie in both V and W). See Exercise 1.2. Although most of our attention will be confined to subspaces of Euclidean spaces, it is instructive to consider some vector spaces that are not subspaces of Euclidean spaces. Example 6 Let SF denote the set o f all real-valued functions o n St. If f + g and af are defined by ( f + g)(x) = f(x) + g(x) and (af)(x) = af(x), then & is a vector space (why?), with the zero vector being the function which is zero for all x 6 St. If # is the set of all continuous functions and St is the set of all polynomials, then & is a subspace of a nj)‘
J=l a This will be the case if u x j = 0, / = 1, . . . , n. Thus we need to find a nontrivial solution of the homogeneous linear equations a a
tl
X X
l
+ a 12 x 2 + ’ * ' +
2l l +
a
22
X
a lk x k
= 0,
2 + * ' ■ + (t 2 k X k
= 0,
)
ani x i + an 2 x 2 + --- + a„k xk = 0. By a nontrivial solution (x x, x2 , . . . , xt ) of the system (1) is meant one for which not all of the x t are zero. But k > n , and (1) is a system of n homogeneous linear equations in the k unknowns x 1 , . . . , x k . (Homogeneous meaning that the righthand side constants are all zero.) It is a basic fact of linear algebra that any system of homogeneous linear equations, with mote unknowns than equations, has a nontrivial solution. The proof of this fact is a n application of the elementary algebraic technique of elimination of variables. Before stating and proving the general theorem, we consider a special case.
2
Subspaces o f 91'
Example 4
7
Consider the following three equations in four unknowns: + 2 x2 — x 3 4- 2 x4 = 0, x x - x 2 + 2x3 + x4 = 0,
(2)
2*1 4- x 2 — x 3 — x4 = 0. We can eliminate x t from the last two equations of (2) by subtracting the first equation from the second one, and twice the first equation from the third one. This gives two equations in three unknowns: — 3 x2 + 3x3 — x4 = 0, — 3x2 + x 3 — 5x4 = 0.
(3)
Subtraction of the first equation of (3) from the second one gives the single equation — 2x3 — 4 x4 = 0 (4) in two unknowns. We can now choose x4 arbitrarily. For instance, if x4 = 1, then x 3 = —2. The first equation of (3) then gives x 2 = — and finally the first equation of (2) gives x t = — 3 -. So we have found the nontrivial solution (— — 4’ — 2, 1) of the system (2). The procedure illustrated in this example can be applied to the general case of n equations in the unknowns x 2, . . . , xk , k > n. First we order the n equations so that the first equation contains x1 ( and then eliminate x t from the remaining equations by subtracting the appropriate multiple of the first equation from each of them. This gives a system of n — 1 homogeneous linear equations in the k — 1 variables x2 , . . . , xk . Similarly we eliminate x 2 from the last n — 2 of these n — 1 equations by subtracting multiples o f the first one, obtaining n — 2 equations in the k — 2 variables x 3 , x4 , xk . After n — 2 steps of this sort, we end u p with a single homogeneous linear equation in the k — n 4- 1 unknowns x„, x„ + 1 , . . . , xk . We can then choose arbitrary nontrivial values for the “extra” variables x„ + 1 , x„ + 2 , . . . , xk (such as x„ + 1 = 1, xn + 2 = • • • = xk = 0), solve the final equation for x„, and finally proceed backward to solve successively for each of the eliminated variables x„_ t , x„_ 2 , . . . , x P The reader may (if he likes) formalize this procedure to give a proof, by induction o n the number n of equations, o f the following result. Theorem 2.2 If k > n, then any system of n homogeneous linear equations in k unknowns has a nontrivial solution. By the discussion preceding Eqs. (1) we now have the desired result that dim = n. Corollary 2.3 Any n 4- 1 vectors in
are linearly dependent.
I
8
Euclidean Space and Linear Mappings
We have seen that the linearly independent vectors e t , e2 , . . . , e„ generate A set of linearly independent vectors that generates the vector space V is called a basis for V. Since x = (xl s x 2 , x„) = x t et + x 2 e2 + • • • + x„e„ , it is clear that the basis vectors ej, . . . , e„ generate V uniquely, that is, if x = + y 2 e2 e a so + ’ ’ ‘ + y then x t = y t for each i. Thus each vector in can be expressed in one and only one way as a linear combination of e , , . . . , e„. Any set of n linearly independent vectors in a n n-dimensional vector space has this property. Theorem 2.4
I f the vectors v t , . . . , vn in the n-dimensional vector space
V are linearly independent, then they constitute a basis for V, and furthermore generate V uniquely. PROOF Given v e V, the vectors v, v1 ; . . . , v„ are linearly dependent, so by Proposition 2.1 there exist numbers x, x 2 , . . . , x„, not all zero, such that XV + X Vj + • • • + x„v„ = 0.
If x = 0, then the fact that vx vn are linearly independent implies that Xj = • • • = x„ = 0. Therefore x 0, so we solve for v:
Thus the vectors vn . . . , v„ generate V, and therefore constitute a basis for V. T o show that they generate V uniquely, suppose that + • • • + an v„ = a/vj + • • • + a „ \ . Then («i -
+ • • • + (a„ - a„')v„ = 0.
So, since v1 5 . . . , v„ are linearly independent, it follows that at — a [ = 0, or a ( = «/, for each i. g There remains the possibility that 9 ? has a basis which contains fewer than n elements. But the following theorem shows that this cannot happen. Theorem 2 . 5
If dim V = n , then each basis for V consists of exactly n
vectors. PROOF Let Wj, w2 , . . . , w„ be n linearly independent vectors in V. If there were a basis v1 ( v2 , . . . , vra for V with m < n , then there would exist numbers {atJ } such that w1 = a 1 1 v1 + --- + a m l vM , wh = ai„v1 + --- + a m„vm .
2
1
Subspaces o f SI
9
Since m < n , Theorem 2.2 supplies numbers x i 9 . . . , xn not all zero, such that 011*1 + ’ - - + al n x„ = 0, Oml*l + ‘ * * + OmH X n — 0. But this implies that a *1*1 4- • • • 4- x„w„ = £ xfaj vx + • • • + a my vm) J~1 m
=
i=i
+ "■ + a (I1xB)v j
= 0, which contradicts the fact that. w1 ; . . . , w„ are linearly independent. Consequently n o basis for V can have m < n elements. | We can now completely describe the general situation as regards subspaces of 3?". If V is a subspace of St*, then k = dim V n by Corollary 2.3, and if k = n, then V = by Theorem 2.4. If k > 0, then any k linearly independent vectors in V generate V, and n o basis for V contains fewer than k vectors (Theorem 2.5).
Exercises 2.1 W h y is it true that the vectors Vi 2.2
2.3 2.4 2.5
2.6
v* are linearly dependent if any one o f them is zero ? If any subset o f them is linearly dependent? Which o f the following sets o f vectors are bases for the appropriate space # " ? (a) ( 1 , 0 ) a n d ( 1 , 1 ) . (b) ( 1 , 0, 0), (1, 1 , 0), and (0, 0, 1). (c) ( 1 , 1 , 1), ( 1 , 1, 0), and (1, 0, 0). (d) ( 1 , 1 , 1 , 0), (1, 0, 0, 0), (0, 1 , 0, 0), and (0, 0, 1 , 0). (e) (1, 1 , 1 , 1), (1, 1 , 1, 0), (1, 1 , 0, 0), and (1, 0: 0, 0). Find the dimension o f the subspace V o f St* that is generated b y the vectors (0, 1 , 0, 1), ( 1 , 0, 1 , 0), and (1, 1 , 1 , 1). Show that the vectors (1, 0, 0, 1), (0, 1 , 0, 1), (0, 0, 1, 1) form a basis for the subspace V o f St* which is defined b y the equation 4- x 2 4— x 4 = 0. Show that any set vn . . . , v*, o f linearly independent vectors in a vector space V can b e extended t o a basis for V. That is, if k < n = d i m V, then there exist vectors vt + 1 v„ i n Fsuch that ▼»»••• v » >s a basis for V . Show that Theorem 2.5 is equivalent t o the following theorem: Suppose that the equations auXi + — + a u x „ = 0, 4-------I- a„„
= 0
I
10
Euclidean Space and Linear Mappings
have only the trivial solution Xi = ••• = xB = 0. Then, for each b — (5 X, . . . , 2>n), the equations 4— al n xn = b l t aniXt 4------1- a„n xn = bn have a unique solution. Hint: Consider the vectors a7 = (a{ ] , a2 j, . . . , an j)J — •••»*• 2.7 Verify that any two collinear vectors, and any three coplanar vectors, are linearly dependent.
3
INNER PRODUCTS AND ORTHOGONALITY
In order to obtain the full geometric structure of (including the concepts of distance, angles, and orthogonality), we must supply $l n with a n inner product. An inner (scalar) product o n the vector space V is a function V x V -+ Si, which associates with each pair (x, y) of vectors in V a real number , and satisfies the following three conditions: S P 1 > 0 if x 0 (positivity). S P 2 = (symmetry). SP3 = a + b(y, z>.
The third of these conditions is linearity in the first variable; symmetry then gives linearity in the second variable also. Thus a n inner product o n V is simply a positive, symmetric, bilinear function o n V x K Note that SP3 implies that = 0 (see Exercise 3.1). The usual inner product o n Sin is denoted by x • y and is defined by x - y = xij 1 + - - - + x „ K ,
(1)
where x = (x15 . . . , x„), y = (y15 . . . , j„). It should be clear that this definition satisfies conditions SP1, SP2, SP3 above. There are many inner products o n Si n (see Example 2 below), but we shall use only the usual one. Example 1 Denote by [a, 6] the vector space of all continuous functions o n the interval [a, b], and define
for any pair of functions f, g e ]. It is obvious that this definition satisfies conditions SP2 and SP3. It also satisfies SP1, because if then by continuity (/(t)) 2 > 0 for all t in some neighborhood o f t 0 , so =fy(t) Therefore we have a n inner product o n
2
dt>0.
[a, Z>].
3
Inner Products and Orthogonality
11
Example 2 Let a, b, c be real numbers with a > 0, ac — b2 > 0, so that the quadratic form q(x) = a x 2 + 2bx v x 2 + cx2 2 is positive-definite (see Section II.4). Then = a x y + b x t y2 + b x2 yi + cx 2 y2 defines a n inner product o n tft 2 (why?). With a = c = l,fi = 0 w e obtain the usual inner product o n St 2 . A n inner product o n the vector space V yields a notion of the length or “ size ” of a vector x e V, called its norm | x | . I n general, a norm o n the vector space V is a real-valued function x - » | x | o n V satisfying the following conditions: N1 | x ] > 0 if x =£■ 0 N2 |ax| = |a] |x| N3 |x + y| |x| + |y|
(positivity), (homogeneity), (triangle inequality),
for all x, y e V and a e St. Note that N 2 implies that | 0 | = 0. The norm associated with the inner product < , > on V is defined by l x l = V < X> X>
(2 )
It is clear that SP1-SP3 and .this definition imply conditions N1 and N2, but the triangle inequality is not so obvious; it will be verified below. The most commonly used norm o n Sin is the Euclidean norm | x | = (x, 2 + ••• + x„2 )1 / 2 , which comes in the above way from the usual inner product o n Other norms o n not necessarily associated with inner products, are occasionally employed, but henceforth | x | will denote the Euclidean norm unless otherwise specified. Example 3 ||x|| — max Xj | , | x„| }, the maximum of the absolute values o f the coordinates of x, defines a norm o n St” (see Exercise 3.2). Example 4 f x = J x J + | x 2 | +•♦•+ | x „ | defines still another norm o n (again see Exercise 3.2). A norm o n V provides a definition of the distance d(x, y) between any two points x and y o f V: d(x,y)= |x - y|. Note that a distance function d defined in this way satisfies the following three conditions: D I d(x, y) > 0 unless x = y D 2 d(x, y) = d(y, x) D 3 d(x, z) d(x, y) + d(y, z)
(positivity), (symmetry), (triangle inequality),
I
12
Euclidean Space and Linear Mappings
for any three points x, y, z. Conditions D I and D 2 follow immediately from N 1 and N2, respectively, while d(x, z) = | x - z | = |(x - y) + (y - z)| S |*-y|
+ |y - z|
- = 0, while w2 / 0 because v and wx are linearly independent. If V is a finite-dimensional vector space with a n inner product, then V has a n orthogonal basis.
Theorem 3.3
In particular, every subspace of 3t n has a n orthogonal basis. PROOF We start with a n arbitrary basis vx, . . . , v„ for V. Let wx = vx. Then, by the preceding construction, the nonzero vector w2 = v2 -
w t >
w
1
is orthogonal to wx and lies in the subspace generated by vx and v2 . Suppose inductively that we have found a n orthogonal basis wx , . . . , wk for the subspace of V that is generated by vx , . . . , vk . The idea is then to obtain w4 + x by subtracting from vk + x its components parallel to each of the vectors wx , . . . , wk . That is, define Wfc+l = v k + x - C XWX - C 2 W2 - ------ Ck Vf k , where ct = /. Then = - c £ = 0 for i S k, and wk + x / 0, because otherwise v* + x would be a linear combination of the vectors wx , . . . , wk , and therefore of the vectors vx , . . . , vk . It follows that the vectors wx wk + x form a n orthogonal basis for the subspace of V that is generated by vx , . . . , vk + x . After a finite number of such steps we obtain the desired orthogonal basis wx, . . . , w„ for V. |
I
16
Euclidean Space and Linear Mappings
It is the method of proof of Theorem 3.3 that is known as the Gram-Schmidt orthogonalization process, summarized by the equations W i == V1, w 2 =- v 2
. - Wp
W 3 == v 3 -
< V3 , W ! > - ------------- - W i < W1 ) W1 >
< V3 , W 2 > - -------------- - w 2 , < W2 , W 2 >
< V n> W » - l > < Wn _ ! , W „ _ ! >
defining the orthogonal basis wx ,
w
'
w„ in terms of the original basis Vj,
T o find a n orthogonal basis for the subspace V of spanned by the vectors ?! - (1, 1, 0, 0), v2 = (1, 0, 1, 0), v3 = (0, 1, 0, 1), we write
Example 7
»i = v i = (b 1,0,0), ▼2 • W1 w w 2 = v 2 ----------------- i -Wn
= (1, 0, 1, 0) - j(l, 1, 0, 0) = Q , - i , 1, 0), W3=V3
_
W1
Wi • Wj
_
W2
w2 • w2
= (0, 1, 0, 1) - id, 1, 0, 0) + i(i, - i , 1, 0) =
I)-
Let & denote the vector space of polynomials in x, with inner product defined by
Example 8
= [ P
ql
dx.
By applying the Gram-Schmidt orthogonalization process to the linearly inde* pendent elements 1, x, x 2 , . . . , x" one obtains a n infinite sequence{p„(x)}”= 0 , the first five elements of which are p0 (x) = 1, p 3(x) = x, p2(x) = x 2 — j, p3(x) = x 3 — fx, />4(x) = x4 — - x 2 + x r (see Exercise 3.12). Upon multiplying the polynomials {pB(x)} by appropriate constants, one obtains the famous Legendre polynomials P0 (x) = p0(x), Pt (x) =Pi(x), P2(x) = i p 2 (x), P3(x) = | p 3(x), P4(x) =V ), e t c One reason for the importance of orthogonal bases is the ease with which a
3
Inner Products and Orthogonality
17
vector v e V can be expressed as a linear combination of orthogonal basis vectors w1 ; . . . , w„ for V. Writing + ••• + an vi„, and taking the inner product with w( , we immediately obtain vw t ai ----------- , w r w, so V • Wi V --- ---------— Wi- Wi
(4)
This is especially simple if w,. . . . , w„ is a n orthonormal basis for K: v = (v • wj)wt + (v • w2)w2 + •♦♦ + (▼• *»)*» •
(5 )
Of course orthonormal basis vectors are easily obtained from orthogonal ones, simply by dividing by their lengths. In this case the coefficient v • w, of w( in (5) is sometimes called the Fourier coefficient of v with respect to w t . This terminology is motivated by a n analogy with Fourier series. The orthonormal functions in
is defined by
L(x) = a • x , then Ker L is the (n — l)-dimensional subspace of Sft." that is
orthogonal to the vector a, and Im L = St. Example 2 If P : ® 2 - * ® 2 is the projection P(x v , x 2 , x 3 ) = Ker P is the x 3 -axis and Im P =
x 2 ), then
The assumption that the kernel of L : V-* W is the zero vector alone, Ker L = 0, has the important consequence that L is one-to-one, meaning that £(vi) = L(v2) implies that v t = v2 (that is, L is one-to-one if no two vectors of V have the same image under L). Theorem 5.1 Let L : K-* W be linear, with V being n-dimensional. If Ker L = 0, then L is one-to-one, and Im L is a n n-dimensional subspace of
W. PROOF T o show that L is one-to-one, suppose £(Vj) = £(v2). Then £(v t — v2) = 0, so v t — v2 = 0 siqce Ker L = 0. T o show that the subspace Im L is n-dimensional, start with a basis Vj, . . . , vB for V. Since it is clear (by linearity of L) that the vectors L(yt ), . . . , £(v„) generate Im L , it suffices to prove that they are linearly independent. Suppose tiMvi) + • • • + t„L(yn ) = 0. Then iOiVi + • • • + th v„) = 0,
so tiVt + • • • + tn v„ = 0 because Ker L = 0. But then t 2 = • • • = t„ = 0 because the vectors Vj, . . . , v„ are linearly independent. | A n important special case of Theorem 5.1 is that in which I F is also ndimensional ; it then follows that Im L = IF (see Exercise 5.3). Let L : -+ be defined by L(x) = Ax, where A = (atJ ) is a n m x n matrix. Then (a) KerL is the orthogonal complement of that subspace of that is generated by the row vectors Ai t . . . , Am of A , and Theorem 5.2
5
The Kernel and Image o f a Linear Mapping
(b) I m L is the subspace of A1 , . . . , A n o f A .
31
that is generated by the column vectors
PROOF (a) follows immediately from the fact that L is described by the scalar equations Li(x) = Ai*, L2(x) = A 2 x, Lm(x) = Am x, so that the ith coordinate L&x) is zero if and only if x is orthogonal to the row vector A { . (b) follows immediately from the fact that Im L is generated by the images L(ej), . . . , £(e„) of the standard basis vectors in whereas L(e|) = At , i = 1, . . . , n , by the definition o f matrix multiplication. | Example 3
Suppose that the matrix of L : '
12 A = (1 \3
-1 2 1
-* 5?3 is — 2\ 1) -1/
Then A 3 = A 2 + A 2 , but A r and A 2 are not collinear, so it follows from 5.2(a) that Ker L is 1-dimensional, since it is the orthogonal complement of the 2-dimensional subspace of ? 3 that is spanned by A t and A2 . Since the column vectors of A are linearly dependent, 3 A1 = 4 A2 — 5A 3 , but not collinear, it follows from 5.2(b) that I m £ is 2-dimensional. Note that, in this example, dim Ker L + dim I m L = 3. This is a n illustration of the following theorem. Theorem 5 . 3 If L : K-> W is a linear mapping of vector spaces, with
dim V = n, then dim Ker L + dim Im L = n. PROOF Let wl f . . . , wp be a basis for Im L, and choose vectors Vj vp e V such that £(v f) = wf for i = 1, . . . , p. Also let u15 . . . , u, be a basis for Ker L. It will then suffice to prove that the vectors Vj, . . . , vp , u t , . . . , u, constitute a basis for V. T o show that these vectors generate V, consider v e V. Then there exist numbers al t a f such that £(v) = a1 w1 + • • • + a p wp ,
I
32
because w19 . . . » wp is a basis for Im L. Since we have
Euclidean Space and Linear Mappings
= L(v f) for each i, by linearity
L(y)=L(al yl + --- + a p v p ), or L(v - at Vj so v —
------ a p vp) = 0,
— • • • — a, vp e Ker L. Hence there exist numbers bt , . . . , bg such that v - a i V j -- ------ a p vp =Z>!Uj + ••• + Z>,u4 ,
or v=
fliVj + • • • + flpVp +
btUi + •■■ + bq u g ,
as desired. T o show that the vectors vl t . . . . v,, u o suppose that SjVj + • • • + SpVp +
u4 are linearly independent,
tiUi + • • • + t,u, = 0.
Then S 1 W , + • • • + Sp Wp =
0
because L(v () = and L(uj) = 0. Since . . . , wp are linearly independent, it follows that s1 — - - - = sl>= 0. But then + •• • + t,u, = 0 implies that tj = • • • = tq = 0 also, because the vectors . . . , u, are linearly independent. By Proposition 2.1 this concludes the proof. I We give a n application of Theorem 5.3 to the theory of linear equations. Consider the system o n *i + ••• + «!»*« = 0,
is defined by L(x) = A x (see Theorem 5.2). Now the row rank of the m x n matrix A is by definition the dimension of the subspace of generated by the row vectors of A , while the column rank of A is the dimension of the subspace of generated by the column vectors of A .
5
The Kernel end Image of a Linear Mapping
33
Theorem 5.4 The row rank of the m x n matrix A = (atJ ) and the column rank of A are equal to the same number r. Furthermore dim 5 = n — r, where S is the space of solutions of the system (1) above.
PROOF We have observed that S' is the orthogonal complement to the subspace of SP generated by the row vectors of A , so (row rank of A ) + dim S = n
(2)
by Theorem 3.4. Since S = Ker L, and by Theorem 5.2, Im L is the subspace of generated by the column vectors of A , we have (column rank of A ) + dim S = n by Theorem 5.3. But Eqs. (2) and (3) immediately give the desired results.
(3) |
Recall that if U and V are subspaces of SP, then U r> V = {x e
: both x e U and x 6 V }
and U + V = {x 6 SP1". x = u + v with u e U and v e F ) are both subspaces of
(Exercises 1.2 and 1.3). Let
U x V ={(x, y) e 0t 2 n : x e 1/ and y e IQ. Then U x V is a subspace of $ 2" with dim(C7 x F) = dim U + dim F (Exercise 5.4). Theorem 5 . 5
If U and F are subspaces of &P1, then dim(l/ + F) + dim(l/ n F ) = dim U + dim V.
(4)
In particular, if U + F =3?", then dim U + dim F — dim((/ n F) = n. PROOF Let L : U x F-> SP* be the linear mapping defined by L(u, v) = u - v. Then Im L = U + V and Ker L = {(x, x) e &2 n : x e U n F), so dim Im L = dim(t/ + F) and dim Ker L = dim(l/ n F). Since dim U x F = dim U + dim F by the preceding remark, Eq. (4) now follows immediately from Theorem 5.3. I Theorem 5.5 is a generalization of the familiar fact that two planes in & 3 “generally” intersect in a line (“generally” meaning that this is the case if the
I
34
Euclidean Space and Linear Mappings
two planes together contain enough linearly independent vectors to span St 3 ). Similarly a 3-dimensional subspace and a 4-dimensional subspace of St 1 generally intersect in a point (the origin); two 7-dimensional subspaces of 1 0 generally intersect in a 4-dimensional subspace.
Exercises 5.1 If L : F-> W is linear, show that Ker £ is a subspace o f V. 5.2 If L : V-* W is linear, show that I m £ is a subspace o f W , 5.3 Suppose that V and W are n-dimensional vector spaces, and that F : F-> is linear, with Ker F = 0. Then F is one-to-one by Theorem 5.1. Deduce that I m F = IF, so that the 1 inverse mapping G = : IF-* V is defined. Prove that G is also linear. 2 5.4 If U and V are subspaces of prove that U x V c " is a subspace o f and that dim(L7 x V ) = dim U 4- dim V, Hint: Consider bases for U and V, 5.5 Let V and I F be n-dimensional vector spaces. If £ : F-* I F is a linear mapping with I m £ = IF, show that Ker £ — 0. 5.6 T w o vector spaces V and I F are called isomorphic if and only if there exist linear mappings S : F-* I F and T : W - * V such that S T and T o S are the identity mappings o f I F and V respectively. Prove that two finite-dimensional vector spaces are isomorphic if and only if they have the same dimension. 5.7 Let V be a finite-dimensional vector space with an inner product < , >. The dual space F * of V is the vector space o f all linear functions F-> Prove that V and F * are isomorphic. Hint: Let v t , . . . , vn be an orthonormal basis for F, and define 0 / e F * by 0/(v/) = 0 unless / = /, 0j(yj) = 1. Then prove that . . , 0n constitute a basis for F*.
6
DETERMINANTS
It is clear by now that a method is needed for deciding whether a given n-tuple of vectors a n . . . , an in St n are linearly independent (and therefore constitute a basis for Stn ). We discuss in this section the method of determinants. The determinant of a n n x n matrix A is a real number denoted by det A or Mb The student is no doubt familiar with the definition of the determinant of a 2 x 2 or 3 x 3 matrix. If A is 2 x 2, then a b — ad — be. c d
det
For 3 x 3 matrices we have expansions by rows and columns. For example, the formula for expansion by the first row is Z*n det I a 2 i \ a 31
a a a
12
a
13\
22
a
23 j =
32
a
33J
a
ll
a
22
a
32
a
23
fl
21
+«„ a
33
alt a
31
°2! a 32
6
Determinants
35
Formulas for expansions by rows or columns are greatly simplified by the following notation. If A is a n n x n matrix, let A i} denote the (n — 1) x (n — 1) submatrix obtained from A by deletion of the zth row and the jth column of A . Then the above formula can be written det A = a l t det A n - a 1 2 det A 1 2 + a1 3 det
13
.
The formula for expansion of the n x n matrix A by the zth row is det
= f ( - l > , + >ay det
,
(1)
-'a 0- d e t J (y .
(2)
v
while the formula for expansion by the Jth column is n
det/f = £ i ( - l )
i+
For example, with n = 3 and J = 2, (2) gives det A =
det A \ 2 4- a2 2 det A 2 2
u3 3 det /4 3 2
as the expansion of a 3 x 3 matrix by its second column. One approach to the problem of defining determinants of matrices is to define the determinant of a n n x n matrix by means of formulas (1) and (2), assuming inductively that determinants of (n — 1) x (n — 1) matrices have been previously defined. Of course it must be verified that expansions along different rows and/or columns give the same result. Instead of carrying through this program, we shall state without proof the basic properties of determinants (I-IV below), and then proceed to derive from them the specific facts that will be needed in subsequent chapters. For a development of the theory of determinants, including proofs of these basic properties, the student may consult the chapter on determinants in any standard linear algebra textbook. In the statement of Property I, we are thinking of a matrix A as being a function of the column vectors of A , det A = D(Al , . . . . yf"). (I) There exists a unique (that is, one and only one) alternating, multilinear function D, from n-tuples of vectors in to real numbers, such that = (®1> • • • , ®n) The assertion that D is multilinear means that it is linear in each variable separately. That is, for each i = 1, . . . , «,
Z)(ai, . . . ; x at + jb, , . . . , a„) = . . . , an . . . , a„) + yP(ai, . . . , b i , ...,a„).
(3)
The assertion that D is alternating means that D(al t . . . , a„) = 0 if a( = ay for some i j. In Exercises 6.1 and 6.2, we ask the student to derive from the alternating multilinearity of D that
Z>(at , . . . , ra i ( . . . , a„) = rD(at , . . . , a £, . . . , a„),
(4)
I
36
Euclidean Space and Linear Mappings
Z>(au . . . , a ( + ra7 , . . . , a„) = P(ai, . . . . a,
a„),
(5)
and
PCaj, .
a7 , .
a,, . . . . a„) = — Z>(aj ....... a, ........
if i * j . (6)
Given the alternating multilinear function provided by (I), the determinant of the n x n matrix A can then be defined by tetA = D(A1 , . . . , A ' r ),
(7)
where A1 , . . . , A" are as usual the column vectors of A . Then (4) above says that the determinant of A is multiplied by r if some column of A is multiplied by r, (5) that the determinant of A is unchanged if a multiple of one column is added to another column, while (6) says that the sign of det A is changed by a n interchange of any two columns of A . By virtue of the following fact, the word “column” in each of these three statements may be replaced throughout by the word “row.” (ir> The determinant of the matrix A is equal to that of its transpose A*. The transpose A* of the matrix A = (atJ ) is obtained from A by interchanging the elements and aJ t , for each i and j. Another way of saying this is that the matrix A is reflected through its principal diagonal. We therefore write A' = (a to state the fact that the element in the ith row and jth column of A* is equal to the one in the Jth row and ith column of A . For example, if /I 2 3\ yl = l 4 5 6 1 , \7 8 9 /
then
/ I 4 7\ A* - 1 2 5 8 ) . \3 6 9 /
Still another way of saying this is that A' is obtained from A by changing the rows of A to columns, and the columns to rows. (III) The determinant of a matrix can be calculated by expansions along rows and columns, that is, by formulas (1) and (2) above. In a systematic development, it would be proved that formulas (1) and (2) give definitions o f det A that satisfy the conditions of Property I and therefore, by the uniqueness of the function D , each must agree with the definition in (7) above. The fourth basic property of determinants is the fact that the determinant of the product of two matrices is equal to the product of their determinants (IV) detXB = (det )(detB). As a n application, recall that the n x n matrix B is said to be a n inverse of the n x n matrix A if and only if A B = B A = I, where I denotes the n x n
6 Determinants
37
identity matrix. I n this case we write B = A " 1 (the matrix A ' 1 is unique if it exists at all—see Exercise 6.3), and say A is invertible. Since the fact that £(®x, . . . , e„) = 1 means that det Z = 1, (IV) gives (det )(det A ' 1 ) = 1 # 0. So a necessary condition for the existence o f A ~ 1 is that det A 0. W e prove in Theorem 6.3 that this condition is also sufficient. The n x n matrix A is called nonsingular if det A 0, singular if det A — 0. W e can n o w give the determinant criterion for the linear independence o f n vectors in St”. Theorem 6.1 and only if
The n vectors al t . . . , a„ in
are linearly independent if
£ ( a i> . . . . a j 5* 0. PROOF Suppose first that they are linearly dependent; we then want to show that . . . , a„) = 0. Some one o f them is then a linear combination o f the others; suppose, for instance, that, »1 = *2«2 + • • • + (t2 a2 + • • • 4- t„a„, a2 , . . . , a„) n = X *i > a 2 > • ••>>„) (multilinearity) i=2 = 0 because each D(»t , a2 , . . . , a„) = 0, i = 2, . . . , n, since D is alternating. Conversely, suppose that the vectors a2, . . . , a„ are linearly independent. Let A be the n x n matrix whose column vectors are a2, . . . , a„ , and define the linear mapping L : -> St” b y L(x) = A x for each (column) vector x e 5?". Since Z(e,) = a { for each i = 1, . . . , n, I m L = St” andL is one-to-one b y Theorem 5.1. It therefore has a linear inverse mapping L ~ l : St” -► (Exercise 5.3); denote b y B the matrix o f L - 1 . Then A B = B A = Z b y Theorem 4.2, so it follows from the remarks preceding the statement o f the theorem that det A 0, as desired. I Determinants also have important applications t o the solution o f linear systems o f equations. Consider the system «n*i + / ’ • + «!■*• = 2iXi + ‘ ‘ + a2 „X„ = Sl m9 and call D the domain (of definition) of f. In order to define lim x _ a /(x), the limit of f at a, it will be necessary that /be defined at points arbitrarily close to a, that is, that D contains points arbitrarily close to a. However we d o not want to insist that a e D, that is, that /be defined at a. For example, when we define the derivative f'(a) of a real-valued singlevariable function as the limit of its difference quotient at a9 x-a
’
this difference quotient is not defined at a. This consideration motivates the following definition. The point a is a limit point of the set D if and only if every open ball centered at a contains points of D other than a (this is what is meant by the statement that D contains points
I
42
Euclidean Space and Linear Mappings
arbitrarily close to a). By the open ball of radius r centered at a is meant the set Br (a) =
| x - a | < r).
Note that a may, or may not, be itself a point of D. Examples: (a) A finite set of points has no limit points; (b) every point of 3t n is a limit point of St" (c) the origin 0 is a limit point of the set 0t n — 0 ; (d) every point of (ft" is a limit point of the set Q of all those points of &n having rational coordinates; (e) the closed ball Br(») = { x e $ " : | x - a |
r}
is the set of all limit points of the open ball Br (a). Given a mapping f : D - !%m, a limit point a of D, and a point b e say that b is the limit of f at a, written
we
lim/(x) = b, x-*a if and only if, given e > 0, there exists 0 such that x e D and 0 < | x — a | < b imply |/(x) — b | < e. The idea is of course that /(x) can be made arbitrarily close to b by choosing x sufficiently close to a, but not equal to a. In geometrical language (Fig. 1.6),
S 8 (o)
£«(b)
the condition of the definition is that, given any open ball Be(b) centered at b, there exists a n open ball Bs(ai) centered at a, whose intersection with D — a is sent by f into Be(b). Example 1 Consider the function f : Si 2 ->
defined by
2
f(x, y) = x + x y + y. In order to prove that IimK_ ( M ) /(x, y) = 3, we first write | / ( x , y ) - 3 | = | x2 + x j +
-3|
g | x2 - 1 | + | y - 1 | + | x y - 1 | = l* + 1| l*~ 1| + | - 1| + Ixy-j + .y- 1|, | / ( x , y ) - 3 | g | x + 1 1 | x - l | + 2 | y - 1[ + | y | | x - 1 | .
(1)
Given e > 0 , we want to find 0 such that |(x, y) — (1, 1)| = [(x — I)2 + (y - 1)2 ]1 / 2 < b implies that the right-hand side of (1) is < e . Clearly we need
7 Limits and Continuity
43
bounds for the coefficients | x + 1 1 and |j>| of [ x — 1 1 in (1). So let us first agree to choose [x + 1 | < 3
and
| y | < 2,
which then implies that l/(x, j ) — 3 |
5 | x — 11 + 2 | j - l |
by (1). It is now clear that, if we take 8 = min then - (1,1)1 < 6
\f(x,y)-3\ Si defined by ' xy ------z f(x,y) = l x 2 ~ y 2 ,
0
if x / if
+ y,
x = ± y.
T o investigate lim ( l J ,) _ 0 /(x, y), let us consider the value of f ( x , y ) as (x, y) approaches 0 along the straight line y — ax. The lines y = ±x, along which f(x, y) = 0, are given by a = ± 1. If a # + 1, then 7f(x,
otx2 ax)7 = ,2 , x - a 2x 2
a j. 1 - a2
For instance, /(x, 2x) = - I
and
/(x, -2x) = + j
for all x / 0. Thus f(x, y) has different constant values on different straight lines through 0, so it is clear that lim ( X J,H 0 /(x, y) does not exist (because, given any proposed limit b, the values —§ and + | of /cannot both be within e of b ife 0, we must find 0 such that 11| < 6 =>[(cos t — I)2 + (sin t)2]1 / 2 < e. In order to simplify the square root, write a =■ cos t — 1 and b = sin t. Then [(cos t — I)2 + (sin t)2 J1 / 2 = (fl2 + b2 )i / 2
0 such that
g
6
| t | < 8 => |cos t — 1 | < -
and
|sint| $im , and write /(x) = ( / t (x), . . . , /„(x)) e for each x e D. Then f t , . . . , f m are real-valued functions o n D , called as usual the coordinate functions of /, and we write f = (/i, . . . , f ). For the function/of Example 3 we have/= where /i(t) = cos t,f 2(t) = sin t, and we found that lim/(t) = (lim/M Iim/2(t)) = (1, 0). t-*o \t-»o r-*o / Suppose f = ( f u . . . , f j t of D , and b = (bl f . . . , b )G !&m. Then
Theorem 7.1
D->
that a is a limit point
ton /(x) = b
(2)
if and only if lim / Xx ) = * o
i = 1, . . . , m .
(3)
PROOF First assume (2). Then, given e > 0, there exists 8 > 0 such that x e D and 0 < | x - a | < 5 imply that |/(x) - b | < e. But then 0< |x-a|
< * = > | / 0, for each i = 1, . . . , m there exists a 8 t > 0 such that xeD
and
0 < | x - a | < |/;(x) -
g
< -y- .
(4)
7 Limits and Continuity
45
If we now choose 3 = min(3 t> . . . , 5m), then x 6 D and r»
1/2
0 < | x - a | < 3 => |/(x) — b | = / p2 \ St” is said to be continuous at a e D if and only if lim/(x) =/(a).
(5)
x-*a
/is said to be continuous on D (or, simply, continuous) if it is continuous at every point of D. Actually we cannot insist upon condition (5) if a e D is not a limit point of D, for in this case the limit of/at a cannot be discussed. Such a point, which belongs to D but is not a limit point of D , is called a n isolated point of D, and we remedy this situation by including in the definition the stipulation that /is automatically continuous at every isolated point of D. Example 4 If D is the open ball 5 t (0) together with the point (2, 0), then any function / o n D is continuous at (2, 0), while / is continuous at a 6 BfG) if and only if condition (5) is satisfied. Example 5 If D is the set of all those points (x, y) of St 2 such that both x and y are integers, then every point of D is a n isolated point, so every function o n D is continuous (at every point of D). The following result is a n immediate corollary to Theorem 7.1. Theorem 7 . 2 The mapping/ : D -> is continuous at a 6 D if and only if each coordinate function of / i s continuous at a.
Example 6 The identity mapping n : -» defined by n(x) = x , is obviously continuous. Its ith coordinate function, nfx i t . . . , x„) = x t , is called the ith projection function, and is continuous by Theorem 7.2. Example 7 The real-valued functions s and p o n St 2 , defined by s(x, y) = x + y and p(x, y) = xy, are continuous. The proofs are left as exercises.
I
46
Euclidean Space and Linear Mappings
The continuity of many mappings can be established without direct recourse to the definition of continuity—instead we apply the known continuity o f the elementary single-variable functions, elementary facts such as Theorem 7.2 and Examples 6 and 7, and the fact that a composition of continuous functions is continuous. Given / : D j -» and g : D2 -*■ #*, where D t the set o f ail boundary points o f the open ball B,(p) is the sphere S,(p) = { x e d t " : | x — p | = r ) . Show that every boundary point o f D is either a point o f D or a limit point o f D . 7.7 Let Z>* denote the set o f all limit points o f the set D*. Then prove that the set D v D * contains all o f its limit points. 7.8 Let be continuous at the point a. I f {a,}? is a sequence o f points o f which converges to a, prove that the sequence {/(«„)} “ converges to the point /(a).
8
ELEMENTARY TOPOLOGY O F St n
I n addition t o its linear structure as a vector space, and the metric structure provided b y the usual inner product, Euclidean n-space possesses a topological structure (defined below). Among other things, this topological structure enables us to define and study a certain class o f subsets o f 5?", called compact sets, that play a n important role in maximum-minimum problems. Once we have defined compact sets, the following two statements will be established. (A) If D is a compact set in and / : D -+ is continuous, then its imagef(D) is a compact set in 0t m (Theorem 8.7). (B) If C is a compact set o n the real line then C contains a maximal element b, that is, a number b e C such that x b for all x e C. It follows immediately from (A) and (B) that, continuous realvaluedfunction on the compact set D c SF, then /(x) attains an absolute maximum value at some point » e D . For if b is the maximal element o f the compact set f(D) a St, and a is a point o f D such that /(a) = /’.•then it is clear that /(a) = b is the maximum value attained b y /(x) o n D . The existence o f maximum (and, similarly, minimum) values for continuous functions o n compact sets, together with the fact that compact sets turn out to be easily recognizable as such (Theorem 8.6), enable compact sets to play the same role in multivariable maximumminimum problems as do closed intervals in single-variable ones. B y a topology (or topological structure) for the set 5 is meant a collection S ' o f subsets, called open subsets o f S, such that S ' satisfies the following three conditions: (i) The empty set 0 and the set S itself are open. (ii) The union o f any collection o f open sets is an open set. (iii) The intersection o f afinite number o f open sets is an open set. The subset A o f is called open if and only if, given any point a e A , there exists a n open ball Br (a) (with r > 0) which is centered at a and is wholly contained in A . Put the other way around, A is open if there does not exist a point such that every open ball Br (a) contains points that are not in A . It is
I
60
Euclidean Space and Linear Mappings
easily verified that, with this definition, the collection of all open subsets of St" satisfies conditions (i)-(iii) above (Exercise 8.1). Examples (a) A n open interval is a n open subset of St, but a closed interval is not. (b) More generally, a n open ball in is a n open subset of (Exercise 8.3) but a closed ball is not (points on the boundary violate the definition), (c) If F is a finite set of points in then St" — F is a n open set. (d) Although St is a n open subset of itself, it is not a n open subset of the plane Si 2 . The subset B of is called closed if and only if its complement Si" — B is open. It is easily verified (Exercise 8.2) that conditions (i)-(iii) above imply that the collection of all closed subsets of satisfies the following three analogous conditions: (i') 0 and St" are closed. (ii') The intersection of any collection of closed sets is a closed set. (iii') The union of a finite number of closed sets is a closed set. Examples-, (a) A closed interval is a closed subset of Si. (b) More generally, a closed ball in St" is a closed subset of (Exercise 8.3). (c) A finite set F of points is a closed set. (d) The real line St is a closed subset of Si1 . (e) If A is the set of points of the sequence {1/n}”, together with the limit point 0, then A is a closed set (why?) The last example illustrates the following useful alternative characterization of closed sets. Proposition 8 . 1
The subset A of St" is closed if and only if it contains all
of its limit points. PROOF Suppose A is closed, and that a is a limit point of A . Since every open ball centered at a contains points of A, and - A is open, a cannot be a point of St" — A . Thus a e A . Conversely, suppose that .4 contains all of its limit points. If b e St" — A , then b is not a limit point of A , so there exists a n open ball Br (b) which contains n o points of A. Thus St" - A is open, so A is closed. | If, given A be the continuous mapping defined by/(x) = | x - a | , where a e Sti" is a fixed point. Then / - 1 ((-r, r)) is the open ball Br (a), so it follows that this open ball is indeed a n open set. Also / - 1 ([0, r]) = B,(a), so the closed ball is indeed closed. Finally, /
-1
(r) = 5 r (a) = { x G
: | x - a | = r},
so the (n — l)-sphere of radius r, centered at a, is a closed set. The subset A of 9F is said to be compact if and only if every infinite subset of A has a limit point which lies in A . This is equivalent to the statement that every sequence of points of A has a subsequence {a,}” which converges to a point a e A . (This means the same thing in as o n the real line: Given e > 0, there exists N such that n N | a „ — a | < e.) The equivalence of this statement and the definition is just a matter of language (Exercise 8.7). Examples-, (a) $ is not compact, because the set of all integers is a n infinite subset of that has no limit point at all. Similarly, is not compact, (b) The open interval (0, 1) is not compact, because the sequence {l/n}“ is a n infinite subset of (0, 1) whose limit point 0 is not in the interval. Similarly, open balls fail to be compact, (c) If the set F is finite, then it is automatically compact because it has no infinite subsets which could cause problems. Closed intervals d o not appear to share the problems (in regard to compactness) of open intervals. Indeed the Bolzano-Weierstrass theorem says precisely that every closed interval is compact (see the Appendix). We will see presently that every closed ball is compact. Note that a closed ball is both closed and bounded, meaning that it lies inside some ball Bf (0) centered at the origin.
I
52
Euclidean Space and Linear Mappings
Lemma 8.3 Every compact set is both closed and bounded. PROOF Suppose that A n . But then { b#}® would be a n infinite subset o f A having n o limit point (Exercise 8.8), thereby contradicting the compactness of A. |
Lemma 8.4 A closed subset o f a compact set is compact. PROOF Suppose that A is closed, B is compact, and A 0, there exists 0 such that xeZ>,
| x - a | < 5 => |/(x) -/(a) | < 8.
In general, 0, there exists S > 0 such that x,yeP,
|x-y|