321 110 11MB
English Pages 313 [314] Year 2016
Pietro-Luciano Buono Advanced Calculus De Gruyter Graduate
Also of interest Functional Analysis. A Terse Introduction Gerard Chacón, Humberto Rafeiro, Juan Camilo Vallejo, 2016 ISBN 978-3-11-044191-8, e-ISBN (PDF) 978-3-11-044192-5, e-ISBN (EPUB) 978-3-11-043364-7
Introduction to Topology Min Yan, 2016 ISBN 978-3-11-037815-3, e-ISBN (PDF) 978-3-11-041302-1, e-ISBN (EPUB) 978-3-11-037816-0
Tensors and Riemannian Geometry. With Applications to Differential Equations Nail H. Ibragimov, 2015 ISBN 978-3-11-037949-5, e-ISBN (PDF) 978-3-11-037950-1, e-ISBN (EPUB) 978-3-11-037964-8
Multivariable Calculus and Differential Geometry Gerard Walschap, 2015 ISBN 978-3-11-036949-6, e-ISBN (PDF) 978-3-11-036954-0
Elements of Partial Differential Equations Pavel Drábek, Gabriela Holubová, 2014 ISBN 978-3-11-031665-0, e-ISBN (PDF) 978-3-11-031667-4, e-ISBN (EPUB) 978-3-11-037404-9
Pietro-Luciano Buono
Advanced Calculus
Differential Calculus and Stokesʼ Theorem
Mathematics Subject Classification 2010 35-02, 65-02, 65C30, 65C05, 65N35, 65N75, 65N80 Author Prof. Pietro-Luciano Buono University of Ontario Institute of Technology 2000 Simcoe St North Oshawa ON L1H 7K4 Canada [email protected]
ISBN 978-3-11-043821-5 e-ISBN (PDF) 978-3-11-043822-2 e-ISBN (EPUB) 978-3-11-042911-4 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliografische Information der Deutschen Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2016 Walter de Gruyter GmbH, Berlin/Boston Cover image: Pietro-Luciano Buono via Asymptote: The Vector Graphics Language Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com
À Isabelle pour son amour, son support et sa patience.
Contents Preface
IX
1 1.1 1.2 1.3 1.4 1.5 1.6
Introduction 1 Review of Set Theory 1 Review of Linear Algebra 3 Coordinate systems 13 Functions and Mappings: including partial derivatives Parametric representation of curves 27 Quadrics 34
2 2.1 2.2 2.3
Calculus of Vector Functions 41 Derivatives and Integrals 41 Best Linear Approximation and Tangent Lines Reparametrizations and arc-length parameter
3 3.1 3.2 3.3
Tangent Spaces and 1-forms Tangent spaces 58 Differentials 69 1-forms 80
4 4.1 4.2 4.3
84 Line Integrals Integration of 1 forms 84 Arc-length, Metrics and Applications Line integrals of vector fields 111
5 5.1 5.2 5.3 5.4 5.5 5.6 5.7
Differential Calculus of Mappings 117 Graphs and Level Sets 117 Limits and Continuity 121 Best Linear Approximation and Derivatives Tangent spaces 139 The Chain Rule 142 Higher Derivatives 145 Taylor expansions 151
6 6.1 6.2 6.3 6.4
Applications of Differential Calculus 156 Optimization 156 Parametrizations 166 Differential Operators 176 Application of Clairault’s theorem to 1-forms
49 53
58
95
129
183
19
VIII
Contents
7 7.1 7.2 7.3 7.4
188 Double and Triple Integrals Area and Volume Forms 188 Double integrals 194 Green’s Theorem 207 Three-dimensional domains 215
8 8.1 8.2 8.3
Wedge Products and Exterior Derivatives More on Wedge Products 224 Differential Forms 231 Exterior Derivative 234
9 9.1 9.2 9.3 9.4 9.5
Integration of Forms 243 Pullbacks of k-forms: k = 1, 2, 3 243 Integrals of Forms: change of variables formula Integrals on a surface 253 Orientation of Surfaces 264 General Pullback Formula 270
10 10.1 10.2 10.3
Stokes’ Theorem and Applications 274 More on orientation of curves and surfaces Stokes’ Theorem 280 Stokes’s Theorem for Vector Fields 294
Bibliography Index
301
299
224
247
274
Preface This book is an outgrowth of the notes I have been using to teach a one semester Calculus III course at the University of Ontario Institute of Technology since 2012. It is intended for students who have already completed at least one semester of Elementary Linear Algebra and two semester long courses in Calculus. The approach taken in this book is to take full advantage of Linear Algebra in order to present the Calculus concepts in as much generality as possible. Because of this bias towards using Linear Algebra, I decided also to go one step further (from many other books) and introduce the concept of tangent space early in the text from which it is possible to define properly the differential of a function and from there, differential forms and pullbacks in the context of line integrals. In the following chapters, those are generalized just enough to provide a unified treatment of integration and the generalized Stokes’ theorem in R3 (Green, Classical Stokes and Divergence). Therefore, this book can also serve as a gentle introduction to the theory of differential forms and prepare the reader to delve into more advanced topics from differential geometry and mathematical physics. The book begins with an introductory chapter on basic topics which are recurrent in the remainder of the book. Several of those topics may already be familiar to some readers. In order to provide an incremental progression in the blending of Linear Algebra and Calculus, sprinkled with differential forms theory, the next three chapters discuss almost exclusively vector functions of one variable and culminate with the Fundamental Theorem of Line Integrals. Chapter 3 is the cornerstone for the whole book and it uses tangent vectors to vector functions to introduce tangent spaces to curves, Rn and to surfaces. It is then possible to define differentials and 1-forms as acting on tangent vectors. The second part of the book is made up of Chapters 5 and 6 and focuses on differentiable mappings from Rn to Rm . Chapters 7 through 9 introduce, in a blended way, additional concepts of differential form theory along with the theory of multiple integrals. Finally, Chapter 10 puts the results from the previous chapters together in the statement and proof of Stokes’ theorem (Green, Classical and Divergence) using differential forms and exterior derivatives. The statement is also rewritten in terms of the classical differential operators. One of the advantage I see in the unconventional ordering of topics adopted here is that it is now possible to start introducing terminology in the context of curves (i.e. one dimensional geometrical objects) which are typically easier to understand and for which the calculations do not require the notational machinery needed with more variables. After a discussion of mappings and especially the introduction of the Jacobian, it is then possible to extend the differential form concepts to higher dimensions. At the same time, this enables for a repetition of the new concepts and terminology which is typically beneficial for learning. Another learning goal that this text attempts to achieve is for the reader to start distinguishing between the
X
Preface
definition of mathematical concepts, the geometric content of the definition and the computational formulae which are useful in most problem solving. It also provides an algorithmic presentation of some computations which I hope will make their usage more straightforward; one such example is for determining the arc-length parametrization of a curve. The content of this book has seen several versions since 2012 and I would like to thank all my students who have ploughed through those various versions. I am grateful to all of those that provided feedback on the presentation and noticed mistakes, typos, etc. I would like to thank Eryn Frawley (Calculus III, 2013) who assisted me by producing a great number of figures using Tikz, solutions to many problems and proofreading of chapters. I am also indebted to my teaching assistants, and especially Jamil Jabbour, who have read major portions of the material and challenged me on the presentation in several places. All figures were done using Tikz (https://sourceforge.net/projects/pgf/) and Asymptote (http://asymptote.sourceforge.net). I would like to thank all the users of those software packages for posting examples and code snippets. In particular, I am indebted to the gallery maintained at http://asy.marris.fr/asymptote/index.html. I hope to be able to give back soon to the community by making some of the codes for the figures of this book available online. For any inquiries about this textbook, error or typos found, etc. Please contact me at: [email protected].
Luciano Buono Oshawa, June 2016.
1 Introduction This chapter gives an overview of several of the topics necessary for the remainder of the book. The first sections on Set Theory and Linear Algebra are review sections.
1.1 Review of Set Theory We begin with a quick review of basic concepts from set theory with a focus on real numbers R. A set of real numbers is an unordered collection of real numbers. One denotes sets inside brackets in enumerative style as follows √ A = {1, −3, π, 1/ 2}. The symbol ∈ means “element of” and denotes the belonging of an element to a set. We can also describe a set using a defining condition B = {x ∈ R | x > 2} which is read as: x is the placeholder for elements of R such that x is greater than 2. In the defining condition notation, to verify whether a number belongs to a given set one has to check if the condition is satisfied. Is −3 ∈ B? Let x = −3, then −3 > 2 is false; therefore, −3 is not an element of B and it is denoted: −3 6∈ B. Sets 2 x
Fig. 1.1. In bold, the set B.
can have a finite number of elements as A or an infinite number of elements as B. An important type of sets on the real line are the intervals, defined as follows: Let a, b ∈ R and a < b [a, b] = {x ∈ R | a ≤ x ≤ b}, (a, b] = {x ∈ R | a < x ≤ b},
(a, b) = {x ∈ R | a < x < b} [a, b) = {x ∈ R | a ≤ x < b}.
If a = −∞ or b = ∞ then we always use “(a” or “b)”. a
b
a x
b
a x
Fig. 1.2. From left to right, the intervals [a, b], (a, b) and (a, b].
b x
2
1 Introduction
A set E is a subset of a set F if every element of E is also an element of F . We then write E ⊂ F . Another notation is E ⊆ F which allows for E and F to have exactly the same elements, that is E = F . Example 1.1.1. Let E = {x ∈ R | x = 4n with n ∈ N} and F be the set of even integers. Is E ⊂ F ? To see this, one has to make sure that every element of E is an even integer. But, E contains an infinite number of elements and so we use the defining condition instead. At this point, if one believes that E ⊂ F , then we can proceed to verify it properly as follows: let x = 4n for an arbitrary natural number n ∈ N, but x = 2(2n) is divisible by 2 and so x is an even integer. Because x can be chosen to be any element of E, this means E ⊂ F . Example 1.1.2. Let E = {x ∈ R | x = p/q, p, q ∈ Z, p, q ≤ 10} and F be the interval (0, 1). Is E ⊂ F ? In this case, one can check that for p = 2 ∈ Z and q = 1 ∈ Z, p, q ≤ 10 and so x = 2/1 ∈ E. But, x = 2 > 1 and so x 6∈ F . Thus, E 6⊂ F . The main operations on sets are the union, the intersection and the complement. Let A ⊆ R and B ⊆ R be two sets. Then the union of A and B is A ∪ B := {x ∈ R | x ∈ A or x ∈ B}. The intersection of A and B is A ∩ B := {x ∈ R | x ∈ A and x ∈ B}. The complement of A is Ac := {x ∈ R | x 6∈ A}. In particular, one can check that for any two sets A and B: A ∩ B ⊂ A ∪ B. One is often familiar with these operations in the context of Venn diagrams Because sets are often given using defining conditions the union and intersection operations are given by adding “or” for unions and “and” for intersections to the defining conditions. Example 1.1.3. Let A = {x ∈ R | x is an even integer} and B = (−3, 5). Then, A ∪ B = {x ∈ R | x is an even integer or −3 < x < 5}. A ∩ B = {x ∈ R | x is an even integer and −3 < x < 5}. For the complement, one writes for instance Ac = {x ∈ R | x 6∈ A} = {x ∈ R | x is not an even integer}.
1.2 Review of Linear Algebra
A
3
B
C Fig. 1.3. Venn Diagram of three sets A, B and C.
Exercises (1) Write the following descriptions of sets using defining conditions. (a) Let E be the set of elements of R that are square of integers. (b) Let F ⊂ R be the set of points with distance less than 1 from π. (c) Let A be the set of integers with absolute value greater than 2π. (d) Let B be the set of rational numbers with denominator less than 30. (2) Determine the union and intersection of the sets A, B, E, F of Exercise 1. (3) For each problem, determine whether E ⊂ F or not. Explain. (a) Let E = {x ∈ R | |x − 3| < 2} and F = {x ∈ R | 2 < x < 4}. (b) Let E = {x ∈ R | x = cos(nπ), for any n ∈ Z} and F = {x ∈ R | −1 ≤ x ≤ 1}. (c) Let E = {x ∈ R | |1 − x| > 0} and F = {x ∈ R | |x − 1| < 0}.
1.2 Review of Linear Algebra This review of linear algebra focuses on the definitions and results revolving around the concept of vector space and on geometrical aspects of linear algebra. For review of solution of linear systems using row reduction, matrices, inverses of matrices, and others, the reader can consult their favourite textbook on linear algebra. One, two and three dimensional Euclidean spaces are the line, the plane and the ambient space familiar to us. They are represented mathematically as R, R2 and R3 . The space Rn is defined as the set of all collections of n real numbers written as (x1 , x2 , . . . , xn ). For n = 2 and n = 3 we have (x1 , x2 )
(x1 , x2 , x3 )
where x1 , x2 , x3 are any real numbers. Euclidean spaces are vector spaces; that is, all elements of Rn satisfy the following two conditions.
4
1 Introduction
(1) For any two elements in Rn : (x1 , x2 , . . . , xn ) and
(y1 , y2 , . . . , yn )
then (x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn ) is also in Rn . (2) For a ∈ R and (x1 , x2 , . . . , xn ) ∈ Rn , then a(x1 , x2 , . . . , xn ) = (ax1 , . . . , axn ) ∈ Rn .
In general, let V ⊂ Rn , then V is a vector space (or a vector subspace of Rn ) if for every element v1 , v2 ∈ V and a ∈ R the following two properties are satisfied: (1) v1 + v2 ∈ V (2) av1 ∈ V .
If a vector space W is a subset of a vector space V , we say that W is a vector subspace of V . Example 1.2.1. The subset V = {(x1 , x2 , x3 ) ∈ R3 | x3 = 2x1 + x2 } ⊂ R3 is a vector subspace. We need to check the two conditions. Two general elements of V must satisfy the defining condition. We write v1 = (a1 , a2 , 2a1 + a2 )
and
v2 = (b1 , b2 , 2b1 + b2 ).
Then, v1 + v2
= =
(a1 , a2 , 2a1 + a2 ) + (b1 , b2 , 2b1 + b2 ) (a1 + b1 , a2 + b2 , 2(a1 + b1 ) + (a2 + b2 )).
Therefore, the third component of the sum satisfies the defining condition for V . Now check for c ∈ R cv1 = c(a1 , a2 , 2a1 + a2 ) = (ca1 , ca2 , c(2a1 + a2 )) and here we also see that the third component satisfies the defining condition for V . Therefore, V is a vector space. Example 1.2.2. The subset W = {(x1 , x2 ) ∈ R2 | x2 = −x1 + 1} ⊂ R2 is not a vector subspace of R2 . Two general elements of W must satisfy the following condition. We write w1 = (a1 , −a1 + 1)
and
w2 = (b1 , −b1 + 1).
1.2 Review of Linear Algebra
5
Then, w1 + w2 = (a1 , −a1 + 1) + (b1 , −b1 + 1) = (a1 + b1 , −(a1 + b1 ) + 2). Therefore, the second component of the sum does not satisfy the defining condition and so W is not a vector space. The elements of a vector space V are called vectors. A linear combination of a collection of k vectors v1 , v2 . . . , vk of a vector space V is given by a1 v1 + a2 v2 + · · · + ak vk for some real numbers a1 , a2 , . . . , ak called the coefficients of the linear combination. Example 1.2.3. One can check that v1 = (1, 0, 2), v2 = (3, −1, 5) and v3 = (−0.2, 1, 0.6) are elements of V from Example 1.2.1. Then, (1.1)v1 + v2 + (−1)v3 is a linear combination of v1 , v2 , v3 . We now recall the definitions of linear dependence and linear independence of vectors. Consider the following example first. Example 1.2.4. Let v1 = (1, −2) and v2 = (−1.3, 2.6), then we can see that v2 = (−1.3)v1 . We can rewrite as a linear combination (−1.3)v1 + (−1)v2 = 0. That is, the linear combination is equal to zero, but the coefficients of the linear combination are not zero. A collection of vectors v1 , . . . , vk in a vector space V are linearly dependent if there exists a linear combination a1 v1 + a2 v2 + · · · + ak vk = 0 for which not all the coefficients are zero. A collection of vectors v1 , . . . , vk are linearly independent if the only way to have a linear combination a1 v1 + a2 v2 + · · · + ak vk = 0 is for a1 = a2 = · · · = ak = 0. Example 1.2.5. The vectors v1 = (1, 0, 2), v2 = (3, −1, 5) and v3 = (−0.2, 1, 0.6) of Example 1.2.3 are linearly independent. One can see this by writing a1 v1 + a2 v2 + a3 v3 = (a1 + 3a2 − 0.2a3 , −a2 + a3 , 2a1 + 5a2 + 0.6a3 ) = 0 from which we obtain a3 = a2 by looking at the second component. Substituting in the first component we obtain a1 − 3a2 − 0.2(a2 ) = 0 which means a1 = 3.2a2 .
6
1 Introduction
Substituting in the third component we have 2(3.2)a2 +5a2 +0.6(a2 ) = 12a2 = 0. But this forces a2 = 0 and therefore a1 = a3 = 0. The only way for a linear combination of v1 , v2 , v3 to equal zero is for all the coefficients to be zero. For a collection of vectors v1 , v2 , . . . , vk ∈ V , the span of v1 , v2 , . . . , vk is the set of all linear combinations of v1 , . . . , vk . We write span(v1 , . . . , vk ) := {a1 v1 + a2 v2 + · · · + ak vk | a1 , a2 , . . . , ak ∈ R}. Consider the following example. Example 1.2.6. The span of v1 = (1, 0, 2) and v2 = (−3, 1, 5) is span(v1 , v2 )
= =
{a1 (1, 0, 2) + a2 (−3, 1, 5) | a1 , a2 ∈ R} {(a1 − 3a2 , a2 , 2a1 + 5a2 ) | a1 , a2 ∈ R}.
This last line gives an explicit expression for elements of the span. One can now verify whether a vector belongs to the span of v1 , v2 by checking explicitly. For instance, is v4 = (4, −1, −4) an element of span(v1 , v2 )? We need, 4 = a1 − 3a2 ,
−1 = a2 ,
a1 + 5a2 = −4.
We see that a2 = −1 and a1 = 1 solve the three equations. That is, v4 ∈ span(v1 , v2 ). Recall the following result concerning span of vectors. Proposition 1.2.7. Let v1 , . . . , vk be a collection of vectors in the vector space V . Then, span(v1 , . . . , vk ) forms a vector subspace of V . We now have all the ingredients to discuss the concepts of basis and dimension. A set of vectors B = {v1 , . . . , vk } in a vector space V is a basis for V if (1) the vectors in B are linearly independent, and (2) V = span(B).
Example 1.2.8. If V = Rn , the set B = {e1 , . . . , en } where ej = (0, . . . , |{z} 1 , . . . , 0) j th
is a basis for Rn . It is called the canonical basis of Rn . The linear independence is straightforward to check. For n = 2, we have e1 = (1, 0)
and
e2 = (0, 1).
Let v = (x1 , x2 ) be an arbitrary element of R2 , then v is in the span of B: v = (x1 , x2 ) = x1 e1 + x2 e2 .
1.2 Review of Linear Algebra
7
e2 e1
Fig. 1.4. Canonical basis of R2 : {e1 , e2 }.
The elements x1 , x2 are called the coordinates of v. For general n, any element w = (x1 , . . . , xn ) ∈ Rn can be written uniquely as w = x1 e1 + x2 e2 + · · · + xn en where x1 , . . . , xn are said to be coordinates of w. The dimension of a vector space V is given by the number of elements of a basis of V. The choice of basis is not unique. For the Euclidean spaces, the canonical basis is the preferred one and other bases are used typically in special circumstances when one wants to single out a particular geometrical feature. Example 1.2.9. We are interested in the subspace of R2 spanned by the vector d1 = (1, 1). This subspace is the bisector line of the first and third quadrants. A basis of R2 which includes d1 is {d1 , e2 }. For this we check the two properties of a basis: (1) for a, b ∈ R, ad1 + be2 = (a, a + b) = 0 which forces a = b = 0. The vectors are linearly independent, and (2) for any (x1 , x2 ) ∈ R2 , (x1 , x2 ) = x1 d1 + (x2 − x1 )e2 . Thus, {d1 , e2 } spans R2 . We now consider the concept of length of vectors in Rn when written in the canonical basis. This is given by the norm of a vector v = (x1 , x2 , . . . , xn ) defined by q ||v|| := x21 + x22 + · · · + x2n . In R2 , this corresponds to the hypotenuse of the right angle triangle with sides x1 and x2 . In Example 1.2.9, a choice which may seem more natural is to take the vector d2 = (1, −1) which spans the bisector of the second and fourth quadrants. The vectors d1 and d2 are at right angles, just as the canonical basis elements. The dot product or scalar product (or also called inner product) of two vectors v and w written in the canonical basis is denoted by v · w and defined as follows: let v = (x1 , x2 , . . . , xn ) and w = (y1 , y2 , . . . , yn ) then v · w = x1 y1 + x2 y2 + · · · + xn yn .
1 Introduction
8
v
x2
x1
Fig. 1.5. Thep vector v has norm ||v|| = x21 + x22 .
This formula comes from the following argument. If v and w are at right angles or
v−w v w
Fig. 1.6. The vector v − w.
perpendicular, then by Pythagoras theorem (valid in Rn ) we have ||v||2 + ||w||2 ||v||2 + ||w||2 ||v||2 + ||w||2
= = =
||v − w||2 (x1 − y1 )2 + · · · + (xn − yn )2 ||v||2 + ||w||2 − 2(x1 y1 + · · · + xn yn ).
Therefore, the left and right hand sides are equal if and only if x1 y1 + · · · + xn yn = 0 = v · w. Two vectors v, w are orthogonal (i.e. perpendicular) if and only if v · w = 0. A basis B for a vector space V is orthogonal if all basis elements are mutually orthogonal. If in addition, the vectors in B have norm 1, the basis is called orthonormal. Example 1.2.10. Consider R2 with the basis {d1 , e2 } of Example 1.2.9. Then, d1 · e2 = 1 and so this basis is not an orthogonal basis. If instead we choose the basis {d1 , d2 }, then d1 · d2 = 0 and so this forms an orthogonal basis, but it is not or√ thonormal because ||d1 || = ||d2 || = 2. To make it orthonormal, we need to form a basis using d01 = d1 /||d1 || and d02 = d2 /||d2 ||. It is a straightforward exercise to check that the canonical basis of Rn is an orthonormal basis. An important property of the scalar product of v and w is that it gives the product of the projection of one vector along the other one. If v and w are vectors
1.2 Review of Linear Algebra
9
in some orthonormal basis, then v · w = ||v|| ||w|| cos θ. We continue with a discussion of matrices and linear mappings. Consider a m×n matrix M of real numbers with entries labelled with aij , where i is the row label and j is the column label. We obtain a11 a12 · · · a1n a21 a22 · · · a2n M = . .. .. . .. .. . . . am1 am2 · · · amn Let v = (v1 , . . . , vn ) be a vector in Rn . Recall that matrix - vector multiplication M v is defined by a11 v1 + a12 v2 + · · · + a1n vn a11 a12 · · · a1n v1 a21 a22 · · · a2n v2 a21 v1 + a22 v2 + · · · + a2n vn . . .. .. .. = .. .. .. . . . . . am1 am2 · · · amn vn am1 v1 + am2 v2 + · · · + amn vn In particular, matrix - vector multiplication has a “linearity” property. That is, for vectors v1 , v2 ∈ Rn and α ∈ R then M (αv1 + v2 ) = αM v1 + M v2 . The m × n matrices are examples of linear transformations from Rn to Rm . Linear transformations are often more easily described without the use of matrices as the next example shows. Example 1.2.11. Consider the spaces of polynomials of degree ≤ 2 and ≤ 1. We write P2 = {a0 + a1 x + a2 x2 | a0 , a1 , a2 ∈ R}
and
P1 = {b0 + b1 x | b0 , b1 ∈ R}.
Consider the differentiation operation D applied to elements of P2 D(a0 + a1 x + a2 x2 ) = a1 + 2a2 x ∈ P1 . Therefore D : P2 → P1 and we know from elementary calculus that differentiation is a linear operation; that is, if p, q ∈ P2 and α ∈ R then D(αp + q) = αD(p) + D(q). This example shows that using exclusively matrices to discuss linear transformations is not optimal in all situations. Moreover, writing out a matrix requires one to fix a basis for the vector spaces. Therefore, it is more appropriate to define the set of linear transformations independently of bases as follows.
1 Introduction
10
Definition 1.2.12. We say that T : Rn → Rm is a linear transformation if it satisfies T (αv1 + v2 ) = αT (v1 ) + T (v2 ) where α ∈ R and v1 , v2 ∈ Rn . The set of all linear transformations from Rn to Rm is denoted by L(Rn , Rm ). In this book, we shall often discuss linear transformations in an abstract way, without specifying a basis, and so it is more convenient to adopt the linear transformation formalism, rather than writing all our transformations in matrix form. Finally, recall that two vector spaces V and W are isomorphic if there exists a linear transformation T : V → W such that T −1 exists. In particular, any two vector spaces of same dimension are automatically isomorphic. Returning to matrices, recall the definition of determinant for 2 × 2 and 3 × 3 matrices which we use repeatedly: a11 a12 det A = det = a11 a22 − a12 a21 a21 a22 and
det A
= =
a11 a12 a13 det a21 a22 a23 a31 a32 a33 a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 ).
If no confusion with the absolute value is possible, we sometimes write det A = |A|. We now conclude with some geometric properties of determinants. y v
q
P (v, w)
w x
Fig. 1.7. Vectors v and w and parallelogram P (v, w)
If v = (a, b) and w = (c, d) are two vectors in the plane based at some point q, then the area of the parallelogram Pvw generated by v and w is given by the absolute
1.2 Review of Linear Algebra
value of the determinant of the matrix
a c
b d
11
,
That is, Area(Pvw ) = |ad − bc|. There are various ways to check this result, but we do not pursue this here. However, this result is important in understanding the area of a parallelogram in threedimensional space. Another geometric way of obtaining the area of a parallelogram is as follows. Consider the parallelogram Pvw formed by two vectors v and w in the plane. The
h w v
Fig. 1.8. Parallelogram P formed by the vectors v and w.
area of this parallelogram is also given by the formula base of parallelogram × height of parallelogram. The height is obtained by drawing a perpendicular from a vertex to the opposite side as in Figure 1.8. The length of this height is, using basic trigonometry, h = ||v|| sin θ where θ is the angle opposite the dashed line. Geometrically, this means Area(Pvw ) = ||w|| ||v|| sin θ. Another way to compute the area of a parallelogram is with the “cross product”, also called “vector product”. The cross product is defined for pairs of vectors in R3 (or R2 if the same component of the vectors is zero). Let v = (x1 , x2 , x3 ) and w = (y1 , y2 , y3 ), written in an orthonormal basis, then the cross-product v × w is a vector given by x2 x3 x1 x3 x1 x2 ,− , v × w := (1.1) y1 y3 y1 y2 y2 y3 where | | is the 2 × 2 determinant. This means that the components of the crossproduct correspond to the “oriented” areas of the parallelograms obtained by the projection of the parallelogram P (v, w).
12
1 Introduction
The cross product v×w is perpendicular to both v and w and so is perpendicular to the plane in which v and w lie. This is verified by checking that v · (v × w) = 0 and
w · (v × w) = 0.
An important feature of the cross product is called its alternating property, namely v × w = −(w × v). This reflects the fact that there are two vectors perpendicular to the plane containing v and w and they are opposite of each other. Finally, let Pvw be the parallelogram generated by v and w then Area(Pvw ) = ||v × w||.
Exercises (1) Show that the set V = {(x, y, z) ∈ R3 | z = 3x + 2y} is a vector subspace (i.e check that the two conditions of a vector space are satisfied). (2) Show that a line ` in R2 is a vector subspace if and only if ` passes through the origin. (3) Use the scalar multiplication property of vector spaces to show that a closed ball B of any radius r > 0 cannot be a vector space. Explicitly B = {(x, y, z) ∈ R3 | x2 + y 2 + z 2 ≤ r2 }. (4) Consider the vectors given. Find out if they are linearly dependent or independent. Compute the span of these vectors and find the dimension of the subspace spanned by the vectors. (a) v1 = (1, 1, 0), v2 = (0, −2, 0) (b) v1 = (2, −1, 1), v2 = (0, 1, 3) and v3 = (2, 0, 4). (c) v1 = (3, 4, 1, 0), v2 = (0, 1, 0, −1) and v3 = (2, −2, 0, 1). (5) Determine if the following vectors are orthogonal. (a) v1 = (1, 3, 1), v2 = (2, −1, 1). (b) v1 = (2, 1, 0, 0), v2 = (0, −1, 1, 0). (c) v1 = (1, −1, 2, 2), v2 = (−1, 1, 1, 1). (6) Find a vector (a, b, c, d) orthogonal to (3, −1, 0, 1). (7) Find a vector v orthogonal to the subspace V of Exercise 1. (8) Normalize the vectors of Exercise 5. √ √ √ √ (9) Verify that v1 = (2/ 5, 1/ 5, 0), v2 = (0, 0, 1) and v3 = (−1/ 5, 2/ 5, 0) forms an orthonormal basis of R3 .
1.3 Coordinate systems
13
(10) Recall the determinant formula to compute the cross product of two vectors. Let v = (x1 , y1 , z1 ) and w = (x2 , y2 , z2 ) then i j k v × w = det x1 y1 z1 x2 y2 z2 where i, j, k correspond to the basis vectors e1 , e2 , e3 (notation used in physics). Show that the result corresponds to the formula 1.1. (11) Show that for vectors in the xy-plane v = (x1 , y1 , 0), w = (x2 , y2 , 0), the only possible nonzero component of the cross product v × w is in the e3 direction. (12) Compute the area of the parallelogram given by the vectors v = (1, 0, −2) and w = (−3, 1, 1). (13) Show that for α 6= 0, (αv) × w = α(v × w). (14) If w = αv for some α 6= 0, show that v × w = 0.
1.3 Coordinate systems Coordinate systems are commonly used to identify locations whether on earth with the longitudes and latitudes or for celestial objects in the night sky using the Elevation/Azimuthal coordinate system. The mathematical definition of a coordinate system used throughout this text is the following. Definition 1.3.1. A coordinate system on Rn is a collection of n-families of curves such that any point in Rn corresponds uniquely to the intersection of one curve from each family. Coordinate systems, as opposed to coordinates of a basis of a vector space, do not need to be linear. In fact, from the commonly used coordinate systems only the Cartesian coordinate system is made up uniquely of straight lines. We define the following families of coordinate curves i Fi := xa (t) = (a1 , . . . , ai−1 , ti , ai+1 , . . . , an ) | t ∈ R (1.2) and a = (a1 , . . . , a ˆi , . . . , an ) ∈ Rn−1 for i = 1, . . . , n and where the u ˆ denotes the missing coordinate in the vector. The Cartesian coordinate system on Rn consists of the n-families of straight lines given by (1.2). A point p ∈ Rn at the intersection of the n lines Fi with values tp1 , tp2 , . . . , tpn has coordinates (tp1 , tp2 , . . . , tpn ). Example 1.3.2. Let u = 1.4e1 + 1.7e2 ∈ R2 , then u is at the intersection of the lines x11.7 (t1 ) and x21.4 (t2 ) with t1 = 1.4 and t2 = (1.7) and so has coordinates (1.4, 1.7). See Figure 1.9. One notices from this example that the coordinates of a point u given in the canonical basis correspond to the coordinates of u in the Cartesian coordinate system.
1 Introduction
14
x21.4 (t)
1.7
x11.7 (t)
u 1.4
Fig. 1.9. u = (1.4, 1.7) at the intersection of two coordinate lines.
Because of this, the dividing line between canonical basis and the Cartesian coordinate systems is often overlooked. θ = π/2 θ = 3π/4
θ = π/4
θ=π
θ = 5π/4
θ=0
θ = 7π/4 θ = 3π/2
Fig. 1.10. Polar coordinate system showing four (equally spaced) radii and eight radial vectors.
However, this is not correct as they correspond to different mathematical objects: a set of vectors for the canonical basis and a family of straight lines for the Cartesian coordinates. The polar coordinate system in R2 is obtained from the Cartesian coordinate system by defining a family of rays from the origin and a family of circles centered at the origin. Let p ∈ R2 be at the intersection of the ray with angle θ0 from the x-axis and the circle at radius r0 , in coordinates we write p = (r0 , θ0 ). In terms of Cartesian coordinates, the formulae for a ray at angle θ0 ∈ [0, 2π) and a circle of radius r0 are respectively rθ0 (t1 ) = (t1 cos θ0 , t1 sin θ0 ) and
θr0 (t2 ) = (r0 cos(t2 ), r0 sin(t2 )).
The polar coordinate system is an example of a curvilinear coordinate system, because at least one family of curves is not made up of straight lines. Example 1.3.3. Consider again the point u = (1.4, 1.7) ∈ R2 . In polar coordinates, the coordinate curves passing through u are determined by the ray rθ0 (t) joining the origin to u and the circle θr0 (t) passing through the point u. The angle θ0 is obtained
1.3 Coordinate systems
15
x21.4 (t) θ0 u 1.7
x11.7 (t)
1.4 Fig. 1.11. u = (1.4, 1.7) at the intersection of two coordinate curves in Cartesian and polar coordinates.
using basic trigonometry since tan(θ0 ) =
1.7 ; 1.4
and so θ0 = arctan(1.7/1.4) ≈ 5π/18. The radius of the circle is obtained from √ Pythagora’s theorem: r0 = 1.42 + 1.72 ≈ 2.20. We now look at the main curvilinear coordinate systems in R3 obtained via the Cartesian coordinate system. The cylindrical coordinate system is described by families of curves extending the polar coordinate system in the third dimension using straight lines. rθ0 ,z0 (t) = (t cos θ0 , t sin θ0 , z0 ),
θr0 ,z0 (t) = (r0 cos t, r0 sin t, z0 )
and x3(r0 ,θ0 ) (t) = (r0 , θ0 , t). Figure 1.12 shows the radial and angular coordinates in the plane and the vertical axis. Note that there are equivalent coordinate systems
Fig. 1.12. Cylindrical coordinates: polar coordinates in the plane and vertical axis.
16
1 Introduction
with polar coordinates in the other coordinate planes. rθ0 ,x0 (t) = (x0 , t cos θ0 , t sin θ0 ),
θr0 ,x0 (t) = (x0 , r0 cos t, r0 sin t)
and x1(r0 ,θ0 ) (t) = (t, r0 cos θ0 , r0 sin θ0 ). Similarly, rθ0 ,y0 (t) = (t cos θ0 , y0 , t sin θ0 ),
θr0 ,y0 (t) = (r0 cos t, y0 , r0 sin t)
and x2(r0 ,θ0 ) (t) = (r0 cos θ0 , t, r0 sin θ0 ). The spherical coordinate system is obtained by fixing a radius ρ0 > 0 in R3 and defining two families of curves on a sphere of radius ρ0 as follows ρθ0 ,φ0 (t) θρ0 ,φ0 (t) φρ0 ,θ0 (t)
= = =
(t cos θ0 sin φ0 , t sin θ0 sin φ0 , t cos(φ0 )) (ρ0 cos t sin φ0 , ρ0 sin t sin φ0 , ρ0 cos(φ0 )) (ρ0 cos θ0 sin t, ρ0 sin θ0 sin t, ρ0 cos t)
(1.3)
The definitions of coordinate system above give us formulae for the correspondence of points from a curvilinear coordinate system to the Cartesian coordinate system. If a point p ∈ R2 has Cartesian coordinates (x, y) and polar coordinates (r, θ) then p x = r cos θ, y = r sin θ ⇔ r = x2 + y 2 , θ = arctan(y/x). (1.4) The relationship between the Cartesian coordinate system in R3 and the cylindrical coordinate system is a direct extension of the polar coordinate system equations. For spherical coordinates, equations (1.3) give us x = ρ cos θ sin φ,
y = ρ sin θ sin φ and
z = ρ cos φ.
We obtain ρ, φ, θ as functions of x, y, z as follows. The radius of a sphere is given by x2 + y 2 + z 2
so ρ =
= = = =
(ρ cos θ sin φ)2 + (ρ sin θ sin φ)2 + (ρ cos φ)2 ρ2 sin2 φ(cos2 θ + sin2 θ) + ρ2 cos2 φ ρ2 (sin2 φ + cos2 φ) ρ2 .
p x2 + y 2 + z 2 . Notice that y/x = tan θ which means
θ = arctan(y/x). p Finally, cos φ = z/ρ and writing ρ = x2 + y 2 + z 2 we obtain ! z φ = arccos p . x2 + y 2 + z 2
1.3 Coordinate systems
17
Fig. 1.13. Spherical coordinates
We summarize as follows, see Figure 1.13 for an illustration where the right-angled triangle has sides of length ρ sin φ, ρ cos φ and hypotensude ρ. p x = ρ cos θ sin φ, ρ = x2 + y 2 + z 2 y = ρ sin θ sin φ,
z = ρ cos φ,
θ = arctan
φ = arccos
y x z p x2 + y 2 + z 2
(1.5)
! .
Here are some examples on how to perform the algebra from one coordinate system to another. Example 1.3.4. Consider the locus of points C in R2 which satisfies the equation x2 + y 2 = 2y. We rewrite this locus of points using polar coordinates. We substitute for x and y to obtain (r2 cos2 θ + r2 sin2 θ) = 2r sin θ r2 = 2r sin θ r = 2 sin θ. C is shown in Figure 1.14, it is the circle of radius 1 centered at (0, 1). Consider this time an example in R3 . Example 1.3.5. We write the locus of points S in R3 given by 3x2 + 3y 2 − z 2 = 1 using the cylindrical and spherical coordinates. In cylindrical coordinates, one obtains 3(r2 cos2 θ) + 3(r2 sin2 θ) − z 2 = 1 3r2 − z 2 = 1.
1 Introduction
18 y 2
1 Fig. 1.14. The circle C of radius 1 centered at (0, 1)
x
One can write, either z 2 = 1 − 3r2
or
r2 =
1 (1 + z 2 ). 3
In spherical coordinates, 3(ρ2 cos2 θ sin2 φ) + 3(ρ2 sin2 θ sin2 φ) − (ρ2 cos2 φ) = 1 3ρ2 sin2 φ − ρ2 cos2 φ = 1 ρ2 (4 sin2 φ − 1) = 1 where the last line is obtained by adding and subtracting ρ2 sin2 φ and simplifying with the ρ2 cos2 θ term. Therefore, one can write ρ2 =
1 . 4 sin2 φ − 1
We conclude this section with coordinate systems in R4 . Obvious choices are the Cartesian coordinate system given by (x1 , x2 , x3 , x4 ), but one can also take two sets of polar coordinates (r1 , θ1 , r2 , θ2 ) or a three dimensional coordinate system, say spherical, plus an additional Cartesian coordinate (ρ, φ, θ, x4 ). Several other options are possible.
Exercises (1) Consider the disk D of radius a > 0 in the plane D = {(x, y) ∈ R2 | x2 + y 2 ≤ a}. Write the definition of D in polar coordinates. What is the shape of D in the (r, θ) plane? (2) Consider the region
π 2π 2 R = (r, θ) ∈ R | 2 < r < 3, ≤θ≤ . 6 3 Draw R in the (r, θ) plane. Describe R in Cartesian coordinates and draw the region in the (x, y) plane.
1.4 Functions and Mappings: including partial derivatives
19
(3) Write x2 + y 2 = 2xy in polar coordinates. Simplify if possible. 1 (4) Write r = in Cartesian coordinates. Simplify if possible. 1 − cos θ (5) Write z = x2 + y 2 in cylindrical coordinates. Simplify if possible. (6) Write z 2 = x2 + y 2 in spherical coordinates. Simplify if possible. (7) Consider the region R ⊂ R3 given by 1 R = (x, y, z) | x2 + y 2 ≤ 1 − z 2 , < z < 1 . 2 Describe this region in spherical coordinates and draw it. (8) Look up hyperbolic coordinates online. Describe this coordinate system. (9) Look up paraboloidal coordinates online. Describe this coordinate system. (10) Search the internet for other coordinate systems. How do they relate to the Cartesian coordinate system?
1.4 Functions and Mappings: including partial derivatives One is typically familiar with functions of the form y = f (x) where x ∈ I ⊂ R. The set I is called the domain of f and the rule f assigns a unique value y for each x ∈ I; this forms the image or range of f . The general definition of a function follows the same pattern. Definition 1.4.1. Let A, B be two sets and f is a rule which assigns to every a ∈ A, a unique value b = f (a) ∈ B. Then the triplet (A, B, f ) is called a function where A is the called the domain of the function and B is the image or range of the function. In this book, we consider functions of the following form. Let U ⊂ Rn and f : U → Rm be the rule. As a shortcut, it is typical to refer to f as the function without mentioning the domain. However, one must always keep track of the domain of the function even if it is not explicitly stated. Several cases of n and m values are given special names, those are: (1) n = m = 1: functions of one variable, (2) n > 1 and m = 1: functions of several variables, (3) n = 1 and m > 1: vector functions.
We look at a few examples. Example 1.4.2. Here are some examples of functions of several variables.
20
– – –
1 Introduction
U = R2 and f (x, y) = x2 + y 2 .
1 . 4 sin2 φ − 1 4 U = R and h(x, y, z, w) = xyzw. U ⊂ R3 and g(ρ, φ, θ) = ρ2 −
Example 1.4.3. Consider now examples of vector functions. – r(t) = (t, t2 ) – r(t) = (cos t, sin t, et ) – r(t) = (1 − t, |t|, 2t, cos t). For general n and m values, it is also customary to refer to f : U → Rm as a mapping. Changes of coordinates between coordinate systems are very useful mappings. Here are a few examples. Example 1.4.4. Consider the mapping f : R2 \ (0, 0) → R2 given by f (r, θ) = (r cos θ, r sin θ) Example 1.4.5. Let A : Rn → Rn be a linear mapping; i.e. a matrix. Suppose that |A| 6= 0 then A is a linear change of coordinates in Rn . For instance, let (x, y, z) be coordinates in R3 and 0 −1 0 A= 1 0 0 . 0 0 −1 We can use A to make a change of coordinates u x v = A y . w z Example 1.4.6. Another change of coordinates is from Cartesian to spherical coordinates. Those are seen in equation (1.5): !! y p z 2 2 2 f (x, y, z) = , arccos p . x + y + z , arctan x x2 + y 2 + z 2 The inverse transformation mapping is g(ρ, φ, θ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ).
1.4.1 Partial Derivatives For functions of several variables of the form f : Rn → R, there is a partial extension to the concept of derivative of a function; the so-called “partial derivative” which we define in the case of a function of two variables. The general case is straightforward once this one is understood.
1.4 Functions and Mappings: including partial derivatives
21
Consider f (x, y) near a point (x0 , y0 ) in its domain. If the following limits exist lim
x→x0
f (x, y0 ) − f (x0 , y0 ) x − x0
and
lim
y→y0
f (x0 , y) − f (x0 , y0 ) y − y0
we call those the partial derivatives of f with respect to x and with respect to y respectively. Those are denoted by ∂f f (x, y0 ) − f (x0 , y0 ) (x0 , y0 ) := lim x→x0 ∂x x − x0 and
f (x0 , y) − f (x0 , y0 ) ∂f (x0 , y0 ) := lim . y→y0 ∂y y − y0
Those can also be rewritten for an arbitrary point (x, y) as: f (x + h, y) − f (x, y) ∂f (x, y) := lim , ∂x h h→0
∂f f (x, y + h) − f (x, y) (x, y) = lim . ∂y h h→0
The partial derivatives of familiar functions are computed in the same way as derivatives of functions of one variable because the other variable is considered fixed as a constant. Moreover, all the differentiation rules apply in the same way: addition, multiplication rule, quotient rule, chain rule. Example 1.4.7. Let f (x, y) = x2 y cos(xy) + x/(1 + y), then considering y fixed ∂f 1 = 2x cos(xy) − x2 y sin(xy)y + ∂x 1−y and keeping x fixed ∂f x = x2 cos(xy) − x2 y sin(xy)x + . ∂y (1 − y)2 For functions of n variables, the recipe is the same. Consider a function f (x1 , . . . , xn ) and a point (x10 , . . . , xn0 ) ∈ Rn , then for j = 1, . . . , n the partial derivative is f (x10 , . . . , xj , . . . , xn0 ) − f (x10 , . . . , xn0 ) ∂f (x10 , . . . , xn0 ) = lim xj →xj0 ∂xj xj − xj0 if the limit on the right-hand side exists. Similarly, at an arbitrary point (x1 , . . . , xn ) we have f (x1 , . . . , xj + h, . . . , xn ) − f (x1 , . . . , xn ) ∂f (x1 , . . . , xn ) = lim . ∂xj h h→0 As explained above, the computation of partial derivatives in the case of n variables follows the same rules as for n = 2. Example 1.4.8. We compute the partial derivatives of f (x, y, z) = (xy + z 2 )eyz : ∂f = yeyz , ∂x
∂f = xeyz + (xy + z 2 )zeyz , ∂y
∂f = 2zeyz + (xy + z 2 )yeyz . ∂z
22
1 Introduction
We now define an object known as the gradient and which is our first example of a differential operator. Definition 1.4.9. Let f (x1 , . . . , xn ) be a function of several variables in Cartesian coordinates for which partial derivatives with respect to all variables exist. The gradient of f is defined as ∂f ∂f ∇f (x1 , . . . , xn ) = ,..., . ∂x1 ∂xn If a function is not specified, we write ∂ ∂ ∇ := ,..., ∂x1 ∂xn As we show in the example above, the chain rule applies in the same way as for functions of one variable. However, there is an important formula concerning the chain rule for functions of several variables which generalizes the regular chain rule from elementary calculus. This formula is valuable for computing derivatives in concrete examples, but it is mostly used theoretically and we refer to this formula often in the following chapters. Proposition 1.4.10. Consider a function of several variables g(x1 , . . . , xn ) and suppose that xj = ϕj (u1 , . . . , uk ) for j = 1, . . . , n. Define f (u1 , . . . , uk ) := g(ϕ1 (u1 , . . . , uk ), . . . , ϕn (u1 , . . . , uk )). Then, for i = 1, . . . , k we have ∂f ∂g ∂ϕ1 ∂g ∂ϕn = + ··· + . ∂ui ∂x1 ∂ui ∂xn ∂ui where the partial derivatives of g are evaluated at xj = ϕj (u1 , . . . , uk ) for j = 1, . . . , n. The proof can be obtained as a special case of a more general chain rule which we present in Chapter 5 and we decide not to burden the presentation with lengthy calculations at this stage. Let us look at this formula for the cases of a few variables. Example 1.4.11. If g(x, y) and x = ϕ1 (u1 , u2 ), y = ϕ2 (u1 , u2 ) and f (u1 , u2 ) = g(ϕ1 (u1 , u2 ), ϕ2 (u1 , u2 )). Then ∂g ∂ϕ1 ∂g ∂ϕ2 ∂f = + ∂u1 ∂x ∂u1 ∂y ∂u1 It is customary to label the functions of u with the variables x, y instead of ϕ1 , ϕ2 . That is, we write f (u1 , u2 ) = g(x(u1 , u2 ), y(u1 , u2 )) and ∂f ∂g ∂x ∂g ∂y = + . ∂u2 ∂x ∂u2 ∂y ∂u2
1.4 Functions and Mappings: including partial derivatives
23
We use the formula to compute the following partial derivatives. Example 1.4.12. Let f (x, y, z) = x2 + y 2 + xz 2 and consider the change of variables in spherical coordinates x = x(ρ, φ, θ) = ρ cos θ sin φ, y = y(ρ, φ, θ) = ρ sin θ sin φ and z = z(ρ, φ, θ) = ρ cos φ. We consider f (x(ρ, φ, θ), y(ρ, φ, θ), z(ρ, φ, θ)) and compute ∂f ∂ρ
=
∂f ∂x ∂f ∂y ∂f ∂z + + ∂x ∂ρ ∂y ∂ρ ∂z ∂ρ
where the partial derivatives of f are evaluated at x = x(ρ, φ, θ) = ρ cos θ sin φ, y = y(ρ, φ, θ) = ρ sin θ sin φ and z = z(ρ, φ, θ) = ρ cos φ. We obtain ∂f = 2x + z 2 |x=x(ρ,φ,θ),z=z(ρ,φ,θ) = 2ρ cos θ sin φ + ρ2 cos2 φ ∂x ∂f = 2y |y=y(ρ,φ,θ) = 2ρ sin θ sin φ, ∂y ∂f = 2xz |x=x(ρ,φ,θ),z=z(ρ,φ,θ) = 2ρ2 cos θ sin φ cos φ. ∂z and
∂x = cos θ sin φ, ∂ρ
∂y = sin θ sin φ, ∂ρ
∂z = cos φ. ∂ρ
Putting all those calculations together we obtain ∂f = 2ρ cos2 θ sin2 φ + 2ρ sin2 θ sin2 φ + 3ρ2 cos θ sin φ cos2 φ. ∂ρ
1.4.2 Open and closed sets Before we begin our discussion on functions, it is important to notice that sometimes only some subset of Euclidean space may be of interest. Here is an important type of subset. Definition 1.4.13. A subset U ⊂ Rn is open if for all points p ∈ U , there exists a ball Bδ (p) := {x ∈ Rn | ||x − p|| < δ} of radius δ > 0 containing p such that Bδ (p) ⊂ U . For p ∈ Rn and δ > 0, a ball Bδ (p) is open. In R, a ball is just an interval of length δ centered at p. In R2 with the Euclidean norm, Bδ (p) is a disk of radius δ centered at p. In R3 , Bδ (p) is a ball in the common sense of the word. In general, a ball of radius δ centered at p is the set of points whose distance to p is less than δ. Example 1.4.14. On the real line, let a < b and consider an interval (a, b). We verify using the definition that this interval is open. For each point p in the interval (a, b), we must find a ball Bδ (p) of a certain radius δ > 0 such that Bδ (p) ⊂ (a, b).
1 Introduction
24 y
Bδ (p) p x Fig. 1.15. Bδ (p) is a disk of radius δ.
The point p is either closer to a, closer to b or in the middle of the interval. Let be the smallest value of the distances between p and a and p and b. This is written = min(|p − a|, |p − b|). See Figure 1.16 for an illustration. Let δ = /2 and choose an arbitrary point x ∈ Bδ (p). By definition of Bδ (p), we must have |x − p| < δ. By choosing δ to be half the distance to the nearest boundary point, this makes sure that x ∈ (a, b). Therefore, any point x ∈ Bδ (p) is also in (a, b). This means Bδ (p) ⊂ (a, b). Because p is chosen arbitrarily in (a, b) this completes the verification that (a, b) is open. Consider the following subsets of R: (a, b], [a, b) and [a, b]. Those three sets are not open and this can also be seen using the definition. The problem lies in the inclusion of at least one boundary point in the set. Consider the first case (a, b]. Suppose p = b, then any ball Bδ (p) with δ > 0 contains points x such that b < x < b + δ. But this means x 6∈ (a, b] and so Bδ (p) 6⊂ (a, b] no matter how small δ > 0 is chosen. Thus, (a, b] is not open. The same applies automatically to the other cases. In particular, if a = b then [a, b] consists of one point and so singletons are not open sets. We now look at examples of sets in R2 and R3 . Example 1.4.15. Consider the set O := {(x, y) ∈ R2 | x > y}. This corresponds to the half-plane shown in Figure 1.18. We now show it is an open set. The key here is the strict inequality used in the definition of the set. Let p = (x0 , y0 ) ∈ O. Then, x0 > y0 . Let be the distance between p and the bisector line x = y. Because x > y, then > 0. Choose δ = /4 and consider the ball Bδ (p) = {(x, y) ∈ R2 | ||(x, y) − (x0 , y0 )|| < δ} a
p
b
a x
p
a p
b x
b x
Fig. 1.16. Three possible cases of location of p in (a, b): midpoint, right and left of midpoint
1.4 Functions and Mappings: including partial derivatives
a
25
p=b x
Fig. 1.17. (a, b] with p = b.
y
B (p) p
Bδ (p)
x Fig. 1.18. Balls Bδ (p) from Example 1.4.15
is strictly contained in B (p) and so strictly contained in O. Here is an example of an open set in higher dimension defined in an abstract way. Example 1.4.16. Consider the motion of three (celestial) bodies in R3 of mass m1 , m2 , m3 . Let q1 , q2 , q3 ∈ R3 be the position vectors of the three bodies. The set of non-collision positions defined by C = {(q1 , q2 , q3 ) ∈ (R3 )3 | q1 6= q2 ,
q2 6= q3
and
q1 6= q3 }
is an open subset of (R3 )3 . An interval [a, b] containing its boundary points is known as a closed interval. The concept of a closed set is trickier to define than open sets and closed sets. Even on the real line, closed sets can have bizarre properties which are beyond the scope of this document. We do not define closed sets formally, but focus our attention to a special type of sets which is a generalization of the closed interval. Example 1.4.17. Consider an open set U in R2 and add the boundary to it to form a set F . The set F is a closed set. For instance, let U = {(x, y) ∈ R2 | |x| + |y| < 1}. This is the diamond shape region shown in Figure 1.19. If we add the boundary, that is, the lines given by |x| + |y| = 1, the set F = {(x, y) ∈ R2 | |x| + |y| ≤ 1} is a closed set. One way to find out if a set is closed is by using the following result. Proposition 1.4.18. A set U ⊂ Rn is open if and only if its complement U c is closed.
1 Introduction
26
y
1 −x + y = 1
x+y =1 U
−1 −x − y = 1
x
1 x−y =1
Fig. 1.19. The open set U is inside the diamond. The closed set F is composed of U and includes the boundary lines.
−1
We omit the proof of this theorem as it is also beyond the scope of this text. Example 1.4.19. Proposition 1.4.18 shows that the complement of the set O in Example 1.4.15 defined by Oc = {(x, y) ∈ R2 | x ≤ y} is closed. In this case, the portion with no boundary is not a problem; infinity acts as a boundary to this set.
Exercises (1) Find the domain for each of the following mappings 1 . (a) f (x, y) = 2 x − y2 √ (b) g(r, θ, z) = 1 − r2 − z 2 . Describe the region geometrically. ρ2 − 1 (c) h(ρ, φ, θ) = . 1 − cos θ sin θ √ (d) r(t) = ( t2 − 1, tan(t)) (2) Compute all the partial derivatives and express the gradient of each function. (a) f (x, y) = ln(cos(xy)) (b) f (x, y) =
xy x2 + y 2
p x2 + y 2 + z 2 )
(c) f (x, y, z) = (x2 + y 2 + z 2 ) cos(1/ (d) f (x1 , x2 , x3 , x4 ) = x1 x4 ex1 x2 x3 x4
1.5 Parametric representation of curves
27
(3) Compute the partial derivatives ∂f ∂θ
and
∂f ∂φ
in Example 1.4.12 using the formula of Proposition 1.4.10. Write explicitly the composition of f with the functions of ρ, φ and θ and compute the partial derivatives with respect to ρ, φ and θ using the regular chain rule for functions of one variable. If you do not obtain the same thing with the two methods, verify your calculations. (4) Determine whether the sets below are open, closed or neither. (a) A = {(x, y) ∈ R2 | x2 + y 2 ≤ 1, y > 0} (b) B = {(x, y) ∈ R2 | −1 ≤ x ≤ 2, 3 ≤ y ≤ 4} (c) C = {(x, y) ∈ R2 | x − y 6= 0} (d) D = {(r, θ) ∈ R2 | r > 1, 0 ≤ θ ≤ π} (e) E = {(r, θ) ∈ R2 | 1 ≤ r < 2} (5) The union of two open sets is an open set. Give an explanation (or a proof if you can) of why this is true. (6) The intersection of two closed sets is a closed set. Give an explanation (or a proof if you can) of why this is true.
1.5 Parametric representation of curves In this section, we begin the study of curves defined using so-called parametric representations. A curve C in Rn is a geometric object which can be described by a unique number; its parameter. Let t ∈ R, a curve C in Rn has parametric representation given by x1 (t), . . . , xn (t) where t ∈ [a, b] with a < b, where a = −∞ and b = ∞ are allowed. Remark 1.5.1. Note that for a given curve C, the choice of parametric representation is not unique. Therefore, one has to distinguish between the geometric object C and the possible representations that can describe C. The graphical representation of the real-line comes with an “orientation”; the arrow points to the right and this determines the direction of increasing real numbers. The choice of pointing to the right is arbitrary and is the convention adopted, possibly unanimously, and is called the positive orientation of the real-line. If the arrow points to the left, so that positive numbers increase in that direction, we talk about negative orientation. An orientation of a curve C is the orientation given by a consistent direction given by an arrow along the length of C.
28
1 Introduction
For a curve C with parametric representation r(t) with t ∈ [a, b], the orientation of C is given by the direction of travel along the curve C given as t increases from a to b.
y (t, t2 )
x
Fig. 1.20. Parabola (t, t2 ) with t ∈ [0, 1].
Example 1.5.2. Let y = f (x) where f : [a, b] → R is a function. The graph of f (x) is a curve C in R2 . The parametric representation is given by x(t) := x1 (t) = t,
y(t) := x2 (t) = f (t).
Consider the parabola C given by y = x2 defined for all x ∈ R, it has parametric representation x(t) = t, y(t) = t2 with t ∈ R. Now, the portion of parabola given by y = x2 with x ∈ [0, 1] is a different geometric object which we denote C1 and has parametric representation x(t) = t, y(t) = t2 with t ∈ [0, 1] We can use parametric representations to describe more complicated curves which cannot be obtained by a single function y = f (x). Example 1.5.3. A circle of radius r0 has equation x2 + y 2 = r02 . A parametric representation is given by x(t) = r0 cos t,
y(t) = r0 sin t,
t ∈ [0, 2π)
because substituting x(t) and y(t) in the equation is an identity for all t ∈ [0, 2π). If the equation of the circle is given in polar coordinates r = r0 , then parametric equations are r(t) = r0 , θ(t) = t, t ∈ [0, 2π). In some simple cases, it is possible to convert from a parametric representation of a curve to Cartesian coordinates in R2 . The cases where either x(t) = t or y(t) = t are straightforward as noticed above. For more complicated cases, the approach is to find functions of x(t) and y(t) from which we obtain an equality.
1.5 Parametric representation of curves
29
Example 1.5.4. Consider the curve C given by x(t) = 2t2 , y(t) = t6 . Then, 1 x(t)3 = t6 = y(t) 8 and so C is given by x3 = 8y. Although this approach is useful sometimes for graphing, our main emphasis in this textbook is on parametric representations and its properties. Let us look at some commonly presented example in R3 . Example 1.5.5. Consider the curve C given by x(t) = cos t, y(t) = sin t and z(t) = t for t ∈ [0, 4π]. We see that in the xy-plane, this is just a circle and in the z-direction, we have linear growth. The curve C is called a helix and is illustrated in Figure 1.21.
Fig. 1.21. Helix curve of radius 1
We now look at an example with quite an intricate structure in R3 . Example 1.5.6. Consider the curve C1 given by x(t) = 4 cos t + cos t cos(3t),
y(t) = 4 sin t + sin t cos(3t),
z(t) = 2 sin(3t)
We see from Figure 1.22 (left) that we obtain a closed curve. Compare with the curve √ C2 obtained by changing 3t to 2t (Figure 1.22 right), √ √ √ x(t) = 4 cos t + cos t cos( 2t), y(t) = 4 sin t + sin t cos( 2t), z(t) = 2 sin( 2t). This curve lies on a surface which has the shape of a donut or bagel (depending on your taste!), known as a “torus”. In fact, the curve C1 also lies on the same torus. Parametric representations in higher dimensions commonly arise in many problems. However, their graphical representations can only be glimpsed upon by looking at projections to R3 . Consider the following one which is reminiscent of the previous example.
30
1 Introduction
Fig. 1.22. Left: closed curve lying on a torus, Right: non-closed curve lying on a torus.
Example 1.5.7. Consider the curve C in R4 given by the parametrization √ √ x1 (t) = cos t, x2 (t) = sin t, x3 (t) = cos 2t, x4 (t) = sin 2t. We see that the projections to the x1 , x2 and x3 , x4 planes are just circles with periods √ respectively of 2π and 2π/ 2. This kind of parametric curve describes the motion of a double pendulum subject to a small displacement. See Figure 1.23 obtained for t ∈ [0, 100].
Fig. 1.23. Parametric curve of Example 1.5.7
One can think of parametric curves as describing the motion of a point particle in space. This leads to a subtle aspect of parametric curves with respect to their intersections as the following example shows.
1.5 Parametric representation of curves
31
Example 1.5.8. Two particles A and B travel in the paths given by xA (t) = t, yA (t) = t3 and xB (t) = cos t, yB (t) = sin t. Do the particles collide? The paths traced by these parametric curves intersect in two points as seen from the figure. However, in order to have the particles collide, one would need the intersections to occur for the same value t0 . It is not the case in this particular example. To check this, one would need to find a value t0 such that t0 = cos t0 . By drawing the graphs of t and cos t, we see that there are two such solutions. For each (approximate) solution, compute t30 − sin t0 and verify that this is not zero. y
x
Fig. 1.24. The curves have two intersection points.
1.5.1 Conics We now look more closely at the planar curves known as conics. Those are obtained geometrically by taking various cross-sections of a cone and have the following definitions in their simplest forms. Those are for a, b ∈ R: y 1
2
x Fig. 1.25. Ellipse with a = 2 and b = 1.
(1) Ellipse: x2 y2 + 2 =1 2 a b
32
1 Introduction
(2) Hyperbola: Left-Right and Up-Down cases. x2 y2 − = 1 or a2 b2
y
−1
y2 x2 − = 1. a2 b2
y=x
1
x Fig. 1.26. Left-Right hyperbola with a = b = 1. The dashed lines are the asymptotes of the hyperbola.
y = −x
(3) Parabola: 4ay = x2
or 4ax = y 2 .
For the ellipse, the largest of a and b represents the major axis and the other one, the minor axis. The ellipse in Figure 1.25 has major axis a = 2 and minor axis b = 1. In the case of the hyperbola, a is the distance between the origin and the vertices which are the nearest points of the hyperbola to the centre. For |x| large, the hyperbola approaches the asymptotes of slope ±b/a. The parabola 4ay = x2 opens up or down depending on whether a > 0 or a < 0. The case 4ax = y 2 opens either left or right. y
x Fig. 1.27. Parabola 4ay = x2 with a = 1/4.
We now describe the conics using parametric representations. The first way one can do this is by solving for y as a function of x. In the case of the ellipse, we have two functions r x2 y = f± (x) := ±b 1 − 2 a
1.5 Parametric representation of curves
33
and f± (x) is well-defined as a function for x ∈ [−a, a]. This leads to r t2 x(t) = t, y(t) = ± 1 − 2 , t ∈ [−a, a]. a Another parametrization similar to the parametrization of the circle is obtained as x(t) = a cos t,
y(t) = b sin t,
t ∈ [0, 2π).
Indeed, x(t)2 y(t)2 a2 cos2 t b2 sin2 t + = + = 1. a2 b2 a2 b2 For the hyperbola, the y = f (x) parametrization is done as above. Recall the hyperbolic functions et + e−t et − e−t cosh t = and sinh t = . 2 2 Another parametrization of the hyperbola is given by x(t) = a cosh t,
y(t) = b sinh t,
t ∈ R.
This justifies the name “hyperbolic function”.
Exercises (1) Use a computer software to draw the curves given by the following parametric representations. (a) x(t) = t2 , y(t) = cos t; t ∈ [0, 2π] (b) x(t) = t cos t, y(t) = t sin t; t ∈ [0, π] (c) x(t) = t2 , y(t) = t3 ; t ∈ [−2, 2] (2) Find the intersections of the curves in Exercise 1. (A difficult exercise!) (3) Verify the statement of Example 1.5.8 by using the approach suggested. (4) Transform the following equations of conics into their standard form and draw a rough sketch of the conic. (a) 2x2 + 5y 2 = 3 (b) 4y 2 − x2 = −3 (5) Find the intersection points of the two conics given. (a) 4x2 + y 2 = 1 and x2 + 4y 2 = 1. (b) x2 − 3y 2 = 1 and 3x2 + 5y 2 = 1. (c) 3y 2 − x2 = 1 and y = 4x2 . 2 (d) 9x2 − 3y 2 = 3 and y 2 − x32 = 1.
34
1 Introduction
Fig. 1.28. Ellipsoid and Hyperbolic Paraboloid
Fig. 1.29. Cone and Elliptic Paraboloid
1.6 Quadrics The Quadrics are a family of surfaces in R3 which contain some well-known examples such as the cone and the ellipsoid. The general equation of a quadric is given by: αx2 + βy 2 + γz 2 + 2δxy + 2xz + 2λyz + κx + µy + νz + p = 0 which covers the case of quadrics anywhere in space and in any orientation. We focus on the special cases where the quadrics are centered at the origin and have an orientation given by the z-axis. We list the quadrics in their simplest form below. They are the ones that are used in the remainder of the book. Let a, b, c ∈ R: (i) Ellipsoid: x2 y2 z2 + 2 + 2 = 1. 2 a b c
Fig. 1.30. Hyperboloid of one sheet and Hyperboloid of two sheets
1.6 Quadrics
35
(ii) Hyperboloid of one sheet: x2 y2 z2 + − = 1. a2 b2 c2 (iii) Hyperboloid of two sheet: −
x2 y2 z2 − 2 + 2 = 1. 2 a b c
(iv) Elliptic paraboloid: y2 x2 + − z = 0. a2 b2 (v) Hyperbolic paraboloid: y2 x2 − 2 − z = 0. 2 a b (vi) Cone: z2 x2 y2 = 2 + 2. 2 c a b Note that all the quadrics are aligned along the z axis in the notation above. However, those quadrics can also be aligned along the x and y axes and the equations are the same up to a permutation of the x, y and z. In the case of the elliptic paraboloid, one obtains z2 x2 z2 y2 + 2 − x = 0 and − 2 − y = 0. 2 2 a b a b The geometry of the quadrics can be understood by taking intersections with planes. We define horizontal planes by setting z = K3 , vertical planes parallel to the xz-plane setting y = K2 and vertical planes parallel to yz-plane by setting x = K1 for some K1 , K2 , K3 ∈ R acting as a parameter which we can vary. The intersection of conics and coordinate planes are called traces. The following examples illustrate the traces of some conics. Example 1.6.1. Consider an ellipsoid x2 y2 + + z 2 = 1. 9 4 Consider the traces given by z = K3 , we obtain x2 y2 + = 1 − K32 9 4 which is the equation of an ellipse as long as 1 − K32 ≥ 0. Thus, the ellipses are getting smaller as |z| increases from zero.
36
1 Introduction
Fig. 1.31. A hyperbolic paraboloid with its traces: a hyperbola for the horizontal trace and two parabolae, one for each vertical trace.
Example 1.6.2. Consider a hyperbolic paraboloid y2 x2 − −z =0 2 10 and take all three types of traces. Setting z = K3 we can simplify the equation to y2 x2 √ − √ =1 2 ( 2K3 ) ( 10K3 )2 which is the equation of a hyperbola. Setting x = K1 we have K12 y2 − −z =0 2 10 which becomes
K2 y2 + 1. 10 2 This is the equation of a parabola opening downward with vertex at z = −K12 /2. Finally, setting y = K2 yields K2 x2 − 2 z= 2 10 which is also a parabola, now opening upward and with vertex at z = K22 /10. Those are shown in Figure 1.31. z=−
Often, in applications, it is necessary to determine the curves of intersections of quadric surfaces. This is done by equating the equations for the quadrics and performing some manipulations. The following examples illustrate this procedure.
1.6 Quadrics
37
y2 and the ellipsoid 3 2 2 2 2 x + 2y + 3z = 1. Isolating z from both equations, we then have
Example 1.6.3. We find the intersection of the cone z 2 = x2 +
x2 +
1 y2 = (1 − x2 − 2y 2 ) 3 3
which simplifies to 4x2 + 3y 2 = 1
⇒
x2 y2 + 2 = 1; 2 1 2
√1 3
√ an ellipse with minor axis a = 1/2 and major axis b = 1/ 3. Quadrics can be expressed in the form of one or two functions of several variables z = f (x, y). The elliptic and hyperbolic paraboloid are written, respectively, y2 x2 y2 x2 + and z = − . a2 b2 a2 b2 The remaining quadrics are obtained using square roots of z and so are expressed using two functions. For instance, the hyperboloid of two sheets is given by r y2 x2 z = ±c 1 + 2 + 2 . a b Note that it is sometimes more convenient to isolate either x or y, rather than z. z=
Example 1.6.4. The surface obtained by rotating the line x = 2y about the x-axis is a cone. The equation of this cone is obtained as follows. For a fixed x 6= 0 value, the trace is a circle of radius x/2 which projects to the yz-plane. The equation of this circle is x 2 = y2 + z2 2 which we rewrite as s y2 z2 z2 y2 2 x = + or x = ± + . 2 2 2 (1/2) (1/2) (1/2) (1/2)2 Figure 1.32 shows the cone along with the line x = 2y.
1.6.1 Cylinders The concept of cylinder is a familiar one. One can describe it as a surface with constant horizontal trace given by a circle of fixed radius. The equation in this case is x2 + y 2 = r 2 where r > 0 is the radius of the cylinder. This definition can be generalized to any curve in the plane. Let C be a curve in the xy-plane, then a cylinder over C is the surface with constant horizontal trace given by C.
38
1 Introduction
Fig. 1.32. Cone obtained from revolving the line x = 2y around the x axis.
(1) The curve y = x2 defines a parabolic cylinder. (2) Note that cylinders do not need to be defined in the xy-plane. Consider the 3 parametric curve x(t) = 0, y(t) = t and z(t) = e−t with t ∈ [−1, 1]. See Figure 1.33.
The intersection of cylinders and quadrics also occurs frequently in applications and define curves which are often better understood using parametric representations. Example 1.6.5. We find the intersection curve of the parabolic cylinder y = x2 and the top half of the ellipsoid x2 + 3y 2 + 3z 2 = 9 and describe it using its parametric equations. The top half of the ellipsoid is obtained for z ≥ 0. To obtain the intersection curve, we substitute y = x2 into the equation of the ellipsoid: y + 3y 2 + 3z 2 = 9. By completing the square for y, we obtain that this equation has the form 2 1 1 + z2 = 3 + . y+ 6 36 √ 1 This equation describes a circle of radius 3+ 36 with centre at (y, z) = (− 61 , 0). We can solve for z and keep only the + solution because we are interested in the top half
1.6 Quadrics
39
3
Fig. 1.33. Cylinder defined by the parametric curve y(t) = t, z(t) = e−t in the yz-plane.
of the ellipsoid:
s z=
109 36
2 1 − y+ . 6
(1.6)
1 2 But, the expression under the square root needs to be positive, so 109 36 − (y + 6 ) ≥ 0. We can now describe the intersection curve in parametric form. Let x(t) = t, and since y = x2 then y(t) = t2 . Equation (1.6) completes the description. We have s 2 1 109 − t2 + x(t) = t, y(t) = t2 , z(t) = 36 6
Fig. 1.34. Ellipsoid and cylinder of Example 1.6.5 with its intersection curve.
40
1 Introduction
with domain of t obtained by isolating t in
sr −
109 36
109 1 − ≤t≤ 36 6
− t2 +
sr
1 2 6
≥ 0. This yields
109 1 − . 36 6
See Figure 1.34.
Exercises (1) Write the following quadrics in their standard form and identify the quadric (note that the x, y, z might be permuted with respect to the equations given at the beginning of the section). (a) 2x2 − y 2 + z 2 = −3 (b) x − y 2 − 2z 2 = 0 (c) −2z 2 − x2 + 4y 2 = 0 x2 (d) 2 − y 2 + 3z 2 = 2 2 (e) −x2 − 4y 2 − 5z 2 = −2 (2) Find the equations of the traces of the quadrics of Exercise 1. (3) Consider the quadric Q with traces given by two families of hyperbolae and one family of circles defined for all values of the constant K. Which quadric is Q? (4) Consider the quadric Q with traces given by two families of parabolae and one family of circles. Which quadric is Q? (5) Find the curve of intersection of the cone z 2 = 2x2 + y 2 with the hyperboloid of two sheets y2 + z 2 = 1. −4x2 − 3 What is this curve? (6) Find the curve of intersection of the paraboloid 3x2 + y 2 + z 2 = 1 and the hyperboloid of one sheet x2 + 3y 2 − z 2 = 1. What is this curve? (7) Find the curve of intersection of the elliptic paraboloid z = 3x2 + 2y 2 and the hyperbolic paraboloid z = x2 − y 2 . What is this curve? (8) Consider the cylinders given by x = y 2 and y 2 + z 2 = 4. Find the intersection curve of those surfaces and write the result in parametric form. (9) Consider the cylinder given by the curve x(t) = t3 , y(t) = t2 and the cone z 2 = x2 + y 2 . Write the intersection curve in parametric form.
2 Calculus of Vector Functions The previous chapter showed how a curve C can be expressed in terms of a parametric representation x1 (t), . . . , xn (t) with t ∈ [a, b]. A convenient way to write parametric representations is using vector functions r : R → Rn of the form r(t) = (x1 (t), . . . , xn (t)). In this section, we show how to apply calculus techniques to curves, and this is done more conveniently by using vector functions rather than parametric representations.
2.1 Derivatives and Integrals Consider a curve C with parametrization r(t). Examples from the previous section show that many of the curves defined are quite “smooth” in the sense that there are no sharp corners. In fact, they are also continuous in the sense that there is no jump in the tracing of the curve as one follows it. The concepts of continuity and smoothness (i.e. derivatives) for function f : [a, b] → R are one of the main topics in an introductory course on Calculus. We show in this section how these concepts extend to curves, but with some warnings!
2.1.1 Limits and Continuity To discuss limits of vector functions, consider first the case of r(t) = (t, f (t)) with t ∈ [a, b]. Those correspond to functions y = f (x) with x ∈ [a, b]. Recall that f has a limit L ∈ R at some x0 ∈ [a, b] if lim f (x) = L.
(2.1)
x→x0
For convenience of the reader, recall that the exact definition of (2.1) is that ∀ > 0, ∃δ > 0
such that if
0 < |x − x0 | < δ
then
|f (x) − L| < .
The vector function r(t) = (t, f (t)) describes the same curve as y = f (x). Consider x(t) = t and y(t) = f (t) separately. Then, the limits as t → t0 exist in both cases: lim x(t) = lim t = t0
t→t0
t→t0
and
lim y(t) = lim f (t) = L.
t→t0
t→t0
Therefore, it makes sense to define lim r(t) = lim x(t), lim y(t) = (t0 , L). t→t0
t→t0
t→t0
42
2 Calculus of Vector Functions
The definition of limit for general vector functions r(t) follows the same approach. Definition 2.1.1. Consider the vector function r(t) = (x1 (t), . . . , xn (t)) for t ∈ [a, b]. Then, the limit of r(t) at t0 ∈ [a, b] exists if lim xj (t) = Lj ∈ R
t→t0
for all j = 1 . . . , n. Thus, lim r(t) = (L1 , . . . , Ln ).
t→t0
− Note that for t0 = a or t0 = b we take the one-sided limit: t → t+ 0 or t → t0 respectively.
We now look at several examples. Example 2.1.2. This example is similar to a function y = f (x) with a jump discontinuity. Let −1, t ∈ [−1, 0) y(t) = 1, t ∈ [0, 1] then the vector function r(t) = (t, y(t), t2 )
t ∈ [−1, 1]
has two separate pieces depending on whether t ∈ [−1, 0) or t ∈ [0, 1], see Figure 2.1.
Fig. 2.1. Discontinuous vector function
With the above definition of limit, we can now discuss the concept of continuity. Definition 2.1.3 (Continuity). A vector function r(t) with r : [a, b] → Rn is continuous at t0 ∈ [a, b] if r(t0 ) = r0 ∈ Rn and lim r(t) = r0 .
t→t0
2.1 Derivatives and Integrals
43
In Example 2.1.2, r(0) = 1 and lim (t, y(t), t2 ) = (0, −1, 0) and
t→0−
lim (t, y(t), t2 ) = (0, 1, 0)
t→0+
so the limit does not exist and r(t) is not continuous at t = 0. Consider instead a vector function r(t) = (t2 , t3 ). Then, r(t0 ) = (t20 , t30 ) and lim (t2 , t3 ) = (t20 , t30 ).
t→t0
In most of our examples, the vector functions are continuous.
2.1.2 Derivatives, Smoothness and Integrals We begin with the definition of derivative for a vector function. Definition 2.1.4. A vector function r(t) is differentiable at t = t0 if lim
t→t0
r(t) − r(t0 ) t − t0
exists. The limit is denoted by r0 (t0 ) and called the derivative of r(t). An equivalent definition is obtained by setting t = t0 + h, then r0 (t0 ) = lim
h→0
r(t0 + h) − r(t0 ) . h
This form of the definition is useful when one needs to obtain a general formula for the derivative of r(t) independent of t0 . A vector function r(t) defined for t ∈ [a, b] is differentiable on (a, b) if r(t) is differentiable at each t ∈ (a, b). As one may expect, if each coordinate function of r(t) is differentiable on its interval of definition then r(t) is differentiable. This is expressed in the following result. Proposition 2.1.5. A vector function r(t) = (x1 (t), . . . , xn (t)) is differentiable on (a, b) if and only if xj (t) is differentiable on (a, b) for j = 1, . . . , n. Proof. The proof can be done as a single calculation as follows. Let t0 ∈ (a, b) and we write lim
t→t0
r(t) − r(t0 ) t − t0
= =
(x1 (t) − x1 (t0 ), . . . , xn (t) − xn (t0 )) t − t0 x1 (t) − x1 (t0 ) xn (t) − xn (t0 ) lim ,..., t→t0 t − t0 t − t0 lim
t→t0
So, if the left-hand side limit exists, then the right-hand side limit in the last row must exist for each component. Similarly, if the right-hand side limit exists for each component, then the left-hand side limit exists.
44
2 Calculus of Vector Functions
From Proposition 2.1.5, we see that the derivative of vector functions depends completely on each coordinate function and it is known from an introductory calculus course how to compute derivatives of functions xj : [a, b] → R. We know that r(t) = (t, |t|) is not differentiable at t = 0 and this is a consequence of the corner of the function |t| at t = 0. However, vector functions are different from functions y = f (x) because the curve C defined by a vector function can have all its coordinates differentiable, but still have a corner as the next example shows. Example 2.1.6. Consider r(t) = (t2 , t3 ),
t ∈ [−1, 1].
Then, x(t) = t2 and y(t) = t3 are differentiable, but Figure 2.2 shows clearly a corner at t = 0. We see that x(t)3 = t6 = y(t)2 so C corresponds to the curve given by y 2 = x3 and called a cusp curve.
Fig. 2.2. The cusp curve has a sharp corner at (0, 0)
Here is an example without a corner. Example 2.1.7. Consider r(t) = (t, t2 , et ) with t ∈ [−1, 1], Figure 2.3 shows the curve C corresponding to this vector function. The fact that there are no corners at any point shows a greater degree of “smoothness” than the previous example. Therefore, the concept of derivative and the smoothness of the curve C corresponding to the vector function are not equivalent. We now discuss the question of smoothness of the curve associated with a differentiable vector function r(t). We define smoothness here and show in the following section that smoothness at a point p of a curve C corresponds to the existence of a line tangent to C at p.
2.1 Derivatives and Integrals
45
Fig. 2.3. Curve with no corner.
Definition 2.1.8. A curve C has a smooth parametrization given by a vector function r(t) for t ∈ [a, b] if r0 (t) 6= 0 for all t ∈ (a, b). If C has a smooth parametrization, then we say that C is a smooth curve. Clearly, the cusp curve is not smooth at t = 0. This next example has a non-obvious corner point. Example 2.1.9. Consider the curve r(t) = (t2 , t2 ) with t ∈ [−1, 1]. This curve starts at (1, 1) for t = −1 and evolves down the diagonal to (0, 0) and returns on itself until it reaches (1, 1) at t = 1. In order to return on its path, the curve must stop at t = 0 and indeed, r0 (0) = (0, 0). Therefore, this is not a smooth curve.
y r(t4 )
r(t3 )
r(t) r(b)
C r(t2 )
a t1 t2
t3
t4
b
x
t r(a)
r(t1 )
Fig. 2.4. A piecewise smooth curve with five pieces and non-smooth points at t1 , t2 , t3 , t4 .
A curve C is piecewise smooth if there exists a parametrization given by r(t) such that r0 (t) = 0 only at a finite number of points t1 , . . . , tk ∈ (a, b). Most of the
46
2 Calculus of Vector Functions
examples presented in this book are at least piecewise smooth. Figure 2.4 sketches a piecewise smooth curve. A piecewise smooth curve C must often be given by distinct vector functions as the next example shows. But it is not always the case as Example 2.1.9 shows.
y 1 x
2
Fig. 2.5. Curve C of Example 2.1.10
Example 2.1.10. Consider the curve C which is the triangle with vertices at (0, 0), (2, 0) and (0, 1). We obtain a vector function for C (with a consistent orientation) by writing a parametrization of each side of the triangle. t ∈ [0, 1] (t, 0) r(t) = (2 − 2t, t) t ∈ [0, 1] (0, 1 − t) t ∈ [0, 1]. See Figure 2.5. Computing the integral of a vector function is a straightforward process. Let r(t) with t ∈ [a, b] be a continuous vector function with r(t) = (x1 (t), . . . , xn (t)), then ˆ b ˆ b ˆ b r(t) dt = x1 (t) dt, . . . , xn (t) dt . a
a
a
(t2 , cos(2πt))
Example 2.1.11. Let r(t) = ˆ 1 r(t) dt
ˆ
1
=
0
with t ∈ [0, 1] then ˆ 1 2 t dt, cos(2πt) dt
0
=
0
1 3, 0
.
2.1.3 Position and velocity of particles The use of vector functions is crucial in physics (especially Newtonian mechanics) and in this section we expose the main interpretations of the concepts of the previous section to this context. As an object moves in space, its velocity is the instanteneous rate of change of its position. If the position is given by a parametric representation, we have the following definition. Definition 2.1.12. Let r(t) be the vector function describing the motion of a particle p. The velocity vector of p is r0 (t) and the speed is given by ||r0 (t)||.
2.1 Derivatives and Integrals
47
The acceleration is the rate of change of velocity and so we have the definition. Definition 2.1.13. Let r(t) be the parametrization describing the motion of a particle m. The acceleration vector of m is r00 (t). Let us consider some examples. Example 2.1.14. Suppose that an object’s position is given by r(t) = (t, t, t(2 − t)) with t ∈ [0, 2]. The velocity vector is r0 (t) = (1, 1, 2 − 2t) and the speed ||r0 (t)|| = p √ 12 + 12 + (2 − 2t)2 = 6 − 8t + 4t2 . The acceleration vector is r00 (t) = (0, 0, −2). We can also use integration of vector functions to solve the inverse problem. Example 2.1.15. Suppose that an object has acceleration vector a(t) = (1, t, t2 ) and has initial position vector (0, 0, 0) and initial velocity vector (1, 0, 0). Integrating the acceleration vector gives the velocity vector: ˆ ˆ ˆ ˆ 2 0 1 dt, t dt, t dt = t + c1 , 21 t2 + c2 , 13 t3 + c3 r (t) = a(t) dt = = t, 12 t2 , 13 t3 + (c1 , c2 , c3 ). We know r0 (0) = (c1 , c2 , c3 ) = (1, 0, 0) implies c1 = 1, c2 = c3 = 0. So, t2 t3 . r0 (t) = t + 1, , 2 3 Integrating the velocity vector: 2 t3 t4 t + t + k1 , + k2 , + k3 . r(t) = 2 6 12 Then r(0) = (k1 , k2 , k3 ) = (0, 0, 0) implies 2 t3 t4 t + t, , . r(t) = 2 6 12 Newton’s second law relates the force vector F acting on an object with the acceleration vector a in this famous formula: F = ma where m is the mass of the object. Example 2.1.16. Consider a ball with mass m thrown from the ground at an angle ψ and with initial velocity v = v0 . If the only external force acting on the ball is the gravitational force Fg with acceleration g = 9.8m/s2 in the (0, 0, −1) direction, we find the position r(t) of the ball and the angle that maximizes the horizontal distance traveled. We assume that at time t = 0, the ball is at the origin r(0) = (0, 0, 0) and we suppose that the motion of the ball happens completely inside the yz-plane. Then,
2 Calculus of Vector Functions
48
z
r0 (0) v0 sin ψ y v0 cos ψ
Fig. 2.6. Projections on the y and z axis of the initial velocity r0 (0)
the initial velocity vector is given by r0 (0) = (0, v0 cos ψ, v0 sin ψ). Because the gravitational force is the only one acting on the ball, from Newton’s second law, we obtain the acceleration 9.8 (0, 0, −1). a(t) = m Integrating the acceleration, we obtain the velocity vector r0 (t) = (9.8/m)(c1 , c2 , −t + c3 ) and the values of the constants are computed using the initial velocity: r0 (0) = (0, v0 cos ψ, v0 sin ψ) = (9.8/m)(c1 , c2 , c3 ). Therefore, −9.8 t 0 r (t) = 0, v0 cos ψ, + v0 sin ψ . m Integrating the velocity vector gives the position vector −9.8 t2 + tv0 sin ψ + d3 . r(t) = d1 , tv0 cos ψ + d2 , 2m Because r(0) = (0, 0, 0) then d1 = d2 = d3 = 0 implies −4.9 t2 + tv0 sin ψ . r(t) = 0, tv0 cos ψ, m The ball lands at time t∗ for which z(t∗ ) = 0 and solving for t∗ we obtain t∗ = mv0 sin ψ/4.9. Thus, the distance traveled by the ball is given by d(ψ) := y(t∗ ) = mv02 cos ψ sin ψ/4.9. Using elementary calculus, d(ψ) has a maximum value for ψ = π/4.
Exercises (1) For the vector functions below, determine if they are differentiable and/or smooth on their interval of definition. (a) r(t) = (t2 − 2t, sin(2πt)) t ∈ [0, 1] 2 (b) r(t) = (3t2 , cos t, e−1/t ), t ∈ [−1, 1] (c) r(t) = ((t − π)2 , cos t, 2 sin(t/2)), t ∈ [−2π, 2π]
2.2 Best Linear Approximation and Tangent Lines
(2) (3) (4) (5) (6) (7)
49
(d) r(t) = (4t3 , t2 ), t ∈ [1, 3] 2 (e) r(t) = (3t cos t, t2/3 , et )), t ∈ [1, 1] For each vector function of exercise (1), compute the integral of the vector function. Let each vector function of exercise (1) describe the trajectory of an object moving in space. Compute the velocity vector, speed and acceleration vector. Show that if a vector function r(t) is differentiable at t = t0 , then it is continuous at t = t0 . Show that any curve given by a vector function (t, f (t)) is smooth. Suppose that a force F (t) = (t2 , cos t, sin t) is applied on an object of mass 1. Find the general formula for the velocity and position vector. An object falls from a cliff of height h0 > 0 and only subject to gravitational acceleration g = 9.8(0, 0, −1)m/s2 . Find the velocity and position vectors if the initial velocity is v0 . Determine a formula for the final velocity as the object hits the ground.
2.2 Best Linear Approximation and Tangent Lines To discuss the existence of “tangent line” at a point p of a curve, we introduce the concept of “best linear approximation”. This concept is defined as follows. Definition 2.2.1. Consider a function f (x) and a linear function L(x) = a + bx. We say that L(x) is the best linear approximation of f (x) at x = x0 if lim
x→x0
f (x) − L(x) = 0. x − x0
The best linear approximation limit means that the difference f (x)−L(x) approaches zero near x = x0 at a much faster rate than 1/(x−x0 ) approaches infinity near x = x0 so that the product tends to zero. We look at the case of functions of one variable as a reminder. Let f (x) be a function which is at least twice differentiable. We do a Taylor expansion of f (x) at x = x0 to first order, f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + o(|x − x0 |)
(2.2)
where the symbol o(|x − x0 |) is a short-hand expression for terms of higher degrees, it is called “little o” and its exact definition is: lim
x→x0
o(|x − x0 |) = 0. |x − x0 |
The right-hand side expression of (2.2) provides an approximation of f (x) for x close to x0 . Consider the linear function L(x) = f (x0 ) + f 0 (x0 )(x − x0 ), then we have lim
x→x0
o(|x − x0 |) f (x) − L(x) = lim = 0. x→x0 x − x0 x − x0
50
2 Calculus of Vector Functions
In fact, L(x) is the only linear function for which lim
x→x0
f (x) − L(x) = 0. x − x0
This is shown in the next result. Theorem 2.2.2. Suppose f (x) is differentiable at x = x0 and L(x) = a + bx. Then, lim
x→x0
f (x) − L(x) =0 x − x0
if and only if f (x0 ) = L(x0 ) and f 0 (x0 ) = L0 (x0 ). Proof. This is an if and only if statement that can be proved in one calculation. We write L(x) = a + bx = (a + bx0 ) + b(x − x0 ). Begin with the limit: lim
x→x0
f (x) − L(x) x − x0
f (x0 ) + f 0 (x0 )(x − x0 ) + o(|x − x0 |) − (a + bx0 + b(x − x0 )) x→x0 x − x0 [f (x0 ) − (a + bx0 )] + (f 0 (x0 ) − b)(x − x0 ) + o(|x − x0 |) = lim x→x0 x − x0 o(|x − x0 |) f (x0 ) − (a + bx0 ) 0 = lim + (f (x0 ) − b) + . x→x0 x − x0 x − x0
= lim
Now, the limit on the last line is zero if and only if the limit of each term is zero. We check each case. First, lim
x→x0
f (x0 ) − (a + bx0 ) x − x0
exists if and only if the numerator f (x0 ) − (a + bx0 ) is zero. In which case, the limit is zero. The third limit is automatically zero by definition of o(|x − x0 |). Therefore, the second limit is zero if and only if f 0 (x0 ) = b. Therefore a = f (x0 ) + f 0 (x0 )x0 and so L(x) = f (x0 ) + f 0 (x0 )(x − x0 ) which means L(x0 ) = f (x0 ) and L0 (x0 ) = f 0 (x0 ) This completes the proof. Definition 2.2.3. The tangent line to the curve given by y = f (x) at (x0 , f (x0 )) is given by the best linear approximation L(x) of f (x) at x0 . Note that in this case, the tangent line is always given by a function y = mx + b. This is not the case anymore for general vector functions as we show below. A vector function q : R → Rn defined by → − − q(t) = → a + bt → − − where → a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) is called an affine vector function. If → − a = 0, it is a linear function. Let r(t) be a parametrization of the smooth curve C. The concept of best linear approximation for f (x) extends naturally to vector functions and we have the following definition.
2.2 Best Linear Approximation and Tangent Lines
51
Definition 2.2.4. Let r, q : R → Rn where r(t) is a vector function and q(t) is a linear vector function. We say that q(t) is the best linear approximation of r(t) at t = t0 if r(t) − q(t) lim = 0. t→t0 t − t0 The idea of a best linear approximation is used in the sections and chapters that follow in order to provide optimal approximations to lengths of curves, areas of surfaces, etc. As with the function f (x) above, we can characterize the best linear approximation using the derivative. → − → − − − Theorem 2.2.5. Let r(t) be a vector function and q(t) = → a + b t where → a , b ∈ Rn . Then, q(t) is a best linear approximation at t = t0 if and only if q(t0 ) = r(t0 ) and q0 (t0 ) = r0 (t0 ). Proof. The argument is similar to the proof of Theorem 2.2.2 done on each component of r(t) and q(t) separately. Therefore, if the curve C is smooth at p = r(t0 ), the best linear approximation is nonconstant and so we have the following definition. Definition 2.2.6. The tangent line at a smooth point p ∈ C is given by the best linear approximation q(t). Example 2.2.7. Consider the circle of radius 1 given by r(t) = (cos t, sin t) and let t = 0. Then, the best linear approximation is q(t) = (1, t) for t ∈ R. It is a vertical line in the plane and so we can’t write it as y = mx + b.
2.2.1 Construction of the Tangent Line Let C be the curve defined by the vector function r(t) = (x(t), y(t)) for t ∈ [a, b] and differentiable for t ∈ (a, b). Let t0 ∈ (a, b), ` be a tangent line at the point r(t0 ) and let (p, q) ∈ `. See Figure 2.7. Then, vp,q := (p, q) − (x(t0 ), y(t0 )) is the vector joining r(t0 ) to (p, q) and vp,q is a nonzero vector. Because r(t) is differentiable at t = t0 , then vp,q = sr0 (t0 ) for some s 6= 0 and this means r0 (t0 ) 6= 0. This corresponds to what is seen in Example 2.1.6, corners can appear only if r0 (t0 ) = 0. The construction seen in Figure 2.7 gives us a method to obtain the formula for the tangent line at points where a curve C is smooth. This is illustrated in the following example.
52
2 Calculus of Vector Functions
r(b) ` y r(t)
v a
t0
t b
r0 (t0 )
r(t0 ) x
C r(a)
vp,q (p, q)
Fig. 2.7. A curve C given by r(t) with its tangent vector r0 (t0 ) at r(t0 ), tangent line ` and a vector vp,q ∈ ` joining the point (p, q) ∈ ` to r(t0 ).
Example 2.2.8. Consider the curve C given by the smooth parametrization r(t) = (t2 , 1 − t) for t ∈ [0, 1]. The tangent line at t = 12 is obtained by first obtaining the tangent vector r0 (t) = (2t, −1) and evaluating at t = 21 : r0 ( 21 ) = (1, −1). Using the base point r( 12 ) = ( 41 , 21 ) and the tangent vector, the equation of the tangent line at r( 12 ) is 1 1 1 1 , + s, − s . + s (1, −1) = `(s) = 4 2 4 2 where s is the variable parameterizing the tangent line. The general method to compute a tangent line is similar to the one exposed in Example 2.2.8. Let r(t) be the smooth parametrization of a curve C and we want to compute the tangent line at t = t0 . (1) Evaluate the vector function at t = t0 : r(t0 ). (2) Compute the tangent vector at t = t0 : r0 (t0 ). (3) Write the formula: `(s) = r(t0 ) + sr0 (t0 ).
Example 2.2.9. Consider the curve C with smooth parametrization r(t) = (t cos(2πt), et sin t, 2t2 ) for t ∈ [0, 1]. We find the general formula for the tangent line at any point on the curve using the approach given above. (1) Evaluate the vector function: r(t) = (cos(2πt), et sin t, 2t2 )
2.3 Reparametrizations and arc-length parameter
53
(2) Compute the tangent vector: r0 (t) = (−2π sin(2πt), et cos t + et sin t, 4t) (3) Write the formula: `(s) = r(t) + sr0 (t). We obtain `(s) = (cos(2πt), et sin t, 2t2 ) + s(−2π sin(2πt), et cos t + et sin t, 4t). This formula can now be evaluated at any point t ∈ [0, 1].
Exercises (1) Determine the general equation of the tangent line for each vector function. Then, evaluate at the given t0 value. (a) r(t) = (t2 − 2t, sin(2πt), t), t0 = π (b) r(t) = (4t3 , t2 , et ), t0 = 0 (c) r(t) = (3tet , t2/3 , t2 ), t0 = 1 (d) r(t) = (et , tet , t2 et ), t0 = 0
2.3 Reparametrizations and arc-length parameter Consider a straight line segment on the x-axis from x = a to x = b. A natural parametrization is x(t) = t, y(t) = 0 with t ∈ [a, b]. Note that ||(x0 (t), y 0 (t))|| = 1. Let α > 0 and consider instead the parametrization x(t) = αt, y(t) = 0 with t ∈ [a/α, b/α]. Now ||(x0 (t), y 0 (t))|| = α and the length of the domain is α−1 (b− a). Therefore, the speed of the parametrization being α leads to a contraction of the domain by a factor of α. Consider the parametrization of a circle of radius r0 given by x(t) = r0 cos(t/r0 ),
y(t) = r0 sin(t/r0 )
with t ∈ [0, 2πr0 ) and here again we have ||(x0 (t), y 0 (t))|| = 1. This parametrization is similar to the one from Example 1.5.3, but the domain has been dilated by a factor r0 with the division of t by r0 in cos and sin. These are examples of “reparametrization” of a curve. Remark 2.3.1. Note that for both examples, the curve has a parametric representation which travels along the curve at speed 1 and the length of the curve (known from elementary geometry) is the same as the length of the domain interval.
54
2 Calculus of Vector Functions
In this section, we present an argument showing that for any curve C there exists a parametric representation given by a vector function r(t) such that ||r0 (t)|| = 1. We single out such parametric representations in the following definition. Definition 2.3.2. Let C be a curve with parametrization r(t) with t ∈ [a, b]. A reparametrization of C is an invertible differentiable function φ : [c, d] → [a, b], t = φ(s), from which we define a parametrization ˜r(s) = r(φ(s)) with s ∈ [c, d] where c = φ−1 (a) and d = φ−1 (b). In the circle example, the reparametrization from Example 1.5.3 to the one above is given by φ(s) = s/r0 . Example 2.3.3. Let C be a curve with parametrization r(t) = (et , e2t , e3t ) with t ∈ [0, 3] and consider the reparametrization t = φ(s) = ln(s). Then, ˜r(s) = r(φ(s)) = (eln(s) , e2 ln(s) , e3 ln(s) ) = (s, s2 , s3 ) with s ∈ [1, e3 ]. Definition 2.3.4. A curve C has an arc-length parametrization r(s) if ||r0 (s)|| = 1 for all s. Example 2.3.5. Let C be a curve with parametrization r(t) = (t2 /2, t3 /3) with t ∈ √ √ [0, 1]. Then ||r0 (t)|| = t2 + t4 = t 1 + t2 . We now find a reparametrization t = φ(s) such that ˜r(s) = r(φ(s)) and ||˜r0 (s)|| = 1. Begin by computing dφ dφ ||˜r0 (s)|| = r0 (φ(s)) = ||r0 (φ(s))|| . ds ds Because we want ||˜r0 (s)|| = 1, we set up the equation dφ 0 1 = ||r (φ(s))|| . ds It is reasonable to assume that the parametrization does not change the orientation, so we have dφ/ds > 0. Since t = φ(s), we can write dt 1 1 = 0 = √ . ds ||r (t)|| t 1 + t2 and using the method of separation of variables this becomes ˆ t ˆ t p ds = τ 1 + τ 2 dτ. 0
0
(2.3)
2.3 Reparametrizations and arc-length parameter
55
The integral on the right can be computed with the substitution rule (u = 1 + τ 2 ) and we obtain 1 s(t) − s(0) = ((1 + t2 )3/2 − 1). 3 The value of s(0) is not fixed and so we assume s(0) = 31 , therefore s(t) =
1 (1 + t2 )3/2 . 3
But we can invert this function and obtain q t = φ(s) = (3s)2/3 − 1.
(2.4)
(2.5)
Therefore, C has an arc-length parametrization given by 1 1 2/3 2/3 3/2 2 3 ˜r(s) = (φ(s) /2, φ(s) /3) = ((3s) − 1), ((3s) − 1) 2 3 where the range of the parameter s is obtained from (2.4) with t ∈ [0, 1]. That is, 1 1 3/2 . , 2 s∈ 3 3 From this example, we can extract an algorithm to find the arc-length parametrization of a vector function r(t) with domain [a, b]. (i) Compute ||r0 (t)||. If ||r0 (t)|| = 1 then it has already the arc-length parametrization. If not, go to step (ii). (ii) Set up the equation ˆ t s(t) − s(a) = ||r0 (τ )|| dτ. (2.6) a
Compute the integral on the right if possible; s(a) is arbitrary so you can set it to a convenient value. (iii) If possible, invert the formula given by (2.6) to obtain t = φ(s). (iv) Determine the domain of s: the interval [s(a), s(b)]. (v) Write ˜r(s) = r(φ(s)).
It is well-known that antiderivatives of continuous functions are in many cases, very difficult to find or that they may not even exist in terms of elementary functions (polynomial, rational, trigonometric, exponential, logarithmic functions) as the next case shows.
56
2 Calculus of Vector Functions
Example 2.3.6. Consider the curve C with parametrization x(t) = t and y(t) = e−t , then a calculation as above leads to a differential equation dt 1 = √ ds 1 + e−2t and so one must compute the integral. ˆ (1 + e−2t )1/2 dt Unfortunately, an antiderivative cannot be found for this case and so the arc-length parametrization cannot be obtained analytically. Although we cannot compute the arc-length parametrization analytically in all cases by solving a differential equation such as (2.3), we can show abstractly that the arclength parametrization must always exist. This is the content of the next result. Theorem 2.3.7. A smooth curve C always has an arc-length parametrization. Proof. Let r(t) be a parametrization of C with t ∈ [a, b]. Let t = u(s) be a reparametrization with ˜r(s) = r(u(s)). We must find a function u such that du 0 ||˜r (s)|| = ||dr/du|| = 1 ds for all s. This means 1 1 du = = p := g(u). ds ||dr/du|| x01 (u)2 + x02 (u)2 + · · · + x0n (u)2 We can solve this differential equation by separation of variables ˆ ˆ 1 du = ds = s. g(u)
(2.7)
(2.8)
1 Because r(t) is smooth and g(u) 6= 0, then g(u) is continuous. By the Fundamental Theorem of Calculus, there exists F (u) such that F 0 (u) = 1/g(u). So, the solution is F (u) = s
but F 0 (u) = 1/g(u) > 0 so the inverse of F exists. This means u(s) = F −1 (s) satisfies the required property. Therefore, there always exists an arc-length parametrization.
2.3 Reparametrizations and arc-length parameter
57
Exercises (1) For the following vector functions, use the transformation t = φ(s) given to reparametrize and change the domain accordingly. In each case compute the reparametrization r˜(s) with its domain, compute its speed and identify the ones with arc-length parametrization. √ 4 (a) r(t) = (t2 , t2 e−t ), t ∈ [0, 2]; t = φ(s) = s. (b) r(t) = (cos(8t), sin(8t)), t ∈ [0, 4π]; t = φ(s) = s/8. √ (c) r(t) = (et , e2t / 2, e3t /3), t ∈ [0, ln(10)]; t = φ(s) = ln(s). s t = φ(s) = √ . b2 + d2 + h2 (2) Compute s(t) given by equation (2.6) for the following vector functions. Invert the relationship to t = φ(s) if possible. (d) r(t) = (a + bt, c − dt, g + ht),
t ∈ [0, 1];
(a) r(t) = (t cos t, t sin t) (b) r(t) = (et , e2t , e4t ) (c) r(t) = (t2 , cos(t2 ), sin(t2 )) (d) r(t) = (2e−t/2 cos t, 2e−t/2 sin t)
3 Tangent Spaces and 1-forms We define the concept of tangent space using specific examples: curves, the spaces Rn and finally two-dimensional surfaces given by z = f (x, y). The tangent space to a geometric object is an important topic in advanced calculus and differential geometry. The second part introduces a formal definition of differential and the construction of the so-called 1-forms.
3.1 Tangent spaces We begin by discussing how tangent lines lead to tangent space and build more general tangent spaces for Rn and surfaces.
3.1.1 From Tangent Lines to Tangent Spaces The previous section shows how to determine the tangent line to a curve C at some point p ∈ C using the derivative of a vector function r(t) defining C. The goal of this section is to extract the vectors which lie on the tangent lines so that we can add them together without leaving the tangent line. Example 3.1.1. Consider the plane curve C given by x(t) = t2 , y(t) = 1 − t for t ∈ [0, 1]. We obtain the tangent line equation at t = 1/2 by the formula `(s) = (x(1/2), y(1/2)) + s(x0 (1/2), y 0 (1/2)) = (1/4, 1/2) + s(1, −1) where s ∈ R is a parameter. The tangent line is not a subspace, if one adds two elements of `(s), then the result is not in `(s) anymore: 1 1 1 1 1 + s1 , − s1 + + s2 , − s2 = + (s1 + s2 ), 1 − (s1 + s2 ) . 4 2 4 2 2 However, if we decide to fix the base point (x(1/2), y(1/2)) and only add the parametrized part and add the base point after, this leaves us on the tangent line: s1 (1, −1) + s2 (1, −1) = (s1 + s2 , −s1 − s2 ) = (s1 + s2 )(1, −1) and so (x(1/2), y(1/2))+(s1 +s2 )(1, −1) is a point of the tangent line. See Figure 3.1 for an illustration. With this in mind, we can now introduce the concept of tangent space, which is a crucial aspect of advanced calculus and of differential geometry in general. Definition 3.1.2. Let C be a smooth space curve (in Rn ). If p ∈ C then the tangent space of C at p, denoted by Tp C, is the set of all vectors tangent to C at the point p.
3.1 Tangent spaces
59
y
1 p x
1 `(s)
Fig. 3.1. Curve C given by r(t) = (t2 , 1 − t) and the tangent line at p = (1/4, 1/2). The orientation is from (0, 1) to (1, 0).
If the vector function r(t) defines C and p = r(t0 ), Tp C is a 1-dimensional vector space with basis {r0 (t0 )}. In Example 3.1.1, r0 (1/2) = (1, −1) and for p = (1/4, 1/2), Tp C = span(r0 (1/2)) = {α(1, −1) | α ∈ R}. Tp C lies on top of the tangent line `(s), but it is a different geometric object, it is made up of vectors. Apart from the fact that tangent spaces are vector spaces, another advantage is that they don’t depend on the vector function used to parametrize C. Example 3.1.3. Let C be the circle of radius a and p = (0, 1). The vector functions r1 (t) = (a cos(t), a sin(t)), t ∈ [0, π] p r2 (t) = (−t, a2 − t2 ), t ∈ [−a, a] both parametrize the upper portion of C with the counterclockwise orientation and p = r1 (π/2) = r2 (0). Now r01 (π/2) = (−a sin(π/2), cos(aπ/2)) = (−a, 0) and r02 (0) = (−1, 0). So, r01 (π/2) = ar02 (0) and they span the same tangent space as shown in Figure 3.2.
3.1.2 Tangent Spaces in any Dimension We extend the concept of tangent space to any dimension. The following example illustrates the situation. Example 3.1.4. Consider Cartesian coordinate lines in R2 : C1 horizontal, x1 (t) = t, y1 (t) = y0 and C2 vertical, x2 (t) = x0 , y2 (t) = t. Then, for C1 we have (x01 (t), y10 (t)) = (1, 0) = e1
and
(x02 (t), y20 (t)) = (0, 1) = e2
3 Tangent Spaces and 1-forms
60
y
r01 (π/2)
r02 (0)
a
x
a
Fig. 3.2. Tangent vectors of two different parametric representations of C at (0, a).
for all t ∈ R. Let p1 = (t1 , y0 ) and p2 = (x0 , t2 ) for t1 , t2 ∈ R. Then, Tp1 C1 = {αe1 | α ∈ R}
and
Tp2 C2 = {βe2 | β ∈ R}.
Consider p1 , q1 ∈ C1 with p1 6= q1 , then Tp1 C1 and Tq1 C1 are spanned by the same basis vector e1 , but Tp1 C1 6= Tq1 C1 because those are located at different points. To emphasize this distinction, our convention is to distinguish the vectors in a tangent space by labelling those using their base points, for instance e1 (p) ∈ Tp1 C1 and e1 (q) ∈ Tq1 C1 . y Tq C 1 q
Tp C1 p
C1
x
Fig. 3.3. Tangent spaces of C1 at p and q with representative vectors.
Example 3.1.5. We return to the plane curve C of Example 3.1.1 given by x(t) = t2 , y(t) = 1 − t for t ∈ [0, 1] at t = 1/2 and consider the elements of Tp C with p = (1/4, 1/2). Those can be written as a linear combination of the tangent vectors of the coordinate lines. Indeed, v ∈ Tp C is written v = αr0 (1/2) and r0 (1/2) = 1e1 (p) + (−1)e2 (p). where p is also an element of C1 ∩ C2 and e1 (p) ∈ Tp C1 and e2 (p) ∈ Tp C2 . In particular, this example shows that {e1 (p), e2 (p)} can span the tangent vectors of any curve passing through p. We define basis vectors at a point p in Rn from which
3.1 Tangent spaces
y
61
C2 1e1 (p) C1
p
x −1e2 (p) r0 (1/2) = (1, −1)
Fig. 3.4. Decomposition of vector r0 (1/2) = (1, −1) ∈ Tp C along the e1 (p) and e2 (p) directions.
we can decompose any vector located at p. This leads us to this other important definition. Definition 3.1.6. The tangent space at the point p ∈ Rn , denoted by Tp Rn , is the set of all vectors based at the point p. Example 3.1.7. We look at the cases n = 1, 2. (1) Let t0 ∈ R, then Tt0 R = {v = he1 | h ∈ R, e1 = 1}. (2) As we saw above, using e1 (p) = (1, 0) and e2 (p) = (0, 1) then Tp R2 = {v = α1 e1 (p) + α2 e2 (p) | α1 , α2 ∈ R}.
The tangent space Tp Rn can be constructed using families of curves passing through p. As an example, let n = 2 again. We show that any vector v ∈ Tp R2 can be obtained as the tangent vector of a curve passing through p. Let p = (x0 , y0 ) and v = (v1 , v2 ) ∈ Tp R2 . Consider the curve r(t) = (x0 + tv1 , y0 + tv2 ). Then r(0) = p and r0 (0) = (v1 , v2 ). Proposition 3.1.8. Tp Rn is a n-dimensional vector space and the set {e1 (p), . . . , en (p)} of unit tangent vectors to the coordinate lines forms a basis. The notation ej (p) is used to emphasize that the location of the basis vectors is at p. Proof. See exercises. The role of the derivative is intimately linked to the tangent spaces as the following result shows. Proposition 3.1.9. Let r(t) define the smooth curve C and p = r(t0 ) ∈ C. Then, r0 (t0 ) : Tt0 R → Tp C. That is, r0 (t0 ) is a mapping taking vectors from Tt0 R to Tp C.
3 Tangent Spaces and 1-forms
62
Proof. We know Tp C = {αr0 (t0 ) | α ∈ R}. But, α can be expressed as αe1 ∈ Tt0 R. Thus, any vector w ∈ Tp C can be written as w = r0 (t0 )v where v ∈ Tt0 R.
z r(t)
r(b) C
a
t0 v
b
r0 (t0 )v y
t
p = r(t0 )
r(a)
x Fig. 3.5. The derivative r0 (t) is a mapping from Tt R into Tp C. For any v ∈ Tt R, one obtains w = r0 (t)v ∈ Tp C.
This result shows that for a vector function r : [a, b] → Rn defining a curve C, the derivative is a function which takes vectors in Tt R and gives a vector in Tp C. This interpretation of the derivative as a function on tangent spaces is fundamental. This is shown in Figure 3.5.
3.1.3 Vector Fields Using the concept of tangent space, we can introduce a useful interpretation of mappings f : Rn → Rn called the “vector field”. Let p ∈ Rn and consider Tp Rn . As a vector space Tp Rn is isomorphic to Rn because they have the same dimension. Therefore, in the discussion that follows, we drop the dependence on p in the tangent space notation and label all tangent spaces as just Rn . Definition 3.1.10. A mapping f : Rn → Rn is a vector field if for p ∈ Rn , f (p) ∈ Rn is interpreted as a vector in Tp Rn . The concept of vector field is very important in physics and is at the heart of the qualitative theory of differential equations pioneered by Henri Poincaré in the late 19th century. We now show some examples to illustrate this concept. Example 3.1.11. Let F, G : R2 → R2 be vector fields defined by F (x, y) = (x, −y) and G(x, y) = (−y, x), shown in Figure 3.6. The arrows are obtained by evaluating F and G at a number of sample points. For instance, F (0, 0) = (0, 0), F (1, 1) = (1, −1) and F (1, 0) = (1, 0).
3.1 Tangent spaces
63
Fig. 3.6. Vector fields F (x, y) (top) and G(x, y) (bottom). Note that the size of the arrows is normalized to a unique size for convenience.
The next example is an important example from physics. Example 3.1.12. Newton’s law of gravitation says that the two bodies are attracted with a force proportional to the masses of the bodies m and M and inversely proportional to the square of the distance. The magnitude of this force is written F (r) =
mM G r2
where G is Newton’s gravitational constant. In the case of a large isolated body of large mass M (e.g. the Earth) and much lighter objects in its neighborhood, it is customary to place the centre of mass M at the origin. Therefore, all bodies are attracted radially towards the origin and for a body located at x = (x, y, z), we write → − −x F (x) = F (r) ||x|| where r = ||x||2 and the second term is the unit direction vector pointing radially → − towards the origin. The function F : R3 → R3 defines a vector field.
64
3 Tangent Spaces and 1-forms
Example 3.1.13. Here is an example of a vector field on R: F (x) = x(1 − x). In this case, all arrows lie directly on the line. For zero vectors, one draws a point.
0
1
x
Fig. 3.7. Vector field on R given by F (x) = x(1 − x). The size of the arrows is relative.
A vector field can be defined not only over whole spaces, but also subsets of spaces. For instance, one can define a vector field on a curve C. Example 3.1.14. To define a vector field on curve, one must use a parametric representation of the curve. Consider a circle of radius 1 with parametrization given by r(θ) = (cos(θ), sin(θ)) with θ ∈ [−π, π). An example of a vector field on C is given by v : [−π, π) → Tp C ' R with v(θ) = π2 2 − θ2 . This vector field is shown in Figure 3.8.
Fig. 3.8. Vector field on a circle. The arrows are in the tangent spaces at all points of the circle. ±π . Note the 0 arrows at θ = 2
Vector fields are discussed in more details in several upcoming sections.
3.1.4 Tangent Plane to a Surface For curves in space, the question of the existence of a tangent line to the curve is an important aspect to consider as it gives a best linear approximation to the curve locally. In particular, tangent lines exist at points in C where the curve is smooth. Consider now a two-dimensional surface S given by z = f (x, y) and consider the question of the existence and computation of a tangent plane to a surface. We do
3.1 Tangent spaces
65
Fig. 3.9. Two-dimensional surface S with curves C1 and C2 projecting to coordinate lines in the xy-plane and intersecting at p = (x0 , y0 , f (x0 , y0 )).
not consider the question of whether a tangent plane exists or not just yet, but focus on the computation. Definition 3.1.15. Let S be a surface in R3 given by z = f (x, y) and p = (x0 , y0 , z0 ) where z0 = f (x0 , y0 ). If p ∈ S is a point at which a unique tangent plane exists, then the tangent space of S at p, denoted by Tp S is the set of all vectors tangent to S at p. With the following calculation, we begin our characterization of Tp S. Consider a point p = (x0 , y0 , f (x0 , y0 )) ∈ S and the curves C1 and C2 given by r1 (t) = (x0 + t, y0 , f (x0 + t, y0 )) and
r2 (t) = (x0 , y0 + t, f (x0 , y0 + t))
passing through p; that is, r1 (0) = p = r2 (0). The projections of C1 and C2 in the xy plane are Cartesian coordinate lines. Figure 3.9 shows a surface S with the curves C1 and C2 . Computing the derivative at t = 0 gives the tangent vectors of both curves at p. Using the chain rule for partial derivatives we obtain ∂f ∂f (x0 , y0 ) and r02 (0) = 0, 1, (x0 , y0 ) . r01 (0) = 1, 0, ∂x ∂y These tangent vectors to C1 and C2 are shown in Figure 3.10. This shows that r01 (0) and r02 (0) are vectors tangent to S at p. We can now state our result. Theorem 3.1.16. Let S be the surface given by z = f (x, y) and p = (x0 , y0 , f (x0 , y0 )) ∈ S. Then, Tp S is a vector space of dimension two with basis given by {τ1 (p), τ2 (p)} where ∂f ∂f τ1 (p) = 1, 0, (x0 , y0 ) and τ2 (p) = 0, 1, (x0 , y0 ) . ∂x ∂y
66
3 Tangent Spaces and 1-forms
Fig. 3.10. Two-dimensional surface S with tangent vectors τ1 (p) and τ2 (p) at p = (x0 , y0 , f (x0 , y0 )).
The proof is interesting, but technical and is left to the end of the section. We begin with the simplest example. Example 3.1.17. Consider the plane P given by equation 3x − y + 2z = 3. A plane tangent to P at any point p should correspond to P itself. We verify this. We compute 1 the tangent plane to P at the point p = (1, 2, 1). We write z = f (x, y) = (3−3x+y) 2 and from Theorem 3.1.16 we have ∂f (1, 2) = (1, 0, −3/2) τ1 (p) = 1, 0, ∂x and
∂f (1, 2) = (0, 1, 1/2). τ2 (p) = 0, 1, ∂y Those are indeed linearly independent vectors tangent to P at p. The tangent plane of P at p is obtained by taking Tp S = span(τ1 (p), τ2 (p))
= =
{α(1, 0, −3/2) + β(0, 1, 1/2) | α, β ∈ R} {(α, β, −3/2α + 1/2β) | α, β ∈ R}.
As expected, Tp S is independent of p. We look at an example where the partial derivatives depend on the base point, so the tangent plane depends on its location. Example 3.1.18. Consider the elliptic paraboloid surface E given by z = f (x, y) = x2 + y 2 for (x, y) ∈ R2 . We determine the equation for the tangent plane at p = (1, 2, 5). We obtain the tangent vectors to S ∂f τ1 (p) = 1, 0, (1, 2) = (1, 0, 2) ∂x
3.1 Tangent spaces
and
τ2 (p) =
∂f 0, 1, (1, 2) ∂y
67
= (0, 1, 4)
based at p and so Tp S = {(α, β, 2α + 4β | α, β ∈ R} The paraboloid and its tangent plane at p are shown in Figure 3.11
Fig. 3.11. Two-dimensional surface S with tangent vectors τ1 (p) and τ2 (p) at p.
Proof of Theorem 3.1.16 We begin by showing that all vectors in the vector subspace span{τ1 (p), τ2 (p)} are tangent to S at p. Choose an arbitrary element w ∈ span{τ1 (p), τ2 (p)}. Let α, β ∈ R and ∂f ∂f w := ατ1 (p) + βτ2 (p) = α, β, α (x0 , y0 ) + β (x0 , y0 ) ∂x ∂y and consider the curve on S given by the vector function r(t) = (x0 + αt, y0 + βt, f (x0 + αt, y0 + βt)) t ∈ R. Then, r(0) = p and
∂f ∂f r (0) = α, β, α (x0 , y0 ) + β (x0 , y0 ) ∂x ∂y 0
is tangent to S at p with w = r0 (0). So w ∈ Tp S and therefore span{τ1 (p), τ2 (p)} ⊆ Tp S. If Tp S contains a vector v not in span{τ1 (p), τ2 (p)},
3 Tangent Spaces and 1-forms
68
then v is linearly independent from τ1 (p) and τ2 (p) and span{τ1 (p), τ2 (p), v} is a three-dimensional vector space which would mean Tp S = R3 . This implies S = R3 which is a contradiction. Therefore, Tp S = span{τ1 (p), τ2 (p)}.
Exercises (1) For each curve C given by r(t), answer the question: (a) r(t) = (t cos(t), t sin(t)). Find Tp C where p = r(π). (b) r(t) = (et , e2t , e4t ). Find Tp C where p = r(ln(2)). (c) r(t) = (cos(8t), sin(8t)). Find a general formula for Tp C at an arbitrary point p = r(t). √ (d) r(t) = (et , e2t / 2, e3t /3). Find a general formula for Tp C at an arbitrary point p = r(t). (2) Draw (by hand) the vector field F (x, y) by sampling a sufficient number of points in each quadrant. Compare your answer with a picture obtained using a software. (a) F (x, y) = (1 − xy, 2 + x + y) (b) F (x, y) = (2x, 3y) (c) F (x, y) = (y, x − x2 ) (3) For each surface S given by z = f (x, y), determine Tp S by finding τ1 (p) and τ2 (p). (a) z = 3x2 − 4y 2 at p = (2, 1, 8). (b) z = cos(xy) at p = (1, π/2, 0). (c) z = x3 − 3xy 2 at p = (0, 0, 0). p (d) z = 1 − (x2 + y 2 ) at p = (1/2, 1/2, 1/2). (4) Check whether the vectors below are in Tp S or not for problems 3(b) and 3(c). (a) v = (2, 3, 2π − 3) (c) u = (0, 4, −16)
(b) w = (1, 2, −1) (d) q = (1, −2, 0).
(5) Tp S can also be computed for surfaces given in other coordinate systems. In the case of cylindrical coordinate system, the basis vectors for Tp S are given by evaluating along the r and θ coordinate lines in the xy-plane. Let p = (r0 , θ0 , f (r0 , θ0 )) ∈ S. (a) Draw a sample picture for S (as in Figure 3.9) showing the curves C1 and C2 on S given by r1 (t) = (r0 + t, θ0 , f (r0 + t, θ0 )) and r2 (t) = (r0 , θ0 + t, f (r0 , θ0 + t)).
3.2 Differentials
69
(b) Compute the derivative of r1 (t) and r2 (t) at t = 0. (c) Argue geometrically that the vectors f1 (p) = r01 (0) and f2 (p) = r02 (0) are linearly independent and conclude that Tp S = span{f1 (p), f2 (p)}. (6) Using the method outlined in the previous problem, compute Tp S for S given √ √ by z = f (r, θ) = 1 − r2 at (r, θ, z) = ( 2/2, π/4, 1/2). Compare with problem 3(d), are the tangent spaces the same? (7) Consider the curve C given by r(t) and the point p specified. In each case, answer the question. (a) r(t) = (1 + t2 , t − 3) and p = r(1) = (2, −2): find the vector v ∈ T1 R such that (10, 5) = r0 (1)v. (b) r(t) = (t, et , e2t ) and p = r(0) = (0, 1, 1): Is (4, 4, 8) ∈ Tp C? Explain why. (c) r(t) = (t2 , 2t, t3 ) and p = r(1) = (1, 2, 1): show that if v1 = 2 ∈ T1 R and v2 = −3 ∈ T1 R, then r0 (1)(v1 + v2 ) = r0 (1)v1 + r0 (1)v2 . (8) Show Proposition 3.1.8.
3.2 Differentials In several elementary books about calculus, the concept of the differential of a function y = f (x) is introduced in a simple fashion, by saying that the differential is dy = f 0 (x) dx where dx is an independent variable taking real values. The differential reappears when discussing definite integrals under the integral sign as one takes the limit of Riemann sums to define the integral. The reason for the appearance of dx under the integral sign is either not mentioned or the author states that it has no meaning save to identify the variable to be integrated or it is the ∆x magically transforming into dx as the limit is taken. Finally, when introducing the substitution rule, say u = g(x), then dx suddenly has a meaning again since now the dx under the integral sign must be changed to du using du = g 0 (x) dx. After this, one would understand a student to be confused about the concept of differential. Our goal in this section is to put the concept of differential over solid foundations such that its use in differentiation and integration becomes clear. The differential we define in this section must at least satisfy the following properties: (1) dx should take values in R. (2) If y = f (x), then we must have dy = f 0 (x) dx. ´ (3) The dx under the sign must have a geometric meaning.
70
3 Tangent Spaces and 1-forms
3.2.1 The differential in one-dimension Example 3.2.1. Consider the curve C on R given by I(t) = t. (1) At a point t0 ∈ R, we compute the tangent vector A = I 0 (t)e1 = e1 A 0
t
t0
(2) Let v ∈ Tt0 R and compute the dot product of v with A: A · v = (1e1 ) · (he1 ) = h(e1 · e1 ) = h ∈ R. (3) Let v, w ∈ Tt0 R, then using the properties of the scalar product A · (v + w) = A · v + A · w
and
A · (bv) = b(A · v).
We use this example to define the differential. Definition 3.2.2. The differential on R, written (dt)t0 , is a linear function (dt)t0 : Tt0 R → R defined by: (dt)t0 (v) := A · v = h,
where
v = he1 .
where A = I 0 (t). Because (dt)t0 is the same at all base points t0 , we write only dt. Example 3.2.3. The differential and the norm return different information. For instance, let v = (−5)e1 , then dt(v) = −5 while ||v|| = 5. Thus, we use the tangent vector A to the curve I(t) = t to define a function dt which takes tangent vectors and returns a real number corresponding to the length and the direction of those vectors. The direction of the vector is lost when using the norm instead of the differential. This definition may seem cumbersome and clumsy at the level of R and one should see this as the initial brick to more complicated constructions. Its strength is that it generalizes in a natural way to higher dimensional spaces Rn , to functions, curves and surfaces. The differential of a general function f (t) is done by thinking of the curve C on R with parametrization given by f (t). Definition 3.2.4. The differential of a differentiable function f : R → R at t0 ∈ R is a function (df )t0 : Tt0 R → R defined as follows. Let A be the tangent vector f at t0 , A = f 0 (t0 )e1 , and let v = he1 then (df )t0 (v) := A · v. Part (3) of Example (3.2.1) shows that dt is a linear function and this generalized automatically to differentiable functions.
3.2 Differentials
71
Proposition 3.2.5. Let f : R → R be a differentiable function, v, w ∈ Tt0 R and α, β ∈ R, then (df )t0 (αv + βw) = α (df )t0 (v) + β (df )t0 (w). We now can obtain our first important result with our definition of differential.
3.2.2 The classical differential formula Using the definition above we now obtain the differential as seen in elementary calculus classes. Let s = f (t) and v = he1 and recall that dt(v) = h, then (ds)t0 (v)
= = = = =
(df )t0 (v) (f 0 (t0 )e1 ) · v f 0 (t0 )h(e1 · e1 ) f 0 (t0 )h f 0 (t0 )dt(v).
(3.1)
Therefore, we recover the well-known formula from elementary calculus courses, namely (ds)t0 = f 0 (t0 ) dt.
y
ds(v)
v = ∆te1
x
Fig. 3.12. Geometric representation of the differential as the increment along the y-axis given by the projection of a tangent vector to the curve with projection ∆t along the x-axis.
The geometric meaning of the differential is similar as in elementary calculus. Let t0 ∈ R and v = (∆t)e1 ∈ Tt0 R be an increment from t0 . Then, ds(∆t) = f 0 (t0 )dt(∆t) However, for s = f (t), formula ds = f 0 (t) dt now has additional meaning since we know that dt (and so ds) are functions defined on tangent spaces. We can add and
72
3 Tangent Spaces and 1-forms
multiply those functions as with any other function and this is useful in the following sections. In particular, this justifies the Leibniz notation for the derivative, ds = f 0 (t) dt where the term on the left is genuinely the division of ds by dt.
3.2.3 Differentials on higher dimensional spaces We now extend the differential to tangent spaces of arbitrary dimensions. Essentially, a coordinate line Cj can be seen as a function R → R since all the other coordinates xi = xi0 are constant for i 6= j. We begin with R2 . Let p = (x0 , y0 ) ∈ R2 and u = (α1 , α2 ) ∈ Tp R2 . Because the tangent space of coordinate lines Tp C1 and Tp C2 are one-dimensional subspaces of Tp R2 , we extend our definition of differential to dx, dy : Tp R2 → R as follows. The coordinate lines C1 and C2 are given as (x1 (t), y1 (t)) = (t, y0 ) and (x2 (t), y2 (t)) = (x0 , t), then (dx)(x0 ,y0 ) (u) := (x01 (t), y10 (t)) · u = (1, 0) · (α1 , α2 ) = α1 . and (dy)(x0 ,y0 ) (u) := (x02 (t), y20 (t)) · u = (0, 1) · (α1 , α2 ) = α2 . Therefore, the geometric meaning of the differentials dx and dy is that it returns the projections of u along the x and y directions respectively. This construction extends automatically to higher dimensional spaces. Proposition 3.2.6. Let x1 , . . . , xn be Cartesian coordinates on Rn and consider the vector v = (α1 , . . . , αn ) ∈ Tp Rn , then dxj (v) = αj for j = 1, . . . , n. We can now begin our discussion of differentials in the context of functions of several variables.
Differentials for functions of several variables Consider a surface S given by a function z = f (x, y). We see in Section 3.1.4 that the tangent space at a point p = (x0 , y0 , z0 ) ∈ S is given by span{τ1 (p), τ2 (p)} where ∂f ∂f (x0 , y0 ) and τ2 (p) = 0, 1, (x0 , y0 ) . τ1 (p) = 1, 0, ∂x ∂y
3.2 Differentials
73
We now define (df )(x0 ,y0 ) on vectors v ∈ Tp S. We need (df ) : Tp S → R. Let v ∈ Tp S, then ∂f ∂f v = α, β, α (x0 , y0 ) + β (x0 , y0 ) . ∂x ∂y Now, (dx)p (v) = α, and (dz)p (v) = α
(dy)p (v) = β
∂f ∂f (x0 , y0 ) + β (x0 , y0 ). ∂x ∂y
But this means (dz)p (v) =
∂f ∂f (x0 , y0 )(dx)p (v) + (x0 , y0 )(dy)p (v). ∂x ∂y
Equivalently, recalling the gradient in Cartesian coordinates, we have (dz)p (v) = ∇f (x0 , y0 ) · ((dx)p (v), (dy)p (v)).
(3.2)
But (3.2) has exactly the form of our previous definitions of differential: “a derivative” scalar product with a vector. Because z = f (x, y) and the values of dx and dy do not depend on the point p explicitly. Definition 3.2.7. The differential of f at (x0 , y0 ) is defined by (df )(x0 ,y0 ) :=
∂f ∂f (x0 , y0 )dx + (x0 , y0 )dy. ∂x ∂y
(3.3)
In particular, (df )(x0 ,y0 ) can be evaluated at any vector v ∈ T(x0 ,y0 ) R2 . The formula is written more simply df =
∂f ∂f dx + dy. ∂x ∂y
Geometrically, we see that df (v) gives the variation of the function f in the direction of the vector v. For this reason, it is also called the directional derivative of f . In particular, because we can express df (v) = ∇f (x, y) · (dx(v), dy(v)) by choosing a unit vector v, we see that the directional derivative of f is maximal if v is parallel to the vector ∇f (x, y) and pointing in the same direction. If v is chosen perpendicular to ∇f (x, y), then df (v) = 0 and this corresponds to a direction where f does not vary. This leads us to introduce the concept of level set curves of a function f (x, y) which is defined as f −1 (c) := {(x, y) | f (x, y) = c}
74
3 Tangent Spaces and 1-forms
where c ∈ R. Note that for a fixed c ∈ R, typically, f (x, y) = c determines a curve. The function f is fixed along f −1 (c) and so the gradient vector is perpendicular to level set curves from the calculation above. A similar construction of the differential can be done for differentiable functions of more than two variables and leads to the formula: df =
n X ∂f dxi . ∂xi i=1
However, the proof of this case must wait for the general definition of tangent spaces for n-dimensional surface in Chapter 5, Section 5.4. We now look at two examples. Example 3.2.8. Consider the surface S given by z = f (x, y) = 3 − 3x + 2xy 2 . Let p = (1, 2), we compute (df )p evaluated on the vector v = (3, −4) ∈ Tp R2 (df )p (v)
= =
∂f ∂f (1, 2)dx(v) + (1, 2)dy(v) ∂x ∂y 5(3) + (8)(−4) = −17
We now determine the direction of maximal increase of df . We do it by determining the direction of no increase of f which is obtained using 0 = ∇f · (v1 , v2 ) = (5, 8) · (v1 , v2 ) where (v1 , v2 ) is a unit vector. Therefore, v2 = −5v1 /8 and ||(v1 , −5v1 /8)|| = 1. Therefore, 8 −5 (v1 , v2 ) = √ , √ . 89 89 The unit vectors perpendicular are
±
8 5 √ ,√ 89 89
.
Therefore, the direction of maximal increase is given by 5 8 √ ,√ . 89 89 Example 3.2.9. Consider the function f : R4 → R defined by f (x, y, z, w) = x2 + y 2 + z 2 + w2 . Then df
= =
∂f ∂f ∂f ∂f dx + dy + dz + dw ∂x ∂y ∂z ∂w 2x dx + 2y dy + 2z dz + 2w dw.
3.2 Differentials
∂ ∂θ y p
∂ ∂r
75
p
p
x q
∂ ∂r
q
∂ ∂θ
q
∂ ∂ and ∂θ of Tp R2 and Tq R2 where p = (1, 1) and q = Fig. 3.13. Basis vectors ∂r √ ∂ with the circles through p (− 3/2, −1/2). The dashed circles emphasize the tangency of ∂θ ∂ and q and the dashed rays the radial directions of ∂r .
3.2.4 Differentials of curvilinear coordinate systems Differentials can be defined in any coordinate system. Let (r, θ) be polar coordinates on the plane and let u0 = (r0 , θ0 ) ∈ R2 . From Section 3.1, Problem (4), Tu0 R2 is spanned by d d (r0 + t, θ0 ) |t=0 and (0, 1) = (r0 , θ0 + t) |t=0 . dt dθ Let r1 (t) = ((r0 +t) cos(θ0 ), (r0 +t) sin(θ0 )) and r2 (t) = (r0 cos(θ0 +t), r0 sin(θ0 +t)). Consider the vectors (1, 0) =
r01 (0) = (cos θ0 , sin θ0 )
and
r02 (0) = (−r0 sin θ0 , r0 cos θ0 )
based at p. See Figure 3.13. We define the following vectors at p = (x, y) = (r cos θ, r sin θ): ∂ ∂ := (cos θ, sin θ) and := (−r sin θ, r cos θ). (3.4) ∂r p ∂θ p which form an orthogonal basis to the tangent space Tp R2 . Note however that it is not an orthonormal basis since ||∂/∂θ|| = r. Thus, we see that the standard orthogonal basis of T(r,θ) R2 is sent via the derivative of the vector functions r1 and r2 to the basis {∂/∂r, ∂/∂θ} at Tp R2 . Therefore, a vector v ∈ Tp R2 can be written v = vr
∂ ∂ + vθ ; ∂r ∂θ
3 Tangent Spaces and 1-forms
76
and in coordinates we write v = (vr , vθ ). We conclude this discussion by defining the scalar product for vectors in Tp R2 written in the basis (∂/∂r, ∂/∂θ). We know that the scalar product of v and w should be the product of the length of the projection of v onto w times the length of w. This means
2 ∂ ∂ ∂ · = = 1 and ∂r ∂r ∂r
2 ∂ ∂ ∂ · = = r 2 . ∂θ ∂θ ∂θ
Hence, letting v=a
∂ ∂ +b ∂r ∂θ
w=c
and
∂ ∂ +d ∂r ∂θ
then v · w := ac + bdr2 or we can also write 1 0 v · w = (a, b) (c, d)T . 0 r2
∂ y ∂θ p √
∂ ∂r
p
v
p
2
(3.5)
x
Fig. 3.14. Basis vectors
∂ ∂r
and p
∂ ∂θ
at p = (1, 1) with tangent vector v = (vr , vθ ) = p
(1, −1).
The relationship between differentials in different coordinate systems is straightforward to obtain. We obtain dr, dθ as functions of dx and dy using r = f (x, y) = p x2 + y 2 , θ = g(x, y) = arctan(y/x). We know dr
= =
∂f ∂f dx + dy ∂x ∂y x y p dx + p dy x2 + y 2 x2 + y 2
(3.6)
3.2 Differentials
and dθ
∂g ∂g dx + dy ∂x ∂y x −y dx + 2 dy. 2 2 x +y x + y2
= =
77
(3.7)
Proposition 3.2.10. Let v = (vr , vθ ) ∈ Tp R2 . The differentials dr and dθ at p evaluated at v are given by (dr)p (v) := vr
(dθ)p (v) := vθ .
and
Proof. We use (3.6) and the definitions (3.4) to compute
(dr)p
∂ ∂r
∂ ∂r
∂ ∂r
x dx r
=
r cos θ r sin θ cos θ + sin θ = 1 r r
+
y dy r
=
and
(dθ)p
∂ ∂θ
=
−y dx r2
=
r cos θ −r sin θ (−r sin θ) + r cos θ = 1. r2 r2
∂ ∂θ
+
x dy r2
∂ ∂θ
From the above, it is straightforward to check that (dr)p (∂/∂θ) = (dθ)p (∂/∂r) = 0. Therefore, by linearity of the differentials, the proposition is verified. Using these formulae, we can evaluate dr and dθ on vectors v ∈ Tp R2 in Cartesian coordinate systems. Consider for instance, v = (vx , vy ) = (1, −1) at p = (1, 1) as √ seen in Figure 3.15; the vector v is tangent to the circle of radius 2 and so (dr)p should be zero and (dθ)p nonzero on this vector. We verify with the computation: 1 1 (dr)p (v) = √ dx(1, −1) + √ dy(1, −1) = 0 2 2 (dθ)p (v) =
−1 1 dx(1, −1) + dy(1, −1) = −1. 2 2
Exercises (1) Using the differential formula (3.3), compute the differential of the functions listed below. (a) f (x, y) = x/y. Find the direction of maximal increase of f at (1, 1). (b) f (x, y) = x2 y − y 2 x. Find the direction of no increase of f at (2, 1). (c) f (x, y, z) = 3 cos(xyz)
3 Tangent Spaces and 1-forms
78
∂ y ∂θ p √
∂ ∂r
p
p
2
v x
Fig. 3.15. Basis vectors
∂ ∂r
and p
∂ ∂θ
at p = (1, 1) with tangent vector v = (vx , vy ) = p
(1, −1).
(d) f (x1 , . . . , xn ) = exp((x1 − a1 ) · · · (xn − an )). (2) Show the following statements (a) d(f (t)g(t)) = (f 0 (t)g(t) + f (t)g 0 (t)) dt (b) If g(t) 6= 0, d(f (t)/g(t)) = (f 0 (t)g(t) − f (t)g 0 (t))g(t)−2 dt. (c) d((f ◦ g)(t)) = f 0 (g(t))g 0 (t) dt. (3) Consider a curve C given by r(t) = (x(t), y(t)) where t ∈ [α, β]. Let α = t0 < t1 < . . . < tn−1 < tn = β and consider the vector vj = tj+1 − tj based at tj ∈ R for j = 1, . . . , n − 1 (i.e. v0 = t1 − t0 is a vector based at t0 , v1 = t2 − t1 is a vector based at t1 , etc). We denote pj = r(tj ). (a) Compute dt(vj ). (b) Explain why the vector Wj := r0 (tj )dt(vj ) is in the tangent space Tpj C. (c) Compute dx(Wj ), dy(Wj ). (4) Generalize the problem above to a curve r(t) = (x1 (t), . . . , xn (t)) with t ∈ [a, b] by doing part (c) in this context; that is, compute dxi (Wj ) for i = 1, . . . , n. (5) Evaluate dr and dθ at the points p ∈ R2 and vectors in v ∈ Tp R2 in polar and Cartesian coordinates using the direct definition of dr and dθ and the formulae (3.6) and (3.7). Check that the answers agree. Illustrate the vectors at the point p. (a) p = (2, 0); v = (vr , vθ ) = (1/2, 1/2) and v = (vx , vy ) = (1/2, 1). √ (b) p = (1/2, − 3/2); v = (vr , vθ ) = (1, −1) and √ √ (1 − 3) −(1 + 3) v = (vx , vy ) = , . 2 2
3.2 Differentials
79
(6) Compute dx and dy in terms of dr and dθ using x = r cos θ and y = r sin θ. Check that your result is similar to solving dx and dy from the equations (3.6) and (3.7). (7) Consider the spherical coordinate system (ρ, φ, θ) and compute dρ, dφ and dθ as functions of dx, dy and dz. (8) Compute dx, dy and dz as functions of dρ, dθ and dφ. You can do it directly using the formulae relating (x, y, z) with (ρ, φ, θ) or solve for dx, dy and dz from the previous problem. (9) Let p = (r cos θ, r sin θ, z). By defining curves r1 (t) = ((r + t) cos θ, (r + t) sin θ, z)T , r2 (t) = (r cos(θ + t), r0 sin(θ + t), z)T , r3 (t) = (r cos θ, r sin θ, z + t)T show that the vectors ∂ ∂ ∂ := (cos θ, sin θ, 0), := (−r sin θ, r cos θ, 0), := (0, 0, 1) ∂r p ∂θ p ∂z p forms an orthogonal basis of Tp R3 . (10) Let p = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ). By defining curves r1 (t)
=
((ρ + t) cos θ sin φ, (ρ + t) sin θ sin φ, (ρ + t) cos φ)T ,
r2 (t)
=
(ρ cos θ sin(φ + t), ρ sin θ sin(φ + t), ρ cos(φ + t))T ,
r3 (t)
=
(ρ cos(θ + t) sin φ, ρ sin(θ + t) sin φ, ρ cos φ)T
show that the vectors ∂ := (cos θ sin φ, sin θ sin φ, cos φ), ∂ρ p ∂ := (ρ cos θ cos φ, ρ sin θ cos φ, −ρ sin φ), ∂φ p ∂ := (−ρ sin θ sin φ, ρ cos θ sin φ, 0) ∂θ p forms an orthogonal basis of Tp R3 with ||∂/∂ρ|| = 1,
||∂/∂φ|| = ρ,
||∂/∂θ|| = ρ sin φ.
(11) Show that the scalar products in the basis of the previous two problems are (a) cylindrical coordinates: (v1 , v2 , v3 ) · (w1 , w2 , w3 ) = v1 w1 + r2 v2 w2 + v3 w3 . (b) spherical coordinates: (v1 , v2 , v3 ) · (w1 , w2 , w3 ) = v1 w1 + ρ2 v2 w2 + ρ2 sin2 φ v3 w3 .
80
3 Tangent Spaces and 1-forms
3.3 1-forms In this section, we introduce a class of objects called 1-forms, which include the differentials. We begin with a reminder from elementary calculus. The Fundamental Theorem of Calculus states that for any continuous function f : [a, b] → R, one can find a differentiable function F (x) (the antiderivative) such that F 0 (x) = f (x), or in the language of differentials dF = F 0 (x) dx = f (x) dx. Consider now the problem in two dimensions. Let f (x, y) and g(x, y) be continuous functions, is it always possible to find F (x, y) such that dF = f (x, y) dx + g(x, y) dy which means
∂F = f (x, y) and ∂x
∂F = g(x, y)? ∂y
(3.8)
The answer is negative and the cases for which it is possible are studied in more details in a forthcoming section. Here is a case where (3.8) is not satisfied. Let f (x, y) = y
and
g(x, y) = x2 .
Indeed, by integrating with respect to x ∂F =y ∂x
=⇒
F (x, y) = xy + G(y)
where G(y) is some arbitrary function of y. But then, ∂F = x + G0 (y) = x2 ∂y cannot be satisfied. Note however that the expression y dx + x2 dy on its own has a well-defined mathematical meaning even though it is not the differential of any function. In fact, expressions such as f (x, y) dx + g(x, y) dy
(3.9)
have many uses in physics as we see below. All differentials and expressions such as (3.9) are examples of mathematical objects called 1-forms. Definition 3.3.1. Let U be an open subset of Rn and consider functions aj (x) := aj (x1 , . . . , xn ) for all j = 1, . . . , n.
3.3 1-forms
81
(1) If aj is a continuous functions for j = 1, . . . , n, then ω(x) = a1 (x)dx1 + . . . + an (x) dxn is a continuous 1-form on U . (2) If aj is a differentiable functions for j = 1, . . . , n, then ω(x) = a1 (x)dx1 + . . . + an (x) dxn is a differentiable 1-form on U .
1-forms are defined using the differentials dx1 , . . . , dxn ; therefore, they act on vectors in the tangent space of points p ∈ U ⊂ Rn . That is, for each p ∈ U , ω(p) : Tp Rn → R Example 3.3.2. Consider the 1-form ω(x, y) = y dx + x2 dy, p = (3, −2) ∈ R2 and v = (−1, 5) ∈ Tp R2 , then ω(3, −2)h−1, 5i
=
(−2) dx(−1, 5) + (3)2 dy(−1, 5)
=
−2(−1) + 9(5) = 49.
We use the notation h i for the vectors on which ω is applied. This should alleviate possible confusion when writing down such expressions. 1-forms can be defined also using curvilinear coordinate systems. Example 3.3.3. Consider the 1-form ω = r2 dr+rθ dθ. We evaluate at p = (r0 , θ0 ) = (2, π/3) at the vector v = (vr , vθ ) = (0.2, 1.3) ∈ Tp R2 . ω(2, π/3)h0.2, 1.3i
=
(2)2 dr(0.2, 1.3) + (2)(π/3)dθ(0.2, 1.3)
=
4(0.2) + (2π/3)(1.3).
If the tangent vector is in Cartesian coordinates v = (vx , vy ), then one needs to use the formulae (3.6) and (3.7) to evaluate dr and dθ. We conclude this section by looking at the set of (continuous or differentiable) 1forms in n-dimensions at a point p which we denote ( ) n X Ωn ω : Tp Rn → R | ω = ai (x1 , x2 , . . . , xn ) dxi . p := i=1
We have the following result.
82
3 Tangent Spaces and 1-forms
Proposition 3.3.4. Ωn p is a vector space (over R). Proof. One needs to check that the sum of two 1-forms ω1 , ω2 ∈ Ωn is also in Ωn and for any a ∈ R and ω ∈ Ωn , then aω ∈ Ωn . We leave the details to the reader. Let U ⊂ Rn be an open set, we denote by Ωn (U ) the vector space of 1-forms defined on U .
3.3.1 1-Forms in Physics As noted above, 1-forms are useful in describing quantities in physics, in particular for computing “work”. However, many more examples exist. For now, we focus on work. In elementary physics, the work (W ) done by a force is described as the product of a force F and the distance d travelled by a mass moved in the direction of F : W = F · d, and has units of Newton-meters (N m). In fact, a more general/practical way of expressing work is using 1-forms. The force may depend on its location and so F = F (p) = (F1 (p), . . . , Fn (p)) is a vector field where p = (x1 , . . . , xn ) ∈ Rn . One defines dW (p) =
n X
Fi (p) dxi .
(3.10)
i=1
At a point p, one can approximate a small distance in the direction of motion given by r(t) by taking a vector v = sr0 (t) ∈ Tp Rn for s small and so dxi (v) = x0i (t)s = x0i (t) dt(s). Therefore, for v ∈ Tp Rn , Fi (p) dxi (p)hvi is the product of the force F in the ith direction times the projection of the distance vector v also in the ith direction. Hence, work at p is the linear superposition of the work done in every Cartesian coordinate direction. Example 3.3.5. Consider the following case. Let F (x, y) = (x, −3) be the force of the wind and suppose that a cyclist travels along the path from P = (3, 2) to Q = (3, −1) and then from Q to R = (−2, −3). We compute the work done at each point p by the wind on the cyclist. Suppose that the path P Q is parametrized by r1 (t) = (3, 2 − t) with t ∈ [0, 3] and the velocity of the cyclist is given by tangent vectors along P Q are of the form v = r01 (t) = (0, −1), then dW (3, 2 − t)hvi = xdx(v) + (−3)dy(v) = (3)(0) − 3(−1) = 3. On the path QR, a parametrization is given by r2 (t) = (3 − 5t, −1 − 2t) with t ∈ [0, 1] and suppose also that the velocity of the cyclist is given by tangent vectors given by
3.3 1-forms
83
y P
x Q
R Fig. 3.16. Trajectory taken by cyclist: from P = (3, 2) to Q = (3, −1) and from Q to R = (−2, −3).
w = r02 (t) = (−5, −2). dW (3 − 5t, −1 − 2t)hwi
=
xdx(w) + (−3)dy(w)
=
(3 − 5t)(−5) + (−3)(−2)
=
−9 + 25t.
In the following section, we show that the total work over a path can be computed by integrating dW .
Exercises (1) Evaluate the 1-forms at the point p and on the vector v given. (a) ω = 2x dx + (3xy − y 2 ) dy, p = (2, −1), v = (−1, 1). (b) ω = cos(x + y) dx + 3xyz dy + (x2 + z 2 ) dz, p = (0, 0, 1), v = (3, −2, 4). (c) ω = x1 x2 dx1 + x1 x3 dx2 + x2 x3 dx3 + x3 x4 dx4 , p = (−3, −2, 1, 1), v = (2, −1, 1, −3). (d) ω = (θ + r) dr + (rθ2 ) dθ, p = (r, θ) = (1, π) and v = (vr , vθ ) = (1, 2). (e) ω = 2θr2 dr + (θr) dθ, p = (x, y) = (−1, −1) and v = (vx , vy ) = (2, −3). (2) Consider the 1-form ω = 2xy dx + x2 dy. Find a function F (x, y) such that dF = ω. Is this function F the unique function for which dF = ω? (3) Prove that Ωn is a vector space (Proposition 3.3.4). (4) Compute the work 1-form dW done by the force F (x, y) = (3x, 2xy) at each point of the path r(t) = (t2 , 1 − t) with t ∈ [0, 1]. (5) Compute the work 1-form dW done by the gravitational force → − mM G −x F (x) = r2 ||x|| on an object of mass m falling on the earth with the trajectory r(t) = ((1 − t) cos(t), (1 − t) sin(t), 1 − t) with t ∈ [0, 1].
4 Line Integrals We introduce the concept of line integrals starting with the integration of 1-forms. This leads to the first of the important theorems of Vector Calculus: the Fundamental Theorem of Line Integrals which is a generalization of the Fundamental Theorem of Calculus seen in elementary calculus courses. We then extend these results to the context of vector fields.
4.1 Integration of 1 forms When introducing differentials, we mention that expressions such as dt, dx, dy need to satisfy conditions (1), (2), (3) at the beginning of Section 3.2. The first two are satisfied with the definition given above and we even generalized to higher dimension. We now look at condition (3) which has to do with integration. However as the previous section shows, the concept of 1-forms is more general (at least in dimensions greater than 1) and so we define what it means to integrate 1-forms.
4.1.1 Revisiting integration in one-dimension We can now properly define integration, not of functions, but of 1-forms over “space curves” in R. This is done using Riemann sums as in elementary calculus. v0 v1 a t1 t2
vn−1 tn−1 b
t
Fig. 4.1. Curve C is the interval [a, b] with partition a = t0 < t1 < · · · < tn−1 < tn = b and vectors vj , j = 0, . . . , n − 1.
Let C be the positively oriented curve given by the interval [a, b] ⊂ R, with parametrization r(t) = t, t ∈ [a, b]. Let ω = f (t) dt be a continuous 1-form defined over C. We begin by defining a partition of [a, b]: a = t0 < t1 < . . . < tn = b. At each point tj of the partition, define vectors in the tangent space vj := tj+1 − tj ∈ Ttj R,
j = 0, 1, . . . , n − 1
and write vj = ∆tj e1 where ∆tj = tj+1 − tj . Now, evaluate ω at each tj on vj : ω(tj )hvj i = f (tj ) dt(vj )
4.1 Integration of 1 forms
85
and take the sum over all j = 0, . . . , n − 1. Therefore, n−1 X
ω(tj )hvj i =
j=0
n−1 X
f (tj ) dt(vj ).
(4.1)
j=0
By noticing that dt(vj ) = ∆tj , (4.1) is just a Riemann sum used to define the integral as it is shown in elementary calculus. Adding points to the partition so that every subinterval [tj−1 , tj ] is always subdivided, we define ˆ ω := lim C
n→∞
n−1 X
ω(tj )hvj i = lim
n→∞
j=0
n−1 X
ˆ
b
f (tj ) dt(vj ) =
f (t) dt. a
j=0
With the formulation given by (4.1), the significance of the dt in the integral is justified because this is how the base of the rectangles in the Riemann sum is computed. As the limit is taken, the vj ’s “vanish”, but the dt remains. Thus, the integration of 1-forms is well-defined and blends nicely with previously known integration of functions from elementary calculus. In particular, the properties of integration of 1-forms are identical. Proposition 4.1.1. The integral of a 1-form ω = f (t) dt satisfies the properties below. (a) If C = [a, b], C1 = [a, c] and C2 = [c, b] then ˆ ˆ ˆ ω= ω+ C
C1
ω.
C2
(b) Let −C be the curve C travelled in the opposite orientation. Then, ˆ ˆ ω. ω=− −C
C
(c) If a, b ∈ R and ωi , i = 1, 2 are 1-forms, then ˆ ˆ ˆ (aω1 + bω2 ) = a ω1 + b ω2 . C
C
C
Proof. These follow from the same properties for the Riemann integral. The next example not only shows how to use the substitution rule in the context of 1-forms, but it also introduces a new operation called the “pullback” which is how changes of variables are applied to 1-forms (and 2-forms, 3-forms, etc which are defined in subsequent chapters). Example 4.1.2 (Substitution rule). Consider the 1-form ω(t) = 2t cos(t2 ) dt
86
4 Line Integrals
over the positively oriented curve C = [0, π]. We know from elementary calculus that the integral of ω can be done using the substitution rule as follows. Let s = t2 , then ds = 2t dt and the s-variable takes values in [0, π 2 ] so, ˆ
ˆ
π2
2
ω= C
ˆ
π
2t cos(t ) dt = 0
cos s ds. 0
Thus, we see that the substitution rule gives rise to a new 1-form ω ˜ (s) = cos(s) ds defined on the curve C 0 = [0, π 2 ] and such that ˆ ˆ ω(t) = ω ˜ (s). C
Note that we can obtain ω ˜ (s) by using t =
C0
√
s and dt = 12 s−1/2 ds:
√ 1 ˜. ω = 2t cos(t2 ) dt = 2 s cos s s−1/2 ds = cos s ds = ω 2 Changes of variable are more often expressed in the form of the old variable “t” in terms of a new variable “s”, so we favour this last formulation. This process of starting with a 1-form ω and using a change of variables to obtain a new 1-form ω ˜ is what the pullback is about; the exact definition follows. Definition 4.1.3. Let ω(t) = a(t) dt be a 1-form and t = g(s) where g : R → R is a differentiable function. The pullback of ω by g is also a 1-form, denoted by g ∗ ω, defined by (g ∗ ω)(s) := a(g(s)) g 0 (s) ds. In Figure 4.2, we see how the reparametrization of the domain between the s and t-variables is used to take the 1-form ω(t) and pulls back a 1-form in the s-domain. Example 4.1.4. We compute the pullback of t ω(t) = √ dt 1 + t2 √ √ by t = g(s) = s − 1. In ω(t), we have that a(t) = t/ 1 + t2 , then applying the formula for the pullback we obtain g ∗ ω(s)
= = =
a(g(s)) g 0 (s) ds √ 1 s−1 √ p ds √ 2 2 s −1 1 + ( s − 1) 1 √ ds. 2 s
4.1 Integration of 1 forms
0
r1 (s) =
√
s
s ∈ [0, π 2 ]
Reparametrization √ t= s 2 π 0 R
w(s) e = cos(s)ds
“P ullback” √ t= s dt =
r2 (t) = t
87
π
t ∈ [0, π]
R
w(t) = 2t cos(t2 )dt
1 √ ds 2 s
Fig. 4.2. Diagram illustrating the relationship between reparametrizations and pullbacks in the case of Example 4.1.2.
4.1.2 Integration of 1-forms We now show how to integrate the 1-form basis elements over any curve C in Rn . We use the case n = 3 to illustrate the general case. Let r(t) = (x(t), y(t), z(t)) with t ∈ [a, b] define a curve C and consider a point p = r(t0 ) on C. If v ∈ Tp C, we can write for some s ∈ R: v = (sx0 (t0 ), sy 0 (t0 ), sz 0 (t0 )). Then, setting s = dt(s) we have (dx)p (v) = x0 (t0 )dt(s),
(dy)p (v) = y 0 (t0 )dt(s),
(dz)p (v) = z 0 (t0 )dt(s).
(4.2)
Consider the case dx and note that for some small s > 0 x(t0 + s) − x(t0 )
=
x(t0 + h) − x(t0 ) s s
=
x(t0 + s) − x(t0 ) dt(s) s
≈
x0 (t0 )dt(s) = dx(v).
(4.3)
Therefore, dx(v) approximates the small increment in the x-direction projected from the tangent vector v ∈ Tp C. We now define ˆ dx C
using the following process. (1) Partition [a, b] ∈ R: a = t0 < t1 < . . . < tn−1 < tn = b.
88
4 Line Integrals
(2) Let vj = (tj+1 −tj )e1 ∈ Ttj R, then dt(vj ) = tj+1 −tj and r0 (tj )dt(vj ) ∈ Tr(tj ) C. Recall that r0 (tj )dt(vj ) is the best linear approximation of C near r(tj ). (3) We take the sum of dx evaluated at tangent vectors r0 (tj )dt(vj ) = (x0 (tj ), y 0 (tj )) dt(vj ) located at the points r(tj ): Rn =
n X
dx(r0 (tj )dt(vj )) ≈
j=1
n X
x(tj+1 ) − x(tj ) = x(b) − x(a)
j=1
where the approximation is given by (4.3). But, Rn =
n X
x0 (tj ) dt(vj ).
j=1
where the right-hand side is a Riemann sum in R. (4) Therefore, ˆ ˆ b dx := lim Rn = x0 (t) dt. n→∞
C
(4.4)
a
and the limit is taken by adding points to the partition so that every interval is subdivided.
Another way of writing (4.4) is using the pullback operation on dx: ˆ
ˆ
b
dx = C
r∗ dx.
a
The formula in terms of pullback is the one which we use for general 1-forms. We have the formulae ˆ ˆ b x0 (t) dt = x(b) − x(a), dx := C a ˆ ˆ b (4.5) dy := y 0 (t) dt = y(b) − y(a), C a ˆ ˆ b dz := z 0 (t) dt = z(b) − z(a). C
a
We see that those are respectively the variation of C along the x, y and z directions. Another way of seeing (4.5) is in terms of displacement on C along x, y and z directions as an object travels on C from r(a) to r(b). Example 4.1.5. Consider a smooth curve C with endpoints at r(a) = (x(a), y(a)) = (1, 0)
and
r(b) = (x(b), y(b)) = (1, 4).
4.1 Integration of 1 forms
89
y
4
1 x
Fig. 4.3. Smooth curve C from Example 4.1.5.
The displacement on C along the x direction is 0 and the displacement on C along the y-direction is 4. These can be verified using formulae (4.5) ˆ ˆ dx = x(b) − x(a) = 0, dy = y(b) − y(a) = 4. C
C
We now generalize the pullback to a general 1-form in Rn . Definition 4.1.6. Let ω be a 1-form on Rn given by ω(x) =
n X
ai (x) dxi .
i=1
and let C be a curve with parametrization r(t) = (g1 (t), . . . , gn (t)). The pullback of ω along C is the 1-form on R given by (r∗ ω)(t) =
n X
ai (r(t)) gi0 (t) dt.
i=1
Example 4.1.7. Consider the context of Example 3.3.5 with dW = x dx + (−3) dy. On the path from P = (3, 2) to Q = (3, −1) given by r1 (t) = (x(t), y(t)) = (3, 2 − t), we compute the pullback of ω := dW using the three step method outlined below: (1) Write the formula: for n = 2, the general formula is ω = (a1 (r(t)) x0 (t) + a2 (r(t)) y 0 (t)) dt. (2) Identify the pieces: From the formula for ω the coefficients are a1 (x, y) = x and
a2 (x, y) = −3
4 Line Integrals
90
so on r1 (t) we have a1 (r1 (t)) = a1 (3, 2 − t) = 3 and
a2 (r1 (t)) = a2 (3, 2 − t) = −3.
The differentials are: x01 (t) = 0 and y10 (t) = −1. For the path r2 (t) = (3 − 5t, 1 − 2t) from Q to R = (−2, −3), we have a1 (r2 (t)) = a1 (3 − 5t, 1 − 2t) = 3 − 5t and
a2 (3 − 5t, 1 − 2t) = −3
with x02 (t) = −5, y20 (t) = −2 (3) Assemble the pieces: we can now use the computations of the second step into the formula: (r∗1 dW )(t) = r∗1 ω = (3(0) + (−3)(−1))dt = 3 dt. (r∗2 dW )(t) = r∗2 ω = ((3 − 5t) (−5) + (−3) (−2)) dt = (25t − 9) dt.
We notice that the pullback is an operator r∗ : Ωn → Ω1 . That is, it takes a 1-form in Rn and gives a 1-form in R. From the previous section, we know that 1-forms in R can be integrated in a straightforward way because they correspond to Riemann integrals.
Geometric construction of the integral We show the geometric construction of the integral of 1-forms in the case R2 as illustrated in Figure 4.4. Consider a curve C given by r(t) with t ∈ [a, b] and ω = a1 (x, y) dx + a2 (x, y) dy.
r(b) r(t) a t1 t2 t3 v0 v1 v2
tn−1 b
···
C
r0 (tj )dt(vj )
r’(t)
vn−1 t r(a)
Fig. 4.4. Geometric construction of the integral of a 1-form.
pj = r(tj )
4.1 Integration of 1 forms
91
(1) Partition [a, b] ∈ R: a = t0 < t1 < . . . < tn−1 < tn = b. (2) Let vj = (tj+1 −tj )e1 ∈ Ttj R, then dt(vj ) = tj+1 −tj and r0 (tj )dt(vj ) ∈ Tr(tj ) C. Recall that r0 (tj )dt(vj ) is the best linear approximation of C near r(tj ). (3) We take the following sum at points r(tj ) and evaluate at r0 (tj )dt(vj ) = (x0 (tj ), y 0 (tj )) dt(vj ): Rn =
n X
a1 (r(tj )) dx(r0 (tj )dt(vj )) + a2 (r(tj )) dy(r0 (tj )dt(vj ))
j=1
This sum can be seen as the contributions of areas of rectangles corresponding to heights of a1 and a2 and bases dx and dy along the x and y coordinate directions as seen in Figure 4.5: a1 (r(tj )) dx(r0 (tj )dt(vj )), | {z } | {z } height at
base in x-direction
r(tj )
a2 (r(tj )) dy(r0 (tj )dt(vj )). | {z } | {z } height at r(tj )
base in y-direction
a2
A2 A1 a1
pj
dy Fig. 4.5. Contribution of areas of rectangles in the x and y directions.
dx
Therefore, we can think of Rn as the addition of the Riemann sums along each direction. We define ˆ ω := lim Rn C
n→∞
where the limit is taken by adding points to the partition so that every interval is always subdivided. (4) Evaluating dx and dy we obtain Rn =
n X
a1 (r(tj ))x0 (tj ) + a2 (r(tj ))y 0 (tj ) dt(vj )
j=1
and so lim Rn
n→∞
=
lim
n→∞
ˆ
b
= a
n X
a1 (r(tj ))x0 (tj ) + a2 (r(tj ))y 0 (tj ) dt(vj )
j=1
(a1 (r(t))x0 (t) + a2 (r(t))y 0 (t)) dt =
ˆ
b
a
r∗ ω.
92
4 Line Integrals
(5) Therefore,
ˆ
ˆ
b
ω= C
r∗ ω.
a
Using the above geometric construction for n = 2, we now state the general definition. Definition 4.1.8. Let ω ∈ Ωn (U ) be a 1-form defined on U ⊂ Rn and let C be a curve given by the vector function r(t) with t ∈ [a, b]. Then, the integral of ω along C is given by ˆ ˆ b ω := r∗ ω. C
a
In coordinates, this means that if ω(x) =
n X
ai (x) dxi
i=1
and r(t) = (x1 (t), . . . , xn (t)) then ˆ ˆ ω= C
n bX
ai (r(t))x0i (t) dt.
a i=1
Remark 4.1.9. In the context of vector fields (that comes up in an upcoming section), the expression line integral along C is preferred, but can be used also in the context of 1-forms. We now look at several examples. Example 4.1.10. Set up the integral of ω = y dx + 3x dy over C given by r(t) = (t, sin(2πt)) with t ∈ [1, 2]. We first write the pullback of ω: (1) Formula: the general formula for the 1-form with n = 2 is r∗ ω = (a1 (r(t)) x0 (t) + a2 (r(t)) y 0 (t))dt. (2) Identify the pieces: a1 (x, y) = y and a2 (x, y) = 3x. Then on r(t) we have a1 (r(t)) = sin(2πt) and
a2 (r(t)) = 3t.
Also, x0 (t) = 1 and y 0 (t) = 2π cos(2πt). (3) Assemble the pieces in the formula: (r∗ ω)(t) = sin(2πt)dt + (3t)2π cos(2πt) dt = (sin(2πt) + 6πt cos(2πt)) dt.
We can now set up the integral ˆ ˆ 2 ˆ 2 ω= r∗ ω = (sin(2πt) + 6πt cos(2πt)) dt. C
1
1
4.1 Integration of 1 forms
93
Example 4.1.11. We set up the integral of ω = (z − y) dx + x2 dy − zy dz over C given by r(t) = (t, 2t, −t) with t ∈ [0, 1]. (1) Write the formula of the pullback: r∗ ω(t) = (a1 (r(t))x0 (t) + a2 (r(t))y 0 (t) + a3 (r(t))z 0 (t)) dt (2) Identify the pieces: We have a1 (x, y, z) = (z − y), a2 (x, y, z) = x2 , and a3 (x, y, z) = −zy. Now r(t) = (x(t), y(t), z(t)) = (t, 2t, −t) which implies r(t) = (x0 (t), y 0 (t), z 0 (t)) = (1, 2, −1). Therefore, a1 (r(t)) = −3t,
a2 (r(t)) = t2 ,
a3 (r(t)) = 2t2 .
(3) Assemble the pieces: r∗ ω(t) = ((−3t)1 + 4t2 (2) + 2t2 (−1)) dt = (−3t + 6t2 ) dt. Therefore,
ˆ
ˆ
C
1
(−3t + 6t2 ) dt.
ω= 0
This integral is easily computable and we leave the details to the reader.
We can now state the general properties of integrals of 1-forms. Proposition 4.1.12. The integral of a 1-form ω ∈ Ωn satisfies the following properties. (a) If a curve C is the (disjoint) union of C1 and C2 then ˆ ˆ ˆ ω= ω+ ω. C1
C
C2
(b) Let −C be the curve C travelled in the opposite orientation. Then, ˆ ˆ ω=− ω. −C
C
(c) If a, b ∈ R and ωi , i = 1, 2 are 1-forms, then ˆ ˆ ˆ (aω1 + bω2 ) = a ω1 + b ω2 . C
C
C
Proof. (a) We suppose that C is given by r(t) with t ∈ [a, b] with C1 obtained by restricting t ∈ [a, c] and C2 by restricting t ∈ [c, b]. We can decompose the integral as follows ˆ ˆ b ˆ c ˆ b ω= r∗ ω = r∗ ω + r∗ ω. C
a
a
c
4 Line Integrals
94
ˆ
b
because the integral
r∗ ω is the integral of a 1-form in R. The two integrals on
a
the right-hand side correspond to ˆ
ˆ ω
ω.
and
C1
C2
(b) Let C be given by r(t) = (x1 (t), . . . , xn (t)) with t ∈ [a, b]. Then −C is given by ˜r(t) = r(a + b − t) with t ∈ [a, b]. Then, ˜r(a) = r(b), ˜r(b) = r(a) and ˜r0 (t) = −r0 (a + b − t). We compute ! ˆ ˆ b X n 0 ω = ai (˜r(t)) (−xi (a + b − t)) dt −C
a i=1 n bX
ˆ = =
−ai (r(a + b − t)) (−x0i (a + b − t)) dt,
a i=1 ˆ aX n
−
b
ˆ =
−
i=1 n bX
set u = a + b − t
ai (r(u))x0i (u) (−du) ˆ ai (r(u))x0i (u) du = −
a i=1
ω. C
(c) This property is straightforward to check and left as an exercise. We illustrate the use of part (a) of Proposition 4.1.12. Example 4.1.13. Find the total work done in Example 3.3.5 on the path C from P to R given. Because the path is piecewise smooth, the integral is the sum of the integral on both smooth pieces C1 and C2 : ˆ
ˆ dW =
C
ˆ dW +
C1
ˆ dW
=
C2
= = =
ˆ 1 r∗1 dW + r∗2 dW 0 0 ˆ 3 ˆ 1 3 dt + (25t − 9) dt 0 0 1 25 2 9+ t − 9t 0 2 25 25 −9 = . 9+ 2 2 3
Exercises (1) Set up the integral of ω = 2xy dx + x2 dy over the curve C given by r(t) = (1 + t, 3t2 ) with t ∈ [0, 1]. (2) Set up the integral of ω = (z − y) dx + x2 dy − zy dz over the curve C given by r(t) = (t, 2t, −t) with t ∈ [0, 1].
4.2 Arc-length, Metrics and Applications
(3) Set up
95
ˆ ω C
(4) (5)
(6)
(7)
where ω = exy dx + xey dy over the piecewise smooth curve C consisting of C1 the piece of parabola y = x2 from (0, 0) to (1, 1) , followed by the line segment C2 from (1, 1) to (2, 0), and finally the line C3 from (2, 0) to (0, 0). Compute the integral of ω = y sin(z) dx + z sin(x) dy + x sin(y) dz over the curve C given by r(t) = (cos(t), sin(t), sin(5t)) with t ∈ [0, π]. Set up the integral for the work done by the force F (x, y, z) = (xy 2 , xy + yz, −3z 3 + y 2 ) over the piecewise smooth path C given by the piece of helix C1 parametrized by r(t) = (cos(t), sin(t), t) with t ∈ [0, π] and C2 the line segment from (−1, 0, π) to (0, −1, 0). Consider a cyclist of mass 1 on a road up a mountain where the path C is given √ √ by r(t) = ( 2 − t cos(2πt), 2 − t sin(2πt), t) with t ∈ [0, 2]. (a) Verify that the path verifies the equation of the elliptic paraboloid z = 2 − (x2 + y 2 ). (b) Determine the work needed against gravity (a = (0, 0, −9.8)) to climb up the path C. (c) Suppose there is a wind with force F (x, y, z) = (−3z − 1, 0, 0), compute the work done by the cyclist against the wind. Show part (c) of Proposition 4.1.12.
4.2 Arc-length, Metrics and Applications We show how to use the ideas from 1-forms to compute important quantities related to curves such as arc-length and curvature. We also introduce integration over the length of a curve and use it to look at computation of mass and centre of mass for curved shaped objects. Finally, we make the link between 1-forms and vector fields and between the integration of 1-forms and the Line Integrals of Vector Fields.
4.2.1 Arc-length Suppose one wants to know the length of a piece of curve on the floor. It can just be picked up, straightened and measured using a measuring tape. The length of the rope is independent of its shape on the floor. We exploit this idea to given an intrinsic definition of arc-length by finding a mathematical way of “straightening” a curve C on top of a coordinate axis to obtain the arc-length. As a first step, the parametrization should not travel along the curve several times. Consider the following example.
96
4 Line Integrals
Example 4.2.1. Let C be the circle curve given by r(t) = (cos(t), sin(t)) with t ∈ [0, 4π]. The section of curve for t ∈ [2π, 4π] is a second winding around the circle. Thus for t1 ∈ [0, 2π] and t2 = t1 + 2π this means two different elements in the domain, t1 6= t2 , are sent to the same point in R2 : r(t1 ) = r(t2 ). Computing the length of the circle over the whole domain would give us twice the length. See Figure 4.6. y
r(t1 ) = r(t1 + 2π) x
Fig. 4.6. Parametrization of the circle in Example 4.2.1
This example leads us to consider parametrizations that are one-to-one, or also called injective. We recall the definition in the context of vector functions. Definition 4.2.2. Let r(t) be a vector function with t ∈ [a, b]. Then, r(t) is one-toone or injective if for t1 , t2 ∈ [a, b], r(t1 ) = r(t2 )
=⇒
t1 = t2 .
Another way to write this implication is: if t1 6= t2 then r(t1 ) 6= r(t2 ). Consider the following example. Example 4.2.3. Let C be a curve given by the parametric representation r(t) = (t2 , t4 ) with t ∈ [−1, 1]. We check to see whether this vector function is injective by setting r(t1 ) = r(t2 ). The goal is to find out if this equality leads t1 to be equal to t2 . If so, then it is injective. But, we can check directly that for t1 = −1 and t2 = 1 we have (t21 , t41 ) = (t22 , t42 ). Thus, it is not injective. In fact, for t2 = −t1 we have r(t1 ) = r(t2 ). Restricting the domain of r(t) to t ∈ [−1, 0] or t ∈ [0, 1] leads to an injective vector function. As noted in Remark 2.3.1, the domain of the arc-length parametrization corresponds in the examples at the beginning of Section 6.2 to the length of the curve C as we know from basic geometry. This is true in general as the construction below shows.
4.2 Arc-length, Metrics and Applications
97
Arc-Length: Geometric Construction We use tangent vectors at points along C to build a Riemann sum for the arc-length. See Figure 4.7 for an illustration of this construction. Let r(s) with s ∈ [a, b] be the arc-length parametrization of a curve C. (1) Let a = s0 < s1 < . . . < sn = b be a partition of [a, b] (2) ∆si = si+1 − si ∈ Tsi R for i = 0, . . . , n − 1.
We know that the line segment r0 (si ) ds(∆si ) is the best linear approximation of C near r(si ) and ||r0 (si )dt(∆si )|| = ds(∆si ) for all i = 0, . . . , n − 1 because r0 (si ) = 1 in the arc-length parametrization. r(b) = r(tn )
r(t) a = t0
t1 v0
t2 v1
tn−1
t3 v2
...
r(tn−1 )
r0 (tj )dt(vj )
tn = b
r(tj )
vn−1
C
r(t2 ) r0 (t0 )dt(v0 ))
r(t1 ) r(a) = r(t0 )
Fig. 4.7. Tangent vectors at points r(tj ) along C.
We see that the arc-length parametrization preserves the length of tangent vectors from Tsj R to Tr(sj ) C. Now, summing those line segments, we obtain an approximation of the length of C which is independent of n, the number of elements in the partition: n−1 n−1 X X ||r0 (si )ds(∆si )|| = ||ds(∆si )|| = b − a (4.6) i=0
i=0
and in particular lim
n→∞
n−1 X
||r0 (si )dt(∆si )|| = b − a.
(4.7)
i=0
But, the approximation C by the line segments r0 (si )dt(∆si ) improves as ds(∆si ) becomes smaller. Thus, in (4.7), we let n → ∞ in such a way that all intervals ∆si → 0 for i = 0, . . . , s − 1.
98
4 Line Integrals
Definition 4.2.4. Let C be a curve with parametrization given by the vector function r(s) with s ∈ [a, b] corresponding to arc-length parametrization. Then, the arc-length of C is given by ˆ b `(C) := ds = b − a. a
The above calculation shows that using the arc-length parametrization, approximations to the curve by tangent vectors always add-up naturally to the length of the domain of the arc-length parametrization. We are in some sense, straightening out pieces of curve on the tangent vectors. As mentioned in the previous chapter, finding the arc-length parametrization is not possible in many cases and so we must obtain a formula from which the arc-length can be expressed no matter which parametrization is used.
Metrics and arc-length Consider a vector v = (v1 , . . . , vn ) ∈ Tp Rn . We know that the differentials dxi applied on v yield the projections of v on the coordinate axes xi : dxi (v) = vi . We use the differentials to define an alternative way of computing the length of vectors in Tp Rn using the metric operator ds : Tp Rn → R defined by v u n uX 2 ds := t dxi . i=1
Note that the metric is not a 1-form because it is not linear, but it is built using 1-forms. The notation ds for the metric operator should not be confused with the differential ds as in the previous section. We see indeed that v v u n u n uX 2 uX 2 t dxi (v) = t vi = ||v||. ds(v) = i=1
i=1
Note that the norm can be defined in curvilinear coordinate systems. Consider the differentials dr and dθ from polar coordinates. We perform the change of coordinates from dx, dy to dr, dθ: dx = cos θ dr − r sin θ dθ
and
dy = sin θ dr + r cos θ dθ.
So, dx2 + dy 2 = (cos θ dr − r sin θ dθ)2 + (sin θ dr + r cos θ dθ)2 = dr2 + r2 dθ2 and therefore, ds =
p dr2 + r2 dθ2 .
We see in the next result that the metric is the right concept to produce a general formula for computing arc-length. We can compute the arc-length of a curve C by
4.2 Arc-length, Metrics and Applications
99
integrating the metric ds. We assume that the curve C is given by an injective parametrization r(t) with t ∈ [a, b]. Theorem 4.2.5. The arc-length `(C) of a curve C in Rn with injective parametrization r(t), t ∈ [c, d], is given by ˆ
d
`(C) =
||r0 (t)|| dt.
c
Proof. Let ˜ r(s) with s ∈ [a, b], be the arc-length parametrization of C, then `(C) = b − a. The parametrizations ˜ r(s) and r(t) are related by t = u(s) where u : [a, b] → [c, d] is differentiable, and we suppose, without loss of generality that u0 (s) > 0. We know that 1 = ||˜ r0 (s)|| = ||r0 (t)|| |u0 (s)|. We now use ds to obtain an approximation of the length of C. Let c = t0 < . . . < tn = d be a partition of [c, d] and consider the vectors r0 (ti )dt(∆ti ) based at r(ti ) where ∆ti = ti+1 − ti ∈ Tti R. Then, n−1 X
ds(r0 (ti )dt(∆ti ))
=
i=0
=
n−1 X i=0 n−1 X
p
x01 (ti )2 + · · · + x0n (ti )2 dt(∆ti )
1/|u0 (si )|dt(∆ti ).
i=0
But, dt(∆ti ) = u(si+1 ) − u(si ) ≈ u0 (si )ds(∆si ). Thus, n−1 X
1/|u0 (si )|dt(∆ti ) ≈
n−1 X i=0
i=0
1 (u0 (si ))ds(∆si ) = b − a. |u0 (si )|
Letting n → ∞ one has ˆ
d
||r0 (t)|| dt = b − a = `(C)
c
and this completes the proof. The following result is immediate from the proof of the above theorem. Corollary 4.2.6. Let C be a curve. The arc-length is independent of the vector function used to describe C. That is, if r1 (t) with t ∈ [a1 , b1 ] and r2 (t) with t ∈ [a2 , b2 ] are two different parametrizations of C, then ˆ
b1
a1
ˆ ||r01 (t)|| dt
b2
= a2
||r02 (t)|| dt.
100
4 Line Integrals
Proof. Any parametrization can be reparametrized to the arc-length parametrization and so the formula is the same with the bounds of integration depending on the parametrization. Remark 4.2.7. (1) A more widespread notation for arc-length is ˆ ds := `(C). C
(2) Note that the integral for computing arc-length is the same as the one we need to compute the arc-length parametrization. ˆ (3) Reversing the orientation leaves ds invariant because C 0
ds(r (ti )dt(∆ti )) = ds(−r0 (ti )dt(∆ti )). Example 4.2.8. We compute the arc-length of C given by r(t) = (t cos(2πt), t sin(2πt)) with t ∈ [0, 2]. The derivative is r0 (t) = (cos(2πt) − 2πt sin(2πt), sin(2πt) + 2πt cos(2πt)) √ and ||r0 (t)|| = 1 + 4π 2 t2 . Therefore, y
r(t) = (t cos(2πt), t sin(2πt))
x Fig. 4.8. Curve C of Example 4.2.8
ˆ
ˆ ds
=
C
2p
1
= =
1 2π 1 2π
1 + 4π 2 t2 dt 2 p p 1 2 2 2 2 πt 1 + 4π t + ln(2πt + 1 + 4π t ) 2 0 p p 1 2 2 2π 1 + 16π + ln(4π + 1 + 16π ) . 2
4.2 Arc-length, Metrics and Applications
101
Example 4.2.9. Set up the integral for the arc-length of C given by r(t) = (et , tet , t2 et ) with t ∈ [0, 2]. The derivative is r0 (t) = (et , et + tet , 2tet + t2 et ) and so ||r0 (t)|| = √ et 2 + 2t + 3t2 + 4t3 + t4 . Then ˆ ˆ 2 p ds = et 2 + 2t + 3t2 + 4t3 + t4 dt. C
0
More Metrics We begin with some examples of metrics in three-dimensions, the simplest one being obtained for cylindrical coordinates which is just a direct extension of the polar coordinate case. We have p ds = dr2 + r2 dθ2 + dz 2 . The metric in spherical coordinates requires more work to derive. We begin by writing dx, dy, dz as functions of dρ, dφ and dθ. Recall x = ρ cos θ sin φ,
y = ρ sin θ sin φ,
z = ρ cos φ.
Then, dx
=
cos θ sin φ dρ − ρ sin θ sin φ dθ + ρ cos θ cos φ dφ.
dy
=
sin θ sin φ dρ + ρ cos θ sin φ dθ + ρ sin θ cos φ dφ
dz
=
cos φ dρ − ρ sin φ dφ
and a straightforward, but lengthy computation shows ds2
=
dx2 + dy 2 + dz 2 = dρ2 + ρ2 sin2 φ dθ2 + ρ2 dφ2 .
This metric is useful for measuring curves lying on a sphere or following a spherical trajectory, possibly of non-constant radius. Example 4.2.10. We set up the integral for the length of the curve C given by the vector function r(t) = (t cos t sin t, t sin2 t, t cos t) with t ∈ [0, π/2]. We write the curve in spherical coordinates: p ρ = t2 cos2 t sin2 t + t2 sin4 t + t2 cos2 t = t, with θ = arctan(y(t)/x(t)) and cos φ = z(t)/ρ(t). Because t ∈ [0, π/2], we obtain θ = t,
φ = t.
Therefore, dρ2 = dt2 , dθ2 = dt2 and dφ2 = dt2 . Thus, ˆ
ˆ
C
π/2 p
1 + t2 sin2 t + t2 dt.
ds = 0
4 Line Integrals
102
r(t) = (t cos t sin t, t sin2 t, t cos t)
y
x Fig. 4.9. The curve given by r(t) in Example 4.2.10
z
Exercises (1) For the following curves C, determine if the parametrization is injective. If not, restrict the domain of t to make it injective (a) Consider the curve C given by r(t) = (a cos(2t), b sin t) with t ∈ [− π2 , π2 ]. (b) Consider the curve C given by r(t) = (t2 , sin(4πt2 )) with t ∈ [−1, 1]. (2) Find the length of the curve C given by r(t) = (t2 /2, t3 /3) with t ∈ [0, 1]. Compare your answer with the domain obtained in Example 2.3.5. (3) Find the length of the Cycloid curve C given by r(t) = (r(t − sin t), r(1 − cos t)) with t ∈ [0, 2π]. Use a computer software to plot C. (4) Find the length of the Cardioid curve C given by r(t) =(a(2 cos t −cos(2t)), a(2 sin t − sin(2t))) with t ∈ [0, 2π]. Use a computer software to plot C. (5) Find the length of the Deltoid curve C given by r(t) = (2a cos t + a cos(2t), 2a sin t − a sin(2t))
(6) (7) (8) (9)
with t ∈ [0, 2π]. Use a computer software to plot C. p Show that for a vector function r(t) = (t, f (t)), then ds = 1 + f 0 (t)2 dt. Find the length of the helix curve C given by r(t) = (a cos(2πt), a sin(2πt), t) with t ∈ [0, 2]. Use the metric ds in cylindrical coordinates. Find the length of the curve C given by r(t) = (t cos t, t sin t, t) with t ∈ [0, 2π]. Is a given metric preferable here? Find the length of the curve C given by r(t) = (e2t cos t, 2, e2t sin t) with t ∈ [0, 2π].
4.2 Arc-length, Metrics and Applications
103
4.2.2 Integral of functions over curves: including applications Consider the situations shown in Figure 4.10:
h(p) h
p
p
C
C
Fig. 4.10. Fences of constant height (left) and variable height (right) over a curve C.
(1) Suppose that a fence of constant height h > 0 is build on top of a curve C. The intuition, which is correct, is that the area of the fence is given by the length of the curve C times the height of the fence: ˆ h ds. C
If the height depends on the location along the fence h = h(p) where p ∈ C, by analogy with the area under a curve seen in elementary calculus, we expect the area to be given by an expression such as ˆ h(p) ds. (4.8) C
(2) A very thin wire (of uniform radial cross-section) is made up of a unique material and is arranged in the shape given by a curve C. It is possible that the material is unevenly distributed along the wire and so the density of material in some small ∆s piece of the curve varies along C. This means that for a small portion of wire near two different points p, q ∈ C, the mass in equal neighborhoods near p and q could be different. The mass near p and q is approximated by ρ(p) ds(v) and ρ(q) ds(w) where ρ is the mass density (kg/m) (evaluated on C) and ds (m) is a small distance near the points p and q where v ∈ Tp C, w ∈ Tq C. Thus, the mass of the wire should be given by ˆ ρ(p) ds. (4.9) C
4 Line Integrals
104
We want to make sense of the expressions (4.8) and (4.9). Let r(t) be a parametrization of a curve C and let f (x) be a function defined on C, where x = (x1 , . . . , xn ). Let ∆tj := tj+1 − tj ∈ Ttj R and define Rn
:=
n−1 X j=0 n−1 X
=
f (r(tj )) ds(r0 (tj ) dt(∆tj )) f (r(tj ))
p x01 (tj )2 + · · · + x0n (tj )2 dt(∆tj )
j=0
We define ˆ f (x) ds
:=
C
=
lim Rn
n→∞
lim
n→∞
ˆ
n−1 X
f (r(tj ))
p
x01 (tj )2 + · · · + x0n (tj )2 dt(∆tj )
j=0
b
f (r(t))
= ˆab =
p
x01 (t)2 + · · · + x0n (t)2 dt
f (r(t)) ||r0 (t)|| dt.
a
This leads us to the following definition. Definition 4.2.11. Consider a function f : Rn → R and C a curve defined by r(t) with t ∈ [a, b]. The integral of f over C is given by ˆ ˆ b f (x1 , . . . , xn ) ds = f (r(t))||r0 (t)|| dt. C
a
We look at a few examples. (1) Write the formula for the integral of f (x, y) = xy over the curve C defined by r(t) = (t, t3 ) with t ∈ [0, 1]. Begin by obtaining the derivative r0 (t) = (1, 3t2 ) √ and ||r0 (t)|| = 1 + 9t4 . Then, ˆ 1 ˆ p f ds = 3t2 1 + 9t4 dt. C
0
(2) Compute the integral of f (x, y, z) = 2x over the curve C defined by r(t) = √ (t, 3 cos t, 3 sin t) with t ∈ [0, 2π]. We compute directly ||r0 (t)|| = 10 and so 2π ˆ ˆ 2π √ √ √ f ds = 2t 10 dt = 10t2 = 4π 2 10. C
0
0
Example 4.2.12. The formula for the integral of f over C is also valid in other coordinate systems. Consider the curve given by the vector function r(t) = (a cos t sin t, a sin2 t, a cos t)
4.2 Arc-length, Metrics and Applications
105
with t ∈ [0, π/2] seen in Figure 4.11. This curve has a form similar to the one of Example 4.2.10 and one can verify that C is located on a sphere of radius a. We compute dρ, dθ and dφ explicitly. We start with ρ2 = x2 + y 2 + z 2 and take the differential to obtain 2ρ dρ = 2x dx + 2y dy + 2z dz. Then, y r(t) = (2 cos t sin t, 2 sin2 t, 2 cos t)
x
Fig. 4.11. Curve r(t) from Example 4.2.12
z
ρdρ
=
x dx + y dy + z dz
=
a2 cos t sin t(− sin2 t + cos2 t) dt + a2 sin2 t (2 sin t cos t dt) +a cos t (−a sin t dt)
=
a2 (cos3 t sin t + sin3 t cos t − sin t cos t) dt
=
a2 (sin t cos t(cos2 t + sin2 t) − sin t cos t) dt = 0.
dθ
=
−y dx + x dy = (−a2 sin2 t(− sin2 t + cos2 t) +a2 cos t sin t(2 sin t cos t)) dt
=
a2 (sin4 t + sin2 t cos2 t) dt
=
a2 sin2 t dt.
and dz = cos φ dρ + ρdφ becomes (−a sin t dt) = adφ. Since ρ2 sin2 φ = a2 (1 − cos2 φ) = a2 sin2 t p ds = a sin t a2 sin2 t + 1 dt.
106
4 Line Integrals
ˆ
ˆ
z ds
π/2
p a2 cos t sin t 1 + a2 sin2 t dt, ˆ0a p u 1 + u2 du, v = 1 + u2 0 ˆ a2 √ 1 v dv 2 0 a2 1 3 1 3/2 3v = 3a .
=
C
= = =
u = a sin t
0
Mass and centre of mass As described above, for a thin wire with mass density given by a function ρ(x, y, z), the mass m of the wire is given by ˆ m= ρ(x, y, z) ds. C
Example 4.2.13. We compute the mass of the wire of density ρ(x, y, z) = z of shape (cos t, sin t, t) with t ∈ [0, 4π]. This is done in cylindrical coordinates with ds = √ dr2 + r2 dθ2 + dz 2 . We know r dr = x dx + y dy = cos t(− sin t dt) + sin t(cos t dt) = 0,
dz = dt
and r2 dθ = −y dx + x dy = − sin t(− sin t dt) + cos t(cos t dt) = dt with r2 = cos2 t + sin2 t = 1. Thus, ˆ ˆ ρ(x, y, z) ds = C
4π
√ √ t 2 dt = 28π 2 .
0
The centre of mass of a wire with density ρ(x, y, z) is located at (x, y, z) given by the formulae ˆ ˆ ˆ 1 1 1 xρ(x, y, z) ds, yρ(x, y, z) ds, yρ(x, y, z) ds. x= y= z= m C m C m C where m is the mass of the wire. For a wire lying in a plane, we need only consider ρ(x, y) and (x, y). Example 4.2.14. Compute the centre of mass of a wire with constant density ρ(x, y) = 3 of shape C given by r(t) = (cos t, sin t) with t ∈ [0, π]. We be√ gin by computing the mass using ds = dr2 +r2 dθ2 (it is also straightforward with √ 2 ds = dx +dy2 = dt). Because C has constant radius, dr = 0 can be easily checked. Now, tan θ = tan t so θ = t, but the domain must be split at t = π/2. Therefore, dθ = dt and ! ˆ π/2 ˆ ˆ π ˆ π m= 3 ds = 3 ds = 3 dt + dt = 3(π/2 + π/2) = 3π. C
0
0
π/2
4.2 Arc-length, Metrics and Applications
107
y
(cos t, sin t)
(0, π2 ) x
Fig. 4.12. Wire in a semicircle shape and location of the centre of mass in Example 4.2.14.
Then,
π ˆ ˆ 1 1 π 1 x= 3x ds = cos t dt = sin t = 0. 3π C π 0 π 0 π ˆ ˆ π 1 1 2 1 3y ds = sin t dt = − cos t = . y= 3π C π 0 π π 0
Exercises (1) Let C be the curve given by r(t) = (t, t2 ) for t ∈ [0, 1] and compute ˆ x ds. C
(2) Let C be the curve of intersection of the cone z 2 = x2 + y 2 and the plane z = 2 − x − y. Compute ˆ z ds. C
(3) Find the mass and the centre of mass of the wire C with density ρ(x, y) = 1+x+y with shape given by r(t) = (1 − t)(−1, 0) + t(2, 3). (4) Let C be the curve given by r(t) = (e−t cos t, e−t sin t) with t ∈ [0, 2π] and compute ˆ p x2 + y 2 ds. C
(5) Find the mass and centre of mass of the wire C with density ρ(x, y, z) = xy with shape given by r(t) = (cos(2πt), sin(2πt), 3t) with t ∈ [0, 1]. (6) Let C be the curve given by r(t) = ((1 − t) cos t sin t, (1 − t) sin2 t, (1 − t) cos t) with t ∈ [0, 2π]. Set up the integral ˆ x ds. C
108
4 Line Integrals
4.2.3 Curvature We denote the unit tangent vector to a curve C given by a vector function r(t) by T(t) :=
r0 (t) ||r0 (t)||
and look at its evolution as t changes. In fact, we begin the discussion by assuming that r(s) is the arc-length parametrization of C, then T(s) · T(s) = 1 and dT(s) d T(s) · T(s) = 2T(s) · = 0. ds ds
T(s) C
dT ds dT ds dT ds
(4.10)
T(s)
T (s)
Fig. 4.13. Curve C with unit tangent . vector T(s) and the vector dT ds
This means the vector
dT(s) ds is perpendicular to the tangent vector, but need not be a unit vector. See Figure 4.13 Definition 4.2.15. The curvature of a curve with the arc-length parametrization is dT κ = ds where T is the unit tangent vector. We check this definition with some intuitive examples. Example 4.2.16. Consider two points p, q ∈ Rn and C the line joining these points. The arc-length parametrization is given by r(s) =
(||q − p|| − s)p + sq . ||q − p||
with s ∈ [0, ||q − p||]. Since r0 (s) = (q − p)/||q − p|| is constant dT =0 ds and the curvature κ = 0. This agrees with our intuition that a straight line has no curvature.
4.2 Arc-length, Metrics and Applications
109
Example 4.2.17. Consider the circle of radius a given by its arc-length parametrization r(s) = (a cos(t/a), a sin(t/a)) with t ∈ [0, 2π]. Then, r0 (s) = (− sin(t/a), cos(t/a))
and
dT = (−a−1 cos(t/a), a−1 sin(t/a)) ds
which means κ = 1/a. A circle has constant curvature which is the reciprocal of its
κ=
1 10
10 κ=1 1 Fig. 4.14. Comparison of the curvature of circles of radius 1 and 10.
radius. Therefore, the larger the circle, the smaller its curvature. In plain words, this means that for an arc of same length on the small and large circles, the unit tangent vector on the small circle has a greater variation of its direction as compared to the larger circle; it is more curved! As we have seen already several times, the arc-length parametrization is not always computable and so we now obtain formulas for κ in terms of general parametrizations r(t). Theorem 4.2.18. Let C be a curve given by r(t) with t ∈ [a, b]. Then κ=
||T0 (t)|| ||r0 (t) × r00 (t)|| = . 0 ||r (t)|| ||r0 (t)||3
Proof. Let t = φ(s) be the reparametrization between r(t) and its arc-length parametrization ˜r(s). Then, ds/dt = ||r0 (t)|| Recall that T(t) = and so
r0 (t) ||r0 (t)||
dT dt T0 (t) T0 (t) = T0 (t) = = 0 . ds ds ds ||r (t)|| dt
Thus, κ=
||T0 (t)|| . ||r0 (t)||
110
4 Line Integrals
The second formula is obtained by noticing that r0 (t) = ||r0 (t)||T0 (t). The rest of the proof is an exercise in computing various derivative. First, r00 (t) =
ds d2 s T(t) + T0 (t) dt2 dt
Using the fact that T(t) × T(t) = 0, one can show that ||r0 (t) × r00 (t)|| =
ds dt
2
||T0 (t)||
Isolating ||T0 (t)|| from this equation yields the result. The formula for κ in terms of r0 (t) and r00 (t) is typically easier to use. Example 4.2.19. We compute the curvature of the parabola C given by r(t) = (t, t2 ). Begin with writing C as r(t) = (t, t2 , 0), then r0 (t) = (1, 2t, 0)
and
r00 (t) = (0, 2, 0)
and r0 (t) × r00 (t) = (0, 0, 2). ||r0 (t) × r00 (t)|| = 2. Now, ||r0 (t)|| =
√
1 + 4t2 which yields κ=
2 . (1 + 4t2 )3/2
Exercises (1) Compute the curvature of the following curves. (a) r(t) = (et cos t, et sin t) (b) r(t) = (1 + t, 2 − 5t2 ) (c) r(t) = (t, t2 , t3 ) (d) r(t) = (cos t, sin t, t) (2) For the first two curves of the previous problem, plot on the same graph r(t) and κ(t). (3) Show that for a curve given by r(t) = (t, f (t)), the curvature formula is κ=
|f 00 (x)| . (1 + (f 0 (x))2 )3/2
4.3 Line integrals of vector fields
111
4.3 Line integrals of vector fields There is a tight link between vector fields and 1-forms. Let x = (x1 , . . . , xn ) ∈ Rn , then {Vector fields in Rn }
⇐⇒
{1-forms on Rn }
F (x) = (F1 (x), . . . , Fn (x))
⇐⇒
ω = F1 (x)dx1 + · · · Fn (x) dxn .
In particular, one can write ω = (F1 (x), . . . , Fn (x)) · (dx1 , . . . , dxn ). Because of the correspondence above, we can define a line integral of vector fields. If F is a vector field and C is a curve given by r(t) with t ∈ [a, b], then we can evaluate the contribution of F along C by the formula: F (r(t)) · r0 (t) for all t ∈ [a, b]. Definition 4.3.1. The line integral of a vector field F over a curve C given by r(t) with t ∈ [a, b] is ˆ b F (r(t)) · r0 (t) dt. a
Using (3.6), we obtain the notation ˆ
b
ˆ F (r(t)) · r0 (t) dt =
a
ˆ
b
F · dr = C
F · T ds a
where T = r0 (t)/||r0 (t)||. Example 4.3.2. Consider the vector field F (x, y) = (−3x2 + xy, 5x2 y 2 ) and let C be the curve given by r(t) = (t3 , −2t2 ) with t ∈ [0, 1]. Then, the line integral of F over C is obtained as follows. Compute F (r(t)) = (−3(t3 ) + t3 (−2t2 ), 5t3 (−2t2 )) = (−3t3 − 2t5 , −10t5 ) and so F (r(t)) · r0 (t) = (−3t3 − 2t5 , −10t5 ) · (3t2 , −4t) = −9t5 − 6t7 + 40t6 . Therefore, ˆ
ˆ F · dr =
C
0
1
3 3 40 (−9t5 − 6t7 + 40t6 ) dt = − − + . 2 4 7
Definition 4.3.3. The following two definitions are in direct correspondence.
112
4 Line Integrals
Simple Closed Curve Not Simple
Fig. 4.15. Left: a simple closed curve. Right: a closed curve which is not simple because of the self-intersection.
(1) A 1-form ω ∈ Ωn (U ) is exact if there exists a differentiable function f : U ⊂ Rn → R such that ω = df . (2) A vector field F (x) defined on U ⊂ Rn is said to be gradient or conservative if there exists a differentiable function f : U ⊂ Rn → R such that F (x) = ∇f (x). In this case, f is called the potential function.
The name conservative comes from physics where force fields are given by vector fields. In the case of conservative vector fields, we show at the end of the section the Principle of Conservation of Energy, which justifies the use of the term conservative. Example 4.3.4. Let ω = yexy dx + xexy dy. We show that ω is exact with ω = df ; that is, ∂f ∂f = yexy and = xexy . ∂x ∂y We integrate
ˆ f (x, y) =
∂f dx = ∂x
ˆ yexy dx = exy + g(y).
Therefore, ∂f = xexy + g 0 (y) = xexy ∂y so g 0 (y) = 0 which implies g(y) = K is a constant. Thus, f (x, y) = exy + K, where K is an integrating constant. For exact 1-forms (and conservative vector fields), the function f such that ω = df , is called an antiderivative in analogy with the case of functions of one-variable and so the integration of ω is done directly using the antiderivative. This is the content of the next result which is one of the cornerstones of advanced calculus. Before, we write the statement, we need to introduce the concept of simple closed curve. A curve C is closed if it can be parametrized by r(t) with t ∈ [a, b] such that r(a) = r(b). A closed curve C is simple if it has no self-intersection. See Figure 4.15.
4.3 Line integrals of vector fields
113
Theorem 4.3.5. [Fundamental Theorem of Calculus for Line Integrals] Consider ω=
n X
⇐⇒
ai (x) dxi
F (x) = (a1 (x), . . . , an (x))
i=1
with ω = df and so F (x) = ∇ f (x). Let C be a curve given by r(t) with parameter t ∈ [a, b], then ˆ
ˆ ω = f (r(b)) − f (r(a)) =
(a) C
F · dr C
(b) If C and C 0 are two curves joining p ∈ Rn to q ∈ Rn , then ˆ ˆ ˆ ˆ ω= ω and F · dr = F · dr. C0
C
C0
C
The integral is independent of the path joining p and q. (c) If C is a simple closed curve, then ˛ ˛ ω=0= F · dr. C
¸
C
where is the symbol used to emphasize that the line integral is taken over a simple closed curve.
Proof. (a) Because ω is exact, by definition there exists a differentiable function f such that ω = df . Let r(t) = (x1 (t), . . . , xn (t)), then
∂f ∂f = (x)dx1 + · · · + (x)dxn ∂xn ∂x1 ∂f ∂f 0 0 (r(t))x1 (t) dt + · · · + (r(t))xn (t) dt = ∂xn ∂x1 ∂f ∂f 0 0 = (r(t))x1 (t) + · · · + (r(t))xn (t) dt ∂x1 ∂xn = ∇f (r(t)) · r0 (t) dt d f (r(t)) dt = dt where the last equality follows from the chain rule. Thus, ˆ b ˆ d ω= f (r(t)) dt = f (r(b)) − f (r(a)). dt C a r∗ ω
r∗
(b) Because C and C 0 have the same endpoints, the integrals are equal by part (a). (c) A simple closed curve is such that r(b) = r(a) and so the integral is zero. Remark 4.3.6. In particular, the last implication means ˛ ω is exact =⇒ ω = 0. C
4 Line Integrals
114
Example 4.3.7. Consider the 1-form ω=
−y dx + x dy . x2 + y 2
In polar coordinates, ω = dθ. Consider now a circle C of radius a centered at (0, 0) and compute ˛ ˆ 2π ω= dθ = 2π 6= 0. C
0
Therefore, ω is not exact. In Chapters 6 and 8, we investigate in more details the question of exactness of 1forms and establish a necessary and sufficient condition for a 1-form to be (locally) exact. Example 4.3.7 is crucial in the understanding of this problem.
Example from Physics We now look at applications of the Fundamental Theorem of Calculus for line integrals in physics. Example 4.3.8. The radial gravitational force field given by mM G −x r2 ||x||
F (x) =
where x = (x, y, z) and r = ||x|| is conservative. The potential function is: f (x) =
mM G 1 . r2 ||x||
Indeed, mM G ∇f (x) = r2
−p
x x2 + y 2 + z 2
, −p
y x2 + y 2 + z 2
, −p
z
!
x2 + y 2 + z 2
We compute the work done by F on the following paths. (1) C given by r(t) = (cos t sin t, sin2 t, cos t) with t ∈ [0, π/4]. Because F is conservative ˆ F · dr = f (r(π/4)) − f (r(0)) C √ = f (2, 2, 2) − f (0, 0, 1) mM G 1 mM g 1 √ q = − 2 2 + 02 + 12 √ 2 r2 r 0 2 2 2 +2 + 2 mM G 1 √ −1 . = r2 10
4.3 Line integrals of vector fields
115
(2) C is given by r(t) = (cos t, sin t, cos(3t)) with t ∈ [0, 2π]. Then, ˆ F · dr = f (r(2π)) − f (r(0)) = f (1, 0, 1) − f (1, 0, 1) = 0. C
Consider the work done by a force F on an object moved between p and q in Rn . If F is conservative, then by Theorem 4.3.5(b), the work done by F is independent of the path chosen. Suppose that an object of mass m is moved on a path C given by r(t) between p = r(a) and q = r(b) in R3 . The force applied on the object at r(t) is F (r(t)) = mr00 (t) by Newton’s second law. The force F is not necessarily conservative, but the work 1-form dW = F · (dx, dy, dz) on r(t) is r∗ dW = F (r(t)) · dr = mr00 (t) · r0 (t) =
1 d 0 ||r (t)||2 dt 2 dt
and integrating we have ˆ ˆ m m b d 0 ||r (t)||2 dt = ||r0 (b)||2 − ||r0 (a)||2 . dW = 2 a dt 2 C The quantity 12 m||r0 (t)|| is called the kinetic energy of the object and is denoted K(r(t)). This gives us the following result. Theorem 4.3.9. The work done by a force F in moving an object of mass m on a path C between points p, q ∈ Rn is the difference in the kinetic energy at points p and q. In the case of a conservative force F with potential function f , we know furthermore that ˆ F · dr = f (r(b)) − f (r(a)). (4.11) C
The potential energy of an object located at p in a conservative field F is P (p) = −f (p) and therefore, F (p) = −∇P (p). Therefore, (4.11) becomes ˆ F · dr = (P (r(a)) − P (r(b))). C
Together with Theorem 4.3.9 we have m ||r0 (b)||2 − ||r0 (a)||2 = (P (r(a)) − P (r(b))). 2 We define E(p) := K(p) + P (p) to be the total energy at p. Rearranging the terms, we obtain E(r(b)) = K(r(b)) + P (r(b)) = K(r(a)) + P (r(a)) = E(r(a)). This leads us to one of the Fundamental Laws of Physics. Theorem 4.3.10. [Principle of Conservation of Energy] In a conservative field F , the total energy is conserved along any path C joining two points p and q.
4 Line Integrals
116
Exercises (1) Compute the line integrals of vector fields F over the curves C given below. (a) F (x, y) = (x sin(y), cos(y)) where C is given by r(t) = (cos(t), t) with t ∈ [0, 2π]. √ √ (b) F (x, y, z) = (y 2 ex , z cos(yz), xz) where C is given by r(t) = (ln(t), t, 2t) and t ∈ [1, 3]. (2) Show that the line integrals below are independent of path and compute the integrals. ˆ (a) (2xy + z 2 ) dx + (x2 − 2z 2 ) dy + (2xz − 4yz) dz where C is any path from C
(−1, 2, 0) to (0, 2, 3). ˆ (b) (ex sin(y) + 3x2 ) dx + (ex cos(y) − ey ) dy where C is any path from (0, 0) C
to (1, 0). (3) Let ω = (−y dx + x dy)/(x2 + y 2 ) and show the following integrals. ˛ 1 ω = 1 where C is the square with vertices (1, 1), (−1, 1), (−1, −1), (a) 2π C (1, −1) in counterclockwise direction. ˛ 1 (b) ω = 0 where C is the circle of radius 1 centered at (2, 0) in the 2π C counterclockwise direction. ˛ 1 ω = n where C is given by r(t) = (cos(nt), sin(nt)) with t ∈ [0, 2π] (c) 2π C and n ∈ N is a fixed positive integer. 2 2 (4) From ˛ the previous problem, we see that if ω = (−y dx + x dy)/(x + y ), then 1 ω is either non-exact, or could be exact if the integral is zero. Notice that 2π C ω is not defined at (0, 0) and in the first and third integral, the curve C surrounds (0, 0), but not the second curve. Consider ω ˜ = (−y dx+(x−2) dy)/((x−2)2 +y 2 ) and show ˛ that
(a) (b)
1 2π 1 2π
ω ˜ = 1 where C is the circle of radius 1 centered at (2, 0). ˛C ω ˜ = 0 where C is given by r(t) = (cos(nt), sin(nt)) with t ∈ [0, 2π]. C
What is your conclusion? (Hint: use the substitution u = x − 2). (5) Consider the electric force field x . F (x) = qQ ||x||3 generated by a charged particle Q located at the origin on an electric charge q located at x = (x, y, z). Suppose an electron is located at the origin with charge Q = −1.6 × 10−19 Coulomb and a positive charge q > 0 is located at (x, y, z) = (10−12 , 0, 0) (in meters). Find the work done by the electric force field on the positive charge as it moves to the location (x, y, z) = (0, 10−8 , 0) (use the value = 8.985 × 109 ).
5 Differential Calculus of Mappings The goal of this chapter is to generalize the basic results of differential calculus to mappings. We begin by distinguishing between functions (mappings) and the graphs of functions. This is often blurred when studying functions f : R → R, but for general mappings it is an essential step. We go on to discuss various ways of visualizing some types of mappings. The following section deals with the concepts of limit and continuity. The definitions are similar to the case of real functions of a single variable; however, examples have much more exotic properties. We pursue with the concept of best linear approximation, introduced previously for vector functions, which we now apply to general mappings and use it to obtain a definition of derivative. We present the main properties of the derivative, in particular, the Chain Rule. We conclude this chapter with higher derivatives and Taylor expansions.
5.1 Graphs and Level Sets We begin the study of local properties of general functions or mappings f : Rn → Rm where n, m ≥ 1. We write those mappings as column vectors. For instance (1) r : R → R3 :
x(t)
r(t) = y(t) . z(t) (2) f : R2 → R2 : f1 (x, y)
f (x, y) =
!
f2 (x, y) where f1 , f2 : R2 → R. (3) f : R5 → R3 :
f1 (x1 , x2 , x3 , x4 , x5 )
f (x1 , x2 , x3 , x4 , x5 ) = f2 (x1 , x2 , x3 , x4 , x5 ) f3 (x1 , x2 , x3 , x4 , x5 ) where f1 , f2 , f3 : R5 → R.
We now define explicitly the “graph of a function”. In the case of functions f : R → R or f : R2 → R, the graph of f can be easily visualized and the function and its graph are often thought of as one and the same. For vector functions of several variables,
5 Differential Calculus of Mappings
118
this is not feasible and we must then make sure that the concepts of a function and its graph are well-understood in their own rights. Definition 5.1.1. Let U ⊂ Rn be an open set and f : U → Rm be a mapping. The graph of f is the set graph (f ) = {(x, f (x)) | x ∈ U }. We look at a few examples
a
b
Fig. 5.1. Graph of a function f : [a, b] → R
Example 5.1.2. Consider a function f : [a, b] → R, then graph (f ) = {(x, f (x)) | x ∈ [a, b]}. We can plot the points of graph (f ) in the plane and this gives us the familiar representation of a curve over an interval in the plane as shown in Figure 5.1.
Fig. 5.2. Graph of a function f : U ⊂ R2 → R
Let U ⊂ R2 and f : U → R, then graph (f ) = {(x, y, f (x, y)) | (x, y) ∈ U } As shown in Figure 5.2, we can plot graph (f ) in a representation of threedimensional space.
5.1 Graphs and Level Sets
119
In Chapter 1, we plot several curves in the plane given by vector functions r : R → R2 . The graph of r can also often be visualized in three-dimensional space, as seen in Figure 5.3. r(t) = (t cos(t) sin(t), t sin2 (t), t cos(t)) y
t x
Fig. 5.3. Graph of a vector function in R3 .
Graphical Representations The previous examples show visualizations of graphs of functions and mappings. Those can be obtained for f : Rn → Rm with n + m ≤ 3. We can complement the graphical representations with other visualizations in the case of vector functions r : [a, b] → R3 and vector fields f : R2 → R2 and f : R3 → R3 as presented in previous chapters. Another approach to obtain information about mappings (and their graph) for functions of several variables is through level sets. The traces of conics of Chapter 1 are examples of level sets and we briefly mention level set curves in Section 3.2.3. We now formalize the concept. Definition 5.1.3. Let U ⊂ Rn , f : U → R. The level sets of f are given by {(x1 , . . . , xn ) ∈ U | f (x1 , . . . , xn ) = c},
(5.1)
where c ∈ R. We denote the set (5.1) by f −1 (c) and it is called the inverse image of c by f . In particular, f −1 (c) can be the empty set. Note that the definition of inverse image is similar to the inverse of a function; in fact, if a function is invertible, its inverse image consists of isolated points. We now look at a few examples. Example 5.1.4. The Monkey Saddle is a surface given by the graph of f (x, y) = x3 − 3xy 2 , see Figure 5.4. We look at its level sets by computing the inverse image f −1 (c) for various values of c. It is typical, as a first step, to consider c = 0 and one value of c < 0 and c > 0. We have for c = 0 that f (x, y) = x(x2 − 3y 2 ) = 0 and this √ holds for x = 0 and x = ± 3y. See Figure 5.5 for some level set curves. Therefore,
120
5 Differential Calculus of Mappings
Fig. 5.4. Monkey saddle surface
f −1 (0) is made up of three lines through (0, 0) at angles of 2π/3. For c 6= 0 the task is more difficult. 4
f −1 (1) f −1 (−1) f −1 (0)
2
−4
−2
2
4
−2
−4
Fig. 5.5. Level sets of the Monkey saddle surface.
Example 5.1.5. The motion of a pendulum without friction has total energy given by 1 E(x, y) = y 2 − cos(x). 2 Figure 5.6 shows the surface given by E and Figure 5.7, level sets for c = −1, c = −1/2 and c = 2. We see a transition from isolated points (at x = 2kπ), to circles, and finally to curves extending to ±∞ in x. Example 5.1.6. Let x = (x1 , x2 ) be coordinates on the plane and y = (y1 , y2 ) the velocity of the particle at x. The harmonic oscillator in the plane has total energy given by 1 H(x, y) = (y · y + x · x) 2
5.2 Limits and Continuity
121
Fig. 5.6. Surface given by the energy E of the pendulum without friction.
where y · y/2 is the kinetic energy and x · x/2 is the potential energy. The level sets of H are given by H −1 (h) = {(x, y) ∈ R4 | y · y + x · x = h} For h > 0 we let h = k 2 for k > 0. The level sets H −1 (h) correspond to three dimensional spheres of radius k since they satisfy an equation analogous to the equation of a sphere in R3 , but with an additional variable. For h = 0, the level set is only the origin (0, 0, 0, 0), while for h < 0 the level set is empty.
Exercises (1) Write the expression for the graph of the functions. (a) f (x, y, z) = (x2 , xy, yz 3 )T . (b) g(A) = det(A) where A is a 2 × 2 matrix. (2) Write the level sets of the functions and give a geometrical description. (a) g(x, y) = cos x cos y (b) f (x, y, z) = 2x2 + y 2 + z 2
5.2 Limits and Continuity In this section we extend the concepts of limit and continuity to general mappings. The computation of limits for functions of several variables is much trickier than for functions f : R → R.
122
5 Differential Calculus of Mappings
E −1 (−1) E −1 (− 12 ) E −1 (2)
4 2
−2π
π
−π
2π
−2 −4
Fig. 5.7. Level sets of the energy E of the pendulum without friction for c = −1, c = −1/2 and c = 2.
Example 5.2.1. Consider the function f : R2 → R defined by x if y = 0 f (x, y) =
y 1
if x = 0
otherwise
and we look at various approaches towards (0, 0). Consider the path (x(t), y(t)) = (t, t) then lim f (x, y) = 1 (t,t)→(0,0)
because f (x, y) = 1 on the path (t, t). But lim
f (x, y) = 0.
(x,0)→(0,0)
Therefore, the limit cannot exist at (0, 0). The definition of limit for mappings, therefore, needs to consider all possible approaches of x towards x0 . Definition 5.2.2. Let f : Rn → Rm be a mapping, then the limit of f at x0 ∈ Rn exists if there exists a unique ` ∈ Rm such that lim f (x) = `,
x→x0
which means ∀ > 0, ∃δ > 0
such that
||x − x0 ||Rn < δ
⇒
||f (x) − `||Rm < .
5.2 Limits and Continuity
123
Example 5.2.3. We show using the definition that lim
|u| + |v| = 0.
(u,v)→(0,0)
We begin by choosing some > 0. The function in our case is f : R2 → R defined by f (u, v) = |u| + |v| and ||f (x) − `||R1 = ||u| + |v| − 0| = |u| + |v|. √ We choose (u, v) such that ||(u, v) − (0, 0)||R2 = u2 + v 2 < δ where δ is not yet fixed with respect to . Note that the following inequalities hold for all u and v p p |u| ≤ u2 + v 2 and |v| ≤ u2 + v 2 . Therefore, ||f (x) − `||R1 = |u| + |v| ≤ 2
p
u2 + v 2 < 2δ
and setting δ = /2 we obtain ||f (x) − `||R1 < . This completes the proof. As seen in Example 5.2.1, in order to show that a limit does not exist, it is often a good strategy to determine a direction of approach to the limiting point for which the one-dimensional limit does not exist or find two approaches which yield different values for the limit. We illustrate this method with the following two examples. Example 5.2.4. We check whether lim (x,y)→(0,0)
xy 3x2 + 2y 2
exists. Substituting (0, 0) into the formula yields 0/0 and this is an indeterminate form. However, it cannot be resolved using l’Hospital rule as it is only valid for functions of one variable. Consider the line y = mx and substitute in the expression, we obtain xy mx2 m = 2 = . 2 2 2 2 3x + 2y 3x + 2m x 3 + 2m2 Therefore, lim (x,y)→(0,0)
m m xy = lim = , 3x2 + 2y 2 3 + 2m2 (x,mx)→(0,0) 3 + 2m2
which means the value of the limit depends on the line of approach towards the origin. We must then conclude that the limit does not exist. We now look at a mapping example. Example 5.2.5. Consider f : R2 \ {(0, 0)} → R2 defined by f (x, y) =
(x, y)T . ||(x, y)||
124
5 Differential Calculus of Mappings
We show that the limit as (x, y) → (0, 0) does not exist. Let y = mx and substitute in the expression (x, y)T (x,y)→(0,0) ||(x, y)|| lim
= = =
(x, mx)T (x,mx)→(0,0) ||(x, mx)|| (x, mx)T √ lim (x,mx)→(0,0) x 1 + m2 (1, m)T √ lim . (x,mx)→(0,0) 1 + m2 lim
Again, the vector value of the limit depends on the line of approach to the origin, so the limit does not exist. This example is similar to the case of the function x x 6= 0 |x| f (x) = 0 x = 0, which is illustrated in the Figure 5.8. It is clear that the function does not have a limit at x = 0.
1
−1 Fig. 5.8. Function f (x) with a jump discontinuity at x = 0
For functions of several variables, f : Rn → R, the properties of the limit are the same as the ones already known for functions of one variable. Those are listed in the next result. Proposition 5.2.6. Let f, g : Rn → R, x0 ∈ Rn , α ∈ R, L1 , L2 ∈ R and suppose lim f (x) = L1
x→x0
and
lim g(x) = L2 .
x→x0
Then, (1) lim αf (x) + g(x) = αL1 + L2
x→x0
Linearity Property
(2) lim f (x)g(x) = L1 L2
x→x0
5.2 Limits and Continuity
125
(3) If L2 6= 0, then lim
x→x0
L1 f (x) = . g(x) L2
Proof. We only do the second case and leave the other ones as exercises at the end of the section. We must use the definition of limit. Let 0 > 0 and consider |f (x)g(x) − L1 L2 |. Then, |f (x)g(x) − L1 L2 |
=
|f (x)g(x) − L1 g(x) + L1 g(x) − L1 L2 |
=
|(f (x) − L1 )g(x) + L1 (g(x) − L2 )|
≤
|(f (x) − L1 )g(x)| + |L1 (g(x) − L2 )|
=
|f (x) − L1 ||g(x)| + |L1 ||g(x) − L2 |.
Because the limit of g(x) as x → x0 exists, there exists > 0 such that if ||x − x0 || < δ1 then |g(x) − L2 | < . Rearranging this last inequality we obtain − + L2 < g(x) < + L2 thus |g(x)| < + L2 for all x ∈ (x0 − δ1 , x0 + δ1 ). For the same > 0, if |x − x0 | < δ2 then |f (x) − L1 | < . Therefore, choosing x such that ||x − x0 || < min(δ1 , δ2 ) we have |f (x) − L1 ||g(x)| + |L1 ||g(x) − L2 | < ( + L2 + |L1 |) Setting 0 = ( + L2 + |L1 |) the proof is complete. As it is done for functions of one variable, we use the limit to introduce the concept of continuity of a function. The exact definition follows. Definition 5.2.7. Let f : Rn → Rm be a mapping. Then f is continuous at x0 ∈ Rn if f (x0 ) is defined and lim f (x) = f (x0 ). x→x0
Example 5.2.8. Consider f (x, y) = (|x| + |y|, |x| − |y|)T and compute lim
f (x, y) = (0, 0)T = f (0, 0).
(x,y)→(0,0)
We now use the definition to prove this statement. Let > 0. The first step is to estimate f (x, y) − (0, 0)T in the Euclidean norm on R2 : ||f (x, y) − (0, 0)T ||
= = =
p (|x| + |y|)2 + (|x| − |y|)2 p 2(|x|2 + |y|2 ) √ p √ 2 x2 + y 2 = 2||(x, y)||.
If (x, y) is chosen so that ||(x, y)|| < δ and we choose δ = then √ ||f (x, y) − (0, 0)T || = 2||(x, y)|| < δ = and the proof is complete.
126
5 Differential Calculus of Mappings
The properties of continuous functions of several variables follow directly from the properties of limits seen in Proposition 5.2.6. Proposition 5.2.9. Let f, g : Rn → R be continuous at x0 . Then, f (x) + g(x), f (x)g(x) are continuous at x0 . Moreover, if g(x0 ) 6= 0, then f (x)/g(x) is continuous at x0 . Proof. By setting L1 = f (x0 ) and L2 = g(x0 ) the proof of Proposition 5.2.6 yields the result. We now look at properties of limits and continuity of general mappings f : Rn → Rm . Of course, the multiplication and division properties of limit do not have an analogue if m > 1. However, the linearity property still holds: if f, g : Rn → Rm , v1 , v2 ∈ Rm , and lim f (x) = v1 and lim g(x) = v2 x→x0
x→x0
then for α ∈ R lim αf (x) + g(x) = αv1 + v2 .
x→x0
(5.2)
However, we can use the results for functions of several variables, as follows, to obtain limit and continuity results for vector functions of several variables. Proposition 5.2.10. Let f : Rn → Rm be a mapping where f = (f1 , . . . , fm )T and v = (v1 , . . . , vm ) ∈ Rm . Then, lim f (x) = v x→x0
if and only if limx→x0 fj (x) = vj for all j = 1, . . . , m. Proof. This is an “if and only if” statement and so we need to prove two implications. We begin by assuming the limit for f exists. Let > 0, then for ||x − x0 || < δ p ||f (x) − v|| = (f1 (x) − v1 )2 + · · · + (fm (x) − vm )2 < . √ Recall that |u| = u2 , and therefore p |fj (x) − vj | < (f1 (x) − v1 )2 + · · · + (fm (x) − vm )2 < for all j = 1, . . . , m. This implies lim fj (x) = vj
x→x0
for all j = 1, . . . , m and so one implication is proved. Suppose now that lim fj (x) = vj
x→x0
for all j = 1, . . . , m. Let > 0, there exists δ > 0 such that if ||x − x0 || < δ then |fj (x) − vj | < for j = 1, . . . , m. So, if x is such that ||x − x0 || < δ, then ||f (x) − v||2 = (f1 (x) − v1 )2 + · · · + (fm (x) − vm )2 < m2 . √ Thus, ||f (x) − v|| < m and the proof is complete.
5.2 Limits and Continuity
127
We can use the above result to determine continuity directly. The proof is straightforward by setting vj = fj (x0 ) for j = 1, . . . , m. Corollary 5.2.11. Let f : Rn → Rm where f = (f1 , . . . , fm )T , then f is continuous at x0 if and only if fj is continuous at x0 for all j = 1, . . . , m. Proposition 5.2.10 and Corollary 5.2.11, along with Propositions 5.2.6 and Proposition 5.2.9, give a recipe for verifying whether a mapping has a limit or is continuous. See the next example for an illustration of the procedure. Example 5.2.12. Consider the mapping f (x, y, z) =
x2 sin(xyz)
!
(x + yz)ex
.
We use properties of functions of several variables to show continuity of f as (x, y, z) → (0, 1, 1). Let f = (f1 , f2 )T and we study each function in turn. Because x2 → 0 as x → 0 and sin(xyz) → 0 as (x, y, z) → (0, 1, 1) and we have f1 (0, 1, 1) = 0, then lim(x,y,z)→(0,1,1) f1 (x, y, z) = f1 (0, 1, 1). Checking the second one, we have (x + yz) → 1 as (x, y, z) → (0, 1, 1) and ex → 1 as x → 0; therefore, lim(x,y,z)→(0,1,1) f2 (x, y, z) = f2 (0, 1, 1). By Corollary 5.2.11, f is continuous at (0, 1, 1). We can extend the definition of continuity at a point to continuity in a domain. Definition 5.2.13. Let U ⊂ Rn be an open set and f : U → Rm . Then, f is continuous on U if for any x0 ∈ U , lim f (x) = f (x0 ).
x→x0
Before we address the question of derivatives of mappings, let us consider the following example which shows the limitations of partial derivatives. Example 5.2.14. Let f (x, y) =
x y
1
if y = 0 if x = 0 otherwise.
We begin by showing that the partial derivatives exist at (0, 0): ∂f f (h, 0) − f (0, 0) h (0, 0) = lim = lim = 1, ∂x h (x,y)→(0,0) (x,y)→(0,0) h ∂f f (0, h) − f (0, 0) h (0, 0) = lim = lim = 1. ∂y h (x,y)→(0,0) (x,y)→(0,0) h However, the function itself does not have a limit at (0, 0) as shown above in Example 5.2.1. Therefore, the existence of partial derivatives at a point does not provide any information about the existence of a limit at that same point.
5 Differential Calculus of Mappings
128
In fact, there are examples where all directional derivatives exist at a point, but the limit does not.
Exercises (1) Compute the following limits. (a)
lim
x sin(xy)
(x,y)→(1,0)
(b) (c)
ez (x,y,z)→(0,0,0) 1 + x2 + y 2 lim
lim (x,y)→(2,2)
(d)
lim
x2 − 2xy + y 2 − 2 ! ! 3 7 x −2
(x,y)→(−1,1)
(e)
4
+
y
1
!
5
||(x − 1, y + 2, z − 3)||
lim (x,y,z)→(0,0,0)
(f)
(2 sin φ cos θ, 2 sin φ sin θ, 2 cos φ)
lim (θ,φ)→(π,π/2)
(2) Show using approaches from various directions that the limits below do not exist. 3y − 6 (a) lim (x,y)→(1,2) ||(x − 1, y − 2)|| 3xy 2 (Hint: try a quadratic approach) (b) lim 5 (x,y)→(0,0) 2x − xy 2 (y, x, 5z)T (x,y,z)→(0,0,0) ||(x, y, z)|| x (d) lim qQ , where x = (x, y, z). x→0 ||x||3 (c)
lim
(3) Show using the definition that the limit exists. (a)
lim
7x − 3y = 0
(x,y)→(0,0)
(b)
lim
xy + x − y − 1 = 0
(x,y)→(1,−1)
(x, y)||(x, y)|| =0 (x,y)→(0,0) 1 + ||(x, y)|| (d) lim (x, y, z) × (1, 1, 1) = (0, 0, 0) (c)
lim
(x,y,z)→(1,1,1)
(4) Show cases (1) and (3) of Proposition 5.2.6. (5) Determine whether the following functions are continuous at the point given. If yes, provide a calculation or use the definition of limit. If not continuous, show why using the method of your choice. 1 + xy (a) lim (x,y,z)→(π/4,2,π/4) cos(y(x + z))
5.3 Best Linear Approximation and Derivatives
(b)
−1
(A(x, y))
lim
where A(x, y) =
(x,y)→(0,0)
(c)
3
x
2−y
x
129
!
(cos θ sin φ, sin θ sin φ, cos φ)
lim (θ,φ)→(π/4,π/4)
(6) Show the continuity of the following mappings using properties of continuity of functions of several variables using Corollary 5.2.11. x2 −1/xy ,e (a) lim (x,y)→(0,0) 3x + 4xy (b) lim tanh(1/(xyz)), x2 + y 2 + z 2 (x,y,z)→(0,0,0)
(c)
lim (ρ,θ,φ)→(5,0,0)
(ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ) ||(ρ, 0, 0)||
5.3 Best Linear Approximation and Derivatives We begin the discussion on derivatives by addressing the concept of tangency at a point. Example 5.3.1. The following examples show functions which are tangent at a point. (1) x2 and x3 at x = 0. x3
x2 1
1 Fig. 5.9. Tangency of x2 and x3 at x = 0.
(2) sin x and tanh(x) at x = 0.
tanh(x) 1 −1
sin(x)
Fig. 5.10. Tangency of sin(x) and tanh(x) at x = 0.
5 Differential Calculus of Mappings
130
Figure 5.9 and Figure 5.10 show the tangency; however, we need a formula from which we can determine whether two curves are tangent at a point. This is given by the next definition. Definition 5.3.2. Consider mappings f : Rn → Rm and g : Rn → Rm . We say f n and g are tangent at x0 = (x10 , . . . , xn 0 ) ∈ R if f (x0 ) = g(x0 ) and lim
x→x0
||f (x) − g(x)||Rm = 0. ||x − x0 ||Rn
The idea behind this definition is that tangency occurs if the approach of f (x)−g(x) towards zero as x → x0 happens at a rate much faster than the linear rate of approach given by x − x0 . We verify that the examples above agree with this definition. Example 5.3.3. In the case of f (x) = x2 and g(x) = x3 at x = 0 we have x2 − x3 f (x) − g(x) = lim = lim x − x2 = 0. x→0 x→0 x→0 x−0 x lim
Take f (x) = sin x and g(x) = tanh(x). Recall from elementary calculus that lim
x→0
sin x = 1, x
and tanh0 x = 1 − tanh2 x. Then, using l’Hospital rule lim
x→0
tanh(x) = 1. x
Therefore, lim
x→0
sin x − tanh(x) sin x tanh(x) = lim − lim = 0. x→0 x x→0 x x
We now look at a case in several variables. Example 5.3.4. Consider f (x, y) = x2 + y 2 and g(x, y) = 2(x + y − 1). We show that |f (x, y) − g(x, y)| = 0. lim (x,y)→(1,1) ||(x, y) − (1, 1)|| Note that |f (x, y) − g(x, y)|
=
|x2 + y 2 − 2(x + y − 1)|
=
|(x2 − 2x + 1) + (y 2 − 2y + 1)|
=
|(x − 1)2 + (y − 1)2 | = ||(x − 1, y − 1)||2 .
Thus, |f (x, y) − g(x, y)| (x,y)→(1,1) ||(x, y) − (1, 1)|| lim
= =
||(x − 1, y − 1)||2 (x,y)→(1,1) ||(x − 1, y − 1)|| lim ||(x − 1, y − 1)|| = 0. lim
(x,y)→(1,1)
5.3 Best Linear Approximation and Derivatives
131
We now refine the above definition to consider only functions g which are linear and this leads us to the concepts of Best Linear Approximation and the Derivative. Definition 5.3.5. Consider the mapping f : Rn → Rm and n x0 = (x10 , . . . , xn 0) ∈ R : (1) If g(x) = a + Lx where a ∈ Rm , L : Rn → Rm and g is tangent to f at x = x0 then g(x) is called the best linear approximation to f at x0 . (2) If f has a best linear approximation g(x) at x0 , then f is said to be differentiable at x0 and L is called the derivative of f at x0 . We denote L = Df (x0 ).
Properties of Tangency and Best Linear Approximation We now look at properties of tangency and best linear approximation which are direct consequences of the definitions above. (1) Suppose f (x) and g(x) are continuous and tangent at x0 , then f (x0 ) = g(x0 ). Proof. Because f (x) and g(x) are continuous at x0 then lim f (x) = f (x0 ) and
x→x0
lim g(x) = g(x0 ).
x→x0
Tangency of f and g implies limx→x0 f (x) − g(x) = 0. Therefore, f (x0 ) = g(x0 ). If g is a best linear approximation to f at x0 then g(x) = f (x0 ) + L(x − x0 ). Proof. We know g(x) = a + Lx, so g(x0 ) = a + Lx0 = f (x0 ) by the above result. This implies a = f (x0 ) − Lx0 and g(x) = a + Lx = f (x0 ) + L(x − x0 ). Best linear approximations are unique, which means derivatives are unique. Proof. Suppose there are two best linear approximations g1 (x) = f (x0 )+L1 (x− x0 ) and g2 (x) = f (x0 ) + L2 (x − x0 ). Then, g1 and g2 are tangent to each other and ||g1 (x) − g2 (x)|| ||(L1 − L2 )(x − x0 )|| 0 = lim = lim . x→x0 x→x0 ||x − x0 || ||x − x0 || We now proceed by contradiction. Suppose that L1 − L2 6= 0. Let v = x − x0 be a vector not in the kernel of L1 − L2 and consider w := (L1 − L2 )v. Then ||w|| = α||v|| for some nonzero α ∈ R. In the v direction, we can write lim
x−x0
||(L1 − L2 )(x − x0 )|| ||x − x0 ||
= = =
Thus we have a contradiction and so L1 = L2 .
||(L1 − L2 )tv|| ||tv|| ||tw|| lim t→0 ||tv|| lim α 6= 0. lim
t→0
t→0
132
5 Differential Calculus of Mappings
We look at some familiar examples. (1) For f : R → R, the derivative Df (x) is a 1 × 1 matrix and corresponds to the regular derivative f 0 (x). (2) For vector functions f : R → Rn this is shown in Chapter 2. (3) For f : R2 → R the derivative Df (x) is a 1 × 2 matrix. The next results shows that Df (x) = ∇f (x).
Proposition 5.3.6. If f : Rn → R is a differentiable function at a point x0 . Then, the derivative L = Df (x0 ) = ∇f (x0 ). Proof. Let L = Df (x0 ) = (a1 , . . . , an ) be a 1 × n matrix. Because f is differentiable at x = x0 we know that |f (x) − g(x)| lim =0 x→x0 ||x − x0 || no matter which path is used to approach x0 . Let x0 = (x01 , . . . , x0n ) and consider the path parallel to the j th coordinate axis given by x(t) = (x1 (t), . . . , xj (t), . . . , xn (t)) where xi (t) = x0i for i 6= j and xj (t) = x0j + t where t ∈ R. Then, 0
= = = =
|f (x) − f (x0 ) − Df (x0 )(x − x0 )| ||x − x0 || |f (x01 , . . . , x0j + t, . . . , x0n ) − f (x0 ) − Df (x0 )ej t| lim t→0 t |f (x01 , . . . , x0j + t, . . . , x0n ) − f (x0 ) − aj t| lim t→0 t ∂f (x0 ). ∂t lim
x→x0
For j = 1, . . . , n we obtain the partial derivatives in all coordinate directions and Df (x0 ) = ∇f (x0 ). Example 5.3.7. The derivative of f (x1 , x2 , x3 , x4 ) = x22 x3 − ex1 x4 at x0 = (x01 , x02 , x03 , x04 ) = (1, 1, 1, 1) is obtained by computing the gradient ∂f ∂f ∂f ∂f Df (x0 ) = (x0 ), (x0 ), (x0 ), (x0 ) ∂x1 ∂x2 ∂x3 ∂x 4 x x 2 x x = (−x4 e 1 4 , 2x2 x3 , x2 , −x1 e 1 4 ) = (−e, 2, 1, −e). x0
5.3 Best Linear Approximation and Derivatives
133
The Jacobian Matrix Using the characterization of the derivative for a function of several variables f : Rn → R in terms of the gradient, we can extend our study to general mappings f : Rn → Rm for arbitrary n, m ∈ N. This is given in the next theorem. Theorem 5.3.8. Let f : Rn → Rm be a mapping and x = (x1 , . . . , xn ) ∈ Rn with f1 (x1 , . . . , xn ) f2 (x1 , . . . , xn ) f (x) = . .. . fm (x1 , . . . , xn ) Suppose that Df (x0 ) exists. Then, ∂f1 (x ) ∂x1 0 ∂f2 (x0 ) Df (x0 ) = ∂x1 .. . ∂fm (x0 ) ∂x1
∂f1 (x0 ) ∂x2
···
∂f2 (x0 ) ∂x2
···
.. .
···
∂fm (x0 ) ∂x2
···
∂f1 (x0 ) ∂xn ∂f2 (x0 ) ∂xn . .. . ∂fm (x0 ) ∂xn
(5.3)
The matrix (5.3) is called the Jacobian matrix of f and is written Jf (x). The proof is done in a similar way as for the derivative of a function f : Rn → R and we do not present it here. The interested reader can supply the details. The Jacobian matrix Jf (x) depends only on partial derivatives and has importance of its own, and should be computed even before checking that a mapping f is differentiable at some point. It is a useful construction and is fundamental in what follows. However, it should not be confused with the derivative since the derivative can exist at points where the Jacobian matrix cannot be computed as we see in the next section. Example 5.3.9. We compute the derivative of uz 2 − 3wx2 f (x, y, z, u, w) = − cos(yz) + xuw xe−1/yu
134
5 Differential Calculus of Mappings
at (x, y, z, u, w) = (0, 1, 0, 1, 0). We begin bian matrix formula is ∂f1 ∂x ∂f2 Jf (x, y, z, u, w) = ∂x ∂f3
by writing f = (f1 , f2 , f3 )T and the Jaco-
∂x
∂f1 ∂y
∂f1 ∂z
∂f1 ∂u
∂f2 ∂y
∂f2 ∂z
∂f2 ∂u
∂f3 ∂y
∂f3 ∂z
∂f3 ∂u
∂f1 ∂w ∂f2 ∂w ∂f3 . ∂w
We compute each partial derivative: ∂f1 = −6wx, ∂x ∂f2 = uw, ∂x
∂f1 = 0, ∂y
∂f1 = 2uz, ∂z
∂f2 = z sin(yz), ∂y
∂f3 = e−1/yu , ∂x
∂f1 = z2, ∂u
∂f2 = y sin(yz), ∂z
∂f3 x −1/yu e , = ∂y uy 2
∂f2 = xw, ∂u
∂f3 = 0, ∂z
and evaluate at (x, y, z, u, w) = (0, 1, 0, 1, 0) to 0 Jf (0, 1, 0, 1, 0) = 0 e−1
∂f1 = −3x2 , ∂w ∂f2 = xu, ∂w
∂f3 x −1/yu e , = ∂u yu2
∂f3 = 0. ∂w
obtain
0
0
0
0
0
0
0
0
0
0
0 . 0
Example 5.3.10. We determine if the Jacobian of y sin(xπ) g(x, y, z) = y ln |1 + x| 4 z y 2 + cos(xπ)
computed at (x, y, z) = (0, 2, 1) is triangular. We begin by writing g = (g1 , g2 , g3 )T and the Jacobian matrix formula is ∂g1 ∂g1 ∂g1 ∂x ∂y ∂z ∂g2 ∂g2 ∂g2 Jg(x, y) = ∂y ∂z ∂x ∂g3 ∂g3 ∂g3 ∂x
∂y
∂z
5.3 Best Linear Approximation and Derivatives
135
We compute each partial derivative: ∂g1 = yπ cos(xπ), ∂x
∂g1 = sin(xπ), ∂y
∂g1 =0 ∂z
y ∂g2 = , ∂x 1+x
∂g2 = ln |1 + x|, ∂y
∂g2 =0 ∂z
∂g3 = −π sin(xπ), ∂x
∂g3 = 2u4 y, ∂y
∂g3 = 4u3 y 2 ∂z
and evaluate at (x, y, z) = (0, 2, 1) to obtain 2π 0 Jg(0, 2) = 3 ln 1 0 4
0
0 16
which is a lower triangular matrix. Alternate terminology and notations are as follows: (1) The Jacobian is sometimes written Jf (x) =
∂(f1 , f2 , . . . , fm ) (x). ∂(x1 , . . . , xn )
(2) The Jacobian matrix is sometimes referred to as the linear part of f or the linearization of f .
We conclude this section with the linearity property for derivatives of mappings. Proposition 5.3.11. [Linearity Property] Let f, g : Rn → Rm be differentiable mappings on U ⊂ Rn , and α ∈ R. Then, D(f + g)(x) = Df (x) + Dg(x)
and
D(αf )(x) = αDf (x).
Example 5.3.12. In the study of differential equations, mappings such as this one are often encountered ! ! x 2x2 + 4xy f (x, y) = A + y 5x3 − 3xy 2 where A is a matrix of constants. The derivative is Df (x, y) = A +
4x + 4y
4x
15x2 − 3y 2
−6xy
! .
5 Differential Calculus of Mappings
136
Exercises (1) Determine whether the functions are continuous, if so prove using the − δ definition. If not, give an explanation why. If possible, find a candidate for the best linear approximation function g and show the tangency with the function at the point. (Hint: the Jacobian(matrix may be useful in finding g). (a) f (x) at x = 0 where f (x) = (b) f (x, y) =
(x2
(c) f (x, y, z) =
x2
x≥0
−x2
x