Calculus of Several Variables: Lecture Notes


292 91 6MB

English Pages [179] Year 2010

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
1. Vectors in R2 and R3
2. Dot product
3. Cross product
4. Planes and distances
5. n-dimensional space
6. Cylindrical and spherical coordinates
7. Functions
8. Limits
9. The derivative
10. More about derivatives
11. Higher derivatives
12. Chain rule
13. Implicit functions
14. Parametrised Curves
15. Arclength
16. Moving frames
17. Vector fields
18. Div grad curl and all that
19. Taylor Polynomials
20. Maxima and minima: I
21. Maxima and minima: II
22. Maxima and minima: II
23. Inclusion-Exclusion
24. Triple integrals
25. Change of coordinates: I
26. Change of coordinates: II
27. Line integrals
28. Manifolds with boundary
29. Conservative vector fields revisited
30. Surface integrals
31. Flux
32. Stokes Theorem
33. Gauss Theorem
34. Forms on Rn
Recommend Papers

Calculus of Several Variables: Lecture Notes

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

1. Vectors in R2 and R3 Definition 1.1. A vector �v ∈ R3 is a 3-tuple of real numbers (v1 , v2 , v3 ). Hopefully the reader can well imagine the definition of a vector in R2 . √ Example 1.2. (1, 1, 0) and ( 2, π, 1/e) are vectors in R3 . Definition 1.3. The zero vector in R3 , denoted �0, is the vector (0, 0, 0). If �v = (v1 , v2 , v3 ) and w � = (w1 , w2 , w3 ) are two vectors in 3 � , denoted �v + w � , is the vector (v1 + w1 , v2 + R , the sum of �v and w w2 , v3 + w3 ). If �v = (v1 , v2 , v3 ) ∈ R3 is a vector and λ ∈ R is a scalar, the scalar product of λ and v, denoted λ · �v , is the vector (λv1 , λv2 , λv3 ). Example 1.4. If �v = (2, −3, 1) and w � = (1, −5, 3) then �v + w � = (3, −8, 4). If λ = −3 then λ · �v = (−6, 9, −3). Lemma 1.5. If λ and µ are scalars and �u, �v and w � are vectors in R3 , then (1) �0 + �v = �0. (2) �u + (�v + w � ) = (�u + �v ) + w �. (3) �u + �v = �v + �u. (4) λ · (µ · �v ) = (λµ) · �v . (5) (λ + µ) · �v = λ · �v + µ · �v . (6) λ · (�u + �v ) = λ · �u + λ · �v . Proof. We check (3). If �u = (u1 , u2 , u3 ) and �v = (v1 , v2 , v3 ), then �u + �v = (u1 + v1 , u2 + v2 , u3 + v3 ) = (v1 + u1 , v2 + u2 , v3 + u3 ) = �v + �u.



We can interpret vector addition and scalar multiplication geometri­ cally. We can think of a vector as representing a displacement from the origin. Geometrically a vector �v has a magnitude (or length) |�v | = (v12 + v22 + v32 )1/2 and every non-zero vector has a direction �u =

�v . |�v |

Multiplying by a scalar leaves the direction unchanged and rescales the magnitude. To add two vectors �v and w � , think of transporting the tail of w � to the endpoint of �v . The sum of �v and w � is the vector whose tail is the tail of the transported vector. 1

One way to think of this is in terms of directed line segments. Note that given a point P and a vector �v we can add �v to P to get another point Q. If P = (p1 , p2 , p3 ) and �v = (v1 , v2 , v3 ) then Q = P + �v = (p1 + v1 , p2 + v2 , p3 + v3 ). −→ If Q = (q1 , q2 , q3 ), then there is a unique vector P Q, such that Q = P + �v , namely −→ P Q = (q1 − p1 , q2 − p2 , q3 − p3 ). Lemma 1.6. Let P , Q and R be three points in R3 . −→ −→ −→ Then P Q + QR = P R. −→ −→ Proof. Let us consider the result of adding P Q + QR to P , −→ −→ −→ −→ P + (P Q + QR) = (P + P Q) + QR −→ = Q + QR = R. On the other hand, there is at most one vector �v such that when we −→ −→ −→ −→ add it P we get R, namely the vector P R. So P Q + QR = P R. � Note that (1.6) expresses the geometrically obvious statement that if one goes from P to Q and then from Q to R, this is the same as going from P to R. Vectors arise quite naturally in nature. We can use vectors to rep­ resent forces; every force has both a magnitude and a direction. The combined effect of two forces is represented by the vector sum. Sim­ ilarly we can use vectors to measure both velocity and acceleration. The equation F� = m�a, is the vector form of Newton’s famous equation. Note that R3 comes with three standard unit vectors ˆı = (1, 0, 0)

jˆ = (0, 1, 0)

and

kˆ = (0, 0, 1),

which are called the standard basis. Any vector can be written uniquely as a linear combination of these vectors, ˆ �v = (v1 , v2 , v3 ) = v1ˆı + v2 jˆ + v3 k. We can use vectors to parametrise lines in R3 . Suppose we are given two different points P and Q of R3 . Then there is a unique line l containing P and Q. Suppose that R = (x, y, z) is a general point of 2

−→ −→ the line. Note that the vector P R is parallel to the vector P Q, so that −→ −→ P R is a scalar multiple of P Q. Algebraically, −→ −→ P R = tP Q, for some scalar t ∈ R. If P = (p1 , p2 , p3 ) and Q = (q1 , q2 , q3 ), then (x − p1 , y − p2 , z − p3 ) = t(q1 − p1 , q2 − p2 , q3 − p3 ) = t(v1 , v2 , v3 ), where (v1 , v2 , v3 ) = (q1 − p1 , q2 − p2 , q3 − p3 ). We can always rewrite this as, (x, y, z) = (p1 , p2 , p3 ) + t(v1 , v2 , v3 ) = (p1 + tv1 , p2 + tv2 , p3 + tv3 ). Writing these equations out in coordinates, we get x = p1 + tv1

y = p2 + tv2

and

z = p3 + tv3 .

Example 1.7. If P = (1, −2, 3) and Q = (1, 0, −1), then �v = (0, 2, −4) and a general point of the line containing P and Q is given parametri­ cally by (x, y, z) = (1, −2, 3) + t(0, 2, −4) = (1, −2 + 2t, 3 − 4t). Example 1.8. Where do the two lines l1 and l2 (x, y, z) = (1, −2 + 2t, 3 − 4t)

and

(x, y, z) = (2t − 1, −3 + t, 3t),

intersect? We are looking for a point (x, y, z) common to both lines. So we have (1, −2 + 2s, 3 − 4s) = (2t − 1, −3 + t, 3t). Looking at the first component, we must have t = 1. Looking at the second component, we must have −2 + 2s = −2, so that s = 0. By inspection, the third component comes out equal to 3 in both cases. So the lines intersect at the point (1, −2, 3). Example 1.9. Where does the line (x, y, z) = (1 − t, 2 − 3t, 2t + 1) intersect the plane 2x − 3y + z = 6? We must have 2(1 − t) − 3(2 − 3t) + (2t + 1) = 6. Solving for t we get 9t − 3 = 6, so that t = 1. The line intersects the plane at the point (x, y, z) = (0, −1, 3). 3

Example 1.10. A cycloid is the path traced in the plane, by a point on the circumference of a circle as the circle rolls along the ground. Let’s find the parametric form of a cycloid. Let’s suppose that the circle has radius a, the circle rolls along the x-axis and the point is at the origin at time t = 0. We suppose that the cylinder rotates through an angle of t radians in time t. So the circumference travels a distance of at. It follows that the centre of the circle at time t is at the point P = (at, a). Call the point on the circumference Q = (x, y) and let O be the centre of coordinates. We have −→ −→ −→ (x, y) = OQ = OP + P Q. Now relative to P , the point Q just goes around a circle of radius a. Note that the circle rotates backwards and at time t = 0, the angle 3π/2. So we have −→ P Q = (a cos(3π/2 − t), a sin(3π/2 − t)) = (−a sin t, −a cos t) Putting all of this together, we have (x, y) = (at − a sin t, a − a cos t).

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

2. Dot product

Definition 2.1. Let �v = (v1 , v2 , v3 ) and w � = (w1 , w2 , w3 ) be two vectors 3 � , denoted �v · w � , is the scalar in R . The dot product of �v and w v1 w1 + v2 w2 + v3 w3 . Example 2.2. The dot product of �v = (1, −2, −1) and w � = (2, 1, −3) is 1 · 2 + (−2) · 1 + (−1) · (−3) = 2 − 2 + 3 = 3. Lemma 2.3. Let �u, �v and w � be three vectors in R3 and let λ be a scalar. (1) (�u + �v ) · w � = �u · w � + �v · w �. (2) �v · w � =w � · �v . (3) (λ�v ) · w � = λ(�v · w � ). (4) �v · �v = 0 if and only if �v = �0. Proof. (1–3) are straightforward. To see (4), first note that one direction is clear. If �v = �0, then �v · �v = 0. For the other direction, suppose that �v · �v = 0. Then v12 + v22 + v32 = 0. Now the square of a real number is non-negative and if a sum of non-negative numbers is zero, then each term must be zero. It follows that v1 = v2 = v3 = 0 and so �v = �0. � Definition 2.4. If �v ∈ R3 , then the norm or length of �v = (v1 , v2 , v3 ) is the scalar √ �v� = �v · �v = (v12 + v22 + v32 )1/2 . It is interesting to note that if you know the norm, then you can calculate the dot product: (�v + w � ) · (�v + w � ) = �v · �v + 2�v · w � +w � ·w � (�v − w � ) · (�v − w � ) = �v · �v − 2�v · w � +w � · w. � Subtracting and dividing by 4 we get 1 �v · w � = ((�v + w � ) · (�v + w � ) − (�v − w � ) · (�v − w � )) 4 1 = (��v + w � �2 − ��v − w � �2 ). 4 Given two non-zero vectors �v and w � in space, note that we can define the angle θ between �v and w � . �v and w � lie in at least one plane (which is in fact unique, unless �v and w � are parallel). Now just measure the angle θ between the �v and w � in this plane. By convention we always take 0 ≤ θ ≤ π. 1

Theorem 2.5. If �v and w � are any two vectors in R3 , then �v · w � = ��v � �w � � cos θ. Proof. If �v is the zero vector, then both sides are equal to zero, so that they are equal to each other and the formula holds (note though, that in this case the angle θ is not determined). By symmetry, we may assume that �v and w � are both non-zero. Let �u = w � −�v and apply the law of cosines to the triangle with sides parallel to �u, �v and w �: ��u�2 = ��v �2 + �w � �2 − 2��v ��w � � cos θ. We have already seen that the LHS of this equation expands to �v · �v − 2�v · w � +w � ·w � = ��v �2 − 2�v · w � + �w � �. � �2 from both sides, and Cancelling the common terms ��v �2 and �w dividing by 2, we get the desired formula. � We can use (2.5) to find the angle between two vectors: Example 2.6. Let �v = −ˆı + kˆ and w � = ˆı + jˆ. Then −1 = �v · w � = ��v ��w � � cos θ = 2 cos θ. Therefore cos θ = −1/2 and so θ = 2π/3. Definition 2.7. We say that two vectors �v and w � in R3 are orthog­ onal if �v · w � = 0. Remark 2.8. If neither �v nor w � are the zero vector, and �v · w � = 0 then the angle between �v and w � is a right angle. Our convention is that the zero vector is orthogonal to every vector. Example 2.9. ˆı, jˆ and kˆ are pairwise orthogonal. Given two vectors �v and w � , we can project �v onto w � . The resulting vector is called the projection of �v onto w � and is denoted projw� �v . For � � is a direction, then the projection of F� example, if F is a force and w onto w � is the force in the direction of w �. As projw� �v is parallel to w � , we have projw� �v = λw, � for some scalar λ. Let’s determine λ. Let’s deal with the case that � is between 0 and π/2). If λ ≥ 0 (so that the angle θ between �v and w we take the norm of both sides, we get � projw� �v � = �λw� � = λ�w � �, 2

(note that λ ≥ 0), so that λ=

� projw� �v � . �w ��

But cos θ =

� projw� �v � , ��v �

so that � projw� �v � = ��v � cos θ. Putting all of this together we get ��v � cos θ �w � �

��v ��w � � cos θ

= �w � �2 �v · w � = . �w � �2

λ=

There are a number of ways to deal with the case when λ < 0 (so that θ > π/2). One can carry out a similar analysis to the one given above. Here is another way. Note that the angle φ between w � and �u = −�v is equal to π − θ < π/2. By what we already proved projw� �u =

�u · w � w. � �w � �2

But projw� �u = − projw� �v and �u · w � = −�v · w � , so we get the same formula in the end. To summarise: Theorem 2.10. If �v and w � are two vectors in R3 , where w � is not zero, then � � �v · w � projw� �v = w. � �w � �2 Example 2.11. Find the distance d between the line l containing the points P1 = (1, −1, 2) and P2 = (4, 1, 0) and the point Q = (3, 2, 4). Suppose that R is the closest point on the line l to the point Q. Note −−→ −→ that QR is orthogonal to the direction P1 P2 of the line. So we want the −− −− −−→ P Q, that is, we want length of the vector P1 Q − proj− P1 P2 1 −− −− −−→ P Q�. d = �P1 Q − proj− P1 P2 1 Now

−−→ P1 Q = (2, 3, 2)

and 3

−−→ P1 P2 = (3, 2, −2).

We have −−→ �P1 P2 �2 = 32 + 22 + 22 = 17 It follows that

and

−−→ −−→ P1 P2 · P1 Q = 6 + 6 − 4 = 8.

−−→ 8 −−→ P Q = proj− (3, 2, −2). P1 P2 1 17

Subtracting, we get −−→ −−→ 8 1 5 −−→ P Q = (2, 3, 2)− P1 Q−proj− (3, 2, −2) = (10, 35, 50) = (2, 7, 10). P1 P2 1 17 17 17 Taking the length, we get 5 2 (2 + 72 + 102 )1/2 ≈ 3.64. 17 Theorem 2.12. The angle subtended on the circumference of a circle by a diameter of the circle is always a right angle. Proof. Suppose that P and Q are the two endpoints of a diameter of the circle and that R is a point on the circumference. We want to show −→

−→ that the angle between P R and QR is a right angle. Let O be the centre of the circle. Then

−→ −→ −→ −→ −→ −→ and QR = QO + OR. P R = P O + OR −→ −→ Note that QO = −P O. Therefore −→ −→ −→ −→ −→ −→ P R · QR = (P O + OR) · (QO + OR) −→ −→ −→ −→ = (P O + OR) · (OR − P O) −−→ −→ = �OR�2 − �P O�2 = r2 − r2 = 0, −→ −→ where r is the radius of the circle. It follows that P R and QR are indeed orthogonal. �

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

3. Cross product

Definition 3.1. Let �v and w � be two vectors in R3 . The cross product � , is the vector defined as follows: of �v and w � , denoted �v × w • the length of �v × w � is the area of the parallelogram with sides �v � � sin θ. and w � , that is, ��v ��w • �v × w � is orthogonal to both �v and w �. • the three vectors �v , w � and �v × w � form a right-handed set of vectors. Remark 3.2. The cross product only makes sense in R3 . Example 3.3. We have ˆ ˆı × jˆ = k,

jˆ × kˆ = ˆı

and

kˆ × ˆı = jˆ.

By contrast ˆ jˆ × ˆı = −k,

kˆ × jˆ = −ˆı

and

ˆı × kˆ = −ˆ j.

Theorem 3.4. Let �u, �v and w � be three vectors in R3 and let λ be a scalar. (1) (2) (3) (4)

�v × w � = −w � × �v . �u × (�v + w � ) = �u × �v + �u × w �. (�u + �v ) × w � = �u × w � + �v × w �. λ(�v × w � ) = (λ�v ) × w � = �v × (λw). �

Before we prove (3.4), let’s draw some conclusions from these prop­ erties. Remark 3.5. Note that (1) of (3.4) is what really distinguishes the cross product (the cross product is skew commutative). Consider computing the cross product of ˆı, ˆı and jˆ. On the one hand, (ˆı × ˆı) × jˆ = �0 × �j = �0. On the other hand, ˆı × (ˆı × jˆ) = ˆı × kˆ = −j. ˆ In other words, the order in which we compute the cross product is important (the cross product is not associative). Note that if �v and w � are parallel, then the cross product is the zero vector. One can see this directly from the formula; the area of the parallelogram is zero and the only vector of zero length is the zero 1

vector. On the other hand, we know that w � = λ�v . In this case, �v × w � = �v × (λ�v ) = λ�v × �v = −λ�v × �v . To get from the second to the third line, we just switched the factors. But the only vector which is equal to its inverse is the zero vector. Let’s try to compute the cross product using (3.4). If �v = (v1 , v2 , v3 ) and w � = (w1 , w2 , w3 ), then �v × w � = (v1ˆı + v2 jˆ + v3 kˆ) × (w1ˆı + w2 jˆ + w3 kˆ) = v1 w1 (ˆı × ˆı) + v1 w2 (ˆı × jˆ) + v1 w3 (ˆı × kˆ) + v2 w1 (ˆ j × ˆı) + v2 w2 (ˆ j × jˆ) + v2 w3 (ˆ j × kˆ) + v3 w1 (kˆ × ˆı) + v3 w2 (kˆ × jˆ) + v3 w3 (kˆ × kˆ) ˆ = (v2 w3 − v3 w2 )ˆı + (v3 w1 − v1 w3 )ˆ j + (v1 w2 − v2 w1 )k. Definition 3.6. A matrix A = (aij ) is a rectangular array of numbers, where aij is in the i row and jth column. If A has m rows and n columns, then we say that A is a m × n matrix. Example 3.7.

� A=

� � � −2 1 −7 a11 a12 a13 = , 0 2 −4 a21 a22 a23

is a 2 × 3 matrix. a23 = −4. Definition 3.8. If





a b A= c d is a 2 × 2 matrix, then the determinant of A is the scalar � � �a b � � � � c d � = ad − bc.

If



⎞ a b c

A = ⎝ d e f ⎠

g h i

is a 3 × 3 matrix, then the determinant of A is the scalar � � �a b c � � � � � � � � � � e f � �d f � �d e � �d e f � = a � � � � � � � � � h i � − b � g i � + c � g h � .

�g h i � 2

Note that the cross product of �v and w � is the (formal) determinant

� � � ˆı jˆ kˆ �� � � v1 v2 v3 � .

� � � w1 w2 w3 �

Let’s now turn to the proof of (3.4). Definition 3.9. Let �u, �v and w � be three vectors in R3 . The triple �. scalar product is (�u × �v ) · w The triple scalar product is the signed volume of the parallelepiped formed using the three vectors, �u, �v and w � . Indeed, the volume of the parallelepiped is the area of the base times the height. For the base, we take the parallelogram with sides �u and �v . The magnitude of �u × �v is the area of this parallelogram. The height of the parallelepiped, up to sign, is the length of w � times the cosine of the angle, let’s call this � . The sign is positive, if �u, �v and w � form a φ, between �u × �v and w right-handed set and negative if they form a left-handed set. Lemma 3.10. If �u = (u1 , u2 , u3 ), �v = (v1 , v2 , v3 ) and w � = (w1 , w2 , w3 ) are three vectors in R3 , then � � � u1 u2 u3 � � � (�u × �v ) · w � = �� v1 v2 v3 �� .

� w1 w2 w3 �

Proof. We have already seen that � � � ˆı jˆ kˆ � � � �u × �v = ��u1 u2 u3 �� .

� v1 v2 v3 �

If one expands this determinant and dots with w � , this is the same as replacing the top row by (w1 , w2 , w3 ), � � �w1 w2 w3 � � � (�u × �v ) · w � = �� u1 u2 u3 �� .

� v1 v2 v3 �

Finally, if we switch the first row and the second row, and then the

second row and the third row, the sign changes twice (which makes no

change at all):

� � � u1 u2 u3 � � � (�u × �v ) · w � = �� v1 v2 v3 �� .

� w1 w2 w3 �

� 3

Example 3.11. The scalar triple product of ˆı, jˆ and kˆ is one. One way to see this is geometrically; the parallelepiped determined by these three vectors is the unit cube, which has volume 1, and these vectors form a right-handed set, so that the sign is positive. Another way to see this is to compute directly (ˆı × jˆ) · kˆ = kˆ · kˆ = 1. Finally one can use determinants, � � �1 0 0� � � (ˆı × jˆ) · kˆ = ��0 1 0�� = 1.

� 0 0 1� Lemma 3.12. Let �u, �v and w � be three vectors in R3 . Then (�u × �v ) · w � = (�v × w � ) · �u = (w � × �u) · �v . Proof. In fact all three numbers have the same absolute value, namely the volume of the parallelepiped with sides �u, �v and w � . On the other hand, if �u, �v , and w � is a right-handed set, then so is �v , w � and �u and vice-versa, so all three numbers have the same sign as well. � Lemma 3.13. Let �v and w � be two vectors in R3 . Then �v = w � if and only if �v · �x = w � · �x, for every vector �x in R3 . Proof. One direction is clear; if �v = w � , then �v · �x = w � · �x for any vector �x. So, suppose that we know that �v · �x = w � · �x, for every vector �x. Suppose that �v = (v1 , v2 , v3 ) and w � = (w1 , w2 , w3 ). If we take �x = ˆı, then we see that � · ˆı = w1 . v1 = �v · ˆı = w Similarly, if we take �x = jˆ and �x = kˆ, then we also get v2 = �v · jˆ = w � · jˆ = w2 , and � · kˆ = w3 . v3 = �v · kˆ = w But then �v = w � as they have the same components.



Proof of (3.4). We first prove (1). Both sides have the same magni­ tude, namely the area of the parallelogram with sides �v and w � . Further both sides are orthogonal to �v and w � , so the only thing to check is the change in sign. 4

As �v , w � and �v × w � form a right-handed triple, it follows that w � , �v � form a left-handed triple. But then w � , �v and −�v × w � form and �v × w a right-handed triple. It follows that w � × �v = −�v × w. � This is (1). To check (2), we check that (�u × (�v + w � )) · �x = (�u × �v + �u × w � ) · �x, for an arbitrary vector �x. We first attack the LHS. By (3.12), we have (�u × (�v + w � )) · �x = (�x × �u) · (�v + w �) = (�x × �u) · �v + (�x × �u) · w � � ) · �x. = (�u × �v ) · �x + (�u × w We now attack the RHS. (�u × �v + �u × w � ) · �x = (�u × �v ) · �x + (�u × �v ) · �x. It follows that both sides are equal. This is (2). We could check (3) by a similar argument. Here is another way. (�u + �v ) × w � = −w � × (�u + �v ) = −w � × �u − w � × �v = �u × w � + �v × w. � This is (3). To prove (4), it suffices to prove the first equality, since the fact that the first term is equal to the third term follows by a similar derivation. If λ = 0, then both sides are the zero vector, and there is nothing to � 0. Note first that the magnitude prove. So we may assume that λ = of both sides is the area of the parallelogram with sides λ�v and w �. If λ > 0, then �v and λ�v point in the same direction. Similarly �v × w � � ) point in the same direction. As �v , w � and �v × w � for a and λ(�v × w � ). But then λ(�v × w � ) is right-handed set, then so do λ�v , w � and λ(�v × w the cross product of λ�v and w � , that is, (λ�v ) × w � = λ(�v × w � ). If λ < 0, then �v and λ�v point in the opposite direction. Similarly � and λ(�v × w � ) point in the opposite direction. But then λ�v , w � �v × w � ) still form a right-handed set. This is (4). � and λ(�v × w

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

4. Planes and distances How do we represent a plane Π in R3 ? In fact the best way to specify a plane is to give a normal vector �n to the plane and a point P0 on the −−→ plane. Then if we are given any point P on the plane, the vector P0 P is a vector in the plane, so that it must be orthogonal to the normal vector �n. Algebraically, we have −−→ P0 P · �n = 0. Let’s write this out as an explicit equation. Suppose that the point P0 = (x0 , y0 , z0 ), P = (x, y, z) and �n = (A, B, C). Then we have (x − x0 , y − y0 , z − z0 ) · (A, B, C) = 0. Expanding, we get A(x − x0 ) + B(y − y0 ) + C(z − z0 ) = 0, which is one common way to write down a plane. We can always rewrite this as Ax + By + Cz = D. Here −−→ D = Ax0 + By0 + Cz0 = (A, B, C) · (x0 , y0 , z0 ) = �n · OP0 . This is perhaps the most common way to write down the equation of a plane. Example 4.1. 3x − 4y + 2z = 6, is the equation of a plane. A vector normal to the plane is (3, −4, 2). Example 4.2. What is the equation of a plane passing through (1, −1, 2), with normal vector �n = (2, 1, −1)? We have (x − 1, y + 1, z − 2) · (2, 1, −1) = 0. So 2(x − 1) + y + 1 − (z − 2) = 0, so that in other words, 2x + y − z = −1. A line is determined by two points; a plane is determined by three points, provided those points are not collinear (that is, provided they don’t lie on the same line). So given three points P0 , P1 and P2 , what is the equation of the plane Π containing P0 , P1 and P2 ? Well, we would like to find a vector �n orthogonal to any vector in the plane. Note that −−→ −−→ P0 P1 and P0 P2 are two vectors in the plane, which by assumption are 1

not parallel. The cross product is a vector which is orthogonal to both vectors, −−→ −−→ �n = P0 P1 × P0 P2 . So the equation we want is −−→ −−→ −−→ P0 P · (P0 P1 × P0 P2 ) = 0. −−→ −→ −−→ We can rewrite this a little. P0 P = OP − OP0 . Expanding and rear­ ranging gives −→ −−→ −−→ −−→ −−→ −−→ OP · (P0 P1 × P0 P2 ) = OP0 · (P0 P1 × P0 P2 ). Note that both sides involve the triple scalar product. Example 4.3. What is the equation of the plane Π through the three points, P0 = (1, 1, 1), P1 = (2, −1, 0) and P2 = (0, −1, −1)? −−→ −−→ P0 P1 = (1, −2, −1) and P0 P2 = (−1, −2, −2). Now a vector orthogonal to both of these vectors is given by the cross product: −−→ −−→ �n = P0 P1 × P0 P2 � � � ˆı ˆ � j ˆ k � � = �� 1 −2 −1�� �−1 −2 −2�

� � � � � � �−2 −1 � � 1 −1� � 1 −2� � − jˆ � � ˆ � � = ˆı �� � −1 −2� + k � −1 −2�

−2 −2 �

ˆ = 2ˆı + 3ˆ j − 4k. Note that

−−→ �n · P0 P1 = 2 − 6 + 4 = 0,

as expected. It follows that the equation of Π is 2(x − 1) + 3(y − 1) − 4(z − 1) = 0, so that 2x + 3y − 4z = 1. For example, if we plug in P2 = (0, −1, −1), then 2 · 0 + 3 · −1 + 4 = 1, as expected. 2

Example 4.4. What is the parametric equation for the line l given as the intersection of the two planes 2x − y + z = 1 and x + y − z = 2? Well we need two points on the intersection of these two planes. If we set z = 0, then we get the intersection of two lines in the xy-plane, 2x − y = 1 x + y = 2. Adding these two equations we get 3x = 3, so that x = 1. It follows that y = 1, so that P0 = (1, 1, 0) is a point on the line. Now suppose that y = 0. Then we get 2x + z = 1 x − z = 2. As before this says x = 1 and so z = −1. So P1 = (1, 0, −1) is a point on l. −−→ −−→ P 0 P = t P0 P1 , for some parameter t. Expanding (x − 1, y − 1, z) = t(0, −1, −1), so that (x, y, z) = (1, 1 − t, −t). We can also calculate distances between planes and points, lines and points, and lines and lines. Example 4.5. What is the distance between the plane x − 2y + 3z = 4 and the point P = (1, 2, 3)? −→ Call the closest point R. Then P R is orthogonal to every vector in −→ the plane, that is, P R is normal to the plane. Note that �n = (1, −2, 3) −→ is normal to the plane, so that P R is parallel to the plane. Pick any point Q belonging to the plane. Then the triangle P QR has a right angle at R, so that −→ −→ P R = ± proj�n P Q. When x = z = 0, then y = −2, so that Q = (0, −2, 0) is a point on the plane. −→ P Q = (−1, −4, −3). Now −→ ��n�2 = �n. · �n. = 12 + 22 + 32 = 14 and �n · P Q = 4. So

−→ 2 proj�n P Q = (1, −2, 3). 7 3

So the distance is

2√ 14. 7 Here is another way to proceed. The line through P , pointing in the direction �n, will intersect the plane at the point R. Now this line is given parametrically as (x − 1, y − 2, z − 3) = t(1, −2, 3),

so that (x, y, z) = (t + 1, 2 − 2t, 3 + 3t). The point R corresponds to (t + 1) − 2(2 − 2t) + 3(3 + 3t) = 4, so that 14t = −2

that is

2 t= . 7

So the point R is 1 (9, 10, 27). 7 It follows that −→ 1 2 P R = (2, −4, 6) = (1, −2, 3), 7 7 the same answer as before (phew!). Example 4.6. What is the distance between the two lines (x, y, z) = (t−2, 3t+1, 2−t)

and

(x, y, z) = (2t−1, 2−3t, t+1)? −−→ If the two closest points are R and R� then RR� is orhogonal to the direction of both lines. Now the direction of the first line is (1, 3, −1) and the direction of the second line is (2, −3, 1). A vector orthogonal to both is given by the cross product: � � �ˆı jˆ ˆ � k � � ˆ �1 3 −1 � = −3ˆ j − 9k. � � �2 −3 1 � To simplify some of the algebra, let’s take ˆ �n = jˆ + 3k, which is parallel to the vector above, so that it is still orthogonal to both lines. −−→ It follows that RR� is parallel to �n. Pick any two points P and P � on the two lines. Note that the length of the vector −− proj�n P � P , 4

is the distance between the two lines. Now if we plug in t = 0 to both lines we get P � = (−2, 1, 2) So

and

P = (−1, 2, 1).

−−� P P = (1, 1, −1).

Then

−− ��n�2 = 12 + 32 = 10 and �n · P � P = −2. It follows that −− −2 −1 proj�n P � P = (0, 1, 3) = (0, 1, 3). 10 5 and so the distance between the two lines is 1√ 10. 5

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

5. n-dimensional space Definition 5.1. A vector in Rn is an n-tuple �v = (v1 , v2 , . . . , vn ). The � in Rn , the zero vector �0 = (0, 0, . . . , 0). Given two vectors �v and w sum �v + w � is the vector (v1 + w1 , v2 + w2 , . . . , vn + wn ). If λ is a scalar, the scalar product λ�v is the vector (λv1 , λv2 , . . . , λvn ). The sum and scalar product of vectors in Rn obey the same rules as the sum and scalar product in R2 and R3 . Definition 5.2. Let �v and w � ∈ Rn . The dot product is the scalar �v · w � = v1 w1 + v2 w2 + . . . vn wn . The norm (or length) of �v is the scalar √ ��v � = �v · �v . The scalar product obeys the usual rules. Example 5.3. Suppose that �v = (1, 2, 3, 4) and w � = (2, −1, 1, −1) and λ = −2. Then �v + w � = (3, 1, 4, 3)

and

λw � = (−4, 2, −2, 2).

We have �v · w � = 2 − 2 + 3 − 4 = −1. The standard basis of Rn is the set of vectors, e1 = (1, 0, . . . , 0),

e2 = (0, 1, . . . , 0),

e1 = (0, 0, 1, . . . , 0),

. . . en = (0, 0, . . . , 1).

Note that if �v = (v1 , v2 , . . . , vn ), then �v = v1 e1 + v2 e2 + · · · + vn en . Let’s adopt the (somewhat ad hoc) convention that �v and w � are parallel if and only if either �v is a scalar multiple of w � , or vice-versa. Note that if both �v and w � are non-zero vectors, then �v is a scalar multiple of w � if and only if w � is a scalar multiple of �v . Theorem 5.4 (Cauchy-Schwarz-Bunjakowski). If �v and w � are two vec­ n tors in R , then |�v · w � | ≤ �v��w�, with equality if and only if �v is parallel to w �. Proof. If either �v or w � is the zero vector, then there is nothing to prove. So we may assume that neither vector is the zero vector. Let �u = x�v + w � , where x is a scalar. Then 0 ≤ �u · �u = (�v · �v )x2 + 2(�v · w � )x + w � ·w � = ax2 + bx + c. 1

So the quadratic function f (x) = ax2 + bx + c has at most one root. It follows that the discriminant is less than or equal to zero, with equality if and only if f (x) has a root. So 4(�v · w � )2 − 4��v �2 �w � �2 = b2 − 4ac ≤ 0. Rearranging, gives (�v · w � )2 ≤ ��v �2 �w � �2 . Taking square roots, gives |�v · w � | ≤ �v��w�. Now if we have equality here, then the discriminant must be equal to zero, in which case we may find a scalar λ such that the vector λ�v + w � has zero length. But the only vector of length zero is the zero vector, so � = −λ�v and �v and w � are parallel. � that λ�v +w � = �0. In other words, w Definition 5.5. If �v and w � ∈ Rn are non-zero vectors, then the angle between them is the unique angle 0 ≤ θ ≤ π such that cos θ =

�v · w � . ��v ��w ��

Note that the fraction is between −1 and 1, by (5.4), so this does makes sense, and we showed in (5.4) that the angle is 0 or π if and only if �v and w � are parallel. Definition 5.6. If A = (aij ) and B = (bij ) are two m × n matrices, then the sum A + B is the m × n matrix (aij + bij ). If λ is a scalar, then the scalar multiple λA is the m × n matrix (λaij ). Example 5.7. If � A=

� 1 −1 , 3 −4

� and

B=

� 1 1 , 2 −1

then � � 2 0 A+B = , 5 −5 and � � 3 −3 3A = . 9 −12 Note that if we flattened A and B to (1, −1, 3, −4) and (2, 0, 5, −5) then the sum corresponds to the usual vector sum (3, −1, 8, −9). Ditto for scalar multiplication. 2

Definition 5.8. Suppose that A = (aij ) is an m × n matrix and B = (bij ) is an n × p matrix. The product C = AB = (cij ) is the m × p matrix where n � cij = ai1 b1j + ai2 b2j + ai3 b3j + · · · + ain bnj = ail blj . l=1

In other words, the entry in the ith row and jth column of C is the dot product of the ith row of A and the jth column of B. This only makes sense because the ith row and the jth column are both vectors in Rn . Example 5.9. Let � � 1 −2 1 A= , 1 −1 5 and ⎛

⎞ 2 1 B = ⎝ 1 −4⎠ . −1 1 Then C = AB has shape 2 × 2, and in fact � � −1 10 C = AB = . −4 10 Theorem 5.10. Let A, B and C be three matrices, and let λ and µ be scalars. (1) If A, B and C have the same shape, then (A + B) + C = A + (B + C). (2) If A and B have the same shape, then A + B = B + A. (3) If A and B have the same shape, then λ(A + B) = λA + λB. (4) If Z is the zero matrix with the same shape as A, then Z + A = A + Z. (5) λ(µA) = (λµ)A. (6) (λ + µ)A = λA + µA. (7) If In is the matrix with ones on the diagonal and zeroes every­ where else and A has shape m×n, then AIn = A and Im A = A. (8) If A has shape m × n and B has shape n × p and C has shape p × q, then A(BC) = (AB)C. (9) If A has shape m × n and B and C have the same shape n × p, then A(B + C) = AB + AC. (10) If A and B have the same shape m × n and C has shape n × p, then (A + B)C = AC + BC. 3

Example 5.11. Note however that AB = � BA in general. For example if A has shape 1 × 3 and B has shape 3 × 2, then it makes sense to multiply A and B but it does not make sense to multiply B and A. In fact even if it makes sense to multiply A and B and B and A, the two products might not even have the same shape. For example, if ⎛ ⎞ 1 A = ⎝ 2 ⎠ ,

−1

and

� � B = 2 −1 3 ,

then AB has shape 3 × 3,



⎞ 2 −1 3

AB = ⎝ 4 −2 6 ⎠ ,

−2 1 −3

but BA has shape 1 × 1, BA = (2 − 2 − 3) = (−3). But even both products AB and BA make sense, and they have the same shape, the products still don’t have to be equal. Suppose

� � � � 1 1 1 0 A= , and B= . 0 1 1 1 Then AB and BA are both 2 × 2 matrices. But � � �



2 1 1 1 AB = , and BA = . 1 1 1 2 One can also define determinants for n × n matrices. It is probably easiest to explain the general rule using an example: � � �1 0 0 2 � � � � � � � � � �2 0 1 � �2 0 1 −1� � 0 1 −1� � � � � = �−2 1 1 � − 2 �1 −2 1 � .

�1 −2 1 1 � � � � � � � � 1 0 1 � � 0 1 0 �

�0 1 0 1 � Notice that we as expand about the top row, the sign alternates +−+−, so that the last term comes with a minus sign. Finally, we try to explain the real meaning of a matrix. Let � � 1 1 A= . 0 1 Given A, we can construct a function f : R2 −→ R2 , 4

by the rule f (�v ) = A�v . If �v = (x, y), then A�v = (x + y, y). Here I cheat a little, and write row vectors instead of column vectors. Geometrically this is called a shear ; it leaves the y-axis alone but one goes further along the x-axis according to the value of y. If � � a b A= c d the resulting function sends (x, y) to (ax + by, cx + dy). In fact the functions one gets this way are always linear. If � � 2 0 A= , 0 −1 then f (x, y) = (2x − y), and this has the result of scaling by a factor of 2 in the x-direction and reflects in the y-direction. In general if A is an m × n matrix, we get a function f : Rn −→ Rm , using the same rule, f (�v ) = A�v . If B is an n × p matrix, then we get a function g : Rp −→ Rn , by the rule g(w � ) = B w. � Note that we can compose the functions f and g, to get a function f ◦ g : Rp −→ Rm . First we apply g to w � to get a vector �v in Rn and then we apply f to �v to get a vector in Rm . The composite function f ◦ g is given by the rule (f ◦ g)(w � ) = (AB)w � . In other words, matrix multiplication is chosen so that it represents composition of functions. As soon as one realises this, many aspects of matrix multiplication become far less mysterious. For example, composition of functions is not commutative, for example � 2 sin x, sin 2x = and this is why AB �= BA in general. Note that it is not hard to check that composition of functions is associative, f ◦ (g ◦ h) = (f ◦ g) ◦ h. This is the easiest way to check that matrix multiplication is associa­ tive, that is, (8) of (5.10). 5

Functions given by matrices are obviously very special. Note that if f (�v ) = A�v , then f (�v + w � ) = A(�v + w � ) = A�v + Aw � = f (�v ) + f (w � ), and f (λ�v ) = A(λ�v ) = λ(A�v ) = λf (�v ). Any function which respects both addition of vectors and scalar multi­ plication is called linear and it is precisely the linear functions which are given by matrices. In fact if e1 , e2 , . . . , en and f1 , f2 , . . . , fm are standard bases for Rn and Rm , and f is linear, then � f (ej ) = aij fi , for some scalars aij , since f (ej ) is a vector in Rm and any vector in Rm is a linear combinaton of the standard basis vectors f1 , f2 , . . . , fm . If we put A = (aij ) then one can check that f is the function f (�v ) = A�v .

6

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

6. Cylindrical and spherical coordinates Recall that in the plane one can use polar coordinates rather than Cartesian coordinates. In polar coordinates we specify a point using the distance r from the origin and the angle θ with the x-axis. In polar coordinates, if a is a constant, then r = a represents a circle of radius a, centred at the origin, and if α is a constant, then θ = α represents a half ray, starting at the origin, making an angle α. Suppose that r = aθ, a a constant. This represents a spiral (in fact, the Archimedes spiral), starting at the origin. The smaller a, the ‘tighter’ the spiral. By convention, if r is negative, we use this to mean that we point in the opposite direction to the direction given by θ. Also by convention, θ and θ + 2π represent the same point. We may require r ≥ 0 and 0 ≤ θ < 2π and if we are not at the origin, this gives us unique polar coordinates. It is straightforward to convert to and from polar coordinates: x = r cos θ y = r sin θ, and r 2 = x2 + y 2 tan θ = y/x. For example, what curve does the equation r = 2a cos θ represent? Well if we multiply both sides by r, then we get r2 = 2ar cos θ. So we get x2 + y 2 = 2ax. Completing the square gives (x − a)2 + y 2 = a2 . So this is a circle radius a, centred at (a, 0). Polar coordinates can be very useful when we have circles or lines through the origin, or there is a lot of radially symmetry. Instead of using the vectors ˆı and jˆ, in polar coordinates it makes sense to use orthogonal vectors of unit length, that move as the point moves (these are called moving frames). At a point P in the plane, with polar coordinates (r, θ), we use the vector eˆr to denote the vector of unit length pointing in the radial direction: eˆr = cos θˆı + sin θj. ˆ 1

eˆr points in the direction of increasing r. The vector eˆθ is a unit vector pointing in the direction of increasing θ. It is orthogonal to eˆr and so in fact eˆθ = − sin θˆı + cos θj. ˆ We will call a set of unit vectors which are pairwise orthogonal, an orthonormal basis if we have two in the plane or three in space. We want do something similar in space but now there are two choices beyond Cartesian coordinates. The first just takes polar coordinates in the xy-plane and throws in the extra variable z. So a point P is specified by three coordinates, (r, θ, z). r is the distance to the origin, −− of the projection P � of P down to the xy-plane, θ is the angle OP � makes with the x-axis, so that (r, θ) are just polar coordinates for the point P � in the xy-plane, and z is just the height of P from the xy-plane. x = r cos θ y = r sin θ z = z. Note that the locus r = a, specifies a cylinder in three space. For this reason we call (r, θ, z) cylindrical coordinates. The locus θ = α, specifies a half-plane which is vertical (if we allow r < 0 then we get the full vertical plane). The locus z = a specifies a horizontal plane, parallel to the xy-plane. The locus z = ar specifies a half cone. At height one, the cone has radius a, so the larger a the more ‘open’ the cone. The locus z = aθ is rather complicated. If we fix the angle, then we get a line of this height and this angle. The resulting surface is called a helicoid, and looks a little bit like a spiral staircase. Again it is useful to write down an orthonormal coordinate frame. In this case there are three vectors, pointing in the direction of increasing r, increasing θ and increasing z: eˆr = cos θˆı + sin θjˆ eˆθ = − sin θˆı + cos θjˆ ˆ eˆz = k. The third coordinate system in space uses two angles and the dis­ tance to the origin, (ρ, θ, φ). ρ is the distance to the origin, θ is the angle made by the projection of P down to the xy-plane and φ is the angle the radius vector makes with the z-axis. Typically we use coor­ dinates such that 0 ≤ z ≤ ∞, 0 ≤ θ < 2π and 0 ≤ φ ≤ π. To get from spherical coordinates to Cartesian coordinates, we first convert to 2

cylindrical coordinates, r = ρ sin φ θ=θ z = ρ cos φ. So, in Cartesian coordinates we get x = ρ sin φ cos θ y = ρ sin φ sin θ z = ρ cos φ. The locus z = a represents a sphere of radius a, and for this reason we call (ρ, θ, φ) cylindrical coordinates. The locus φ = a represents a cone. Example 6.1. Describe the region x 2 + y 2 + z 2 ≤ a2

and

x2 + y 2 ≥ z 2 ,

in spherical coordinates. The first region is the region inside the sphere of radius, ρ ≤ a. The second is the region outside a cone. The surface of the cone is given by z 2 = x2 + y 2 . Now one point on this cone is the point (1, 1, 1), so that this a right-angled cone, and the region is given by π/4 ≤ φ ≤ 3π/4. So we can describe this region by the inequalities ρ≤a

π/4 ≤ φ ≤ 3π/4.

and

Finally, let’s write down the moving frame given by spherical coordi­ nates, the one corresponding to increasing ρ, increasing θ and increasing φ. xˆı + yjˆ + zkˆ eˆρ = �

x2 + y 2 + z 2

ˆ = sin φ cos θˆı + sin φ sin θˆ j + cos φk. eˆθ = − sin θˆı + cos θj. ˆ 3

To calculate eˆφ , we use the fact that it has unit length and it is orthogonal to both eˆρ and eˆθ . We have eˆφ = ±ˆ eθ × eˆρ � � � ˆı jˆ kˆ �� � = �� − sin θ cos θ 0 �� �sin φ cos θ sin φ sin θ cos φ �

= cos φ cos θˆı + sin θ cos φjˆ − (sin2 θ sin φ + cos2 θ sin φ)kˆ = cos φ cos θˆı + cos φ sin θjˆ − sin φkˆ Now when φ increases, z decreases. So we want the vector with negative z-component, which is exactly the last vector we wrote down.

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

7. Functions Definition 7.1. A function f : A −→ B consists of: (1) the set A, which is called the domain of f , (2) the set B, which is called the range of f , (3) a rule which assigns to every element a ∈ A an element b ∈ B, which we denote by f (a). It is easy to write down examples of functions: (1) Let A be the set of all people and let B = [0, ∞). Let f (x) be the height of person x, to the nearest inch. (2) f : R −→ R given by f (x) = x2 . (3) f : R −→ [0, ∞) given by f (x) = x2 . (4) f : [0, ∞) −→ R given by f (x) = x2 . (5) f : [0, ∞) −→ [0, ∞) given by f (x) = x2 . (6) f : R −→ {0, 1} given by � 0 if x is rational f (x) = 1 if x is irrational. (7) f : N −→ {0, 1} given by � 0 if x is even f (x) = 1 if x is odd. Definition 7.2. If f : A −→ B and g : B −→ C are two functions, then we can compose f and g to get a new function g ◦ f : A −→ C. If a ∈ A, then (g ◦ f )(a) = g(f (a)). If f : R −→ R is the function f (x) = sin x and g : R −→ R is the function g(x) = 2x, then g ◦ f : R −→ R is the function (g ◦ f )(x) = 2 sin x, whilst f ◦ g : R −→ R is the function (f ◦ g)(x) = sin 2x. Definition 7.3. Let f : A −→ B be a function. We say that f is injective if whenever f (a1 ) = f (a2 ) for some pair a1 , a2 ∈ A, then in fact a1 = a2 . We say that f is surjective if for every b ∈ B, we may find an a ∈ A such that f (a) = b. We say that f is a bijection if f is both injective and surjective. It is interesting to go through the examples above. The function in (1) is neither injective or surjective. Indeed there are lots of people who are five foot nine, so just because there are two people with the same height, does not mean they are the same person. So f is not injective. On the other hand, there is no person who is a million inches high, so f is not surjective either. 1

The function in (2) is neither injective nor surjective as well. f (−1) = � −1. There is no real number whose square is −1, so 1 = f (1), but 1 = there is no real number a such that f (a) = −1. The function in (3) is not injective but it is surjective. f (−1) = f (1), and 1 �= −1. But if b ≥ 0 then there is always a real number a ≥ 0 such that f (a) = b (namely, the square root of b). The function in (4) is injective but not surjective. If f (a1 ) = f (a2 ), then a21 = a22 . As both a1 ≥ 0 and a2 ≥ 0, this implies a1 = a2 . On the other hand, there is still no number whose square is −1. The function in (5) is bijective. It is injective, as in (4) and it is surjective as in (3). The function in (6) is not injective but it is surjective. f (0) = 0 = � 1. On the other f (1), but 0 = √ hand, if b ∈ {0, 1} then either b = 0 or b = 1. But f (0) = 0 and f ( 2) = 1. The function in (7) is not injective but it is surjective. f (0) = f (2) = � 2. On the other hand, if b ∈ {0, 1} then either b = 0 or 0, but 0 = b = 1. But f (0) = 0 and f (1) = 1. (8) (9) (10) (11)

f : R3 −→ R given by f (�v ) = �v�. f : R3 − {0} −→ R3 given by f (�v ) = ��vv� . r : R −→ R3 given by f (t) = (cos t, sin t, t). f : R −→ (0, ∞) is given by f (x) = ex .

The function in (8) is neither injective nor surjective. There are plenty of unit vectors and there are no vectors of negative length. The function in (9) is neither injective nor surjective. There are plenty of vectors which point in the same direction and the image consists of vectors of unit length. The function in (10) is injective but not surjective. The function in (11) is bijective. If f : A −→ B is a bijective function, then f has an inverse function g : B −→ A. g ◦ f : A −→ A is the identity, it sends a to a. Similarly f ◦ g : B −→ B is the identity, it sends b to b. The inverse of the exponential function is the logarithm g : (0, ∞) −→ R given by g(x) = ln x. Definition 7.4. If A and B are two sets, the product of A and B, denoted A × B is the set of ordered pairs A × B{ (a, b) | a ∈ A, b ∈ B }. The graph of a function f : A −→ B is the subset of the product { (a, f (a) | a ∈ A } ⊂ A × B. The product R × R = R2 , R × R2 = R3 , and so on. 2

If f : R −→ R is the function f (x) = x2 , then the graph of f is the parabola, y = x2 , { (x, x2 ) | x ∈ R } ⊂ R2 . Now suppose that g : R2 −→ R is the function g(x, y) = x2 + y 2 . The graph of g is the set z = x2 + y 2 , { (x, y, x2 + y 2 ) | (x, y) ∈ R2 } ⊂ R3 . One way to picture g is to slice it using level curves and level sets. If g : R2 −→ R is any function, the level curve of g at height c is the set { (x, y) ∈ R2 | g(x, y) = c } ⊂ R2 . 2 2 In the example when √ f (x, y) = x + y , the level curves are concentric circles of radius c, centred at the origin. If c = 0 the level curve is the origin and if c < 0 the level curve is empty. Note that graph has a minimum at the origin. If g : R3 −→ R is any function, the level set of g at height c is the set { (x, y, z) ∈ R3 | g(x, y, z) = c } ⊂ R3 . If g(x, y, z) = z − x2 − y 2 then the level sets are parallel paraboloids. It is interesting to try to visualise various functions. If we consider f (x, y) = −x2 −y 2 , then this is an upside down paraboloid, which has a maximum at the origin. Note that the level curves are the same as the level curves for f (x, y) = x2 + y 2 (for different values of c, of course). Now consider the function f (x, y) = x2 − y 2 . If we consider fixing y = b and varying x, then we get a parabola z = x2 − b2 . If we consider fixing x = a and varying y, then we get an upside down parabola, z = a2 − y 2 . In fact the origin is a saddle point, a point which is neither a maximum nor a minimum.

3

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

8. Limits

Definition 8.1. Let P ∈ Rn be a point. The open ball of radius � > 0 about P is the set −→ B� (P ) = { Q ∈ Rn | �P Q� < � }. The closed ball of radius � > 0 about P is the set −→ { Q ∈ Rn | �P Q� ≤ � }. Definition 8.2. A subset A ⊂ Rn is called open if for every P ∈ A there is an � > 0 such that the open ball of radius � about P is entirely contained in A, B� (P ) ⊂ A. We say that B is closed if the complement of B is open. Put differently, an open set is a union of open balls. Open balls are open and closed balls are closed. [0, 1) is neither open nor closed. Definition 8.3. Let B ⊂ Rn . We say that P ∈ Rn is a limit point if for every � > 0 the intersection B� (P ) ∩ B = � ∅. Example 8.4. 0 is a limit point of 1 { | n ∈ N } ⊂ R. n Lemma 8.5. A subset B ⊂ Rn is closed if and only if B contains all of its limit points. Example 8.6. Rn − {0} is open. One can see this directly from the definition or from the fact that the complement {0} is closed. Definition 8.7. Let A ⊂ Rn and let P ∈ Rn be a limit point. Suppose that f : A −→ Rm is a function. We say that f approaches L as Q approaches P and write lim f (Q) = L,

Q→P

if for every � > 0 we may find δ > 0 such that whenever Q ∈ Bδ (P ) ∩ A f (Q) ∈ B� (L). In this case we call L the limit. It might help to understand the notion of a limit in terms of a game played between two people. Let’s call the first player Larry and the second player Norman. Larry wants to show that L is the limit of f (Q) as Q approaches P and Norman does not. So Norman gets to choose � > 0. Once Norman has chosen � > 0, Larry has to choose δ > 0. The smaller that Norman chooses � > 0, 1

the harder Larry has to work (typically he will have to make a choice of δ > 0 very small). Proposition 8.8. Let f : A −→ Rm and g : A −→ Rm be two func­ tions. Let λ ∈ R be a scalar. If P is a limit point of A and lim f (Q) = L

Q→P

and

lim g(Q) = M,

Q→P

then (1) limQ→P (f + g)(Q) = L + M , and (2) limQ→P (λf )(Q) = λL. Now suppose that m = 1. (3) limQ→P (f g)(Q) = LM , and (4) if M �= 0, then limQ→P (f /g)(Q) = L/M . Proof. We just prove (1). Suppose that � > 0. As L and M are limits, we may find δ1 and δ2 such that, if �Q − P � < δ1 and Q ∈ A, then �f (Q) − L� < �/2 and if �Q − P � < δ2 and Q ∈ A, then �g(Q) − L� < �/2. Let δ = min(δ1 , δ2 ). If �Q − P � < δ and Q ∈ A, then �(f + g)(Q) − L − M � = �(f (Q) − L) + (g(Q) − M )� ≤ �(f (Q) − L)� + �(g(Q) − M )� � � ≤ + 2 2 = �, where we applied the triangle inequality to get from the second line to the third line. This is (1). (2-4) have similar proofs. � Definition 8.9. Let A ⊂ Rn and let P ∈ A. If f : A −→ Rm is a function, then we say that f is continuous at P , if lim f (Q) = f (P ).

Q→P

We say that f is continuous, if it continuous at every point of A. Theorem 8.10. If f : Rn −→ R is a polynomial function, then f is continuous. A similar result holds if f is a rational function (a quotient of two polynomials). Example 8.11. f : R2 −→ R given by f (x, y) = x2 + y 2 is continuous. Sometimes Larry is very lucky: 2

Example 8.12. Does the limit x2 − y 2 , (x,y)→(0,0) x − y lim

exist? Here the domain of f is A = { (x, y) ∈ R2 | x = � y }. Note (0, 0) is a limit point of A. Note that if (x, y) ∈ A, then x2 − y 2 = x + y, x−y so that

x2 − y 2 lim = lim x + y = 0. (x,y)→(0,0) x − y (x,y)→(0,0) So the limit does exist. Norman likes the following result: Proposition 8.13. Let A ⊂ Rn and let B ⊂ Rm . Let f : A −→ B and g : B −→ Rl . Suppose that P is a limit point of A, L is a limit point of B and lim f (Q) = L

and

Q→P

lim g(M ) = E.

M →L

Then lim (g ◦ f )(Q) = E.

Q→P

Proof. Let � > 0. We may find δ > 0 such that if �M − L� < δ, and M ∈ B, then �g(M ) − E� < �. Given δ > 0 we may find η > 0 such that if �Q − P � < η and Q ∈ A, then |f (Q) − L� < η. But then if �Q − P � < η and Q ∈ A, then M = f (Q) ∈ B and �M − L� < δ, so that �(g ◦ f )(Q) − E� = �g(f (Q)) − E� = �g(M ) − E� < �. Example 8.14. Does lim

(x,y)→(0,0) x2



xy + y2

exist? The answer is no. To show that the answer is no, we suppose that the limit exists. Suppose we consider restricting to the x-axis. Let f : R −→ R2 , 3

be given by t −→ (t, 0). As f is continuous, if we compose we must get a function with a limit, 0 lim 2 = lim 0 = 0. t→0 t + 0 t→0 Now suppose that we restrict to the line y = x. Now consider the function f : R −→ R2 , be given by t −→ (t, t). As f is continuous, if we compose we must get a function with a limit, t2 1 1 = lim = . t→0 t2 + t2 t→0 2 2 The problem is that the limit along two different lines is different. So the original limit cannot exist. lim

Example 8.15. Does the limit x3 , (x,y)→(0,0) x2 + y 2 lim

exist? Let us use polar coordinates. Note that x3 r3 cos3 θ = = r cos3 θ. 2 2 2 x +y r So we guess the limit is zero. |

lim (x,y)→(0,0)

x3 | = lim |r cos3 θ| r→0 x2 + y 2 ≤ lim |r| = 0. r→0

Example 8.16. Does the limit lim

(x,y,z)→(0,0,0) x2

xyz , + y2 + z2

exist? Same trick, but now let us use spherical coordinates. lim (x,y,z)→(0,0,0)

|

xyz ρ3 cos2 φ sin φ cos θ sin θ | = lim | | ρ→0 x2 + y 2 + z 2 ρ2 = lim |ρ cos2 φ sin φ cos θ sin θ| ρ→0

≤ lim |ρ| = 0. ρ→0

Sometimes Norman needs to restrict to more complicated curves than just lines: 4

Example 8.17. Does the limit y , (x,y)→(0,0) y + x2 exist? If we restrict to the line t −→ (at, bt), then we get bt b lim = lim = 1. 2 2 t→0 bt + a t t→0 b + at But if we restrict to the conic t −→ (t, at2 ), then we get lim

at2 a a = lim = , 2 2 t→0 at + t t→0 1 + a 1+a and the limit changes as we vary a, so that the limit does not exist. lim

Note that if we start with y , y + xd then Norman even needs to use curves of degree d, t −→ (t, atd ).

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

9. The derivative The derivative of a function represents the best linear approximation of that function. In one variable, we are looking for the equation of a straight line. We know a point on the line so that we only need to determine the slope. Definition 9.1. Let f : R −→ R be a function and let a ∈ R be a real number. f is differentiable at a, with derivative λ ∈ R, if f (x) − f (a) lim = λ. x→ a x−a To understand the definition of the derivative of a multi-variable function, it is slightly better to recast (9.1): Definition 9.2. Let f : R −→ R be a function and let a ∈ R be a real number. f is differentiable at a, with derivative λ ∈ R, if f (x) − f (a) − λ(x − a) lim = 0. x→ a x−a We are now ready to give the definition of the derivative of a function of more than one variable:z Definition 9.3. Let f : Rn −→ Rm be a function and let P ∈ Rn be a point. f is differentiable at P , with derivative the m × n matrix A, if −→ f (Q) − f (P ) − AP Q lim = 0. −→ Q→P �P Q� We will write Df (P ) = A. So how do we compute the derivative? We want to find the matrix A. Suppose that � � a b A= c d Then � �� � � � a b 1 a Aˆ e1 = A = c d 0 c and � �� � � � a b 0 b Aˆ e2 = A = . c d 1 d In general, given an m × n matrix A, we get the jth column of A, simply by multiplying A by the column vector determined by eˆj . So we want to know what happens if we approach P along the line −→ ej , where h goes to zero. In determined by eˆj . So we take P Q = hˆ 1

other words, we take Q = P + heˆj . Let’s assume that h > 0. So we consider the fraction f (Q) − f (P ) − A(heˆj ) f (Q) − f (P ) − A(heˆj ) = −→ h �P Q� f (Q) − f (P ) − hAeˆj = h f (Q) − f (P ) = − Aeˆj . h

Taking the limit we get the jth column of A,

f (P + heˆj ) − f (P ) Aeˆj = lim .

h→0 h

Now f (P + heˆj ) − f (P ) is a column vector, whose entry in the ith row is ej )−fi (P ) = fi (a1 , a2 , . . . , aj−1 , aj +h, aj+1 , . . . , an )−fi (a1 , a2 , . . . , aj−1 , aj , aj+1 , . . . , an ). fi (P +ˆ and so for the expression on the right, in the ith row, we have fi (P + heˆj ) − fi (P ) lim . h→0 h Definition 9.4. Let g : Rn −→ R be a function and let P ∈ Rn . The partial derivative of f at P = (a1 , a2 , . . . , an ), with respect to xj is the limit � ∂f �� g(a1 , a2 , . . . , aj + h, . . . , an ) − g(a1 , a2 , . . . , an ) = lim . �

∂xj P h→0 h Putting all of this together, we get Proposition 9.5. Let f : Rn −→ Rm be a function. If f is differentiable at P , then Df (P ) is the matrix whose (i, j) entry is the partial derivative � ∂fi �� .

∂xj �

P

Example 9.6. Let f : A −→ R2 be the function f (x, y, z) = (x3 y + x sin(xz), log xyz). Here A ⊂ R3 is the first octant, the locus where x, y and z are all positive. Supposing that f is differentiable at P , then the derivative is given by the matrix of partial derivatives, � 2 �

3x y + sin(xz) + xz cos(xz) x3 x2 cos(xz) Df (P ) =

. 1 1 1 x

y

2

z

Definition 9.7. Let f : Rn −→ R be a differentiable function. Then the derivative of f at P , Df (P ) is �a row vector, which is called the gradient of f , and is denoted (�f )�P , � � � ∂f �� ∂f �� ∂f �� ( � ,

, . . . ,

).

∂x1 ∂x2 � ∂xn �

P

P

P

The point (x1 , x2 , . . . , xn , xn+1 ) lies on the graph of f : Rn −→ R if and only if xn+1 = f (x1 , x2 , . . . , xn ). The point (x1 , x2 , . . . , xn , xn+1 ) lies on the tangent hyperplane of f : Rn −→ R at P = (a1 , a2 , . . . , an ) if and only if

� xn+1 = f (a1 , a2 , . . . , an ) + (�f )� · (x1 − a1 , x2 − a2 , . . . , xn − an ). P

In other words, the vector � � � ∂f �� ∂f �� ∂f �� ( � ,

, . . . ,

, −1),

∂x1 P ∂x2 �P ∂xn � P is a normal vector to the tangent hyperplane and of course the point (a1 , a2 , . . . , an , f (a1 , a2 , . . . , an )) is on the tangent hyperplane. Example 9.8. Let D = { (x, y) ∈ R2 | x2 + y 2 < r2 }, the open ball of radius r, centred at the origin. Let f : R2 −→ R be the function given by � f (x, y) = r2 − x2 − y 2 . Then

∂f −2x/2

−x =� =� ,

∂x r 2 − x2 − y 2 r 2 − x2 − y 2 and so by symmetry, ∂f −y

−x =� =� ,

∂y r 2 − x2 − y 2 r 2 − x2 − y 2 At the point (a, b), the gradient is � −1

(�f ) � (a,b) = √ (a, b).

2 r − a2 − b 2 So the equation for the tangent plane is 1 z = f (a, b) − √ (a(x − a) + b(x − b)). r 2 − a2 − b 2 For example, if (a, b) = (0, 0), then the tangent plane is z = r, as expected. 3

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

10. More about derivatives

The main result is:

Theorem 10.1. Let A ⊂ Rn be an open subset and let f : A −→ Rm be a function. If the partial derivatives ∂fi , ∂xj exist and are continuous, then f is differentiable. We will need: Theorem 10.2 (Mean value theorem). Let f : [a, b] −→ R is con­ tinuous and differentiable at every point of (a, b), then we may find c ∈ (a, b) such that f (b) − f (a) = f � (c)(b − a). Geometrically, (10.2) is clear. However it is surprisingly hard to give a complete proof. Proof of (10.1). We may assume that m = 1. We only prove this in the case when n = 2 (the general case is similar, only notationally more involved). So we have f : R2 −→ R. −→ Suppose that P = (a, b) and let P Q = h1ˆı + h2 j. ˆ Let P0 = (a, b)

P1 = (a + h1 , b)

and

P2 = (a + h1 , b + h2 ) = Q.

Now f (Q) − f (P ) = [f (P2 ) − f (P1 )] + [f (P1 ) − f (P0 )]. We apply the Mean value theorem twice. We may find Q1 and Q2 such that f (P1 ) − f (P0 ) =

∂f (Q1 )h1 ∂x

and

f (P2 ) − f (P1 ) =

∂f (Q2 )h2 . ∂y

Here Q1 lies somewhere on the line segment P0 P1 and Q2 lies on the line segment P1 P2 . Putting this together, we get f (Q) − f (P ) =

∂f ∂f (Q1 )h1 + (Q2 )h2 . ∂x ∂y 1

Thus −→ |( ∂f (Q1 ) − |f (Q) − f (P ) − A · P Q| ∂x = −→ �P Q�

∂f (P ))h1 ∂x

+ ( ∂f (Q2 ) − ∂y −→ �P Q�

∂f (P ))h2 | ∂y

∂f ∂f |( ∂f (Q1 ) − ∂f (P ))h1 | |( ∂y (Q2 ) − ∂y (P ))h2 | ∂x ∂x ≤ + −→ −→ �P Q� �P Q� ∂f ∂f |( ∂f (Q1 ) − ∂f (P ))h1 | |( ∂y (Q2 ) − ∂y (P ))h2 | ∂x ∂x ≤ + |h1 | |h2 | ∂f ∂f ∂f ∂f = |( (Q1 ) − (P ))| + |( (Q2 ) − (P ))|. ∂x ∂y ∂y ∂x

Note that as Q approaches P , Q1 and Q2 both approach P as well. As the partials of f are continuous, we have −→ |f (Q) − f (P ) − A · P Q| ∂f ∂f ∂f ∂f lim ≤ lim (|( (Q1 )− (P ))|+|( (Q2 )− (P ))|) = 0. − → Q→P Q→P ∂y ∂y ∂x ∂x �P Q� Therefore f is differentiable at P , with derivative A.



Example 10.3. Let f : A −→ R be given by x f (x, y) = � , 2 x + y2 where A = R2 − {(0, 0)}. Then ∂f (x2 + y 2 )1/2 − x(2x)(1/2)(x2 + y 2 )−1/2 y2 = = .

∂x x2 + y 2 (x2 + y 2 )3/2 Similarly ∂f xy =− 2 . ∂y (x + y 2 )3/2 Now both partial derivatives exist and are continuous, and so f is dif­ ferentiable, with derivative the gradient, y2 xy 1 �f = ( 2 ,− 2 )= 2 (y 2 , −xy). 2 3/2 2 3/2 2 3/2 (x + y ) (x + y ) (x + y ) Lemma 10.4. Let A = (aij ) be an m × n matrix. If �v ∈ Rn then �A�v � ≤ K��v �, where � K=( a2ij )1/2 . i,j

2

Proof. Let �a1 , �a2 , . . . , �am be the rows of A. Then the entry in the ith row of A�v is �ai · �v . So, �A�v �2 = (�a1 · �v )2 + (�a2 · �v )2 + · · · + (�an · �v )2 ≤ ��a1 �2 ��v �2 + ��a2 �2 ��v �2 + · · · + ��an �2 ��v �2 = (��a1 �2 + ��a2 �2 + · · · + ��an �2 )��v �2 = K 2 ��v �2 . Now take square roots of both sides.



Theorem 10.5. Let f : A −→ Rm be a function, where A ⊂ Rn is open. If f is differentiable at P , then f is continuous at P . Proof. Suppose that Df (P ) = A. Then −→ f (Q) − f (P ) − A · P Q lim = 0. −→ Q→P �P Q This is the same as to require −→ �f (Q) − f (P ) − A · P Q� lim = 0. −→ Q→P �P Q But if this happens, then surely −→ lim �f (Q) − f (P ) − A · P Q� = 0. Q→P

So

−→ −→ �f (Q) − f (P )� = �f (Q) − f (P ) − A · P Q + A · P Q� −→ −→ ≤ �f (Q) − f (P ) − A · P Q� + �A · P Q� −→ −→ ≤ �f (Q) − f (P ) − A · P Q� + K�P Q�.

Taking the limit as Q approaches P , both terms on the RHS go to zero, so that lim �f (Q) − f (P )� = 0, Q→P

and f is continuous at P .



3

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

11. Higher derivatives We first record a very useful: Theorem 11.1. Let A ⊂ Rn be an open subset. Let f : A −→ Rm and g : A −→ Rm be two functions and suppose that P ∈ A. Let λ ∈ A be a scalar. If f and g are differentiable at P , then (1) f + g is differentiable at P and D(f + g)(P ) = Df (P ) + Dg(P ). (2) λ · f is differentiable at P and D(λf )(P ) = λD(f )(P ). Now suppose that m = 1. (3) f g is differentiable at P and D(f g)(P ) = D(f )(P )g(P )+f (P )D(g)(P ). (4) If g(P ) �= 0, then f g is differentiable at P and D(f /g)(P ) =

D(f )(P )g(P ) − f (P )D(g)(P ) . g 2 (P )

If the partial derivatives of f and g exist and are continuous, then (11.1) follows from the well-known single variable case. One can prove the general case of (11.1), by hand (basically lots of �’s and δ’s). How­ ever, perhaps the best way to prove (11.1) is to use the chain rule, proved in the next section. What about higher derivatives? Definition 11.2. Let A ⊂ Rn be an open set and let f : A −→ R be a function. The kth order partial derivative of f , with respect to the variables xi1 , xi2 , . . . xik is the iterated derivative ∂kf ∂ ∂ ∂ ∂f (P ) = ( (. . . ( ) . . . ))(P ). ∂xik ∂xik−1 . . . ∂xi2 ∂xi1 ∂xik ∂xik−1 ∂xi2 ∂xi1 We will also use the notation fxik xik−1 ...xi2 xi1 (P ). Example 11.3. Let f : R2 −→ R be the function f (x, t) = e−at cos x. Then ∂ ∂ −at ( (e cos x)) ∂x ∂x ∂ = (−e−at sin x) ∂x = −e−at cos x.

fxx (x, t) =

1

On the other hand, ∂ ∂ −at ( (e cos x)) ∂x ∂t ∂ = (−ae−at cos x) ∂x = ae−at sin x.

fxt (x, t) =

Similarly, ∂ ∂ −at ( (e cos x)) ∂t ∂x ∂ = (−e−at sin x) ∂t = ae−at sin x.

ftx (x, t) =

Note that ft (x, t) = −ae−at cos x. It follows that f (x, t) is a solution to the Heat equation: ∂f ∂ 2f = . 2 ∂x ∂t Definition 11.4. Let A ⊂ Rn be an open subset and let f : A −→ Rm be a function. We say that f is of class C k if all kth partial derivatives exist and are continuous. We say that f is of class C ∞ (aka smooth) if f is of class C k for all k. a

In lecture 10 we saw that if f is C 1 , then it is differentiable. Theorem 11.5. Let A ⊂ Rn be an open subset and let f : A −→ Rm be a function. If f is C 2 , then ∂ 2f ∂ 2f = , ∂xi ∂xj ∂xj ∂xi for all 1 ≤ i, j ≤ n. The proof uses the Mean Value Theorem. Suppose we are given A ⊂ R an open subset and a function f : A −→ R of class C 1 . The objective is to find a solution to the equation f (x) = 0. Newton’s method proceeds as follows. Start with some x0 ∈ A. The best linear approximation to f (x) in a neighbourhood of x0 is given by f (x0 ) + f � (x0 )(x − x0 ). 2

If f � (x0 ) �= 0, then the linear equation f (x0 ) + f � (x0 )(x − x0 ) = 0, has the unique solution, x1 = x 0 −

f (x0 ) . f � (x0 )

Now just keep going (assuming that f � (xi ) is never zero), f (x0 ) f � (x0 ) f (x1 ) x2 = x 1 − � f (x1 ) .. .. .=. x1 = x 0 −

xn = xn−1 −

f (xn−1 ) . f � (xn−1 )

Claim 11.6. Suppose that x∞ = limn→∞ xn exists and f � (x∞ ) == � 0. Then f (x∞ ) = 0. Proof of (11.6). Indeed, we have xn = xn−1 −

f (xn−1 ) . f � (xn−1 )

Take the limit as n goes to ∞ of both sides: x∞ = x ∞ −

f (x∞ ) , f � (x∞ )

we we used the fact that f and f � are continuous and f � (x∞ ) �= 0. But then f (x∞ ) = 0, as claimed.



Suppose that A ⊂ Rn is open and f : A −→ Rn is a function. Sup­ pose that f is C 1 (that is, suppose each of the coordinate functions f1 , f2 , . . . , fn is C 1 ). The objective is to find a solution to the equation f (P ) = �0. Start with any point P0 ∈ A. The best linear approximation to f at P0 is given by −−→ f (P0 ) + Df (P0 )P P0 . 3

Assume that Df (P0 ) is an invertible matrix, that is, assume that det Df (P0 ) �= 0. Then the inverse matrix Df (P0 )−1 exists and the unique solution to the linear equation −−→ f (P0 ) + Df (P0 )P P0 = �0, is given by P1 = P0 − Df (P0 )−1 f (P0 ). Notice that matrix multiplication is not commutative, so that there is a difference between Df (P0 )−1 f (P0 ) and f (P0 )Df (P0 )−1 . If possible, we get a sequence of solutions, P1 = P0 − Df (P0 )−1 f (P0 ) P2 = P1 − Df (P1 )−1 f (P1 ) .. .. .=. Pn = Pn−1 − Df (Pn−1 )−1 f (Pn−1 ). Suppose that the limit P∞ = limn→∞ Pn exists and that Df (P∞ ) is invertible. As before, if we take the limit of both sides, this implies that f (P∞ ) = �0. Let us try a concrete example. Example 11.7. Solve x2 + y 2 = 1 y 2 = x3 . First we write down an appropriate function, f : R2 −→ R2 , given by f (x, y) = (x2 + y 2 − 1, y 2 − x3 ). Then we are looking for a point P such that f (P ) = (0, 0). Then � Df (P ) =



2x 2y

. −3x2 2y

The determinant of this matrix is 4xy + 6x2 y = 2xy(2 + 3x). Now if we are given a 2 × 2 matrix � � a b , c d 4

then we may write down the inverse by hand, � � 1 d −b . ad − bc −c a So −1

Df (P )

1 = 2xy(2 + 3x)



2y −2y 3x2 2x



So, � �� 2 � 1 2y −2y x + y2 − 1 Df (P ) f (P ) = y 2 − x3 2xy(2 + 3x) 3x2 2x � � 1 2x2 y − 2y + 2x3 y = 2xy(2 + 3x) x4 + 3x2 y 2 − 3x2 + 2xy 2 −1

One nice thing about this method is that it is quite easy to implement on a computer. Here is what happens if we start with (x0 , y0 ) = (5, 2), (x0 , y0 ) = (5.00000000000000, 2.00000000000000) (x1 , y1 ) = (3.24705882352941, −0.617647058823529) (x2 , y2 ) = (2.09875150983980, 1.37996311951634) (x3 , y3 ) = (1.37227480405610, 0.561220968705054) (x4 , y4 ) = (0.959201654346683, 0.503839504009063) (x5 , y5 ) = (0.787655203525685, 0.657830227357845) (x6 , y6 ) = (0.755918792660404, 0.655438554539110), and if we start with (x0 , y0 ) = (5, 5), (x0 , y0 ) = (5.00000000000000, 5.00000000000000) (x1 , y1 ) = (3.24705882352941, 1.85294117647059) (x2 , y2 ) = (2.09875150983980, 0.363541705259258) (x3 , y3 ) = (1.37227480405610, −0.306989760884339) (x4 , y4 ) = (0.959201654346683, −0.561589294711320) (x5 , y5 ) = (0.787655203525685, −0.644964218428458) (x6 , y6 ) = (0.755918792660404, −0.655519172668858). One can sketch the two curves and check that these give reasonable solutions. One can also check that (x6 , y6 ) lie close to the two given curves, by computing x26 + y62 − 1 and y62 − x36 .

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

12. Chain rule Theorem 12.1 (Chain Rule). Let U ⊂ Rn and let V ⊂ Rm be two open subsets. Let f : U −→ V and g : V −→ Rp be two functions. If f is differentiable at P and g is differentiable at Q = f (P ), then g ◦ f : U −→ Rp is differentiable at P , with derivative: D(g ◦ f )(P ) = (D(g)(Q))(D(f )(P )). It is interesting to untwist this result in specific cases. Suppose we are given f : R −→ R2 and g : R2 −→ R. So f (x) = (f1 (x), f2 (x)) and w = g(y, z). Then � df1 � ∂g ∂g (P ) dx Df (P ) = df2 and Dg(Q) = ( (Q), (Q)). (P ) ∂y ∂z dx So d(g ◦ f ) ∂g df1 ∂g df2 = D(g ◦ f )(P ) = Dg(Q)Df (P ) = (Q) (P )+ (Q) (P ). dx ∂y dx ∂z dx Example 12.2. Suppose that f (x) = (x2 , x3 ) and g(y, z) = yz. If we apply the chain rule, we get D(g ◦ f )(x) = z(2x) + y(3x2 ) = 5x4 . On the other hand (g ◦ f )(x) = x5 , and of course dx5 = 5x4 . dx Now suppose that f : R2 −→ R2

and

g : R2 −→ R

So f (x, y) = (f1 (x, y), f2 (x, y)) and w = g(u, v). Then � ∂f1 � ∂f2 ∂g ∂g (P ) (P ) ∂x ∂x Df (P ) = ∂f and Dg(Q) = ( (Q), (Q)). ∂f2 2 (P ) ∂x (P ) ∂u ∂v ∂x In this case ∂(g ◦ f ) ∂(g ◦ f ) , ) ∂x ∂y ∂g ∂f1 ∂g ∂f2 ∂g ∂f1 ∂g ∂f2 = ( (Q) (P ) + (Q) (P ), (Q) (P ) + (Q) (P )). ∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y ∂g ∂u ∂g ∂v ∂g ∂u ∂g ∂v = ( (Q) (P ) + (Q) (P ), (Q) (P ) + (Q) (P )) ∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y ∂g ∂u ∂g ∂v ∂g ∂u ∂g ∂v =( + , + ), ∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y

D(g ◦ f ) = (

1

since u = f1 (x, y) and v = f2 (x, y). Notice that in the last line we were a bit sloppy and dropped P and Q. If we split this vector equation into its components we get ∂(g ◦ f ) ∂g ∂f1 = (Q) (P ) + ∂x ∂u ∂x ∂(g ◦ f ) ∂g ∂f1 = (Q) (P ) + ∂y ∂u ∂y

∂g ∂f2 (Q) (P ) ∂v ∂x ∂g ∂f2 (Q) (P ). ∂v ∂y

Again, we could replace f1 by u and f2 by v in these equations, and maybe even drop P and Q. Example 12.3. Suppose that f (x, y) = (cos(xy), ex−y ) and g(u, v) = u2 sin v. If we apply the chain rule, we get D(g ◦ f )(x) = (2u sin v(−y sin xy) + u2 cos v(ex−y ), −2u sin vx sin xy − u2 cos vex−y = (2 cos(xy) sin(ex−y )(−y sin xy) + cos2 (xy) cos(ex−y )ex−y , . . . ). In general, the (i, k) entry of D(g ◦ f )(P ), that is ∂(g ◦ f )i ∂xk is given by the dot product of the ith row of Dg(Q) and the kth column of Df (P ), m ∂(g ◦ f )i � ∂gi ∂fj = (Q) (P ). ∂xk ∂y ∂x j i j=1 If z = (g ◦ f )(P ), then we get m

� ∂zi ∂zi ∂yj = (Q) (P ). ∂xk ∂yj ∂xi j=1 We can use the chain rule to prove some of the simple rules for derivatives. Suppose that we have f : Rn −→ Rm

and

g : Rn −→ Rm .

Suppose that f and g are differentiable at P . What about f + g? Well there is a function a : R2m −→ Rm , which sends (�u, �v ) ∈ Rm × Rm to the sum �u + �v . In coordinates (u1 , u2 , . . . , um , v1 , v2 , . . . , vm ), a(u1 , u2 , . . . , um , v1 , v2 , . . . , vm ) = (u1 + v1 , u2 + v2 , . . . , um + vm ). 2

Now a is differentiable (it is a polynomial, linear even). There is func­ tion h : Rn −→ R2m , which sends Q to (f (Q), g(Q)). The composition a ◦ h : Rn −→ Rm is the function we want to differentiate, it sends P to f (P ) + g(P ). The chain rule says that that the function is differentiable at P and D(f + g)(P ) = Df (P ) + Dg(P ).

Now suppose that m = 1. Instead of a, consider the function

m : R2 −→ R, given by m(x, y) = xy. Then m is differentiable, with derivative Dm(x, y) = (y, x). So the chain rule says the composition of h and m, namely the func­ tion which sends P to the product f (P )g(P ) is differentiable and the derivative satisfies the usual rule D(f g)(P ) = g(P )D(f )(P ) + f (P )D(g)(P ). Here is another example of the chain rule, suppose x = r cos θ y = r sin θ. Then ∂f ∂f ∂x ∂f ∂y = + ∂r ∂x ∂r ∂y ∂r ∂f ∂f = cos θ + sin θ. ∂x ∂y Similarly, ∂f ∂f ∂x ∂f ∂y = + ∂θ ∂x ∂θ ∂y ∂θ ∂f ∂f = − r sin θ + r cos θ. ∂x ∂y We can rewrite this as � ∂ � � �� ∂ � cos θ sin θ ∂x ∂r = ∂ ∂ −r sin θ r cos θ ∂y ∂θ Now the determinant of �

cos θ sin θ −r sin θ r cos θ 3



is r(cos2 θ + sin2 θ) = r. So if r �= 0, then we can invert the matrix above and we get �∂� � �� ∂ � 1 r cos θ − sin θ ∂x ∂r = ∂ ∂ r sin θ cos θ r ∂y ∂θ We now turn to a proof of the chain rule. We will need: Lemma 12.4. Let A ⊂ Rn be an open subset and let f : A −→ Rm be a function. If f is differentiable at P , then there is a constant M ≥ 0 and δ > 0 −→ such that if �P Q� < δ, then −→ �f (Q) − f (P )� < M �P Q�. Proof. As f is differentiable at P , there is a constant δ > 0 such that −→ if �P Q� < δ, then −→ �f (Q) − f (P ) − Df (P )P Q� < 1. −→ �P Q� Hence

−→ −→ �f (Q) − f (P ) − Df (P )P Q� < �P Q�.

But then −→ −→ �f (Q) − f (P )� = �f (Q) − f (P ) − Df (P )P Q + Df (P )P Q� −→ −→ ≤ �f (Q) − f (P ) − Df (P )P Q� + �Df (P )P Q� −→ −→ ≤ �P Q� + K�P Q� −→ = M �P Q�, where M = 1 + K.



Proof of (12.1). Let’s fix some notation. We want the derivative at P . Let Q = f (P ). Let P � be a point in U (which we imagine is close to P ). Finally, let Q� = f (P � ) (so if P � is close to P , then we expect Q� to be close to Q). The trick is to carefully define an auxiliary function G : V −→ Rp , ⎧ −−→ ⎨ g(Q� )−g(Q)−Dg(Q)(QQ� ) if Q� �= Q − − → �QQ� � G(Q� ) = ⎩�0 if Q� = Q. 4

Then G is continuous at Q = f (P ), as g is differentiable at Q. Now, −− (g ◦ f )(P � ) − (g ◦ f )(P ) − Dg(Q)Df (P )(P P � ) −− �P P � � −−→ �f (P � ) − f (P )� f (P � ) − f (P ) − Df (P )(P P � ) = Dg(Q) + G(f (P � )) . −− � −−→ �P P � �P P � � As P � approaches P , note that −−→ f (P � ) − f (P ) − Df (P )(P P � ) , −− �P P � � and G(P � ) both approach zero and �f (P � ) − f (P )� ≤ M. −− �P P � � So then −−→ (g ◦ f )(P � ) − (g ◦ f )(P ) − Dg(Q)Df (P )(P P � ) , −− �P P � � approaches zero as well, which is what we want. �

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

13. Implicit functions Consider the curve y 2 = x in the plane R2 , C = { (x, y) ∈ R2 | y 2 = x }. This is not the graph of a function, and yet it is quite close to the graph of a function. Given any point on the graph, let’s say the point (2, 4), we can always find open intervals U containing 2 and V containing 4 and a smooth function f : U −→ V such that C ∩ (U × V ) is the graph√of f . Indeed, take U = (0, ∞), V = (0, ∞) and f (x) = x. In fact, we can do this for any point on the graph, apart from the origin. If it is above the x-axis, the function above works. If the point we √are interested in√ is below the x-axis, replace V by (0, −∞) and f (x) = x, by g(x) = − x. How can we tell that the origin is a point where we cannot define an implicit function? Well away from the origin, the tangent line is not vertical but at the origin the tangent line is vertical. In other words, if we consider F : R2 −→ R, given by F (x, y) = y 2 − x, so that C is the set of points where F is zero, then DF (x, y) = (−1, 2y). The locus where we run into trouble, is where 2y = 0. Somewhat amazingly this works in general: Theorem 13.1 (Implicit Function Theorem). Let A ⊂ Rn+m be an open subset and let F : A −→ Rm be a C 1 -function. Suppose that (�a, �b) ∈ S = { (�x, �y ) ∈ A | F (�x, �y ) = �0 }. Assume that � det

∂Fi ∂yj

� �= 0.

Then we may find open subsets �a ∈ U ⊂ Rn and �b ∈ V ⊂ Rm , where U × V ⊂ A and a function f : U −→ V such that S ∩ (U × V ) is the graph of f , that is, F (�x, �y ) = �0

if and only if

where �x ∈ U and �y ∈ V . Let’s look at an example. Let F : R3 −→ R, 1

�y = f (�x).

be the function F (x1 , x2 , y) = x31 x2 − x2 y 2 + y 5 + 1. Let S = { (x1 , x2 , y) ∈ R3 | F (x1 , x2 , y) = 0 }. Then (1, 3, −1) ∈ S. Let’s compute the partial derivatives of F , � � ∂F 2 � (1, 3, −1) = 3x1 x2 � = 9

∂x1 (1,3,−1) � � ∂F 3 2 � (1, 3, −1) = (x1 − y )� = 0

∂x2 (1,3,−1) � � ∂F 4 � (1, 3, −1) = (−2x2 y + 5y )� = 11.

∂y (1,3,−1)

So DF (1, 3, −1) = (9, 0, 11). Now what is important is that the last entry is non-zero (so that the 1 × 1 matrix (1) is invertible). It follows that we may find open subsets (1, 3) ∈ U ⊂ R2 and −1 ∈ V ⊂ R and a C 1 function f : U −→ V such that F (x1 , x2 , f (x1 , x2 )) = 0. It is not possible to write down an explicit formula for f , but we can calculate the partial derivatives of f . Define a function G : U −→ R, by the rule G(x1 , x2 ) = F (x1 , x2 , f (x1 , x2 )) = 0. On the one hand, ∂G ∂G =0 and = 0. ∂x1 ∂x2 On the other hand, by the chain rule, ∂G ∂F ∂x1 ∂F ∂x2 ∂F ∂f = + + ∂x1 ∂x1 ∂x1 ∂x2 ∂x1 ∂x3 ∂x1 Now ∂x1 ∂x2 =1 and = 0. ∂x1 ∂x1 So ∂F ∂f 1 = − ∂x . ∂F ∂x1 ∂x3 2

Similarly ∂F

∂f 2 = − ∂x . ∂F ∂x2 ∂x3 So ∂F (1, 3, −1) ∂f 9 ∂x1 (1, 3) = − ∂F =− , ∂x1 11 (1, 3, −1) ∂x3

and ∂F (1, 3, −1) ∂f 0 ∂x2 (1, 3) = − ∂F = − = 0. ∂x2 11 (1, 3, −1) ∂x3

Definition 13.2. Let A ⊂ Rn be an open subset and let f : Rn −→ R be a function. The directional derivative of f in the direction of the unit vector uˆ is f (P + huˆ) − f (P ) Duˆ f (P ) = lim . h→0 h If uˆ = eˆi then,

∂f

Deˆi f (P ) = (P ), ∂xi the usual partial derivative. Proposition 13.3. If f is differentiable at P then Duˆ f (P ) = Df (P ) · u. ˆ Proof. Since A is open, we may find δ > 0 such that the parametrised line r : (−δ, δ) −→ A, given by r(h) = f (P ) + huˆ is entirely contained in A. Consider the composition of r and f , f ◦ r : R −→ R. Then �

� d(f ◦ r) Duˆ f (P ) = (0) dh = D(r(0)) · Dr(0) = Df (P ) · u. ˆ Note that we can also write Duˆ f (P ) = �f (P ) · u. ˆ 3



Note that the directional derivative is largest when �f (P ) uˆ = , ��f (P )� so that the gradient always points in the direction of maximal change (and in fact the magnitude of the gradient, gives the maximum change). Note also that the directional derivative is zero if uˆ is orthogonal to the gradient and that the directional derivative is smallest when �f (P ) uˆ = − . ��f (P )� � 0 then the tangent hyperplane Π to Proposition 13.4. If �f (P ) = the hypersurface S = { Q ∈ Rn | f (Q) − f (P ) = 0 }, is the set of all points Q which satisfy the equation −→ �f (P ) · P Q = 0. Remark 13.5. If f is C 1 , then f is the graph of some function, locally about P . Proof. By definition, the point Q belongs to the tangent hyperplane if and only if there is a curve r : (−δ, δ) −→ S, such that

−→ r� (0) = P Q. r(0) = P and Now, since r(h) ∈ S for all h ∈ (−δ, δ), we have F (r(h)) = 0. So dF (r(h)) 0= (0) dh = �F (r(0)) · r� (0) −→ = �F (P ) · P Q.

4



MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

14. Parametrised Curves Definition 14.1. A parametrised differentiable curve in Rn is a differentiable function �r : I −→ Rn , where I is an open interval in R. Remark 14.2. Any open interval I is one of four different forms: (a, b); (−∞, b); (a, ∞); (−∞, ∞) = R, where a and b are real numbers. Definition 14.3. The velocity vector at time t of a parametrised differentiable curve �r : I −→ Rn is the derivative: �v (t) = �r� (t) = D�r(t). If �v is differentiable, then the acceleration vector at time t is the derivative of the velocity vector: �a(t) = �v � (t) = �r�� (t). Example 14.4. Let �r : R −→ R3 , be given by �r(t) = (a cos t, a sin t, bt). This traces out a helix. The velocity vector is �v (t) = (−a sin t, a cos t, b). The acceleration vector is �a(t) = (−a cos t, −a sin t, 0). The speed, that is the magnitude of the velocity vector, ��v (t)� = (a2 + b2 )1/2 , is constant. Nevertheless the acceleration vector is not zero, as we are travelling on a curve and not a straight line. Let’s now attack a very famous problem. Kepler formulated three laws of planetary motion, based on extensive observations of the recorded positions of the planets. The first law states that planets move around in ellipses, where the sun is at one focal point of the ellipse; let’s see how one can derive this law from Newton’s universal law of gravity. Let’s put the sun at the origin O of our coordinates. Let’s suppose that the planet is at the point P = P (t) at time t. Then �r(t) : R −→ R3 , −→ is a parametrised differentiable curve, where �r(t) = OP . We will need a simple formula for the vector triple product in R3 : (�u × �v ) × w � = (�u · w � )�v − (�v · w � )�u. One can check this formula using coordinates. 1

Theorem 14.5 (Newton). Suppose that �a = −

GM �r, r3

for some constants G and M . Then �r traces out either an ellipse, a parabola or a hyperbola. Proof. We have d�r dv d(�r × �v ) × �v + �r × = dt dt dt = �v × �v + �r × �a = �0 + �0 = �0, since �a and �r are parallel by assumption. Hence �r × �v = �c, a constant vector. It follows that �r and �v lie in the plane Π through the origin and orthogonal to �c. We may write �r = ruˆ, where uˆ is a unit vector. �v =

d(ruˆ) dr duˆ = uˆ + r . dt dt dt

It follows that �c = �r × �v dr duˆ uˆ + ruˆ × r dt dt du ˆ = r2 uˆ × . dt = ruˆ ×

So �

� � � GM duˆ 2 �a × �c = − 2 uˆ × r uˆ × r dt � � duˆ = −GM uˆ × uˆ × dt � � duˆ = GM uˆ × × uˆ dt � � duˆ duˆ = GM (ˆ u · uˆ) − (ˆ u · )ˆ u dt dt d(GM uˆ) = . dt 2

On the other hand, d�v d(�v × �c) × �c = dt dt

�a × �c = It follows that

� �v × �c = GM uˆ + d, where d� is a constant vector. If we cross both sides with �c, then the LHS is zero and so the RHS is zero. It follows that d also lies in the plane Π. Define θ to be the angle between d� and �u. Now ��c�2 = �c · �c = (�r × �v ) · �c = (�v × �c) · �r = (GM uˆ + d�) · �r = GM r + r�d�� cos θ. Let c = ��c� and d = �d��. Then c2 GM + d cos θ p = , 1 + e cos θ

r=

where

c2 d and e= . GM GM Let’s express these equations in Cartesian coordinates and not polar coordinates. We have p=

x = r cos θ y = r sin θ. Therefore p = r + er cos θ, so that p = r + ex. Solving for r, r = p − ex. Squaring both sides we get r2 = (p − ex)2 . That is x2 + y 2 = p2 − 2epx + e2 x2 . 3

Therefore (1 − e2 )x2 + 2pex + y 2 = p2 . There are three cases. The conic C = { (x, y) ∈ R2 | (1 − e2 )x2 + 2pex + y 2 = p2 }, is an

⎧ ⎪ if |e| < 1 ⎨ellipse parabola if |e| = 1 ⎪ ⎩ hyperbola if |e| > 1.

Let’s suppose that |e| < 1. First divide through by 1 − e2 , 2pe 1 p2 2 x +

y =

.

1 − e 2 1 − e2

1 − e 2 If we complete the square, then we get



�2 pe 1 p2 2 x+ + y = . 1 − e2 1 − e2 (1 − e2 )2 Finally divide through by the RHS to get

�2 �

�2 �

pe x + 1−e y 2 = 1. + p p x 2 +



1−e2

1−e2

This is the equation of an ellipse. The centre of the ellipse is at pe (− , 0). 1 − e2 One can check that this means one of the focal points is at the origin. �

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

15. Arclength

Definition 15.1. Let I be an open interval. A partition P of [a, b] ⊂ I is a sequence of points a = t0 < t1 < · · · < tn = b, for some n ≥ 1. The mesh of P is m(P) = max{ |ti+1 − ti | | 0 ≤ i ≤ n − 1 }. i

Definition 15.2. Let �r : I −→ Rn be a parametrised differentiable curve, and let P be a partition. We think of l(P) =

n−1 �

��r(ti+1 ) − r(ti )�,

i=0

as being an approximation to the length of the curve �r([a, b]). The curve �r has length L if given any � > 0, there is a δ > 0 such that for every partition P whose mesh size is less than δ, we have �L − l(P)� < �. There are interesting examples of curves that don’t have a length. Start with interval [0, 1] in the plane. The length is 1. Now adjust this curve, by adding in a triangular hump in the middle, so that we get four line segments of length 1/3. This is a curve of length 4/3. Now add a hump to each of the four line segments of length 1/3. The length of the resulting curve is (4/3)2 . If we keep doing this, then we get more and more complicated curves, whose length at stage n is (4/3)n . This process converges to a very pointed curve whose length is infinite (in fact this curve is a fractal). However: Proposition 15.3. If �r : I −→ Rn is a C 1 -function, then the curve �r([a, b]) has a length � b L= ��r� (t)� dt. a

Remark 15.4. The fractal curve above is continuous but it is nowhere differentiable (the curve has too many sharp points). In general the exact formula for the arclength is only of theoretical interest. However there are some contrived examples where we can calculately the arclength precisely. Example 15.5. Let �r : R −→ R2 , 1

be the parametrised differentiable curve given by �r(t) = a cos tˆı + a sin tj. ˆ Then �r� (t) = −a sin tˆı + a cos tj, ˆ and so ��r� (t)� = a. Hence the length of the curve �r([0, 2π]) is � 2π L= a dt = 2πa, 0

which is indeed the circumference of a circle of radius a. Example 15.6. Let �r : R −→ R3 , be the parametrised differentiable curve given by ˆ �r(t) = a cos tˆı + a sin tˆ j + btk, Then ˆ �r� (t) = −a sin tˆı + a cos tˆ j + bk, and so

√ ��r� (t)� = a2 + b2 . Hence the length of the curve �r([0, 2π]) is � 2π L= (a2 + b2 )1/2 dt = 2π(a2 + b2 )1/2 . 0

Example 15.7. Let �r : R −→ R2 , be the parametrised differentiable curve given by �r(t) = a cos tˆı + b sin tj. ˆ Then �r� (t) = −a sin tˆı + b cos tj, ˆ and so ��r� (t)� = (a2 sin2 t + b2 cos2 t)1/2 Hence the length of the curve �r([0, 2π]) is � 2π L= (a2 sin2 t + b2 cos2 t)1/2 dt, 0

the length of an ellipse, with major and minor axes of length a and b. 2

Definition 15.8. Let �r : I −→ Rn be a parametrised differentiable curve, which is of class C 1 . Suppose that �r� (t) is nowhere zero. Given a ∈ I, define the arclength parameter s(t), by the formula � � t length of r([a, t]) if t ≥ a

s(t) = ��r� (τ )� dτ = length of −r([t, a]) if t < a.

a By the fundamental theorem of calculus � d t � � s (t) = ��r (τ )� dτ dt a = ��r� (t)�, which is the speed at time t. Since s� (t) is nowhere zero, we can write t as a function of s, that is we can write down the inverse function, t(s), which will be C 1 . Example 15.9. For the helix, ˆ �r(t) = a cos tˆı + a sin tˆ j + bkt, we have t



��r� (τ )� dτ =

s(t) =



a2 + b2

1/2

t.

0

Therefore s t(s) = √ . a2 + b2 In this case � �r(s) = a cos

� � � � � s s s √ ˆı + a sin √ jˆ + bkˆ √ . a2 + b2 a2 + b2 a2 + b2

In fact, one can always parametrise a curve by its arclength and in this case the derivative is a unit vector: Definition 15.10. Let �r : I −→ Rn be a parametrised differentiable curve parametrised by the arclength. Then d�r T� (s) = , ds is the unit tangent vector. In the case of the helix, we get � � � � � � a s a s b ˆ �r(s) = −√ sin √ ˆı+ √ cos √ jˆ+ √ k. 2 2 2 2 2 2 2 2 2 2 a +b a +b a +b a +b a +b 3

Note that there is an obvious way to get the unit tangent vector. Take the derivative and divide by the speed. In fact d�r(t) d�r(s) ds = dt ds dt ds = T� (s) , dt by the chain rule. So T� (s) =

d� r� (t) dt , ds dt

where the denominator is precisely the speed, so that the unit tangent vector is the unit vector which points in the direction of the velocity. Example 15.11. Let �r(t) = (a cos t, a sin t) be the standard parametri­ sation of the circle of radius a. Then s s = ta so that t= . a So �s� �s� �r(s) = (a cos , a sin ), a a and �s� �s� T� (s) = (− sin , cos ). a a Definition 15.12. Let �r : I −→ Rn be a C 2 curve such that �r� (t) �= �0 for all t. The curvature κ(s) of �r(s) is the magnitude of the vector dT� (s) , ds and the unit normal vector is the unit vector pointing in the direction of � � (s) = dT (s) . N ds In the example above, �s� �s� dT� (s) 1 = (− cos , − sin ), ds a a a which has length 1/a. So the larger the radius, the smaller the curva­ ture (which is what one might expect). The unit normal vector is � � � � � (s) = (− cos s , − sin s ) = −�r(s). N a a 4

One can try to calculate the curvature using the parameter t. By the chain rule, dT� (s) ds dT� (t) = . dt ds dt So dT� (t) dT� (s) dt = ds . ds dt The denominator is the speed. It follows that dT� (t) � (s) N and , dt point in the same direction, so that the unit normal vector is the unit vector associated to dT� (t) . dt On the other hand, the curvature is �

� dTdt(t) � ds dt

.

Note that the normal vector and the unit tangent vector are always orthogonal. Indeed, �T� (s)� = 1. So T� (s) · T� (s) = 1. Differentiate both sides with respect to s, to get T� � (s) · T� (s) + T� (s) · T� � (s) = 0. It follows that 2T� � (s) · T� (s) = 0, that is T� � (s) · T� (s) = 0. � (s) points in the same direction as T� � (s), it follows that the tan­ As N gent vector and the normal vector are orthogonal.

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

16. Moving frames Definition 16.1. We say a parametrised differentiable curve �r : I −→ Rn is regular if �r� (t) �= 0 (the speed is never zero). We say that �r(t) is smooth if �r(t) is C ∞ . Given a regular smooth parametrised differentiable curve �r : I −→ R , we can parametrise by arclength, in which case we get can write down the unit tangent vector 3

d�r T� = (s). ds The curvature κ(s) is defined as the magnitude of dT� (s). ds If the curvature is nowhere zero, then we define the normal vector � (s) as the unit vector pointing in the direction of the derivative of N the tangent vector: dT� � (s). (s) = κ(s)N ds � (s) are orthogonal. We have already seen that T� (s) and N Definition 16.2. � (s) = T� (s) × N � (s). B is called the binormal vector. � (s), and B � (s) are unit vectors and pairwise The three vectors T� (s), N orthogonal, that is, these vectors are an orthonormal basis of R3 . Notice � (s), and B � (s) are a right handed set. that T� (s), N We call these vectors a moving frame or the Frenet-Serret frame. Now � dB � (s) = 0, (s) × B ds as � (s) · B � (s)� = 1. �B It follows that � dB (s), ds 1

� (s).

lies in the plane spanned by T� (s) and N � �) dB d(T� × N (s) · T� (s) = (s) · T� (s) ds ds � � � dT� dN � (s) + T� (s) × = (s) × N (s) · T� (s) ds ds � (s) × N � (s)) · T� (s) + (T� (s) × N � (s)) · T� (s) = κ(s)(N � (s) = 0 + (T� (s) × T� (s)) · N = 0. It follows that � dB (s) ds

and

T� (s),

are orthogonal, and so � dB (s) ds

is parallel to

� (s). N

Definition 16.3. The torsion of the curve �r(s) is the unique scalar τ (s) such that � dB � (s). (s) = −τ (s)N ds If we have a helix, the sign of the torsion distinguishes between a right handed helix and a left handed helix. The magnitude of the torsion measures how spread out the helix is (the curvature measures how tight the turns are). Now � dN (s) ds � (s), and so it is a linear combination of T� (s) and is orthogonal to N � (s). In fact, B � � × T� ) dN d(B (s) = (s) ds ds � � dB � (s) × dT (s) = (s) × T� (s) + B ds ds � (s) × T� (s) + κ(s)B � (s) × N � (s) = −τ (s)N � (s) − κ(s)T� (s) = τ (s)B � (s). = −κ(s)T� (s) + τ (s)B 2

I −→ R3 be a regular smooth

Theorem 16.4 (Frenet Formulae). Let �r : parametrised curve. Then ⎛ �� ⎞ ⎛ T (s) 0 κ(s) ⎝N � � (s)⎠ = ⎝ −κ(s) 0 � � (s) −τ (s) 0 B

⎞ ⎛ � ⎞

T (s)

0

� (s) ⎠ .

τ (s)⎠ ⎝ N � (s)

0

B

Of course, s represents the arclength parameter and primes denote derivatives with respect to s. Notice that the 3 × 3 matrix A appearing in (16.4) is skew-symmetric, that is At = −A. The way we have written the Frenet formulae, it appears that we have two 3 × 1 vectors; strictly speaking these are the rows of two 3 × 3 matrices. Theorem 16.5. Let I ⊂ R be an open interval and suppose we are given two smooth functions κ : I −→ R

and

τ : I −→ R,

where κ(s) > 0 for all s ∈ I. Then there is a regular smooth curve �r : I −→ R3 parametrised by arclength with curvature κ(s) and torsion τ (s). Further, any two such curves are congruent, that is, they are the same up to translation and rotation. Remark 16.6. Uniqueness is one of the hwk problems. Let’s consider the example of the helix: Example 16.7. s s bs �r(s) = (a cos , a sin , ), c c c where c2 = a2 + b2 . Let’s assume that a > 0. By convention c > 0. Then 1 s s T� (s) = (−a sin , a cos , b). c c c Hence dT −a s s a s s a � (s) = 2 (cos , sin , 0) = 2 (− cos , − sin , 0) = 2 N (s) ds c c c c c c c It follows that a � (s) = (− cos s , − sin s , 0). κ(s) = 2 and N c c c Finally,

� � � ˆı ˆ � j ˆ k � � � (s) = �− a sin s a cos s b � B � c c c c c� � − cos s − sin s 0�

c c 3

It follows that � (s) = ( b sin s , − b cos s , a ) = 1 (b sin s , −b cos s , a). B c c c c c c c c Finally, note that � dB b s s b � (s) = 2 (cos , sin , 0) = − 2 N . ds c c c c Using this we can compute the torsion: b τ (s) = 2 . c It is interesting to use the torsion and curvature to characterise var­ ious geometric properties of curves. Let’s say that a parametrised dif­ ferentiable curve �r : I −→ R3 is planar if there is a plane Π which contains the image of �r. Theorem 16.8. A regular smooth curve �r : I −→ R3 is planar if and only if the torsion is zero. Proof. We may assume that the curve passes through the origin. Suppose that �r is planar. Then the image of �r is contained in a plane Π. As the curve passes through the origin, Π contains the origin as well. Note that the unit tangent vector T� (s) and the unit normal � (s) are contained in Π. It follows that B � (s) is a normal vector vector N � (s) is a unit vector, it must be constant. But then to the plane; as B � dB � (s), (s) = �0 = 0N ds so that the torsion is zero. Now suppose that the torsion is zero. Then dB � = �0, (s) = 0N ds � (s) = B0 , is a constant vector. Consider the function so that B � (s) = �r(s) · B � 0. f (s) = �r(s) · B Then � 0) df d(�r × B (s) = (s) ds ds � 0 = 0. = T� (s).B So f (s) is constant. It is zero when �r(a) = �0 (the curve passes through the origin) so that f (s) = 0. But then �r(s) is always orthogonal to a fixed vector, so that �r is contained in a plane, that is, C is planar. � 4

It is interesting to try to figure out how to characterise curves which are contained in spheres or cylinders.

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

17. Vector fields Definition 17.1. Let A ⊂ Rn be an open subset. A vector field on A is function F� : A −→ Rn . One obvious way to get a vector field is to take the gradient of a differentiable function. If f : A −→ R, then �f : A −→ Rn , is a vector field. Definition 17.2. A vector field F� : A −→ Rn is called a gradient (aka conservative) vector field if F� = �f for some differentiable function f : A −→ R. Example 17.3. Let F� : R3 − {0} −→ R3 , be the vector field F� (x, y, z) =

(x2

cx cy cz ˆ ˆı+ 2 jˆ+ 2 k, 2 2 3/2 2 2 3/2 2 +y +z ) (x + y + z ) (x + y + z 2 )3/2

for some constant c. Then F� (x, y, z) is the gradient of f : R3 − {0} −→ R, given by f (x, y, z) = −

(x2

c . + + z 2 )1/2 y2

So F� is a conservative vector field. Notice that if c < 0 then F� models the gravitational force and f is the potential (note that unfortunately mathematicians and physicists have different sign conventions for f ). Proposition 17.4. If F� is a conservative vector field and F� is C 1 function, then ∂Fi ∂Fj = , ∂xj ∂xi for all i and j between 1 and n. Proof. If F� is conservative, then we may find a differentiable function f : A −→ Rn such that ∂f Fi = . ∂xi 1

As Fi is C 1 for each i, it follows that f is C 2 . But then ∂ 2f ∂Fi = ∂xj ∂xj ∂xi ∂ 2f = ∂xi ∂xj ∂Fj = . ∂xi



Notice that (17.4) is a negative result; one can use it show that various vector fields are not conservative. Example 17.5. Let F� : R2 −→ R2

given by

F� (x, y) = (−y, x).

Then ∂F1 = −1 ∂y

∂F2 = 1 �= −1. ∂x

and

So F� is not conservative. Example 17.6. Let F� : R2 −→ R2

given by

F� (x, y) = (y, x + y).

Then ∂F1 =1 ∂y

and

∂F2 = 1, ∂x

so F� might be conservative. Let’s try to find f : R2 −→ R

such that

�f (x, y) = (y, x + y).

If f exists, then we must have ∂f =y ∂x

∂f = x + y. ∂y

and

If we integrate the first equation with respect to x, then we get f (x, y) = xy + g(y). Note that g(y) is not just a constant but it is a function of y. There are two ways to see this. One way, is to imagine that for every value of y, we have a separate differential equation. If we integrate both sides, we get an arbitrary constant c. As we vary y, c varies, so that c = g(y) is a function of y. On the other hand, if to take the partial derivatives 2

of g(y) with respect to x, then we get 0. Now we take xy + g(y) and differentiate with respect to y, to get ∂(xy + g(y)) dg x+y = = x + (y). ∂y dy So g � (y) = y. Integrating both sides with respect to y we get g(y) = y 2 /2 + c. It follows that �(xy + y 2 /2) = (y, x + y), so that F� is conservative. Definition 17.7. If F� : A −→ Rn is a vector field, we say that a parametrised differentiable curve �r : I −→ A is a flow line for F� , if �r� (t) = F� (�r(t)), for all t ∈ I. Example 17.8. Let F� : R2 −→ R2

given by

F� (x, y) = (−y, x).

We check that �r : R −→ R2

given by

�r(t) = (a cos t, a sin t),

is a flow line. In fact �r� (t) = (−a sin t, a cos t), and so F� (�r(t)) = F� (a cos t, a sin t) = �r� (t), so that �r(t) is indeed a flow line. Example 17.9. Let F� : R2 −→ R2

given by

F� (x, y) = (−x, y).

Let’s find a flow line through the point (a, b). We have x� (t) = −x(t)

x(0) = a



y (t) = y(t)

y(0) = b.

Therefore, and x(t) = ae−t gives the flow line through (a, b). 3

y(t) = bet ,

Example 17.10. Let F� : R2 −→ R2

F� (x, y) = (x2 − y 2 , 2xy).

given by

Try x(t) = 2a cos t sin t y(t) = 2a sin2 t. Then x� (t) = 2a(− sin2 t + cost ) =

x2 (t) − y 2 (t) . y(t)

Similarly y � (t) = 4a cos t sin t =

2x(t)y(t) . y(t)

So F� (�r(t)) . f (t) So the curves themselves are flow lines, but this is not the correct parametrisation. The flow lines are circles passing through the origin, with centre along the y-axis. �r� (t) =

Example 17.11. Let F� : R2 −{(0, 0)} −→ R2

given by

F� (x, y) = (−

x2

y x , 2 ). 2 + y x + y2

Then ∂F1 x2 + y 2 − 2y 2 y 2 − x2 (x, y) = − = . ∂y (x2 + y 2 )2 (x2 + y 2 )2 and ∂F2 x2 + y 2 − 2x2 y 2 − x2 (x, y) = = . ∂x (x2 + y 2 )2 (x2 + y 2 )2 So F� might be conservative. Let’s find the flow lines. Try � � t x(t) = a cos a2 � � t y(t) = a sin . a2 4

Then � � 1 t x (t) = − sin a a2 y =− 2 . x + y2 �

Similarly � � 1 t y (t) = cos a a2 x = 2 . x + y2 �

So the flow lines are closed curves. In fact this means that F� is not conservative.

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

18. Div grad curl and all that Theorem 18.1. Let A ⊂ Rn be open and let f : A −→ R be a differ­ entiable function. If �r : I −→ A is a flow line for �f : A −→ Rn , then the function f ◦ �r : I −→ R is increasing. Proof. By the chain rule, d(f ◦ �r) (t) = �f (�r(t)) · �r� (t) dt = �r� (t) · �r� (t) ≥ 0.



Corollary 18.2. A closed parametrised curve is never the flow line of a conservative vector field. Once again, note that (18.2) is mainly a negative result: Example 18.3. F� : R2 −{(0, 0)} −→ R2

given by

F� (x, y) = (−

x2

y x , 2 ), 2 + y x + y2

is not a conservative vector field as it has flow lines which are circles. Definition 18.4. The del operator is the formal symbol �=

∂ ∂ ∂ ˆ ˆı + jˆ + k. ∂x ∂y ∂z

Note that one can formally define the gradient of a function grad f : R3 −→ R3 , by the formal rule grad f = �f =

∂f ∂f ∂f ˆ ˆı + jˆ + k. ∂x ∂y ∂z

Using the operator del we can define two other operations, this time on vector fields: Definition 18.5. Let A ⊂ R3 be an open subset and let F� : A −→ R3 be a vector field. The divergence of F� is the scalar function, div F� : A −→ R, which is defined by the rule div F� (x, y, z) = � · F� (x, y, z) = 1

∂f ∂f ∂f + + . ∂x ∂y ∂z

The curl of F� is the vector field curl F� : A −→ R3 , which is defined by the rule curl F� (x, x, z) = � × F� (x, y, z)

� � ˆ� � ˆı j ˆ k �∂ ∂ ∂� � = �� ∂x ∂y ∂z � �F F F �

1 2 3 � � � � � � ∂F3 ∂F2 ∂F3 ∂F1 ∂F2 ∂F1 ˆ = − ˆı − − jˆ + − k. ∂y ∂z ∂x ∂z ∂x ∂y Note that the del operator makes sense for any n, not just n = 3. So we can define the gradient and the divergence in all dimensions. However curl only makes sense when n = 3. Definition 18.6. The vector field F� : A −→ R3 is called rotation free if the curl is zero, curl F� = �0, and it is called incompressible if the divergence is zero, div F� = 0. Proposition 18.7. Let f be a scalar field and F� a vector field. (1) If f is C 2 , then curl(grad f ) = �0. Every conservative vector field is rotation free. (2) If F� is C 2 , then div(curl F� ) = 0. The curl of a vector field is incompressible. Proof. We compute; curl(grad f ) = � × (�f ) � � � ˆı � ˆ j ˆ k � � �∂ ∂ ∂� = � ∂x ∂y ∂z � � ∂f ∂f ∂f � � ∂x ∂y ∂z �

� 2 � � 2 �

� 2 �

∂ f ∂ 2f ∂ f ∂ 2f ∂ f ∂ 2 f ˆ

=



ˆı − − jˆ + −

k ∂y∂z ∂z∂y

∂x∂z ∂z∂x ∂x∂y ∂y∂x

= �0. 2

This gives (1). div(curl F� ) = � · (� × f ) � � ˆ� � ˆı � ∂ ∂jˆ k∂ � = � · �� ∂x ∂y ∂z �� �F F F �

1 2 3 �∂ ∂ ∂� � ∂x ∂y ∂z � �∂ ∂ ∂� � = �� ∂x ∂y ∂z � � F F F �

1 2 3 ∂ 2 F3 ∂ 2 F2 ∂ 2 F3 ∂ 2 F1 ∂ 2 F2 ∂ 2 F1 − − + + − ∂x∂y ∂x∂z ∂y∂x ∂y∂z ∂z∂x ∂z∂y = 0. =

This is (2).



Example 18.8. The gravitational field cx cy cz ˆ F� (x, y, z) = 2 ˆı+ 2 jˆ+ 2 k, 2 2 3/2 2 2 3/2 2 (x + y + z ) (x + y + z ) (x + y + z 2 )3/2 is a gradient vector field, so that the gravitational field is rotation free. In fact if c f (x, y, z) = − 2 , 2 (x + y + z 2 )1/2 then F� = grad f , so that curl F� = curl(grad f ) = �0. � is always the curl of something, Example 18.9. A magnetic field B � = curl A, � B � is a vector field. So where A � ) = div(curl A � ) = 0. div(B Therefore a magnetic field is always incompressible. There is one other way to combine two del operators: Definition 18.10. The Laplace operator take a scalar field f : A −→ R and outputs another scalar field �2 f : A −→ R. It is defined by the rule �2 f = div(grad f ) = 3

∂ 2f ∂ 2f ∂ 2f + + . ∂x ∂y ∂z

A solution of the differential equation �2 f = 0, is called a harmonic function. Example 18.11. The function f (x, y, z) = −

(x2

is harmonic.

4

c , + + z 2 )1/2 y2

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

19. Taylor Polynomials If f : A −→ Rm is a differentiable function, and we are given a point P ∈ A, one can use the derivative to write down the best linear ap­ proximation to f at P . It is natural to wonder if one can do better using quadratic, or even higher degree, polynomials. We start with the one dimensional case. Definition 19.1. Let I ⊂ R be an open interval and let f : I −→ R be a C k -function. Given a point a ∈ I, let Pa,k f (x) = f (a) + f � (a)(x − a) + =

k � f i (a) i=0

i!

f �� (a) f ��� (a) f k (a) (x − a)2 + (x − a)3 + · · · + (x − a)k 2 3! k!

(x − a)i .

Then Pa,k f (x) is the kth Taylor polynomial of f , centred at a. The remainder is the difference Ra,k f (x) = f (x) − Pa,k f (x). Note that we have chosen Pa,k f so that the first k derivatives of Pa,k f at a are precisely the same as those of f . In other words, the first k derivatives at a of the remainder are all zero. The remainder is a measure of how good the Taylor polynomial approximates f (x) and so it is very useful to estimate Ra,k (x). Theorem 19.2 (Taylor’s Theorem with remainder). Let I ⊂ R be an open interval and let f : I −→ R be a C k+1 -function. Let a and b be two points in I. Then there is a ξ between a and b, such that Ra,k f (b) =

f k+1 (ξ) (b − a)k+1 . (k + 1)!

Proof. If a = b then take ξ = a. The result is clear in this case. Otherwise if we put Ra,k f (b) M= , (b − a)k+1 then Ra,k f (b) = M (b − a)k+1 . We want to show that there is some ξ between a and b such that M=

f k+1 (ξ) . (k + 1)! 1

If we let g(x) = Ra,k (x) − M (x − a)k+1 , then g k+1 (x) = f k+1 (x) − k!M. Then we are looking for ξ such that g k+1 (ξ) = 0. Now the first k derivatives of g at a are all zero, g i (a) = 0

for

0 ≤ i ≤ k.

By choice of M , g(b) = 0. So by the mean value theorem, applied to g(x), there is a ξ1 between a and b such that g � (ξ1 ) = 0. Again by the mean value theorem, applied to g � (x), there is a ξ2 between a and ξ1 such that g �� (ξ2 ) = 0. Continuing in this way, by induction we may find ξi , 1 ≤ i ≤ k + 1 between a and ξi−1 such that g i (ξi ) = 0. Let ξ = ξk+1 .

� 2

Let’s try an easy example. Start with

f (x) = x1/2

1

f � (x) = x−1/2 2 1 f �� (x) = 2 x−3/2

2 3

f ��� (x) = 3 x−5/2

2 1 · 3 · 5 −7/2 f 4 (x) = − x 24 1 · 3 · 5 · 7 −9/2

f 5 (x) = x 25 1 · 3 · 5 · 7 · 9 −11/2 f 6 (x) = − x 26 (2k − 1)!! −(2k−1)/2 f k (x) = (−1)k−1 x 2k (2k − 1)!! 22k−1 f k (9/4) = (−1)k−1 2k 32k−1 (2k − 1)!!2k−1 = (−1)k−1 . 32k−1 Let’s write down the Taylor polynomial centred at a = 9/4. P9/4,5 f (x) = f (9/4)+f � (9/4)(x−9/4)+f �� (9/4)/2(x−9/4)2 +f ��� (9/4)/6(x−9/4)3 f 4 (9/4)/24(x − 9/4)4 + f 5 (9/4)/120(x − 9/4)5 . So, P9/4,5 f (x) = 3/2 + 1/3(x − 9/4) − 1/33 (x − 9/4)2 + 2/35 (x − 9/4)3 1 · 3 · 5 · 23 1 · 3 · 5 · 7 · 24 4 − 9/4) + (x − 9/4)5 . (x 120 · 39 24 · 37 If we plug √ in x = 2, so that x − 9/4 = −1/4 we get an approximation to f (2) = 2. 10997 P9/4,3 (2) = 3/2+1/3(−1/4)−1/33 (1/4)2 −2/35 (1/4)3 = ≈ 1.41422 . . . . 7776 On the other hand, 1 · 3 −7/2 1·3 (ξ) |R3 (2, 9/4)| = (1/4)4 < (1/2) = 1/16. 4! 4! In fact 10997 √ | R3 (2, 9/4)| = − 2 ≈ 4 × 10−6 . 7776 −

3

Definition 19.3. Let A ⊂ Rn be an open subset which is convex (if �a and �b belong to A, then so does every point on the line segment between them). Suppose that f : A −→ R is C k . Given �a ∈ A, the kth Taylor polynomial of f centred at a is P�a,k f (�x) = f (�a)+ +

� 1 k! 1≤i ,i ,...,i 1 2

k

� ∂f � ∂ 2f (�a)(xi −ai )+1/2 (�a)(xi −ai )(xj −aj )+. . . ∂xi ∂xi ∂xj 1≤i≤n 1≤i,j≤n

∂kf (�a)(xi1 −ai1 )(xi2 −ai2 ) . . . (xik −aik ). ∂x ∂x . . . ∂x i i i 1 2 k ≤n

The remainder is the difference R�a,k f (�x) = f (�x) − P�a,k f (�x). Theorem 19.4. Let A ⊂ Rn be an open subset which is convex. Sup­ pose that f : A −→ R is C k+1 , and let �a and �b belong to A. Then there is a vector ξ� on the line segment between �a and �b such that � 1 ∂ k+1 f R�a,k (�b) = (ξ�)(bi1 −ai1 )(bi2 −ai2 ) . . . (bik+1 −aik+1 ). (k + 1)! 1≤l ,l ,...,l ≤n ∂xi1 ∂xi2 . . . ∂xik+1 1 2

k+1

Proof. As A is open and convex, we may find � > 0 so that the parametrised line �r : (−�, 1 + �) −→ Rn

given by

�r(t) = �a + t(�b − �a),

is contained in A. Let g : (−�, 1 + �) −→ R, be the composition of �r(t) and f (�x). Claim 19.5. P0,k g(t) = P�a,k f (�r(t)). Proof of (19.5). This is just the chain rule; � ∂f g � (t) = (�r(t))(bi − ai )

∂xi 1≤i≤n

g �� (t) =

∂ 2f (�r(t))(bi − ai )(bj − aj ) ∂x ∂x i j 1≤i≤j≤n �

and so on.



So the result follows by the one variable result. 4



We can write out the first few terms of the Taylor series of f and get something interesting. Let �h = �x − �a. Then � ∂f � ∂ 2f P�a,2 f (x) = f (�a) + (�a)hi + 1/2 (�a)hi hj . ∂x ∂x ∂x i i j 1≤i≤n 1≤i 0, then a is a local minimum of f . (iii) If f � (a) = 0 and f �� (a) = 0, then we don’t know. The reason why the second derivative works follows from Taylor’s Theorem with remainder, applied to the second Taylor polynomial. To figure out the multi-variable form of the second derivative test, we need to consider the multi-variable second Taylor polynomial: 1 P�a,2 f (�x) = f (�a) + �f (�a) · �h + �ht Hf (�a)�h. 2 Recall that ∂2f �h = (h1 , h2 , . . . , hn ) and Hf (�a) = ( (�a)). ∂xi ∂xj The important term is then Q(�h) = ht Hf (�a)h. 2

Definition 20.9. If A is a symmetric n × n matrix, then the function

Q(�h) = �xt A�x, is called a symmetric quadratic form. � 0 implies that Q(�x) > 0. We say that Q is positive definite if �x = We say that Q is positive definite if �x = � 0 implies that Q(�x) < 0. Example 20.10. If A = I2 then � �� � 1 0 x Q(x, y) = (x, y) = x2 + y 2 , 0 1 y which is positive definite. If A = −I2 then Q(x, y) = −x2 − y 2 is negative definite. Finally if � � 1 0 0 −1 then Q(x, y) = x2 − y 2 is neither positive nor negative definite. Proposition 20.11. If �a ∈ K ⊂ Rn is an interior point and f : K −→ R is C 3 and �a is a critical point, then (1) If Q(�h) = �ht Hf (�a)�h is positive definite, then �a is a minimum. (2) If Q(�h) = �ht Hf (�a)�h is negative definite, then �a is a maximum. (3) If Q(�h) = �ht Hf (�a)�h is not zero and is neither positive nor negative definite, then �a is a saddle point. Proof. Immediate from Taylor’s Theorem.



Proposition 20.12. If A is a n × n matrix, then let di be the deter­ minant of the upper left i × i submatrix. Let Q(�h) = ht Ah. (1) If di > 0 for all i, then Q is positive definite. (2) If di > 0 for i even and di < 0 for i odd, then Q is negative definite. Let’s consider the 2 × 2 case. � A=

� a b b d

In this case Q(x, y) = ax2 + 2bxy + cy 2 . Assume that d1 = a > 0. Let’s complete the square. a = α2 , some α > 0. Q(x, y) = (αx + b/αy)2 + (d − b2 /α2 )y 2 = (αx + b/αy)2 + (ad − b2 )/ay 2 . In this case d1 = a > 0 and d2 = ad − b2 . So the coefficient of y 2 is positive if d2 > 0 and negative if d2 < 0. 3

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

21. Maxima and minima: II To see how to maximise and minimise a function on the boundary, let’s conside a concrete example. Let K = { (x, y) | x2 + y 2 ≤ 2 }. Then K is compact. Let f : K −→ R, be the function f (x, y) = xy. Then f is continuous and so f achieves its maximum and minimum. I. Let’s first consider the interior points. Then �f (x, y) = (y, x), so that (0, 0) is the only critical point. The Hessian of f is � � 0 1 Hf (x, y) = . 1 0 d1 = 0 and d2 = −1 = � 0 so that (0, 0) is a saddle point. It follows that the maxima and minima of f are on the boundary, that is, the set of points C = { (x, y) | x2 + y 2 = 2 }. II. Let g : R2 −→ R be the function g(x, y) = x2 + y 2 . Then the circle C is a level curve of g. The original problem asks to maximise and minimise f (x, y) = xy

subject to

g(x, y) = x2 + y 2 = 2.

One way to proceed is to use the second equation to eliminate a vari­ able. The method of Lagrange multipliers does exactly the opposite. Instead of eliminating a variable we add one more variable, tradition­ ally called λ. So now let’s maximise and minimise h(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x2 + y 2 − 2). We find the critical points of h(x, y, λ): y = 2λx x = 2λy 2 = x2 + y 2 . First note that if x = 0 then y = 0 and x2 + y 2 = 0 �= 2, impossible. So x �= 0. Similarly one can check that y �= 0 and λ �= 0. Divide the 1

first equation by the second: y x = , x y so that y 2 = x2 . As x2 + y 2 = 2 it follows that x2 = y 2 = 1. So x = ±1 and y = ±1. This gives four potential points (1, 1), (−1, 1), (1, −1), (−1, −1). Then the maximum value of f is 1, and this occurs at the first and the last point. The minimum value of f is −1, and this occurs at the second and the third point. One can also try to parametrise the boundary: √ �r(t) = 2(cos t, sin t). So we maximise the composition h : [0, 2π] −→ R, where h(t) = 2 cos t sin t. As I = [0, 2π] is compact, h has a maximum and minimum on I. When h� (t) = 0, we get cos2 t − sin2 t = 0. Note that the LHS is cos 2t, so we want cos 2t = 0. It follows that 2t = π/2 + 2mπ, so that t = π/4,

3π/4,

5π/4,

and

7π/4.

These give the four points we had before. What is the closest point to the origin on the surface F = { (x, y, z) ∈ R3 | x ≥ 0, y ≥ 0, z ≥ 0, xyz = p }? So we want to minimise the distance to the origin on F . The first trick is to minimise the square of the distance. In other words, we are trying to minimise f (x, y, z) = x2 + y 2 + z 2 on the surface F = { (x, y, z) ∈ R3 | x ≥ 0, y ≥ 0, z ≥ 0, xyz = p }. In words, given three numbers x ≥, y ≥ 0 and z ≥ 0 whose product is p > 0, what is the minimum value of x2 + y 2 + z 2 ? Now F is closed but it is not bounded, so it is not even clear that the minimum exists. Let’s use the method of Lagrange multipliers. Let h : R4 −→ R, be the function h(x, y, z, λ) = x2 + y 2 + z 2 − λ(xyz − p). 2

We look for the critical points of h: 2x = λyz 2y = λxz 2z = λxy p = xyz. Once again, it is not possible for any of the variables to be zero. Taking the product of the first three equations, we get 8(xyz) = λ3 (x2 y 2 z 2 ). So, dividing by xyz and using the last equation, we get 8 = λ3 p, that is λ=

2 p1/3

.

Taking the product of the first two equations, and dividing by xy, we get 4 = λ2 z 2 , so that z = p1/3 . So h(x, y, z, λ) has a critical point at 2 (x, y, z, λ) = (p1/3 , p1/3 , p1/3 , 1/3 ). p We check that the point (x, y, z) = (p1/3 , p1/3 , p1/3 ), is a minimum of x2 + y 2 + z 2 subject to the constraint xyz = p. At this point the sum of the squares is 3p2/3 . √ 1/3 3p . Then the Suppose that x ≥ √ √ sum of the squares is at least 3p2/3 . Similarly if y ≥ 3p1/3 or z ≥ 3p1/3 . On the other hand, the set √ √ √ K = { (x, y, z) ∈ R3 | x ∈ [0, 3p1/3 ], y ∈ [0, 3p1/3 ], z ∈ [0, 3p1/3 ], xyz = p }, is closed and bounded, so that f achieves it minimum on this set, which we have already decided is at (x, y, z) = (p1/3 , p1/3 , p1/3 ),

since f is larger on the boundary. Putting all of this together, the point

(x, y, z) = (p1/3 , p1/3 , p1/3 ),

3

is a point where the sum of the squares is a minimum. Here is another such problem. Find the closest point to the origin which also belongs to the cone x2 + y 2 = z 2 , and to the plane x + y + z = 3. As before, we minimise f (x, y, z) = x2 + y 2 + z 2 subject to g1 (x, y, z) = x2 + y 2 − z 2 = 0 and g2 (x, y, z) = x + y + z = 3. Introduce a new function, with two new variables λ1 and λ2 , h : R5 −→ R, given by h(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 g1 (x, y, z) − λ2 g2 (x, y, z) = x2 + y 2 + z 2 − λ1 (x2 + y 2 − z 2 ) − λ2 (x + y + z − 3). We find the critical points of h: 2x = 2λ1 x + λ2 2y = 2λ1 y + λ2 2z = −2λ1 z + λ2 z 2 = x2 + y 2 3 = x + y + z. Suppose we substract the first equation from the second: y − x = λ1 (y − x). So either x = y or λ1 = 1. Suppose x �= y. Then λ1 = 1 and λ2 = 0. In this case z = −z, so that z = 0. But then x2 +y 2 = 0 and so x = y = 0, which is not possible. √ It follows that x = y, in which case z = ± 2x and √ (2 ± 2)x = 3. So

√ 3 3(2 � 2) √ = x= . 2 2± 2 This gives us two critical points: √ √ √ √ 3(2 − 2) 3(2 − 2) 3 2(2 − 2) P1 = ( , , ) 2√ 2√ 2 √ √ 3(2 + 2) 3(2 + 2) 3 2(2 − 2) P2 = ( , ,− ). 2 2 2 Of the two, clearly the first is closest to the origin. 4

To finish, we had better show that this point is the closest to the origin on the whole locus F = { (x, y, z) ∈ R3 | x2 + y 2 = z 2 , x + y + z = 3 }. Let K = { (x, y, z) ∈ F | x2 + y 2 + z 2 ≤ 25 }. Then K is closed and bounded, whence compact. So f achieves its minimum somewhere on K, and so it must achieve its minimum at P = P1 . Clearly outside f is at least 25 on F \ K, and so f is a minimum at P1 on the whole of F .

5

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

22. Double integrals Definition 22.1. Let R = [a, b] × [c, d] ⊂ R2 be a rectangle in the plane. A partition P of R is a pair of sequences: a = x0 < x1 < · · · < xn = b c = y0 < y1 < · · · < yn = d. The mesh of P is

m(P) = max{ xi − xi−1 , yi − yi−1 | 1 ≤ i ≤ k }.

Now suppose we are given a function

f : R −→ R

Pick

�cij ∈ Rij = [xi−1 , xi ] × [yj−1 , yj ]. Definition 22.2. The sum n � n � S= f (�cij )(xi − xi−1 )(yj − yj−1 ), i=1 j=1

is called a Riemann sum. We will use the short hand notation

Δxi = xi − xi−1

and

Δyj = yj − yj−1 .

Definition 22.3. The function f : R −→ R is called integrable, with integral I, if for every � > 0, we may find a δ > 0 such that for every mesh P whose mesh size is less than δ, we have |I − S| < �, where S is any Riemann sum associated to P. We write � �

f (x, y) dx dy = I,

R

to mean that f is integrable with integral I. We use a sneaky trick to integrate over regions other than rectangles. Suppose that D is a bounded subset of the plane. Then we can find a rectangle R which completely contains D. Definition 22.4. The indicator function of D ⊂ R is the function iD : R −→ R, 1

given by � 1 iD (x) = 0

if x ∈ D if x ∈ / D.

If iD is integrable, then we say that the area of D is the integral �� iD dx dy. R

If iD is not integrable, then D does not have an area. Example 22.5. Let D = { (x, y) ∈ [0, 1] × [0, 1] | x, y ∈ Q }. Then D does not have an area. Definition 22.6. If f : D −→ R is a function and D is bounded, then pick D ⊂ R ⊂ R2 a rectangle. Define f˜: R −→ R, by the rule � f (x) f˜(x) = 0

if x ∈ D otherwise.

We say that f is integrable over D if f˜ is integrable over R. In this case �� �� f (x, y) dx dy = f˜(x, y) dx dy. D

R 2

Proposition 22.7. Let D ⊂ R be a bounded subset and let f : D −→ R and g : D −→ R be two integrable functions. Let λ be a scalar. Then (1) f + g is integrable over D and � � �� �� f (x, y) + g(x, y) dx dy = f (x, y) dx dy + g(x, y) dx dy. D

D

D

(2) λf is integrable over D and � � �� λf (x, y) dx dy = λ f (x, y) dx dy. D

D

(3) If f (x, y) ≤ g(x, y) for any (x, y) ∈ D, then � � �� f (x, y) dx dy ≤ g(x, y) dx dy. D

D

2

(4) |f | is integrable over D and �� |

�� f (x, y) dx dy| ≤

|f (x, y)| dx dy.

D

D

It is straightforward to integrate continuous functions over regions of three special types: Definition 22.8. A bounded subset D ⊂ R2 is an elementary region if it is one of three types: Type 1: D = { (x, y) ∈ R2 | a ≤ x ≤ b, γ(x) ≤ y ≤ δ(x) }, where γ : [a, b] −→ R and δ : [a, b] −→ R are continuous functions. Type 2: D = { (x, y) ∈ R2 | c ≤ y ≤ d, α(y) ≤ x ≤ β(y) }, where α : [c, d] −→ R and β : [c, d] −→ R are continuous functions. Type 3: D is both type 1 and 2. Theorem 22.9. Let D ⊂ R2 be an elementary region and let f : D −→ R be a continuous function. Then (1) If D is of type 1, then � b ��

��



δ(x)

f (x, y) dx dy =

f (x, y) dy

D

a

dx.

γ(x)

(2) If D if of type 2, then ��



d

��

f (x, y) dx dy = D



β(y)

f (x, y) dx c

dy.

α(y)

Example 22.10. Let D be the region bounded by the lines x = 0, y = 4 and the parabola y = x2 . Let f : D −→ R be the function given by f (x, y) = x2 + y 2 . 3

If we view D as a region of type 1, then we get � �� � 2 �� 4 2 2 f (x, y) dx dy = x + y dy dx D

x2

0

� 2� = 0

y3 x y+ 3 2

�4 dx x2

2

26 x6 − x4 − dx 3 3 0 � 3 �2 4x 26 x x5 x7 − − = + 3 3 5 3·7 0 5 7 5 7 2 2 2 2 = + − − 3 3 5 3·7 6 8 2 2 = + 3 ·�5 7 � 1 22 6 =2 + . 3·5 7 On the other hand, if we view D as a region of type 2, then we get � �� � 4 �� √y 2 2 f (x, y) dx dy = x + y dx dy �

4x2 +

=

D

0

0

� 4� = 0



3

x + xy 2 3

�√ y dy 0

4 3/2

y

=

3

0

+ y 5/2 dy

�4 2y 5/2 2y 7/2 = + 3·5 7 0 26 28 = + 3 ·�5 7 � 1 22 6 =2 + . 3·5 7 �

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

23. Inclusion-Exclusion Proposition 23.1. Let D = D1 ∪ D2 be a bounded region and let f : D −→ R be a function. If f is integrable over D1 and over D2 , then f is integrable over D and and D1 ∩ D2 , and we have ��

�� f (x, y) dx dy =

D

�� f (x, y) dx dy+

D1

�� f (x, y) dx dy−

f (x, y) dx dy. D1 ∩D2

D2

Example 23.2. Let D = { (x, y) ∈ R2 | 1 ≤ x2 + y 2 ≤ 9 }. Then D is not an elementary region. Let D1 = { (x, y) ∈ D | y ≥ 0 }

D2 = { (x, y) ∈ D | y ≤ 0 }.

and

Then D1 and D2 are both of type 1. If f is continuous, then f is integrable over D and D1 ∩ D2 . In fact D1 ∩ D2 = L ∪ R = { (x, y) ∈ R2 | − 3 ≤ x ≤ −1, 0 ≤ y ≤ 0 } ∪ { (x, y) ∈ R2 | 1 ≤ x ≤ 3, 0 ≤ y ≤ 0 }. Now L and R are elementary regions. We have ��

3



0

��

f (x, y) dx dy = R

� f (x, y) dy

1

dx = 0.

0

Therefore, by symmetry, ��

�� f (x, y) dx dy =

f (x, y) dx dy = 0

L

R

and so ��

�� f (x, y) dx dy =

D

�� f (x, y) dx dy +

D1

f (x, y) dx dy. D2

1

To integrate f over D1 , break D1 into three parts. � �� � 3 �� δ(x) f (x, y) dx dy = f (x, y) dy dx −3

D1

−1



γ(x) √

��



9−x2

=

f (x, y) dy −3

��

1

� +

3

��

f (x, y) dy

1−x2 √ 9−x2

+

dx �

f (x, y) dy 1

dx



9−x2



−1



0 √

dx.

0

One can do something similar for D2 . Example 23.3. Suppose we are given that � �� � 1 �� 2y f (x, y) dx dy = f (x, y) dx dy. D

0

y

What is the region D? It is the region bounded by the two lines y = x and x = 2y and between the two lines y = 0 and y = 1. Change order of integration: � � � 2 �� 1 �� � 1 �� x f (x, y) dx dy = f (x, y) dx dy+ f (x, y) dx dy. D

0

x/2

1

x/2

Example 23.4. Calculate the volume of a solid ball of radius a. Let B = { (x, y, z) ∈ R3 | x2 + y 2 + z 2 ≤ a2 }. We want the volume of B. Break into two pieces. Let B + = { (x, y, z) ∈ R3 | x2 + y 2 + z 2 ≤ a2 , z ≥ 0 }. Let D = { (x, y) ∈ R2 | x2 + y 2 ≤ a2 }. Then B + is bounded by the xy-plane and the graph of the function f : D −→ R, given by f (x, y) =

� a 2 − x2 − y 2 . 2

It follows that �� � vol(B ) = a2 − x2 − y 2 dy dx D � � a �� √a2 −x2 � = a2 − x2 − y 2 dy dx √ +

− a2 −x2 √ � a2 −x2

−a



a

��

=

√ − a2 −x2

−a

y2 √ 2 1− 2 a − x2 dy a − x2

Now let’s make the substitution y t= √ so that 2 a − x2 a



+

vol(B ) = −a � a

=

�� 1 √

� dx.

dy dt = √ . 2 a − x2 t2 (a2

2



1− − x ) dt dx �� 1 � √ 2 2 2 (a − x ) 1 − t dt dx −1

−a

−1

Now let’s make the substitution t = sin u

so that



+

a

(a2 − x2 )

vol(B ) =

dt = cos u du. �� π � cos2 u du

dx

− π2

−a



2

a

π (a2 − x2 ) dx 2 −a � � 3 a π 2 x = a x− 2 3 −a =

= π(a3 −

a3 ) 3

2πa3 . 3 Therefore, we get the expected answer =

4πa3 . 3 Example 23.5. Now consider the example of a cone whose base radius is a and whose height is b. Put the central axis along the x-axis and vol(B) = 2 vol(B + ) =

3

the base in the yz-plane. In the xy-plane we get an equilateral triangle of height b and base 2a. If we view this as a region of type 1, we have � � x� x� γ(x) = −a 1 − and δ(x) = a 1 − . b b We want to integrate the function f : D −→ R, given by � f (x, y) =

a2



x �2 1− − y2. b

So half of the volume of the cone is � � b �� a(1− x ) � � � �2 b x

π b 2� x �2 2 2 a 1 − − y dy dx = a 1 − dx

2 0 b b 0 −a(1− xb ) � πa2 b 2x x2 = 1− + 2 dx 2 0 b b � �b x2 x3 πa2 x− + 2 = 2 b 3b 0 1 = (πa2 b). 6 Therefore the volume is 1 (πa2 b). 3

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

24. Triple integrals Definition 24.1. Let B = [a, b] × [c, d] × [e, f ] ⊂ R3 be a box in space. A partition P of R is a triple of sequences: a = x0 < x1 < · · · < xn = b c = y0 < y1 < · · · < yn = d e = z0 < z1 < · · · < zn = f. The mesh of P is

m(P) = max{ xi − xi−1 , yi − yi−1 , zi − zi−1 | 1 ≤ i ≤ k }.

Now suppose we are given a function

f : B −→ R

Pick

�cijk ∈ Bijk = [xi−1 , xi ] × [yj−1 , yj ] × [zi , zi−1 ]. Definition 24.2. The sum n � n � n � S= f (�cijk )(xi − xi−1 )(yj − yj−1 )(zi − zi−1 ), i=1 j=1 k=1

is called a Riemann sum. Definition 24.3. The function f : B −→ R is called integrable, with integral I, if for every � > 0, we may find a δ > 0 such that for every mesh P whose mesh size is less than δ, we have |I − S| < �, where S is any Riemann sum associated to P. If W ⊂ R3 is a bounded subset and f : W −→ R is a bounded function, then pick a box B containing W and extend f by zero to a function f˜: B −→ R, � x if x ∈ W f˜(x) = 0 otherwise. If f˜ is integrable, then we write � � � ��� f (x, y, z) dx dy dz = f˜(x, y, z) dx dy dz. W

B

In particular ��� vol(W ) =

dx dy dz. W

There are two pairs of results, which are much the same as the results for double integrals: 1

Proposition 24.4. Let W ⊂ R2 be a bounded subset and let f : W −→ R and g : W −→ R be two integrable functions. Let λ be a scalar. Then (1) f + g is integrable over W and ��� ��� ��� f (x, y, z)+g(x, y, z) dx dy dz = f (x, y, z) dx dy dz+ g(x, y, z) dx dy dz. W

W

W

(2) λf is integrable over W and ��� ��� λf (x, y, z) dx dy dz = λ f (x, y, z) dx dy dz. W

W

(3) If f (x, y, z) ≤ g(x, y, z) for any (x, y, z) ∈ W , then ��� ��� f (x, y, z) dx dy dz ≤ g(x, y, z) dx dy dz. W

W

(4) |f | is integrable over W and ��� ��� | f (x, y, z) dx dy dz| ≤ |f (x, y, z)| dx dy dz. W

W

Proposition 24.5. Let W = W1 ∪ W2 ⊂ R3 be a bounded subset and let f : W −→ R be a bounded function. If f is integrable over W1 and over W2 , then f is integrable over W and and W1 ∩ W2 , and we have ��� ��� ��� f (x, y, z) dx dy dz = f (x, y, z) dx dy dz+ f (x, y, z) dx dy dz W W1 W2 ��� − f (x, y, z) dx dy dz. W1 ∩W2

Definition 24.6. Define three maps πij : R3 −→ R2 , by projection onto the ith and jth coordinate. In coordinates, we have π12 (x, y, z) = (x, y),

π23 (x, y, z) = (y, z),

and

π13 (x, y, z) = (x, z).

For example, if we start with a solid pyramid and project onto the xy-plane, the image is a square, but it project onto the xz-plane, the image is a triangle. Similarly onto the yz-plane. Definition 24.7. A bounded subset W ⊂ R3 is an elementary sub­ set if it is one of four types: Type 1: D = π12 (W ) is an elementary region and W = { (x, y, z) ∈ R2 | (x, y) ∈ D, �(x, y) ≤ z ≤ φ(x, y) }, 2

where � : D −→ R and φ : D −→ R are continuous functions. Type 2: D = π23 (W ) is an elementary region and W = { (x, y, z) ∈ R2 | (y, z) ∈ D, α(y, z) ≤ x ≤ β(y, z) }, where α : D −→ R and β : D −→ R are continuous functions. Type 3: D = π13 (W ) is an elementary region and W = { (x, y, z) ∈ R2 | (x, z) ∈ D, γ(x, z) ≤ y ≤ δ(x, z) }, where γ : D −→ R and δ : D −→ R are continuous functions. Type 4: W is of type 1, 2 and 3. The solid pyramid is of type 4. Theorem 24.8. Let W ⊂ R3 be an elementary region and let f : W −→ R be a continuous function. Then (1) If W is of type 1, then ���

��

��



φ(x,y)

f (x, y, z) dx dy dz =

f (x, y, z) dz

W

π12 (W )

dx dy.

�(x,y)

(2) If W if of type 2, then ���

��

��



β(y,z)

f (x, y, z) dx dy dz =

f (x, y, z) dx

W

π23 (W )

dy dz.

α(y,z)

(3) If W if of type 3, then ���

��

�� f (x, y, z) dx dy dz =

W



δ(x,z)

f (x, y, z) dy π13 (W )

γ(x,z)

Let’s figure out the volume of the solid ellipsoid: W = { (x, y, z) ∈ R3 |

� x �2 a

3

+

� y �2 b

+

� z �2 c

≤ 1 }.

dx dz.

This is an elementary region of type 4. ��� vol(W ) = dx dy dz W ⎛ q ⎛ q ⎞ ⎞ � a � b 1−( x )2 � c 1−( x )2 −( y )2 a a b ⎝ q ⎝ q = dz ⎠ dy ⎠ dx 2 2 y 2 x x −a −b 1−( a ) −c 1−( a ) −( b ) ⎛ q ⎞ � a � b 1−( x )2 � � x �2 � y �2 a ⎝ q = 2c 1 − − dy ⎠ dx x 2 a b −a −b 1−( a ) ⎛ q ⎞ � a � b 1−( x )2 � � x �2 � y �2 a ⎝ q − = 2c 1− dy ⎠ dx x 2 a b −a −b 1−( a ) ⎛ q ⎞ � � a � b 1−( x )2 � � � � 2 a 2c x ⎝ q = b2 1 − − y 2 dy ⎠ dx 2 b −a a −b 1−( x a) � � a � � πc x �2 2 = b 1− dx b −a a � a � x �2 = πbc 1− dx a −a � �a x3 = πbc x − 2 3a −a � � a3 = πbc 2a − 2 2

3a 4π

= abc. 3

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

25. Change of coordinates: I

Definition 25.1. A function f : U −→ V between two open subsets of Rn is called a diffeomorphism if: (1) f is a bijection, (2) f is differentiable, and (3) f −1 is differentiable. Almost be definition of the inverse function, f ◦ f −1 : V −→ V and f ◦ f : U −→ U are both the identity function, so that −1

(f ◦ f −1 )(�y ) = �y

and

(f −1 ◦ f )(�x) = �x.

and

Df −1 (�y )Df (�x) = In ,

It follows that Df (�x)Df −1 (�y ) = In

by the chain rule. Taking determinants, we see that det(Df ) det(Df −1 ) = det In = 1. Therefore, det(Df −1 ) = (det(Df ))−1 . It follows that det(Df ) �= 0. Theorem 25.2 (Inverse function theorem). Let U ⊂ Rn be an open subset and let f : U −→ R be a function. Suppose that (1) f is injective, (2) f is C 1 , and (3) Df (�x) = � 0 for all �x ∈ U . Then V = f (U ) ⊂ Rn is open and the induced map f : U −→ V is a diffeomorphism. Example 25.3. Let f (r, θ) = (r cos θ, r sin θ). Then � � cos θ sin θ Df (r, θ) = , −r sin θ r cos θ so that det Df (r, θ) = r. It follows that f defines a diffeomorphism f : U −→ V between U = (0, ∞) × (0, 2π)

and

V = R2 \ { (x, y) ∈ R2 | y = 0, x ≥ 0 }. 1

Theorem 25.4. Let g : U −→ V be a diffeomorphism between open subsets of R2 ,

g(u, v) = (x(u, v), y(u, v)).

Let D∗ ⊂ U be a region and let D = f (D∗ ) ⊂ V . Let f : D −→ R be a function. Then

��

�� f (x, y) dx dy =

D

f (x(u, v), y(u, v))| det Dg(u, v)| du dv. D∗

It is convenient to use the following notation:

∂(x, y) (u, v) = det Dg(u, v). ∂(u, v)

The LHS is called the Jacobian. Note that

∂(x, y) (u, v) = ∂(u, v)



�−1 ∂(u, v) (x, y) . ∂(x, y) 2

Example 25.5. There is no simple expression for the integral of e−x . However it is possible to compute the following integral





2

e−x dx.

I= −∞

(In what follows, we will ignore issues relating to the fact that the integrals are improper; in practice all integrals converge). Instead of 2

computing I, we compute I 2 , �� ∞ � �� ∞ � 2 −x2 −y 2 I = e dx e dy −∞ −∞ � � ∞ �� ∞ −x2 −y 2 = e dx dy −∞ −∞ �� 2 2 = e−x −y dx dy 2 � �R 2 = re−r dr dθ R2 � � ∞ �� 2π −r2 = re dθ dr 0 0 �� 2π � � ∞ −r2 = re dθ dr 0 0 � ∞ 2 = 2π re−r dr �0 �∞ 2 e−r = 2π − 2 0

= π. So I =



π.

Example 25.6. Find the area of the region D bounded by the four curves xy = 1,

xy = 3,

y = x3 ,

and

y = 2x3 .

Define two new variables, u=

x3 y

and

v = xy.

Then D is a rectangle in uv-coordinates, D∗ = [1/2, 1] × [1, 3] Now for the Jacobian we have � 3x2 � x3 � � ∂(u, v) 4x3 − 2 y � = (x, y) = �� y = 4u. ∂(x, y) y y x � It follows that ∂(x, y) 1 (u, v) = . ∂(u, v) 4u 3

This is nowhere zero. In fact note that we can solve for x and y ex­ plicitly in terms of u and v. x uv = x4 and y= . v So x = (uv)1/4 and y = u−1/4 v 3/4 . Therefore �� area(D) = dx dy D �� 1 = du dv D∗ 4u � � �� 1 1 3 1 = du dv 4 1 1/2 u � 1 3 = [ln u]11/2 dv 4 1 � 1 3 = ln 2 dv 4 1 1 = ln 2. 2 Theorem 25.7. Let g : U −→ V be a diffeomorphism between open subsets of R3 , g(u, v, w) = (x(u, v, w), y(u, v, w), z(u, v, w)). ∗

Let W ⊂ U be a region and let W = f (W ∗ ) ⊂ V . Let f : W −→ R be a function. � �Then � ��� f (x, y, z) dx dy dz = f (x(u, v, w), y(u, v, w), z(u, v, w))| det Dg(u, v, w)| du dv dw. W

W∗

As before, it is convenient to introduce more notation:

∂(x, y, z)

(u, v, w) = det Dg(u, v, w). ∂(u, v, w)

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

26. Change of coordinates: II Example 26.1. Let D be the region bounded by the cardiod, r = 1 − cos θ. If we multiply both sides by r, take r cos θ over the other side, then we get (x2 + y 2 + x)2 = x2 + y 2 . We have �� area(D) =

dx dy D

�� =

r dr dθ �

D∗ π �� 1−cos θ

= −π



r dr dθ

0

� 2 �1−cos θ r = dθ 2 0 −π � π (1 − cos θ)2 = dθ 2 −π � π 1 cos2 θ − cos θ + = dθ 2 −π 2 �

�π θ

π

− sin θ = 2 −π 2 π = . 2 �

π

In R3 , we can either use cylindrical or spherical coordinates, instead of Cartesian coordinates. Let’s first do the case of cylindrical coordinates. Recall that x = r cos θ y = r sin θ z = z. So the Jacobian is given by � � �cos θ −r sin θ 0� � � ∂(x, y, z)

(r, θ, z) = �� sin θ r cos θ 0 �� = r.

∂(r, θ, z)

� 0 0 1�

1

So, ���

��� f (x, y, z) dx dy dz =

f (r, θ, z) dr dθ dz. W∗

W

Example 26.2. Consider a cone of height b and base radius a. Put the vertex of the cone at the point (0, 0, b), so that the base of the cone is the circle of radius a, centred at the origin, in the xy-plane. Note that at height z, we have a circle of radius � z� a 1− . b ��� vol(W ) = dx dy dz W ��� = r dr dθ dz W∗ � � � �� �� b



a(1−z/b

=

r dr dθ dz 0

0

� b ��

0 2π

� � 2 �a(1−z/b r 0 dθ dz 0 0 � � �� 2π � 1 b z �2 2 = a 1− dθ dz 2 0 b 0 � a� z �2 2 = πa 1− dz b −a � 0 2 = −πa b u2 du 1 � 1 = πa2 b u2 du 1 = 2

0 2

πa b . 3 Example 26.3. Consider a ball of radius a. Put the centre of the ball at the point (0, 0, 0). Note that =

x2 + y 2 + z 2 = a 2 , translates to the equation r2 + z 2 = a2 , so that r=



a2 − z 2 . 2

��� vol(W ) =

dx dy dz W

�� =

r dr dθ dz �� �� √

W∗



a



a2 −z 2

=





r dr dθ dz −a

0

0

� � 2 �√a2 −z2 r 0 dθ dz −a 0 � � �� 2π 1 a 2 2 = a − z dθ dz 2 −a 0 � a =π a2 − z 2 dz −a � �a z3 2 =π a z− 3 −a 1 = 2

=



a

��



4πa3 . 3

Now consider using spherical coordinates. Recall that x = ρ sin φ cos θ y = ρ sin φ sin θ z = ρ cos φ. So � � �sin φ cos θ ρ cos φ cos θ −ρ sin φ sin θ� � � ∂(x, y, z) (ρ, φ, θ) = �� sin φ sin θ ρ cos φ sin θ ρ sin φ cos θ �� ∂(ρ, φ, θ) � cos φ � −ρ sin φ 0 = ρ2 cos2 φ sin φ + ρ2 sin3 φ = ρ2 sin φ. Notice that this is greater than zero, if 0 < φ < π. So, ���

��� f (x, y, z) dx dy dz = W∗

W

3

f (ρ, φ, θ)ρ2 sin φ dr dθ dz.

Example 26.4. Consider a ball of radius a. Put the centre of the ball at the point (0, 0, 0). ��� vol(W ) = dx dy dz W ��� = ρ2 sin φ dρ dφ dθ W∗ � � � 2π �� π �� a 2 = ρ sin φ dρ dφ dθ 0 0 0 � 3 �a � � 2π �� π ρ = sin φ dφ dθ 3 0

0 0 �

� 2π �� π a3 = sin φ dφ dθ 3 0 0 � 2π a3 = [− cos φ]π0 dθ 3 0 � 2a3 2π = dθ 3 0 4πa3 = . 3

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

27. Line integrals Let I be an open interval and let �x : I −→ Rn , be a parametrised differentiable curve. If [a, b] ⊂ I then let C = �x([a, b]) be the image of [a, b] and let f : C −→ R be a function. Definition 27.1. The line integral of f along C is �

� f ds =

C

b

f (�x(u))��x� (u)� du.

a

Let u : J −→ I be a diffeomorphism between two open intervals. Suppose that u is C 1 . Definition 27.2. We say that u is orientation-preserving if u� (t) > 0 for every t ∈ J. We say that u is orientation-reversing if u� (t) < 0 for every t ∈ J. Notice that u is always either orientation-preserving or orientationreversing (this is a consequence of the intermediate value theorem, ap­ plied to the continuous function u� (t)). Define a function �y : J −→ Rn , by composition, �y (t) = �x(u(t)), so that �y = �x ◦ u. Now suppose that u([c, d]) = [a, b]. Then C = �y ([c, d]), so that �y gives another parametrisation of C. Lemma 27.3. � b

d





f (�y (t))��y � (t)� dt.

f (�x(u))��x (u)� du = a

c

Proof. We deal with the case that u is orientation-reversing. The case that u is orientation-preserving is similar and easier. 1

As u is orientation-reversing, we have u(c) = b and u(d) = a and so, � d � d � f (�y (t))��y (t)� dt = f (�x(u(t)))�u(t)�x� (u(t))� dt c c � d =− f (�x(u(t)))��x� (u(t))�u� (t) dt c � a =− f (�x(u))��x� (u)� du b � b = f (�x(u))��x� (u)� du. � a

Now suppose that we have a vector field on C, F� : C −→ Rn . Definition 27.4. The line integral of F� along C is � � b � F · d�s = F� (�x(u)) · �x� (u) du. C

a

Note that now the orientation is very important: Lemma 27.5. �� d � b F� (�y (t)) · �y � (t) dt u� (t) > 0 c� F� (�x(u)) · �x� (u) du = d − c F� (�y (t)) · �y � (t) dt u� (t) < 0 a Proof. We deal with the case that u is orientation-reversing. The case that u is orientation-preserving is similar and easier. As u is orientation-reversing, we have u(c) = b and u(d) = a and so, � d � d � � F (�y (t)) · �y (t) dt = F� (�x(u(t))) · �x� (u(t))u� (t) dt c �c a

= F� (�x(u)) · �x� (u) du b � b =− F� (�x(u)) · �x� (u) du. � a

Example 27.6. If C is a piece of wire and f (�x) is the mass density at �x ∈ C, then the line integral � f ds, C

is the total mass of the curve. Clearly this is always positive, whichever way you parametrise the curve. 2

Example 27.7. If C is an oriented path and F� (�x) is a force field, then the line integral � F� · d�s, C

is the work done when moving along C. If we reverse the orientation, then the sign flips. For example, imagine C is a spiral staircase and F� is the force due to gravity. Going up the staircase costs energy and going down we gain energy. Definition 27.8. Let U ⊂ Rk and V ⊂ Rl be two open subsets. We say that f : U −→ V, is smooth if all higher order partials ∂ nf (x1 , x2 , . . . , xk ), ∂xi1 . . . ∂xin exist and are continuous. Definition 27.9. Now suppose that X ⊂ Rk and Y ⊂ Rl are any two subsets. We say that a function f� : X −→ Y, is smooth, if given any point �a ∈ X we may find �a ∈ U ⊂ Rk open, and a smooth function F� : U −→ Rl , such that f�(�x) = F� (�x), where �x ∈ X ∩U (equivalently f�|X∩U = F� |X∩U ), and we put Df�(�x) = DF� (�x). We say that f� is a (smooth) diffeomorphism if f� is bijective and both f� and f�−1 are smooth. Notice that in the definition of a diffeomorphism we are now requir­ ing more than we did (before we just required that f� and f�−1 were differentiable). Remark 27.10. Note that if X is not very “big” then Df (�x) might not be unique. For example, if X = {�x} is a single point, then there are very many different ways to extend f� to a function F� in an open neighbourhood of �x. In the examples we consider in this class, this will not be an issue (namely, manifolds with boundary). 3

Example 27.11. The map �x : [a, b] −→ Rn , is smooth if and only if there is a constant � > 0 and a smooth function �y : (a − �, b + �) −→ Rn , whose restriction to [a, b] is the function �x, �y (t) = �x(t)

for all

t ∈ [a, b].

Lemma 27.12. If �x : [a, b] −→ Rn , is injective for all t ∈ [a, b], then �x : [a, b] −→ C = �x([a, b]), is a diffeomorphism.

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

28. Manifolds with boundary Definition 28.1. Upper half space is the set Hm = { (x1 , x2 , . . . , xm ) | xm ≥ 0 } ⊂ Rm . The boundary of Hm , is ∂Hm = { (x1 , x2 , . . . , xm ) | xm = 0 } ⊂ Mm . Definition 28.2. A subset M ⊂ Rk is a smooth m-manifold with boundary if for every �a ∈ M there is an open subset W ⊂ Rk and an open subset U ⊂ Rm , and a diffeomorphism �g : Hm ∩ U −→ M ∩ W. The boundary of M is the set of points �a which map to a point of the boundary of Hm . Example 28.3. The solid ellipse, 2

M = { (x, y) ∈ R |

� x �2 a

+

� y �2 b

≤ 1 },

is a 2-manifold with boundary. Let U1 = { (u, v) | 0 < u < 2π, −1 < v < 1 } and W1 = R2 − { (x, 0) ∈ R2 | x ≥ 0 }. Define a function �g1 : U1 −→ W1 , by the rule �g1 (u, v) = (a(1 − v) cos u, b(1 − v) sin u). Similarly, let U2 = { (u, v) | − π < u < π, −1 < v < 1 } and W2 = R2 − { (x, 0) ∈ R2 | x ≤ 0 }. Define a function �g2 : U2 −→ W2 , by the rule �g2 (u, v) = (a(1 − v) cos u, b(1 − v) sin u). Finally, let U3 = { (u, v) | u2 + (v − b)2 < b } 1

and W3 = { (x, y) | x2 + y 2 < b }. Define a function �g3 : U3 −→ W3 , by the rule �g3 (u, v) = (u, v − b). Let F� : R2 −→ R2 , be the function F� (x, y) = (−y, x). Then �

F� · d�s =

∂M





(−b sin t, a cos t) · (a cos t, b sin t) dt 0

� 2π

=

ab dt 0

= 2πab. On the other hand, ∂F2 ∂F1 − = 1 − (−1) = 2, ∂x ∂y

curl F� = and so ��

curl F� dx dy = 2πab.

M

In fact this is not a coincidence: Theorem 28.4 (Green’s Theorem). Let M ⊂ R2 be a smooth 2­ manifold with boundary, and let F� : M −→ R2 be a smooth vector field such that { (x, y) ∈ M | F� (x, y) �= �0 } ⊂ R2 , is a bounded subset. Then �� � M

∂F2 ∂F1 − ∂x ∂y





F� · d�s.

dx dy = ∂M

Here ∂M is oriented, so that M is on the left as we go around ∂M (in the positive direction). 2

Proof. In the first step we assume that M = H2 = { (u, v) | v ≥ 0 } ⊂ R2 . By assumption we may find a and b such that F� = �0 outside the box [−a/2, a/2] × [0, b/2] ⊂ H2 . So F� (u, v) = �0 if u = ±a or v = b. Let’s calculate the LHS, � �� � �� �� ∂F2 ∂F1 ∂F2 ∂F1 − du dv = du dv − du dv ∂u ∂v H2 H2 ∂u H2 ∂v � b� a � a� b ∂F2 ∂F1 = du dv − dv du 0 −a ∂u −a 0 ∂v � b � a = F2 (a, v) − F2 (−a, v) dv − F1 (u, b) − F1 (u, 0) du 0 −a � a = F1 (u, 0) du. −a

Okay, now let’s parametrise the boundary of the upper half plane, �x : R −→ ∂H2 , by the rule �x(u) = (u, 0). Then �x� (u) = ˆı. Let’s calculate the RHS, � � a � F · d�s = F� (�x(u)) · �x� (u) du 2 ∂H �−aa = F� (u, 0) · ˆı du −a � a = F1 (u, 0) du. −a

So the result holds if M = H2 . This completes the first step. In the second step, we suppose that there is a diffeomorphism �g : H2 ∩ U −→ M ∩ W, such that for some positive real numbers a and b, we have (1) [−a, a] × [0, b] ⊂ H2 ∩ U , (2) F� = �0 outside �g ([−a/2, a/2] × [0, b]), and (3) det D�g (u, v) > 0 for every (u, v) ∈ H2 ∩ U . 3

In this case, parametrise ∂M ∩ W as follows; define �s : (−a, a) −→ ∂M ∩ W, by the rule �s(u) = �g (�x(u)) = �g (u, 0). Note that this is compatible with the orientation, as we are assuming that the Jacobian of g is positive. � � a F� · d�s = F� (�s(u)) · �s� (u) du ∂M �−aa = F� (�g (�x(u)))D�g (�x(u)) · �x� (u) du �−a � · d�s, = G ∂H2

where � : H2 −→ R2 , G is defined by the rule � � (u, v) = G

F� (�g (u, v))D�g (u, v) if (u, v) ∈ U �0 otherwise.

Now we compute, � � � � ∂G2 ∂G1 ∂ ∂x ∂y ∂ ∂x ∂y

− − = F1 + F2 F1 + F2 ∂u ∂v ∂u ∂v ∂v ∂v ∂u ∂u

� � 2 ∂F1 ∂x ∂F1 ∂y ∂x ∂ x = + + F1 ∂x ∂u ∂y ∂u ∂v ∂u∂v � � ∂F2 ∂x ∂F2 ∂y ∂y ∂ 2y + + + F2 ∂x ∂u ∂y ∂u ∂v ∂u∂v � � ∂F1 ∂x ∂F1 ∂y ∂x ∂2x − − F1 + ∂x ∂v ∂y ∂v ∂u ∂v∂u � � ∂F2 ∂x ∂F2 ∂y ∂y ∂ 2x − F2 + − ∂y ∂v ∂u ∂v∂u ∂x ∂v � � � � ∂F2 ∂x ∂y ∂x ∂y ∂F1 ∂x ∂y ∂x ∂y = − − − ∂x ∂u ∂v ∂x ∂u ∂y ∂u ∂v ∂x ∂u � � ∂F2 ∂F1 ∂(x, y) − = . ∂x ∂y ∂(u, v) 4

Using this, we get �� � H2

∂G2 ∂G1 − ∂u ∂v





� ∂G2 ∂G1 du dv = − du dv ∂u ∂v H2 ∩U � � �� ∂F2 ∂F1 ∂(x, y) − = du dv ∂x ∂y ∂(u, v) H2 ∩U � � �� ∂F2 ∂F1 = − dx dy ∂x ∂y M ∩W � �� � ∂F2 ∂F1 = − dx dy. ∂x ∂y M ��

Putting all of this together, we have �� � M

∂F2 ∂F1 − ∂x ∂y



�� � dx dy = H2



∂G2 ∂G1 − ∂u ∂v

� du dv

� · d�s G

= ∂H2



F� · d�s.

= ∂M

This completes step 2. We now turn to the third and final step. To complete the proof, we need to invoke the existence of partitions of unity. Starting with F� , I claim that there are vector finitely many fields F1 , F2 , . . . , Fk , each of which satisfy the hypotheses of step 2, such that F� = F�1 + F�2 + · · · + F�k =

k �

F�i .

i=1

Indeed, start with a partition of unity, 1=

m �

ρi ,

i=1

multiply both sides by F� , to get F� =



i = 1 ρi F� = m

m � i=1

5

F�i .

Granted this, Green’s Theorem follows very easily, � � �� � k �� � � ∂F2 ∂F1 ∂Fi,2 ∂Fi,1 − − dx dy = dx dy ∂x ∂y ∂x ∂y M M i=1 k � � = F�i · d�s. �i=1 =

∂M

F� · d�s.



∂M

Lemma 28.5. Let K ⊂ Rn . Suppose that K is containedin the union of closed balls B1 , B2 , . . . , Bm , such that any point of K belongs to the interior of at least one of B1 , B2 , . . . , Bm . Then we may find smooth functions ρ1 , ρ2 , . . . , ρm such that ρi is zero outside Bi and m � 1= ρi . i=1

Proof. We prove the case n = 2. The general case is similar, only notationally more involved. First observe that it is enough to find smooth functions σ1 , σ2 , . . . , σm , such that σi is zero outside Bi and such that m � σ= σi , i=1

does not vanish at any point of K. Indeed, if we let σi ρi = , σ then ρi is smooth, it vanishes outside Bi and dividing both sides of the equation above by σ, we have 1=

m �

ρi .

i=1

In fact it suffices to find functions σ1 , σ2 , . . . , σm , such that σi vanishes outside Bi and which is non-zero on the interior of Bi (replacing σi by σi2 , so that σi2 is positive on the interior of Bi , we get rid of the annoying possibility that the sum is zero because of cancelling). It is enough to do this for one solid circle Bi and we might as well assume that B = Bm = B1 is the solid unit circle. Using polar coordinates, we want a function of one variable r which is zero outside [0, 1] and which is non-zero on (0, 1), so we are now down to a one variable question. 6

At this point we realise we want a smooth function, f : R −→ R, all of whose derivatives are zero at 0 and yet the function f is not the zero function. Such a function is given by 2

f (x) = e−1/x .

7



MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

29. Conservative vector fields revisited

Let U ⊂ R2 be an open subset. Given a smooth function f : U −→ R we get a smooth vector field by taking F� = grad f . Given a smooth vector field F� : U −→ R2 we get a function by taking f = curl F� . Suppose that M ⊂ U is a smooth 2-manifold with boundary. If we start with F� , then we have �� � � curl F dA = F� · d�s. M

∂M

Suppose we start with f , and let C be a smooth oriented curve. Pick a parametrisation, �x : [a, b] −→ U, such that �x(a) = P and �x(b) = Q. Then we have � � b grad f · d�s = grad f (�x(t)) · �x� (t) dt C a

� b

d = f (�x(t)) dt a dt = f (�x(b)) − f (�x(a)) = f (Q) − f (P ). Definition 29.1. We say that X ⊂ Rn is star-shaped with respect to P ∈ X, if given any point Q ∈ X then the point −→ P + tP Q ∈ X, for every t ∈ [0, 1]. In other words, the line segment connecting P to Q belongs to X. Theorem 29.2. Let U ⊂ R2 be an open and star-shaped let F� : U −→ R2 be a smooth vector field. The following are equivalent: (1) curl F� = 0. (2) F� = grad f . Proof. (2) implies (1) is easy. We check (1) implies (2). Suppose that U is star-shaped with respect to P = (x0 , y0 ). Parametrise the line L from P to Q = (x, y) as follows −→ P + tP Q = (x0 + t(x − x0 ), y0 + t(y − y0 )) = Pt , for 0 ≤ t ≤ 1. 1

Define �

F� · d�s

f (x, y) =

L � 1

=

xF1 (Pt ) + yF2 (Pt ) dt. 0

Then � 1 ∂f ∂ = (xF1 (x0 + t(x − x0 ), y0 + t(y − y0 )) + yF2 (x0 + t(x − x0 ), y0 + t(y − y0 ))) dt ∂x 0 ∂x � 1 ∂F1 ∂F2 = F1 (Pt ) + tx (Pt ) + ty (Pt ) dt. ∂x ∂x 0 On the other hand, ∂ ∂F1 ∂F2 (tF1 (x0 + t(x − x0 ), y0 + t(y − y0 ))) = F1 (Pt )+tx F1 (Pt )+ty (Pt ). ∂t ∂t ∂y Since curl F� = 0, we have ∂F1 ∂F2 = , ∂y ∂x and so ∂f = ∂x

� 0

1

∂F1 (Pt ) dt = F1 (x, y). ∂t

Similarly ∂f = F2 (x, y). ∂y It follows that F� = grad f .



Definition 29.3. Let F� : U −→ R2 be a vector field. Define another vector field by the rule ∗F : U −→ R2

∗ F = (−F2 , F1 ).

where

Theorem 29.4 (Divergence theorem in the plane). Suppose that M ⊂ R2 is a smooth 2-manifold with boundary ∂M . If F� : U −→ R2 is a smooth vector field, then �� � � div F dA = F� · n ˆ ds, M

∂M

where n ˆ is the unit normal vector of the smooth oriented curve C = ∂M which points out of M . 2

Proof. Note that curl(∗F� ) = div F� , and ∗F · d�s = (F� · n ˆ )ds, and so the result follows from Green’s theorem applied to ∗F� .

3



MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

30. Surface integrals

Suppose we are given a smooth 2-manifold M ⊂ R3 . Let

�g : U −→ M ∩ W, be a diffeomorphism, where U ⊂ R2 , with coordinates s and t. We can define two tangent vectors, which span the tangent plane to M at P = �g (s0 , t0 ): ∂�g T�s (s0 , t0 ) = (s0 , t0 ) ∂s ∂�g T�t (s0 , t0 ) = (s0 , t0 ). ∂t We get an element of area on M , dS = �T�s × T�t � ds dt. Using this we can define the area of M ∩ W to be �� �� area(M ∩ W ) = dS = �T�s × T�t � ds dt. M ∩W

U

Example 30.1. We can parametrise the torus, � M = { (x, y, z) | (a − x2 + y 2 )2 + z 2 = b2 }, as follows. Let U = (0, 2π) × (0, 2π), and W = R3 \ { (x, y, z) | x ≥ 0 and y = 0, or x2 + y 2 ≥ a2 and z = 0 }. Let �g : U −→ M ∩ W, be the function �g (s, t) = ((a + b cos t) cos s, (a + b cos t) sin s, b sin t). Let’s calculate the tangent vectors, ∂�g T�s = = (−(a + b cos t) sin s, (a + b cos t) cos s, 0), ∂s ∂�g T�t = = (−b sin t cos s, −b sin t sin s, b cos t). ∂t So

� � � ˆ � ˆ ı j ˆ k � � T�s × T�t = ��−(a + b cos t) sin s (a + b cos t) cos s 0 �� � −b sin t cos s −b sin t sin s b cos t �

ˆ = (a + b cos t)b cos s cos tˆı + (a + b cos t)b sin s cos tˆ j + (a + b cos t)b sin tk. 1

Therefore, �T�s × T�t � = (a + b cos t)b(cos2 s cos2 t + sin2 s cos2 t + sin2 t)1/2 = (a + b cos t)b. As a ≥ b, note that (a + b cos t)b > 0. Hence area(M ) = area(M ∩ W ) �� = dS M ∩W �� = �T�s × T�t � ds dt U � 2π � 2π = (a + b cos t)b ds dt 0 0 � 2π = 2πb (a + b cos t) dt 0

= 4π 2 ab. Notice that this is the surface area of a cylinder of radius b and height 2πa, as expected. Example 30.2. We can parametrise the sphere, M = { (x, y, z) | x2 + y 2 + z 2 = a2 }, as follows. Let U = (0, π) × (0, 2π), and W = R3 \ { (x, y, z) | x ≥ 0 and y = 0 }. Let �g : U −→ M ∩ W, be the function �g (φ, θ) = (a sin φ cos θ, a sin φ sin θ, a cos φ). Let’s calculate the tangent vectors, ∂�g T�φ = = (a cos φ cos θ, a cos φ sin θ, −a sin φ), ∂φ ∂�g T�θ = = (−a sin φ sin θ, a sin φ cos θ, 0). ∂θ 2

So

� � � ˆı jˆ kˆ �� � T�φ × T�θ = �� a cos φ cos θ a cos φ sin θ −a sin φ�� � −a sin φ sin θ a sin φ cos θ

0 �

ˆ = a2 sin2 φ cos θˆı + a2 sin2 φ sin θjˆ + a2 cos φ sin φk. Therefore, �T�φ × T�θ � = a2 sin φ(sin2 φ cos2 θ + sin2 φ sin2 θ + cos2 φ)1/2 = a2 sin φ. As 0 < φ < π, note that a2 sin φ > 0. Hence area(M ) = area(M ∩ W ) �� = dS � �M ∩W = �T�φ × T�θ � dφ dθ U � 2π � π = a2 sin φ dφ dθ 0 0 � 2π = 2a2 dt 0 2

= 4πa . Notice that this is the surface area of a sphere of radius a. Let’s now suppose that there are two different ways to parametrise the same piece M ∩ W of the manifold M : �g : U −→ M ∩ W

and

�h : V −→ M ∩ W.

Let use (u, v) coordinates for U ⊂ R2 and (s, t) coordinates for V ⊂ R2 . Then f� = (�h)−1 ◦ �g : U −→ V, is a diffeomorphism. Note that �g = �h ◦ f�. We then have ∂�g ∂(�h ◦ f�) (u, v) = (u, v)

∂u ∂u ∂s ∂�h ∂t ∂�h = (s, t) (u, v) + (s, t) (u, v). ∂s ∂u ∂t ∂u 3

Similarly ∂�g ∂(�h ◦ f�) (u, v) = (u, v) ∂v ∂v ∂s ∂�h ∂t ∂�h = (s, t) (u, v) + (s, t) (u, v). ∂s ∂v ∂t ∂v � � � � ∂�g ∂�g ∂�h ∂s ∂�h ∂t ∂�h ∂s ∂�h ∂t × × = + + ∂u ∂v ∂s ∂u ∂t ∂u ∂s ∂v ∂t ∂v � � ∂�h ∂�h ∂s ∂t ∂s ∂t = × − ∂s ∂t ∂u ∂v ∂v ∂u ∂�h ∂�h ∂(s, t) = × . ∂s ∂t ∂(u, v)

It follows that

∂�g ∂�g ∂�h ∂�h ∂(s, t)

� × �=� × �| | . ∂u ∂v ∂s ∂t ∂(u, v) Hence

�� �� ∂�g ∂�g ∂�h ∂�h ∂(s, t) � × � du dv = � × �| | du dv ∂t ∂(u, v) ∂v U ∂u U ∂s �� ∂�h ∂�h = � × � ds dt. ∂t V ∂s Notice that the first term is precisely the integral we use to define the area of M ∩W . This formula then says that the area is independent of the choice of parametrisation.

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

31. Flux

Let S ⊂ R3 be a smooth 2-manifold and let

�g : U −→ S ∩ W, be a diffeomorphism. Definition 31.1. Let f : S ∩ W −→ R and F� : S ∩ W −→ R3 be two functions, the first a scalar function and the second a vector field. We define �� �� ∂�g ∂�g f dS = f (g(s, t))� × � ds dt ∂s ∂t � � � � S∩W � �S∩W ∂�g ∂�g � � � F · dS = F (g(s, t)) · × ds dt. ∂s ∂t S∩W S∩W The second integral is called the flux of F� across S in the direction of ∂�g g × ∂� ∂s ∂t n ˆ = ∂�g ∂�g . � ∂s × ∂t � Note that

��

�= F� · dS

� �

S∩W

� (F� · n) ˆ dS.

S∩W

Note also that one can define the line integral of f and F� over the whole of S using partitions of unity. Example 31.2. Find the flux of the vector field given by ˆ F� (x, y, z) = yˆı + zˆ j + xk, through the triangle S with vertices P0 = (1, 2, −1) in the direction of

P1 = (2, 1, 1)

and

P2 = (3, −1, 2),

−−→ −−→ P0 P1 × P0 P2 n ˆ = −−→ −−→ . �P0 P1 × P0 P2 �

First we parametrise S,

�g : U −→ S ∩ W, where −−→ −−→ −−→ g(s, t) = OP0 + sP0 P1 + tP0 P2 = (1 + s + 2t, 2 − s − 3t, −1 + 2s + 3t), and U = { (s, t) ∈ R2 | 0 < s < 1, 0 < t < 1 − s }, 1

and W is the whole of R3 minus the three lines P0 P1 , P1 P2 and P2 P0 . Now ∂�g −−→ = P0 P1 = (1, −1, 2) ∂s

and

∂�g −−→ = P0 P2 = (2, −3, 3), ∂t

and so

� � �ˆı jˆ kˆ � � � ∂�g ∂�g � × = �1 −1 2 �� ∂s ∂t �

2 −3 3 �

ˆ = 3ˆı + jˆ − k. g g Clearly, n ˆ and ∂� × ∂� have the same direction. ∂s ∂t �� �� � � � F · dS = F� · dS S � � � �S∩W ∂�g ∂�g � = F (�g (s, t)) · × ds dt ∂s ∂t U � 1 � 1−s = (2 − s − 3t, −1 + 2s + 3t, 1 + s + 2t) · (3, 1, −1) dt ds 0 0 � 1 � 1−s = (6 − 3s − 9t − 1 + 2s + 3t − 1 − s − 2t) dt ds

0 0 � 1 � 1−s = (4 − 2s − 8t) dt ds

0 0 � 1 �1−s �

=

4t − 2st − 4t2 0 ds 0 � 1 = (4(1 − s) − 2s(1 − s) − 4(1 − s)2 ) ds 0 � 1 = (2s − 2s2 ) ds 0

�1 �

2s3 2 = s − 3 0 1 = . 3 Example 31.3. Let S be the disk of radius 2 centred around the point P = (1, 1, −2) and orthogonal to the vector 1 2 2 n ˆ = ( , − , ). 3 3 3 2

Find the flux of the vector field given by ˆ F� (x, y, z) = yˆı + zjˆ + xk, through S in the direction of n ˆ. First we need to parametrise S. We want a right handed triple of unit vectors (ˆ a, ˆb, n ˆ) which are pairwise orthogonal, so that they are an orthonormal basis. Let’s take 2 2 1 a ˆ = ( , , ). 3 3 3 With this choice, it is clear that a ˆ·n ˆ = 0, so that a ˆ is orthogonal to n ˆ, ˆb = n ˆ×a ˆ ⎛ ⎞ ˆı jˆ jˆ = ⎝1/3 −2/3 2/3⎠ 2/3 2/3 1/3 ˆ

= −2/3ˆı + 1/3ˆ j + 2/3k. This gives us a parametrisation,

�g : U −→ S ∩ W, given by −→ g(r, θ) = OP + r cos θa ˆ + r sin θˆb = (1, 1, −2) + (2r/3 cos θ, 2r/3 cos θ, r/3 cos θ) + (−2r/3 sin θ, r/3 sin θ, 2r/3 sin θ) = (1 + 2r/3 cos θ − 2r/3 sin θ, 1 + 2r/3 cos θ + r/3 sin θ, −2 + r/3 cos θ + 2r/3 sin θ), where U = (0, 2) × (0, 2π), and W is the whole of R minus the boundary of the disk. Now ∂�g ∂�g = cos θa ˆ + sin θˆb and = −r sin θa ˆ + r cos θˆb, ∂r ∂θ and so ∂�g ∂�g × = (cos θa ˆ + sin θˆb) × (−r sin θa ˆ + r cos θˆb) ∂r ∂θ

= r(cos2 θ + sin2 θ)ˆ n = rˆ

n. 3

3

Clearly this points in the direction of n ˆ. �� �� �= � F� · dS F� · dS S � � � �S∩W ∂� g ∂� g = F� (�g (r, θ)) · × dr dθ ∂r ∂θ U

� 2π � 2

= (1 + 2r/3 cos θ + r/3 sin θ, −2 + r/3 cos θ + 2r/3 sin θ, 0

0

1 + 2r/3 cos θ − 2r/3 sin θ, ) · (r/3, −2r/3, 2r/3) dr dθ. Now when we expand the integrand, we will clearly get α + β cos θ + γ sin θ, where α, β and γ are affine linear functions of r (that is, of the form mr + b). The integral of cos θ and sin θ over the range [0, 2π] is zero. Computing, we get α = r(1/3 + 4/3 + 2/3) = 7r/3, so that α is a linear function of r. Therefore the integral reduces to � 2π � 2 �2 � 2π � 2 7r 7r dr dθ = dθ 6 0 0 0 3 0 � 2π 14 = dθ 3 0 28π = . 3

4

MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

32. Stokes Theorem Definition 32.1. We say that a vector field F� : A −→ Rm , has compact support if there is a compact subset K ⊂ A such that F� (�x) = �0, for every �x ∈ A − K. If S ⊂ R3 is a smooth manifold (possibly with boundary) then we will call S a surface. An orientation is a “continuous” choice of unit normal vector. Not every surface can be oriented. Consider for example the M¨obius band, which is obtained by taking a piece of paper and attaching it to itself, except that we add a twist. Theorem 32.2 (Stokes’ Theorem). Let S ⊂ R3 be a smooth oriented surface with boundary and let F� : S −→ R3 be a smooth vector field with compact support. Then �� � � � curl F · dS = F� · d�s, S

∂S

where ∂S is oriented compatibly with the orientation on S. Example 32.3. Let S be a smooth 2-manifold that looks like a pair of pants. Choose the orientation of S such that the normal vector is pointing outwards. There are three oriented curves C1 , C2 and C3 (the � with two legs and the waist). Suppose that we are given a vector field B zero curvature. Then (32.2) says that � � � �� � � � � · dS � = 0. B · d�s + B · d�s + B · d�s = curl B C3

C1�

C2�

S

Here C1� and C2� denote the curves C1 and C2 with the opposite orien­ tation. In other words, � � � � � � · d�s. B · d�s = B · d�s + B C3

C1

C2

Proof of (32.2). We prove this in three steps, in very much the same way as we proved Green’s Theorem. Step 1: We suppose that M = H2 ⊂ R2 ⊂ R3 , where the plane is the xy-plane. In this case, we can take n ˆ = kˆ, and this induces the standard orientation of the boundary. Note that ∂F2 ∂F1 curl F� · n ˆ= − , ∂x ∂y 1

and so the result reduces to Green’s Theorem. This completes step 1. Step 2: We suppose that there is a compact subset K ⊂ S and a parametrisation �g : H2 ∩ U −→ S ∩ W, which is compatible with the orientation, such that (1) F� (�x) = �0 if �x ∈ S − K, and (2) K ⊂ S ∩ W .

� : H2 −→ R2 by the rule Define a vector field G �

� � (u, v) = F (�g (u, v)) · D�g (u, v) (u, v) ∈ U G �0 (u, v) ∈ / U. Note that ∂x ∂y ∂z + F2 + F3 ∂u ∂u ∂u ∂x ∂y ∂z G2 (u, v) = F1 + F2 + F3 . ∂v ∂v ∂v Using step 1, it is enough to prove: G1 (u, v) = F1

Claim 32.4. (1) ��

�= curl F� .dS

��



H2

S

∂G2 ∂G1 − ∂u ∂v

� du dv.

(2) ��

F� · d�s =

��

� · ds. G ∂H2

∂S

Proof of (32.4). Note that � � ˆ� � ˆı j ˆ k �∂ ∂ ∂� � curl F� = �� ∂x ∂y ∂z � � F F F �

1 2 3 � � � � � � ∂F3 ∂F2 ∂F3 ∂F1 ∂F2 ∂F1 ˆ − = − ˆı − − jˆ + k. ∂y ∂x ∂y ∂z ∂x ∂z On the other hand, ∂�g ∂x = ˆı + ∂u ∂u ∂�g ∂x = ˆı + ∂v ∂v 2

∂y ∂z ˆ jˆ + k ∂u ∂u ∂y ∂z ˆ jˆ + k. ∂v ∂v

It follows that

� � ˆı ∂�g ∂�g �� ∂x × = ∂u ∂v �� ∂u ∂x ∂v

� kˆ �� ∂z � ∂u � ∂z �

jˆ ∂y ∂u ∂y ∂v

∂v

∂(y, z) ∂(x, z) ∂(x, y) ˆ = ˆı − jˆ + k. ∂(u, v) ∂(u, v) ∂(u, v) So, ∂�g ∂�g curl F� · × = ∂u ∂v



∂F3 ∂F2 − ∂y ∂z



� � � � ∂(y, z) ∂F3 ∂F1 ∂(x, z) ∂F2 ∂F1 ∂(x, y) + − + − . ∂(u, v) ∂x ∂z ∂(u, v) ∂x ∂y ∂(u, v)

On the other hand, if one looks at the proof of the second step of Green’s theorem, we see that ∂G2 ∂G1 − , ∂u ∂v is also equal to the RHS (in fact, what we calculated in the proof of Green’s theorem was the third term of the RHS; by symmetry the other two terms have the same form). This is (1). For (2), let’s parametrise ∂H2 ∩ U by �x(u) = (u, 0) and ∂S ∩ W by �s(u) = �g (�x(u)). Then �



� F · d�s = F� · d�s ∂S ∂S∩W � b = F� (�s(u)) · �s� (u) du a

� b

= F� (�g (�x(u))) · D�g (�x(u))�x� (u) du a

� b

� (�x(u)) · �x� (u) du = G �a � · d�s = G. ∂H2 ∩U �

� · d�s, = G. ∂H2

and this is (2).



This completes step 2. Step 3: We again use partitions of unity. It is straightforward to cover the bounded set K by finitely many compact subsets K1 , K2 , . . . , Kk , such that given any smooth vector field which is zero outside Ki , then 3

the conditions of step 2 hold. By using a partition of unity, we can find smooth functions ρ1 , ρ2 , . . . , ρk such that ρi is zero outside Ki and 1=

k �

ρi .

i=1

Multiplying both sides of this equation by F� , we have F� =

k �

F�i ,

i=1

where F�i = ρi F� is a smooth vector field, which is zero outside Ki . In this case �� k �� � � � � curl F · dS = curl F�i · dS S

S

i=1

=

k � �

�i=1 = ∂M

4

F�i · d�s

∂M

F� · d�s.



MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

33. Gauss Theorem Theorem 33.1 (Gauss’ Theorem). Let M ⊂ R3 be a smooth 3-manifold with boundary, and let F� : M −→ R3 be a smooth vector field with com­ pact support. Then ��� �� � � div F dx dy dz = F� · dS, M

∂M

where ∂M is given the outward orientation. Example 33.2. Three point charges are located at the points P1 , P2 and P3 . There is an electric field � : R3 \ {P1 , P2 , P3 } −→ R3 , E � = 0. which satisfies div E Suppose there are four closed surfaces S1 , S2 , S3 and S4 . Each Si divides R3 into two pieces, which we will informally call the inside and the outside. S1 and S2 and S3 are completely contained in the inside of S4 . The inside of S1 contains the point P1 but neither P2 nor P3 , the inside of S2 contains the point P2 but neither P1 nor P3 , and the inside of S3 contains the point P3 but neither P1 nor P2 . The inside of S4 , together with S4 , minus the inside of S1 , S2 and S3 is a smooth 3-manifold with boundary. We have ∂M = S1� � S2� � S3� � S4 . Recall that primes denote the reverse orientation. (33.1) implies that �� �� �� �� � � � � � � � · dS � E · dS − E · dS − E · dS − E �S4� �S1� �S2� �S3� � · dS �+ � · dS �+ � · dS �+ � · dS � = E E E E � � � S S1 S2 S3 �� 4 � · dS � = E ∂M �� � dx dy dz = div E M

= 0. In other words, we have �� �� �� �� � · dS �= � · dS �+ � · dS �+ � · dS. � E E E E S4

S1

S2

S3

Proof of (33.1). The proof (as usual) is divided into three steps. 1

Step 1: We first suppose that M = H3 , upper half space. Suppose � : H3 −→ R3 , which is zero outside that we are given a vector field G some box K = [−a/2, a/2] × [−b/2, b/2] × [0, c/2]. We calculate: � ��� � c � b � a� ∂G1 ∂G2 ∂G1 � div G du dv dw = + + du dv dw ∂u ∂v ∂w H3 0 −b −a � c� b = (G1 (a, v, w) − G1 (−a, v, w)) dv dw 0 −b � c� a + (G2 (u, b, w) − G2 (u, −b, w)) du dw −a a

0 b





(G3 (u, v, c) − G3 (u, v, 0)) du dw

+ −b



−a b � a

=−

G3 (u, v, 0) du dw. −b

−a

On the other hand, let’s parametrise the boundary ∂H3 , by �g : R2 −→ ∂H3 , where �g (u, v) = (u, v, 0). In this case

∂�g ∂�g ˆ × = ˆı × jˆ = k. ∂u ∂v �� � � � · kˆ du dv G · dS = G

It follows that � � (∂H2 )�

R2



b

a



=

G3 (u, v, 0) du dv. −b

−a

Therefore ��

� · dS �= G

��

� · dS � G

(∂H2 )�

∂H2



b



a

=−

G3 (u, v, 0) du dv. −b

−a

Putting all of this together, we have ��� �� � div G du dv dw = H3

∂H2

This completes step 1. 2

� · dS. � G

Step 2: We suppose that there is a compact subset K ⊂ M and a parametrisation �g : H3 ∩ U −→ M ∩ W, such that (1) F� (�x) = �0 for any �x ∈ M \ K. (2) K ⊂ M ∩ W .

We may write

�g (u, v, w) = (x(u, v, w), y(u, v, w), z(u, v, w)). Define � : H3 −→ R3 , G by ∂(y, z) ∂(x, z) ∂(x, y) − F2 + F3 ∂(v, w) ∂(v, w) ∂(v, w) ∂(y, z) ∂(x, z) ∂(x, y) − F3 G2 = −F1 + F2 ∂(u, w) ∂(u, w) ∂(u, w) ∂(y, z) ∂(x, z) ∂(x, y) − F2 G3 = F1 + F3 , ∂(u, v) ∂(u, v) ∂(u, v) G1 = F1

for any (u, v, w) ∈ V and otherwise zero. Put differently, � � � (u, v, w) = F · A if (u, v, w) ∈ U G �0 otherwise, where A is the matrix of cofactors of the derivative D�g . One can check (that is, there is a somewhat long and involved cal­ culation, similar, but much worse, than ones that appear in the proof of Green’s Theorem or Stokes’ Theorem) that � = div F� det D�g div G ∂(x, y, z) = div F� . ∂(u, v, w) We have ���

div F� dx dy dz =

���

� dx dy dz div G

H3

M

��

� · dS, � G

= ∂H2

�� = ∂M

3

� F� · dS,

where the last equality needs to be checked (this is relatively straight­ forward). This completes step 2. Step 3: We finish off in the standard way. We may find a partition of unity k � 1= ρi , i=1

where ρi is a smooth function which is zero outside a compact subset Ki such that F�i = ρi F� is a smooth vector field, which satisfies the hypothesis of step 2, for each 1 ≤ i ≤ k. We have F� =

k �

F�i .

i=1

and so ��

�= curl F� · dS

S

k �� � i=1

=

k � �

�i=1 = ∂M

4

� curl F�i · dS

S

F�i · d�s

∂M

F� · d�s.



MIT OpenCourseWare http://ocw.mit.edu

18.022 Calculus of Several Variables Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

34. Forms on Rn Definition 34.1. A basic 1-form on Rn is a formal symbol dx1 , dx2 , . . . , dxn . A general 1-form on Rn is any expression of the form ω = f1 dx1 + f2 dx2 + · · · + fn dxn , where f1 , f2 , . . . , fn are smooth functions. Note that there are n basic 1-forms. If f is a smooth function, we get a 1-form using the formal rule, df =

n � ∂f dxi . ∂x i i=1

Definition 34.2. A basic 2-form on Rn is any formal symbol dxi ∧ dxj , where 1 ≤ i < j ≤ n. A general 2-form on Rn is any expression of the form � ω= fij dxi ∧ dxj i