256 103 2MB
English Pages [462] Year 2020
![Introduction to Analysis in Several Variables: Advanced Calculus [1 (draft) ed.]
9781470456696, 9781470460167](https://ebin.pub/img/200x200/introduction-to-analysis-in-several-variables-advanced-calculus-1-draftnbsped-9781470456696-9781470460167.jpg)
Introduction to Analysis in Several Variables (Advanced Calculus) Michael Taylor Math. Dept., UNC E-mail address: [email protected]
2010 Mathematics Subject Classification. 26B05, 26B10, 26B12, 26B15, 26B20 Key words and phrases. real numbers, complex numbers, Euclidean space, metric spaces, compact spaces, Cauchy sequences, continuous function, power series, derivative, mean value theorem, Riemann integral, fundamental theorem of calculus, arclength, exponential function, logarithm, trigonometric functions, Euler’s formula, multiple integrals, surfaces, surface area, differential forms, Stokes theorem, degree, Riemannian manifold, metric tensor, geodesics, curvature, Gauss-Bonnet theorem, Fourier analysis
Contents
Preface
xi
Some basic notation
xv
Chapter 1.
Background
1.1. One variable calculus
1 2
Exercises
13
1.2. Euclidean spaces
17
Exercises
22
1.3. Vector spaces and linear transformations
22
Exercises
30
1.4. Determinants
31
Exercises
36
Chapter 2.
Multivariable differential calculus
41
2.1. The derivative
41
Exercises
54
2.2. Inverse function and implicit function theorem
58
Exercises
67
2.3. Systems of differential equations and vector fields
70
Exercises
82
Chapter 3.
Multivariable integral calculus and calculus on surfaces
3.1. The Riemann integral in n variables
89 90
Exercises
115
3.2. Surfaces and surface integrals
119
Exercises
138 vii
viii
Contents
3.3. Partitions of unity
147
3.4. Sard’s theorem
148
3.5. Morse functions 3.6. The tangent space to a manifold
149 150
Chapter 4.
Differential forms and the Gauss-Green-Stokes formula
155
4.1. Differential forms
156
Exercises
160
4.2. Products and exterior derivatives of forms
162
Exercises
165
4.3. The general Stokes formula
166
Exercises 4.4. The classical Gauss, Green, and Stokes formulas
170 172
Exercises
178
4.5. Differential forms and the change of variable formula
182
Chapter 5.
Applications of the Gauss-Green-Stokes formula
187
5.1. Holomorphic functions and harmonic functions
188
Exercises
196
5.2. Differential forms, homotopy, and the Lie derivative
202
Exercises 5.3. Differential forms and degree theory
206 208
Exercises
216
Chapter 6.
Differential geometry of surfaces
225
6.1. Geometry of surfaces I: geodesics
229
Exercises
240
6.2. Geometry of surfaces II: curvature
242
Exercises
254
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem Exercises
256 265
6.4. Smooth matrix groups
269
Exercises
284
6.5. The derivative of the exponential map
287
6.6. A spectral mapping theorem
292
Chapter 7.
Fourier analysis
295
7.1. Fourier series
298
Exercises 7.2. The Fourier transform
311 314
Exercises
332
Contents
ix
7.3. Poisson summation formulas
334
7.4. Spherical harmonics
336
Exercises 7.5. Fourier series on compact matrix groups
371 376
Exercises
381
7.6. Isoperimetric inequality
382
Appendix A.
Complementary material
385
A.1. Metric spaces, convergence, and compactness
386
Exercises
396
A.2. Inner product spaces
397
A.3. Eigenvalues and eigenvectors A.4. Complements on power series
402 407
A.5. The Weierstrass theorem and the Stone-Weierstrass theorem
412
A.6. Further results on harmonic functions
415
A.7. Beyond degree theory – introduction to de Rham theory
420
Exercises
438
Bibliography
441
Index
445
Preface
This text was produced for the second part of a two-part sequence on advanced calculus, whose aim is to provide a firm logical foundation for analysis, for students who have had 3 semesters of calculus and a course in linear algebra. The first part treats analysis in one variable, and the text [49] was written to cover that material. The text at hand treats analysis in several variables. These two texts can be used as companions, but they are written so that they can be used independently, if desired. Chapter 1 treats background needed for multivariable analysis. The first section gives a brief treatment of one-variable calculus, including the Riemann integral and the fundamental theorem of calculus. This section distills material developed in more detail in the companion text [49]. We have included it here to facilitate the independent use of this text. Subsequent sections in Chapter 1 present basic linear algebra background of use for the rest of the text. They include material on ndimensional Euclidean spaces and other vector spaces, on linear transformations on such spaces, and on determinants of such linear transformations. Chapter 2 develops multi-dimensional differential calculus on domains in ndimensional Euclidean space Rn . The first section defines the derivative of a differentiable map F : O → Rm , at a point x ∈ O, for O open in Rn , as a linear map from Rn to Rm , and establishes basic properties, such as the chain rule. The next section deals with the Inverse Function Theorem, giving a condition for such a map to have a differentiable inverse, when n = m. The third section treats n × n systems of differential equations, bringing in the concepts of vector fields and flows on an open set O ∈ Rn . While the emphasis here is on differential calculus, we do make use of integral calculus in one variable, as exposed in Chapter 1. Chapter 3 treats multi-dimensional integral calculus. We define the Riemann integral for a class of functions on Rn , and establish basic properties, including a change of variable formula. We then study smooth m-dimensional surfaces in Rn , and extend the Riemann integral to a class of functions on such surfaces. Going further, we abstract the notion of surface to that of a manifold, and study a class
xi
xii
Preface
of manifolds known as Riemannian manifolds. These possess an object known as a “metric tensor.” We also define the Riemann integral for a class of functions on such manifolds. The change of variable formula is instrumental in this extension of the integral. In Chapter 4 we introduce a further class of objects that can be defined on surfaces, differential forms. A k-form can be integrated over a k-dimensional surface, endowed with an extra structure, an “orientation.” Again the change of variable formula plays a role in establishing this. Important operations on differential forms include products and the exterior derivative. A key result of Chapter 4 is a general Stokes formula, an important integral identity that can be seen as a multidimensional version of the fundamental theorem of calculus. In §4.4 we specialize this general Stokes formula to classical cases, known as theorems of Gauss, Green, and Stokes. A concluding section of Chapter 4 makes use of material on differential forms to give another proof of the change of variable formula for the integral, much different from the proof given in Chapter 3. Chapter 5 is devoted to several applications of the material on the GaussGreen-Stokes theorems from Chapter 4. In §5.1 we use Green’s Theorem to derive fundamental properties of holomorphic functions of a complex variable. Sprinkled throughout earlier sections are some allusions to functions of complex variables, particularly in some of the exercises in §§2.1–2.2. Readers with no previous exposure to complex variables might wish to return to these exercises after getting through §5.1. In this section, we also discuss some results on the closely related study of harmonic functions. One result is Liouville’s Theorem, stating that a bounded harmonic function on all of Rn must be constant. When specialized to holomorphic functions on C = R2 , this yields a proof of the Fundamental Theorem of Algebra. In §5.2 we define the notion of smoothly homotopic maps and consider the behavior of closed differential forms under pull back by smoothly homotopic maps. This material is then applied in §5.3, which introduces degree theory and derives some interesting consequences. Key results include the Brouwer fixed point theorem, the Jordan-Brouwer separation theorem (in the smooth case), and the study of critical points of a vector field tangent to a compact surface, and connections with the Euler characteristic. We also show how degree theory yields another proof of the fundamental theorem of algebra. Chapter 6 applies results of Chapters 2–5 to the study of the geometry of surfaces (and more generally of Riemannian manifolds). Section 6.1 studies geodesics, which are locally length-minimizing curves. Section 6.2 studies curvature. Several varieties of curvature arise, including Gauss curvature and Riemann curvature, and it is of great interest to understand the relations between them. Section 6.3 ties the curvature study of §6.2 to material on degree theory from §5.3, in a result known as the Gauss-Bonnet theorem. Section 6.4 studies smooth matrix groups, which are smooth surfaces in M (n, F) that are also groups. These carry left and right invariant metric tensors, with important consequences for the application of such groups to other aspects of analysis, including results presented in §7.4.
Preface
xiii
Chapter 7 is devoted to an introduction to multi-dimensional Fourier analysis. Section 7.1 treats Fourier series on the n-dimensional torus Tn , and §7.2 treats the Fourier transform for functions on Rn . Section 7.3 introduces a topic that ties the first two together, known as Poisson’s summation formula. We apply this formula to establish a classical result of Riemann, his functional equation for the Riemann zeta function. The material in §§7.1–7.3 bears on topics rather different from the geometrical material emphasized in the latter part of Chapter 3 and in Chapters 4–6. In fact, this part of Chapter 7 could be tackled right after one gets through §3.1. On the other hand, the last three sections of Chapter 7 make strong contact with this geometrical material. Section 7.4 treats Fourier analysis on the sphere S n−1 , which involves expanding a function on S n−1 in terms of eigenfunctions of the Laplace operator ∆S , arising from the Riemannian metric on S n−1 . This study of course includes integrating functions over S n−1 . It also brings in the matrix group SO(n), introduced in Chapter 3, which acts on each eigenspace Vk of ∆S , and its subgroup SO(n − 1), and makes use of integrals over SO(n − 1). Section 7.4 also makes use of the Gauss-Green-Stokes formula, and applications to harmonic functions, from §§4.4 and 5.1. We believe the reader will gain a good appreciation of the utility of unifying geometrical concepts with those aspects of Fourier analysis developed in the first part of Chapter 7. We complement §7.4 with a brief discussion of Fourier series on compact matrix groups, in §7.5. Section 7.6 deals with the purely geometric problem of showing that, among smoothly bounded planar domains Ω ⊂ R2 , with fixed area, the disks have the smallest perimeter. This is the 2D isoperimetric inequality. Its placement here is due to the fact that its proof is an application of Fourier series. The text ends with a collection of appendices, some giving further background material, others providing complements to results of the main text. Appendix A.1 covers some basic notions of metric spaces and compactness, used from time to time throughout the text, such as in the study of the Riemann integral and in the proof of the fundamental existence theorem for ODE. As is the case with §1.1, this appendix distills material developed at a more leisurely pace in [49], again serving to make this text independent of the first one. Appendices A.2 and A.3 complement results on linear algebra presented in Chapter 1 with some further results. Appendix A.2 treats a general class of inner product spaces, both finite and infinite dimensional. Treatments in the latter case are relevant to results on Fourier analysis in Chapter 7. Appendix A.3 treats eigenvalues and eigenvectors of linear transformations on finite dimensional vector spaces, providing results useful in various places, from §2.1 to §6.6. Appendix A.4 discusses the remainder term in the power series of a function. Appendix A.5 deals with the Weierstrass theorem, on approximating a continuous function by polynomials, and an extension, known as the Stone-Weierstrass theorem, a useful tool in analysis, with applications in Sections 5.3, 7.1, and 7.4. Appendix A.6 builds on material on harmonic functions presented in Chapters 5
xiv
Preface
and 7. Results range from a removable singularity theorem to extensions of Liouville’s theorem. Appendix A.7 introduces de Rham cohomology, as an extension of degree theory, developed in Chapter 5. We point out some distinctive features of this treatment of Advanced Calculus. 1) Applications of the Gauss-Green-Stokes formulas. These formulas form a high point in any advanced calculus course, but we do not want them to be seen as the culmination of the course. Their significance arises from their many applications. The first application we treat is to the theory of functions of a complex variable, including the Cauchy integral theorem and basic consequences. This basically constitutes a mini-course in complex analysis. (A much more extensive treatment can be found in [51].) We also derive applications to the study of harmonic functions, in n variables, a study that is closely related to complex analysis when n = 2. We also apply differential forms and the Stokes formula to results of a topological flavor, involving a set of tools known as degree theory. We start with a result known as the Brouwer fixed point theorem. We give a short proof, as a direct application of the Stokes formula, thus making this theorem a precursor to degree theory, rather than an application. 2) The unity of analysis and geometry. This starts with calculus on surfaces, computing surface areas and surface integrals, given in terms of the metric tensors these surfaces inherit, but it proceeds much further. There is the question of finding geodesics, shortest paths, described by certain differential equations, whose coefficients arise from the metric tensor. Another issue is what makes a curved surface curved. One particular measure is called the Gauss curvature. There are formulas for the integrated Gauss curvature, which in turn make contact with degree theory. Such matters are examples of connections uniting analysis and geometry, pursued in the text. Other connections arise in the treatment of Fourier analysis. In addition to Fourier analysis on Euclidean space, the text treats Fourier analysis on spheres. Matrix groups, such as rotation groups SO(n), make an appearance, both as tools to study Fourier analysis on spheres, and as further sources of problems in Fourier analysis, thereby expanding the theater on which to bring to bear techniques of advanced calculus developed here. We follow this preface with a list of some basic notation, of use throughout the text. Acknowledgment During the preparation of this book, my research has been supported by a number of NSF grants, most recently DMS-1500817.
Some basic notation
R is the set of real numbers. C is the set of complex numbers. Z is the set of integers. Z+ is the set of integers ≥ 0. N is the set of integers ≥ 1 (the “natural numbers”). Q is the set of rational numbers. x ∈ R means x is an element of R, i.e., x is a real number. (a, b) denotes the set of x ∈ R such that a < x < b. [a, b] denotes the set of x ∈ R such that a ≤ x ≤ b. {x ∈ R : a ≤ x ≤ b} denotes the set of x in R such that a ≤ x ≤ b. [a, b) = {x ∈ R : a ≤ x < b} and (a, b] = {x ∈ R : a < x ≤ b}. z = x − iy if z = x + iy ∈ C, x, y ∈ R. xv
xvi
Some basic notation
Ω denotes the closure of the set Ω. f : A → B denotes that the function f takes points in the set A to points in B. One also says f maps A to B. x → x0 means the variable x tends to the limit x0 . f (x) = O(x) means f (x)/x is bounded. Similarly g(ε) = O(εk ) means g(ε)/εk is bounded. f (x) = o(x) as x → 0 (resp., x → ∞) means f (x)/x → 0 as x tends to the specified limit. S = sup |an | means S is the smallest real number that satisfies S ≥ |an | for all n. n
If there is no such real number then we take S = +∞. ( ) lim sup |ak | = lim sup |ak | . k→∞
n→∞ k≥n
Chapter 1
Background
This first chapter provides background material on one-variable calculus, the geometry of n-dimensional Euclidean space, and linear algebra. We begin in §1.1 with a presentation of the elements of calculus in one variable. We first define the Riemann integral for a class of functions on an interval. We then introduce the derivative, and establish the Fundamental Theorem of Calculus, relating differentiation and integration as essentially inverse operations. Further results are dealt with in the exercises, such as the change of variable formula for integrals, and the Taylor formula with remainder, for power series. Next we introduce the n-dimensional Euclidean spaces Rn , in §1.2. The dot product on Rn gives rise to a norm, hence to a distance function, making Rn a metric space. (More general metric spaces are studied in Appendix A.1.) We define the notion of open and closed subsets of Rn , of convergent sequences, and of compactness, and establish that nonempty closed, bounded subsets of Rn are compact. This material makes use of results on the real line R, dealt with at length in [49], and reviewed in Appendix A.1. The spaces Rn are special cases of vector spaces, explored in greater generality in §1.3. We also study linear transformations T : V → W between two vector spaces. We define the class of finite-dimensional vector spaces, and show that the dimension of such a vector space is well defined. If V is a real vector space and dim V = n, then V is isomorphic to Rn . Linear transformations from Rn to Rm are given by m × n matrices. In §1.4 we define the determinant, det A, of an n × n matrix A, and show that A is invertible if and only if det A ̸= 0. In Chapter 2, such linear transformations arise as derivatives of nonlinear maps, and understanding the behavior of these derivatives is basic to many key results in multivariable calculus, both in Chapter 2 and in subsequent chapters.
1
2
1. Background
1.1. One variable calculus In this brief discussion of one variable calculus, we introduce the Riemann integral, and relate it to the derivative. We will define the Riemann integral of a bounded function over an interval I = [a, b] on the real line. For now, we assume f is real valued. To start, we partition I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤ k ≤ N }, disjoint except for their endpoints, whose union is I. We can order the Jk so that Jk = [xk , xk+1 ], where (1.1.1)
x0 < x1 < · · · < xN < xN +1 ,
x0 = a, xN +1 = b.
We call the points xk the endpoints of P. We set (1.1.2)
ℓ(Jk ) = xk+1 − xk ,
maxsize (P) = max ℓ(Jk ) 0≤k≤N
We then set I P (f ) =
∑ k
(1.1.3) I P (f ) =
∑ k
sup f (x) ℓ(Jk ), Jk
inf f (x) ℓ(Jk ). Jk
Definitions of sup and inf are given in (A.1.17)–(A.1.18). We call I P (f ) and I P (f ) respectively the upper sum and lower sum of f , associated to the partition P. See Figure 1.1.1 for an illustration. Note that I P (f ) ≤ I P (f ). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of I, we say P refines Q, and write P ≻ Q, if P is formed by partitioning each interval in Q. Equivalently, P ≻ Q if and only if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions have a common refinement; just take the union of their endpoints, to form a new partition. Note also that refining a partition lowers the upper sum of f and raises its lower sum: (1.1.4)
P ≻ Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).
Consequently, if Pj are any two partitions and Q is a common refinement, we have (1.1.5)
I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).
Now, whenever f : I → R is bounded, the following quantities are well defined: (1.1.6)
I(f ) =
inf
P∈Π(I)
I P (f ),
I(f ) = sup P∈Π(I)
I P (f ),
where Π(I) is the set of all partitions of I. We call I(f ) the lower integral of f and I(f ) its upper integral. Clearly, by (1.1.5), I(f ) ≤ I(f ). We then say that f is Riemann integrable provided I(f ) = I(f ), and in such a case, we set ∫ ∫ b f (x) dx = f (x) dx = I(f ) = I(f ). (1.1.7) a
I
We will denote the set of Riemann integrable functions on I by R(I). We derive some basic properties of the Riemann integral.
1.1. One variable calculus
3
Figure 1.1.1. Upper and lower sums
Proposition 1.1.1. If f, g ∈ R(I), then f + g ∈ R(I), and ∫ ∫ ∫ (1.1.8) (f + g) dx = f dx + g dx. I
I
I
Proof. If Jk is any subinterval of I, then sup (f + g) ≤ sup f + sup g, Jk
Jk
Jk
and inf (f + g) ≥ inf f + inf g, Jk
Jk
Jk
so, for any partition P, we have I P (f + g) ≤ I P (f ) + I P (g). Also, using common refinements, we can simultaneously approximate I(f ) and I(g) by I P (f ) and I P (g), and ditto for I(f + g). Thus the characterization (1.1.6) implies I(f + g) ≤ I(f ) + I(g). A parallel argument implies I(f + g) ≥ I(f ) + I(g), and the proposition follows. Next, there is a fair supply of Riemann integrable functions. Proposition 1.1.2. If f is continuous on I, then f is Riemann integrable. Proof. Any continuous function on a compact interval is bounded and uniformly continuous (see Propositions A.1.15 and A.1.16). Let ω(δ) be a modulus of continuity for f, so (1.1.9)
|x − y| ≤ δ =⇒ |f (x) − f (y)| ≤ ω(δ),
ω(δ) → 0 as δ → 0.
4
1. Background
Then (1.1.10)
maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · ℓ(I),
which yields the proposition.
We denote the set of continuous functions on I by C(I). Thus Proposition 1.1.2 says C(I) ⊂ R(I). The proof of Proposition 1.1.2 provides a criterion on a partition guaranteeing ∫ that I P (f ) and I P (f ) are close to I f dx when f is continuous. We produce an extension, giving a condition under which I P (f ) and I(f ) are close, and I P (f ) and I(f ) are close, given f bounded on I. Given a partition P0 of I, set (1.1.11)
minsize(P0 ) = min{ℓ(Jk ) : Jk ∈ P0 }.
Lemma 1.1.3. Let P and Q be two partitions of I. Assume maxsize(P) ≤
(1.1.12)
1 minsize(Q). k
Let |f | ≤ M on I. Then 2M ℓ(I), k 2M I P (f ) ≥ I Q (f ) − ℓ(I). k
I P (f ) ≤ I Q (f ) + (1.1.13)
Proof. Let P1 denote the minimal common refinement of P and Q. Consider on the one hand those intervals in P that are contained in intervals in Q and on the other hand those intervals in P that are not contained in intervals in Q. Each interval of the first type is also an interval in P1 . Each interval of the second type gets partitioned, to yield two intervals in P1 . Denote by P1b the collection of such divided intervals. By (1.1.12), the lengths of the intervals in P1b sum to ≤ ℓ(I)/k. It follows that ∑ ℓ(I) , |I P (f ) − I P1 (f )| ≤ 2M ℓ(J) ≤ 2M k b J∈P1
and similarly |I P (f ) − I P1 (f )| ≤ 2M ℓ(I)/k. Therefore I P (f ) ≤ I P1 (f ) +
2M ℓ(I), k
I P (f ) ≥ I P1 (f ) −
2M ℓ(I). k
Since also I P1 (f ) ≤ I Q (f ) and I P1 (f ) ≥ I Q (f ), we obtain (1.1.13).
The following consequence is sometimes called Darboux’s Theorem. Theorem 1.1.4. Let Pν be a sequence of partitions of I into ν intervals Jνk , 1 ≤ k ≤ ν, such that maxsize(Pν ) −→ 0. If f : I → R is bounded, then (1.1.14)
I Pν (f ) → I(f ) and I Pν (f ) → I(f ).
1.1. One variable calculus
5
Consequently, (1.1.15)
f ∈ R(I) ⇐⇒ I(f ) = lim
ν→∞
ν ∑ k=1
for arbitrary ξνk ∈ Jνk , in which case the limit is
f (ξνk )ℓ(Jνk ), ∫ I
f dx.
Proof. As before, assume |f | ≤ M . Pick ε = 1/k > 0. Let Q be a partition such that I(f ) ≤ I Q (f ) ≤ I(f ) + ε, I(f ) ≥ I Q (f ) ≥ I(f ) − ε. Now pick N such that ν ≥ N =⇒ maxsize Pν ≤ ε minsize Q. Lemma 1.1.3 yields, for ν ≥ N , I Pν (f ) ≤ I Q (f ) + 2M ℓ(I)ε, I Pν (f ) ≥ I Q (f ) − 2M ℓ(I)ε. Hence, for ν ≥ N , I(f ) ≤ I Pν (f ) ≤ I(f ) + [2M ℓ(I) + 1]ε, I(f ) ≥ I Pν (f ) ≥ I(f ) − [2M ℓ(I) + 1]ε.
This proves (1.1.14).
Remark. ∫ The sums on the right side of (1.1.15) are called Riemann sums, approximating I f dx (when f is Riemann integrable). Remark. A second proof of Proposition 1.1.1 can readily be deduced from Theorem 1.1.4. One should be warned that, once such a specific choice of Pν and ξνk has been made, the limit on the right side of (1.1.15) might exist for a bounded function f that is not Riemann integrable. This and other phenomena are illustrated by the following example of a function which is not Riemann integrable. For x ∈ I, set (1.1.16)
ϑ(x) = 1 if x ∈ Q,
ϑ(x) = 0 if x ∈ / Q,
where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains points in Q and points not in Q, so for any partition P of I we have I P (ϑ) = ℓ(I) and I P (ϑ) = 0, hence (1.1.17)
I(ϑ) = ℓ(I),
I(ϑ) = 0.
Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to be rational, in which case the limit on the right side of (1.1.15) would be ℓ(I), or we could pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we could pick half of them to be rational and half to be irrational, and the limit would be ℓ(I)/2.
6
1. Background
Associated to the Riemann integral is a notion of size of a set S, called content. If S is a subset of I, define the “characteristic function” χS (x) = 1 if x ∈ S, 0 if x ∈ / S.
(1.1.18)
We define “upper content” cont+ and “lower content” cont− by cont+ (S) = I(χS ),
(1.1.19)
cont− (S) = I(χS ).
We say S “has content,” or “is contented” if these quantities are equal, which happens if and only if χS ∈ R(I), in which case the common value of cont+ (S) and cont− (S) is ∫ (1.1.20) m(S) = χS (x) dx. I
It is easy to see that (1.1.21)
cont+ (S) = inf
N {∑
} ℓ(Jk ) : S ⊂ J1 ∪ · · · ∪ JN ,
k=1
where Jk are intervals. Here, we require S to be in the union of a finite collection of intervals. See the appendix at the end of this section for a generalization of Proposition 1.1.2, giving a sufficient condition for a bounded function to be Riemann integrable on I, in terms of the upper content of its set of discontinuities. There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure. The key to the construction of Lebesgue measure is to cover a set S by a countable (either finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by {∑ ∪ } ℓ(Jk ) : S ⊂ Jk . (1.1.22) m∗ (S) = inf k≥1
k≥1
Here {Jk } is a finite or countably infinite collection of intervals. Clearly m∗ (S) ≤ cont+ (S).
(1.1.23)
Note that, if S = I ∩ Q, then χS = ϑ, defined by (1.1.16). In this case it is easy to see that cont+ (S) = ℓ(I), but m∗ (S) = 0. Zero is the “right” measure of this set. More material on the development of measure theory can be found in a number of books, including [17] and [47]. ∫ It is useful to note that I f dx is additive in I, in the following sense. Proposition 1.1.5. If a < b < c, f : [a, c] → R, f1 = f [a,b] , f2 = f [b,c] , then ( ) ( ) ( ) (1.1.24) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] , and, if this holds, ∫ (1.1.25)
∫
c
f dx = a
∫
b
f1 dx + a
c
f2 dx. b
1.1. One variable calculus
7
Proof. Since any partition of [a, c] has a refinement for which b is an endpoint, we may as well consider a partition P = P1 ∪ P2 , where P1 is a partition of [a, b] and P2 is a partition of [b, c]. Then I P (f ) = I P1 (f1 ) + I P2 (f2 ),
(1.1.26) so (1.1.27)
I P (f ) = I P1 (f1 ) + I P2 (f2 ),
{ } { } I P (f ) − I P (f ) = I P1 (f1 ) − I P1 (f1 ) + I P2 (f2 ) − I P2 (f2 ) .
Since both terms in braces in (1.1.27) are ≥ 0, we have equivalence in (1.1.24). Then (1.1.25) follows from (1.1.26) upon taking sufficiently fine partitions. Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the function ∫ x (1.1.28) g(x) = f (t) dt. a
If a ≤ x0 ≤ x1 ≤ b, then (1.1.29)
∫ g(x1 ) − g(x0 ) =
x1
f (t) dt, x0
so, if |f | ≤ M, (1.1.30)
|g(x1 ) − g(x0 )| ≤ M |x1 − x0 |.
In other words, if f ∈ R(I), then g is Lipschitz continuous on I. A function g : (a, b) → R is said to be differentiable at x ∈ (a, b) provided there exists the limit ] 1[ (1.1.31) lim g(x + h) − g(x) = g ′ (x). h→0 h When such a limit exists, g ′ (x), also denoted dg/dx, is called the derivative of g at x. Clearly g is continuous wherever it is differentiable. The next result is part of the Fundamental Theorem of Calculus. Theorem 1.1.6. If f ∈ C([a, b]), then the function g, defined by (1.1.28), is differentiable at each point x ∈ (a, b), and g ′ (x) = f (x).
(1.1.32)
Proof. Parallel to (1.1.29), we have, for h > 0, ∫ ] 1 x+h 1[ (1.1.33) g(x + h) − g(x) = f (t) dt. h h x If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f (t) − f (x)| ≤ ε whenever |t − x| ≤ δ. Thus the right side of (1.1.33) is within ε of f (x) whenever h ∈ (0, δ]. Thus the desired limit exists as h ↘ 0. A similar argument treats h ↗ 0. The next result is the rest of the Fundamental Theorem of Calculus. Theorem 1.1.7. If G is differentiable and G′ (x) is continuous on [a, b], then ∫ b (1.1.34) G′ (t) dt = G(b) − G(a). a
8
1. Background
Figure 1.1.2. Illustration of the Mean Value Theorem
Proof. Consider the function (1.1.35)
∫ g(x) =
x
G′ (t) dt.
a
We have g ∈ C([a, b]), g(a) = 0, and, by Theorem 1.1.6, g ′ (x) = G′ (x),
∀ x ∈ (a, b).
Thus f (x) = g(x) − G(x) is continuous on [a, b], and (1.1.36)
f ′ (x) = 0,
∀ x ∈ (a, b).
We claim that (1.1.36) implies f is constant on [a, b]. Granted this, since f (a) = g(a) − G(a) = −G(a), we have f (x) = −G(a) for all x ∈ [a, b], so the integral (1.1.35) is equal to G(x) − G(a) for all x ∈ [a, b]. Taking x = b yields (1.1.34). The fact that (1.1.36) implies f is constant on [a, b] is a consequence of the following result, known as the Mean Value Theorem. This is illustrated in Figure 1.1.2. Theorem 1.1.8. Let f : [a, β] → R be continuous, and assume f is differentiable on (a, β). Then ∃ ξ ∈ (a, β) such that (1.1.37)
f ′ (ξ) =
f (β) − f (a) . β−a
1.1. One variable calculus
9
Proof. Set g(x) = f (x) − κ(x − a), where κ is the right side of (1.1.37). Note that g ′ (ξ) = f ′ (ξ) − κ, so it suffices to show that g ′ (ξ) = 0 for some ξ ∈ (a, β). Note also that g(a) = g(β). Since [a, β] is compact, g must assume a maximum and a minimum on [a, β]. Since g(a) = g(β), one of these must be assumed at an interior point, at which g ′ vanishes. Now, to see that (1.1.36) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that f (β) ̸= f (a). Then just apply Theorem 1.1.8 to f on [a, β]. This completes the proof of Theorem 1.1.7. We now extend Theorems 1.1.6–1.1.7 to the setting of Riemann integrable functions. Proposition 1.1.9. Let f ∈ R([a, b]), and define g by (1.1.28). If x ∈ [a, b] and f is continuous at x, then g is differentiable at x, and g ′ (x) = f (x). The proof is identical to that of Theorem 1.1.6. Proposition 1.1.10. Assume G is differentiable on [a, b] and G′ ∈ R([a, b]). Then (1.1.34) holds. Proof. We have G(b) − G(a) =
( ( k )] k + 1) − G a + (b − a) G a + (b − a) n n
n−1 ∑[ k=0
n−1 b−a ∑ ′ = G (ξkn ), n k=0
for some ξkn satisfying a + (b − a)
k+1 k < ξkn < a + (b − a) , n n
as a consequence of the Mean Value Theorem. Given G′ ∈ R([a, b]), Darboux’s ∫b theorem (Theorem 1.1.4) implies that as n → ∞ one gets G(b)−G(a) = a G′ (t) dt. Note that the beautiful symmetry in Theorems 1.1.6–1.1.7 is not preserved in Propositions 1.1.9–1.1.10. The hypothesis of Proposition 1.1.10 requires G to be differentiable at each x ∈ [a, b], but the conclusion of Proposition 1.1.9 does not yield differentiability at all points. For this reason, we regard Propositions 1.1.9–1.1.10 as less “fundamental” than Theorems 1.1.6–1.1.7. There are more satisfactory extensions of the fundamental theorem of calculus, involving the Lebesgue integral, and a more subtle notion of the “derivative” of a non-smooth function. For this, we can point the reader to Chapters 10–11 of the text [47], Measure Theory and Integration. So far, we have dealt with integration of real valued functions. If f : I → C, we set f = f1 + if2 with fj : I → R and say f ∈ R(I) if and only if f1 and f2 are
10
in R(I). Then
1. Background
∫
∫ f dx =
I
∫ f1 dx + i
I
f2 dx. I
There are straightforward extensions of Propositions 1.1.5–1.1.10 to complex valued functions. Similar comments apply to functions f : I → Rn . ′ If a function G is differentiable on (a, ( b), )and G is continuous on (a, b),k (we say) 1 1 G ∈ C (a, b) . Inductively, we say G ∈ C (a, b) G is a C function, (and write ) provided G′ ∈ C k−1 (a, b) . An easy consequence of the definition (1.1.31) of the derivative is that, for any real constants a, b, and c, df f (x) = ax2 + bx + c =⇒ = 2ax + b. dx Now, it is a simple enough step to replace a, b, c by y, z, w, in these formulas. Having done that, we can regard y, z, and w as variables, along with x : F (x, y, z, w) = yx2 + zx + w. We can then hold y, z and w fixed (e.g., set y = a, z = b, w = c), and then differentiate with respect to x; we get ∂F = 2yx + z, ∂x the partial derivative of F with respect to x. Generally, if F is a function of n variables, x1 , . . . , xn , we set ∂F (x1 , . . . ,xn ) ∂xj 1[ (1.1.38) = lim F (x1 , . . . , xj−1 , xj + h, xj+1 , . . . , xn ) h→0 h ] − F (x1 , . . . , xj−1 , xj , xj+1 , . . . , xn ) , where the limit exists. Section 2.1 carries on with a further investigation of the derivative of a function of several variables. Complementary results on Riemann integrability Here we provide a condition, more general then Proposition 1.1.2, which guarantees Riemann integrability. Proposition 1.1.11. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the set S of points of discontinuity of f has the property (1.1.39)
cont+ (S) = 0.
Then f ∈ R(I). Proof. Say |f (x)| ≤ M . Take ε > 0. As in (1.1.21), take intervals J1 , . . . , JN such ∑N that S ⊂ J1 ∪ · · · ∪ JN and k=1 ℓ(Jk ) < ε. In fact, fatten each Jk such that S is contained in the interior of this collection of intervals. Consider a partition P0 of
1.1. One variable calculus
11
I, whose intervals include J1 , . . . , JN , amongst others, which we label I1 , . . . , IK . Now f is continuous on each interval Iν , so, subdividing each Iν as necessary, hence refining P0 to a partition P1 , we arrange that sup f − inf f < ε on each such subdivided interval. Denote these subdivided intervals I1′ , . . . , IL′ . It readily follows that N L ∑ ∑ 0 ≤ I P1 (f ) − I P1 (f ) < 2M ℓ(Jk ) + εℓ(Ik′ ) k=1
k=1
< 2εM + εℓ(I). Since ε can be taken arbitrarily small, this establishes that f ∈ R(I).
Remark. An even better result is that such f is Riemann integrable if and only if m∗ (S) = 0,
(1.1.40)
where m∗ (S) is defined by (1.1.22). The implication m∗ (S) = 0 ⇒ f ∈ R(I) (in the n-dimensional setting) is established in Proposition 3.1.31 of this text. For the 1-dimensional case, see also Proposition 4.2.12 of [49]. For the reverse implication f ∈ R(I) ⇒ m∗ (S) = 0, one can see standard books on measure theory, such as [17] and [47]. We give an example of a function to which Proposition 1.1.11 applies, and then an example for which Proposition 1.1.11 fails to apply, though the function is Riemann integrable. Example 1. Let I = [0, 1]. Define f : I → R by f (0) = 0, f (x) = (−1)j for x ∈ (2−(j+1) , 2−j ], j ≥ 0. Then |f | ≤ 1 and the set of points of discontinuity of f is S = {0} ∪ {2−j : j ≥ 1}. It is easy to see that cont+ S = 0. Hence f ∈ R(I). See Exercises 19–20 below for a more elaborate example to which Proposition 1.1.11 applies. Example 2. Again I = [0, 1]. Define f : I → R by f (x) = 0 if x ∈ / Q, 1 m if x = , in lowest terms. n n Then |f | ≤ 1 and the set of points of discontinuity of f is S = I ∩ Q.
12
1. Background
As we have seen below (1.1.23), cont+ S = 1, so Proposition 1.1.11 does not apply. Nevertheless, it is fairly easy to see directly that I(f ) = I(f ) = 0,
so f ∈ R(I).
In fact, given ε > 0, f ≥ ε only on a finite set, hence I(f ) ≤ ε,
∀ ε > 0.
As indicated below (1.1.23), (1.1.40) does apply to this function. By contrast, the function ϑ in (1.1.16) is discontinuous at each point of I. We mention an alternative characterization of I(f ) and I(f ), which can be useful. Given I = [a, b], we say g : I → R is piecewise constant on I (and write g ∈ PK(I)) provided there exists a partition P = {Jk } of I such that g is constant on the interior of each interval Jk . Clearly PK(I) ⊂ R(I). It is easy to see that, if f : I → R is bounded, {∫ } I(f ) = inf f1 dx : f1 ∈ PK(I), f1 ≥ f , I {∫
(1.1.41) I(f ) = sup
} f0 dx : f0 ∈ PK(I), f0 ≤ f .
I
Hence, given f : I → R bounded, (1.1.42)
f ∈ R(I) ⇔ for each ε > 0, ∃f0 , f1 ∈ PK(I) such that ∫ f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε. I
This can be used to prove (1.1.43)
f, g ∈ R(I) =⇒ f g ∈ R(I),
via the fact that (1.1.44)
fj , gj ∈ PK(I) =⇒ fj gj ∈ PK(I).
In fact, we have the following, which can be used to prove (1.1.43). Proposition 1.1.12. Let f ∈ R(I), and assume |f | ≤ M . Let φ : [−M, M ] → R be continuous. Then φ ◦ f ∈ R(I). Proof. We proceed in steps. Step 1. We can obtain φ as a uniform limit on [−M, M ] of a sequence φν of continuous, piecewise linear functions. Then φν ◦ f → φ ◦ f uniformly on I. A uniform limit g of functions gν ∈ R(I) is in R(I) (see Exercise 12). So it suffices to prove Proposition 1.1.12 when φ is continuous and piecewise linear. Step 2. Given φ : [−M, M ] → R continuous and piecewise linear, it is an exercise
Exercises
13
to write φ = φ1 − φ2 , with φj : [−M, M ] → R monotone, continuous, and piecewise linear. Now φ1 ◦ f, φ2 ◦ f ∈ R(I) ⇒ φ ◦ f ∈ R(I). Step 3. We now demonstrate Proposition 1.1.12 when φ : [−M, M ] → R is monotone and Lipschitz. By Step 2, this will suffice. So we assume −M ≤ x1 < x2 ≤ M =⇒ φ(x1 ) ≤ φ(x2 ) and φ(x2 ) − φ(x1 ) ≤ L(x2 − x1 ), for some L < ∞. Given ε > 0, pick f0 , f1 ∈ PK(I), as in (1.1.42). Then φ ◦ f0 , φ ◦ f1 ∈ PK(I), and
φ ◦ f0 ≤ φ ◦ f ≤ φ ◦ f1 ,
∫
∫ (φ ◦ f1 − φ ◦ f0 ) dx ≤ L
I
(f1 − f0 ) dx ≤ Lε. I
This proves φ ◦ f ∈ R(I).
Exercises 1. Let c > 0 and let f : [ac, bc] → R be Riemann integrable. Working directly with the definition of integral, show that ∫ b ∫ 1 bc (1.1.45) f (cx) dx = f (x) dx. c ac a More generally, show that ∫ b−d/c ∫ 1 bc (1.1.46) f (cx + d) dx = f (x) dx. c ac a−d/c n 2. ∫ Let f : I × S → R be continuous, where I = [a, b] and S ⊂ R . Take φ(y) = f (x, y) dx. Show that φ is continuous on S. I Hint. If fj : I → R are continuous and |f1 (x) − f2 (x)| ≤ δ on I, then ∫ ∫ (1.1.47) f1 dx − f2 dx ≤ ℓ(I)δ. I
I
Hint. Suppose yj ∈ S, yj → y ∈ S. Let Se = {yj } ∪ {y}. This is compact. Thus f : I × Se → R is uniformly continuous. Hence |f (x, yj ) − f (x, y)| ≤ ω(|yj − y|),
∀ x ∈ I,
where ω(δ) → 0 as δ → 0. 3. With f as in Exercise 2, suppose gj : S → R are continuous and a ≤ g0 (y) < ∫ g (y) g1 (y) ≤ b. Take φ(y) = g01(y) f (x, y) dx. Show that φ is continuous on S. Hint. Make a change of variables, linear in x, to reduce this to Exercise 2.
14
1. Background
4. Suppose f : (a, b) → (c, d) and g : (c, d) → R are differentiable. Show that h(x) = g(f (x)) is differentiable and h′ (x) = g ′ (f (x))f ′ (x). This is the chain rule. Hint. Peek at the proof of the chain rule in §2.1. 5. If f1 and f2 are differentiable on (a, b), show that f1 (x)f2 (x) is differentiable and ) d ( f1 (x)f2 (x) = f1′ (x)f2 (x) + f1 (x)f2′ (x). dx If f2 (x) ̸= 0, ∀ x ∈ (a, b), show that f1 (x)/f2 (x) is differentiable, and d ( f1 (x) ) f1′ (x)f2 (x) − f1 (x)f2′ (x) = . dx f2 (x) f2 (x)2 6. Let φ : [a, b] → [A, B] be C 1 on a neighborhood J of [a, b], with φ′ (x) > 0 for all x ∈ [a, b]. Assume φ(a) = A, φ(b) = B. Show that the identity ∫ B ∫ b ( ) (1.1.48) f (y) dy = f φ(t) φ′ (t) dt, A
a
for any f ∈ C(I), I = [A, B], follows from the chain rule and the Fundamental Theorem of Calculus. This is the change of variable formula for the one-dimensional integral. Hint. Replace b by x, B by φ(x), and differentiate.
7. Show that (1.1.48) holds for each f ∈ PK(I). Using (1.1.41)–(1.1.42), show that f ∈ R(I) ⇒ f ◦ φ ∈ R([a, b]) and (1.1.48) holds. (This result contains that of Exercise 1.) 8. Show that, if f and g are C 1 on a neighborhood of [a, b], then ∫ b ∫ b [ ] ′ (1.1.49) f (s)g (s) ds = − f ′ (s)g(s) ds + f (b)g(b) − f (a)g(a) . a
a
This transformation of integrals is called “integration by parts.” 9. Let f : (−a, a) → R be a C j+1 function. Show that, for x ∈ (−a, a), (1.1.50)
f (x) = f (0) + f ′ (0)x +
where (1.1.51)
∫ Rj (x) = 0
x
f ′′ (0) 2 f (j) (0) j x + ··· + x + Rj (x), 2 j!
(x − s)j (j+1) f (s) ds j!
This is Taylor’s formula with remainder. See §2.1 for the multidimensional extension.
Exercises
15
Hint. Use induction. If (1.1.50)–(1.1.51) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1, by showing that ∫ x ∫ x (x − s)k (k+1) f (k+1) (0) k+1 (x − s)k+1 (k+2) (1.1.52) f (s) ds = x + f (s) ds. k! (k + 1)! (k + 1)! 0 0 To establish this, use the integration by parts formula (1.1.49), with f (s) replaced by f (k+1) (s), and with appropriate g(s). See Appendix §A.4 for further material on the remainder formula. Note that another presentation of (1.1.51) is ∫ 1 (( ) ) xj+1 (1.1.53) Rj (x) = f (j+1) 1 − t1/(j+1) x dt. (j + 1)! 0 10. Assume f : (−a, a) → R is a C j function. Show that, for x ∈ (−a, a), (1.1.50) holds, with ∫ x [ ] 1 (x − s)j−1 f (j) (s) − f (j) (0) ds. (1.1.54) Rj (x) = (j − 1)! 0 Hint. Apply (1.1.51) with j replaced by j − 1. Add and subtract f (j) (0) to the factor f (j) (s) in the resulting integrand. 11. Given I = [a, b], show that (1.1.55)
f, g ∈ R(I) =⇒ f g ∈ R(I),
as advertised in (1.1.43). 12. Assume fk ∈ R(I) and fk → f uniformly on I. Prove that f ∈ R(I) and ∫ ∫ (1.1.56) fk dx −→ f dx. I
I
13. Given I = [a, b], Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk | ≤ M on I for all k, and (1.1.57)
fk −→ f uniformly on Iε ,
for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (1.1.56) holds. 14. Use the fundamental theorem of calculus to compute ∫ b (1.1.58) xr dx, r ∈ Q \ {−1}, a
where 0 ≤ a < b < ∞ if r ≥ 0 and 0 < a < b < ∞ if r < 0. 15. Use the change of variable result of Exercise 6 to compute ∫ 1 √ x 1 + x2 dx. 0
16
1. Background
16. We say f ∈ R(R) provided f |[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and ∞ ∫ k+1 ∑ (1.1.59) |f (x)| dx < ∞. k=−∞
If f ∈ R(R), we set
∫
k
∫
∞
(1.1.60)
k
f (x) dx = lim
k→∞
−∞
f (x) dx. −k
Formulate and demonstrate basic properties of the integral over R of elements of R(R). 17. This exercise discusses the integral test for absolute convergence of an infinite series, which goes as follows. Let f be a positive, monotonically decreasing, continuous function on [0, ∞), and suppose |ak | = f (k). Then ∫ ∞ ∞ ∑ |ak | < ∞ ⇐⇒ f (x) dx < ∞. 0
k=0
Prove this. Hint. Use N ∑
∫ |ak | ≤
N
f (x) dx ≤ 0
k=1
N −1 ∑
|ak |.
k=0
18. Use the integral test to show that, if p > 0, (1.1.61)
∞ ∑ 1 < ∞ ⇐⇒ p > 1. kp
k=1
∫N Hint. Use Exercise 14 to evaluate IN (p) = 1 x−p dx, for p ̸= −1, and let N → ∞. ∫ ∞ −1 See if you can show 1 x dx = ∞ without knowing about log N . Subhint. Show ∫2 ∫ 2N that 1 x−1 dx = N x−1 dx. In Exercises 19–20, C ⊂ R is the Cantor set, defined as follows. Take a closed, bounded interval [a, b] = C0 . Let C1 be obtained from C0 by deleting the open middle third interval, of length (b−a)/3. At the jth stage, Cj is a disjoint union of 2j closed intervals, each of length 3−j (b − a). Then Cj+1 is obtained from Cj by deleting the open middle third of each of these 2j intervals. We have C0 ⊃ C1 ⊃ · · · ⊃ Cj ⊃ · · · , each a closed subset of [a, b]. The Cantor set is C = ∩j≥0 Cj . 19. Show that cont+ Cj = (2/3)j (b − a), and conclude that cont+ C = 0. 20. Define f : [a, b] → R as follows. We call an interval of length 3−j (b−a), omitted
1.2. Euclidean spaces
17
in passing from Cj−1 to Cj , a “j-interval.” Set f (x) = 0,
if x ∈ C,
(−1)j ,
if x belongs to a j-interval.
Show that the set of discontinuities of f is C. Hence Proposition 1.1.11 implies f ∈ R([a, b]). 21. Generalize Exercise 8 as follows. Assume f and g are differentiable on a neighborhood of [a, b] and f ′ , g ′ ∈ R([a, b]). Then show that (1.1.49) holds. Hint. Use the results of Exercise 11 to show that (f g)′ ∈ R([a, b]). 22. Let f : I → R be bounded, I = [a, b]. Show that {∫ } I(f ) = inf f1 dx : f1 ∈ C(I), f1 ≥ f , I {∫
I(f ) = sup
} f0 dx : f0 ∈ C(I), f0 ≤ f .
I
Compare (1.1.41). Then show that
(1.1.62)
f ∈ R(I) ⇐⇒ for each ε > 0, ∃ f0 , f1 ∈ C(I) such that ∫ f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε. I
Compare (1.1.42).
1.2. Euclidean spaces The space Rn , n-dimensional Euclidean space, consists of n-tuples of real numbers: (1.2.1)
x = (x1 , . . . , xn ) ∈ Rn ,
xj ∈ R, 1 ≤ j ≤ n.
The number xj is called the jth component of x. Here we discuss some important algebraic and metric structures on Rn . First, there is addition. If x is as in (1.2.1) and also y = (y1 , . . . , yn ) ∈ Rn , we have (1.2.2)
x + y = (x1 + y1 , . . . , xn + yn ) ∈ Rn .
Addition is done componentwise. Also, given a ∈ R, we have (1.2.3)
ax = (ax1 , . . . , axn ) ∈ Rn .
This is scalar multiplication. In (1.2.1), we represent x as a row vector. Sometimes we want to represent x by a column vector, x1 .. (1.2.4) x = . . xn
18
1. Background
Then (1.2.2)–(1.2.3) are converted to x1 + y1 .. (1.2.5) x+y = , .
ax1 ax = ... .
xn + yn
axn
We also have the dot product, (1.2.6)
x·y =
n ∑
xj yj = x1 y1 + · · · + xn yn ∈ R,
j=1
given x, y ∈ Rn . The dot product has the properties x · y = y · x, (1.2.7)
x · (ay + bz) = a(x · y) + b(x · z), x · x > 0 unless x = 0.
Note that (1.2.8)
x · x = x21 + · · · + x2n .
We set (1.2.9)
|x| =
√
x · x,
which we call the norm of x. Note that (1.2.7) implies (1.2.10)
(ax) · (ax) = a2 (x · x),
hence (1.2.11)
|ax| = |a| · |x|,
for a ∈ R, x ∈ Rn .
Taking a cue from the Pythagorean theorem, we say that the distance from x to y in Rn is (1.2.12)
d(x, y) = |x − y|.
For us, (1.2.9) and (1.2.12) are simply definitions. We do not need to depend on a derivation of the Pythagorean theorem via classical Euclidean geometry. Significant properties will be derived below, without recourse to a prior theory of Euclidean geometry. A set X equipped with a distance function is called a metric space. We consider metric spaces in general in Appendix A.1. Here, we want to show that the Euclidean distance, defined by (1.2.12), satisfies the “triangle inequality,” (1.2.13)
d(x, y) ≤ d(x, z) + d(z, y),
∀ x, y, z ∈ Rn .
This in turn is a consequence of the following, also called the triangle inequality. Proposition 1.2.1. The norm (1.2.9) on Rn has the property (1.2.14)
|x + y| ≤ |x| + |y|,
∀ x, y ∈ Rn .
1.2. Euclidean spaces
19
Proof. We compare the squares of the two sides of (1.2.14). First, |x + y|2 = (x + y) · (x + y) (1.2.15)
=x·x+y·x+y·x+y·y = |x|2 + 2x · y + |y|2 .
Next, (1.2.16)
(|x| + |y|)2 = |x|2 + 2|x| · |y| + |y|2 .
We see that (1.2.14) holds if and only if x·y ≤ |x|·|y|. Thus the proof of Proposition 1.2.1 is finished off by the following result, known as Cauchy’s inequality. Proposition 1.2.2. For all x, y ∈ Rn , |x · y| ≤ |x| · |y|.
(1.2.17) Proof. We start with the chain (1.2.18)
0 ≤ |x − y|2 = (x − y) · (x − y) = |x|2 + |y|2 − 2x · y,
which implies (1.2.19)
2x · y ≤ |x|2 + |y|2 ,
∀ x, y ∈ Rn .
If we replace x by tx and y by t−1 y, with t > 0, the left side of (1.2.19) is unchanged, so we have (1.2.20)
2x · y ≤ t2 |x|2 + t−2 |y|2 ,
∀ t > 0.
Now we pick t so that the two terms on the right side of (1.2.20) are equal, namely (1.2.21)
t2 =
|y| , |x|
t−2 =
|x| . |y|
(At this point, note that (1.2.17) is obvious if x = 0 or y = 0, so we will assume that x ̸= 0 and y ̸= 0.) Plugging (1.2.21) into (1.2.20) gives (1.2.22)
x · y ≤ |x| · |y|,
∀ x, y ∈ Rn .
This is almost (1.2.17). To finish, we can replace x in (1.2.22) by −x = (−1)x, getting (1.2.23)
−(x · y) ≤ |x| · |y|,
and together (1.2.22) and (1.2.23) give (1.2.17).
We now discuss a number of notions and results related to convergence in Rn . First, a sequence of points (pj ) in Rn converges to a limit p ∈ Rn (we write pj → p) if and only if (1.2.24)
|pj − p| −→ 0,
where | · | is the Euclidean norm on Rn , defined by (1.2.9), and the meaning of (1.2.24) is that for every ε > 0 there exists N such that (1.2.25)
j ≥ N =⇒ |pj − p| < ε.
If we write pj = (p1j , . . . , pnj ) and p = (p1 , . . . , pn ), then (1.2.24) is equivalent to (p1j − p1 )2 + · · · + (pnj − pn )2 −→ 0,
as j → ∞,
20
1. Background
which holds if and only if |pℓj − pℓ | −→ 0 as j → ∞, for each ℓ ∈ {1, . . . , n}. That is to say, convergence pj → p in Rn is eqivalent to convergence of each component. A set S ⊂ Rn is said to be closed if and only if pj ∈ S, pj → p =⇒ p ∈ S.
(1.2.26)
The complement R \ S of a closed set S is open. Alternatively, Ω ⊂ Rn is open if and only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where n
(1.2.27)
Bε (q) = {p ∈ Rn : |p − q| < ε},
so q cannot be a limit of a sequence of points in Rn \ Ω. An important property of Rn is completeness, a property defined as follows. A sequence (pj ) of points in Rn is called a Cauchy sequence if and only if (1.2.28)
|pj − pk | −→ 0,
as j, k → ∞.
Again we see that (pj ) is Cauchy in R if and only if each component is Cauchy in R. It is easy to see that if pj → p for some p ∈ Rn , then (1.2.28) holds. The completeness property is the converse. n
Theorem 1.2.3. If (pj ) is a Cauchy sequence in Rn , then it has a limit, i.e., (1.2.24) holds for some p ∈ Rn . Proof. Since convergence pj → p in Rn is equivalent to convergence in R of each component, the result is a consequence of the completeness of R. This is proved in [49]. Completeness provides a path to the following key notion of compactness. A nonempty set K ⊂ Rn is said to be compact if and only if the following property holds. Each infinite sequence (pj ) in K has a subsequence (1.2.29) that converges to a point in K. It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j ̸= k, so (pj ) cannot have a convergent subsequence. The following converse result is the n-dimensional Bolzano-Weierstrass theorem. Theorem 1.2.4. If a nonempty K ⊂ Rn is closed and bounded, then it is compact. Proof. If K ⊂ Rn is closed and bounded, it is a closed subset of some box (1.2.30)
B = {(x1 , . . . , xn ) ∈ Rn : a ≤ xk ≤ b, ∀ k}.
Clearly every closed subset of a compact set is compact, so it suffices to show that B is compact. Now, each closed bounded interval [a, b] in R is compact, as shown in Appendix A.1, and (by reasoning similar to the proof of Theorem 1.2.3) the compactness of B follows readily from this.
1.2. Euclidean spaces
21
We establish some further properties of compact sets K ⊂ Rn , leading to the important result, Proposition 1.2.8 below. A generalization of this result is given in Appendix A.1. Proposition 1.2.5. Let K ⊂ Rn be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreasing sequence of closed subsets of K. If each Xm ̸= ∅, then ∩m Xm ̸= ∅. Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y. Since {xmk : k ≥ ℓ} ⊂ Xmℓ , which is closed, we have y ∈ ∩m Xm . Corollary 1.2.6. Let K ⊂ Rn be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open sets in Rn . If ∪m Um ⊃ K, then UM ⊃ K for some M . Proof. Consider Xm = K \ Um .
Before getting to Proposition 1.2.8, we bring in the following. Let Q denote the set of rational numbers, and let Qn denote the set of points in Rn all of whose components are rational. The set Qn ⊂ Rn has the following “denseness” property: given p ∈ Rn and ε > 0, there exists q ∈ Qn such that |p − q| < ε. Let (1.2.31)
R = {Br (q) : q ∈ Qn , r ∈ Q ∩ (0, ∞)}.
Note that Q and Qn are countable, i.e., they can be put in one-to-one correspondence with N. Hence R is a countable collection of balls. The following lemma is left as an exercise for the reader. Lemma 1.2.7. Let Ω ⊂ Rn be a nonempty open set. Then ∪ (1.2.32) Ω = {B : B ∈ R, B ⊂ Ω}. To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα . If each Uα ⊂ Rn is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we say {Uβ : β ∈ B} is a subcover. The following is part of the n-dimensional Heine-Borel theorem. (See Theorem A.1.10.) Proposition 1.2.8. If K ⊂ Rn is compact, then it has the following property. (1.2.33)
Every open cover {Uα : α ∈ A} of K has a finite subcover.
Proof. By Lemma 1.2.7, it suffices to prove the following. (1.2.34)
Every countable cover {Bj : j ∈ N} of K by open balls has a finite subcover.
To see this, write R = {Bj : j ∈ N}. Given the cover {Uα }, pass to {Bj : j ∈ J}, where j ∈ J if and only of Bj is contained in some Uα . By (1.2.32), {Bj : j ∈ J} covers K. If (1.2.34) holds, we have a subcover {Bℓ : ℓ ∈ L} for some finite L ⊂ J. Pick αℓ ∈ A such that Bℓ ⊂ Uαℓ . The {Uαℓ : ℓ ∈ L} is the desired finite subcover advertised in (1.2.33). Finally, to prove (1.2.34), we set (1.2.35) and apply Corollary 1.2.6.
Um = B1 ∪ · · · ∪ Bm
22
1. Background
Exercises 1. Identifying z = x + iy ∈ C with (x, y) ∈ R2 and w = u + iv ∈ C with (u, v) ∈ R2 , show that the dot product satisfies z · w = Re zw. 2. Show that the inequality (1.2.14) implies (1.2.13). 3. Prove Lemma 1.2.7. 4. Use Proposition 1.2.8 to prove the following extension of Proposition 1.2.5. Proposition 1.2.9. Let K ⊂ Rn be compact. Assume {Xα : α ∈ A} is a collection of closed subsets of K. Assume that for each finite set B ⊂ A, ∩α∈B Xα ̸= ∅. Then ∩ Xα ̸= ∅. α∈A
Hint. Consider Uα = Rn \ Xα . 5. Let K ⊂ Rn be compact. Show that there exist x0 , x1 ∈ K such that |x0 | ≤ |x|,
∀ x ∈ K,
|x1 | ≥ |x|,
∀ x ∈ K.
We say |x0 | = min |x|, x∈K
|x1 | = max |x|. x∈K
1.3. Vector spaces and linear transformations We have seen in §1.2 how Rn is a vector space, with vector operations given by (1.2.2)–(1.2.3), for row vectors, and by (1.2.4)–(1.2.5) for column vectors. We could also use complex numbers, replacing Rn by Cn , and allowing a ∈ C in (1.2.3) and (1.2.5). We will use F to denote R or C. Many other vector spaces arise naturally. We define this general notion now. A vector space over F is a set V , endowed with two operations, that of vector addition and multiplication by scalars. That is, given v, w ∈ V and a ∈ F, then v + w and av are defined in V . Furthermore, the following properties are to hold, for all u, v, w ∈ V, a, b ∈ F. First there are laws for vector addition: Commutative law : (1.3.1)
Associative law : Zero vector : Negative :
u + v = v + u, (u + v) + w = u + (v + w), ∃ 0 ∈ V, v + 0 = v, ∃ − v, v + (−v) = 0.
1.3. Vector spaces and linear transformations
23
Next there are laws for multiplication by scalars: (1.3.2)
Associative law : Unit :
a(bv) = (ab)v, 1 · v = v.
Finally there are two distributive laws: (1.3.3)
a(u + v) = au + av, (a + b)u = au + bu.
It is easy to see that Rn and Cn satisfy all these rules. We will present a number of other examples below. Let us also note that a number of other simple identities are automatic consequences of the rules given above. Here are some, which the reader is invited to verify: v + w = v ⇒ w = 0, v + 0 · v = (1 + 0)v = v, (1.3.4)
0 · v = 0, v + w = 0 ⇒ w = −v, v + (−1)v = 0 · v = 0, (−1)v = −v.
Here are some other examples of vector spaces. Let I = [a, b] denote an interval in R, and take a non-negative integer k. Then C k (I) denotes the set of functions f : I → F whose derivatives up to order k are continuous. We denote by P the set of polynomials in x, with coefficients in F. We denote by Pk the set of polynomials in x of degree ≤ k. In these various cases, (1.3.5)
(f + g)(x) = f (x) + g(x),
(af )(x) = af (x).
Such vector spaces and certain of their linear subspaces play a major role in the material developed in these notes. Regarding the notion just mentioned, we say a subset W of a vector space V is a linear subspace provided (1.3.6)
wj ∈ W, aj ∈ F =⇒ a1 w1 + a2 w2 ∈ W.
Then W inherits the structure of a vector space. Linear transformations and matrices
If V and W are vector spaces over F (R or C), a map (1.3.7)
T : V −→ W
is said to be a linear transformation provided (1.3.8)
T (a1 v1 + a2 v2 ) = a1 T v1 + a2 T v2 ,
∀ aj ∈ F, vj ∈ V.
We also write T ∈ L(V, W ). In case V = W , we also use the notation L(V ) = L(V, V ).
24
1. Background
Linear transformations arise in a number of ways. For example, an m × n matrix A with entries in F defines a linear transformation A : Fn −→ Fm
(1.3.9) by
(1.3.10)
a11 .. .
···
a1n b1 Σa1ℓ bℓ .. .. = .. . . . .
am1
···
amn
bn
Σamℓ bℓ
We also have linear transformations on function spaces, such as multiplication operators Mf : C k (I) −→ C k (I),
(1.3.11)
Mf g(x) = f (x)g(x),
given f ∈ C (I), I = [a, b], and the operation of differentiation: k
(1.3.12)
D : C k+1 (I) −→ C k (I),
Df (x) = f ′ (x).
We also have integration: (1.3.13)
∫
I : C k (I) −→ C k+1 (I),
If (x) =
x
f (y) dy. a
Note also that D : Pk+1 −→ Pk ,
(1.3.14)
I : Pk −→ Pk+1 ,
where Pk denotes the space of polynomials in x of degree ≤ k. Two linear transformations Tj ∈ L(V, W ) can be added: (1.3.15)
T1 + T2 : V −→ W,
(T1 + T2 )v = T1 v + T2 v.
Also T ∈ L(V, W ) can be multiplied by a scalar: aT : V −→ W,
(1.3.16)
(aT )v = a(T v).
This makes L(V, W ) a vector space. We can also compose linear transformations S ∈ L(W, X), T ∈ L(V, W ): ST : V −→ X,
(1.3.17)
(ST )v = S(T v).
For example, we have (1.3.18)
Mf D : C k+1 (I) −→ C k (I),
Mf Dg(x) = f (x)g ′ (x),
given f ∈ C k (I). When two transformations (1.3.19)
A : Fn −→ Fm ,
B : Fk −→ Fn
are represented by matrices, e.g., A as in (1.3.10) and b11 · · · b1k .. , (1.3.20) B = ... . bn1
···
bnk
then (1.3.21)
AB : Fk −→ Fm
1.3. Vector spaces and linear transformations
25
is given by matrix multiplication: Σa1ℓ bℓ1 .. (1.3.22) AB = .
···
Σa1ℓ bℓk .. . .
Σamℓ bℓ1
···
Σamℓ bℓk
For example, ( a11 a21
a12 a22
)(
b11 b21
b12 b22
)
( =
a11 b11 + a12 b21 a21 b11 + a22 b21
) a11 b12 + a12 b22 . a21 b12 + a22 b22
Another way of writing (1.3.22) is to represent A and B as (1.3.23)
A = (aij ),
B = (bij ),
and then we have (1.3.24)
AB = (dij ),
dij =
n ∑
aiℓ bℓj .
ℓ=1
To establish the identity (1.3.22), we note that it suffices to show the two sides have the same effect on each ej ∈ Fk , 1 ≤ j ≤ k, where ej is the column vector in Fk whose jth entry is 1 and whose other entries are 0. First note that b1j .. (1.3.25) Bej = . , bnj the jth column in B, as one can see via (1.3.10). Similarly, if D denotes the right side of (1.3.22), Dej is the jth column of this matrix, i.e., Σa1ℓ bℓj .. (1.3.26) Dej = . . Σamℓ bℓj On the other hand, applying A to (1.3.25), via (1.3.10), gives the same result, so (1.3.25) holds. Associated with a linear transformation as in (1.3.7) there are two special linear spaces, the null space of T and the range of T . The null space of T is (1.3.27)
N (T ) = {v ∈ V : T v = 0},
and the range of T is (1.3.28)
R(T ) = {T v : v ∈ V }.
Note that N (T ) is a linear subspace of V and R(T ) is a linear subspace of W . If N (T ) = 0 we say T is injective; if R(T ) = W we say T is surjective. Note that T is injective if and only if T is one-to-one, i.e., (1.3.29)
T v1 = T v2 =⇒ v1 = v2 .
If T is surjective, we also say T is onto. If T is one-to-one and onto, we say it is an isomorphism. In such a case the inverse (1.3.30)
T −1 : W −→ V
26
1. Background
is well defined, and it is a linear transformation. We also say T is invertible, in such a case. Basis and dimension
Given a finite set S = {v1 , . . . , vk } in a vector space V , the span of S is the set of vectors in V of the form (1.3.31)
c1 v1 + · · · + ck vk ,
with cj arbitrary scalars, ranging over F = R or C. This set, denoted Span(S) is a linear subspace of V . The set S is said to be linearly dependent if and only if there exist scalars c1 , . . . , ck , not all zero, such that (1.3.31) vanishes. Otherwise we say S is linearly independent. If {v1 , . . . , vk } is linearly independent, we say S is a basis of Span(S), and that k is the dimension of Span(S). In particular, if this holds and Span(S) = V , we say k = dim V . We also say V has a finite basis, and that V is finite dimensional. By convention, if V has only one element, the zero element, we say V = 0 and dim V = 0. It is easy to see that any finite set S = {v1 , . . . , vk } ⊂ V has a maximal subset that is linearly independent, and such a subset has the same span as S, so Span(S) has a basis. To take a complementary perspective, S will have a minimal subset S0 with the same span, and any such minimal subset will be a basis of Span(S). Soon we will show that any two bases of a finite-dimensional vector space V have the same number of elements (so dim V is well defined). First, let us relate V to Fk . So say V has a basis S = {v1 , . . . , vk }. We define a linear transformation (1.3.32)
A : Fk −→ V
by (1.3.33) where
(1.3.34)
A(c1 e1 + · · · + ck ek ) = c1 v1 + · · · + ck vk , 1 0 0 .. e1 = . , . . . . . . , e k = . . .. 0 0 1
We say {e1 , . . . , ek } is the standard basis of Fk . The linear independence of S is equivalent to the injectivity of A and the statement that S spans V is equivalent to the surjectivity of A. Hence the statement that S is a basis of V is equivalent to the statement that A is an isomorphism, with inverse uniquely specified by (1.3.35)
A−1 (c1 v1 + · · · + ck vk ) = c1 e1 + · · · + ck ek .
We begin our demonstration that dim V is well defined, with the following concrete result.
1.3. Vector spaces and linear transformations
27
Lemma 1.3.1. If v1 , . . . , vk+1 are vectors in Fk , then they are linearly dependent. Proof. We use induction on k. The result is obvious if k = 1. We can suppose the last component of some vj is nonzero, since otherwise we can regard these vectors as elements of Fk−1 and use the inductive hypothesis. Reordering these vectors, we can assume the last component of vk+1 is nonzero, and it can be assumed to be 1. Form wj = vj − vkj vk+1 , 1 ≤ j ≤ k, where vj = (v1j , . . . , vkj )t . Then the last component of each of the vectors w1 , . . . , wk is 0, so we can regard these as k vectors in Fk−1 . By induction, there exist scalars a1 , . . . , ak , not all zero, such that a1 w1 + · · · + ak wk = 0, so we have a1 v1 + · · · + ak vk = (a1 vk1 + · · · + ak vkk )vk+1 , the desired linear dependence relation on {v1 , . . . , vk+1 }.
With this result in hand, we proceed. Proposition 1.3.2. If V has a basis {v1 , . . . , vk } with k elements and {w1 , . . . , wℓ } ⊂ V is linearly independent, then ℓ ≤ k. Proof. Take the isomorphism A : Fk → V described in (1.3.32)–(1.3.33). The hypotheses imply that {A−1 w1 , . . . , A−1 wℓ } is linearly independent in Fk , so Lemma 1.3.1 implies ℓ ≤ k. Corollary 1.3.3. If V is finite-dimensional, any two bases of V have the same number of elements. If V is isomorphic to W , these spaces have the same dimension. Proof. If S (with #S elements) and T are bases of V , we have #S ≤ #T and #T ≤ #S, hence #S = #T . For the latter part, an isomorphism of V onto W takes a basis of V to a basis of W . The following is an easy but useful consequence. Proposition 1.3.4. If V is finite dimensional and W ⊂ V a linear subspace, then W has a finite basis, and dim W ≤ dim V . Proof. Suppose {w1 , . . . , wℓ } is a linearly independent subset of W . Proposition 1.3.2 implies ℓ ≤ dim V . If this set spans W , we are done. If not, there is an element wℓ+1 ∈ W not in this span, and {w1 , . . . , wℓ+1 } is a linearly independent subset of W . Again ℓ + 1 ≤ dim V . Continuing this process a finite number of times must produce a basis of W . A similar argument establishes: Proposition 1.3.5. Suppose V is finite dimensional, W ⊂ V a linear subspace, and {w1 , . . . , wℓ } a basis of W . Then V has a basis of the form {w1 , . . . , wℓ , u1 , . . . , um }, and ℓ + m = dim V .
28
1. Background
Having this, we can establish the following result, sometimes called the fundamental theorem of linear algebra. Proposition 1.3.6. Assume V and W are vector spaces, V finite dimensional, and A : V −→ W
(1.3.36) a linear map. Then (1.3.37)
dim N (A) + dim R(A) = dim V.
Proof. Let {w1 , . . . , wℓ } be a basis of N (A) ⊂ V , and complete it to a basis {w1 , . . . , wℓ , u1 , . . . , um } of V . Set L = Span{u1 , . . . , um }, and consider (1.3.38)
A0 : L −→ W,
A 0 = A L .
Clearly w ∈ R(A) ⇒ w = A(a1 w1 + · · · + aℓ wℓ + b1 u1 + · · · + bm um ) = A0 (b1 u1 + · · · + bm um ), so (1.3.39)
R(A0 ) = R(A).
Furthermore, (1.3.40)
N (A0 ) = N (A) ∩ L = 0.
Hence A0 : L → R(A0 ) is an isomorphism. Thus dim R(A) = dim R(A0 ) = dim L = m, and we have (1.3.37). The following is a significant special case. Corollary 1.3.7. Let V be finite dimensional, and let A : V → V be linear. Then A injective ⇐⇒ A surjective ⇐⇒ A isomorphism. We mention that these equivalences can fail for infinite dimensional spaces. For example, if P denotes the space of polynomials in x, then Mx : P → P (Mx f (x) = xf (x)) is injective but not surjective, while D : P → P (Df (x) = f ′ (x)) is surjective but not injective. Next we have the following important characterization of injectivity and surjectivity. Proposition 1.3.8. Assume V and W are finite dimensional and A : V → W is linear. Then (1.3.41)
A surjective ⇐⇒ AB = IW , for some B ∈ L(W, V ),
and (1.3.42)
A injective ⇐⇒ CA = IV , for some C ∈ L(W, V ).
Proof. Clearly AB = I ⇒ A surjective and CA = I ⇒ A injective. We establish the converses. First assume A : V → W is surjective. Let {w1 , . . . , wℓ } be a basis of W . Pick vj ∈ V such that Avj = wj . Set (1.3.43)
B(a1 w1 + · · · + aℓ wℓ ) = a1 v1 + · · · + aℓ vℓ .
1.3. Vector spaces and linear transformations
29
This works in (1.3.41). Next assume A : V → W is injective. Let {v1 , . . . , vk } be a basis of V . Set wj = Avj . Then {w1 , . . . , wk } is linearly independent, hence a basis of R(A), and we then can produce a basis {w1 , . . . , wk , u1 , . . . , um } of W . Set (1.3.44)
C(a1 w1 + · · · + ak wk + b1 u1 + · · · + bm um ) = a1 v1 + · · · + ak vk .
This works in (1.3.42).
An m × n matrix A defines a linear transformation A : Fn → Fm , as in (1.3.9)– (1.3.10). The columns of A are a1j (1.3.45) aj = ... . amj As seen in (1.3.25), (1.3.46)
Aej = aj ,
where e1 , . . . , en is the standard basis of Fn . Hence (1.3.47)
R(A) = linear span of the columns of A,
so (1.3.48)
R(A) = Fm ⇐⇒ a1 , . . . , an span Fm .
Furthermore, (1.3.49)
A
n (∑
n ) ∑ cj ej = 0 ⇐⇒ cj aj = 0,
j=1
j=1
so (1.3.50)
N (A) = 0 ⇐⇒ {a1 , . . . , an } is linearly independent.
We have the following conclusion, in case m = n. Proposition 1.3.9. Let A be an n × n matrix, defining A : Fn → Fn . Then the following are equivalent: A is invertible, (1.3.51)
The columns of A are linearly independent, The columns of A span Fn .
30
1. Background
Exercises 1. Show that the results in (1.3.4) follow from the basic rules (1.3.1)–(1.3.3). Hint. To start, add −v to both sides of the identity v + w = v, and take account first of the associative law in (1.3.1), and then of the rest of (1.3.1). For the second line of (1.3.4), use the rules (1.3.2) and (1.3.3). Then use the first two lines of (1.3.4) to justify the third line... 2. Demonstrate the following results for any vector space. Take a ∈ F, v ∈ V . a · 0 = 0 ∈ V, a(−v) = −av. Hint. Feel free to use the results of (1.3.4). Let V be a vector space (over F) and W, X ⊂ V linear subspaces. We say (1.3.52)
V =W +X
provided each v ∈ V can be written (1.3.53)
v = w + x,
w ∈ W, x ∈ X.
We say V =W ⊕X
(1.3.54)
provided each v ∈ V has a unique representation (1.3.53). 3. Show that V = W ⊕ X ⇐⇒ V = W + X and W ∩ X = 0. 4. Let A : Fn → Fm be defined by an m × n matrix, as in (1.3.9)–(1.3.10). (a) Show that R(A) is the span of the columns of A. Hint. See (1.3.25). (b) Show that N (A) = 0 if and only if the columns of A are linearly independent. 5. Define the transpose of an m × n matrix A = (ajk ) to be the n × m matrix At = (akj ). Thus, if A is as in (1.3.9)–(1.3.10), a11 · · · am1 .. . (1.3.55) At = ... . a1n For example,
1 A = 3 5
···
amn
( 2 1 3 t 4 =⇒ A = 2 4 6
) 5 . 6
1.4. Determinants
31
Suppose also B is an n × k matrix, as in (1.3.20), so AB is defined, as in (1.3.21). Show that (AB)t = B t At .
(1.3.56) 6. Let (
A= 1 2
)
3 ,
2 B = 0 . 2
Compute AB and BA. Then compute At B t and B t At . 7. Let P5 be the space of real polynomials in x of degree ≤ 5 and set ( ) T : P5 −→ R3 , T p = p(−1), p(0), p(1) . Specify R(T ) and N (T ), and verify (1.3.37) for V = P5 , W = R3 , A = T . 8. Denote the space of m × n matrices with entries in F (as in (1.3.10)) by M (m × n, F).
(1.3.57) If m = n, denote it by
M (n, F).
(1.3.58) Show that
dim M (m × n, F) = mn, especially dim M (n, F) = n2 .
1.4. Determinants Determinants arise in the study of inverting a matrix. To take the 2×2 case, solving for x and y the system ax + by = u, (1.4.1) cx + dy = v can be done by multiplying these equations by d and b, respectively, and subtracting, and by multiplying them by c and a, respectively, and subtracting, yielding (1.4.2)
(ad − bc)x = du − bv, (ad − bc)y = av − cu.
The factor on the left is
(
a b c d
) = ad − bc,
(1.4.3)
det
and solving (1.4.2) for x and ( a (1.4.4) A= c
y leads to ) ( ) 1 b d −b −1 =⇒ A = , d det A −c a
provided det A ̸= 0.
32
1. Background
We now consider determinants of n × n matrices. Let M (n, F) denote the set of n × n matrices with entries in F = R or C. We write a11 · · · a1n .. = (a , . . . , a ), (1.4.5) A = ... 1 n . ···
an1 where
ann
a1j aj = ...
(1.4.6)
anj is the jth column of A. The determinant is defined as follows. Proposition 1.4.1. There is a unique function ϑ : M (n, F) −→ F,
(1.4.7)
satisfying the following three properties: (a) ϑ is linear in each column aj of A, e = −ϑ(A) if A e is obtained from A by interchanging two columns, (b) ϑ(A) (c) ϑ(I) = 1. This defines the determinant: (1.4.8)
ϑ(A) = det A.
If (c) is replaced by (c′ ) ϑ(I) = r, then (1.4.9)
ϑ(A) = r det A.
The proof will involve constructing an explicit formula for det A by following the rules (a)–(c). We start with the case n = 3. We have (1.4.10)
det A =
3 ∑
aj1 det(ej , a2 , a3 ),
j=1
∑ by applying (a) to the first column of A, a1 = j aj1 ej . Here and below, {ej : 1 ≤ j ≤ n} denotes the standard basis of Fn , so ej has a 1 in the jth slot and 0s elsewhere. Applying (a) to the second and third columns gives det A =
3 ∑
aj1 ak2 det(ej , ek , a3 )
j,k=1
(1.4.11) =
3 ∑ j,k,ℓ=1
aj1 ak2 aℓ3 det(ej , ek , eℓ ).
1.4. Determinants
33
This is a sum of 27 terms, but most of them are 0. Note that rule (b) implies (1.4.12)
det B = 0 whenever B has two identical columns.
Hence det(ej , ek , eℓ ) = 0 unless j, k, and ℓ are distinct, that is, unless (j, k, ℓ) is a permutation of (1, 2, 3). Now rule (c) says (1.4.13)
det(e1 , e2 , e3 ) = 1,
and we see from rule (b) that det(ej , ek , eℓ ) = 1 if one can convert (ej , ek , eℓ ) to (e1 , e2 , e3 ) by an even number of column interchanges, and det(ej , ek , eℓ ) = −1 if it takes an odd number of interchanges. Explicitly, (1.4.14)
det(e1 , e2 , e3 ) = 1,
det(e1 , e3 , e2 ) = −1,
det(e2 , e3 , e1 ) = 1,
det(e2 , e1 , e3 ) = −1,
det(e3 , e1 , e2 ) = 1,
det(e3 , e2 , e1 ) = −1.
Consequently (1.4.11) yields det A = a11 a22 a33 − a11 a32 a23 + a21 a32 a13 − a21 a12 a33
(1.4.15)
+ a31 a12 a23 − a31 a22 a13 . Note that the second indices occur in (1, 2, 3) order in each product. We can rearrange these products so that the first indices occur in (1, 2, 3) order: det A = a11 a22 a33 − a11 a23 a32 + a13 a21 a32 − a12 a21 a33
(1.4.16)
+ a12 a23 a31 − a13 a22 a31 . Now we tackle the case of general n. Parallel to (1.4.10)–(1.4.11), we have ∑ det A = aj1 det(ej , a2 , . . . , an ) = · · · j
(1.4.17) =
∑
aj1 1 · · · ajn n det(ej1 , . . . ejn ),
j1 ,...,jn
by applying rule (a) to each of the n columns of A. As before, (1.4.12) implies det(ej1 , . . . , ejn ) = 0 unless (j1 , . . . , jn ) are all distinct, that is, unless (j1 , . . . , jn ) is a permutation of the set (1, 2, . . . , n). We set (1.4.18)
Sn = set of permutations of (1, 2, . . . , n).
That is, Sn consists of elements σ, mapping the set {1, . . . , n} to itself, (1.4.19)
σ : {1, 2, . . . , n} −→ {1, 2, . . . , n},
that are one-to-one and onto. We can compose two such permutations, obtaining the product στ ∈ Sn , given σ and τ in Sn . A permutation that interchanges just two elements of {1, . . . , n}, say j and k (j ̸= k), is called a transposition, and labeled (jk). It is easy to see that each permutation of {1, . . . , n} can be achieved by successively transposing pairs of elements of this set. That is, each element σ ∈ Sn is a product of transpositions. We claim that (1.4.20)
det(eσ(1)1 , . . . , eσ(n)n ) = (sgn σ) det(e1 , . . . , en ) = sgn σ,
34
1. Background
where (1.4.21)
sgn σ = 1
if σ is a product of an even number of transpositions,
− 1 if σ is a product of an odd number of transpositions.
In fact, the first identity in (1.4.20) follows from rule (b) and the second identity from rule (c). There is one point to be checked here. Namely, we claim that a given σ ∈ Sn cannot simultaneously be written as a product of an even number of transpositions and an odd number of transpositions. If σ could be so written, sgn σ would not be well defined, and it would be impossible to satisfy condition (b), so Proposition 1.4.1 would fail. One neat way to see that sgn σ is well defined is the following. Let σ ∈ Sn act on functions of n variables by (1.4.22)
(σf )(x1 , . . . , xn ) = f (xσ(1) , . . . , xσ(n) ).
It is readily verified that if also τ ∈ Sn , (1.4.23)
g = σf =⇒ τ g = (τ σ)f.
Now, let P be the polynomial (1.4.24)
∏
P (x1 , . . . , xn ) =
(xj − xk ).
1≤j σ(k). Note that (1.4.29)
aσ(1)1 · · · aσ(n)n = a1τ (1) · · · anτ (n) ,
with τ = σ −1 ,
and sgn σ = sgn σ −1 , so, parallel to (1.4.16), we also have ∑ (1.4.30) det A = (sgn σ)a1σ(1) · · · anσ(n) . σ∈Sn
Comparison with (1.4.27) gives (1.4.31)
det A = det At ,
where A = (ajk ) ⇒ At = (akj ). Note that the jth column of At has the same entries as the jth row of A. In light of this, we have: Corollary 1.4.2. In Proposition 1.4.1, one can replace “columns” by “rows.” The following is a key property of the determinant, called multiplicativity: Proposition 1.4.3. Given A and B in M (n, F), (1.4.32)
det(AB) = (det A)(det B).
Proof. For fixed A, apply Proposition 1.4.1 to (1.4.33)
ϑ1 (B) = det(AB).
If B = (b1 , . . . , bn ), with jth column bj , then (1.4.34)
AB = (Ab1 , . . . , Abn ).
e = (bσ(1) , . . . , bσ(n) ) is obtained from B Clearly rule (a) holds for ϑ1 . Also, if B e by permuting its columns, then AB has columns (Abσ(1) , . . . , Abσ(n) ), obtained by permuting the columns of AB in the same fashion. Hence rule (b) holds for ϑ1 . Finally, rule (c′ ) holds for ϑ1 , with r = det A, and (1.4.32) follows, Corollary 1.4.4. If A ∈ M (n, F) is invertible, then det A ̸= 0.
36
1. Background
Proof. If A is invertible, there exists B ∈ M (n, F) such that AB = I. Then, by (1.4.32), (det A)(det B) = 1, so det A ̸= 0. The converse of Corollary 1.4.4 also holds. Before proving it, it is convenient to show that the determinant is invariant under a certain class of column operations, given as follows. e is obtained from A = (a1 , . . . , an ) ∈ M (n, F) by adding Proposition 1.4.5. If A caℓ to ak for some c ∈ F, ℓ ̸= k, then e = det A. det A
(1.4.35)
e = det A + c det Ab , where Ab is obtained from A by Proof. By rule (a), det A replacing the column ak by aℓ . Hence Ab has two identical columns, so det Ab = 0, and (1.4.35) holds. We now extend Corollary 1.4.4. Proposition 1.4.6. If A ∈ M (n, F), then A is invertible if and only if det A ̸= 0. Proof. We have half of this from Corollary 1.4.4. To finish, assume A is not invertible. As seen in §1.3, this implies the columns a1 , . . . , an of A are linearly dependent. Hence, for some k, ∑ (1.4.36) ak + cℓ aℓ = 0, ℓ̸=k
e where with cℓ ∈ F. Now we can apply Proposition 1.4.5 to obtain det A = det A, ∑ e e A is obtained by adding cℓ aℓ to ak . But then the kth column of A is 0, so e = 0. This finishes the proof of Proposition 1.4.6. det A = det A Further useful facts about determinants arise in the following exercises. Exercises
Exercises 1. Show that
(1.4.37)
1 0 det . ..
a12 a22 .. .
··· ···
a1n a2n .. = det .
0
an2
···
ann
1 0 .. .
0 a22 .. .
··· ···
0 a2n .. = det A11 .
0
an2
···
ann
where A11 = (ajk )2≤j,k≤n . Hint. Do the first identity using Proposition 1.4.5. Then exploit uniqueness for det on M (n − 1, F).
Exercises
37
2. Deduce that det(ej , a2 , . . . , an ) = (−1)j−1 det A1j where Akj is formed by deleting the kth column and the jth row from A. 3. Deduce from the first sum in (1.4.17) that n ∑ (1.4.38) det A = (−1)j−1 aj1 det A1j . j=1
More generally, for any k ∈ {1, . . . , n}, n ∑ (1.4.39) det A = (−1)j−k ajk det Akj . j=1
This is called an expansion of det A by minors, down the kth column. 4. By definition, the cofactor matrix of A is given by Cof(A)jk = ckj = (−1)j−k det Akj . Show that n ∑
(1.4.40)
ajℓ ckj = 0,
if ℓ ̸= k.
j=1
Deduce from this and (5.39) that Cof(A)t A = (det A)I.
(1.4.41)
Hint. Reason as in Exercises 1–3 that the left side of (1.4.40) is equal to det (a1 , . . . , aℓ , . . . , aℓ , . . . , an ), with aℓ in the kth column as well as in the ℓth column. The identity (1.4.41) is known as Cramer’s formula. Note how this generalizes (1.4.4). 5. Show that
(1.4.42)
a11 det
a12 a22
··· ··· .. .
a1n a2n .. = a11 a22 · · · ann . . ann
Hint. Use (1.4.37) and induction. Alternative: Use (1.4.27). Show that σ ∈ Sn , σ(k) ≤ k ∀ k ⇒ σ(k) ≡ k.
Exercises on the cross product 1. If u, v ∈ R3 , show that the formula (1.4.43)
w1 w · (u × v) = det w2 w3
u1 u2 u3
v1 v2 v3
38
1. Background
for u × v = Π(u, v) defines uniquely a bilinear map Π : R3 × R3 → R3 . Show that it satisfies i × j = k, j × k = i, k × i = j, where {i, j, k} is the standard basis of R3 . 2. We say T ∈ SO(3) provided that T is a real 3 × 3 matrix satisfying T t T = I and det T > 0, (hence det T = 1). Show that (1.4.44)
T ∈ SO(3) =⇒ T u × T v = T (u × v).
Hint. Multiply the 3 × 3 matrix in Exercise 1 on the left by T. 3. Show that, if θ is the angle between u and v in R3 , then |u × v| = |u| |v| | sin θ|.
(1.4.45)
Hint. Check this for u = i, v = ai + bj, and use Exercise 2 to show this suffices. 4. Show that, for all u, v, w, x ∈ R3 , (1.4.46)
(
u·w (u × v) · (w × x) = det u·x
) v·w . v·x
Taking w = u, x = v, show that this implies (1.4.45). Hint. Using Exercise 2, show that it suffices to check this for w = i,
x = ai + bj,
in which case w × x = bk. Then the left side 0 (u × v) · bk = det 0 b
of (1.4.46) is u·i v·i u · j v · j . u·k v·k
Show that this equals the right side of (1.4.46). 5. Show that κ : R3 → Skew(3), the set of antisymmetric real 3 × 3 matrices, given by 0 −y3 y2 0 −y1 (1.4.47) κ(y1 , y2 , y3 ) = y3 −y2 y1 0 satisfies (1.4.48)
Kx = y × x,
Show that, with [A, B] = AB − BA, (1.4.49)
The trace of a matrix
K = κ(y).
[ ] κ(x × y) = κ(x), κ(y) , ( ) Tr κ(x)κ(y)t = 2x · y.
Exercises
39
Let A ∈ M (n, F) be as in (1.4.5). We define the trace of A as n ∑ Tr A = ajj . j=1
1. If also B ∈ M (n, F), show that Tr AB = Tr BA. 2. Deduce that if B is invertible, Tr B −1 AB = Tr A. 3. Show that, as t → 0, det(I + tA) = 1 + t Tr A + O(t2 ). 4. Deduce that, if B is invertible, det(B + tA) = (det B)(1 + t Tr B −1 A) + O(t2 ).
Chapter 2
Multivariable differential calculus
This chapter develops differential calculus on domains in n-dimensional Euclidean space Rn . In §2.1 we define the derivative of a function F : O → Rm , where O is an open subset of Rn , as a linear map from Rn to Rm . We establish some basic properties, such as the chain rule. We use the one-dimensional integral as a tool to show that, if the matrix of first order partial derivatives of F is continuous on O, then F is differentiable on O. We also discuss two convenient multi-index notations for higher derivatives, and derive the Taylor formula with remainder for a smooth function F on O ⊂ Rn . In §2.2 we establish the Inverse Function Theorem, stating that a smooth map F : O → Rn with an invertible derivative DF (p) has a smooth inverse defined near q = F (p). We derive the Implicit Function Theorem as a consequence of this. As a tool in proving the Inverse Function Theorem, we use a fixed point theorem known as the Contraction Mapping Principle. In §2.3 we treat systems of differential equations. We establish a basic existence and uniqueness theorem and also study the smooth dependence of a solution on initial data. We interpret the solution operator as a flow generated by a vector field and introduce the concept of the Lie bracket of vector fields. We also consider the linearization of a system of ODEs about a solution. Within the setting of linear systems, we introduce the matrix exponential as a tool and derive a number of its basic properties.
2.1. The derivative Let O be an open subset of Rn , and F : O → Rm a continuous function. We say F is differentiable at a point x ∈ O, with derivative L, if L : Rn → Rm is a linear 41
42
2. Multivariable differential calculus
transformation such that, for y ∈ Rn , small, (2.1.1)
F (x + y) = F (x) + Ly + R(x, y)
with ∥R(x, y)∥ → 0 as y → 0. ∥y∥
(2.1.2)
We denote the derivative at x by DF (x) = L. With respect to the standard bases of Rn and Rm , DF (x) is simply the matrix of partial derivatives, ( (2.1.3)
DF (x) =
∂Fj ∂xk
)
∂F1 /∂x1 .. = .
···
∂F1 /∂xn .. , .
∂Fm /∂x1
···
∂Fm /∂xn
so that, if v = (v1 , . . . , vn )t , (regarded as a column vector) then
(2.1.4)
∑ (∂F1 /∂xk )vk k .. DF (x)v = . . ∑ (∂Fm /∂xk )vk k
Recall the definition of the partial derivative ∂fj /∂xk from §1.1. It will be shown below that F is differentiable whenever all the partial derivatives exist and are continuous on O. In such a case we say F is a C 1 function on O. More generally, F is said to be C k if all its partial derivatives of order ≤ k exist and are continuous. If F is C k for all k, we say F is C ∞ . In (2.1.2), we can use the Euclidean norm on Rn and Rm . As seen in §1.2, this norm is defined by ( )1/2 ∥x∥ = x21 + · · · + x2n
(2.1.5) for x = (x1 , . . . , xn ) ∈ Rn .
An application of the Fundamental Theorem of Calculus, to functions of each variable xj separately, yields the following. If we assume F : O → Rm is differentiable in each variable separately, and that each ∂F/∂xj is continuous on O, then F (x + y) = F (x) +
n ∑ [ ] F (x + zj ) − F (x + zj−1 ) j=1
(2.1.6)
= F (x) + ∫ Aj (x, y) = 0
n ∑
Aj (x, y)yj ,
j=1 1
) ∂F ( x + zj−1 + tyj ej dt, ∂xj
2.1. The derivative
43
where z0 = 0, zj = (y1 , . . . , yj , 0, . . . , 0), and {ej } is the standard basis of Rn . Consequently, F (x + y) = F (x) + (2.1.7)
n ∑ ∂F (x) yj + R(x, y), ∂xj j=1
∫ 1{ } ∂F ∂F R(x, y) = yj (x + zj−1 + tyj ej ) − (x) dt. ∂xj ∂xj 0 j=1 n ∑
Now (2.1.7) implies F is differentiable on O, as we stated below (2.1.4). Thus we have established the following. Proposition 2.1.1. If O is an open subset of Rn and F : O → Rm is of class C 1 , then F is differentiable at each point x ∈ O. One can use the Mean Value Theorem in place of the fundamental theorem of calculus and obtain a slightly more general result. See Exercise 0 below for prompts on how to accomplish this. Let us give some examples of derivatives. First, take n = 2, m = 1, and set (2.1.8)
F (x) = (sin x1 )(sin x2 ).
Then (2.1.9)
DF (x) = ((cos x1 )(sin x2 ), (sin x1 )(cos x2 )).
Next, take n = m = 2 and
(
(2.1.10)
F (x) =
Then (2.1.11)
( DF (x) =
) x1 x2 . x21 − x22
x2 2x1
) x1 . −2x2
We can replace Rn and Rm by more general finite-dimensional real vector spaces, isomorphic to Euclidean space. For example, the space M (n, R) of real 2 n × n matrices is isomorphic to Rn . Consider the function (2.1.12)
S : M (n, R) −→ M (n, R),
S(X) = X 2 .
We have (2.1.13)
(X + Y )2 = X 2 + XY + Y X + Y 2 = X 2 + DS(X)Y + R(X, Y ),
with R(X, Y ) = Y 2 , and hence (2.1.14)
DS(X)Y = XY + Y X.
For our next example, we take (2.1.15)
O = Gℓ(n, R) = {X ∈ M (n, R) : det X ̸= 0},
which, as shown below, is open in M (n, R). We consider (2.1.16)
Φ : Gℓ(n, R) −→ M (n, R),
Φ(X) = X −1 ,
44
2. Multivariable differential calculus
and compute DΦ(I). We use the following. If, for A ∈ M (n, R), (2.1.17)
∥A∥ = sup{∥Av∥ : v ∈ Rn , ∥v∥ ≤ 1},
then A, B ∈ M (n, R) ⇒ ∥A + B∥ ≤ ∥A∥ + ∥B∥ and ∥AB∥ ≤ ∥A∥ · ∥B∥,
(2.1.18)
so Y ∈ M (n, R) ⇒ ∥Y k ∥ ≤ ∥Y ∥k . Also Sk = I − Y + Y 2 − · · · + (−1)k Y k (2.1.19)
⇒ Y Sk = Sk Y = Y − Y 2 + Y 3 − · · · + (−1)k Y k+1 ⇒ (I + Y )Sk = Sk (I + Y ) = I + (−1)k Y k+1 ,
hence (2.1.20)
∥Y ∥ < 1 =⇒ (I + Y )
−1
=
∞ ∑
(−1)k Y k = I − Y + Y 2 − · · · ,
k=0
so (2.1.21)
DΦ(I)Y = −Y.
Related calculations show that Gℓ(n, R) is open in M (n, R). In fact, given X ∈ Gℓ(n, R), Y ∈ M (n, R), (2.1.22)
X + Y = X(I + X −1 Y ),
which by (2.1.20) is invertible as long as (2.1.23)
∥X −1 Y ∥ < 1.
One can proceed from here to compute DΦ(X). See the exercises. We return to general considerations, and derive the chain rule for the derivative. Let F : O → Rm be differentiable at x ∈ O, as above, let U be a neighborhood of z = F (x) in Rm , and let G : U → Rk be differentiable at z. Consider H = G ◦ F. We have H(x + y) = G(F (x + y)) ( ) = G F (x) + DF (x)y + R(x, y) ( ) (2.1.24) = G(z) + DG(z) DF (x)y + R(x, y) + R1 (x, y) = G(z) + DG(z)DF (x)y + R2 (x, y) with
∥R2 (x, y)∥ → 0 as y → 0. ∥y∥ Thus G ◦ F is differentiable at x, and (2.1.25)
D(G ◦ F )(x) = DG(F (x)) · DF (x).
Another useful remark is that, by the Fundamental Theorem of Calculus, applied to φ(t) = F (x + ty), ∫ 1 (2.1.26) F (x + y) = F (x) + DF (x + ty)y dt, 0
2.1. The derivative
45
provided F is C 1 . For a typical application, see (2.3.46). For the study of higher order derivatives of a function, the following result is fundamental. Proposition 2.1.2. Assume F : O → Rm is of class C 2 , with O open in Rn . Then, for each x ∈ O, 1 ≤ j, k ≤ n, ∂ ∂F ∂ ∂F (2.1.27) (x) = (x). ∂xj ∂xk ∂xk ∂xj Proof. It suffices to take m = 1. We label our function f : O → R. For 1 ≤ j ≤ n, we set ) 1( (2.1.28) ∆j,h f (x) = f (x + hej ) − f (x) , h where {e1 , . . . , en } is the standard basis of Rn . The mean value theorem (for functions of xj alone) implies that if ∂j f = ∂f /∂xj exists on O, then, for x ∈ O, h > 0 sufficiently small, (2.1.29)
∆j,h f (x) = ∂j f (x + αj hej ),
for some αj ∈ (0, 1), depending on x and h. Iterating this, if ∂j (∂k f ) exists on O, then, for x ∈ O, h > 0 sufficiently small, ∆k,h ∆j,h f (x) = ∂k (∆j,h f )(x + αk hek ) = ∆j,h (∂k f )(x + αk hek )
(2.1.30)
= ∂j ∂k f (x + αk hek + αj hej ), with αj , αk ∈ (0, 1). Here we have used the elementary result (2.1.31)
∂k ∆j,h f = ∆j,h (∂k f ).
We deduce the following. Proposition 2.1.3. If ∂k f and ∂j ∂k f exist on O and ∂j ∂k f is continuous at x0 ∈ O, then (2.1.32)
∂j ∂k f (x0 ) = lim ∆k,h ∆j,h f (x0 ). h→0
Clearly (2.1.33)
∆k,h ∆j,h f = ∆j,h ∆k,h f,
so we have the following, which easily implies Proposition 2.1.2.
Corollary 2.1.4. In the setting of Proposition 2.1.3, if also ∂j f and ∂k ∂j f exist on O and ∂k ∂j f is continuous at x0 , then (2.1.34)
∂j ∂k f (x0 ) = ∂k ∂j f (x0 ).
We now describe two convenient notations to express higher order derivatives of a C k function f : Ω → R, where Ω ⊂ Rn is open. In one, let J be a k-tuple of integers between 1 and n; J = (j1 , . . . , jk ). We set (2.1.35)
f (J) (x) = ∂jk · · · ∂j1 f (x),
∂j =
∂ . ∂xj
46
2. Multivariable differential calculus
We set |J| = k, the total order of differentiation. As we have seen in Proposition 2.1.2, ∂i ∂j f = ∂j ∂i f provided f ∈ C 2 (Ω). It follows that, if f ∈ C k (Ω), then ∂jk · · · ∂j1 f = ∂ℓk · · · ∂ℓ1 f whenever {ℓ1 , . . . , ℓk } is a permutation of {j1 , . . . , jk }. Thus, another convenient notation to use is the following. Let α be an n-tuple of non-negative integers, α = (α1 , . . . , αn ). Then we set (2.1.36)
f (α) (x) = ∂1α1 · · · ∂nαn f (x),
|α| = α1 + · · · + αn .
Note that, if |J| = |α| = k and f ∈ C k (Ω), (2.1.37)
f (J) (x) = f (α) (x), with αi = #{ℓ : jℓ = i}.
Correspondingly, there are two expressions for monomials in x = (x1 , . . . , xn ): (2.1.38)
xJ = xj1 · · · xjk ,
αn 1 x α = xα 1 · · · xn ,
and xJ = xα provided J and α are related as in (2.1.37). Both these notations are called “multi-index” notations. We now derive Taylor’s formula with remainder for a smooth function F : Ω → R, making use of these multi-index notations. We will apply the one variable formula (1.1.50)–(1.1.51), i.e., 1 1 (2.1.39) φ(t) = φ(0) + φ′ (0)t + φ′′ (0)t2 + · · · + φ(k) (0)tk + rk (t), 2 k! with ∫ 1 t (2.1.40) rk (t) = (t − s)k φ(k+1) (s) ds, k! 0 given φ ∈ C k+1 (I), I = (−a, a). (See Exercise 9 of §1.1, Exercise 7 of this section, and also Appendix A.4 for further discussion.) Let us assume 0 ∈ Ω, and that the line segment from 0 to x is contained in Ω. We set φ(t) = F (tx), and apply (2.1.39)–(2.1.40) with t = 1. Applying the chain rule, we have (2.1.41)
φ′ (t) =
n ∑
∂j F (tx)xj =
∑
F (J) (tx)xJ .
|J|=1
j=1
Differentiating again, we have ∑ (2.1.42) φ′′ (t) =
F (J+K) (tx)xJ+K =
|J|=1,|K|=1
∑
F (J) (tx)xJ ,
|J|=2
where, if |J| = k, |K| = ℓ, we take J + K = (j1 , . . . , jk , k1 , . . . , kℓ ). Inductively, we have ∑ (2.1.43) φ(k) (t) = F (J) (tx)xJ . |J|=k
Hence, from (2.1.39) with t = 1, ∑ 1 ∑ (J) (2.1.44) F (x) = F (0) + F (J) (0)xJ + · · · + F (0)xJ + Rk (x), k! |J|=1
or, more briefly, (2.1.45)
F (x) =
∑ |J|≤k
|J|=k
1 (J) F (0)xJ + Rk (x), |J|!
2.1. The derivative
47
where (2.1.46)
Rk (x) =
1 k!
∑ (∫ |J|=k+1
1
) (1 − s)k F (J) (sx) ds xJ .
0
This gives Taylor’s formula with remainder for F ∈ C k+1 (Ω), in the J-multi-index notation. We also want to write the formula in the α-multi-index notation. We have ∑ ∑ (2.1.47) F (J) (tx)xJ = ν(α)F (α) (tx)xα , |J|=k
|α|=k
where (2.1.48)
ν(α) = #{J : α = α(J)},
and we define the relation α = α(J) to hold provided the condition (2.1.37) holds, or equivalently provided xJ = xα . Thus ν(α) is uniquely defined by ∑ ∑ (2.1.49) ν(α)xα = xJ = (x1 + · · · + xn )k . |α|=k
|J|=k
To evaluate ν(α), we can expand (x1 + · · · + xn )k in terms of xα by a repeated application of the binomial formula: ( )k (x1 + · · · + xn )k = x1 + (x2 + · · · + xn ) ∑ (k) = xα1 (x2 + · · · + xn )k−α1 α1 1 α1 ≤k ∑ ( k )(k − α1 ) k−α1 −α2 1 α2 = xα 1 x2 (x3 + · · · + xn ) α1 α2 α1 +α2 ≤k (2.1.50) = ··· ( ) ∑ ( k )(k − α1 ) k − α1 − · · · − αn−1 α1 n = ··· x1 · · · xα n α1 α2 αn |α|=k ∑ = ν(α)xα . |α|=k
We have ν(α) equal to the product of binomial coefficients given above, i.e., to k! (k − α1 )! (k − α1 − · · · − αn−1 )! · ··· α1 !(k − α1 )! α2 !(k − α1 − α2 )! αn !(k − α1 − · · · − αn )! k! = . α1 ! · · · αn ! In other words, for |α| = k, k! , where α! = α1 ! · · · αn ! α! Thus the Taylor formula (2.1.45) can be rewritten ∑ 1 (2.1.52) F (x) = F (α) (0)xα + Rk (x), α! (2.1.51)
ν(α) =
|α|≤k
48
2. Multivariable differential calculus
where (2.1.53)
∑
Rk (x) =
|α|=k+1
k + 1( α!
∫
1
) (1 − s)k F (α) (sx) ds xα .
0
The formula (2.1.52)–(2.1.53) holds for F ∈ C k+1 . It is significant that (2.1.52), with a variant of (2.1.53), holds for F ∈ C k . In fact, for such F , we can apply (2.1.53) with k replaced by k − 1, to get ∑ 1 (2.1.54) F (x) = F (α) (0)xα + Rk−1 (x), α! |α|≤k−1
with (2.1.55)
Rk−1 (x) =
) ∑ k (∫ 1 (1 − s)k−1 F (α) (sx) ds xα . α! 0
|α|=k
We can add and subtract F (α) (0) to F (α) (sx) in the integrand above, and obtain the following. Proposition 2.1.5. If F ∈ C k on a ball Br (0), the formula (2.1.52) holds for x ∈ Br (0), with ∑ k (∫ 1 [ ] ) (2.1.56) Rk (x) = (1 − s)k−1 F (α) (sx) − F (α) (0) ds xα . α! 0 |α|=k
Remark. Note that (2.1.56) yields the estimate ∑ |xα | (2.1.57) |Rk (x)| ≤ sup |F (α) (sx) − F (α) (0)|. α! 0≤s≤1 |α|=k
The term corresponding to |J| = 2 in (2.1.45), or |α| = 2 in (2.1.52), is of particular interest. It is n 1 ∑ (J) 1 ∑ ∂2F (2.1.58) F (0) xJ = (0) xj xk . 2 2 ∂xk ∂xj |J|=2
j,k=1
We define the Hessian of a C function F : O → R as an n × n matrix: ( 2 ) ∂ F (2.1.59) D2 F (y) = (y) . ∂xk ∂xj 2
Then the power series expansion of second order about 0 for F takes the form 1 (2.1.60) F (x) = F (0) + DF (0)x + x · D2 F (0)x + R2 (x), 2 where, by (2.1.57), (2.1.61)
|R2 (x)| ≤ Cn |x|2
sup
|F (α) (sx) − F (α) (0)|.
0≤s≤1,|α|=2
In all these formulas we can translate coordinates and expand about y ∈ O. For example, (2.1.60) extends to 1 (2.1.62) F (x) = F (y) + DF (y)(x − y) + (x − y) · D2 F (y)(x − y) + R2 (x, y), 2
2.1. The derivative
49
with (2.1.63)
|R2 (x, y)| ≤ Cn |x − y|2
sup
|F (α) (y + s(x − y)) − F (α) (y)|.
0≤s≤1,|α|=2
Example. If we take F (x) as in (2.1.8), so DF (x) is as in (2.1.9), then ( ) − sin x1 sin x2 cos x1 cos x2 D2 F (x) = . cos x1 cos x2 − sin x1 sin x2 The results (2.1.62)–(2.1.63) are useful for extremal problems, i.e., determining where a sufficiently smooth function F : O → R has local maxima and local minima. Clearly if F ∈ C 1 (O) and F has a local maximum or minimum at x0 ∈ O, then DF (x0 ) = 0. In such a case, we say x0 is a critical point of F . To check what kind of critical point x0 is, we look at the n × n matrix A = D2 F (x0 ), assuming F ∈ C 2 (O). By Proposition 2.1.2, A is a symmetric matrix. A basic result in linear algebra is that if A is a real, symmetric n × n matrix, then Rn has an orthonormal basis of eigenvectors, {v1 , . . . , vn }, satisfying Avj = λj vj ; the real numbers λj are the eigenvalues of A. We say A is positive definite if all λj > 0, and we say A is negative definite if all λj < 0. We say A is strongly indefinite if some λj > 0 and another λk < 0. Equivalently, given a real, symmetric matrix A, (2.1.64)
A positive definite ⇐⇒ v · Av ≥ C|v|2 , A negative definite ⇐⇒ v · Av ≤ −C|v|2 ,
for some C > 0, all v ∈ Rn , and A strongly indefinite ⇐⇒ ∃ v, w ∈ Rn , nonzero, such that (2.1.65) v · Av ≥ C|v|2 , w · Aw ≤ −C|w|2 , for some C > 0. In light of (2.1.45)–(2.1.46), we have the following result. Proposition 2.1.6. Assume F ∈ C 2 (O) is real valued, O open in Rn . Let x0 ∈ O be a critical point for F . Then (i) D2 F (x0 ) positive definite ⇒ F has a local minimum at x0 , (ii) D2 F (x0 ) negative definite ⇒ F has a local maximum at x0 , (iii) D2 F (x0 ) strongly indefinite ⇒ F has neither a local maximum nor a local minimum at x0 . In case (iii), we say x0 is a saddle point for F . The following is a test for positive definiteness. Proposition 2.1.7. Let A = (aij ) be a real, symmetric, n × n matrix. For 1 ≤ ℓ ≤ n, form the ℓ × ℓ matrix Aℓ = (aij )1≤i,j≤ℓ . Then (2.1.66)
A positive definite ⇐⇒ det Aℓ > 0,
∀ ℓ ∈ {1, . . . , n}.
Regarding the implication ⇒, note that if A is positive definite, then det A = det An is the product of its eigenvalues, all > 0, hence is > 0. Also in this case, it follows from the hypothesis on the left side of (2.1.66) that each Aℓ must be positive definite, hence have positive determinant, so we have ⇒. The implication ⇐ is easy enough for 2 × 2 matrices. If A is symmetric and det A > 0, then either both its eigenvalues are positive (so A is positive definite) or
50
2. Multivariable differential calculus
both are negative (so A is negative definite). In the latter case, A1 = (a11 ) must be negative, so we have ⇐ in this case. We prove ⇐ for n ≥ 3, using induction. The inductive hypothesis implies that if det Aℓ > 0 for each ℓ ≤ n, then An−1 is positive definite. The next lemma then guarantees that A = An has at least n − 1 positive eigenvalues. The hypothesis that det A > 0 does not allow that the remaining eigenvalue be ≤ 0, so all the eigenvalues of A must be positive. Thus Proposition 2.1.7 is proven, once we have the following. Lemma 2.1.8. In the setting of Proposition 2.1.7, if An−1 is positive definite, then A = An has at least n − 1 positive eigenvalues. Proof. Since A is symmetric, Rn has an orthonormal basis v1 , . . . , vn of eigenvectors of A; Avj = λj vj . If the conclusion of the lemma is false, at least two of the eigenvalues, say λ1 , λ2 , are ≤ 0. Let W = Span(v1 , v2 ), so w ∈ W =⇒ w · Aw ≤ 0. Since W has dimension 2, Rn−1 ⊂ Rn satisfies Rn−1 ∩ W ̸= 0, so there exists a nonzero w ∈ Rn−1 ∩ W , and then w · An−1 w = w · Aw ≤ 0,
contradicting the hypothesis that An−1 is positive definite.
Remark. Given (2.1.66), we see by taking A 7→ −A that if A is a real, symmetric n × n matrix, (2.1.67)
A negative definite ⇐⇒ (−1)ℓ det Aℓ > 0,
∀ ℓ ∈ {1, . . . , n}.
We return to higher order power series formulas with remainder and complement Proposition 2.1.5. Let us go back to (2.1.39)–(2.1.40) and note that the integral in (2.1.40) is 1/(k + 1) times a weighted average of φ(k+1) (s) over s ∈ [0, t]. Hence we can write 1 rk (t) = φ(k+1) (θt), for some θ ∈ [0, 1], (k + 1)! if φ is of class C k+1 . This is the Lagrange form of the remainder; see Appendix A.4 for more on this, and for a comparison with the Cauchy form of the remainder. If φ is of class C k , we can replace k + 1 by k in (2.1.39) and write (2.1.68)
φ(t) = φ(0) + φ′ (0)t + · · · +
1 1 φ(k−1) (0)tk−1 + φ(k) (θt)tk , (k − 1)! k!
for some θ ∈ [0, 1]. Pluging (2.1.68) into (2.1.43) for φ(t) = F (tx) gives ∑ 1 (J) 1 ∑ (J) (2.1.69) F (x) = F (0)xJ + F (θx)xJ , |J|! k! |J|≤k−1
|J|=k
for some θ ∈ [0, 1] (depending on x and on k, but not on J), when F is of class C k on a neighborhood Br (0) of 0 ∈ Rn . Similarly, using the α-multi-index notation,
2.1. The derivative
51
we have, as an alternative to (2.1.54)–(2.1.55), ∑ 1 ∑ 1 F (α) (0)xα + F (α) (θx)xα , (2.1.70) F (x) = α! α! |α|≤k−1
|α|=k
for some θ ∈ [0, 1] (depending on x and on |α|, but not on α), if F ∈ C k (Br (0)). Note also that n 1 ∑ ∂2F 1 ∑ (J) F (θx)xJ = (θx)xj xk 2 2 ∂xk ∂xj |J|=2
(2.1.71)
j,k=1
=
1 x · D2 F (θx)x, 2
with D2 F (y) as in (2.1.59), so if F ∈ C 2 (Br (0)), we have, as an alternative to (2.1.60), 1 F (x) = F (0) + DF (0)x + x · D2 F (θx)x, 2
(2.1.72) for some θ ∈ [0, 1].
We next complement the multi-index notations for higher derivatives of a function F by a multi-linear notation, defined as follows. If k ∈ N, F ∈ C k (U ), and y ∈ U ⊂ Rn , set (2.1.73) Dk F (y)(u1 , . . . , uk ) = ∂t1 · · · ∂tk F (y + t1 u1 + · · · + tk uk ) , t1 =···=tk =0
for u1 , . . . , uk ∈ R . For k = 1, this formula is equivalent to the definition of DF given at the beginning of this section. For k = 2, we have n
D2 F (y)(u, v) = u · D2 F (y)v,
(2.1.74)
with D2 F (y) on the right as in (2.1.59). Generally, (2.1.73) defines Dk F (y) as a symmetric, k-linear form in u1 , . . . , uk ∈ Rn . We can relate (2.1.73) to the J-multi-index notation as follows. We start with ∑ (2.1.75) ∂t1 F (y + t1 u1 + · · · + tk uk ) = F (J) (y + Σtj uj )uJ1 , |J|=1
and inductively obtain ∑
(2.1.76) ∂t1 · · · ∂tk F (y + Σtj uj ) =
F (J1 +···+Jk ) (y + Σtj uj )uJ1 1 · · · uJkk ,
|J1 |=···=|Jk |=1
hence (2.1.77)
Dk F (y)(u1 , . . . , uk ) =
∑
F (J1 +···+Jk ) (y)uJ1 1 · · · uJkk .
|J1 |=···=|Jk |=1
In particular, if u1 = · · · = uk = u, (2.1.78)
Dk F (y)(u, . . . , u) =
∑ |J|=k
F (J) (y)uJ .
52
2. Multivariable differential calculus
Hence (2.1.69) yields the multi-linear Taylor formula with remainder F (x) = F (0) + DF (0)x + · · · + (2.1.79)
1 Dk−1 F (0)(x, . . . , x) (k − 1)! 1 + Dk F (θx)(x, . . . , x), k!
for some θ ∈ [0, 1], if F ∈ C k (Br (0)). In fact, rather than appealing to (2.1.69), we can note that φ(t) = F (tx) =⇒ φ(k) (t) = ∂t1 · · · ∂tk φ(t + t1 + · · · + tk ) t1 =···=tk =0
k
= D F (tx)(x, . . . , x), and obtain (2.1.79) directly from (2.1.68). We can also use the notation Dj F (y)x⊗j = Dj F (y)(x, . . . , x),
(2.1.80)
with j copies of x within the last set of parentheses, and rewrite (2.1.79) as 1 Dk−1 F (0)x⊗(k−1) (k − 1)! (2.1.81) 1 + Dk F (θx)x⊗k . k! Note how (2.1.79) and (2.1.81) generalize (2.1.72). F (x) = F (0) + DF (0)x + · · · +
Convergent power series and their derivatives Here we consider functions given by convergent power series, of the form ∑ (2.1.82) F (x) = bα x α . α≥0
We allow bα ∈ C, and take x = (x1 , . . . , xn ) ∈ Rn , with xα given by (2.1.38). Here is our first result. Proposition 2.1.9. Assume there exist y ∈ Rn and C0 < ∞ such that (2.1.83)
|yk | = ak > 0, ∀ k,
|bα y α | ≤ C0 , ∀ α.
Then, for each δ ∈ (0, 1), the series (2.1.82) converges absolutely and uniformly on each set (2.1.84)
Rδ = {x ∈ Rn : |xk | ≤ (1 − δ)ak , ∀ k}.
e = {x ∈ Rn : |xk | < ak , ∀ k}. The sum F (x) is continuous on R Proof. We have (2.1.85) hence (2.1.86)
x ∈ Rδ =⇒ |bα xα | ≤ C0 (1 − δ)|α| , ∑ α≥0
|bα xα | ≤ C0
∑ α≥0
∀ α,
(1 − δ)|α| < ∞.
2.1. The derivative
53
Thus the power series (2.1.82) is absolutely convergent whenever x ∈ Rδ . We also have, for each N ∈ N, ∑ (2.1.87) F (x) = bα xα + RN (x), |α|≤N
and, for x ∈ Rδ ,
∑
|RN (x)| ≤
|bα xα |
|α|>N
∑
≤ C0
(2.1.88)
(1 − δ)|α|
|α|>N
= εN → 0 as N → ∞. This shows that RN (x) → 0 uniformly for x ∈ Rδ , and completes the proof of Proposition 2.1.9. We next discuss differentiability of power series. e Proposition 2.1.10. In the setting of Proposition 2.1.9, F is differentiable on R and, for each j ∈ {1, . . . , n}, ∑ ∂F e (2.1.89) (x) = αj bα xα−εj , ∀ x ∈ R. ∂xj α≥εj
Here, we set εj = (0, . . . , 1, . . . , 0), with the 1 in the jth slot. It is convenient to begin the proof of Proposition 2.1.10 with the following. Lemma 2.1.11. In the setting of Proposition 2.1.9, for each j ∈ {1, . . . , n}, ∑ (2.1.90) Gj (x) = αj bα xα−εj α≥εj
e uniformly on Rδ for each δ ∈ (0, 1), therefore is absoletely convergent for x ∈ R, e defining Gj as a continuous function on R. Proof. Take a = (a1 , . . . , an ), with aj as in (2.1.83). Given x ∈ Rδ , we have ∑ ∑ αj |bα xα−εj | ≤ αj (1 − δ)|α|−1 |bα aα−εj | (2.1.91)
α≥εj
α≥εj
≤
∑ C0 αj (1 − δ)|α| , aj (1 − δ) α≥0
and this is (2.1.92)
≤ Mδ < ∞,
∀ δ ∈ (0, 1).
This gives the asserted convergence on Rδ and hence defines the function Gj , cone tinuous on R. To prove Proposition 2.1.10, we need to show that ∂F e (2.1.93) = Gj on R, ∂xj
54
2. Multivariable differential calculus
for each j. Let us use the notation (2.1.94)
x bj = (x1 , . . . , xj−1 , 0, xj+1 , . . . , xn ) = x − xj ej ,
where ej is the jth standard basis vector of Rn . Now, given x ∈ Rδ , δ ∈ (0, 1), the uniform convergence of (2.1.90) on Rδ implies ∫ xj ∫ xj ∑ Gj (b xj + tej ) dt = αj bα (b xj + tej )α−εj dt 0
0
α≥εj
= (2.1.95)
∑
αj bα αj−1 xα
α≥εj
=
∑
bα xα
α≥εj
= F (x) − F (b xj ). Applying ∂/∂xj to the left side of (2.1.95) and using the fundamental theorem of calculus then yields (2.1.93) as desired. This gives the identity (2.1.89). Since each e this implies F is differentiable on R. e Gj is continuous on R, We can iterate Proposition 2.1.10, obtaining ∂k ∂j F (x) = ∂k Gj (x) as a convere etc. In particular, we have the following. gent power series on R, e Corollary 2.1.12. In the setting of Proposition 2.1.9, we have F ∈ C ∞ (R).
Exercises 0. Here we provide a path to a strengthening of Proposition 2.1.1. Let O ⊂ Rn be open, f : O → R. Assume ∂f /∂xj exists on O for each j. Fix x ∈ O and assume that ∂f (2.1.96) is continuous at x, for each j. ∂xj Task: prove that f is differentiable at x. Hint. Start as in (2.1.6), with n { } ∑ f (x + y) = f (x) + f (x + zj ) − f (x + zj−1 ) , j=1
where z0 = 0, zj = (y1 , . . . , yj , 0, . . . , 0) = zj−1 + yj ej , zn = y. Deduce from the mean value theorem that, for each j, ∂f f (x + zj ) − f (x + zj−1 ) = (x + zj−1 + θj yj ej )yj , ∂xj for some θj ∈ (0, 1). Deduce that f (x + y) = f (x) +
n ∑ ∂f (x)yj + R(x, y), ∂xj j=1
Exercises
where
55
n { } ∑ ∂f ∂f R(x, y) = (x + zj−1 + θj yj ej ) − (x) yj . ∂xj ∂xj j=1
Show that the hypothesis (2.1.96) implies that R(x, y) = o(∥y∥). 1. Consider the following function f : R2 → R: f (x, y) = (cos x)(cos y). Find all its critical points, and determine which of these are local maxima, local minima, and saddle points. 2. Let M (n, R) denote the space of real n × n matrices, and let Ω ⊂ M (n, R) be open. Assume F, G : Ω → M (n, R) are of class C 1 . Show that H(X) = F (X)G(X) defines a C 1 map H : Ω → M (n, R), and DH(X)Y = DF (X)Y G(X) + F (X)DG(X)Y. 3. Let Gl(n, R) ⊂ M (n, R) denote the set of invertible matrices. Show that Φ : Gl(n, R) −→ M (n, R),
Φ(X) = X −1
is of class C 1 and that DΦ(X)Y = −X −1 Y X −1 . Hint. The series expansion (2.1.20) should be useful. 3A. Define S, Φ, F : Gℓ(n, R) → M (n, R) by S(X) = X 2 ,
Φ(X) = X −1 ,
F (X) = X −2 .
Compute DF (X)Y using each of the following approaches: (a) Take F (X) = Φ(X)Φ(X) and use the product rule (Exercise 2). (b) Take F (X) = Φ(S(X)) and use the chain rule. (c) Take F (X) = S(Φ(X)) and use the chain rule. 4. Identify R2 and C via z = x + iy. Then multiplication by i on C corresponds to applying ( ) 0 −1 J= 1 0 Let O ⊂ R2 be open, f : O → R2 be C 1 . Say f = (u, v). Regard Df (x, y) as a 2 × 2 real matrix. One says f is holomorphic, or complex-analytic, provided the Cauchy-Riemann equations hold: ∂u ∂v = , ∂x ∂y
∂u ∂v =− . ∂y ∂x
Show that this is equivalent to the condition Df (x, y) J = J Df (x, y). Generalize to O open in C , f : O → Cn . m
56
2. Multivariable differential calculus
5. Let f be C 1 on a region in R2 containing [a, b] × {y}. Show that, as h → 0, ] 1[ ∂f f (x, y + h) − f (x, y) −→ (x, y), uniformly on [a, b] × {y}. h ∂y Hint. Show that the left side is equal to ∫ 1 h ∂f (x, y + s) ds, h 0 ∂y and use the uniform continuity of ∂f /∂y on [a, b] × [y − δ, y + δ]; cf. Proposition A.1.16. 6. In the setting of Exercise 5, show that ∫ b ∫ b d ∂f f (x, y) dx = (x, y) dx. dy a a ∂y 7. Considering the power series f (x) = f (y) + f ′ (y)(x − y) + · · · +
f (j) (y) (x − y)j + Rj (x, y), j!
show that
1 ∂Rj = − f (j+1) (y)(x − y)j , Rj (x, x) = 0. ∂y j! Use this to re-derive (1.1.51), and hence (2.1.39)–(2.1.40). We define “big oh” and “little oh” notation: f (x) f (x) = O(x) (as x → 0) ⇔ ≤ C as x → 0, x f (x) → 0 as x → 0. f (x) = o(x) (as x → 0) ⇔ x 8. Let O ⊂ Rn be open and y ∈ O. Show that ∑ 1 f ∈ C k+1 (O) ⇒ f (x) = f (α) (y)(x − y)α + O(|x − y|k+1 ), α! |α|≤k
f ∈ C k (O) ⇒ f (x) =
∑ 1 f (α) (y)(x − y)α + o(|x − y|k ). α!
|α|≤k
9. Assume G : U → O, F : O → Ω. Show that (2.1.97)
F, G ∈ C 1 =⇒ F ◦ G ∈ C 1 .
More generally, show that, for k ∈ N, (2.1.98)
F, G ∈ C k =⇒ F ◦ G ∈ C k .
Hint. Write H = F ◦ G, with hℓ (x) = fℓ (g1 (x), . . . , gn (x)), and use (2.1.25) to get (2.1.99)
∂j hℓ (x) =
n ∑ k=1
∂k fℓ (g1 , . . . , gn )∂j gk .
Exercises
57
Show that this yields (2.1.97). To proceed, deduce from (2.1.99) that n ∑ ∂j1 ∂j2 hℓ (x) = ∂k1 ∂k2 fℓ (g1 , . . . , gn )(∂j1 gk1 )(∂j2 gk2 ) (2.1.100)
k1 ,k2 =1 n ∑
+
∂k fℓ (g1 , . . . , gn )∂j1 ∂j2 gk .
k=1
Use this to get (2.1.98) for k = 2. Proceeding inductively, show that there exist constants C(µ, J # , k # ) = C(µ, J1 , . . . , Jµ , k1 , . . . , kµ ) such that if F, G ∈ C k and |J| ≤ k, ∑ (J ) (k ,...,kµ ) (J) (J ) (2.1.101) hℓ (x) = C(µ, J # , k # )gk11 · · · gkµµ fℓ 1 (g1 , . . . , gn ), where the sum is over µ ≤ |J|, J1 + · · · + Jµ ∼ J, |Jν | ≥ 1, and J1 + · · · + Jµ ∼ J means J is a rearrangement of J1 + · · · + Jµ . Show that (2.1.98) follows from this. 10. Show that the map Φ : Gl(n, R) → Gl(n, R) given by Φ(X) = X −1 is C k for each k, i.e., Φ ∈ C ∞ . Hint. Start with the material of Exercise 3. Write DΦ(X)Y = −X −1 Y X −1 as ∂ ∂ℓm Φ(X) = Φ(X) = DΦ(X)Eℓm = −Φ(X)Eℓm Φ(X), ∂xℓm where X = (xℓm ) and Eℓm has just one nonzero entry, at position (ℓ, m). Iterate this to get ∂ℓ2 m2 ∂ℓ1 m1 Φ(X) = −(∂ℓ2 m2 Φ(X))Eℓ1 m1 Φ(X) − Φ(X)Eℓ1 m1 (∂ℓ2 m2 Φ(X)), and continue. Exercises 11–13 deal with properties of the determinant, as a differentiable function on spaces of matrices. 11. Let M (n, R) be the space of n × n matrices with real coefficients, det : M (n, R) → R the determinant. Show that, if I is the identity matrix, D det(I)B = Tr B, i.e., d det(I + tB)|t=0 = Tr B. dt 12. If A(t) = (ajk (t)) is a smooth curve in M (n, R), use the expansion of (d/dt) det A(t) as a sum of n determinants, in which the rows of A(t) are successively differentiated, to show that ( ) d det A(t) = Tr Cof(A(t))t · A′ (t) , dt and deduce that, for A, B ∈ M (n, R), ( ) D det(A)B = Tr Cof(A)t · B .
58
2. Multivariable differential calculus
Here Cof(A), the cofactor matrix, is defined in Exercise 4 of §1.4. 13. Suppose A ∈ M (n, R) is invertible. Using det(A + tB) = (det A) det(I + tA−1 B), show that
D det(A)B = (det A) Tr(A−1 B). Comparing this result with that of Exercise 12, establish Cramer’s formula: (det A)A−1 = Cof(A)t . Compare the derivation in Exercise 4 of §1.4. 14. Define f (x, y) on R2 by xy f (x, y) = √ , (x, y) ̸= (0, 0), x2 + y 2 0, (x, y) = (0, 0). Show that f is continuous on R2 and smooth on R2 \ (0, 0). Show that ∂f /∂x and ∂f /∂y exist at each point of R2 , and are continuous on R2 \ (0, 0), but not on R2 . Show that ∂f ∂f (0, 0) = (0, 0) = 0. ∂x ∂y Show that f is not differentiable at (0, 0). Hint. Show that f (x, y) is not o(∥(x, y)∥) as (x, y) → (0, 0), by considering f (x, x). 15. Define g(x, y) on R2 by g(x, y) =
xy 3 , (x, y) ̸= (0, 0), x2 + y 2 0, (x, y) = (0, 0).
Show that g is smooth on R2 \ (0, 0) and class C 1 on R2 . Show that ∂x ∂y g and ∂y ∂x g exist at each point of R2 , and are continuous on R2 \ (0, 0), but not on R2 . Show that ∂ ∂g ∂ ∂g (0, 0) = 1, (0, 0) = 0. ∂y ∂x ∂x ∂y
2.2. Inverse function and implicit function theorem The Inverse Function Theorem gives a condition under which a function can be locally inverted. This theorem and its corollary the Implicit Function Theorem are fundamental results in multivariable calculus. First we state the Inverse Function Theorem. Here, we assume k ≥ 1. Theorem 2.2.1. Let F be a C k map from an open neighborhood Ω of p0 ∈ Rn to Rn , with q0 = F (p0 ). Suppose the derivative DF (p0 ) is invertible. Then there is a neighborhood U of p0 and a neighborhood V of q0 such that F : U → V is one-to-one and onto, and F −1 : V → U is a C k map. (One says F : U → V is a diffeomorphism.)
2.2. Inverse function and implicit function theorem
59
First we show that F is one-to-one on a neighborhood of p0 , under these hypotheses. In fact, we establish the following result, of interest in its own right. Proposition 2.2.2. Assume Ω ⊂ Rn is open and convex, and let f : Ω → Rn be C 1 . Assume that the symmetric part of Df (u) is positive-definite, for each u ∈ Ω. Then f is one-to-one on Ω. Proof. Take distinct points u1 , u2 ∈ Ω, and set u2 − u1 = w. Consider φ : [0, 1] → R, given by φ(t) = w · f (u1 + tw). ′ Then φ (t) = w·Df (u1 +tw)w > 0 for t ∈ [0, 1], so φ(0) ̸= φ(1). But φ(0) = w·f (u1 ) and φ(1) = w · f (u2 ), so f (u1 ) ̸= f (u2 ). To continue the proof of Theorem 2.2.1, let us set ( ) (2.2.1) f (u) = A F (p0 + u) − q0 , A = DF (p0 )−1 . Then f (0) = 0 and Df (0) = I, the identity matrix. We will show that f maps a neighborhood of 0 one-to-one and onto some neighborhood of 0. We can write (2.2.2)
f (u) = u + R(u),
R(0) = 0, DR(0) = 0,
1
and R is C . Pick b > 0 such that 1 . 2 Then Df = I + DR has positive definite symmetric part on ∥u∥ ≤ 2b =⇒ ∥DR(u)∥ ≤
(2.2.3)
B2b (0) = {u ∈ Rn : ∥u∥ < 2b}, so by Proposition 2.2.2, f : B2b (0) −→ Rn is one-to-one. We will show that the range f (B2b (0)) contains Bb (0), that is to say, we can solve (2.2.4)
f (u) = v,
given v ∈ Bb (0), for some (unique) u ∈ B2b (0). This is equivalent to u + R(u) = v. To get the solution, we set Tv (u) = v − R(u).
(2.2.5)
Then solving (2.2.4) is equivalent to solving (2.2.6)
Tv (u) = u.
We look for a fixed point u = K(v) = f −1 (v).
(2.2.7)
Also, we want to show that DK(0) = I, i.e., that (2.2.8)
K(v) = v + r(v),
r(v) = o(∥v∥).
The “little oh” notation is defined in Exercise 8 of §2.1. If we succeed in doing this, it follows that, for y close to q0 , G(y) = F −1 (y) is defined. Also, taking x = p0 + u,
y = F (x),
v = f (u) = A(y − q0 ),
60
2. Multivariable differential calculus
as in (2.2.1), we have, via (2.2.8), G(y) = p0 + u = p0 + K(v) = p0 + K(A(y − q0 )) = p0 + A(y − q0 ) + o(∥y − q0 ∥). Hence G is differentiable at q0 and (2.2.9)
DG(q0 ) = A = DF (p0 )−1 .
A parallel argument, with p0 replaced by a nearby x and y = F (x), gives DG(y) = DF (G(y))−1 .
(2.2.10)
Thus our task is to solve (2.2.6). To do this, we use the following general result, known as the Contraction Mapping Theorem. Theorem 2.2.3. Let X be a complete metric space, and let T : X → X satisfy (2.2.11)
dist(T x, T y) ≤ r dist(x, y),
for some r < 1. (We say T is a contraction.) Then T has a unique fixed point x. For any y0 ∈ X, T k y0 → x as k → ∞. Proof. Pick y0 ∈ X and let yk = T k y0 . Then dist(yk , yk+1 ) ≤ rk dist(y0 , y1 ), so (2.2.12)
dist(yk , yk+m ) ≤ dist(yk , yk+1 ) + · · · + dist(yk+m−1 , yk+m ) ( ) ≤ rk + · · · + rk+m−1 dist(y0 , y1 ) ( )−1 ≤ rk 1 − r dist(y0 , y1 ).
It follows that (yk ) is a Cauchy sequence, so it converges; yk → x. Since T yk = yk+1 and T is continuous, it follows that T x = x, i.e., x is a fixed point. Uniqueness of the fixed point is clear from the estimate dist(T x, T x′ ) ≤ r dist(x, x′ ), which implies dist(x, x′ ) = 0 if x and x′ are fixed points. This proves Theorem 2.2.3. Returning to the task of solving (2.2.6), having b as in (2.2.3), we claim that (2.2.13)
∥v∥ ≤ b =⇒ Tv : Xv → Xv ,
where Xv = {u ∈ B2b (0) : ∥u − v∥ ≤ Av }, (2.2.14)
Av =
sup ∥w∥≤2∥v∥
∥R(w)∥.
See Figure 2.2.1. Note from (2.2.2)–(2.2.3) that 1 ∥w∥ ≤ 2b =⇒ ∥R(w)∥ ≤ ∥w∥, and ∥R(w)∥ = o(∥w∥). 2 Hence (2.2.15)
∥v∥ ≤ b =⇒ Av ≤ ∥v∥,
and Av = o(∥v∥).
Thus ∥u − v∥ ≤ Av ⇒ u ∈ Xv . Also u ∈ Xv =⇒ ∥u∥ ≤ 2∥v∥ (2.2.16)
=⇒ ∥R(u)∥ ≤ Av =⇒ ∥Tv (u) − v∥ ≤ Av ,
2.2. Inverse function and implicit function theorem
61
Figure 2.2.1. Tv : Xv → Xv
so (2.2.13) holds. As for the contraction property, given uj ∈ Xb , ∥v∥ ≤ b, (2.2.17)
∥Tv (u1 ) − Tv (u2 )∥ = ∥R(u2 ) − R(u1 )∥ 1 ≤ ∥u1 − u2 ∥, 2
the last inequality by (2.2.3), so the map (2.2.13) is a contraction. Hence, by Theorem 2.2.3, there is a unique fixed point, u = K(v) ∈ Xv . Also, since u ∈ Xv , (2.2.18)
∥K(v) − v∥ ≤ Av = o(∥v∥).
Thus we have (2.2.8). This establishes the existence of the inverse function G = F −1 : V → U , and we have the formula (2.2.10) for the derivative DG. Since G is differentiable on V , it is certainly continuous, so (2.2.10) implies DG is continuous, given F ∈ C 1 (U ). To finish the proof of the Inverse Function Theorem and show that G is C k if F is C k , for k ≥ 2, one uses an inductive argument. See Exercise 6 at the end of this section for an approach to this last argument. Thus if DF is invertible on the domain of F, F is a local diffeomorphism. Stronger hypotheses are needed to guarantee that F is a global diffeomorphism
62
2. Multivariable differential calculus
onto its range. Proposition 2.2.2 provides one tool for doing this. Here is a slight strengthening. Corollary 2.2.4. Assume Ω ⊂ Rn is open and convex, and that F : Ω → Rn is C 1 . Assume there exist n × n matrices A and B such that the symmetric part of A DF (u) B is positive definite for each u ∈ Ω. Then F maps Ω diffeomorphically onto its image, an open set in Rn .
Proof. Exercise.
We make a comment about solving the equation F (x) = y, under the hypotheses of Theorem 2.2.1, when y is close to q0 . The fact that finding the fixed point for Tv in (2.2.13) is accomplished by taking the limit of Tvk (v) implies that, when y is sufficiently close to q0 , the sequence (xk ), defined by ( ) (2.2.19) x0 = p0 , xk+1 = xk + DF (p0 )−1 y − F (xk ) , converges to the solution x. An analysis of the rate at which xk → x, and F (xk ) → y, can be made by applying F to (2.2.19), yielding F (xk+1 ) = F (xk + DF (p0 )−1 (y − F (xk )) = F (xk ) + DF (xk )DF (p0 )−1 (y − F (xk )) + R(xk , DF (p0 )−1 (y − F (xk ))), and hence (2.2.20)
( ) y − F (xk+1 ) = I − DF (xk )DF (p0 )−1 (y − F (xk )) e k , y − F (xk )), + R(x
e k , y − F (xk ))∥ = o(∥y − F (xk )∥). with ∥R(x It turns out that replacing DF (p0 )−1 by DF (xk )−1 in (2.2.19) yields a faster approximation. This method, known as Newton’s method, is described in the exercises. We consider some examples of maps to which Theorem 2.2.1 applies. First, we look at ( ) ( ) r cos θ x(r, θ) (2.2.21) F : (0, ∞) × R −→ R2 , F (r, θ) = = . r sin θ y(r, θ) Then (2.2.22)
( DF (r, θ) =
∂r x ∂θ x ∂r y ∂θ y
)
( =
cos θ sin θ
) −r sin θ , r cos θ
so (2.2.23)
det DF (r, θ) = r cos2 θ + r sin2 θ = r.
Hence DF (r, θ) is invertible for all (r, θ) ∈ (0, ∞) × R. Theorem 2.2.1 implies that each (r0 , θ0 ) ∈ (0, ∞) × R has a neighborhood U and (x0 , y0 ) = (r0 cos θ0 , r0 sin θ0 ) has a neighborhood V such that F is a smooth diffeomorphism of U onto V . In this simple situation, it can be verified directly that (2.2.24)
F : (0, ∞) × (−π, π) −→ R2 \ {(x, 0) : x ≤ 0}
is a smooth diffeomorphism.
2.2. Inverse function and implicit function theorem
63
Note that DF (1, 0) = I in (2.2.22). Let us check the domain of applicability of Proposition 2.2.2. The symmetric part of DF (r, θ) in (2.2.22) is ( ) 1 cos θ (1 − r) sin θ 2 (2.2.25) S(r, θ) = 1 . r cos θ 2 (1 − r) sin θ By Proposition 2.1.7, this is positive definite if and only if (2.2.26)
cos θ > 0,
and 1 det S(r, θ) = r cos2 θ − (1 − r)2 sin2 θ > 0. 4 Now (2.2.26) holds for θ ∈ (−π/2, π/2), but not on all of (−π, π). Furthermore, (2.2.27) holds for (r, θ) in a neighborhood of (r0 , θ0 ) = (1, 0), but it does not hold on all of (0, ∞) × (−π/2, π/2). We see that Proposition 2.2.2 does not capture the full force of the diffeomorphism property of (2.2.24). (2.2.27)
We move on to another example. As in §2.1, we can extend Theorem 2.2.1, replacing Rn by a finite dimensional real vector space, isometric to a Euclidean 2 space, such as M (n, R) ≈ Rn . As an example, consider (2.2.28)
Exp : M (n, R) −→ M (n, R),
Exp(X) = eX =
∞ ∑ 1 k X . k!
k=0
Smoothness of Exp follows from Corollary 2.1.12. Since 1 (2.2.29) Exp(Y ) = I + Y + Y 2 + · · · , 2 we have (2.2.30)
D Exp(0)Y = Y,
∀ Y ∈ M (n, R),
so D Exp(0) is invertible. Then Theorem 2.2.1 implies that there exist a neighborhod U of 0 ∈ M (n, R) and a neighborhood V of I ∈ M (n, R) such that Exp : U → V is a smooth diffeomorphism. To motivate the next result, we consider the following example. Take a > 0 and consider the equation (2.2.31)
x2 + y 2 = a2 ,
F (x, y) = x2 + y 2 .
Note that (2.2.32)
DF (x, y) = (2x 2y),
Dx F (x, y) = 2x,
Dy F (x, y) = 2y.
The equation (2.2.31) defines y “implicitly” as a smooth function of x if |x| < a. Explicitly, √ (2.2.33) |x| < a =⇒ y = a2 − x2 , Similarly, (2.2.31) defines x implicitly as a smooth function of y if |y| < a; explicitly √ (2.2.34) |y| < a =⇒ x = a2 − y 2 . Now, given x0 ∈ R, a > 0, there exists y0 ∈ R such that F (x0 , y0 ) = a2 if and only if |x0 | ≤ a. Furthermore, (2.2.35)
given F (x0 , y0 ) = a2 ,
Dy F (x0 , y0 ) ̸= 0 ⇔ |x0 | < a.
64
2. Multivariable differential calculus
Similarly, given y0 ∈ R, there exists x0 such that F (x0 , y0 ) = a2 if and only if |y0 | ≤ a, and (2.2.36)
given F (x0 , y0 ) = a2 ,
Dx F (x0 , y0 ) ̸= 0 ⇔ |x0 | < a.
Note also that, whenever (x, y) ∈ R2 and F (x, y) = a2 > 0, (2.2.37)
DF (x, y) ̸= 0,
so either Dx F (x, y) ̸= 0 or Dy F (x, y) ̸= 0, and, as seen above whenever (x0 , y0 ) ∈ R2 and F (x0 , y0 ) = a2 > 0, we can solve F (x, y) = a2 for either y as a smooth function of x for x near x0 or for x as a smooth function of y for y near y0 . We move from these observations to the next result, the Implicit Function Theorem. Theorem 2.2.5. Suppose U is a neighborhood of x0 ∈ Rm , V a neighborhood of y0 ∈ Rℓ , and we have a C k map (2.2.38)
F : U × V −→ Rℓ ,
F (x0 , y0 ) = u0 .
Assume Dy F (x0 , y0 ) is invertible. Then the equation F (x, y) = u0 defines y = g(x, u0 ) for x near x0 (satisfying g(x0 , u0 ) = y0 ) with g a C k map. Proof. Consider H : U × V → Rm × Rℓ defined by ( ) (2.2.39) H(x, y) = x, F (x, y) . (Actually, regard (x, y) and (x, F (x, y)) as column vectors.) We have ( ) I 0 (2.2.40) DH = . Dx F Dy F Thus DH(x0 , y0 ) is invertible, so G = H −1 exists, on a neighborhood of (x0 , u0 ), and is C k , by the Inverse Function Theorem. Let us set (2.2.41)
G(x, u) = (ξ(x, u), g(x, u)).
Then (2.2.42)
H ◦ G(x, u) = H(ξ(x, u), g(x, u)) = (ξ(x, u), F (ξ(x, u), g(x, u)).
Since H ◦ G(x, u) = (x, u), we have ξ(x, u) = x, so (2.2.43)
G(x, u) = (x, g(x, u))
and hence (2.2.44)
H ◦ G(x, u) = (x, F (x, g(x, u)),
hence (2.2.45)
F (x, g(x, u)) = u.
Note that G(x0 , u0 ) = (x0 , y0 ), so g(x0 , u0 ) = y0 , and g is the desired map.
2.2. Inverse function and implicit function theorem
65
Here is an example where Theorem 2.2.5 applies. Set ( ) x(u2 + v 2 ) 4 2 (2.2.46) F : R −→ R , F (u, v, x, y) = . xu + yv We have
( ) 4 F (2, 0, 1, 1) = . 2
(2.2.47) Note that
(
(2.2.48)
Du,v F (u, v, x, y) =
2xu x
hence
(
(2.2.49)
Du,v F (2, 0, 1, 1) =
4 1
) 2xv , y ) 0 1
is invertible, so Theorem 2.2.5 (with (u, v) in place of y and (x, y) in place of x) implies that the equation ( ) 4 (2.2.50) F (u, v, x, y) = 2 defines smooth functions (2.2.51)
u = u(x, y),
v = v(x, y),
for (x, y) near (x0 , y0 ) = (1, 1), satisfying (2.2.50), with (u(1, 1), v(1, 1)) = (2, 0). Let us next focus on the case ℓ = 1 of Theorem 2.2.5, so (2.2.52)
z = (x, y) ∈ Rn , x ∈ Rn−1 , y ∈ R,
F (z) ∈ R.
Then Dy F = ∂y F . If F (x0 , y0 ) = u0 , Theorem 2.2.5 says that if ∂y F (x0 , y0 ) ̸= 0,
(2.2.53) then one can solve (2.2.54)
F (x, y) = u0 for y = g(x, u0 ),
for x near x0 (satisfying g(x0 , u0 ) = y0 ), with g a C k function. This phenomenon was illustrated in (2.2.31)–(2.2.35). To generalize the observations involving (2.2.36)–(2.2.37), we note the following. Set (x, y) = z = (z1 , . . . , zn ), z0 = (x0 , y0 ). The condition (2.2.53) is that ∂zn F (z0 ) ̸= 0. Now a simple permutation of variables allows us to assume ∂zj F (z0 ) ̸= 0,
(2.2.55)
F (z0 ) = u0 ,
and deduce that one can solve (2.2.56)
F (z) = u0 ,
for zj = g(z1 , . . . , zj−1 , zj+1 , . . . , zn ).
Let us record this result, changing notation and replacing z by x.
66
2. Multivariable differential calculus
Proposition 2.2.6. Let Ω be a neighborhood of x0 ∈ Rn . Asume we have a C k function F : Ω −→ R,
(2.2.57)
F (x0 ) = u0 ,
and assume (2.2.58)
DF (x0 ) ̸= 0,
i.e., (∂1 F (x0 ), . . . , ∂n F (x0 )) ̸= 0.
Then there exists j ∈ {1, . . . , n} such that one can solve F (x) = u0 for (2.2.59)
xj = g(x1 , . . . , xj−1 , xj+1 , . . . , xn ),
with (x10 , . . . , xj0 , . . . , xn0 ) = x0 , for a C k function g. Remark. For F : Ω → R, it is common to denote DF (x) by ∇F (x), (2.2.60)
∇F (x) = (∂1 F (x), . . . , ∂n F (x)).
Here is an example to which Proposition 2.2.6 applies. Using the notation (x, y) = (x1 , x2 ), set (2.2.61)
F : R2 −→ R,
F (x, y) = x2 + y 2 − x.
Then (2.2.62)
∇F (x, y) = (2x − 1, 2y),
which vanishes if and only if x = 1/2, y = 0. Hence Proposition 2.2.6 applies if and only if (x0 , y0 ) ̸= (1/2, 0). Let us give an example involving a real valued function on M (n, R), namely (2.2.63)
det : M (n, R) −→ R.
As indicated in Exercise 13 of §2.1, if det X ̸= 0, (2.2.64)
D det(X)Y = (det X) Tr(X −1 Y ),
so (2.2.65)
det X ̸= 0 =⇒ D det(X) ̸= 0.
We deduce that, if (2.2.66)
X0 ∈ M (n, R),
det X0 = a ̸= 0,
then, writing (2.2.67)
X = (xjk )1≤j,k≤n ,
there exist µ, ν ∈ {1, . . . , n} such that the equation (2.2.68)
det X = a
has a smooth solution of the form ( ) (2.2.69) xµν = g xαβ : (α, β) ̸= (µ, ν) , such that, if the argument of g consists of the matrix entries of X0 other than the µ, ν entry, then the left side of (2.2.69) is the µ, ν entry of X0 .
Exercises
67
Let us return to the setting of Theorem 2.2.5, with ℓ not necessarily equal to 1. In notation parallel to that of (2.2.55), we assume F is a C k map, (2.2.70)
F : Ω −→ Rℓ ,
F (z0 ) = u0 ,
where Ω is a neighborhood of z0 in Rn . We assume (2.2.71)
DF (z0 ) : Rn −→ Rℓ is surjective.
Then, upon reordering the variables z = (z1 , . . . , zn ), we can write z = (x, y), x = (x1 , . . . , xn−ℓ ), y = (y1 , . . . , yℓ ), such that Dy F (z0 ) is invertible, and Theorem 2.2.5 applies. Thus (for this reordering of variables), we have a C k solution to (2.2.72)
F (x, y) = u0 ,
y = g(x, u0 ),
satisfying y0 = g(x0 , u0 ), z0 = (x0 , y0 ). To give one example to which this result applies, we take another look at F : R4 → R2 in (2.2.46). We have ( ) 2xu 2xv u2 + v 2 0 (2.2.73) DF (u, v, x, y) = . x y u v The reader is invited to determine for which (u, v, x, y) ∈ R4 the matrix on the right side of (2.2.73) has rank 2. Here is another example, involving a map defined on M (n, R). Set ( ) det X 2 (2.2.74) F : M (n, R) −→ R , F (X) = . Tr X Parallel to (2.2.64), if det X ̸= 0, Y ∈ M (n, R), ( ) (det X) Tr(X −1 Y ) (2.2.75) DF (X)Y = . Tr Y Hence, given det X ̸= 0, DF (X) : M (n, R) → R2 is surjective if and only if ( ) Tr(X −1 Y ) (2.2.76) L : M (n, R) → R2 , LY = Tr Y is surjective. This is seen to be the case if and only if X is not a scalar multiple of the identity I ∈ M (n, R).
Exercises 1. Suppose F : U → Rn is a C 2 map, p ∈ U, open in Rn , and DF (p) is invertible. With q = F (p), define a map N on a neighborhood of p by ( ) (2.2.77) N (x) = x + DF (x)−1 q − F (x) . Show that there exists ε > 0 and C < ∞ such that, for 0 ≤ r < ε, ∥x − p∥ ≤ r =⇒ ∥N (x) − p∥ ≤ C r2 . Conclude that, if ∥x1 − p∥ ≤ r with r < min(ε, 1/2C), then xj+1 = N (xj ) defines a sequence converging very rapidly to p. This is the basis of Newton’s method, for
68
2. Multivariable differential calculus
solving F (p) = q for p. Hint. Apply F to both sides of (2.73). 2. Applying Newton’s method to f (x) = 1/x, show that you get a fast approximation to division using only addition and multiplication. Hint. Carry out the calculation of N (x) in this case and notice a “miracle.” 3. Identify R2n with Cn via z = x + iy, as in Exercise 4 of §2.1. Let U ⊂ R2n be open, F : U → R2n be C 1 . Assume p ∈ U, DF (p) invertible. If F −1 : V → U is given as in Theorem 2.2.1, show that F −1 is holomorphic provided F is. 4. Let O ⊂ Rn be open. We say a function f ∈ C ∞ (O) is real analytic provided that, for each x0 ∈ O, we have a convergent power series expansion ∑ 1 (2.2.78) f (x) = f (α) (x0 )(x − x0 )α , α! α≥0
valid in a neighborhood of x0 . Show that we can let x be complex in (2.2.16), and obtain an extension of f to a neighborhood of O in Cn . Show that the extended function is holomorphic, i.e., satisfies the Cauchy-Riemann equations. Hint. Use Proposition 2.1.10. Remark. It can be shown that, conversely, any holomorphic function has a power series expansion. See §5.1. For the next exercise, assume this as known. 5. Let O ⊂ Rn be open, p ∈ O, f : O → Rn be real analytic, with Df (p) invertible. Take f −1 : V → U as in Theorem 2.2.1. Show f −1 is real analytic. Hint. Consider a holomorphic extension F : Ω → Cn of f and apply Exercise 3. 6. Use (2.2.10) to show that if a C 1 diffeomorphism has a C 1 inverse G, and if actually F is C k , then also G is C k . Hint. Use induction on k. Write (2.2.10) as G(x) = Φ ◦ F ◦ G(x), −1
with Φ(X) = X , as in Exercises 3 and 10 of §2.1, G(x) = DG(x), F(x) = DF (x). Apply Exercise 9 of §2.1 to show that, in general G, F, Φ ∈ C ℓ =⇒ G ∈ C ℓ . Deduce that if one is given F ∈ C k and one knows that G ∈ C k−1 , then this result applies to give G = DG ∈ C k−1 , hence G ∈ C k . 7. Show that there is a neighborhood O of (1, 0) ∈ R2 and there are functions u, v, w ∈ C 1 (O) (u = u(x, y), etc.) satisfying the equations u3 + v 3 − xw3 = 0, u2 + yw2 + v = 1, xu + yvw = 1,
Exercises
69
for (x, y) ∈ O, and satisfying u(1, 0) = 1,
v(1, 0) = 0,
w(1, 0) = 1.
Hint. Define F : R → R by 5
3
3 u + v 3 − xw3 F (u, v, w, x, y) = u2 + yw2 + v , xu + yvw
Then F (1, 0, 1, 1, 0) = (0, 1, 1)t . Evaluate the 3 × 3 matrix Du,v,w F (1, 0, 1, 1, 0). Compare (2.2.46)–(2.2.51). 8. Consider F : M (n, R) → M (n, R), given by F (X) = X 2 . Show that F is a diffeomorphism of a neighborhood of the identity matrix I onto a neighborhood of I. Show that F is not a diffeomorphism of a neighborhood of ( ) 1 0 0 −1 onto a neighborhood of I (in case n = 2). 9. Prove Corollary 2.2.4. 10. Let f : R2 → R3 be a C 1 map. Assume f (0) = (0, 0, 0) and ∂f ∂f (0) × (0) = (0, 0, 1). ∂x ∂y Show that there exist neighborhoods O and Ω of 0 ∈ R2 and a C 1 map u : Ω → R such that the image of O under f in R3 is the graph of u over Ω. Hint. Let Π : R3 → R2 be Π(x, y, z) = (x, y), and consider φ(x, y) = Π(f (x, y)),
φ : R2 → R2 .
Show that Dφ(0) : R2 → R2 is invertible, and apply the inverse function theorem. Then let u be the z-component of f ◦ φ−1 . 11. Generalize Exercise 10 to the setting where f : Rm → Rn (m < n) is C 1 and Df (0) : Rm −→ Rn is injective. Remark. For related results, see the opening paragraphs of §3.2. 12. Let Ω ⊂ Rn be open and contain p0 . Assume F : Ω → Rn is continuous and F (p0 ) = q0 . Assume F is C 1 on Ω and DF (x) is invertible for all x ∈ Ω. Finally, assume there exists R > 0 such that (2.2.79)
x ∈ ∂Ω =⇒ ∥F (x) − q0 ∥ ≥ R.
Show that (2.2.80)
F (Ω) ⊃ BR/2 (q0 ).
70
2. Multivariable differential calculus
Hint. Given y0 ∈ BR/2 (q0 ), use compactness to show that there exists x0 ∈ Ω such that ∥F (x0 ) − y0 ∥ = inf ∥F (x) − y0 ∥. x∈Ω
Use the hypothesis (2.2.79) to show that x0 ∈ Ω. If F (x0 ) ̸= y0 , use F (x0 + tz) = F (x0 ) + tDF (x0 )z + o(∥tz∥), to produce z ∈ Rn (say DF (x0 )z = y0 − F (x0 )) such that F (x0 + tz) is closer to y0 than F (x0 ) is, for small t > 0. Contradiction. 13. Do Exercise 12 with the conclusion (2.2.80) strengthened to (2.2.81)
F (Ω) ⊃ BR (q0 ).
Hint. It suffices to show that F (Ω) ⊃ BS (q0 ) for each S < R. Given such S, produce a diffeomorphism φ : Rn → Rn such that Exercise 12 applies to φ ◦ F , and yields the desired conclusion.
2.3. Systems of differential equations and vector fields In this section we study n × n systems of ODE, dy (2.3.1) = F (t, y), y(t0 ) = y0 . dt To begin, we prove the following fundamental existence and uniqueness result. Theorem 2.3.1. Let y0 ∈ Ω, an open subset of Rn , I ⊂ R an interval containing t0 . Suppose F is continuous on I × Ω and satisfies the following Lipschitz estimate in y : (2.3.2)
∥F (t, y1 ) − F (t, y2 )∥ ≤ L∥y1 − y2 ∥
for t ∈ I, yj ∈ Ω. Then the equation (2.3.1) has a unique solution on some t-interval containing t0 . To begin the proof, we note that the equation (2.3.1) is equivalent to the integral equation ∫ t ( ) (2.3.3) y(t) = y0 + F s, y(s) ds. t0
Existence will be established via the Picard iteration method, which is the following. Guess y0 (t), e.g., y0 (t) = y0 . Then set ∫ t ( ) F s, yk−1 (s) ds. (2.3.4) yk (t) = y0 + t0
We aim to show that, as k → ∞, yk (t) converges to a (unique) solution of (2.3.3), at least for t close enough to t0 . To do this, we use the Contraction Mapping Theorem, established in §2.2. We look for a fixed point of Φ, defined by ∫ t ( ) (2.3.5) (Φy)(t) = y0 + F s, y(s) ds. t0
2.3. Systems of differential equations and vector fields
Let (2.3.6)
71
{ } X = u ∈ C(J, Rn ) : u(t0 ) = y0 , sup ∥u(t) − y0 ∥ ≤ R . t∈J
Here J = [t0 − T, t0 + T ], where T will be chosen, sufficiently small, below. The quantity R is picked so that BR (y0 ) = {y : ∥y − y0 ∥ ≤ R} is contained in Ω, and we also suppose J ⊂ I. Then there exists M such that (2.3.7)
sup s∈J,∥y−y0 ∥≤R
∥F (s, y)∥ ≤ M.
Then, provided (2.3.8)
T ≤
R , M
we have (2.3.9)
Φ : X → X.
Now, using the Lipschitz hypothesis (2.3.2), we have, for t ∈ J, ∫ t ∥(Φy)(t) − (Φz)(t)∥ ≤ L∥y(s) − z(s)∥ ds t0 (2.3.10) ≤ T L sup ∥y(s) − z(s)∥ s∈J
assuming y and z belong to X. It follows that Φ is a contraction on X provided one has 1 (2.3.11) T < L in addition to the hypotheses above. This proves Theorem 2.3.1. Note that the bound M and the Lipschitz hypothesis on F were needed only on BR (y0 ). Thus we can extend Theorem 2.3.1 to the following setting: (2.3.12)
For each compact K ⊂ Ω, there exists MK < ∞ such that ∥F (t, x)∥ ≤ MK , ∀ x ∈ K, t ∈ I,
and (2.3.13)
For each K as above, there exists LK < ∞ such that ∥F (t, x) − F (t, y)∥ ≤ LK ∥x − y∥, ∀ x, y ∈ K, t ∈ I.
Note that, if K ⊂ Ω is compact, there exists RK > 0 such that e = {x ∈ Rn : dist(x, K) ≤ RK } ⊂ Ω, K e is compact. It follows that for each y0 ∈ K, the solution to (2.3.1) exists on and K the interval (2.3.14)
{t ∈ I : |t − t0 | ≤ min(RK /MKe , 1/2LKe )}.
Now that we have local solutions to (2.3.1), it is of interest to investigate when global solutions exist. Here is an example where breakdown occurs: dy (2.3.15) = y 2 , y(0) = 1. dt
72
2. Multivariable differential calculus
The solution blows up in finite time. See Exercise 1. It is useful to know that “blowing up” is the only way a solution can fail to exist globally. We have the following result. Proposition 2.3.2. Let F be as in Theorem 2.3.1, but with the boundedness and Lipschitz hypotheses replaced by (2.3.12)–(2.3.13). Assume [a, b] is contained in the open interval I, and assume y(t) solves (2.3.1) for t ∈ (a, b). Assume there exists a compact K ⊂ Ω such that y(t) ∈ K for all t ∈ (a, b). Then there exist a1 < a and b1 > b such that y(t) solves (2.3.1) for t ∈ (a1 , b1 ). Proof. We deduce from (2.3.14) that there exists δ > 0 such that for each y1 ∈ K, t1 ∈ [a, b], the solution to dy = F (t, y), y(t1 ) = y1 dt exists on the interval [t1 − δ, t1 + δ]. Now, under the current hypotheses, take t1 ∈ (b − δ/2, b) and y1 = y(t1 ), with y(t) solving (2.3.1). Then solving (2.3.16) continues y(t) past t = b. Similarly one can continue y(t) past t = a. (2.3.16)
Here is an example of a global existence result that can be deduced from Proposition 2.3.2. Consider the 2 × 2 system for y = (x, v): dx = v, dt (2.3.17) dv = −x3 . dt Here we take Ω = R2 , F (t, y) = F (t, x, v) = (u, −x3 ). If (2.3.17) holds for t ∈ (a, b), we have d ( v2 x4 ) dv dx (2.3.18) + + x3 = 0, =v dt 2 4 dt dt so each y(t) = (x(t), v(t)) solving (2.3.17) lies on a level curve x4 /4 + v 2 /2 = C, hence is confined to a compact subset of R2 , yielding global existence of solutions to (2.3.17). For more examples of global existence, see Exercises 2–4 below, and also further material below, treating linear systems. The discussion above dealt with first order systems. Often one wants to deal with a higher-order ODE. There is a standard method of reducing an nth-order ODE (2.3.19)
y (n) (t) = f (t, y, y ′ , . . . , y (n−1) )
to a first-order system. One sets u = (u0 , . . . , un−1 ) with (2.3.20)
u0 = y,
uj = y (j) ,
and then (2.3.21)
) du ( = u1 , . . . , un−1 , f (t, u0 , . . . , un−1 ) = g(t, u). dt
If y takes values in Rk , then u takes values in Rkn .
2.3. Systems of differential equations and vector fields
73
If the system (2.3.1) is non-autonomous, i.e., if F explicitly depends on t, it can be converted to an autonomous system (one with no explicit t-dependence) as follows. Set z = (t, y). We then have ) dz ( dy ) ( (2.3.22) = 1, = 1, F (z) = G(z). dt dt Sometimes this process destroys important features of the original system (2.3.1). For example, if (2.3.1) is linear, (2.3.22) might be nonlinear. Nevertheless, the trick of converting (2.3.1) to (2.3.22) has some uses. Linear systems Here we consider linear systems, of the form dx = A(t)x, x(0) = x0 , (2.3.23) dt given A(t) continuous in t ∈ I (an interval about 0), with values in M (n, R). We will apply Proposition 2.3.2 to establish global existence of solutions. It suffices to establish the following. Proposition 2.3.3. If ∥A(t)∥ ≤ K for t ∈ I, then the solution to (2.3.23) satisfies ∥x(t)∥ ≤ eK|t| ∥x0 ∥.
(2.3.24)
Proof. It suffices to prove (2.3.24) for t ≥ 0. Then y(t) = e−Kt x(t) satisfies dy = C(t)y, dt We claim that, for t ≥ 0,
C(t) = A(t) − KI.
(2.3.25)
y(0) = x0 ,
(2.3.26)
∥y(t)∥ ≤ ∥y(0)∥,
which then implies (2.3.24), for t ≥ 0. In fact, (2.3.27)
d ∥y(t)∥2 = y ′ (t) · y(t) + y(t) · y ′ (t) dt = 2y(t) · (A(t) − K)y(t).
Now y(t) · A(t)y(t) ≤ ∥y(t)∥ · ∥A(t)y(t)∥ ≤ ∥A(t)∥ · ∥y(t)∥2 , so the hypothesis ∥A(t)∥ ≤ K implies (2.3.28)
d ∥y(t)∥2 ≤ 0. dt
yielding (3.26).
Thanks to Proposition 2.3.3, we have, for s, t ∈ I, the solution operator for (2.3.23), (2.3.29)
S(t, s) ∈ M (n, R),
S(t, s)x(s) = x(t).
We have (2.3.30)
∂ S(t, s) = A(t)S(t, s), ∂t
S(s, s) = I.
74
2. Multivariable differential calculus
Note that S(t, s)S(s, r) = S(t, r). In particular, S(t, s) = S(s, t)−1 . We can use the solution operator S(t, s) to solve the inhomogeneous system dx = A(t)x + f (t), dt
(2.3.31) Namely, we can take (2.3.32)
∫
x(t0 ) = x0 . t
x(t) = S(t, t0 )x0 +
S(t, s)f (s) ds. t0
This is known as Duhamel’s formula. Verifying that this solves (2.3.30) is an exercise. We will make good use of this in the next subsection. Dependence of solutions on initial data and other parameters We study how the solution to a system of differential equations dx = F (x), x(0) = y (2.3.33) dt depends on the initial condition y. As shown in (2.3.22), there is no loss of generality in considering the autonomous system (2.3.33). We will assume F : Ω → Rn is smooth, Ω ⊂ Rn open and convex, and denote the solution to (2.3.33) by x = x(t, y). We want to examine smoothness in y. Let DF (x) denote the n × n matrix valued function of partial derivatives of F . To start, we assume F is of class C 1 , i.e., DF is continuous on Ω, and we want to show x(t, y) is differentiable in y. Let us recall what this means. Take y ∈ Ω and pick R > 0 such that BR (y) is contained in Ω. We seek an n × n matrix W (t, y) such that, for w0 ∈ Rn , ∥w0 ∥ ≤ R, (2.3.34)
x(t, y + w0 ) = x(t, y) + W (t, y)w0 + r(t, y, w0 ),
where (2.3.35)
r(t, y, w0 ) = o(∥w0 ∥),
which means (2.3.36)
lim
w0 →0
r(t, y, w0 ) = 0. ∥w0 ∥
When this holds, x(t, y) is differentiable in y, and (2.3.37)
Dy x(t, y) = W (t, y).
In other words, (2.3.38)
x(t, y + w0 ) = x(t, y) + Dy x(t, y)w0 + o(∥w0 ∥).
In the course of proving this differentiability, we also want to produce an equation for W (t, y) = Dy x(t, y). This can be done as follows. Suppose x(t, y) were differentiable in y. (We do not yet know that it is, but that is okay.) Then F (x(t, y)) is differentiable in y, so we can apply Dy to (2.3.32). Using the chain rule, we get the following equation, dW = DF (x)W, W (0, y) = I, (2.3.39) dt
2.3. Systems of differential equations and vector fields
75
called the linearization of (2.3.33). Here, I is the n×n identity matrix. Equivalently, given w0 ∈ Rn , (2.3.40)
w(t, y) = W (t, y)w0
is expected to solve (2.3.41)
dw = DF (x)w, dt
w(0) = w0 .
Now, we do not yet know that x(t, y) is differentiable, but we do know from results above on linear systems that (2.3.39) and (2.3.41) are uniquely solvable. It remains to show that, with such a choice of W (t, y), (2.3.34)–(2.3.35) hold. To rephrase the task, set (2.3.42)
x(t) = x(t, y),
x1 (t) = x(t, y + w0 ),
z(t) = x1 (t) − x(t),
and let W (t) solve (2.3.39), and w(t) satisfy (2.3.40)–(2.3.41). We then have x(t, y + w0 ) = x(t, y) + W (t, y)w0 + {z(t) − w(t)}, so the task of verifying (2.3.34)–(2.3.35) is equivalent to the task of verifying (2.3.43)
∥z(t) − w(t)∥ = o(∥w0 ∥).
To establish (2.3.43), we will obtain for z(t) an equation similar to (2.3.41). To begin, (2.3.42) implies dz (2.3.44) = F (x1 ) − F (x), z(0) = w0 . dt Now the fundamental theorem of calculus gives (2.3.45)
F (x1 ) − F (x) = G(x1 , x)(x1 − x),
with (2.3.46)
∫ G(x1 , x) =
1
( ) DF τ x1 + (1 − τ )x dτ.
0
If F is C 1 , then G is continuous. Then (2.3.44)–(2.3.45) yield dz (2.3.47) = G(x1 , x)z, z(0) = w0 . dt Given that (2.3.48)
∥DF (u)∥ ≤ L,
∀ u ∈ Ω,
which we have by continuity of DF , after possibly shrinking Ω slightly, we deduce from Proposition 2.3.3 that (2.3.49)
∥z(t)∥ ≤ e|t|L ∥w0 ∥,
that is, (2.3.50)
∥x(t, y) − x(t, y + w0 )∥ ≤ e|t|L ∥w0 ∥.
This establishes that x(t, y) is Lipschitz in y. To proceed, since G is continuous and G(x, x) = DF (x), we can rewrite (2.3.47) as (2.3.51)
dz = G(x + z, x)z = DF (x)z + R(x, z), dt
z(0) = w0 ,
76
2. Multivariable differential calculus
where (2.3.52)
F ∈ C 1 (Ω) =⇒ ∥R(x, z)∥ = o(∥z∥) = o(∥w0 ∥).
Now comparing (2.3.51) with (2.3.41), we have d (z − w) = DF (x)(z − w) + R(x, z), (z − w)(0) = 0. dt Then Duhamel’s formula gives ∫ t (2.3.54) z(t) − w(t) = S(t, s)R(x(s), z(s)) ds, (2.3.53)
0
where S(t, s) is the solution operator for d/dt − B(t), with B(t) = G(x1 (t), x(t)), which as in (2.3.49), satisfies (2.3.55)
∥S(t, s)∥ ≤ e|t−s|L .
We hence have (2.3.43), i.e., (2.3.56)
∥z(t) − w(t)∥ = o(∥w0 ∥).
This is precisely what is required to show that x(t, y) is differentiable with respect to y, with derivative W = Dy x(t, y) satisfying (2.3.39). Hence we have: Proposition 2.3.4. If F ∈ C 1 (Ω) and if solutions to (2.3.33) exist for t ∈ (−T0 , T1 ), then, for each such t, x(t, y) is C 1 in y, with derivative Dy x(t, y) satisfying (2.3.39). We have shown that x(t, y) is both Lipschitz and differentiable in y. The continuity of W (t, y) in y follows easily by comparing the differential equations of the form (2.3.39) for W (t, y) and W (t, y + w0 ), in the spirit of the analysis of z(t) done above. If F possesses further smoothness, we can establish higher differentiability of x(t, y) in y by the following trick. Couple (2.3.33) and (2.3.39), to get a system of differential equations for (x, W ):
(2.3.57)
dx = F (x), dt dW = DF (x)W, dt
with initial conditions (2.3.58)
x(0) = y,
W (0) = I.
We can reiterate the preceding argument, getting results on Dy (x, W ), hence on Dy2 x(t, y), and continue, proving: Proposition 2.3.5. If F ∈ C k (Ω), then x(t, y) is C k in y. Similarly, we can consider dependence of the solution to (2.3.59)
dx = F (τ, x), dt
x(0) = y
2.3. Systems of differential equations and vector fields
77
on a parameter τ , assuming F smooth jointly in (τ, x). This result can be deduced from the previous one by the following trick. Consider the system dz dx = F (z, y), = 0, x(0) = y, z(0) = τ. (2.3.60) dt dt Then we get smoothness of x(t, τ, y) jointly in (τ, y). As a special case, let F (τ, x) = τ F (x). In this case x(t0 , τ, y) = x(τ t0 , y), so we can improve the conclusion in Proposition 2.3.5 to the following: (2.3.61)
F ∈ C k (Ω) =⇒ x ∈ C k jointly in (t, y).
Vector fields and flows Let U ⊂ Rn be open. A vector field on U is a smooth map (2.3.62)
X : U −→ Rn .
Consider the corresponding ODE dy = X(y), y(0) = x, (2.3.63) dt with x ∈ U. A curve y(t) solving (2.3.63) is called an integral curve of the vector field X. It is also called an orbit. For fixed t, write (2.3.64)
t y = y(t, x) = FX (x).
t The locally defined FX , mapping (a subdomain of) U to U, is called the flow generated by the vector field X. As a consequence of the results on smooth dependence of solutions to ODE, in (2.3.64), y is a smooth function of (t, x).
The vector field X defines a differential operator on scalar functions, as follows: [ ] d h t (2.3.65) LX f (x) = lim h−1 f (FX x) − f (x) = f (FX x) t=0 . h→0 dt We also use the common notation (2.3.66)
LX f (x) = Xf,
that is, we apply X to f as a first order differential operator. Note that, if we apply the chain rule to (2.3.65) and use (2.3.63), we have ∑ ∂f (2.3.67) LX f (x) = X(x) · ∇f (x) = aj (x) , ∂xj ∑ if X = aj (x)ej , with {ej } the standard basis of Rn . In particular, using the notation (2.3.66), we have (2.3.68)
aj (x) = Xxj .
In the notation (2.3.66), (2.3.69)
X=
∑
aj (x)
∂ . ∂xj
We note that X is a derivation, that is, a map on C ∞ (U ), linear over R, satisfying (2.3.70)
X(f g) = (Xf )g + f (Xg).
78
2. Multivariable differential calculus
Conversely, any derivation on C ∞ (U ) defines a vector field, i.e., has the form (2.3.69), as we now show. Proposition 2.3.6. If X is a derivation on C ∞ (U ), then X has the form (2.3.69). ∑ Proof. Set aj (x) = Xxj , X # = aj (x)∂/∂xj , and Y = X − X # . Then Y is a derivation satisfying Y xj = 0 for each j. We aim to show that Y f = 0 for all f . Note that whenever Y is a derivation 1 · 1 = 1 ⇒ Y · 1 = 2Y · 1 ⇒ Y · 1 = 0. Thus Y annihilates constants. Thus in this case Y annihilates all polynomials of degree ≤ 1. Now we show that Y f (p) = 0 for all p ∈ U . Without loss of generality, we can ∫1 suppose p = 0. Then, with bj (x) = 0 (∂j f )(tx) dt, we can write ∑ f (x) = f (0) + bj (x)xj . It immediately follows that Y f vanishes at 0, so the proposition is proved.
A fundamental fact about vector fields is that they can be “straightened out” near points where they do not vanish. To see this, let X be a smooth vector field on U , and suppose X(p) ̸= 0. Then near p there is a hyperplane H that is not tangent to X near p, say on a portion we denote M . We can choose coordinates near p so that p is the origin and M is given by {xn = 0}. Thus we can identify a point x′ ∈ Rn−1 near the origin with x′ ∈ M . We can define a map (2.3.71)
F : M × (−t0 , t0 ) −→ U
by (2.3.72)
t F(x′ , t) = FX (x′ ).
This is C ∞ and has surjective derivative at (0, 0), and so by the inverse function theorem is a local diffeomorphism. This defines a new coordinate system near p, in which the flow generated by X has the form (2.3.73)
s FX (x′ , t) = (x′ , t + s).
If we denote the new coordinates by (u1 , . . . , un ), we see that the following result is established. Theorem 2.3.7. If X is a smooth vector field on U and X(p) ̸= 0, then there exists a coordinate system (u1 , . . . , un ), centered at p (so uj (p) = 0) with respect to which ∂ (2.3.74) X= . ∂un By contrast with the situation in Theorem 2.3.7, if X is a vector field on U, p ∈ U , and X(p) = 0, we say p is a critical point of X. It is of interest to understand the behavior of X and its flow near such a critical point. One special feature that arises here is that if X(p) = 0, then the linearization of (2.3.33) at p, given in general by (2.3.41), takes the special form dw (2.3.75) = Aw, A = DX(p), dt
2.3. Systems of differential equations and vector fields
79
since the solution to (2.3.33) satisfying x(0) = p is x(t) ≡ p. (Here F has been relabeled X.) The solution to (2.3.75) is given explicitly as a matrix exponential: (2.3.76)
w(t) = etA w0 ,
explored in the exercise set entitled “Exercises on the matrix exponential,” at the end of this section. We say p is a non-degenerate critical point of X if A = DX(p) is invertible. In such a case, the behavior of etA is governed by that of the eigenvalues {λj } of A. In particular, Re λj < 0 ∀ j =⇒ etA w0 → 0 as t → +∞, Re λj > 0 ∀ j =⇒ etA w0 → 0 as t → −∞. We say X has a sink at p in the first case, and a source at p in the second case. If Re λj is positive for some j and negative for some j, but never 0, we say X has a saddle at p. If Re λj ≡ 0, we say X(p) has a center at p. This exhausts the possibilities for non-degenerate critical points in dimension 2. In higher dimension there are other possibilities, which the reader can catalogue. In dimension 2, we illustrate these cases in Figure 2.3.1, showing two sinks, a saddle, and a center, taking, respectively, ( ) ( ) ( ) ( ) −1 −a −1 1 −1 (2.3.77) A= , , , , −1 1 −a −1 1 with a > 0 in the second case. Reverse the signs on the first two matrices to exhibit sources. t It follows from material above that the action of FX on p + w0 is close to that of e on w0 , for small w0 , and for t in a bounded interval, say [−T0 , T0 ]. Of course, t (p + w0 ) and etA w0 move very little when w0 is small. If for t ∈ [−T0 , T0 ], both FX one is to show that the structure of the orbits of X near p is close to that of the linearization, a finer analysis, involving large t, is needed. For sources and sinks, this analysis is fairly straightforward, and can be found in §3, Chapter 4, of [50]. The case of saddles is more subtle, and is treated in Appendix C to Chapter 4 of t can be rather different. [50]. When one has a center for etA , the behavior of FX The following example is considered in §3, Chapter 4 of [50]: ( ) −1 (2.3.78) X(x) = Jx − |x|2 x, x ∈ R2 , J = . 1 tA
This has a critical point at x = 0, and DX(0) = J. It is shown that (2.3.79)
t FX (x) −→ 0, as t ↗ +∞,
in this case. In summary, when one passes from the behavior of the linearization of X to X near a non-degenerate critical point, sources are sources, sinks are sinks, and saddles are saddles, but the center does not hold.
80
2. Multivariable differential calculus
Figure 2.3.1. Two sinks, a saddle, and a center
We turn to further mapping properties of vector fields. If F : V → W is a diffeomorphism between two open domains in Rn , and Y is a vector field on W, we define a vector field F# Y on V so that (2.3.80)
FFt # Y = F −1 ◦ FYt ◦ F,
or equivalently, by the chain rule, (2.3.81)
( )( ) ( ) F# Y (x) = DF −1 F (x) Y F (x) .
In particular, if U ⊂ Rn is open and X is a vector field on U, defining a flow F t , t then for a vector field Y, F# Y is defined on most of U, for |t| small, and we can define the Lie derivative: ( h ) d t (2.3.82) LX Y = lim h−1 F# Y − Y = F# Y t=0 , h→0 dt as a vector field on U. Another natural construction is the operator-theoretic bracket, also called the Lie bracket: (2.3.83)
[X, Y ] = XY − Y X,
where the vector fields X and Y are regarded as first order differential operators on C ∞ (U ). One verifies that (2.3.83) defines a vector field on U. In fact, if X =
2.3. Systems of differential equations and vector fields ∑
aj (x)∂/∂xj , Y =
(2.3.84)
∑
81
bj (x)∂/∂xj , then ∑( ∂bj ∂aj ) ∂ ak − bk . [X, Y ] = ∂xk ∂xk ∂xj j,k
The basic fact about the Lie bracket is the following. Theorem 2.3.8. If X and Y are smooth vector fields, then LX Y = [X, Y ]. s Proof. We examine LX Y = (d/ds)FX# Y s=0 , using (2.3.81), which implies that ( s ) ( s ) −s s (2.3.86) Ys (x) = FX# Y (x) = DFX FX (x) Y FX (x) . (2.3.85)
−s . Note that G s : U → L(Rn ). Hence, for x ∈ U, DG s (x) is an Let us set G s(= DFX ) n n element of L R , L(R ) .
To take the s-derivative of (2.3.86), we can apply the product rule and the chain rule. In particular, d s s s (2.3.87) Y (FX (x)) = DY (FX (x))X(FX (x)), ds in view of the master identity d s s F (x) = X(FX (x)). (2.3.88) ds X To differentiate the first factor on the right side of (2.3.86), we start with (d ) d s s d s s s (2.3.89) G (FX (x)) = G s (FX (x)) + DG s (FX (x)) FX (x), ds ds ds and then write (d ) (d ) −s s s G s (FX (x)) = DFX (FX (x)) ds ds (2.3.90) ( d ) −s s = D FX (FX (x)), ds and, as in (2.3.88), d −s −s F (y) = −DX(FX (y)), ds X so the right side of (2.3.90) is equal to (2.3.91)
(2.3.92)
D
−DX(x).
Putting together (2.3.87)–(2.3.92), we have ( s ) d Ys (x) = − DX(x)Y FX (x) ds ( s ) ( s ) ( s ) (2.3.93) + DG s FX (x) X FX (x) Y FX (x) ( s ) ( s ) ( s ) −s + DFX FX (x) DY FX (x) X FX (x) . Note that G 0 (x) = I ∈ L(Rn ) for all x ∈ U, so DG 0 = 0. Thus d (2.3.94) LX Y = Ys (x) s=0 = −DX(x)Y (x) + DY (x)X(x), ds which agrees with the formula (2.3.84) for [X, Y ].
82
2. Multivariable differential calculus
Corollary 2.3.9. If X and Y are smooth vector fields on U, then d t t F Y = FX# [X, Y ] dt X#
(2.3.95) for all t.
t+s t+s s t Proof. Since locally FX = FX FX , we have the same identity for FX# . Hence
d t d t s t FX# Y = FX# FX# Y s=0 = FX# LX Y, dt ds which yields (2.3.95). (2.3.96)
Here is one useful application of Corollary 2.3.9. Note that (2.3.81) implies −s t s FFt X# s Y = FX ◦ FY ◦ FX ,
(2.3.97)
while (2.3.95) yields the implication s [X, Y ] = 0 =⇒ FX# Y ≡ Y.
(2.3.98)
Putting together (2.3.97) and (2.3.98), we have the following. Proposition 2.3.10. Let X and Y be smooth vector fields on U . Let O ⊂ U be open, and assume that, for |s| < a and |t| < b, (2.3.99)
−s s s s FX , FYt , FYt ◦ FYs , FX ◦ FYt , and FX ◦ FYt ◦ FX ,
all map O → U . Then (2.3.100)
s s [X, Y ] = 0 =⇒ FYt ◦ FX (x) = FX ◦ FYt (x),
for x ∈ O, |s| < a, and |t| < b.
Exercises 1. Solve the initial value problem dy = y 2 , y(0) = a, dt given a ∈ R. On what t-interval is the solution defined? 2. Assume in (2.3.1) that F : R × Rn → Rn is C 1 and satisfies ∥F (t, y)∥ ≤ M for all (t, y) ∈ R × Rn . Use Proposition 2.3.2 to show that (2.3.1) has a unique solution for all t ∈ R. 3. Let M be a compact smooth surface in Rn . Suppose F : Rn → Rn is a smooth map (vector field), such that, for each x ∈ M, F (x) is tangent to M, i.e., the line γx (t) = x + tF (x) is tangent to M at x, at t = 0. Show that, if x ∈ M, then the initial value problem dy = F (y), y(0) = x dt
Exercises
83
has a solution for all t ∈ R, and y(t) ∈ M for all t. Hint. Locally, straighten out M to be a linear subspace of Rn , to which F is tangent. Use uniqueness. Material in §2.2 helps do this local straightening. 4. Show that the initial value problem dy dx = −x(x2 + y 2 ), = −y(x2 + y 2 ), x(0) = x0 , y(0) = y0 dt dt has a solution for all t ≥ 0, but not for all t < 0, unless (x0 , y0 ) = (0, 0). 5. Verify Duhamel’s formula (2.3.32), for the solution to (2.3.31). That is, show that the solution to dx = A(t)x + f (t), x(t0 ) = x0 dt is given by ∫ t S(t, s)f (s) ds. x(t) = S(t, t0 )x0 + t0
Hint. Set x(t) = S(t, t0 )y(t) and seek a simpler differential equation for y(t). Exercises on exponential functions 1. Let a ∈ R. Show that the unique solution to u′ (t) = au(t), u(0) = 1 is given by (2.3.101)
u(t) =
∞ ∑ aj j=0
j!
tj .
We denote this function by u(t) = eat , the exponential function. We also write exp(t) = et . Hint. Integrate the series term by term and use the Fundamental Theorem of Calculus. Alternative. Setting u0 (t) = 1, and using the Picard iteration method (2.3.4) to ∑k define the sequence uk (t), show that uk (t) = j=0 aj tj /j! 2. Show that, for all s, t ∈ R, (2.3.102)
ea(s+t) = eas eat .
Hint. Show that u1 (t) = ea(s+t) and u2 (t) = eas eat solve the same initial value problem. Alternative. Apply d/dt to ea(s+t) e−at . 3. Show that exp : R → (0, ∞) is a diffeomorphism. We denote the inverse by log : (0, ∞) −→ R. Show that v(x) = log x solves the ODE dv/dx = 1/x, v(1) = 0, and deduce that ∫ x 1 (2.3.103) dy = log x. 1 y
84
2. Multivariable differential calculus
4. Here we significantly expand the scope of Problems 1–2. Let a ∈ C. Show that the unique solution to f ′ (t) = af, f (0) = 1 is given by (2.3.104)
f (t) =
∞ ∑ aj j=0
j!
tj .
We denote this function by f (t) = eat . Show that, for all t ∈ R, a, b, ∈ C, e(a+b)t = eat ebt .
(2.3.105)
Hint. See the alternative hint for Problem 2. 5. Write (2.3.106)
eit =
∞ ∑ (−1)j j=0
(2j)!
t2j + i
∞ ∑ (−1)j 2j+1 t = u(t) + iv(t). (2j + 1)! j=0
Show that (2.3.107)
u′ (t) = −v(t),
v ′ (t) = u(t).
We denote these functions by u(t) = cos t, v(t) = sin t. The identity (2.3.108)
eit = cos t + i sin t
is called Euler’s formula. For a presentation of the standard geometric definition of sin t and cos t, and its equivalence with (2.3.108), see exercises in §3.2, building on an argument introduced in Exercise 3 below. Auxiliary exercises on trigonometric functions We use the definitions of the trigonometric functions sin t and cos t given in Exercise 5 of the last exercise set. 1. Use (2.3.105) to derive the identities (2.3.109)
sin(x + y) = sin x cos y + cos x sin y cos(x + y) = cos x cos y − sin x sin y.
2. Use (2.3.105)–(2.3.109) to show that (2.3.110)
sin2 t + cos2 t = 1,
cos2 t =
) 1( 1 + cos 2t . 2
Hint. Show that eit = e−it and hence |eit |2 = 1, for t ∈ R. 3. Show that (2.3.111)
γ(t) = (cos t, sin t)
is a map of R onto the unit circle S 1 ⊂ R2 with non-vanishing derivative, and, as t increases, γ(t) moves monotonically, counterclockwise. Use Exercise 5 above to
Exercises
85
show that γ ′ (t) = (− sin t, cos t),
(2.3.112)
and deduce that γ(t) has unit speed. We define π to be the smallest number t1 ∈ (0, ∞) such that γ(t1 ) = (−1, 0), so cos π = −1,
sin π = 0.
Show that 2π is the smallest number t2 ∈ (0, ∞) such that γ(t2 ) = (1, 0), so cos 2π = 1,
sin 2π = 0.
Show that cos(t + 2π) = cos t,
cos(t + π) = − cos t
sin(t + 2π) = sin t,
sin(t + π) = − sin t.
Show that γ(π/2) = (0, 1), and that ( 1 ) cos t + π = − sin t, 2
( 1 ) sin t + π = cos t. 2
4. Show that sin : (−π/2, π/2) → (−1, 1) is a diffeomorphism. We denote its inverse by ( 1 1 ) arcsin : (−1, 1) −→ − π, π . 2 2 Show that u(t) = arcsin t solves the ODE 1 du =√ , u(0) = 0. dt 1 − t2 ( ) Hint. Apply the chain rule to sin u(t) = t. Deduce that, for t ∈ (−1, 1), ∫ t dx √ (2.3.113) arcsin t = . 1 − x2 0 5. Show that e
πi/3
√ 1 3 = + i, 2 2
Hint. First compute
(1
√ e
πi/6
=
3 1 + i. 2 2
√
3 )3 i 2 2 and use Exercise 3. Then compute eπi/2 e−πi/3 . For intuition behind these formulas, look at Figure 2.3.2 +
6. Show that sin(π/6) = 1/2, and hence that ∫ 1/2 ∞ ∑ π dx an ( 1 )2n+1 √ = , = 6 2n + 1 2 1 − x2 0 n=0 where a0 = 1,
an+1 =
2n + 1 an . 2n + 2
86
2. Multivariable differential calculus
Figure 2.3.2. Regular hexagon, a = (1 +
Show that
√
3i)/2
4−k π ∑ an ( 1 )2n+1 − < . 6 n=0 2n + 1 2 3(2k + 3) k
Using a calculator, sum the series over 0 ≤ n ≤ 20, and verify that π ≈ 3.141592653589 · · · 7. For x ̸= (k + 1/2)π, k ∈ Z, set sin x . cos x Show that 1 + tan2 x = 1/ cos2 x. Show that w(x) = tan x satisfies the ODE dw = 1 + w2 , w(0) = 0. dx tan x =
8. Show that tan : (−π/2, π/2) → R is a diffeomorphism. Denote the inverse by ( 1 1 ) arctan : R −→ − π, π . 2 2 Show that ∫ y dx . (2.3.114) arctan y = 2 0 1+x
Exercises
87
9. Use the identity π 1 =√ 6 3 together with (2.3.114) to produce an infinite series for π that is different from that of Exercise 6. tan
Exercises on the matrix exponential
1. Let A be an n × n matrix. Parallel to Exercises 1 and 3 of the exercise set on exponential functions, show that (2.3.115)
etA =
∞ k ∑ t k=0
k!
Ak
solves (2.3.116)
d tA e = AetA = etA A, dt
etA t=0 = I.
2. Show that, for t ∈ R, A ∈ M (n, C), etA e−tA = I.
(2.3.117)
Hint. Compute (d/dt)etA e−tA . Show it is 0. 3. In the setting of Exercise 2, show that for s, t ∈ R, (2.3.118)
e(s+t)A = esA etA .
Hint. Compute (d/dt) e(s+t)A e−tA . Show it is 0. 4. Let A, B ∈ M (n, C), and assume (2.3.119)
AB = BA.
Show that (2.3.120)
et(A+B) = etA etB .
Hint. Compute d t(A+B) −tB −tA e e e . dt Show it is 0. To get this, you will want to show that, if A and B commute, then (2.3.121)
e−tB A = Ae−tB . 5. We desire to compute etJ when (2.3.122)
J=
(
0 1
) −1 . 0
88
2. Multivariable differential calculus
Noting that J 2 = −I, J 3 = −J, J 4 = I, show that the power series for etJ resembles (2.3.106) and deduce that ( ) cos t − sin t (2.3.123) etJ = (cos t)I + (sin t)J = . sin t cos t Note the resemblance of (2.3.123) and (2.3.108). 6. Note that X(t) = Exp(tA) = etA is given as the solution to a system of the form (2.3.59): dX = F (A, X), X(0) = I, dt where F (A, X) = AX. Deduce from the discussion of (2.3.59)–(2.3.60) that Exp : M (n, C) −→ M (n, C) is smooth,
(2.3.124)
of class C , for each k ∈ N. Note that such smoothness also follows from Corollary 2.1.12. k
7. Given A ∈ M (n, C) and continuous f : R → Cn , show that the solution to dx = Ax + f, x(0) = x0 ∈ Cn (2.3.125) dt is given by the following, known as Duhamel’s formula: ∫ t (2.3.126) x(t) = etA x0 + e(t−s)A f (s) ds. 0 −tA
Hint. Derive a simpler ODE for y(t) = e x(t). Extend the argument, and provide a proof of (2.3.32). 8. Given A ∈ M (n, C), show that det eA = eTr A .
Chapter 3
Multivariable integral calculus and calculus on surfaces
This central chapter develops integral calculus on domains in Rn , and then pursues notions of calculus to a higher level, for surfaces in Rn , and more generally for a class of objects known as “manifolds.” In §3.1 we take up the multidimensional Riemann integral. The basic definition is quite parallel to the one-dimensional case, but a number of fundamental results, while parallel in statement to the one-dimensional case, require more elaborate demonstrations in higher dimensions. This section is one of the most demanding in this text, and it is in a sense the heart of the course. One central result is the change of variable formula for multidimensional integrals. Another is the reduction of multiple integrals to iterated integrals. In §3.2 we define the notion of a smooth m-dimensional surface in Rn and study properties of these objects. We associate to such a surface a “metric tensor,” and make use of this to define the integral of functions on a surface. This includes the study of surface area. Examples include the computation of areas of higher dimensional spheres. We also explore integration on the group of rotations on Rn , leading to the notion of “averaging over rotations.” In addition, we introduce a class of objects more general than surfaces, called manifolds. Manifolds can also be endowed with metric tensors. These are called Riemannian manifolds, and one can again define the integral of functions. These two main sections are followed by several short sections on supplementary material, including a section on partitions of unity, useful to localize analysis on an n-dimensional surface to analysis on an open subset of Rn . A section on Sard’s theorem leads to one on a special family of functions known as Morse functions, whose existence allows one to construct lots of conveniently behaved vector fields on a surface. We end with a notion of tangent space to a manifold that conveniently abstracts the notion of the tangent space to a surface in Euclidean space.
89
90
3. Multivariable integral calculus and calculus on surfaces
3.1. The Riemann integral in n variables We define the Riemann integral of a bounded function f : R → R, where R ⊂ Rn is a cell, i.e., a product of intervals R = I1 × · · · × In , where Iν = [aν , bν ] are intervals in R. Recall that a partition of an interval I = [a, b] is a finite collection of subintervals {Jk : 0 ≤ k ≤ N }, disjoint except for their endpoints, whose union is I. We can take Jk = [xk , xk+1 ], where a = x0 < x1 < · · · < xN < xN +1 = b.
(3.1.1)
Now, if one has a partition of each Iν into Jν1 ∪ · · · ∪ Jν,N (ν) , then a partition P of R consists of the cells Rα = J1α1 × J2α2 × · · · × Jnαn ,
(3.1.2)
where 0 ≤ αν ≤ N (ν). For such a partition, define (3.1.3)
maxsize (P) = max diam Rα , α
where (diam Rα ) = ℓ(J1α1 ) + · · · + ℓ(Jnαn )2 . Here, ℓ(J) denotes the length of an interval J. Each cell has n-dimensional volume 2
2
V (Rα ) = ℓ(J1α1 ) · · · ℓ(Jnαn ).
(3.1.4)
Sometimes we use Vn (Rα ) for emphasis on the dimension. We also use A(R) for V2 (R), and, of course, ℓ(R) for V1 (R). We set I P (f ) =
∑ α
(3.1.5) I P (f ) =
∑ α
sup f (x) V (Rα ), Rα
inf f (x) V (Rα ). Rα
Note that I P (f ) ≤ I P (f ). These quantities should approximate the Riemann integral of f, if the partition P is sufficiently “fine.” To be more precise, if P and Q are two partitions of R, we say P refines Q, and write P ≻ Q, if each partition of each interval factor Iν of R involved in the definition of Q is further refined in order to produce the partitions of the factors Iν , used to define P, via (3.1.2). It is an exercise to show that any two partitions of R have a common refinement. Note also that (3.1.6)
P ≻ Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).
Consequently, if Pj are any two partitions of R and Q is a common refinement, we have (3.1.7)
I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).
Now, whenever f : R → R is bounded, the following quantities are well defined: (3.1.8)
I(f ) =
inf
P∈Π(R)
I P (f ),
I(f ) =
sup P∈Π(R)
I P (f ),
where Π(R) is the set of all partitions of R, as defined above. Clearly, by (3.1.7), I(f ) ≤ I(f ). We then say that f is Riemann integrable (on R) provided I(f ) = I(f ),
3.1. The Riemann integral in n variables
and in such a case, we set
91
∫
(3.1.9)
f (x) dV (x) = I(f ) = I(f ). R
We will denote the set of Riemann integrable functions on R by R(R). If dim R = 2, we will often use dA(x) instead of dV (x) in (3.1.9). For general n, we might also use simply dx. We derive some basic properties of the Riemann integral. First, the proofs of Lemma 1.1.3 and Theorem 1.1.4 readily extend, to give: Proposition 3.1.1. Let Pν be any sequence of partitions of R such that maxsize (Pν ) = δν → 0, and let ξνα be any choice of one point in each cell Rνα in the partition Pν . Then, whenever f ∈ R(R), ∫ ∑ (3.1.10) f (x) dV (x) = lim f (ξνα ) V (Rνα ). ν→∞
α
R
This is the multidimensional Darboux theorem. The sums that arise in (3.1.10) are Riemann sums. Also, we can extend the proof of Proposition 1.1.1, to obtain: Proposition 3.1.2. If fj ∈ R(R) and cj ∈ R, then c1 f1 + c2 f2 ∈ R(R), and ∫ ∫ ∫ (3.1.11) (c1 f1 + c2 f2 ) dV = c1 f1 dV + c2 f2 dV. R
R
R
Next, we establish an integrability result analogous to Proposition 1.1.2. Proposition 3.1.3. If f is continuous on R, then f ∈ R(R). Proof. As in the proof of Proposition 1.1.2, we have that, (3.1.12)
maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · V (R),
where ω(δ) is a modulus of continuity for f on R. This proves the proposition. When the number of variables exceeds one, it becomes more important to identify some nice classes of discontinuous functions on R that are Riemann integrable. A useful tool for this is the following notion of size of a set S ⊂ R, called content. Extending (1.1.18)–(1.1.19), we define “upper content” cont+ and “lower content” cont− by (3.1.13)
cont+ (S) = I(χS ),
cont− (S) = I(χS ),
where χS is the characteristic function of S. We say S has content, or “is contented,” if these quantities are equal, which happens if and only if χS ∈ R(R), in which case the common value of cont+ (S) and cont− (S) is ∫ (3.1.14) V (S) = χS (x) dV (s). R
92
3. Multivariable integral calculus and calculus on surfaces
We mention that, if S = I1 ×· · ·×In is a cell, it is readily verified that the definitions in (3.1.5), (3.1.8), and (3.1.13) yield cont+ (S) = cont− (S) = ℓ(I1 ) · · · ℓ(In ), so the definition of V (S) given by (3.1.14) is consistent with that given in (3.1.4). It is easy to see that (3.1.15)
cont+ (S) = inf
N {∑
} V (Rk ) : S ⊂ R1 ∪ · · · ∪ RN ,
k=1
where Rk are cells contained in R. In the formal definition, the Rα in (3.1.15) should be part of a partition P of R, as defined above, but if {R1 , . . . , RN } are any cells in R, they can be chopped up into smaller cells, some perhaps thrown away, to yield a finite cover of S by cells in a partition of R, so one gets the same result. It is an exercise to see that, for any set S ⊂ R, cont+ (S) = cont+ (S),
(3.1.16) where S is the closure of S.
We note that, generally, for a bounded function f on R, I(f ) + I(1 − f ) = V (R).
(3.1.17)
This follows directly from (3.1.5). In particular, given S ⊂ R, cont− (S) + cont+ (R \ S) = V (R).
(3.1.18)
Using this together with (3.1.16), with S and R \ S switched, we have ◦
cont− (S) = cont− (S),
(3.1.19) ◦
◦
where S denotes the interior of S. The difference S \ S is called the boundary of S, and denoted bS. Note that (3.1.20)
cont− (S) = sup
N {∑
} V (Rk ) : R1 ∪ · · · ∪ RN ⊂ S ,
k=1
where here we take {R1 , . . . , RN } to be cells within a partition P of R, and let P vary over all partitions of R. Now, given a partition P of R, classify each cell in P ◦
as either being contained in R \ S, or intersecting bS, or contained in S. Letting P vary over all partitions of R, we see that (3.1.21)
◦
cont+ (S) = cont+ (bS) + cont− (S).
In particular, we have: Proposition 3.1.4. If S ⊂ R, then S is contented if and only if cont+ (bS) = 0. If a set Σ ⊂ R has the property that cont+ (Σ) = 0, we say that Σ has content zero, or is a nil set. Clearly Σ is nil if and only if Σ is nil. It follows easily from ∪K Proposition 3.1.2 that, if Σj are nil, 1 ≤ j ≤ K, then j=1 Σj is nil.
3.1. The Riemann integral in n variables
93
◦
◦
◦
If S1 , S2 ⊂ R and S = S1 ∪ S2 , then S = S 1 ∪ S 2 and S ⊃ S 1 ∪ S 2 . Hence bS ⊂ b(S1 ) ∪ b(S2 ). It follows then from Proposition 3.1.4 that, if S1 and S2 are contented, so is S1 ∪ S2 . Clearly, if Sj are contented, so are Sjc = R \ Sj . It follows ( )c that, if S1 and S2 are contented, so is S1 ∩ S2 = S1c ∪ S2c . A family F of subsets of R is called an algebra of subsets of R provided the following conditions hold: R ∈ F, Sj ∈ F ⇒ S1 ∪ S2 ∈ F , and S ∈ F ⇒ R \ S ∈ F. Algebras of sets are automatically closed under finite intersections also. We see that: Proposition 3.1.5. The family of contented subsets of R is an algebra of sets. The following result specifies a useful class of Riemann integrable functions. For a sharper result, see Proposition 3.1.31. Proposition 3.1.6. If f : R → R is bounded and the set S of points of discontinuity of f is a nil set, then f ∈ R(R). Proof. Suppose |f | ≤ M on R, and take ε > 0. Take a partition P of R, and write P = P ′ ∪ P ′′ , where cells in P ′ do not meet S, and cells in P ′′ do intersect S. Since cont+ (S) = 0, we can pick P so that the cells in P ′′ have total volume ≤ ε. Now f is continuous on each cell in P ′ . Further refining the partition if necessary, we can assume that f varies by ≤ ε on each cell in P ′ . Thus [ ] (3.1.22) I P (f ) − I P (f ) ≤ V (R) + 2M ε.
This proves the proposition. To give an example, suppose K ⊂ R is f : K → R be continuous. Define fe : R → R fe(x) = f (x) for (3.1.23) 0 for
a closed set such that bK is nil. Let by x ∈ K, x ∈ R \ K.
Then the set of points of discontinuity of fe is contained in bK. Hence fe ∈ R(R). We set ∫ ∫ (3.1.24) f dV = fe dV. K
R
In connection with this, we note the following fact, whose proof is an exercise. e are cells, with R ⊂ R. e Suppose that g ∈ R(R) and that g˜ is Suppose R and R e to be equal to g on R and to be 0 on R e \ R. Then defined on R, ∫ ∫ e (3.1.25) g˜ ∈ R(R), and g dV = g˜ dV. R
e R
This can be shown by an argument involving refining any given pair of partitions e respectively, to a pair of partitions PR and P e with the property that of R and R, R each cell in PR is a cell in PRe .
94
3. Multivariable integral calculus and calculus on surfaces
The following describes an important class of sets S ⊂ Rn that have content zero. Proposition 3.1.7. Let Σ ⊂ Rn−1 be a closed bounded set and let g : Σ → R be continuous. Then the graph of g, {( ) } G = x, g(x) : x ∈ Σ is a nil subset of Rn . Proof. Put Σ in a cell R0 ⊂ Rn−1 . Suppose |f | ≤ M on Σ. Take N ∈ Z+ and set ε = M/N. Pick a partition P0 of R0 , sufficiently fine that g varies by at most ε on each set Σ ∩ Rα , for any cell Rα ∈ P0 . Partition the interval I = [−M, M ] into 2N equal intervals Jν , of length ε. Then {Rα × Jν } = {Qαν } forms a partition of R0 × I. Now, over each cell Rα ∈ P0 , there lie at most 2 cells Qαν which intersect G, so cont+ (G) ≤ 2ε · V (R0 ). Letting N → ∞, we have the proposition. Similarly, for any j ∈ {1, . . . , n}, the graph of xj as a continuous function of the complementary variables is a nil set in Rn . So are finite unions of such graphs. Such sets arise as boundaries of many ordinary-looking regions in Rn . Here is a further class of nil sets. Proposition 3.1.8. Let O ⊂ Rn be open and let S ⊂ O be a compact nil subset. Assume f : O → Rn is a Lipschitz map. Then f (S) is a nil subset of Rn . Proof. The Lipschitz hypothesis on f is that there exists L < ∞ such that, for p, q ∈ O, |f (p) − f (q)| ≤ L|p − q|. If we cover S with k cells (in a partition), of total volume ≤√α, each cubical with edgesize δ, then f (S) is covered by k √sets of diameter ≤ L nδ, hence √ it can be covered by k cubical cells of edgesize L nδ, having total volume ≤ (L n)n α. From this we have the (not very sharp) general bound ( ) √ (3.1.26) cont+ f (S) ≤ (L n)n cont+ (S),
which proves the proposition.
In evaluating n-dimensional integrals, it is usually convenient to reduce them to iterated integrals. The following is a special case of a result known as Fubini’s Theorem. Theorem 3.1.9. Let Σ ⊂ Rn−1 be a closed, bounded contented set and let gj : Σ → R be continuous, with g0 (x) < g1 (x) on Σ. Take { } (3.1.27) Ω = (x, y) ∈ Rn : x ∈ Σ, g0 (x) ≤ y ≤ g1 (x) . Then Ω is a contented set in Rn . If f : Ω → R is continuous, then ∫ g1 (x) (3.1.28) φ(x) = f (x, y) dy g0 (x)
is continuous on Σ, and
∫
(3.1.29)
∫ f dVn =
Ω
φ dVn−1 , Σ
3.1. The Riemann integral in n variables
i.e.,
∫ (∫
∫ (3.1.30)
)
g1 (x)
f (x, y) dy
f dVn = Ω
95
Σ
dVn−1 (x).
g0 (x)
Proof. The continuity of (3.1.28) is an exercise in one-variable integration; see Exercises 2–3 of §1.1. Let ω(δ) be a modulus of continuity for g0 , g1 , and f, and also φ. We also can assume that ω(δ) ≥ δ. Put Σ in a cell R0 and let P0 be a partition of R0 . If A ≤ g0 ≤ g1 ≤ B, partition the interval [A, B], and from this and P0 construct a partition P of R = R0 ×[A, B]. We denote a cell in P0 by Rα and a cell in P by Rαℓ = Rα ×Jℓ . Pick points ξαℓ ∈ Rαℓ . ◦
Write P0 = P0′ ∪ P0′′ ∪ P0′′′ , consisting respectively of cells inside Σ, meeting bΣ, and inside R0 \ Σ. Similarly write P = P ′ ∪ P ′′ ∪ P ′′′ , consisting respectively of cells ◦
inside Ω, meeting bΩ, and inside R \ Ω, as illustrated in Figure 3.1.1. For fixed α, let z ′ (α) = {ℓ : Rαℓ ∈ P ′ }, and let z ′′ (α) and z ′′′ (α) be similarly defined. Note that z ′ (α) ̸= ∅ ⇐⇒ Rα ∈ P0′ , provided we assume maxsize (P) ≤ δ and 2δ < min[g1 (x) − g0 (x)], as we will from here on. It follows from (1.1.9)–(1.1.10) that ∫ B(α) ∑ f (ξαℓ )ℓ(Jℓ ) − (3.1.31) f (x, y) dy ≤ (B − A)ω(δ),
∀ x ∈ Rα ,
A(α)
ℓ∈z ′ (α)
∪ where ℓ∈z′ (α) Jℓ = [A(α), B(α)]. Note that A(α) and B(α) are within 2ω(δ) of g0 (x) and g1 (x), respectively, for all x ∈ Rα , if Rα ∈ P0′ . Hence, if |f | ≤ M, ∫ B(α) (3.1.32) f (x, y) dy − φ(x) ≤ 4M ω(δ), ∀ x ∈ Rα . A(α)
Thus, with C = B − A + 4M, ∑ f (ξαℓ )ℓ(Jℓ ) − φ(x) ≤ Cω(δ), (3.1.33)
∀ x ∈ Rα ∈ P0′ .
ℓ∈z ′ (α)
Multiplying by Vn−1 (Rα ) and summing over Rα ∈ P0′ , we have ∑ ∑ f (ξαℓ )Vn (Rαℓ ) − φ(xα )Vn−1 (Rα ) ≤ CV (R0 )ω(δ), (3.1.34) Rαℓ ∈P ′
Rα ∈P0′
where xα is an arbitrary point in Rα . Now, if P0 is a sufficiently fine partition of R0 , it follows from the ∫ proof of Proposition 3.1.6 that the second sum in (3.1.34) is arbitrarily close to Σ φ dVn−1 , since bΣ has content zero. Furthermore, an argument such as used to prove Proposition 3.1.7 shows that bΩ has content zero, and one verifies ∫that, for a sufficiently fine partition, the first sum in (3.1.34) is arbitrarily close to Ω f dVn . This proves the desired identity (3.1.29).
96
3. Multivariable integral calculus and calculus on surfaces
Figure 3.1.1. Fubini partition
We next take up the change of variables formula for multiple integrals, extending the one-variable formula, (1.1.44). We begin with a result on linear changes of variables. The set of invertible real n × n matrices is denoted Gl(n, R). In (3.1.35) ∫ ∫ and subsequent formulas, f dV denotes R f dV for some cell R on which f is supported. The integral is independent of the choice of such a cell; cf. (3.1.25). Proposition 3.1.10. Let f be a continuous function with compact support in Rn . If A ∈ Gl(n, R), then ∫ ∫ (3.1.35) f (x) dV = | det A| f (Ax) dV. Proof. Let G be the set of elements A ∈ Gl(n, R) for which (3.1.35) is true. Clearly I ∈ G. Using det A−1 = (det A)−1 , and det AB = (det A)(det B), we can conclude that G is a subgroup of Gl(n, R). In more detail, for A ∈ Gl(n, R), f as above, let ∫ IA (f ) = fA dV = I(fA ), fA (x) = f (Ax). Then
A ∈ G ⇐⇒ IA (f ) = | det A|−1 I(f ), for all such f . We see that IAB (f ) = I(fAB ) = IB (fA ),
3.1. The Riemann integral in n variables
97
so A, B ∈ G =⇒ IAB (f ) = | det B|−1 I(fA ) = | det B|−1 | det A|−1 I(f ) = | det AB|−1 I(f ) =⇒ AB ∈ G. Applying a similar argument to IAA−1 (f ) = I(f ), also yields the implication A ∈ G ⇒ A−1 ∈ G. To prove the proposition, it will therefore suffice to show that G contains all elements of the following 3 forms, since (as shown in the exercises on row reduction at the end of this section) the method of applying elementary row operations to reduce a matrix shows that any element of Gl(n, R) is a product of a finite number of these elements. Here, {ej : 1 ≤ j ≤ n} denotes the standard basis of Rn , and σ a permutation of {1, . . . , n}. A1 ej = eσ(j) , A2 ej = cj ej ,
(3.1.36)
A3 e2 = e2 + ce1 ,
cj ̸= 0
A3 ej = ej for j ̸= 2.
The proofs of (3.1.35) in the first two cases are elementary consequences of the definition of the Riemann integral, and can be left as exercises. We show that (3.1.35) holds for transformations of the form A3 by using Theorem 3.1.9 (in a special case), to reduce it to the case n = 1. Given f ∈ C(R), compactly supported, and b ∈ R, we clearly have ∫ ∫ (3.1.37) f (x) dx = f (x + b) dx. Now, for the case A = A3 , with x = (x1 , x′ ), we have ∫ ∫ (∫ ) ′ f (x1 + cx2 , x ) dVn (x) = f (x1 + cx2 , x′ ) dx1 dVn−1 (x′ ) ∫ (∫ (3.1.38) ) = f (x1 , x′ ) dx1 dVn−1 (x′ ), the second identity by (3.1.37). Thus we get (3.1.35) in case A = A3 , so the proposition is proved. It is desirable to extend Proposition 3.1.10 to some discontinuous functions. Given a cell R and f : R → R, bounded, we say (3.1.39)
f ∈ C(R) ⇐⇒ the set of discontinuities of f is nil.
Proposition 3.1.6 implies (3.1.40)
C(R) ⊂ R(R).
From the closure of the class of nil sets under finite unions it is clear that C(R) is closed under sums and products, i.e., that C(R) is an algebra of functions on R. We will denote by Cc (Rn ) the set of bounded functions f : Rn → R such that f has compact support and its set of discontinuities is nil. Any f ∈ Cc (Rn ) is supported
98
3. Multivariable integral calculus and calculus on surfaces
in some cell R, and f R ∈ C(R). Here is another useful class of functions. Given a cell R ⊂ Rn and f : R → R bounded, we say (3.1.41)
f ∈ PK(R) ⇐⇒ ∃ a partition P of R such that f is constant on the interior of each cell Rα ∈ P.
The following will be a useful tool for extending Proposition 3.1.10. It is also of interest in its own right, and it will have other uses. Proposition 3.1.11. Given a cell R ⊂ Rn and f : R → R bounded, {∫ } I(f ) = inf g dV : g ∈ PK(R), g ≥ f R {∫
(3.1.42)
= inf R {∫
= inf
g dV : g ∈ C(R), g ≥ f
}
} g dV : g ∈ C(R), g ≥ f .
R
Similarly, {∫ I(f ) = sup R {∫
(3.1.43)
= sup R
{∫ = sup
g dV : g ∈ PK(R), g ≤ f g dV : g ∈ C(R), g ≤ f
}
}
} g dV : g ∈ C(R), g ≤ f .
R
Proof. Denote the three quantities on the right side of (3.1.42) by I 1 (f ), I 2 (f ), and I 3 (f ), respectively. The definition of I 1 (f ) is sufficiently close to that of I(f ) in (3.1.8) that the identity I(f ) = I 1 (f ) is apparent. Now I 2 (f ) is an inf over a larger class of functions g than that defining I 1 (f ), so I 2 (f ) ≤ I 1 (f ). On the other hand, I(g) ≥ I(f ) for all g involved in defining I 2 (f ), so I 2 (f ) ≥ I(f ), hence I 2 (f ) = I(f ). Next, I 3 (f ) is an inf over a smaller class of functions g than that defining I 2 (f ), so I 3 (f ) ≥ I(f ). On the other hand, ∫ given ε > 0 and ψ ∈ PK(R), one can readily find g ∈ C(R) such that g ≥ ψ and R (g − ψ) dV < ε. This implies I 3 (f ) ≤ I(f ) + ε for all ε > 0 and finishes the proof of (3.1.42). The proof of (3.1.43) is similar. We can now extend Proposition 3.1.10. Say f ∈ Rc (Rn ) if f has compact support, say in some cell R, and f ∈ R(R). Also say f ∈ Cc (Rn ) if f is continuous on Rn , with compact support. Proposition 3.1.12. Given A ∈ Gl(n, R), the identity (3.1.35) holds for all f ∈ Rc (Rn ).
3.1. The Riemann integral in n variables
99
Proof. We have from Proposition 3.1.11 that,∫ for each ν ∈ N, there exist gν , hν ∈ Cc (Rn ) such that hν ≤ f ≤ gν and, with B = f dV , ∫ ∫ 1 1 B − ≤ hν dV ≤ B ≤ gν dV ≤ B + . ν ν Now Proposition 3.1.10 applies to gν and hν , so ∫ ∫ 1 1 (3.1.44) B − ≤ | det A| hν (Ax) dV ≤ B ≤ | det A| gν (Ax) dV ≤ B + . ν ν Furthermore, with fA (x) = f (Ax), we have hν (Ax) ≤ fA (x) ≤ gν (Ax), so (4.44) gives 1 1 (3.1.45) B − ≤ | det A| I(fA ) ≤ | det A| I(fA ) ≤ B + , ν ν for all ν, and leting ν → ∞ we obtain (3.1.35). Corollary 3.1.13. If Σ ⊂ Rn is a compact, contented set and A ∈ Gl(n, R), then A(Σ) = {Ax : x ∈ Σ} is contented, and ( ) (3.1.46) V A(Σ) = | det A| V (Σ). We now extend Proposition 3.1.10 to nonlinear changes of variables. Proposition 3.1.14. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism, and f a continuous function with compact support in Ω. Then ∫ ∫ ( ) (3.1.47) f (y) dV (y) = f G(x) | det DG(x)| dV (x). Ω
O
Proof. It suffices to prove the result under the additional assumption that f ≥ 0, which we make from here on. Also, using a partition of unity (see §3.3), we can write f as a finite sum of continuous functions with small supports, so it suffices to e ⊂ Ω and f ◦ G is supported in a cell treat the case where f is supported in a cell R R ⊂ O. See Figure 3.1.2. Let P = {Rα } be a partition of R. Note that for each Rα ∈ P, bG(Rα ) = G(bRα ), so G(Rα ) is contented, in view of Propositions 3.1.4 and 3.1.8. eα = Rα − ξα , a cell with center at the Let ξα be the center of Rα , and let R origin. Then ( ) eα = ηα + Hα (3.1.48) G(ξα ) + DG(ξα ) R is an n-dimensional parallelepiped, each point of which is very close to a point in eα , G(Rα ), if Rα is small enough. To be precise, for y ∈ R G(ξα + y) = ηα + DG(ξα )y + Φ(ξα , y)y, ∫ 1 [ ] Φ(ξα , y) = DG(ξα + ty) − DG(ξα ) dt. 0
See Figure 3.1.3. Consequently, given ε > 0, if δ > 0 is small enough and maxsize(P) ≤ δ, then we have (3.1.49)
ηα + (1 + ε)Hα ⊃ G(Rα ),
100
3. Multivariable integral calculus and calculus on surfaces
Figure 3.1.2. Image of a cell
for all Rα ∈ P. Now, by (3.1.46), V (Hα ) = |det DG(ξα )| V (Rα ).
(3.1.50) Hence
( ) V G(Rα ) ≤ (1 + ε)n |det DG(ξα )| V (Rα ).
(3.1.51) Now we have
∫ f dV =
∑ α
(3.1.52)
≤
∑ α
∫ f dV G(Rα )
( ) sup f ◦ G(x) V G(Rα ) Rα
≤ (1 + ε)n
∑ α
sup f ◦ G(x) |det DG(ξα )| V (Rα ). Rα
To see that the first line of (3.1.52) holds, ∑ note that f χG(Rα ) is Riemann integrable, by Proposition 3.1.6; note also that α f χG(Rα ) = f except on a set of content zero. Then the additivity result in Proposition 3.1.2 applies. The first inequality in (3.1.52) is elementary; the second inequality uses (3.1.51) and f ≥ 0. If we set (3.1.53)
h(x) = f ◦ G(x) |det DG(x)|,
3.1. The Riemann integral in n variables
101
Figure 3.1.3. Cell image closeup
then we have (3.1.54)
sup f ◦ G(x) |det DG(ξα )| ≤ sup h(x) + M ω(δ), Rα
Rα
provided |f | ≤ M and ω(δ) is a modulus of continuity for DG. Taking arbitrarily fine partitions, we get ∫ ∫ (3.1.55) f dV ≤ h dV. Ω
O
If we apply this result, with G replaced by G−1 , O and Ω switched, and f replaced by h, given by (3.1.53), we have ∫ ∫ ∫ (3.1.56) h dV ≤ h ◦ G−1 (y) | det DG−1 (y)| dV (y) = f dV. O
Ω
Ω
The inequalities (3.1.55) and (3.1.56) together yield the identity (3.1.47).
We now extend Proposition 3.1.14 to more general Riemann integrable functions. Recall that f ∈ Rc (Rn ) if f has compact support, say in some cell R, and f ∈ R(R). If Ω ⊂ Rn is open and f ∈ Rc (Rn ) has support in Ω, we say f ∈ Rc (Ω). We also say f ∈ Cc (Ω) if f ∈ Cc (Rn ) has support in Ω, and we say f ∈ Cc (Ω) if f is continuous with compact support in Ω.
102
3. Multivariable integral calculus and calculus on surfaces
Theorem 3.1.15. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism. If f ∈ Rc (Ω), then f ◦ G ∈ Rc (O), and (3.1.47) holds. Proof. The proof is similar to that of Proposition 3.1.12. Given ν ∈ N, we have from Proposition 3.1.11 that there exist gν , hν ∈ Cc (Ω) such that hν ≤ f ≤ gν and, ∫ with B = Ω f dV , ∫ ∫ 1 1 B − ≤ hν dV ≤ B ≤ gν dV ≤ B + . ν ν Then Proposition 3.1.14 applies to hν and gν , so ∫ 1 B − ≤ hν (G(x))| det DG(x)| dV (x) ν O ∫ 1 ≤ B ≤ gν (G(x))| det DG(x)| dV (x) ≤ B + . ν O
Now, with fG (x) = f (G(x)), we have hν (G(x)) ≤ fG (x) ≤ gν (G(x)), so 1 1 (3.1.57) B − ≤ I(fG | det DG|) ≤ I(fG | det DG|) ≤ B + , ν ν for all ν, and letting ν → ∞, we obtain (3.1.47).
We have seen how Proposition 3.1.11 has been useful. The following result, to some degree a variant of Proposition 3.1.11, is also useful. Lemma 3.1.16. Let F : R → R be bounded, B ∈ R. Suppose that, for each ν ∈ Z+ , there exist Ψν , Φν ∈ R(R) such that Ψν ≤ F ≤ Φ ν
(3.1.58) and (3.1.59)
∫ B − δν ≤
∫ Ψν (x) dV (x) ≤
R
Φν (x) dV (x) ≤ B + δν ,
δν → 0.
R
Then F ∈ R(R) and
∫
(3.1.60)
F (x) dV (x) = B. R
Furthermore, if there exist Ψν , Φν ∈ R(R) such that (3.1.58) holds and ∫ (3.1.61) (Φν (x) − Ψν (x)) dV ≤ δν → 0, R
then there exists B such that (3.1.59) holds. Hence F ∈ R(R) and (3.1.60) holds. The most frequently invoked case of the change of variable formula, in the case n = 2, involves the following change from Cartesian to polar coordinates: (3.1.62)
x = r cos θ,
y = r sin θ.
See (2.3.111). Thus, take G(r, θ) = (r cos θ, r sin θ). We have ( ) cos θ −r sin θ (3.1.63) DG(r, θ) = , det DG(r, θ) = r. sin θ r cos θ
3.1. The Riemann integral in n variables
103
For example, if ρ ∈ (0, ∞) and Dρ = {(x, y) ∈ R2 : x2 + y 2 ≤ ρ2 },
(3.1.64)
then, for f ∈ C(Dρ ), ∫ ∫ (3.1.65) f (x, y) dA =
ρ
2π
f (r cos θ, r sin θ)r dθ dr.
0
Dρ
∫ 0
To get this, we first apply Proposition 3.1.14, with O = [ε, ρ] × [0, 2π − ε], then apply Theorem 3.1.9, then let ε ↘ 0. We next use Lemma 3.1.16 to establish the following useful result on products of Riemann integrable functions. Proposition 3.1.17. Given f1 , f2 ∈ R(R), we have f1 f2 ∈ R(R). Proof. It suffices to prove this when fj ≥ 0. Take partitions Pν and functions ψjν , φjν ≥ 0, constant in the interior of each cell in Pν , such that 0 ≤ ψjν ≤ fj ≤ φjν ≤ B, ∫
and
∫
∫ φjν dV −→
ψjν dV,
fj dV.
We apply Lemma 3.1.16 with F = f1 f2 , Note that
Ψν = ψ1ν ψ2ν ,
Φν = φ1ν φ2ν .
Φν − Ψν = φ1ν (φ2ν − ψ2ν ) + ψ2ν (φ1ν − ψ1ν ) ≤ B(φ2ν − ψ2ν ) + B(φ1ν − ψ1ν ).
Hence (3.1.61) holds, giving F ∈ R(R).
As a consequence of Proposition 3.1.17, we can make the following construction. Assume R is a cell and S ⊂ R is a contented set. If f ∈ R(R), we have χS f ∈ R(R), by Proposition 3.1.17. We define ∫ ∫ (3.1.66) f (x) dV (x) = χS (x)f (x) dV (x). S
R
Note how his extends the scope of (3.1.24). Integrals over Rn It is often useful to integrate a function whose support is not bounded. Generally, given a bounded function f : Rn → R, we say f ∈ R(Rn ) provided f R ∈ R(R) for each cell R ⊂ Rn , and ∫ |f | dV ≤ C, R
104
3. Multivariable integral calculus and calculus on surfaces
for some C < ∞, independent of R. If f ∈ R(Rn ), we set ∫ ∫ f dV = lim f dV, Rs = {x ∈ Rn : |xj | ≤ s, ∀ j}. (3.1.67) s→∞
Rn
Rs
The existence of the limit in (3.1.67) can be established as follows. If M < N , then ∫ ∫ ∫ f dV − f dV = f dV, RN
RM
∫
RN \RM
which is dominated in absolute value by RN \RM |f | dV . If f ∈ R(Rn ), then aN = ∫ |f | dV is a bounded monotone sequence, which hence converges, so RN ∫ ∫ ∫ |f | dV = |f | dV − |f | dV −→ 0, as M, N → ∞. RN \RM
RN
RM
The following simple but useful result is an exercise. Proposition 3.1.18. If Kν is any sequence of compact contented subsets of Rn such that each Rs , for s < ∞, is contained in all Kν for ν sufficiently large, i.e., ν ≥ N (s), then, whenever f ∈ R(Rn ), ∫ ∫ f dV = lim f dV. (3.1.68) ν→∞
Rn
Kν
Change of variables formulas and Fubini’s Theorem extend to this case. For example, the limiting case of (3.1.65) as ρ → ∞ is ∫ ∫ ∞ ∫ 2π 2 2 (3.1.69) f ∈ C(R )∩R(R ) =⇒ f (x, y) dA = f (r cos θ, r sin θ)r dθ dr. 0
R2
0
To see this, use Proposition 3.1.17 with Kν = Dν , defined as in (3.1.64), to write ∫ ∫ (3.1.70) f (x, y) dA = lim f (x, y) dA, ν→∞
R2
Dν
and apply (3.1.65) to write the integral on the right as ∫ ν ∫ 2π f (r cos θ, r sin θ)r dθ dr. 0
0
You get the right side of (3.1.69) in the limit ν → ∞. The following is a good example. Take f (x, y) = e−x −y . We have ∫ ∫ ∞ ∫ 2π ∫ ∞ 2 2 2 2 (3.1.71) e−x −y dA = e−r r dθ dr = 2π e−r r dr. 2
R2
0
0
Now, methods of §1.1 allow the substitution s = r2 , so ∫ ∫ ∞ 2 1 ∞ −s 1 e ds = . e−r r dr = 2 0 2 0
2
0
3.1. The Riemann integral in n variables
Hence
∫
e−x
(3.1.72)
2
−y 2
105
dA = π.
R2
On the other hand, Theorem 3.1.9 extends to give ∫ ∫ ∞∫ ∞ 2 2 2 2 e−x −y dA = e−x −y dy dx (3.1.73)
−∞
R2
(∫
−∞
∞
e−x dx 2
=
) (∫
−∞
∞
) 2 e−y dy .
−∞
Note that the two factors in the last product are equal. We deduce that ∫ ∞ √ 2 (3.1.74) e−x dx = π. −∞
We can generalize (3.1.73), to obtain (via (3.1.74)) (∫ ∞ )n ∫ −|x|2 −x2 (3.1.75) e dVn = e dx = π n/2 . −∞
Rn
The integrals (3.1.71)–(3.1.75) are called Gaussian integrals, and their evaluation has many uses. We shall see some in §3.2. We record the following additivity result for the integral over Rn , whose proof is also an exercise. Proposition 3.1.19. If f, g ∈ R(Rn ), then f + g ∈ R(Rn ), and ∫ ∫ ∫ (3.1.76) (f + g) dV = f dV + g dV. Rn
Rn
Rn
Unbounded integrable functions
There are lots of unbounded functions we would like to be able to integrate. For example, consider f (x) = x−1/2 on (0, 1] (defined any way you like at x = 0). Since, for ε ∈ (0, 1), ∫ 1 √ (3.1.77) x−1/2 dx = 2 − 2 ε, ε
this has a limit as ε ↘ 0, and it is natural to set ∫ 1 (3.1.78) x−1/2 dx = 2. 0
Sometimes (3.1.78) is called an “improper integral,” but we do not consider that to be a proper designation. We aim for a treatment of the integral for a natural class of unbounded functions. To this end, we define a class R# (I) of not necessarily bounded “integrable” functions on I. The set I will stand for either Rn or a cell in Rn .
106
3. Multivariable integral calculus and calculus on surfaces
To start, assume f ≥ 0 on I, and for A ∈ (0, ∞), set fA (x) = f (x)
(3.1.79)
if f (x) ≤ A,
A, if f (x) > A.
(We hereby abandon the use of fA as in the proof of Proposition 3.1.10.) We say f ∈ R# (I) provided fA ∈ R(I),
∀ A < ∞, and ∫ ∃ uniform bound fA dV ≤ M.
(3.1.80)
I
∫
If f ≥ 0 satisfies (3.1.80), then I∫ fA dV increases monotonically to a finite limit as A ↗ +∞, and we call the limit I f dV : ∫ ∫ (3.1.81) fA dV ↗ f dV, for f ∈ R# (I), f ≥ 0. I
I
If I is understood, we might just write
∫
f dV .
Remark. If f ∈ R(I) is ≥ 0, then fA ∈ R(I) for all A < ∞. See the easy part of Exercise 15. It is valuable to have the following. Proposition 3.1.20. If f, g : I → R+ are in R# (I), then f + g ∈ R# (I), and ∫ ∫ ∫ (3.1.82) (f + g) dV = f dV + g dV. I
I
I
Proof. To start, note that (f + g)A ≤ fA + gA . In fact, (3.1.83)
(f + g)A = (fA + gA )A . ∫ ∫ ∫ ∫ Hence (f + g)A ∈ R(I) and (f + g)A dV ≤ fA dV + gA dV ≤ f dV + g dV , so we have f + g ∈ R# (I) and ∫ ∫ ∫ (3.1.84) (f + g) dV ≤ f dV + g dV. ∫
On the other hand, if B > 2A, then (f + g)B ≥ fA + gA , so ∫ ∫ ∫ (3.1.85) (f + g) dV ≥ fA dV + gA dV, for all A < ∞, and hence ∫ ∫ ∫ (3.1.86) (f + g) dV ≥ f dV + g dV. Together, (3.1.84) and (3.1.86) yield (3.1.82).
3.1. The Riemann integral in n variables
107
Next, we take f : I → R and set (3.1.87)
f = f + − f −,
f + (x) = f (x)
if f (x) ≥ 0,
0
if f (x) < 0.
Then we say f ∈ R# (I) ⇐⇒ f + , f − ∈ R# (I),
(3.1.88) and set
∫
(3.1.89)
∫ f dV −
f dV = I
∫ +
I
f − dV,
I
where the two terms on the right are defined as in (3.1.81). To extend the additivity, we begin as follows Lemma 3.1.21. Assume that g ∈ R# (I) and that gj ≥ 0, gj ∈ R# (I), and g = g0 − g1 .
(3.1.90) Then
∫
(3.1.91)
∫ g dV =
∫ g0 dV −
g1 dV.
Proof. Take g = g + − g − as in (3.1.87). Then (3.1.90) implies g + + g1 = g0 + g − ,
(3.1.92)
which by Proposition 3.1.19 yields ∫ ∫ ∫ ∫ (3.1.93) g + dV + g1 dV = g0 dV + g − dV. This implies
∫
∫ g + dV −
(3.1.94)
g − dV =
∫
∫ g0 dV −
g1 dV,
which yields (3.1.91). We now extend additivity. Proposition 3.1.22. Assume f1 , f2 ∈ R# (I). Then f1 + f2 ∈ R# (I) and ∫ ∫ ∫ (3.1.95) (f1 + f2 ) dV = f1 dV + f2 dV. I
Proof. If g = f1 + f2 = (3.1.96)
I
(f1+
−
g = g0 − g1 ,
f1− )
+ (f2+ − f2− ), g0 = f1+ + f2+ ,
I
then g1 = f1− + f2− .
We have gj ∈ R# (I), and then ∫ ∫ ∫ (f1 + f2 ) dV = g0 dV − g1 dV ∫ ∫ + + (3.1.97) = (f1 + f2 ) dV − (f1− + f2− ) dV ∫ ∫ ∫ ∫ = f1+ dV + f2+ dV − f1− dV − f2− dV,
108
3. Multivariable integral calculus and calculus on surfaces
the first equality by Lemma 3.1.21, the second tautologically, and the third by Proposition 3.1.20. Since ∫ ∫ ∫ + (3.1.98) fj dV = fj dV − fj− dV,
this gives (3.1.95).
If f : I → C, we set f = f1 + if2 , fj : I → R, and say f ∈ R# (I) if and only if f1 and f2 belong to R# (I). Then we set ∫ ∫ ∫ (3.1.99) f dV = f1 dV + i f2 dV. Similar comments apply to f : I → Rn . We next establish a useful result on products. Proposition 3.1.23. Assume f ∈ R# (Rn ), g ∈ R(Rn ), and f, g ≥ 0. Then f g ∈ R# (Rn ) and ∫ ∫ (3.1.100) fA g dV ↗ f g dV as A ↗ +∞. Proof. Given the additivity properties just established, it would be equivalent to prove this with g replaced by g + 1, so we will assume from here that g ≥ 1. Then (3.1.101)
(f g)A = (fA g)A .
By Proposition 3.1.17, fA g|R ∈ R(R) for each cell R. Hence (e.g., by the easy part of Exercise 15), (fA g)A |R ∈ R(R) for each cell R. Thus (3.1.102) (f g)A R ∈ R(R). Now there exists K < ∞ such that 1 ≤ g ≤ K, so (3.1.103)
fA g ≤ KfA ,
hence (f g)A ≤ KfA .
The hypothesis f ∈ R (R ) implies there exists M < ∞ such that ∫ (3.1.104) fA dV ≤ M, #
n
R
for all A < ∞ and each cell R. Hence, by (3.1.103), ∫ (3.1.105) sup (f g)A dV ≤ M K, A R
independent of R. This implies f g ∈ R# (Rn ). By definition, ∫ ∫ (3.1.106) (f g)A dV ↗ f g dV, as A ↗ +∞. Meanwhile, clearly fA g ↗ as A ↗, so the estimate (3.1.103) implies ∫ (3.1.107) fA g dV ↗ L, as A ↗ +∞,
3.1. The Riemann integral in n variables
109
for some L ∈ R+ . It remains to identify the limits in (3.1.106) and (3.1.107). Now (3.1.101) implies ∫ (3.1.108) (f g)A ≤ fA g, hence f g dV ≤ L. Finally, since fA g ≤ f g and fA g ≤ KA, we have fA g ≤ (f g)B for B ≥ KA.
(3.1.109) This implies
∫ L ≤ sup
(3.1.110)
∫ (f g)B dV =
f g dV,
B
and hence we have (3.1.100).
We now extend the change of variable formula in Theorem 3.1.15 to unbounded functions. It is convenient to introduce the following notation. Given an open set # n Ω ⊂ Rn , we say f ∈ R# c (Ω) provided f ∈ R (R ) and f is supported on a compact subset of Ω. Proposition 3.1.24. Let O and Ω be open in Rn , G : O → Ω a C 1 diffeomorphism. # If f ∈ R# c (Ω), then f ◦ G ∈ Rc (O) and ∫ ∫ (3.1.111) f (y) dV (y) = f (G(x))| det DG(x)| dV (x). O
Ω
Proof. It suffices to establish this in case f ≥ 0, which we assume from here. Then ∫ ∫ (3.1.112) fA dV ↗ f dV. Ω
Ω
We set φ = f ◦ G and note that fA ◦ G = φA . Hence, by Theorem 3.1.15, for each A ∈ (0, ∞), ∫ ∫ (3.1.113) fA (y) dV (y) = φA (x)| det DG(x)| dV (x). Ω
O
If f is supported on a compact set K ⊂ Ω, then φA is supported on G−1 (K) ⊂ O, also compact, hence on which | det DG| has a positive lower bound. Hence an ∫ upper bound on the right side of (3.1.113) implies an upper bound on φA dV , independent of A, so φ ∈ R# (Rn ). Then Proposition 3.1.23 implies φ| det DG| ∈ R# (Rn ) and ∫ ∫ (3.1.114) φA (x)| det DG(x)| dV (x) ↗ φ(x)| det DG(x)| dV (x). Together (3.1.112)–(3.1.114) yield (3.1.111).
One also has versions of Proposition 3.1.24 where f need not have compact support. See Exercise 13 below for an example. Our next result on a class of elements of R# (I) ties in closely with the example in (3.1.77). As before, I is either Rn or a cell in Rn .
110
3. Multivariable integral calculus and calculus on surfaces
Proposition 3.1.25. Let f : I → [0, ∞) and assume fA ∈ R(I) for each A < ∞. Assume there are nested contented subsets of I: (3.1.115)
U1 ⊃ U2 ⊃ U3 ⊃ · · · ,
V (Uν ) → 0.
Assume f (1 − χUν ) ∈ R(I) for each ν and that there exists C < ∞ such that ∫ (3.1.116) f dV = Jν ≤ C, ∀ ν. I\Uν
Then f ∈ R# (I) and
∫ Jν ↗
(3.1.117)
f dV. I
Proof. The hypothesis (3.1.116) implies Jν ↗ J for some J ∈ [0, ∞). Also, since 0 ≤ fA ≤ f , we have ∫ (3.1.118) fA dV ≤ J, ∀ v, A. I\Uν
Furthermore,
∫ fA dV ≤ AV (Uν ),
(3.1.119) Uν
so
∫ fA dV ≤ J + AV (Uν ),
(3.1.120)
∀ ν, A,
I
hence
∫ fA dV ≤ J,
(3.1.121)
∀A.
I
It follows that f ∈ R# (I) and
∫ f dV ≤ J.
(3.1.122) I
On the other hand,
∫
∫ f dV ≥
(3.1.123) I
for each ν, so we have (3.1.117).
f dV = Jν , I\Uν
Monotone convergence theorem We aim to establish a circle of results known as monotone convergence theorems. Here is the first result.
3.1. The Riemann integral in n variables
111
Proposition 3.1.26. Let R ⊂ Rn be a cell. Assume fk ∈ R(R). Then ∫ (3.1.124) fk (x) ↘ 0 ∀ x ∈ R =⇒ fk dV ↘ 0. R
Proof. It suffices to assume V (R) = 1. Say 0 ≤ f1 ≤ K on R, so also 0 ≤ fk ≤ K. We have ∫ (3.1.125) fk dV ↘ α, R
for some α ≥ 0, and we want to show that α = 0. Suppose α > 0. Pick a partition Pk of R such that I Pk (fk ) ≥ α/2. Thus fk ≥ φk ≥ 0 for some φk ∈ PK(R), constant on the interior of each cell in Pk , with integral ≥ α/2. The contribution ∫ to R φk dV from the cells on which φk ≤ α/4 is ≤ α/4, so the contribution from the cells on which φk ≥ α/4 must be ≥ α/4. Since φk ≤ K on R, it follows that the latter class of cells must have total volume ≥ α/4K. Consequently, for each k, there exists Sk ⊂ R, a finite union of cells in Pk , such that α α (3.1.126) V (Sk ) ≥ , and fk ≥ on Sk . 4K 4 Then fℓ ≥ α/4 on Sk for all ℓ ≤ k. Hence, with ∪ (3.1.127) Oℓ = Sk , k≥ℓ
we have (3.1.128)
cont− (Oℓ ) ≥
α , 4K
fℓ ≥
α on Oℓ . 4
The hypothesis fℓ ↘ 0 on R implies (3.1.129)
Oℓ ↘ ∅ as ℓ ↗ ∞.
Without loss of generality, we can take Sk open in (3.1.126), hence each Oℓ is open. The conclusion of Proposition 3.1.26 is hence a consequence of the following, which implies that (3.1.128) and (3.1.129) are contradictory. Proposition 3.1.27. If Oℓ ⊂ R are open sets, for ℓ ∈ N, then (3.1.130)
Oℓ ↘ ∅ =⇒ cont− (Oℓ ) ↘ 0.
Proof. Assume Oℓ ↘ ∅. If the conclusion of (3.1.130) fails, then (3.1.131)
cont− (Oℓ ) ↘ b
for some b > 0. Passing to a subsequence if necessary, we can assume (3.1.132)
cont− (Oℓ ) ≤ b + δℓ ,
δℓ < 2−ℓ · 10−9 · b.
Then we can pick Kℓ ⊂ Oℓ , a compact union of finitely many cells in a partition of R, such that (3.1.133)
V (Kℓ ) ≥ b − δℓ .
We claim that ∩ℓ Kℓ ̸= ∅, which will provide a contradiction.
112
3. Multivariable integral calculus and calculus on surfaces
Place K1 ∪ K2 in a finite union C1 of cells, contained in O1 . We then have V (K1 ∩ K2 ) ≥ V (K1 ) − V (C1 \ K2 )
(3.1.134)
≥ b − (2δ1 + δ2 ),
since V (C1 \ K2 ) = V (C1 ) − V (K2 ) ≤ cont− (O1 ) − V (K2 ) ≤ δ1 + δ2 . Next, place (K1 ∩ K2 ) ∪ K3 in a finite union C2 of cells, contained in O2 . Then V (K1 ∩ K2 ∩ K3 ) ≥ V (K1 ∩ K2 ) − V (C2 \ K3 )
(3.1.135)
≥ b − (2δ1 + δ2 ) − (2δ2 + δ3 ),
since V (C2 \ K3 ) = V (C2 ) − V (K3 ) ≤ cont− (O2 ) − V (K3 ) ≤ δ2 + δ3 . Proceeding in this fashion, we get (3.1.136)
V
k (∩
k ) ∑ Kℓ ≥ b − (2δℓ + δℓ+1 ) > 0,
ℓ=1
∀ k.
ℓ=1
e k = ∩k Kℓ is a decreasing sequence of nonempty compact sets. Hence Thus, K ℓ=1 ∩ ∩ (3.1.137) Oℓ ⊃ Kℓ ̸= ∅, ℓ≥1
ℓ≥1
contradicting the hypothesis of (3.1.130).
Having Proposition 3.1.26, we proceed to the following significant improvement. Proposition 3.1.28. Assume fk ∈ R# (R). Then ∫ (3.1.138) fk (x) ↘ 0 ∀ x ∈ R =⇒ fk dV ↘ 0. R
Proof. Again we have (3.1.125) for some α ≥ 0 and again we want to show that α = 0. For each A ∈ (0, ∞) and each k ∈ N, form (fk )A , as in (3.1.79). Thus (fk )A ∈ R(R), and the hypothesis of (3.1.138) implies (fk )A ↘ 0 as k ↗ ∞. Thus, by Proposition 3.1.26, ∫ (3.1.139) (fk )A dV ↘ 0 as k → ∞, for each A < ∞. R
We note that (3.1.140)
fk+1 (x) − (fk+1 )A (x) ≤ fk (x) − (fk )A (x)
for each x ∈ R, k ∈ N. In fact, if fk (x) ≤ A (so fk+1 (x) ≤ A), both sides of (3.1.140) are 0, if fk+1 (x) ≥ A (so fk (x) ≥ A), we get fk+1 (x) − A ≤ fk (x) − A, and if fk+1 (x) < A < fk (x), we get 0 ≤ fk (x) − A. It follows that, for each A < ∞, ∫ (3.1.141) [fk − (fk )A ] dV ↘ α, as k → ∞. R
However, for each δ > 0, there exists A = A(δ) < ∞ such that This forces α = 0, and proves Proposition 3.1.28.
∫ R
[f1 −(f1 )A ] dV ≤ δ.
Applying Proposition 3.1.28 to fk = g − gk , we have the following.
3.1. The Riemann integral in n variables
113
Corollary 3.1.29. Assume g, gk ∈ R# (R). Then ∫ ∫ (3.1.142) gk (x) ↗ g(x) ∀ x ∈ R =⇒ gk dV ↗ g dV. R
R
Finally, we remove the support constraint. Proposition 3.1.30. Assume g, gk ∈ R# (Rn ). Then ∫ ∫ (3.1.143) gk (x) ↗ g(x) ∀ x ∈ Rn =⇒ gk dV ↗ g dV. Rn
Proof. Clearly
∫
∫ gk dV ↗ c,
(3.1.144)
Rn
and c ≤
Rn
g dV. Rn
Now, given ε > 0, there is a cell R ⊂ Rn such that ∫ (3.1.145) (|g| + |g1 |) dV < ε, Rn \R
and Corollary 3.1.29 gives
∫ gk dV ↗
(3.1.146) We deduce that c ≥
∫
∫ Rn
R
g dV. R
g dV − ε for all ε > 0, so (4.137) holds.
In the Lebesgue theory of integration, there is a stronger result. Namely, if n gk are integrable if there is a uniform ∫ on R and gk (x) ↗ g(x) for each x, and upper bound Rn gk dx ≤ B < ∞, then g is integrable on Rn and the conclusion of (3.1.143) holds. Such a result can be found in [47]. Upper content and outer measure Given a bounded set S ⊂ Rn , its upper content is defined in (3.1.13) and an equivalent characterization given in (3.1.15). A related quantity is the outer measure of S, defined by {∑ } ∪ (3.1.147) m∗ (S) = inf V (Rk ) : Rk ⊂ Rn cells, S ⊂ Rk . k≥1
k≥1
The difference between (3.1.15) and (3.1.147) is that in (3.1.15) we require the cover of S by cells to be finite and in (3.1.147) we allow any countable cover of S by cells. Clearly (3.1.147) is an inf over a larger collection of objects than (3.1.15), so (3.1.148)
m∗ (S) ≤ cont+ (S).
We get the same result in (3.1.147) if we require ∪ ◦ Rk (3.1.149) S⊂ k≥1
114
3. Multivariable integral calculus and calculus on surfaces
(just expand each Rk by a factor of (1 + 2−k ε)). Since any open cover of a compact set has a finite subcover (see Appendix A.1), it follows that S compact =⇒ m∗ (S) = cont+ (S).
(3.1.150)
On the other hand, it is readily verified from (3.1.147) that S countable =⇒ m∗ (S) = 0.
(3.1.151)
For example, if R = {x ∈ Rn : 0 ≤ xj ≤ 1, ∀ j}, then m∗ (R ∩ Qn ) = 0,
(3.1.152)
but cont+ (R ∩ Qn ) = 1,
the latter result by (3.1.16). We now establish the following integrability criterion, which sharpens Proposition 3.1.6. Proposition 3.1.31. Let f : R → R be bounded, and let S ⊂ R be the set of points of discontinuity of f . Then m∗ (S) = 0 =⇒ f ∈ R(R).
(3.1.153)
Proof. Assume |f | ≤ M and pick ε > 0. Take a countable collection {Rk } of ∑ cells that are open (in R) such that S ⊂ ∪k≥1 Rk and k≥1 V (Rk ) < ε. Now f is continuous at each p ∈ R \ S, so there exists a cell Rp# , open (in R), containing p, such that supRp# f − inf Rp# f < ε. Then {Rk : k ∈ N} ∪ {Rp# : p ∈ R \ S} is an open cover of R. Since R is compact, there is a finite subcover, which we denote # {R1 , . . . , RN , R1# , . . . , RM }. We have (3.1.154)
N ∑
V (Rk ) < ε,
and sup f − inf f < ε, ∀ j ∈ {1, . . . , M }. Rj#
Rj#
k=1
Recall that R = I1 × · · · × In is a product of n closed, bounded intervals. Also each cell Rk and Rj# is a product of intervals. For each ν ∈ {1, . . . , n}, take the collection of all endpoints in the νth factor of each of these cells, and use these to form a partition of Iν . Taking products yields a partition P of R. We can write P = {Lk : 1 ≤ k ≤ µ} (∪ ) (∪ ) = Lk ∪ Lk ,
(3.1.155)
k∈A
k∈B
where we say k ∈ A provided Lk is contained in a cell of the form Rj# for some j ∈ {1, . . . , M }, as in (3.1.154). Consequently, if k ∈ B, then Lk ⊂ Rℓ for some ℓ ∈ {1, . . . , N }, so ∪
(3.1.156)
k∈B
We therefore have (3.1.157)
∑ k∈B
V (Lk ) < ε,
Lk ⊂
N ∪
Rℓ .
ℓ=1
and sup f − inf f < ε, ∀ j ∈ A. Lj
Lj
Exercises
115
It follows that 0 ≤ I P (f ) − I P (f ) < (3.1.158)
∑
2M V (Lk ) +
∑
εV (Lj )
j∈A
k∈B
< 2εM + εV (R). Since ε can be taken arbitrarily small, this establishes that f ∈ R(R).
Remark. The condition (3.1.153) is sharp. That is, given f : R → R bounded, f ∈ R(R) ⇔ m∗ (S) = 0. Proofs of this can be found in standard measure theory texts, such as [47].
Exercises 1. Show that any two partitions of a cell R have a common refinement. Hint. Consider the argument given for the one-dimensional case in §1.1. 2. Write down a careful proof of the identity (3.1.16), i.e., cont+ (S) = cont+ (S). 3. Write down the details of the argument giving (3.1.25), on the independence of the integral from the choice of cell. 4. Write down a direct proof that the transformation formula (3.1.35) holds for those linear transformations of the form A1 and A2 in (3.1.36). Compare Exercise 1 of §1.1. 5. Consider spherical polar coordinates on R3 , given by x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, ( ) i.e., take G(ρ, φ, θ) = ρ sin φ cos θ, ρ sin φ sin θ, ρ cos θ . Show that det DG(ρ, φ, θ) = ρ2 sin φ. Use this to compute the volume of the unit ball in R3 . 6. If B is the unit ball in R3 , show that Theorem 3.1.9 implies ∫ √ V (B) = 2 1 − |x|2 dA(x), D
where D = {x ∈ R2 : |x| ≤ 1} is the unit disk. Use polar coordinates, as in (3.1.62)– (3.1.65), to compute this integral. Compare the result with that of Exercise 5. 7. Apply Corollary 3.1.13 and the answer to Exercises 5 and 6 to compute the
116
3. Multivariable integral calculus and calculus on surfaces
volume of the ellipsoidal region in R3 defined by x2 y2 z2 + + ≤ 1, a2 b2 c2 given a, b, c ∈ (0, ∞). 8. Prove Lemma 3.1.16. 9. If R is a cell and S ⊂ R is a contented set, and f ∈ R(R), we have, via Proposition 3.1.17, ∫ ∫ f (x) dV (x) = χS (x)f (x) dV (x). S
R
Show that, if Sj ⊂ R are contented and they are disjoint (or more generally cont+ (S1 ∩ S2 ) = 0), then, for f ∈ R(R), ∫ ∫ ∫ f (x) dV (x) = f (x) dV (x) + f (x) dV (x). S1 ∪S2
S1
S2
10. Establish the convergence result (3.1.68), for all f ∈ R(Rn ). 11. Take B = {x ∈ Rn : |x| ≤ 1/2}, and let f : B → R+ . Assume f is continuous on B \ 0. Show that ∫ f ∈ R# (B) ⇔ f dV is bounded as ε ↘ 0. |x|>ε
12. With B ⊂ Rn as in Exercise 11, define qb : B → R by 1 qb (x) = b , n |x| log |x| for x ̸= 0. Say qb (0) = 0. Show that qb ∈ R# (B) ⇔ b > 1. 13. Show that f (x) = |x|−a e−|x| ∈ R# (Rn ) ⇐⇒ a < n. 2
14. Theorem 3.1.9, relating multiple integrals and iterated integrals, played the following role in the proof of the change of variable formula (3.1.47). Namely, it was used to establish the identity (3.1.50) for the volume of the parallelepiped Hα , via an appeal to Corollary 3.1.13, hence to Proposition 3.1.10, whose proof relied on Theorem 3.1.9. Try to establish Corollary 3.1.13 directly, without using Theorem 3.1.9, in the case when Σ is either a cell or the image of a cell under an element of Gl(n, R). In preparation for the next three exercises, review the proof of Proposition 1.1.12.
Exercises
117
15. Assume f ∈ R(R), |f | ≤ M , and let φ : [−M, M ] → R be Lipschitz and monotone. Show directly from the definition that φ ◦ f ∈ R(R). 16. If φ : [−M, M ] → R is continuous and piecewise linear, show that you can write φ = φ1 − φ2 with φj Lipschitz and monotone. Deduce that f ∈ R(R) ⇒ φ ◦ f ∈ R(R) when φ is piecewise linear. 17. Assume uν ∈ R(R) and that uν → u uniformly on R. Show that u ∈ R(R). Deduce that if f ∈ R(R), |f | ≤ M , and ψ : [−M, M ] → R is continuous, then ψ ◦ f ∈ R(R). 18. Let R ⊂ Rn be a cell and let f, g : R → R be bounded. Show that I(f + g) ≤ I(f ) + I(g),
I(f + g) ≥ I(f ) + I(g).
Hint. Look at the proof of Proposition 1.1.1. 19. Let R ⊂ Rn be a cell and let f : R → R be bounded. Assume that for each ε > 0, there exist bounded fε , gε such that fε ∈ R(R),
f = fε + gε , Show that f ∈ R(R) and
∫
I(|gε |) ≤ ε.
∫ fε dV −→
R
f dV. R
Hint. Use Exercise 18. 20. Use the result of Exercise 19 to produce another proof of Proposition 3.1.6. 21. Behind (3.1.45) is the assertion that if R is a cell, g is supported on K ⊂ R, and |g| ≤ M , then I(|g|) ≤ M cont+ (K). Prove this. More generally, if g, h : R → R are bounded and |g| ≤ M , show that I(|gh|) ≤ M I(|h|). 22. Establish the following Fubini-type theorem, and compare it with Theorem 3.1.9. Proposition 3.1.32. Let A ⊂ Rm and B ⊂ Rn be cells, and take f ∈ R(A × B). For x ∈ A, define fx : B → R by fx (y) = f (x, y). Define Lf , Uf : A → R by Lf (x) = I(fx ),
Uf (x) = I(fx ).
Then Lf and Uf belong to R(A), and ∫ ∫ ∫ f dV = Lf (x) dx = Uf (x) dx. A×B
A
A
Hint. Given ε > 0, use Proposition 3.1.11 to take φ, ψ ∈ PK(A × B) such that ∫ ∫ φ ≤ f ≤ ψ, ψ dV − φ dV < ε.
118
3. Multivariable integral calculus and calculus on surfaces
With definitions of φx and ψx analogous to that of fx , show that ∫ ∫ φ dV = φx dx ≤ I(Lf ) A×B
A
∫
∫
≤ I(Uf ) ≤
ψx dx = A
ψ dV. A×B
Deduce that I(Lf ) = I(Uf ), and proceed. Exercises on row reduction and matrix products We consider the following three types of row operations on an n × n matrix A = (ajk ). If σ is a permutation of {1, . . . , n}, let ρσ (A) = (aσ(j)k ). If c = (c1 , . . . , cj ), and all cj are nonzero, set µc (A) = (c−1 j ajk ). Finally, if c ∈ R and µ ̸= ν, define εµνc (A) = (bjk ),
bνk = aνk − caµk , bjk = ajk for j ̸= ν.
We relate these operations to left multiplication by matrices Pσ , Mc , and Eµνc , defined by the following actions on the standard basis {e1 , . . . , en } of Rn : Pσ ej = eσ(j) ,
M c e j = cj e j ,
and Eµνc eµ = eµ + ceν ,
Eµνc ej = ej for j ̸= µ.
1. Show that A = Pσ ρσ (A),
A = Mc µc (A),
A = Eµνc εµνc (A).
2. Show that Pσ−1 = Pσ−1 . 3. Show that, if µ ̸= ν, then Eµνc = Pσ−1 E21c Pσ , for some permutation σ. 4. If B = ρσ (A) and C = µc (B), show that A = Pσ Mc C. Generalize this to other cases where a matrix C is obtained from a matrix A via a sequence of row operations. 5. If A is an invertible, real n × n matrix (i.e., A ∈ Gl(n, R)), then the rows of A form a basis of Rn . Use this to show that A can be transformed to the identity matrix via a sequence of row operations. Deduce that any A ∈ Gl(n, R) can be
3.2. Surfaces and surface integrals
119
written as a finite product of matrices of the form Pσ , Mc and Eµνc , hence as a finite product of matrices of the form listed in (3.1.36).
3.2. Surfaces and surface integrals A smooth m-dimensional surface M ⊂ Rn is characterized by the following property. Given p ∈ M, there is a neighborhood U of p in M and a smooth map φ : O → U, from an open set O ⊂ Rm bijectively to U, with injective derivative at each point, and continuous inverse φ−1 : U → O. Such a map φ is called a coordinate chart on M. We call U ⊂ M a coordinate patch. If all such maps φ are smooth of class C k , we say M is a surface of class C k . In §4.3 we will define analogous notions of a C k surface with boundary, and of a C k surface with corners. There is an abstraction of the notion of a surface, namely the notion of a manifold, which we will discuss at the end of this section. Examples include projective spaces and other spaces obtained as quotients of surfaces. If φ : O → U is a C k coordinate chart, such as described above, or more generally φ : O → Rn is a C k map with injective derivative, and φ(x0 ) = p, we set (3.2.1)
Tp M = Range Dφ(x0 ),
a linear subspace of Rn of dimension m, and we denote by Np M its orthogonal complement. It is useful to consider the following map. Pick a linear isomorphism A : Rn−m → Np M , and define (3.2.2)
Φ : O × Rn−m −→ Rn ,
Φ(x, z) = φ(x) + Az.
Thus Φ is a C k map defined on an open subset of Rn . Note that ( ) v (3.2.3) DΦ(x0 , 0) = Dφ(x0 )v + Aw, w so DΦ(x0 , 0) : Rn → Rn is surjective, hence bijective, so the Inverse Function Theorem applies; Φ maps some neighborhood of (x0 , 0) diffeomorphically onto a neighborhood of p ∈ Rn . Suppose there is another C k coordinate chart, ψ : Ω → U . Since φ and ψ are by hypothesis one-to-one and onto, it follows that F = ψ −1 ◦ φ : O → Ω is a well defined map, which is one-to-one and onto. See Figure 3.2.1. Also F and F −1 are continuous. In fact, we can say more. Lemma 3.2.1. Under the hypotheses above, F is a C k diffeomorphism. Proof. It suffices to show that F and F −1 are C k on a neighborhood of x0 and y0 , respectively, where φ(x0 ) = ψ(y0 ) = p. Let us define a map Ψ in a fashion ep M be similar to (3.2.2). To be precise, we set Tep M = Range Dψ(y0 ), and let N its orthogonal complement. (Shortly we will show that Tep M = Tp M , but we are ep M and not quite ready for that.) Then pick a linear isomorphism B : Rn−m → N consider Ψ : Ω × Rn−m −→ Rn , Ψ(y, z) = ψ(y) + Bz.
120
3. Multivariable integral calculus and calculus on surfaces
Figure 3.2.1. Coordinate charts
Again, Ψ is a C k diffeomorphism from a neighborhood of (y0 , 0) onto a neighbore of (x0 , 0) in O × Rn−m , Ω e hood of p. To be precise, there exist neighborhoods O n−m n e of (y0 , 0) in Ω × R , and U of p in R such that e −→ U e, Φ:O
e −→ U e and Ψ : Ω
are C k diffeomorphisms. e→Ω e is a C k diffeomorphism. Now note that, for It follows that Ψ−1 ◦ Φ : O e and (y, 0) ∈ Ω, e (x, 0) ∈ O ( ) ( ) (3.2.4) Ψ−1 ◦ Φ(x, 0) = F (x), 0 , Φ−1 ◦ Ψ(y, 0) = F −1 (y), 0 . In fact, to verify the first identity in (3.2.4), we check that Ψ(F (x), 0) = ψ(F (x)) + B0 = ψ(ψ −1 ◦ φ(x)) = φ(x) = Φ(x, 0). The identities in (3.2.4) imply that F and F −1 have the desired regularity.
3.2. Surfaces and surface integrals
121
Thus, when there are two such coordinate charts, φ : O → U, ψ : Ω → U , we have a C k diffeomorphism F : O → Ω such that φ = ψ ◦ F.
(3.2.5) By the chain rule, (3.2.6)
Dφ(x) = Dψ(y) DF (x),
y = F (x).
In particular this implies that Range Dφ(x0 ) = Range Dψ(y0 ), so Tp M in (3.2.1) is independent of the choice of coordinate chart. It is called the tangent space to M at p. Remark. An application of the inverse function theorem related to the proof of Lemma 3.2.1 can be used to show that if O ⊂ Rm is open, m < n, and φ : O → Rn is a C k map such that Dφ(p) : Rm → Rn is injective, (p ∈ O), then there is a e of p in O such that the image of O e under φ is a C k surface in Rn . neighborhood O Compare Exercise 11 in §2.2. We next define an object called the metric tensor on M .( Given )a coordinate chart φ : O → U , there is associated an m × m matrix G(x) = gjk (x) of functions on O, defined in terms of the inner product of vectors tangent to M : n ∑ ∂φ ∂φ ∂φℓ ∂φℓ (3.2.7) gjk (x) = Dφ(x)ej · Dφ(x)ek = · = , ∂xj ∂xk ∂xj ∂xk ℓ=1
where {ej : 1 ≤ j ≤ m} is the standard orthonormal basis of Rm . Equivalently, G(x) = Dφ(x)t Dφ(x).
(3.2.8)
We call (gjk ) the metric tensor of M, on U , with respect to the coordinate chart φ : O → U . Note that this matrix is positive-definite. From a coordinate-independent point of view, the metric tensor on M specifies inner products of vectors tangent to M , using the inner product of Rn . If we take another coordinate chart ψ : Ω → U, we want to compare (gjk ) with H = (hjk ), given by (3.2.9)
hjk (y) = Dψ(y)ej · Dψ(y)ek ,
i.e., H(y) = Dψ(y)t Dψ(y).
As seen above we have a diffeomorphism F : O → Ω such that (3.2.5)–(3.2.6) hold. Consequently, (3.2.10)
G(x) = DF (x)t H(y) DF (x),
for y = F (x),
or equivalently, (3.2.11)
gjk (x) =
∑ ∂Fi ∂Fℓ hiℓ (y). ∂xj ∂xk i,ℓ
We now define the notion of surface integral on M . If f : M → R is a continuous function supported on U, we set ∫ ∫ √ (3.2.12) f dS = f ◦ φ(x) g(x) dx, M
O
122
3. Multivariable integral calculus and calculus on surfaces
where (3.2.13)
g(x) = det G(x).
We need to know that this is independent of the choice of coordinate chart φ : O → U. we use ψ : Ω → U instead, we want to show that (3.2.12) is equal to ∫ Thus, if √ f ◦ ψ(y) h(y) dy, where h(y) = det H(y). Indeed, since f ◦ ψ ◦ F = f ◦ φ, we Ω can apply the change of variable formula, Theorem 3.1.15, to get ∫ ∫ √ √ (3.2.14) f ◦ ψ(y) h(y) dy = f ◦ φ(x) h(F (x)) |det DF (x)| dx. O
Ω
Now, (3.2.10) implies that (3.2.15)
√ √ g(x) = |det DF (x)| h(y),
so the right side of (3.2.14) is seen to be equal to (3.2.12), and our surface integral is well defined, at least for f supported in a coordinate patch. More generally, if f : M → R has compact support, write it as a finite sum of terms, each supported on a coordinate patch, and use (3.2.12) on each patch. Using (3.1.11), one readily verifies that ∫ ∫ ∫ (3.2.16) (f1 + f2 ) dV = f1 dV + f2 dV, M
M
M
if fj : M → R are continuous functions with compact support. Let us pause to consider the special cases m = 1 and m = 2. For m = 1, we are considering a curve in Rn , say φ : [a, b] → Rn . Then G(x) is a 1 × 1 matrix, namely G(x) = |φ′ (x)|2 . If we denote the curve in Rn by γ, rather than M, the formula (3.2.12) becomes the arc length integral ∫ ∫ b (3.2.17) f ds = f ◦ φ(x) |φ′ (x)| dx. a
γ
In case m = 2, let us consider a surface M ⊂ R3 , with a coordinate chart φ : O → U ⊂ M. For f supported in U, an alternative way to write the surface integral is ∫ ∫ (3.2.18) f dS = f ◦ φ(x) |∂1 φ × ∂2 φ| dx1 dx2 , M
O
where u × v is the cross product of vectors u and v in R3 . To see this, we compare this integrand with the one in (3.2.12). In this case, ( ) ∂1 φ · ∂1 φ ∂1 φ · ∂2 φ (3.2.19) = |∂1 φ|2 |∂2 φ|2 − (∂1 φ · ∂2 φ)2 . g = det ∂2 φ · ∂1 φ ∂2 φ · ∂2 φ Recall from (1.4.45) that |u × v| = |u| |v| | sin θ|, where θ is the angle between u and v. Equivalently, since u · v = |u| |v| cos θ, ( ) (3.2.20) |u × v|2 = |u|2 |v|2 1 − cos2 θ = |u|2 |v|2 − (u · v)2 . √ Thus we see that |∂1 φ × ∂2 φ| = g, in this case, and (3.2.18) is equivalent to (3.2.12). An important class of surfaces is the class of graphs of smooth functions. Let u ∈ C 1 (Ω), for an open Ω ⊂ Rn−1 , and let M be the graph of z = u(x). The map
3.2. Surfaces and surface integrals
123
( ) φ(x) = x, u(x) provides a natural coordinate system, in which the metric tensor formula (3.2.7) becomes (3.2.21)
gjk (x) = δjk +
∂u ∂u . ∂xj ∂xk
If u is C 1 , we see that gjk is continuous. To calculate g = det(gjk ), at a given point p ∈ Ω, if ∇u(p) ̸= 0, rotate coordinates so that ∇u(p) is parallel to the x1 axis. We obtain ( )1/2 √ g = 1 + |∇u|2 . (3.2.22) (See Exercise 31 for another take on this formula.) In particular, the (n − 1)dimensional volume of the surface M is given by ∫ ∫ ( )1/2 (3.2.23) Vn−1 (M ) = dS = 1 + |∇u(x)|2 dx. M
Ω
Particularly important examples of surfaces are the unit spheres S n−1 in Rn , S n−1 = {x ∈ Rn : |x| = 1}.
(3.2.24)
Spherical polar coordinates on Rn are defined in terms of a smooth diffeomorphism (3.2.25)
R : (0, ∞) × S n−1 −→ Rn \ 0,
R(r, ω) = rω.
Let (hℓm ) denote the metric tensor on S n−1 (induced from its inclusion in Rn ) with respect to some coordinate chart φ : O → U ⊂ S n−1 . Then we have a coordinate chart Φ : (0, ∞) × O → U ⊂ Rn given by Φ(r, y) = rφ(y). Take y0 = r, y = (y1 , . . . , yn−1 ). In the coordinate system Φ the Euclidean metric tensor (ejk ) is given by e00 = ∂0 Φ · ∂0 Φ = φ(y) · φ(y) = 1, e0j = ∂0 Φ · ∂j Φ = φ(y) · ∂j φ(y) = 0, 1 ≤ j ≤ n − 1, ejk = r2 ∂j φ · ∂k φ = r2 hjk , 1 ≤ j, k ≤ n − 1. The fact that φ(y)·∂j φ(y) = 0 follows by applying ∂/∂yj to the identity φ(y)·φ(y) ≡ 0. To summarize, ( ) ( ) 1 (3.2.26) ejk = . r2 hℓm Now (3.2.26) yields √
(3.2.27)
√ e = rn−1 h.
We therefore have the following result for integrating a function in spherical polar coordinates. ∫ ∫ [∫ ∞ ] (3.2.28) f (x) dx = f (rω)rn−1 dr dS(ω). Rn
S n−1
0
124
3. Multivariable integral calculus and calculus on surfaces
We next compute the (n − 1)-dimensional area An−1 of the unit sphere S n−1 ⊂ R , using (3.2.28) together with the computation ∫ 2 (3.2.29) e−|x| dx = π n/2 , n
Rn
from (3.1.75). First note that, whenever f (x) = φ(|x|), (3.2.28) yields ∫ ∫ ∞ (3.2.30) φ(|x|) dx = An−1 φ(r)rn−1 dr. 0
Rn
In particular, taking φ(r) = e−r and using (3.2.29), we have ∫ ∞ ∫ ∞ 2 e−s sn/2−1 ds, (3.2.31) π n/2 = An−1 e−r rn−1 dr = 21 An−1 2
0
0
2
where we used the substitution s = r to get the last identity. We hence have (3.2.32)
An−1 =
2π n/2 , Γ( n2 )
where Γ(z) is Euler’s Gamma function, defined for z > 0 by ∫ ∞ (3.2.33) Γ(z) = e−s sz−1 ds. 0
We need to complement (3.2.32) with some results on Γ(z) allowing a computation of Γ(n/2) in terms of more familiar quantities. Of course, setting z = 1 in (3.2.33), we immediately get (3.2.34)
Γ(1) = 1.
Also, setting n = 1 in (3.2.31), we have ∫ ∞ ∫ 2 π 1/2 = 2 e−r dr = 0
∞
e−s s−1/2 ds,
0
or
(1) Γ = π 1/2 . 2
(3.2.35)
We can proceed inductively from (3.2.34)-(3.2.35) to a formula for Γ(n/2) for any n ∈ Z+ , using the following. Lemma 3.2.2. For all z > 0, (3.2.36)
Γ(z + 1) = zΓ(z).
Proof. We can write
∫
Γ(z + 1) = − 0
∞(
d −s ) z s ds = e ds
∫ 0
∞
e−s
d ( z) s ds, ds
the last identity by integration by parts. The last expression here is seen to equal the right side of (3.2.36).
3.2. Surfaces and surface integrals
Consequently, for k ∈ Z+ , (3.2.37)
Γ(k) = (k − 1)!,
125
( (1) 1) ( 1) = k− ··· π 1/2 . Γ k+ 2 2 2
Thus (3.2.32) can be rewritten (3.2.38)
A2k−1 =
2π k , (k − 1)!
A2k = (
2π k ) ( ). k − 12 · · · 12
We discuss another important example of a smooth surface, in the space M (n, R) 2 ≈ Rn of real n × n matrices, namely SO(n), the set of matrices T ∈ M (n, R) satisfying T t T = I and det T > 0 (hence det T = 1). The exponential map Exp: M (n, R) → M (n, R) defined by Exp(A) = eA has the property Exp : Skew(n) −→ SO(n),
(3.2.39)
where Skew(n) is the set of skew-symmetric matrices in M (n, R). As seen in (2.2.28)– (2.2.30), (3.2.40)
D Exp(0)Y = Y,
∀ Y ∈ M (n, R),
and hence the Inverse Function Theorem implies that there is a ball Ω centered at e of I 0 in M (n, R) that is mapped diffeomorphically by Exp onto a neighborhood Ω in M (n, R). From the identities Exp X t = (Exp X)t ,
Exp(−X) = (Exp X)−1 ,
e we see that, given X ∈ Ω, A = Exp X ∈ Ω, A ∈ SO(n) ⇐⇒ X ∈ Skew(n). Thus there is a neighborhood O of 0 in Skew(n) that is mapped by Exp diffeomorphically onto a smooth surface U ⊂ M (n, R), of dimension m = n(n − 1)/2. Furthermore, U is a neighborhood of I in SO(n). For general T ∈ SO(n), we can define maps (3.2.41)
φT : O −→ SO(n),
φT (A) = T Exp(A),
and obtain coordinate charts in SO(n), which is consequently a smooth surface of dimension n(n − 1)/2 in M (n, R). Note that SO(n) is a closed bounded subset of M (n, R); hence it is compact. We use the inner product on M (n, R) computed componentwise; equivalently, ⟨A, B⟩ = Tr (B t A) = Tr (BAt ).
(3.2.42)
This produces a metric tensor on SO(n). The surface integral over SO(n) has the following important invariance property. ( ) Proposition 3.2.3. Given f ∈ C SO(n) , if we set (3.2.43)
ρT f (X) = f (XT ),
for T, X ∈ SO(n), we have ∫ ∫ (3.2.44) ρT f dS = SO(n)
SO(n)
λT f (X) = f (T X), ∫ λT f dS =
f dS. SO(n)
126
3. Multivariable integral calculus and calculus on surfaces
Proof. Given T ∈ SO(n), the maps RT , LT : M (n, R) → M (n, R) defined by RT (X) = XT, LT (X) = T X are easily seen from (3.2.42) to be isometries. Thus they yield maps of SO(n) to itself which preserve the metric tensor, proving (3.2.44). ( ) ∫ Since SO(n) is compact, its total volume V SO(n) = SO(n) 1 dS is finite. We define the integral with respect to “Haar measure” ∫ ∫ 1 ) f (g) dg = ( (3.2.45) f dS. V SO(n) SO(n)
SO(n)
This is used in many arguments involving “averaging over rotations.” Examples of such averaging arise in §5.1, §5.3, and §7.4. See also §6.4 for generalizations to other matrix groups. Extended notion of coordinates Basic calculus as developed in this text so far has involved maps from one Euclidean space to another, of the type F : Rn → Rm . It is convenient and useful to extend our setting to F : V → W , where V and W are general finite-dimensional real vector spaces. There is the following notion of the derivative. Let V and W be as above, and let Ω ⊂ V be open. We say F : Ω → W is differentiable at x ∈ Ω provided there exists a linear map L : V → W such that, for y ∈ V small, (3.2.46)
F (x + y) = F (x) + Ly + r(x, y),
with r(x, y) → 0 faster than y → 0, i.e., (3.2.47)
∥r(x, y)∥ −→ 0 as y → 0. ∥y∥
For this to be meaningful, we need norms on V and W . Often these norms come from inner products. See Appendix A.2 for a discussion of inner product spaces. If (3.2.46)–(3.2.47) hold, we set DF (x) = L, and call the linear map DF (x) : V −→ W the derivative of F at x. We say F is C 1 if DF (x) is continuous in x. Notions of F in C k are produced in analogy with the situation in §2.1. Of course, we can reduce all this to the setting of §2.1 by picking bases of V and W . Often such V and W arise as linear subspaces of Rn , such as Tp M in (3.2.1), or V = Np M , mentioned right below that. As noted there, we can take a linear isomorphism of such V with Rk for some k, and keep working in the context of maps between such standard Euclidean spaces, as in (3.2.2). However, it can be convenient to avoid this distraction, and, for example, replace (3.2.2) by (3.2.48)
Φ : O × Np M −→ Rn ,
and (3.2.3) by (3.2.49)
DΦ(x0 , 0)
Φ(x, z) = φ(x) + z,
( ) v = Dφ(x0 )v + w. w
3.2. Surfaces and surface integrals
127
In order to carry out Lemma 3.2.1 in this setting, we want the following version of the Inverse Function Theorem. Proposition 3.2.4. Let V and W be real vector spaces, each of dimension n. Let F be a C k map from an open neighborhood Ω of p0 ∈ V to W , with q0 = F (p0 ), k ≥ 1. Assume the derivative DF (p0 ) : V → W is an isomorphism. e of q0 such that Then there exist a neighborhood U of p0 and a neighborhood U −1 e k e F : U → U is one-to-one and onto, and F : U → U is a C map. While Proposition 3.2.4 is apparently an extension of Theorem 2.2.1, there is no extra work required to prove it. One can simply take linear isomorphisms A : Rn → V and B : Rn → W and apply Theorem 2.2.1 to the map G(x) = B −1 F (Ax). Thus Proposition 3.2.4 is not a technical improvement of Theorem 2.2.1, but it is a useful conceptual extension. With this in mind, we can define the notion of an m-dimensional surface M ⊂ V (an n-dimensional vector space) as follows. Take a vector space W , of dimension m. Given p ∈ M , we require there to be a neighborhood U of p in M and a smooth map φ : O → U , from an open set O ⊂ W bijectively to U , with an injective derivative at each point. We call such a map a coordinate chart. If all such maps are smooth of class C k , we say M is a surface of class C k . As a further wrinkle, we could take different vector spaces Wp for different p ∈ M , as long as they all have dimension m. The reader is invited to formulate the appropriate modification of Lemma 3.2.1 in this setting. Submersions Let V and W be finite dimensional real vector spaces, Ω ⊂ V open, and F : Ω → W a C k map, k ≥ 1. We say F is a submersion provided that, for each x ∈ Ω, DF (x) : V → W is surjective. (This requires dim V ≥ dim W .) We establish the following Submersion Mapping Theorem, which the reader might recognize as a variant of the Implicit Function Theorem. In the statement, ker T denotes the null space ker T = {v ∈ V : T v = 0}, if T : V → W is a linear transformation. Proposition 3.2.5. With V, W , and Ω ⊂ V as above, assume F : Ω → W is a C k map, k ≥ 1. Fix p ∈ W , and consider (3.2.50)
S = {x ∈ V : F (x) = p}.
Assume that, for each x ∈ S, DF (x) : V → W is surjective. Then S is a C k surface in Ω. Furthermore, for each x ∈ S, (3.2.51)
Tx S = ker DF (x).
Proof. Given q ∈ S, set Kq = ker DF (q) and define (3.2.52)
Gq : V −→ W ⊕ Kq ,
Gq (x) = (F (x), Pq (x − q)),
128
3. Multivariable integral calculus and calculus on surfaces
where Pq is a projection of V onto Kq . Note that (3.2.53)
Gq (q) = (F (q), 0) = (p, 0).
Also (3.2.54)
DGq (x) = (DF (x), Pq ),
x ∈ V.
We claim that (3.2.55)
DGq (q) = (DF (q), Pq ) : V → W ⊕ Kq is an isomorphism.
This is a special case of the following general observation.
Lemma 3.2.6. If A : V → W is a surjective linear map and P is a projection of V onto ker A, then (3.2.56)
(A, P ) : V −→ W ⊕ ker A is an isomorphism.
We postpone the proof of this lemma and proceed with the proof of Proposition 3.2.5. Having (3.2.55), we can apply the Inverse Function Theorem (Proposition 3.2.4) to obtain a neighborhood U of q in V and a neighborhood O of (p, 0) in W ⊕ Kq such that Gq : U → O is bijective, with C k inverse G−1 q : O −→ U,
(3.2.57)
G−1 q (p, 0) = q.
By (3.2.52), given x ∈ U , (3.2.58)
x ∈ S ⇐⇒ Gq (x) = (p, v), for some v ∈ Kq .
Hence S ∩U is the image under the C k diffeomorphism G−1 q of O ∩{(p, v) : v ∈ Kq }. Hence S is smooth of class C k and dim Tq S = dim Kq . It follows from the chain rule that Tq S ⊂ Kq , so the dimension count yields Tq S = Kq . This proves Proposition 3.2.5. Note that we have the following coordinate chart on a neighborhood of q ∈ S: (3.2.59)
ψq (v) = G−1 q (p, v),
ψq : Ωq → S,
where Ωq is a neighborhood of 0 in Tq S = Kq = ker DF (q). It remains to prove Lemma 3.2.6. Indeed, given that A : V → W is surjective, the fundamental theorem of linear algebra implies dim V = dim(W ⊕ ker A), and it is clear that (A, P ) in (3.2.56) is injective, so the isomorphism property follows. Remark. In case V = Rn and W = R, DF (x) is typically denoted ∇F (x), the hypothesis on DF (x) becomes ∇F (x) ̸= 0, and (3.2.51) is equivalent to the assertion that dim S = n − 1 and, for x ∈ S, ∇F (x) ⊥ Tx S.
(3.2.60)
Compare the discussion following Proposition 2.2.6. We illustrate Proposition 3.2.5 with another proof that SO(n) ⊂ M (n, R)
(3.2.61)
is a smooth surface, different from the argument involving (3.2.39)–(3.2.41). To get this, we take (3.2.62)
V = M (n, R),
W = {A ∈ M (n, R) : A = At },
3.2. Surfaces and surface integrals
129
and F : V −→ W,
(3.2.63)
F (X) = X t X.
Now, given X, Y ∈ V , Y small, F (X + Y ) = X t X + X t Y + Y t X + O(∥Y ∥2 ),
(3.2.64) so
DF (X)Y = X t Y + Y t X.
(3.2.65) We claim that (3.2.66)
X ∈ SO(n) =⇒ DF (X) : M (n, R) → W is surjective.
Indeed, given A ∈ W , i.e., A ∈ M (n, R) and At = A, and X ∈ SO(n), we have 1 (3.2.67) Y = XA =⇒ DF (X)Y = A. 2 This establishes (3.2.66), so Proposition 3.2.5 applies. Again we conclude that SO(n) is a smooth surface in M (n, R). Riemann integrable functions on a surface Let M ⊂ Rn be an m-dimensional surface, smooth of class C 1 . We define the class Rc (M ) of compactly supported Riemann integrable functions as follows, guided by Proposition 3.1.11. If f : M → R is bounded and has compact support, we set {∫ } I(f ) = inf g dS : g ∈ Cc (M ), g ≥ f , M
{∫
(3.2.68) I(f ) = sup
} h dS : h ∈ Cc (M ), h ≤ f ,
M
where Cc (M ) denotes the set of continuous functions on M with compact support. Then (3.2.69)
f ∈ Rc (M ) ⇐⇒ I(f ) = I(f ),
∫ and if such is the case, we denote the common value by M f dS. It follows readily from the definition and arguments produced in §3.1 that (3.2.70) ∫ ∫ ∫ f1 , f2 ∈ Rc (M ) ⇒ f1 + f2 ∈ Rc (M ) and (f1 + f2 ) dS = f1 dS + f2 dS. M
M
M
In fact, using (3.2.16) for functions that are continuous on M with compact support, one obtains from the definition (3.2.68) that, if fj : M → R are bounded and have compact support, I(f1 + f2 ) ≤ I(f1 ) + I(f2 ),
I(f1 + f2 ) ≥ I(f1 ) + I(f2 ),
which yields (3.2.70). Also one can modify the proof of Proposition 3.1.17 to show that (3.2.71)
f ∈ Rc (M ), u ∈ C(M ) =⇒ uf ∈ Rc (M ).
130
3. Multivariable integral calculus and calculus on surfaces
Furthermore, if φ : O → U ⊂ M is a coordinate chart and f ∈ Rc (U ), then an application of Proposition 3.1.11 gives ∫ ∫ √ (3.2.72) f ◦ φ ∈ Rc (O), and f dS = f (φ(x)) g(x) dx, M
O
with g(x) as in (3.2.12)–(3.2.13). Given ∑ any f∑∈ Rc (M ), we can take a continuous partition of unity {uj }, write f = j fj = j uj f , and use (3.2.70)–(3.2.72) to ∫ express M f dS as a sum of integrals over coordinate charts. If Σ ⊂ M has compact closure, then cont+ Σ = I(χΣ ),
(3.2.73)
and Σ is contented if and only if χΣ ∈ Rc (M ). In such a case, (3.2.73) is the area of Σ. Given f : M → R, bounded and compactly supported, in parallel with (3.1.39) we say (3.2.74)
f ∈ Cc (M ) ⇔ the set Σ of points of discontinuity of f satisfies cont+ Σ = 0.
We have Cc (M ) ⊂ Rc (M ),
(3.2.75)
and (again parallel to Proposition 3.1.11) if f : M → R is bounded and compactly supported, } {∫ g dS : g ∈ Cc (M ), g ≥ f , I(f ) = inf M {∫
(3.2.76) I(f ) = sup
} h dS : h ∈ Cc (M ), h ≤ f .
M
One can proceed from here to define the spaces (3.2.77)
R(M ),
R# (M ),
and establish properties of functions in these spaces, in analogy with work in §3.1 on R(Rn ) and R# (Rn ). We leave such an investigation to the reader. Vector fields and flows on surfaces Let M ⊂ Rn be a smooth, m-dimensional surface. A smooth vector field X on M (sometimes called a tangent vector field) is a smooth map (3.2.78)
X : M −→ Rn such that X(p) ∈ Tp M, ∀ p ∈ M.
If φ : O → U ⊂ M is a coordinate chart, then there is a unique smooth vector field Xφ : O → Rm such that (3.2.79)
X(φ(x)) = Dφ(x)Xφ (x).
t The vector field X generates a flow FX on M , satisfing
(3.2.80)
t t FX (φ(x)) = φ(FX (x)), φ
3.2. Surfaces and surface integrals
131
at least for small |t|. As before, we have the defining property d t t F (x) = X(FX (x)). dt X
(3.2.81) Note also that
s+t t s FX (x) = FX ◦ FX (x).
(3.2.82)
With slight abuse of notation we will use the same symbol X for the vector field X on M and for the associated vector field Xφ on a coordinate patch. t Valuable information on the behavior of the flow FX can be obtained by investigating the t-derivative of t vt (x) = v(FX (x)),
(3.2.83)
given v ∈ C01 (M ), i.e., v is of class C 1 and vanishes outside some compact subset of M . In fact, we take v ∈ C01 (U ), and identify this with v ∈ C01 (O), with O ⊂ Rm as above. The chain rule plus (3.2.81) yields d t t vt (x) = X(FX (x)) · ∇v(FX (x)), dt
(3.2.84) In particular,
d s v(FX (x)) s=0 = X(x) · ∇v(x). ds Here ∇v is the gradient of v, given by ∇v = (∂v/∂x1 , . . . , ∂v/∂xm ). A useful alternative formula to (3.2.84) is d d s vt (x) = vt (FX (x)) s=0 dt ds (3.2.86) = X(x) · ∇vt (x), (3.2.85)
the first equality following from (3.2.82) and the second from (3.2.85), with v replaced by vt . One significant consequence of (3.2.86), which will lead to the formula (3.2.91) below, is that, for v ∈ C01 (O), ∫ ∫ d √ √ t v(FX (x)) g dx = X(x) · ∇vt (x) g dx dt O O ∫ (3.2.87) √ t (x)) g dx. = − div X(x)v(FX O
Here, div X(x) is the divergence of the vector field X(x) = (X1 (x), . . . , Xm (x)), defined (in local coordinates) by ∑ ∂ (g 1/2 Xj (x)). (3.2.88) div X(x) = g −1/2 ∂x j j The last equality in (3.2.87) follows by integration by parts, ∫ ∫ ∂vt ∂Gk √ dx = − vt (x) dx, Gk (x) = gXk (x), Gk (x) ∂xk ∂xk O
O
132
3. Multivariable integral calculus and calculus on surfaces
followed by summation over k. We restate (3.2.87) in global terms: ∫ ∫ d t t v(FX (x)) dV = − div X(x)v(FX (x)) dV. (3.2.89) dt M
M
So far, we have (3.2.89) for v ∈ C01 (M ). We extend this to less regular functions. First, note that (3.2.89) implies ∫ ∫ t v(FX (x)) dV − v(x) dV M
(3.2.90)
M
∫ t∫ =−
s div X(x)v(FX (x)) dV ds. 0
M
Basic results on the integral allow one to pass from v ∈ C01 (M ) in (3.2.58) to more general v, including v = χΩ (the characteristic function of Ω, defined to be equal to 1 on Ω and 0 on M \ Ω), for smoothly bounded compact Ω ⊂ M . In more detail, if Ω ⊂ M is a compact, smoothly bounded subset, let Bδ = {x ∈ M : dist(x, Ω) ≤ δ}. There exists δ0 > 0 such that Bδ ⊂ M for δ ∈ (0, δ0 ]. For such δ, one can produce vδ ∈ C01 (M ) such that vδ = 1 on Bδ , Then
0 ≤ vδ ≤ 1,
∫ ∫ χΩ (x) dV − vδ (x) dV ≤ Vol(Bδ \ Ω) → 0, as δ → 0,
so, as δ → 0,
∫
∫ vδ (x) dV −→
M
∫ t vδ (FX (x)) dV
−→
M
t χΩ (FX (x)) dV, M
∫ t∫
∫ t∫ s div X(x)vδ (FX (x)) dV
0
χΩ (x) dV. M
Similar arguments give ∫
and
vδ = 0 on M \ Bδ .
ds →
s div X(x)χΩ (FX (x)) dV ds. 0
M
M
These results allow one to take v = χΩ in (3.2.90). Now one can pass from (3.2.90) back to (3.2.89), via the fundamental theorem of calculus. Note that ∫ −t t Vol FX (Ω) = χΩ (FX (x)) dV. M
We can apply (3.2.89), with t replaced by −t, and v by χΩ , and deduce the following.
3.2. Surfaces and surface integrals
133
t Proposition 3.2.7. If X is a C 1 vector field, generating the flow FX , well defined on M for t ∈ I, and Ω ⊂ M is compact and smoothly bounded, then, for t ∈ I, ∫ d t (3.2.91) Vol FX (Ω) = div X(x) dV. dt t (Ω) FX
This result is behind the notation div X, i.e., the divergence of X. Vector fields t with positive divergence generate flows FX that magnify volumes as t increases, while vector fields with negative divergence generate flows that shrink volumes as t increases. We will see more of the divergence operation on vector fields in Sections 4.4 and 5.2. Projective spaces, quotient surfaces, and manifolds Real projective space Pn−1 is obtained from the sphere S n−1 by identifying each pair of antipodal points: Pn−1 = S n−1 / ∼,
(3.2.92) where
x ∼ y ⇐⇒ x = ±y,
(3.2.93)
for x, y ∈ S ⊂ R . More generally, if M ⊂ Rn is an m-dimensional surface, smooth of class C k , satisfying n−1
n
0∈ / M,
(3.2.94)
x ∈ M ⇒ −x ∈ M,
we define P(M ) = M/ ∼,
(3.2.95)
using the equivalence relation (3.2.93). Note that M has the metric space structure d(x, y) = ∥x − y∥, and then P(M ) becomes a metric space with metric (3.2.96)
d([x], [y]) = min{d(x′ , y ′ ) : x′ ∈ [x], y ′ ∈ [y]},
or, in view of (3.2.93), (3.2.97)
d([x], [y]) = min{d(x, y), d(x, −y)}.
Here, x ∈ M and [x] ∈ P(M ) is its associated equivalence class. The map x 7→ [x] is a continuous map (3.2.98)
ρ : M −→ P(M ).
It has the following readily established property. Lemma 3.2.8. Each p ∈ P(M ) has an open neighborhood U ⊂ P(M ) such that ρ−1 (U ) = U0 ∪ U1 is the disjoint union of two open subsets of M , and, for j = 0, 1, ρ : Uj → U is a homeomorphism, i.e., it is continuous, one-to-one, and onto, with continuous inverse. Given p ∈ P(M ), {p0 , p1 } = ρ−1 (p), let U0 be a neighborhood of p0 in M for which there is a C k coordinate chart φ0 : O → U0 (O ⊂ Rm open). Then φ1 (x) = −φ0 (x) gives a coordinate chart φ1 : O → U1 onto a neighborhood U1 of p1 ∈ M . If U0 is picked small enough, U0 and U1 are disjoint. The projection ρ
134
3. Multivariable integral calculus and calculus on surfaces
maps U0 and U1 homeomorphically onto a neighborhood U of p in P(M ), and we have “coordinate charts” ρ ◦ φj : O −→ U.
(3.2.99)
In fact, ρ ◦ φ1 = ρ ◦ φ0 . If ψ0 : Ω → U0 is another C k coordinate chart, then, as in Lemma 3.2.1, we have a C k diffeomorphism F : O → Ω such that ψ0 ◦ F = φ0 . Similarly ψ1 ◦ F = φ1 , with ψ1 (x) = −ψ0 (x), and we have ρ ◦ ψj ◦ F = ρ ◦ φj . The structure just placed on the “quotient surface” P(M ) makes it a manifold, an object we now define. Given a metric space X, we say X has the structure of a C k manifold of dimension m provided the following conditions hold. First, for each p ∈ X, we have an open neighborhood Up of p in X, an open set Op ⊂ Rm , and a homeomorphism φp : Op −→ Up .
(3.2.100)
Next, if also q ∈ X and Upq = Up ∩ Uq ̸= ∅, then the homeomorphism from −1 Opq = φ−1 p (Upq ) to Oqp = φq (Upq ), (3.2.101) Fpq = φ−1 q ◦ φp O , pq
is a C diffeomorphism. As before, we call the maps φp : Op → Up ⊂ X coordinate charts on X. k
A metric tensor on a C k manifold X is defined by positive-definite, symmetric m × m matrices Gp ∈ C k−1 (Op ), satisfying the compatibility condition (3.2.102)
Gp (x) = DFpq (x)t Gq (y)DFpq (x),
for (3.2.103)
x ∈ Opq ⊂ Op ,
y = Fpq (x) ∈ Oqp ⊂ Oq .
We then set gp = det Gp ∈ C k−1 (Op ),
(3.2.104) satisfying
√
(3.2.105)
√ gp (x) = | det DFpq (x)| gq (y),
for x and y as in (3.2.103). If f : X → R is a continuous function supported in Up , we set ∫ ∫ √ (3.2.106) f dS = f (φp (x)) gp (x) dx. X
Op
∫ As in (3.2.14)–(3.2.15), this leads to a well defined integral X f dS for f ∈ Cc (X), obtained by writing f as a finite sum of continuous functions supported on various coordinate patches Up . From here we can develop the class of functions Rc (X) and their integrals over X, in a fashion parallel to that done above when X is a surface in Rn . The quotient surfaces P(M ) are examples of C k manifolds as defined above. They get natural metric tensors with the property that ρ in (3.2.98) is a local
3.2. Surfaces and surface integrals
isometry. In such a case,
∫
(3.2.107)
135
1 f dS = 2
P(M )
∫ f ◦ ρ dS. M
Another important quotient manifold is the “flat torus” Tn = Rn /Zn .
(3.2.108)
Here the equivalence relation on Rn is x ∼ y ⇔ x − y ∈ Zn . Natural local coordinates on Tn are given by the projection ρ : Rn → Tn , restricted to sufficiently small open sets in Rn . The quotient Tn gets a natural metric tensor for which ρ is a local isometry. Given two C k manifolds X and Y , a continuous map ψ : X → Y is said to be smooth of class C k provided that for each p ∈ X, there are neighborhoods U e of q = ψ(p), and coordinate charts φ1 : O → U, φ2 : O e → U e , such of p and U −1 k k e is a C map. We say ψ is a C diffeomorphism if it is that φ2 ◦ ψ ◦ φ1 : O → O one-to-one and onto and ψ −1 : Y → X is a C k map. If X is a C k manifold and M ⊂ Rn a C k surface, a C k diffeomorphism ψ : X → M is called a C k embedding of X into Rn . Here is an embedding of Tn into R2n : (3.2.109)
ψ(x) =
n ∑
(cos 2πxj )ej +
j=1
n ∑
(sin 2πxj )en+j .
j=1
A priori, ψ : Rn → R2n , but ψ(x) = ψ(y) whenever x − y ∈ Zn , so this naturally induces a smooth map Tn → R2n , which can be seen to be an embedding. If M ⊂ Rn is an m-dimensional surface satisfying (3.2.94), an embedding of P(M ) into M (n, R) can be constructed via the map ψ : Rn −→ M (n, R),
(3.2.110) Note that (3.2.111)
x1 x = ... =⇒ xxt = xn
ψ(x) = xxt .
x21 .. .
···
x1 xj .. .
···
xn x1
···
xn xj
···
x1 xn .. . . x2n
We need a couple of lemmas. Lemma 3.2.9. For ψ as in (3.2.110), x, y ∈ Rn , (3.2.112)
ψ(x) = ψ(y) ⇐⇒ x = ±y.
Proof. The map ψ is characterized by ψ(x)ej = xj x, where x is as in (3.2.111) and {ej } is the standard basis of Rn . It follows that if x ̸= 0, ψ(x) has exactly one nonzero eigenvalue, namely |x|2 , and ψ(x)x = |x|2 x. Thus ψ(x) = ψ(y) implies that |x|2 = |y|2 and that x and y are parallel. Thus x = ay and a = ±1. Lemma 3.2.10. In the setting of Lemma 3.2.9, if x ̸= 0, (3.2.113)
Dψ(x) : Rn −→ M (n, R) is injective.
136
3. Multivariable integral calculus and calculus on surfaces
Proof. A calculation gives Dψ(x)v = xv t + vxt .
(3.2.114) Thus, if v ∈ ker Dψ(x),
xv t = −vxt .
(3.2.115)
Both sides are rank 1 elements of M (n, R). The range of the left side is spanned by x and that of the right side is spanned by v, so v = ax for some a ∈ R. Then (3.2.115) becomes axxt = −axxt ,
(3.2.116) which implies a = 0 if x ̸= 0.
Remark. Here is a refinement of Lemma 3.2.10. Using the inner product on M (n, R) given by (3.2.42), we can calculate ( ) (3.2.117) ⟨Dψ(x)v, Dψ(x)v⟩ = 2 |x|2 |v|2 + (x · v)2 . Lemmas 3.2.9 and 3.2.10 imply that if M ⊂ Rn is an m-dimensional surface satisfying (3.2.94), then ψ|M yields an embedding of P(M ) into M (n, R). Denote the image surface by M # . As we see from (3.2.117), this embedding is not typically an isometry. However, if M = S n−1 and v is tangent to S n−1 at x, then v · x = 0, and (3.2.117) implies that in this case the embedding of Pn−1 into M (n, R) is an isometry, up to a factor of 2. It is the case that if X is any C k manifold that is a countable union of compact sets, then X can be embedded into Rn for some n. In case X is compact, this is not very hard to prove, using local coordinate charts and smooth cutoffs, and the interested reader might take a crack at it. If X is provided with a metric tensor, this embedding might not preserve this metric tensor. If it does, one calls is an isometric embedding. It is the case that any such manifold has an isometric embedding into Rn for some n (if k is sufficiently large). This result is the famous Nash embedding theorem, and its proof is quite difficult. For X compact and C ∞ , a proof is given in Chapter 14 of [46]. Polar decomposition of matrices We define the spaces Sym(n) and P(n) by (3.2.118)
Sym(n) = {A ∈ M (n, R) : A = At }, P(n) = {A ∈ Sym(n) : x · Ax > 0, ∀ x ∈ Rn \ 0}.
It is easy to show that P(n) is an open, convex subset of the linear space Sym(n). We aim to prove the following result. Proposition 3.2.11. Given A ∈ Gl+ (n, R), there exist unique U ∈ SO(n) and Q ∈ P(n) such that (3.2.119)
A = U Q.
3.2. Surfaces and surface integrals
137
The representation (3.2.119) is called the polar decomposition of A. Note that (U Q)t U Q = QU t U Q = Q2 ,
(3.2.120)
so if the identity (3.2.119) were to hold, we would have At A = Q2 .
(3.2.121) Note also that
A ∈ Gl(n, R) =⇒ At A ∈ P(n),
(3.2.122)
since x · At Ax = (Ax) · (Ax) = |Ax|2 . To prove Proposition 3.2.11, we bring in the following basic result of linear algebra. See Appendix A.3. Proposition 3.2.12. Given B ∈ Sym(n), there is an orthonormal basis of Rn consisting of eigenvectors of B, with eigenvalues λj ∈ R. Equivalently, there exists V ∈ SO(n) such that (3.2.123)
B = V DV −1 ,
with
D=
(3.2.124)
λ1 ..
,
. λn
λj ∈ R. If B ∈ P(n), then each λj > 0. We can then set 1/2 λ1 −1 .. V , (3.2.125) Q=V . 1/2 λn and obtain the following. Corollary 3.2.13. Given B ∈ P(n), there is a unique Q ∈ P(n) satisfying (3.2.126)
Q2 = B.
We say Q = B 1/2 . To obtain the decomposition (3.2.119), we set (3.2.127)
Q = (At A)1/2 ,
U = AQ−1 .
Note that (3.2.128)
U t U = Q−1 At AQ−1 = Q−1 Q2 Q−1 = I,
and (det U )(det Q) = det A > 0, so det U > 0, and hence U ∈ SO(n), as desired. By (3.2.121) and Corollary 3.2.13, the factor Q ∈ P(n) in (3.2.119) is unique, and hence so is the factor U . We can use Proposition 3.2.11 to prove the following. Proposition 3.2.14. The set Gl+ (n, R) is connected. In fact, given A ∈ Gl+ (n, R), there is a smooth path γ : [0, 1] → Gl+ (n, R) such that γ(0) = I and γ(1) = A.
138
3. Multivariable integral calculus and calculus on surfaces
Proof. To start, we have that (3.2.129)
Exp : Skew(n) −→ SO(n) is onto.
See Exercise 14 below for this (or Corollary A.3.9). Hence, with A = U Q as in (3.2.119), we have a smooth path α(t) = Exp(tS), α : [0, 1] → SO(n), such that α(0) = I and α(1) = U . Since P(n) is a convex subset of Sym(n), we can take β(t) = (1 − t)I + tQ, obtaining a smooth path β : [0, 1] → P(n), such that β(0) = I and β(1) = Q. Then (3.2.130)
γ(t) = α(t)β(t)
does the trick.
Exercises 1. The following exercise deals with parametrizing a curve by arc length. (a) Let γ : [a, b] → Rn be a C 1 curve, and assume γ ′ (t) ̸= 0 for t ∈ [a, b]. By (3.2.17), the length of the segment γ([a, t]) is ∫ t ℓ(t) = |γ ′ (x)| dx. a
We have a C 1 map ℓ : [a, b] → [0, L] (where L = ℓ(b)), satisfying ℓ′ (t) = |γ ′ (t)| > 0. Hence (by the 1D inverse function theorem) there is a C 1 inverse, ℓ−1 : [0, L] → [a, b]. We set σ(s) = γ(ℓ−1 (s)),
s ∈ [0, L].
Show that σ : [0, L] → Rn is a C 1 curve and |σ ′ (s)| ≡ 1. We say that σ is obtained from γ by parametrization by arc length. (b) Consider the circle C = {(x, y) ∈ R2 : x2 + y 2 = 1}. Show that C has a parametrization by arc length, say γ, such that γ(0) = (1, 0) and γ ′ (0) = (0, 1). Let us say γ(t) = (c(t), s(t)). Remark. The curve segment γ([0, t]) has length t. In trigonometry, the line segments from (0, 0) to (1, 0) = γ(0) and from (0, 0) to (c(t), s(t)) = γ(t) are said to meet at an angle, measured in radians, equal to the length of this curve, i.e., to t radians. Then the geometric definition of the trigonometric functions cos t and sin t yields cos t = c(t),
sin t = s(t).
(c) Consult the auxiliary exercises on trigonometric functions in §2.3, and deduce from (2.3.111) that cos t and sin t, defined above, are identical with cos t and sin t as they appear in (2.3.108). For a related approach, see Exercises 11–13 below. (d) Taking π as characterized below (2.3.112), show that the arc length of the unit circle C is equal to 2π (reiterating the case n = 2 of (3.2.32)).
Exercises
139
2. Compute the volume of the unit ball B n = {x ∈ Rn : |x| ≤ 1}. Hint. Apply (3.2.30) with φ = χ[0,1] . 3. Taking the upper half of the sphere S n to be the graph of xn+1 = (1 − |x|2 )1/2 , for x ∈ B n , the unit ball in Rn , deduce from (3.2.23) and (3.2.30) that ∫ π/2 ∫ 1 rn−1 √ dr = 2An−1 (sin θ)n−1 dθ. An = 2An−1 1 − r2 0 0 Use this to get an alternative derivation of the formula (3.2.38) for An . Hint. Rewrite this formula as ∫ π An = An−1 bn−1 , bk = sink θ dθ. 0
To analyze bk , you can write, on the one hand, ∫ π bk+2 = bk − sink θ cos2 θ dθ, 0
and on the other, using integration by parts, ∫ π d bk+2 = cos θ sink+1 θ dθ. dθ 0 Deduce that k+1 bk+2 = bk . k+2 4. Suppose M is a surface in Rn of dimension 2, and ( ) φ : O → U ⊂ M is a coordinate chart, with O ⊂ R2 . Set φjk (x) = φj (x), φk (x) , so φjk : O → R2 . Show that the formula (3.2.12) for the surface integral is equivalent to √ ( ∫ ∫ )2 ∑ f dS = f ◦ φ(x) det Dφjk (x) dx. O
M
Hint. Show that the quantity under
j 0 by x2 + y 2 + z 2 = 1. a2
144
3. Multivariable integral calculus and calculus on surfaces
Use the method of Exercise 17 to show that ∫ a√ Area Ea = 4π 1 − βs2 ds,
β=
0
1 1 − 4. a2 a
19. Given a, b, c > 0, consider the ellipsoid E(a, b, c), given by x2 y2 z2 + 2 + 2 = 1. 2 a b c Using (3.2.23), write down a formula for the area of E(a, b, c) as an integral over the region } { y2 x2 Ea,b = (x, y) ∈ R2 : 2 + 2 ≤ 1 . a b 20. Consider the parabolic curve ( t2 ) . γ(t) = t, 2 Show that the length of γ([0, x]) is
∫
x
ℓ(x) =
√
1 + t2 dt.
0
Evaluate this integral using the substitution t = sinh u. 21. Let M be the surface of revolution obtained by taking the graph of the function y = ex , a ≤ x ≤ b, and rotating it about the x-axis in R3 . Show that Exercise 17 yields ∫ b √ Area M = 2π es 1 + e2s ds. a
Taking t = es , show that this is equal to ∫ β√ 2π 1 + t2 dt,
α = ea , β = eb .
α
Relate this to Exercise 20. In the next exercise, M is an n-dimensional “surface of revolution” given by a smooth map ψ : [a, b] × S n−1 −→ M ⊂ R × Rn of the form ψ(s, ω) = (s, f (s)ω). A coordinate chart φ : Ω → S , with Ω open in Rn−1 , gives rise to a coordinate chart ˜ x) = (s, f (s)φ(x)). ψ˜ : [a, b] × Ω −→ M, ψ(s, n−1
We set x0 = s, x = (x1 , . . . , xn−1 ).
Exercises
145
˜ 22. Show that, in ψ-coordinates, the metric tensor of M takes the form (gjk ), for 0 ≤ j, k ≤ n − 1, with the following components: ∂ ψ˜ ∂ ψ˜ g00 = · = 1 + f ′ (s)2 , ∂x0 ∂x0 ∂ ψ˜ ∂ ψ˜ · = 0, for 1 ≤ j ≤ n − 1, g0j = ∂x0 ∂xj ∂ ψ˜ ∂ ψ˜ gjk = · = f (s)2 hjk , for 1 ≤ j, k ≤ n − 1, ∂xj ∂xk where (hℓm ) is the metric tensor of S n−1 in the φ-coordinates. Otherwise said, ( ) 1 + f ′ (s)2 (gjk ) = . f (s)2 hℓm Compare (3.2.26). 23. In the setting of Exercise 22, deduce that if u : M → R is continuous, ∫ ∫ b ∫ √ u dS = u(s, f (s)ω)f (s)n−1 1 + f ′ (s)2 dS(ω) ds. a
M
S n−1
In particular, with An−1 as in (3.2.30)–(3.2.32), ∫ b √ Area M = An−1 f (s)n−1 1 + f ′ (s)2 ds. a
Note how this generalizes the conclusion of Exercise 17. √ 24. In the setting of Exercises 22–23, let M = S n , with f (s) = 1 − s2 . Show that ∫ ∫ 1 u(x0 ) dS(x) = An−1 u(s)(1 − s2 )(n−2)/2 ds. −1
Sn
25. Let ψ : SO(n) → M (k, R) be continuous and satisfy the following properties: ψ(gh) = ψ(g)ψ(h),
ψ(g −1 ) = ψ(g)−1 ,
for all g, h ∈ SO(n). We say ψ is a representation of SO(n) on Rk . Form ∫ P = ψ(g) dg, P ∈ M (k, R), SO(n)
using the integral (3.2.45) (but here with a matrix valued integrand). Show that P : Rk −→ V, and v ∈ V ⇒ P v = v, where V = {v ∈ Rk : ψ(g)v = v, ∀ g ∈ SO(n)}. Thus P is a projection of Rk onto V . ∫ Hint. With h ∈ SO(n) arbitrary, express ψ(h)P as the integral ψ(hg) dg, and apply (3.2.44).
146
3. Multivariable integral calculus and calculus on surfaces
26. In the setting of Exercise 25, show that ∫ dim V = χ(g) dg,
χ(g) = Tr ψ(g).
SO(n)
27. Given u ∈ C(Rn ), define Au ∈ C(Rn ) by ∫ Au(x) = u(gx) dg. SO(n)
Show that Au is a radial function, in fact Au(x) = Su(|x|),
with Su(r) =
1
∫ u(rω) dS(ω).
An−1 S n−1
28. In the setting of Exercise 27, show that if u(x) is a polynomial in x, then Su(r) is a polynomial in r. Hint. Show that Au(x) is a polynomial in x. Look at Au(re1 ). 29. Let M be a C 1 surface, K ⊂ M compact. Let φj : Oj → Uj be coordinate ∑k charts on M and assume K ⊂ ∪kj=1 Uj . Take vj ∈ Cc (Uj ) such that 1 vj > 0 on K. Let f : M → R be bounded and supported on K. Show that f ∈ Rc (M ) ⇐⇒ (vj f ) ◦ φj ∈ Rc (Oj ), for each j. Here, use the definition (3.2.68)–(3.2.69) for Rc (M ) and define Rc (Oj ) as in §3.1. 30. Let M ⊂ Rn be a compact, m-dimensional, C 1 surface. We define a contented partition of M to be a finite collection P = {Σk } of contented subsets of M such that ∪ M= Σk , cont+ (Σj ∩ Σk ) = 0, ∀ j ̸= k. k
We say maxsize P = max diam(Σk ), k
where diam Σk = supx,y∈Σk ∥x−y∥. Establish the following variant of the Darboux theorem (Proposition 3.1.1). Proposition. Let Pν = {Σkν : 1 ≤ k ≤ N (ν)} be a sequence of contented partitions of M such that maxsize Pν → 0. Pick points ξkν ∈ Σkν . Then, given f ∈ R(M ), we have ∫ N (ν) ∑ f dS = lim f (ξjν ) V (Σkν ), ν→∞
where V (Σkν ) =
∫ M
M
k=1
χΣkν dS is the content of Σkν .
3.3. Partitions of unity
147
Hint. First treat the case f ∈ C(M ). Use the material in (3.2.68)–(3.2.76) to extend this to f ∈ R(M ). 31. We desire to compute det G when G = (gjk ) is an m × m matrix given by gjk = δjk + vj vk . Compare (3.2.21). In other words, G = I + T,
T = (tjk ),
tjk = vj vk .
(a) Let v ∈ R have components vj . Show that, for w ∈ Rm , T w = (v · w)v. (b) Deduce that T has one nonzero eigenvalue, |v|2 . (c) Deduce that one eigenvalue of G is 1 + |v|2 , and the other m − 1 eigenvalues are 1. √ √ (d) Deduce that g = det G = 1 + |v|2 , so g = 1 + |v|2 . Compare (3.2.22), with v = ∇u. m
3.3. Partitions of unity In the text we have made occasional use of partitions of unity, and we include some material on this topic here. We begin by defining and constructing a continuous partition of unity on a compact metric space, subordinate to a open cover {Uj : 1 ≤ j ≤ N } of X. By definition, this is a family of continuous functions φj : X → R such that ∑ (3.3.1) φj ≥ 0, supp φj ⊂ Uj , φj = 1. j
To construct such a partition of unity, we do the following. First, it can be shown that there is an open cover {Vj : 1 ≤ j ≤ N } of X and open sets Wj such that (3.3.2)
V j ⊂ Wj ⊂ W j ⊂ Uj .
Given this, let ψj (x) = dist (x, X \Wj ). Then ψj is continuous, supp ψj ⊂ W j ⊂ Uj , ∑ and ψj is strictly positive on V j . Hence Ψ = j ψj is continuous and strictly positive on X, and we see that (3.3.3)
φj (x) = Ψ(x)−1 ψj (x)
yields such a partition of unity. We indicate how to construct the sets Vj and Wj used above, starting with V1 and W1 . Note that the set K1 = X \ (U2 ∪ · · · ∪ UN ) is a compact subset of U1 . Assume it is nonempty; otherwise just throw U1 out and relabel the sets Uj . Now set V1 = {x ∈ U1 : dist (x, K1 ) < 31 dist (x, X \ U1 )}, and W1 = {x ∈ U1 : dist (x, K1 ) < 23 dist (x, X \ U1 )}. To construct V2 and W2 , proceed as above, but use the cover {U2 , . . . , UN , V1 }. Continue until done. Given a smooth compact surface M (perhaps with boundary), covered by coordinate patches Uj (1 ≤ j ≤ N ), one can construct a smooth partition of unity on
148
3. Multivariable integral calculus and calculus on surfaces
M , subordinate to this cover. The main additional tool for this is the construction of a function ψ ∈ C0∞ (Rn ) such that 1 , ψ(x) = 0 for |x| ≥ 1. 2 One way to get this is to start with the function on R given by ψ(x) = 1 for |x| ≤
(3.3.4)
f0 (x) = e−1/x
(3.3.5)
for x > 0, for x ≤ 0.
0
It is an exercise to show that f0 ∈ C ∞ (R). Now the function f1 (x) = f0 (x)f0 ( 12 − x) belongs to C ∞ (R) and is zero outside the interval [0, 1/2]. Hence the function ∫ x f2 (x) = f1 (s) ds −∞
∞
belongs to C (R), is zero for x ≤ 0, and equals some positive constant (say C2 ) for x ≥ 1/2. Then 1 ψ(x) = f2 (1 − |x|) C2 is a function on Rn with the desired properties. With this function in hand, to construct the smooth partition of unity mentioned above is an exercise we recommend to the reader.
3.4. Sard’s theorem Let F : O → Rn be a C 1 map, with O open in Rn . If p ∈ O and DF (p) : Rn → Rn is not surjective, then p is said to be a critical point, and F (p) a critical value. The set C of critical points can be a large subset of O, even all of it, but the set of critical values F (C) must be small in Rn , as the following result implies. Proposition 3.4.1. If F : O → Rn is a C 1 map, C ⊂ O its set of critical points, and K ⊂ O compact, then F (C ∩ K) is a nil subset of Rn . Proof. Without loss of generality, we can assume K is a cubical cell. Let P be a partition of K into cubical cells Rα , all of diameter δ. Write P = P ′ ∪ P ′′ , where cells in P ′ are disjoint from C, and cells in P ′′ intersect C. Pick xα ∈ Rα ∩ C, for Rα ∈ P ′′ . Fix ε > 0. Now we have (3.4.1)
F (xα + y) = F (xα ) + DF (xα ) + rα (y),
and, if δ > 0 is small enough, then |rα (y)| ≤ ε|y| ≤ εδ, for xα +y ∈ Rα . Thus F (Rα ) is contained in an εδ-neighborhood of the set Hα = F (xα ) + DF (xα )(Rα − xα ), which is a parallelipiped of dimension ≤ n − 1, and diameter ≤ M δ, if |DF | ≤ M. Hence (3.4.2)
cont+ F (Rα ) ≤ Cεδ n ≤ C ′ εV (Rα ),
for Rα ∈ P ′′ .
3.5. Morse functions
149
Thus cont+ F (C ∩ K) ≤
(3.4.3)
∑
cont+ F (Rα ) ≤ C ′′ ε.
Rα ∈P ′′
Taking ε → 0, we have the proof.
This is the easy case of a result known as Sard’s Theorem, which also treats the case F : O → Rn when O is an open set in Rm , m > n. Then a more elaborate argument is needed, and one requires more differentiability, namely that F is class C k , with k = m − n + 1. A proof can be found in [36] or [44].
3.5. Morse functions If Ω ⊂ Rn is open, a C 2 function f : Ω → R is said to be a Morse function if each critical point of f is nondegenerate, i.e., ∀ p ∈ Ω,
(3.5.1)
∇f (p) = 0 ⇒ D2 f (p) is invertible,
where D2 f (p) is the symmetric n × n matrix of second order partial derivatives defined in §2.1. More generally, if M is an n-dimensional surface, a C 2 function f : M → R is said to be a Morse function if f ◦ φ is a Morse function on Ω for each coordinate patch φ : Ω → U ⊂ M . Our goal here is to establish the existence of lots of Morse functions on an n-dimensional surface M . For simplicity, we restrict attention to the case where M is compact. Here is our main result. Theorem 3.5.1. Let M ⊂ RN be a compact, smooth, n-dimensional surface. For a ∈ RN , set φa : M −→ R,
(3.5.2)
φa (x) = a · x, x ∈ M.
Take f ∈ C (M ). Then the set Of of a ∈ RN such that 2
(3.5.3)
f + φa : M −→ R is a Morse function
is a dense open subset of RN . It is easy to verify that Of is open, since when (3.5.1) holds, a small C 2 perturbation g of f has the property that D2 g(x) is invertible for x near p. What is not so easy is to show that Of is dense (or even nonempty!). Our proof of such denseness will make use of Sard’s theorem, from §3.4. We begin with an easy special case. Proposition 3.5.2. In the setting of Theorem 3.5.1, assume N = n + 1 and M = ∂Ω, with Ω ⊂ Rn+1 open. Then (3.5.4)
{a ∈ S n : a ∈ / O0 } is a nil set,
hence has empty interior in the unit sphere S n . Proof. Here we are examining when φa is a Morse function on M . Let N : M → S n be the exterior unit normal. Then x0 ∈ M is a critical point of φa if and only if N (x0 ) = ±a. Such a point x0 is a nondegenerate critical point of φa if and only if it is not a critical point of N . Hence, if ±a ∈ S n are regular values of N , then φa
150
3. Multivariable integral calculus and calculus on surfaces
is a Morse function, i.e., a ∈ O0 . By Sard’s theorem, the set of points in S n that are critical values of N is a nil set, so the proof of Proposition 3.5.2 is done. We begin to tackle Theorem 3.5.1 with the following result. Lemma 3.5.3. Let Ω ⊂ Rn be open, and take g ∈ C 2 (Ω). Let U ⊂ Ω be the closure of a smoothly bounded open U . Set ga (x) = g(x) + a · x. Let Og denote the set of a ∈ Rn such that ga |U has only nondegenerate critical points. Then Rn \ Og is a nil set. Proof. Consider (3.5.5)
F (x) = −∇g(x),
F : Ω −→ Rn .
A point x ∈ Ω is a critical point of ga if and only if F (x) = a, and this critical point is degenerate only if, in addition, a is a critical value of F . Hence the desired conclusion holds for all a ∈ Rn that are not critical values of F |U . Again Sard’s theorem applies.
Proof of Theorem 3.5.1. Each p ∈ M has a neighborhood Up in M such that U p ⊂ Ωp ⊂ M and some n of the coordinates xj on RN produce coordinates on Ωp . Say x1 , . . . , xn do it. Let (an+1 , . . . , aN ) be fixed, but arbitrary. Then Lemma ∑N 3.5.3 can be applied to g = f + n+1 aj xj , treated as a function of (x1 , . . . , xn ). It follows that, for all (a1 , . . . , an ) but a nil set, f + φa has only nondegenerate critical points in U p . Thus (3.5.6)
{a ∈ RN : f + φa has only nondegenerate critical points in U p }
is dense in RN . We also know this set is open. Now M can be covered by a finite collection of such sets Up , so Of , defined in Theorem 3.5.1, is a finite intersection of open dense subsets of RN , hence it is open and dense, as asserted.
3.6. The tangent space to a manifold Let X be a C k manifold of dimension m, as defined in §3.2. Thus, for each p ∈ X, there are an open set Up ⊂ X an open Op ⊂ Rn , and a homeomorphism (3.6.1)
φp : Op −→ Up
of Op onto Up . One requires the following compatibility. If also q ∈ X and Upq = −1 Up ∩ Uq ̸= ∅, then the homeomorphism from Opq = φ−1 p (Upq ) to Oqp = φq (Upq ), , (3.6.2) Fpq = φ−1 q ◦ φp Opq
is a C k diffeomorphism. The maps φp : Op → Up ⊂ X are coordinate charts on X. In §3.2 we defined the tangent space Tp X as an m-dimensional linear subspace of Rn in case X is a C k surface in Rn . Here we will give an intrinsic definition of Tp X, for a general C k manifold, and show that it is naturally isomorphic to the
3.6. The tangent space to a manifold
151
tangent space defined in §3.2 when X is a C k surface in Rn . We proceed as follows. Given p ∈ X, let Cp (X) denote the set of C 1 curves (3.6.3)
γ : (−1, 1) −→ X,
γ(0) = p.
1
The notion that such γ is a C map is as defined below (3.2.108). Definition. The tangent space Tp X is the set of equivalence classes of curves in Cp (X), where, given γ0 , γ1 ∈ Cp (X), (3.6.4)
γ0 ∼ γ1 ⇐⇒ σ0′ (0) = σ1′ (0),
where (3.6.5)
σj (t) = φ−1 p ◦ γj (t),
for |t| < ε,
and ε > 0 is sufficiently small that |t| < ε ⇒ γj (t) ∈ Up . As we will see shortly, the set Tp X is unchanged if one takes some other coordinate chart about p. Given the characterization above, we have a bijective map (3.6.6)
m Dφ−1 p : Tp X −→ R ,
Dφ−1 p ([γ])
=
(φ−1 p
defined by ′
◦ γ) (0),
where, for γ ∈ Cp (X), [γ] denotes its equivalence class in Tp X. Given some other coordinate chart ψ : O → Up , we likewise set (3.6.7)
Dψ −1 (p) : Tp X −→ Rm ,
Dψ −1 (p)([γ]) = (ψ −1 ◦ γ)′ (0),
with γ ∈ Cp (X). To compare (3.6.6) and (3.6.7), we apply the chain rule to (3.6.8)
ψ −1 ◦ γ = ψ −1 ◦ φp ◦ φ−1 p ◦ γ,
obtaining (3.6.9)
′ Dψ −1 (p)([γ]) = D(ψ −1 ◦ φp )(o) (φ−1 p ◦ γ) (0)
= D(ψ −1 ◦ φp )(o) Dφ−1 p ([γ]),
−1 where o = φ−1 ◦ φp )(o) : Rm → Rm is a linear isomorphism, so p (p). Now D(ψ the bijective map (3.6.6) gives Tp X the structure of an m-dimensional vector space, independent of the choice of coordinate chart about p.
Note that if X = Rm and p ∈ Rm , then we can use the identity map as the coordinate chart φp , i.e., φp (x) = x. This leads to the natural isomorphism (3.6.10)
Ip : Tp Rm −→ Rm ,
Ip ([γ]) = γ ′ (0),
γ ∈ Cp (Rm ).
Now suppose X and Y are two C k manifolds, of dimension m and n, respectively, and (3.6.11)
f : X −→ Y
is a C map, as defined below (3.2.108). Take p ∈ X, q = f (p). Note that 1
(3.6.12)
γ ∈ Cp (X) =⇒ f ◦ γ ∈ Cq (Y ).
We can define (3.6.13)
Df (p) : Tp X −→ Tq Y
152
3. Multivariable integral calculus and calculus on surfaces
by Df (p)([γ]) = [f ◦ γ],
(3.6.14)
γ ∈ Cp (X).
In view of (3.6.10), we see that if X ⊂ Rm and Y ⊂ Rn are open subsets, then Iq ◦ Df (p) ◦ Ip−1 : Rm −→ Rn
(3.6.15)
gives the same map as we specified for Df (p) as it was defined in §2.1. Here are some related results, whose proofs are left to the reader. Proposition 3.6.1. Given a C 1 coordinate chart ψ : O → Up ⊂ X, leading to (3.6.7), we have ψ −1 : Up → Rm , and Dψ −1 (p) : Tp X −→ Tp Rm ≈ Rm
(3.6.16)
coincides with Dψ −1 (p) in (3.6.7). Proposition 3.6.2. In the setting of (3.6.11)–(3.6.14), the map Df (p) in (3.6.13) is linear. Going further, suppose Z is a C 1 manifold and we have a C 1 map g : Y −→ Z,
(3.6.17)
g(q) = r,
so, parallel to (3.6.13), (3.6.18)
Dg(q) : Tq Y −→ Tr Z.
We then have the chain rule D(g ◦ f )(p)([γ]) = [g ◦ f ◦ γ] (3.6.19) = Dg(q) Df (p)([γ]), for γ ∈ Cp (X). If X is a C k manifold of dimension m, we can form the disjoint union of the tangent spaces Tp X, ∪ (3.6.20) TX = Tp X, p∈X
called the tangent bundle of X. It is useful to know that T X has a natural structure of a C k−1 manifold, produced as follows. Suppose φ : O → U ⊂ X is a coordinate chart, and set T U = ∪p∈U Tp X ⊂ T X. We have (3.6.21)
Dφ : O × Rm → T U, Dφ(x) : R
m
Dφ(x, v) = Dφ(x)v,
≈ Tx Rm → Tφ(x) U.
If ψ : Ω → U is a smooth coordinate chart, with (3.6.22)
Dψ : Ω × Rm → T U,
Dψ(y, w) = Dψ(y)w,
then we have (3.6.23)
Fφ,ψ = (Dψ)−1 ◦ (Dφ) : O × Rm −→ Ω × Rm , ( ) Fφ,ψ (x, v) = ψ −1 ◦ φ(x), D(ψ −1 ◦ φ)(x)v ,
where (3.6.24)
ψ −1 ◦ φ : O −→ Ω
3.6. The tangent space to a manifold
153
is a C k diffeomorphism and D(ψ −1 ◦ φ)(x) : Rm → Rm is defined as in §2.1. It follows that Fφ,ψ in (3.6.23) is a C k−1 diffeomorphism, with inverse Fψ,φ : Ω×Rm → O × Rm . Thus covering X by coordinate charts that are C k compatible leads to a covering of T X by coordinate charts that are C k−1 compatible. Note that there is a natural projection π : T X −→ X,
(3.6.25)
π : Tp X → {p}.
There is also a natural inclusion ι : X −→ T X,
(3.6.26)
namely, given p ∈ X, ι(p) is the zero element of Tp X. Given that X is a C k manifold and ℓ ≤ k − 1, we can define a C ℓ vector field V on X as a C ℓ map V : X −→ T X,
(3.6.27)
such that π(V (x)) = x, ∀ x ∈ X.
Let us now recall how we defined a metric tensor on a C k manifold in §3.2. Namely, to each coordinate chart φp as in (3.6.1), there is a positive definite m × m matrix valued function Gp : Op −→ M (m, R),
(3.6.28)
smooth of class C (ℓ ≤ k − 1), satisfying the compatibility condition ℓ
(3.6.29)
Gp (x) = DFpq (x)t Gq (y) DFpq (x),
for (3.6.30)
x ∈ Opq ⊂ Op ,
y = Fpq (x) ∈ Oqp ⊂ Oq .
We can now see that such a family {Gp } gives an inner product on each tangent space Tp X, as follows. Take p ∈ X, and suppose p ∈ Uq , with φq : Oq → Uq . Take V, W ∈ Tp X. Then we set (3.6.31)
−1 −1 ⟨V, W ⟩(p) = Dφ−1 q (p)V · Gq (φq (p))Dφq (p)W.
Here we take the standard dot product on Rm . The compatibility condition (3.6.29) gives (3.6.32)
−1 −1 ⟨V, W ⟩(p) = Dφ−1 p (p)V · Gp (φp (p))Dφp (p)W,
so the inner product is independent of the choice of q in (3.6.31), as long as p ∈ Up ∩ Uq .
Chapter 4
Differential forms and the Gauss-Green-Stokes formula
The calculus of differential forms, one of E. Cartan’s fundamental contributions to analysis, provides a superb set of tools for calculus on surfaces and other manifolds. A 1-form α on an open set Ω ⊂ Rn can be written α = a1 (x) dx1 + · · · + an (x) dxn . One can integrate such a 1-form over a smooth curve γ : I → Ω, via ∫ ∫ (4.0.1) α = γ ∗ α, γ
I
∗
where γ α is the pull-back of α, given by k-form is a finite sum of terms (4.0.2)
aj (x) dxj1 ∧ · · · ∧ dxjk ,
∑ j
aj (γ(t))γj′ (t) dt. More generally, a
j = (j1 , . . . , jk ),
where the “wedge product” satisfies the anticommutativity relation (4.0.3)
dxℓ ∧ dxm = −dxm ∧ dxℓ .
If φ : O → Ω is a smooth map, and α is a k-form on Ω, one has the pull-back φ∗ α to a k-form on O, satisfying (4.0.4)
φ∗ (α ∧ β) = φ∗ α ∧ φ∗ β,
(φ ◦ ψ)∗ α = ψ ∗ (φ∗ α),
if also ψ : U → O is a smooth map. Another fundamental ingredient is the exterior derivative, (4.0.5)
d : Λk (Ω) −→ Λk+1 (Ω),
where Λk (Ω) denotes the space of smooth k-forms on Ω. One has the crucial identities (4.0.6)
ddα = 0,
d(φ∗ α) = φ∗ (dα).
The action of φ∗ on n-forms (for Ω, O open in Rn ) is given by (4.0.7)
φ∗ (F (x) dx1 ∧ · · · ∧ dxn ) = F (φ(x))(det Dφ(x)) dx1 ∧ · · · ∧ dxn . 155
156
4. Differential forms and the Gauss-Green-Stokes formula
Hence, the change of variable formula established in §3.1 yields ∫ ∫ (4.0.8) α = φ∗ α, O
Ω
provided φ : O → Ω is a diffeomorphism such that det Dφ(x) > 0 on O. (One says φ preserves orientation.) Given this, one can define ∫ (4.0.9) β M
whenever β is a k form and M ⊂ R possesses an “orientation.”
n
is a k-dimensional surface, assuming M
Complementing the important identities (4.0.6)–(4.0.8), one has the following, which could be called the “fundamental theorem of the calculus of differential forms,” ∫ ∫ (4.0.10) dα = α, M
∂M
when M is a k-dimensional oriented surface (or manifold) with smooth boundary ∂M , and α is a smooth (k − 1)-form on M . This identity generalizes classical identities of Gauss, Green, and Stokes, and is called the general Stokes formula. The proof of this, in §4.3, is followed by a discussion of these classical cases in §4.4. In §4.5 we will use the theory of differential forms to obtain another proof of the change of variable formula for the integral, a proof very much different from the one given in §3.1. Chapter 5 will present a number of substantial applications of the calculus of differential forms, particularly of the Gauss-Green-Stokes formula.
4.1. Differential forms It is very desirable to be able to make constructions that depend as little as possible on a particular choice of coordinate system. The calculus of differential forms, whose study we now take up, is one convenient set of tools for this purpose. We start with the notion of a 1-form. It is an object that gets integrated over a curve; formally, a 1-form on Ω ⊂ Rn is written ∑ (4.1.1) α= aj (x) dxj . j
If γ : [a, b] → Ω is a smooth curve, we set ∫ ∫ b∑ ( ) α= aj γ(t) γj′ (t) dt. (4.1.2) γ
In other words,
a
∫
∫ α=
(4.1.3) γ
I
γ∗α
4.1. Differential forms
157
where I = [a, b] and γ∗α =
∑
aj (γ(t))γj′ (t) dt
j
is the pull-back of α under the map γ. More generally, if F : O → Ω is a smooth map (O ⊂ Rm open), the pull-back F ∗ α is a 1-form on O defined by ∑ ∂Fj (4.1.4) F ∗α = aj (F (y)) dyk . ∂yk j,k
The usual change of variable formula for integrals gives ∫ ∫ (4.1.5) α = F ∗α γ
σ
if γ is the curve F ◦ σ. If F : O → Ω is a diffeomorphism, and ∑ ∂ (4.1.6) X= bj (x) ∂xj is a vector field on Ω, recall from (2.3.40) that we have the vector field on O : ( ) (4.1.7) F# X(y) = DF −1 (p) X(p), p = F (y). If we define a pairing between 1-forms and vector fields on Ω by ∑ (4.1.8) ⟨X, α⟩ = bj (x)aj (x) = b · a, j
a simple calculation gives (4.1.9)
⟨F# X, F ∗ α⟩ = ⟨X, α⟩ ◦ F.
Thus, a 1-form on Ω is characterized at each point p ∈ Ω as a linear transformation of the space of vectors at p to R. More generally, we can regard a k-form α on Ω as a k-multilinear map on vector fields: (4.1.10)
α(X1 , . . . , Xk ) ∈ C ∞ (Ω);
we impose the further condition of anti-symmetry when k ≥ 2: (4.1.11)
α(X1 , . . . , Xj , . . . , Xℓ , . . . , Xk ) = −α(X1 , . . . , Xℓ , . . . , Xj , . . . , Xk ).
Let us note that a 0-form is simply a function. There is a special notation we use for k-forms. If 1 ≤ j1 < · · · < jk ≤ n, j = (j1 , . . . , jk ), we set 1 ∑ (4.1.12) α= aj (x) dxj1 ∧ · · · ∧ dxjk k! j where (4.1.13)
aj (x) = α(Dj1 , . . . , Djk ),
Dj = ∂/∂xj .
More generally, we assign meaning to (4.1.12) summed over all k-indices (j1 , . . . , jk ), where we identify (4.1.14)
dxj1 ∧ · · · ∧ dxjk = (sgn σ) dxjσ(1) ∧ · · · ∧ dxjσ(k) ,
158
4. Differential forms and the Gauss-Green-Stokes formula
σ being a permutation of {1, . . . , k}. If any jm = jℓ (m ̸= ℓ), then (4.1.14) vanishes. A common notation for the statement that α is a k-form on Ω is α ∈ Λk (Ω).
(4.1.15)
In particular, we can write a 2-form β as 1∑ (4.1.16) β= bjk (x) dxj ∧ dxk 2 and pick coefficients satisfying bjk (x)∑= −bkj (x). According to (4.1.12)–(4.1.13), if ∑ we set U = uj (x)∂/∂xj and V = vj (x)∂/∂xj , then ∑ (4.1.17) β(U, V ) = bjk (x)uj (x)v k (x). ∑ If bjk is not required to be antisymmetric, one gets β(U, V ) = (1/2) (bjk − bkj )uj v k . If F : O → Ω is a smooth map as above, we define the pull-back F ∗ α of a k-form α, given by (4.1.12), to be ∑ ( ) (4.1.18) F ∗α = aj F (y) (F ∗ dxj1 ) ∧ · · · ∧ (F ∗ dxjk ) j
where F ∗ dxj =
(4.1.19)
∑ ∂Fj ℓ
∂yℓ
dyℓ ,
the algebraic computation in (4.1.18) being performed using the rule (4.1.14). Extending (4.1.9), if F is a diffeomorphism, we have (4.1.20)
(F ∗ α)(F# X1 , . . . , F# Xk ) = α(X1 , . . . , Xk ) ◦ F.
If B = (bjk ) is an n × n matrix, then, by (4.1.14), (∑ ) (∑ ) (∑ ) b1k dxk ∧ b2k dxk ∧ · · · ∧ bnk dxk k
= (4.1.21)
k
∑
k
b1k1 · · · bnkn dxk1 ∧ · · · ∧ dxkn
k1 ,...,kn
=
(∑ (
) (sgn σ)b1σ(1) b2σ(2) · · · bnσ(n) dx1 ∧ · · · ∧ dxn
σ∈Sn
) = det B dx1 ∧ · · · ∧ dxn . Here Sn denotes the set of permutations of {1, . . . , n}, and the last identity is the formula for the determinant presented in (1.4.30). It follows that if F : O → Ω is a C 1 map between two domains of dimension n, and α = A(x) dx1 ∧ · · · ∧ dxn
(4.1.22) is an n-form on Ω, then (4.1.23)
F ∗ α = det DF (y) A(F (y)) dy1 ∧ · · · ∧ dyn .
Comparison with the change of variable formula for multiple integrals suggests ∫ that one has an intrinsic definition of Ω α when α is an n-form on Ω, n = dim
4.1. Differential forms
159
Ω. To implement this, we need to take into account that det DF (y) rather than | det DF (y)| appears in (4.1.21). We say a smooth map F : O → Ω between two open subsets of Rn preserves orientation if det DF (y) is everywhere positive. The object called an “orientation” on Ω can be identified as an equivalence class of nowhere vanishing n-forms on Ω, two such forms being equivalent if one is a multiple of another by a positive function in C ∞ (Ω); the standard orientation on Rn is determined by dx1 ∧ · · · ∧ dxn . If S is an n-dimensional surface in Rn+k , an orientation on S can also be specified by a nowhere vanishing form ω ∈ Λn (S). If such a form exists, S is said to be orientable. The equivalence class of positive multiples a(x)ω is said to consist of “positive” forms. A smooth map ψ : S → M between oriented n-dimensional surfaces preserves orientation provided ψ ∗ σ is positive on S whenever σ ∈ Λn (M ) is positive. If S is oriented, one can choose coordinate charts which are all orientation preserving. We mention that there exist surfaces that cannot be oriented, such as the famous “M¨obius strip,” and also the projective space P2 , discussed in §3.2. See Exercise 13 below. We define the integral of an n-form over an oriented n-dimensional surface as follows. First, if α is an n-form supported on an open set Ω ⊂ Rn , given by (4.1.22), then we set ∫ ∫ (4.1.24) α = A(x) dV (x), Ω
Ω
the right side defined as in §3.1. If O is also open in Rn and F : O → Ω is an orientation preserving diffeomorphism, we have ∫ ∫ ∗ (4.1.25) F α = α, O
Ω
as a consequence of (4.1.23) and the change of variable formula (3.1.47). More generally, if S is an n-dimensional surface with an orientation, say the image of an open set O ⊂ Rn by φ : O → S, carrying the natural orientation of O, we can set ∫ ∫ (4.1.26) α = φ∗ α S
O
∫ for an n-form α on S. If it takes several coordinate patches to cover S, define S α by writing α as a sum of forms, each supported on one patch. ∫ We need to show that this definition of S α is independent of the choice of coordinate system on S (as long as the orientation of S is respected). Thus, suppose φ : O → U ⊂ S and ψ : Ω → U ⊂ S are both coordinate patches, so that F = ψ −1 ◦ φ : O → Ω is an orientation-preserving diffeomorphism, as in Figure 3.2.1 of §3.2. We need to check that, if α is an n-form on S, supported on U, then ∫ ∫ ∗ (4.1.27) φ α = ψ ∗ α. O
Ω
To establish this, we first show that, for any form α of any degree, (4.1.28)
ψ ◦ F = φ =⇒ φ∗ α = F ∗ ψ ∗ α.
160
4. Differential forms and the Gauss-Green-Stokes formula
∑ It suffices to check (4.1.28) for α = dxj . Then (4.1.19) gives ψ ∗ dxj = (∂ψj /∂xℓ ) dxℓ , so ∑ ∂φj ∑ ∂Fℓ ∂ψj (4.1.29) F ∗ ψ ∗ dxj = dxm , φ∗ dxj = dxm ; ∂xm ∂xℓ ∂xm m ℓ,m
but the identity of these forms follows from the chain rule: ∑ ∂ψj ∂Fℓ ∂φj = . (4.1.30) Dφ = (Dψ)(DF ) =⇒ ∂xm ∂xℓ ∂xm ℓ
Now that we have (4.1.28), we see that the left side of (4.1.27) is equal to ∫ (4.1.31) F ∗ (ψ ∗ α), O
which is equal to the right side of (4.1.27), by (4.1.25). Thus the integral of an n-form over an oriented n-dimensional surface is well defined.
Exercises 1. If F : U0 → U1 and G : U1 → U2 are smooth maps and α ∈ Λk (U2 ), then (4.1.26) implies (4.1.32)
(G ◦ F )∗ α = F ∗ (G∗ α) in Λk (U0 ).
In the special case that Uj = Rn and F and G are linear maps, and k = n, show that this identity implies (4.1.33)
det(GF ) = (det F )(det G).
Compare this with the derivation of (1.4.32). 2. Let Λk Rn denote the space of k-forms (4.1.12) with constant coefficients. Show that ( ) n k n (4.1.34) dimR Λ R = . k If T : Rm → Rn is linear, then T ∗ preserves this class of spaces; we denote the map (4.1.35)
Λk T ∗ : Λk Rn −→ Λk Rm .
Similarly, replacing T by T ∗ yields (4.1.36)
Λk T : Λk Rm −→ Λk Rn .
3. Show that Λk T is uniquely characterized as a linear map from Λk Rm to Λk Rn which satisfies (Λk T )(v1 ∧ · · · ∧ vk ) = (T v1 ) ∧ · · · ∧ (T vk ),
v j ∈ Rm .
Exercises
161
4. Show that if S, T : Rn → Rn are linear maps, then Λk (ST ) = (Λk S) ◦ (Λk T ).
(4.1.37) Relate this to (4.1.28).
If {e1 , . . . , en } is the standard orthonormal basis of Rn , define an inner product on Λk Rn by declaring an orthonormal basis to be {ej1 ∧ · · · ∧ ejk : 1 ≤ j1 < · · · < jk ≤ n}.
(4.1.38)
If A : Λ R → Λ R is a linear map, define At : Λk Rn → Λk Rn by k
(4.1.39)
n
k
n
⟨Aα, β⟩ = ⟨α, At β⟩,
α, β ∈ Λk Rn ,
where ⟨ , ⟩ is the inner product on Λk Rn defined above. 5. Show that, if T : Rn → Rn is linear, with transpose T t , then (Λk T )t = Λk (T t ).
(4.1.40)
Hint. Check the identity ⟨(Λk T )α, β⟩ = ⟨α, (Λk T t )β⟩ when α and β run over the orthonormal basis (4.1.38). That is, show that if α = ej1 ∧· · ·∧ejk , β = ei1 ∧· · ·∧eik , then (4.1.41) ⟨T ej1 ∧ · · · ∧ T ejk , ei1 ∧ · · · ∧ eik ⟩ = ⟨ej1 ∧ · · · ∧ ejk , T t ei1 ∧ · · · ∧ T t eik ⟩. Hint. Say T = (tij ). In the spirit of (4.1.21), expand T ej1 ∧ · · · ∧ T ejk , and show that the left side of (4.1.41) is equal to ∑ (4.1.42) (sgn σ)tiσ(1) j1 · · · tiσ(k) jk , σ∈Sk
where Sk denotes the set of permutations of {1, . . . , k}. Similarly, show that the right side of (4.1.41) is equal to ∑ (4.1.43) (sgn τ )ti1 jτ (1) · · · tik jτ (k) . τ ∈Sk
To compare these two formulas, see the treatment of (1.4.31). 6. Show that if {u1 , . . . , un } is any orthonormal basis of Rn , then the set {uj1 ∧ · · · ∧ ujk : 1 ≤ j1 < · · · < jk ≤ n} is an orthonormal basis of Λk Rn . Hint. Use Exercises 4 and 5 to show that if T : Rn → Rn is an orthogonal transformation on Rn (i.e., preserves the inner product) then Λk T is an orthogonal transformation on Λk Rn . 7. Let vj , wj ∈ Rn , 1 ≤ j ≤ k (k < n). Form the matrices V , whose k columns are the column vectors v1 , . . . , vk , and W , whose k columns are the column vectors w1 , . . . , wk . Show that (4.1.44)
⟨v1 ∧ · · · ∧ vk , w1 ∧ · · · ∧ wk ⟩ = det W t V = det V t W.
162
4. Differential forms and the Gauss-Green-Stokes formula
Hint. Show that both sides are linear in each vj and in each wj . Use this to reduce the problem to verifying (4.1.44) when each vj and each wj is chosen from among the set of basis vectors {e1 , . . . , en }. Use anti-symmetries to reduce the problem further. 8. Deduce from Exercise 7 that if vj , wj ∈ Rn , then ∑ (4.1.45) ⟨v1 ∧ · · · ∧ vk , w1 ∧ · · · ∧ wk ⟩ = (sgn π)⟨v1 , wπ(1) ⟩ · · · ⟨vk , wπ(k) ⟩, π
where π ranges over the set of permutations of {1, . . . , k}. 9. Show that the conclusion of Exercise 6 also follows from (4.1.45). 10. Let A, B : Rk → Rn be linear maps and set ω = e1 ∧ · · · ∧ ek ∈ Λk Rk . We have Λk Aω, Λk Bω ∈ Λk Rn . Deduce from (4.1.44) that ⟨Λk Aω, Λk Bω⟩ = det B t A.
(4.1.46)
11. Let φ : O → Rn be smooth, with O ⊂ Rm open. Deduce from Exercise 10 that, for each x ∈ O, (4.1.47)
∥Λm Dφ(x)ω∥2 = det Dφ(x)t Dφ(x),
where ω = e1 ∧ · · · ∧ em . Deduce that if φ : O → U ⊂ M is a coordinate patch on a smooth m-dimensional surface M ⊂ Rn and f ∈ C(M ) is supported on U , then ∫ ∫ (4.1.48) f dS = f (φ(x))∥Λm Dφ(x)ω∥ dx. O
M
12. Show that the result of Exercise 5 in §3.2 follows from (4.1.48), via (4.1.41)– (4.1.42). 13. Recall the projective spaces Pn , constructed in §3.2. Show that Pn is orientable if and only if n is odd. Hint. Let p : S n → Pn denote the natural projection, and A : S n → S n the antipodal map, so p ◦ A = p. If α ∈ Λn (Pn ) is nowhere vanishing, consider β = p∗ α ∈ Λn (S n ). Show that β is nowhere vanishing. Show that p∗ = A∗ p∗ , hence A∗ β = β, but A is orientation preserving if and only if n is odd.
4.2. Products and exterior derivatives of forms Having discussed the notion of a differential form as something to be integrated, we now consider some operations on forms. There is a wedge product, or exterior product, characterized as follows. If α ∈ Λk (Ω) has the form (4.1.12) and if ∑ (4.2.1) β= bi (x) dxi1 ∧ · · · ∧ dxiℓ ∈ Λℓ (Ω), i
4.2. Products and exterior derivatives of forms
define (4.2.2)
α∧β =
∑
163
aj (x)bi (x) dxj1 ∧ · · · ∧ dxjk ∧ dxi1 ∧ · · · ∧ dxiℓ
j,i
in Λk+ℓ (Ω). A special case of this arose in (4.1.18)–(4.1.21). We retain the equivalence (4.1.14). It follows easily that (4.2.3)
α ∧ β = (−1)kℓ β ∧ α.
In addition, there is an interior product if α ∈ Λk (Ω) with a vector field X on Ω, producing ιX α = α⌋X ∈ Λk−1 (Ω), defined by (4.2.4)
(α⌋X)(X1 , . . . , Xk−1 ) = α(X, X1 , . . . , Xk−1 ).
Consequently, if α = dxj1 ∧ · · · ∧ dxjk , Di = ∂/∂xi , then (4.2.5)
c j ∧ · · · ∧ dxj α⌋Djℓ = (−1)ℓ−1 dxj1 ∧ · · · ∧ dx ℓ k
c j denotes removing the factor dxj . Furthermore, where dx ℓ ℓ i∈ / {j1 , . . . , jk } =⇒ α⌋Di = 0. If F : O → Ω is a diffeomorphism and α, β are forms and X a vector field on Ω, it is readily verified that (4.2.6)
F ∗ (α ∧ β) = (F ∗ α) ∧ (F ∗ β),
F ∗ (α⌋X) = (F ∗ α)⌋(F# X).
We make use of the operators ∧k and ιk on forms: (4.2.7)
∧k α = dxk ∧ α,
ιk α = α⌋Dk .
There is the following useful anticommutation relation: ∧k ιℓ + ιℓ ∧k = δkℓ ,
(4.2.8)
where δkℓ is 1 if k = ℓ, 0 otherwise. This is a fairly straightforward consequence of (4.2.5). We also have (4.2.9)
∧j ∧k + ∧k ∧j = 0,
ιj ιk + ιk ιj = 0.
From (4.2.8)–(4.2.9) one says that the operators {ιj , ∧j : 1 ≤ j ≤ n} generate a “Clifford algebra.” Another important operator on forms is the exterior derivative: (4.2.10)
d : Λk (Ω) −→ Λk+1 (Ω),
defined as follows. If α ∈ Λk (Ω) is given by (4.1.12), then ∑ ∂aj (4.2.11) dα = dxℓ ∧ dxj1 ∧ · · · ∧ dxjk . ∂xℓ j,ℓ
Equivalently, (4.2.12)
dα =
n ∑
∂ℓ ∧ℓ α
ℓ=1
where ∂ℓ = ∂/∂xℓ and ∧ℓ is given by (4.2.7). The antisymmetry dxm ∧ dxℓ = −dxℓ ∧ dxm , together with the identity ∂ 2 aj /∂xℓ ∂xm = ∂ 2 aj /∂xm ∂xℓ , implies (4.2.13)
d(dα) = 0,
164
4. Differential forms and the Gauss-Green-Stokes formula
for any differential form α. We also have a product rule: (4.2.14)
d(α ∧ β) = (dα) ∧ β + (−1)k α ∧ (dβ),
α ∈ Λk (Ω), β ∈ Λj (Ω).
The exterior derivative has the following important property under pull-backs: F ∗ (dα) = dF ∗ α,
(4.2.15)
if α ∈ Λk (Ω) and F : O → Ω is a smooth map. To see this, extending (4.2.14) to a formula for d(α ∧ β1 ∧ · · · ∧ βℓ ) and using this to apply d to F ∗ α, we have (4.2.16) ∑ ∂ ( ) ( ) ( ) dF ∗ α = aj ◦ F (x) dxℓ ∧ F ∗ dxj1 ∧ · · · ∧ F ∗ dxjk ∂xℓ j,ℓ ∑ ) ( )( ) ( ) ( + (±)aj F (x) F ∗ dxj1 ∧ · · · ∧ d F ∗ dxjν ∧ · · · ∧ F ∗ dxjk . j,ν
Now the definition (4.1.18)–(4.1.19) of pull-back gives directly that F ∗ dxi =
(4.2.17)
∑ ∂Fi ℓ
∂xℓ
dxℓ = dFi ,
and hence d(F ∗ dxi ) = ddfi = 0, so only the first sum in (4.2.16) contributes to dF ∗ α. Meanwhile, (4.2.18)
F ∗ dα =
∑ ∂aj ( ) ( ) ( ) F (x) (F ∗ dxm ) ∧ F ∗ dxj1 ∧ · · · ∧ F ∗ dxjk , ∂xm j,m
so (4.2.15) follows from the identity (4.2.19)
∑ ∂aj ( ∑ ∂ ( ) ) aj ◦ F (x) dxℓ = F (x) F ∗ dxm , ∂xℓ ∂x m m ℓ
which in turn follows from the chain rule. If dα = 0, we say α is closed; if α = dβ for some β ∈ Λk−1 (Ω), we say α is exact. Formula (4.2.13) implies that every exact form is closed. The converse is not always true globally. Consider the multi-valued angular coordinate θ on R2 \ (0, 0); dθ is a single valued closed form on R2 \ (0, 0) which is not globally exact. An important result, called the Poincar´e lemma, is that every closed form is locally exact. A proof is given in §5.2. (A special case is established in §4.3.)
Exercises
165
Exercises 1. If α is a k-form, verify the formula (4.2.14), i.e., d(α∧β) = (dα)∧β +(−1)k α∧dβ. If α is closed and β is exact, show that α ∧ β is exact. ∑3 2. Let F be a vector field on U, open in R3 , F = 1 fj (x)∂/∂xj . The vector field G = curl F is classically defined as a formal determinant e1 e2 e3 (4.2.20) curl F = det ∂1 ∂2 ∂3 , f1 f2 f3 ∑3 where {ej } is the standard basis of R3 . Consider the 1-form φ = 1 fj (x)dxj . Show that dφ and curl F are related in the following way: curl F =
(4.2.21)
3 ∑
gj (x)ej =
1
3 ∑
gj (x) ∂/∂xj ,
1
dφ = g1 (x) dx2 ∧ dx3 + g2 (x) dx3 ∧ dx1 + g3 (x) dx1 ∧ dx2 . See (4.4.30)–(4.4.37) for more on this connection. 3. If F and φ are related as in Exercise 2, show that curl F is uniquely specified by the relation dφ ∧ α = ⟨curl F, α⟩ ω
(4.2.22)
for all 1-forms α on U ⊂ R3 , where ω = dx1 ∧ dx2 ∧ dx3 is the volume form. 4. Let B be a ball in R3 , F a smooth vector field on B. Show that ∃ u ∈ C ∞ (B) s.t. F = grad u =⇒ curl F = 0.
(4.2.23)
Hint. Compare F = grad u with φ = du. 5. Let B be a ball in R3 and G a smooth vector field on B. Show that (4.2.24) Hint. If G = (4.2.25)
∑3
∃ vector field F s.t. G = curl F =⇒ div G = 0.
1 gj (x)∂/∂xj ,
consider
ψ = g1 (x) dx2 ∧ dx3 + g2 (x) dx3 ∧ dx1 + g3 (x) dx1 ∧ dx2 .
Show that (4.2.26)
dψ = (div G) dx1 ∧ dx2 ∧ dx3 .
6. Show that the 1-form dθ mentioned below (4.2.19) is given by dθ =
x dy − y dx . x2 + y 2
166
4. Differential forms and the Gauss-Green-Stokes formula
For the next set of exercises, let Ω be a planar domain, X = f (x, y) ∂/∂x + g(x, y) ∂/∂y a nonvanishing vector field on Ω. Consider the 1-form α = g(x, y) dx − f (x, y) dy. 7. Let γ : I → Ω be a smooth curve, I = (a, b). Show that the image C = γ(I) is the image of an integral curve of X if and only if γ ∗ α = 0. Consequently, with slight abuse of notation, one describes the integral curves by g dx − f dy = 0. If α is exact, i.e., α = du, conclude the level curves of u are the integral curves of X. 8. A function φ is called an integrating factor if α ˜ = φα is exact, i.e., if d(φα) = 0, provided Ω is simply connected. Show that an integrating factor always exists, at least locally. Show that φ = ev is an integrating factor if and only if Xv = − div X. Find an integrating factor for α = (x2 + y 2 − 1) dx − 2xy dy. 9. Define the radial vector field R = x1 ∂/∂x1 + · · · + xn ∂/∂xn , on Rn . Show that ω = dx1 ∧ · · · ∧ dxn =⇒ ω⌋R =
n ∑
c j ∧ · · · ∧ dxn . (−1)j−1 xj dx1 ∧ · · · ∧ dx
j=1
Show that d(ω⌋R) = n ω. 10. Show that if F : Rn → Rn is a linear rotation (i.e., F ∈ SO(n)) than β = ω⌋R in Exercise 9 has the property that F ∗ β = β.
4.3. The general Stokes formula The Stokes formula involves integrating a k-form over a k-dimensional surface with boundary. We first define that concept. Let S be a smooth k-dimensional surface (say in RN ), and let M be an open subset of S, such that its closure M (in RN ) is contained in S. Its boundary is ∂M = M \ M . We say M is a smooth surface with boundary if also ∂M is a smooth (k − 1)-dimensional surface. In such a case, any p ∈ ∂M has a neighborhood U ⊂ S with a coordinate chart φ : O → U , where O is an open neighborhood of 0 in Rk , such that φ(0) = p and φ maps {x ∈ O : x1 = 0} onto U ∩ ∂M . If S is oriented, then M is oriented, and ∂M inherits an orientation, uniquely determined by the following requirement: if (4.3.1)
M = Rk− = {x ∈ Rk : x1 ≤ 0},
then ∂M = {(x2 , . . . , xk )} has the orientation determined by dx2 ∧ · · · ∧ dxk . We can now state the Stokes formula.
4.3. The general Stokes formula
167
Proposition 4.3.1. Given a compactly supported (k − 1)-form β of class C 1 on an oriented k-dimensional surface M (of class C 2 ) with boundary ∂M, with its natural orientation, ∫ ∫ (4.3.2) dβ = β. M
∂M
Proof. Using a partition of unity and invariance of the integral and the exterior derivative under coordinate transformations, it suffices to prove this when M has the form (4.3.1). In that case, we will be able to deduce (4.3.2) from the Fundamental Theorem of Calculus. Indeed, if c j ∧ · · · ∧ dxk , (4.3.3) β = bj (x) dx1 ∧ · · · ∧ dx with bj (x) of bounded support, we have dβ = (−1)j−1
(4.3.4) If j > 1, we have
∫ dβ = (−1)j−1
(4.3.5)
∂bj dx1 ∧ · · · ∧ dxk . ∂xj
∫ {∫
∞
−∞
M
} ∂bj dxj dx′ = 0, ∂xj
and also κ∗ β = 0, where κ : ∂M → M is the inclusion. On the other hand, for j = 1, we have ∫ ∫ {∫ 0 } ∂b1 dx1 dx2 · · · dxk dβ = −∞ ∂x1 M ∫ (4.3.6) = b1 (0, x′ ) dx′ ∫ = β. ∂M
This proves Stokes’ formula (4.3.2).
It is useful to allow singularities in ∂M. We say a point p ∈ M is a corner of dimension ν if there is a neighborhood U of p in M and a C 2 diffeomorphism of U onto a neighborhood of 0 in (4.3.7)
K = {x ∈ Rk : xj ≤ 0, for 1 ≤ j ≤ k − ν},
where k is the dimension of M. If M is a C 2 surface and every point p ∈ ∂M is a corner (of some dimension), we say M is a C 2 surface with corners. In such a case, ∂M is a locally finite union of C 2 surfaces with corners. The following result extends Proposition 4.3.1. Proposition 4.3.2. If M is a C 2 surface of dimension k, with corners, and β is a compactly supported (k − 1)-form of class C 1 on M , then (4.3.2) holds. Proof. It suffices to establish this when β is supported on a small neighborhood of a corner p ∈ ∂M, of the form U described above. Hence it suffices to show that (4.3.2) holds whenever β is a (k − 1)-form of class C 1 , with compact support on K
168
4. Differential forms and the Gauss-Green-Stokes formula
in (4.3.7); and we can take β to have the form (4.3.3). Then, for j > k − ν, (4.3.5) still holds, while, for j ≤ k − ν, we have, as in (4.3.6), ∫ ∫ {∫ 0 } ∂bj j−1 c j · · · dxk dxj dx1 · · · dx dβ = (−1) −∞ ∂xj K ∫ c j · · · dxk (4.3.8) = (−1)j−1 bj (x1 , . . . , xj−1 , 0, xj+1 , . . . , xk ) dx1 · · · dx ∫ = β. ∂K
This completes the proof.
The reason we required M to be a surface of class C 2 (with corners) in Propositions 4.3.1 and 4.3.2 is the following. Due to the formulas (4.1.18)–(4.1.19) for a pull-back, if β is of class C j and F is of class C ℓ , then F ∗ β is generally of class C µ , with µ = min(j, ℓ − 1). Thus, if j = ℓ = 1, F ∗ β might be only of class C 0 , so there is not a well-defined notion of a differential form of class C 1 on a C 1 surface, though such a notion is well defined on a C 2 surface. This problem can be overcome, and one can extend Propositions 4.3.1–4.3.2 to the case where M is a C 1 surface (with corners), and β is a (k −1)-form with the property that both β and dβ are continuous. We will not go into the details. Substantially more sophisticated generalizations are given in [14]. We will mention one useful extension of the scope of Proposition 4.3.2, to surfaces with piecewise smooth boundary that do not satisfy the corner condition. An example is illustrated in Figure 4.3.1. There the point p is a singular point of ∂M that is not a corner, according to the definition using (4.3.7). However, in many cases M can be divided into pieces (M1 amd M2 for the example presented in Figure 4.3.1) and each piece Mj is a surface with corners. Then Proposition 4.3.2 applies to each piece separately: ∫ ∫ (4.3.9) dβ = β, Mj
∂Mj
and one can sum over j to get (4.3.2) in this more general setting. We next apply Proposition 4.3.2 to prove the following special case of the Poincar´e lemma, which will be used in §5.1. Proposition 4.3.3. If α is a 1-form on B = {x ∈ R2 : |x| < 1} and dα = 0, then there exists a real valued u ∈ C ∞ (B) such that α = du. In fact, let us set (4.3.10)
∫ uj (x) =
α, γj (x)
where γj (x) is a path from 0 to x = (x1 , x2 ) which consists of two line segments. The path first goes from 0 to (0, x2 ) and then from (0, x2 ) to x, if j = 1, while if
4.3. The general Stokes formula
169
Figure 4.3.1. Division into surfaces with corners
j = 2 it first goes from 0 to (x1 , 0) and then from (x1 , 0) to x. See Figure 4.3.2. It is easy to verify that ∂uj /∂xj = αj (x). We claim that u1 = u2 , or equivalently that ∫ (4.3.11) α = 0, σ(x)
where σ(x) is a closed path consisting of γ2 (x) followed by γ1 (x), in reverse. In fact, Stokes’ formula, Proposition 4.3.2, implies that ∫ ∫ (4.3.12) α= dα, σ(x)
R(x)
where R(x) is the rectangle whose boundary is σ(x). If dα = 0, then (4.3.12) vanishes, and we have the desired function u : u = u1 = u2 .
170
4. Differential forms and the Gauss-Green-Stokes formula
Figure 4.3.2. Antiderivative of closed 1-form
Exercises 1. In the setting of Proposition 4.3.1, show that ∫ ∂M = ∅ =⇒ dβ = 0. M
2. Consider the region Ω ⊂ R2 defined by Ω = {(x, y) : 0 ≤ y ≤ x2 , 0 ≤ x ≤ 1}. Show that the boundary points (1, 0) and (1, 1) are corners, but (0, 0) is not a corner. The boundary of Ω is too sharp at (0, 0) to be a corner; it is called a “cusp.” Extend Proposition 4.3.2. to treat this region. 3. Suppose U ⊂ Rn is an open set with smooth boundary M = ∂U, and U has the standard orientation, determined by dx1 ∧ · · · ∧ dxn . (See the paragraph above (4.1.23).) Let φ ∈ C 1 (Rn ) satisfy φ(x) < 0 for x ∈ U, φ(x) > 0 for x ∈ Rn \ U , and grad φ(x) ̸= 0 for x ∈ ∂U, so grad φ points out of U. Show that the natural orientation on ∂U, as defined just before Proposition 4.3.1, is the same as the
Exercises
171
following. The equivalence class of forms β ∈ Λn−1 (∂U ) defining the orientation on ∂U satisfies the property that dφ ∧ β is a positive multiple of dx1 ∧ · · · ∧ dxn , on ∂U. 4. Suppose U = {x ∈ Rn : xn < 0}. Show that the orientation on ∂U described above is that of (−1)n−1 dx1 ∧ · · · ∧ dxn−1 . If V = {x ∈ Rn : xn > 0}, what orientation does ∂V inherit? 5. Extend the special case of Poincar´e’s Lemma given in Proposition 4.3.3 to the case where α is a closed 1-form on B = {x ∈ Rn : |x| < 1}, i.e., from the case dim B = 2 to higher dimensions. 6. Define β ∈ Λn−1 (Rn ) by β=
n ∑
c j ∧ · · · ∧ dxn . (−1)j−1 xj dx1 ∧ · · · ∧ dx
j=1
Let Ω ⊂ Rn be a smoothly bounded compact subset. Show that ∫ 1 β = Vol(Ω). n ∂Ω
7. In the setting of Exercise 6, show that if f ∈ C 1 (Ω), then ∫ ∫ f β = (Rf + nf ) dx, Ω
∂Ω
where Rf =
n ∑
xj
j=1
∂f . ∂xj
8. In the setting of Exercises 6–7, and with S n−1 ⊂ Rn the unit sphere, show that ∫ ∫ fβ = f dS. S n−1
S n−1
Hint. be the unit ball, and define φ : B → S n−1 by φ(x′ ) = √ Let B ⊂ R ′ ′ 2 (x , 1 − |x | ). Compute φ∗ β. Compare surface area formulas derived in §3.2. n−1
j
Another approach. The unit sphere S n−1 ,→ Rn has a volume form (cf. (4.4.13)), it must be a scalar multiple g(j ∗ β), and, by Exercise 10 of §4.2, g must be constant. Then Exercise 6 identifies this constant, in light of results from §3.2. See the exercises in §4.4 for more on this. 9. Given β as in Exercise 6. show that the (n − 1)-form ω = |x|−n β on Rn \ 0 is closed. Use Exercise 6 to show that exact.
∫ S n−1
ω ̸= 0, and hence ω is not
172
4. Differential forms and the Gauss-Green-Stokes formula
10. Let Ω ⊂ Rn be a compact, smoothly bounded subset. Take ω as in Exercise 9. Show that ∫ ω = An−1 if 0 ∈ Ω, ∂Ω
if 0 ∈ / Ω.
0
4.4. The classical Gauss, Green, and Stokes formulas The case of (4.3.1) where S = Ω is a region in R2 with smooth boundary yields the classical Green Theorem. In this case, we have ( ∂g ∂f ) (4.4.1) β = f dx + g dy, dβ = − dx ∧ dy, ∂x ∂y and hence (4.3.1) becomes the following Proposition 4.4.1. If Ω is a region in R2 with smooth boundary, and f and g are smooth functions on Ω, which vanish outside some compact set in Ω, then ∫∫ ( ∫ ∂f ) ∂g − (4.4.2) dx dy = (f dx + g dy). ∂x ∂y Ω
∂Ω
Note that, if we have a vector field X = X1 ∂/∂x + X2 ∂/∂y on Ω, then the integrand on the left side of (4.4.2) is ∂X1 ∂X2 + = div X, ∂x ∂y provided g = X1 , f = −X2 . We obtain ∫∫ ∫ ( (4.4.4) (div X) dx dy = −X2 dx + X1 dy).
(4.4.3)
Ω
∂Ω
( ) If ∂Ω is parametrized by arc-length, as γ(s) = x(s), y(s) , with orientation as defined for Proposition 4.3.1, )then the unit normal ν, to ∂Ω, pointing out of Ω, is ( given by ν(s) = y ′ (s), −x′ (s) , and (4.4.4) is equivalent to ∫∫ ∫ (4.4.5) (div X) dx dy = ⟨X, ν⟩ ds. Ω
∂Ω
This is a special case of Gauss’ Divergence Theorem. We now derive a more general form of the Divergence Theorem. We begin with a definition of the divergence of a vector field on a surface M. Let M be a region in Rn , or an n-dimensional surface in Rn+k , provided with a volume form (4.4.6)
ωM ∈ Λn M.
Let X be a vector field on M. Then the divergence of X, denoted div X, is a function on M given by (4.4.7)
(div X) ωM = d(ωM ⌋X).
4.4. The classical Gauss, Green, and Stokes formulas
173
If M = Rn , with the standard volume element ω = dx1 ∧ · · · ∧ dxn ,
(4.4.8) and if (4.4.9)
X=
∑
X j (x)
∂ , ∂xj
then (4.4.10)
ω⌋X =
n ∑
dj ∧ · · · ∧ dxn . (−1)j−1 X j (x)dx1 ∧ · · · ∧ dx
j=1
Hence, in this case, (4.4.7) yields the familiar formula (4.4.11)
div X =
n ∑
∂j X j ,
j=1
where we use the notation (4.4.12)
∂j f =
∂f . ∂xj
Suppose now that M is endowed with both an orientation and a metric tensor gjk (x). Then M carries a natural volume element ωM , determined by the condition that, if one has an orientation-preserving coordinate system in which gjk (p0 ) = δjk , then ωM (p0 ) = dx1 ∧ · · · ∧ dxn . This condition produces the following formula, in any orientation-preserving coordinate system: √ (4.4.13) ωM = g dx1 ∧ · · · ∧ dxn , g = det(gjk ), by the same sort of calculations as done in (3.2.10)–(3.2.15). We now compute div X when the volume element on M is given by (4.4.13). We have ∑ √ dj ∧ · · · ∧ dxn (4.4.14) ωM ⌋X = (−1)j−1 X j g dx1 ∧ · · · ∧ dx j
and hence (4.4.15)
√ d(ωM ⌋X) = ∂j ( gX j ) dx1 ∧ · · · ∧ dxn .
Here, as below, we use the summation convention. Hence the formula (4.4.7) gives (4.4.16)
div X = g −1/2 ∂j (g 1/2 X j ).
Compare (3.2.56). We now derive the Divergence Theorem, as a consequence of Stokes’ formula, which we recall is ∫ ∫ (4.4.17) dα = α, M
∂M
for an (n − 1)-form on M , assumed to be a smooth compact oriented surface with boundary. If α = ωM ⌋X, formula (4.4.7) gives ∫ ∫ (4.4.18) (div X) ωM = ωM ⌋X. M
∂M
174
4. Differential forms and the Gauss-Green-Stokes formula
This is one form of the Divergence Theorem. We will produce an alternative expression for the integrand on the right before stating the result formally. Given that ωM is the volume form for M determined by a Riemannian metric, we can write the interior product ωM ⌋X in terms of the volume element ω∂M on ∂M, with its induced orientation and Riemannian metric, as follows. Pick coordinates on M, centered at p0 ∈ ∂M, such that ∂M is tangent to the hyperplane {x1 = 0} at p0 = 0 (with M to the left of ∂M ), and such that gjk (p0 ) = δjk , so ωM (p0 ) = dx1 ∧ · · · ∧ dxn . Consequently, ω∂M (p0 ) = dx2 ∧ · · · ∧ dx2 . It follows that, at p0 , j ∗ (ωM ⌋X) = ⟨X, ν⟩ω∂M ,
(4.4.19)
where ν is the unit vector normal to ∂M, pointing out of M and j : ∂M ,→ M the natural inclusion. The two sides of (4.4.19), which are both defined in a coordinate independent fashion, are hence equal on ∂M, and the identity (4.4.18) becomes ∫ ∫ (4.4.20) (div X) ωM = ⟨X, ν⟩ ω∂M . M
∂M
Finally, we adopt the notation of the sort used in §§3.1–3.2. We denote the volume element on M by dV and that on ∂M by dS, obtaining the Divergence Theorem: Theorem 4.4.2. If M is a compact surface with boundary, X a smooth vector field on M , then ∫ ∫ ⟨X, ν⟩ dS, (4.4.21) (div X) dV = M
∂M
where ν is the unit outward-pointing normal to ∂M. The only point left to mention here is that M need not be orientable. Indeed, we can treat the integrals in (4.4.21) as surface integrals, as in §3.2, and note that all objects in (4.4.21) are independent of a choice of orientation. To prove the general case, just use a partition of unity supported on orientable pieces. We obtain some further integral identities. First, we apply (4.4.21) with X replaced by uX. We have the following “derivation” identity: div uX = u div X + ⟨du, X⟩ = u div X + Xu,
(4.4.22)
which follows easily from the formula (4.4.16). The Divergence Theorem immediately gives ∫ ∫ ∫ (4.4.23) (div X)u dV + Xu dV = ⟨X, ν⟩u dS. M
M
∂M
Replacing u by uv and using the derivation identity X(uv) = (Xu)v + u(Xv), we have ∫ ∫ ∫ [ ] (4.4.24) (Xu)v + u(Xv) dV = − (div X)uv dV + ⟨X, ν⟩uv dS. M
M
∂M
4.4. The classical Gauss, Green, and Stokes formulas
175
It is very useful to apply (4.4.23) to a gradient vector field X. If v is a smooth function on M, grad v is a vector field satisfying ⟨grad v, Y ⟩ = ⟨Y, dv⟩,
(4.4.25)
for any vector field Y, where the brackets on the left are given by the metric tensor on M and those on the right by the natural pairing of vector fields and 1-forms. Hence grad v = X has components X j = g jk ∂k v, where (g jk ) is the matrix inverse of (gjk ). Applying div to grad v defines the Laplace operator: ( ) (4.4.26) ∆v = div grad v = g −1/2 ∂j g jk g 1/2 ∂k v . When M is a region in Rn and we use the standard Euclidean metric, so div X is given by (4.4.11), we have the Laplace operator on Euclidean space: (4.4.27)
∆v =
∂2v ∂2v + ··· + . 2 ∂x1 ∂x2n
Now, setting X = grad v in (4.4.23), we have Xu = ⟨grad u, grad v⟩, and ⟨X, ν⟩ = ⟨ν, grad v⟩, which we call the normal derivative of v, and denote ∂v/∂ν. Hence ∫ ∫ ∫ ∂v dS. (4.4.28) u(∆v) dV = − ⟨grad u, grad v⟩ dV + u ∂ν M
M
∂M
If we interchange the roles of u and v and subtract, we have ∫ ∫ ∫ [ ∂u ] ∂v − v dS. (4.4.29) u(∆v) dV = (∆u)v dV + u ∂ν ∂ν M
M
∂M
Formulas (4.4.28)–(4.4.29) are also called Green formulas. We will make further use of them in §5.1. We return to the Green formula (4.4.2), and give it another formulation. Consider a vector field Z = (f, g, h) on a region in R3 containing the planar surface U = {(x, y, 0) : (x, y) ∈ Ω}. If we form i j k (4.4.30) curl Z = det ∂x ∂y ∂z f g h we see that the integrand on the left side of (4.4.2) is the k-component of curl Z, so (4.4.2) can be written ∫∫ ∫ (4.4.31) (curl Z) · k dA = (Z · T ) ds, U
∂U
where T is the unit tangent vector to ∂U. To see how to extend this result, note that k is a unit normal field to the planar surface U. To formulate and prove the extension of (4.4.31) to any compact oriented surface with boundary in R3 , we use the relation between curl and exterior derivative discussed in Exercises 2–3 of §4.2. In particular, if we set (4.4.32)
F =
3 ∑ j=1
fj (x)
∂ , ∂xj
φ=
3 ∑ j=1
fj (x) dxj ,
176
then curl F = (4.4.33)
4. Differential forms and the Gauss-Green-Stokes formula ∑3
1 gj (x)
∂/∂xj where
dφ = g1 (x) dx2 ∧ dx3 + g2 (x) dx3 ∧ dx1 + g3 (x) dx1 ∧ dx2 .
Now Suppose M is a smooth oriented (n−1)-dimensional surface with boundary in Rn . Using the orientation of M, we pick a unit normal field N to M as follows. Take a smooth function v which vanishes on M but such that ∇v(x) ̸= 0 on M. Thus ∇v is normal to M. Let σ ∈ Λn−1 (M ) define the orientation of M. Then dv ∧ σ = a(x) dx1 ∧ · · · ∧ dxn , where a(x) is nonvanishing on M. For x ∈ M, we take N (x) = ∇v(x)/|∇v(x)| if a(x) > 0, and N (x) = −∇v(x)/|∇v(x)| if a(x) < 0. We call N the “positive” unit normal field to the oriented surface M, in this case. Part of the motivation for this characterization of N is that, if Ω ⊂ Rn is an open set with smooth boundary M = ∂Ω, and we give M the induced orientation, as described in §4.3, then the positive normal field N just defined coincides with the unit normal field pointing out of Ω. Compare Exercises 2–3 of §4.3. Now, if G = (g1 , . . . , gn ) is a vector field defined near M, then ∫ ∫ (∑ n ) c j · · · ∧ dxn . (4.4.34) (N · G) dS = (−1)j−1 gj (x) dx1 ∧ · · · dx M
M
j=1
This result follows from (4.4.19). When n = 3 and G = curl F, we deduce from (4.4.32)–(4.4.33) that ∫∫ ∫∫ (4.4.35) dφ = (N · curl F ) dS. M
M
Furthermore, in this case we have ∫ ∫ φ= (F · T ) ds, (4.4.36) ∂M
∂M
where T is the unit tangent vector to ∂M, specied as follows by the orientation of ∂M ; if τ ∈ Λ1 (∂M ) defines the orientation of ∂M, then ⟨T, τ ⟩ > 0 on ∂M. We call T the “forward” unit tangent vector field to the oriented curve ∂M. By the calculations above, we have the classical Stokes formula: Proposition 4.4.3. If M is a compact oriented surface with boundary in R3 , and F is a C 1 vector field on a neighborhood of M , then ∫∫ ∫ (4.4.37) (N · curlF ) dS = (F · T ) ds, M
∂M
where N is the positive unit normal field on M and T the forward unit tangent field to ∂M. Remark. The right side of (4.4.37) is called the circulation of F about ∂M . Proposition 4.4.3 shows how curl F arises to measure this circulation. Direct proof of the Divergence Theorem
4.4. The classical Gauss, Green, and Stokes formulas
177
Let Ω be a bounded open subset of Rn , with a C 1 smooth boundary ∂Ω. Hence, for each p ∈ ∂Ω, there is a neighborhood U of p in Rn , a rotation of coordinate axes, and a C 1 function u : O → R, defined on an open set O ⊂ Rn−1 , such that Ω ∩ U = {x ∈ Rn : xn ≤ u(x′ ), x′ ∈ O} ∩ U, where x = (x′ , xn ), x′ = (x1 , . . . , xn−1 ). We aim to prove that, given f ∈ C 1 (Ω), and any constant vector e ∈ Rn , ∫ ∫ (4.4.38) e · ∇f (x) dx = (e · N )f dS, Ω
∂Ω
where dS is surface measure on ∂Ω and N (x) is the unit normal to ∂Ω, pointing out of Ω. At x = (x′ , u(x′ )) ∈ ∂Ω, we have N = (1 + |∇u|2 )−1/2 (−∇u, 1).
(4.4.39)
To prove (4.4.38), we may as well suppose f is supported in such a neighborhood U. Then we have ∫ ( ∫ ∫ ) ∂f ∂n f (x′ , xn ) dxn dx′ dx = ∂xn O
Ω
∫
=
(4.4.40)
xn ≤u(x′ )
( ) f x′ , u(x′ ) dx′
O
∫
(en · N )f dS.
= ∂Ω
The first identity in (4.4.40) follows from Theorem 3.1.9, the second identity from the Fundamental Theorem of Calculus, and the third identity from the identification ( )1/2 ′ dS = 1 + |∇u|2 dx , established in (3.2.22). We use the standard basis {e1 , . . . , en } of Rn . Such an argument works when en is replaced by any constant vector e with the property that we can represent ∂Ω ∩ U as the graph of a function yn = u ˜(y ′ ), with the yn -axis parallel to e. In particular, it works for e = en + aej , for 1 ≤ j ≤ n − 1 and for |a| sufficiently small. Thus, we have ∫ ∫ (4.4.41) (en + aej ) · ∇f (x) dx = (en + aej ) · N f dS. Ω
∂Ω
If we subtract (4.4.40) from this and divide the result by a, we obtain (9.38) for e = ej , for all j, and hence (4.4.38) holds in general. Note that replacing e by ej and f by fj in (4.4.38), and summing over 1 ≤ j ≤ n, yields ∫ ∫ (4.4.42) (div F ) dx = N · F dS, Ω
∂Ω
for the vector field F = (f1 , . . . , fn ). This is the usual statement of Gauss’ Divergence Theorem, as given in Theorem 4.4.2 (specialized to domains in Rn ).
178
4. Differential forms and the Gauss-Green-Stokes formula
Reversing the argument leading from (4.4.2) to (4.4.5), we also have another proof of Green’s Theorem, in the form (4.4.2).
Exercises 1. Newton’s equation md2 x/dt2 = −∇V (x) for the motion in Rn of a body of mass m, in a potential force field F = −∇V , can be converted to a first-order system for (x, ξ), with ξ = mx. One gets d (x, ξ) = Hf (x, ξ), dt where Hf is a “Hamiltonian vector field” on R2n , given by n [ ∑ ∂f ∂ ] ∂f ∂ − . Hf = ∂ξj ∂xj ∂xj ∂ξj j=1
In the case described above, f (x, ξ) =
1 |ξ|2 + V (x). 2m
Calculate div Hf from (4.4.11). t . 2. Let X be a smooth vector field on a smooth surface M , generating a flow FX t Let O ⊂ M be a compact, smoothly bounded subset, and set Ot = FX (O). As seen in Proposition 3.2.7, ∫ d (4.4.43) Vol(Ot ) = (div X) dV. dt Ot
Use the Divergence Theorem to deduce from this that ∫ d (4.4.44) Vol(Ot ) = ⟨X, ν⟩ dS. dt ∂Ot
Remark. Conversely, a direct proof of (4.4.44), together with the Divergence Theorem, would lead to another proof of (4.4.43). 3. Show that, if F : R3 → R3 is a linear rotation, then, for a C 1 vector field Z on R3 , (4.4.45)
F# (curl Z) = curl(F# Z).
4. Let M be the graph in R3 of a smooth function, z = u(x, y), (x, y) ∈ O ⊂ R2 , a
Exercises
179
bounded region with smooth boundary (maybe with corners). Show that ∫ ∫ ∫ [( ∂F2 )( ∂u ) ( ∂F1 ∂F3 )( ∂u ) ∂F3 − − + − − (curl F · N ) dS = ∂y ∂z ∂x ∂z ∂x ∂y O (4.4.46) M ( ∂F ∂F1 )] 2 + − dx dy, ∂x ∂y ( ) where ∂Fj /∂x and ∂Fj /∂y are evaluated at x, y, u(x, y) . Show that ∫ ∫ ( ( ∂u ) ∂u ) (4.4.47) (F · T ) ds = Fe1 + Fe3 dx + Fe2 + Fe3 dy, ∂x ∂y ∂M
∂O
( ) where Fej (x, y) = Fj x, y, u(x, y) . Apply Green’s Theorem, with f = Fe1 +Fe3 (∂u/∂x), g = Fe2 + Fe3 (∂u/∂y), to show that the right sides of (4.4.46) and (4.4.47) are equal, hence proving Stokes’ Theorem in this case. 5. Let M ⊂ Rn be the graph of a function xn = u(x′ ), x′ = (x1 , . . . , xn−1 ). If β=
n ∑
c j ∧ · · · ∧ dxn , (−1)j−1 gj (x) dx1 ∧ · · · ∧ dx
j=1
( ) as in (4.4.34), and φ(x′ ) = x′ , u(x′ ) , show that n−1 ∑ ( ( ) ) ∂u − gn x′ , u(x′ ) dx1 ∧ · · · ∧ dxn−1 φ∗ β = (−1)n gj x′ , u(x′ ) ∂x j j=1 = (−1)n−1 G · (−∇u, 1) dx1 ∧ · · · ∧ dxn−1 , where G = (g1 , . . . , gn ), and verify the identity (4.4.34) in this case. Hint. For the last part, recall Exercises 2–3 of §4.3, regarding the orientation of M. 6. Let S be a smooth oriented 2-dimensional surface in R3 , and M an open subset of S, with smooth boundary; see Figure 4.4.1. Let N be the positive unit normal field to S, defined by its orientation. For x ∈ ∂M, let ν(x) be the unit vector, tangent to M, normal to ∂M, and pointing out of M, and let T be the forward unit tangent vector field to ∂M. Show that, on ∂M, N × ν = T,
ν × T = N.
7. If M is an oriented (n − 1)-dimensional surface in Rn , with positive unit normal field N , show that the volume element ωM on M is given by ωM = ω⌋N, where ω = dx1 ∧ · · · ∧ dxn is the standard volume form on Rn . Deduce that the volume element on the unit sphere S n−1 ⊂ Rn is given by n ∑ dj ∧ · · · ∧ dxn , ωS n−1 = (−1)j−1 xj dx1 ∧ · · · ∧ dx j=1
if S
n−1
inherits the orientation as the boundary of the unit ball.
180
4. Differential forms and the Gauss-Green-Stokes formula
Figure 4.4.1. Setup for classical Stokes formula
8. Let M be a C k surface, k ≥ 2. Suppose φ : M → M is a C k isometry, i.e., it preserves the metric tensor. Taking φ∗ u(x) = u(φ(x)) for u ∈ C 2 (M ), show that ∆φ∗ u = φ∗ ∆u.
(4.4.48)
Hint. The Laplace operator is uniquely specified by the metric tensor on M , via (4.4.26). 9. Let X and Y be smooth vector fields on an open set Ω ⊂ R3 . Show that Y · curl X − X · curl Y = div(X × Y ). 10. In the setting of Exercise 9, assume Ω is compact and smoothly bounded, and that X and Y are C 1 on Ω. Show that ∫ ∫ X · curl Y dx = Y · curl X dx, Ω
provided either (a) X is normal to ∂Ω, or (b) X is parallel to Y on ∂Ω.
Ω
Exercises
181
11. Recall the formula (3.2.26) for the metric tensor of Rn in spherical polar coordinates R : (0, ∞) × S n−1 → Rn , R(r, ω) = rω. Using (4.4.26), show that if u ∈ C 2 (Rn ), then ∂2 n−1 ∂ 1 u(rω) + u(rω) + 2 ∆S u(rω), 2 ∂r r ∂r r where ∆S is the Laplace operator on S n−1 . Deduce that n−1 ′ (4.4.50) u(x) = f (|x|) =⇒ ∆u(rω) = f ′′ (r) + f (r). r (4.4.49)
∆u(rω) =
12. Show that (4.4.51)
|x|−(n−2) is harmonic on Rn \ 0.
In case n = 2, show that (4.4.52)
log |x| is harmonic on R2 \ 0.
In Exercise 13, we take n ≥ 3 and consider ∫ 1 f (y) Gf (x) = dy Cn |x − y|n−2 Rn ∫ 1 f (x − y) = dy, Cn |y|n−2 Rn
with Cn = −(n − 2)An−1 . 13. Assume f ∈ C02 (Rn ). Let Ωε = Rn \ Bε , where Bε = {x ∈ Rn : |x| < ε}. Verify that ∫ Cn ∆Gf (0) = lim ∆f (x) · |x|2−n dx ε→0
Ωε
∫
= lim
ε→0 Ωε
= − lim
[∆f (x) · |x|2−n − f (x)∆|x|2−n ] dx ∫ [
ε→0
ε2−n
] ∂f − (2 − n)ε1−n f dS ∂r
∂Bε
= −(n − 2)An−1 f (0), using (4.4.29) for the third identity. Use this to show that ∆Gf (x) = f (x). 14. Work out the analogue of Exercise 13 in case n = 2 and ∫ 1 f (y) log |x − y| dy. Gf (x) = 2π R2
182
4. Differential forms and the Gauss-Green-Stokes formula
4.5. Differential forms and the change of variable formula The change of variable formula for one-variable integrals, ∫ t ∫ φ(t) ( ) (4.5.1) f φ(x) φ′ (x) dx = f (x) dx, a
φ(a)
given f continuous and φ of class C 1 , is easily established, via the fundamental theorem of calculus and the chain rule. We recall how this was done in §1.1. If we denote the left side of (4.5.1) by A(t) and the right by B(t), we apply these results to get ( ) (4.5.2) A′ (t) = f φ(t) φ′ (t) = B ′ (t), and since A(a) = B(a) = 0, another application of the fundamental theorem of calculus (or simply the mean value theorem) gives A(t) = B(t). For multiple integrals, the change of variable formula takes the following form, given in Proposition 3.1.14: Theorem 4.5.1. Let O, Ω be connected, open sets in Rn and let φ : O → Ω be a C 1 diffeomorphism. Given f continuous on Ω, with compact support, we have ∫ ∫ ( ) (4.5.3) f φ(x) det Dφ(x) dx = f (x) dx. O
Ω
There are many variants of Theorem 4.5.1. In particular one wants to extend the class of functions f for which (4.5.3) holds, but once one has Theorem 4.5.1 as stated, such extensions are relatively painless. See the derivation of Theorem 3.1.15. Let’s face it; the proof of Theorem 4.5.1 given in §3.1 was a grim affair, involving careful estimates of volumes of images of small cubes under the map φ and numerous pesky details. In more recent times, P. Lax [32] found a fresh approach to the proof of the multidimensional change of variable formula. More precisely, [32] established the following result, from which Theorem 4.5.1 is a consequence. Theorem 4.5.2. Let φ : Rn → Rn be a C 1 map. Assume φ(x) = x for |x| ≥ R. Let f be a continuous function on Rn with compact support. Then ∫ ∫ ( ) (4.5.4) f φ(x) det Dφ(x) dx = f (x) dx. We will give a variant of the proof of [32]. One difference between this proof and that of [32] is that we use the language of differential forms. Proof of Theorem 4.5.2. Via standard approximation arguments, it suffices to prove this when φ is C 2 and f ∈ C01 (Rn ), which we will assume from here on. To begin, pick A > 0 such that f (x − Ae1 ) is supported in {x : |x| > R}, where e1 = (1, 0, . . . , 0). Also take A large enough that the image of {x : |x| ≤ R} under φ does not intersect the support of f (x − Ae1 ). We can set (4.5.5)
F (x) = f (x) − f (x − Ae1 ) =
∂ψ (x), ∂x1
4.5. Differential forms and the change of variable formula
where
∫
(4.5.6)
183
A
f (x − se1 ) ds,
ψ(x) =
ψ ∈ C01 (Rn ).
0
Then we have the following identities involving n-forms: ∂ψ dx1 ∧ · · · ∧ dxn α = F (x) dx1 ∧ · · · ∧ dxn = ∂x1 (4.5.7) = dψ ∧ dx2 ∧ · · · ∧ dxn = d(ψ dx2 ∧ · · · ∧ dxn ), i.e., α = dβ, with β = ψ dx2 ∧ · · · ∧ dxn a compactly supported (n − 1)-form of class C 1 . Now the pull-back of α under φ is given by ( ) (4.5.8) φ∗ α = F φ(x) det Dφ(x) dx1 ∧ · · · ∧ dxn . Furthermore, the right side of (4.5.8) is equal to ( ) (4.5.9) f φ(x) det Dφ(x) dx1 ∧ · · · ∧ dxn − f (x − Ae1 ) dx1 ∧ · · · ∧ dxn . Hence we have
∫
(4.5.10)
∫ ( ) f φ(x) det Dφ(x) dx1 · · · dxn − f (x) dx1 · · · dxn ∫ ∫ ∫ ∗ ∗ = φ α = φ dβ = d(φ∗ β),
where we use the general identity φ∗ dβ = d(φ∗ β),
(4.5.11)
a consequence of the chain rule. On the other hand, a very special case of Stokes’ theorem applies to ∑ dj ∧ · · · ∧ dxn , (4.5.12) φ∗ β = γ = γj (x) dx1 ∧ · · · ∧ dx j
with γj ∈ (4.5.13)
C01 (Rn ).
Namely dγ =
∑ j
(−1)j−1
∂γj dx1 ∧ · · · ∧ dxn , ∂xj
and hence, by the fundamental theorem of calculus, ∫ (4.5.14) dγ = 0. This gives the desired identity (4.5.4), from (4.5.10).
We make some remarks on Theorem 4.5.2. Note that φ is not assumed to be one-to-one or onto. In fact, as noted in [32], the identity (4.5.4) implies that such φ must be onto, and this has important topological implications. We mention that, if one puts absolute values around det Dφ(x) in (4.5.4), the appropriate formula is ∫ ∫ ( ) (4.5.15) f φ(x) det Dφ(x) dx = f (x) n(x) dx, where n(x) = #{y : φ(y) = x}. A proof of (4.5.15) can be found in texts on geometrical measure theory.
184
4. Differential forms and the Gauss-Green-Stokes formula
As noted in [32], Theorem 4.5.2 was proven in [4]. The proof there makes use of differential forms and Stokes’ theorem, but it is quite different from the proof given here. A crucial difference is that the proof in [4] requires that one knows the change of variable formula as formulated in Theorem 4.5.1. We now show how Theorem 4.5.1 can be deduced from Theorem 4.5.2. We will use the following lemma. Lemma 4.5.3. In the setting of Theorem 4.5.1, and with det Dφ > 0, given p ∈ Ω, there exists a neighborhood U of p and a C 1 map Φ : Rn → Rn such that Φ = φ on φ−1 (U ),
(4.5.16)
Φ(x) = x for |x| large,
and Φ(x) ∈ U =⇒ x ∈ φ−1 (U ).
(4.5.17)
Granted the lemma, we proceed as follows. Assume det Dφ > 0 on O. Given f ∈ C(Ω), supp f ⊂ K, compact in Ω, cover K with a finite number of subsets Uj as∑in Lemma 4.5.3, and, using a continuous partition of unity (cf. §3.3), write f = j fj , supp fj ⊂ Uj . Also, let Φj have the obvious significance. By Theorem 4.5.2, we have ∫ ∫ (4.5.18) fj (Φj (x)) det DΦj (x) dx = fj dx. But we also have (4.5.19)
∫
∫ fj (Φj (x)) det DΦj (x) dx =
fj (φ(x)) det Dφ(x) dx. O
Now summing over j gives (4.5.3). If we do not have det Dφ > 0 on O, then det Dφ < 0 on O. In this case, one can compose with the map κ : Rn −→ Rn ,
(4.5.20)
κ(x1 , x′ ) = (−x1 , x′ ),
(for which Theorem 4.5.1 is elementary) and readily recover the desired result. We turn to the proof of Lemma 4.5.3. Say q = φ−1 (p), Dφ(q) = A ∈ Gℓ+ (n, R), i.e., A ∈ Gℓ(n, R) and det A > 0. Translating coordinates, we can assume p = q = 0. We set Ψ(x) = β(x)φ(x) + (1 − β(x))Ax,
(4.5.21) C0∞ (Rn )
where β ∈ has support in a small neighborhood of q and β ≡ 1 on a smaller neighborhood V = φ−1 (U ), chosen so that we can apply Corollary 2.2.4, to deduce that (4.5.22)
Ψ maps Rn diffeomorphically onto its image, an open set in Rn .
In fact, estimates behind the proof of Proposition 2.2.2 imply that, for appropriately chosen β, there exists b > 0 such that |Ψ(x) − Ψ(y)| ≥ b|x − y| for all x, y ∈ Rn . Hence the image Ψ(Rn ) is closed in Rn , as well as open, so actually Ψ maps Rn diffeomorphically onto Rn .
4.5. Differential forms and the change of variable formula
185
Note that Ψ = φ on V = φ−1 (U ). We want to alter Ψ(x) for large |x| to obtain Φ, satisfying (4.5.16)–(4.5.17). To do this, we use the fact that Gℓ+ (n, R) is connected (see Proposition 3.2.14). Pick a smooth path Γ : [0, 1] → Gℓ+ (n, R) such that Γ(t) = A for t ∈ [0, 1/4] and Γ(t) = I for t ∈ [3/4, 1]. Let (4.5.23)
M = sup ∥Γ(t)−1 ∥,
so |Γ(t)x| ≥ M −1 |x|, ∀ x ∈ Rn .
0≤t≤1
Now assume U ⊂ BR1 = {x ∈ Rn : |x| < R1 }, so Ψ(V ) ⊂ BR1 . Next, take R2 so large that V = φ−1 (U ) ⊂ BR2 and (4.5.24)
|x| ≥ R2 =⇒ |Ψ(x)| > M R1 and Ψ(x) = Ax.
Now set Φ(x) = Ψ(x) (4.5.25)
for |x| ≤ R2 ,
Γ(t)x for |x| = R2 + t, 0 ≤ t ≤ 1, x
for |x| ≥ R2 + 1.
This map has the properties (4.5.16)–(4.5.17).
Chapter 5
Applications of the Gauss-Green-Stokes formula
In this chapter we present two major types of applications of the theory of differential forms developed in Chapter 4. The first set of applications, given in §5.1, deals with complex function theory. If Ω ⊂ C is an open set, a C 1 function f : Ω → C is said to be holomorphic if it is complex differentiable, of equivalently if it satisfies a set of equations called the Cauchy-Riemann equations. We deduce from Green’s theorem that if Ω is a smoothly bounded domain and f ∈ C 1 (Ω) is holomorphic on Ω, then we have the Cauchy integral theorem, ∫ f (z) dz = 0, (5.0.1) ∂Ω
and the Cauchy integral formula, (5.0.2)
f (z0 ) =
1 2πi
∫ ∂Ω
f (z) dz, z − z0
z0 ∈ Ω.
These key results lead to further results on holomorphic functions, such as power series developments. In §5.1 we also consider functions on domains Ω ⊂ Rn that are harmonic, and use Gauss-Green formulas to establish results about such functions, such as mean value properties, and Liouville’s theorem, which states that a bounded harmonic on all of Rn must be constant. These results specialize to holomorphic functions on C. One consequence is the fundamental theorem of algebra, which states that if p(z) is a nonconstant polynomial on C, it must have a complex root. The second set of applications, given in §§5.2–5.3, yields important results on the topological behavior of smooth maps on regions in Rn , and on surfaces and more generally on manifolds. A central notion here is that of degree theory. If X 187
188
5. Applications of the Gauss-Green-Stokes formula
is a smooth, compact, oriented, n-dimensional surface, and F : X → Y is a smooth map to a compact, connected, oriented, n-dimensional surface Y , then the degree of F is given by ∫ (5.0.3) Deg(F ) = F ∗ ω, ∫
X
where ω is an n-form on Y such that Y ω = 1. That this is well-defined, independent of the choice of such ω, is a consequence of the fundamental exactness criterion, given in Proposition 5.3.6, which ∫ says a smooth n-form α on Y is exact, i.e., has the form α = dβ, if and only if Y α = 0. With this, we are able to develop degree theory as a powerful tool. Applications range from the Brouwer fixed-point theorem (actually arising here as a precursor to degree theory) and the JordanBrouwer separation theorem (in the smooth case) to a degree-theory proof of the fundamental theorem of algebra. We also consider on a compact surface M a vector field X with nondegenerate critical points, define the index of such a vector field, and show that (5.0.4)
Index X = χ(M )
is independent of the choice of such a vector field. This defines an invariant χ(M ), called the Euler characteristic. Investigations of χ(M ) will play an important role in Chapter 6.
5.1. Holomorphic functions and harmonic functions Let f be a complex-valued C 1 function on a region Ω ⊂ R2 . We identify R2 with C, via z = x + iy, and write f (z) = f (x, y). We say f is holomorphic on Ω provided it is complex differentiable, in the sense that (5.1.1)
lim
h→0
1 (f (z + h) − f (z)) exists, h
for each z ∈ Ω. When this limit exists, we denote it f ′ (z), or df /dz. An equivalent condition (given f ∈ C 1 ) is that f satisfies the Cauchy-Riemann equation: ∂f 1 ∂f = . ∂x i ∂y
(5.1.2) In such a case, (5.1.3)
f ′ (z) =
∂f 1 ∂f (z) = (z). ∂x i ∂y
Note that f (z) = z has this property, but f (z) = z does not. The following is a convenient tool for producing more holomorphic functions. Lemma 5.1.1. If f and g are holomorphic on Ω, so is f g. Proof. We have (5.1.4)
∂ ∂f ∂g (f g) = g+f , ∂x ∂x ∂x
∂ ∂f ∂g (f g) = g+f , ∂y ∂y ∂y
5.1. Holomorphic functions and harmonic functions
189
so if f and g satisfy the Cauchy-Riemann equation, so does f g. Note that (5.1.5)
d (f g)(z) = f ′ (z)g(z) + f (z)g ′ (z). dz
Using Lemma 5.1.1, one can show inductively that if k ∈ N, z k is holomorphic on C, and d k z = kz k−1 . dz Also, a direct analysis of (5.1.1) gives this for k = −1, on C \ 0, and then an inductive argument gives it for each negative integer k, on C \ 0. The exercises explore various other important examples of holomorphic functions. (5.1.6)
Our goal in this section is to show how Green’s theorem can be used to establish basic results about holomorphic functions on domains in C (and also develop a study of harmonic functions on domains in Rn ). In Theorems 5.1.2–5.1.4, Ω will be a bounded domain with piecewise smooth boundary, and we assume Ω can be partitioned into a finite number of C 2 domains with corners, as defined in §4.3. To begin, we apply Green’s theorem to the line integral ∫ ∫ f dz = f (dx + i dy). ∂Ω
∂Ω
Clearly (4.4.2) applies to complex-valued functions, and if we set g = if, we get ∫ ∫∫ ( ∂f ) ∂f − (5.1.7) f dz = dx dy. i ∂x ∂y ∂Ω
Ω
Whenever f is holomorphic, the integrand on the right side of (5.1.7) vanishes, so we have the following result, known as Cauchy’s Integral Theorem: Theorem 5.1.2. If f ∈ C 1 (Ω) is holomorphic, then ∫ (5.1.8) f (z) dz = 0. ∂Ω
Using (5.1.8), we can establish Cauchy’s Integral Formula: Theorem 5.1.3. If f ∈ C 1 (Ω) is holomorphic and z0 ∈ Ω, then ∫ 1 f (z) (5.1.9) f (z0 ) = dz. 2πi z − z0 ∂Ω
Proof. Note that g(z) = f (z)/(z − z0 ) is holomorphic on Ω \ {z0 }. Let Dr be the disk of radius r centered at z0 . Pick r so small that Dr ⊂ Ω. See Figure 5.1.1. Then (5.1.8) implies ∫ ∫ f (z) f (z) (5.1.10) dz = dz. z − z0 z − z0 ∂Ω
∂Dr
190
5. Applications of the Gauss-Green-Stokes formula
Figure 5.1.1. Proving Cauchy’s integral formula
To evaluate the integral on the right, parametrize the curve ∂Dr by γ(θ) = z0 +reiθ . Hence dz = ireiθ dθ, so the integral on the right is equal to ∫ (5.1.11) 0
2π
f (z0 + reiθ ) ireiθ dθ = i reiθ
∫
2π
f (z0 + reiθ ) dθ. 0
As r → 0, this tends in the limit to 2πif (z0 ), so (5.1.9) is established.
Suppose f ∈ C 1 (Ω) is holomorphic, z0 ∈ Dr ⊂ Ω, where Dr is the disk of radius r centered at z0 , and suppose z ∈ Dr . See Figure 5.1.2. Then Theorem 5.1.3 implies (5.1.12)
f (z) =
1 2πi
∫ ∂Ω
f (ζ) dζ. (ζ − z0 ) − (z − z0 )
We have the infinite series expansion (5.1.13)
∞ 1 1 ∑ ( z − z0 )n = , (ζ − z0 ) − (z − z0 ) ζ − z0 n=0 ζ − z0
5.1. Holomorphic functions and harmonic functions
191
Figure 5.1.2. Convergent power series on Dr (z0 )
valid as long as |z − z0 | < |ζ − z0 |. Hence, given |z − z0 | < r, this series is uniformly convergent for ζ ∈ ∂Ω, and we have ∞ ∫ 1 ∑ f (ζ) ( z − z0 )n (5.1.14) f (z) = dζ. 2πi n=0 ζ − z 0 ζ − z0 ∂Ω
We summarize what has been established. Theorem 5.1.4. Given f ∈ C 1 (Ω), holomorphic on Ω and a disk Dr ⊂ Ω as above, for z ∈ Dr , f (z) has the convergent power series expansion ∫ ∞ ∑ 1 f (ζ) (5.1.15) f (z) = an (z − z0 )n , an = dζ. 2πi (ζ − z0 )n+1 n=0 ∂Ω
Remark. Differentiating (5.1.9) gives formulas for f (k) (z0 ), implying that f ′ (z), f ′′ (z), ..., all exist and are holomorphic, and that an =
f (n) (z0 ) n!
in (5.1.15). We hence have f ∈ C ∞ (Ω). See Exercises 16 and 26 below.
192
5. Applications of the Gauss-Green-Stokes formula
Note that, when (5.1.9) is applied to Ω = Dr , the disk of radius r centered at z0 , the computation (5.1.11) yields ∫ 2π ∫ 1 1 f (z) ds(z), (5.1.16) f (z0 ) = f (z0 + reiθ ) dθ = 2π 0 ℓ(∂Dr ) ∂Dr
when f is holomorphic and C 1 on Dr , and ℓ(∂Dr ) = 2πr is the length of the circle ∂Dr . This is a mean value property, which extends to harmonic functions on domains in Rn , as we will see below. Note that we can write (5.1.1) as (∂x +i∂y )f = 0; applying the operator ∂x −i∂y to this gives ∂2f ∂2f + =0 ∂x2 ∂y 2
(5.1.17)
for any holomorphic function. A general C 2 solution to (5.1.17) on a region Ω ⊂ R2 is called a harmonic function. More generally, if O is an open set in Rn , a function f ∈ C 2 (Ω) is called harmonic if ∆f = 0 on O, where, as in (4.4.27), ∆ is the Laplace operator, (5.1.18)
∆f =
∂2f ∂2f + · · · + . ∂x21 ∂x2n
Generalizing (5.1.16), we have the following, known as the mean value property of harmonic functions: Proposition 5.1.5. Let Ω ⊂ Rn be open, u ∈ C 2 (Ω) be harmonic, p ∈ Ω, and BR (p) = {x ∈ Ω : |x − p| ≤ R} ⊂ Ω. Then ∫ 1 ) u(x) dS(x). (5.1.19) u(p) = ( A ∂BR (p) ∂BR (p)
For the proof, set (5.1.20)
ψ(r) =
1 A(S n−1 )
∫ u(p + rω) dS(ω), S n−1
for 0 < r ≤ R. We have ψ(R) equal to the right side of (5.1.19), while clearly ψ(r) → u(p) as r → 0. Now (5.1.21) ∫ ∫ 1 ∂u 1 ( ) ψ ′ (r) = ω · ∇u(p + rω) dS(ω) = dS(x). A(S n−1 ) ∂ν A ∂Br (p) S n−1
∂Br (p)
At this point, we establish: Lemma 5.1.6. If O ⊂ Rn is a bounded domain with smooth boundary and u ∈ C 2 (O) is harmonic in O, then ∫ ∂u (5.1.22) (x) dS(x) = 0. ∂ν ∂O
5.1. Holomorphic functions and harmonic functions
193
Proof. Apply the Green formula (4.4.29), with M = O and v = 1. If ∆u = 0, every integrand in (4.4.29) vanishes, except the one appearing in (5.1.22), so this integrates to zero. It follows from this lemma that (5.1.21) vanishes, so ψ(r) is constant. This completes the proof of (5.1.19). We can integrate the identity (5.1.19), to obtain ∫ 1 ) (5.1.23) u(p) = ( u(x) dV (x), V BR (p) ( 2
BR (p)
)
where u ∈ C BR (p) is harmonic. This is another expression of the mean value property. The mean value property of harmonic functions has a number of important consequences, a couple of which we mention here. The first is known as the maximum principle. Proposition 5.1.7. Let Ω ⊂ Rn be bounded, open, and connected. Assume u : Ω → R is harmonic. If x0 ∈ Ω and u(x0 ) ≥ u(x),
(5.1.24)
∀ x ∈ Ω,
then u is constant. Consequently, if u ∈ C(Ω) ∩ C 2 (Ω) is harmonic, then sup |u(x)| = sup |u(y)|.
(5.1.25)
x∈Ω
y∈∂Ω
Proof. Assume (5.1.24) holds, and set a = u(x0 ). We see from (5.1.23) that if BR (x0 ) ⊂ Ω, then u(x) = a for all x ∈ BR (x0 ). It follows that {x ∈ Ω : u(x) = a} is open in Ω. It is also clearly closed in Ω, so if Ω is connected, this set must be all of Ω, and we have the first part of Proposition 5.1.7. As indicated, the second part is a consequence. We leave this argument to the reader. The next result is known as Liouville’s Theorem. Proposition 5.1.8. If u ∈ C 2 (Rn ) is harmonic on all of Rn and bounded, then u is constant. Proof. Pick any two points p, q ∈ Rn . We have, for any r > 0, ∫ ∫ 1 ) (5.1.26) u(p) − u(q) = ( u(x) dx − u(x) dx . V Br (0) (
Br (p)
Br (q)
) Note that V Br (0) = Cn rn , where Cn is evaluated in exercise 2 of §3.2. Thus ∫ Cn (5.1.27) |u(p) − u(q)| ≤ n |u(x)| dx, r ∆(p,q,r)
194
5. Applications of the Gauss-Green-Stokes formula
Figure 5.1.3. Two large balls, with centers at p and q
where (5.1.28)
( ) ( ) ∆(p, q, r) = Br (p)△Br (q) = Br (p) \ Br (q) ∪ Br (q) \ Br (p) .
See Figure 5.1.3. Note that, if a = |p − q|, then ∆(p, q, r) ⊂ Br+a (p) \ Br−a (p); hence ( ) (5.1.29) V ∆(p, q, r) ≤ C(p, q) rn−1 , r ≥ 1. It follows that, if |u(x)| ≤ M for all x ∈ Rn , then (5.1.30)
|u(p) − u(q)| ≤ M Cn C(p, q) r−1 ,
∀ r ≥ 1.
Taking r → ∞, we obtain u(p) − u(q) = 0, so u is constant.
We will now use Liouville’s Theorem to prove the Fundamental Theorem of Algebra: Theorem 5.1.9. If p(z) = an z n + an−1 z n−1 + · · · + a1 z + a0 is a polynomial of degree n ≥ 1 (an ̸= 0), then p(z) must vanish somewhere in C. Proof. Consider (5.1.31)
f (z) =
1 . p(z)
5.1. Holomorphic functions and harmonic functions
195
If p(z) does not vanish anywhere in C, then f (z) is holomorphic on all of C. (See Exercise 9 below.) On the other hand, (5.1.32)
f (z) =
1 1 , z n an + an−1 z −1 + · · · + a0 z −n
so (5.1.33)
|f (z)| −→ 0, as |z| → ∞.
Thus f is bounded on C, if p(z) has no roots. By Proposition 5.1.8, f (z) must be constant, which is impossible, so p(z) must have a complex root. From the fact that every holomorphic function f on O ⊂ R2 is harmonic, it follows that its real and imaginary parts are harmonic. This result has a converse. Let u ∈ C 2 (O) be harmonic. Consider the 1-form α=−
(5.1.34)
∂u ∂u dx + dy. ∂y ∂x
We have dα = −(∆u)dx ∧ dy, so α is closed if and only if u is harmonic. Now, if O is diffeomorphic to a disk, it follows from Proposition 4.3.3 that α is exact on O, whenever it is closed, so, in such a case, (5.1.35)
∆u = 0 on O =⇒ ∃ v ∈ C 1 (O) s.t. α = dv.
In other words, (5.1.36)
∂u ∂v = , ∂x ∂y
∂u ∂v =− . ∂y ∂x
This is precisely the Cauchy-Riemann equation (5.1.1) for f = u + iv, so we have: Proposition 5.1.10. If O ⊂ R2 is diffeomorphic to a disk and u ∈ C 2 (O) is harmonic, then u is the real part of a holomorphic function on O. The function v (which is unique up to an additive constant) is called the harmonic conjugate of u. We close this section with a brief mention of holomorphic functions on a domain O ⊂ Cn . We say f ∈ C 1 (O) is holomorphic provided it satisfies (5.1.37)
∂f 1 ∂f = , ∂xj i ∂yj
1 ≤ j ≤ n.
Suppose z ∈ O, z = (z1 , . . . , zn ). Suppose ζ ∈ O whenever |z − ζ| < r. Then, by successively applying Cauchy’s integral formula (5.1.9) to each complex variable zj , we have that ∫ ∫ −n · · · f (ζ)(ζ1 − z1 )−1 · · · (ζn − zn )−1 dζ1 · · · dζn , (5.1.38) f (z) = (2πi) γn
γ1
where γj is any √ simple counterclockwise curve about zj in C with the property that |ζj − zj | < r/ n for all ζj ∈ γj . Consequently, if p ∈ Cn and O contains the “polydisc” D = {z ∈ Cn : |zj − pj | ≤ δ, ∀ j},
196
5. Applications of the Gauss-Green-Stokes formula
then, for z ∈ D, the interior of D, we have ∫ ∫ [ ]−1 ··· f (z) = (2πi)−n · · · f (ζ) (ζ1 − p1 ) − (z1 − p1 ) (5.1.39) C1 Cn [ ]−1 (ζn − pn ) − (zn − pn ) dζ1 · · · dζn , where Cj = {ζ ∈ C : |ζ − pj | = δ}. Then, parallel to (5.1.12)–(5.1.15), we have ∑ (5.1.40) f (z) = cα (z − p)α , α≥0
for z ∈ D, where α = (α1 , . . . , αn ) is a multi-index, (z − p)α = (z1 − p1 )α1 · · · (zn − pn )αn , as in (2.1.13), and (5.1.41) cα = (2πi)−n
∫
Cn
∫ ···
f (ζ)(ζ1 − p1 )−α1 −1 · · · (ζn − pn )−αn −1 dζ1 · · · dζn .
C1
Thus holomorphic functions on open domains in Cn have convergent power series expansions. We refer to [2], [26], and [51] for more material on holomorphic functions of one complex variable, and to [31] for material on holomorphic functions of several complex variables. We will return to harmonic functions in §7.4 and Appendix A.6. For more information on harmonic functions, one can see [30] and [46].
Exercises 1. Let fk : Ω → C be holomorphic on an open set Ω ⊂ C. Assume fk → f and ∇fk → ∇f locally uniformly in Ω. Show that f : Ω → C is holomorphic. See Exercise 26 for a stronger result. 2. Assume (5.1.42)
f (z) =
∞ ∑
ak z k
k=0
is absolutely convergent for |z| < R. Deduce from Proposition 2.1.10 and Exercise 1 above that f is holomorphic on |z| < R, and that ∞ ∑ (5.1.43) f ′ (z) = kak z k−1 , for |z| < R. k=1
3. As in (2.3.104), the exponential function ez is defined by ∞ ∑ 1 k (5.1.44) ez = z . k! k=0
Exercises
197
Deduce from Exercise 2 that ez is holomorphic in z. 4. By (2.3.105), we have ∀ z, h ∈ C.
ez+h = ez eh , z
Use this to show directly from (5.1.1) that e is complex differentiable and (d/dz)ez = ez on C, giving another proof that ez is holomorphic on C. Hint. Use the power series for eh to show that lim
h→0
eh − 1 = 1. h
5. For another approach to the fact that ez is holomorphic, use ez = ex eiy and (2.3.104) to verify that ez satisfies the Cauchy-Riemann equation. 6. For z ∈ C, set
) ) 1 ( iz 1 ( iz e + e−iz , sin z = e − e−iz . 2 2i Show that these functions agree with the definitions of cos t and sin t given in (2.3.106)–(2.3.108), for z = t ∈ R. Show that cos z and sin z are holomorphic in z ∈ C. Show that d d (5.1.46) cos z = − sin z, sin z = cos z, dz dz and (5.1.45)
(5.1.47)
cos z =
cos2 z + sin2 z = 1,
for all z ∈ C. 7. Let O, Ω be open in C. If f is holomorphic on O, with range in Ω, and g is holomorphic on Ω, show that h = g ◦ f is holomorphic on O, and h′ (z) = g ′ (f (z))f ′ (z). Hint. See the proof of the chain rule in §2.1. 8. Let Ω ⊂ C be a connected open set and let f be holomorphic on Ω. (a) Show that if f (zj ) = 0 for distinct zj ∈ Ω and zj → z0 ∈ Ω, then f (z) = 0 for z in a neighborhood of z0 . Hint. Use the power series expansion (5.1.1). (b) Show that if f = 0 on a nonempty open set O ⊂ Ω, then f ≡ 0 on Ω. Hint. Let U ⊂ Ω denote the interior of the set of points where f vanishes. Use part (a) to show that U ∩ Ω is open. 9. Let Ω = C \ (−∞, 0] and define log : Ω → C by ∫ 1 (5.1.48) log z = dζ, ζ γz
198
5. Applications of the Gauss-Green-Stokes formula
where γz is a path from 1 to z in Ω. Use Theorem 5.1.2 to show that this is independent of the choice of such path. Show that it yields a holomorphic function on C \ (−∞, 0], satisfying d 1 log z = , dz z
z ∈ C \ (−∞, 0].
10. Taking log z as in Exercise 9, show that (5.1.49)
∀ z ∈ C \ (−∞, 0].
elog z = z,
Hint. If φ(z) denotes the left side, show that φ(1) = 1 and φ′ (z) = φ(z)/z. Use uniqueness results from §2.3 to deduce that φ(x) = x for x ∈ (0, ∞), and from there deduce that φ(z) ≡ z, using Exercise 8. Alternative. Apply d/dz to show that log ez = z, for z in some neighborhood of 0. Deduce from this (and Exercise 3 of §2.2) that (5.1.49) holds for z in some neighborhood of 1. Then get it for all z ∈ C \ (−∞, 0] using Exercise 8. 11. With Ω = C \ (−∞, 0] as in Exercise 9, and a ∈ C, define z a for z ∈ Ω by (5.1.50)
z a = ea log z .
Show that this is holomorphic on Ω and (5.1.51)
d a z = az a−1 , dz
z a z b = z a+b ,
∀ z ∈ Ω.
12. Let O = C \ {[1, ∞) ∪ (−∞, −1]}, and define As : O → C by ∫ As(z) = (1 − ζ 2 )−1/2 dζ, σz
where σz is a path from 0 to z in O. Show that this is independent of the choice of such a path, and that it yields a holomorphic function on O. 13. With As as in Exercise 12, show that As(sin z) = z, for z in some neighborhood of 0. (Hint. Apply d/dz.) From here, show that sin(As(z)) = z, Thus we write (5.1.52)
∫ arcsin z = 0
Compare (2.3.113). 14. Look again at Exercise 4 in §2.1.
z
∀ z ∈ O.
(1 − ζ 2 )−1/2 dζ.
Exercises
199
15. Look again at Exercises 3–5 in §2.2. Write the result as an inverse function theorem for holomorphic maps. 16. Differentiate (5.1.9) to show that, in the setting of Theorem 5.1.3, for k ∈ N, we have the derivative Cauchy integral formula, ∫ f (z) k! (5.1.53) f (k) (z0 ) = dz. 2πi (z − z0 )k+1 ∂Ω
Show that this also follows from (5.1.15). 17. Assume f is holomorphic on C, and set M (z0 , R) =
sup |z−z0 |≤R
|f (z)|.
Use the k = 1 case of Exercise 16 to show that M (z0 , R) , ∀ R ∈ (0, ∞). |f ′ (z0 )| ≤ R 18. In the setting of Exercise 17, assume f is bounded, say |f (z)| ≤ M for all z ∈ C. Deduce that f ′ (z0 ) = 0 for all z0 ∈ C, and in that way obtain another proof of Liouville’s theorem, in the setting of holomorphic functions on C. (Note that Proposition 5.1.8 is more general.) The next four exercises deal with the function ∫ ∞ 2 (5.1.54) G(z) = e−t +tz dt, −∞
z ∈ C.
19. Show that G is continuous on C. 20. Show that G is holomorphic on C, with ∫ ∞ 2 G′ (z) = te−t +tz dt. −∞
Hint. Write 1 [G(z + h) − G(z)] = h
∫
∞
−∞
e−t
2
+tz
1 th (e − 1) dt, h
and 1 th 1 (e − 1) = t + R(th), h h where ew = 1 + w + R(w), so
|R(w)| ≤ C|w|2 e|w| ,
1 R(th) ≤ Ct2 |h|e|th| . h
200
5. Applications of the Gauss-Green-Stokes formula
21. Show that, for x ∈ R,
√
G(x) = ∫
Hint. Write 2
G(x) = ex
/4
πex
∞
2
/4
.
e−(t−x/2) dt, 2
−∞
and make a change of variable in the integral. 22. Deduce from Exercises 21 and 8 that √ 2 (5.1.55) G(z) = π ez /4 ,
∀ z ∈ C.
The next exercises deal with the Gamma function, ∫ ∞ (5.1.56) Γ(z) = e−s sz−1 ds, 0
defined for z > 0 in (3.2.33). 23. Show that the integral is absolutely convergent for Re z > 0 and defines Γ(z) as a holomorphic function on {z ∈ C : Re z > 0}. 24. Extend the identity (3.2.36), i.e., (5.1.57)
Γ(z + 1) = zΓ(z),
to Re z > 0. 25. Use (5.1.57) to extend Γ to be holomorphic on C \ {0, −1, −2, −3, . . . }. 26. Use the result of Exercise 16 to show that if fν are holomorphic on an open set Ω ⊂ C and fν → f uniformly on compact subsets of Ω, then f is holomorphic on Ω and fν′ → f ′ uniformly on compact subsets. 27. The Riemann zeta function ζ(z) is defined for Re z > 1 by (5.1.58)
ζ(z) =
∞ ∑ 1 . kz
k=1
Show that ζ(z) is holomorphic on {z ∈ C : Re z > 1}. The following exercises deal with harmonic functions on domains in Rn . 28. Using the formula (4.4.26) for the Laplace operator together with the formula (3.2.26) for the metric tensor on Rn in spherical polar coordinates x = rω, x ∈ Rn , r = |x|, ω ∈ S n−1 , show that of u ∈ C 2 (Ω), Ω ⊂ Rn , (5.1.59)
∆u(rω) =
n−1 ∂ 1 ∂2 u(rω) + u(rω) + 2 ∆S u(rω), ∂r2 r ∂r r
Exercises
201
where ∆S is the Laplace operator on S n−1 . 29. If f (x) = φ(|x|) on Rn , show that ∆f (x) = φ′′ (|x|) +
(5.1.60)
n−1 ′ φ (|x|). |x|
In particular, show that |x|−(n−2) is harmonic on Rn \ 0,
(5.1.61) if n ≥ 3, and
log |x| is harmonic on R2 \ 0.
(5.1.62)
If O, Ω are open in Rn , a smooth map φ : O → Ω is said to be conformal provided the matrix function G(x) = Dφ(x)t Dφ(x) is a multiple of the identity, G(x) = γ(x)I. Recall formula (3.2.2). 30. Suppose n = 2 and φ preserves orientation. Show that φ (pictured as a function φ : O → C) is conformal if and only if it is holomorphic. If φ reverses orientation, φ is conformal ⇔ φ is holomorphic (we say φ is anti-holomorphic). 31. If O and Ω are open in R2 and u is harmonic on Ω, show that u ◦ φ is harmonic on O, whenever φ : O → Ω is a smooth conformal map. Hint. Use Exercise 7 and Proposition 5.1.10. The following exercises will present an alternative approach to the proof of Proposition 5.1.5 (the mean value property of harmonic functions). For this, let BR = {x ∈ Rn : |x| ≤ R}. Assume u is continuous on BR and C 2 and harmonic on the ◦
interior B R . We assume n ≥ 2. ◦
32. Given g ∈ SO(n), show that ug (x) = u(gx) is harmonic on B R . Hint. See Exercise 7 of §4.4. 33. As in Exercise 27 of §3.2, define Au ∈ C(BR ) by ∫ Au(x) = u(gx) dg. SO(n)
Thus Au(x) is a radial function: Au(x) = Su(|x|),
Su(r) =
1
∫ u(rω) dS(ω).
An−1 S n−1
◦
Deduce from Exercise 32 above that Au is harmonic on B R .
202
5. Applications of the Gauss-Green-Stokes formula
34. Use Exercise 29 to show that φ(r) = Su(r) satisfies n−1 ′ φ (r) = 0, r for r ∈ (0, R). Deduce from this differential equation that there exist constants C0 and C1 such that φ(r) = C0 + C1 r−(n−2) , if n ≥ 3, φ′′ (r) +
C0 + C1 log r,
if n = 2.
Then show that, since Au(x) does not blow up at x = 0, C1 = 0. Hence Au(x) = C0 ,
∀ x ∈ BR .
35. Note that Au(0) = u(0). Deduce that for each r ∈ (0, R], ∫ 1 (5.1.63) u(0) = Su(r) = u(rω) dS(ω). An−1 S n−1
5.2. Differential forms, homotopy, and the Lie derivative Let X and Y be smooth surfaces. Two smooth maps f0 , f1 : X → Y are said to be smoothly homotopic provided there is a smooth F : [0, 1] × X → Y such that F (0, x) = f0 (x) and F (1, x) = f1 (x). The following result illustrates the significance of maps being homotopic. Proposition 5.2.1. Assume X is a compact, oriented, k-dimensional surface and α ∈ Λk (Y ) is closed, i.e., dα = 0. If f0 , f1 : X → Y are smoothly homotopic, then ∫ ∫ ∗ (5.2.1) f0 α = f1∗ α. X
X
In fact, with [0, 1] × X = Ω, this is a special case of the following. Proposition 5.2.2. Assume Ω is a smoothly bounded, compact, oriented (k + 1)dimensional surface, and α ∈ Λk (Y ) is closed. If F : Ω → Y is a smooth map, then ∫ (5.2.2) F ∗ α = 0. ∂Ω
Proof. Stokes’ theorem gives ∫ ∫ ∗ (5.2.3) F α = dF ∗ α = 0, ∂Ω
Ω
since dF ∗ α = F ∗ dα and, by hypothesis, dα = 0. Proposition 5.2.2 is one generalization of Proposition 5.2.1. Here is another.
5.2. Differential forms, homotopy, and the Lie derivative
203
Proposition 5.2.3. Assume X is a k-dimensional surface and α ∈ Λℓ (Y ) is closed. If f0 , f1 : X → Y are smoothly homotopic, then f0∗ α − f1∗ α is exact, i.e., f0∗ α − f1∗ α = dβ,
(5.2.4) for some β ∈ Λℓ−1 (X).
Proof. Take a smooth F : R × X → Y such that F (j, x) = fj (x). Consider α ˜ = F ∗ α ∈ Λℓ (R × X).
(5.2.5)
α = F ∗ dα = 0. Now consider Note that d˜ Φs : R × X −→ R × X,
(5.2.6)
Φs (t, x) = (s + t, x).
We claim that ˜ α ˜ − Φ∗1 α ˜ = dβ,
(5.2.7)
for some β˜ ∈ Λℓ−1 (R × X). Now take ˜ j : X → R × X, (5.2.8) β = j ∗ β,
j(x) = (0, x).
We have F ◦ j = f0 , F ◦ Φ1 ◦ j = f1 , so it follows that f0∗ α − f1∗ α = j ∗ α ˜ − j ∗ Φ∗1 α ˜ (5.2.9) ∗ ˜ = j dβ, given (5.2.7), which yields (5.2.4) with β as in (5.2.8).
It remains to prove (5.2.7), under the hypothesis that d˜ α = 0. The following result gives this. The formula (5.2.10) uses the interior product, defined by (4.2.4)– (4.2.5). Lemma 5.2.4. If α ˜ ∈ Λℓ (R × X) and Φs is as in (5.2.6), then ( ) d ∗ (5.2.10) Φs α ˜ = Φ∗s d(˜ α⌋∂t ) + (d˜ α)⌋∂t . ds Hence, if d˜ α = 0, (5.2.7) holds with ∫ 1 (5.2.11) β˜ = − (Φ∗s α ˜ )⌋∂t ds. 0
Proof. Since Φ∗s+σ = Φ∗s Φ∗σ = Φ∗σ Φ∗s , it suffices to show that (5.2.10) holds at s = 0. It also suffices to work in local coordinates on X. Say ∑ # α ˜= αi (t, x) dxi1 ∧ · · · ∧ dxiℓ i
(5.2.12)
+
∑
αjb (t, x) dt ∧ dxj1 ∧ · · · ∧ dxjℓ−1 .
j
Φ∗s α ˜
We have given by a similar formula, with coefficients replaced by αi# (t + s, x) b and αj (t + s, x), hence ∑ d ∗ Φs α ˜ s=0 = ∂t αi# (t, x) dxi1 ∧ · · · ∧ dxiℓ ds i (5.2.13) ∑ + ∂t αjb (t, x) dt ∧ dxj1 ∧ · · · ∧ dxjℓ−1 . j
204
5. Applications of the Gauss-Green-Stokes formula
Meanwhile (5.2.14)
α ˜ ⌋∂t =
∑
αjb (t, x) dxj1 ∧ · · · ∧ dxjℓ−1 ,
j
so d(˜ α⌋∂t ) =
∑ j
(5.2.15)
+
∂t αjb (t, x) dt ∧ dxj1 ∧ · · · ∧ dxjℓ−1
∑
∂xν αjb (t, x) dxν ∧ dxj1 ∧ · · · ∧ dxjℓ−1 .
j,ν
A similar calculation yields (d˜ α)⌋∂t = (5.2.16)
∑ i
−
∂t αi# (t, x) dxi1 ∧ · · · ∧ dxiℓ
∑
∂xν αjb (t, x) dxν ∧ dxj1 ∧ · · · ∧ dxjℓ−1 .
j,ν
Comparison of (5.2.15)–(5.2.16) with (5.2.13) yields (5.2.10) at s = 0, proving Lemma 5.2.4. The following consequence of Proposition 5.2.3 is the Poincar´e lemma. Proposition 5.2.5. Let X be a smooth k-dimensional surface. Assume the identity map I : X → X is smoothly homotopic to a constant map K : X → X, satisfying K(x) ≡ p. Then, for all ℓ ∈ {1, . . . , k}, (5.2.17)
α ∈ Λℓ (X), dα = 0 =⇒ α is exact.
Proof. By Proposition 5.2.3, α − K ∗ α is exact. However, K ∗ α = 0.
Proposition 5.2.5 applies to any open X ⊂ Rk that is star-shaped, so (5.2.18)
Ds : X −→ X for s ∈ [0, 1],
Ds (x) = sx.
Thus, for any open star-shaped X ⊂ R , each closed α ∈ Λℓ (X) is exact. k
We next present an important generalization of Lemma 5.2.4. Let Ω be a smooth n-dimensional surface. If α ∈ Λk (Ω) and X is a vector field on Ω, generating t a flow FX , the Lie derivative LX α is defined to be d t ∗ (5.2.19) LX α = (FX ) α . dt t=0 Note the similarity to the definition (2.3.82) of LX Y for a vector field Y , for which there was the alternative formula (2.3.85). The following useful result is known as Cartan’s formula for the Lie derivative. Proposition 5.2.6. We have (5.2.20)
LX α = d(α⌋X) + (dα)⌋X.
Proof. We can assume Ω is an open subset of Rn . First we compare both sides in the special case X = ∂/∂xℓ = ∂ℓ . Note that ∑ (5.2.21) (F∂tℓ )∗ α = aj (x + teℓ ) dxj1 ∧ · · · ∧ dxjk , j
5.2. Differential forms, homotopy, and the Lie derivative
so (5.2.22)
L∂ℓ α =
∑
205
∂xℓ aj (x) dxj1 ∧ · · · ∧ dxjk = ∂ℓ α.
j
To evaluate the right side of (5.2.21), with X = ∂ℓ , we could parallel the calculation (5.2.14)–(5.2.16). Alternatively, we can use (4.2.12) to write this as (5.2.23)
d(ιℓ α) + ιℓ dα =
n ∑
(∂j ∧j ιℓ + ιℓ ∂j ∧j )α.
j=1
Using the commutativity of ∂j with ∧j and with ιℓ , and the anticommutativity relations (4.2.8), we see that the right side of (5.2.23) is ∂ℓ α, which coincides with (5.2.22). Thus the proposition holds for X = ∂/∂xℓ . Now we prove the proposition in general, for a smooth vector field X on Ω. It is to be verified at each point x0 ∈ Ω. If X(x0 ) ̸= 0, we can apply Theorem 2.3.7 to choose a coordinate system about x0 so X = ∂/∂x1 and use the calculation above. This shows that the desired identity holds on the set {x0 ∈ Ω : X(x0 ) ̸= 0}, and by continuity it holds on the closure of this set. However, if x0 has a neighborhood on which X vanishes, it is clear that LX α = 0 near x0 and also α⌋X and dα⌋X vanish near x0 . This completes the proof. s+t t s , it follows that FX From (5.2.19) and the identity FX = FX
d t ∗ t ∗ (F t )∗ α = LX (FX ) α = (FX ) LX α. dt X It is useful to generalize this. Let Ft be a smooth family of diffeomorphisms of M into M . Define vector fields Xt on Ft (M ) by (5.2.24)
d Ft (x) = Xt (Ft (x)). dt
(5.2.25) Then, given α ∈ Λk (M ), (5.2.26)
d ∗ F α = Ft∗ LXt α dt t [
] = Ft∗ d(α⌋Xt ) + (dα)⌋Xt .
In particular, if α is closed, then, if Ft are diffeomorphisms for 0 ≤ t ≤ 1, ∫ 1 ∗ ∗ (5.2.27) F1 α − F0 α = dβ, β = Ft∗ (α⌋Xt ) dt. 0
The fact that the left side of (5.2.27) is exact is a special case of Proposition 5.2.3, but the explicit formula given in (5.2.27) can be useful. More on the divergence of a vector field Let M ⊂ Rn be an m-dimensional, oriented surface, with volume form ω. Then dω = 0 on M , so, if X is a vector field on M , (5.2.28)
LX ω = d(ω⌋X).
206
5. Applications of the Gauss-Green-Stokes formula
Comparison with (4.4.7) gives (div X)ω = LX ω.
(5.2.29)
This is sometimes taken as the definition of div X. It readily leads to a formula for t how the flow FX affects volumes. To get this, we start with d t ∗ (F t )∗ ω = (FX ) LX ω dt X t ∗ = (FX ) ((div X)ω).
(5.2.30)
t Hence, if Ω ⊂ M is a smoothly bounded domain on which the flow FX is defined for t ∈ I, then, for such t, ∫ d d t t ∗ Vol FX (Ω) = (FX ) ω dt dt Ω ∫ t ∗ = (FX ) ((div X) ω) (5.2.31) Ω ∫ = (div X) ω. t (Ω) FX
In other words,
∫
d t Vol FX (Ω) = dt
(5.2.32)
(div X) dV. t (Ω) FX
This result is equivalent to Proposition 3.2.7, but the derivation here is substantially different. Compare also the discussion in Exercise 2 of §4.4.
Exercises 1. Show that if α is a k-form and X, Xj are vector fields, (LX α)(x1 , . . . , Xk ) (5.2.33)
= X · α(X1 , . . . , Xk ) −
∑
α(X1 , . . . , LX Xj , . . . , Xk ).
j
Recall from (2.3.85) that LX Xj = [X, Xj ], and rewrite (5.2.33) accordingly. 2. Writing (5.2.20) as ιX dα = LX α − dιX α, deduce that (5.2.34)
(dα)(X0 , X1 , . . . , Xk ) = (LX0 α)(X1 , . . . , Xk ) − (dιX0 α)(X1 , . . . , Xk ).
Exercises
207
3. In case α is a one-form, deduce from (5.2.33)–(5.2.34) that (5.2.35)
(dα)(X0 , X1 ) = X0 · α(X1 ) − X1 · α(X0 ) − α([X0 , X1 ]).
4. Using (5.2.33)–(5.2.34) and induction on k, show that, if α is a k-form, (dα)(X0 , . . . , Xk ) (5.2.36)
=
k ∑ ℓ=0
+
b ℓ , . . . , Xk ) (−1)ℓ Xℓ · α(X0 , . . . , X ∑
bℓ , . . . , X bj , . . . , Xk ). (−1)j+ℓ α([Xℓ , Xj ], X0 , . . . , X
0≤ℓ 0. Now apply Stokes’ theorem to β = φ∗ ω. If φ is a retraction, then φ ◦ j(x) = x, where j : ∂M ,→ M is the natural inclusion. Hence j ∗ φ∗ ω = ω, so we have ∫ ∫ ω = dφ∗ ω. (5.3.1) ∂M
M
But dφ∗ ω = φ∗ dω = 0, so the integral (5.3.1) is zero. This is a contradiction, so there can be no retraction. A simple consequence of this is the famous Brouwer Fixed-Point Theorem. We first present the smooth case. Theorem 5.3.3. If F : B → B is a smooth map on the closed unit ball in Rn , then F has a fixed point. Proof. We are claiming that F (x) = x for some x ∈ B. If not, define φ(x) to be the endpoint of the ray from F (x) to x, continued until it hits ∂B = S n−1 . See Figure 5.3.1. An explicit formula is √ b2 + 4ac − b , φ(x) = x + t(x − F (x)), t = 2a 2 a = ∥x − F (x)∥ , b = 2x · (x − F (x)), c = 1 − ∥x∥2 .
5.3. Differential forms and degree theory
209
Figure 5.3.1. Purported retraction of B onto ∂B
Here t is picked to solve the equation ∥x + t(x − F (x))∥2 = 1. Note that ac ≥ 0, so t ≥ 0. It is clear that φ would be a smooth retraction, contradicting Proposition 5.3.1. Now we give the general case, using the Stone-Weierstrass theorem (established in Appendix A.5) to reduce it to Theorem 5.3.3. Theorem 5.3.4. If G : B → B is a continuous map on the closed unit ball in Rn , then G has a fixed point. Proof. If not, then inf |G(x) − x| = δ > 0.
x∈B
The Stone-Weierstrass theorem implies there exists a polynomial P such that |P (x) − G(x)| < δ/8 for all x ∈ B. Set ( δ) F (x) = 1 − P (x). 8 Then F : B → B and |F (x) − G(x)| < δ/2 for all x ∈ B, so inf |F (x) − x| >
x∈B
This contradicts Theorem 5.3.3.
δ . 2
210
5. Applications of the Gauss-Green-Stokes formula
As a second precursor to degree theory, we next show that an even dimensional sphere cannot have a smooth nonvanishing vector field. Proposition 5.3.5. There is no smooth nonvanishing vector field on S n if n = 2k is even. Proof. If X were such a vector field, we could arrange it to have unit length, so we would have X : S n → S n with X(v) ⊥ v for v ∈ S n ⊂ Rn+1 . Thus there would be a unique unit speed curve γv along the great circle from v to X(v), of length π/2. Define a smooth family of maps Ft : S n → S n by Ft (v) = γv (t). Thus F0 (v) = v, Fπ/2 (v) = X(v), and Fπ = A would be the antipodal map, A(v) = −v. By Proposition 5.2.3, we deduce that A∗ ω − ω = dβ is exact, where ω is the volume form on S n . Hence, by Stokes’ theorem, ∫ ∫ (5.3.2) A∗ ω = ω. Sn
Sn
Alternatively, (5.3.2) follows directly from Proposition 5.2.1. On the other hand, it is straightforward that A∗ ω = (−1)n+1 ω, so (5.3.2) is possible only when n is odd. Note that an important ingredient in the proof of both Proposition 5.3.2 and Proposition 5.3.5 is the existence of n-forms on a compact oriented n-dimensional surface M that are not exact (though of course they are closed). We next establish the following exactness criterion, counterpoint to the Poincar´e lemma. Proposition 5.3.6. If M is a compact, connected, oriented surface of dimension n and α ∈ Λn M, then α = dβ for some β ∈ Λn−1 (M ) if and only if ∫ (5.3.3) α = 0. M
We have already discussed the necessity of (5.3.3). To prove the sufficiency, we first look at the case M = S n . In that case, any n-form α is of the form a(x)ω, a ∈ C ∞ (S n ), ω the volume form on S n , with its standard metric. The group G = SO(n + 1) of rotations of Rn+1 acts as a transitive group of isometries on S n . In §3.2 we constructed the integral of functions over SO(n + 1), with respect to Haar measure. As seen in §3.2, we have the smooth map Exp : Skew(n + 1) −→ SO(n + 1), giving a diffeomorphism from a ball O about 0 in Skew(n + 1) onto an open set U ⊂ SO(n + 1) = G, a neighborhood of the identity. Since G is compact, we can pick a finite number of elements ξj ∈ G such that the open sets Uj = {ξj g : g ∈ U } cover G. Using Corollary A.3.9, we can pick ηj ∈ Skew(n+1) such that Exp ηj = ξj . Define Φjt : Uj → G for 0 ≤ t ≤ 1 by ( ) (5.3.4) Φjt ξj Exp(A) = (Exp tηj )(Exp tA), A ∈ O.
5.3. Differential forms and degree theory
211
Now partition G into subsets Ωj , each of whose boundaries has content zero, such that Ωj ⊂ Uj . If g ∈ Ωj , set g(t) = Φjt (g). This family of elements of SO(n + 1) defines a family of maps Fgt : S n → S n . Now by (5.2.27) we have ∫ 1 ∗ (5.3.5) α = g ∗ α − dκg (α), κg (α) = Fgt (α⌋Xgt ) dt, 0
for each g ∈ SO(n + 1), where Xgt is the family of vector fields on S n associated to Fgt , as in (5.2.25). Therefore, ∫ ∫ (5.3.6) α = g ∗ α dg − d κg (α) dg. G
G
Now the first term on the right is equal to αω, where α = in fact, the constant is ∫ 1 (5.3.7) α= α. Vol S n
∫
a(g · x)dg is a constant;
Sn
Thus in this case (5.3.3) is precisely what serves to make (5.3.6) a representation of α as an exact form. This takes care of the case M = S n . For a general compact, oriented, connected M, proceed as follows. Cover M with open sets O1 , . . . , OK such that each Oj is diffeomorphic to the closed unit ball in Rn . Set U1 = O1 , and inductively enlarge each Oj to Uj , so that U j is also diffeomorphic to the closed ball, and such that Uj+1 ∩ Uj ̸= ∅, 1 ≤ j < K. You can do this by drawing a simple curve from Oj+1 to a point in Uj and thickening it. Pick a smooth partition of unity φj , subordinate to this cover. (See Section 3.3.) ∫ Given α ∈ Λn M, satisfying (5.3.3), take α ˜ j = φj α. Most likely ∫ α ˜ 1 = c1 ̸= 0, so take σ1 ∈ Λn M, with compact support in U1 ∩ U2 , such that σ1 = c1 . Set α1 = α ˜∫1 − σ1 , and redefine α ˜ 2 to be the old α ˜ 2 plus σ1 . Make a similar construction using α ˜ 2 = c2 , and continue. When you are done, you have α = α1 + · · · + αK ,
(5.3.8)
with αj compactly supported in Uj . By construction, ∫ (5.3.9) αj = 0 ∫ for 1 ≤ j < K. But then (5.3.3) implies αK = 0 too. Now pick p ∈ S n and define smooth maps ψj : M −→ S n
(5.3.10)
which map Uj diffeomorphically onto S n \p, and map M \Uj to p. There is a unique vj ∈ Λn S n , with compact support in S n \ p, such that ψ ∗ vj = αj . Clearly ∫ vj = 0, Sn n
so by the case M = S of Proposition 5.3.6 already established, we know that vj = dwj for some wj ∈ Λn−1 S n , and then (5.3.11)
αj = dβj ,
βj = ψj∗ wj .
212
5. Applications of the Gauss-Green-Stokes formula
This concludes the proof of Proposition 5.3.6. We are now ready to introduce the notion of the degree of a map between compact oriented surfaces. Let X and Y be compact oriented n-dimensional surfaces, and assume Y is connected. We want to define the degree of a smooth map F : X → Y. We pick ω ∈ Λn Y such that ∫ (5.3.12) ω = 1. Y
We propose to define
∫
(5.3.13)
Deg(F ) =
F ∗ ω.
X
The following result shows that Deg(F ) is indeed well defined by this formula. The key argument is an application of Proposition 5.3.6. Lemma 5.3.7. The quantity (5.3.13) is independent of the choice of ω, as long as (5.3.12) holds. ∫ ∫ Proof. Pick ω1 ∈ Λn Y satisfying Y ω1 = 1, so Y (ω − ω1 ) = 0. By Proposition 5.3.6, this implies (5.3.14)
ω − ω1 = dα, for some α ∈ Λn−1 Y.
Thus
∫
∫
∗
F ω−
(5.3.15) X
∫
∗
F ω1 = X
dF ∗ α = 0,
X
and the lemma is proved. The following homotopy invariance of degree is a most basic property.
Proposition 5.3.8. If F0 and F1 are smoothly homotopic, then Deg(F0 ) = Deg(F1 ). ∫ Proof. By Proposition 5.2.1, if F0 and F1 are smoothly homotopic, then X F0∗ ω = ∫ F ∗ ω. X 1 The following result is a simple but powerful extension of Proposition 5.3.8. Compare the relation between Propositions 5.2.1 and 5.2.2. Proposition 5.3.9. Let M be a compact oriented surface with boundary, dim M = n + 1. Take Y as above, n = dim Y . Given a smooth map F : M → Y, let f = F ∂M : ∂M → Y. Then Deg(f ) = 0. Proof. Applying Stokes’ Theorem to α = F ∗ ω, we have ∫ ∫ f ∗ ω = dF ∗ ω. ∂M
M
But dF ∗ ω = F ∗ dω, and dω = 0 if dim Y = n, so we are done.
5.3. Differential forms and degree theory
213
Brouwer’s no-retraction theorem is an easy corollary of Proposition 5.3.9. Compare the proof of Proposition 5.3.2. Corollary 5.3.10. If M is a compact oriented surface with nonempty boundary ∂M, then there is no smooth retraction φ : M → ∂M. Proof. Without loss of generality, we can assume M is connected. If there were a retraction, then ∂M = φ(M ) must also be connected, so Proposition 5.3.9 applies. But then we would have, for the map id. = φ ∂M , the contradiction that its degree is both zero and 1. We next give an alternative formula for the degree of a map, which is very useful in many applications. In particular, it implies that the degree is always an integer. A point y0 ∈ Y is called a regular value of F, provided that, for each x ∈ X satisfying F (x) = y0 , DF (x) : Tx X → Ty0 Y is an isomorphism. The easy case of Sard’s Theorem, discussed in Section 3.4, implies that most points in Y are regular. Endow X with a volume element ωX , and similarly endow Y with ωY . If DF (x) is invertible, define JF (x) ∈ R \ 0 by F ∗ (ωY ) = JF (x)ωX . Clearly the sign of JF (x), i.e., sgn JF (x) = ±1, is independent of choices of ωX and ωY , as long as they determine the given orientations of X and Y. Proposition 5.3.11. If y0 is a regular value of F, then ∑ (5.3.16) Deg(F ) = {sgnJF (xj ) : F (xj ) = y0 }. Proof. Pick ω ∈ Λn Y, satisfying ∑ (5.3.12), with support in a small neighborhood of y0 . Then∫ F ∗ ω will be a sum ωj , with ωj supported in a small neighborhood of xj , and ωj = ±1 as sgn JF (xj ) = ±1. For an application of Proposition 5.3.11, let X be a compact smooth oriented hypersurface in Rn+1 , and set Ω = Rn+1 \ X. Given p ∈ Ω, define x−p (5.3.17) Fp : X −→ S n , Fp (x) = . |x − p| It is clear that Deg(Fp ) is constant on each connected component of Ω. It is also easy to see that, when p crosses X, Deg(Fp ) jumps by ±1. Thus Ω has at least two connected components. This is most of the smooth case of the Jordan-Brouwer separation theorem: Theorem 5.3.12. If X is a smooth compact oriented hypersurface of Rn+1 , which is connected, then Ω = Rn+1 \ X has exactly 2 connected components. Proof. X being oriented, it has a smooth global normal vector field. Use this to separate a small collar neighborhood C of X into 2 pieces; C \ X = C0 ∪ C1 . The collar C is diffeomorphic to [−1, 1] × X, and each Cj is clearly connected. It suffices to show that any connected component O of Ω intersects either C0 or C1 . Take p ∈ ∂O. If p ∈ / X, then p ∈ Ω, which is open, so p cannot be a boundary point of any component of Ω. Thus ∂O ⊂ X, so O must intersect a Cj . This completes the proof.
214
5. Applications of the Gauss-Green-Stokes formula
Let us note that, of the two components of Ω, exactly one is unbounded, say Ω0 , and the other is bounded; call it Ω1 . Then we claim (5.3.18)
p ∈ Ωj =⇒ Deg(Fp ) = j.
Indeed, for p very far from X, Fp : X → S n is not onto, so its degree is 0. And when p crosses X, from Ω0 to Ω1 , the degree jumps by +1. For a simple closed curve in R2 , Theorem 5.3.12 is the smooth case of the Jordan curve theorem. That special case of the argument given above can be found in [45]. The existence of a smooth normal field simplifies the use of basic degree theory to prove such a result. For a general continuous, simple closed curve in R2 , such a normal field is not available, and the proof of the Jordan curve theorem in this more general context requires a different argument, which can be found in [21].
We apply results just established on degree theory to properties of vector fields, particularly of their critical points. A critical point of a vector field V is a point where V vanishes. Let V be a vector field defined on a neighborhood O of p ∈ Rn , with a single critical point, at p. Then, for any small ball Br about p, Br ⊂ O, we have a map (5.3.19)
Vr : ∂Br → S n−1 ,
Vr (x) =
V (x) . |V (x)|
The degree of this map is called the index of V at p, denoted indp (V ); it is clearly independent of r. If V has a finite number of critical points, then the index of V is defined to be ∑ (5.3.20) Index(V ) = indpj (V ). If ψ : O → O′ is an orientation preserving diffeomorphism, taking p to p and V to W, then we claim (5.3.21)
indp (V ) = indp (W ).
In fact, Dψ(p) is an element of GL(n, R) with positive determinant, so it is homotopic to the identity, and from this it readily follows that Vr and Wr are homotopic maps of ∂Br → S n−1 . Thus one has a well defined notion of the index of a vector field with a finite number of critical points on any oriented surface M. There is one more wrinkle. Suppose X is a smooth vector field on M and p an isolated critical point. If you change the orientation of a small coordinate neighborhood O of p, then the orientations of both ∂Br and S n−1 in (5.3.19) get changed, so the associated degree is not changed. Hence one has a well defined notion of the index of a vector field with a finite number of critical points on any smooth surface M , oriented or not. A vector field V on O ⊂ Rn is said to have a non-degenerate critical point at p provided DV (p) is a nonsingular n × n matrix. The following formula is convenient. Proposition 5.3.13. If V has a nondegenerate critical point at p, then (5.3.22)
indp (V ) = sgn detDV (p).
5.3. Differential forms and degree theory
215
Proof. If p is a nondegenerate critical point, and we set ψ(x) = DV (p)x, ψr (x) = ψ(x)/|ψ(x)|, for x ∈ ∂Br , it is readily verified that ψr and Vr are homotopic, for r small. The fact that Deg(ψr ) is given by the right side of (5.3.22) is an easy consequence of Proposition 5.3.11 The following is an important global relation between index and degree. Proposition 5.3.14. Let Ω be a smooth bounded region in Rn+1 . Let V be a vector field on Ω, with a finite number of critical points pj , all in the interior Ω. Define F : ∂Ω → S n by F (x) = V (x)/|V (x)|. Then (5.3.23)
Index(V ) = Deg(F ).
∪ Proof. If we apply Proposition 5.3.9 to M = Ω \ j Bε (pj ), we see that Deg(F ) is equal to the sum of degrees of the maps of ∂Bε (pj ) to S n , which gives (5.3.23). Next we look at a process of producing vector fields in higher dimensional spaces from vector fields in lower dimensional spaces. n Proposition 5.3.15. Let W be a vector ( field on ) R , vanishing only at 0. Define n+k a vector field V on R by V (x, y) = W (x), y . Then V vanishes only at (0, 0). Then we have
(5.3.24)
ind0 W = ind(0,0) V.
Proof. If we use Proposition 5.3.11 to compute degrees of maps, and choose y0 ∈ S n−1 ⊂ S n+k−1 , a regular value of Wr , and hence also for Vr , this identity follows. We turn to a more sophisticated variation. Let X be a compact n dimensional surface in Rn+k , W a (tangent) vector field on X with a finite number of critical points pj . Let Ω be a small tubular neighborhood of X, π : Ω → X mapping z ∈ Ω to the nearest point in X. Let φ(z) = dist(z, X)2 . Now define a vector field V on Ω by (5.3.25)
V (z) = W (π(z)) + ∇φ(z).
Proposition 5.3.16. If F : ∂Ω → S n+k−1 is given by F (z) = V (z)/|V (z)|, then (5.3.26)
Deg(F ) = Index(W ).
Proof. We see that all the critical points of V are points in X that are critical for W, and, as in Proposition 5.3.15, Index(W ) = Index(V ). Then Proposition 5.3.14 implies Index(V ) = Deg(F ). Since φ(z) is increasing as one goes away from X, it is clear that, for z ∈ ∂Ω, V (z) points out of Ω, provided it is a sufficiently small tubular neighborhood of X. Thus F : ∂Ω → S n+k−1 is homotopic to the Gauss map (5.3.27)
N : ∂Ω −→ S n+k−1 ,
given by the outward pointing normal. This immediately gives:
216
5. Applications of the Gauss-Green-Stokes formula
Corollary 5.3.17. Let X be a compact n-dimensional surface in Rn+k , Ω a small tubular neighborhood of X, and N : ∂Ω → S n+k−1 the Gauss map. If W is a vector field on X with a finite number of critical points, then (5.3.28)
Index(W ) = Deg(N ).
Clearly the right side of (5.3.28) is independent of the choice of W. Thus any two vector fields on X with a finite number of critical points have the same index, i.e., Index(W ) is an invariant of X. This invariant is denoted (5.3.29)
Index(W ) = χ(X),
and is called the Euler characteristic of X. Remark. The existence of smooth vector fields with only nondegenerate critical points (hence only finitely many critical points) on a given compact surface X follows from results presented in Section 3.5.
Exercises 1. Let X be a compact, oriented, connected surface. Show that the identity map I : X → X has degree 1. 2. Suppose Y is also a compact, oriented, connected surface. Show that if F : X → Y is not onto, then Deg(F ) = 0. 3. If A : S n → S n is the antipodal map, show that Deg(A) = (−1)n−1 . 4. Show that the homotopy invariance property given in Proposition 5.3.8 can be deduced as a corollary of Proposition 5.3.9. Hint. Take M = X × [0, 1]. 5. Let p(z) = z n + an−1 z n−1 + · · · + a1 z + a0 be a polynomial of degree n ≥ 1. The fundamental theorem of algebra, proved in §5.1, states that p(z0 ) = 0 for some z0 ∈ C. We aim for another proof, using degree theory. To get this, by contradiction, assume p : C → C \ 0. For r ≥ 0, define Fr : S 1 −→ S 1 ,
Fr (eiθ ) =
p(reiθ ) . |p(reiθ )|
Show that each Fr is smoothly homotopic to F0 , and note that Deg(F0 ) = 0. Then show that there exists r0 such that r ≥ r0 ⇒ Fr is homotopic to Φ, where Φ(eiθ ) = einθ . Show that Deg(Φ) = n, and obtain a contradiction. Note. Regarding the use of degree theory here, show how the proof of Proposition 5.3.6 vastly simplifies when M = S 1 .
Exercises
217
6. Show that each odd-dimensional sphere S 2k−1 has a smooth, nowhere vanishing tangent vector field. Hint. Regard S 2k−1 ⊂ Ck , and multiply the unit normal by i. 7. Let V be a planar vector field. Assume it has a nondegenerate critical point at p. Show that p saddle =⇒ indp (V ) = −1 p source =⇒ indp (V ) = 1 p sink =⇒ indp (V ) = 1 p center =⇒ indp (V ) = 1 Refer to Figure 2.3.1 for illustrations of these cases. 8. Let M be a compact oriented 2-dimensional surface. Given a triangulation of M, within each triangle construct a vector field, vanishing at 7 points as illustrated in Figure 5.3.2, with the vertices as sinks, the center as a source, and the midpoints of each edge as saddle points. Fit these together to produce a smooth vector field X on M. Show directly that Index(X) = V − E + F, where V = # vertices, E = # edges, F = # faces, in the triangulation. The resulting identity χ(M ) = V − E + F is known as Euler’s formula for χ(M ). 9. With X = S n ⊂ Rn+1 , note that the manifold ∂Ω in (5.3.27) consists of two copies of S n , with opposite orientations. Compute the degree of the map N in (5.3.27)–(5.3.28), and use this to show that (5.3.30)
χ(S n ) = 2 if n even,
0 if n odd,
granted (5.3.28)–(5.3.29). 10. Consider the vector field R on S 2 generating rotation about an axis. Show that R has two critical points, at the “poles.” Classify the critical points, compute Index(R), and compare the n = 2 case of (5.3.30). 11. Generalizing Exercise 9, Let X ⊂ Rn+1 be a smooth, compact, oriented, ndimensional surface, so again the neighborhood Ω of X as in (5.3.27) has boundary e : ∂Ω consisting essentially of two copies of X, with opposite orientations. Let N n X → S be the outward pointing unit normal. Show that (5.3.31)
e = 1 χ(X), Deg N 2
if n is even.
218
5. Applications of the Gauss-Green-Stokes formula
Figure 5.3.2. Vector field on a triangle, with source, sinks, and saddles
e ∗ ωS = K ωX , Remark. If ωS is the volume form on S n and ωX that on X, then N and K : X → R is called the Gauss curvature of X. Then (5.3.31) implies ∫ 1 K(x) dS(x) = An χ(X), 2 X
if n is even. This is a basic case of the Gauss-Bonnet formula. See §6.3 for more on this. 12. In the setting of Exercise 11, assume n is odd. Show that (5.3.32)
χ(X) = 0.
Give examples where the identity in (5.3.31) fails. 13. Retain the setting of Exercise 12, especially that n is odd. Let X = ∂O, with O ⊂ Rn+1 bounded and open. Take a smooth function φ : O −→ [0, ∞),
φ(x) = dist(x, X) near X, φ(x) > 0 on O.
Let Σ ⊂ Rn+2 be the surface Σ = {(x, y) : x ∈ O, y 2 = φ(x)},
Exercises
219
and let ν : Σ → S n+1 be the outward pointing unit normal. Show that e, Deg ν = Deg N
(5.3.33) and deduce that (5.3.34)
e = 1 χ(Σ). Deg N 2
e is also a e : X → S n ⊂ S n+1 , show that each regular value of N Hint. Taking N regular value of ν, with the same preimage in X ⊂ Σ. Then show that Proposition 5.3.11 applies. 14. Actually, (5.3.33) holds whether n is odd or even. Can you get anything else from this? 15. In the setting of Exercise 12 (n is odd), generalize the construction of Exercise 6 to show directly that there is a smooth, nowhere vanishing vector field tangent to X. 16. Let O ⊂ Rn be open and f : O → R smooth of class C 2 . Let V = ∇f . Assume p ∈ O is a nondegenerate critical point of f , so its Hessian D2 f (p) is a nondegenerate n × n symmetric matrix. Say (5.3.35)
D2 f (p) has ℓ positive eigenvalues and n − ℓ negative eigenvalues.
Show that Proposition 5.3.13 implies (5.3.36)
indp (V ) = (−1)n−ℓ .
17. Let X ⊂ Rn+k be a smooth, compact, n-dimensional surface. Assume there exists f ∈ C 2 (X), with just two critical points, a max and a min, both nondegenerate. Use Exercise 16 to show that (5.3.37)
χ(X) = 2 if n is even, 0 if n is odd.
Considering S n ⊂ Rn+1 , use this to give another demonstration of (5.3.30). 18. Let T ⊂ R3 be the “inner tube” surface described in Exercise 15 of §3.2. (a) Show that rotation about the z-axis is generated by a vector field that is tangent to T and nowhere vanishing of T . (b) Define f : T → R by f (x, y, z) = x, (x, y, z) ∈ T . Show that f has four critical points, a max, a min, and two saddles. Deduce from Exercise 16 that ∇f is a vector field on T of index 0. (c) Show that both part (a) and part (b) imply χ(T ) = 0.
220
5. Applications of the Gauss-Green-Stokes formula
Figure 5.3.3. Three-holed torus, Euler characteristic −4
19. Figure 5.3.3 shows a 3-holed torus M in R3 , lined up along the x-axis. Define φ : M → R, φ = x , M
and consider the vector field X = ∇φ on M . (a) Show that X has 8 critical points: one source, 6 saddles, and one sink. (b) Deduce from part (a) that χ(M ) = 2 − 6 = −4. 20. Extending the scope of Exercise 19, consider a g-holed torus, Mg , again lined up along the x-axis, and define a similar vector field X on Mg . Show that X has 2g + 2 critical points: one source, 2g saddles, and one sink. Deduce that χ(Mg ) = 2 − 2g. Compare part (b) of Exerise 18. 21. Let M ⊂ Rn be a smooth, compact, m-dimensional surface. Assume (5.3.38)
x ∈ M ⇐⇒ −x ∈ M,
0∈ / M,
and form the projective manifold P(M ), as in §3.2. Let X be a vector field on P(M ) with only nondegenerate critical points. Show that there is naturally associated a
Exercises
221
vector field Y on M such that Index Y = 2 Index X. Deduce that χ(M ) = 2χ(P(M )). 22. In the setting of Exercise 20, with Mg arranged to satisfy (5.3.38), show that χ(P(Mg )) = 1 − g. Let X ⊂ Rn be an m-dimensional surface, and let Y ⊂ Rν be a µ-dimensional surface, both smooth of class C k . Then the Cartesian product X × Y = {(x, y) ∈ Rn × Rν : x ∈ X, y ∈ Y } ⊂ Rn × Rν has a natural structure of an (m + µ)-dimensional C k surface. 23. Let X and Y be as above, and assume they are both compact. Let V1 be a smooth vector field tangent to X and V2 a smooth vector field tangent to Y , both with only nondegenerate critical points. Say {pi } are the critical points of V1 and {qj } those of V2 . (a) Show that W (x, y) = V1 (x) + V2 (y) is a smooth vector field tangent to X × Y . Show that its critical points are precisely the points {(pi , qj )}, each nondegenerate. Show that Proposition 5.3.13 gives (5.3.39)
ind(pi ,qj ) W = (indpi V1 )(indqj V2 ).
(b) Show that (5.3.40)
Index W = (Index V1 )(Index V2 ).
(c) Deduce that (5.3.41)
χ(X × Y ) = χ(X)χ(Y ).
Let X and Y be smooth, compact, oriented surfaces in Rn . Assume k = dim X, ℓ = dim Y , and k + ℓ = n − 1. Assume X ∩ Y = ∅. Set (5.3.42)
φ : X × Y −→ S n−1 ,
φ(x, y) =
x−y . |x − y|
We define the linking number (5.3.43)
λ(X, Y, Rn ) = Deg φ.
24. Let γ and σ ⊂ R3 be the following simple closed curves, parametrized by s, t ∈ R/(2πZ): γ(s) = (cos s, sin s, 0),
σ(t) = (0, 1 + cos t, sin t).
222
5. Applications of the Gauss-Green-Stokes formula
Figure 5.3.4. Two curves in R3 with linking number 1
Thus γ is a circle in the (x, y)-plane centered at (0, 0, 0) and σ is a circle in the (y, z)-plane, centered at (0, 1, 0), both of unit radius. See Figure 5.3.4. Show that λ(γ, σ, R3 ) = 1. Hint. With φ as above, show that e2 = (0, 1, 0) ∈ S 2 has exactly one preimage point, under φ : γ × σ → S 2 . 25. Let M be a smooth, compact, oriented, (n−1)-dimensional surface, and assume φ : M → Rn \ 0 is a smooth map. Set F (x) =
φ(x) , |φ(x)|
F : M −→ S n−1 .
Take ω ∈ Λn−1 (Rn \ 0) to be the form considered in Exercises 9–10 of §4.3, i.e., ω = |x|−n
n ∑
c j ∧ · · · ∧ dxn . xj dx1 ∧ · · · ∧ dx
j=1
Show that Deg(F ) =
1
∫
An−1 M
φ∗ ω.
Exercises
223
Hint. Use Proposition 5.2.1 (with Y = Rn \ 0), plus Exercise 9 of §8, to show that ∫ ∫ φ∗ ω = F ∗ ω, M
and show that, under S
M
n−1 j
,→ Rn , j ∗ ω is the area form on S n−1 .
26. In Exercise 25, take n = 3 and M = T2 = R2 /(2πZ2 ), parametrized by (s, t) ∈ R2 . Show that φ1 φ2 φ3 φ∗ ω = |φ|−3 det ∂s φ1 ∂s φ2 ∂s φ3 (ds ∧ dt) ∂t φ1 ∂t φ2 ∂t φ3 ( ∂φ ∂φ ) = |φ|−3 φ · × (ds ∧ dt). ∂s ∂t In case φ(s, t) = γ(s) − σ(t), deduce the Gauss linking number formula: ∫ 1 γ(s) − σ(t) λ(γ, σ, R3 ) = − · (γ ′ (s) × σ ′ (t)) ds dt. 4π |γ(s) − σ(t)|3 T2
27. Take X, Y ⊂ Rn as in (5.3.42)–(5.3.43). We say an unlinking of X and Y is a pair of smooth families ξt : X → Rn , ηt : Y → Rn of smooth maps such that ξ0 (x) = x, η0 (y) = y, ∀ x ∈ X, y ∈ Y, ξt (X) ∩ ηt (Y ) = ∅,
∀ t ∈ [0, 1],
ξ1 (X) and η1 (Y ) are separated by a hyperplane in Rn . Show that if there is an unlinking of X and Y , then λ(X, Y, Rn ) = 0. Deduce that there is no unlinking of the curves σ and γ in Exercise 24. Hint. Consider φt : X × Y → S n−1 , given by ξt (x) − ηt (y) φt (x, y) = . |ξt (x) − ηt (y)| 28. Let f : S n → S n be a smooth map with the property that ∀ x ∈ S n , f (x) ̸= −x. Show that f is smoothly homotopic to the identity map, and hence Deg f = 1. 29. Let g : S n → S n be a smooth map, and assume n = 2k is even. Show that Deg g ̸= −1 =⇒ g has a fixed point in S n . Hint. Let A : S n → S n be the antipodal map, and consider f = A ◦ g. 30. What sort of fixed-point result can you establish for g : S n → S n when n is odd?
Chapter 6
Differential geometry of surfaces
Here we study the geometry of an n-dimensional surface M in Rk , or more generally of an n-dimensional manifold M , equipped with a metric tensor (a Riemannian manifold). The first object of our study is the class of geodesics, curves γ : I → M that are critical points for the length functional ∫ b (6.0.1) L(γ) = ∥γ ′ (t)∥ dt, a
with fixed endpoints, say γ(a) = p, γ(b) = q. If we parametrize γ to have constant speed, these curves are equivalently critical points of the energy functional ∫ 1 b ′ (6.0.2) E(γ) = ∥γ (t)∥2 dt, 2 a and, moreover, critical points of (6.0.2) are seen to automatically have constant speed. The geodesic condition can be expressed as a differential equation. We provide three approaches to this geodesic equation. In the first approach, M ⊂ Rk is an n-dimensional surface. We see that a smooth curve γ : I → M is a critical point of (6.0.2) if and only if, for each t ∈ (a, b), γ ′′ (t) is normal to M at γ(t). To derive a differential equation from this characterization, we bring in the matrix-valued function P : M → M (k, R), given by (6.0.3)
P (x) = orthogonal projection of Rk onto Tx M,
for x ∈ M . We derive the geodesic equation [ ] (6.0.4) γ ′′ (t) + DP ⊥ (γ(t))γ ′ (t) γ ′ (t) = 0, where P ⊥ (x) = I − P (x). (DP (x) will appear again, in (6.0.11).) 225
226
6. Differential geometry of surfaces
The second approach uses local coordinates and works on an arbitrary Riemannian manifold. We take γ(t) = (x1 (t), . . . , xn (t)) and write (6.0.2) as ∫ 1 b gjk (x(t))x˙ j (t)x˙ k (t) dt, (6.0.5) E(γ) = 2 a using the summation convention (sum over repeated indices). We obtain the geodesic equation in the form (6.0.6)
x ¨ℓ + x˙ j x˙ k Γℓ jk = 0,
where Γℓ jk are the Christoffel symbols, given by (6.1.47). We also rewrite this as a first-order system, for (x, ξ), where ξℓ = gℓk x˙ k ; see (6.1.49). This is a Hamiltonian system. The fact that solutions are constant speed curves is manifested as a conservation law. Our third approach to the geodesic equation brings in the notion of a covariant derivative, which associates to each tangent vector field X on M a first-order differential operator ∇X , itself acting on vector fields on M . To each Riemannian manifold M there is a naturally associated Levi-Civita covariant derivative, given by the formula (6.1.62). We show that when M ⊂ Rk is an n-dimensional surface, then the Levi-Civita covariant derivative is given by (6.0.7)
∇X Y = P (x)DX Y,
at x ∈ M , where DX acts componentwise on Y and P (x) is the projection (6.0.3). For a general Riemannian manifold M , the geodesic equation takes the form (6.0.8)
∇T T = 0,
at each point γ(t), with T (t) = γ ′ (t). Using the basic existence, uniqueness, and smooth dependence on parameters for systems of ODE, we obtain, for each p ∈ M , an exponential map (6.0.9)
Expp : U −→ M,
defined on a neighborhood of 0 in Tp M by (6.0.10)
Expp (v) = γv (1),
where γv (t) is the unique constant-speed geodesic satisfying γv (0) = p, γv′ (0) = v. The derivative D Expp (0) is the identity map on Tp M , so Expp gives a diffeomorphism from some open neighborhood of 0 ∈ Tp M onto a neighborhood of p in M , called an exponential coordinate system. We next take up the study of curvature, in §6.2. If M ⊂ Rk is a connected n-dimensional surface, it is part of a flat plane in Rk if and only if P (x) is constant on M . Hence a measure of how M curves at x is given by (6.0.11)
DP (x) : Tx M −→ M (k, R).
Given X(x) ∈ Tx M , we set DX P (x) = DP (x)X(x), yielding (6.0.12)
DX P : M −→ M (k, R),
when X is a vector field on M . One sees that, for x ∈ M , (6.0.13)
DX P (x) maps Tx M to νx M , and νx M to Tx M ,
6. Differential geometry of surfaces
227
where νx M = (Tx M )⊥ ⊂ Rk . If X and Y are tangent vector fields to M , there is the second fundamental form, given by (6.0.14)
II(X, Y ) = (DX P )Y,
which is normal to M . If ξ is a normal field to M , we define the Weingarten map, (6.0.15)
Aξ (x) : Tx M −→ Tx M,
by
⟨Aξ X, Y ⟩ = ⟨ξ, II(X, Y )⟩.
One has the Weingarten formula, P DX ξ = −Aξ X.
(6.0.16)
In case k = n + 1, so M has codimension 1, we take a smooth unit normal field N to M , so N : M −→ S n
(6.0.17)
is the Gauss map, introduced in §5.3, giving rise to the Gauss curvature, ) ( (6.0.18) K(x) = det DN (x) , Tx M
where DN (x) : Tx M → TN (x) S n = Tx M . In this case, the Weingarten formula becomes DX N = −AN X,
(6.0.19) so we have (6.0.20)
K(x) = (−1)n det AN (x).
Section 6.2 explores a number of explicit computations of the Gauss curvature of special classes of surfaces. The measures of curvature mentioned above are extrinsically defined, i.e., they are defined by how the surface M sits in Rk . There is also an intrinsically defined curvature, the Riemann curvature, defined as follows. If M is a Riemannian manifold and X, Y, Z are vector fields on M , we set (6.0.21)
R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z,
where ∇ is the Levi-Civita covariant derivative on M . There is a formula for R in terms of the Christoffel symbols, presented in (6.2.65)–(6.2.69). In case M ⊂ Rk is an n-dimensional surface, we have a formula relating the Riemann curvature to the Weingarten map and the second fundamental form: (6.0.22)
R(X, Y )Z = AII(Y,Z) X − AII(X,Z) Y.
In case k = n + 1, we can set (6.0.23) and deduce that (6.0.24)
e II(X, Y ) = II(X, Y )N, ( ) e e II(X, W ) II(X, Z) ⟨R(X, Y )Z, W ⟩ = det e . e II(Y, W ) II(Y, Z)
In case n = 2, k = 3, this yields the formula (6.0.25)
K(x) = ⟨R(U, V )V, U ⟩(x)
228
6. Differential geometry of surfaces
for the Gauss curvature in terms of the Riemann curvature (where U and V form an orthonormal basis of Tx M ). This is the Gauss theorema egregium. It implies that the Gauss curvature of M , defined initially extrinsically, is actually an intrinsic measure of the curvature of M , when M ⊂ R3 is a 2D surface. One can use the right side of (6.0.25) to define the Gauss curvature of any 2D Riemannian manifold, whether or not it is a surface in R3 . In (6.2.102)–(6.2.107), we show that the Gauss curvature of a surface M ⊂ Rn+1 of dimension n is intrinsically defined whenever n = 2m is even, and we extend the definition of Gauss curvature to all Riemannian manifolds of dimension 2m. In §6.3 we tie results on curvature to results on degree theory from Chapter 5. Results of §5.3 show that if M ⊂ Rn+1 is a compact, n-dimensional surface, with Gauss map N : M → S n , and if n = 2m is even, then ∫ 1 K(x) dS(x), and Deg(N ) = An M (6.0.26) 1 Deg(N ) = χ(M ), 2 where An is the n-dimensional area of the unit sphere S n , K(x) is the Gauss curvature of M , given by (6.0.18), and χ(M ) is the Euler characteristic of M . Putting these identities together and specializing to n = 2 yields ∫ (6.0.27) K dS = 2πχ(M ), M
for M ⊂ R , which is part of the classical Gauss-Bonnet theorem. We have two primary goals in this section. One is to establish (6.0.27) for an arbitrary 2D compact Riemannian manifold, regardless of whether it is a surface in R3 . Going further, we consider a domain Ω ∫in such a 2D Riemannian manifold M and seek to extend (6.0.27) to a formula for Ω K dS. For example, we show that if Ω ⊂ M is smoothly bounded and M \ Ω has k connected components, each diffeomorphic to a closed disk, then ∫ ∫ ( ) (6.0.28) K dS + κ ds = 2π χ(M ) − k , 3
Ω
∂Ω
where κ is the geodesic curvature of ∂Ω. We make some comments on higher dimensional versions of the Gauss-Bonnet theorem, beyond the case of codimension1 surfaces treated in (6.0.26), noting that pursuing this is a task for a more advanced course. In §6.4 we study smooth matrix groups, which are subsets G ⊂ M (n, F) that are smooth surfaces and have the property (6.0.29)
g, h ∈ G =⇒ gh, g −1 ∈ G.
Such surfaces get both left and right invariant metric tensors, and associated volume elements, and we investigate properties of the resulting invariant integrals. We show that the matrix exponential has the property (6.0.30)
A ∈ g = TI G =⇒ etA ∈ G, ∀ t ∈ R,
6.1. Geometry of surfaces I: geodesics
229
and explore a Lie algebra structure on g. In case G has a bi-invariant metric tensor, we show that these curves etA are geodesics on M , and obtain formulas for the covariant derivative and the Riemann curvature in terms of the Lie algebra structure on g. We conclude this chapter with some results on the derivative of the exponential map, actually two exponential maps, first the matrix exponential, and then the map Expp in (6.0.9)–(6.0.10). We particularly consider existence of critical points, associated to conjugate points on M , and examine the influence of curvature on the existence of such critical points. This also serves as an introduction to a topic that can be pursued much further in a more advanced course.
6.1. Geometry of surfaces I: geodesics Let M be a smooth, n-dimensional surface in Rk (k > n), or more generally a smooth, n-dimensional manifold equipped with a metric tensor (a Riemannian manifold). Let γ : [a, b] → M be a smooth curve. As seen in §3.2, its length is ∫ b (6.1.1) L(γ) = ∥γ ′ (t)∥ dt, a
where ∥γ ′ (t)∥2 = ⟨γ ′ (t), γ ′ (t)⟩ = γ ′ (t) · G(γ(t))γ ′ (t) ∑ = gjk (γ(t))γj′ (t)γk′ (t),
(6.1.2)
j,k
the last expression being given in a local coordinate system, with G = (gjk ) denoting the metric tensor in this coordinate system. We use ⟨ , ⟩ to denote the inner product of two vectors defined by this metric tensor: (6.1.3)
⟨V (x), W (x)⟩ = V (x) · G(x)W (x) ∑ = gjk (x)Vj (x)Wk (x). j,k
In case M is a surface in R , with the induced metric tensor, ⟨V (x), W (x)⟩ is given by the standard dot product on Rk , applied to V (x), W (x) ∈ Tx M ⊂ Rk . We aim to study smooth curves γ that are length minimizing, among curves with the same endpoints. Such curves are called geodesics. They have the following property. Let γs be a smooth family of curves satisfying k
(6.1.4)
γs : [a, b] −→ M,
γs (a) ≡ p,
γs (b) ≡ q,
with γ0 = γ. Then L(γs ) ≥ L(γ0 ) for all s, so d L(γs ) = 0. ds In other words, γ0 is a critical point of the length functional. We define “geodesic” to include all such critical paths. Later we will investigate the length minimizing properties of general geodesics. (6.1.5)
230
6. Differential geometry of surfaces
Note that L(γ) is unchanged under reparametrization. We will parametrize γ0 so that ∥γ0′ (t)∥ ≡ c0 is constant. Then ∫ b d d = ⟨γ ′ (t), γs′ (t)⟩1/2 dt L(γs ) ds ds a s s=0 (6.1.6) ∫ b 1 ∂ ′ = ⟨γs (t), γs′ (t)⟩ dt . 2c0 a ∂s s=0 Equivalently, (6.1.7)
d 1 d L(γs ) E(γs ) , = ds c0 ds s=0 s=0
where
(6.1.8)
1 E(γs ) = 2 =
1 2
∫
b
∥γs′ (t)∥2 dt
a
∫
b
⟨γs′ (t), γs′ (t)⟩ dt
a
is the energy of the curve γs : [a, b] → M . Thus we shift from seeking a critical point of the length functional to finding a critical point of the energy functional. As we will see, such a critical path automatically has the property that ∥γ ′ (t)∥ is constant. Our next goal is to derive a differential equation characterizing critical points of the energy functional. We will discuss three approaches to such a geodesic equation. In all cases, we start with ∫ 1 b ∂ ′ d E(γs ) = ⟨γ (t), γs′ (t)⟩ dt. (6.1.9) ds 2 a ∂s s In our first approach, we assume M is a smooth surface in Rk . Then we have from (6.1.7) that ∫ b⟨ ⟩ d ∂ ′ (6.1.10) E(γs ) = γs (t), γs′ (t) dt. ds ∂s a Using the identity
⟩ ⟨∂ ⟩ ⟨∂ ⟩ d⟨∂ γs (t), γs′ (t) = γs′ (t), γs′ (t) + γs (t), γs′′ (t) , dt ∂s ∂s ∂s together with the fundamental theorem of calculus and the fact that (given (6.1.4)) (6.1.11)
(6.1.12) we have (6.1.13)
∂ γs (t) = 0, at t = a and b, ∂s ∫ b d =− ⟨V (t), γ0′′ (t)⟩ dt, E(γs ) ds s=0 a
where (6.1.14)
V (t) =
∂ γs (t) s=0 ∈ Tγ0 (t) M. ∂s
6.1. Geometry of surfaces I: geodesics
231
Figure 6.1.1. The projection P (x) of Rk onto Tx M
Now, given any smooth V : [a, b] → Rk satisfying V (t) ∈ Tγ0 (t) M and V (a) = V (b) = 0, one can find a smooth family of curves γs : [a, b] → M satisfying (6.1.4), such that γ0 = γ and (6.1.14) holds. We have the following. Proposition 6.1.1. If M ⊂ Rk is a smooth surface, a smooth curve γ = γ0 : [a, b] → M is a critical point of the energy functional if and only if, for each t ∈ (a, b), (6.1.15)
γ ′′ (t) ⊥ V (t) for each V (t) ∈ Tγ(t) M.
A convenient restatement of (6.1.15) can be formulated as follows. We take (6.1.16)
P : M −→ M (k, R), P (x) = orthogonal projection of Rk onto Tx M.
See Figure 6.1.1. Then (6.1.15) is equivalent to the statement that (6.1.17)
P (γ(t))γ ′′ (t) = 0,
for all t ∈ (a, b). In order to get a differential equation in standard form, we complement (6.1.17) with (6.1.18)
P ⊥ (γ(t))γ ′ (t) = 0,
232
6. Differential geometry of surfaces
(where P ⊥ (x) = I − P (x)), which follows from the fact that γ ′ (t) ∈ Tγ(t) M . Applying d/dt to (6.1.18), and using the product rule and the chain rule, we have (6.1.19)
0=
[ ] d ⊥ P (γ(t))γ ′ (t) = DP ⊥ (γ(t))γ ′ (t) γ ′ (t) + P ⊥ (γ(t))γ ′′ (t). dt
Here, for x ∈ M , DP ⊥ (x) : Tx M −→ M (k, R),
(6.1.20)
so DP ⊥ (γ(t))γ ′ (t) ∈ M (k, R). Adding (6.1.19) to (6.1.17) gives [ ] (6.1.21) γ ′′ (t) + DP ⊥ (γ(t))γ ′ (t) γ ′ (t) = 0. In order to apply ODE theory to (6.1.21), it is convenient to have the following set-up. Assume M = M0 and that there is a diffeomorphism Ba × M → O, where Ba is an open ball about 0 in Rk−n and O is an open neighborhood of M0 in Rk . Then O is a union of surfaces My , y ∈ Ba . We extend P in (6.1.16) to a smooth map (6.1.22)
P : O −→ M (k, R), P (x) = orthogonal projection of Rk onto Tx My , if x ∈ My .
Thus we can regard (6.1.21) as an ODE on O. By results of §2.3, it has a unique short time solution, given initial data γ(a) = p ∈ O, γ ′ (a) = v ∈ Rk . The following result shows when solutions to (6.1.21) give geodesics on M . Proposition 6.1.2. Assume γ(t) solves (6.1.21), on an interval I, containing a, with initial data γ(a) = p ∈ M,
(6.1.23)
γ ′ (a) = v ∈ Tp M.
Then γ(t) satisfies (6.1.17)–(6.1.18), and γ(t) ∈ M , for t ∈ I. Proof. To start, we derive an identity based on the fact that each P (x) is a projection. Applying d/dt to P (γ(t))P (γ(t)) = P (γ(t))
(6.1.24) gives (6.1.25)
[ ] [ ] DP (γ(t))γ ′ (t) P (γ(t)) + P (γ(t)) DP (γ(t))γ ′ (t) = DP (γ(t))γ ′ (t),
hence
[ ] [ ] P (γ(t)) DP (γ(t))γ ′ (t) = DP (γ(t))γ ′ (t) P ⊥ (γ(t)).
(6.1.26)
Meanwhile, applying P (γ(t)) to (6.1.21) yields [ ] (6.1.27) P (γ(t))γ ′′ (t) = P (γ(t)) DP (γ(t))γ ′ (t) γ ′ (t). Hence, by (6.1.26), (6.1.28)
[ ] P (γ(t))γ ′′ (t) = DP (γ(t))γ ′ (t) P ⊥ (γ(t))γ ′ (t).
Now set (6.1.29)
σ(t) = P ⊥ (γ(t))γ ′ (t).
6.1. Geometry of surfaces I: geodesics
233
Applying d/dt, we have
[ ] σ ′ (t) = DP ⊥ (γ(t))γ ′ (t) + P ⊥ (γ(t))γ ′′ (t)
(6.1.30)
= −P (γ(t))γ ′′ (t) [ ] = − DP (γ(t))γ ′ (t) σ(t),
the second identity by (6.1.21) and the third identity by (6.1.28). Hence σ satisfies a first-order, homogeneous, linear ODE, so (6.1.31)
γ ′ (a) ∈ Tγ(a) M ⇒ σ(a) = 0 ⇒ σ(t) ≡ 0.
From (6.1.28)–(6.1.29), it is clear that (6.1.32)
σ(t) ≡ 0 =⇒ P (γ(t))γ ′ (t) ≡ 0,
so we have (6.1.17)–(6.1.18), and (6.1.18) yields γ(t) ∈ M for all t ∈ I.
Corollary 6.1.3. In the setting of Proposition 6.1.2, if (6.1.23) holds, then ∥γ ′ (t)∥ is constant. Proof. We have d ′ ⟨γ (t), γ ′ (t)⟩ = 2⟨γ ′′ (t), γ ′ (t)⟩, dt which vanishes when (6.1.17)–(6.1.18) hold. (6.1.33)
We give an alternative presentation of the geodesic equation in case k = n + 1, i.e., M has codimension 1 in Rk . Suppose M is defined by u(x) = c, ∇u(x) ̸= 0. Then (6.1.17) is equivalent to γ ′′ (t) = K(t)∇u(γ(t)),
(6.1.34)
for a real valued function K, which remains to be determined. Meanwhile, the condition that u(γ(t)) ≡ c implies ⟨γ ′ (t), ∇u(γ(t))⟩ = 0
(6.1.35)
(cf. (6.1.18)), and differentiating this gives (6.1.36)
⟨γ ′′ (t), ∇u(γ(t))⟩ = −⟨γ ′ (t), D2 u(γ(t))γ ′ (t)⟩,
where D2 u is the k × k matrix of second-order partial derivatives of u. Comparing (6.1.34) and (6.1.36) gives K(t), and we have the ODE ⟨ ′ ⟩ γ (t), D2 u(γ(t))γ ′ (t) ′′ (6.1.37) γ (t) = − ∇u(γ(t)), ∥∇u(γ(t))∥2 for a geodesic γ lying in M . For our second approach to the geodesic equation, we let M be a general Riemannian manifold. We assume γs : [a, b] → M , satisfying (6.1.4), are contained in a single coordinate patch, on which the metric tensor is given by the n×n symmetric, positive-definite matrix G = (gjk ). We use the following notation: γs (t) = xs (t) = (x1s (t), . . . , xns (t)), (6.1.38)
d γs (t) = x˙ s (t) = (x˙ 1s (t), . . . , x˙ ns (t)). dt
234
6. Differential geometry of surfaces
We use the following summation convention, converting (6.1.2) to ∥x˙ s (t)∥2 = gjk (xs (t))x˙ js (t)x˙ ks (t).
(6.1.39)
(We sum over repeated indices.) Thus the energy functional is ∫ 1 b (6.1.40) E(xs ) = gjk (x(t))x˙ js (t)x˙ ks (t) dt. 2 a Hence, with V j (t) =
(6.1.41)
∂ j x (t) , ∂s s s=0
and x(t) = x0 (t), we have
∫ b[ d ∂ E(xs ) = gjk (x(t)) x˙ js (t) x˙ k (t) ds ∂s s=0 s=0 a (6.1.42) ] 1 j ∂gkℓ k + V (t) j x˙ (t)x˙ ℓ (t) dt, 2 ∂x where we have made use of the symmetry gjk = gkj . Now, in analogy with (6.1.11), we can write ) d( gjk (x(t))V j (t)x˙ k (t) dt ∂ (6.1.43) = gjk (x(t)) x˙ js (t) x˙ k (t) ∂s s=0 ∂gjk j + gjk (x(t))V (t)¨ xk (t) + x˙ ℓ (t) ℓ V j (t)x˙ k (t). ∂x Thus, by the fundamental theorem of calculus, ∫ b[ d ∂gjk j k E(xs ) =− V x˙ gjk V j x ¨k + x˙ ℓ ds ∂xℓ s=0 a (6.1.44) 1 ∂gkℓ k ℓ ] − Vj x˙ x˙ dt, 2 ∂xj and the stationary condition (d/ds)E(xs )|s=0 = 0 for all variations of the form (6.1.4) becomes ( ∂g 1 ∂gkℓ ) k ℓ jk (6.1.45) gjk x ¨k (t) = − − x˙ x˙ . ∂xℓ 2 ∂xj Symmetrizing the quantity in parentheses with respect to k and ℓ yields the geodesic equation x ¨ℓ + x˙ j x˙ k Γℓ jk = 0,
(6.1.46) where Γℓ jk is defined by (6.1.47)
gkℓ Γℓ ij =
1 ( ∂gjk ∂gik ∂gij ) + − . 2 ∂xi ∂xj ∂xk
The functions Γℓ ij are called the Christoffel symbols. We will see more of them. We next convert (6.1.46) to a first-order system. It is convenient to set (6.1.48)
ξℓ = gℓk x˙ k .
6.1. Geometry of surfaces I: geodesics
235
Then consider the system x˙ ℓ = g ℓk (x)ξk , (6.1.49)
1 ∂g jk ξj ξk , ξ˙ℓ = − 2 ∂xℓ
where (g jk ) is the matrix inverse to (gjk ). If we apply d/dt to the first equation and plug in the second one for ξ˙k , we get ( 1 iℓ ) ∂g ik kj ∂g (6.1.50) x ¨ℓ = − g ℓj ξi ξk , + g 2 ∂xj ∂xj and using (6.1.48), together with ∂ ∂G G(x)−1 = −G(x)−1 j G(x)−1 , ∂xj ∂x straightforward manipulations yield the geodsic equation (6.1.46). A special structure of the system (6.1.49) is revealed by writing this system as (6.1.51)
∂ f (x, ξ), ∂ξℓ ∂ ξ˙ℓ = − ℓ f (x, ξ), ∂x
x˙ ℓ = (6.1.52)
where 1 jk g (x)ξk ξk . 2 A system of ODEs of the form (6.1.52) is called a Hamiltonian system. Note that if (6.1.52) holds, then (6.1.53)
(6.1.54)
f (x, ξ) =
∂f ∂f d f (x, ξ) = x˙ ℓ ℓ + ξ˙ℓ dt ∂x ∂ξℓ = −x˙ ℓ ξ˙ℓ + ξ˙ℓ x˙ ℓ = 0.
Hence f (x(t), ξ(t)) is constant for a solution to (6.1.52). Since g jk (x)ξj ξk = ξ · G(x)−1 ξ (6.1.55)
= G(x)x˙ · G(x)−1 G(x)x˙ = gjk (x)x˙ j x˙ k ,
this implies that the solution curve to the geodesic equation (6.1.46) has constant speed, parallel to the result of Corollary 6.1.3. The passage from the geodesic equation (6.1.46) to the system (6.1.52) is a special case of a more general passage from a “Lagrangian equation” to an associated Hamiltonian system. More on this can be found in Chapter 4, §7 of [50], and in Chapter 1, §12 of [46]. We now discuss a third approach to the geodesic equation. Again, M is a smooth n-dimensional Riemannian manifold, and γs : [a, b] → M is a smooth
236
6. Differential geometry of surfaces
family of curves satisfying (6.1.4), and V (t) is defined as in (6.1.14). Let T = γs′ (t).
(6.1.56) Then, parallel to (6.1.9),
d 1 E(γs ) = ds 2
(6.1.57)
∫
b
V ⟨T, T ⟩ dt. a
Now we need a generalization of (∂/∂s)γs′ (t) and of the formulas (6.1.10)–(6.1.11). To achieve this, we introduce the notion of a covariant derivative. If X and Y are vector fields on M , the covariant derivative ∇X Y is a vector field on M . The following properties are required. We assume that ∇X Y is additive in both X and Y , that ∇f X Y = f ∇X Y,
(6.1.58) ∞
for f ∈ C (M ), and that (6.1.59)
∇X (f Y ) = f ∇X Y + (Xf )Y.
Thus ∇X acts as a derivation. The operator ∇X is also required to have the following relation to the Riemannian metric: (6.1.60)
X⟨Y, Z⟩ = ⟨∇X Y, Z⟩ + ⟨Y, ∇X Z⟩.
One further property will uniquely specify ∇: (6.1.61)
∇X Y − ∇Y X = [X, Y ].
If all these properties hold, we say ∇ is the Levi-Civita covariant derivative on the Riemannian manifold M . We have the following basic existence and uniqueness result. Proposition 6.1.4. Associated with a Riemannian metric is a unique Levi-Civita covariant derivative, given by (6.1.62)
2⟨∇X Y, Z⟩ = X⟨Y, Z⟩ + Y ⟨X, Z⟩ − Z⟨X, Y ⟩ + ⟨[X, Y ], Z⟩ − ⟨[X, Z], Y ⟩ − ⟨[Y, Z], X⟩.
Proof. To obtain the formula (6.1.62), cyclically permute X, Y , and Z in (6.1.60) and take the appropriate alternating sum, using (6.1.61) to cancel out all terms involving ∇ but two copies of ⟨∇X Y, Z⟩. This derives the formula and establishes uniqueness. On the other hand, if (6.1.62) is taken as the definition of ∇X Y , then verification of the properties (6.1.58)–(6.1.61) is a straightforward exercise. To see that the passage from (6.1.57) to the next step ((6.1.64) below) generalizes passage from (6.1.9) to (6.1.10), we note the following. Proposition 6.1.5. If M is a smooth surface in Rk , with the induced Riemannian metric, and if ∇M is its Levi-Civita covariant derivative, then, for X and Y tangent to M , (6.1.63)
∇M X Y = P (x)DX Y,
at x ∈ M,
where DX acts componentwise on Y , and P (x) is the projection (6.1.16).
6.1. Geometry of surfaces I: geodesics
237
Proof. It is routine to verify that the right side of (6.1.63) satisfies all the conditions described in (6.1.58)–(6.1.61) that define the Levi-Civita covariant derivative. The identity (6.1.63) will also play an important role in the study of curvature, in the next section. We resume our analysis of (6.1.57), which becomes ∫ b d (6.1.64) E(γs ) s=0 = ⟨∇V T, T ⟩ dt. ds a Since ∂/∂s and ∂/∂t commute, we have [V, T ] = 0 on γ0 , and (6.1.61) implies ∫ b d (6.1.65) E(γs ) = ⟨∇T V, T ⟩ dt. ds s=0 a The replacement for (6.1.11) is (6.1.66)
T ⟨V, T ⟩ = ⟨∇T V, T ⟩ + ⟨V, ∇T T ⟩,
so, by the fundamental theorem of calculus, ∫ b d E(γs ) =− ⟨V, ∇T T ⟩ dt. (6.1.67) ds s=0 a If this is to vanish for all smooth vector fields over γ0 , vanishing at p and q, we must have ∇T T = 0.
(6.1.68)
We show how this leads again to (6.1.46). If M has a coordinate chart Ω ⊂ Rn that carries a Riemannian metric (gjk ) and a corresponding Levi-Civita covariant derivative, the Christoffel symbols can be defined by ∑ (6.1.69) ∇Di Dj = Γk ji Dk , k
where Dk = ∂/∂xk . The formula (6.1.62) implies 1 ( ∂gjk ∂gik ∂gij ) (6.1.70) gkℓ Γℓ ij = + − , 2 ∂xi ∂xj ∂xk in agreement with (6.1.47). We can rewrite the geodesic equation (6.1.68) for γ(t) = x(t) as follows. With x = (x1 , . . . , xn ) and T = (x˙ 1 , . . . , x˙ n ), we have ∑ ∑ (6.1.71) 0= ∇T (x˙ ℓ Dℓ ) = (¨ xℓ Dℓ + x˙ ℓ ∇T Dℓ ). ℓ
ℓ
In view of (6.1.69), this becomes (6.1.72)
x ¨ℓ + x˙ j x˙ k Γℓ jk = 0,
where we bring back the summation convention. We have recovered the geodesic equation (6.1.46). Note that if T = γ ′ (t), then T ⟨T, T ⟩ = 2⟨∇T T, T ⟩ = 0, so if (6.1.68) holds, γ(t) automatically has constant speed. Shortly we will verify that a curve satisfying the geodesic equation is indeed locally length minimizing.
238
6. Differential geometry of surfaces
Figure 6.1.2. The exponential map: Expp (tv) = γv (t)
For a given p ∈ M , the exponential map (6.1.73)
Expp : U −→ M
is defined on a neighborhood of 0 in Tp M ≈ Rn by (6.1.74)
Expp (v) = γv (1),
where γv (t) is the unique constant-speed geodesic satisfying (6.1.75)
γv (0) = p,
γv′ (0) = v.
See Figure 6.1.2. Note that Expp (tv) = γv (t). It is clear that Expp is well defined and smooth on a sufficiently small neighborhood U of 0 in Tp M , and its derivative at 0 is the identity. Thus, perhaps shrinking U , we have that Expp is a diffeomorphism of U onto a neighborhood O of p in M . This provides what is called an exponential coordinate system on a neighborhood of p ∈ M . Clearly the geodesics through p are the straight lines through the origin in this coordinate system. We now establish a result, known as the Gauss lemma, which will imply that each geodesic is locally length minimizing. For a > 0 small, let Σa = {v ∈ Tp M : ∥v∥ = a}, and let Sa = Expp (Σa ). Proposition 6.1.6. Any unit-speed geodesic through p hitting Sa at t = a is orthogonal to Sa .
6.1. Geometry of surfaces I: geodesics
239
Proof. If γ0 (t) is a unit-speed geodesic, γ0 (0) = p, γ0 (a) = q ∈ Sa , and V ∈ Tp M is tangent to Sa , there is a smooth family of unit-speed geodesics γs (t), such that γs (0) = p and (∂/∂s)γs (a)|s=0 = V . Since L(γs ) = E(γs ) is constant, we can use (6.1.75)–(6.1.76) to conclude that ∫ a T ⟨V, T ⟩ dt = ⟨V, γ0′ (a)⟩, (6.1.76) 0= 0
which proves the proposition.
Corollary 6.1.7. Suppose Expp : Ba → M is a diffeomorhism of Ba = {v ∈ Tp M : ∥v∥ ≤ a} onto its image B. Then, for each q ∈ B, q = Expp (w), the curve γ(t) = Expp (tw), 0 ≤ t ≤ 1, is the unique shortest path from p to q. Proof. We can assume ∥w∥ = a. Let σ : [0, 1] → M be another constant speed path from p to q, say ∥σ ′ (t)∥ = b. We can assume σ(t) ∈ B for all t ∈ [0, 1]; otherwise restrict σ to [0, β], where β = inf{t ≥ 0 : σ(t) ∈ ∂B} and the argument below will show this segment has length ≥ a. For all t such that σ(t) ∈ B \p, we can write σ(t) = Expp (r(t)ω(t)), for uniquely determined ω(t) in the unit sphere of Tp M , and r(t) ∈ (0, a]. If we pull the metric tensor of M back to Ba , we have (6.1.77)
∥σ ′ (t)∥2 = r′ (t)2 + r(t)2 ∥ω ′ (t)∥2 ,
by the Gauss lemma. Hence
∫
1
∥σ ′ (t)∥ dt
b = L(σ) = 0
1 = b
(6.1.78)
1 ≥ b Cauchy’s inequality yields ∫ (6.1.79)
1
|r′ (t)| dt ≤
0
∫
1
∥σ ′ (t)∥2 dt
0
∫
1
r′ (t)2 dt.
0
(∫
1
)1/2 r′ (t)2 dt ,
0
so the last quantity in (6.1.78) is ≥ a /b. This implies b ≥ a, with equality only if ∥ω ′ (t)∥ = 0 for all t. The corollary is proved. 2
The following is a useful converse. Proposition 6.1.8. Let γ : [0, 1] → M be a constant speed Lipschitz curve from p to q that is absolutely length minimizing. Then γ is a smooth curve satisfying the geodesic equation. Proof. We make use of the following fact, which will be established below. Namely, there exists a > 0 such that, for each point x ∈ γ, Expx : Ba → M is a diffeomorphism of Ba onto its image (and a is independent of x ∈ γ). So choose t0 ∈ [0, 1] and consider x0 = γ(t0 ). The hypothesis implies that γ must be a length-minimizing curve from x0 to γ(t), for all t ∈ [0, 1]. By Corollary 6.1.7, γ(t) coincides with a geodesic for t ∈ [t0 , t0 + α] and for t ∈ [t0 − β, t0 ], where
240
6. Differential geometry of surfaces
t0 + α = min(t0 + a, 1) and t0 − β = max(t0 − a, 0). We need only show that, if t0 ∈ (0, 1), these two geodesic segments fit together smoothly, i.e., that γ is smooth in a neighborhood of t0 . To see this, pick ε > 0 such that ε < min(t0 , a), and consider t1 = t0 − ε. The same argument as above applied to this case shows that γ coincides with a smooth geodesic on a segment including t0 in its interior, so we are done.
The asserted lower bound on a follows from compactness plus the following observation. Given p ∈ M , there is a neighborhood O of (p, 0) in T M on which (6.1.80)
E : O −→ M,
E(x, v) = Expx (v),
(v ∈ Tx M )
is defined. Let us set (6.1.81)
F(x, v) = (x, Expx (v)),
We readily compute that
F : O −→ M × M. (
(6.1.82)
DF(p, 0) =
I I
) 0 , I
as a map on Tp M ⊕ Tp M , where we use Expp to identify T(p,0) Tp M ≈ Tp M ⊕ Tp M ≈ T(p,p) (M × M ). Hence the inverse function theorem implies that F is a diffeomorphism from a neighborhood of (p, 0) in T M onto a neighborhood of (p, p) in M × M . Let us remark that, though a geodesic is locally length minimizing, it need not be globally length minimizing. Easy examples arise when M is the standard unit sphere in Euclidean space, for which the great circles are the geodesics.
Exercises 1. Let u : Rk → R be smooth, and assume γ : (a, b) → Rk is a smooth solution to (6.1.37). Note that, in general, d u(γ(t)) = ⟨γ ′ (t), ∇u(γ(t))⟩. dt Show that (6.1.37) implies d ′ ⟨γ (t), ∇u(γ(t))⟩ ≡ 0. dt
(6.1.83) Deduce that, if t0 ∈ (a, b),
γ ′ (t0 ) ⊥ ∇u(γ(t0 )) =⇒
d u(γ(t)) ≡ 0. dt
2. In the setting of Exercise 1, note that d ′ ⟨γ (t), γ ′ (t)⟩ = 2⟨γ ′′ (t), γ ′ (t)⟩. dt
Exercises
241
Deduce that, if (6.1.37) holds, then γ ′ (t0 ) ⊥ ∇u(γ(t0 )) =⇒ ∥γ ′ (t)∥ ≡ ∥γ ′ (t0 )∥. Hint. Use (6.1.83) again. 3. Suppose M is a smooth surface of dimension k − 1 in Rk . Let N : M → Rk be a smooth unit normal field to M . Show that an alternative form of the geodesic equation (6.1.37) is ⟩ ⟨ d γ ′′ (t) = − γ ′ (t), N (γ(t)) N (γ(t)). dt 4. We say a 2D Riemannian manifold has a Clairaut parametrization if there are coordinates (u, v) in which the metric tensor takes the form ( ) G1 (u) G(u, v) = , G2 (u) with no v dependence. Compute Γℓ jk in this case, and show that the geodesic equations (6.1.46) become 1 G′1 2 1 G′2 2 u˙ − v˙ = 0, 2 G1 2 G1 G′ v¨ + 2 u˙ v˙ = 0. G2 Note that the second equation is equivalent to u ¨+
d (G2 (u)v) ˙ = 0, hence G2 (u)v˙ = a. dt Use this and the constant speed condition G1 (u)u˙ 2 + G2 (u)v˙ 2 = c2 to get a first-order ODE for u, to which you can apply separation of variables. 5. Show that
∂gjk = gak Γa jℓ + gaj Γa kℓ . ∂xℓ
Hint. Start with Dℓ ⟨Dj , Dk ⟩ = ⟨∇Dℓ Dj , Dk ⟩ + ⟨Dj , ∇Dℓ Dk ⟩, and plug in (6.1.69). 6. Consider the exponential map Expp : U → M defined in (6.1.74)–(6.1.75). Show that, in this coordinate system, Γa bk (p) = 0. Hint. Since the line through the origin in any direction aDj + bDk is a geodesic, we have ∇aDj +bDk (aDj + bDk ) = 0, at p, ∀ a, b ∈ R, j, k.
242
6. Differential geometry of surfaces
7. In the setting of Exercise 6, deduce that, at the center of an exponential coordinate system, ∂gjk (p) = 0, ∀ j, k, ℓ. ∂xℓ 8. Let M be a compact, connected Riemannian manifold, and take p, q ∈ M, p ̸= q. Let F denote the set of smooth curves σ : [0, 1] → M satisfying σ(0) = p,
∥σ ′ (t)∥ independent of t.
σ(1) = q,
Let d(p, q) = inf{L(σ) : σ ∈ F }, and take σν ∈ F such that L(σν ) → d(p, q). Use the Arzela-Ascoli theorem to show that there is a uniformly convergent subsequence σνk −→ γ : [0, 1] → M,
γ(0) = p, γ(1) = q,
such that γ is length minimizing, among such curves. Use Proposition 6.1.8 to establish that γ is a geodesic from p to q.
6.2. Geometry of surfaces II: curvature The curvature of a surface (or, in the 1D case, a curve) is a measure of its not being flat. To take the simplest case, let γ : (a, b) → Rk be smooth, with non-vanishing velocity. We can parametrize γ by arclength, so (6.2.1)
T (t) = γ ′ (t),
∥T (t)∥ ≡ 1.
Then γ is a straight line if and only if T (t) is constant. Thus a measure of how γ curves is given by (6.2.2)
T ′ (t).
We call this the curvature vector of γ. Note that (6.2.3)
T · T ≡ 1 =⇒ T ′ (t) · T (t) = 0,
so T ′ (t) is orthogonal to T (t). In connection with this, we can interpret Proposition 6.1.1 as follows. Proposition 6.2.1. Let M ⊂ Rk be a smooth surface, γ : (a, b) → M a smooth curve, parametrized by arc length. Then γ is a geodesic on M if and only if the curvature vector of γ is normal to M at each point of γ(t). Let us now specialize to planar curves, so γ : (a, b) → R2 . In such a case, we apply counterclockwise rotation by 90◦ to T (t) to get a unit normal to γ: ( ) 0 −1 (6.2.4) N (t) = JT (t), J = . 1 0 In such a case, (6.2.3) implies that T ′ (t) is parallel to N (t), say (6.2.5)
T ′ (t) = κ(t)N (t),
6.2. Geometry of surfaces II: curvature
243
Figure 6.2.1. The unit normal to M at x
and we call κ(t) the curvature of γ. Note that, by (6.2.4), N ′ (t) = κ(t)JN (t) (6.2.6)
= κ(t)J 2 T (t) = −κ(t)T (t).
We move on to the case that M ⊂ Rn+1 is a smooth, codimension-one surface, with smooth unit normal N . See Figure 6.2.1. Then (6.2.7)
N : M −→ S n ⊂ Rn+1
is called the Gauss map. In this case M is flat if and only if N is constant, so we measure its curvature by (6.2.8)
DN (x) : Tx M −→ TN (x) S n = Tx M.
In particular, we have a well defined real-valued function ( ) (6.2.9) K(x) = det DN (x) T M , x
called the Gauss curvature of M at x. In case n = 1, (6.2.6) yields K(γ(t)) = −κ(t). Of course, the map (6.2.8) contains further curvature information when n > 1. We return to this below.
244
6. Differential geometry of surfaces
For a general n-dimensional surface M ⊂ Rk , a natural object with which to measure how M curves in Rk is the map P , introduced in (6.1.16), i.e., (6.2.10)
P : M −→ M (k, R), P (x) = orthogonal projection of Rk onto Tx M .
We see that M is flat if and only if P is constant, so a natural measure of curvature is DP (x) : Tx M −→ M (k, R).
(6.2.11) We use the notation
DX P : M −→ M (k, R),
(6.2.12)
for a vector field X on M ; DX P (x) = DP (x)X(x). Since we are differentiating a smooth family of symmetric k × k matrices, it is clear that DX P (x)t = DX P (x) in M (k, R),
(6.2.13)
for each x ∈ M . The following is another useful fact. Proposition 6.2.2. For x ∈ M , let νx M = Tx M ⊥ ,
(6.2.14)
the orthogonal complement of Tx M in Rk . Then, if X is a vector field on M , DX P (x) : Tx M −→ νx M, and
(6.2.15)
DX P (x) : νx M −→ Tx M.
Proof. We start with the projection identity, P = P P , and apply DX , using the product rule, to obtain (6.2.16)
DX P = (DX P )P + P (DX P ),
which in turn yields (DX P )P = P ⊥ (DX P ),
(6.2.17)
(DX P )P ⊥ = P (DX P ),
where P ⊥ = I − P . This gives (6.2.15).
To proceed, we bring in an object called the second fundamental form of M ⊂ Rk , defined for vector fields X and Y on M by II(X, Y ) = DX Y − ∇M X Y,
(6.2.18)
where ∇M is the intrinsic covariant derivative on M , given by Proposition 6.1.4. Clearly, for f ∈ C ∞ (M ), (6.2.19)
II(f X, Y ) = f II(X, Y ). ∞
Furthermore, if g ∈ C (M ), II(X, gY ) = DX (gY ) − ∇M X (gY ) (6.2.20)
= gDX Y + (Xg)Y − g∇M X Y − (Xg)Y = g II(X, Y ).
6.2. Geometry of surfaces II: curvature
245
We also note that II(X, Y ) = P ⊥ DX Y,
(6.2.21)
as a consequence of Proposition 6.1.5. In particular, II(X, Y )(x) ∈ νx M,
(6.2.22)
∀ x ∈ M.
A smooth function ξ : M → R with the property that ξ(x) ∈ νx M for all x is called a normal field. k
The following important identity connects DX P to the second fundamental form. Proposition 6.2.3. If X and Y are smooth vector fields on M , (6.2.23)
(DX P )Y = II(X, Y ).
Proof. We have (6.2.24)
DX Y = DX (P Y ) = (DX P )Y + P (DX Y ) = (DX P )Y + ∇M X Y,
hence (6.2.25)
(DX P )Y = DX Y − ∇M X Y = II(X, Y ).
From this, (6.2.13), and (6.2.15), we have: Corollary 6.2.4. If ξ is a smooth normal field on M , then (DX P )ξ is a vector field on M , uniquely specified by the identity ⟨(DX P )ξ, Y ⟩ = ⟨ξ, II(X, Y )⟩,
(6.2.26)
for all vector fields Y on M . This last identity motivates us to bring in an object called the Weingarten map, defined as follows. If ξ is a normal field on M , we define Aξ (x) : Tx M −→ Tx M
(6.2.27) by
⟨Aξ X, Y ⟩ = ⟨ξ, II(X, Y )⟩.
(6.2.28) Let us note that
M II(X, Y ) − II(Y, X) = (DX Y − DY X) − (∇M X Y − ∇Y X)
(6.2.29)
= [X, Y ] − [X, Y ] = 0,
so (6.2.30)
II(X, Y ) = II(Y, X),
and hence (6.2.31)
Atξ = Aξ on Tx M.
From (6.2.26) we have (6.2.32)
(DX P )ξ = Aξ X.
246
6. Differential geometry of surfaces
Using (6.2.33)
0 = DX (P ξ) = (DX P )ξ + P DX ξ,
we are led from (6.2.32) to the following result, known as the Weingarten formula. Proposition 6.2.5. If ξ is a normal field and X is a vector field on M , then (6.2.34)
P DX ξ = −Aξ X.
Second proof. We know P DX ξ is tangent to M . Given a vector field Y , ⟨P DX ξ, Y ⟩ = ⟨DX ξ, Y ⟩ (6.2.35)
= DX ⟨ξ, Y ⟩ − ⟨ξ, DX Y ⟩ = −⟨ξ, DX Y − ∇M XY⟩ = −⟨ξ, II(X, Y )⟩,
where we have used ⟨ξ, Y ⟩ ≡ 0 and ⟨ξ, ∇M X Y ⟩ ≡ 0. The last identity in (6.2.35) yields (6.2.34). In order to state a more definitive result, we can define a covariant derivative ∇ν on normal fields on M by (6.2.36)
∇νX ξ = P ⊥ DX ξ,
when X is a vector field on M and ξ is a normal field. One readily verifies analogues of (6.1.58)–(6.1.59): (6.2.37)
∇νf X ξ = f ∇νX ξ, ∇νX (f ξ) = f ∇νX ξ + (Xf )ξ.
Furthermore, parallel to (6.1.60), if η is also a normal field, (6.2.38)
X⟨ξ, η⟩ = ⟨∇νX ξ, η⟩ + ⟨ξ, ∇νX η⟩,
thanks to the identity X⟨ξ, η⟩ = ⟨DX ξ, η⟩ + ⟨ξ, DX η⟩. Using (6.2.36), we deduce the following extension of Proposition 6.2.5. Proposition 6.2.6. In the setting of Proposition 6.2.5, (6.2.39)
DX ξ = ∇νX ξ − Aξ X,
The right side splits DX ξ into the sum of a normal field on M and a vector field tangent to M . In case k = n + 1, so M has codimension 1, with smooth unit normal N , we can combine (6.2.34) with (6.2.8), to deduce the following classical Weingarten formula: Corollary 6.2.7. If M has codimension 1 in Rn+1 , with smooth unit normal field N , and X is a vector field on M , then (6.2.40)
DX N = −AN X.
6.2. Geometry of surfaces II: curvature
247
Consequently, (6.2.9) yields the formula K(x) = (−1)n det AN (x),
(6.2.41)
for the Gauss curvature of M at x. In case M has codimension 1 in Rk and we have in hand the smooth unit e by normal N , it is natural to define a real-valued form II, e (6.2.42) II(X, Y ) = II(X, Y )N. This is the classical case of the second fundamental form. e when M has codimension It is useful to note the following characterization of II, k 1 in R . Translating and rotating coordinates, we can move a specific point p ∈ M to the origin in Rk , and suppose M is given locally by xk = f (x′ ),
(6.2.43)
∇f (0) = 0,
′
where x = (x1 , . . . , xk−1 ). We can then identify the tangent space of M at p with Rk−1 . Proposition 6.2.8. In the setting described in the previous paragraph, the second fundamental form of M at p is given by (6.2.44)
e II(X, Y)=
k−1 ∑ j,ℓ=1
∂2f (0) Xj Yℓ . ∂xj ∂xℓ
Proof. From (6.2.34) we have, for any ξ normal to M , ⟨II(X, Y ), ξ⟩ = −⟨DX ξ, Y ⟩.
(6.2.45) Taking
( ∂f ) ∂f ξ= − ,...,− ,1 ∂x1 ∂xk−1
(6.2.46)
gives the desired formula.
If M is a surface in R3 , given locally by x3 = f (x1 , x2 ) with f (0) = 0 and ∇f (0) = 0, then the Gauss curvature of M at the origin is seen by (6.2.44) to equal ( ∂ 2 f (0) ) (6.2.47) det . ∂xj ∂xℓ Besides providing a good conception of the second fundamental form of a codimension 1 surface in Rk , Proposition 6.2.8 leads to useful formulas for computation, one of which we will give in (6.2.53). First, we give a more invariant reformulation of Proposition 6.2.8. Suppose the (k − 1)-dimensional surface M in Rk is given by (6.2.48)
u(x) = c,
with ∇u ̸= 0 on M . Then we can use the computation (6.2.45) with ξ = ∇u to obtain (6.2.49)
⟨II(X, Y ), ∇u⟩ = −Y · (D2 u)X,
where D2 u is the k × k matrix of second-order partial derivatives of u. In other words, e (6.2.50) II(X, Y ) = −∥∇u∥−1 Y · (D2 u)X,
248
6. Differential geometry of surfaces
for X and Y tangent to M . In particular, if M is a two-dimensional surface in R3 , given by (6.2.48), then the Gauss curvature at p ∈ M is given by (6.2.51) K(p) = ∥∇u(p)∥−2 det(D2 u) Tp M , where D2 u|Tp M denotes the restriction of the quadratic form D2 u to the tangent space Tp M . With this calculation we can derive the following formula, extending (6.2.47). Proposition 6.2.9. If M ⊂ R3 is given by (6.2.52)
x3 = f (x1 , x2 ),
then, at p = (x′ , f (x′ )) ∈ M , the Gauss curvature is given by ( ∂2f ) (6.2.53) K(p) = (1 + ∥∇f (x′ )∥2 )−2 det . ∂xj ∂xℓ Proof. We can apply (6.2.51) with u(x) = f (x1 , x2 ) − x3 . Note that ∥∇u∥2 = 1 + ∥∇f (x′ )∥2 and ( 2 ) D f 0 2 (6.2.54) D u= . 0 0 Noting that a basis of Tp M is given by (1, 0, ∂1 f ) = v1 , (0, 1, ∂2 f ) = v2 , we obtain ( ) det vj · (D2 u)vk 2 = (1 + ∥∇f (x′ )∥2 )−1 det D2 f, (6.2.55) det D u Tp M = det(vj · vk )
which yields (6.2.53).
Intrinsic curvature So far, we have considered how M is curved in Rk . That is, we have examined extrinsic measures of curvature of M . We now look at intrinsic measures of curvature of M . In this setting, M can be any n-dimensional Riemannian manifold, not necessarily a surface in Rk . One distinguishing property of Rn with the standard flat metric (δjk ) and associated covariant derivative D is that, for any two vector fields X and Y on Rn , (6.2.56)
DX DY − DY DX = D[X,Y ] .
With this in mind, if M is an n-dimensional Riemannian manifold, with Levi-Civita covariant derivative ∇, we set (6.2.57)
R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z.
We call R the Riemann curvature of M . It is clearly additive in each term X, Y , and Z. The following property implies it is linear in each variable, over C ∞ (M ). Proposition 6.2.10. Given f, g, h ∈ C ∞ (M ), (6.2.58)
R(f X, gY )hZ = f ghR(X, Y )Z.
6.2. Geometry of surfaces II: curvature
249
Proof. It suffices to treat f, g and h separately. To start, (6.2.59)
R(f X, Y )Z = ∇f X ∇Y Z − ∇Y ∇f X Z − ∇[f X,Y ] Z.
Use the identity [f X, Y ] = f [X, Y ] − (Y f )X to write this as (6.2.60)
f ∇X ∇Y Z − ∇Y (f ∇X Z) − f ∇[X,Y ] Z + (Y f )∇X Z,
and then write the second term in (6.2.60) as −f ∇Y ∇X Z − (Y f )∇X Z,
(6.2.61)
to get (6.2.58) when g = h = 1. The other cases are similar.
It follows that R(X, Y )Z is determined by its behavior on the coordinate vector fields Dj = ∂/∂xj . We define the components Ra bjk of the Riemann curvature (in a coordinate system) by R(Dj , Dk )Db = Ra bjk Da
(6.2.62)
(using the summation convention). Since [Dj , Dk ] = 0, (6.2.63)
R(Dj , Dk )Db = ∇Dj ∇Dk Db − ∇Dk ∇Dj Db .
Recall from (6.1.69) that ∇Dk Db = Γa bk Da ,
(6.2.64)
where Γa bk are the Christoffel symbols, given by (6.1.70). Plugging this into (6.2.63) and using the derivation property (6.1.59) readily gives the formula ∂ a ∂ a (6.2.65) Ra bjk = Γ bk − Γ bj + Γa cj Γc bk − Γa ck Γc bj . ∂xj ∂xk These formulas can be written in shorter form, as follows. For each j and k in {1, . . . , n}, we define n × n matrices (6.2.66)
Γj = (Γa bj ),
Rjk = (Ra bjk ).
Then (6.2.63) is equivalent to (6.2.67)
Rjk =
∂ ∂ Γk − Γj + [Γj , Γk ], ∂xj ∂xk
where [Γj , Γk ] = Γj Γk − Γk Γj is the matrix commutator. Note that Rjk is antisymmetric in j and k. Now we can define a “connection 1-form” Γ and a “curvature 2-form” Ω by ∑ 1∑ (6.2.68) Γ= Γj dxj , Ω = Rjk dxj ∧ dxk , 2 j j,k
and the formula (6.2.67) is equivalent to (6.2.69)
Ω = dΓ + Γ ∧ Γ.
We next mention a couple of basic symmetries of the Riemann curvature. Proposition 6.2.11. The Riemann curvature satisfies (6.2.70)
R(X, Y )Z = −R(Y, X)Z,
and (6.2.71)
⟨R(X, Y )Z, W ⟩ = −⟨Z, R(X, Y )W ⟩.
250
6. Differential geometry of surfaces
Proof. The identity (6.2.70) is immediate from the definition (6.2.57). Next, the metric property (6.1.60) of ∇ yields 0 = (XY − Y X − [X, Y ])⟨Z, W ⟩
(6.2.72)
= ⟨R(X, Y )Z, W ⟩ + ⟨Z, R(X, Y )W ⟩,
which gives (6.2.71). We set (6.2.73)
Rabjk = gac Rc bjk = ⟨Da , R(Dj , Dk )Db ⟩.
Then the identities (6.2.70)–(6.2.71) become Rabjk = −Rabkj
(6.2.74)
= −Rbajk .
Having defined the Riemann curvature of a general n-dimensional Riemannian manifold and developed some of its basic properties, we turn back to the case of a smooth n-dimensional surface M ⊂ Rk , with its induced metric, and relate the Riemann curvature to the second fundamental form. We again make use of Proposition 6.1.5, which says the Levi-Civita covariant derivative ∇ on M is given by ∇X Y = P DX Y.
(6.2.75) Consequently, (6.2.57) implies
R(X, Y )Z = P DX (P DY Z) − P DY (P DX Z) − P D[X,Y ] Z (6.2.76)
= P DX DY Z − P DY DX Z − P D[X,Y ] Z + P (DX P )(DY Z) − P (DY P )(DX Z).
Now DX DY Z − DY DX Z − D[X,Y ] Z = 0, so we are left with the last two terms. Furthermore, P (DX P )(DY Z) = (DX P )P ⊥ (DY Z) (6.2.77)
= (DX P ) II(Y, Z) = AII(Y,Z) X,
the first identity by (6.2.17), the second by (6.2.21), and the third by (6.2.32). This yields the following result. Proposition 6.2.12. The Riemann curvature of a surface M ⊂ Rk satisfies (6.2.78)
R(X, Y )Z = AII(Y,Z) X − AII(X,Z) Y,
if X, Y , and Z are vector fields on M . Equivalently, if also W is a vector field on M, (6.2.79)
⟨R(X, Y )Z, W ⟩ = ⟨II(Y, Z), II(X, W )⟩ − ⟨II(X, Z), II(Y, W )⟩.
Corollary 6.2.13. Assume k = n + 1, so M is a codimension 1 surface. Then ( ) e e II(X, W ) II(X, Z) (6.2.80) ⟨R(X, Y )Z, W ⟩ = det e . e II(Y, W ) II(Y, Z)
6.2. Geometry of surfaces II: curvature
251
In the setting of Corollary 6.2.13, we have ( ) e e II(X, X) II(X, Y) (6.2.81) ⟨R(X, Y )Y, X⟩ = det e . e II(Y, X) II(Y, Y) Also, in this setting, (6.2.82)
e II(X, Y ) = ⟨AN X, Y ⟩.
We therefore have the following, known as Gauss’ Theorema Egregium. Proposition 6.2.14. Assume M ⊂ R3 is a 2D surface. Take p ∈ M , and let U and V be vector fields on M such that {U (p), V (p)} forms an orthonormal basis of Tp M . Then the Gauss curvature of M at p is given by (6.2.83)
K(p) = ⟨R(U, V )V, U ⟩(p).
Proof. By (6.2.41), K(p) = det AN (p), but ( ) ⟨AN U, U ⟩ ⟨AN U, V ⟩ (6.2.84) det AN (p) = det , ⟨AN V, U ⟩ ⟨AN V, V ⟩ and the result then follows from (6.2.81)–(6.2.82).
It follows that the Gauss curvature is an intrinsic quantity on a 2D surface. In fact, if M is a 2D Riemannian manifold, not necessarily a surface in R3 , one can define its Gauss curvature at p ∈ M by (6.2.83). One can readily show that the right side of (6.2.83) is independent of the choice of orthonormal basis of Tp M . Going further, if we take two vectors X, Y ∈ Tp M and expand them in terms of an orthonormal basis {U, V }, the following result is a straightforward consequence of Propositions 6.2.10–6.2.11 and (6.2.83). Proposition 6.2.15. If X and Y are vector fields on a 2D Riemannian manifold M , with Gauss curvature K(x), then ( ) ⟨X, X⟩ ⟨X, Y ⟩ (6.2.85) ⟨R(X, Y )Y, X⟩ = K(x) det . ⟨X, Y ⟩ ⟨Y, Y ⟩ If we take a coordinate chart on M and set X = D1 , Y = D2 , the coordinate vector fields, then the left side of (6.2.85) becomes (6.2.86)
⟨R(D1 , D2 )D2 , D1 ⟩ = R1212 ,
as defined in (6.2.73), and the matrix on the right is G(x), the matrix defining the metric tensor. Then (6.2.85) yields the identity (6.2.87)
K(x) =
R1212 (x) , g(x)
g(x) = det G(x),
when dim M = 2. Here is an explicit calculation of the Gauss curvature for an important class of 2D manifolds.
252
6. Differential geometry of surfaces
Proposition 6.2.16. Let M be a 2D Riemannian manifold. Suppose that one has a coordinate chart in which the metric tensor takes the form ( ) G1 (x) (6.2.88) G(x) = , g(x) = G1 (x)G2 (x). G2 (x) Then the Gauss curvature is given by ( ∂ G )] 1 [ ( ∂1 G2 ) 2 1 (6.2.89) K(x) = − √ ∂1 √ + ∂2 √ . 2 g g g Proof. One can first compute that
( ) 1 ∂1 G1 /G1 ∂2 G1 /G1 Γ1 = (Γ b1 ) = , 2 −∂2 G1 /G2 ∂1 G2 /G2 ( ) 1 ∂2 G1 /G1 −∂1 G2 /G1 . Γ2 = (Γa b2 ) = 2 ∂1 G2 /G2 ∂2 G2 /G2 a
(6.2.90)
Then, computing R12 = (Ra b12 ) = ∂1 Γ2 − ∂2 Γ1 + Γ1 Γ2 − Γ2 Γ1 , we have 1 ( ∂1 G2 ) 1 ( ∂2 G1 ) − ∂2 R1 212 = − ∂1 2 G1 2 G1 1 ( ∂1 G1 ∂1 G2 ∂2 G1 ∂2 G2 ) (6.2.91) + − + 4 G1 G1 G1 G2 ( 1 ∂2 G1 ∂2 G1 ∂1 G2 ∂1 G2 ) − − . 4 G1 G1 G1 G 2 Now R1212 = G1 R1 212 in this case, and (6.2.87) yields (6.2.92)
K(x) =
1 1 1 R1212 = R 212 . G1 G2 G2
If we divide (6.2.91) by G2 , then in the resulting formula for K(x) interchange G1 and G2 , and ∂1 and ∂2 , and sum the two formulas for K(x), we get 1 [ 1 ( ∂1 G2 ) 1 ( ∂1 G2 )] K(x) = − ∂1 + ∂1 4 G2 G1 G1 G2 (6.2.93) 1 [ 1 ( ∂2 G1 ) 1 ( ∂2 G1 )] − ∂2 + ∂2 , 4 G1 G2 G2 G1 which is readily transformed into (6.2.89).
Coordinates in which the metric tensor takes the form (6.2.88) are called orthogonal coordinates. If in addition we have G1 = G2 , they are called isothermal coordinates. In such a case, (6.2.89) specializes to the following neat formula. Corollary 6.2.17. Suppose dim M = 2, and one has an isothermal coordinate system, in which (6.2.94)
gjk (x) = e2v(x) δjk ,
for a smooth v. Then the Gauss curvature is given by ( ∂2v ∂2v ) (6.2.95) K(x) = −e−2v + . ∂x21 ∂x22
6.2. Geometry of surfaces II: curvature
253
A significant example of Corollary 6.2.17 is provided by the Poincar´e disk, 4 (6.2.96) D = {x ∈ R2 : ∥x∥ < 1}, gjk (x) = δjk . (1 − ∥x∥2 )2 Application of (6.2.95) yields K ≡ −1,
(6.2.97)
for this metric. A related example is the Poincar´e upper half plane, (6.2.98)
U = {x ∈ R2 : x2 > 0},
gjk (x) = x−2 2 δjk ,
again yielding (6.2.97). It is clear that one cannot obtain the Gauss curvature of an n-dimensional M ⊂ Rn+1 from its Riemann curvature R when n = 1. In fact, if n = 1, (6.2.70) implies R ≡ 0, while the Gauss curvature is given by (6.2.6). More generally, one cannot obtain K(x) from R when n is odd. On the other hand, we can obtain K(x) from R when the dimension of M is even. We discuss how this works. Assume M is a smooth surface of dimension n = 2m in Rn+1 . Take p ∈ M , and let {e1 , . . . , en } be an orthonormal basis of Tp M . For short, set A = AN . By (6.2.70), ( ) ⟨Aej , ea ⟩ ⟨Aej , eb ⟩ (6.2.99) ⟨R(ej , ek )eb , ea ⟩ = det . ⟨Aek , ea ⟩ ⟨Aek , eb ⟩ It follows that (6.2.100)
Aej ∧ Aek =
1 ∑ ⟨R(ej , ek )eb2 , eb1 ⟩eb1 ∧ eb2 . 2 b1 ,b2
Now
(6.2.101)
(det A) e1 ∧ · · · ∧ en = (Ae1 ∧ Ae2 ) ∧ · · · ∧ (Aen−1 ∧ Aen ) ∑ = 2−m ⟨R(e1 , e2 )eb2 , eb1 ⟩ · · · ⟨R(en−1 , en )ebn , ebn−1 ⟩ b
× eb1 ∧ eb2 ∧ · · · ∧ ebn−1 ∧ ebn , where the sum runs over b ∈ Sn , the set of permutations of {1, . . . , n}. Since eb1 ∧ · · · ∧ ebn = (sgn b) e1 ∧ · · · ∧ en , this leads to a formula for det A. Recalling (6.2.41), we obtain the following higher-dimensional extension of Proposition 6.2.14. Proposition 6.2.18. Assume M ⊂ Rn+1 is a surface of dimension n = 2m. Pick p ∈ M , and let {e1 , . . . , en } be an orthonormal basis of Tp M . Then the Gauss curvature of M at p is given by ∑ (sgn b)⟨R(e1 , e2 )eb2 , eb1 ⟩ · · · ⟨R(en−1 , en )ebn , ebn−1 ⟩. (6.2.102) K(p) = 2−n/2 b∈Sn
A variant of the formula (6.2.102) is given by taking arbitrary permutations of {e1 , . . . , en } and summing. This gives 2−n/2 ∑ (sgn a)(sgn b)⟨R(ea1 , ea2 )eb2 , eb1 ⟩ K(p) = n! (6.2.103) a,b∈Sn
· · · ⟨R(ean−1 , ean )ebn , ebn−1 ⟩.
254
6. Differential geometry of surfaces
We can use this formula to define the Gauss curvature of any Riemannian manifold M of dimension n = 2m, regardless of whether it is a surface in Rn+1 . One needs to check that the right side of (6.2.103) is independent of the choice of an orthonormal basis of Tp M . One way to get this involves the following objects. In addition to the curvature 2-form Ω in (6.2.68), one has the “curvature (2,2)-form” (6.2.104)
e=1 Ω 4
n ∑
Rabjk (dxa ∧ dxb ) ⊗ (dxj ∧ dxk ).
a,b,j,k=1
One can define a product on (k, ℓ)-forms by (6.2.105)
(α1 ⊗ β1 ) ∧ (α2 ⊗ β2 ) = (α1 ∧ α2 ) ⊗ (β1 ∧ β2 ),
given kj -forms αj and ℓj -forms βj . The product is a (k1 + k2 , ℓ1 + ℓ2 )-form. Now, if dim M = n = 2m, we form the (n, n)-form e =Ω e ∧ ··· ∧ Ω e (m factors). P(Ω)
(6.2.106)
If we take p ∈ M and choose a coordinate system about p in which the coordinate vector fields D1 , . . . , Dn are orthonormal, then comparison with the calculations (6.2.103)–(6.2.104) above give, at p, n! K(p) ωM ⊗ ωM , 2n/2 where, at p, ωM = dx1 ∧ · · · ∧ dxn (in this coordinate system). Consequently, if we take ωM to be the volume form on M , then (6.2.106)–(6.2.107) can be taken as a definition of the Gauss curvature on a general Riemannian manifold of dimension n = 2m. e = P(Ω)
(6.2.107)
Note that
(6.2.108)
e = R1212 (dx1 ∧ dx2 ) ⊗ (dx1 ∧ dx2 ), n = 2 =⇒ Ω √ e = Ω, e and ωM = g dx1 ∧ dx2 , and P(Ω) e = 1 R1212 ωM ⊗ ωM , =⇒ P(Ω) g
in which case (6.2.107) implies (6.2.109)
K=
1 R1212 , g
for n = 2,
and we recover (6.2.87).
Exercises 1. Give another proof of Proposition 6.2.3, starting with 0 = DX (P ⊥ Y ) = −(DX P )Y + P ⊥ DX Y, and using (6.2.21). 2. Discuss how (6.2.6) is a special case of the Weingarten identity (6.2.40).
Exercises
255
3. We say a 2D Riemannian manifold has a Clairaut parametrization if there are coordinates (u, v) in which the metric tensor takes the form ( ) G1 (u) G(u, v) = G2 (u) (with no v dependence). Show that the Gauss curvature is given by 1 d G′2 (u) √ K(u) = − √ , 2 g(u) du g(u)
g = G1 G2 .
4. Let M ⊂ R3 be a surface of revolution, with coordinate chart ( ) X(u, v) = g(u), h(u) cos v, h(u) sin v , obtained by taking the curve γ(u) = (g(u), h(u), 0) in the xy-plane and rotating it about the x-axis in R3 . Show that in these coordinates the metric tensor is given by ( ′ ) |γ (u)|2 G(u, v) = . h(u)2 Apply Exercise 3 to compute the Gauss curvature. 5. Let γs : [a, b] → M be a smooth family of curves satisfying γs (a) ≡ p, γs (b) ≡ q. Assume γ0 has unit speed. Define V (t) = ∂s γs (t) s=0 ∈ Tγ0 (t) M. Recall the energy function E(γs ). Show that ∫ b [ ] d2 E(γ ) = 2 ⟨R(V, T )V, T ⟩ + ⟨∇T V, ∇T V ⟩ + ⟨∇V V, ∇T T ⟩ dt. s 2 ds s=0 a Note that the last term in the integrand vanishes if γ0 is a geodesic. Show that, since V (a) = V (b) = 0, the middle term in the integrand can be replaced by −⟨V, ∇T ∇T V ⟩, so, if γ0 is a geodesic, ∫ b [ ] d2 (6.2.110) E(γs ) = −2 ⟨R(V, T )T, V ⟩ + ⟨V, ∇T ∇T V ⟩ dt. ds2 s=0 a 6. Let M ⊂ Rk be a smooth n-dimensional surface, and let X, Y , and Z be smooth vactor fields on M . Use (6.2.18) and (6.2.39) to verify that M M ν DX DY Z = ∇M X ∇Y Z + II(X, ∇Y Z) − AII(Y,Z) X + ∇X II(Y, Z).
Get a similar expression for DY DX Z, and apply (6.2.18) to D[X,Y ] Z. Use the fact that DX DY Z − DY DX Z − D[X,Y ] Z = 0 to deduce that the Riemann curvature of M satisfies { M R(X, Y )Z = − II(X, ∇M Y Z) + II(Y, ∇X Z) + II([X, Y ], Z) } (6.2.111) − ∇νX II(Y, Z) + ∇νY II(X, Z) { } + AII(Y,Z) X − AII(X,Z) Y .
256
6. Differential geometry of surfaces
The quantity in the first set of braces is normal to M , so it vanishes. This vanishing is called Codazzi’s equation. The identity of R(X, Y )Z with the quantity in the second set of braces is equivalent to the Gauss equation (6.2.78). 7. In the setting of Exercise 6, assume k = n + 1. Take the inner product of both sides of (6.2.111) with N and deduce that Codazzi’s equation is equivalent to (6.2.112)
M ∇M X (AN Y ) − ∇Y (AN X) = AN ([X, Y ]).
Hint. Start with M ⟨N, II(X, ∇M Y Z)⟩ = ⟨AN X, ∇Y Z⟩ = Y ⟨AN X, Z⟩ − ⟨∇M Y (AN X), Z⟩, and then show that ⟨N, ∇νY II(X, Z)⟩ = Y ⟨AN X, Z⟩. 8. Take M ⊂ Rk as in Exercise 7 (k = dim M + 1). We say a point p ∈ M is an umbilic if AN (p) = λ(p)I, for some λ(p) ∈ R. Assume that each p ∈ M is an umbilic and that M is connected. Show that λ must be constant. Hint. Apply the Codazzi equation (6.2.112). Deduce that (Xλ)Y = (Y λ)X for arbitrary smooth vector fields X and Y on M , hence Xλ ≡ Y λ ≡ 0.
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem Here we establish results that make contact between material on curvature in §6.2 and material of §5.3, including material on the degree of the Gauss map, and material on the Euler characteristic, defined in (5.3.29). We will also bring in the notion of parallel transport. To begin, assume M is a smooth, compact surface of dimension n in Rn+1 , with smooth unit normal N → S n . As seen in §5.3, the degree of this map satisfies ∫ (6.3.1) Deg(N ) = N ∗ ω, M n
for any n-form ω on S that integrates to 1. In particular, we can take ω = A−1 n ωS ,
(6.3.2)
where ωS is the volume form of S n , and (cf. (3.2.32)) ∫ 2π (n+1)/2 (6.3.3) An = ω S = . Γ((n + 1)/2) Sn
Recall that for each x ∈ M , DN (x) : Tx M → TN (x) S n = Tx M . We leave the following result as an exercise. Proposition 6.3.1. In the setting described above, (6.3.4)
N ∗ ωS = (det DN ) ωM ,
where ωM is the volume form of M .
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem
257
Recalling the characterization of Gauss curvature in (6.2.9), we deduce the following. Proposition 6.3.2. If M ⊂ Rn+1 is a compact, n-dimensional surface, with smooth unit normal N : M → S n , and associated Gauss curvature K, then ∫ 1 Deg(N ) = (6.3.5) K(x) dS(x). An M
Putting this together with Corollary 5.3.17, we have the following. Proposition 6.3.3. Take M as in Proposition 6.3.2, and assume dim M = n = 2m is even. Then 1 (6.3.6) Deg(N ) = χ(M ), 2 where χ(M ) is the Euler characteristic of M , defined by (5.3.29). Consequently ∫ An (6.3.7) K(x) dS(x) = χ(M ). 2 M
Proof. (Compare Exercise 11 of §5.3.) Corollary 5.3.17 applies directly to a neighborhood Ω of M whose boundary is essentially two copies of M , with normals N and −N . The conclusion (5.3.28)–(5.3.29) then becomes χ(M ) = Deg(N ) + Deg(−N )
(6.3.8)
= (1 + (−1)n ) Deg(N ),
which yields (6.3.6) when n is even. Specializing (6.3.7) to n = 2, we get ∫ (6.3.9) K(x) dS(x) = 2πχ(M ), M
when M ⊂ R is a compact, 2D surface. This is the classical Gauss-Bonnet formula, for surfaces without boundary. See Figure 6.3.1 for an example, involving a threeholed torus. As seen in §5.3, Exercise 19, χ(M ) = −4 in this case, so ∫ K dS = −8π. 3
M
The classical formula also treats 2D surfaces with boundary, and we will want to derive such a result below. We also want to extend (6.3.9) from compact 2D surfaces in R3 to arbitrary compact 2D Riemannian manifolds. One approach to such extensions involves the notion of parallel transport, which we now introduce. Let M be an n-dimensional Riemannian manifold and γ : [0, τ ] → M be a smooth (or piecewise smooth) curve. Say γ(0) = p, and take V0 ∈ Tp M . We say a vector field V along γ is defined by parallel transport (or “parallel translation”) if (6.3.10)
∇T V = 0,
T = γ′,
258
6. Differential geometry of surfaces
Figure 6.3.1. Three-holed torus, χ(M ) = −4,
∫ M
K dS = −8π.
with V (t) ∈ Tγ(t) M, V (0) = V0 . If γ is contained in a coordinate patch, in which (6.3.11)
γ(t) = (x1 (t), . . . , xn (t)),
we can set (6.3.12)
V (t) = v j (t)Dj ,
where Dj = ∂/∂xj . Here and below we use the summation convention. We obtain from (6.3.10) and (6.1.69) the following linear system of ODE for the components of V : dv a dxk (6.3.13) = −Γa bk v b . dt dt The following is a key result, relating curvature and parallel translation. Proposition 6.3.4. Let γ : [0, τ ] → M be a piecewise smooth closed loop on M , parametrized by arc length, with γ(τ ) = γ(0). If V (t) is a vector field over γ defined by parallel translation, with components as in (6.3.12), then ∫ ) 1∑ a ( a a (6.3.14) v (τ ) − v (0) = − R bjk dxj ∧ dxk v b (0) + O(τ 3 ), 2 j,k,b
A
where A is an oriented 2-surface in M with ∂A = γ.
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem
259
Proof. If we put a coordinate system on a neighborhood of p = γ(0) ∈ M , then parallel transport is defined by (6.3.13), so ∫ t dxk a a Γa bk (γ(s))v b (s) ds. (6.3.15) v (t) = v (0) − ds 0 We hence have (6.3.16)
v a (t) = v a (0) − Γa bk (p)v b (0)(xk − pk ) + O(t2 ).
We can solve (6.3.13) up to O(t3 ) if we use (6.3.17)
Γa bj (x) = Γa bj (p) + (xk − pk )∂k Γa bj + O(|x − p|2 ).
Hence
∫
t[
] Γa bk (p) + (xj − pj )∂j Γa bk (p)
v a (t) = v a (0) − 0
(6.3.18)
[ ] dxk ds + O(t3 ). · v b (0) − Γb cℓ (p)v c (0)(xℓ − pℓ ) ds
If γ(τ ) = γ(0), we get
∫
τ
v a (τ ) = v a (0) −
xj dxk (∂j Γa bk )v b (0) ∫0 τ
(6.3.19)
xj dxk Γa bk Γb cj v c (0) + O(τ 3 ),
+ 0
the components of Γ and their first derivatives being evaluated at p. Now Stokes’ theorem gives ∫ ∫ xj dxk = dxj ∧ dxk , γ
so (6.3.20)
A
[ ](∫ ) v a (τ ) − v a (0) = − ∂j Γa bk − Γa ck Γc bj dxj ∧ dxk v b (0) + O(τ 3 ). A
Recall that the Riemann curvature is given by (6.2.65), i.e., (6.3.21)
Ra bjk = ∂j Γa bk − ∂k Γa bj + Γa cj Γc bk − Γa ck Γc bj .
Now the right side of (6.3.21) is the antisymmetrization, with respect to j and k, ∫ of the quantity in brackets in (6.3.20). Since A dxj ∧ dxk is antisymmetric in j and k, we get the desired formula (6.3.14). Let us specialize to dim M = 2, and use exponential coordinates centered at p, so gjk (x) = δjk + O(|x − p|2 ). Then ∫ ) (∫ √ g dx1 dx2 (1 + O(τ 2 )) dx1 dx2 = (6.3.22)
A
A
= (Area A)(1 + O(τ 2 )) = Area A + O(τ 3 ),
and (6.3.14) becomes (6.3.23)
v a (τ ) − v a (0) = −Ra b12 (p)(Area A)v b (0) + O(τ 3 ).
260
6. Differential geometry of surfaces
Furthermore, (6.3.24)
(
R
a
) b12
(
0 1 = K(p) −1 0
) = −K(p)J,
where K(p) is the Gauss curvature of M at p and J : Tp M → Tp M is counterclockwise rotation by 90◦ . We have the following. Corollary 6.3.5. In the setting of Proposition 6.3.4, if dim M = 2 and γ = ∂A, then (6.3.25)
V (τ ) − V (0) = K(p)(Area A)JV (0) + O(τ 3 ).
In order to make use of (6.3.25), we examine further the parallel transport of a vector V along a smooth segment of γ. Say γ : [a, b] → M is smooth, and, as before, γ ′ = T, ∥T ∥ ≡ 1. Note that, for a ≤ t ≤ b, (6.3.26)
d ∥V (t)∥2 = T ⟨V, V ⟩ = 2⟨∇T V, V ⟩ = 0, dt
so V (t) has constant length. Next, (6.3.27)
T ⟨V, T ⟩ = ⟨∇T V, T ⟩ + ⟨V, ∇T T ⟩ = ⟨V, ∇T T ⟩.
In particular, if γ is a geodesic for t ∈ [a, b], then ⟨V (t), T (t)⟩ is constant on that interval. Consequently, the angle between V (t) and T (t) is constant on that interval. With these observations in hand, we can examine what happens when one takes a vector V0 ∈ Tp M and parallel transports it along a curve γ that bounds a geodesic triangle A ⊂ M , having one vertex at p. Here and below, a “triangle” in a 2D manifold M is assumed to be homeomorphic to a standard triangle in R2 . As one sees from Figure 6.3.2, the resulting vector V3 ∈ Tp M is obtained from V0 by rotation in Tp M through an angle that depends on the angle defect δ = α+β +γ −π of the triangle. Indeed, if ξ is the angle that V0 (hence V1 ) makes with the geodesic from p to q, then the angle from V0 to V3 is (6.3.28)
(π + α) − (2π − β − γ − ξ) − ξ = α + β + γ − π.
That is to say, (6.3.29)
V3 = e(α+β+γ−π)J V0 .
We thus have from (6.3.25) that (6.3.30)
∫
α+β+γ−π =
K dS + O(τ 3 ), A
if the length of γ = ∂A is τ . Finally, we can get rid of the remainder term and obtain the following celebrated formula of Gauss. Proposition 6.3.6. If A is a geodesic triangle in a 2D Riemannian manifold M , with angles α, β, and γ, then ∫ (6.3.31) α + β + γ − π = K dS. A
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem
261
Figure 6.3.2. Parallel transport along a geodesic triangle
Proof. Break up the geodesic triangle A into N 2 little geodesic triangles, each of diameter O(N −1 ) and area O(N −2 ). Since the angle defects are additive, the estimate (6.3.30) implies ∫ α + β + γ − π = K dS + N 2 O(N −3 ) A
∫
(6.3.32) =
K dS + O(N −1 ),
A
and taking the limit as N → ∞ yields (6.3.31).
Our next goal is to extend the formula (6.3.9) to general compact, 2D Riemannian manifolds. To begin, we assert that we can partition M into geodesic triangles. Suppose such a triangulation of M has (6.3.33)
F faces (triangles),
E edges,
V vertices.
If the angles of the jth triangle are αj , βj , and γj , then summing all the angles clearly produces 2πV . On the other hand, (6.3.31) applied to the jth triangle, and
262
6. Differential geometry of surfaces
summed over j, yields (6.3.34) ∫
∑
∫ (αj + βj + γj ) = πF +
j
K dS. M
Hence M K dS = (2V − F )π. Since in this case all the faces are triangles, counting each triangle three times will count each edge twice, so 3F = 2E. Thus we obtain ∫ (6.3.35) K dS = 2π(V − E + F ). M
Now we have Euler’s formula, (6.3.36)
χ(M ) = V − E + F,
whose derivation is discussed in Exercise 8 of §5.3. Putting together (6.3.35) and (6.3.36), we have the desired extension of (6.3.9). Remark. This argument depends on the existence of a triangulation of M as described above. See Exercise 8 below for a proof of (6.3.9), for a general compact 2D manifold M , that does not depend on a triangulation. We next want to extend Proposition 6.3.6 to cases where A is a triangle whose sides are not geodesics. To start, we pursue the study of parallel transport of V (t) along a smooth curve γ : [a, b] → M , begun in (6.3.26)–(6.3.27). Now we no longer assume γ is a geodesic, so (6.3.27) does not imply T ⟨V, T ⟩ = 0. Note that (6.3.37)
⟨T, T ⟩ ≡ 1 =⇒ T ⟨∇T T, T ⟩ = 0,
so ∇T T is orthogonal to T . Let us set (6.3.38)
N (t) = JT (t),
where J : Tγ(t) M → Tγ(t) M is counterclockwise rotation by 90◦ . (Note: N is not the Gauss map of M ; in fact, we do not assume M ⊂ R3 , so we do not have such a Gauss map.) Then ∇T T is parallel to N (t), so we set ∇T T = κ(t)N (t),
(6.3.39)
and this defines κ(t), which we call the geodesic curvature of γ at t. Compare (6.2.5), which deals with the case M = R2 . Next, note that (6.3.40)
⟨N, T ⟩ ≡ 1 ⇒ 0 = T ⟨N, T ⟩ = ⟨∇T N, T ⟩ + ⟨N, ∇T T ⟩ ⇒ ⟨∇T N, T ⟩ = −κ(t),
and (6.3.41)
⟨N, N ⟩ ≡ 1 =⇒ 0 = 2⟨∇T N, N ⟩,
so ∇T N is parallel to T . Hence (6.3.42)
∇T N = −κ(t)T (t).
Compare (6.2.6). Returning to V (t), solving ∇T V = 0, we have (6.3.43)
T ⟨V, T ⟩ = ⟨V, ∇T T ⟩ = κ⟨V, N ⟩, T ⟨V, N ⟩ = ⟨V, ∇T N ⟩ = −κ⟨V, T ⟩.
6.3. Geometry of surfaces III: the Gauss-Bonnet theorem
263
In other words, if we expand (6.3.44)
V (t) = v1 (t)T (t) + v2 (t)N (t),
we obtain the 2 × 2 system dv1 = κ(t)v2 (t), dt dv2 = −κ(t)v1 (t). dt
(6.3.45) Hence (6.3.46)
dz = −iκ(t)z(t) dt ∫t ⇒ z(t) = e−i a κ(s) ds z(a).
z(t) = v1 (t) + iv2 (t) ⇒
Hence V (t) = (v1 (t)I + v2 (t)J)T (t) (6.3.47)
= e−J
∫t a
κ(s) ds
(v1 (a)I + v2 (a)J)T (t).
This leads to the following result. Proposition 6.3.7. Let M be an oriented 2D Riemannian manifold, γ : [a, b] → M a smooth, unit speed curve. Let V (t) be obtained from V (a) ∈ Tγ(a) M by parallel transport. Assume V (a) makes the angle ξ0 with T (a), i.e., up to a positive real factor, (6.3.48)
V (a) = eξ0 J T (a).
Then, for t ∈ [a, b], V (t) makes with T (t) the angle ∫ t (6.3.49) ξ(t) = ξ0 − κ(s) ds, a
where κ(s) is the geodesic curvature of γ at γ(s). From here, a straightforward modification of the proof of Proposition 6.3.6 yields the following generalization. Proposition 6.3.8. Let A be a triangle in a 2D Riemannian manifold, whose boundary consists of three smooth arcs, meeting at angles α, β, and γ. Then ∫ ∫ (6.3.50) α + β + γ − π = K dS + κ ds, A
∂A
where κ is the geodesic curvature of (the smooth part of ) ∂A. Remark. ∫When one divides A into N 2 little triangles Aj , there arises a sum of N 2 terms ∂Aj κ ds. The integrals along curve segments that are interfaces of two ∫ triangles cancel out, so the sum of these terms is ∂A κ ds. Note that if O ⊂ M is a smoothly bounded domain whose closure is diffeomorphic to a closed disk, then O is a limiting case of “triangles” treated in Proposition 6.3.8, in which α = β = γ = π, so we have the following.
264
6. Differential geometry of surfaces
Proposition 6.3.9. If O is a smoothly bounded open subset of a compact 2D Riemannian manifold M , and O is diffeomorhic to a closed disk, then ∫ ∫ (6.3.51) K dS + κ ds = 2π. O
∂O
The following result deals with a broad class of 2D manifolds with boundary. Proposition 6.3.10. Let M be a compact, oriented, 2D manifold, and let Ω ⊂ M be a smoothly bounded open subset. Assume that M \Ω=
(6.3.52)
k ∪
Oj ,
j=1
where each Oj is connected, with closure Oj diffeomorphic to a closed disk. Then ∫ ∫ ( ) (6.3.53) K dS + κ ds = 2π χ(M ) − k . Ω
∂Ω
Proof. First observe the cancellation property ∫ k ∫ ∑ (6.3.54) κ ds + κ ds = 0. ∂Ω
j=1∂O j
Hence, by (6.3.9), extended from surfaces in R3 to M in the current setting, ∫ 2πχ(M ) = K dS M
∫
(6.3.55)
=
K dS + ∫
∫
κ ds +
K dS + Ω
K dS
j=1O j
Ω
=
k ∫ ∑
∂Ω
k [∫ ∑ j=1 O j
∫ K dS +
] κ ds .
∂Oj
By Proposition 6.3.9, each term in the last sum over j is equal to 2π, so (6.3.53) follows. Moving back to higher dimensions, we state the following extension of Proposition 6.3.3, known as the generalized Gauss-Bonnet theorem. Proposition 6.3.11. Let M be a compact Riemannian manifold of dimension n = 2m, and define its Gauss curvature K by (6.2.104)–(6.2.107). Then ∫ An (6.3.56) χ(M ). K(x) dS(x) = 2 M
Such a result is subject matter for a full-blown course in differential geometry. Treatments can be found in [43] and in Appendix C (Vol. 2) of [46].
Exercises
265
Exercises 1. Let M ⊂ Rk be an n-dimensional surface, with the induced metric tensor, and let γ : [a, b] → M be a smooth curve. For t ∈ [a, b], let P (t) denote the orthogonal projection of Rk onto Tγ(t) M . Take V0 ∈ Tγ(a) M . Show that the solution to dV = P ′ (t)V (t), V (a) = V0 dt is the vector field along γ obtained from V0 by parallel transport. Hint. Show that W (t) = P ⊥ (t)V (t) satisfies W ′ (t) = −P (t)P ′ (t)V (t) = −P ′ (t)P ⊥ (t)V (t) = −P ′ (t)W (t), to deduce that W (t) ≡ 0. Then check Proposition 6.1.5. 2. Let M1 and M2 be n-dimensional surfaces in Rk . Supppose M1 ∩ M2 contains a curve γ and p ∈ γ =⇒ Tp M1 = Tp M2 . Show that parallel transport along γ for M1 coincides with parallel transport along γ for M2 Hint. Use Exercise 1. 3. Let M be an oriented 2D Riemannian manifold. For each p ∈ M , define J : Tp M → Tp M to be counterclockwise rotation by 90◦ . Let ∇ be the Lavi-Civita covariant derivative on M . Show that, if X and Y are vector fields on M , ∇X JY = J∇X Y. Hint. See if you can get this from (6.3.39) and (6.3.42). 4. In the setting of Exercise 3, show that if X, Y , and Z are vector fields on M , R(X, Y )JZ = JR(X, Y )Z. 5. Let M be an oriented 2D Riemannian manifold, O ⊂ M an open set. Assume you have a smooth vector field E1 on O such that ⟨E1 , E1 ⟩ ≡ 1. Set E2 = JE1 . Define a 1-form ω on O by (6.3.57)
ω(X) = ⟨∇X E1 , E2 ⟩.
Show (using (5.2.35)) that, for all vector fields X, Y, Z on O, R(X, Y )Z = dω(X, Y ) JZ. In particular, R(E1 , E2 )E2 = −dω(E1 , E2 )E1 . Since the Gauss curvature K(x) = ⟨R(E1 , E2 )E2 , E1 ⟩, deduce that (6.3.58)
dω = −K σM ,
where σM denotes the area 2-form on M .
266
6. Differential geometry of surfaces
6. In the setting of Exercise 5, let γ : [a, b] → O be a smooth, unit-speed curve, T (t) = γ ′ (t). Define a smooth function φ : [a, b] → R such that T (t) = cos φ(t) E1 + sin φ(t) E2 . Show that the geodesic curvature of γ is given by κ(t) = φ′ (t) + ω(T ). 7. In the setting of Exercise 6, let Ω ⊂ O be a smoothly bounded domain, γ = ∂Ω. Apply Stokes’ theorem to get ∫ ∫ ∫ ∫ K dS = − ω = − κ(s) ds + φ′ (s) ds. Ω
∂Ω
∂Ω
∂Ω
Obtain variants of this when Ω has piecewise smooth boundary. Use this approach to produce alternative proofs of Propositions 6.3.6 and 6.3.8–6.3.10. 8. Let M be a compact, oriented, 2D Riemannian manifold, V a smooth vector field on M , {pj } its set of critical points, each assumed to be nondegenerate. Set E1 = ∥V ∥−1 V,
E2 = JE1 ,
on O = M \ {pj }.
Define the 1-form ω on O as in (6.3.57). For small ε > 0, let Dj (ε) denote the disk of radius ε centered at pj , and set ∪ Ωε = M \ Dj (ε). j
By (6.3.58),
∫
∫ K dS = −
Ωε
ω=
∂Ωε
∑ ∫ j
ω.
∂Dj (ε)
∫
Show that lim
ω = 2π indpj (V ).
ε→0 ∂Dj (ε)
(Recall (5.3.19)–(5.3.20) and Exercise 7 of §5.3.) Deduce that ∫ K dS = 2π Index(V ), M
thus obtaining another proof of the Gauss-Bonnet formula, (6.3.9), for general compact, oriented, 2D Riemannian manifolds. 9. In this exercise, we consider the compact 2D surface M depicted in Figure 6.3.3. This is shown sitting in a cube C ⊂ R3 , say C = [−π, π]3 , but we identify opposite faces of C and take (6.3.59)
M ⊂ T3 = R3 /2πZ3 ,
a compact surface without boundary.
Exercises
267
Figure 6.3.3. A surface M ⊂ T3 satisfying χ(M ) = −4
(a) Show that the natural isomorphism Tx T3 = R3 leads to a well defined Gauss map (6.3.60)
M ⊂ T3 =⇒ N : M → S 2 ,
and that, parallel to the case M ⊂ R3 , 1 χ(M ). 2 Furthermore, show that Proposition 6.3.2 continues to hold. (These observations extend more generally to n-dimensional M ⊂ Tn+1 , n even.) (6.3.61)
Deg(N ) =
(b) Show that the surface M depicted in Figure 6.3.3 is diffeomorphic to the 3-holed torus depicted in Figure 6.3.1, hence χ(M ) = −4.
(6.3.62)
(c) Combining (6.3.62) with either Part (a) or the Gauss-Bonnet theorem for general compact 2D manifolds, deduce that ∫ (6.3.63) K dS = −8π. M
(d) Now regard M as a compact surface with boundary (consisting of 6 circles) in R3 , and use Proposition 6.3.10, with M relabeled as Ω, and putting Ω in a surface diffeomorphic to S 2 . In this case, κ ≡ 0 on ∂Ω. Show that Proposition 6.3.10 (with
268
6. Differential geometry of surfaces
Figure 6.3.4. Regular tetrahedron in R3
these changes in labeling) also implies (6.3.63). This approach to (6.3.63) does not use χ(M ) = −4, just χ(S 2 ) = 2. (e) See if you can produce a surface M ⊂ T3 , looking like that in Figure 6.3.3, whose Gauss curvature satisfies (6.3.64)
K(x) < 0,
∀ x ∈ M.
10. Let T ⊂ R3 be a (solid) regular tetrahedron; cf. Figure 6.3.4. The angle θ between two faces is specified by √ (π ) 1 1 2 2 cos θ = , sin θ = , sin −θ = . 3 3 2 3 Assume one vertex of T is the origin in R3 . Dilate T by a factor of 2, and consider Ω = S 2 ∩ 2T . (a) Show that Ω ⊂ S 2 is a geodesic triangle, each of whose angles is equal to θ, specified above. (b) Deduce from the Gauss formula, Proposition 6.3.6, that Area(Ω) = 3θ − π,
6.4. Smooth matrix groups
269
hence π 1 − 3 sin−1 . 2 3 Remark. It is shown in §7.4 of [52] that sin−1 (1/3) is not a rational multiple of π. Area(Ω) =
6.4. Smooth matrix groups A smooth matrix group is a subset G ⊂ M (n, F),
(6.4.1)
F = R or C,
with the following two properties: (i) G is a smooth m-dimensional surface in the vector space M (n, F), (ii) G is a group, i.e., (6.4.2)
g1 , g2 ∈ G =⇒ g1 g2 ∈ G,
and
g ∈ G =⇒ g is invertible and g −1 ∈ G.
We assume G is nonempty, and note that the two conditions in (6.4.2) imply I ∈ G, where I ∈ M (n, F) is the n × n identity matrix. Here are some examples of smooth matrix groups. Gl(n, F) = {g ∈ M (n, F) : det g ̸= 0}, Sl(n, F) = {g ∈ M (n, F) : det g = 1}, (6.4.3)
O(n) = {g ∈ M (n, R) : g ∗ g = I}, SO(n) = {g ∈ O(n) : det g = 1}, U (n) = {g ∈ M (n, C) : g ∗ g = I}, SU (n) = {g ∈ U (n) : det g = 1}.
The fact that all these sets of matrices satisfy (6.4.2) follows from the identities (6.4.4)
det(g1 g2 ) = (det g1 )(det g2 ),
(g1 g2 )∗ = g2∗ g1∗ ,
and the fact that g ∈ M (n, F) is invertible if and only if det g ̸= 0. These facts also imply that if G ⊂ M (n, F) is a matrix group, then in fact (6.4.5)
G ⊂ Gl(n, F).
Note that the defining property det g ̸= 0 implies (6.4.6)
Gl(n, F) is open in M (n, F).
The fact that the other groups listed in (6.4.3) are smooth surfaces can be established using the submersion mapping theorem, Proposition 3.2.5, which we recall here. Proposition 6.4.1. Let V and W be finite dimensional vector spaces, Ω ⊂ V open, F : Ω ⊂ W a smooth map. Fix p ∈ W , and consider (6.4.7)
S = {x ∈ Ω : F (x) = p}.
270
6. Differential geometry of surfaces
Assume that, for each x ∈ S, DF (x) : V → W is surjective. Then S is a smooth surface in Ω. Furthermore, for each x ∈ S, Tx S = N DF (x).
(6.4.8)
To apply this to the groups listed in (6.4.3), we start with Sl(n, F). Here we take (6.4.9)
V = M (n, F),
W = F,
F : V → W,
F (A) = det A.
Now, given A invertible, (6.4.10)
F (A + B) = det(A + B) = (detA) det(I + A−1 B),
and we have, for X ∈ M (n, F), det(I + X) = 1 + Tr X + O(∥X∥2 ),
(6.4.11) so
DF (A)B = (det A) Tr(A−1 B).
(6.4.12)
Now, given A ∈ Sl(n, F), or even A ∈ Gl(n, F), it is readily verified that τA : M (n, F) → F,
(6.4.13)
τA (B) = Tr(A−1 B),
is nonzero, hence surjective, and Proposition 6.4.1 applies. We turn to O(n). In this case, (6.4.14)
V = M (n, R),
W = {X ∈ M (n, R) : X = X ∗ },
F : V −→ W,
F (A) = A∗ A.
Now, given A ∈ V , (6.4.15)
F (A + B) = A∗ A + A∗ B + B ∗ A + O(∥B∥2 ),
so (6.4.16)
DF (A)B = A∗ B + B ∗ A = A∗ B + (A∗ B)∗ .
We claim that (6.4.17)
A ∈ O(n) =⇒ DF (A) : M (n, R) → W is surjective.
Indeed, given X ∈ W , i.e., X = X ∗ ∈ M (n, R), we have 1 AX =⇒ DF (A)B = X. 2 Again Proposition 6.4.1 applies so O(n) is a smooth surface. (6.4.18)
B=
Similar arguments apply to U (n). For SU (n), we take (6.4.19)
V = M (n, C),
W = {X ∈ M (n, C) : X = X ∗ } ⊕ R,
F : V −→ W,
F (A) = (A∗ A, Im det A).
Note that A ∈ U (n) implies | det A| = 1, so Im det A = 0 ⇔ det A = ±1. As a further comment on O(n), we note that, given A ∈ M (n, R), defining A : Rn → Rn , (6.4.20)
A ∈ O(n) ⇐⇒ ⟨Au, Av⟩ = ⟨u, v⟩,
∀ u, v ∈ Rn ,
6.4. Smooth matrix groups
271
where ⟨u, v⟩ is the Euclidean inner product on Rn , ∑ (6.4.21) ⟨u, v⟩ = uj vj , j
where u = (u1 , . . . , un ) , etc. Similarly, given A ∈ M (n, C), defining A : Cn → Cn , t
(6.4.22)
A ∈ U (n) ⇐⇒ (Au, Av) = (u, v),
∀ u, v ∈ Cn ,
where (u, v) denotes the Hermitian inner product on Cn , ∑ (6.4.23) (u, v) = uj v j . j
Note that ⟨u, v⟩ = Re(u, v)
(6.4.24)
defines the Euclidean inner product on Cn ≈ R2n , and we have (6.4.25)
U (n) ,→ O(2n).
If G ⊂ M (n, F) is a smooth matrix group, it is of particular interest to consider the tangent space to G at the identity element, g = TI G ⊂ M (n, F),
(6.4.26)
an R-linear subspace. For the groups listed in (6.4.3), we have the following specific identifications: TI Sl(n, F) = {A ∈ M (n, F) : Tr A = 0}, TI O(n) = {A ∈ M (n, R) : A∗ = −A} = TI SO(n),
(6.4.27)
TI U (n) = {A ∈ M (n, C) : A∗ = −A}, TI SU (n) = {A ∈ M (n, C) : A∗ = −A, Tr A = 0}.
For the first two identities, take A = I in (6.4.12) and (6.4.16), respectively, yielding DF (I)A = Tr A and DF (I)A = A + A∗ , respectively. Having (6.4.27) and making use of the identities (6.4.28)
etB
−1
AB
= B −1 etA B,
∗
etA = (etA )∗ ,
det etA = et Tr A ,
one readily verifies the following, for (6.4.29)
Exp(A) = eA =
∞ ∑ 1 k A . k!
k=0
Proposition 6.4.2. For each matrix group listed in (6.4.3), (6.4.30)
Exp : TI G −→ G.
Here is a significant extension of Proposition 6.4.2. Proposition 6.4.3. Let G ⊂ M (n, F) be a smooth matrix group, g = TI G. Then, for A ∈ M (n, F), we have (6.4.31)
A ∈ g ⇐⇒ etA ∈ G, ∀ t ∈ R.
272
6. Differential geometry of surfaces
The “⇐” part is clear from the identity (d/dt)etA |t=0 = A. As for the “⇒” part, we have this for G in (6.4.3), by Proposition 6.4.2. We will postpone the proof of the rest of Proposition 6.4.3 until later in this section, but will proceed to develop some consequences of this result, starting with the following. For A, B ∈ M (n, F), set [A, B] = AB − BA.
(6.4.32)
This is called the commutator, or the Lie bracket of A and B. Proposition 6.4.4. Let G ⊂ M (n, F) be a smooth matrix group, g = TI G. Then A, B ∈ g =⇒ [A, B] ∈ g.
(6.4.33)
Remark. For the tangent spaces TI G listed in (6.4.27), the implication (6.4.33) is straightforward, via such identities as [A, B]∗ = −[A∗ , B ∗ ],
(6.4.34)
Tr[A, B] = 0.
We turn to the general situation. Proof. Given g ∈ G, A ∈ g, g −1 etA g = etg
(6.4.35)
−1
Ag
,
∀ t,
and the left side of (6.4.35) belongs to G, so by (6.4.31) we have g −1 Ag ∈ g,
(6.4.36)
∀ g ∈ G, A ∈ g.
Setting g = e , B ∈ g, we have tB
(6.4.37)
e−tB AetB ∈ g,
∀ A, B ∈ g.
Applying d/dt at t = 0 gives (6.4.33)
The commutator [A, B] = AB − BA gives g the structure of a Lie algebra. We aim to establish further relations between the Lie algebra structure of g and the group structure of G. To begin, for A, B ∈ g, we have, for small t, ( )( ) t2 t2 etA etB = I + tA + A2 + O(t3 ) I + tB + B 2 + O(t3 ) 2 2 (6.4.38) t2 2 = I + t(A + B) + (A + 2AB + B 2 ) + O(t3 ), 2 and similarly (6.4.39)
etB etA = I + t(A + B) +
t2 2 (A + 2BA + B 2 ) + O(t3 ), 2
hence (6.4.40)
etA etB = etB etA + t2 [A, B] + O(t3 ).
Consequently, etA etB e−tA e−tB = I + t2 [A, B] + O(t3 ) (6.4.41)
2
= et
[A,B]
+ O(t3 ).
6.4. Smooth matrix groups
273
We apply these calculations to show how the Lie algebra structure is preserved under smooth homomorphisms of G. Thus, assume we have a smooth homomorphism π : G −→ Gl(m, F),
(6.4.42) i.e., a smooth map satisfying (6.4.43)
∀ g1 , g2 ∈ G.
π(g1 g2 ) = π(g1 )π(g2 ),
Note that (6.4.43) implies (6.4.44)
π(I) = π(I · I) = π(I)π(I),
hence π(I) = I,
Let us set σ = Dπ(I) : g −→ M (m, F), so d σ(A) = π(esA ) s=0 , ds for A ∈ g. Note that, for such A, d d π(etA ) = π(e(s+t)A ) s=0 dt ds d (6.4.46) = π(esA )π(etA ) s=0 ds = σ(A)π(etA ), (6.4.45)
and since γ(t) = π(etA ) satisfies γ(0) = I, this gives (6.4.47)
π(etA ) = etσ(A) .
We are ready to prove the following Proposition 6.4.5. If π is a smooth homomorphism in (6.4.42) and σ : g → M (m, F) is given by (6.4.45), then, for A, B ∈ g, we have (6.4.48)
σ([A, B]) = [σ(A), σ(B)].
Proof. Applying π to (6.4.41), we have (6.4.49)
π(etA etB e−tA e−tB ) = π(et
2
[A,B]
) + O(t3 ).
By (6.4.47), and a second application of (6.4.41), the left side of (6.4.49) is equal to (6.4.50)
etσ(A) etσ(B) e−tσ(A) e−tσ(B) = et
2
[σ(A),σ(B)]
+ O(t3 ).
Comparing the right sides of (6.4.49) and (6.4.50), we have d (6.4.51) π(es[A,B] ) s=0 = [σ(A), σ(B)], ds which gives (6.4.48).
We next associate to each A ∈ TI G = g a certain vector field on G. To start, take A ∈ M (n, F), the Lie algebra of Gl(n, F). We define a vector field XA on Gl(n, F) by (6.4.52)
XA (g) = gA,
274
6. Differential geometry of surfaces
for g ∈ Gl(n, F). This vector field is left invariant. That is to say, if for each h ∈ Gl(n, F) we define left translation Lh : Gl(n, F) → Gl(n, F) by (6.4.53)
Lh (g) = hg,
then we have (6.4.54)
XA (hg) = DLh (g)XA (g).
We now have the following simple result: Proposition 6.4.6. If A ∈ g = TI G, then XA is tangent to G. Proof. Given g ∈ G, we have Lg : G → G, and hence (6.4.55)
DLg (I) : TI G −→ Tg G,
hence A ∈ g ⇒ XA (g) ∈ Tg G.
t Given A ∈ M (n, F), the flow FA on Gl(n, F) generated by XA is given by
(6.4.56)
t FA (g) = getA ,
as is readily checked: (6.4.57)
d t FA (g) t=0 = XA (g) (by definition) dt = gA,
which coincides with (d/dt)getA |t=0 . With these observations, we can give a short Proof of Proposition 6.4.3 (the “⇒” part). t As seen in §3.2, the flow FA generated by XA leaves invariant each smooth surface in Gl(n, F) to which XA is tangent. If A ∈ g, then, by Proposition 6.4.6, G has this t (I). property, and etA = FA We record the following useful complement to Proposition 6.4.3. Proposition 6.4.7. If G ⊂ M (n, F) is a smooth matrix group, g = TI G, then Exp : g −→ G is a smooth map and there exist neighborhoods O of 0 ∈ g and Ω of I ∈ G such that Exp : O → Ω is a diffeomorphism. Proof. The mapping property has just been proved. The smoothness follows from the smoothness of Exp : M (n, F) → Gl(n, F). We also have for D Exp(I) : g → TI G = g that D Exp(I)A = A, ∀ A ∈ g, so the inverse function theorem applies.
As seen in §2.3, a smooth vector field X defines a differential operator (also denoted X) on smooth functions by d Xu(x) = u(F t (x)) t=0 , dt
6.4. Smooth matrix groups
275
where F t is the flow generated by X. In particular, for A ∈ M (n, F), d XA u(g) = u(getA ) t=0 dt (6.4.58) = Du(g) · gA, where the “dot product” gives the action of Du(g) ∈ L(M (n, F), R) on gA ∈ M (n, F). Recall from §2.3 that the Lie bracket of vector fields is given by [XA , XB ] = XA XB − XB XA .
(6.4.59)
The following result provides an equivalence between the Lie algebra structure on g and the Lie bracket in (6.4.59). Proposition 6.4.8. Given A, B ∈ M (n, F), we have (6.4.60)
[XA , XB ] = X[A,B] .
Proof. To begin, we have (6.4.61)
XA XB u(g) =
∂2 u(getA esB ) s,t=0 , ∂s∂t
and hence (6.4.62)
(XA XB − XB XA )u(g) =
] ∂2 [ u(getA esB ) − u(gesB etA ) . ∂s∂t s,t=0
We can extend (6.4.40) to etA esB = esB etA + st[A, B] + O((|s| + |t|)3 ).
(6.4.63) Consequently, (6.4.64)
u(getA esB ) = u(gesB etA + stg[A, B] + O((|s| + |t|)3 )) = u(gesB etA ) + stDu(gesB etA ) · g[A, B] + O((|s|| t|)3 ).
Applying ∂ 2 /∂s∂t)|s,t=0 , we obtain (6.4.65)
(XA XB − XB XA )u(g) = Du(g) · g[A, B] = X[A,B] u(g),
the last identity holding by (6.4.58). This proves (6.4.60).
We turn to the production of metric tensors on a smooth matrix group G ⊂ M (n, R). (For simplicity, we restrict attention to F = R here, which is no real loss of generality, since Gl(n, C) ⊂ Gl(2n, R.) To start, we use the inner product on M (n, R) computed componentwise; equivalently (6.4.66)
⟨A, B⟩ = Tr(B ∗ A) = Tr(BA∗ ).
This produces a metric tensor on G ⊂ M (n, R), by restriction. This Riemannian metric interfaces well with the group structure of G in the following situation. Proposition 6.4.9. Assume the smooth matrix group G ⊂ M (n, R) satisfies (6.4.67)
G ⊂ O(n).
Then the metric tensor on G induced from the inner product (6.4.66) on M (n, R) is invariant under both left and right translations, i.e., under (6.4.68)
Lg , Rg : G −→ G,
g ∈ G,
276
6. Differential geometry of surfaces
defined by (6.4.69)
Lg (x) = gx,
Rg (x) = xg,
g, x ∈ G.
Proof. Indeed, we have (6.4.70)
Lg , Rg : M (n, R) −→ M (n, R), isometries, ∀ g ∈ O(n),
where (6.4.71)
Lg A = gA,
A ∈ M (n, R), g ∈ O(n).
Rg A = Ag,
In the setting of Proposition 6.4.9, we say the Riemannian metric on G defined above is bi-invariant. When G does not satisfy (6.4.68), the metric tensor on G induced from M (n, R) is generally not what we want to deal with. In such cases, take some positive-definite inner product ⟨ , ⟩ on g = TI G (perhaps that induced from (6.4.66)), and define inner products ⟨ , ⟩ℓ,g and ⟨ , ⟩r,g on Tg G by ⟨DLg (I)A, DLg (I)B⟩ℓ,g = ⟨A, B⟩,
(6.4.72)
⟨DRg (I)A, DRg (I)B⟩r,g = ⟨A, B⟩,
for A, B ∈ g. We have the following. Proposition 6.4.10. Given an inner product ⟨ , ⟩ on g = TI G, there are unique extensions to Riemannian metric tensors ⟨ , ⟩ℓ and ⟨ , ⟩r on G such that, for all g ∈ G, (6.4.73)
Lg : G −→ G is an isometry for ⟨ , ⟩ℓ , and Rg : G −→ G is an isometry for ⟨ , ⟩r .
These metric tensors give rise to volume elements and hence to integrals that are, respectively, left and right invariant, so ∫ ∫ f (x) dVℓ (x) = f (gx) dVℓ (x), G
G
∫
(6.4.74)
∫
f (x) dVr (x) = G
f (xg) dVr (x), G
for all g ∈ G and all integrable f (e.g., f continuous and compactly supported on G). Generally, if dV1 and dV2 are two smooth volume elements on a smooth surface M , they differ by a smooth positive factor, so ∫ ∫ f (x) dV1 (x) = f (x)φ12 (x) dV2 (x). (6.4.75) M
M
If M = G and dVℓ , dVeℓ are both smooth left-invariant volume elements, then ∫ ∫ (6.4.76) f (x) dVℓ (x) = f (x)φ(x) dVeℓ (x) G
G
6.4. Smooth matrix groups
equals both
277
∫
(6.4.77)
∫
f (gx)φ(x) dVeℓ (x)
f (gx) dVℓ (x) = G
G
and
∫
(6.4.78)
f (gx)φ(gx) dVeℓ (x),
G
for all f ∈ C0 (G), g ∈ G. This forces φ(x) = φ(gx) for all x, g ∈ G, hence φ = constant. We can apply this observation to compare a left-invariant volume element dVℓ and a right translation of this, say dVℓh , defined by ∫ ∫ (6.4.79) f (x) dVℓh (x) = f (xh) dVℓ (x). G
G
We have (6.4.80)
dVℓh (x) = ψ(h) dVℓ (x),
h ∈ G.
Note that, if h1 , h2 ∈ G, then ∫ ∫ f (xh1 h2 ) dVℓ (x) = ψ(h1 ) f (xh2 ) dVℓ (x) (6.4.81)
G
G
∫
= ψ(h2 )ψ(h1 )
f (x) dVℓ (x), G
so ψ : G −→ (0, ∞)
(6.4.82)
is a smooth multiplicative homomorphism, (6.4.83)
ψ(h1 h2 ) = ψ(h1 )ψ(h2 ).
This leads to the following. Proposition 6.4.11. If the only smooth homomorphism ψ : G → (0, ∞) is the trivial homomorphism ψ(g) ≡ 1, then the left-invariant volume element dVℓ is also right invariant. Note that the image of G under ψ must be a multiplicative subgroup of (0, ∞), and the only proper subgroup is {1}. Since the image is compact if G is compact, we have the following. Proposition 6.4.12. If G is a compact, smooth matrix group, then each leftinvariant volume element on G is also right invariant. Given G compact, we normalize the bi-invariant volume element, and define ∫ ∫ 1 (6.4.84) f (g) dg = f (x) dVℓ (x). Vℓ (G) G
G
278
6. Differential geometry of surfaces
Recall that we already have a bi-invariant Riemannian metric tensor, hence a bi-invariant integral, in case G ⊂ O(n). We come full circle with the following result. Proposition 6.4.13. Let G ⊂ M (n, R) be a smooth, compact matrix group. Then there is an inner product on Rn preserved by the action of G. Proof. Let ⟨ , ⟩ denote the standard inner product on Rn , and define a new inner product ( , ) by ∫ (6.4.85) (v, w) = ⟨gv, gw⟩ dg. G
Then, for h ∈ G,
∫ ⟨ghv, ghw⟩ dg
(hv, hw) = G
∫
(6.4.86)
⟨gv, gw⟩ dg,
= G
by right invariance of the integral, and the last integral is equal to (v, w).
A matrix group G for which the left-invariant volume form is also right invariant is said to be unimodular. One can deduce from Proposition 6.4.11 that Sl(n, R) is unimodular.
(6.4.87) On the other hand, also
Gl(n, R) is unimodular,
(6.4.88)
though Proposition 6.4.11 does not apply to this case. An example of a group that is not unimodular is the two-dimensional group ) {( } a b (6.4.89) G= : a > 0, b ∈ R . 0 1 We refer the reader to such sources as [53] or [55] for details on which matrix groups are unimodular. One application of integration over G arises in the proof of the following significant result. Proposition 6.4.14. If G is a smooth matrix group, every continuous homomorphism π : G −→ Gl(m, F)
(6.4.90) ∞
is smooth (of class C ). To set up the proof, we define (6.4.91)
π(f ) ∈ M (m, F), for f ∈ C0 (G),
by (6.4.92)
∫ π(f )v =
f (x)π(x)v dVℓ (x), G
v ∈ Fm .
6.4. Smooth matrix groups
279
We set (6.4.93)
C ∞ (π) = {v ∈ Fm : π(g)v is C ∞ in g}.
Clearly C ∞ (π) is a linear subspace of Fm . Our goal is to show that C ∞ (π) = Fm . To this end, we have: Lemma 6.4.15. Let fν ∈ C0 (G) satisfy fν ≥ 0, (6.4.94)
fν (x) = 0 for ∥x − I∥ ≥ 2−ν , ∫ fν (x) dVℓ (x) = 1. G
Then (6.4.95)
π(fν )v −→ v, as ν → ∞, ∀ v ∈ Fm .
We complement this with: Lemma 6.4.16. In the setting described above, (6.4.96)
f ∈ C0∞ (G), v ∈ Fm =⇒ π(f )v ∈ C ∞ (π).
Proof. For g ∈ G, we have
∫
π(g)π(f )v =
f (x) π(gx)v dVℓ (x) G
∫
(6.4.97) =
f (g −1 y)π(y)v dVℓ (y).
G
We leave (6.4.95) and the smoothness of the last integral as a function of g as exercises. Proof of Proposition 6.4.14. Take fν ∈ C0∞ (G) satisfying (6.4.94). We have π(fν )v ∈ C ∞ (π) and π(fν )v → v as ν → ∞, for each v ∈ Fm . Hence C ∞ (π) is dense in Fm . However, the only dense linear subspace of Fm is Fm itself. It is useful to broaden the class of group homomorphisms (6.4.90), as follows. Let V be a finite-dimensional vector space, G a smooth matrix group. A representation of G on V is a continuous map π : G −→ L(V ),
(6.4.98) satisfying (6.4.99)
π(I) = I,
π(g1 g2 ) = π(g1 )π(g2 ), ∀ gj ∈ G.
Picking a basis of V yields an isomorphism J : V → Fm , hence a homomorphism (6.4.100)
πJ : G → Gl(m, F),
πJ (g) = J π(g)J −1 .
We can readily translate results on πJ to results on π. For example, each continuous representation π in (6.4.98) is smooth, and we have analogues of (6.4.47)–(6.4.48). The language of representation theory is a convenient one in which to formulate results. Here is an example.
280
6. Differential geometry of surfaces
Proposition 6.4.17. Let G be a compact, smooth matrix group, π a continuous representation of G on a finite-dimensional vector space V . Then ∫ (6.4.101) P = π(g) dg G
is a projection of V onto the subspace V0 = {v ∈ V : π(g)v = v, ∀ g ∈ G},
(6.4.102)
on which π acts trivially. Proof. For each h ∈ G, v ∈ V , we have ∫ π(h)P v = π(hg)v dg G
∫
(6.4.103) =
π(g)v dg, G
the second identity by left invariance of the integral. The last integral is equal to P v, so we have P v ∈ V0 for each v ∈ V , i.e., P : V −→ V0 .
(6.4.104) Furthermore, (6.4.105)
∫ v ∈ V0 ⇒ P v =
∫ π(g)v dg =
G
v dg = v, G
so indeed P is a projection onto V0 .
Examples of representations of a matrix group G include representations on spaces of functions. In §7.4 we will see representations of SO(n) on spaces of functions on the sphere S n−1 known as spherical harmonics. Analysis of these representations of SO(n) will be seen to provide a useful tool in the study of spherical harmonics. We now focus on matrix groups with the following property. (6.4.106)
G is a smooth matrix group, endowed with a bi-invariant Riemannian metric,
and examine differential geometric properties of these groups. Recall that compact, smooth matrix groups possess bi-invariant metric tensors. In fact, these are the main examples. To start, consider (6.4.107)
ψ : G −→ G,
ψ(x) = x−1 .
This map takes a left-invariant metric tensor to a right-invariant metric tensor, and vice-versa. We have (6.4.108)
Dψ(I) = −I on g = TI G.
6.4. Smooth matrix groups
281
We can deduce from this that, if (6.4.106) holds, then ψ is an isometry. This generalizes as follows. Proposition 6.4.18. If (6.4.106) holds, then, for each g ∈ G, (6.4.109)
ψg : G −→ G,
ψg (x) = gx−1 g,
is an isometry of G, fixing g and satisfying (6.4.110)
Dψg (g) = −I on Tg G.
Using this, we can prove the following. Proposition 6.4.19. If γ is a unit-speed geodesic on G satisfying γ(0) = I, then (6.4.111)
γ(s + t) = γ(s)γ(t).
Proof. Fix t ∈ R and consider (6.4.112)
σ(s) = γ(t + s).
This is a unit-speed geodesic satisfying σ(0) = γ(t), σ ′ (0) = γ ′ (t). It follows that (6.4.113)
σ ˜ (s) = ψγ(t) (σ(s))
is the unit-speed geodesic satisfying σ ˜ (0) = γ(t), σ ˜ ′ (0) = −γ ′ (t). This forces σ ˜ (s) = γ(t − s), i.e., (6.4.114)
γ(t − s) = ψγ(t) (γ(t + s)) = γ(t)γ(t + s)−1 γ(t).
Taking t = 0 gives γ(−s) = γ(s)−1 ,
(6.4.115)
and taking t = s in (6.4.114) gives I = γ(t)γ(2t)−1 γ(t), hence γ(2t)−1 = γ(t)−1 γ(t)−1 , so (6.4.116)
γ(2t) = γ(t)γ(t).
Inductively, we obtain γ(kt) = γ(t)k , for k = 2ν , which leads to (6.4.111) when s and t have the same sign. In such a case, (6.4.114) implies (6.4.117)
γ(t − s) = γ(t)γ(s)−1 γ(t)−1 γ(t) = γ(t)γ(−s),
so we have (6.4.111) in general. Proposition 6.4.19 implies that if A ∈ g, then (6.4.118)
γA (t) = etA
is a constant-speed geodesic through I. In other words, when (6.4.106) holds, the exponential map (6.4.119)
ExpI : TI G −→ G
defined in §6.1 coincides with the matrix exponential (6.4.120)
Exp : g −→ G
used in this section. We also get that ExpI in (6.4.119) is defined on all of TI G.
282
6. Differential geometry of surfaces
Given A ∈ g, the associated left-invariant vector field X = XA generates the flow t FX (g) = getA = Lg (etA ),
(6.4.121)
which for each g ∈ G is a geodesic on G, since Lg : G → G is an isometry. In other words, each integral curve of X is a constant speed geodesic. Recalling the geodesic equation from §6.1, we have the following. Proposition 6.4.20. If G satisfies (6.4.106), then each left-invariant vector field X on G satisfies ∇X X = 0 on G.
(6.4.122)
If also Y is a left-invariant vector field on G, so is X + Y , and (6.4.122) applies in these cases. Expanding ∇X+Y (X + Y ) = 0
(6.4.123) then yields
∇X Y + ∇Y X = 0.
(6.4.124) Since
∇X Y − ∇Y X = [X, Y ],
(6.4.125) we obtain the following.
Proposition 6.4.21. If G satisfies (6.4.106) and X and Y are left-invariant vector fields on G, then 1 (6.4.126) ∇X Y = [X, Y ]. 2 As seen in §6.2, the Riemann curvature R is given by (6.4.127)
R(X, Y )Z = ∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z.
We hence obtain the following formula. Proposition 6.4.22. If G satisfies (6.4.106) and X, Y, Z are left-invariant vector fields on G, then 1 (6.4.128) R(X, Y )Z = − [[X, Y ], Z]. 4 Lie groups and Lie algebras The matrix groups we have studied in this section fall into a broader category of objects called Lie groups. Generally, a Lie group is a smooth manifold G that also has the structure of a group. That is, there is a map µ : G × G → G, called multiplication, µ(g1 , g2 ) = g1 g2 , that has the following properties. First, the associative law, (6.4.129)
(g1 g2 )g3 = g1 (g2 g3 ),
∀ gj ∈ G,
next, the existence of an identity element e ∈ G such that (6.4.130)
ge = eg = g,
∀ g ∈ G,
6.4. Smooth matrix groups
283
and third that each g ∈ G has an inverse g −1 = Inv(g) ∈ G, satisfying gg −1 = g −1 g = e,
(6.4.131)
∀ g ∈ G.
Furthermore, we require that (6.4.132)
µ : G × G → G and Inv : G → G are smooth.
Such groups arise as sets of symmetries of various structures. An example is E(n), the group of isometric mappings of Rn onto itself. A typical element of E(n) has the form (6.4.133)
T1 (v) = A1 v + w1 ,
A1 ∈ O(n), w1 ∈ Rn .
If also T2 (v) = A2 v + w2 , we have T1 T2 (v) = A1 (T2 v) + w1 = A1 (A2 v + w2 ) + w1
(6.4.134)
= A1 A2 v + A1 w2 + w1 . Hence, as a smooth manifold, E(n) = O(n) × Rn ,
(6.4.135)
and the group operation is given by (6.4.136)
(A1 , w1 ) · (A2 , w2 ) = (A1 A2 , w1 + A1 w2 ).
While E(n) is not defined as a matrix group, it is nevertheless isomorphic to a matrix group, namely {(A w) } (6.4.137) : A ∈ O(n), w ∈ Rn , 0 1 as one sees from the computation ( )( A1 w1 A2 (6.4.138) 0 1 0
w2 1
)
( =
A1 A2 0
) A1 w2 + w1 . 1
There exist Lie groups that are not isomorphic to matrix groups, though it turns out that each Lie group is “locally isomorphic” to a matrix group. For an abstract Lie group G, we define g to consist of the set of left-invariant vector fields on G. This is a linear space, and if X1 , X2 ∈ g, then the Lie bracket [X1 , X2 ] also belongs to g, and this gives g the structure of a Lie algebra. Restriction to the identity element e ∈ G gives a linear isomorphism (6.4.139)
g ≈ Te G.
The exponential map (6.4.140)
Exp : g −→ G
is given by (6.4.141)
t Exp(tX) = FX (e),
t where FX is the flow generated by X.
There are a number of books that develop the theory of Lie groups, such as [12] and [55]. The reader who has gotten this far should be in a good position to pursue this topic in such texts.
284
6. Differential geometry of surfaces
Exercises In Exercises 1–7, let G be a compact, smooth matrix group, endowed with Haar measure, as in (6.4.84). 1. Show that, for sll f ∈ C(G), ∫
f (g −1 ) dg =
G
∫ f (g) dg. G
2. Let V be a finite-dimensional inner product space and assume π : G → L(V ) is a continuous representation that is unitary, i.e. (π(g)u, v) = (u, π(g −1 )v),
∀ u, v ∈ V, g ∈ G.
(We say π is a unitary representation of G on V .) Show that ∫ P = π(g) dg G
is the orthogonal projection of V onto V0 = {v ∈ V : π(g)v = v, ∀ g ∈ G}. Hint. By Proposition 6.4.17, one need only show P = P ∗ . Note that ∫ ∫ P ∗ = π(g)∗ dg = π(g −1 ) dg, G
G
and use Exercise 1. 3. Let π and λ be unitary representations of G on finite-dimensional inner product spaces V and W , respectively. Give L(W, V ) the inner product (A, B) = Tr B ∗ A, and define a unitary representation ν of G on L(W, V ) by ν(g)A = π(g)Aλ(g −1 ). Define Qπλ : L(W, V ) → L(W, V ) by ∫ ∫ Qπλ A = ν(g)A dg = π(g)Aλ(g −1 ) dg. G
G
Show that Qπλ is the orthogonal projection of L(W, V ) onto I(π, λ) = {A ∈ L(W, V ) : π(g)A = Aλ(g), ∀ g ∈ G}. 4. Assume π and λ are irreducible unitary representations, i.e., V and W have no proper linear subspaces invariant under the action of G. Show that if A ∈ I(π, λ) is not zero, then A : W → V must be an isomorphism, i.e., N (A) = 0 and R(A) = V . Hint. N (A) is invariant under λ and R(A) is invariant under π.
Exercises
285
5. In the setting of Exercise 4, suppose A ∈ I(π, λ) is an isomorphism of W onto V , and consider T = A∗ A : W −→ W. Show that T = T ∗ and that T λ(g) = λ(g)T,
∀ g ∈ G.
Show that this forces T = αI on W,
for some α > 0.
Hint. Show that each eigenspace of T is invariant under λ(g), for all g ∈ G. When this holds, show that A0 = α−1/2 A : W −→ V is unitary (i.e., preserves inner products) and π(g) = A0 λ(g)A−1 0 ,
∀ g ∈ G.
We say π and λ are unitarily equivalent. 6. Deduce from the exercises above that, if π and λ are irreducible unitary representations of G, π and λ are not unitarily equivalent ⇔ I(π, λ) = 0 ⇔ Qπλ = 0, and I(π, π) = Span(I), hence
Qππ A = (dim V )−1 (Tr A)I. 7. Take π and λ, irreducible unitary representations of G, as in Exercise 4. Take orthonormal bases of V and W , so π(g) and λ(g) have matrix representations π(g)jk and λ(g)mℓ . Deduce from Exercise 6 that ∫ π(g)jk π(g)mℓ dg = (dim V )−1 δjm δkℓ , G
while, if π is not unitarily equivalent to λ, then ∫ π(g)jk λ(g)mℓ dg ≡ 0. G
These identities are known as the Weyl orthogonality relations. Here is a restatement. Proposition 6.4.23. Let G be a compact, smooth matrix group, and {πα : α ∈ A} a set of mutually inequivalent, irreducible unitary representations of G on spaces Vα , of dimension dα . Pick orthonormal bases of Vα , so that πα (g) has the matrix representation (πα (g)jk ). Then (6.4.142)
F = {d1/2 α πα (g)jk : α ∈ A, 1 ≤ j, k ≤ dα }
is an orthonormal set of functions on G.
286
6. Differential geometry of surfaces
In Exercises 8–10, we retain the hypothesis that G is a compact, smooth matrix group. Pick A such that each irreducible, unitary representation of G is equivalent to πα for some α ∈ A. 8. Let π be a unitary representation of G on a finite-dimensional inner product space V . Show that if V1 ⊂ V is invariant under π(g) for all g, so is its orthogonal complement V1⊥ . Deduce that, if π is not irreducible, there exists an orthogonal decomposition V = V1 ⊕ · · · ⊕ VK such that π acts irreducibly on each factor Vj . Hint. Induction on dimension. 9. Let π and λ be irreducible unitary representations of G, with matrix entries π(g)jk and λ(g)mℓ . Apply Exercise 8 to the representation of G introduced in Exercise 3 to show that π(g)jk λ(g)mℓ ∈ Span F. 10. Note that if (π(g)jk ) is a unitary representation of G, so is (π(g)jk ), so Span F is closed under complex conjugation. Then we can deduce from Exercise 9 that Span F is an algebra. Use the Stone-Weierstrass theorem to establish the following complement to Proposition 6.4.23. Proposition 6.4.24. The orthonormal set F in (6.4.142) has the property that Span F is dense in C(G). To set up the next few exercises, for g ∈ G, define Cg : G −→ G,
Cg (x) = gxg −1 = Lg Rg−1 (x).
Note that Cg (I) = I and, with g = TI G, DCg (I) : g −→ g is given by DCg (I)A = gAg −1 ,
A ∈ g.
11. Let Ad(g)A = gAg −1 . Show that Ad is a representation of G on g. Show that the derived representation of g on g, arising as in (6.4.45), is given by ad(A)B = [A, B],
A, B ∈ g.
12. Give g a positive-definite inner product ⟨ , ⟩, and use {Lg } to extend this to a left-invariant metric tensor on G. Show that this metric tensor is also right
6.5. The derivative of the exponential map
287
invariant (hence bi-invariant) if and only if, for each g ∈ G, Ad(g) : g → g preserves this inner product, i.e., ⟨Ad(g)A, Ad(g)B⟩ = ⟨A, B⟩,
∀ A, B ∈ g, g ∈ G.
Show that, if this holds, ⟨ad(C)A, B⟩ = −⟨A, ad(C)B⟩,
∀ A, B, C ∈ g.
13. Assume G has a bi-invariant metric tensor. Using Proposition 6.4.22 and Exercise 12, show that, if X, Y, Z, and W are left-invariant vector fields on G, then 1 ⟨R(X, Y )Z, W ⟩ = − ⟨[X, Y ], [Z, W ]⟩. 4 In particular, 1 ⟨R(X, Y )Y, X⟩ = ∥[X, Y ]∥2 . 4
6.5. The derivative of the exponential map Here we look at formulas for the derivative of two types of exponential maps, first the matrix exponential (6.5.1)
Exp : M (n, C) −→ M (n, C),
introduced in §2.3, and studied further in §6.4, and then the exponential map (6.5.2)
Expp : U −→ M,
introduced in §6.1, where M is a smooth surface in Rk , or more generally a smooth Riemannian manifold, p ∈ M , and U is a neighborhood of 0 in the tangent space Tp M . We also investigate where D Exp is invertible, and draw conclusions from this. We start with the matrix exponential ∞ k ∑ t k (6.5.3) Exp(tA) = etA = A , k!
A ∈ M (n, C),
k=0
introduced in (2.3.115), satisfying d tA e = AetA , Exp(0) = I. dt As noted there, Exp is a C ∞ map. We will derive a formula for d A+sB (6.5.5) D Exp(A)B = e , B ∈ M (n, C). ds s=0 (6.5.4)
To get this, it is convenient to bring in the matrix-valued function (6.5.6)
U (s, t) = et(A+sB) .
Note that (6.5.7)
D Exp(A)B = V (1),
where V (t) = ∂s U (s, t) s=0 .
We will derive a differential equation for V (t). To start, (6.5.6) yields (6.5.8)
∂t U (s, t) = (A + sB)U (s, t),
288
6. Differential geometry of surfaces
and applying ∂s to this gives (6.5.9)
∂s ∂t U = (A + sB)∂s U + BU.
Since ∂s ∂t U = ∂t ∂s U , we obtain V ′ (t) = AV (t) + BU (0, t) (6.5.10)
= AV (t) + BetA .
Note that V (0) = 0. Thus Duhamel’s formula (2.3.126) yields ∫ t (6.5.11) V (t) = e(t−s)A BesA ds. 0
Consequently
∫
(6.5.12)
D Exp(A)B = e
1
A
e−sA BesA ds.
0
Here is a convenient alternative formula. Define (6.5.13)
A : M (n, C) −→ M (n, C),
A(B) = [A, B] = AB − BA.
Then e−sA BesA = e−sA B.
(6.5.14)
In fact, the two sides, Y1 (s) and Y2 (s), both satisfy the same differential equation, (6.5.15) Thus
Yj′ (s) = −AYj (s) = Yj (s)A − AYj (s), ∫
1
(6.5.16)
e−sA BesA ds =
0
∫
1
Yj (0) = B.
e−sA B ds = Φ(A)B,
0
where ∞
Φ(z) = (6.5.17)
∑ 1 1 − e−z = (−z)k−1 , z k! k=1
∞ ∑ 1 (−A)k−1 . Φ(A) = k! k=1
In this setting, we can rewrite (6.5.12) as D Exp(A)B = eA Φ(A)B.
(6.5.18) Thus, given A ∈ M (n, C), (6.5.19)
D Exp(A) is invertible on M (n, C) ⇐⇒ Φ(A) is invertible on M (n, C).
We will now consider the spectrum of Φ(A). Given a linear transformation T on a finite dimensional vector space X, Spec T is the set of eigenvalues of T . Thus T is invertible on X if and only if 0 ∈ / Spec T . The following is a version of the spectral mapping theorem.
6.5. The derivative of the exponential map
289
∑ Proposition 6.5.1. Let Φ(z) = k≥0 ak z k converge for all z ∈ C. Let A : X → X be a linear ∑ transformation on a finite-dimensional complex vector space X. Set Φ(A) = k≥0 ak Ak . Then Spec Φ(A) = {Φ(λj ) : λj ∈ Spec A}.
(6.5.20)
A proof of this result is given in §6.6. In case Φ is given by (6.5.17), we deduce that (6.5.21)
Φ(A) is invertible on M (n, C) ⇐⇒ Spec A is disjoint from {2πik : k ∈ Z \ 0}.
Furthermore, if A is related to A as in (6.5.13), Spec A = {λ − µ : λ, µ ∈ Spec A}.
(6.5.22)
Given these results, we have the following conclusion. Proposition 6.5.2. Given Exp as in (6.5.3) and A ∈ M (n, C), (6.5.23)
D Exp(A) is invertible on M (n, C) ⇐⇒ no two eigenvalues of A/2πi differ by a nonzero integer.
Remark. It is also of interest to consider (6.5.24)
Exp : M (n, R) −→ M (n, R).
Of course, the formulas (6.5.12) and (6.5.18) for D Exp(A) : M (n, R) → M (n, R), given A ∈ M (n, R), work without change. Furthermore, given a linear transformation T : Rm → Rm (m = n2 ), extended to a linear transformation T : Cm → Cm , it is clear that T is invertible on Rm if and only if it is invertible on Cm . Hence we have the following immediate consequence of Proposition 6.5.2. Corollary 6.5.3. Given Exp as in (6.5.24) and A ∈ M (n, R), D Exp(A) is invertible on M (n, R) if and only if no two eigenvalues of A differ by 2πi times a nonzero integer.
We next look at the second type of exponential map, (6.5.2). Let M be a smooth Riemannian manifold, p ∈ M . As seen in §6.1, there is a neighborhood U of 0 in Tp M on which we have a smooth map (6.5.25)
Expp : U −→ M,
satisfying the condition that, for v ∈ U , (6.5.26)
Expp (tv) = γ(t),
0 ≤ t ≤ 1,
is a geodesic of constant speed ∥v∥, satisfying (6.5.27)
γ(0) = p,
We desire to analyze (6.5.28)
D Expp (v)w =
γ ′ (0) = v.
d Expp (v + sw) , ds s=0
290
6. Differential geometry of surfaces
given v ∈ U, w ∈ Tp M . To do this, we assume v ̸= 0 and bring in the family of curves (6.5.29)
γs (t) = Expp (t(v + sw)),
defined for t ∈ [0, 1] and |s| ≤ δ, a sufficiently small positive quantity. This is a family of geodesics on M , of speed ∥v + sw∥, with tangent (6.5.30)
Ts (t) = γs′ (t),
satisfying ∇Ts Ts = 0.
We also bring in the vector field (6.5.31)
Ws (γs (t)) = ∂s γs (t) ∈ Tγs (t) M.
In rough analogy to (6.5.8)–(6.5.9), we apply ∇Ws to the geodesic equation in (6.5.30), obtaining (6.5.32)
0 = ∇Ws (∇Ts Ts ) = ∇Ts ∇Ws Ts + ∇[Ws ,Ts ] + R(Ws , Ts )Ts ,
the latter identity using the characterization (6.2.57) of the Riemann curvature R of M . It is useful to note that (6.5.33)
[Ws , Ts ] = 0.
Furthermore, the identity (6.1.61) for the Levi-Civita covariant derivative implies (6.5.34)
∇Ws Ts = ∇Ts Ws + [Ws , Ts ] = ∇Ts Ws .
Hence (6.5.32) yields the equation (6.5.35)
∇Ts ∇Ts Ws + R(Ws , Ts )Ts = 0.
We specialize to s = 0, obtaining (6.5.36)
∇T ∇T W + R(W, T )T = 0,
where (6.5.37)
T (t) = γ ′ (t),
γ(t) = Expp (tv),
W = W0 = ∂s γs (t) s=0 .
Note that ∥T (t)∥ = ∥v∥. The equation (6.5.36) is called the Jacobi variational equation for the geodesic flow. It has the form of a second-order ODE for (6.5.38)
W : [0, 1] −→ T M,
W (t) ∈ Tγ(t) M.
It is appropriate to impose the following initial condition at t = 0; (6.5.39)
W (0) = 0,
∇T W (0) = w.
In fact, close enough to p one can use Expp as an exponential coordinate system, in which W (t) = tw, and this leads to (6.5.39). We denote the solution to (6.5.36)– (6.5.39) by (6.5.40)
W (t) = Jp,v,w (t),
p ∈ M, v, w ∈ Tp M, v ̸= 0.
Then (6.5.41)
D Expp (v) : Tp (M ) −→ TExpp (v) M
is given by (6.5.42)
D Expp (v)w = Jp,v,w (1).
6.5. The derivative of the exponential map
291
A solution to (6.5.36) is called a Jacobi field along the geodesic γ. The computation (6.5.42) has the following consequence. Proposition 6.5.4. Given Expp as in (6.5.25)–(6.5.27), and v ∈ U ⊂ Tp M, v ̸= 0, D Expp (v) in (6.5.41) is not invertible if and only if there exists a Jacobi field J(t) along γ(t) = Expp (tv) such that J(0) = 0 and J(1) = 0. The null space of D Expp (v) is given by {w ∈ Tp M : Jp,v,w (1) = 0}.
(6.5.43)
Remark. In (6.5.29)–(6.5.42) we have assumed v ̸= 0. From these formulas one can show that (6.5.44)
lim Jp,v,w (1) = w,
v→0
consistent with the formula (6.5.45)
D Expp (0)w = w,
which also follows immediately form (6.5.26)–(6.5.27), as observed below (6.1.75). We say two points p and q on a geodesic γ on M are conjugate if there exists a (not identically zero) Jacobi field J along γ that vanishes at p and q. By Proposition 6.5.4, an equivalent condition is that D Expp is singular at v ∈ Tp M if γ(t) = Expp (tv), p = γ(0), and q = γ(1). A certain negative curvature condition can guarantee that there are no conjugate points. Proposition 6.5.5. Suppose that for all vector fields X and Y on M , (6.5.46)
⟨R(X, Y )Y, X⟩ ≤ 0.
Then no two points of M are conjugate along any geodesic. Proof. Let γ be a constant speed geodesic, p = γ(0), q = γ(b). Suppose J is a Jacobi field along γ satisfying J(p) = 0. Let T = γ ′ . A computation gives
(6.5.47)
1 2 T ⟨J, J⟩ = T ⟨∇T J, J⟩ 2 = ⟨∇T ∇T J, J⟩ + ⟨∇T J, ∇T J⟩ = ⟨∇T J, ∇T J⟩ − ⟨R(J, T )T, J⟩,
the last identity by (6.5.36). The hypothesis (6.5.46) then gives 1 2 T ⟨J, J⟩ ≥ ∥∇T J∥2 ≥ 0. 2 If J is not identically zero along γ([0, b]), then ∇T J must be nonzero at some point γ(ξ), ξ ∈ (0, b). Since J(γ(0)) = 0, (6.5.48) forces ∥J(γ(b))∥2 > 0. (6.5.48)
For example, if M is a 2D Riemannian manifold with Gauss curvature K(x) ≤ 0 for all x ∈ M , then there are no conjugate points. In contrast, for positive curvature, there is the following result.
292
6. Differential geometry of surfaces
Proposition 6.5.6. Let M be a 2D Riemannian manifold. Assume its Gauss curvature K satisfies K(x) ≥ κ > 0, ∀ x ∈ M. √ Then any geodesic on M of length > π/ κ has conjugate points. (6.5.49)
We refer to [10] for a proof of this, and to [11] for a higher dimensional generalization. We state a few other results about Jacobi fields and conjugate points, referring to [11] for proofs. We mention that the proof of Proposition 6.5.6 makes use of the second variational formula (6.2.110), as does the proof of the next result. Proposition 6.5.7. If γ is a unit speed geodesic on a Riemannian manifold M , and p = γ(a) and q = γ(b) are conjugate along γ (b > a), then (6.5.50)
d(p, γ(t)) < t − a,
for t > b.
To formulate the next result, we say a Riemannian manifold M is complete if Expp is defined on all of Tp M , for each p ∈ M . Proposition 6.5.8. Assume M is a complete Riemannian manifold of dimension 2. If (6.5.49) holds, then M is compact and (6.5.51)
π d(p, q) ≤ √ , κ
∀ p, q ∈ M.
√ Note that a sphere of radius R = 1/ κ in R3 , whose Gauss curvature is ≡ κ, illustrates the sharpness of Propositions 6.5.6 and 6.5.8. The proof of Proposition 6.5.8 uses Propositions 6.5.6 and 6.5.7. Higher dimensional extensions can also be found in [11].
6.6. A spectral mapping theorem Let A ∈ M (n, C), and let Φ(z) be given by (6.6.1)
Φ(z) =
∞ ∑
ck z k ,
k=0
a power series that is convergent for all z ∈ C. We define Φ(A) ∈ M (n, C) by (6.6.2)
Φ(A) =
∞ ∑
c k Ak .
k=0
Convergence of this series follows by estimates of the form (2.1.18), which hold for complex as well as real matrices. We want to describe Spec Φ(A) in terms of Spec A, where Spec A denotes the set of eigenvalues of A. We prove the following, which plays a role in the proof of Proposition 6.5.2. Proposition 6.6.1. For A ∈ M (n, C) and Φ(A) defined above, (6.6.3)
Spec Φ(A) = {Φ(λj ) : λj ∈ Spec A}.
6.6. A spectral mapping theorem
293
The proof will make use of some basic results of linear algebra, discussed in Appendix A.3, and treated in more detail in Chapter 2 of [50] and in §§6–7 of [52]. To start, let us note that (6.6.3) is quite easy to prove when A is diagonalizable, i.e., Cn has a basis {uj : 1 ≤ j ≤ n} of eigenvectors of A, satisfying (6.6.4)
Auj = λj uj , k
In this case, A uj = (6.6.5)
λkj uj ,
1 ≤ j ≤ n.
so (6.6.2) gives
Φ(A)uj =
∞ ∑
ck λkj uj = Φ(λj )uj .
k=0
This immediately gives (6.6.3). The proof of (6.6.3) requires more work when A is not diagonalizable. In that case, for each λj ∈ Spec A, define the “generalized eigenspace” (6.6.6)
GE(A, λj ) = {v ∈ Cn : (A − λj I)ν v = 0 for some ν}.
It is a general result (proved in [50] and [52]) that each v ∈ Cn has a unique decomposition ∑ (6.6.7) v= vj , vj ∈ GE(A, λj ), λj ∈ Spec A. j
We also write (6.6.8)
⊕
Cn =
GE(A, λ).
λ∈Spec A
Similarly, (6.6.9)
Cn =
⊕
GE(Φ(A), µ).
µ∈Spec Φ(A)
In light of this, (6.6.3) is a consequence of the following. Lemma 6.6.2. In the setting of Proposition 6.6.1, (6.6.10)
GE(A, λj ) ⊂ GE(Φ(A), Φ(λj )).
Proof. If we set Ψ(z) = Φ(z + λj ) − Φ(λj ), so Ψ(0) = 0, and set B = A − λj I, it suffices to show that (6.6.11)
GE(B, 0) ⊂ GE(Ψ(B), 0).
Now B is nilpotent on GE(B, 0), i.e., for some ν ∈ N, B ν = 0 on GE(B, 0). Noting that (6.6.12)
Ψ(z) = zF (z),
so Ψ(B) = BF (B),
where F (z) is a convergent power series, we see that (6.6.13)
B ν = 0 on GE(B, 0) =⇒ Ψ(B)ν = 0 on GE(B, 0),
which gives (6.6.11) and completes the proof of Lemma 6.6.2.
Chapter 7
Fourier analysis
This chapter is devoted to Fourier series and the Fourier transform, in several variables. We begin in §7.1 with Fourier series. Given f ∈ R(Tn ), (where Tn is the n-dimensional torus Rn /2πZn ) or more generally f ∈ R# (Tn ), we define its Fourier coefficients ∫ −n ˆ (7.0.1) f (k) = (2π) f (θ)e−ik·θ dθ, Tn
where k = (k1 , . . . , kn ) ∈ Zn . A key result is the Fourier inversion formula, ∑ (7.0.2) f (θ) = fˆ(k)eik·θ . k∈Zn
We first establish this for f ∈ A(Tn ), where we say ∑ (7.0.3) f ∈ A(Tn ) ⇐⇒ |fˆ(k)| < ∞. k∈Zn
In this case, (7.0.2) holds, with absolute and uniform convergence. We show that (7.0.4)
f ∈ C m (Tm ), m >
n =⇒ f ∈ A(Tn ). 2
It is also important to deal with Fourier series of discontinuous functions. For this, we set ∑ (7.0.5) SN f (θ) = fˆ(k)eik·θ , |k|≤N
and show that (7.0.6)
SN f −→ f,
as N → ∞, 295
296
7. Fourier analysis
in L2 -norm, when f ∈ R(Tn ), or more generally when f, |f |2 ∈ R# (Tn ). In such a case, L2 -norm convergence of (7.0.6) yields the Plancherel identity ∫ ∑ |fˆ(k)|2 . (7.0.7) (2π)−n |f (θ)|2 dθ = k∈Zn
Tn
We give a brief discussion of how the Lebesgue theory of integration produces L2 (Tn ), which is the Hilbert space completion of R(Tn ) with respect to the L2 norm, and which is naturally the largest space on which (7.0.6) holds, in L2 -norm. We then consider Fourier series of more singular objects, namely distributions, which arise as continuous linear functionals (7.0.8)
w : C ∞ (Tn ) −→ C.
We say w ∈ D′ (Tn ), and write the action on f ∈ C ∞ (Tn ) as f 7→ ⟨f, w⟩. As an example, if M ⊂ Tn is a compact, m-dimensional, C 1 surface, we can define δM ∈ D′ (Tn ) by ∫ (7.0.9) ⟨f, δM ⟩ = f (x) dS(x). M
In this distributional setting, the Fourier coefficients are given by (7.0.10)
w(k) ˆ = (2π)−n ⟨ek , w⟩,
ek (θ) = e−ik·θ .
We extend (7.0.6) to (7.0.11)
SN w −→ w,
as N → ∞,
in the sense that ⟨f, SN w⟩ → ⟨f, w⟩, for all f ∈ C ∞ (Tn ). In §7.2 we turn to the Fourier transform, given by ∫ (7.0.12) fˆ(ξ) = (2π)−n/2 f (x)e−ix·ξ dx, Rn
for ξ ∈ Rn , for f ∈ R(Rn ), or more generally f ∈ R# (Rn ). In this case, the Fourier inversion formula takes the form ∫ (7.0.13) f (x) = (2π)−n/2 fˆ(ξ)eix·ξ dξ. Rn
We first establish this for f ∈ A(R ), where, given f ∈ R(Rn ) ∩ C(Rn ), we say n
(7.0.14)
f ∈ A(Rn ) ⇐⇒ fˆ ∈ R(Rn ).
We derive a sufficient condition for f to belong to A(Rn ), somewhat similar to (7.0.4). For such f , (7.0.13) holds, the right side being an absolutely convergent integral. Pursuing a Fourier inversion formula for discontinuous functions, we define ∫ (7.0.15) SR f (x) = (2π)−n/2 fˆ(ξ)eix·ξ dξ, |ξ|≤R
and show that (7.0.16)
SR f −→ f,
as R → ∞,
7. Fourier analysis
297
in L2 -norm, for f ∈ R(Rn ), or more generally for f, |f |2 ∈ R# (Rn ). In such a case, we also have the Plancherel identity ∫ ∫ |f (x)|2 dx = |fˆ(ξ)|2 dξ. (7.0.17) Rn
Rn
Parallel to the toral case, the Lebesgue theory of integration gives rise to L2 (Rn ), as the maximal space for which (7.0.16) holds, in L2 -norm. In the Euclidean setting, we also have an extension of the Fourier transform to a class of distributions, defined by L. Schwartz, known as tempered distributions, and results on this are derived in §7.2. In §7.3 we bring Fourier series and the Fourier transform together to produce an important family of identities known as Poisson summation formulas. They are obtained by taking a sufficiently nice function f on Rn , periodizing it to obtain a function φ on Tn , and evaluating φ in two ways, both in terms of fˆ(ξ) and in terms 2 of φ(k). ˆ A striking example of this, obtained for f (x) = e−x on R, is known as the Jacobi inversion formula. We show how it leads to an identity for the Riemann zeta function, known as the Riemann functional equation. In §7.4 we study spherical harmonics, that is, Fourier analysis on spheres. In this case, in place of using exponentials eik·θ , we seek to write a function f on the unit sphere S n−1 in Rn as an infinite series in functions Ykℓ ∈ C ∞ (S n−1 ) that are eigenfunctions of the Laplace operator ∆S on the surface S n−1 , with its naturally arising Riemannian metric. We see that ∆S has eigenvalues of the form −λ2k , where (7.0.18)
λ2k = k 2 + (n − 2)k,
with k ∈ Z+ . The associated eigenspaces (7.0.19)
Vk = {g ∈ C ∞ (S n−1 ) : ∆S g = −λ2k g}
are seen to consist of the restrictions to S n−1 of the class of harmonic polynomials on Rn that are homogeneous of degree k. We define the orthogonal projection Ek f of f on Vk , and the version of Fourier series that arises is (7.0.20)
f=
∞ ∑
Ek f.
k=0
Summing over k ∈ {0, . . . , N } yields SN f . Parallel to (7.0.6), we have L2 -norm convergence SN f → f . Also, parallel to (7.0.4), we have absolute and uniform convergence of (7.0.20), when f is a sufficiently smooth function on S n−1 . One significant tool for this study invloves a formula for the projection Ek , of the form ∫ (n−2)/2 (7.0.21) Ek f (ω) = γnk Ck (ω · y)f (y) dS(y), S n−1
where γnk are certain explicit constants and Ckα (t) are polynomials in t, of degree k, known as Gegenbauer polynomials, which specialize to Legendre polynomials Pk (t) in the classical case n = 3, leading to analysis on S 2 . Our approach to (7.0.21) goes
298
7. Fourier analysis
through the Dirichlet problem, to solve for u on the unit ball B ⊂ Rn the equation (7.0.22)
u = f on ∂B = S n−1 .
∆u = 0 on B,
On the one hand, there is the Poisson integral formula for the solution, ∫ 1 − |x|2 f (y) (7.0.23) u(x) = dS(y), x ∈ B, An−1 |x − y|n S n−1
and on the other hand the identity (7.0.24)
∞ ∑
u(rω) =
rk Ek f (ω),
ω ∈ S n−1 , r ∈ [0, 1).
k=0
The simultaneous use of these two formulas leads to many interesting results about spherical harmonics. For another tool, we study the action of SO(n) on the spaces Vk and develop some results on the representation theory of these rotation groups, with particular emphasis on SO(3) and its applications to spherical harmonics on S 2 . We complement this material with a brief discussion of Fourier series on compact matrix groups, in §7.5. In §7.6 we give a geometric application of Fourier series, to the isoperimetric inequality, which states that, among smoothly bounded planar domains whose boundaries have a fixed length, disks have the largest area. The proof brings in both Fourier series and Green’s theorem.
7.1. Fourier series We consider Fourier series of functions on the n-dimensional torus Tn = Rn /(2πZn ). Given f ∈ R(Tn ) (or more generally f ∈ R# (Tn )) we set, for k ∈ Zn , ∫ (7.1.1) fˆ(k) = (2π)−n f (θ)e−ik·θ dθ, Tn
i.e., (7.1.2)
fˆ(k) = (2π)−n
∫
∫
2π
··· 0
2π
f (θ1 , . . . , θn )e−i(k1 θ1 +···+kn θn ) dθ1 · · · dθn .
0
We call fˆ(k) the Fourier coefficients of f . The first major problem is to recover f from its Fourier coefficients. We first accomplish this when f belongs to the space } { ∑ |fˆ(k)| < ∞ . (7.1.3) A(Tn ) = f ∈ C(Tn ) : k∈Zn
Proposition 7.1.1. If f ∈ A(Tn ), then the following Fourier inversion formula holds: ∑ (7.1.4) f (θ) = fˆ(k)eik·θ . k∈Zn
7.1. Fourier series
299
∑ ˆ Proof. Given |f (k)| < ∞, the right side of (7.1.4) is absolutely and uniformly convergent, defining ∑ (7.1.5) g(θ) = fˆ(k)eik·θ , g ∈ C(Tn ), k∈Zn
and our task is to show that f ≡ g. Making use of the identities ∫ (2π)−n eiℓ·θ dθ = 0, if ℓ ̸= 0, (7.1.6) Tn 1, if ℓ = 0, for ℓ ∈ Zn , we get gˆ(k) = fˆ(k) for all k ∈ Zn . Let us set u = f − g. We have u ∈ C(Tn ),
(7.1.7)
u ˆ(k) = 0, ∀ k ∈ Zn .
It remains to show that this implies u ≡ 0. To prove this, we use Corollary A.5.5 (a consequence of the Stone-Weierstrass theorem, treated in Appendix A.5), which implies that for each v ∈ C(Tn ), there exist trigonometric polynomials, i.e., finite linear combinations vN of {eik·θ : k ∈ Zn }, such that vN −→ v uniformly on Tn .
(7.1.8) Now (7.1.7) implies
∫
(7.1.9)
u(θ)vN (θ) dθ = 0,
∀ N,
Tn
and passing to the limit, using (7.1.8), gives ∫ (7.1.10) u(θ)v(θ) dθ = 0, ∀ v ∈ C(Tn ). Tn
Taking v = u gives
∫ |u(θ)|2 dθ = 0,
(7.1.11) Tn
forcing u ≡ 0 and completing the proof.
We seek conditions on f implying that f ∈ A(Tn ). Assume f ∈ C ℓ (Tn ). Then integration by parts gives ∫ (7.1.12) (2π)−n f (α) (θ)e−ik·θ dθ = (ik)α fˆ(k), for |α| ≤ ℓ. Tn
Hence (7.1.13)
f ∈ C ℓ (Tn ) ⇒ |fˆ(k)| ≤ C(1 + |k|)−ℓ
∑ |α|≤ℓ
where we set (7.1.14)
∥u∥L1 = (2π)−n
∫ |u(θ)| dθ. Tn
∥f (α) ∥L1 ,
300
Since
7. Fourier analysis ∑
k∈Zn (1
+ |k|)−(n+1) < ∞, we deduce that C n+1 (Tn ) ⊂ A(Tn ).
(7.1.15)
We will sharpen this implication below. We next make use of (7.1.6) to produce results on the following.
∫ Tn
|f (θ)|2 dθ, starting with
Proposition 7.1.2. Given f ∈ A(Tn ), ∫ ∑ (7.1.16) |fˆ(k)|2 = (2π)−n |f (θ)|2 dθ. k∈Zn
Tn
More generally, if also g ∈ A(T ), ∫ ∑ (7.1.17) fˆ(k)ˆ g (k) = (2π)−n f (θ)g(θ) dθ. n
k∈Zn
Tn
Proof. Switching order of summation and integration and using (7.1.6), we have ∫ ∫ ∑ fˆ(j)ˆ (2π)−n f (θ)g(θ) dθ = (2π)−n g (k)e−i(j−k)·θ dθ Tn
(7.1.18)
=
∑
n Tn j,k∈Z
g (k), fˆ(k)ˆ
k
giving (7.1.17). Taking g = f gives (7.1.16).
We will extend the scope of Proposition 7.1.2 below. A related issue is the convergence of SN f to f as N → ∞, where ∑ (7.1.19) SN f (θ) = fˆ(k)eik·θ . |k|≤N
Clearly f ∈ A(Tn ) ⇒ Sn f → f uniformly on Here we take |k| = n T as N → ∞. Here, we are interested in convergence in L2 -norm, where ∫ (7.1.20) ∥f ∥2L2 = (2π)−n |f (θ)|2 dθ. (k12
+ · · · + kn2 )1/2 .
Tn
Given f ∈ R(T ), this defines a “norm,” satisfying the following result, called the triangle inequality: n
(7.1.21)
∥f + g∥L2 ≤ ∥f ∥L2 + ∥g∥L2 .
See Appendix A.2 for details on this. Behind this result is the fact that (7.1.22)
∥f ∥2L2 = (f, f )L2 ,
where, when f, g ∈ L2 (Tn ), we set (7.1.23)
(f, g)L2 = (2π)−n
∫ f (θ)g(θ) dθ. Tn
Thus the content of (7.1.16) is that ∑ (7.1.24) |fˆ(k)|2 = ∥f ∥2L2 , k∈Zn
7.1. Fourier series
301
and that of (7.1.17) is that
∑
(7.1.25)
g (k) = (f, g)L2 . fˆ(k)ˆ
k∈Zn
The left side of (7.1.24) can be regarded as the square norm of an element of ℓ2 (Zn ). Generally, an element (ak )k∈Zn belongs to ℓ2 (Zn ) if and only if ∑ (7.1.26) ∥(ak )∥2ℓ2 = |ak |2 < ∞. k∈Zn
There is an associated inner product (7.1.27)
((an ), (bn ))ℓ2 =
∑
ak bk .
k∈Zn
As in (7.1.21), one has ∥(ak ) + (bk )∥ℓ2 ≤ ∥(ak )∥ℓ2 + ∥(bk )∥ℓ2 .
(7.1.28)
As for the notion of L2 -convergence, we say fν → f in L2 ⇐⇒ ∥f − fν ∥L2 → 0.
(7.1.29)
There is a similar notion of convergence in ℓ2 . Clearly ∥f − fν ∥L2 ≤ sup |f (θ) − fν (θ)|.
(7.1.30)
θ
In view of the uniform convergence SN f → f for f ∈ A(Tn ), noted above, we have f ∈ A(Tn ) =⇒ SN f → f in L2 , as N → ∞.
(7.1.31)
The triangle inequality implies (7.1.32) ∥f ∥L2 − ∥SN f ∥L2 ≤ ∥f − SN f ∥L2 , and clearly (by Proposition 7.1.2) ∥SN f ∥2L2 =
(7.1.33)
∑
|fˆ(k)|2 ,
|k|≤N
so ∥f − SN f ∥2L2 → 0 as N → ∞ =⇒ ∥f ∥2L2 =
(7.1.34)
∑
|fˆ(k)|2 .
k∈Zn
We now consider more general functions f ∈ R(Tn ). With fˆ(k) and SN f defined by (7.1.1) and (7.1.19), we define RN f by (7.1.35) Note that
∫ Tn
(7.1.36)
f (θ)e
−ik·θ
dθ =
∫
f = SN f + RN f. Tn
SN f (θ)e−ik·θ dθ for |k| ≤ N . Hence
(f, SN f )L2 = (SN f, SN f )L2 ,
and hence (7.1.37)
(SN f, RN f )L2 = 0.
Consequently, (7.1.38)
∥f ∥2L2 = (SN f + RN f, SN f + RN f )L2 = ∥SN f ∥2L2 + ∥RN f ∥2L2 .
302
7. Fourier analysis
In particular, ∥SN f ∥L2 ≤ ∥f ∥L2 .
(7.1.39)
We are now in a position to prove the following. Lemma 7.1.3. Let f, fν ∈ R(Tn ). Assume lim ∥f − fν ∥L2 = 0,
(7.1.40)
ν→∞
and, for each ν, lim ∥fν − SN fν ∥L2 = 0.
(7.1.41)
N →∞
Then lim ∥f − SN f ∥L2 = 0.
(7.1.42)
N →∞
Proof. Writing f − SN f = (f − fν ) + (fν − SN fν ) + SN (fν − f ) and using the triangle inequality, we have, for each ν, (7.1.43)
∥f − SN f ∥L2 ≤ ∥f − fν ∥L2 + ∥fν − SN fν ∥L2 + ∥SN (fν − f )∥L2 .
Taking N → ∞ and using (7.1.39), we have lim sup ∥f − SN f ∥L2 ≤ 2∥f − fν ∥L2 ,
(7.1.44)
N →∞
for each ν. Then (7.1.40) yields the desired conclusion (7.1.42).
Given f ∈ C(Tn ), we have trigonometric polynomials fν → f uniformly on T (by Corollary A.5.5), and clearly (7.1.41) holds for each such fν . Thus Lemma 7.1.3 yields the following. n
f ∈ C(Tn ) =⇒ SN f → f in L2 , and ∑ |fˆ(k)|2 = ∥f ∥2L2 .
(7.1.45)
k∈Zn
To go further, we bring in the following lemma. Lemma 7.1.4. Given f ∈ R(Tn ), there exist fν ∈ C(Tn ) such that fν → f in L2 . Proof. One easily reduces this to treating the case f ≥ 0. Then∫ Proposition 3.1.11 implies that there exist fν ∈ C(Tn ) such that 0 ≤ fν ≤ f and Tn (f − fν ) dθ → 0. Hence ∫ ∫ |f − fν |2 dθ ≤ (sup f ) Tn
(f − fν ) dθ → 0.
Tn
The last two lemmas and (7.1.45) yield the following result. Proposition 7.1.5. We have (7.1.46)
f ∈ R(Tn ) =⇒ SN f → f in L2 , and ∑ |fˆ(k)|2 = ∥f ∥2L2 . k∈Zn
7.1. Fourier series
303
The last identity is called the Plancherel identity. Having some general results guaranteeing the conclusion in (7.1.16), we now improve (7.1.15). Proposition 7.1.6. We have (7.1.47)
ℓ>
n =⇒ C ℓ (Tn ) ⊂ A(Tn ). 2
Proof. As before, we have (7.1.12). We deduce from this that ∑ (7.1.48) |k α fˆ(k)|2 = ∥f (α) ∥2L2 , for |α| ≤ ℓ. k∈Zn
Summing over |α| ≤ ℓ yields ∑ ∑ (7.1.49) (1 + |k|)2ℓ |fˆ(k)|2 ≤ C ∥f (α) ∥2L2 . |α|≤ℓ
k∈Zn
Then Cauchy’s inequality gives ∑ ∑ |fˆ(k)| = (1 + |k|)−ℓ (1 + |k|)ℓ |fˆ(k)| (7.1.50)
k∈Zn
k∈Zn 1/2
≤ An,ℓ
{∑
(1 + |k|)2ℓ |fˆ(k)|2
}1/2 ,
k∈Zn
where (7.1.51)
An,ℓ =
∑
(1 + |k|)−2ℓ < ∞, if 2ℓ > n.
k∈Zn
This establishes the desired finiteness of
∑ ˆ |f (k)|.
It is illuminating to restate some of the results established above, bringing in the spaces ℓp (Zn ) of functions a : Zn → C, with norm ∥ · ∥ℓp , defined by ∑ (7.1.52) ∥a∥pℓp = |a(k)|p < ∞, k∈Zn
for 1 ≤ p < ∞, and (7.1.53)
∥a∥ℓ∞ = sup |a(k)| < ∞. k
The case p = 2 arose in (7.1.26)–(7.1.27). Also, we denote by F the transformation that assigns to an integrable function f on Tn its Fourier coefficients (fˆ(k)). The definition (7.1.1) readily gives (7.1.54)
F : R# (Tn ) −→ ℓ∞ (Zn ),
∥Ff ∥ℓ∞ ≤ ∥f ∥L1 ,
with ∥f ∥L1 defined as in (7.1.14). Proposition 7.1.5 gives (7.1.55)
F : R(Tn ) −→ ℓ2 (Zn ),
∥Ff ∥ℓ2 = ∥f ∥L2 .
Meanwhile, the definition (7.1.3) gives (7.1.56)
F : A(Tn ) −→ ℓ1 (Zn ).
We can restate Proposition 7.1.1 by bringing in the transformation F ∗ , defined on ℓ1 (Zn ) by ∑ (7.1.57) F ∗ (a)(θ) = a(k)eik·θ . k∈Zn
304
7. Fourier analysis
The content of Proposition 7.1.1 is that (7.1.58)
F ∗ : ℓ1 (Zn ) → A(Tn ) is the 2-sided inverse of F in (7.1.56).
The step (7.1.7) in the proof of that result is equivalent to FF ∗ = I on ℓ1 (Zn ), and the reverse result, F ∗ F = I on A(Tn ), made use of the Stone-Weierstrass theorem. The analytical apparatus for an extension of (7.1.55) to a result involving F and F ∗ as inverses of each other was produced by H. Lebesgue in what is now known as the theory of Lebesgue measure and integration. We give a brief description of this, refering the reader to other sources, such as [17] or [47], for a detailed presentation. To start, we say a set S ⊂ Tn is measurable provided that (7.1.59)
m∗ (S) + m∗ (Tn \ S) = V (Tn ),
where m∗ is the outer measure defined by (3.1.124). We say a function f : Tn → C is measurable provided that, for each open O ⊂ C, f −1 (O) ⊂ Tn is measurable. The Lebesgue integral associates a value in [0, +∞] to ∫ (7.1.60) f (θ) dθ Tn
for each measurable f satisfying f (θ) ≥ 0 for all θ. The space L1 (Tn ) consists of all measurable functions f such that ∫ (7.1.61) ∥f ∥L1 = (2π)−n |f (θ)| dθ < ∞. Tn
In such a case, one can write f = f0+ − f0− + i(f1+ − f1− ), with all fj± measurable and ≥ 0, and with finite integral, and the process alluded to above applies to ∫ evaluate these integrals, and hence to evaluate Tn f (θ) dθ. There is one further wrinkle. The space L1 (Tn ) actually consists of equivalence classes of measurable functions satisfying (7.1.61), where one says f1 ∼ f2 provided {x ∈ Tn : f1 (x) ̸= f2 (x)} has outer measure zero. This makes L1 (Tn ) a normed space. More generally, for p ∈ [1, ∞), Lp (Tn ) consists of equivalence classes of measurable functions for which ∫ (7.1.62) ∥f ∥pLp = (2π)−n |f (θ)|p dθ < ∞. Tn
The only case of p > 1 we work with here is p = 2. One has (7.1.63)
f, g ∈ L2 (Tn ) =⇒ f g ∈ L1 (Tn ),
and L2 (Tn ) is an inner product space, via (7.1.23), extended to this setting. These Lp norms satisfy the triangle inequality (7.1.64)
∥f + g∥Lp ≤ ∥f ∥Lp + ∥g∥Lp .
For p = 1, this inequality is a simple consequence of the pointwise inequality |f (θ)+ g(θ)| ≤ |f (θ)|+|g(θ)|. For p = 2, the proof, as indicated in (7.1.21)–(7.1.23), follows from material in Appendix A.2. For other p, which we do not need here, the reader can consult the references mentioned above. Thanks to (7.1.63), Lp (Tn ) has the structure of a metric space, with d(f, g) = ∥f − g∥Lp . We say fν → f in Lp if
7.1. Fourier series
305
∥f − fν ∥Lp → 0. For all p ∈ [1, ∞), these spaces have the following important metric properties. The first is a denseness property. Proposition A. Given f ∈ Lp (Tn ) and ℓ ∈ N, there exist fν ∈ C ℓ (Tn ) such that fν → f in Lp . The second is a completeness property. Proposition B. If (fν ) is a Cauchy sequence in Lp (Tn ), then there exists f ∈ Lp (Tn ) such that fν → f in Lp . We refer to [17] or [47] for proofs of these results. The following two neat L2 results illustrate the usefulness of the Lebesgue theory of integration in Fourier analysis. Proposition 7.1.7. The maps F and F ∗ have unique continuous linear extensions from F : A(Tn ) → ℓ1 (Zn ),
(7.1.65)
F ∗ : ℓ1 (Zn ) → A(Tn )
to (7.1.66)
F : L2 (Tn ) → ℓ2 (Zn ),
F ∗ : ℓ2 (Zn ) → L2 (Tn ),
and the identities (7.1.67)
F ∗ Ff = f,
FF ∗ a = a
and (7.1.68)
∥Ff ∥ℓ2 = ∥f ∥L2 ,
∥F ∗ a∥L2 = ∥a∥ℓ2
hold for all f ∈ L2 (Tn ) and a ∈ ℓ2 (Zn ). Proposition 7.1.8. Define SN as in (7.1.19). Then SN : L2 (Tn ) → L2 (Tn ) and (7.1.69)
f ∈ L2 (Tn ) =⇒ SN f → f in L2 (Tn ).
Rather than presenting arguments here, we refer the reader to §7.2, and the analogous results, Propositions 7.2.10 and 7.2.11. As a complement to Proposition 7.1.7, we mention that, using the inner products (7.1.23) and (7.1.27), we have (7.1.70)
(a, Ff )ℓ2 = (F ∗ a, f )L2 ,
for all f ∈ L2 (Tn ), a ∈ ℓ2 (Zn ). In case f ∈ A(Tn ) and a ∈ ℓ1 (Zn ), such an identity is elementary. We now discuss another way of taking Fourier analysis beyond the setting of A(Tn ), which is a normed space, with norm (7.1.71)
∥f ∥A = ∥Ff ∥ℓ1 ,
for which we have (7.1.72)
∥F ∗ a∥A = ∥a∥ℓ1 ,
306
7. Fourier analysis
for all a ∈ ℓ1 (Zn ). We define the dual space A′ (Tn ) to consist of continuous linear functionals w : A(Tn ) −→ C,
(7.1.73) i.e., those linear maps satisfying (7.1.74)
|w(f )| ≤ C∥f ∥A ,
∀ f ∈ A(Tn ).
The optimal constant in (7.1.74) is denoted ∥w∥A′ . Other useful notations are (7.1.75)
⟨f, w⟩ = w(f ),
and (f, w) = w(f ),
for f ∈ A(Tn ), w ∈ A′ (Tn ). We want to pass from F and F ∗ in (7.1.69) to (7.1.76)
F : A′ (Tn ) → ℓ∞ (Zn ),
F ∗ : ℓ∞ (Zn ) → A′ (Tn ).
To explain the use of ℓ∞ (Zn ), we establish the following. Lemma 7.1.9. The dual space to ℓ1 (Zn ) is ℓ∞ (Zn ). Proof. What is to be established is a natural identification of the set of continuous linear functionals on ℓ1 (Zn ), β : ℓ1 (Zn ) −→ C,
(7.1.77)
with the set of bounded functions b : Zn → C, b ∈ ℓ∞ (Zn ).
(7.1.78)
The map ℓ∞ (Zn ) → ℓ1 (Zn )′ is given by ∑ (7.1.79) ⟨a, b⟩ = ak bk , a ∈ ℓ1 (Zn ), b ∈ ℓ∞ (Zn ). k∈Zn
Note that (7.1.80)
( )∑ |⟨a, b⟩| ≤ sup |bk | |ak | = ∥a∥ℓ1 ∥b∥ℓ∞ . k
k
Conversely, given β as in (7.1.77), satisfying |β(a)| ≤ B∥a∥ℓ1 ,
(7.1.81) let us define (7.1.82)
bk = β(εk ),
where εk (ℓ) = 1 if ℓ = k, 0 if ℓ ̸= k.
We have |bk | ≤ B for all k, so b = (bk ) ∈ ℓ∞ (Zn ), and ∥b∥ℓ∞ ≤ B. Furthermore, ⟨a, b⟩ = β(a),
(7.1.83)
for a = εk , for each k ∈ Z , and hence (7.1.83) holds for all a ∈ ℓ1 (Zn ). This proves the lemma. n
Remark. An inspection of the proof shows that the correspondence β ↔ b of ℓ1 (Zn )′ with ℓ∞ (Zn ) is norm preserving, i.e., the optimal constant ∥β∥ in (7.1.81) is equal to ∥b∥ℓ∞ .
7.1. Fourier series
307
We are now ready to define F and F ∗ in (7.1.76). Taking off from (7.1.70), we define Fw ∈ ℓ∞ (Zn ) for w ∈ A′ (Tn ) by (a, Fw) = (F ∗ a, w),
(7.1.84)
a ∈ ℓ1 (Zn ),
and we define F ∗ b ∈ A′ (Tn ) for b ∈ ℓ∞ (Zn ) by (f, F ∗ b) = (Ff, b),
(7.1.85)
f ∈ A(Tn ).
Note that (7.1.84) yields (7.1.86)
|(a, Fw)| ≤ |w(F ∗ a)| ≤ ∥w∥A′ ∥F ∗ a∥A = ∥w∥A′ ∥a∥ℓ1 ,
the last identity by (7.1.72). Hence, by Lemma 7.1.9 and the remark following it, ∥Fw∥ℓ∞ ≤ ∥w∥A′ .
(7.1.87) Next, (7.1.85) yields
|(f, F ∗ b)| ≤ ∥Ff ∥ℓ1 ∥b∥ℓ∞ = ∥f ∥A ∥b∥ℓ∞ ,
(7.1.88)
the last identity by (7.1.71), hence ∥F ∗ b∥A′ ≤ ∥b∥ℓ∞ .
(7.1.89)
Thus F and F ∗ in (7.1.76) are well defined. The following Fourier inverson formula complements those in Propositions 7.1.1 and 7.1.7. Proposition 7.1.10. The maps F and F ∗ in (7.1.76) are two-sided inverses of each other, i.e., (7.1.90)
F ∗ Fw = w and FF ∗ b = b,
∀w ∈ A′ (Tn ), b ∈ ℓ∞ (Zn ).
Proof. Given f ∈ A(Tn ), a ∈ ℓ1 (Zn ), we have (7.1.91)
(f, F ∗ Fw) = (Ff, Fw) = (F ∗ Ff, w) = (f, w),
the first identity by (7.1.85), the second by (7.1.84), and the third by (7.1.58). This implies F ∗ Fw = w. Similarly, (7.1.92)
(a, FF ∗ b) = (F ∗ a, F ∗ b) = (FF ∗ a, b) = (a, b),
yielding FF ∗ b = b.
We can now sharpen (7.1.87) and (7.1.89). Corollary 7.1.11. For w ∈ A′ (Tn ), b ∈ ℓ∞ (Zn ), (7.1.93)
∥Fw∥ℓ∞ = ∥w∥A′ , and ∥F ∗ b∥A′ = ∥b∥ℓ∞ .
Proof. First, w = F ∗ Fw ⇒ ∥w∥A′ ≤ ∥F w∥ℓ∞ , by (7.1.89) with b = Fw. This together with (7.1.87) yields the first identity in (7.1.93). The second identity has a similar proof. Let us note that (7.1.84), applied to a = εk , defined as in (7.1.82), gives, for w ∈ A′ (Tn ), (7.1.94)
Fw(k) = (2π)−n ⟨ek , w⟩,
ek ∈ A(Tn ),
ek (θ) = e−ik·θ .
308
7. Fourier analysis
The space A′ (Tn ) contains an interesting variety of functions and other objects. For example, there is a natural map ι : R# (Tn ) −→ A′ (Tn ),
(7.1.95) given by
∫
⟨f, ι(u)⟩ = (2π)−n
(7.1.96)
f (θ)u(θ) dθ. Tn
We have
( ) |⟨f, ι(u)⟩| ≤ (2π)−n sup |f | ∥u∥L1
(7.1.97)
≤ (2π)−n ∥f ∥A ∥u∥L1 .
Given the results on the Lebesgue integral mentioned above, this extends to ι : L1 (Tn ) −→ A′ (Tn ).
(7.1.98) We have from (7.1.94) that
Fι(u)(k) = (2π)−n
(7.1.99)
∫
u(θ)e−ik·θ dθ,
k ∈ Zn .
Tn ′
The space A (T ) also contains some objects more singular than functions. For example, given p ∈ Tn , we define δp ∈ A′ (Tn ) by n
⟨f, δp ⟩ = f (p).
(7.1.100)
More generally, let M ⊂ Tn be a compact, m-dimensional, C 1 surface, and u ∈ C(M ). Define uδM ∈ A′ (Tn ) by ∫ (7.1.101) ⟨f, uδM ⟩ = f (x)u(x) dS(x). M
We have (7.1.102)
|⟨f, uδM ⟩| ≤ CM (sup |u|)(sup |f |) ≤ CM (sup |u|)∥f ∥A .
Again (7.1.94) applies, to give (7.1.103)
F(uδM )(k) = (2π)−n
∫
u(θ)e−ik·θ dS(θ),
k ∈ Zn .
M
Objects such as δp and δM are examples of “distributions.” A beautiful theory of Fourier analysis on the space of distributions on Tn , which includes some more singular objects, was constructed by L. Schwartz. We present some of his results here. We start with the space C ∞ (Tn ). By (7.1.12)–(7.1.13), f ∈ C ∞ (Tn ) =⇒ F f ∈ s(Zn ),
(7.1.104) where (7.1.105)
s(Zn ) = {a ∈ ℓ∞ (Zn ) : |a(k)| ≤ CN (1 + |k|)−N , ∀ N }.
It is also easy to see that (7.1.106)
F ∗ : s(Zn ) −→ C ∞ (Tn ),
7.1. Fourier series
309
and (7.1.58) specializes, to yield F ∗ Ff = f,
(7.1.107)
FF ∗ a = a,
∀ f ∈ C ∞ (Tn ), a ∈ s(Zn ).
The space C ∞ (Tn ) carries the following sequence of norms: pk (f ) = max sup |f (α) (θ)|,
(7.1.108)
|α|≤k θ∈Tn
and the space s(Zn ) carries the norms qν (a) = sup (1 + |k|)ν |a(k)|.
(7.1.109)
k∈Zn
We see from (7.1.13) that (7.1.110) Also, since
qν (Ff ) ≤ Cν pν (f ).
∑
k∈Zn (1
−(n+1)
+ |k|)
< ∞, we have
p0 (F ∗ a) ≤ Cqn+1 (a),
(7.1.111) and from this we get
pk (F ∗ a) ≤ Cqk+n+1 (a).
(7.1.112)
Now a distribution w ∈ D′ (Tn ) is a continuous linear functional w : C ∞ (Tn ) −→ C,
(7.1.113)
that is to say, w is a linear map from C ∞ (Tn ) to C with the property that there exist k ∈ Z+ and C < ∞ such that |w(f )| ≤ Cpk (f ),
(7.1.114)
∀ f ∈ C ∞ (Tn ).
As in (7.1.75), we use the notation ⟨f, w⟩ = w(f ),
(7.1.115)
(f, w) = w(f ).
It follows from (7.1.15) that ∥f ∥A ≤ Cpn+1 (f ),
(7.1.116) ′
so each w ∈ A (T ) also defines an element of D′ (Tn ). Thus δp in (7.1.100) and δM in (7.1.101) are examples of distributions on Tn . To produce more singular distributions, we can define n
∂ α : D′ (Tn ) −→ D′ (Tn ),
(7.1.117) by
⟨f, ∂ α w⟩ = (−1)|α| ⟨f (α) , w⟩.
(7.1.118) We seek to define (7.1.119)
F : D′ (Tn ) −→ s′ (Zn ),
F ∗ : s′ (Zn ) −→ D′ (Tn ),
where (7.1.120)
s′ (Zn ) = {b : (1 + |k|)−N b ∈ ℓ∞ (Zn ), for some N ∈ Z+ }.
The significance of s′ (Zn ) is explained by the following. Lemma 7.1.12. The dual space to s(Zn ) is s′ (Zn ).
310
7. Fourier analysis
Proof. What we claim is that there is a natural identification of the set s(Zn )′ of continuous linear functionals on s(Zn ), β : s(Zn ) −→ C,
(7.1.121) with the set of functions s′ (Zn ),
b ∈ s′ (Zn ).
(7.1.122)
The condition of continuity on β in (7.1.121) is that there exist ν ∈ N and C < ∞ such that |β(a)| ≤ Cqν (a).
(7.1.123) ′
n ′
The map s (Z ) → s(Z ) is given by ∑ (7.1.124) ⟨a, b⟩ = a k bk , n
a ∈ s(Zn ), b ∈ s′ (Zn ).
k∈Zn
Note that (7.1.125)
( )( ) |⟨a, b⟩| ≤ C sup (1 + |k|)N +n+1 |a(k)| sup (1 + |k|)−N |b(k)| . k
k
Conversely, given β as in (7.1.121), satisfying (7.1.123), we define (7.1.126)
bk = β(εk ),
with εk as in (7.1.82), and verify that b ∈ s′ (Zn ) and that ⟨a, b⟩ = β(a),
(7.1.127)
first for a = εk , for all k ∈ Z , and then for all a ∈ s(Zn ). n
We are now ready to define F and F ∗ in (7.1.119). Parallel to (7.1.84), we define Fw ∈ s′ (Zn ) for w ∈ D′ (Tn ) by (7.1.128)
(a, Fw) = (F ∗ a, w),
a ∈ s(Zn ),
and we define F ∗ b ∈ D′ (Tn ) for b ∈ s′ (Zn ) by (7.1.129)
(f, F ∗ b) = (Ff, b),
f ∈ C ∞ (Tn ).
As we have seen, a ∈ s(Zn ) ⇒ F ∗ a ∈ s(Zn ), with estimates (7.1.112), and f ∈ C ∞ (Tn ) ⇒ Ff ∈ s(Zn ), with estimates (7.1.110). These results enable one to deduce that (7.1.128) defines Fw ∈ s′ (Zn ) and (7.1.129) defines F ∗ b ∈ D′ (Tn ). Here is a further extension of the Fourier inversion formula. Proposition 7.1.13. The maps F and F ∗ in (7.1.119) are two-sided inverses of each other, i.e., (7.1.130)
F ∗ Fw = w, and FF ∗ b = b,
∀ w ∈ D′ (Tn ), b ∈ s′ (Zn ).
The proof is parallel to that of Proposition 7.1.10. The result (7.1.12) implies (7.1.131)
Ff (α) (k) = (ik)α Ff (k),
∀ f ∈ C ∞ (Tn ).
We can extend this to D′ (Tn ), using (7.1.117)–(7.1.118) and (7.1.128)–(7.1.129), to obtain:
Exercises
311
Proposition 7.1.14. Given w ∈ D′ (Tn ), F(∂ α w)(k) = (ik)α Fw(k).
(7.1.132)
Exercises 1. Consider f (θ) = |θ| for −π ≤ θ ≤ π, extended periodically to define an element of C(T1 ). (a) Compute fˆ(k). (b) Show that f ∈ A(T1 ). (c) Use the Fourier inversion formula (7.1.4) at θ = 0 to show that ∞ ∑ 1 π2 (7.1.133) = . 2 (2ℓ + 1) 8 ℓ=0
(d) Deduce from (c) that ∞ ∑ 1 π2 = . k2 6
(7.1.134)
k=1
Hint. Decompose this sum into the sum over k odd and the sum over k even. ∑∞ Remark. k=1 k −2 is ζ(2). 2. Consider g(θ) = 1 for 0 < θ < π, 0 for −π < θ < 0, defining a bounded integrable function on T1 . (a) Compute gˆ(k). (b) Use the Plancherel identity (7.1.46) to obtain another derivation of (7.1.133). 3. Apply the Plancherel identity to f and fˆ in Exercise 1. Use this to show that ∞ ∑ 1 π4 . (7.1.135) = 4 k 90 k=1
Note. This sum is ζ(4). 4. Given f ∈ R(T1 ), set (7.1.136)
Pr f (θ) =
∑
r|k| fˆ(k)eikθ ,
0 ≤ r < 1.
k∈Z
Show that ∥Pr f − f ∥L2 −→ 0, as r ↗ 1. 5. In the setting of Exercise 4, show that ∞ ∞ ∑ ∑ (7.1.137) Pr f (θ) = fˆ(k)z k + fˆ(−k)z k , k=0
z = reiθ .
k=1
Deduce that u(reiθ ) = Pr f (θ) is harmonic on the unit disk D = {z ∈ C : |z| < 1}.
312
7. Fourier analysis
6. Interpret the results of Exercises 4–5 as providing a solution to the Dirichlet boundary probem (7.1.138) ∆u = 0 on D, u = f. ∂D
7. In the setting of Exercises 4–6, establish the following Poisson integral formula: ∫ (7.1.139)
2π
f (φ)p(r, θ − φ) dφ,
Pr f (θ) = 0
where (7.1.140)
p(r, θ) =
∞ 1 ∑ |k| ikθ r e . 2π k=−∞
Decompose this sum into a sum of two geometric series and show that (7.1.141)
p(r, θ) =
8. Show that
∫
1 − r2 1 . 2π 1 − 2r cos θ + r2
p(r, θ) dθ = 1,
∀ r ∈ [0, 1),
T1
and, for all δ > 0, p(r, θ) → 0 uniformly for θ ∈ [−π, π] \ [−δ, δ], as r ↗ 1. Hint. For the first part, integrate the series for p(r, θ) term by term. 9. Show that if f ∈ C(T1 ), then Pr f → f uniformly on T1 , as r ↗ 1. Hint. Look forward, to Lemma 7.2.3. 10. Adapt the proof of Lemma 7.2.3 to establish the following. Proposition X. If f ∈ R# (T1 ), I ⊂ T1 is an open set on which f is continuous, and K ⊂ I is compact, then Pr f → f uniformly on K, as r ↗ 1. For comparison, we mention the following result, given in Proposition 4.12 in Chapter 5 of [49]. Proposition Y. If f ∈ R# (T1 ) and I ⊂ T1 is an open set on which f is H¨older continuous, with some positive exponent, then SN f (θ) → f (θ) as N → ∞, for each θ ∈ I. Here SN f is as in (7.1.19), which for n = 1 is SN f (θ) =
N ∑ k=−N
fˆ(k)eikθ .
Exercises
313
11. From the geometric series
∑∞ k=0
∞ ∑ zk k=1
z k = 1/(1 − z), deduce that
= log
k
12. Show that f (θ) = log
1 , 1−z
for |z| < 1.
1 =⇒ f ∈ R# (T1 ). 1 − eiθ
Set
1 , for 0 ≤ r < 1. 1 − reiθ Show that fr ∈ C(T1 ) for each r ∈ [0, 1) and fr (θ) = log
∥f − fr ∥L1 −→ 0,
as r ↗ 1.
Deduce that fˆr (k) −→ fˆ(k) as r ↗ 1,
for each k ∈ Z.
13. In the setting of Exercise 12, note that, for 0 ≤ r < 1, ∫ 2π ( ) 1 1 ˆ log fr (k) = e−ikθ dθ. 2π 0 1 − reiθ Input the power series from Exercise 11 and deduce (via Exercise 12) that 1 fˆ(k) = , if k ≥ 1, k 0, if k ≤ 0. 14. Deduce from Exercise 13 and Proposition Y that there is pointwise convergence (though not absolute convergence) ∞ ∑ zk k=1
k
= log
1 , 1−z
for |z| = 1, z ̸= 1.
Note that taking z = −1 yields 1 1 1 + − + · · · = log 2. 2 3 4 For a more elementary approach to this special case, see Exercise 30 for Chapter 4, §5, of [49]. 1−
15. Let f ∈ C(Tn ) and assume that there exist fν ∈ A(Tn ) such that fν → f uniformly, and, for some K < ∞, independent of ν, ∥fˆν ∥ℓ1 ≤ K. Show that f ∈ A(Tn ). Show that this holds if there exists m > n/2 and K1 < ∞ such that each fν ∈ C m (Tn ) and ∑ ∥fν(α) ∥L2 ≤ K1 . |α|≤m
314
7. Fourier analysis
16. Deduce from Exercise 15 that Lip(T1 ) ⊂ A(T1 ). Note the improvement over (7.1.47) (in case n = 1). Note that this result applies to the function in Exercise 1.
7.2. The Fourier transform Given f ∈ R(Rn ) (or more generally f ∈ R# (Rn )), we define its Fourier transform to be ∫ −n/2 ˆ (7.2.1) f (ξ) = Ff (ξ) = (2π) f (x)e−ix·ξ dx, x ∈ Rn . Rn
Similarly, we set (7.2.2)
F ∗ f (ξ) = (2π)−n/2
∫ f (x)eix·ξ dx,
ξ ∈ Rn ,
Rn ∗
and ultimately plan to identify F as the inverse Fourier transform. Here, of course we take R(Rn ) to consist of complex-valued functions. Recall from §3.1 that this means their real and imaginary parts are separately Riemann integrable. Clearly |fˆ(ξ)| ≤ (2π)−n/2 ∥f ∥L1 ,
(7.2.3) where we use the notation
∫ |f (x)| dx.
∥f ∥L1 =
(7.2.4)
Rn
We also have continuity. Proposition 7.2.1. If f ∈ R(Rn ), then fˆ is continuous on Rn . ∫ Proof. Given ε > 0, pick N < ∞ such that |x|>N |f (x)| dx < ε. Write f = fN +gN where fN (x) = f (x) for |x| ≤ N , 0 for |x| > N . Then fˆ(ξ) = fˆN (ξ) + gˆN (ξ),
(7.2.5) and
|ˆ gN (ξ)| < ε,
(7.2.6) Meanwhile, for ξ, ζ ∈ R ,
∀ ξ.
n
(7.2.7)
fˆN (ξ) − fˆN (ζ) = (2π)−n/2
∫
( ) f (x) e−ix·ξ − e−ix·ζ dx,
|x|≤N
and |e−ix·ξ − e−ix·ζ | ≤ |ξ − ζ| max |∇η e−ix·η | η
(7.2.8)
≤ |x| · |ξ − ζ| ≤ N |ξ − ζ|,
7.2. The Fourier transform
315
for |x| ≤ N , so |fˆN (ξ) − fˆN (ζ)| ≤
(7.2.9)
N ∥f ∥L1 |ξ − ζ|. (2π)n/2
Hence each fˆN is continuous, and, by (7.2.6), fˆ is a uniform limit of continuous functions, so it is continuous. Remark. Proposition 7.2.1 also holds for f ∈ R# (Rn ). We compute some Fourier transforms, making use of the result (5.1.55) that ∫ ∞ √ 2 2 (7.2.10) e−x +xz dx = πez /4 , ∀ z ∈ C. −∞
Taking z = iξ, ξ ∈ R, gives ∫ ∞ √ 2 2 (7.2.11) ex +ixξ dx = πe−ξ /4 ,
∀ ξ ∈ R.
−∞
Writing e−|x|
2
(7.2.12) we get
∫
(7.2.13)
+ix·ξ
= e−x1 +ix1 ξ1 · · · e−xn +ixx ξn , 2
2
e−|x| eix·ξ dx = π n/2 e−|ξ| 2
2
/4
,
∀ ξ ∈ Rn .
Rn
Thus (7.2.14)
g(x) = e−|x| on Rn =⇒ F g(ξ) = F ∗ g(ξ) = 2−n/2 e−|ξ| 2
2
/4
.
A change of variable gives generally, for f ∈ R(Rn ), a > 0, (7.2.15)
fa (x) = f (ax) =⇒ F fa (ξ) = a−n fˆ(a−1 ξ),
and consequently, for b > 0 (7.2.16)
gb (x) = e−b|x| on Rn =⇒ F gb (ξ) = F ∗ gb (ξ) = (2b)−n/2 e−|ξ| 2
2
/4b
.
From (7.2.16) we see that Fg1/2 = g1/2 and also that F ∗ Fgb = FF ∗ gb = gb . The Fourier inversion formula asserts that ∫ (7.2.17) f (x) = (2π)−1/2 fˆ(ξ)eix·ξ dξ, Rn
in appropriate senses, depending on the nature of f . We will approach this by examining ∫ 2 (7.2.18) Jε f (x) = (2π)−n/2 fˆ(ξ)e−ε|ξ| eix·ξ dξ, Rn
316
7. Fourier analysis
2 with ε > 0. By (6.2.3), fˆ(ξ)e−ε|ξ| is Riemann integrable over Rn whenever f ∈ R(Rn ). We can plug in (7.2.1) for fˆ(ξ) and switch order of integration, getting ∫∫ 2 f (y)ei(x−y)·ξ e−ε|ξ| dy dξ Jε f (x) = (2π)−n ∫ (7.2.19) = f (y)Hε (x − y) dy,
where −n
(7.2.20)
∫
e−ε|ξ|
2
Hε (x) = (2π)
+ix·ξ
dξ.
Rn
Using (7.2.16), we have Hε (x) = (4πε)−n/2 e−|x| /4ε . ∫ 2 A change of variable and use of Rn e−|x| dx = π n/2 gives ∫ (7.2.22) Hε (x) dx = 1, ∀ ε > 0. 2
(7.2.21)
Rn
Using this information, we will be able to prove the following. Proposition 7.2.2. Assume f is bounded and continuous on Rn , and take Jε f (x) = ∫ f (y)Hε (x − y) dy, with Hε as in (7.2.21)–(7.2.22). Then, as ε ↘ 0, Jε f (x) −→ f (x),
(7.2.23)
∀ x ∈ Rn .
It is convenient to put this result in a more general context. If f is bounded and continuous on Rn and h ∈ R(Rn ), we define the convolution h ∗ f by ∫ (7.2.24) h ∗ f (x) = h(y)f (x − y) dy. Rn
Clearly (7.2.25)
∫ |h| dx = A, |f | ≤ M on Rn =⇒ |h ∗ f | ≤ AM on Rn .
Also, a change of variables gives (7.2.26)
∫
h ∗ f (x) =
h(x − y)f (y) dy. Rn
We want to analyze the convolution action of a family of integrable functions hν on Rn that satisfy the following conditions: ∫ ∫ (7.2.27) hν ≥ 0, hν dx = 1, hν dx ≤ εν → 0, Rn \Sν
where (7.2.28)
Sν = {x ∈ Rn : |x| ≤ δν },
δν → 0.
Assume (7.2.29)
f ∈ C(Rn ),
We aim to prove the following.
|f | ≤ M on Rn .
7.2. The Fourier transform
317
Lemma 7.2.3. If hν ∈ R(Rn ) satisfy (7.2.27)–(7.2.28) and if f ∈ C(Rn ) satisfies (7.2.29), then (7.2.30)
fν (x) = hν ∗ f (x) −→ f (x),
∀ x ∈ Rn , locally uniformly in x.
Proof. Given that f is continuous, it is uniformly continuous on compact sets, so we can supplement (7.2.29) with (7.2.31)
|x − x′ | ≤ δν , |x| ≤ R =⇒ |f (x) − f (x′ )| ≤ ε˜ν (R) → 0,
for each R < ∞. To proceed, write ∫ ∫ fν (x) = hν (y)f (x − y) dy + hν (y)f (x − y) dy (7.2.32) Sν Rn \Sν = gν (x) + rν (x). Clearly |rν (x)| ≤ M ε,
(7.2.33) Next, (7.2.34)
∀ x ∈ Rn .
∫ gν (x) − f (x) =
hν (y)[f (x − y) − f (x)] dy − εν f (x), Sν
so (7.2.35)
|gν (x) − f (x)| ≤ ε˜ν (R) + M εν ,
for |x| ≤ R,
|fν (x) − f (x)| ≤ ε˜ν (R) + 2M εν ,
for |x| ≤ R,
hence (7.2.36)
yielding (7.2.30).
In view of (7.2.21)–(7.2.22), Proposition 7.2.2 follows from Lemma 7.2.3. From here, we obtain the following. Proposition 7.2.4. Assume f is bounded and continuous on Rn , and f, fˆ ∈ R(Rn ). Then the Fourier inversion formula (7.2.17) holds for all x ∈ Rn . Proof. If f ∈ R(Rn ), then fˆ is bounded and continuous. If also fˆ ∈ R(Rn ), then the right side of (7.2.18) converges to the right side of (7.2.17), i.e., to F ∗ fˆ(x), for each x ∈ Rn , as ε ↘ 0. That is to say, (7.2.37)
lim Jε f (x) = F ∗ fˆ(x),
ε↘0
∀ x ∈ Rn .
In concert with (7.2.23), this proves the proposition.
Remark. With some more work, one can omit the hypothesis in Proposition 7.2.4 that f be bounded and continuous, and use (7.2.37) to deduce these properties as a conclusion. This sort of reasoning is best carried out in a course on measure theory and integration.
318
7. Fourier analysis
In light of the arguments given above, we see that the following class of functions arises as one that is significant for Fourier analysis. (7.2.38) A(Rn ) = {f ∈ R(Rn ) : f bounded and continuous, fˆ ∈ R(Rn )}. By Proposition 7.2.4, the Fourier inversion formula (7.2.17) holds for all f ∈ A(Rn ). It also follows that f ∈ A(Rn ) ⇒ fˆ ∈ A(Rn ). It is of interest to know when f ∈ A(Rn ). We begin with the following simple result. Suppose f ∈ C k (Rn ) has compact support (we write f ∈ Cck (Rn )). Then integration by parts yields ∫ (7.2.39) (2π)−n/2 f (α) (x)e−ix·ξ dx = (iξ)α fˆ(ξ), |α| ≤ k. Rn
Hence (7.2.40)
f ∈ Cck (Rn ) =⇒ |fˆ(ξ)| ≤ C(1 + |ξ|)−k =⇒ fˆ ∈ R(Rn ), if k > n,
so Ccn+1 (Rn ) ⊂ A(Rn ).
(7.2.41)
We will obtain some substantially sharper results below. To proceed, it is useful to bring in the quantities ∥f ∥L2 and (f, g)L2 , defined by
∫ ∥f ∥2L2 =
(7.2.42)
|f (x)|2 dx, Rn
and (7.2.43)
∫ (f, g)L2 =
f (x)g(x) dx. Rn
Note that ∥f ∥2L2 = (f, f )L2 . Since elements of R(Rn ) are both bounded and integrable, we have ∫ ( ) n (7.2.44) f ∈ R(R ) =⇒ |f (x)|2 dx ≤ sup |f | ∥f ∥L1 , Rn
where ∥f ∥L1 is defined by (7.2.4). Use of (7.2.43) makes R(Rn ) an inner product space and use of (7.2.42) makes it a normed space (if we identify f and g whenever ∫ |f − g| dx = 0). In particular, the triangle inequality holds: (7.2.45)
∥f + g∥L2 ≤ ∥f ∥L2 + ∥g∥L2 .
A parallel result holds for L1 : (7.2.46)
∥f + g∥L1 ≤ ∥f ∥L1 + ∥g∥L1 .
The inequality (7.2.46) is immediate from the definition (7.2.4) and the pointwise estimate |f (x) + g(x)| ≤ |f (x)| + |g(x)|. The proof of (7.2.45) takes a longer argument, involving along the way Cauchy’s inequality, (7.2.47)
|(f, g)L2 | ≤ ∥f ∥L2 ∥g∥L2 .
See Appendix A.2, on inner product spaces, for a derivation of (7.2.47) and (7.2.45).
7.2. The Fourier transform
319
An important property of the Fourier transform F and of F ∗ is that they preserve the L2 -norm and inner product. We first derive this for elements of A(Rn ). Since F and F ∗ differ only in replacing e−ix·ξ by its complex conjugate eix·ξ , we have, for f, g ∈ A(Rn ), (Ff, g)L2 = (f, F ∗ g)L2 .
(7.2.48)
Combining this with Proposition 7.2.4, we have (7.2.49)
f, g ∈ A(Rn ) =⇒ (Ff, Fg)L2 = (f, F ∗ Fg)L2 = (f, g)L2 .
One readily obtains a similar result with F replaced by F ∗ . Hence (7.2.50)
∥Ff ∥L2 = ∥F ∗ f ∥L2 = ∥f ∥L2 ,
for all f ∈ A(Rn ). The result (7.2.50) is called the Plancherel identity. It extends beyond f ∈ A(Rn ). We aim to prove the following. Proposition 7.2.5. If f ∈ R(Rn ), then |Ff |2 and |F ∗ f |2 belong to R(Rn ), and (7.2.50) holds. Note that we do not assert that f ∈ R(Rn ) implies fˆ ∈ R(Rn ). Indeed, that can fail, as the following simple example shows, in case n = 1. Namely, if (7.2.51)
χR (x) = 1 for |x| ≤ R,
then 1 χ bR (ξ) = √ 2π (7.2.52)
∫
R
0 for |x| > R,
e−ixξ dx
−R
∫ R 1 =√ cos xξ dx 2π −R √ 2 sin Rξ , = π ξ
a function that is square integrable but not integrable. We use an approximation argument to prove Proposition 7.2.5, making use of the following lemma. Pick k > n. Lemma 7.2.6. Given f ∈ R(Rn ), there exist fν ∈ Cck (Rn ) such that sup |fν | ≤ sup |f |, (7.2.53)
∥f − fν ∥L1 −→ 0, and ∥f − fν ∥L2 −→ 0.
Take fR (x) = f (x) for |x| < R, 0 for |x| ≥ R. Then f ∈ R(Rn ) → Proof. ∫ |f − fR | dx → 0 as R → ∞, so we specialize to the case f ∈ Rc (Rn ). It suffices to treat the case f ≥ 0. Then the existence of fν ∈ Cc (Rn ) such that 0 ≤ fν ≤ f and ∥f − fν ∥L1 → 0 follows from Proposition 3.1.11. Then also ∥f − fν ∥L2 → 0. Further approximation by elements of Cck (Rn ) is left to the reader. One might use the Stone-Weierstrass theorem, or perhaps a convolution argument.
320
7. Fourier analysis
We now move to the proof of Proposition 7.2.5. By (7.2.41), each fν belongs to A(Rn ), so, by (7.2.50), ∥fˆν ∥L2 = ∥fν ∥L2 ,
(7.2.54)
∀ ν.
By (7.2.3) we have (7.2.55)
sup |fˆ(ξ) − fˆν (ξ)| ≤ (2π)−n/2 ∥f − fν ∥L1 , ξ
so fˆν → fˆ uniformly on Rn . Consequently, for each R < ∞, ∫ ∫ (7.2.56) |fˆ(ξ)|2 dξ = lim |fˆν (ξ)|2 dξ. |ξ|≤R
ν→∞ |ξ|≤R
Meanwhile, the right side of (7.2.56) is dominated by ∥fˆν ∥2L2 , so ∫ |fˆ(ξ)|2 dξ ≤ lim inf ∥fˆν ∥2L2 ν→∞
(7.2.57)
|ξ|≤R
= lim inf ∥fν ∥2L2 =
ν→∞ ∥f ∥2L2 ,
the second line by (7.2.54) and the third by (7.2.53). Since (7.2.57) holds for all R < ∞, we have at this point that (7.2.58) f ∈ R(Rn ) =⇒ ∥fˆ∥L2 ≤ ∥f ∥L2 . To proceed, we apply this implication to gν = f − fν , obtaining ∥ˆ gν ∥L2 ≤ ∥gν ∥L2 , i.e., (7.2.59) ∥fˆ − fˆν ∥L2 ≤ ∥f − fν ∥L2 . By (7.2.53), ∥f − fν ∥L2 → 0 as ν → ∞, so (7.2.60)
∥fˆ − fˆν ∥L2 −→ 0.
Hence (7.2.61)
∥fˆ∥L2 = lim ∥fˆν ∥L2 . ν→∞
Applying (7.2.54) and (7.2.53) again, we get (7.2.62)
f ∈ R(Rn ) =⇒ ∥fˆ∥L2 = ∥f ∥L2 ,
as desired. The same sort of argument applies to F ∗ f . Proposition 7.2.5 is proved. This L2 material sets us up for another type of Fourier inversion formula. To state it, we bring in the following family of linear transformations. For R ∈ (0, ∞) and f ∈ R(Rn ), set ∫ fˆ(ξ)eix·ξ dξ. (7.2.63) SR f (x) = (2π)−n/2 |ξ|≤R
Equivalently, (7.2.64)
SR f = F ∗ (χR fˆ),
7.2. The Fourier transform
321
with χR as in (7.2.51), but this time x ∈ Rn . Note that (7.2.65) f ∈ R(Rn ) =⇒ fˆ bounded and continuous (and square integrable) =⇒ χR f ∈ R(Rn ) =⇒ F ∗ (χR fˆ) bounded and continuous, and square integrable. We also have (7.2.66)
∥χR fˆ − fˆ∥L2 −→ 0 as R → ∞.
This suggests the following result, our second Fourier inversion formula. Proposition 7.2.7. Given f ∈ R(Rn ), (7.2.67)
∥SR f − f ∥L2 −→ 0 as R → ∞.
It is convenient to approach this via a lemma. Lemma 7.2.8. If f ∈ A(Rn ), then (7.2.67) holds. Proof. We already know that χR fˆ ∈ R(Rn ). If f ∈ A(Rn ), then also fˆ ∈ R(Rn ), so Proposition 7.2.5 applies to χR fˆ − fˆ, yielding (7.2.68)
∥SR f − f ∥L2 = ∥F ∗ (χR fˆ − fˆ)∥L2 = ∥χR fˆ − fˆ∥L2 → 0.
To prove Proposition 7.2.7, we use Lemma 7.2.6 to obtain fν ∈ A(Rn ) such that ∥f − fν ∥L2 → 0. Note also that, by (7.2.65), (7.2.69)
f ∈ R(Rn ) ⇒ ∥SR f ∥L2 = ∥χR fˆ∥L2 ≤ ∥fˆ∥L2 = ∥f ∥L2 ,
for all R ∈ (0, ∞). Now we can write (7.2.70)
SR f − f = (SR f − SR fν ) + (SR fν − fν ) + (fν − f ),
and hence (7.2.71)
∥SR f − f ∥L2 ≤ ∥SR (f − fν )∥L2 + ∥SR fν − fν ∥L2 + ∥f − fν ∥L2 ≤ ∥SR fν − fν ∥L2 + 2∥f − fν ∥L2 ,
the last inequality by (7.2.69), with f replaced by f − fν . Consequently, by Lemma 7.2.8, for each f ∈ R(Rn ), (7.2.72)
lim sup ∥SR f − f ∥L2 ≤ 2∥f − fν ∥L2 ,
∀ ν,
R→∞
and taking ν → ∞ gives (7.2.67). Proposition 7.2.7 is proved. We next use the Plancherel theorem to establish the following sharpening of (7.2.41). See Propositions 7.2.15–7.2.16 for a further improvement. Proposition 7.2.9. We have (7.2.73)
k>
n =⇒ Cck (Rn ) ⊂ A(Rn ). 2
322
7. Fourier analysis
Proof. Given f ∈ Cck (Rn ), we again use the identity (7.2.39). The Plancherel identity then gives ∫ (7.2.74) |ξ α fˆ(ξ)|2 dξ = ∥f (α) ∥2L2 , ∀ |α| ≤ k. Rn
Summing over |α| ≤ k gives ∫ ∑ (7.2.75) (1 + |ξ|)2k |fˆ(ξ)|2 dξ ≤ C ∥f (α) ∥2L2 . |α|≤k
Rn
Now Cauchy’s inequality gives ∫ ∫ ˆ |f (ξ)| dξ = (1 + |ξ|)−k (1 + |ξ|)k |fˆ(ξ)| dξ (7.2.76)
Rn
Rn 1/2
≤ An,k
{∫
(1 + |ξ|)2k |fˆ(ξ)|2 dξ
}1/2 ,
Rn
where (7.2.77)
∫ An,k =
(1 + |ξ|)−2k dξ < ∞, if 2k > n.
Rn
This establishes the desired finiteness of
∫ Rn
|fˆ(ξ)| dξ.
Having seen important roles played by L1 and L2 norms and L2 inner products, we are motivated to advertise how Fourier analysis is a natural setting in which to work with spaces of functions larger than R(Rn ) or R# (Rn ), spaces that are labeled L1 (Rn ) and L2 (Rn ). These are defined using the Lebesgue theory of integration. We give a brief description of this, refering the reader to other sources, such as [17] or [47], for a detailed presentation. To start, we say a set S ⊂ Rn is measurable provided that, for each cell R ⊂ Rn , (7.2.78)
m∗ (S ∩ R) + m∗ (R \ S) = V (R),
where m∗ is the outer measure, defined in (3.1.124). We say a function f : Rn → C is measurable provided that, for each open O ⊂ C, f −1 (O) ⊂ Rn is measurable. The Lebesgue integral associates a value in [0, +∞] to ∫ (7.2.79) f (x) dx Rn
for each measurable f satisfying f (x) ≥ 0 for all x. The space L1 (Rn ) consists of all measurable functions f such that ∫ (7.2.80) ∥f ∥L1 = |f (x)| dx < ∞. Rn
In such a case, one can write f = f0+ − f0− + i(f1+ − f1− ) with all fj± measurable and ≥ 0, and with finite integral, and the process alluded to above applies to ∫ evaluate these integrals, and hence to evaluate Rn f (x) dx. There is one further wrinkle. The space L1 (Rn ) actually consists of equivalence classes of measurable
7.2. The Fourier transform
323
functions satisfying (7.2.80), where the equivalence is f1 ∼ f2 if and only if {x ∈ Rn : f1 (x) ̸= f2 (x)} has outer measure 0. This makes L1 (Rn ) a normed space. More generally, for p ∈ [1, ∞), Lp (Rn ) consists of equivalence classes of measurable functions for which ∫ p (7.2.81) ∥f ∥Lp = |f (x)|p dx < ∞. Rn
The only case of p > 1 that we work with here is p = 2. One has f, g ∈ L2 (Rn ) =⇒ f g ∈ L1 (Rn ),
(7.2.82)
and L2 (Rn ) is an inner product space, via (7.2.43), extended to this setting. These Lp norms satisfy the triangle inequality: ∥f + g∥Lp ≤ ∥f ∥Lp + ∥g∥Lp .
(7.2.83)
For p = 1 or 2, the proofs of (7.2.83) are as described before. For other p ∈ (1, ∞), which we do not deal with here, the reader can consult the references mentioned above. Thanks to (7.2.83), Lp (Rn ) has the structure of a metric space, with d(f, g) = ∥f − g∥Lp . We say fν → f in Lp if ∥fν − f ∥Lp → 0. For all p ∈ [1, ∞), these spaces have the following important metric properties. The first is a denseness property. Proposition A. Given f ∈ Lp (Rn ) and k ∈ N, there exist fν ∈ Cck (Rn ) such that fν → f in Lp . The next is a completeness property. Proposition B. If (fν ) is a Cauchy sequence in Lp (Rn ), then there exists f ∈ Lp (Rn ) such that fν → f in Lp . We refer to [17] or [47] for proofs of these results. We mention that if f is bounded and continuous on Rn , then f ∈ L1 (Rn ) if and only if f ∈ R(Rn ). Hence (7.2.38) is equivalent to (7.2.84) A(Rn ) = {f ∈ L1 (Rn ) : f is bounded and continuous, and fˆ ∈ L1 (Rn )}. We also have A(Rn ) ⊂ L2 (Rn ),
(7.2.85) either by (7.2.44) or by (7.2.86)
( ) ∥f ∥2L2 ≤ sup |f | ∥f ∥L1 ≤ (2π)−n/2 ∥fˆ∥L1 ∥f ∥L1 .
The following neat extensions of Propositions 7.2.5 and 7.2.7 illustrate the usefulness of the Lebesgue theory of integration in Fourier analysis. Proposition 7.2.10. The maps F and F ∗ have unique continuous linear extensions from (7.2.87)
F, F ∗ : A(Rn ) −→ A(Rn )
324
7. Fourier analysis
to F, F ∗ : L2 (Rn ) −→ L2 (Rn ),
(7.2.88) and the identities
F ∗ Ff = f,
(7.2.89)
FF ∗ f = f,
and ∥Ff ∥L2 = ∥F ∗ f ∥L2 = ∥f ∥L2
(7.2.90) hold for all f ∈ L2 (Rn ).
Proposition 7.2.11. Define SR by (7.2.91)
SR f (x) = (2π)
−n/2
∫ fˆ(ξ)eix·ξ dξ.
|ξ|≤R
Then SR : L2 (Rn ) → L2 (Rn ), and (7.2.92)
f ∈ L2 (Rn ) =⇒ SR f → f in L2 (Rn ),
as R → ∞. Proposition 7.2.10 can be proven using Propositions A and B (with p = 2) and the inclusion (7.2.41), which, together with Proposition A, implies that (7.2.93)
given f ∈ L2 (Rn ), there exist fν ∈ A(Rn ) such that fν → f in L2 .
The argument goes like this. Given f ∈ L2 (Rn ), take fν as in (7.2.93). Then ∥fµ − fν ∥L2 → 0 as µ, ν → ∞. Now (6.2.50), applied to fµ − fν ∈ A(Rn ), gives (7.2.94)
∥Ffµ − Ffν ∥L2 = ∥fµ − fν ∥L2 → 0,
as µ, ν → ∞. Hence (Ffµ ) is a Cauchy sequence in L2 (Rn ). By Proposition B, there exits a limit h ∈ L2 (Rn ), that is, Ffν → h in L2 . One gets the same element h regardless of the choice of (fν ) such that (7.2.93) holds, and so we set Ff = h. The same argument applies to F ∗ fν , which hence converges to F ∗ f . We have (7.2.95)
∥Ffν − Ff ∥L2 , ∥F ∗ fν − F ∗ f ∥L2 → 0.
From here, the results (7.2.89)–(7.2.90) follow. As for Proposition 7.2.11, we have (7.2.96)
SR f = F ∗ (χR Ff ),
and, thanks to Proposition 7.2.10, f ∈ L2 (Rn ) =⇒ F f ∈ L2 (Rn ) (7.2.97)
=⇒ χR Ff → F f in L2 (Rn ) =⇒ F ∗ (χR Ff ) → F ∗ Ff in L2 (Rn ),
and finally, again by Proposition 7.2.10, F ∗ Ff = f . We now discuss another way of taking Fourier analysis beyond the setting of A(Rn ), which is a normed linear space, with norm (7.2.98)
∥f ∥A = ∥f ∥L1 + ∥fˆ∥L1 ,
7.2. The Fourier transform
325
for which we have ∥Ff ∥A = ∥F ∗ f ∥A = ∥f ∥A ,
(7.2.99)
for all f ∈ A(Rn ), thanks to Proposition 7.2.4 and the fact that F ∗ f (ξ) = Ff (−ξ). We define the dual space A′ (Rn ) to consist of continuous linear functionals w : A(Rn ) −→ C,
(7.2.100)
i.e., those linear maps w satisfying |w(f )| ≤ C∥f ∥A ,
(7.2.101)
∀ f ∈ A(Rn ).
The optimal constant in (7.2.101) is denoted ∥w∥A′ . We also use the notation (7.2.102)
⟨f, w⟩ = w(f ),
f ∈ A(Rn ), w ∈ A′ (Rn ).
Then, we define F, F ∗ : A′ (Rn ) −→ A′ (Rn )
(7.2.103) by (7.2.104)
⟨f, F ∗ w⟩ = ⟨F ∗ f, w⟩.
⟨f, Fw⟩ = ⟨Ff, w⟩,
Note that (7.2.105)
|⟨f, Fw⟩| = |⟨Ff, w⟩| ≤ ∥w∥A′ ∥Ff ∥A = ∥w∥A′ ∥f ∥A ,
so ∥Fw∥A′ ≤ ∥w∥A′ .
(7.2.106)
We also have the Fourier inversion formula on A′ (Rn ): F ∗ Fw = FF ∗ w = w,
(7.2.107)
∀ w ∈ A′ (Rn ).
Indeed, given f ∈ A(Rn ), w ∈ A′ (Rn ), (7.2.108)
⟨f, F ∗ Fw⟩ = ⟨F ∗ f, Fw⟩ = ⟨F ∗ Ff, w⟩ = ⟨f, w⟩,
the last identity by Proposition 7.2.4. The proof that FF ∗ w = w is similar. Incidentally, this Fourier inversion formula combines with (7.2.106) to yield (7.2.109)
∥Fw∥A′ = ∥w∥A′ .
The space A′ (Rn ) contains an interesting variety of functions and other objects. For example, there is a natural map (7.2.110)
ι : BC(Rn ) −→ A′ (Rn ),
where BC(Rn ) consists of the set of bounded continuous functions on Rn , with norm (7.2.111)
∥w∥BC = sup |w|.
The map is given by (7.2.112)
∫ ⟨f, ι(w)⟩ =
f (x)w(x) dx. Rn
Clearly (7.2.113)
|⟨f, ι(w)⟩| ≤ (sup |w|)∥f ∥L1 .
326
7. Fourier analysis
We also claim that the map ι in (7.2.110) is injective, i.e, w ∈ BC(Rn ), ⟨f, ι(w)⟩ = 0 for all f ∈ A(Rn ) implies w = 0. This is a consequence of the following more general result, whose proof we leave to the reader. ∫ Lemma 7.2.12. If w ∈ BC(Rn ) and f (x)w(x) dx = 0 for all f ∈ Cc∞ (Rn ), then w ≡ 0. The space A′ (Rn ) also contains some objects more singular than functions. For example, given p ∈ Rn , we define δp ∈ A′ (Rn ) by ⟨f, δp ⟩ = f (p).
(7.2.114) For p = 0, we simply set δ = δ0 .
Let us compute some Fourier transforms, and observe the Fourier inversion formula in action. We have ∫ (7.2.115) ⟨f, Fδ⟩ = ⟨Ff, δ⟩ = fˆ(0) = (2π)−n/2 f (x) dx, so (7.2.116)
Fδ(ξ) = (2π)−n/2 ,
a constant function. By comparison, ⟨f, F ∗ 1⟩ = ⟨F ∗ f, 1⟩ = (7.2.117)
∫
F ∗ f (ξ) dξ
= (2π)n/2 FF ∗ f (0) = (2π)n/2 f (0),
the last identity by Proposition 7.2.4. Hence (7.2.118)
F ∗ 1 = (2π)n/2 δ.
Thus (7.2.116) and (7.2.118) illustrate (7.2.107). Generalizing the construction of δp , let M ⊂ Rn be a compact, m-dimensional, C surface, and u ∈ C(M ). Define uδM ∈ A′ (Rn ) by ∫ (7.2.119) ⟨f, uδM ⟩ = f (x)u(x) dS(x). 1
M
Then ⟨f, F(uδM )⟩ = ⟨Ff, uδM ⟩ ∫ = fˆ(ξ)u(ξ) dS(ξ) (7.2.120)
M −n/2
∫ (∫
= (2π)
M
) f (x)e−ix·ξ dx u(ξ) dS(ξ).
Rn
Now covering M with coordinate charts and chopping u into pieces supported on coordinate patches, we can repeatedly apply the Fubini theorem to write ∫ (∫ ) (7.2.121) ⟨f, F(uδM )⟩ = (2π)−n/2 u(ξ)e−ix·ξ dS(ξ) f (x) dx, Rn
M
7.2. The Fourier transform
327
so F(uδM ) is (the image under ι in (7.2.110) of) an element of BC(Rn ), given by ∫ (7.2.122) F(uδM )(x) = (2π)−n/2 u(ξ)e−x·ξ dS(ξ). M
In particular, −n/2
∫
e−ix·ξ dS(ξ).
FδM (x) = (2π)
(7.2.123)
M
Tempered distributions Objects such as δp and δM are examples of “distributions.” A beautiful theory of Fourier analysis on the space of “tempered distributions,” which includes some more singular objects, was constructed by L. Schwartz. We present some of his results here. We start with the space S(Rn ), defined as (7.2.124)
S(Rn ) = {f ∈ C ∞ (Rn ) : xβ f (α) bounded, ∀ α, β}.
By this definition, we clearly have f ∈ S(Rn ) =⇒ xβ f (α) ∈ S(Rn ),
∀ α, β.
Using the integrability of (1 + |x|)−(n+1) on Rn , one can show that (7.2.125)
f ∈ S(Rn ) =⇒ xβ f (α) ∈ R(Rn ), ∀ α, β.
Also, the integration by parts argument in (7.2.39) extends: ∫ (7.2.126) f ∈ S(Rn ) ⇒ (2π)−n/2 f (α) (x)e−ix·ξ dx = (iξ)α fˆ(ξ), ∀ α. Rn
In particular, f ∈ S(R ) ⇒ fˆ ∈ R(Rn ), so n
S(Rn ) ⊂ A(Rn ).
(7.2.127) Thus, by Proposition 7.2.4, (7.2.128)
f ∈ S(Rn ) =⇒ f = F ∗ Ff = FF ∗ f.
We can also complement (7.2.126) by (7.2.129)
f ∈ S(Rn ) ⇒ (2π)−n/2
∫
xβ f (x)e−ix·ξ dx = i|β| ∂ξβ fˆ(ξ),
Rn
and deduce that (7.2.130)
F, F ∗ : S(Rn ) −→ S(Rn ).
By (7.2.128), the operators F and F ∗ are inverses of each other on S(Rn ). The space S(Rn ) carries the following sequence of norms: (7.2.131)
pk (f ) =
max
sup |xβ f (α) (x)|.
|α|+β|≤k x∈Rn
One says a linear map T : S(Rn ) → S(Rn ) is continuous provided that, for each k ∈ Z+ , there exists ℓ ∈ Z+ and Ck < ∞ such that (7.2.132)
pk (T f ) ≤ Ck pℓ (f ),
∀ f ∈ S(Rn ).
328
7. Fourier analysis
The observations leading to (7.2.125) and (7.2.129) show that pk (Ff ) ≤ Ck pk+n+1 (f ),
(7.2.133)
∀ f ∈ S(Rn ),
so F in (7.2.130) is continuous. The same goes for F ∗ . Now a tempered distribution w ∈ S ′ (Rn ) is a continuous linear functional w : S(Rn ) −→ C,
(7.2.134)
that is to say, w is a linear map from S(Rn ) to C with the property that there exists k ∈ Z+ and C < ∞ such that |w(f )| ≤ Cpk (f ),
(7.2.135)
∀ f ∈ S(Rn ).
As in (7.2.102), we use the notation ⟨f, w⟩ = w(f ),
(7.2.136)
f ∈ S(Rn ), w ∈ S ′ (Rn ).
Analysis behind (7.2.127) gives ∥f ∥A ≤ Cp2n+2 (f ),
(7.2.137) ′
so each w ∈ A (R ) also defines an element of S ′ (Rn ). Thus δp in (7.2.114) and δM in (7.2.119) are examples of tempered distributions. To produce more singular tempered distributions, we can define n
∂ α : S ′ (Rn ) −→ S ′ (Rn )
(7.2.138) by
⟨f, ∂ α w⟩ = (−1)|α| ⟨f (α) , w⟩.
(7.2.139)
In this way, we get, for example, δ ′ ∈ S ′ (R). We can also define xβ w for w ∈ S ′ (Rn ) by ⟨f, xβ w⟩ = ⟨xβ f, w⟩.
(7.2.140) We now define (7.2.141)
F, F ∗ : S ′ (Rn ) −→ S ′ (Rn )
by (7.2.142)
⟨f, Fw⟩ = ⟨Ff, w⟩,
⟨f, F ∗ w⟩ = ⟨F ∗ f, w⟩,
for f ∈ S(Rn ), w ∈ S ′ (Rn ). Note that, given w ∈ S ′ (Rn ), (7.2.143)
|⟨f, w⟩| ≤ Cpk (f ),
∀ f ∈ S(Rn )
⇒ |⟨f, Fw⟩| ≤ Cpk (Ff ) ≤ CCk pk+n+1 (f ),
by (7.2.133), so indeed if w ∈ S ′ (Rn ), (17.140) defines Fw as an element of S ′ (Rn ). The same goes for F ∗ w. Here is a further extension of the Fourier inversion formula. Proposition 7.2.13. Given w ∈ S ′ (Rn ), (7.2.144)
F ∗ Fw = FF ∗ w = w.
Proof. Parallel to (7.2.108), we have, for f ∈ S(Rn ), w ∈ S ′ (Rn ), (7.2.145)
⟨f, F ∗ Fw⟩ = ⟨F ∗ f, Fw⟩ = ⟨FF ∗ f, w⟩ = ⟨f, w⟩,
the last identity by (7.2.128). The same goes for ⟨f, FF ∗ w⟩.
7.2. The Fourier transform
329
We can extend (7.2.126) and (7.2.129) from S(Rn ) to S ′ (Rn ), using (7.2.139)– (7.2.140) and an argument parallel to (17.143), to obtain: Proposition 7.2.14. Given w ∈ S ′ (Rn ), F(∂ α w) = (iξ)α Fw,
(7.2.146) and
F(xβ w) = i|β| ∂ξβ Fw,
(7.2.147)
with similar formulas involving F ∗ . For example, with δ as in (7.2.114) (p = 0), and n = 1, we have from (7.2.116) and (7.2.146) Fδ ′ (ξ) = (2π)−1/2 iξ.
(7.2.148)
More general sufficient condition for f ∈ A(Rn ) We return to the task of identifying elements of A(Rn ), and establish a result substantially sharper than Proposition 7.2.9. We mention that an analogous result holds for Fourier series. The interested reader can investigate this. To set things up, given f ∈ R(Rn ), let (7.2.149)
fh (x) = f (x + h).
Here is our result in the case n = 1. Proposition 7.2.15. If f ∈ R(R) and there exists C < ∞ such that (7.2.150)
∥f − fh ∥L2 ≤ C|h|α ,
for |h| ≤ 1,
with (7.2.151)
α>
1 , 2
then f ∈ A(R). Proof. A calculation gives fˆh (ξ) = eihξ fˆ(ξ),
(7.2.152) so, by the Plancherel identity, (7.2.153)
∥f − fh ∥2L2 =
∫
∞
−∞
|1 − eihξ |2 |fˆ(ξ)|2 dξ.
Now, (7.2.154)
π 3π ≤ |hξ| ≤ =⇒ |1 − eihξ |2 ≥ 2, 2 2
so (7.2.155)
∫ ∥f −
fh ∥2L2
|fˆ(ξ)|2 dξ.
≥2 π 3π 2 ≤|hξ|≤ 2
330
7. Fourier analysis
If (7.2.150) holds, we deduce that, for 0 < |h| ≤ 1, ∫ (7.2.156) |fˆ(ξ)|2 dξ ≤ C|h|2α , 2 4 ≤|ξ|≤ |h| |h|
hence (setting |h| = 2−ℓ+1 ), for ℓ ≥ 1, ∫ (7.2.157) |fˆ(ξ)|2 dξ ≤ C2−2αℓ . 2ℓ ≤|ξ|≤2ℓ+1
Cauchy’s inequality gives ∫ |fˆ(ξ)| dξ 2ℓ ≤|ξ|≤2ℓ+1
(7.2.158)
≤
∫
{
|fˆ(ξ)|2 dξ
}1/2
×
≤ C2
·2
}1/2 1 dξ
2ℓ ≤|ξ|≤2ℓ+1
2ℓ ≤|ξ|≤2ℓ+1 −αℓ
∫
{
ℓ/2
= C2−(α−1/2)ℓ . Summing over ℓ ∈ N and using (again by Cauchy’s inequality) ∫ (7.2.159) |fˆ| dξ ≤ C∥fˆ∥L2 = C∥f ∥L2 , |ξ|≤2
then gives the proof. To see how close to sharp Proposition 7.2.15 is, consider (7.2.160)
f (x) = χI (x) = 1 0
if 0 ≤ x ≤ 1, otherwise.
We have, for |h| ≤ 1, ∥f − fh ∥2L2 = 2|h|,
(7.2.161)
so (7.2.150) holds, with α = 1/2. Since A(R) ⊂ C(R), this function does not belong to A(R), so the condition (7.2.151) is about as sharp as it could be. To produce the appropriate generalization to n variables, let us focus on (7.2.158), and note that when R is replaced by Rn , the integral of 1 dξ becomes ∼ 2nℓ , so to obtain the result ∫ ∑ (7.2.162) |fˆ(ξ)| dξ < ∞, ℓ≥1
we want
2ℓ ≤|ξ|≤2ℓ+1
∫
(7.2.163) 2ℓ ≤|ξ|≤2ℓ+1
|fˆ(ξ|2 dξ ≤ C2−2γℓ ,
γ>
n . 2
7.2. The Fourier transform
331
It is convenient to rewrite this as ∫ (7.2.164) |ξ|2k |fˆ(ξ)|2 dξ ≤ C2−2αℓ , 2ℓ ≤|ξ|≤2ℓ+1
where α > 0, if n = 2k, 1 α > , if n = 2k + 1. 2 Now we bring in Proposition 7.2.14. Assume (7.2.165)
∂ β f = fβ ∈ R(Rn ),
(7.2.166)
for |β| ≤ k,
where a priori ∂ f is defined as an element of S ′ (Rn ), by (7.2.139). Then calculations parallel to (7.2.152)–(7.2.157), applied to fβ in place of f , show that, if β
(7.2.167)
∥fβ − (fβ )h ∥L2 ≤ C|h|α ,
then
∫
(7.2.168)
for |β| ≤ k,
|ξ β fˆ(ξ)|2 dξ ≤ C2−αℓ ,
2ℓ ≤|ξ|≤2ℓ+1
for ℓ ≥ 1. Summing over |β| ≤ k then yields (7.2.164). We hence have the following higher dimensional extension of Proposition 7.2.15. Proposition 7.2.16. Assume n = 2k or n = 2k + 1. Take f ∈ R(Rn ) and assume that (7.2.169)
∂ β f = fβ ∈ R(Rn ),
for |β| ≤ k,
and that, for each such β, (7.2.170)
∥fβ − (fβ )h ∥L2 ≤ C|h|α ,
for |h| ≤ 1,
where α satisfies (7.2.165). Then f ∈ A(R ). n
We mention a result that refines Proposition 7.2.16 when n > 1. To state it, we bring in the difference operators (7.2.171)
∆j,ε f (x) = ε−1 (f (x + εej ) − f (x)),
1 n ∆βε = ∆β1,ε · · · ∆βn,ε ,
where {e1 , . . . , en } is the standard basis of Rn . Here is the result. Proposition 7.2.17. Assume n = 2k or n = 2k + 1. Take f ∈ R(Rn ) and assume that there exists C < ∞ such that, for |β| ≤ k, (7.2.172)
∥∆βε f − (∆βε f )h ∥L2 ≤ C|h|α ,
for |h| ≤ 1, 0 < ε ≤ 1,
where α satisfies (7.2.165). Then f ∈ A(R ). n
We will not present a proof of Proposition 7.2.17, but leave this as a challenge to the ambitious reader.
332
7. Fourier analysis
Exercises 1. Let f : Rn → C satisfy |f (α) (x)| ≤ C(1 + |x|)−(n+1) for |α| ≤ n + 1. Show that |fˆ(ξ)| ≤ C(1 + |ξ|)−(n+1) . Deduce that f ∈ A(Rn ). 2. Sharpen the result of Exercise 1 as follows. Assume f satisfies [n] |f (α) (x)| ≤ C(1 + |x|)−(n+1) for |α| ≤ + 1. 2 Then show that f ∈ A(Rn ). 3. Take n = 1. For each of the following functions f : R → C, compute fˆ(ξ). e−|x| , 1 , 1 + x2 χ[−1,1] (x),
(7.2.173)
f (x)
=
(7.2.174)
f (x)
=
(7.2.175)
f (x)
=
(7.2.176)
f (x)
= (1 − |x|)χ[−1,1] (x).
4. In each case of Exercise 3, record the identity that follows from the Plancherel identity (7.2.50), established in Proposition 7.2.5. 5. Define fr ∈ C(R) by fr (x) = (1 − x2 )r 0
for |x| ≤ 1, for |x| > 1.
Show that fr ∈ A(R) for each r > 0, as a consequence of Proposition 7.2.15. What is the best conclusion one could draw from Proposition 7.2.9? The next exercises bear on the function ∫ Φn (x) = eix·ξ dS(ξ), S n−1
6. Show that (a) Φn ∈ C ∞ (Rn ). (b) (∆ + 1)Φn = 0 on Rn . (c) Φn is radial, i.e., Φn (x) = φn (|x|).
x ∈ Rn .
Exercises
333
7. Deduce from Exercise 6 and the formula for ∆ in spherical polar coordinates that n−1 ′ φ′′n (s) + φn (s) + φn (s) = 0. s Note that φn (s) = Φn (se1 ) is a smooth, even function of s. 8. Convert the calculation
∫ eisξ1 dS(ξ)
φn (s) =
=
S n−1 ∞ ∑ k=0
=
ξ1k dS(ξ) S n−1
∞ ∑ (−1)ℓ ℓ=0
into a formula for
∫
(is)k k!
(2ℓ)!
∫
2ℓ
ξ12ℓ dS(ξ)
s
S n−1
∫ ℓ ∈ N.
ξ12ℓ dS(ξ), S n−1
9. More generally, produce a formula for ∫ ξ α dS(ξ), α = (α1 , . . . , αn ), αj ∈ Z+ , S n−1 (α)
in terms of Φn (0). 10. In the spirit of Exercises 8–9, use e
−|x|2 /4
=π
−n/2
∫
e−|ξ| eix·ξ dξ 2
Rn
to produce a formula for
∫
ξ α e−|ξ| dξ 2
Rn
in terms of derivatives of e−|x|
2
/4
at x = 0.
11. Show that |x|2−n defines an element of S ′ (Rn ), and ∆(|x|2−n ) = Cn δ on Rn ,
Cn = −(n − 2)An−1 ,
for n ≥ 3. Similarly, show that ∆(log |x|) = 2πδ on R2 . Hint. Check Exercises 13–14 of §4.4.
334
7. Fourier analysis
7.3. Poisson summation formulas Comparing Fourier transforms of functions on Rn with Fourier series of related functions on Tn leads to highly nontrivial identities, known as Poisson summation formulas. We derive some of them here. To start, we take f ∈ S(Rn ),
(7.3.1)
φ(x) =
∑
f (x + 2πℓ).
ℓ∈Zn
We have (7.3.2)
φ ∈ C ∞ (Rn ),
φ(x) = φ(x + 2πk), ∀ k ∈ Zn ,
hence (with slight abuse of notation) φ ∈ C ∞ (Tn ),
(7.3.3)
Tn = Rn /(2πZn ).
We next observe that ∫ ∫ (7.3.4) φ(x)e−ik·x dx = f (x)e−ik·x dx, Tn
∀ k ∈ Zn .
Rn
Consequently, with φ(k) ˆ defined as in (7.1.1) and fˆ(ξ) defined as in (7.2.1), φ(k) ˆ = (2π)−n/2 fˆ(k),
(7.3.5)
∀ k ∈ Zn .
Now the Fourier inversion formula, in the form of Proposition 7.1.1, applies to φ: ∑ ik·x (7.3.6) φ(x) = ˆ , ∀ x ∈ Tn . φ(k)e k∈Zn
Putting this together with (7.3.1) and (7.3.5) gives the following general Poisson summation formula. Proposition 7.3.1. Given f ∈ S(Rn ), we have, for each x ∈ Rn , ∑ ∑ (7.3.7) f (x + 2πℓ) = (2π)−n/2 fˆ(k)eik·x . ℓ∈Zn
In particular,
k∈Zn
∑
(7.3.8)
∑
f (2πℓ) = (2π)−n/2
ℓ∈Zn
fˆ(k).
k∈Zn
We can apply this to f (x) = e−t|x| , 2
(7.3.9)
and use (7.2.16) to evaluate fˆ(k). This leads to ∑ ∑ 2 2 2 e−|k| /4t , e−4π t|ℓ| = (4πt)−n/2 (7.3.10) ℓ∈Zn
Taking τ = 4πt, we can rewrite this as ∑ ∑ 2 2 e−π|k| /τ , e−πτ |ℓ| = τ −n/2 (7.3.11) ℓ∈Zn
t > 0.
k∈Zn
k∈Zn
This result is known as the Jacobi inversion formula.
τ > 0.
7.3. Poisson summation formulas
335
The Riemann functional equation The Riemann zeta function ζ(s) is defined by ∞ ∑ (7.3.12) ζ(s) = k −s , Re s > 1. k=1
This defines ζ(s) as a function holomorphic in {s ∈ C : Re s > 1}. See (5.1.58). Here we establish a formula of Riemann that extends ζ(s) beyond the half plane Re s > 1. To start the analysis, we relate ζ(s) to the function ∞ ∑ 2 g(t) = e−πk t . (7.3.13) k=1
We have
∫
∞
g(t)t (7.3.14)
s−1
dt =
0
∞ ∑
k
−2s −s
∫
π
∞
e−t ts−1 dt
0
k=1
= ζ(2s)π −s Γ(s), for Re s > 1/2. The Gamma function Γ(s) is as in (5.1.56). This gives rise to further identities, via the n = 1 case of the Jacobi inversion formula (7.3.11), i.e., √ ∞ ∞ ∑ 1 ∑ −πk2 /t −πℓ2 t e = (7.3.15) e , t ℓ=−∞
k=−∞
which implies
(1) 1 1 g(t) = − + t−1/2 + t−1/2 g . 2 2 t To use this, we first note from (7.3.14) that, for Re s > 1, ∫ ∞ (s) Γ π −s/2 ζ(s) = g(t)ts/2−1 dt 2 0 (7.3.17) ∫ 1 ∫ ∞ s/2−1 = g(t)t dt + g(t)ts/2−1 dt. (7.3.16)
0
1
Into the integral over [0, 1], we substitute the right side of (7.3.16) for g(t), to obtain ∫ 1( (s) ) 1 1 −s/2 Γ π ζ(s) = − + t−1/2 ts/2−1 dt 2 2 2 0 (7.3.18) ∫ ∞ ∫ 1 g(t−1 )ts/2−3/2 dt +
+ 0
g(t)ts/2−1 dt.
1
We evaluate the first integral on the right and replace t by 1/t in the second integral, to obtain, for Re s > 1, ∫ ∞ (s) [ s/2 ] 1 1 −s/2 (7.3.19) Γ π ζ(s) = − + t + t(1−s)/2 g(t)t−1 dt. 2 s−1 s 1 Note that g(t) ≤ Ce−πt for t ∈ [1, ∞), so the integral on the right side of (7.3.19) defines a function holomorphic for all s ∈ C. As seen in §5.1, Γ(z) is holomorphic
336
7. Fourier analysis
on C \ {0, −1, −2, −3, . . . }. Further results on the Gamma function include the following. Lemma 7.3.2. The function 1/Γ(z) extends to be holomorphic on all of C, with zeros at {0, −1, −2, −3, . . . }. We refer to [51], Chapter 4, for a proof. Given this, we have from (7.3.19) that ζ(s) extends to be holomorphic on C \ {1}. The formula (7.3.19) does more than establish such a holomorphic extension of the zeta function. Note that the right side of (7.3.19) is invariant under replacing s by 1 − s. Thus we have the following identity, known as Riemann’s functional equation: (s) (1 − s) π −s/2 ζ(s) = Γ π −(1−s)/2 ζ(1 − s). (7.3.20) Γ 2 2 The Riemann zeta function plays a central role in the study of prime numbers. Basic material on this can be found in Chapter 4 of [51], and a great deal more in [13]. Exercise 1. Show that the Poisson summation formula (7.3.8) applies when f ∈ A(Rn ) and φ ∈ A(Tn ), ∑ where, as in (7.3.1), φ(x) = ℓ f (x + 2πℓ).
7.4. Spherical harmonics One type of generalization of Fourier series on the circle S 1 ≈ T1 is Fourier series on the n-dimensional torus, treated in §7.1. Another, which we treat in this section, involves the unit sphere S n−1 ⊂ Rn , leading to what are called spherical harmonics. Our approach to this theory will emphasize contact with the following Dirichlet problem, for harmonic functions on the unit ball (7.4.1)
B n = {x ∈ Rn : |x| < 1},
∂B n = S n−1 . n
Namely, given f ∈ C(S n−1 ), we seek a function u ∈ C(B ) ∩ C 2 (B n ) satisfying (7.4.2) ∆u = 0 on B n , u n−1 = f. S
In case n = 2, connections between this problem and Fourier series are explored in Exercises 4–8 of §7.1. In particular, we have from (7.1.139)–(7.1.141) that, when n = 2, ∫ f (φ) 1 − r2 π dφ. (7.4.3) u(reiθ ) = 2π 1 − 2r cos(θ − φ) + r2 −π A change of variable gives, for x ∈ R2 , |x| < 1, ∫ f (y) 1 − |x|2 ds(y), (7.4.4) u(x) = 2π |x − y|2 S1
7.4. Spherical harmonics
337
where ds(y) denotes arc-length. Moving from dimension 2 to dimension n ≥ 3, we are motivated to try a formula for the solution to (7.4.2) of the form ∫ f (y) 2 (7.4.5) u(x) = Cn (1 − |x| ) dS(y). |x − y|n S n−1
We will show that this works, and along the way calculate the constant Cn . First we will show that, for each f ∈ C(S n−1 ), the function u is harmonic on B n . This is a consequence of the following. Lemma 7.4.1. For a given y ∈ S n−1 (i.e., |y| = 1), set v(x) = (1 − |x|2 )|x − y|−n .
(7.4.6)
Then v is harmonic on Rn \ {y}. Proof. It suffices to show that w(x) = v(x + y) is harmonic on Rn \ 0. Since 1 − |x + y|2 = −2(x · y + |x|2 ) provided |y| = 1, we have (7.4.7)
−w(x) = 2(y · x)|x|−n + |x|2−n .
That |x|2−n is harmonic on Rn \ 0, we have already seen, as a consequence of the formula for ∆ in polar coordinates (cf. (5.1.59)), which yields n−1 ′ φ (r). r Now applying ∂/∂xj to a smooth harmonic function on an open set in Rn gives another, so the following are harmonic on Rn \ 0: (7.4.8)
(7.4.9)
g(x) = φ(r) =⇒ ∆g = φ′′ (r) +
wj (x) =
∂ |x|2−n = (2 − n)xj |x|−n . ∂xj
For n = 2 we take instead (7.4.10)
∂ log |x| = xj |x|−2 . ∂xj
Thus the first term on the right side of (7.4.7) is a linear combination of these functions, so the lemma is proved. To justify (7.4.5), it remains to show that if u is given by this formula, and Cn is chosen correctly, then u = f on S n−1 . Note that if we write x = rω, ω ∈ S n−1 , then (7.4.5) yields ∫ (7.4.11) u(rω) = p(r, ω, y)f (y) dS(y), S n−1
where (7.4.12)
p(r, ω, y) = Cn (1 − r2 )|rω − y|−n .
It is clear that (7.4.13)
p(r, ω, y) → 0 as r ↗ 1, if ω ̸= y,
338
7. Fourier analysis
with uniform convergence on each compact subset of {(ω, y) ∈ S n−1 × S n−1 : ω ̸= y}. We claim that ∫ (7.4.14) p(r, ω, y) dS(y) = Cn′ , S n−1
a constant independent of r and ω. The independence of ω follows by rotational symmetry. Thus we can integrate with respect to ω. But Lemma 7.4.1 implies that (7.4.15)
p(r, x, y) = Cn (1 − r2 |x|2 )|rx − y|−n
is harmonic in x, for |x| < 1/r, so the mean value property for harmonic functions gives ∫ 1 (7.4.16) p(r, ω, y) dS(ω) = Cn An−1 S n−1
for all r < 1, y ∈ S n−1 . This implies (7.4.14), with Cn′ = Cn An−1 . Thus, in view of (7.4.13), p(r, ω, y) is highly peaked near ω = y ∈ S n−1 as r ↗ 1, and an argument parallel to that used in the proof of Proposition 7.2.2 (see also Exercise 9 of §7.1) shows that the limit of (7.4.11) as r ↗ 1 is equal to Cn An−1 f (ω), for each f ∈ C(S n−1 ). This justifies the formula (7.4.5) and fixes the constant: Cn = 1/An−1 . We have proved most of the following. n
Proposition 7.4.2. Given f ∈ C(S n−1 ), the solution in C(B )∩C 2 (B n ) to (7.4.2) is given by the Poisson integral formula ∫ f (y) 1 − |x|2 dS(y). (7.4.17) u(x) = An−1 |x − y|n S n−1
Furthermore this solution is unique. Proof. It remains to establish uniqueness. In fact, the difference v of two such n solutions would be harmonic on B n , belong to C(B ), and vanish on ∂B n . Hence the maximum principle, from Proposition 5.1.7, yields v ≡ 0 on B n . Another way to write the conclusion (7.4.17) of Proposition 7.4.2 is ∫ 1 − r2 f (y) (7.4.18) u(rω) = dS(y), An−1 (1 − 2rω · y + r2 )n/2 S n−1
for ω ∈ S n−1 , 0 ≤ r < 1. The next step in the extension of results connecting Fourier series and harmonic functions in §7.1 brings in spherical harmonics. We look for harmonic functions in the form (7.4.19)
u(rω) = φ(r)g(ω).
To get this, we use the formula for ∆ in spherical polar coordinates, derived in (5.1.59): (7.4.20)
∆u(rω) =
n−1 ∂ 1 ∂2 u(rω) + u(rω) + 2 ∆S u(rω), ∂r2 r ∂r r
7.4. Spherical harmonics
339
where ∆S is the Laplace operator on the unit sphere S n−1 . In case u is given by (7.4.19), we have [ n−1 ′ ] 1 (7.4.21) ∆u(rω) = φ′′ (r) + φ (r) g(ω) + 2 φ(r)∆S g(ω). r r Thus (7.4.19) defines a harmonic function on B n \ 0 if and only if there exists a constant µ such that ∆S g = µg on S n−1 , and
(7.4.22)
′′
φ (r) +
(7.4.23)
n−1 ′ r φ (r)
+
µ r 2 φ(r)
= 0,
for 0 < r < 1. Note that (7.4.17) implies the harmonic function u is C ∞ on B n . Hence, in (7.4.19), we must have g ∈ C ∞ (S n−1 ). If g satisfies (7.4.22), we say g is an eigenfunction of ∆S , with eigenvalue µ. Note that ∫ ∫ 2 µ |g| dS = (∆S g)g dS S n−1
(7.4.24)
S n−1
∫
=−
| grad g|2 dS,
S n−1
so µ ≤ 0. Say µ = −λ . Now, the equation (7.4.23) is an Euler equation, whose solutions are linear combinations of 2
φ± (r) = rk± ,
(7.4.25)
where k = k± satisfies the equation (7.4.26)
k(k − 1) + (n − 1)k − λ2 = 0,
with roots
n − 2 1√ (n − 2)2 + 4λ2 . ± 2 2 The smoothness of u on B n requires that the exponent in (7.4.25) be positive, so we need n − 1 1√ (7.4.28) φ(r) = rk , k = − + (n − 2)2 + 4λ2 . 2 2 Furthermore, such smoothness requires (7.4.27)
(7.4.29)
k± = −
k ∈ Z+ = {0, 1, 2, 3, . . . }.
Then (7.4.27) yields the eigenvalue µ = −λ2k , with (7.4.30)
λ2k = k 2 + (n − 2)k ( n − 2 )2 ( n − 2 )2 = k+ − . 2 2
Let us set (7.4.31)
Vk = {g ∈ C ∞ (S n−1 ) : ∆S g = −λ2k g}.
We see that if k ∈ Z+ and g ∈ Vk , then u(rω) = rk g(ω) is harmonic on B n \ 0 and bounded. It follows that {0} is a removable singularity, and u extends to be harmonic on all of B n . (See §A.6 for this removable singularity theorem.) As seen above, this harmonic function u belongs to C ∞ (B n ). It is homogeneous of degree
340
7. Fourier analysis
k, so ∂xα u(x) is homogeneous of degree k − |α|. In particular, it is homogeneous of degree 0 for |α| = k. Being smooth, it must be constant, so |α| = k =⇒ ∂xα u = cα , const.
(7.4.32)
Hence u(x) is a polynomial in x, homogeneous of degree k. Let us set (7.4.33)
Hk = space of harmonic polynomials on Rn , homogeneous of degree k.
We have the following. Proposition 7.4.3. The map τ : Hk −→ Vk ,
(7.4.34)
τ u = u S n−1 ,
is an isomorphism. Next, we have the following important orthogonality result. Proposition 7.4.4. Assume gj ∈ Vj , gk ∈ Vk , j ̸= k. Then ∫ (7.4.35) (gj , gk )L2 (S n−1 ) = gj gk dS = 0. S n−1
Proof. By (4.4.26) (or (4.4.29), with M = S n−1 , so ∂M = ∅), for g, h ∈ C 2 (S n−1 ), (7.4.36)
(∆S g, h)L2 = (g, ∆S h)L2 .
Hence, for gj , gk as above −λ2j (gj , gk )L2 = (∆S gj , gk )L2 = (gj , ∆S gk )L2
(7.4.37)
= −λ2k (gj , gk )L2 . Since j ̸= k ⇒ λj ̸= λk , we have (7.4.35).
Second proof. Denote by gj , gk the corresponding elements of Hj , Hk , and apply (4.4.29), with M = B n , ∂M = S n−1 . Note that, due to the homogeneity, ∂gj = jgj , ∂ν
(7.4.38)
∂gk = kgk , ∂ν
so you get
on S n−1 ,
∫ (j − k)
(7.4.39)
gj g k dS = 0,
S n−1
which yields (7.4.35).
The following result provides very valuable information on the spaces Hk . Let (7.4.40)
Pk = space of polynomials on Rn , homogeneous of degree k.
7.4. Spherical harmonics
341
Proposition 7.4.5. For all k ∈ Z+ , we have the direct sum decomposition Pk = Hk ⊕ |x|2 Hk−2 ⊕ · · · ⊕ |x|2j Hk−2j ,
(7.4.41)
with k ∈ {2j, 2j + 1}. Proof. By Proposition 7.4.4, the various summands on the right side of (7.4.41) are mutually orthogonal with respect to the L2 -inner product on S n−1 , so the sum on the right is direct. It remains to do a dimension count. We do this by induction on k. Note that P0 = H0 and P1 = H1 , so (7.4.41) is clear for k = 0, 1. Now assume the analogue of (7.4.41) holds for Pℓ , for all ℓ < k. Given this, the right side of (7.4.41) is equal to Hk ⊕ |x|2 Pk−2 = Qk ,
(7.4.42)
and we need to show this has the same dimension as Pk . The direct sum condition in (7.4.42) implies dim Hk + dim Pk−2 = dim Qk ≤ dim Pk .
(7.4.43) Now consider
∆ : Pk −→ Pk−2 .
(7.4.44)
The null space is N (∆) = Hk , so the fundamental theorem of linear algebra implies dim Pk = dim N (∆) + dim R(∆)
(7.4.45)
≤ dim Hk + dim Pk−2 .
Comparison with (7.4.43) gives dim Pk = dim Qk , and finishes the proof. (It also shows that ∆ in (7.4.44) is surjective.) We can use Proposition 7.4.5, and its corollary Pk = Hk ⊕ |x|2 Pk−2 ,
(7.4.46)
to compute dim Hk (hence dim Vk ). To approach this, let us use the notation Pk (Rn ) for the space of polynomials on Rn homogeneous of degree k, and P k (Rn ) for the space of polynomials on Rn of degree ≤ k, and set dk (n) = dim Pk (Rn ).
(7.4.47) We see that (7.4.48)
dk (n) = dim P k (Rn−1 ) = dk (n − 1) + dk−1 (n − 1) + · · · + d0 (n − 1),
and similarly dk−1 (n) = dk−1 (n − 1) + · · · + d0 (n − 1), hence dk (n) − dk−1 (n) = dk (n − 1). Similarly dk−1 (n) − dk−2 (n) = dk−1 (n − 1), so dk (n) − dk−2 (n) = dk (n − 1) + dk−1 (n − 1).
342
7. Fourier analysis
Hence (7.4.46) gives (7.4.49)
dim Hk = dk (n − 1) + dk−1 (n − 1) = dim P k (Rn−2 ) + dim P k−1 (Rn−2 ),
and this is also equal to dim Vk . As one example, we see that On S 2 , dim Vk = 2k + 1,
(7.4.50)
since dim P k (R1 ) = k + 1. More generally, one can deduce inductively from (7.4.48) that ( ) k+n−1 n (7.4.51) dim Pk (R ) = , k and hence, (7.4.52)
On S
n−1
( ) ( ) k+n−2 k+n−3 , dim Vk = + . k k−1
Returning from dimension counts to other consequences of Proposition 7.4.5, we have the following important algebraic result. Proposition 7.4.6. Given gj ∈ Vj and gk ∈ Vk , the product satisfies gj gk ∈
(7.4.53)
j+k ⊕
Vℓ .
ℓ=0
Proof. The functions gj and gk extend to elements of Pj and Pk , so gj gk extends to an element of Pj+k . Then (7.4.53) follows from (7.4.41), applied to Pj+k . We hence get the following important density result. Proposition 7.4.7. The space (7.4.54)
V=
ℓ ∪⊕
Vk
ℓ>0 k=0
of finite linear combinations of eigenfunctions of ∆S is dense in C(S n−1 ). Proof. We see that (7.4.54) is an algebra of continuous functions on S n−1 . Clearly it separates points (since V1 does that) and it is closed under complex conjugates, so the Stone-Weierstrass theorem applies, to yield denseness in C(S n−1 ). Corollary 7.4.8. For each k ∈ Z+ , let (7.4.55)
{Ykℓ : ℓ ∈ Σk }
be an orthonormal basis of Vk , where Σk is an index set of cardinality dim Vk . Then (7.4.56)
{Ykℓ : k ≥ 0, ℓ ∈ Σk }
is an orthonormal set of functions on S n−1 whose linear span is dense in C(S n−1 ).
7.4. Spherical harmonics
343
These results give rise to constructions parallel to some done in §7.1. Given f ∈ R(S n−1 ), set ∫ (7.4.57) fˆ(k, ℓ) = f (y)Ykℓ (y) dS(y). S n−1
Then set (7.4.58)
Ek f (y) =
∑
fˆ(k, ℓ)Ykℓ (y),
ℓ∈Σk
yielding Ek : R(S n−1 ) −→ Vk .
(7.4.59)
The map Ek is an orthogonal projection of R(S n−1 ) onto Vk , satisfying (7.4.60)
Ek f = f,
∀ f ∈ Vk ,
f − Ek f ⊥ Vk ,
∀ f ∈ R(S n−1 ),
in the sense that (7.4.61)
(f − Ek f, g)L2 = 0,
∀ g ∈ Vk .
The properties (7.4.59)–(7.4.60) uniquely characterize Ek . As such, Ek is independent of the choice of orthonormal basis on Vk . We set SN f =
N ∑
Ek f
k=0
(7.4.62) =
N ∑ ∑
fˆ(k, ℓ)Ykℓ .
k=0 ℓ∈Σk
Given Corollary 7.4.8, arguments parallel to those used in Proposition 7.1.5 establish the following. Proposition 7.4.9. We have f ∈ R(S n−1 ) =⇒ SN f → f in L2 -norm, and ∞ ∑ ∑
(7.4.63)
|fˆ(k, ℓ)|2 = ∥f ∥2L2 .
k=0 ℓ∈Σk
We are also interested in conditions guaranteeing that SN f → f uniformly on S n−1 . This study is complicated by the fact that, for n ≥ 3, the eigenfunctions Ykℓ are not uniformly bounded. (See Corollary 7.4.22.) Somewhat compensating for this is the following interesting result. Proposition 7.4.10. For each k ∈ Z+ , if {Ykℓ : ℓ ∈ Σk } is an orthonormal basis of Vk , then ∑ dim Vk (7.4.64) |Ykℓ (ω)|2 = , ∀ ω ∈ S n−1 . An−1 ℓ∈Σk
344
7. Fourier analysis
Proof. We define the map ψk : S n−1 −→ Vk
(7.4.65) by the property (7.4.66)
∀ g ∈ Vk .
(g, ψk (ω))L2 (S n−1 ) = g(ω),
Then, for R ∈ SO(n), g ∈ Vk , ω ∈ S n−1 , (7.4.67)
(g, ψk (R−1 ω))L2 = g(R−1 ω) = πk (R)g(ω),
where we define (7.4.68)
πk : SO(n) −→ L(Vk ) by πk (R)g(ω) = g(R−1 ω).
In connection with this, it follows from (4.4.48) that the action of SO(n) on C ∞ (S n−1 ) commutes with ∆S , so this action preserves Vk . To continue, we have (7.4.67) equal to (7.4.69)
(πk (R)g, ψk (ω))L2 = (g, πk (R−1 )ψk (ω))L2 ,
so (7.4.70)
ψk (R−1 ω) = πk (R−1 )ψk (ω),
∀ ω ∈ S n−1 , R ∈ SO(n).
Now the action of πk (R−1 ) on Vk given by (7.4.68) preserves the L2 -norm, so we have (7.4.71)
∥ψk (R−1 ω)∥Vk = ∥ψk (ω)∥Vk ,
∀ R ∈ SO(n), ω ∈ S n−1 .
On the other hand, we have, for each ω ∈ S n−1 , ∑ ∥ψk (ω)∥2Vk = |(Ykℓ , ψk (ω))L2 |2 ℓ∈Σk
(7.4.72) =
∑
|Ykℓ (ω)|2 ,
ℓ∈Σk
by (7.4.66). This implies that (7.4.73)
∑
|Ykℓ (ω)|2 = ck
ℓ∈Σk
is independent of ω ∈ S n−1 . Integrating over ω ∈ S n−1 then gives (7.4.64).
For a first application of Proposition 7.4.10, we have the following estimate on Ek f , given by (7.4.58). For all ω ∈ S n−1 , ∑ |Ek f (ω)| ≤ |fˆ(k, ℓ)| · |Ykℓ (ω)| ℓ∈Σk
(7.4.74)
≤
(∑
|fˆ(k, ℓ)|2
ℓ −1/2
)1/2 (∑
|Ykℓ (ω)|2
ℓ
= An−1 (dim Vk )1/2 ∥Ek f ∥Vk ,
)1/2
7.4. Spherical harmonics
345
the second inequality by Cauchy’s inequality, and the last identity by (7.4.64). Let us use the notation (7.4.75)
Dk = dim Vk .
To proceed, suppose now that f ∈ C 2m (S n−1 ), and set Lf = (1 − ∆S )f,
(7.4.76) so
Ek Lm f = Lm Ek f = (1 + λ2k )m Ek f.
(7.4.77)
Thus we deduce from (7.4.74) that (7.4.78)
An−1 |Ek f (ω)| ≤ (1 + λ2k )−m Dk ∥Ek Lm f ∥Vk , 1/2
1/2
∀ ω ∈ S n−1 .
Another application of Cauchy’s inequality gives N ∑
1/2 An−1
|Ek f (ω)|
k=M +1
≤
N ∑
(1 + λ2k )−m Dk ∥Ek Lm f ∥Vk 1/2
k=M +1
(7.4.79) ≤
N ( ∑
(1 +
λ2k )−2m Dk
N )1/2 ( ∑
k=M +1
=
N ( ∑
∥Ek Lm f ∥2Vk
)1/2
k=M +1
(1 + λ2k )−2m Dk
)1/2
∥(SN − SM )Lm f ∥L2 .
k=M +1
Recall that λ2k is given by (7.4.30) and Dk is given by (7.4.49)–(7.4.52). It follows that (1 + λ2k )−2m Dk ≤ cn (1 + k)−4m+n−2 .
(7.4.80) Hence (7.4.81)
∞ ∑
(1 + λ2k )−2m Dk < ∞,
provided 4m > n − 1.
k=0
This leads to the following result, which one can compare with Proposition 7.1.6. Proposition 7.4.11. Assume n−1 . 2 Then, for 0 ≤ M < N < ∞, we have, for all ω ∈ S n−1 , f ∈ C 2m (S n−1 ), and 2m >
(7.4.82)
(7.4.83)
N ∑
|Ek f (ω)| ≤ εmn (M )∥(SN − SM )Lm f ∥L2 ,
k=M +1
with (7.4.84)
εmn (M ) −→ 0, as M → ∞.
Consequently (7.4.85)
SN f (ω) −→ f (ω) as N → ∞,
uniformly in ω ∈ S n−1 .
346
7. Fourier analysis
Proof. The result (7.4.83) follows from (7.4.79)–(7.4.81). Hence SN f is uniformly convergent to some limit g ∈ C(S n−1 ). It follows that SN f → g in L2 -norm. But Proposition 7.4.9 yields SN f → f in L2 -norm, so f = g, and we have (7.4.85). We return to the connection between the Dirichlet problem and spherical harmonic expansions, and examine ∞ ∑ u(rω) = rk Ek f (ω) (7.4.86) =
k=0 ∞ ∑
∑
fˆ(k, ℓ)rk Ykℓ (ω).
k=0 ℓ∈Σk k
Ykℓ (ω)
Recall that each term r = ηkℓ (rω) is a harmonic polynomial (homogeneous n of degree k) on R . We have from (7.4.52) and the estimate (7.4.74) that, for all ω ∈ S n−1 , ∞ ∑ rk |Ek f (ω)| k=0
(7.4.87)
≤ cn
∞ ∑
(1 + k)(n−2)/2 rk ∥Ek f ∥L2
k=0
≤ cn
∞ (∑
) (1 + k)(n−2)/2 rk ∥f ∥L2 .
k=0
It follows that, whenever f ∈ R(S n−1 ), or more generally f, |f |2 ∈ R# (S n−1 ), the series (7.4.86) converges uniformly on all balls B R = {x ∈ Rn : |x| ≤ R}, for R < 1, to a function u ∈ C(B n ). Of course, any partial sum SN f (rω) =
(7.4.88)
N ∑
rk Ek f (ω)
k=0
is harmonic, being a finite sum of harmonic polynomials. To pass from harmonicity of SN f to u in (7.4.86), we have the following useful result. Proposition 7.4.12. Let O ⊂ Rn be open. Assume uν ∈ C ∞ (O) are harmonic and uν → u, uniformly on compact subsets of O. Then (7.4.89)
u ∈ C ∞ (O),
∂ α uν → ∂ α u uniformly on compact subsets of O,
and u is harmonic on O. Proof. It suffices to show that if x0 ∈ BR (x0 ) ⊂ O and 0 < ρ < R, then (7.4.90)
uν → u uniformly on ∂BR (x0 ) ⇒ ∂ α uν uniformly Cauchy on Bρ (x0 ).
Translating and dilating, we can assume x0 = 0 and R = 1, and apply (7.4.17), so ∫ 1 − |x|2 uµ (y) − uν (y) dS(y). (7.4.91) uµ (x) − uν (x) = An−1 |x − y|n S n−1
7.4. Spherical harmonics
347
We can apply ∂xα to the right side to get (7.4.90).
Applying this result to (7.4.86)–(7.4.88), we have the following. Proposition 7.4.13. Given f ∈ R(S n−1 ), or more generally f, |f |2 ∈ R# (S n−1 ), the function u defined by (7.4.87) is harmonic on B n , and the sequence SN f (rω) of harmonic polynomials given by (7.4.88) satisfies ∂ α SN f (x) −→ ∂ α u(x),
(7.4.92)
uniformly each ball Bρ (0), for ρ < 1. We now want to show that the solution to the Dirichlet problem (7.4.2), with f ∈ C(S n−1 ), has the representation (7.4.86). Let us fix some notation, and denote by n
PI : C(S n−1 ) −→ {u ∈ C(B ) ∩ C ∞ (B n ) : ∆u = 0 on B n }
(7.4.93)
the solution to (7.4.2), given by Proposition 7.4.2, i.e., the Poisson integral ∫ 1 − |x|2 f (y) (7.4.94) PI f (x) = dS(y). An−1 |x − y|n S n−1
Here is the result. Proposition 7.4.14. For all f ∈ C(S n−1 ), (7.4.95)
PI f (rω) =
∞ ∑
rk Ek f (ω),
k=0
for rω ∈ B . n
f (rω). As seen in (7.4.86)–(7.4.88) Proof. Denote the right side of (7.4.95) by PIf and Proposition 7.4.12, we have (7.4.96)
f : R(S n−1 ) −→ {u ∈ C ∞ (B n ) : ∆u = 0 on B n }, PI
and (7.4.97)
f (rω)| ≤ cn |PIf
∞ (∑
) (1 + k)(n−2)/2 rk ∥f ∥L2 .
k=0
We want to show that (7.4.98)
f (rω), PI f (rω) = PIf
∀ rω ∈ B n ,
for all f ∈ C(S n−1 ). It is clear that if f is a finite sum of eigenfunctions of ∆S , f is a smooth solution to (7.4.2), so by i.e., if f ∈ V, given in (7.4.54), then PIf the uniqueness part of Propisition 7.4.2, (7.4.98) holds for all f ∈ V. As seen in Proposition 7.4.7, V is dense in C(S n−1 ). Thus, given f ∈ C(S n−1 ), there exist fν ∈ V such that (7.4.99)
fν → f uniformly on C(S n−1 ), hence in L2 -norm.
We have (7.4.100)
f ν (rω), PI fν (rω) = PIf
348
7. Fourier analysis
for all ω ∈ S n−1 , r ∈ [0, 1). Furthermore, as ν → ∞, PI fν (rω) −→ PI f (rω),
(7.4.101)
f ν (rω) −→ PIf f (rω), PIf
the latter result by (7.4.97), applied to f − fν . Together, (7.4.100)–(7.4.101) give (7.4.98), hence (7.4.95). Using (7.4.18), we deduce from Proposition 7.4.14 that, for f ∈ C(S n−1 ), r ∈ [0, 1), ∫ ∞ ∑ 1 − r2 f (y) k (7.4.102) r Ek f (ω) = dS(y). An−1 (1 − 2rω · y + r2 )n/2 k=0 S n−1
We are consequently motivated to expand the integral on the right side of (7.4.102) in powers of r. The following “generating function identity,” (1 − 2tr + r2 )−α =
(7.4.103)
∞ ∑
Ckα (t)rk
k=0
defines a class of special functions known as Gegenbauer polynomials. To compute these, we can use the identity ) ∞ ( ∑ j+α−1 j −α (7.4.104) (1 − z) = z , j j=0 with z = r(2t − r), to write the left side of (7.4.103) as ) ∞ ( ∑ j+α−1 j r (2t − r)j j j=0 (7.4.105)
=
)( ) j ( ∞ ∑ ∑ j+α−1 j (−1)ℓ rj+ℓ (2t)j−ℓ j ℓ j=0 ℓ=0
=
∞ [k/2] ∑ ∑
( (−1)
ℓ
k=0 ℓ=0
)( ) k−ℓ+α−1 k−ℓ (2t)k−2ℓ rk . k−ℓ ℓ
Hence ∑
[k/2]
(7.4.106)
Ckα (t) =
(−1)ℓ
ℓ=0
We have (7.4.107)
(
k−ℓ+α−1 k−ℓ
∫ ∞ 1 − r2 ∑ k PI f (rω) = r An−1 k=0
)(
) k−ℓ (2t)k−2ℓ . ℓ
n/2
Ck (ω · y)f (y) dS(y),
S n−1
for 0 ≤ r < 1. Comparison with the left side of (7.4.102) gives ∫ [ ] 1 n/2 n/2 (7.4.108) Ek f (ω) = Ck (ω · y) − Ck−2 (ω · y) f (y) dS(y), An−1 S n−1
7.4. Spherical harmonics
349
provided we make the convention that Ckα (t) = 0 for k < 0. Summing (7.4.108) over 0 ≤ k ≤ N yields ∫ [ ] 1 n/2 n/2 (7.4.109) SN f (ω) = CN (ω · y) + CN −1 (ω · y) f (y) dS(y). An−1 S n−1
We seek an alternative formula for Ek . For its derivation, it is convenient to temporarily replace the exponent k in rk by νk , given by ( n − 2 )2 n−2 (7.4.110) νk2 = λ2k + , νk = k + . 2 2 Multiplying (7.4.102) by r(n−2)/2 and making the change of variable r = e−s yields ∫ ∞ ∑ 2 sinh s (7.4.111) e−νk s Ek f (ω) = f (y) dS(y). An−1 (2 cosh s − 2ω · y)n/2 k=0 S n−1
Now integrating over s ∈ [s1 , ∞) and taking r = e−s1 (and dividing by r(n−2)/2 ) gives ∫ ∞ ∑ f (y) 2 (7.4.112) νk−1 rk Ek f (ω) = dS(y). (n − 2)An−1 (1 − 2rω · y + r2 )(n−2)/2 k=0 S n−1
We again apply the generating function identity (7.4.103), this time with α = (n − 2)/2, and compare coefficients of rk , to get ∫ 2νk (n−2)/2 Ck (ω · y)f (y) dS(y). (7.4.113) Ek f (ω) = (n − 2)An−1 S n−1
In the classical case S 2 ⊂ R3 , these Gegenbauer polynomials specialize to Legendre polynomials 1/2
(7.4.114)
Pk (t) = Ck (t).
Since A2 = 4π and νk = k + 1/2 in this case, we get ∫ 2k + 1 (7.4.115) Ek f (ω) = Pk (ω · y)f (y) dS(y). 4π S2
We denote the integral kernel of the projection Ek by Ek (ω, y), so ∫ (7.4.116) Ek f (ω) = Ek (ω, y)f (y) dS(y). S n−1
Then the content of (7.4.113) is that (7.4.117)
Ek (ω, y) =
2νk (n−2)/2 C (ω · y). (n − 2)An−1 k
The following is a useful general identity. Proposition 7.4.15. If {Ykℓ : ℓ ∈ Σk } is an orthonormal basis of Vk , then ∑ (7.4.118) Ek (ω, y) = Ykℓ (ω)Ykℓ (y). ℓ∈Σk
350
7. Fourier analysis
Proof. Denote the right side of (7.4.118) by Fk (ω, y), and set ∫ (7.4.119) Fk f (ω) = Fk (ω, y)f (y) dS(y). S n−1
We see that Fk = E k .
Fk Ykℓ
=
for all ℓ ∈ Σk and that Fk f = 0 if f ⊥ Vk , so indeed
Ykℓ
If we set ω = y in (7.4.118) and integrate over S n−1 , we get ∫ (7.4.120) Ek (y, y) dS(y) = dim Vk . S n−1
Since ω = y ∈ S
⇒ ω · y = 1, we deduce from (7.4.117) that 2νk (n−2)/2 C (1). (7.4.121) dim Vk = n−2 k On the other hand, setting t = 1 in (7.4.103), obtaining (1 − r)−2α , we have ( ) k + 2α − 1 α (7.4.122) Ck (1) = , k n−1
so (7.4.121) implies (7.4.123)
dim Vk =
( ) 2k + n − 2 k + n − 3 , n−2 k
which, as one can check (writing the numerator in the fraction above as k + (k + n − 2)), agrees with (7.4.52). Specializing to n = 3, we have 1/2
(7.4.124)
Pk (1) = Ck (1) = 1,
and hence, on S 2 , dim Vk = 2νk = 2k + 1,
(7.4.125) as in (7.4.50).
The following identity is a useful complement to (7.4.120). Proposition 7.4.16. For each k ∈ Z+ , ∫ ∫ (7.4.126) |Ek (ω, y)|2 dS(ω) dS(y) = dim Vk . S n−1 S n−1
Proof. From (7.4.118), we have (7.4.127)
|Ek (ω, y)|2 =
∑
Ykℓ (ω)Ykℓ (y)Ykm (ω)Ykm (y).
ℓ,m∈Σk
Integrating over S
n−1
∫
×S ∫
and using orthonormality of {Ykℓ : ℓ ∈ Σk } gives ∑ |Ek (ω, y)|2 dS(ω) dS(y) = δℓm
n−1
ℓ,m∈Σk
S n−1 S n−1
(7.4.128)
=
∑
1
ℓ∈Σk
= dim Vk .
7.4. Spherical harmonics
351
To proceed, we would like to apply the formula (7.4.117) to (7.4.126). The following general comments are useful. Each T ∈ SO(n) acts on S n−1 , and we have, for each y ∈ S n−1 , ∫ ∫ F (ω · y) dS(ω) = F (T ω · T y) dS(ω) (7.4.129)
S n−1
S n−1
∫
F (ω · T y) dS(ω),
= S n−1
the last identity because T : S → S n−1 is an isometry, and hence preserves volumes. It follows that the integral on the left side of (7.4.129) is independent of y ∈ S n−1 . We can fix e ∈ S n−1 , and obtain ∫ ∫ ∫ (7.4.130) F (ω · y) dS(ω) dS(y) = An−1 F (ω · e) dS(ω). n−1
S n−1 S n−1
S n−1
It follows that, with (7.4.131)
2k + n − 2 , (n − 2)An−1
γnk =
we have
∫
∫ |Ek (ω, y)|2 dS(ω) dS(y)
S n−1 S n−1
∫
(7.4.132)
=
∫ (n−2)/2
2 γnk
Ck
(ω · y)2 dS(ω) dS(y)
S n−1 S n−1
∫
2 = γnk An−1
(n−2)/2
Ck
(ω · e)2 dS(ω),
S n−1
and this is equal to dim Vk . For the sake of simplicity, let us specialize this to n = 3, i.e., to analysis on S 2 ⊂ R3 . Then we have the Legendre polynomials Pk , given by (7.4.124), and (7.4.132) yields ∫ 1 1 Pk (ω · e)2 dS(ω) = (7.4.133) . 4π 2k + 1 S2
The identities obtained above from (7.4.126) can be generalized. In fact, from (7.4.118) (plus the fact that each Ej (ω, y) is real valued) we have ∫ (7.4.134) Ej (ω, z)Ek (z, y) dS(z) = δjk Ek (ω, y), ∀ ω, y ∈ S n−1 . S n−1
Using (7.4.117), we can rewrite this as ∫ (n−2)/2 (n−2)/2 (n−2)/2 (7.4.135) γnk Cj (ω · z)Ck (z · y) dS(z) = δjk Ck (ω · y). S n−1
352
7. Fourier analysis
In particular, for n = 3, 2k + 1 4π
(7.4.136)
∫ Pj (ω · z)Pk (z · y) dS(z) = δjk Pk (ω · y). S2
Note that taking j = k and ω = y = e ∈ S 2 gives (7.4.133), since Pk (1) = 1. Zonal functions A zonal function on S n−1 is a continuous function of the form f (ω) = φ(ω · e),
(7.4.137)
where e = (0, . . . , 0, 1) (which we might call the “north pole”). We write f ∈ Z(S n−1 ). Provided n ≥ 3, an equivalent condition is the following. Consider SO(n − 1) as a subgroup of SO(n) consisting of rotations about the xn -axis: (7.4.138)
SO(n − 1) = {R ∈ SO(n) : Re = e}.
Then, given f ∈ C(S n−1 ), (7.4.139)
f ∈ Z(S n−1 ) ⇐⇒ f (Rω) = f (ω), ∀ R ∈ SO(n − 1), ω ∈ S n−1 .
For n = 2, SO(1) consists only of the identity transformation, and this equivalence fails. In the rest of this subsection, we will assume n ≥ 3. We define the space of zonal harmonics (7.4.140)
Zk = Vk ∩ Z(S n−1 ).
We have seen examples of elements of Zk , namely (7.4.141)
(n−2)/2
Zk (ω) = Ck
(ω · e).
The following complement to Proposition 7.4.7 is of key importance. Proposition 7.4.17. The linear span of {Zk : k ∈ Z+ } is dense in Z(S n−1 ). Proof. From (7.4.106) we see that Ckα (t) is a polynomial in t of degree k, whose leading term is ( ) k+α−1 (7.4.142) (2t)k . k Thus the linear span of {Ckα : k ∈ Z+ } is the space of all polynomials in t, which, by the Weierstrass approximation theorem, is dense in C([−1, 1]), so we have the proposition. Here is a basic corollary of Proposition 7.4.17. Proposition 7.4.18. For each k ∈ Z+ , (7.4.143) In particular dim Zk = 1.
Zk = Span(Zk ).
7.4. Spherical harmonics
353
Proof. Suppose f ∈ Zk and f ⊥ Zk , i.e., (f, Zk )L2 = 0. Clearly (f, Zj )L2 = 0 for all j ̸= k, so (f, g)L2 = 0 for all g that are finite linear combinations of {Zj }, hence, ∫ by Proposition 7.4.17, for all g ∈ Zk . Taking g = f yields |f |2 dS = 0, hence f ≡ 0. To proceed, let us set Yk0 (ω) = ∥Zk ∥−1 L2 Zk (ω),
(7.4.144)
so {Yk0 : k ∈ Z+ } is an orthonormal set of functions in Z(S n−1 ). As observed below (7.4.68), the action of SO(n) on C(S n−1 ) given by π(R)f (ω) = f (R−1 ω)
(7.4.145)
preserves each space Vk . Hence it commutes with Ek . Thus the characterization (7.4.139) implies Ek : Z(S n−1 ) −→ Zk .
(7.4.146)
By (7.4.143), the identity (7.4.58) specializes to (7.4.147)
f ∈ Z(S n−1 ) ⇒ Ek f (ω) = (f, Yk0 )L2 Yk0 (ω) = fˆ(k, 0)Yk0 (ω).
Hence Proposition 7.4.9 specializes to the following. Proposition 7.4.19. If f ∈ Z(S n−1 ), then (7.4.148)
N ∑
SN f (ω) =
fˆ(k, 0)Yk0 (ω)
k=0
has the property that (7.4.149)
SN f −→ f in L2 -norm,
and ∞ ∑
(7.4.150)
|fˆ(k, 0)|2 = ∥f ∥2L2 .
k=0
If we specialize to n = 3, then (7.4.133) yields ( 2k + 1 )1/2 (7.4.151) Yk0 (ω) = Pk (ω · e), 4π and we have, for f (ω) = φ(ω · e), (7.4.152)
φ(ω · e) =
∞ ∑
φk Pk (ω · e),
k=0
with 2k + 1 φk = 4π (7.4.153)
(
∫ φ(ω · e)Pk (ω · e) dS(ω) S2
1) = k+ 2
∫
1
−1
φ(t)Pk (t) dt,
a result known as the Funk-Hecke theorem.
354
7. Fourier analysis
We turn to a consideration of the transformation (7.4.154)
Π : C(S n−1 ) −→ Z(S n−1 ),
given by (7.4.155)
∫ Πf (ω) =
f (R−1 ω) dR,
SO(n−1)
where dR denotes the Haar measure on SO(n − 1), introduced in (3.2.45). Since π(R) commutes with Ek , so does Π, and we have (7.4.156) Πk : Vk −→ Zk , Πk = Π . Vk
Clearly f ∈ Zk =⇒ Πk f = f.
(7.4.157) Also, with (7.4.158)
Zk⊥ = {f ∈ Vk : (f, Zk )L2 = 0},
we have (7.4.159)
f ∈ Zk⊥ =⇒ fR ∈ Zk⊥ , ∀ R ∈ SO(n − 1),
where fR (ω) = f (R−1 ω), so (7.4.160)
Πk : Zk⊥ −→ Zk⊥ ∩ Zk = 0.
To summarize: Proposition 7.4.20. The map Πk , given by (7.4.155)–(7.4.156), is the orthogonal projection of Vk onto Zk , hence, for f ∈ Vk , (7.4.161)
Πk f (ω) = (f, Yk0 )L2 Yk0 (ω).
Incidentally, note from (7.4.155) that, for all f ∈ Vk , (7.4.162)
Πk f (e) = f (e).
Thus we have: Corollary 7.4.21. For f ∈ Vk , (7.4.163)
f ⊥ Yk0 =⇒ f (e) = 0.
In particular, if {Y ℓ : ℓ ∈ Σk } is an orthonormal basis of Vk , (7.4.164)
ℓ ̸= 0 =⇒ Ykℓ (e) = 0.
Taking into account the identity (7.4.64), we have: Corollary 7.4.22. For Yk0 given by (7.4.141) and (7.4.144), we have (7.4.165)
|Yk0 (e)|2 =
dim Vk . An−1
7.4. Spherical harmonics
355
This gives a sharp illustration of the fact, mentioned above Proposition 7.4.10, that the eigenfunctions Ykℓ are not uniformly bounded. For n = 3, (7.4.165) specializes to ( 2k + 1 )1/2 , on S 2 , (7.4.166) Yk0 (e) = 4π which we have already seen in (7.4.151). Action of SO(n) on the eigenspaces Vk As we have seen above, the rotation group SO(n) acts on functions on S n−1 by π(R)f (ω) = f (R−1 ω).
(7.4.167)
This action works on various function spaces, including C ∞ (S n−1 ), C(S n−1 ), and R(S n−1 ). Since the transformation R is an isometry on S n−1 , this action on C ∞ (S n−1 ) commutes with the Laplace operator ∆S , so one has each eigenspace Vk invariant, yielding πk : SO(n) −→ L(Vk ).
(7.4.168)
One can check from the definition (7.4.167) that the map πk is continuous, in fact smooth. Also, for Rj ∈ SO(n), (7.4.169)
πk (R1 R2 ) = πk (R1 )πk (R2 ),
πk (I) = I.
We say πk is a representation of SO(n) on Vk . We also see that π(R) and hence each πk (R), preserves the L2 -norm. A smooth representation ρ : SO(n) −→ L(V )
(7.4.170)
on a finite-dimensional inner product space V that preserves the norm is called a unitary representation. A linear subspace W ⊂ V is said to be invariant under the representation ρ provided ρ(R) : W −→ W,
(7.4.171)
∀ R ∈ SO(n).
When this holds, and ρ is unitary, we also have ρ(R) : W ⊥ −→ W ⊥ ,
(7.4.172)
∀ R ∈ SO(n),
⊥
where W = {v ∈ V : (v, w) = 0, ∀ w ∈ W }. Indeed, if w ∈ W, v ∈ W ⊥ , R ∈ SO(n), then (w, π(R)v) = (π(R)∗ w, v) (7.4.173)
= (π(R−1 )w, v),
which vanishes if (7.4.171) holds. The unitary representation ρ is said to be irreducible if V has no proper invariant linear subspaces. As just seen, if W is a proper invariant subspace, then V = W ⊕ W ⊥ and ρ acts on each factor. If dim V < ∞, an inductive procedure decomposes (7.4.174)
V = W0 ⊕ · · · ⊕ WM ,
into factors Wj , on each of which SO(n) acts irreducibly.
356
7. Fourier analysis
With these notions in hand, we can make the following important statement about the representations πk defined by (7.4.167)–(7.4.168). Proposition 7.4.23. For each k ∈ Z+ , the representation πk of SO(n) on the eigenspace Vk is irreducible. Proof. Suppose W ⊂ Vk is invariant under πk and f ∈ W, f ̸= 0. Then f (ω0 ) ̸= 0 for some ω0 ∈ S n−1 . Pick R0 ∈ SO(n) such that R0 ω0 = e. Then g = π(R0 )f ∈ W and g(e) ̸= 0. Applying Πk , defined by (7.4.156), we have h = Πk g ∈ W ∩ Zk , and h(e) = g(e) ̸= 0. Thanks to Proposition 7.4.18, this implies Zk ∈ W . If W is not all of Vk , then W ⊥ ⊂ Vk is a nonzero invariant subspace, and the same argument implies Zk ∈ W ⊥ . Contradiction. In an effort to understand the nature of unitary representations ρ in (7.4.170), we will bring in the derived map (7.4.175)
σ = Dρ(I) : TI SO(n) −→ L(V ).
Recall that (7.4.176)
TI SO(n) = Skew(n),
where (7.4.177)
Skew(n) = {A ∈ M (n, R) : At = −A}.
We also bring in the notation (7.4.178)
G = SO(n),
g = TI G,
so g = Skew(n). A key connection between g and G is provided by the following result, involving the matrix exponential (defined in (2.3.115)). Proposition 7.4.24. Given A ∈ M (n, R), (7.4.179)
A ∈ g ⇐⇒ etA ∈ G, ∀ t ∈ R.
Proof. Indeed, (7.4.180)
t
(etA )t = etA ,
(etA )−1 = e−tA ,
so (etA )t = (etA )−1 for all t ∈ R if and only if (7.4.181)
etA = e−tA , t
∀ t ∈ R,
which in turn holds if and only if A = −A. t
There is an algebraic structure on g called the Lie bracket (or the commutator): (7.4.182)
A, B ∈ g =⇒ [A, B] = AB − BA ∈ g.
To establish this for g = Skew(n), note that (7.4.183) if A, B ∈ Skew(n).
(AB − BA)t = B t At − At B t = BA − AB,
7.4. Spherical harmonics
357
The following calculation captures the interaction between the Lie bracket on g and the product on G. First, for A, B ∈ g, we have, for small t, ( )( ) t2 t2 etA etB = I + tA + A2 + O(t3 ) I + tB + B 2 + O(t3 ) 2 2 (7.4.184) t2 2 = I + t(A + B) + (A + 2AB + B 2 ) + O(t3 ), 2 and similarly (7.4.185)
etB etA = I + t(A + B) +
t2 2 (A + 2BA + B 2 ) + O(t3 ), 2
hence (7.4.186)
etA etB = etB etA + t2 [A, B] + O(t3 ).
Consequently, etA etB e−tA e−tB = I + t2 [A, B] + O(t3 ) (7.4.187)
2
= et
[A,B]
+ O(t3 ).
We apply these calculations to show how the Lie bracket is preserved under maps on g arising from representations of G. Thus, assume we have a smooth representation ρ as in (7.4.170), and define σ : g → L(V ) as in (7.4.175), so d (7.4.188) σ(A) = ρ(esA ) s=0 , ds for A ∈ g. Then, for such A, d d ρ(etA ) = ρ(e(s+t)A ) s=0 dt ds d (7.4.189) = ρ(esA )ρ(etA ) s=0 ds = σ(A)ρ(etA ), and since γ(t) = ρ(etA ) satisfies γ(0) = I, this gives (7.4.190)
ρ(etA ) = etσ(A) .
We are ready to prove the following. Proposition 7.4.25. If ρ is a smooth representation of G on V and σ : g → L(V ) is given by (7.4.175), then, for A, B ∈ g, we have (7.4.191)
σ([A, B]) = [σ(A), σ(B)].
Proof. Applying ρ to (7.4.187), we have (7.4.192)
ρ(etA etB e−tA e−tB ) = ρ(et
2
[A,B]
) + O(t3 ).
By (7.4.190) (and a second application of (7.4.187)), the left side of (7.4.192) is equal to (7.4.193)
etσ(A) etσ(B) e−tσ(A) e−tσ(B) = et
2
[σ(A),σ(B)]
Comparing the right sides of (7.4.192) and (7.4.193), we have d (7.4.194) ρ(es[A,B] ) s=0 = [σ(A), σ(B)], ds
+ O(t3 ).
358
7. Fourier analysis
which gives (7.4.191).
A linear map σ : g → L(V ) satisfying (7.4.191) is called a Lie algebra representation of g. The content of Proposition 7.4.25 is that each smooth representation ρ of G on V gives rise, via (7.4.175), to a Lie algebra representation σ of g on V . We say σ is the derived representation associated with ρ. It follows from (7.4.188) that if ρ is a unitary representation, then (7.4.195)
A ∈ g =⇒ σ(A)∗ = −σ(A),
i.e., σ is a representation of g by skew-adjoint linear transformations on V . Parallel to (7.4.171), we say a linear subspace W ⊂ V is invariant under the Lie algebra representation σ provided (7.4.196)
σ(A) : W −→ W,
∀ A ∈ g.
When this holds and σ is a skew-adjoint representation, then, parallel to (7.4.172), we have (7.4.197)
σ(A) : W ⊥ −→ W ⊥ ,
∀ A ∈ g.
The following result connects the two notions of invariance. Proposition 7.4.26. A linear subspace W ⊂ V is invariant under the representation ρ of G if and only if it is invariant under the Lie algebra representation σ of g. Proof. First we observe that, for each A ∈ g, (7.4.198)
ρ(etA ) : W → W, ∀ t ∈ R ⇐⇒ etσ(A) : W → W, ∀ t ∈ R ⇐⇒ σ(A) : W → W.
This readily gives (7.4.199)
W invariant under ρ =⇒ W invariant under σ.
To finish the argument, we use the fact that, for each R ∈ SO(n), there exist A ∈ Skew(n) such that (7.4.200)
R = eA ,
to get the converse implication to (7.4.199).
Remark. The content of (7.4.200) is that Exp : Skew(n) → SO(n) is onto. See Appendix A.3, Proposition A.3.8 and Corollary A.3.9. We say a Lie algebra representation σ of g on V is irreducible if V has no proper linear subspaces invariant under σ. We then have the following immediate consequence of Proposition 7.4.26. Proposition 7.4.27. Let V be a finite-dimensional vector space. A smooth representation ρ of G on V is irreducible if and only if its derived representation σ of g on V is irreducible.
7.4. Spherical harmonics
359
It is a natural task to classify the family of irreducible skew-adjoint representations of g, up to equivalence, where we say representations σ and τ of g on V and W are equivalent if there is an isomorphism (7.4.201)
≈
U : V −→ W, such that σ(A) = U −1 τ (A)U, ∀ A ∈ g.
In light of Proposition 7.4.23, we expect that a useful approach to understanding the representations πk of SO(n) on Vk would be to classify the irreducible representations of Skew(n) and identify which are equivalent to the representation πk . In the case n = 3, carrying out this program is relatively straightforward, and this is what we proceed to do. To get started, we select the following basis of Skew(3): 0 −1 0 −1 0 , A2 = , A3 = 0 (7.4.202) A1 = 1 0 0 1 0
0 1
−1 . 0
Note that the families of transformations etAj are groups of rotations about the x3−j -axis, each periodic in t of period 2π. A straightforward calculation yields the following “commutator identities,” (7.4.203)
[A1 , A2 ] = A3 ,
[A2 , A3 ] = A1 ,
[A3 , A1 ] = A2 .
Now suppose σ : Skew(3) → L(V ) is a skew-adjoint representation. Set L∗j = −Lj .
Lj = σ(Aj ) ∈ L(V ),
(7.4.204)
The identities (7.4.203) yield (7.4.205)
[L1 , L2 ] = L3 ,
[L2 , L3 ] = L1 ,
[L3 , L1 ] = L2 .
One key to understanding the structure of this representation is provided by M = L21 + L22 + L23 ∈ L(V ).
(7.4.206) Note that
M = M ∗,
(7.4.207)
(M v, v) = −
3 ∑
∥Lj v∥2 ≤ 0, ∀ v ∈ V,
j=1
so V has an orthonormal basis of eigenvectors of M , and all its eigenvalues are ≤ 0. Now, using the general identity (7.4.208)
[X, Y 2 ] = [X, Y ]Y + Y [X, Y ],
X, Y ∈ L(V ),
we can readily verify that (7.4.209)
[Lj , M ] = 0,
∀ j.
It follows that each eigenspace of M is invariant under L1 , L2 , and L3 . This establishes the following. Lemma 7.4.28. Assume σ is an irreducible skew-adjoint representation of Skew(3) on V . Then (7.4.210) for some λ ∈ [0, ∞).
M = −λ2 I,
360
7. Fourier analysis
To proceed, we diagonalize L1 on V . Set Wµ = {v ∈ V : L1 v = iµv},
(7.4.211)
⊕
V =
Wµ .
iµ∈Spec L1
The structure of σ is determined by how L2 and L3 behave on Wµ . It is convenient to set L± = L2 ∓ iL3 .
(7.4.212)
The following key identity follows directly from (7.4.205): [L1 , L± ] = ±iL± .
(7.4.213) This leads to the following. Lemma 7.4.29. For each µ,
L± : Wµ −→ Wµ±1 .
(7.4.214)
In particular, if iµ ∈ Spec L1 , then either L+ = 0 on Wµ or i(µ + 1) ∈ Spec L1 , and also either L− = 0 on Wµ or i(µ − 1) ∈ Spec L1 . Proof. If v ∈ Wµ , then (7.4.213) implies L1 L± v = L± L1 v ± iL± v = i(µ ± 1)L± v,
(7.4.215)
and this identity yields the lemma.
Remark. Because of (7.4.214) and (7.4.215), one calls L± ladder operators. The following gives more precise information. Proposition 7.4.30. If σ is an irreducible skew-adjoint representation of Skew(3) on V , then Spec L1 must consist of a sequence, Spec L1 = i{µ0 , µ0 + 1, . . . , µ0 + ℓ = µ1 },
(7.4.216) with (7.4.217)
≈
for 0 ≤ j ≤ ℓ − 1,
≈
for 0 ≤ j ≤ ℓ − 1.
L+ : Wµ0 +j −→ Wµ0 +j+1 ,
and (7.4.218)
L− : Wµ1 −j −→ Wµ1 −j−1 ,
Proof. A computation gives (7.4.219)
L− L+ = L22 + L23 + i[L3 , L2 ] = −λ2 − L21 − iL1 ,
on V , and, similarly, (7.4.220)
L+ L− = −λ2 − L21 + iL1
on V . Hence (7.4.221)
L− L+ = µ(µ + 1) − λ2 ,
on Wµ ,
L+ L− = µ(µ − 1) − λ ,
on Wµ .
2
7.4. Spherical harmonics
361
Also, since L2 and L3 are skew-adjoint, L+ = −L∗− ,
(7.4.222) so (7.4.223)
L+ L− = −L∗− L− ,
L− L+ = −L∗+ L+ .
Hence, we have the identity of null spaces, (7.4.224)
N (L+ ) = N (L− L+ ),
N (L− ) = N (L+ L− ).
These observations establish (7.4.216)–(7.4.218).
In the setting of Proposition 7.4.30, we see that, if v ∈ Wµ0 is nonzero, then µ1 −µ0 Span{v, L+ v, . . . , L+ v}
(7.4.225)
is invariant under L1 , L+ , and L− , hence under the representation σ, so it must be all of V , if σ is irreducible. This implies (7.4.226)
dim Wµ = 1,
for iµ ∈ Spec L1 , µ0 ≤ µ ≤ µ1 .
From (7.4.221) we see that (7.4.227)
µ1 (µ1 + 1) = λ2 = µ0 (µ0 − 1),
hence
(7.4.228)
ℓ ℓ µ1 − µ0 = ℓ =⇒ µ0 = − , µ1 = , 2 2 dim V = ℓ + 1, and λ2 =
ℓ(ℓ + 2) . 4
The vectors in the basis (7.4.225) of V are mutually orthogonal, since the various eigenspaces of L1 are, but they do not form an orthonormal basis. To nail down the structure of the action of Skew(3) on V , we have the following. Proposition 7.4.31. Assume σ is an irreducible skew-adjoint representation of Skew(3) on V , iµ ∈ Spec L1 , and wµ ∈ Wµ . Then the norms of the vectors L± wµ ∈ Wµ±1 are given by (7.4.229)
∥L+ wµ ∥2 = [λ2 − µ(µ + 1)] ∥wµ ∥2 , ∥L− wµ ∥2 = [λ2 − µ(µ − 1)] ∥wµ ∥2 .
Proof. Using (7.4.221) and (7.4.222), we have ∥L+ wµ ∥2 = (L∗+ L+ wµ , wµ ) (7.4.230)
= −(L− L+ wµ , wµ ) = [λ2 − µ(µ + 1)] ∥wµ ∥2 .
The computation of ∥L− wµ ∥2 is similar.
This leads to the following explicit description of the action of Skew(3) on V .
362
7. Fourier analysis
Proposition 7.4.32. Let V be a complex inner product space of dimension ℓ + 1, ℓ ∈ Z+ = {0, 1, 2, . . . }. If σ is an irreducible skew-adjoint representation of Skew(3) on V , then V has an orthonormal basis (7.4.231)
vµ ,
ℓ µ = − + j, j ∈ {0, . . . , ℓ}, 2
with respect to which
(7.4.232)
L1 vµ = iµvµ , √ L+ vµ = λ2 − µ(µ + 1) vµ+1 , √ L− vµ = λ2 − µ(µ − 1) vµ−1 ,
where ℓ(ℓ + 2) . 4 Consequently, for each ℓ ∈ Z+ , there is, up to equivalence, just one such representation of Skew(3). λ2 =
(7.4.233)
Among the representations of Skew(3) arising in Proposition 7.4.32 are the derived representations (which we denote σk ) associated to the representations πk of SO(n) on Vk ⊂ C(S n−1 ), in case n = 3. Recall that (7.4.234)
n = 3 =⇒ dim Vk = 2k + 1.
Thus we get “half” of the representations described in Proposition 7.4.32, those for which ℓ is even. If we denote by τℓ the representation of Skew(3) described in Proposition 7.4.32, when dim V = ℓ + 1, then (7.4.235)
σk is equivalent to τ2k .
Actually, for ℓ odd, the representation τℓ is not derived from a representation of SO(3). We can see this as follows. From (7.4.202), we have e2πA1 = I.
(7.4.236)
Hence, if ρ is an irreducible representation of SO(3), with derived representation σ, then (7.4.237)
e2πL1 = e2πσ(A1 ) = ρ(e2πA1 ) = I,
so Spec L1 ⊂ iZ. But if ℓ is odd, we have from (7.4.231)–(7.4.232) that i/2 ∈ Spec L1 . This has the following consequence. Proposition 7.4.33. Each irreducible unitary representation of SO(3) is equivalent to one of the representations πk of SO(3) on a space Vk ⊂ C(S 2 ) of spherical harmonics. On the other hand there is a group, closely related to SO(3), whose irreducible unitary representations representations have derived representations equivalent to each of those given in Proposition 7.4.32, namely SU (2). Generally, for n ≥ 2, we set (7.4.238)
SU (n) = {U ∈ M (n, C) : U ∗ U = I and det U = 1}.
7.4. Spherical harmonics
363
As with SO(n), this is a group of matrices which is also a smooth surface in the vector space M (n, C). Its tangent space at the identity element is (7.4.239)
TI SU (n) = su(n) = {X ∈ Skew(Cn ) : Tr X = 0},
where (7.4.240)
Skew(Cn ) = {X ∈ M (n, C) : X ∗ = −X}.
The space su(2) is a 3-dimensional real vector space, closed under the Lie bracket. It has the basis ( ) ( ) ( ) 1 i 0 1 0 1 1 0 i (7.4.241) X1 = , X2 = , X3 = , 2 0 −i 2 −1 0 2 i 0 with commutator identities (7.4.242)
[X1 , X2 ] = X3 ,
[X2 , X3 ] = X1 ,
[X3 , X1 ] = X2 ,
parallel to those in (7.4.203). Consequently, the map Aj 7→ Xj extends uniquely to a linear isomorphism (7.4.243)
≈
τ : Skew(3) −→ su(2),
which preserves the Lie bracket (we call τ a Lie algebra isomorphism). In fact, composing τ with the inclusion su(2) ⊂ M (2, C) gives a representation of Skew(3) equivalent to τ1 , described above. The group SU (2) has representations on complex vector spaces of dimension ℓ + 1, described as follows. We take (7.4.244)
Pℓ (C2 ) = space of polynomials in z = (z1 , z2 ), homogeneous of degree ℓ,
and define the representation γℓ of SU (2) on Pℓ (C2 ) by (7.4.245)
γℓ (U )p(z) = p(U −1 z),
U ∈ SU (2), z ∈ C2 , p ∈ Pℓ (C2 ).
One can use on Pℓ (C2 ) the inner product ∫ (7.4.246) (f, g) = f (z)g(z) dS(z), S3
where S 3 ⊂ C2 ≈ R4 is the unit sphere, with its induced Riemannian metric. Then γℓ is a unitary representation. It is irreducible for each ℓ, a fact we leave as a challenge to the reader. Given this, it follows from Proposition 7.4.32 that the derived representation of γℓ is a representation of su(2) ≈ Skew(3) that is equivalent to τℓ , for each ℓ ∈ Z+ . We return to the study of the representations πk of SO(3), and derived representations σk , on the eigenspaces of ∆S , Vk ⊂ C ∞ (S 2 ), and see how Proposition 7.4.32 yields specific formulas. First, note that, since dim Vk = 2k + 1, (7.4.233) gives (7.4.247)
λ = k(k + 1) = λk ,
with λk as in (7.4.30), with n = 3. Equivalently, (7.4.248)
L21 + L22 + L23 = ∆S .
364
7. Fourier analysis
Next, we have an orthonormal basis of Vk of the form Ykℓ ,
(7.4.249)
−k ≤ ℓ ≤ k,
satisfying L1 Ykℓ = iℓYkℓ .
(7.4.250)
For ℓ = 0, this identifies Yk0 as a zonal function. Following (7.4.151), we take ( 2k + 1 )1/2 Pk (ω · e), (7.4.251) Yk0 (ω) = 4π where e = (0, 0, 1) and Pk (t) are the Legendre polynomials. For each ℓ ∈ {−k, . . . , 0, . . . , k}, (7.4.250) says Ykℓ (etA1 ω) = eiℓt Ykℓ (ω),
(7.4.252)
with A1 given by (7.4.202). Writing a general element f ∈ Vk as f = deduce the following result.
∑
j j cj Yk ,
we
Proposition 7.4.34. Given f ∈ Vk , ℓ ∈ {−k, . . . , 0, . . . , k}, we have ∫ 2π 1 (7.4.253) e−iℓt f (eitA ω) dt = (f, Ykℓ )L2 Ykℓ (ω). 2π 0 Elements of Vk that one could plug into (7.4.253) include fky (ω) = Pk (ω · y),
(7.4.254) and
gkc (ω) =
(7.4.255) ∑
(∑
)k cj ωj
,
y ∈ S2,
cj ∈ C,
j
∑
c2j = 0,
j
k
since then ( cj xj ) is a harmonic polynomial, homogeneous of degree k. A particular case of this is (7.4.256)
gk (ω) = (ω1 + iω2 )k ,
which satisfies L1 gk = ikgk , implying (7.4.257)
Ykk (ω) = αk (ω1 + iω2 )k ,
for some constant αk . A more direct path to explicit formulas for Ykℓ is found by applying (7.4.232), to get √ (7.4.258) L+ Ykℓ (ω) = k(k + 1) − ℓ(ℓ + 1) Ykℓ+1 (ω). We can start at ℓ = 0 with (7.4.251) and apply this iteratively to obtain formulas for Ykℓ (ω), for 1 ≤ ℓ ≤ k. For −k ≤ ℓ ≤ −1, we could work similarly with iterates of L− , or we could just take (7.4.259)
Yk−ℓ (ω) = Ykℓ (ω),
noting that L1 f = L1 f . To apply (7.4.258) explicitly, we use spherical coordinates (θ, ψ), defined by (7.4.260)
x(θ, ψ) = (sin θ cos ψ, sin θ sin ψ, cos θ), 0 ≤ θ ≤ π,
0 ≤ ψ ≤ 2π,
7.4. Spherical harmonics
365
Figure 7.4.1. Spherical coordinates: x(θ, ψ) = (sin θ cos ψ, sin θ sin ψ, cos θ)
so θ = 0 defines the “north pole,” (0, 0, 1) = e, and θ = π defines the south pole, −e. See Figure 7.4.1. In these coordinates, we have [ ∂ ∂ ∂ ] (7.4.261) L1 = , L± = ie±iψ ± + i cot θ . ∂ψ ∂θ ∂ψ Also, in these coordinates, (7.4.251) takes the form ( 2k + 1 )1/2 Pk (cos θ). (7.4.262) Yk0 (θ, ψ) = 4π To start the iteration (7.4.258) at ℓ = 0, we have the general formula L+ g(θ) = ieiψ g ′ (θ),
(7.4.263) hence (7.4.264)
L+ G(cos θ) = −ieiψ (sin θ) G′ (cos θ).
More generally, we calculate (7.4.265)
( ) L+ eiℓψ sinℓ θ Gℓ (cos θ) = −iei(ℓ+1)ψ sinℓ+1 θ Gℓ′ (cos θ).
Hence, inductively, we obtain the formula (7.4.266)
(ℓ)
Ykℓ (θ, ψ) = αkℓ eiℓψ sinℓ θ Pk (cos θ),
0 ≤ ℓ ≤ k,
366
7. Fourier analysis
with constants αkℓ obtainable via (7.4.258). Recall that Pk (t) is a polynomial in t (k) of degree k, so Pk (t) is constant, so (7.4.266) gives Ykk (θ, ψ) = αk eikψ sink θ,
(7.4.267)
which recovers (7.4.257), since, by (7.4.260), eiψ sin θ = ω1 + iω2 .
(7.4.268)
In light of this identity, we see that another way to write (7.4.266) is (7.4.269)
(ℓ)
Ykℓ (ω) = αkℓ (ω1 + iω2 )ℓ Pk (ω3 ),
0 ≤ ℓ ≤ k.
We record the conclusion. Proposition 7.4.35. Each eigenspace Vk ⊂ C ∞ (S 2 ) of the Laplace operator ∆S on S 2 has an orthonormal basis {Ykℓ : −k ≤ ℓ ≤ k} of the form ( 2k + 1 )1/2 (7.4.270) Yk0 (ω) = Pk (ω3 ), 4π and, for 1 ≤ ℓ ≤ k (if k ≥ 1), (ℓ)
(7.4.271)
Ykℓ (ω) = αkℓ (ω1 + iω2 )ℓ Pk (ω3 ), Yk−ℓ (ω) = αkℓ (ω1 − iω2 )ℓ Pk (ω3 ). (ℓ)
with coefficients αkℓ ∈ (0, ∞) obtainable from (7.4.258). See Exercises 9–10 below for a convenient recursive formula for the Legendre polynomials Pk (t). See Figure 7.4.2 for graphs of the normalized Legendre polynomials √ (7.4.272) yk (t) = 2k + 1Pk (t), for 0 ≤ k ≤ 5, and Figure 7.4.3 for k = 10, 20, 30. Note the spiking at t = ±1 as k increases. Since the argument of Pk (t) often appears as t = cos θ, it is also of interest to see graphs of yk (cos θ). See Figure 7.4.4. It is of interest to contrast the behavior of the zonal eigenfunctions Yk0 (ω) with that of the other extreme case, Ykk (ω). We have (7.4.273)
|Ykk (ω)|2 = |αk |2 (ω12 + ω22 )k = |αk |2 (1 − t2 )k ,
with αk as in (7.4.257) and t = ω3 ∈ [−1, 1]. The normalization ∥Ykk ∥L2 (S 2 ) = 1 gives ∫ 1 = |Ykk (ω)|2 dS(ω) (7.4.274)
S2
= 2π|αk |2
∫
1
−1
(1 − t2 )k dt.
To evaluate |αk |2 , it is convenient to have the following.
7.4. Spherical harmonics
367
Figure 7.4.2. Normalized Legendre polynomials, yk (t) =
Lemma 7.4.36. With
∫
(7.4.275)
γk =
√
2k + 1Pk (t)
1
−1
(1 − t2 )k dt,
we have γ0 = 2 and, for k ≥ 1, (7.4.276)
γk =
2k γk−1 . 2k + 1
Proof. Integration by parts gives ∫ γk = −
1
−1 1
∫
(7.4.277) = 2k
−1
d (1 − t2 )k t dt dt (1 − t2 )k−1 t2 dt.
Meanwhile, ∫ (7.4.278)
γk =
1
−1
(1 − t2 )k−1 (1 − t2 ) dt.
Comparing (7.4.277)–(7.4.278), we have γk = γk−1 − γk /2k, hence (7.4.276).
368
7. Fourier analysis
Figure 7.4.3. Graphs of yk (t) for k = 10, 20, 30
We see from (7.4.274) that 1 . 2πγk The following result describes the behavior of γk as k → ∞. |αk |2 =
(7.4.279)
Lemma 7.4.37. We have
√ γk ∼
(7.4.280)
π , k
as k → ∞.
Proof. Since 1 − t2 and e−t both have nondegenerate maxima at t = 0 and agree up to O(t4 ) there, one can show that √ ∫ 1 ∫ ∞ π 2 k −kt2 (7.4.281) (1 − t ) dt ∼ e dt = . k −1 −∞ 2
The industrious reader is invited to tackle the demonstration of (7.4.281) directly. This is a special case of results established in Appendix A.3 of [51]. Remark. Using results on the gamma function given in Chapter 4 of [51] (cf. (4.3.40)– (4.3.41)), one can show that (7.4.282)
γk =
Γ( 21 )Γ(k + 1) . Γ(k + 32 )
7.4. Spherical harmonics
369
Figure 7.4.4. Graphs of yk (cos θ) =
√
4πYk0 (ω), −π/2 ≤ θ ≤ π/2
Results on the gamma function established in §3.2 of this text also allow one to deduce (7.4.276) from (7.4.282). In addition, Stirling’s formula, on the behavior of Γ(x) as x → ∞, treated in Appendix A.3 of [51], allows one to deduce (7.4.280) from (7.4.282) (though the derivation via (7.4.281) is more direct and elementary). Now, parallel to (7.4.272), we set √ 2 (7.4.283) wk (t) = (1 − t2 )k/2 , γk so (7.4.284)
|Ykk (ω)|2 =
1 wk (t)2 , 4π
with t = ω3 , and (7.4.274) is equivalent to ∫ 1 1 (7.4.285) wk (t)2 dt ≡ 1. 2 −1 See Figure 7.4.5 for graphs of the functions wk (cos θ), for various values of k between 1 and 30. The graphs of the functions wk (t) also spike as k → ∞, though the nature of the spiking differs from that of yk (t). For example, the spikes for wk (t) have a
370
7. Fourier analysis
Figure 7.4.5. Graphs of wk (cos θ) =
√
4π|Ykk (ω)|, ω3 = cos θ, 0 ≤ θ ≤ π
smaller amplitude. The maximum value of yk (t) is √ ( 4k )1/4 2 ∼ . γk π
√
2k + 1, while that for wk (t) is
Regarding the associated behavior of Ykk and of Yk0 , we see that Ykk spikes on a neighborhood of the equator in S 2 , while Yk0 spikes on a smaller area, around the poles. Another difference one can perceive from Figures 7.4.4–7.4.5 is that the amplitude of Ykk tends to 0 as k → ∞, outside any neighborhood of the equator, so Ykk really concentrates around the equator. On the other hand, Yk0 remains sizable outside its spike zone. In fact, we can state the following result, though we do not prove it here. (See [38], §4.8.) Proposition 7.4.38. In the limit as k → ∞, the sequence of functions yk (t) has the upper envelope √ 4 y= (1 − t2 )−1/4 . π The reader can examine Figure 7.4.6 and formulate a definition of “upper envelope” in this setting.
Exercises
371
Figure 7.4.6. y30 (t) and the upper envelope y =
√
4/π (1 − t2 )−1/4
Exercises 1. Given that the solution to (7.4.2) is specified by (7.4.5), for some constant Cn , show that plugging in f = 1 and evaluating at x = 0 implies Cn must be equal to 1/An−1 . 2. Refining Proposition 7.4.6, show that if f ∈ Vj and g ∈ Vk , then (7.4.286)
fg ∈
k+j ⊕
Vℓ .
ℓ=|k−j|
Hint. Suppose 0 ≤ j < k and m < k − j, so j + m < k. Deduce that h ∈ Vm =⇒ h ⊥ f g, from the implication h ∈ Vm , f ∈ Vj ⇒ hf ⊥ Vk . 3. Recall the map ψk : S n−1 → Vk , given in (7.4.65)–(7.4.67). (a) Implicit in the construction of (7.4.65)–(7.4.66) is the following assertion. Let V be a finite-dimensional (complex) inner product space, and λ : V −→ C a
372
7. Fourier analysis
linear map. Then there is a unique ψ ∈ V such that λ(g) = (g, ψ)V ,
∀ g ∈ V.
Prove this. (Hint. Consider the action of λ on an orthonormal basis of V .) (b) Show that, in the setting of (7.4.66), (n−1)/2
ψk (ω)(y) = Ek (ω, y) = βnk Ck
(ω · y),
for an appropriate constant βnk . (c) Use (7.4.117)–(7.4.118) to give another proof of Proposition 7.4.10. (d) Assume k ≥ 1. Show that for each p ∈ S n−1 , Dψk (p) : Tp S n−1 −→ Vk is injective. In fact, show that there exists bkn ∈ (0, ∞) such that ∥Dψk (p)X∥Vk = bkn ∥X∥Rn ,
∀ X ∈ Tp S n−1 .
(e) Show that if x, y ∈ S n−1 , then ψ2 (x) = ψ2 (y) if and only if y = ±x. Hence ψ2 induces a smooth map ψ˜2 : Pn−1 −→ V2 , of real projective space Pn−1 , defined by (3.2.92). Show that this map is an embedding. How does this embedding compare to the one arising from (3.2.110)– (3.2.111)? 4. Verify the identity (7.4.104) for (1−z)−α , for |z| < 1, with the binomial coefficient (7.4.287)
( ) j+α−1 α(α + 1) . . . (α + j − 1) . = j j!
5. Assume x, y ∈ Rn and |x| < |y|. Deduce from (7.4.103) that ∞ ( |x| )k 1 1 ∑ , (7.4.288) = Pk (cos θ) |x − y| |y| |y| k=0
where θ is the angle between x and y, i.e., x · y = |x| · |y| cos θ. This expansion is Legendre’s multipole expansion. It started out the study of the Legendre polynomials. 6. Looking back at Exercise 2, show that if g ∈ Vk , then (7.4.289)
ωℓ g(ω) ∈ Vk+1 ⊕ Vk−1 .
Hint. Use a parity argument to show that the component in Vk is 0. 7. Deduce from Exercise 6 that, on S n−1 , ωn Yk0 (ω) is a linear combination of 0 0 Yk+1 (ω) and Yk−1 (ω), Equivalently, when α = (n − 2)/2, (7.4.290)
α α tCkα (t) = Akα Ck+1 (t) + Bkα Ck−1 (t).
Exercises
373
In particular, (7.4.291)
tPk (t) = ak Pk+1 (t) + bk Pk−1 (t).
8. Recall that the leading coefficient of the polynomial Ckα (t) is given by (7.4.142). α Write down the leading coefficient of the polynomial tCkα (t) and of Ck+1 (t) and find the coefficient Akα in (7.4.290). 9. Since Pk (1) ≡ 1, note that, in (7.4.291), ak + bk = 1. Use this together with Exercise 8 to show that k+1 k (7.4.292) tPk (t) = Pk+1 (t) + Pk−1 (t). 2k + 1 2k + 1 10. Rewrite (7.4.292) as a formula for Pk+1 (t). Use it to write down the polynomials Pk (t) for 0 ≤ k ≤ 5. If you can program a computer, explore further, and draw graphs. 11. Use the computation of Ckα (1) in (7.4.122), together with Exercise 8, to evaluate the coefficient Bkα in (7.4.290), when α = (n − 2)/2. 12. Fix c = (c1 , . . . , cn ) ∈ Cn , satisfying c21 + · · · + c2n = 0. We say c ∈ K n . Set Gck (x) = (c1 x1 + · · · + cn xn )k . Show that Gck is a harmonic polynomial on Rn , in Hk , hence gkc = Gck S n−1 ∈ Vk . Show that Span{gkc : c ∈ K n } = Vk . Hint. Denote the span by Vk . Show that πk : Vk → Vk and use irreducibility of πk . 13. For y ∈ S n−1 , set (n−2)/2
fky (ω) = Ck
(ω · y),
which belongs to Vk . Show that Span{fky : y ∈ S n−1 } = Vk . 14. Take n = 3, and, for y ∈ S 2 , set ( 2k + 1 )1/2 Pk (ω · y), (7.4.293) φyk (ω) = 4π so each φyk ∈ Vk has unit L2 -norm. Use (7.4.136) to show that, for y, η ∈ S 2 , (7.4.294)
(φyk , φηk )L2 = Pk (y · η).
374
7. Fourier analysis
15. Take {yj : 1 ≤ j ≤ 2k + 1} ⊂ S 2 , and recall that dim Vk = 2k + 1. Then y
{φkj : 1 ≤ j ≤ 2k + 1} ⊂ Vk is a set of elements of unit L2 -norm. Use Exercise 14 to formulate a condition that this set be a basis of Vk . y Hint. Define T : C2k+1 → Vk by T ej = φkj , where {ej } is the standard basis of 2k+1 C . Show that the matrix entries of A = T ∗ T ∈ M (2k + 1, C) are given by y
aij = (φyki , φkj )L2 . 16. Recall the formula (7.4.103) for the Gegenbauer polynomials: ∞ ∑ (7.4.295) (1 − 2tr + r2 )−α = Ckα (t)rk . k=0
(a) Show that for t ∈ [−1, 1] and z ∈ C, the roots of z 2 − 2tz + 1 are √ z± = t ± i 1 − t2 , so |z± |2 = 1. (b) Use this to show that, for t ∈ [−1, 1], α ∈ R, Fα,t (z) = (1 − 2tz + z 2 )−α is holomorphic for |z| < 1, and that (7.4.296)
(1 − 2tz + z 2 )−α =
∞ ∑
Ckα (t)z k ,
k=0
with absolute convergence for |z| < 1. (c) Show that (7.4.296) converges absolutely and uniformly for 1 1 t, z ∈ C, |t| ≤ , |z| ≤ , 2 2 and more generally for (1 1 ) t, z ∈ C, |t| ≤ T, |z| ≤ min , . 2 4T 1/2
17. Take α = 1/2 in Exercise 16, and recall that Ck (t) = Pk (t) are the Legendre polynomials. Use Cauchy’s integral formula to show that, for t ∈ [−1, 1], ∫ 1 (1 − 2tz + z 2 )−1/2 z −k−1 dz, Pk (t) = 2πi γ
where γ is an appropriate closed path about the origin. 18. Show that the formula (7.4.106) specialized to α = 1/2 yields ∑
[k/2]
(7.4.297)
Pk (t) =
ℓ=0
(−1)ℓ
(2k − 2ℓ)! tk−2ℓ . − ℓ)!(k − 2ℓ)!
2k ℓ!(k
Exercises
375
Applying the binomial formula to (t2 − 1)k , show that k dk 2 dk ∑ k! k (t − 1) = t2k−2ℓ (−1)ℓ k k dt dt ℓ!(k − ℓ)! ℓ=0
(7.4.298)
∑
[k/2]
=
(−1)ℓ
ℓ=0
k! (2k − 2ℓ)! k−2ℓ t . ℓ!(k − ℓ)! (k − 2ℓ)!
Deduce that, for k ∈ Z+ , (7.4.299)
Pk (t) =
1 dk 2 (t − 1)k . 2k k! dtk
This is the Rodrigues formula for the Legendre polynomials. 19. Recall the map Πk : Vk → Zk discussed in (7.4.155)–(7.4.161). Here, we specialize to n = 3. (a) Show that, for f ∈ Vk , ∫ π 1 f ((cos φ)ω1 + (sin φ)ω2 , −(sin φ)ω1 + (cos φ)ω2 , ω3 ) dφ. Πk f (ω) = 2π −π (b) Show that f ∈ Vk , f (e) = 1 =⇒ Πk f (ω) = Pk (ω3 ). (c) Apply part (a) to fk (ω) = (ω3 + iω1 )k . Show that ∫ π 1 (ω3 + i[(cos φ)ω1 + (sin φ)ω2 ])k dφ. Πk fk (ω) = 2π −π Deduce from part (b) that (7.4.300)
Pk (t) =
1 2π
∫
π
−π
(
)k √ t + i(sin φ) 1 − t2 dφ.
(d) Using the binomial expansion, show that Pk (t) =
( ) k ∑ k ij aj tk−j (1 − t2 )j/2 , j j=0
with 1 aj = 2π
∫
π
sinj φ dφ. −π
Note that aj = 0 for j odd. Hence ∑
[k/2]
Pk (t) =
ℓ=0
(−1)ℓ
( ) k a2ℓ tk−2ℓ (1 − t2 )ℓ . 2ℓ
376
7. Fourier analysis
7.5. Fourier series on compact matrix groups Let G be a smooth, compact matrix group. As seen in §6.4, G has a bi-invariant volume form of total mass 1, and we can also equip G with a bi-invariant Riemannian metric. We form a family (7.5.1)
α ∈ A,
πα ,
of mutually inequivalent, irreducible unitary representations of G on inner product spaces Vα , of dimension dα . We arrange that each irreducible unitary representation of G is equivalent to one of these. Choose an orthonormal basis of Vα , so that πα (g) has the matrix representation (πα (g)jk ), 1 ≤ j, k ≤ dα . The following result summarizes Propositions 6.4.23–6.4.24. Proposition 7.5.1. The collection of functions (7.5.2)
F = {d1/2 α πα (g)jk : α ∈ A, 1 ≤ j, k ≤ dα }
is an orthonormal set of functions on G. Furthermore, Span F is dense in C(G).
(7.5.3)
Given this result, we can parallel arguments used to prove Proposition 7.1.5 (see also Proposition 7.4.9), obtaining the following result, known as the Peter-Weyl theorem. To set it up, identify A with N, and for N ∈ N and f ∈ C(G), or more generally f ∈ R(G), set ∑ ∑ (7.5.4) SN f (x) = dα (f, παjk )L2 πα (x)jk . α≤N j,k≤dα
Proposition 7.5.2. For each f ∈ R(G), (7.5.5)
SN f −→ f in L2 -norm, as N → ∞,
and ∥f ∥2L2 =
(7.5.6)
∑ ∑
2 dα (f, παjk )L2 .
α j,k≤dα
Another way to write SN is as (7.5.7)
∑
SN f =
Pα f,
α≤N
where, for each α ∈ A, (7.5.8)
Pα f (x) =
∑
dα (f, παjk )L2 πα (x)jk ,
j,k≤dα
so Pα is a projection of R(G) onto the space (7.5.9)
Vα = Span{πα (x)jk : 1 ≤ j, k ≤ dα } ⊂ C(G).
This choice of basis of Vα produces a linear isomorphism (7.5.10)
≈
Jα : Vα −→ M (dα , C).
7.5. Fourier series on compact matrix groups
377
Each space Vα is invariant under left and right translations, so we have a representation γα of G × G on Vα , γα (g, h)u(x) = u(g −1 xh).
(7.5.11)
Setting u(x) = πα (x)jk , we have γα (g, h)πα (x)jk = πα (g −1 xh)jk ∑ = πα (g −1 )jℓ πα (xh)ℓk (7.5.12)
ℓ
=
∑
πα (g −1 )jℓ πα (x)ℓm πα (h)mk .
ℓ,m
Hence, (7.5.13)
A = Jα (u) =⇒ Jα (γα (g, h)u) = πα (h)Aπα (g −1 ).
The following result is to some degree parallel to Proposition 7.4.23. Proposition 7.5.3. For each α ∈ A, the representation γα of G × G on Vα is irreducible. To start the proof, supose W ⊂ Vα is a linear subspace, invariant under the action of G × G via γα . Suppose w ∈ W is nonzero, say w(g0 ) ̸= 0. Then w0 = γ(I, g0 )w ∈ W and w0 (I) ̸= 0. Now consider ∫ ∫ (7.5.14) w1 (x) = γα (g, g)w0 (x) dg = w0 (g −1 xg) dg. G
G
We have w1 (I) = w0 (I) ̸= 0, so w1 ∈ W and w1 ̸= 0. Also γα (g, g)w1 = w1 for all g ∈ G, i.e., (7.5.15)
w1 (g −1 xg) = w1 (x),
∀ x, g ∈ G.
Now we can produce an element of Vα that has this conjugation invariance, namely (7.5.16)
χα (x) = Tr πα (x) =
dα ∑
πα (x)jj .
j=1
It is useful to have the following. Lemma 7.5.4. If w1 ∈ Vα is conjugation invariant, i.e., if (7.5.15) holds, then w1 is a constant multiple of χα . Before proving Lemma 7.5.4, let us see how it finishes off the proof of Proposition 7.5.3. Indeed, if W ⊂ Vα is a linear subspace that is invariant under γα , then the argument above shows that W must contain χα , given the lemma above. If W were a proper linear subspace, then its orthogonal complement W ⊥ ⊂ Vα would also be invariant, so the same argument gives χα ∈ W ⊥ . This is a contradiction, so Proposition 7.5.3 is proved, modulo a proof of Lemma 7.5.4. To prove Lemma 7.5.4, we use the isomorphism Jα in (7.5.10)–(7.5.12). We have A1 = J (w1 ) ∈ M (dα , C), satisfying (7.5.17)
πα (g)A1 πα (g)−1 = A1 ,
∀ g ∈ G.
378
7. Fourier analysis
The following result, called Schur’s lemma, applies, and yields Lemma 7.5.4. Lemma 7.5.5. If πα is an irreducible unitary representation of G on Cdα and (7.5.17) holds for some A1 ∈ M (dα , C), then A1 is a scalar multiple of the identity matrix. Proof. If (7.5.17) holds, then πα (g) also commutes with A∗1 for all g, hence with A1 + A∗1 and with (A1 − A∗1 )/i, so it suffices to consider A1 self adjoint. Then (7.5.17) implies each eigenspace of A1 is invariant under the action of G via πα . The irreducibility of πα then implies A1 must be a scalar multiple of I. At this point, we endow G with a bi-invariant Riemannian metric and consider the associated Laplace operator ∆ on G. We aim to show that, for each α ∈ A, ∆ : Vα −→ Vα ,
(7.5.18)
and ∆ acts like a scalar multiple of the identity on each of these spaces. The following result enables us to establish (7.5.18), and is interesting in its own right. Proposition 7.5.6. Let {Ej } be an orthonormal basis of g = TI G, and consider the left-invariant vector fields Xj = XEj . Then ∑ (7.5.19) ∆= Xj2 . j
Proof. Denote the right side of (7.5.19) by L. Both ∆ and L are second order differential operators on G; L is manifestly left invariant, and ∆ is both left and right invariant. If we use the exponential coordinate system, associated to the map Exp : g → G, we see that both ∆ and L have expressions of the form ∑ ∑ (7.5.20) ajk (x)∂j ∂k + bj (x)∂j + c(x), j
j,k jk
jk
with a (0) = δ , in both cases. Since both ∆ and L are left invariant, this implies ∆ − L is a first-order differential operator, so (7.5.21)
∆ − L = X + c,
where X is a left-invariant vector field and c is a constant. Since both ∆1 = 0 and L1 = 0, we have c = 0. Also, X is skew-adjoint: (7.5.22)
(Xf, g) = −(f, Xg),
f, g ∈ C ∞ (G),
while ∆ and L are self adjoint, i.e., (7.5.23)
(∆f, g) = (f, ∆g),
and similarly for L. This implies X = 0, hence ∆ = L. The identity (7.5.19) applies to (7.5.18) as follows. First, (7.5.24) via
A ∈ g =⇒ XA : Vα → Vα , d πα (xetA )jk dt t=0 ∑ = πα (x)jℓ σα (A)ℓk ,
XA πα (x)jk = (7.5.25)
ℓ
7.5. Fourier series on compact matrix groups
379
where σα : g → L(Vα ) is the derived representation associated to πα . From (7.5.24) we have ∑ (7.5.26) L= Xj2 : Vα −→ Vα , J
which, via (7.5.19), implies (7.5.18). Combining this with the fact that ∆ commutes with left and right translations gives (7.5.27)
∆γα (g, h) = γα (g, h)∆ on Vα ,
∀ g, h ∈ G.
In view of the irreducibility result, Proposition 7.5.3, we have the following. Proposition 7.5.7. For each α ∈ A, we have λα ∈ [0, ∞) such that (7.5.28)
∆u = −λ2α u,
∀ u ∈ Vα .
The function χα = Tr πα defined in (7.5.16) is called the character of the representation πα . One can see that these characters play a role in the proof of irreducibility in Proposition 7.5.3 similar to that played by zonal harmonics in the proof of the irreducibility in Proposition 7.4.23. Going further, the role of zonal functions on spheres S n−1 , introduced in (7.4.139), has an analogue in the space of central functions, (7.5.29)
Z(G) = {f ∈ C(G) : f (g −1 xg) = f (x), ∀ x, g ∈ G}.
The content of Lemma 7.5.4 is that (7.5.30)
Vα ∩ Z(G) = Span(χα ),
which is parallel to Proposition 7.4.18. The following consequence of Propositions 7.5.1–7.5.2 and the observations above is parallel to Proposition 7.4.19. Proposition 7.5.8. The family of functions X = {χα : α ∈ A}
(7.5.31)
is an orthonormal set of functions on G. Furthermore, (7.5.32)
Span X is dense in Z(G).
Consequently, for each f ∈ Z(G), (7.5.33)
f=
∑
(f, χα )L2 χα ,
α
with convergence in L2 -norm. The expansion (7.5.33) is analogous to the Funk-Hecke identity, (7.4.152)– (7.4.153). For another example of the parallel between characters on compact matrix groups G and zonal harmonics on spheres S n−1 , we have the following result, which can be compared with (7.4.113)–(7.4.117). Proposition 7.5.9. The projection Pα of R(G) onto Vα , introduced in (7.5.8), is given by ∫ (7.5.34) Pα f (x) = dα χα (g −1 x)f (g) dg. G
380
7. Fourier analysis
Proof. The characterization (7.5.8) says ∫ (7.5.35) Pα f (x) = Kα (x, g)f (g) dg, G
with (7.5.36)
Kα (x, g) = dα
∑
πα (g)jk πα (x)jk .
j,k≤dα
By comparison, χα (g −1 x) =
∑
πα (g −1 x)kk
k
=
(7.5.37)
∑
πα (g −1 )kj πα (x)jk
j,k
=
∑
πα (g)jk πα (x)jk ,
j,k
the last identity due to unitarity: πα (g −1 ) = πα (g)∗ . This gives (7.5.34).
The operation in (7.5.34) is an example of a convolution. Generally, the convolution of two functions u and v on G is defined as ∫ (7.5.38) u ∗ v(x) = u(g)v(g −1 x) dg. G
Thus (7.5.34) says (7.5.39)
Pα f = dα f ∗ χα .
The convolution product is associative, as one can check: (7.5.40)
u ∗ (v ∗ w) = (u ∗ v) ∗ w.
This product is not generally commutative (unless G is commutative), but we have the following. Proposition 7.5.10. If G is a compact matrix group and f ∈ R(G), (7.5.41)
χ ∈ Z(G) =⇒ f ∗ χ = χ ∗ f.
We close this section with a brief discussion of a case where the methods of this section and of §7.4 are equally applicable, namely the matrix group ) {( z } w (7.5.42) SU (2) = : z, w ∈ C, |z|2 + |w|2 = 1 , −w z consisting of (7.5.43)
U ∈ M (2, C) such that U ∗ U = I, det U = 1.
There is a natural bijective map of SU (2) onto (7.5.44)
{(z, w) ∈ C2 : |z|2 + |w|2 = 1} = S 3 .
The group SU (2) has a unique bi-invariant Riemannian metric, up to a constant factor, and this correspondence maps SU (2) isometrically onto S 3 , with its standard metric induced from inclusion in C2 ≈ R4 (up to a constant factor). The
Exercises
381 (
) 1 0 of SU (2) corresponds to (1, 0) = (1 + 0i, 0 + 0i) ∈ C2 , 0 1 i.e., to (1, 0, 0, 0) ∈ R4 , which differs just by a rotation from our choice of “north pole” (0, 0, 0, 1) in §7.4. If we make this adjustment, we can verify that, under the diffeomorphism SU (2) → S 3 described above, the space Z(SU (2)) of central functions on SU (2) is taken isomorphically to the space Z(S 3 ) of zonal functions on S 3 . identity element I =
Thanks to the isometry of SU (2) → S 3 , the bi-invariant Laplace operator on SU (2) considered here corresponds to the Laplace operator ∆S on S 3 considered in §7.4, so we have a natural isomorphism of each eigenspace of ∆ (a subspace of C ∞ (SU (2))) with an eigenspace of ∆S (a subspace of C ∞ (S 3 )), and vice-versa. Recall that, with G = SU (2), G × G acts on G as a group of isometries, via x 7→ gxh−1 . We get the identity map here precisely when (7.5.45)
(g, h) ∈ {(I, I), (−I, −I)},
and this sets up a natural homomorphism (7.5.46)
SU (2) × SU (2) −→ SO(4),
onto the group of rotations of R4 (yielding isometries of S 3 ), taking the set (7.5.45) to I ∈ SO(4). Recall that ∆ acts as a scalar on each space Vα , on which SU (2) × SU (2) acts irreducibly. Meanwhile, each eigenspace Vk of ∆S is a space of functions on S 3 on which SO(4) acts irreducibly. Comparing these facts allows us to show that the isometry SU (2) → S 3 induces an isomporphism (7.5.47)
≈
Vα −→ Vk ,
for appropriate k = k(α). These ingredients then set up an equivalence between Fourier series on SU (2), as considered in this section, and analysis of spherical harmonic expansions on S 3 , as considered in §7.4.
Exercises 1. Verify the assertions (7.5.40) and (7.5.41), regarding the convolution product. 2. Let π be a finite-dimensional continuous representation of the compact matrix group G. For f ∈ R(G), set ∫ π(f ) = f (x)π(x) dx. G
Show that π(u ∗ v) = π(u)π(v).
382
7. Fourier analysis
3. Show that ∥u ∗ v∥sup ≤ ∥u∥L2 ∥v∥L2 . 4. Use (7.5.38) in concert with (7.5.39)–(7.5.40) to show that Pα (u ∗ v) = (Pα u) ∗ (Pα v). Hint. Start with the observation that χα = Pα χα = dα χα ∗ χα , hence Pα (u ∗ v) = d2α (u ∗ v) ∗ (χα ∗ χα ). 5. Show that
2∥Pα (u ∗ v)∥sup ≤ 2∥Pα u∥L2 ∥Pα v∥L2 ≤ ∥Pα u∥2L2 + ∥Pα v∥2L2 .
6. Noting that ∥u∥2L2 =
∑
∥Pα u∥2L2 ,
α
show that 2
∑
∥Pα (u ∗ v)∥sup ≤ ∥u∥2L2 + ∥v∥2L2 .
α
Take u 7→ tu and v 7→ t−1 v and optimize the resulting estimates over t to deduce that ∑ ∥Pα (u ∗ v)∥sup ≤ ∥u∥L2 ∥v∥L2 . α
7.6. Isoperimetric inequality Let Ω ⊂ R2 be a smoothly bounded open set, with boundary ∂Ω. The following result, known as the isoperimetric inequality, implies that if the length ℓ(∂Ω) is fixed, then the area Area(Ω) is maximal when Ω is a disk. Proposition 7.6.1. In the setting described above, (7.6.1)
ℓ(∂Ω)2 ≥ 4π Area(Ω).
In case of equality in (7.6.1), Ω must be a disk. Proof. We can assume ∂Ω has only one connected component. Otherwise, one component of ∂Ω, say γ, touches the unbounded component O of R2 \ Ω. Then R2 \ γ has two connected components, O and U , each having γ as its boundary. We have (7.6.2)
ℓ(∂Ω) ≥ ℓ(∂U ),
Area(U ) ≥ Area(Ω),
so the inequality (7.6.3)
ℓ(∂U )2 ≥ 4π Area(U )
would imply (7.6.1). So we can assume ∂Ω = γ. To proceed, parametrize γ to have constant speed, say (7.6.4)
γ : T = R/(2πZ) −→ ∂Ω
7.6. Isoperimetric inequality
383
is a diffeomorphism, with |γ ′ (t)| ≡ ℓ(∂Ω)/2π. We will obtain formulas for ℓ(∂Ω) and Area(Ω) in terms of Fourier coefficients, as follows. Identifying R2 with C, take ∫ π ∑ 1 ikt (7.6.5) γ(t) = ak e , ak = γˆ (k) = γ(s)e−iks ds. 2π −π k
Since γ has constant speed, we have 4π 2 ℓ(∂Ω) = 2π
∫
π
2
(7.6.6)
= 4π 2
−π
∑
|γ ′ (t)|2 dt
k 2 |ak |2 ,
k
by the Plancherel identity for Fourier series, plus the identity ∑ (7.6.7) γ ′ (t) = i kak eikt . k
Meanwhile, by Green’s theorem, ∫ ∫ ∫ 1 Area(Ω) = x dy = − y dx = z dz 2i ∂Ω ∂Ω ∂Ω ∫ 1 π ′ = γ (t)γ(t) dt (7.6.8) 2i −π = −πi(γ ′ , γ)L2 (T,dt/2π) ∑ =π k|ak |2 . k
Hence (7.6.9)
ℓ(∂Ω)2 − 4π Area(Ω) = 4π 2
∑
(k 2 − k)|ak |2 .
k
Note that k 2 −k ≥ 0 on Z, so the right side of (7.6.9) is ≥ 0, and we have (7.6.1).
To prove the last assertion in Proposition 7.6.1, we will establish the following quantitative refinement. Proposition 7.6.2. Let Ω ⊂ R2 be a smoothly bounded domain. Assume ∂Ω has just one connected component. There exists C1 ∈ (0, ∞), independent of Ω, with the following property. If δ ∈ (0, 2π] and (7.6.10)
ℓ(∂Ω)2 − 4π Area(Ω) ≤ δ Area(Ω),
then there is a disk D such that (7.6.11)
Area(Ω△D) ≤ C1 δ 1/2 Area(Ω).
Here Ω△D denotes the symmetric difference, (Ω \ D) ∪ (D \ Ω). Note that (7.6.12)
Area(Ω△D) = ∥χΩ − χD ∥2L2 .
Proof. Taking γ as in the proof of Proposition 7.6.1, let D be the disk with boundary curve (7.6.13)
γ0 (t) = a0 + a1 eit .
384
7. Fourier analysis
By (7.6.9), (7.6.14)
ℓ(∂Ω)2 − 4π Area(Ω) = 4π 2
∑
(k 2 − k)|ˆ γ (k) − γˆ0 (k)|2 ,
k
since |ˆ γ (k) − γˆ0 (k)| = |ak | except for k = 0, 1. Also 2
2
k2 − k ≥
(7.6.15)
k2 on Z \ {0, 1}, 2
so (7.6.16)
ℓ(∂Ω)2 − 4π Area(Ω) ≥ 2π 2
∑
k 2 |ˆ γ (k) − γˆ0 (k)|2
= 2π 2 ∥γ ′ − γ0′ ∥2L2 (T) .
In turn, since γ − γ0 has mean value zero, we have (7.6.17)
sup |γ(t) − γ0 (t)| ≤ C2 ∥γ ′ − γ0′ ∥L2 (T) . t∈T
Then the hypothesis (7.6.10) gives (7.6.18)
sup |γ(t) − γ0 (t)| ≤ C2 δ 1/2 ∥χΩ ∥L2 . t∈T
Also, if ρ = |a1 | is the radius of D, and if we denote by Dζ a concentric disk of radius ζ, we have sup |γ(t) − γ0 (t)| ≤ σ (7.6.19)
=⇒ Dρ−σ ⊂ Ω ⊂ Dρ+σ =⇒ Area(Ω△D) ≤ 4πρσ = 4π 1/2 σ∥χD ∥L2 .
Hence (7.6.20)
Area(Ω△D) ≤ 4π 1/2 C2 δ 1/2 ∥χD ∥L2 ∥χΩ ∥L2 .
Meanwhile, (7.6.21)
Area(D) = π|a1 |2 ≤ Area(Ω),
by (7.6.8), so ∥χD ∥L2 ≤ ∥χΩ ∥L2 , and we have the desired conclusion (7.6.11).
Appendix A
Complementary material
Here we treat a number of topics that illuminate material in the main body of the text. In Appendix A.1 we discuss metric spaces. A metric space is a set X, equipped with a distance function d(x, y), satisfying some natural conditions (cf. (A.1.1)), notably the triangle inequality, d(x, y) ≤ d(x, z) + d(z, y), for all x, y, z ∈ X. The notion of a metric space is a natural abstraction of that of Euclidean space, treated in §1.2. More general types of spaces that fall under the category of metric spaces include various spaces of functions, and the unification of these notions is quite useful. Appendix A.2, treating inner product spaces, also complements §1.2 and extends its scope. It treats both finite and infinite dimensional inner product spaces, the latter class making contact with Fourier series in §7.1. Material in Appendix A.3, on eigenvalues and eigenvectors, is useful for the study of various types of critical points in §2.1. Material on power series in Appendix A.4 both complements results in §2.1 and provides a path to a proof of the Weierstrass approximation theorem in Appendix A.5. This result in turn is extended to the Stone-Weierstrass approximation theorem in that appendix, and this is of use in the treatment of Fourier series, in §7.1. In §A.6 we present some results on harmonic functions, complementing results established in Chapters 5 and 7. One such result is a removable singularity theorem, used in the proof of Proposition 7.4.3. Appendix A.7 introduces de Rham cohomology, as an extension of degree theory, developed in Chapter 5. Whereas degree theory applies to maps f : M → N between manifolds of the same dimension, and deals with the behavior of the pullback f ∗ on top-degree forms, de Rham cohomology applies to maps between manifolds of possibly different dimension, and deals with the behavior of f ∗ on forms 385
386
A. Complementary material
of various degrees. One basic example is the Hopf invariant, which applies to maps f : M → N , where N has dimension n and M has dimension 2n − 1.
A.1. Metric spaces, convergence, and compactness A metric space is a set X, together with a distance function d : X × X → [0, ∞), having the properties that d(x, y) = 0 ⇐⇒ x = y, (A.1.1)
d(x, y) = d(y, x), d(x, y) ≤ d(x, z) + d(y, z).
The third of these properties is called the triangle inequality. An example of a metric space is the set of rational numbers Q, with d(x, y) = |x − y|. Another example is X = Rn , with √ d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 . If (xν ) is a sequence in X, indexed by ν = 1, 2, 3, . . . , i.e., by ν ∈ Z+ , one says xν → y if d(xν , y) → 0, as ν → ∞. One says (xν ) is a Cauchy sequence if d(xν , xµ ) → 0 as µ, ν → ∞. One says X is a complete metric space if every Cauchy sequence converges to a limit in X. Some metric spaces are not complete; for example, Q is √not complete. You can take a sequence (xν ) of rational numbers such that xν → 2, which is not rational. Then (xν ) is Cauchy in Q, but it has no limit in Q. b as If a metric space X is not complete, one can construct its completion X b consist of an equivalence class of Cauchy sefollows. Let an element ξ of X quences in X, where we say (xν ) ∼ (yν ) provided d(xν , yν ) → 0. We write the equivalence class containing (xν ) as [xν ]. If ξ = [xν ] and η = [yν ], we can set b a d(ξ, η) = limν→∞ d(xν , yν ), and verify that this is well defined, and makes X complete metric space. The following is the major result of Chapter 1 of [49]. Theorem A.1.1. If the completion process described above is applied to the set Q of rational numbers, one obtains the set R of real numbers, as a complete space, with metric d(x, y) = |x − y|. There are a number of useful concepts related to the notion of closeness. We define some of them here. First, if p is a point in a metric space X and r ∈ (0, ∞), the set (A.1.2)
Br (p) = {x ∈ X : d(x, p) < r}
is called the open ball about p of radius r. Generally, a neighborhood of p ∈ X is a set containing such a ball, for some r > 0. A set U ⊂ X is called open if it contains a neighborhood of each of its points. The complement of an open set is said to be closed. The following result characterizes closed sets.
A.1. Metric spaces, convergence, and compactness
387
Proposition A.1.2. A subset K ⊂ X of a metric space X is closed if and only if (A.1.3)
xj ∈ K, xj → p ∈ X =⇒ p ∈ K.
Proof. Assume K is closed, xj ∈ K, xj → p. If p ∈ / K, then p ∈ X \ K, which is open, so some Bε (p) ⊂ X \ K, and d(xj , p) ≥ ε for all j. This contradiction implies p ∈ K. Conversely, assume (A.1.3) holds, and let q ∈ U = X \ K. If B1/n (q) is not contained in U for any n, then there exists xn ∈ K ∩ B1/n (q), hence xn → q, contradicting (A.1.3). This completes the proof. The following is straightforward. Proposition A.1.3. If Uα is a family of open sets in X, then ∪α Uα is open. If Kα is a family of closed subsets of X, then ∩α Kα is closed. Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The following result is straightforward. Proposition A.1.4. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p. Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each ε > 0, there exists q ∈ S ∩ Bε (p), q ̸= p. It follows that p is an accumulation point of S if and only if each Bε (p), ε > 0, contains infinitely many points of S. One straightforward observation is that all points of S \ S are accumulation points of S. The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all the open sets contained in S. Note that the complement of the interior of S is equal to the closure of X \ S. We now turn to the notion of compactness. We say a metric space X is compact provided the following property holds: (A.1.4)
Each sequence (xk ) in X has a convergent subsequence.
We will establish various properties of compact metric spaces, and provide various equivalent characterizations. For example, it is easily seen that (A.1.4) is equivalent to: (A.1.5)
Each infinite subset S ⊂ X has an accumulation point.
The following property is known as total boundedness: Proposition A.1.5. If X is a compact metric space, then (A.1.6)
Given ε > 0, ∃ finite set {x1 , . . . , xN } such that Bε (x1 ), . . . , Bε (xN ) covers X.
Proof. Take ε > 0 and pick x1 ∈ X. If Bε (x1 ) = X, we are done. If not, pick x2 ∈ X \ Bε (x1 ). If Bε (x1 ) ∪ Bε (x2 ) = X, we are done. If not, pick x3 ∈
388
A. Complementary material
X \ [Bε (x1 ) ∪ Bε (x2 )]. Continue, taking xk+1 ∈ X \ [Bε (x1 ) ∪ · · · ∪ Bε (xk )], if Bε (x1 ) ∪ · · · ∪ Bε (xk ) ̸= X. Note that, for 1 ≤ i, j ≤ k, i ̸= j =⇒ d(xi , xj ) ≥ ε. If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no accumulation point, so property (A.1.5) is contradicted. Corollary A.1.6. If X is a compact metric space, it has a countable dense subset. Proof. Given ε = 2−n , let Sn be a finite set of points xj such that {Bε (xj )} covers X. Then C = ∪n Sn is a countable dense subset of X. Here is another useful property of compact metric spaces, which will eventually be generalized even further, in (A.1.10) below. Proposition A.1.7. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form a decreasing sequence of closed subsets of X. If each Kn = ̸ ∅, then ∩n Kn ̸= ∅. Proof. Pick xn ∈ Kn . If (A.1.4) holds, (xn ) has a convergent subsequence, xnk → y. Since {xnk : k ≥ ℓ} ⊂ Knℓ , which is closed, we have y ∈ ∩n Kn . Corollary A.1.8. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing sequence of open subsets of X. If ∪n Un = X, then UN = X for some N . Proof. Consider Kn = X \ Un .
The following is an important extension of Corollary A.1.8. Proposition A.1.9. If X is a compact metric space, then it has the property: (A.1.7)
Every open cover {Uα : α ∈ A} of X has a finite subcover.
Proof. Each Uα is a union of open balls, so it suffices to show that (A.1.4) implies the following: (A.1.8)
Every cover {Bα : α ∈ A} of X by open balls has a finite subcover.
Let C = {zj : j ∈ N} ⊂ X be a countable dense subset of X, as in Corollary A.1.6. Each Bα is a union of balls Brj (zj ), with zj ∈ C ∩ Bα , rj rational. Thus it suffices to show that Every countable cover {Bj : j ∈ N} of X (A.1.9) by open balls has a finite subcover. For this, we set Un = B1 ∪ · · · ∪ Bn and apply Corollary A.1.8.
A.1. Metric spaces, convergence, and compactness
389
The following is a convenient alternative to property (A.1.7): ∩ If Kα ⊂ X are closed and Kα = ∅, (A.1.10) α then some finite intersection is empty. Considering Uα = X \ Kα , we see that (A.1.7) ⇐⇒ (A.1.10). The following result, known as the Heine-Borel theorem, completes Proposition A.1.9. Theorem A.1.10. For a metric space X, (A.1.4) ⇐⇒ (A.1.7). Proof. By Proposition A.1.9, (A.1.4) ⇒ (A.1.7). To prove the converse, it will suffice to show that (A.1.10) ⇒ (A.1.5). So let S ⊂ X and assume S has no accumulation point. We claim: Such S must be closed. Indeed, if z ∈ S and z ∈ / S, then z would have to be an accumulation point. Say S = {xα : α ∈ A}. Set Kα = S \ {xα }. Then each Kα has no accumulation point, hence Kα ⊂ X is closed. Also ∩α Kα = ∅. Hence there exists a finite set F ⊂ A such that ∩α∈F Kα = ∅, if (A.1.10) holds. Hence S = ∪α∈F {xα } is finite, so indeed (A.1.10) ⇒ (A.1.5). Remark. So far we have that for every metric space X, (A.1.4) ⇐⇒ (A.1.5) ⇐⇒ (A.1.7) ⇐⇒ (A.1.10) =⇒ (A.1.6). We claim that (A.1.6) implies the other conditions if X is complete. Of course, compactness implies completeness, but (A.1.6) may hold for incomplete X, e.g., X = (0, 1) ⊂ R. Proposition A.1.11. If X is a complete metric space with property (A.1.6), then X is compact. Proof. It suffices to show that (A.1.6) ⇒ (A.1.5) if X is a complete metric space. So let S ⊂ X be an infinite set. Cover X by balls B1/2 (x1 ), . . . , B1/2 (xN ). One of these balls contains infinitely many points of S, and so does its closure, say X1 = B1/2 (y1 ). Now cover X by finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1 . One such set contains infinitely many points of S, and so does its closure X2 = B1/4 (y2 ) ∩ X1 . Continue in this fashion, obtaining X1 ⊃ X2 ⊃ X3 ⊃ · · · ⊃ Xk ⊃ Xk+1 ⊃ · · · ,
Xj ⊂ B2−j (yj ),
each containing infinitely many points of S. One sees that (yj ) forms a Cauchy sequence. If X is complete, it has a limit, yj → z, and z is seen to be an accumulation point of S.
390
A. Complementary material
If Xj , 1 ≤ j ≤ m, is a finite collection of metric spaces, with metrics dj , we can define a Cartesian product metric space (A.1.11)
X=
m ∏
Xj ,
d(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ).
j=1
√ Another choice of metric is δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . The metrics d and δ are equivalent, i.e., there exist constants C0 , C1 ∈ (0, ∞) such that (A.1.12)
C0 δ(x, y) ≤ d(x, y) ≤ C1 δ(x, y),
∀ x, y ∈ X.
A key example is R , the Cartesian product of m copies of the real line R. m
We describe some important classes of compact spaces. Proposition A.1.12. If Xj are compact metric spaces, 1 ≤ j ≤ m, so is X = ∏m j=1 Xj . Proof. If (xν ) is an infinite sequence of points in X, say xν = (x1ν , . . . , xmν ), pick a convergent subsequence of (x1ν ) in X1 , and consider the corresponding subsequence of (xν ), which we relabel (xν ). Using this, pick a convergent subsequence of (x2ν ) in X2 . Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then have a convergent subsequence in X. The following result (already stated in Theorem 1.2.4) is useful for calculus on Rn . Proposition A.1.13. If K is a closed bounded subset of Rn , then K is compact. Proof. The discussion above reduces the problem to showing that any closed interval I = [a, b] in R is compact. This compactness is a corollary of Proposition A.1.11. For pedagogical purposes, we redo the argument here, since in this concrete case it can be streamlined. Suppose S is a subset of I with infinitely many elements. Divide I into 2 equal subintervals, I1 = [a, b1 ], I2 = [b1 , b], b1 = (a + b)/2. Then either I1 or I2 must contain infinitely many elements of S. Say Ij does. Let x1 be any element of S lying in Ij . Now divide Ij in two equal pieces, Ij = Ij1 ∪ Ij2 . One of these intervals (say Ijk ) contains infinitely many points of S. Pick x2 ∈ Ijk to be one such point (different from x1 ). Then subdivide Ijk into two equal subintervals, and continue. We get an infinite sequence of distinct points xν ∈ S, and |xν − xν+k | ≤ 2−ν (b − a), for k ≥ 1. Since R is complete, (xν ) converges, say to y ∈ I. Any neighborhood of y contains infinitely many points in S, so we are done. If X and Y are metric spaces, a function f : X → Y is said to be continuous provided xν → x in X implies f (xν ) → f (x) in Y. An equivalent condition, which the reader is invited to verify, is (A.1.13)
U open in Y =⇒ f −1 (U ) open in X.
Proposition A.1.14. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X compact, then f (K) is a compact subset of Y.
A.1. Metric spaces, convergence, and compactness
391
Proof. If (yν ) is an infinite sequence of points in f (K), pick xν ∈ K such that f (xν ) = yν . If K is compact, we have a subsequence xνj → p in X, and then yνj → f (p) in Y. If F : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition A.1.14 is: Proposition A.1.15. If X is a compact metric space and f ∈ C(X), then f assumes a maximum and a minimum value on X. Proof. We know from Proposition A.1.14 that f (X) is a compact subset of R. Hence f (X) is bounded, say f (X) ⊂ I = [a, b]. Repeatedly subdividing I into equal halves, as in the proof of Proposition A.1.13, at each stage throwing out intervals that do not intersect f (X), and keeping only the leftmost and rightmost interval amongst those remaining, we obtain points α ∈ f (X) and β ∈ f (X) such that f (X) ⊂ [α, β]. Then α = f (x0 ) for some x0 ∈ X is the minimum and β = f (x1 ) for some x1 ∈ X is the maximum. At this point, the reader might take a look at the proof of the Mean Value Theorem, given in §1.1, which applies this result. If S ⊂ R is a nonempty, bounded set, Proposition A.1.13 implies S is compact. The function η : S → R, η(x) = x is continuous, so by Proposition A.1.15 it assumes a maximum and a minimum on S. We set (A.1.14)
sup S = max x,
inf S = min x,
s∈S
x∈S
when S is bounded. More generally, if S ⊂ R is nonempty and bounded from above, say S ⊂ (−∞, B], we can pick A < B such that S ∩ [A, B] is nonempty, and set (A.1.15)
sup S = sup S ∩ [A, B].
Similarly, if S ⊂ R is nonempty and bounded from below, say S ⊂ [A, ∞), we can pick B > A such that S ∩ [A, B] is nonempty, and set inf S = inf S ∩ [A, B].
(A.1.16)
If X is a nonempty set and f : X → R is bounded from above, we set (A.1.17)
sup f (x) = sup f (X), x∈X
and if f : X → R is bounded from below, we set (A.1.18)
inf f (x) = inf f (X).
x∈X
If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below, we set inf f = −∞. Given a set X, f : X → R, and xn → x, we set ( ) (A.1.19) lim sup f (xn ) = lim sup f (xk ) , n→∞
n→∞ k≥n
and (A.1.20)
( lim inf f (xn ) = lim n→∞
) inf f (xk ) .
n→∞ k≥n
392
A. Complementary material
We return to the notion of continuity. A function f ∈ C(X) is said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that (A.1.21)
x, y ∈ X, d(x, y) ≤ δ =⇒ |f (x) − f (y)| ≤ ε.
An equivalent condition is that f have a modulus of continuity, i.e., a monotonic function ω : [0, 1) → [0, ∞) such that δ ↘ 0 ⇒ ω(δ) ↘ 0, and such that (A.1.22)
x, y ∈ X, d(x, y) ≤ δ ≤ 1 =⇒ |f (x) − f (y)| ≤ ω(δ).
Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R, then f (x) = sin 1/x is continuous, but not uniformly continuous, on X. The following result is useful, for example, in the development of the Riemann integral in §2.1. Proposition A.1.16. If X is a compact metric space and f ∈ C(X), then f is uniformly continuous. Proof. If not, there exist xν , yν ∈ X and ε > 0 such that d(xν , yν ) ≤ 2−ν but |f (xν ) − f (yν )| ≥ ε.
(A.1.23)
Taking a convergent subsequence xνj → p, we also have yνj → p. Now continuity of f at p implies f (xνj ) → f (p) and f (yνj ) → f (p), contradicting (A.14). If X and Y are metric spaces, the space C(X, Y ) of continuous maps f : X → Y has a natural metric structure, under some additional hypotheses. We use ( ) (A.1.24) D(f, g) = sup d f (x), g(x) . x∈X
This sup exists provided f (X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum exists if X is compact. The following result is useful in the proof of the fundamental local existence result for ODE, in §2.3. Proposition A.1.17. If X is a compact metric space and Y is a complete metric space, then C(X, Y ), with the metric (A.1.16), is complete. Proof. That D(f, g) satisfies the conditions to define a metric on C(X, Y ) is straightforward. We check completeness. Suppose (fν ) is a Cauchy sequence in C(X, Y ), so, as ν → ∞, ( ) (A.1.25) sup sup d fν+k (x), fν (x) ≤ εν → 0. k≥0 x∈X
Then in particular (fν (x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say to g(x) ∈ Y . It remains to show that g ∈ C(X, Y ) and that fν → g in the metric (A.1.16). In fact, taking k → ∞ in the estimate above, we have ( ) (A.1.26) sup d g(x), fν (x) ≤ εν → 0, x∈X
A.1. Metric spaces, convergence, and compactness
393
i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x in X and fix ε > 0. Pick N so that εN < ε. Since fN is continuous, there exists J such that j ≥ J ⇒ d(fN (xj ), fN (x)) < ε. Hence ( ) ( ) ( ) j ≥ J ⇒ d g(xj ), g(x) ≤ d g(xj ), fN (xj ) + d fN (xj ), fN (x) ( ) + d fN (x), g(x) < 3ε.
This completes the proof.
In case Y = R, C(X, R) = C(X), introduced earlier in this appendix. The distance function (A.1.24) can be written D(f, g) = ∥f − g∥sup ,
∥f ∥sup = sup |f (x)|. x∈X
∥f ∥sup is a norm on C(X). Generally, a norm on a vector space V is an assignment f 7→ ∥f ∥ ∈ [0, ∞), satisfying ∥f ∥ = 0 ⇔ f = 0,
∥af ∥ = |a| ∥f ∥,
∥f + g∥ ≤ ∥f ∥ + ∥g∥,
given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a normed vector space. It is then a metric space, with distance function D(f, g) = ∥f − g∥. If the space is complete, one calls V a Banach space. In particular, by Proposition A.1.17, C(X) is a Banach space, when X is a compact metric space. We next give a couple of slightly more sophisticated results on compactness. The following extension of Proposition A.1.12 is a special case of Tychonov’s Theorem. Proposition A.1.18. If {Xj : j ∈ Z+ } are compact metric spaces, so is X = ∏∞ j=1 Xj . Here, we can make X a metric space by setting ∞ ∑ dj (pj (x), pj (y)) (A.1.27) d(x, y) = 2−j , 1 + dj (pj (x), pj (y)) j=1 where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X, then xν → y in X, as ν → ∞, if and only if, for each j, pj (xν ) → pj (y) in Xj . Proof. Following the argument in Proposition A.1.12, if (xν ) is an infinite sequence of points in X, we obtain a nested family of subsequences (A.1.28)
(xν ) ⊃ (x1 ν ) ⊃ (x2 ν ) ⊃ · · · ⊃ (xj ν ) ⊃ · · ·
such that pℓ (xj ν ) converges in Xℓ , for 1 ≤ ℓ ≤ j. The next step is a diagonal construction. We set (A.1.29)
ξν = xν ν ∈ X.
Then, for each j, after throwing away a finite number N (j) of elements, one obtains from (ξν ) a subsequence of the sequence (xj ν ) in (A.1.28), so pℓ (ξν ) converges in Xℓ for all ℓ. Hence (ξν ) is a convergent subsequence of (xν ).
394
A. Complementary material
The next result the Arzela-Ascoli Theorem. Proposition A.1.19. Let X and Y be compact metric spaces, and fix a modulus of continuity ω(δ). Then { ( ) ( ) } (A.1.30) Cω = f ∈ C(X, Y ) : d f (x), f (x′ ) ≤ ω d(x, x′ ) ∀ x, x′ ∈ X is a compact subset of C(X, Y ). Proof. Let (fν ) be a sequence in Cω . Let Σ be a countable dense subset of X, as in Corollary A.1.6. For each x ∈ Σ, (fν (x)) is a sequence in Y, which hence has a convergent subsequence. Using a diagonal construction similar to that in the proof of Proposition A.1.18, we obtain a subsequence (φν ) of (fν ) with the property that φν (x) converges in Y, for each x ∈ Σ, say φν (x) → ψ(x),
(A.1.31) for all x ∈ Σ, where ψ : Σ → Y.
So far, we have not used (A.1.30). This hypothesis will now be used to show that φν converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is compact, we can cover X by finitely many balls Bδ (xj ), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so large that φν (xj ) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N ). Now, for any x ∈ X, picking ℓ ∈ {1, . . . , N } such that d(x, xℓ ) ≤ δ, we have, for k ≥ 0, ν ≥ M, ( ) ( ) ( ) d φν+k (x), φν (x) ≤ d φν+k (x), φν+k (xℓ ) + d φν+k (xℓ ), φν (xℓ ) ( ) (A.1.32) + d φν (xℓ ), φν (x) ≤ ε/3 + ε/3 + ε/3. Thus (φν (x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we now have (A.1.31) for all x ∈ X. Letting k → ∞ in (A.1.32) we have uniform convergence of φν to ψ. Finally, passing to the limit ν → ∞ in (A.1.33)
d(φν (x), φν (x′ )) ≤ ω(d(x, x′ ))
gives ψ ∈ Cω .
We want to re-state Proposition A.1.19, bringing in the notion of equicontinuity. Given metric spaces X and Y , and a set of maps F ⊂ C(X, Y ), we say F is equicontinuous at a point x0 ∈ X provided (A.1.34)
∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F , dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.
We say F is equicontinuous on X if it is equicontinuous at each point of X. We say F is uniformly equicontinuous on X provided (A.1.35)
∀ ε > 0, ∃ δ > 0 such that ∀ x, x′ ∈ X, f ∈ F , dX (x, x′ ) < δ =⇒ dY (f (x), f (x′ )) < ε.
Note that (A.1.35) is equivalent to the existence of a modulus of continuity ω such that F ⊂ Cω , given by (A.1.30). It is useful to record the following result.
A.1. Metric spaces, convergence, and compactness
395
Proposition A.1.20. Let X and Y be metric spaces, F ⊂ C(X, Y ). Assume X is compact. then (A.1.36)
F equicontinuous =⇒ F is uniformly equicontinuous.
Proof. The argument is a variant of the proof of Proposition A.1.16. In more detail, suppose there exist xν , x′ν ∈ X, ε > 0, and fν ∈ F such that d(xν , x′ν ) ≤ 2−ν but (A.1.37)
d(fν (xν ), fν (x′ν )) ≥ ε.
Taking a convergent subsequence xνj → p ∈ X, we also have x′νj → p. Now equicontinuity of F at p implies that there esists N < ∞ such that ε (A.1.38) d(g(xνj ), g(p)) < , ∀ j ≥ N, g ∈ F , 2 contradicting (A.1.37). Putting together Propositions A.1.19 and A.1.20 then gives the following. Proposition A.1.21. Let X and Y be compact metric spaces. If F ⊂ C(X, Y ) is equicontinuous on X, then it has compact closure in C(X, Y ). We next define the notion of a connected space. A metric space X is said to be connected provided that it cannot be written as the union of two disjoint nonempty open subsets. The following is a basic class of examples. Proposition A.1.22. Each interval I in R is connected. Proof. Suppose A ⊂ I is nonempty, with nonempty complement B ⊂ I, and both sets are open. Take a ∈ A, b ∈ B; we can assume a < b. Let ξ = sup{x ∈ [a, b] : x ∈ A} This exists, as a consequence of the basic fact that R is complete. Now we obtain a contradiction, as follows. Since A is closed ξ ∈ A. But then, since A is open, there must be a neighborhood (ξ − ε, ξ + ε) contained in A; this is not possible. We say X is path-connected if, given any p, q ∈ X, there is a continuous map γ : [0, 1] → X such that γ(0) = p and γ(1) = q. It is an easy consequence of Proposition A.1.22 that X is connected whenever it is path-connected. The next result, known as the Intermediate Value Theorem, is frequently useful. Proposition A.1.23. Let X be a connected metric space and f : X → R continuous. Assume p, q ∈ X, and f (p) = a < f (q) = b. Then, given any c ∈ (a, b), there exists z ∈ X such that f (z) = c. Proof. Under the hypotheses, A = {x ∈ X : f (x) < c} is open and contains p, while B = {x ∈ X : f (x) > c} is open and contains q. If X is connected, then A∪B cannot be all of X; so any point in its complement has the desired property.
396
A. Complementary material
Exercises 1. If X is a metric space, with distance function d, show that |d(x, y) − d(x′ , y ′ )| ≤ d(x, x′ ) + d(y, y ′ ), and hence d : X × X −→ [0, ∞) is continuous. 2. Let φ : [0, ∞) → [0, ∞) be a C 2 function. Assume φ(0) = 0,
φ′ > 0,
φ′′ < 0.
Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does δ(x, y) = φ(d(x, y)). Hint. Show that such φ satisfies φ(s + t) ≤ φ(s) + φ(t), for s, t ∈ R+ . 3. Show that the function d(x, y) defined by (A.1.27) satisfies (A.1.1). Hint. Consider φ(r) = r/(1 + r). 4. Let X be a compact metric space. Assume fj , f ∈ C(X) and fj (x) ↗ f (x),
∀ x ∈ X.
Prove that fj → f uniformly on X. (This result is called Dini’s theorem.) Hint. For ε > 0, let Kj (ε) = {x ∈ X : f (x) − fj (x) ≥ ε}. Note that Kj (ε) ⊃ Kj+1 (ε) ⊃ · · · . Given a metric space X and f : X → [−∞, ∞], we say f is lower semicontinuous at x ∈ X provided f −1 ((c, ∞]) ⊂ X is open, ∀ c ∈ R. We say f is upper semicontinuous provided f −1 ([−∞, c)) is open, ∀ c ∈ R. 5. Show that f is lower semicontinuous ⇐⇒ f −1 ([−∞, c]) is closed, ∀ c ∈ R, and
f is upper semicontinuous ⇐⇒ f −1 ([c, ∞]) is closed, ∀ c ∈ R.
6. Show that f is lower semicontinuous ⇐⇒ xn → x implies lim inf f (xn ) ≥ f (x). Show that f is upper semicontinuous ⇐⇒ xn → x implies lim sup f (xn ) ≤ f (x).
A.2. Inner product spaces
397
7. Given S ⊂ X, show that χS is lower semicontinuous ⇐⇒ S is open. χS is upper semicontinuous ⇐⇒ S is closed. 8. If X is a compact metric space, show that f : X → R is lower semicontinuous =⇒ min f is achieved. 9. In the setting of (A.1.11), let { }1/2 δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . Show that δ(x, y) ≤ d(x, y) ≤
√
m δ(x, y).
10. Let X and Y be compact metric spaces. Show that if F ⊂ C(X, Y ) is compact, then F is equicontinuous. (This is a converse to Proposition A.1.21.) 11. Recall that a Banach space is a complete normed linear space. Consider C 1 (I), where I = [0, 1], with norm ∥f ∥C 1 = sup |f | + sup |f ′ |. I
I
1
Show that C (I) is a Banach space. 12. Let F = {f ∈ C 1 (I) : ∥f ∥C 1 ≤ 1}. Show that F has compact closure in C(I). Find a function in the closure of F that is not in C 1 (I).
A.2. Inner product spaces In §6.1 we have looked at norms and inner products on finite-dimensional vector spaces other than Rn , and in §§7.1–7.2 we have looked at norms and inner products on spaces of functions, such as C(Tn ) and R(Rn ), which are infinite-dimensional vector spaces. We discuss general results on such objects here. Generally, as discussed in §1.3, a complex vector space V is a set on which there are operations of vector addition: (A.2.1)
f, g ∈ V =⇒ f + g ∈ V,
and multiplication by an element of C (called scalar multiplication): (A.2.2)
a ∈ C, f ∈ V =⇒ af ∈ V,
satisfying the following properties. For vector addition, we have (A.2.3)
f + g = g + f, (f + g) + h = f + (g + h), f + 0 = f, f + (−f ) = 0.
For multiplication by scalars, we have (A.2.4)
a(bf ) = (ab)f,
1 · f = f.
398
A. Complementary material
Furthermore, we have two distributive laws: (A.2.5)
a(f + g) = af + ag,
(a + b)f = af + bf.
These properties are readily verified for the function spaces mentioned above. An inner product on a complex vector space V assigns to elements f, g ∈ V the quantity (f, g) ∈ C, in a fashion that obeys the following three rules: (a1 f1 + a2 f2 , g) = a1 (f1 , g) + a2 (f2 , g), (A.2.6)
(f, g) = (g, f ), (f, f ) > 0
unless f = 0.
A vector space equipped with an inner product is called an inner product space. For example, ∫ 1 (A.2.7) (f, g) = f (θ)g(θ) dθ 2π S1
defines an inner product on C(S ), and also on R(S 1 ), where we identify two functions that differ only on a set of upper content zero. Similarly, ∫ ∞ (A.2.8) (f, g) = f (x)g(x) dx 1
−∞
defines an inner product on R(R) (where, again, we identify two functions that differ only on a set of upper content zero). As another example, in we define ℓ2 to consist of sequences (ak )k∈Z such that ∞ ∑
(A.2.9)
|ak |2 < ∞.
k=−∞
An inner product on ℓ2 is given by (A.2.10)
(
∞ ∑ ) a k bk . (ak ), (bk ) = k=−∞
Given an inner product on V , one says the object ∥f ∥ defined by √ (A.2.11) ∥f ∥ = (f, f ) is the norm on V associated with the inner product. Generally, a norm on V is a function f 7→ ∥f ∥ satisfying (A.2.12)
∥af ∥
=
|a| · ∥f ∥,
(A.2.13)
∥f ∥
>
0 unless f = 0,
a ∈ C, f ∈ V,
(A.2.14)
∥f + g∥
≤
∥f ∥ + ∥g∥.
The property (A.2.14) is called the triangle inequality. A vector space equipped with a norm is called a normed vector space. We can define a distance function on such a space by (A.2.15)
d(f, g) = ∥f − g∥.
Properties (A.2.12)–(A.2.14) imply that d : V × V → [0, ∞) satisfies the properties in (A.1.1), making V a metric space.
A.2. Inner product spaces
399
If ∥f ∥ is given by (A.2.11), from an inner product satisfying (A.2.6), it is clear that (A.2.12)–(A.2.13) hold, but (A.2.14) requires a demonstration. Note that ∥f + g∥2 = (f + g, f + g) = ∥f ∥2 + (f, g) + (g, f ) + ∥g∥2
(A.2.16)
= ∥f ∥2 + 2 Re(f, g) + ∥g∥2 , while (∥f ∥ + ∥g∥)2 = ∥f ∥2 + 2∥f ∥ · ∥g∥ + ∥g∥2 .
(A.2.17)
Thus to establish (A.2.17) it suffices to prove the following, known as Cauchy’s inequality. Proposition A.2.1. For any inner product on a vector space V , with ∥f ∥ defined by (A.2.11), |(f, g)| ≤ ∥f ∥ · ∥g∥,
(A.2.18)
∀ f, g ∈ V.
Proof. We start with 0 ≤ ∥f − g∥2 = ∥f ∥2 − 2 Re(f, g) + ∥g∥2 ,
(A.2.19) which implies
2 Re(f, g) ≤ ∥f ∥2 + ∥g∥2 ,
(A.2.20)
∀ f, g ∈ V.
Replacing f by af for arbitrary a ∈ C of absolute velue 1 yields 2 Re a(f, g) ≤ ∥f ∥2 + ∥g∥2 , for all such a, hence 2|(f, g)| ≤ ∥f ∥2 + ∥g∥2 , Replacing f by tf and g by t
−1
g for arbitrary t ∈ (0, ∞), we have
2|(f, g)| ≤ t ∥f ∥2 + t−2 ∥g∥2 , 2
(A.2.21)
∀ f, g ∈ V. ∀ f, g ∈ V, t ∈ (0, ∞).
If we take t = ∥g∥/∥f ∥, we obtain the desired inequality (A.2.18). This assumes f and g are both nonzero, but (A.2.18) is trivial if f or g is 0. 2
An inner product space V is called a Hilbert space if it is a complete metric space, i.e., if every Cauchy sequence (fν ) in V has a limit in V . The space ℓ2 has this completeness property, but C(S 1 ), with inner product (A.2.7), does not, nor does R(S 1 ). Appendix A.1 describes a process of constructing the completion of a metric space. When applied to an incomplete inner product space, it produces a Hilbert space. When this process is applied to C(S 1 ), the completion is the space L2 (S 1 ). An alternative construction of L2 (S 1 ) uses the Lebesgue integral. For this approach, one can consult Chapter 4 of [47]. For the rest of this appendix, we confine attention to finite-dimensional inner product spaces. If V is a finite-dimensional inner product space, a basis {u1 , . . . , un } of V is called an orthonormal basis of V provided (A.2.22)
(uj , uk ) = δjk ,
1 ≤ j, k ≤ n,
i.e., (A.2.23)
∥uj ∥ = 1,
j ̸= k ⇒ (uj , uk ) = 0.
400
A. Complementary material
In such a case we see that v = a1 u1 + · · · + an un , w = b1 u1 + · · · + bn un (A.2.24) =⇒ (v, w) = a1 b1 + · · · + an bn . It is often useful to construct orthonormal bases. The construction we now describe is called the Gramm-Schmidt construction. Proposition A.2.2. Let {v1 , . . . , vn } be a basis of V , an inner product space. Then there is an orthonormal basis {u1 , . . . , un } of V such that (A.2.25)
Span{uj : j ≤ ℓ} = Span{vj : j ≤ ℓ},
1 ≤ ℓ ≤ n.
Proof. To begin, take (A.2.26)
u1 =
1 v1 . ∥v1 ∥
Now define the linear transformation P1 : V → V by P1 v = (v, u1 )u1 and set v˜2 = v2 − P1 v2 = v2 − (v2 , u1 )u1 . We see that (˜ v2 , u1 ) = (v2 , u1 ) − (v2 , u1 ) = 0. Also v˜2 ̸= 0 since u1 and v2 are linearly independent. Hence we set 1 v˜2 . (A.2.27) u2 = ∥˜ v2 ∥ Inductively, suppose we have an orthonormal set {u1 , . . . , um } with m < n and (A.2.25) holding for 1 ≤ ℓ ≤ m. Then define Pm : V → V by (A.2.28)
Pm v = (v, u1 )u1 + · · · + (v, um )um ,
and set (A.2.29)
v˜m+1 = vm+1 − Pm vm+1 = vm+1 − (vm+1 , u1 )u1 − · · · − (vm+1 , um )um .
We see that (A.2.30)
j ≤ m ⇒ (˜ vm+1 , uj ) = (vm+1 , uj ) − (vm+1 , uj ) = 0.
Also, since vm+1 ∈ / Span{v1 , . . . , vm } = Span{u1 , . . . , um }, it follows that v˜m+1 ̸= 0. Hence we set 1 (A.2.31) um+1 = v˜m+1 . ∥˜ vm+1 ∥ This completes the construction. Example. Take V = P2 , with basis {1, x, x2 }, and inner product given by ∫ 1 p(x)q(x) dx. (A.2.32) (p, q) = −1
The Gramm-Schmidt construction gives first 1 (A.2.33) u1 (x) = √ . 2 Then v˜2 (x) = x,
A.2. Inner product spaces
401 ∫1
x2 dx = 2/3, so we take √ 3 x. u2 (x) = 2
since by symmetry (x, u1 ) = 0. Now (A.2.34)
−1
Next
1 v˜3 (x) = x2 − (x2 , u1 )u1 = x2 − , 3 ∫1 since by symmetry (x2 , u2 ) = 0. Now −1 (x2 − 1/3)2 dx = 8/45, so we take √ ( 45 2 1 ) (A.2.35) u3 (x) = x − . 8 3 Let V be an n-dimensional inner product space, W ⊂ V an m-dimensional linear subspace. By Proposition A.2.2, W has an orthonormal basis {w1 , . . . , wm }. We know from §1.3 that V has a basis of the form (A.2.36)
{w1 , . . . , wm , v1 , . . . , vℓ },
ℓ + m = n.
Applying Proposition A.2.2 again gives the following. Proposition A.2.3. If V is an n-dimensional inner product space and W ⊂ V an m-dimensional linear subspace, with orthonormal basis {w1 , . . . , wm }, then V has an orthonormal basis of the form (A.2.37)
{w1 , . . . , wm , u1 , . . . , uℓ },
ℓ + m = n.
We see that, if we define the orthogonal complement of W in V as (A.2.38)
W ⊥ = {v ∈ V : (v, w) = 0, ∀ w ∈ W },
then (A.2.39)
W ⊥ = Span{u1 , . . . , uℓ }.
In particular, (A.2.40)
dim W + dim W ⊥ = dim V.
In the setting of Proposition A.2.3, we can define PW ∈ L(V ) by (A.2.41)
PW v =
m ∑
(v, wj )wj ,
for v ∈ V,
j=1
and see that PW is uniquely defined by the properties (A.2.42)
PW w = w, ∀ w ∈ W,
PW u = 0, ∀ u ∈ W ⊥ .
We call PW the orthogonal projection of V onto W . Note the appearance of such orthogonal projections in the proof of Proposition A.2.2, namely in (A.2.28). Another object that arises in the setting of inner product spaces is the adjoint, defined as follows. If V and W are finite-dimensional inner product spaces and T ∈ L(V, W ), we define the adjoint (A.2.43)
T ∗ ∈ L(W, V ),
(v, T ∗ w) = (T v, w).
402
A. Complementary material
If V and W are real vector spaces, we also use the notation T t for the adjoint, and call it the transpose. In case V = W and T ∈ L(V ), we say T is self-adjoint ⇐⇒ T ∗ = T,
(A.2.44) and (A.2.45)
T is unitary (if F = C), or orthogonal (if F = R) . ⇐⇒ T ∗ = T −1 .
The following gives a significant connection between adjoints and orthogonal complements. Proposition A.2.4. Let V be an n-dimensional inner product space, W ⊂ V a linear subspace. Take T ∈ L(V ). Then (A.2.46)
T : W → W =⇒ T ∗ : W ⊥ → W ⊥ .
Proof. Note that (A.2.47)
(w, T ∗ u) = (T w, u) = 0,
∀ w ∈ W, u ∈ W ⊥ ,
if T : W → W . This shows that T ∗ u ⊥ W for all u ∈ W ⊥ , and we have (A.2.46).
In particular, (A.2.48)
T = T ∗ , T : W → W =⇒ T : W ⊥ → W ⊥ .
A.3. Eigenvalues and eigenvectors Let T : V → V be linear. If there is a nonzero v ∈ V such that (A.3.1)
T v = λj v,
for some λj ∈ F, we say λj is an eigenvalue of T , and v is an eigenvector. Let E(T, λj ) denote the set of vectors v ∈ V such that (A.3.1) holds. It is clear that E(T, λj ) (the λj -eigenspace of T ) is a linear subspace of V and (A.3.2)
T : E(T, λj ) −→ E(T, λj ).
The set of λj ∈ F such that E(T, λj ) ̸= 0 is denoted Spec(T ). Clearly λj ∈ Spec(T ) if and only if T − λj I is not injective, so, if V is finite dimensional, (A.3.3)
λj ∈ Spec(T ) ⇐⇒ det(λj I − T ) = 0.
We call KT (λ) = det(λI − T ) the characteristic polynomial of T . If F = C, we can use the fundamental theorem of algebra, which says every non-constant polynomial with complex coefficients has at least one complex root. (See §5.1 for a proof of this result.) This proves the following. Proposition A.3.1. If V is a finite-dimensional complex vector space and T ∈ L(V ), then T has at least one eigenvector in V . Remark. If V is real and KT (λ) does have a real root λj , then there is a real λj -eigenvector.
A.3. Eigenvalues and eigenvectors
403
Sometimes a linear transformation has only one eigenvector, up to a scalar multiple. Consider the transformation A : C3 → C3 given by 2 1 0 (A.3.4) A = 0 2 1 . 0 0 2 We see that det(λI − A) = (λ − 2)3 , so λ = 2 is a triple root. It is clear that E(A, 2) = Span{e1 },
(A.3.5)
where e1 = (1, 0, 0) is the first standard basis vector of C3 . t
If one is given T ∈ L(V ), it is of interest to know whether V has a basis of eigenvectors of T . The following result is useful. Proposition A.3.2. Assume that the characteristic polynomial of T ∈ L(V ) has k distinct roots, λ1 , . . . , λk , with eigenvectors vj ∈ E(T, λj ), 1 ≤ j ≤ k. Then {v1 , . . . , vk } is linearly independent. In particular, if k = dim V , these vectors form a basis of V . Proof. We argue by contradiction. If {v1 , . . . , vk } is linearly dependent, take a minimal subset that is linearly dependent and (reordering if necessary) say this set is {v1 , . . . , vm }, with T vj = λj vj , and (A.3.6)
c1 v1 + · · · + cm vm = 0,
with cj ̸= 0 for each j ∈ {1, . . . , m}. Applying T − λm I to (A.3.6) gives (A.3.7)
c1 (λ1 − λm )v1 + · · · + cm−1 (λm−1 − λm )vm−1 = 0,
a linear dependence relation on the smaller set {v1 , . . . , vm−1 }. This contradiction proves the proposition. Here is another important class of transformations that have a full complement of eigenvectors. Proposition A.3.3. Let V be an n-dimensional inner product space, T ∈ L(V ). Assume T is self-adjoint, i.e., T = T ∗ . The V has an orthonormal basis of eigenvectors of T . Proof. First, assume V is a complex vector space (F = C). Proposition A.3.1 implies that there exists an eigenvector v1 of T . Let W = Span{v1 }. Then Proposition A.2.4 gives (A.3.8)
T : W ⊥ −→ W ⊥ ,
and dim W ⊥ = n − 1. The proposition then follows by induction on n.
If V is a real vector space (F = R), then the characteristic polynomial det(λI − T ) has a complex root, say λ1 ∈ C. Denote by Ve the complexification of V . The transformation T extends to T ∈ L(Ve ), as a self-adjoint transformation on this complex inner product space. Hence there exists nonzero v1 ∈ Ve such that T v1 = λ1 v1 . We now take note of the following. Proposition A.3.4. If T = T ∗ , every eigenvalue of T is real.
404
A. Complementary material
Proof. Say T v1 = λ1 v1 , v1 ̸= 0. Then (A.3.9)
λ1 ∥v1 ∥2 = (λ1 v1 , v1 ) = (T v1 , v1 ) = (v1 , T v1 ) = (v1 , λv1 ) = λ1 ∥v1 ∥2 .
Hence λ1 = λ1 , so λ1 is real.
Returning to the proof of Proposition A.3.3 when V is a real inner product space, we see that the (complex) root λ1 of det(λI − T ) must in fact be real. Hence λ1 I − T : V → V is not injective, so there exists a λ1 -eigenvector v1 ∈ V . Induction on n, as in the argument above, finishes the proof. Here is a useful general result on orthogonality of eigenvectors. Proposition A.3.5. Let V be an inner product space, T ∈ L(V ). If (A.3.10)
T u = λu,
T ∗ v = µv,
λ ̸= µ,
then u ⊥ v.
(A.3.11) Proof. We have (A.3.12)
λ(u, v) = (T u, v) = (u, T ∗ v) = µ(u, v).
As a corollary, it T = T ∗ , then T u = λu, T v = µv, λ ̸= µ ⇒ u ⊥ v. Our next goal is to extend Proposition A.3.3 to a broader class of transformations. Given T ∈ L(V ), where V is an n-dimensional complex inner product space, we say T is normal if T and T ∗ commute, i.e., T T ∗ = T ∗ T . Equivalently, taking (A.3.13)
T = A + iB,
A = A∗ , B = B ∗ ,
we have (A.3.14)
T normal ⇐⇒ AB = BA.
Generally, for A, B ∈ L(V ), we see that (A.3.15)
BA = AB =⇒ B : E(A, λj ) → E(A, λj ).
Thus, in the setting of (A.3.13), we can find an orthonormal basis of each space E(A, λ), λ ∈ Spec A, consisting of eigenvectors of B, to get an orthonormal basis of V consisting of vectors that are simultaneously eigenvectors of A and B, hence eigenvectors of T . This establishes the following. Proposition A.3.6. Let V be an n-dimensional complex inner product space, T ∈ L(V ) a normal transformation. Then V has an orthonormal basis of eigenvectore of T .
A.3. Eigenvalues and eigenvectors
405
Note that if T has the form (A.3.13)–(A.3.14) and λ = a + ib, a, b ∈ R, then (A.3.16)
E(T, λ) = E(A, a) ∩ E(B, b) = E(T ∗ , λ).
We deduce from Proposition A.3.5 the following. Proposition A.3.7. In the setting of Proposition A.3.6, with T normal, (A.3.17)
λ ̸= µ =⇒ E(T, λ) ⊥ E(T, µ).
An important class of normal operators is the class of unitary operators, defined in §A.2. We recall that if V is an inner product space and T ∈ L(V ), then (A.3.18)
T is unitary ⇐⇒ T ∗ = T −1 .
We write T ∈ U (V ), if V is a complex inner product space. We see from (A.3.16) (or directly) that (A.3.19)
T ∈ U (V ), λ ∈ Spec T =⇒ λ = λ−1 =⇒ |λ| = 1.
We deduce that if T ∈ U (V ), then V has an orthonormal basis of eigenvectors of T , each eigenvalue being a complex number of absolute value 1. If V is a real n-dimensional inner product space and (A.3.18) holds, we say T is an orthogonal transformation, and write T ∈ O(V ). In such a case, V typically does not have an orthonormal basis of eigenvectors of T . However, V does have an orthonormal basis with respect to which such an orthogonal transformation has a special structure, as we proceed to show. To get it, we construct the complexification of V , (A.3.20)
VC = {u + iv : u, v ∈ V },
which has a natural structure of a complex n-dimensional vector space, with a Hermitian inner product. A transformation T ∈ O(V ) has a unique C-linear extension to a transformation on VC , which we continue to denote by T , and this extended transformation is unitary on VC . Hence VC has an orthonormal basis of eigenvectors of T . Say u + iv ∈ VC is such an eigenvector, (A.3.21)
T (u + iv) = e−iθ (u + iv),
eiθ ∈ / {1, −1}.
Writing eiθ = c + is, c, s ∈ R, we have (A.3.22)
T u + iT v = (c − is)(u + iv) = cu + sv + i(−su + cv),
hence (A.3.23)
T u = cu + sv, T v = −su + cv.
In such a case, applying complex conjugation to (A.3.21) yields T (u − iv) = eiθ (u − iv), and eiθ ̸= e−iθ if eiθ ∈ / {1, −1}, so Proposition A.3.7 yields (A.3.24)
u + iv ⊥ u − iv,
406
A. Complementary material
hence 0 = (u + iv, u − iv) (A.3.25)
= (u, u) − (v, v) + i(v, u) + i(u, v) = |u|2 − |v|2 + 2i(u, v),
or equivalently (A.3.26)
|u| = |v| and u ⊥ v.
Now Span{u, v} ⊂ V has an (n − 2)-dimensional orthogonal complement, on which T acts, and an inductive argument gives the following. Proposition A.3.8. Let V be a n-dimensional real inner product space, T : V → V an orthogonal transformation. Then V has an orthonormal basis in which the matrix representation of T consists of blocks ( ) cj −sj (A.3.27) , c2j + s2j = 1, sj cj plus perhaps an identity matrix block if 1 ∈ Spec T , and a block that is −I if −1 ∈ Spec T . This result has the following consequence, advertised in Exercise 14 of §3.2. Corollary A.3.9. For each integer n ≥ 2, (A.3.28)
Exp : Skew(n) −→ SO(n) is onto.
As in §3.2 we leave the proof as an exercise for the reader. The key is to use the Euler-type identity ( ) ( ) 0 −1 cos θ − sin θ θJ (A.3.29) J= =⇒ e = . 1 0 sin θ cos θ In cases when T is a linear transform on an n-dimensional complex vector space V , and V does not have a basis of eigenvectors of T , it is useful to have the concept of a generalized eigenspace, defined as (A.3.30)
GE(T, λj ) = {v ∈ V : (t − λj I)k v = 0 for some k}.
If λj is an eigenvalue of T , nonzero elements of GE(T, λj ) are called generalized eigenvectors. Clearly E(T, λj ) ⊂ GE(T, λj ). Also T : GE(T, λj ) → GE(T, λj ). Furthermore, one has the following. Proposition A.3.10. If µ ̸= λj , then (A.3.31)
≈
T − µI : GE(T, λj ) −→ GE(T, λj ).
It is useful to know the following. Proposition A.3.11. If W is an n-dimensional complex vector space, and T ∈ L(V ), then W has a basis of generalized eigenvectors of T . We will not give a proof of this result here. A proof can be found in Chapter 2, §7 of [50], and also in [52].
A.4. Complements on power series
407
A.4. Complements on power series If a function f is sufficiently differentiable on an interval in R containing x and y, the Taylor expansion about y reads (A.4.1)
f (x) = f (y) + f ′ (y)(x − y) + · · · +
1 (n) f (y)(x − y)n + Rn (x, y). n!
Here, Tn (x, y) = f (y) + · · · + f (n) (y)(x − y)n /n! is that polynomial of degree n in x all of whose x-derivatives of order ≤ n, evaluated at y, coincide with those of f . This prescription makes the formula for Tn (x, y) easy to derive. The analysis of the remainder term Rn (x, y) is more subtle. One useful result about this remainder is the following. Say x > y, and for simplicity assume f (n+1) is continuous on [y, x]; we say f ∈ C n+1 ([y, x]). Then m ≤ f (n+1) (ξ) ≤ M, ∀ ξ ∈ [y, x] (A.4.2) =⇒ m
(x − y)n+1 (x − y)n+1 ≤ Rn (x, y) ≤ M . (n + 1)! (n + 1)!
Under our hypotheses, this result is equivalent to the Lagrange form of the remainder: 1 (A.4.3) Rn (x, y) = (x − y)n+1 f (n+1) (ζn ), (n + 1)! for some ζn between x and y. A proof of (A.4.3). will be given below. One of our purposes here is to comment on how effective estimates on Rn (x, y) are in determining the convergence of the infinite series (A.4.4)
∞ ∑ f (k) (y) k=0
k!
(x − y)k
to f (x). That is to say, we want to perceive that Rn (x, y) → 0 as n → ∞, in appropriate circumstances. Before we look at how effective the estimate (A.4.2) is at this job, we want to introduce another player, and along the way discuss the derivation of various formulas for the remainder in (A.4.1). A simple formula for Rn (x, y) follows upon taking the y-derivative of both sides of (A.4.1); we are assuming that f is at least (n+1)-fold differentiable. When we do this (applying the Leibniz formula to those terms that are products) an enormous amount of cancellation arises, and the formula collapses to (A.4.5)
∂Rn 1 = − f (n+1) (y)(x − y)n , ∂y n!
Rn (x, x) = 0.
If we concentrate on Rn (x, y) as a function of y and look at the difference quotient [Rn (x, y) − Rn (x, x)]/(y − x), an immediate consequence of the mean value theorem is that 1 (A.4.6) Rn (x, y) = (x − y) (x − ξn )n f (n+1) (ξn ), n! for some ξn between x and y. This result, known as Cauchy’s formula for the remainder, has a slightly more complicated appearance than (A.4.3), but as we will see it has advantages over Lagrange’s formula. The application of the mean value
408
A. Complementary material
theorem to obtain (A.4.6) does not require the continuity of f (n+1) , but we do not want to dwell on that point. If f (n+1) is continuous, we can apply the Fundamental Theorem of Calculus to (A.4.5), in the y-variable, and obtain the basic integral formula ∫ 1 x (A.4.7) Rn (x, y) = (x − s)n f (n+1) (s) ds. n! y Another proof of (A.4.7) is indicated in Exercise 9 of §1.1. If we think of the integral in (A.4.7) as (x − y) times the mean value of the integrand, we see (A.4.6) as a consequence. On the other hand, if we want to bring a factor of (x − y)n+1 outside the integral in (A.4.7), the change of variable x − s = t(x − y) gives the integral formula ∫ 1 ( ) 1 tn f (n+1) ty + (1 − t)x dt. (A.4.8) Rn (x, y) = (x − y)n+1 n! 0 If we think of this integral as 1/(n + 1) times a weighted mean value of f (n+1) , we recover the Lagrange formula (A.4.3). From the Lagrange form (A.4.3) of the remainder in the Taylor series (A.4.1) we have the estimate |x − y|n+1 (A.4.9) |Rn (x, y)| ≤ sup |f (n+1) (ζ)|, (n + 1)! ζ∈I(x,y) where I(x, y) is the open interval from x to y (either (x, y) or (y, x), disregarding the trivial case x = y). Meanwhile, from the Cauchy form (A.4.6) of the remainder we have the estimate |x − y| (A.4.10) |Rn (x, y)| ≤ sup |(x − ξ)n f (n+1) (ξ)|. n! ξ∈I(x,y) We now study how effective these estimates are in determining that various power series converge. We begin with a look at these remainder estimates for the power series expansion about the origin of the simple function 1 (A.4.11) f (x) = . 1−x We have, for x ̸= 1, (A.4.12)
f (k) (x) = k! (1 − x)−k−1 ,
and formula (A.4.1) becomes 1 (A.4.13) = 1 + x + · · · + xn + Rn (x, 0). 1−x Of course, everyone knows that the infinite series (A.4.14)
1 + x + · · · + xn + · · ·
converges to f (x) in (A.4.11), precisely for x ∈ (−1, 1). What we are interested in is what can be deduced from the estimate (A.4.9), which, for the function (A.4.11), takes the form (A.4.15)
|Rn (x, 0)| ≤ |x|n+1 ·
sup ζ∈I(x,0)
|1 − ζ|−n−2 .
A.4. Complements on power series
409
We consider two cases. First, if x ≤ 0, then |1 − ζ| ≥ 1 for ζ ∈ I(x, 0), so (A.4.16)
x ≤ 0 =⇒ |Rn (x, 0)| ≤ |x|n+1 .
Thus the estimate (A.4.9) implies that Rn (x, 0) → 0 in (A.4.13), for all x ∈ (−1, 0]. Suppose however that x ≥ 0. What we have from (A.4.15) is x ≥ 0 =⇒ |Rn (x, 0)| ≤ |x|n+1 sup |1 − ζ|−n−2 0≤ζ≤x
1 ( x )n+1 = . 1−x 1−x This tends to 0 as n → ∞ if and only if x < 1 − x, i.e., if and only if x < 1/2. What we have is the following: (A.4.17)
Conclusion. The estimate (A.4.9) implies the convergence of the Taylor series (about the origin) for the function f (x) = 1/(1 − x), only for −1 < x < 1/2. This example points to a weakness in the estimate (A.4.9), coming from the Lagrange form of the remainder. Now let us see how well we can do with the estimate (A.4.10), coming from the Cauchy form of the remainder. For the function (A.4.11), this takes the form (A.4.18)
|Rn (x, 0)| ≤ (n + 1) |x|
sup ξ∈I(x,0)
|x − ξ|n . |1 − ξ|n+2
For −1 < x ≤ 0 one has an estimate like (A.4.16), with a factor of (n + 1) thrown in. On the other hand, one readily verifies that x−ξ ≤ x, 0 ≤ ξ ≤ x < 1 =⇒ 1−ξ so we deduce from (A.4.18) that (A.4.19)
0 ≤ x < 1 =⇒ |Rn (x, 0)| ≤ (n + 1)
xn+1 , 1−x
which does tend to 0 for all x ∈ [0, 1). One might be wondering if one could come up with some more complicated example, for which Cauchy’s form is effective only on an interval shorter than the interval of convergence. In fact, you can’t. Cauchy’s form of the remainder is always effective in the interior of the interval of convergence. This can be demonstrated, using methods of complex analysis. We look at some more power series, and see when convergence can be established at an endpoint of an interval of convergence, using the estimate (A.4.10), i.e., |Rn (x, y)| ≤ Cn (x, y), (A.4.20)
Cn (x, y) =
|x − y| n!
sup
|(x − ξ)n f (n+1) (ξ)|.
ξ∈I(x,y)
We consider the following family of examples: (A.4.21)
f (x) = (1 − x)a ,
a > 0.
410
A. Complementary material
The power series expansion has radius of convergence 1 (if a is not an integer) and, as we will see, one has convergence at both endpoints, +1 and −1, whenever a > 0. Let us see when Cn (±1, 0) → 0. We have (A.4.22)
f (n+1) (x) = (−1)n+1 a(a − 1) · · · (a − n) (1 − x)a−n−1 .
Hence
(A.4.23)
a(a − 1) · · · (a − n) | − 1 − ξ|n Cn (−1, 0) = sup n+1−a n! −1 0 in (A.4.21). On the other hand, a(a − 1) · · · (a − n) (A.4.24) Cn (1, 0) = sup (1 − ξ)a−1 . n! 0 0, the Taylor series about the origin for the function f (x) = (1 − x)a converges absolutely and uniformly to f (x) on the closed interval [−1, 1]. Proof. As noted, the series is (A.4.25)
∞ ∑
cn x n ,
cn = (−1)n
n=0
a(a − 1) · · · (a − n + 1) , n!
and, by an analysis parallel to (A.4.23), |cn | ≤ C n−1−a .
(A.4.26) In more detail, if n − 1 > a, (A.4.27)
cn = −
a ∏ ( a) 1− n k 1≤k≤a
∏ a 0. This was established in §A.4, and we will use it, with a = 1/2. From the identity x1/2 = (1 − (1 − x))1/2 , we have x1/2 ∈ P([0, 2]). More to the point, from the identity ( )1/2 (A.5.1) |x| = 1 − (1 − x2 ) , √ √ we have |x| ∈ P([− 2, 2]). Using |x| = b−1 |bx|, for any b > 0, we see that |x| ∈ P(I) for any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval I. By translation, we have |x − a| ∈ P(I)
(A.5.2)
for any compact interval I. Using the identities 1 (x + y) + 2 we see that for any a ∈ R and any (A.5.3)
(A.5.4)
max(x, y) =
1 1 1 |x − y|, min(x, y) = (x + y) − |x − y|, 2 2 2 compact I,
max(x, a), min(x, a) ∈ P(I).
We next note that P(I) is an algebra of functions, i.e., (A.5.5)
f, g ∈ P(I), c ∈ R =⇒ f + g, f g, cf ∈ P(I).
Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f | ∈ P(I), and, via (A.5.3), we deduce that (A.5.6)
f, g ∈ P(I) =⇒ max(f, g), min(f, g) ∈ P(I).
A.5. The Weierstrass theorem and the Stone-Weierstrass theorem
413
Suppose now that I ′ = [a′ , b′ ] is a subinterval of I = [a, b]. With the notation x+ = max(x, 0), we have ( ) (A.5.7) fII ′ (x) = min (x − a′ )+ , (b′ − x)+ ∈ P(I). This is a piecewise linear function, equal to zero off I \ I ′ , with slope 1 from a′ to the midpoint m′ of I ′ , and slope −1 from m′ to b′ . Now if I is divided into N equal subintervals, any continuous function on I that is linear on each such subinterval can be written as a linear combination of such “tent functions,” so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such piecewise linear functions, so we have f ∈ P(I), proving the theorem.
A far reaching extension of Proposition A.5.1, due to M. Stone, is the following result, known as the Stone-Weierstrass theorem. Proposition A.5.2. Let X be a compact metric space, A a subalgebra of CR (X), the algebra of real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X, i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq (p) ̸= hpq (q). Then the closure A is equal to CR (X). We will derive this from the following lemma. Lemma A.5.3. Let A ⊂ CR (X) satisfy the hypotheses of Proposition A.5.2, and let K, L ⊂ X be disjoint, compact subsets of X. Then there exists gKL ∈ A such that (A.5.8)
gKL = 1 on K, 0 on L,
and 0 ≤ gKL ≤ 1 on X.
Proof of Proposition A.5.2. To start, take f ∈ CR (X) such that 0 ≤ f ≤ 1 on X. Set (A.5.9)
K = {x ∈ X : f (x) ≥ 23 },
U = {x ∈ X : f (x) > 31 },
L = X \ U.
Lemma A.5.3 implies that there exist g ∈ A such that 1 1 g= on K, g = 0 on L, and 0 ≤ g ≤ . 3 3 Then 0 ≤ g ≤ f ≤ 1 on X, and, more precisely, 2 (A.5.10) 0 ≤ f − g ≤ , on X. 3 We can apply such reasoning with f replaved by f − g, obtaining g2 ∈ A such that ( 2 )2 (A.5.11) 0 ≤ f − g − g2 ≤ , on X, 3 and iterate, obtaining gj ∈ A such that, for each k, ( 2 )k (A.5.12) 0 ≤ f − g − g2 − · · · − gk ≤ , on X. 3 This yields f ∈ A, whenever f ∈ CR (X) satisfies 0 ≤ f ≤ 1. It is an easy step to see that f ∈ CR (X) ⇒ f ∈ A.
414
A. Complementary material
Proof of Lemma A.5.3. We present the proof in six steps. Step 1. By Proposition A.5.1, if f ∈ A and φ : R → R is continuous, then φ ◦ f ∈ A. Step 2. Consequently, if fj ∈ A, then (A.5.13)
max(f1 , f2 ) =
1 1 |f1 − f2 | + (f1 + f2 ) ∈ A, 2 2
and similarly min(f1 , f2 ) ∈ A. Step 3. It follows from the hypotheses that if p, q ∈ X and p ̸= q, then there exists fpq ∈ A, equal to 1 at p and to 0 at q. Step 4. Apply an appropriate continuous φ : R → R to get gpq = φ ◦ fpq ∈ A, equal to 1 on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X. Step 5. Let L ⊂ X be compact and fix p ∈ X \ L. By Step 4, given q ∈ L, there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X. Now {Ωq } is an open cover of L, so there exists a finite subcover Ωq1 , . . . , ΩqN . Let gpL = min gpqj ∈ A.
(A.5.14) Taking O =
1≤j≤N
∩N 1 Oqj ,
an open neighborhood of p, we have
gpL = 1 on O,
0 on L,
and 0 ≤ gpU ≤ 1 on X.
Step 6. Take K, L ⊂ X disjoint, compact subsets. By Step 5, for each p ∈ K, there exists gpL ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on L. Now {Op } covers K, so there exists a finite subcover Op1 , . . . , OpM . Let gKL = max gpj L ∈ A.
(A.5.15)
1≤j≤M
We have (A.5.16)
gKL = 1 on K,
as asserted in the lemma.
0 on L,
and 0 ≤ gKL ≤ 1 on X,
Proposition A.5.2 has a complex analogue. In that case, we add the assumption that f ∈ A ⇒ f ∈ A, and conclude that A = C(X). This is easily reduced to the real case. Here are a couple of applications of Proposition A.5.2, in its real and complex forms:
A.6. Further results on harmonic functions
415
Corollary A.5.4. If X is a compact subset of Rn , then every f ∈ C(X) is a uniform limit of polynomials on Rn . Corollary A.5.5. The space of trigonometric polynomials on the n-torus Tn , given by ∑ (A.5.17) ak eik·θ , |k|≤N n
is dense in C(T ).
A.6. Further results on harmonic functions Recall that if Ω ⊂ Rn is open, a function u ∈ C 2 (Ω) is harmonic on Ω provided ∆u = 0, where (A.6.1)
∆=
∂2 ∂2 + ··· + 2 ∂x1 ∂x2n
is the Laplace operator. Some useful results on harmonic functions have been presented in Chapters 5 and 7. In particular, we have established the mean value property (cf. (5.1.23)): if u is harmonic on Ω and the closed ball BR (x0 ) ⊂ Ω, then (A.6.2)
u(x0 ) =
1 V (BR )
∫ u(x) dx. BR (x0 )
This leads to the maximum principle, Proposition 5.1.7. In particular, if Ω is bounded and u ∈ C(Ω) is harmonic on Ω, then sup |u(x)| = sup |u(y)|.
(A.6.3)
x∈Ω
y∈∂Ω
One consequence of this is that, for such bounded Ω ⊂ Rn , a solution u ∈ C 2 (Ω) ∩ C(Ω) to the Dirichlet problem (A.6.4) ∆u = 0 on Ω, u = f, ∂Ω
with f ∈ C(∂Ω) given, is unique (provided it exists). In Chapter 7 we showed that solutions do exist if Ω = BR (x0 ) is a ball. In fact, we derived a formula, the Poisson integral formula, for the solution in case the ball is B = B1 (0). In that case, the solution is given by u = PI f , where ∫ 1 − |x|2 f (y) (A.6.5) PI f (x) = dS(y). An−1 |x − y|n S n−1
One can then treat arbitrary balls BR (x0 ), via translation and dilation. One consequence of this formula is that, whenever Ω ⊂ Rn is open and u ∈ C 2 (Ω) is harmonic, then in fact u ∈ C ∞ (Ω). It then follows that all derivatives ∂ α u are harmonic in Ω. In case Ω is a bounded set and u ∈ C(Ω), and K ⊂ Ω is compact, we also have an estimate (A.6.6)
sup |∂ α u(x)| ≤ CKα sup |u(y)|. x∈K
y∈∂Ω
416
A. Complementary material
Our goal in this appendix is to establish some further useful results about harmonic functions. We start with the following removable singularity theorem, which is of use in the proof of Proposition 7.4.3. Take B = B1 (0). Proposition A.6.1. Assume u ∈ C 2 (B \ 0) ∩ C(B \ 0) is harmonic on B \ 0 and bounded, i.e., there exists M < ∞ such that (A.6.7)
|u(x)| ≤ M,
∀ x ∈ B \ 0.
Then u can be extended (in a unique fashion) to be harmonic on all of B. Proof. Let f = u|∂B ∈ C(∂B) and set (A.6.8)
v ∈ C(B) ∩ C 2 (B).
v = PI f,
We claim v = u on B \ 0. To this end, consider w = u − v on B \ 0. We have w ∈ C(B \ 0) ∩ C 2 (B \ 0), ∆w = 0 on B \ 0, and w = 0 on ∂B. Also |w| ≤ 2M on B \ 0. We claim w ≡ 0. To show this, we can assume that w is real valued. Now bring in the function H ∈ C(B \ 0) ∩ C 2 (B \ 0), given by (A.6.9)
H(x) = |x|2−n − 1, if n ≥ 3, 1 log , if n = 2. |x|
We see that H is harmonic on B \0, H ≥ 0 on B \0, H = 0 on ∂B, and H(x) → +∞ as x → 0. Hence, for each ε > 0, there exists δ0 > 0 such that (A.6.10)
εH − w ≥ 0 on ∂Bδ (0),
∀ δ ∈ (0, δ0 ].
The maximum principle implies that (A.6.11)
εH − w ≥ 0
on B \ Bδ (0). Taking δ ↘ 0 yields (A.6.11) on B \ 0. Then taking ε ↘ 0 yields (A.6.12)
w ≤ 0 on B \ 0.
A similar argument gives w ≥ 0 on B\0, hence w ≡ 0, and the proof is complete.
We turn now to a converse to the mean value property of harmonic functions. To state it, we say a function u ∈ C(Ω) has the mean value property provided that (A.6.2) holds whenever the ball BR (x0 ) ⊂ Ω. One point of abstracting this property is that such functions satisfy the maximum principle: Lemma A.6.2. If Ω is bounded and u ∈ C(Ω) has the mean value property on Ω, then (A.6.3) holds. In fact, the proof of Proposition 5.1.7 works without change here. Here is our converse result. Proposition A.6.3. If u ∈ C(Ω) has the mean value property, then u is harmonic on Ω. (In particular, u ∈ C ∞ (Ω).) Proof. It suffices to show that if B ⊂ Ω is a ball, then u is harmonic on B. Translating and dilating, we can assume B = B1 (0). Now take (A.6.13) f = u , v = PI f. ∂B
A.6. Further results on harmonic functions
417
It suffices to show that v ≡ u on B, or equivalently that w = v − u ≡ 0 on B. Indeed, w has the mean value property on B, w ∈ C(B), and w = 0 on ∂B, so Lemma A.6.2 implies w ≡ 0. The next result is known as the Schwarz reflection principle. To state is, let B = B1 (0) and set B+ = {x ∈ B : xn > 0}.
(A.6.14)
Proposition A.6.4. Assume u ∈ C(B + ) is harmonic on B+ and (A.6.15)
u = 0 on S = B ∩ {x ∈ Rn : xn = 0}.
Define v on B by (A.6.16)
v(x) = u(x),
if x ∈ B + ,
− u(ρ(x)),
if ρ(x) ∈ B + ,
where (A.6.17)
ρ(x1 , . . . , xn ) = (x1 , . . . , −xn ).
Then v is harmonic on B. Proof. The hypothesis (A.6.15) implies v ∈ C(B). In particular f = v|∂B belongs to C(∂B). Consider (A.6.18)
w = PI f.
We claim w ≡ v. Since x 7→ ρ(x) is an isometry, we have PI(f ◦ ρ) = w ◦ ρ, so w(ρ(x)) = −w(x).
(A.6.19)
Hence w = 0 on S. Also w = f = v on ∂B, so (A.6.20)
w = u on ∂B+ .
Since u and w are harmonic on B+ , we have (A.6.21)
w = u on B+ ,
which, together with (A.6.19), yields w ≡ v, completing the proof.
We turn to a circle of results on harmonic functions that satisfy one-sided bounds. To start, let u : B1 (0) → R be harmonic and assume u ≥ 0. Also assume u ∈ C(B1 (0)), for now, so u|S n−1 = f ≥ 0 and u is given by (A.6.5). Hence, for x ∈ B1 (0), u(x) ≥ (1 − |x|2 ) · min |x − y|−n AvgS n−1 f |y|=1
(A.6.22)
1 − |x| u(0), (1 + |x|)n 2
= so (A.6.23)
u(x) ≥
1 − |x| u(0), (1 + |x|)n−1
∀ x ∈ B1 (0).
If we omit the hypothesis that u ∈ C(B1 (0)) and apply this reasoning to ub (x) = u(bx) and let b ↗ 1, we obtain (A.6.23) for this more general class. Going further,
418
A. Complementary material
we can apply translations and dilations, and obtain the following result, known as Harnack’s inequality. Proposition A.6.5. Assume u is harmonic on BR (x0 ) and ≥ 0 there. Then, for all x1 ∈ BR (x0 ), (A.6.24)
u(x1 ) ≥
1 − R−1 |x1 − x0 | u(x0 ). (1 + R−1 |x1 − x0 |)n−1
Using Harnack’s inequality, we can establish the following stronger version of Liouville’s theorem. Proposition A.6.6. Assume v : Rn → R is harmonic and bounded from below: v(x) ≥ −M,
∀ x ∈ Rn .
Then v is constant. Proof. The function u(x) = v(x) + M is harmonic and ≥ 0 on Rn . Given x0 , x1 ∈ Rn , we can take R > |x1 − x0 | and apply (A.6.24). Taking R → ∞ then gives u(x1 ) ≥ u(x0 ),
∀ x0 , x 1 ∈ R n .
Reversing roles also gives u(x0 ) ≥ u(x1 ), so u is constant, and so is v.
It is useful to complement Harnack’s lower bound with an upper bound. To start, assume u ∈ C(B1 (0)) is harmonic and ≥ 0, and complement (A.6.22) with u(x) ≤ (1 − |x|2 ) · max |x − y|−n AvgS n−1 f |y|=1
(A.6.25)
1 − |x| u(0), (1 − |x|)n 2
= so (A.6.26)
u(x) ≤
1 + |x| u(0), (1 − |x|)n−1
∀ x ∈ B1 (0).
We can remove the hypothesis oc continuity on B1 (0) by the dilation argument used above. Further translation and dilation gives the following complement to Proposition A.6.5. Proposition A.6.7. Assume u is harmonic on BR (x0 ) and ≥ 0 there. Then, for all x1 ∈ BR (x0 ), (A.6.27)
u(x1 ) ≤
1 + R−1 |x1 − x0 | u(x0 ). (1 − R−1 |x1 − x0 |)n−1
The following result will lead to further extensions of Liouville’s theorem. Proposition A.6.8. For each n ≥ 2, there exist constants Kn ∈ (0, ∞) with the following property. Let u be harmonic on BR (0) ⊂ Rn . Assume (A.6.28)
u(0) = 0,
u(x) ≤ M on BR (0).
Then (A.6.29)
u(x) ≥ −Kn M on BR/2 (0).
A.6. Further results on harmonic functions
419
Proof. Apply Proposition A.6.7, with u replaced by M − u, which is ≥ 0 on BR (0) and equal to M at 0. We see that R 3 =⇒ M − u(x1 ) ≤ 2n−1 M, (A.6.30) |x1 | = 2 3 so (A.6.29) holds with Kn = 3 · 2n−1 − 1. We can use Proposition A.6.8 t give a second proof of Proposition A.6.6. Indeed, if v ≥ 0 is harmonic on Rn , then u(x) = v(0) − v(x) satisfies (A.6.28), with M = v(0), for all R, so (A.6.29) implies u(x) ≥ −Kn v(0) for all x ∈ Rn . Hence u is harmonic and bounded on Rn , so the fact that u is constant follows from the version of Liouville’s theorem given in Proposition 5.1.8. An extension of this argument gives the following. Proposition A.6.9. Assume that u is harmonic on Rn and that there exist C0 , C1 ∈ (0, ∞)) and k ∈ Z+ such that (A.6.31)
u(x) ≤ C0 + C1 |x|k ,
∀ x ∈ Rn .
Then there exist C2 , C3 ∈ (0, ∞) such that (A.6.32)
u(x) ≥ −C2 − C3 |x|k ,
∀ x ∈ Rn .
Proof. Apply Proposition A.6.8 to u(x) − u(0), M = C0 + |u(0)| + C1 Rk .
The next result is an important extension of Liouville’s theorem. Proposition A.6.10. Let u : Rn → R be harmonic, and assume there exist C0 , C1 , and k such that (A.6.33)
|u(x)| ≤ C0 + C1 |x|k ,
∀ x ∈ Rn .
Then u(x) is a polynomial in x, of degree ≤ k. Proof. Set vR (x) = R−k u(Rx). Then {vR : R ∈ [1, ∞)} is uniformly bounded on B1 (0). By (A.6.6), we have 1 (A.6.34) |∂ α vR (x)| ≤ Cα , for |x| ≤ , R ≥ 1, 2 or equivalently 1 (A.6.35) R−k+|α| |∂ α u(Rx)| ≤ Cα , for |x| ≤ , R ≥ 1. 2 Taking R → ∞ gives (A.6.36)
|∂ α u(y)| = 0,
∀ y ∈ Rn , |α| > k.
This implies u is a polynomial of degree ≤ k.
Here is a basic application of Propositions A.6.9–A.6.10. Proposition A.6.11. Let f : C → C be holomorphic. Assume that there is a bound (A.6.37)
|ef (z) | ≤ CeA|z| ,
∀ z ∈ C.
Then f (z) has the form (A.6.38)
f (z) = az + b.
420
A. Complementary material
Proof. The bound (A.6.37) implies Re f (z) ≤ A|z| + A′ .
(A.6.39) Hence, by Proposition A.6.9,
| Re f (z)| ≤ B|z| + B ′ .
(A.6.40)
By Proposition A.6.10, we have (A.6.41)
Re f (z) = αx + βy + γ.
Consequently the harmonic conjugate v(x, y) of u(x, y) = Re f (z) is also a firstorder polynomial in x, y. This yields (A.6.38).
A.7. Beyond degree theory – introduction to de Rham theory Our treatment of degree theory in Chapter 5 began by exploiting the existence of an n-form ω with non-zero integral, on any smooth, compact, connected, oriented n-dimensional manifold M . This led, via the Stokes theorem, to a proof of the no-retraction theorem, in case M = ∂Ω, with Ω a compact, oriented, smoothly bounded manifold, and thence to Brouwer’s fixed-point theorem. As noted there, such ω ∈ Λn M is not exact (i.e., not of the form dβ for some β ∈ Λn−1 M ), though of course it is closed (i.e., dω = 0). In order to go further, we established a central result, Proposition 5.3.6, which we re-state here. Proposition A.7.1. Given M as above and α ∈ Λn M , then α = dβ for some β ∈ Λn−1 M if and only if ∫ (A.7.1) α = 0. M
We then defined the degree of a smooth map F : S → M (where S is smooth, compact, oriented, and n-dimensional) to be ∫ (A.7.2) Deg(F ) = F ∗ ω, S
where we choose
∫ ω ∈ Λn M,
(A.7.3)
ω = 1. M
It follows from Proposition A.7.1 and Stokes’ theorem that the integral in (A.7.2) is independent of the choice of ω satisfying (A.7.3). Having this, we were on our merry way. Here, we deal with the fact that the issue of closed forms versus exact forms has a broader scope. Generally, if M is a smooth manifold and k ∈ Z+ , we set (A.7.4)
C k (M ) = {α ∈ Λk M : dα = 0}, E (M ) = {dβ ∈ Λ M : β ∈ Λ k
k
k−1
and M },
called, respectively, the space of closed k-forms and the space of exact k-forms on M . Here and below, Λk M will denote the space of smooth (class C ∞ ) k-forms on
A.7. Beyond degree theory – introduction to de Rham theory
421
M . The spaces C k (M ) and E k (M ) are both linear subspaces of Λk M , and since d2 = 0, we have E k (M ) ⊂ C k (M ).
(A.7.5) We form the quotient, (A.7.6)
Hk (M ) = C k (M )/E k (M ).
This is called the kth de Rham cohomology group of M . Given α ∈ C k (M ), we denote its residue class mod E k (M ) (i.e., its cohomology class) as [α] ∈ Hk (M ) (or we might get sloppy and simply use α to denote its cohomology class). Note that, if dim M = n, then C n (M ) = Λn M , so (A.7.7)
dim M = n =⇒ Hn (M ) = Λn M/E n (M ).
Consequently, in the language just introduced, Proposition A.7.1 takes the following form. Proposition A.7.2. Let M be a smooth, compact, connected, oriented n-dimensional manifold. Then the map (A.7.8)
IM : Hn (M ) −→ R
defined by (A.7.9)
∫ IM ([ω]) =
ω M
is an isomorphism. Extensions of the notion of degree arise as follows. Let M and S be smooth manifolds (possibly of different dimension), and let F : S → M be a smooth map. For each k ∈ Z+ , F induces a map on cohomology, (A.7.10)
F ∗ : Hk (M ) −→ Hk (S),
by (A.7.11)
F ∗ ([α]) = [F ∗ α],
[α] ∈ Hk (M ).
Here F ∗ α ∈ Λk S is the pull back of α. If dα = 0, then dF ∗ α = F ∗ dα = 0, since F ∗ d = dF ∗ . Hence F ∗ α ∈ C k (S). Furthermore, if α, α ˜ ∈ C k (M ) define the same k−1 cohomology class, then α − α ˜ = dβ, β ∈ Λ M , so (A.7.12)
F ∗α = F ∗α ˜ + dF ∗ β,
and hence F ∗ α and F ∗ α ˜ define the same cohomology class on S. Hence the map (A.7.10) is well defined. Here is how the action of F ∗ on cohomology leads to the definition of degree. Take M as in Proposition A.7.2, and also assume that S is a smooth, compact, oriented n-dimensional manifold, and that F : S → M is smooth. If in addition S is connected, then (A.7.13)
−1 Deg(F ) = IS ◦ F ∗ ◦ IM (1),
422
A. Complementary material
with F ∗ : Hn (M ) → Hn (S) defined by (A.7.10)–(A.7.11), and IM , IS as in Proposition A.7.2. More generally, if S has ℓ connected components, S1 , . . . , Sℓ , and Fj = F |Sj , then (A.7.14)
Deg(F ) =
ℓ ∑
−1 ISj ◦ Fj∗ ◦ IM (1).
j=1
It is of interest to compute other cohomology groups. We start with the following consequence of the Poincar´e lemma, Proposition 5.2.5. Proposition A.7.3. Let M be a smooth n-dimensional manifold. Assume the identity map I : M → M is smoothly homotopic to a constant map K : M → M , satisfying K(x) ≡ p. Then Hk (M ) = 0 for 1 ≤ k ≤ n.
(A.7.15)
Regarding k = 0, for a general smooth manifold M , E 0 (M ) = 0, and hence H0 (M ) = C 0 (M ) = set of functions f : M → R that are
(A.7.16)
constant on each connected component of M . Hence H0 (M ) ≈ Rℓ , if M has ℓ connected components.
(A.7.17) On the other hand, (A.7.18)
dim M = n, k > n ⇒ Λk M = 0 ⇒ Hk (M ) = 0.
For the next “vanishing” result, we bring in the following class of manifolds. Definition. Let M be a smooth, connected manifold. We say M is simply connected provided each smooth loop γ : S 1 → M is smoothly homotopic to a constant map. Proposition A.7.4. If M is a smooth manifold that is simply connected, then H1 (M ) = 0.
(A.7.19)
Proof. Fix p ∈ M . Given α ∈ C 1 (M ), we propose to define f : M → R by ∫ (A.7.20) f (x) = α, γx
where γx : [0, 1] → M is a smooth path satisfying γx (0) = p, γx (1) = x. We claim that if M is simply connected, then the right side of (A.7.20) is independent of the choice of such a path. That is, if also σx : [0, 1] → M is a smooth path satisfying σx (0) = p, σx (1) = x, then ∫ ∫ α = α, ∀ α ∈ C 1 (M ). (A.7.21) γx
σx
A.7. Beyond degree theory – introduction to de Rham theory
423
To show this, we begin by reparametrizing the paths γx and σx , obtaining smooth paths γ˜x , σ ˜x : [0, 1] → M , all of whose derivatives vanish at 0 and 1. It is elementary that ∫ ∫ (A.7.22) α = α, ∀ α ∈ Λ1 M, γx
γ ˜x
and similarly for σx versus σ ˜x , so (A.7.21) follows if we obtain ∫ (A.7.23) α = 0, ∀ α ∈ C 1 (M ), τ
where τ : [0, 2] → M is the closed loop given by τ (t) = γ˜x (t),
(A.7.24)
0 ≤ t ≤ 1,
σ ˜x (2 − t),
1 ≤ t ≤ 2.
We think of τ as defined on R/2Z. Since M is simply connected, τ is smoothly homotopic to a constant map, so (A.7.23) is a special case of Proposition 5.2.1 (with Y = M and X = R/2Z, f0 = τ, f1 = const.). This gives (A.7.23), hence (A.7.21). Having (A.7.21), we now have f : M → R well defined by (A.7.20), for each α ∈ C 1 (M ). Taking a coordinate neighborhood of x ∈ M , centered at x, and picking γx to approach x along the xj -axis, we get ∂f /∂xj = αj (x), for each j, where ∑ α= αj (x) dxj (A.7.25) j
in this coordinate system. This gives (A.7.26)
df = α,
hence C (M ) = E (M ), and we have (A.7.19). 1
1
As an example, it is fairly easy to show that the n-dimensional sphere S n is simply connected when n ≥ 2, so (A.7.27)
H1 (S n ) = 0 for n ≥ 2.
A complete description of the cohomology of S n is (A.7.28)
Hk (S n ) ≈ R, if k = 0 or n, 0, otherwise.
We will return to this point below; see Proposition A.7.17. We next extend the homotopy invariance of degree, as follows. Proposition A.7.5. If M and S are smooth manifolds and f0 , f1 : S → M are smoothly homotopic, then, for each k, (A.7.29)
f1∗ = f0∗ : Hk (M ) −→ Hk (S).
Proof. Take α ∈ C k (M ), defining [α] ∈ Hk (M ). By Proposition 5.2.3, (A.7.30) so we have (A.7.29).
f0∗ α − f1∗ α ∈ E k (S),
424
A. Complementary material
We now introduce a powerful tool in the study of de Rham cohomology, the theory of harmonic forms and the Hodge decomposition. To set this up, we take a smooth, compact, n-dimensional manifold M , endowed with a Riemannian metric. We define δ = d∗ ,
δ : Λk M −→ Λk−1 (M ),
(A.7.31)
so that if α ∈ Λk M, β ∈ Λk−1 M , (A.7.32)
(δα, β)L2 = (α, dβ)L2 ,
where the L2 -inner product of α, γ ∈ Λk M is given by ∫ (A.7.33) (α, γ)L2 = ⟨α(x), γ(x)⟩ dV (x), M
⟨ , ⟩ denoting the inner product on Λk Tx∗ M , defined as in Exercises 5–9 of §4.1, upon taking an isometric isomorphism Tx∗ M ≈ Rn , arising via the metric tensor. The operator δ is, like d, a first-order differential operator. See (A.7.72) below for a formula. We now form the Hodge Laplacian ∆ : Λk M → Λk M , by ∆ = −(dδ + δd).
(A.7.34)
Note that δ = 0 on Λ M , so ∆ = −δd = −d∗ d on Λ0 M , and ∆ specializes to the Laplace operator given by (4.4.26) on 0-forms (i.e., on real-valued functions). The identity (4.4.28) says 0
(A.7.35)
−(∆u, v)L2 = (du, dv)L2 ,
u, v ∈ Λ0 M
(when M is compact and ∂M = ∅). More generally, (A.7.34) leads to (A.7.36)
−(∆u, v)L2 = (du, dv)L2 + (δu, δv)L2 ,
u, v ∈ Λk M.
In analogy with C k (M ) and E k (M ), we bring in the spaces (A.7.37)
Cδk (M ) = {u ∈ Λk M : δu = 0}, Eδk (M ) = {δv : v ∈ Λk+1 M },
called respectively the spaces of co-closed k-forms and co-exact k-forms. Note that d2 = 0 ⇒ δ 2 = 0, so Eδk (M ) ⊂ Cδk (M ). We also define (A.7.38)
Hk (M ) = {u ∈ Λk M : ∆u = 0},
the space of harmonic k-forms on M . Clearly (A.7.34) implies that C k (M ) ∩ Cδk (M ) ⊂ Hk (M ). The converse follows from (A.7.36), with u = v: (A.7.39)
∥∆u∥2L2 = ∥du∥2L2 + ∥δu∥2L2 ,
u ∈ Λk M.
Hence we have (A.7.40)
Hk (M ) = C k (M ) ∩ Cδk (M ).
Our next goal is to derive the fundamental isomorphism (A.7.41)
Hk (M ) ≈ Hk (M ),
when M is a compact Riemannian manifold. A proof of this makes use of results in partial differential equations involving the analysis of ∆ as an “elliptic differential operator.” We will state two results here (Propositions A.7.6 and A.7.8) for whose
A.7. Beyond degree theory – introduction to de Rham theory
425
proofs we need to refer to the PDE literature. Proofs are given in Chapter 5, §8 of [46]. Here is the first such result. Proposition A.7.6. If M is a compact Riemannian manifold, then for each k, dim Hk (M ) < ∞.
(A.7.42)
Say dim Hk (M ) = βk . Pick an orthonormal basis {ψ1 , . . . , ψβk } of Hk (M ), and define Ph : Λk M −→ Hk (M )
(A.7.43) by (A.7.44)
Ph u =
βk ∑
(u, ψj )L2 ψj ,
u ∈ Λk M.
j=1
Note that, for each u ∈ Λ M , if we set v0 = Ph u and v1 = (I − Ph )u, then k
(A.7.45)
u = v0 + v1 ,
v0 ∈ Hk (M ),
v1 ⊥ Hk (M ),
i.e., (v1 , ψ)L2 = 0 for all ψ ∈ Hk (M ). If also u = w0 + w1 with w0 ∈ Hk (M ) and w1 ⊥ Hk (M ), then 0 = (v0 − w0 ) + (v1 − w1 ), so v0 − w0 ∈ Hk (M ) and v0 − w0 ⊥ Hk (M ), hence v0 − w0 = v1 − w1 = 0. Thus the decomposition (A.7.45) is unique. In particular, Ph is independent of the choice of orthonormal basis of Hk (M ). We have (A.7.46)
Ph is the orthogonal projection of Λk M onto Hk (M ).
The issue of orthogonality will be central to what we do next, so we record the following useful observation. Proposition A.7.7. If M is a compact Riemannian manifold, then, for each k, E k (M ) ⊥ Cδk (M ) and C k (M ) ⊥ Eδk (M ).
(A.7.47)
Proof. Straightforward from the defining identity (A.7.32).
We now state the second result that makes major use of elliptic PDE theory (for which see Chapter 5 of [46]): Proposition A.7.8. If M is a compact Riemannian manifold, of dimension n, then for each k ∈ {0, . . . , n}, there exists a linear map G : Λk M → Λk M such that −∆G = −G∆ = I − Ph
(A.7.48) and (A.7.49)
Ph G = 0.
We note that such G is unique. Indeed, if Γ : Λk M → Λk M also satisfies (A.7.48)–(A.7.49), then ∆(G − Γ) = (G − Γ)∆ = 0,
(A.7.50)
and Ph (G − Γ) = 0,
so, for each u ∈ Λ M, v = (G − Γ)u satisfies k
(A.7.51)
∆v = 0,
Ph v = 0,
426
A. Complementary material
hence, by (A.7.48), v = −G∆v + Ph v = 0,
(A.7.52) so G = Γ.
Using (A.7.48), let us write u = dδGu + δdGu + Ph u
(A.7.53)
= Pd u + Pδ u + Ph u,
for u ∈ Λk M . Note that (A.7.54)
Pd u ∈ E k (M ),
Pδ u ∈ Eδk (M ),
Ph u ∈ Hk (M ).
By Proposition A.7.7, (A.7.55)
E k (M ), Eδk (M ), and Hk (M ) are mutually orthogonal.
The following will hasten us to our goal of (A.7.41). Proposition A.7.9. If M is a compact Riemannian manifold, then for each k, (A.7.56)
Pδ : C k (M ) −→ 0.
Proof. Take u ∈ C k (M ) and write (A.7.57)
u = Pδ u + (I − Pδ )u = u0 + u1 , u0 ∈ Eδk (M ),
u1 = (Pd + Ph )u ∈ C k (M ).
Hence (A.7.58)
u − u1 = u0 ∈ Eδk (M ) ∩ C k (M ),
which, by (A.7.47) implies u − u1 = u0 = 0, as claimed.
This leads to the following Hodge decomposition of closed forms. Proposition A.7.10. If M is a compact Riemannian manifold, then for each k, (A.7.59)
u ∈ C k (M ) =⇒ u = Pd u + Ph u,
hence (A.7.60)
u = ud + uh ,
ud ∈ E k (M ), uh ∈ Hk (M ).
Furthermore, this decomposition is unique. Proof. It remains only to establish uniqueness in (A.7.60). Indeed, if also u = vd + vh ,
vd ∈ E k (M ), vh ∈ Hk (M ),
then ud − vd = vh − uh ∈ E k (M ) ∩ Hk (M ), hence again (A.7.47) implies ud − vd = vh − uh = 0. We have the following restatement of Proposition A.7.10.
A.7. Beyond degree theory – introduction to de Rham theory
427
Theorem A.7.11. If M is a compact Riemannian manifold, then for each k, (A.7.61)
C k (M ) = E k (M ) ⊕ Hk (M )
is an orthogonal direct sum. Hence each u ∈ C k (M ) is cohomologous to a unique harmonic k-form, namely Ph u, and we have (A.7.62)
Hk (M ) ≈ Hk (M ).
We now introduce the Hodge star operator, (A.7.63)
∗ : Λk M −→ Λn−k M,
defined when M is an oriented, n-dimensional Riemannian manifold. This arises from ∗ : Λk Tx∗ M −→ Λn−k Tx∗ M. To define it, we bring in the volume form ω ∈ Λn M,
(A.7.64)
attached to M when it is oriented and has a Riemannian metric. Then the star operator (A.7.63) is uniquely specified by the relation (A.7.65)
u ∧ ∗v = ⟨u, v⟩ω,
where ⟨u, v⟩ is the inner product on Λk Tx∗ M that arose in (A.7.33). In particular, we have ∗1 = ω. Furthermore, if {e1 , . . . , en } is an oriented, orthonormal basis of Tx∗ M , we have (A.7.66)
∗(ej1 ∧ · · · ∧ ejk ) = (sgn π)eℓ1 ∧ · · · ∧ eℓn−k ,
where {j1 , . . . , jk , ℓ1 , . . . , ℓn−k } = {1, . . . , n}, and π is the permutation mapping the one ordered set to the other. It follows that (A.7.67)
∗∗ = (−1)k(n−k) on Λk M.
It will be convenient to denote (A.7.67) by w, and also set (A.7.68)
w = (−1)k on Λk M,
so (A.7.69)
d(u ∧ v) = du ∧ v + w(u) ∧ dv.
It follows that if u ∈ Λk−1 M and v ∈ Λk M , then w(u) ∧ d ∗ v = −u ∧ d ∗ w(v), so (A.7.70)
d(u ∧ ∗v) = du ∧ ∗v − u ∧ d ∗ w(v) = du ∧ ∗v − u ∧ ∗w ∗ d ∗ w(v),
since ∗w∗ is the identity map, by (A.7.67). We can use this to obtain a formula for the differential operator δ. We need not assume M is compact here. Take u and v as above and assume at least one of these has compact support. Then Stokes formula yields ∫ d(u ∧ ∗v) = 0, M
428
A. Complementary material
and hence
∫ du ∧ ∗v
(du, v)L2 = M
∫
(A.7.71)
u ∧ ∗w ∗ d ∗ w(v)
= M
= (u, w ∗ d ∗ w(v))L2 . Consequently (A.7.72)
δ = w ∗ d ∗ w = (−1)k(n−k)−n+k−1 ∗ d∗,
on Λk M.
This effectively provides a formula for δ on Λk M whether or not M is orientable. In fact, each point x ∈ M has a neighborhood U that is orientable, on which one can choose one of two possible orientations. Then the formula (A.7.72) holds on U , for each orientation. Changing from one orientation to the other exactly has the effect of changing ∗ to −∗, but since there are two factors of ∗ in (A.7.72), each choice of orientation on U leads to the same formula for δ there. Returning to the case where M is oriented, we see from the characterization (A.7.40) of harmonic k-forms that (A.7.73)
∗ : Hk (M ) −→ Hn−k (M ),
and, by (A.7.67), this map is an isomorphism. In conjunction with Theorem A.7.11, this yields the following version of Poincar´e duality. Proposition A.7.12. If M is a compact, oriented Riemannian manifold of dimension n, there is an isomorphism of de Rham cohomology groups (A.7.74)
Hk (M ) ≈ Hn−k (M ),
induced via the isomorphism (A.7.73). Taking k = 0 and using (A.7.16), we have the following: Corollary A.7.13. If M is a conpact, oriented, n-dimensional manifold, then (A.7.75)
Hn (M ) ≈ H0 (M ) ≈ Rℓ ,
if M has ℓ connected components. In particular, if M is connected, we have another proof of Proposition A.7.2. The orientability of M is crucial for this corollary. For example, we have Proposition A.7.14. Let n be even, so P(S n ) is not orientable. Then (A.7.76)
Hn (P(S n )) = 0.
Proof. We have the natural covering map π : S n → P(S n ), which is a local isometry. If ω0 ∈ Hn (P(S n )) then ω = π ∗ ω0 ∈ Hn (S n ). (Compare arguments around (A.7.78) below.) If ω0 ̸= 0, then ω ̸= 0. But we have Hn (S n ) ≈ Hn (S n ) ≈ R, so ω must be a (nonzero) constant multiple of the volume form on S n , which is nowhere vanishing. This implies ω0 must be nowhere vanishing, but that would provide an orientation on P(S n ). This forces ω0 = 0, yielding (A.7.76).
A.7. Beyond degree theory – introduction to de Rham theory
429
Remark. One can tweak the proof of Proposition A.7.14 to obtain: Proposition A.7.15. Let M be a compact, connected, n-dimensional Riemannian manifold. If M is not orientable, then Hn (M ) = 0.
(A.7.77)
We do not take up the details here. One can consult Exercise 11 in §5.8 of [46]. We next discuss how one can exploit symmetries together with Hodge theory, to gain information about cohomology. To start, let M be a Riemannian manifold and suppose f : M → M is an isometry. We know that f ∗ du = df ∗ u for each k-form u. We claim that also f ∗ δu = δf ∗ u,
(A.7.78)
∀ u ∈ Λk M.
In fact, given v ∈ Λk−1 M , (f ∗ v, δf ∗ u)L2 = (df ∗ v, f ∗ u)L2 = (f ∗ dv, f ∗ u)L2 = (dv, u)L2
(A.7.79)
= (v, δu)L2 = (f ∗ v, f ∗ δu)L2 , the third and fifth identities here holding when f is an isometry. This establishes (A.7.78). It also yields (A.7.80)
f ∗ ∆u = ∆f ∗ u,
∀ u ∈ Λk M,
a result established in the case k = 0 in (4.4.48). With this in hand, we have the following variant of Proposition A.7.5. Proposition A.7.16. If M is a compact Riemannian manifold and ft : M → M is a smooth family of maps, for 0 ≤ t ≤ 1, and if f0 and f1 are isometries, then, for each k, (A.7.81)
f1∗ = f0∗ : Hk (M ) −→ Hk (M ).
Proof. The fact that fj∗ : Hk (M ) → Hk (M ), for j = 0, 1, follows from (A.7.78) (or (A.7.79)). The fact that, for each u ∈ Hk (M ), (A.7.82)
f1∗ u − f0∗ u ∈ E k (M ),
follows from (A.7.30). The conclusion (A.7.81) then follows, since Hk (M )∩E k (M ) = 0. We now exploit symmetries to compute the cohomology of the spheres S n . Proposition A.7.17. If S n is the unit sphere in Rn+1 , then (A.7.83)
Hk (S n ) ≈ R, 0,
if k = 0 or n, otherwise.
430
A. Complementary material
Proof. (We have already seen this, by other means, for k = 0, 1, and n.) We use the fact that SO(n + 1) acts as a group of isometries of S n . Also, SO(n + 1) has been seen to be path connected. Hence, for each k, (A.7.84)
g ∗ : Hk (M ) −→ Hk (M ) is the identity map, ∀ g ∈ SO(n + 1).
Let p = (0, . . . , 0, 1) be the “north pole.” Pick u ∈ Hk (S n ), and consider u(p) ∈ Λk Tp∗ (S n ). One has Tp∗ S n naturally isometrically isomorphic to Rn , and we can regard u(p) ∈ Λk Rn .
(A.7.85)
Now the group SO(n) acts as a group of isometries of S n leaving p fixed, and, by (A.7.84), we have g ∗ u(p) = u(p),
(A.7.86)
∀ g ∈ SO(n).
We claim that α ∈ Λk Rn , g ∗ α = α ∀ g ∈ SO(n)
(A.7.87)
=⇒ α = 0,
if 1 ≤ k ≤ n − 1.
Given this, we have (A.7.88)
u(p) = 0, ∀ u ∈ Hk (S n ), if 1 ≤ k ≤ n − 1.
Exploiting the identity g ∗ u = u for all g ∈ SO(n + 1) and the fact that SO(n + 1) acts transitively on S n , we obtain (A.7.89)
u(x) = 0, ∀ x ∈ S n , u ∈ Hk (S n ), if 1 ≤ k ≤ n − 1.
Hence Hk (S n ) = 0, for 1 ≤ k ≤ n − 1,
(A.7.90)
which takes care of all the cases of (A.7.83) not already established. To complete the proof of Proposition A.7.17, it remains to establish the purely algebraic result (A.7.87). Proof of (A.7.87). (Actually, this proof uses analysis.) For ℓ ̸= m ∈ {1, . . . , n}, θ ∈ R, we define Rℓm (θ) ∈ SO(n) by (A.7.91)
Rℓm (θ)eℓ = (cos θ)eℓ + (sin θ)em , Rℓm (θ)em = (− sin θ)eℓ + (cos θ)em ,
and Rℓm (θ)ei = ei for i ∈ / {ℓ, m}, where {ei : 1 ≤ i ≤ n} is the standard orthonormal basis of Rn . Write α ∈ Λk Rn as ∑ (A.7.92) α= aI eI , I = (i1 , . . . , ik ), 1 ≤ i1 < · · · < ik ≤ n, I
where (A.7.93)
e I = e i1 ∧ · · · ∧ e ik .
Note that (A.7.94)
g ∗ eI = g ∗ ei1 ∧ · · · ∧ g ∗ eik .
Suppose that (A.7.95)
g ∗ α = α,
∀ g ∈ SO(n).
A.7. Beyond degree theory – introduction to de Rham theory
It follows that (A.7.96)
∫
1 2π
π
−π
431
Rℓm (θ)∗ α dθ = α,
for all ℓ, m ∈ {1, . . . , n}, ℓ ̸= m. On the other hand, we have ∫ π 1 Rℓm (θ)∗ eI dθ 2π −π (A.7.97) = eI if both ℓ, m ∈ I or both ℓ, m ∈ / I. 0 if ℓ or m ∈ I but not both. It follows that, for each k-multiindex I, the integral (A.7.97) vanishes for some ℓ ̸= m, as long as 1 ≤ k ≤ n − 1. Consequently, if (A.7.95) holds, then aI = 0 for each k-multiindex I, hence α = 0. We have (A.7.87), and the proof of Proposition A.7.17 is complete.
A somewhat parallel argument treats the cohomology of tori. Proposition A.7.18. For the n-dimensional torus Tn = Rn /Zn , we have (A.7.98) hence (A.7.99)
Hk (Tn ) ≈ Λk Rn ,
0 ≤ k ≤ n,
( ) n n! dim H (T ) = = . k!(n − k)! k k
n
Proof. We have Rn acting on Tn as a group of isometries by translation, (A.7.100)
τy (x) = x + y (mod Zn ).
Hence, for 0 ≤ k ≤ n, (A.7.101)
u ∈ Hk (Tn ), y ∈ Rn =⇒ τy∗ u = u.
Writing (A.7.102)
u=
∑
uI (x) dxi1 ∧ · · · ∧ dxik ,
I
we have from (A.7.101) that each coefficient uI (x) is constant, i.e., ∑ (A.7.103) u ∈ Hk (Tn ) =⇒ u = uI dxi1 ∧ · · · ∧ dxik , uI ∈ R. I
The converse implication (⇐) is clear, so we have (A.7.104)
Hk (Tn ) ≈ Λk Rn ,
and hence (A.7.98).
Another result for which Hodge theory is effective is the computation of the cohomology of products, known as the Kunneth formula: Proposition A.7.19. If M and N are smooth, compact manifolds, of dimension m and n, respectively, then, for 0 ≤ k ≤ m + n, ⊕ [ ] Hi (M ) ⊗ Hj (N ) . (A.7.105) Hk (M × N ) ≈ i+j=k
432
A. Complementary material
Here, if V and W are two finite-dimensional vector spaces, the tensor product V ⊗W can be defined as V ⊗ W = L(V ′ , W ). If V and W have bases {vν } and wµ }, then V ⊗ W has basis {vν ⊗ wµ }. Further results on tensor products can be found in Chapter 5 of [52]. To establish Proposition A.7.19, one endows M and N with Riemannian metrics and puts the product metric on M × N . Then one shows that ⊕ (A.7.106) Hk (M × N ) = Hi (M ) ⊗ Hj (N ), i+j=k
which yields (A.7.105). The proof of (A.7.106) involves further analysis of elliptic PDE. Details can be found in the treatment of Proposition 8.5 in §5.8 of [46]. One can use variants of the arguments in Propositions A.7.17 and A.7.18 to compute the cohomology of other manifolds with lots of symmetry, such as compact matrix groups, and a number of manifolds on which they act transitively. Some results on this are treated in §5.8 of [46]. On the other hand, manifolds without much symmetry call for further techniques. Here is one example. Let Mg be a g-holed torus, a compact 2D surface discussed in §5.3, Exercise 20. Proposition A.7.20. For the g-holed torus Mg , one has (A.7.107)
H1 (Mg ) ≈ R2g .
One tool that is effective on this and other cohomology calculations is the “MayerVietoris sequence,” treated in Chapter 5, Appendix B of [46], where a proof of (A.7.107) can be found. Recall that we already know that Hk (Mg ) ≈ R for k = 0, 2. The content of Exercise 20 in §5.3 is that the Euler characteristic of Mg is given by (A.7.108)
χ(Mg ) = 2 − 2g.
In light of (A.7.107), this is equivalent to the identity (A.7.109)
χ(Mg ) = dim H0 (Mg ) − dim H1 (Mg ) + dim H2 (Mg ).
This is a special case of the following result, whose proof can be found in §11 of [5]. Proposition A.7.21. If M is a smooth, compact, n-dimensional manifold, then n ∑ (A.7.110) χ(M ) = (−1)k dim Hk (M ). k=0
Actually, (A.7.110) is taken as the definition of χ(M ) in [5], and their Theorem 11.25 is phrased as equating this with the index of a vector field on M . This result is known as the Hopf index theorem. In addition to de Rham cohomology groups, arising via differential forms, there are other varieties of cohomology groups, such as singular cohomology groups, (A.7.111)
k Hsing (M, G),
A.7. Beyond degree theory – introduction to de Rham theory
433
and also singular homology groups, Hk,sing (M, G),
(A.7.112)
defined when G is a commutative additive group. We will not discuss these here, but refer to the texts [21] and [24] for treatments, with a wealth of topological applications. A key connection with de Rham cohomology is provided by the following result, known as de Rham’s theorem. Theorem A.7.22. If M is a smooth, compact manifold, there is a natural isomorphism k Hk (M ) ≈ Hsing (M, R).
(A.7.113)
A proof is given in [8]. A variant, using Cech cohomology in place of singular cohomology, is given in [5], Theorem 8.9. We conclude our introduction to de Rham theory with a discussion of the Hopf invariant. This was introduced by E. Hopf in his analysis of homotopy classes of maps f : S 3 −→ S 2 ,
(A.7.114)
and seen to extend to broader contexts, such as f : S 2n−1 −→ S n .
(A.7.115)
More generally, we will attach a number, denoted h(f ), called the Hopf invariant, to a smooth map f : M −→ N,
(A.7.116)
when M and N are smooth, compact, oriented manifolds satisfying dim N = n ≥ 2, even, dim M = 2n − 1, (A.7.117) N connected, Hn (M ) = 0. Note that Proposition A.7.17 implies Hn (S 2n−1 ) = 0, for n ≥ 2. We define h(f ) as follows. Pick ∫ n (A.7.118) ω ∈ Λ N, ω = 1, B = f ∗ ω, N
and pick A ∈ Λn−1 M,
(A.7.119)
dA = B,
which can be done when H (M ) = 0. Then set ∫ (A.7.120) h(f ) = A ∧ dA. n
M
If n is odd, then A ∧ dA = (1/2)d(A ∧ A), so (A.7.120) vanishes. Hence we restrict attention to the case n even. To guarantee that the Hopf invariant is well defined, we need to check independence of choices. First, suppose (A.7.121)
ω ′ ∈ Λn N,
ω ′ = ω + dβ.
434
A. Complementary material
Then f ∗ ω ′ = d(A + f ∗ β), and we need to check that ∫ ∫ (A.7.122) (A + f ∗ β) ∧ (dA + df ∗ β) = A ∧ dA. M
M
Indeed, the integrand on the left side of (A.7.122) expands to A ∧ dA + (f ∗ β) ∧ dA + A ∧ df ∗ β + f ∗ β ∧ df ∗ β,
(A.7.123)
a sum of four terms. The second term is equal to f ∗ (β ∧ ω) = 0. The third term is equal to (−1)n−1 d(A ∧ f ∗ β) + (−1)n dA ∧ f ∗ β, and as just seen the last term here is 0, so this integrates to 0. Finally, the last term in (A.7.123) is equal to f ∗ (β ∧ dβ) = 0. Thus we have (A.7.122). Next, suppose dA = dA′ , so A′ − A = α ∈ C n−1 (M ).
(A.7.124) Then
A′ ∧ dA′ = A ∧ dA + α ∧ dA
(A.7.125)
= A ∧ dA + (−1)n−1 d(α ∧ A),
so (A.7.120) is unchanged upon replacing A by A′ . The following result asserts the homotopy invariance of the Hopf invariant. Proposition A.7.23. Assume M and N are smooth, compact, oriented manifolds satisfying (A.7.117). If f0 , f1 : M → N are smoothly homotopic, then (A.7.126)
h(f0 ) = h(f1 ).
Proof. Let ft : M → N be a smooth family of maps, t ∈ [0, 1]. We can arrange that ft = f0 for t ∈ [0, δ],
(A.7.127)
ft = f1 for t ∈ [1 − δ, 1],
for some δ > 0. Define F : [0, 1] × M −→ N,
(A.7.128)
F (t, x) = ft (x).
Take ω as in (A.7.118), and this time set B = F ∗ ω ∈ Λn (Ω),
(A.7.129)
Ω = [0, 1] × M.
We claim that, since H (M ) = 0, n
∃ A ∈ Λn−1 (Ω) such that dA = B.
(A.7.130)
Granted this, we apply Stokes’ formula to A ∧ dA ∈ Λ2n−1 (Ω), to get ∫ ∫ (A.7.131) dA ∧ dA = A ∧ dA. Ω
∂Ω
∗
Since dA ∧ dA = F (ω ∧ ω) = 0, we have ∫ (A.7.132) A ∧ dA = 0, ∂Ω
which implies (A.7.126), modulo the proof of (A.7.130).
A.7. Beyond degree theory – introduction to de Rham theory
435
Proof of (A.7.130). We construct the “double” S of Ω, as [−1, 1] × M , where we identify the endpoints −1 and 1. In other words, (A.7.133)
S = T × M,
T = R/2Z.
We have the inclusion Ω ⊂ S, via [0, 1] ⊂ T. We extend F to (A.7.134)
F : T × M −→ N,
F (t, x) = f|t| (x).
The arrangement (A.7.127) implies that such F is smooth. In addition, there is a smooth involution (A.7.135)
κ : S −→ S,
κ(t, x) = (−t, x),
and we have F ◦ κ = F . Extending (A.7.129), we take (A.7.136)
B = F ∗ ω ∈ Λn S,
so dB = 0, κ∗ B = B,
and we claim that, since Hn (M ) = 0, (A.7.137)
∃ A ∈ Λn−1 S such that dA = B.
If we establish (A.7.137), we will have (A.7.130). We will get this from the Kunneth formula, though not immediately, since what this formula gives is [ ] [ ] Hn (S) ≈ H0 (T) ⊗ Hn (M ) ⊕ H1 (T) ⊗ Hn−1 (M ) (A.7.138) ≈ Hn−1 (M ). Actually, (A.7.139)
M = S 2n−1 , n ≥ 2 =⇒ Hn−1 (M ) = 0,
so one who is interested only in the case M = S 2n−1 could stop here. Otherwise, we can carry on, and exploit the fact that κ∗ B = B. The closed form B is cohomologous to its harmonic projection (A.7.140)
B # = Ph B ∈ Hn (S),
and behind (A.7.138) is the identity (A.7.141)
Hn (S) = H1 (T) ⊗ Hn−1 (M ) = {dt ∧ u : u ∈ Hn−1 (M )}.
We see that (A.7.142)
v ∈ Hn (S) =⇒ κ∗ v = −v, but κ∗ B # = B # .
This implies B # = 0, and yields (A.7.137), and hence completes the proof of Proposition A.7.23. There is the following relationship between the Hopf invariant and degree.
436
A. Complementary material
Proposition A.7.24. Take M and N as in Proposition A.7.23. Suppose also that M ′ is a smooth, compact, oriented manifold of dimension 2n−1, with Hn (M ′ ) = 0, and we have a smooth map φ : M ′ −→ M.
(A.7.143) Then (A.7.144)
h(f ◦ φ) = (Deg φ)h(f ).
Proof. If A′ = φ∗ A, we have dA′ = (f ◦ φ)∗ ω and A′ ∧ dA′ = φ∗ (A ∧ dA), so ∫ h(f ◦ φ) = A′ ∧ dA′ M′
∫
φ∗ (A ∧ dA)
=
(A.7.145)
M′
∫ A ∧ dA,
= (Deg φ) M
yielding (A.7.144).
We turn to the computation of a basic example of the Hopf invariant. Let N = S 2 and M = SO(3). The tangent space TI SO(3) is the Lie algebra Skew(3), spanned by 0 0 −1 , 0 1 , J2 = 0 J1 = −1 0 1 0 (A.7.146) 0 1 , J3 = −1 0 0 satisfying commutation relations (A.7.147)
[J1 , J2 ] = J3 ,
[J2 , J3 ] = J1 ,
[J3 , J1 ] = J2 .
The group SO(3) acts transitively on S ⊂ R , and the subgroup fixing the “north pole” p = (0, 0, 1) is the group generated by J3 . This defines a map 2
(A.7.148)
φ : SO(3) −→ S 2 ,
3
φ(g) = g(p).
Let us define a 1-form α on SO(3) as follows. If Jν are extended as left-invariant vector fields on SO(3), set (A.7.149)
α(J1 ) = α(J2 ) = 0,
α(J3 ) = 1.
Then the formula dα(X, Y ) = Xα(Y ) − Y α(X) − α([X, Y ]), together with the commutation relations (A.7.147) yields (A.7.150)
dα(J1 , J2 ) = −1,
dα(J2 , J3 ) = 0,
and hence (A.7.151)
α ∧ dα(J1 , J2 , J3 ) = −1.
dα(J3 , J1 ) = 0,
A.7. Beyond degree theory – introduction to de Rham theory
437
It follows from (A.7.150) that, up to sign, dα is the pull-back via φ of the volume form ωS 2 on S 2 given by its standard metric, so ∫ ωS 2 = 4π. S2
Before performing the computation of h(φ), let us bring in M ′ = SU (2). We have a two-fold covering homomorphism (A.7.152)
τ : SU (2) −→ SO(3),
obtained by mapping Xν to Jν , where Xν form the following basis of su(u) = TI SU (2): ( ( ( ) ) ) 1 0 1 1 0 i 1 i 0 , X2 = , X3 = , (A.7.153) X1 = 2 0 −i 2 −1 0 2 i 0 with the same commutation relations as in (A.7.147), i.e., (A.7.154)
[X1 , X2 ] = X3 ,
[X2 , X3 ] = X1 ,
[X3 , X1 ] = X2 .
Thus α′ = τ ∗ α satisfies analogues of (A.7.149)–(A.7.151), with Jν replaced by Xν , extended to be left-invariant vector fields on SU (2). Now SU (2) acts simply transitively on S 3 ⊂ C2 . Note that e4πXν = I,
(A.7.155)
and 4π is the smallest positive factor for which this holds. Hence, if SU (2) is given the Riemannian metric induced from S 3 , we see that ∥Xν ∥ = 1/2. Also, the tangent vectors Xν are mutually orthogonal at each point of SU (2). Hence α′ ∧ dα′ is 8 times the volume element on SU (2) induced by this metric. We have seen that Vol(S 3 ) = 2π 2 , so ∫ (A.7.156) α′ ∧ dα′ = 16π 2 , SU (2)
provided we give SU (2) the orientation such that (−X1 , −X2 , −X3 ) is an oriented basis of the tangent space. Now, let us set 1 1 1 ′ (A.7.157) ω= ωS 2 , A = α, A′ = α. 4π 4π 4π Then (A.7.158)
dA = φ∗ ω,
dA′ = ψ ∗ ω,
where (A.7.159)
ψ = φ ◦ τ : SU (2) −→ S 2 ,
and we have (A.7.160)
A′ ∧ dA′ =
By (A.7.156), we obtain the following.
1 α′ ∧ dα′ . 16π 2
438
A. Complementary material
Proposition A.7.25. For the maps φ and ψ given by (A.7.148) and (A.7.159), we have 1 (A.7.161) h(φ) = , h(ψ) = 1. 2 The map (A.7.159) is equivalent to the classical Hopf map f : S 3 → S 2 in (A.7.114), and we conclude that h(f ) = 1 in this classical case. Note that, upon composing such f with a map gν : S 3 → S 3 of degree ν and using Proposition A.7.24, we obtain a map (A.7.162)
fν : S 3 −→ S 2 ,
such that h(fν ) = ν.
It is known that, whenever M = S 2n−1 and f : S 2n−1 → N , then h(f ) is an integer. See [5] for a detailed discussion of the case n = 2. In fact, the Hopf invariant of f : M → N is usually only studied for M = S 2n−1 . We see in Proposition A.7.25 a map φ : SO(3) → S 2 whose Hopf invariant is not an integer. It is shown in [33] that, for M and N as in Proposition A.7.23, and f : M → N , the Hopf invariant h(f ) is rational.
Exercises 1. The treatment of the Hopf invariant of φ : SO(3) → S 2 in Proposition A.7.25 requires H2 (SO(3)) = 0. Show that H1 (SO(3)) ≈ H2 (SO(3)) = 0. Hint. Show that u ∈ Hk (SO(3)) =⇒ τ ∗ u ∈ Hk (SU (2)) ≈ Hk (S 3 ). But τ : SU (2) → SO(3) is a local diffeomorphism, so τ ∗ is injective on Hk (SO(3)). 2. Show that, for n ≥ 2, Hk (P(S n )) = 0 for 1 ≤ k ≤ n − 1. 3. Let Mg be a g-holed torus, such as studied in Exercises 20–22 of §5.3, and form P(Mg ), as indicated there. (a) Show that H2 (P(Mg )) = 0. Hint. Extend Proposition A.7.14, as indicated in Proposition A.7.15. (b) Using Exercise 22 of §5.3 together with Proposition A.7.21, deduce that H1 (P(Mg )) ≈ Rg . 4. As seen in Exercise 23 of §5.3, if X and Y are smooth, compact manifolds, χ(X × Y ) = χ(X)χ(Y ). Show that this identity also follows from the Kunneth formula, (A.7.105), together with the Hopf formula, (A.7.110).
Exercises
439
5. Produce an extension of the Poincar´e lemma, Proposition A.7.3, as follows. Let M be a smooth, connected, n-dimensional manifold, X ⊂ M a smooth ℓdimensional surface, ℓ < n. Denote the by ι the inclusion, ι : X ,→ M . Assume that the identity map I : M → M is smoothly homotopic to a map K : M → M with image in X, so K = ι ◦ F , with F : M → X. Show that, for all k ≥ 0, F ∗ ι∗ : Hk (M ) −→ Hk (M ) is the identity, hence ι∗ : Hk (M ) −→ Hk (X) is injective. In particular, Hk (M ) = 0, ∀ k > ℓ. 6. Take M, X, ι as in Exercise 5. This time, assume there is a retraction R : M → X, so R ◦ ι : X → X is the identity map. Show that, for all k ≥ 0, ι∗ R∗ : Hk (X) −→ Hk (X) is the identity, hence
ι∗ : Hk (M ) −→ Hk (X) is surjective.
In particular, Hk (X) ̸= 0 =⇒ Hk (M ) ̸= 0. 7. Now combine Exercises 5 and 6. Take M, X, ι as above. Assume the identity map I : M → M is smoothly homotopic to a map K = ι ◦ R, where R : M → X is a retraction. (We say X is a smooth deformation retract of M .) Deduce that ι induces isomorphisms ≈
ι∗ : Hk (M ) −→ Hk (X),
∀ k ≥ 0.
Bibliography
[1] R. Abraham and J. Marsden, Foundations of Mechanics, Benjamin, Reading, Mass., 1978. [2] L. Ahlfors, Complex Analysis, McGraw-Hill, New York, 1979. [3] V. Arnold, Mathematical Methods of Classical Mechanics, Springer, New York, 1978. [4] L. Baez-Duarte, Brouwer’s fixed-point theorem and a generalization of the formula for change of variables in multiple integrals, J. Math. Anal. Appl. 177 (1993), 412– 414. [5] R. Bott and L. Tu, Differential Forms in Algebraic Topology, Springer, New York, 1982. [6] R. Buck, Advanced Calculus, McGraw-Hill, New York, 1978. [7] F. Dai and Y. Xu, Approximation Theory and Harmonic Analysis on Spheres and Balls, Springer, New York, 2013. [8] G. de Rham, Differentiable Manifolds, Springer, New York, 1984. [9] A. Devinatz, Advanced Calculus, Holt, Reinhart, and Winston, New York, 1968. [10] M. do Carmo, Differential Geometry of Curves and Surfaces, Prentice-Hall, Englewood Cliffs, NJ, 1976. [11] M. do Carmo, Riemannian Geometry, Birkhauser, Boston, 1992. [12] J. J. Duistermaat and J. Kolk, Lie Groups, Springer-Verlag, New York, 2000. [13] H. Edwards, Riemann’s Zeta Function, Dover, New York, 2001. [14] H. Federer, Geometric Measure Theory, Springer, New York, 1969. [15] H. Flanders, Differential Forms with Applications to the Physical Sciences, Academic Press, New York, 1963. [16] W. Fleming, Functions of Several Variables, Addison-Wesley, Reading, Mass., 1965. [17] G. Folland, Real Analysis: Modern Techniques and Applications, Wiley-Interscience, New York, 1984. [18] P. Franklin, A Treatise on Advanced Calculus, John Wiley, New York, 1955. [19] H. Goldstein, Classical Mechanics, Addison-Wesley, New York, 1950.
441
442
Bibliography
[20] E. Goursat, A Course in Modern Analysis, Vols. 1–3, Dover, New York, 1959. Translated from the French in 1904. [21] M. Greenberg and J. Harper, Algebraic Topology, a First Course, Addison-Wesley, New York, 1981. [22] V. Guillemin and P. Haine, Differential Forms, World Scientific, 2019. [23] J. Hale, Ordinary Differential Equations, Wiley, New York, 1969. [24] A. Hatcher, Algebraic Topology, Cambridge Univ. Press, 2002. [25] N. Hicks, Notes on Differential Geometry, Van Nostrand, New York, 1965. [26] E. Hille, Analytic Function Theory, Chelsea Publ., New York, 1977. [27] M. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York, 1974. [28] F. Hobson, The Theory of Spherical and Ellipsoidal Harmonics, Chelsea, New York, 1965. (Reprint of 1931 Cambridge U. Press monograph) [29] Y. Kannai, An elementary proof of the no-retraction theorem, Amer. Math. Monthly 88 (1981), 264–268. [30] O. Kellogg, Foundations of Potential Theory, Dover, New York, 1953. [31] S. Krantz, Function Theory of Several Complex Variables, Wiley, New York, 1982. [32] P. Lax, Change of variables in multiple integrals, Amer. Math. Monthly 106 (1999), 497–501. [33] C. Lebrun and M. Taylor, “The Hopf bracket,” manuscript, 2013, available at http://mtaylor.web.unc.edu/notes, item #16. [34] S. Lefschetz, Differential Equations, Geometric Theory, J. Wiley, New York, 1957. [35] L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, New York, 1968. [36] J. Milnor, Topology from the Differentiable Viewpoint, Univ. Press of Virginia, Charlottesville VA, 1965. [37] H. Nickerson, D. Spencer, and N. Steenrod, Advanced Calculus, Van Nostrand, Princeton, New Jersey, 1959. [38] F. Olver, Asymptotics and Special Functions, Academic Press, New York, 1974. [39] M. Protter and C. Morrey, A First Course in Real Analysis, (2nd ed.) SpringerVerlag, New York, 1991. [40] B. Simon, Harmonic Analysis, American Mathematical Society, Providence RI, 2015. [41] K. Smith, Primer of Modern Analysis, Springer-Verlag, 1983. [42] M. Spivak, Calculus on Manifolds, Benjamin, New York, 1965. [43] M. Spivak, A Comprehensive Introduction to Differential Geometry, Vols. 1–5, Publish or Perish Press, Berkeley, CA, 1979. [44] S. Sternberg, Lectures on Differential Geometry, Prentice Hall, New Jersey, 1964. [45] J. J. Stoker, Differential Geometry, Wiley-Interscience, New York, 1969. [46] M. Taylor, Partial Differential Equations, Vols. 1–3, Springer-Verlag, New York, 1996 (2nd ed., 2011). [47] M. Taylor, Measure Theory and Integration, GSM #76, American Mathematical Society, Providence RI, 2006. [48] M. Taylor, Differential Geometry, Lecture notes, Preprint, 2019. [49] M. Taylor, Introduction to Analysis in One Variable, American Math. Society, Providence RI, to appear.
Bibliography
443
[50] M. Taylor, Introduction to Differential Equations, American Math. Soc., Providence, RI, 2011. [51] M. Taylor, Introduction to Complex Analysis, GSM # 202, American Math. Society, Providence RI, 2019. [52] M. Taylor, Linear Algebra, American Math. Society, to appear. [53] M. Taylor, Noncommutative Harmonic Analysis, Math. Surv. Monogr. # 22, American Mathematical Society, Providence RI, 1986. [54] J. Thorpe, Elementary Topics in Differential Geometry, Springer, New York, 1979. [55] V. S. Varadarajan, Lie Groups, Lie Algebras, and Their Representations, Prentice Hall, Englewood Cliffs NJ, 1974. [56] E. Whittaker and G. Watson, Modern Analysis, Cambridge Univ. Press, 1927. [57] E. B. Wilson, Advanced Calculus, Dover, New York, 1958. (Reprint of 1912 ed.)
Index
accumulation point, 387 Ad, 286 ad, 286 adjoint, 401 algebra of sets, 93 analysis, xi angle defect, 260 anti-symmetry, 157 anticommutation relation, 163 antipodal map, 210, 216 antipodal points, 133 arc length, 122 arcsin, 85, 198 arctan, 86 area, 124 Arzela-Ascoli theorem, 242, 394 averaging over rotations, 126 ball, 386 Banach space, 393 basis, 26, 374 bi-invariant metric, 276 binomial coefficient, 372 Bolzano-Weierstrass theorem, 20 bracket, 80 Brouwer fixed-point theorem, 208, 420 Cantor set, 16 Cartan’s formula for Lie derivative, 204 Cartesian product, 221 Cauchy integral formula, 189, 199, 374 Cauchy integral theorem, 189 Cauchy remainder formula, 50, 407, 411 Cauchy sequence, 20, 386
Cauchy’s inequality, 19, 303, 318, 345, 399 Cauchy-Riemann equations, 55, 68, 188, 195 cell, 90 center, 79, 217 central function, 379 chain rule, 14, 44, 74, 77, 121, 151, 160, 197 change of variable formula, 14, 96, 99, 104, 109, 122, 157, 182 character of a representation, 379 characteristic function, 5, 91 characteristic polynomial, 402 Christoffel symbols, 234, 237, 249 Clairaut parametrization, 241, 255 classical Stokes formula, 176 Clifford algebra, 163 closed, 386 closed form, 164, 421 closed set, 20 closure, 92, 387 Codazzi equation, 256 cofactor matrix, 37, 58 column vector, 17, 25 commutator, 272, 356 commutator identities, 359 compact, 20, 387 complete metric space, 386 completeness property, 20, 305, 323 completion, 386 complex analytic, 55 complexification, 405
445
446
conformal, 201 conjugate points, 291 connected, 137, 185, 395 connection 1-form, 249 content, 91 continuous, 3, 42, 91, 390 contraction mapping theorem, 60, 70 convergence, 19 convergent power series, 52 convolution, 316, 380 coordinate chart, 119, 150 coordinate syatem, 156 cos, 84, 138, 141, 197 countable, 388 covariant derivative, 236 covariant derivative on normal fields, 246 Cramer’s formula, 37, 58 critical point, 49, 55 critical point of a vector field, 78, 214 cross product, 37, 122 curl, 165, 176 curvature, 226, 243 curvature (2,2)-form, 254 curvature 2-form, 249 curvature vector, 242 curve, 122, 138, 156 Darboux theorem, 4, 91, 146 de Rham cohomology, 421 definition vs. formula, 35 deformation retract, 439 Deg, 208, 212 degree, 208, 212, 216, 420 degree of Gauss map, 256 dense, 342, 388 derivation, 77 derivative, 7, 42, 126 derived representation, 358, 362 det, 32, 88 determinant, 31, 32, 57, 158 diffeomorphism, 59, 119, 135, 157 differentiability of power series, 53 differentiable, 7, 42, 126 differential equation, 70 differential form, 156 differential operator, 77 dimension, 26 Dini’s theorem, 396 Dirichlet problem, 298, 312, 336, 346, 347, 415 disk, 382
Index
distance, 18, 386 distribution, 309 div, 131, 165, 172, 206 divergence, 131, 206 divergence theorem, 172, 174, 177 dot product, 18 dual space, 306, 325 Duhamel’s formula, 74, 88, 288 eigenfunction, 339 eigenspace, 402 eigenvalue, 137, 339, 402 eigenvector, 137, 402 ellipsoid, 116 embedding, 135 energy functional, 225, 230 Euclidean metric tensor, 123 Euclidean space, 17, 43 Euler characteristic, 216, 257, 432 Euler’s formula, 84, 142, 406 Euler’s other formula, 217, 262 Expp , 238 exact form, 164, 203, 421 exactness criterion, 210 Exp, 63, 88, 125, 142, 406 expansion by minors, 37 exponential function, 83 exponential map, 125, 226, 238, 287 exterior derivative, 163 exterior product, 162 extremal problem, 49 fixed point, 59 flat torus, 135 flow, 77 Fourier coefficients, 298 Fourier inversion formula, 295, 296, 298, 307, 310, 317, 321, 325, 328 Fourier series, 295, 298 Fourier transform, 296, 314 Fubini’s theorem, 94, 104, 117 fundamental theorem of algebra, 194, 216, 402 fundamental theorem of calculus, 7, 42, 167, 230, 234, 237, 408 fundamental theorem of linear algebra, 28, 128, 341 Funk-Hecke theorem, 353, 379 Gl+ (n, R), 137, 184 Gamma function, 124, 200, 335 Gauss angle defect formula, 260
Index
Gauss curvature, 218, 227, 243, 247, 248, 253, 255, 257, 265 Gauss linking number formula, 223 Gauss map, 215, 243 Gauss theorema egregium, 228, 251 Gauss-Bonnet formula, 218 Gauss-Bonnet theorem, 228, 257 Gaussian integral, 105 Gegenbauer polynomials, 348 generalized eigenspace, 293, 406 generalized Gauss-Bonnet theorem, 264 generating function, 348 geodesic curvature, 262 geodesic equation, 225, 230, 233, 234, 241, 282 geodesic triangle, 260 geodesics, 225, 229 geometric series, 312 Gl(n, R), 43 global diffeomorphism, 62 grad, 165 Gramm-Schmidt construction, 400 Green formulas, 175 Green’s theorem, 172, 179, 189, 383 group, 282 Haar measure, 126, 210, 354 Hamiltonian system, 235 Hamiltonian vector field, 178 harmonic, 181, 189, 192, 201, 311, 415 harmonic conjugate, 195 harmonic form, 424 harmonic function, 336 harmonic polynomial, 340, 346 Harnack’s inequality, 418 Heine-Borel theorem, 21, 389 Hessian, 48 Hilbert space, 399 Hodge decomposition, 424, 426 Hodge Laplacian, 424 holomorphic, 55, 68, 188 homotopic, 202, 215 homotopy invariance, 212, 216 Hopf index theorem, 432 Hopf invariant, 433 implicit function theorem, 64, 127 index of a vector field, 214 inf, 2 injective, 25 inner product, 121, 153, 305, 318, 398 inner tube, 142, 219
447
integral equation, 70 integral of an n-form, 159 integral test, 16 integrating factor, 166 integration by parts, 14, 125 interior, 92, 387 interior product, 163, 203 intermediate value theorem, 395 interval, 2 inverse, 25 inverse Fourier transform, 314 inverse function theorem, 58, 119, 125, 127, 199 invertible, 29, 36 irreducible representation, 284, 355 isometric embedding, 136 isomorphism, 25 isoperimetric inequality, 382 isothermal coordinates, 252 iterated integral, 94 Jacobi field, 291 Jacobi inversion formula, 334, 335 Jacobi variational equation, 290 Jordan curve theorem, 214 Jordan-Brouwer separation theorem, 213 Kunneth formula, 431 ladder operators, 360 Lagrange remainder formula, 50, 407, 412 Laplace operator, 175, 180, 181, 192, 201, 339, 415 Lebesgue integral, 304, 322 Lebesgue measure, 6 Legendre polynomials, 349, 364 length, 229 length functional, 225, 230 Levi-Civita covariant derivative, 226, 236 Lie algebra, 272 Lie algebra isomorphism, 363 Lie algebra representation, 358 Lie bracket, 80, 272, 356 Lie derivative, 80, 204 Lie group, 282 linear system of ODE, 73 linear transformation, 23, 42 linearization of an ODE, 75 linearly dependent, 26
448
linearly independent, 26 linking number, 221 Liouville’s theorem, 193, 199, 418 Lipschitz, 94 local diffeomorphism, 62 local maximum, 49, 55 local minimum, 49, 55 log, 83, 197 lower content, 5, 91 M(n, F), 31, 32, 88 M(n, R), 125 manifold, 134, 150 matrix, 23 matrix exponential, 79, 87, 287, 356 matrix group, 269 matrix multiplication, 25 maximum, 391 maximum principle, 193, 338, 415 maxsize, 2, 90 mean value property, 192, 338, 415 mean value theorem, 8, 45 measurable, 304 metric space, 18, 386 metric tensor, 121, 134, 153, 229 minimum, 391 modulus of continuity, 392 monotone convergence theorem, 110 Morse function, 149 multi-index notation, 46 multi-linear notation, 51 multi-linear Taylor formula, 52 multiplicativity, 35 multipole expansion, 372 negative definite, 49 neighborhood, 386 Newton method miracle, 68 Newton’s method, 62, 68 nil set, 92 no-retraction theorem, 208, 213 non-degenerate critical point, 79, 214 norm, 18, 42, 300, 318, 393, 398 normal field, 245 normal transformation, 404 null space, 25 open, 386 open set, 20, 42 orbit, 77 orientable, 159, 162 orientation, 158
Index
orthogonal complement, 401 orthogonal coordinates, 252 orthogonal projection, 231, 244, 343, 401 orthogonal transformation, 405 orthogonality, 340 orthonormal basis, 121, 137, 342, 399 outer measure, 6, 113 parallel translation, 257 parallel transport, 257 parametrization by arc length, 138, 242 partial derivative, 10, 42 partition, 2, 90 partition of unity, 147, 167 permutation, 33 Peter-Weyl theorem, 376 PI, 347 pi, 85, 86, 138 Picard iteration method, 70 piecewise constant, 12 PK, 98 Plancherel identity, 296, 303, 311, 319, 321, 329, 383 Poincar´e disk, 253 Poincar´e duality, 428 Poincar´e lemma, 164, 168, 204, 210, 422, 439 Poisson integral, 347 Poisson integral formula, 298, 312, 338 Poisson summation formula, 334, 336 polar coordinates, 102, 337 polar decomposition, 137 polynomial, 24 positive definite, 49 power series, 191, 196, 407 prime numbers, 336 product rule, 164 projection, 145, 280 projective space, 133, 162, 372 pull-back, 157, 158, 164 Pythagorean theorem, 18 quotient surface, 134 radial vector field, 166 range, 25 rational numbers, 386 real analytic, 68 real numbers, 386 regular value, 213 removable singularity theorem, 340, 416
Index
representation, 145, 279, 355 retraction, 208 Riemann curvature, 227, 248, 282 Riemann integrability criterion, 114 Riemann integrable, 2, 91, 129 Riemann integral, 2, 90 Riemann sum, 5, 91 Riemann zeta function, 335 Riemann’s functional equation, 336 Riemannian manifold, 229 Rodrigues formula, 375 rotation group, 355 row operation, 97, 118 row vector, 17 saddle, 79, 217 saddle point, 49, 55 Sard’s theorem, 149, 213 Schur’s lemma, 378 Schwarz reflection principle, 417 second fundamental form, 227, 244, 247 self-adjoint, 402 sequence, 387 simply connected, 422 sin, 84, 138, 141, 197 sinh, 144 sink, 79, 217 Skew(n), 125, 142, 356, 406 SO(n), 38, 125, 142, 201, 355, 406 source, 79, 217 span, 26 Spec, 402 spectral mapping theorem, 288, 292 sphere, 123, 124 spherical coordinates, 364 spherical harmonic expansion, 346 spherical harmonics, 297, 336, 338 spherical polar coordinates, 115, 123, 338 Stokes formula, 166, 179, 202, 213, 259, 266 Stone-Weierstrass theorem, 209, 286, 299, 342, 413 SU(n), 362 submersion, 127 submersion mapping theorem, 127 subsequence, 387 summation convention, 173, 234 sup, 2 surface, 119 surface integral, 121 surface of revolution, 144, 255
449
surface with boundary, 166 surface with corners, 167 surjective, 25 Sym(n), 136 tan, 86 tangent bundle, 152 tangent space, 121, 151 tangent vector field, 130 Taylor formula with remainder, 14, 46 tempered distribution, 328 torus, 298 total boundedness, 387 Tr, 39, 88 trace, 39 transpose, 30 triangle inequality, 18, 300, 304, 318, 386, 398 triangulation, 261 trigonometric function, 84, 138 trigonometric polynomial, 299, 415 Tychonov theorem, 393 umbilic, 256 unbounded integrable function, 105 uniformly continuous, 392 unimodular group, 278 unit normal, 243 unitarily equivalent, 285 unitary, 402 unitary representation, 284, 355 unlinking, 223 upper content, 5, 91 vector, 157 vector field, 77, 130, 153, 244 vector space, 22, 43, 397 volume, 90, 123 volume of a ball, 115, 139 wedge product, 162 Weierstrass approximation theorem, 412 Weingarten formula, 227, 246, 254 Weingarten map, 227, 245 Weyl orthogonality relations, 285 zonal function, 352 zonal harmonics, 352