Topics in Applied Mathematics and Modeling. Concise Theory with Case Studies 2022034482, 9781470469917, 9781470472177

355 63 7MB

English Pages [228] Year 2023

Table of contents :
Cover
Title page
Copyright
Contents
Preface
Note to instructors
Case studies and mini-projects
Chapter 1. Dimensional analysis
1.1. Units and dimensions
1.2. Axioms of dimensions
1.3. Dimensionless quantities
1.4. Change of units
1.5. Unit-free equations
1.6. Buckingham 𝜋-theorem
1.7. Case study
Reference notes
Exercises
Mini-project
Chapter 2. Scaling
2.1. Domains and scales
2.2. Scale transformations
2.3. Derivative relations
2.4. Natural scales
2.5. Scaling theorem
2.6. Case study
Reference notes
Exercises
Mini-project
Chapter 3. One-dimensional dynamics
3.1. Preliminaries
3.2. Solvability theorem
3.3. Equilibria
3.4. Monotonicity theorem
3.5. Stability of equilibria
3.6. Derivative test for stability
3.7. Bifurcation of equilibria
3.8. Case study
Reference notes
Exercises
Mini-project
Chapter 4. Two-dimensional dynamics
4.1. Preliminaries
4.2. Solvability theorem
4.3. Direction field, nullclines
4.4. Path equation, first integrals
4.5. Equilibria
4.6. Periodic orbits
4.7. Linear systems
4.8. Equilibria in nonlinear systems
4.9. Periodic orbits in nonlinear systems
4.10. Bifurcation
4.11. Case study
4.12. Case study
Reference notes
Exercises
Mini-project 1
Mini-project 2
Mini-project 3
Chapter 5. Perturbation methods
5.1. Perturbed equations
5.2. Regular versus singular behavior
5.3. Assumptions, analytic functions
5.4. Notation, order symbols
5.5. Regular algebraic case
5.6. Regular differential case
5.7. Case study
5.8. Poincaré–Lindstedt method
5.9. Singular algebraic case
5.10. Singular differential case
5.11. Case study
Reference notes
Exercises
Mini-project 1
Mini-project 2
Mini-project 3
Chapter 6. Calculus of variations
6.1. Preliminaries
6.2. Absolute extrema
6.3. Local extrema
6.4. Necessary conditions
6.5. First-order problems
6.6. Simplifications, essential results
6.7. Case study
6.8. Natural boundary conditions
6.9. Case study
6.10. Second-order problems
6.11. Case study
6.12. Constraints
6.13. Case study
6.14. A sufficient condition
Reference notes
Exercises
Mini-project 1
Mini-project 2
Mini-project 3
Bibliography
Index
Back Cover

Recommend Papers

Numerical C: Applied Computational Programming with Case Studies 148425063X, 9781484250631

Learn applied numerical computing using the C programming language, starting with a quick primer on the C programming la

383 85 4MB Read more

Introduction to Modeling Sustainable Development in Business Processes: Theory and Case Studies [1st ed.] 9783030584214, 9783030584221

Sustainable development and corporate social responsibility drive countries, regions, and businesses to take environment

414 84 9MB Read more

Applied Mathematics and Modeling for Chemical Engineers 0471303771, 9780471303770

Bridges the gap between classical analysis and modern applications. Following the chapter on the model building stage, i

712 50 4MB Read more

Topics in Mathematical Modeling 9781400884056

Topics in Mathematical Modeling is an introductory textbook on mathematical modeling. The book teaches how simple mathem

119 0 3MB Read more

Numerical Analysis in Pascal ABC: Studies in Applied Mathematics

The aim of the book is maintenance of task setting, mathematical description and computer solving of the investigated pr

153 38 4MB Read more

Advanced Reactor Modeling with MATLAB: Case Studies with Solved Examples (De Gruyter STEM) 3110632195, 9783110632194

Offers the reader a modern approach to reactor description and modelling. Using the widely applied numerical language MA

380 14 35MB Read more

Advanced Reactor Modeling with MATLAB: Case Studies with Solved Examples 9783110632927, 9783110632194

Offers the reader a modern approach to reactor description and modelling. Using the widely applied numerical language MA

214 22 9MB Read more

Applied Speech Processing Algorithms: Algorithms and Case Studies 9780128238981

432 27 14MB Read more

Use Case Driven Object Modeling with UML: Theory and Practice 9780321278272, 0321278275

Use Case Driven Object Modeling with UMLTheory and Practice shows how to drive an object-oriented software design from u

1,164 140 9MB Read more

Modeling in Biopharmaceutics, Pharmacokinetics and Pharmacodynamics: Homogeneous and Heterogeneous Approaches (Interdisciplinary Applied Mathematics) 0387281789, 9780387281780

118 90 7MB Read more

Topics in Applied Mathematics and Modeling. Concise Theory with Case Studies
2022034482, 9781470469917, 9781470472177

Author / Uploaded
Oscar Gonzalez

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

59

Topics in Applied Mathematics and Modeling Concise Theory with Case Studies

Oscar Gonzalez

Topics in Applied Mathematics and Modeling Concise Theory with Case Studies

UNDERGRADUATE

TEXTS

•

59

Topics in Applied Mathematics and Modeling Concise Theory with Case Studies

Oscar Gonzalez

EDITORIAL COMMITTEE Giuliana Davidoﬀ Steven J. Miller

Tara S. Holm Maria Cristina Pereyra

Gerald B. Folland (Chair) 2020 Mathematics Subject Classiﬁcation. Primary 00A69, 34A26, 34E10, 37N99, 41A58, 49K15.

For additional information and updates on this book, visit www.ams.org/bookpages/amstext-59

Library of Congress Cataloging-in-Publication Data Names: Gonzalez, Oscar, 1968– author. Title: Topics in applied mathematics and modeling : concise theory with case studies / Oscar Gonzalez. Description: Providence, Rhode Island : American Mathematical Society, [2023] | Series: Pure and applied undergraduate texts, ISSN 1943–9334; Volume 59 | Includes bibliographical references and index. Identiﬁers: LCCN 2022034482 | ISBN 9781470469917 (paperback) | 9781470472177 (ebook) Subjects: LCSH: Diﬀerential equations. | AMS: General – General and miscellaneous speciﬁc topics – General applied mathematics. | Ordinary diﬀerential equations – General theory – Geometric methods in diﬀerential equations. | Ordinary diﬀerential equations – Asymptotic theory – Perturbations, asymptotics. | Dynamical systems and ergodic theory – Applications – None of the above, but in this section. | Approximations and expansions – Approximations and expansions – Series expansions (e.g. Taylor, Lidstone series, but not Fourier series). | Calculus of variations and optimal control; optimization – Optimality conditions – Problems involving ordinary diﬀerential equations. Classiﬁcation: LCC QA372 .G617 2023 | DDC 515/.352–dc23/eng20221014 LC record available at https://lccn.loc.gov/2022034482

Copying and reprinting. Individual readers of this publication, and nonproﬁt libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2023 by the author. All rights reserved. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

28 27 26 25 24 23

Contents

Preface

ix

Note to instructors

xi

Case studies and mini-projects

xiii

Chapter 1. Dimensional analysis

1

1.1. Units and dimensions

1

1.2. Axioms of dimensions

2

1.3. Dimensionless quantities

3

1.4. Change of units

4

1.5. Unit-free equations

4

1.6. Buckingham 𝜋-theorem

7

1.7. Case study

11

Reference notes

14

Exercises Mini-project

14 17

Chapter 2. Scaling

19

2.1. Domains and scales

19

2.2. Scale transformations

20

2.3. Derivative relations

22

2.4. Natural scales

23

2.5. Scaling theorem

26

2.6. Case study

29

Reference notes

31 v

vi

Contents

Exercises Mini-project Chapter 3.

One-dimensional dynamics

31 34 37

3.1. Preliminaries

37

3.2. Solvability theorem

38

3.3. Equilibria

39

3.4. Monotonicity theorem

40

3.5. Stability of equilibria

41

3.6. Derivative test for stability

43

3.7. Bifurcation of equilibria

44

3.8. Case study

46

Reference notes

50

Exercises Mini-project

50 53

Chapter 4.

Two-dimensional dynamics

55

4.1. Preliminaries

55

4.2. Solvability theorem

56

4.3. Direction field, nullclines

57

4.4. Path equation, first integrals

58

4.5. Equilibria

60

4.6. Periodic orbits

62

4.7. Linear systems

64

4.8. Equilibria in nonlinear systems

70

4.9. Periodic orbits in nonlinear systems

73

4.10. Bifurcation

77

4.11. Case study

77

4.12. Case study

81

Reference notes

86

Exercises Mini-project 1 Mini-project 2 Mini-project 3

86 92 92 93

Chapter 5.

Perturbation methods

95

5.1. Perturbed equations

95

5.2. Regular versus singular behavior

96

5.3. Assumptions, analytic functions

98

5.4. Notation, order symbols

99

Contents

vii

5.5. Regular algebraic case

100

5.6. Regular differential case

107

5.7. Case study

110

5.8. Poincaré–Lindstedt method

114

5.9. Singular algebraic case

118

5.10. Singular differential case

121

5.11. Case study

127

Reference notes

132

Exercises Mini-project 1 Mini-project 2 Mini-project 3

133 139 140 141

Chapter 6.

Calculus of variations

143

6.1. Preliminaries

143

6.2. Absolute extrema

145

6.3. Local extrema

147

6.4. Necessary conditions

150

6.5. First-order problems

154

6.6. Simplifications, essential results

158

6.7. Case study

161

6.8. Natural boundary conditions

166

6.9. Case study

170

6.10. Second-order problems

174

6.11. Case study

177

6.12. Constraints

181

6.13. Case study

186

6.14. A sufficient condition

189

Reference notes

193

Exercises Mini-project 1 Mini-project 2 Mini-project 3

193 200 201 202

Bibliography

205

Index

207

Preface

This book provides a concise tour of some fundamental methods and results of applied mathematics. It is designed for a one-semester course aimed at junior and senior level undergraduate students in the mathematical, physical, and engineering sciences. The prerequisites are an introductory knowledge of calculus, linear algebra, and ordinary differential equations. The purpose of the book is to provide a context for students to gain a deeper appreciation of mathematics and its connections with other disciplines. It provides a setting in which mathematics can be observed in action, as a tool for exploring meaningful problems in the world around us. Moreover, it illustrates how mathematics is often inspired by real problems, and how mathematical abstraction can lead to physical understanding. The subjects explored in the book are dimensional analysis and scaling, dynamical systems, perturbation methods, and calculus of variations. These are immense subjects of wide applicability, and a fertile ground for critical thinking and quantitative reasoning, in which every student of applied mathematics should have some experience. The book originated from a set of lecture notes for the course M 374M at The University of Texas at Austin. It is intended for a course of study focused on concepts and examples. For completeness, proofs of less-standard results are summarized throughout, at the level of the prerequisites, whereas proofs of standard results can be found in the references as noted. All sections of the book were developed and improved over several years, and have been classroom tested. Over 300 exercises and 180 illustrations are provided to support teaching and learning. The highlights of the book are the case studies and miniprojects, which should be considered as essential for any plan of study. Various exercises provide opportunities for computer simulation and further exploration. It is expected that students will benefit from this book in a number of ways. They will enhance their understanding of mathematics and gain experience in quantitative

ix

x

Preface

reasoning. They will also gain an appreciation for the intrinsic beauty of mathematical abstraction, and its utility as a guide for critical thinking. And they will acquire tools to explore meaningful problems, and increase their preparedness for research and advanced studies. Students can benefit from this book with minimal prerequisites, before any experience with partial differential equations or real analysis, which increases accessibility for both majors and nonmajors. I gratefully acknowledge the many authors, mentors, and teachers whose work provided the foundation for the material presented here. My dependence on their work is profound, and too extensive for complete citation.

Note to instructors

All six chapters can be covered in a standard one-semester course, which consists of about 42 class meetings of about 50 minutes each. Each chapter is organized into a number of short sections which are easily digestible. The first three chapters can be covered quickly, whereas the final three chapters can be covered more slowly. A suggested schedule, which allows for the possibility of two midterm exams, either in-class or take-home, is: Chapter 1 (4 classes), Chapter 2 (3 classes), Chapter 3 (4 classes), Chapter 4 (7–8 classes), Chapter 5 (10 classes), and Chapter 6 (12–13 classes). Proofs are not part of the main narrative in Chapters 1–5 and can be assigned as reading only. In contrast, some proofs in Chapter 6 are an essential part of the narrative and should be covered as appropriate. A focal point of the book are the case studies, which should be covered at the appropriate points in each chapter, and the mini-projects, which should form a regular part of the weekly assignments. Any other extended exercise that explores an application could be substituted for a case study or mini-project. Although most exercises can be completed by hand, the use of technology such as Desmos, Mathematica, and Matlab can be very helpful and should be encouraged. All exercises have been checked using such technology. A number of additional topics could be pursued to supplement the material presented here, or be considered for independent reading for students. The concepts of Lyapunov functions, Hopf bifurcations, Poincaré maps, and chaos would be natural supplements for Chapter 4. Other perturbation methods of the WKB, averaging, and homogenization types, and other classes of singularly perturbed differential equations, such as those with interior layers and turning points, would be natural supplements for Chapter 5. A treatment of sufficient conditions, such as those based on convexity or conjugate points, along with issues of regularity, and elementary optimal control theory, would be natural supplements for Chapter 6.

xi

Case studies and mini-projects

Case studies Period of a pendulum (p. 11) Evolution of a chemical reaction (p. 29) Bifurcation events in insect populations (p. 46) Outbreak condition for spread of an illness (p. 77) Global phase diagram for a rigid body (p. 81) Air resistance effects in ballistic targeting (p. 110) Shape of a meniscus in a liquid-gas interface (p. 127) Optimal design of a slide (p. 161) Optimal steering control of a boat (p. 170) Optimal acceleration control of a car (p. 177) Shape of a hanging chain (p. 186)

Mini-projects Period vs initial angle curve for a pendulum (p. 17) Asymmetry of ascent and descent of a projectile (p. 34) Bifurcation events in plant populations (p. 53) Love-hate dynamics in relationships (p. 92) Limit cycles in a biochemical process of glycolysis (p. 92) Unstable spinning motions of a rigid body (p. 93) Aiming angles and trajectories in ballistic targeting (p. 139) Relativistic effects in the orbit of Mercury (p. 140) Shape of meniscus curve in a liquid-gas interface (p. 141) Soap films and minimal surfaces of revolution (p. 200) Optimal paths in a boat steering problem (p. 201) Min/maximizing shapes in a hanging chain problem (p. 202) xiii

Chapter 1

Dimensional analysis

Mathematical models are equations that express relationships between given quantities of interest. The equations may be of any type, and the quantities may be of any type, either variable or constant. In this chapter, we outline various results about the units and dimensions of quantities, which can lead to insights, and point the way towards simpler, more concise forms of any mathematical model.

1.1. Units and dimensions Throughout our developments we consider equations involving real-valued quantities expressed in given units of some given dimension. By a unit for a quantity we mean a scale for its measurement, such as a foot, hour, or gram. By the dimension of a quantity we mean its intrinsic type, such as length, time, or mass. Whereas a unit for a quantity can be chosen arbitrarily, the dimension of a quantity is a characteristic property that is fixed. Not all quantities have a dimension of their own. Indeed, by virtue of their definition, the dimensions of some quantities can be expressed as combinations of others. Thus only a basic set or basis of dimensions is required to describe a collection of quantities. For example, a standard dimensional basis for quantities arising in simple physical systems is (1.1)

{length (𝐿), time (𝑇), mass (𝑀), temperature (𝛩)}.

Different bases could be considered depending on the context. In systems for which forces are important but not masses, the basis could include the dimension of force instead of mass. A similar change could be made if energies were important but not masses. In systems that include electrical quantities, the basis would be enlarged to include the dimension of electric current. As a different example, to describe quantities arising in a simple ecological system, a dimensional basis might consist of (1.2)

{carnivore (𝐶), herbivore (𝐻), plant (𝑃), insect (𝐼), time (𝑇)}. 1

2

1. Dimensional analysis

To any dimensional basis we associate a corresponding choice of units. These units may have some standard size, or any other arbitrary, nontrivial size, and they may have some standard name, or any other arbitrary name for convenience. For example, for the dimensional basis in (1.1), one choice of units is {meter, second, kilogram, kelvin}. For the dimensional basis in (1.2), one choice of units could be {herd, flock, field, swarm, month}, where, for example, 1 herd may be defined as 20 carnivores, 1 flock may be defined as 12 herbivores, and so on. Thus we will consider real-valued quantities, with values specified in a given choice of units, in a given dimensional basis. The following notation will be used throughout. Definition 1.1.1. Let 𝑞 ∈ ℝ be a quantity specified in units {𝑈1 , . . . , 𝑈𝑚 } in a dimensional basis {𝐷1 , . . . , 𝐷𝑚 } for some 𝑚 ≥ 1. By [𝑞] we mean the dimension of 𝑞 expressed as a product of powers of the basis elements, namely 𝑎

𝑎

𝑎

[𝑞] = 𝐷1 1 𝐷2 2 ⋯ 𝐷𝑚𝑚 .

(1.3)

The numbers 𝑎1 , . . . , 𝑎𝑚 are called the dimensional exponents of 𝑞 in the given basis. The array of exponents is denoted by Δ𝑞 = (𝑎1 , . . . , 𝑎𝑚 ) ∈ ℝ𝑚 . kg⋅m2

m

kelvin

Example 1.1.1. Let 𝑝 = 3 s2 , 𝑔 = 9.8 s2 , and 𝑞 = 100 s . These quantities are expressed in units {meter, second, kilogram, kelvin} in the dimensional basis {𝐿, 𝑇, 𝑀, 𝛩}. The dimensions and corresponding dimensional exponents for 𝑝, 𝑔 and 𝑞 in this basis are [𝑝] = 𝑀𝐿2 /𝑇 2 = 𝐿2 𝑇 −2 𝑀𝛩0 , Δ𝑝 = (2, −2, 1, 0), (1.4)

[𝑔] = 𝐿/𝑇 2 [𝑞] = 𝛩/𝑇

= 𝐿𝑇 −2 𝑀 0 𝛩0 , 0

=𝐿𝑇

−1

0

𝑀 𝛩,

Δ𝑔 = (1, −2, 0, 0), Δ𝑞 = (0, −1, 0, 1).

m

Recall that, in the notation 𝑔 = 9.8 s2 , the number 9.8 is the numerical value of the m quantity, and the tag s2 is an explicit reminder of the units for the quantity. When we say that one quantity is a function of another, we mean that a relation exists between their numerical values, with respect to a given choice of units, in a given dimensional basis. Thus when we write 𝑝 = 𝑓(𝑞), we mean that the numerical value of 𝑝 is completely determined by the numerical value of 𝑞. The function 𝑓 is simply a map from one real value to another, and may be defined by a formula or graph in the usual way.

1.2. Axioms of dimensions We adopt the basic axioms that addition and subtraction are dimensionally meaningful only for quantities of the same dimension, whereas multiplication and division are meaningful for quantities of arbitrary dimension. To state these axioms in a more precise way, let 𝑝, 𝑞, 𝑟, 𝑠 ∈ ℝ be quantities with given units in a given dimensional basis. The basic axiom on addition and subtraction reflects the idea that only quantities of the same dimension can be added and subtracted in a dimensionally meaningful way. Thus the statement 𝑟 = 𝑝 ± 𝑞 has a dimensional meaning only when 𝑝 and 𝑞, and hence 𝑟, have the same dimension. For instance, “1 meter + 2 meter” is a meaningful statement, whereas “1 meter + 2 second” is not.

1.3. Dimensionless quantities

3

The basic axiom on multiplication and division reflects the idea that quantities of any dimension can be multiplied and divided; indeed, this is how more complicated dimensions are derived from elementary ones. Thus the statements 𝑟 = 𝑝𝑞 and 𝑠 = 𝑝/𝑞 (𝑞 ≠ 0) have a dimensional meaning for all 𝑝 and 𝑞, and the dimensions of the results 𝑟 and 𝑠 are well defined in each case. Moreover, this axiom can be extended to arbitrary powers, integration, and differentiation. Axiom 1.2.1. Let 𝑝, 𝑞 ∈ ℝ be quantities specified in units {𝑈1 , . . . , 𝑈𝑚 }, in a dimensional basis {𝐷1 , . . . , 𝐷𝑚 }, with dimensions [𝑝], [𝑞]. Then (1) [𝑝 ± 𝑞] is defined if and only if [𝑝] = [𝑞], (2) [𝑝𝑞] = [𝑝][𝑞] for all 𝑝, 𝑞, (3) [𝑝/𝑞] = [𝑝]/[𝑞] for all 𝑝, 𝑞 with 𝑞 ≠ 0, (4) [𝑞𝛼 ] = [𝑞]𝛼 for all 𝑞 > 0 and real 𝛼, (5) [∫ 𝑝 𝑑𝑞] = [𝑝][𝑞] for any integrable function 𝑝 = 𝑓(𝑞), (6) [𝑑𝑝/𝑑𝑞] = [𝑝]/[𝑞] for any differentiable function 𝑝 = 𝑓(𝑞). In property (4) the condition 𝑞 > 0 ensures that 𝑞𝛼 is defined for any power 𝛼. While it would suffice to only consider rational powers, we assume that the property holds for all real powers. The content of properties (2)–(4) can be translated to the dimensional exponents Δ𝑝 and Δ𝑞 in a straightforward way, namely (1.5)

Δ𝑝𝑞 = Δ𝑝 + Δ𝑞 ,

Δ𝑝/𝑞 = Δ𝑝 − Δ𝑞 ,

Δ𝑞𝛼 = 𝛼Δ𝑞 .

1.3. Dimensionless quantities The concept of a quantity with no dimension as defined next will play an important role throughout our developments. We note that such quantities can arise when considering combinations of other quantities, and can also arise naturally in other ways. Definition 1.3.1. A quantity 𝑞 ∈ ℝ is called dimensionless if its dimensional expression is [𝑞] = 1, or equivalently its array of dimensional exponents is Δ𝑞 = 0, in any units in any dimensional basis. ft

1

Example 1.3.1. (1) Let 𝑞 = 𝑎𝑏/𝑐, where 𝑎 = 4 hour , 𝑏 = 3 hour , 𝑐 = 2 −1

−1

−2

ft hour

2

. Consid-

ering dimensions we have [𝑎] = 𝐿𝑇 , [𝑏] = 𝑇 , and [𝑐] = 𝐿𝑇 , and we find that [𝑞] = [𝑎][𝑏]/[𝑐] = 1. Thus 𝑞 is a dimensionless quantity; its value is 𝑞 = 6. (2) Let 𝑎 be an arbitrary quantity with dimension [𝑎], and let 𝑏 = 𝑎 + 𝑎 and 𝑐 = 𝑎 ⋅ 𝑎 ⋅ 𝑎. Then it is natural to rewrite these quantities as 𝑏 = 2𝑎 and 𝑐 = 𝑎3 . In these latter expressions, we note that the coefficient 2 and exponent 3 are dimensionless; they are purely mathematical entities called pure numbers. The dimensions of 𝑏 and 𝑐 are [𝑏] = [2𝑎] = [𝑎] and [𝑐] = [𝑎3 ] = [𝑎]3 . (3) Let 𝜃 be an arbitrary angle, which when inscribed in a circle of radius 𝑟 subtends ℓ an arc of length ℓ. Then, in the radian unit of measurement, we have 𝜃 = 𝑟 and we find [𝜃] = 1. Hence angles and the radian unit of measurement are dimensionless.

4

1. Dimensional analysis

Similarly, since it only differs in size, the degree unit of measurement is dimensionless. (4) Any ratio of two quantities of the same dimension is dimensionless. The value of such a ratio can be expressed as a pure number, or in terms of any arbitrary dimensionless unit such as a percentage or parts-per-hundred.

1.4. Change of units Here we outline the effect of a change of units on an arbitrary quantity. For our purposes it will be sufficient to only consider changes in the dimensional units associated with a given dimensional basis, with any dimensionless units held fixed. We assume that any two units of the same dimensional type are related by a multiplicative conversion factor as introduced below. To state the result, we consider an arbitrary quantity 𝑞 ∈ ℝ, expressed in units {𝑈1 , . . . , 𝑈𝑚 }, in a dimensional basis {𝐷1 , . . . , 𝐷𝑚 }, with dimensional exponents Δ𝑞 = (𝑎1 , . . . , 𝑎𝑚 ) ∈ ℝ𝑚 . ˜1 , . . . , 𝑈 ˜𝑚 }, then the quantity 𝑞 is Result 1.4.1. If units {𝑈1 , . . . , 𝑈𝑚 } are changed to {𝑈 changed to 𝑞,̃ where 𝑎

𝑎

𝑎

𝑞 ̃ = 𝑞𝜆1 1 𝜆2 2 ⋯ 𝜆𝑚𝑚 .

(1.6)

Here 𝜆𝑖 > 0 (𝑖 = 1, . . . , 𝑚) are unit-conversion factors; each factor 𝜆𝑖 quantifies the ˜ 𝑖 per unit of 𝑈 𝑖 . number of units of 𝑈 The above result follows from straightforward algebra and the axioms on dimensions regarding multiplication and division. Note that if 𝑞 is dimensionless, then Δ𝑞 = (0, . . . , 0), and we obtain 𝑞 ̃ = 𝑞. Thus dimensionless quantities are not affected by a change of dimensional units. m

Example 1.4.1. Let 𝑔 = 9.8 s2 . This quantity is expressed in units {m, s} in the dimensional basis {𝐿, 𝑇}. Since [𝑔] = 𝐿𝑇 −2 , its dimensional exponents are Δ𝑔 = (𝑎1 , 𝑎2 ) = (1, −2). If the units are changed to {km, min}, then the unit-conversion factors are 1 km 1 min (1.7) 𝜆1 = , 𝜆2 = . 1000 m 60 s In the new units we have m 1 km 1 min −2 km 𝑎 𝑎 . (1.8) 𝑔̃ = 𝑔𝜆1 1 𝜆2 2 = (9.8 2 )( )( ) = 35.28 2 1000 m 60 s s min

1.5. Unit-free equations In the modeling of various types of systems, we will usually consider a set of real-valued quantities 𝑞1 , . . . , 𝑞𝑛 , specified in units {𝑈1 , . . . , 𝑈𝑚 }, in a dimensional basis {𝐷1 , . . . , 𝐷𝑚 }, for some 𝑛 ≥ 2 and 𝑚 ≥ 1. We will often seek to construct and study equations of the form (1.9)

𝑞1 = 𝑓(𝑞2 , . . . , 𝑞𝑛 ),

1.5. Unit-free equations

5

where 𝑓 ∶ ℝ𝑛−1 → ℝ is some function. The function notation above indicates that the numerical value of 𝑞1 is completely determined by the numerical values of 𝑞2 , . . . , 𝑞𝑛 in the given units. In our pursuits, we will only consider equations that are unit-free as defined next. Definition 1.5.1. An equation 𝑞1 = 𝑓(𝑞2 , . . . , 𝑞𝑛 ) is called unit-free if it transforms into (1.10)

𝑞1̃ = 𝑓(𝑞2̃ , . . . , 𝑞𝑛̃ )

under an arbitrary change of units on arbitrary values of 𝑞1 , . . . , 𝑞𝑛 . The key point of a unit-free equation is that the function 𝑓 is unaffected by the choice of units. All the equations that we consider will be unit-free in this sense. Note that, without this property, the function 𝑓 may change whenever the units are changed, and the equation would have limited value as a model. Indeed, it would be tedious to document each different version of the equation for each different choice of units. Thus a unit-free equation can be viewed as a well designed equation. Model equations derived from fundamental physical laws are naturally unit-free; they inherit this property from the laws on which they are based. In contrast, empirical equations derived from curve fitting procedures are not naturally unit-free, but can always be re-designed to have this property by introducing appropriate dimensional constants. In the most basic sense, a unit-free equation can be viewed as a dimensionally meaningful equation, consistent with the axioms of dimensions, and this property can always be achieved. Example 1.5.1. Let 𝑥, 𝑡 and 𝑔 be specified in units {m, s}, in the dimensional basis {𝐿, 𝑇}, with dimensions [𝑥] = 𝐿, [𝑡] = 𝑇 and [𝑔] = 𝐿𝑇 −2 , and exponents Δ𝑥 = (1, 0), Δ𝑡 = (0, 1) and Δ𝑔 = (1, −2). Suppose that the value of 𝑥 is determined by the values of 𝑡 and 𝑔 through the equation (1.11)

𝑥 = 𝑓(𝑡, 𝑔) =

1 2 𝑔𝑡 . 2 1

(Unless mentioned otherwise, unnamed quantities such as the factor 2 and exponent 2 can be interpreted as pure numbers.) To determine if the above equation is unit˜1 , 𝑈 ˜2 }, defined by free, we consider a change of units from {m, s} to arbitrary units {𝑈 arbitrary conversion factors 𝜆1 , 𝜆2 . In the new units, the values of 𝑥, 𝑡 and 𝑔 become (1.12)

𝑥̃ = 𝑥𝜆1 ,

𝑡 ̃ = 𝑡𝜆2 ,

𝑔̃ = 𝑔𝜆1 𝜆−2 2 .

1

Substitution of these expressions into 𝑥 = 2 𝑔𝑡2 gives 2 1 2 ̃ −1 (𝑔𝜆̃ −1 1 𝜆2 )(𝑡𝜆2 ) . 2 In the above, all factors with 𝜆1 and 𝜆2 cancel, and we get

(1.13)

̃ −1 (𝑥𝜆 1 )=

1 2 𝑔𝑡̃ ̃ . 2 Thus (1.11) is unit-free since it has exactly the same form in any choice of units. The original equation 𝑥 = 𝑓(𝑡, 𝑔) is transformed into 𝑥̃ = 𝑓(𝑡,̃ 𝑔), ̃ with the same function 𝑓. (1.14)

𝑥̃ =

6

1. Dimensional analysis

Example 1.5.2. Let 𝑥, 𝑡 and 𝑔 be as before, and let 𝑐 be an additional quantity, say a constant, with [𝑐] = 𝑇 and Δ𝑐 = (0, 1). For purposes of comparison, consider the two different equations (1.15)

𝑥=

1 2 −𝑡 𝑔𝑡 𝑒 , 2

𝑥=

1 2 −𝑡/𝑐 𝑔𝑡 𝑒 . 2

(Here 𝑒𝑞 = exp(𝑞) is the natural exponential function; the base 𝑒 can be interpreted as a pure number.) Considering an arbitrary change of units as above, we get (1.16)

𝑥̃ =

1 2 −𝑡𝜆̃ −1 𝑔𝑡̃ ̃ 𝑒 2 , 2

𝑥̃ =

1 2 −𝑡/̃ 𝑐 ̃ 𝑔𝑡̃ ̃ 𝑒 . 2

The first equation is not unit-free since it changes form: a unit-conversion factor remains in the equation and does not cancel out. In contrast, the second equation is unit-free since all the unit-conversion factors cancel. Note how the first equation becomes unit-free by introduction of the constant 𝑐. The equations in (1.15), written in units {m, s}, would be numerically the same when 𝑐 = 1s. However, the second equation is advantageous since it would have exactly the same form in any units. Example 1.5.3. Let 𝑣 and 𝑡 be quantities specified in units {lb, hr}, in the dimensional basis {𝑀, 𝑇}, with dimensions [𝑣] = 𝑀/𝑇 and [𝑡] = 𝑇. Suppose that the value of 𝑣 is determined by the value of 𝑡 through an empirical equation 𝑣 = 3.7𝑡2 − sin(5.4𝑡).

(1.17)

Here we rewrite this equation in a unit-free form. To begin, we introduce constants 𝑎, 𝑏, 𝑐 with values 3.7, 1, 5.4 in units {lb, hr} and consider 𝑣 = 𝑎𝑡2 − 𝑏 sin(𝑐𝑡).

(1.18)

We next determine the dimensions of these constants to make the equation unit-free. Accordingly, let Δ𝑎 = (𝛼1 , 𝛼2 ), Δ𝑏 = (𝛽1 , 𝛽2 ) and Δ𝑐 = (𝛾1 , 𝛾2 ) be the unknown dimensional exponents. Under an arbitrary change of units with conversion factors 𝜆1 , 𝜆2 , using the fact that Δ𝑣 = (1, −1) and Δ𝑡 = (0, 1), we get, after dividing out the conversion factors from the left side of the equation, (1.19)

1−𝛼1 −3−𝛼2 𝜆2

𝑣 ̃ = 𝜆1

1−𝛽1 −1−𝛽2 𝜆2

𝑎𝑡̃ 2̃ − 𝜆1

−𝛾1 −1−𝛾2 𝜆2

𝑏 ̃ sin(𝜆1

𝑐𝑡). ̃̃

Note that the unit-free condition will be satisfied when the exponents of all the conversion factors in the above expression are zero, which requires Δ𝑎 = (1, −3), Δ𝑏 = (1, −1) and Δ𝑐 = (0, −1). Thus the dimensions of the constants are completely determined, and in units {lb, hr} we have (1.20)

𝑎 = 3.7

lb 3

hr

,

𝑏=1

lb , hr

𝑐 = 5.4

1 . hr

In any other units, the equation would be 𝑣 ̃ = 𝑎𝑡̃ 2̃ − 𝑏 ̃ sin(𝑐𝑡), ̃ ̃ where 𝑎,̃ 𝑏,̃ 𝑐 ̃ are the values of the constants in the new units. In our function notation, the equation in (1.18) would be written as 𝑣 = 𝑓(𝑡, 𝑎, 𝑏, 𝑐).

1.6. Buckingham 𝜋-theorem

7

1.6. Buckingham 𝜋-theorem Here we outline a classic result known as the Buckingham 𝜋-theorem. It states that, for any unit-free equation 𝑞1 = 𝑓(𝑞2 , . . . , 𝑞𝑛 ), the function 𝑓 cannot depend on 𝑞2 , . . . , 𝑞𝑛 in a completely arbitrary way; it can only depend on certain dimensionless combinations. For simplicity we state the result only for positive values of 𝑞1 , . . . , 𝑞𝑛 . Similar results hold for nonpositive values, but at the expense of more complicated statements. Definition 1.6.1. By a power product of 𝑞1 , . . . , 𝑞𝑛 > 0 we mean a quantity 𝜋 > 0 of the form 𝑏

𝑏

𝜋 = 𝑞1 1 ⋯ 𝑞𝑛𝑛 ,

(1.21)

for some powers 𝑏1 , . . . , 𝑏𝑛 ∈ ℝ. We say that 𝜋 includes 𝑞𝑖 if 𝑏𝑖 ≠ 0. The condition that each 𝑞𝑖 be positive ensures that 𝜋 is well defined for arbitrary powers. In any dimensional basis {𝐷1 , . . . , 𝐷𝑚 }, we note that each quantity 𝑞𝑖 will have dimensional exponents Δ𝑞𝑖 ∈ ℝ𝑚 , and the power product 𝜋 will have dimensional exponents Δ𝜋 ∈ ℝ𝑚 . From the definition in (1.21), together with the properties that Δ𝑝𝑞 = Δ𝑝 + Δ𝑞 and Δ𝑞𝛼 = 𝛼Δ𝑞 given in (1.5), we deduce that (1.22)

Δ𝜋 = 𝑏1 Δ𝑞1 + ⋯ + 𝑏𝑛 Δ𝑞𝑛 = 𝐴𝑣,

where 𝐴 = (Δ𝑞1 , Δ𝑞2 , . . . , Δ𝑞𝑛 ) ∈ ℝ𝑚×𝑛 and 𝑣 = (𝑏1 , . . . , 𝑏𝑛 ) ∈ ℝ𝑛 . Here all onedimensional arrays are considered as columns, and we assume 𝑛 ≥ 2 and 𝑚 ≥ 1 with 𝑛 ≥ 𝑚. Given 𝑞1 , . . . , 𝑞𝑛 we will be interested in forming power products 𝜋 that are dimensionless. In this respect, we note that (1.23)

𝜋 dimensionless

⇔

Δ𝜋 = 0

⇔

𝐴𝑣 = 0.

Furthermore, we will only be interested in nontrivial power products 𝜋 ≢ 1, which correspond to 𝑣 ≠ 0. The following result, which essentially is a definition, characterizes the dimensionless power products that we seek. Result 1.6.1. If 𝐴𝑣 = 0 has a total of 𝑘 independent solutions 𝑣 1 , . . . , 𝑣 𝑘 , then a total of 𝑘 independent dimensionless power products 𝜋1 , . . . , 𝜋𝑘 can be formed. Any such 𝜋1 , . . . , 𝜋𝑘 is called a full set. This set is further called normalized if 𝜋1 includes 𝑞1 (with power 𝑏1 = 1), and 𝜋2 , . . . , 𝜋𝑘 do not include 𝑞1 . Recall that, through the usual process of row reduction, any nontrivial solution of 𝐴𝑣 = 0 will be expressed in terms of certain free variables. If there are 𝑘 free variables, then there are 𝑘 independent solutions 𝑣, and hence 𝑘 independent dimensionless power products 𝜋. While any independent choices of the free variables can be made to form a full set of solutions, a deliberate choice of these variables is required to form a normalized set. Specifically, the normalization condition requires that the first solution have 𝑏1 = 1, and any other solutions have 𝑏1 = 0. Example 1.6.1. Let 𝑥, 𝑡, 𝑔, ℎ, 𝑚 > 0 be quantities with dimensions [𝑥] = 𝐿, [𝑡] = 𝑇, [𝑔] = 𝐿𝑇 −2 , [ℎ] = 𝐿𝑇 −3 and [𝑚] = 𝑀. A dimensional basis is {𝐿, 𝑇, 𝑀}, and the

8

1. Dimensional analysis

dimensional exponent matrix in this basis is (1.24)

1 0 𝐴 = (Δ𝑥 , Δ𝑡 , Δ𝑔 , Δℎ , Δ𝑚 ) = ( 0 1 0 0

1 −2 0

1 −3 0

0 0 ). 1

An arbitrary power product has the form 𝜋 = 𝑥𝑏1 𝑡𝑏2 𝑔𝑏3 ℎ𝑏4 𝑚𝑏5 . The equation 𝐴𝑣 = 0, where 𝑣 = (𝑏1 , . . . , 𝑏5 ), has two free variables, and the general solution is (1.25)

𝑏1 = −𝑏3 − 𝑏4 ,

𝑏2 = 2𝑏3 + 3𝑏4 ,

𝑏5 = 0,

𝑏3 , 𝑏4 free.

Since there are two free variables, there are two independent solutions. For one solution we choose 𝑏3 = −1, 𝑏4 = 0, which gives 𝑣 1 = (1, −2, −1, 0, 0), and hence 𝜋1 = 𝑥/(𝑔𝑡2 ). For a second solution we choose 𝑏3 = −1, 𝑏4 = 1, which gives 𝑣 2 = (0, 1, −1, 1, 0), and hence 𝜋2 = 𝑡ℎ/𝑔. By choice, we arranged for 𝜋1 to include 𝑥 with an exponent of unity, and for 𝜋2 to exclude 𝑥. Hence 𝜋1 , 𝜋2 is a full set of dimensionless power products for 𝑥, 𝑡, 𝑔, ℎ, 𝑚, and this set is normalized with respect to 𝑥. Note that 𝑚 will not be included in any dimensionless power product. The next result shows that any unit-free equation can only depend on dimensionless power products. No assumption on the form or continuity properties of the function 𝑓 are required. For simplicity, the results are stated only for positive quantities; similar results can be derived to account for negative and zero quantities. Result 1.6.2. [𝜋-theorem] Let 𝜋1 , . . . , 𝜋𝑘 > 0 be a full set of dimensionless power products for 𝑞1 , . . . , 𝑞𝑛 > 0 where 𝑘 ≥ 1 and 𝑛 ≥ 2. If the set 𝜋1 , . . . , 𝜋𝑘 is normalized, then any unit-free equation 𝑞1 = 𝑓(𝑞2 , . . . , 𝑞𝑛 ) for some function 𝑓, is equivalent to an equation (1.26)

𝜋1 = 𝜙(𝜋2 , . . . , 𝜋𝑘 )

for some function 𝜙. In the case that 𝑘 = 1, the function 𝜙 reduces to some constant 𝐶. The normalization condition ensures that the reduced equation (1.26) is explicit, just as the original equation, in terms of 𝑞1 . We remark that a more general form of the theorem states that any unit-free equation in the general implicit form 𝐹(𝑞1 , . . . , 𝑞𝑛 ) = 0 is equivalent to 𝛷(𝜋1 , . . . , 𝜋𝑘 ) = 0, without any normalization condition on the power products. Also, if the only dimensionless power product for quantities 𝑞1 , . . . , 𝑞𝑛 is trivial, then the only unit-free relation among these quantities is trivial; in this case, the set 𝑞1 , . . . , 𝑞𝑛 would need to be enlarged in order for a nontrivial unit-free relation to exist. The proof of the theorem is based on a change of variable argument that exploits the unit-free condition and the definition of the power products, which will be outlined after some examples. Example 1.6.2. Let 𝑥, 𝑡, 𝑔, ℎ, 𝑚 > 0 be quantities as in the previous example. A normalized set of power products for these quantities is 𝜋1 = 𝑥/(𝑔𝑡2 ) and 𝜋2 = 𝑡ℎ/𝑔. Thus any unit-free equation of the form 𝑥 = 𝑓(𝑡, 𝑔, ℎ, 𝑚) must be equivalent to an equation of the form (1.27)

𝜋1 = 𝜙(𝜋2 ) or

𝑡ℎ 𝑥 = 𝜙( ), 𝑔 𝑔𝑡2

1.6. Buckingham 𝜋-theorem

9

which can be rearranged to yield 𝑥 = 𝑔𝑡2 ⋅ 𝜙(

(1.28)

𝑡ℎ ). 𝑔

Thus 𝑥 cannot depend on 𝑚, and must depend on 𝑡, 𝑔 and ℎ in a specific way. If 𝜙(0) is defined, then the special case when ℎ has a fixed value of zero can be considered, and the relation becomes 𝑥 = 𝛽𝑔𝑡2 , where 𝛽 = 𝜙(0) is a dimensionless constant. Example 1.6.3. Here we explicitly find the reduced form of the unit-free equation 𝛼𝑥 −𝑥2 /(𝛽𝑡2 ) (1.29) 𝑢 = 𝑓(𝑥, 𝑡, 𝛼, 𝛽) = 𝑒 , 𝑡 where [𝑢] = 𝛩, [𝑥] = 𝐿, [𝑡] = 𝑇, [𝛼] = 𝛩𝑇/𝐿, and [𝛽] = 𝐿2 /𝑇 2 . A dimensional basis is {𝛩, 𝐿, 𝑇}, and the dimensional exponent matrix is (1.30)

1 0 𝐴 = (Δᵆ , Δ𝑥 , Δ𝑡 , Δ𝛼 , Δ𝛽 ) = ( 0 1 0 0

0 0 1

1 −1 1

0 2 ). −2

An arbitrary power product is 𝜋 = 𝑢𝑏1 𝑥𝑏2 𝑡𝑏3 𝛼𝑏4 𝛽𝑏5 . The equation 𝐴𝑣 = 0, where 𝑣 = (𝑏1 , . . . , 𝑏5 ), has two free variables. By choosing these variables in a similar way as before, we obtain the full, normalized set 𝜋1 = 𝑢/(𝛼√𝛽) and 𝜋2 = 𝑥/(𝑡√𝛽). By the 𝜋-theorem, the original equation 𝑢 = 𝑓(𝑥, 𝑡, 𝛼, 𝛽) must be equivalent to 𝜋1 = 𝜙(𝜋2 ) for some function 𝜙. Here this result can be verified directly due to the explicit form of the original equation. Specifically, dividing the equation by 𝛼√𝛽, and then substituting, we obtain 𝛼𝑥 −𝑥2 /(𝛽𝑡2 ) 𝑢 𝑥 −𝑥2 /(𝛽𝑡2 ) 2 (1.31) 𝑢= 𝑒 ⇔ = 𝑒 ⇔ 𝜋1 = 𝜋2 𝑒−𝜋2 . 𝑡 𝛼√𝛽 𝑡√𝛽 Example 1.6.4. A simple theory of sound waves in a gas proposes that the speed of sound 𝑣 > 0 should depend on only the mass density 𝜌 > 0, pressure 𝑝 > 0, and viscosity 𝜇 > 0 so that (1.32)

𝑣 = 𝑓(𝜌, 𝑝, 𝜇),

for some function 𝑓. Here we use the 𝜋-theorem to find an equivalent and possibly simpler form of (1.32) assuming that it is unit-free. This can be viewed as an important first step in exploring any new or proposed relation of interest. The quantities 𝑣, 𝜌, 𝑝, 𝜇 have dimensions [𝑣] = 𝐿/𝑇, [𝜌] = 𝑀/𝐿3 , [𝑝] = 𝑀/(𝐿𝑇 2 ) and [𝜇] = 𝑀/(𝐿𝑇). A dimensional basis is {𝐿, 𝑇, 𝑀}, and the dimensional exponent matrix in this basis is (1.33)

1 −3 −1 −1 0 −2 −1 ) . 𝐴 = (Δ𝑣 , Δ𝜌 , Δ𝑝 , Δ𝜇 ) = ( −1 0 1 1 1

An arbitrary power product has the form 𝜋 = 𝑣𝑏1 𝜌𝑏2 𝑝𝑏3 𝜇𝑏4 . The equation 𝐴𝑣 = 0, where 𝑣 = (𝑏1 , . . . , 𝑏4 ), has one free variable, and the general solution is (1.34)

𝑏1 = −2𝑏3 ,

𝑏2 = −𝑏3 ,

𝑏4 = 0,

𝑏3 free.

10

1. Dimensional analysis

Since there is only one free variable, there is only one independent solution. For this solution we choose 𝑏3 = −1/2, which gives 𝑣 1 = (1, 1/2, −1/2, 0), and hence 𝜋1 = 𝑣√𝜌/𝑝. Hence 𝜋1 is a full set of independent dimensionless power products for 𝑣, 𝜌, 𝑝, 𝜇, and this set is normalized with respect to 𝑣. By the 𝜋-theorem, the equation in (1.32) must be equivalent to (1.35)

𝜋1 = 𝐶

or 𝑣 = 𝐶

𝑝 , √𝜌

where 𝐶 > 0 is some dimensionless constant. Thus any experimental investigation of sound waves under the given hypothesis should be aimed at (1.35), and the determination of the unknown constant 𝐶. Note that, even though an unknown constant is involved, there is valuable, direct information implied by (1.35). For instance, it implies that the speed of sound must be independent of the viscosity, and would increase with pressure at fixed density, and decrease with density at fixed pressure. Moreover, the speed of sound would remain unchanged when pressure and density are both increased or decreased in a simultaneous way. Sketch of proof: Result 1.6.2. Let 𝑞1 , . . . , 𝑞𝑛 > 0 and 𝜋1 , . . . , 𝜋𝑘 > 0 be given, and let 𝐴 ∈ ℝ𝑚×𝑛 be the dimensional exponent matrix whose columns are Δ𝑞𝜍 = (𝑎1,𝜍 , . . . , 𝑎𝑚,𝜍 ) ∈ ℝ𝑚 , where 𝜎 = 1, . . . , 𝑛. Note that, to each dimensionless power product 𝜋𝜌 , there is an independent solution 𝑣 𝜌 = (𝑏1,𝜌 , . . . , 𝑏𝑛,𝜌 ) ∈ ℝ𝑛 of 𝐴𝑣 = 0, where 𝜌 = 1, . . . , 𝑘. Thus the row-reduced form of 𝐴𝑣 = 0 has 𝑘 columns without pivots, which correspond to the free variables, and 𝑛 − 𝑘 columns with pivots. 𝑏

𝑏

Due to the normalization condition, we have 𝜋1 = 𝑞1 𝑞2 2,1 ⋯ 𝑞𝑛𝑛,1 , and any remaining power products 𝜋2 , . . . , 𝜋𝑘 involve only 𝑞2 , . . . , 𝑞𝑛 . In view of this, we consider 𝐴′ ∈ ℝ𝑚×(𝑛−1) , defined to be the submatrix of 𝐴 obtained by omitting the first column, and 𝑣′ ∈ ℝ𝑛−1 , defined to be the subvector of 𝑣 obtained by omitting the first entry. The assumption that 𝐴𝑣 = 0 has a full set of 𝑘 independent solutions that satisfy the normalization condition implies that 𝐴′ 𝑣′ = 0 has precisely 𝑘 − 1 independent solutions, and hence precisely as many free variables. Consequently, the row-reduced form of 𝐴′ 𝑣′ = 0 has 𝑛 − 𝑘 columns with pivots, so that 𝐴′ has rank 𝑛 − 𝑘. Let 𝑞1 = 𝑓(𝑞2 , . . . , 𝑞𝑛 ) be given and consider an arbitrary change of units that changes 𝑞1 , . . . , 𝑞𝑛 into 𝑞1̃ , . . . , 𝑞𝑛̃ , and note that 𝑞1̃ = 𝑓(𝑞2̃ , . . . , 𝑞𝑛̃ ) by the unit-free assumption. In view of the above expression for 𝜋1 , we introduce the function 𝑏

𝑏

𝐹(𝑞2 , . . . , 𝑞𝑛 ) = 𝑓(𝑞2 , . . . , 𝑞𝑛 )𝑞2 2,1 ⋯ 𝑞𝑛𝑛,1 so that 𝜋1 = 𝐹(𝑞2 , . . . , 𝑞𝑛 ). Similarly, beginning from the analogous expression for 𝜋 ˜1 , we find 𝜋 ˜1 = 𝐹(𝑞2̃ , . . . , 𝑞𝑛̃ ). Because it is dimensionless, we have 𝜋1 = 𝜋 ˜1 , which implies 𝐹(𝑞2 , . . . , 𝑞𝑛 ) = 𝐹(𝑞2̃ , . . . , 𝑞𝑛̃ ). Thus the function 𝐹 is invariant under an arbitrary change of units. To establish the result of the theorem, we consider different cases depending on the number 𝑘 of power products. In the case when 𝑘 = 1, we consider the change of unit 𝑎 𝑎 relations 𝑞𝜍̃ = 𝑞𝜍 𝜆1 1,𝜍 ⋯ 𝜆𝑚𝑚,𝜍 for 𝜎 = 2, . . . , 𝑛, where 𝜆1 , . . . , 𝜆𝑚 are the conversion factors. From this we obtain the log-linear system ln(𝑞𝜍̃ /𝑞𝜍 ) = 𝑎1,𝜍 ln 𝜆1 +⋯+𝑎𝑚,𝜍 ln 𝜆𝑚 .

1.7. Case study

11

In matrix form, we have 𝐴′𝑇 𝑢 = 𝑔′ , where 𝑢 = (ln 𝜆1 , . . . , ln 𝜆𝑚 ) ∈ ℝ𝑚 and 𝑔′ = (ln(𝑞2̃ /𝑞2 ), . . . , ln(𝑞𝑛̃ /𝑞𝑛 )) ∈ ℝ𝑛−1 . When 𝑘 = 1, the rank of 𝐴′ and 𝐴′𝑇 is 𝑛 − 1, and the columns of 𝐴′𝑇 span ℝ𝑛−1 . Thus, for arbitrary old values 𝑞2 , . . . , 𝑞𝑛 , we can always find a change of units to obtain any specified new values 𝑞2̃ , . . . , 𝑞𝑛̃ , say a value of one for each. The required change of units can be found by setting 𝑞2̃ = 1, . . . , 𝑞𝑛̃ = 1 in this log-linear system and solving for the conversion factors 𝜆1 , . . . , 𝜆𝑚 . Due to the invariance property of 𝐹, for arbitrary 𝑞2 , . . . , 𝑞𝑛 we get 𝜋1 = 𝐹(1, . . . , 1) = 𝐶, where 𝐶 is some fixed constant, which establishes the result for this case. In the case when 𝑘 = 𝑛, the system 𝐴′ 𝑣′ = 0 has 𝑛 − 1 independent solutions = (𝑏2,𝜌 , . . . , 𝑏𝑛,𝜌 ) ∈ ℝ𝑛−1 for 𝜌 = 2, . . . , 𝑛. Let 𝐵′ ∈ ℝ(𝑛−1)×(𝑛−1) be the matrix whose columns are these solutions, and note that it is square and has full rank, and

𝑣′𝜌

𝑏2,𝜌

𝑏𝑛,𝜌

hence is invertible. For this case, we consider the power products 𝜋𝜌 = 𝑞2 ⋯ 𝑞𝑛 for 𝜌 = 2, . . . , 𝑛, and obtain the log-linear system ln 𝜋𝜌 = 𝑏2,𝜌 ln 𝑞2 + ⋯ + 𝑏𝑛,𝜌 ln 𝑞𝑛 . In matrix form, we have 𝐵 ′𝑇 𝑤′ = ℎ′ , where 𝑤′ = (ln 𝑞2 , . . . , ln 𝑞𝑛 ) ∈ ℝ𝑛−1 and ℎ′ = (ln 𝜋2 , . . . , ln 𝜋𝑛 ) ∈ ℝ𝑛−1 . Since 𝐵 ′𝑇 is invertible, we find that 𝑞2 , . . . , 𝑞𝑛 can be uniquely expressed in terms of 𝜋2 , . . . , 𝜋𝑛 . This implies that 𝜋1 = 𝐹(𝑞2 , . . . , 𝑞𝑛 ) = 𝜙(𝜋2 , . . . , 𝜋𝑛 ), for some function 𝜙, which establishes the result for this case. In the case when 1 < 𝑘 < 𝑛, the system 𝐴′ 𝑣′ = 0 has 𝑘 − 1 independent solutions = (𝑏2,𝜌 , . . . , 𝑏𝑛,𝜌 ) ∈ ℝ𝑛−1 for 𝜌 = 2, . . . , 𝑘, and has rank 𝑛 − 𝑘 as noted earlier. Without loss of generality, up to a reordering of 𝑞2 , . . . , 𝑞𝑛 , we may suppose that the pivots in the system 𝐴′ 𝑣′ = 0 all occur in the leading 𝑛 − 𝑘 columns, whereas the free variables all occur in the latter 𝑘 − 1 columns. We now consider the 𝑛 − 𝑘 change 𝑎 𝑎 of unit relations 𝑞𝜍̃ = 𝑞𝜍 𝜆1 1,𝜍 ⋯ 𝜆𝑚𝑚,𝜍 for 𝜎 = 2, . . . , 𝑛 − 𝑘 + 1, and again consider the system ln(𝑞𝜍̃ /𝑞𝜍 ) = 𝑎1,𝜍 ln 𝜆1 +⋯+𝑎𝑚,𝜍 ln 𝜆𝑚 . Since the dimensional exponent vectors (𝑎1,𝜍 , . . . , 𝑎𝑚,𝜍 ) are the leading 𝑛 − 𝑘 columns of 𝐴′ , they are independent. Thus for arbitrary 𝑞2 , . . . , 𝑞𝑛−𝑘+1 we can find 𝜆1 , . . . , 𝜆𝑚 to achieve 𝑞2̃ = 1, . . . , 𝑞𝑛−𝑘+1 ̃ = 1. We 𝑏2,𝜌 𝑏𝑛,𝜌 next consider the 𝑘 − 1 power products 𝜋𝜌 = 𝑞2 ⋯ 𝑞𝑛 for 𝜌 = 2, . . . , 𝑘. Since 𝜋𝜌 = 𝑏𝑛−𝑘+2,𝜌 𝑏𝑛,𝜌 𝜋 ˜𝜌 and 𝑞2̃ = 1, . . . , 𝑞𝑛−𝑘+1 ̃ = 1, we get the reduced expressions 𝜋𝜌 = 𝑞𝑛−𝑘+2 ̃ ⋯ 𝑞𝑛̃ , which leads to the system ln 𝜋𝜌 = 𝑏𝑛−𝑘+2,𝜌 ln 𝑞𝑛−𝑘+2 ̃ + ⋯ + 𝑏𝑛,𝜌 ln 𝑞𝑛̃ . For each 𝜌, we note that (𝑏𝑛−𝑘+2,𝜌 , . . . , 𝑏𝑛,𝜌 ) are the 𝑘−1 free variables from the system 𝐴′ 𝑣′ = 0, which were independently chosen to generate the solution set. Thus this log-linear system is square and has full rank, and we find that 𝑞𝑛−𝑘+2 ̃ , . . . , 𝑞𝑛̃ can be uniquely expressed in terms of 𝜋2 , . . . , 𝜋𝑘 . This implies 𝜋1 = 𝐹(𝑞2̃ , . . . , 𝑞𝑛̃ ) = 𝐹(1, . . . , 1, 𝑞𝑛−𝑘+2 ̃ , . . . , 𝑞𝑛̃ ) = 𝜙(𝜋2 , . . . , 𝜋𝑘 ), for some function 𝜙, which establishes the result. 𝑣′𝜌

1.7. Case study Setup. To illustrate the preceding results on dimensional methods, and the process of modelling a simple mechanical system, we study the motion of a pendulum released from rest. Figure 1.1 illustrates the system, which consists of a string of length ℓ, with one end attached to a fixed support point, and the other end attached to a ball of mass 𝑚. We assume the string is always in tension and hence straight, and we let 𝜃 denote the angle between the string and a vertical line through the support point, and arbitrarily take the positive direction to be counter-clockwise. We assume that gravitational acceleration 𝑔 is directed in the downward, vertical direction. When the ball is raised

12

1. Dimensional analysis

o

g

θ

y

x θ

r

m

j

i

F string

eθ er

F gravity

Figure 1.1. 𝑑𝜃

and released from the rest conditions 𝜃 = 𝜃0 and 𝑑𝑡 = 0 at time 𝑡 = 0, the ball will swing back-and-forth in a periodic motion. We seek to understand various aspects of this motion; for example, how the period depends on the parameters 𝑚, 𝑔, ℓ, and 𝜃0 . Outline of model. We assume that the motion occurs in a plane and introduce an origin and 𝑥, 𝑦 coordinates as shown. The standard unit vectors in the positive 𝑥 and 𝑦 directions are denoted by 𝑖 ⃗ and 𝑗,⃗ and the position vector for the ball is denoted by 𝑟.⃗ It will be convenient to introduce unit vectors 𝑒 𝑟⃗ and 𝑒 𝜃⃗ that are parallel and ⃗ perpendicular to 𝑟.⃗ For any angle 𝜃, the components of these vectors are 𝑟 ⃗ = ℓ sin 𝜃 𝑖 + ⃗ ⃗ ℓ cos 𝜃 𝑗,⃗ 𝑒 𝑟⃗ = sin 𝜃 𝑖 +cos 𝜃 𝑗,⃗ and 𝑒 𝜃⃗ = cos 𝜃 𝑖 −sin 𝜃 𝑗.⃗ By differentiating the position with respect to time, we obtain the velocity and acceleration vectors

(1.36)

𝑑𝑟 ⃗ 𝑑𝜃 ⃗ 𝑑𝜃 ⃗ = ℓ cos 𝜃 𝑖 − ℓ sin 𝜃 𝑗, 𝑑𝑡 𝑑𝑡 𝑑𝑡 𝑑2𝑟 ⃗ 𝑑2𝜃 𝑑𝜃 2 𝑑2𝜃 𝑑𝜃 2 = [ℓ cos 𝜃 2 − ℓ sin 𝜃( ) ] 𝑖 ⃗ − [ℓ sin 𝜃 2 + ℓ cos 𝜃( ) ] 𝑗.⃗ 2 𝑑𝑡 𝑑𝑡 𝑑𝑡 𝑑𝑡 𝑑𝑡

We assume that only two forces act on the ball: one due to gravity, and another due to the pull of the string. Thus we neglect any other forces, such as that due to air ⃗ resistance. The force of gravity has the form 𝐹gravity = 𝑚𝑔 𝑗,⃗ and the force in the string ⃗ has the form 𝐹string = −𝜆 𝑒 𝑟⃗ , where 𝜆 is an unknown tension, which is nonconstant in general. Note that, although the magnitude of this force is unknown, its direction is known: it is always parallel to 𝑒 𝑟⃗ . Newton’s law of motion for the ball requires that the product of its mass and acceleration be equal to the sum of the applied forces, or equivalently, (1.37)

𝑚

𝑑2𝑟 ⃗ ⃗ ⃗ = 𝐹gravity + 𝐹string . 𝑑𝑡2

To put the above equation in a concise form, and eliminate the unknown magni⃗ tude of 𝐹string , we consider the vector dot-product of the above with the unit vector 𝑒 𝜃⃗ , namely 𝑑2𝑟 ⃗ ⃗ ⃗ ⋅ 𝑒 ⃗ = 𝐹gravity ⋅ 𝑒 𝜃⃗ + 𝐹string ⋅ 𝑒 𝜃⃗ . 𝑑𝑡2 𝜃 By direct calculation, using the component expressions for all vectors involved, and the facts that 𝑖 ⃗ ⋅ 𝑖 ⃗ = 1, 𝑗 ⃗ ⋅ 𝑗 ⃗ = 1 and 𝑖 ⃗ ⋅ 𝑗 ⃗ = 0, and noting that 𝑒 𝑟⃗ ⋅ 𝑒 𝜃⃗ = 0 because they are perpendicular, we obtain (1.38)

(1.39)

𝑚

𝑑2𝑟 ⃗ 𝑑2𝜃 ⋅ 𝑒 ⃗ = ℓ , 𝜃 𝑑𝑡2 𝑑𝑡2

⃗ 𝐹gravity ⋅ 𝑒 𝜃⃗ = −𝑚𝑔 sin 𝜃,

⃗ 𝐹string ⋅ 𝑒 𝜃⃗ = 0.

1.7. Case study

13

By substituting (1.39) into (1.38), and dividing out the mass and rearranging, we arrive at a differential equation for the pendulum motion. When the release conditions at time 𝑡 = 0 are included, we obtain (1.40)

ℓ

𝑑2𝜃 + 𝑔 sin 𝜃 = 0, 𝑑𝑡2

𝑑𝜃 | = 0, 𝑑𝑡 𝑡=0

𝜃|𝑡=0 = 𝜃0 ,

𝑡 ≥ 0.

The equations in (1.40) form a second-order, nonlinear, initial-value problem for the pendulum angle 𝜃 as a function of time 𝑡. This function also naturally depends on the parameters 𝑔, ℓ, and 𝜃0 that appear in the equations, and we note that the mass 𝑚 was eliminated along the way. The theory of ordinary differential equations guarantees that there exists a unique solution 𝜃 = 𝑓(𝑡, 𝑔, ℓ, 𝜃0 ), for some function 𝑓. Moreover, provided that the initial velocity is zero and the initial angle satisfies 𝜃0 ∈ (0, 𝜋), this solution will be periodic in time with a period 𝑃 = 𝐹(𝑔, ℓ, 𝜃0 ), for some function 𝐹. Although they can be written in terms of certain special (elliptic) functions, there are no elementary expressions for 𝑓 or 𝐹. Here we use dimensional methods to find a reduced form of the period relation and examine some implications. Reduced equation for period. The quantities 𝑃, 𝑔, ℓ, 𝜃0 have dimensions [𝑃] = 𝑇, [𝑔] = 𝐿/𝑇 2 , [ℓ] = 𝐿 and [𝜃0 ] = 1. A dimensional basis is {𝑇, 𝐿}, and the dimensional exponent matrix in this basis is (1.41)

𝐴 = (Δ𝑃 , Δ𝑔 , Δℓ , Δ𝜃0 ) = (

1 −2 0 1

0 1

0 ). 0 𝑏

An arbitrary power product has the form 𝜋 = 𝑃 𝑏1 𝑔𝑏2 ℓ𝑏3 𝜃04 . The equation 𝐴𝑣 = 0, where 𝑣 = (𝑏1 , . . . , 𝑏4 ), has two free variables, and the general solution is (1.42)

𝑏1 = −2𝑏3 ,

𝑏2 = −𝑏3 ,

𝑏3 and 𝑏4 free.

Since there are two free variables, there are two independent solutions. For the first solution, we choose 𝑏3 = −1/2 and 𝑏4 = 0, which gives 𝜋1 = 𝑃√𝑔/ℓ. For the second solution, we choose 𝑏3 = 0 and 𝑏4 = 1, which gives 𝜋2 = 𝜃0 . This is a full set of independent dimensionless power products, and is normalized with respect to 𝑃. By the 𝜋-theorem, the period equation 𝑃 = 𝐹(𝑔, ℓ, 𝜃0 ) must be equivalent to (1.43)

𝜋1 = 𝜙(𝜋2 ) or 𝑃 =

ℓ 𝜙(𝜃0 ), √𝑔

for some function 𝜙. Thus the relation between the quantities 𝑃, 𝑔, ℓ, 𝜃0 is not characterized by an unknown function of three quantities 𝐹(𝑔, ℓ, 𝜃0 ), but is instead characterized by an unknown function of one quantity 𝜙(𝜃0 ). Equivalently, the dependence of 𝐹(𝑔, ℓ, 𝜃0 ) on the quantities 𝑔 and ℓ is completely dictated by dimensional considerations. Some implications. The reduced form of the period relation given in (1.43) has some interesting implications, as summarized next. (1) A single curve of 𝜋1 versus 𝜋2 completely determines the function 𝜙, and hence the relation between the quantities 𝑃, 𝑔, ℓ, and 𝜃0 . Thus the goal of any experiment or further analysis should be aimed at determining this curve.

14

1. Dimensional analysis

(2) Consider any two pendula released from the same initial angle 𝜃0 . Let {𝑔1 , ℓ1 , 𝜃0 } be the parameters of the first pendulum, and {𝑔2 , ℓ2 , 𝜃0 } be the parameters of the second. In view of (1.43), the periods of the two pendula are given by 𝑃1 = √ℓ1 /𝑔1 𝜙(𝜃0 ) and 𝑃2 = √ℓ2 /𝑔2 𝜙(𝜃0 ). By dividing these two expressions, we obtain a fundamental period law for pendula, namely ℓ1 𝑔2 𝑃1 = . 𝑃2 √ ℓ2 𝑔1

(1.44)

(3) The dependence of the period 𝑃 on each of the quantities 𝑔, ℓ, and 𝜃0 can be characterized using (1.43). Specifically, for fixed 𝑔 and 𝜃0 , the period 𝑃 is an increasing function of ℓ; for fixed ℓ and 𝜃0 , the period 𝑃 is a decreasing function of 𝑔; and for fixed ℓ and 𝑔, the period 𝑃 increases or decreases with 𝜃0 depending on the function 𝜙. A detailed analysis shows that the function 𝜙(𝜃0 ), 𝜃0 ∈ (0, 𝜋) is positive, monotone, increasing, and has the limits (1.45)

lim 𝜙(𝜃0 ) = 2𝜋,

𝜃0 →0+

lim 𝜙(𝜃0 ) = ∞.

𝜃0 →𝜋−

Thus the period satisfies 𝑃 ≈ 2𝜋√ℓ/𝑔 for a pendulum released from rest with initial angle 𝜃0 ≈ 0, which corresponds to a nearly vertical, downward position. Interestingly, the period 𝑃 is arbitrarily large for initial angles 𝜃0 ≈ 𝜋, which corresponds to a nearly vertical, upward position. The special cases of 𝜃0 = 0 and 𝜃0 = 𝜋 correspond to rest or equilibrium positions of the system in which no motion occurs. Such states and further properties of dynamical systems will be considered in later chapters.

Reference notes Classic references for the material presented here are the books by Birkhoff (2015) and Bridgman (1963). A recent treatment with a wealth of details and examples is given in Szirtes (2007), and a concise guide that illustrates various diverse applications is given in Lemons (2017).

Exercises 1. Let 𝑥, 𝑡, 𝑎, 𝑏 be quantities in units {m, s}, and let 𝑥,̃ 𝑡,̃ 𝑎,̃ 𝑏 ̃ be the corresponding quantities in units {cm, hr}, with dimensions [𝑥] = 𝐿, [𝑡] = 𝑇, [𝑎] = 𝐿/𝑇 and [𝑏] = 𝑇. Change the equation from 𝑥, 𝑡, 𝑎, 𝑏 to 𝑥,̃ 𝑡,̃ 𝑎,̃ 𝑏.̃ Is the equation unit-free? (a) 𝑥 =

𝑎𝑡3 arctan(𝑡) . (𝑏 + 4𝑡)2

(b) 𝑥 =

𝑎𝑡2 arctan(𝑡/𝑏) . 𝑏 + 4𝑡

2. Let 𝑥, 𝑦 and 𝑝, 𝑞, 𝑟, 𝑠 be quantities in units {ft, lb} in the basis {𝐿, 𝑀}. Also, let 𝑑𝑦 𝑣 = 𝑑𝑥 . Assuming [𝑥] = 𝐿 and [𝑦] = 𝑀, find the dimensions [𝑝], [𝑞], [𝑟], [𝑠] as needed to make the given equation unit-free.

Exercises

15

𝑞 . 𝑟 + 𝑥2

(a) 𝑦 = 𝑝𝑥2 + 𝑞 sin(𝑟𝑥).

(b) 𝑦 = 𝑝 ln(𝑠𝑥) +

(c) 𝑦 = (𝑝𝑥 + 𝑞𝑥2 )𝑒𝑥/𝑟 .

(d) 𝑦 =

(e) 𝑣 = 𝑝𝑥 − 𝑞𝑥3 − 𝑟.

(f) 𝑣 = 𝑝𝑦 − 𝑞𝑥2 − 𝑟𝑥𝑦2 .

2

𝑝 + 𝑞𝑒𝑟𝑥 . 𝑠+𝑥

3. Let 𝑃, 𝑄, 𝑅, 𝑆 be quantities in given units in a basis {𝐷1 , 𝐷2 }. Show that 𝑃 = 𝑄 + 𝑅 + 𝑆 is unit-free if and only if [𝑃] = [𝑄] = [𝑅] = [𝑆]. 4. Let 𝑥, 𝑦, 𝑧 > 0 and 𝑝, 𝑞, 𝑟 > 0 be quantities with the following dimensions in the basis (1.1): [𝑥] = 𝐿, [𝑦] = 𝑀𝐿/𝑇, [𝑧] = 𝛩𝑀/(𝐿𝑇), [𝑝] = 𝐿2 ,

[𝑞] = 𝑀/𝑇,

[𝑟] = 𝑀/𝛩.

Find a reduced form of the given equation assuming it is unit-free. If not possible, explain why. (a) 𝑥 = 𝑓(𝑦, 𝑞, 𝑟).

(b) 𝑦 = 𝑓(𝑥, 𝑧, 𝑝).

(c) 𝑧 = 𝑓(𝑥, 𝑦, 𝑟).

(d) 𝑦 = 𝑓(𝑥, 𝑝, 𝑞).

5. Let 𝑢, 𝑣, 𝑤 > 0 and 𝛼, 𝛽, 𝛾, 𝛿 > 0 be quantities with the following dimensions in the basis (1.2): [𝑢] = 𝐶/𝑇, [𝛼] = 𝐶/𝐻,

[𝑣] = 𝐻/𝑇,

[𝛽] = 𝑃/𝐻,

[𝑤] = 𝑃/𝑇,

[𝛾] = 𝐻/(𝐶𝑇),

[𝛿] = 1/𝑇.

Find a reduced form of the given equation assuming it is unit-free. If not possible, explain why. (a) 𝑢 = 𝑓(𝑣, 𝑤, 𝛼, 𝛽).

(b) 𝑣 = 𝑓(𝑤, 𝛽, 𝛾, 𝛿).

(c) 𝑤 = 𝑓(𝛼, 𝛽, 𝛾).

(d) 𝑤 = 𝑓(𝑢, 𝑣, 𝛼, 𝛽, 𝛿).

6. An experiment to measure the temperature 𝑢 in a furnace at time 𝑡 is performed. A curve fitting procedure applied to the 𝑢, 𝑡 data yields the empirical relation 𝑢 = 3.7𝑡1.5 + 4.2𝑡 + 293.2 in units {kelvin, minutes}. Here we explore different ways to make this relation unit-free. (a) Consider 𝑢 = 𝑎𝑡1.5 + 𝑏𝑡 + 𝑐, where 𝑎, 𝑏, 𝑐 are dimensional constants with values 3.7, 4.2, 293.2 in units {kelvin, minutes}. What must be the dimensions of 𝑎, 𝑏, 𝑐 so that the equation is unit-free? What would be the values of 𝑎, 𝑏, 𝑐 in units {kelvin, hours}? 𝑡

𝑡

(b) Alternatively, consider 𝑢 = 𝛽[3.7( 𝛼 )1.5 + 4.2 𝛼 + 293.2], where 𝛼 and 𝛽 are dimensional constants, with values 𝛼 = 1 minute and 𝛽 = 1 kelvin, and 3.7, 4.2 and 293.2 are dimensionless (pure) numbers. Show that this form of the equation is also unit-free.

16

1. Dimensional analysis

7. Data is collected on the height 𝑦 of certain trees at time 𝑡 during their lifetimes. A curve fitting procedure gives the empirical relation 𝑦 = 52.4 − 52.4𝑒−0.1𝑡 − 3.3𝑡𝑒−0.2𝑡 in units {foot, year}. Rewriting as 𝑦 = 𝑎 − 𝑏𝑒−𝑐𝑡 − 𝑝𝑡𝑒−𝑞𝑡 , find the dimensions of 𝑎, 𝑏, 𝑐, 𝑝, 𝑞 so that the equation is unit-free. Convert the equation to units {yard, decade}. 8. According to a simple theory of growth, the ultimate radius 𝑟 > 0 of a singlecelled organism is determined by the nutrient absorption rate 𝑎 > 0 through its surface, and nutrient consumption rate 𝑐 > 0 throughout its volume, via a unitfree equation 𝑟 = 𝑓(𝑎, 𝑐). Find a reduced form of the relation using [𝑎] = 𝑀/(𝐿2 𝑇) and [𝑐] = 𝑀/(𝐿3 𝑇). 9. A metal forming process involves a pressure 𝑃 > 0, length 𝑥 > 0, time 𝑡 > 0, mass 𝑚 > 0 and density 𝜌 > 0, and is described by a unit-free equation 𝑃 = 𝑓(𝑥, 𝑡, 𝑚, 𝜌). Find a reduced form of this relation and express the result explicitly in terms of 𝑃. Recall that [𝑃] = 𝑀/(𝐿𝑇 2 ) and [𝜌] = 𝑀/𝐿3 . 10. A sphere of radius 𝑟 > 0 is immersed in a fluid of density 𝜌 > 0 and viscosity 𝜇 > 0. When subject to a force 𝑞 > 0, the sphere attains a terminal velocity 𝑣 > 0. We suppose there is a unit-free equation 𝑣 = 𝑓(𝑟, 𝑞, 𝜌, 𝜇), where [𝑞] = 𝑀𝐿𝑇 −2 , [𝜌] = 𝑀𝐿−3 , [𝜇] = 𝑀𝐿−1 𝑇 −1 . (a) Find a reduced form of 𝑣 = 𝑓(𝑟, 𝑞, 𝜌, 𝜇). (b) For fixed 𝑞, 𝜌, 𝜇, show that 𝑣 is proportional to a power of 𝑟. 11. A model for the digestion process in animals states that the absorption rate 𝑢 > 0 of a given nutrient is determined by the concentration 𝑐 > 0, residence time 𝜏 > 0, and breakdown rate 𝑟 > 0 of the nutrient in the gut, along with the volume 𝑣 > 0 of the gut. We suppose there is a unit-free equation 𝑢 = 𝑓(𝑐, 𝜏, 𝑟, 𝑣), where [𝑢] = 𝑀/𝑇, [𝑐] = 𝑀/𝐿3 and [𝑟] = 𝑀/(𝐿3 𝑇). (a) Find a reduced form of 𝑢 = 𝑓(𝑐, 𝜏, 𝑟, 𝑣). (b) For fixed 𝑐, 𝜏, 𝑟, show that 𝑢 is proportional to 𝑣. 12. In an explosion, a circular blast wave of intense pressure expands from the point of explosion into the surrounding air. A simple theory asserts that the radius 𝑟 > 0 of the wave is determined by the elapsed time 𝑡 > 0, the energy 𝐸 > 0 released in the explosion, and the density 𝜌 > 0 of the surrounding air, via a unit-free equation 𝑟 = 𝑓(𝑡, 𝐸, 𝜌). Find a reduced form of this relation and show that the radius must increase with time in a nonlinear way; specifically, 𝑟 increases as 𝑡2/5 . Here [𝐸] = 𝑀𝐿2 /𝑇 2 and [𝜌] = 𝑀/𝐿3 . 13. In a domino toppling show, a long line of dominoes topple over, one by one, in a chain reaction. It is hypothesized that the speed 𝑣 > 0 of the toppling wave depends on the spacing 𝑑 > 0 and height ℎ > 0 of each domino, and gravitational acceleration 𝑔 > 0, via a unit-free equation 𝑣 = 𝑓(𝑑, ℎ, 𝑔). (The speed is assumed to be insensitive to the thickness and width of each domino.)

Exercises

17

(a) Show that a reduced form of the speed equation is 𝑣 = √𝑔ℎ 𝜙(𝑑/ℎ) for some function 𝜙. (b) The figure below shows a plot of 𝜋1 = 𝜙(𝜋2 ), where 𝜋1 = 𝑣/√𝑔ℎ and 𝜋2 = 𝑑/ℎ, made with data from different domino experiments. Note that the graph is approximately linear on the interval 0.1 ≤ 𝜋2 ≤ 0.7. Find a linear expression for 𝜋1 = 𝜙(𝜋2 ) valid on this interval. Use this expression to write 𝑣 in terms of 𝑑, ℎ, 𝑔. π1 1.5 1.2

0.1

0.7

π2

14. Drone airplanes of surface area 𝑠 use fuel of energy content 𝑒 to fly at velocity 𝑣 through air of viscosity 𝜇. A theory proposes that the fuel consumption rate 𝑘 is determined by a unit-free equation 𝑘 = 𝑓(𝑠, 𝑒, 𝑣, 𝜇), where [𝑘] = 𝐿3 𝑇 −1 , [𝑒] = 𝑀𝐿−1 𝑇 −2 and [𝜇] = 𝑀𝐿−1 𝑇 −1 . (a) Find a reduced form of 𝑘 = 𝑓(𝑠, 𝑒, 𝑣, 𝜇). (b) Using data from two experiments, and linear interpolation in a plot of the reduced form, find 𝑘 when (𝑠, 𝑒, 𝑣, 𝜇) = (6, 3, 20, 1). Data in some appropriate units 𝑠 𝑒 𝑣 𝜇 𝑘 experiment 1: 5 2 10 1 1 experiment 2: 7 3 15 2 5

Mini-project. As developed in Section 1.7, a model for an ideal pendulum released from rest is

g

θ

ℓ

𝑑2𝜃 𝑑𝜃 + 𝑔 sin 𝜃 = 0, | = 0, 𝜃|𝑡=0 = 𝜃0 , 𝑡 ≥ 0. 𝑑𝑡 𝑡=0 𝑑𝑡2

Here 𝜃 is the pendulum angle, ℓ is the length, 𝑔 is gravitational acceleration, and 𝑡 is time. The above system has a unique solution 𝜃 = 𝑓(𝑡, 𝑔, ℓ, 𝜃0 ), for some function 𝑓, and provided that the initial velocity is zero and the initial angle satisfies 𝜃0 ∈ (0, 𝜋), this solution will be periodic in time with a period 𝑃 = 𝐹(𝑔, ℓ, 𝜃0 ), for some function 𝐹. Here we study the period relation and construct an approximate formula for it using some data. All quantities are in units of meters and seconds. (a) As outlined in the text, show that the reduced form of the period relation is 𝑃 = √ℓ/𝑔 𝜙(𝜃0 ), for some function 𝜙. Given that 𝜙 and 𝐹 are both unknown, what is

18

1. Dimensional analysis

the conceptual advantage of the form 𝑃 = √ℓ/𝑔 𝜙(𝜃0 ) compared to the form 𝑃 = 𝐹(𝑔, ℓ, 𝜃0 )? (b) The table below shows experimental measurements of 𝑃 for five pendula with different values of 𝑔, ℓ, and 𝜃0 . Compute the value of 𝜙 for each case; make a table or plot of 𝜙 versus 𝜃0 . Over the given interval of 𝜃0 , what are the qualitative features of 𝜙? Does the function appear to be increasing or decreasing? Concave up or concave down?

case 1: case 2: case 3: case 4: case 5:

𝑔, m/s2 9.80 9.80 9.80 9.80 9.80

ℓ, m 0.20 0.10 0.50 0.30 0.25

𝜃0 , rad 𝜋/12 𝜋/6 𝜋/4 𝜋/3 𝜋/2

𝑃, s 0.9015 0.6457 1.4760 1.1798 1.1845

(c) Using your 𝜙 versus 𝜃0 table, and linear interpolation between entries, predict the 5𝜋 5𝜋 𝜋 period 𝑃 for given parameter values {𝑔, ℓ, 𝜃0 } = {9.8, 0.5, 12 }, {9.8, 0.7, 12 }, {4.9, 0.6, 3 }. More generally, what would be an approximate formula for 𝑃, valid for any ℓ > 0, 𝑔 > 0 𝜋 𝜋 and 𝜃0 ∈ [ 3 , 2 ]? (d) Use Matlab or other similar software to numerically solve the pendulum differential equation, along with the initial conditions, to produce a plot of 𝜃 versus 𝑡. Run a simulation with each set of parameters {𝑔, ℓ, 𝜃0 } from part (c) and directly estimate the period from the plot for each case. Do the estimates agree with the predictions from (c)?

Chapter 2

Scaling

Mathematical models may involve a number of quantities expressed in a variety of units. While the absolute magnitudes of the quantities are determined by the units, the relative significance of the quantities can be obscured by differences in these units. In this chapter, we outline some results on scale transformations, which can facilitate the study of any mathematical model, and expose the relative significance of the quantities involved.

2.1. Domains and scales In the modeling of various types of systems, we will often seek to construct and study a unit-free relation of the form (2.1)

𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ),

where 𝑦, 𝑡, 𝑐 1 , . . . , 𝑐 𝑁 are real-valued quantities with given units. Here 𝑦, 𝑡 denote variables, 𝑐 1 , . . . , 𝑐 𝑁 denote parameters, and 𝑓 is a given function. By a parameter we mean a constant whose value depends on the specific case or situation of interest. The function 𝑓 may be defined explicitly, or implicitly as the solution of some related equation, for example a differential equation. For brevity, when the parameters are not essential to a discussion, we will abbreviate the above relation as 𝑦 = 𝑓(𝑡). A basic goal is to understand how the parameters 𝑐 1 , . . . , 𝑐 𝑁 influence the graph 𝑦 = 𝑓(𝑡) in a given domain 𝐷. For our purposes, it will be convenient to consider domains consisting of rectangular cells as illustrated in Figure 2.1, defined by (2.2)

𝐷 = { (𝑡, 𝑦) |

𝑚1 𝑎 ≤ 𝑡 ≤ 𝑚2 𝑎,

𝑛1 𝑏 ≤ 𝑦 ≤ 𝑛2 𝑏 }.

Here 𝑚1 < 𝑚2 and 𝑛1 < 𝑛2 are integers, and 𝑎 > 0 and 𝑏 > 0 are constants called scale factors for the domain. Smaller values of the scale factors correspond to smaller observation windows or domains, which give a more zoomed-in view of a graph. Similarly, larger scale factors correspond to larger domains, which give a more zoomed-out 19

20

2. Scaling

view. Note that the visible features of a graph depend on the scales at which it is observed. At smaller scales, a graph may appear nearly flat or linear, and at larger scales, it may appear rather nonlinear with significant curvature. The influence of scales on the visible features of a graph is illustrated in the following example. y D

b 0 a

t

Figure 2.1.

Example 2.1.1. Consider the function 𝑦 = 𝑐 1 𝑡2 + 𝑐 2 sin(2𝜋𝑡/𝑐 3 ), with parameters 𝑐 1 = m 2 s2 , 𝑐 2 = 0.01 m and 𝑐 3 = 0.01 s. Consider also the domain (𝑡, 𝑦) ∈ [−𝑎, 𝑎] × [−𝑏, 𝑏] with four different sets of scales {𝑎, 𝑏} = {0.001s, 0.005m}, {0.02s, 0.02m}, {0.15 s, 0.10 m} and {10 s, 200 m}. The graph of this function, with each of the four sets of scales, is shown in Figure 2.2. The graph appears nearly linear at the smallest scales (top-left), y 0.005

y 0.02

t 0.001

-0.001 -0.005

-0.02 y 200

y 0.1 t 0.15

-0.15

t 0.02

-0.02

-10

-0.1

10

t

-200

Figure 2.2.

then nearly sinusoidal (top-right), and then nearly sinusoidal with a quadratic drift (bottom-left), and finally nearly quadratic at the largest scales (bottom-right). Note that each panel in the figure corresponds to the same function, but the scales are different, which significantly affects the appearance of the graph. The scales are reflected in the range of values on the 𝑡, 𝑦 axes.

2.2. Scale transformations When preparing to study a function 𝑦 = 𝑓(𝑡) in a domain 𝐷, it will be helpful to introduce a change of variable as described next. Such a transformation will normalize the variables and domain, and ultimately make the function easier to study.

2.2. Scale transformations

21

Definition 2.2.1. By a scale transformation for variables 𝑡, 𝑦 we mean a change of variable (2.3)

𝑡 = 𝑡/𝑎,

𝑦 = 𝑦/𝑏,

where 𝑎, 𝑏 > 0 are constants with the same dimensions as 𝑡, 𝑦. The transformation converts any function and domain (2.4)

𝑦 = 𝑓(𝑡),

𝑚1 𝑎 ≤ 𝑡 ≤ 𝑚2 𝑎,

𝑛1 𝑏 ≤ 𝑦 ≤ 𝑛2 𝑏,

into an equivalent scaled function and scaled domain (see Figure 2.3) (2.5)

𝑦 = 𝑏−1 𝑓(𝑎𝑡) = 𝑓(𝑡),

𝑚1 ≤ 𝑡 ≤ 𝑚2 ,

𝑛1 ≤ 𝑦 ≤ 𝑛2 .

Note that the scaled function 𝑦 = 𝑓(𝑡) is a horizontally and vertically stretched version of the original function 𝑦 = 𝑓(𝑡). Depending on the scale factors, each cell of size 𝑎 × 𝑏 in the 𝑡, 𝑦 plane is enlarged or reduced to a cell of size 1 × 1 in the 𝑡, 𝑦 plane. Specifically, the original cell is enlarged when 0 < 𝑎, 𝑏 < 1, and reduced when 𝑎, 𝑏 > 1. Moreover, because they are dimensionless, the scaled variables 𝑡, 𝑦 will usually be more convenient and efficient to use than 𝑡, 𝑦. y

y

D

D

y=f(t)

y=f(t)

b −a

1

a

t

2a

−1

1

2

t

Figure 2.3.

For any values of 𝑎, 𝑏 the scaled function 𝑦 = 𝑓(𝑡) is a faithful representation of the original function 𝑦 = 𝑓(𝑡), that is, they have the same qualitative features. Specifically, if the original function is increasing or decreasing in some interval, then the scaled function must be increasing or decreasing in the corresponding scaled interval. Similarly, the two functions must have the same concavity, and the same numbers of minima and maxima. The following example shows how the scaled function, in the scaled or normalized domain, faithfully replicates the features of the original function. Example 2.2.1. Consider again the previous example, and for each set of scales 𝑎, 𝑏, consider now the scaled function 𝑦 = 𝑓(𝑡) and the scaled domain (𝑡, 𝑦) ∈ [−1, 1] × [−1, 1], where (2.6)

𝑦=

𝑐 1 𝑎2 2 𝑐 2 2𝜋𝑎𝑡 𝑡 + sin( ). 𝑏 𝑏 𝑐3

The graph of this function, for each of the four sets of scales, is shown in Figure 2.4. In contrast to before, the domain for the variables (𝑡, 𝑦) does not change, but now the function 𝑦 = 𝑓(𝑡) explicitly changes with the scales 𝑎, 𝑏 as shown in equation (2.6). Thus the scale transformation exposes the significance of the scales by making them parameters of the function instead of the domain. Note that each panel of Figure 2.2 is faithfully represented in Figure 2.4.

22

2. Scaling

y 1

-1

1

t

-1 Figure 2.4.

2.3. Derivative relations The scale transformation in (2.3) can be extended to derivatives. If a function 𝑦 = 𝑓(𝑡) is differentiable at some point (𝑡, 𝑦), then so will be the scaled function 𝑦 = 𝑓(𝑡) at the scaled point (𝑡, 𝑦), and similarly for higher-order derivatives. The next result, which follows from a straightforward application of the chain rule, summarizes the relation between the original and scaled derivatives. Result 2.3.1. Let 𝑡 = 𝑡/𝑎 and 𝑦 = 𝑦/𝑏 where 𝑎, 𝑏 > 0 are constant scales. If 𝑦 = 𝑓(𝑡) is 𝑛-times differentiable at (𝑡, 𝑦), then so is 𝑦 = 𝑓(𝑡) at (𝑡, 𝑦), and conversely. The relation between derivatives is 𝑑𝑦 𝑑𝑛𝑦 𝑑2𝑦 𝑎 𝑑𝑦 𝑎2 𝑑 2 𝑦 𝑎𝑛 𝑑 𝑛 𝑦 (2.7) = = , ... , . 𝑛 = 2 2 𝑏 𝑑𝑡 𝑏 𝑑𝑡 𝑏 𝑑𝑡𝑛 𝑑𝑡 𝑑𝑡 𝑑𝑡 These relations also hold for a more general change of variable that includes a shift, such as 𝑡 = (𝑡 − 𝑡0 )/𝑎 and 𝑦 = (𝑦 − 𝑦0 )/𝑏, for any constants 𝑡0 , 𝑦0 . The above relations imply that any equation for 𝑦 = 𝑓(𝑡) and its derivatives can be converted to an equation for the scaled function 𝑦 = 𝑓(𝑡) and its derivatives. Equivalently, any differential equation in the variables 𝑡, 𝑦 can be converted to an equation in the scaled variables 𝑡, 𝑦. Note that solutions of the original and scaled equations must be equivalent under the change of variable. As we will see, the scale factors 𝑎, 𝑏 can be chosen so that the scaled equation is simpler than the original, with fewer parameters. Also, the fact that the scaled equation is dimensionless is advantageous: the relative size and importance of quantities in the scaled equation can be directly compared with each other. In contrast, quantities in the original equation may be difficult to compare due to differences in their dimensions. Example 2.3.1. Consider the initial-value problem 𝑑𝑦 = 𝑐 1 𝑦2 + 𝑐 2 𝑦, 𝑦|𝑡=0 = 𝑐 3 , 𝑡 ≥ 0, 𝑑𝑡 where 𝑡, 𝑦 are variables and 𝑐 1 , 𝑐 2 , 𝑐 3 > 0 are parameters. Here we find the scaled form of this problem using arbitrary scale factors 𝑎, 𝑏 > 0. To obtain the scaled form, we substitute the change of variable relations 𝑡 = 𝑎𝑡 and 𝑦 = 𝑏𝑦, along with the derivative 𝑏 𝑑𝑦 𝑑𝑦 relation 𝑑𝑡 = 𝑎 , and get (2.8)

𝑑𝑡

(2.9)

2 𝑏 𝑑𝑦 = 𝑐 1 𝑏2 𝑦 + 𝑐 2 𝑏𝑦, 𝑎 𝑑𝑡

𝑏𝑦|𝑎𝑡=0 = 𝑐 3 ,

𝑎𝑡 ≥ 0,

2.4. Natural scales

23

which after simplification becomes (2.10)

𝑑𝑦

2

= 𝑐 1 𝑎𝑏𝑦 + 𝑐 2 𝑎𝑦,

𝑦|𝑡=0 =

𝑐3 , 𝑏

𝑡 ≥ 0. 𝑑𝑡 Note that the scaled problem in (2.10) is similar to the original problem in (2.8), but the coefficients have changed. The new coefficients now depend on the scale constants 𝑎, 𝑏 along with the original constants 𝑐 1 , 𝑐 2 , 𝑐 3 . As before, the scale transformation exposes the significance of the scales by making them parameters of the equation instead of the domain. Note that certain choices of scale could be made to simplify the scaled equation, for example, some of the new coefficients could be made to have unit values. Example 2.3.2. Let 𝑦 = 𝑓(𝑡) be the solution of the initial-value problem 𝑑𝑦 + 𝑐 1 𝑦 = 𝑐 2 𝑡, 𝑦|𝑡=0 = 𝑐 3 , 𝑡 ≥ 0. 𝑑𝑡 Here 𝑡, 𝑦 are variables with dimensions [𝑡] = 𝑇 and [𝑦] = 𝐿, and 𝑐 1 , 𝑐 2 , 𝑐 3 > 0 are parameters with dimensions [𝑐 1 ] = 𝑇 −1 , [𝑐 2 ] = 𝐿𝑇 −2 and [𝑐 3 ] = 𝐿. The visible features of the solution will depend not only on these parameters, but also the scales 𝑎, 𝑏 > 0 at which the solution is observed. To explore this, we consider the scaled form of the problem, obtained by the same process as before, (2.11)

(2.12)

𝑑𝑦 𝑑𝑡

+ 𝜎1 𝑦 = 𝜎2 𝑡,

𝑦|𝑡=0 = 𝜎3 ,

𝑡 ≥ 0,

𝑐 2 𝑎2 , 𝑏

𝑐3 . 𝑏

where (2.13)

𝜎1 = 𝑐 1 𝑎,

𝜎2 =

𝜎3 =

The scaled problem in (2.12) informs us that the behavior of 𝑦 = 𝑓(𝑡) is determined by the three dimensionless parameters 𝜎1 , 𝜎2 , and 𝜎3 . Different interesting cases can be identified. For instance, in cases for which 𝜎1 ≫ 𝜎2 and 𝜎3 ≥ 1, the scaled equations 𝑑𝑦 become ≈ −𝜎1 𝑦 and 𝑦|𝑡=0 ≥ 1, which suggests that the scaled and hence original 𝑑𝑡 graph would have the appearance of a decaying exponential. Alternatively, in cases for 𝑑𝑦 ≈ 𝜎2 𝑡 and 𝑦|𝑡=0 ≪ 1, which 𝜎2 ≫ 𝜎1 and 𝜎3 ≪ 1, the scaled equations become 𝑑𝑡 which suggests that the scaled and hence original graph would have the appearance of a growing quadratic. Note that 𝜎1 , 𝜎2 , 𝜎3 are dimensionless and can be directly compared against each other; in contrast, 𝑐 1 , 𝑐 2 , 𝑐 3 have different dimensions and cannot be similarly compared. Thus the scale transformation facilitates the study of the problem, and exposes the relative significance of all constants involved.

2.4. Natural scales Various choices of scale can be made when studying a function 𝑦 = 𝑓(𝑡) with parameters 𝑐 1 , . . . , 𝑐 𝑁 . In general, it is desirable to use scales that are natural or intrinsic to the function, in the sense that they depend only on the function itself and its parameters, namely (2.14)

𝑎 = 𝜌(𝑐 1 , . . . , 𝑐 𝑁 ),

𝑏 = 𝜎(𝑐 1 , . . . , 𝑐 𝑁 ),

24

2. Scaling

for some functions 𝜌, 𝜎. Here we outline two different types of natural scales in this sense. Definition 2.4.1. Assume 𝑦 = 𝑓(𝑡) is nonconstant and continuously differentiable on some closed, bounded interval 𝐼. Then the constants 𝑎, 𝑏 > 0 defined by 𝑏 | 𝑑𝑓 | = max | (𝑡)| , | 𝑎 𝑡∈𝐼 | 𝑑𝑡

𝑏 = max |𝑓(𝑡)| ,

(2.15)

𝑡∈𝐼

are natural scales. These constants are called characteristic scales for 𝑦 = 𝑓(𝑡) on 𝐼. 𝑑𝑦 In simpler notation, 𝑏 = |𝑦|max and 𝑎 = 𝑏/| 𝑑𝑡 |max . The characteristic scales 𝑎, 𝑏 provide a default level of zoom for a graph 𝑦 = 𝑓(𝑡) in any domain of the 𝑡, 𝑦 plane with 𝑡 ∈ 𝐼. At this default level of magnification, the slope of the graph in any cell of the domain is bounded by that of the two diagonal lines of the cell, and the entire vertical range of the graph is bounded by twice the height of a cell. A domain with a moderate number of cells at these characteristic scales can be expected to provide a natural view of a function, with no large or abrupt changes in the graph or its slope. For a complicated function with a number of different components, the above definition could be applied to the overall function, or instead be applied to each different component, leading to a collection of characteristic scales. A function for which significantly different scales are needed to clearly represent its features is called a multiscale function. m

Example 2.4.1. Consider the function 𝑦 = 𝑐 1 𝑡𝑒−𝑡/𝑐2 , where 𝑐 1 = 2 s and 𝑐 2 = 6 s, and consider the interval 𝐼 = [0, 40 s]. Using the methods of calculus to determine the indicated maximum values, we find that characteristic scales for this function and interval are (2.16)

𝑏 = max |𝑓(𝑡)| = 𝑡∈𝐼

| 𝑑𝑓 | 𝑐 𝑎 = 𝑏 / max | (𝑡)| = 2 . | 𝑒 𝑡∈𝐼 | 𝑑𝑡

𝑐1𝑐2 , 𝑒

A plot corresponding to the domain [0, 10𝑎] × [0, 𝑏] is shown in Figure 2.5. Note that a moderate number of cells with scales 𝑎, 𝑏 provides a clear, well-proportioned view of all the essential features of the function, which are a rise up to a single maximum followed by a decay to zero. y 4 3 2 1 5

10

15

Figure 2.5.

20

t

2.4. Natural scales

25

𝑡

1

Example 2.4.2. Consider 𝑦 = 𝑐 1 𝑒−𝑡/𝑐2 cos( 𝑐 ), where 𝑐 1 = 1 m, 𝑐 2 = 6 s, 𝑐 3 = 100 s, 3 and as before 𝐼 = [0, 40 s]. Applying the above definitions to this function and interval yield the characteristic scales 𝑏 = 𝑐 1 and 𝑎 ≈ 𝑐 3 . A plot corresponding to the domain [0, 10𝑎]×[−𝑏, 𝑏] is shown in the left half of Figure 2.6. These scales provide a clear view of the oscillating component of the function, but this view is incomplete; there is little indication of the decaying component associated with the factor 𝑒−𝑡/𝑐2 , which varies on the much longer (slower) scale 𝑎̂ = 𝑐 2 . A plot corresponding to the longer domain [0, 5𝑎]̂ × [−𝑏, 𝑏] is shown in the right half of the figure; the decaying component is now visible, but the oscillations are so tightly spaced that the graph appears to fill an area. Note that neither plot by itself provides a clear representation of all the features of the function, but instead two plots are necessary to illustrate the fast and slow behaviors in time. This is a simple example of a multiscale function. y 1

y 1

0.5 -0.5

0.5 0.05

0.1

t

10

-0.5

-1

20

30

t

-1 Figure 2.6.

In many cases it is impractical to determine characteristic scales for a function in the sense of the definition above. In these cases, other natural scales can be considered, which are more explicit and easier to find. To state the definition, we consider a given relation 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ) and consider the power products (2.17)

𝛼

𝛼

𝑎 = 𝑐 1 1 ⋯ 𝑐 𝑁𝑁 ,

𝛽

𝛽

𝑏 = 𝑐 1 1 ⋯ 𝑐 𝑁𝑁 ,

where 𝛼1 , . . . , 𝛼𝑁 and 𝛽1 , . . . , 𝛽𝑁 are powers to be determined. In any dimensional basis {𝐷1 , . . . , 𝐷𝑚 }, each parameter 𝑐 𝑖 will have dimensional exponents Δ𝑐𝑖 ∈ ℝ𝑚 , and the power products 𝑎, 𝑏 will have dimensional exponents Δ𝑎 , Δ𝑏 ∈ ℝ𝑚 . As noted in the previous chapter, the relation between these exponents is (2.18)

Δ𝑎 = 𝐴𝛼,

Δ𝑏 = 𝐴𝛽,

where 𝐴 = (Δ𝑐1 , . . . , Δ𝑐𝑁 ) ∈ ℝ𝑚×𝑁 , 𝛼 = (𝛼1 , . . . , 𝛼𝑁 ) ∈ ℝ𝑁 and 𝛽 = (𝛽1 , . . . , 𝛽𝑁 ) ∈ ℝ𝑁 . Here, as before, all one-dimensional arrays are considered as columns, and we assume 𝑁 ≥ 1 and 𝑚 ≥ 1 with 𝑁 ≥ 𝑚. The condition that scales 𝑎, 𝑏 have the same dimensions as 𝑡, 𝑦 is equivalent to Δ𝑎 = Δ𝑡 and Δ𝑏 = Δ𝑦 . When these conditions are combined with (2.18), we obtain the following class of natural scales, which are more practical. Definition 2.4.2. Let 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ) be given, where 𝑐 1 , . . . , 𝑐 𝑁 > 0 are parameters, and let 𝛼 = (𝛼1 , . . . , 𝛼𝑁 ) and 𝛽 = (𝛽1 , . . . , 𝛽𝑁 ) be any powers satisfying (2.19)

𝐴𝛼 = Δ𝑡 ,

𝐴𝛽 = Δ𝑦 .

Then the constants 𝑎, 𝑏 > 0 in (2.17) are natural scales. These constants are called associated scales for the function.

26

2. Scaling

Unlike the characteristic case, any associated scales 𝑎, 𝑏 depend only on the parameters 𝑐 1 , . . . , 𝑐 𝑁 and not the specific form of a function 𝑓. The existence of these types of scales is not automatic. Depending on the matrix of exponents 𝐴, there may be no set of scales which can be expressed as power products, or there may be one or more sets of scales. Any set of associated scales provides a natural reference for measuring the variables 𝑡, 𝑦. In various cases, the characteristic scales defined above will be among the associated scales. Example 2.4.3. Here we find a set of associated scales for a function 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 , 𝑐 3 ), where 𝑡, 𝑦 are variables with dimensions [𝑡] = 𝑇 and [𝑦] = 𝑀, and 𝑐 1 ≥ 0 and 𝑐 2 , 𝑐 3 > 0 are parameters with dimensions [𝑐 1 ] = 1/𝑇, [𝑐 2 ] = 1/(𝑀𝑇) and [𝑐 3 ] = 𝑀/𝑇 2 . Note that, since 𝑐 1 = 0 is possible, we exclude this parameter from consideration since it may lead to a zero or undefined scale. Hence we seek scales 𝑎, 𝑏 > 0 depending on 𝑐 2 , 𝑐 3 > 0. In the dimensional basis {𝑇, 𝑀}, we have (2.20)

1 Δ𝑡 = ( ) , 0

0 Δ𝑦 = ( ) , 1

−1 Δ 𝑐2 = ( ) , −1 𝛼

−2 Δ𝑐3 = ( ) . 1 𝛽

𝛼

𝛽

Introducing the power products 𝑎 = 𝑐 2 2 𝑐 3 3 and 𝑏 = 𝑐 22 𝑐 33 , we consider the matrix 𝐴 = (Δ𝑐2 , Δ𝑐3 ) and the vectors 𝛼 = (𝛼2 , 𝛼3 ) and 𝛽 = (𝛽2 , 𝛽3 ). The conditions 𝐴𝛼 = Δ𝑡 and 𝐴𝛽 = Δ𝑦 become (2.21)

(

−1 −2 𝛼2 1 )( ) = ( ), −1 1 𝛼3 0

(

−1 −2 𝛽2 0 )( ) = ( ). 1 −1 1 𝛽3

The first system requires −𝛼2 − 2𝛼3 = 1 and −𝛼2 + 𝛼3 = 0, which has the solution . The second system 𝑐−1/3 (𝛼2 , 𝛼3 ) = (−1/3, −1/3), which gives the scale 𝑎 = 𝑐−1/3 3 2 requires −𝛽2 −2𝛽3 = 0 and −𝛽2 +𝛽3 = 1, which has the solution (𝛽2 , 𝛽3 ) = (−2/3, 1/3), 𝑐1/3 which gives the scale 𝑏 = 𝑐−2/3 3 . 2

2.5. Scaling theorem Under a scale transformation, a given function 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ) is converted into an equivalent scaled function (2.22)

𝑦 = 𝑏−1 𝑓(𝑎𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ) = 𝑓(𝑡, 𝑎, 𝑏, 𝑐 1 , . . . , 𝑐 𝑁 ).

The next result shows that, if 𝑎, 𝑏 are natural scales, then the scaled function is not only dimensionless, but also involves fewer parameters, thus making it easier to study. The result follows from an application of the 𝜋-theorem stated in Result 1.6.2. Result 2.5.1. If 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ), 𝑎 = 𝜌(𝑐 1 , . . . , 𝑐 𝑁 ) and 𝑏 = 𝜎(𝑐 1 , . . . , 𝑐 𝑁 ) are unit-free, then the scaled relation is equivalent to (2.23)

𝑦 = 𝜙(𝑡, 𝜇1 , . . . , 𝜇𝑀 ),

for some function 𝜙 and parameters 𝜇1 , . . . , 𝜇𝑀 . The variables 𝑡, 𝑦 and parameters 𝜇1 , . . . , 𝜇𝑀 are all dimensionless and 𝑀 ≤ 𝑁. The case 𝑦 = 𝜙(𝑡) with 𝑀 = 0 parameters is possible.

2.5. Scaling theorem

27

Thus, up to a scale transformation with natural scales, the given relation 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ) is equivalent to 𝑦 = 𝜙(𝑡, 𝜇1 , . . . , 𝜇𝑀 ), which normally depends on fewer parameters, or possibly 𝑦 = 𝜙(𝑡), which depends on no parameters. Note that the result is valid whether or not the relations are known explicitly. That is, if 𝑓 is defined implicitly as a solution of a differential equation with parameters 𝑐 1 , . . . , 𝑐 𝑁 , then 𝜙 is defined as a solution of the corresponding scaled equation with parameters 𝜇1 , . . . , 𝜇𝑀 . Hence these latter parameters can be identified by the simple process of applying the scale transformation to the differential equation, and writing it in dimensionless form. The parameters 𝜇1 , . . . , 𝜇𝑀 provide an intrinsic way to study and discuss the relation 𝑦 = 𝑓(𝑡, 𝑐 1 , . . . , 𝑐 𝑁 ). Whereas 𝑐 1 , . . . , 𝑐 𝑁 may depend on a variety of units and be difficult to compare, 𝜇1 , . . . , 𝜇𝑀 are independent of units and can always be compared. Thus any limiting, simplified, or otherwise special case of the relation can be naturally expressed in terms of conditions on 𝜇1 , . . . , 𝜇𝑀 . That is, these dimensionless parameters provide a way to classify properties of the original relation in a way that is independent of units. Example 2.5.1. Let 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 , 𝑐 3 ) be the solution of the initial-value problem 𝑑𝑦 + 𝑐 1 𝑦 = 𝑐 2 𝑡, 𝑦|𝑡=0 = 𝑐 3 , 𝑡 ≥ 0. 𝑑𝑡 We suppose 𝑡, 𝑦 are variables with dimensions [𝑡] = 𝑇 and [𝑦] = 𝐿, and 𝑐 1 , 𝑐 2 , 𝑐 3 > 0 are parameters with dimensions [𝑐 1 ] = 𝑇 −1 , [𝑐 2 ] = 𝐿𝑇 −2 and [𝑐 3 ] = 𝐿. We seek to classify the family of solutions 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 , 𝑐 3 ) for all possible values of 𝑐 1 , 𝑐 2 , 𝑐 3 > 0. For this purpose, we first determine a set of natural scales 𝑎, 𝑏 > 0, and then examine the scaled form of the equation. (2.24)

𝛼

𝛼

𝛼

𝛽

𝛽

𝛽

For the scales, we consider the power products 𝑎 = 𝑐 1 1 𝑐 2 2 𝑐 3 3 and 𝑏 = 𝑐 1 1 𝑐 22 𝑐 33 . In the dimensional basis {𝑇, 𝐿}, the equations 𝐴𝛼 = Δ𝑡 and 𝐴𝛽 = Δ𝑦 become (2.25)

−1 ( 0

𝛼1 1 −2 0 ) (𝛼2 ) = ( ) , 0 1 1 𝛼3

(

−1 −2 0 1

𝛽1 0 0 ) (𝛽 2 ) = ( ) . 1 1 𝛽3

The first system requires −𝛼1 − 2𝛼2 = 1 and 𝛼2 + 𝛼3 = 0, which has a simple solution (𝛼1 , 𝛼2 , 𝛼3 ) = (−1, 0, 0), which gives the natural scale 𝑎 = 𝑐−1 1 . The second system requires −𝛽1 − 2𝛽2 = 0 and 𝛽2 + 𝛽3 = 1, which has a simple solution (𝛽1 , 𝛽2 , 𝛽3 ) = (0, 0, 1), which gives the natural scale 𝑏 = 𝑐 3 . Under the scale transformation 𝑡 = 𝑎𝑡 and 𝑦 = 𝑏𝑦, we obtain the scaled equation (2.26)

𝑑𝑦 𝑑𝑡

+𝑦=

𝑐2 2 𝑐1𝑐3

𝑡,

𝑦|𝑡=0 = 1,

𝑡 ≥ 0.

The scaled problem in (2.26) involves only a single dimensionless parameter 𝜇 = 𝑐 2 /(𝑐21 𝑐 3 ), and will have a solution 𝑦 = 𝜙(𝑡, 𝜇). Thus, up to a scale transformation, the original family of solutions 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 , 𝑐 3 ) for all possible values of 𝑐 1 , 𝑐 2 , 𝑐 3 > 0 can be classified in terms of 𝜇 > 0. In this case, the scaled equation can be solved using standard methods to get 𝑦 = 𝜇𝑡 − 𝜇 + (1 + 𝜇)𝑒−𝑡 , and the influence of the parameter 𝜇 can be directly assessed. The scaled solution can be seen to be a convex function, with 1 1 a minimum at (𝑡, 𝑦) = (ln(1 + 𝜇 ), 𝜇 ln(1 + 𝜇 )), and with a slant asymptote along the

28

2. Scaling

line 𝜇𝑡 − 𝜇. Figure 2.7 illustrates the scaled solution for the three cases 𝜇 = 0.02, 0.1 and 1.

-

y 2

μ=1

1.5

μ = 0.1

1 0.5 2

4

6

8

10

-

μ = 0.02

t

Figure 2.7.

Sketch of proof: Result 2.5.1. For notational convenience, we denote the list of parameters by 𝛾 = (𝑐 1 , . . . , 𝑐 𝑁 ), and consider the unit-free equations 𝑦 = 𝑓(𝑡, 𝛾), 𝑎 = 𝜌(𝛾) and 𝑏 = 𝜎(𝛾). We also consider the scale transformation 𝑡 = 𝑡/𝑎 and 𝑦 = 𝑦/𝑏, and the resulting scaled equation 𝑦 = 𝑓(𝑎𝑡, 𝛾)/𝑏. Substituting for 𝑎 and 𝑏 in this equation we get 𝑦 = 𝐹(𝑡, 𝛾), where the function on the right is defined by 𝐹(𝑡, 𝛾) = 𝑓(𝜌(𝛾)𝑡, 𝛾)/𝜎(𝛾). We next consider an arbitrary change of units as outlined in Result 1.4.1. By the change of units formula, the variables 𝑡, 𝑦 will be changed to 𝑡,̃ 𝑦.̃ Similarly, the scale factors 𝑎, 𝑏 will be changed to 𝑎,̃ 𝑏,̃ and the parameters 𝛾 will be changed to 𝛾.̃ The scaled variables 𝑡, 𝑦 will also be changed to 𝑡,̃ 𝑦.̃ However, since they are dimensionless and their dimensional exponents are zero, we get 𝑡 ̃ = 𝑡 and 𝑦̃ = 𝑦. Note that, by the unit-free property, we have 𝑦 ̃ = 𝑓(𝑡,̃ 𝛾), ̃ 𝑎̃ = 𝜌(𝛾)̃ and 𝑏 ̃ = 𝜎(𝛾). ̃ The scale transformation can also be applied to the variables in the new units. The scaled versions of these variables are 𝑡 ̃ = 𝑡/̃ 𝑎̃ and 𝑦 ̃ = 𝑦/̃ 𝑏.̃ Using the dimensional exponent relations Δ𝑎 = Δ𝑡 and Δ𝑏 = Δ𝑦 , which hold since 𝑎 and 𝑏 are scale factors, together with the change of units formula, we find that 𝑡/̃ 𝑎̃ = 𝑡/𝑎 and 𝑦/̃ 𝑏 ̃ = 𝑦/𝑏. From this we deduce the useful relations 𝑡 ̃ = 𝑡 = 𝑡 ̃ and 𝑦 ̃ = 𝑦 = 𝑦,̃ and also 𝑡 ̃ = 𝑎𝑡̃ ̃ = 𝑎𝑡̃ ̃ and ̃ ̃ 𝑦 ̃ = 𝑏𝑦̃ ̃ = 𝑏𝑦. We can now examine the effect of a change of units on the relation 𝑦 = 𝐹(𝑡, 𝛾). From the definition of the function, using the fact that 𝑡 ̃ = 𝑎𝑡, ̃ ̃ we observe that 𝐹(𝑡,̃ 𝛾)̃ = ̃ 𝑓(𝜌(𝛾)̃ 𝑡, 𝛾)/𝜎( ̃ 𝛾)̃ = 𝑓(𝑡,̃ 𝛾)/ ̃ 𝑏,̃ which leads to the result that 𝐹(𝑡,̃ 𝛾)̃ = 𝑦.̃ Thus the relation 𝑦 = 𝐹(𝑡, 𝛾) is unit-free, where 𝛾 is brief notation for 𝑐 1 , . . . , 𝑐 𝑁 . The 𝜋-theorem in Result 1.6.2 can now be applied. Since 𝑦 and 𝑡 are dimensionless, the set of quantities 𝑦, 𝑡, 𝑐 1 , . . . , 𝑐 𝑁 has a full, normalized set of dimensionless power products 𝜋1 , . . . , 𝜋𝑘 for some 2 ≤ 𝑘 ≤ 𝑁 + 2. The power products can be chosen so that 𝜋1 = 𝑦 and 𝜋2 = 𝑡, with 𝜋3 , . . . , 𝜋𝑘 dependent only on 𝑐 1 , . . . , 𝑐 𝑁 ; specifically, since 𝑦 and 𝑡 are dimensionless, they would not contribute to any power product with 𝑐 1 , . . . , 𝑐 𝑁 . The 𝜋-theorem then implies 𝑦 = 𝜙(𝑡, 𝜋3 , . . . , 𝜋𝑘 ), which establishes the result.

2.6. Case study

29

2.6. Case study Setup. To illustrate the preceding results on scaling, and the process of modelling a chemical system, we study the evolution of an elementary reaction. Figure 2.8 illustrates the system, which consists of a closed reaction chamber or tank, filled with a fluid solution containing three chemical substances 𝑋, 𝑌 , and 𝑍. The concentrations of the substances, in units of molecules per volume, are denoted by 𝑥, 𝑦, and 𝑧. The three dimensions [𝑥], [𝑦], and [𝑧] are the same and equal to 𝑀cl /𝑉, where 𝑀cl means Molecules, and 𝑉 means Volume. We consider a reaction described by the chemical 𝑘 equation 𝑋 + 𝑌 ⟶ 2𝑍, in which one molecule of 𝑋 and one molecule of 𝑌 combine to form two molecules of 𝑍. Here 𝑋 and 𝑌 are the reactants, 𝑍 is the product, and 𝑘 > 0 is a reaction rate constant that will be described below. Beginning from conditions 𝑥 = 𝑥0 > 0, 𝑦 = 𝑦0 > 0 and 𝑧 = 0 at time 𝑡 = 0, we seek to understand how the reaction evolves in time; for example, how the concentration versus time curves depend on the parameters 𝑘, 𝑥0 , and 𝑦0 .

X+Y

k

reaction tank 2Z

Figure 2.8. 𝑘 𝑝𝑍, the proBackground. In an elementary reaction of the form 𝑚𝑋 + 𝑛𝑌 ⟶ duction of 𝑝 molecules of 𝑍 requires a pairing of 𝑚 molecules of 𝑋 and 𝑛 molecules of 𝑌 . (Here pairing means an appropriate chemical event involving the reactants.) The rate constant 𝑘 is defined such that # Pairings (2.27) = 𝑘𝑥𝑚 𝑦𝑛 . Time ⋅ Volume 𝑘 𝑝𝑍, we deduce that Thus, in a reaction 𝑚𝑋 + 𝑛𝑌 ⟶

(2.28)

# Pairings # of 𝑋 consumed # of 𝑋 =( )( ) = 𝑚 𝑘𝑥𝑚 𝑦𝑛 . Time ⋅ Volume Pairing Time ⋅ Volume

Similarly, for 𝑌 and 𝑍 we deduce (2.29)

# of 𝑌 consumed = 𝑛 𝑘𝑥𝑚 𝑦𝑛 , Time ⋅ Volume

# of 𝑍 produced = 𝑝 𝑘𝑥𝑚 𝑦𝑛 . Time ⋅ Volume

Model equations. We can now outline a set of equations to describe the evolution of the reaction in our tank. The basic principle that we employ is balance of mass: the rate of change of the number of any chemical species in the tank, must equal the rate of production minus consumption in any reactions, plus the rate of supply minus removal 𝑘 by external sources. Considering only the single reaction 𝑋 +𝑌 ⟶ 2𝑍, in a closed tank with no external sources, we have # of 𝑋 consumed 𝑑 # of 𝑋 , (2.30) ( )=− 𝑑𝑡 Volume Time ⋅ Volume or equivalently, 𝑑𝑥 (2.31) = −1 𝑘𝑥𝑦. 𝑑𝑡

30

2. Scaling

Similarly, for the consumption of 𝑌 and production of 𝑍, we have 𝑑𝑦 = −1 𝑘𝑥𝑦, 𝑑𝑡

(2.32)

𝑑𝑧 = +2 𝑘𝑥𝑦. 𝑑𝑡

Simplification. The differential equations for the concentrations 𝑥, 𝑦, and 𝑧 can 𝑑𝑦 𝑑𝑥 be simplified. From (2.32) we observe that 𝑑𝑡 = 𝑑𝑡 , which implies 𝑦(𝑡) = 𝑥(𝑡) + 𝑐 for all 𝑡 ≥ 0, where 𝑐 is a constant determined by initial conditions; specifically, 𝑦0 = 𝑥0 +𝑐 𝑑𝑧 𝑑𝑥 or equivalently 𝑐 = 𝑦0 − 𝑥0 . Similarly, from (2.32) we also observe that 𝑑𝑡 = −2 𝑑𝑡 , which implies 𝑧(𝑡) = −2𝑥(𝑡) + 𝑑 for all 𝑡 ≥ 0, where 𝑑 is a constant again determined by initial conditions; specifically, 0 = −2𝑥0 + 𝑑 which gives 𝑑 = 2𝑥0 . By substituting the relation 𝑦 = 𝑥 + 𝑐 into (2.31), we obtain a self-contained equation for 𝑥, namely 𝑑𝑥 = −𝑘𝑥(𝑥 + 𝑐). Thus we focus attention on this equation and note that, once 𝑥 is 𝑑𝑡 known as a function of time 𝑡, then so will be 𝑦 and 𝑧. Analysis. We consider the initial-value problem in variables 𝑡, 𝑥 given by 𝑑𝑥 = −𝑘𝑥(𝑥 + 𝑐), 𝑥|𝑡=0 = 𝑥0 , 𝑡 ≥ 0, 𝑑𝑡 where 𝑘, 𝑥0 , and 𝑐 are parameters. Whereas 𝑘 and 𝑥0 are positive, we note that 𝑐 = 𝑦0 −𝑥0 could be positive, zero, or negative. To characterize the solution 𝑥 = 𝑓(𝑡, 𝑘, 𝑥0 , 𝑐) for all possible values of the parameters, we first find a set of natural scales, and then study the resulting scaled problem. (2.33)

The variables 𝑡 and 𝑥 have dimensions of time and concentration, that is, [𝑡] = 𝑇 and [𝑥] = 𝑄, where 𝑄 = 𝑀cl /𝑉. Since there is no need to consider 𝑀cl or 𝑉 individually, we use {𝑇, 𝑄} as the dimensional basis. By definition, the parameters 𝑥0 and 𝑐 have dimensions of concentration, so [𝑥0 ] = 𝑄 and [𝑐] = 𝑄. The dimensions of the parameter 𝑘 can be deduced from the differential equation. Specifically, dimensional 𝑑𝑥 consistency requires [ 𝑑𝑡 ] = [𝑘𝑥(𝑥 + 𝑐)], which by properties of dimensions implies 𝑄 1 = [𝑘]𝑄2 , which gives [𝑘] = 𝑇𝑄 . 𝑇 We consider natural scales 𝑎 and 𝑏 depending on parameters 𝑘, 𝑥0 , and 𝑐. Initially, 𝛽 𝛼 we consider the power products 𝑎 = 𝑘𝛼1 𝑥0 2 |𝑐|𝛼3 and 𝑏 = 𝑘𝛽1 𝑥0 2 |𝑐|𝛽3 . However, since 𝑐 = 0 is possible, we exclude this parameter from the power products to avoid any case with 𝑎 = 0 or 𝑏 = 0, for which the scale transformation is undefined. Hence we 𝛽 𝛼 consider the reduced expressions 𝑎 = 𝑘𝛼1 𝑥0 2 and 𝑏 = 𝑘𝛽1 𝑥0 2 , and by the usual method 1 we find 𝑎 = 𝑘𝑥 and 𝑏 = 𝑥0 . 0

Under the scale transformation 𝑡 = 𝑎𝑡 and 𝑥 = 𝑏𝑥, we obtain the scaled problem 𝑑𝑥 = −𝑥(𝑥 + 𝜇), 𝑥|𝑡=0 = 1, 𝑡 ≥ 0. 𝑑𝑡 𝑐 𝑦 In the above, there is only a single dimensionless parameter 𝜇 = 𝑏 = 𝑥0 − 1, and 0 from the fact that 𝑥0 and 𝑦0 are both positive, we find that 𝜇 ∈ (−1, ∞). Note that the solution of the scaled problem is a function 𝑥 = 𝜙(𝑡, 𝜇), and the solution of the original problem will just be a stretched version of this function. Thus solutions can be classified in terms of the single parameter 𝜇. (2.34)

The form and character of the solution 𝑥 = 𝜙(𝑡, 𝜇) depends on whether 𝜇 ∈ (−1, 0), 𝜇 = 0, or 𝜇 ∈ (0, ∞). In all cases, the solution is positive, monotone, decreasing, and

Exercises

31

concave up. However, there are some interesting qualitative differences between the three cases. When 𝜇 = 0, the differential equation can be solved using separation of variables, and the solution is 1 (2.35) 𝑥= , 𝑡 ≥ 0. 𝑡+1 Note that 𝑥 → 0 as 𝑡 → ∞. Thus the chemical 𝑋 becomes totally consumed or depleted as the reaction evolves, and the rate at which depletion occurs is algebraic in time, 1 specifically, 𝑥 → 0 as quickly as → 0. 𝑡

When 𝜇 ≠ 0, the differential equation can also be solved using separation of variables, but now a partial fraction decomposition is required to complete the integration, and the solution is 𝜇 , 𝑡 ≥ 0. (2.36) 𝑥= (1 + 𝜇)𝑒𝜇𝑡 − 1 When 𝜇 > 0, we note that 𝑒𝜇𝑡 → ∞, and again we find that 𝑥 → 0 as 𝑡 → ∞. Thus the chemical 𝑋 again becomes totally depleted as the reaction evolves, but now the rate of 1 depletion is exponential in time, specifically, 𝑥 → 0 as quickly as 𝜇𝑡 → 0. 𝑒

Interestingly, when 𝜇 < 0, we get the result that 𝑥 → −𝜇 > 0 as 𝑡 → ∞. In contrast to before, we now have 𝑒𝜇𝑡 → 0 since 𝜇 is negative and 𝜇𝑡 → −∞. Thus in this case the chemical 𝑋 is only partially consumed as the reaction evolves. (The chemical 𝑌 , which is necessary for the reaction, is depleted before 𝑋.) In this case, the rate at which 𝑥 → −𝜇 is also exponential in time.

Reference notes The elementary mathematical technique of scaling is an important tool of analysis. When applied to a function or an equation, it leads to a simpler, normalized form that is easier to study. In later chapters, this technique will be used to emphasize the properties of functions and equations at different scales, and transform them into more convenient and revealing forms. For discussions of scaling within different contexts, see Holmes (2019), Lin and Segel (1988), and Logan (2013).

Exercises 1. Prove the derivative relations in Result 2.3.1. Specifically, assuming the derivatives exist, show 𝑑𝑘𝑦 𝑎𝑘 𝑑 𝑘 𝑦 = , 𝑘 = 1, . . . , 𝑛. 𝑘 𝑏 𝑑𝑡𝑘 𝑑𝑡 2. Consider a function 𝑦 = 𝑓(𝑡), where [𝑡] = 𝑇 and [𝑦] = 𝑌 , and let 𝑣 = 𝑤=

𝑑𝑣 . 𝑑𝑡

(a) Use properties of dimensions to find [𝑣], [𝑤] in terms of 𝑇, 𝑌 . 𝑑𝑛 𝑦

(b) Use induction to find [ 𝑑𝑡𝑛 ] for any 𝑛 ≥ 1.

𝑑𝑦 𝑑𝑡

and

32

2. Scaling

3. Let 𝑡, 𝑦 and 𝑝, 𝑞, 𝑟 be quantities with given units in the basis {𝐿, 𝑇}. Also, let 𝑑𝑦 𝑑2 𝑦 𝑑3 𝑦 𝑣 = 𝑑𝑡 , 𝑤 = 𝑑𝑡2 and 𝑧 = 𝑑𝑡3 . Assuming [𝑡] = 𝑇 and [𝑦] = 𝐿, find the dimensions [𝑝], [𝑞], [𝑟] as needed to make the given equation unit-free. (a) 𝑣 = 𝑝𝑦 − 𝑞𝑡 − 𝑟.

(b) 𝑣 = 𝑝𝑦2 + 𝑞𝑡2 + 𝑟𝑡𝑦.

(c) 𝑤 = 𝑝𝑦 − 𝑞𝑡𝑦 − 𝑟𝑣.

(d) 𝑧 = 𝑝𝑦2 + 𝑞𝑡 + 𝑟𝑤.

4. Find characteristic scales for the function on the interval 𝑡 ∈ [0, 𝑡0 ], where 𝑡0 > 0 is fixed but arbitrary. Here 𝑡, 𝑦 are variables and 𝑐 𝑘 > 0 are parameters. [For parts (c) and (d) you may introduce simplifying assumptions on 𝑐 𝑘 and 𝑡0 if needed.] (a) 𝑦 = 𝑐 1 𝑡 + 𝑐 2 𝑡2 + 𝑐 3 𝑡3 . 2

(c) 𝑦 = 𝑐 3 𝑒−𝑐1 𝑡−𝑐2 𝑡 .

(b) 𝑦 = 𝑐 1 + 𝑐 2 𝑒−𝑡/𝑐3 . (d) 𝑦 = 𝑐 1 𝑡 + 𝑐 2 cos(𝜋𝑡/𝑐 3 ).

5. Let 𝑡, 𝑦 be variables with dimensions [𝑡] = 𝑇, [𝑦] = 𝑌 , and let 𝑐 𝑘 > 0 be parameters. For each function below, determine the dimensions of 𝑐 𝑘 , and find a set of associated scales. (a) 𝑦 = 𝑐 1 𝑡 + 𝑐 2 𝑡2 + 𝑐 3 𝑡3 . 2

(c) 𝑦 = 𝑐 3 𝑒−𝑐1 𝑡−𝑐2 𝑡 .

(b) 𝑦 = 𝑐 1 + 𝑐 2 𝑒−𝑡/𝑐3 . (d) 𝑦 = 𝑐 1 𝑡 + 𝑐 2 cos(𝜋𝑡/𝑐 3 ).

6. Let 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 , 𝑐 3 ) be the solution of the following problem, where 𝑐 1 , 𝑐 2 , 𝑐 3 > 0 are parameters, and [𝑡] = 𝑇 and [𝑦] = 𝐿𝑇 −1 . 𝑑𝑦 = 𝑐 1 − 𝑐 2 𝑦, 𝑑𝑡

𝑦|𝑡=0 = 𝑐 3 ,

𝑡 ≥ 0.

(a) Find a set of associated scales 𝑎, 𝑏 > 0 for the problem. (b) Find the scaled problem and identify the single independent dimensionless parameter 𝜇 that appears. (c) Find the solution 𝑦 = 𝜙(𝑡, 𝜇) of the scaled problem. Describe qualitative behavior of solution when 𝜇 < 1, = 1 and > 1. 7. Let 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 ) be the solution of the following problem, where 𝑐 1 , 𝑐 2 > 0 are parameters, and [𝑡] = 𝑇 and [𝑦] = 𝑀. 𝑑𝑦 = −𝑐 1 𝑦2 , 𝑑𝑡

𝑦|𝑡=0 = 𝑐 2 ,

𝑡 ≥ 0.

(a) Find a set of associated scales 𝑎, 𝑏 > 0 for the problem. (b) Find the scaled problem and show that its solution is a fixed function 𝑦 = 𝜙(𝑡) involving no parameters. (c) What effect do 𝑐 1 , 𝑐 2 > 0 have on 𝑦 = 𝑓(𝑡, 𝑐 1 , 𝑐 2 )? Can this solution qualitatively change as 𝑐 1 , 𝑐 2 are varied?

Exercises

33

8. Let 𝑦 = 𝑓(𝑡, 𝑟, 𝑝, 𝑣 0 , 𝑦0 ) be the solution of the following problem, where 𝑝, 𝑦0 > 0 and 𝑟, 𝑣 0 ≥ 0 are parameters, and [𝑡] = 𝑇 and [𝑦] = 𝑌 . 𝑑2𝑦 = 𝑟 − 𝑝𝑦, 𝑑𝑡2

𝑑𝑦 | = 𝑣0 , 𝑑𝑡 𝑡=0

𝑦|𝑡=0 = 𝑦0 ,

𝑡 ≥ 0.

(a) Find a set of associated scales for 𝑡, 𝑦 involving only 𝑝, 𝑦0 > 0. (b) Find the scaled problem and show that it contains two independent dimensionless parameters 𝜇1 , 𝜇2 . (c) Find the solution 𝑦 = 𝜙(𝑡, 𝜇1 , 𝜇2 ) of the scaled problem. 9. Let 𝑦 = 𝑓(𝑡, 𝑚, 𝑐, 𝑘, 𝑦0 ) be the solution of the following problem, where 𝑚, 𝑘, 𝑦0 > 0 and 𝑐 ≥ 0 are parameters, and [𝑚] = 𝑀, [𝑡] = 𝑇 and [𝑦] = 𝐿. 𝑚

𝑑2𝑦 𝑑𝑦 − 𝑘𝑦, = −𝑐 𝑑𝑡 𝑑𝑡2

𝑑𝑦 | = 0, 𝑑𝑡 𝑡=0

𝑦|𝑡=0 = 𝑦0 ,

𝑡 ≥ 0.

(a) Find a set of associated scales for 𝑡, 𝑦 involving only 𝑚, 𝑘, 𝑦0 > 0. (b) Find the scaled problem and show that it contains only one independent dimensionless parameter 𝜇. (c) Show that the scaled solution type will be purely trigonometric, purely exponential, or some combination type depending on 𝜇; give explicit conditions for each type. 10. The temperature 𝑢 at time 𝑡 of a chemically reacting body in a furnace is modeled by the following problem, where 𝑘, 𝑢∗ , 𝑢0 > 0 and 𝑐, ℎ ≥ 0 are parameters, and [𝑡] = 𝑇 and [𝑢] = 𝛩. 𝑑𝑢 = 𝑐 𝑒−ℎ/ᵆ − 𝑘(𝑢 − 𝑢∗ ), 𝑑𝑡

𝑢|𝑡=0 = 𝑢0 ,

𝑡 ≥ 0.

(a) Find a set of associated scales for 𝑡, 𝑢 involving only 𝑘, 𝑢∗ > 0. (b) Find the scaled problem and show that it contains three independent dimensionless parameters. (c) Which dimensionless parameter must be very small to obtain the approximate equation 𝑑𝑢/𝑑𝑡 ≈ 1 − 𝑢? 11. The displacement 𝑥 at time 𝑡 of a nonlinear spring-mass system is modeled by the following problem, where 𝑚, 𝜎, ℎ > 0 and 𝑐, 𝑥0 ≥ 0 are parameters, and [𝑚] = 𝑀, [𝑡] = 𝑇 and [𝑥] = 𝐿. 𝑚

𝑑2𝑥 𝑑𝑥 = −𝑐 − 𝜎𝑥3 , 𝑑𝑡 𝑑𝑡2

𝑚

𝑑𝑥 | = ℎ, 𝑑𝑡 𝑡=0

𝑥|𝑡=0 = 𝑥0 ,

𝑡 ≥ 0.

34

2. Scaling

(a) Find a set of associated scales for 𝑡, 𝑥 involving only 𝑚, 𝜎, ℎ > 0. (b) Find the scaled problem. How many independent dimensionless parameters does it contain? (c) Which dimensionless parameter must be very small to obtain the approxi2 3 mate equation 𝑑 2 𝑥/𝑑𝑡 ≈ −𝑥 ? 12. In a simple ecosystem, the number of prey 𝑥 and predators 𝑦 at time 𝑡 are modeled by the following problem, where 𝛼, 𝛽, 𝛾, 𝛿 > 0 are parameters, and [𝑡] = Time and [𝑥] = Prey and [𝑦] = Predator. 𝑑𝑦 = −𝛾𝑦 + 𝛿𝑥𝑦, 𝑑𝑡

𝑑𝑥 = 𝛼𝑥 − 𝛽𝑥𝑦, 𝑑𝑡

𝑥|𝑡=0 = 𝑥0 ,

𝑦|𝑡=0 = 𝑦0 .

(a) Find the dimensions of 𝛼, 𝛽, 𝛾, 𝛿. (b) Find a set of scales 𝑎, 𝑏, 𝑐 > 0 for 𝑡, 𝑥, 𝑦 so that the scaled differential equations become 𝑑𝑥/𝑑𝑡 = 𝑥 − 𝑥 𝑦 and 𝑑𝑦/𝑑𝑡 = −𝜇𝑦 + 𝑥 𝑦, where 𝜇 is a dimensionless parameter. Mini-project. A model for the vertical motion of a projectile or ball tossed up into the air is air c

𝑚

m

g v0

𝑑2𝑦 𝑑𝑦 𝑑𝑦 = −𝑐 − 𝑚𝑔, | = 𝑣 0 , 𝑦|𝑡=0 = 0, 𝑡 ≥ 0. 2 𝑑𝑡 𝑑𝑡 𝑡=0 𝑑𝑡

y

ground

Here 𝑦 is the projectile height above the ground, 𝑚 is the projectile mass, 𝑐 is an air resistance coefficient, 𝑔 is gravitational acceleration, and 𝑡 is time. In the ideal case when 𝑐 = 0, so there is no air resistance, the height versus time curve is perfectly symmetric: the duration and shape of the ascent portion of the curve is the same as the descent portion. However, when 𝑐 > 0, the curve is no longer symmetric: the height versus time curve becomes skewed, and the time it takes to ascend and descend are no longer equal to each other. Here we use a scaled form of the equations to explore how the asymmetry depends on the air resistance parameter 𝑐 ≥ 0 and other parameters 𝑚, 𝑔, 𝑣 0 > 0. (a) Find scales for 𝑡, 𝑦 involving only 𝑚, 𝑔, 𝑣 0 . Show that the scaled equations take the following form, for an appropriate dimensionless parameter 𝜇, 𝑑2𝑦 𝑑𝑡

2

+𝜇

𝑑𝑦 𝑑𝑡

+ 1 = 0,

𝑑𝑦 𝑑𝑡

|𝑡=0 = 1,

𝑦|𝑡=0 = 0,

𝑡 ≥ 0.

(b) Solve the system in (a) for the case 𝜇 = 0. Verify that the solution is an inverted parabola as shown. By hand, find 𝑡1 and 𝑡2 . Show the ascent interval [0, 𝑡1 ] is the same size as the descent interval [𝑡1 , 𝑡2 ], that is, 𝑡2 = 2𝑡1 .

Exercises

35

y

0

t1

t2

t

(c) Solve the system in (a) for the case 𝜇 = 0.5. Verify by plotting that the solution is concave down, with intercepts at 0 and 𝑡2 , and a maximum at 𝑡1 , for some 0 < 𝑡1 < 𝑡2 . By hand, find 𝑡1 , and verify that 𝑦|𝑡=2𝑡1 > 0. Explain how this implies that the descent interval [𝑡1 , 𝑡2 ] is larger than the ascent interval [0, 𝑡1 ]. (d) Show the results in (c) hold for arbitrary 𝜇 > 0. Does the asymmetry in the ascent and descent intervals become more or less pronounced as 𝜇 increases from zero? How do the maximum height and total flight interval [0, 𝑡2 ] qualitatively change as 𝜇 increases from zero?

Chapter 3

One-dimensional dynamics

Mathematical models often take the form of differential equations that express how quantities change over time. Such equations may be first- or higher-order, linear or nonlinear, autonomous or nonautonomous, and may involve a number of parameters. In this chapter, we consider models in the form of a single, autonomous, first-order differential equation, and study various qualitative and geometric properties of their solutions, and explore how such properties depend on parameters.

3.1. Preliminaries In the modeling of simple systems, we will often consider a first-order initial-value problem of the form (3.1)

𝑑𝑢 = 𝑓(𝑢, 𝑐 1 , . . . , 𝑐 𝑁 ), 𝑑𝑡

𝑢|𝑡=0 = 𝑢0 ,

𝑡 ≥ 0,

where 𝑢, 𝑡 are real variables, 𝑢0 , 𝑐 1 , . . . , 𝑐 𝑁 are real parameters, and 𝑓 is a given function. The system in (3.1) is called a dynamical system for 𝑢. A solution is a function 𝑢 = 𝑢(𝑡, 𝑢0 , 𝑐 1 , . . . , 𝑐 𝑁 ), which is differentiable in 𝑡, and satisfies (3.1). For brevity, we will often omit the parameters and only show the variables in a functional relation. Hence 𝑑ᵆ we will abbreviate the equation as 𝑑𝑡 = 𝑓(𝑢), and a solution as 𝑢 = 𝑢(𝑡). Parameters will be indicated when they are essential to a discussion. A basic goal in the study of (3.1) is to characterize how a solution 𝑢(𝑡) depends on the parameters. For the moment, we fix 𝑐 1 , . . . , 𝑐 𝑁 , and only consider the dependence on 𝑢0 ; the dependence on other parameters will be explored later. A solution 𝑢(𝑡) can be viewed in two different ways as illustrated in Figure 3.1. In a time view, a solution 𝑑ᵆ is viewed as a graph in the 𝑡, 𝑢-plane, and 𝑑𝑡 = 𝑓(𝑢) is the slope of this graph. Alternatively, in a phase view, a solution is viewed as a moving point on the 𝑢-axis. This point 𝑑ᵆ can only move along the axis, in the positive and negative directions, and 𝑑𝑡 = 𝑓(𝑢) is the velocity of the point, which depends on its position. 37

38

3. One-dimensional dynamics

u

u

u1

u1

u0

u0 0

1

t

Figure 3.1.

3.2. Solvability theorem The question of existence and uniqueness of solutions to (3.1) is addressed in the following result from the theory of ordinary differential equations. The set of all points where 𝑓(𝑢) is continuously differentiable is denoted by 𝐷, assumed to be an open set in the real line ℝ. Result 3.2.1. The system in (3.1) has a unique solution 𝑢(𝑡) ∈ 𝐷 for any 𝑢0 ∈ 𝐷. Moreover: (i)

𝑢(𝑡) exists and is in 𝐷 for some maximal interval 𝑡 ∈ (𝑇− , 𝑇+ ), where 𝑇− < 0 and 𝑇+ > 0 depend on 𝑢0 ,

(ii)

if 𝑇− or 𝑇+ is finite, then 𝑢(𝑡) leaves 𝐷 at 𝑡 = 𝑇− or 𝑡 = 𝑇+ ,

(iii) if 𝑢0 ≠ 𝑢̂0 , then 𝑢(𝑡) ≠ 𝑢(𝑡) ̂ while both solutions exist in 𝐷. Thus, within the set 𝐷 where 𝑓(𝑢) is continuously differentiable, the system in (3.1) has a unique solution 𝑢(𝑡) for any given 𝑢0 . This solution exists on some maximal, open interval of time that contains 𝑡 = 0. While we focus on times 𝑡 ≥ 0, solutions are also defined for 𝑡 ≤ 0, whether such times are relevant or not. Moreover, solutions with different initial conditions cannot intersect in the time view. At an intersection, two solution curves would extend from one point, which would violate uniqueness. Regardless of whether 𝐷 = ℝ or 𝐷 ⊂ ℝ, the maximal existence interval for 𝑢(𝑡) depends on 𝑢0 . This interval is the largest for which 𝑢(𝑡) is in 𝐷. Thus, if 𝑇− or 𝑇+ is finite, then 𝑢(𝑡) must leave 𝐷 at that time. Outside of the set 𝐷, solutions 𝑢(𝑡) may not be unique or may not exist, or the function 𝑓(𝑢) and the differential equation may not be defined. 𝑑ᵆ

Example 3.2.1. Consider 𝑑𝑡 = 3𝑢, 𝑢|𝑡=0 = 𝑢0 . Since 𝑓(𝑢) = 3𝑢 is continuously differentiable for all 𝑢 ∈ ℝ, Result 3.2.1 guarantees a unique solution for any 𝑢0 ∈ ℝ. By the methods of calculus we find 𝑢(𝑡) = 𝑢0 𝑒3𝑡 . By properties of the exponential, this solution exists and is in ℝ for 𝑡 ∈ (−∞, ∞) for any 𝑢0 . Moreover, for any 𝑢0 ≠ 𝑢̂0 , we get 𝑢0 𝑒3𝑡 ≠ 𝑢̂0 𝑒3𝑡 , or equivalently 𝑢(𝑡) ≠ 𝑢(𝑡), ̂ and solutions do not intersect for all time as illustrated in Figure 3.2a. 𝑑ᵆ

Example 3.2.2. Consider 𝑑𝑡 = 𝑢2 , 𝑢|𝑡=0 = 𝑢0 . Since 𝑓(𝑢) = 𝑢2 is continuously differentiable for all 𝑢 ∈ ℝ, Result 3.2.1 guarantees a unique solution for any 𝑢0 ∈ ℝ. By the methods of calculus we find 𝑢(𝑡) = 𝑢0 /(1 − 𝑢0 𝑡). Due to the denominator, the existence interval depends on 𝑢0 . The solution exists and is in ℝ for 𝑡 ∈ (−∞, ∞) if

3.3. Equilibria

39

u

u

8 4

-0.2

-4

4

u0 = 2 0.2

t

0.4

2

u0 = 0 -0.5

u0 = -2

0.5

-2

t

-4

-8 (a)

(b)

Figure 3.2. 1

1

𝑢0 = 0, for 𝑡 ∈ (−∞, ᵆ ) if 𝑢0 > 0, and for 𝑡 ∈ ( ᵆ , ∞) if 𝑢0 < 0. In the latter two cases, 0 0 the solution has a vertical asymptote, and 𝑢(𝑡) leaves the set ℝ at the finite end point 1 𝑡 = ᵆ of the existence interval. As before, for any 𝑢0 ≠ 𝑢̂0 , we have 𝑢(𝑡) ≠ 𝑢(𝑡) ̂ while 0 both solutions exist as illustrated in Figure 3.2(b). 𝑑ᵆ

Example 3.2.3. Consider 𝑑𝑡 = −1/𝑢, 𝑢|𝑡=0 = 𝑢0 . Since 𝑓(𝑢) = −1/𝑢 is continuously differentiable in the open set 𝐷 = {𝑢 | 𝑢 ≠ 0}, Result 3.2.1 guarantees a unique solution for any 𝑢0 ∈ 𝐷, say 𝑢0 > 0. By the methods of calculus we find 𝑢(𝑡) = √𝑢20 − 2𝑡. Due to the square root, the existence interval depends on 𝑢0 . Specifically, 𝑢(𝑡) exists and is 1 1 in 𝐷 for 𝑡 ∈ (−∞, 2 𝑢20 ). Note that 𝑢(𝑡) leaves 𝐷 at the finite end point 𝑡 = 2 𝑢20 of the 𝑑ᵆ

interval. At this end point we get 𝑢 = 0, but 𝑑𝑡 and 𝑓(𝑢) and the differential equation are undefined (infinite) at 𝑢 = 0. Two sample solution curves are illustrated in Figure 3.3. u 3 u0 = 1

2

u0 = 2

1

-2

-1

-1

1

2

3

t

boundary of D

Figure 3.3.

3.3. Equilibria Here we consider a special class of solutions to (3.1) that are constant in time. As we will see, knowledge of such constant solutions will be helpful in developing a qualitative or geometric understanding of arbitrary solutions. Definition 3.3.1. A solution of (3.1) is called an equilibrium or steady state if it is constant in time, that is (3.2)

𝑢(𝑡) ≡ 𝑢∗

for all

𝑡 ≥ 0,

40

3. One-dimensional dynamics

for some point 𝑢∗ ∈ 𝐷. Since 𝑓(𝑢∗ ) = 0.

𝑑ᵆ 𝑑𝑡

= 𝑓(𝑢), it follows that 𝑢∗ is an equilibrium if and only if

Thus equilibrium solutions must be roots of the function 𝑓(𝑢). Depending on this function, a system may have no equilibria, or one or more equilibria. Note that knowledge of such solutions is helpful since, by Result 3.2.1(iii), they provide barriers which other solutions cannot touch or cross. 𝑑ᵆ

Example 3.3.1. Consider 𝑑𝑡 = 𝑢2 − 3𝑢, 𝑢|𝑡=0 = 𝑢0 . For this system, we have 𝑓(𝑢) = 𝑢2 − 3𝑢, and the equation 𝑓(𝑢∗ ) = 0 has two roots 𝑢∗ = 0, 3. Thus for 𝑢0 = 0, 3 we get constant solutions 𝑢(𝑡) ≡ 0, 3. And for 𝑢0 ≠ 0, 3 we get nonconstant solutions that must satisfy 𝑢(𝑡) ≠ 0, 3 while defined. Note that if 𝑢0 is in one of the intervals (−∞, 0), (0, 3) or (3, ∞), then 𝑢(𝑡) must remain in the same interval while defined as illustrated in Figure 3.4. u 4

u0 = 3

2

-0.2

-0.1

0.1

-2

0.2

t

u0 = 0

-4 Figure 3.4.

3.4. Monotonicity theorem The concept of equilibrium solutions can be combined with Result 3.2.1 to get a qualitative characterization of arbitrary solutions. To state the result, we consider the following three disjoint sets, some of which could be empty, but whose union is the entire set 𝐷 where 𝑓(𝑢) is continuously differentiable. Specifically, let (3.3)

𝐸 = {𝑢 ∈ 𝐷 | 𝑓(𝑢) = 0}, +

𝐼 = {𝑢 ∈ 𝐷 | 𝑓(𝑢) > 0},

𝐼 − = {𝑢 ∈ 𝐷 | 𝑓(𝑢) < 0}.

Result 3.4.1. Consider (3.1) and let 𝑢(𝑡) be the solution with initial condition 𝑢0 . (i)

If 𝑢0 ∈ 𝐸, then 𝑢(𝑡) ≡ 𝑢0 is an equilibrium solution.

(ii)

If 𝑢0 ∈ 𝐼 + , then 𝑢(𝑡) must remain in 𝐼 + and increase while defined.

(iii) If 𝑢0 ∈ 𝐼 − , then 𝑢(𝑡) must remain in 𝐼 − and decrease while defined. Thus solutions 𝑢(𝑡) must be monotonic in time for any initial condition 𝑢0 . The result follows from the fact that solutions cannot intersect or touch while they are defined. Specifically, if 𝑢0 ∉ 𝐸, then we must have 𝑓(𝑢(𝑡)) ≠ 0, and by continuity, the signs of 𝑓(𝑢(𝑡)) and 𝑓(𝑢0 ) must be the same. The monotonicity result then follows from 𝑑ᵆ this observation about signs, and the fact that 𝑑𝑡 = 𝑓(𝑢(𝑡)) is the slope of the graph of 𝑢(𝑡).

3.5. Stability of equilibria

41

𝑑ᵆ

Example 3.4.1. Consider 𝑑𝑡 = (𝑢 − 1)(4 − 𝑢), 𝑢|𝑡=0 = 𝑢0 . For this system, we have 𝑓(𝑢) = (𝑢 − 1)(4 − 𝑢), and the equation 𝑓(𝑢∗ ) = 0 has two roots 𝑢∗ = 1, 4. The sign of 𝑓(𝑢) versus 𝑢 is shown in the table in Figure 3.5, and by inspection we get 𝐸 = {1, 4}, 𝐼 + = (1, 4) and 𝐼 − = (−∞, 1) ∪ (4, ∞). Solution curves for 𝑢0 ∈ 𝐸 and various 𝑢0 ∉ 𝐸 are also shown. Note that all solutions are monotonic in accordance with the result. u 6

sign f(u) u

0

0

1

4 -0.2

-0.1

4

u0 = 4

2

u0 = 1

-2

0.1

0.2

t

Figure 3.5.

While monotonicity is a general feature of one-dimensional dynamical systems, we note that it is not a general feature of higher-dimensional systems. Indeed, systems with two or more time-dependent quantities may possess a rich variety of monotonic and non-monotonic solutions, including various types of spiraling and periodic solutions.

3.5. Stability of equilibria Here we introduce a classification of equilibria. The classification will be based on the behavior of nearby solutions, and will lead to a better qualitative understanding of a system. In contrast to the monotonicity result, this classification will not be limited to one-dimensional systems, and will be generalized to higher-dimensional systems later. Definition 3.5.1. Let 𝑢∗ be an equilibrium of (3.1), and for any 𝜌 > 0 let 𝐼∗,𝜌 denote the open interval (𝑢∗ − 𝜌, 𝑢∗ + 𝜌). (1) 𝑢∗ is called asymptotically stable if for every 𝜀 > 0 there is a 𝛿 > 0 such that, if 𝑢0 ∈ 𝐼∗,𝛿 then 𝑢(𝑡) ∈ 𝐼∗,𝜀 for all 𝑡 ≥ 0, and 𝑢(𝑡) → 𝑢∗ as 𝑡 → ∞ for every 𝑢0 ∈ 𝐼∗,𝛿 ; see Figure 3.6(a). (2) 𝑢∗ is called neutrally stable if for every 𝜀 > 0 there is a 𝛿 > 0 such that, if 𝑢0 ∈ 𝐼∗,𝛿 then 𝑢(𝑡) ∈ 𝐼∗,𝜀 for all 𝑡 ≥ 0, and 𝑢(𝑡) ↛ 𝑢∗ as 𝑡 → ∞ for some 𝑢0 ∈ 𝐼∗,𝛿 ; see Figure 3.6(b). (3) 𝑢∗ is called unstable if it is not asymptotically or neutrally stable; see Figure 3.6(c) and (d). Thus every equilibrium solution 𝑢∗ can be classified as one of three types: asymptotically stable, neutrally stable, or unstable. Denoting the solution of (3.1) by 𝑢(𝑡, 𝑢0 ), we note that stability is a form of continuity of this function with respect to 𝑢0 . Specifically, asymptotic and neutral stability imply that |𝑢(𝑡, 𝑢0 ) − 𝑢(𝑡, 𝑢∗ )| will be arbitrarily small for all time 𝑡 ≥ 0 provided that |𝑢0 − 𝑢∗ | is sufficiently small, where 𝑢(𝑡, 𝑢∗ ) ≡ 𝑢∗ .

42

3. One-dimensional dynamics

u 2ε

2δ

u

attractor or sink

2ε

u *

2δ

neutral

u *

t

t

(a)

u

(b)

u

repeller or source

u *

semi−stable or hyperbolic

u * t

t

(c)

(d)

Figure 3.6.

Analogously, instability implies that |𝑢(𝑡, 𝑢0 ) − 𝑢(𝑡, 𝑢∗ )| will not be arbitrarily small for all time 𝑡 ≥ 0 for some 𝑢0 , no matter how small |𝑢0 − 𝑢∗ | may be. In view of the monotonicity result, the classification of an equilibrium can be determined by examining the sign of 𝑓(𝑢) around 𝑢∗ . Intuitively, an asymptotically stable equilibrium 𝑢∗ can be interpreted as a preferred state of the system. All solutions that start sufficiently close to such an equilibrium will remain close, and will be pulled into the equilibrium, as time goes on. An unstable equilibrium 𝑢∗ can be interpreted as an unpreferred state. Some solutions that start arbitrarily close, but not at such an equilibrium, will be pushed away over the course of time. A neutrally stable equilibrium 𝑢∗ can be interpreted as a borderline case. All solutions that start sufficiently close to such an equilibrium will remain close, but some will not be pulled into the equilibrium. As indicated in Figure 3.6, an asymptotically stable equilibrium is also called an attractor or sink, and an unstable equilibrium can be a repeller or source, or it can be semi-stable or hyperbolic. Equilibria that are neutrally stable are rather special and not common in one-dimensional systems; however, they will be more common in the higher-dimensional case. 𝑑ᵆ

Example 3.5.1. Consider 𝑑𝑡 = 4𝑢2 − 𝑢3 , 𝑢|𝑡=0 = 𝑢0 . For this system, we have 𝑓(𝑢) = 4𝑢2 − 𝑢3 , and the equation 𝑓(𝑢∗ ) = 0 has two distinct roots 𝑢∗ = 0, 4. The sign of 𝑓(𝑢) versus 𝑢 is shown in the table in Figure 3.7. The pattern of signs around 𝑢∗ = 4 indicate that solutions with 𝑢0 ∈ (0, 4) will increase towards 𝑢∗ , and solutions with 𝑢0 ∈ (4, ∞) will decrease toward 𝑢∗ , which implies that 𝑢∗ is asymptotically stable; it is an attractor. The pattern of signs around 𝑢∗ = 0 indicate that any solution with 𝑢0 ∈ (0, 4) will increase and move away from 𝑢∗ , and solutions with 𝑢0 ∈ (−∞, 0) will increase toward 𝑢∗ , which implies that 𝑢∗ is unstable; it is hyperbolic. The equilibria and various solution curves around them are illustrated in Figure 3.7.

3.6. Derivative test for stability

43

u

sign f(u) u

0

0

0

4

4

u0 = 4

2

u0 = 0 0.1

-2

0.3

0.5

t

Figure 3.7.

𝑑ᵆ

Example 3.5.2. Consider the trivial system 𝑑𝑡 ≡ 0, 𝑢|𝑡=0 = 𝑢0 . Here we have 𝑓(𝑢) ≡ 0, and every 𝑢∗ on the real line is a root and hence an equilibrium. Moreover, since all solutions are constant in time, every 𝑢∗ is neutrally stable, as can be inferred from the definition with 𝛿 = 𝜀.

3.6. Derivative test for stability In some cases, examining the sign of 𝑓(𝑢) around an equilibrium 𝑢∗ can be tedious, especially when the function involves a number of parameters. The next result provides a simple test that will be helpful in various situations. Result 3.6.1. Let 𝑢∗ be an equilibrium of (3.1) and let 𝜆∗ = 𝑓′ (𝑢∗ ), where 𝑓′ denotes (i)

If 𝜆∗ < 0, then 𝑢∗ is asymptotically stable; it is an attractor.

(ii)

If 𝜆∗ > 0, then 𝑢∗ is unstable; it is a repeller.

𝑑𝑓 . 𝑑ᵆ

(iii) If 𝜆∗ = 0, then 𝑢∗ may be stable or unstable. The above test is based on the fact that, if 𝑓(𝑢∗ ) = 0 and 𝑓′ (𝑢∗ ) ≠ 0, then the function 𝑓(𝑢) is either increasing or decreasing at 𝑢∗ , which determines the sign pattern of 𝑓(𝑢) around 𝑢∗ . For instance, if 𝑓′ (𝑢∗ ) > 0, then 𝑓(𝑢) is increasing, and it must be negative in an interval to the left of 𝑢∗ , zero at 𝑢∗ , and positive in an interval to the right of 𝑢∗ . Similar conclusions can be made if 𝑓′ (𝑢∗ ) < 0. Note that, if 𝑓′ (𝑢∗ ) = 0, then there are no implications for the sign pattern of 𝑓(𝑢). 𝑑ᵆ

𝑘

Example 3.6.1. Consider 𝑑𝑡 = 1+ᵆ − 𝑚𝑢, 𝑢|𝑡=0 = 𝑢0 , where 𝑘, 𝑚 > 0 are constants. Suppose we only care about solutions with 𝑢 ≥ 0. For this system, we have 𝑓(𝑢) = 𝑘 1 − 𝑚𝑢, and the equation 𝑓(𝑢∗ ) = 0 has two real, distinct roots 𝑢± ∗ = 2𝑚 (−𝑚 ± 1+ᵆ √𝑚2 + 4𝑚𝑘). In view of the restriction 𝑢 ≥ 0, we discard the negative root, and only 1 2 consider the single equilibrium 𝑢+ ∗ = 2𝑚 (−𝑚 + √𝑚 + 4𝑚𝑘) > 0. The construction of a sign table for 𝑓(𝑢) around this equilibrium would be tedious. However, a relatively straightforward calculation gives (3.4)

𝜆∗ = 𝑓′ (𝑢+ ∗)=

−𝑘 − 𝑚. 2 (1 + 𝑢+ ∗)

Since 𝜆∗ < 0 for any 𝑘, 𝑚 > 0, it follows that 𝑢+ ∗ is an attractor for any 𝑘, 𝑚 > 0.

44

3. One-dimensional dynamics

3.7. Bifurcation of equilibria Equilibrium solutions and their stability are key features that provide important qualitative information about a system. The dependence of these features on any parameter can be illustrated in a graphical diagram as defined next. For the following, consider a system 𝑑𝑢 (3.5) = 𝑓(𝑢, ℎ), 𝑢|𝑡=0 = 𝑢0 , 𝑡 ≥ 0, 𝑑𝑡 where ℎ is an arbitrary parameter of interest. Note that equilibrium solutions 𝑢∗ must satisfy 𝑓(𝑢∗ , ℎ) = 0. Definition 3.7.1. The set of all points (𝑢∗ , ℎ) satisfying 𝑓(𝑢∗ , ℎ) = 0, with stability of 𝑢∗ indicated at each point, is called a bifurcation diagram. A bifurcation diagram thus provides a direct graphical illustration of how the number, location, and stability of equilibrium solutions 𝑢∗ depend on a parameter ℎ in the system. Depending on the context, 𝑢∗ and ℎ may be allowed to vary over all real values, or they may be subject to restrictions, and allowed to vary only in given intervals. For the case considered here, with one variable 𝑢∗ and one parameter ℎ, the set of points satisfying 𝑓(𝑢∗ , ℎ) = 0 will generally be a set of curves in the 𝑢∗ , ℎ-plane. Analogous diagrams could be contemplated when a dependence on more than one parameter is to be explored. For instance, if two parameters ℎ, 𝑘 are considered, then the set of points satisfying 𝑓(𝑢∗ , ℎ, 𝑘) = 0 would generally be a set of surfaces in 𝑢∗ , ℎ, 𝑘-space. Here we consider the dependence of equilibria on only a single parameter at a time. 𝑑ᵆ

Example 3.7.1. Consider 𝑑𝑡 = 𝑢3 − 𝑢ℎ, 𝑢|𝑡=0 = 𝑢0 , where ℎ is a parameter. To construct a bifurcation diagram, we note that the equation 𝑓(𝑢∗ , ℎ) = 𝑢3∗ − 𝑢∗ ℎ = 0 has three real, distinct solutions 𝑢∗ in terms of ℎ. One solution is 𝑢∗ = 0 for all ℎ, and the other two solutions are 𝑢∗ = ±√ℎ for all ℎ > 0 (when ℎ = 0, the pair reduces to 𝑢∗ = 0, and when ℎ < 0, the pair is not real). The three solutions are illustrated in Figure 3.8(a).

u

u

*

*

h

sign f(u,0) u

(a)

0 0

(b)

h stable unstable (c)

Figure 3.8.

We next proceed to assess the stability of each equilibrium 𝑢∗ for each value of ℎ. 𝑑𝑓 To employ the derivative test, we consider 𝜆∗ = 𝑑ᵆ (𝑢∗ , ℎ) = 3𝑢2∗ − ℎ. For 𝑢∗ = 0 we get 𝜆∗ = −ℎ. When ℎ > 0, we get 𝜆∗ < 0, which implies 𝑢∗ is a stable attractor. When ℎ < 0, we get 𝜆∗ > 0, which implies 𝑢∗ is an unstable repeller. When ℎ = 0, the derivative test is inconclusive, and so we consider a sign table for 𝑓(𝑢, 0) = 𝑢3 around

3.7. Bifurcation of equilibria

45

𝑢∗ as shown in Figure 3.8(b), which implies 𝑢∗ is an unstable repeller in this special case. For the pair 𝑢∗ = ±√ℎ, which exist only for ℎ > 0, we get 𝜆∗ = 2ℎ > 0, which implies the pair are always unstable repellers. The bifurcation diagram is obtained by superimposing the stability information onto the curves in Figure 3.8(a); the result is shown in Figure 3.8(c).

u u

h 0

0

t

t

h

(a)

(b)

Figure 3.9.

The information contained in the bifurcation diagram can be translated into a qualitative time view. When ℎ ≤ 0, the diagram shows that the system has only a repeller at 𝑢∗ = 0, which implies that solution curves for various 𝑢0 must have a time view as shown in Figure 3.9(a). In contrast, when ℎ > 0, the diagram shows that there is a repeller at 𝑢∗ = −√ℎ, an attractor at 𝑢∗ = 0, and another repeller at 𝑢∗ = √ℎ, which implies that solution curves for various 𝑢0 must have a time view as shown in Figure 3.9(b). Thus the bifurcation diagram provides a concise summary of the system. 𝑑ᵆ

Example 3.7.2. Consider 𝑑𝑡 = ℎ𝑢 − 𝑔(𝑢), 𝑢|𝑡=0 = 𝑢0 , where ℎ is a parameter. Let 𝑔(𝑢) be a given function, with intercepts at 𝑎 and 𝑏 as sketched in Figure 3.10(a), but whose explicit form is unknown. We consider this system subject to the restrictions 𝑢 ≥ 0 and ℎ ≥ 0. To construct a bifurcation diagram, we note that the function 𝑓(𝑢, ℎ) can be written in the form 𝑓(𝑢, ℎ) = 𝑦1 − 𝑦2 , where 𝑦1 = ℎ𝑢 and 𝑦2 = 𝑔(𝑢) are two graphs in the 𝑢, 𝑦-plane. For given ℎ, the value of 𝑓(𝑢, ℎ) at any 𝑢 is simply the signed distance between the graphs at that 𝑢. y

y

y = hu 1

y + +

a

b

u

+

− +

u#

u

uL

uR

u

y = g(u) 2 (a)

(b)

(c)

Figure 3.10.

For large values of ℎ, we note that 𝑓(𝑢, ℎ) > 0 for all 𝑢 as shown in Figure 3.10(a), and there are no equilibria. For a certain value ℎ = ℎ# , the graphs are tangent at a

46

3. One-dimensional dynamics

point 𝑢 = 𝑢# as shown in Figure 3.10(b). Hence this point is an equilibrium, and the sign pattern for 𝑓(𝑢, ℎ# ), which is indicated in the figure, shows that 𝑢# is hyperbolic. For any given ℎ < ℎ# , the graphs intersect at two points 𝑢 = 𝑢𝐿 , 𝑢𝑅 as shown in Figure 3.10(c). These points are equilibria, and the sign patterns for 𝑓(𝑢, ℎ) show that 𝑢𝐿 is an attractor and that 𝑢𝑅 is a repeller. As the slope ℎ of the line varies between 0 and ℎ# , we make the qualitative observation that 𝑢𝐿 → 𝑎 and 𝑢𝑅 → 𝑏 as ℎ → 0; moreover, 𝑢𝐿 → 𝑢# and 𝑢𝑅 → 𝑢# as ℎ → ℎ# . Since there are no equilibria for ℎ > ℎ# , we obtain a bifurcation diagram as qualitatively sketched in Figure 3.11. u * b u# a

uR stable unstable

uL h#

h

Figure 3.11.

As before, the information in the diagram can be translated into a time view, and some interesting features can be exposed. For example, consider the solution curve 𝑢(𝑡) in the time view with initial condition 𝑢0 = 𝑢# . For any ℎ < ℎ# , this solution would remain bounded for all time: it would be repelled from the equilibrium at 𝑢𝑅 , and attracted to the equilibrium at 𝑢𝐿 . For ℎ = ℎ# , this solution would itself be an equilibrium. However, under the slightest increase of the parameter to ℎ > ℎ# , this solution would no longer remain bounded: it would grow uncontrollably in time since 𝑓(𝑢, ℎ) > 0 for all 𝑢 when ℎ > ℎ# .

3.8. Case study Setup. To illustrate the preceding results on stability and bifurcation, and some models arising in ecology, we study the dynamics of a population in a simple ecosystem. Figure 3.12 shows the system, which consists of a fixed region of land, and three coexisting populations of insects, plants and birds. The populations interact in the sense that the insects consume the plants, and the birds consume the insects. Under the simplifying assumption that the number of plants and birds is steady, we study a model for the number of insects. We seek to understand how the insect population changes in time, and how the population versus time curve is influenced by various parameters in the model. plants

birds

insects Figure 3.12.

3.8. Case study

47

Outline of model. Let 𝑝 denote the insect population size at time 𝑡, with dimensions of [𝑝] = Insect and [𝑡] = Time. We assume that the only factors influencing the insect population are natural births and deaths, and consumption by the birds. Specifically, we neglect any immigration and emigration of insects between neighboring regions, and any supply or removal of insects by external mechanisms, such as the application of insecticides. Thus a simple balance equation for the population size takes the form 𝑑𝑝 (3.6) = 𝐹(𝑝) − 𝐺(𝑝), 𝑑𝑡 where 𝐹 denotes the rate of change due to natural births and deaths, and 𝐺 denotes the rate of change due to consumption by the birds. Note that the dimensions [𝐹] and [𝐺] are both Insect/Time. Births, deaths of insects. To describe the demographic characteristics of the insects in their environment, we consider a logistic model for the rate 𝐹, of the form 𝑝 (3.7) 𝐹(𝑝) = 𝑟𝑝(1 − ), 𝑘 where 𝑟, 𝑘 > 0 are constants with dimensions of [𝑟] = 1/Time and [𝑘] = Insect. The 𝐹 versus 𝑝 curve for this model is an inverted parabola, with intercepts at 𝑝 = 0 and 𝑟𝑘 𝑘 𝑝 = 𝑘, and a maximum value of 𝐹max = 4 at 𝑝 = 2 as shown in Figure 3.13(a). For populations of size 0 < 𝑝 < 𝑘, we have 𝐹 > 0, which means that more births than deaths occur per unit time. Similarly, for populations of size 𝑝 > 𝑘, we have 𝐹 < 0, which means that more deaths than births occur per unit time. Thus the parameter 𝑘 can be seen as the maximum sustainable population, or carrying capacity, that can be supported by the environment, due to limited food, space, and other resources. 𝑑𝐹 The parameter 𝑟 is the initial slope of the curve, that is, 𝑑𝑝 |𝑝=0 = 𝑟. Note that the parameters 𝑟 and 𝑘 together determine the maximum value 𝐹max in the model.

F

G

rk 4

m

0

k

p

0

(a)

n/ 3

p

(b)

Figure 3.13.

Consumption of insects. To describe the feeding characteristics of the birds in their environment, we consider a Holling-type model for the rate 𝐺, of the form (3.8)

𝐺(𝑝) =

𝑚𝑝2 , 𝑛2 + 𝑝 2

where 𝑚, 𝑛 > 0 are constants with dimensions of [𝑚] = Insect/Time and [𝑛] = Insect. The 𝐺 versus 𝑝 curve for this model is a sigmoidal or “s”-shaped curve, with an intercept

48

3. One-dimensional dynamics

𝑛 , √3

at 𝑝 = 0, an inflection point at 𝑝 =

and a saturation or horizontal asymptote value

of 𝐺sat = 𝑚 as shown in Figure 3.13(b). For insect populations of size 0 < 𝑝
, √3

we have 𝐺 ≈ 𝑚, which means that the birds now notice the insects, and catch and consume them as quickly as they can, which is some rate near the upper bound 𝑚. 𝑛 , the consumption rate experiences its maximum For population sizes near 𝑝 = √3

variation, and we have 0 < 𝐺 < 𝑚. Model equations. Combining (3.6)–(3.8), we obtain the dynamical system 𝑑𝑝 𝑝 𝑚𝑝2 = 𝑟𝑝(1 − ) − 2 , 𝑑𝑡 𝑘 𝑛 + 𝑝2

(3.9)

𝑝|𝑡=0 = 𝑝0 ,

𝑡 ≥ 0.

We seek to understand the behavior of solutions in terms of the parameters 𝑟, 𝑘, 𝑚, 𝑛 > 0 and initial population 𝑝0 ≥ 0. Although the system is mathematically well defined for all 𝑝, we focus only on physically meaningful solutions with 𝑝 ≥ 0. Analysis of model. To study the above system it will be convenient to rewrite it in dimensionless form. For this purpose, we introduce the scale transformation 𝜏 = 𝑡/𝑎 𝑛 and 𝑢 = 𝑝/𝑏, where 𝑎 = 𝑚 and 𝑏 = 𝑛, and by the usual method we obtain 𝑑𝑢 𝑢 𝑢2 , = ℎ𝑢(1 − ) − 𝑑𝜏 𝑐 1 + 𝑢2

(3.10) 𝑟𝑛

𝑘

𝑢|𝜏=0 = 𝑢0 ,

𝜏 ≥ 0,

𝑝

where ℎ = 𝑚 , 𝑐 = 𝑛 and 𝑢0 = 𝑛0 are dimensionless parameters. For fixed 𝑐 > 0, we seek a bifurcation diagram in terms of ℎ > 0. We denote the right-hand side of the differential equation by 𝑓(𝑢, ℎ). Equilibria, stability. To characterize the equilibria and their stability, we consider the partial factorization 𝑢 𝑢 . (3.11) 𝑓(𝑢, ℎ) = 𝑢𝑔(𝑢, ℎ) where 𝑔(𝑢, ℎ) = ℎ(1 − ) − 𝑐 1 + 𝑢2 Note that sign 𝑓(𝑢, ℎ) = sign 𝑔(𝑢, ℎ) when 𝑢 > 0, and that 𝑓(𝑢, ℎ) = 0 when 𝑢 = 0 or 𝑔(𝑢, ℎ) = 0. The equilibrium 𝑢 = 0 will be called trivial, and any equilibria satisfying 𝑔(𝑢, ℎ) = 0 will be called nontrivial. The trivial equilibrium 𝑢 = 0 is a solution for all ℎ > 0. Using the derivative test, 𝑑𝑓 we find 𝜆 = 𝑑ᵆ (0, ℎ) = ℎ, which implies that this equilibrium is a repeller for all values of the parameter. Nontrivial equilibria must satisfy 𝑔(𝑢, ℎ) = 0. Solving this equation for 𝑢 in terms of ℎ is difficult; however, solving for ℎ in terms of 𝑢 is straightforward, namely (3.12)

𝑔(𝑢, ℎ) = 0 ⇔ ℎ(1 −

𝑢 𝑢 𝑢 ⇔ ℎ= . )= ᵆ 𝑐 1 + 𝑢2 (1 − 𝑐 )(1 + 𝑢2 )

Considering only points with ℎ > 0 and 𝑢 > 0, a sketch of the equation 𝑔(𝑢, ℎ) = 0 is illustrated in Figure 3.14(a). Every point on the curve gives a nontrivial equilibrium 𝑢 for a corresponding value of ℎ. Note that the equation has no solutions with 𝑢 = 𝑐, and solutions with 𝑢 > 𝑐 have ℎ < 0 and are discarded. Thus the only nontrivial equilibria

3.8. Case study

49

are in the interval 0 < 𝑢 < 𝑐. At points above and below the curve, it will be convenient to label the regions with the sign of 𝑔(𝑢, ℎ) as shown.

h

u

g=0

c

g>0

g=0 (A,B)

h=A

g>0 g 0 for 𝑢 < 𝐵. Since 𝑓(𝑢, ℎ) and 𝑔(𝑢, ℎ) have the same sign for any 𝑢 > 0, the sign table for 𝑓(𝑢, ℎ) must be as shown in Figure 3.14(c). Thus the nontrivial equilibrium at (𝐴, 𝐵) is an attractor. This procedure can be applied to each point on the curve. Note that stability changes at the two turning points, and that these two points are unstable, hyperbolic equilibria. Bifurcation diagram. The above results for the trivial and nontrivial equilibria can be combined into a bifurcation diagram as illustrated in Figure 3.15. Note that the precise shape of the curve 𝑔(𝑢, ℎ) = 0 will depend on the fixed parameter 𝑐 > 0, and only a qualitative sketch is shown. The parameter values ℎ1 and ℎ2 corresponding to

u c

0

sudden drop

h1

sudden rise

h2

stable unstable h

Figure 3.15.

the two turning points play an interesting role; they can be associated with infestation and extermination events. For any given value of ℎ, the insect population will settle into some stable size, but seasonal variations can cause abrupt changes. For instance, if the stable population size is initially small, and ℎ increases across ℎ2 , then a sudden rise in the stable population size will result, and an infestation of insects will seem to appear from nowhere – think of crickets at certain times of the year! After such an infestation, if ℎ decreases across ℎ1 , then a sudden drop in the stable population size will result, and the insects will seem to have disappeared or been naturally exterminated.

50

3. One-dimensional dynamics

Reference notes The purpose of this chapter was to introduce the qualitative theory of ordinary differential equations in the simple context of a single, first-order, autonomous equation. The concepts of stability and bifurcation outlined here arise in a wide range of applications, and permeate a large part of applied mathematics. A proof of the theorem on existence and uniqueness of solutions, within the context of general systems, can be found in the classic texts by Coddington and Levinson (1955) and Hirsch and Smale (1974). For elementary texts which highlight the onedimensional case, see Arnold (1992), Kelley and Peterson (2010), and Strogatz (2015).

Exercises 1. Find the solution 𝑢(𝑡) and its interval of existence 𝑡 ∈ (𝑇− , 𝑇+ ) for an arbitrary initial condition 𝑢|𝑡=0 = 𝑢0 . (a) (c) (e)

𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡

= 1 − 2𝑢.

(b)

= (3 − 𝑢)2 .

(d)

= 𝑒−2ᵆ .

(f)

𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡

= 𝑢3 . = 1 + 𝑢2 . = 1 + 𝑒ᵆ .

2. Find the solution 𝑢(𝑡) and its interval of existence 𝑡 ∈ (𝑇− , 𝑇+ ) for an arbitrary initial condition 𝑢|𝑡=0 = 𝑢0 , subject to the given restriction. = √𝑢,

(c)

𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡

=

1 , 1−ᵆ

(e)

𝑑ᵆ 𝑑𝑡

=

1 , ᵆ3

(a)

𝑢 > 0. 𝑢 > 1. 𝑢 > 0.

= −𝑢 ln(3𝑢),

(d)

𝑑ᵆ 𝑑𝑡 𝑑ᵆ 𝑑𝑡

(f)

𝑑ᵆ 𝑑𝑡

= √1 − 𝑢 2 ,

(b)

=

1 , cos(ᵆ)

𝑢 > 0.

𝜋 −2

0}. (a) Verify that 𝑢(𝑡) ≡ 0 is a solution with 𝑢0 = 0, and 𝑢(𝑡) ̂ = 𝑡2 is a solution with 𝑢̂0 = 0 for 𝑡 ≥ 0. 𝑑ᵆ

(b) Observe that 𝑢(𝑡) and 𝑢(𝑡) ̂ are two solutions of the same problem: 𝑑𝑡 = 2√𝑢, 𝑢|𝑡=0 = 0. Does this violate the uniqueness part of Result 3.2.1? Why not?

Exercises

51

5. Find all initial conditions 𝑢0 for which solutions 𝑢(𝑡) would increase in time while defined, or would decrease, subject to any given restrictions. Identify any equilibria. (a)

𝑑ᵆ 𝑑𝑡

= 4 − 6𝑢.

(c)

𝑑ᵆ 𝑑𝑡

=

(e)

𝑑ᵆ 𝑑𝑡

= −𝑢 ln(𝑢),

6−5ᵆ+ᵆ2 . 1+ᵆ2

𝑢 > 0.

(b)

𝑑ᵆ 𝑑𝑡

= 2𝑢 − 𝑢3 .

(d)

𝑑ᵆ 𝑑𝑡

= 9𝑢2 − 𝑢4 .

(f)

𝑑ᵆ 𝑑𝑡

=

1 , cos(ᵆ)

𝜋

−2 < 𝑢
𝑚 > 0 are constants. (a)

𝑑ᵆ 𝑑𝑡

= 3𝑢2 − 𝑢3 .

(b)

𝑑ᵆ 𝑑𝑡

= (𝑘 − 𝑢)(𝑚 − 𝑢)3 .

(c)

𝑑ᵆ 𝑑𝑡

= 𝑒(ᵆ ) − 𝑒−𝑚ᵆ .

2

(d)

𝑑ᵆ 𝑑𝑡

= 𝑘 sin 𝑢 − 𝑚 cos 𝑢.

(e)

𝑑ᵆ 𝑑𝑡

= −𝑘𝑢 ln(𝑚𝑢), 𝑢 > 0.

(f)

𝑑ᵆ 𝑑𝑡

=

𝑘ᵆ 𝑚+ᵆ

+ 𝑢 − 𝑘, 𝑢 > −𝑚.

𝑑ᵆ

7. Consider 𝑑𝑡 = 𝑓(𝑢), 𝑢|𝑡=0 = 𝑢0 . Suppose that 𝑓(𝑢) = −𝑉 ′ (𝑢) for some function 𝑉(𝑢). Such a function is called a potential for the system. (a) Show that any critical point of 𝑉(𝑢) is an equilibrium of the system. (b) Show that any strict local minimum of 𝑉(𝑢) is an attractor of the system, and any strict local maximum is a repeller. (c) Show that, for any solution 𝑢(𝑡), the function 𝑉(𝑢(𝑡)) is either decreasing or constant in time. 8. Construct a bifurcation diagram of equilibria 𝑢∗ with respect to the parameter ℎ. Here 𝑟, 𝜎 > 0 are fixed constants. (a)

𝑑ᵆ 𝑑𝑡

= ℎ𝑢 − 𝑢2 .

(b)

𝑑ᵆ 𝑑𝑡

= (ℎ − 𝜎)𝑢 − 𝑟𝑢3 .

(c)

𝑑ᵆ 𝑑𝑡

= (𝑟 − 𝑢)(𝑢2 − ℎ).

(d)

𝑑ᵆ 𝑑𝑡

= 𝑢2 − 𝑢ℎ2 + 𝑢ℎ.

(e)

𝑑ᵆ 𝑑𝑡

= 𝑢𝑒ᵆ − 𝑢ℎ.

(f)

𝑑ᵆ 𝑑𝑡

= ln(ℎ2 + 𝑢2 ) − 1, ℎ ≠ 0.

(g)

𝑑ᵆ 𝑑𝑡

= ℎ𝑢(𝑢 + 1) − 𝑢.

(h)

𝑑ᵆ 𝑑𝑡

= 𝑢2 − ℎ(𝑢 − 1).

9. Use a graphical or similar qualitative argument to construct a bifurcation diagram of equilibria 𝑢∗ with respect to ℎ, subject to any given restrictions. Here 𝜂 > 0 is a fixed constant. (a)

𝑑ᵆ 𝑑𝑡

= 𝑢2 − 𝑢 + ℎ.

(b)

𝑑ᵆ 𝑑𝑡

= 𝑢3 − 𝑢 + ℎ.

(c)

𝑑ᵆ 𝑑𝑡

= 𝜂 sin 𝑢 − ℎ.

(d)

𝑑ᵆ 𝑑𝑡

=

(e)

𝑑ᵆ 𝑑𝑡

= ℎ𝑢 + 𝑢2 − 1.

(f)

𝑑ᵆ 𝑑𝑡

= ℎ𝑢2 − 𝜂𝑒ᵆ .

(g)

𝑑ᵆ 𝑑𝑡

= ℎ + 𝜂𝑢2 − 𝑢3 .

(h)

𝑑ᵆ 𝑑𝑡

= 𝑢(ℎ − 𝑢) − 𝜂𝑒−ᵆ , ℎ > 0.

ᵆ 1+ᵆ2

− 𝑢ℎ.

52

3. One-dimensional dynamics

10. Consider

𝑑ᵆ 𝑑𝑡

= 𝑢(𝑢 − 4) + ℎ, where ℎ is a parameter.

(a) Construct a bifurcation diagram showing all equilibria and their stability for all ℎ. (b) For the case 𝑢|𝑡=0 = 1, use the diagram to determine lim𝑡→∞ 𝑢(𝑡) for each ℎ. 11. Consider

𝑑ᵆ 𝑑𝑡

= ℎ𝑢2 + 𝑢ℎ2 − 4𝑢, where ℎ > 0 is a parameter.

(a) Construct a bifurcation diagram showing all equilibria and their stability for all ℎ. (b) For the case 𝑢|𝑡=0 = 3, use the diagram to determine lim𝑡→∞ 𝑢(𝑡) for each ℎ. 12. A model for the population in a fishery is shown below, where 𝑝 ≥ 0 is the population size, 𝑡 ≥ 0 is time, 𝑘, 𝑟 > 0 are capacity and birth rate parameters, and 𝜎 ≥ 0 is a harvesting rate parameter. 𝑝 𝑑𝑝 = 𝑟𝑝(1 − ) − 𝜎, 𝑝|𝑡=0 = 𝑝0 , 𝑡 ≥ 0. 𝑑𝑡 𝑘 The population will become extinct if 𝑝(𝑡) → 0 as 𝑡 → ∞, or as 𝑡 → 𝑇𝑒 for some finite 𝑇𝑒 ; otherwise, the population will survive. (a) Let 𝜏 = 𝑡/𝑎 and 𝑢 = 𝑝/𝑏, where 𝑎 = 1/𝑟 and 𝑏 = 𝑘. Show that the scaled equation becomes ℎ.

𝑑ᵆ 𝑑𝜏

= 𝑓(𝑢, ℎ). Identify the function 𝑓(𝑢, ℎ) and the parameter

(b) Construct a bifurcation diagram for the rescaled model; only consider 𝑢 ≥ 0 and ℎ ≥ 0. (c) Show that there is a critical value ℎ# such that, if ℎ > ℎ# , then the population will become extinct for any 𝑢0 , and if ℎ < ℎ# , it will survive if 𝑢0 is large enough. 13. A model for a solid-state laser device takes the form below, where 𝑧 ≥ 0 is the number of laser photons emitted, 𝑡 ≥ 0 is time, 𝐸 ≥ 0 is an input energy parameter, and 𝜂, 𝜇 > 0 are material parameters. 𝑑𝑧 = (𝐸 − 𝜂)𝑧 − 𝜇𝑧2 , 𝑧|𝑡=0 = 𝑧0 , 𝑡 ≥ 0. 𝑑𝑡 The device will produce only lamp-light if 𝑧(𝑡) → 0 as 𝑡 → ∞, and will produce sustained laser-light otherwise. (a) Let 𝜏 = 𝑡/𝑎 and 𝑢 = 𝑧/𝑏. Find the scales 𝑎 and 𝑏 so that the scaled equation becomes

𝑑ᵆ 𝑑𝜏

= (ℎ − 1)𝑢 − 𝑢2 . Identify the parameter ℎ.

(b) Construct a bifurcation diagram for the rescaled model; only consider 𝑢 ≥ 0 and ℎ ≥ 0. (c) For given 𝜂, 𝜇, show that there is a critical value of the input energy 𝐸, below which the device will produce only lamp-light, and above which it will produce sustained laser-light, for any 𝑧0 > 0.

Exercises

53

14. A model for a biochemical switch is shown below, where 𝑤 > 0 is the concentration of the switch chemical, 𝑡 ≥ 0 is time, and 𝑘0 , . . . , 𝑘3 > 0 are reaction constants. 𝑘 𝑤2 𝑑𝑤 = 𝑘 0 − 𝑘1 𝑤 + 2 2 , 𝑤|𝑡=0 = 𝑤 0 , 𝑡 ≥ 0. 𝑑𝑡 𝑘3 + 𝑤2 An equilibrium 𝑤∗ is called a low state if 𝑤∗ < 0.3𝑘3 , and a high state if 𝑤∗ > 0.9𝑘3 . (a) Let 𝜏 = 𝑡/𝑎 and 𝑢 = 𝑤/𝑏. Find the scales 𝑎 and 𝑏 so that the scaled equation becomes

𝑑ᵆ 𝑑𝜏

= ℎ − 𝑟𝑢 +

ᵆ2 . 1+ᵆ2

Identify the parameters ℎ and 𝑟.

(b) Let 𝑟 = 0.54 be fixed. Construct a bifurcation diagram for the rescaled model. Considering only 𝑢 > 0 and ℎ > 0, show that the only stable equilibria are low and high states. (c) Show that there are two critical values ℎoff and ℎon such that, if ℎ < ℎoff , then the switch will tend to a low state, and if ℎ > ℎon , it will tend to a high state, for any 𝑢0 . 15. A bead of mass 𝑚 > 0, subject to gravitational acceleration 𝑔 > 0, slides along a wire hoop of radius 𝑟 > 0, which is rotating at angular velocity 𝜔 ≥ 0. If the sliding occurs with a high friction coefficient 𝜂 > 0, then a model for the bead angle 𝜙 ∈ (−𝜋, 𝜋] at time 𝑡 ≥ 0 is as follows. ω

𝑑𝜙 = −𝑚𝑔 sin 𝜙 + 𝑚𝑟𝜔2 sin 𝜙 cos 𝜙, 𝑑𝑡 𝜙|𝑡=0 = 𝜙0 , 𝑡 ≥ 0. 𝜂

g

φ

r m

The bead will tend to the trivial state if 𝜙(𝑡) → 0 as 𝑡 → ∞, and will tend to a suspended state otherwise. (a) Let 𝜏 = 𝑡/𝑎. Find the scale 𝑎 so that the scaled equation becomes − sin 𝜙 + ℎ sin 𝜙 cos 𝜙. Identify the parameter ℎ.

𝑑𝜙 𝑑𝜏

=

(b) Construct a bifurcation diagram for the rescaled model. Considering only −𝜋 < 𝜙 ≤ 𝜋 and ℎ ≥ 0, show that there are either two or four equilibria for any ℎ. (c) For given 𝑚, 𝑟, 𝑔, 𝜂, show that there is a critical value of the rotation rate 𝜔, below which the bead will tend to the trivial state, and above which it will tend to a suspended state, for any 0 < |𝜙0 | < 𝜋. Mini-project. Similar to the case studied in Section 3.8, we consider a model for the population of plants in a simple ecosystem, of the form plants herbivores

𝑑𝑝 𝑝 𝜎𝑝 = 𝑟𝑝 (1 − ) − , 𝑑𝑡 𝑘 1 + 𝜂𝑝

𝑝|𝑡=0 = 𝑝0 ,

𝑡 ≥ 0.

54

3. One-dimensional dynamics

Here 𝑝 is the number of plants, 𝑡 is time, 𝑟 and 𝑘 are constants that describe the growth rate of the plants, and 𝜎 and 𝜂 are constants that describe the consumption rate of the plants by other members of the ecosystem, such as herbivores. The dimensions of the variables are [𝑝] = Plant and [𝑡] = Time. We seek to understand the behavior of solutions in terms of the parameters 𝑟, 𝑘, 𝜎, 𝜂 > 0 and initial population 𝑝0 ≥ 0. The system is mathematically well defined for all 𝑝 ≠ −1/𝜂, and we focus only on physically meaningful solutions with 𝑝 ≥ 0. (a) Introduce the scale transformation 𝜏 = 𝑡/𝑎 and 𝑢 = 𝑝/𝑏, where 𝑎 = 1/𝑟 and 𝑏 = 𝑘, and show that the dimensionless version of the system takes the following form, 𝑑𝑢 ℎ𝑢 = 𝑢(1 − 𝑢) − , 𝑢|𝜏=0 = 𝑢0 , 𝜏 ≥ 0. 𝑑𝜏 1 + 𝑐𝑢 Identify the dimensionless parameters ℎ, 𝑐 and 𝑢0 . (b) Assuming 𝑐 is fixed, specifically 𝑐 > 1, find all equilibrium solutions of the system in (a) and determine their stability in terms of the parameter ℎ > 0. Illustrate the results on a bifurcation diagram. (c) Consider a scenario in which 𝑐 = 4 and 𝑢0 = 1/8 are fixed. Using your diagram from (b), determine all values of ℎ for which the plant population would remain positive (avoid extinction) as 𝜏 → ∞. Similarly, for the same 𝑐 and 𝑢0 , determine all values of ℎ for which the plant population would approach zero (become extinct) as 𝜏 → ∞. (d) Use Matlab or other similar software to numerically simulate the system in (a). Using 𝑐 = 4 and a few values of ℎ, say ℎ = 0.6, 1.4, 1.7, produce portraits of solutions for various 𝑢0 . Indicate the locations of all stable and unstable equilibria in the plots and confirm your results from (b). Similarly, confirm your predictions from (c).

Chapter 4

Two-dimensional dynamics

Here we continue our study of dynamical systems and consider models in the form of two coupled, autonomous, first-order differential equations. In contrast to the case of one dimension, systems in two dimensions may exhibit a rich variety of monotonic, spiraling and periodic solutions. We study various qualitative properties of these solutions, and explore how such properties depend on parameters.

4.1. Preliminaries We consider an initial-value problem for a system of first-order equations of the form (4.1)

𝑑𝑥 = 𝑓(𝑥, 𝑦, 𝑐 1 , . . . , 𝑐 𝑁 ), 𝑥|𝑡=0 = 𝑥0 , 𝑑𝑡 𝑑𝑦 = 𝑔(𝑥, 𝑦, 𝑐 1 , . . . , 𝑐 𝑁 ), 𝑦|𝑡=0 = 𝑦0 , 𝑑𝑡

𝑡 ≥ 0,

where 𝑥, 𝑦, 𝑡 are real variables, 𝑥0 , 𝑦0 , 𝑐 1 , . . . , 𝑐 𝑁 are real parameters, and 𝑓, 𝑔 are given functions. The system in (4.1) is called a dynamical system for 𝑥, 𝑦. A solution is a pair of functions 𝑥 = 𝑥(𝑡, 𝑥0 , 𝑦0 , 𝑐 1 , . . . , 𝑐 𝑁 ) and 𝑦 = 𝑦(𝑡, 𝑥0 , 𝑦0 , 𝑐 1 , . . . , 𝑐 𝑁 ), which are differentiable in 𝑡, and satisfy (4.1). For brevity, we will abbreviate the system as 𝑑𝑥 𝑑𝑦 = 𝑓(𝑥, 𝑦) and 𝑑𝑡 = 𝑔(𝑥, 𝑦), and a solution as 𝑥 = 𝑥(𝑡) and 𝑦 = 𝑦(𝑡), or more simply 𝑑𝑡 (𝑥, 𝑦)(𝑡). Parameters will be indicated when they are essential to a discussion. Similar to before, we seek to characterize how a solution (𝑥, 𝑦)(𝑡) depends on the parameters. For the moment, we fix 𝑐 1 , . . . , 𝑐 𝑁 , and only consider the dependence on 𝑥0 , 𝑦0 . A solution (𝑥, 𝑦)(𝑡) can be viewed in two ways as illustrated in Figure 4.1. In a time view, a solution is viewed as a pair of graphs in the 𝑡, 𝑥- and 𝑡, 𝑦-planes. The 𝑑𝑥 𝑑𝑦 slopes of these two graphs satisfy 𝑑𝑡 = 𝑓(𝑥, 𝑦) and 𝑑𝑡 = 𝑔(𝑥, 𝑦). Alternatively, in a phase view, a solution is viewed as a moving point in the 𝑥, 𝑦-plane, referred to as the phase plane. The moving point traces out a curve in this plane, called an orbit or 𝑑𝑥 𝑑𝑦 path, and the vector 𝜈 ⃗ = ( 𝑑𝑡 , 𝑑𝑡 ) = (𝑓(𝑥, 𝑦), 𝑔(𝑥, 𝑦)) is the velocity of the point, which depends on its position, and is tangent to the path. 55

56

4. Two-dimensional dynamics

x

y

y

x1

y1

x0

y0

ν t=1

0

1

t

0

1

t

t=0 x

Figure 4.1.

4.2. Solvability theorem The question of existence and uniqueness of solutions to (4.1) is addressed in the following result, which is similar to Result 3.2.1 from the one-dimensional case. The set of all points where 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuously differentiable is denoted by 𝐷, assumed to be an open set in the real plane ℝ2 . Result 4.2.1. The system in (4.1) has a unique solution (𝑥, 𝑦)(𝑡) ∈ 𝐷 for any (𝑥0 , 𝑦0 ) ∈ 𝐷. Moreover: (i)

(𝑥, 𝑦)(𝑡) exists and is in 𝐷 for some maximal interval 𝑡 ∈ (𝑇− , 𝑇+ ), where 𝑇− < 0 and 𝑇+ > 0 depend on (𝑥0 , 𝑦0 ),

(ii)

if 𝑇− or 𝑇+ is finite, then (𝑥, 𝑦)(𝑡) leaves 𝐷 at 𝑡 = 𝑇− or 𝑡 = 𝑇+ ,

̂ (iii) if (𝑥0 , 𝑦0 ) ≠ (𝑥0̂ , 𝑦0̂ ), then (𝑥, 𝑦)(𝑡) ≠ (𝑥,̂ 𝑦)(𝑡) while both solutions exist in 𝐷; also, paths are either disjoint or the same, and cannot cross. Thus, within the set 𝐷 where 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuously differentiable, the system in (4.1) has a unique solution (𝑥, 𝑦)(𝑡) for any given (𝑥0 , 𝑦0 ). As before, this solution exists on some maximal, open interval of time that contains 𝑡 = 0. Although we focus on 𝑡 ≥ 0, solutions are also defined for 𝑡 ≤ 0, whether of interest or not. Regardless of whether 𝐷 = ℝ2 or 𝐷 ⊂ ℝ2 , the maximal existence interval for (𝑥, 𝑦)(𝑡) depends on (𝑥0 , 𝑦0 ), and is the largest for which the solution is in 𝐷. Thus, if 𝑇− or 𝑇+ is finite, then (𝑥, 𝑦)(𝑡) must leave 𝐷 at that time. Outside of the set 𝐷, solutions may not be unique or may not exist, or the dynamical system itself may not be defined. By the maximal orbit or path of a solution we mean that traced for all time in the maximal existence interval. Part (iii) of Result 4.2.1 provides important qualitative information. For any two different initial conditions (𝑥0 , 𝑦0 ) and (𝑥0̂ , 𝑦0̂ ), the corresponding solutions (𝑥, 𝑦)(𝑡) ̂ and (𝑥,̂ 𝑦)(𝑡) will never occupy the same point at the same time in the phase view. Also, the maximal paths traced out by these solutions will either be disjoint or the same, and cannot cross as illustrated in Figure 4.2. A crossing cannot occur under any circumstances, not even if the crossing point were visited at different times. Note that, if two paths have a common point (𝑥𝑐 , 𝑦𝑐 ), then they must also have a common velocity vector 𝜈(𝑥 ⃗ 𝑐 , 𝑦𝑐 ) at that point, but a crossing would imply two different velocity vectors. Moreover, two paths with a common point must be identical, by uniqueness of solutions, with the common point serving as initial condition. Thus two maximal paths may either have no points in common, in which case they are disjoint, or all points in common, in which case they are the same.

4.3. Direction field, nullclines

57

y

y (x0 , y0 )

(x0 , y0 ) (x0 , y0 )

y

x

(x0 , y0 ) (x0 , y0 )

disjoint paths (possible)

same path (possible)

x

(x0 , y0 )

x

crossed paths (impossible)

Figure 4.2.

4.3. Direction ﬁeld, nullclines The velocity vector informs us of the direction and orientation of the solution curve through each point of the phase plane. This fact can be used to get a qualitative portrait of solution curves. Definition 4.3.1. By the direction field for (4.1) we mean the collection of vectors 𝜈(𝑥, ⃗ 𝑦) = (𝑓(𝑥, 𝑦), 𝑔(𝑥, 𝑦)) over all points (𝑥, 𝑦).

y

For practical reasons, the vectors 𝜈(𝑥, ⃗ 𝑦) can only be visualized at a finite number of points as illustrated in Figure 4.3. A sketch of the vectors at several points can provide a qualitative portrait of solution curves, but many points may be required before an accurate portrait emerges. Alternative, more qualitative information about solutions

x

Figure 4.3.

can be obtained by simply considering how the signs of 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) vary over the phase plane. This motivates the following definition. Definition 4.3.2. The set of points where 𝑓(𝑥, 𝑦) = 0 and 𝑔(𝑥, 𝑦) = 0 are called the 𝑥and 𝑦-nullclines, respectively. The set of points where 𝑓(𝑥, 𝑦) > 0 (< 0) and 𝑔(𝑥, 𝑦) > 0 (< 0) are called the 𝑥- and 𝑦-increasing (decreasing) regions, respectively. Thus at each point on an 𝑥-nullcline we have 𝑓(𝑥, 𝑦) = 0 and the direction field has no horizontal component. Moreover, at each point in an 𝑥-increasing region we have 𝑓(𝑥, 𝑦) > 0 and the direction field points rightward; in a decreasing region we have 𝑓(𝑥, 𝑦) < 0 and the direction is leftward. Similarly, at each point on a 𝑦-nullcline we have 𝑔(𝑥, 𝑦) = 0 and the direction field has no vertical component. And at each point

58

4. Two-dimensional dynamics

in a 𝑦-increasing region we have 𝑔(𝑥, 𝑦) > 0 and the direction field points upward; and in a decreasing region we have 𝑔(𝑥, 𝑦) < 0 and the direction is downward. Note that the nullclines, increasing regions, and decreasing regions are disjoint sets that partition the entire phase plane for each of the 𝑥- and 𝑦-variables; however, there is no restriction on the overlaps of these sets between the two variables. 𝑑𝑥

𝑑𝑦

Example 4.3.1. Consider 𝑑𝑡 = 𝑓(𝑥, 𝑦) = 𝑥 − 𝑦2 and 𝑑𝑡 = 𝑔(𝑥, 𝑦) = 𝑥2 − 1. The 𝑥nullclines are defined by 𝑓 = 0, which yields the single curve 𝑥 = 𝑦2 . The 𝑥-increasing region is 𝑓 > 0, which corresponds to 𝑥 > 𝑦2 , and the decreasing region is 𝑓 < 0, which corresponds to 𝑥 < 𝑦2 . The 𝑦-nullclines are defined by 𝑔 = 0, which yields the two curves 𝑥 = ±1. The 𝑦-increasing region is 𝑔 > 0, which corresponds to 𝑥2 > 1, and the decreasing region is 𝑔 < 0, which corresponds to 𝑥2 < 1. Figure 4.4 illustrates the nullclines and regions for each separate variable, and the solution curve directions obtained by superimposing the information for each variable. y

f0

g>0

g0

x

x

f=0 g=0

g=0

Figure 4.4.

4.4. Path equation, ﬁrst integrals Due to the fact that (4.1) is an autonomous system of first-order equations, we can eliminate the time variable to obtain a purely geometric description of solution curves in the phase plane. Definition 4.4.1. By the path equation associated with (4.1) we mean the differential equation (4.2)

𝑔(𝑥, 𝑦) 𝑑𝑦 = , 𝑑𝑥 𝑓(𝑥, 𝑦)

𝑦 = 𝑦0 when 𝑥 = 𝑥0 .

Provided that 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuously differentiable as in Result 4.2.1, and do not both vanish at (𝑥0 , 𝑦0 ), the path equation or its reciprocal has a unique solution curve 𝑦 = 𝑦(𝑥) or 𝑥 = 𝑥(𝑦), which coincides with the solution curve of the original equation (4.1) through (𝑥0 , 𝑦0 ). Note that a graph description 𝑦 = 𝑦(𝑥) or 𝑥 = 𝑥(𝑦) of a solution curve is more restrictive than a parametric description (𝑥, 𝑦)(𝑡). Although the path equation only provides information on the shape of a solution curve, this information can be combined with knowledge of the direction field to orient the curve with respect to time.

4.4. Path equation, first integrals

59

Definition 4.4.2. A function 𝐸(𝑥, 𝑦) is called a first integral of (4.1) if the general solution of the path equation can be written as 𝐸(𝑥, 𝑦) = 𝐶, for some arbitrary constant 𝐶. When the path equation is written in the form (4.2), a first integral will exist when the quotient 𝑔(𝑥, 𝑦)/𝑓(𝑥, 𝑦) is separable in the variables 𝑥 and 𝑦. In this case, the equation can be solved by separating variables and integrating, which gives 𝐻(𝑦) = 𝐽(𝑥)+𝐶, where 𝐻(𝑦) and 𝐽(𝑥) are functions obtained from the integration. This general solution can then be written as 𝐸(𝑥, 𝑦) = 𝐶, where 𝐸(𝑥, 𝑦) = 𝐻(𝑦) − 𝐽(𝑥). Alternatively, when 𝑑𝑦 the path equation is written as 𝑓(𝑥, 𝑦) 𝑑𝑥 − 𝑔(𝑥, 𝑦) = 0, a first integral will exist when this equation is exact in the sense that 𝑓𝑥 (𝑥, 𝑦) = −𝑔𝑦 (𝑥, 𝑦). In this case, a first integral will satisfy 𝐸𝑥 (𝑥, 𝑦) = −𝑔(𝑥, 𝑦) and 𝐸𝑦 (𝑥, 𝑦) = 𝑓(𝑥, 𝑦). More generally, a first integral will exist when (𝜙𝑓)𝑥 (𝑥, 𝑦) = −(𝜙𝑔)𝑦 (𝑥, 𝑦) for some integrating factor 𝜙(𝑥, 𝑦), in which case a first integral will satisfy 𝐸𝑥 (𝑥, 𝑦) = −(𝜙𝑔)(𝑥, 𝑦) and 𝐸𝑦 (𝑥, 𝑦) = (𝜙𝑓)(𝑥, 𝑦). Significant qualitative and quantitative information about the solution curves of (4.1) can be obtained from a first integral. If a first integral 𝐸(𝑥, 𝑦) exists in some region of the phase plane, then every solution curve of (4.1) in this region will be contained within a level set of 𝐸(𝑥, 𝑦). Thus a contour map of 𝐸(𝑥, 𝑦) provides a geometric portrait of solution curves, but with no explicit time information. For any given 𝐶0 , the level set 𝐸(𝑥, 𝑦) = 𝐶0 contains the path of all solution curves with initial points satisfying 𝐸(𝑥0 , 𝑦0 ) = 𝐶0 . In some cases, the path of a solution curve will not cover the entire set 𝐸(𝑥, 𝑦) = 𝐶0 , but only a subset or component. 𝑑𝑦

𝑑𝑥

Example 4.4.1. Consider 𝑑𝑡 = 2𝑦3 and 𝑑𝑡 = 𝑥−2. For this system the path equation is 𝑑𝑦 𝑥−2 1 1 = 2𝑦3 . This can be solved using separation of variables to get 2 𝑦4 = 2 (𝑥 − 2)2 + 𝐶, 𝑑𝑥 where 𝐶 is an arbitrary constant. After rearranging and introducing 𝐵 = −2𝐶, we obtain (𝑥−2)2 −𝑦4 = 𝐵. Thus a first integral is 𝐸(𝑥, 𝑦) = (𝑥−2)2 −𝑦4 , and every solution (𝑥, 𝑦)(𝑡) of the system is contained within a level set 𝐸(𝑥, 𝑦) = 𝐵. Figure 4.5(a) shows y

y

y

1 1

2

-1

3

4

x

f=0

x g=0

(a)

(b)

x (c)

Figure 4.5.

some level sets, corresponding to 𝐵 = −1, 0, 1, 4. These level sets can be viewed as roads upon which solutions can travel. Figure 4.5(b) shows the nullclines and solution (velocity) directions. These are the directions in which solutions move along the roads as time increases. For comparison, Figure 4.5(c) shows some actual solution curves. Note that the level set with 𝐵 = 0 (purple) is relatively complicated and consists of five components: four branches corresponding to 𝑥 < 2 or > 2 and 𝑦 < 0 or > 0, and

60

4. Two-dimensional dynamics

one point corresponding to (𝑥, 𝑦) = (2, 0). Because of the constant solution (𝑥, 𝑦)(𝑡) ≡ (2, 0), a solution curve starting in any one of these components remains there for all time.

4.5. Equilibria As in the one-dimensional case, we consider the special class of solutions to (4.1) that are constant in time. Definition 4.5.1. A solution of (4.1) is called an equilibrium or steady state if it is constant in time, that is (4.3)

(𝑥, 𝑦)(𝑡) ≡ (𝑥∗ , 𝑦∗ ) for all

𝑡 ≥ 0,

𝑑𝑦

𝑑𝑥

for some point (𝑥∗ , 𝑦∗ ) ∈ 𝐷. Since 𝑑𝑡 = 𝑓(𝑥, 𝑦) and 𝑑𝑡 = 𝑔(𝑥, 𝑦), it follows that (𝑥∗ , 𝑦∗ ) is an equilibrium if and only if 𝑓(𝑥∗ , 𝑦∗ ) = 0 and 𝑔(𝑥∗ , 𝑦∗ ) = 0. Thus equilibrium solutions must be simultaneous roots of the functions 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦); equivalently, they correspond to intersection points of the 𝑥- and 𝑦-nullclines. Note that the curve (𝑥, 𝑦)(𝑡) ≡ (𝑥∗ , 𝑦∗ ) is a single, fixed point in the phase plane, which no other solution curve can touch or cross. In the one-dimensional case, equilibria provided barriers which trapped other solutions in regions of increase or decrease, which lead to a monotonicity result. In the two-dimensional case, equilibria no longer provide barriers, but instead provide organizing points on the boundary between regions of increase and decrease of each variable, about which solutions may exhibit a variety of growing, decaying, spiraling and periodic behaviors. Similar to the onedimensional case, we next introduce a classification of equilibria for two-dimensional systems. Definition 4.5.2. Let (𝑥∗ , 𝑦∗ ) be an equilibrium of (4.1), and for any 𝜌 > 0 let 𝑅∗,𝜌 denote the open rectangle (𝑥∗ − 𝜌, 𝑥∗ + 𝜌) × (𝑦∗ − 𝜌, 𝑦∗ + 𝜌). (1) (𝑥∗ , 𝑦∗ ) is called asymptotically stable if for every 𝜀 > 0 there is a 𝛿 > 0 such that, if (𝑥0 , 𝑦0 ) ∈ 𝑅∗,𝛿 then (𝑥, 𝑦)(𝑡) ∈ 𝑅∗,𝜀 for all 𝑡 ≥ 0, and (𝑥, 𝑦)(𝑡) → (𝑥∗ , 𝑦∗ ) as 𝑡 → ∞ for every (𝑥0 , 𝑦0 ) ∈ 𝑅∗,𝛿 ; see Figure 4.6. x 2ε

y 2ε

2δ x *

y

2δ y *

t

t

x

Figure 4.6.

(2) (𝑥∗ , 𝑦∗ ) is called neutrally stable if for every 𝜀 > 0 there is a 𝛿 > 0 such that, if (𝑥0 , 𝑦0 ) ∈ 𝑅∗,𝛿 then (𝑥, 𝑦)(𝑡) ∈ 𝑅∗,𝜀 for all 𝑡 ≥ 0, and (𝑥, 𝑦)(𝑡) ↛ (𝑥∗ , 𝑦∗ ) as 𝑡 → ∞ for some (𝑥0 , 𝑦0 ) ∈ 𝑅∗,𝛿 ; see Figure 4.7. (3) (𝑥∗ , 𝑦∗ ) is called unstable if it is not asymptotically or neutrally stable; see Figure 4.8.

4.5. Equilibria

61

x 2ε

y 2ε

2δ x *

y

2δ y *

t

t

x

Figure 4.7.

y

y

y

x

x

x

Figure 4.8.

Thus every equilibrium solution (𝑥∗ , 𝑦∗ ) can be classified as one of three types: asymptotically stable, neutrally stable, or unstable. Similar to the one-dimensional case, if we denote the solution of (4.1) by (𝑥, 𝑦)(𝑡, 𝑥0 , 𝑦0 ), then stability is a form of continuity of this function with respect to (𝑥0 , 𝑦0 ). Specifically, asymptotic and neutral stability imply that (𝑥, 𝑦)(𝑡, 𝑥0 , 𝑦0 ) and (𝑥, 𝑦)(𝑡, 𝑥∗ , 𝑦∗ ) will be arbitrarily close for all time 𝑡 ≥ 0 provided that (𝑥0 , 𝑦0 ) and (𝑥∗ , 𝑦∗ ) are sufficiently close, where (𝑥, 𝑦)(𝑡, 𝑥∗ , 𝑦∗ ) ≡ (𝑥∗ , 𝑦∗ ). And instability implies that (𝑥, 𝑦)(𝑡, 𝑥0 , 𝑦0 ) and (𝑥, 𝑦)(𝑡, 𝑥∗ , 𝑦∗ ) will not be arbitrarily close for all time 𝑡 ≥ 0 for some (𝑥0 , 𝑦0 ), no matter how close it may be to (𝑥∗ , 𝑦∗ ). Figures 4.6–4.8 provide a small sampling of the possible behavior of solution curves and are for illustration only; there are many other possibilities depending on whether an equilibrium is isolated or not, and whether it is nondegenerate in an appropriate sense. Note that the classification of equilibria in the two-dimensional case is not as straightforward as the one-dimensional case. Inspection of the nullclines and direction field in the regions surrounding an equilibria can give stability information in some cases. A more systematic method for classification will require a number of preliminary results about the linear case, which will be outlined later. Similar to the one-dimensional case, an asymptotically stable equilibrium (𝑥∗ , 𝑦∗ ) can be interpreted as a preferred state of the system. All solutions that start sufficiently close to such an equilibrium will remain close, and will be pulled into the equilibrium, as time goes on. However, in contrast to the one-dimensional case, the approach to the equilibrium may be non-monotonic in one or both of the variables 𝑥(𝑡) and 𝑦(𝑡). As before, an unstable equilibrium (𝑥∗ , 𝑦∗ ) can be interpreted as an unpreferred state. Some solutions that start arbitrarily close, but not at such an equilibrium, will be pushed away over the course of time, but now in a possibly non-monotonic way. And a neutrally stable equilibrium (𝑥∗ , 𝑦∗ ) can again be interpreted as a borderline case. All solutions that start sufficiently close to such an equilibrium will remain close, but some will not be pulled into the equilibrium.

62

4. Two-dimensional dynamics

𝑑𝑥

𝑑𝑦

Example 4.5.1. Consider the system in Example 4.3.1, namely 𝑑𝑡 = 𝑥 − 𝑦2 and 𝑑𝑡 = 𝑥2 − 1. The equilibrium points of this system satisfy 𝑥 − 𝑦2 = 0 and 𝑥2 − 1 = 0. The first equation implies 𝑥 = 𝑦2 and the second implies 𝑥 = ±1. In the case 𝑥 = 1, we obtain 𝑦 = ±1, and in the case 𝑥 = −1, there is no real solution for 𝑦. Hence y

y (x ,y ) = (1,1) * * x

x

(x ,y ) = (1,−1) * *

Figure 4.9.

this system has two equilibrium points (𝑥∗ , 𝑦∗ ) = (1, 1) and (1, −1). Notice that these are precisely the intersection points of the 𝑥- and 𝑦-nullclines as considered earlier, which are shown in the left part of Figure 4.9. A sketch of the direction field in the regions surrounding (𝑥∗ , 𝑦∗ ) = (1, −1) indicates that this equilibrium is unstable, since solutions in the immediate vicinity of the equilibrium are pushed away in some areas. In contrast, a sketch of the direction field in the regions surrounding (𝑥∗ , 𝑦∗ ) = (1, 1) is not as revealing; it indicates a spiraling or periodic type of behavior, but the information is not precise enough to determine stability. An accurate portrait of solution curves, as illustrated in the right part of Figure 4.9, shows that (1, 1) is actually unstable.

4.6. Periodic orbits Other special types of solutions besides equilibria are also of interest. For systems in two and higher dimensions, one such type of solution is a periodic orbit as defined next. Note that such solutions are not possible for dynamical systems in one dimension due to monotonicity. Definition 4.6.1. A solution of (4.1) is called a periodic or closed orbit if it is nonconstant and there is a number 𝑃 > 0 such that (4.4)

(𝑥, 𝑦)(𝑡 + 𝑃) = (𝑥, 𝑦)(𝑡)

for all

𝑡 ≥ 0.

The smallest such 𝑃 is called the period. We assume that (𝑥, 𝑦)(𝑡) ∈ 𝐷 for all 𝑡. Thus a periodic orbit is a solution that repeats. Its path in the phase plane is a closed curve that is repeatedly traced over in time, in one of two possible orientations, as illustrated in Figure 4.10. If (𝑥0 , 𝑦0 ) ≠ (𝑥0̂ , 𝑦0̂ ) are any two points on this closed ̂ curve, then the corresponding solutions (𝑥, 𝑦)(𝑡) ≠ (𝑥,̂ 𝑦)(𝑡) will remain on the curve, be periodic with the same period, and chase each other around the curve for all time. A periodic orbit cannot intersect with any other solution curve in the phase plane, and cannot intersect with itself to form a figure eight or anything similar. A stability classification can be defined for periodic orbits. Rather than state a precise, technical form of the definitions, we simply note that a periodic orbit can be

4.6. Periodic orbits

63

y

(x0 , y0 ) x Figure 4.10.

asymptotically stable, neutrally stable, or unstable similar to equilibria as illustrated in Figure 4.11. A periodic orbit is called asymptotically stable if all solution curves that start sufficiently close will remain close, and will spiral alongside and be pulled closer as time goes on, as illustrated in panel (a). In this case, the periodic orbit is called a limit cycle. A periodic orbit is called neutrally stable if all solution curves that start sufficiently close will remain close, and will spiral alongside as time goes on, but some will not get pulled closer, as illustrated in panel (b). And a periodic orbit is called unstable if it is neither asymptotically nor neutrally stable, as illustrated in panel (c). y

y

y

x

x

(a)

(b)

x (c)

Figure 4.11. 𝑑𝑥

𝑑𝑦

Example 4.6.1. Consider 𝑑𝑡 = 𝑦 and 𝑑𝑡 = −𝑘2 𝑥, with initial conditions 𝑥|𝑡=0 = 𝑥0 and 𝑦|𝑡=0 = 𝑦0 , where 𝑘 > 0 is a given constant. Due to its simple form, this system can be solved by combining the pair of first-order equations into a single second-order 𝑑2 𝑥 𝑑𝑦 𝑑2 𝑥 equation, namely 𝑑𝑡2 = 𝑑𝑡 = −𝑘2 𝑥, or equivalently 𝑑𝑡2 + 𝑘2 𝑥 = 0. Using standard methods, the general solution of this equation is 𝑥(𝑡) = 𝐶1 cos(𝑘𝑡) + 𝐶2 sin(𝑘𝑡), 𝑑𝑥 where 𝐶1 and 𝐶2 are arbitrary constants. The equation 𝑦 = 𝑑𝑡 then implies 𝑦(𝑡) = −𝑘𝐶1 sin(𝑘𝑡) + 𝑘𝐶2 cos(𝑘𝑡). Using the given conditions at 𝑡 = 0, we find that 𝐶1 = 𝑥0 and 𝐶2 = 𝑦0 /𝑘. Hence the solution curves of the system are given by 𝑦 (4.5) 𝑥(𝑡) = 𝑥0 cos(𝑘𝑡) + 0 sin(𝑘𝑡), 𝑦(𝑡) = 𝑦0 cos(𝑘𝑡) − 𝑘𝑥0 sin(𝑘𝑡). 𝑘 For (𝑥0 , 𝑦0 ) = (0, 0) we obtain the equilibrium solution (𝑥, 𝑦)(𝑡) ≡ (0, 0), and for any 2𝜋 (𝑥0 , 𝑦0 ) ≠ (0, 0), we obtain a periodic solution (𝑥, 𝑦)(𝑡) of period 𝑃 = 𝑘 . Thus every nonconstant solution of this system is periodic, of the neutrally stable type, and the equilibrium at the origin is also neutrally stable. An examination of the nullclines and direction field indicates that the periodic solution curves are oriented clockwise in time. 1 Solutions for various (𝑥0 , 𝑦0 ) ≠ (0, 0) are illustrated in Figure 4.12 for the case 𝑘 = 2 . 2 2 2 Equivalently, a first integral for this system is 𝐸(𝑥, 𝑦) = 𝑘 𝑥 + 𝑦 , and every solution curve is contained within a level set 𝐸(𝑥, 𝑦) = 𝐶 ≥ 0. All level sets with 𝐶 > 0 are

64

4. Two-dimensional dynamics

ellipses about the origin, and the level set with 𝐶 = 0 is the single point at the origin. y y

f=0

x

x

g=0 Figure 4.12.

4.7. Linear systems Here we study linear dynamical systems in two dimensions. Such systems arise as models in simple contexts, and more importantly, they will provide the foundation for understanding the behavior of more general nonlinear systems considered later. Definition 4.7.1. A dynamical system for variables 𝑥, 𝑦 is called linear if it has the form 𝑑𝑥/𝑑𝑡 = 𝑎𝑥 + 𝑏𝑦, 𝑥|𝑡=0 = 𝑥0 , (4.6) 𝑡 ≥ 0, 𝑑𝑦/𝑑𝑡 = 𝑐𝑥 + 𝑑𝑦, 𝑦|𝑡=0 = 𝑦0 , where 𝑎, 𝑏, 𝑐, 𝑑 are constants. Equivalently, in matrix notation, the system is 𝑑𝑣 (4.7) = 𝐴𝑣, 𝑣|𝑡=0 = 𝑣 0 , 𝑡 ≥ 0, 𝑑𝑡 𝑥 𝑎 𝑏 where 𝑣 = ( ) and 𝐴 = ( ). The system is called nondegenerate if det 𝐴 ≠ 0; and 𝑦 𝑐 𝑑 degenerate otherwise. Equilibria. The equilibrium solutions of (4.7) have the form 𝑣(𝑡) ≡ 𝑣∗ , where the constant vector 𝑣∗ must satisfy 𝐴𝑣∗ = 0. In the nondegenerate case, this equation will have no free variables, and the only solution is 𝑣∗ = 0, which in components is (𝑥∗ , 𝑦∗ ) = (0, 0). Alternatively, in the degenerate case, the equation 𝐴𝑣∗ = 0 will have some free variables, and there will be infinitely many solutions. For simplicity, we focus on the nondegenerate case and seek to understand the behavior of solutions of (4.7) around the isolated equilibrium at the origin. Degenerate cases require a different treatment and will be considered as the need arises. Sign regions. Since 𝑓(𝑥, 𝑦) = 𝑎𝑥 + 𝑏𝑦, the nullcline curve 𝑓 = 0 will be a line through the origin, provided that 𝑎 and 𝑏 are not both zero. Similarly, since 𝑔(𝑥, 𝑦) = 𝑐𝑥 + 𝑑𝑦, the nullcline curve 𝑔 = 0 will also be a line through the origin, provided that 𝑐 and 𝑑 are not both zero. In the nondegenerate case, these two lines are nonparallel, and they partition the plane into four distinct sign regions around the isolated equilibrium (0, 0). These sign regions may be enough to understand the behavior of solutions around the equilibrium, or they may be inconclusive. Note that a geometrical characterization of a nondegenerate equilibrium is that it be surrounded by exactly

4.7. Linear systems

65

four sign regions, all of which are distinct. Any equilibrium which is surrounded by a different number of regions, or which has nondistinct regions, must be degenerate. Due to the limited information provided by sign regions, we next pursue a detailed characterization of behavior provided by the general solution of the system, which is possible due to linearity. General solution. Following the standard theory for linear, constant-coefficient, ordinary differential equations, the general solution of (4.7) can be motivated by considering a solution of the form 𝑣(𝑡) = 𝑒𝜆𝑡 𝑢,̂ where 𝜆 is an arbitrary number and 𝑢̂ is an arbitrary vector. Substituting this function into the equation, using the fact that 𝑒𝜆𝑡 never vanishes, we get

(4.8)

𝑑(𝑒𝜆𝑡 𝑢)̂ = 𝐴(𝑒𝜆𝑡 𝑢)̂ 𝑑𝑡

⇔

𝑒𝜆𝑡 𝜆𝑢̂ = 𝑒𝜆𝑡 𝐴𝑢̂

⇔

𝐴𝑢̂ = 𝜆𝑢.̂

Hence 𝑣(𝑡) = 𝑒𝜆𝑡 𝑢̂ is a solution of the differential equation if and only if 𝜆 and 𝑢̂ satisfy the algebraic equation 𝐴𝑢̂ = 𝜆𝑢.̂ From this it follows that the general solution of (4.7) is determined by the eigenvalues and eigenvectors of 𝐴, which may be real or complex. Note that the nondegeneracy condition det 𝐴 ≠ 0 implies 𝜆 ≠ 0, and in all cases we assume 𝑢̂ ≠ 0. The specific form of the general solution depends on the details of 𝜆 and 𝑢.̂ Case 1. If 𝐴 has real, distinct eigenvalues 𝜆1 ≠ 𝜆2 , then it necessarily has two independent eigenvectors 𝑢̂1 and 𝑢̂2 , and the general solution of (4.7) takes the form (4.9)

𝑣(𝑡) = 𝐶1 𝑒𝜆1 𝑡 𝑢̂1 + 𝐶2 𝑒𝜆2 𝑡 𝑢̂2 ,

where 𝐶1 and 𝐶2 are arbitrary constants. From this expression we deduce the following phase diagrams, which illustrate the behavior of solution curves around the equilibrium 𝑣∗ = 0. 1.1. If 𝜆1 < 0 and 𝜆2 < 0, then the equilibrium 𝑣∗ = 0 is asymptotically stable, and is called a stable node or attractor. The behavior of solution curves in the phase plane is as follows; see Figure 4.13(a). Let 𝐿1 and 𝐿2 be lines through the origin parallel to 𝑢̂1 and 𝑢̂2 . Note that all solutions with 𝐶2 = 0 are on the line 𝐿1 . Any solution curve that starts on 𝐿1 , on either side of the origin, will remain on 𝐿1 and be pulled toward the origin since 𝑒𝜆1 𝑡 → 0 as 𝑡 → ∞. Similarly, note that all solutions with 𝐶1 = 0 are on the line 𝐿2 . Any solution curve that starts on 𝐿2 , on either side of the origin, will remain on 𝐿2 and be pulled toward the origin since 𝑒𝜆2 𝑡 → 0 as 𝑡 → ∞. Since solution curves cannot cross, the lines 𝐿1 and 𝐿2 divide the phase plane into four wedge regions, in which all solutions with 𝐶1 ≠ 0 and 𝐶2 ≠ 0 are trapped. All such solutions also get pulled to the origin as time goes on, but along paths that are generally curved and not straight. Note that the lines and wedge regions described here are associated with the general solution, and are generally different from the nullclines and sign regions for the system.

66

4. Two-dimensional dynamics .

L2

L2 y

L1

L2 y

L1

x

(a)

y

L1

x

(b)

x

(c)

Figure 4.13.

1.2. If 𝜆1 > 0 and 𝜆2 < 0, then the equilibrium 𝑣∗ = 0 is unstable, and is called a saddle or hyperbolic point. The behavior of solution curves in the phase plane is as follows; see Figure 4.13(b). Let 𝐿1 and 𝐿2 again be lines through the origin parallel to 𝑢̂1 and 𝑢̂2 . As before, any solution that starts on 𝐿2 , on either side of the origin, will remain on 𝐿2 and be pulled toward the origin since 𝑒𝜆2 𝑡 → 0 as 𝑡 → ∞. In contrast, any solution that starts on 𝐿1 , on either side of the origin, will remain on 𝐿1 but be pushed away from the origin since 𝑒𝜆1 𝑡 → ∞ as 𝑡 → ∞. The lines 𝐿1 and 𝐿2 again divide the phase plane into four wedge regions in which all solutions with 𝐶1 ≠ 0 and 𝐶2 ≠ 0 are trapped. Since 𝑒𝜆1 𝑡 → ∞ and 𝑒𝜆2 𝑡 → 0, all such solutions must have a slant asymptote along the line 𝐿1 , and approach this line in one of four ways depending on the signs of 𝐶1 and 𝐶2 , which determine the wedge region in which the solution lies. 1.3. If 𝜆1 > 0 and 𝜆2 > 0, then the equilibrium 𝑣∗ = 0 is unstable, and is called an unstable node or repeller. The behavior of solution curves in the phase plane is analogous to the stable case, with lines 𝐿1 and 𝐿2 defined as before, but now all directions are reversed and all solutions are pushed away from the origin and grow unbounded in time; see Figure 4.13(c). Case 2. If 𝐴 has a real, repeated (double) eigenvalue 𝜆, then it may have two independent eigenvectors 𝑢̂1 and 𝑢̂2 , or only one independent eigenvector 𝑢.̂ The form of the general solution of (4.7) depends on these two possibilities. 2.1. If there are two independent eigenvectors, then the general solution is

(4.10)

𝑣(𝑡) = 𝐶1 𝑒𝜆𝑡 𝑢̂1 + 𝐶2 𝑒𝜆𝑡 𝑢̂2 ,

where 𝐶1 and 𝐶2 are arbitrary constants. The equilibrium 𝑣∗ = 0 is stable if 𝜆 < 0, and unstable if 𝜆 > 0, and as before is referred to as a node. The behavior of solution curves in the phase plane is similar to before, but now all solution curves are straight lines due to the fact that the two exponential factors are identical; see Figure 4.14(a) and (b).

4.7. Linear systems

67

L2

L2 y

y

L1

L1 x

x

λ>0

λ 0, but is now referred to as an improper node. The behavior of solution curves in the phase plane is as follows; see Figure 4.14(c) and (d). Let 𝐿 be the line through the origin parallel to 𝑢.̂ Note that all solutions with 𝐶2 = 0 are on this line. Any solution that starts on 𝐿, on either side of the origin, remains on 𝐿 and is pulled toward or pushed away from the origin depending on the sign of 𝜆. Since solution curves cannot cross, the line 𝐿 divides the plane into two regions, in which all solutions with 𝐶2 ≠ 0 are trapped. When 𝜆 < 0, all of these solutions have a slant asymptote with the line 𝐿 far away from the origin as 𝑡 → −∞, and become tangent to 𝐿 at the origin as 𝑡 → ∞, since the component 𝑡𝑒𝜆𝑡 𝑢̂ dominates. When 𝜆 > 0, the diagram is similar, but all directions are reversed. Case 3. If 𝐴 has complex eigenvalues 𝜆+ = 𝛼 + 𝑖𝛽 and 𝜆− = 𝛼 − 𝑖𝛽, where 𝑖 is the imaginary unit and 𝛽 ≠ 0, then it necessarily has two independent eigenvectors 𝑢̂+ = 𝛾 ̂ + 𝑖𝜂 ̂ and 𝑢̂− = 𝛾 ̂ − 𝑖𝜂.̂ Note that 𝜆+ , 𝑢̂+ are written in terms of +𝑖, and 𝜆− , 𝑢̂− in terms of −𝑖, which ensures that 𝛽 and 𝜂 ̂ are consistently defined. Using Euler’s formula 𝑒𝑖𝜃 = cos 𝜃 + 𝑖 sin 𝜃, the general solution to (4.7) can be put into the real form (4.12)

𝑣(𝑡) = 𝐶1 𝑒𝛼𝑡 [𝛾 ̂ cos(𝛽𝑡) − 𝜂 ̂ sin(𝛽𝑡)] + 𝐶2 𝑒𝛼𝑡 [𝛾 ̂ sin(𝛽𝑡) + 𝜂 ̂ cos(𝛽𝑡)],

where 𝐶1 and 𝐶2 are arbitrary constants. From this expression we deduce the following phase diagrams.

68

4. Two-dimensional dynamics

3.1. If 𝛼 ≠ 0, then the equilibrium 𝑣∗ = 0 is called a spiral. It is asymptotically stable if 𝛼 < 0, and unstable if 𝛼 > 0. In the stable case, all solution curves spiral towards and approach the origin as time goes on, and all have the same orientation, either clockwise (CW) or counter-clockwise (CCW). The orientation can be determined by examining the direction vector at any test point away from the origin. The unstable case is similar, with the only difference being that all solution curves spiral away from the origin and grow unbounded in time; see Figure 4.15(a) and (b).

y

y CCW

CW

x

x

α0

(a)

(b)

y

y CCW

CW

x

x

α=0

α=0

(c)

(d)

Figure 4.15.

3.2. If 𝛼 = 0, then the equilibrium 𝑣∗ = 0 is called a center and is neutrally stable. In this case, all solution curves starting away from the origin are periodic orbits around the origin, and all have an elliptical shape with the same orientation; see Figure 4.15 (c) and (d). Similar to before, the orientation of the solution curves can be determined by examining the direction vector at any test point away from the origin. When 𝛼 = 0, note that the general solution in (4.12) can be written as (4.13)

𝑣(𝑡) = 𝑣 0 cos(𝛽𝑡) + 𝛽 −1 𝐴𝑣 0 sin(𝛽𝑡), 𝑑𝑣

where 𝑣 0 = 𝑣(0) is any given initial condition, and 𝐴𝑣 0 = 𝑑𝑡 (0) is the corresponding initial rate of change determined by (4.7). Moreover, note that the vectors {𝑣 0 , 𝛽 −1 𝐴𝑣 0 , −𝑣 0 , −𝛽 −1 𝐴𝑣 0 } correspond to reference points on the elliptical orbit which 𝜋 𝜋 3𝜋 2𝜋 are visited at times {0, 2𝛽 , 𝛽 , 2𝛽 }, and so on modulo 𝛽 . 𝑑𝑥

𝑑𝑦

Example 4.7.1. Consider 𝑑𝑡 = 𝑥+𝑦 and 𝑑𝑡 = 4𝑥−2𝑦, with initial conditions 𝑥|𝑡=0 = 5 and 𝑦|𝑡=0 = − 2 . The matrix, eigenvalues and eigenvectors for this linear system are (4.14)

1 1 𝐴=( ), 4 −2

𝜆1,2 = 2, −3,

−1 1 𝑢̂1,2 = ( ) , ( ) . 4 1

5 2

4.7. Linear systems

69

The system is nondegenerate, and the single equilibrium at the origin is an unstable saddle. The general solution has the form (4.9), which gives 𝑥(𝑡) = 𝐶1 𝑒2𝑡 − 𝐶2 𝑒−3𝑡 ,

(4.15)

𝑦(𝑡) = 𝐶1 𝑒2𝑡 + 4𝐶2 𝑒−3𝑡 .

5

5

3

Using the initial conditions 𝑥(0) = 2 and 𝑦(0) = − 2 , we obtain the constants 𝐶1 = 2 and 𝐶2 = −1. The phase diagram is illustrated in Figure 4.16. For comparison, the direction field for this system is also illustrated, along with the particular solution curve satisfying the initial conditions.

y

1 1 y

L 1 parallel to x L 2 parallel to

−1 4

x

Figure 4.16.

y

y

x x

Figure 4.17. 𝑑𝑥

𝑑𝑦

Example 4.7.2. Consider 𝑑𝑡 = 4𝑥 − 5𝑦 and 𝑑𝑡 = 2𝑥 − 2𝑦, with arbitrary initial conditions 𝑥|𝑡=0 = 𝑥0 and 𝑦|𝑡=0 = 𝑦0 . The matrix, eigenvalues and eigenvectors for this linear system are (4.16)

𝐴=(

4 −5 ), 2 −2

𝜆+,− = 1 + 𝑖, 1 − 𝑖,

𝑢̂+,− = (

3+𝑖 3−𝑖 ),( ). 2 2

In our convention in terms of real and imaginary parts, we have 𝜆+ = 𝛼 + 𝑖𝛽 and 𝑢̂+ = 𝛾 ̂ + 𝑖𝜂,̂ where (4.17)

𝛼 = 1,

𝛽 = 1,

3 𝛾̂ = ( ) , 2

1 𝜂̂ = ( ) . 0

The system is nondegenerate, and the single equilibrium at the origin is an unstable 𝑑𝑦 spiral. Since 𝑑𝑡 > 0 for any point with 𝑥 > 0 and 𝑦 = 0, the orientation is CCW. The general solution has the form (4.12), which gives (4.18)

1 3 1 3 𝑥(𝑡) ) = 𝐶1 𝑒𝑡 [ ( ) cos 𝑡 − ( ) sin 𝑡] + 𝐶2 𝑒𝑡 [ ( ) sin 𝑡 + ( ) cos 𝑡]. ( 𝑦(𝑡) 0 2 0 2

70

4. Two-dimensional dynamics

The phase diagram is illustrated in Figure 4.17. For comparison, the direction field for this system is also illustrated, along with the particular solution curve with initial 1 1 condition (𝑥0 , 𝑦0 ) = (− 6 , − 6 ).

4.8. Equilibria in nonlinear systems We now return to the general system (4.19)

𝑑𝑥/𝑑𝑡 = 𝑓(𝑥, 𝑦),

𝑥|𝑡=0 = 𝑥0 ,

𝑑𝑦/𝑑𝑡 = 𝑔(𝑥, 𝑦),

𝑦|𝑡=0 = 𝑦0 ,

𝑡 ≥ 0.

We seek to understand the phase diagram for this system around any given equilibrium (𝑥∗ , 𝑦∗ ). Since the linear case has already been addressed, we assume that the system is not linear. To motivate the result, let (𝑥∗ , 𝑦∗ ) be a given equilibrium, and consider Taylor expansions of 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) around this point. Since 𝑓(𝑥∗ , 𝑦∗ ) = 0 and 𝑔(𝑥∗ , 𝑦∗ ) = 0, these expansions take the form 𝑓(𝑥, 𝑦) = 𝑓𝑥 (𝑥∗ , 𝑦∗ )(𝑥 − 𝑥∗ ) + 𝑓𝑦 (𝑥∗ , 𝑦∗ )(𝑦 − 𝑦∗ ) + 𝑅1 (𝑥, 𝑦, 𝑥∗ , 𝑦∗ ), (4.20) 𝑔(𝑥, 𝑦) = 𝑔𝑥 (𝑥∗ , 𝑦∗ )(𝑥 − 𝑥∗ ) + 𝑔𝑦 (𝑥∗ , 𝑦∗ )(𝑦 − 𝑦∗ ) + 𝑅2 (𝑥, 𝑦, 𝑥∗ , 𝑦∗ ). Here 𝑓𝑥 , 𝑓𝑦 , 𝑔𝑥 , 𝑔𝑦 denote partial derivatives, and 𝑅1 , 𝑅2 denote remainder terms. Introducing the matrix notation (4.21)

𝑣=(

𝑥 − 𝑥∗ ), 𝑦 − 𝑦∗

𝐴∗ = (

𝑓𝑥 𝑔𝑥

𝑓𝑦 ) 𝑔𝑦 (𝑥

,

∗ ,𝑦∗ )

𝑅=(

𝑅1 ) 𝑅2 (𝑥,𝑦,𝑥

,

∗ ,𝑦∗ )

we note that the system in (4.19) can be written in an equivalent form (4.22)

𝑑𝑥/𝑑𝑡 = 𝑓(𝑥, 𝑦) 𝑑𝑦/𝑑𝑡 = 𝑔(𝑥, 𝑦)

⇔

𝑑𝑣 = 𝐴∗ 𝑣 + 𝑅. 𝑑𝑡

The matrix 𝐴∗ is called the Jacobian of the system at (𝑥∗ , 𝑦∗ ). Thus, at any point in the phase plane near an equilibrium, a nonlinear system is equivalent to a linear system with a remainder. The continuous differentiability of 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) ensure that the remainder decreases in magnitude and vanishes as (𝑥, 𝑦) approaches (𝑥∗ , 𝑦∗ ). Hence we can expect that the phase diagram for a nonlinear system to be equivalent to the diagram of a corresponding linear system, but with some distortion caused by a remainder term. The following result shows that such an equivalence holds in most cases under a nondegeneracy assumption, but there is an important exception. The validity of the result requires that 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) be twice continuously differentiable in an open neighborhood of (𝑥∗ , 𝑦∗ ), so that the remainder term vanishes at a maximal rate. Result 4.8.1. [Hartman–Grobman] Let (𝑥∗ , 𝑦∗ ) be an equilibrium of (4.19), and let 𝜆∗1,2 be the eigenvalues of 𝐴∗ . If det 𝐴∗ ≠ 0, then the phase diagram in a sufficiently small

4.8. Equilibria in nonlinear systems

71

neighborhood of (𝑥∗ , 𝑦∗ ) is, up to some distortion, an asymp. stable node unstable saddle unstable node asymp. stable spiral if unstable spiral if no conclusion if

if 𝜆∗1,2 real and negative, if 𝜆∗1,2 real and opposite, if 𝜆∗1,2 real and positive, 𝜆∗1,2 = 𝛼 ± 𝑖𝛽, 𝛽 ≠ 0, 𝛼 < 0, 𝜆∗1,2 = 𝛼 ± 𝑖𝛽, 𝛽 ≠ 0, 𝛼 > 0, 𝜆∗1,2 = 𝛼 ± 𝑖𝛽, 𝛽 ≠ 0, 𝛼 = 0.

Hence the phase diagram around an equilibrium in a nonlinear system is of the same type as a corresponding linear system, provided that the latter is nondegenerate, and has eigenvalues that are not purely imaginary. Moreover, a node can be improper in the same way as before when eigenvalues are repeated. Note that the given equilibrium serves as the origin for the corresponding linear system and its phase diagram. For instance, if (𝑥∗ , 𝑦∗ ) and (𝑥∗̃ , 𝑦∗̃ ) are two equilibria, where the first is an unstable saddle and the second is a stable spiral, then the phase diagram in a small neighborhood around each would appear as illustrated in Figure 4.18(a). The result provides no information on the phase diagram in regions between or far from the equilibria. y

(x , y ) * *

L1

L2

L1

L2

(x~ , y~ ) * *

(x , y ) * *

x

(a)

(b)

Figure 4.18.

When an equilibrium is nondegenerate, and has eigenvalues that are not purely imaginary, the distortion caused by the remainder produces only a slight deformation of the linear diagram, with no qualitative changes. In the cases of real eigenvalues, the solution lines 𝐿1,2 defined by the eigenvectors of the linear system are deformed into solution curves L1,2 in the nonlinear system. The lines 𝐿1,2 are tangent to the curves L1,2 at the equilibrium point. Whereas the lines 𝐿1,2 provide barriers that partition the diagram of the linear system, the curves L1,2 provide analogous barriers that partition the diagram of the nonlinear system. The case of a saddle is illustrated in Figure 4.18 (b). In the cases of complex eigenvalues, the spiraling solution curves of the linear system are deformed into qualitatively similar spiraling solution curves of the nonlinear system. The amount of deformation can be expected to decrease as the size of the neighborhood is reduced. When an equilibrium is degenerate, or has eigenvalues that are purely imaginary, the distortion caused by the remainder can produce significant, qualitative changes to the linear diagram. For instance, when the eigenvalues are purely imaginary, the

72

4. Two-dimensional dynamics

linear diagram is a neutrally stable center, but the nonlinear diagram could be either a stable or unstable spiral, among other things. Similar qualitative differences between the linear and nonlinear diagrams can occur when an equilibrium is degenerate. In these exceptional cases, the nonlinear diagram may be relatively complicated and nonstandard, that is, it may be qualitatively different from a node, saddle, spiral, or center. Other methods would be required in order to gain insight into the phase diagram in such cases; for instance, an analysis of nullclines and sign regions may be helpful, and explicit knowledge of a first integral, if one exists, would be ideal. 𝑑𝑦

𝑑𝑥

Example 4.8.1. Consider 𝑑𝑡 = 𝑦 and 𝑑𝑡 = 𝑥2 − 𝑥 − 3𝑦. The equilibrium points of this system satisfy 𝑦 = 0 and 𝑥2 − 𝑥 − 3𝑦 = 0. When combined with the first equation, the second implies 𝑥(𝑥 − 1) = 0, which gives 𝑥 = 0, 1. Hence this system has two equilibrium points (𝑥∗ , 𝑦∗ ) = (0, 0) and (1, 0). Since 𝑓(𝑥, 𝑦) = 𝑦 and 𝑔(𝑥, 𝑦) = 𝑥2 − 𝑥 − 3𝑦, the Jacobian matrix at an arbitrary point is 𝑓 (𝑥, 𝑦) 𝐴(𝑥, 𝑦) = ( 𝑥 𝑔𝑥 (𝑥, 𝑦)

(4.23)

𝑓𝑦 (𝑥, 𝑦) 0 )=( 𝑔𝑦 (𝑥, 𝑦) 2𝑥 − 1

1 ). −3

For each of the equilibrium points, we consider the matrix 𝐴∗ = 𝐴(𝑥∗ , 𝑦∗ ), and examine its eigenvalues and eigenvectors. The results are summarized below. Equilibrium 1: (𝑥∗ , 𝑦∗ ) = (0, 0), (4.24)

𝐴∗ = (

0 1 ), −1 −3

𝜆∗1,2 = −2.62, −0.38,

−0.38 −2.62 𝑢̂∗1,2 = ( ),( ). 1 1

Equilibrium 2: (𝑥∗ , 𝑦∗ ) = (1, 0), 0 1 𝐴∗ = ( ), 1 −3

(4.25)

𝜆∗1,2 = −3.30, 0.30,

𝑢̂∗1,2 = (

−0.30 3.30 ),( ). 1 1

Thus each of the equilibrium points is nondegenerate, and (0, 0) is an asymptotically stable node, and (1, 0) is an unstable saddle. Using the eigenvectors 𝑢̂1,2 , we can draw corresponding lines 𝐿1,2 through each equilibrium, and these lines then provide a guide for solution curves L1,2 that partition each of the diagrams. The curves L1,2 are tangent to 𝐿1,2 at each equilibrium, but no further information about L1,2 is available from this analysis, such as their concavity. The diagrams are illustrated in Figure 4.19. y L1

L1

L1

L1

L2 L2

L2

L2 (0,0)

(1,0) Figure 4.19.

x

4.9. Periodic orbits in nonlinear systems

73

4.9. Periodic orbits in nonlinear systems We consider again the general system (4.26)

𝑑𝑥/𝑑𝑡 = 𝑓(𝑥, 𝑦),

𝑥|𝑡=0 = 𝑥0 ,

𝑑𝑦/𝑑𝑡 = 𝑔(𝑥, 𝑦),

𝑦|𝑡=0 = 𝑦0 ,

𝑡 ≥ 0.

Aside from equilibria, we seek to understand properties of periodic orbits. In the linear case, a system can have either no periodic orbits, or infinitely many, all of which are nonisolated and neutrally stable. In the nonlinear case, a system can have any number of periodic orbits, and these can be isolated or not, and of different stability types. Here we outline some results about periodic orbits which apply only to systems in two dimensions. The first result generalizes an observation from the linear case. It is based on the fact that equilibria correspond to intersection points of the 𝑥- and 𝑦-nullclines, and are on the boundary between regions of increase and decrease of each variable. Since a periodic orbit must necessarily pass through an increasing and decreasing region for each of the two variables, the corresponding four regions must meet at an equilibrium point within the area enclosed by the orbit. For the following result we assume that 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuously differentiable in an open disc containing the orbit and the area that it encloses. Result 4.9.1. Let (𝑥, 𝑦)(𝑡) be a periodic orbit of (4.26) and let 𝑈 denote its enclosed area; see Figure 4.20. Then 𝑈 must contain an equilibrium point.

y U x (x,y)(t) Figure 4.20.

Thus equilibrium points are necessary for periodic orbits. In the linear case, we saw that every periodic orbit enclosed a single equilibrium at the origin. In the nonlinear case, a periodic orbit can enclose more than one equilibrium point, and these can be located away from the origin. Any system which does not possess an equilibrium cannot have a periodic orbit. Since they are in a plane, and cannot intersect, solution curves for systems in two dimensions can exhibit only a limited range of behaviors. Only four types of behaviors are possible in forward time 𝑡 ≥ 0: either a solution leaves the system domain (open set 𝐷 in Result 4.2.1), or tends to (or is) a periodic orbit, or tends to (or is) an equilibrium, or tends to a set of points consisting of multiple equilibria and curves joining them. The same types of behaviors are possible in backward time 𝑡 ≤ 0. Thus solution curves in two dimensions cannot wander around aimlessly or chaotically for all time, as is possible in higher dimensions. In the next result, the existence of a periodic orbit is

74

4. Two-dimensional dynamics

implied by ruling out three of the four possible behaviors in forward time. We assume that 𝑓(𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuously differentiable in an open disc containing the region 𝑅 as described. Result 4.9.2. [Poincaré-Bendixson] Consider the system in (4.26) and suppose that a closed, bounded region 𝑅 can be found such that (see Figure 4.21): (i)

𝑅 surrounds one or more equilibria, but does not contain them, and

(ii)

at each point on the boundary of 𝑅 the direction field points either into or tangent to 𝑅.

Then there exists a periodic orbit in 𝑅. y ~ ,~ (x y ) * * (x , y ) * *

R

x Figure 4.21.

The region 𝑅 in the above result is called a trapping region. The condition on the direction field ensures that any solution which begins in 𝑅 cannot escape and hence is trapped there for all forward time. Since 𝑅 is bounded and in the system domain, and contains no equilibrium points, every solution that starts in 𝑅 either is or tends to a periodic orbit. Thus there is at least one periodic orbit as stated, and it must be contained in 𝑅. Note that it is essential for the trapping region to surround equilibria in view of Result 4.9.1. 𝑑𝑥

𝑑𝑦

Example 4.9.1. Consider 𝑑𝑡 = −𝑦 − 𝑥(𝑥2 + 𝑦2 − 𝜇) and 𝑑𝑡 = 𝑥 − 𝑦(𝑥2 + 𝑦2 − 𝜇), where 𝜇 > 0 is a given constant. To explore if this system has periodic solutions, we first locate its equilibria, if any. The equilibrium points satisfy −𝑦 − 𝑥(𝑥2 + 𝑦2 − 𝜇) = 0 and 𝑥 −𝑦(𝑥2 +𝑦2 −𝜇) = 0. Multiplying the first equation by 𝑦, and the second equation by 𝑥, and subtracting the results, we get 𝑥2 + 𝑦2 = 0. The only solution of this equation is 𝑥 = 0 and 𝑦 = 0, and hence this system has a single equilibrium point at (𝑥∗ , 𝑦∗ ) = (0, 0). The Jacobian matrix 𝐴∗ at this point has eigenvalues 𝜆∗1,2 = 𝜇 ± 𝑖, which shows 𝑑𝑦

that the equilibrium is an unstable spiral for any 𝜇 > 0. Since 𝑑𝑡 > 0 for any point with 𝑥 > 0 and 𝑦 = 0, the spiraling solution curves around the equilibrium are CCW. Consider now a candidate trapping region 𝑅 which surrounds the equilibrium point (0, 0) but does not contain it. Specifically, let 𝑅 be the closed region between some inner circle of radius 𝑟 in and some outer circle of radius 𝑟out as illustrated in Figure 4.22(a). We next consider the direction field along the inner and outer boundary of 𝑅.

4.9. Periodic orbits in nonlinear systems

75

y

y

R x

rin

x rin

rout (a)

(b)

Figure 4.22.

Inner. Consider any 𝑟 in > 0 sufficiently small so that the inner circle is within the neighborhood of (0, 0) where the phase diagram is known. Then, since (0, 0) is an unstable spiral, we expect that the direction field will point into 𝑅 all along the inner circle as illustrated in Figure 4.22 (b). Note that the shape of the inner circle could be changed if needed, to an ellipse for example, in order to ensure that this condition holds in a neighborhood of the equilibrium. 2 2 Outer. Consider any 𝑟out > 𝜇, say 𝑟out = 2𝜇. Then for all points along the outer circle 2 2 2 𝑥 + 𝑦 = 𝑟out we can rewrite 𝑓 and 𝑔 as

𝑓(𝑥, 𝑦) = −𝑦 − 𝑥(𝑥2 + 𝑦2 − 𝜇) = −𝑦 − 𝜇𝑥,

(4.27)

𝑔(𝑥, 𝑦) = 𝑥 − 𝑦(𝑥2 + 𝑦2 − 𝜇) = 𝑥 − 𝜇𝑦.

Note that the line 𝑦 = −𝜇𝑥 intersects the outer circle in two points where 𝑓 = 0, and divides it into two semicircles with 𝑓 > 0 and 𝑓 < 0, as shown in Figure 4.23 (a). Similarly, the line 𝑥 = 𝜇𝑦 intersects the outer circle in two points where 𝑔 = 0, and divides it into two semicircles with 𝑔 > 0 and 𝑔 < 0, as shown in Figure 4.23 (b). By combining this information about the signs of 𝑓 and 𝑔, we find that the direction field will point into 𝑅 all along the outer circle, as shown in Figure 4.23 (c). y = −μ x y

x = μy

y f0

rout

rout (b)

(c)

Figure 4.23.

Thus 𝑅 will be a trapping region with the above choices of 𝑟 in and 𝑟out , and by the Poincaré–Bendixson theorem, there exists a periodic orbit in 𝑅. Note that this result holds for any 𝜇 > 0, and that all such periodic solutions are bounded by the radius 𝑟out = √2𝜇. Moreover, in the limit 𝜇 → 0, note that all such periodic solutions must

76

4. Two-dimensional dynamics

y

collapse onto the origin. For verification, Figure 4.24 shows the direction field in the case when 𝜇 = 2: a single periodic solution is visible, and it is a stable limit cycle.

x

Figure 4.24.

Our final result follows from a simple fact about level sets for a continuously differentiable function of two variables. Specifically, in the neighborhood of a strict local extremum point (minimum or maximum) of such a function, the level sets must be closed curves that encircle the point. Result 4.9.3. [nonlinear center] Let (𝑥∗ , 𝑦∗ ) be an isolated equilibrium and let 𝐸(𝑥, 𝑦) be a first integral of (4.26). If 𝐸(𝑥, 𝑦) has a strict local extremum at (𝑥∗ , 𝑦∗ ), then the phase diagram in a sufficiently small neighborhood of (𝑥∗ , 𝑦∗ ) is a center. Every solution curve in this neighborhood is a periodic orbit, except the equilibrium itself. Thus an isolated equilibrium that is a strict local extremum of a first integral is encircled by infinitely many periodic orbits. These orbits fill an entire neighborhood around the equilibrium, and the orbits and equilibrium are all neutrally stable. This type of equilibrium is the nonlinear version of a center as considered in the linear case. Note that, in addition to the Poincaré–Bendixson theorem, the above result provides another tool for establishing the existence of periodic orbits. Results 4.8.1 and 4.9.3 together provide conditions under which an equilibrium in a nonlinear system has a standard phase diagram. While the eigenvalues of a Jacobian are sufficient to classify a nonlinear node, saddle, or spiral, they are not sufficient for a nonlinear center. However, a condition involving a first integral is sufficient in this latter case. Note that the existence of a first integral for (4.26) depends on the form of the path equation in (4.2). Specifically, a first integral will exist when this first-order differential equation is exact, or can be converted to an exact form with a suitable integrating factor. The condition in Result 4.9.3 that the equilibrium be isolated is important. Simple considerations show that the result does not generally hold for equilibria that are nonisolated. For instance, any point within a line of equilibria cannot be encircled by a periodic orbit, because the orbit would necessarily intersect and cross through other equilibria from the line, which cannot occur. A first integral, if one exists, would also be helpful in understanding solution curves in such a nonisolated case. For instance,

4.11. Case study

77

if a point on the line is a strict local extremum of the integral, then the closed level sets of the integral would contain equilibria from the line, and a solution could not move around the entire contour of a level set, but instead be trapped within an arc.

4.10. Bifurcation Equilibrium and periodic solutions and their stability are key features of a system that provide important qualitative information. The dependence of these features on any parameter can be studied as before. For instance, we may consider 𝑑𝑥/𝑑𝑡 = 𝑓(𝑥, 𝑦, 𝜇),

(4.28)

𝑑𝑦/𝑑𝑡 = 𝑔(𝑥, 𝑦, 𝜇),

𝑡 ≥ 0,

where 𝜇 is an arbitrary parameter of interest. Note that nullclines, regions of increase and decrease, equilibria, Jacobians, eigenvalues, and first integrals will all depend on the parameter. A bifurcation is said to occur at a value 𝜇 = 𝜇# if there is a qualitative change in any part of the phase diagram of (4.28) as the parameter changes from 𝜇 < 𝜇# to 𝜇 > 𝜇# . Just as in the one-dimensional case, equilibrium solutions can be created or destroyed, and stabilized or destabilized, and can otherwise change from one type to another. However, in the two-dimensional case, there is the additional possibility that periodic orbits can be created or destroyed, and stabilized or destabilized, and other kinds of bifurcations can also occur. For instance, the phase plane may be partitioned by special solution curves, such as those which emanate from one saddle point and arrive at another, and this partitioning can qualitatively change. Various examples are considered in the Exercises.

4.11. Case study Setup. To illustrate some of the preceding results, and a simple model in epidemiology, we study the spread of an infectious disease in a population. As shown in Figure 4.25, we consider a population 𝑃˜ in a local, isolated community. We partition the population into three subsets: the collection 𝑆 ̃ of persons who are susceptible to the disease, the collection 𝐼 ̃ of persons who are infected with the disease, and the collection ˜ of persons who have recovered (or died) and are now immune to the disease. For 𝑅

local, isolated community

~ Population P

~ ~ ~ ~ P = S UIUR

Figure 4.25.

simplicity, we consider a time window of only a few weeks, and thus ignore changes in the total population size due to births and migration of persons into and from the community. We also suppose that susceptible and infected persons roam freely, and that a person of one type has an equal chance to encounter any person of the other type. Under these ideal conditions, we seek to understand how an infectious disease

78

4. Two-dimensional dynamics

may spread throughout the population in time, beginning from one infected individual, and on how the dynamics of the spread are influenced by parameters associated with the disease and population. ˜ at time 𝑡, with Outline of model. Let 𝑆, 𝐼, 𝑅 denote the sizes of the subsets 𝑆,̃ 𝐼,̃ 𝑅 dimensions of [𝑆], [𝐼], [𝑅] = Person and [𝑡] = Time (weeks). We assume that persons move from one subset to another as illustrated in Figure 4.26, where 𝜙1 and 𝜙2 are transfer rates, with dimensions of [𝜙1 ], [𝜙2 ] = Person/Time. This diagram shows the simplest version of an SIR model. φ1 ~ S

infection rate recovery rate

~ I ~ R

φ1 φ2

φ2

Figure 4.26.

Infection rate. The rate at which individuals become infected is represented by (4.29)

𝜙1 =

# persons infected . week

We assume that the disease spreads by a pairing between one susceptible person and one infected person, in which a physical interaction occurs, such as a handshake, cough, or sneeze. Under this assumption, the infection rate can be decomposed at any instant of time as (4.30)

𝜙1 = (# possible pairings)(

# infections ). # possible pairings ⋅ week

The first factor above, which is the number of different, possible pairings at any instant, is the product 𝑆 ⋅ 𝐼. The second factor, which is the number of infections per pairing per week, is assumed to be a constant 𝑎 > 0 and is called the transmission coefficient. This constant represents not only the infectiousness of the disease, but also the behavior of the population – it represents the fraction of all possible pairings that are expected to actually occur, which depends on social characteristics, and the fraction of these that result in transmission. Thus a model for the infection rate is (4.31)

𝜙1 = 𝑎𝑆𝐼.

Recovery rate. The rate at which individuals recover (including deaths) and become immune is represented by (4.32)

𝜙2 =

# persons recovered . week

We assume that a person can recover and become immune only after becoming infected, and thus exclude any kind of vaccination process whereby the infection could

4.11. Case study

79

be bypassed. Under this assumption, the recovery rate can be decomposed at any instant of time as # recoveries (4.33) 𝜙2 = (# infected)( ). # infected ⋅ week The first factor above is the number 𝐼. The second factor, which is the number of recoveries per number infected per week, is assumed to be a constant 𝑟 > 0 and is called the recovery coefficient. This constant represents not only a property of the disease, but also a property of the population – age, fitness, and other health factors. Thus a model for the recovery rate is (4.34)

𝜙2 = 𝑟𝐼.

Model equations. In view of Figure 4.26, and the model expressions in (4.31) and (4.34), we consider the equations (4.35)

𝑑𝑆 = −𝜙1 = −𝑎𝑆𝐼, 𝑑𝑡

𝑑𝐼 = 𝜙1 − 𝜙2 = 𝑎𝑆𝐼 − 𝑟𝐼, 𝑑𝑡

𝑑𝑅 = 𝜙2 = 𝑟𝐼, 𝑑𝑡

𝑆|𝑡=0 = 𝑆 0 ,

𝐼|𝑡=0 = 𝐼0 ,

𝑅|𝑡=0 = 𝑅0 .

Note that 𝑁 = 𝑆 0 + 𝐼0 + 𝑅0 > 0 is the total population size. We seek to understand how an infection spreads, beginning from one infected person in a population, where no one is initially immune. Thus we focus on initial conditions of the form (4.36)

𝑆 0 = 𝑁 − 1,

𝐼0 = 1,

𝑅0 = 0.

Specifically, we seek to understand how the size of the infected group evolves in time, and how qualitative properties of this evolution depend on the parameters 𝑎, 𝑟 and 𝑁. Reduced equations. The variable 𝑅 can be explicitly eliminated from the system. This follows from the above equations and the observation that 𝑑𝑆 𝑑𝐼 𝑑𝑅 + + = 0, 𝑡 ≥ 0, 𝑑𝑡 𝑑𝑡 𝑑𝑡 which can be integrated in time to obtain 𝑆 + 𝐼 + 𝑅 = 𝑆 0 + 𝐼0 + 𝑅0 , which then implies (4.37)

(4.38)

𝑅 = 𝑁 − 𝑆 − 𝐼,

𝑡 ≥ 0.

The above result is the statement that the total size of the population is constant. This follows from the fact that the model includes no mechanisms, such as births or migration, that would cause the size to change. Thus any expression or function of the variables 𝑆, 𝐼 and 𝑅 can be reduced to a function of 𝑆 and 𝐼 only. Considering only the model equations for 𝑆 and 𝐼, and introducing the scale transformation 𝑥 = 𝑆/𝑁, 𝑦 = 𝐼/𝑁 and 𝜏 = 𝑟𝑡, we obtain the two-dimensional dynamical system 𝑑𝑥 = −𝜌𝑥𝑦, 𝑑𝜏

(4.39) 𝜌=

𝑎𝑁 𝑟

> 0,

𝑑𝑦 = 𝜌𝑥𝑦 − 𝑦, 𝑑𝜏

𝑥0 + 𝑦0 = 1,

𝑥0 =

𝑁−1 𝑁

≲ 1.

Note that the differential equations involve only a single dimensionless parameter 𝜌, which is called the basic reproduction number. In the above, the notation 𝑥0 ≲ 1

80

4. Two-dimensional dynamics

means strictly less than, but nearly equal. Although the above system is mathematically well defined for all real values of 𝑥 and 𝑦, we only consider the physically meaningful case with 𝑥 ≥ 0 and 𝑦 ≥ 0. Analysis of model. The qualitative properties of the solution of (4.39), beginning from the initial conditions of interest, are completely determined by the magnitude of the parameter 𝜌 > 0. Equilibria. An equilibrium must satisfy −𝜌𝑥𝑦 = 0 and 𝜌𝑥𝑦 − 𝑦 = 0. By adding these two equations we obtain 𝑦 = 0, and we deduce that any point on the 𝑥-axis is an equilibrium. Thus the system has an entire line of equilibria, of the form (𝑥∗ , 𝑦∗ ) = (𝑥∗ , 0) for all 𝑥∗ ≥ 0. Considering the Jacobian matrix at any such point we find det 𝐴∗ = 0, which implies that each equilibrium is degenerate. Thus the phase diagram around each point (𝑥∗ , 0) cannot be determined based on the eigenvalues and eigenvectors of 𝐴∗ . Nullclines, directions. Significant information for the system can be obtained from the nullclines and sign regions for 𝑓(𝑥, 𝑦) = −𝜌𝑥𝑦 and 𝑔(𝑥, 𝑦) = (𝜌𝑥 − 1)𝑦. All points f=0 y

g=0

1/ρ

x

f=0, g=0

Figure 4.27.

satisfying 𝑓 = 0 have 𝑥 = 0 or 𝑦 = 0, and we note that 𝑓 < 0 in the first quadrant with 1 𝑥 > 0 and 𝑦 > 0. Similarly, all points satisfying 𝑔 = 0 have 𝑥 = 𝜌 or 𝑦 = 0, and we note 1

1

that 𝑔 < 0 if 0 ≤ 𝑥 < 𝜌 and 𝑦 > 0, and 𝑔 > 0 if 𝑥 > 𝜌 and 𝑦 > 0. The direction field for the system is illustrated in Figure 4.27. Note that the line of equilibria at 𝑦 = 0 is partitioned into two qualitatively different segments depending on 𝜌: equilibria with 1 1 𝑥 > 𝜌 have a repelling character, whereas equilibria with 0 ≤ 𝑥 < 𝜌 have an attracting character. Observations. Given initial conditions 𝑥0 +𝑦0 = 1, with 𝑥0 ≲ 1, there are two cases that arise depending on the value of 𝜌. Case 0 < 𝜌 < 1. The initial condition and behavior of the resulting solution are illustrated in Figure 4.28 (a). The dashed line shows all points with 𝑥0 + 𝑦0 = 1, and the condition 𝑥0 ≲ 1 implies that (𝑥0 , 𝑦0 ) is near the intercept on the right end. Note that (𝑥0 , 𝑦0 ) is near the attracting equilibria, in the region to the left of the vertical line 1 𝑥 = 𝜌 , where the direction field is leftwards and downwards at all points. In this case, for the solution beginning at (𝑥0 , 𝑦0 ), the size 𝑦 of the infected group will monotonically shrink in time. Thus the disease does not spread among the population. Case 𝜌 > 1. The initial condition and behavior of the resulting solution are illustrated in Figure 4.28 (b). Note that (𝑥0 , 𝑦0 ) is near the repelling equilibria, in the region to

4.12. Case study

81

y

y

1

1 ymax

(x0 , y0 ) 1

(x0 , y0 )

x

1/ρ

1/ρ

(a)

1

x

(b)

Figure 4.28. 1

the right of the vertical line 𝑥 = 𝜌 , where the direction field is leftwards and upwards in contrast to before. The shape of the solution curve for the solution beginning at (𝑥0 , 𝑦0 ) is qualitatively as shown. In this case, the size 𝑦 of the infected group will grow before shrinking. Thus the disease spreads among the population and an outbreak is said to occur. An interesting problem is to determine the maximum size 𝑦max of the infected group. A geometric description of the solution curve can be obtained by 1 integrating the path equation for this system, which yields 𝑦 = 𝜌 ln 𝑥−𝑥+𝐶, where 𝐶 is 1 a constant determined by (𝑥0 , 𝑦0 ). Note that 𝑦 = 𝑦max occurs when 𝑥 = 𝜌 , and that the 1 value of 𝑦max will be decreased as 𝜌 is increased. Thus any social interventions in the 𝑎𝑁 population that reduce the parameter 𝜌 = 𝑟 will flatten the curve. For example, the practice of social distancing would reduce the transmission coefficient 𝑎, and thereby reduce 𝜌.

4.12. Case study Setup. To illustrate another application of the concepts of this chapter, and a simple model from physics, we study the free-spinning motion of a body. As shown in Figure 4.29, we consider a uniform body of rectangular shape, with edge lengths 𝑎, 𝑏, 𝑐 > 0, and total mass 𝑚 > 0. We suppose that the body is rigid, so that its shape is fixed but otherwise free to move. The body can experience two physically distinct types of motion, which usually occur simultaneously: its center of mass can translate in any way and trace out some path in space, and the body can independently rotate about this center as it translates. Here we focus on the rotational component of the motion. We seek to understand the qualitative behaviors that are possible when the body is given an arbitrary initial spin, and is allowed to spin freely thereafter, with zero net torque, and how these behaviors depend on the parameters of the body. For instance, we can consider an experiment in which the body is tossed into the air, and we observe its rotational motion during flight, before it falls to the ground. The assumption of a rectangular shape is made for simplicity; an arbitrary shape could be considered, but at the expense of more complicated expressions and statements. Outline of model. Let 𝑒 1 , 𝑒 2 , 𝑒 3 be orthonormal basis vectors for three-dimensional space, which are attached to and move with the body. The rotational motion of the body is described by the angular velocity vector 𝜔. The direction of this vector corresponds to the axis of rotation, while the magnitude corresponds to the rate of

82

4. Two-dimensional dynamics

ω

e3

ω

e2 a

rate of rotation

b

c axis of rotation

e1

Figure 4.29.

rotation. For rotational motion, the mass properties of the body are described by an inertia matrix 𝛤, which is always symmetric and positive-definite, and its product with velocity is called the angular momentum vector 𝑢 = 𝛤𝜔. The relation between momentum and velocity can be inverted to get 𝜔 = 𝐾𝑢, where 𝐾 = 𝛤 −1 . In the basis 𝑒 1 , 𝑒 2 , 𝑒 3 as shown, the vectors have components 𝜔 = (𝜔1 , 𝜔2 , 𝜔3 ), 𝑢 = (𝑢1 , 𝑢2 , 𝑢3 ), and the matrices have components 1

(4.40)

⎛ 𝛼 𝛤=⎜ 0 ⎜ ⎝ 0

0 1 𝛽

0

0 ⎞ 0 ⎟, 1 ⎟ 𝛾 ⎠ 1

𝛼 0 𝐾=( 0 𝛽 0 0

0 0 ). 𝛾

𝑚

𝑚

1

1

𝑚

In the above, the inertia parameters are 𝛼 = 12 (𝑎2 +𝑐2 ), 𝛽 = 12 (𝑎2 +𝑏2 ), 𝛾 = 12 (𝑏2 +𝑐2 ), and we will be interested in the case when 𝑐 > 𝑏 > 𝑎, which implies 𝛽 > 𝛼 > 𝛾. We consider the case when the body is given an arbitrary initial spin, and is allowed to spin freely thereafter, with zero net torque. Thus we suppose that the initial angular momentum is an arbitrary vector 𝑢0 ≠ 0, of some arbitrary magnitude 𝜂 > 0. In the trivial case 𝑢0 = 0, which corresponds to no initial spin, the body would simply remain still. The law of angular momentum from physics leads to the following system, called the Euler equations for rotational motion, where × denotes the vector cross product, and | ⋅ | denotes magnitude, (4.41)

𝑑𝑢 = 𝑢 × (𝐾𝑢), 𝑑𝑡

𝑢|𝑡=0 = 𝑢0 ,

|𝑢0 | = 𝜂 > 0,

𝑡 ≥ 0.

Analysis of model. Here we outline some results on the equilibria and corresponding phase diagrams for (4.41), which reveal some interesting features about rotating bodies. The details are left to the Exercises. Phase space. Using the component expression 𝑢 = (𝑢1 , 𝑢2 , 𝑢3 ), and carrying out the vector cross product, we find that (4.41) gives the dynamical system 𝑑𝑢3 𝑑𝑢1 𝑑𝑢2 = (𝛾 − 𝛽)𝑢2 𝑢3 , = (𝛼 − 𝛾)𝑢3 𝑢1 , = (𝛽 − 𝛼)𝑢1 𝑢2 . 𝑑𝑡 𝑑𝑡 𝑑𝑡 We consider initial conditions (𝑢1,0 , 𝑢2,0 , 𝑢3,0 ) ≠ (0, 0, 0), satisfying the magnitude condition 𝑢21,0 + 𝑢22,0 + 𝑢23,0 = 𝜂2 . Solutions of this system trace out paths or orbits in a three-dimensional phase space, with 𝑢1 , 𝑢2 , 𝑢3 -coordinate axes. By observation, we note that every solution has the property that (4.42)

(4.43)

𝑑𝑢 𝑑𝑢 𝑑𝑢 𝑑 2 (𝑢 + 𝑢22 + 𝑢23 ) = 2𝑢1 1 + 2𝑢2 2 + 2𝑢3 3 = 0, 𝑑𝑡 1 𝑑𝑡 𝑑𝑡 𝑑𝑡

𝑡 ≥ 0.

4.12. Case study

83

Integrating the above expression in time we obtain 𝑢21 + 𝑢22 + 𝑢23 = 𝜂2 , where we have used the fact that 𝑢21,0 + 𝑢22,0 + 𝑢23,0 = 𝜂2 . This shows that every solution remains on the sphere of radius 𝜂 for all time. Thus the system is effectively two-dimensional: instead of paths in a flat plane, solutions trace out paths on the curved surface of a sphere. Equilibria. In an equilibrium solution, the momentum vector 𝑢 and the corresponding velocity vector 𝜔 are constant, which implies that the rate and axis of rotation are constant, which corresponds to a steady spinning motion. Equilibria must satisfy (𝛾 − 𝛽)𝑢2 𝑢3 = 0, (𝛼 − 𝛾)𝑢3 𝑢1 = 0, (𝛽 − 𝛼)𝑢1 𝑢2 = 0, along with the magnitude condition 𝑢21 + 𝑢22 + 𝑢23 = 𝜂2 . Since the parameters 𝛼, 𝛽, 𝛾 are all distinct, we find that there are precisely six equilibria, given by 𝑢∗ = (±𝜂, 0, 0), (0, ±𝜂, 0), (0, 0, ±𝜂). Since the matrix 𝐾 is diagonal, the corresponding angular velocity vectors are 𝜔∗ = 𝐾𝑢∗ = (±𝜂𝛼, 0, 0), (0, ±𝜂𝛽, 0), (0, 0, ±𝜂𝛾), which are parallel to the basis vectors 𝑒 1 , 𝑒 2 , 𝑒 3 . Thus steady spinning motions can only occur about an axis that is perpendicular to a face of the body. An initial spin about any other axis will result in a non-steady motion, which can generally be described as tumbling or wobbling. Local phase diagrams. The phase diagram for each equilibria can be found explicitly and exactly. (Although an approach based on a three-dimensional Jacobian matrix could be contemplated, the Jacobian would have zero determinant in all cases, and also purely imaginary eigenvalues in some cases, which would be inconclusive.) An exact phase diagram is possible for each equilibrium by projecting solution curves onto an appropriate plane. For example, consider any solution curve (𝑢1 , 𝑢2 , 𝑢3 )(𝑡) starting near the equilibrium (0, 0, 𝜂); say, in the region on the sphere with 𝜂 − 𝛿 < 𝑢3 < 𝜂 for some small 𝛿 > 0. The projection of the curve (𝑢1 , 𝑢2 , 𝑢3 )(𝑡) onto the 𝑢1 , 𝑢2 -plane is the curve (𝑢1 , 𝑢2 )(𝑡). The path traced out by this projected curve is described by the equation (4.44)

(𝛾 − 𝛽)𝑢2 𝑢3 (𝛾 − 𝛽)𝑢2 𝑑𝑢1 /𝑑𝑡 𝑑𝑢1 = = . = 𝑑𝑢2 𝑑𝑢2 /𝑑𝑡 (𝛼 − 𝛾)𝑢3 𝑢1 (𝛼 − 𝛾)𝑢1

The above equation can be integrated to obtain 𝑝𝑢21 + 𝑞𝑢22 = 𝐶, where 𝐶 is an arbitrary constant, and 𝑝, 𝑞 are parameters depending on 𝛼, 𝛽, 𝛾. In the present case, using the fact that 𝛽 > 𝛼 > 𝛾, we have 𝑝 > 0 and 𝑞 > 0. Since the projected curve (𝑢1 , 𝑢2 )(𝑡) is an ellipse in the plane around (0, 0), the original curve (𝑢1 , 𝑢2 , 𝑢3 )(𝑡) must be an ellipse on the sphere around (0, 0, 𝜂). Hence the phase diagram for the equilibrium (0, 0, 𝜂) has the form of a neutrally stable center as in the flat case, but now on the surface of the sphere. Note that the same result will be obtained for the equilibrium (0, 0, −𝜂). Specifically, the phase diagram for each equilibrium depends on its axis, but not its sign. A phase diagram can be found for each of the six equilibrium points. For the case of a rectangular body, with edges 𝑐 > 𝑏 > 𝑎 and basis vectors 𝑒 1 , 𝑒 2 , 𝑒 3 as shown earlier, we find that the overall diagram consists of four centers and two saddles on the spherical phase space. A subset of these is illustrated in Figure 4.30, corresponding to the positive side of each axis. The centers occur at the points (0, ±𝜂, 0) and (0, 0, ±𝜂), and each is neutrally stable and encircled by a family of periodic orbits. The saddles occur at the points (±𝜂, 0, 0), and a more detailed analysis shows that the out-going solution curves from (𝜂, 0, 0) are the in-going curves for (−𝜂, 0, 0), and vice-versa. Note that the centers

84

4. Two-dimensional dynamics

correspond to spinning motions about the 𝑒 2 and 𝑒 3 basis vectors, which are parallel to the long and short edges 𝑐 and 𝑎 of the body. The saddles correspond to spinning motions about the 𝑒 1 basis vector, which is parallel to the intermediate edge 𝑏 of the body. u3

u2 u1 Figure 4.30.

The overall structure of the above phase diagram has some interesting physical consequences. For instance, if the body is given an initial spin that is nearly parallel to the 𝑒 2 or 𝑒 3 axes, then the axis of rotation will remain nearly constant, and the body will spin in a nearly steady fashion, with a slight wobble. This motion corresponds to the momentum 𝑢 moving around a small closed curve about the equilibrium in the phase diagram of a center. In contrast, if the body is given an initial spin that is nearly parallel to the 𝑒 1 axis, then the axis of rotation will change sharply, and the body will not spin in a nearly steady fashion, but will instead begin to “flip” back and forth! This motion corresponds to moving back and forth from one saddle to the other: the momentum 𝑢 starts near one saddle, then is pushed along the out-going direction to the other saddle on the opposite side of the spherical phase space, then pushed back again, and so on. The above result on the instability of steady spin motions is called the intermediate axis theorem. (The result is also known as the tennis racket theorem or the Dzhanibekov effect.) It holds for a large class of bodies of arbitrary shape, not just the rectangular shape considered here. The name of the result reflects the fact that the unstable equilibrium corresponds to a spin parallel to the intermediate edge 𝑏, while the neutrally stable equilibria correspond to spins parallel to the long and short edges 𝑐 and 𝑎. For bodies of arbitrary shape, the unstable equilibrium would correspond to a spin about the eigenvector of the intermediate eigenvalue of the inertia matrix 𝛤. The result is applicable provided that this matrix has distinct eigenvalues. Global phase diagram. The procedure outlined above for the local phase diagrams can be exploited to produce a global diagram of solution curves. Specifically, the local diagram around each of the four centers can be expanded in a maximal way. The resulting four maximal diagrams cover the sphere, and these diagrams imply those for the remaining two saddles. To begin, consider any one of the four center equilibrium points, say (0, 0, 𝜂). Let (𝑢1 , 𝑢2 , 𝑢3 )(𝑡) be a solution curve that starts in a neighborhood of this equilibrium, and consider the projected curve (𝑢1 , 𝑢2 )(𝑡). As shown earlier, the projected curve must trace out the ellipse 𝑝𝑢21 + 𝑞𝑢22 = 𝐶, where 𝑝, 𝑞 > 0 are parameters depending on 𝛼, 𝛽, 𝛾, and 𝐶 ≥ 0 is a constant depending on the initial point.

4.12. Case study

85

For initial points in a small neighborhood of (0, 0, 𝜂), we obtain a local diagram of a center surrounded by elliptical solution curves, as considered earlier. We now ask: what is the largest neighborhood that is filled with such curves? Due to the fact that 𝑢21 + 𝑢22 + 𝑢23 = 𝜂2 , as long as the projected curve remains in the disc 𝑢21 + 𝑢22 < 𝜂2 , the original curve will remain in the hemisphere 𝑢3 > 0, and the path equation (4.44) for the projected curve will remain valid. Thus the local diagram can be expanded to a maximal diagram, defined by the set of all ellipses 𝑝𝑢21 + 𝑞𝑢22 = 𝐶 that are contained in the disc 𝑢21 + 𝑢22 < 𝜂2 . u2

u1

Figure 4.31.

Figure 4.31 shows the resulting maximal diagram associated with the center (0, 0, 𝜂). The set of all ellipses that fit within the disc fill an elliptical region of the plane, and when projected, we obtain a corresponding wedge region on the surface of the sphere around (0, 0, 𝜂). The elliptical region contains all ellipses with the given 𝑝, 𝑞 > 0, determined by the equilibrium, but different 𝐶 ≥ 0. The boundary of the elliptical region is tangent to the edge of the disc at two points. When the elliptical region is projected up to the sphere, the two points of tangency stay fixed, since they are at the equator of the sphere. As a result, the two points of tangency become the corners of the wedge region. The above construction can be applied to each of the four center equilibria (0, 0, ±𝜂) and (0, ±𝜂, 0), and a maximal diagram and wedge region is obtained for each. As shown in Figure 4.32, the wedge regions have disjoint interiors, cover the entire sphere, and share a common set of corner points and boundary curves. The common corner points are (±𝜂, 0, 0), which are the saddle equilibria, and the common boundary curves are precisely the in- and out-going solution curves of the two saddles. Note that all solution curves on the sphere are closed orbits, except for the equilibria and the four curves that connect the saddles.

Figure 4.32.

86

4. Two-dimensional dynamics

All solution curves in a wedge region have the same orientation about the corresponding equilibrium, and the orientation extends to the boundary of the wedge. Since the orientation of adjacent wedges must agree on the common boundary, the orientation within one wedge determines the orientation within all four. By inspection of the direction field, we find that the orientation in the top wedge containing (0, 0, 𝜂) is CCW. Finally, from the orientation of the common boundary curves between the wedges, we find that an out-going curve from one saddle is an in-going curve to the other, and vice-versa.

Reference notes Ordinary differential equations are an essential tool for modeling and analyzing problems in numerous scientific disciplines. The study of such equations from a dynamical point of view, with a focus on qualitative features such as the stability and bifurcation of equilibrium and periodic solutions is an important part of applied mathematics. Here we considered only elementary parts of the theory for two-dimensional autonomous systems, which provides a starting point for understanding more general systems. Proofs of the main results outlined here can be found in a number of texts. For proofs of the general theorems on existence and uniqueness of solutions, stability of equilibria, existence of periodic orbits, and the Poincaré–Bendixson theorem, see the classic texts by Coddington and Levinson (1955) and Hirsch and Smale (1974). A proof of the Hartman–Grobman theorem can be found in the texts by Perko (2001) and Teschl (2012). For texts with a focus on bifurcation theory, for dynamical and more general systems, see Chow and Hale (1982) and Guckenheimer and Holmes (1990). For introductory texts in dynamical systems see Arnold (1992), Kelley and Peterson (2010), and Strogatz (2015).

Exercises 1. Sketch the nullclines, and the direction of solution curves in all regions separated by the nullclines, and find the equilibria. (a) (c) (e)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 𝑥 + 𝑒−𝑦 ,

𝑑𝑦 𝑑𝑡

(b)

= −𝑦.

𝑑𝑦 = 𝑥 + 𝑦. 𝑑𝑡 𝑑𝑦 = 𝑦(𝑥 + 3𝑦). 𝑑𝑡

= 𝑦 − 𝑥2 + 1,

(d)

= 𝑥(𝑦 − 𝑥),

(f)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 = 𝑥 − 𝑦. 𝑑𝑡 𝑑𝑦 𝑦 − ln 𝑥, 𝑑𝑡 = 𝑥 − 4 + 𝑒−𝑦 . 𝑑𝑦 𝑦, 𝑑𝑡 = 𝑥(2 + 𝑦) − 1.

= 𝑥 − 𝑥3 , = =

2. Sketch the nullclines, and the direction of solution curves in all regions separated by the nullclines, and find the equilibria. (a) (b) (c) (d)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 = 𝑥 − 2𝑦. 𝑑𝑡 2 𝑑𝑦 5𝑥 𝑦 − 𝑥, 𝑑𝑡 = 4+𝑥2 − 𝑦. 𝑑𝑦 𝑥 + 𝑦 − 2𝑥2 , 𝑑𝑡 = 𝑦 + 3𝑦2 − 𝑑𝑦 −𝑥 − 𝑦 − 𝑥2 𝑦, 𝑑𝑡 = 𝑥 + 4𝑦.

= 𝑥2 + 𝑦2 − 4, = = =

2𝑥.

Exercises

87

3. Use the path equation to find and sketch the paths through the given points (𝑥0 , 𝑦0 ). Indicate the direction of increasing time. (a) (b) (c) (d)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 𝑦,

𝑑𝑦 𝑑𝑡

1

= 𝑥2 − 𝑥,

(𝑥0 , 𝑦0 ) = ( 2 , 0), (2, 0).

𝑑𝑦 3 1 = 2𝑥𝑦 − 𝑦, (𝑥0 , 𝑦0 ) = ( 4 , 4 ). 𝑑𝑡 𝑑𝑦 𝑒𝑥+𝑦 , 𝑑𝑡 = 𝑥 + 1, (𝑥0 , 𝑦0 ) = (0, 0). 𝑑𝑦 1 1 𝑥 + 𝑦2 , 𝑑𝑡 = 𝑥 − 𝑦, (𝑥0 , 𝑦0 ) = (− 2 , − 2 ).

= −2𝑥𝑦, = =

4. Find a first integral 𝐸(𝑥, 𝑦) of the dynamical system. (a) (c) (e)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 = 3𝑥2 − 1. 𝑑𝑡 𝑑𝑦 −𝑦, 𝑑𝑡 = 𝑥3 − 𝑥. 𝑑𝑦 𝑦, 𝑑𝑡 = − sin 𝑥.

= 𝑦,

(b)

=

(d)

=

(f)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 2 − 𝑦, = 𝑥𝑦,

𝑑𝑦 𝑑𝑡

𝑑𝑦 𝑑𝑡

= 2𝑥3 .

= 𝑥2 .

= 𝑥 − 𝑥𝑦,

𝑑𝑦 𝑑𝑡

= −𝑦 + 𝑥𝑦.

5. Find the general solution of the linear dynamical system. Sketch the phase diagram and state the type and stability of the equilibrium at the origin. (a) (c) (e) (g)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 2𝑥 − 𝑦,

𝑑𝑦 𝑑𝑡

= −3𝑥 + 4𝑦, = 3𝑦,

𝑑𝑦 𝑑𝑡

= −2𝑥 + 2𝑦. 𝑑𝑦 𝑑𝑡

(b) (d)

= −𝑥 + 𝑦.

(f)

= −2𝑥.

= −𝑥 + 𝑦,

𝑑𝑦 𝑑𝑡

(h)

= 3𝑥 − 𝑦.

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 𝑥 − 3𝑦,

𝑑𝑦 𝑑𝑡

= −2𝑥 − 3𝑦, = 2𝑥,

𝑑𝑦 𝑑𝑡

= −3𝑥 + 𝑦. 𝑑𝑦 𝑑𝑡

= 3𝑥 − 2𝑦.

= 2𝑦.

= 𝑥 − 4𝑦,

𝑑𝑦 𝑑𝑡

= 𝑥 + 𝑦.

6. The equation for a damped spring-mass system is given below, where 𝑥 is position or displacement, 𝑡 is time, and 𝑚 > 0, 𝑎 ≥ 0 and 𝑘 > 0 are parameters that quantify the mass, damping and stiffness of the system. k

m

a

𝑚 0

𝑑𝑥 𝑑2𝑥 +𝑎 + 𝑘𝑥 = 0. 2 𝑑𝑡 𝑑𝑡

x

𝑑𝑥

(a) Let 𝑦 = 𝑑𝑡 and rewrite the above equation as a linear first-order dynamical system for 𝑥 and 𝑦. Is this system ever degenerate? (b) Find the type and stability of the equilibrium at (𝑥∗ , 𝑦∗ ) = (0, 0) in the cases 𝑎 = 0, 𝑎2 − 4𝑘𝑚 = 0, 𝑎2 − 4𝑘𝑚 > 0, and 𝑎2 − 4𝑘𝑚 < 0. 7. Consider the system

𝑑𝑥 𝑑𝑡

= 2𝑥 + 𝜇𝑦,

𝑑𝑦 𝑑𝑡

= 𝑥 + 𝑦, where 𝜇 is an arbitrary parameter.

(a) Find all values of 𝜇 for which the system is nondegenerate. (b) Find the type and stability of the equilibrium at the origin for each value of 𝜇 in (a). 8. Consider the system

𝑑𝑥 𝑑𝑡

= 𝑥 + 2𝑦,

𝑑𝑦 𝑑𝑡

= −2𝑥 − 4𝑦.

88

4. Two-dimensional dynamics

(a) Show that this system is degenerate and has an entire line of equilibrium points. (b) Sketch the nullclines and direction field and determine if the line of equilibria has an attracting or repelling character. 𝑑𝑣

9. Consider a nondegenerate system 𝑑𝑡 = 𝐴𝑣 in two dimensions, with equilibrium 𝑣∗ = 0, and let 𝛿 = det 𝐴 (determinant) and 𝜏 = tr 𝐴 (trace). Show that: (a) if 𝛿 < 0, then 𝑣∗ is unstable (saddle). (b) if 𝛿 > 0 and 𝜏 > 0, then 𝑣∗ is unstable (node or spiral). (c) if 𝛿 > 0 and 𝜏 < 0, then 𝑣∗ is asymptotically stable (node or spiral). (d) if 𝛿 > 0 and 𝜏 = 0, then 𝑣∗ is neutrally stable (center). 10. A model for insecticide transfer between an agricultural crop and soil is given below, where 𝑥 and 𝑦 are the amounts of insecticide in the crop and soil, 𝑡 is time, and 𝛼 > 0, 𝛽 > 0 and 𝛾 > 0 are parameters that quantify transfer and degradation rates. transfer rate αx x insecticide in crop

y insecticide in soil

transfer rate βy

𝑑𝑦 𝑑𝑥 = −𝛼𝑥 + 𝛽𝑦, = 𝛼𝑥 − 𝛽𝑦 − 𝛾𝑦. 𝑑𝑡 𝑑𝑡

degradation rate γ y

Show that this system is nondegenerate and find the type and stability of the equilibrium at (𝑥∗ , 𝑦∗ ) = (0, 0) for all 𝛼, 𝛽 and 𝛾. 𝑑𝑥

𝑑𝑦

11. Consider the system 𝑑𝑡 = 𝑝𝑥−𝑞𝑦, 𝑑𝑡 = 𝑞𝑥+2𝑦. Find all values of the parameters 𝑝, 𝑞 for which the origin will be the following: (a) saddle.

(b) center.

(c) stable node.

12. Find the type and stability of all equilibrium points, and sketch a local phase diagram for each. (a) (c) (e)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 𝑥 + 𝑒−𝑦 ,

𝑑𝑦 𝑑𝑡

= 𝑥2 + 𝑦2 − 4, =𝑦−

𝑑𝑦 𝑥2 , 𝑑𝑡

= −𝑦. 𝑑𝑦 𝑑𝑡

= 𝑥 − 2𝑦.

= 𝑦 + 𝑥 − 1.

(b) (d) (f)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 = 𝑥 − 𝑦. 𝑑𝑡 𝑑𝑦 5𝑥2 𝑥, 𝑑𝑡 = 4+𝑥2 − 𝑦. 𝑑𝑦 3 + 2𝑒𝑥 , 𝑑𝑡 = 𝑥 +

= 𝑥 − 𝑥3 , =𝑦− =𝑦−

ln 𝑦.

13. Find all equilibrium points and their stability, using a first integral for help as needed. Determine which, if any, are nonlinear centers.

Exercises

(a) (c) (e) (g) (i)

89

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 𝑦,

𝑑𝑦 𝑑𝑡

= 3𝑥2 − 1.

= 4𝑦 + 𝑦2 ,

𝑑𝑦 𝑑𝑡

(b)

= −𝑥.

(d)

= 𝑦3 ,

(f)

=

𝑑𝑦 = 𝑥3 . 𝑑𝑡 𝑑𝑦 −𝑦, 𝑑𝑡 = 𝑥3 − 𝑥. 𝑑𝑦 3𝑦2 − 𝑥, 𝑑𝑡 = 𝑦 −

(h)

=

14. Consider the system

𝑑𝑥 𝑑𝑡

𝑥.

= 𝑥𝑦,

(j) 𝑑𝑦 𝑑𝑡

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

= 2 − 𝑦,

𝑑𝑦 𝑑𝑡

= 2𝑥3 .

𝑑𝑦 = −𝑥3 . 𝑑𝑡 𝑑𝑦 𝑦2 , 𝑑𝑡 = −𝑥2 . 𝑑𝑦 𝑦, 𝑑𝑡 = − sin 𝑥. 𝑑𝑦 𝑥 − 𝑥𝑦, 𝑑𝑡 = −𝑦

= 𝑦3 , = = =

+ 𝑥𝑦.

= −𝑥2 .

(a) Show that 𝐸(𝑥, 𝑦) = 𝑥2 + 𝑦2 is a first integral. (b) Explain why the origin is not a nonlinear center, even though it is an equilibrium and a strict local minimum of 𝐸(𝑥, 𝑦). 15. Let 𝑉(𝑥, 𝑦) be continuously differentiable in a neighborhood 𝐷 of an equilibrium 𝑑 (𝑥∗ , 𝑦∗ ). If 𝑉(𝑥, 𝑦) has a strict local minimum at (𝑥∗ , 𝑦∗ ), and satisfies 𝑑𝑡 𝑉(𝑥(𝑡), 𝑦(𝑡)) ≤ 0 for every solution curve in 𝐷, then it is called a Lyapunov function for (𝑥∗ , 𝑦∗ ). (a) Show that, if a Lyapunov function exists, then (𝑥∗ , 𝑦∗ ) is either neutrally or asymptotically stable. 𝑑

(b) Show that, if 𝑑𝑡 𝑉(𝑥(𝑡), 𝑦(𝑡)) < 0 holds for every solution curve (𝑥, 𝑦)(𝑡) ≠ (𝑥∗ , 𝑦∗ ) in 𝐷, then (𝑥∗ , 𝑦∗ ) is asymptotically stable. 𝑑𝑥

𝑑𝑦

16. Consider the system 𝑑𝑡 = −𝑦 − 𝛾𝑥3 , 𝑑𝑡 = 𝑥 − 𝛾𝑦3 , where 𝛾 > 0 is a parameter, and consider the equilibrium (𝑥∗ , 𝑦∗ ) = (0, 0). (a) Show the Hartman–Grobman theorem is inconclusive for (𝑥∗ , 𝑦∗ ). (b) Show that 𝑉(𝑥, 𝑦) = 𝑥2 +𝑦2 is a Lyapunov function and that (𝑥∗ , 𝑦∗ ) is asymptotically stable for any 𝛾 > 0. 17. Consider the system

𝑑𝑥 𝑑𝑡

= 𝑥𝑦,

𝑑𝑦 𝑑𝑡

= 𝑥2 − 𝑦.

(a) Show that (𝑥∗ , 𝑦∗ ) = (0, 0) is the only equilibrium, and that this equilibrium is degenerate in the sense that det 𝐴∗ = 0. (b) Determine the stability of (𝑥∗ , 𝑦∗ ) using nullclines and a sketch of the direction field. Explain how the phase diagram around (𝑥∗ , 𝑦∗ ) differs from that of a node, saddle, spiral or center. 18. Find the type and stability of all equilibrium points in terms of the parameters 𝜇, 𝜂 and 𝛾. Here 𝜇 is arbitrary, and 𝜂 and 𝛾 are positive. (a) (c) (e)

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 𝑑𝑡

= 𝑥2 − 𝑥 + 𝜇𝑦.

(b)

= 𝑦 + 𝜇𝑥,

(d)

=𝑦−

𝑑𝑦 = −𝑥 − 𝑥𝑦. 𝑑𝑡 𝑑𝑦 𝜇𝑒𝑥 , 𝑑𝑡 = 𝑥 + ln 𝑦.

(f)

= 𝑦,

𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡 𝑑𝑥 𝑑𝑡

𝑑𝑦 = 𝜇𝑥 + 𝑥2 . 𝑑𝑡 𝑑𝑦 𝜇𝑥, 𝑑𝑡 = 𝑥2 − 𝑦. 𝑑𝑦 𝑥 𝑥2 , 𝑑𝑡 = 𝜇 + 𝑦 .

= 𝑥 + 𝑦, =𝑦+ =𝑦−

90

4. Two-dimensional dynamics

(g)

𝑑𝑥 𝑑𝑡

= 𝛾𝑥 − 𝑥𝑦,

𝑑𝑦 𝑑𝑡

19. Consider the system

= 𝑥𝑦 − 𝜂𝑦 + 𝑦.

𝑑𝑥 𝑑𝑡

(h) 𝑑𝑦 𝑑𝑡

= 𝑥 − 𝑦 − 𝑥3 ,

𝑑𝑥 𝑑𝑡

= 𝛾 − 𝑦2 − 𝜂𝑥,

𝑑𝑦 𝑑𝑡

= 𝑥𝑦 − 𝜂𝑦.

= 𝑥 + 𝑦 − 𝑦3 .

(a) Sketch the nullclines and direction field, and show (graphically) that (𝑥∗ , 𝑦∗ ) = (0, 0) is the only equilibrium. (b) Find the type and stability of (𝑥∗ , 𝑦∗ ). (c) Use the nullclines to find a rectangular trapping region that surrounds (𝑥∗ , 𝑦∗ ); state the coordinates of all four corners. Show the region contains a periodic orbit. 20. In various ecological settings, prey and predators coexist and interact, and cycles are observed in the populations of both species. A model for one such pair of interacting species is given below in dimensionless form, where 𝑥 and 𝑦 are the population sizes of prey and predator, 𝑡 is time, and 𝜇 is a parameter. Assume 𝑥 ≥ 0, 𝑦 ≥ 0 and 𝜇 > 0. This model is called the Lotka–Volterra model. 𝑑𝑥 = 𝑥 − 𝑥𝑦, 𝑑𝑡

𝑑𝑦 = −𝜇𝑦 + 𝑥𝑦. 𝑑𝑡

(a) Sketch the nullclines and direction field. Find the equilibrium (𝑥∗ , 𝑦∗ ) in which the two species coexist, with 𝑥∗ > 0 and 𝑦∗ > 0. (b) Show that the Hartman–Grobman theorem is inconclusive for (𝑥∗ , 𝑦∗ ). (c) Find a first integral for the system and show that (𝑥∗ , 𝑦∗ ) is a nonlinear center for any 𝜇 > 0. 21. Certain chemical reactions exhibit periodic behavior in time, instead of tending to an equilibrium state. A model for one such reaction is given below in dimensionless form, where 𝑥 and 𝑦 are the concentrations of the two chemical constituents, 𝑡 is time, and 𝛾 is a reaction parameter. Assume 𝑥 ≥ 0, 𝑦 ≥ 0 and 𝛾 > 0. Such reactions are called chemical oscillators. 4𝑥𝑦 𝑑𝑥 , = 10 − 𝑥 − 𝑑𝑡 1 + 𝑥2

𝑑𝑦 𝛾𝑥𝑦 . = 𝛾𝑥 − 𝑑𝑡 1 + 𝑥2

(a) Sketch the nullclines and direction field. (b) Show that (𝑥∗ , 𝑦∗ ) = (2, 5) is the only equilibrium, and that it is an unstable node or unstable spiral if 𝛾 < 𝛾# , for some 𝛾# which you should find. (c) Use the nullclines to find a rectangular trapping region that surrounds (𝑥∗ , 𝑦∗ ); state the coordinates of all four corners. Show the region contains a periodic orbit when 𝛾 < 𝛾# .

Exercises

91

22. A model for the spread of a contagious illness in a population is given below, where 𝑆, 𝐼 and 𝑅 are the number of susceptible, infected and recovered individuals, 𝑡 is time, 𝑎 and 𝑟 are infection and recovery rate parameters, and 𝜇 and 𝜈 are re-susceptibility and vaccination rate parameters. Assume 𝑆, 𝐼, 𝑅 ≥ 0 and 𝑎, 𝑟 > 0 and 𝜇, 𝜈 ≥ 0. Such a model is called an SIR model. 𝑑𝑆 φ1 = −𝑎𝑆𝐼 + 𝜇𝑅 − 𝜈𝑆, 𝑑𝑡 φ1 = a S I 𝑑𝐼 φ2 = r I S I = 𝑎𝑆𝐼 − 𝑟𝐼, 𝑑𝑡 φ3 = μ R φ3 φ2 R 𝑑𝑅 φ4 = ν S = 𝑟𝐼 − 𝜇𝑅 + 𝜈𝑆. φ4 𝑑𝑡 (a) Show that 𝑆(𝑡) + 𝐼(𝑡) + 𝑅(𝑡) = 𝑁 for all time while a solution exists, where 𝑁 is a constant. (𝑁 is the total population size). Use this relation to eliminate 𝑅 from the 𝑆, 𝐼 equations. (b) Consider the 𝑆, 𝐼 equations from (a) with 𝜇 = 0. Rewrite system using 𝑥 = 𝑆/𝑁, 𝑦 = 𝐼/𝑁 and 𝜏 = 𝑟𝑡. Find all equilibria, and their type and stability, in terms of dimensionless parameters. (c) Consider the 𝑆, 𝐼 equations from (a) with 𝜈 = 0. Rewrite system using 𝑥 = 𝑆/𝑁, 𝑦 = 𝐼/𝑁 and 𝜏 = 𝑟𝑡. Find all equilibria, and their type and stability, in terms of dimensionless parameters. 23. The equation for a damped pendulum is given below, where 𝜃 is the angular position, 𝑡 is time, and 𝑚 > 0, ℓ > 0, 𝑔 > 0, and 𝑎 ≥ 0 are parameters that quantify mass, length, gravitational acceleration, and damping. a

𝑚ℓ θ

g

m

𝑑𝜃 𝑑2𝜃 +𝑎 + 𝑚𝑔 sin 𝜃 = 0. 𝑑𝑡 𝑑𝑡2

𝑑𝜃

(a) Let 𝜔 = 𝑑𝑡 and rewrite the above equation as a first-order system for 𝜃 and 𝜔. Find all equilibrium points (𝜃∗ , 𝜔∗ ). (b) Find the type and stability of all equilibria in the case of no damping when 𝑎 = 0. Note: the system has a first integral in this case. (c) Find the type and stability of all equilibria in the case with damping when 𝑎 > 0. 24. In certain electrical circuits with nonlinear components, the flow of current can exhibit self-sustained oscillations. A model for one such circuit is given below in dimensionless form, where 𝑥 is the current, 𝑡 is time, and 𝜇 is a parameter. This model is called the Van der Pol equation. 𝑑2𝑥 𝑑𝑥 + 𝜇(𝑥2 − 1) + 𝑥 = 0. 𝑑𝑡 𝑑𝑡2

92

4. Two-dimensional dynamics

𝑑𝑥

(a) Let 𝑦 = 𝑑𝑡 and rewrite the above equation as a first-order system for 𝑥 and 𝑦. Show that (𝑥∗ , 𝑦∗ ) = (0, 0) is the only equilibrium. (b) Find the type and stability of (𝑥∗ , 𝑦∗ ) for all 𝜇 > 0. (c) Make a qualitative sketch of the nullclines and the direction field for arbitrary 𝜇 > 0. Does a periodic orbit seem possible? (d) Use Matlab or similar software to illustrate that an asymptotically stable periodic orbit exists; for concreteness use 𝜇 = 1. Mini-project 1. A simple model for the relationship dynamics between two people X and Y is X

Y

𝑑𝑥 = 𝑎𝑥 + 𝑏𝑦, 𝑑𝑡 𝑑𝑦 = 𝑐𝑥 + 𝑑𝑦, 𝑑𝑡

𝑥|𝑡=0 = 𝑥0 , 𝑡 ≥ 0. 𝑦|𝑡=0 = 𝑦0 ,

Here 𝑡 is time, 𝑥 is the intensity of X’s feelings (for Y), and 𝑦 is the intensity of Y’s feelings (for X), where positive values mean love, and negative values mean hate. Thus intensity of feeling is a kind of emotional temperature. The constants 𝑎 and 𝑏 characterize X’s personality: X is eager or cautious if 𝑎 > 0 or < 0, and responsive or manipulative if 𝑏 > 0 or < 0. Similarly, 𝑐 and 𝑑 characterize Y’s personality: Y is eager or cautious if 𝑑 > 0 or < 0, and responsive or manipulative if 𝑐 > 0 or < 0. Note that 𝑎, 𝑑 model intra- or within-person traits, while 𝑏, 𝑐 model inter- or between-person traits. Here we perform a qualitative analysis to understand the ultimate fate of a relationship depending on the personality types. All quantities are dimensionless. (a) Consider the case of 𝑎 = 0, 𝑏 > 0, 𝑐 < 0, 𝑑 = 0: so X is responsive and 𝑌 is manipulative, and neither is eager or cautious. In other words, X warms up when Y is warm, cools down when Y is cool, and has no self-amplifying or self-suppressing tendencies. Y behaves analogously, but cools down when X is warm, and warms up when X is cool. Show that, if (𝑥0 , 𝑦0 ) ≠ (0, 0), then the relationship evolves as a neverending cycle of love and hate. Sketch a phase diagram to illustrate how the system evolves in time. (b) Consider the case of 𝑎 = 𝑑 < 0 and 𝑏 = 𝑐 > 0, so that X and Y are both cautious and responsive with identical characteristics. Show that if |𝑎| > 𝑏 (more cautious than responsive), then the relationship always fizzles out to mutual apathy. On the other hand, if |𝑎| < 𝑏 (more responsive than cautious), then the relationship is explosive: it will generally end up in extreme mutual love or mutual hatred depending on the initial feelings. What set of initial feelings lead to mutual love? What about mutual hatred? Sketch a phase diagram for each case to illustrate how the system evolves in time. Mini-project 2. In the biochemical process of glycolysis, living cells obtain energy (ATP) by breaking down sugar (glucose). Many intermediate reactions and compounds

Exercises

93

are involved. A model for the dynamics of two of these compounds is

glucose

reactions involving X, Y, Z, ...

ATP + products

𝑑𝑥 = −𝑥 + 𝑎𝑦 + 𝑥2 𝑦, 𝑑𝑡 𝑑𝑦 = 𝑏 − 𝑎𝑦 − 𝑥2 𝑦, 𝑑𝑡

𝑥|𝑡=0 = 𝑥0 , 𝑡 ≥ 0. 𝑦|𝑡=0 = 𝑦0 ,

Here 𝑡 is time, 𝑥 is the concentration of compound X (adenosine diphosphate), 𝑦 is the concentration of compound Y (fructose-6-phosphate), and 𝑎 and 𝑏 are positive constants that describe the reaction kinetics. Here we show that the system for compounds X and Y always has an equilibrium, and remarkably also has a periodic orbit under certain conditions on 𝑎 and 𝑏. The periodic orbit is observed to be asymptotically stable, and attracts solutions from a large neighborhood into a never-ending cycle. All quantities are dimensionless. (a) Sketch the nullclines and show that the system has only one equilibrium solution (𝑥∗ , 𝑦∗ ) for any 𝑎 > 0 and 𝑏 > 0. Explicitly find the equilibrium.

(b) For simplicity, assume 𝑏 = 1/2 is fixed. Determine the stability of the equilibrium in terms of 𝑎 > 0. Show that the equilibrium is unstable for 0 < 𝑎 < 𝑎# , and asymptotically stable for 𝑎 > 𝑎# , for an appropriate number 𝑎# .

(c) Consider a shaded region 𝑅 as shown, where the hole is centered at (𝑥∗ , 𝑦∗ ). Show that the straight edges of 𝑅 can be chosen such that the direction field along these edges either points inward or along the edge. Using the result in (b), and the Poincaré– Bendixson theorem, deduce that the system must contain a closed orbit in 𝑅 for any 0 < 𝑎 < 𝑎# . Give explicit locations for the vertices of 𝑅. [Hint: consider the nullclines; 𝑑𝑦 also, note that 𝑦 ̇ + 𝑥̇ ≤ 0 implies 𝑑𝑥 ≤ −1 provided 𝑥̇ > 0.] Would the same conclusion hold for 𝑎 > 𝑎# ? y

R x

(d) Use Matlab or similar software to simulate the system for various (𝑥0 , 𝑦0 ) in (0, 2)× (0, 2). Given 𝑏 = 0.5, consider the cases 0 < 𝑎 < 𝑎# and 𝑎 > 𝑎# , say 𝑎 = 0.01, 0.05, 0.10, 0.15, 0.20, 0.25, and confirm your results in (b) and (c). What happens to solution curves and the closed orbit as 𝑎 is changed? Mini-project 3. Consider a uniform, rectangular rigid body of mass 𝑚 and edge lengths 𝑎, 𝑏, and 𝑐 as outlined in Section 4.12. In the absence of an applied torque,

94

4. Two-dimensional dynamics

the free-spinning motion of the body is described by ω 3 2 a

𝑑𝑢 = 𝑢 × (𝐾𝑢), 𝑑𝑡

𝑢|𝑡=0 = 𝑢0 ,

|𝑢0 | = 𝜂 > 0,

𝑡 ≥ 0.

b

c 1

In the above, 𝑢 = (𝑢1 , 𝑢2 , 𝑢3 ) is the angular momentum vector in the body reference frame, 𝐾 = diag(𝛼, 𝛽, 𝛾) is a diagonal matrix of inertia parameters, × is the vector cross product, | ⋅ | is the vector magnitude, and 𝜂 is a given constant. The angular velocity 12 12 vector is 𝜔 = 𝐾𝑢, and the inertia parameters are 𝛼 = 𝑚(𝑎2 +𝑐2 ) , 𝛽 = 𝑚(𝑎2 +𝑏2 ) and 12 𝛾 = 𝑚(𝑏2 +𝑐2 ) . Notice 𝑐 > 𝑏 > 𝑎 implies 𝛽 > 𝛼 > 𝛾. Here we fill in the details omitted in the text. We find all equilibrium solutions and characterize the phase diagram around each. Remarkably, we will see that some spinning motions are unstable, and their repelling character leads to interesting motions as described earlier. The above are called the Euler equations for rigid body rotation. 𝑑ᵆ

𝑑ᵆ

(a) In components, show that the dynamical system is 𝑑𝑡1 = (𝛾 − 𝛽)𝑢2 𝑢3 , 𝑑𝑡2 = 𝑑ᵆ (𝛼 − 𝛾)𝑢3 𝑢1 and 𝑑𝑡3 = (𝛽 − 𝛼)𝑢1 𝑢2 . Also, show that every solution 𝑢(𝑡) of this system has the property that 𝑢21 + 𝑢22 + 𝑢23 = constant for all 𝑡 ≥ 0. Since |𝑢0 | = 𝜂, conclude that |𝑢| = 𝜂 for all time, so every solution is on the sphere of radius 𝜂 in 𝑢1 , 𝑢2 , 𝑢3 -space as illustrated in Figure 4.30. (b) Show that the system has precisely six equilibrium solutions given by 𝑢∗= (±𝜂, 0, 0), (0, ±𝜂, 0) and (0, 0, ±𝜂); specifically, show that there are no other equilibria. (c) Let 𝑢(𝑡) be an arbitrary solution curve starting near (0, 0, 𝜂) on the sphere. Show that the image of this curve in the 𝑢1 , 𝑢2 plane must be an ellipse around (0, 0), and hence 𝑢(𝑡) must be an elliptical curve around (0, 0, 𝜂) on the sphere. Thus (0, 0, 𝜂) is a neutrally stable center. Explain why a similar result holds for (0, 0, −𝜂) and (0, ±𝜂, 0). (d) Let 𝑢(𝑡) be an arbitrary solution curve starting near (𝜂, 0, 0) on the sphere. Show that the image of this curve in the 𝑢2 , 𝑢3 plane must be a hyperbola around (0, 0), and hence 𝑢(𝑡) must be a hyperbolic curve around (𝜂, 0, 0) on the sphere. Thus (𝜂, 0, 0) is an unstable saddle. Explain why a similar result holds for (−𝜂, 0, 0).

Chapter 5

Perturbation methods

An important problem in the study of a model is to understand how a solution behaves when a parameter deviates from a reference value. For instance, we may know the solution when the parameter is zero, and seek to understand how the solution is altered when the parameter is nonzero. Alternatively, we may know the solution when the parameter is nonzero, and seek to explore the limit in which the parameter tends to zero. The study of such problems is the subject of perturbation theory. Here we explore cases when the model equation and its solution depend on a parameter in either a regular or singular way. We consider both algebraic and differential equations, and outline various series representation results which can be used to approximate solutions and study their behavior.

5.1. Perturbed equations We will consider different types of model equations, usually in dimensionless form, that contain a small parameter of interest 𝜀 ≥ 0. Given a solution when 𝜀 = 0, we may seek to understand how it changes when 𝜀 > 0. Or given a solution when 𝜀 > 0, we may seek to understand what happens when 𝜀 → 0+ . The restriction on the parameter is purely for convenience. We could also consider an arbitrary reference value 𝜀 = 𝜀0 , and study the case when |𝜀−𝜀0 | is small, or equivalently, when the deviations 𝜀−𝜀0 ≥ 0 and 𝜀 − 𝜀0 ≤ 0 are small. Unless mentioned otherwise, we assume the reference value is zero, and consider the interval 𝜀 ≥ 0. A system of equations involving a small parameter can be classified as one of two types as defined next. Here we focus on algebraic and differential equations, and note that more refined definitions can be given in more specific contexts. Definition 5.1.1. A system of equations is called perturbed if it contains a small parameter 𝜀 ≥ 0. It is called regularly perturbed if every solution for 𝜀 > 0 extends continuously to 𝜀 = 0; otherwise, it is called singularly perturbed. 95

96

5. Perturbation methods

Thus a perturbed system can be classified as one of two types depending on properties of its solutions. A system is expected to be regularly perturbed when the number of solutions for 𝜀 > 0 is the same as for 𝜀 = 0, so that no solutions are lost or become undefined as the parameter vanishes. In contrast, a system will be singularly perturbed when the number of solutions for 𝜀 > 0 is different than for 𝜀 = 0. For systems involving algebraic equations, we consider all possible solutions, real or complex, whereas for systems involving differential equations, we focus only on real solutions. In the definition, note that continuity of a solution in the parameter 𝜀 ≥ 0 would naturally be defined in terms of a norm on the relevant solution space. Various technical conditions guarantee when a system of equations is regularly or singularly perturbed. For simplicity, we will not focus on such conditions, but will instead make assumptions based on observation. Example 5.1.1. Consider the algebraic equation 𝑥2 + 𝜀𝑥 − 1 = 0, where 0 ≤ 𝜀 ≪ 1 is a parameter. The equation with 𝜀 > 0 is quadratic and has two solutions, and the equation with 𝜀 = 0 is also quadratic and has two solutions. This equation is expected to be regularly perturbed: each solution for 𝜀 > 0 is expected to extend continuously to 𝜀 = 0. Consider now the equation 𝜀𝑥2 + 𝑥 − 1 = 0, where 0 ≤ 𝜀 ≪ 1. The equation with 𝜀 > 0 is quadratic and has two solutions, but the equation with 𝜀 = 0 is linear and has only one solution. This equation is singularly perturbed: some solution for 𝜀 > 0 does not continue to 𝜀 = 0. 𝑑2 𝑥

𝑑𝑥

𝑑𝑥

Example 5.1.2. Consider the initial-value problem 4 𝑑𝑡2 + 𝜀 𝑑𝑡 + 𝑥 = 1, 𝑑𝑡 |𝑡=0 = 2, 𝑥|𝑡=0 = 0, 𝑡 ≥ 0, where 0 ≤ 𝜀 ≪ 1 is a parameter. When 𝜀 > 0, the system consists of a second-order differential equation with two initial conditions and has a unique solution. When 𝜀 = 0, the system has the same form and again has a unique solution. This system is expected to be regularly perturbed: the solution for 𝜀 > 0 is expected to extend continuously to 𝜀 = 0. 𝑑2 𝑥

𝑑𝑥

𝑑𝑥

Consider now the problem 𝜀 𝑑𝑡2 + 4 𝑑𝑡 + 𝑥 = 1, 𝑑𝑡 |𝑡=0 = 2, 𝑥|𝑡=0 = 0, 𝑡 ≥ 0, where 0 ≤ 𝜀 ≪ 1. When 𝜀 > 0, the system has the same form as above and has a unique solution. However, when 𝜀 = 0, the system consists of a first-order differential equation with two initial conditions, and has no solution; specifically, the two initial conditions cannot be satisfied with the single arbitrary constant of integration from the differential equation. This system is singularly perturbed: the solution for 𝜀 > 0 does not continue to 𝜀 = 0.

5.2. Regular versus singular behavior Given a perturbed equation with a small parameter 𝜀 ≥ 0, we seek to understand the behavior of its solutions near 𝜀 = 0. For a large class of equations this behavior can be studied using series expansions. The following simple examples illustrate how both regular and singular behavior can be exposed with a series. The construction of expansions for more general problems will be the subject of this chapter.

5.2. Regular versus singular behavior

97

Example 5.2.1. (regular case). Consider 𝑥2 + 𝜀𝑥 − 1 = 0, where 0 ≤ 𝜀 ≪ 1. The exact solutions are (5.1)

𝑥=

−𝜀 ± [𝜀2 + 4]1/2 , 2

𝜀 ≥ 0.

To illustrate the behavior of these solutions for small values of the parameter, we simplify the term involving the square root. Specifically, the function 𝑓(𝜀) = [𝜀2 + 4]1/2 has a Taylor series at 𝜀 = 0 with some positive radius of convergence, and the first few terms, omitting those that are zero, are ∞

(5.2)

𝜀𝑛 𝑓(𝑛) (0) 𝜀2 𝜀4 =2+ − + ⋯. 𝑛! 4 64 𝑛=0

𝑓(𝜀) = ∑

Substituting (5.2) into (5.1), we get the two roots (5.3)

𝑥+ (𝜀) = 1 −

𝜀 2

𝑥− (𝜀) = −1 −

+ 𝜀 2

𝜀2 8

−

𝜀4 + ⋯, 128 𝜀2 𝜀4 + 128 + ⋯ , 8

−

𝜀 ≥ 0, 𝜀 ≥ 0.

Thus there are two solutions 𝑥± (𝜀) for each value of 𝜀 ≥ 0, and each solution corresponds to a curve in the 𝜀, 𝑥-plane as illustrated in Figure 5.1. Since each solution for x 1

x+ (ε)

−1

ε

x− (ε) Figure 5.1.

𝜀 > 0 extends continuously to 𝜀 = 0, the system is regularly perturbed as expected. The series expansions inform us that 𝑥± (𝜀) → ±1 as 𝜀 → 0+ , and moreover that 𝑥+ (𝜀) < 1 and 𝑥− (𝜀) < −1 for small 𝜀 > 0; specifically, the slope and concavity of each solution 1 2 1 2 ′ ″ ′ ″ curve at 𝜀 = 0 are 𝑥+ (0) = − 2 , 𝑥+ (0) = 8 and 𝑥− (0) = − 2 , 𝑥− (0) = − 8 . Example 5.2.2. (singular case). Consider 𝜀𝑥2 + 𝑥 − 1 = 0, where 0 ≤ 𝜀 ≪ 1. The exact solutions are −1±[1+4𝜀]1/2 , 2𝜀

𝑥= 𝑥 = 1,

(5.4)

𝜀 > 0, 𝜀 = 0.

To illustrate the behavior of these solutions for small values of the parameter, we again simplify the term involving the square root. Specifically, the function 𝑓(𝜀) = [1 + 4𝜀]1/2 has a Taylor series at 𝜀 = 0 with some positive radius of convergence, and the first few terms are ∞

(5.5)

𝜀𝑛 𝑓(𝑛) (0) = 1 + 2𝜀 − 2𝜀2 + 4𝜀3 + ⋯ . 𝑛! 𝑛=0

𝑓(𝜀) = ∑

98

5. Perturbation methods

Substituting (5.5) into (5.4), we get the solutions (5.6)

𝑥+ (𝜀) = 1 − 𝜀 + 2𝜀2 + ⋯ , 1 𝑥− (𝜀) = − 𝜀 − 1 + 𝜀 − 2𝜀2 + ⋯ , 𝑥 = 1,

𝜀 > 0, 𝜀 > 0, 𝜀 = 0.

Thus there are two solutions 𝑥± (𝜀) for each value of 𝜀 > 0, and only one solution 𝑥 = 1 when 𝜀 = 0. As before, each of the solutions can be plotted in the 𝜀, 𝑥-plane as illustrated in Figure 5.2. Since two solutions are defined for 𝜀 > 0, but only one extends x 1

x+ (ε)

ε

x− (ε)

Figure 5.2.

continuously to 𝜀 = 0, the system is singularly perturbed. The series expansion for 𝑥+ (𝜀) informs us that this solution converges to the single solution 𝑥 = 1 at 𝜀 = 0, that is, 𝑥+ (𝜀) → 1 as 𝜀 → 0+ . In contrast, the expansion for 𝑥− (𝜀) informs us that this solution becomes undefined as the parameter vanishes, that is, 𝑥− (𝜀) → −∞ as 𝜀 → 0+ . Such unbounded behavior is typical of a singularly perturbed algebraic equation; this is how the number of solutions can change between 𝜀 > 0 and 𝜀 = 0.

5.3. Assumptions, analytic functions The class of problems that we consider will be algebraic or differential equations such as 𝐹(𝑥, 𝜀) = 0, or 𝑑𝑥 = 𝐹(𝑡, 𝑥, 𝜀), 𝑥|𝑡=0 = 𝑥# , 𝑡 ≥ 0. 𝑑𝑡 We will also consider related problems of a similar form. A basic assumption that we will make is that the function 𝐹(𝑥, 𝜀) or 𝐹(𝑡, 𝑥, 𝜀) is analytic at some given point as defined next. (5.7)

Definition 5.3.1. A function 𝐹(𝑡, 𝑥, 𝜀) is called analytic at (𝑡# , 𝑥# , 𝜀# ) if it has a convergent power series expansion around that point, that is ∞

(5.8)

𝐹(𝑡, 𝑥, 𝜀) = ∑ 𝑐 𝑖𝑗𝑘 (𝑡 − 𝑡# )𝑖 (𝑥 − 𝑥# )𝑗 (𝜀 − 𝜀# )𝑘 , 𝑖,𝑗,𝑘=0

|𝑡 − 𝑡# | < 𝜎, for some coefficients 𝑐 𝑖𝑗𝑘 and radii 𝜎, 𝜂, 𝜌 > 0.

|𝑥 − 𝑥# | < 𝜂,

|𝜀 − 𝜀# | < 𝜌,

5.4. Notation, order symbols

99

Note that the same definition applies to real- or complex-valued functions, of real or complex variables, and a similar definition holds in the case of an arbitrary number of variables. Also, any analytic real-valued function of real variables can locally be extended to an analytic complex-valued function of complex variables; the two functions would share the same series coefficients. Thus, when a function is stated to be analytic at some given point, we will always have in mind a complex neighborhood of that point, whether the given point is real or complex. Any conditions or operations on the function will be understood within the context of this neighborhood. A large class of functions that arise in applications are analytic at most points. For instance, elementary functions of a single variable of the polynomial, rational, exponential, log, root, and trigonometric types are analytic at every point in the interior of their domains. Moreover, sums, products, quotients, and compositions of analytic functions are also analytic, as well as derivatives and antiderivatives. Furthermore, a function of multiple variables is analytic at every point in the interior of its domain if it is continuous in all variables jointly, and is analytic in each variable separately, for arbitrary fixed values of the other variables. The definition of derivative for a function of a complex variable has the same form as that for a real variable. Thus the usual differentiation formulas for the elementary functions are still valid when the variables are complex, as well as those for sums, products, quotients, and compositions. 𝜀𝑥2 +5𝑥+𝜀3

2

Example 5.3.1. Let 𝐹(𝑥, 𝜀) = sin(1+𝜀𝑥) , 𝐺(𝑡, 𝑥, 𝜀) = (4𝑥2 + 𝑡2 𝑥)𝑒𝜀𝑥+𝑡𝜀 , 𝐽(𝑥) = √𝑥, and 𝐾(𝑥) = ln 𝑥. The function 𝐹 is analytic at every real or complex pair (𝑥, 𝜀) with sin(1 + 𝜀𝑥) ≠ 0, whereas 𝐺 is analytic at every real or complex triple (𝑡, 𝑥, 𝜀). The functions 𝐽 and 𝐾 are analytic at every real 𝑥 > 0, and more generally in a neighborhood 1 of every complex 𝑥 ≠ 0. Whether 𝑥 is real or complex we have 𝐽 ′ (𝑥) = and 𝐾 ′ (𝑥) = 2√𝑥 1 . 𝑥

5.4. Notation, order symbols The following notation and definition of the order symbols 𝑂 (big-o) and 𝑜 (little-o) will be useful in discussing the behavior of a function 𝑓(𝜀) in the limit 𝜀 → 0+ . ˜ = Definition 5.4.1. Let an exponent 𝑟 ≥ 0 and a function 𝑓(𝜀) be given. Assume 𝑓(𝜀) ˜ 𝑟. 𝑓(𝜀)/𝜀𝑟 is continuous for 𝜀 ∈ (0, 1] and note 𝑓(𝜀) = 𝑓(𝜀)𝜀 ˜ is bounded for 𝜀 ∈ (0, 1], then we say 𝑓(𝜀) is order 𝑶(𝜀𝑟 ). (1) If 𝑓(𝜀) ˜ → 0 as 𝜀 → 0+ , then we say 𝑓(𝜀) is order 𝒐(𝜀𝑟 ). (2) If 𝑓(𝜀) Thus a function 𝑓(𝜀) may be order 𝑂(𝜀𝑟 ) or 𝑜(𝜀𝑟 ) (or both) depending on the be˜ havior of 𝑓(𝜀). To indicate the order of 𝑓(𝜀) we use the notation 𝑓(𝜀) = 𝑂(𝜀𝑟 ) or 𝑟 𝑓(𝜀) = 𝑜(𝜀 ). From the definition, 𝑓(𝜀) = 𝑂(𝜀𝑟 ) when there is a constant 𝐶 ≥ 0 such that |𝑓(𝜀)| ≤ 𝐶𝜀𝑟 for 𝜀 ∈ (0, 1], in which case 𝑓(𝜀) → 0 at least as fast as 𝜀𝑟 → 0+ . Alternatively, 𝑓(𝜀) = 𝑜(𝜀𝑟 ) when 𝑓(𝜀) → 0 at a strictly faster rate than 𝜀𝑟 → 0+ . The notation 𝑓(𝜀) = 𝑔(𝜀) + 𝑂(𝜀𝑟 ) signifies that 𝑓(𝜀) − 𝑔(𝜀) = 𝑂(𝜀𝑟 ), and similarly for 𝑜(𝜀𝑟 ). The order symbols are often used to describe the individual terms of a series, or quantify the size

100

5. Perturbation methods

of the remainder when a series is truncated. In the definition, the interval (0, 1] could be replaced with (0, 𝑏] for any fixed 𝑏 > 0. Example 5.4.1. Consider 𝑓(𝜀) = 4𝜀2 , 𝑔(𝜀) = 5𝜀5 −2𝜀3 and ℎ(𝜀) = 𝜀2 ln(𝜀). By definition, we find 𝑓(𝜀) = 𝑂(𝜀2 ) and 𝑔(𝜀) = 𝑂(𝜀3 ), and no higher exponents can be used in each of these statements. Note that ℎ(𝜀) ≠ 𝑂(𝜀2 ), since ln(𝜀) is unbounded for 𝜀 ∈ (0, 1]. However, we find ℎ(𝜀) = 𝑂(𝜀) and more accurately ℎ(𝜀) = 𝑜(𝜀), since 𝜀 ln(𝜀) is bounded for 𝜀 ∈ (0, 1] and vanishes as 𝜀 → 0+ . In fact, ℎ(𝜀) = 𝑜(𝜀𝑟 ) for any 0 ≤ 𝑟 < 2. 1

Example 5.4.2. Consider 𝑓(𝜀) = 𝑒𝜀 and 𝑔(𝜀) = 1+𝜀+ 2 𝜀2 . Note that 𝑔(𝜀) is a truncation of the power series for 𝑓(𝜀), that is 2

∞

(5.9)

𝜀𝑛 . 𝑛! 𝑛=0

𝜀𝑛 , 𝑛! 𝑛=0

𝑔(𝜀) = ∑

𝑓(𝜀) = ∑

The difference or remainder 𝑅(𝜀) = 𝑓(𝜀) − 𝑔(𝜀) has the property that ∞

∞

(5.10)

𝜀𝑘 𝜀𝑛 ˜(𝜀). = 𝜀3 ∑ = 𝜀3 𝑅 𝑛! (𝑘 + 3)! 𝑘=0 𝑛=3

𝑅(𝜀) = ∑

˜(𝜀) converges for 𝜀 ∈ (−∞, ∞), it is continuous and bounded on any finite Since 𝑅 interval, and it follows that 𝑅(𝜀) = 𝑂(𝜀3 ). This result can be written as 𝑓(𝜀) = 𝑔(𝜀) + 1 𝑂(𝜀3 ), or equivalently 𝑒𝜀 = 1 + 𝜀 + 2 𝜀2 + 𝑂(𝜀3 ).

5.5. Regular algebraic case Consider a regularly perturbed algebraic equation of the form (5.11)

𝐹(𝑥, 𝜀) = 0,

0 ≤ 𝜀 ≪ 1.

Let 𝑥(𝜀) be a solution or root, which may be real or complex, and let 𝑥0 = 𝑥(0) as illustrated conceptually in Figure 5.3. We seek an expression for 𝑥(𝜀). The following

x x(ε) x0

ε

0 Figure 5.3.

result says that, provided the function 𝐹(𝑥, 𝜀) can be expressed as a series, then so can the solution 𝑥(𝜀). We only consider a single equation with a single unknown; results for systems of equations with multiple unknowns are more involved. Below we use the notation ≢ 0 to mean not identically zero. Result 5.5.1. Assume 𝐹(𝑥, 0) ≢ 0. Let 𝑥0 be given such that 𝐹(𝑥0 , 0) = 0. If 𝐹(𝑥, 𝜀) 𝜕𝐹 𝜕𝐹 is analytic at (𝑥0 , 0), and 𝜕𝑥 (𝑥0 , 0) ≠ 0 or 𝜕𝜀 (𝑥0 , 0) ≠ 0, then a solution curve 𝑥(𝜀) of (5.11) exists. The curve 𝑥(𝜀) can be written as a series, where the form of the series depends on 𝑥0 .

5.5. Regular algebraic case

101

(1) If 𝑥0 is a simple root, then 𝑥(𝜀) = 𝑥0 + 𝜀𝑥1 + 𝜀2 𝑥2 + ⋯ .

(5.12)

(2) If 𝑥0 is a repeated root of order 𝑚, then 𝑥(𝜀) = 𝑥0 + 𝜀𝛼 𝑥1 + 𝜀2𝛼 𝑥2 + ⋯ ,

(5.13) where 𝛼 =

1 . 𝑚

Using the substitution 𝛿 = 𝜀𝛼 , we may instead consider 𝑥(𝛿) = 𝑥0 + 𝛿𝑥1 + 𝛿2 𝑥2 + ⋯ .

(5.14)

The series in (5.12) and (5.14) converge for 𝜀 ∈ [0, 𝜌) for some 𝜌 > 0. The coefficients 𝑥𝑛 , 𝑛 ≥ 0 can be found from (5.11). Thus a solution curve 𝑥(𝜀) will exist, and have a series expansion around a given root 𝑥 = 𝑥0 at 𝜀 = 0, provided that the function 𝐹(𝑥, 𝜀) is analytic at the starting 𝜕𝐹 𝜕𝐹 point (𝑥0 , 0), and satisfies a nondegeneracy condition 𝜕𝑥 (𝑥0 , 0) ≠ 0 or 𝜕𝜀 (𝑥0 , 0) ≠ 0, along with 𝐹(𝑥, 0) ≢ 0. In both the simple and repeated cases a solution curve 𝑥(𝜀) has the property that 𝐹(𝑥(𝜀), 𝜀) = 0 for all 𝜀 ∈ [0, 𝜌). The series in (5.12) is a standard power series involving only nonnegative, integer powers of 𝜀. The series in (5.14) is a generalized power series that involves nonnegative, but fractional powers of 𝜀; such a series is called a Puiseux series. An expansion in the form of a Puiseux series still exists when the nondegeneracy conditions on the derivatives do not hold, but the identification of 𝛼 > 0 is more involved as discussed later. By a perturbation approximation of a solution up to order 𝑂(𝜀𝑟 ) we mean a truncated series with all terms up to and including 𝜀𝑟 . Note that the above result is entirely local. The existence and form of a solution curve 𝑥(𝜀) depends only on properties of the function 𝐹(𝑥, 𝜀) at the starting root 𝑥 = 𝑥0 at 𝜀 = 0, and does not involve information about any other points. Equation (5.11) will indeed be regularly perturbed when each of its solution curves has a starting point which satisfies the conditions of Result 5.5.1. The proof of the result shows that, when 𝑥0 is a simple root, there is a unique solution curve that extends from the point (𝑥0 , 0), and when 𝑥0 is a repeated root of order 𝑚, there are 𝑚 solution curves that extend from (𝑥0 , 0). In the simple case, the curve 𝑥(𝜀) is analytic and hence differentiable to all orders in 𝜀 at any point within its interval of convergence. In contrast, in the repeated case, the curves 𝑥(𝜀) are not analytic in 𝜀 due to the fractional powers, but instead the reparameterized curves 𝑥(𝛿) are analytic and hence differentiable to all orders in 𝛿. For convenience, we use the same symbol 𝑥 for both the original and reparameterized functions. Example 5.5.1. Consider (5.15)

𝑥4 − 𝑥2 + 2𝜀2 𝑥 + 6𝜀 = 0,

0 ≤ 𝜀 ≪ 1.

This equation has four roots when 𝜀 > 0 and also when 𝜀 = 0. Here we seek an expansion for each of the roots 𝑥(𝜀). Note that 𝐹(𝑥, 𝜀) = 𝑥4 − 𝑥2 + 2𝜀2 𝑥 + 6𝜀 is analytic at all points (𝑥, 𝜀), and that 𝐹(𝑥, 0) = 𝑥4 − 𝑥2 ≢ 0. Root types at 𝜀 = 0. To determine the form of the expansions, we first examine the multiplicity of the roots 𝑥0 = 𝑥(0). From (5.15) with 𝜀 = 0 we get 𝑥04 − 𝑥02 = 0, which has

102

5. Perturbation methods

the four roots 𝑥0 = 1,−1,0,0. Thus there are two simple roots, and a repeated root of 𝜕𝐹 multiplicity 𝑚 = 2. For each simple root we find 𝜕𝑥 (𝑥0 , 0) = 4𝑥03 − 2𝑥0 ≠ 0, and for 𝜕𝐹

the repeated root we find 𝜕𝜀 (𝑥0 , 0) = 6 ≠ 0. Thus, for each given 𝑥0 , a nondegeneracy condition holds and Result 5.5.1 can be applied. Expansion of simple roots. To each simple root 𝑥0 at 𝜀 = 0 there is a corresponding solution curve 𝑥(𝜀) for 𝜀 ≥ 0, which has the expansion (5.16)

𝑥(𝜀) = 𝑥0 + 𝜀𝑥1 + 𝜀2 𝑥2 + 𝜀3 𝑥3 + ⋯ ,

𝑥0 = 1, −1.

For future reference, note that (5.17)

𝑥′ (𝜀) = 0 + 𝑥1 + 2𝜀𝑥2 + 3𝜀2 𝑥3 + ⋯ , 𝑥″ (𝜀) = 0 + 0 + 2𝑥2 + 6𝜀𝑥3 + ⋯ .

Substituting 𝑥(𝜀) into (5.15) we get (5.18)

𝑓(𝜀) − 𝑔(𝜀) + 𝑞(𝜀) + 𝑝(𝜀) = 0,

where 𝑓(𝜀) = 𝑥4 (𝜀), 𝑔(𝜀) = 𝑥2 (𝜀), 𝑞(𝜀) = 2𝜀2 𝑥(𝜀) and 𝑝(𝜀) = 6𝜀. We next expand each of these terms in powers of 𝜀. For 𝑝(𝜀) and 𝑞(𝜀), the expansions are straightforward, and there is no need to use Taylor’s formula; we get (5.19)

𝑝(𝜀) = 6𝜀 = 0 + 6𝜀 + 0𝜀2 + 0𝜀3 + ⋯ , 𝑞(𝜀) = 2𝜀2 𝑥(𝜀) = 2𝜀2 𝑥0 + 2𝜀3 𝑥1 + 2𝜀4 𝑥2 + ⋯ .

For 𝑔(𝜀) = 𝑥2 (𝜀) we may use Taylor’s formula to find the first few terms in the expansion. Using the chain rule as needed for derivatives, together with (5.17), we get 𝑔(𝜀) = 𝑔(0) + 𝜀𝑔′ (0) + (5.20)

𝜀2 ″ 𝑔 (0) 2

+⋯

= [𝑥2 (𝜀)]𝜀=0 + 𝜀[2𝑥(𝜀)𝑥′ (𝜀)]𝜀=0 𝜀2

+ 2 [2𝑥′ (𝜀)𝑥′ (𝜀) + 2𝑥(𝜀)𝑥″ (𝜀)]𝜀=0 + ⋯ = [𝑥02 ] + 𝜀[2𝑥0 𝑥1 ] +

𝜀2 [2𝑥12 2

+ 4𝑥0 𝑥2 ] + ⋯ .

Similarly, for 𝑓(𝜀) = 𝑥4 (𝜀), we get 𝑓(𝜀) = 𝑓(0) + 𝜀𝑓′ (0) + (5.21)

𝜀2 ″ 𝑓 (0) 2

+⋯

= [𝑥4 (𝜀)]𝜀=0 + 𝜀[4𝑥3 (𝜀)𝑥′ (𝜀)]𝜀=0 𝜀2

+ 2 [12𝑥2 (𝜀)(𝑥′ (𝜀))2 + 4𝑥3 (𝜀)𝑥″ (𝜀)]𝜀=0 + ⋯ = [𝑥04 ] + 𝜀[4𝑥03 𝑥1 ] +

𝜀2 [12𝑥02 𝑥12 2

+ 8𝑥03 𝑥2 ] + ⋯ .

Substituting the expansions (5.19)–(5.21) into (5.18) and collecting terms by powers of 𝜀 we get (5.22)

[𝑥04 − 𝑥02 ] + 𝜀[4𝑥03 𝑥1 − 2𝑥0 𝑥1 + 6] +𝜀2 [6𝑥02 𝑥12 + 4𝑥03 𝑥2 − 𝑥12 − 2𝑥0 𝑥2 + 2𝑥0 ] + ⋯ = 0.

5.5. Regular algebraic case

103

In order for the above equation to hold for arbitrary values of 0 ≤ 𝜀 ≪ 1, the coefficient of each power of 𝜀 must vanish. Setting each coefficient to zero we get the following. 𝜀0 . 𝑥04 − 𝑥02 = 0. are 𝑥0 = 1, −1.

This is satisfied by the simple roots under consideration, which

𝜀1 . 4𝑥03 𝑥1 − 2𝑥0 𝑥1 + 6 = 0. For each value of 𝑥0 , we get a corresponding value of 3 𝑥1 , specifically 𝑥1 = − 2𝑥3 −𝑥 = −3, 3. 0

0

𝜀2 . 6𝑥02 𝑥12 + 4𝑥03 𝑥2 − 𝑥12 − 2𝑥0 𝑥2 + 2𝑥0 = 0. For each matching pair of 𝑥0 and 𝑥1 , 47 43 we get a corresponding value of 𝑥2 , specifically 𝑥2 = − 2 , 2 . The above process can be continued up to any desired power of 𝜀. Based on our calculations, the first three terms in the expansion of the two simple roots are 47 2 𝜀 + ⋯, 2 43 2 + 2 𝜀 + ⋯.

𝑥(1) (𝜀) = 1 − 3𝜀 −

(5.23)

𝑥(2) (𝜀) = −1 + 3𝜀

Each series is guaranteed to converge for 𝜀 ∈ [0, 𝜌) for some 𝜌 > 0. If the series are truncated, then the three terms shown for each root would be called a perturbation approximation up to order 𝑂(𝜀2 ). Such approximations can be used to estimate values of the roots for any given 𝜀, provided it is sufficiently small. Expansion of repeated roots. To each repeated root 𝑥0 at 𝜀 = 0 there is a corresponding solution curve 𝑥(𝜀) for 𝜀 ≥ 0; that is, if 𝑥0 has multiplicity 𝑚, then there will be 𝑚 solution curves 𝑥(𝜀). The series expansions of these curves can be found by using the 1 substitution 𝛿 = 𝜀𝛼 where 𝛼 = 𝑚 . In the present case, we have 𝑚 = 2 corresponding to the double root 𝑥0 = 0, 0. Substituting 𝛿 = 𝜀1/2 or 𝜀 = 𝛿2 into the original equation (5.15) we get 𝑥4 − 𝑥2 + 2𝛿4 𝑥 + 6𝛿2 = 0,

(5.24)

0 ≤ 𝛿 ≪ 1,

and we consider expansions of the form (5.25)

𝑥(𝛿) = 𝑥0 + 𝛿𝑥1 + 𝛿2 𝑥2 + 𝛿3 𝑥3 + ⋯ ,

𝑥0 = 0, 0.

Proceeding just as before, we substitute (5.25) into (5.24), and expand all terms in powers of 𝛿. Collecting terms, we obtain (5.26)

[𝑥04 − 𝑥02 ] + 𝛿[4𝑥03 𝑥1 − 2𝑥0 𝑥1 ] +𝛿2 [6𝑥02 𝑥12 + 4𝑥03 𝑥2 − 𝑥12 − 2𝑥0 𝑥2 + 6] + ⋯ = 0.

In order for the above equation to hold for arbitrary values of 0 ≤ 𝛿 ≪ 1, the coefficient of each power of 𝛿 must vanish. Setting each coefficient to zero we get the following. 𝛿0 . 𝑥04 − 𝑥02 = 0. is 𝑥0 = 0, 0.

This is satisfied by the double root under consideration, which

𝛿1 . 4𝑥03 𝑥1 − 2𝑥0 𝑥1 = 0. For each value 𝑥0 = 0, 0, this equation becomes empty (0 = 0), and gives no information on 𝑥1 .

104

5. Perturbation methods

𝛿2 . 6𝑥02 𝑥12 + 4𝑥03 𝑥2 − 𝑥12 − 2𝑥0 𝑥2 + 6 = 0. For each value 𝑥0 = 0, 0, this equation becomes −𝑥12 + 6 = 0, which yields 𝑥1 = √6, −√6. Similar to before, the above process can be continued up to any desired power of 𝛿. Our calculations thus far show that the first two terms in the expansion of the two repeated roots are (5.27)

𝑥(3) (𝛿) = 0 + √6𝛿 + ⋯ , 𝑥(4) (𝛿) = 0 − √6𝛿 + ⋯ .

The above expansions can be expressed in terms of the original parameter using the relation 𝛿 = 𝜀1/2 , and each series is guaranteed to converge for 𝜀 ∈ [0, 𝜌) for some 𝜌 > 0. If the series are truncated, then the terms shown form a perturbation approximation up to order 𝑂(𝛿), or equivalently 𝑂(𝜀1/2 ). For convenience, we use the same symbols 𝑥(3) and 𝑥(4) , whether these curves are expressed as functions of 𝛿 or 𝜀. x 1

x(1)(ε) x(3)(ε)

0

−1

x(4)(ε)

ε

x(2)(ε) Figure 5.4.

Picture of solutions. We now have a picture of the four roots of the equation 𝑥4 − 𝑥2 + 2𝜀2 𝑥 + 6𝜀 = 0 in terms of the parameter 0 ≤ 𝜀 ≪ 1. The roots 𝑥(1) (𝜀) and 𝑥(2) (𝜀) are simple at 𝜀 = 0, and have a finite slope and concavity at 𝜀 = 0, which guide the direction of each solution curve for small values of 𝜀. In contrast, the roots 𝑥(3) (𝜀) and 𝑥(4) (𝜀) come together and form a double root at 𝜀 = 0, have an infinite slope at 𝜀 = 0, and exhibit a square-root type behavior for small values of 𝜀. The qualitative behavior of the roots is illustrated graphically in Figure 5.4. Note that such an illustration would be difficult in cases where the roots are complex. Sketch of proof: Result 5.5.1. For convenience, we assume that the given root is 𝑥0 = 0. This value of 𝑥0 can always be arranged by shifting 𝑥, and redefining 𝐹(𝑥, 𝜀). Since 𝐹(0, 0) = 0, but 𝐹(𝑥, 0) ≢ 0, a series in powers of 𝑥 gives 𝐹(𝑥, 0) = 𝑥𝑚 𝑉(𝑥), where 𝑚 ≥ 1 is the multiplicity of the root, and 𝑉(𝑥) is analytic with 𝑉(0) ≠ 0. Thus a 𝜕𝐹 simple root (𝑚 = 1) must have 𝜕𝑥 (0, 0) ≠ 0, and a repeated root (𝑚 ≥ 2) must have 𝜕𝐹 (0, 0) = 0. 𝜕𝑥 𝜕𝐹

To begin, we suppose that 𝑥0 is a simple root. In this case, we must have 𝜕𝑥 (0, 0) ≠ 0, and an analytic version of the implicit function theorem guarantees the existence of a unique power series 𝑥(𝜀) = ∑𝑘≥1 𝑥𝑘 𝜀𝑘 , with positive radius of convergence 𝜌, such that 𝐹(𝑥(𝜀), 𝜀) = 0 for all |𝜀| < 𝜌. Thus (5.11) has a solution curve that extends from

5.5. Regular algebraic case

105

(0, 0), and there is only one such curve. Moreover, by removing any shift as described above, this curve can be written as in (5.12). 𝜕𝐹

We next suppose that 𝑥0 is a repeated root. In this case, we must have 𝜕𝑥 (0, 0) = 0, and the implicit function theorem is no longer applicable. Thus we turn to a different result called the Weierstrass preparation theorem. This theorem guarantees that, in a neighborhood of (0, 0), we have the factorization (5.28)

𝐹(𝑥, 𝜀) = 𝑈(𝑥, 𝜀)[𝑥𝑚 + 𝑎𝑚−1 (𝜀)𝑥𝑚−1 + ⋯ + 𝑎1 (𝜀)𝑥 + 𝑎0 (𝜀)],

where 𝑈(𝑥, 𝜀) and 𝑎𝑗 (𝜀) (𝑗 = 0, . . . , 𝑚 − 1) are analytic functions, such that 𝑈(𝑥, 𝜀) ≠ 0 at all points in the neighborhood, and 𝑎𝑗 (0) = 0. In this neighborhood, the equation 𝐹(𝑥, 𝜀) = 0 will be satisfied if and only if (5.29)

𝑥𝑚 + 𝑎𝑚−1 (𝜀)𝑥𝑚−1 + ⋯ + 𝑎1 (𝜀)𝑥 + 𝑎0 (𝜀) = 0.

Note that, since they are analytic, and vanish at 𝜀 = 0, we can write 𝑎𝑗 (𝜀) = ∑𝑘≥1 𝑐𝑗𝑘 𝜀𝑘 for some coefficients 𝑐𝑗𝑘 . 𝜕𝐹

We now suppose that the repeated root is nondegenerate in the sense that 𝜕𝜀 (0, 0) ≠ 0, which implies 𝑐 01 ≠ 0. Upon introducing the change of variable 𝑥 = 𝛿𝑧 and 𝜀 = 𝛿𝑚 , and the new coefficients 𝑎𝑗̂ (𝜀) = ∑𝑘≥1 𝑐𝑗𝑘 𝜀𝑘−1 , and then factoring out 𝛿𝑚 , we find that (5.29) will be satisfied if (5.30)

𝑧𝑚 + 𝑎𝑚−1 ̂ (𝛿𝑚 )𝛿𝑚−1 𝑧𝑚−1 + ⋯ + 𝑎1̂ (𝛿𝑚 )𝛿𝑧 + 𝑎0̂ (𝛿𝑚 ) = 0.

When 𝛿 = 0, the above equation reduces to 𝑧𝑚 0 + 𝑐 01 = 0, which has 𝑚 simple roots (𝜍) 𝑧0 ≠ 0 (𝜎 = 1, . . . , 𝑚). Since each function 𝑎𝑗̂ (𝛿𝑚 )𝛿𝑗 is analytic in 𝛿, and each root is simple, the implicit function theorem can be applied. Thus 𝑚 different solution curves exist for (5.30), and hence for (5.29) and (5.11). Each of these curves has a series expan(𝜍) sion 𝑧(𝜍) (𝛿) = ∑𝑘≥0 𝑧𝑘 𝛿𝑘 , which converges for all |𝛿| < 𝜌(𝜍) , for some positive radius (𝜍) of convergence 𝜌(𝜍) . Using the relation 𝑥 = 𝛿𝑧 we obtain 𝑥(𝜍) (𝛿) = ∑𝑘≥0 𝑧𝑘 𝛿𝑘+1 . By relabeling coefficients, and removing any shift as described above, each of the curves can be written as in (5.14), where 𝛿 = 𝜀1/𝑚 , and each extends from (𝑥0 , 0). We again consider the problem in (5.11). For completeness, we suppose that 𝑥0 has 𝜕𝐹 𝜕𝐹 multiplicity 𝑚 ≥ 2, but that 𝜕𝑥 (𝑥0 , 0) = 0 and 𝜕𝜀 (𝑥0 , 0) = 0, in which case the root is called degenerate. (The case of a simple root or 𝑚 = 1 is always nondegenerate.) Similar to before, there are 𝑚 solution curves that extend from such a root, and the series for each curve can be written in powers of 𝜀𝛼 for some rational 𝛼 > 0. However, in contrast to before, the identification of 𝛼 for each curve is more involved; we may 1 no longer have 𝛼 = 𝑚 for each. Although the above proof could be generalized, we instead take a more direct approach. One way to find the exponents is to use the Newton polygon method. The basic idea is that the exponents can be found by a direct search using a change of variable similar to above, namely 𝑥 = 𝑥0 + 𝛿𝑧 and 𝜀 = 𝛿𝛾 , where 𝛾 > 0 is rational. The strategy is to identify values of 𝛾 for which the resulting equation in 𝑧 will have either simple, or repeated but nondegenerate roots 𝑧 = 𝑧0 when 𝛿 = 0. Moreover, values of 𝛾 are sought for which 𝑧0 ≠ 0. For such roots, Result 5.5.1 applies, and a series can be developed as before.

106

5. Perturbation methods

The method proceeds until a total of 𝑚 expansions are found, corresponding to the 𝑚 solution curves associated with the original root. When the equation is written in the form of a polynomial, different candidate values for 𝛾 can be obtained by equating the exponents of 𝛿 on different pairs of terms, as described below. A polygonal diagram can be introduced to represent these candidate values, but this diagram is not employed here. Example 5.5.2. Consider (5.31)

𝑥5 + 𝜀𝑥3 + 𝜀𝑥2 + 𝜀2 + 𝜀6 = 0,

0 ≤ 𝜀 ≪ 1.

When 𝜀 = 0, this equation has the repeated root 𝑥0 = 0 of multiplicity 𝑚 = 5, which 𝜕𝐹 𝜕𝐹 is degenerate since 𝜕𝑥 (𝑥0 , 0) = 0 and 𝜕𝜀 (𝑥0 , 0) = 0. Here we find expansions for the five solution curves that extend from this root. To begin, we introduce 𝑥 = 𝑥0 + 𝛿𝑧 and 𝜀 = 𝛿𝛾 , and consider (5.32)

𝛿5 𝑧5 + 𝛿𝛾+3 𝑧3 + 𝛿𝛾+2 𝑧2 + 𝛿2𝛾 + 𝛿6𝛾 = 0.

We next proceed to find candidate values of 𝛾 by matching exponents of 𝛿 on pairs of terms. Note that, assuming 𝑧 → 𝑧0 ≠ 0 as 𝛿 → 0+ , at least two exponents must match; otherwise, the exponents would all be distinct, and dividing out the lowest would lead to an inconsistent equation as 𝛿 → 0+ . We begin by considering the first two terms in (5.32), and suppose the match 5 = 𝛾 + 3, which gives 𝛾 = 2. For this value of 𝛾, the coefficients of the equation are {𝛿5 , 𝛿5 , 𝛿4 , 𝛿4 + 𝛿12 }, which after dividing out the lowest common factor gives {𝛿, 𝛿, 1, 1 + 𝛿8 }, and the equation becomes (5.33)

𝛿𝑧5 + 𝛿𝑧3 + 𝑧2 + 1 + 𝛿8 = 0.

When 𝛿 = 0 we get 𝑧20 + 1 = 0, which has two simple nonzero roots 𝑧0 = ±𝑖. These generate two standard power series expansions for 𝑧 in terms of 𝛿, which can be found by the usual process applied to (5.33). These expansions can then be converted back to 𝑥 and 𝜀 using the change of variable 𝑥 = 𝑥0 + 𝛿𝑧 and 𝛿 = 𝜀1/2 . Next, we consider the first and third terms in (5.32), and suppose the match 5 = 𝛾 + 2, which gives 𝛾 = 3. For this value, the coefficients are {𝛿5 , 𝛿6 , 𝛿5 , 𝛿6 + 𝛿18 }, which gives {1, 𝛿, 1, 𝛿 + 𝛿13 }, and the equation is (5.34)

𝑧5 + 𝛿𝑧3 + 𝑧2 + 𝛿 + 𝛿13 = 0.

When 𝛿 = 0 we get 𝑧50 + 𝑧20 = 0 or 𝑧20 (𝑧30 + 1) = 0, which has three simple nonzero 1 roots 𝑧0 = −1, 2 (1 ± 𝑖√3). These generate three standard power series expansions for 𝑧 in terms of 𝛿, which can be found by the usual process applied to (5.34). Similar to before, these expansions can be converted back to 𝑥 and 𝜀 using the change of variable 𝑥 = 𝑥0 + 𝛿𝑧 and 𝛿 = 𝜀1/3 . Thus we have five expansions corresponding to all five repeated roots and the procedure terminates. Note that, in the process of matching exponents in (5.32), some matches may be impossible, or may provide no new or useful information. For example, the match 𝛾 + 3 = 𝛾 + 2 is not possible, whereas the match 𝛾 + 3 = 2𝛾 gives 𝛾 = 3, and 𝛾 + 2 = 2𝛾 gives 𝛾 = 2, and both of these values are not new. The match

5.6. Regular differential case

107

5

5 = 2𝛾 gives 𝛾 = 2 , which yields the coefficients {𝛿5 , 𝛿11/2 , 𝛿9/2 , 𝛿5 + 𝛿15 }, or equivã , 𝛿11 ̃ , 𝛿9̃ , 𝛿10 ̃ + 𝛿30 ̃ }, where 𝛿 ̃ = 𝛿1/2 . After dividing out common factors and lently {𝛿10 2 setting 𝛿 ̃ = 0, we get 𝑧0 = 0, which is undesirable since we seek simple nonzero roots 2 for 𝑧0 . Finally, the match 𝛾 + 2 = 6𝛾 gives 𝛾 = 5 , which leads to the equation ̃ 𝑧5 + 𝛿13 ̃ 𝑧3 + 𝛿8̃ 𝑧2 + 1 + 𝛿8̃ = 0, where 𝛿 ̃ = 𝛿1/5 . Here the equation with 𝛿 ̃ = 0 is in𝛿21 consistent, which indicates that the value of 𝛾 is not suitable. We introduce 𝛿 ̃ as needed to obtain integer exponents (thus analytic coefficients), so that a standard power series in 𝛿 ̃ is guaranteed for simple roots.

5.6. Regular differential case Consider a regularly perturbed ordinary differential equation, together with a boundary or initial condition, of the form 𝑑𝑢 (5.35) = 𝐹(𝑡, 𝑢, 𝜀), 𝑢|𝑡=0 = 𝑢# , 𝑡 ≥ 0, 0 ≤ 𝜀 ≪ 1. 𝑑𝑡 For generality, we suppose that 𝑢 = (𝑢1 , . . . , 𝑢𝑑 ) and 𝐹 = (𝐹1 , . . . , 𝐹 𝑑 ) for some 𝑑 ≥ 1. Thus one-, two-, and higher-dimensional systems can be considered; the one-dimensional case is illustrated conceptually in Figure 5.5. We seek an expression for the solution 𝑢(𝑡, 𝜀). The following result says that, provided the function 𝐹(𝑡, 𝑢, 𝜀) can be

u

u(t,ε)

ε>0 ε=0

u# t Figure 5.5.

expressed as a series, then so can the solution 𝑢(𝑡, 𝜀). Result 5.6.1. Let 𝑢# be given. If 𝐹(𝑡, 𝑢, 𝜀) is analytic at (0, 𝑢# , 0), then a unique solution 𝑢(𝑡, 𝜀) of (5.35) exists. This solution is analytic at (0, 0), and can be written as a single series in 𝜀, namely (5.36)

𝑢(𝑡, 𝜀) = 𝑢0 (𝑡) + 𝜀𝑢1 (𝑡) + 𝜀2 𝑢2 (𝑡) + ⋯ .

The series converges for 𝑡 ∈ [0, 𝜎) and 𝜀 ∈ [0, 𝜌) for some 𝜎, 𝜌 > 0. The coefficients 𝑢𝑛 (𝑡), 𝑛 ≥ 0 can be found from (5.35). Thus a unique solution 𝑢(𝑡, 𝜀) will exist and be analytic at (0, 0) provided that the function 𝐹(𝑡, 𝑢, 𝜀) is analytic at the initial point (0, 𝑢# , 0). In view of Definition 5.3.1, the solution 𝑢(𝑡, 𝜀) can be written as a double power series in (𝑡, 𝜀), which converges for some positive radius in each variable. By summing over the series in 𝑡, we arrive at the form in (5.36), which is a standard power series in 𝜀 with time-dependent coefficients 𝑢𝑛 (𝑡). This form of the solution will be the most convenient for our purposes. Note that 𝑢(𝑡, 𝜀) has derivatives of all orders in 𝑡 and 𝜀 within its domain of convergence. For 𝑑 𝑑 𝜕 𝜕 simplicity, we will denote derivatives by 𝑑𝑡 and 𝑑𝜀 , rather than 𝜕𝑡 and 𝜕𝜀 .

108

5. Perturbation methods

As in the algebraic case, note that the above result is entirely local. The existence, uniqueness, and form of the solution 𝑢(𝑡, 𝜀) depends only on properties of the function 𝐹(𝑡, 𝑢, 𝜀) at the initial point (0, 𝑢# , 0). The system in (5.35) is indeed regularly perturbed when the conditions of Result 5.6.1 are met, since the unique solution for 𝜀 > 0 extends continuously in a suitable norm to 𝜀 = 0. In the 𝑑-dimensional case with 𝑢 = (𝑢1 , . . . , 𝑢𝑑 ) and 𝐹 = (𝐹1 , . . . , 𝐹 𝑑 ), the function 𝐹 is analytic if each of the component functions 𝐹𝑖 is analytic. Moreover, the series in (5.36) can be written in vector form 𝑢(𝑡, 𝜀) = ∑𝑛≥0 𝑢𝑛 (𝑡)𝜀𝑛 , or in component form 𝑢𝑖 (𝑡, 𝜀) = ∑𝑛≥0 𝑢𝑖,𝑛 (𝑡)𝜀𝑛 . As before, by a perturbation approximation of the solution up to order 𝑂(𝜀𝑟 ) we mean a truncated series with all terms up to and including 𝜀𝑟 . Example 5.6.1. Consider 𝑑𝑢 (5.37) = −𝑢 + √1 + 𝜀𝑢, 𝑢|𝑡=0 = 2, 𝑡 ≥ 0, 0 ≤ 𝜀 ≪ 1. 𝑑𝑡 This system has a unique solution when 𝜀 > 0 and also when 𝜀 = 0. Here we seek an expansion of the solution 𝑢(𝑡, 𝜀). Note that 𝐹(𝑡, 𝑢, 𝜀) = −𝑢 + √1 + 𝜀𝑢 is analytic at (𝑡, 𝑢, 𝜀) = (0, 2, 0). According to the above result, the solution can be expanded in a series (5.38)

𝑢(𝑡, 𝜀) = 𝑢0 (𝑡) + 𝜀𝑢1 (𝑡) + 𝜀2 𝑢2 (𝑡) + 𝜀3 𝑢3 (𝑡) + ⋯ .

For future reference, note that 𝑑𝑢 (𝑡, 𝜀) = 0 + 𝑢1 (𝑡) + 2𝜀𝑢2 (𝑡) + 3𝜀2 𝑢3 (𝑡) + ⋯ , 𝑑𝜀 (5.39) 𝑑2𝑢 (𝑡, 𝜀) = 0 + 0 + 2𝑢2 (𝑡) + 6𝜀𝑢3 (𝑡) + ⋯ . 𝑑𝜀2 Substituting 𝑢(𝑡, 𝜀) into the differential equation in (5.37), and moving all terms to the left-hand side for convenience, we get (5.40)

𝑝(𝑡, 𝜀) + 𝑞(𝑡, 𝜀) − 𝑓(𝑡, 𝜀) = 0,

𝑡 ≥ 0,

𝑑ᵆ (𝑡, 𝜀), 𝑑𝑡

where 𝑝(𝑡, 𝜀) = 𝑞(𝑡, 𝜀) = 𝑢(𝑡, 𝜀), and 𝑓(𝑡, 𝜀) = √1 + 𝜀𝑢(𝑡, 𝜀). We next expand each of these terms in powers of 𝜀. For 𝑝(𝑡, 𝜀) and 𝑞(𝑡, 𝜀), the expansions are straightforward, and there is no need to use Taylor’s formula; we get 𝑑𝑢0 𝑑𝑢 𝑑𝑢 𝑝(𝑡, 𝜀) = (𝑡, 𝜀) = (𝑡) + 𝜀 1 (𝑡) + ⋯ , 𝑑𝑡 𝑑𝑡 𝑑𝑡 (5.41) 𝑞(𝑡, 𝜀) = 𝑢(𝑡, 𝜀) = 𝑢0 (𝑡) + 𝜀𝑢1 (𝑡) + ⋯ . For 𝑓(𝑡, 𝜀) = √1 + 𝜀𝑢(𝑡, 𝜀) we may use Taylor’s formula, applied to 𝜀 with 𝑡 fixed, to find the first few terms in the expansion. Using the chain rule as needed for derivatives, together with (5.39), we get 𝑓(𝑡, 𝜀) = 𝑓(𝑡, 0) + 𝜀 (5.42)

𝑑𝑓 𝜀2 𝑑 2 𝑓 (𝑡, 0) + (𝑡, 0) + ⋯ 𝑑𝜀 2 𝑑𝜀2

1 𝑑𝑢 = [(1 + 𝜀𝑢)1/2 ]𝜀=0 + 𝜀 [ (1 + 𝜀𝑢)−1/2 (𝑢 + 𝜀 )] +⋯ 2 𝑑𝜀 𝜀=0 1 = [1] + 𝜀 [ 𝑢0 (𝑡)] + ⋯ . 2

5.6. Regular differential case

109

Substituting the expansions (5.41)–(5.42) into (5.40) and collecting terms by powers of 𝜀 we get (5.43)

[

𝑑𝑢0 𝑑𝑢 1 (𝑡) + 𝑢0 (𝑡) − 1] + 𝜀 [ 1 (𝑡) + 𝑢1 (𝑡) − 𝑢0 (𝑡)] + ⋯ = 0, 𝑑𝑡 𝑑𝑡 2

𝑡 ≥ 0.

A similar procedure can be applied to the initial condition in (5.37). Substituting 𝑢(𝑡, 𝜀) into that condition we get 𝑢(0, 𝜀) − 2 = 0, and using the series expansion (5.38) we get [𝑢0 (0) − 2] + 𝜀𝑢1 (0) + 𝜀2 𝑢2 (0) + ⋯ = 0.

(5.44)

In order for the equations in (5.43) and (5.44) to hold for arbitrary values of 0 ≤ 𝜀 ≪ 1, the coefficient of each power of 𝜀 must vanish. Setting each coefficient to zero we get the following sequence of initial-value problems for the functions 𝑢𝑛 (𝑡), 𝑛 ≥ 0. 𝜀0 . 𝑢′0 + 𝑢0 − 1 = 0, 𝑢0 (0) − 2 = 0, 𝑡 ≥ 0. This is a first-order equation for 𝑢0 (𝑡), which can be solved using an integrating factor or separation of variables. Solving gives 𝑢0 (𝑡) = 1 + 𝑒−𝑡 . 1

𝜀1 . 𝑢′1 + 𝑢1 − 2 𝑢0 = 0, 𝑢1 (0) = 0, 𝑡 ≥ 0. Given the solution for 𝑢0 (𝑡), this is a first-order equation for 𝑢1 (𝑡), which can be solved using an integrating factor similar 1 to before. Solving gives 𝑢1 (𝑡) = 2 (1 + 𝑡𝑒−𝑡 − 𝑒−𝑡 ). The above process can be continued up to any desired power of 𝜀. From our calculations thus far, the first two terms in the expansion of the solution of (5.37) are (5.45)

1

𝑢(𝑡, 𝜀) = [1 + 𝑒−𝑡 ] + 𝜀 [ 2 (1 + 𝑡𝑒−𝑡 − 𝑒−𝑡 )] + ⋯ .

The series is guaranteed to converge for 𝑡 ∈ [0, 𝜎) and 𝜀 ∈ [0, 𝜌) for some 𝜎, 𝜌 > 0, and we note that it may be possible to increase 𝜎 or 𝜌 at the expense of decreasing the other. If the above series is truncated, then the terms shown form a perturbation approximation up to order 𝑂(𝜀), and such an approximation can be used to estimate the solution for any sufficiently small 𝜀. Using the above approximation we can get a picture of how the solution 𝑢(𝑡, 𝜀) is influenced by the parameter 0 ≤ 𝜀 ≪ 1. As illustrated in Figure 5.6, when 𝜀 = 0 the u 2

ε=0

1.5

ε = 0.1

1 0

1

2

3

4

t

Figure 5.6.

solution curve monotonically decays from the value 𝑢 = 2 at 𝑡 = 0, and approaches the equilibrium value 𝑢 = 1 as 𝑡 → ∞. For small 𝜀 > 0, the solution curve is qualitatively similar, but lies strictly above the previous curve (with the exception of the initial point), and approaches a strictly higher equilibrium value, which according to 𝜀 the approximation is at 𝑢 = 1 + 2 .

110

5. Perturbation methods

Sketch of proof: Result 5.6.1. We consider the 𝑑-dimensional system in (5.35). We focus on the case 𝑑 = 1, and note that the case 𝑑 > 1 is similar. To begin, the assumption that 𝐹(𝑡, 𝑢, 𝜀) is analytic at (0, 𝑢# , 0) implies that it is analytic in an open neighborhood, and thus in a closed subset 𝐷 = {(𝑡, 𝑢, 𝜀) | |𝑡| ≤ 𝑎, |𝑢 − 𝑢# | ≤ 𝑏, |𝜀| ≤ 𝑐} for some 𝑎, 𝑏, 𝑐 > 0. Since all of its partial derivatives are continuous and bounded in 𝐷, there exist constants 𝑀, 𝐾 > 0 such that |𝐹(𝑡, 𝑢, 𝜀)| ≤ 𝑀 and |𝐹(𝑡, 𝑢, 𝜀) − 𝐹(𝑡, 𝑣, 𝜀)| ≤ 𝐾|𝑢 − 𝑣| for all (𝑡, 𝑢, 𝜀) and (𝑡, 𝑣, 𝜀) in 𝐷. The second inequality is a Lipschitz condition, implied 𝜕𝐹 by the boundedness of 𝜕ᵆ . 𝑏

1

Let 𝐸 = {(𝑡, 𝜀) | |𝑡| < 𝜎, |𝜀| < 𝜌}, where 𝜎 < min(𝑎, 𝑀 , 𝐾 ) and 𝜌 < 𝑐. Moreover, let U be the set of all functions 𝑢(𝑡, 𝜀) that are analytic in 𝐸 and that 𝑢(0, 𝜀) = 𝑢# and ‖𝑢 − 𝑢# ‖ ≤ 𝑏, where ‖ ⋅ ‖ denotes the supremum norm, defined by ‖𝑢‖ = sup(𝑡,𝜀)∈𝐸 |𝑢(𝑡, 𝜀)|. With this norm, the set U is a complete metric space, which follows from the fact that the analytic property and the conditions 𝑢(0, 𝜀) = 𝑢# and ‖𝑢−𝑢# ‖ ≤ 𝑏 are all preserved under uniform convergence. To establish the existence and uniqueness of a solution to (5.35), we consider an integrated version of the equation given by 𝑡

(5.46)

𝑢(𝑡, 𝜀) = 𝑢# + ∫ 𝐹(𝑠, 𝑢(𝑠, 𝜀), 𝜀) 𝑑𝑠. 0

If this equation has a unique solution 𝑢 ∈ U, then the function 𝑢(𝑡, 𝜀) will also be a unique solution of (5.35), including the initial condition. Motivated by (5.46), for any 𝑤 ∈ U, we consider the transformed function 𝛬𝑤 𝑡 defined by (𝛬𝑤)(𝑡, 𝜀) = 𝑢# + ∫0 𝐹(𝑠, 𝑤(𝑠, 𝜀), 𝜀) 𝑑𝑠. Since ‖𝑤 − 𝑢# ‖ ≤ 𝑏, the composite function 𝐹(𝑠, 𝑤(𝑠, 𝜀), 𝜀) is analytic for all (𝑠, 𝜀) in 𝐸, and the function (𝛬𝑤)(𝑡, 𝜀) is well defined for all (𝑡, 𝜀) in 𝐸. Specifically, the (contour) integral in the definition of 𝛬𝑤 is independent of the path from 𝑠 = 0 to 𝑠 = 𝑡, which we may assume is a line segment. Using the fact that the integral of an analytic function is analytic, together with the bound on 𝐹, we find that 𝛬𝑤 is analytic in 𝐸, and satisfies (𝛬𝑤)(0, 𝜀) = 𝑢# and ‖𝛬𝑤 − 𝑢# ‖ ≤ 𝜎𝑀, where 𝜎𝑀 < 𝑏, which implies that 𝛬𝑤 ∈ U for all 𝑤 ∈ U. Moreover, from the Lipschitz property of 𝐹, we get ‖𝛬𝑤 − 𝛬𝑣‖ ≤ 𝜎𝐾‖𝑤 − 𝑣‖, where 𝜎𝐾 < 1. The contraction mapping theorem can now be applied to conclude that there is a unique 𝑢 ∈ U that satisfies 𝑢 = 𝛬𝑢, which implies that (5.46) and thus (5.35) has a unique solution. Since it is analytic in 𝐸, we have the representation 𝑢(𝑡, 𝜀) = ∑𝑗,𝑘≥0 𝛾𝑗𝑘 𝑡𝑗 𝜀𝑘 for some coefficients 𝛾𝑗𝑘 . By introducing 𝑢𝑘 (𝑡) = ∑𝑗≥0 𝛾𝑗𝑘 𝑡𝑗 , we get 𝑢(𝑡, 𝜀) = ∑𝑘≥0 𝑢𝑘 (𝑡)𝜀𝑘 , which is a series of the form in (5.36).

5.7. Case study Setup. To illustrate some of the preceding results we study a simplified, planar model for ballistic targeting. As shown in Figure 5.7, we consider the problem of aiming a weapon such as a rifle for a long-range shot. We suppose that the weapon is at the origin (0, 0), and that the target is at a given point (𝑎, 𝑏). For a given firing speed 𝑐 = 2 2 √𝑢0 + 𝑣 0 , which is the initial speed imparted to the bullet by the weapon, the problem is to find the line of sight that is required for the bullet to strike the target.

5.7. Case study

111

y t f sigh

g

line o

(u ,v ) 0 0

trajectory

θ

aiming

h height (a,b)

x

aiming angle

Figure 5.7.

Note that, if gravity and air resistance were absent, then the bullet velocity would be constant, and the trajectory would be a line, coincident with the line of sight. In this case the aiming problem would be trivial: the line of sight would be directed at the target. In contrast, when gravity and air resistance are present, the bullet velocity is nonconstant, and the trajectory is a curved path, which is only initially tangent with the line of sight. In this case the aiming problem becomes nontrivial: the line of sight must be directed at a point sufficiently above the target so that the curved path will intersect the target. We seek to determine how the line of sight, or equivalently the aiming angle 𝜃 or aiming height ℎ, should be chosen depending on the target location (𝑎, 𝑏), firing speed 𝑐, gravitational acceleration 𝑔, and a small parameter 𝜀 that quantifies air resistance effects. For convenience, we consider the aiming angle measured from the horizontal, and note that the angle measured from the line to the target may be more natural. Outline of model. With a coordinate system as shown above, let (𝑥, 𝑦) denote the ⃗ 𝑦𝑗 ⃗ denote the position location of the bullet at any time 𝑡 ≥ 0. Equivalently, let 𝑟 ⃗ = 𝑥𝑖 + vector, where 𝑖 ⃗ and 𝑗 ⃗ are the standard unit vectors in the positive coordinate directions. ̈ ⃗ The velocity and acceleration vectors at any time are then 𝑟 ̇⃗ = 𝑥𝑖̇ ⃗ + 𝑦𝑗̇ ⃗ and 𝑟 ̈⃗ = 𝑥𝑖̈ ⃗ + 𝑦𝑗, where a superposed dot denotes a derivative with respect to time. We suppose that only two forces act on the bullet after it exits the weapon: one due to gravity, and another due to air resistance, as illustrated in Figure 5.8. The force of ⃗ gravity has the form 𝐹gravity = −𝑚𝑔 𝑗,⃗ where 𝑚 is the bullet mass. Using a simple model for high-speed motion through air, we assume that the force due to air resistance has a direction opposite to 𝑟,̇⃗ and a magnitude proportional to |𝑟|̇⃗ 2 . Specifically, we assume ⃗ = −𝛼|𝑟|̇⃗ 𝑟 ̇⃗ = −𝛼(𝑥2̇ + 𝑦2̇ )1/2 (𝑥𝑖̇ ⃗ + 𝑦𝑗), ̇ ⃗ where 𝛼 ≥ 0 is an air resistance coefficient. 𝐹air We suppose that 𝑥̇ > 0, so that the bullet is always traveling rightward, and moreover r bullet Fair Fgravity Figure 5.8.

112

5. Perturbation methods

suppose that 𝑥̇ ≫ |𝑦|,̇ which will be the case when the bullet path is nearly horizontal. Under these conditions, the speed factor (𝑥2̇ + 𝑦2̇ )1/2 > 0 can be approximated by 𝑥̇ > 0, ⃗ 𝑦𝑗). ⃗ = −𝛼𝑥(̇ 𝑥𝑖̇ + ̇ ⃗ and thus we consider a simplified model of air resistance given by 𝐹air This simplification will make some of the following calculations easier. Newton’s law of motion for the bullet requires that the product of its mass and ac⃗ ⃗ . celeration be equal to the sum of the applied forces, or equivalently, 𝑚𝑟 ̈⃗ = 𝐹gravity + 𝐹air Writing this equation in components, and considering initial conditions, we obtain the system 𝑥̈ = −𝜀(𝑥)̇ 2 , 𝑦 ̈ = −𝜀𝑥𝑦 ̇ ̇ − 𝑔,

(5.47)

𝑥|̇ 𝑡=0 = 𝑢0 , 𝑦|̇ 𝑡=0 = 𝑣 0 ,

𝜀 = 𝛼/𝑚,

𝑥|𝑡=0 = 0, 𝑦|𝑡=0 = 0,

𝑐 = √𝑢20 + 𝑣20 .

Any solution (𝑥, 𝑦)(𝑡) of the above system is a bullet trajectory with prescribed initial conditions. Note that the air resistance parameter is expected to be small in the sense that 0 ≤ 𝜀 ≪ 1. Also, note that the magnitude 𝑐 > 0 is known, but not the individual velocity components (𝑢0 , 𝑣 0 ). The targeting problem can now be stated: for given values of 𝑎, 𝑏, 𝑐, 𝑔, 𝜀, we seek initial velocity components (𝑢0 , 𝑣 0 ) such that the path (𝑥, 𝑦)(𝑡) will intersect the target point (𝑎, 𝑏). Once the velocity components are found, 𝑣 the aiming angle is given by tan 𝜃 = ᵆ0 , which completely determines the line of sight. 0

ℎ+𝑏

𝑣

Note that the aiming height can be found from the relation 𝑎 = ᵆ0 . Consistent with 0 the assumption that 𝑥̇ > 0 at all times, we assume 𝑢0 > 0 and 𝑎 > 0. Analysis of model. He we study the targeting problem associated with (5.47) in two cases. In the first case, we consider 𝑔 > 0 and 𝜀 = 0, which corresponds to including the effects of gravity, but not air resistance. In the second case, we consider 𝑔 > 0 and 𝜀 > 0, which corresponds to including the effects of both gravity and air resistance. Case of 𝑔 > 0 and 𝜀 = 0. The differential equations take the simple form 𝑥̈ = 0 and 𝑦 ̈ = −𝑔, and the initial conditions are 𝑥|̇ 𝑡=0 = 𝑢0 and 𝑦|̇ 𝑡=0 = 𝑣 0 , together with 𝑥|𝑡=0 = 0 and 𝑦|𝑡=0 = 0. The differential equations can be solved by explicit integration, and applying the initial conditions, we find that the bullet trajectory is 𝑥(𝑡) = 𝑢0 𝑡 and 1 𝑦(𝑡) = 𝑣 0 𝑡 − 2 𝑔𝑡2 for all 𝑡 ≥ 0. The bullet will strike the target at some time 𝑡 = 𝑡𝑠 > 0 if and only if 𝑥(𝑡𝑠 ) = 𝑎 and 𝑦(𝑡𝑠 ) = 𝑏. From the first condition we can solve for the 𝑎 time to obtain 𝑡𝑠 = ᵆ , and substituting this result into the second condition we obtain 𝑏

0

the relation 𝑣 0 = ( 𝑎 )𝑢0 + ( 𝑢20

𝑣20

𝑎𝑔 1 ) . 2 ᵆ0 2

The velocity components must also satisfy the initial

speed condition + = 𝑐 . Thus we arrive at two equations for the two unknowns 𝑢0 and 𝑣 0 . Note that the two equations define two curves in the 𝑢0 , 𝑣 0 plane, and that any simultaneous solution of the equations must correspond to an intersection point of these curves. 𝑏

𝑎𝑔 1 ) 2 ᵆ0 2 𝑣 0 = 𝑐2

The strike condition 𝑣 0 = ( 𝑎 )𝑢0 + ( 𝑢20

corresponds to a hyperbolic-type curve,

and the initial speed condition + corresponds to a circle. Depending on the specified values of 𝑎, 𝑏, 𝑐, 𝑔, these curves can have two, one, or no intersections. The case when there are two intersections is illustrated in Figure 5.9. For each of the

5.7. Case study

113

intersection points (𝑢∗0 , 𝑣∗0 ) there is a corresponding aiming angle tan 𝜃 = ℎ+𝑏 𝑎

𝑣∗0 . ᵆ∗0

𝑣∗0 ᵆ∗0

and aim-

= Note that solution 1 would have a low aiming angle, whereas ing height solution 2 would have a high aiming angle. Only solution 1 would be consistent with our assumption of a nearly horizontal bullet path. In this special case of 𝜀 = 0, the intersection points of the curves can be found exactly, although the algebra required is somewhat tedious. These points can also be found with the aid of numerical rootfinding software.

strike condition

v0

y

soln2

soln2

soln1 speed condition

(a,b)

soln1

u0

x

Figure 5.9.

Depending on the specified values of 𝑎, 𝑏, 𝑐, 𝑔, it may also happen that the strike and speed condition curves have only one intersection, at which point they are tangent, or the curves may have no intersections. In the latter case, the target is “out of range,” and there is no aiming angle for which the bullet will strike the target. Specifically, gravity will pull the bullet below the target before the bullet can reach the target. Case of 𝑔 > 0 and 𝜀 > 0. The differential equations now take the form 𝑥̈ = −𝜀(𝑥)̇ 2 and 𝑦 ̈ = −𝜀𝑥𝑦 ̇ ̇ − 𝑔, and the initial conditions are 𝑥|̇ 𝑡=0 = 𝑢0 and 𝑦|̇ 𝑡=0 = 𝑣 0 , together with 𝑥|𝑡=0 = 0 and 𝑦|𝑡=0 = 0. Obtaining an explicit expression for the bullet path is difficult, and thus we now seek a series expansion for each of the components 𝑥(𝑡, 𝜀) and 𝑦(𝑡, 𝜀). Considering only the first two terms in each series, the differential equations and initial conditions defined by the coefficients of 𝜀0 and 𝜀1 can be found and solved in the usual way, and we obtain an approximate path 𝑥(𝑡, 𝜀) = 𝑥0 (𝑡) + 𝜀𝑥1 (𝑡) and 𝑦(𝑡, 𝜀) = 𝑦0 (𝑡) + 𝜀𝑦1 (𝑡). As before, the bullet will strike the target at some time 𝑡 = 𝑡𝑠 > 0 if and only if 𝑥(𝑡𝑠 , 𝜀) = 𝑎 and 𝑦(𝑡𝑠 , 𝜀) = 𝑏. From the first condition we can solve for the time 𝑡𝑠 ; although there may be multiple solutions, only one is physically meaningful in the 𝑎 sense that it tends to the previous value ᵆ in the limit 𝜀 → 0+ . Substituting the rel0 evant solution for 𝑡𝑠 into the condition 𝑦(𝑡𝑠 , 𝜀) = 𝑏 we obtain a relation of the form 𝑣 0 = 𝑓(𝑢0 , 𝑎, 𝑏, 𝑔, 𝜀), which corresponds to a perturbed or distorted version of the hyperbolic-type curve considered above. And just as before, the velocity components must also satisfy the initial speed condition 𝑢20 + 𝑣20 = 𝑐2 . Thus we again arrive at two equations for the two unknowns 𝑢0 and 𝑣 0 . Depending on the specified values of 𝑎, 𝑏, 𝑐, 𝑔, 𝜀, these equations can again have two, one or no solutions. The details for this case are considered in the Exercises.

114

5. Perturbation methods

5.8. Poincaré–Lindstedt method The standard series expansion of a solution can have shortcomings for some types of differential equation problems. A notable type is the class of problems with periodic solutions. In this case, the standard expansion may contain terms that are nonperiodic and grow in time. Although the full sum of the series will converge to a periodic function, any truncation of the series will yield a nonperiodic approximation with a growing error. Here we show that this shortcoming can be corrected by using an enhanced expansion based on a scale transformation. Setup. For illustration, we consider 𝑑𝑥/𝑑𝑡 = 𝑦,

(5.48)

𝑥|𝑡=0 = 1, 3

𝑑𝑦/𝑑𝑡 = −𝑘𝑥 − 𝜀𝑥 ,

𝑦|𝑡=0 = 0,

𝑡 ≥ 0,

where 𝑘 > 0 and 0 ≤ 𝜀 ≪ 1 are parameters. To develop the expansion it will be convenient to rewrite this system in the equivalent form 𝑑2𝑥 𝑑𝑥 = −𝑘𝑥 − 𝜀𝑥3 , | = 0, 𝑥|𝑡=0 = 1, 𝑡 ≥ 0. 2 𝑑𝑡 𝑡=0 𝑑𝑡 The above system has a periodic solution when 𝜀 = 0 and also when 𝜀 > 0. This fact can be deduced from the path equation for (5.48), which gives the first integral 1 𝐸(𝑥, 𝑦) = 𝑘𝑥2 + 2 𝜀𝑥4 + 𝑦2 . The solution path through the point (𝑥, 𝑦)|𝑡=0 = (1, 0) is the curve 𝐸(𝑥, 𝑦) = 𝐶, where 𝐶 = 𝐸(1, 0). This curve is a closed loop for any 𝜀 ≥ 0, which implies that the solution is periodic. Here we seek series expansions for 𝑥(𝑡, 𝜀) and 𝑦(𝑡, 𝜀). For convenience, we focus on the formulation in (5.49), and only consider 𝑑𝑥 𝑥(𝑡, 𝜀). Once this function is known, then so is the other, since 𝑦(𝑡, 𝜀) = 𝑑𝑡 (𝑡, 𝜀). (5.49)

Standard method. For motivation, and to understand the issues involved, we follow the standard procedure and consider the expansion 𝑥(𝑡, 𝜀) = 𝑥0 (𝑡) + 𝜀𝑥1 (𝑡) + 𝜀2 𝑥2 (𝑡) + ⋯ .

(5.50)

Substituting 𝑥(𝑡, 𝜀) into the differential equation and initial conditions in (5.49) we get 𝑑𝑥 𝑑2𝑥 (𝑡, 𝜀) + 𝑘𝑥(𝑡, 𝜀) + 𝜀𝑓(𝑡, 𝜀) = 0, (0, 𝜀) = 0, 𝑥(0, 𝜀) − 1 = 0, 𝑑𝑡 𝑑𝑡2 where 𝑓(𝑡, 𝜀) = 𝑥3 (𝑡, 𝜀). As before, we expand each term in powers of 𝜀. For the function 𝑓(𝑡, 𝜀) we may use Taylor’s formula, applied to 𝜀 with 𝑡 fixed, which gives

(5.51)

𝑓(𝑡, 𝜀) = 𝑓(𝑡, 0) + 𝜀 (5.52)

𝑑𝑓 (𝑡, 0) + ⋯ 𝑑𝜀

= [𝑥3 (𝑡, 𝜀)]𝜀=0 + 𝜀 [3𝑥2 (𝑡, 𝜀)

𝑑𝑥 (𝑡, 𝜀)] +⋯ 𝑑𝜀 𝜀=0

= [𝑥03 (𝑡)] + 𝜀 [3𝑥02 (𝑡)𝑥1 (𝑡)] + ⋯ . Substituting (5.52) and (5.50) into the differential equation in (5.51), and collecting terms by powers of 𝜀, we get (5.53)

[

𝑑 2 𝑥0 𝑑2𝑥 (𝑡) + 𝑘𝑥0 (𝑡)] + 𝜀 [ 21 (𝑡) + 𝑘𝑥1 (𝑡) + 𝑥03 (𝑡)] + ⋯ = 0, 2 𝑑𝑡 𝑑𝑡

𝑡 ≥ 0.

5.8. Poincaré–Lindstedt method

115

We also expand each of the two initial conditions in (5.51). Substituting (5.50) into these conditions and collecting terms we get 𝑑𝑥0 𝑑𝑥 (0) + 𝜀 1 (0) + ⋯ = 0, [𝑥0 (0) − 1] + 𝜀𝑥1 (0) + ⋯ = 0. 𝑑𝑡 𝑑𝑡 In order for the equations in (5.53) and (5.54) to hold for arbitrary values of 0 ≤ 𝜀 ≪ 1, the coefficient of each power of 𝜀 must vanish. Setting each coefficient to zero we get the following sequence of initial-value problems for the functions 𝑥𝑛 (𝑡), 𝑛 ≥ 0. (5.54)

𝜀0 . 𝑥0″ + 𝑘𝑥0 = 0, 𝑥0′ (0) = 0, 𝑥0 (0) = 1, 𝑡 ≥ 0. This is a linear, homogeneous second-order equation for 𝑥0 (𝑡). The general solution can be found using standard methods based on the roots of a characteristic polynomial. Solving gives 𝑥0 (𝑡) = cos(𝛽𝑡), where 𝛽 = √𝑘 > 0. 𝜀1 . 𝑥1″ + 𝑘𝑥1 = −𝑥03 , 𝑥1′ (0) = 0, 𝑥1 (0) = 0, 𝑡 ≥ 0. Given 𝑥0 (𝑡) from above, this is a linear, inhomogeneous second-order equation for 𝑥1 (𝑡). The general solution 𝑝 takes the form 𝑥1 (𝑡) = 𝑥1ℎ (𝑡) + 𝑥1 (𝑡), where 𝑥1ℎ (𝑡) = 𝐶1 cos(𝛽𝑡) + 𝐶2 sin(𝛽𝑡) is the 𝑝 solution of the homogeneous equation, with arbitrary constants 𝐶1 and 𝐶2 , and 𝑥1 (𝑡) ″ 3 is any particular solution of the inhomogeneous equation 𝑥1 + 𝑘𝑥1 = − cos (𝛽𝑡). To solve this equation it is convenient to simplify the right-hand side using the triple-angle 1 3 identity cos3 (𝜃) = 4 cos(3𝜃) + 4 cos(𝜃). In this case, the differential equation becomes 1 3 𝑥1″ + 𝑘𝑥1 = − cos(3𝛽𝑡) − cos(𝛽𝑡). 4 4 Using the method of undetermined coefficients, and noting that the inhomogeneous 3 term 4 cos(𝛽𝑡) is contained in the homogeneous solution 𝑥1ℎ (𝑡), we propose that the particular solution will have the form (5.55)

(5.56)

𝑝

𝑥1 (𝑡) = 𝐴 cos(3𝛽𝑡) + 𝐵 sin(3𝛽𝑡) + 𝑡 [𝐷 cos(𝛽𝑡) + 𝐸 sin(𝛽𝑡)] ,

where 𝐴, 𝐵, 𝐷, and 𝐸 are constants to be determined. Substituting (5.56) into (5.55) and matching terms on both sides of the equation, and noting that 𝛽 2 = 𝑘, we find 3 1 a particular solution with 𝐴 = 32𝛽2 , 𝐵 = 0, 𝐷 = 0, and 𝐸 = − 8𝛽 . Combining the homogeneous and particular solutions then gives the general solution (5.57)

𝑥1 (𝑡) = 𝐶1 cos(𝛽𝑡) + 𝐶2 sin(𝛽𝑡) +

1 3𝑡 cos(3𝛽𝑡) − sin(𝛽𝑡). 8𝛽 32𝛽2

The initial conditions 𝑥1 (0) = 0 and 𝑥1′ (0) = 0 can now be applied, and we find that 1 𝐶1 = − 32𝛽2 and 𝐶2 = 0, and the solution for 𝑥1 (𝑡) is completely determined. The above process can be continued up to any desired power of 𝜀. Based on our calculations thus far, the first two terms in the expansion of the solution of (5.49) are (5.58)

𝑥(𝑡, 𝜀) = cos(𝛽𝑡) + 𝜀 [

cos(3𝛽𝑡) cos(𝛽𝑡) 3𝑡 sin(𝛽𝑡) − − ] + ⋯. 8𝛽 32𝛽2 32𝛽2

As before, the series is guaranteed to converge for 𝑡 ∈ [0, 𝜎) and 𝜀 ∈ [0, 𝜌) for some 𝜎, 𝜌 > 0, and the first few terms of the series provide an approximation for sufficiently small 𝜀. The term with 𝑡 sin(𝛽𝑡) is called a secular term in the expansion. Unlike the other terms shown, which are all periodic, this term is nonperiodic and grows in

116

5. Perturbation methods

time. Thus any truncation of the series will yield a nonperiodic approximation with a growing error. Poincaré–Lindstedt method. We again consider a series expansion for the solution 𝑥(𝑡, 𝜀) of the system (5.49). In contrast to before, we consider a scaled time variable 𝑠 = 𝜔𝑡, where 𝜔 > 0 is a constant, and we further let 𝜔 depend on the constant parameter 𝜀. We assume that 𝜔 = 𝜔(𝜀) is an analytic function, with expansion 𝜔(𝜀) = 𝜔0 + 𝜀𝜔1 + 𝜀2 𝜔2 + ⋯ .

(5.59)

Note that the condition 𝜔 > 0 for 0 ≤ 𝜀 ≪ 1 requires 𝜔0 > 0. Aside from this, the scale 𝜔 can be arbitrary; the system in (5.49) and its solution 𝑥(𝑡, 𝜀) can always be written in terms of a scaled variable 𝑠 = 𝜔𝑡. The essential idea of the Poincaré–Lindstedt method is to apply the scale transformation to the original system and change variables from 𝑥, 𝑡 to 𝑥, 𝑠. For this purpose, the usual derivative relations from Result 2.3.1 are applicable, which yield 𝑑𝑥 𝑑𝑥 =𝜔 , 𝑑𝑡 𝑑𝑠

(5.60)

𝑑2𝑥 𝑑2𝑥 = 𝜔2 2 . 2 𝑑𝑡 𝑑𝑠

The coefficients 𝜔𝑛 , 𝑛 ≥ 0 in the expansion of 𝜔 can then be chosen so that the expansion of 𝑥(𝑠, 𝜀) is clean and free of secular terms. The resulting solution in the variables 𝑥, 𝑠 can then be converted back to 𝑥, 𝑡 as desired. To illustrate the method, we consider the system in (5.49), and change variables from 𝑥, 𝑡 to 𝑥, 𝑠. Using the above derivative relations, and noting that the interval 𝑡 ≥ 0 becomes 𝑠 ≥ 0, we obtain (5.61)

𝜔2

𝑑2𝑥 = −𝑘𝑥 − 𝜀𝑥3 , 𝑑𝑠2

𝜔

𝑑𝑥 | = 0, 𝑑𝑠 𝑠=0

𝑥|𝑠=0 = 1,

𝑠 ≥ 0.

Let 𝜔(𝜀) be as in (5.59), and consider a usual expansion for 𝑥(𝑠, 𝜀) as outlined in Result 5.6.1, namely (5.62)

𝑥(𝑠, 𝜀) = 𝑥0 (𝑠) + 𝜀𝑥1 (𝑠) + 𝜀2 𝑥2 (𝑠) + ⋯ .

We next substitute (5.59) and (5.62) into (5.61) and expand all terms in powers of 𝜀. To this end, it is convenient to introduce the functions (5.63)

𝑓(𝑠, 𝜀) = 𝜔2 (𝜀)

𝑑2𝑥 (𝑠, 𝜀), 𝑑𝑠2

𝑔(𝑠, 𝜀) = 𝑥3 (𝑠, 𝜀),

ℎ(𝑠, 𝜀) = 𝜔(𝜀)

𝑑𝑥 (𝑠, 𝜀). 𝑑𝑠

Each of these functions can be expanded using Taylor’s formula, applied to 𝜀 with 𝑠 fixed. For 𝑓(𝑠, 𝜀) we obtain 𝑓(𝑠, 𝜀) = 𝑓(𝑠, 0) + 𝜀

(5.64)

𝑑𝑓 (𝑠, 0) + ⋯ 𝑑𝜀

= [𝜔2

𝑑2𝑥 𝑑𝜔 𝑑 2 𝑥 𝑑2𝑥 2 𝑑 + 𝜀 + 𝜔 +⋯ ] [2𝜔 ( )] 𝑑𝜀 𝑑𝑠2 𝑑𝜀 𝑑𝑠2 𝜀=0 𝑑𝑠2 𝜀=0

= [𝜔20

𝑑 2 𝑥0 𝑑2𝑥 𝑑2𝑥 ] + 𝜀 [2𝜔0 𝜔1 20 + 𝜔20 21 ] + ⋯ . 2 𝑑𝑠 𝑑𝑠 𝑑𝑠

5.8. Poincaré–Lindstedt method

117

Similarly, for 𝑔(𝑠, 𝜀) we get 𝑔(𝑠, 𝜀) = 𝑔(𝑠, 0) + 𝜀 (5.65)

𝑑𝑔 (𝑠, 0) + ⋯ 𝑑𝜀

= [𝑥3 ]𝜀=0 + 𝜀 [3𝑥2

𝑑𝑥 +⋯ ] 𝑑𝜀 𝜀=0

= [𝑥03 ] + 𝜀 [3𝑥02 𝑥1 ] + ⋯ , and for ℎ(𝑠, 𝜀) we get ℎ(𝑠, 𝜀) = ℎ(𝑠, 0) + 𝜀 = [𝜔

(5.66)

𝑑ℎ (𝑠, 0) + ⋯ 𝑑𝜀

𝑑𝑥 𝑑𝜔 𝑑𝑥 𝑑 𝑑𝑥 + 𝜀[ + 𝜔 ( )] +⋯ ] 𝑑𝑠 𝜀=0 𝑑𝜀 𝑑𝑠 𝑑𝜀 𝑑𝑠 𝜀=0

= [𝜔0

𝑑𝑥0 𝑑𝑥 𝑑𝑥 ] + 𝜀 [𝜔1 0 + 𝜔0 1 ] + ⋯ . 𝑑𝑠 𝑑𝑠 𝑑𝑠

We can now substitute (5.62)–(5.66) into (5.61) and move all terms to the left-hand side and collect them by powers of 𝜀. We do this in the differential equation, and each of the two initial conditions. Setting each coefficient to zero we then obtain the following sequence of initial-value problems for the functions 𝑥𝑛 (𝑠), which now involve the coefficients 𝜔𝑛 , 𝑛 ≥ 0. 𝜀0 . 𝜔20 𝑥0″ + 𝑘𝑥0 = 0, 𝜔0 𝑥0′ (0) = 0, 𝑥0 (0) = 1, 𝑠 ≥ 0. This is a linear, homogeneous second-order equation for 𝑥0 (𝑠). Aside from being positive, the coefficient 𝜔0 > 0 can be chosen arbitrarily. The specific choice 𝜔0 = √𝑘 is convenient; this will simplify various calculations that follow. With this choice the equation takes the clean form 𝑥0″ + 𝑥0 = 0, and the resulting solution, taking into account the initial conditions, becomes 𝑥0 (𝑠) = cos(𝑠). 𝜀1 . 𝜔20 𝑥1″ + 𝑘𝑥1 = −𝑥03 − 2𝜔0 𝜔1 𝑥0″ , 𝜔0 𝑥1′ (0) = −𝜔1 𝑥0′ (0), 𝑥1 (0) = 0, 𝑠 ≥ 0. Given 𝜔0 and 𝑥0 (𝑠) from above, this is a linear, inhomogeneous second-order equation for 𝑥1 (𝑠). Using the fact that 𝜔0 = √𝑘 and 𝑥0 (𝑠) = cos(𝑠) we can rewrite the differential 2𝜔 1 equation as 𝑥1″ + 𝑥1 = 𝜔 1 cos(𝑠) − 𝜔2 cos3 (𝑠). As before, it is convenient to simplify 0

0

the right-hand side using the triple-angle identity cos3 (𝜃) = this case, the differential equation becomes 𝑥1″ + 𝑥1 = (

(5.67)

1 4

cos(3𝜃) +

3 4

cos(𝜃). In

2𝜔1 3 1 − 2 ) cos(𝑠) − 2 cos(3𝑠). 𝜔0 4𝜔0 4𝜔0

Because cos(𝑠) is a solution of the homogeneous equation, it will lead to a secular term in the particular solution for 𝑥1 (𝑠). Hence we choose 𝜔1 to eliminate the term with 3 2𝜔 cos(𝑠). Setting ( 𝜔 1 − 4𝜔2 ) = 0 we get 0

(5.68)

0

𝜔1 =

3 . 8𝜔0

118

5. Perturbation methods

With this choice of 𝜔1 the differential equation becomes 𝑥1″ + 𝑥1 = −

(5.69)

1 cos(3𝑠). 4𝜔20

The general solution, formed by combining the homogeneous and a particular solution, takes the form 1 (5.70) 𝑥1 (𝑠) = 𝐶1 cos(𝑠) + 𝐶2 sin(𝑠) + cos(3𝑠), 32𝜔20 where 𝐶1 and 𝐶2 are arbitrary constants. The initial conditions 𝑥1 (0) = 0 and 𝑥1′ (0) = 𝜔 1 − 𝜔1 𝑥0′ (0) = 0 imply that 𝐶1 = − 32𝜔2 and 𝐶2 = 0, and the solution for 𝑥1 (𝑠) is com0

0

pletely determined.

The above process can be continued up to any desired power of 𝜀. The success of the method relies on the assumption that the arbitrary coefficients 𝜔𝑛 can be chosen to eliminate subsequent secular terms as they arise, and that the resulting series for 𝜔(𝜀) will converge. Our calculations thus far provide a two-term perturbation approximation for both 𝑥(𝑠, 𝜀) and 𝜔(𝜀), namely (5.71)

𝑥(𝑠, 𝜀) ≐ 𝑥0 (𝑠) + 𝜀𝑥1 (𝑠),

𝜔(𝜀) ≐ 𝜔0 + 𝜀𝜔1 ,

𝑠 = 𝜔(𝜀)𝑡.

More specifically, we have 𝑥(𝑠, 𝜀) ≐ cos(𝑠) + 𝜀 [ (5.72) 𝑠 = (𝜔0 +

cos(3𝑠) cos(𝑠) − ], 32𝜔20 32𝜔20

3𝜀 ) 𝑡, 8𝜔0

𝜔0 = √𝑘.

In contrast to before, note that all terms shown are periodic, and that any truncation of such an expansion will yield an approximation that is periodic. Thus, by using an enhanced expansion technique such as the Poincaré–Lindstedt method, we can avoid secular terms that arise when using the standard method. When the Poincaré– Lindstedt method is successful, any truncation of the resulting expansion provides an approximation with an error that does not grow as before, but instead is uniformly bounded in time. Note that the above expression in variables 𝑥, 𝑠 can be converted back to variables 𝑥, 𝑡 by direct substitution.

5.9. Singular algebraic case We next extend the results for regularly perturbed algebraic equations to the singular case. Rather than state a general result, we illustrate the essential ideas using an example. We consider (5.73)

𝜀𝑥4 − 𝑥 − 1 = 0,

0 < 𝜀 ≪ 1.

This equation is singularly perturbed since all solutions for 𝜀 > 0 cannot be continued to 𝜀 = 0. Note that, although 𝜀 = 0 is not of direct interest, we still use it to guide our developments. Similar to the regular case, we seek an expansion for each of the four roots 𝑥(𝜀).

5.9. Singular algebraic case

119

Remarks. For motivation, we note that (5.73) has 4 roots when 𝜀 > 0 and only 1 root when 𝜀 = 0. This informs us that 3 roots are singular, which in the algebraic case must become unbounded (infinite) as 𝜀 → 0+ , and 1 root is regular, which remains defined as 𝜀 → 0+ . We also note that a regular series as outlined in Result 5.5.1 can only represent a regular root, since a series in powers of 𝜀 or more generally 𝜀1/𝑚 with 𝑚 ≥ 1 will remain defined as 𝜀 → 0+ . Thus a special approach is needed to obtain an expansion for each of the four roots. Procedure. To construct a series expansion for each root, we first identify the total number of roots, and the number of regular and singular type. As outlined above, we have 4 total roots: 1 regular and 3 singular. Regular roots. A series expansion for each regular root can be constructed using the standard approach based on simple and repeated types at 𝜀 = 0, as outlined in Result 5.5.1. Setting 𝜀 = 0 in equation (5.73) we get −𝑥0 − 1 = 0, which gives 𝑥0 = −1. Thus the one regular root is simple at 𝜀 = 0. (A single root must be simple.) Based on this, we consider the regular expansion (5.74)

𝑥(𝜀) = 𝑥0 + 𝜀𝑥1 + 𝜀2 𝑥2 + 𝜀3 𝑥3 + ⋯ ,

𝑥0 = −1.

Note that an expansion in powers of 𝛿 = 𝜀1/𝑚 or similar would be needed for a repeated root of multiplicity 𝑚. In the usual way, we next substitute (5.74) into (5.73), expand all terms in powers of 𝜀, and collect coefficients. The result is as follows. 𝜀0 . −1.

−𝑥0 − 1 = 0.

This is satisfied by the root under consideration, which is 𝑥0 =

𝜀1 . 𝑥04 − 𝑥1 = 0. For the given value of 𝑥0 , we get a corresponding value of 𝑥1 , specifically 𝑥1 = 𝑥04 = 1. The above process can be continued up to any desired power of 𝜀. From our calculations, the first two terms in the expansion of the one regular root are (5.75)

𝑥(1) (𝜀) = −1 + 𝜀 + ⋯ .

Singular roots. A series expansion for each singular root can be constructed by introducing a scale transformation. The purpose of this change of variable is to make the problem regular, so that previous results can be applied. We consider a change of variable of the form 𝑧 = 𝜀𝛼 𝑥 or 𝑥 = 𝑧/𝜀𝛼 , where 𝛼 > 0 is an exponent to be determined. When this exponent is properly chosen, the singular and regular roots in 𝑥 will be converted to nonzero and zero roots in 𝑧 at 𝜀 = 0. Normally, singular roots in 𝑥 will have 𝑧 ≠ 0 at 𝜀 = 0, while regular roots in 𝑥 will have 𝑧 = 0 at 𝜀 = 0. Note that 𝑧 ≠ 0 implies that 𝑥 becomes unbounded as 𝜀 → 0+ . We next substitute 𝑥 = 𝑧/𝜀𝛼 into (5.73), and then simplify the result to obtain a leading coefficient of unity, which gives a normalized equation in 𝑧. Specifically, we

120

5. Perturbation methods

get 𝜀(𝜀−𝛼 𝑧)4 − 𝜀−𝛼 𝑧 − 1 = 0, 𝜀1−4𝛼 𝑧4 − 𝜀−𝛼 𝑧 − 1 = 0,

(5.76)

𝑧4 − 𝜀3𝛼−1 𝑧 − 𝜀4𝛼−1 = 0. The exponent 𝛼 can now be determined: we seek the smallest value of 𝛼 > 0 that will make the normalized equation regularly perturbed. This will be the case when each power of 𝜀 is nonnegative, which requires 3𝛼 − 1 ≥ 0 and 4𝛼 − 1 ≥ 0, or equivalently 1 1 1 𝛼 ≥ 3 and 𝛼 ≥ 4 . The smallest value of 𝛼 that satisfies both conditions is 𝛼 = 3 . Substituting this exponent into (5.76) gives the regularly perturbed equation 𝑧4 − 𝑧 − 𝜀1/3 = 0. This result can be written with integer exponents (thus analytic coefficients) by relabeling the parameter, namely 𝑧4 − 𝑧 − 𝜀 ̃ = 0,

(5.77)

𝜀 ̃ = 𝜀1/3 .

Since the above equation is regularly perturbed, a series expansion for each root can be constructed using the standard approach based on simple and repeated types at 𝜀 ̃ = 0. As outlined above, we seek roots with 𝑧0 ≠ 0, as these will correspond to the original singular roots. Setting 𝜀 ̃ = 0 in equation (5.77) we get 𝑧40 − 𝑧0 = 0, or equivalently 𝑧0 (𝑧30 − 1) = 0, which requires 𝑧0 = 0 or 𝑧30 = 1. We discard 𝑧0 = 0, and only consider 𝑧30 = 1, which 1 1 yields 𝑧0 = 1, 2 (−1 + 𝑖√3), 2 (−1 − 𝑖√3). Thus we have three nonzero roots 𝑧0 at 𝜀 ̃ = 0, which are all simple. Based on this, we consider the regular expansions 𝑧(𝜀)̃ = 𝑧0 + 𝜀𝑧̃ 1 + 𝜀2̃ 𝑧2 + ⋯ ,

(5.78)

𝑧0 = 1,

−1+𝑖√3 −1−𝑖√3 , 2 . 2

̃ or similar would be needed for a repeated Note that an expansion in powers of 𝛿 = 𝜀1/𝑚 root of multiplicity 𝑚. As before, we next substitute (5.78) into (5.77), expand all terms in powers of 𝜀,̃ and collect coefficients. The result is as follows. 𝜀0̃ .

𝑧40 − 𝑧0 = 0.

𝑧0 = 1,

This is satisfied by the three roots under consideration, which are

−1+𝑖√3 −1−𝑖√3 , . 2 2

𝜀1̃ . 4𝑧30 𝑧1 − 𝑧1 − 1 = 0. For each value of 𝑧0 , we get a corresponding value of 𝑧1 , 1 1 1 1 specifically 𝑧1 = 4𝑧3 −1 = 3 , 3 , 3 . 0

The above process can be continued up to any desired power of 𝜀.̃ Based on our calculations, the first two terms in the expansion of the three relevant roots for 𝑧 are 1

𝑧(𝑎) (𝜀)̃ = 1 + 3 𝜀 ̃ + ⋯ , (5.79)

𝑧(𝑏) (𝜀)̃ =

−1+𝑖√3 2

+ 3 𝜀̃ + ⋯ ,

𝑧(𝑐) (𝜀)̃ =

−1−𝑖√3 2

+ 3 𝜀̃ + ⋯ .

1

1

Each series for 𝑧 in terms of 𝜀 ̃ can be converted into a series for 𝑥 in terms of 𝜀 using 1 the relations 𝜀 ̃ = 𝜀1/3 and 𝑥 = 𝑧/𝜀𝛼 , where 𝛼 = 3 . Note that 𝜀 ̃ and 𝜀𝛼 will in general be

5.10. Singular differential case

121

different, although they are the same in this example. Thus the three original singular roots have the expansions 𝑥(2) (𝜀) = (5.80)

1 𝜀1/3

+

1 3

+ ⋯,

𝑥(3) (𝜀) =

−1+𝑖√3 2𝜀1/3

+

1 3

+ ⋯,

𝑥(4) (𝜀) =

−1−𝑖√3 2𝜀1/3

+

1 3

+ ⋯.

The results in (5.80) and (5.75) are the expansions of all four roots of the original equation in (5.73). The overall procedure can be applied in the same way to other singularly perturbed algebraic equations. Note. If further considered, the root of (5.77) with 𝑧0 = 0 would generate an expansion for the regular root of the original equation. However, an expansion obtained in terms of 𝑧 and 𝜀,̃ while equivalent, would be more cumbersome and less efficient than that obtained by working directly with the original equation in terms of 𝑥 and 𝜀. Thus it is advantageous to deal with the regular and singular roots separately as described above.

5.10. Singular differential case We next extend the results for regularly perturbed differential equations to the singular case. As before, rather than state a general result, we illustrate the essential ideas using an example. We consider (5.81)

𝜀

𝑑2𝑦 𝑑𝑦 + (1 + 𝜀) + 𝑦 = 0, 0 ≤ 𝑥 ≤ 1, 𝑑𝑥 𝑑𝑥2 𝑦|𝑥=0 = 0, 𝑦|𝑥=1 = 1, 0 < 𝜀 ≪ 1.

This system is singularly perturbed since the unique solution for 𝜀 > 0 cannot be continued to 𝜀 = 0; specifically, there is no solution in the latter case. Note that, although 𝜀 = 0 is not of direct interest, we still use it to guide our developments. As in the regular case, we seek an expansion for the solution 𝑦(𝑥, 𝜀). Remarks. For the case 𝜀 > 0, the system consists of a linear, second-order differential equation with two boundary conditions, and has a unique solution. In contrast, for the case 𝜀 = 0, the system consists of a linear, first-order differential equation with two boundary conditions, and has no solution. This informs us that the unique solution for 𝜀 > 0 must become undefined or break down as 𝜀 → 0+ . For systems involving a differential equation, a typical mode of break down is (5.82)

𝑑𝑛𝑦 (𝑥, 𝜀) → ±∞ 𝑑𝑥𝑛

as

𝜀 → 0+ ,

for some 𝑛 and 𝑥.

Numerical experimentation with (5.81) shows that the break down in the solution 𝑑𝑦 occurs in the slope at the left side of the domain, specifically 𝑑𝑥 (0, 𝜀) → ∞ as 𝜀 → 0+ as illustrated in Figure 5.10. For small values of 𝜀 > 0, the slope is large in a thin region adjacent to 𝑥 = 0; and as 𝜀 → 0+ , the region becomes thinner and the slope becomes infinite. Outside of this thin region the solution behaves regularly in the sense that there is no singular behavior as 𝜀 → 0+ . The thin region adjacent to 𝑥 = 0 is called the boundary layer or inner region, and the remaining portion of the domain that

122

5. Perturbation methods

y 1

y(x,ε)

x

0

1

inner region

outer region

Figure 5.10.

includes 𝑥 = 1 is called the outer region. This is a qualitative partitioning of the domain; the point of transition between the two regions is left unspecified. In our developments, we restrict attention to problems with singular behavior at only one point, which will be the left or right end of an interval, and we outline a procedure for finding only the first term in an expansion of the solution. Problems involving more than one point of singular behavior, which may include points in the interior of the domain, and procedures for finding high-order terms in an expansion, are not pursued here. Procedure. To construct an expansion for the solution 𝑦(𝑥, 𝜀) we first identify the location of the inner region or boundary layer. We assume the location is given; in the present example, it is on the left end of the solution interval 0 ≤ 𝑥 ≤ 1. For generality, we denote the boundary layer endpoint by 𝑞in , and the other end point by 𝑞out , so the interval becomes 𝑞in ≤ 𝑥 ≤ 𝑞out , where 𝑞in = 0 and 𝑞out = 1. We next split the problem into two parts corresponding to the inner and outer regions. In each region, the solution must satisfy the differential equation, and any boundary condition in that region. In the current example, one boundary condition is specified in each region or side; but more generally, a region or side may contain some, all, or none of the boundary conditions. The two parts for the current example are 𝑑2 𝑦

𝑑𝑦

𝑥 ≤ 𝑞out ,

𝑑2 𝑦

𝑑𝑦

𝑥 ≥ 𝑞in ,

(5.83)

(Outer)

+ (1 + 𝜀) 𝑑𝑥 + 𝑦 = 0, 𝜀 { 𝑑𝑥2 𝑦|𝑥=𝑞out = 1.

(5.84)

(Inner)

𝜀 + (1 + 𝜀) 𝑑𝑥 + 𝑦 = 0, { 𝑑𝑥2 𝑦|𝑥=𝑞in = 0.

Note that, for problems in which the boundary layer is on the right side, the above parts would be interchanged: the outer region would be 𝑥 ≥ 𝑞out = 0, and the inner region would be 𝑥 ≤ 𝑞in = 1. Outer problem. In the outer region, the solution behaves regularly, and no special treatment is required. Thus we consider a regular expansion of the form (5.85)

𝑦(𝑥, 𝜀) = 𝑦0 (𝑥) + 𝜀𝑦1 (𝑥) + 𝜀2 𝑦2 (𝑥) + ⋯ .

In the usual way, we can substitute (5.85) into (5.83), expand all terms in powers of 𝜀, and collect coefficients. Restricting attention to only the leading-order term 𝑦0 (𝑥), we

5.10. Singular differential case

123

get 𝑑𝑦0 + 𝑦0 = 0, 𝑦0 (𝑞out ) = 1, 𝑥 ≤ 𝑞out . 𝑑𝑥 Solving this equation using separation of variables we obtain 𝑦0 (𝑥) = 𝑒1−𝑥 , and truncating the series in (5.85) at the first term, we get the following leading-order approximation of the outer solution (5.86)

𝑦out (𝑥, 𝜀) = 𝑦0 (𝑥) = 𝑒1−𝑥 .

(5.87)

For later reference, let 𝐼 out denote the value of the outer solution at the inner endpoint 𝑞in in the limit 𝜀 → 0+ , that is 𝐼 out = lim+ 𝑦out (𝑞in , 𝜀).

(5.88)

𝜀→0

Since 𝑞in = 0 and there is no dependence on 𝜀, we get 𝐼 out = 𝑒 as illustrated in Figure 5.11. For problems in which the boundary layer is on the right, the definition of 𝐼 out y I

out

out y (x, ε) 1 x q out

q in Figure 5.11.

in (5.88) would be the same, but the figure would be flipped: the outer region and 𝑞out would be on the left. Inner problem. In the inner region, the solution behaves singularly, and special treatment is required. Motivated by the algebraic case, we introduce a scale transformation or change of variable to make the problem regular in this region. We consider the change of variable (5.89)

𝜏 = (𝑥 − 𝑞in )/𝜀𝛼 ,

𝑥 = 𝑞in + 𝜀𝛼 𝜏,

where 𝛼 > 0 is an exponent to be determined. The essential idea is to apply this transformation to the inner system and change variables from 𝑥, 𝑦 to 𝜏, 𝑦. For this purpose, the usual derivative relations from Result 2.3.1 are applicable, which yield 𝑑𝑦 𝑑𝑦 𝑑2𝑦 𝑑2𝑦 = 𝜀−𝛼 , = 𝜀−2𝛼 2 . 2 𝑑𝑥 𝑑𝜏 𝑑𝑥 𝑑𝜏 Note that the solution curve in the 𝜏, 𝑦-plane will be a horizontally stretched version of the curve in the 𝑥, 𝑦-plane, where the stretching factor is 𝜀−𝛼 . When the exponent is properly chosen, the stretching will cancel the steepening slope at 𝑥 = 𝑞in when 𝜀 → 0+ . (5.90)

We next apply the change of variable (5.89) to the inner system (5.84), noting that the interval 𝑥 ≥ 𝑞in becomes 𝜏 ≥ 0, and then simplify the differential equation so that

124

5. Perturbation methods

the highest-order term has a coefficient of unity. This yields a normalized system in the variables 𝜏, 𝑦, which is (5.91)

𝑑2𝑦 𝑑𝑦 𝛼−1 + 𝜀𝛼 ) + 𝜀2𝛼−1 𝑦 = 0, { 𝑑𝜏2 + (𝜀 𝑑𝜏 𝑦|𝜏=0 = 0.

𝜏 ≥ 0,

The exponent 𝛼 can now be determined: we seek the smallest value of 𝛼 > 0 that will make the normalized system regularly perturbed. This will be the case when each power of 𝜀 is nonnegative, which requires 𝛼 − 1 ≥ 0, 𝛼 ≥ 0 and 2𝛼 − 1 ≥ 0, or equiva1 lently 𝛼 ≥ 1, 𝛼 ≥ 0 and 𝛼 ≥ 2 . The smallest value of 𝛼 that satisfies all three conditions is 𝛼 = 1. Substituting this exponent into (5.91) we get (5.92)

𝑑𝑦 𝑑2𝑦 { 𝑑𝜏2 + (1 + 𝜀) 𝑑𝜏 + 𝜀𝑦 = 0, 𝑦|𝜏=0 = 0.

𝜏 ≥ 0,

Since the above equation is regularly perturbed, a series expansion for the inner solution in the variables 𝜏, 𝑦 can now be obtained in the usual way. Thus we consider a regular expansion of the form (5.93)

𝑦(𝜏, 𝜀) = 𝑦0 (𝜏) + 𝜀𝑦1 (𝜏) + 𝜀2 𝑦2 (𝜏) + ⋯ .

Similar to before, we can substitute (5.93) into (5.92), expand all terms in powers of 𝜀, and collect coefficients. Restricting attention to only the leading-order term 𝑦0 (𝜏), we get (5.94)

𝑑 2 𝑦0 𝑑𝑦0 + = 0, 𝑑𝜏 𝑑𝜏2

𝑦0 (0) = 0,

𝜏 ≥ 0.

The general solution of the above differential equation, obtained by considering the associated characteristic polynomial, is 𝑦0 (𝜏) = 𝐶1 + 𝐶2 𝑒−𝜏 , where 𝐶1 and 𝐶2 are arbitrary constants. The boundary condition 𝑦0 (0) = 0 then implies 𝐶1 + 𝐶2 = 0, or equivalently 𝐶2 = −𝐶1 . Thus the leading-order approximation of the inner solution is (5.95)

𝑦in (𝜏, 𝜀) = 𝑦0 (𝜏) = 𝐶1 (1 − 𝑒−𝜏 ).

The remaining constant 𝐶1 will be determined in the matching step outlined below. We can express the inner solution in terms of the original variables 𝑥, 𝑦 using the change of variable in (5.89), where 𝑞in = 0 and 𝛼 = 1, which gives (5.96)

[𝑦in (𝜏, 𝜀)]𝜏=(𝑥−𝑞

in )/𝜀

𝛼

= 𝐶1 (1 − 𝑒−𝑥/𝜀 ).

For later reference, let 𝐻 in denote the value of the inner solution at the outer end point 𝑥 = 𝑞out in the limit 𝜀 → 0+ , which in view of the relation 𝜏 = (𝑥 − 𝑞in )/𝜀𝛼 , corresponds to (5.97)

𝐻 in = lim+ [𝑦in (𝜏, 𝜀)]𝜏=(𝑞 𝜀→0

out −𝑞in )/𝜀

𝛼

.

Since 𝜏 → ∞ when 𝑞out > 𝑞in , we note that 𝐻 in corresponds to a horizontal asymptote, as illustrated in Figure 5.12. In the current example, we obtain 𝐻 in = lim𝜏→∞ 𝑦in (𝜏, 𝜀) = 𝐶1 . For problems in which the boundary layer is on the right, the definition of 𝐻 in in (5.97) would be the same, but the figure would be flipped: the inner region and

5.10. Singular differential case

125

y in y (x,ε ) H

0

in

x q out

q in

Figure 5.12.

𝑞in would be on the right. In this case, 𝐻 in would be a horizontal asymptote in the direction 𝜏 → −∞ since 𝑞out < 𝑞in . Matching. To match the inner and outer approximations we set 𝐻 in = 𝐼 out .

(5.98)

The overall approximation procedure will be successful provided that this condition can be satisfied, with both sides of the equation finite, and provided that all arbitrary constants in the approximations are determined. In the current example, the matching condition becomes 𝐶1 = 𝑒, and hence the procedure will be successful. Composite approximation. The approximation procedure is completed by combining the inner and outer approximations as illustrated in Figure 5.13. The outer approximation satisfies the outer boundary conditions, and takes the value 𝐼 out at the opposite end of the domain. Similarly, the inner approximation satisfies the inner boundary conditions, and tends to the value 𝐻 in at the opposite end of the domain. By adding the two graphs, we obtain an approximation over the entire domain, but with a common offset in the boundary conditions. Thus a combined or composite approximation y

y in H

out I y in 0 q in

y out

y y in + y out

1 x q out

y in + y out−Iout 1

0 q in

1 0

x q out

q in

x q out

Figure 5.13.

is obtained by removing the offset. The result of the overall procedure is called the leading-order composite approximation and is given by (5.99)

𝑦comp (𝑥, 𝜀) = 𝑦out (𝑥, 𝜀) + [𝑦in (𝜏, 𝜀)]𝜏=(𝑥−𝑞

in )/𝜀

𝛼

− 𝐼 out .

In the current example, we obtain (5.100)

𝑦comp (𝑥, 𝜀) = 𝑒1−𝑥 − 𝑒1−(𝑥/𝜀) ,

0 ≤ 𝑥 ≤ 1.

The above expression is the first or leading-order term in an expansion of the exact solution 𝑦(𝑥, 𝜀) of the original problem in (5.81). Procedures exist for determining higher-order terms in the expansion, but they are more involved and not addressed

126

5. Perturbation methods

here. Although it is only leading-order, the approximation in (5.100) captures the essential character of the exact solution. Specifically, for small 𝜀 > 0, the approximation has large slope in a thin boundary layer region adjacent to 𝑥 = 0, and in the limit 𝜀 → 0+ , the slope becomes infinite leading to a jump discontinuity at 𝑥 = 0. Note. The scale or stretching transformation in (5.89) is only considered in the boundary layer region to counteract the infinite derivative that develops in the solution as 𝜀 → 0+ . If considered over the entire interval, this transformation would excessively stretch the solution in the outer region; it would stretch this part of the solution into a flat curve as 𝜀 → 0+ . For this reason, the problem is split into parts, which are treated differently, and then the parts are combined as described above. Discussion. Here we explain how the function 𝑦comp (𝑥, 𝜀) defined by (5.99) approximates the solution 𝑦(𝑥, 𝜀) of the original system in (5.81). To begin, note that the solution of the original problem must also be a solution to the outer problem in (5.83) and the inner problem in (5.84). For each of these two problems, a series expansion of the solution was found, using a change of variable in the inner case, and the leading-order term in each solution was denoted as 𝑦out (𝑥, 𝜀) and 𝑦in (𝜏, 𝜀). Each of these functions is thus a leading-order approximation of 𝑦(𝑥, 𝜀), but each is only relevant and satisfies a given boundary condition in its respective region. To understand the composite approximation we need a more explicit description of the outer and inner regions. For this purpose, we introduce a matching point 𝑥m , which can be interpreted as the point where the two regions meet. (More appropriately, we could suppose that the two regions overlap, and 𝑥m would be some point in the overlap.) To define this point, we recall the change of variable used in the inner region, 𝑥 = 𝑞in + 𝜀𝛼 𝜏, where 𝛼 > 0 is the exponent defined in the procedure. Based on this expression we define a matching point by 𝑥m = 𝑞in + 𝐶𝜀𝛽 , where 𝐶 > 0 and 0 < 𝛽 < 𝛼 are given constants. By design, the matching point depends on 𝜀, and will not remain at a fixed location on the 𝑥-axis as 𝜀 → 0+ , and will also not remain at a fixed location on the 𝜏-axis when the variable is changed. Using the matching point, we denote the outer region by 𝑥 ∈ (𝑥m , 𝑞out ], and the inner region by 𝑥 ∈ [𝑞in , 𝑥m ]. Equivalently, by the change of variable, the inner region corresponds to 𝜏 ∈ [0, 𝜏m ], where 𝜏m = 𝐶𝜀𝛽−𝛼 . The form of the point ensures that 𝑥m → 𝑞in and 𝜏m → ∞ as 𝜀 → 0+ . Thus the matching point tends to the left-most possible edge of the outer region in the sense that 𝑥m → 𝑞in , and simultaneously, it tends to the right-most possible edge of the stretched inner region in the sense that 𝜏m → ∞. The motivation for the matching rule used in the procedure is now evident. Since 𝑥m → 𝑞in , the solution of the outer problem at the matching point tends to the value 𝐼 out as 𝜀 → 0+ . Similarly, since 𝜏m → ∞, the solution of the inner problem at the matching point tends to the value 𝐻 in as 𝜀 → 0+ . Since the outer and inner solutions must agree at the matching point as 𝜀 → 0+ , we obtain the rule 𝐼 out = 𝐻 in . The manner in which 𝑦comp (𝑥, 𝜀) provides a leading-order approximation can now be described. For any point in the outer region 𝑥 ∈ (𝑥m , 𝑞out ] we have (5.101)

𝑦comp (𝑥, 𝜀) = 𝑦out (𝑥, 𝜀) + 𝑅(𝑥, 𝜀),

5.11. Case study

127

where (5.102)

𝑅(𝑥, 𝜀) = [𝑦in (𝜏, 𝜀)]𝜏=(𝑥−𝑞

in )/𝜀

𝛼

− 𝐼 out .

Since 𝑥 > 𝑥m we have 𝜏 > 𝜏m , and we deduce that lim𝜀→0+ 𝑅(𝑥, 𝜀) = 0, which follows from the definition of 𝐻 in and the matching rule. Thus, in the outer region, the composite approximation is equal to the leading-order solution 𝑦out (𝑥, 𝜀) of the outer problem, plus a remainder term 𝑅(𝑥, 𝜀) which becomes vanishingly small as 𝜀 becomes vanishingly small. Moreover, because the outer solution satisfies the relevant boundary conditions at 𝑞out , so will the composite approximation, up to the remainder term. Similarly, for any point in the inner region 𝑥 ∈ [𝑞in , 𝑥m ], we have (5.103)

𝑦comp (𝑥, 𝜀) = [𝑦in (𝜏, 𝜀)]𝜏=(𝑥−𝑞

𝛼 in )/𝜀

ˆ(𝑥, 𝜀), +𝑅

where (5.104)

ˆ(𝑥, 𝜀) = 𝑦out (𝑥, 𝜀) − 𝐼 out . 𝑅

ˆ(𝑥, 𝜀) = 0, which follows Since 𝑞in ≤ 𝑥 ≤ 𝑥m and 𝑥m → 𝑞in , we deduce that lim𝜀→0+ 𝑅 out from the definition of 𝐼 . Thus, in the inner region, the composite approximation is equal to the leading-order solution 𝑦in (𝜏, 𝜀) of the inner problem, plus a remainder term ˆ(𝑥, 𝜀) which becomes vanishingly small as 𝜀 becomes vanishingly small. Moreover, 𝑅 because the inner solution satisfies the relevant boundary conditions at 𝑞in , so will the composite approximation, up to the remainder term. By design, the function 𝑦comp (𝑥, 𝜀) is defined over the entire interval 𝑥 ∈ [𝑞in , 𝑞out ], and in each of the subintervals [𝑞in , 𝑥m ] and (𝑥m , 𝑞out ], it is equal to the relevant leading-order solution of the differential equation, and satisfies the relevant boundary condition, up to a remainder term that vanishes as 𝜀 → 0+ .

5.11. Case study Setup. Singularly perturbed differential equations arise in a number of applications. Here we consider an example that arises in the modeling of an interface. As illustrated in Figure 5.14, we study a planar model for the interface between a liquid and gas in equilibrium. We suppose that a liquid of mass density 𝜌 occupies a portion of a rectangular container with an open top, which is exposed to a gas at a fixed, constant pressure 𝑝0 , with gravitational acceleration 𝑔 oriented vertically downward. We consider a coordinate system as shown, where the lateral position of the origin is midway between the left and right walls, and the vertical position is as described below. We suppose that the container is of size 2𝐿 > 0 along the 𝑥-direction, and of size 𝑤 > 0 along the 𝑧-direction into the page. In a planar slice of the container as illustrated, the interface appears as a curve 𝑦(𝑥). Physically, the interface can be identified with the top layer of the liquid, which acts like an elastic skin, and which is held in a stretched state by molecular forces. The net effect of these forces is quantified by the surface tension 𝜎. Common experience tells us that the interface will be almost flat, except near the walls of the container, where the liquid surface will rise up and form a wetting angle 𝛾 ≥ 0, or dip down and form an angle 𝛾 ≤ 0; this portion of the interface is called the meniscus. We seek

128

5. Perturbation methods

y

interface curve y(x)

0

x

gas g

liquid x=−L

γ

zoom of interface σ

x=L

σ

σ

w

σ

(edge)

(top)

(length w in z−direction) Figure 5.14.

to describe how the interface curve 𝑦(𝑥), and specifically the shape and height of the meniscus, depend on the given parameters 𝜌, 𝑝0 , 𝑔, 𝐿, 𝜎, 𝛾. Note that 𝜌 is a mass per unit volume, 𝑝0 is a force per unit area, and 𝜎 is a force per unit length. Outline of model. We assume that, in any given equilibrium state, the pressure in the gas and liquid are known functions of the vertical coordinate 𝑦. For the gas, we assume that the pressure is independent of 𝑦, so 𝑝gas (𝑦) ≡ 𝑝0 . For the liquid, due to its weight, we assume that the pressure depends linearly on 𝑦, namely 𝑝 liq (𝑦) = 𝐶 − 𝜌𝑔𝑦, where 𝐶 is an arbitrary constant, as required by the laws of hydrostatics. The vertical position of the origin is implicitly chosen so that 𝐶 = 𝑝0 , which will be convenient for the derivation of the model. With this choice of origin the interface height 𝑦|𝑥=0 will not be known explicitly, but will instead be determined as part of the solution. Once 𝑦|𝑥=0 is known, then so too will be the origin; for instance, if 𝑦|𝑥=0 is positive, then 𝜍 the origin is below the interface by this amount. Note that when 𝜌𝑔𝐿2 ≪ 1, which is the case of interest here, the value 𝑦|𝑥=0 will be extremely small, and the origin will be extremely close to, and for all practical purposes on, the interface. We next consider a small piece 𝛤 of the interface curve at an arbitrary point (𝑥, 𝑦(𝑥)) as illustrated in Figure 5.15. Provided that the piece is sufficiently small, and that the curve is twice continuously differentiable, the piece can be described to arbitrary accuracy as an arc of a circle of radius 𝑟(𝑥) and central angle 𝜙, which is aligned with the ⃗ and 𝑁(𝑥) ⃗ unit tangent and normal vectors 𝑇(𝑥) as shown. From calculus, the curvature of this circular arc is given by (5.105)

𝜅(𝑥) =

𝑦″ (𝑥) 1 = . 𝑟(𝑥) [1 + (𝑦′ (𝑥))2 ]3/2

Note that the size or length of 𝛤 is determined by 𝜙, and that 𝛤 will shrink to the central point (𝑥, 𝑦(𝑥)) in the limit 𝜙 → 0+ . In equilibrium, the forces acting on 𝛤 must be balanced. Forces arise due to the surface tension within the interface, and the pressures of the liquid and gas on the opposite sides of the interface. Since our goal is to consider the limit 𝜙 → 0+ , it will suffice to consider the pressures evaluated at the point (𝑥, 𝑦(𝑥)); variations in the pressures over 𝛤 will not contribute in the limit. To express the forces, we recall that the interface curve represents a slice of a surface, and thus a piece 𝛤 of the curve represents a patch of the surface. The patch has length 𝑤 in the 𝑧-direction, along which the

5.11. Case study

129

N(x) N(x)

y Γ

φ

r(x)

T(x) σ

(x,y(x))

σ φ/2

p gas φ/2

p

x

T(x)

liq

Figure 5.15.

surface tension is distributed, and has an area 𝑤𝑟(𝑥)𝜙, over which the pressures are distributed. The forces due to the surface tension on the right and left sides of 𝛤 are 𝜙 ⃗ 𝜙 ⃗ ⃗ 𝐹𝜍,right = 𝜎𝑤 cos( )𝑇(𝑥) + 𝜎𝑤 sin( )𝑁(𝑥), 2 2 (5.106) 𝜙 ⃗ 𝜙 ⃗ ⃗ + 𝜎𝑤 sin( )𝑁(𝑥). 𝐹𝜍,left = −𝜎𝑤 cos( )𝑇(𝑥) 2 2 Similarly, the forces due to the liquid and gas pressures on the opposite sides of 𝛤 are (5.107)

⃗ 𝐹 ⃗ liq = 𝑝 liq (𝑦(𝑥))𝑤𝑟(𝑥)𝜙𝑁(𝑥), ⃗ = −𝑝gas (𝑦(𝑥))𝑤𝑟(𝑥)𝜙𝑁(𝑥). ⃗ 𝐹gas

⃗ + 𝐹𝜍,right ⃗ ⃗ Balance of forces requires 𝐹 ⃗ liq + 𝐹gas + 𝐹𝜍,left = 0. Note that the com⃗ ponents in the tangential direction 𝑇(𝑥) are automatically balanced, which is due to symmetry and the fact that the surface tension is constant. For the components in ⃗ the normal direction 𝑁(𝑥), after using the explicit expressions for the pressures and dividing by 𝑤𝑟(𝑥)𝜙, and noting that 𝜅(𝑥) = 1/𝑟(𝑥), we obtain the relation 𝜙

(5.108)

2𝜎𝜅(𝑥) sin( 2 ) −𝜌𝑔𝑦(𝑥) + = 0. 𝜙 𝜙

sin( )

1

In the limit 𝜙 → 0+ , in which 𝛤 shrinks to a point, we note that 𝜙 2 → 2 , and thus we obtain 𝜎𝜅(𝑥) = 𝜌𝑔𝑦(𝑥). Substituting for the curvature from (5.105), we obtain the system 𝑑𝑦 2 3/2 𝜎 𝑑2𝑦 = 𝑦[1 + ( ) ] , 0 ≤ 𝑥 ≤ 𝐿, 𝜌𝑔 𝑑𝑥2 𝑑𝑥 (5.109) 𝑑𝑦 𝑑𝑦 | = 0, | = tan 𝛾. 𝑑𝑥 𝑥=0 𝑑𝑥 𝑥=𝐿 The nonlinear differential equation above is called the Young–Laplace equation. It represents a local balance of forces that must hold at each point along the interface curve 𝑦(𝑥). Here we consider only the above planar version of the equation; there are various generalizations. Note that, by symmetry, we need only consider half of the interface corresponding to the interval 0 ≤ 𝑥 ≤ 𝐿. The boundary conditions correspond to a zero slope at 𝑥 = 0, which is required by symmetry and the fact that the point of the interface midway between the walls is a local extremum, and a prescribed slope at 𝑥 = 𝐿, which is dictated by the wetting angle 𝛾. We seek an expression for the curve

130

5. Perturbation methods

𝑦(𝑥) and the meniscus end point height 𝑦# = 𝑦|𝑥=𝐿 , and assume 𝜎 > 0, 𝜌 > 0, 𝑔 > 0, 𝜋 𝐿 > 0 and 𝛾 ∈ (0, 2 ) are given constants. Analysis of model. It is convenient to introduce dimensionless variables defined by the scale transformation ℎ = 𝑦/𝐿 and 𝑠 = 𝑥/𝐿. In terms of these variables, the system takes the form 𝑑2ℎ 𝑑ℎ 2 3/2 = ℎ[1 + ( ) ] , 0 ≤ 𝑠 ≤ 1, 2 𝑑𝑠 𝑑𝑠 𝑑ℎ 𝑑ℎ 𝜎 | = 0, | = tan 𝛾, 𝜀 = . 𝑑𝑠 𝑠=0 𝑑𝑠 𝑠=1 𝜌𝑔𝐿2 𝜀

(5.110)

Singularly perturbed features. Since 𝜎 is a microscopic quantity, whereas 𝜌𝑔𝐿2 is macroscopic, we naturally have 0 < 𝜀 ≪ 1. We note that the parameter 𝜀 is the coefficient of the highest-order term, and in the extreme case when 𝜀 = 0, the only solution of the differential equation is ℎ(𝑠) ≡ 0, which satisfies the boundary condition at 𝑠 = 0, but not at 𝑠 = 1. Thus any solution for 𝜀 > 0 cannot be continued to 𝜀 = 0 and the system is singularly perturbed. Intuitively, we expect that the system has a boundary layer in the meniscus region at 𝑠 = 1. However, in contrast to the example outlined in the previous section, the singular behavior in the boundary layer does not appear 𝑑2 ℎ 𝑑ℎ in the first derivative or slope 𝑑𝑠 , but rather the second derivative 𝑑𝑠2 . 2Specifically, 𝑑ℎ 𝑑 ℎ | remains bounded and equal to tan 𝛾 in the limit 𝜀 → 0+ , whereas 𝑑𝑠2 |𝑠=1 grows 𝑑𝑠 𝑠=1 unbounded. Thus the boundary layer or inner region is on the right, with endpoint 𝑞in = 1, and the outer region is on the left, with endpoint 𝑞out = 0. In dimensionless variables, the meniscus end point (𝑥, 𝑦) = (𝐿, 𝑦# ) becomes (𝑠, ℎ) = (1, ℎ# ), where ℎ# = 𝑦# /𝐿. Outer equations. The equations for the interface curve in the outer region 𝑠 ≥ 𝑞out 𝑑2 ℎ 𝑑ℎ are the differential equation 𝜀 𝑑𝑠2 − ℎ[1 + ( 𝑑𝑠 )2 ]3/2 = 0, together with the boundary 𝑑ℎ

condition 𝑑𝑠 |𝑠=𝑞out = 0. In this region, no special treatment is required, and we consider a regular series expansion for ℎ(𝑠, 𝜀). Upon substituting the expansion into the differential equation and boundary condition, and collecting the coefficients of 𝜀0 , we get (5.111)

−ℎ0 [1 + (

𝑑ℎ0 2 3/2 ) ] = 0, 𝑑𝑠

𝑑ℎ0 = 0, | 𝑑𝑠 𝑠=𝑞out

𝑠 ≥ 𝑞out .

By inspection, since the sum within the brackets is positive, we note that the only solution of the differential equation is ℎ0 (𝑠) ≡ 0. Moreover, although there are no arbitrary constants in this solution, we note that it satisfies the boundary condition. Thus the leading-order outer solution, and the associated intercept used for matching, are (5.112)

ℎout (𝑠, 𝜀) = ℎ0 (𝑠) ≡ 0,

𝐼 out = lim+ ℎout (𝑞in , 𝜀) = 0. 𝜀→0

For future reference, the matching condition will be 𝐻 in = 𝐼 out = 0, and the leading-order composite solution will be ℎcomp = ℎout + ℎin − 𝐼 out = ℎin . Thus the leading-order inner solution will itself be the composite solution, provided that the matching condition can be satisfied.

5.11. Case study

131

Inner equations. The equations for the interface curve in the inner region 𝑠 ≤ 𝑞in 𝑑2 ℎ 𝑑ℎ are the differential equation 𝜀 𝑑𝑠2 = ℎ[1 + ( 𝑑𝑠 )2 ]3/2 , together with the boundary condi𝑑ℎ

tion 𝑑𝑠 |𝑠=𝑞in = tan 𝛾. In this region we introduce a scale or stretching transformation that will counteract the infinite (second) derivative that develops in the solution; equivalently, the transformation will make the inner equations regularly perturbed. We consider the change of variable 𝜏 = 𝜀−𝛼 (𝑠 − 𝑞in ) and 𝑢 = 𝜀−𝛽 ℎ, where 𝛼 > 0 and 𝛽 > 0 are exponents to be determined. Substituting this change of variable into 1 the inner equations, and using the relevant derivative relations, we find that 𝛼 = 2 and 1

𝛽 = 2 are the smallest exponents that will make the system regular. Remarkably, for this choice of exponents, all factors of 𝜀 cancel out, and the normalized inner equations become 𝑑2𝑢 𝑑𝑢 2 3/2 𝑑𝑢 | (5.113) = 𝑢[1 + ( ) ] , = tan 𝛾, 𝜏 ≤ 0. 2 𝑑𝜏 𝑑𝜏 |𝜏=0 𝑑𝜏 The above equations imply that, in the variables 𝜏, 𝑢, the inner solution is independent of 𝜀. Thus the expansion of the inner solution consists of only one term, and we have 𝑢in (𝜏, 𝜀) = 𝑢in (𝜏) = 𝑢0 (𝜏). For convenience, we use the notation 𝑢(𝜏) in place of 𝑢in (𝜏). Since the outer solution is identically zero (in ℎ and also 𝑢), the matching condition for the inner solution is (5.114)

0 = 𝐻 in = lim+ [𝑢in (𝜏, 𝜀)]𝜏=(𝑞 𝜀→0

out −𝑞in )/𝜀

𝛼

= lim 𝑢(𝜏). 𝜏→−∞

The general solution of the differential equation in (5.113) subject to the matching condition (5.114) can be found by direct integration. To carry out this integration we make some further assumptions consistent with the case when tan 𝛾 > 0: we assume 𝑑ᵆ 𝑑ᵆ that 𝑢 > 0 and 𝑑𝜏 > 0 in the inner region, and also lim𝜏→−∞ 𝑑𝜏 (𝜏) = 0. In this case, we find that the solution is given by (see Exercise 23) 𝑢 (5.115) 𝜏 = 𝐹(𝑢) − 𝐵 where 𝐹(𝑢) = √4 − 𝑢2 + ln ( ). 2 + √4 − 𝑢2 Here 𝐵 is an arbitrary constant. Note that the solution has the implicit form 𝜏 = 𝜏(𝑢), rather than 𝑢 = 𝑢(𝜏). The functions 𝜏 = 𝜏(𝑢) and 𝑢 = 𝑢(𝜏) are both well defined, and have graphs that are monotonic with positive slope, provided that 𝑢 ∈ (0, √2). In the inner variables, the meniscus end point (𝑠, ℎ) = (1, ℎ# ) corresponds to 𝑑ᵆ (𝜏, 𝑢) = (0, 𝑢# ). At this point, the solution must satisfy 𝑑𝜏 |𝜏=0 = tan 𝛾, or equivalently 𝑑𝜏 1 | = tan 𝛾 . In view of (5.115), we obtain the condition 𝑑ᵆ ᵆ=ᵆ# 1 (5.116) 𝐹 ′ (𝑢# ) = . tan 𝛾 𝜋

The above algebraic equation has a unique root 𝑢# ∈ (0, √2) for any given 𝛾 ∈ (0, 2 ), which can be found explicitly. Moreover, since 𝑢 = 𝑢# when 𝜏 = 0, the arbitrary constant in (5.115) must have the value 𝐵 = 𝐹(𝑢# ). Thus a unique solution to the inner equations is obtained, which satisfies the matching condition required by the outer solution, along with the given boundary condition at the meniscus endpoint. Composite solution. As noted earlier, the leading-order composite solution for the interface curve will be equal to the leading-order inner solution, which has the implicit

132

5. Perturbation methods

form 𝜏 = 𝐹(𝑢) − 𝐹(𝑢# ). By reversing the change of variables from 𝜏, 𝑢 to 𝑠, ℎ, and then from 𝑠, ℎ to 𝑥, 𝑦, we obtain a solution of the original problem. The meniscus endpoint height 𝑦# can be obtained directly from the change of variable relations 𝑦# = 𝐿ℎ# and ℎ# = 𝜀1/2 𝑢# . Thus the meniscus height itself is determined by the root of the algebraic equation in (5.116). This problem is further explored in the Exercises. Note. Solving the inner system in (5.113) appears to be just as difficult as solving the original system in (5.110). However, there are important conceptual differences that make the inner system more tractable than the original. For the original system, the limit 𝜀 → 0+ fundamentally changes the form of the differential equation, while the interval 𝑠 ∈ [0, 1] and boundary conditions remain fixed. For the inner system, the differential equation has fixed coefficients and its form does not change; instead, the limit 𝜀 → 0+ gives rise to matching conditions as 𝜏 → −∞. The fact that conditions are imposed at infinity, rather than a finite point, make the inner system more readily solvable.

Reference notes Perturbation methods have been used to study various types of equations in a wide range of applications, from celestial mechanics to quantum mechanics, across all branches of math and science. Here we touched only the elementary theory for algebraic and ordinary differential equations. The main results outlined here on the existence and convergence of perturbation series were based on fundamental results about analytic functions. For background on such functions and other results from complex analysis see Churchill and Brown (2014) and Gunning and Rossi (2009). Perturbed algebraic equations as considered here fall within the general theory of implicit equations and algebraic curves; see Casas-Alvero (2000), Krantz and Parks (2002), and Wall (2004). An important application in this setting is the study of eigenvalues and eigenvectors of matrices and more general operators; see Kato (1982) and Rellich (1969). Perturbed ordinary differential equations as considered here fall within the general theory of differential equations with parameters. Such equations are discussed in the classic text by Coddington and Levinson (1955), and also Hille (1976). Perturbation methods for a wealth of examples are considered in Bender and Orszag (1999). The Poincaré–Lindstedt method is one example from a collection that is designed for equations with periodic structure at one or more scales. Other methods in this collection include those based on averaging and homogenization; see for example the texts by Holmes (2013) and Sanders, Verhulst, and Murdock (2007). Boundary layers and related phenomena such as interior layers and turning points arise in various applications involving ordinary and partial differential equations; see Gie et al. (2018), Holmes (2013), and Neu (2015). Expansion methods for boundary layer problems, employing higher-order matching rules than considered herein, are discussed in Kevorkian and Cole (1981) and Bender and Orszag (1999).

Exercises

133

Specialized methods for linear equations, which exploit representations of solutions in terms of integrals or special functions, are also important; examples of such methods, including the WKB method among others, can be found in Bender and Orszag (1999) and Holmes (2013).

Exercises 1. Verify the order statement for each function 𝑓(𝜀) in the limit 𝜀 → 0+ . Each 𝑓(𝜀) is as described or given, and continuous for 𝜀 ∈ (0, 1]. (a) If 𝑓(𝜀) is bounded, then 𝑓(𝜀) = 𝑂(1). (b) If 𝑓(𝜀) → 0 as 𝜀 → 0+ , then 𝑓(𝜀) = 𝑜(1). (c) If 𝑓(𝜀) = √𝜀(1 − 𝜀), then 𝑓(𝜀) = 𝑂(𝜀1/2 ). (d) If 𝑓(𝜀) = 𝜀 sin(𝜀2 ), then 𝑓(𝜀) = 𝑂(𝜀3 ). 𝜀

2

(e) If 𝑓(𝜀) = ∫0 𝑒−𝑠 𝑑𝑠, then 𝑓(𝜀) = 𝑂(𝜀). (f) If 𝑓(𝜀) = 𝑒−1/𝜀 , then 𝑓(𝜀) = 𝑜(𝜀𝑟 ) for any 𝑟 ≥ 0. 2. Recall that, if ℎ(𝜀) has a series expansion in whole powers of 𝜀, with a positive radius of convergence about 𝜀 = 0, then the expansion can be found using Taylor’s 1 formula ℎ(𝜀) = ℎ(0) + 𝜀ℎ′ (0) + 2! 𝜀2 ℎ″ (0) + ⋯. (a) Let 𝑓(𝜀) = ln(1+𝜀). Use Taylor’s formula to find the first three nonzero terms in the series expansion in powers of 𝜀. (b) Let 𝑔(𝜀) = ln(1 + √𝜀). Explain why this function cannot have a series expansion in powers of 𝜀. [Are derivatives defined at 𝜀 = 0?] (c) Using (a), find the first three nonzero terms in a series expansion for 𝑔(𝜀) in powers of 𝜀1/2 . 3. If 𝑥(𝜀) is analytic at 𝜀 = 0, with expansion 𝑥(𝜀) = 𝑥0 + 𝜀𝑥1 + 𝜀2 𝑥2 + ⋯, then each function 𝑓(𝜀) below is also analytic at 𝜀 = 0, with expansion 𝑓(𝜀) = 𝑓0 + 𝜀𝑓1 + 𝜀2 𝑓2 + ⋯. Find 𝑓0 , 𝑓1 , 𝑓2 in terms of 𝑥𝑛 , 𝑛 ≥ 0. 1 . 1+𝜀𝑥(𝜀)

(a) 𝑓(𝜀) = 𝜀2 𝑥3 (𝜀).

(b) 𝑓(𝜀) =

(c) 𝑓(𝜀) = (1 + 𝜀𝑥(𝜀))3/2 .

(d) 𝑓(𝜀) = sin(𝜀𝑥(𝜀)).

(e) 𝑓(𝜀) = 𝑒−𝑥(𝜀) .

(f) 𝑓(𝜀) = 𝑥2 (𝜀)𝑥′ (𝜀).

4. For each regularly perturbed equation, find a two-term perturbation approximation of each real or complex root, where 0 ≤ 𝜀 ≪ 1.

134

5. Perturbation methods

(a) 𝑥3 + 2𝑥 + 𝜀 = 0.

(b) 𝑥4 − 𝑥2 + 𝜀 = 0.

(c) 𝑥4 + 2𝜀𝑥2 − 𝜀2 𝑥 − 4 = 0.

(d) 𝑥3 − 4𝜀𝑥2 − 𝑥 = 0.

(e) 𝑥4 − 2𝑥3 − 2𝜀 = 0.

(f) (𝑥 + 1)3 = 𝜀𝑥 + 2𝜀2 .

(g) 𝑥4 = 2𝑥2 + 𝜀𝑥.

(h) 𝑥3 = 5𝑥2 − 6𝑥 − 𝜀.

5. Find a two-term perturbation approximation of each real or complex root, where 0 ≤ 𝜀 ≪ 1. [Some or all roots may be degenerate.] (a) 𝑥3 + 𝜀𝑥 + 𝜀2 = 0.

(b) 𝑥3 + 𝜀2 𝑥 + 𝜀3 = 0.

(c) 𝑥3 + 𝜀2 𝑥 + 𝜀 = 0.

(d) 𝑥4 + 𝜀𝑥 + 𝜀 = 0.

(e) 𝑥4 + 𝜀𝑥 + 𝜀2 = 0.

(f) 𝑥4 + 𝜀2 𝑥 + 𝜀3 = 0.

6. To estimate the roots of 𝑥3 − 4.01𝑥 + 0.02 = 0, we may consider 𝑥3 − (4 + 𝜀)𝑥 + 2𝜀 = 0. Find a perturbation approximation of each real or complex root up to order 𝑂(𝜀2 ), and numerically estimate the roots of the original equation. 7. To estimate the real roots of 4 − 𝑥2 = 𝑒0.1𝑥 , we may consider 4 − 𝑥2 = 𝑒𝜀𝑥 . Find a perturbation approximation of each real root up to order 𝑂(𝜀2 ), and numerically estimate the roots of the original equation. 8. To estimate the real root of sin(𝑥 +

𝜋 ) 30

= 4𝑥, we may consider

sin(𝑥 + 𝜀) = 4𝑥. Find a perturbation approximation of the real root up to order 𝑂(𝜀2 ), and numerically estimate the root of the original equation. [Note that the only real solution of sin(𝑥0 ) = 4𝑥0 is 𝑥0 = 0.] 9. For each regularly perturbed equation, find a perturbation approximation of the solution up to order 𝑂(𝜀), where 0 ≤ 𝜀 ≪ 1. 1 , 1+𝜀ᵆ

(a)

𝑑ᵆ 𝑑𝑡

+𝑢=

(b)

𝑑ᵆ 𝑑𝑡

=

(c)

𝑑ᵆ 𝑑𝑡

+ 2𝑢 + 6𝜀 = (2 + 𝜀𝑢)3 ,

𝑢|𝑡=0 = 1,

(d)

𝑑ᵆ 𝑑𝑡

= 𝑢2 − 𝜀𝑡,

𝑡 ≥ 0.

(e)

𝑑ᵆ 𝑑𝑡

= 𝑡𝑒−ᵆ + 𝜀,

6ᵆ , 2+𝜀ᵆ

𝑢|𝑡=0 = 0,

𝑢|𝑡=0 = 4,

𝑡 ≥ 0.

𝑡 ≥ 0.

𝑢|𝑡=0 = 1, 𝑢|𝑡=0 = 0,

𝑡 ≥ 0.

𝑡 ≥ 0.

Exercises

135

(f)

𝑑ᵆ 𝑑𝑡

(g)

𝑑2 ᵆ 𝑑𝑡2

= 𝜀𝑒−𝑡 𝑑𝑡 ,

(h)

𝑑2 ᵆ 𝑑𝑡2

+

= 1 + 𝜀𝑢2 sin 𝑡, 𝑑ᵆ

𝑑ᵆ 𝑑𝑡

𝑢|𝑡=0 = 0,

𝑑ᵆ | 𝑑𝑡 𝑡=0

= 1,

𝑑ᵆ | 𝑑𝑡 𝑡=0

= −𝜀𝑢2 ,

𝑡 ≥ 0. 𝑢|𝑡=0 = 0,

= 1,

𝑡 ≥ 0.

𝑢|𝑡=0 = 0,

𝑡 ≥ 0.

10. For each regularly perturbed equation, find a two-term perturbation approximation of the solution, where 0 ≤ 𝜀 ≪ 1. (a)

𝑑ᵆ 𝑑𝑡

+ 4𝑢 = (1 + 𝜀 𝑑𝑡 )−1/2 ,

𝑑ᵆ

(b)

𝑑ᵆ 𝑑𝑡

+ 4𝑢 = 𝜀( 𝑑𝑡 )2 𝑢 + 2𝜀,

(c)

𝑑ᵆ 𝑑𝑡

= 𝑒−𝜀ᵆ − 𝑢,

(d)

𝑑ᵆ 𝑑𝑡

= 1 + sin( ᵆ ),

(e)

𝑑2 ᵆ 𝑑𝑡2

+ 𝜀( 𝑑𝑡 )2 = sin 𝑡,

(f)

𝑑2 ᵆ 𝑑𝑡2

= √1 + 𝜀𝑢,

𝑑ᵆ

𝜀

𝑑ᵆ

𝑢|𝑡=0 = 1,

𝑡 ≥ 0.

𝑢|𝑡=0 = 1,

𝑡 ≥ 0.

𝑢|𝑡=0 = 4,

𝑡 ≥ 0.

𝑢|𝑡=0 = 1, 𝑑ᵆ | 𝑑𝑡 𝑡=0

𝑑ᵆ | 𝑑𝑡 𝑡=0

𝑡 ≥ 0. = 1,

= 1,

𝑢|𝑡=0 = 0,

𝑢|𝑡=0 = 0,

𝑡 ≥ 0.

𝑡 ≥ 0.

11. A model for the vertical motion of a projectile that includes the effect of nonconstant gravity is shown below. Here 𝑦 is height, 𝑡 is time, 𝑔 is gravitational acceleration at the surface of the earth, 𝑅 is the radius of the earth, and 𝑣 0 is the launch velocity. 𝑔 𝑑2𝑦 =− , 2 𝑑𝑡 ((𝑦/𝑅) + 1)2

𝑑𝑦 | = 𝑣0 , 𝑑𝑡 𝑡=0

𝑦|𝑡=0 = 0,

𝑡 ≥ 0.

(a) Let 𝜏 = 𝑡/𝑎 and 𝑢 = 𝑦/𝑏. Find the scales 𝑎 and 𝑏 so that the scaled equations 1 𝑑ᵆ 𝑑2 ᵆ become 𝑑𝜏2 = − (𝜀ᵆ+1)2 , 𝑑𝜏 |𝜏=0 = 1, 𝑢|𝜏=0 = 0. Identify the parameter 𝜀. (b) Find a perturbation approximation of the solution 𝑢(𝜏, 𝜀) up to order 𝑂(𝜀) assuming 0 ≤ 𝜀 ≪ 1. (c) Express the results in terms of 𝑦, 𝑡, 𝑔, 𝑅 and 𝑣 0 . Under what condition on 𝑔, 𝑅 and 𝑣 0 would we have 0 ≤ 𝜀 ≪ 1? 12. A model for a projectile that includes both air resistance and nonconstant gravity is shown below, in dimensionless form, where 𝑦 is height, 𝑡 is time, and 0 ≤ 𝜀 ≪ 1 is a parameter. 𝑑 2 𝑦 𝑑𝑦 1 , + =− 2 𝑑𝑡 𝑑𝑡 (𝜀𝑦 + 1)2

𝑑𝑦 | = 1, 𝑑𝑡 𝑡=0

𝑦|𝑡=0 = 0.

Find a perturbation approximation of the solution 𝑦(𝑡, 𝜀) up to order 𝑂(𝜀).

136

5. Perturbation methods

13. In dimensionless form, a model for a thermo-chemical reaction is shown below, where 𝑢 and 𝑞 are the concentration and temperature of the reactant, 𝑡 is time, and 0 ≤ 𝜀 ≪ 1 is a parameter. 𝑑𝑢/𝑑𝑡 = 1 − 𝑢𝑒𝜀(𝑞−1) , 𝑑𝑞/𝑑𝑡 = −𝑞 + 𝑢𝑒

𝜀(𝑞−1)

𝑢|𝑡=0 = 1, ,

𝑞|𝑡=0 = 2,

𝑡 ≥ 0.

Find a perturbation approximation of the solution (𝑢, 𝑞)(𝑡, 𝜀) up to order 𝑂(𝜀). 14. Use the Poincaré–Lindstedt method to obtain a two-term perturbation approximation for each of the following, where 0 ≤ 𝜀 ≪ 1. (a)

𝑑2 𝑥 𝑑𝑡2

+ 9𝑥 = 𝜀𝑥3 − 𝜀𝑥,

(b)

𝑑2 𝑥 𝑑𝑡2

+ 4𝑥 = 𝜀𝑥( 𝑑𝑡 )2 ,

(c)

𝑑2 𝑥 𝑑𝑡2

+ 4𝑥 = 8𝜀𝑥3 ,

(d)

𝑑2 𝑥 𝑑𝑡2

𝑑𝑥

𝑑𝑥 | 𝑑𝑡 𝑡=0 𝑑𝑥 | 𝑑𝑡 𝑡=0

𝑑𝑥 | 𝑑𝑡 𝑡=0

𝑑2 𝑥

+ 9𝑥 = 𝜀𝑥2 ( 𝑑𝑡2 ),

= 0,

𝑥|𝑡=0 = 1,

𝑡 ≥ 0.

= 0,

𝑥|𝑡=0 = 1,

𝑡 ≥ 0.

= 1,

𝑑𝑥 | 𝑑𝑡 𝑡=0

𝑥|𝑡=0 = 0,

= 1,

𝑡 ≥ 0.

𝑥|𝑡=0 = 0,

𝑡 ≥ 0.

15. In dimensionless form, a model of a visco-elastic spring-mass system is shown 𝑑𝑥 𝑑𝑥 below, where 𝑘(𝜀, 𝜎, 𝑑𝑡 ) = 1 + 𝜀𝜎( 𝑑𝑡 )2 is a velocity-dependent spring stiffness coefficient, 0 ≤ 𝜀 ≪ 1 is a parameter, and 𝜎 = ±1 is a sign factor that determines whether the spring gets stiffer or softer with velocity. 𝑑𝑥 𝑑2𝑥 𝑑𝑥 + 𝑘(𝜀, 𝜎, )𝑥 = 0, | = 0, 𝑥|𝑡=0 = 1, 𝑡 ≥ 0. 𝑑𝑡 𝑑𝑡 𝑡=0 𝑑𝑡2 Use the Poincaré–Lindstedt method to obtain a two-term perturbation approximation of the solution 𝑥(𝑡, 𝜀, 𝜎). 16. For each singularly perturbed equation, find a two-term perturbation approximation of each regular and singular root, whether real or complex, where 0 < 𝜀 ≪ 1. (a) 𝜀𝑥3 + 𝑥2 − 𝑥 + 𝜀 = 0.

(b) 𝜀𝑥3 + 𝑥 + 1 = 0.

(c) 𝜀2 𝑥3 + 4𝜀𝑥2 + 2𝑥 + 1 = 0.

(d) 𝜀𝑥4 + 4𝑥2 + 𝜀𝑥 + 9 = 0.

(e) 𝜀𝑥5 − 𝜀𝑥2 − 𝑥 + 2 = 0.

(f) 𝜀𝑥4 + 𝜀𝑥2 − 𝑥 + 3 = 0.

(g) 𝜀2 𝑥4 + 𝜀𝑥2 + 𝑥 + 1 = 0.

(h) 𝜀2 𝑥6 − 𝜀𝑥4 − 𝑥3 + 8 = 0.

17. Experiments performed on a nonideal gas at fixed volume show that the pressure 𝑝 > 0 is related to the absolute temperature 𝑇 > 0 as 3

2

𝑝 𝑇 𝑇 𝑇 = 𝜆( ) + 𝜇( ) + 𝛾( ), 𝑝0 𝑇0 𝑇0 𝑇0 where 𝑝0 is a reference pressure, 𝑇0 is a reference temperature, and 𝜆, 𝜇 and 𝛾 are dimensionless constants with values 0 < 𝜆 ≪ 𝜇 ≤ 𝛾 ≤ 1. Here we develop an approximate inverse of this relation.

Exercises

137

𝑇

𝑝

(a) Let 𝑦 = 𝑝 , 𝑥 = 𝑇 and 𝜀 = 𝜆, and consider the cubic equation 𝜀𝑥3 + 𝜇𝑥2 + 0 0 𝛾𝑥 − 𝑦 = 0. Find a two-term perturbation approximation to each root 𝑥 for any given 𝑦 and 0 < 𝜀 ≪ 𝜇 ≤ 𝛾 ≤ 1. (b) The given relation 𝑦 = 𝜀𝑥3 +𝜇𝑥2 +𝛾𝑥 has the property that 𝑦 ∈ [0, 𝜀+𝜇+𝛾] for 𝑥 ∈ [0, 1]. Which root 𝑥 = 𝑥(𝑦, 𝜀, 𝜇, 𝛾) in (a) has the property that 𝑥 ∈ [0, 1] (approximately) for 𝑦 ∈ [0, 𝜀 + 𝜇 + 𝛾]? 18. For each singularly perturbed equation, find a leading-order composite approximation of the solution assuming a boundary layer on the indicated side of the interval, where 0 < 𝜀 ≪ 1. 𝑑2 𝑦

𝑑𝑦

𝑑2 𝑦

𝑑𝑦

𝑑2 𝑦

𝑑𝑦

𝑑2 𝑦

𝑑𝑦

(a) 𝜀 𝑑𝑥2 + 4 𝑑𝑥 + 𝑦 = 3, 𝑦|𝑥=0 = 2, 𝑦|𝑥=1 = 1, 0 ≤ 𝑥 ≤ 1, left. (b) 𝜀 𝑑𝑥2 + 2 𝑑𝑥 + 𝑒𝑦 = 0, 𝑦|𝑥=0 = 0, 𝑦|𝑥=1 = 0, 0 ≤ 𝑥 ≤ 1, left. (c) 𝜀 𝑑𝑥2 − 2 𝑑𝑥 − 4𝑦 = 0, 𝑦|𝑥=0 = 3, 𝑦|𝑥=2 = 6, 0 ≤ 𝑥 ≤ 2, right. (d) 𝜀 𝑑𝑥2 − 𝑥 𝑑𝑥 = −4, 𝑦|𝑥=1 = 1, 𝑦|𝑥=3 = 3, 1 ≤ 𝑥 ≤ 3, right. 𝑑2 𝑦

(e) 𝜀 𝑑𝑥2 +

𝑑𝑦 𝑑𝑥

1

1

= −𝑦2 , 𝑦|𝑥=0 = 4 , 𝑦|𝑥=1 = 2 , 0 ≤ 𝑥 ≤ 1, left.

19. For each singularly perturbed equation, find a leading-order composite approximation of the solution assuming a boundary layer at 𝑡 = 0, where 0 < 𝜀 ≪ 1. For problems involving time, such a boundary layer is called an initial layer. 𝑑ᵆ

(a) 𝜀 𝑑𝑡 + 𝑢 = 𝑒−𝑡 , 𝑢|𝑡=0 = 2, 0 ≤ 𝑡 < ∞. 𝑑2 ᵆ

(b) 𝜀 𝑑𝑡2 +

𝑑ᵆ 𝑑𝑡

𝑑ᵆ | 𝑑𝑡 𝑡=0

+ 𝑢2 = 0,

𝑑2 ᵆ

𝑑ᵆ

= 0, 𝑢|𝑡=0 = 3, 0 ≤ 𝑡 < ∞.

𝑑ᵆ

(c) 𝜀 𝑑𝑡2 + (𝑡 + 1)2 𝑑𝑡 = 1, 𝜀 𝑑𝑡 |𝑡=0 = 1, 𝑢|𝑡=0 = 1, 0 ≤ 𝑡 < ∞. 𝑑2 ᵆ

𝑑ᵆ

𝑑2 ᵆ

𝑑ᵆ

𝑑ᵆ

(d) 𝜀 𝑑𝑡2 + 2 𝑑𝑡 = 𝑡𝑢2 , 𝜀 𝑑𝑡 |𝑡=0 = 2, 𝑢|𝑡=0 = 5, 0 ≤ 𝑡 < ∞. 𝑑ᵆ

(e) 𝜀 𝑑𝑡2 + 4 𝑑𝑡 − 𝜀𝑢 = 𝑡, 𝜀 𝑑𝑡 |𝑡=0 = 6, 𝑢|𝑡=0 = 3, 0 ≤ 𝑡 < ∞. 20. For each singularly perturbed system, find a leading-order composite approximation of the solution assuming an initial layer at 𝑡 = 0, where 𝑡 ≥ 0 and 0 < 𝜀 ≪ 1. (a)

𝑑𝑥 𝑑𝑡

= 𝑦 − 2𝑥 + 𝜀𝑥𝑦, 𝜀 𝑑𝑡 = 𝑥 − 𝑦 − 𝜀𝑥𝑦, 𝑥|𝑡=0 = 2, 𝑦|𝑡=0 = 1.

𝑑𝑦

(b)

𝑑𝑥 𝑑𝑡

= 𝜀 sin 𝑥 − 𝑦, 𝜀 𝑑𝑡 = 𝑥 − 𝑦 − 2 + 𝜀𝑦3 , 𝑥|𝑡=0 = 1, 𝑦|𝑡=0 = 0.

𝑑𝑦

138

5. Perturbation methods

21. In dimensionless form, a model for a two-step chemical reaction involving constituents 𝑋, 𝑌 , 𝑈, 𝑉 is X+Y

𝑑𝑥 1 = − 𝑥𝑦 + 𝑢, 𝑑𝑡 𝜀 𝑥|𝑡=0 = 𝑥# ,

𝑑𝑦 1 = − 𝑥𝑦 + 𝑢, 𝑑𝑡 𝜀 𝑦|𝑡=0 = 𝑦# ,

U

V

𝑑𝑢 1 = 𝑥𝑦 − 𝑢 − 𝛾𝑢, 𝑑𝑡 𝜀 𝑢|𝑡=0 = 0,

𝑑𝑣 = 𝛾𝑢, 𝑑𝑡 𝑣|𝑡=0 = 0.

Here 𝑥, 𝑦, 𝑢, 𝑣 are concentrations, 𝑡 is time, 𝜀, 𝛾 are positive constants, and we 𝑑𝑥 𝑑𝑦 assume 𝑥# ≥ 𝑦# > 0. The first two equations imply 𝑑𝑡 = 𝑑𝑡 , which gives 𝑡 𝑦(𝑡) = 𝑥(𝑡) − 𝜆, where 𝜆 = 𝑥# − 𝑦# . The last equation implies 𝑣(𝑡) = ∫0 𝛾𝑢(𝑡)̂ 𝑑𝑡.̂ By substituting for 𝑦, and combining the first and third equations, the system reduces to 𝑑𝑥 𝜀 = −𝑥(𝑥 − 𝜆) + 𝜀𝑢, 𝑥|𝑡=0 = 𝑥# , 𝑑𝑡 𝑑𝑢 𝑑𝑥 =− − 𝛾𝑢, 𝑢|𝑡=0 = 0. 𝑑𝑡 𝑑𝑡 When the reaction 𝑋 + 𝑌 → 𝑈 is much faster than the other two, we have 0 < 𝜀 ≪ 1, and the system is singularly perturbed with an initial layer at 𝑡 = 0. (a) Find leading-order composite approximations for 𝑥(𝑡, 𝜀) and 𝑢(𝑡, 𝜀) on the interval 0 ≤ 𝑡 < ∞. (b) Using the results from (a), find explicit leading-order expressions for 𝑦(𝑡, 𝜀) and 𝑣(𝑡, 𝜀). (c) Using {𝑥# , 𝑦# , 𝛾, 𝜀} = {1, 0.7, 0.6, 0.01}, make plots of 𝑢, 𝑣 versus 𝑡 ∈ [0, 10]. What appears to be the terminal value of 𝑣 as 𝑡 → ∞? At what time is the reaction 90% complete, that is, when does 𝑣 reach 90% of its terminal value? 22. The Michaelis–Menten model of a biochemical reaction catalyzed by an enzyme leads to the following reduced, dimensionless system 𝑑𝑥 = −𝑥 + 𝑥𝑦 + 𝜇𝑦, 𝑥|𝑡=0 = 1, 𝑑𝑡 𝑑𝑦 𝜀 = 𝑥 − 𝑥𝑦 − 𝛾𝑦, 𝑦|𝑡=0 = 0. 𝑑𝑡 Here 𝑥, 𝑦 are the concentrations of the substrate and enzyme-substrate complex, 𝑡 is time, and 𝜇, 𝛾, 𝜀 are positive constants. Under typical conditions, we have 0 < 𝜀 ≪ 1, and the system is singularly perturbed with an initial layer at 𝑡 = 0. Find leading-order composite approximations for 𝑥(𝑡, 𝜀) and 𝑦(𝑡, 𝜀) on the interval 0 ≤ 𝑡 < ∞. [The outer solution will have an implicit form.] Note: In singularly perturbed chemical kinetics problems, the leading-order outer solution is called a quasi-steady-state approximation.

Exercises

139

23. As described in Section 5.11, the liquid-gas interface model leads to the following inner equation, where 𝑢 is the stretched height, and 𝜏 is the stretched horizontal coordinate. 𝑑2𝑢 𝑑𝑢 2 3/2 = 𝑢[1 + ( ) ] , 𝜏 ≤ 0. 2 𝑑𝜏 𝑑𝜏 Here we solve the above equation subject to the matching condition 𝑢 → 0 as 𝑑ᵆ 𝜋 𝜏 → −∞, along with the boundary condition 𝑑𝜏 |𝜏=0 = tan 𝛾, where 0 < 𝛾 < 2 𝑑ᵆ is a given angle. Consistent with tan 𝛾 > 0, we suppose 𝑢 > 0 and 𝑑𝜏 > 0 in the 𝑑ᵆ inner region, and also 𝑑𝜏 → 0 as 𝜏 → −∞. 𝑑ᵆ

𝑑𝑣

(a) Introduce the first-order system 𝑑𝜏 = 𝑣, 𝑑𝜏 = 𝑢[1 + 𝑣2 ]3/2 and consider the 𝑑𝑣 equation for 𝑑ᵆ . Using the condition 𝑣 → 0 when 𝑢 → 0, together with 𝑣 > 0 ᵆ√4−ᵆ2 when 𝑢 > 0, show that the solution of this equation is 𝑣 = 2−ᵆ2 . Note that the solution is well defined for 𝑢 ∈ (0, √2). 𝑑ᵆ 𝑑𝜏

(b) From (a) we get

=

ᵆ√4−ᵆ2 , 2−ᵆ2

or equivalently

𝑑𝜏 𝑑ᵆ

2−ᵆ2 . Show that the ᵆ√4−ᵆ2 2 𝑢 +ln(𝑢/[2 + √4 − 𝑢2 ]),

=

general solution is 𝜏 = 𝐹(𝑢)−𝐵, where 𝐹(𝑢) = √4 − and 𝐵 is an arbitrary constant. Note that, since it is monotonic for 𝑢 ∈ (0, √2), the function 𝜏 = 𝜏(𝑢) has an inverse 𝑢 = 𝑢(𝜏). Note also that the matching condition 𝑢 → 0+ as 𝜏 → −∞, or equivalently 𝜏 → −∞ as 𝑢 → 0+ , is satisfied. (c) In the inner variables, the meniscus end point is (𝜏, 𝑢) = (0, 𝑢# ). At this point, 1 𝑑𝜏 𝑑ᵆ the solution must satisfy 𝑑𝜏 |𝜏=0 = tan 𝛾, or equivalently 𝑑ᵆ |ᵆ=ᵆ# = tan 𝛾 . Using (b), show that this equation has a unique solution 𝑢# ∈ (0, √2) for any 𝜋 given 𝛾 ∈ (0, 2 ). (d) Since 𝑢 = 𝑢# when 𝜏 = 0 show that the arbitrary constant from part (b) must have the value 𝐵 = 𝐹(𝑢# ). Thus the complete inner solution for the interface curve has the form 𝜏 = 𝐹(𝑢) − 𝐹(𝑢# ). Mini-project 1. A basic problem in ballistics is to determine how to aim a given weapon in order for a bullet to strike a given target as described in Section 5.7. Many factors are involved, and all are important in long-range shots. Considering only gravity and air resistance, a simple model for the near-horizontal motion of a bullet is y

g line o

(u ,v ) 0 0

θ aiming angle

t

f sigh

aiming

h height

trajectory

x

(a,b)

𝑥̈ = −𝜀(𝑥)̇ 2 ,

𝑥|̇ 𝑡=0 = 𝑢0 , 𝑥|𝑡=0 = 0,

𝑦 ̈ = −𝜀𝑥𝑦 ̇ ̇ − 𝑔,

𝑦|̇ 𝑡=0 = 𝑣 0 ,

𝑦|𝑡=0 = 0.

Here (𝑥, 𝑦) is the bullet position, 𝑡 is time, 𝑔 is gravitational acceleration, 𝜀 is an air resistance coefficient, and (𝑢0 , 𝑣 0 ) is the bullet firing velocity; we assume 𝑢20 + 𝑣20 = 𝑐2 , where 𝑐 > 0 is a constant that depends on the weapon. The problem is to determine

140

5. Perturbation methods

the pair (𝑢0 , 𝑣 0 ), or equivalently the aiming angle 𝜃 or height ℎ, required for the bullet trajectory to intersect a fixed target at (𝑎, 𝑏). Here we study the influence of gravity and air resistance on this problem. All quantities are in units of meters and seconds. (a) For the case of 𝜀 = 0, solve for the path (𝑥, 𝑦)(𝑡) and briefly describe how the targeting problem may have zero, one or two solutions for (𝑢0 , 𝑣 0 ) depending on 𝑎, 𝑏, 𝑐 and 𝑔. If 𝑐 = 500 and 𝑔 = 10, what value of (𝑢0 , 𝑣 0 ) gives the lowest aiming angle to strike a target at (𝑎, 𝑏) = (500, 2)? What is the aiming height ℎ? How much time is required for the impact? [Give all results to 4 decimal places.] (b) For the case of small 𝜀 > 0, find an approximation to the path (𝑥, 𝑦)(𝑡, 𝜀) up to and including 𝑂(𝜀) terms. If 𝑐 = 500 and 𝑔 = 10 as before, and 𝜀 = 0.0002, what value of (𝑢0 , 𝑣 0 ) now gives the lowest aiming angle to strike the target at (𝑎, 𝑏) = (500, 2)? What is the aiming height ℎ? How much time is now required for the impact? [When considering the strike conditions, note that the physically meaningful solutions should tend to those in part (a) as 𝜀 → 0+ . Give all results to 4 decimal places.] (c) Use Matlab or other similar software to numerically solve the differential equations for the bullet path and confirm your results in (a) and (b). Specifically, using your computed values of (𝑢0 , 𝑣 0 ) and impact times, make and superimpose plots of the bullet trajectories and verify that both hit the given target. Mini-project 2. Due to the high velocities involved, the motion of Mercury (M) around the Sun (S) is described by a corrected form of Newton’s laws, where the correction is based on Einstein’s theory of relativity. When time is eliminated, the equations for the orbital path 𝑟 = 𝑟(𝜃) are y perihelion S M (r,θ)

x

𝑑2𝑢 1 𝛾𝑢2 +𝑢= + , 2 𝜌 𝜌 𝑑𝜃 𝑑𝑢 = 0, | 𝑑𝜃 𝜃=𝜃0

𝜃 ≥ 𝜃0 ,

𝑢|𝜃=𝜃0 =

1+𝑐 . 𝜌

aphelion

In the above, 𝑢 = 1/𝑟 is the inverse of the radius, 𝜌 is a constant related to gravity, 𝛾 is a small constant that quantifies relativistic effects, and 𝑐 is a constant that defines the initial radius. Here we study the above system with (𝛾 > 0), and without (𝛾 = 0), the relativistic correction term, whose effect we seek to understand. Relevant physical dimensions are [𝜌] = 𝐿, [𝛾] = 𝐿2 and [𝑐] = 1. We assume 𝜌 > 0, 0 < 𝑐 < 1, 𝛾 ≥ 0, and 𝜃0 are given. As M moves around S the angle 𝜃 will continually increase: it will reach the value 𝜃0 + 2𝑛𝜋 after 𝑛 complete revolutions. The differential equation above describes how 𝑟, and hence (𝑥, 𝑦), vary with 𝜃. (a) Consider the scale transformation 𝑣 = 𝜌𝑢 and 𝜙 = 𝜃 − 𝜃0 . [Although the transformation includes a shift, the derivative relations will not be changed.] Show that the

Exercises

141

system can be written as follows for an appropriate parameter 𝜀: 𝑑2𝑣 + 𝑣 = 1 + 𝜀𝑣2 , 𝑑𝜙2

𝑑𝑣 | = 0, 𝑑𝜙 𝜙=0

𝑣|𝜙=0 = 1 + 𝑐,

𝜙 ≥ 0.

(b) Solve the system in (a) with 𝜀 = 0, which corresponds to neglecting the relativistic correction. Given that 𝑟(𝜃) = 𝜌/𝑣(𝜙), where 𝜙 = 𝜃 − 𝜃0 , show by inspection that the smallest value of 𝑟 (perihelion of orbit) occurs at the angle 𝜃 = 𝜃0 + 2𝑛𝜋, and the largest value (aphelion) occurs at 𝜃 = 𝜃0 + (2𝑛 + 1)𝜋. Does the location of the perihelion/aphelion in the 𝑥𝑦-plane change with each revolution 𝑛 ≥ 0? (c) Solve the system in (a) with 0 < 𝜀 ≪ 1, which corresponds to including the relativistic correction. Use the Poincaré–Lindstedt method with 𝑣(𝑠, 𝜀) = 𝑣 0 (𝑠) + 𝜀𝑣 1 (𝑠) + ⋯ ,

𝑠 = 𝜔𝜙,

𝜔 = 𝜔0 + 𝜀𝜔1 + ⋯ .

You need only determine 𝑣 0 , 𝜔0 and 𝜔1 . Using 𝑟(𝜃) = 𝜌/𝑣 0 (𝑠), where 𝑠 = (𝜔0 + 𝜀𝜔1 )𝜙 𝛾 and 𝜙 = 𝜃 − 𝜃0 , show the smallest value of 𝑟 now occurs at 𝜃 = 𝜃0 + 2𝑛𝜋(1 + 𝜌2 −𝛾 ), 𝛾

and the largest value occurs at 𝜃 = 𝜃0 + (2𝑛 + 1)𝜋(1 + 𝜌2 −𝛾 ). Does the location of the perihelion/aphelion change now? What is the change in the angle of the aphelion between revolution 𝑛 and 𝑛 + 1? 𝜋

(d) Use Matlab or other software to simulate the original system using 𝜃0 = 4 , 𝜌 = 1 and 𝑐 = 0.7. For the cases 𝛾 = 0 and 𝛾 = 0.01, make a plot of the orbital curve (𝑥, 𝑦) = (𝑟(𝜃) cos 𝜃, 𝑟(𝜃) sin 𝜃) for 𝜃 ∈ [𝜃0 , 𝜃0 + 2𝑛𝜋] for 𝑛 = 10. Do the simulations agree with your analysis in (b) and (c)? Does the location of the aphelion change by the expected amount? Note: Astronomical observations show that the perihelion and aphelion of Mercury advance with each revolution of the orbit in agreement with the relativistic theory. Mini-project 3. Due to surface tension, a liquid-gas interface will rise up or dip down at a solid boundary to form a meniscus as outlined in Section 5.11. In the planar case, the shape of the interface or meniscus curve 𝑦(𝑥) is described by gas

y

g meniscus −L height liquid

0

y(x) x L

γ

𝑑𝑦 2 3/2 𝜎 𝑑2𝑦 = 𝑦[1 + ( )] , 𝜌𝑔 𝑑𝑥2 𝑑𝑥 𝑑𝑦 | = 0, 𝑑𝑥 |𝑥=0

0 ≤ 𝑥 ≤ 𝐿,

𝑑𝑦 | = tan 𝛾. 𝑑𝑥 |𝑥=𝐿

In the above, 𝜌 is the mass density of the liquid, 𝑔 is gravitational acceleration, and 𝜎 and 𝛾 are the surface tension and wetting angle of the interface. Under typical conditions, the above problem is singularly perturbed and has a boundary layer at 𝑥 = 𝐿. Here we develop an approximate solution and predict the height of the meniscus. We assume 𝜋 𝜎 > 0, 𝜌 > 0, 𝑔 > 0, 𝐿 > 0 and 𝛾 ∈ (0, 2 ) are given constants. All quantities are in units of kilograms, meters and seconds.

142

5. Perturbation methods

For convenience, we introduce 𝑠 = 𝑥/𝐿 and ℎ = 𝑦/𝐿, and consider the following 𝜍 dimensionless form of the above system, where 0 < 𝜀 = 𝜌𝑔𝐿2 ≪ 1. Note that the outer and inner endpoints in 𝑠 will be 𝑞out = 0 and 𝑞in = 1. 𝜀

𝑑2ℎ 𝑑ℎ 2 3/2 = ℎ[1 + )] , ( 𝑑𝑠 𝑑𝑠2

𝑑ℎ | = 0, 𝑑𝑠 |𝑠=0

𝑑ℎ | = tan 𝛾, 𝑑𝑠 |𝑠=1

0 ≤ 𝑠 ≤ 1.

(a) For the outer problem, show that the leading-order term of the outer solution is ℎ0 (𝑠) ≡ 0. As a result, show that the matching condition will be 𝐻 in = 0 = 𝐼 out , and that the leading-order composite approximation will be the inner approximation. (b) For the inner problem, use the change of variable 𝜏 = 𝜀−𝛼 (𝑠 − 𝑞in ) and 𝑢 = 𝜀−𝛽 ℎ 1 1 and show that the choice 𝛼 = 2 and 𝛽 = 2 yields a regular system with no explicit 𝜀 parameter, namely 𝑑𝑢 | 𝑑2𝑢 𝑑𝑢 2 3/2 = 𝑢[1 + ( ) ] , = tan 𝛾, 𝜏 ≤ 0. 2 𝑑𝜏 𝑑𝜏 |𝜏=0 𝑑𝜏 Thus, in the variables 𝜏, 𝑢, the inner solution is independent of 𝜀 so that 𝑢(𝜏, 𝜀) = 𝑢(𝜏) = 𝑢0 (𝜏). (c) Solution curves of (b) which satisfy the matching condition in (a) have the implicit form 𝜏 = 𝐹(𝑢) − 𝐵, where 𝐹(𝑢) = √4 − 𝑢2 + ln(𝑢/[2 + √4 − 𝑢2 ]), and 𝐵 is an arbitrary constant. Verify that these curves satisfy the differential equation (see hint below). (d) In the inner variables, the meniscus endpoint is (𝜏, 𝑢) = (0, 𝑢# ). At this point, 1 𝑑ᵆ 𝑑𝜏 the solution must satisfy 𝑑𝜏 |𝜏=0 = tan 𝛾, or equivalently 𝑑ᵆ |ᵆ=ᵆ# = tan 𝛾 . Using (c), 𝜋 show that this equation has a unique solution 𝑢# ∈ (0, √2) for any given 𝛾 ∈ (0, 2 ). Moreover, since 𝑢 = 𝑢# when 𝜏 = 0, show that 𝐵 = 𝐹(𝑢# ). (e) The inner and also composite approximation of the meniscus curve is 𝜏 = 𝐹(𝑢) − 𝐹(𝑢# ). Note that (𝜏, 𝑢) = (0, 𝑢# ) corresponds to (𝑥, 𝑦) = (𝐿, 𝑦# ), where 𝑦# is the meniscus height. Using your expression for 𝑢# from (d), find an explicit expression for 𝑦# . What are the values of 𝑢# and 𝑦# in the case when 𝜎 = 0.05, 𝜌 = 1000, 𝑔 = 10, 𝐿 = 0.07 2𝜋 and 𝛾 = 5 ? (f) Use Matlab or similar software for boundary-value problems to simulate the original system for the parameter values in (e). Is the prediction of the meniscus height 𝑦# in agreement with the simulation? Based on the prediction, assuming 0 < 𝜀 ≪ 1, how would an increase or decrease in the gravitational acceleration 𝑔 affect the meniscus height 𝑦# ? Does the height 𝑦# depend on the container size 𝐿? 𝑑𝜏

𝑑ᵆ

𝑑ᵆ

Hint for (c): 𝑑ᵆ = 𝐹 ′ (𝑢), so 𝑑𝜏 = 1/𝐹 ′ (𝑢). Introduce 𝜙(𝑢) = 1/𝐹 ′ (𝑢) so that 𝑑𝜏 = 𝜙(𝑢) 𝑑2 ᵆ and 𝑑𝜏2 = 𝜙′ (𝑢)𝜙(𝑢). Is the differential equation satisfied when these expressions are 𝑑ᵆ 𝑑𝜏 substituted? [Note that 𝑑ᵆ and 𝑑𝜏 are defined and positive for 𝑢 ∈ (0, √2).]

Chapter 6

Calculus of variations

A wide variety of problems in modeling are concerned with finding the minimum or maximum value of a function, and the corresponding inputs that produce such a value. When the domain of the function is a subset of the real line, plane, or some higherdimensional space, the techniques of differential calculus can be used to characterize those points that are minimizing or maximizing. Alternatively, when the domain is a collection of graphs or more general curves, then other techniques are required to characterize those curves that are minimizers or maximizers for the function. Such problems cannot be solved by the tools of elementary calculus alone, but require a more elaborate theory known as the calculus of variations. Here we outline a basic version of this theory. We consider both first- and second-order problems involving graphs and more general curves in the plane, with various types of essential and natural boundary conditions, and constraints. We focus on necessary conditions for optimizers and establish sufficient conditions in different cases. The theory is illustrated with various applications, including some problems in optimal control.

6.1. Preliminaries Throughout our developments we consider various sets whose elements are real-valued functions 𝑦(𝑥) of a real variable 𝑥 ∈ [𝑎, 𝑏]. The most basic of these sets are defined as follows, where 𝑛 ≥ 0 denotes an integer. Definition 6.1.1. By 𝐶 𝑛 [𝑎, 𝑏] we mean the set of functions on [𝑎, 𝑏] with 𝑛 continuous derivatives, specifically (6.1)

𝐶 𝑛 [𝑎, 𝑏] = {𝑦 ∶ [𝑎, 𝑏] → ℝ | 𝑦(𝑥), 𝑦′ (𝑥), . . . , 𝑦(𝑛) (𝑥) continuous}.

Thus 𝐶 0 [𝑎, 𝑏] is the set of all functions that are continuous, 𝐶 1 [𝑎, 𝑏] is the set of all functions that are continuous and have a first derivative that is continuous, and so on; see Figure 6.1. Here continuity is understood to hold over the entire interval [𝑎, 𝑏], including the end points. For any given 𝑛 and [𝑎, 𝑏] the set 𝐶 𝑛 [𝑎, 𝑏] is a linear space, 143

144

6. Calculus of variations

y

y

y in C 0, but not C1

a

x

b

y in C1

a

C0

b

x

Figure 6.1.

which means that if 𝑢 and 𝑣 are functions in the set, then so is 𝛼𝑢+𝛽𝑣 for any constants 𝛼 and 𝛽, so (6.2)

𝑢, 𝑣 ∈ 𝐶 𝑛 [𝑎, 𝑏]

⟶

𝛼𝑢 + 𝛽𝑣 ∈ 𝐶 𝑛 [𝑎, 𝑏],

∀𝛼, 𝛽 ∈ ℝ.

More generally, a set of functions V is called a linear space if (6.3)

𝑢, 𝑣 ∈ V

⟶

𝛼𝑢 + 𝛽𝑣 ∈ V,

∀𝛼, 𝛽 ∈ ℝ.

Note that a linear space is simply a real vector space as defined in linear algebra, where the elements are functions. The sets defined above are called the 𝐶 𝑛 -spaces. Example 6.1.1. (1) Consider V = {𝑦 ∈ 𝐶 0 [1, 2] | 𝑦(1) = 0}. This set is a linear space since each condition of membership is preserved under arbitrary linear combinations. Specifically, if 𝑢 ∈ 𝐶 0 [1, 2] and 𝑣 ∈ 𝐶 0 [1, 2], then (𝛼𝑢 + 𝛽𝑣) ∈ 𝐶 0 [1, 2] for all 𝛼, 𝛽. Also, if 𝑢(1) = 0 and 𝑣(1) = 0, then (𝛼𝑢 + 𝛽𝑣)(1) = 0 for all 𝛼, 𝛽. (2) Consider V = {𝑦 ∈ 𝐶 1 [0, 3] | 𝑦(3) = 4}. This set is not a linear space since a condition of membership is not preserved under arbitrary linear combinations. Specifically, if 𝑢(3) = 4 and 𝑣(3) = 4, then (𝛼𝑢 + 𝛽𝑣)(3) = 4𝛼 + 4𝛽 ≠ 4 for some 𝛼, 𝛽. We will be interested in optimization problems that involve a given set of functions and a real-valued quantity defined on this set. Such a quantity, whose input is a function and output is a number, is defined next. Definition 6.1.2. By a functional 𝐹 on a set of functions V we mean a mapping 𝐹 ∶ V → ℝ. A functional 𝐹 is called linear if V is a linear space and (6.4)

𝐹(𝛼𝑢 + 𝛽𝑣) = 𝛼𝐹(𝑢) + 𝛽𝐹(𝑣),

∀𝑢, 𝑣 ∈ V,

∀𝛼, 𝛽 ∈ ℝ.

Thus a functional is a mapping that associates a number 𝐹(𝑦) ∈ ℝ to any given function 𝑦 ∈ V. Such mappings arise naturally in many contexts. As illustrated in the example below, functionals can be either linear or nonlinear, and can be defined in a number of different ways. We will be primarily interested in functionals defined through an integral expression. 1

Example 6.1.2. (1) Consider V = {𝑦 ∈ 𝐶 0 [0, 1] | 𝑦(1) = 0}, 𝐹(𝑦) = ∫0 𝑥2 𝑦(𝑥) 𝑑𝑥. The expression for 𝐹 defines a functional, since it produces a number for a given function 𝑦. This functional may be described as the integral type, due to its form. Moreover, this functional is linear, which follows from the fact that V is a linear space, and from

6.2. Absolute extrema

145

properties of integrals, which imply 1

𝐹(𝛼𝑢 + 𝛽𝑣) = ∫ 𝑥2 (𝛼𝑢(𝑥) + 𝛽𝑣(𝑥)) 𝑑𝑥 (6.5)

0 1

1

= 𝛼 ∫ 𝑥2 𝑢(𝑥) 𝑑𝑥 + 𝛽 ∫ 𝑥2 𝑣(𝑥) 𝑑𝑥 = 𝛼𝐹(𝑢) + 𝛽𝐹(𝑣). 0

0 1

(2) Consider V = 𝐶 1 [0, 1], 𝐹(𝑦) = ∫0 𝑥𝑦(𝑥) + 4(𝑦′ (𝑥))2 𝑑𝑥. The expression for 𝐹 defines a functional; it is of the integral type similar to before, but is nonlinear due to the squared term. Note that 𝐹 is defined on the entire space 𝐶 1 [0, 1], and could not be similarly defined on 𝐶 0 [0, 1]. 1

(3) Consider V = 𝐶 2 [0, 1], 𝐹(𝑦) = 𝑦′ ( 2 ) + 𝑒−𝑦(0) + max[0,1] |𝑦″ (𝑥)|. The expression for 𝐹 defines a functional; it is of the nonintegral type, and is nonlinear. Note that a func1 tional may involve local or point information about the input, such as 𝑦(0) and 𝑦′ ( 2 ), as well as global information, such as max[0,1] |𝑦″ (𝑥)|, which denotes the maximum of the absolute value of 𝑦″ (𝑥) in the interval 𝑥 ∈ [0, 1]. Just as calculus can be viewed as the study of functions, calculus of variations can be viewed as the study of functionals. In the remainder of our developments, we outline an elementary theory of minima and maxima for functionals. Ideally, a systematic study of their continuity and differentiability properties should also be considered, but this will not be pursued here.

6.2. Absolute extrema A minimum or maximum value of a functional is called an extremum. The input associated with such a value is also called an extremum, or more appropriately an extremizer, to distinguish it from the output. An extremum can be of the absolute (global) type, or relative (local) type, or both. The definition of an absolute extremum is especially simple. It is based on a comparison with all elements of a set, and does not involve any notion of distance or closeness within the set. Definition 6.2.1. Let 𝐹 ∶ V → ℝ be given. A function 𝑦∗ ∈ V is called an absolute minimizer of 𝐹 if (6.6)

𝐹(𝑦) ≥ 𝐹(𝑦∗ ),

∀𝑦 ∈ V.

Similarly, 𝑦∗ is called an absolute maximizer if (6.7)

𝐹(𝑦) ≤ 𝐹(𝑦∗ ),

∀𝑦 ∈ V.

Thus a function 𝑦∗ is an absolute minimizer if it gives the smallest or minimum value of 𝐹 over the set V, and similarly, 𝑦∗ is an absolute maximizer if it gives the largest or maximum value. In some cases, the existence or not of an absolute extremum can be established by observation and a straightforward analysis. In other cases, the existence of such an extremum is more delicate. Methods to produce candidates for absolute extrema will be presented later; for the moment, we proceed by observation.

146

6. Calculus of variations

Example 6.2.1. Consider finding absolute extrema of 𝐹 ∶ V → ℝ, where V = {𝑦 ∈ 1 𝐶 0 [0, 1] | 𝑦(0) = 0}, and 𝐹(𝑦) = ∫0 𝑦2 (𝑥) 𝑑𝑥. Absolute minimizer. The positive quadratic form of the integrand suggests that a minimizer may exist. Specifically, we note that the functional satisfies the lower bound 𝐹(𝑦) ≥ 0 for all 𝑦 ∈ V. Moreover, we note that 𝐹(𝑦) = 0 when and only when 𝑦(𝑥) ≡ 0, and the zero function is in V. Thus 𝑦∗ (𝑥) ≡ 0 is an absolute minimizer; it satisfies 𝐹(𝑦) ≥ 𝐹(𝑦∗ ) for all 𝑦 ∈ V. Note that this minimizer is unique, since there are no other functions with this property. Absolute maximizer. Based on the positive quadratic form of the integrand, we expect that the functional has no upper bound, and hence no absolute maximizer. To show ̂ this, consider the function 𝑦(𝑥) = 𝑐𝑥, where 𝑐 is a constant. Note that 𝑦 ̂ ∈ V for any 1 1 𝑐, and by direct computation 𝐹(𝑦)̂ = ∫0 𝑐2 𝑥2 𝑑𝑥 = 3 𝑐2 . Since 𝐹(𝑦)̂ → ∞ as 𝑐 → ∞ we deduce that there is no absolute maximizer. Specifically, there is no 𝑦∗ ∈ V which satisfies 𝐹(𝑦) ≤ 𝐹(𝑦∗ ) for all 𝑦 ∈ V, because 𝐹(𝑦∗ ) would be a fixed number, and 𝐹(𝑦)̂ would be greater than this number for sufficiently large 𝑐. Thus no function in V gives a largest value of 𝐹. Example 6.2.2. Consider finding absolute extrema of 𝐹 ∶ V → ℝ, where 𝐹(𝑦) = 1 ∫0 𝑦2 (𝑥) 𝑑𝑥, but now V = {𝑦 ∈ 𝐶 0 [0, 1] | 𝑦(1) = 1}. Absolute minimizer. As before, we note that 𝐹(𝑦) ≥ 0, and 𝐹(𝑦) = 0 when and only when 𝑦(𝑥) ≡ 0. But now the zero function is not in V. Thus the lower bound is not reached by any function in the set, and we have 𝐹(𝑦) > 0 for all 𝑦 ∈ V. However,

y 1

0

y(x) 1−ε

1

x

Figure 6.2.

there are functions that come arbitrarily close to reaching the lower bound. One such example is the piecewise linear function 𝑦 ̂ shown in Figure 6.2. This function is in V for 1 any 0 < 𝜀 < 1, and a direct computation gives 𝐹(𝑦)̂ = 3 𝜀. Since 𝐹(𝑦)̂ → 0+ as 𝜀 → 0+ we deduce that there is no absolute minimizer. Specifically, there is no 𝑦∗ ∈ V which satisfies 𝐹(𝑦) ≥ 𝐹(𝑦∗ ) for all 𝑦 ∈ V, because 𝐹(𝑦∗ ) would be a fixed positive number, and 𝐹(𝑦)̂ would be less than this number for sufficiently small 𝜀. Thus no function in V gives a smallest value of 𝐹. Note also that the function 𝑦 ̂ considered here does not approach a function in V as 𝜀 → 0+ . Specifically, 𝑦 ̂ approaches a function with a jump discontinuity at 𝑥 = 1, and such a function is not contained in V. Absolute maximizer. As before, we expect that the functional has no upper bound, and ̂ hence no absolute maximizer. To show this, consider the function 𝑦(𝑥) = 𝑥 + 𝑐(1 − 𝑥),

6.3. Local extrema

147

where 𝑐 is a constant. Note that 𝑦 ̂ ∈ V for any 𝑐, and by direct computation 𝐹(𝑦)̂ = 1 (1 + 𝑐 + 𝑐2 ). Since 𝐹(𝑦)̂ → ∞ as 𝑐 → ∞ we deduce that there is no absolute maximizer 3 as before.

6.3. Local extrema Aside from the absolute type, we also consider extrema of a local type. Such extrema may be more likely to exist and easier to find. The definition of a local extremum requires the concept of a neighborhood within a set of functions. Here we consider neighborhoods defined using the standard family of norms in the linear space 𝐶 𝑛 [𝑎, 𝑏]. The notation max𝑎≤𝑥≤𝑏 |𝑣(𝑥)| means the maximum of the absolute value of 𝑣(𝑥) in the interval 𝑥 ∈ [𝑎, 𝑏]. Definition 6.3.1. Let V ⊂ 𝐶 𝑛 [𝑎, 𝑏] and 𝑚 ≤ 𝑛 be given. By the 𝐶 𝑚 -norm of 𝑣 ∈ V we mean the number (6.8)

‖𝑣‖𝐶 𝑚 = max |𝑣(𝑥)| + max |𝑣′ (𝑥)| + ⋯ + max |𝑣(𝑚) (𝑥)|. 𝑎≤𝑥≤𝑏

𝑎≤𝑥≤𝑏

𝑎≤𝑥≤𝑏

The distance between 𝑢 ∈ V and 𝑦 ∈ V is the number ‖𝑢−𝑦‖𝐶 𝑚 . By the 𝐶 𝑚 -neighborhood of 𝑦 ∈ V of radius 𝛿 > 0 we mean the set 𝑁 𝐶 𝑚 (𝑦, 𝛿) = {𝑢 ∈ V | ‖𝑢 − 𝑦‖𝐶 𝑚 ≤ 𝛿}.

(6.9)

Thus the 𝐶 𝑚 -norm of 𝑣 is a measure of the magnitude or size of 𝑣, which is based on the maximum absolute value of the function and its derivatives up through order 𝑚. Two functions 𝑢, 𝑦 are close in this norm when the distance between them ‖𝑢 − 𝑦‖𝐶 𝑚 is small. The 𝐶 𝑚 -neighborhood of 𝑦 consists of all functions 𝑢 that are within a given distance. By design, we define a neighborhood as a subset of V, and do not consider any functions outside of this set. Note that the 𝐶 𝑚 -norm has all the properties of a vector norm as defined in linear algebra.

y 1

y * 1

x

−2

u δ

Figure 6.3.

Example 6.3.1. Let V = {𝑦 ∈ 𝐶 0 [0, 1] | 𝑦(0) = 0} and consider 𝑦∗ ∈ V as shown in the left part of Figure 6.3. The 𝐶 0 -norm of this function is ‖𝑦∗ ‖𝐶 0 = max[0,1] |𝑦∗ (𝑥)| = 2. For given 𝛿 > 0, the 𝐶 0 -neighborhood is 𝑁 𝐶 0 (𝑦∗ , 𝛿) = {𝑢 ∈ V | ‖𝑢 − 𝑦∗ ‖𝐶 0 ≤ 𝛿}. Using the fact that ‖𝑣‖𝐶 0 ≤ 𝛿 if and only if |𝑣(𝑥)| ≤ 𝛿 for all 𝑥 ∈ [0, 1], the neighborhood can be written in the more explicit form (6.10)

𝑁 𝐶 0 (𝑦∗ , 𝛿) = {𝑢 ∈ V | |𝑢(𝑥) − 𝑦∗ (𝑥)| ≤ 𝛿}.

Thus 𝑁 𝐶 0 (𝑦∗ , 𝛿) is the set of all functions 𝑢 whose graphs are contained within a strip of half-width 𝛿 about the graph of 𝑦∗ , as shown in the right part of Figure 6.3.

148

6. Calculus of variations

Example 6.3.2. Let V = {𝑦 ∈ 𝐶 1 [0, 1] | 𝑦(0) = 0} and consider 𝑦∗ ∈ V as shown in the top left part of Figure 6.4. In view of (6.8), note that the 𝐶 1 -norm of a function can be written as the sum of two separate 𝐶 0 -norms, namely ‖𝑦∗ ‖𝐶 1 = ‖𝑦∗ ‖𝐶 0 + ‖𝑦∗′ ‖𝐶 0 . For given 𝛿 > 0, the 𝐶 1 -neighborhood is 𝑁 𝐶 1 (𝑦∗ , 𝛿) = {𝑢 ∈ V | ‖𝑢 − 𝑦∗ ‖𝐶 1 ≤ 𝛿}. Using the noted relation between norms, this neighborhood can be written in a more explicit form similar to before, namely 𝑁 𝐶 1 (𝑦∗ , 𝛿) = {𝑢 ∈ V | |𝑢(𝑥) − 𝑦∗ (𝑥)| ≤ 𝛿,̃

(6.11)

|𝑢′ (𝑥) − 𝑦∗′ (𝑥)| ≤ 𝛿,̂ 𝛿 ̃ + 𝛿 ̂ ≤ 𝛿}.

Thus 𝑁 𝐶 1 (𝑦∗ , 𝛿) consists of all functions 𝑢 that satisfy two conditions: 𝑢, 𝑦∗ are within some 𝐶 0 -distance 𝛿,̃ and 𝑢′ , 𝑦∗′ are within some 𝐶 0 -distance 𝛿,̂ where 𝛿 ̃ + 𝛿 ̂ ≤ 𝛿, as shown in the bottom left part of Figure 6.4. Note that a 𝐶 1 -neighborhood of 𝑦∗ is much more restrictive than a 𝐶 0 -neighborhood. For example, as illustrated in the right part of Figure 6.4, the function 𝑢 is in a 𝐶 1 -neighborhood of 𝑦∗ for some small 𝛿. In contrast, the function 𝑣 is in a 𝐶 0 -neighborhood, but not a 𝐶 1 -neighborhood. Specifically, 𝑣 is close to 𝑦∗ for all 𝑥 ∈ [0, 1], but the slope of 𝑣 is not close to the slope of 𝑦∗ at some points. y

y y *

0

y *

1

y

x

u

x

v

δ

u u

δ

0

1

x

Figure 6.4.

Using the above notion of a neighborhood, we can now define local extrema for a functional on a subset of a 𝐶 𝑛 -space. This definition should be compared to that for absolute extrema given earlier. Definition 6.3.2. Let V ⊂ 𝐶 𝑛 [𝑎, 𝑏], 𝐹 ∶ V → ℝ and 𝑚 ≤ 𝑛 be given. A function 𝑦∗ ∈ V is called a local minimizer of 𝐹 in the 𝐶 𝑚 -norm if (6.12)

𝐹(𝑢) ≥ 𝐹(𝑦∗ ),

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) for some 𝛿 > 0.

Similarly, 𝑦∗ is called a local maximizer if (6.13)

𝐹(𝑢) ≤ 𝐹(𝑦∗ ),

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) for some 𝛿 > 0.

Thus a function 𝑦∗ is a local minimizer in the 𝐶 𝑚 -norm if it gives the smallest value of 𝐹 over some 𝐶 𝑚 -neighborhood of 𝑦∗ in V; and similarly, it is a local maximizer

6.3. Local extrema

149

if it gives the largest value. Methods outlined later will produce candidates 𝑦∗ for local extrema for various different types of functionals and sets. To show that a candidate 𝑦∗ is a local extremum we will need to verify that it satisfies the above definition for some radius 𝛿 > 0. If the definition is satisfied for an arbitrary radius, no matter how large, then the candidate is an absolute extremum; in this case, the neighborhood is the entire set. Thus the difference between a local and absolute extremum can be understood in terms of the radius of the neighborhood. Note that an absolute extremum must be a local extremum for every norm and neighborhood. In contrast, a local extremum that is not absolute may only be an extremum in some norm and neighborhood. As the next example shows, a local extremum in one norm may not be an extremum in another. Example 6.3.3. Consider the set of functions V = {𝑦 ∈ 𝐶 1 [0, 1] | 𝑦(0) = 𝛼} and 1 functional 𝐹(𝑦) = ∫0 (𝑦′ (𝑥))2 (𝛽 − (𝑦′ (𝑥))2 ) 𝑑𝑥, where 𝛼 > 0 and 𝛽 > 0 are given constants. Methods outlined later will produce a candidate for a local minimizer, namely 𝑦∗ (𝑥) ≡ 𝛼. Here we determine if this candidate is an actual local minimizer in the 𝐶 𝑚 -norm for some 𝑚 ≤ 1. For reference, note that 𝑦∗′ (𝑥) ≡ 0, and hence 𝐹(𝑦∗ ) = 0. Claim: 𝑦∗ is a local minimizer in the 𝐶 1 -norm. To establish this, let 𝛿 > 0 be given and consider the neighborhood 𝑁 𝐶 1 (𝑦∗ , 𝛿) as shown in Figure 6.5. For any 𝑢 ∈ 𝑁 𝐶 1 (𝑦∗ , 𝛿) y

δ + δ < δ δ u y *

α

0

1

δ 0

u y *

x Figure 6.5.

as illustrated, we have |𝑢(𝑥) − 𝛼| ≤ 𝛿 ̃ ≤ 𝛿 and also |𝑢′ (𝑥) − 0| ≤ 𝛿 ̂ ≤ 𝛿, for all 𝑥 ∈ [0, 1]. Provided that 𝛿 ≤ √𝛽, we will have 𝛽 − (𝑢′ (𝑥))2 ≥ 0, and consequently 1

(6.14)

𝐹(𝑢) = ∫ (𝑢′ (𝑥))2 (𝛽 − (𝑢′ (𝑥))2 ) 𝑑𝑥 ≥ 0. 0

Since 𝐹(𝑦∗ ) = 0, we find that (6.15)

𝐹(𝑢) ≥ 𝐹(𝑦∗ ),

∀𝑢 ∈ 𝑁 𝐶 1 (𝑦∗ , 𝛿) for any 𝛿 ∈ (0, √𝛽].

Thus 𝑦∗ is a local minimizer in the 𝐶 1 -norm. Claim: 𝑦∗ is not a local minimizer in the 𝐶 0 -norm. To establish this, let 𝛿 > 0 be given and consider the neighborhood 𝑁 𝐶 0 (𝑦∗ , 𝛿) as shown in Figure 6.6. For any 𝜀 ∈ (0, 𝛿] consider the function 𝑣 ∈ 𝑁 𝐶 0 (𝑦∗ , 𝛿) defined by (6.16)

𝑣(𝑥) = 𝛼 + 𝜀 sin(𝑥/𝜀2 ).

For small values of the parameter 𝜀, the function 𝑣 will be as close as desired to 𝑦∗ , but the slope of 𝑣 will be very oscillatory and large compared to the zero slope of 𝑦∗ . 1 If we consider parameter values of the form 𝜀 = , where 𝑘 is an integer, then √2𝑘𝜋

150

6. Calculus of variations

y

δ v y *

α

0

1

x

Figure 6.6.

provided that 𝑘 ≥ evaluation that

1 , 2𝜋𝛿 2

we have 0 < 𝜀 ≤ 𝛿, and provided that 𝑘 > 1

(6.17)

𝐹(𝑣) = ∫ (𝑣′ (𝑥))2 (𝛽 − (𝑣′ (𝑥))2 ) 𝑑𝑥 = 0

2𝛽 , 3𝜋

we find by direct

𝑘𝜋 (2𝛽 − 3𝑘𝜋) < 0. 2

However, as noted earlier, we have 𝐹(𝑦∗ ) = 0. Thus every 𝐶 0 -neighborhood of 𝑦∗ contains functions such as 𝑣 with lower values of 𝐹, and it follows that 𝑦∗ is not a local minimizer in the 𝐶 0 -norm.

6.4. Necessary conditions Here we outline a set of general necessary conditions for the local extrema of a functional. Although only necessary, they will provide a practical means of identifying candidates. The conditions are based on the concept of admissible variations. We first introduce a space of such variations, and then proceed to the important idea of the variation of a function. Definition 6.4.1. Consider a set of functions of the form (6.18)

V = {𝑦 ∈ 𝐶 𝑛 [𝑎, 𝑏] | 𝐺 1 (𝑦) = 𝑐 1 , . . . , 𝐺 𝐽 (𝑦) = 𝑐 𝐽 },

where 𝐺𝑗 ∶ 𝐶 𝑛 [𝑎, 𝑏] → ℝ are linear functionals and 𝑐𝑗 ∈ ℝ are constants for 𝑗 = 1, . . . , 𝐽. By the space of admissible variations associated with V we mean the linear space (6.19)

V0 = {ℎ ∈ 𝐶 𝑛 [𝑎, 𝑏] | 𝐺 1 (ℎ) = 0, . . . , 𝐺 𝐽 (ℎ) = 0}.

Thus V0 is a linear space associated with V, defined by homogeneous versions of the membership conditions. By design, the space V0 has the property that if 𝑦1 ∈ V and 𝑦2 ∈ V, then 𝑦2 − 𝑦1 ∈ V0 . Also, if 𝑦1 ∈ V and ℎ ∈ V0 , then 𝑦1 + ℎ ∈ V. These properties follow from the linearity of the membership conditions 𝐺𝑗 (𝑦) = 𝑐𝑗 . For any given V as considered above, the identification of V0 is straightforward. 1

Example 6.4.1. (1) For V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 2, 𝑦′ (1) = 0, ∫0 𝑦(𝑥) 𝑑𝑥 = 3}, the 1 space of variations is V0 = {ℎ ∈ 𝐶 2 [0, 1] | ℎ(0) = 0, ℎ′ (1) = 0, ∫0 ℎ(𝑥) 𝑑𝑥 = 0}. (2) For V = 𝐶 2 [0, 1], with no additional conditions, the space of variations is V0 = 𝐶 2 [0, 1], with no additional conditions. The space V0 is useful for describing neighborhoods of a given function 𝑦∗ ∈ V. Specifically, to each 𝑢 in a neighborhood of 𝑦∗ there is a unique ℎ ∈ V0 such that

6.4. Necessary conditions

151

𝑢 = 𝑦∗ + ℎ, namely ℎ = 𝑢 − 𝑦∗ , and 𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) if and only if ‖ℎ‖𝐶 𝑚 ≤ 𝛿. More importantly, the space V0 is useful for describing distortions or variations of a given function 𝑦∗ ∈ V. This latter idea will provide the foundation for a theory of local extrema. Definition 6.4.2. Let 𝑦∗ ∈ V and ℎ ∈ V0 be given. By variations of 𝑦∗ in the direction ℎ we mean the family of functions 𝑦∗ + 𝜀ℎ ∈ V,

(6.20)

𝜀 ∈ ℝ.

For each 𝜀, the function 𝑦∗ +𝜀ℎ can be understood as a distorted version of 𝑦∗ within the set V. The direction of the distortion is determined by ℎ, and the level or scale of the distortion is determined by 𝜀. The functions 𝑦∗ + 𝜀ℎ and 𝑦∗ coincide when 𝜀 = 0, and remain close to each other in any 𝐶 𝑚 -norm for small values of 𝜀 > 0 and 𝜀 < 0. From a geometrical standpoint, the family 𝑦∗ + 𝜀ℎ can also be interpreted as an abstract line in the set V, which passes through the element 𝑦∗ , where 𝜀 is the coordinate along the line, and ℎ is the direction. Example 6.4.2. Consider V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 1}, with space of variations V0 = {ℎ ∈ 𝐶 2 [0, 1] | ℎ(0) = 0, ℎ(1) = 0}. Moreover, consider the function 𝑦∗ (𝑥) = 𝑥 in V. Figure 6.7 illustrates some sample elements ℎ in V0 , and the resulting variations 𝑦∗ + 𝜀ℎ in V, corresponding to positive and negative values of the parameter 𝜀. ε>0 y *

y +εh *

h ε0

x y +εh *

y * h 0

1

x

0

1

x

0

1

x

Figure 6.7.

We can now state a set of general necessary conditions that the local extrema of a functional must satisfy. These conditions are based on the concept of a variation as introduced above, and results from single-variable calculus. In the following statement, the indicated derivatives with respect to the parameter are assumed to exist. Result 6.4.1. [necessary conditions] Let V ⊂ 𝐶 𝑛 [𝑎, 𝑏], 𝐹 ∶ V → ℝ and 𝑚 ≤ 𝑛 be given. If 𝑦∗ ∈ V is a local minimizer of 𝐹 in the 𝐶 𝑚 -norm, then it must satisfy 𝑑 (6.21) = 0, ∀ℎ ∈ V0 , [ 𝐹(𝑦∗ + 𝜀ℎ)] 𝑑𝜀 𝜀=0

152

6. Calculus of variations

and (6.22)

[

𝑑2 𝐹(𝑦∗ + 𝜀ℎ)] ≥ 0, 𝑑𝜀2 𝜀=0

∀ℎ ∈ V0 .

For a local maximizer, change ≥ to ≤ in condition (6.22). The above result follows from simple considerations. Specifically, if the functional 𝐹(𝑦) has a local minimum at 𝑦 = 𝑦∗ , then the single-variable function 𝑓(𝜀) = 𝐹(𝑦∗ +𝜀ℎ) has a local minimum at 𝜀 = 0. Results from single-variable calculus then require that 𝑑2 𝑓 𝑑𝑓 (0) = 0 and 𝑑𝜀2 (0) ≥ 0, provided that these derivatives exist, and this must hold 𝑑𝜀 for any given ℎ. In the case of a local maximum, the condition 𝑑2 𝑓 (0) 𝑑𝜀2

𝑑2 𝑓 (0) 𝑑𝜀2

≥ 0 is replaced

by ≤ 0. Note that the conditions are independent of the specific 𝐶 𝑚 -norm associated with a given extremum 𝑦∗ . For brevity, the derivatives in (6.21) and (6.22) are denoted as 𝛿𝐹(𝑦∗ , ℎ) and 𝛿2 𝐹(𝑦∗ , ℎ), and are called the first and second variation of 𝐹 at 𝑦∗ in the direction ℎ. The conditions outlined in the above result are necessary, but not sufficient. And they are still not sufficient even if the inequality in (6.22) is made strict. Whereas a strict inequality is sufficient in the case of functions defined on finite-dimensional domains, it is no longer sufficient in the case of functionals defined on infinite-dimensional domains. This lack of sufficiency can be attributed to the difference between finite and infinite dimension. Although only necessary, the above conditions provide a practical means of identifying candidates for extrema. In the remainder of our developments we will specialize these conditions to problems involving different types of functionals and sets. Before proceeding, we illustrate the basic idea with an example. Example 6.4.3. Consider the set V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 1}, with space of variations V0 = {ℎ ∈ 𝐶 2 [0, 1] | ℎ(0) = 0, ℎ(1) = 0}, and consider the functional 1 𝐹(𝑦) = ∫0 2𝑦(𝑥) + (𝑦′ (𝑥))2 𝑑𝑥. Here we find candidates 𝑦∗ for local extrema. Variations. For any fixed 𝑦 ∈ V and ℎ ∈ V0 , we seek expressions for the derivatives 𝑑 𝑑2 [ 𝑑𝜀 𝐹(𝑦 + 𝜀ℎ)]𝜀=0 and [ 𝑑𝜀2 𝐹(𝑦 + 𝜀ℎ)]𝜀=0 , which for brevity are denoted as 𝛿𝐹(𝑦, ℎ) and 𝛿2 𝐹(𝑦, ℎ). From the definition of 𝐹, we have 1

(6.23)

𝐹(𝑦 + 𝜀ℎ) = ∫ 2(𝑦(𝑥) + 𝜀ℎ(𝑥)) + (𝑦′ (𝑥) + 𝜀ℎ′ (𝑥))2 𝑑𝑥. 0

Differentiating with respect to 𝜀, and noting that the derivative can be taken inside the integral, and using the chain rule where needed, we get 1

(6.24)

𝑑 𝐹(𝑦 + 𝜀ℎ) = ∫ 2ℎ(𝑥) + 2(𝑦′ (𝑥) + 𝜀ℎ′ (𝑥))ℎ′ (𝑥) 𝑑𝑥, 𝑑𝜀 0

and 1

(6.25)

𝑑2 𝐹(𝑦 + 𝜀ℎ) = ∫ 2ℎ′ (𝑥)ℎ′ (𝑥) 𝑑𝑥. 𝑑𝜀2 0

6.4. Necessary conditions

153

Setting 𝜀 = 0 we obtain expressions for the first and second variations, namely 1

(6.26)

𝛿𝐹(𝑦, ℎ) = ∫ 2ℎ(𝑥) + 2𝑦′ (𝑥)ℎ′ (𝑥) 𝑑𝑥, 0

and 1

(6.27)

2

𝛿 𝐹(𝑦, ℎ) = ∫ 2ℎ′ (𝑥)ℎ′ (𝑥) 𝑑𝑥. 0

We next rewrite the first variation 𝛿𝐹(𝑦, ℎ) in a more useful form using the integration-by-parts formula ∫ 𝑢 𝑑𝑣 = 𝑢𝑣 − ∫ 𝑣 𝑑𝑢. Specifically, applying this formula to the term ∫ 2𝑦′ ℎ′ 𝑑𝑥, with 𝑢 = 2𝑦′ and 𝑑𝑣 = ℎ′ 𝑑𝑥, we get 𝑑𝑢 = 2𝑦″ 𝑑𝑥 and 𝑣 = ℎ, and we obtain 1

(6.28)

𝑥=1

𝛿𝐹(𝑦, ℎ) = ∫ 2ℎ(𝑥) − 2𝑦″ (𝑥)ℎ(𝑥) 𝑑𝑥 + [2𝑦′ (𝑥)ℎ(𝑥)]𝑥=0 . 0

Since ℎ ∈ V0 , we have ℎ(0) = 0 and ℎ(1) = 0, and it follows that the boundary term 𝑥=1 [2𝑦′ (𝑥)ℎ(𝑥)]𝑥=0 is zero. Thus we get the convenient expression 1

(6.29)

𝛿𝐹(𝑦, ℎ) = ∫ (2 − 2𝑦″ (𝑥))ℎ(𝑥) 𝑑𝑥. 0

First-order condition. We now make the observation that, if 𝑦∗ ∈ V is a local extremum of 𝐹, then it must satisfy 1

(6.30)

𝛿𝐹(𝑦∗ , ℎ) = ∫ (2 − 2𝑦∗″ (𝑥))ℎ(𝑥) 𝑑𝑥 = 0,

∀ℎ ∈ V0 .

0

Note that all factors in the integrand are continuous, and that the integral must vanish for every choice of ℎ in V0 . As we will see, this condition will hold when and only when the factor multiplying ℎ(𝑥) in the integrand vanishes throughout the integration interval, that is, 2 − 2𝑦∗″ (𝑥) = 0 for all 𝑥 ∈ [0, 1]. This equation, combined with the boundary conditions specified in V, gives a boundary-value problem for the function 𝑦∗ (𝑥), namely (6.31)

2 − 2𝑦∗″ (𝑥) = 0,

𝑦∗ (0) = 0,

𝑦∗ (1) = 1,

0 ≤ 𝑥 ≤ 1.

These equations can be solved in the usual way, and we obtain the function 𝑦∗ (𝑥) = 1 (𝑥 + 𝑥2 ). This function is in V and is the only candidate for a local extremum. 2 Second-order condition. If the sign of 𝛿2 𝐹(𝑦∗ , ℎ) can be determined, for any given ℎ in V0 , then more information on the candidate 𝑦∗ can be obtained. From (6.27), and the fact that (ℎ′ (𝑥))2 ≥ 0 for all 𝑥 ∈ [0, 1], we get the straightforward result that 1

(6.32)

𝛿2 𝐹(𝑦∗ , ℎ) = ∫ 2(ℎ′ (𝑥))2 𝑑𝑥 ≥ 0,

∀ℎ ∈ V0 .

0

This informs us that 𝑦∗ could be a local minimizer, but not a local maximizer. A further analysis is required to determine whether this candidate is an actual local minimizer in the 𝐶 𝑚 -norm for some 𝑚.

154

6. Calculus of variations

6.5. First-order problems We consider the problem of finding local extrema for a functional 𝐹 ∶ V → ℝ, where the set of functions is V = {𝑦 ∈ 𝐶 2 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦(𝑏) = 𝛽},

(6.33)

the space of variations is V0 = {ℎ ∈ 𝐶 2 [𝑎, 𝑏] | ℎ(𝑎) = 0, ℎ(𝑏) = 0},

(6.34) and the functional is

𝑏

𝐹(𝑦) = ∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥.

(6.35)

𝑎

Here [𝑎, 𝑏] is a given interval, 𝛼, 𝛽 are given constants, and 𝐿(𝑥, 𝑦, 𝑦′ ) is a given integrand. Unless indicated otherwise, we assume that 𝐿 is twice continuously differentiable for all 𝑥 ∈ [𝑎, 𝑏], 𝑦 ∈ ℝ and 𝑦′ ∈ ℝ. The integrand 𝐿 is called the Lagrangian for the functional. The above problem is said to be of first-order type, since the functional 𝐹 involves derivatives of at most first order. Moreover, the problem is said to be of fixed-fixed type, since the functions in V are fixed at both ends. The continuity requirements for the functions in V, and for the integrand 𝐿, ensure that the functional 𝐹 is finite for each input. They also ensure that the general necessary conditions in Result 6.4.1 can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. The following result outlines some implications of the general necessary conditions, when specialized to the problem considered here. Result 6.5.1. Let 𝐹 ∶ V → ℝ be defined as in (6.33)–(6.35). If 𝑦∗ ∈ V is a local minimizer of 𝐹 in the 𝐶 𝑚 -norm for some 𝑚, then (6.36)

𝑑 𝐹(𝑦∗ + 𝜀ℎ)] = 0, 𝑑𝜀 𝜀=0

∀ℎ ∈ V0 ,

𝑑2 𝐹(𝑦∗ + 𝜀ℎ)] ≥ 0, 𝑑𝜀2 𝜀=0

∀ℎ ∈ V0 .

[

and (6.37)

[

Condition (6.36) implies that 𝑦∗ must satisfy (6.38)

𝜕𝐿 𝑑 𝜕𝐿 ′ ′ ⎧ 𝜕𝑦 (𝑥, 𝑦, 𝑦 ) − 𝑑𝑥 [ 𝜕𝑦′ (𝑥, 𝑦, 𝑦 )] = 0, ⎨ ⎩𝑦(𝑎) = 𝛼, 𝑦(𝑏) = 𝛽.

𝑎 ≤ 𝑥 ≤ 𝑏,

Condition (6.37) implies that 𝑦∗ must also satisfy (6.39)

𝜕2 𝐿 (𝑥, 𝑦, 𝑦′ ) ≥ 0, 𝜕𝑦′ 𝜕𝑦′

𝑎 ≤ 𝑥 ≤ 𝑏.

For a local maximizer, change ≥ to ≤ in conditions (6.37) and (6.39).

6.5. First-order problems

155

The conditions in (6.38) and (6.39) are pointwise in the sense that they must be satisfied at every point 𝑥 ∈ [𝑎, 𝑏]. The equations in (6.38) provide a boundary-value problem that every local extremum must satisfy; they are called the Euler–Lagrange equations. The differential equation in this boundary-value problem is at most secondorder, and may be linear or nonlinear. The inequality in (6.39) is a further condition that must be satisfied; it is called the Legendre condition, and can be used to partially classify an extremum. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. Such an analysis can be straightforward in some problems, but can be significantly involved in others, and may require a number of additional technical results. As noted earlier in the discussion of Result 6.4.1, the above conditions would still not be sufficient even if the inequalities in (6.37) and (6.39) were made strict. Following standard terminology, any solution of (6.38), and hence a candidate, is called an extremal. Note that the boundary-value problem in (6.38) may have one, none, or multiple solutions; hence a functional may have as many local extrema. The possibility of no or multiple solutions is an intrinsic feature of boundary-value problems. The issue with solutions normally does not arise when solving the differential equation itself. Specifically, the theory of initial-value problems guarantees, under mild conditions, that the differential equation will have a general solution involving arbitrary constants. (These constants reflect arbitrary initial conditions.) Instead, the issue normally arises when attempting to fit this general solution to boundary conditions specified at two distinct points. There may be a unique set of constants that will fit such boundary conditions, or none, or many. Sketch of proof: Result 6.5.1. We first discuss how (6.36) implies (6.38). This basic argument will be repeated in various forms when other types of problems are considered. To begin, consider any fixed 𝑦 ∈ V and ℎ ∈ V0 . From the definition of 𝐹 in (6.35) we have 𝑏

𝐹(𝑦 + 𝜀ℎ) = ∫ 𝐿(𝑥, 𝑦 + 𝜀ℎ, 𝑦′ + 𝜀ℎ′ ) 𝑑𝑥.

(6.40)

𝑎

Differentiating with respect to 𝜀, and noting that the derivative can be taken inside the integral, and using the chain rule, we get

(6.41)

𝑑 𝐹(𝑦 + 𝜀ℎ) 𝑑𝜀 𝑏

=∫ 𝑎

𝜕𝐿 𝜕𝐿 (𝑥, 𝑦 + 𝜀ℎ, 𝑦′ + 𝜀ℎ′ )ℎ + ′ (𝑥, 𝑦 + 𝜀ℎ, 𝑦′ + 𝜀ℎ′ )ℎ′ 𝑑𝑥. 𝜕𝑦 𝜕𝑦

Setting 𝜀 = 0 we obtain an expression for the first variation, namely 𝑏

(6.42)

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ + 𝑓ℎ′ 𝑑𝑥, 𝑎

156

6. Calculus of variations

𝜕𝐿

𝜕𝐿

where for brevity we use the notation 𝑔 = 𝜕𝑦 (𝑥, 𝑦, 𝑦′ ) and 𝑓 = 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ). As before, we next write the above expression in a more useful form using the integration-by-parts formula ∫ 𝑢 𝑑𝑣 = 𝑢𝑣 − ∫ 𝑣 𝑑𝑢. Specifically, applying this formula to the term ∫ 𝑓ℎ′ 𝑑𝑥, with 𝑢 = 𝑓 and 𝑑𝑣 = ℎ′ 𝑑𝑥, we get 𝑑𝑢 = 𝑓′ 𝑑𝑥 and 𝑣 = ℎ, and we obtain 𝑏

(6.43)

𝑥=𝑏

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ − 𝑓′ ℎ 𝑑𝑥 + [𝑓ℎ]𝑥=𝑎 . 𝑎

In the above, note that 𝑓′ means

𝑑 𝜕𝐿 𝑑𝑓 , or equivalently, 𝑑𝑥 [ 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ )]. 𝑑𝑥

Since ℎ ∈ V0 , we 𝑥=𝑏

have ℎ(𝑎) = 0 and ℎ(𝑏) = 0, and it follows as before that the boundary term [𝑓ℎ]𝑥=𝑎 is zero. Thus we get the expression 𝑏

𝛿𝐹(𝑦, ℎ) = ∫ (𝑔 − 𝑓′ )ℎ 𝑑𝑥.

(6.44)

𝑎

We now observe that, if 𝑦∗ ∈ V is a local extremum, then the condition in (6.36) requires 𝑏

(6.45)

𝛿𝐹(𝑦∗ , ℎ) = ∫ (𝑔 − 𝑓′ )ℎ 𝑑𝑥 = 0,

∀ℎ ∈ V0 .

𝑎

Note that all factors in the integrand are continuous, and that the integral must vanish for every choice of ℎ in V0 . By a result known as the fundamental lemma to be outlined later, this condition will hold when and only when the factor multiplying ℎ in the integrand vanishes throughout the integration interval, that is, 𝑔 − 𝑓′ = 0 for all 𝑥 ∈ [𝑎, 𝑏]. This equation, combined with the boundary conditions specified in V, gives the boundary-value problem in (6.38). We next describe how (6.37) implies (6.39). Returning to (6.41), we differentiate a second time with respect to 𝜀, and then set 𝜀 = 0 to obtain an expression for the second variation, namely 𝑏

(6.46)

2

𝛿 𝐹(𝑦, ℎ) = ∫ 𝑝(ℎ′ )2 + 2𝑞ℎℎ′ + 𝑟ℎ2 𝑑𝑥, 𝑎 𝜕2 𝐿

𝜕2 𝐿

where 2for brevity we use the notation 𝑝 = 𝜕𝑦′ 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ), 𝑞 = 𝜕𝑦𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) and 𝜕 𝐿 𝑟 = 𝜕𝑦𝜕𝑦 (𝑥, 𝑦, 𝑦′ ). We now observe that, if 𝑦∗ ∈ V is a local extremum, for example a minimizer, then the condition in (6.37) requires 𝑏

(6.47)

𝛿2 𝐹(𝑦∗ , ℎ) = ∫ 𝑝(ℎ′ )2 + 2𝑞ℎℎ′ + 𝑟ℎ2 𝑑𝑥 ≥ 0,

∀ℎ ∈ V0 .

𝑎

Similar to before, note that all factors in the integrand are continuous, and that the integral must be nonnegative for every choice of ℎ in V0 . By a result which we call the sign lemma to be outlined later, the above condition implies that the factor multiplying the highest derivatives of ℎ must be nonnegative throughout the integration interval, that is, 𝑝 ≥ 0 for all 𝑥 ∈ [𝑎, 𝑏]. This condition is the inequality stated in (6.39). The intuitive explanation is that functions ℎ can be chosen to localize the integrand around any point, and for such functions the higher derivatives will be much larger than the lower ones, and the term 𝑝(ℎ′ )2 will dominate.

6.5. First-order problems

157

Example 6.5.1. Consider the set V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 1}, and 1 functional 𝐹(𝑦) = ∫0 4𝑥2 𝑦−(𝑦′ )2 𝑑𝑥. Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition. Extremals. The Lagrangian is 𝐿(𝑥, 𝑦, 𝑦′ ) = 4𝑥2 𝑦 − (𝑦′ )2 , and its partial derivatives are 𝜕𝐿 𝑑 𝜕𝐿 𝜕𝐿/𝜕𝑦 = 4𝑥2 and 𝜕𝐿/𝜕𝑦′ = −2𝑦′ . The differential equation to consider is 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ] = 𝑑

0, which becomes 4𝑥2 − 𝑑𝑥 [ − 2𝑦′ ] = 0, or equivalently 4𝑥2 + 2𝑦″ = 0, and the interval 𝑎 ≤ 𝑥 ≤ 𝑏 becomes 0 ≤ 𝑥 ≤ 1. In view of the boundary conditions in V, the boundaryvalue problem for an extremal is (6.48)

𝑦″ + 2𝑥2 = 0,

𝑦(0) = 0,

𝑦(1) = 1,

0 ≤ 𝑥 ≤ 1.

The differential equation can be explicitly integrated, and the general solution is 𝑦 = 1 − 6 𝑥4 + 𝐶𝑥 + 𝐷, where 𝐶 and 𝐷 are arbitrary constants. Applying the boundary con1

7

7

ditions, we get 𝐷 = 0 and 𝐶 = 6 , and we obtain the unique extremal 𝑦∗ = − 6 𝑥4 + 6 𝑥. This is the only candidate for a local extremum. 𝜕2 𝐿

𝜕𝐿

Sign condition. Since 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) = −2𝑦′ , we get 𝜕𝑦′ 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) = −2. Here the required second partial is a constant, but more generally it may depend on 𝑥, 𝑦, and 𝑦′ . Substituting the extremal 𝑦∗ and its derivative 𝑦∗′ into this expression, we get the con𝜕2 𝐿 stant function 𝜕𝑦′ 𝜕𝑦′ (𝑥, 𝑦∗ , 𝑦∗′ ) ≡ −2 for 𝑥 ∈ [0, 1]. The fact that this expression is ≤ 0 for all 𝑥 ∈ [0, 1] informs us that 𝑦∗ could be a local maximizer, but not a local minimizer. Analysis of candidate. To determine if 𝑦∗ is a local maximizer in the 𝐶 𝑚 -norm for some 𝑚, we attempt to verify the definition of a maximizer. To begin, let 𝑚 ≤ 2 and 𝛿 > 0 be given; we will adjust these as needed as we proceed. Consider any 𝑢 in the neighborhood 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) and let ℎ = 𝑢 − 𝑦∗ . Note that ℎ is in V0 , since it is the difference of two functions in V. From the definition of 𝐹, and the fact that 𝑢 = 𝑦∗ + ℎ, we have 1

(6.49)

𝐹(𝑢) = ∫ 4𝑥2 (𝑦∗ + ℎ) − (𝑦∗′ + ℎ′ )2 𝑑𝑥. 0

Expanding and grouping terms on the right-hand side, and again using the definition of 𝐹, we get 1

(6.50)

1

𝐹(𝑢) = 𝐹(𝑦∗ ) + ∫ 4𝑥2 ℎ − 2𝑦∗′ ℎ′ 𝑑𝑥 − ∫ (ℎ′ )2 𝑑𝑥. 0

0

Using integration-by-parts on the term ℎ(1) = 0 since ℎ ∈ V0 , we get

∫ −2𝑦∗′ ℎ′

𝑑𝑥, and noting that ℎ(0) = 0 and

1

(6.51)

1

𝐹(𝑢) = 𝐹(𝑦∗ ) + ∫ (4𝑥2 + 2𝑦∗″ )ℎ 𝑑𝑥 − ∫ (ℎ′ )2 𝑑𝑥. 0

0 2

From the differential equation in (6.48), we note that 4𝑥 + 2𝑦∗″ = 0 for all 𝑥 ∈ [0, 1]. Using this observation, together with the fact that −(ℎ′ )2 ≤ 0 for all 𝑥 ∈ [0, 1], we

158

6. Calculus of variations

obtain the result that 1

(6.52)

𝐹(𝑢) − 𝐹(𝑦∗ ) = − ∫ (ℎ′ )2 𝑑𝑥 ≤ 0,

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿).

0

The above result shows that 𝑦∗ is a local maximizer for any 𝑚 ≤ 2 and any 𝛿 > 0; in fact, it is an absolute maximizer.

6.6. Simpliﬁcations, essential results The Euler–Lagrange (E–L) differential equation in (6.38) may be difficult to solve depending on the Lagrangian. Normally, we expect that the equation will have a second-order form, and that its general solution will have two arbitrary constants. In the special cases outlined below, it is possible to express the equation in a simplified or reduced first-order form, which may be easier to solve when the original form is nonlinear. The reduced form is called a first integral of the Euler–Lagrange equation. Note that the reduced form will contain one arbitrary constant, and the process of solving it will introduce the second arbitrary constant. Result 6.6.1. [reduced forms of E–L equation] Let 𝐿 = 𝐿(𝑥, 𝑦, 𝑦′ ) be the Lagrangian function for the Euler–Lagrange differential equation in (6.38). (1) If 𝐿 is independent of 𝑦, so 𝐿 = 𝐿(𝑥, 𝑦′ ), then every solution 𝑦 ∈ 𝐶 2 [𝑎, 𝑏] of the Euler– Lagrange equation must satisfy 𝜕𝐿 (𝑥, 𝑦′ ) = 𝐴, 𝜕𝑦′

(6.53) where 𝐴 is an arbitrary constant.

(2) If 𝐿 is independent of 𝑥, so 𝐿 = 𝐿(𝑦, 𝑦′ ), then every solution 𝑦 ∈ 𝐶 2 [𝑎, 𝑏] of the Euler– Lagrange equation must satisfy (6.54)

𝐿(𝑦, 𝑦′ ) − 𝑦′

𝜕𝐿 (𝑦, 𝑦′ ) = 𝐴, 𝜕𝑦′

where 𝐴 is an arbitrary constant. Sketch of proof: Result 6.6.1. The results follow from simple manipulation of the 𝜕𝐿 differential equation in (6.38). In the first case, when 𝐿 = 𝐿(𝑥, 𝑦′ ), we find that 𝜕𝑦 = 0, 𝑑 𝜕𝐿 and the equation in (6.38) becomes − 𝑑𝑥 [ 𝜕𝑦′ ] = 0. This equation can now be integrated 𝜕𝐿 with respect to 𝑥, and we obtain 𝜕𝑦′ = 𝐴, where 𝐴 is an arbitrary constant. In the second case, when 𝐿 = 𝐿(𝑦, 𝑦′ ), we can multiply the equation in (6.38) by 𝑦′ to obtain 𝜕𝐿 𝑑 𝜕𝐿 𝜕𝐿 𝑑 𝜕𝐿 (6.55) 0 = 𝑦′ ( − − 𝑦′ [ ′ ]) = 𝑦′ [ ]. 𝜕𝑦 𝑑𝑥 𝜕𝑦 𝜕𝑦 𝑑𝑥 𝜕𝑦′ Using the chain rule to expand (6.56)

0 = 𝑦′

𝑑 𝜕𝐿 𝑑 [ ], and noting 𝑑𝑥 𝑦 𝑑𝑥 𝜕𝑦′

= 𝑦′ and

𝜕𝐿 𝜕2 𝐿 𝜕2 𝐿 − 𝑦′ 𝑦′ − 𝑦′ 𝑦″ ′ ′ . ′ 𝜕𝑦 𝜕𝑦𝜕𝑦 𝜕𝑦 𝜕𝑦

𝑑 ′ 𝑦 𝑑𝑥

= 𝑦″ , we obtain

6.6. Simplifications, essential results

159

𝜕𝐿

Adding and subtracting the term 𝑦″ 𝜕𝑦′ , we find that the right-hand side may be written in a simplified way, namely 𝜕𝐿 𝜕𝐿 𝜕𝐿 𝜕2 𝐿 𝜕2 𝐿 + 𝑦″ ′ − 𝑦″ ′ − 𝑦′ 𝑦′ − 𝑦′ 𝑦″ ′ ′ ′ 𝜕𝑦 𝜕𝑦 𝜕𝑦 𝜕𝑦𝜕𝑦 𝜕𝑦 𝜕𝑦 𝑑 𝜕𝐿 = [𝐿 − 𝑦′ ′ ] . 𝑑𝑥 𝜕𝑦

0 = 𝑦′ (6.57)

𝑑

𝜕𝐿

Similar to before, the equation 𝑑𝑥 [𝐿 − 𝑦′ 𝜕𝑦′ ] = 0 can be integrated with respect to 𝑥, 𝜕𝐿 and we obtain 𝐿 − 𝑦′ 𝜕𝑦′ = 𝐴, where 𝐴 is an arbitrary constant. The next two examples illustrate the different forms that the E–L differential equation may take depending on the Lagrangian. The case 𝐿 = 𝐿(𝑥, 𝑦′ ) is considered in the first example, and 𝐿 = 𝐿(𝑦, 𝑦′ ) in the second. Note that the two cases are not mutually exclusive, and the simpler of the two reduced forms can be chosen when applicable. Example 6.6.1. Consider 𝐿(𝑥, 𝑦, 𝑦′ ) = √1 + (𝑦′ )2 . Working with the original form of 𝜕𝐿 𝑑 𝜕𝐿 the equation 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ] = 0, and expanding the derivatives, we get (6.58)

𝑦″ (𝑦′ )2 𝑦″ − = 0. (1 + (𝑦′ )2 )3/2 (1 + (𝑦′ )2 )1/2

We could attempt to simplify the above second-order differential equation and eventually find a general solution. Alternatively, since 𝐿 is independent of 𝑦, we may consider 𝜕𝐿 the reduced form of the equation, which is 𝜕𝑦′ = 𝐴, where 𝐴 is a constant. Using the expression for 𝐿, we get the reduced equation (6.59)

𝑦′ √1 + (𝑦′ )2

= 𝐴. 𝐴

The above equation can be rearranged to get 𝑦′ = ± , or more simply 𝑦′ = 𝐵, √1−𝐴2 where 𝐵 is a constant. This equation can now be explicitly integrated and we get the general solution 𝑦 = 𝐵𝑥 + 𝐶, where 𝐵 and 𝐶 are arbitrary constants. (𝑦′ )2

Example 6.6.2. Consider 𝐿(𝑥, 𝑦, 𝑦′ ) = 1+𝑦2 . Working with the original form of the 𝜕𝐿 𝑑 𝜕𝐿 equation 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ] = 0, and expanding the derivatives, we get (6.60)

2𝑦(𝑦′ )2 2𝑦″ − = 0. (1 + 𝑦2 )2 1 + 𝑦2

As before, we could attempt to simplify the above second-order differential equation and find a general solution. However, since 𝐿 is independent of 𝑥, we may consider the 𝜕𝐿 reduced form of the equation, which is 𝐿 − 𝑦′ 𝜕𝑦′ = 𝐴, where 𝐴 is a constant. Using the expression for 𝐿, we get the reduced equation (6.61)

−

(𝑦′ )2 = 𝐴. 1 + 𝑦2

The above equation informs us that 𝐴 ≤ 0. For convenience, let 𝐴 = −𝐵 2 , where 𝐵 is a constant. Then the above equation can be rearranged to get 𝑦′ = ±𝐵√1 + 𝑦2 , or more simply 𝑦′ = 𝐶√1 + 𝑦2 , where 𝐶 = ±𝐵 is a constant. This is a first-order equation that

160

6. Calculus of variations

can be solved using separation of variables. Specifically, using the notation of 𝑦′ , and with the help of a table of integrals, we get (6.62)

∫

𝑑𝑦 √1 +

𝑦2

= ∫ 𝐶 𝑑𝑥

which gives

𝑑𝑦 𝑑𝑥

in place

−1

sinh (𝑦) = 𝐶𝑥 + 𝐷,

−1

where sinh (𝑦) is the inverse hyperbolic sine function and 𝐷 is a constant. Thus the general solution is 𝑦 = sinh(𝐶𝑥 + 𝐷), where 𝐶 and 𝐷 are arbitrary constants. In the sections that follow, an associated Euler–Lagrange equation will be derived for different types of problems. In all cases, the derivation will rely on a result known as the fundamental lemma of the calculus of variations, as mentioned earlier. Here we outline a version of this lemma that will be useful for a number of problems. Result 6.6.2. [fundamental lemma] Let integers 𝑛 ≥ 𝜇 ≥ 0 and an interval [𝑎, 𝑏] be 𝑏 given. If a function 𝑤 ∈ 𝐶 0 [𝑎, 𝑏] satisfies ∫𝑎 𝑤(𝑥)ℎ(𝑥) 𝑑𝑥 = 0 for all ℎ ∈ 𝐶 𝑛 [𝑎, 𝑏], where ℎ(𝑘) (𝑎) = 0 and ℎ(𝑘) (𝑏) = 0 for 𝑘 = 0, . . . , 𝜇, then 𝑤(𝑥) ≡ 0 for all 𝑥 ∈ [𝑎, 𝑏]. Sketch of proof: Result 6.6.2. The result follows from a straightforward argument based on continuity. For contradiction, suppose that 𝑤(𝑥# ) ≠ 0 for some 𝑥# ∈ (𝑎, 𝑏), say 𝑤(𝑥# ) > 0. Then, by continuity, there exists a number 𝛿 > 0 such that 𝑤(𝑥) > 0 for all 𝑥 ∈ (𝑥# − 𝛿, 𝑥# + 𝛿) ⊂ (𝑎, 𝑏). Also, we can explicitly construct a so-called bump or test function ℎ ∈ 𝐶 𝑛 [𝑎, 𝑏], with ℎ(𝑥) > 0 for all 𝑥 ∈ (𝑥# − 𝛿, 𝑥# + 𝛿), and with ℎ(𝑥) = 0 for all 𝑥 ∉ (𝑥# − 𝛿, 𝑥# + 𝛿), as illustrated in Figure 6.8. Note that such a function will satisfy ℎ(𝑘) (𝑎) = 0 and ℎ(𝑘) (𝑏) = 0 for all 𝑘 ≥ 0, and will visibly have the form of a bump, with zero segments on both ends. For such a function the product 𝑤(𝑥)ℎ(𝑥) h h(x)

a

(

)

x−δ #

x +δ #

b

x

Figure 6.8.

will be positive when 𝑥 ∈ (𝑥# − 𝛿, 𝑥# + 𝛿), and zero otherwise, and we get 𝑏

(6.63)

𝑥# +𝛿

∫ 𝑤(𝑥)ℎ(𝑥) 𝑑𝑥 = ∫ 𝑎

𝑤(𝑥)ℎ(𝑥) 𝑑𝑥 > 0.

𝑥# −𝛿 𝑏

But this contradicts the assumption that ∫𝑎 𝑤(𝑥)ℎ(𝑥) 𝑑𝑥 = 0, and a similar argument can be made if 𝑤(𝑥# ) < 0. Hence we must have 𝑤(𝑥# ) = 0 for all 𝑥# ∈ (𝑎, 𝑏). Moreover, since 𝑤 is continuous on [𝑎, 𝑏], it must also be zero at the end points 𝑎 and 𝑏. In addition to an Euler–Lagrange equation, an associated Legendre condition will also be stated for different types of problems. In all cases, this condition will follow from a sign lemma as mentioned earlier. Here we outline a version of this lemma that will be applicable to various problems.

6.7. Case study

161

Result 6.6.3. [sign lemma] Let integers 𝑛 ≥ 𝜈 ≥ 𝜇 ≥ 0, an interval [𝑎, 𝑏] and functions 𝜙𝑗𝑘 ∈ 𝐶 0 [𝑎, 𝑏] for 𝑗, 𝑘 = 0, . . . , 𝜈 be given, and consider 𝑏 𝜈

𝜈

𝐼(ℎ) = ∫ ∑ ∑ 𝜙𝑗𝑘 ℎ(𝑗) ℎ(𝑘) 𝑑𝑥.

(6.64)

𝑎

𝑗=0 𝑘=0

𝑛

If 𝐼(ℎ) ≥ 0 for all ℎ ∈ 𝐶 [𝑎, 𝑏], where ℎ(𝑘) (𝑎) = 0 and ℎ(𝑘) (𝑏) = 0 for 𝑘 = 0, . . . , 𝜇, then 𝜙𝜈𝜈 ≥ 0 for all 𝑥 ∈ [𝑎, 𝑏]. Similarly, if 𝐼(ℎ) ≤ 0, then 𝜙𝜈𝜈 ≤ 0. Sketch of proof: Result 6.6.3. The result follows from continuity similar to before. Since the case 𝜈 = 0 is straightforward, we assume 𝜈 ≥ 1. To begin, assume 𝐼(ℎ) ≥ 0 for all ℎ as described, and for contradiction suppose 𝜙𝜈𝜈 (𝑥# ) < 0 for some 𝑥# ∈ (𝑎, 𝑏), and let 𝜆 = −𝜙𝜈𝜈 (𝑥# ). Then, by continuity, there exists a number 𝛿 > 0 such that 1 𝜙𝜈𝜈 (𝑥) < − 2 𝜆 for all 𝑥 ∈ (𝑥# − 𝛿, 𝑥# + 𝛿) ⊂ (𝑎, 𝑏). Next, let 𝑚 ≥ 𝑛 + 1 and 𝛽 ≥ 1 be odd integers, where 𝑚 is fixed and 𝛽 is arbitrary, and consider the piecewise-defined function (6.65)

𝛽𝜋(𝑥 − 𝑥# ) 1 cos𝑚 [ ] ℎ(𝑥) = { 𝛽𝜈 2𝛿 0

, 𝑥 ∈ (𝑥# − 𝛿, 𝑥# + 𝛿), , 𝑥 ∉ (𝑥# − 𝛿, 𝑥# + 𝛿).

By design, we have ℎ ∈ 𝐶 𝑛 [𝑎, 𝑏], and also ℎ(𝑘) (𝑎) = 0 and ℎ(𝑘) (𝑏) = 0 for all 𝑘. Moreover, for 𝑘 ≤ 𝜈 − 1, the derivatives have the property that lim𝛽→∞ ℎ(𝑘) (𝑥) = 0 uniformly for 𝑥 ∈ [𝑎, 𝑏]. Crucially, the derivative ℎ(𝜈) (𝑥) is bounded and does not vanish 𝑏 as 𝛽 → ∞, but instead has the property that 𝛾 = ∫𝑎 (ℎ(𝜈) )2 𝑑𝑥 is a constant independent of 𝛽. These observations follow from a trigonometric reduction formula, which for 𝑚 odd, states cos𝑚 𝜃 = ∑𝜍 𝐶𝜍 cos(𝜎𝜃), where the sum extends over odd integers 𝜎 = 1, 3, . . . , 𝑚, and 𝐶𝜍 are positive constants. From this it follows that each derivative ℎ(𝑘) 𝜋 𝜋 is a sum of the form ±𝛽 𝑘−𝜈 ( 2𝛿 )𝑘 ∑𝜍 𝜎𝑘 𝐶𝜍 cos(𝜎𝜃) or ±𝛽𝑘−𝜈 ( 2𝛿 )𝑘 ∑𝜍 𝜎𝑘 𝐶𝜍 sin(𝜎𝜃), 𝛽𝜋(𝑥−𝑥# ) where 𝜃 = , and the integral of (ℎ(𝜈) )2 can be characterized explicitly to get 2𝛿 𝜋 2𝜈 2𝜈 2 𝛾 = 𝛿( 2𝛿 ) ∑𝜍 𝜎 𝐶𝜍 . 𝑏

Finally, let 𝑅(ℎ) = 𝐼(ℎ) − ∫𝑎 𝜙𝜈𝜈 (ℎ(𝜈) )2 𝑑𝑥. Since the functions 𝜙𝑗𝑘 are all continuous, and ℎ(𝑘) → 0 for 𝑘 ≤ 𝜈 − 1, and ℎ(𝜈) is bounded, it follows that 𝑅(ℎ) → 0 as 1 𝛽 → ∞. Moreover, since 𝜙𝜈𝜈 (𝑥) < − 2 𝜆 in (𝑥# − 𝛿, 𝑥# + 𝛿) and ℎ(𝜈) vanishes outside this interval, we have 𝑏

(6.66)

1

𝐼(ℎ) = 𝑅(ℎ) + ∫ 𝜙𝜈𝜈 (ℎ(𝜈) )2 𝑑𝑥 < 𝑅(ℎ) − 2 𝜆𝛾. 𝑎

For sufficiently large 𝛽, the above implies that 𝐼(ℎ) < 0, which contradicts the assumption that 𝐼(ℎ) ≥ 0. Hence we must have 𝜙𝜈𝜈 (𝑥# ) ≥ 0 for all 𝑥# ∈ (𝑎, 𝑏). Moreover, since 𝜙𝜈𝜈 is continuous on [𝑎, 𝑏], it must also be nonnegative at the end points 𝑎 and 𝑏. Note that a similar conclusion can be reached under the assumption that 𝐼(ℎ) ≤ 0.

6.7. Case study Setup. To illustrate the preceding results we study a problem in the design of a playground slide. We consider a slide in two dimensions as illustrated in Figure 6.9, where

162

6. Calculus of variations

the initial point is on the left at a given height ℎ above ground level, and the terminal point is on the right at a given distance ℓ along ground level, with gravitational acceleration 𝑔 oriented vertically downwards. Beginning from the initial point with an initial

y h

c initial speed y(x) profile curve

g slide

x ground

Figure 6.9.

speed 𝑐, a user of the slide will get accelerated and transported to the terminal point under the influence of gravity. Normally, the quicker the descent, the greater the thrill of the ride. Here we seek the shape of the slide that will optimize this thrill. Specifically, we seek the profile curve 𝑦(𝑥) that will provide the quickest descent, or equivalently, minimize the travel time 𝑇 down the slide, where 𝑥 and 𝑦 are coordinates as shown, and ℎ, ℓ, 𝑔 and 𝑐 are given positive constants. In the idealized case, when the user is modeled as a mass particle which never breaks contact with the slide, and the sliding motion is planar and occurs without friction or air resistance, the above problem is equivalent to the classic brachistochrone problem, which was one of the earliest problems studied in the calculus of variations. Although various generalizations could be considered, which for example include friction and other external forces, as well as rotational motion of the mass in addition to sliding, along profile curves in three-dimensional space instead of two, we focus on the idealized version described here. Travel time. We represent a user of the slide as a particle of arbitrary mass 𝑚 > 0. We suppose that the particle is at the point (𝑥, 𝑦) = (0, ℎ) at time 𝑡 = 0, and arrives at the point (𝑥, 𝑦) = (ℓ, 0) at time 𝑡 = 𝑇, so that, by definition, 𝑇 is the travel time. At 𝑑𝑥 𝑑𝑦 any instant during the motion, the particle has a position (𝑥, 𝑦) and a velocity ( 𝑑𝑡 , 𝑑𝑡 ). Since the particle can only move along the slide, we have 𝑦 = 𝑦(𝑥), and the chain rule 𝑑𝑦 𝑑𝑦 𝑑𝑥 𝑑𝑥 𝑑𝑦 implies 𝑑𝑡 = 𝑑𝑥 𝑑𝑡 . Using the convenient notation 𝑢 = 𝑑𝑡 and 𝑣 = 𝑑𝑡 , this chain rule relation becomes 𝑣 =

𝑑𝑦 𝑢. 𝑑𝑥

Assuming no friction or air resistance, conservation of energy requires that the total energy of the particle at any time 𝑡 ≥ 0 must be equal to that at 𝑡 = 0. Considering kinetic and potential energy, and using ground level as the reference for potential, we have (6.67)

1 1 𝑚(𝑢2 + 𝑣2 ) + 𝑚𝑔𝑦 = 𝑚(𝑢20 + 𝑣20 ) + 𝑚𝑔𝑦0 , 2 2

where 𝑢20 + 𝑣20 = 𝑐2 and 𝑦0 = ℎ are the squared speed and height at time 𝑡 = 0. After 𝑑𝑦 eliminating the arbitrary mass 𝑚, and substituting the relation 𝑣 = 𝑑𝑥 𝑢, we obtain,

6.7. Case study

163

after slight simplification, 𝑢2 =

(6.68)

2𝑔(𝑘 − 𝑦) , 1 + (𝑦′ )2

𝑐2

𝑑𝑦

where 𝑘 = 2𝑔 + ℎ > 0 is a constant. Here we use the notation 𝑦′ = 𝑑𝑥 . For future reference, we note that 𝑘 represents an upper bound for the particle height 𝑦, that would be attained if all kinetic energy were converted to potential. 𝑑𝑥

Assuming 𝑑𝑡 = 𝑢 > 0 throughout the travel, so that the particle is always moving to the right, we get the useful result that 1/2

𝑘−𝑦 𝑑𝑥 = √2𝑔 [ ] 𝑑𝑡 1 + (𝑦′ )2

(6.69)

.

An integral expression for the travel time 𝑇 in terms of the profile curve 𝑦(𝑥) can now be written. Specifically, beginning from a simple integral identity, and performing a 𝑑𝑥 change of variable using the above expression for 𝑑𝑡 , we get 𝑡=𝑇

(6.70)

𝑇=∫ 𝑡=0

𝑥=ℓ

𝑑𝑡 = ∫ 𝑥=0

ℓ

1/2

1 + (𝑦′ )2 𝑑𝑡 1 ∫ [ 𝑑𝑥 = ] 𝑑𝑥 𝑘−𝑦 √2𝑔 0

𝑑𝑥.

Restated problem. To state the problem in the notation of this chapter, we consider finding minimizers for a functional 𝐹 ∶ V → ℝ given by (6.71)

𝐹(𝑦) =

1 √2𝑔

ℓ

∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, 0

′

where 𝐿(𝑥, 𝑦, 𝑦 ) is the integrand defined in (6.70). Note that, since the constant factor outside of the integral can be eliminated from the Euler–Lagrange equation, there is no need to include it with the integrand. Moreover, based on the first-order form of the functional, and the fact that the initial and terminal points of the slide are prescribed, we seek minimizers among the set of functions (6.72)

V = {𝑦 ∈ 𝐶 2 [0, ℓ] | 𝑦(0) = ℎ, 𝑦(ℓ) = 0, 𝑦(𝑥) < 𝑘}.

The additional, upper bound condition 𝑦(𝑥) < 𝑘 for all 𝑥 ∈ [0, ℓ] is a consequence of energy conservation. Specifically, in view of the initial and terminal points, the particle height 𝑦 could never reach or exceed 𝑘 in any motion from one end of the slide to the other, and consequently, the integrand 𝐿 is real and finite only when the bound is satisfied. (The case of zero initial speed is special and not considered here; in this case, we would have 𝑐 = 0 and 𝑘 = ℎ, so that the upper bound would be attained at the initial point, and the integrand must then be allowed to become infinite.) In our case, the upper bound will play no active role in our developments; we can simply verify that it is satisfied at the end of our analysis. Observe that any candidate minimizer would favor a larger separation between 𝑦 and 𝑘, since a smaller separation would tend to increase the integrand and thus the travel time. System of equations. Every extremal of 𝐹 in V must satisfy the system of equations in (6.38). In view of the integrand 𝐿 in (6.70), we note that the general form of 𝜕𝐿 𝑑 𝜕𝐿 the Euler–Lagrange equation 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ] = 0 will be tedious. However, since 𝐿 is independent of 𝑥, we may instead consider the reduced form of the equation given in

164

6. Calculus of variations

𝜕𝐿

Result 6.6.1, which is 𝐿 − 𝑦′ 𝜕𝑦′ = 𝐴, where 𝐴 is a constant. Using the expression for 𝐿, we get, after some simplification, −1/2

[1 + (𝑦′ )2 ]

(6.73)

1/2

= 𝐴[𝑘 − 𝑦]

.

The above equation informs us that 𝐴 > 0 since the two factors in brackets are positive. 1 𝐵 By rearranging the equation we get 1+(𝑦′ )2 = 𝐴2 (𝑘−𝑦) , or more simply (𝑦′ )2 = (𝑘−𝑦) −1, 𝑑𝑦 1 ′ where 𝐵 = 𝐴2 . Noting that 𝑦 = 𝑑𝑥 , and including boundary conditions, we find that every extremal must satisfy the system (6.74)

(

𝑑𝑦 2 𝐵 − (𝑘 − 𝑦) , ) = 𝑑𝑥 (𝑘 − 𝑦)

𝑦(0) = ℎ,

𝑦(ℓ) = 0,

0 ≤ 𝑥 ≤ ℓ,

where 𝑘 > 0 is a given constant, and 𝐵 > 0 is an arbitrary constant. The differential equation. For convenience in constructing the general solution of the differential equation, we consider an alternative description of a solution curve. Specifically, instead of the cartesian description 𝑦 = 𝑦(𝑥), 0 ≤ 𝑥 ≤ ℓ, we consider a general parametric description 𝑥 = 𝑓(𝑠), 𝑦 = 𝑔(𝑠), 𝑎 ≤ 𝑠 ≤ 𝑏. Here 𝑠 is an arbitrary parameter along the curve, 𝑓(𝑠) and 𝑔(𝑠) are arbitrary functions, and [𝑎, 𝑏] is an arbitrary 𝑑𝑥 interval. We may suppose that 𝑑𝑠 > 0 so that the curve is traced left to right. The fact that a parametric description of a solution curve involves two arbitrary functions is advantageous: we can choose one of the functions 𝑓(𝑠) or 𝑔(𝑠) to simplify the differential equation, and then solve for the other. 𝑑𝑦

To illustrate the parametric approach, we substitute the calculus relation 𝑑𝑥 = 𝑑𝑥 the differential equation in (6.74), and then rearrange terms to separate 𝑑𝑠 and 𝑑𝑠 , which gives 𝑑𝑦 𝑑𝑥 / into 𝑑𝑠 𝑑𝑠 𝑑𝑦

(6.75)

(

(𝑘 − 𝑦) 𝑑𝑦 2 𝑑𝑥 2 ( ) . ) = 𝑑𝑠 𝐵 − (𝑘 − 𝑦) 𝑑𝑠

We may now choose 𝑦 = 𝑔(𝑠) to simplify the right-hand side of the equation. Any 𝑑2 𝑦 𝑑𝑦 choice can be made, provided that it leads to a curve for which 𝑑𝑥 and 𝑑𝑥2 are defined and continuous, which can be verified in the end. Motivated by the form of the quotient above, we let 𝑘 − 𝑦 = 𝐵 sin2 (𝑠). This corresponds to 𝑦 = 𝑘 − 𝐵 sin2 (𝑠), which gives 𝑑𝑦 = −2𝐵 sin(𝑠) cos(𝑠), and using the identity 1 − sin2 (𝑠) = cos2 (𝑠), we get 𝑑𝑠 (6.76)

(

The above equation implies 𝑑𝑥 𝑑𝑠

𝑑𝑥 𝑑𝑠

𝑑𝑥 2 ) = 4𝐵 2 sin4 (𝑠). 𝑑𝑠

= ±2𝐵 sin2 (𝑠). By choice, we orient the curve so that 𝑑𝑥 = 2𝐵 sin2 (𝑠). This equation can now 𝑑𝑠 1 𝐸 + 𝐵[𝑠 − 2 sin(2𝑠)], where 𝐸 is a constant. Thus,

> 0, which corresponds to the positive case

be explicitly integrated to obtain 𝑥 = in parametric form, the general solution curve of the differential equation is (6.77)

𝑥 = 𝐸 + 𝐵[𝑠 −

1 sin(2𝑠)], 2

𝑦 = 𝑘 − 𝐵 sin2 (𝑠),

𝑎 ≤ 𝑠 ≤ 𝑏.

Although not possible here, we could now attempt to eliminate the parameter 𝑠, and thereby obtain a cartesian description of the curve involving only 𝑥 and 𝑦.

6.7. Case study

165

The curve in (6.77) can be written in a simpler, more symmetric form. Specifically, 𝐵 using the substitutions 𝜎 = 2𝑠, 𝛼 = 2𝑎, 𝛽 = 2𝑏 and 𝐷 = 2 , we obtain (6.78)

𝑥 = 𝐸 + 𝐷[𝜎 − sin 𝜎],

𝑦 = 𝑘 − 𝐷[1 − cos 𝜎],

𝛼 ≤ 𝜎 ≤ 𝛽.

Here 𝑘 > 0 is a given constant, while 𝐷 > 0, 𝐸, 𝛼, 𝛽 are arbitrary constants. Because 𝑑𝑦 𝑑2 𝑦 of periodicity, and to obtain a curve for which 𝑑𝑥 and 𝑑𝑥2 are defined and continuous, we restrict attention to the interval 0 < 𝛼 < 𝛽 < 2𝜋. Specifically, an inspection of 𝑑𝑦 𝑑𝑥 (6.78) reveals that a vertical tangent or cusp with 𝑑𝑥 = ±∞ occurs when 𝑑𝜍 = 0, or equivalently when 𝜎 is an integer multiple of 2𝜋, and thus we must avoid such points. The general curve defined above is called a cycloid curve. A description of this curve can be found in many texts on elementary calculus, and it has an interesting geometrical characterization that is independent of the slide problem considered here: as a circular wheel is rolled along a flat surface, a point on the perimeter of the wheel will trace out such a curve, up to a congruence. Although the cycloid given in (6.78) is the general solution of the Euler–Lagrange equation, we have not yet found an extremal. We must now consider if values of the arbitrary constants can be found to satisfy the boundary conditions. The boundary conditions. In view of the general curve in (6.78), the boundary conditions in (6.74) require that (𝑥, 𝑦) = (0, ℎ) when 𝜎 = 𝛼, and also (𝑥, 𝑦) = (ℓ, 0) when 𝜎 = 𝛽. By writing each condition for 𝑥 and 𝑦 separately, we obtain (6.79)

𝐸 + 𝐷[𝛼 − sin 𝛼] = 0,

𝑘 − 𝐷[1 − cos 𝛼] = ℎ,

𝐸 + 𝐷[𝛽 − sin 𝛽] = ℓ,

𝑘 − 𝐷[1 − cos 𝛽] = 0.

Thus the boundary conditions yield four equations for four unknown constants 𝐷, 𝐸, 𝛼, 𝛽. Note that the determination of solutions is nontrivial due to the nonlinear form of the equations. The existence or not of an extremal, and the possibility of multiple extremals, ultimately relies on our ability to characterize solutions of these equations 𝑐2 for any given values of the slide design parameters ℎ, ℓ, 𝑔, 𝑐, where 𝑘 = 2𝑔 + ℎ. Summary of results. Here we summarize two results about extremals. We give a hint of the proof of the first result, and note that complete proofs of both results are tedious and outside the scope considered here. Slide result 1. For any positive design parameters ℎ, ℓ, 𝑔, 𝑐, the equations in (6.79) have a unique solution 𝐷, 𝐸, 𝛼, 𝛽, under the restrictions 𝐷 > 0 and 0 < 𝛼 < 𝛽 < 2𝜋. Thus there is a unique extremal; it is a cycloid curve. Slide result 2. The unique cycloid extremal is the absolute minimizer of the functional 𝐹 in the set V; no other curve provides a quicker travel time. Note that the minimizing curve is a specific arc of a cycloid, where the arc is defined by the interval [𝛼, 𝛽]. Depending on the design parameters, this interval may contain the point 𝜎 = 𝜋 in its interior, which affects the qualitative properties of the curve as shown in Figure 6.10. Specifically, when 𝜋 ∉ (𝛼, 𝛽), the minimizing curve that provides the quickest travel time is strictly downhill from the initial point to the terminal point, as intuitively expected. However, when 𝜋 ∈ (𝛼, 𝛽), the minimizing curve will extend below ground level and contain an uphill portion as illustrated; although the

166

6. Calculus of variations

y

y π not in (α,β)

π in (α,β)

x

x

Figure 6.10.

mass particle must travel uphill for some distance, this curve will nonetheless provide the quickest travel time. Interestingly, the physical construction of such a slide would require some digging! Sketch of proof for result 1. We suppose that positive constants ℎ, ℓ, 𝑔, 𝑐 are given, and seek constants 𝐷, 𝐸, 𝛼, 𝛽 that satisfy (6.79), where 𝐷 > 0 and 0 < 𝛼 < 𝛽 < 2𝜋. To get a glimpse of the basic ideas, we consider the special limiting case when 𝑐 = 0, which requires the relaxed restrictions 0 ≤ 𝛼 < 𝛽 < 2𝜋. Although this limiting case is excluded in the results outlined above, it leads to a simpler analysis, which serves as a guide for the general case. The assumption that 𝑐 = 0 implies the simple and useful result that 𝑘 = ℎ. From the second equation in (6.79) we get 𝐷[1 − cos 𝛼] = 0, and using the restrictions that 𝐷 > 0 and 0 ≤ 𝛼 < 2𝜋, we deduce that 𝛼 = 0. Substituting this result into the first equation in (6.79) then implies 𝐸 = 0. Since 𝛽 − sin 𝛽 > 0 for all 0 < 𝛽 < 2𝜋, the third equation in (6.79) gives 𝐷 = ℓ/[𝛽 − sin 𝛽], which can then be substituted into the fourth equation, and we obtain (6.80)

𝛼 = 0,

𝐸 = 0,

𝐷=

ℓ , 𝛽 − sin 𝛽

1 − cos 𝛽 ℎ = . ℓ 𝛽 − sin 𝛽 1−cos 𝛽

We now observe that the single-variable function 𝜙(𝛽) = 𝛽−sin 𝛽 has a graph that is monotone decreasing on the interval 0 < 𝛽 < 2𝜋, and has the property that lim𝛽→0+ 𝜙(𝛽) = ∞ and lim𝛽→2𝜋− 𝜙(𝛽) = 0. Thus for any given ℎ > 0 and ℓ > 0 there is a unique ℎ

𝛽# ∈ (0, 2𝜋) that satisfies the root equation 𝜙(𝛽# ) = ℓ . Note that there is no simple expression for this root, and hence it must be found numerically. Once this root is known, we have 𝛽 = 𝛽# and 𝐷 = ℓ/[𝛽# − sin 𝛽# ], and a unique solution of (6.79) is obtained.

6.8. Natural boundary conditions We consider the problem of finding local extrema for a functional 𝐹 ∶ V → ℝ, where the set of functions is (6.81)

V = {𝑦 ∈ 𝐶 2 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼},

𝑦(𝑏) free,

the space of variations is (6.82)

V0 = {ℎ ∈ 𝐶 2 [𝑎, 𝑏] | ℎ(𝑎) = 0},

ℎ(𝑏) free,

6.8. Natural boundary conditions

167

and the functional is 𝑏

(6.83)

𝐹(𝑦) = ∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥 + [𝐺(𝑦)]𝑥=𝑏 . 𝑎

Here [𝑎, 𝑏] is a given interval, 𝛼 is a given constant, 𝐿(𝑥, 𝑦, 𝑦′ ) is a given integrand, and 𝐺(𝑦) is a given function. Unless indicated otherwise, we assume that 𝐿 is twice continuously differentiable for all 𝑥 ∈ [𝑎, 𝑏], 𝑦 ∈ ℝ and 𝑦′ ∈ ℝ, and that 𝐺 is continuously differentiable for all 𝑦 ∈ ℝ. The trivial case 𝐺 ≡ 0 is typical. Similar to before, the above problem is first-order type, since the functional 𝐹 involves derivatives of at most first order. However, in contrast to before, the problem is now of fixed-free type, since the functions in V have a free end at the point 𝑥 = 𝑏; that is, the value of 𝑦(𝑏) is arbitrary. The term [𝐺(𝑦)]𝑥=𝑏 in the functional is called a free-end term; it is evaluated at the single point 𝑥 = 𝑏. For instance, if 𝐺(𝑦) = 𝑦2 + 𝑒−𝑦 , then [𝐺(𝑦)]𝑥=𝑏 = 𝑦2 (𝑏) + 𝑒−𝑦(𝑏) . The following result outlines some implications of the general necessary conditions, when specialized to the above problem. Result 6.8.1. Let 𝐹 ∶ V → ℝ be defined as in (6.81)–(6.83). If 𝑦∗ ∈ V is a local minimizer of 𝐹 in the 𝐶 𝑚 -norm for some 𝑚, then 𝑑 (6.84) = 0, ∀ℎ ∈ V0 , [ 𝐹(𝑦∗ + 𝜀ℎ)] 𝑑𝜀 𝜀=0 and (6.85)

[

𝑑2 𝐹(𝑦∗ + 𝜀ℎ)] ≥ 0, 𝑑𝜀2 𝜀=0

∀ℎ ∈ V0 .

Condition (6.84) implies that 𝑦∗ must satisfy 𝜕𝐿 𝑑 𝜕𝐿 ′ ′ ⎧ 𝜕𝑦 (𝑥, 𝑦, 𝑦 ) − 𝑑𝑥 [ 𝜕𝑦′ (𝑥, 𝑦, 𝑦 )] = 0, 𝑎 ≤ 𝑥 ≤ 𝑏, ⎪ (6.86) ⎨ ⎪𝑦(𝑎) = 𝛼, [ 𝜕𝐿 (𝑥, 𝑦, 𝑦′ ) + 𝜕𝐺 (𝑦)] = 0. 𝜕𝑦′ 𝜕𝑦 ⎩ 𝑥=𝑏 Condition (6.85) implies that 𝑦∗ must also satisfy 𝜕2 𝐿 (𝑥, 𝑦, 𝑦′ ) ≥ 0, 𝑎 ≤ 𝑥 ≤ 𝑏. 𝜕𝑦′ 𝜕𝑦′ For a local maximizer, change ≥ to ≤ in conditions (6.85) and (6.87). (6.87)

Thus the conditions for local extrema of the fixed-free problem are similar to those for the fixed-fixed problem, but with an important difference. Specifically, the Euler– Lagrange differential equation is the same as before, but now there is a new type of boundary condition associated with the free end; it is called a natural boundary condition. In contrast, the boundary condition associated with the fixed end, which is explicitly specified in the set V, is called an essential boundary condition. As in the fixed-fixed case, the boundary-value problem in (6.86) may have one, none, or multiple solutions. Note that the natural condition is a property of an extremizing function. While every function in V is required to satisfy the essential condition, only the local extrema of 𝐹 are required to satisfy the additional, natural condition in order to be extremizing.

168

6. Calculus of variations

Note also that problems of the free-fixed and free-free types would all involve different combinations of natural and essential boundary conditions. It is important to note that the natural boundary condition for a free end at 𝑥 = 𝑎 is slightly different than for a free end at 𝑥 = 𝑏; there is a difference of sign. Just as before, the above conditions are necessary, but not sufficient. These conditions are still not sufficient even if the inequalities in (6.85) and (6.87) are made strict. The conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. Sketch of proof: Result 6.8.1. Here we show how (6.84) implies (6.86), including the natural boundary condition. To begin, consider any fixed 𝑦 ∈ V and ℎ ∈ V0 . From the definition of 𝐹 in (6.83) we have 𝑏

(6.88)

𝐹(𝑦 + 𝜀ℎ) = ∫ 𝐿(𝑥, 𝑦 + 𝜀ℎ, 𝑦′ + 𝜀ℎ′ ) 𝑑𝑥 + [𝐺(𝑦 + 𝜀ℎ)]𝑥=𝑏 . 𝑎

Just as in the fixed-fixed case, we differentiate with respect to 𝜀, and take the derivative inside the integral and use the chain rule, and then set 𝜀 = 0. The resulting expression for the first variation is 𝑏

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ + 𝑓ℎ′ 𝑑𝑥 + [𝜃ℎ]𝑥=𝑏 ,

(6.89)

𝑎 𝜕𝐿

𝜕𝐿

where for brevity we use the notation 𝑔 = 𝜕𝑦 (𝑥, 𝑦, 𝑦′ ), 𝑓 = 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) and 𝜃 = 𝜕𝐺 (𝑦). As before, we next write the above expression in a more useful form using the 𝜕𝑦 integration-by-parts formula ∫ 𝑢 𝑑𝑣 = 𝑢𝑣 − ∫ 𝑣 𝑑𝑢. Specifically, applying this formula to the term ∫ 𝑓ℎ′ 𝑑𝑥, with 𝑢 = 𝑓 and 𝑑𝑣 = ℎ′ 𝑑𝑥, we get 𝑑𝑢 = 𝑓′ 𝑑𝑥 and 𝑣 = ℎ, and we obtain 𝑏 𝑥=𝑏

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ − 𝑓′ ℎ 𝑑𝑥 + [𝑓ℎ]𝑥=𝑎 + [𝜃ℎ]𝑥=𝑏 .

(6.90)

𝑎

𝑑𝑓

𝑑

𝑥=𝑏

𝜕𝐿

In the above, note that 𝑓 means 𝑑𝑥 , or equivalently, 𝑑𝑥 [ 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ )]. Also, [𝑓ℎ]𝑥=𝑎 = [𝑓ℎ]𝑥=𝑏 − [𝑓ℎ]𝑥=𝑎 . Since ℎ ∈ V0 , we have ℎ(𝑎) = 0, and it follows that the boundary term at 𝑥 = 𝑎 is zero. Thus we get the expression ′

𝑏

𝛿𝐹(𝑦, ℎ) = ∫ (𝑔 − 𝑓′ )ℎ 𝑑𝑥 + [𝑓 + 𝜃]𝑥=𝑏 ℎ(𝑏).

(6.91)

𝑎

We now observe that, if 𝑦∗ ∈ V is a local extremum, then the condition in (6.84) requires 𝑏

(6.92)

𝛿𝐹(𝑦∗ , ℎ) = ∫ (𝑔 − 𝑓′ )ℎ 𝑑𝑥 + [𝑓 + 𝜃]𝑥=𝑏 ℎ(𝑏) = 0,

∀ℎ ∈ V0 .

𝑎

As a special case, the above equation must also hold for the subset of V0 with ℎ(𝑏) = 0. So a local extremum must also satisfy 𝑏

(6.93)

∫ (𝑔 − 𝑓′ )ℎ 𝑑𝑥 = 0,

∀ℎ ∈ 𝐶 2 [𝑎, 𝑏], ℎ(𝑎) = 0, ℎ(𝑏) = 0.

𝑎

The above condition is the same as considered in the fixed-fixed case. By the fundamental lemma, due to the continuity of 𝑔 and 𝑓′ , this condition will hold when and only

6.8. Natural boundary conditions

169

when 𝑔−𝑓′ = 0 for all 𝑥 ∈ [𝑎, 𝑏]. Substituting this information into (6.92), we find that [𝑓 + 𝜃]𝑥=𝑏 ℎ(𝑏) = 0 for all ℎ ∈ V0 , and by choosing any ℎ with ℎ(𝑏) ≠ 0 we find that [𝑓 + 𝜃]𝑥=𝑏 = 0. The equation 𝑔 − 𝑓′ = 0 for 𝑥 ∈ [𝑎, 𝑏] is the Euler–Lagrange equation, and [𝑓 + 𝜃]𝑥=𝑏 = 0 is the natural boundary condition. When these are combined with the essential boundary condition specified in V, we obtain the boundary-value problem in (6.86). Example 6.8.1. Consider the set V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 3}, and functional 𝐹(𝑦) = 1 ∫0 (𝑦′ )2 −𝑥𝑦′ +𝑦2 𝑑𝑥+𝑦2 (1). Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition. Extremals. There is a free end at 𝑥 = 1. The Lagrangian is 𝐿(𝑥, 𝑦, 𝑦′ ) = (𝑦′ )2 − 𝑥𝑦′ + 𝑦2 , and its partial derivatives are 𝜕𝐿/𝜕𝑦 = 2𝑦 and 𝜕𝐿/𝜕𝑦′ = 2𝑦′ − 𝑥. The free-end term has 𝐺(𝑦) = 𝑦2 , and its derivative is 𝜕𝐺/𝜕𝑦 = 2𝑦. The differential equation to consider is 𝜕𝐿 𝑑 𝜕𝐿 𝑑 − 𝑑𝑥 [ 𝜕𝑦′ ] = 0, which becomes 2𝑦 − 𝑑𝑥 [2𝑦′ − 𝑥] = 0, or equivalently 2𝑦 −2𝑦″ +1 = 𝜕𝑦 𝜕𝐿 𝜕𝐺 0. The natural boundary condition at the free end is [ 𝜕𝑦′ + 𝜕𝑦 ]𝑥=1 = 0, which becomes ′ ′ [2𝑦 − 𝑥 + 2𝑦]𝑥=1 = 0, or equivalently 2𝑦 (1) − 1 + 2𝑦(1) = 0. In view of the essential boundary condition in V, the boundary-value problem for an extremal is (6.94)

𝑦″ − 𝑦 =

1 , 2

𝑦(0) = 3,

𝑦′ (1) + 𝑦(1) =

1 , 2

0 ≤ 𝑥 ≤ 1.

The differential equation can be solved using standard methods for linear, inhomoge1 neous equations, and the general solution is 𝑦 = 𝐶1 𝑒𝑥 + 𝐶2 𝑒−𝑥 − 2 , where 𝐶1 and 𝐶2 are 1

7𝑒−1

arbitrary constants. Applying the boundary conditions, we get 𝐶1 = 2𝑒 and 𝐶2 = 2𝑒 , and we obtain a unique extremal 𝑦∗ . This is the only candidate for a local extremum. 𝜕𝐿

𝜕2 𝐿

Sign condition. Since 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) = 2𝑦′ − 𝑥, we get 𝜕𝑦′ 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ ) = 2. Here the required second partial is a constant, but more generally it may depend on 𝑥, 𝑦 and 𝑦′ . Substituting the extremal 𝑦∗ and its derivative 𝑦∗′ into this expression, we get the 𝜕2 𝐿 constant function 𝜕𝑦′ 𝜕𝑦′ (𝑥, 𝑦∗ , 𝑦∗′ ) ≡ 2 for 𝑥 ∈ [0, 1]. The fact that this expression is ≥ 0 for all 𝑥 ∈ [0, 1] informs us that 𝑦∗ could be a local minimizer, but not a local maximizer. Analysis of candidate. To determine if 𝑦∗ is a local minimizer in the 𝐶 𝑚 -norm for some 𝑚, we attempt to verify the definition of a minimizer. To begin, let 𝑚 ≤ 2 and 𝛿 > 0 be given; we will adjust these as needed. Consider any 𝑢 in the neighborhood 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) and let ℎ = 𝑢 − 𝑦∗ . Note that ℎ is in V0 , since it is the difference of two functions in V. From the definition of 𝐹, and the fact that 𝑢 = 𝑦∗ + ℎ, we have 1

(6.95)

𝐹(𝑢) = ∫ (𝑦∗′ + ℎ′ )2 − 𝑥(𝑦∗′ + ℎ′ ) + (𝑦∗ + ℎ)2 𝑑𝑥 0

+ (𝑦∗ (1) + ℎ(1))2 .

170

6. Calculus of variations

Expanding and grouping terms on the right-hand side, and again using the definition of 𝐹, we get 1

(6.96)

𝐹(𝑢) = 𝐹(𝑦∗ ) + ∫ 2𝑦∗′ ℎ′ − 𝑥ℎ′ + 2𝑦∗ ℎ + (ℎ′ )2 + ℎ2 𝑑𝑥 0

+ 2𝑦∗ (1)ℎ(1) + ℎ2 (1). Using integration-by-parts on the term ∫(2𝑦∗′ − 𝑥)ℎ′ 𝑑𝑥, and noting that ℎ(0) = 0 since ℎ ∈ V0 , we get 1

(6.97)

𝐹(𝑢) = 𝐹(𝑦∗ ) + ∫ (2𝑦∗ + 1 − 2𝑦∗″ )ℎ + (ℎ′ )2 + ℎ2 𝑑𝑥 0

+ (2𝑦∗′ (1) − 1 + 2𝑦∗ (1))ℎ(1) + ℎ2 (1). From the differential equation and natural boundary condition in (6.94), we note that 2𝑦∗ + 1 − 2𝑦∗″ = 0 for all 𝑥 ∈ [0, 1] and 2𝑦∗′ (1) − 1 + 2𝑦∗ (1) = 0. Using this observation, together with the fact that (ℎ′ )2 ≥ 0 and ℎ2 ≥ 0 for all 𝑥 ∈ [0, 1], we obtain the result that 1

(6.98)

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ (ℎ′ )2 + ℎ2 𝑑𝑥 + ℎ2 (1) ≥ 0,

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿).

0

The above result shows that 𝑦∗ is a local minimizer for any 𝑚 ≤ 2 and any 𝛿 > 0; in fact, it is an absolute minimizer.

6.9. Case study Setup. To illustrate the role of boundary conditions we study a problem in the optimal steering control of a boat. We consider driving a boat across a channel of moving water as illustrated in Figure 6.11, where the departure point 𝑃 is on the left bank, and the arrival point 𝑄 is on the right bank. We suppose that the water motion is everywhere parallel to the banks, and that the water speed 𝑤(𝑥) is a given function of position across the channel, and could possibly change sign within the channel. Moreover, we

y

w(x) water y(x) σ

θ(x)

boat

P

Q x

0 Figure 6.11.

suppose that the boat moves at constant speed 𝜎 relative to the water, and that the steering angle 𝜃(𝑥) with respect to the horizontal axis can be adjusted or controlled as desired by a driver. Throughout our developments, 𝑥 and 𝑦 will denote coordinates as shown, and ℓ will denote the width of the channel. We assume that ℓ > 0 and 𝜎 > 0 are given constants, and that 𝑤(𝑥) and 𝜃(𝑥) are continuously differentiable.

6.9. Case study

171

As the boat moves across the channel, the path 𝑦(𝑥) and steering angle 𝜃(𝑥) are directly related, and the travel time 𝑇 can be expressed in terms of either. Here we seek the path and steering angle that will minimize the travel time under different boundary conditions. Note that, since boundary conditions will be specified on the path, we treat 𝑦(𝑥) as the primary variable rather than 𝜃(𝑥). Once an optimal path 𝑦(𝑥) is known, the corresponding optimal steering angle 𝜃(𝑥) can be found. It will be convenient to introduce the speed ratio 𝑒(𝑥) = 𝑤(𝑥)/𝜎, and we will assume that the magnitude of the water speed is everywhere less than the boat speed so that −1 < 𝑒(𝑥) < 1. Moreover, we will assume that the boat path is a graph (one 𝑦 for each 𝑥) with two continuous 𝜋 𝜋 derivatives, which requires − 2 < 𝜃(𝑥) < 2 . We note that a different approach to the problem, involving more general curves rather than graphs, would be needed if these restrictions were removed. Steering-path relation. The relation between the steering angle and path follows from simple considerations about velocity. Specifically, let (𝑥, 𝑦) be the position of the boat at any time 𝑡 ≥ 0, so that its velocity is given by (6.99)

𝑣⃗ =

𝑑𝑥 ⃗ 𝑑𝑦 ⃗ 𝑖+ 𝑗, 𝑑𝑡 𝑑𝑡

where 𝑖 ⃗ and 𝑗 ⃗ are the standard unit vectors in the positive coordinate directions. Equivalently, noting that 𝑣 ⃗ is the resultant of two contributions, namely the velocity of the water, plus the velocity of the boat relative to the water, we also have (6.100)

⃗ 𝑣 ⃗ = [𝑤(𝑥)𝑗]⃗ + [𝜎 cos 𝜃(𝑥)𝑖 ⃗ + 𝜎 sin 𝜃(𝑥)𝑗]. 𝑑𝑦

𝑑𝑥

From (6.99) and (6.100) we get 𝑑𝑡 = 𝜎 cos 𝜃(𝑥) and 𝑑𝑡 = 𝑤(𝑥) + 𝜎 sin 𝜃(𝑥), which is a dynamical system for the boat position (𝑥, 𝑦)(𝑡) as a function of time. By dividing these component equations we obtain the associated path equation for this system, namely (6.101)

𝑑𝑦 𝑒(𝑥) + sin 𝜃(𝑥) 𝑤(𝑥) + 𝜎 sin 𝜃(𝑥) = . = 𝑑𝑥 𝜎 cos 𝜃(𝑥) cos 𝜃(𝑥)

The above relation implies that, if the steering angle 𝜃(𝑥) is known, then the resulting 𝜋 𝜋 path 𝑦(𝑥) can be found. Moreover, the importance of the restriction − 2 < 𝜃(𝑥) < 2 𝑑𝑦

can now be seen; it guarantees that 𝑑𝑥 will be defined and continuous (finite) when 𝑒(𝑥) and 𝜃(𝑥) are given. Thus the restriction on the angle implies that the curve (𝑥, 𝑦)(𝑡) can be described as a graph 𝑦(𝑥). Inverted relation. For our purposes, it will be convenient to invert the relation in (6.101) and express 𝜃(𝑥) in terms of 𝑦′ (𝑥). After rearranging the equation, and using 𝑑𝑦 𝑦′ in place of 𝑑𝑥 , and omitting the argument 𝑥 in all functions for brevity, we get (6.102)

sin 𝜃 = 𝑦′ cos 𝜃 − 𝑒.

Squaring both sides and using the identity sin2 𝜃 = 1 − cos2 𝜃, we arrive at a quadratic 𝜋 𝜋 equation which can be solved for cos 𝜃. Consistent with the restriction − 2 < 𝜃 < 2 , we choose the positive root and obtain (6.103)

cos 𝜃 =

𝑒𝑦′ + √(𝑦′ )2 + 1 − 𝑒2 . (𝑦′ )2 + 1

172

6. Calculus of variations

The above expression can be put into an alternative form. Specifically, we can rationalize the numerator to get (6.104)

cos 𝜃 =

1 − 𝑒2 √(𝑦′ )2 + 1 − 𝑒2 − 𝑒𝑦′

.

The above relations imply that, if the path 𝑦(𝑥) or its derivative 𝑦′ (𝑥) are known, then the corresponding steering angle 𝜃(𝑥) can be found. Moreover, the importance of the restriction −1 < 𝑒(𝑥) < 1 can now be seen; it guarantees that cos 𝜃(𝑥) will be real and positive when 𝑒(𝑥) and 𝑦′ (𝑥) are given. Note that, to obtain an angle in the interval 𝜋 𝜋 − 2 < 𝜃(𝑥) < 2 , we would substitute (6.104) into (6.102) and use inverse sine. Thus the restriction on the speed ratio implies that a steering angle function 𝜃(𝑥) will exist for any given path function 𝑦(𝑥). Note that, if this restriction were removed, then there may be no angle function that is consistent with a given path, that is, some paths might not be achievable. For instance, in the case when 𝑒(𝑥) > 1, so that the water speed is positive and exceeds the boat speed at all locations in the channel, the boat would not be able to move along a path that is upstream, or even horizontal. 𝜋

𝜋

𝑑𝑥

Travel time. In view of the condition − 2 < 𝜃(𝑥) < 2 , we find that 𝑑𝑡 = 𝜎 cos 𝜃(𝑥) > 0, which implies that 𝑥 increases monotonically with time. We suppose that the boat is at 𝑥 = 0 at time 𝑡 = 0, and arrives at 𝑥 = ℓ at time 𝑡 = 𝑇, so that, by definition, 𝑇 is the travel time across the channel. An integral expression for the travel time in terms of the path 𝑦(𝑥) can now be written. Specifically, beginning from a simple integral identity, and performing a change of variable using the above expression 𝑑𝑥 for 𝑑𝑡 , and then using the expression in (6.104), we get 𝑡=𝑇

𝑇=∫

(6.105)

𝑥=ℓ

𝑑𝑡 = ∫

𝑡=0

𝑥=0

ℓ

√(𝑦′ )2 + 1 − 𝑒2 − 𝑒𝑦′ 𝑑𝑡 1 𝑑𝑥 = ∫ 𝑑𝑥. 𝑑𝑥 𝜎 0 1 − 𝑒2

Two problems. To illustrate the role and significance of boundary conditions we consider two different optimal steering problems. y

y Q ( ,h)

P (0,0)

P (0,0)

x (a)

x (b)

Figure 6.12.

Fixed-fixed problem. We seek a path 𝑦(𝑥) and steering angle 𝜃(𝑥) that will minimize the travel time from a given point 𝑃 = (0, 0) on the left bank to a given point 𝑄 = (ℓ, ℎ) on the right bank, as illustrated in Figure 6.12a. Equivalently, we seek minimizers for

6.9. Case study

173

a functional 𝑇 ∶ V → ℝ, where ℓ

(6.106)

𝑇(𝑦) =

1 ∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, 𝜎 0

V = {𝑦 ∈ 𝐶 2 [0, ℓ] | 𝑦(0) = 0, 𝑦(ℓ) = ℎ}. Here 𝐿(𝑥, 𝑦, 𝑦′ ) is the integrand defined in (6.105). Every candidate for a minimizer must satisfy the Euler–Lagrange differential equation as usual, together with the two essential boundary conditions at 𝑥 = 0 and 𝑥 = ℓ. Due to the fact that 𝐿(𝑥, 𝑦, 𝑦′ ) has no explicit dependence on 𝑦, the differential equation can be written in a reduced form, and its solution can be expressed in the form of an integral with two arbitrary constants. Consideration of the boundary conditions then leads to a simple equation for one of these constants, and a nonlinear equation for the other. Note that once an optimal path 𝑦(𝑥) is known, the corresponding steering angle 𝜃(𝑥) can be found as described earlier. Fixed-free problem. We seek a path 𝑦(𝑥) and steering angle 𝜃(𝑥) that will minimize the travel time from a given point 𝑃 = (0, 0) on the left bank to any point on the right bank, as illustrated in Figure 6.12b. In other words, we seek the quickest route from 𝑃 to the other side of the channel – with no bias or preference on the landing point. In this case the landing point on the right bank is now free, and we simply want to reach the right bank as quickly as possible. Here we seek minimizers for a functional 𝑇 ∶ W → ℝ, where ℓ

(6.107)

1 𝑇(𝑦) = ∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, 𝜎 0 W = {𝑦 ∈ 𝐶 2 [0, ℓ] | 𝑦(0) = 0}.

As before, 𝐿(𝑥, 𝑦, 𝑦′ ) is the integrand defined in (6.105). Every candidate for a minimizer must again satisfy the Euler–Lagrange differential equation, together with an essential boundary condition at the fixed end 𝑥 = 0, and a natural boundary condition at the free end 𝑥 = ℓ. Similar to before, the general solution of the differential equation can be expressed in the form of an integral with two arbitrary constants. However, in contrast to before, the boundary conditions now lead to two simple, explicit equations for these constants. Once an optimal path 𝑦(𝑥) is known, the corresponding steering angle 𝜃(𝑥) can again be found. Summary of results. Let a channel width ℓ, a boat speed 𝜎, and a continuously differentiable water speed 𝑤(𝑥) be given, and assume the speed ratio satisfies the restriction −1 < 𝑒(𝑥) < 1. (1) The fixed-free problem for 𝑇 ∶ W → ℝ has a unique extremal, and it is an absolute minimizer. (2) The fixed-fixed problem for 𝑇 ∶ V → ℝ either has a unique extremal, and it is an absolute minimizer, or no extremal for a given landing point (ℓ, ℎ). In the fixed-fixed problem, the possibility of no extremal may arise depending on the given data, which would indicate that no graph in V would provide a quickest route from 𝑃 to 𝑄. In this case we note that a quickest route may exist among a more general

174

6. Calculus of variations

set of curves. For instance, if the water speed 𝑤(𝑥) is as shown, and 𝑄 is far downstream from 𝑃, then a quickest route may conceivably take the form of a piecewisedefined curve, where one piece is a vertical line segment oriented with the flow along the middle of the channel where the water speed is the fastest. In contrast, when 𝑄 is upstream from 𝑃 by a sufficient amount, then a quickest route could conceivably favor vertical line segments oriented opposite to the flow near one or both banks where the water speed is the slowest, and the route becomes less intuitive. Various details for the above two problems are explored in the Exercises.

6.10. Second-order problems We consider the problem of finding local extrema for a functional 𝐹 ∶ V → ℝ, where the set of functions is V = {𝑦 ∈ 𝐶 4 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦′ (𝑎) = 𝛾, 𝑦(𝑏) = 𝛽, 𝑦′ (𝑏) = 𝜂} (6.108) (or some or all of these conditions free), the space of variations is (6.109)

V0 = {ℎ ∈ 𝐶 4 [𝑎, 𝑏] | ℎ(𝑎) = 0, ℎ′ (𝑎) = 0, ℎ(𝑏) = 0, ℎ′ (𝑏) = 0} (or some or all of these conditions free),

and the functional is 𝑏

𝐹(𝑦) = ∫ 𝐿(𝑥, 𝑦, 𝑦′ , 𝑦″ ) 𝑑𝑥.

(6.110)

𝑎

Here [𝑎, 𝑏] is a given interval, 𝛼, 𝛽, 𝛾, 𝜂 are given constants, and 𝐿(𝑥, 𝑦, 𝑦′ , 𝑦″ ) is a given integrand. Unless indicated otherwise, we assume that 𝐿 is three times continuously differentiable for all 𝑥 ∈ [𝑎, 𝑏], 𝑦 ∈ ℝ, 𝑦′ ∈ ℝ and 𝑦″ ∈ ℝ. In the case when some of the conditions in V are free, the functional could be defined to include associated freeend terms, but that level of generality will not be pursued. The integrand 𝐿 is called the Lagrangian as before. The above problem is said to be of second-order type, since the functional 𝐹 involves derivatives of at most second order. The continuity requirements for the functions in V, and for the integrand 𝐿, ensure that the functional 𝐹 is finite for each input. They also ensure that the general necessary conditions in Result 6.4.1 can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. The following result outlines some implications of the general necessary conditions, when specialized to the above problem. Result 6.10.1. Let 𝐹 ∶ V → ℝ be defined as in (6.108)–(6.110). If 𝑦∗ ∈ V is a local minimizer of 𝐹 in the 𝐶 𝑚 -norm for some 𝑚, then (6.111)

𝑑 𝐹(𝑦∗ + 𝜀ℎ)] = 0, 𝑑𝜀 𝜀=0

∀ℎ ∈ V0 ,

𝑑2 𝐹(𝑦∗ + 𝜀ℎ)] ≥ 0, 𝑑𝜀2 𝜀=0

∀ℎ ∈ V0 .

[

and (6.112)

[

6.10. Second-order problems

175

Condition (6.111) implies that 𝑦∗ must satisfy 𝑑 𝜕𝐿 𝑑 2 𝜕𝐿 𝜕𝐿 − [ ′ ] + 2 [ ″ ] = 0, ⎧ 𝑑𝑥 𝜕𝑦 ⎪ 𝜕𝑦 𝑑𝑥 𝜕𝑦 ⎪ (6.113)

𝑎 ≤ 𝑥 ≤ 𝑏,

′ ′ ⎨𝑦(𝑎) = 𝛼, 𝑦 (𝑎) = 𝛾, 𝑦(𝑏) = 𝛽, 𝑦 (𝑏) = 𝜂, ⎪ ⎪ ⎩(or natural boundary conditions if any are free).

Condition (6.112) implies that 𝑦∗ must also satisfy 𝜕2 𝐿 (𝑥, 𝑦, 𝑦′ , 𝑦″ ) ≥ 0, 𝑎 ≤ 𝑥 ≤ 𝑏. 𝜕𝑦″ 𝜕𝑦″ For a local maximizer, change ≥ to ≤ in conditions (6.112) and (6.114). (6.114)

The equations in (6.113) provide a boundary-value problem that every local extremum must satisfy; they are called the Euler–Lagrange equations as before. The differential equation in this boundary-value problem is at most fourth-order, and may be linear or nonlinear. The boundary conditions appearing in (6.113) are essential in the sense that they are explicitly specified in the set V. If any of these conditions is removed from V, then an associated natural boundary condition would appear in (6.113). The inequality in (6.114) is a further condition that must be satisfied; it is called the Legendre condition as before, and can be used to partially classify an extremum. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. As noted before, these conditions are still not sufficient even if the inequalities in (6.112) and (6.114) are made strict. The boundary-value problem in (6.113) may have one, none, or multiple solutions; any solution, and hence a candidate, is called an extremal. The natural boundary conditions that may arise in (6.113) are summarized below. Result 6.10.2. Let 𝐹 ∶ V → ℝ be defined as in (6.108)–(6.110). If any essential boundary condition is removed, then local extrema must satisfy a corresponding natural boundary condition. 𝜕𝐿

(1) If 𝑦(𝑎) is free, the natural condition is [ 𝜕𝑦′ −

𝑑 𝜕𝐿 [ ]] 𝑑𝑥 𝜕𝑦″ 𝑥=𝑎

= 0.

𝜕𝐿

(2) If 𝑦′ (𝑎) is free, the natural condition is [ 𝜕𝑦″ ]𝑥=𝑎 = 0. 𝜕𝐿

(3) If 𝑦(𝑏) is free, the natural condition is [ 𝜕𝑦′ −

𝑑 𝜕𝐿 [ ]] 𝑑𝑥 𝜕𝑦″ 𝑥=𝑏

= 0.

𝜕𝐿

(4) If 𝑦′ (𝑏) is free, the natural condition is [ 𝜕𝑦″ ]𝑥=𝑏 = 0. Sketch of proof: Results 6.10.1 and 6.10.2. Here we show how (6.111) implies (6.113), including the various natural boundary conditions that can arise. To begin, consider any fixed 𝑦 ∈ V and ℎ ∈ V0 . From the definition of 𝐹 in (6.110) we have 𝑏

(6.115)

𝐹(𝑦 + 𝜀ℎ) = ∫ 𝐿(𝑥, 𝑦 + 𝜀ℎ, 𝑦′ + 𝜀ℎ′ , 𝑦″ + 𝜀ℎ″ ) 𝑑𝑥. 𝑎

176

6. Calculus of variations

Just as in the previous cases, we differentiate with respect to 𝜀, and take the derivative inside the integral and use the chain rule, and then set 𝜀 = 0. The resulting expression for the first variation is 𝑏

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ + 𝑓ℎ′ + 𝑞ℎ″ 𝑑𝑥,

(6.116)

𝑎 𝜕𝐿

𝜕𝐿

where for brevity we use the notation 𝑔 = 𝜕𝑦 (𝑥, 𝑦, 𝑦′ , 𝑦″ ), 𝑓 = 𝜕𝑦′ (𝑥, 𝑦, 𝑦′ , 𝑦″ ) and 𝑞 = 𝜕𝐿 (𝑥, 𝑦, 𝑦′ , 𝑦″ ). As before, we next write the above expression in a more useful form 𝜕𝑦″ using the integration-by-parts formula ∫ 𝑢 𝑑𝑣 = 𝑢𝑣 − ∫ 𝑣 𝑑𝑢. Specifically, applying this formula to the term ∫ 𝑓ℎ′ 𝑑𝑥, with 𝑢 = 𝑓 and 𝑑𝑣 = ℎ′ 𝑑𝑥, and also to the term ∫ 𝑞ℎ″ 𝑑𝑥, with 𝑢 = 𝑞 and 𝑑𝑣 = ℎ″ 𝑑𝑥, we get 𝑏

(6.117)

𝑥=𝑏

𝑥=𝑏

𝛿𝐹(𝑦, ℎ) = ∫ 𝑔ℎ − 𝑓′ ℎ − 𝑞′ ℎ′ 𝑑𝑥 + [𝑓ℎ]𝑥=𝑎 + [𝑞ℎ′ ]𝑥=𝑎 . 𝑎

We again apply the integration-by-parts formula to the term ∫ −𝑞′ ℎ′ 𝑑𝑥, with 𝑢 = −𝑞′ and 𝑑𝑣 = ℎ′ 𝑑𝑥, and we get, after collecting the boundary terms, 𝑏

(6.118)

𝑥=𝑏

𝑥=𝑏

𝛿𝐹(𝑦, ℎ) = ∫ (𝑔 − 𝑓′ + 𝑞″ )ℎ 𝑑𝑥 + [(𝑓 − 𝑞′ )ℎ]𝑥=𝑎 + [𝑞ℎ′ ]𝑥=𝑎 . 𝑎

′

𝑑𝑓

In the above, note that 𝑓 means 𝑑𝑥 , or equivalently = [𝑞ℎ′ ]𝑥=𝑏 − [𝑞ℎ′ ]𝑥=𝑎 , and so on.

𝑑 𝜕𝐿 [ ], and so on. 𝑑𝑥 𝜕𝑦′

𝑥=𝑏

Also, [𝑞ℎ′ ]𝑥=𝑎

We now observe that, if 𝑦∗ ∈ V is a local extremum, then the condition in (6.111) requires (6.119)

𝛿𝐹(𝑦∗ , ℎ) = 0,

∀ℎ ∈ V0 .

Regardless of whether any essential boundary conditions are specified, the space V0 will always contain functions that vanish at the ends. Thus a special case of the above condition is 𝑏

(6.120)

∫ (𝑔 − 𝑓′ + 𝑞″ )ℎ 𝑑𝑥 = 0, 𝑎

∀ℎ ∈ 𝐶 4 [𝑎, 𝑏],

ℎ(𝑎) = 0, ℎ′ (𝑎) = 0, ℎ(𝑏) = 0, ℎ′ (𝑏) = 0.

By the fundamental lemma, due to the continuity of 𝑔, 𝑓′ , and 𝑞″ , this condition will hold when and only when 𝑔 − 𝑓′ + 𝑞″ = 0 for all 𝑥 ∈ [𝑎, 𝑏]. This is the Euler–Lagrange differential equation in (6.113). Substituting this information into (6.119), we then find that (𝑓 − 𝑞′ )(𝑏)ℎ(𝑏) − (𝑓 − 𝑞′ )(𝑎)ℎ(𝑎) (6.121) +𝑞(𝑏)ℎ′ (𝑏) − 𝑞(𝑎)ℎ′ (𝑎) = 0, ∀ℎ ∈ V0 . The above expression is trivial when all the essential boundary conditions are specified. However, when any of ℎ(𝑎), ℎ′ (𝑎), ℎ(𝑏), or ℎ′ (𝑏) are free, then the corresponding coefficient must vanish in order for the above condition to hold, and this gives the corresponding natural boundary condition. Example 6.10.1. Consider the set V = {𝑦 ∈ 𝐶 4 [0, 1] | 𝑦(0) = 1, 𝑦(1) = 0, 𝑦′ (1) = 1 0}, and functional 𝐹(𝑦) = ∫0 (𝑦″ )2 + 𝑦𝑦′ + (𝑦′ )2 − 𝑦 𝑑𝑥. Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition.

6.11. Case study

177

Extremals. Note that [𝑎, 𝑏] = [0, 1] and that 𝑦′ (𝑎) is free, since it is not specified in V. The Lagrangian is 𝐿(𝑥, 𝑦, 𝑦′ , 𝑦″ ) = (𝑦″ )2 + 𝑦𝑦′ + (𝑦′ )2 − 𝑦, and its partial derivatives are 𝜕𝐿/𝜕𝑦 = 𝑦′ − 1, 𝜕𝐿/𝜕𝑦′ 2= 𝑦 + 2𝑦′ and 𝜕𝐿/𝜕𝑦″ = 2𝑦″ . The differential equation to con𝑑 𝜕𝐿 𝑑 𝜕𝐿 𝑑 𝜕𝐿 𝑑2 sider is 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ]+ 𝑑𝑥2 [ 𝜕𝑦″ ] = 0, which becomes 𝑦′ −1− 𝑑𝑥 [𝑦 + 2𝑦′ ]+ 𝑑𝑥2 [2𝑦″ ] = 0, or equivalently −1 − 2𝑦″ + 2𝑦⁗ = 0. The natural boundary condition associated 𝜕𝐿 with 𝑦′ (𝑎) being free is [ 𝜕𝑦″ ]𝑥=𝑎 = 0, which becomes [2𝑦″ ]𝑥=0 = 0, or equivalently 𝑦″ (0) = 0. In view of the essential boundary conditions in V, the boundary-value problem for an extremal is 1 (6.122) 𝑦⁗ − 𝑦″ = , 𝑦(0) = 1, 𝑦(1) = 0, 𝑦′ (1) = 0, 𝑦″ (0) = 0, 0 ≤ 𝑥 ≤ 1. 2 The differential equation can be solved using standard methods for linear, inhomo1 geneous equations, and the general solution is 𝑦 = 𝐶1 + 𝐶2 𝑥 + 𝐶3 𝑒𝑥 + 𝐶4 𝑒−𝑥 − 4 𝑥2 , where 𝐶1 , . . . , 𝐶4 are arbitrary constants. Applying the boundary conditions, we find unique values for these constants, and we obtain a unique extremal 𝑦∗ . This is the only candidate for a local extremum. 𝜕𝐿

𝜕2 𝐿

Sign condition. Since 𝜕𝑦″ (𝑥, 𝑦, 𝑦′ , 𝑦″ ) = 2𝑦″ , we get 𝜕𝑦″ 𝜕𝑦″ (𝑥, 𝑦, 𝑦′ , 𝑦″ ) = 2. Here the required second partial is a constant, but more generally it may depend on 𝑥, 𝑦, 𝑦′ , and 𝑦″ . Substituting the extremal 𝑦∗ and its derivatives into this expression, we get the 𝜕2 𝐿 constant function 𝜕𝑦″ 𝜕𝑦″ (𝑥, 𝑦∗ , 𝑦∗′ , 𝑦∗″ ) ≡ 2 for 𝑥 ∈ [0, 1]. The fact that this expression is ≥ 0 for all 𝑥 ∈ [0, 1] informs us that 𝑦∗ could be a local minimizer, but not a local maximizer. Analysis of candidate. To determine if 𝑦∗ is a local minimizer in the 𝐶 𝑚 -norm for some 𝑚, we attempt to verify the definition of a minimizer. Similar to previous cases, let 𝑚 ≤ 4 and 𝛿 > 0 be given, and consider any 𝑢 in the neighborhood 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿), and let ℎ = 𝑢−𝑦∗ . Note that ℎ is in V0 , since it is the difference of two functions in V. Through a tedious, but straightforward analysis we obtain the result that (6.123)

𝐹(𝑢) − 𝐹(𝑦∗ ) ≥ 0,

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿).

Thus we conclude that 𝑦∗ is a local minimizer for any 𝑚 ≤ 4 and 𝛿 > 0; in fact, it is an absolute minimizer.

6.11. Case study Setup. To illustrate an application involving a second-order problem, and the role of different boundary conditions, we study a problem in the optimal acceleration control of a car. We consider driving between two given points, 𝑃 and 𝑄, along a road in a fixed time interval [0, 𝑏] as shown in Figure 6.13. We suppose that the motion occurs in a plane and that the car is always in contact with the road. We represent the car as a particle of mass 𝑚 and the road as a curve. The arclength (distance) coordinate along the road is denoted by 𝑠, which is chosen so that 𝑠 = 0 corresponds to point 𝑃, and 𝑠 = ℓ corresponds to point 𝑄. Since we consider motion in a fixed time interval, we have 𝑠 = 0 when 𝑡 = 0, and 𝑠 = ℓ when 𝑡 = 𝑏. The road may contain topographical features such as hills and valleys, as described by an inclination angle 𝜃(𝑠) with respect to a horizontal axis, which is a given function of arclength.

178

6. Calculus of variations

y

g

P

car θ(s) s

Q

x Figure 6.13.

The motion of the car is described by a function 𝑠(𝑡) which gives its position versus time. Throughout the motion we suppose that the acceleration of the car can be influenced by a driver as desired. The driver can add positive acceleration by pressing on a gas pedal, or negative acceleration by pressing on a brake pedal, where the magnitude of the acceleration can vary according to the amount of pressure on the pedal. The driver could also choose to add zero acceleration and simply allow the car to coast. We denote the acceleration input from the driver by a function 𝑢(𝑡). In addition to this input, we assume that the car is subject to gravity, which is vertically downward with gravitational acceleration 𝑔. We also consider an air resistance force, which is proportional and directly opposed to the velocity of the car, with constant of proportionality 𝜇, and for convenience we introduce the parameter 𝜂 = 𝜇/𝑚. In general, a driver will need to use the gas and brake pedals in order to travel from 𝑃 to 𝑄. Here we seek a driving input or control 𝑢(𝑡) that will minimize the total pedal usage throughout the travel under different boundary conditions. In other words, we consider the question of how to achieve the trip while using the gas and brake pedals as little as possible. We assume that 𝑏, 𝑚, ℓ, and 𝑔 are positive constants, 𝜇 and 𝜂 are nonnegative constants, and that 𝜃(𝑠) and 𝑢(𝑡) are twice continuously differentiable 𝜋 𝜋 functions with − 2 < 𝜃(𝑠) < 2 . Description of road. We suppose that the road is described by a planar curve (𝑥, 𝑦)(𝑠), where 𝑥 and 𝑦 are cartesian coordinates as shown with some fixed origin, and 𝑠 is an arclength parameter as described above. The unit tangent vector along the road ⃗ is 𝑇(𝑠) = 𝑥′ (𝑠)𝑖 ⃗ + 𝑦′ (𝑠)𝑗,⃗ where a prime denotes a derivative with respect to 𝑠, and 𝑖 ⃗ and 𝑗 ⃗ are the standard unit vectors along the positive 𝑥 and 𝑦 directions. By definition ⃗ of the inclination angle, we have 𝑇(𝑠) = cos 𝜃(𝑠)𝑖 ⃗ + sin 𝜃(𝑠)𝑗.⃗ Note that 𝜃(𝑠) can be determined from (𝑥, 𝑦)(𝑠) using the relation tan 𝜃(𝑠) = 𝑦′ (𝑠)/𝑥′ (𝑠), and consistent with 𝜋 𝜋 the assumption − 2 < 𝜃(𝑠) < 2 , we assume 𝑥′ (𝑠) > 0. We also consider a unit normal ⃗ vector defined by 𝑁(𝑠) = − sin 𝜃(𝑠)𝑖 ⃗ + cos 𝜃(𝑠)𝑗,⃗ in the upward direction from soil to air. In addition to being orthonormal at each point along the road, these vectors satisfy ⃗ the relation 𝑇⃗ ′ (𝑠) = 𝜅(𝑠)𝑁(𝑠), where 𝜅(𝑠) = 𝜃′ (𝑠) is the (signed) curvature of the road at 𝑠. Note that the road is concave up in regions where 𝜅(𝑠) > 0, concave down where 𝜅(𝑠) < 0, and straight where 𝜅(𝑠) = 0. Motion of car. At any time 𝑡 ∈ [0, 𝑏], the car has an arclength position 𝑠(𝑡) along the road, and a position vector 𝑟(𝑠(𝑡)) ⃗ = 𝑥(𝑠(𝑡))𝑖 ⃗ + 𝑦(𝑠(𝑡))𝑗 ⃗ in the plane. Using the chain rule, we find that the velocity and acceleration of the car are given by 𝑟 ̇⃗ = 𝑠𝑇 ̇ ⃗ and 𝑟 ̈⃗ = 𝑠𝑇 ̈ ⃗ + 𝜅𝑠2̇ 𝑁,⃗ where a dot denotes a derivative with respect to time 𝑡. We assume that

6.11. Case study

179

the car is subject to a force 𝐹𝑑⃗ = 𝑚𝑢𝑇⃗ due to driver input, a force 𝐹𝑎⃗ = −𝜇𝑟 ̇⃗ due to air resistance, a force 𝐹𝑔⃗ = −𝑚𝑔𝑗 ⃗ due to gravity, and a force 𝐹𝑟⃗ = 𝜆𝑁⃗ due to contact with the road. In any motion, the road can only push up (𝜆 > 0) to support the car, and cannot pull down (𝜆 < 0) or effectively be absent (𝜆 = 0). According to Newton’s law, the motion of the car must satisfy the equation 𝑚𝑟 ̈⃗ = 𝐹𝑑⃗ + 𝐹𝑎⃗ + 𝐹𝑔⃗ + 𝐹𝑟⃗ . By taking the dot product with 𝑇,⃗ and dividing by 𝑚 and using the relation 𝜂 = 𝜇/𝑚, we find that the tangential components of acceleration and force must satisfy (6.124)

𝑠 ̈ = 𝑢 − 𝜂𝑠 ̇ − 𝑔 sin 𝜃(𝑠).

The above equation implies that, if the inclination angle 𝜃(𝑠) and control input 𝑢(𝑡) are given, then the car position 𝑠(𝑡) can be found. By taking the dot product of the original equation with 𝑁,⃗ we find that the normal components of acceleration and force must satisfy 𝜆 = 𝑚𝜅𝑠2̇ + 𝑚𝑔 cos 𝜃. Note that this equation leads to a condition for contact. Specifically, contact will exist or not exist depending on whether 𝜆 > 0 or 𝜆 ≤ 0. For instance, along regions where the road is concave down, so that 𝜅 < 0, contact will be lost if the velocity 𝑠 ̇ is too large; the car (and its driver!) will feel weightless as they separate from the road. In contrast, along regions where the road is concave up or straight, so that 𝜅 ≥ 0, contact will exist for any velocity. The contact force between car and road (and driver and seat) will noticeably increase through concave up regions at higher velocities. For our purposes, it will be convenient to invert the relation in (6.124) and express 𝑢(𝑡) in terms of 𝑠(𝑡). Specifically, after rearranging, we obtain (6.125)

𝑢 = 𝑠 ̈ + 𝜂𝑠 ̇ + 𝑔 sin 𝜃(𝑠).

The above relation shows that, if the inclination angle 𝜃(𝑠) and motion 𝑠(𝑡) are known, then the corresponding control function 𝑢(𝑡) required to produce the motion can be found. Gas and brake usage. As a measure of the total gas and brake pedal usage in the time interval [0, 𝑏] we consider the functional 𝑏

(6.126)

𝐹 = ∫ 𝑢2 (𝑡) 𝑑𝑡. 0

Note that larger values of the driver input 𝑢(𝑡), over longer periods of time, will give larger positive values of 𝐹. Also, a zero value of 𝑢(𝑡), over the entire period of time, will give a zero value of 𝐹. And in all cases we have 𝐹 ≥ 0. The above is called a cost functional for the problem: it assigns a quantitative “cost” to any given driver input 𝑢(𝑡). In a game where the objective is to use the pedals as little as possible, a high score in the game would correspond to a low value of the cost, and a low score would correspond to a high value of the cost. Thus cost can be understood as the inverse of a score. Note that the definition of the cost functional is subjective just as the scoring rules of any game are subjective. The simple example above is for purposes of illustration only. The integrand need not depend on the input in a uniform, sign-independent way. More generally, the integrand could depend on

180

6. Calculus of variations

the sign of 𝑢(𝑡), for instance to reflect different contributions to cost associated with fuel consumption or rate of wear on the vehicle. Also, the integrand could include contributions that explicitly depend on time 𝑡 and the motion 𝑠(𝑡). As the car moves along the road, the input 𝑢(𝑡) and motion 𝑠(𝑡) are directly related, and the cost functional 𝐹 can be expressed in terms of either. Since boundary conditions will be specified on the motion rather than the control, we treat 𝑠(𝑡) as the primary variable rather than 𝑢(𝑡). Using (6.125) and (6.126), we obtain an expression for the cost assigned to any driver input in terms of the motion, namely 𝑏

(6.127)

2

𝐹 = ∫ (𝑠 ̈ + 𝜂𝑠 ̇ + 𝑔 sin 𝜃(𝑠)) 𝑑𝑡. 0

Note that the above functional is of second-order type in the motion 𝑠(𝑡) since it involves the first and second derivatives 𝑠(𝑡) ̇ and 𝑠(𝑡). ̈ Two problems. We consider two different types of optimal driving problems corresponding to different boundary conditions. Problem 1. Consider a trip in which the car starts from rest at point 𝑃 at time 𝑡 = 0, and must come to a stop at point 𝑄 at time 𝑡 = 𝑏. We seek a driving control function 𝑢(𝑡) that will minimize the total gas and brake pedal usage while accomplishing this trip. Equivalently, working with the motion 𝑠(𝑡), we seek a minimizer for the cost functional 𝐹 ∶ V → ℝ, where 𝑏

(6.128)

𝐹(𝑠) = ∫ 𝐿(𝑡, 𝑠, 𝑠,̇ 𝑠)̈ 𝑑𝑡, 0

V = {𝑠 ∈ 𝐶 4 [0, 𝑏] | 𝑠(0) = 0, 𝑠(0) ̇ = 0, 𝑠(𝑏) = ℓ, 𝑠(𝑏) ̇ = 0}. Here 𝐿(𝑡, 𝑠, 𝑠,̇ 𝑠)̈ is the integrand defined in (6.127). Every candidate for a minimizer must satisfy the Euler–Lagrange differential equation, together with four essential boundary conditions at 𝑡 = 0 and 𝑡 = 𝑏. When the road has a nontrivial topography defined by an arbitrary inclination angle 𝜃(𝑠), the Euler–Lagrange equation will be nonlinear and numerical procedures will generally be required to find extremals. In contrast, when the road has a trivial topography, for instance when it is straight and flat so that 𝜃(𝑠) ≡ 0, the equation will be linear and extremals can be found by the usual solution techniques. Once an optimal motion 𝑠(𝑡) is known, the corresponding control 𝑢(𝑡) can be found using (6.125), and the contact condition could be checked to assess the feasibility of the motion. This problem is explored in the Exercises. Problem 2. Consider a trip in which the car again starts from rest at point 𝑃 at time 𝑡 = 0, and must reach or cross point 𝑄 at time 𝑡 = 𝑏, but where the velocity at point 𝑄 is allowed to be free. We seek a driving control function 𝑢(𝑡) that will minimize the total gas and brake pedal usage while accomplishing this different version of the trip. Equivalently, working with the motion 𝑠(𝑡), we seek a minimizer for the cost functional 𝐹 ∶ W → ℝ, where 𝑏

(6.129)

𝐹(𝑠) = ∫ 𝐿(𝑡, 𝑠, 𝑠,̇ 𝑠)̈ 𝑑𝑡, 0

W = {𝑠 ∈ 𝐶 4 [0, 𝑏] | 𝑠(0) = 0, 𝑠(0) ̇ = 0, 𝑠(𝑏) = ℓ}.

6.12. Constraints

181

Here, as before, 𝐿(𝑡, 𝑠, 𝑠,̇ 𝑠)̈ is the integrand defined in (6.127). Every candidate for a minimizer must again satisfy the Euler–Lagrange differential equation, together with three essential boundary conditions at 𝑡 = 0 and 𝑡 = 𝑏, along with an appropriate natural boundary condition at 𝑡 = 𝑏. Remarks similar to those above can also be made here, and we note that any optimal motion 𝑠(𝑡) and corresponding control 𝑢(𝑡) will generally be different than before. This problem is also explored in the Exercises.

6.12. Constraints Given two functionals 𝐹, 𝐺 ∶ V → ℝ, and a constant 𝑘 ∈ ℝ, we consider the problem of finding local extrema of 𝐹(𝑦) subject to the constraint 𝐺(𝑦) = 𝑘. We suppose that the set of functions is V = {𝑦 ∈ 𝐶 2 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦(𝑏) = 𝛽}

(6.130)

(or one or both free),

the space of variations is (6.131)

V0 = {ℎ ∈ 𝐶 2 [𝑎, 𝑏] | ℎ(𝑎) = 0, ℎ(𝑏) = 0} (or one or both free),

and the functionals are 𝑏

(6.132)

𝐹(𝑦) = ∫ 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, 𝑎

𝑏

𝐺(𝑦) = ∫ 𝑀(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥. 𝑎

Here [𝑎, 𝑏] is a given interval, 𝛼, 𝛽 are given constants, and 𝐿(𝑥, 𝑦, 𝑦′ ) and 𝑀(𝑥, 𝑦, 𝑦′ ) are given integrands. Unless indicated otherwise, we assume that 𝐿 and 𝑀 are twice continuously differentiable for all 𝑥 ∈ [𝑎, 𝑏], 𝑦 ∈ ℝ and 𝑦′ ∈ ℝ. In the case when some of the conditions in V are free, the functionals could be defined to include associated free-end terms, but that level of generality will not be considered. The above problem is called an isoperimetric problem; it arises in various classic applications in geometry, where the integral constraint 𝐺(𝑦) = 𝑘 represents a fixed arclength or perimeter. There are many variants that could be considered, which may involve multiple constraints of the integral type, higher-order functionals, and also constraints of a local or pointwise type. Here we consider only the version outlined above. As before, the continuity requirements for the functions in V, and for the integrands 𝐿 and 𝑀, ensure that the functionals 𝐹 and 𝐺 are finite for each input. They also ensure that the general necessary condition outlined below can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. Whereas all previous results have exploited facts from single-variable calculus, the following result exploits a fact from two-variable calculus, regarding the local extrema of functions subject to constraints. Thus, instead of a one-parameter family of variations, we now consider a two-parameter family as described next. Result 6.12.1. Let 𝐹, 𝐺 ∶ V → ℝ be defined as in (6.130)–(6.132). Let 𝑦∗ ∈ V be given, and assume that it is not an extremal of 𝐺(𝑦), and consider any fixed ℎ1 ∈ V0 such that

182

6. Calculus of variations

˜ ∶ ℝ2 → ℝ defined by ˜𝐺 𝛿𝐺(𝑦∗ , ℎ1 ) ≠ 0. For arbitrary ℎ2 ∈ V0 consider the functions 𝐹, (6.133)

˜ 1 , 𝜀2 ) = 𝐹(𝑦∗ + 𝜀1 ℎ1 + 𝜀2 ℎ2 ), 𝐹(𝜀 ˜ 1 , 𝜀2 ) = 𝐺(𝑦∗ + 𝜀1 ℎ1 + 𝜀2 ℎ2 ). 𝐺(𝜀

If 𝑦∗ ∈ V is a local extremum of 𝐹(𝑦) subject to 𝐺(𝑦) = 𝑘 in the 𝐶 𝑚 -norm for some 𝑚, ˜ 1 , 𝜀2 ) = 𝑘. Thus a number ˜ 1 , 𝜀2 ) subject to 𝐺(𝜀 then (0, 0) ∈ ℝ2 is a local extremum of 𝐹(𝜀 𝜆 ∈ ℝ must exist such that

(6.134)

𝜕𝐹˜ ⎛ ⎞ 𝜕𝜀 ⎜ 1⎟ ⎜ 𝜕𝐹˜ ⎟ ⎜ ⎟ ⎝ 𝜕𝜀2 ⎠

+ (0, 0)

˜ 𝜕𝐺 ⎞ ⎛ 𝜕𝜀 1⎟ ⎜ 𝜆⎜ ˜⎟ ⎜ 𝜕𝐺 ⎟ ⎝ 𝜕𝜀2 ⎠

0 = ( ). 0 (0, 0)

The number 𝜆 is independent of ℎ1 and ℎ2 , and the above condition implies that 𝑦∗ must satisfy

(6.135)

𝑑 𝜕𝐿 𝜕𝑀 𝑑 𝜕𝑀 𝜕𝐿 ⎧ ( 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ]) + 𝜆 ( 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ]) = 0, 𝑎 ≤ 𝑥 ≤ 𝑏, ⎪ ⎪ ⎪ 𝐺(𝑦) = 𝑘, 𝜆 constant, ⎨ ⎪ 𝑦(𝑎) = 𝛼, 𝑦(𝑏) = 𝛽, ⎪ ⎪ ⎩ (or natural boundary conditions if any are free).

The condition in (6.134) is a result from two-variable differential calculus. This condition arises in the study of constrained optimization and is known as the Lagrange multiplier rule; the constant 𝜆 is called the multiplier. The two-parameter family of variations in (6.133) is introduced for the sole purpose of exploiting this elegant result. The equations in (6.135) provide a boundary-value problem that every local extremum must satisfy; they are called the Euler–Lagrange equations as before. The differential equation in this boundary-value problem is at most second-order, may be linear or nonlinear, and now contains the constant 𝜆. This additional unknown is balanced by an additional equation, which is the constraint condition 𝐺(𝑦) = 𝑘. The boundary conditions appearing in (6.135) are essential in the sense that they are explicitly specified in the set V. If any of these conditions is removed from V, then an associated natural boundary condition would appear. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. A corresponding Legendre-type condition to further classify candidates can also be derived. However, this condition is significantly involved and difficult to apply for constrained problems and is omitted for brevity. The boundary-value problem in (6.135) may have one, none, or multiple solutions; any solution, and hence a candidate, is called an extremal as before. The natural boundary conditions that may arise in (6.135) are summarized below.

6.12. Constraints

183

Result 6.12.2. Let 𝐹, 𝐺 ∶ V → ℝ be defined as in (6.130)–(6.132), and let 𝜆 ∈ ℝ be the multiplier from (6.135). If any essential boundary condition is removed, then local extrema must satisfy a corresponding natural boundary condition. 𝜕𝐿

𝜕𝑀

𝜕𝐿

𝜕𝑀

(1) If 𝑦(𝑎) is free, the natural condition is [ 𝜕𝑦′ + 𝜆 𝜕𝑦′ ]𝑥=𝑎 = 0. (2) If 𝑦(𝑏) is free, the natural condition is [ 𝜕𝑦′ + 𝜆 𝜕𝑦′ ]𝑥=𝑏 = 0. Sketch of proof: Results 6.12.1 and 6.12.2. Here we show how (6.134) implies (6.135), including the two natural boundary conditions that can arise. To begin, let 𝑦 ∈ V and ˜ in (6.133) we have ℎ1 , ℎ2 ∈ V0 be arbitrary. From the definitions of 𝐹˜ and 𝐺 𝑏

˜ 1 , 𝜀2 ) = ∫ 𝐿(𝑥, 𝑦 + 𝜀1 ℎ1 + 𝜀2 ℎ2 , 𝑦′ + 𝜀1 ℎ1′ + 𝜀2 ℎ2′ ) 𝑑𝑥, 𝐹(𝜀 𝑎

(6.136)

𝑏

˜ 1 , 𝜀2 ) = ∫ 𝑀(𝑥, 𝑦 + 𝜀1 ℎ1 + 𝜀2 ℎ2 , 𝑦′ + 𝜀1 ℎ1′ + 𝜀2 ℎ2′ ) 𝑑𝑥. 𝐺(𝜀 𝑎

For each of 𝑖 = 1 and 𝑖 = 2, we differentiate with respect to 𝜀𝑖 , and take the derivative inside the integral and use the chain rule, and then set (𝜀1 , 𝜀2 ) = (0, 0). The resulting expressions are 𝑏

𝜕𝐹˜ | = ∫ 𝑔𝐿 ℎ𝑖 + 𝑓𝐿 ℎ𝑖′ 𝑑𝑥, 𝜕𝜀𝑖 |(0,0) 𝑎

(6.137)

𝑏

˜ 𝜕𝐺 | = ∫ 𝑔𝑀 ℎ𝑖 + 𝑓𝑀 ℎ𝑖′ 𝑑𝑥, 𝜕𝜀𝑖 |(0,0) 𝑎 𝜕𝐿

𝜕𝐿

𝜕𝑀

𝜕𝑀

where for brevity we use the notation 𝑔𝐿 = 𝜕𝑦 , 𝑓𝐿 = 𝜕𝑦′ , 𝑔𝑀 = 𝜕𝑦 , and 𝑓𝑀 = 𝜕𝑦′ . As before, we next rewrite each expression in a more useful form using the integrationby-parts formula ∫ 𝑢 𝑑𝑣 = 𝑢𝑣 − ∫ 𝑣 𝑑𝑢. Specifically, applying this formula to the terms ∫ 𝑓𝐿 ℎ𝑖′ 𝑑𝑥 and ∫ 𝑓𝑀 ℎ𝑖′ 𝑑𝑥, we get 𝑏

𝑥=𝑏 𝜕𝐹˜ | = ∫ 𝑄𝐿 ℎ𝑖 𝑑𝑥 + [𝑓𝐿 ℎ𝑖 ]𝑥=𝑎 , 𝜕𝜀𝑖 |(0,0) 𝑎

(6.138)

𝑏

˜ 𝑥=𝑏 𝜕𝐺 | = ∫ 𝑄𝑀 ℎ𝑖 𝑑𝑥 + [𝑓𝑀 ℎ𝑖 ]𝑥=𝑎 , 𝜕𝜀𝑖 |(0,0) 𝑎 𝑑

𝑑

where we use the notation 𝑄𝐿 = 𝑔𝐿 − 𝑑𝑥 𝑓𝐿 and 𝑄𝑀 = 𝑔𝑀 − 𝑑𝑥 𝑓𝑀 . Note that, by definition of the first variations of 𝐹 and 𝐺, the above expressions can be succinctly ˜ ˜ 𝜕𝐹 𝜕𝐺 written as 𝜕𝜀 |(0,0) = 𝛿𝐹(𝑦, ℎ𝑖 ) and 𝜕𝜀 |(0,0) = 𝛿𝐺(𝑦, ℎ𝑖 ). 𝑖

𝑖

We now suppose that 𝑦∗ is an extremum of 𝐹 subject to 𝐺 = 𝑘, so that (0, 0) is an ˜ = 𝑘. Under the assumption that 𝑦∗ is not an extremal of 𝐺, extremum of 𝐹˜ subject to 𝐺 we can fix ℎ1 such that 𝛿𝐺(𝑦∗ , ℎ1 ) ≠ 0, and let ℎ2 be arbitrary. This implies that (0, 0) ˜ and by the multiplier rule from two-variable calculus, will not be a critical point of 𝐺,

184

6. Calculus of variations

a number 𝜆 must exist such that

(6.139)

𝑏 ˜ 𝑏 𝜕𝐺 𝜕𝐹˜ +𝜆 = ∫ (𝑄𝐿 + 𝜆𝑄𝑀 )ℎ1 𝑑𝑥 + [(𝑓𝐿 + 𝜆𝑓𝑀 )ℎ1 ]𝑎 = 0, 𝜕𝜀1 𝜕𝜀1 𝑎 𝑏 ˜ 𝑏 𝜕𝐹˜ 𝜕𝐺 +𝜆 = ∫ (𝑄𝐿 + 𝜆𝑄𝑀 )ℎ2 𝑑𝑥 + [(𝑓𝐿 + 𝜆𝑓𝑀 )ℎ2 ]𝑎 = 0. 𝜕𝜀2 𝜕𝜀2 𝑎

The above two equations can be written as 𝛿𝐹(𝑦∗ , ℎ1 )+𝜆𝛿𝐺(𝑦∗ , ℎ1 ) = 0 and 𝛿𝐹(𝑦∗ , ℎ2 )+ 𝜆𝛿𝐺(𝑦∗ , ℎ2 ) = 0. For fixed ℎ1 , but arbitrary ℎ2 such that 𝛿𝐺(𝑦∗ , ℎ2 ) ≠ 0, we find that 𝛿𝐹(𝑦 ,ℎ ) 𝛿𝐹(𝑦 ,ℎ ) −𝜆 = 𝛿𝐺(𝑦∗ ,ℎ1 ) = 𝛿𝐺(𝑦∗ ,ℎ2 ) . Thus the number 𝜆 is a constant whose value depends on ∗ 1 ∗ 2 𝑦∗ , but not on ℎ1 or ℎ2 . Since the two equations in (6.139) have an identical form, and ℎ2 is arbitrary, they are equivalent to the single equation 𝑏

∫ 𝑤ℎ 𝑑𝑥 + 𝜃(𝑏)ℎ(𝑏) − 𝜃(𝑎)ℎ(𝑎) = 0,

(6.140)

∀ℎ ∈ V0 ,

𝑎

where 𝑤 = 𝑄𝐿 +𝜆𝑄𝑀 and 𝜃 = 𝑓𝐿 +𝜆𝑓𝑀 . Regardless of whether any essential boundary conditions are specified, the space V0 will always contain functions that vanish at the ends. Thus a special case of the above condition is 𝑏

(6.141)

∫ 𝑤ℎ 𝑑𝑥 = 0,

∀ℎ ∈ 𝐶 2 [𝑎, 𝑏],

ℎ(𝑎) = 0,

ℎ(𝑏) = 0.

𝑎

By the fundamental lemma, due to the continuity of 𝑤, this condition will hold when and only when 𝑤 = 0 for all 𝑥 ∈ [𝑎, 𝑏]. This is the Euler–Lagrange differential equation in (6.135). Substituting this information into (6.140), we then find that (6.142)

𝜃(𝑏)ℎ(𝑏) − 𝜃(𝑎)ℎ(𝑎) = 0,

∀ℎ ∈ V0 .

The above expression is trivial when all the essential boundary conditions are specified. However, when either of ℎ(𝑎) or ℎ(𝑏) is free, or both, then the corresponding coefficient must vanish in order for the above condition to hold, and this gives the corresponding natural boundary condition. Example 6.12.1. Consider the set V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 0}, and func1 1 tionals 𝐹(𝑦) = ∫0 (𝑦′ )2 𝑑𝑥 and 𝐺(𝑦) = ∫0 𝑥𝑦 𝑑𝑥. Here we find all extremals of 𝐹 subject to 𝐺 = 1, and then determine if they are actual local extrema using the definition. Extremals. The integrands are 𝐿(𝑥, 𝑦, 𝑦′ ) = (𝑦′ )2 and 𝑀(𝑥, 𝑦, 𝑦′ ) = 𝑥𝑦, and their partial derivatives are 𝜕𝐿/𝜕𝑦 = 0, 𝜕𝐿/𝜕𝑦′ = 2𝑦′ , 𝜕𝑀/𝜕𝑦 = 𝑥, and 𝜕𝑀/𝜕𝑦′ = 0. The differential 𝜕𝐿 𝑑 𝜕𝐿 𝜕𝑀 𝑑 𝜕𝑀 equation to consider is ( 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ]) + 𝜆( 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ]) = 0, which becomes −2𝑦″ + 𝜆𝑥 = 0, where 𝜆 is an unknown constant. In view of the essential boundary conditions in V, we get the equations 𝜆 𝑥, 𝑦(0) = 0, 𝑦(1) = 0, 0 ≤ 𝑥 ≤ 1. 2 The differential equation can be explicitly integrated, and its general solution is 𝑦 = 𝜆 3 𝑥 + 𝐶𝑥 + 𝐷, where 𝐶 and 𝐷 are unknown constants. Applying the boundary con12 𝜆 𝜆 ditions, we find 𝐷 = 0 and 𝐶 = − 12 , and the function becomes 𝑦 = 12 (𝑥3 − 𝑥). The (6.143)

𝑦″ =

6.12. Constraints

185

constraint condition 𝐺 = 1 must also be satisfied, and this requires 1

(6.144)

1

∫ 𝑥𝑦 𝑑𝑥 = 1

which becomes

∫

0

0

𝜆 4 (𝑥 − 𝑥2 ) 𝑑𝑥 = 1. 12 15

The above equation implies 𝜆 = −90, and we obtain a unique extremal 𝑦∗ = − 2 (𝑥3 − 𝑥). This is the only candidate for a local extremum. Analysis of candidate. To determine if 𝑦∗ is a local extremum in the 𝐶 𝑚 -norm for some 𝑚, we attempt to verify the definition of a minimizer or maximizer. Similar to previous cases, let 𝑚 ≤ 2 and 𝛿 > 0 be given, and consider any 𝑢 in the neighborhood 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) such that 𝐺(𝑢) = 1. Moreover, let ℎ = 𝑢 − 𝑦∗ and note that ℎ is in V0 , since it is the difference of two functions in V. From the definition of 𝐹, and the fact that 𝑢 = 𝑦∗ + ℎ, we have 1

𝐹(𝑢) = ∫ (𝑦∗′ + ℎ′ )2 𝑑𝑥.

(6.145)

0

Expanding and grouping terms on the right-hand side, and again using the definition of 𝐹, we get 1

(6.146)

1

2𝑦∗′ ℎ′

𝐹(𝑢) = 𝐹(𝑦∗ ) + ∫

𝑑𝑥 + ∫ (ℎ′ )2 𝑑𝑥.

0

0

Using integration-by-parts on the term ∫ 2𝑦∗′ ℎ′

𝑑𝑥, and noting that ℎ(0) = 0 and ℎ(1) =

0 since ℎ ∈ V0 , we get

1

(6.147)

1

𝐹(𝑢) = 𝐹(𝑦∗ ) − ∫ 2𝑦∗″ ℎ 𝑑𝑥 + ∫ (ℎ′ )2 𝑑𝑥. 0

0

From the differential equation in (6.143), we note that 2𝑦∗″ = 𝜆𝑥 for all 𝑥 ∈ [0, 1], and so 2𝑦∗″ ℎ = 𝜆𝑥ℎ = 𝜆𝑥𝑢 − 𝜆𝑥𝑦∗ for all 𝑥 ∈ [0, 1], where we have used the fact that ℎ = 𝑢 − 𝑦∗ . Substituting this result into (6.147), and using the definition of 𝐺, we get 1

1

1

𝐹(𝑢) − 𝐹(𝑦∗ ) = − ∫ 𝜆𝑥𝑢 𝑑𝑥 + ∫ 𝜆𝑥𝑦∗ 𝑑𝑥 + ∫ (ℎ′ )2 𝑑𝑥 (6.148)

0

0

0 1

= −𝜆𝐺(𝑢) + 𝜆𝐺(𝑦∗ ) + ∫ (ℎ′ )2 𝑑𝑥. 0

Since both 𝑦∗ and 𝑢 satisfy the constraint, that is, 𝐺(𝑦∗ ) = 1 and 𝐺(𝑢) = 1, the above terms with 𝜆 cancel, and we obtain the result that 1

(6.149)

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ (ℎ′ )2 𝑑𝑥 ≥ 0, 0

∀𝑢 ∈ 𝑁 𝐶 𝑚 (𝑦∗ , 𝛿) with 𝐺(𝑢) = 1.

Thus we conclude that 𝑦∗ is a local minimizer subject to the constraint 𝐺 = 1 for any 𝑚 ≤ 2 and 𝛿 > 0; in fact, it is an absolute minimizer subject to the constraint 𝐺 = 1. Example 6.12.2. In some cases, the unknown constant 𝜆 can affect the form of the solution of the differential equation. For instance, for a differential equation such as (6.150)

𝑦″ − 𝜆𝑦 = 0,

186

6. Calculus of variations

the form of the solution depends on 𝜆, specifically (6.151)

𝐶1 𝑒√𝜆 𝑥 + 𝐶2 𝑒−√𝜆 𝑥 , 𝑦={ 𝐶1 + 𝐶2 𝑥, 𝐶1 cos(𝛽𝑥) + 𝐶2 sin(𝛽𝑥),

if 𝜆 > 0, if 𝜆 = 0, if 𝜆 < 0,

where 𝛽 = √|𝜆|. All possible solutions of the differential equation must be considered, and those that satisfy the boundary conditions and constraint would be the extremals.

6.13. Case study Setup. To illustrate the preceding results we study the problem of determining the equilibrium shape of a hanging chain. We consider a chain in two dimensions as shown in Figure 6.14, which is suspended above the ground by its endpoints, which are fixed atop two support columns. We assume that the chain is inextensible, but perfectly flexible, and has a positive total mass that is distributed uniformly along its length, and hence subject to gravity. y 2α

x g

x=−L

x=L

Figure 6.14.

We consider a coordinate system as shown, where the horizontal position of the origin is halfway between the fixed endpoints, and the vertical position is also halfway between these points. We suppose that the chain has length ℓ, and mass per unit length 𝜌, and that gravitational acceleration 𝑔 is oriented vertically downwards. The horizontal distance between the endpoints is 2𝐿, and the coordinates of these points are (−𝐿, 𝛼) and (𝐿, −𝛼), where 𝛼 is an offset parameter that may be zero, positive, or negative. Thus the vertical distance between the endpoints is 2|𝛼|. The shape or profile of the chain is described by a curve 𝑦(𝑥). According to a principle from physics, among all possible shapes of length ℓ, the observed shape minimizes potential energy. We seek to characterize this shape for any given positive constants ℓ, 𝜌, 𝑔, 𝐿, and offset parameter 𝛼. Length, energy functionals. Let 𝑦(𝑥), 𝑥 ∈ [−𝐿, 𝐿], be an arbitrary profile curve which is twice continuously differentiable. To develop expressions for its length and potential energy, we consider a uniform partition of the curve into 𝑛 segments as illustrated in Figure 6.15. For 𝑖 = 1, . . . , 𝑛, we let segment 𝑖 denote the piece of curve 2𝐿 between nodes (𝑥𝑖−1 , 𝑦 𝑖−1 ) and (𝑥𝑖 , 𝑦 𝑖 ), and let Δ𝑥 = 𝑥𝑖 − 𝑥𝑖−1 = 𝑛 denote the horizontal spacing between nodes, and let Δ𝑦 = 𝑦 𝑖 − 𝑦 𝑖−1 denote the vertical spacing.

6.13. Case study

187

Since we consider the limit 𝑛 → ∞, or equivalently Δ𝑥 → 0, we can approximate each segment as linear, and we note that its center of mass will coincide with its midpoint, which we denote by (𝑥𝑖̄ , 𝑦 ̄𝑖 ).

y0

y1 y 2

yn

node i node i−1

Δy Δx

segment i −L x0 x1 x2

L xn Figure 6.15.

Each segment 𝑖 will have a length 𝐺 𝑖 , mass 𝑚𝑖 , and gravitational potential energy ∆𝑦 𝐸𝑖 . The length of a segment is 𝐺 𝑖 = (Δ𝑥2 + Δ𝑦2 )1/2 = (1 + ( ∆𝑥 )2 )1/2 Δ𝑥. Similarly, the mass of a segment is 𝑚𝑖 = 𝜌𝐺 𝑖 , and the potential energy is 𝐸𝑖 = 𝑚𝑖 𝑔𝑦 ̄𝑖 = 𝜌𝑔𝑦 ̄𝑖 (1 + ∆𝑦 ( ∆𝑥 )2 )1/2 Δ𝑥. By summing up the contributions from all the segments, and employing the limit definition of an integral, we obtain the length 𝐺 and energy 𝐸 for any given profile curve 𝑦(𝑥), 𝑥 ∈ [−𝐿, 𝐿], namely 𝐿

𝑛

𝐺 = lim ∑ 𝐺 𝑖 = ∫ √1 + (𝑦′ )2 𝑑𝑥, 𝑛→∞ 𝑖=1

(6.152)

𝑛

−𝐿

𝐿

𝐸 = lim ∑ 𝐸𝑖 = ∫ 𝜌𝑔𝑦√1 + (𝑦′ )2 𝑑𝑥. 𝑛→∞

𝑖=1

−𝐿

Restated problem. To characterize the observed shape of a hanging chain, we consider a set of functions V, and seek minimizers of 𝐸 subject to 𝐺 = ℓ, where 𝐿

(6.153)

𝐿

𝐸(𝑦) = ∫ L(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, −𝐿

𝐺(𝑦) = ∫ M(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, −𝐿

V = {𝑦 ∈ 𝐶 2 [−𝐿, 𝐿] | 𝑦(−𝐿) = 𝛼, 𝑦(𝐿) = −𝛼}. Here L(𝑥, 𝑦, 𝑦′ ) and M(𝑥, 𝑦, 𝑦′ ) are the integrands defined in (6.152). Every extremal must satisfy the Euler–Lagrange differential equation, together with the constraint and boundary conditions. Note that the general form of the differential equation can be 𝜕N 𝑑 𝜕N written in the compact form 𝜕𝑦 − 𝑑𝑥 [ 𝜕𝑦′ ] = 0, where N = L + 𝜆M is a combined integrand, and 𝜆 is the multiplier for the constraint. Reduced equation. Based on the expressions for L and M, we note that the original form of the Euler–Lagrange equation will be tedious. However, since N is independent of 𝑥, we may instead consider the reduced form of the equation given in Result 𝜕N 6.6.1, which is N − 𝑦′ 𝜕𝑦′ = 𝐴, where 𝐴 is a constant. Using the expression for N, we get, after some simplification, (6.154)

1/2

𝜌𝑔𝑦 + 𝜆 = 𝐴[1 + (𝑦′ )2 ]

.

188

6. Calculus of variations

Inspection of the above equation reveals that there are two basic types of solutions 𝜆 corresponding to 𝐴 = 0 and 𝐴 ≠ 0. The case 𝐴 = 0 gives the trivial solution 𝑦 = − 𝜌𝑔 ≡ constant, and this solution will not satisfy the constraint 𝐺 = ℓ, except in the special situation when ℓ = 2𝐿. To find the general, nontrivial solution of the equation, we assume that 𝐴 ≠ 0, and rearrange the equation to get (𝑦′ )2 =

(6.155)

(𝜌𝑔𝑦 + 𝜆)2 − 𝐴2 . 𝐴2

Solution of reduced equation. As was done in Section 6.7, instead of a cartesian description 𝑦(𝑥) of a solution curve, we consider a parametric description 𝑥 = 𝜙(𝑠) and 𝑦 = 𝜓(𝑠). Here 𝑠 is an arbitrary parameter along the curve, and 𝜙(𝑠) and 𝜓(𝑠) are arbitrary functions. Similar to before, we can choose one of these functions to simplify the differential equation, and then solve for the other. To proceed, we substitute the calculus relation

𝑑𝑦 𝑑𝑥

equation in (6.155), and then rearrange terms to separate (6.156)

(

𝑑𝑦 𝑑𝑥 / into the differential 𝑑𝑠 𝑑𝑠 𝑑𝑥 𝑑𝑦 and 𝑑𝑠 , which gives 𝑑𝑠

=

𝑑𝑦 2 𝑑𝑥 2 𝐴2 ) = ( ) . 2 2 𝑑𝑠 (𝜌𝑔𝑦 + 𝜆) − 𝐴 𝑑𝑠

We may now choose 𝑦 = 𝜓(𝑠) to simplify the right-hand side of the equation. Any 𝑑𝑦 𝑑2 𝑦 choice can be made, provided that it leads to a curve for which 𝑑𝑥 and 𝑑𝑥2 are defined and continuous, which can be verified in the end. Motivated by the form of the quotient 𝜆 𝐴 above, we let 𝜌𝑔𝑦 + 𝜆 = 𝐴 cosh(𝑠). This corresponds to 𝑦 = − 𝜌𝑔 + 𝜌𝑔 cosh(𝑠), which gives

𝑑𝑦 𝑑𝑠

=

𝐴 𝜌𝑔

2

2

sinh(𝑠), and using the identity cosh (𝑠) − 1 = sinh (𝑠), we get

(6.157)

(

The above equation implies choose the sign to get

𝑑𝑥 𝑑𝑠

𝑑𝑥 𝑑𝑠

𝐴 2 𝑑𝑥 2 ) =( ) . 𝑑𝑠 𝜌𝑔 𝐴

= ± 𝜌𝑔 . Note that, depending on 𝐴, we can always

> 0, so that the curve will be oriented left to right. This 𝐴

equation can be explicitly integrated to obtain 𝑥 = ± 𝜌𝑔 𝑠 + 𝐵, where 𝐵 is a constant, (𝑥−𝐵)𝜌𝑔

which can be rearranged to get 𝑠 = ± 𝐴 . Since cosh(𝑠) = cosh(−𝑠), we note that the choice of sign does not affect 𝑦, and we get (6.158)

𝑦=−

𝜌𝑔𝑥 𝜌𝑔𝐵 𝜆 𝐴 + cosh ( − ), 𝜌𝑔 𝜌𝑔 𝐴 𝐴

where 𝜆, 𝐴 ≠ 0 and 𝐵 are unknown constants. By straightforward substitutions, this function can be put into the cleaner form 𝐿 𝑐𝑥 (6.159) 𝑦 = −𝑏 + cosh ( + 𝑑), 𝑐 𝐿 where 𝑏, 𝑐 ≠ 0 and 𝑑 are unknown constants, into which 𝜌 and 𝑔 have been absorbed. The general curve defined above is called a catenary curve. A description of this curve can be found in many texts on elementary calculus. Constraint, boundary conditions. Although the catenary given in (6.159) is the general, nontrivial solution of the Euler–Lagrange equation, we have not yet obtained an extremal. We must now consider if values of the unknown constants can be found

6.14. A sufficient condition

189

to satisfy the constraint and boundary conditions. An analysis reveals that different interesting cases depending on ℓ, 𝐿 and 𝛼 arise, and that extremals exist in pairs. The symmetric case when 𝛼 = 0 is studied in the Exercises.

6.14. A sufﬁcient condition Here we outline a simple sufficient condition for local extrema, to supplement the necessary conditions considered up to now. The condition is based on the idea of concavity, and is straightforward and explicit, but stringent. Although less stringent conditions are available, they are more difficult to use and not pursued here. We state a condition for only a certain type of first-order problem and note that similar statements hold for more general problems. To state the result, we suppose that 𝑦∗ ∈ V is a given extremal of a functional 𝑏 𝐹 ∶ V → ℝ, where V ⊂ 𝐶 2 [𝑎, 𝑏] and 𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥. The functions in V may satisfy fixed or free conditions at each end, but for simplicity we assume that 𝐹 has no free-end terms outside of the integral. Also, we suppose that 𝑅 = (𝑐, 𝑑) × (𝑚, 𝑛) is a given rectangle that contains the range of the extremal and its derivative, in the sense that 𝑐 < 𝑦∗ (𝑥) < 𝑑 and 𝑚 < 𝑦∗′ (𝑥) < 𝑛 for all 𝑥 ∈ [𝑎, 𝑏]. It will be convenient to use the notation 𝐿(𝑥, 𝑣, 𝑤) and consider the integrand as a function of three independent variables 𝑥, 𝑣, 𝑤. So if 𝐿(𝑥, 𝑦, 𝑦′ ) = 𝑒𝑥 𝑦 + (𝑦′ )2 , then 𝐿(𝑥, 𝑣, 𝑤) = 𝑒𝑥 𝑣 + 𝑤2 . For each fixed 𝑥0 ∈ [𝑎, 𝑏] we consider the graph of 𝐿(𝑥0 , 𝑣, 𝑤) over the 𝑣, 𝑤-plane and call it the 𝐿-graph associated with 𝑥0 . Note that the 𝐿-graph is a surface, and the rectangle 𝑅 is an open set in its domain. Result 6.14.1. [concavity theorem] Let 𝐿(𝑥, 𝑣, 𝑤) be given, and consider a functional 𝑏 𝐹 ∶ V → ℝ, where 𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, and V ⊂ 𝐶 2 [𝑎, 𝑏] is a set of functions with fixed or free conditions at each end. Moreover, let 𝑦∗ ∈ V be an extremal and let 𝑅 be an open rectangle that contains the range of 𝑦∗ , 𝑦∗′ . (1)

If for every 𝑥0 ∈ [𝑎, 𝑏] the 𝐿-graph is concave up over 𝑅, then 𝑦∗ is a local minimizer of 𝐹 in the 𝐶 1 -norm.

(2)

If for every 𝑥0 ∈ [𝑎, 𝑏] the 𝐿-graph is concave up over the entire 𝑣, 𝑤-plane, then 𝑦∗ is an absolute minimizer of 𝐹.

For a maximizer, change concave up to concave down in (1) and (2). Thus a given extremal 𝑦∗ is guaranteed to be an actual extremum when the integrand 𝐿(𝑥, 𝑣, 𝑤) satisfies a concavity property in 𝑣, 𝑤 for every 𝑥. The extremum is local or absolute depending on whether the concavity of the integrand is local or absolute (global). Note that these conditions are sufficient, but not necessary. That is, an extremal can be an extremum even when the above conditions do not hold. Note also that the extremal is assumed to exist. The terms concave up and concave down refer to the usual concepts from elementary calculus. A function of one variable is called concave up (down) in an interval when its graph remains on or above (below) each tangent line in the interval. Similarly, a function of two variables is called concave up (down) in a region when its

190

6. Calculus of variations

graph remains on or above (below) each tangent plane in the region. For twice continuously differentiable functions as considered here, concavity is determined by the second derivatives. The term convexity is sometimes used in place of concavity. L

L(w) L tang (w) w1

w

w2

Figure 6.16.

𝑏

Sketch of proof: Result 6.14.1. We initially consider the case when 𝐹(𝑦) = ∫𝑎 𝐿(𝑦′ ) 𝑑𝑥, where 𝐿(𝑤) is a given function of one variable, which we assume to be concave up for all 𝑤 ∈ ℝ. Let 𝑦∗ ∈ V be an extremal and let 𝑢 ∈ V be arbitrary, and introduce ℎ = 𝑢 − 𝑦∗ ∈ V0 . For any given 𝑥 ∈ [𝑎, 𝑏], let 𝑤 1 = 𝑦∗′ (𝑥) and 𝑤 2 = 𝑢′ (𝑥). Since the 𝐿-graph is concave up as illustrated in Figure 6.16, we have (6.160)

𝐿(𝑤 2 ) ≥ 𝐿tang (𝑤 2 ).

Here 𝐿tang (𝑤) is the function for the tangent line at 𝑤 1 , namely, 𝐿tang (𝑤) = 𝐿(𝑤 1 ) + 𝜕𝐿 (𝑤 1 )(𝑤 − 𝑤 1 ). Substituting 𝜕𝑤 𝜕𝐿 place of 𝜕𝑤 , we get

for 𝑤 2 and 𝑤 1 in (6.160), and using the notation

𝐿(𝑢′ ) ≥ 𝐿(𝑦∗′ ) +

(6.161)

𝜕𝐿 𝜕𝑦′

in

𝜕𝐿 ′ ′ (𝑦 )(𝑢 − 𝑦∗′ ). 𝜕𝑦′ ∗

The above inequality holds for every 𝑥 ∈ [𝑎, 𝑏]. Integrating, and using the fact that ℎ = 𝑢 − 𝑦∗ , we obtain 𝑏

(6.162)

𝑏 ′

∫ 𝐿(𝑢 ) 𝑑𝑥 ≥ ∫ 𝑎

𝑏

𝐿(𝑦∗′ )

𝑑𝑥 + ∫

𝑎

𝑎

𝜕𝐿 ′ ′ (𝑦 )ℎ 𝑑𝑥. 𝜕𝑦′ ∗

The above expression says 𝐹(𝑢) ≥ 𝐹(𝑦∗ ) + 𝛿𝐹(𝑦∗ , ℎ). Since 𝑦∗ is an extremal, we have 𝛿𝐹(𝑦∗ , ℎ) = 0, and we get 𝐹(𝑢) ≥ 𝐹(𝑦∗ ) for arbitrary 𝑢, which implies that 𝑦∗ is an absolute minimizer. Note that the opposite conclusion would be obtained when 𝐿(𝑤) is concave down. 𝑏

Consider now the case when 𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥, where 𝐿(𝑥, 𝑣, 𝑤) is a given function. For every 𝑥 ∈ [𝑎, 𝑏], we assume that this function is concave up for all (𝑣, 𝑤) ∈ 𝑅, where 𝑅 is a fixed, open rectangle. Consider also a given extremal 𝑦∗ ∈ V, and suppose the range of (𝑦∗ , 𝑦∗′ ) is contained in 𝑅. Note that, since 𝑅 is open, there is a 𝛿 > 0 such that if 𝑢 ∈ 𝑁 𝐶 1 (𝑦∗ , 𝛿), then the range of (𝑢, 𝑢′ ) is also contained in 𝑅, and similar to before we introduce ℎ = 𝑢 − 𝑦∗ ∈ V0 . For any fixed 𝑥 ∈ [𝑎, 𝑏], let (𝑣 1 , 𝑤 1 ) = (𝑦∗ (𝑥), 𝑦∗′ (𝑥)) and (𝑣 2 , 𝑤 2 ) = (𝑢(𝑥), 𝑢′ (𝑥)). Since the 𝐿-graph is concave up in the region 𝑅, we have (6.163)

𝐿(𝑥, 𝑣 2 , 𝑤 2 ) ≥ 𝐿tang (𝑥, 𝑣 2 , 𝑤 2 ),

6.14. A sufficient condition

191

where 𝐿tang (𝑥, 𝑣, 𝑤) is the function for the tangent plane at (𝑣 1 , 𝑤 1 ). Substituting for (𝑣 2 , 𝑤 2 ) and (𝑣 1 , 𝑤 1 ) in (6.163), using the usual formula for the tangent plane, and the 𝜕𝐿 𝜕𝐿 𝜕𝐿 𝜕𝐿 notation 𝜕𝑦 and 𝜕𝑦′ in place of 𝜕𝑣 and 𝜕𝑤 , we get 𝐿(𝑥, 𝑢, 𝑢′ ) ≥ 𝐿(𝑥, 𝑦∗ , 𝑦∗′ ) + (6.164)

𝜕𝐿 (𝑥, 𝑦∗ , 𝑦∗′ )(𝑢 − 𝑦∗ ) 𝜕𝑦 𝜕𝐿 + ′ (𝑥, 𝑦∗ , 𝑦∗′ )(𝑢′ − 𝑦∗′ ). 𝜕𝑦

The above inequality holds for every 𝑥 ∈ [𝑎, 𝑏]. Integrating, and using the fact that ℎ = 𝑢 − 𝑦∗ , we obtain an expression of the same form as before, namely 𝐹(𝑢) ≥ 𝐹(𝑦∗ ) + 𝛿𝐹(𝑦∗ , ℎ). Since 𝑦∗ is an extremal, we have 𝛿𝐹(𝑦∗ , ℎ) = 0, and we get 𝐹(𝑢) ≥ 𝐹(𝑦∗ ) for arbitrary 𝑢 ∈ 𝑁 𝐶 1 (𝑦∗ , 𝛿), which implies that 𝑦∗ is a local minimizer. Note that the opposite conclusion would be obtained when 𝐿(𝑥, 𝑣, 𝑤) is concave down, and the local result would become absolute when 𝑅 is the entire 𝑣, 𝑤-plane. Example 6.14.1. Consider 𝐹 ∶ V → ℝ, where V ⊂ 𝐶 2 [0, 1] is a set of functions with fixed or free conditions at each end. 1

(1) Let 𝐹(𝑦) = ∫0 𝑒−𝑥 𝑦2 + (1 + 𝑥2 )(𝑦′ )2 𝑑𝑥. For any fixed 𝑥0 ∈ [0, 1], we have 𝐿(𝑥0 , 𝑣, 𝑤) = 𝐴0 𝑣2 + 𝐵0 𝑤2 , where 𝐴0 = 𝑒−𝑥0 > 0 and 𝐵0 = 1 + 𝑥02 > 0. Here the 𝐿-graph is an elliptic paraboloid, which is concave up over the entire 𝑣, 𝑤-plane. By Result 6.14.1, if an extremal 𝑦∗ ∈ V exists, then it must be an absolute minimizer. 1

(2) Let 𝐹(𝑦) = ∫0 (1 − 𝑞2 (𝑥) + (𝑦′ )2 )1/2 − 𝑞(𝑥)𝑦′ 𝑑𝑥, where −1 < 𝑞(𝑥) < 1 is a given coefficient function. For any fixed 𝑥0 ∈ [0, 1], we have 𝐿(𝑥0 , 𝑣, 𝑤) = (1 − 𝑞20 + 𝑤22 )1/2 − 𝜕 𝐿 𝑞0 𝑤, where 𝑞0 = 𝑞(𝑥0 ). Here the 𝐿-graph is independent of 𝑣, and we note that 𝜕𝑤2 > 0 for all 𝑤, so the graph is similar to a parabolic cylinder, which is concave up over the entire 𝑣, 𝑤-plane. By Result 6.14.1, if an extremal 𝑦∗ ∈ V exists, then it must be an absolute minimizer. 1

Example 6.14.2. Consider 𝐹 ∶ V → ℝ, where 𝐹(𝑦) = ∫0 6(𝑦′ )2 − (𝑦′ )4 𝑑𝑥, V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 𝑘}, and 𝑘 is a constant. This functional has a unique extremal given by 𝑦∗ (𝑥) = 𝑘𝑥. For any fixed 𝑥0 ∈ [0, 1], we have 𝐿(𝑥0 , 𝑣, 𝑤) = 6𝑤2 −𝑤4 , which is independent of both 𝑥0 and 𝑣. The 𝐿-graph is concave up in the rectangle 𝑅u = {(𝑣, 𝑤) | − 1 < 𝑤 < 1}, and concave down in the rectangles 𝑅d1 = {(𝑣, 𝑤) | 𝑤 < −1} and 𝑅d2 = {(𝑣, 𝑤) | 𝑤 > 1}. Thus, if −1 < 𝑘 < 1, then the range of (𝑦∗ , 𝑦∗′ ) is contained in 𝑅u , and 𝑦∗ is a local minimizer in the 𝐶 1 -norm by Result 6.14.1. Moreover, if 𝑘 < −1 or 𝑘 > 1, then the range of (𝑦∗ , 𝑦∗′ ) is contained in 𝑅d1 or 𝑅d2 , and 𝑦∗ is a local maximizer. 1

Example 6.14.3. Consider 𝐹 ∶ V → ℝ, where 𝐹(𝑦) = ∫0 𝑦(𝑦′ )2 𝑑𝑥 and V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 1, 𝑦(1) = 4}. This functional has a unique extremal given by 𝑦∗ (𝑥) = (1 + 7𝑥)2/3 . For any fixed 𝑥0 ∈ [0, 1], we have 𝐿(𝑥0 , 𝑣, 𝑤) = 𝑣𝑤2 , which is independent of 𝑥0 . An appropriate second-derivative test shows that the 𝐿-graph is neither concave up nor concave down in any open region of the 𝑣, 𝑤-plane. Thus the sufficient conditions in Result 6.14.1 do not hold. Nevertheless, by direct verification, we find that 𝑦∗ is a local minimizer in the 𝐶 0 -norm in V. To show this, we first note

192

6. Calculus of variations

that 𝑦∗ satisfies the Euler–Lagrange equation 2𝑦∗ 𝑦∗″ + (𝑦∗′ )2 = 0, and has the properties that 𝑦∗ ≥ 1 and 𝑦∗″ < 0. Next, let 𝑢 ∈ V be arbitrary and let ℎ = 𝑢 − 𝑦∗ ∈ V0 . Then 1

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ (𝑦∗ + ℎ)(𝑦∗′ + ℎ′ )2 − 𝑦∗ (𝑦∗′ )2 𝑑𝑥 0

(6.165)

1

= ∫ ℎ(𝑦∗′ )2 + 2𝑦∗ 𝑦∗′ ℎ′ + 2ℎ𝑦∗′ ℎ′ + (𝑦∗ + ℎ)(ℎ′ )2 𝑑𝑥. 0

Using integration-by-parts on the term ∫ 2𝑦∗ 𝑦∗′ ℎ′ 𝑑𝑥, noting that ℎ(0) = 0 and ℎ(1) = 0, and then using the differential equation, we find that the above expression reduces to 1

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ 2ℎ𝑦∗′ ℎ′ + (𝑦∗ + ℎ)(ℎ′ )2 𝑑𝑥.

(6.166)

0 2 ′

′

Next, using the fact that 2ℎℎ = (ℎ ) , and performing an integration-by-parts on the term ∫ 𝑦∗′ (ℎ2 )′ 𝑑𝑥, we get 1

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ −𝑦∗″ ℎ2 + (𝑦∗ + ℎ)(ℎ′ )2 𝑑𝑥.

(6.167)

0

Since 𝑦∗″ < 0, we have −𝑦∗″ ℎ2 ≥ 0, and since 𝑦∗ ≥ 1, we have (𝑦∗ + ℎ)(ℎ′ )2 ≥ 0 for any ℎ ≥ −1. Restricting to −1 ≤ ℎ ≤ 1, and noting ℎ = 𝑢 − 𝑦∗ , we obtain (6.168)

𝐹(𝑢) − 𝐹(𝑦∗ ) ≥ 0,

∀𝑢 ∈ 𝑁 𝐶 0 (𝑦∗ , 𝛿) for any 𝛿 ∈ (0, 1].

Thus 𝑦∗ is a local minimizer in the 𝐶 0 -norm, even though the sufficient conditions in Result 6.14.1 do not hold. 𝜋

Example 6.14.4. Consider 𝐹 ∶ V → ℝ, where 𝐹(𝑦) = ∫0 (𝑦′ )2 − 𝑘2 𝑦2 𝑑𝑥, V = {𝑦 ∈ 𝐶 2 [0, 𝜋] | 𝑦(0) = 0, 𝑦(𝜋) = 1}, and 𝑘 > 0 is a noninteger constant. This functional has a unique extremal given by 𝑦∗ (𝑥) = sin(𝑘𝑥)/ sin(𝑘𝜋). (There is no extremal when 𝑘 is an integer.) For any fixed 𝑥0 ∈ [0, 𝜋], we have 𝐿(𝑥0 , 𝑣, 𝑤) = 𝑤2 − 𝑘2 𝑣2 , which is independent of 𝑥0 . Here the 𝐿-graph is a hyperbolic paraboloid (saddle) over the entire 𝑣, 𝑤-plane, and the sufficient conditions in Result 6.14.1 do not hold. As before, we may attempt a direct verification, so we let 𝑢 ∈ V be arbitrary, and let ℎ = 𝑢−𝑦∗ ∈ V0 . Then by straightforward calculation we obtain 𝜋

𝐹(𝑢) − 𝐹(𝑦∗ ) = ∫ (ℎ′ )2 − 𝑘2 ℎ2 𝑑𝑥.

(6.169)

0

Since it contains a difference of squares, the sign of the integral is not obvious. If the integral is positive for some ℎ, and negative for other ℎ, then the extremal will not be an extremum. To help with the analysis, we bring in a technical result about continuously differentiable functions that vanish at each end, called Wirtinger’s (or Poincaré’s) inequality, which states 𝑏

(6.170)

𝑏

∫ (ℎ′ )2 𝑑𝑥 ≥ ( 𝑎

𝜋 2 ) ∫ ℎ2 𝑑𝑥, 𝑏−𝑎 𝑎

∀ℎ ∈ 𝐶 1 [𝑎, 𝑏], ℎ(𝑎) = 0, ℎ(𝑏) = 0.

Using (6.170) in (6.169) we get, after minor simplification, and with 𝑎 = 0 and 𝑏 = 𝜋, 𝜋

(6.171)

𝐹(𝑢) − 𝐹(𝑦∗ ) ≥ ∫ (1 − 𝑘2 )ℎ2 𝑑𝑥. 0

Exercises

193

In the case when 0 < 𝑘 < 1, the above inequality yields 𝐹(𝑢) − 𝐹(𝑦∗ ) ≥ 0 for arbitrary 𝑢, which implies that 𝑦∗ is an absolute minimizer. On the other hand, when 𝑘 > 1, the above inequality is not helpful: it says that 𝐹(𝑢)−𝐹(𝑦∗ ) is greater than or equal to a nonpositive quantity. In this case, we return to (6.169) and show by direct example that the integral can be positive or negative depending on ℎ. For instance, let ℎ(𝑥) = 𝜀 sin(𝑛𝑥), where 𝜀 > 0 is an arbitrary coefficient, and 𝑛 > 0 is an integer. Then ℎ ∈ V0 and 1 𝐹(𝑢)−𝐹(𝑦∗ ) = 2 𝜋𝜀2 (𝑛2 −𝑘2 ), which is positive if 𝑛 > 𝑘, and negative if 𝑛 < 𝑘. Moreover, 𝜀 can be chosen so that 𝑢 is within any given neighborhood of 𝑦∗ . This shows that 𝑦∗ is not an extremum in the case when 𝑘 > 1.

Reference notes The calculus of variations is a vast subject whose theory spans across a multitude of levels, from elementary to advanced, with applications in all branches of science. Only the most elementary parts of the subject were discussed here. A wealth of additional information can be found in more specialized texts. More complete treatments of the elementary theory are given in the recent work by Kot (2014), and the classic works by Gelfand and Fomin (1963) and Bliss (1925). A treatment of the elementary theory with a focus on constraints, of both equality and inequality type, along with some elements of optimal control can be found in Troutman (1996), Smith (1974) and Bliss (1930). An introduction to the more advanced theory is given in Dacorogna (2015) and Buttazzo, Giaquinta and Hildebrandt (1998). Although we focused primarily on necessary conditions, the theory of sufficient conditions is an important part of the subject which can be approached in different ways. Various treatments of such conditions can be found in the texts above. The results for the key problems considered in the case studies and exercises can be found in the literature, or are direct consequences of results therein. For the brachistochrone or slide problem, the solvability of the boundary conditions and the minimizing property of the cycloid are established in the classic book by Bliss (1925); for more recent treatments, see Lawlor (1996), Troutman (1996) and Coleman (2012). For the minimal surface of revolution problem, the solvability of the boundary conditions and the characterization of the two catenoid extremals are also established in Bliss (1925); see also Kot (2014). For the boat steering problem, the uniqueness and minimizing properties of the extremals follow from straightforward concavity arguments. For the hanging chain problem, the existence and characterization of the two catenary extremals are established in Troutman (1996), where the problem is reformulated in an alternative way, and concavity arguments are employed. For a proof of Wirtinger’s inequality, see Dacorogna (2015) or Dym and McKean (1972).

Exercises 1

1. Let 𝐹(𝑦) = ∫0 (1 + 𝑥)(𝑦′ )2 𝑑𝑥, V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 1}. Consider 1 ̂ ̃ 𝑦(𝑥) = ln 2 ln(1 + 𝑥) and 𝑦(𝑥) = 𝑥2 in V. (a) Find 𝛿𝐹(𝑦, ℎ) and 𝛿2 𝐹(𝑦, ℎ) for arbitrary 𝑦 ∈ V and ℎ ∈ V0 .

194

6. Calculus of variations

(b) Show 𝛿𝐹(𝑦,̂ ℎ) = 0 for every ℎ ∈ V0 , while 𝛿𝐹(𝑦,̃ ℎ) ≠ 0 for some ℎ ∈ V0 . Thus 𝑦 ̂ is a candidate for a local extrema, but not 𝑦.̃ (c) Use 𝛿2 𝐹(𝑦,̂ ℎ) to partially classify 𝑦.̂ 1

̂ 2. Let 𝐹(𝑦) = ∫0 (𝑦′ )2 (4 − (𝑦′ )2 ) 𝑑𝑥, V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 1}. Consider 𝑦(𝑥) ≡1 2 ̃ (constant) and 𝑦(𝑥) = 1 + 𝑥 in V. (a) Find 𝛿𝐹(𝑦, ℎ) and 𝛿2 𝐹(𝑦, ℎ) for arbitrary 𝑦 ∈ V and ℎ ∈ V0 . (b) Show 𝛿𝐹(𝑦,̂ ℎ) = 0 for every ℎ ∈ V0 , while 𝛿𝐹(𝑦,̃ ℎ) ≠ 0 for some ℎ ∈ V0 . Thus 𝑦 ̂ is a candidate for a local extrema, but not 𝑦.̃ (c) Use 𝛿2 𝐹(𝑦,̂ ℎ) to partially classify 𝑦.̂ 3. Find the extremals, if any, for the following functionals and boundary conditions. 1

(a) 𝐹(𝑦) = ∫0 (𝑦′ )2 + 3𝑦 + 2𝑥 𝑑𝑥, 2 (𝑦′ )2 𝑥3

(b) 𝐹(𝑦) = ∫1

𝑑𝑥,

𝑦(0) = 0, 𝑦(1) = 4.

𝑦(1) = 1, 𝑦(2) = 2.

1

(c) 𝐹(𝑦) = ∫0 (𝑦′ )2 + 𝑦𝑒𝑥 + 𝑦2 𝑑𝑥,

𝑦(0) = 0, 𝑦(1) = 0.

2

𝑦(1) = 0, 𝑦(2) = 1.

2

𝑦(1) = 3, 𝑦(2) = 4.

(d) 𝐹(𝑦) = ∫1 𝑥2 (𝑦′ )2 + 𝑦2 𝑑𝑥, (e) 𝐹(𝑦) = ∫1 𝑥3 (𝑦′ )2 − 4𝑦 𝑑𝑥, 1

(f) 𝐹(𝑦) = ∫0 𝑥𝑦𝑦′ 𝑑𝑥,

𝑦(0) = 0, 𝑦(1) = 3.

1

(g) 𝐹(𝑦) = ∫−1 2𝑦𝑦′ − 3𝑥 𝑑𝑥, 1

(h) 𝐹(𝑦) = ∫0 𝑦3 + 𝑒𝑥 𝑦 𝑑𝑥,

𝑦(−1) = −2, 𝑦(1) = 2. 𝑦(0) = 3, 𝑦(1) = 6.

4. For each functional and set of boundary conditions, find the unique extremal, and show it is an absolute minimizer or maximizer. 1

𝑦(0) = 0, 𝑦(1) = 1.

1

𝑦(−1) = −1, 𝑦(1) = 0.

(a) 𝐹(𝑦) = ∫0 𝑥2 + 4𝑦 − (𝑦′ )2 𝑑𝑥, (b) 𝐹(𝑦) = ∫−1 𝑦2 + 4(𝑦′ )2 𝑑𝑥, 1

(c) 𝐹(𝑦) = ∫0 𝑥 − (𝑦 − 𝑦′ )2 𝑑𝑥, 2

𝑦(0) = 0, 𝑦(1) = 1.

(d) 𝐹(𝑦) = ∫0 𝑦2 + 𝑦𝑦′ + (𝑦′ − 2)2 𝑑𝑥,

𝑦(0) = 1, 𝑦(2) = 2.

Exercises

195

5. Let 𝑦∗ ∈ V be an extremal of 𝐹 ∶ V → ℝ, where 𝑝, 𝑞 ∈ 𝐶 2 [𝑎, 𝑏] are given functions and V = {𝑦 ∈ 𝐶 2 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦(𝑏) = 𝛽}, 𝑏

𝐹(𝑦) = ∫𝑎 𝑝(𝑥)(𝑦′ )2 + 𝑞(𝑥)𝑦2 𝑑𝑥. Show that, if 𝑝, 𝑞 are positive in [𝑎, 𝑏], then 𝑦∗ is an absolute minimizer. 6. Prove the following alternative forms of the fundamental lemma, where 𝐶01 [𝑎, 𝑏] = {ℎ ∈ 𝐶 1 [𝑎, 𝑏] | ℎ(𝑎) = 0, ℎ(𝑏) = 0}. 𝑏

(a) If 𝑓 ∈ 𝐶 0 [𝑎, 𝑏] and ∫𝑎 𝑓ℎ′ 𝑑𝑥 = 0 for all ℎ ∈ 𝐶01 [𝑎, 𝑏], then 𝑓 is a constant function. 𝑏

(b) If 𝑓, 𝑔 ∈ 𝐶 0 [𝑎, 𝑏] and ∫𝑎 𝑔ℎ+𝑓ℎ′ 𝑑𝑥 = 0 for all ℎ ∈ 𝐶01 [𝑎, 𝑏], then 𝑓 ∈ 𝐶 1 [𝑎, 𝑏] and 𝑓′ = 𝑔. Note: The result in (a) is called the duBois-Reymond lemma. It can be used to derive an alternative form of the Euler–Lagrange equation, with weaker continuity requirements. 7. Find the extremals, if any, where 𝑚 > 1 is a constant. 2

(a) 𝐹(𝑦) = ∫ 1

√1 + (𝑦′ )2 𝑑𝑥, 𝑥

𝑦(1) = 3, 𝑦(2) = 4.

1

(b) 𝐹(𝑦) = ∫ √1 + 𝑚(𝑦′ )2 𝑑𝑥,

𝑦(0) = 5, 𝑦(1) = 7.

0

1

(c) 𝐹(𝑦) = ∫ 𝑦√1 + (𝑦′ )2 𝑑𝑥,

𝑦(0) = 1, 𝑦(1) = 4.

0

1

(d) 𝐹(𝑦) = ∫ 𝑦(𝑦′ )2 𝑑𝑥,

𝑦(0) = 𝑚, 𝑦(1) = 4𝑚.

0

1

(e) 𝐹(𝑦) = ∫ 𝑦2 (𝑦′ )2 𝑑𝑥,

𝑦(0) = 1, 𝑦(1) = 𝑚.

0

8. Light travels from a point source 𝑆 = (1, ℎ) to a point receiver 𝑅 = (0, 1) through an atmosphere with index of refraction 𝑛(𝑥, 𝑦). According to Fermat’s principle, light emitted at 𝑆 and arriving at 𝑅 travels along a ray or path 𝑦(𝑥) that minimizes

196

6. Calculus of variations

the time-of-travel functional y atmosphere S

(1,h)

R (0,1)

1

y(x)

𝐹(𝑦) = ∫ 𝑛(𝑥, 𝑦)√1 + (𝑦′ )2 𝑑𝑥, 0

obstacle 0

2

𝑦 ∈ 𝐶 [0, 1],

1/2

1

𝑦(0) = 1,

𝑦(1) = ℎ.

x

(a) Find the extremal of 𝐹 in the case 𝑛(𝑥, 𝑦) = 1/𝑦, where ℎ > 0 is a constant. Assume 𝑦 > 0. (b) Find all values of ℎ for which the light path 𝑦(𝑥) will be blocked by the obstacle (𝑆 not visible to 𝑅). 9. Consider a scenario as in Exercise 8, but with an arbitrary index of refraction 𝑑𝑛 𝑛 = 𝑛(𝑦) > 0. Show that if 𝑛(𝑦) decreases with height, that is 𝑑𝑦 < 0, then any extremal 𝑦(𝑥) must be concave down, that is 𝑦″ < 0. Note: The index of refraction in the atmosphere is believed to decrease with altitude, which implies that light paths will be concave. As a result, when we see the setting sun vanish on the horizon, it is actually already below the horizon, and has been below for some time! 10. Consider the daily food intake of an individual. Let 𝑦 be the mass of a given food type in the stomach at time 𝑡 ∈ [0, 𝑏], where 𝑡 = 0 is wake time, and 𝑡 = 𝑏 is bed time, and suppose the rate of change satisfies

y 𝑦′ = −𝑘𝑦 + 𝑢, 𝑦 = 𝑦(𝑡),

c

𝑡 ∈ [0, 𝑏].

target

y

𝑢 = 𝑢(𝑡).

0 wake

b t bed

Here 𝑘𝑦 is the food breakdown rate, and 𝑢 is the external control, where 𝑢 > 0 represents “eating” and 𝑢 < 0 “purging”. (We assume 𝑦 → 0 during sleep.) A measure of the daily intake imbalance is 𝑏

𝐹 = ∫ 𝜂(𝑦 − 𝑐)2 + 𝑢2 𝑑𝑡. 0

Here 𝑐 is a target amount for the mass, and 𝜂 is a weighting factor. The parameters 𝑏, 𝑘, 𝑐, 𝜂 are positive constants. Note that larger values of 𝐹 correspond to larger deviations from the target, or larger eating or purging events, over longer periods. For given conditions, we seek a food intake schedule to minimize the imbalance.

Exercises

197

𝑏

(a) Write the functional as 𝐹(𝑦) = ∫0 𝐿(𝑡, 𝑦, 𝑦′ ) 𝑑𝑡, by eliminating 𝑢. (b) Find the unique extremal 𝑦(𝑡) given 𝑦 ∈ 𝐶 2 [0, 𝑏], 𝑦(0) = 0, 𝑦(𝑏) = 𝑐. (c) Find the control curve 𝑢(𝑡) associated with the extremal in (b). (d) Plot 𝑦(𝑡) and 𝑢(𝑡) for the case {𝑏, 𝑘, 𝑐, 𝜂} = {12, 0.5, 1, 1} in dimensionless units. Briefly describe the optimal food intake schedule suggested by the curves. Are there time periods where 𝑦 > 𝑐 or 𝑢 < 0? Aside from continuous snacking, when should the larger meals occur? Is the target food mass reached before bed time? 11. Consider 𝐹 ∶ V → ℝ, where V = {𝑦 ∈ 𝐶 2 [𝑎, 𝑏] | 𝑦(𝑏) = 𝛽},

𝑦(𝑎) free,

𝑏

𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ ) 𝑑𝑥 + [𝐺(𝑦)]𝑥=𝑎 . Derive the natural boundary condition at 𝑥 = 𝑎 for an extremal 𝑦∗ ∈ V. 12. Find the extremals, if any, for the following functionals and boundary conditions. 3

(a) 𝐹(𝑦) = ∫0 𝑒2𝑥 ((𝑦′ )2 − 𝑦2 ) 𝑑𝑥,

𝑦(0) = 1, 𝑦(3) free.

1 1 ′ 2 1 (𝑦 ) + 𝑦′ 𝑦 + 𝑦′ + 𝑦 𝑑𝑥, 𝑦(0) = 2 , 𝑦(1) free. 2 1 ∫0 (𝑦′ )2 + 3𝑦 𝑑𝑥 + 𝑦2 (1), 𝑦(0) = 4, 𝑦(1) free.

(b) 𝐹(𝑦) = ∫0 (c) 𝐹(𝑦) =

4

(d) 𝐹(𝑦) = ∫0 (𝑦′ )2 − 2𝑦 𝑑𝑥 + 𝑦2 (0),

𝑦(0) free, 𝑦(4) = 2.

2

(e) 𝐹(𝑦) = ∫0 (𝑦′ )2 − 4𝑦 + 𝑦2 𝑑𝑥 + 3𝑦(2),

𝑦(0) free, 𝑦(2) free.

1

13. Find the unique extremal for 𝐹(𝑦) = ∫0 (𝑦′ )2 + 𝑦 + 𝑦2 𝑑𝑥, 𝑦(0) = 0, 𝑦(1) free. Show the extremal is an absolute minimizer or maximizer. 𝜋

14. Find the unique extremal for 𝐹(𝑦) = ∫0 (𝑦′ )2 −𝑦2 𝑑𝑥, 𝑦(0) = 1, 𝑦(𝜋) free. Show by example that the extremal is neither an absolute minimizer nor an absolute maximizer. 15. Find the extremals, if any, where 0 < 𝑘 < 1 is a constant. Assume 𝑦(𝑥) > 0 in parts (b),(c). 1

(a) 𝐹(𝑦) = ∫ √1 − 𝑘2 + (𝑦′ )2 − 𝑘𝑦′ 𝑑𝑥,

𝑦(0) = 4, 𝑦(1) free.

0

1

(b) 𝐹(𝑦) = ∫ 0

√1 + (𝑦′ )2 𝑑𝑥, 𝑦

𝑦(0) = 𝑘, 𝑦(1) free.

198

6. Calculus of variations

1

(c) 𝐹(𝑦) = ∫ √𝑦(1 + (𝑦′ )2 ) 𝑑𝑥,

𝑦(0) free, 𝑦(1) = 2.

0

1

(d) 𝐹(𝑦) = ∫ 𝑒−𝑦 (𝑦′ )2 𝑑𝑥,

𝑦(0) free, 𝑦(1) = 𝑘.

0 1

16. Consider 𝐹(𝑦) = ∫0 [ (𝑦′ )2 + 𝑟𝑦′ ]𝑒−2𝑦 𝑑𝑥 where 𝑦(0) = 1, 𝑦(1) is free, and 𝑟 is a constant. Find the extremals. Show there are no extremals if 𝑟 ≥ 𝑟# , and only one extremal if 𝑟 < 𝑟# , where 𝑟# is a number that you should find. 17. Find the extremals assuming 𝑦 > 0 for 1

𝐹(𝑦) = ∫ 0

𝑘𝑦′ − (𝑦′ )2 𝑑𝑥, 𝑦

𝑦(0) = 1,

𝑦(1) free,

𝑘 constant.

Show there are no extremals when 𝑘 < 𝑘∗ , two extremals when 𝑘∗ < 𝑘 < 𝑘∗∗ , and only one allowable extremal when 𝑘 > 𝑘∗∗ , where 𝑘∗ and 𝑘∗∗ are numbers that you should find. What happens to the extremals when 𝑘 = 𝑘∗ and 𝑘 = 𝑘∗∗ ? 18. Consider 𝐹 ∶ V → ℝ, where V = {𝑦 ∈ 𝐶 4 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦′ (𝑎) = 𝛾},

𝑦(𝑏) and 𝑦′ (𝑏) free,

𝑏

𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ , 𝑦″ ) 𝑑𝑥 + [𝐺(𝑦)]𝑥=𝑏 + [𝐻(𝑦′ )]𝑥=𝑏 . Derive the natural boundary conditions at 𝑥 = 𝑏 for an extremal 𝑦∗ ∈ V. 19. Find the extremals, if any, for the following functionals and boundary conditions. 1

(a) 𝐹 = ∫0 𝑦𝑦′ + (𝑦″ )2 𝑑𝑥, 𝑦(0) = 0, 𝑦′ (0) = 1, 𝑦(1) = 2, 𝑦′ (1) = 4. 1

(b) 𝐹 = ∫0 𝑦 + 𝑦𝑦′ 𝑦″ 𝑑𝑥, 𝑦(0) = 2, 𝑦′ (0) = 2, 𝑦(1) = 3, 𝑦′ (1) = 3. 1

(c) 𝐹 = ∫0 (𝑦′ )2 + (𝑦″ )2 𝑑𝑥, 𝑦(0) = 0, 𝑦′ (0) = 1, 𝑦(1) = 2, 𝑦′ (1) free. 1

(d) 𝐹 = ∫0 𝑦 + 5𝑦𝑦″ + (𝑦′ + 𝑦″ )2 𝑑𝑥, 𝑦(0) = 1, 𝑦′ (0) = 0, 𝑦(1) free, 𝑦′ (1) = 0. 20. Repeat the optimal food intake problem in Exercise 10, assuming 𝑦(𝑏) is free. 21. Find the extremals of 𝐹, if any, subject to the constraint 𝐺 = 1, with the given boundary conditions.

Exercises

199

𝜋

𝜋

(a) 𝐹 = ∫0 (𝑦′ )2 𝑑𝑥, 𝐺 = ∫0 𝑦2 𝑑𝑥, 𝑦(0) = 0, 𝑦(𝜋) = 0. 1

1

(b) 𝐹 = ∫0 𝑦 + (𝑦′ )2 𝑑𝑥, 𝐺 = ∫0 𝑥𝑦′ 𝑑𝑥, 𝑦(0) = 0, 𝑦(1) = 1. 𝜋

𝜋

𝜋

(c) 𝐹 = ∫02 (𝑦′ )2 + 2𝑦𝑦′ − 𝑦2 𝑑𝑥, 𝐺 = ∫02 6𝑦 𝑑𝑥, 𝑦(0) = 0, 𝑦( 2 ) = 1. 𝜋

𝜋

(d) 𝐹 = ∫0 (𝑦′ )2 + 4𝑦 𝑑𝑥, 𝐺 = ∫0 𝑦2 𝑑𝑥, 𝑦(0) = 0, 𝑦(𝜋) free. 1

22. Find the extremals of 𝐹 = ∫0 √1 + (𝑦′ )2 𝑑𝑥, if any, subject to the constraint 𝐺 = 1 ∫0 𝑦 𝑑𝑥 = 𝐴, where 𝑦(0) = 0, 𝑦(1) = 0 and 𝐴 > 0 is a constant. Explain why the 𝜋 𝜋 case 0 < 𝐴 ≤ 8 is different from 𝐴 > 8 . Note: The above is a variant of Queen Dido’s problem, which is to find a plane curve that encloses the greatest area among all curves of a given length. In the above, only graphs are considered, and we seek a graph of shortest length among all graphs that enclose a given area. 23. Consider 𝐹, 𝐺 ∶ V → ℝ, where V = {𝑦 ∈ 𝐶 4 [𝑎, 𝑏] | 𝑦(𝑎) = 𝛼, 𝑦′ (𝑎) = 𝛾, 𝑦(𝑏) = 𝛽, 𝑦′ (𝑏) = 𝜂}, 𝑏

𝑏

𝐹(𝑦) = ∫𝑎 𝐿(𝑥, 𝑦, 𝑦′ , 𝑦″ ) 𝑑𝑥,

𝐺(𝑦) = ∫𝑎 𝑀(𝑥, 𝑦, 𝑦′ , 𝑦″ ) 𝑑𝑥.

Derive the Euler–Lagrange boundary-value problem for an extremal of 𝐹 subject to the constraint 𝐺 = 𝑘. 𝜋

24. Let V = {𝑦 ∈ 𝐶 2 [0, 𝜋] | 𝑦(0) = 0, 𝑦(𝜋) = 0}, 𝐹(𝑦) = ∫0 (𝑦′ )2 𝑑𝑥, and 𝐺(𝑦) = 𝜋 ∫0 𝑦2 𝑑𝑥. The Euler–Lagrange equation implies that, for every integer 𝑛 ≠ 0, the 2 function 𝑦∗,𝑛 (𝑥) = ( 𝜋 )1/2 sin(𝑛𝑥) is an extremal of 𝐹 subject to 𝐺 = 1. (a) For 𝑛 = ±1, show that 𝑦∗,𝑛 is an absolute minimizer of 𝐹 subject to 𝐺 = 1. [Hint: Use Wirtinger’s inequality (6.170).] (b) For 𝑛 ≠ ±1, show that 𝑦∗,𝑛 is not an absolute minimizer of 𝐹 subject to 𝐺 = 1. 25. Consider the car acceleration problem outlined in Section 6.11. In dimensionless units, let {𝑏, ℓ, 𝑔} = {1, 1, 10} and consider the case of a straight, flat road so that 𝜃(𝑠) ≡ 0. y

air resistance η P

g

t=0 s

car

Q

𝑃∶

𝑠 = 0 when 𝑡 = 0,

t=b

𝑄∶

𝑠 = ℓ when 𝑡 = 𝑏.

x

200

6. Calculus of variations

(a) For 𝜂 = 0, find the unique extremal 𝑠(𝑡) for the problem in (6.128). Determine the corresponding optimal control 𝑢(𝑡). Qualitatively describe the 𝑢 versus 𝑡 curve in terms of “foot pressure” on the gas and brake pedals. (b) For 𝜂 = 0, find the unique extremal 𝑠(𝑡) for the problem in (6.129). Determine the corresponding optimal control 𝑢(𝑡). Qualitatively describe the 𝑢 versus 𝑡 curve as before. (c) Repeat (a) for the case 𝜂 = 4. (d) Repeat (b) for the case 𝜂 = 4.

Mini-project 1. A soap film will form a curved surface of revolution when stretched between two rings. If the profile of the film is denoted by 𝑦(𝑥), 𝑥 ∈ [−𝐿, 𝐿], then the surface area is given by rings of radius α, β

y(x) α

β

𝐿

𝐹(𝑦) = 2𝜋 ∫ 𝑦√1 + (𝑦′ )2 𝑑𝑥. −𝐿

film

−L

L

The natural, observed shape of the film can be described as that which minimizes the area functional 𝐹 in the set 𝑉 = {𝑦 ∈ 𝐶 2 [−𝐿, 𝐿] | 𝑦(−𝐿) = 𝛼, 𝑦(𝐿) = 𝛽, 𝑦(𝑥) > 0}. Here we explore this minimization problem in the case when the two rings are the same size so that 𝛼 = 𝛽, and for different values of the aspect ratio 𝛼/𝐿. We assume that 𝛼 = 𝛽 > 0 and 𝐿 > 0 are given constants. All quantities are dimensionless. (a) Write and solve the Euler–Lagrange differential equation for a local minimizer of 𝐹 in 𝑉. [One way is to consider the reduced equation, and parametric description 𝑥 = 𝑓(𝑠) and 𝑦 = 𝑔(𝑠), with 𝑔(𝑠) = 𝐴 cosh(𝑠), and then eliminate 𝑠.] By renaming constants 𝐿 𝑐𝑥 as necessary, show that the general solution can be written as 𝑦(𝑥) = 𝑐 cosh( 𝐿 + 𝑑), where 𝑐 > 0 and 𝑑 are arbitrary constants. (b) Write the boundary conditions for 𝑦 in the symmetric case when 𝛼 = 𝛽. Show that these conditions imply the following equations for 𝑐 and 𝑑, where 𝛾 = 𝛼/𝐿, 𝑑=0

and

cosh(𝑐) − 𝛾𝑐 = 0.

(c) In the second equation above, 𝛾 is given, and 𝑐 is an unknown. Show that this equation has two solutions if 𝛾 > 𝛾# , one solution if 𝛾 = 𝛾# , and no solution if 0 < 𝛾 < 𝛾# , where 𝛾# is an appropriate number which you should find. Each solution for 𝑐, together with 𝑑 = 0, gives a candidate curve 𝑦∗ ∈ 𝑉. Hence we have two, one or no candidates for a local minimizer depending on the value of the ratio 𝛾.

Exercises

201

(d) When 𝛾 > 𝛾# and there are two candidate curves 𝑦∗ , it can be shown that the candidate with the smaller 𝑐 is a local minimizer whereas the other is not. Find and plot these curves for the case 𝛾 = 2 and indicate which is the local minimizer; for the plot interval [−𝐿, 𝐿] use 𝐿 = 1. When 𝛾 ≤ 𝛾# , it can be shown that there are no local minimizers of 𝐹 in 𝑉; in this case, a surface of minimum area no longer has a profile in 𝑉. What do you think might happen to the film in this case? Note: The above results for 𝛾 > 𝛾# and 𝛾 ≤ 𝛾# can be illustrated by direct experiment with some bubble solution and wire rings. Note that decreasing 𝛾 is equivalent to increasing 𝐿 with fixed 𝛼. The general curve in (a) is called a catenary, and the surface of revolution is called a catenoid. Mini-project 2. A drone boat is to be driven across a channel of moving water as described in Section 6.9. We suppose that the steering angle 𝜃 with respect to the horizontal axis can be controlled remotely, and that the boat moves at constant speed 𝜎 relative to the water. If we let 𝑦(𝑥), 𝑥 ∈ [0, ℓ] denote the path of the boat, then the travel time 𝑇(𝑦) along the path, and steering angle 𝜃(𝑥) required for the path, are y

w(x)

ℓ

water y(x) σ boat

P

𝑇(𝑦) = Q

θ(x)

cos 𝜃 = x

√1 − 𝑒2 + (𝑦′ )2 − 𝑒𝑦′ 1 ∫ 𝑑𝑥, 𝜎 0 1 − 𝑒2 1 − 𝑒2

√1 − 𝑒2 + (𝑦′ )2 − 𝑒𝑦′

, sin 𝜃 = 𝑦′ cos 𝜃 − 𝑒.

0

Here 𝑒(𝑥) = 𝑤(𝑥)/𝜎, where 𝑤(𝑥) is the speed of the water. We seek the path 𝑦(𝑥) that will minimize the travel time 𝑇(𝑦) under different boundary conditions. We assume that the water speed is everywhere less than the boat speed, so that −1 < 𝑒(𝑥) < 1, and that the boat path is a graph (one 𝑦 for each 𝑥) with two continuous derivatives, which 𝜋 𝜋 requires − 2 < 𝜃(𝑥) < 2 . For concreteness, we use 𝜎 = 1 and 𝑤(𝑥) = 𝜂𝑥(ℓ − 𝑥), where 𝜂 = 3.5 and ℓ = 1 are constants. All quantities are dimensionless. (a) Independent of boundary conditions, consider the Euler–Lagrange differential 𝜕𝐿 equation for local minimizers of 𝑇(𝑦). Show that the equation becomes 𝜕𝑦′ = 𝐴, which gives 𝑦′ (𝑥) = 𝑓(𝑥, 𝐴), and thus the general solution is 𝑥

𝑦(𝑥) = 𝐵 + ∫ 𝑓(𝑥,̂ 𝐴) 𝑑𝑥,̂ 0

where 𝐴 and 𝐵 are arbitrary constants. Here 𝑓(𝑥, 𝐴) is a function which you should find. Show that 𝑓(𝑥, 𝐴) will be defined for all 𝑥 ∈ [0, 1] only if 𝐴# < 𝐴 < 𝐴# , where 𝐴# and 𝐴# are bounds which you should find. (b) Consider the fixed-fixed problem where the boat must depart from 𝑃 = (0, 0) and arrive at 𝑄 = (1, 3). In this case, an optimal path is a minimizer of 𝑇 ∶ V → ℝ, where V = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0, 𝑦(1) = 3}. Find a unique extremal using the solution from (a). [The boundary conditions will give a nonlinear equation for 𝐴; it can be solved numerically using the interval (𝐴# , 𝐴# ) as a guide.]

202

6. Calculus of variations

(c) Consider the fixed-free problem where the boat must depart from 𝑃 but is free to arrive at any point on the other side. In this case, an optimal path is a minimizer of 𝑇 ∶ W → ℝ, where W = {𝑦 ∈ 𝐶 2 [0, 1] | 𝑦(0) = 0}. Find a unique extremal using the solution from (a). (d) It can be shown that the unique extremals found in (b) and (c) are absolute minimizers. Using the expressions for sin 𝜃(𝑥) and cos 𝜃(𝑥), and the substitution 𝑦′ (𝑥) = 𝑓(𝑥, 𝐴), make a plot of the steering angle 𝜃(𝑥) for each problem. Also, make a plot of each path 𝑦(𝑥). Given that V ⊂ W, briefly explain why the travel time for the optimal path in (c) must be less than or equal to that in (b). Mini-project 3. A chain of given length will hang in a curved shape when its ends are held fixed at two given points as outlined in Section 6.13. Using coordinates as shown, the natural shape of the chain can be described as that curve 𝑦(𝑥), 𝑥 ∈ [−𝐿, 𝐿] which minimizes the chain potential energy 𝐸(𝑦), subject to the length constraint 𝐺(𝑦) = ℓ, where y 2α

x

𝐿

𝐸(𝑦) = ∫ 𝜌𝑔𝑦√1 + (𝑦′ )2 𝑑𝑥, −𝐿

𝐿

g

𝐺(𝑦) = ∫ √1 + (𝑦′ )2 𝑑𝑥. −𝐿

x=−L

x=L

In the above, ℓ is the chain length, 𝜌 is the chain mass per unit length, 𝑔 is gravitational acceleration, and 𝛼 is an offset parameter such that 𝑦(−𝐿) = 𝛼 and 𝑦(𝐿) = −𝛼. Here we explore this minimization problem in the case of zero offset so that 𝛼 = 0, and for different values of the length ratio 𝛾 = ℓ/(2𝐿). We assume 𝜌, 𝑔, ℓ, and 𝐿 are positive constants. All quantities are dimensionless. (a) Write and solve the Euler–Lagrange differential equation for extremals. By filling in the details omitted in the text, and excluding the trivial solution with 𝑦(𝑥) ≡ constant, show that the general solution can be written as 𝐿 𝑐𝑥 𝑦(𝑥) = −𝑏 + cosh ( + 𝑑), 𝑐 𝐿 where 𝑏, 𝑐 ≠ 0, and 𝑑 are unknown constants. (b) Show that the boundary and constraint conditions 𝑦(−𝐿) = 0, 𝑦(𝐿) = 0 and 𝐺(𝑦) = ℓ imply the following equations for 𝑏, 𝑐, and 𝑑, where 𝛾 = ℓ/(2𝐿), 𝑑=0

and

𝑏=

𝐿 cosh(𝑐) 𝑐

and

sinh(𝑐) = 𝛾. 𝑐

(c) In the last equation above, 𝛾 > 0 is given, and 𝑐 ≠ 0 is an unknown. Show that this equation has two solutions if 𝛾 > 1, and no solution if 𝛾 < 1. Hence we have two or no extremals depending on the ratio 𝛾. What is the physical reason there can be no extremal if 𝛾 < 1? What is the only possible extremal when 𝛾 = 1? [This is the trivial solution excluded in part (a).]

Exercises

203

(d) In the case when 𝛾 > 1 and there are two extremals, it can be shown that one is an absolute minimizer and the other an absolute maximizer. Find and make plots of these curves for the case when ℓ = 0.5 and 𝐿 = 0.1. Indicate which is the minimizer and maximizer; this should be clear. What is the middle sag-depth 𝑞 = |𝑦(0)| for the energy-minimizing shape? Note 1: The above results can be illustrated by direct experiment with an open necklace. You should be able to compute and verify the sag-depth at the middle (and other locations) from knowledge of the necklace length ℓ and the separation distance 2𝐿. Note that the results do not depend on 𝜌 and 𝑔. Note 2: The case when ℓ = √(2𝐿)2 + (2𝛼)2 , which corresponds to 𝛾 = 1 when 𝛼 = 0, is special. This value of ℓ is a minimum of the constraint functional 𝐺, and the only possible shape of the chain is a line, but this line need not satisfy the Euler–Lagrange equation since Result 6.12.1 does not hold for an extremum of 𝐺. This line happens to satisfy the Euler–Lagrange equation when 𝛼 = 0, but not when 𝛼 ≠ 0.

Bibliography

[1] V. I. Arnold, Ordinary differential equations, Springer Textbook, Springer-Verlag, Berlin, 1992. Translated from the third Russian edition by Roger Cooke. MR1162307 [2] C. M. Bender and S. A. Orszag, Advanced mathematical methods for scientists and engineers. I: Asymptotic methods and perturbation theory, Springer-Verlag, New York, 1999. Reprint of the 1978 original, DOI 10.1007/978-1-4757-3069-2. MR1721985 [3] G. Birkhoff, Hydrodynamics: A study in logic, fact and similitude, Princeton Legacy Library, 2234. Princeton University Press, 2015, 2nd edition, revised. MR0122193. [4] G. A. Bliss, Calculus of variations. Mathematical Association of America, Chicago, IL, 1925. [5] G. A. Bliss, The problem of Lagrange in the calculus of variations, Amer. J. Math. 52 (1930), no. 4, 673–744, DOI 10.2307/2370714. MR1506783 [6] P. W. Bridgman, Dimensional analysis, Yale University Press, New Haven, CT 1963. [7] G. Buttazzo, M. Giaquinta, and S. Hildebrandt, One-dimensional variational problems: An introduction, Oxford Lecture Series in Mathematics and its Applications, vol. 15, The Clarendon Press, Oxford University Press, New York, 1998. MR1694383 [8] E. Casas-Alvero, Singularities of plane curves, London Mathematical Society Lecture Note Series, vol. 276, Cambridge University Press, Cambridge, 2000, DOI 10.1017/CBO9780511569326. MR1782072 [9] S. N. Chow and J. K. Hale, Methods of bifurcation theory, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 251, Springer-Verlag, New York-Berlin, 1982. MR660633 [10] R. V. Churchill and J. W. Brown, Complex variables and applications, 9th ed., McGraw-Hill Book Co., New York, 2014. MR730937 [11] E. A. Coddington and N. Levinson, Theory of ordinary differential equations, McGraw-Hill Book Co., Inc., New YorkToronto-London, 1955. MR0069338 [12] R. Coleman, A detailed analysis of the brachistochrone problem, arXiv:1001.2181v2, 2012. [13] B. Dacorogna, Introduction to the calculus of variations, 3rd ed., Imperial College Press, London, 2015. MR3288348 [14] H. Dym and H. P. McKean, Fourier series and integrals, Probability and Mathematical Statistics, No. 14, Academic Press, New York-London, 1972. MR0442564 [15] I. M. Gelfand and S. V. Fomin, Calculus of variations, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1963. Revised English edition translated and edited by Richard A. Silverman. MR0160139 [16] G.-M. Gie, M. Hamouda, C.-Y. Jung, and R. M. Temam, Singular perturbations and boundary layers, Applied Mathematical Sciences, vol. 200, Springer, Cham, 2018, DOI 10.1007/978-3-030-00638-9. MR3839343 [17] J. Guckenheimer and P. Holmes, Nonlinear oscillations, dynamical systems, and bifurcations of vector fields, Applied Mathematical Sciences, vol. 42, Springer-Verlag, New York, 1990. Revised and corrected reprint of the 1983 original. MR1139515 [18] R. C. Gunning and H. Rossi, Analytic functions of several complex variables, AMS Chelsea Publishing, Providence, RI, 2009. Reprint of the 1965 original, DOI 10.1090/chel/368. MR2568219 [19] E. Hille, Ordinary differential equations in the complex domain, Pure and Applied Mathematics, Wiley-Interscience [John Wiley & Sons], New York-London-Sydney, 1976. MR0499382

205

206

Bibliography

[20] M. W. Hirsch and S. Smale, Differential equations, dynamical systems, and linear algebra, Pure and Applied Mathematics, Vol. 60, Academic Press [Harcourt Brace Jovanovich, Publishers], New York-London, 1974. MR0486784 [21] M. H. Holmes, Introduction to perturbation methods, 2nd ed., Texts in Applied Mathematics, vol. 20, Springer, New York, 2013, DOI 10.1007/978-1-4614-5477-9. MR2987304 [22] M. H. Holmes, Introduction to the foundations of applied mathematics, Texts in Applied Mathematics, vol. 56, Springer, Cham, 2019. Second edition of [MR2526777], DOI 10.1007/978-3-030-24261-9. MR3969979 [23] T. Kato, A short introduction to perturbation theory for linear operators, Springer-Verlag, New York-Berlin, 1982. MR678094 [24] W. G. Kelley and A. C. Peterson, The theory of differential equations: Classical and qualitative, 2nd ed., Universitext, Springer, New York, 2010, DOI 10.1007/978-1-4419-5783-2. MR2640364 [25] J. Kevorkian and J. D. Cole, Perturbation methods in applied mathematics, Applied Mathematical Sciences, vol. 34, Springer-Verlag, New York-Berlin, 1981. MR608029 [26] M. Kot, A first course in the calculus of variations, Student Mathematical Library, vol. 72, American Mathematical Society, Providence, RI, 2014, DOI 10.1090/stml/072. MR3241749 [27] S. G. Krantz and H. R. Parks, The implicit function theorem: History, theory, and applications, Birkhäuser Boston, Inc., Boston, MA, 2002, DOI 10.1007/978-1-4612-0059-8. MR1894435 [28] G. Lawlor, A new minimization proof for the brachistochrone, Amer. Math. Monthly 103 (1996), no. 3, 242–249, DOI 10.2307/2975375. MR1376179 [29] D. S. Lemons, A student’s guide to dimensional analysis, Cambridge University Press, Cambridge, MA, 2017. [30] C. C. Lin and L. A. Segel, Mathematics applied to deterministic problems in the natural sciences, 2nd ed., Classics in Applied Mathematics, vol. 1, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1988. With material on elasticity by G. H. Handelman; With a foreword by Robert E. O’Malley, Jr., DOI 10.1137/1.9781611971347. MR982711 [31] J. D. Logan, Applied mathematics, 4th ed., John Wiley & Sons, Inc., Hoboken, NJ, 2013. MR3237684 [32] J. C. Neu, Singular perturbation in the physical sciences, Graduate Studies in Mathematics, vol. 167, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/gsm/167. MR3410360 [33] L. Perko, Differential equations and dynamical systems, 3rd ed., Texts in Applied Mathematics, vol. 7, Springer-Verlag, New York, 2001, DOI 10.1007/978-1-4613-0003-8. MR1801796 [34] F. Rellich, Perturbation theory of eigenvalue problems, Gordon and Breach Science Publishers, New York-London-Paris, 1969. Assisted by J. Berkowitz; With a preface by Jacob T. Schwartz. MR0240668 [35] J. A. Sanders, F. Verhulst, and J. Murdock, Averaging methods in nonlinear dynamical systems, 2nd ed., Applied Mathematical Sciences, vol. 59, Springer, New York, 2007. MR2316999 [36] D. R. Smith, Variational methods in optimization, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1974. MR0346616 [37] S. H. Strogatz, Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering, 2nd ed., Westview Press, Boulder, CO, 2015. MR3837141 [38] T. Szirtes, Applied dimensional analysis and modeling, 2nd ed.. Elsevier, New York, 2007. [39] G. Teschl, Ordinary differential equations and dynamical systems, Graduate Studies in Mathematics, vol. 140, American Mathematical Society, Providence, RI, 2012, DOI 10.1090/gsm/140. MR2961944 [40] J. L. Troutman, Variational calculus and optimal control: Optimization with elementary convexity, 2nd ed., Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1996. With the assistance of William Hrusa, DOI 10.1007/9781-4612-0737-5. MR1363262 [41] C. T. C. Wall, Singular points of plane curves, London Mathematical Society Student Texts, vol. 63, Cambridge University Press, Cambridge, 2004, DOI 10.1017/CBO9780511617560. MR2107253

Index

analytic function, 98 ballistic targeting model, 110, 139 bifurcation diagram definition of, 44 turning points, 49 biochemical switch model, 53 brachistochrone, 162 Buckingham 𝜋-theorem, 8 𝐶 𝑛 [𝑎, 𝑏] definition of, 143 distance, 147 neighborhood, 147 norm, 147 calculus of variations boundary condition essential, 167, 175, 182 natural, 167, 175, 182 concavity theorem, 189 cost functional, 179 extremal, 155, 175, 182 extremum absolute, 145 local, 148 first integral, 158 first-order problem constrained, 181 fixed-fixed, 154 fixed-free, 166 functional definition of, 144 first variation, 152 second variation, 152

fundamental lemma, 156, 160 necessary conditions, 151 reduced forms, 158 second-order problem, 174 sign lemma, 156, 161 sufficient condition, 189 variations of a function, 151 space of, 150 catenary curve, 188, 201 catenoid surface, 201 contraction mapping theorem, 110 cycloid curve, 165 digestion model, 16, 196 dimension axioms of, 3 basis of, 1 definition of, 1 dimensional exponents, 2 dimensionless quantity, 3 domino toppling model, 16 duBois-Reymond lemma, 195 dynamical system, 1D bifurcation diagram, 44 definition of, 37 derivative test, 43 equilibrium asymptotically stable, 41 attractor, 42 definition of, 39 hyperbolic, 42 neutrally stable, 41

207

208

repeller, 42 unstable, 41 monotonicity theorem, 40 phase view, 37 potential function, 51 solvability theorem, 38 time view, 37 velocity, 37 dynamical system, 2D bifurcation, 77 definition of, 55 direction field, 57 equilibrium asymptotically stable, 60 definition of, 60 neutrally stable, 60 unstable, 60 first integral, 59 incr/decreasing regions, 57 linear center, 68 definition of, 64 improper node, 67 nondegenerate, 64 phase diagrams, 65 saddle, hyperbolic point, 66 spiral, 68 stable node, attractor, 65 unstable node, repeller, 66 maximal orbit, path, 56 nonlinear center theorem, 76 nullclines, 57 orbit, path, 55 path equation, 58 periodic orbit theorem, 73 periodic, closed orbit asymptotically stable, 63 definition of, 62 limit cycle, 63 neutrally stable, 63 period, 62 unstable, 63 phase view, 55 solvability theorem, 56 time view, 55 trapping region, 74 velocity, 55 Dzhanibekov effect, 84 epidemic, SIR basic reproduction number, 79 generalized model, 91

Index

outbreak condition, 81 plain model, 78 recovery coefficient, 79 transmission coefficient, 78 Euler equations, 82, 94 Euler’s formula, 67 Euler–Lagrange eqns, 155, 175, 182 explosion model, 16 Fermat’s principle, 196 fishery model, 52 fuel consumption model, 17 glycolysis dynamics, 93 growth model, 16 Hartman-Grobman theorem, 70 Holling consumption model, 47 implicit function theorem, 104 insecticide dynamics, 88 isoperimetric problem, 181 Jacobian, 70 Lagrange multiplier rule, 182 Lagrangian, 154, 174 Legendre condition, 155, 175, 182 linear space, 144 Lipschitz condition, 110 liquid-gas interface meniscus, 127 model, 127, 139, 141 surface tension, 127 wetting angle, 127 logistic growth model, 47 Lotka–Volterra equations, 90 Lyapunov function, 89 Mercury orbit model, 140 Michaelis-Menten model, 138 multiscale function, 24 Newton polygon method, 105 non-ideal gas model, 136 optimal shape, control boat steering, 170, 201 car acceleration, 177, 199 food intake schedule, 196 hanging chain, 186, 202 playground slide, 161 soap film, 200 order symbols 𝑜, 𝑂, 99

Index

oscillations chemical, 90, 93 electrical, 91 see also pendulum see also predator-prey system see also spring-mass parameter definition of, 19 for bifurcation, 44 pendulum damped, 91 equation of motion, 13, 17 model, 12 period equation, 13 period law, 14 perturbed equation definition of, 95 regular algebraic approximation, 101 degenerate root, 105 standard series, 101 theorem on, 100 regular differential approximation, 108 periodic solution, 114 secular term, 115 standard series, 107 theorem on, 107 singular algebraic regular root, 119 singular root, 119 singular differential boundary layer, 122 composite approximation, 125 initial layer, 137 inner problem, 123 inner region, 122 matching condition, 125, 126 outer problem, 122 outer region, 122 overlap region, 126 Poincaré’s inequality, 192 Poincaré–Lindstedt method, 116 Poincaré-Bendixson theorem, 74 population dynamics bifurcation diagram, 49 carrying capacity, 47 model, insects, 46 model, plants, 53 turning points, 49 power product

209

definition of, 7 dimensionless set, 7 predator-prey system, 34, 90 projectile motion ballistic, 110 constant gravity, 34 variable gravity, 135 Puiseux series, 101 pure number, 3 quasi-steady-state, 138 Queen Dido’s problem, 199 reaction tank chemical equation, 29 model, 29 rate constant, 29 relationship dynamics, 92 rigid body angular momentum, 82 angular velocity, 81 global phase diagram, 84 inertia matrix, 82 intermediate axis theorem, 84 local phase diagrams, 83 model, 81, 94 scale associated, 25 characteristic, 24 derivative relations, 22 factors, 19 natural, 23 transformation, 21 scaling theorem, 26 sliding bead model, 53 solid-state laser model, 52 spring-mass system, 33, 87, 136 Taylor series, 97 temperature model, 33 tennis racket theorem, 84 terminal velocity, 16 thermo-chemical reaction model, 136 two-step reaction model, 138 unit change of, 4 choice of, 2 definition of, 1 dimensionless, 4 unit-conversion factor, 4 unit-free equation, 5

210

Van der Pol equation, 91 Weierstrass preparation theorem, 105 Wirtinger’s inequality, 192 Young-Laplace equation, 129

Index

Selected Published Titles in This Series 59 57 55 54

Oscar Gonzalez, Topics in Applied Mathematics and Modeling, 2023 Meighan I. Dillon, Linear Algebra, 2023 Joseph H. Silverman, Abstract Algebra, 2022 Rustum Choksi, Partial Diﬀerential Equations, 2022

53 52 51 50

Louis-Pierre Arguin, A First Course in Stochastic Calculus, 2022 Michael E. Taylor, Introduction to Diﬀerential Equations, Second Edition, 2022 James R. King, Geometry Transformed, 2021 James P. Keener, Biology in Time and Space, 2021

49 48 47 46

Carl G. Wagner, A First Course in Enumerative Combinatorics, 2020 R´ obert Freud and Edit Gyarmati, Number Theory, 2020 Michael E. Taylor, Introduction to Analysis in One Variable, 2020 Michael E. Taylor, Introduction to Analysis in Several Variables, 2020

45 Michael E. Taylor, Linear Algebra, 2020 44 Alejandro Uribe A. and Daniel A. Visscher, Explorations in Analysis, Topology, and Dynamics, 2020 43 Allan Bickle, Fundamentals of Graph Theory, 2020 42 Steven H. Weintraub, Linear Algebra for the Young Mathematician, 2019 41 40 39 38

William J. Terrell, A Passage to Modern Analysis, 2019 Heiko Knospe, A Course in Cryptography, 2019 Andrew D. Hwang, Sets, Groups, and Mappings, 2019 Mark Bridger, Real Analysis, 2019

37 Mike Mesterton-Gibbons, An Introduction to Game-Theoretic Modelling, Third Edition, 2019 36 Cesar E. Silva, Invitation to Real Analysis, 2019 ´ 35 Alvaro Lozano-Robledo, Number Theory and Geometry, 2019 34 C. Herbert Clemens, Two-Dimensional Geometries, 2019 33 32 31 30

Brad G. Osgood, Lectures on the Fourier Transform and Its Applications, 2019 John M. Erdman, A Problems Based Course in Advanced Calculus, 2018 Benjamin Hutz, An Experimental Introduction to Number Theory, 2018 Steven J. Miller, Mathematics of Optimization: How to do Things Faster, 2017

29 Tom L. Lindstrøm, Spaces, 2017 28 Randall Pruim, Foundations and Applications of Statistics: An Introduction Using R, Second Edition, 2018 27 Shahriar Shahriari, Algebra in Action, 2017 26 Tamara J. Lakins, The Tools of Mathematical Reasoning, 2016 25 24 23 22

Hossein Hosseini Giv, Mathematical Analysis and Its Inherent Nature, 2016 Helene Shapiro, Linear Algebra and Matrices, 2015 Sergei Ovchinnikov, Number Systems, 2015 Hugh L. Montgomery, Early Fourier Analysis, 2014

21 John M. Lee, Axiomatic Geometry, 2013 20 Paul J. Sally, Jr., Fundamentals of Mathematical Analysis, 2013 19 R. Clark Robinson, An Introduction to Dynamical Systems: Continuous and Discrete, Second Edition, 2012 18 Joseph L. Taylor, Foundations of Analysis, 2012 17 Peter Duren, Invitation to Classical Analysis, 2012

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/amstextseries/.

The analysis and interpretation of mathematical models is an essential part of the modern scientific process. Topics in Applied Mathematics and Modeling is designed for a one-semester course in this area aimed at a wide undergraduate audience in the mathematical sciences. The prerequisite for access is exposure to the central ideas of linear algebra and ordinary differential equations. The subjects explored in the book are dimensional analysis and scaling, dynamical systems, perturbation methods, and calculus of variations. These are immense subjects of wide applicability and a fertile ground for critical thinking and quantitative reasoning, in which every student of mathematics should have some experience. Students who use this book will enhance their understanding of mathematics, acquire tools to explore meaningful scientific problems, and increase their preparedness for future research and advanced studies. The highlights of the book are case studies and mini-projects, which illustrate the mathematics in action. The book also contains a wealth of examples, figures, and regular exercises to support teaching and learning. The book includes opportunities for computer-aided explorations, and each chapter contains a bibliography with references covering further details of the material.

For additional information and updates on this book, visit www.ams.org/bookpages/amstext-59

AMSTEXT/59

This series was founded by the highly respected mathematician and educator, Paul J. Sally, Jr.