337 63 7MB
English Pages [228] Year 2023
59
Topics in Applied Mathematics and Modeling Concise Theory with Case Studies
Oscar Gonzalez
Topics in Applied Mathematics and Modeling Concise Theory with Case Studies
UNDERGRADUATE
TEXTS
β’
59
Topics in Applied Mathematics and Modeling Concise Theory with Case Studies
Oscar Gonzalez
EDITORIAL COMMITTEE Giuliana Davidoο¬ Steven J. Miller
Tara S. Holm Maria Cristina Pereyra
Gerald B. Folland (Chair) 2020 Mathematics Subject Classiο¬cation. Primary 00A69, 34A26, 34E10, 37N99, 41A58, 49K15.
For additional information and updates on this book, visit www.ams.org/bookpages/amstext-59
Library of Congress Cataloging-in-Publication Data Names: Gonzalez, Oscar, 1968β author. Title: Topics in applied mathematics and modeling : concise theory with case studies / Oscar Gonzalez. Description: Providence, Rhode Island : American Mathematical Society, [2023] | Series: Pure and applied undergraduate texts, ISSN 1943β9334; Volume 59 | Includes bibliographical references and index. Identiο¬ers: LCCN 2022034482 | ISBN 9781470469917 (paperback) | 9781470472177 (ebook) Subjects: LCSH: Diο¬erential equations. | AMS: General β General and miscellaneous speciο¬c topics β General applied mathematics. | Ordinary diο¬erential equations β General theory β Geometric methods in diο¬erential equations. | Ordinary diο¬erential equations β Asymptotic theory β Perturbations, asymptotics. | Dynamical systems and ergodic theory β Applications β None of the above, but in this section. | Approximations and expansions β Approximations and expansions β Series expansions (e.g. Taylor, Lidstone series, but not Fourier series). | Calculus of variations and optimal control; optimization β Optimality conditions β Problems involving ordinary diο¬erential equations. Classiο¬cation: LCC QA372 .G617 2023 | DDC 515/.352βdc23/eng20221014 LC record available at https://lccn.loc.gov/2022034482
Copying and reprinting. Individual readers of this publication, and nonproο¬t libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2023 by the author. All rights reserved. Printed in the United States of America. β The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1
28 27 26 25 24 23
Contents
Preface
ix
Note to instructors
xi
Case studies and mini-projects
xiii
Chapter 1. Dimensional analysis
1
1.1. Units and dimensions
1
1.2. Axioms of dimensions
2
1.3. Dimensionless quantities
3
1.4. Change of units
4
1.5. Unit-free equations
4
1.6. Buckingham π-theorem
7
1.7. Case study
11
Reference notes
14
Exercises Mini-project
14 17
Chapter 2. Scaling
19
2.1. Domains and scales
19
2.2. Scale transformations
20
2.3. Derivative relations
22
2.4. Natural scales
23
2.5. Scaling theorem
26
2.6. Case study
29
Reference notes
31 v
vi
Contents
Exercises Mini-project Chapter 3.
One-dimensional dynamics
31 34 37
3.1. Preliminaries
37
3.2. Solvability theorem
38
3.3. Equilibria
39
3.4. Monotonicity theorem
40
3.5. Stability of equilibria
41
3.6. Derivative test for stability
43
3.7. Bifurcation of equilibria
44
3.8. Case study
46
Reference notes
50
Exercises Mini-project
50 53
Chapter 4.
Two-dimensional dynamics
55
4.1. Preliminaries
55
4.2. Solvability theorem
56
4.3. Direction field, nullclines
57
4.4. Path equation, first integrals
58
4.5. Equilibria
60
4.6. Periodic orbits
62
4.7. Linear systems
64
4.8. Equilibria in nonlinear systems
70
4.9. Periodic orbits in nonlinear systems
73
4.10. Bifurcation
77
4.11. Case study
77
4.12. Case study
81
Reference notes
86
Exercises Mini-project 1 Mini-project 2 Mini-project 3
86 92 92 93
Chapter 5.
Perturbation methods
95
5.1. Perturbed equations
95
5.2. Regular versus singular behavior
96
5.3. Assumptions, analytic functions
98
5.4. Notation, order symbols
99
Contents
vii
5.5. Regular algebraic case
100
5.6. Regular differential case
107
5.7. Case study
110
5.8. PoincarΓ©βLindstedt method
114
5.9. Singular algebraic case
118
5.10. Singular differential case
121
5.11. Case study
127
Reference notes
132
Exercises Mini-project 1 Mini-project 2 Mini-project 3
133 139 140 141
Chapter 6.
Calculus of variations
143
6.1. Preliminaries
143
6.2. Absolute extrema
145
6.3. Local extrema
147
6.4. Necessary conditions
150
6.5. First-order problems
154
6.6. Simplifications, essential results
158
6.7. Case study
161
6.8. Natural boundary conditions
166
6.9. Case study
170
6.10. Second-order problems
174
6.11. Case study
177
6.12. Constraints
181
6.13. Case study
186
6.14. A sufficient condition
189
Reference notes
193
Exercises Mini-project 1 Mini-project 2 Mini-project 3
193 200 201 202
Bibliography
205
Index
207
Preface
This book provides a concise tour of some fundamental methods and results of applied mathematics. It is designed for a one-semester course aimed at junior and senior level undergraduate students in the mathematical, physical, and engineering sciences. The prerequisites are an introductory knowledge of calculus, linear algebra, and ordinary differential equations. The purpose of the book is to provide a context for students to gain a deeper appreciation of mathematics and its connections with other disciplines. It provides a setting in which mathematics can be observed in action, as a tool for exploring meaningful problems in the world around us. Moreover, it illustrates how mathematics is often inspired by real problems, and how mathematical abstraction can lead to physical understanding. The subjects explored in the book are dimensional analysis and scaling, dynamical systems, perturbation methods, and calculus of variations. These are immense subjects of wide applicability, and a fertile ground for critical thinking and quantitative reasoning, in which every student of applied mathematics should have some experience. The book originated from a set of lecture notes for the course M 374M at The University of Texas at Austin. It is intended for a course of study focused on concepts and examples. For completeness, proofs of less-standard results are summarized throughout, at the level of the prerequisites, whereas proofs of standard results can be found in the references as noted. All sections of the book were developed and improved over several years, and have been classroom tested. Over 300 exercises and 180 illustrations are provided to support teaching and learning. The highlights of the book are the case studies and miniprojects, which should be considered as essential for any plan of study. Various exercises provide opportunities for computer simulation and further exploration. It is expected that students will benefit from this book in a number of ways. They will enhance their understanding of mathematics and gain experience in quantitative
ix
x
Preface
reasoning. They will also gain an appreciation for the intrinsic beauty of mathematical abstraction, and its utility as a guide for critical thinking. And they will acquire tools to explore meaningful problems, and increase their preparedness for research and advanced studies. Students can benefit from this book with minimal prerequisites, before any experience with partial differential equations or real analysis, which increases accessibility for both majors and nonmajors. I gratefully acknowledge the many authors, mentors, and teachers whose work provided the foundation for the material presented here. My dependence on their work is profound, and too extensive for complete citation.
Note to instructors
All six chapters can be covered in a standard one-semester course, which consists of about 42 class meetings of about 50 minutes each. Each chapter is organized into a number of short sections which are easily digestible. The first three chapters can be covered quickly, whereas the final three chapters can be covered more slowly. A suggested schedule, which allows for the possibility of two midterm exams, either in-class or take-home, is: Chapter 1 (4 classes), Chapter 2 (3 classes), Chapter 3 (4 classes), Chapter 4 (7β8 classes), Chapter 5 (10 classes), and Chapter 6 (12β13 classes). Proofs are not part of the main narrative in Chapters 1β5 and can be assigned as reading only. In contrast, some proofs in Chapter 6 are an essential part of the narrative and should be covered as appropriate. A focal point of the book are the case studies, which should be covered at the appropriate points in each chapter, and the mini-projects, which should form a regular part of the weekly assignments. Any other extended exercise that explores an application could be substituted for a case study or mini-project. Although most exercises can be completed by hand, the use of technology such as Desmos, Mathematica, and Matlab can be very helpful and should be encouraged. All exercises have been checked using such technology. A number of additional topics could be pursued to supplement the material presented here, or be considered for independent reading for students. The concepts of Lyapunov functions, Hopf bifurcations, PoincarΓ© maps, and chaos would be natural supplements for Chapter 4. Other perturbation methods of the WKB, averaging, and homogenization types, and other classes of singularly perturbed differential equations, such as those with interior layers and turning points, would be natural supplements for Chapter 5. A treatment of sufficient conditions, such as those based on convexity or conjugate points, along with issues of regularity, and elementary optimal control theory, would be natural supplements for Chapter 6.
xi
Case studies and mini-projects
Case studies Period of a pendulum (p. 11) Evolution of a chemical reaction (p. 29) Bifurcation events in insect populations (p. 46) Outbreak condition for spread of an illness (p. 77) Global phase diagram for a rigid body (p. 81) Air resistance effects in ballistic targeting (p. 110) Shape of a meniscus in a liquid-gas interface (p. 127) Optimal design of a slide (p. 161) Optimal steering control of a boat (p. 170) Optimal acceleration control of a car (p. 177) Shape of a hanging chain (p. 186)
Mini-projects Period vs initial angle curve for a pendulum (p. 17) Asymmetry of ascent and descent of a projectile (p. 34) Bifurcation events in plant populations (p. 53) Love-hate dynamics in relationships (p. 92) Limit cycles in a biochemical process of glycolysis (p. 92) Unstable spinning motions of a rigid body (p. 93) Aiming angles and trajectories in ballistic targeting (p. 139) Relativistic effects in the orbit of Mercury (p. 140) Shape of meniscus curve in a liquid-gas interface (p. 141) Soap films and minimal surfaces of revolution (p. 200) Optimal paths in a boat steering problem (p. 201) Min/maximizing shapes in a hanging chain problem (p. 202) xiii
Chapter 1
Dimensional analysis
Mathematical models are equations that express relationships between given quantities of interest. The equations may be of any type, and the quantities may be of any type, either variable or constant. In this chapter, we outline various results about the units and dimensions of quantities, which can lead to insights, and point the way towards simpler, more concise forms of any mathematical model.
1.1. Units and dimensions Throughout our developments we consider equations involving real-valued quantities expressed in given units of some given dimension. By a unit for a quantity we mean a scale for its measurement, such as a foot, hour, or gram. By the dimension of a quantity we mean its intrinsic type, such as length, time, or mass. Whereas a unit for a quantity can be chosen arbitrarily, the dimension of a quantity is a characteristic property that is fixed. Not all quantities have a dimension of their own. Indeed, by virtue of their definition, the dimensions of some quantities can be expressed as combinations of others. Thus only a basic set or basis of dimensions is required to describe a collection of quantities. For example, a standard dimensional basis for quantities arising in simple physical systems is (1.1)
{length (πΏ), time (π), mass (π), temperature (π©)}.
Different bases could be considered depending on the context. In systems for which forces are important but not masses, the basis could include the dimension of force instead of mass. A similar change could be made if energies were important but not masses. In systems that include electrical quantities, the basis would be enlarged to include the dimension of electric current. As a different example, to describe quantities arising in a simple ecological system, a dimensional basis might consist of (1.2)
{carnivore (πΆ), herbivore (π»), plant (π), insect (πΌ), time (π)}. 1
2
1. Dimensional analysis
To any dimensional basis we associate a corresponding choice of units. These units may have some standard size, or any other arbitrary, nontrivial size, and they may have some standard name, or any other arbitrary name for convenience. For example, for the dimensional basis in (1.1), one choice of units is {meter, second, kilogram, kelvin}. For the dimensional basis in (1.2), one choice of units could be {herd, flock, field, swarm, month}, where, for example, 1 herd may be defined as 20 carnivores, 1 flock may be defined as 12 herbivores, and so on. Thus we will consider real-valued quantities, with values specified in a given choice of units, in a given dimensional basis. The following notation will be used throughout. Definition 1.1.1. Let π β β be a quantity specified in units {π1 , . . . , ππ } in a dimensional basis {π·1 , . . . , π·π } for some π β₯ 1. By [π] we mean the dimension of π expressed as a product of powers of the basis elements, namely π
π
π
[π] = π·1 1 π·2 2 β― π·ππ .
(1.3)
The numbers π1 , . . . , ππ are called the dimensional exponents of π in the given basis. The array of exponents is denoted by Ξπ = (π1 , . . . , ππ ) β βπ . kgβ
m2
m
kelvin
Example 1.1.1. Let π = 3 s2 , π = 9.8 s2 , and π = 100 s . These quantities are expressed in units {meter, second, kilogram, kelvin} in the dimensional basis {πΏ, π, π, π©}. The dimensions and corresponding dimensional exponents for π, π and π in this basis are [π] = ππΏ2 /π 2 = πΏ2 π β2 ππ©0 , Ξπ = (2, β2, 1, 0), (1.4)
[π] = πΏ/π 2 [π] = π©/π
= πΏπ β2 π 0 π©0 , 0
=πΏπ
β1
0
π π©,
Ξπ = (1, β2, 0, 0), Ξπ = (0, β1, 0, 1).
m
Recall that, in the notation π = 9.8 s2 , the number 9.8 is the numerical value of the m quantity, and the tag s2 is an explicit reminder of the units for the quantity. When we say that one quantity is a function of another, we mean that a relation exists between their numerical values, with respect to a given choice of units, in a given dimensional basis. Thus when we write π = π(π), we mean that the numerical value of π is completely determined by the numerical value of π. The function π is simply a map from one real value to another, and may be defined by a formula or graph in the usual way.
1.2. Axioms of dimensions We adopt the basic axioms that addition and subtraction are dimensionally meaningful only for quantities of the same dimension, whereas multiplication and division are meaningful for quantities of arbitrary dimension. To state these axioms in a more precise way, let π, π, π, π β β be quantities with given units in a given dimensional basis. The basic axiom on addition and subtraction reflects the idea that only quantities of the same dimension can be added and subtracted in a dimensionally meaningful way. Thus the statement π = π Β± π has a dimensional meaning only when π and π, and hence π, have the same dimension. For instance, β1 meter + 2 meterβ is a meaningful statement, whereas β1 meter + 2 secondβ is not.
1.3. Dimensionless quantities
3
The basic axiom on multiplication and division reflects the idea that quantities of any dimension can be multiplied and divided; indeed, this is how more complicated dimensions are derived from elementary ones. Thus the statements π = ππ and π = π/π (π β 0) have a dimensional meaning for all π and π, and the dimensions of the results π and π are well defined in each case. Moreover, this axiom can be extended to arbitrary powers, integration, and differentiation. Axiom 1.2.1. Let π, π β β be quantities specified in units {π1 , . . . , ππ }, in a dimensional basis {π·1 , . . . , π·π }, with dimensions [π], [π]. Then (1) [π Β± π] is defined if and only if [π] = [π], (2) [ππ] = [π][π] for all π, π, (3) [π/π] = [π]/[π] for all π, π with π β 0, (4) [ππΌ ] = [π]πΌ for all π > 0 and real πΌ, (5) [β« π ππ] = [π][π] for any integrable function π = π(π), (6) [ππ/ππ] = [π]/[π] for any differentiable function π = π(π). In property (4) the condition π > 0 ensures that ππΌ is defined for any power πΌ. While it would suffice to only consider rational powers, we assume that the property holds for all real powers. The content of properties (2)β(4) can be translated to the dimensional exponents Ξπ and Ξπ in a straightforward way, namely (1.5)
Ξππ = Ξπ + Ξπ ,
Ξπ/π = Ξπ β Ξπ ,
ΞππΌ = πΌΞπ .
1.3. Dimensionless quantities The concept of a quantity with no dimension as defined next will play an important role throughout our developments. We note that such quantities can arise when considering combinations of other quantities, and can also arise naturally in other ways. Definition 1.3.1. A quantity π β β is called dimensionless if its dimensional expression is [π] = 1, or equivalently its array of dimensional exponents is Ξπ = 0, in any units in any dimensional basis. ft
1
Example 1.3.1. (1) Let π = ππ/π, where π = 4 hour , π = 3 hour , π = 2 β1
β1
β2
ft hour
2
. Consid-
ering dimensions we have [π] = πΏπ , [π] = π , and [π] = πΏπ , and we find that [π] = [π][π]/[π] = 1. Thus π is a dimensionless quantity; its value is π = 6. (2) Let π be an arbitrary quantity with dimension [π], and let π = π + π and π = π β
π β
π. Then it is natural to rewrite these quantities as π = 2π and π = π3 . In these latter expressions, we note that the coefficient 2 and exponent 3 are dimensionless; they are purely mathematical entities called pure numbers. The dimensions of π and π are [π] = [2π] = [π] and [π] = [π3 ] = [π]3 . (3) Let π be an arbitrary angle, which when inscribed in a circle of radius π subtends β an arc of length β. Then, in the radian unit of measurement, we have π = π and we find [π] = 1. Hence angles and the radian unit of measurement are dimensionless.
4
1. Dimensional analysis
Similarly, since it only differs in size, the degree unit of measurement is dimensionless. (4) Any ratio of two quantities of the same dimension is dimensionless. The value of such a ratio can be expressed as a pure number, or in terms of any arbitrary dimensionless unit such as a percentage or parts-per-hundred.
1.4. Change of units Here we outline the effect of a change of units on an arbitrary quantity. For our purposes it will be sufficient to only consider changes in the dimensional units associated with a given dimensional basis, with any dimensionless units held fixed. We assume that any two units of the same dimensional type are related by a multiplicative conversion factor as introduced below. To state the result, we consider an arbitrary quantity π β β, expressed in units {π1 , . . . , ππ }, in a dimensional basis {π·1 , . . . , π·π }, with dimensional exponents Ξπ = (π1 , . . . , ππ ) β βπ . Λ1 , . . . , π Λπ }, then the quantity π is Result 1.4.1. If units {π1 , . . . , ππ } are changed to {π changed to π,Μ where π
π
π
π Μ = ππ1 1 π2 2 β― πππ .
(1.6)
Here ππ > 0 (π = 1, . . . , π) are unit-conversion factors; each factor ππ quantifies the Λ π per unit of π π . number of units of π The above result follows from straightforward algebra and the axioms on dimensions regarding multiplication and division. Note that if π is dimensionless, then Ξπ = (0, . . . , 0), and we obtain π Μ = π. Thus dimensionless quantities are not affected by a change of dimensional units. m
Example 1.4.1. Let π = 9.8 s2 . This quantity is expressed in units {m, s} in the dimensional basis {πΏ, π}. Since [π] = πΏπ β2 , its dimensional exponents are Ξπ = (π1 , π2 ) = (1, β2). If the units are changed to {km, min}, then the unit-conversion factors are 1 km 1 min (1.7) π1 = , π2 = . 1000 m 60 s In the new units we have m 1 km 1 min β2 km π π . (1.8) πΜ = ππ1 1 π2 2 = (9.8 2 )( )( ) = 35.28 2 1000 m 60 s s min
1.5. Unit-free equations In the modeling of various types of systems, we will usually consider a set of real-valued quantities π1 , . . . , ππ , specified in units {π1 , . . . , ππ }, in a dimensional basis {π·1 , . . . , π·π }, for some π β₯ 2 and π β₯ 1. We will often seek to construct and study equations of the form (1.9)
π1 = π(π2 , . . . , ππ ),
1.5. Unit-free equations
5
where π βΆ βπβ1 β β is some function. The function notation above indicates that the numerical value of π1 is completely determined by the numerical values of π2 , . . . , ππ in the given units. In our pursuits, we will only consider equations that are unit-free as defined next. Definition 1.5.1. An equation π1 = π(π2 , . . . , ππ ) is called unit-free if it transforms into (1.10)
π1Μ = π(π2Μ , . . . , ππΜ )
under an arbitrary change of units on arbitrary values of π1 , . . . , ππ . The key point of a unit-free equation is that the function π is unaffected by the choice of units. All the equations that we consider will be unit-free in this sense. Note that, without this property, the function π may change whenever the units are changed, and the equation would have limited value as a model. Indeed, it would be tedious to document each different version of the equation for each different choice of units. Thus a unit-free equation can be viewed as a well designed equation. Model equations derived from fundamental physical laws are naturally unit-free; they inherit this property from the laws on which they are based. In contrast, empirical equations derived from curve fitting procedures are not naturally unit-free, but can always be re-designed to have this property by introducing appropriate dimensional constants. In the most basic sense, a unit-free equation can be viewed as a dimensionally meaningful equation, consistent with the axioms of dimensions, and this property can always be achieved. Example 1.5.1. Let π₯, π‘ and π be specified in units {m, s}, in the dimensional basis {πΏ, π}, with dimensions [π₯] = πΏ, [π‘] = π and [π] = πΏπ β2 , and exponents Ξπ₯ = (1, 0), Ξπ‘ = (0, 1) and Ξπ = (1, β2). Suppose that the value of π₯ is determined by the values of π‘ and π through the equation (1.11)
π₯ = π(π‘, π) =
1 2 ππ‘ . 2 1
(Unless mentioned otherwise, unnamed quantities such as the factor 2 and exponent 2 can be interpreted as pure numbers.) To determine if the above equation is unitΛ1 , π Λ2 }, defined by free, we consider a change of units from {m, s} to arbitrary units {π arbitrary conversion factors π1 , π2 . In the new units, the values of π₯, π‘ and π become (1.12)
π₯Μ = π₯π1 ,
π‘ Μ = π‘π2 ,
πΜ = ππ1 πβ2 2 .
1
Substitution of these expressions into π₯ = 2 ππ‘2 gives 2 1 2 Μ β1 (ππΜ β1 1 π2 )(π‘π2 ) . 2 In the above, all factors with π1 and π2 cancel, and we get
(1.13)
Μ β1 (π₯π 1 )=
1 2 ππ‘Μ Μ . 2 Thus (1.11) is unit-free since it has exactly the same form in any choice of units. The original equation π₯ = π(π‘, π) is transformed into π₯Μ = π(π‘,Μ π), Μ with the same function π. (1.14)
π₯Μ =
6
1. Dimensional analysis
Example 1.5.2. Let π₯, π‘ and π be as before, and let π be an additional quantity, say a constant, with [π] = π and Ξπ = (0, 1). For purposes of comparison, consider the two different equations (1.15)
π₯=
1 2 βπ‘ ππ‘ π , 2
π₯=
1 2 βπ‘/π ππ‘ π . 2
(Here ππ = exp(π) is the natural exponential function; the base π can be interpreted as a pure number.) Considering an arbitrary change of units as above, we get (1.16)
π₯Μ =
1 2 βπ‘πΜ β1 ππ‘Μ Μ π 2 , 2
π₯Μ =
1 2 βπ‘/Μ π Μ ππ‘Μ Μ π . 2
The first equation is not unit-free since it changes form: a unit-conversion factor remains in the equation and does not cancel out. In contrast, the second equation is unit-free since all the unit-conversion factors cancel. Note how the first equation becomes unit-free by introduction of the constant π. The equations in (1.15), written in units {m, s}, would be numerically the same when π = 1s. However, the second equation is advantageous since it would have exactly the same form in any units. Example 1.5.3. Let π£ and π‘ be quantities specified in units {lb, hr}, in the dimensional basis {π, π}, with dimensions [π£] = π/π and [π‘] = π. Suppose that the value of π£ is determined by the value of π‘ through an empirical equation π£ = 3.7π‘2 β sin(5.4π‘).
(1.17)
Here we rewrite this equation in a unit-free form. To begin, we introduce constants π, π, π with values 3.7, 1, 5.4 in units {lb, hr} and consider π£ = ππ‘2 β π sin(ππ‘).
(1.18)
We next determine the dimensions of these constants to make the equation unit-free. Accordingly, let Ξπ = (πΌ1 , πΌ2 ), Ξπ = (π½1 , π½2 ) and Ξπ = (πΎ1 , πΎ2 ) be the unknown dimensional exponents. Under an arbitrary change of units with conversion factors π1 , π2 , using the fact that Ξπ£ = (1, β1) and Ξπ‘ = (0, 1), we get, after dividing out the conversion factors from the left side of the equation, (1.19)
1βπΌ1 β3βπΌ2 π2
π£ Μ = π1
1βπ½1 β1βπ½2 π2
ππ‘Μ 2Μ β π1
βπΎ1 β1βπΎ2 π2
π Μ sin(π1
ππ‘). ΜΜ
Note that the unit-free condition will be satisfied when the exponents of all the conversion factors in the above expression are zero, which requires Ξπ = (1, β3), Ξπ = (1, β1) and Ξπ = (0, β1). Thus the dimensions of the constants are completely determined, and in units {lb, hr} we have (1.20)
π = 3.7
lb 3
hr
,
π=1
lb , hr
π = 5.4
1 . hr
In any other units, the equation would be π£ Μ = ππ‘Μ 2Μ β π Μ sin(ππ‘), Μ Μ where π,Μ π,Μ π Μ are the values of the constants in the new units. In our function notation, the equation in (1.18) would be written as π£ = π(π‘, π, π, π).
1.6. Buckingham π-theorem
7
1.6. Buckingham π-theorem Here we outline a classic result known as the Buckingham π-theorem. It states that, for any unit-free equation π1 = π(π2 , . . . , ππ ), the function π cannot depend on π2 , . . . , ππ in a completely arbitrary way; it can only depend on certain dimensionless combinations. For simplicity we state the result only for positive values of π1 , . . . , ππ . Similar results hold for nonpositive values, but at the expense of more complicated statements. Definition 1.6.1. By a power product of π1 , . . . , ππ > 0 we mean a quantity π > 0 of the form π
π
π = π1 1 β― πππ ,
(1.21)
for some powers π1 , . . . , ππ β β. We say that π includes ππ if ππ β 0. The condition that each ππ be positive ensures that π is well defined for arbitrary powers. In any dimensional basis {π·1 , . . . , π·π }, we note that each quantity ππ will have dimensional exponents Ξππ β βπ , and the power product π will have dimensional exponents Ξπ β βπ . From the definition in (1.21), together with the properties that Ξππ = Ξπ + Ξπ and ΞππΌ = πΌΞπ given in (1.5), we deduce that (1.22)
Ξπ = π1 Ξπ1 + β― + ππ Ξππ = π΄π£,
where π΄ = (Ξπ1 , Ξπ2 , . . . , Ξππ ) β βπΓπ and π£ = (π1 , . . . , ππ ) β βπ . Here all onedimensional arrays are considered as columns, and we assume π β₯ 2 and π β₯ 1 with π β₯ π. Given π1 , . . . , ππ we will be interested in forming power products π that are dimensionless. In this respect, we note that (1.23)
π dimensionless
β
Ξπ = 0
β
π΄π£ = 0.
Furthermore, we will only be interested in nontrivial power products π β’ 1, which correspond to π£ β 0. The following result, which essentially is a definition, characterizes the dimensionless power products that we seek. Result 1.6.1. If π΄π£ = 0 has a total of π independent solutions π£ 1 , . . . , π£ π , then a total of π independent dimensionless power products π1 , . . . , ππ can be formed. Any such π1 , . . . , ππ is called a full set. This set is further called normalized if π1 includes π1 (with power π1 = 1), and π2 , . . . , ππ do not include π1 . Recall that, through the usual process of row reduction, any nontrivial solution of π΄π£ = 0 will be expressed in terms of certain free variables. If there are π free variables, then there are π independent solutions π£, and hence π independent dimensionless power products π. While any independent choices of the free variables can be made to form a full set of solutions, a deliberate choice of these variables is required to form a normalized set. Specifically, the normalization condition requires that the first solution have π1 = 1, and any other solutions have π1 = 0. Example 1.6.1. Let π₯, π‘, π, β, π > 0 be quantities with dimensions [π₯] = πΏ, [π‘] = π, [π] = πΏπ β2 , [β] = πΏπ β3 and [π] = π. A dimensional basis is {πΏ, π, π}, and the
8
1. Dimensional analysis
dimensional exponent matrix in this basis is (1.24)
1 0 π΄ = (Ξπ₯ , Ξπ‘ , Ξπ , Ξβ , Ξπ ) = ( 0 1 0 0
1 β2 0
1 β3 0
0 0 ). 1
An arbitrary power product has the form π = π₯π1 π‘π2 ππ3 βπ4 ππ5 . The equation π΄π£ = 0, where π£ = (π1 , . . . , π5 ), has two free variables, and the general solution is (1.25)
π1 = βπ3 β π4 ,
π2 = 2π3 + 3π4 ,
π5 = 0,
π3 , π4 free.
Since there are two free variables, there are two independent solutions. For one solution we choose π3 = β1, π4 = 0, which gives π£ 1 = (1, β2, β1, 0, 0), and hence π1 = π₯/(ππ‘2 ). For a second solution we choose π3 = β1, π4 = 1, which gives π£ 2 = (0, 1, β1, 1, 0), and hence π2 = π‘β/π. By choice, we arranged for π1 to include π₯ with an exponent of unity, and for π2 to exclude π₯. Hence π1 , π2 is a full set of dimensionless power products for π₯, π‘, π, β, π, and this set is normalized with respect to π₯. Note that π will not be included in any dimensionless power product. The next result shows that any unit-free equation can only depend on dimensionless power products. No assumption on the form or continuity properties of the function π are required. For simplicity, the results are stated only for positive quantities; similar results can be derived to account for negative and zero quantities. Result 1.6.2. [π-theorem] Let π1 , . . . , ππ > 0 be a full set of dimensionless power products for π1 , . . . , ππ > 0 where π β₯ 1 and π β₯ 2. If the set π1 , . . . , ππ is normalized, then any unit-free equation π1 = π(π2 , . . . , ππ ) for some function π, is equivalent to an equation (1.26)
π1 = π(π2 , . . . , ππ )
for some function π. In the case that π = 1, the function π reduces to some constant πΆ. The normalization condition ensures that the reduced equation (1.26) is explicit, just as the original equation, in terms of π1 . We remark that a more general form of the theorem states that any unit-free equation in the general implicit form πΉ(π1 , . . . , ππ ) = 0 is equivalent to π·(π1 , . . . , ππ ) = 0, without any normalization condition on the power products. Also, if the only dimensionless power product for quantities π1 , . . . , ππ is trivial, then the only unit-free relation among these quantities is trivial; in this case, the set π1 , . . . , ππ would need to be enlarged in order for a nontrivial unit-free relation to exist. The proof of the theorem is based on a change of variable argument that exploits the unit-free condition and the definition of the power products, which will be outlined after some examples. Example 1.6.2. Let π₯, π‘, π, β, π > 0 be quantities as in the previous example. A normalized set of power products for these quantities is π1 = π₯/(ππ‘2 ) and π2 = π‘β/π. Thus any unit-free equation of the form π₯ = π(π‘, π, β, π) must be equivalent to an equation of the form (1.27)
π1 = π(π2 ) or
π‘β π₯ = π( ), π ππ‘2
1.6. Buckingham π-theorem
9
which can be rearranged to yield π₯ = ππ‘2 β
π(
(1.28)
π‘β ). π
Thus π₯ cannot depend on π, and must depend on π‘, π and β in a specific way. If π(0) is defined, then the special case when β has a fixed value of zero can be considered, and the relation becomes π₯ = π½ππ‘2 , where π½ = π(0) is a dimensionless constant. Example 1.6.3. Here we explicitly find the reduced form of the unit-free equation πΌπ₯ βπ₯2 /(π½π‘2 ) (1.29) π’ = π(π₯, π‘, πΌ, π½) = π , π‘ where [π’] = π©, [π₯] = πΏ, [π‘] = π, [πΌ] = π©π/πΏ, and [π½] = πΏ2 /π 2 . A dimensional basis is {π©, πΏ, π}, and the dimensional exponent matrix is (1.30)
1 0 π΄ = (Ξα΅ , Ξπ₯ , Ξπ‘ , ΞπΌ , Ξπ½ ) = ( 0 1 0 0
0 0 1
1 β1 1
0 2 ). β2
An arbitrary power product is π = π’π1 π₯π2 π‘π3 πΌπ4 π½π5 . The equation π΄π£ = 0, where π£ = (π1 , . . . , π5 ), has two free variables. By choosing these variables in a similar way as before, we obtain the full, normalized set π1 = π’/(πΌβπ½) and π2 = π₯/(π‘βπ½). By the π-theorem, the original equation π’ = π(π₯, π‘, πΌ, π½) must be equivalent to π1 = π(π2 ) for some function π. Here this result can be verified directly due to the explicit form of the original equation. Specifically, dividing the equation by πΌβπ½, and then substituting, we obtain πΌπ₯ βπ₯2 /(π½π‘2 ) π’ π₯ βπ₯2 /(π½π‘2 ) 2 (1.31) π’= π β = π β π1 = π2 πβπ2 . π‘ πΌβπ½ π‘βπ½ Example 1.6.4. A simple theory of sound waves in a gas proposes that the speed of sound π£ > 0 should depend on only the mass density π > 0, pressure π > 0, and viscosity π > 0 so that (1.32)
π£ = π(π, π, π),
for some function π. Here we use the π-theorem to find an equivalent and possibly simpler form of (1.32) assuming that it is unit-free. This can be viewed as an important first step in exploring any new or proposed relation of interest. The quantities π£, π, π, π have dimensions [π£] = πΏ/π, [π] = π/πΏ3 , [π] = π/(πΏπ 2 ) and [π] = π/(πΏπ). A dimensional basis is {πΏ, π, π}, and the dimensional exponent matrix in this basis is (1.33)
1 β3 β1 β1 0 β2 β1 ) . π΄ = (Ξπ£ , Ξπ , Ξπ , Ξπ ) = ( β1 0 1 1 1
An arbitrary power product has the form π = π£π1 ππ2 ππ3 ππ4 . The equation π΄π£ = 0, where π£ = (π1 , . . . , π4 ), has one free variable, and the general solution is (1.34)
π1 = β2π3 ,
π2 = βπ3 ,
π4 = 0,
π3 free.
10
1. Dimensional analysis
Since there is only one free variable, there is only one independent solution. For this solution we choose π3 = β1/2, which gives π£ 1 = (1, 1/2, β1/2, 0), and hence π1 = π£βπ/π. Hence π1 is a full set of independent dimensionless power products for π£, π, π, π, and this set is normalized with respect to π£. By the π-theorem, the equation in (1.32) must be equivalent to (1.35)
π1 = πΆ
or π£ = πΆ
π , βπ
where πΆ > 0 is some dimensionless constant. Thus any experimental investigation of sound waves under the given hypothesis should be aimed at (1.35), and the determination of the unknown constant πΆ. Note that, even though an unknown constant is involved, there is valuable, direct information implied by (1.35). For instance, it implies that the speed of sound must be independent of the viscosity, and would increase with pressure at fixed density, and decrease with density at fixed pressure. Moreover, the speed of sound would remain unchanged when pressure and density are both increased or decreased in a simultaneous way. Sketch of proof: Result 1.6.2. Let π1 , . . . , ππ > 0 and π1 , . . . , ππ > 0 be given, and let π΄ β βπΓπ be the dimensional exponent matrix whose columns are Ξππ = (π1,π , . . . , ππ,π ) β βπ , where π = 1, . . . , π. Note that, to each dimensionless power product ππ , there is an independent solution π£ π = (π1,π , . . . , ππ,π ) β βπ of π΄π£ = 0, where π = 1, . . . , π. Thus the row-reduced form of π΄π£ = 0 has π columns without pivots, which correspond to the free variables, and π β π columns with pivots. π
π
Due to the normalization condition, we have π1 = π1 π2 2,1 β― πππ,1 , and any remaining power products π2 , . . . , ππ involve only π2 , . . . , ππ . In view of this, we consider π΄β² β βπΓ(πβ1) , defined to be the submatrix of π΄ obtained by omitting the first column, and π£β² β βπβ1 , defined to be the subvector of π£ obtained by omitting the first entry. The assumption that π΄π£ = 0 has a full set of π independent solutions that satisfy the normalization condition implies that π΄β² π£β² = 0 has precisely π β 1 independent solutions, and hence precisely as many free variables. Consequently, the row-reduced form of π΄β² π£β² = 0 has π β π columns with pivots, so that π΄β² has rank π β π. Let π1 = π(π2 , . . . , ππ ) be given and consider an arbitrary change of units that changes π1 , . . . , ππ into π1Μ , . . . , ππΜ , and note that π1Μ = π(π2Μ , . . . , ππΜ ) by the unit-free assumption. In view of the above expression for π1 , we introduce the function π
π
πΉ(π2 , . . . , ππ ) = π(π2 , . . . , ππ )π2 2,1 β― πππ,1 so that π1 = πΉ(π2 , . . . , ππ ). Similarly, beginning from the analogous expression for π Λ1 , we find π Λ1 = πΉ(π2Μ , . . . , ππΜ ). Because it is dimensionless, we have π1 = π Λ1 , which implies πΉ(π2 , . . . , ππ ) = πΉ(π2Μ , . . . , ππΜ ). Thus the function πΉ is invariant under an arbitrary change of units. To establish the result of the theorem, we consider different cases depending on the number π of power products. In the case when π = 1, we consider the change of unit π π relations ππΜ = ππ π1 1,π β― πππ,π for π = 2, . . . , π, where π1 , . . . , ππ are the conversion factors. From this we obtain the log-linear system ln(ππΜ /ππ ) = π1,π ln π1 +β―+ππ,π ln ππ .
1.7. Case study
11
In matrix form, we have π΄β²π π’ = πβ² , where π’ = (ln π1 , . . . , ln ππ ) β βπ and πβ² = (ln(π2Μ /π2 ), . . . , ln(ππΜ /ππ )) β βπβ1 . When π = 1, the rank of π΄β² and π΄β²π is π β 1, and the columns of π΄β²π span βπβ1 . Thus, for arbitrary old values π2 , . . . , ππ , we can always find a change of units to obtain any specified new values π2Μ , . . . , ππΜ , say a value of one for each. The required change of units can be found by setting π2Μ = 1, . . . , ππΜ = 1 in this log-linear system and solving for the conversion factors π1 , . . . , ππ . Due to the invariance property of πΉ, for arbitrary π2 , . . . , ππ we get π1 = πΉ(1, . . . , 1) = πΆ, where πΆ is some fixed constant, which establishes the result for this case. In the case when π = π, the system π΄β² π£β² = 0 has π β 1 independent solutions = (π2,π , . . . , ππ,π ) β βπβ1 for π = 2, . . . , π. Let π΅β² β β(πβ1)Γ(πβ1) be the matrix whose columns are these solutions, and note that it is square and has full rank, and
π£β²π
π2,π
ππ,π
hence is invertible. For this case, we consider the power products ππ = π2 β― ππ for π = 2, . . . , π, and obtain the log-linear system ln ππ = π2,π ln π2 + β― + ππ,π ln ππ . In matrix form, we have π΅ β²π π€β² = ββ² , where π€β² = (ln π2 , . . . , ln ππ ) β βπβ1 and ββ² = (ln π2 , . . . , ln ππ ) β βπβ1 . Since π΅ β²π is invertible, we find that π2 , . . . , ππ can be uniquely expressed in terms of π2 , . . . , ππ . This implies that π1 = πΉ(π2 , . . . , ππ ) = π(π2 , . . . , ππ ), for some function π, which establishes the result for this case. In the case when 1 < π < π, the system π΄β² π£β² = 0 has π β 1 independent solutions = (π2,π , . . . , ππ,π ) β βπβ1 for π = 2, . . . , π, and has rank π β π as noted earlier. Without loss of generality, up to a reordering of π2 , . . . , ππ , we may suppose that the pivots in the system π΄β² π£β² = 0 all occur in the leading π β π columns, whereas the free variables all occur in the latter π β 1 columns. We now consider the π β π change π π of unit relations ππΜ = ππ π1 1,π β― πππ,π for π = 2, . . . , π β π + 1, and again consider the system ln(ππΜ /ππ ) = π1,π ln π1 +β―+ππ,π ln ππ . Since the dimensional exponent vectors (π1,π , . . . , ππ,π ) are the leading π β π columns of π΄β² , they are independent. Thus for arbitrary π2 , . . . , ππβπ+1 we can find π1 , . . . , ππ to achieve π2Μ = 1, . . . , ππβπ+1 Μ = 1. We π2,π ππ,π next consider the π β 1 power products ππ = π2 β― ππ for π = 2, . . . , π. Since ππ = ππβπ+2,π ππ,π π Λπ and π2Μ = 1, . . . , ππβπ+1 Μ = 1, we get the reduced expressions ππ = ππβπ+2 Μ β― ππΜ , which leads to the system ln ππ = ππβπ+2,π ln ππβπ+2 Μ + β― + ππ,π ln ππΜ . For each π, we note that (ππβπ+2,π , . . . , ππ,π ) are the πβ1 free variables from the system π΄β² π£β² = 0, which were independently chosen to generate the solution set. Thus this log-linear system is square and has full rank, and we find that ππβπ+2 Μ , . . . , ππΜ can be uniquely expressed in terms of π2 , . . . , ππ . This implies π1 = πΉ(π2Μ , . . . , ππΜ ) = πΉ(1, . . . , 1, ππβπ+2 Μ , . . . , ππΜ ) = π(π2 , . . . , ππ ), for some function π, which establishes the result. π£β²π
1.7. Case study Setup. To illustrate the preceding results on dimensional methods, and the process of modelling a simple mechanical system, we study the motion of a pendulum released from rest. Figure 1.1 illustrates the system, which consists of a string of length β, with one end attached to a fixed support point, and the other end attached to a ball of mass π. We assume the string is always in tension and hence straight, and we let π denote the angle between the string and a vertical line through the support point, and arbitrarily take the positive direction to be counter-clockwise. We assume that gravitational acceleration π is directed in the downward, vertical direction. When the ball is raised
12
1. Dimensional analysis
o
g
ΞΈ
y
x ΞΈ
r
m
j
i
F string
eΞΈ er
F gravity
Figure 1.1. ππ
and released from the rest conditions π = π0 and ππ‘ = 0 at time π‘ = 0, the ball will swing back-and-forth in a periodic motion. We seek to understand various aspects of this motion; for example, how the period depends on the parameters π, π, β, and π0 . Outline of model. We assume that the motion occurs in a plane and introduce an origin and π₯, π¦ coordinates as shown. The standard unit vectors in the positive π₯ and π¦ directions are denoted by π β and π,β and the position vector for the ball is denoted by π.β It will be convenient to introduce unit vectors π πβ and π πβ that are parallel and β perpendicular to π.β For any angle π, the components of these vectors are π β = β sin π π + β β β cos π π,β π πβ = sin π π +cos π π,β and π πβ = cos π π βsin π π.β By differentiating the position with respect to time, we obtain the velocity and acceleration vectors
(1.36)
ππ β ππ β ππ β = β cos π π β β sin π π, ππ‘ ππ‘ ππ‘ π2π β π2π ππ 2 π2π ππ 2 = [β cos π 2 β β sin π( ) ] π β β [β sin π 2 + β cos π( ) ] π.β 2 ππ‘ ππ‘ ππ‘ ππ‘ ππ‘
We assume that only two forces act on the ball: one due to gravity, and another due to the pull of the string. Thus we neglect any other forces, such as that due to air β resistance. The force of gravity has the form πΉgravity = ππ π,β and the force in the string β has the form πΉstring = βπ π πβ , where π is an unknown tension, which is nonconstant in general. Note that, although the magnitude of this force is unknown, its direction is known: it is always parallel to π πβ . Newtonβs law of motion for the ball requires that the product of its mass and acceleration be equal to the sum of the applied forces, or equivalently, (1.37)
π
π2π β β β = πΉgravity + πΉstring . ππ‘2
To put the above equation in a concise form, and eliminate the unknown magniβ tude of πΉstring , we consider the vector dot-product of the above with the unit vector π πβ , namely π2π β β β β
π β = πΉgravity β
π πβ + πΉstring β
π πβ . ππ‘2 π By direct calculation, using the component expressions for all vectors involved, and the facts that π β β
π β = 1, π β β
π β = 1 and π β β
π β = 0, and noting that π πβ β
π πβ = 0 because they are perpendicular, we obtain (1.38)
(1.39)
π
π2π β π2π β
π β = β , π ππ‘2 ππ‘2
β πΉgravity β
π πβ = βππ sin π,
β πΉstring β
π πβ = 0.
1.7. Case study
13
By substituting (1.39) into (1.38), and dividing out the mass and rearranging, we arrive at a differential equation for the pendulum motion. When the release conditions at time π‘ = 0 are included, we obtain (1.40)
β
π2π + π sin π = 0, ππ‘2
ππ | = 0, ππ‘ π‘=0
π|π‘=0 = π0 ,
π‘ β₯ 0.
The equations in (1.40) form a second-order, nonlinear, initial-value problem for the pendulum angle π as a function of time π‘. This function also naturally depends on the parameters π, β, and π0 that appear in the equations, and we note that the mass π was eliminated along the way. The theory of ordinary differential equations guarantees that there exists a unique solution π = π(π‘, π, β, π0 ), for some function π. Moreover, provided that the initial velocity is zero and the initial angle satisfies π0 β (0, π), this solution will be periodic in time with a period π = πΉ(π, β, π0 ), for some function πΉ. Although they can be written in terms of certain special (elliptic) functions, there are no elementary expressions for π or πΉ. Here we use dimensional methods to find a reduced form of the period relation and examine some implications. Reduced equation for period. The quantities π, π, β, π0 have dimensions [π] = π, [π] = πΏ/π 2 , [β] = πΏ and [π0 ] = 1. A dimensional basis is {π, πΏ}, and the dimensional exponent matrix in this basis is (1.41)
π΄ = (Ξπ , Ξπ , Ξβ , Ξπ0 ) = (
1 β2 0 1
0 1
0 ). 0 π
An arbitrary power product has the form π = π π1 ππ2 βπ3 π04 . The equation π΄π£ = 0, where π£ = (π1 , . . . , π4 ), has two free variables, and the general solution is (1.42)
π1 = β2π3 ,
π2 = βπ3 ,
π3 and π4 free.
Since there are two free variables, there are two independent solutions. For the first solution, we choose π3 = β1/2 and π4 = 0, which gives π1 = πβπ/β. For the second solution, we choose π3 = 0 and π4 = 1, which gives π2 = π0 . This is a full set of independent dimensionless power products, and is normalized with respect to π. By the π-theorem, the period equation π = πΉ(π, β, π0 ) must be equivalent to (1.43)
π1 = π(π2 ) or π =
β π(π0 ), βπ
for some function π. Thus the relation between the quantities π, π, β, π0 is not characterized by an unknown function of three quantities πΉ(π, β, π0 ), but is instead characterized by an unknown function of one quantity π(π0 ). Equivalently, the dependence of πΉ(π, β, π0 ) on the quantities π and β is completely dictated by dimensional considerations. Some implications. The reduced form of the period relation given in (1.43) has some interesting implications, as summarized next. (1) A single curve of π1 versus π2 completely determines the function π, and hence the relation between the quantities π, π, β, and π0 . Thus the goal of any experiment or further analysis should be aimed at determining this curve.
14
1. Dimensional analysis
(2) Consider any two pendula released from the same initial angle π0 . Let {π1 , β1 , π0 } be the parameters of the first pendulum, and {π2 , β2 , π0 } be the parameters of the second. In view of (1.43), the periods of the two pendula are given by π1 = ββ1 /π1 π(π0 ) and π2 = ββ2 /π2 π(π0 ). By dividing these two expressions, we obtain a fundamental period law for pendula, namely β1 π2 π1 = . π2 β β2 π1
(1.44)
(3) The dependence of the period π on each of the quantities π, β, and π0 can be characterized using (1.43). Specifically, for fixed π and π0 , the period π is an increasing function of β; for fixed β and π0 , the period π is a decreasing function of π; and for fixed β and π, the period π increases or decreases with π0 depending on the function π. A detailed analysis shows that the function π(π0 ), π0 β (0, π) is positive, monotone, increasing, and has the limits (1.45)
lim π(π0 ) = 2π,
π0 β0+
lim π(π0 ) = β.
π0 βπβ
Thus the period satisfies π β 2πββ/π for a pendulum released from rest with initial angle π0 β 0, which corresponds to a nearly vertical, downward position. Interestingly, the period π is arbitrarily large for initial angles π0 β π, which corresponds to a nearly vertical, upward position. The special cases of π0 = 0 and π0 = π correspond to rest or equilibrium positions of the system in which no motion occurs. Such states and further properties of dynamical systems will be considered in later chapters.
Reference notes Classic references for the material presented here are the books by Birkhoff (2015) and Bridgman (1963). A recent treatment with a wealth of details and examples is given in Szirtes (2007), and a concise guide that illustrates various diverse applications is given in Lemons (2017).
Exercises 1. Let π₯, π‘, π, π be quantities in units {m, s}, and let π₯,Μ π‘,Μ π,Μ π Μ be the corresponding quantities in units {cm, hr}, with dimensions [π₯] = πΏ, [π‘] = π, [π] = πΏ/π and [π] = π. Change the equation from π₯, π‘, π, π to π₯,Μ π‘,Μ π,Μ π.Μ Is the equation unit-free? (a) π₯ =
ππ‘3 arctan(π‘) . (π + 4π‘)2
(b) π₯ =
ππ‘2 arctan(π‘/π) . π + 4π‘
2. Let π₯, π¦ and π, π, π, π be quantities in units {ft, lb} in the basis {πΏ, π}. Also, let ππ¦ π£ = ππ₯ . Assuming [π₯] = πΏ and [π¦] = π, find the dimensions [π], [π], [π], [π ] as needed to make the given equation unit-free.
Exercises
15
π . π + π₯2
(a) π¦ = ππ₯2 + π sin(ππ₯).
(b) π¦ = π ln(π π₯) +
(c) π¦ = (ππ₯ + ππ₯2 )ππ₯/π .
(d) π¦ =
(e) π£ = ππ₯ β ππ₯3 β π.
(f) π£ = ππ¦ β ππ₯2 β ππ₯π¦2 .
2
π + ππππ₯ . π +π₯
3. Let π, π, π
, π be quantities in given units in a basis {π·1 , π·2 }. Show that π = π + π
+ π is unit-free if and only if [π] = [π] = [π
] = [π]. 4. Let π₯, π¦, π§ > 0 and π, π, π > 0 be quantities with the following dimensions in the basis (1.1): [π₯] = πΏ, [π¦] = ππΏ/π, [π§] = π©π/(πΏπ), [π] = πΏ2 ,
[π] = π/π,
[π] = π/π©.
Find a reduced form of the given equation assuming it is unit-free. If not possible, explain why. (a) π₯ = π(π¦, π, π).
(b) π¦ = π(π₯, π§, π).
(c) π§ = π(π₯, π¦, π).
(d) π¦ = π(π₯, π, π).
5. Let π’, π£, π€ > 0 and πΌ, π½, πΎ, πΏ > 0 be quantities with the following dimensions in the basis (1.2): [π’] = πΆ/π, [πΌ] = πΆ/π»,
[π£] = π»/π,
[π½] = π/π»,
[π€] = π/π,
[πΎ] = π»/(πΆπ),
[πΏ] = 1/π.
Find a reduced form of the given equation assuming it is unit-free. If not possible, explain why. (a) π’ = π(π£, π€, πΌ, π½).
(b) π£ = π(π€, π½, πΎ, πΏ).
(c) π€ = π(πΌ, π½, πΎ).
(d) π€ = π(π’, π£, πΌ, π½, πΏ).
6. An experiment to measure the temperature π’ in a furnace at time π‘ is performed. A curve fitting procedure applied to the π’, π‘ data yields the empirical relation π’ = 3.7π‘1.5 + 4.2π‘ + 293.2 in units {kelvin, minutes}. Here we explore different ways to make this relation unit-free. (a) Consider π’ = ππ‘1.5 + ππ‘ + π, where π, π, π are dimensional constants with values 3.7, 4.2, 293.2 in units {kelvin, minutes}. What must be the dimensions of π, π, π so that the equation is unit-free? What would be the values of π, π, π in units {kelvin, hours}? π‘
π‘
(b) Alternatively, consider π’ = π½[3.7( πΌ )1.5 + 4.2 πΌ + 293.2], where πΌ and π½ are dimensional constants, with values πΌ = 1 minute and π½ = 1 kelvin, and 3.7, 4.2 and 293.2 are dimensionless (pure) numbers. Show that this form of the equation is also unit-free.
16
1. Dimensional analysis
7. Data is collected on the height π¦ of certain trees at time π‘ during their lifetimes. A curve fitting procedure gives the empirical relation π¦ = 52.4 β 52.4πβ0.1π‘ β 3.3π‘πβ0.2π‘ in units {foot, year}. Rewriting as π¦ = π β ππβππ‘ β ππ‘πβππ‘ , find the dimensions of π, π, π, π, π so that the equation is unit-free. Convert the equation to units {yard, decade}. 8. According to a simple theory of growth, the ultimate radius π > 0 of a singlecelled organism is determined by the nutrient absorption rate π > 0 through its surface, and nutrient consumption rate π > 0 throughout its volume, via a unitfree equation π = π(π, π). Find a reduced form of the relation using [π] = π/(πΏ2 π) and [π] = π/(πΏ3 π). 9. A metal forming process involves a pressure π > 0, length π₯ > 0, time π‘ > 0, mass π > 0 and density π > 0, and is described by a unit-free equation π = π(π₯, π‘, π, π). Find a reduced form of this relation and express the result explicitly in terms of π. Recall that [π] = π/(πΏπ 2 ) and [π] = π/πΏ3 . 10. A sphere of radius π > 0 is immersed in a fluid of density π > 0 and viscosity π > 0. When subject to a force π > 0, the sphere attains a terminal velocity π£ > 0. We suppose there is a unit-free equation π£ = π(π, π, π, π), where [π] = ππΏπ β2 , [π] = ππΏβ3 , [π] = ππΏβ1 π β1 . (a) Find a reduced form of π£ = π(π, π, π, π). (b) For fixed π, π, π, show that π£ is proportional to a power of π. 11. A model for the digestion process in animals states that the absorption rate π’ > 0 of a given nutrient is determined by the concentration π > 0, residence time π > 0, and breakdown rate π > 0 of the nutrient in the gut, along with the volume π£ > 0 of the gut. We suppose there is a unit-free equation π’ = π(π, π, π, π£), where [π’] = π/π, [π] = π/πΏ3 and [π] = π/(πΏ3 π). (a) Find a reduced form of π’ = π(π, π, π, π£). (b) For fixed π, π, π, show that π’ is proportional to π£. 12. In an explosion, a circular blast wave of intense pressure expands from the point of explosion into the surrounding air. A simple theory asserts that the radius π > 0 of the wave is determined by the elapsed time π‘ > 0, the energy πΈ > 0 released in the explosion, and the density π > 0 of the surrounding air, via a unit-free equation π = π(π‘, πΈ, π). Find a reduced form of this relation and show that the radius must increase with time in a nonlinear way; specifically, π increases as π‘2/5 . Here [πΈ] = ππΏ2 /π 2 and [π] = π/πΏ3 . 13. In a domino toppling show, a long line of dominoes topple over, one by one, in a chain reaction. It is hypothesized that the speed π£ > 0 of the toppling wave depends on the spacing π > 0 and height β > 0 of each domino, and gravitational acceleration π > 0, via a unit-free equation π£ = π(π, β, π). (The speed is assumed to be insensitive to the thickness and width of each domino.)
Exercises
17
(a) Show that a reduced form of the speed equation is π£ = βπβ π(π/β) for some function π. (b) The figure below shows a plot of π1 = π(π2 ), where π1 = π£/βπβ and π2 = π/β, made with data from different domino experiments. Note that the graph is approximately linear on the interval 0.1 β€ π2 β€ 0.7. Find a linear expression for π1 = π(π2 ) valid on this interval. Use this expression to write π£ in terms of π, β, π. Ο1 1.5 1.2
0.1
0.7
Ο2
14. Drone airplanes of surface area π use fuel of energy content π to fly at velocity π£ through air of viscosity π. A theory proposes that the fuel consumption rate π is determined by a unit-free equation π = π(π , π, π£, π), where [π] = πΏ3 π β1 , [π] = ππΏβ1 π β2 and [π] = ππΏβ1 π β1 . (a) Find a reduced form of π = π(π , π, π£, π). (b) Using data from two experiments, and linear interpolation in a plot of the reduced form, find π when (π , π, π£, π) = (6, 3, 20, 1). Data in some appropriate units π π π£ π π experiment 1: 5 2 10 1 1 experiment 2: 7 3 15 2 5
Mini-project. As developed in Section 1.7, a model for an ideal pendulum released from rest is
g
ΞΈ
β
π2π ππ + π sin π = 0, | = 0, π|π‘=0 = π0 , π‘ β₯ 0. ππ‘ π‘=0 ππ‘2
Here π is the pendulum angle, β is the length, π is gravitational acceleration, and π‘ is time. The above system has a unique solution π = π(π‘, π, β, π0 ), for some function π, and provided that the initial velocity is zero and the initial angle satisfies π0 β (0, π), this solution will be periodic in time with a period π = πΉ(π, β, π0 ), for some function πΉ. Here we study the period relation and construct an approximate formula for it using some data. All quantities are in units of meters and seconds. (a) As outlined in the text, show that the reduced form of the period relation is π = ββ/π π(π0 ), for some function π. Given that π and πΉ are both unknown, what is
18
1. Dimensional analysis
the conceptual advantage of the form π = ββ/π π(π0 ) compared to the form π = πΉ(π, β, π0 )? (b) The table below shows experimental measurements of π for five pendula with different values of π, β, and π0 . Compute the value of π for each case; make a table or plot of π versus π0 . Over the given interval of π0 , what are the qualitative features of π? Does the function appear to be increasing or decreasing? Concave up or concave down?
case 1: case 2: case 3: case 4: case 5:
π, m/s2 9.80 9.80 9.80 9.80 9.80
β, m 0.20 0.10 0.50 0.30 0.25
π0 , rad π/12 π/6 π/4 π/3 π/2
π, s 0.9015 0.6457 1.4760 1.1798 1.1845
(c) Using your π versus π0 table, and linear interpolation between entries, predict the 5π 5π π period π for given parameter values {π, β, π0 } = {9.8, 0.5, 12 }, {9.8, 0.7, 12 }, {4.9, 0.6, 3 }. More generally, what would be an approximate formula for π, valid for any β > 0, π > 0 π π and π0 β [ 3 , 2 ]? (d) Use Matlab or other similar software to numerically solve the pendulum differential equation, along with the initial conditions, to produce a plot of π versus π‘. Run a simulation with each set of parameters {π, β, π0 } from part (c) and directly estimate the period from the plot for each case. Do the estimates agree with the predictions from (c)?
Chapter 2
Scaling
Mathematical models may involve a number of quantities expressed in a variety of units. While the absolute magnitudes of the quantities are determined by the units, the relative significance of the quantities can be obscured by differences in these units. In this chapter, we outline some results on scale transformations, which can facilitate the study of any mathematical model, and expose the relative significance of the quantities involved.
2.1. Domains and scales In the modeling of various types of systems, we will often seek to construct and study a unit-free relation of the form (2.1)
π¦ = π(π‘, π 1 , . . . , π π ),
where π¦, π‘, π 1 , . . . , π π are real-valued quantities with given units. Here π¦, π‘ denote variables, π 1 , . . . , π π denote parameters, and π is a given function. By a parameter we mean a constant whose value depends on the specific case or situation of interest. The function π may be defined explicitly, or implicitly as the solution of some related equation, for example a differential equation. For brevity, when the parameters are not essential to a discussion, we will abbreviate the above relation as π¦ = π(π‘). A basic goal is to understand how the parameters π 1 , . . . , π π influence the graph π¦ = π(π‘) in a given domain π·. For our purposes, it will be convenient to consider domains consisting of rectangular cells as illustrated in Figure 2.1, defined by (2.2)
π· = { (π‘, π¦) |
π1 π β€ π‘ β€ π2 π,
π1 π β€ π¦ β€ π2 π }.
Here π1 < π2 and π1 < π2 are integers, and π > 0 and π > 0 are constants called scale factors for the domain. Smaller values of the scale factors correspond to smaller observation windows or domains, which give a more zoomed-in view of a graph. Similarly, larger scale factors correspond to larger domains, which give a more zoomed-out 19
20
2. Scaling
view. Note that the visible features of a graph depend on the scales at which it is observed. At smaller scales, a graph may appear nearly flat or linear, and at larger scales, it may appear rather nonlinear with significant curvature. The influence of scales on the visible features of a graph is illustrated in the following example. y D
b 0 a
t
Figure 2.1.
Example 2.1.1. Consider the function π¦ = π 1 π‘2 + π 2 sin(2ππ‘/π 3 ), with parameters π 1 = m 2 s2 , π 2 = 0.01 m and π 3 = 0.01 s. Consider also the domain (π‘, π¦) β [βπ, π] Γ [βπ, π] with four different sets of scales {π, π} = {0.001s, 0.005m}, {0.02s, 0.02m}, {0.15 s, 0.10 m} and {10 s, 200 m}. The graph of this function, with each of the four sets of scales, is shown in Figure 2.2. The graph appears nearly linear at the smallest scales (top-left), y 0.005
y 0.02
t 0.001
-0.001 -0.005
-0.02 y 200
y 0.1 t 0.15
-0.15
t 0.02
-0.02
-10
-0.1
10
t
-200
Figure 2.2.
then nearly sinusoidal (top-right), and then nearly sinusoidal with a quadratic drift (bottom-left), and finally nearly quadratic at the largest scales (bottom-right). Note that each panel in the figure corresponds to the same function, but the scales are different, which significantly affects the appearance of the graph. The scales are reflected in the range of values on the π‘, π¦ axes.
2.2. Scale transformations When preparing to study a function π¦ = π(π‘) in a domain π·, it will be helpful to introduce a change of variable as described next. Such a transformation will normalize the variables and domain, and ultimately make the function easier to study.
2.2. Scale transformations
21
Definition 2.2.1. By a scale transformation for variables π‘, π¦ we mean a change of variable (2.3)
π‘ = π‘/π,
π¦ = π¦/π,
where π, π > 0 are constants with the same dimensions as π‘, π¦. The transformation converts any function and domain (2.4)
π¦ = π(π‘),
π1 π β€ π‘ β€ π2 π,
π1 π β€ π¦ β€ π2 π,
into an equivalent scaled function and scaled domain (see Figure 2.3) (2.5)
π¦ = πβ1 π(ππ‘) = π(π‘),
π1 β€ π‘ β€ π2 ,
π1 β€ π¦ β€ π2 .
Note that the scaled function π¦ = π(π‘) is a horizontally and vertically stretched version of the original function π¦ = π(π‘). Depending on the scale factors, each cell of size π Γ π in the π‘, π¦ plane is enlarged or reduced to a cell of size 1 Γ 1 in the π‘, π¦ plane. Specifically, the original cell is enlarged when 0 < π, π < 1, and reduced when π, π > 1. Moreover, because they are dimensionless, the scaled variables π‘, π¦ will usually be more convenient and efficient to use than π‘, π¦. y
y
D
D
y=f(t)
y=f(t)
b βa
1
a
t
2a
β1
1
2
t
Figure 2.3.
For any values of π, π the scaled function π¦ = π(π‘) is a faithful representation of the original function π¦ = π(π‘), that is, they have the same qualitative features. Specifically, if the original function is increasing or decreasing in some interval, then the scaled function must be increasing or decreasing in the corresponding scaled interval. Similarly, the two functions must have the same concavity, and the same numbers of minima and maxima. The following example shows how the scaled function, in the scaled or normalized domain, faithfully replicates the features of the original function. Example 2.2.1. Consider again the previous example, and for each set of scales π, π, consider now the scaled function π¦ = π(π‘) and the scaled domain (π‘, π¦) β [β1, 1] Γ [β1, 1], where (2.6)
π¦=
π 1 π2 2 π 2 2πππ‘ π‘ + sin( ). π π π3
The graph of this function, for each of the four sets of scales, is shown in Figure 2.4. In contrast to before, the domain for the variables (π‘, π¦) does not change, but now the function π¦ = π(π‘) explicitly changes with the scales π, π as shown in equation (2.6). Thus the scale transformation exposes the significance of the scales by making them parameters of the function instead of the domain. Note that each panel of Figure 2.2 is faithfully represented in Figure 2.4.
22
2. Scaling
y 1
-1
1
t
-1 Figure 2.4.
2.3. Derivative relations The scale transformation in (2.3) can be extended to derivatives. If a function π¦ = π(π‘) is differentiable at some point (π‘, π¦), then so will be the scaled function π¦ = π(π‘) at the scaled point (π‘, π¦), and similarly for higher-order derivatives. The next result, which follows from a straightforward application of the chain rule, summarizes the relation between the original and scaled derivatives. Result 2.3.1. Let π‘ = π‘/π and π¦ = π¦/π where π, π > 0 are constant scales. If π¦ = π(π‘) is π-times differentiable at (π‘, π¦), then so is π¦ = π(π‘) at (π‘, π¦), and conversely. The relation between derivatives is ππ¦ πππ¦ π2π¦ π ππ¦ π2 π 2 π¦ ππ π π π¦ (2.7) = = , ... , . π = 2 2 π ππ‘ π ππ‘ π ππ‘π ππ‘ ππ‘ ππ‘ These relations also hold for a more general change of variable that includes a shift, such as π‘ = (π‘ β π‘0 )/π and π¦ = (π¦ β π¦0 )/π, for any constants π‘0 , π¦0 . The above relations imply that any equation for π¦ = π(π‘) and its derivatives can be converted to an equation for the scaled function π¦ = π(π‘) and its derivatives. Equivalently, any differential equation in the variables π‘, π¦ can be converted to an equation in the scaled variables π‘, π¦. Note that solutions of the original and scaled equations must be equivalent under the change of variable. As we will see, the scale factors π, π can be chosen so that the scaled equation is simpler than the original, with fewer parameters. Also, the fact that the scaled equation is dimensionless is advantageous: the relative size and importance of quantities in the scaled equation can be directly compared with each other. In contrast, quantities in the original equation may be difficult to compare due to differences in their dimensions. Example 2.3.1. Consider the initial-value problem ππ¦ = π 1 π¦2 + π 2 π¦, π¦|π‘=0 = π 3 , π‘ β₯ 0, ππ‘ where π‘, π¦ are variables and π 1 , π 2 , π 3 > 0 are parameters. Here we find the scaled form of this problem using arbitrary scale factors π, π > 0. To obtain the scaled form, we substitute the change of variable relations π‘ = ππ‘ and π¦ = ππ¦, along with the derivative π ππ¦ ππ¦ relation ππ‘ = π , and get (2.8)
ππ‘
(2.9)
2 π ππ¦ = π 1 π2 π¦ + π 2 ππ¦, π ππ‘
ππ¦|ππ‘=0 = π 3 ,
ππ‘ β₯ 0,
2.4. Natural scales
23
which after simplification becomes (2.10)
ππ¦
2
= π 1 πππ¦ + π 2 ππ¦,
π¦|π‘=0 =
π3 , π
π‘ β₯ 0. ππ‘ Note that the scaled problem in (2.10) is similar to the original problem in (2.8), but the coefficients have changed. The new coefficients now depend on the scale constants π, π along with the original constants π 1 , π 2 , π 3 . As before, the scale transformation exposes the significance of the scales by making them parameters of the equation instead of the domain. Note that certain choices of scale could be made to simplify the scaled equation, for example, some of the new coefficients could be made to have unit values. Example 2.3.2. Let π¦ = π(π‘) be the solution of the initial-value problem ππ¦ + π 1 π¦ = π 2 π‘, π¦|π‘=0 = π 3 , π‘ β₯ 0. ππ‘ Here π‘, π¦ are variables with dimensions [π‘] = π and [π¦] = πΏ, and π 1 , π 2 , π 3 > 0 are parameters with dimensions [π 1 ] = π β1 , [π 2 ] = πΏπ β2 and [π 3 ] = πΏ. The visible features of the solution will depend not only on these parameters, but also the scales π, π > 0 at which the solution is observed. To explore this, we consider the scaled form of the problem, obtained by the same process as before, (2.11)
(2.12)
ππ¦ ππ‘
+ π1 π¦ = π2 π‘,
π¦|π‘=0 = π3 ,
π‘ β₯ 0,
π 2 π2 , π
π3 . π
where (2.13)
π1 = π 1 π,
π2 =
π3 =
The scaled problem in (2.12) informs us that the behavior of π¦ = π(π‘) is determined by the three dimensionless parameters π1 , π2 , and π3 . Different interesting cases can be identified. For instance, in cases for which π1 β« π2 and π3 β₯ 1, the scaled equations ππ¦ become β βπ1 π¦ and π¦|π‘=0 β₯ 1, which suggests that the scaled and hence original ππ‘ graph would have the appearance of a decaying exponential. Alternatively, in cases for ππ¦ β π2 π‘ and π¦|π‘=0 βͺ 1, which π2 β« π1 and π3 βͺ 1, the scaled equations become ππ‘ which suggests that the scaled and hence original graph would have the appearance of a growing quadratic. Note that π1 , π2 , π3 are dimensionless and can be directly compared against each other; in contrast, π 1 , π 2 , π 3 have different dimensions and cannot be similarly compared. Thus the scale transformation facilitates the study of the problem, and exposes the relative significance of all constants involved.
2.4. Natural scales Various choices of scale can be made when studying a function π¦ = π(π‘) with parameters π 1 , . . . , π π . In general, it is desirable to use scales that are natural or intrinsic to the function, in the sense that they depend only on the function itself and its parameters, namely (2.14)
π = π(π 1 , . . . , π π ),
π = π(π 1 , . . . , π π ),
24
2. Scaling
for some functions π, π. Here we outline two different types of natural scales in this sense. Definition 2.4.1. Assume π¦ = π(π‘) is nonconstant and continuously differentiable on some closed, bounded interval πΌ. Then the constants π, π > 0 defined by π | ππ | = max | (π‘)| , | π π‘βπΌ | ππ‘
π = max |π(π‘)| ,
(2.15)
π‘βπΌ
are natural scales. These constants are called characteristic scales for π¦ = π(π‘) on πΌ. ππ¦ In simpler notation, π = |π¦|max and π = π/| ππ‘ |max . The characteristic scales π, π provide a default level of zoom for a graph π¦ = π(π‘) in any domain of the π‘, π¦ plane with π‘ β πΌ. At this default level of magnification, the slope of the graph in any cell of the domain is bounded by that of the two diagonal lines of the cell, and the entire vertical range of the graph is bounded by twice the height of a cell. A domain with a moderate number of cells at these characteristic scales can be expected to provide a natural view of a function, with no large or abrupt changes in the graph or its slope. For a complicated function with a number of different components, the above definition could be applied to the overall function, or instead be applied to each different component, leading to a collection of characteristic scales. A function for which significantly different scales are needed to clearly represent its features is called a multiscale function. m
Example 2.4.1. Consider the function π¦ = π 1 π‘πβπ‘/π2 , where π 1 = 2 s and π 2 = 6 s, and consider the interval πΌ = [0, 40 s]. Using the methods of calculus to determine the indicated maximum values, we find that characteristic scales for this function and interval are (2.16)
π = max |π(π‘)| = π‘βπΌ
| ππ | π π = π / max | (π‘)| = 2 . | π π‘βπΌ | ππ‘
π1π2 , π
A plot corresponding to the domain [0, 10π] Γ [0, π] is shown in Figure 2.5. Note that a moderate number of cells with scales π, π provides a clear, well-proportioned view of all the essential features of the function, which are a rise up to a single maximum followed by a decay to zero. y 4 3 2 1 5
10
15
Figure 2.5.
20
t
2.4. Natural scales
25
π‘
1
Example 2.4.2. Consider π¦ = π 1 πβπ‘/π2 cos( π ), where π 1 = 1 m, π 2 = 6 s, π 3 = 100 s, 3 and as before πΌ = [0, 40 s]. Applying the above definitions to this function and interval yield the characteristic scales π = π 1 and π β π 3 . A plot corresponding to the domain [0, 10π]Γ[βπ, π] is shown in the left half of Figure 2.6. These scales provide a clear view of the oscillating component of the function, but this view is incomplete; there is little indication of the decaying component associated with the factor πβπ‘/π2 , which varies on the much longer (slower) scale πΜ = π 2 . A plot corresponding to the longer domain [0, 5π]Μ Γ [βπ, π] is shown in the right half of the figure; the decaying component is now visible, but the oscillations are so tightly spaced that the graph appears to fill an area. Note that neither plot by itself provides a clear representation of all the features of the function, but instead two plots are necessary to illustrate the fast and slow behaviors in time. This is a simple example of a multiscale function. y 1
y 1
0.5 -0.5
0.5 0.05
0.1
t
10
-0.5
-1
20
30
t
-1 Figure 2.6.
In many cases it is impractical to determine characteristic scales for a function in the sense of the definition above. In these cases, other natural scales can be considered, which are more explicit and easier to find. To state the definition, we consider a given relation π¦ = π(π‘, π 1 , . . . , π π ) and consider the power products (2.17)
πΌ
πΌ
π = π 1 1 β― π ππ ,
π½
π½
π = π 1 1 β― π ππ ,
where πΌ1 , . . . , πΌπ and π½1 , . . . , π½π are powers to be determined. In any dimensional basis {π·1 , . . . , π·π }, each parameter π π will have dimensional exponents Ξππ β βπ , and the power products π, π will have dimensional exponents Ξπ , Ξπ β βπ . As noted in the previous chapter, the relation between these exponents is (2.18)
Ξπ = π΄πΌ,
Ξπ = π΄π½,
where π΄ = (Ξπ1 , . . . , Ξππ ) β βπΓπ , πΌ = (πΌ1 , . . . , πΌπ ) β βπ and π½ = (π½1 , . . . , π½π ) β βπ . Here, as before, all one-dimensional arrays are considered as columns, and we assume π β₯ 1 and π β₯ 1 with π β₯ π. The condition that scales π, π have the same dimensions as π‘, π¦ is equivalent to Ξπ = Ξπ‘ and Ξπ = Ξπ¦ . When these conditions are combined with (2.18), we obtain the following class of natural scales, which are more practical. Definition 2.4.2. Let π¦ = π(π‘, π 1 , . . . , π π ) be given, where π 1 , . . . , π π > 0 are parameters, and let πΌ = (πΌ1 , . . . , πΌπ ) and π½ = (π½1 , . . . , π½π ) be any powers satisfying (2.19)
π΄πΌ = Ξπ‘ ,
π΄π½ = Ξπ¦ .
Then the constants π, π > 0 in (2.17) are natural scales. These constants are called associated scales for the function.
26
2. Scaling
Unlike the characteristic case, any associated scales π, π depend only on the parameters π 1 , . . . , π π and not the specific form of a function π. The existence of these types of scales is not automatic. Depending on the matrix of exponents π΄, there may be no set of scales which can be expressed as power products, or there may be one or more sets of scales. Any set of associated scales provides a natural reference for measuring the variables π‘, π¦. In various cases, the characteristic scales defined above will be among the associated scales. Example 2.4.3. Here we find a set of associated scales for a function π¦ = π(π‘, π 1 , π 2 , π 3 ), where π‘, π¦ are variables with dimensions [π‘] = π and [π¦] = π, and π 1 β₯ 0 and π 2 , π 3 > 0 are parameters with dimensions [π 1 ] = 1/π, [π 2 ] = 1/(ππ) and [π 3 ] = π/π 2 . Note that, since π 1 = 0 is possible, we exclude this parameter from consideration since it may lead to a zero or undefined scale. Hence we seek scales π, π > 0 depending on π 2 , π 3 > 0. In the dimensional basis {π, π}, we have (2.20)
1 Ξπ‘ = ( ) , 0
0 Ξπ¦ = ( ) , 1
β1 Ξ π2 = ( ) , β1 πΌ
β2 Ξπ3 = ( ) . 1 π½
πΌ
π½
Introducing the power products π = π 2 2 π 3 3 and π = π 22 π 33 , we consider the matrix π΄ = (Ξπ2 , Ξπ3 ) and the vectors πΌ = (πΌ2 , πΌ3 ) and π½ = (π½2 , π½3 ). The conditions π΄πΌ = Ξπ‘ and π΄π½ = Ξπ¦ become (2.21)
(
β1 β2 πΌ2 1 )( ) = ( ), β1 1 πΌ3 0
(
β1 β2 π½2 0 )( ) = ( ). 1 β1 1 π½3
The first system requires βπΌ2 β 2πΌ3 = 1 and βπΌ2 + πΌ3 = 0, which has the solution . The second system πβ1/3 (πΌ2 , πΌ3 ) = (β1/3, β1/3), which gives the scale π = πβ1/3 3 2 requires βπ½2 β2π½3 = 0 and βπ½2 +π½3 = 1, which has the solution (π½2 , π½3 ) = (β2/3, 1/3), π1/3 which gives the scale π = πβ2/3 3 . 2
2.5. Scaling theorem Under a scale transformation, a given function π¦ = π(π‘, π 1 , . . . , π π ) is converted into an equivalent scaled function (2.22)
π¦ = πβ1 π(ππ‘, π 1 , . . . , π π ) = π(π‘, π, π, π 1 , . . . , π π ).
The next result shows that, if π, π are natural scales, then the scaled function is not only dimensionless, but also involves fewer parameters, thus making it easier to study. The result follows from an application of the π-theorem stated in Result 1.6.2. Result 2.5.1. If π¦ = π(π‘, π 1 , . . . , π π ), π = π(π 1 , . . . , π π ) and π = π(π 1 , . . . , π π ) are unit-free, then the scaled relation is equivalent to (2.23)
π¦ = π(π‘, π1 , . . . , ππ ),
for some function π and parameters π1 , . . . , ππ . The variables π‘, π¦ and parameters π1 , . . . , ππ are all dimensionless and π β€ π. The case π¦ = π(π‘) with π = 0 parameters is possible.
2.5. Scaling theorem
27
Thus, up to a scale transformation with natural scales, the given relation π¦ = π(π‘, π 1 , . . . , π π ) is equivalent to π¦ = π(π‘, π1 , . . . , ππ ), which normally depends on fewer parameters, or possibly π¦ = π(π‘), which depends on no parameters. Note that the result is valid whether or not the relations are known explicitly. That is, if π is defined implicitly as a solution of a differential equation with parameters π 1 , . . . , π π , then π is defined as a solution of the corresponding scaled equation with parameters π1 , . . . , ππ . Hence these latter parameters can be identified by the simple process of applying the scale transformation to the differential equation, and writing it in dimensionless form. The parameters π1 , . . . , ππ provide an intrinsic way to study and discuss the relation π¦ = π(π‘, π 1 , . . . , π π ). Whereas π 1 , . . . , π π may depend on a variety of units and be difficult to compare, π1 , . . . , ππ are independent of units and can always be compared. Thus any limiting, simplified, or otherwise special case of the relation can be naturally expressed in terms of conditions on π1 , . . . , ππ . That is, these dimensionless parameters provide a way to classify properties of the original relation in a way that is independent of units. Example 2.5.1. Let π¦ = π(π‘, π 1 , π 2 , π 3 ) be the solution of the initial-value problem ππ¦ + π 1 π¦ = π 2 π‘, π¦|π‘=0 = π 3 , π‘ β₯ 0. ππ‘ We suppose π‘, π¦ are variables with dimensions [π‘] = π and [π¦] = πΏ, and π 1 , π 2 , π 3 > 0 are parameters with dimensions [π 1 ] = π β1 , [π 2 ] = πΏπ β2 and [π 3 ] = πΏ. We seek to classify the family of solutions π¦ = π(π‘, π 1 , π 2 , π 3 ) for all possible values of π 1 , π 2 , π 3 > 0. For this purpose, we first determine a set of natural scales π, π > 0, and then examine the scaled form of the equation. (2.24)
πΌ
πΌ
πΌ
π½
π½
π½
For the scales, we consider the power products π = π 1 1 π 2 2 π 3 3 and π = π 1 1 π 22 π 33 . In the dimensional basis {π, πΏ}, the equations π΄πΌ = Ξπ‘ and π΄π½ = Ξπ¦ become (2.25)
β1 ( 0
πΌ1 1 β2 0 ) (πΌ2 ) = ( ) , 0 1 1 πΌ3
(
β1 β2 0 1
π½1 0 0 ) (π½ 2 ) = ( ) . 1 1 π½3
The first system requires βπΌ1 β 2πΌ2 = 1 and πΌ2 + πΌ3 = 0, which has a simple solution (πΌ1 , πΌ2 , πΌ3 ) = (β1, 0, 0), which gives the natural scale π = πβ1 1 . The second system requires βπ½1 β 2π½2 = 0 and π½2 + π½3 = 1, which has a simple solution (π½1 , π½2 , π½3 ) = (0, 0, 1), which gives the natural scale π = π 3 . Under the scale transformation π‘ = ππ‘ and π¦ = ππ¦, we obtain the scaled equation (2.26)
ππ¦ ππ‘
+π¦=
π2 2 π1π3
π‘,
π¦|π‘=0 = 1,
π‘ β₯ 0.
The scaled problem in (2.26) involves only a single dimensionless parameter π = π 2 /(π21 π 3 ), and will have a solution π¦ = π(π‘, π). Thus, up to a scale transformation, the original family of solutions π¦ = π(π‘, π 1 , π 2 , π 3 ) for all possible values of π 1 , π 2 , π 3 > 0 can be classified in terms of π > 0. In this case, the scaled equation can be solved using standard methods to get π¦ = ππ‘ β π + (1 + π)πβπ‘ , and the influence of the parameter π can be directly assessed. The scaled solution can be seen to be a convex function, with 1 1 a minimum at (π‘, π¦) = (ln(1 + π ), π ln(1 + π )), and with a slant asymptote along the
28
2. Scaling
line ππ‘ β π. Figure 2.7 illustrates the scaled solution for the three cases π = 0.02, 0.1 and 1.
-
y 2
ΞΌ=1
1.5
ΞΌ = 0.1
1 0.5 2
4
6
8
10
-
ΞΌ = 0.02
t
Figure 2.7.
Sketch of proof: Result 2.5.1. For notational convenience, we denote the list of parameters by πΎ = (π 1 , . . . , π π ), and consider the unit-free equations π¦ = π(π‘, πΎ), π = π(πΎ) and π = π(πΎ). We also consider the scale transformation π‘ = π‘/π and π¦ = π¦/π, and the resulting scaled equation π¦ = π(ππ‘, πΎ)/π. Substituting for π and π in this equation we get π¦ = πΉ(π‘, πΎ), where the function on the right is defined by πΉ(π‘, πΎ) = π(π(πΎ)π‘, πΎ)/π(πΎ). We next consider an arbitrary change of units as outlined in Result 1.4.1. By the change of units formula, the variables π‘, π¦ will be changed to π‘,Μ π¦.Μ Similarly, the scale factors π, π will be changed to π,Μ π,Μ and the parameters πΎ will be changed to πΎ.Μ The scaled variables π‘, π¦ will also be changed to π‘,Μ π¦.Μ However, since they are dimensionless and their dimensional exponents are zero, we get π‘ Μ = π‘ and π¦Μ = π¦. Note that, by the unit-free property, we have π¦ Μ = π(π‘,Μ πΎ), Μ πΜ = π(πΎ)Μ and π Μ = π(πΎ). Μ The scale transformation can also be applied to the variables in the new units. The scaled versions of these variables are π‘ Μ = π‘/Μ πΜ and π¦ Μ = π¦/Μ π.Μ Using the dimensional exponent relations Ξπ = Ξπ‘ and Ξπ = Ξπ¦ , which hold since π and π are scale factors, together with the change of units formula, we find that π‘/Μ πΜ = π‘/π and π¦/Μ π Μ = π¦/π. From this we deduce the useful relations π‘ Μ = π‘ = π‘ Μ and π¦ Μ = π¦ = π¦,Μ and also π‘ Μ = ππ‘Μ Μ = ππ‘Μ Μ and Μ Μ π¦ Μ = ππ¦Μ Μ = ππ¦. We can now examine the effect of a change of units on the relation π¦ = πΉ(π‘, πΎ). From the definition of the function, using the fact that π‘ Μ = ππ‘, Μ Μ we observe that πΉ(π‘,Μ πΎ)Μ = Μ π(π(πΎ)Μ π‘, πΎ)/π( Μ πΎ)Μ = π(π‘,Μ πΎ)/ Μ π,Μ which leads to the result that πΉ(π‘,Μ πΎ)Μ = π¦.Μ Thus the relation π¦ = πΉ(π‘, πΎ) is unit-free, where πΎ is brief notation for π 1 , . . . , π π . The π-theorem in Result 1.6.2 can now be applied. Since π¦ and π‘ are dimensionless, the set of quantities π¦, π‘, π 1 , . . . , π π has a full, normalized set of dimensionless power products π1 , . . . , ππ for some 2 β€ π β€ π + 2. The power products can be chosen so that π1 = π¦ and π2 = π‘, with π3 , . . . , ππ dependent only on π 1 , . . . , π π ; specifically, since π¦ and π‘ are dimensionless, they would not contribute to any power product with π 1 , . . . , π π . The π-theorem then implies π¦ = π(π‘, π3 , . . . , ππ ), which establishes the result.
2.6. Case study
29
2.6. Case study Setup. To illustrate the preceding results on scaling, and the process of modelling a chemical system, we study the evolution of an elementary reaction. Figure 2.8 illustrates the system, which consists of a closed reaction chamber or tank, filled with a fluid solution containing three chemical substances π, π , and π. The concentrations of the substances, in units of molecules per volume, are denoted by π₯, π¦, and π§. The three dimensions [π₯], [π¦], and [π§] are the same and equal to πcl /π, where πcl means Molecules, and π means Volume. We consider a reaction described by the chemical π equation π + π βΆ 2π, in which one molecule of π and one molecule of π combine to form two molecules of π. Here π and π are the reactants, π is the product, and π > 0 is a reaction rate constant that will be described below. Beginning from conditions π₯ = π₯0 > 0, π¦ = π¦0 > 0 and π§ = 0 at time π‘ = 0, we seek to understand how the reaction evolves in time; for example, how the concentration versus time curves depend on the parameters π, π₯0 , and π¦0 .
X+Y
k
reaction tank 2Z
Figure 2.8. π ππ, the proBackground. In an elementary reaction of the form ππ + ππ βΆ duction of π molecules of π requires a pairing of π molecules of π and π molecules of π . (Here pairing means an appropriate chemical event involving the reactants.) The rate constant π is defined such that # Pairings (2.27) = ππ₯π π¦π . Time β
Volume π ππ, we deduce that Thus, in a reaction ππ + ππ βΆ
(2.28)
# Pairings # of π consumed # of π =( )( ) = π ππ₯π π¦π . Time β
Volume Pairing Time β
Volume
Similarly, for π and π we deduce (2.29)
# of π consumed = π ππ₯π π¦π , Time β
Volume
# of π produced = π ππ₯π π¦π . Time β
Volume
Model equations. We can now outline a set of equations to describe the evolution of the reaction in our tank. The basic principle that we employ is balance of mass: the rate of change of the number of any chemical species in the tank, must equal the rate of production minus consumption in any reactions, plus the rate of supply minus removal π by external sources. Considering only the single reaction π +π βΆ 2π, in a closed tank with no external sources, we have # of π consumed π # of π , (2.30) ( )=β ππ‘ Volume Time β
Volume or equivalently, ππ₯ (2.31) = β1 ππ₯π¦. ππ‘
30
2. Scaling
Similarly, for the consumption of π and production of π, we have ππ¦ = β1 ππ₯π¦, ππ‘
(2.32)
ππ§ = +2 ππ₯π¦. ππ‘
Simplification. The differential equations for the concentrations π₯, π¦, and π§ can ππ¦ ππ₯ be simplified. From (2.32) we observe that ππ‘ = ππ‘ , which implies π¦(π‘) = π₯(π‘) + π for all π‘ β₯ 0, where π is a constant determined by initial conditions; specifically, π¦0 = π₯0 +π ππ§ ππ₯ or equivalently π = π¦0 β π₯0 . Similarly, from (2.32) we also observe that ππ‘ = β2 ππ‘ , which implies π§(π‘) = β2π₯(π‘) + π for all π‘ β₯ 0, where π is a constant again determined by initial conditions; specifically, 0 = β2π₯0 + π which gives π = 2π₯0 . By substituting the relation π¦ = π₯ + π into (2.31), we obtain a self-contained equation for π₯, namely ππ₯ = βππ₯(π₯ + π). Thus we focus attention on this equation and note that, once π₯ is ππ‘ known as a function of time π‘, then so will be π¦ and π§. Analysis. We consider the initial-value problem in variables π‘, π₯ given by ππ₯ = βππ₯(π₯ + π), π₯|π‘=0 = π₯0 , π‘ β₯ 0, ππ‘ where π, π₯0 , and π are parameters. Whereas π and π₯0 are positive, we note that π = π¦0 βπ₯0 could be positive, zero, or negative. To characterize the solution π₯ = π(π‘, π, π₯0 , π) for all possible values of the parameters, we first find a set of natural scales, and then study the resulting scaled problem. (2.33)
The variables π‘ and π₯ have dimensions of time and concentration, that is, [π‘] = π and [π₯] = π, where π = πcl /π. Since there is no need to consider πcl or π individually, we use {π, π} as the dimensional basis. By definition, the parameters π₯0 and π have dimensions of concentration, so [π₯0 ] = π and [π] = π. The dimensions of the parameter π can be deduced from the differential equation. Specifically, dimensional ππ₯ consistency requires [ ππ‘ ] = [ππ₯(π₯ + π)], which by properties of dimensions implies π 1 = [π]π2 , which gives [π] = ππ . π We consider natural scales π and π depending on parameters π, π₯0 , and π. Initially, π½ πΌ we consider the power products π = ππΌ1 π₯0 2 |π|πΌ3 and π = ππ½1 π₯0 2 |π|π½3 . However, since π = 0 is possible, we exclude this parameter from the power products to avoid any case with π = 0 or π = 0, for which the scale transformation is undefined. Hence we π½ πΌ consider the reduced expressions π = ππΌ1 π₯0 2 and π = ππ½1 π₯0 2 , and by the usual method 1 we find π = ππ₯ and π = π₯0 . 0
Under the scale transformation π‘ = ππ‘ and π₯ = ππ₯, we obtain the scaled problem ππ₯ = βπ₯(π₯ + π), π₯|π‘=0 = 1, π‘ β₯ 0. ππ‘ π π¦ In the above, there is only a single dimensionless parameter π = π = π₯0 β 1, and 0 from the fact that π₯0 and π¦0 are both positive, we find that π β (β1, β). Note that the solution of the scaled problem is a function π₯ = π(π‘, π), and the solution of the original problem will just be a stretched version of this function. Thus solutions can be classified in terms of the single parameter π. (2.34)
The form and character of the solution π₯ = π(π‘, π) depends on whether π β (β1, 0), π = 0, or π β (0, β). In all cases, the solution is positive, monotone, decreasing, and
Exercises
31
concave up. However, there are some interesting qualitative differences between the three cases. When π = 0, the differential equation can be solved using separation of variables, and the solution is 1 (2.35) π₯= , π‘ β₯ 0. π‘+1 Note that π₯ β 0 as π‘ β β. Thus the chemical π becomes totally consumed or depleted as the reaction evolves, and the rate at which depletion occurs is algebraic in time, 1 specifically, π₯ β 0 as quickly as β 0. π‘
When π β 0, the differential equation can also be solved using separation of variables, but now a partial fraction decomposition is required to complete the integration, and the solution is π , π‘ β₯ 0. (2.36) π₯= (1 + π)πππ‘ β 1 When π > 0, we note that πππ‘ β β, and again we find that π₯ β 0 as π‘ β β. Thus the chemical π again becomes totally depleted as the reaction evolves, but now the rate of 1 depletion is exponential in time, specifically, π₯ β 0 as quickly as ππ‘ β 0. π
Interestingly, when π < 0, we get the result that π₯ β βπ > 0 as π‘ β β. In contrast to before, we now have πππ‘ β 0 since π is negative and ππ‘ β ββ. Thus in this case the chemical π is only partially consumed as the reaction evolves. (The chemical π , which is necessary for the reaction, is depleted before π.) In this case, the rate at which π₯ β βπ is also exponential in time.
Reference notes The elementary mathematical technique of scaling is an important tool of analysis. When applied to a function or an equation, it leads to a simpler, normalized form that is easier to study. In later chapters, this technique will be used to emphasize the properties of functions and equations at different scales, and transform them into more convenient and revealing forms. For discussions of scaling within different contexts, see Holmes (2019), Lin and Segel (1988), and Logan (2013).
Exercises 1. Prove the derivative relations in Result 2.3.1. Specifically, assuming the derivatives exist, show πππ¦ ππ π π π¦ = , π = 1, . . . , π. π π ππ‘π ππ‘ 2. Consider a function π¦ = π(π‘), where [π‘] = π and [π¦] = π , and let π£ = π€=
ππ£ . ππ‘
(a) Use properties of dimensions to find [π£], [π€] in terms of π, π . ππ π¦
(b) Use induction to find [ ππ‘π ] for any π β₯ 1.
ππ¦ ππ‘
and
32
2. Scaling
3. Let π‘, π¦ and π, π, π be quantities with given units in the basis {πΏ, π}. Also, let ππ¦ π2 π¦ π3 π¦ π£ = ππ‘ , π€ = ππ‘2 and π§ = ππ‘3 . Assuming [π‘] = π and [π¦] = πΏ, find the dimensions [π], [π], [π] as needed to make the given equation unit-free. (a) π£ = ππ¦ β ππ‘ β π.
(b) π£ = ππ¦2 + ππ‘2 + ππ‘π¦.
(c) π€ = ππ¦ β ππ‘π¦ β ππ£.
(d) π§ = ππ¦2 + ππ‘ + ππ€.
4. Find characteristic scales for the function on the interval π‘ β [0, π‘0 ], where π‘0 > 0 is fixed but arbitrary. Here π‘, π¦ are variables and π π > 0 are parameters. [For parts (c) and (d) you may introduce simplifying assumptions on π π and π‘0 if needed.] (a) π¦ = π 1 π‘ + π 2 π‘2 + π 3 π‘3 . 2
(c) π¦ = π 3 πβπ1 π‘βπ2 π‘ .
(b) π¦ = π 1 + π 2 πβπ‘/π3 . (d) π¦ = π 1 π‘ + π 2 cos(ππ‘/π 3 ).
5. Let π‘, π¦ be variables with dimensions [π‘] = π, [π¦] = π , and let π π > 0 be parameters. For each function below, determine the dimensions of π π , and find a set of associated scales. (a) π¦ = π 1 π‘ + π 2 π‘2 + π 3 π‘3 . 2
(c) π¦ = π 3 πβπ1 π‘βπ2 π‘ .
(b) π¦ = π 1 + π 2 πβπ‘/π3 . (d) π¦ = π 1 π‘ + π 2 cos(ππ‘/π 3 ).
6. Let π¦ = π(π‘, π 1 , π 2 , π 3 ) be the solution of the following problem, where π 1 , π 2 , π 3 > 0 are parameters, and [π‘] = π and [π¦] = πΏπ β1 . ππ¦ = π 1 β π 2 π¦, ππ‘
π¦|π‘=0 = π 3 ,
π‘ β₯ 0.
(a) Find a set of associated scales π, π > 0 for the problem. (b) Find the scaled problem and identify the single independent dimensionless parameter π that appears. (c) Find the solution π¦ = π(π‘, π) of the scaled problem. Describe qualitative behavior of solution when π < 1, = 1 and > 1. 7. Let π¦ = π(π‘, π 1 , π 2 ) be the solution of the following problem, where π 1 , π 2 > 0 are parameters, and [π‘] = π and [π¦] = π. ππ¦ = βπ 1 π¦2 , ππ‘
π¦|π‘=0 = π 2 ,
π‘ β₯ 0.
(a) Find a set of associated scales π, π > 0 for the problem. (b) Find the scaled problem and show that its solution is a fixed function π¦ = π(π‘) involving no parameters. (c) What effect do π 1 , π 2 > 0 have on π¦ = π(π‘, π 1 , π 2 )? Can this solution qualitatively change as π 1 , π 2 are varied?
Exercises
33
8. Let π¦ = π(π‘, π, π, π£ 0 , π¦0 ) be the solution of the following problem, where π, π¦0 > 0 and π, π£ 0 β₯ 0 are parameters, and [π‘] = π and [π¦] = π . π2π¦ = π β ππ¦, ππ‘2
ππ¦ | = π£0 , ππ‘ π‘=0
π¦|π‘=0 = π¦0 ,
π‘ β₯ 0.
(a) Find a set of associated scales for π‘, π¦ involving only π, π¦0 > 0. (b) Find the scaled problem and show that it contains two independent dimensionless parameters π1 , π2 . (c) Find the solution π¦ = π(π‘, π1 , π2 ) of the scaled problem. 9. Let π¦ = π(π‘, π, π, π, π¦0 ) be the solution of the following problem, where π, π, π¦0 > 0 and π β₯ 0 are parameters, and [π] = π, [π‘] = π and [π¦] = πΏ. π
π2π¦ ππ¦ β ππ¦, = βπ ππ‘ ππ‘2
ππ¦ | = 0, ππ‘ π‘=0
π¦|π‘=0 = π¦0 ,
π‘ β₯ 0.
(a) Find a set of associated scales for π‘, π¦ involving only π, π, π¦0 > 0. (b) Find the scaled problem and show that it contains only one independent dimensionless parameter π. (c) Show that the scaled solution type will be purely trigonometric, purely exponential, or some combination type depending on π; give explicit conditions for each type. 10. The temperature π’ at time π‘ of a chemically reacting body in a furnace is modeled by the following problem, where π, π’β , π’0 > 0 and π, β β₯ 0 are parameters, and [π‘] = π and [π’] = π©. ππ’ = π πββ/α΅ β π(π’ β π’β ), ππ‘
π’|π‘=0 = π’0 ,
π‘ β₯ 0.
(a) Find a set of associated scales for π‘, π’ involving only π, π’β > 0. (b) Find the scaled problem and show that it contains three independent dimensionless parameters. (c) Which dimensionless parameter must be very small to obtain the approximate equation ππ’/ππ‘ β 1 β π’? 11. The displacement π₯ at time π‘ of a nonlinear spring-mass system is modeled by the following problem, where π, π, β > 0 and π, π₯0 β₯ 0 are parameters, and [π] = π, [π‘] = π and [π₯] = πΏ. π
π2π₯ ππ₯ = βπ β ππ₯3 , ππ‘ ππ‘2
π
ππ₯ | = β, ππ‘ π‘=0
π₯|π‘=0 = π₯0 ,
π‘ β₯ 0.
34
2. Scaling
(a) Find a set of associated scales for π‘, π₯ involving only π, π, β > 0. (b) Find the scaled problem. How many independent dimensionless parameters does it contain? (c) Which dimensionless parameter must be very small to obtain the approxi2 3 mate equation π 2 π₯/ππ‘ β βπ₯ ? 12. In a simple ecosystem, the number of prey π₯ and predators π¦ at time π‘ are modeled by the following problem, where πΌ, π½, πΎ, πΏ > 0 are parameters, and [π‘] = Time and [π₯] = Prey and [π¦] = Predator. ππ¦ = βπΎπ¦ + πΏπ₯π¦, ππ‘
ππ₯ = πΌπ₯ β π½π₯π¦, ππ‘
π₯|π‘=0 = π₯0 ,
π¦|π‘=0 = π¦0 .
(a) Find the dimensions of πΌ, π½, πΎ, πΏ. (b) Find a set of scales π, π, π > 0 for π‘, π₯, π¦ so that the scaled differential equations become ππ₯/ππ‘ = π₯ β π₯ π¦ and ππ¦/ππ‘ = βππ¦ + π₯ π¦, where π is a dimensionless parameter. Mini-project. A model for the vertical motion of a projectile or ball tossed up into the air is air c
π
m
g v0
π2π¦ ππ¦ ππ¦ = βπ β ππ, | = π£ 0 , π¦|π‘=0 = 0, π‘ β₯ 0. 2 ππ‘ ππ‘ π‘=0 ππ‘
y
ground
Here π¦ is the projectile height above the ground, π is the projectile mass, π is an air resistance coefficient, π is gravitational acceleration, and π‘ is time. In the ideal case when π = 0, so there is no air resistance, the height versus time curve is perfectly symmetric: the duration and shape of the ascent portion of the curve is the same as the descent portion. However, when π > 0, the curve is no longer symmetric: the height versus time curve becomes skewed, and the time it takes to ascend and descend are no longer equal to each other. Here we use a scaled form of the equations to explore how the asymmetry depends on the air resistance parameter π β₯ 0 and other parameters π, π, π£ 0 > 0. (a) Find scales for π‘, π¦ involving only π, π, π£ 0 . Show that the scaled equations take the following form, for an appropriate dimensionless parameter π, π2π¦ ππ‘
2
+π
ππ¦ ππ‘
+ 1 = 0,
ππ¦ ππ‘
|π‘=0 = 1,
π¦|π‘=0 = 0,
π‘ β₯ 0.
(b) Solve the system in (a) for the case π = 0. Verify that the solution is an inverted parabola as shown. By hand, find π‘1 and π‘2 . Show the ascent interval [0, π‘1 ] is the same size as the descent interval [π‘1 , π‘2 ], that is, π‘2 = 2π‘1 .
Exercises
35
y
0
t1
t2
t
(c) Solve the system in (a) for the case π = 0.5. Verify by plotting that the solution is concave down, with intercepts at 0 and π‘2 , and a maximum at π‘1 , for some 0 < π‘1 < π‘2 . By hand, find π‘1 , and verify that π¦|π‘=2π‘1 > 0. Explain how this implies that the descent interval [π‘1 , π‘2 ] is larger than the ascent interval [0, π‘1 ]. (d) Show the results in (c) hold for arbitrary π > 0. Does the asymmetry in the ascent and descent intervals become more or less pronounced as π increases from zero? How do the maximum height and total flight interval [0, π‘2 ] qualitatively change as π increases from zero?
Chapter 3
One-dimensional dynamics
Mathematical models often take the form of differential equations that express how quantities change over time. Such equations may be first- or higher-order, linear or nonlinear, autonomous or nonautonomous, and may involve a number of parameters. In this chapter, we consider models in the form of a single, autonomous, first-order differential equation, and study various qualitative and geometric properties of their solutions, and explore how such properties depend on parameters.
3.1. Preliminaries In the modeling of simple systems, we will often consider a first-order initial-value problem of the form (3.1)
ππ’ = π(π’, π 1 , . . . , π π ), ππ‘
π’|π‘=0 = π’0 ,
π‘ β₯ 0,
where π’, π‘ are real variables, π’0 , π 1 , . . . , π π are real parameters, and π is a given function. The system in (3.1) is called a dynamical system for π’. A solution is a function π’ = π’(π‘, π’0 , π 1 , . . . , π π ), which is differentiable in π‘, and satisfies (3.1). For brevity, we will often omit the parameters and only show the variables in a functional relation. Hence πα΅ we will abbreviate the equation as ππ‘ = π(π’), and a solution as π’ = π’(π‘). Parameters will be indicated when they are essential to a discussion. A basic goal in the study of (3.1) is to characterize how a solution π’(π‘) depends on the parameters. For the moment, we fix π 1 , . . . , π π , and only consider the dependence on π’0 ; the dependence on other parameters will be explored later. A solution π’(π‘) can be viewed in two different ways as illustrated in Figure 3.1. In a time view, a solution πα΅ is viewed as a graph in the π‘, π’-plane, and ππ‘ = π(π’) is the slope of this graph. Alternatively, in a phase view, a solution is viewed as a moving point on the π’-axis. This point πα΅ can only move along the axis, in the positive and negative directions, and ππ‘ = π(π’) is the velocity of the point, which depends on its position. 37
38
3. One-dimensional dynamics
u
u
u1
u1
u0
u0 0
1
t
Figure 3.1.
3.2. Solvability theorem The question of existence and uniqueness of solutions to (3.1) is addressed in the following result from the theory of ordinary differential equations. The set of all points where π(π’) is continuously differentiable is denoted by π·, assumed to be an open set in the real line β. Result 3.2.1. The system in (3.1) has a unique solution π’(π‘) β π· for any π’0 β π·. Moreover: (i)
π’(π‘) exists and is in π· for some maximal interval π‘ β (πβ , π+ ), where πβ < 0 and π+ > 0 depend on π’0 ,
(ii)
if πβ or π+ is finite, then π’(π‘) leaves π· at π‘ = πβ or π‘ = π+ ,
(iii) if π’0 β π’Μ0 , then π’(π‘) β π’(π‘) Μ while both solutions exist in π·. Thus, within the set π· where π(π’) is continuously differentiable, the system in (3.1) has a unique solution π’(π‘) for any given π’0 . This solution exists on some maximal, open interval of time that contains π‘ = 0. While we focus on times π‘ β₯ 0, solutions are also defined for π‘ β€ 0, whether such times are relevant or not. Moreover, solutions with different initial conditions cannot intersect in the time view. At an intersection, two solution curves would extend from one point, which would violate uniqueness. Regardless of whether π· = β or π· β β, the maximal existence interval for π’(π‘) depends on π’0 . This interval is the largest for which π’(π‘) is in π·. Thus, if πβ or π+ is finite, then π’(π‘) must leave π· at that time. Outside of the set π·, solutions π’(π‘) may not be unique or may not exist, or the function π(π’) and the differential equation may not be defined. πα΅
Example 3.2.1. Consider ππ‘ = 3π’, π’|π‘=0 = π’0 . Since π(π’) = 3π’ is continuously differentiable for all π’ β β, Result 3.2.1 guarantees a unique solution for any π’0 β β. By the methods of calculus we find π’(π‘) = π’0 π3π‘ . By properties of the exponential, this solution exists and is in β for π‘ β (ββ, β) for any π’0 . Moreover, for any π’0 β π’Μ0 , we get π’0 π3π‘ β π’Μ0 π3π‘ , or equivalently π’(π‘) β π’(π‘), Μ and solutions do not intersect for all time as illustrated in Figure 3.2a. πα΅
Example 3.2.2. Consider ππ‘ = π’2 , π’|π‘=0 = π’0 . Since π(π’) = π’2 is continuously differentiable for all π’ β β, Result 3.2.1 guarantees a unique solution for any π’0 β β. By the methods of calculus we find π’(π‘) = π’0 /(1 β π’0 π‘). Due to the denominator, the existence interval depends on π’0 . The solution exists and is in β for π‘ β (ββ, β) if
3.3. Equilibria
39
u
u
8 4
-0.2
-4
4
u0 = 2 0.2
t
0.4
2
u0 = 0 -0.5
u0 = -2
0.5
-2
t
-4
-8 (a)
(b)
Figure 3.2. 1
1
π’0 = 0, for π‘ β (ββ, α΅ ) if π’0 > 0, and for π‘ β ( α΅ , β) if π’0 < 0. In the latter two cases, 0 0 the solution has a vertical asymptote, and π’(π‘) leaves the set β at the finite end point 1 π‘ = α΅ of the existence interval. As before, for any π’0 β π’Μ0 , we have π’(π‘) β π’(π‘) Μ while 0 both solutions exist as illustrated in Figure 3.2(b). πα΅
Example 3.2.3. Consider ππ‘ = β1/π’, π’|π‘=0 = π’0 . Since π(π’) = β1/π’ is continuously differentiable in the open set π· = {π’ | π’ β 0}, Result 3.2.1 guarantees a unique solution for any π’0 β π·, say π’0 > 0. By the methods of calculus we find π’(π‘) = βπ’20 β 2π‘. Due to the square root, the existence interval depends on π’0 . Specifically, π’(π‘) exists and is 1 1 in π· for π‘ β (ββ, 2 π’20 ). Note that π’(π‘) leaves π· at the finite end point π‘ = 2 π’20 of the πα΅
interval. At this end point we get π’ = 0, but ππ‘ and π(π’) and the differential equation are undefined (infinite) at π’ = 0. Two sample solution curves are illustrated in Figure 3.3. u 3 u0 = 1
2
u0 = 2
1
-2
-1
-1
1
2
3
t
boundary of D
Figure 3.3.
3.3. Equilibria Here we consider a special class of solutions to (3.1) that are constant in time. As we will see, knowledge of such constant solutions will be helpful in developing a qualitative or geometric understanding of arbitrary solutions. Definition 3.3.1. A solution of (3.1) is called an equilibrium or steady state if it is constant in time, that is (3.2)
π’(π‘) β‘ π’β
for all
π‘ β₯ 0,
40
3. One-dimensional dynamics
for some point π’β β π·. Since π(π’β ) = 0.
πα΅ ππ‘
= π(π’), it follows that π’β is an equilibrium if and only if
Thus equilibrium solutions must be roots of the function π(π’). Depending on this function, a system may have no equilibria, or one or more equilibria. Note that knowledge of such solutions is helpful since, by Result 3.2.1(iii), they provide barriers which other solutions cannot touch or cross. πα΅
Example 3.3.1. Consider ππ‘ = π’2 β 3π’, π’|π‘=0 = π’0 . For this system, we have π(π’) = π’2 β 3π’, and the equation π(π’β ) = 0 has two roots π’β = 0, 3. Thus for π’0 = 0, 3 we get constant solutions π’(π‘) β‘ 0, 3. And for π’0 β 0, 3 we get nonconstant solutions that must satisfy π’(π‘) β 0, 3 while defined. Note that if π’0 is in one of the intervals (ββ, 0), (0, 3) or (3, β), then π’(π‘) must remain in the same interval while defined as illustrated in Figure 3.4. u 4
u0 = 3
2
-0.2
-0.1
0.1
-2
0.2
t
u0 = 0
-4 Figure 3.4.
3.4. Monotonicity theorem The concept of equilibrium solutions can be combined with Result 3.2.1 to get a qualitative characterization of arbitrary solutions. To state the result, we consider the following three disjoint sets, some of which could be empty, but whose union is the entire set π· where π(π’) is continuously differentiable. Specifically, let (3.3)
πΈ = {π’ β π· | π(π’) = 0}, +
πΌ = {π’ β π· | π(π’) > 0},
πΌ β = {π’ β π· | π(π’) < 0}.
Result 3.4.1. Consider (3.1) and let π’(π‘) be the solution with initial condition π’0 . (i)
If π’0 β πΈ, then π’(π‘) β‘ π’0 is an equilibrium solution.
(ii)
If π’0 β πΌ + , then π’(π‘) must remain in πΌ + and increase while defined.
(iii) If π’0 β πΌ β , then π’(π‘) must remain in πΌ β and decrease while defined. Thus solutions π’(π‘) must be monotonic in time for any initial condition π’0 . The result follows from the fact that solutions cannot intersect or touch while they are defined. Specifically, if π’0 β πΈ, then we must have π(π’(π‘)) β 0, and by continuity, the signs of π(π’(π‘)) and π(π’0 ) must be the same. The monotonicity result then follows from πα΅ this observation about signs, and the fact that ππ‘ = π(π’(π‘)) is the slope of the graph of π’(π‘).
3.5. Stability of equilibria
41
πα΅
Example 3.4.1. Consider ππ‘ = (π’ β 1)(4 β π’), π’|π‘=0 = π’0 . For this system, we have π(π’) = (π’ β 1)(4 β π’), and the equation π(π’β ) = 0 has two roots π’β = 1, 4. The sign of π(π’) versus π’ is shown in the table in Figure 3.5, and by inspection we get πΈ = {1, 4}, πΌ + = (1, 4) and πΌ β = (ββ, 1) βͺ (4, β). Solution curves for π’0 β πΈ and various π’0 β πΈ are also shown. Note that all solutions are monotonic in accordance with the result. u 6
sign f(u) u
0
0
1
4 -0.2
-0.1
4
u0 = 4
2
u0 = 1
-2
0.1
0.2
t
Figure 3.5.
While monotonicity is a general feature of one-dimensional dynamical systems, we note that it is not a general feature of higher-dimensional systems. Indeed, systems with two or more time-dependent quantities may possess a rich variety of monotonic and non-monotonic solutions, including various types of spiraling and periodic solutions.
3.5. Stability of equilibria Here we introduce a classification of equilibria. The classification will be based on the behavior of nearby solutions, and will lead to a better qualitative understanding of a system. In contrast to the monotonicity result, this classification will not be limited to one-dimensional systems, and will be generalized to higher-dimensional systems later. Definition 3.5.1. Let π’β be an equilibrium of (3.1), and for any π > 0 let πΌβ,π denote the open interval (π’β β π, π’β + π). (1) π’β is called asymptotically stable if for every π > 0 there is a πΏ > 0 such that, if π’0 β πΌβ,πΏ then π’(π‘) β πΌβ,π for all π‘ β₯ 0, and π’(π‘) β π’β as π‘ β β for every π’0 β πΌβ,πΏ ; see Figure 3.6(a). (2) π’β is called neutrally stable if for every π > 0 there is a πΏ > 0 such that, if π’0 β πΌβ,πΏ then π’(π‘) β πΌβ,π for all π‘ β₯ 0, and π’(π‘) β π’β as π‘ β β for some π’0 β πΌβ,πΏ ; see Figure 3.6(b). (3) π’β is called unstable if it is not asymptotically or neutrally stable; see Figure 3.6(c) and (d). Thus every equilibrium solution π’β can be classified as one of three types: asymptotically stable, neutrally stable, or unstable. Denoting the solution of (3.1) by π’(π‘, π’0 ), we note that stability is a form of continuity of this function with respect to π’0 . Specifically, asymptotic and neutral stability imply that |π’(π‘, π’0 ) β π’(π‘, π’β )| will be arbitrarily small for all time π‘ β₯ 0 provided that |π’0 β π’β | is sufficiently small, where π’(π‘, π’β ) β‘ π’β .
42
3. One-dimensional dynamics
u 2Ξ΅
2Ξ΄
u
attractor or sink
2Ξ΅
u *
2Ξ΄
neutral
u *
t
t
(a)
u
(b)
u
repeller or source
u *
semiβstable or hyperbolic
u * t
t
(c)
(d)
Figure 3.6.
Analogously, instability implies that |π’(π‘, π’0 ) β π’(π‘, π’β )| will not be arbitrarily small for all time π‘ β₯ 0 for some π’0 , no matter how small |π’0 β π’β | may be. In view of the monotonicity result, the classification of an equilibrium can be determined by examining the sign of π(π’) around π’β . Intuitively, an asymptotically stable equilibrium π’β can be interpreted as a preferred state of the system. All solutions that start sufficiently close to such an equilibrium will remain close, and will be pulled into the equilibrium, as time goes on. An unstable equilibrium π’β can be interpreted as an unpreferred state. Some solutions that start arbitrarily close, but not at such an equilibrium, will be pushed away over the course of time. A neutrally stable equilibrium π’β can be interpreted as a borderline case. All solutions that start sufficiently close to such an equilibrium will remain close, but some will not be pulled into the equilibrium. As indicated in Figure 3.6, an asymptotically stable equilibrium is also called an attractor or sink, and an unstable equilibrium can be a repeller or source, or it can be semi-stable or hyperbolic. Equilibria that are neutrally stable are rather special and not common in one-dimensional systems; however, they will be more common in the higher-dimensional case. πα΅
Example 3.5.1. Consider ππ‘ = 4π’2 β π’3 , π’|π‘=0 = π’0 . For this system, we have π(π’) = 4π’2 β π’3 , and the equation π(π’β ) = 0 has two distinct roots π’β = 0, 4. The sign of π(π’) versus π’ is shown in the table in Figure 3.7. The pattern of signs around π’β = 4 indicate that solutions with π’0 β (0, 4) will increase towards π’β , and solutions with π’0 β (4, β) will decrease toward π’β , which implies that π’β is asymptotically stable; it is an attractor. The pattern of signs around π’β = 0 indicate that any solution with π’0 β (0, 4) will increase and move away from π’β , and solutions with π’0 β (ββ, 0) will increase toward π’β , which implies that π’β is unstable; it is hyperbolic. The equilibria and various solution curves around them are illustrated in Figure 3.7.
3.6. Derivative test for stability
43
u
sign f(u) u
0
0
0
4
4
u0 = 4
2
u0 = 0 0.1
-2
0.3
0.5
t
Figure 3.7.
πα΅
Example 3.5.2. Consider the trivial system ππ‘ β‘ 0, π’|π‘=0 = π’0 . Here we have π(π’) β‘ 0, and every π’β on the real line is a root and hence an equilibrium. Moreover, since all solutions are constant in time, every π’β is neutrally stable, as can be inferred from the definition with πΏ = π.
3.6. Derivative test for stability In some cases, examining the sign of π(π’) around an equilibrium π’β can be tedious, especially when the function involves a number of parameters. The next result provides a simple test that will be helpful in various situations. Result 3.6.1. Let π’β be an equilibrium of (3.1) and let πβ = πβ² (π’β ), where πβ² denotes (i)
If πβ < 0, then π’β is asymptotically stable; it is an attractor.
(ii)
If πβ > 0, then π’β is unstable; it is a repeller.
ππ . πα΅
(iii) If πβ = 0, then π’β may be stable or unstable. The above test is based on the fact that, if π(π’β ) = 0 and πβ² (π’β ) β 0, then the function π(π’) is either increasing or decreasing at π’β , which determines the sign pattern of π(π’) around π’β . For instance, if πβ² (π’β ) > 0, then π(π’) is increasing, and it must be negative in an interval to the left of π’β , zero at π’β , and positive in an interval to the right of π’β . Similar conclusions can be made if πβ² (π’β ) < 0. Note that, if πβ² (π’β ) = 0, then there are no implications for the sign pattern of π(π’). πα΅
π
Example 3.6.1. Consider ππ‘ = 1+α΅ β ππ’, π’|π‘=0 = π’0 , where π, π > 0 are constants. Suppose we only care about solutions with π’ β₯ 0. For this system, we have π(π’) = π 1 β ππ’, and the equation π(π’β ) = 0 has two real, distinct roots π’Β± β = 2π (βπ Β± 1+α΅ βπ2 + 4ππ). In view of the restriction π’ β₯ 0, we discard the negative root, and only 1 2 consider the single equilibrium π’+ β = 2π (βπ + βπ + 4ππ) > 0. The construction of a sign table for π(π’) around this equilibrium would be tedious. However, a relatively straightforward calculation gives (3.4)
πβ = πβ² (π’+ β)=
βπ β π. 2 (1 + π’+ β)
Since πβ < 0 for any π, π > 0, it follows that π’+ β is an attractor for any π, π > 0.
44
3. One-dimensional dynamics
3.7. Bifurcation of equilibria Equilibrium solutions and their stability are key features that provide important qualitative information about a system. The dependence of these features on any parameter can be illustrated in a graphical diagram as defined next. For the following, consider a system ππ’ (3.5) = π(π’, β), π’|π‘=0 = π’0 , π‘ β₯ 0, ππ‘ where β is an arbitrary parameter of interest. Note that equilibrium solutions π’β must satisfy π(π’β , β) = 0. Definition 3.7.1. The set of all points (π’β , β) satisfying π(π’β , β) = 0, with stability of π’β indicated at each point, is called a bifurcation diagram. A bifurcation diagram thus provides a direct graphical illustration of how the number, location, and stability of equilibrium solutions π’β depend on a parameter β in the system. Depending on the context, π’β and β may be allowed to vary over all real values, or they may be subject to restrictions, and allowed to vary only in given intervals. For the case considered here, with one variable π’β and one parameter β, the set of points satisfying π(π’β , β) = 0 will generally be a set of curves in the π’β , β-plane. Analogous diagrams could be contemplated when a dependence on more than one parameter is to be explored. For instance, if two parameters β, π are considered, then the set of points satisfying π(π’β , β, π) = 0 would generally be a set of surfaces in π’β , β, π-space. Here we consider the dependence of equilibria on only a single parameter at a time. πα΅
Example 3.7.1. Consider ππ‘ = π’3 β π’β, π’|π‘=0 = π’0 , where β is a parameter. To construct a bifurcation diagram, we note that the equation π(π’β , β) = π’3β β π’β β = 0 has three real, distinct solutions π’β in terms of β. One solution is π’β = 0 for all β, and the other two solutions are π’β = Β±ββ for all β > 0 (when β = 0, the pair reduces to π’β = 0, and when β < 0, the pair is not real). The three solutions are illustrated in Figure 3.8(a).
u
u
*
*
h
sign f(u,0) u
(a)
0 0
(b)
h stable unstable (c)
Figure 3.8.
We next proceed to assess the stability of each equilibrium π’β for each value of β. ππ To employ the derivative test, we consider πβ = πα΅ (π’β , β) = 3π’2β β β. For π’β = 0 we get πβ = ββ. When β > 0, we get πβ < 0, which implies π’β is a stable attractor. When β < 0, we get πβ > 0, which implies π’β is an unstable repeller. When β = 0, the derivative test is inconclusive, and so we consider a sign table for π(π’, 0) = π’3 around
3.7. Bifurcation of equilibria
45
π’β as shown in Figure 3.8(b), which implies π’β is an unstable repeller in this special case. For the pair π’β = Β±ββ, which exist only for β > 0, we get πβ = 2β > 0, which implies the pair are always unstable repellers. The bifurcation diagram is obtained by superimposing the stability information onto the curves in Figure 3.8(a); the result is shown in Figure 3.8(c).
u u
h 0
0
t
t
h
(a)
(b)
Figure 3.9.
The information contained in the bifurcation diagram can be translated into a qualitative time view. When β β€ 0, the diagram shows that the system has only a repeller at π’β = 0, which implies that solution curves for various π’0 must have a time view as shown in Figure 3.9(a). In contrast, when β > 0, the diagram shows that there is a repeller at π’β = βββ, an attractor at π’β = 0, and another repeller at π’β = ββ, which implies that solution curves for various π’0 must have a time view as shown in Figure 3.9(b). Thus the bifurcation diagram provides a concise summary of the system. πα΅
Example 3.7.2. Consider ππ‘ = βπ’ β π(π’), π’|π‘=0 = π’0 , where β is a parameter. Let π(π’) be a given function, with intercepts at π and π as sketched in Figure 3.10(a), but whose explicit form is unknown. We consider this system subject to the restrictions π’ β₯ 0 and β β₯ 0. To construct a bifurcation diagram, we note that the function π(π’, β) can be written in the form π(π’, β) = π¦1 β π¦2 , where π¦1 = βπ’ and π¦2 = π(π’) are two graphs in the π’, π¦-plane. For given β, the value of π(π’, β) at any π’ is simply the signed distance between the graphs at that π’. y
y
y = hu 1
y + +
a
b
u
+
β +
u#
u
uL
uR
u
y = g(u) 2 (a)
(b)
(c)
Figure 3.10.
For large values of β, we note that π(π’, β) > 0 for all π’ as shown in Figure 3.10(a), and there are no equilibria. For a certain value β = β# , the graphs are tangent at a
46
3. One-dimensional dynamics
point π’ = π’# as shown in Figure 3.10(b). Hence this point is an equilibrium, and the sign pattern for π(π’, β# ), which is indicated in the figure, shows that π’# is hyperbolic. For any given β < β# , the graphs intersect at two points π’ = π’πΏ , π’π
as shown in Figure 3.10(c). These points are equilibria, and the sign patterns for π(π’, β) show that π’πΏ is an attractor and that π’π
is a repeller. As the slope β of the line varies between 0 and β# , we make the qualitative observation that π’πΏ β π and π’π
β π as β β 0; moreover, π’πΏ β π’# and π’π
β π’# as β β β# . Since there are no equilibria for β > β# , we obtain a bifurcation diagram as qualitatively sketched in Figure 3.11. u * b u# a
uR stable unstable
uL h#
h
Figure 3.11.
As before, the information in the diagram can be translated into a time view, and some interesting features can be exposed. For example, consider the solution curve π’(π‘) in the time view with initial condition π’0 = π’# . For any β < β# , this solution would remain bounded for all time: it would be repelled from the equilibrium at π’π
, and attracted to the equilibrium at π’πΏ . For β = β# , this solution would itself be an equilibrium. However, under the slightest increase of the parameter to β > β# , this solution would no longer remain bounded: it would grow uncontrollably in time since π(π’, β) > 0 for all π’ when β > β# .
3.8. Case study Setup. To illustrate the preceding results on stability and bifurcation, and some models arising in ecology, we study the dynamics of a population in a simple ecosystem. Figure 3.12 shows the system, which consists of a fixed region of land, and three coexisting populations of insects, plants and birds. The populations interact in the sense that the insects consume the plants, and the birds consume the insects. Under the simplifying assumption that the number of plants and birds is steady, we study a model for the number of insects. We seek to understand how the insect population changes in time, and how the population versus time curve is influenced by various parameters in the model. plants
birds
insects Figure 3.12.
3.8. Case study
47
Outline of model. Let π denote the insect population size at time π‘, with dimensions of [π] = Insect and [π‘] = Time. We assume that the only factors influencing the insect population are natural births and deaths, and consumption by the birds. Specifically, we neglect any immigration and emigration of insects between neighboring regions, and any supply or removal of insects by external mechanisms, such as the application of insecticides. Thus a simple balance equation for the population size takes the form ππ (3.6) = πΉ(π) β πΊ(π), ππ‘ where πΉ denotes the rate of change due to natural births and deaths, and πΊ denotes the rate of change due to consumption by the birds. Note that the dimensions [πΉ] and [πΊ] are both Insect/Time. Births, deaths of insects. To describe the demographic characteristics of the insects in their environment, we consider a logistic model for the rate πΉ, of the form π (3.7) πΉ(π) = ππ(1 β ), π where π, π > 0 are constants with dimensions of [π] = 1/Time and [π] = Insect. The πΉ versus π curve for this model is an inverted parabola, with intercepts at π = 0 and ππ π π = π, and a maximum value of πΉmax = 4 at π = 2 as shown in Figure 3.13(a). For populations of size 0 < π < π, we have πΉ > 0, which means that more births than deaths occur per unit time. Similarly, for populations of size π > π, we have πΉ < 0, which means that more deaths than births occur per unit time. Thus the parameter π can be seen as the maximum sustainable population, or carrying capacity, that can be supported by the environment, due to limited food, space, and other resources. ππΉ The parameter π is the initial slope of the curve, that is, ππ |π=0 = π. Note that the parameters π and π together determine the maximum value πΉmax in the model.
F
G
rk 4
m
0
k
p
0
(a)
n/ 3
p
(b)
Figure 3.13.
Consumption of insects. To describe the feeding characteristics of the birds in their environment, we consider a Holling-type model for the rate πΊ, of the form (3.8)
πΊ(π) =
ππ2 , π2 + π 2
where π, π > 0 are constants with dimensions of [π] = Insect/Time and [π] = Insect. The πΊ versus π curve for this model is a sigmoidal or βsβ-shaped curve, with an intercept
48
3. One-dimensional dynamics
π , β3
at π = 0, an inflection point at π =
and a saturation or horizontal asymptote value
of πΊsat = π as shown in Figure 3.13(b). For insect populations of size 0 < π
, β3
we have πΊ β π, which means that the birds now notice the insects, and catch and consume them as quickly as they can, which is some rate near the upper bound π. π , the consumption rate experiences its maximum For population sizes near π = β3
variation, and we have 0 < πΊ < π. Model equations. Combining (3.6)β(3.8), we obtain the dynamical system ππ π ππ2 = ππ(1 β ) β 2 , ππ‘ π π + π2
(3.9)
π|π‘=0 = π0 ,
π‘ β₯ 0.
We seek to understand the behavior of solutions in terms of the parameters π, π, π, π > 0 and initial population π0 β₯ 0. Although the system is mathematically well defined for all π, we focus only on physically meaningful solutions with π β₯ 0. Analysis of model. To study the above system it will be convenient to rewrite it in dimensionless form. For this purpose, we introduce the scale transformation π = π‘/π π and π’ = π/π, where π = π and π = π, and by the usual method we obtain ππ’ π’ π’2 , = βπ’(1 β ) β ππ π 1 + π’2
(3.10) ππ
π
π’|π=0 = π’0 ,
π β₯ 0,
π
where β = π , π = π and π’0 = π0 are dimensionless parameters. For fixed π > 0, we seek a bifurcation diagram in terms of β > 0. We denote the right-hand side of the differential equation by π(π’, β). Equilibria, stability. To characterize the equilibria and their stability, we consider the partial factorization π’ π’ . (3.11) π(π’, β) = π’π(π’, β) where π(π’, β) = β(1 β ) β π 1 + π’2 Note that sign π(π’, β) = sign π(π’, β) when π’ > 0, and that π(π’, β) = 0 when π’ = 0 or π(π’, β) = 0. The equilibrium π’ = 0 will be called trivial, and any equilibria satisfying π(π’, β) = 0 will be called nontrivial. The trivial equilibrium π’ = 0 is a solution for all β > 0. Using the derivative test, ππ we find π = πα΅ (0, β) = β, which implies that this equilibrium is a repeller for all values of the parameter. Nontrivial equilibria must satisfy π(π’, β) = 0. Solving this equation for π’ in terms of β is difficult; however, solving for β in terms of π’ is straightforward, namely (3.12)
π(π’, β) = 0 β β(1 β
π’ π’ π’ β β= . )= α΅ π 1 + π’2 (1 β π )(1 + π’2 )
Considering only points with β > 0 and π’ > 0, a sketch of the equation π(π’, β) = 0 is illustrated in Figure 3.14(a). Every point on the curve gives a nontrivial equilibrium π’ for a corresponding value of β. Note that the equation has no solutions with π’ = π, and solutions with π’ > π have β < 0 and are discarded. Thus the only nontrivial equilibria
3.8. Case study
49
are in the interval 0 < π’ < π. At points above and below the curve, it will be convenient to label the regions with the sign of π(π’, β) as shown.
h
u
g=0
c
g>0
g=0 (A,B)
h=A
g>0 g 0 for π’ < π΅. Since π(π’, β) and π(π’, β) have the same sign for any π’ > 0, the sign table for π(π’, β) must be as shown in Figure 3.14(c). Thus the nontrivial equilibrium at (π΄, π΅) is an attractor. This procedure can be applied to each point on the curve. Note that stability changes at the two turning points, and that these two points are unstable, hyperbolic equilibria. Bifurcation diagram. The above results for the trivial and nontrivial equilibria can be combined into a bifurcation diagram as illustrated in Figure 3.15. Note that the precise shape of the curve π(π’, β) = 0 will depend on the fixed parameter π > 0, and only a qualitative sketch is shown. The parameter values β1 and β2 corresponding to
u c
0
sudden drop
h1
sudden rise
h2
stable unstable h
Figure 3.15.
the two turning points play an interesting role; they can be associated with infestation and extermination events. For any given value of β, the insect population will settle into some stable size, but seasonal variations can cause abrupt changes. For instance, if the stable population size is initially small, and β increases across β2 , then a sudden rise in the stable population size will result, and an infestation of insects will seem to appear from nowhere β think of crickets at certain times of the year! After such an infestation, if β decreases across β1 , then a sudden drop in the stable population size will result, and the insects will seem to have disappeared or been naturally exterminated.
50
3. One-dimensional dynamics
Reference notes The purpose of this chapter was to introduce the qualitative theory of ordinary differential equations in the simple context of a single, first-order, autonomous equation. The concepts of stability and bifurcation outlined here arise in a wide range of applications, and permeate a large part of applied mathematics. A proof of the theorem on existence and uniqueness of solutions, within the context of general systems, can be found in the classic texts by Coddington and Levinson (1955) and Hirsch and Smale (1974). For elementary texts which highlight the onedimensional case, see Arnold (1992), Kelley and Peterson (2010), and Strogatz (2015).
Exercises 1. Find the solution π’(π‘) and its interval of existence π‘ β (πβ , π+ ) for an arbitrary initial condition π’|π‘=0 = π’0 . (a) (c) (e)
πα΅ ππ‘ πα΅ ππ‘ πα΅ ππ‘
= 1 β 2π’.
(b)
= (3 β π’)2 .
(d)
= πβ2α΅ .
(f)
πα΅ ππ‘ πα΅ ππ‘ πα΅ ππ‘
= π’3 . = 1 + π’2 . = 1 + πα΅ .
2. Find the solution π’(π‘) and its interval of existence π‘ β (πβ , π+ ) for an arbitrary initial condition π’|π‘=0 = π’0 , subject to the given restriction. = βπ’,
(c)
πα΅ ππ‘ πα΅ ππ‘
=
1 , 1βα΅
(e)
πα΅ ππ‘
=
1 , α΅3
(a)
π’ > 0. π’ > 1. π’ > 0.
= βπ’ ln(3π’),
(d)
πα΅ ππ‘ πα΅ ππ‘
(f)
πα΅ ππ‘
= β1 β π’ 2 ,
(b)
=
1 , cos(α΅)
π’ > 0.
π β2
0}. (a) Verify that π’(π‘) β‘ 0 is a solution with π’0 = 0, and π’(π‘) Μ = π‘2 is a solution with π’Μ0 = 0 for π‘ β₯ 0. πα΅
(b) Observe that π’(π‘) and π’(π‘) Μ are two solutions of the same problem: ππ‘ = 2βπ’, π’|π‘=0 = 0. Does this violate the uniqueness part of Result 3.2.1? Why not?
Exercises
51
5. Find all initial conditions π’0 for which solutions π’(π‘) would increase in time while defined, or would decrease, subject to any given restrictions. Identify any equilibria. (a)
πα΅ ππ‘
= 4 β 6π’.
(c)
πα΅ ππ‘
=
(e)
πα΅ ππ‘
= βπ’ ln(π’),
6β5α΅+α΅2 . 1+α΅2
π’ > 0.
(b)
πα΅ ππ‘
= 2π’ β π’3 .
(d)
πα΅ ππ‘
= 9π’2 β π’4 .
(f)
πα΅ ππ‘
=
1 , cos(α΅)
π
β2 < π’
π > 0 are constants. (a)
πα΅ ππ‘
= 3π’2 β π’3 .
(b)
πα΅ ππ‘
= (π β π’)(π β π’)3 .
(c)
πα΅ ππ‘
= π(α΅ ) β πβπα΅ .
2
(d)
πα΅ ππ‘
= π sin π’ β π cos π’.
(e)
πα΅ ππ‘
= βππ’ ln(ππ’), π’ > 0.
(f)
πα΅ ππ‘
=
πα΅ π+α΅
+ π’ β π, π’ > βπ.
πα΅
7. Consider ππ‘ = π(π’), π’|π‘=0 = π’0 . Suppose that π(π’) = βπ β² (π’) for some function π(π’). Such a function is called a potential for the system. (a) Show that any critical point of π(π’) is an equilibrium of the system. (b) Show that any strict local minimum of π(π’) is an attractor of the system, and any strict local maximum is a repeller. (c) Show that, for any solution π’(π‘), the function π(π’(π‘)) is either decreasing or constant in time. 8. Construct a bifurcation diagram of equilibria π’β with respect to the parameter β. Here π, π > 0 are fixed constants. (a)
πα΅ ππ‘
= βπ’ β π’2 .
(b)
πα΅ ππ‘
= (β β π)π’ β ππ’3 .
(c)
πα΅ ππ‘
= (π β π’)(π’2 β β).
(d)
πα΅ ππ‘
= π’2 β π’β2 + π’β.
(e)
πα΅ ππ‘
= π’πα΅ β π’β.
(f)
πα΅ ππ‘
= ln(β2 + π’2 ) β 1, β β 0.
(g)
πα΅ ππ‘
= βπ’(π’ + 1) β π’.
(h)
πα΅ ππ‘
= π’2 β β(π’ β 1).
9. Use a graphical or similar qualitative argument to construct a bifurcation diagram of equilibria π’β with respect to β, subject to any given restrictions. Here π > 0 is a fixed constant. (a)
πα΅ ππ‘
= π’2 β π’ + β.
(b)
πα΅ ππ‘
= π’3 β π’ + β.
(c)
πα΅ ππ‘
= π sin π’ β β.
(d)
πα΅ ππ‘
=
(e)
πα΅ ππ‘
= βπ’ + π’2 β 1.
(f)
πα΅ ππ‘
= βπ’2 β ππα΅ .
(g)
πα΅ ππ‘
= β + ππ’2 β π’3 .
(h)
πα΅ ππ‘
= π’(β β π’) β ππβα΅ , β > 0.
α΅ 1+α΅2
β π’β.
52
3. One-dimensional dynamics
10. Consider
πα΅ ππ‘
= π’(π’ β 4) + β, where β is a parameter.
(a) Construct a bifurcation diagram showing all equilibria and their stability for all β. (b) For the case π’|π‘=0 = 1, use the diagram to determine limπ‘ββ π’(π‘) for each β. 11. Consider
πα΅ ππ‘
= βπ’2 + π’β2 β 4π’, where β > 0 is a parameter.
(a) Construct a bifurcation diagram showing all equilibria and their stability for all β. (b) For the case π’|π‘=0 = 3, use the diagram to determine limπ‘ββ π’(π‘) for each β. 12. A model for the population in a fishery is shown below, where π β₯ 0 is the population size, π‘ β₯ 0 is time, π, π > 0 are capacity and birth rate parameters, and π β₯ 0 is a harvesting rate parameter. π ππ = ππ(1 β ) β π, π|π‘=0 = π0 , π‘ β₯ 0. ππ‘ π The population will become extinct if π(π‘) β 0 as π‘ β β, or as π‘ β ππ for some finite ππ ; otherwise, the population will survive. (a) Let π = π‘/π and π’ = π/π, where π = 1/π and π = π. Show that the scaled equation becomes β.
πα΅ ππ
= π(π’, β). Identify the function π(π’, β) and the parameter
(b) Construct a bifurcation diagram for the rescaled model; only consider π’ β₯ 0 and β β₯ 0. (c) Show that there is a critical value β# such that, if β > β# , then the population will become extinct for any π’0 , and if β < β# , it will survive if π’0 is large enough. 13. A model for a solid-state laser device takes the form below, where π§ β₯ 0 is the number of laser photons emitted, π‘ β₯ 0 is time, πΈ β₯ 0 is an input energy parameter, and π, π > 0 are material parameters. ππ§ = (πΈ β π)π§ β ππ§2 , π§|π‘=0 = π§0 , π‘ β₯ 0. ππ‘ The device will produce only lamp-light if π§(π‘) β 0 as π‘ β β, and will produce sustained laser-light otherwise. (a) Let π = π‘/π and π’ = π§/π. Find the scales π and π so that the scaled equation becomes
πα΅ ππ
= (β β 1)π’ β π’2 . Identify the parameter β.
(b) Construct a bifurcation diagram for the rescaled model; only consider π’ β₯ 0 and β β₯ 0. (c) For given π, π, show that there is a critical value of the input energy πΈ, below which the device will produce only lamp-light, and above which it will produce sustained laser-light, for any π§0 > 0.
Exercises
53
14. A model for a biochemical switch is shown below, where π€ > 0 is the concentration of the switch chemical, π‘ β₯ 0 is time, and π0 , . . . , π3 > 0 are reaction constants. π π€2 ππ€ = π 0 β π1 π€ + 2 2 , π€|π‘=0 = π€ 0 , π‘ β₯ 0. ππ‘ π3 + π€2 An equilibrium π€β is called a low state if π€β < 0.3π3 , and a high state if π€β > 0.9π3 . (a) Let π = π‘/π and π’ = π€/π. Find the scales π and π so that the scaled equation becomes
πα΅ ππ
= β β ππ’ +
α΅2 . 1+α΅2
Identify the parameters β and π.
(b) Let π = 0.54 be fixed. Construct a bifurcation diagram for the rescaled model. Considering only π’ > 0 and β > 0, show that the only stable equilibria are low and high states. (c) Show that there are two critical values βoff and βon such that, if β < βoff , then the switch will tend to a low state, and if β > βon , it will tend to a high state, for any π’0 . 15. A bead of mass π > 0, subject to gravitational acceleration π > 0, slides along a wire hoop of radius π > 0, which is rotating at angular velocity π β₯ 0. If the sliding occurs with a high friction coefficient π > 0, then a model for the bead angle π β (βπ, π] at time π‘ β₯ 0 is as follows. Ο
ππ = βππ sin π + πππ2 sin π cos π, ππ‘ π|π‘=0 = π0 , π‘ β₯ 0. π
g
Ο
r m
The bead will tend to the trivial state if π(π‘) β 0 as π‘ β β, and will tend to a suspended state otherwise. (a) Let π = π‘/π. Find the scale π so that the scaled equation becomes β sin π + β sin π cos π. Identify the parameter β.
ππ ππ
=
(b) Construct a bifurcation diagram for the rescaled model. Considering only βπ < π β€ π and β β₯ 0, show that there are either two or four equilibria for any β. (c) For given π, π, π, π, show that there is a critical value of the rotation rate π, below which the bead will tend to the trivial state, and above which it will tend to a suspended state, for any 0 < |π0 | < π. Mini-project. Similar to the case studied in Section 3.8, we consider a model for the population of plants in a simple ecosystem, of the form plants herbivores
ππ π ππ = ππ (1 β ) β , ππ‘ π 1 + ππ
π|π‘=0 = π0 ,
π‘ β₯ 0.
54
3. One-dimensional dynamics
Here π is the number of plants, π‘ is time, π and π are constants that describe the growth rate of the plants, and π and π are constants that describe the consumption rate of the plants by other members of the ecosystem, such as herbivores. The dimensions of the variables are [π] = Plant and [π‘] = Time. We seek to understand the behavior of solutions in terms of the parameters π, π, π, π > 0 and initial population π0 β₯ 0. The system is mathematically well defined for all π β β1/π, and we focus only on physically meaningful solutions with π β₯ 0. (a) Introduce the scale transformation π = π‘/π and π’ = π/π, where π = 1/π and π = π, and show that the dimensionless version of the system takes the following form, ππ’ βπ’ = π’(1 β π’) β , π’|π=0 = π’0 , π β₯ 0. ππ 1 + ππ’ Identify the dimensionless parameters β, π and π’0 . (b) Assuming π is fixed, specifically π > 1, find all equilibrium solutions of the system in (a) and determine their stability in terms of the parameter β > 0. Illustrate the results on a bifurcation diagram. (c) Consider a scenario in which π = 4 and π’0 = 1/8 are fixed. Using your diagram from (b), determine all values of β for which the plant population would remain positive (avoid extinction) as π β β. Similarly, for the same π and π’0 , determine all values of β for which the plant population would approach zero (become extinct) as π β β. (d) Use Matlab or other similar software to numerically simulate the system in (a). Using π = 4 and a few values of β, say β = 0.6, 1.4, 1.7, produce portraits of solutions for various π’0 . Indicate the locations of all stable and unstable equilibria in the plots and confirm your results from (b). Similarly, confirm your predictions from (c).
Chapter 4
Two-dimensional dynamics
Here we continue our study of dynamical systems and consider models in the form of two coupled, autonomous, first-order differential equations. In contrast to the case of one dimension, systems in two dimensions may exhibit a rich variety of monotonic, spiraling and periodic solutions. We study various qualitative properties of these solutions, and explore how such properties depend on parameters.
4.1. Preliminaries We consider an initial-value problem for a system of first-order equations of the form (4.1)
ππ₯ = π(π₯, π¦, π 1 , . . . , π π ), π₯|π‘=0 = π₯0 , ππ‘ ππ¦ = π(π₯, π¦, π 1 , . . . , π π ), π¦|π‘=0 = π¦0 , ππ‘
π‘ β₯ 0,
where π₯, π¦, π‘ are real variables, π₯0 , π¦0 , π 1 , . . . , π π are real parameters, and π, π are given functions. The system in (4.1) is called a dynamical system for π₯, π¦. A solution is a pair of functions π₯ = π₯(π‘, π₯0 , π¦0 , π 1 , . . . , π π ) and π¦ = π¦(π‘, π₯0 , π¦0 , π 1 , . . . , π π ), which are differentiable in π‘, and satisfy (4.1). For brevity, we will abbreviate the system as ππ₯ ππ¦ = π(π₯, π¦) and ππ‘ = π(π₯, π¦), and a solution as π₯ = π₯(π‘) and π¦ = π¦(π‘), or more simply ππ‘ (π₯, π¦)(π‘). Parameters will be indicated when they are essential to a discussion. Similar to before, we seek to characterize how a solution (π₯, π¦)(π‘) depends on the parameters. For the moment, we fix π 1 , . . . , π π , and only consider the dependence on π₯0 , π¦0 . A solution (π₯, π¦)(π‘) can be viewed in two ways as illustrated in Figure 4.1. In a time view, a solution is viewed as a pair of graphs in the π‘, π₯- and π‘, π¦-planes. The ππ₯ ππ¦ slopes of these two graphs satisfy ππ‘ = π(π₯, π¦) and ππ‘ = π(π₯, π¦). Alternatively, in a phase view, a solution is viewed as a moving point in the π₯, π¦-plane, referred to as the phase plane. The moving point traces out a curve in this plane, called an orbit or ππ₯ ππ¦ path, and the vector π β = ( ππ‘ , ππ‘ ) = (π(π₯, π¦), π(π₯, π¦)) is the velocity of the point, which depends on its position, and is tangent to the path. 55
56
4. Two-dimensional dynamics
x
y
y
x1
y1
x0
y0
Ξ½ t=1
0
1
t
0
1
t
t=0 x
Figure 4.1.
4.2. Solvability theorem The question of existence and uniqueness of solutions to (4.1) is addressed in the following result, which is similar to Result 3.2.1 from the one-dimensional case. The set of all points where π(π₯, π¦) and π(π₯, π¦) are continuously differentiable is denoted by π·, assumed to be an open set in the real plane β2 . Result 4.2.1. The system in (4.1) has a unique solution (π₯, π¦)(π‘) β π· for any (π₯0 , π¦0 ) β π·. Moreover: (i)
(π₯, π¦)(π‘) exists and is in π· for some maximal interval π‘ β (πβ , π+ ), where πβ < 0 and π+ > 0 depend on (π₯0 , π¦0 ),
(ii)
if πβ or π+ is finite, then (π₯, π¦)(π‘) leaves π· at π‘ = πβ or π‘ = π+ ,
Μ (iii) if (π₯0 , π¦0 ) β (π₯0Μ , π¦0Μ ), then (π₯, π¦)(π‘) β (π₯,Μ π¦)(π‘) while both solutions exist in π·; also, paths are either disjoint or the same, and cannot cross. Thus, within the set π· where π(π₯, π¦) and π(π₯, π¦) are continuously differentiable, the system in (4.1) has a unique solution (π₯, π¦)(π‘) for any given (π₯0 , π¦0 ). As before, this solution exists on some maximal, open interval of time that contains π‘ = 0. Although we focus on π‘ β₯ 0, solutions are also defined for π‘ β€ 0, whether of interest or not. Regardless of whether π· = β2 or π· β β2 , the maximal existence interval for (π₯, π¦)(π‘) depends on (π₯0 , π¦0 ), and is the largest for which the solution is in π·. Thus, if πβ or π+ is finite, then (π₯, π¦)(π‘) must leave π· at that time. Outside of the set π·, solutions may not be unique or may not exist, or the dynamical system itself may not be defined. By the maximal orbit or path of a solution we mean that traced for all time in the maximal existence interval. Part (iii) of Result 4.2.1 provides important qualitative information. For any two different initial conditions (π₯0 , π¦0 ) and (π₯0Μ , π¦0Μ ), the corresponding solutions (π₯, π¦)(π‘) Μ and (π₯,Μ π¦)(π‘) will never occupy the same point at the same time in the phase view. Also, the maximal paths traced out by these solutions will either be disjoint or the same, and cannot cross as illustrated in Figure 4.2. A crossing cannot occur under any circumstances, not even if the crossing point were visited at different times. Note that, if two paths have a common point (π₯π , π¦π ), then they must also have a common velocity vector π(π₯ β π , π¦π ) at that point, but a crossing would imply two different velocity vectors. Moreover, two paths with a common point must be identical, by uniqueness of solutions, with the common point serving as initial condition. Thus two maximal paths may either have no points in common, in which case they are disjoint, or all points in common, in which case they are the same.
4.3. Direction field, nullclines
57
y
y (x0 , y0 )
(x0 , y0 ) (x0 , y0 )
y
x
(x0 , y0 ) (x0 , y0 )
disjoint paths (possible)
same path (possible)
x
(x0 , y0 )
x
crossed paths (impossible)
Figure 4.2.
4.3. Direction ο¬eld, nullclines The velocity vector informs us of the direction and orientation of the solution curve through each point of the phase plane. This fact can be used to get a qualitative portrait of solution curves. Definition 4.3.1. By the direction field for (4.1) we mean the collection of vectors π(π₯, β π¦) = (π(π₯, π¦), π(π₯, π¦)) over all points (π₯, π¦).
y
For practical reasons, the vectors π(π₯, β π¦) can only be visualized at a finite number of points as illustrated in Figure 4.3. A sketch of the vectors at several points can provide a qualitative portrait of solution curves, but many points may be required before an accurate portrait emerges. Alternative, more qualitative information about solutions
x
Figure 4.3.
can be obtained by simply considering how the signs of π(π₯, π¦) and π(π₯, π¦) vary over the phase plane. This motivates the following definition. Definition 4.3.2. The set of points where π(π₯, π¦) = 0 and π(π₯, π¦) = 0 are called the π₯and π¦-nullclines, respectively. The set of points where π(π₯, π¦) > 0 (< 0) and π(π₯, π¦) > 0 (< 0) are called the π₯- and π¦-increasing (decreasing) regions, respectively. Thus at each point on an π₯-nullcline we have π(π₯, π¦) = 0 and the direction field has no horizontal component. Moreover, at each point in an π₯-increasing region we have π(π₯, π¦) > 0 and the direction field points rightward; in a decreasing region we have π(π₯, π¦) < 0 and the direction is leftward. Similarly, at each point on a π¦-nullcline we have π(π₯, π¦) = 0 and the direction field has no vertical component. And at each point
58
4. Two-dimensional dynamics
in a π¦-increasing region we have π(π₯, π¦) > 0 and the direction field points upward; and in a decreasing region we have π(π₯, π¦) < 0 and the direction is downward. Note that the nullclines, increasing regions, and decreasing regions are disjoint sets that partition the entire phase plane for each of the π₯- and π¦-variables; however, there is no restriction on the overlaps of these sets between the two variables. ππ₯
ππ¦
Example 4.3.1. Consider ππ‘ = π(π₯, π¦) = π₯ β π¦2 and ππ‘ = π(π₯, π¦) = π₯2 β 1. The π₯nullclines are defined by π = 0, which yields the single curve π₯ = π¦2 . The π₯-increasing region is π > 0, which corresponds to π₯ > π¦2 , and the decreasing region is π < 0, which corresponds to π₯ < π¦2 . The π¦-nullclines are defined by π = 0, which yields the two curves π₯ = Β±1. The π¦-increasing region is π > 0, which corresponds to π₯2 > 1, and the decreasing region is π < 0, which corresponds to π₯2 < 1. Figure 4.4 illustrates the nullclines and regions for each separate variable, and the solution curve directions obtained by superimposing the information for each variable. y
f0
g>0
g0
x
x
f=0 g=0
g=0
Figure 4.4.
4.4. Path equation, ο¬rst integrals Due to the fact that (4.1) is an autonomous system of first-order equations, we can eliminate the time variable to obtain a purely geometric description of solution curves in the phase plane. Definition 4.4.1. By the path equation associated with (4.1) we mean the differential equation (4.2)
π(π₯, π¦) ππ¦ = , ππ₯ π(π₯, π¦)
π¦ = π¦0 when π₯ = π₯0 .
Provided that π(π₯, π¦) and π(π₯, π¦) are continuously differentiable as in Result 4.2.1, and do not both vanish at (π₯0 , π¦0 ), the path equation or its reciprocal has a unique solution curve π¦ = π¦(π₯) or π₯ = π₯(π¦), which coincides with the solution curve of the original equation (4.1) through (π₯0 , π¦0 ). Note that a graph description π¦ = π¦(π₯) or π₯ = π₯(π¦) of a solution curve is more restrictive than a parametric description (π₯, π¦)(π‘). Although the path equation only provides information on the shape of a solution curve, this information can be combined with knowledge of the direction field to orient the curve with respect to time.
4.4. Path equation, first integrals
59
Definition 4.4.2. A function πΈ(π₯, π¦) is called a first integral of (4.1) if the general solution of the path equation can be written as πΈ(π₯, π¦) = πΆ, for some arbitrary constant πΆ. When the path equation is written in the form (4.2), a first integral will exist when the quotient π(π₯, π¦)/π(π₯, π¦) is separable in the variables π₯ and π¦. In this case, the equation can be solved by separating variables and integrating, which gives π»(π¦) = π½(π₯)+πΆ, where π»(π¦) and π½(π₯) are functions obtained from the integration. This general solution can then be written as πΈ(π₯, π¦) = πΆ, where πΈ(π₯, π¦) = π»(π¦) β π½(π₯). Alternatively, when ππ¦ the path equation is written as π(π₯, π¦) ππ₯ β π(π₯, π¦) = 0, a first integral will exist when this equation is exact in the sense that ππ₯ (π₯, π¦) = βππ¦ (π₯, π¦). In this case, a first integral will satisfy πΈπ₯ (π₯, π¦) = βπ(π₯, π¦) and πΈπ¦ (π₯, π¦) = π(π₯, π¦). More generally, a first integral will exist when (ππ)π₯ (π₯, π¦) = β(ππ)π¦ (π₯, π¦) for some integrating factor π(π₯, π¦), in which case a first integral will satisfy πΈπ₯ (π₯, π¦) = β(ππ)(π₯, π¦) and πΈπ¦ (π₯, π¦) = (ππ)(π₯, π¦). Significant qualitative and quantitative information about the solution curves of (4.1) can be obtained from a first integral. If a first integral πΈ(π₯, π¦) exists in some region of the phase plane, then every solution curve of (4.1) in this region will be contained within a level set of πΈ(π₯, π¦). Thus a contour map of πΈ(π₯, π¦) provides a geometric portrait of solution curves, but with no explicit time information. For any given πΆ0 , the level set πΈ(π₯, π¦) = πΆ0 contains the path of all solution curves with initial points satisfying πΈ(π₯0 , π¦0 ) = πΆ0 . In some cases, the path of a solution curve will not cover the entire set πΈ(π₯, π¦) = πΆ0 , but only a subset or component. ππ¦
ππ₯
Example 4.4.1. Consider ππ‘ = 2π¦3 and ππ‘ = π₯β2. For this system the path equation is ππ¦ π₯β2 1 1 = 2π¦3 . This can be solved using separation of variables to get 2 π¦4 = 2 (π₯ β 2)2 + πΆ, ππ₯ where πΆ is an arbitrary constant. After rearranging and introducing π΅ = β2πΆ, we obtain (π₯β2)2 βπ¦4 = π΅. Thus a first integral is πΈ(π₯, π¦) = (π₯β2)2 βπ¦4 , and every solution (π₯, π¦)(π‘) of the system is contained within a level set πΈ(π₯, π¦) = π΅. Figure 4.5(a) shows y
y
y
1 1
2
-1
3
4
x
f=0
x g=0
(a)
(b)
x (c)
Figure 4.5.
some level sets, corresponding to π΅ = β1, 0, 1, 4. These level sets can be viewed as roads upon which solutions can travel. Figure 4.5(b) shows the nullclines and solution (velocity) directions. These are the directions in which solutions move along the roads as time increases. For comparison, Figure 4.5(c) shows some actual solution curves. Note that the level set with π΅ = 0 (purple) is relatively complicated and consists of five components: four branches corresponding to π₯ < 2 or > 2 and π¦ < 0 or > 0, and
60
4. Two-dimensional dynamics
one point corresponding to (π₯, π¦) = (2, 0). Because of the constant solution (π₯, π¦)(π‘) β‘ (2, 0), a solution curve starting in any one of these components remains there for all time.
4.5. Equilibria As in the one-dimensional case, we consider the special class of solutions to (4.1) that are constant in time. Definition 4.5.1. A solution of (4.1) is called an equilibrium or steady state if it is constant in time, that is (4.3)
(π₯, π¦)(π‘) β‘ (π₯β , π¦β ) for all
π‘ β₯ 0,
ππ¦
ππ₯
for some point (π₯β , π¦β ) β π·. Since ππ‘ = π(π₯, π¦) and ππ‘ = π(π₯, π¦), it follows that (π₯β , π¦β ) is an equilibrium if and only if π(π₯β , π¦β ) = 0 and π(π₯β , π¦β ) = 0. Thus equilibrium solutions must be simultaneous roots of the functions π(π₯, π¦) and π(π₯, π¦); equivalently, they correspond to intersection points of the π₯- and π¦-nullclines. Note that the curve (π₯, π¦)(π‘) β‘ (π₯β , π¦β ) is a single, fixed point in the phase plane, which no other solution curve can touch or cross. In the one-dimensional case, equilibria provided barriers which trapped other solutions in regions of increase or decrease, which lead to a monotonicity result. In the two-dimensional case, equilibria no longer provide barriers, but instead provide organizing points on the boundary between regions of increase and decrease of each variable, about which solutions may exhibit a variety of growing, decaying, spiraling and periodic behaviors. Similar to the onedimensional case, we next introduce a classification of equilibria for two-dimensional systems. Definition 4.5.2. Let (π₯β , π¦β ) be an equilibrium of (4.1), and for any π > 0 let π
β,π denote the open rectangle (π₯β β π, π₯β + π) Γ (π¦β β π, π¦β + π). (1) (π₯β , π¦β ) is called asymptotically stable if for every π > 0 there is a πΏ > 0 such that, if (π₯0 , π¦0 ) β π
β,πΏ then (π₯, π¦)(π‘) β π
β,π for all π‘ β₯ 0, and (π₯, π¦)(π‘) β (π₯β , π¦β ) as π‘ β β for every (π₯0 , π¦0 ) β π
β,πΏ ; see Figure 4.6. x 2Ξ΅
y 2Ξ΅
2Ξ΄ x *
y
2Ξ΄ y *
t
t
x
Figure 4.6.
(2) (π₯β , π¦β ) is called neutrally stable if for every π > 0 there is a πΏ > 0 such that, if (π₯0 , π¦0 ) β π
β,πΏ then (π₯, π¦)(π‘) β π
β,π for all π‘ β₯ 0, and (π₯, π¦)(π‘) β (π₯β , π¦β ) as π‘ β β for some (π₯0 , π¦0 ) β π
β,πΏ ; see Figure 4.7. (3) (π₯β , π¦β ) is called unstable if it is not asymptotically or neutrally stable; see Figure 4.8.
4.5. Equilibria
61
x 2Ξ΅
y 2Ξ΅
2Ξ΄ x *
y
2Ξ΄ y *
t
t
x
Figure 4.7.
y
y
y
x
x
x
Figure 4.8.
Thus every equilibrium solution (π₯β , π¦β ) can be classified as one of three types: asymptotically stable, neutrally stable, or unstable. Similar to the one-dimensional case, if we denote the solution of (4.1) by (π₯, π¦)(π‘, π₯0 , π¦0 ), then stability is a form of continuity of this function with respect to (π₯0 , π¦0 ). Specifically, asymptotic and neutral stability imply that (π₯, π¦)(π‘, π₯0 , π¦0 ) and (π₯, π¦)(π‘, π₯β , π¦β ) will be arbitrarily close for all time π‘ β₯ 0 provided that (π₯0 , π¦0 ) and (π₯β , π¦β ) are sufficiently close, where (π₯, π¦)(π‘, π₯β , π¦β ) β‘ (π₯β , π¦β ). And instability implies that (π₯, π¦)(π‘, π₯0 , π¦0 ) and (π₯, π¦)(π‘, π₯β , π¦β ) will not be arbitrarily close for all time π‘ β₯ 0 for some (π₯0 , π¦0 ), no matter how close it may be to (π₯β , π¦β ). Figures 4.6β4.8 provide a small sampling of the possible behavior of solution curves and are for illustration only; there are many other possibilities depending on whether an equilibrium is isolated or not, and whether it is nondegenerate in an appropriate sense. Note that the classification of equilibria in the two-dimensional case is not as straightforward as the one-dimensional case. Inspection of the nullclines and direction field in the regions surrounding an equilibria can give stability information in some cases. A more systematic method for classification will require a number of preliminary results about the linear case, which will be outlined later. Similar to the one-dimensional case, an asymptotically stable equilibrium (π₯β , π¦β ) can be interpreted as a preferred state of the system. All solutions that start sufficiently close to such an equilibrium will remain close, and will be pulled into the equilibrium, as time goes on. However, in contrast to the one-dimensional case, the approach to the equilibrium may be non-monotonic in one or both of the variables π₯(π‘) and π¦(π‘). As before, an unstable equilibrium (π₯β , π¦β ) can be interpreted as an unpreferred state. Some solutions that start arbitrarily close, but not at such an equilibrium, will be pushed away over the course of time, but now in a possibly non-monotonic way. And a neutrally stable equilibrium (π₯β , π¦β ) can again be interpreted as a borderline case. All solutions that start sufficiently close to such an equilibrium will remain close, but some will not be pulled into the equilibrium.
62
4. Two-dimensional dynamics
ππ₯
ππ¦
Example 4.5.1. Consider the system in Example 4.3.1, namely ππ‘ = π₯ β π¦2 and ππ‘ = π₯2 β 1. The equilibrium points of this system satisfy π₯ β π¦2 = 0 and π₯2 β 1 = 0. The first equation implies π₯ = π¦2 and the second implies π₯ = Β±1. In the case π₯ = 1, we obtain π¦ = Β±1, and in the case π₯ = β1, there is no real solution for π¦. Hence y
y (x ,y ) = (1,1) * * x
x
(x ,y ) = (1,β1) * *
Figure 4.9.
this system has two equilibrium points (π₯β , π¦β ) = (1, 1) and (1, β1). Notice that these are precisely the intersection points of the π₯- and π¦-nullclines as considered earlier, which are shown in the left part of Figure 4.9. A sketch of the direction field in the regions surrounding (π₯β , π¦β ) = (1, β1) indicates that this equilibrium is unstable, since solutions in the immediate vicinity of the equilibrium are pushed away in some areas. In contrast, a sketch of the direction field in the regions surrounding (π₯β , π¦β ) = (1, 1) is not as revealing; it indicates a spiraling or periodic type of behavior, but the information is not precise enough to determine stability. An accurate portrait of solution curves, as illustrated in the right part of Figure 4.9, shows that (1, 1) is actually unstable.
4.6. Periodic orbits Other special types of solutions besides equilibria are also of interest. For systems in two and higher dimensions, one such type of solution is a periodic orbit as defined next. Note that such solutions are not possible for dynamical systems in one dimension due to monotonicity. Definition 4.6.1. A solution of (4.1) is called a periodic or closed orbit if it is nonconstant and there is a number π > 0 such that (4.4)
(π₯, π¦)(π‘ + π) = (π₯, π¦)(π‘)
for all
π‘ β₯ 0.
The smallest such π is called the period. We assume that (π₯, π¦)(π‘) β π· for all π‘. Thus a periodic orbit is a solution that repeats. Its path in the phase plane is a closed curve that is repeatedly traced over in time, in one of two possible orientations, as illustrated in Figure 4.10. If (π₯0 , π¦0 ) β (π₯0Μ , π¦0Μ ) are any two points on this closed Μ curve, then the corresponding solutions (π₯, π¦)(π‘) β (π₯,Μ π¦)(π‘) will remain on the curve, be periodic with the same period, and chase each other around the curve for all time. A periodic orbit cannot intersect with any other solution curve in the phase plane, and cannot intersect with itself to form a figure eight or anything similar. A stability classification can be defined for periodic orbits. Rather than state a precise, technical form of the definitions, we simply note that a periodic orbit can be
4.6. Periodic orbits
63
y
(x0 , y0 ) x Figure 4.10.
asymptotically stable, neutrally stable, or unstable similar to equilibria as illustrated in Figure 4.11. A periodic orbit is called asymptotically stable if all solution curves that start sufficiently close will remain close, and will spiral alongside and be pulled closer as time goes on, as illustrated in panel (a). In this case, the periodic orbit is called a limit cycle. A periodic orbit is called neutrally stable if all solution curves that start sufficiently close will remain close, and will spiral alongside as time goes on, but some will not get pulled closer, as illustrated in panel (b). And a periodic orbit is called unstable if it is neither asymptotically nor neutrally stable, as illustrated in panel (c). y
y
y
x
x
(a)
(b)
x (c)
Figure 4.11. ππ₯
ππ¦
Example 4.6.1. Consider ππ‘ = π¦ and ππ‘ = βπ2 π₯, with initial conditions π₯|π‘=0 = π₯0 and π¦|π‘=0 = π¦0 , where π > 0 is a given constant. Due to its simple form, this system can be solved by combining the pair of first-order equations into a single second-order π2 π₯ ππ¦ π2 π₯ equation, namely ππ‘2 = ππ‘ = βπ2 π₯, or equivalently ππ‘2 + π2 π₯ = 0. Using standard methods, the general solution of this equation is π₯(π‘) = πΆ1 cos(ππ‘) + πΆ2 sin(ππ‘), ππ₯ where πΆ1 and πΆ2 are arbitrary constants. The equation π¦ = ππ‘ then implies π¦(π‘) = βππΆ1 sin(ππ‘) + ππΆ2 cos(ππ‘). Using the given conditions at π‘ = 0, we find that πΆ1 = π₯0 and πΆ2 = π¦0 /π. Hence the solution curves of the system are given by π¦ (4.5) π₯(π‘) = π₯0 cos(ππ‘) + 0 sin(ππ‘), π¦(π‘) = π¦0 cos(ππ‘) β ππ₯0 sin(ππ‘). π For (π₯0 , π¦0 ) = (0, 0) we obtain the equilibrium solution (π₯, π¦)(π‘) β‘ (0, 0), and for any 2π (π₯0 , π¦0 ) β (0, 0), we obtain a periodic solution (π₯, π¦)(π‘) of period π = π . Thus every nonconstant solution of this system is periodic, of the neutrally stable type, and the equilibrium at the origin is also neutrally stable. An examination of the nullclines and direction field indicates that the periodic solution curves are oriented clockwise in time. 1 Solutions for various (π₯0 , π¦0 ) β (0, 0) are illustrated in Figure 4.12 for the case π = 2 . 2 2 2 Equivalently, a first integral for this system is πΈ(π₯, π¦) = π π₯ + π¦ , and every solution curve is contained within a level set πΈ(π₯, π¦) = πΆ β₯ 0. All level sets with πΆ > 0 are
64
4. Two-dimensional dynamics
ellipses about the origin, and the level set with πΆ = 0 is the single point at the origin. y y
f=0
x
x
g=0 Figure 4.12.
4.7. Linear systems Here we study linear dynamical systems in two dimensions. Such systems arise as models in simple contexts, and more importantly, they will provide the foundation for understanding the behavior of more general nonlinear systems considered later. Definition 4.7.1. A dynamical system for variables π₯, π¦ is called linear if it has the form ππ₯/ππ‘ = ππ₯ + ππ¦, π₯|π‘=0 = π₯0 , (4.6) π‘ β₯ 0, ππ¦/ππ‘ = ππ₯ + ππ¦, π¦|π‘=0 = π¦0 , where π, π, π, π are constants. Equivalently, in matrix notation, the system is ππ£ (4.7) = π΄π£, π£|π‘=0 = π£ 0 , π‘ β₯ 0, ππ‘ π₯ π π where π£ = ( ) and π΄ = ( ). The system is called nondegenerate if det π΄ β 0; and π¦ π π degenerate otherwise. Equilibria. The equilibrium solutions of (4.7) have the form π£(π‘) β‘ π£β , where the constant vector π£β must satisfy π΄π£β = 0. In the nondegenerate case, this equation will have no free variables, and the only solution is π£β = 0, which in components is (π₯β , π¦β ) = (0, 0). Alternatively, in the degenerate case, the equation π΄π£β = 0 will have some free variables, and there will be infinitely many solutions. For simplicity, we focus on the nondegenerate case and seek to understand the behavior of solutions of (4.7) around the isolated equilibrium at the origin. Degenerate cases require a different treatment and will be considered as the need arises. Sign regions. Since π(π₯, π¦) = ππ₯ + ππ¦, the nullcline curve π = 0 will be a line through the origin, provided that π and π are not both zero. Similarly, since π(π₯, π¦) = ππ₯ + ππ¦, the nullcline curve π = 0 will also be a line through the origin, provided that π and π are not both zero. In the nondegenerate case, these two lines are nonparallel, and they partition the plane into four distinct sign regions around the isolated equilibrium (0, 0). These sign regions may be enough to understand the behavior of solutions around the equilibrium, or they may be inconclusive. Note that a geometrical characterization of a nondegenerate equilibrium is that it be surrounded by exactly
4.7. Linear systems
65
four sign regions, all of which are distinct. Any equilibrium which is surrounded by a different number of regions, or which has nondistinct regions, must be degenerate. Due to the limited information provided by sign regions, we next pursue a detailed characterization of behavior provided by the general solution of the system, which is possible due to linearity. General solution. Following the standard theory for linear, constant-coefficient, ordinary differential equations, the general solution of (4.7) can be motivated by considering a solution of the form π£(π‘) = πππ‘ π’,Μ where π is an arbitrary number and π’Μ is an arbitrary vector. Substituting this function into the equation, using the fact that πππ‘ never vanishes, we get
(4.8)
π(πππ‘ π’)Μ = π΄(πππ‘ π’)Μ ππ‘
β
πππ‘ ππ’Μ = πππ‘ π΄π’Μ
β
π΄π’Μ = ππ’.Μ
Hence π£(π‘) = πππ‘ π’Μ is a solution of the differential equation if and only if π and π’Μ satisfy the algebraic equation π΄π’Μ = ππ’.Μ From this it follows that the general solution of (4.7) is determined by the eigenvalues and eigenvectors of π΄, which may be real or complex. Note that the nondegeneracy condition det π΄ β 0 implies π β 0, and in all cases we assume π’Μ β 0. The specific form of the general solution depends on the details of π and π’.Μ Case 1. If π΄ has real, distinct eigenvalues π1 β π2 , then it necessarily has two independent eigenvectors π’Μ1 and π’Μ2 , and the general solution of (4.7) takes the form (4.9)
π£(π‘) = πΆ1 ππ1 π‘ π’Μ1 + πΆ2 ππ2 π‘ π’Μ2 ,
where πΆ1 and πΆ2 are arbitrary constants. From this expression we deduce the following phase diagrams, which illustrate the behavior of solution curves around the equilibrium π£β = 0. 1.1. If π1 < 0 and π2 < 0, then the equilibrium π£β = 0 is asymptotically stable, and is called a stable node or attractor. The behavior of solution curves in the phase plane is as follows; see Figure 4.13(a). Let πΏ1 and πΏ2 be lines through the origin parallel to π’Μ1 and π’Μ2 . Note that all solutions with πΆ2 = 0 are on the line πΏ1 . Any solution curve that starts on πΏ1 , on either side of the origin, will remain on πΏ1 and be pulled toward the origin since ππ1 π‘ β 0 as π‘ β β. Similarly, note that all solutions with πΆ1 = 0 are on the line πΏ2 . Any solution curve that starts on πΏ2 , on either side of the origin, will remain on πΏ2 and be pulled toward the origin since ππ2 π‘ β 0 as π‘ β β. Since solution curves cannot cross, the lines πΏ1 and πΏ2 divide the phase plane into four wedge regions, in which all solutions with πΆ1 β 0 and πΆ2 β 0 are trapped. All such solutions also get pulled to the origin as time goes on, but along paths that are generally curved and not straight. Note that the lines and wedge regions described here are associated with the general solution, and are generally different from the nullclines and sign regions for the system.
66
4. Two-dimensional dynamics .
L2
L2 y
L1
L2 y
L1
x
(a)
y
L1
x
(b)
x
(c)
Figure 4.13.
1.2. If π1 > 0 and π2 < 0, then the equilibrium π£β = 0 is unstable, and is called a saddle or hyperbolic point. The behavior of solution curves in the phase plane is as follows; see Figure 4.13(b). Let πΏ1 and πΏ2 again be lines through the origin parallel to π’Μ1 and π’Μ2 . As before, any solution that starts on πΏ2 , on either side of the origin, will remain on πΏ2 and be pulled toward the origin since ππ2 π‘ β 0 as π‘ β β. In contrast, any solution that starts on πΏ1 , on either side of the origin, will remain on πΏ1 but be pushed away from the origin since ππ1 π‘ β β as π‘ β β. The lines πΏ1 and πΏ2 again divide the phase plane into four wedge regions in which all solutions with πΆ1 β 0 and πΆ2 β 0 are trapped. Since ππ1 π‘ β β and ππ2 π‘ β 0, all such solutions must have a slant asymptote along the line πΏ1 , and approach this line in one of four ways depending on the signs of πΆ1 and πΆ2 , which determine the wedge region in which the solution lies. 1.3. If π1 > 0 and π2 > 0, then the equilibrium π£β = 0 is unstable, and is called an unstable node or repeller. The behavior of solution curves in the phase plane is analogous to the stable case, with lines πΏ1 and πΏ2 defined as before, but now all directions are reversed and all solutions are pushed away from the origin and grow unbounded in time; see Figure 4.13(c). Case 2. If π΄ has a real, repeated (double) eigenvalue π, then it may have two independent eigenvectors π’Μ1 and π’Μ2 , or only one independent eigenvector π’.Μ The form of the general solution of (4.7) depends on these two possibilities. 2.1. If there are two independent eigenvectors, then the general solution is
(4.10)
π£(π‘) = πΆ1 πππ‘ π’Μ1 + πΆ2 πππ‘ π’Μ2 ,
where πΆ1 and πΆ2 are arbitrary constants. The equilibrium π£β = 0 is stable if π < 0, and unstable if π > 0, and as before is referred to as a node. The behavior of solution curves in the phase plane is similar to before, but now all solution curves are straight lines due to the fact that the two exponential factors are identical; see Figure 4.14(a) and (b).
4.7. Linear systems
67
L2
L2 y
y
L1
L1 x
x
Ξ»>0
Ξ» 0, but is now referred to as an improper node. The behavior of solution curves in the phase plane is as follows; see Figure 4.14(c) and (d). Let πΏ be the line through the origin parallel to π’.Μ Note that all solutions with πΆ2 = 0 are on this line. Any solution that starts on πΏ, on either side of the origin, remains on πΏ and is pulled toward or pushed away from the origin depending on the sign of π. Since solution curves cannot cross, the line πΏ divides the plane into two regions, in which all solutions with πΆ2 β 0 are trapped. When π < 0, all of these solutions have a slant asymptote with the line πΏ far away from the origin as π‘ β ββ, and become tangent to πΏ at the origin as π‘ β β, since the component π‘πππ‘ π’Μ dominates. When π > 0, the diagram is similar, but all directions are reversed. Case 3. If π΄ has complex eigenvalues π+ = πΌ + ππ½ and πβ = πΌ β ππ½, where π is the imaginary unit and π½ β 0, then it necessarily has two independent eigenvectors π’Μ+ = πΎ Μ + ππ Μ and π’Μβ = πΎ Μ β ππ.Μ Note that π+ , π’Μ+ are written in terms of +π, and πβ , π’Μβ in terms of βπ, which ensures that π½ and π Μ are consistently defined. Using Eulerβs formula πππ = cos π + π sin π, the general solution to (4.7) can be put into the real form (4.12)
π£(π‘) = πΆ1 ππΌπ‘ [πΎ Μ cos(π½π‘) β π Μ sin(π½π‘)] + πΆ2 ππΌπ‘ [πΎ Μ sin(π½π‘) + π Μ cos(π½π‘)],
where πΆ1 and πΆ2 are arbitrary constants. From this expression we deduce the following phase diagrams.
68
4. Two-dimensional dynamics
3.1. If πΌ β 0, then the equilibrium π£β = 0 is called a spiral. It is asymptotically stable if πΌ < 0, and unstable if πΌ > 0. In the stable case, all solution curves spiral towards and approach the origin as time goes on, and all have the same orientation, either clockwise (CW) or counter-clockwise (CCW). The orientation can be determined by examining the direction vector at any test point away from the origin. The unstable case is similar, with the only difference being that all solution curves spiral away from the origin and grow unbounded in time; see Figure 4.15(a) and (b).
y
y CCW
CW
x
x
Ξ±0
(a)
(b)
y
y CCW
CW
x
x
Ξ±=0
Ξ±=0
(c)
(d)
Figure 4.15.
3.2. If πΌ = 0, then the equilibrium π£β = 0 is called a center and is neutrally stable. In this case, all solution curves starting away from the origin are periodic orbits around the origin, and all have an elliptical shape with the same orientation; see Figure 4.15 (c) and (d). Similar to before, the orientation of the solution curves can be determined by examining the direction vector at any test point away from the origin. When πΌ = 0, note that the general solution in (4.12) can be written as (4.13)
π£(π‘) = π£ 0 cos(π½π‘) + π½ β1 π΄π£ 0 sin(π½π‘), ππ£
where π£ 0 = π£(0) is any given initial condition, and π΄π£ 0 = ππ‘ (0) is the corresponding initial rate of change determined by (4.7). Moreover, note that the vectors {π£ 0 , π½ β1 π΄π£ 0 , βπ£ 0 , βπ½ β1 π΄π£ 0 } correspond to reference points on the elliptical orbit which π π 3π 2π are visited at times {0, 2π½ , π½ , 2π½ }, and so on modulo π½ . ππ₯
ππ¦
Example 4.7.1. Consider ππ‘ = π₯+π¦ and ππ‘ = 4π₯β2π¦, with initial conditions π₯|π‘=0 = 5 and π¦|π‘=0 = β 2 . The matrix, eigenvalues and eigenvectors for this linear system are (4.14)
1 1 π΄=( ), 4 β2
π1,2 = 2, β3,
β1 1 π’Μ1,2 = ( ) , ( ) . 4 1
5 2
4.7. Linear systems
69
The system is nondegenerate, and the single equilibrium at the origin is an unstable saddle. The general solution has the form (4.9), which gives π₯(π‘) = πΆ1 π2π‘ β πΆ2 πβ3π‘ ,
(4.15)
π¦(π‘) = πΆ1 π2π‘ + 4πΆ2 πβ3π‘ .
5
5
3
Using the initial conditions π₯(0) = 2 and π¦(0) = β 2 , we obtain the constants πΆ1 = 2 and πΆ2 = β1. The phase diagram is illustrated in Figure 4.16. For comparison, the direction field for this system is also illustrated, along with the particular solution curve satisfying the initial conditions.
y
1 1 y
L 1 parallel to x L 2 parallel to
β1 4
x
Figure 4.16.
y
y
x x
Figure 4.17. ππ₯
ππ¦
Example 4.7.2. Consider ππ‘ = 4π₯ β 5π¦ and ππ‘ = 2π₯ β 2π¦, with arbitrary initial conditions π₯|π‘=0 = π₯0 and π¦|π‘=0 = π¦0 . The matrix, eigenvalues and eigenvectors for this linear system are (4.16)
π΄=(
4 β5 ), 2 β2
π+,β = 1 + π, 1 β π,
π’Μ+,β = (
3+π 3βπ ),( ). 2 2
In our convention in terms of real and imaginary parts, we have π+ = πΌ + ππ½ and π’Μ+ = πΎ Μ + ππ,Μ where (4.17)
πΌ = 1,
π½ = 1,
3 πΎΜ = ( ) , 2
1 πΜ = ( ) . 0
The system is nondegenerate, and the single equilibrium at the origin is an unstable ππ¦ spiral. Since ππ‘ > 0 for any point with π₯ > 0 and π¦ = 0, the orientation is CCW. The general solution has the form (4.12), which gives (4.18)
1 3 1 3 π₯(π‘) ) = πΆ1 ππ‘ [ ( ) cos π‘ β ( ) sin π‘] + πΆ2 ππ‘ [ ( ) sin π‘ + ( ) cos π‘]. ( π¦(π‘) 0 2 0 2
70
4. Two-dimensional dynamics
The phase diagram is illustrated in Figure 4.17. For comparison, the direction field for this system is also illustrated, along with the particular solution curve with initial 1 1 condition (π₯0 , π¦0 ) = (β 6 , β 6 ).
4.8. Equilibria in nonlinear systems We now return to the general system (4.19)
ππ₯/ππ‘ = π(π₯, π¦),
π₯|π‘=0 = π₯0 ,
ππ¦/ππ‘ = π(π₯, π¦),
π¦|π‘=0 = π¦0 ,
π‘ β₯ 0.
We seek to understand the phase diagram for this system around any given equilibrium (π₯β , π¦β ). Since the linear case has already been addressed, we assume that the system is not linear. To motivate the result, let (π₯β , π¦β ) be a given equilibrium, and consider Taylor expansions of π(π₯, π¦) and π(π₯, π¦) around this point. Since π(π₯β , π¦β ) = 0 and π(π₯β , π¦β ) = 0, these expansions take the form π(π₯, π¦) = ππ₯ (π₯β , π¦β )(π₯ β π₯β ) + ππ¦ (π₯β , π¦β )(π¦ β π¦β ) + π
1 (π₯, π¦, π₯β , π¦β ), (4.20) π(π₯, π¦) = ππ₯ (π₯β , π¦β )(π₯ β π₯β ) + ππ¦ (π₯β , π¦β )(π¦ β π¦β ) + π
2 (π₯, π¦, π₯β , π¦β ). Here ππ₯ , ππ¦ , ππ₯ , ππ¦ denote partial derivatives, and π
1 , π
2 denote remainder terms. Introducing the matrix notation (4.21)
π£=(
π₯ β π₯β ), π¦ β π¦β
π΄β = (
ππ₯ ππ₯
ππ¦ ) ππ¦ (π₯
,
β ,π¦β )
π
=(
π
1 ) π
2 (π₯,π¦,π₯
,
β ,π¦β )
we note that the system in (4.19) can be written in an equivalent form (4.22)
ππ₯/ππ‘ = π(π₯, π¦) ππ¦/ππ‘ = π(π₯, π¦)
β
ππ£ = π΄β π£ + π
. ππ‘
The matrix π΄β is called the Jacobian of the system at (π₯β , π¦β ). Thus, at any point in the phase plane near an equilibrium, a nonlinear system is equivalent to a linear system with a remainder. The continuous differentiability of π(π₯, π¦) and π(π₯, π¦) ensure that the remainder decreases in magnitude and vanishes as (π₯, π¦) approaches (π₯β , π¦β ). Hence we can expect that the phase diagram for a nonlinear system to be equivalent to the diagram of a corresponding linear system, but with some distortion caused by a remainder term. The following result shows that such an equivalence holds in most cases under a nondegeneracy assumption, but there is an important exception. The validity of the result requires that π(π₯, π¦) and π(π₯, π¦) be twice continuously differentiable in an open neighborhood of (π₯β , π¦β ), so that the remainder term vanishes at a maximal rate. Result 4.8.1. [HartmanβGrobman] Let (π₯β , π¦β ) be an equilibrium of (4.19), and let πβ1,2 be the eigenvalues of π΄β . If det π΄β β 0, then the phase diagram in a sufficiently small
4.8. Equilibria in nonlinear systems
71
neighborhood of (π₯β , π¦β ) is, up to some distortion, an asymp. stable node unstable saddle unstable node asymp. stable spiral if unstable spiral if no conclusion if
if πβ1,2 real and negative, if πβ1,2 real and opposite, if πβ1,2 real and positive, πβ1,2 = πΌ Β± ππ½, π½ β 0, πΌ < 0, πβ1,2 = πΌ Β± ππ½, π½ β 0, πΌ > 0, πβ1,2 = πΌ Β± ππ½, π½ β 0, πΌ = 0.
Hence the phase diagram around an equilibrium in a nonlinear system is of the same type as a corresponding linear system, provided that the latter is nondegenerate, and has eigenvalues that are not purely imaginary. Moreover, a node can be improper in the same way as before when eigenvalues are repeated. Note that the given equilibrium serves as the origin for the corresponding linear system and its phase diagram. For instance, if (π₯β , π¦β ) and (π₯βΜ , π¦βΜ ) are two equilibria, where the first is an unstable saddle and the second is a stable spiral, then the phase diagram in a small neighborhood around each would appear as illustrated in Figure 4.18(a). The result provides no information on the phase diagram in regions between or far from the equilibria. y
(x , y ) * *
L1
L2
L1
L2
(x~ , y~ ) * *
(x , y ) * *
x
(a)
(b)
Figure 4.18.
When an equilibrium is nondegenerate, and has eigenvalues that are not purely imaginary, the distortion caused by the remainder produces only a slight deformation of the linear diagram, with no qualitative changes. In the cases of real eigenvalues, the solution lines πΏ1,2 defined by the eigenvectors of the linear system are deformed into solution curves L1,2 in the nonlinear system. The lines πΏ1,2 are tangent to the curves L1,2 at the equilibrium point. Whereas the lines πΏ1,2 provide barriers that partition the diagram of the linear system, the curves L1,2 provide analogous barriers that partition the diagram of the nonlinear system. The case of a saddle is illustrated in Figure 4.18 (b). In the cases of complex eigenvalues, the spiraling solution curves of the linear system are deformed into qualitatively similar spiraling solution curves of the nonlinear system. The amount of deformation can be expected to decrease as the size of the neighborhood is reduced. When an equilibrium is degenerate, or has eigenvalues that are purely imaginary, the distortion caused by the remainder can produce significant, qualitative changes to the linear diagram. For instance, when the eigenvalues are purely imaginary, the
72
4. Two-dimensional dynamics
linear diagram is a neutrally stable center, but the nonlinear diagram could be either a stable or unstable spiral, among other things. Similar qualitative differences between the linear and nonlinear diagrams can occur when an equilibrium is degenerate. In these exceptional cases, the nonlinear diagram may be relatively complicated and nonstandard, that is, it may be qualitatively different from a node, saddle, spiral, or center. Other methods would be required in order to gain insight into the phase diagram in such cases; for instance, an analysis of nullclines and sign regions may be helpful, and explicit knowledge of a first integral, if one exists, would be ideal. ππ¦
ππ₯
Example 4.8.1. Consider ππ‘ = π¦ and ππ‘ = π₯2 β π₯ β 3π¦. The equilibrium points of this system satisfy π¦ = 0 and π₯2 β π₯ β 3π¦ = 0. When combined with the first equation, the second implies π₯(π₯ β 1) = 0, which gives π₯ = 0, 1. Hence this system has two equilibrium points (π₯β , π¦β ) = (0, 0) and (1, 0). Since π(π₯, π¦) = π¦ and π(π₯, π¦) = π₯2 β π₯ β 3π¦, the Jacobian matrix at an arbitrary point is π (π₯, π¦) π΄(π₯, π¦) = ( π₯ ππ₯ (π₯, π¦)
(4.23)
ππ¦ (π₯, π¦) 0 )=( ππ¦ (π₯, π¦) 2π₯ β 1
1 ). β3
For each of the equilibrium points, we consider the matrix π΄β = π΄(π₯β , π¦β ), and examine its eigenvalues and eigenvectors. The results are summarized below. Equilibrium 1: (π₯β , π¦β ) = (0, 0), (4.24)
π΄β = (
0 1 ), β1 β3
πβ1,2 = β2.62, β0.38,
β0.38 β2.62 π’Μβ1,2 = ( ),( ). 1 1
Equilibrium 2: (π₯β , π¦β ) = (1, 0), 0 1 π΄β = ( ), 1 β3
(4.25)
πβ1,2 = β3.30, 0.30,
π’Μβ1,2 = (
β0.30 3.30 ),( ). 1 1
Thus each of the equilibrium points is nondegenerate, and (0, 0) is an asymptotically stable node, and (1, 0) is an unstable saddle. Using the eigenvectors π’Μ1,2 , we can draw corresponding lines πΏ1,2 through each equilibrium, and these lines then provide a guide for solution curves L1,2 that partition each of the diagrams. The curves L1,2 are tangent to πΏ1,2 at each equilibrium, but no further information about L1,2 is available from this analysis, such as their concavity. The diagrams are illustrated in Figure 4.19. y L1
L1
L1
L1
L2 L2
L2
L2 (0,0)
(1,0) Figure 4.19.
x
4.9. Periodic orbits in nonlinear systems
73
4.9. Periodic orbits in nonlinear systems We consider again the general system (4.26)
ππ₯/ππ‘ = π(π₯, π¦),
π₯|π‘=0 = π₯0 ,
ππ¦/ππ‘ = π(π₯, π¦),
π¦|π‘=0 = π¦0 ,
π‘ β₯ 0.
Aside from equilibria, we seek to understand properties of periodic orbits. In the linear case, a system can have either no periodic orbits, or infinitely many, all of which are nonisolated and neutrally stable. In the nonlinear case, a system can have any number of periodic orbits, and these can be isolated or not, and of different stability types. Here we outline some results about periodic orbits which apply only to systems in two dimensions. The first result generalizes an observation from the linear case. It is based on the fact that equilibria correspond to intersection points of the π₯- and π¦-nullclines, and are on the boundary between regions of increase and decrease of each variable. Since a periodic orbit must necessarily pass through an increasing and decreasing region for each of the two variables, the corresponding four regions must meet at an equilibrium point within the area enclosed by the orbit. For the following result we assume that π(π₯, π¦) and π(π₯, π¦) are continuously differentiable in an open disc containing the orbit and the area that it encloses. Result 4.9.1. Let (π₯, π¦)(π‘) be a periodic orbit of (4.26) and let π denote its enclosed area; see Figure 4.20. Then π must contain an equilibrium point.
y U x (x,y)(t) Figure 4.20.
Thus equilibrium points are necessary for periodic orbits. In the linear case, we saw that every periodic orbit enclosed a single equilibrium at the origin. In the nonlinear case, a periodic orbit can enclose more than one equilibrium point, and these can be located away from the origin. Any system which does not possess an equilibrium cannot have a periodic orbit. Since they are in a plane, and cannot intersect, solution curves for systems in two dimensions can exhibit only a limited range of behaviors. Only four types of behaviors are possible in forward time π‘ β₯ 0: either a solution leaves the system domain (open set π· in Result 4.2.1), or tends to (or is) a periodic orbit, or tends to (or is) an equilibrium, or tends to a set of points consisting of multiple equilibria and curves joining them. The same types of behaviors are possible in backward time π‘ β€ 0. Thus solution curves in two dimensions cannot wander around aimlessly or chaotically for all time, as is possible in higher dimensions. In the next result, the existence of a periodic orbit is
74
4. Two-dimensional dynamics
implied by ruling out three of the four possible behaviors in forward time. We assume that π(π₯, π¦) and π(π₯, π¦) are continuously differentiable in an open disc containing the region π
as described. Result 4.9.2. [PoincarΓ©-Bendixson] Consider the system in (4.26) and suppose that a closed, bounded region π
can be found such that (see Figure 4.21): (i)
π
surrounds one or more equilibria, but does not contain them, and
(ii)
at each point on the boundary of π
the direction field points either into or tangent to π
.
Then there exists a periodic orbit in π
. y ~ ,~ (x y ) * * (x , y ) * *
R
x Figure 4.21.
The region π
in the above result is called a trapping region. The condition on the direction field ensures that any solution which begins in π
cannot escape and hence is trapped there for all forward time. Since π
is bounded and in the system domain, and contains no equilibrium points, every solution that starts in π
either is or tends to a periodic orbit. Thus there is at least one periodic orbit as stated, and it must be contained in π
. Note that it is essential for the trapping region to surround equilibria in view of Result 4.9.1. ππ₯
ππ¦
Example 4.9.1. Consider ππ‘ = βπ¦ β π₯(π₯2 + π¦2 β π) and ππ‘ = π₯ β π¦(π₯2 + π¦2 β π), where π > 0 is a given constant. To explore if this system has periodic solutions, we first locate its equilibria, if any. The equilibrium points satisfy βπ¦ β π₯(π₯2 + π¦2 β π) = 0 and π₯ βπ¦(π₯2 +π¦2 βπ) = 0. Multiplying the first equation by π¦, and the second equation by π₯, and subtracting the results, we get π₯2 + π¦2 = 0. The only solution of this equation is π₯ = 0 and π¦ = 0, and hence this system has a single equilibrium point at (π₯β , π¦β ) = (0, 0). The Jacobian matrix π΄β at this point has eigenvalues πβ1,2 = π Β± π, which shows ππ¦
that the equilibrium is an unstable spiral for any π > 0. Since ππ‘ > 0 for any point with π₯ > 0 and π¦ = 0, the spiraling solution curves around the equilibrium are CCW. Consider now a candidate trapping region π
which surrounds the equilibrium point (0, 0) but does not contain it. Specifically, let π
be the closed region between some inner circle of radius π in and some outer circle of radius πout as illustrated in Figure 4.22(a). We next consider the direction field along the inner and outer boundary of π
.
4.9. Periodic orbits in nonlinear systems
75
y
y
R x
rin
x rin
rout (a)
(b)
Figure 4.22.
Inner. Consider any π in > 0 sufficiently small so that the inner circle is within the neighborhood of (0, 0) where the phase diagram is known. Then, since (0, 0) is an unstable spiral, we expect that the direction field will point into π
all along the inner circle as illustrated in Figure 4.22 (b). Note that the shape of the inner circle could be changed if needed, to an ellipse for example, in order to ensure that this condition holds in a neighborhood of the equilibrium. 2 2 Outer. Consider any πout > π, say πout = 2π. Then for all points along the outer circle 2 2 2 π₯ + π¦ = πout we can rewrite π and π as
π(π₯, π¦) = βπ¦ β π₯(π₯2 + π¦2 β π) = βπ¦ β ππ₯,
(4.27)
π(π₯, π¦) = π₯ β π¦(π₯2 + π¦2 β π) = π₯ β ππ¦.
Note that the line π¦ = βππ₯ intersects the outer circle in two points where π = 0, and divides it into two semicircles with π > 0 and π < 0, as shown in Figure 4.23 (a). Similarly, the line π₯ = ππ¦ intersects the outer circle in two points where π = 0, and divides it into two semicircles with π > 0 and π < 0, as shown in Figure 4.23 (b). By combining this information about the signs of π and π, we find that the direction field will point into π
all along the outer circle, as shown in Figure 4.23 (c). y = βΞΌ x y
x = ΞΌy
y f0
rout
rout (b)
(c)
Figure 4.23.
Thus π
will be a trapping region with the above choices of π in and πout , and by the PoincarΓ©βBendixson theorem, there exists a periodic orbit in π
. Note that this result holds for any π > 0, and that all such periodic solutions are bounded by the radius πout = β2π. Moreover, in the limit π β 0, note that all such periodic solutions must
76
4. Two-dimensional dynamics
y
collapse onto the origin. For verification, Figure 4.24 shows the direction field in the case when π = 2: a single periodic solution is visible, and it is a stable limit cycle.
x
Figure 4.24.
Our final result follows from a simple fact about level sets for a continuously differentiable function of two variables. Specifically, in the neighborhood of a strict local extremum point (minimum or maximum) of such a function, the level sets must be closed curves that encircle the point. Result 4.9.3. [nonlinear center] Let (π₯β , π¦β ) be an isolated equilibrium and let πΈ(π₯, π¦) be a first integral of (4.26). If πΈ(π₯, π¦) has a strict local extremum at (π₯β , π¦β ), then the phase diagram in a sufficiently small neighborhood of (π₯β , π¦β ) is a center. Every solution curve in this neighborhood is a periodic orbit, except the equilibrium itself. Thus an isolated equilibrium that is a strict local extremum of a first integral is encircled by infinitely many periodic orbits. These orbits fill an entire neighborhood around the equilibrium, and the orbits and equilibrium are all neutrally stable. This type of equilibrium is the nonlinear version of a center as considered in the linear case. Note that, in addition to the PoincarΓ©βBendixson theorem, the above result provides another tool for establishing the existence of periodic orbits. Results 4.8.1 and 4.9.3 together provide conditions under which an equilibrium in a nonlinear system has a standard phase diagram. While the eigenvalues of a Jacobian are sufficient to classify a nonlinear node, saddle, or spiral, they are not sufficient for a nonlinear center. However, a condition involving a first integral is sufficient in this latter case. Note that the existence of a first integral for (4.26) depends on the form of the path equation in (4.2). Specifically, a first integral will exist when this first-order differential equation is exact, or can be converted to an exact form with a suitable integrating factor. The condition in Result 4.9.3 that the equilibrium be isolated is important. Simple considerations show that the result does not generally hold for equilibria that are nonisolated. For instance, any point within a line of equilibria cannot be encircled by a periodic orbit, because the orbit would necessarily intersect and cross through other equilibria from the line, which cannot occur. A first integral, if one exists, would also be helpful in understanding solution curves in such a nonisolated case. For instance,
4.11. Case study
77
if a point on the line is a strict local extremum of the integral, then the closed level sets of the integral would contain equilibria from the line, and a solution could not move around the entire contour of a level set, but instead be trapped within an arc.
4.10. Bifurcation Equilibrium and periodic solutions and their stability are key features of a system that provide important qualitative information. The dependence of these features on any parameter can be studied as before. For instance, we may consider ππ₯/ππ‘ = π(π₯, π¦, π),
(4.28)
ππ¦/ππ‘ = π(π₯, π¦, π),
π‘ β₯ 0,
where π is an arbitrary parameter of interest. Note that nullclines, regions of increase and decrease, equilibria, Jacobians, eigenvalues, and first integrals will all depend on the parameter. A bifurcation is said to occur at a value π = π# if there is a qualitative change in any part of the phase diagram of (4.28) as the parameter changes from π < π# to π > π# . Just as in the one-dimensional case, equilibrium solutions can be created or destroyed, and stabilized or destabilized, and can otherwise change from one type to another. However, in the two-dimensional case, there is the additional possibility that periodic orbits can be created or destroyed, and stabilized or destabilized, and other kinds of bifurcations can also occur. For instance, the phase plane may be partitioned by special solution curves, such as those which emanate from one saddle point and arrive at another, and this partitioning can qualitatively change. Various examples are considered in the Exercises.
4.11. Case study Setup. To illustrate some of the preceding results, and a simple model in epidemiology, we study the spread of an infectious disease in a population. As shown in Figure 4.25, we consider a population πΛ in a local, isolated community. We partition the population into three subsets: the collection π Μ of persons who are susceptible to the disease, the collection πΌ Μ of persons who are infected with the disease, and the collection Λ of persons who have recovered (or died) and are now immune to the disease. For π
local, isolated community
~ Population P
~ ~ ~ ~ P = S UIUR
Figure 4.25.
simplicity, we consider a time window of only a few weeks, and thus ignore changes in the total population size due to births and migration of persons into and from the community. We also suppose that susceptible and infected persons roam freely, and that a person of one type has an equal chance to encounter any person of the other type. Under these ideal conditions, we seek to understand how an infectious disease
78
4. Two-dimensional dynamics
may spread throughout the population in time, beginning from one infected individual, and on how the dynamics of the spread are influenced by parameters associated with the disease and population. Λ at time π‘, with Outline of model. Let π, πΌ, π
denote the sizes of the subsets π,Μ πΌ,Μ π
dimensions of [π], [πΌ], [π
] = Person and [π‘] = Time (weeks). We assume that persons move from one subset to another as illustrated in Figure 4.26, where π1 and π2 are transfer rates, with dimensions of [π1 ], [π2 ] = Person/Time. This diagram shows the simplest version of an SIR model. Ο1 ~ S
infection rate recovery rate
~ I ~ R
Ο1 Ο2
Ο2
Figure 4.26.
Infection rate. The rate at which individuals become infected is represented by (4.29)
π1 =
# persons infected . week
We assume that the disease spreads by a pairing between one susceptible person and one infected person, in which a physical interaction occurs, such as a handshake, cough, or sneeze. Under this assumption, the infection rate can be decomposed at any instant of time as (4.30)
π1 = (# possible pairings)(
# infections ). # possible pairings β
week
The first factor above, which is the number of different, possible pairings at any instant, is the product π β
πΌ. The second factor, which is the number of infections per pairing per week, is assumed to be a constant π > 0 and is called the transmission coefficient. This constant represents not only the infectiousness of the disease, but also the behavior of the population β it represents the fraction of all possible pairings that are expected to actually occur, which depends on social characteristics, and the fraction of these that result in transmission. Thus a model for the infection rate is (4.31)
π1 = πππΌ.
Recovery rate. The rate at which individuals recover (including deaths) and become immune is represented by (4.32)
π2 =
# persons recovered . week
We assume that a person can recover and become immune only after becoming infected, and thus exclude any kind of vaccination process whereby the infection could
4.11. Case study
79
be bypassed. Under this assumption, the recovery rate can be decomposed at any instant of time as # recoveries (4.33) π2 = (# infected)( ). # infected β
week The first factor above is the number πΌ. The second factor, which is the number of recoveries per number infected per week, is assumed to be a constant π > 0 and is called the recovery coefficient. This constant represents not only a property of the disease, but also a property of the population β age, fitness, and other health factors. Thus a model for the recovery rate is (4.34)
π2 = ππΌ.
Model equations. In view of Figure 4.26, and the model expressions in (4.31) and (4.34), we consider the equations (4.35)
ππ = βπ1 = βπππΌ, ππ‘
ππΌ = π1 β π2 = πππΌ β ππΌ, ππ‘
ππ
= π2 = ππΌ, ππ‘
π|π‘=0 = π 0 ,
πΌ|π‘=0 = πΌ0 ,
π
|π‘=0 = π
0 .
Note that π = π 0 + πΌ0 + π
0 > 0 is the total population size. We seek to understand how an infection spreads, beginning from one infected person in a population, where no one is initially immune. Thus we focus on initial conditions of the form (4.36)
π 0 = π β 1,
πΌ0 = 1,
π
0 = 0.
Specifically, we seek to understand how the size of the infected group evolves in time, and how qualitative properties of this evolution depend on the parameters π, π and π. Reduced equations. The variable π
can be explicitly eliminated from the system. This follows from the above equations and the observation that ππ ππΌ ππ
+ + = 0, π‘ β₯ 0, ππ‘ ππ‘ ππ‘ which can be integrated in time to obtain π + πΌ + π
= π 0 + πΌ0 + π
0 , which then implies (4.37)
(4.38)
π
= π β π β πΌ,
π‘ β₯ 0.
The above result is the statement that the total size of the population is constant. This follows from the fact that the model includes no mechanisms, such as births or migration, that would cause the size to change. Thus any expression or function of the variables π, πΌ and π
can be reduced to a function of π and πΌ only. Considering only the model equations for π and πΌ, and introducing the scale transformation π₯ = π/π, π¦ = πΌ/π and π = ππ‘, we obtain the two-dimensional dynamical system ππ₯ = βππ₯π¦, ππ
(4.39) π=
ππ π
> 0,
ππ¦ = ππ₯π¦ β π¦, ππ
π₯0 + π¦0 = 1,
π₯0 =
πβ1 π
β² 1.
Note that the differential equations involve only a single dimensionless parameter π, which is called the basic reproduction number. In the above, the notation π₯0 β² 1
80
4. Two-dimensional dynamics
means strictly less than, but nearly equal. Although the above system is mathematically well defined for all real values of π₯ and π¦, we only consider the physically meaningful case with π₯ β₯ 0 and π¦ β₯ 0. Analysis of model. The qualitative properties of the solution of (4.39), beginning from the initial conditions of interest, are completely determined by the magnitude of the parameter π > 0. Equilibria. An equilibrium must satisfy βππ₯π¦ = 0 and ππ₯π¦ β π¦ = 0. By adding these two equations we obtain π¦ = 0, and we deduce that any point on the π₯-axis is an equilibrium. Thus the system has an entire line of equilibria, of the form (π₯β , π¦β ) = (π₯β , 0) for all π₯β β₯ 0. Considering the Jacobian matrix at any such point we find det π΄β = 0, which implies that each equilibrium is degenerate. Thus the phase diagram around each point (π₯β , 0) cannot be determined based on the eigenvalues and eigenvectors of π΄β . Nullclines, directions. Significant information for the system can be obtained from the nullclines and sign regions for π(π₯, π¦) = βππ₯π¦ and π(π₯, π¦) = (ππ₯ β 1)π¦. All points f=0 y
g=0
1/Ο
x
f=0, g=0
Figure 4.27.
satisfying π = 0 have π₯ = 0 or π¦ = 0, and we note that π < 0 in the first quadrant with 1 π₯ > 0 and π¦ > 0. Similarly, all points satisfying π = 0 have π₯ = π or π¦ = 0, and we note 1
1
that π < 0 if 0 β€ π₯ < π and π¦ > 0, and π > 0 if π₯ > π and π¦ > 0. The direction field for the system is illustrated in Figure 4.27. Note that the line of equilibria at π¦ = 0 is partitioned into two qualitatively different segments depending on π: equilibria with 1 1 π₯ > π have a repelling character, whereas equilibria with 0 β€ π₯ < π have an attracting character. Observations. Given initial conditions π₯0 +π¦0 = 1, with π₯0 β² 1, there are two cases that arise depending on the value of π. Case 0 < π < 1. The initial condition and behavior of the resulting solution are illustrated in Figure 4.28 (a). The dashed line shows all points with π₯0 + π¦0 = 1, and the condition π₯0 β² 1 implies that (π₯0 , π¦0 ) is near the intercept on the right end. Note that (π₯0 , π¦0 ) is near the attracting equilibria, in the region to the left of the vertical line 1 π₯ = π , where the direction field is leftwards and downwards at all points. In this case, for the solution beginning at (π₯0 , π¦0 ), the size π¦ of the infected group will monotonically shrink in time. Thus the disease does not spread among the population. Case π > 1. The initial condition and behavior of the resulting solution are illustrated in Figure 4.28 (b). Note that (π₯0 , π¦0 ) is near the repelling equilibria, in the region to
4.12. Case study
81
y
y
1
1 ymax
(x0 , y0 ) 1
(x0 , y0 )
x
1/Ο
1/Ο
(a)
1
x
(b)
Figure 4.28. 1
the right of the vertical line π₯ = π , where the direction field is leftwards and upwards in contrast to before. The shape of the solution curve for the solution beginning at (π₯0 , π¦0 ) is qualitatively as shown. In this case, the size π¦ of the infected group will grow before shrinking. Thus the disease spreads among the population and an outbreak is said to occur. An interesting problem is to determine the maximum size π¦max of the infected group. A geometric description of the solution curve can be obtained by 1 integrating the path equation for this system, which yields π¦ = π ln π₯βπ₯+πΆ, where πΆ is 1 a constant determined by (π₯0 , π¦0 ). Note that π¦ = π¦max occurs when π₯ = π , and that the 1 value of π¦max will be decreased as π is increased. Thus any social interventions in the ππ population that reduce the parameter π = π will flatten the curve. For example, the practice of social distancing would reduce the transmission coefficient π, and thereby reduce π.
4.12. Case study Setup. To illustrate another application of the concepts of this chapter, and a simple model from physics, we study the free-spinning motion of a body. As shown in Figure 4.29, we consider a uniform body of rectangular shape, with edge lengths π, π, π > 0, and total mass π > 0. We suppose that the body is rigid, so that its shape is fixed but otherwise free to move. The body can experience two physically distinct types of motion, which usually occur simultaneously: its center of mass can translate in any way and trace out some path in space, and the body can independently rotate about this center as it translates. Here we focus on the rotational component of the motion. We seek to understand the qualitative behaviors that are possible when the body is given an arbitrary initial spin, and is allowed to spin freely thereafter, with zero net torque, and how these behaviors depend on the parameters of the body. For instance, we can consider an experiment in which the body is tossed into the air, and we observe its rotational motion during flight, before it falls to the ground. The assumption of a rectangular shape is made for simplicity; an arbitrary shape could be considered, but at the expense of more complicated expressions and statements. Outline of model. Let π 1 , π 2 , π 3 be orthonormal basis vectors for three-dimensional space, which are attached to and move with the body. The rotational motion of the body is described by the angular velocity vector π. The direction of this vector corresponds to the axis of rotation, while the magnitude corresponds to the rate of
82
4. Two-dimensional dynamics
Ο
e3
Ο
e2 a
rate of rotation
b
c axis of rotation
e1
Figure 4.29.
rotation. For rotational motion, the mass properties of the body are described by an inertia matrix π€, which is always symmetric and positive-definite, and its product with velocity is called the angular momentum vector π’ = π€π. The relation between momentum and velocity can be inverted to get π = πΎπ’, where πΎ = π€ β1 . In the basis π 1 , π 2 , π 3 as shown, the vectors have components π = (π1 , π2 , π3 ), π’ = (π’1 , π’2 , π’3 ), and the matrices have components 1
(4.40)
β πΌ π€=β 0 β β 0
0 1 π½
0
0 β 0 β, 1 β πΎ β 1
πΌ 0 πΎ=( 0 π½ 0 0
0 0 ). πΎ
π
π
1
1
π
In the above, the inertia parameters are πΌ = 12 (π2 +π2 ), π½ = 12 (π2 +π2 ), πΎ = 12 (π2 +π2 ), and we will be interested in the case when π > π > π, which implies π½ > πΌ > πΎ. We consider the case when the body is given an arbitrary initial spin, and is allowed to spin freely thereafter, with zero net torque. Thus we suppose that the initial angular momentum is an arbitrary vector π’0 β 0, of some arbitrary magnitude π > 0. In the trivial case π’0 = 0, which corresponds to no initial spin, the body would simply remain still. The law of angular momentum from physics leads to the following system, called the Euler equations for rotational motion, where Γ denotes the vector cross product, and | β
| denotes magnitude, (4.41)
ππ’ = π’ Γ (πΎπ’), ππ‘
π’|π‘=0 = π’0 ,
|π’0 | = π > 0,
π‘ β₯ 0.
Analysis of model. Here we outline some results on the equilibria and corresponding phase diagrams for (4.41), which reveal some interesting features about rotating bodies. The details are left to the Exercises. Phase space. Using the component expression π’ = (π’1 , π’2 , π’3 ), and carrying out the vector cross product, we find that (4.41) gives the dynamical system ππ’3 ππ’1 ππ’2 = (πΎ β π½)π’2 π’3 , = (πΌ β πΎ)π’3 π’1 , = (π½ β πΌ)π’1 π’2 . ππ‘ ππ‘ ππ‘ We consider initial conditions (π’1,0 , π’2,0 , π’3,0 ) β (0, 0, 0), satisfying the magnitude condition π’21,0 + π’22,0 + π’23,0 = π2 . Solutions of this system trace out paths or orbits in a three-dimensional phase space, with π’1 , π’2 , π’3 -coordinate axes. By observation, we note that every solution has the property that (4.42)
(4.43)
ππ’ ππ’ ππ’ π 2 (π’ + π’22 + π’23 ) = 2π’1 1 + 2π’2 2 + 2π’3 3 = 0, ππ‘ 1 ππ‘ ππ‘ ππ‘
π‘ β₯ 0.
4.12. Case study
83
Integrating the above expression in time we obtain π’21 + π’22 + π’23 = π2 , where we have used the fact that π’21,0 + π’22,0 + π’23,0 = π2 . This shows that every solution remains on the sphere of radius π for all time. Thus the system is effectively two-dimensional: instead of paths in a flat plane, solutions trace out paths on the curved surface of a sphere. Equilibria. In an equilibrium solution, the momentum vector π’ and the corresponding velocity vector π are constant, which implies that the rate and axis of rotation are constant, which corresponds to a steady spinning motion. Equilibria must satisfy (πΎ β π½)π’2 π’3 = 0, (πΌ β πΎ)π’3 π’1 = 0, (π½ β πΌ)π’1 π’2 = 0, along with the magnitude condition π’21 + π’22 + π’23 = π2 . Since the parameters πΌ, π½, πΎ are all distinct, we find that there are precisely six equilibria, given by π’β = (Β±π, 0, 0), (0, Β±π, 0), (0, 0, Β±π). Since the matrix πΎ is diagonal, the corresponding angular velocity vectors are πβ = πΎπ’β = (Β±ππΌ, 0, 0), (0, Β±ππ½, 0), (0, 0, Β±ππΎ), which are parallel to the basis vectors π 1 , π 2 , π 3 . Thus steady spinning motions can only occur about an axis that is perpendicular to a face of the body. An initial spin about any other axis will result in a non-steady motion, which can generally be described as tumbling or wobbling. Local phase diagrams. The phase diagram for each equilibria can be found explicitly and exactly. (Although an approach based on a three-dimensional Jacobian matrix could be contemplated, the Jacobian would have zero determinant in all cases, and also purely imaginary eigenvalues in some cases, which would be inconclusive.) An exact phase diagram is possible for each equilibrium by projecting solution curves onto an appropriate plane. For example, consider any solution curve (π’1 , π’2 , π’3 )(π‘) starting near the equilibrium (0, 0, π); say, in the region on the sphere with π β πΏ < π’3 < π for some small πΏ > 0. The projection of the curve (π’1 , π’2 , π’3 )(π‘) onto the π’1 , π’2 -plane is the curve (π’1 , π’2 )(π‘). The path traced out by this projected curve is described by the equation (4.44)
(πΎ β π½)π’2 π’3 (πΎ β π½)π’2 ππ’1 /ππ‘ ππ’1 = = . = ππ’2 ππ’2 /ππ‘ (πΌ β πΎ)π’3 π’1 (πΌ β πΎ)π’1
The above equation can be integrated to obtain ππ’21 + ππ’22 = πΆ, where πΆ is an arbitrary constant, and π, π are parameters depending on πΌ, π½, πΎ. In the present case, using the fact that π½ > πΌ > πΎ, we have π > 0 and π > 0. Since the projected curve (π’1 , π’2 )(π‘) is an ellipse in the plane around (0, 0), the original curve (π’1 , π’2 , π’3 )(π‘) must be an ellipse on the sphere around (0, 0, π). Hence the phase diagram for the equilibrium (0, 0, π) has the form of a neutrally stable center as in the flat case, but now on the surface of the sphere. Note that the same result will be obtained for the equilibrium (0, 0, βπ). Specifically, the phase diagram for each equilibrium depends on its axis, but not its sign. A phase diagram can be found for each of the six equilibrium points. For the case of a rectangular body, with edges π > π > π and basis vectors π 1 , π 2 , π 3 as shown earlier, we find that the overall diagram consists of four centers and two saddles on the spherical phase space. A subset of these is illustrated in Figure 4.30, corresponding to the positive side of each axis. The centers occur at the points (0, Β±π, 0) and (0, 0, Β±π), and each is neutrally stable and encircled by a family of periodic orbits. The saddles occur at the points (Β±π, 0, 0), and a more detailed analysis shows that the out-going solution curves from (π, 0, 0) are the in-going curves for (βπ, 0, 0), and vice-versa. Note that the centers
84
4. Two-dimensional dynamics
correspond to spinning motions about the π 2 and π 3 basis vectors, which are parallel to the long and short edges π and π of the body. The saddles correspond to spinning motions about the π 1 basis vector, which is parallel to the intermediate edge π of the body. u3
u2 u1 Figure 4.30.
The overall structure of the above phase diagram has some interesting physical consequences. For instance, if the body is given an initial spin that is nearly parallel to the π 2 or π 3 axes, then the axis of rotation will remain nearly constant, and the body will spin in a nearly steady fashion, with a slight wobble. This motion corresponds to the momentum π’ moving around a small closed curve about the equilibrium in the phase diagram of a center. In contrast, if the body is given an initial spin that is nearly parallel to the π 1 axis, then the axis of rotation will change sharply, and the body will not spin in a nearly steady fashion, but will instead begin to βflipβ back and forth! This motion corresponds to moving back and forth from one saddle to the other: the momentum π’ starts near one saddle, then is pushed along the out-going direction to the other saddle on the opposite side of the spherical phase space, then pushed back again, and so on. The above result on the instability of steady spin motions is called the intermediate axis theorem. (The result is also known as the tennis racket theorem or the Dzhanibekov effect.) It holds for a large class of bodies of arbitrary shape, not just the rectangular shape considered here. The name of the result reflects the fact that the unstable equilibrium corresponds to a spin parallel to the intermediate edge π, while the neutrally stable equilibria correspond to spins parallel to the long and short edges π and π. For bodies of arbitrary shape, the unstable equilibrium would correspond to a spin about the eigenvector of the intermediate eigenvalue of the inertia matrix π€. The result is applicable provided that this matrix has distinct eigenvalues. Global phase diagram. The procedure outlined above for the local phase diagrams can be exploited to produce a global diagram of solution curves. Specifically, the local diagram around each of the four centers can be expanded in a maximal way. The resulting four maximal diagrams cover the sphere, and these diagrams imply those for the remaining two saddles. To begin, consider any one of the four center equilibrium points, say (0, 0, π). Let (π’1 , π’2 , π’3 )(π‘) be a solution curve that starts in a neighborhood of this equilibrium, and consider the projected curve (π’1 , π’2 )(π‘). As shown earlier, the projected curve must trace out the ellipse ππ’21 + ππ’22 = πΆ, where π, π > 0 are parameters depending on πΌ, π½, πΎ, and πΆ β₯ 0 is a constant depending on the initial point.
4.12. Case study
85
For initial points in a small neighborhood of (0, 0, π), we obtain a local diagram of a center surrounded by elliptical solution curves, as considered earlier. We now ask: what is the largest neighborhood that is filled with such curves? Due to the fact that π’21 + π’22 + π’23 = π2 , as long as the projected curve remains in the disc π’21 + π’22 < π2 , the original curve will remain in the hemisphere π’3 > 0, and the path equation (4.44) for the projected curve will remain valid. Thus the local diagram can be expanded to a maximal diagram, defined by the set of all ellipses ππ’21 + ππ’22 = πΆ that are contained in the disc π’21 + π’22 < π2 . u2
u1
Figure 4.31.
Figure 4.31 shows the resulting maximal diagram associated with the center (0, 0, π). The set of all ellipses that fit within the disc fill an elliptical region of the plane, and when projected, we obtain a corresponding wedge region on the surface of the sphere around (0, 0, π). The elliptical region contains all ellipses with the given π, π > 0, determined by the equilibrium, but different πΆ β₯ 0. The boundary of the elliptical region is tangent to the edge of the disc at two points. When the elliptical region is projected up to the sphere, the two points of tangency stay fixed, since they are at the equator of the sphere. As a result, the two points of tangency become the corners of the wedge region. The above construction can be applied to each of the four center equilibria (0, 0, Β±π) and (0, Β±π, 0), and a maximal diagram and wedge region is obtained for each. As shown in Figure 4.32, the wedge regions have disjoint interiors, cover the entire sphere, and share a common set of corner points and boundary curves. The common corner points are (Β±π, 0, 0), which are the saddle equilibria, and the common boundary curves are precisely the in- and out-going solution curves of the two saddles. Note that all solution curves on the sphere are closed orbits, except for the equilibria and the four curves that connect the saddles.
Figure 4.32.
86
4. Two-dimensional dynamics
All solution curves in a wedge region have the same orientation about the corresponding equilibrium, and the orientation extends to the boundary of the wedge. Since the orientation of adjacent wedges must agree on the common boundary, the orientation within one wedge determines the orientation within all four. By inspection of the direction field, we find that the orientation in the top wedge containing (0, 0, π) is CCW. Finally, from the orientation of the common boundary curves between the wedges, we find that an out-going curve from one saddle is an in-going curve to the other, and vice-versa.
Reference notes Ordinary differential equations are an essential tool for modeling and analyzing problems in numerous scientific disciplines. The study of such equations from a dynamical point of view, with a focus on qualitative features such as the stability and bifurcation of equilibrium and periodic solutions is an important part of applied mathematics. Here we considered only elementary parts of the theory for two-dimensional autonomous systems, which provides a starting point for understanding more general systems. Proofs of the main results outlined here can be found in a number of texts. For proofs of the general theorems on existence and uniqueness of solutions, stability of equilibria, existence of periodic orbits, and the PoincarΓ©βBendixson theorem, see the classic texts by Coddington and Levinson (1955) and Hirsch and Smale (1974). A proof of the HartmanβGrobman theorem can be found in the texts by Perko (2001) and Teschl (2012). For texts with a focus on bifurcation theory, for dynamical and more general systems, see Chow and Hale (1982) and Guckenheimer and Holmes (1990). For introductory texts in dynamical systems see Arnold (1992), Kelley and Peterson (2010), and Strogatz (2015).
Exercises 1. Sketch the nullclines, and the direction of solution curves in all regions separated by the nullclines, and find the equilibria. (a) (c) (e)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= π₯ + πβπ¦ ,
ππ¦ ππ‘
(b)
= βπ¦.
ππ¦ = π₯ + π¦. ππ‘ ππ¦ = π¦(π₯ + 3π¦). ππ‘
= π¦ β π₯2 + 1,
(d)
= π₯(π¦ β π₯),
(f)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ = π₯ β π¦. ππ‘ ππ¦ π¦ β ln π₯, ππ‘ = π₯ β 4 + πβπ¦ . ππ¦ π¦, ππ‘ = π₯(2 + π¦) β 1.
= π₯ β π₯3 , = =
2. Sketch the nullclines, and the direction of solution curves in all regions separated by the nullclines, and find the equilibria. (a) (b) (c) (d)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ = π₯ β 2π¦. ππ‘ 2 ππ¦ 5π₯ π¦ β π₯, ππ‘ = 4+π₯2 β π¦. ππ¦ π₯ + π¦ β 2π₯2 , ππ‘ = π¦ + 3π¦2 β ππ¦ βπ₯ β π¦ β π₯2 π¦, ππ‘ = π₯ + 4π¦.
= π₯2 + π¦2 β 4, = = =
2π₯.
Exercises
87
3. Use the path equation to find and sketch the paths through the given points (π₯0 , π¦0 ). Indicate the direction of increasing time. (a) (b) (c) (d)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= π¦,
ππ¦ ππ‘
1
= π₯2 β π₯,
(π₯0 , π¦0 ) = ( 2 , 0), (2, 0).
ππ¦ 3 1 = 2π₯π¦ β π¦, (π₯0 , π¦0 ) = ( 4 , 4 ). ππ‘ ππ¦ ππ₯+π¦ , ππ‘ = π₯ + 1, (π₯0 , π¦0 ) = (0, 0). ππ¦ 1 1 π₯ + π¦2 , ππ‘ = π₯ β π¦, (π₯0 , π¦0 ) = (β 2 , β 2 ).
= β2π₯π¦, = =
4. Find a first integral πΈ(π₯, π¦) of the dynamical system. (a) (c) (e)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ = 3π₯2 β 1. ππ‘ ππ¦ βπ¦, ππ‘ = π₯3 β π₯. ππ¦ π¦, ππ‘ = β sin π₯.
= π¦,
(b)
=
(d)
=
(f)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= 2 β π¦, = π₯π¦,
ππ¦ ππ‘
ππ¦ ππ‘
= 2π₯3 .
= π₯2 .
= π₯ β π₯π¦,
ππ¦ ππ‘
= βπ¦ + π₯π¦.
5. Find the general solution of the linear dynamical system. Sketch the phase diagram and state the type and stability of the equilibrium at the origin. (a) (c) (e) (g)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= 2π₯ β π¦,
ππ¦ ππ‘
= β3π₯ + 4π¦, = 3π¦,
ππ¦ ππ‘
= β2π₯ + 2π¦. ππ¦ ππ‘
(b) (d)
= βπ₯ + π¦.
(f)
= β2π₯.
= βπ₯ + π¦,
ππ¦ ππ‘
(h)
= 3π₯ β π¦.
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= π₯ β 3π¦,
ππ¦ ππ‘
= β2π₯ β 3π¦, = 2π₯,
ππ¦ ππ‘
= β3π₯ + π¦. ππ¦ ππ‘
= 3π₯ β 2π¦.
= 2π¦.
= π₯ β 4π¦,
ππ¦ ππ‘
= π₯ + π¦.
6. The equation for a damped spring-mass system is given below, where π₯ is position or displacement, π‘ is time, and π > 0, π β₯ 0 and π > 0 are parameters that quantify the mass, damping and stiffness of the system. k
m
a
π 0
ππ₯ π2π₯ +π + ππ₯ = 0. 2 ππ‘ ππ‘
x
ππ₯
(a) Let π¦ = ππ‘ and rewrite the above equation as a linear first-order dynamical system for π₯ and π¦. Is this system ever degenerate? (b) Find the type and stability of the equilibrium at (π₯β , π¦β ) = (0, 0) in the cases π = 0, π2 β 4ππ = 0, π2 β 4ππ > 0, and π2 β 4ππ < 0. 7. Consider the system
ππ₯ ππ‘
= 2π₯ + ππ¦,
ππ¦ ππ‘
= π₯ + π¦, where π is an arbitrary parameter.
(a) Find all values of π for which the system is nondegenerate. (b) Find the type and stability of the equilibrium at the origin for each value of π in (a). 8. Consider the system
ππ₯ ππ‘
= π₯ + 2π¦,
ππ¦ ππ‘
= β2π₯ β 4π¦.
88
4. Two-dimensional dynamics
(a) Show that this system is degenerate and has an entire line of equilibrium points. (b) Sketch the nullclines and direction field and determine if the line of equilibria has an attracting or repelling character. ππ£
9. Consider a nondegenerate system ππ‘ = π΄π£ in two dimensions, with equilibrium π£β = 0, and let πΏ = det π΄ (determinant) and π = tr π΄ (trace). Show that: (a) if πΏ < 0, then π£β is unstable (saddle). (b) if πΏ > 0 and π > 0, then π£β is unstable (node or spiral). (c) if πΏ > 0 and π < 0, then π£β is asymptotically stable (node or spiral). (d) if πΏ > 0 and π = 0, then π£β is neutrally stable (center). 10. A model for insecticide transfer between an agricultural crop and soil is given below, where π₯ and π¦ are the amounts of insecticide in the crop and soil, π‘ is time, and πΌ > 0, π½ > 0 and πΎ > 0 are parameters that quantify transfer and degradation rates. transfer rate Ξ±x x insecticide in crop
y insecticide in soil
transfer rate Ξ²y
ππ¦ ππ₯ = βπΌπ₯ + π½π¦, = πΌπ₯ β π½π¦ β πΎπ¦. ππ‘ ππ‘
degradation rate Ξ³ y
Show that this system is nondegenerate and find the type and stability of the equilibrium at (π₯β , π¦β ) = (0, 0) for all πΌ, π½ and πΎ. ππ₯
ππ¦
11. Consider the system ππ‘ = ππ₯βππ¦, ππ‘ = ππ₯+2π¦. Find all values of the parameters π, π for which the origin will be the following: (a) saddle.
(b) center.
(c) stable node.
12. Find the type and stability of all equilibrium points, and sketch a local phase diagram for each. (a) (c) (e)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= π₯ + πβπ¦ ,
ππ¦ ππ‘
= π₯2 + π¦2 β 4, =π¦β
ππ¦ π₯2 , ππ‘
= βπ¦. ππ¦ ππ‘
= π₯ β 2π¦.
= π¦ + π₯ β 1.
(b) (d) (f)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ = π₯ β π¦. ππ‘ ππ¦ 5π₯2 π₯, ππ‘ = 4+π₯2 β π¦. ππ¦ 3 + 2ππ₯ , ππ‘ = π₯ +
= π₯ β π₯3 , =π¦β =π¦β
ln π¦.
13. Find all equilibrium points and their stability, using a first integral for help as needed. Determine which, if any, are nonlinear centers.
Exercises
(a) (c) (e) (g) (i)
89
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= π¦,
ππ¦ ππ‘
= 3π₯2 β 1.
= 4π¦ + π¦2 ,
ππ¦ ππ‘
(b)
= βπ₯.
(d)
= π¦3 ,
(f)
=
ππ¦ = π₯3 . ππ‘ ππ¦ βπ¦, ππ‘ = π₯3 β π₯. ππ¦ 3π¦2 β π₯, ππ‘ = π¦ β
(h)
=
14. Consider the system
ππ₯ ππ‘
π₯.
= π₯π¦,
(j) ππ¦ ππ‘
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
= 2 β π¦,
ππ¦ ππ‘
= 2π₯3 .
ππ¦ = βπ₯3 . ππ‘ ππ¦ π¦2 , ππ‘ = βπ₯2 . ππ¦ π¦, ππ‘ = β sin π₯. ππ¦ π₯ β π₯π¦, ππ‘ = βπ¦
= π¦3 , = = =
+ π₯π¦.
= βπ₯2 .
(a) Show that πΈ(π₯, π¦) = π₯2 + π¦2 is a first integral. (b) Explain why the origin is not a nonlinear center, even though it is an equilibrium and a strict local minimum of πΈ(π₯, π¦). 15. Let π(π₯, π¦) be continuously differentiable in a neighborhood π· of an equilibrium π (π₯β , π¦β ). If π(π₯, π¦) has a strict local minimum at (π₯β , π¦β ), and satisfies ππ‘ π(π₯(π‘), π¦(π‘)) β€ 0 for every solution curve in π·, then it is called a Lyapunov function for (π₯β , π¦β ). (a) Show that, if a Lyapunov function exists, then (π₯β , π¦β ) is either neutrally or asymptotically stable. π
(b) Show that, if ππ‘ π(π₯(π‘), π¦(π‘)) < 0 holds for every solution curve (π₯, π¦)(π‘) β (π₯β , π¦β ) in π·, then (π₯β , π¦β ) is asymptotically stable. ππ₯
ππ¦
16. Consider the system ππ‘ = βπ¦ β πΎπ₯3 , ππ‘ = π₯ β πΎπ¦3 , where πΎ > 0 is a parameter, and consider the equilibrium (π₯β , π¦β ) = (0, 0). (a) Show the HartmanβGrobman theorem is inconclusive for (π₯β , π¦β ). (b) Show that π(π₯, π¦) = π₯2 +π¦2 is a Lyapunov function and that (π₯β , π¦β ) is asymptotically stable for any πΎ > 0. 17. Consider the system
ππ₯ ππ‘
= π₯π¦,
ππ¦ ππ‘
= π₯2 β π¦.
(a) Show that (π₯β , π¦β ) = (0, 0) is the only equilibrium, and that this equilibrium is degenerate in the sense that det π΄β = 0. (b) Determine the stability of (π₯β , π¦β ) using nullclines and a sketch of the direction field. Explain how the phase diagram around (π₯β , π¦β ) differs from that of a node, saddle, spiral or center. 18. Find the type and stability of all equilibrium points in terms of the parameters π, π and πΎ. Here π is arbitrary, and π and πΎ are positive. (a) (c) (e)
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ ππ‘
= π₯2 β π₯ + ππ¦.
(b)
= π¦ + ππ₯,
(d)
=π¦β
ππ¦ = βπ₯ β π₯π¦. ππ‘ ππ¦ πππ₯ , ππ‘ = π₯ + ln π¦.
(f)
= π¦,
ππ₯ ππ‘ ππ₯ ππ‘ ππ₯ ππ‘
ππ¦ = ππ₯ + π₯2 . ππ‘ ππ¦ ππ₯, ππ‘ = π₯2 β π¦. ππ¦ π₯ π₯2 , ππ‘ = π + π¦ .
= π₯ + π¦, =π¦+ =π¦β
90
4. Two-dimensional dynamics
(g)
ππ₯ ππ‘
= πΎπ₯ β π₯π¦,
ππ¦ ππ‘
19. Consider the system
= π₯π¦ β ππ¦ + π¦.
ππ₯ ππ‘
(h) ππ¦ ππ‘
= π₯ β π¦ β π₯3 ,
ππ₯ ππ‘
= πΎ β π¦2 β ππ₯,
ππ¦ ππ‘
= π₯π¦ β ππ¦.
= π₯ + π¦ β π¦3 .
(a) Sketch the nullclines and direction field, and show (graphically) that (π₯β , π¦β ) = (0, 0) is the only equilibrium. (b) Find the type and stability of (π₯β , π¦β ). (c) Use the nullclines to find a rectangular trapping region that surrounds (π₯β , π¦β ); state the coordinates of all four corners. Show the region contains a periodic orbit. 20. In various ecological settings, prey and predators coexist and interact, and cycles are observed in the populations of both species. A model for one such pair of interacting species is given below in dimensionless form, where π₯ and π¦ are the population sizes of prey and predator, π‘ is time, and π is a parameter. Assume π₯ β₯ 0, π¦ β₯ 0 and π > 0. This model is called the LotkaβVolterra model. ππ₯ = π₯ β π₯π¦, ππ‘
ππ¦ = βππ¦ + π₯π¦. ππ‘
(a) Sketch the nullclines and direction field. Find the equilibrium (π₯β , π¦β ) in which the two species coexist, with π₯β > 0 and π¦β > 0. (b) Show that the HartmanβGrobman theorem is inconclusive for (π₯β , π¦β ). (c) Find a first integral for the system and show that (π₯β , π¦β ) is a nonlinear center for any π > 0. 21. Certain chemical reactions exhibit periodic behavior in time, instead of tending to an equilibrium state. A model for one such reaction is given below in dimensionless form, where π₯ and π¦ are the concentrations of the two chemical constituents, π‘ is time, and πΎ is a reaction parameter. Assume π₯ β₯ 0, π¦ β₯ 0 and πΎ > 0. Such reactions are called chemical oscillators. 4π₯π¦ ππ₯ , = 10 β π₯ β ππ‘ 1 + π₯2
ππ¦ πΎπ₯π¦ . = πΎπ₯ β ππ‘ 1 + π₯2
(a) Sketch the nullclines and direction field. (b) Show that (π₯β , π¦β ) = (2, 5) is the only equilibrium, and that it is an unstable node or unstable spiral if πΎ < πΎ# , for some πΎ# which you should find. (c) Use the nullclines to find a rectangular trapping region that surrounds (π₯β , π¦β ); state the coordinates of all four corners. Show the region contains a periodic orbit when πΎ < πΎ# .
Exercises
91
22. A model for the spread of a contagious illness in a population is given below, where π, πΌ and π
are the number of susceptible, infected and recovered individuals, π‘ is time, π and π are infection and recovery rate parameters, and π and π are re-susceptibility and vaccination rate parameters. Assume π, πΌ, π
β₯ 0 and π, π > 0 and π, π β₯ 0. Such a model is called an SIR model. ππ Ο1 = βπππΌ + ππ
β ππ, ππ‘ Ο1 = a S I ππΌ Ο2 = r I S I = πππΌ β ππΌ, ππ‘ Ο3 = ΞΌ R Ο3 Ο2 R ππ
Ο4 = Ξ½ S = ππΌ β ππ
+ ππ. Ο4 ππ‘ (a) Show that π(π‘) + πΌ(π‘) + π
(π‘) = π for all time while a solution exists, where π is a constant. (π is the total population size). Use this relation to eliminate π
from the π, πΌ equations. (b) Consider the π, πΌ equations from (a) with π = 0. Rewrite system using π₯ = π/π, π¦ = πΌ/π and π = ππ‘. Find all equilibria, and their type and stability, in terms of dimensionless parameters. (c) Consider the π, πΌ equations from (a) with π = 0. Rewrite system using π₯ = π/π, π¦ = πΌ/π and π = ππ‘. Find all equilibria, and their type and stability, in terms of dimensionless parameters. 23. The equation for a damped pendulum is given below, where π is the angular position, π‘ is time, and π > 0, β > 0, π > 0, and π β₯ 0 are parameters that quantify mass, length, gravitational acceleration, and damping. a
πβ ΞΈ
g
m
ππ π2π +π + ππ sin π = 0. ππ‘ ππ‘2
ππ
(a) Let π = ππ‘ and rewrite the above equation as a first-order system for π and π. Find all equilibrium points (πβ , πβ ). (b) Find the type and stability of all equilibria in the case of no damping when π = 0. Note: the system has a first integral in this case. (c) Find the type and stability of all equilibria in the case with damping when π > 0. 24. In certain electrical circuits with nonlinear components, the flow of current can exhibit self-sustained oscillations. A model for one such circuit is given below in dimensionless form, where π₯ is the current, π‘ is time, and π is a parameter. This model is called the Van der Pol equation. π2π₯ ππ₯ + π(π₯2 β 1) + π₯ = 0. ππ‘ ππ‘2
92
4. Two-dimensional dynamics
ππ₯
(a) Let π¦ = ππ‘ and rewrite the above equation as a first-order system for π₯ and π¦. Show that (π₯β , π¦β ) = (0, 0) is the only equilibrium. (b) Find the type and stability of (π₯β , π¦β ) for all π > 0. (c) Make a qualitative sketch of the nullclines and the direction field for arbitrary π > 0. Does a periodic orbit seem possible? (d) Use Matlab or similar software to illustrate that an asymptotically stable periodic orbit exists; for concreteness use π = 1. Mini-project 1. A simple model for the relationship dynamics between two people X and Y is X
Y
ππ₯ = ππ₯ + ππ¦, ππ‘ ππ¦ = ππ₯ + ππ¦, ππ‘
π₯|π‘=0 = π₯0 , π‘ β₯ 0. π¦|π‘=0 = π¦0 ,
Here π‘ is time, π₯ is the intensity of Xβs feelings (for Y), and π¦ is the intensity of Yβs feelings (for X), where positive values mean love, and negative values mean hate. Thus intensity of feeling is a kind of emotional temperature. The constants π and π characterize Xβs personality: X is eager or cautious if π > 0 or < 0, and responsive or manipulative if π > 0 or < 0. Similarly, π and π characterize Yβs personality: Y is eager or cautious if π > 0 or < 0, and responsive or manipulative if π > 0 or < 0. Note that π, π model intra- or within-person traits, while π, π model inter- or between-person traits. Here we perform a qualitative analysis to understand the ultimate fate of a relationship depending on the personality types. All quantities are dimensionless. (a) Consider the case of π = 0, π > 0, π < 0, π = 0: so X is responsive and π is manipulative, and neither is eager or cautious. In other words, X warms up when Y is warm, cools down when Y is cool, and has no self-amplifying or self-suppressing tendencies. Y behaves analogously, but cools down when X is warm, and warms up when X is cool. Show that, if (π₯0 , π¦0 ) β (0, 0), then the relationship evolves as a neverending cycle of love and hate. Sketch a phase diagram to illustrate how the system evolves in time. (b) Consider the case of π = π < 0 and π = π > 0, so that X and Y are both cautious and responsive with identical characteristics. Show that if |π| > π (more cautious than responsive), then the relationship always fizzles out to mutual apathy. On the other hand, if |π| < π (more responsive than cautious), then the relationship is explosive: it will generally end up in extreme mutual love or mutual hatred depending on the initial feelings. What set of initial feelings lead to mutual love? What about mutual hatred? Sketch a phase diagram for each case to illustrate how the system evolves in time. Mini-project 2. In the biochemical process of glycolysis, living cells obtain energy (ATP) by breaking down sugar (glucose). Many intermediate reactions and compounds
Exercises
93
are involved. A model for the dynamics of two of these compounds is
glucose
reactions involving X, Y, Z, ...
ATP + products
ππ₯ = βπ₯ + ππ¦ + π₯2 π¦, ππ‘ ππ¦ = π β ππ¦ β π₯2 π¦, ππ‘
π₯|π‘=0 = π₯0 , π‘ β₯ 0. π¦|π‘=0 = π¦0 ,
Here π‘ is time, π₯ is the concentration of compound X (adenosine diphosphate), π¦ is the concentration of compound Y (fructose-6-phosphate), and π and π are positive constants that describe the reaction kinetics. Here we show that the system for compounds X and Y always has an equilibrium, and remarkably also has a periodic orbit under certain conditions on π and π. The periodic orbit is observed to be asymptotically stable, and attracts solutions from a large neighborhood into a never-ending cycle. All quantities are dimensionless. (a) Sketch the nullclines and show that the system has only one equilibrium solution (π₯β , π¦β ) for any π > 0 and π > 0. Explicitly find the equilibrium.
(b) For simplicity, assume π = 1/2 is fixed. Determine the stability of the equilibrium in terms of π > 0. Show that the equilibrium is unstable for 0 < π < π# , and asymptotically stable for π > π# , for an appropriate number π# .
(c) Consider a shaded region π
as shown, where the hole is centered at (π₯β , π¦β ). Show that the straight edges of π
can be chosen such that the direction field along these edges either points inward or along the edge. Using the result in (b), and the PoincarΓ©β Bendixson theorem, deduce that the system must contain a closed orbit in π
for any 0 < π < π# . Give explicit locations for the vertices of π
. [Hint: consider the nullclines; ππ¦ also, note that π¦ Μ + π₯Μ β€ 0 implies ππ₯ β€ β1 provided π₯Μ > 0.] Would the same conclusion hold for π > π# ? y
R x
(d) Use Matlab or similar software to simulate the system for various (π₯0 , π¦0 ) in (0, 2)Γ (0, 2). Given π = 0.5, consider the cases 0 < π < π# and π > π# , say π = 0.01, 0.05, 0.10, 0.15, 0.20, 0.25, and confirm your results in (b) and (c). What happens to solution curves and the closed orbit as π is changed? Mini-project 3. Consider a uniform, rectangular rigid body of mass π and edge lengths π, π, and π as outlined in Section 4.12. In the absence of an applied torque,
94
4. Two-dimensional dynamics
the free-spinning motion of the body is described by Ο 3 2 a
ππ’ = π’ Γ (πΎπ’), ππ‘
π’|π‘=0 = π’0 ,
|π’0 | = π > 0,
π‘ β₯ 0.
b
c 1
In the above, π’ = (π’1 , π’2 , π’3 ) is the angular momentum vector in the body reference frame, πΎ = diag(πΌ, π½, πΎ) is a diagonal matrix of inertia parameters, Γ is the vector cross product, | β
| is the vector magnitude, and π is a given constant. The angular velocity 12 12 vector is π = πΎπ’, and the inertia parameters are πΌ = π(π2 +π2 ) , π½ = π(π2 +π2 ) and 12 πΎ = π(π2 +π2 ) . Notice π > π > π implies π½ > πΌ > πΎ. Here we fill in the details omitted in the text. We find all equilibrium solutions and characterize the phase diagram around each. Remarkably, we will see that some spinning motions are unstable, and their repelling character leads to interesting motions as described earlier. The above are called the Euler equations for rigid body rotation. πα΅
πα΅
(a) In components, show that the dynamical system is ππ‘1 = (πΎ β π½)π’2 π’3 , ππ‘2 = πα΅ (πΌ β πΎ)π’3 π’1 and ππ‘3 = (π½ β πΌ)π’1 π’2 . Also, show that every solution π’(π‘) of this system has the property that π’21 + π’22 + π’23 = constant for all π‘ β₯ 0. Since |π’0 | = π, conclude that |π’| = π for all time, so every solution is on the sphere of radius π in π’1 , π’2 , π’3 -space as illustrated in Figure 4.30. (b) Show that the system has precisely six equilibrium solutions given by π’β= (Β±π, 0, 0), (0, Β±π, 0) and (0, 0, Β±π); specifically, show that there are no other equilibria. (c) Let π’(π‘) be an arbitrary solution curve starting near (0, 0, π) on the sphere. Show that the image of this curve in the π’1 , π’2 plane must be an ellipse around (0, 0), and hence π’(π‘) must be an elliptical curve around (0, 0, π) on the sphere. Thus (0, 0, π) is a neutrally stable center. Explain why a similar result holds for (0, 0, βπ) and (0, Β±π, 0). (d) Let π’(π‘) be an arbitrary solution curve starting near (π, 0, 0) on the sphere. Show that the image of this curve in the π’2 , π’3 plane must be a hyperbola around (0, 0), and hence π’(π‘) must be a hyperbolic curve around (π, 0, 0) on the sphere. Thus (π, 0, 0) is an unstable saddle. Explain why a similar result holds for (βπ, 0, 0).
Chapter 5
Perturbation methods
An important problem in the study of a model is to understand how a solution behaves when a parameter deviates from a reference value. For instance, we may know the solution when the parameter is zero, and seek to understand how the solution is altered when the parameter is nonzero. Alternatively, we may know the solution when the parameter is nonzero, and seek to explore the limit in which the parameter tends to zero. The study of such problems is the subject of perturbation theory. Here we explore cases when the model equation and its solution depend on a parameter in either a regular or singular way. We consider both algebraic and differential equations, and outline various series representation results which can be used to approximate solutions and study their behavior.
5.1. Perturbed equations We will consider different types of model equations, usually in dimensionless form, that contain a small parameter of interest π β₯ 0. Given a solution when π = 0, we may seek to understand how it changes when π > 0. Or given a solution when π > 0, we may seek to understand what happens when π β 0+ . The restriction on the parameter is purely for convenience. We could also consider an arbitrary reference value π = π0 , and study the case when |πβπ0 | is small, or equivalently, when the deviations πβπ0 β₯ 0 and π β π0 β€ 0 are small. Unless mentioned otherwise, we assume the reference value is zero, and consider the interval π β₯ 0. A system of equations involving a small parameter can be classified as one of two types as defined next. Here we focus on algebraic and differential equations, and note that more refined definitions can be given in more specific contexts. Definition 5.1.1. A system of equations is called perturbed if it contains a small parameter π β₯ 0. It is called regularly perturbed if every solution for π > 0 extends continuously to π = 0; otherwise, it is called singularly perturbed. 95
96
5. Perturbation methods
Thus a perturbed system can be classified as one of two types depending on properties of its solutions. A system is expected to be regularly perturbed when the number of solutions for π > 0 is the same as for π = 0, so that no solutions are lost or become undefined as the parameter vanishes. In contrast, a system will be singularly perturbed when the number of solutions for π > 0 is different than for π = 0. For systems involving algebraic equations, we consider all possible solutions, real or complex, whereas for systems involving differential equations, we focus only on real solutions. In the definition, note that continuity of a solution in the parameter π β₯ 0 would naturally be defined in terms of a norm on the relevant solution space. Various technical conditions guarantee when a system of equations is regularly or singularly perturbed. For simplicity, we will not focus on such conditions, but will instead make assumptions based on observation. Example 5.1.1. Consider the algebraic equation π₯2 + ππ₯ β 1 = 0, where 0 β€ π βͺ 1 is a parameter. The equation with π > 0 is quadratic and has two solutions, and the equation with π = 0 is also quadratic and has two solutions. This equation is expected to be regularly perturbed: each solution for π > 0 is expected to extend continuously to π = 0. Consider now the equation ππ₯2 + π₯ β 1 = 0, where 0 β€ π βͺ 1. The equation with π > 0 is quadratic and has two solutions, but the equation with π = 0 is linear and has only one solution. This equation is singularly perturbed: some solution for π > 0 does not continue to π = 0. π2 π₯
ππ₯
ππ₯
Example 5.1.2. Consider the initial-value problem 4 ππ‘2 + π ππ‘ + π₯ = 1, ππ‘ |π‘=0 = 2, π₯|π‘=0 = 0, π‘ β₯ 0, where 0 β€ π βͺ 1 is a parameter. When π > 0, the system consists of a second-order differential equation with two initial conditions and has a unique solution. When π = 0, the system has the same form and again has a unique solution. This system is expected to be regularly perturbed: the solution for π > 0 is expected to extend continuously to π = 0. π2 π₯
ππ₯
ππ₯
Consider now the problem π ππ‘2 + 4 ππ‘ + π₯ = 1, ππ‘ |π‘=0 = 2, π₯|π‘=0 = 0, π‘ β₯ 0, where 0 β€ π βͺ 1. When π > 0, the system has the same form as above and has a unique solution. However, when π = 0, the system consists of a first-order differential equation with two initial conditions, and has no solution; specifically, the two initial conditions cannot be satisfied with the single arbitrary constant of integration from the differential equation. This system is singularly perturbed: the solution for π > 0 does not continue to π = 0.
5.2. Regular versus singular behavior Given a perturbed equation with a small parameter π β₯ 0, we seek to understand the behavior of its solutions near π = 0. For a large class of equations this behavior can be studied using series expansions. The following simple examples illustrate how both regular and singular behavior can be exposed with a series. The construction of expansions for more general problems will be the subject of this chapter.
5.2. Regular versus singular behavior
97
Example 5.2.1. (regular case). Consider π₯2 + ππ₯ β 1 = 0, where 0 β€ π βͺ 1. The exact solutions are (5.1)
π₯=
βπ Β± [π2 + 4]1/2 , 2
π β₯ 0.
To illustrate the behavior of these solutions for small values of the parameter, we simplify the term involving the square root. Specifically, the function π(π) = [π2 + 4]1/2 has a Taylor series at π = 0 with some positive radius of convergence, and the first few terms, omitting those that are zero, are β
(5.2)
ππ π(π) (0) π2 π4 =2+ β + β―. π! 4 64 π=0
π(π) = β
Substituting (5.2) into (5.1), we get the two roots (5.3)
π₯+ (π) = 1 β
π 2
π₯β (π) = β1 β
+ π 2
π2 8
β
π4 + β―, 128 π2 π4 + 128 + β― , 8
β
π β₯ 0, π β₯ 0.
Thus there are two solutions π₯Β± (π) for each value of π β₯ 0, and each solution corresponds to a curve in the π, π₯-plane as illustrated in Figure 5.1. Since each solution for x 1
x+ (Ξ΅)
β1
Ξ΅
xβ (Ξ΅) Figure 5.1.
π > 0 extends continuously to π = 0, the system is regularly perturbed as expected. The series expansions inform us that π₯Β± (π) β Β±1 as π β 0+ , and moreover that π₯+ (π) < 1 and π₯β (π) < β1 for small π > 0; specifically, the slope and concavity of each solution 1 2 1 2 β² β³ β² β³ curve at π = 0 are π₯+ (0) = β 2 , π₯+ (0) = 8 and π₯β (0) = β 2 , π₯β (0) = β 8 . Example 5.2.2. (singular case). Consider ππ₯2 + π₯ β 1 = 0, where 0 β€ π βͺ 1. The exact solutions are β1Β±[1+4π]1/2 , 2π
π₯= π₯ = 1,
(5.4)
π > 0, π = 0.
To illustrate the behavior of these solutions for small values of the parameter, we again simplify the term involving the square root. Specifically, the function π(π) = [1 + 4π]1/2 has a Taylor series at π = 0 with some positive radius of convergence, and the first few terms are β
(5.5)
ππ π(π) (0) = 1 + 2π β 2π2 + 4π3 + β― . π! π=0
π(π) = β
98
5. Perturbation methods
Substituting (5.5) into (5.4), we get the solutions (5.6)
π₯+ (π) = 1 β π + 2π2 + β― , 1 π₯β (π) = β π β 1 + π β 2π2 + β― , π₯ = 1,
π > 0, π > 0, π = 0.
Thus there are two solutions π₯Β± (π) for each value of π > 0, and only one solution π₯ = 1 when π = 0. As before, each of the solutions can be plotted in the π, π₯-plane as illustrated in Figure 5.2. Since two solutions are defined for π > 0, but only one extends x 1
x+ (Ξ΅)
Ξ΅
xβ (Ξ΅)
Figure 5.2.
continuously to π = 0, the system is singularly perturbed. The series expansion for π₯+ (π) informs us that this solution converges to the single solution π₯ = 1 at π = 0, that is, π₯+ (π) β 1 as π β 0+ . In contrast, the expansion for π₯β (π) informs us that this solution becomes undefined as the parameter vanishes, that is, π₯β (π) β ββ as π β 0+ . Such unbounded behavior is typical of a singularly perturbed algebraic equation; this is how the number of solutions can change between π > 0 and π = 0.
5.3. Assumptions, analytic functions The class of problems that we consider will be algebraic or differential equations such as πΉ(π₯, π) = 0, or ππ₯ = πΉ(π‘, π₯, π), π₯|π‘=0 = π₯# , π‘ β₯ 0. ππ‘ We will also consider related problems of a similar form. A basic assumption that we will make is that the function πΉ(π₯, π) or πΉ(π‘, π₯, π) is analytic at some given point as defined next. (5.7)
Definition 5.3.1. A function πΉ(π‘, π₯, π) is called analytic at (π‘# , π₯# , π# ) if it has a convergent power series expansion around that point, that is β
(5.8)
πΉ(π‘, π₯, π) = β π πππ (π‘ β π‘# )π (π₯ β π₯# )π (π β π# )π , π,π,π=0
|π‘ β π‘# | < π, for some coefficients π πππ and radii π, π, π > 0.
|π₯ β π₯# | < π,
|π β π# | < π,
5.4. Notation, order symbols
99
Note that the same definition applies to real- or complex-valued functions, of real or complex variables, and a similar definition holds in the case of an arbitrary number of variables. Also, any analytic real-valued function of real variables can locally be extended to an analytic complex-valued function of complex variables; the two functions would share the same series coefficients. Thus, when a function is stated to be analytic at some given point, we will always have in mind a complex neighborhood of that point, whether the given point is real or complex. Any conditions or operations on the function will be understood within the context of this neighborhood. A large class of functions that arise in applications are analytic at most points. For instance, elementary functions of a single variable of the polynomial, rational, exponential, log, root, and trigonometric types are analytic at every point in the interior of their domains. Moreover, sums, products, quotients, and compositions of analytic functions are also analytic, as well as derivatives and antiderivatives. Furthermore, a function of multiple variables is analytic at every point in the interior of its domain if it is continuous in all variables jointly, and is analytic in each variable separately, for arbitrary fixed values of the other variables. The definition of derivative for a function of a complex variable has the same form as that for a real variable. Thus the usual differentiation formulas for the elementary functions are still valid when the variables are complex, as well as those for sums, products, quotients, and compositions. ππ₯2 +5π₯+π3
2
Example 5.3.1. Let πΉ(π₯, π) = sin(1+ππ₯) , πΊ(π‘, π₯, π) = (4π₯2 + π‘2 π₯)πππ₯+π‘π , π½(π₯) = βπ₯, and πΎ(π₯) = ln π₯. The function πΉ is analytic at every real or complex pair (π₯, π) with sin(1 + ππ₯) β 0, whereas πΊ is analytic at every real or complex triple (π‘, π₯, π). The functions π½ and πΎ are analytic at every real π₯ > 0, and more generally in a neighborhood 1 of every complex π₯ β 0. Whether π₯ is real or complex we have π½ β² (π₯) = and πΎ β² (π₯) = 2βπ₯ 1 . π₯
5.4. Notation, order symbols The following notation and definition of the order symbols π (big-o) and π (little-o) will be useful in discussing the behavior of a function π(π) in the limit π β 0+ . Λ = Definition 5.4.1. Let an exponent π β₯ 0 and a function π(π) be given. Assume π(π) Λ π. π(π)/ππ is continuous for π β (0, 1] and note π(π) = π(π)π Λ is bounded for π β (0, 1], then we say π(π) is order πΆ(ππ ). (1) If π(π) Λ β 0 as π β 0+ , then we say π(π) is order π(ππ ). (2) If π(π) Thus a function π(π) may be order π(ππ ) or π(ππ ) (or both) depending on the beΛ havior of π(π). To indicate the order of π(π) we use the notation π(π) = π(ππ ) or π π(π) = π(π ). From the definition, π(π) = π(ππ ) when there is a constant πΆ β₯ 0 such that |π(π)| β€ πΆππ for π β (0, 1], in which case π(π) β 0 at least as fast as ππ β 0+ . Alternatively, π(π) = π(ππ ) when π(π) β 0 at a strictly faster rate than ππ β 0+ . The notation π(π) = π(π) + π(ππ ) signifies that π(π) β π(π) = π(ππ ), and similarly for π(ππ ). The order symbols are often used to describe the individual terms of a series, or quantify the size
100
5. Perturbation methods
of the remainder when a series is truncated. In the definition, the interval (0, 1] could be replaced with (0, π] for any fixed π > 0. Example 5.4.1. Consider π(π) = 4π2 , π(π) = 5π5 β2π3 and β(π) = π2 ln(π). By definition, we find π(π) = π(π2 ) and π(π) = π(π3 ), and no higher exponents can be used in each of these statements. Note that β(π) β π(π2 ), since ln(π) is unbounded for π β (0, 1]. However, we find β(π) = π(π) and more accurately β(π) = π(π), since π ln(π) is bounded for π β (0, 1] and vanishes as π β 0+ . In fact, β(π) = π(ππ ) for any 0 β€ π < 2. 1
Example 5.4.2. Consider π(π) = ππ and π(π) = 1+π+ 2 π2 . Note that π(π) is a truncation of the power series for π(π), that is 2
β
(5.9)
ππ . π! π=0
ππ , π! π=0
π(π) = β
π(π) = β
The difference or remainder π
(π) = π(π) β π(π) has the property that β
β
(5.10)
ππ ππ Λ(π). = π3 β = π3 π
π! (π + 3)! π=0 π=3
π
(π) = β
Λ(π) converges for π β (ββ, β), it is continuous and bounded on any finite Since π
interval, and it follows that π
(π) = π(π3 ). This result can be written as π(π) = π(π) + 1 π(π3 ), or equivalently ππ = 1 + π + 2 π2 + π(π3 ).
5.5. Regular algebraic case Consider a regularly perturbed algebraic equation of the form (5.11)
πΉ(π₯, π) = 0,
0 β€ π βͺ 1.
Let π₯(π) be a solution or root, which may be real or complex, and let π₯0 = π₯(0) as illustrated conceptually in Figure 5.3. We seek an expression for π₯(π). The following
x x(Ξ΅) x0
Ξ΅
0 Figure 5.3.
result says that, provided the function πΉ(π₯, π) can be expressed as a series, then so can the solution π₯(π). We only consider a single equation with a single unknown; results for systems of equations with multiple unknowns are more involved. Below we use the notation β’ 0 to mean not identically zero. Result 5.5.1. Assume πΉ(π₯, 0) β’ 0. Let π₯0 be given such that πΉ(π₯0 , 0) = 0. If πΉ(π₯, π) ππΉ ππΉ is analytic at (π₯0 , 0), and ππ₯ (π₯0 , 0) β 0 or ππ (π₯0 , 0) β 0, then a solution curve π₯(π) of (5.11) exists. The curve π₯(π) can be written as a series, where the form of the series depends on π₯0 .
5.5. Regular algebraic case
101
(1) If π₯0 is a simple root, then π₯(π) = π₯0 + ππ₯1 + π2 π₯2 + β― .
(5.12)
(2) If π₯0 is a repeated root of order π, then π₯(π) = π₯0 + ππΌ π₯1 + π2πΌ π₯2 + β― ,
(5.13) where πΌ =
1 . π
Using the substitution πΏ = ππΌ , we may instead consider π₯(πΏ) = π₯0 + πΏπ₯1 + πΏ2 π₯2 + β― .
(5.14)
The series in (5.12) and (5.14) converge for π β [0, π) for some π > 0. The coefficients π₯π , π β₯ 0 can be found from (5.11). Thus a solution curve π₯(π) will exist, and have a series expansion around a given root π₯ = π₯0 at π = 0, provided that the function πΉ(π₯, π) is analytic at the starting ππΉ ππΉ point (π₯0 , 0), and satisfies a nondegeneracy condition ππ₯ (π₯0 , 0) β 0 or ππ (π₯0 , 0) β 0, along with πΉ(π₯, 0) β’ 0. In both the simple and repeated cases a solution curve π₯(π) has the property that πΉ(π₯(π), π) = 0 for all π β [0, π). The series in (5.12) is a standard power series involving only nonnegative, integer powers of π. The series in (5.14) is a generalized power series that involves nonnegative, but fractional powers of π; such a series is called a Puiseux series. An expansion in the form of a Puiseux series still exists when the nondegeneracy conditions on the derivatives do not hold, but the identification of πΌ > 0 is more involved as discussed later. By a perturbation approximation of a solution up to order π(ππ ) we mean a truncated series with all terms up to and including ππ . Note that the above result is entirely local. The existence and form of a solution curve π₯(π) depends only on properties of the function πΉ(π₯, π) at the starting root π₯ = π₯0 at π = 0, and does not involve information about any other points. Equation (5.11) will indeed be regularly perturbed when each of its solution curves has a starting point which satisfies the conditions of Result 5.5.1. The proof of the result shows that, when π₯0 is a simple root, there is a unique solution curve that extends from the point (π₯0 , 0), and when π₯0 is a repeated root of order π, there are π solution curves that extend from (π₯0 , 0). In the simple case, the curve π₯(π) is analytic and hence differentiable to all orders in π at any point within its interval of convergence. In contrast, in the repeated case, the curves π₯(π) are not analytic in π due to the fractional powers, but instead the reparameterized curves π₯(πΏ) are analytic and hence differentiable to all orders in πΏ. For convenience, we use the same symbol π₯ for both the original and reparameterized functions. Example 5.5.1. Consider (5.15)
π₯4 β π₯2 + 2π2 π₯ + 6π = 0,
0 β€ π βͺ 1.
This equation has four roots when π > 0 and also when π = 0. Here we seek an expansion for each of the roots π₯(π). Note that πΉ(π₯, π) = π₯4 β π₯2 + 2π2 π₯ + 6π is analytic at all points (π₯, π), and that πΉ(π₯, 0) = π₯4 β π₯2 β’ 0. Root types at π = 0. To determine the form of the expansions, we first examine the multiplicity of the roots π₯0 = π₯(0). From (5.15) with π = 0 we get π₯04 β π₯02 = 0, which has
102
5. Perturbation methods
the four roots π₯0 = 1,β1,0,0. Thus there are two simple roots, and a repeated root of ππΉ multiplicity π = 2. For each simple root we find ππ₯ (π₯0 , 0) = 4π₯03 β 2π₯0 β 0, and for ππΉ
the repeated root we find ππ (π₯0 , 0) = 6 β 0. Thus, for each given π₯0 , a nondegeneracy condition holds and Result 5.5.1 can be applied. Expansion of simple roots. To each simple root π₯0 at π = 0 there is a corresponding solution curve π₯(π) for π β₯ 0, which has the expansion (5.16)
π₯(π) = π₯0 + ππ₯1 + π2 π₯2 + π3 π₯3 + β― ,
π₯0 = 1, β1.
For future reference, note that (5.17)
π₯β² (π) = 0 + π₯1 + 2ππ₯2 + 3π2 π₯3 + β― , π₯β³ (π) = 0 + 0 + 2π₯2 + 6ππ₯3 + β― .
Substituting π₯(π) into (5.15) we get (5.18)
π(π) β π(π) + π(π) + π(π) = 0,
where π(π) = π₯4 (π), π(π) = π₯2 (π), π(π) = 2π2 π₯(π) and π(π) = 6π. We next expand each of these terms in powers of π. For π(π) and π(π), the expansions are straightforward, and there is no need to use Taylorβs formula; we get (5.19)
π(π) = 6π = 0 + 6π + 0π2 + 0π3 + β― , π(π) = 2π2 π₯(π) = 2π2 π₯0 + 2π3 π₯1 + 2π4 π₯2 + β― .
For π(π) = π₯2 (π) we may use Taylorβs formula to find the first few terms in the expansion. Using the chain rule as needed for derivatives, together with (5.17), we get π(π) = π(0) + ππβ² (0) + (5.20)
π2 β³ π (0) 2
+β―
= [π₯2 (π)]π=0 + π[2π₯(π)π₯β² (π)]π=0 π2
+ 2 [2π₯β² (π)π₯β² (π) + 2π₯(π)π₯β³ (π)]π=0 + β― = [π₯02 ] + π[2π₯0 π₯1 ] +
π2 [2π₯12 2
+ 4π₯0 π₯2 ] + β― .
Similarly, for π(π) = π₯4 (π), we get π(π) = π(0) + ππβ² (0) + (5.21)
π2 β³ π (0) 2
+β―
= [π₯4 (π)]π=0 + π[4π₯3 (π)π₯β² (π)]π=0 π2
+ 2 [12π₯2 (π)(π₯β² (π))2 + 4π₯3 (π)π₯β³ (π)]π=0 + β― = [π₯04 ] + π[4π₯03 π₯1 ] +
π2 [12π₯02 π₯12 2
+ 8π₯03 π₯2 ] + β― .
Substituting the expansions (5.19)β(5.21) into (5.18) and collecting terms by powers of π we get (5.22)
[π₯04 β π₯02 ] + π[4π₯03 π₯1 β 2π₯0 π₯1 + 6] +π2 [6π₯02 π₯12 + 4π₯03 π₯2 β π₯12 β 2π₯0 π₯2 + 2π₯0 ] + β― = 0.
5.5. Regular algebraic case
103
In order for the above equation to hold for arbitrary values of 0 β€ π βͺ 1, the coefficient of each power of π must vanish. Setting each coefficient to zero we get the following. π0 . π₯04 β π₯02 = 0. are π₯0 = 1, β1.
This is satisfied by the simple roots under consideration, which
π1 . 4π₯03 π₯1 β 2π₯0 π₯1 + 6 = 0. For each value of π₯0 , we get a corresponding value of 3 π₯1 , specifically π₯1 = β 2π₯3 βπ₯ = β3, 3. 0
0
π2 . 6π₯02 π₯12 + 4π₯03 π₯2 β π₯12 β 2π₯0 π₯2 + 2π₯0 = 0. For each matching pair of π₯0 and π₯1 , 47 43 we get a corresponding value of π₯2 , specifically π₯2 = β 2 , 2 . The above process can be continued up to any desired power of π. Based on our calculations, the first three terms in the expansion of the two simple roots are 47 2 π + β―, 2 43 2 + 2 π + β―.
π₯(1) (π) = 1 β 3π β
(5.23)
π₯(2) (π) = β1 + 3π
Each series is guaranteed to converge for π β [0, π) for some π > 0. If the series are truncated, then the three terms shown for each root would be called a perturbation approximation up to order π(π2 ). Such approximations can be used to estimate values of the roots for any given π, provided it is sufficiently small. Expansion of repeated roots. To each repeated root π₯0 at π = 0 there is a corresponding solution curve π₯(π) for π β₯ 0; that is, if π₯0 has multiplicity π, then there will be π solution curves π₯(π). The series expansions of these curves can be found by using the 1 substitution πΏ = ππΌ where πΌ = π . In the present case, we have π = 2 corresponding to the double root π₯0 = 0, 0. Substituting πΏ = π1/2 or π = πΏ2 into the original equation (5.15) we get π₯4 β π₯2 + 2πΏ4 π₯ + 6πΏ2 = 0,
(5.24)
0 β€ πΏ βͺ 1,
and we consider expansions of the form (5.25)
π₯(πΏ) = π₯0 + πΏπ₯1 + πΏ2 π₯2 + πΏ3 π₯3 + β― ,
π₯0 = 0, 0.
Proceeding just as before, we substitute (5.25) into (5.24), and expand all terms in powers of πΏ. Collecting terms, we obtain (5.26)
[π₯04 β π₯02 ] + πΏ[4π₯03 π₯1 β 2π₯0 π₯1 ] +πΏ2 [6π₯02 π₯12 + 4π₯03 π₯2 β π₯12 β 2π₯0 π₯2 + 6] + β― = 0.
In order for the above equation to hold for arbitrary values of 0 β€ πΏ βͺ 1, the coefficient of each power of πΏ must vanish. Setting each coefficient to zero we get the following. πΏ0 . π₯04 β π₯02 = 0. is π₯0 = 0, 0.
This is satisfied by the double root under consideration, which
πΏ1 . 4π₯03 π₯1 β 2π₯0 π₯1 = 0. For each value π₯0 = 0, 0, this equation becomes empty (0 = 0), and gives no information on π₯1 .
104
5. Perturbation methods
πΏ2 . 6π₯02 π₯12 + 4π₯03 π₯2 β π₯12 β 2π₯0 π₯2 + 6 = 0. For each value π₯0 = 0, 0, this equation becomes βπ₯12 + 6 = 0, which yields π₯1 = β6, ββ6. Similar to before, the above process can be continued up to any desired power of πΏ. Our calculations thus far show that the first two terms in the expansion of the two repeated roots are (5.27)
π₯(3) (πΏ) = 0 + β6πΏ + β― , π₯(4) (πΏ) = 0 β β6πΏ + β― .
The above expansions can be expressed in terms of the original parameter using the relation πΏ = π1/2 , and each series is guaranteed to converge for π β [0, π) for some π > 0. If the series are truncated, then the terms shown form a perturbation approximation up to order π(πΏ), or equivalently π(π1/2 ). For convenience, we use the same symbols π₯(3) and π₯(4) , whether these curves are expressed as functions of πΏ or π. x 1
x(1)(Ξ΅) x(3)(Ξ΅)
0
β1
x(4)(Ξ΅)
Ξ΅
x(2)(Ξ΅) Figure 5.4.
Picture of solutions. We now have a picture of the four roots of the equation π₯4 β π₯2 + 2π2 π₯ + 6π = 0 in terms of the parameter 0 β€ π βͺ 1. The roots π₯(1) (π) and π₯(2) (π) are simple at π = 0, and have a finite slope and concavity at π = 0, which guide the direction of each solution curve for small values of π. In contrast, the roots π₯(3) (π) and π₯(4) (π) come together and form a double root at π = 0, have an infinite slope at π = 0, and exhibit a square-root type behavior for small values of π. The qualitative behavior of the roots is illustrated graphically in Figure 5.4. Note that such an illustration would be difficult in cases where the roots are complex. Sketch of proof: Result 5.5.1. For convenience, we assume that the given root is π₯0 = 0. This value of π₯0 can always be arranged by shifting π₯, and redefining πΉ(π₯, π). Since πΉ(0, 0) = 0, but πΉ(π₯, 0) β’ 0, a series in powers of π₯ gives πΉ(π₯, 0) = π₯π π(π₯), where π β₯ 1 is the multiplicity of the root, and π(π₯) is analytic with π(0) β 0. Thus a ππΉ simple root (π = 1) must have ππ₯ (0, 0) β 0, and a repeated root (π β₯ 2) must have ππΉ (0, 0) = 0. ππ₯ ππΉ
To begin, we suppose that π₯0 is a simple root. In this case, we must have ππ₯ (0, 0) β 0, and an analytic version of the implicit function theorem guarantees the existence of a unique power series π₯(π) = βπβ₯1 π₯π ππ , with positive radius of convergence π, such that πΉ(π₯(π), π) = 0 for all |π| < π. Thus (5.11) has a solution curve that extends from
5.5. Regular algebraic case
105
(0, 0), and there is only one such curve. Moreover, by removing any shift as described above, this curve can be written as in (5.12). ππΉ
We next suppose that π₯0 is a repeated root. In this case, we must have ππ₯ (0, 0) = 0, and the implicit function theorem is no longer applicable. Thus we turn to a different result called the Weierstrass preparation theorem. This theorem guarantees that, in a neighborhood of (0, 0), we have the factorization (5.28)
πΉ(π₯, π) = π(π₯, π)[π₯π + ππβ1 (π)π₯πβ1 + β― + π1 (π)π₯ + π0 (π)],
where π(π₯, π) and ππ (π) (π = 0, . . . , π β 1) are analytic functions, such that π(π₯, π) β 0 at all points in the neighborhood, and ππ (0) = 0. In this neighborhood, the equation πΉ(π₯, π) = 0 will be satisfied if and only if (5.29)
π₯π + ππβ1 (π)π₯πβ1 + β― + π1 (π)π₯ + π0 (π) = 0.
Note that, since they are analytic, and vanish at π = 0, we can write ππ (π) = βπβ₯1 πππ ππ for some coefficients πππ . ππΉ
We now suppose that the repeated root is nondegenerate in the sense that ππ (0, 0) β 0, which implies π 01 β 0. Upon introducing the change of variable π₯ = πΏπ§ and π = πΏπ , and the new coefficients ππΜ (π) = βπβ₯1 πππ ππβ1 , and then factoring out πΏπ , we find that (5.29) will be satisfied if (5.30)
π§π + ππβ1 Μ (πΏπ )πΏπβ1 π§πβ1 + β― + π1Μ (πΏπ )πΏπ§ + π0Μ (πΏπ ) = 0.
When πΏ = 0, the above equation reduces to π§π 0 + π 01 = 0, which has π simple roots (π) π§0 β 0 (π = 1, . . . , π). Since each function ππΜ (πΏπ )πΏπ is analytic in πΏ, and each root is simple, the implicit function theorem can be applied. Thus π different solution curves exist for (5.30), and hence for (5.29) and (5.11). Each of these curves has a series expan(π) sion π§(π) (πΏ) = βπβ₯0 π§π πΏπ , which converges for all |πΏ| < π(π) , for some positive radius (π) of convergence π(π) . Using the relation π₯ = πΏπ§ we obtain π₯(π) (πΏ) = βπβ₯0 π§π πΏπ+1 . By relabeling coefficients, and removing any shift as described above, each of the curves can be written as in (5.14), where πΏ = π1/π , and each extends from (π₯0 , 0). We again consider the problem in (5.11). For completeness, we suppose that π₯0 has ππΉ ππΉ multiplicity π β₯ 2, but that ππ₯ (π₯0 , 0) = 0 and ππ (π₯0 , 0) = 0, in which case the root is called degenerate. (The case of a simple root or π = 1 is always nondegenerate.) Similar to before, there are π solution curves that extend from such a root, and the series for each curve can be written in powers of ππΌ for some rational πΌ > 0. However, in contrast to before, the identification of πΌ for each curve is more involved; we may 1 no longer have πΌ = π for each. Although the above proof could be generalized, we instead take a more direct approach. One way to find the exponents is to use the Newton polygon method. The basic idea is that the exponents can be found by a direct search using a change of variable similar to above, namely π₯ = π₯0 + πΏπ§ and π = πΏπΎ , where πΎ > 0 is rational. The strategy is to identify values of πΎ for which the resulting equation in π§ will have either simple, or repeated but nondegenerate roots π§ = π§0 when πΏ = 0. Moreover, values of πΎ are sought for which π§0 β 0. For such roots, Result 5.5.1 applies, and a series can be developed as before.
106
5. Perturbation methods
The method proceeds until a total of π expansions are found, corresponding to the π solution curves associated with the original root. When the equation is written in the form of a polynomial, different candidate values for πΎ can be obtained by equating the exponents of πΏ on different pairs of terms, as described below. A polygonal diagram can be introduced to represent these candidate values, but this diagram is not employed here. Example 5.5.2. Consider (5.31)
π₯5 + ππ₯3 + ππ₯2 + π2 + π6 = 0,
0 β€ π βͺ 1.
When π = 0, this equation has the repeated root π₯0 = 0 of multiplicity π = 5, which ππΉ ππΉ is degenerate since ππ₯ (π₯0 , 0) = 0 and ππ (π₯0 , 0) = 0. Here we find expansions for the five solution curves that extend from this root. To begin, we introduce π₯ = π₯0 + πΏπ§ and π = πΏπΎ , and consider (5.32)
πΏ5 π§5 + πΏπΎ+3 π§3 + πΏπΎ+2 π§2 + πΏ2πΎ + πΏ6πΎ = 0.
We next proceed to find candidate values of πΎ by matching exponents of πΏ on pairs of terms. Note that, assuming π§ β π§0 β 0 as πΏ β 0+ , at least two exponents must match; otherwise, the exponents would all be distinct, and dividing out the lowest would lead to an inconsistent equation as πΏ β 0+ . We begin by considering the first two terms in (5.32), and suppose the match 5 = πΎ + 3, which gives πΎ = 2. For this value of πΎ, the coefficients of the equation are {πΏ5 , πΏ5 , πΏ4 , πΏ4 + πΏ12 }, which after dividing out the lowest common factor gives {πΏ, πΏ, 1, 1 + πΏ8 }, and the equation becomes (5.33)
πΏπ§5 + πΏπ§3 + π§2 + 1 + πΏ8 = 0.
When πΏ = 0 we get π§20 + 1 = 0, which has two simple nonzero roots π§0 = Β±π. These generate two standard power series expansions for π§ in terms of πΏ, which can be found by the usual process applied to (5.33). These expansions can then be converted back to π₯ and π using the change of variable π₯ = π₯0 + πΏπ§ and πΏ = π1/2 . Next, we consider the first and third terms in (5.32), and suppose the match 5 = πΎ + 2, which gives πΎ = 3. For this value, the coefficients are {πΏ5 , πΏ6 , πΏ5 , πΏ6 + πΏ18 }, which gives {1, πΏ, 1, πΏ + πΏ13 }, and the equation is (5.34)
π§5 + πΏπ§3 + π§2 + πΏ + πΏ13 = 0.
When πΏ = 0 we get π§50 + π§20 = 0 or π§20 (π§30 + 1) = 0, which has three simple nonzero 1 roots π§0 = β1, 2 (1 Β± πβ3). These generate three standard power series expansions for π§ in terms of πΏ, which can be found by the usual process applied to (5.34). Similar to before, these expansions can be converted back to π₯ and π using the change of variable π₯ = π₯0 + πΏπ§ and πΏ = π1/3 . Thus we have five expansions corresponding to all five repeated roots and the procedure terminates. Note that, in the process of matching exponents in (5.32), some matches may be impossible, or may provide no new or useful information. For example, the match πΎ + 3 = πΎ + 2 is not possible, whereas the match πΎ + 3 = 2πΎ gives πΎ = 3, and πΎ + 2 = 2πΎ gives πΎ = 2, and both of these values are not new. The match
5.6. Regular differential case
107
5
5 = 2πΎ gives πΎ = 2 , which yields the coefficients {πΏ5 , πΏ11/2 , πΏ9/2 , πΏ5 + πΏ15 }, or equivaΜ , πΏ11 Μ , πΏ9Μ , πΏ10 Μ + πΏ30 Μ }, where πΏ Μ = πΏ1/2 . After dividing out common factors and lently {πΏ10 2 setting πΏ Μ = 0, we get π§0 = 0, which is undesirable since we seek simple nonzero roots 2 for π§0 . Finally, the match πΎ + 2 = 6πΎ gives πΎ = 5 , which leads to the equation Μ π§5 + πΏ13 Μ π§3 + πΏ8Μ π§2 + 1 + πΏ8Μ = 0, where πΏ Μ = πΏ1/5 . Here the equation with πΏ Μ = 0 is inπΏ21 consistent, which indicates that the value of πΎ is not suitable. We introduce πΏ Μ as needed to obtain integer exponents (thus analytic coefficients), so that a standard power series in πΏ Μ is guaranteed for simple roots.
5.6. Regular differential case Consider a regularly perturbed ordinary differential equation, together with a boundary or initial condition, of the form ππ’ (5.35) = πΉ(π‘, π’, π), π’|π‘=0 = π’# , π‘ β₯ 0, 0 β€ π βͺ 1. ππ‘ For generality, we suppose that π’ = (π’1 , . . . , π’π ) and πΉ = (πΉ1 , . . . , πΉ π ) for some π β₯ 1. Thus one-, two-, and higher-dimensional systems can be considered; the one-dimensional case is illustrated conceptually in Figure 5.5. We seek an expression for the solution π’(π‘, π). The following result says that, provided the function πΉ(π‘, π’, π) can be
u
u(t,Ξ΅)
Ξ΅>0 Ξ΅=0
u# t Figure 5.5.
expressed as a series, then so can the solution π’(π‘, π). Result 5.6.1. Let π’# be given. If πΉ(π‘, π’, π) is analytic at (0, π’# , 0), then a unique solution π’(π‘, π) of (5.35) exists. This solution is analytic at (0, 0), and can be written as a single series in π, namely (5.36)
π’(π‘, π) = π’0 (π‘) + ππ’1 (π‘) + π2 π’2 (π‘) + β― .
The series converges for π‘ β [0, π) and π β [0, π) for some π, π > 0. The coefficients π’π (π‘), π β₯ 0 can be found from (5.35). Thus a unique solution π’(π‘, π) will exist and be analytic at (0, 0) provided that the function πΉ(π‘, π’, π) is analytic at the initial point (0, π’# , 0). In view of Definition 5.3.1, the solution π’(π‘, π) can be written as a double power series in (π‘, π), which converges for some positive radius in each variable. By summing over the series in π‘, we arrive at the form in (5.36), which is a standard power series in π with time-dependent coefficients π’π (π‘). This form of the solution will be the most convenient for our purposes. Note that π’(π‘, π) has derivatives of all orders in π‘ and π within its domain of convergence. For π π π π simplicity, we will denote derivatives by ππ‘ and ππ , rather than ππ‘ and ππ .
108
5. Perturbation methods
As in the algebraic case, note that the above result is entirely local. The existence, uniqueness, and form of the solution π’(π‘, π) depends only on properties of the function πΉ(π‘, π’, π) at the initial point (0, π’# , 0). The system in (5.35) is indeed regularly perturbed when the conditions of Result 5.6.1 are met, since the unique solution for π > 0 extends continuously in a suitable norm to π = 0. In the π-dimensional case with π’ = (π’1 , . . . , π’π ) and πΉ = (πΉ1 , . . . , πΉ π ), the function πΉ is analytic if each of the component functions πΉπ is analytic. Moreover, the series in (5.36) can be written in vector form π’(π‘, π) = βπβ₯0 π’π (π‘)ππ , or in component form π’π (π‘, π) = βπβ₯0 π’π,π (π‘)ππ . As before, by a perturbation approximation of the solution up to order π(ππ ) we mean a truncated series with all terms up to and including ππ . Example 5.6.1. Consider ππ’ (5.37) = βπ’ + β1 + ππ’, π’|π‘=0 = 2, π‘ β₯ 0, 0 β€ π βͺ 1. ππ‘ This system has a unique solution when π > 0 and also when π = 0. Here we seek an expansion of the solution π’(π‘, π). Note that πΉ(π‘, π’, π) = βπ’ + β1 + ππ’ is analytic at (π‘, π’, π) = (0, 2, 0). According to the above result, the solution can be expanded in a series (5.38)
π’(π‘, π) = π’0 (π‘) + ππ’1 (π‘) + π2 π’2 (π‘) + π3 π’3 (π‘) + β― .
For future reference, note that ππ’ (π‘, π) = 0 + π’1 (π‘) + 2ππ’2 (π‘) + 3π2 π’3 (π‘) + β― , ππ (5.39) π2π’ (π‘, π) = 0 + 0 + 2π’2 (π‘) + 6ππ’3 (π‘) + β― . ππ2 Substituting π’(π‘, π) into the differential equation in (5.37), and moving all terms to the left-hand side for convenience, we get (5.40)
π(π‘, π) + π(π‘, π) β π(π‘, π) = 0,
π‘ β₯ 0,
πα΅ (π‘, π), ππ‘
where π(π‘, π) = π(π‘, π) = π’(π‘, π), and π(π‘, π) = β1 + ππ’(π‘, π). We next expand each of these terms in powers of π. For π(π‘, π) and π(π‘, π), the expansions are straightforward, and there is no need to use Taylorβs formula; we get ππ’0 ππ’ ππ’ π(π‘, π) = (π‘, π) = (π‘) + π 1 (π‘) + β― , ππ‘ ππ‘ ππ‘ (5.41) π(π‘, π) = π’(π‘, π) = π’0 (π‘) + ππ’1 (π‘) + β― . For π(π‘, π) = β1 + ππ’(π‘, π) we may use Taylorβs formula, applied to π with π‘ fixed, to find the first few terms in the expansion. Using the chain rule as needed for derivatives, together with (5.39), we get π(π‘, π) = π(π‘, 0) + π (5.42)
ππ π2 π 2 π (π‘, 0) + (π‘, 0) + β― ππ 2 ππ2
1 ππ’ = [(1 + ππ’)1/2 ]π=0 + π [ (1 + ππ’)β1/2 (π’ + π )] +β― 2 ππ π=0 1 = [1] + π [ π’0 (π‘)] + β― . 2
5.6. Regular differential case
109
Substituting the expansions (5.41)β(5.42) into (5.40) and collecting terms by powers of π we get (5.43)
[
ππ’0 ππ’ 1 (π‘) + π’0 (π‘) β 1] + π [ 1 (π‘) + π’1 (π‘) β π’0 (π‘)] + β― = 0, ππ‘ ππ‘ 2
π‘ β₯ 0.
A similar procedure can be applied to the initial condition in (5.37). Substituting π’(π‘, π) into that condition we get π’(0, π) β 2 = 0, and using the series expansion (5.38) we get [π’0 (0) β 2] + ππ’1 (0) + π2 π’2 (0) + β― = 0.
(5.44)
In order for the equations in (5.43) and (5.44) to hold for arbitrary values of 0 β€ π βͺ 1, the coefficient of each power of π must vanish. Setting each coefficient to zero we get the following sequence of initial-value problems for the functions π’π (π‘), π β₯ 0. π0 . π’β²0 + π’0 β 1 = 0, π’0 (0) β 2 = 0, π‘ β₯ 0. This is a first-order equation for π’0 (π‘), which can be solved using an integrating factor or separation of variables. Solving gives π’0 (π‘) = 1 + πβπ‘ . 1
π1 . π’β²1 + π’1 β 2 π’0 = 0, π’1 (0) = 0, π‘ β₯ 0. Given the solution for π’0 (π‘), this is a first-order equation for π’1 (π‘), which can be solved using an integrating factor similar 1 to before. Solving gives π’1 (π‘) = 2 (1 + π‘πβπ‘ β πβπ‘ ). The above process can be continued up to any desired power of π. From our calculations thus far, the first two terms in the expansion of the solution of (5.37) are (5.45)
1
π’(π‘, π) = [1 + πβπ‘ ] + π [ 2 (1 + π‘πβπ‘ β πβπ‘ )] + β― .
The series is guaranteed to converge for π‘ β [0, π) and π β [0, π) for some π, π > 0, and we note that it may be possible to increase π or π at the expense of decreasing the other. If the above series is truncated, then the terms shown form a perturbation approximation up to order π(π), and such an approximation can be used to estimate the solution for any sufficiently small π. Using the above approximation we can get a picture of how the solution π’(π‘, π) is influenced by the parameter 0 β€ π βͺ 1. As illustrated in Figure 5.6, when π = 0 the u 2
Ξ΅=0
1.5
Ξ΅ = 0.1
1 0
1
2
3
4
t
Figure 5.6.
solution curve monotonically decays from the value π’ = 2 at π‘ = 0, and approaches the equilibrium value π’ = 1 as π‘ β β. For small π > 0, the solution curve is qualitatively similar, but lies strictly above the previous curve (with the exception of the initial point), and approaches a strictly higher equilibrium value, which according to π the approximation is at π’ = 1 + 2 .
110
5. Perturbation methods
Sketch of proof: Result 5.6.1. We consider the π-dimensional system in (5.35). We focus on the case π = 1, and note that the case π > 1 is similar. To begin, the assumption that πΉ(π‘, π’, π) is analytic at (0, π’# , 0) implies that it is analytic in an open neighborhood, and thus in a closed subset π· = {(π‘, π’, π) | |π‘| β€ π, |π’ β π’# | β€ π, |π| β€ π} for some π, π, π > 0. Since all of its partial derivatives are continuous and bounded in π·, there exist constants π, πΎ > 0 such that |πΉ(π‘, π’, π)| β€ π and |πΉ(π‘, π’, π) β πΉ(π‘, π£, π)| β€ πΎ|π’ β π£| for all (π‘, π’, π) and (π‘, π£, π) in π·. The second inequality is a Lipschitz condition, implied ππΉ by the boundedness of πα΅ . π
1
Let πΈ = {(π‘, π) | |π‘| < π, |π| < π}, where π < min(π, π , πΎ ) and π < π. Moreover, let U be the set of all functions π’(π‘, π) that are analytic in πΈ and that π’(0, π) = π’# and βπ’ β π’# β β€ π, where β β
β denotes the supremum norm, defined by βπ’β = sup(π‘,π)βπΈ |π’(π‘, π)|. With this norm, the set U is a complete metric space, which follows from the fact that the analytic property and the conditions π’(0, π) = π’# and βπ’βπ’# β β€ π are all preserved under uniform convergence. To establish the existence and uniqueness of a solution to (5.35), we consider an integrated version of the equation given by π‘
(5.46)
π’(π‘, π) = π’# + β« πΉ(π , π’(π , π), π) ππ . 0
If this equation has a unique solution π’ β U, then the function π’(π‘, π) will also be a unique solution of (5.35), including the initial condition. Motivated by (5.46), for any π€ β U, we consider the transformed function π¬π€ π‘ defined by (π¬π€)(π‘, π) = π’# + β«0 πΉ(π , π€(π , π), π) ππ . Since βπ€ β π’# β β€ π, the composite function πΉ(π , π€(π , π), π) is analytic for all (π , π) in πΈ, and the function (π¬π€)(π‘, π) is well defined for all (π‘, π) in πΈ. Specifically, the (contour) integral in the definition of π¬π€ is independent of the path from π = 0 to π = π‘, which we may assume is a line segment. Using the fact that the integral of an analytic function is analytic, together with the bound on πΉ, we find that π¬π€ is analytic in πΈ, and satisfies (π¬π€)(0, π) = π’# and βπ¬π€ β π’# β β€ ππ, where ππ < π, which implies that π¬π€ β U for all π€ β U. Moreover, from the Lipschitz property of πΉ, we get βπ¬π€ β π¬π£β β€ ππΎβπ€ β π£β, where ππΎ < 1. The contraction mapping theorem can now be applied to conclude that there is a unique π’ β U that satisfies π’ = π¬π’, which implies that (5.46) and thus (5.35) has a unique solution. Since it is analytic in πΈ, we have the representation π’(π‘, π) = βπ,πβ₯0 πΎππ π‘π ππ for some coefficients πΎππ . By introducing π’π (π‘) = βπβ₯0 πΎππ π‘π , we get π’(π‘, π) = βπβ₯0 π’π (π‘)ππ , which is a series of the form in (5.36).
5.7. Case study Setup. To illustrate some of the preceding results we study a simplified, planar model for ballistic targeting. As shown in Figure 5.7, we consider the problem of aiming a weapon such as a rifle for a long-range shot. We suppose that the weapon is at the origin (0, 0), and that the target is at a given point (π, π). For a given firing speed π = 2 2 βπ’0 + π£ 0 , which is the initial speed imparted to the bullet by the weapon, the problem is to find the line of sight that is required for the bullet to strike the target.
5.7. Case study
111
y t f sigh
g
line o
(u ,v ) 0 0
trajectory
ΞΈ
aiming
h height (a,b)
x
aiming angle
Figure 5.7.
Note that, if gravity and air resistance were absent, then the bullet velocity would be constant, and the trajectory would be a line, coincident with the line of sight. In this case the aiming problem would be trivial: the line of sight would be directed at the target. In contrast, when gravity and air resistance are present, the bullet velocity is nonconstant, and the trajectory is a curved path, which is only initially tangent with the line of sight. In this case the aiming problem becomes nontrivial: the line of sight must be directed at a point sufficiently above the target so that the curved path will intersect the target. We seek to determine how the line of sight, or equivalently the aiming angle π or aiming height β, should be chosen depending on the target location (π, π), firing speed π, gravitational acceleration π, and a small parameter π that quantifies air resistance effects. For convenience, we consider the aiming angle measured from the horizontal, and note that the angle measured from the line to the target may be more natural. Outline of model. With a coordinate system as shown above, let (π₯, π¦) denote the β π¦π β denote the position location of the bullet at any time π‘ β₯ 0. Equivalently, let π β = π₯π + vector, where π β and π β are the standard unit vectors in the positive coordinate directions. Μ β The velocity and acceleration vectors at any time are then π Μβ = π₯πΜ β + π¦πΜ β and π Μβ = π₯πΜ β + π¦π, where a superposed dot denotes a derivative with respect to time. We suppose that only two forces act on the bullet after it exits the weapon: one due to gravity, and another due to air resistance, as illustrated in Figure 5.8. The force of β gravity has the form πΉgravity = βππ π,β where π is the bullet mass. Using a simple model for high-speed motion through air, we assume that the force due to air resistance has a direction opposite to π,Μβ and a magnitude proportional to |π|Μβ 2 . Specifically, we assume β = βπΌ|π|Μβ π Μβ = βπΌ(π₯2Μ + π¦2Μ )1/2 (π₯πΜ β + π¦π), Μ β where πΌ β₯ 0 is an air resistance coefficient. πΉair We suppose that π₯Μ > 0, so that the bullet is always traveling rightward, and moreover r bullet Fair Fgravity Figure 5.8.
112
5. Perturbation methods
suppose that π₯Μ β« |π¦|,Μ which will be the case when the bullet path is nearly horizontal. Under these conditions, the speed factor (π₯2Μ + π¦2Μ )1/2 > 0 can be approximated by π₯Μ > 0, β π¦π). β = βπΌπ₯(Μ π₯πΜ + Μ β and thus we consider a simplified model of air resistance given by πΉair This simplification will make some of the following calculations easier. Newtonβs law of motion for the bullet requires that the product of its mass and acβ β . celeration be equal to the sum of the applied forces, or equivalently, ππ Μβ = πΉgravity + πΉair Writing this equation in components, and considering initial conditions, we obtain the system π₯Μ = βπ(π₯)Μ 2 , π¦ Μ = βππ₯π¦ Μ Μ β π,
(5.47)
π₯|Μ π‘=0 = π’0 , π¦|Μ π‘=0 = π£ 0 ,
π = πΌ/π,
π₯|π‘=0 = 0, π¦|π‘=0 = 0,
π = βπ’20 + π£20 .
Any solution (π₯, π¦)(π‘) of the above system is a bullet trajectory with prescribed initial conditions. Note that the air resistance parameter is expected to be small in the sense that 0 β€ π βͺ 1. Also, note that the magnitude π > 0 is known, but not the individual velocity components (π’0 , π£ 0 ). The targeting problem can now be stated: for given values of π, π, π, π, π, we seek initial velocity components (π’0 , π£ 0 ) such that the path (π₯, π¦)(π‘) will intersect the target point (π, π). Once the velocity components are found, π£ the aiming angle is given by tan π = α΅0 , which completely determines the line of sight. 0
β+π
π£
Note that the aiming height can be found from the relation π = α΅0 . Consistent with 0 the assumption that π₯Μ > 0 at all times, we assume π’0 > 0 and π > 0. Analysis of model. He we study the targeting problem associated with (5.47) in two cases. In the first case, we consider π > 0 and π = 0, which corresponds to including the effects of gravity, but not air resistance. In the second case, we consider π > 0 and π > 0, which corresponds to including the effects of both gravity and air resistance. Case of π > 0 and π = 0. The differential equations take the simple form π₯Μ = 0 and π¦ Μ = βπ, and the initial conditions are π₯|Μ π‘=0 = π’0 and π¦|Μ π‘=0 = π£ 0 , together with π₯|π‘=0 = 0 and π¦|π‘=0 = 0. The differential equations can be solved by explicit integration, and applying the initial conditions, we find that the bullet trajectory is π₯(π‘) = π’0 π‘ and 1 π¦(π‘) = π£ 0 π‘ β 2 ππ‘2 for all π‘ β₯ 0. The bullet will strike the target at some time π‘ = π‘π > 0 if and only if π₯(π‘π ) = π and π¦(π‘π ) = π. From the first condition we can solve for the π time to obtain π‘π = α΅ , and substituting this result into the second condition we obtain π
0
the relation π£ 0 = ( π )π’0 + ( π’20
π£20
ππ 1 ) . 2 α΅0 2
The velocity components must also satisfy the initial
speed condition + = π . Thus we arrive at two equations for the two unknowns π’0 and π£ 0 . Note that the two equations define two curves in the π’0 , π£ 0 plane, and that any simultaneous solution of the equations must correspond to an intersection point of these curves. π
ππ 1 ) 2 α΅0 2 π£ 0 = π2
The strike condition π£ 0 = ( π )π’0 + ( π’20
corresponds to a hyperbolic-type curve,
and the initial speed condition + corresponds to a circle. Depending on the specified values of π, π, π, π, these curves can have two, one, or no intersections. The case when there are two intersections is illustrated in Figure 5.9. For each of the
5.7. Case study
113
intersection points (π’β0 , π£β0 ) there is a corresponding aiming angle tan π = β+π π
π£β0 . α΅β0
π£β0 α΅β0
and aim-
= Note that solution 1 would have a low aiming angle, whereas ing height solution 2 would have a high aiming angle. Only solution 1 would be consistent with our assumption of a nearly horizontal bullet path. In this special case of π = 0, the intersection points of the curves can be found exactly, although the algebra required is somewhat tedious. These points can also be found with the aid of numerical rootfinding software.
strike condition
v0
y
soln2
soln2
soln1 speed condition
(a,b)
soln1
u0
x
Figure 5.9.
Depending on the specified values of π, π, π, π, it may also happen that the strike and speed condition curves have only one intersection, at which point they are tangent, or the curves may have no intersections. In the latter case, the target is βout of range,β and there is no aiming angle for which the bullet will strike the target. Specifically, gravity will pull the bullet below the target before the bullet can reach the target. Case of π > 0 and π > 0. The differential equations now take the form π₯Μ = βπ(π₯)Μ 2 and π¦ Μ = βππ₯π¦ Μ Μ β π, and the initial conditions are π₯|Μ π‘=0 = π’0 and π¦|Μ π‘=0 = π£ 0 , together with π₯|π‘=0 = 0 and π¦|π‘=0 = 0. Obtaining an explicit expression for the bullet path is difficult, and thus we now seek a series expansion for each of the components π₯(π‘, π) and π¦(π‘, π). Considering only the first two terms in each series, the differential equations and initial conditions defined by the coefficients of π0 and π1 can be found and solved in the usual way, and we obtain an approximate path π₯(π‘, π) = π₯0 (π‘) + ππ₯1 (π‘) and π¦(π‘, π) = π¦0 (π‘) + ππ¦1 (π‘). As before, the bullet will strike the target at some time π‘ = π‘π > 0 if and only if π₯(π‘π , π) = π and π¦(π‘π , π) = π. From the first condition we can solve for the time π‘π ; although there may be multiple solutions, only one is physically meaningful in the π sense that it tends to the previous value α΅ in the limit π β 0+ . Substituting the rel0 evant solution for π‘π into the condition π¦(π‘π , π) = π we obtain a relation of the form π£ 0 = π(π’0 , π, π, π, π), which corresponds to a perturbed or distorted version of the hyperbolic-type curve considered above. And just as before, the velocity components must also satisfy the initial speed condition π’20 + π£20 = π2 . Thus we again arrive at two equations for the two unknowns π’0 and π£ 0 . Depending on the specified values of π, π, π, π, π, these equations can again have two, one or no solutions. The details for this case are considered in the Exercises.
114
5. Perturbation methods
5.8. PoincarΓ©βLindstedt method The standard series expansion of a solution can have shortcomings for some types of differential equation problems. A notable type is the class of problems with periodic solutions. In this case, the standard expansion may contain terms that are nonperiodic and grow in time. Although the full sum of the series will converge to a periodic function, any truncation of the series will yield a nonperiodic approximation with a growing error. Here we show that this shortcoming can be corrected by using an enhanced expansion based on a scale transformation. Setup. For illustration, we consider ππ₯/ππ‘ = π¦,
(5.48)
π₯|π‘=0 = 1, 3
ππ¦/ππ‘ = βππ₯ β ππ₯ ,
π¦|π‘=0 = 0,
π‘ β₯ 0,
where π > 0 and 0 β€ π βͺ 1 are parameters. To develop the expansion it will be convenient to rewrite this system in the equivalent form π2π₯ ππ₯ = βππ₯ β ππ₯3 , | = 0, π₯|π‘=0 = 1, π‘ β₯ 0. 2 ππ‘ π‘=0 ππ‘ The above system has a periodic solution when π = 0 and also when π > 0. This fact can be deduced from the path equation for (5.48), which gives the first integral 1 πΈ(π₯, π¦) = ππ₯2 + 2 ππ₯4 + π¦2 . The solution path through the point (π₯, π¦)|π‘=0 = (1, 0) is the curve πΈ(π₯, π¦) = πΆ, where πΆ = πΈ(1, 0). This curve is a closed loop for any π β₯ 0, which implies that the solution is periodic. Here we seek series expansions for π₯(π‘, π) and π¦(π‘, π). For convenience, we focus on the formulation in (5.49), and only consider ππ₯ π₯(π‘, π). Once this function is known, then so is the other, since π¦(π‘, π) = ππ‘ (π‘, π). (5.49)
Standard method. For motivation, and to understand the issues involved, we follow the standard procedure and consider the expansion π₯(π‘, π) = π₯0 (π‘) + ππ₯1 (π‘) + π2 π₯2 (π‘) + β― .
(5.50)
Substituting π₯(π‘, π) into the differential equation and initial conditions in (5.49) we get ππ₯ π2π₯ (π‘, π) + ππ₯(π‘, π) + ππ(π‘, π) = 0, (0, π) = 0, π₯(0, π) β 1 = 0, ππ‘ ππ‘2 where π(π‘, π) = π₯3 (π‘, π). As before, we expand each term in powers of π. For the function π(π‘, π) we may use Taylorβs formula, applied to π with π‘ fixed, which gives
(5.51)
π(π‘, π) = π(π‘, 0) + π (5.52)
ππ (π‘, 0) + β― ππ
= [π₯3 (π‘, π)]π=0 + π [3π₯2 (π‘, π)
ππ₯ (π‘, π)] +β― ππ π=0
= [π₯03 (π‘)] + π [3π₯02 (π‘)π₯1 (π‘)] + β― . Substituting (5.52) and (5.50) into the differential equation in (5.51), and collecting terms by powers of π, we get (5.53)
[
π 2 π₯0 π2π₯ (π‘) + ππ₯0 (π‘)] + π [ 21 (π‘) + ππ₯1 (π‘) + π₯03 (π‘)] + β― = 0, 2 ππ‘ ππ‘
π‘ β₯ 0.
5.8. PoincarΓ©βLindstedt method
115
We also expand each of the two initial conditions in (5.51). Substituting (5.50) into these conditions and collecting terms we get ππ₯0 ππ₯ (0) + π 1 (0) + β― = 0, [π₯0 (0) β 1] + ππ₯1 (0) + β― = 0. ππ‘ ππ‘ In order for the equations in (5.53) and (5.54) to hold for arbitrary values of 0 β€ π βͺ 1, the coefficient of each power of π must vanish. Setting each coefficient to zero we get the following sequence of initial-value problems for the functions π₯π (π‘), π β₯ 0. (5.54)
π0 . π₯0β³ + ππ₯0 = 0, π₯0β² (0) = 0, π₯0 (0) = 1, π‘ β₯ 0. This is a linear, homogeneous second-order equation for π₯0 (π‘). The general solution can be found using standard methods based on the roots of a characteristic polynomial. Solving gives π₯0 (π‘) = cos(π½π‘), where π½ = βπ > 0. π1 . π₯1β³ + ππ₯1 = βπ₯03 , π₯1β² (0) = 0, π₯1 (0) = 0, π‘ β₯ 0. Given π₯0 (π‘) from above, this is a linear, inhomogeneous second-order equation for π₯1 (π‘). The general solution π takes the form π₯1 (π‘) = π₯1β (π‘) + π₯1 (π‘), where π₯1β (π‘) = πΆ1 cos(π½π‘) + πΆ2 sin(π½π‘) is the π solution of the homogeneous equation, with arbitrary constants πΆ1 and πΆ2 , and π₯1 (π‘) β³ 3 is any particular solution of the inhomogeneous equation π₯1 + ππ₯1 = β cos (π½π‘). To solve this equation it is convenient to simplify the right-hand side using the triple-angle 1 3 identity cos3 (π) = 4 cos(3π) + 4 cos(π). In this case, the differential equation becomes 1 3 π₯1β³ + ππ₯1 = β cos(3π½π‘) β cos(π½π‘). 4 4 Using the method of undetermined coefficients, and noting that the inhomogeneous 3 term 4 cos(π½π‘) is contained in the homogeneous solution π₯1β (π‘), we propose that the particular solution will have the form (5.55)
(5.56)
π
π₯1 (π‘) = π΄ cos(3π½π‘) + π΅ sin(3π½π‘) + π‘ [π· cos(π½π‘) + πΈ sin(π½π‘)] ,
where π΄, π΅, π·, and πΈ are constants to be determined. Substituting (5.56) into (5.55) and matching terms on both sides of the equation, and noting that π½ 2 = π, we find 3 1 a particular solution with π΄ = 32π½2 , π΅ = 0, π· = 0, and πΈ = β 8π½ . Combining the homogeneous and particular solutions then gives the general solution (5.57)
π₯1 (π‘) = πΆ1 cos(π½π‘) + πΆ2 sin(π½π‘) +
1 3π‘ cos(3π½π‘) β sin(π½π‘). 8π½ 32π½2
The initial conditions π₯1 (0) = 0 and π₯1β² (0) = 0 can now be applied, and we find that 1 πΆ1 = β 32π½2 and πΆ2 = 0, and the solution for π₯1 (π‘) is completely determined. The above process can be continued up to any desired power of π. Based on our calculations thus far, the first two terms in the expansion of the solution of (5.49) are (5.58)
π₯(π‘, π) = cos(π½π‘) + π [
cos(3π½π‘) cos(π½π‘) 3π‘ sin(π½π‘) β β ] + β―. 8π½ 32π½2 32π½2
As before, the series is guaranteed to converge for π‘ β [0, π) and π β [0, π) for some π, π > 0, and the first few terms of the series provide an approximation for sufficiently small π. The term with π‘ sin(π½π‘) is called a secular term in the expansion. Unlike the other terms shown, which are all periodic, this term is nonperiodic and grows in
116
5. Perturbation methods
time. Thus any truncation of the series will yield a nonperiodic approximation with a growing error. PoincarΓ©βLindstedt method. We again consider a series expansion for the solution π₯(π‘, π) of the system (5.49). In contrast to before, we consider a scaled time variable π = ππ‘, where π > 0 is a constant, and we further let π depend on the constant parameter π. We assume that π = π(π) is an analytic function, with expansion π(π) = π0 + ππ1 + π2 π2 + β― .
(5.59)
Note that the condition π > 0 for 0 β€ π βͺ 1 requires π0 > 0. Aside from this, the scale π can be arbitrary; the system in (5.49) and its solution π₯(π‘, π) can always be written in terms of a scaled variable π = ππ‘. The essential idea of the PoincarΓ©βLindstedt method is to apply the scale transformation to the original system and change variables from π₯, π‘ to π₯, π . For this purpose, the usual derivative relations from Result 2.3.1 are applicable, which yield ππ₯ ππ₯ =π , ππ‘ ππ
(5.60)
π2π₯ π2π₯ = π2 2 . 2 ππ‘ ππ
The coefficients ππ , π β₯ 0 in the expansion of π can then be chosen so that the expansion of π₯(π , π) is clean and free of secular terms. The resulting solution in the variables π₯, π can then be converted back to π₯, π‘ as desired. To illustrate the method, we consider the system in (5.49), and change variables from π₯, π‘ to π₯, π . Using the above derivative relations, and noting that the interval π‘ β₯ 0 becomes π β₯ 0, we obtain (5.61)
π2
π2π₯ = βππ₯ β ππ₯3 , ππ 2
π
ππ₯ | = 0, ππ π =0
π₯|π =0 = 1,
π β₯ 0.
Let π(π) be as in (5.59), and consider a usual expansion for π₯(π , π) as outlined in Result 5.6.1, namely (5.62)
π₯(π , π) = π₯0 (π ) + ππ₯1 (π ) + π2 π₯2 (π ) + β― .
We next substitute (5.59) and (5.62) into (5.61) and expand all terms in powers of π. To this end, it is convenient to introduce the functions (5.63)
π(π , π) = π2 (π)
π2π₯ (π , π), ππ 2
π(π , π) = π₯3 (π , π),
β(π , π) = π(π)
ππ₯ (π , π). ππ
Each of these functions can be expanded using Taylorβs formula, applied to π with π fixed. For π(π , π) we obtain π(π , π) = π(π , 0) + π
(5.64)
ππ (π , 0) + β― ππ
= [π2
π2π₯ ππ π 2 π₯ π2π₯ 2 π + π + π +β― ] [2π ( )] ππ ππ 2 ππ ππ 2 π=0 ππ 2 π=0
= [π20
π 2 π₯0 π2π₯ π2π₯ ] + π [2π0 π1 20 + π20 21 ] + β― . 2 ππ ππ ππ
5.8. PoincarΓ©βLindstedt method
117
Similarly, for π(π , π) we get π(π , π) = π(π , 0) + π (5.65)
ππ (π , 0) + β― ππ
= [π₯3 ]π=0 + π [3π₯2
ππ₯ +β― ] ππ π=0
= [π₯03 ] + π [3π₯02 π₯1 ] + β― , and for β(π , π) we get β(π , π) = β(π , 0) + π = [π
(5.66)
πβ (π , 0) + β― ππ
ππ₯ ππ ππ₯ π ππ₯ + π[ + π ( )] +β― ] ππ π=0 ππ ππ ππ ππ π=0
= [π0
ππ₯0 ππ₯ ππ₯ ] + π [π1 0 + π0 1 ] + β― . ππ ππ ππ
We can now substitute (5.62)β(5.66) into (5.61) and move all terms to the left-hand side and collect them by powers of π. We do this in the differential equation, and each of the two initial conditions. Setting each coefficient to zero we then obtain the following sequence of initial-value problems for the functions π₯π (π ), which now involve the coefficients ππ , π β₯ 0. π0 . π20 π₯0β³ + ππ₯0 = 0, π0 π₯0β² (0) = 0, π₯0 (0) = 1, π β₯ 0. This is a linear, homogeneous second-order equation for π₯0 (π ). Aside from being positive, the coefficient π0 > 0 can be chosen arbitrarily. The specific choice π0 = βπ is convenient; this will simplify various calculations that follow. With this choice the equation takes the clean form π₯0β³ + π₯0 = 0, and the resulting solution, taking into account the initial conditions, becomes π₯0 (π ) = cos(π ). π1 . π20 π₯1β³ + ππ₯1 = βπ₯03 β 2π0 π1 π₯0β³ , π0 π₯1β² (0) = βπ1 π₯0β² (0), π₯1 (0) = 0, π β₯ 0. Given π0 and π₯0 (π ) from above, this is a linear, inhomogeneous second-order equation for π₯1 (π ). Using the fact that π0 = βπ and π₯0 (π ) = cos(π ) we can rewrite the differential 2π 1 equation as π₯1β³ + π₯1 = π 1 cos(π ) β π2 cos3 (π ). As before, it is convenient to simplify 0
0
the right-hand side using the triple-angle identity cos3 (π) = this case, the differential equation becomes π₯1β³ + π₯1 = (
(5.67)
1 4
cos(3π) +
3 4
cos(π). In
2π1 3 1 β 2 ) cos(π ) β 2 cos(3π ). π0 4π0 4π0
Because cos(π ) is a solution of the homogeneous equation, it will lead to a secular term in the particular solution for π₯1 (π ). Hence we choose π1 to eliminate the term with 3 2π cos(π ). Setting ( π 1 β 4π2 ) = 0 we get 0
(5.68)
0
π1 =
3 . 8π0
118
5. Perturbation methods
With this choice of π1 the differential equation becomes π₯1β³ + π₯1 = β
(5.69)
1 cos(3π ). 4π20
The general solution, formed by combining the homogeneous and a particular solution, takes the form 1 (5.70) π₯1 (π ) = πΆ1 cos(π ) + πΆ2 sin(π ) + cos(3π ), 32π20 where πΆ1 and πΆ2 are arbitrary constants. The initial conditions π₯1 (0) = 0 and π₯1β² (0) = π 1 β π1 π₯0β² (0) = 0 imply that πΆ1 = β 32π2 and πΆ2 = 0, and the solution for π₯1 (π ) is com0
0
pletely determined.
The above process can be continued up to any desired power of π. The success of the method relies on the assumption that the arbitrary coefficients ππ can be chosen to eliminate subsequent secular terms as they arise, and that the resulting series for π(π) will converge. Our calculations thus far provide a two-term perturbation approximation for both π₯(π , π) and π(π), namely (5.71)
π₯(π , π) β π₯0 (π ) + ππ₯1 (π ),
π(π) β π0 + ππ1 ,
π = π(π)π‘.
More specifically, we have π₯(π , π) β cos(π ) + π [ (5.72) π = (π0 +
cos(3π ) cos(π ) β ], 32π20 32π20
3π ) π‘, 8π0
π0 = βπ.
In contrast to before, note that all terms shown are periodic, and that any truncation of such an expansion will yield an approximation that is periodic. Thus, by using an enhanced expansion technique such as the PoincarΓ©βLindstedt method, we can avoid secular terms that arise when using the standard method. When the PoincarΓ©β Lindstedt method is successful, any truncation of the resulting expansion provides an approximation with an error that does not grow as before, but instead is uniformly bounded in time. Note that the above expression in variables π₯, π can be converted back to variables π₯, π‘ by direct substitution.
5.9. Singular algebraic case We next extend the results for regularly perturbed algebraic equations to the singular case. Rather than state a general result, we illustrate the essential ideas using an example. We consider (5.73)
ππ₯4 β π₯ β 1 = 0,
0 < π βͺ 1.
This equation is singularly perturbed since all solutions for π > 0 cannot be continued to π = 0. Note that, although π = 0 is not of direct interest, we still use it to guide our developments. Similar to the regular case, we seek an expansion for each of the four roots π₯(π).
5.9. Singular algebraic case
119
Remarks. For motivation, we note that (5.73) has 4 roots when π > 0 and only 1 root when π = 0. This informs us that 3 roots are singular, which in the algebraic case must become unbounded (infinite) as π β 0+ , and 1 root is regular, which remains defined as π β 0+ . We also note that a regular series as outlined in Result 5.5.1 can only represent a regular root, since a series in powers of π or more generally π1/π with π β₯ 1 will remain defined as π β 0+ . Thus a special approach is needed to obtain an expansion for each of the four roots. Procedure. To construct a series expansion for each root, we first identify the total number of roots, and the number of regular and singular type. As outlined above, we have 4 total roots: 1 regular and 3 singular. Regular roots. A series expansion for each regular root can be constructed using the standard approach based on simple and repeated types at π = 0, as outlined in Result 5.5.1. Setting π = 0 in equation (5.73) we get βπ₯0 β 1 = 0, which gives π₯0 = β1. Thus the one regular root is simple at π = 0. (A single root must be simple.) Based on this, we consider the regular expansion (5.74)
π₯(π) = π₯0 + ππ₯1 + π2 π₯2 + π3 π₯3 + β― ,
π₯0 = β1.
Note that an expansion in powers of πΏ = π1/π or similar would be needed for a repeated root of multiplicity π. In the usual way, we next substitute (5.74) into (5.73), expand all terms in powers of π, and collect coefficients. The result is as follows. π0 . β1.
βπ₯0 β 1 = 0.
This is satisfied by the root under consideration, which is π₯0 =
π1 . π₯04 β π₯1 = 0. For the given value of π₯0 , we get a corresponding value of π₯1 , specifically π₯1 = π₯04 = 1. The above process can be continued up to any desired power of π. From our calculations, the first two terms in the expansion of the one regular root are (5.75)
π₯(1) (π) = β1 + π + β― .
Singular roots. A series expansion for each singular root can be constructed by introducing a scale transformation. The purpose of this change of variable is to make the problem regular, so that previous results can be applied. We consider a change of variable of the form π§ = ππΌ π₯ or π₯ = π§/ππΌ , where πΌ > 0 is an exponent to be determined. When this exponent is properly chosen, the singular and regular roots in π₯ will be converted to nonzero and zero roots in π§ at π = 0. Normally, singular roots in π₯ will have π§ β 0 at π = 0, while regular roots in π₯ will have π§ = 0 at π = 0. Note that π§ β 0 implies that π₯ becomes unbounded as π β 0+ . We next substitute π₯ = π§/ππΌ into (5.73), and then simplify the result to obtain a leading coefficient of unity, which gives a normalized equation in π§. Specifically, we
120
5. Perturbation methods
get π(πβπΌ π§)4 β πβπΌ π§ β 1 = 0, π1β4πΌ π§4 β πβπΌ π§ β 1 = 0,
(5.76)
π§4 β π3πΌβ1 π§ β π4πΌβ1 = 0. The exponent πΌ can now be determined: we seek the smallest value of πΌ > 0 that will make the normalized equation regularly perturbed. This will be the case when each power of π is nonnegative, which requires 3πΌ β 1 β₯ 0 and 4πΌ β 1 β₯ 0, or equivalently 1 1 1 πΌ β₯ 3 and πΌ β₯ 4 . The smallest value of πΌ that satisfies both conditions is πΌ = 3 . Substituting this exponent into (5.76) gives the regularly perturbed equation π§4 β π§ β π1/3 = 0. This result can be written with integer exponents (thus analytic coefficients) by relabeling the parameter, namely π§4 β π§ β π Μ = 0,
(5.77)
π Μ = π1/3 .
Since the above equation is regularly perturbed, a series expansion for each root can be constructed using the standard approach based on simple and repeated types at π Μ = 0. As outlined above, we seek roots with π§0 β 0, as these will correspond to the original singular roots. Setting π Μ = 0 in equation (5.77) we get π§40 β π§0 = 0, or equivalently π§0 (π§30 β 1) = 0, which requires π§0 = 0 or π§30 = 1. We discard π§0 = 0, and only consider π§30 = 1, which 1 1 yields π§0 = 1, 2 (β1 + πβ3), 2 (β1 β πβ3). Thus we have three nonzero roots π§0 at π Μ = 0, which are all simple. Based on this, we consider the regular expansions π§(π)Μ = π§0 + ππ§Μ 1 + π2Μ π§2 + β― ,
(5.78)
π§0 = 1,
β1+πβ3 β1βπβ3 , 2 . 2
Μ or similar would be needed for a repeated Note that an expansion in powers of πΏ = π1/π root of multiplicity π. As before, we next substitute (5.78) into (5.77), expand all terms in powers of π,Μ and collect coefficients. The result is as follows. π0Μ .
π§40 β π§0 = 0.
π§0 = 1,
This is satisfied by the three roots under consideration, which are
β1+πβ3 β1βπβ3 , . 2 2
π1Μ . 4π§30 π§1 β π§1 β 1 = 0. For each value of π§0 , we get a corresponding value of π§1 , 1 1 1 1 specifically π§1 = 4π§3 β1 = 3 , 3 , 3 . 0
The above process can be continued up to any desired power of π.Μ Based on our calculations, the first two terms in the expansion of the three relevant roots for π§ are 1
π§(π) (π)Μ = 1 + 3 π Μ + β― , (5.79)
π§(π) (π)Μ =
β1+πβ3 2
+ 3 πΜ + β― ,
π§(π) (π)Μ =
β1βπβ3 2
+ 3 πΜ + β― .
1
1
Each series for π§ in terms of π Μ can be converted into a series for π₯ in terms of π using 1 the relations π Μ = π1/3 and π₯ = π§/ππΌ , where πΌ = 3 . Note that π Μ and ππΌ will in general be
5.10. Singular differential case
121
different, although they are the same in this example. Thus the three original singular roots have the expansions π₯(2) (π) = (5.80)
1 π1/3
+
1 3
+ β―,
π₯(3) (π) =
β1+πβ3 2π1/3
+
1 3
+ β―,
π₯(4) (π) =
β1βπβ3 2π1/3
+
1 3
+ β―.
The results in (5.80) and (5.75) are the expansions of all four roots of the original equation in (5.73). The overall procedure can be applied in the same way to other singularly perturbed algebraic equations. Note. If further considered, the root of (5.77) with π§0 = 0 would generate an expansion for the regular root of the original equation. However, an expansion obtained in terms of π§ and π,Μ while equivalent, would be more cumbersome and less efficient than that obtained by working directly with the original equation in terms of π₯ and π. Thus it is advantageous to deal with the regular and singular roots separately as described above.
5.10. Singular differential case We next extend the results for regularly perturbed differential equations to the singular case. As before, rather than state a general result, we illustrate the essential ideas using an example. We consider (5.81)
π
π2π¦ ππ¦ + (1 + π) + π¦ = 0, 0 β€ π₯ β€ 1, ππ₯ ππ₯2 π¦|π₯=0 = 0, π¦|π₯=1 = 1, 0 < π βͺ 1.
This system is singularly perturbed since the unique solution for π > 0 cannot be continued to π = 0; specifically, there is no solution in the latter case. Note that, although π = 0 is not of direct interest, we still use it to guide our developments. As in the regular case, we seek an expansion for the solution π¦(π₯, π). Remarks. For the case π > 0, the system consists of a linear, second-order differential equation with two boundary conditions, and has a unique solution. In contrast, for the case π = 0, the system consists of a linear, first-order differential equation with two boundary conditions, and has no solution. This informs us that the unique solution for π > 0 must become undefined or break down as π β 0+ . For systems involving a differential equation, a typical mode of break down is (5.82)
πππ¦ (π₯, π) β Β±β ππ₯π
as
π β 0+ ,
for some π and π₯.
Numerical experimentation with (5.81) shows that the break down in the solution ππ¦ occurs in the slope at the left side of the domain, specifically ππ₯ (0, π) β β as π β 0+ as illustrated in Figure 5.10. For small values of π > 0, the slope is large in a thin region adjacent to π₯ = 0; and as π β 0+ , the region becomes thinner and the slope becomes infinite. Outside of this thin region the solution behaves regularly in the sense that there is no singular behavior as π β 0+ . The thin region adjacent to π₯ = 0 is called the boundary layer or inner region, and the remaining portion of the domain that
122
5. Perturbation methods
y 1
y(x,Ξ΅)
x
0
1
inner region
outer region
Figure 5.10.
includes π₯ = 1 is called the outer region. This is a qualitative partitioning of the domain; the point of transition between the two regions is left unspecified. In our developments, we restrict attention to problems with singular behavior at only one point, which will be the left or right end of an interval, and we outline a procedure for finding only the first term in an expansion of the solution. Problems involving more than one point of singular behavior, which may include points in the interior of the domain, and procedures for finding high-order terms in an expansion, are not pursued here. Procedure. To construct an expansion for the solution π¦(π₯, π) we first identify the location of the inner region or boundary layer. We assume the location is given; in the present example, it is on the left end of the solution interval 0 β€ π₯ β€ 1. For generality, we denote the boundary layer endpoint by πin , and the other end point by πout , so the interval becomes πin β€ π₯ β€ πout , where πin = 0 and πout = 1. We next split the problem into two parts corresponding to the inner and outer regions. In each region, the solution must satisfy the differential equation, and any boundary condition in that region. In the current example, one boundary condition is specified in each region or side; but more generally, a region or side may contain some, all, or none of the boundary conditions. The two parts for the current example are π2 π¦
ππ¦
π₯ β€ πout ,
π2 π¦
ππ¦
π₯ β₯ πin ,
(5.83)
(Outer)
+ (1 + π) ππ₯ + π¦ = 0, π { ππ₯2 π¦|π₯=πout = 1.
(5.84)
(Inner)
π + (1 + π) ππ₯ + π¦ = 0, { ππ₯2 π¦|π₯=πin = 0.
Note that, for problems in which the boundary layer is on the right side, the above parts would be interchanged: the outer region would be π₯ β₯ πout = 0, and the inner region would be π₯ β€ πin = 1. Outer problem. In the outer region, the solution behaves regularly, and no special treatment is required. Thus we consider a regular expansion of the form (5.85)
π¦(π₯, π) = π¦0 (π₯) + ππ¦1 (π₯) + π2 π¦2 (π₯) + β― .
In the usual way, we can substitute (5.85) into (5.83), expand all terms in powers of π, and collect coefficients. Restricting attention to only the leading-order term π¦0 (π₯), we
5.10. Singular differential case
123
get ππ¦0 + π¦0 = 0, π¦0 (πout ) = 1, π₯ β€ πout . ππ₯ Solving this equation using separation of variables we obtain π¦0 (π₯) = π1βπ₯ , and truncating the series in (5.85) at the first term, we get the following leading-order approximation of the outer solution (5.86)
π¦out (π₯, π) = π¦0 (π₯) = π1βπ₯ .
(5.87)
For later reference, let πΌ out denote the value of the outer solution at the inner endpoint πin in the limit π β 0+ , that is πΌ out = lim+ π¦out (πin , π).
(5.88)
πβ0
Since πin = 0 and there is no dependence on π, we get πΌ out = π as illustrated in Figure 5.11. For problems in which the boundary layer is on the right, the definition of πΌ out y I
out
out y (x, Ξ΅) 1 x q out
q in Figure 5.11.
in (5.88) would be the same, but the figure would be flipped: the outer region and πout would be on the left. Inner problem. In the inner region, the solution behaves singularly, and special treatment is required. Motivated by the algebraic case, we introduce a scale transformation or change of variable to make the problem regular in this region. We consider the change of variable (5.89)
π = (π₯ β πin )/ππΌ ,
π₯ = πin + ππΌ π,
where πΌ > 0 is an exponent to be determined. The essential idea is to apply this transformation to the inner system and change variables from π₯, π¦ to π, π¦. For this purpose, the usual derivative relations from Result 2.3.1 are applicable, which yield ππ¦ ππ¦ π2π¦ π2π¦ = πβπΌ , = πβ2πΌ 2 . 2 ππ₯ ππ ππ₯ ππ Note that the solution curve in the π, π¦-plane will be a horizontally stretched version of the curve in the π₯, π¦-plane, where the stretching factor is πβπΌ . When the exponent is properly chosen, the stretching will cancel the steepening slope at π₯ = πin when π β 0+ . (5.90)
We next apply the change of variable (5.89) to the inner system (5.84), noting that the interval π₯ β₯ πin becomes π β₯ 0, and then simplify the differential equation so that
124
5. Perturbation methods
the highest-order term has a coefficient of unity. This yields a normalized system in the variables π, π¦, which is (5.91)
π2π¦ ππ¦ πΌβ1 + ππΌ ) + π2πΌβ1 π¦ = 0, { ππ2 + (π ππ π¦|π=0 = 0.
π β₯ 0,
The exponent πΌ can now be determined: we seek the smallest value of πΌ > 0 that will make the normalized system regularly perturbed. This will be the case when each power of π is nonnegative, which requires πΌ β 1 β₯ 0, πΌ β₯ 0 and 2πΌ β 1 β₯ 0, or equiva1 lently πΌ β₯ 1, πΌ β₯ 0 and πΌ β₯ 2 . The smallest value of πΌ that satisfies all three conditions is πΌ = 1. Substituting this exponent into (5.91) we get (5.92)
ππ¦ π2π¦ { ππ2 + (1 + π) ππ + ππ¦ = 0, π¦|π=0 = 0.
π β₯ 0,
Since the above equation is regularly perturbed, a series expansion for the inner solution in the variables π, π¦ can now be obtained in the usual way. Thus we consider a regular expansion of the form (5.93)
π¦(π, π) = π¦0 (π) + ππ¦1 (π) + π2 π¦2 (π) + β― .
Similar to before, we can substitute (5.93) into (5.92), expand all terms in powers of π, and collect coefficients. Restricting attention to only the leading-order term π¦0 (π), we get (5.94)
π 2 π¦0 ππ¦0 + = 0, ππ ππ2
π¦0 (0) = 0,
π β₯ 0.
The general solution of the above differential equation, obtained by considering the associated characteristic polynomial, is π¦0 (π) = πΆ1 + πΆ2 πβπ , where πΆ1 and πΆ2 are arbitrary constants. The boundary condition π¦0 (0) = 0 then implies πΆ1 + πΆ2 = 0, or equivalently πΆ2 = βπΆ1 . Thus the leading-order approximation of the inner solution is (5.95)
π¦in (π, π) = π¦0 (π) = πΆ1 (1 β πβπ ).
The remaining constant πΆ1 will be determined in the matching step outlined below. We can express the inner solution in terms of the original variables π₯, π¦ using the change of variable in (5.89), where πin = 0 and πΌ = 1, which gives (5.96)
[π¦in (π, π)]π=(π₯βπ
in )/π
πΌ
= πΆ1 (1 β πβπ₯/π ).
For later reference, let π» in denote the value of the inner solution at the outer end point π₯ = πout in the limit π β 0+ , which in view of the relation π = (π₯ β πin )/ππΌ , corresponds to (5.97)
π» in = lim+ [π¦in (π, π)]π=(π πβ0
out βπin )/π
πΌ
.
Since π β β when πout > πin , we note that π» in corresponds to a horizontal asymptote, as illustrated in Figure 5.12. In the current example, we obtain π» in = limπββ π¦in (π, π) = πΆ1 . For problems in which the boundary layer is on the right, the definition of π» in in (5.97) would be the same, but the figure would be flipped: the inner region and
5.10. Singular differential case
125
y in y (x,Ξ΅ ) H
0
in
x q out
q in
Figure 5.12.
πin would be on the right. In this case, π» in would be a horizontal asymptote in the direction π β ββ since πout < πin . Matching. To match the inner and outer approximations we set π» in = πΌ out .
(5.98)
The overall approximation procedure will be successful provided that this condition can be satisfied, with both sides of the equation finite, and provided that all arbitrary constants in the approximations are determined. In the current example, the matching condition becomes πΆ1 = π, and hence the procedure will be successful. Composite approximation. The approximation procedure is completed by combining the inner and outer approximations as illustrated in Figure 5.13. The outer approximation satisfies the outer boundary conditions, and takes the value πΌ out at the opposite end of the domain. Similarly, the inner approximation satisfies the inner boundary conditions, and tends to the value π» in at the opposite end of the domain. By adding the two graphs, we obtain an approximation over the entire domain, but with a common offset in the boundary conditions. Thus a combined or composite approximation y
y in H
out I y in 0 q in
y out
y y in + y out
1 x q out
y in + y outβIout 1
0 q in
1 0
x q out
q in
x q out
Figure 5.13.
is obtained by removing the offset. The result of the overall procedure is called the leading-order composite approximation and is given by (5.99)
π¦comp (π₯, π) = π¦out (π₯, π) + [π¦in (π, π)]π=(π₯βπ
in )/π
πΌ
β πΌ out .
In the current example, we obtain (5.100)
π¦comp (π₯, π) = π1βπ₯ β π1β(π₯/π) ,
0 β€ π₯ β€ 1.
The above expression is the first or leading-order term in an expansion of the exact solution π¦(π₯, π) of the original problem in (5.81). Procedures exist for determining higher-order terms in the expansion, but they are more involved and not addressed
126
5. Perturbation methods
here. Although it is only leading-order, the approximation in (5.100) captures the essential character of the exact solution. Specifically, for small π > 0, the approximation has large slope in a thin boundary layer region adjacent to π₯ = 0, and in the limit π β 0+ , the slope becomes infinite leading to a jump discontinuity at π₯ = 0. Note. The scale or stretching transformation in (5.89) is only considered in the boundary layer region to counteract the infinite derivative that develops in the solution as π β 0+ . If considered over the entire interval, this transformation would excessively stretch the solution in the outer region; it would stretch this part of the solution into a flat curve as π β 0+ . For this reason, the problem is split into parts, which are treated differently, and then the parts are combined as described above. Discussion. Here we explain how the function π¦comp (π₯, π) defined by (5.99) approximates the solution π¦(π₯, π) of the original system in (5.81). To begin, note that the solution of the original problem must also be a solution to the outer problem in (5.83) and the inner problem in (5.84). For each of these two problems, a series expansion of the solution was found, using a change of variable in the inner case, and the leading-order term in each solution was denoted as π¦out (π₯, π) and π¦in (π, π). Each of these functions is thus a leading-order approximation of π¦(π₯, π), but each is only relevant and satisfies a given boundary condition in its respective region. To understand the composite approximation we need a more explicit description of the outer and inner regions. For this purpose, we introduce a matching point π₯m , which can be interpreted as the point where the two regions meet. (More appropriately, we could suppose that the two regions overlap, and π₯m would be some point in the overlap.) To define this point, we recall the change of variable used in the inner region, π₯ = πin + ππΌ π, where πΌ > 0 is the exponent defined in the procedure. Based on this expression we define a matching point by π₯m = πin + πΆππ½ , where πΆ > 0 and 0 < π½ < πΌ are given constants. By design, the matching point depends on π, and will not remain at a fixed location on the π₯-axis as π β 0+ , and will also not remain at a fixed location on the π-axis when the variable is changed. Using the matching point, we denote the outer region by π₯ β (π₯m , πout ], and the inner region by π₯ β [πin , π₯m ]. Equivalently, by the change of variable, the inner region corresponds to π β [0, πm ], where πm = πΆππ½βπΌ . The form of the point ensures that π₯m β πin and πm β β as π β 0+ . Thus the matching point tends to the left-most possible edge of the outer region in the sense that π₯m β πin , and simultaneously, it tends to the right-most possible edge of the stretched inner region in the sense that πm β β. The motivation for the matching rule used in the procedure is now evident. Since π₯m β πin , the solution of the outer problem at the matching point tends to the value πΌ out as π β 0+ . Similarly, since πm β β, the solution of the inner problem at the matching point tends to the value π» in as π β 0+ . Since the outer and inner solutions must agree at the matching point as π β 0+ , we obtain the rule πΌ out = π» in . The manner in which π¦comp (π₯, π) provides a leading-order approximation can now be described. For any point in the outer region π₯ β (π₯m , πout ] we have (5.101)
π¦comp (π₯, π) = π¦out (π₯, π) + π
(π₯, π),
5.11. Case study
127
where (5.102)
π
(π₯, π) = [π¦in (π, π)]π=(π₯βπ
in )/π
πΌ
β πΌ out .
Since π₯ > π₯m we have π > πm , and we deduce that limπβ0+ π
(π₯, π) = 0, which follows from the definition of π» in and the matching rule. Thus, in the outer region, the composite approximation is equal to the leading-order solution π¦out (π₯, π) of the outer problem, plus a remainder term π
(π₯, π) which becomes vanishingly small as π becomes vanishingly small. Moreover, because the outer solution satisfies the relevant boundary conditions at πout , so will the composite approximation, up to the remainder term. Similarly, for any point in the inner region π₯ β [πin , π₯m ], we have (5.103)
π¦comp (π₯, π) = [π¦in (π, π)]π=(π₯βπ
πΌ in )/π
Λ(π₯, π), +π
where (5.104)
Λ(π₯, π) = π¦out (π₯, π) β πΌ out . π
Λ(π₯, π) = 0, which follows Since πin β€ π₯ β€ π₯m and π₯m β πin , we deduce that limπβ0+ π
out from the definition of πΌ . Thus, in the inner region, the composite approximation is equal to the leading-order solution π¦in (π, π) of the inner problem, plus a remainder term Λ(π₯, π) which becomes vanishingly small as π becomes vanishingly small. Moreover, π
because the inner solution satisfies the relevant boundary conditions at πin , so will the composite approximation, up to the remainder term. By design, the function π¦comp (π₯, π) is defined over the entire interval π₯ β [πin , πout ], and in each of the subintervals [πin , π₯m ] and (π₯m , πout ], it is equal to the relevant leading-order solution of the differential equation, and satisfies the relevant boundary condition, up to a remainder term that vanishes as π β 0+ .
5.11. Case study Setup. Singularly perturbed differential equations arise in a number of applications. Here we consider an example that arises in the modeling of an interface. As illustrated in Figure 5.14, we study a planar model for the interface between a liquid and gas in equilibrium. We suppose that a liquid of mass density π occupies a portion of a rectangular container with an open top, which is exposed to a gas at a fixed, constant pressure π0 , with gravitational acceleration π oriented vertically downward. We consider a coordinate system as shown, where the lateral position of the origin is midway between the left and right walls, and the vertical position is as described below. We suppose that the container is of size 2πΏ > 0 along the π₯-direction, and of size π€ > 0 along the π§-direction into the page. In a planar slice of the container as illustrated, the interface appears as a curve π¦(π₯). Physically, the interface can be identified with the top layer of the liquid, which acts like an elastic skin, and which is held in a stretched state by molecular forces. The net effect of these forces is quantified by the surface tension π. Common experience tells us that the interface will be almost flat, except near the walls of the container, where the liquid surface will rise up and form a wetting angle πΎ β₯ 0, or dip down and form an angle πΎ β€ 0; this portion of the interface is called the meniscus. We seek
128
5. Perturbation methods
y
interface curve y(x)
0
x
gas g
liquid x=βL
Ξ³
zoom of interface Ο
x=L
Ο
Ο
w
Ο
(edge)
(top)
(length w in zβdirection) Figure 5.14.
to describe how the interface curve π¦(π₯), and specifically the shape and height of the meniscus, depend on the given parameters π, π0 , π, πΏ, π, πΎ. Note that π is a mass per unit volume, π0 is a force per unit area, and π is a force per unit length. Outline of model. We assume that, in any given equilibrium state, the pressure in the gas and liquid are known functions of the vertical coordinate π¦. For the gas, we assume that the pressure is independent of π¦, so πgas (π¦) β‘ π0 . For the liquid, due to its weight, we assume that the pressure depends linearly on π¦, namely π liq (π¦) = πΆ β πππ¦, where πΆ is an arbitrary constant, as required by the laws of hydrostatics. The vertical position of the origin is implicitly chosen so that πΆ = π0 , which will be convenient for the derivation of the model. With this choice of origin the interface height π¦|π₯=0 will not be known explicitly, but will instead be determined as part of the solution. Once π¦|π₯=0 is known, then so too will be the origin; for instance, if π¦|π₯=0 is positive, then π the origin is below the interface by this amount. Note that when πππΏ2 βͺ 1, which is the case of interest here, the value π¦|π₯=0 will be extremely small, and the origin will be extremely close to, and for all practical purposes on, the interface. We next consider a small piece π€ of the interface curve at an arbitrary point (π₯, π¦(π₯)) as illustrated in Figure 5.15. Provided that the piece is sufficiently small, and that the curve is twice continuously differentiable, the piece can be described to arbitrary accuracy as an arc of a circle of radius π(π₯) and central angle π, which is aligned with the β and π(π₯) β unit tangent and normal vectors π(π₯) as shown. From calculus, the curvature of this circular arc is given by (5.105)
π
(π₯) =
π¦β³ (π₯) 1 = . π(π₯) [1 + (π¦β² (π₯))2 ]3/2
Note that the size or length of π€ is determined by π, and that π€ will shrink to the central point (π₯, π¦(π₯)) in the limit π β 0+ . In equilibrium, the forces acting on π€ must be balanced. Forces arise due to the surface tension within the interface, and the pressures of the liquid and gas on the opposite sides of the interface. Since our goal is to consider the limit π β 0+ , it will suffice to consider the pressures evaluated at the point (π₯, π¦(π₯)); variations in the pressures over π€ will not contribute in the limit. To express the forces, we recall that the interface curve represents a slice of a surface, and thus a piece π€ of the curve represents a patch of the surface. The patch has length π€ in the π§-direction, along which the
5.11. Case study
129
N(x) N(x)
y Ξ
Ο
r(x)
T(x) Ο
(x,y(x))
Ο Ο/2
p gas Ο/2
p
x
T(x)
liq
Figure 5.15.
surface tension is distributed, and has an area π€π(π₯)π, over which the pressures are distributed. The forces due to the surface tension on the right and left sides of π€ are π β π β β πΉπ,right = ππ€ cos( )π(π₯) + ππ€ sin( )π(π₯), 2 2 (5.106) π β π β β + ππ€ sin( )π(π₯). πΉπ,left = βππ€ cos( )π(π₯) 2 2 Similarly, the forces due to the liquid and gas pressures on the opposite sides of π€ are (5.107)
β πΉ β liq = π liq (π¦(π₯))π€π(π₯)ππ(π₯), β = βπgas (π¦(π₯))π€π(π₯)ππ(π₯). β πΉgas
β + πΉπ,right β β Balance of forces requires πΉ β liq + πΉgas + πΉπ,left = 0. Note that the comβ ponents in the tangential direction π(π₯) are automatically balanced, which is due to symmetry and the fact that the surface tension is constant. For the components in β the normal direction π(π₯), after using the explicit expressions for the pressures and dividing by π€π(π₯)π, and noting that π
(π₯) = 1/π(π₯), we obtain the relation π
(5.108)
2ππ
(π₯) sin( 2 ) βπππ¦(π₯) + = 0. π π
sin( )
1
In the limit π β 0+ , in which π€ shrinks to a point, we note that π 2 β 2 , and thus we obtain ππ
(π₯) = πππ¦(π₯). Substituting for the curvature from (5.105), we obtain the system ππ¦ 2 3/2 π π2π¦ = π¦[1 + ( ) ] , 0 β€ π₯ β€ πΏ, ππ ππ₯2 ππ₯ (5.109) ππ¦ ππ¦ | = 0, | = tan πΎ. ππ₯ π₯=0 ππ₯ π₯=πΏ The nonlinear differential equation above is called the YoungβLaplace equation. It represents a local balance of forces that must hold at each point along the interface curve π¦(π₯). Here we consider only the above planar version of the equation; there are various generalizations. Note that, by symmetry, we need only consider half of the interface corresponding to the interval 0 β€ π₯ β€ πΏ. The boundary conditions correspond to a zero slope at π₯ = 0, which is required by symmetry and the fact that the point of the interface midway between the walls is a local extremum, and a prescribed slope at π₯ = πΏ, which is dictated by the wetting angle πΎ. We seek an expression for the curve
130
5. Perturbation methods
π¦(π₯) and the meniscus end point height π¦# = π¦|π₯=πΏ , and assume π > 0, π > 0, π > 0, π πΏ > 0 and πΎ β (0, 2 ) are given constants. Analysis of model. It is convenient to introduce dimensionless variables defined by the scale transformation β = π¦/πΏ and π = π₯/πΏ. In terms of these variables, the system takes the form π2β πβ 2 3/2 = β[1 + ( ) ] , 0 β€ π β€ 1, 2 ππ ππ πβ πβ π | = 0, | = tan πΎ, π = . ππ π =0 ππ π =1 πππΏ2 π
(5.110)
Singularly perturbed features. Since π is a microscopic quantity, whereas πππΏ2 is macroscopic, we naturally have 0 < π βͺ 1. We note that the parameter π is the coefficient of the highest-order term, and in the extreme case when π = 0, the only solution of the differential equation is β(π ) β‘ 0, which satisfies the boundary condition at π = 0, but not at π = 1. Thus any solution for π > 0 cannot be continued to π = 0 and the system is singularly perturbed. Intuitively, we expect that the system has a boundary layer in the meniscus region at π = 1. However, in contrast to the example outlined in the previous section, the singular behavior in the boundary layer does not appear π2 β πβ in the first derivative or slope ππ , but rather the second derivative ππ 2 . 2Specifically, πβ π β | remains bounded and equal to tan πΎ in the limit π β 0+ , whereas ππ 2 |π =1 grows ππ π =1 unbounded. Thus the boundary layer or inner region is on the right, with endpoint πin = 1, and the outer region is on the left, with endpoint πout = 0. In dimensionless variables, the meniscus end point (π₯, π¦) = (πΏ, π¦# ) becomes (π , β) = (1, β# ), where β# = π¦# /πΏ. Outer equations. The equations for the interface curve in the outer region π β₯ πout π2 β πβ are the differential equation π ππ 2 β β[1 + ( ππ )2 ]3/2 = 0, together with the boundary πβ
condition ππ |π =πout = 0. In this region, no special treatment is required, and we consider a regular series expansion for β(π , π). Upon substituting the expansion into the differential equation and boundary condition, and collecting the coefficients of π0 , we get (5.111)
ββ0 [1 + (
πβ0 2 3/2 ) ] = 0, ππ
πβ0 = 0, | ππ π =πout
π β₯ πout .
By inspection, since the sum within the brackets is positive, we note that the only solution of the differential equation is β0 (π ) β‘ 0. Moreover, although there are no arbitrary constants in this solution, we note that it satisfies the boundary condition. Thus the leading-order outer solution, and the associated intercept used for matching, are (5.112)
βout (π , π) = β0 (π ) β‘ 0,
πΌ out = lim+ βout (πin , π) = 0. πβ0
For future reference, the matching condition will be π» in = πΌ out = 0, and the leading-order composite solution will be βcomp = βout + βin β πΌ out = βin . Thus the leading-order inner solution will itself be the composite solution, provided that the matching condition can be satisfied.
5.11. Case study
131
Inner equations. The equations for the interface curve in the inner region π β€ πin π2 β πβ are the differential equation π ππ 2 = β[1 + ( ππ )2 ]3/2 , together with the boundary condiπβ
tion ππ |π =πin = tan πΎ. In this region we introduce a scale or stretching transformation that will counteract the infinite (second) derivative that develops in the solution; equivalently, the transformation will make the inner equations regularly perturbed. We consider the change of variable π = πβπΌ (π β πin ) and π’ = πβπ½ β, where πΌ > 0 and π½ > 0 are exponents to be determined. Substituting this change of variable into 1 the inner equations, and using the relevant derivative relations, we find that πΌ = 2 and 1
π½ = 2 are the smallest exponents that will make the system regular. Remarkably, for this choice of exponents, all factors of π cancel out, and the normalized inner equations become π2π’ ππ’ 2 3/2 ππ’ | (5.113) = π’[1 + ( ) ] , = tan πΎ, π β€ 0. 2 ππ ππ |π=0 ππ The above equations imply that, in the variables π, π’, the inner solution is independent of π. Thus the expansion of the inner solution consists of only one term, and we have π’in (π, π) = π’in (π) = π’0 (π). For convenience, we use the notation π’(π) in place of π’in (π). Since the outer solution is identically zero (in β and also π’), the matching condition for the inner solution is (5.114)
0 = π» in = lim+ [π’in (π, π)]π=(π πβ0
out βπin )/π
πΌ
= lim π’(π). πβββ
The general solution of the differential equation in (5.113) subject to the matching condition (5.114) can be found by direct integration. To carry out this integration we make some further assumptions consistent with the case when tan πΎ > 0: we assume πα΅ πα΅ that π’ > 0 and ππ > 0 in the inner region, and also limπβββ ππ (π) = 0. In this case, we find that the solution is given by (see Exercise 23) π’ (5.115) π = πΉ(π’) β π΅ where πΉ(π’) = β4 β π’2 + ln ( ). 2 + β4 β π’2 Here π΅ is an arbitrary constant. Note that the solution has the implicit form π = π(π’), rather than π’ = π’(π). The functions π = π(π’) and π’ = π’(π) are both well defined, and have graphs that are monotonic with positive slope, provided that π’ β (0, β2). In the inner variables, the meniscus end point (π , β) = (1, β# ) corresponds to πα΅ (π, π’) = (0, π’# ). At this point, the solution must satisfy ππ |π=0 = tan πΎ, or equivalently ππ 1 | = tan πΎ . In view of (5.115), we obtain the condition πα΅ α΅=α΅# 1 (5.116) πΉ β² (π’# ) = . tan πΎ π
The above algebraic equation has a unique root π’# β (0, β2) for any given πΎ β (0, 2 ), which can be found explicitly. Moreover, since π’ = π’# when π = 0, the arbitrary constant in (5.115) must have the value π΅ = πΉ(π’# ). Thus a unique solution to the inner equations is obtained, which satisfies the matching condition required by the outer solution, along with the given boundary condition at the meniscus endpoint. Composite solution. As noted earlier, the leading-order composite solution for the interface curve will be equal to the leading-order inner solution, which has the implicit
132
5. Perturbation methods
form π = πΉ(π’) β πΉ(π’# ). By reversing the change of variables from π, π’ to π , β, and then from π , β to π₯, π¦, we obtain a solution of the original problem. The meniscus endpoint height π¦# can be obtained directly from the change of variable relations π¦# = πΏβ# and β# = π1/2 π’# . Thus the meniscus height itself is determined by the root of the algebraic equation in (5.116). This problem is further explored in the Exercises. Note. Solving the inner system in (5.113) appears to be just as difficult as solving the original system in (5.110). However, there are important conceptual differences that make the inner system more tractable than the original. For the original system, the limit π β 0+ fundamentally changes the form of the differential equation, while the interval π β [0, 1] and boundary conditions remain fixed. For the inner system, the differential equation has fixed coefficients and its form does not change; instead, the limit π β 0+ gives rise to matching conditions as π β ββ. The fact that conditions are imposed at infinity, rather than a finite point, make the inner system more readily solvable.
Reference notes Perturbation methods have been used to study various types of equations in a wide range of applications, from celestial mechanics to quantum mechanics, across all branches of math and science. Here we touched only the elementary theory for algebraic and ordinary differential equations. The main results outlined here on the existence and convergence of perturbation series were based on fundamental results about analytic functions. For background on such functions and other results from complex analysis see Churchill and Brown (2014) and Gunning and Rossi (2009). Perturbed algebraic equations as considered here fall within the general theory of implicit equations and algebraic curves; see Casas-Alvero (2000), Krantz and Parks (2002), and Wall (2004). An important application in this setting is the study of eigenvalues and eigenvectors of matrices and more general operators; see Kato (1982) and Rellich (1969). Perturbed ordinary differential equations as considered here fall within the general theory of differential equations with parameters. Such equations are discussed in the classic text by Coddington and Levinson (1955), and also Hille (1976). Perturbation methods for a wealth of examples are considered in Bender and Orszag (1999). The PoincarΓ©βLindstedt method is one example from a collection that is designed for equations with periodic structure at one or more scales. Other methods in this collection include those based on averaging and homogenization; see for example the texts by Holmes (2013) and Sanders, Verhulst, and Murdock (2007). Boundary layers and related phenomena such as interior layers and turning points arise in various applications involving ordinary and partial differential equations; see Gie et al. (2018), Holmes (2013), and Neu (2015). Expansion methods for boundary layer problems, employing higher-order matching rules than considered herein, are discussed in Kevorkian and Cole (1981) and Bender and Orszag (1999).
Exercises
133
Specialized methods for linear equations, which exploit representations of solutions in terms of integrals or special functions, are also important; examples of such methods, including the WKB method among others, can be found in Bender and Orszag (1999) and Holmes (2013).
Exercises 1. Verify the order statement for each function π(π) in the limit π β 0+ . Each π(π) is as described or given, and continuous for π β (0, 1]. (a) If π(π) is bounded, then π(π) = π(1). (b) If π(π) β 0 as π β 0+ , then π(π) = π(1). (c) If π(π) = βπ(1 β π), then π(π) = π(π1/2 ). (d) If π(π) = π sin(π2 ), then π(π) = π(π3 ). π
2
(e) If π(π) = β«0 πβπ ππ , then π(π) = π(π). (f) If π(π) = πβ1/π , then π(π) = π(ππ ) for any π β₯ 0. 2. Recall that, if β(π) has a series expansion in whole powers of π, with a positive radius of convergence about π = 0, then the expansion can be found using Taylorβs 1 formula β(π) = β(0) + πββ² (0) + 2! π2 ββ³ (0) + β―. (a) Let π(π) = ln(1+π). Use Taylorβs formula to find the first three nonzero terms in the series expansion in powers of π. (b) Let π(π) = ln(1 + βπ). Explain why this function cannot have a series expansion in powers of π. [Are derivatives defined at π = 0?] (c) Using (a), find the first three nonzero terms in a series expansion for π(π) in powers of π1/2 . 3. If π₯(π) is analytic at π = 0, with expansion π₯(π) = π₯0 + ππ₯1 + π2 π₯2 + β―, then each function π(π) below is also analytic at π = 0, with expansion π(π) = π0 + ππ1 + π2 π2 + β―. Find π0 , π1 , π2 in terms of π₯π , π β₯ 0. 1 . 1+ππ₯(π)
(a) π(π) = π2 π₯3 (π).
(b) π(π) =
(c) π(π) = (1 + ππ₯(π))3/2 .
(d) π(π) = sin(ππ₯(π)).
(e) π(π) = πβπ₯(π) .
(f) π(π) = π₯2 (π)π₯β² (π).
4. For each regularly perturbed equation, find a two-term perturbation approximation of each real or complex root, where 0 β€ π βͺ 1.
134
5. Perturbation methods
(a) π₯3 + 2π₯ + π = 0.
(b) π₯4 β π₯2 + π = 0.
(c) π₯4 + 2ππ₯2 β π2 π₯ β 4 = 0.
(d) π₯3 β 4ππ₯2 β π₯ = 0.
(e) π₯4 β 2π₯3 β 2π = 0.
(f) (π₯ + 1)3 = ππ₯ + 2π2 .
(g) π₯4 = 2π₯2 + ππ₯.
(h) π₯3 = 5π₯2 β 6π₯ β π.
5. Find a two-term perturbation approximation of each real or complex root, where 0 β€ π βͺ 1. [Some or all roots may be degenerate.] (a) π₯3 + ππ₯ + π2 = 0.
(b) π₯3 + π2 π₯ + π3 = 0.
(c) π₯3 + π2 π₯ + π = 0.
(d) π₯4 + ππ₯ + π = 0.
(e) π₯4 + ππ₯ + π2 = 0.
(f) π₯4 + π2 π₯ + π3 = 0.
6. To estimate the roots of π₯3 β 4.01π₯ + 0.02 = 0, we may consider π₯3 β (4 + π)π₯ + 2π = 0. Find a perturbation approximation of each real or complex root up to order π(π2 ), and numerically estimate the roots of the original equation. 7. To estimate the real roots of 4 β π₯2 = π0.1π₯ , we may consider 4 β π₯2 = πππ₯ . Find a perturbation approximation of each real root up to order π(π2 ), and numerically estimate the roots of the original equation. 8. To estimate the real root of sin(π₯ +
π ) 30
= 4π₯, we may consider
sin(π₯ + π) = 4π₯. Find a perturbation approximation of the real root up to order π(π2 ), and numerically estimate the root of the original equation. [Note that the only real solution of sin(π₯0 ) = 4π₯0 is π₯0 = 0.] 9. For each regularly perturbed equation, find a perturbation approximation of the solution up to order π(π), where 0 β€ π βͺ 1. 1 , 1+πα΅
(a)
πα΅ ππ‘
+π’=
(b)
πα΅ ππ‘
=
(c)
πα΅ ππ‘
+ 2π’ + 6π = (2 + ππ’)3 ,
π’|π‘=0 = 1,
(d)
πα΅ ππ‘
= π’2 β ππ‘,
π‘ β₯ 0.
(e)
πα΅ ππ‘
= π‘πβα΅ + π,
6α΅ , 2+πα΅
π’|π‘=0 = 0,
π’|π‘=0 = 4,
π‘ β₯ 0.
π‘ β₯ 0.
π’|π‘=0 = 1, π’|π‘=0 = 0,
π‘ β₯ 0.
π‘ β₯ 0.
Exercises
135
(f)
πα΅ ππ‘
(g)
π2 α΅ ππ‘2
= ππβπ‘ ππ‘ ,
(h)
π2 α΅ ππ‘2
+
= 1 + ππ’2 sin π‘, πα΅
πα΅ ππ‘
π’|π‘=0 = 0,
πα΅ | ππ‘ π‘=0
= 1,
πα΅ | ππ‘ π‘=0
= βππ’2 ,
π‘ β₯ 0. π’|π‘=0 = 0,
= 1,
π‘ β₯ 0.
π’|π‘=0 = 0,
π‘ β₯ 0.
10. For each regularly perturbed equation, find a two-term perturbation approximation of the solution, where 0 β€ π βͺ 1. (a)
πα΅ ππ‘
+ 4π’ = (1 + π ππ‘ )β1/2 ,
πα΅
(b)
πα΅ ππ‘
+ 4π’ = π( ππ‘ )2 π’ + 2π,
(c)
πα΅ ππ‘
= πβπα΅ β π’,
(d)
πα΅ ππ‘
= 1 + sin( α΅ ),
(e)
π2 α΅ ππ‘2
+ π( ππ‘ )2 = sin π‘,
(f)
π2 α΅ ππ‘2
= β1 + ππ’,
πα΅
π
πα΅
π’|π‘=0 = 1,
π‘ β₯ 0.
π’|π‘=0 = 1,
π‘ β₯ 0.
π’|π‘=0 = 4,
π‘ β₯ 0.
π’|π‘=0 = 1, πα΅ | ππ‘ π‘=0
πα΅ | ππ‘ π‘=0
π‘ β₯ 0. = 1,
= 1,
π’|π‘=0 = 0,
π’|π‘=0 = 0,
π‘ β₯ 0.
π‘ β₯ 0.
11. A model for the vertical motion of a projectile that includes the effect of nonconstant gravity is shown below. Here π¦ is height, π‘ is time, π is gravitational acceleration at the surface of the earth, π
is the radius of the earth, and π£ 0 is the launch velocity. π π2π¦ =β , 2 ππ‘ ((π¦/π
) + 1)2
ππ¦ | = π£0 , ππ‘ π‘=0
π¦|π‘=0 = 0,
π‘ β₯ 0.
(a) Let π = π‘/π and π’ = π¦/π. Find the scales π and π so that the scaled equations 1 πα΅ π2 α΅ become ππ2 = β (πα΅+1)2 , ππ |π=0 = 1, π’|π=0 = 0. Identify the parameter π. (b) Find a perturbation approximation of the solution π’(π, π) up to order π(π) assuming 0 β€ π βͺ 1. (c) Express the results in terms of π¦, π‘, π, π
and π£ 0 . Under what condition on π, π
and π£ 0 would we have 0 β€ π βͺ 1? 12. A model for a projectile that includes both air resistance and nonconstant gravity is shown below, in dimensionless form, where π¦ is height, π‘ is time, and 0 β€ π βͺ 1 is a parameter. π 2 π¦ ππ¦ 1 , + =β 2 ππ‘ ππ‘ (ππ¦ + 1)2
ππ¦ | = 1, ππ‘ π‘=0
π¦|π‘=0 = 0.
Find a perturbation approximation of the solution π¦(π‘, π) up to order π(π).
136
5. Perturbation methods
13. In dimensionless form, a model for a thermo-chemical reaction is shown below, where π’ and π are the concentration and temperature of the reactant, π‘ is time, and 0 β€ π βͺ 1 is a parameter. ππ’/ππ‘ = 1 β π’ππ(πβ1) , ππ/ππ‘ = βπ + π’π
π(πβ1)
π’|π‘=0 = 1, ,
π|π‘=0 = 2,
π‘ β₯ 0.
Find a perturbation approximation of the solution (π’, π)(π‘, π) up to order π(π). 14. Use the PoincarΓ©βLindstedt method to obtain a two-term perturbation approximation for each of the following, where 0 β€ π βͺ 1. (a)
π2 π₯ ππ‘2
+ 9π₯ = ππ₯3 β ππ₯,
(b)
π2 π₯ ππ‘2
+ 4π₯ = ππ₯( ππ‘ )2 ,
(c)
π2 π₯ ππ‘2
+ 4π₯ = 8ππ₯3 ,
(d)
π2 π₯ ππ‘2
ππ₯
ππ₯ | ππ‘ π‘=0 ππ₯ | ππ‘ π‘=0
ππ₯ | ππ‘ π‘=0
π2 π₯
+ 9π₯ = ππ₯2 ( ππ‘2 ),
= 0,
π₯|π‘=0 = 1,
π‘ β₯ 0.
= 0,
π₯|π‘=0 = 1,
π‘ β₯ 0.
= 1,
ππ₯ | ππ‘ π‘=0
π₯|π‘=0 = 0,
= 1,
π‘ β₯ 0.
π₯|π‘=0 = 0,
π‘ β₯ 0.
15. In dimensionless form, a model of a visco-elastic spring-mass system is shown ππ₯ ππ₯ below, where π(π, π, ππ‘ ) = 1 + ππ( ππ‘ )2 is a velocity-dependent spring stiffness coefficient, 0 β€ π βͺ 1 is a parameter, and π = Β±1 is a sign factor that determines whether the spring gets stiffer or softer with velocity. ππ₯ π2π₯ ππ₯ + π(π, π, )π₯ = 0, | = 0, π₯|π‘=0 = 1, π‘ β₯ 0. ππ‘ ππ‘ π‘=0 ππ‘2 Use the PoincarΓ©βLindstedt method to obtain a two-term perturbation approximation of the solution π₯(π‘, π, π). 16. For each singularly perturbed equation, find a two-term perturbation approximation of each regular and singular root, whether real or complex, where 0 < π βͺ 1. (a) ππ₯3 + π₯2 β π₯ + π = 0.
(b) ππ₯3 + π₯ + 1 = 0.
(c) π2 π₯3 + 4ππ₯2 + 2π₯ + 1 = 0.
(d) ππ₯4 + 4π₯2 + ππ₯ + 9 = 0.
(e) ππ₯5 β ππ₯2 β π₯ + 2 = 0.
(f) ππ₯4 + ππ₯2 β π₯ + 3 = 0.
(g) π2 π₯4 + ππ₯2 + π₯ + 1 = 0.
(h) π2 π₯6 β ππ₯4 β π₯3 + 8 = 0.
17. Experiments performed on a nonideal gas at fixed volume show that the pressure π > 0 is related to the absolute temperature π > 0 as 3
2
π π π π = π( ) + π( ) + πΎ( ), π0 π0 π0 π0 where π0 is a reference pressure, π0 is a reference temperature, and π, π and πΎ are dimensionless constants with values 0 < π βͺ π β€ πΎ β€ 1. Here we develop an approximate inverse of this relation.
Exercises
137
π
π
(a) Let π¦ = π , π₯ = π and π = π, and consider the cubic equation ππ₯3 + ππ₯2 + 0 0 πΎπ₯ β π¦ = 0. Find a two-term perturbation approximation to each root π₯ for any given π¦ and 0 < π βͺ π β€ πΎ β€ 1. (b) The given relation π¦ = ππ₯3 +ππ₯2 +πΎπ₯ has the property that π¦ β [0, π+π+πΎ] for π₯ β [0, 1]. Which root π₯ = π₯(π¦, π, π, πΎ) in (a) has the property that π₯ β [0, 1] (approximately) for π¦ β [0, π + π + πΎ]? 18. For each singularly perturbed equation, find a leading-order composite approximation of the solution assuming a boundary layer on the indicated side of the interval, where 0 < π βͺ 1. π2 π¦
ππ¦
π2 π¦
ππ¦
π2 π¦
ππ¦
π2 π¦
ππ¦
(a) π ππ₯2 + 4 ππ₯ + π¦ = 3, π¦|π₯=0 = 2, π¦|π₯=1 = 1, 0 β€ π₯ β€ 1, left. (b) π ππ₯2 + 2 ππ₯ + ππ¦ = 0, π¦|π₯=0 = 0, π¦|π₯=1 = 0, 0 β€ π₯ β€ 1, left. (c) π ππ₯2 β 2 ππ₯ β 4π¦ = 0, π¦|π₯=0 = 3, π¦|π₯=2 = 6, 0 β€ π₯ β€ 2, right. (d) π ππ₯2 β π₯ ππ₯ = β4, π¦|π₯=1 = 1, π¦|π₯=3 = 3, 1 β€ π₯ β€ 3, right. π2 π¦
(e) π ππ₯2 +
ππ¦ ππ₯
1
1
= βπ¦2 , π¦|π₯=0 = 4 , π¦|π₯=1 = 2 , 0 β€ π₯ β€ 1, left.
19. For each singularly perturbed equation, find a leading-order composite approximation of the solution assuming a boundary layer at π‘ = 0, where 0 < π βͺ 1. For problems involving time, such a boundary layer is called an initial layer. πα΅
(a) π ππ‘ + π’ = πβπ‘ , π’|π‘=0 = 2, 0 β€ π‘ < β. π2 α΅
(b) π ππ‘2 +
πα΅ ππ‘
πα΅ | ππ‘ π‘=0
+ π’2 = 0,
π2 α΅
πα΅
= 0, π’|π‘=0 = 3, 0 β€ π‘ < β.
πα΅
(c) π ππ‘2 + (π‘ + 1)2 ππ‘ = 1, π ππ‘ |π‘=0 = 1, π’|π‘=0 = 1, 0 β€ π‘ < β. π2 α΅
πα΅
π2 α΅
πα΅
πα΅
(d) π ππ‘2 + 2 ππ‘ = π‘π’2 , π ππ‘ |π‘=0 = 2, π’|π‘=0 = 5, 0 β€ π‘ < β. πα΅
(e) π ππ‘2 + 4 ππ‘ β ππ’ = π‘, π ππ‘ |π‘=0 = 6, π’|π‘=0 = 3, 0 β€ π‘ < β. 20. For each singularly perturbed system, find a leading-order composite approximation of the solution assuming an initial layer at π‘ = 0, where π‘ β₯ 0 and 0 < π βͺ 1. (a)
ππ₯ ππ‘
= π¦ β 2π₯ + ππ₯π¦, π ππ‘ = π₯ β π¦ β ππ₯π¦, π₯|π‘=0 = 2, π¦|π‘=0 = 1.
ππ¦
(b)
ππ₯ ππ‘
= π sin π₯ β π¦, π ππ‘ = π₯ β π¦ β 2 + ππ¦3 , π₯|π‘=0 = 1, π¦|π‘=0 = 0.
ππ¦
138
5. Perturbation methods
21. In dimensionless form, a model for a two-step chemical reaction involving constituents π, π , π, π is X+Y
ππ₯ 1 = β π₯π¦ + π’, ππ‘ π π₯|π‘=0 = π₯# ,
ππ¦ 1 = β π₯π¦ + π’, ππ‘ π π¦|π‘=0 = π¦# ,
U
V
ππ’ 1 = π₯π¦ β π’ β πΎπ’, ππ‘ π π’|π‘=0 = 0,
ππ£ = πΎπ’, ππ‘ π£|π‘=0 = 0.
Here π₯, π¦, π’, π£ are concentrations, π‘ is time, π, πΎ are positive constants, and we ππ₯ ππ¦ assume π₯# β₯ π¦# > 0. The first two equations imply ππ‘ = ππ‘ , which gives π‘ π¦(π‘) = π₯(π‘) β π, where π = π₯# β π¦# . The last equation implies π£(π‘) = β«0 πΎπ’(π‘)Μ ππ‘.Μ By substituting for π¦, and combining the first and third equations, the system reduces to ππ₯ π = βπ₯(π₯ β π) + ππ’, π₯|π‘=0 = π₯# , ππ‘ ππ’ ππ₯ =β β πΎπ’, π’|π‘=0 = 0. ππ‘ ππ‘ When the reaction π + π β π is much faster than the other two, we have 0 < π βͺ 1, and the system is singularly perturbed with an initial layer at π‘ = 0. (a) Find leading-order composite approximations for π₯(π‘, π) and π’(π‘, π) on the interval 0 β€ π‘ < β. (b) Using the results from (a), find explicit leading-order expressions for π¦(π‘, π) and π£(π‘, π). (c) Using {π₯# , π¦# , πΎ, π} = {1, 0.7, 0.6, 0.01}, make plots of π’, π£ versus π‘ β [0, 10]. What appears to be the terminal value of π£ as π‘ β β? At what time is the reaction 90% complete, that is, when does π£ reach 90% of its terminal value? 22. The MichaelisβMenten model of a biochemical reaction catalyzed by an enzyme leads to the following reduced, dimensionless system ππ₯ = βπ₯ + π₯π¦ + ππ¦, π₯|π‘=0 = 1, ππ‘ ππ¦ π = π₯ β π₯π¦ β πΎπ¦, π¦|π‘=0 = 0. ππ‘ Here π₯, π¦ are the concentrations of the substrate and enzyme-substrate complex, π‘ is time, and π, πΎ, π are positive constants. Under typical conditions, we have 0 < π βͺ 1, and the system is singularly perturbed with an initial layer at π‘ = 0. Find leading-order composite approximations for π₯(π‘, π) and π¦(π‘, π) on the interval 0 β€ π‘ < β. [The outer solution will have an implicit form.] Note: In singularly perturbed chemical kinetics problems, the leading-order outer solution is called a quasi-steady-state approximation.
Exercises
139
23. As described in Section 5.11, the liquid-gas interface model leads to the following inner equation, where π’ is the stretched height, and π is the stretched horizontal coordinate. π2π’ ππ’ 2 3/2 = π’[1 + ( ) ] , π β€ 0. 2 ππ ππ Here we solve the above equation subject to the matching condition π’ β 0 as πα΅ π π β ββ, along with the boundary condition ππ |π=0 = tan πΎ, where 0 < πΎ < 2 πα΅ is a given angle. Consistent with tan πΎ > 0, we suppose π’ > 0 and ππ > 0 in the πα΅ inner region, and also ππ β 0 as π β ββ. πα΅
ππ£
(a) Introduce the first-order system ππ = π£, ππ = π’[1 + π£2 ]3/2 and consider the ππ£ equation for πα΅ . Using the condition π£ β 0 when π’ β 0, together with π£ > 0 α΅β4βα΅2 when π’ > 0, show that the solution of this equation is π£ = 2βα΅2 . Note that the solution is well defined for π’ β (0, β2). πα΅ ππ
(b) From (a) we get
=
α΅β4βα΅2 , 2βα΅2
or equivalently
ππ πα΅
2βα΅2 . Show that the α΅β4βα΅2 2 π’ +ln(π’/[2 + β4 β π’2 ]),
=
general solution is π = πΉ(π’)βπ΅, where πΉ(π’) = β4 β and π΅ is an arbitrary constant. Note that, since it is monotonic for π’ β (0, β2), the function π = π(π’) has an inverse π’ = π’(π). Note also that the matching condition π’ β 0+ as π β ββ, or equivalently π β ββ as π’ β 0+ , is satisfied. (c) In the inner variables, the meniscus end point is (π, π’) = (0, π’# ). At this point, 1 ππ πα΅ the solution must satisfy ππ |π=0 = tan πΎ, or equivalently πα΅ |α΅=α΅# = tan πΎ . Using (b), show that this equation has a unique solution π’# β (0, β2) for any π given πΎ β (0, 2 ). (d) Since π’ = π’# when π = 0 show that the arbitrary constant from part (b) must have the value π΅ = πΉ(π’# ). Thus the complete inner solution for the interface curve has the form π = πΉ(π’) β πΉ(π’# ). Mini-project 1. A basic problem in ballistics is to determine how to aim a given weapon in order for a bullet to strike a given target as described in Section 5.7. Many factors are involved, and all are important in long-range shots. Considering only gravity and air resistance, a simple model for the near-horizontal motion of a bullet is y
g line o
(u ,v ) 0 0
ΞΈ aiming angle
t
f sigh
aiming
h height
trajectory
x
(a,b)
π₯Μ = βπ(π₯)Μ 2 ,
π₯|Μ π‘=0 = π’0 , π₯|π‘=0 = 0,
π¦ Μ = βππ₯π¦ Μ Μ β π,
π¦|Μ π‘=0 = π£ 0 ,
π¦|π‘=0 = 0.
Here (π₯, π¦) is the bullet position, π‘ is time, π is gravitational acceleration, π is an air resistance coefficient, and (π’0 , π£ 0 ) is the bullet firing velocity; we assume π’20 + π£20 = π2 , where π > 0 is a constant that depends on the weapon. The problem is to determine
140
5. Perturbation methods
the pair (π’0 , π£ 0 ), or equivalently the aiming angle π or height β, required for the bullet trajectory to intersect a fixed target at (π, π). Here we study the influence of gravity and air resistance on this problem. All quantities are in units of meters and seconds. (a) For the case of π = 0, solve for the path (π₯, π¦)(π‘) and briefly describe how the targeting problem may have zero, one or two solutions for (π’0 , π£ 0 ) depending on π, π, π and π. If π = 500 and π = 10, what value of (π’0 , π£ 0 ) gives the lowest aiming angle to strike a target at (π, π) = (500, 2)? What is the aiming height β? How much time is required for the impact? [Give all results to 4 decimal places.] (b) For the case of small π > 0, find an approximation to the path (π₯, π¦)(π‘, π) up to and including π(π) terms. If π = 500 and π = 10 as before, and π = 0.0002, what value of (π’0 , π£ 0 ) now gives the lowest aiming angle to strike the target at (π, π) = (500, 2)? What is the aiming height β? How much time is now required for the impact? [When considering the strike conditions, note that the physically meaningful solutions should tend to those in part (a) as π β 0+ . Give all results to 4 decimal places.] (c) Use Matlab or other similar software to numerically solve the differential equations for the bullet path and confirm your results in (a) and (b). Specifically, using your computed values of (π’0 , π£ 0 ) and impact times, make and superimpose plots of the bullet trajectories and verify that both hit the given target. Mini-project 2. Due to the high velocities involved, the motion of Mercury (M) around the Sun (S) is described by a corrected form of Newtonβs laws, where the correction is based on Einsteinβs theory of relativity. When time is eliminated, the equations for the orbital path π = π(π) are y perihelion S M (r,ΞΈ)
x
π2π’ 1 πΎπ’2 +π’= + , 2 π π ππ ππ’ = 0, | ππ π=π0
π β₯ π0 ,
π’|π=π0 =
1+π . π
aphelion
In the above, π’ = 1/π is the inverse of the radius, π is a constant related to gravity, πΎ is a small constant that quantifies relativistic effects, and π is a constant that defines the initial radius. Here we study the above system with (πΎ > 0), and without (πΎ = 0), the relativistic correction term, whose effect we seek to understand. Relevant physical dimensions are [π] = πΏ, [πΎ] = πΏ2 and [π] = 1. We assume π > 0, 0 < π < 1, πΎ β₯ 0, and π0 are given. As M moves around S the angle π will continually increase: it will reach the value π0 + 2ππ after π complete revolutions. The differential equation above describes how π, and hence (π₯, π¦), vary with π. (a) Consider the scale transformation π£ = ππ’ and π = π β π0 . [Although the transformation includes a shift, the derivative relations will not be changed.] Show that the
Exercises
141
system can be written as follows for an appropriate parameter π: π2π£ + π£ = 1 + ππ£2 , ππ2
ππ£ | = 0, ππ π=0
π£|π=0 = 1 + π,
π β₯ 0.
(b) Solve the system in (a) with π = 0, which corresponds to neglecting the relativistic correction. Given that π(π) = π/π£(π), where π = π β π0 , show by inspection that the smallest value of π (perihelion of orbit) occurs at the angle π = π0 + 2ππ, and the largest value (aphelion) occurs at π = π0 + (2π + 1)π. Does the location of the perihelion/aphelion in the π₯π¦-plane change with each revolution π β₯ 0? (c) Solve the system in (a) with 0 < π βͺ 1, which corresponds to including the relativistic correction. Use the PoincarΓ©βLindstedt method with π£(π , π) = π£ 0 (π ) + ππ£ 1 (π ) + β― ,
π = ππ,
π = π0 + ππ1 + β― .
You need only determine π£ 0 , π0 and π1 . Using π(π) = π/π£ 0 (π ), where π = (π0 + ππ1 )π πΎ and π = π β π0 , show the smallest value of π now occurs at π = π0 + 2ππ(1 + π2 βπΎ ), πΎ
and the largest value occurs at π = π0 + (2π + 1)π(1 + π2 βπΎ ). Does the location of the perihelion/aphelion change now? What is the change in the angle of the aphelion between revolution π and π + 1? π
(d) Use Matlab or other software to simulate the original system using π0 = 4 , π = 1 and π = 0.7. For the cases πΎ = 0 and πΎ = 0.01, make a plot of the orbital curve (π₯, π¦) = (π(π) cos π, π(π) sin π) for π β [π0 , π0 + 2ππ] for π = 10. Do the simulations agree with your analysis in (b) and (c)? Does the location of the aphelion change by the expected amount? Note: Astronomical observations show that the perihelion and aphelion of Mercury advance with each revolution of the orbit in agreement with the relativistic theory. Mini-project 3. Due to surface tension, a liquid-gas interface will rise up or dip down at a solid boundary to form a meniscus as outlined in Section 5.11. In the planar case, the shape of the interface or meniscus curve π¦(π₯) is described by gas
y
g meniscus βL height liquid
0
y(x) x L
Ξ³
ππ¦ 2 3/2 π π2π¦ = π¦[1 + ( )] , ππ ππ₯2 ππ₯ ππ¦ | = 0, ππ₯ |π₯=0
0 β€ π₯ β€ πΏ,
ππ¦ | = tan πΎ. ππ₯ |π₯=πΏ
In the above, π is the mass density of the liquid, π is gravitational acceleration, and π and πΎ are the surface tension and wetting angle of the interface. Under typical conditions, the above problem is singularly perturbed and has a boundary layer at π₯ = πΏ. Here we develop an approximate solution and predict the height of the meniscus. We assume π π > 0, π > 0, π > 0, πΏ > 0 and πΎ β (0, 2 ) are given constants. All quantities are in units of kilograms, meters and seconds.
142
5. Perturbation methods
For convenience, we introduce π = π₯/πΏ and β = π¦/πΏ, and consider the following π dimensionless form of the above system, where 0 < π = πππΏ2 βͺ 1. Note that the outer and inner endpoints in π will be πout = 0 and πin = 1. π
π2β πβ 2 3/2 = β[1 + )] , ( ππ ππ 2
πβ | = 0, ππ |π =0
πβ | = tan πΎ, ππ |π =1
0 β€ π β€ 1.
(a) For the outer problem, show that the leading-order term of the outer solution is β0 (π ) β‘ 0. As a result, show that the matching condition will be π» in = 0 = πΌ out , and that the leading-order composite approximation will be the inner approximation. (b) For the inner problem, use the change of variable π = πβπΌ (π β πin ) and π’ = πβπ½ β 1 1 and show that the choice πΌ = 2 and π½ = 2 yields a regular system with no explicit π parameter, namely ππ’ | π2π’ ππ’ 2 3/2 = π’[1 + ( ) ] , = tan πΎ, π β€ 0. 2 ππ ππ |π=0 ππ Thus, in the variables π, π’, the inner solution is independent of π so that π’(π, π) = π’(π) = π’0 (π). (c) Solution curves of (b) which satisfy the matching condition in (a) have the implicit form π = πΉ(π’) β π΅, where πΉ(π’) = β4 β π’2 + ln(π’/[2 + β4 β π’2 ]), and π΅ is an arbitrary constant. Verify that these curves satisfy the differential equation (see hint below). (d) In the inner variables, the meniscus endpoint is (π, π’) = (0, π’# ). At this point, 1 πα΅ ππ the solution must satisfy ππ |π=0 = tan πΎ, or equivalently πα΅ |α΅=α΅# = tan πΎ . Using (c), π show that this equation has a unique solution π’# β (0, β2) for any given πΎ β (0, 2 ). Moreover, since π’ = π’# when π = 0, show that π΅ = πΉ(π’# ). (e) The inner and also composite approximation of the meniscus curve is π = πΉ(π’) β πΉ(π’# ). Note that (π, π’) = (0, π’# ) corresponds to (π₯, π¦) = (πΏ, π¦# ), where π¦# is the meniscus height. Using your expression for π’# from (d), find an explicit expression for π¦# . What are the values of π’# and π¦# in the case when π = 0.05, π = 1000, π = 10, πΏ = 0.07 2π and πΎ = 5 ? (f) Use Matlab or similar software for boundary-value problems to simulate the original system for the parameter values in (e). Is the prediction of the meniscus height π¦# in agreement with the simulation? Based on the prediction, assuming 0 < π βͺ 1, how would an increase or decrease in the gravitational acceleration π affect the meniscus height π¦# ? Does the height π¦# depend on the container size πΏ? ππ
πα΅
πα΅
Hint for (c): πα΅ = πΉ β² (π’), so ππ = 1/πΉ β² (π’). Introduce π(π’) = 1/πΉ β² (π’) so that ππ = π(π’) π2 α΅ and ππ2 = πβ² (π’)π(π’). Is the differential equation satisfied when these expressions are πα΅ ππ substituted? [Note that πα΅ and ππ are defined and positive for π’ β (0, β2).]
Chapter 6
Calculus of variations
A wide variety of problems in modeling are concerned with finding the minimum or maximum value of a function, and the corresponding inputs that produce such a value. When the domain of the function is a subset of the real line, plane, or some higherdimensional space, the techniques of differential calculus can be used to characterize those points that are minimizing or maximizing. Alternatively, when the domain is a collection of graphs or more general curves, then other techniques are required to characterize those curves that are minimizers or maximizers for the function. Such problems cannot be solved by the tools of elementary calculus alone, but require a more elaborate theory known as the calculus of variations. Here we outline a basic version of this theory. We consider both first- and second-order problems involving graphs and more general curves in the plane, with various types of essential and natural boundary conditions, and constraints. We focus on necessary conditions for optimizers and establish sufficient conditions in different cases. The theory is illustrated with various applications, including some problems in optimal control.
6.1. Preliminaries Throughout our developments we consider various sets whose elements are real-valued functions π¦(π₯) of a real variable π₯ β [π, π]. The most basic of these sets are defined as follows, where π β₯ 0 denotes an integer. Definition 6.1.1. By πΆ π [π, π] we mean the set of functions on [π, π] with π continuous derivatives, specifically (6.1)
πΆ π [π, π] = {π¦ βΆ [π, π] β β | π¦(π₯), π¦β² (π₯), . . . , π¦(π) (π₯) continuous}.
Thus πΆ 0 [π, π] is the set of all functions that are continuous, πΆ 1 [π, π] is the set of all functions that are continuous and have a first derivative that is continuous, and so on; see Figure 6.1. Here continuity is understood to hold over the entire interval [π, π], including the end points. For any given π and [π, π] the set πΆ π [π, π] is a linear space, 143
144
6. Calculus of variations
y
y
y in C 0, but not C1
a
x
b
y in C1
a
C0
b
x
Figure 6.1.
which means that if π’ and π£ are functions in the set, then so is πΌπ’+π½π£ for any constants πΌ and π½, so (6.2)
π’, π£ β πΆ π [π, π]
βΆ
πΌπ’ + π½π£ β πΆ π [π, π],
βπΌ, π½ β β.
More generally, a set of functions V is called a linear space if (6.3)
π’, π£ β V
βΆ
πΌπ’ + π½π£ β V,
βπΌ, π½ β β.
Note that a linear space is simply a real vector space as defined in linear algebra, where the elements are functions. The sets defined above are called the πΆ π -spaces. Example 6.1.1. (1) Consider V = {π¦ β πΆ 0 [1, 2] | π¦(1) = 0}. This set is a linear space since each condition of membership is preserved under arbitrary linear combinations. Specifically, if π’ β πΆ 0 [1, 2] and π£ β πΆ 0 [1, 2], then (πΌπ’ + π½π£) β πΆ 0 [1, 2] for all πΌ, π½. Also, if π’(1) = 0 and π£(1) = 0, then (πΌπ’ + π½π£)(1) = 0 for all πΌ, π½. (2) Consider V = {π¦ β πΆ 1 [0, 3] | π¦(3) = 4}. This set is not a linear space since a condition of membership is not preserved under arbitrary linear combinations. Specifically, if π’(3) = 4 and π£(3) = 4, then (πΌπ’ + π½π£)(3) = 4πΌ + 4π½ β 4 for some πΌ, π½. We will be interested in optimization problems that involve a given set of functions and a real-valued quantity defined on this set. Such a quantity, whose input is a function and output is a number, is defined next. Definition 6.1.2. By a functional πΉ on a set of functions V we mean a mapping πΉ βΆ V β β. A functional πΉ is called linear if V is a linear space and (6.4)
πΉ(πΌπ’ + π½π£) = πΌπΉ(π’) + π½πΉ(π£),
βπ’, π£ β V,
βπΌ, π½ β β.
Thus a functional is a mapping that associates a number πΉ(π¦) β β to any given function π¦ β V. Such mappings arise naturally in many contexts. As illustrated in the example below, functionals can be either linear or nonlinear, and can be defined in a number of different ways. We will be primarily interested in functionals defined through an integral expression. 1
Example 6.1.2. (1) Consider V = {π¦ β πΆ 0 [0, 1] | π¦(1) = 0}, πΉ(π¦) = β«0 π₯2 π¦(π₯) ππ₯. The expression for πΉ defines a functional, since it produces a number for a given function π¦. This functional may be described as the integral type, due to its form. Moreover, this functional is linear, which follows from the fact that V is a linear space, and from
6.2. Absolute extrema
145
properties of integrals, which imply 1
πΉ(πΌπ’ + π½π£) = β« π₯2 (πΌπ’(π₯) + π½π£(π₯)) ππ₯ (6.5)
0 1
1
= πΌ β« π₯2 π’(π₯) ππ₯ + π½ β« π₯2 π£(π₯) ππ₯ = πΌπΉ(π’) + π½πΉ(π£). 0
0 1
(2) Consider V = πΆ 1 [0, 1], πΉ(π¦) = β«0 π₯π¦(π₯) + 4(π¦β² (π₯))2 ππ₯. The expression for πΉ defines a functional; it is of the integral type similar to before, but is nonlinear due to the squared term. Note that πΉ is defined on the entire space πΆ 1 [0, 1], and could not be similarly defined on πΆ 0 [0, 1]. 1
(3) Consider V = πΆ 2 [0, 1], πΉ(π¦) = π¦β² ( 2 ) + πβπ¦(0) + max[0,1] |π¦β³ (π₯)|. The expression for πΉ defines a functional; it is of the nonintegral type, and is nonlinear. Note that a func1 tional may involve local or point information about the input, such as π¦(0) and π¦β² ( 2 ), as well as global information, such as max[0,1] |π¦β³ (π₯)|, which denotes the maximum of the absolute value of π¦β³ (π₯) in the interval π₯ β [0, 1]. Just as calculus can be viewed as the study of functions, calculus of variations can be viewed as the study of functionals. In the remainder of our developments, we outline an elementary theory of minima and maxima for functionals. Ideally, a systematic study of their continuity and differentiability properties should also be considered, but this will not be pursued here.
6.2. Absolute extrema A minimum or maximum value of a functional is called an extremum. The input associated with such a value is also called an extremum, or more appropriately an extremizer, to distinguish it from the output. An extremum can be of the absolute (global) type, or relative (local) type, or both. The definition of an absolute extremum is especially simple. It is based on a comparison with all elements of a set, and does not involve any notion of distance or closeness within the set. Definition 6.2.1. Let πΉ βΆ V β β be given. A function π¦β β V is called an absolute minimizer of πΉ if (6.6)
πΉ(π¦) β₯ πΉ(π¦β ),
βπ¦ β V.
Similarly, π¦β is called an absolute maximizer if (6.7)
πΉ(π¦) β€ πΉ(π¦β ),
βπ¦ β V.
Thus a function π¦β is an absolute minimizer if it gives the smallest or minimum value of πΉ over the set V, and similarly, π¦β is an absolute maximizer if it gives the largest or maximum value. In some cases, the existence or not of an absolute extremum can be established by observation and a straightforward analysis. In other cases, the existence of such an extremum is more delicate. Methods to produce candidates for absolute extrema will be presented later; for the moment, we proceed by observation.
146
6. Calculus of variations
Example 6.2.1. Consider finding absolute extrema of πΉ βΆ V β β, where V = {π¦ β 1 πΆ 0 [0, 1] | π¦(0) = 0}, and πΉ(π¦) = β«0 π¦2 (π₯) ππ₯. Absolute minimizer. The positive quadratic form of the integrand suggests that a minimizer may exist. Specifically, we note that the functional satisfies the lower bound πΉ(π¦) β₯ 0 for all π¦ β V. Moreover, we note that πΉ(π¦) = 0 when and only when π¦(π₯) β‘ 0, and the zero function is in V. Thus π¦β (π₯) β‘ 0 is an absolute minimizer; it satisfies πΉ(π¦) β₯ πΉ(π¦β ) for all π¦ β V. Note that this minimizer is unique, since there are no other functions with this property. Absolute maximizer. Based on the positive quadratic form of the integrand, we expect that the functional has no upper bound, and hence no absolute maximizer. To show Μ this, consider the function π¦(π₯) = ππ₯, where π is a constant. Note that π¦ Μ β V for any 1 1 π, and by direct computation πΉ(π¦)Μ = β«0 π2 π₯2 ππ₯ = 3 π2 . Since πΉ(π¦)Μ β β as π β β we deduce that there is no absolute maximizer. Specifically, there is no π¦β β V which satisfies πΉ(π¦) β€ πΉ(π¦β ) for all π¦ β V, because πΉ(π¦β ) would be a fixed number, and πΉ(π¦)Μ would be greater than this number for sufficiently large π. Thus no function in V gives a largest value of πΉ. Example 6.2.2. Consider finding absolute extrema of πΉ βΆ V β β, where πΉ(π¦) = 1 β«0 π¦2 (π₯) ππ₯, but now V = {π¦ β πΆ 0 [0, 1] | π¦(1) = 1}. Absolute minimizer. As before, we note that πΉ(π¦) β₯ 0, and πΉ(π¦) = 0 when and only when π¦(π₯) β‘ 0. But now the zero function is not in V. Thus the lower bound is not reached by any function in the set, and we have πΉ(π¦) > 0 for all π¦ β V. However,
y 1
0
y(x) 1βΞ΅
1
x
Figure 6.2.
there are functions that come arbitrarily close to reaching the lower bound. One such example is the piecewise linear function π¦ Μ shown in Figure 6.2. This function is in V for 1 any 0 < π < 1, and a direct computation gives πΉ(π¦)Μ = 3 π. Since πΉ(π¦)Μ β 0+ as π β 0+ we deduce that there is no absolute minimizer. Specifically, there is no π¦β β V which satisfies πΉ(π¦) β₯ πΉ(π¦β ) for all π¦ β V, because πΉ(π¦β ) would be a fixed positive number, and πΉ(π¦)Μ would be less than this number for sufficiently small π. Thus no function in V gives a smallest value of πΉ. Note also that the function π¦ Μ considered here does not approach a function in V as π β 0+ . Specifically, π¦ Μ approaches a function with a jump discontinuity at π₯ = 1, and such a function is not contained in V. Absolute maximizer. As before, we expect that the functional has no upper bound, and Μ hence no absolute maximizer. To show this, consider the function π¦(π₯) = π₯ + π(1 β π₯),
6.3. Local extrema
147
where π is a constant. Note that π¦ Μ β V for any π, and by direct computation πΉ(π¦)Μ = 1 (1 + π + π2 ). Since πΉ(π¦)Μ β β as π β β we deduce that there is no absolute maximizer 3 as before.
6.3. Local extrema Aside from the absolute type, we also consider extrema of a local type. Such extrema may be more likely to exist and easier to find. The definition of a local extremum requires the concept of a neighborhood within a set of functions. Here we consider neighborhoods defined using the standard family of norms in the linear space πΆ π [π, π]. The notation maxπβ€π₯β€π |π£(π₯)| means the maximum of the absolute value of π£(π₯) in the interval π₯ β [π, π]. Definition 6.3.1. Let V β πΆ π [π, π] and π β€ π be given. By the πΆ π -norm of π£ β V we mean the number (6.8)
βπ£βπΆ π = max |π£(π₯)| + max |π£β² (π₯)| + β― + max |π£(π) (π₯)|. πβ€π₯β€π
πβ€π₯β€π
πβ€π₯β€π
The distance between π’ β V and π¦ β V is the number βπ’βπ¦βπΆ π . By the πΆ π -neighborhood of π¦ β V of radius πΏ > 0 we mean the set π πΆ π (π¦, πΏ) = {π’ β V | βπ’ β π¦βπΆ π β€ πΏ}.
(6.9)
Thus the πΆ π -norm of π£ is a measure of the magnitude or size of π£, which is based on the maximum absolute value of the function and its derivatives up through order π. Two functions π’, π¦ are close in this norm when the distance between them βπ’ β π¦βπΆ π is small. The πΆ π -neighborhood of π¦ consists of all functions π’ that are within a given distance. By design, we define a neighborhood as a subset of V, and do not consider any functions outside of this set. Note that the πΆ π -norm has all the properties of a vector norm as defined in linear algebra.
y 1
y * 1
x
β2
u Ξ΄
Figure 6.3.
Example 6.3.1. Let V = {π¦ β πΆ 0 [0, 1] | π¦(0) = 0} and consider π¦β β V as shown in the left part of Figure 6.3. The πΆ 0 -norm of this function is βπ¦β βπΆ 0 = max[0,1] |π¦β (π₯)| = 2. For given πΏ > 0, the πΆ 0 -neighborhood is π πΆ 0 (π¦β , πΏ) = {π’ β V | βπ’ β π¦β βπΆ 0 β€ πΏ}. Using the fact that βπ£βπΆ 0 β€ πΏ if and only if |π£(π₯)| β€ πΏ for all π₯ β [0, 1], the neighborhood can be written in the more explicit form (6.10)
π πΆ 0 (π¦β , πΏ) = {π’ β V | |π’(π₯) β π¦β (π₯)| β€ πΏ}.
Thus π πΆ 0 (π¦β , πΏ) is the set of all functions π’ whose graphs are contained within a strip of half-width πΏ about the graph of π¦β , as shown in the right part of Figure 6.3.
148
6. Calculus of variations
Example 6.3.2. Let V = {π¦ β πΆ 1 [0, 1] | π¦(0) = 0} and consider π¦β β V as shown in the top left part of Figure 6.4. In view of (6.8), note that the πΆ 1 -norm of a function can be written as the sum of two separate πΆ 0 -norms, namely βπ¦β βπΆ 1 = βπ¦β βπΆ 0 + βπ¦ββ² βπΆ 0 . For given πΏ > 0, the πΆ 1 -neighborhood is π πΆ 1 (π¦β , πΏ) = {π’ β V | βπ’ β π¦β βπΆ 1 β€ πΏ}. Using the noted relation between norms, this neighborhood can be written in a more explicit form similar to before, namely π πΆ 1 (π¦β , πΏ) = {π’ β V | |π’(π₯) β π¦β (π₯)| β€ πΏ,Μ
(6.11)
|π’β² (π₯) β π¦ββ² (π₯)| β€ πΏ,Μ πΏ Μ + πΏ Μ β€ πΏ}.
Thus π πΆ 1 (π¦β , πΏ) consists of all functions π’ that satisfy two conditions: π’, π¦β are within some πΆ 0 -distance πΏ,Μ and π’β² , π¦ββ² are within some πΆ 0 -distance πΏ,Μ where πΏ Μ + πΏ Μ β€ πΏ, as shown in the bottom left part of Figure 6.4. Note that a πΆ 1 -neighborhood of π¦β is much more restrictive than a πΆ 0 -neighborhood. For example, as illustrated in the right part of Figure 6.4, the function π’ is in a πΆ 1 -neighborhood of π¦β for some small πΏ. In contrast, the function π£ is in a πΆ 0 -neighborhood, but not a πΆ 1 -neighborhood. Specifically, π£ is close to π¦β for all π₯ β [0, 1], but the slope of π£ is not close to the slope of π¦β at some points. y
y y *
0
y *
1
y
x
u
x
v
Ξ΄
u u
Ξ΄
0
1
x
Figure 6.4.
Using the above notion of a neighborhood, we can now define local extrema for a functional on a subset of a πΆ π -space. This definition should be compared to that for absolute extrema given earlier. Definition 6.3.2. Let V β πΆ π [π, π], πΉ βΆ V β β and π β€ π be given. A function π¦β β V is called a local minimizer of πΉ in the πΆ π -norm if (6.12)
πΉ(π’) β₯ πΉ(π¦β ),
βπ’ β π πΆ π (π¦β , πΏ) for some πΏ > 0.
Similarly, π¦β is called a local maximizer if (6.13)
πΉ(π’) β€ πΉ(π¦β ),
βπ’ β π πΆ π (π¦β , πΏ) for some πΏ > 0.
Thus a function π¦β is a local minimizer in the πΆ π -norm if it gives the smallest value of πΉ over some πΆ π -neighborhood of π¦β in V; and similarly, it is a local maximizer
6.3. Local extrema
149
if it gives the largest value. Methods outlined later will produce candidates π¦β for local extrema for various different types of functionals and sets. To show that a candidate π¦β is a local extremum we will need to verify that it satisfies the above definition for some radius πΏ > 0. If the definition is satisfied for an arbitrary radius, no matter how large, then the candidate is an absolute extremum; in this case, the neighborhood is the entire set. Thus the difference between a local and absolute extremum can be understood in terms of the radius of the neighborhood. Note that an absolute extremum must be a local extremum for every norm and neighborhood. In contrast, a local extremum that is not absolute may only be an extremum in some norm and neighborhood. As the next example shows, a local extremum in one norm may not be an extremum in another. Example 6.3.3. Consider the set of functions V = {π¦ β πΆ 1 [0, 1] | π¦(0) = πΌ} and 1 functional πΉ(π¦) = β«0 (π¦β² (π₯))2 (π½ β (π¦β² (π₯))2 ) ππ₯, where πΌ > 0 and π½ > 0 are given constants. Methods outlined later will produce a candidate for a local minimizer, namely π¦β (π₯) β‘ πΌ. Here we determine if this candidate is an actual local minimizer in the πΆ π -norm for some π β€ 1. For reference, note that π¦ββ² (π₯) β‘ 0, and hence πΉ(π¦β ) = 0. Claim: π¦β is a local minimizer in the πΆ 1 -norm. To establish this, let πΏ > 0 be given and consider the neighborhood π πΆ 1 (π¦β , πΏ) as shown in Figure 6.5. For any π’ β π πΆ 1 (π¦β , πΏ) y
Ξ΄ + Ξ΄ < Ξ΄ Ξ΄ u y *
Ξ±
0
1
Ξ΄ 0
u y *
x Figure 6.5.
as illustrated, we have |π’(π₯) β πΌ| β€ πΏ Μ β€ πΏ and also |π’β² (π₯) β 0| β€ πΏ Μ β€ πΏ, for all π₯ β [0, 1]. Provided that πΏ β€ βπ½, we will have π½ β (π’β² (π₯))2 β₯ 0, and consequently 1
(6.14)
πΉ(π’) = β« (π’β² (π₯))2 (π½ β (π’β² (π₯))2 ) ππ₯ β₯ 0. 0
Since πΉ(π¦β ) = 0, we find that (6.15)
πΉ(π’) β₯ πΉ(π¦β ),
βπ’ β π πΆ 1 (π¦β , πΏ) for any πΏ β (0, βπ½].
Thus π¦β is a local minimizer in the πΆ 1 -norm. Claim: π¦β is not a local minimizer in the πΆ 0 -norm. To establish this, let πΏ > 0 be given and consider the neighborhood π πΆ 0 (π¦β , πΏ) as shown in Figure 6.6. For any π β (0, πΏ] consider the function π£ β π πΆ 0 (π¦β , πΏ) defined by (6.16)
π£(π₯) = πΌ + π sin(π₯/π2 ).
For small values of the parameter π, the function π£ will be as close as desired to π¦β , but the slope of π£ will be very oscillatory and large compared to the zero slope of π¦β . 1 If we consider parameter values of the form π = , where π is an integer, then β2ππ
150
6. Calculus of variations
y
Ξ΄ v y *
Ξ±
0
1
x
Figure 6.6.
provided that π β₯ evaluation that
1 , 2ππΏ 2
we have 0 < π β€ πΏ, and provided that π > 1
(6.17)
πΉ(π£) = β« (π£β² (π₯))2 (π½ β (π£β² (π₯))2 ) ππ₯ = 0
2π½ , 3π
we find by direct
ππ (2π½ β 3ππ) < 0. 2
However, as noted earlier, we have πΉ(π¦β ) = 0. Thus every πΆ 0 -neighborhood of π¦β contains functions such as π£ with lower values of πΉ, and it follows that π¦β is not a local minimizer in the πΆ 0 -norm.
6.4. Necessary conditions Here we outline a set of general necessary conditions for the local extrema of a functional. Although only necessary, they will provide a practical means of identifying candidates. The conditions are based on the concept of admissible variations. We first introduce a space of such variations, and then proceed to the important idea of the variation of a function. Definition 6.4.1. Consider a set of functions of the form (6.18)
V = {π¦ β πΆ π [π, π] | πΊ 1 (π¦) = π 1 , . . . , πΊ π½ (π¦) = π π½ },
where πΊπ βΆ πΆ π [π, π] β β are linear functionals and ππ β β are constants for π = 1, . . . , π½. By the space of admissible variations associated with V we mean the linear space (6.19)
V0 = {β β πΆ π [π, π] | πΊ 1 (β) = 0, . . . , πΊ π½ (β) = 0}.
Thus V0 is a linear space associated with V, defined by homogeneous versions of the membership conditions. By design, the space V0 has the property that if π¦1 β V and π¦2 β V, then π¦2 β π¦1 β V0 . Also, if π¦1 β V and β β V0 , then π¦1 + β β V. These properties follow from the linearity of the membership conditions πΊπ (π¦) = ππ . For any given V as considered above, the identification of V0 is straightforward. 1
Example 6.4.1. (1) For V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 2, π¦β² (1) = 0, β«0 π¦(π₯) ππ₯ = 3}, the 1 space of variations is V0 = {β β πΆ 2 [0, 1] | β(0) = 0, ββ² (1) = 0, β«0 β(π₯) ππ₯ = 0}. (2) For V = πΆ 2 [0, 1], with no additional conditions, the space of variations is V0 = πΆ 2 [0, 1], with no additional conditions. The space V0 is useful for describing neighborhoods of a given function π¦β β V. Specifically, to each π’ in a neighborhood of π¦β there is a unique β β V0 such that
6.4. Necessary conditions
151
π’ = π¦β + β, namely β = π’ β π¦β , and π’ β π πΆ π (π¦β , πΏ) if and only if βββπΆ π β€ πΏ. More importantly, the space V0 is useful for describing distortions or variations of a given function π¦β β V. This latter idea will provide the foundation for a theory of local extrema. Definition 6.4.2. Let π¦β β V and β β V0 be given. By variations of π¦β in the direction β we mean the family of functions π¦β + πβ β V,
(6.20)
π β β.
For each π, the function π¦β +πβ can be understood as a distorted version of π¦β within the set V. The direction of the distortion is determined by β, and the level or scale of the distortion is determined by π. The functions π¦β + πβ and π¦β coincide when π = 0, and remain close to each other in any πΆ π -norm for small values of π > 0 and π < 0. From a geometrical standpoint, the family π¦β + πβ can also be interpreted as an abstract line in the set V, which passes through the element π¦β , where π is the coordinate along the line, and β is the direction. Example 6.4.2. Consider V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 1}, with space of variations V0 = {β β πΆ 2 [0, 1] | β(0) = 0, β(1) = 0}. Moreover, consider the function π¦β (π₯) = π₯ in V. Figure 6.7 illustrates some sample elements β in V0 , and the resulting variations π¦β + πβ in V, corresponding to positive and negative values of the parameter π. Ξ΅>0 y *
y +Ξ΅h *
h Ξ΅0
x y +Ξ΅h *
y * h 0
1
x
0
1
x
0
1
x
Figure 6.7.
We can now state a set of general necessary conditions that the local extrema of a functional must satisfy. These conditions are based on the concept of a variation as introduced above, and results from single-variable calculus. In the following statement, the indicated derivatives with respect to the parameter are assumed to exist. Result 6.4.1. [necessary conditions] Let V β πΆ π [π, π], πΉ βΆ V β β and π β€ π be given. If π¦β β V is a local minimizer of πΉ in the πΆ π -norm, then it must satisfy π (6.21) = 0, ββ β V0 , [ πΉ(π¦β + πβ)] ππ π=0
152
6. Calculus of variations
and (6.22)
[
π2 πΉ(π¦β + πβ)] β₯ 0, ππ2 π=0
ββ β V0 .
For a local maximizer, change β₯ to β€ in condition (6.22). The above result follows from simple considerations. Specifically, if the functional πΉ(π¦) has a local minimum at π¦ = π¦β , then the single-variable function π(π) = πΉ(π¦β +πβ) has a local minimum at π = 0. Results from single-variable calculus then require that π2 π ππ (0) = 0 and ππ2 (0) β₯ 0, provided that these derivatives exist, and this must hold ππ for any given β. In the case of a local maximum, the condition π2 π (0) ππ2
π2 π (0) ππ2
β₯ 0 is replaced
by β€ 0. Note that the conditions are independent of the specific πΆ π -norm associated with a given extremum π¦β . For brevity, the derivatives in (6.21) and (6.22) are denoted as πΏπΉ(π¦β , β) and πΏ2 πΉ(π¦β , β), and are called the first and second variation of πΉ at π¦β in the direction β. The conditions outlined in the above result are necessary, but not sufficient. And they are still not sufficient even if the inequality in (6.22) is made strict. Whereas a strict inequality is sufficient in the case of functions defined on finite-dimensional domains, it is no longer sufficient in the case of functionals defined on infinite-dimensional domains. This lack of sufficiency can be attributed to the difference between finite and infinite dimension. Although only necessary, the above conditions provide a practical means of identifying candidates for extrema. In the remainder of our developments we will specialize these conditions to problems involving different types of functionals and sets. Before proceeding, we illustrate the basic idea with an example. Example 6.4.3. Consider the set V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 1}, with space of variations V0 = {β β πΆ 2 [0, 1] | β(0) = 0, β(1) = 0}, and consider the functional 1 πΉ(π¦) = β«0 2π¦(π₯) + (π¦β² (π₯))2 ππ₯. Here we find candidates π¦β for local extrema. Variations. For any fixed π¦ β V and β β V0 , we seek expressions for the derivatives π π2 [ ππ πΉ(π¦ + πβ)]π=0 and [ ππ2 πΉ(π¦ + πβ)]π=0 , which for brevity are denoted as πΏπΉ(π¦, β) and πΏ2 πΉ(π¦, β). From the definition of πΉ, we have 1
(6.23)
πΉ(π¦ + πβ) = β« 2(π¦(π₯) + πβ(π₯)) + (π¦β² (π₯) + πββ² (π₯))2 ππ₯. 0
Differentiating with respect to π, and noting that the derivative can be taken inside the integral, and using the chain rule where needed, we get 1
(6.24)
π πΉ(π¦ + πβ) = β« 2β(π₯) + 2(π¦β² (π₯) + πββ² (π₯))ββ² (π₯) ππ₯, ππ 0
and 1
(6.25)
π2 πΉ(π¦ + πβ) = β« 2ββ² (π₯)ββ² (π₯) ππ₯. ππ2 0
6.4. Necessary conditions
153
Setting π = 0 we obtain expressions for the first and second variations, namely 1
(6.26)
πΏπΉ(π¦, β) = β« 2β(π₯) + 2π¦β² (π₯)ββ² (π₯) ππ₯, 0
and 1
(6.27)
2
πΏ πΉ(π¦, β) = β« 2ββ² (π₯)ββ² (π₯) ππ₯. 0
We next rewrite the first variation πΏπΉ(π¦, β) in a more useful form using the integration-by-parts formula β« π’ ππ£ = π’π£ β β« π£ ππ’. Specifically, applying this formula to the term β« 2π¦β² ββ² ππ₯, with π’ = 2π¦β² and ππ£ = ββ² ππ₯, we get ππ’ = 2π¦β³ ππ₯ and π£ = β, and we obtain 1
(6.28)
π₯=1
πΏπΉ(π¦, β) = β« 2β(π₯) β 2π¦β³ (π₯)β(π₯) ππ₯ + [2π¦β² (π₯)β(π₯)]π₯=0 . 0
Since β β V0 , we have β(0) = 0 and β(1) = 0, and it follows that the boundary term π₯=1 [2π¦β² (π₯)β(π₯)]π₯=0 is zero. Thus we get the convenient expression 1
(6.29)
πΏπΉ(π¦, β) = β« (2 β 2π¦β³ (π₯))β(π₯) ππ₯. 0
First-order condition. We now make the observation that, if π¦β β V is a local extremum of πΉ, then it must satisfy 1
(6.30)
πΏπΉ(π¦β , β) = β« (2 β 2π¦ββ³ (π₯))β(π₯) ππ₯ = 0,
ββ β V0 .
0
Note that all factors in the integrand are continuous, and that the integral must vanish for every choice of β in V0 . As we will see, this condition will hold when and only when the factor multiplying β(π₯) in the integrand vanishes throughout the integration interval, that is, 2 β 2π¦ββ³ (π₯) = 0 for all π₯ β [0, 1]. This equation, combined with the boundary conditions specified in V, gives a boundary-value problem for the function π¦β (π₯), namely (6.31)
2 β 2π¦ββ³ (π₯) = 0,
π¦β (0) = 0,
π¦β (1) = 1,
0 β€ π₯ β€ 1.
These equations can be solved in the usual way, and we obtain the function π¦β (π₯) = 1 (π₯ + π₯2 ). This function is in V and is the only candidate for a local extremum. 2 Second-order condition. If the sign of πΏ2 πΉ(π¦β , β) can be determined, for any given β in V0 , then more information on the candidate π¦β can be obtained. From (6.27), and the fact that (ββ² (π₯))2 β₯ 0 for all π₯ β [0, 1], we get the straightforward result that 1
(6.32)
πΏ2 πΉ(π¦β , β) = β« 2(ββ² (π₯))2 ππ₯ β₯ 0,
ββ β V0 .
0
This informs us that π¦β could be a local minimizer, but not a local maximizer. A further analysis is required to determine whether this candidate is an actual local minimizer in the πΆ π -norm for some π.
154
6. Calculus of variations
6.5. First-order problems We consider the problem of finding local extrema for a functional πΉ βΆ V β β, where the set of functions is V = {π¦ β πΆ 2 [π, π] | π¦(π) = πΌ, π¦(π) = π½},
(6.33)
the space of variations is V0 = {β β πΆ 2 [π, π] | β(π) = 0, β(π) = 0},
(6.34) and the functional is
π
πΉ(π¦) = β« πΏ(π₯, π¦, π¦β² ) ππ₯.
(6.35)
π
Here [π, π] is a given interval, πΌ, π½ are given constants, and πΏ(π₯, π¦, π¦β² ) is a given integrand. Unless indicated otherwise, we assume that πΏ is twice continuously differentiable for all π₯ β [π, π], π¦ β β and π¦β² β β. The integrand πΏ is called the Lagrangian for the functional. The above problem is said to be of first-order type, since the functional πΉ involves derivatives of at most first order. Moreover, the problem is said to be of fixed-fixed type, since the functions in V are fixed at both ends. The continuity requirements for the functions in V, and for the integrand πΏ, ensure that the functional πΉ is finite for each input. They also ensure that the general necessary conditions in Result 6.4.1 can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. The following result outlines some implications of the general necessary conditions, when specialized to the problem considered here. Result 6.5.1. Let πΉ βΆ V β β be defined as in (6.33)β(6.35). If π¦β β V is a local minimizer of πΉ in the πΆ π -norm for some π, then (6.36)
π πΉ(π¦β + πβ)] = 0, ππ π=0
ββ β V0 ,
π2 πΉ(π¦β + πβ)] β₯ 0, ππ2 π=0
ββ β V0 .
[
and (6.37)
[
Condition (6.36) implies that π¦β must satisfy (6.38)
ππΏ π ππΏ β² β² β§ ππ¦ (π₯, π¦, π¦ ) β ππ₯ [ ππ¦β² (π₯, π¦, π¦ )] = 0, β¨ β©π¦(π) = πΌ, π¦(π) = π½.
π β€ π₯ β€ π,
Condition (6.37) implies that π¦β must also satisfy (6.39)
π2 πΏ (π₯, π¦, π¦β² ) β₯ 0, ππ¦β² ππ¦β²
π β€ π₯ β€ π.
For a local maximizer, change β₯ to β€ in conditions (6.37) and (6.39).
6.5. First-order problems
155
The conditions in (6.38) and (6.39) are pointwise in the sense that they must be satisfied at every point π₯ β [π, π]. The equations in (6.38) provide a boundary-value problem that every local extremum must satisfy; they are called the EulerβLagrange equations. The differential equation in this boundary-value problem is at most secondorder, and may be linear or nonlinear. The inequality in (6.39) is a further condition that must be satisfied; it is called the Legendre condition, and can be used to partially classify an extremum. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. Such an analysis can be straightforward in some problems, but can be significantly involved in others, and may require a number of additional technical results. As noted earlier in the discussion of Result 6.4.1, the above conditions would still not be sufficient even if the inequalities in (6.37) and (6.39) were made strict. Following standard terminology, any solution of (6.38), and hence a candidate, is called an extremal. Note that the boundary-value problem in (6.38) may have one, none, or multiple solutions; hence a functional may have as many local extrema. The possibility of no or multiple solutions is an intrinsic feature of boundary-value problems. The issue with solutions normally does not arise when solving the differential equation itself. Specifically, the theory of initial-value problems guarantees, under mild conditions, that the differential equation will have a general solution involving arbitrary constants. (These constants reflect arbitrary initial conditions.) Instead, the issue normally arises when attempting to fit this general solution to boundary conditions specified at two distinct points. There may be a unique set of constants that will fit such boundary conditions, or none, or many. Sketch of proof: Result 6.5.1. We first discuss how (6.36) implies (6.38). This basic argument will be repeated in various forms when other types of problems are considered. To begin, consider any fixed π¦ β V and β β V0 . From the definition of πΉ in (6.35) we have π
πΉ(π¦ + πβ) = β« πΏ(π₯, π¦ + πβ, π¦β² + πββ² ) ππ₯.
(6.40)
π
Differentiating with respect to π, and noting that the derivative can be taken inside the integral, and using the chain rule, we get
(6.41)
π πΉ(π¦ + πβ) ππ π
=β« π
ππΏ ππΏ (π₯, π¦ + πβ, π¦β² + πββ² )β + β² (π₯, π¦ + πβ, π¦β² + πββ² )ββ² ππ₯. ππ¦ ππ¦
Setting π = 0 we obtain an expression for the first variation, namely π
(6.42)
πΏπΉ(π¦, β) = β« πβ + πββ² ππ₯, π
156
6. Calculus of variations
ππΏ
ππΏ
where for brevity we use the notation π = ππ¦ (π₯, π¦, π¦β² ) and π = ππ¦β² (π₯, π¦, π¦β² ). As before, we next write the above expression in a more useful form using the integration-by-parts formula β« π’ ππ£ = π’π£ β β« π£ ππ’. Specifically, applying this formula to the term β« πββ² ππ₯, with π’ = π and ππ£ = ββ² ππ₯, we get ππ’ = πβ² ππ₯ and π£ = β, and we obtain π
(6.43)
π₯=π
πΏπΉ(π¦, β) = β« πβ β πβ² β ππ₯ + [πβ]π₯=π . π
In the above, note that πβ² means
π ππΏ ππ , or equivalently, ππ₯ [ ππ¦β² (π₯, π¦, π¦β² )]. ππ₯
Since β β V0 , we π₯=π
have β(π) = 0 and β(π) = 0, and it follows as before that the boundary term [πβ]π₯=π is zero. Thus we get the expression π
πΏπΉ(π¦, β) = β« (π β πβ² )β ππ₯.
(6.44)
π
We now observe that, if π¦β β V is a local extremum, then the condition in (6.36) requires π
(6.45)
πΏπΉ(π¦β , β) = β« (π β πβ² )β ππ₯ = 0,
ββ β V0 .
π
Note that all factors in the integrand are continuous, and that the integral must vanish for every choice of β in V0 . By a result known as the fundamental lemma to be outlined later, this condition will hold when and only when the factor multiplying β in the integrand vanishes throughout the integration interval, that is, π β πβ² = 0 for all π₯ β [π, π]. This equation, combined with the boundary conditions specified in V, gives the boundary-value problem in (6.38). We next describe how (6.37) implies (6.39). Returning to (6.41), we differentiate a second time with respect to π, and then set π = 0 to obtain an expression for the second variation, namely π
(6.46)
2
πΏ πΉ(π¦, β) = β« π(ββ² )2 + 2πβββ² + πβ2 ππ₯, π π2 πΏ
π2 πΏ
where 2for brevity we use the notation π = ππ¦β² ππ¦β² (π₯, π¦, π¦β² ), π = ππ¦ππ¦β² (π₯, π¦, π¦β² ) and π πΏ π = ππ¦ππ¦ (π₯, π¦, π¦β² ). We now observe that, if π¦β β V is a local extremum, for example a minimizer, then the condition in (6.37) requires π
(6.47)
πΏ2 πΉ(π¦β , β) = β« π(ββ² )2 + 2πβββ² + πβ2 ππ₯ β₯ 0,
ββ β V0 .
π
Similar to before, note that all factors in the integrand are continuous, and that the integral must be nonnegative for every choice of β in V0 . By a result which we call the sign lemma to be outlined later, the above condition implies that the factor multiplying the highest derivatives of β must be nonnegative throughout the integration interval, that is, π β₯ 0 for all π₯ β [π, π]. This condition is the inequality stated in (6.39). The intuitive explanation is that functions β can be chosen to localize the integrand around any point, and for such functions the higher derivatives will be much larger than the lower ones, and the term π(ββ² )2 will dominate.
6.5. First-order problems
157
Example 6.5.1. Consider the set V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 1}, and 1 functional πΉ(π¦) = β«0 4π₯2 π¦β(π¦β² )2 ππ₯. Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition. Extremals. The Lagrangian is πΏ(π₯, π¦, π¦β² ) = 4π₯2 π¦ β (π¦β² )2 , and its partial derivatives are ππΏ π ππΏ ππΏ/ππ¦ = 4π₯2 and ππΏ/ππ¦β² = β2π¦β² . The differential equation to consider is ππ¦ β ππ₯ [ ππ¦β² ] = π
0, which becomes 4π₯2 β ππ₯ [ β 2π¦β² ] = 0, or equivalently 4π₯2 + 2π¦β³ = 0, and the interval π β€ π₯ β€ π becomes 0 β€ π₯ β€ 1. In view of the boundary conditions in V, the boundaryvalue problem for an extremal is (6.48)
π¦β³ + 2π₯2 = 0,
π¦(0) = 0,
π¦(1) = 1,
0 β€ π₯ β€ 1.
The differential equation can be explicitly integrated, and the general solution is π¦ = 1 β 6 π₯4 + πΆπ₯ + π·, where πΆ and π· are arbitrary constants. Applying the boundary con1
7
7
ditions, we get π· = 0 and πΆ = 6 , and we obtain the unique extremal π¦β = β 6 π₯4 + 6 π₯. This is the only candidate for a local extremum. π2 πΏ
ππΏ
Sign condition. Since ππ¦β² (π₯, π¦, π¦β² ) = β2π¦β² , we get ππ¦β² ππ¦β² (π₯, π¦, π¦β² ) = β2. Here the required second partial is a constant, but more generally it may depend on π₯, π¦, and π¦β² . Substituting the extremal π¦β and its derivative π¦ββ² into this expression, we get the conπ2 πΏ stant function ππ¦β² ππ¦β² (π₯, π¦β , π¦ββ² ) β‘ β2 for π₯ β [0, 1]. The fact that this expression is β€ 0 for all π₯ β [0, 1] informs us that π¦β could be a local maximizer, but not a local minimizer. Analysis of candidate. To determine if π¦β is a local maximizer in the πΆ π -norm for some π, we attempt to verify the definition of a maximizer. To begin, let π β€ 2 and πΏ > 0 be given; we will adjust these as needed as we proceed. Consider any π’ in the neighborhood π πΆ π (π¦β , πΏ) and let β = π’ β π¦β . Note that β is in V0 , since it is the difference of two functions in V. From the definition of πΉ, and the fact that π’ = π¦β + β, we have 1
(6.49)
πΉ(π’) = β« 4π₯2 (π¦β + β) β (π¦ββ² + ββ² )2 ππ₯. 0
Expanding and grouping terms on the right-hand side, and again using the definition of πΉ, we get 1
(6.50)
1
πΉ(π’) = πΉ(π¦β ) + β« 4π₯2 β β 2π¦ββ² ββ² ππ₯ β β« (ββ² )2 ππ₯. 0
0
Using integration-by-parts on the term β(1) = 0 since β β V0 , we get
β« β2π¦ββ² ββ²
ππ₯, and noting that β(0) = 0 and
1
(6.51)
1
πΉ(π’) = πΉ(π¦β ) + β« (4π₯2 + 2π¦ββ³ )β ππ₯ β β« (ββ² )2 ππ₯. 0
0 2
From the differential equation in (6.48), we note that 4π₯ + 2π¦ββ³ = 0 for all π₯ β [0, 1]. Using this observation, together with the fact that β(ββ² )2 β€ 0 for all π₯ β [0, 1], we
158
6. Calculus of variations
obtain the result that 1
(6.52)
πΉ(π’) β πΉ(π¦β ) = β β« (ββ² )2 ππ₯ β€ 0,
βπ’ β π πΆ π (π¦β , πΏ).
0
The above result shows that π¦β is a local maximizer for any π β€ 2 and any πΏ > 0; in fact, it is an absolute maximizer.
6.6. Simpliο¬cations, essential results The EulerβLagrange (EβL) differential equation in (6.38) may be difficult to solve depending on the Lagrangian. Normally, we expect that the equation will have a second-order form, and that its general solution will have two arbitrary constants. In the special cases outlined below, it is possible to express the equation in a simplified or reduced first-order form, which may be easier to solve when the original form is nonlinear. The reduced form is called a first integral of the EulerβLagrange equation. Note that the reduced form will contain one arbitrary constant, and the process of solving it will introduce the second arbitrary constant. Result 6.6.1. [reduced forms of EβL equation] Let πΏ = πΏ(π₯, π¦, π¦β² ) be the Lagrangian function for the EulerβLagrange differential equation in (6.38). (1) If πΏ is independent of π¦, so πΏ = πΏ(π₯, π¦β² ), then every solution π¦ β πΆ 2 [π, π] of the Eulerβ Lagrange equation must satisfy ππΏ (π₯, π¦β² ) = π΄, ππ¦β²
(6.53) where π΄ is an arbitrary constant.
(2) If πΏ is independent of π₯, so πΏ = πΏ(π¦, π¦β² ), then every solution π¦ β πΆ 2 [π, π] of the Eulerβ Lagrange equation must satisfy (6.54)
πΏ(π¦, π¦β² ) β π¦β²
ππΏ (π¦, π¦β² ) = π΄, ππ¦β²
where π΄ is an arbitrary constant. Sketch of proof: Result 6.6.1. The results follow from simple manipulation of the ππΏ differential equation in (6.38). In the first case, when πΏ = πΏ(π₯, π¦β² ), we find that ππ¦ = 0, π ππΏ and the equation in (6.38) becomes β ππ₯ [ ππ¦β² ] = 0. This equation can now be integrated ππΏ with respect to π₯, and we obtain ππ¦β² = π΄, where π΄ is an arbitrary constant. In the second case, when πΏ = πΏ(π¦, π¦β² ), we can multiply the equation in (6.38) by π¦β² to obtain ππΏ π ππΏ ππΏ π ππΏ (6.55) 0 = π¦β² ( β β π¦β² [ β² ]) = π¦β² [ ]. ππ¦ ππ₯ ππ¦ ππ¦ ππ₯ ππ¦β² Using the chain rule to expand (6.56)
0 = π¦β²
π ππΏ π [ ], and noting ππ₯ π¦ ππ₯ ππ¦β²
= π¦β² and
ππΏ π2 πΏ π2 πΏ β π¦β² π¦β² β π¦β² π¦β³ β² β² . β² ππ¦ ππ¦ππ¦ ππ¦ ππ¦
π β² π¦ ππ₯
= π¦β³ , we obtain
6.6. Simplifications, essential results
159
ππΏ
Adding and subtracting the term π¦β³ ππ¦β² , we find that the right-hand side may be written in a simplified way, namely ππΏ ππΏ ππΏ π2 πΏ π2 πΏ + π¦β³ β² β π¦β³ β² β π¦β² π¦β² β π¦β² π¦β³ β² β² β² ππ¦ ππ¦ ππ¦ ππ¦ππ¦ ππ¦ ππ¦ π ππΏ = [πΏ β π¦β² β² ] . ππ₯ ππ¦
0 = π¦β² (6.57)
π
ππΏ
Similar to before, the equation ππ₯ [πΏ β π¦β² ππ¦β² ] = 0 can be integrated with respect to π₯, ππΏ and we obtain πΏ β π¦β² ππ¦β² = π΄, where π΄ is an arbitrary constant. The next two examples illustrate the different forms that the EβL differential equation may take depending on the Lagrangian. The case πΏ = πΏ(π₯, π¦β² ) is considered in the first example, and πΏ = πΏ(π¦, π¦β² ) in the second. Note that the two cases are not mutually exclusive, and the simpler of the two reduced forms can be chosen when applicable. Example 6.6.1. Consider πΏ(π₯, π¦, π¦β² ) = β1 + (π¦β² )2 . Working with the original form of ππΏ π ππΏ the equation ππ¦ β ππ₯ [ ππ¦β² ] = 0, and expanding the derivatives, we get (6.58)
π¦β³ (π¦β² )2 π¦β³ β = 0. (1 + (π¦β² )2 )3/2 (1 + (π¦β² )2 )1/2
We could attempt to simplify the above second-order differential equation and eventually find a general solution. Alternatively, since πΏ is independent of π¦, we may consider ππΏ the reduced form of the equation, which is ππ¦β² = π΄, where π΄ is a constant. Using the expression for πΏ, we get the reduced equation (6.59)
π¦β² β1 + (π¦β² )2
= π΄. π΄
The above equation can be rearranged to get π¦β² = Β± , or more simply π¦β² = π΅, β1βπ΄2 where π΅ is a constant. This equation can now be explicitly integrated and we get the general solution π¦ = π΅π₯ + πΆ, where π΅ and πΆ are arbitrary constants. (π¦β² )2
Example 6.6.2. Consider πΏ(π₯, π¦, π¦β² ) = 1+π¦2 . Working with the original form of the ππΏ π ππΏ equation ππ¦ β ππ₯ [ ππ¦β² ] = 0, and expanding the derivatives, we get (6.60)
2π¦(π¦β² )2 2π¦β³ β = 0. (1 + π¦2 )2 1 + π¦2
As before, we could attempt to simplify the above second-order differential equation and find a general solution. However, since πΏ is independent of π₯, we may consider the ππΏ reduced form of the equation, which is πΏ β π¦β² ππ¦β² = π΄, where π΄ is a constant. Using the expression for πΏ, we get the reduced equation (6.61)
β
(π¦β² )2 = π΄. 1 + π¦2
The above equation informs us that π΄ β€ 0. For convenience, let π΄ = βπ΅ 2 , where π΅ is a constant. Then the above equation can be rearranged to get π¦β² = Β±π΅β1 + π¦2 , or more simply π¦β² = πΆβ1 + π¦2 , where πΆ = Β±π΅ is a constant. This is a first-order equation that
160
6. Calculus of variations
can be solved using separation of variables. Specifically, using the notation of π¦β² , and with the help of a table of integrals, we get (6.62)
β«
ππ¦ β1 +
π¦2
= β« πΆ ππ₯
which gives
ππ¦ ππ₯
in place
β1
sinh (π¦) = πΆπ₯ + π·,
β1
where sinh (π¦) is the inverse hyperbolic sine function and π· is a constant. Thus the general solution is π¦ = sinh(πΆπ₯ + π·), where πΆ and π· are arbitrary constants. In the sections that follow, an associated EulerβLagrange equation will be derived for different types of problems. In all cases, the derivation will rely on a result known as the fundamental lemma of the calculus of variations, as mentioned earlier. Here we outline a version of this lemma that will be useful for a number of problems. Result 6.6.2. [fundamental lemma] Let integers π β₯ π β₯ 0 and an interval [π, π] be π given. If a function π€ β πΆ 0 [π, π] satisfies β«π π€(π₯)β(π₯) ππ₯ = 0 for all β β πΆ π [π, π], where β(π) (π) = 0 and β(π) (π) = 0 for π = 0, . . . , π, then π€(π₯) β‘ 0 for all π₯ β [π, π]. Sketch of proof: Result 6.6.2. The result follows from a straightforward argument based on continuity. For contradiction, suppose that π€(π₯# ) β 0 for some π₯# β (π, π), say π€(π₯# ) > 0. Then, by continuity, there exists a number πΏ > 0 such that π€(π₯) > 0 for all π₯ β (π₯# β πΏ, π₯# + πΏ) β (π, π). Also, we can explicitly construct a so-called bump or test function β β πΆ π [π, π], with β(π₯) > 0 for all π₯ β (π₯# β πΏ, π₯# + πΏ), and with β(π₯) = 0 for all π₯ β (π₯# β πΏ, π₯# + πΏ), as illustrated in Figure 6.8. Note that such a function will satisfy β(π) (π) = 0 and β(π) (π) = 0 for all π β₯ 0, and will visibly have the form of a bump, with zero segments on both ends. For such a function the product π€(π₯)β(π₯) h h(x)
a
(
)
xβΞ΄ #
x +Ξ΄ #
b
x
Figure 6.8.
will be positive when π₯ β (π₯# β πΏ, π₯# + πΏ), and zero otherwise, and we get π
(6.63)
π₯# +πΏ
β« π€(π₯)β(π₯) ππ₯ = β« π
π€(π₯)β(π₯) ππ₯ > 0.
π₯# βπΏ π
But this contradicts the assumption that β«π π€(π₯)β(π₯) ππ₯ = 0, and a similar argument can be made if π€(π₯# ) < 0. Hence we must have π€(π₯# ) = 0 for all π₯# β (π, π). Moreover, since π€ is continuous on [π, π], it must also be zero at the end points π and π. In addition to an EulerβLagrange equation, an associated Legendre condition will also be stated for different types of problems. In all cases, this condition will follow from a sign lemma as mentioned earlier. Here we outline a version of this lemma that will be applicable to various problems.
6.7. Case study
161
Result 6.6.3. [sign lemma] Let integers π β₯ π β₯ π β₯ 0, an interval [π, π] and functions πππ β πΆ 0 [π, π] for π, π = 0, . . . , π be given, and consider π π
π
πΌ(β) = β« β β πππ β(π) β(π) ππ₯.
(6.64)
π
π=0 π=0
π
If πΌ(β) β₯ 0 for all β β πΆ [π, π], where β(π) (π) = 0 and β(π) (π) = 0 for π = 0, . . . , π, then πππ β₯ 0 for all π₯ β [π, π]. Similarly, if πΌ(β) β€ 0, then πππ β€ 0. Sketch of proof: Result 6.6.3. The result follows from continuity similar to before. Since the case π = 0 is straightforward, we assume π β₯ 1. To begin, assume πΌ(β) β₯ 0 for all β as described, and for contradiction suppose πππ (π₯# ) < 0 for some π₯# β (π, π), and let π = βπππ (π₯# ). Then, by continuity, there exists a number πΏ > 0 such that 1 πππ (π₯) < β 2 π for all π₯ β (π₯# β πΏ, π₯# + πΏ) β (π, π). Next, let π β₯ π + 1 and π½ β₯ 1 be odd integers, where π is fixed and π½ is arbitrary, and consider the piecewise-defined function (6.65)
π½π(π₯ β π₯# ) 1 cosπ [ ] β(π₯) = { π½π 2πΏ 0
, π₯ β (π₯# β πΏ, π₯# + πΏ), , π₯ β (π₯# β πΏ, π₯# + πΏ).
By design, we have β β πΆ π [π, π], and also β(π) (π) = 0 and β(π) (π) = 0 for all π. Moreover, for π β€ π β 1, the derivatives have the property that limπ½ββ β(π) (π₯) = 0 uniformly for π₯ β [π, π]. Crucially, the derivative β(π) (π₯) is bounded and does not vanish π as π½ β β, but instead has the property that πΎ = β«π (β(π) )2 ππ₯ is a constant independent of π½. These observations follow from a trigonometric reduction formula, which for π odd, states cosπ π = βπ πΆπ cos(ππ), where the sum extends over odd integers π = 1, 3, . . . , π, and πΆπ are positive constants. From this it follows that each derivative β(π) π π is a sum of the form Β±π½ πβπ ( 2πΏ )π βπ ππ πΆπ cos(ππ) or Β±π½πβπ ( 2πΏ )π βπ ππ πΆπ sin(ππ), π½π(π₯βπ₯# ) where π = , and the integral of (β(π) )2 can be characterized explicitly to get 2πΏ π 2π 2π 2 πΎ = πΏ( 2πΏ ) βπ π πΆπ . π
Finally, let π
(β) = πΌ(β) β β«π πππ (β(π) )2 ππ₯. Since the functions πππ are all continuous, and β(π) β 0 for π β€ π β 1, and β(π) is bounded, it follows that π
(β) β 0 as 1 π½ β β. Moreover, since πππ (π₯) < β 2 π in (π₯# β πΏ, π₯# + πΏ) and β(π) vanishes outside this interval, we have π
(6.66)
1
πΌ(β) = π
(β) + β« πππ (β(π) )2 ππ₯ < π
(β) β 2 ππΎ. π
For sufficiently large π½, the above implies that πΌ(β) < 0, which contradicts the assumption that πΌ(β) β₯ 0. Hence we must have πππ (π₯# ) β₯ 0 for all π₯# β (π, π). Moreover, since πππ is continuous on [π, π], it must also be nonnegative at the end points π and π. Note that a similar conclusion can be reached under the assumption that πΌ(β) β€ 0.
6.7. Case study Setup. To illustrate the preceding results we study a problem in the design of a playground slide. We consider a slide in two dimensions as illustrated in Figure 6.9, where
162
6. Calculus of variations
the initial point is on the left at a given height β above ground level, and the terminal point is on the right at a given distance β along ground level, with gravitational acceleration π oriented vertically downwards. Beginning from the initial point with an initial
y h
c initial speed y(x) profile curve
g slide
x ground
Figure 6.9.
speed π, a user of the slide will get accelerated and transported to the terminal point under the influence of gravity. Normally, the quicker the descent, the greater the thrill of the ride. Here we seek the shape of the slide that will optimize this thrill. Specifically, we seek the profile curve π¦(π₯) that will provide the quickest descent, or equivalently, minimize the travel time π down the slide, where π₯ and π¦ are coordinates as shown, and β, β, π and π are given positive constants. In the idealized case, when the user is modeled as a mass particle which never breaks contact with the slide, and the sliding motion is planar and occurs without friction or air resistance, the above problem is equivalent to the classic brachistochrone problem, which was one of the earliest problems studied in the calculus of variations. Although various generalizations could be considered, which for example include friction and other external forces, as well as rotational motion of the mass in addition to sliding, along profile curves in three-dimensional space instead of two, we focus on the idealized version described here. Travel time. We represent a user of the slide as a particle of arbitrary mass π > 0. We suppose that the particle is at the point (π₯, π¦) = (0, β) at time π‘ = 0, and arrives at the point (π₯, π¦) = (β, 0) at time π‘ = π, so that, by definition, π is the travel time. At ππ₯ ππ¦ any instant during the motion, the particle has a position (π₯, π¦) and a velocity ( ππ‘ , ππ‘ ). Since the particle can only move along the slide, we have π¦ = π¦(π₯), and the chain rule ππ¦ ππ¦ ππ₯ ππ₯ ππ¦ implies ππ‘ = ππ₯ ππ‘ . Using the convenient notation π’ = ππ‘ and π£ = ππ‘ , this chain rule relation becomes π£ =
ππ¦ π’. ππ₯
Assuming no friction or air resistance, conservation of energy requires that the total energy of the particle at any time π‘ β₯ 0 must be equal to that at π‘ = 0. Considering kinetic and potential energy, and using ground level as the reference for potential, we have (6.67)
1 1 π(π’2 + π£2 ) + πππ¦ = π(π’20 + π£20 ) + πππ¦0 , 2 2
where π’20 + π£20 = π2 and π¦0 = β are the squared speed and height at time π‘ = 0. After ππ¦ eliminating the arbitrary mass π, and substituting the relation π£ = ππ₯ π’, we obtain,
6.7. Case study
163
after slight simplification, π’2 =
(6.68)
2π(π β π¦) , 1 + (π¦β² )2
π2
ππ¦
where π = 2π + β > 0 is a constant. Here we use the notation π¦β² = ππ₯ . For future reference, we note that π represents an upper bound for the particle height π¦, that would be attained if all kinetic energy were converted to potential. ππ₯
Assuming ππ‘ = π’ > 0 throughout the travel, so that the particle is always moving to the right, we get the useful result that 1/2
πβπ¦ ππ₯ = β2π [ ] ππ‘ 1 + (π¦β² )2
(6.69)
.
An integral expression for the travel time π in terms of the profile curve π¦(π₯) can now be written. Specifically, beginning from a simple integral identity, and performing a ππ₯ change of variable using the above expression for ππ‘ , we get π‘=π
(6.70)
π=β« π‘=0
π₯=β
ππ‘ = β« π₯=0
β
1/2
1 + (π¦β² )2 ππ‘ 1 β« [ ππ₯ = ] ππ₯ πβπ¦ β2π 0
ππ₯.
Restated problem. To state the problem in the notation of this chapter, we consider finding minimizers for a functional πΉ βΆ V β β given by (6.71)
πΉ(π¦) =
1 β2π
β
β« πΏ(π₯, π¦, π¦β² ) ππ₯, 0
β²
where πΏ(π₯, π¦, π¦ ) is the integrand defined in (6.70). Note that, since the constant factor outside of the integral can be eliminated from the EulerβLagrange equation, there is no need to include it with the integrand. Moreover, based on the first-order form of the functional, and the fact that the initial and terminal points of the slide are prescribed, we seek minimizers among the set of functions (6.72)
V = {π¦ β πΆ 2 [0, β] | π¦(0) = β, π¦(β) = 0, π¦(π₯) < π}.
The additional, upper bound condition π¦(π₯) < π for all π₯ β [0, β] is a consequence of energy conservation. Specifically, in view of the initial and terminal points, the particle height π¦ could never reach or exceed π in any motion from one end of the slide to the other, and consequently, the integrand πΏ is real and finite only when the bound is satisfied. (The case of zero initial speed is special and not considered here; in this case, we would have π = 0 and π = β, so that the upper bound would be attained at the initial point, and the integrand must then be allowed to become infinite.) In our case, the upper bound will play no active role in our developments; we can simply verify that it is satisfied at the end of our analysis. Observe that any candidate minimizer would favor a larger separation between π¦ and π, since a smaller separation would tend to increase the integrand and thus the travel time. System of equations. Every extremal of πΉ in V must satisfy the system of equations in (6.38). In view of the integrand πΏ in (6.70), we note that the general form of ππΏ π ππΏ the EulerβLagrange equation ππ¦ β ππ₯ [ ππ¦β² ] = 0 will be tedious. However, since πΏ is independent of π₯, we may instead consider the reduced form of the equation given in
164
6. Calculus of variations
ππΏ
Result 6.6.1, which is πΏ β π¦β² ππ¦β² = π΄, where π΄ is a constant. Using the expression for πΏ, we get, after some simplification, β1/2
[1 + (π¦β² )2 ]
(6.73)
1/2
= π΄[π β π¦]
.
The above equation informs us that π΄ > 0 since the two factors in brackets are positive. 1 π΅ By rearranging the equation we get 1+(π¦β² )2 = π΄2 (πβπ¦) , or more simply (π¦β² )2 = (πβπ¦) β1, ππ¦ 1 β² where π΅ = π΄2 . Noting that π¦ = ππ₯ , and including boundary conditions, we find that every extremal must satisfy the system (6.74)
(
ππ¦ 2 π΅ β (π β π¦) , ) = ππ₯ (π β π¦)
π¦(0) = β,
π¦(β) = 0,
0 β€ π₯ β€ β,
where π > 0 is a given constant, and π΅ > 0 is an arbitrary constant. The differential equation. For convenience in constructing the general solution of the differential equation, we consider an alternative description of a solution curve. Specifically, instead of the cartesian description π¦ = π¦(π₯), 0 β€ π₯ β€ β, we consider a general parametric description π₯ = π(π ), π¦ = π(π ), π β€ π β€ π. Here π is an arbitrary parameter along the curve, π(π ) and π(π ) are arbitrary functions, and [π, π] is an arbitrary ππ₯ interval. We may suppose that ππ > 0 so that the curve is traced left to right. The fact that a parametric description of a solution curve involves two arbitrary functions is advantageous: we can choose one of the functions π(π ) or π(π ) to simplify the differential equation, and then solve for the other. ππ¦
To illustrate the parametric approach, we substitute the calculus relation ππ₯ = ππ₯ the differential equation in (6.74), and then rearrange terms to separate ππ and ππ , which gives ππ¦ ππ₯ / into ππ ππ ππ¦
(6.75)
(
(π β π¦) ππ¦ 2 ππ₯ 2 ( ) . ) = ππ π΅ β (π β π¦) ππ
We may now choose π¦ = π(π ) to simplify the right-hand side of the equation. Any π2 π¦ ππ¦ choice can be made, provided that it leads to a curve for which ππ₯ and ππ₯2 are defined and continuous, which can be verified in the end. Motivated by the form of the quotient above, we let π β π¦ = π΅ sin2 (π ). This corresponds to π¦ = π β π΅ sin2 (π ), which gives ππ¦ = β2π΅ sin(π ) cos(π ), and using the identity 1 β sin2 (π ) = cos2 (π ), we get ππ (6.76)
(
The above equation implies ππ₯ ππ
ππ₯ ππ
ππ₯ 2 ) = 4π΅ 2 sin4 (π ). ππ
= Β±2π΅ sin2 (π ). By choice, we orient the curve so that ππ₯ = 2π΅ sin2 (π ). This equation can now ππ 1 πΈ + π΅[π β 2 sin(2π )], where πΈ is a constant. Thus,
> 0, which corresponds to the positive case
be explicitly integrated to obtain π₯ = in parametric form, the general solution curve of the differential equation is (6.77)
π₯ = πΈ + π΅[π β
1 sin(2π )], 2
π¦ = π β π΅ sin2 (π ),
π β€ π β€ π.
Although not possible here, we could now attempt to eliminate the parameter π , and thereby obtain a cartesian description of the curve involving only π₯ and π¦.
6.7. Case study
165
The curve in (6.77) can be written in a simpler, more symmetric form. Specifically, π΅ using the substitutions π = 2π , πΌ = 2π, π½ = 2π and π· = 2 , we obtain (6.78)
π₯ = πΈ + π·[π β sin π],
π¦ = π β π·[1 β cos π],
πΌ β€ π β€ π½.
Here π > 0 is a given constant, while π· > 0, πΈ, πΌ, π½ are arbitrary constants. Because ππ¦ π2 π¦ of periodicity, and to obtain a curve for which ππ₯ and ππ₯2 are defined and continuous, we restrict attention to the interval 0 < πΌ < π½ < 2π. Specifically, an inspection of ππ¦ ππ₯ (6.78) reveals that a vertical tangent or cusp with ππ₯ = Β±β occurs when ππ = 0, or equivalently when π is an integer multiple of 2π, and thus we must avoid such points. The general curve defined above is called a cycloid curve. A description of this curve can be found in many texts on elementary calculus, and it has an interesting geometrical characterization that is independent of the slide problem considered here: as a circular wheel is rolled along a flat surface, a point on the perimeter of the wheel will trace out such a curve, up to a congruence. Although the cycloid given in (6.78) is the general solution of the EulerβLagrange equation, we have not yet found an extremal. We must now consider if values of the arbitrary constants can be found to satisfy the boundary conditions. The boundary conditions. In view of the general curve in (6.78), the boundary conditions in (6.74) require that (π₯, π¦) = (0, β) when π = πΌ, and also (π₯, π¦) = (β, 0) when π = π½. By writing each condition for π₯ and π¦ separately, we obtain (6.79)
πΈ + π·[πΌ β sin πΌ] = 0,
π β π·[1 β cos πΌ] = β,
πΈ + π·[π½ β sin π½] = β,
π β π·[1 β cos π½] = 0.
Thus the boundary conditions yield four equations for four unknown constants π·, πΈ, πΌ, π½. Note that the determination of solutions is nontrivial due to the nonlinear form of the equations. The existence or not of an extremal, and the possibility of multiple extremals, ultimately relies on our ability to characterize solutions of these equations π2 for any given values of the slide design parameters β, β, π, π, where π = 2π + β. Summary of results. Here we summarize two results about extremals. We give a hint of the proof of the first result, and note that complete proofs of both results are tedious and outside the scope considered here. Slide result 1. For any positive design parameters β, β, π, π, the equations in (6.79) have a unique solution π·, πΈ, πΌ, π½, under the restrictions π· > 0 and 0 < πΌ < π½ < 2π. Thus there is a unique extremal; it is a cycloid curve. Slide result 2. The unique cycloid extremal is the absolute minimizer of the functional πΉ in the set V; no other curve provides a quicker travel time. Note that the minimizing curve is a specific arc of a cycloid, where the arc is defined by the interval [πΌ, π½]. Depending on the design parameters, this interval may contain the point π = π in its interior, which affects the qualitative properties of the curve as shown in Figure 6.10. Specifically, when π β (πΌ, π½), the minimizing curve that provides the quickest travel time is strictly downhill from the initial point to the terminal point, as intuitively expected. However, when π β (πΌ, π½), the minimizing curve will extend below ground level and contain an uphill portion as illustrated; although the
166
6. Calculus of variations
y
y Ο not in (Ξ±,Ξ²)
Ο in (Ξ±,Ξ²)
x
x
Figure 6.10.
mass particle must travel uphill for some distance, this curve will nonetheless provide the quickest travel time. Interestingly, the physical construction of such a slide would require some digging! Sketch of proof for result 1. We suppose that positive constants β, β, π, π are given, and seek constants π·, πΈ, πΌ, π½ that satisfy (6.79), where π· > 0 and 0 < πΌ < π½ < 2π. To get a glimpse of the basic ideas, we consider the special limiting case when π = 0, which requires the relaxed restrictions 0 β€ πΌ < π½ < 2π. Although this limiting case is excluded in the results outlined above, it leads to a simpler analysis, which serves as a guide for the general case. The assumption that π = 0 implies the simple and useful result that π = β. From the second equation in (6.79) we get π·[1 β cos πΌ] = 0, and using the restrictions that π· > 0 and 0 β€ πΌ < 2π, we deduce that πΌ = 0. Substituting this result into the first equation in (6.79) then implies πΈ = 0. Since π½ β sin π½ > 0 for all 0 < π½ < 2π, the third equation in (6.79) gives π· = β/[π½ β sin π½], which can then be substituted into the fourth equation, and we obtain (6.80)
πΌ = 0,
πΈ = 0,
π·=
β , π½ β sin π½
1 β cos π½ β = . β π½ β sin π½ 1βcos π½
We now observe that the single-variable function π(π½) = π½βsin π½ has a graph that is monotone decreasing on the interval 0 < π½ < 2π, and has the property that limπ½β0+ π(π½) = β and limπ½β2πβ π(π½) = 0. Thus for any given β > 0 and β > 0 there is a unique β
π½# β (0, 2π) that satisfies the root equation π(π½# ) = β . Note that there is no simple expression for this root, and hence it must be found numerically. Once this root is known, we have π½ = π½# and π· = β/[π½# β sin π½# ], and a unique solution of (6.79) is obtained.
6.8. Natural boundary conditions We consider the problem of finding local extrema for a functional πΉ βΆ V β β, where the set of functions is (6.81)
V = {π¦ β πΆ 2 [π, π] | π¦(π) = πΌ},
π¦(π) free,
the space of variations is (6.82)
V0 = {β β πΆ 2 [π, π] | β(π) = 0},
β(π) free,
6.8. Natural boundary conditions
167
and the functional is π
(6.83)
πΉ(π¦) = β« πΏ(π₯, π¦, π¦β² ) ππ₯ + [πΊ(π¦)]π₯=π . π
Here [π, π] is a given interval, πΌ is a given constant, πΏ(π₯, π¦, π¦β² ) is a given integrand, and πΊ(π¦) is a given function. Unless indicated otherwise, we assume that πΏ is twice continuously differentiable for all π₯ β [π, π], π¦ β β and π¦β² β β, and that πΊ is continuously differentiable for all π¦ β β. The trivial case πΊ β‘ 0 is typical. Similar to before, the above problem is first-order type, since the functional πΉ involves derivatives of at most first order. However, in contrast to before, the problem is now of fixed-free type, since the functions in V have a free end at the point π₯ = π; that is, the value of π¦(π) is arbitrary. The term [πΊ(π¦)]π₯=π in the functional is called a free-end term; it is evaluated at the single point π₯ = π. For instance, if πΊ(π¦) = π¦2 + πβπ¦ , then [πΊ(π¦)]π₯=π = π¦2 (π) + πβπ¦(π) . The following result outlines some implications of the general necessary conditions, when specialized to the above problem. Result 6.8.1. Let πΉ βΆ V β β be defined as in (6.81)β(6.83). If π¦β β V is a local minimizer of πΉ in the πΆ π -norm for some π, then π (6.84) = 0, ββ β V0 , [ πΉ(π¦β + πβ)] ππ π=0 and (6.85)
[
π2 πΉ(π¦β + πβ)] β₯ 0, ππ2 π=0
ββ β V0 .
Condition (6.84) implies that π¦β must satisfy ππΏ π ππΏ β² β² β§ ππ¦ (π₯, π¦, π¦ ) β ππ₯ [ ππ¦β² (π₯, π¦, π¦ )] = 0, π β€ π₯ β€ π, βͺ (6.86) β¨ βͺπ¦(π) = πΌ, [ ππΏ (π₯, π¦, π¦β² ) + ππΊ (π¦)] = 0. ππ¦β² ππ¦ β© π₯=π Condition (6.85) implies that π¦β must also satisfy π2 πΏ (π₯, π¦, π¦β² ) β₯ 0, π β€ π₯ β€ π. ππ¦β² ππ¦β² For a local maximizer, change β₯ to β€ in conditions (6.85) and (6.87). (6.87)
Thus the conditions for local extrema of the fixed-free problem are similar to those for the fixed-fixed problem, but with an important difference. Specifically, the Eulerβ Lagrange differential equation is the same as before, but now there is a new type of boundary condition associated with the free end; it is called a natural boundary condition. In contrast, the boundary condition associated with the fixed end, which is explicitly specified in the set V, is called an essential boundary condition. As in the fixed-fixed case, the boundary-value problem in (6.86) may have one, none, or multiple solutions. Note that the natural condition is a property of an extremizing function. While every function in V is required to satisfy the essential condition, only the local extrema of πΉ are required to satisfy the additional, natural condition in order to be extremizing.
168
6. Calculus of variations
Note also that problems of the free-fixed and free-free types would all involve different combinations of natural and essential boundary conditions. It is important to note that the natural boundary condition for a free end at π₯ = π is slightly different than for a free end at π₯ = π; there is a difference of sign. Just as before, the above conditions are necessary, but not sufficient. These conditions are still not sufficient even if the inequalities in (6.85) and (6.87) are made strict. The conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. Sketch of proof: Result 6.8.1. Here we show how (6.84) implies (6.86), including the natural boundary condition. To begin, consider any fixed π¦ β V and β β V0 . From the definition of πΉ in (6.83) we have π
(6.88)
πΉ(π¦ + πβ) = β« πΏ(π₯, π¦ + πβ, π¦β² + πββ² ) ππ₯ + [πΊ(π¦ + πβ)]π₯=π . π
Just as in the fixed-fixed case, we differentiate with respect to π, and take the derivative inside the integral and use the chain rule, and then set π = 0. The resulting expression for the first variation is π
πΏπΉ(π¦, β) = β« πβ + πββ² ππ₯ + [πβ]π₯=π ,
(6.89)
π ππΏ
ππΏ
where for brevity we use the notation π = ππ¦ (π₯, π¦, π¦β² ), π = ππ¦β² (π₯, π¦, π¦β² ) and π = ππΊ (π¦). As before, we next write the above expression in a more useful form using the ππ¦ integration-by-parts formula β« π’ ππ£ = π’π£ β β« π£ ππ’. Specifically, applying this formula to the term β« πββ² ππ₯, with π’ = π and ππ£ = ββ² ππ₯, we get ππ’ = πβ² ππ₯ and π£ = β, and we obtain π π₯=π
πΏπΉ(π¦, β) = β« πβ β πβ² β ππ₯ + [πβ]π₯=π + [πβ]π₯=π .
(6.90)
π
ππ
π
π₯=π
ππΏ
In the above, note that π means ππ₯ , or equivalently, ππ₯ [ ππ¦β² (π₯, π¦, π¦β² )]. Also, [πβ]π₯=π = [πβ]π₯=π β [πβ]π₯=π . Since β β V0 , we have β(π) = 0, and it follows that the boundary term at π₯ = π is zero. Thus we get the expression β²
π
πΏπΉ(π¦, β) = β« (π β πβ² )β ππ₯ + [π + π]π₯=π β(π).
(6.91)
π
We now observe that, if π¦β β V is a local extremum, then the condition in (6.84) requires π
(6.92)
πΏπΉ(π¦β , β) = β« (π β πβ² )β ππ₯ + [π + π]π₯=π β(π) = 0,
ββ β V0 .
π
As a special case, the above equation must also hold for the subset of V0 with β(π) = 0. So a local extremum must also satisfy π
(6.93)
β« (π β πβ² )β ππ₯ = 0,
ββ β πΆ 2 [π, π], β(π) = 0, β(π) = 0.
π
The above condition is the same as considered in the fixed-fixed case. By the fundamental lemma, due to the continuity of π and πβ² , this condition will hold when and only
6.8. Natural boundary conditions
169
when πβπβ² = 0 for all π₯ β [π, π]. Substituting this information into (6.92), we find that [π + π]π₯=π β(π) = 0 for all β β V0 , and by choosing any β with β(π) β 0 we find that [π + π]π₯=π = 0. The equation π β πβ² = 0 for π₯ β [π, π] is the EulerβLagrange equation, and [π + π]π₯=π = 0 is the natural boundary condition. When these are combined with the essential boundary condition specified in V, we obtain the boundary-value problem in (6.86). Example 6.8.1. Consider the set V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 3}, and functional πΉ(π¦) = 1 β«0 (π¦β² )2 βπ₯π¦β² +π¦2 ππ₯+π¦2 (1). Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition. Extremals. There is a free end at π₯ = 1. The Lagrangian is πΏ(π₯, π¦, π¦β² ) = (π¦β² )2 β π₯π¦β² + π¦2 , and its partial derivatives are ππΏ/ππ¦ = 2π¦ and ππΏ/ππ¦β² = 2π¦β² β π₯. The free-end term has πΊ(π¦) = π¦2 , and its derivative is ππΊ/ππ¦ = 2π¦. The differential equation to consider is ππΏ π ππΏ π β ππ₯ [ ππ¦β² ] = 0, which becomes 2π¦ β ππ₯ [2π¦β² β π₯] = 0, or equivalently 2π¦ β2π¦β³ +1 = ππ¦ ππΏ ππΊ 0. The natural boundary condition at the free end is [ ππ¦β² + ππ¦ ]π₯=1 = 0, which becomes β² β² [2π¦ β π₯ + 2π¦]π₯=1 = 0, or equivalently 2π¦ (1) β 1 + 2π¦(1) = 0. In view of the essential boundary condition in V, the boundary-value problem for an extremal is (6.94)
π¦β³ β π¦ =
1 , 2
π¦(0) = 3,
π¦β² (1) + π¦(1) =
1 , 2
0 β€ π₯ β€ 1.
The differential equation can be solved using standard methods for linear, inhomoge1 neous equations, and the general solution is π¦ = πΆ1 ππ₯ + πΆ2 πβπ₯ β 2 , where πΆ1 and πΆ2 are 1
7πβ1
arbitrary constants. Applying the boundary conditions, we get πΆ1 = 2π and πΆ2 = 2π , and we obtain a unique extremal π¦β . This is the only candidate for a local extremum. ππΏ
π2 πΏ
Sign condition. Since ππ¦β² (π₯, π¦, π¦β² ) = 2π¦β² β π₯, we get ππ¦β² ππ¦β² (π₯, π¦, π¦β² ) = 2. Here the required second partial is a constant, but more generally it may depend on π₯, π¦ and π¦β² . Substituting the extremal π¦β and its derivative π¦ββ² into this expression, we get the π2 πΏ constant function ππ¦β² ππ¦β² (π₯, π¦β , π¦ββ² ) β‘ 2 for π₯ β [0, 1]. The fact that this expression is β₯ 0 for all π₯ β [0, 1] informs us that π¦β could be a local minimizer, but not a local maximizer. Analysis of candidate. To determine if π¦β is a local minimizer in the πΆ π -norm for some π, we attempt to verify the definition of a minimizer. To begin, let π β€ 2 and πΏ > 0 be given; we will adjust these as needed. Consider any π’ in the neighborhood π πΆ π (π¦β , πΏ) and let β = π’ β π¦β . Note that β is in V0 , since it is the difference of two functions in V. From the definition of πΉ, and the fact that π’ = π¦β + β, we have 1
(6.95)
πΉ(π’) = β« (π¦ββ² + ββ² )2 β π₯(π¦ββ² + ββ² ) + (π¦β + β)2 ππ₯ 0
+ (π¦β (1) + β(1))2 .
170
6. Calculus of variations
Expanding and grouping terms on the right-hand side, and again using the definition of πΉ, we get 1
(6.96)
πΉ(π’) = πΉ(π¦β ) + β« 2π¦ββ² ββ² β π₯ββ² + 2π¦β β + (ββ² )2 + β2 ππ₯ 0
+ 2π¦β (1)β(1) + β2 (1). Using integration-by-parts on the term β«(2π¦ββ² β π₯)ββ² ππ₯, and noting that β(0) = 0 since β β V0 , we get 1
(6.97)
πΉ(π’) = πΉ(π¦β ) + β« (2π¦β + 1 β 2π¦ββ³ )β + (ββ² )2 + β2 ππ₯ 0
+ (2π¦ββ² (1) β 1 + 2π¦β (1))β(1) + β2 (1). From the differential equation and natural boundary condition in (6.94), we note that 2π¦β + 1 β 2π¦ββ³ = 0 for all π₯ β [0, 1] and 2π¦ββ² (1) β 1 + 2π¦β (1) = 0. Using this observation, together with the fact that (ββ² )2 β₯ 0 and β2 β₯ 0 for all π₯ β [0, 1], we obtain the result that 1
(6.98)
πΉ(π’) β πΉ(π¦β ) = β« (ββ² )2 + β2 ππ₯ + β2 (1) β₯ 0,
βπ’ β π πΆ π (π¦β , πΏ).
0
The above result shows that π¦β is a local minimizer for any π β€ 2 and any πΏ > 0; in fact, it is an absolute minimizer.
6.9. Case study Setup. To illustrate the role of boundary conditions we study a problem in the optimal steering control of a boat. We consider driving a boat across a channel of moving water as illustrated in Figure 6.11, where the departure point π is on the left bank, and the arrival point π is on the right bank. We suppose that the water motion is everywhere parallel to the banks, and that the water speed π€(π₯) is a given function of position across the channel, and could possibly change sign within the channel. Moreover, we
y
w(x) water y(x) Ο
ΞΈ(x)
boat
P
Q x
0 Figure 6.11.
suppose that the boat moves at constant speed π relative to the water, and that the steering angle π(π₯) with respect to the horizontal axis can be adjusted or controlled as desired by a driver. Throughout our developments, π₯ and π¦ will denote coordinates as shown, and β will denote the width of the channel. We assume that β > 0 and π > 0 are given constants, and that π€(π₯) and π(π₯) are continuously differentiable.
6.9. Case study
171
As the boat moves across the channel, the path π¦(π₯) and steering angle π(π₯) are directly related, and the travel time π can be expressed in terms of either. Here we seek the path and steering angle that will minimize the travel time under different boundary conditions. Note that, since boundary conditions will be specified on the path, we treat π¦(π₯) as the primary variable rather than π(π₯). Once an optimal path π¦(π₯) is known, the corresponding optimal steering angle π(π₯) can be found. It will be convenient to introduce the speed ratio π(π₯) = π€(π₯)/π, and we will assume that the magnitude of the water speed is everywhere less than the boat speed so that β1 < π(π₯) < 1. Moreover, we will assume that the boat path is a graph (one π¦ for each π₯) with two continuous π π derivatives, which requires β 2 < π(π₯) < 2 . We note that a different approach to the problem, involving more general curves rather than graphs, would be needed if these restrictions were removed. Steering-path relation. The relation between the steering angle and path follows from simple considerations about velocity. Specifically, let (π₯, π¦) be the position of the boat at any time π‘ β₯ 0, so that its velocity is given by (6.99)
π£β =
ππ₯ β ππ¦ β π+ π, ππ‘ ππ‘
where π β and π β are the standard unit vectors in the positive coordinate directions. Equivalently, noting that π£ β is the resultant of two contributions, namely the velocity of the water, plus the velocity of the boat relative to the water, we also have (6.100)
β π£ β = [π€(π₯)π]β + [π cos π(π₯)π β + π sin π(π₯)π]. ππ¦
ππ₯
From (6.99) and (6.100) we get ππ‘ = π cos π(π₯) and ππ‘ = π€(π₯) + π sin π(π₯), which is a dynamical system for the boat position (π₯, π¦)(π‘) as a function of time. By dividing these component equations we obtain the associated path equation for this system, namely (6.101)
ππ¦ π(π₯) + sin π(π₯) π€(π₯) + π sin π(π₯) = . = ππ₯ π cos π(π₯) cos π(π₯)
The above relation implies that, if the steering angle π(π₯) is known, then the resulting π π path π¦(π₯) can be found. Moreover, the importance of the restriction β 2 < π(π₯) < 2 ππ¦
can now be seen; it guarantees that ππ₯ will be defined and continuous (finite) when π(π₯) and π(π₯) are given. Thus the restriction on the angle implies that the curve (π₯, π¦)(π‘) can be described as a graph π¦(π₯). Inverted relation. For our purposes, it will be convenient to invert the relation in (6.101) and express π(π₯) in terms of π¦β² (π₯). After rearranging the equation, and using ππ¦ π¦β² in place of ππ₯ , and omitting the argument π₯ in all functions for brevity, we get (6.102)
sin π = π¦β² cos π β π.
Squaring both sides and using the identity sin2 π = 1 β cos2 π, we arrive at a quadratic π π equation which can be solved for cos π. Consistent with the restriction β 2 < π < 2 , we choose the positive root and obtain (6.103)
cos π =
ππ¦β² + β(π¦β² )2 + 1 β π2 . (π¦β² )2 + 1
172
6. Calculus of variations
The above expression can be put into an alternative form. Specifically, we can rationalize the numerator to get (6.104)
cos π =
1 β π2 β(π¦β² )2 + 1 β π2 β ππ¦β²
.
The above relations imply that, if the path π¦(π₯) or its derivative π¦β² (π₯) are known, then the corresponding steering angle π(π₯) can be found. Moreover, the importance of the restriction β1 < π(π₯) < 1 can now be seen; it guarantees that cos π(π₯) will be real and positive when π(π₯) and π¦β² (π₯) are given. Note that, to obtain an angle in the interval π π β 2 < π(π₯) < 2 , we would substitute (6.104) into (6.102) and use inverse sine. Thus the restriction on the speed ratio implies that a steering angle function π(π₯) will exist for any given path function π¦(π₯). Note that, if this restriction were removed, then there may be no angle function that is consistent with a given path, that is, some paths might not be achievable. For instance, in the case when π(π₯) > 1, so that the water speed is positive and exceeds the boat speed at all locations in the channel, the boat would not be able to move along a path that is upstream, or even horizontal. π
π
ππ₯
Travel time. In view of the condition β 2 < π(π₯) < 2 , we find that ππ‘ = π cos π(π₯) > 0, which implies that π₯ increases monotonically with time. We suppose that the boat is at π₯ = 0 at time π‘ = 0, and arrives at π₯ = β at time π‘ = π, so that, by definition, π is the travel time across the channel. An integral expression for the travel time in terms of the path π¦(π₯) can now be written. Specifically, beginning from a simple integral identity, and performing a change of variable using the above expression ππ₯ for ππ‘ , and then using the expression in (6.104), we get π‘=π
π=β«
(6.105)
π₯=β
ππ‘ = β«
π‘=0
π₯=0
β
β(π¦β² )2 + 1 β π2 β ππ¦β² ππ‘ 1 ππ₯ = β« ππ₯. ππ₯ π 0 1 β π2
Two problems. To illustrate the role and significance of boundary conditions we consider two different optimal steering problems. y
y Q ( ,h)
P (0,0)
P (0,0)
x (a)
x (b)
Figure 6.12.
Fixed-fixed problem. We seek a path π¦(π₯) and steering angle π(π₯) that will minimize the travel time from a given point π = (0, 0) on the left bank to a given point π = (β, β) on the right bank, as illustrated in Figure 6.12a. Equivalently, we seek minimizers for
6.9. Case study
173
a functional π βΆ V β β, where β
(6.106)
π(π¦) =
1 β« πΏ(π₯, π¦, π¦β² ) ππ₯, π 0
V = {π¦ β πΆ 2 [0, β] | π¦(0) = 0, π¦(β) = β}. Here πΏ(π₯, π¦, π¦β² ) is the integrand defined in (6.105). Every candidate for a minimizer must satisfy the EulerβLagrange differential equation as usual, together with the two essential boundary conditions at π₯ = 0 and π₯ = β. Due to the fact that πΏ(π₯, π¦, π¦β² ) has no explicit dependence on π¦, the differential equation can be written in a reduced form, and its solution can be expressed in the form of an integral with two arbitrary constants. Consideration of the boundary conditions then leads to a simple equation for one of these constants, and a nonlinear equation for the other. Note that once an optimal path π¦(π₯) is known, the corresponding steering angle π(π₯) can be found as described earlier. Fixed-free problem. We seek a path π¦(π₯) and steering angle π(π₯) that will minimize the travel time from a given point π = (0, 0) on the left bank to any point on the right bank, as illustrated in Figure 6.12b. In other words, we seek the quickest route from π to the other side of the channel β with no bias or preference on the landing point. In this case the landing point on the right bank is now free, and we simply want to reach the right bank as quickly as possible. Here we seek minimizers for a functional π βΆ W β β, where β
(6.107)
1 π(π¦) = β« πΏ(π₯, π¦, π¦β² ) ππ₯, π 0 W = {π¦ β πΆ 2 [0, β] | π¦(0) = 0}.
As before, πΏ(π₯, π¦, π¦β² ) is the integrand defined in (6.105). Every candidate for a minimizer must again satisfy the EulerβLagrange differential equation, together with an essential boundary condition at the fixed end π₯ = 0, and a natural boundary condition at the free end π₯ = β. Similar to before, the general solution of the differential equation can be expressed in the form of an integral with two arbitrary constants. However, in contrast to before, the boundary conditions now lead to two simple, explicit equations for these constants. Once an optimal path π¦(π₯) is known, the corresponding steering angle π(π₯) can again be found. Summary of results. Let a channel width β, a boat speed π, and a continuously differentiable water speed π€(π₯) be given, and assume the speed ratio satisfies the restriction β1 < π(π₯) < 1. (1) The fixed-free problem for π βΆ W β β has a unique extremal, and it is an absolute minimizer. (2) The fixed-fixed problem for π βΆ V β β either has a unique extremal, and it is an absolute minimizer, or no extremal for a given landing point (β, β). In the fixed-fixed problem, the possibility of no extremal may arise depending on the given data, which would indicate that no graph in V would provide a quickest route from π to π. In this case we note that a quickest route may exist among a more general
174
6. Calculus of variations
set of curves. For instance, if the water speed π€(π₯) is as shown, and π is far downstream from π, then a quickest route may conceivably take the form of a piecewisedefined curve, where one piece is a vertical line segment oriented with the flow along the middle of the channel where the water speed is the fastest. In contrast, when π is upstream from π by a sufficient amount, then a quickest route could conceivably favor vertical line segments oriented opposite to the flow near one or both banks where the water speed is the slowest, and the route becomes less intuitive. Various details for the above two problems are explored in the Exercises.
6.10. Second-order problems We consider the problem of finding local extrema for a functional πΉ βΆ V β β, where the set of functions is V = {π¦ β πΆ 4 [π, π] | π¦(π) = πΌ, π¦β² (π) = πΎ, π¦(π) = π½, π¦β² (π) = π} (6.108) (or some or all of these conditions free), the space of variations is (6.109)
V0 = {β β πΆ 4 [π, π] | β(π) = 0, ββ² (π) = 0, β(π) = 0, ββ² (π) = 0} (or some or all of these conditions free),
and the functional is π
πΉ(π¦) = β« πΏ(π₯, π¦, π¦β² , π¦β³ ) ππ₯.
(6.110)
π
Here [π, π] is a given interval, πΌ, π½, πΎ, π are given constants, and πΏ(π₯, π¦, π¦β² , π¦β³ ) is a given integrand. Unless indicated otherwise, we assume that πΏ is three times continuously differentiable for all π₯ β [π, π], π¦ β β, π¦β² β β and π¦β³ β β. In the case when some of the conditions in V are free, the functional could be defined to include associated freeend terms, but that level of generality will not be pursued. The integrand πΏ is called the Lagrangian as before. The above problem is said to be of second-order type, since the functional πΉ involves derivatives of at most second order. The continuity requirements for the functions in V, and for the integrand πΏ, ensure that the functional πΉ is finite for each input. They also ensure that the general necessary conditions in Result 6.4.1 can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. The following result outlines some implications of the general necessary conditions, when specialized to the above problem. Result 6.10.1. Let πΉ βΆ V β β be defined as in (6.108)β(6.110). If π¦β β V is a local minimizer of πΉ in the πΆ π -norm for some π, then (6.111)
π πΉ(π¦β + πβ)] = 0, ππ π=0
ββ β V0 ,
π2 πΉ(π¦β + πβ)] β₯ 0, ππ2 π=0
ββ β V0 .
[
and (6.112)
[
6.10. Second-order problems
175
Condition (6.111) implies that π¦β must satisfy π ππΏ π 2 ππΏ ππΏ β [ β² ] + 2 [ β³ ] = 0, β§ ππ₯ ππ¦ βͺ ππ¦ ππ₯ ππ¦ βͺ (6.113)
π β€ π₯ β€ π,
β² β² β¨π¦(π) = πΌ, π¦ (π) = πΎ, π¦(π) = π½, π¦ (π) = π, βͺ βͺ β©(or natural boundary conditions if any are free).
Condition (6.112) implies that π¦β must also satisfy π2 πΏ (π₯, π¦, π¦β² , π¦β³ ) β₯ 0, π β€ π₯ β€ π. ππ¦β³ ππ¦β³ For a local maximizer, change β₯ to β€ in conditions (6.112) and (6.114). (6.114)
The equations in (6.113) provide a boundary-value problem that every local extremum must satisfy; they are called the EulerβLagrange equations as before. The differential equation in this boundary-value problem is at most fourth-order, and may be linear or nonlinear. The boundary conditions appearing in (6.113) are essential in the sense that they are explicitly specified in the set V. If any of these conditions is removed from V, then an associated natural boundary condition would appear in (6.113). The inequality in (6.114) is a further condition that must be satisfied; it is called the Legendre condition as before, and can be used to partially classify an extremum. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. As noted before, these conditions are still not sufficient even if the inequalities in (6.112) and (6.114) are made strict. The boundary-value problem in (6.113) may have one, none, or multiple solutions; any solution, and hence a candidate, is called an extremal. The natural boundary conditions that may arise in (6.113) are summarized below. Result 6.10.2. Let πΉ βΆ V β β be defined as in (6.108)β(6.110). If any essential boundary condition is removed, then local extrema must satisfy a corresponding natural boundary condition. ππΏ
(1) If π¦(π) is free, the natural condition is [ ππ¦β² β
π ππΏ [ ]] ππ₯ ππ¦β³ π₯=π
= 0.
ππΏ
(2) If π¦β² (π) is free, the natural condition is [ ππ¦β³ ]π₯=π = 0. ππΏ
(3) If π¦(π) is free, the natural condition is [ ππ¦β² β
π ππΏ [ ]] ππ₯ ππ¦β³ π₯=π
= 0.
ππΏ
(4) If π¦β² (π) is free, the natural condition is [ ππ¦β³ ]π₯=π = 0. Sketch of proof: Results 6.10.1 and 6.10.2. Here we show how (6.111) implies (6.113), including the various natural boundary conditions that can arise. To begin, consider any fixed π¦ β V and β β V0 . From the definition of πΉ in (6.110) we have π
(6.115)
πΉ(π¦ + πβ) = β« πΏ(π₯, π¦ + πβ, π¦β² + πββ² , π¦β³ + πββ³ ) ππ₯. π
176
6. Calculus of variations
Just as in the previous cases, we differentiate with respect to π, and take the derivative inside the integral and use the chain rule, and then set π = 0. The resulting expression for the first variation is π
πΏπΉ(π¦, β) = β« πβ + πββ² + πββ³ ππ₯,
(6.116)
π ππΏ
ππΏ
where for brevity we use the notation π = ππ¦ (π₯, π¦, π¦β² , π¦β³ ), π = ππ¦β² (π₯, π¦, π¦β² , π¦β³ ) and π = ππΏ (π₯, π¦, π¦β² , π¦β³ ). As before, we next write the above expression in a more useful form ππ¦β³ using the integration-by-parts formula β« π’ ππ£ = π’π£ β β« π£ ππ’. Specifically, applying this formula to the term β« πββ² ππ₯, with π’ = π and ππ£ = ββ² ππ₯, and also to the term β« πββ³ ππ₯, with π’ = π and ππ£ = ββ³ ππ₯, we get π
(6.117)
π₯=π
π₯=π
πΏπΉ(π¦, β) = β« πβ β πβ² β β πβ² ββ² ππ₯ + [πβ]π₯=π + [πββ² ]π₯=π . π
We again apply the integration-by-parts formula to the term β« βπβ² ββ² ππ₯, with π’ = βπβ² and ππ£ = ββ² ππ₯, and we get, after collecting the boundary terms, π
(6.118)
π₯=π
π₯=π
πΏπΉ(π¦, β) = β« (π β πβ² + πβ³ )β ππ₯ + [(π β πβ² )β]π₯=π + [πββ² ]π₯=π . π
β²
ππ
In the above, note that π means ππ₯ , or equivalently = [πββ² ]π₯=π β [πββ² ]π₯=π , and so on.
π ππΏ [ ], and so on. ππ₯ ππ¦β²
π₯=π
Also, [πββ² ]π₯=π
We now observe that, if π¦β β V is a local extremum, then the condition in (6.111) requires (6.119)
πΏπΉ(π¦β , β) = 0,
ββ β V0 .
Regardless of whether any essential boundary conditions are specified, the space V0 will always contain functions that vanish at the ends. Thus a special case of the above condition is π
(6.120)
β« (π β πβ² + πβ³ )β ππ₯ = 0, π
ββ β πΆ 4 [π, π],
β(π) = 0, ββ² (π) = 0, β(π) = 0, ββ² (π) = 0.
By the fundamental lemma, due to the continuity of π, πβ² , and πβ³ , this condition will hold when and only when π β πβ² + πβ³ = 0 for all π₯ β [π, π]. This is the EulerβLagrange differential equation in (6.113). Substituting this information into (6.119), we then find that (π β πβ² )(π)β(π) β (π β πβ² )(π)β(π) (6.121) +π(π)ββ² (π) β π(π)ββ² (π) = 0, ββ β V0 . The above expression is trivial when all the essential boundary conditions are specified. However, when any of β(π), ββ² (π), β(π), or ββ² (π) are free, then the corresponding coefficient must vanish in order for the above condition to hold, and this gives the corresponding natural boundary condition. Example 6.10.1. Consider the set V = {π¦ β πΆ 4 [0, 1] | π¦(0) = 1, π¦(1) = 0, π¦β² (1) = 1 0}, and functional πΉ(π¦) = β«0 (π¦β³ )2 + π¦π¦β² + (π¦β² )2 β π¦ ππ₯. Here we find all extremals, partially classify them with the sign condition, and then determine if they are actual local extrema using the definition.
6.11. Case study
177
Extremals. Note that [π, π] = [0, 1] and that π¦β² (π) is free, since it is not specified in V. The Lagrangian is πΏ(π₯, π¦, π¦β² , π¦β³ ) = (π¦β³ )2 + π¦π¦β² + (π¦β² )2 β π¦, and its partial derivatives are ππΏ/ππ¦ = π¦β² β 1, ππΏ/ππ¦β² 2= π¦ + 2π¦β² and ππΏ/ππ¦β³ = 2π¦β³ . The differential equation to conπ ππΏ π ππΏ π ππΏ π2 sider is ππ¦ β ππ₯ [ ππ¦β² ]+ ππ₯2 [ ππ¦β³ ] = 0, which becomes π¦β² β1β ππ₯ [π¦ + 2π¦β² ]+ ππ₯2 [2π¦β³ ] = 0, or equivalently β1 β 2π¦β³ + 2π¦β = 0. The natural boundary condition associated ππΏ with π¦β² (π) being free is [ ππ¦β³ ]π₯=π = 0, which becomes [2π¦β³ ]π₯=0 = 0, or equivalently π¦β³ (0) = 0. In view of the essential boundary conditions in V, the boundary-value problem for an extremal is 1 (6.122) π¦β β π¦β³ = , π¦(0) = 1, π¦(1) = 0, π¦β² (1) = 0, π¦β³ (0) = 0, 0 β€ π₯ β€ 1. 2 The differential equation can be solved using standard methods for linear, inhomo1 geneous equations, and the general solution is π¦ = πΆ1 + πΆ2 π₯ + πΆ3 ππ₯ + πΆ4 πβπ₯ β 4 π₯2 , where πΆ1 , . . . , πΆ4 are arbitrary constants. Applying the boundary conditions, we find unique values for these constants, and we obtain a unique extremal π¦β . This is the only candidate for a local extremum. ππΏ
π2 πΏ
Sign condition. Since ππ¦β³ (π₯, π¦, π¦β² , π¦β³ ) = 2π¦β³ , we get ππ¦β³ ππ¦β³ (π₯, π¦, π¦β² , π¦β³ ) = 2. Here the required second partial is a constant, but more generally it may depend on π₯, π¦, π¦β² , and π¦β³ . Substituting the extremal π¦β and its derivatives into this expression, we get the π2 πΏ constant function ππ¦β³ ππ¦β³ (π₯, π¦β , π¦ββ² , π¦ββ³ ) β‘ 2 for π₯ β [0, 1]. The fact that this expression is β₯ 0 for all π₯ β [0, 1] informs us that π¦β could be a local minimizer, but not a local maximizer. Analysis of candidate. To determine if π¦β is a local minimizer in the πΆ π -norm for some π, we attempt to verify the definition of a minimizer. Similar to previous cases, let π β€ 4 and πΏ > 0 be given, and consider any π’ in the neighborhood π πΆ π (π¦β , πΏ), and let β = π’βπ¦β . Note that β is in V0 , since it is the difference of two functions in V. Through a tedious, but straightforward analysis we obtain the result that (6.123)
πΉ(π’) β πΉ(π¦β ) β₯ 0,
βπ’ β π πΆ π (π¦β , πΏ).
Thus we conclude that π¦β is a local minimizer for any π β€ 4 and πΏ > 0; in fact, it is an absolute minimizer.
6.11. Case study Setup. To illustrate an application involving a second-order problem, and the role of different boundary conditions, we study a problem in the optimal acceleration control of a car. We consider driving between two given points, π and π, along a road in a fixed time interval [0, π] as shown in Figure 6.13. We suppose that the motion occurs in a plane and that the car is always in contact with the road. We represent the car as a particle of mass π and the road as a curve. The arclength (distance) coordinate along the road is denoted by π , which is chosen so that π = 0 corresponds to point π, and π = β corresponds to point π. Since we consider motion in a fixed time interval, we have π = 0 when π‘ = 0, and π = β when π‘ = π. The road may contain topographical features such as hills and valleys, as described by an inclination angle π(π ) with respect to a horizontal axis, which is a given function of arclength.
178
6. Calculus of variations
y
g
P
car ΞΈ(s) s
Q
x Figure 6.13.
The motion of the car is described by a function π (π‘) which gives its position versus time. Throughout the motion we suppose that the acceleration of the car can be influenced by a driver as desired. The driver can add positive acceleration by pressing on a gas pedal, or negative acceleration by pressing on a brake pedal, where the magnitude of the acceleration can vary according to the amount of pressure on the pedal. The driver could also choose to add zero acceleration and simply allow the car to coast. We denote the acceleration input from the driver by a function π’(π‘). In addition to this input, we assume that the car is subject to gravity, which is vertically downward with gravitational acceleration π. We also consider an air resistance force, which is proportional and directly opposed to the velocity of the car, with constant of proportionality π, and for convenience we introduce the parameter π = π/π. In general, a driver will need to use the gas and brake pedals in order to travel from π to π. Here we seek a driving input or control π’(π‘) that will minimize the total pedal usage throughout the travel under different boundary conditions. In other words, we consider the question of how to achieve the trip while using the gas and brake pedals as little as possible. We assume that π, π, β, and π are positive constants, π and π are nonnegative constants, and that π(π ) and π’(π‘) are twice continuously differentiable π π functions with β 2 < π(π ) < 2 . Description of road. We suppose that the road is described by a planar curve (π₯, π¦)(π ), where π₯ and π¦ are cartesian coordinates as shown with some fixed origin, and π is an arclength parameter as described above. The unit tangent vector along the road β is π(π ) = π₯β² (π )π β + π¦β² (π )π,β where a prime denotes a derivative with respect to π , and π β and π β are the standard unit vectors along the positive π₯ and π¦ directions. By definition β of the inclination angle, we have π(π ) = cos π(π )π β + sin π(π )π.β Note that π(π ) can be determined from (π₯, π¦)(π ) using the relation tan π(π ) = π¦β² (π )/π₯β² (π ), and consistent with π π the assumption β 2 < π(π ) < 2 , we assume π₯β² (π ) > 0. We also consider a unit normal β vector defined by π(π ) = β sin π(π )π β + cos π(π )π,β in the upward direction from soil to air. In addition to being orthonormal at each point along the road, these vectors satisfy β the relation πβ β² (π ) = π
(π )π(π ), where π
(π ) = πβ² (π ) is the (signed) curvature of the road at π . Note that the road is concave up in regions where π
(π ) > 0, concave down where π
(π ) < 0, and straight where π
(π ) = 0. Motion of car. At any time π‘ β [0, π], the car has an arclength position π (π‘) along the road, and a position vector π(π (π‘)) β = π₯(π (π‘))π β + π¦(π (π‘))π β in the plane. Using the chain rule, we find that the velocity and acceleration of the car are given by π Μβ = π π Μ β and π Μβ = π π Μ β + π
π 2Μ π,β where a dot denotes a derivative with respect to time π‘. We assume that
6.11. Case study
179
the car is subject to a force πΉπβ = ππ’πβ due to driver input, a force πΉπβ = βππ Μβ due to air resistance, a force πΉπβ = βπππ β due to gravity, and a force πΉπβ = ππβ due to contact with the road. In any motion, the road can only push up (π > 0) to support the car, and cannot pull down (π < 0) or effectively be absent (π = 0). According to Newtonβs law, the motion of the car must satisfy the equation ππ Μβ = πΉπβ + πΉπβ + πΉπβ + πΉπβ . By taking the dot product with π,β and dividing by π and using the relation π = π/π, we find that the tangential components of acceleration and force must satisfy (6.124)
π Μ = π’ β ππ Μ β π sin π(π ).
The above equation implies that, if the inclination angle π(π ) and control input π’(π‘) are given, then the car position π (π‘) can be found. By taking the dot product of the original equation with π,β we find that the normal components of acceleration and force must satisfy π = ππ
π 2Μ + ππ cos π. Note that this equation leads to a condition for contact. Specifically, contact will exist or not exist depending on whether π > 0 or π β€ 0. For instance, along regions where the road is concave down, so that π
< 0, contact will be lost if the velocity π Μ is too large; the car (and its driver!) will feel weightless as they separate from the road. In contrast, along regions where the road is concave up or straight, so that π
β₯ 0, contact will exist for any velocity. The contact force between car and road (and driver and seat) will noticeably increase through concave up regions at higher velocities. For our purposes, it will be convenient to invert the relation in (6.124) and express π’(π‘) in terms of π (π‘). Specifically, after rearranging, we obtain (6.125)
π’ = π Μ + ππ Μ + π sin π(π ).
The above relation shows that, if the inclination angle π(π ) and motion π (π‘) are known, then the corresponding control function π’(π‘) required to produce the motion can be found. Gas and brake usage. As a measure of the total gas and brake pedal usage in the time interval [0, π] we consider the functional π
(6.126)
πΉ = β« π’2 (π‘) ππ‘. 0
Note that larger values of the driver input π’(π‘), over longer periods of time, will give larger positive values of πΉ. Also, a zero value of π’(π‘), over the entire period of time, will give a zero value of πΉ. And in all cases we have πΉ β₯ 0. The above is called a cost functional for the problem: it assigns a quantitative βcostβ to any given driver input π’(π‘). In a game where the objective is to use the pedals as little as possible, a high score in the game would correspond to a low value of the cost, and a low score would correspond to a high value of the cost. Thus cost can be understood as the inverse of a score. Note that the definition of the cost functional is subjective just as the scoring rules of any game are subjective. The simple example above is for purposes of illustration only. The integrand need not depend on the input in a uniform, sign-independent way. More generally, the integrand could depend on
180
6. Calculus of variations
the sign of π’(π‘), for instance to reflect different contributions to cost associated with fuel consumption or rate of wear on the vehicle. Also, the integrand could include contributions that explicitly depend on time π‘ and the motion π (π‘). As the car moves along the road, the input π’(π‘) and motion π (π‘) are directly related, and the cost functional πΉ can be expressed in terms of either. Since boundary conditions will be specified on the motion rather than the control, we treat π (π‘) as the primary variable rather than π’(π‘). Using (6.125) and (6.126), we obtain an expression for the cost assigned to any driver input in terms of the motion, namely π
(6.127)
2
πΉ = β« (π Μ + ππ Μ + π sin π(π )) ππ‘. 0
Note that the above functional is of second-order type in the motion π (π‘) since it involves the first and second derivatives π (π‘) Μ and π (π‘). Μ Two problems. We consider two different types of optimal driving problems corresponding to different boundary conditions. Problem 1. Consider a trip in which the car starts from rest at point π at time π‘ = 0, and must come to a stop at point π at time π‘ = π. We seek a driving control function π’(π‘) that will minimize the total gas and brake pedal usage while accomplishing this trip. Equivalently, working with the motion π (π‘), we seek a minimizer for the cost functional πΉ βΆ V β β, where π
(6.128)
πΉ(π ) = β« πΏ(π‘, π , π ,Μ π )Μ ππ‘, 0
V = {π β πΆ 4 [0, π] | π (0) = 0, π (0) Μ = 0, π (π) = β, π (π) Μ = 0}. Here πΏ(π‘, π , π ,Μ π )Μ is the integrand defined in (6.127). Every candidate for a minimizer must satisfy the EulerβLagrange differential equation, together with four essential boundary conditions at π‘ = 0 and π‘ = π. When the road has a nontrivial topography defined by an arbitrary inclination angle π(π ), the EulerβLagrange equation will be nonlinear and numerical procedures will generally be required to find extremals. In contrast, when the road has a trivial topography, for instance when it is straight and flat so that π(π ) β‘ 0, the equation will be linear and extremals can be found by the usual solution techniques. Once an optimal motion π (π‘) is known, the corresponding control π’(π‘) can be found using (6.125), and the contact condition could be checked to assess the feasibility of the motion. This problem is explored in the Exercises. Problem 2. Consider a trip in which the car again starts from rest at point π at time π‘ = 0, and must reach or cross point π at time π‘ = π, but where the velocity at point π is allowed to be free. We seek a driving control function π’(π‘) that will minimize the total gas and brake pedal usage while accomplishing this different version of the trip. Equivalently, working with the motion π (π‘), we seek a minimizer for the cost functional πΉ βΆ W β β, where π
(6.129)
πΉ(π ) = β« πΏ(π‘, π , π ,Μ π )Μ ππ‘, 0
W = {π β πΆ 4 [0, π] | π (0) = 0, π (0) Μ = 0, π (π) = β}.
6.12. Constraints
181
Here, as before, πΏ(π‘, π , π ,Μ π )Μ is the integrand defined in (6.127). Every candidate for a minimizer must again satisfy the EulerβLagrange differential equation, together with three essential boundary conditions at π‘ = 0 and π‘ = π, along with an appropriate natural boundary condition at π‘ = π. Remarks similar to those above can also be made here, and we note that any optimal motion π (π‘) and corresponding control π’(π‘) will generally be different than before. This problem is also explored in the Exercises.
6.12. Constraints Given two functionals πΉ, πΊ βΆ V β β, and a constant π β β, we consider the problem of finding local extrema of πΉ(π¦) subject to the constraint πΊ(π¦) = π. We suppose that the set of functions is V = {π¦ β πΆ 2 [π, π] | π¦(π) = πΌ, π¦(π) = π½}
(6.130)
(or one or both free),
the space of variations is (6.131)
V0 = {β β πΆ 2 [π, π] | β(π) = 0, β(π) = 0} (or one or both free),
and the functionals are π
(6.132)
πΉ(π¦) = β« πΏ(π₯, π¦, π¦β² ) ππ₯, π
π
πΊ(π¦) = β« π(π₯, π¦, π¦β² ) ππ₯. π
Here [π, π] is a given interval, πΌ, π½ are given constants, and πΏ(π₯, π¦, π¦β² ) and π(π₯, π¦, π¦β² ) are given integrands. Unless indicated otherwise, we assume that πΏ and π are twice continuously differentiable for all π₯ β [π, π], π¦ β β and π¦β² β β. In the case when some of the conditions in V are free, the functionals could be defined to include associated free-end terms, but that level of generality will not be considered. The above problem is called an isoperimetric problem; it arises in various classic applications in geometry, where the integral constraint πΊ(π¦) = π represents a fixed arclength or perimeter. There are many variants that could be considered, which may involve multiple constraints of the integral type, higher-order functionals, and also constraints of a local or pointwise type. Here we consider only the version outlined above. As before, the continuity requirements for the functions in V, and for the integrands πΏ and π, ensure that the functionals πΉ and πΊ are finite for each input. They also ensure that the general necessary condition outlined below can be rewritten in a local, pointwise form involving only continuous quantities. These continuity requirements can be relaxed, but at the expense of more complicated statements. Whereas all previous results have exploited facts from single-variable calculus, the following result exploits a fact from two-variable calculus, regarding the local extrema of functions subject to constraints. Thus, instead of a one-parameter family of variations, we now consider a two-parameter family as described next. Result 6.12.1. Let πΉ, πΊ βΆ V β β be defined as in (6.130)β(6.132). Let π¦β β V be given, and assume that it is not an extremal of πΊ(π¦), and consider any fixed β1 β V0 such that
182
6. Calculus of variations
Λ βΆ β2 β β defined by ΛπΊ πΏπΊ(π¦β , β1 ) β 0. For arbitrary β2 β V0 consider the functions πΉ, (6.133)
Λ 1 , π2 ) = πΉ(π¦β + π1 β1 + π2 β2 ), πΉ(π Λ 1 , π2 ) = πΊ(π¦β + π1 β1 + π2 β2 ). πΊ(π
If π¦β β V is a local extremum of πΉ(π¦) subject to πΊ(π¦) = π in the πΆ π -norm for some π, Λ 1 , π2 ) = π. Thus a number Λ 1 , π2 ) subject to πΊ(π then (0, 0) β β2 is a local extremum of πΉ(π π β β must exist such that
(6.134)
ππΉΛ β β ππ β 1β β ππΉΛ β β β β ππ2 β
+ (0, 0)
Λ ππΊ β β ππ 1β β πβ Λβ β ππΊ β β ππ2 β
0 = ( ). 0 (0, 0)
The number π is independent of β1 and β2 , and the above condition implies that π¦β must satisfy
(6.135)
π ππΏ ππ π ππ ππΏ β§ ( ππ¦ β ππ₯ [ ππ¦β² ]) + π ( ππ¦ β ππ₯ [ ππ¦β² ]) = 0, π β€ π₯ β€ π, βͺ βͺ βͺ πΊ(π¦) = π, π constant, β¨ βͺ π¦(π) = πΌ, π¦(π) = π½, βͺ βͺ β© (or natural boundary conditions if any are free).
The condition in (6.134) is a result from two-variable differential calculus. This condition arises in the study of constrained optimization and is known as the Lagrange multiplier rule; the constant π is called the multiplier. The two-parameter family of variations in (6.133) is introduced for the sole purpose of exploiting this elegant result. The equations in (6.135) provide a boundary-value problem that every local extremum must satisfy; they are called the EulerβLagrange equations as before. The differential equation in this boundary-value problem is at most second-order, may be linear or nonlinear, and now contains the constant π. This additional unknown is balanced by an additional equation, which is the constraint condition πΊ(π¦) = π. The boundary conditions appearing in (6.135) are essential in the sense that they are explicitly specified in the set V. If any of these conditions is removed from V, then an associated natural boundary condition would appear. The conditions outlined above are necessary, but not sufficient. Thus these conditions can only be used to find candidates for extrema, and a separate analysis would be required to determine which candidates, if any, are actual extrema. A corresponding Legendre-type condition to further classify candidates can also be derived. However, this condition is significantly involved and difficult to apply for constrained problems and is omitted for brevity. The boundary-value problem in (6.135) may have one, none, or multiple solutions; any solution, and hence a candidate, is called an extremal as before. The natural boundary conditions that may arise in (6.135) are summarized below.
6.12. Constraints
183
Result 6.12.2. Let πΉ, πΊ βΆ V β β be defined as in (6.130)β(6.132), and let π β β be the multiplier from (6.135). If any essential boundary condition is removed, then local extrema must satisfy a corresponding natural boundary condition. ππΏ
ππ
ππΏ
ππ
(1) If π¦(π) is free, the natural condition is [ ππ¦β² + π ππ¦β² ]π₯=π = 0. (2) If π¦(π) is free, the natural condition is [ ππ¦β² + π ππ¦β² ]π₯=π = 0. Sketch of proof: Results 6.12.1 and 6.12.2. Here we show how (6.134) implies (6.135), including the two natural boundary conditions that can arise. To begin, let π¦ β V and Λ in (6.133) we have β1 , β2 β V0 be arbitrary. From the definitions of πΉΛ and πΊ π
Λ 1 , π2 ) = β« πΏ(π₯, π¦ + π1 β1 + π2 β2 , π¦β² + π1 β1β² + π2 β2β² ) ππ₯, πΉ(π π
(6.136)
π
Λ 1 , π2 ) = β« π(π₯, π¦ + π1 β1 + π2 β2 , π¦β² + π1 β1β² + π2 β2β² ) ππ₯. πΊ(π π
For each of π = 1 and π = 2, we differentiate with respect to ππ , and take the derivative inside the integral and use the chain rule, and then set (π1 , π2 ) = (0, 0). The resulting expressions are π
ππΉΛ | = β« ππΏ βπ + ππΏ βπβ² ππ₯, πππ |(0,0) π
(6.137)
π
Λ ππΊ | = β« ππ βπ + ππ βπβ² ππ₯, πππ |(0,0) π ππΏ
ππΏ
ππ
ππ
where for brevity we use the notation ππΏ = ππ¦ , ππΏ = ππ¦β² , ππ = ππ¦ , and ππ = ππ¦β² . As before, we next rewrite each expression in a more useful form using the integrationby-parts formula β« π’ ππ£ = π’π£ β β« π£ ππ’. Specifically, applying this formula to the terms β« ππΏ βπβ² ππ₯ and β« ππ βπβ² ππ₯, we get π
π₯=π ππΉΛ | = β« ππΏ βπ ππ₯ + [ππΏ βπ ]π₯=π , πππ |(0,0) π
(6.138)
π
Λ π₯=π ππΊ | = β« ππ βπ ππ₯ + [ππ βπ ]π₯=π , πππ |(0,0) π π
π
where we use the notation ππΏ = ππΏ β ππ₯ ππΏ and ππ = ππ β ππ₯ ππ . Note that, by definition of the first variations of πΉ and πΊ, the above expressions can be succinctly Λ Λ ππΉ ππΊ written as ππ |(0,0) = πΏπΉ(π¦, βπ ) and ππ |(0,0) = πΏπΊ(π¦, βπ ). π
π
We now suppose that π¦β is an extremum of πΉ subject to πΊ = π, so that (0, 0) is an Λ = π. Under the assumption that π¦β is not an extremal of πΊ, extremum of πΉΛ subject to πΊ we can fix β1 such that πΏπΊ(π¦β , β1 ) β 0, and let β2 be arbitrary. This implies that (0, 0) Λ and by the multiplier rule from two-variable calculus, will not be a critical point of πΊ,
184
6. Calculus of variations
a number π must exist such that
(6.139)
π Λ π ππΊ ππΉΛ +π = β« (ππΏ + πππ )β1 ππ₯ + [(ππΏ + πππ )β1 ]π = 0, ππ1 ππ1 π π Λ π ππΉΛ ππΊ +π = β« (ππΏ + πππ )β2 ππ₯ + [(ππΏ + πππ )β2 ]π = 0. ππ2 ππ2 π
The above two equations can be written as πΏπΉ(π¦β , β1 )+ππΏπΊ(π¦β , β1 ) = 0 and πΏπΉ(π¦β , β2 )+ ππΏπΊ(π¦β , β2 ) = 0. For fixed β1 , but arbitrary β2 such that πΏπΊ(π¦β , β2 ) β 0, we find that πΏπΉ(π¦ ,β ) πΏπΉ(π¦ ,β ) βπ = πΏπΊ(π¦β ,β1 ) = πΏπΊ(π¦β ,β2 ) . Thus the number π is a constant whose value depends on β 1 β 2 π¦β , but not on β1 or β2 . Since the two equations in (6.139) have an identical form, and β2 is arbitrary, they are equivalent to the single equation π
β« π€β ππ₯ + π(π)β(π) β π(π)β(π) = 0,
(6.140)
ββ β V0 ,
π
where π€ = ππΏ +πππ and π = ππΏ +πππ . Regardless of whether any essential boundary conditions are specified, the space V0 will always contain functions that vanish at the ends. Thus a special case of the above condition is π
(6.141)
β« π€β ππ₯ = 0,
ββ β πΆ 2 [π, π],
β(π) = 0,
β(π) = 0.
π
By the fundamental lemma, due to the continuity of π€, this condition will hold when and only when π€ = 0 for all π₯ β [π, π]. This is the EulerβLagrange differential equation in (6.135). Substituting this information into (6.140), we then find that (6.142)
π(π)β(π) β π(π)β(π) = 0,
ββ β V0 .
The above expression is trivial when all the essential boundary conditions are specified. However, when either of β(π) or β(π) is free, or both, then the corresponding coefficient must vanish in order for the above condition to hold, and this gives the corresponding natural boundary condition. Example 6.12.1. Consider the set V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 0}, and func1 1 tionals πΉ(π¦) = β«0 (π¦β² )2 ππ₯ and πΊ(π¦) = β«0 π₯π¦ ππ₯. Here we find all extremals of πΉ subject to πΊ = 1, and then determine if they are actual local extrema using the definition. Extremals. The integrands are πΏ(π₯, π¦, π¦β² ) = (π¦β² )2 and π(π₯, π¦, π¦β² ) = π₯π¦, and their partial derivatives are ππΏ/ππ¦ = 0, ππΏ/ππ¦β² = 2π¦β² , ππ/ππ¦ = π₯, and ππ/ππ¦β² = 0. The differential ππΏ π ππΏ ππ π ππ equation to consider is ( ππ¦ β ππ₯ [ ππ¦β² ]) + π( ππ¦ β ππ₯ [ ππ¦β² ]) = 0, which becomes β2π¦β³ + ππ₯ = 0, where π is an unknown constant. In view of the essential boundary conditions in V, we get the equations π π₯, π¦(0) = 0, π¦(1) = 0, 0 β€ π₯ β€ 1. 2 The differential equation can be explicitly integrated, and its general solution is π¦ = π 3 π₯ + πΆπ₯ + π·, where πΆ and π· are unknown constants. Applying the boundary con12 π π ditions, we find π· = 0 and πΆ = β 12 , and the function becomes π¦ = 12 (π₯3 β π₯). The (6.143)
π¦β³ =
6.12. Constraints
185
constraint condition πΊ = 1 must also be satisfied, and this requires 1
(6.144)
1
β« π₯π¦ ππ₯ = 1
which becomes
β«
0
0
π 4 (π₯ β π₯2 ) ππ₯ = 1. 12 15
The above equation implies π = β90, and we obtain a unique extremal π¦β = β 2 (π₯3 β π₯). This is the only candidate for a local extremum. Analysis of candidate. To determine if π¦β is a local extremum in the πΆ π -norm for some π, we attempt to verify the definition of a minimizer or maximizer. Similar to previous cases, let π β€ 2 and πΏ > 0 be given, and consider any π’ in the neighborhood π πΆ π (π¦β , πΏ) such that πΊ(π’) = 1. Moreover, let β = π’ β π¦β and note that β is in V0 , since it is the difference of two functions in V. From the definition of πΉ, and the fact that π’ = π¦β + β, we have 1
πΉ(π’) = β« (π¦ββ² + ββ² )2 ππ₯.
(6.145)
0
Expanding and grouping terms on the right-hand side, and again using the definition of πΉ, we get 1
(6.146)
1
2π¦ββ² ββ²
πΉ(π’) = πΉ(π¦β ) + β«
ππ₯ + β« (ββ² )2 ππ₯.
0
0
Using integration-by-parts on the term β« 2π¦ββ² ββ²
ππ₯, and noting that β(0) = 0 and β(1) =
0 since β β V0 , we get
1
(6.147)
1
πΉ(π’) = πΉ(π¦β ) β β« 2π¦ββ³ β ππ₯ + β« (ββ² )2 ππ₯. 0
0
From the differential equation in (6.143), we note that 2π¦ββ³ = ππ₯ for all π₯ β [0, 1], and so 2π¦ββ³ β = ππ₯β = ππ₯π’ β ππ₯π¦β for all π₯ β [0, 1], where we have used the fact that β = π’ β π¦β . Substituting this result into (6.147), and using the definition of πΊ, we get 1
1
1
πΉ(π’) β πΉ(π¦β ) = β β« ππ₯π’ ππ₯ + β« ππ₯π¦β ππ₯ + β« (ββ² )2 ππ₯ (6.148)
0
0
0 1
= βππΊ(π’) + ππΊ(π¦β ) + β« (ββ² )2 ππ₯. 0
Since both π¦β and π’ satisfy the constraint, that is, πΊ(π¦β ) = 1 and πΊ(π’) = 1, the above terms with π cancel, and we obtain the result that 1
(6.149)
πΉ(π’) β πΉ(π¦β ) = β« (ββ² )2 ππ₯ β₯ 0, 0
βπ’ β π πΆ π (π¦β , πΏ) with πΊ(π’) = 1.
Thus we conclude that π¦β is a local minimizer subject to the constraint πΊ = 1 for any π β€ 2 and πΏ > 0; in fact, it is an absolute minimizer subject to the constraint πΊ = 1. Example 6.12.2. In some cases, the unknown constant π can affect the form of the solution of the differential equation. For instance, for a differential equation such as (6.150)
π¦β³ β ππ¦ = 0,
186
6. Calculus of variations
the form of the solution depends on π, specifically (6.151)
πΆ1 πβπ π₯ + πΆ2 πββπ π₯ , π¦={ πΆ1 + πΆ2 π₯, πΆ1 cos(π½π₯) + πΆ2 sin(π½π₯),
if π > 0, if π = 0, if π < 0,
where π½ = β|π|. All possible solutions of the differential equation must be considered, and those that satisfy the boundary conditions and constraint would be the extremals.
6.13. Case study Setup. To illustrate the preceding results we study the problem of determining the equilibrium shape of a hanging chain. We consider a chain in two dimensions as shown in Figure 6.14, which is suspended above the ground by its endpoints, which are fixed atop two support columns. We assume that the chain is inextensible, but perfectly flexible, and has a positive total mass that is distributed uniformly along its length, and hence subject to gravity. y 2Ξ±
x g
x=βL
x=L
Figure 6.14.
We consider a coordinate system as shown, where the horizontal position of the origin is halfway between the fixed endpoints, and the vertical position is also halfway between these points. We suppose that the chain has length β, and mass per unit length π, and that gravitational acceleration π is oriented vertically downwards. The horizontal distance between the endpoints is 2πΏ, and the coordinates of these points are (βπΏ, πΌ) and (πΏ, βπΌ), where πΌ is an offset parameter that may be zero, positive, or negative. Thus the vertical distance between the endpoints is 2|πΌ|. The shape or profile of the chain is described by a curve π¦(π₯). According to a principle from physics, among all possible shapes of length β, the observed shape minimizes potential energy. We seek to characterize this shape for any given positive constants β, π, π, πΏ, and offset parameter πΌ. Length, energy functionals. Let π¦(π₯), π₯ β [βπΏ, πΏ], be an arbitrary profile curve which is twice continuously differentiable. To develop expressions for its length and potential energy, we consider a uniform partition of the curve into π segments as illustrated in Figure 6.15. For π = 1, . . . , π, we let segment π denote the piece of curve 2πΏ between nodes (π₯πβ1 , π¦ πβ1 ) and (π₯π , π¦ π ), and let Ξπ₯ = π₯π β π₯πβ1 = π denote the horizontal spacing between nodes, and let Ξπ¦ = π¦ π β π¦ πβ1 denote the vertical spacing.
6.13. Case study
187
Since we consider the limit π β β, or equivalently Ξπ₯ β 0, we can approximate each segment as linear, and we note that its center of mass will coincide with its midpoint, which we denote by (π₯πΜ , π¦ Μπ ).
y0
y1 y 2
yn
node i node iβ1
Ξy Ξx
segment i βL x0 x1 x2
L xn Figure 6.15.
Each segment π will have a length πΊ π , mass ππ , and gravitational potential energy βπ¦ πΈπ . The length of a segment is πΊ π = (Ξπ₯2 + Ξπ¦2 )1/2 = (1 + ( βπ₯ )2 )1/2 Ξπ₯. Similarly, the mass of a segment is ππ = ππΊ π , and the potential energy is πΈπ = ππ ππ¦ Μπ = πππ¦ Μπ (1 + βπ¦ ( βπ₯ )2 )1/2 Ξπ₯. By summing up the contributions from all the segments, and employing the limit definition of an integral, we obtain the length πΊ and energy πΈ for any given profile curve π¦(π₯), π₯ β [βπΏ, πΏ], namely πΏ
π
πΊ = lim β πΊ π = β« β1 + (π¦β² )2 ππ₯, πββ π=1
(6.152)
π
βπΏ
πΏ
πΈ = lim β πΈπ = β« πππ¦β1 + (π¦β² )2 ππ₯. πββ
π=1
βπΏ
Restated problem. To characterize the observed shape of a hanging chain, we consider a set of functions V, and seek minimizers of πΈ subject to πΊ = β, where πΏ
(6.153)
πΏ
πΈ(π¦) = β« L(π₯, π¦, π¦β² ) ππ₯, βπΏ
πΊ(π¦) = β« M(π₯, π¦, π¦β² ) ππ₯, βπΏ
V = {π¦ β πΆ 2 [βπΏ, πΏ] | π¦(βπΏ) = πΌ, π¦(πΏ) = βπΌ}. Here L(π₯, π¦, π¦β² ) and M(π₯, π¦, π¦β² ) are the integrands defined in (6.152). Every extremal must satisfy the EulerβLagrange differential equation, together with the constraint and boundary conditions. Note that the general form of the differential equation can be πN π πN written in the compact form ππ¦ β ππ₯ [ ππ¦β² ] = 0, where N = L + πM is a combined integrand, and π is the multiplier for the constraint. Reduced equation. Based on the expressions for L and M, we note that the original form of the EulerβLagrange equation will be tedious. However, since N is independent of π₯, we may instead consider the reduced form of the equation given in Result πN 6.6.1, which is N β π¦β² ππ¦β² = π΄, where π΄ is a constant. Using the expression for N, we get, after some simplification, (6.154)
1/2
πππ¦ + π = π΄[1 + (π¦β² )2 ]
.
188
6. Calculus of variations
Inspection of the above equation reveals that there are two basic types of solutions π corresponding to π΄ = 0 and π΄ β 0. The case π΄ = 0 gives the trivial solution π¦ = β ππ β‘ constant, and this solution will not satisfy the constraint πΊ = β, except in the special situation when β = 2πΏ. To find the general, nontrivial solution of the equation, we assume that π΄ β 0, and rearrange the equation to get (π¦β² )2 =
(6.155)
(πππ¦ + π)2 β π΄2 . π΄2
Solution of reduced equation. As was done in Section 6.7, instead of a cartesian description π¦(π₯) of a solution curve, we consider a parametric description π₯ = π(π ) and π¦ = π(π ). Here π is an arbitrary parameter along the curve, and π(π ) and π(π ) are arbitrary functions. Similar to before, we can choose one of these functions to simplify the differential equation, and then solve for the other. To proceed, we substitute the calculus relation
ππ¦ ππ₯
equation in (6.155), and then rearrange terms to separate (6.156)
(
ππ¦ ππ₯ / into the differential ππ ππ ππ₯ ππ¦ and ππ , which gives ππ
=
ππ¦ 2 ππ₯ 2 π΄2 ) = ( ) . 2 2 ππ (πππ¦ + π) β π΄ ππ
We may now choose π¦ = π(π ) to simplify the right-hand side of the equation. Any ππ¦ π2 π¦ choice can be made, provided that it leads to a curve for which ππ₯ and ππ₯2 are defined and continuous, which can be verified in the end. Motivated by the form of the quotient π π΄ above, we let πππ¦ + π = π΄ cosh(π ). This corresponds to π¦ = β ππ + ππ cosh(π ), which gives
ππ¦ ππ
=
π΄ ππ
2
2
sinh(π ), and using the identity cosh (π ) β 1 = sinh (π ), we get
(6.157)
(
The above equation implies choose the sign to get
ππ₯ ππ
ππ₯ ππ
π΄ 2 ππ₯ 2 ) =( ) . ππ ππ π΄
= Β± ππ . Note that, depending on π΄, we can always
> 0, so that the curve will be oriented left to right. This π΄
equation can be explicitly integrated to obtain π₯ = Β± ππ π + π΅, where π΅ is a constant, (π₯βπ΅)ππ
which can be rearranged to get π = Β± π΄ . Since cosh(π ) = cosh(βπ ), we note that the choice of sign does not affect π¦, and we get (6.158)
π¦=β
πππ₯ πππ΅ π π΄ + cosh ( β ), ππ ππ π΄ π΄
where π, π΄ β 0 and π΅ are unknown constants. By straightforward substitutions, this function can be put into the cleaner form πΏ ππ₯ (6.159) π¦ = βπ + cosh ( + π), π πΏ where π, π β 0 and π are unknown constants, into which π and π have been absorbed. The general curve defined above is called a catenary curve. A description of this curve can be found in many texts on elementary calculus. Constraint, boundary conditions. Although the catenary given in (6.159) is the general, nontrivial solution of the EulerβLagrange equation, we have not yet obtained an extremal. We must now consider if values of the unknown constants can be found
6.14. A sufficient condition
189
to satisfy the constraint and boundary conditions. An analysis reveals that different interesting cases depending on β, πΏ and πΌ arise, and that extremals exist in pairs. The symmetric case when πΌ = 0 is studied in the Exercises.
6.14. A sufο¬cient condition Here we outline a simple sufficient condition for local extrema, to supplement the necessary conditions considered up to now. The condition is based on the idea of concavity, and is straightforward and explicit, but stringent. Although less stringent conditions are available, they are more difficult to use and not pursued here. We state a condition for only a certain type of first-order problem and note that similar statements hold for more general problems. To state the result, we suppose that π¦β β V is a given extremal of a functional π πΉ βΆ V β β, where V β πΆ 2 [π, π] and πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² ) ππ₯. The functions in V may satisfy fixed or free conditions at each end, but for simplicity we assume that πΉ has no free-end terms outside of the integral. Also, we suppose that π
= (π, π) Γ (π, π) is a given rectangle that contains the range of the extremal and its derivative, in the sense that π < π¦β (π₯) < π and π < π¦ββ² (π₯) < π for all π₯ β [π, π]. It will be convenient to use the notation πΏ(π₯, π£, π€) and consider the integrand as a function of three independent variables π₯, π£, π€. So if πΏ(π₯, π¦, π¦β² ) = ππ₯ π¦ + (π¦β² )2 , then πΏ(π₯, π£, π€) = ππ₯ π£ + π€2 . For each fixed π₯0 β [π, π] we consider the graph of πΏ(π₯0 , π£, π€) over the π£, π€-plane and call it the πΏ-graph associated with π₯0 . Note that the πΏ-graph is a surface, and the rectangle π
is an open set in its domain. Result 6.14.1. [concavity theorem] Let πΏ(π₯, π£, π€) be given, and consider a functional π πΉ βΆ V β β, where πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² ) ππ₯, and V β πΆ 2 [π, π] is a set of functions with fixed or free conditions at each end. Moreover, let π¦β β V be an extremal and let π
be an open rectangle that contains the range of π¦β , π¦ββ² . (1)
If for every π₯0 β [π, π] the πΏ-graph is concave up over π
, then π¦β is a local minimizer of πΉ in the πΆ 1 -norm.
(2)
If for every π₯0 β [π, π] the πΏ-graph is concave up over the entire π£, π€-plane, then π¦β is an absolute minimizer of πΉ.
For a maximizer, change concave up to concave down in (1) and (2). Thus a given extremal π¦β is guaranteed to be an actual extremum when the integrand πΏ(π₯, π£, π€) satisfies a concavity property in π£, π€ for every π₯. The extremum is local or absolute depending on whether the concavity of the integrand is local or absolute (global). Note that these conditions are sufficient, but not necessary. That is, an extremal can be an extremum even when the above conditions do not hold. Note also that the extremal is assumed to exist. The terms concave up and concave down refer to the usual concepts from elementary calculus. A function of one variable is called concave up (down) in an interval when its graph remains on or above (below) each tangent line in the interval. Similarly, a function of two variables is called concave up (down) in a region when its
190
6. Calculus of variations
graph remains on or above (below) each tangent plane in the region. For twice continuously differentiable functions as considered here, concavity is determined by the second derivatives. The term convexity is sometimes used in place of concavity. L
L(w) L tang (w) w1
w
w2
Figure 6.16.
π
Sketch of proof: Result 6.14.1. We initially consider the case when πΉ(π¦) = β«π πΏ(π¦β² ) ππ₯, where πΏ(π€) is a given function of one variable, which we assume to be concave up for all π€ β β. Let π¦β β V be an extremal and let π’ β V be arbitrary, and introduce β = π’ β π¦β β V0 . For any given π₯ β [π, π], let π€ 1 = π¦ββ² (π₯) and π€ 2 = π’β² (π₯). Since the πΏ-graph is concave up as illustrated in Figure 6.16, we have (6.160)
πΏ(π€ 2 ) β₯ πΏtang (π€ 2 ).
Here πΏtang (π€) is the function for the tangent line at π€ 1 , namely, πΏtang (π€) = πΏ(π€ 1 ) + ππΏ (π€ 1 )(π€ β π€ 1 ). Substituting ππ€ ππΏ place of ππ€ , we get
for π€ 2 and π€ 1 in (6.160), and using the notation
πΏ(π’β² ) β₯ πΏ(π¦ββ² ) +
(6.161)
ππΏ ππ¦β²
in
ππΏ β² β² (π¦ )(π’ β π¦ββ² ). ππ¦β² β
The above inequality holds for every π₯ β [π, π]. Integrating, and using the fact that β = π’ β π¦β , we obtain π
(6.162)
π β²
β« πΏ(π’ ) ππ₯ β₯ β« π
π
πΏ(π¦ββ² )
ππ₯ + β«
π
π
ππΏ β² β² (π¦ )β ππ₯. ππ¦β² β
The above expression says πΉ(π’) β₯ πΉ(π¦β ) + πΏπΉ(π¦β , β). Since π¦β is an extremal, we have πΏπΉ(π¦β , β) = 0, and we get πΉ(π’) β₯ πΉ(π¦β ) for arbitrary π’, which implies that π¦β is an absolute minimizer. Note that the opposite conclusion would be obtained when πΏ(π€) is concave down. π
Consider now the case when πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² ) ππ₯, where πΏ(π₯, π£, π€) is a given function. For every π₯ β [π, π], we assume that this function is concave up for all (π£, π€) β π
, where π
is a fixed, open rectangle. Consider also a given extremal π¦β β V, and suppose the range of (π¦β , π¦ββ² ) is contained in π
. Note that, since π
is open, there is a πΏ > 0 such that if π’ β π πΆ 1 (π¦β , πΏ), then the range of (π’, π’β² ) is also contained in π
, and similar to before we introduce β = π’ β π¦β β V0 . For any fixed π₯ β [π, π], let (π£ 1 , π€ 1 ) = (π¦β (π₯), π¦ββ² (π₯)) and (π£ 2 , π€ 2 ) = (π’(π₯), π’β² (π₯)). Since the πΏ-graph is concave up in the region π
, we have (6.163)
πΏ(π₯, π£ 2 , π€ 2 ) β₯ πΏtang (π₯, π£ 2 , π€ 2 ),
6.14. A sufficient condition
191
where πΏtang (π₯, π£, π€) is the function for the tangent plane at (π£ 1 , π€ 1 ). Substituting for (π£ 2 , π€ 2 ) and (π£ 1 , π€ 1 ) in (6.163), using the usual formula for the tangent plane, and the ππΏ ππΏ ππΏ ππΏ notation ππ¦ and ππ¦β² in place of ππ£ and ππ€ , we get πΏ(π₯, π’, π’β² ) β₯ πΏ(π₯, π¦β , π¦ββ² ) + (6.164)
ππΏ (π₯, π¦β , π¦ββ² )(π’ β π¦β ) ππ¦ ππΏ + β² (π₯, π¦β , π¦ββ² )(π’β² β π¦ββ² ). ππ¦
The above inequality holds for every π₯ β [π, π]. Integrating, and using the fact that β = π’ β π¦β , we obtain an expression of the same form as before, namely πΉ(π’) β₯ πΉ(π¦β ) + πΏπΉ(π¦β , β). Since π¦β is an extremal, we have πΏπΉ(π¦β , β) = 0, and we get πΉ(π’) β₯ πΉ(π¦β ) for arbitrary π’ β π πΆ 1 (π¦β , πΏ), which implies that π¦β is a local minimizer. Note that the opposite conclusion would be obtained when πΏ(π₯, π£, π€) is concave down, and the local result would become absolute when π
is the entire π£, π€-plane. Example 6.14.1. Consider πΉ βΆ V β β, where V β πΆ 2 [0, 1] is a set of functions with fixed or free conditions at each end. 1
(1) Let πΉ(π¦) = β«0 πβπ₯ π¦2 + (1 + π₯2 )(π¦β² )2 ππ₯. For any fixed π₯0 β [0, 1], we have πΏ(π₯0 , π£, π€) = π΄0 π£2 + π΅0 π€2 , where π΄0 = πβπ₯0 > 0 and π΅0 = 1 + π₯02 > 0. Here the πΏ-graph is an elliptic paraboloid, which is concave up over the entire π£, π€-plane. By Result 6.14.1, if an extremal π¦β β V exists, then it must be an absolute minimizer. 1
(2) Let πΉ(π¦) = β«0 (1 β π2 (π₯) + (π¦β² )2 )1/2 β π(π₯)π¦β² ππ₯, where β1 < π(π₯) < 1 is a given coefficient function. For any fixed π₯0 β [0, 1], we have πΏ(π₯0 , π£, π€) = (1 β π20 + π€22 )1/2 β π πΏ π0 π€, where π0 = π(π₯0 ). Here the πΏ-graph is independent of π£, and we note that ππ€2 > 0 for all π€, so the graph is similar to a parabolic cylinder, which is concave up over the entire π£, π€-plane. By Result 6.14.1, if an extremal π¦β β V exists, then it must be an absolute minimizer. 1
Example 6.14.2. Consider πΉ βΆ V β β, where πΉ(π¦) = β«0 6(π¦β² )2 β (π¦β² )4 ππ₯, V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = π}, and π is a constant. This functional has a unique extremal given by π¦β (π₯) = ππ₯. For any fixed π₯0 β [0, 1], we have πΏ(π₯0 , π£, π€) = 6π€2 βπ€4 , which is independent of both π₯0 and π£. The πΏ-graph is concave up in the rectangle π
u = {(π£, π€) | β 1 < π€ < 1}, and concave down in the rectangles π
d1 = {(π£, π€) | π€ < β1} and π
d2 = {(π£, π€) | π€ > 1}. Thus, if β1 < π < 1, then the range of (π¦β , π¦ββ² ) is contained in π
u , and π¦β is a local minimizer in the πΆ 1 -norm by Result 6.14.1. Moreover, if π < β1 or π > 1, then the range of (π¦β , π¦ββ² ) is contained in π
d1 or π
d2 , and π¦β is a local maximizer. 1
Example 6.14.3. Consider πΉ βΆ V β β, where πΉ(π¦) = β«0 π¦(π¦β² )2 ππ₯ and V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 1, π¦(1) = 4}. This functional has a unique extremal given by π¦β (π₯) = (1 + 7π₯)2/3 . For any fixed π₯0 β [0, 1], we have πΏ(π₯0 , π£, π€) = π£π€2 , which is independent of π₯0 . An appropriate second-derivative test shows that the πΏ-graph is neither concave up nor concave down in any open region of the π£, π€-plane. Thus the sufficient conditions in Result 6.14.1 do not hold. Nevertheless, by direct verification, we find that π¦β is a local minimizer in the πΆ 0 -norm in V. To show this, we first note
192
6. Calculus of variations
that π¦β satisfies the EulerβLagrange equation 2π¦β π¦ββ³ + (π¦ββ² )2 = 0, and has the properties that π¦β β₯ 1 and π¦ββ³ < 0. Next, let π’ β V be arbitrary and let β = π’ β π¦β β V0 . Then 1
πΉ(π’) β πΉ(π¦β ) = β« (π¦β + β)(π¦ββ² + ββ² )2 β π¦β (π¦ββ² )2 ππ₯ 0
(6.165)
1
= β« β(π¦ββ² )2 + 2π¦β π¦ββ² ββ² + 2βπ¦ββ² ββ² + (π¦β + β)(ββ² )2 ππ₯. 0
Using integration-by-parts on the term β« 2π¦β π¦ββ² ββ² ππ₯, noting that β(0) = 0 and β(1) = 0, and then using the differential equation, we find that the above expression reduces to 1
πΉ(π’) β πΉ(π¦β ) = β« 2βπ¦ββ² ββ² + (π¦β + β)(ββ² )2 ππ₯.
(6.166)
0 2 β²
β²
Next, using the fact that 2ββ = (β ) , and performing an integration-by-parts on the term β« π¦ββ² (β2 )β² ππ₯, we get 1
πΉ(π’) β πΉ(π¦β ) = β« βπ¦ββ³ β2 + (π¦β + β)(ββ² )2 ππ₯.
(6.167)
0
Since π¦ββ³ < 0, we have βπ¦ββ³ β2 β₯ 0, and since π¦β β₯ 1, we have (π¦β + β)(ββ² )2 β₯ 0 for any β β₯ β1. Restricting to β1 β€ β β€ 1, and noting β = π’ β π¦β , we obtain (6.168)
πΉ(π’) β πΉ(π¦β ) β₯ 0,
βπ’ β π πΆ 0 (π¦β , πΏ) for any πΏ β (0, 1].
Thus π¦β is a local minimizer in the πΆ 0 -norm, even though the sufficient conditions in Result 6.14.1 do not hold. π
Example 6.14.4. Consider πΉ βΆ V β β, where πΉ(π¦) = β«0 (π¦β² )2 β π2 π¦2 ππ₯, V = {π¦ β πΆ 2 [0, π] | π¦(0) = 0, π¦(π) = 1}, and π > 0 is a noninteger constant. This functional has a unique extremal given by π¦β (π₯) = sin(ππ₯)/ sin(ππ). (There is no extremal when π is an integer.) For any fixed π₯0 β [0, π], we have πΏ(π₯0 , π£, π€) = π€2 β π2 π£2 , which is independent of π₯0 . Here the πΏ-graph is a hyperbolic paraboloid (saddle) over the entire π£, π€-plane, and the sufficient conditions in Result 6.14.1 do not hold. As before, we may attempt a direct verification, so we let π’ β V be arbitrary, and let β = π’βπ¦β β V0 . Then by straightforward calculation we obtain π
πΉ(π’) β πΉ(π¦β ) = β« (ββ² )2 β π2 β2 ππ₯.
(6.169)
0
Since it contains a difference of squares, the sign of the integral is not obvious. If the integral is positive for some β, and negative for other β, then the extremal will not be an extremum. To help with the analysis, we bring in a technical result about continuously differentiable functions that vanish at each end, called Wirtingerβs (or PoincarΓ©βs) inequality, which states π
(6.170)
π
β« (ββ² )2 ππ₯ β₯ ( π
π 2 ) β« β2 ππ₯, πβπ π
ββ β πΆ 1 [π, π], β(π) = 0, β(π) = 0.
Using (6.170) in (6.169) we get, after minor simplification, and with π = 0 and π = π, π
(6.171)
πΉ(π’) β πΉ(π¦β ) β₯ β« (1 β π2 )β2 ππ₯. 0
Exercises
193
In the case when 0 < π < 1, the above inequality yields πΉ(π’) β πΉ(π¦β ) β₯ 0 for arbitrary π’, which implies that π¦β is an absolute minimizer. On the other hand, when π > 1, the above inequality is not helpful: it says that πΉ(π’)βπΉ(π¦β ) is greater than or equal to a nonpositive quantity. In this case, we return to (6.169) and show by direct example that the integral can be positive or negative depending on β. For instance, let β(π₯) = π sin(ππ₯), where π > 0 is an arbitrary coefficient, and π > 0 is an integer. Then β β V0 and 1 πΉ(π’)βπΉ(π¦β ) = 2 ππ2 (π2 βπ2 ), which is positive if π > π, and negative if π < π. Moreover, π can be chosen so that π’ is within any given neighborhood of π¦β . This shows that π¦β is not an extremum in the case when π > 1.
Reference notes The calculus of variations is a vast subject whose theory spans across a multitude of levels, from elementary to advanced, with applications in all branches of science. Only the most elementary parts of the subject were discussed here. A wealth of additional information can be found in more specialized texts. More complete treatments of the elementary theory are given in the recent work by Kot (2014), and the classic works by Gelfand and Fomin (1963) and Bliss (1925). A treatment of the elementary theory with a focus on constraints, of both equality and inequality type, along with some elements of optimal control can be found in Troutman (1996), Smith (1974) and Bliss (1930). An introduction to the more advanced theory is given in Dacorogna (2015) and Buttazzo, Giaquinta and Hildebrandt (1998). Although we focused primarily on necessary conditions, the theory of sufficient conditions is an important part of the subject which can be approached in different ways. Various treatments of such conditions can be found in the texts above. The results for the key problems considered in the case studies and exercises can be found in the literature, or are direct consequences of results therein. For the brachistochrone or slide problem, the solvability of the boundary conditions and the minimizing property of the cycloid are established in the classic book by Bliss (1925); for more recent treatments, see Lawlor (1996), Troutman (1996) and Coleman (2012). For the minimal surface of revolution problem, the solvability of the boundary conditions and the characterization of the two catenoid extremals are also established in Bliss (1925); see also Kot (2014). For the boat steering problem, the uniqueness and minimizing properties of the extremals follow from straightforward concavity arguments. For the hanging chain problem, the existence and characterization of the two catenary extremals are established in Troutman (1996), where the problem is reformulated in an alternative way, and concavity arguments are employed. For a proof of Wirtingerβs inequality, see Dacorogna (2015) or Dym and McKean (1972).
Exercises 1
1. Let πΉ(π¦) = β«0 (1 + π₯)(π¦β² )2 ππ₯, V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 1}. Consider 1 Μ Μ π¦(π₯) = ln 2 ln(1 + π₯) and π¦(π₯) = π₯2 in V. (a) Find πΏπΉ(π¦, β) and πΏ2 πΉ(π¦, β) for arbitrary π¦ β V and β β V0 .
194
6. Calculus of variations
(b) Show πΏπΉ(π¦,Μ β) = 0 for every β β V0 , while πΏπΉ(π¦,Μ β) β 0 for some β β V0 . Thus π¦ Μ is a candidate for a local extrema, but not π¦.Μ (c) Use πΏ2 πΉ(π¦,Μ β) to partially classify π¦.Μ 1
Μ 2. Let πΉ(π¦) = β«0 (π¦β² )2 (4 β (π¦β² )2 ) ππ₯, V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 1}. Consider π¦(π₯) β‘1 2 Μ (constant) and π¦(π₯) = 1 + π₯ in V. (a) Find πΏπΉ(π¦, β) and πΏ2 πΉ(π¦, β) for arbitrary π¦ β V and β β V0 . (b) Show πΏπΉ(π¦,Μ β) = 0 for every β β V0 , while πΏπΉ(π¦,Μ β) β 0 for some β β V0 . Thus π¦ Μ is a candidate for a local extrema, but not π¦.Μ (c) Use πΏ2 πΉ(π¦,Μ β) to partially classify π¦.Μ 3. Find the extremals, if any, for the following functionals and boundary conditions. 1
(a) πΉ(π¦) = β«0 (π¦β² )2 + 3π¦ + 2π₯ ππ₯, 2 (π¦β² )2 π₯3
(b) πΉ(π¦) = β«1
ππ₯,
π¦(0) = 0, π¦(1) = 4.
π¦(1) = 1, π¦(2) = 2.
1
(c) πΉ(π¦) = β«0 (π¦β² )2 + π¦ππ₯ + π¦2 ππ₯,
π¦(0) = 0, π¦(1) = 0.
2
π¦(1) = 0, π¦(2) = 1.
2
π¦(1) = 3, π¦(2) = 4.
(d) πΉ(π¦) = β«1 π₯2 (π¦β² )2 + π¦2 ππ₯, (e) πΉ(π¦) = β«1 π₯3 (π¦β² )2 β 4π¦ ππ₯, 1
(f) πΉ(π¦) = β«0 π₯π¦π¦β² ππ₯,
π¦(0) = 0, π¦(1) = 3.
1
(g) πΉ(π¦) = β«β1 2π¦π¦β² β 3π₯ ππ₯, 1
(h) πΉ(π¦) = β«0 π¦3 + ππ₯ π¦ ππ₯,
π¦(β1) = β2, π¦(1) = 2. π¦(0) = 3, π¦(1) = 6.
4. For each functional and set of boundary conditions, find the unique extremal, and show it is an absolute minimizer or maximizer. 1
π¦(0) = 0, π¦(1) = 1.
1
π¦(β1) = β1, π¦(1) = 0.
(a) πΉ(π¦) = β«0 π₯2 + 4π¦ β (π¦β² )2 ππ₯, (b) πΉ(π¦) = β«β1 π¦2 + 4(π¦β² )2 ππ₯, 1
(c) πΉ(π¦) = β«0 π₯ β (π¦ β π¦β² )2 ππ₯, 2
π¦(0) = 0, π¦(1) = 1.
(d) πΉ(π¦) = β«0 π¦2 + π¦π¦β² + (π¦β² β 2)2 ππ₯,
π¦(0) = 1, π¦(2) = 2.
Exercises
195
5. Let π¦β β V be an extremal of πΉ βΆ V β β, where π, π β πΆ 2 [π, π] are given functions and V = {π¦ β πΆ 2 [π, π] | π¦(π) = πΌ, π¦(π) = π½}, π
πΉ(π¦) = β«π π(π₯)(π¦β² )2 + π(π₯)π¦2 ππ₯. Show that, if π, π are positive in [π, π], then π¦β is an absolute minimizer. 6. Prove the following alternative forms of the fundamental lemma, where πΆ01 [π, π] = {β β πΆ 1 [π, π] | β(π) = 0, β(π) = 0}. π
(a) If π β πΆ 0 [π, π] and β«π πββ² ππ₯ = 0 for all β β πΆ01 [π, π], then π is a constant function. π
(b) If π, π β πΆ 0 [π, π] and β«π πβ+πββ² ππ₯ = 0 for all β β πΆ01 [π, π], then π β πΆ 1 [π, π] and πβ² = π. Note: The result in (a) is called the duBois-Reymond lemma. It can be used to derive an alternative form of the EulerβLagrange equation, with weaker continuity requirements. 7. Find the extremals, if any, where π > 1 is a constant. 2
(a) πΉ(π¦) = β« 1
β1 + (π¦β² )2 ππ₯, π₯
π¦(1) = 3, π¦(2) = 4.
1
(b) πΉ(π¦) = β« β1 + π(π¦β² )2 ππ₯,
π¦(0) = 5, π¦(1) = 7.
0
1
(c) πΉ(π¦) = β« π¦β1 + (π¦β² )2 ππ₯,
π¦(0) = 1, π¦(1) = 4.
0
1
(d) πΉ(π¦) = β« π¦(π¦β² )2 ππ₯,
π¦(0) = π, π¦(1) = 4π.
0
1
(e) πΉ(π¦) = β« π¦2 (π¦β² )2 ππ₯,
π¦(0) = 1, π¦(1) = π.
0
8. Light travels from a point source π = (1, β) to a point receiver π
= (0, 1) through an atmosphere with index of refraction π(π₯, π¦). According to Fermatβs principle, light emitted at π and arriving at π
travels along a ray or path π¦(π₯) that minimizes
196
6. Calculus of variations
the time-of-travel functional y atmosphere S
(1,h)
R (0,1)
1
y(x)
πΉ(π¦) = β« π(π₯, π¦)β1 + (π¦β² )2 ππ₯, 0
obstacle 0
2
π¦ β πΆ [0, 1],
1/2
1
π¦(0) = 1,
π¦(1) = β.
x
(a) Find the extremal of πΉ in the case π(π₯, π¦) = 1/π¦, where β > 0 is a constant. Assume π¦ > 0. (b) Find all values of β for which the light path π¦(π₯) will be blocked by the obstacle (π not visible to π
). 9. Consider a scenario as in Exercise 8, but with an arbitrary index of refraction ππ π = π(π¦) > 0. Show that if π(π¦) decreases with height, that is ππ¦ < 0, then any extremal π¦(π₯) must be concave down, that is π¦β³ < 0. Note: The index of refraction in the atmosphere is believed to decrease with altitude, which implies that light paths will be concave. As a result, when we see the setting sun vanish on the horizon, it is actually already below the horizon, and has been below for some time! 10. Consider the daily food intake of an individual. Let π¦ be the mass of a given food type in the stomach at time π‘ β [0, π], where π‘ = 0 is wake time, and π‘ = π is bed time, and suppose the rate of change satisfies
y π¦β² = βππ¦ + π’, π¦ = π¦(π‘),
c
π‘ β [0, π].
target
y
π’ = π’(π‘).
0 wake
b t bed
Here ππ¦ is the food breakdown rate, and π’ is the external control, where π’ > 0 represents βeatingβ and π’ < 0 βpurgingβ. (We assume π¦ β 0 during sleep.) A measure of the daily intake imbalance is π
πΉ = β« π(π¦ β π)2 + π’2 ππ‘. 0
Here π is a target amount for the mass, and π is a weighting factor. The parameters π, π, π, π are positive constants. Note that larger values of πΉ correspond to larger deviations from the target, or larger eating or purging events, over longer periods. For given conditions, we seek a food intake schedule to minimize the imbalance.
Exercises
197
π
(a) Write the functional as πΉ(π¦) = β«0 πΏ(π‘, π¦, π¦β² ) ππ‘, by eliminating π’. (b) Find the unique extremal π¦(π‘) given π¦ β πΆ 2 [0, π], π¦(0) = 0, π¦(π) = π. (c) Find the control curve π’(π‘) associated with the extremal in (b). (d) Plot π¦(π‘) and π’(π‘) for the case {π, π, π, π} = {12, 0.5, 1, 1} in dimensionless units. Briefly describe the optimal food intake schedule suggested by the curves. Are there time periods where π¦ > π or π’ < 0? Aside from continuous snacking, when should the larger meals occur? Is the target food mass reached before bed time? 11. Consider πΉ βΆ V β β, where V = {π¦ β πΆ 2 [π, π] | π¦(π) = π½},
π¦(π) free,
π
πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² ) ππ₯ + [πΊ(π¦)]π₯=π . Derive the natural boundary condition at π₯ = π for an extremal π¦β β V. 12. Find the extremals, if any, for the following functionals and boundary conditions. 3
(a) πΉ(π¦) = β«0 π2π₯ ((π¦β² )2 β π¦2 ) ππ₯,
π¦(0) = 1, π¦(3) free.
1 1 β² 2 1 (π¦ ) + π¦β² π¦ + π¦β² + π¦ ππ₯, π¦(0) = 2 , π¦(1) free. 2 1 β«0 (π¦β² )2 + 3π¦ ππ₯ + π¦2 (1), π¦(0) = 4, π¦(1) free.
(b) πΉ(π¦) = β«0 (c) πΉ(π¦) =
4
(d) πΉ(π¦) = β«0 (π¦β² )2 β 2π¦ ππ₯ + π¦2 (0),
π¦(0) free, π¦(4) = 2.
2
(e) πΉ(π¦) = β«0 (π¦β² )2 β 4π¦ + π¦2 ππ₯ + 3π¦(2),
π¦(0) free, π¦(2) free.
1
13. Find the unique extremal for πΉ(π¦) = β«0 (π¦β² )2 + π¦ + π¦2 ππ₯, π¦(0) = 0, π¦(1) free. Show the extremal is an absolute minimizer or maximizer. π
14. Find the unique extremal for πΉ(π¦) = β«0 (π¦β² )2 βπ¦2 ππ₯, π¦(0) = 1, π¦(π) free. Show by example that the extremal is neither an absolute minimizer nor an absolute maximizer. 15. Find the extremals, if any, where 0 < π < 1 is a constant. Assume π¦(π₯) > 0 in parts (b),(c). 1
(a) πΉ(π¦) = β« β1 β π2 + (π¦β² )2 β ππ¦β² ππ₯,
π¦(0) = 4, π¦(1) free.
0
1
(b) πΉ(π¦) = β« 0
β1 + (π¦β² )2 ππ₯, π¦
π¦(0) = π, π¦(1) free.
198
6. Calculus of variations
1
(c) πΉ(π¦) = β« βπ¦(1 + (π¦β² )2 ) ππ₯,
π¦(0) free, π¦(1) = 2.
0
1
(d) πΉ(π¦) = β« πβπ¦ (π¦β² )2 ππ₯,
π¦(0) free, π¦(1) = π.
0 1
16. Consider πΉ(π¦) = β«0 [ (π¦β² )2 + ππ¦β² ]πβ2π¦ ππ₯ where π¦(0) = 1, π¦(1) is free, and π is a constant. Find the extremals. Show there are no extremals if π β₯ π# , and only one extremal if π < π# , where π# is a number that you should find. 17. Find the extremals assuming π¦ > 0 for 1
πΉ(π¦) = β« 0
ππ¦β² β (π¦β² )2 ππ₯, π¦
π¦(0) = 1,
π¦(1) free,
π constant.
Show there are no extremals when π < πβ , two extremals when πβ < π < πββ , and only one allowable extremal when π > πββ , where πβ and πββ are numbers that you should find. What happens to the extremals when π = πβ and π = πββ ? 18. Consider πΉ βΆ V β β, where V = {π¦ β πΆ 4 [π, π] | π¦(π) = πΌ, π¦β² (π) = πΎ},
π¦(π) and π¦β² (π) free,
π
πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² , π¦β³ ) ππ₯ + [πΊ(π¦)]π₯=π + [π»(π¦β² )]π₯=π . Derive the natural boundary conditions at π₯ = π for an extremal π¦β β V. 19. Find the extremals, if any, for the following functionals and boundary conditions. 1
(a) πΉ = β«0 π¦π¦β² + (π¦β³ )2 ππ₯, π¦(0) = 0, π¦β² (0) = 1, π¦(1) = 2, π¦β² (1) = 4. 1
(b) πΉ = β«0 π¦ + π¦π¦β² π¦β³ ππ₯, π¦(0) = 2, π¦β² (0) = 2, π¦(1) = 3, π¦β² (1) = 3. 1
(c) πΉ = β«0 (π¦β² )2 + (π¦β³ )2 ππ₯, π¦(0) = 0, π¦β² (0) = 1, π¦(1) = 2, π¦β² (1) free. 1
(d) πΉ = β«0 π¦ + 5π¦π¦β³ + (π¦β² + π¦β³ )2 ππ₯, π¦(0) = 1, π¦β² (0) = 0, π¦(1) free, π¦β² (1) = 0. 20. Repeat the optimal food intake problem in Exercise 10, assuming π¦(π) is free. 21. Find the extremals of πΉ, if any, subject to the constraint πΊ = 1, with the given boundary conditions.
Exercises
199
π
π
(a) πΉ = β«0 (π¦β² )2 ππ₯, πΊ = β«0 π¦2 ππ₯, π¦(0) = 0, π¦(π) = 0. 1
1
(b) πΉ = β«0 π¦ + (π¦β² )2 ππ₯, πΊ = β«0 π₯π¦β² ππ₯, π¦(0) = 0, π¦(1) = 1. π
π
π
(c) πΉ = β«02 (π¦β² )2 + 2π¦π¦β² β π¦2 ππ₯, πΊ = β«02 6π¦ ππ₯, π¦(0) = 0, π¦( 2 ) = 1. π
π
(d) πΉ = β«0 (π¦β² )2 + 4π¦ ππ₯, πΊ = β«0 π¦2 ππ₯, π¦(0) = 0, π¦(π) free. 1
22. Find the extremals of πΉ = β«0 β1 + (π¦β² )2 ππ₯, if any, subject to the constraint πΊ = 1 β«0 π¦ ππ₯ = π΄, where π¦(0) = 0, π¦(1) = 0 and π΄ > 0 is a constant. Explain why the π π case 0 < π΄ β€ 8 is different from π΄ > 8 . Note: The above is a variant of Queen Didoβs problem, which is to find a plane curve that encloses the greatest area among all curves of a given length. In the above, only graphs are considered, and we seek a graph of shortest length among all graphs that enclose a given area. 23. Consider πΉ, πΊ βΆ V β β, where V = {π¦ β πΆ 4 [π, π] | π¦(π) = πΌ, π¦β² (π) = πΎ, π¦(π) = π½, π¦β² (π) = π}, π
π
πΉ(π¦) = β«π πΏ(π₯, π¦, π¦β² , π¦β³ ) ππ₯,
πΊ(π¦) = β«π π(π₯, π¦, π¦β² , π¦β³ ) ππ₯.
Derive the EulerβLagrange boundary-value problem for an extremal of πΉ subject to the constraint πΊ = π. π
24. Let V = {π¦ β πΆ 2 [0, π] | π¦(0) = 0, π¦(π) = 0}, πΉ(π¦) = β«0 (π¦β² )2 ππ₯, and πΊ(π¦) = π β«0 π¦2 ππ₯. The EulerβLagrange equation implies that, for every integer π β 0, the 2 function π¦β,π (π₯) = ( π )1/2 sin(ππ₯) is an extremal of πΉ subject to πΊ = 1. (a) For π = Β±1, show that π¦β,π is an absolute minimizer of πΉ subject to πΊ = 1. [Hint: Use Wirtingerβs inequality (6.170).] (b) For π β Β±1, show that π¦β,π is not an absolute minimizer of πΉ subject to πΊ = 1. 25. Consider the car acceleration problem outlined in Section 6.11. In dimensionless units, let {π, β, π} = {1, 1, 10} and consider the case of a straight, flat road so that π(π ) β‘ 0. y
air resistance Ξ· P
g
t=0 s
car
Q
πβΆ
π = 0 when π‘ = 0,
t=b
πβΆ
π = β when π‘ = π.
x
200
6. Calculus of variations
(a) For π = 0, find the unique extremal π (π‘) for the problem in (6.128). Determine the corresponding optimal control π’(π‘). Qualitatively describe the π’ versus π‘ curve in terms of βfoot pressureβ on the gas and brake pedals. (b) For π = 0, find the unique extremal π (π‘) for the problem in (6.129). Determine the corresponding optimal control π’(π‘). Qualitatively describe the π’ versus π‘ curve as before. (c) Repeat (a) for the case π = 4. (d) Repeat (b) for the case π = 4.
Mini-project 1. A soap film will form a curved surface of revolution when stretched between two rings. If the profile of the film is denoted by π¦(π₯), π₯ β [βπΏ, πΏ], then the surface area is given by rings of radius Ξ±, Ξ²
y(x) Ξ±
Ξ²
πΏ
πΉ(π¦) = 2π β« π¦β1 + (π¦β² )2 ππ₯. βπΏ
film
βL
L
The natural, observed shape of the film can be described as that which minimizes the area functional πΉ in the set π = {π¦ β πΆ 2 [βπΏ, πΏ] | π¦(βπΏ) = πΌ, π¦(πΏ) = π½, π¦(π₯) > 0}. Here we explore this minimization problem in the case when the two rings are the same size so that πΌ = π½, and for different values of the aspect ratio πΌ/πΏ. We assume that πΌ = π½ > 0 and πΏ > 0 are given constants. All quantities are dimensionless. (a) Write and solve the EulerβLagrange differential equation for a local minimizer of πΉ in π. [One way is to consider the reduced equation, and parametric description π₯ = π(π ) and π¦ = π(π ), with π(π ) = π΄ cosh(π ), and then eliminate π .] By renaming constants πΏ ππ₯ as necessary, show that the general solution can be written as π¦(π₯) = π cosh( πΏ + π), where π > 0 and π are arbitrary constants. (b) Write the boundary conditions for π¦ in the symmetric case when πΌ = π½. Show that these conditions imply the following equations for π and π, where πΎ = πΌ/πΏ, π=0
and
cosh(π) β πΎπ = 0.
(c) In the second equation above, πΎ is given, and π is an unknown. Show that this equation has two solutions if πΎ > πΎ# , one solution if πΎ = πΎ# , and no solution if 0 < πΎ < πΎ# , where πΎ# is an appropriate number which you should find. Each solution for π, together with π = 0, gives a candidate curve π¦β β π. Hence we have two, one or no candidates for a local minimizer depending on the value of the ratio πΎ.
Exercises
201
(d) When πΎ > πΎ# and there are two candidate curves π¦β , it can be shown that the candidate with the smaller π is a local minimizer whereas the other is not. Find and plot these curves for the case πΎ = 2 and indicate which is the local minimizer; for the plot interval [βπΏ, πΏ] use πΏ = 1. When πΎ β€ πΎ# , it can be shown that there are no local minimizers of πΉ in π; in this case, a surface of minimum area no longer has a profile in π. What do you think might happen to the film in this case? Note: The above results for πΎ > πΎ# and πΎ β€ πΎ# can be illustrated by direct experiment with some bubble solution and wire rings. Note that decreasing πΎ is equivalent to increasing πΏ with fixed πΌ. The general curve in (a) is called a catenary, and the surface of revolution is called a catenoid. Mini-project 2. A drone boat is to be driven across a channel of moving water as described in Section 6.9. We suppose that the steering angle π with respect to the horizontal axis can be controlled remotely, and that the boat moves at constant speed π relative to the water. If we let π¦(π₯), π₯ β [0, β] denote the path of the boat, then the travel time π(π¦) along the path, and steering angle π(π₯) required for the path, are y
w(x)
β
water y(x) Ο boat
P
π(π¦) = Q
ΞΈ(x)
cos π = x
β1 β π2 + (π¦β² )2 β ππ¦β² 1 β« ππ₯, π 0 1 β π2 1 β π2
β1 β π2 + (π¦β² )2 β ππ¦β²
, sin π = π¦β² cos π β π.
0
Here π(π₯) = π€(π₯)/π, where π€(π₯) is the speed of the water. We seek the path π¦(π₯) that will minimize the travel time π(π¦) under different boundary conditions. We assume that the water speed is everywhere less than the boat speed, so that β1 < π(π₯) < 1, and that the boat path is a graph (one π¦ for each π₯) with two continuous derivatives, which π π requires β 2 < π(π₯) < 2 . For concreteness, we use π = 1 and π€(π₯) = ππ₯(β β π₯), where π = 3.5 and β = 1 are constants. All quantities are dimensionless. (a) Independent of boundary conditions, consider the EulerβLagrange differential ππΏ equation for local minimizers of π(π¦). Show that the equation becomes ππ¦β² = π΄, which gives π¦β² (π₯) = π(π₯, π΄), and thus the general solution is π₯
π¦(π₯) = π΅ + β« π(π₯,Μ π΄) ππ₯,Μ 0
where π΄ and π΅ are arbitrary constants. Here π(π₯, π΄) is a function which you should find. Show that π(π₯, π΄) will be defined for all π₯ β [0, 1] only if π΄# < π΄ < π΄# , where π΄# and π΄# are bounds which you should find. (b) Consider the fixed-fixed problem where the boat must depart from π = (0, 0) and arrive at π = (1, 3). In this case, an optimal path is a minimizer of π βΆ V β β, where V = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0, π¦(1) = 3}. Find a unique extremal using the solution from (a). [The boundary conditions will give a nonlinear equation for π΄; it can be solved numerically using the interval (π΄# , π΄# ) as a guide.]
202
6. Calculus of variations
(c) Consider the fixed-free problem where the boat must depart from π but is free to arrive at any point on the other side. In this case, an optimal path is a minimizer of π βΆ W β β, where W = {π¦ β πΆ 2 [0, 1] | π¦(0) = 0}. Find a unique extremal using the solution from (a). (d) It can be shown that the unique extremals found in (b) and (c) are absolute minimizers. Using the expressions for sin π(π₯) and cos π(π₯), and the substitution π¦β² (π₯) = π(π₯, π΄), make a plot of the steering angle π(π₯) for each problem. Also, make a plot of each path π¦(π₯). Given that V β W, briefly explain why the travel time for the optimal path in (c) must be less than or equal to that in (b). Mini-project 3. A chain of given length will hang in a curved shape when its ends are held fixed at two given points as outlined in Section 6.13. Using coordinates as shown, the natural shape of the chain can be described as that curve π¦(π₯), π₯ β [βπΏ, πΏ] which minimizes the chain potential energy πΈ(π¦), subject to the length constraint πΊ(π¦) = β, where y 2Ξ±
x
πΏ
πΈ(π¦) = β« πππ¦β1 + (π¦β² )2 ππ₯, βπΏ
πΏ
g
πΊ(π¦) = β« β1 + (π¦β² )2 ππ₯. βπΏ
x=βL
x=L
In the above, β is the chain length, π is the chain mass per unit length, π is gravitational acceleration, and πΌ is an offset parameter such that π¦(βπΏ) = πΌ and π¦(πΏ) = βπΌ. Here we explore this minimization problem in the case of zero offset so that πΌ = 0, and for different values of the length ratio πΎ = β/(2πΏ). We assume π, π, β, and πΏ are positive constants. All quantities are dimensionless. (a) Write and solve the EulerβLagrange differential equation for extremals. By filling in the details omitted in the text, and excluding the trivial solution with π¦(π₯) β‘ constant, show that the general solution can be written as πΏ ππ₯ π¦(π₯) = βπ + cosh ( + π), π πΏ where π, π β 0, and π are unknown constants. (b) Show that the boundary and constraint conditions π¦(βπΏ) = 0, π¦(πΏ) = 0 and πΊ(π¦) = β imply the following equations for π, π, and π, where πΎ = β/(2πΏ), π=0
and
π=
πΏ cosh(π) π
and
sinh(π) = πΎ. π
(c) In the last equation above, πΎ > 0 is given, and π β 0 is an unknown. Show that this equation has two solutions if πΎ > 1, and no solution if πΎ < 1. Hence we have two or no extremals depending on the ratio πΎ. What is the physical reason there can be no extremal if πΎ < 1? What is the only possible extremal when πΎ = 1? [This is the trivial solution excluded in part (a).]
Exercises
203
(d) In the case when πΎ > 1 and there are two extremals, it can be shown that one is an absolute minimizer and the other an absolute maximizer. Find and make plots of these curves for the case when β = 0.5 and πΏ = 0.1. Indicate which is the minimizer and maximizer; this should be clear. What is the middle sag-depth π = |π¦(0)| for the energy-minimizing shape? Note 1: The above results can be illustrated by direct experiment with an open necklace. You should be able to compute and verify the sag-depth at the middle (and other locations) from knowledge of the necklace length β and the separation distance 2πΏ. Note that the results do not depend on π and π. Note 2: The case when β = β(2πΏ)2 + (2πΌ)2 , which corresponds to πΎ = 1 when πΌ = 0, is special. This value of β is a minimum of the constraint functional πΊ, and the only possible shape of the chain is a line, but this line need not satisfy the EulerβLagrange equation since Result 6.12.1 does not hold for an extremum of πΊ. This line happens to satisfy the EulerβLagrange equation when πΌ = 0, but not when πΌ β 0.
Bibliography
[1] V. I. Arnold, Ordinary differential equations, Springer Textbook, Springer-Verlag, Berlin, 1992. Translated from the third Russian edition by Roger Cooke. MR1162307 [2] C. M. Bender and S. A. Orszag, Advanced mathematical methods for scientists and engineers. I: Asymptotic methods and perturbation theory, Springer-Verlag, New York, 1999. Reprint of the 1978 original, DOI 10.1007/978-1-4757-3069-2. MR1721985 [3] G. Birkhoff, Hydrodynamics: A study in logic, fact and similitude, Princeton Legacy Library, 2234. Princeton University Press, 2015, 2nd edition, revised. MR0122193. [4] G. A. Bliss, Calculus of variations. Mathematical Association of America, Chicago, IL, 1925. [5] G. A. Bliss, The problem of Lagrange in the calculus of variations, Amer. J. Math. 52 (1930), no. 4, 673β744, DOI 10.2307/2370714. MR1506783 [6] P. W. Bridgman, Dimensional analysis, Yale University Press, New Haven, CT 1963. [7] G. Buttazzo, M. Giaquinta, and S. Hildebrandt, One-dimensional variational problems: An introduction, Oxford Lecture Series in Mathematics and its Applications, vol. 15, The Clarendon Press, Oxford University Press, New York, 1998. MR1694383 [8] E. Casas-Alvero, Singularities of plane curves, London Mathematical Society Lecture Note Series, vol. 276, Cambridge University Press, Cambridge, 2000, DOI 10.1017/CBO9780511569326. MR1782072 [9] S. N. Chow and J. K. Hale, Methods of bifurcation theory, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 251, Springer-Verlag, New York-Berlin, 1982. MR660633 [10] R. V. Churchill and J. W. Brown, Complex variables and applications, 9th ed., McGraw-Hill Book Co., New York, 2014. MR730937 [11] E. A. Coddington and N. Levinson, Theory of ordinary differential equations, McGraw-Hill Book Co., Inc., New YorkToronto-London, 1955. MR0069338 [12] R. Coleman, A detailed analysis of the brachistochrone problem, arXiv:1001.2181v2, 2012. [13] B. Dacorogna, Introduction to the calculus of variations, 3rd ed., Imperial College Press, London, 2015. MR3288348 [14] H. Dym and H. P. McKean, Fourier series and integrals, Probability and Mathematical Statistics, No. 14, Academic Press, New York-London, 1972. MR0442564 [15] I. M. Gelfand and S. V. Fomin, Calculus of variations, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1963. Revised English edition translated and edited by Richard A. Silverman. MR0160139 [16] G.-M. Gie, M. Hamouda, C.-Y. Jung, and R. M. Temam, Singular perturbations and boundary layers, Applied Mathematical Sciences, vol. 200, Springer, Cham, 2018, DOI 10.1007/978-3-030-00638-9. MR3839343 [17] J. Guckenheimer and P. Holmes, Nonlinear oscillations, dynamical systems, and bifurcations of vector fields, Applied Mathematical Sciences, vol. 42, Springer-Verlag, New York, 1990. Revised and corrected reprint of the 1983 original. MR1139515 [18] R. C. Gunning and H. Rossi, Analytic functions of several complex variables, AMS Chelsea Publishing, Providence, RI, 2009. Reprint of the 1965 original, DOI 10.1090/chel/368. MR2568219 [19] E. Hille, Ordinary differential equations in the complex domain, Pure and Applied Mathematics, Wiley-Interscience [John Wiley & Sons], New York-London-Sydney, 1976. MR0499382
205
206
Bibliography
[20] M. W. Hirsch and S. Smale, Differential equations, dynamical systems, and linear algebra, Pure and Applied Mathematics, Vol. 60, Academic Press [Harcourt Brace Jovanovich, Publishers], New York-London, 1974. MR0486784 [21] M. H. Holmes, Introduction to perturbation methods, 2nd ed., Texts in Applied Mathematics, vol. 20, Springer, New York, 2013, DOI 10.1007/978-1-4614-5477-9. MR2987304 [22] M. H. Holmes, Introduction to the foundations of applied mathematics, Texts in Applied Mathematics, vol. 56, Springer, Cham, 2019. Second edition of [MR2526777], DOI 10.1007/978-3-030-24261-9. MR3969979 [23] T. Kato, A short introduction to perturbation theory for linear operators, Springer-Verlag, New York-Berlin, 1982. MR678094 [24] W. G. Kelley and A. C. Peterson, The theory of differential equations: Classical and qualitative, 2nd ed., Universitext, Springer, New York, 2010, DOI 10.1007/978-1-4419-5783-2. MR2640364 [25] J. Kevorkian and J. D. Cole, Perturbation methods in applied mathematics, Applied Mathematical Sciences, vol. 34, Springer-Verlag, New York-Berlin, 1981. MR608029 [26] M. Kot, A first course in the calculus of variations, Student Mathematical Library, vol. 72, American Mathematical Society, Providence, RI, 2014, DOI 10.1090/stml/072. MR3241749 [27] S. G. Krantz and H. R. Parks, The implicit function theorem: History, theory, and applications, BirkhΓ€user Boston, Inc., Boston, MA, 2002, DOI 10.1007/978-1-4612-0059-8. MR1894435 [28] G. Lawlor, A new minimization proof for the brachistochrone, Amer. Math. Monthly 103 (1996), no. 3, 242β249, DOI 10.2307/2975375. MR1376179 [29] D. S. Lemons, A studentβs guide to dimensional analysis, Cambridge University Press, Cambridge, MA, 2017. [30] C. C. Lin and L. A. Segel, Mathematics applied to deterministic problems in the natural sciences, 2nd ed., Classics in Applied Mathematics, vol. 1, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1988. With material on elasticity by G. H. Handelman; With a foreword by Robert E. OβMalley, Jr., DOI 10.1137/1.9781611971347. MR982711 [31] J. D. Logan, Applied mathematics, 4th ed., John Wiley & Sons, Inc., Hoboken, NJ, 2013. MR3237684 [32] J. C. Neu, Singular perturbation in the physical sciences, Graduate Studies in Mathematics, vol. 167, American Mathematical Society, Providence, RI, 2015, DOI 10.1090/gsm/167. MR3410360 [33] L. Perko, Differential equations and dynamical systems, 3rd ed., Texts in Applied Mathematics, vol. 7, Springer-Verlag, New York, 2001, DOI 10.1007/978-1-4613-0003-8. MR1801796 [34] F. Rellich, Perturbation theory of eigenvalue problems, Gordon and Breach Science Publishers, New York-London-Paris, 1969. Assisted by J. Berkowitz; With a preface by Jacob T. Schwartz. MR0240668 [35] J. A. Sanders, F. Verhulst, and J. Murdock, Averaging methods in nonlinear dynamical systems, 2nd ed., Applied Mathematical Sciences, vol. 59, Springer, New York, 2007. MR2316999 [36] D. R. Smith, Variational methods in optimization, Prentice-Hall, Inc., Englewood Cliffs, N.J., 1974. MR0346616 [37] S. H. Strogatz, Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering, 2nd ed., Westview Press, Boulder, CO, 2015. MR3837141 [38] T. Szirtes, Applied dimensional analysis and modeling, 2nd ed.. Elsevier, New York, 2007. [39] G. Teschl, Ordinary differential equations and dynamical systems, Graduate Studies in Mathematics, vol. 140, American Mathematical Society, Providence, RI, 2012, DOI 10.1090/gsm/140. MR2961944 [40] J. L. Troutman, Variational calculus and optimal control: Optimization with elementary convexity, 2nd ed., Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1996. With the assistance of William Hrusa, DOI 10.1007/9781-4612-0737-5. MR1363262 [41] C. T. C. Wall, Singular points of plane curves, London Mathematical Society Student Texts, vol. 63, Cambridge University Press, Cambridge, 2004, DOI 10.1017/CBO9780511617560. MR2107253
Index
analytic function, 98 ballistic targeting model, 110, 139 bifurcation diagram definition of, 44 turning points, 49 biochemical switch model, 53 brachistochrone, 162 Buckingham π-theorem, 8 πΆ π [π, π] definition of, 143 distance, 147 neighborhood, 147 norm, 147 calculus of variations boundary condition essential, 167, 175, 182 natural, 167, 175, 182 concavity theorem, 189 cost functional, 179 extremal, 155, 175, 182 extremum absolute, 145 local, 148 first integral, 158 first-order problem constrained, 181 fixed-fixed, 154 fixed-free, 166 functional definition of, 144 first variation, 152 second variation, 152
fundamental lemma, 156, 160 necessary conditions, 151 reduced forms, 158 second-order problem, 174 sign lemma, 156, 161 sufficient condition, 189 variations of a function, 151 space of, 150 catenary curve, 188, 201 catenoid surface, 201 contraction mapping theorem, 110 cycloid curve, 165 digestion model, 16, 196 dimension axioms of, 3 basis of, 1 definition of, 1 dimensional exponents, 2 dimensionless quantity, 3 domino toppling model, 16 duBois-Reymond lemma, 195 dynamical system, 1D bifurcation diagram, 44 definition of, 37 derivative test, 43 equilibrium asymptotically stable, 41 attractor, 42 definition of, 39 hyperbolic, 42 neutrally stable, 41
207
208
repeller, 42 unstable, 41 monotonicity theorem, 40 phase view, 37 potential function, 51 solvability theorem, 38 time view, 37 velocity, 37 dynamical system, 2D bifurcation, 77 definition of, 55 direction field, 57 equilibrium asymptotically stable, 60 definition of, 60 neutrally stable, 60 unstable, 60 first integral, 59 incr/decreasing regions, 57 linear center, 68 definition of, 64 improper node, 67 nondegenerate, 64 phase diagrams, 65 saddle, hyperbolic point, 66 spiral, 68 stable node, attractor, 65 unstable node, repeller, 66 maximal orbit, path, 56 nonlinear center theorem, 76 nullclines, 57 orbit, path, 55 path equation, 58 periodic orbit theorem, 73 periodic, closed orbit asymptotically stable, 63 definition of, 62 limit cycle, 63 neutrally stable, 63 period, 62 unstable, 63 phase view, 55 solvability theorem, 56 time view, 55 trapping region, 74 velocity, 55 Dzhanibekov effect, 84 epidemic, SIR basic reproduction number, 79 generalized model, 91
Index
outbreak condition, 81 plain model, 78 recovery coefficient, 79 transmission coefficient, 78 Euler equations, 82, 94 Eulerβs formula, 67 EulerβLagrange eqns, 155, 175, 182 explosion model, 16 Fermatβs principle, 196 fishery model, 52 fuel consumption model, 17 glycolysis dynamics, 93 growth model, 16 Hartman-Grobman theorem, 70 Holling consumption model, 47 implicit function theorem, 104 insecticide dynamics, 88 isoperimetric problem, 181 Jacobian, 70 Lagrange multiplier rule, 182 Lagrangian, 154, 174 Legendre condition, 155, 175, 182 linear space, 144 Lipschitz condition, 110 liquid-gas interface meniscus, 127 model, 127, 139, 141 surface tension, 127 wetting angle, 127 logistic growth model, 47 LotkaβVolterra equations, 90 Lyapunov function, 89 Mercury orbit model, 140 Michaelis-Menten model, 138 multiscale function, 24 Newton polygon method, 105 non-ideal gas model, 136 optimal shape, control boat steering, 170, 201 car acceleration, 177, 199 food intake schedule, 196 hanging chain, 186, 202 playground slide, 161 soap film, 200 order symbols π, π, 99
Index
oscillations chemical, 90, 93 electrical, 91 see also pendulum see also predator-prey system see also spring-mass parameter definition of, 19 for bifurcation, 44 pendulum damped, 91 equation of motion, 13, 17 model, 12 period equation, 13 period law, 14 perturbed equation definition of, 95 regular algebraic approximation, 101 degenerate root, 105 standard series, 101 theorem on, 100 regular differential approximation, 108 periodic solution, 114 secular term, 115 standard series, 107 theorem on, 107 singular algebraic regular root, 119 singular root, 119 singular differential boundary layer, 122 composite approximation, 125 initial layer, 137 inner problem, 123 inner region, 122 matching condition, 125, 126 outer problem, 122 outer region, 122 overlap region, 126 PoincarΓ©βs inequality, 192 PoincarΓ©βLindstedt method, 116 PoincarΓ©-Bendixson theorem, 74 population dynamics bifurcation diagram, 49 carrying capacity, 47 model, insects, 46 model, plants, 53 turning points, 49 power product
209
definition of, 7 dimensionless set, 7 predator-prey system, 34, 90 projectile motion ballistic, 110 constant gravity, 34 variable gravity, 135 Puiseux series, 101 pure number, 3 quasi-steady-state, 138 Queen Didoβs problem, 199 reaction tank chemical equation, 29 model, 29 rate constant, 29 relationship dynamics, 92 rigid body angular momentum, 82 angular velocity, 81 global phase diagram, 84 inertia matrix, 82 intermediate axis theorem, 84 local phase diagrams, 83 model, 81, 94 scale associated, 25 characteristic, 24 derivative relations, 22 factors, 19 natural, 23 transformation, 21 scaling theorem, 26 sliding bead model, 53 solid-state laser model, 52 spring-mass system, 33, 87, 136 Taylor series, 97 temperature model, 33 tennis racket theorem, 84 terminal velocity, 16 thermo-chemical reaction model, 136 two-step reaction model, 138 unit change of, 4 choice of, 2 definition of, 1 dimensionless, 4 unit-conversion factor, 4 unit-free equation, 5
210
Van der Pol equation, 91 Weierstrass preparation theorem, 105 Wirtingerβs inequality, 192 Young-Laplace equation, 129
Index
Selected Published Titles in This Series 59 57 55 54
Oscar Gonzalez, Topics in Applied Mathematics and Modeling, 2023 Meighan I. Dillon, Linear Algebra, 2023 Joseph H. Silverman, Abstract Algebra, 2022 Rustum Choksi, Partial Diο¬erential Equations, 2022
53 52 51 50
Louis-Pierre Arguin, A First Course in Stochastic Calculus, 2022 Michael E. Taylor, Introduction to Diο¬erential Equations, Second Edition, 2022 James R. King, Geometry Transformed, 2021 James P. Keener, Biology in Time and Space, 2021
49 48 47 46
Carl G. Wagner, A First Course in Enumerative Combinatorics, 2020 RΒ΄ obert Freud and Edit Gyarmati, Number Theory, 2020 Michael E. Taylor, Introduction to Analysis in One Variable, 2020 Michael E. Taylor, Introduction to Analysis in Several Variables, 2020
45 Michael E. Taylor, Linear Algebra, 2020 44 Alejandro Uribe A. and Daniel A. Visscher, Explorations in Analysis, Topology, and Dynamics, 2020 43 Allan Bickle, Fundamentals of Graph Theory, 2020 42 Steven H. Weintraub, Linear Algebra for the Young Mathematician, 2019 41 40 39 38
William J. Terrell, A Passage to Modern Analysis, 2019 Heiko Knospe, A Course in Cryptography, 2019 Andrew D. Hwang, Sets, Groups, and Mappings, 2019 Mark Bridger, Real Analysis, 2019
37 Mike Mesterton-Gibbons, An Introduction to Game-Theoretic Modelling, Third Edition, 2019 36 Cesar E. Silva, Invitation to Real Analysis, 2019 Β΄ 35 Alvaro Lozano-Robledo, Number Theory and Geometry, 2019 34 C. Herbert Clemens, Two-Dimensional Geometries, 2019 33 32 31 30
Brad G. Osgood, Lectures on the Fourier Transform and Its Applications, 2019 John M. Erdman, A Problems Based Course in Advanced Calculus, 2018 Benjamin Hutz, An Experimental Introduction to Number Theory, 2018 Steven J. Miller, Mathematics of Optimization: How to do Things Faster, 2017
29 Tom L. LindstrΓΈm, Spaces, 2017 28 Randall Pruim, Foundations and Applications of Statistics: An Introduction Using R, Second Edition, 2018 27 Shahriar Shahriari, Algebra in Action, 2017 26 Tamara J. Lakins, The Tools of Mathematical Reasoning, 2016 25 24 23 22
Hossein Hosseini Giv, Mathematical Analysis and Its Inherent Nature, 2016 Helene Shapiro, Linear Algebra and Matrices, 2015 Sergei Ovchinnikov, Number Systems, 2015 Hugh L. Montgomery, Early Fourier Analysis, 2014
21 John M. Lee, Axiomatic Geometry, 2013 20 Paul J. Sally, Jr., Fundamentals of Mathematical Analysis, 2013 19 R. Clark Robinson, An Introduction to Dynamical Systems: Continuous and Discrete, Second Edition, 2012 18 Joseph L. Taylor, Foundations of Analysis, 2012 17 Peter Duren, Invitation to Classical Analysis, 2012
For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/amstextseries/.
The analysis and interpretation of mathematical models is an essential part of the modern scientific process. Topics in Applied Mathematics and Modeling is designed for a one-semester course in this area aimed at a wide undergraduate audience in the mathematical sciences. The prerequisite for access is exposure to the central ideas of linear algebra and ordinary differential equations. The subjects explored in the book are dimensional analysis and scaling, dynamical systems, perturbation methods, and calculus of variations. These are immense subjects of wide applicability and a fertile ground for critical thinking and quantitative reasoning, in which every student of mathematics should have some experience. Students who use this book will enhance their understanding of mathematics, acquire tools to explore meaningful scientific problems, and increase their preparedness for future research and advanced studies. The highlights of the book are case studies and mini-projects, which illustrate the mathematics in action. The book also contains a wealth of examples, figures, and regular exercises to support teaching and learning. The book includes opportunities for computer-aided explorations, and each chapter contains a bibliography with references covering further details of the material.
For additional information and updates on this book, visit www.ams.org/bookpages/amstext-59
AMSTEXT/59
This series was founded by the highly respected mathematician and educator, Paul J. Sally, Jr.