121 17 13MB
English Pages 506 [502] Year 2023
Graduate Texts in Mathematics
Rabi Bhattacharya Edward C. Waymire
Continuous Parameter Markov Processes and Stochastic Differential Equations
Graduate Texts in Mathematics
299
Graduate Texts in Mathematics Series Editors: Patricia Hersh, University of Oregon, Eugene, OR, USA Ravi Vakil, Stanford University, Stanford, CA, USA Jared Wunsch, Northwestern University, Evanston, IL, USA Advisory Editors: Alexei Borodin, Massachusetts Institute of Technology, Cambridge, MA, USA Richard D. Canary, University of Michigan, Ann Arbor, MI, USA David Eisenbud, University of California, Berkeley & SLMath, Berkeley, CA, USA Brian C. Hall, University of Notre Dame, Notre Dame, IN, USA June Huh, Princeton University, Princeton, NJ, USA Akhil Mathew, University of Chicago, Chicago, IL, USA Peter J. Olver, University of Minnesota, Minneapolis, MN, USA John Pardon, Princeton University, Princeton, NJ, USA Jeremy Quastel, University of Toronto, Toronto, ON, Canada Wilhelm Schlag, Yale University, New Haven, CT, USA Barry Simon, California Institute of Technology, Pasadena, CA, USA Melanie Matchett Wood, Harvard University Yufei Zhao, Massachusetts Institute of Technology, Cambridge, MA, USA
Graduate Texts in Mathematics bridge the gap between passive study and creative understanding, offering graduate-level introductions to advanced topics in mathematics. The volumes are carefully written as teaching aids and highlight characteristic features of the theory. Although these books are frequently used as textbooks in graduate courses, they are also suitable for individual study.
Rabi Bhattacharya • Edward C. Waymire
Continuous Parameter Markov Processes and Stochastic Differential Equations
Rabi Bhattacharya Department of Mathematics University of Arizona Tucson, AZ, USA
Edward C. Waymire Department of Mathematics Oregon State University Corvallis, OR, USA
ISSN 0072-5285 ISSN 2197-5612 (electronic) Graduate Texts in Mathematics ISBN 978-3-031-33294-4 ISBN 978-3-031-33296-8 (eBook) https://doi.org/10.1007/978-3-031-33296-8 Mathematics Subject Classification: 47D07, 60H05, 60H30, 60J25, 60J27, 60J28, 60J35, 60J70, 60J74, 60J76, 60E07, 60F17, 60G44, 60G51, 60G52, 60G53 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Paper in this product is recyclable.
Dedicated to Gouri (In loving memory), to Linda, and to Vijay K Gupta (In memory of a dedicated scientific collaborator and dear friend to both of us.)
Preface
The subject of Markov processes in continuous time has an elegant and profound mathematical theory and a great diversity of applications in all fields of natural and social sciences, technology, and engineering. It is our objective to present this theory and many of its applications as simply as we can. Apart from a graduate course in analysis and one in probability, the book is essentially self-contained. For the probability background, A Basic Course in Probability Theory (2016), referred to as BCPT throughout the text, suffices. A few technical results from analysis are proved in Appendices. There are rare but occasional references to the companion stochastic process texts (Bhattacharya and Waymire 2021, 2022). Alternative references can be used as well. The first chapter reviews Doob’s martingale theory as an essential tool used throughout the book. We begin then with an exposition of how a semigroup of transition operators .Tt .(t > 0), describing the expected evolution in time t of all appropriate functions .f (x) of an initial state x, is specified by the derivative of .Tt f (x) at time .t = 0. This derivative A, called the infinitesimal generator, embodies the law of evolution determining the Markov process. The generator A is in integrodifferential form, including the class of all second order linear elliptic operators, indicating a deep connection with partial differential equations. Next, after establishing a broad criterion for right continuity of sample paths, two classes of Markov processes are presented both analytically, using semigroups, and probabilistically, describing pathwise behavior. Jump Markov processes are those that remain in a state for an exponentially distributed random time before moving to another state. Parameters of the exponential times depend on the holding state. Poisson and compound Poisson processes, as well as continuous time
vii
viii
Preface
birth-and-death chains, are examples. This is followed by a chapter on processes with independent increments, also called Lévy processes. This class includes the ubiquitous Brownian motion. Leaving that out, the fascinating Lévy-Khinchin-Itô theory exhibits the process as an accumulation of jumps from a Poisson random field. The most important continuous time stochastic processes are diffusions-which are real or vector-valued Markov processes having continuous sample paths. Much of the book is devoted to this topic. A diffusion may be thought of as locally Brownian, and Itô’s beautiful and rich theory of stochastic integrals and stochastic differential equations for the construction of general diffusions is developed around this theme. A centerpiece of this theory is Itô’s Lemma, with stochastic differentials and integrals replacing Newtonian calculus. Fairly complete asymptotic properties of these processes in one dimension, due to Feller, such as transience, null and positive recurrence, explosion, are presented in several chapters. In view of its general importance in Markov process theory, a rather detailed account is presented of the speed of convergence to equilibrium of Harris recurrent Markov processes in discrete time, and one-and-multi-dimensional diffusions in continuous time, using the method of coupling. Multi-dimensional extensions of Feller’s criteria, announced by Kha´sminskii, are derived in two chapters under broad conditions on the coefficients of the elliptic infinitesimal generator. Also developed in this text is a widely applicable functional central limit theorem (FCLT) for general ergodic Markov processes. It says, in essence, that the FCLT holds for integrals of all square integrable functions in the range of the infinitesimal generator. This provides in many instances a computable dispersion matrix for the asymptotic Brownian motion. Among several applications, we give a probabilistic derivation of the Taylor-Aris theory of solute transport in a capillary, and the asymptotic Gaussian distribution of a scaled diffusion on .Rk with periodic coefficients. A transformation rule, known as the Cameron-Martin-Girsanov Theorem, computes the Radon-Nikodym derivative of the distribution on finite intervals .[0, T ] of one diffusion with respect to another with the same dispersion coefficient but different drifts. This has many uses, one being the proof that a diffusion with nonsingular diffusion matrix has full support. In one dimension, the Hille-Yosida Theorem for semigroups is the cornerstone of Feller’s theory for the construction of all one-dimensional regular diffusions. Feller’s seminal theory also characterizes all possible boundary conditions as well. An exposition of this theory in a graduate text is one of the highlights of this book. The probabilistic understanding of the main boundary conditions is also emphasized.
Preface
ix
Our approach to the construction of certain classes of Markov processes as functions of Markov processes, yields, among others, the construction of reflecting and periodic diffusions and diffusions on a torus. A chapter on Paul Lévy’s local time of Brownian motion is included, leading to applications such as his construction of reflecting Brownian motion, as well as the intriguing behavior under mixed boundary conditions. The treatment of local times uses some material from a chapter on integration with respect to .L2 -martingales that precedes it. This latter chapter contains the Doob-Meyer decomposition as well. The intimate connection between stochastic differential equations and elliptic and parabolic partial differential equations of second order, presented in two chapters, attests to the mathematical depth of this part of probability and is used extensively for the asymptotic theory that follows. The chapters on special topics include one on multi-phase homogenization of periodic diffusions with a large spatial parameter “a.” Two examples illustrate this fascinating theory. One displays the passage of the solute profile through Gaussian and non-Gaussian phases as dispersivity rapidly increases with time, while the other has different Gaussian phases, but no substantial increase in dispersion with time. This topic is motivated by widely observed growth in dispersion of solutes in heterogeneous media such as aquifers. Another special topics chapter titled “Skew Brownian Motion” is also motivated by solute transport in porous media. Here also, interesting anomalous asymptotic behavior occurs when the diffusion has change points in the dispersion parameter. Another special topics chapter is devoted to a brief exposition of the StroockVaradhan martingale problem, with an application to the approximation of a onedimensional diffusion by a scaled birth-and-death chain. The authors are indebted to Springer Editors Loretta Bartolini and Elizabeth Loew for their guidance, and also to anonymous reviewers for their valuable suggestions. We would like to thank our colleagues Sunder Sethuraman, William Faris, and Enrique Thomann for their continued advice and encouragement. We also thank the University of Arizona graduate students Duncan Bennett and Eric Roon for technical assistance with some TIKZ and Latex code. The authors gratefully acknowledge partial support (DMS1811317, DMS-1408947) from the National Science Foundation during the preparation of this book. Tucson, AZ, USA Corvallis, OR, USA July 17, 2023
Rabi Bhattacharya Edward C. Waymire
x
Preface Course Suggestions (A) Discontinuous Markov Processes: 1-4, 5, 26 (B) Stochastic Differential Equations: 1, 6-9,10,11,12, 20, 23 (C) Diffusions and PDEs: 2, 8,9,13, 15-16, 21, 22, 25 (D) Equilibria and Speed of Convergence: 11, 14,18, 22 (E) Fluctuation Laws and Approximation in Distribution: 1,17, 24, 25 Chapter Dependency Diagram
DM P = Discontinuous Markov Processes SC = Stochastic Calculus D = Diffusion
DM P
SC
D
1
1−2
1−3
6
6 − 10
3
7
13, 17, 21, 25
4
8
24
2
26
5
23
9
20
19
22
10
11
21
14
12
15
18
Chapter Relationships and Dependencies
16
A Trilogy on Stochastic Processes
• Random Walk, Brownian Motion and Martingales (GTM 292) • Stationary Processes and Discrete Time Markov Processes (GTM 293) • Continuous Time Markov Processes and Stochastic Differential Equations (GTM 299) The trilogy provides a systematic exposition of stochastic processes—a branch of mathematics that increasingly impacts every area of science and technology. GTM 292 This volume begins with a broad coverage of random walk, Brownian motion, and martingales—three elegant pillars of probability theory. Brownian motion with its universality shines as a star in the center of probability, with the scaled random walk and the widely applicable martingale functional central limit theorem (FCLT) converging to it. There is an extensive treatment of branching processes with many applications, along with a branching-like treatment initiated by Le Jan and Snitzman of the most basic equations of fluid dynamics, namely, Navier-Stokes equations; it is among the most challenging unsolved problems of mathematics with broad implications to physics. Discrete renewal theory, using coupling, lays the foundation for analyzing the speed of convergence to equilibrium for general Markov processes—a recurring theme in the trilogy. Blackwell’s general renewal theory is essential for the computation of ruin probabilities in insurance, especially for heavy-tailed claim sizes which may aggravate the chance of a collapse. GTM 293 The first part of the second volume is a detailed treatment of Kolmogorov’s prediction theory for processes with only weak stationarity, namely, that of second order moments. Together with Wiener’s independent result for such processes in continuous time, this has been one of the most important developments in the engineering literature. This is followed by Birkhoff’s ergodic theorem for strictly stationary processes which, historically, had provided the affirmation of the long anticipated equality of the time average and the phase average of systems under ergodicity in statistical mechanics. At the other extreme are chaotic dynamical systems, some of which are illustrated. Asymptotic behavior of discrete parameter xi
xii
A Trilogy on Stochastic Processes
Markov processes, which may be thought of as random dynamical systems, is studied in depth, including Harris recurrence. Standing apart are monotone Markov processes which may not even be irreducible, but still converge to equilibrium in special distances. The latter arise in the study of sustainable resources, queuing, etc. For functions of ergodic Markov processes, the Billingsley-Ibragimov FCLT is derived. A rather new exposition of the large deviation theory is presented covering both the classical i.i.d. case of Cramer and Sanov, as well as the Markovian case due to Donsker and Varadhan. There is a chapter on random fields which arise in physics that includes the FKG inequality and a central limit theorem under positive dependence due to Newman. GTM 299 The contents of the final volume of the trilogy are summarized in the Preface.
Contents
1
A Review of Martingales, Stopping Times, and the Markov Property Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 15
2
Semigroup Theory and Markov Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17 33
3
Regularity of Markov Process Sample Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35 39
4
Continuous Parameter Jump Markov Processes. . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41 59
5
Processes with Independent Increments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65 95
6
The Stochastic Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7
Construction of Diffusions as Solutions of Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Construction of One-Dimensional Diffusions . . . . . . . . . . . . . . . . . . . . . . 7.2 Extension to Multidimensional Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 An Extension of the Itô Integral & SDEs with Locally Lipschitz Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 An Extension to SDEs with Nonhomogeneous Coefficients . . . . . . . 7.6 An Extension to k−Dimensional SDE Governed by r−Dimensional Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
115 116 122 126 131 133 134 135
Itô’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.1 Asymptotic Properties of One-Dimensional Diffusions: Transience and Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 xiii
xiv
Contents
9
Cameron–Martin–Girsanov Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
10
Support of Nonsingular Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
11
Transience and Recurrence of Multidimensional Diffusions . . . . . . . . . . 185 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
12
Criteria for Explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13
Absorption, Reflection, and Other Transformations of Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Absorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 General One-Dimensional Diffusions on Half-Line with Absorption at Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Reflecting Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
219 221 224 230 236
14
The Speed of Convergence to Equilibrium of Discrete Parameter Markov Processes and Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
15
Probabilistic Representation of Solutions to Certain PDEs. . . . . . . . . . . . 15.1 Feynman–Kaˆc Formula for Multidimensional Diffusion . . . . . . . . . . . 15.2 Kolmogorov Forward Equation (The Fokker–Planck Equation) . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
Probabilistic Solution of the Classical Dirichlet Problem . . . . . . . . . . . . . . 287 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
17
The Functional Central Limit Theorem for Ergodic Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 17.1 A Functional Central Limit Theorem for Diffusions with Periodic Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
18
Asymptotic Stability for Singular Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
19
Stochastic Integrals with .L2 -Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
20
Local Time for Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
21
Construction of One-Dimensional Diffusions by Semigroups . . . . . . . . . 355 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
273 277 280 283
Contents
xv
22
Eigenfunction Expansions of Transition Probabilities for One-Dimensional Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
23
Special Topic: The Martingale Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
24
Special Topic: Multiphase Homogenization for Transport in Periodic Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
25
Special Topic: Skew Random Walk and Skew Brownian Motion . . . . . 445 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
26
Special Topic: Piecewise Deterministic Markov Processes in Population Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
A
The Hille–Yosida Theorem and Closed Graph Theorem. . . . . . . . . . . . . . . 477
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483 Related Textbooks and Monographs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Symbol Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Author Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
Chapter 1
A Review of Martingales, Stopping Times, and the Markov Property
Primarily for ease of reference, this chapter contains an expository overview of some of the most essential aspects of martingale theory used in this text, especially as it pertains to continuous parameter Markov processes.
Among the dominant notions from the theory of stochastic processes used to develop the theory of stochastic differential equations are those of martingales, stopping times, and the Markov property. This chapter provides an expository account of this basic background material that will be used repeatedly in the ensuing applications to Markov processes and stochastic differential equations. A more exhaustive and detailed treatment is presented in BCPT.1 First, let us recall the meaning of the Markov property of a discrete parameter stochastic process X = (X0 , X1 , . . . ) defined on a probability space (Ω, F, P ), each Xn taking values in a measurable space (S, S). Suppose that the initial state is deterministic, say X0 = x for some x ∈ S, and X has (time-homogeneous or stationary) transition probabilities p(x, dy). Here p(x, dy) is, for each x ∈ S, a probability measure on (S, S) such that x → p(x, B) is measurable for every B ∈ S. Then, the distribution of the process X is a probability measure Px on the infinite product space (S ∞ , S ⊗∞ ) of possible sample paths given by Px0 (A0 × A1 × · · · × An × S ∞ ) = P ({ω ∈ Ω : Xi (ω) ∈ Ai , 0 ≤ i ≤ n}|X0 = x0 ), (1.1)
.
1 Throughout,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
Theory. © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_1
1
2
1 A Review of Martingales, Stopping Times, and the Markov Property
for finite dimensional events A0 × A1 × · · · × An × S ∞ , Ai ∈ S, 0 ≤ i ≤ n. Let Fn = σ (X0 , X1 , . . . , Xn ) be the σ -field of subsets of Ω of this form, i.e., the σ -field generated by X0 , X1 , . . . , Xn . In (1.1), the notation “|X0 = x0 ” is used to indicate conditional probability given σ (X0 ) evaluated at X0 = x0 . Then, the Markov property may be expressed as follows: For each m ≥ 1 (fixed), the + = (X , X conditional distribution of the after-m process Xm m m+1 , Xm+2 , . . . ) given Fm is PXm , where PXm is the probability Px with x = Xm . The Markov property may be expressed in terms of transition probabilities as: The conditional distribution of the state Xm+1 given the past and present Fm is p(Xm , dy). To see that this onestep Markov property is sufficient, observe that with it and iterated conditioning one has for any bounded measurable f : S n+1 → R, .
E (f (Xm , Xm+1 , . . . , Xm+n )|Fm+n−1 ) = f (Xm , Xm+1 , . . . , Xm+n−1 , xm+n )p(Xm+n−1 , dxm+n ) S
= gn−1 (Xm , . . . , Xm+n−1 ),
(1.2)
for a measurable function gn−1 : S n → R. Next conditioning on Fm+n−2 , one obtains a function gn−2 (Xm , . . . , Xm+n−2 ). Continuing in this way one arrives at E (f (Xm , Xm+1 , . . . , Xm+n )|Fm ) = E (g1 (Xm , Xm+1 )|Fm )
.
= g0 (Xm ).
(1.3)
On the other hand, a similar iteration (with m = 0) yields a computation of Ex f (X0 , . . . , Xn ). In this way one arrives at E(f (X0 , X1 , . . . , Xm )|σ (X0 )) = g0 (X0 ).
.
The consistent family of finite-dimensional distributions so obtained determines the distribution Px0 for every initial state x0 . By integrating Px0 with respect to a probability measure μ(dx0 ) on (S, S) one arrives at the distribution Pμ of the Markov process on (S ∞ , S ⊗∞ ) having the transition probability p(x, dy) and initial distribution μ(dx). In this notation Px0 ≡ Pδx0 . An increasing sequence of subsigma-fields (of F) F1 ⊆ F2 ⊆ · · · ⊆ Fm ⊆ · · · is called a filtration on (Ω, F). If, in addition, one has that each Xn is Fn measurable, then we say that the process X is adapted to the filtration. This would be the case, for example, in the minimal case Fn = σ (X0 , . . . , Xn ). A stopping time with respect to the filtration is a nonnegative possibly infinite random variable τ defined on Ω, such that [τ ≤ n] ∈ Fn ,
.
∀n ∈ Z+ = {0, 1, 2, . . . } ∪ {∞}.
(1.4)
1 A Review of Martingales, Stopping Times, and the Markov Property
3
If τ is a stopping time, then the pre-τ -sigmafield may be defined by Fτ = {A ∈ F : A ∩ [τ ≤ n] ∈ Fn ∀n ∈ Z+ }.
(1.5)
.
A typical example of a stopping time is the first hitting time of a set B ∈ S defined by τB = inf{n ≥ 0 : Xn ∈ B}.
(1.6)
.
The following proposition summarizes some simple and important properties. Proposition 1.1 (i) A nonnegative constant random variable τ is a stopping time. (ii) If τ1 and τ2 are stopping times, then so are τ1 ∧ τ2 = min{τ1 , τ2 }, τ1 ∨ τ2 = max{τ1 , τ2 }, and τ1 + τ2 . (iii) If in addition τ1 ≤ τ2 , then Fτ1 ⊆ Fτ2 . The importance of stopping times for Markov processes is in the following extension of the Markov property to the strong Markov property: The conditional distribution of the after-τ process Xτ+ = {Xτ , Xτ +1 , Xτ +2 , . . . } given Fτ is Px evaluated at x = Xτ , denoted PXτ = Px |x=Xτ , on the event [τ < ∞]. This means precisely that E 1A∩[τ 0,
.
S
B ∈ S,
(1.8)
4
1 A Review of Martingales, Stopping Times, and the Markov Property
where S is the Borel σ -field of the Polish state space S. Note that the right side integrates Px (Xt+s ∈ B|Xs = y) = p(t; y, B) over the distribution p(s; x, dy) of Xs , given X0 = x. It is simple to see from this that if p(t; x, dy) is known for t ≤ h, for some h > 0, however small, then the entire family can be constructed. That is, one needs some infinitesimal condition for the transition probability, such as its “derivative” at t = 0, since the initial condition is given: p(0; x, dy) = δx (dy). This is explained in detail in the next chapter. Given the family p(t; x, dy), t > 0, x ∈ S, we consider the canonical construction of {Xt : t ≥ 0} on a suitable, but general, path space Ω. Such a path space is the set of all càdlàg functions on [0, ∞), namely, functions ω with values in the metric space S which are right continuous, i.e., ω(t+) = ω(t), and have left limit ω(t−), say, as s → t from right or left, respectively. The process is given by the coordinate projection Xt = ω(t). We will study the topology on the space D([0, ∞) : S) ≡ D(0, ∞) := Ω of càdlàg functions induced by the Skorokhod metric on it. First, consider D([0, T ]), the space of càdlàg functions on [0, T ] for each T > 0. Let Λ be the set of all continuous strictly increasing functions λ on [0, T ] onto [0, T ]. Define the metric d on D([0, T ]) as d(f, g) = inf{ε > 0 : sup{|f (t) − g(λ(t))| : 0 ≤ t ≤ T } ≤ ε},
.
(1.9)
for some λ satisfying .
sup{|λ(t) − t| : 0 ≤ t ≤ T } ≤ ε.
(1.10)
To check that d is a metric note that (i) d(f, f ) = 0, taking λ(t) = t for all t; conversely, if f = g, then there exists a t such that either f (t) > g(t) + δ on [t, t+δ] or f (t) < g(t)+δ on [t, t+δ], for some δ > 0, δ > 0. Suppose, if possible, d(f, g) = 0. This contradicts the existence of a sequence λn converging uniformly to the identity function λ(s) = s and g(λn (s)) converging to f (s) uniformly on [t, t + δ]. Next, (ii) d(f, g) = d(g, f ) follows by the equalities .
sup{|f (t) − g(λ(t))| : 0 ≤ t ≤ T } = sup{|f (λ−1 (t) − g(t)| : 0 ≤ t ≤ T },
and .
sup{|λ(t) − t| : 0 ≤ t ≤ T } = sup{|λ−1 (t) − t| : 0 ≤ t ≤ T }.
Finally, the triangle inequality follows from |f (t) − h(γ ◦ η(t))| ≤ |f (t) − g(η(t))| + |g(η(t)) − h(γ ◦ η(t))|,
.
and |γ ◦ η(t) − t| ≤ |γ ◦ η(t) − η(t)| + |η(t) − t| for all γ , η ∈ Λ, t ≥ 0.
.
1 A Review of Martingales, Stopping Times, and the Markov Property
5
Define a modulus w on D([0, T ]) as follows: For 0 ≤ t, s > 0, s + t < T , f ∈ D([0, T ]), w(f : [t, t + s)) = sup{|f (u) − f (v)| : t ≤ u < v < t + s}.
.
The following lemma2 plays an important role in deriving properties of the Skorokhod topology. Lemma 1 Let f ∈ D([0, T ]). Then, for every ε > 0, there exist k and 0 = t0 < t1 < · · · < tk = T such that w(f : [ti , ti+1 )) < ε for all i = 0, . . . , k − 1.
.
(1.11)
Proof Consider the supremum sˆ of all s, 0 ≤ s ≤ T , for which the assertion holds on [0, s]. Since f is right continuous at 0, sˆ > 0. Since f (ˆs − ) exists, [0, sˆ ] can be decomposed with the above property. Suppose, if possible, sˆ < T . But the right continuity of f at sˆ implies that the property holds on [0, sˆ + δ] for some δ > 0, leading to a contradiction. It follows from the lemma that for f ∈ D([0, T ]), given any ε > 0, there are at most finitely many points t in [0, T ] such that |f (t) − f (t−)| ≥ ε. Letting ε = 1/n, (n = 1, 2, . . . ), a countable dense subset of D([0, T ]) is the collection of those f , which take rational values at T and have constant rational values in intervals [ti (n), ti+1 (n))(i = 0, 1, . . . , k(n)) of the kind in the lemma. This leads to the following result. Proposition 1.3 (a) The space D([0, T ]) with the Skorokhod topology induced by the metric d is a separable metric space. (b) If f is continuous at t and fn → f in the metric d, then fn (t) converges to f (t), and if f is continuous on [0, T ], then the convergence is uniform on [0, T ]. (c) The relative topology of the set C = C[0, T ] of continuous functions on [0, T ] into S is that of uniform convergence on [0, T ]. (d) Let f ∈ D([0, T ]) and f (n) = f at the set of points t (r) = {ti (r), i = 0, . . . , k(r)}, r = 1, 2, . . . , n, as above, and joined by steps between neighboring points s, t for s ≤ u < t. Then, f (n) → f in the metric d. Proof The proofs of (a),(d) follow from the steps to the separability argument above. Part (b) follows from the definition of the metric d, and part (c) is a consequence (b). We now follow Billingsley’s argument3 to prove the following. Proposition 1.4 The projections Πt (1),...,t (k) on D([0, T ]) into S k , defined by Πt (1),...,t (k) (f ) = (f (t (1)), f (t (2)), . . . , f (t (k)))
.
2 See 3 See
Billingsley (1968), p. 110. Billingsley (1968), p.121.
6
1 A Review of Martingales, Stopping Times, and the Markov Property
are Borel measurable (from D([0, T [) into (S k , S ⊗k ) for all time points 0 ≤ t (1) ≤ · · · ≤ t (k) ≤ T , for all k. Proof Π0 is continuous (on D([0, T ]) into S) due to right continuity at 0, and ΠT is continuous at T by definition of d. It is enough to prove Πt is measurable for every t, 0 < t < T . Fix such a t. Let fn → f in the metric d. Then, fn → f at all points of continuity of f , i.e., all points outside an at most countable set. Also, fn are uniformly bounded. Therefore, as n → ∞, .
1 δ
[t,t+δ]
fn (s)ds →
1 δ
[t,t+δ]
f (s)ds,
for all δ > 0, (t + δ < T ).
This implies that 1δ [t,t+δ] f (s)ds is a continuous real-valued function in the Skorokhod topology and, by right-continuity of f at t, it converges to f (t) as δ ↓ 0. This proves the measurability of Πt . The space D([0, T [) is Polish under the Skorokhod topology. It is not complete under the metric d, but the topology under d is metrizable with a metric d0 under which D([0, T ]) is complete.4 One defines the space D([0, ∞)) of càdlàg functions on [0, ∞) into a Polish space S, with a metric d(f, g) =
.
1≤n 0. Let us now show that the finite dimensional distributions also have the Feller property. For this, let g(y1 , . . . , yk ) be a real-valued bounded continuous function on S k , and let 0 < t1 < · · · < tk ≤ T . Denote Ft = σ (Xs : 0 ≤ s ≤ t), t > 0. With
4 See
Billingsley (1968), pp. 113–116.
1 A Review of Martingales, Stopping Times, and the Markov Property
7
X0 = x, .
Ex ϕ(Xt1 , . . . , Xtk ) = Ex E(ϕ(Xt1 , . . . , Xtk )|Ftk−1 )) = Ex ϕ(Xt1 , . . . , Xtk−1 , y)p(tk − tk−1 ; Xtk−1 , dyk ) S
= Ex φ1 (Xt1 , . . . , Xtk−1 ), where ϕ1 (y1 , . . . , yk−1 ) =
ϕ(y1 , . . . yk−1 , yk )p(tk − tk−1 ; yk−1 , dyk )
.
S
at Xti = yi , i = 1, . . . , k − 1, which is a continuous function of yk−1 , for given y1 , . . . , yk−2 . But it is also continuous in the remaining variables y1 , . . . , yk−2 , by the continuity property of integration. Repeating this one arrives at Ex ϕ(Xt1 , . . . , Xtk ) = Ex φk−1 (Xt1 ) =
ϕk−1 (y1 )p(t1 ; x, dy1 ),
.
(1.12)
S
where ϕk−1 is a continuous function on S: ϕk−1 (y1 ) =
ϕk−2 (y1 , y2 )p(t2 − t1 ; y1 , dy2 ).
.
S
This proves the assertion that finite-dimensional distributions under Px are weakly continuous in x. We are now ready to prove the following result. Proposition 1.5 If the transition probabilities p(t; x, dy), t > 0, have the Feller property, then x → Px is weakly continuous on the Polish space S into the Polish space of all probability measures on D([0, T ]) and on D([0, ∞)). Proof Let φ be a bounded real-valued continuous function on D([0, T ]). By Proposition 1.3(d), there is a countable set of points in [0, T ] such that every f ∈ D([0, T ]) is a limit of functions f (n) , as n → ∞, whose values depend on the projection of D([0, T ]) onto a finite set of points in [0, T ], i.e., f (n) are finite dimensional. Choose and fix ε > 0. Let n(ε) be such that Ex φ = φ(f )dPx differs from Ex φn ≡ φ(f (n) )dPx by no more than ε/2 for n ≥ n(ε). Let x (k) → x as k → ∞. Choose k = k(ε) so that Ex (k) φ(n(ε)) differs from Ex φ(n(ε)) by no more than ε/2 for k ≥ k(ε). Then, Ex (k) φ differs from Ex φ by no more than ε for all k ≥ k(ε). This argument extends to Px on D([0, ∞)) in view of the definition of the metric .d(f, g) = 2−n dn (f, g), f, g ∈ D[0, ∞). 1≤n 1, then EMn ≤ q p E|Xn |p , where q is the p 1 1 conjugate exponent defined by q + p = 1, i.e., q = p−1 . Corollary 1.16 Let {Xt : t ∈ [0, T ]} be a right-continuous nonnegative {Ft }submartingale with E|XT |p < ∞ for some p ≥ 1. Then, MT := sup{Xs : 0 ≤ s ≤ T } is FT -measurable and, for all λ > 0, 1 .P (MT > λ) ≤ λp
p
[MT >λ]
XT dP ≤
1 p EXT , (p ≥ 1), λp
(1.25)
and for all p > 1, p
p
EMT ≤ q p EXT ,
.
(q =
p ). p−1
Proof Consider the nonnegative submartingale {Xj 2−n T : j = 0, 1, . . . , 2n }, for each n = 1, 2, . . . , and let Mn := max{Xj 2−n T : 0 ≤ j ≤ 2n }. For λ > 0, [Mn > λ] ↑ [MT > λ] as n ↑ ∞. In particular, MT is FT -measurable. By Theorem 1.14, P (Mn > λ) ≤
.
1 λp
p
[Mn >λ]
XT dP ≤
1 p EXT . λp
Letting n ↑ ∞, (1.25) is obtained. The second inequality is obtained similarly.
14 See 15 See
BCPT, Proposition 1.4. BCPT, Theorem 3.4.
Exercises
15
For the next Remark, recall that t → Xt is stochastically continuous if |Xt − Xs | → 0 in probability as s → t, (t ≥ 0). Remark 1.5 (Regularization of Continuous Parameter Submartingales) Let {Xt : t ≥ 0} be a submartingale such that t → Xt is stochastically continuous. Then there t : t ≥ 0} of {Xt : t ≥ 0} having càdlàg sample paths almost surely is a version {X (e.g., see Theorem 3.1, Chapter 3.)
Exercises 1. (i) Prove (1.18). (ii) Provide a proof of Le Cam’s coupling approximation (1.19). [Hint: For i.i.d. uniformly distributed Uj , j ≥ 1, on (0, 1), define
Xj =
.
0 0 ≤ Uj < 1 − pj 1 1 − pj ≤ Uj < 1,
and define
Yj =
.
⎧ ⎪ 0 0 ≤ Uj < e−pj ⎪ ⎪ ⎪ ⎪ ⎨1 e−pj ≤ Uj < e−pj + pj e−pj ⎪ 2 e−pj + pj e−pj ≤ Uj < e−pj + pj e−pj + ⎪ ⎪ ⎪ ⎪ ⎩.. . etc.
pj2 −p j 2! e
2. Give a proof of Proposition 1.1. n 3. (a) Prove Theorem 1.7. (b) (Yule Process) Let T = ∪∞ n=0 {1, 2} , and view each v = (v1 , . . . , vn ) ∈ T, n ≥ 1, as an nth generation vertex of height |v| = n, in a binary tree rooted at the zeroth generation vertex θ ∈ {1, 2}0 , |θ | = 0. Next, let {Tv , v ∈ T} be a random field of i.i.d. exponential random variables with intensity parameter λ, indexed by T, and consider the set V1 (t) = |v|−1 |v| {v ∈ T : t ≥ 0, where −1 j =0 = 0, and j =0 Tv|j ≤ t < j =0 Tv|j }, v|j = (v1 , . . . , vj ), v ∈ T, j ≤ |v|. As the evolutionary set V1 (t) evolves, when an exponentially distributed clock rings for a vertex v, then it is removed and two offspring vertices v1, v2 are incorporated. The stochastic process Yt = |V1 (t)|, t ≥ 0, counting the number of (offspring) vertices at time t, is referred to as a Yule process (compare Definition 1.3), starting at Y0 = |V1 (0)| = 1, i.e., V1 (0) = {θ }. Note that the sample paths of Y are nondecreasing step functions with unit jumps. Let 0 = β0 < β1 < · · · denote the successive jump times of Y , i.e., βk = inf{t ≥ βk−1 : Yt = k + 1}. (i) Show that for arbitrary k ≥ 1, conditionally given β0 = 0 and given σ {β1 , . . . , βk−1 }, the increment βk − βk−1 is exponentially distributed with
16
1 A Review of Martingales, Stopping Times, and the Markov Property
parameter kλ. In particular, βk − βk−1 , k = 1, 2, . . . are independent random variables. [Hint: P (β1 > t) = e−λt , t ≥ 0. Next, P (β2 −β1 > t) = P (T (1) ∧ T (2) > t) = e−λt e−λt = e−2λt , and more generally for k ≥ 2, given β1 , . . . , βk−1 , βk − βk−1 is the minimum of k i.i.d. exponentially distributed random variables with intensity λ. ] (ii) Show that for fixed t ≥ 0, Yt has a (marginal) geometric distribution with parameter pt = e−λt . In particular, Y does not have stationary nor independent increments. [Hint: Compute the distribution of Yt by a similar method as used for the Poisson distribution, i.e., write P (Yt = k) = P (βk ≤ t < βk+1 ) as a difference of probabilities and apply integration by parts.] λs −1 (iii) Show that Y has the o.s.p. with Ft (s) = eeλt −1 , 0 ≤ s ≤ t. [Hint: Given [Yt = n + 1] there are n jump times β1 < β2 < · · · < βn . The joint pdf of (β1 , β2 − β1 , . . . , βn+1 − βn ) is the product (n + 1)!λn+1 e−λ
n+1 j =1 j tj
, tj
≥
0. Thus, by a change of n+1
n+1 e−λ j =1 j (sj −sj −1 ) = variable, (β1 , . . . , βn+1 ) has n pdf (n + 1)!λ (n + 1)!λn+1 e−λ{(n+1)sn+1 − j =1 sj } , 0 = s0 < s1 < · · · < sn+1 . Normalize by P (Yt = n + 1) and integrate sn+1 over [t, ∞) to obtain the desired λsj conditional pdf n! nj =1 eλeλt −1 , 0 < s1 < s2 < · · · < sn ≤ t.] (iv) Show that Y is a Markov process. [Hint: Apply Theorem 1.7 to Y − 1.] (v) Show that the limit W = limt→∞ e−λt Yt exists a.s. and is exponentially distributed with mean one. [Hint: Check that e−λt Yt , t ≥ 0, is a uniformly integrable, nonnegative martingale. In particular, supt Ee−2λt Yt2 = −λt supt e−2λt 2−e < ∞. Show that W = e−λβ1 (W (1) +W (2) ), in distribution, e−2λt where W (1) , W (2) are conditionally i.i.d. given β1 , and check that the unique positive, nonconstant, mean one solution to the corresponding equation for ϕ(r) = EerW , namely, after conditioning on Tθ = β1 , ϕ(r) = 1 Eϕ 2 (re−λβ1 ), ϕ(0) = 1, is ϕ(r) = 1−r , r < 1. For this, make a change 1 r 2 of variable, ϕ(r) = r 0 ϕ (u)du, ϕ(0) = 1. Differentiating it follows 1 that ϕ(r) = 1+cr , for a constant c. Finally, observe that 1 = EW = ϕ (0) = −c.]
Chapter 2
Semigroup Theory and Markov Processes
Semigroups associated with Markov processes are introduced as examples together with some general theory and basic properties, including the Hille– Yosida theorem.
To construct a discrete parameter homogeneous Markov process with state space (S, S), one only needs to specify the initial state, or initial distribution, and the onestep transition probability .p(x, dy) (see BCPT,1 pp. 187–189, or Bhattacharya and Waymire 2021, pp. 9–11). There is, in general, no such simple mechanism in the continuous parameter cases to construct the transition probability .p(t; x, dy). This is because the latter transition probabilities must satisfy the following requirements (Exercise 5):
.
Definition 2.1 Let .(S, S) be a measurable space. A family of homogeneous continuous parameter transition probabilities consist of functions .p(t; ·, ·), t > 0, on .S × S with the following properties: (i) For each .t > 0, each .x ∈ S, .B → p(t; x, B) is a probability measure on .S, (ii) For each .t > 0, each .B ∈ S, .x → p(t; x, B) is a measurable function on S, into .R on the .σ -field .S on S and the Borel .σ -field .B(R) on .R. (iii) (Chapman–Kolmogorov Equation) For all .t > 0, s > 0, x ∈ S, B ∈ S,
1 Throughout,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
Theory. © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_2
17
18
2 Semigroup Theory and Markov Processes
p(t + s; x, B) =
p(t; z, B)p(s; x, dz).
.
S
In view of the Chapman–Kolmogorov equation (iii) in Definition 2.1, one needs to determine the transition probability only on an arbitrarily small interval .(0, t0 ). Indeed, in this chapter, we show that the derivative of the transition probability operator at .t = 0 determines the transition probability for all times! We begin with the primary example for this chapter, Markov semigroups, and then consider a number of specific examples to further elucidate the nature and properties of naturally ocurring semigroups. Example 1 (Markov Semigroups) Let .(S, S) be a metric space with Borel .σ -field .S. Let .p(t; x, dy) be a transition probability on S. Let .B(S) denote the set of all realvalued bounded measurable functions on S. .B(S) is a Banach space with respect to the “sup”-norm ||f || = sup |f (x)|,
.
f ∈ B(S).
(2.1)
x∈S
Define the transition operator .Tt : B(S) → B(S), by (Tt f )(x) =
f (y)p(t; x, dy),
.
t > 0.
(2.2)
S
By property (iii) of the transition probability .p(t; x, dy), .{Tt : t > 0} is a oneparameter semigroup of (commuting) linear operators: Tt+s f = Tt (Ts f ), f ∈ B(S), i.e., Tt Ts = Ts Tt .
.
(2.3)
Also, each .Tt is a contraction: ||Tt f || ≤ ||f ||,
.
f ∈ B(S),
(2.4)
as is easily seen from (2.2), .Tt f (x) being the average of f with respect to the probability measure .p(t; x, dy). Note that taking .f = 1B , B ∈ S, implies the Chapman–Kolmogorov equations in the form p(t + s; x, B) =
p(t; x, dy)p(s; y, B)
.
S
=
p(s; x, dy)p(t; y, B), s, t ≥ 0, B ∈ S.
(2.5)
S
Example 2 (Initial Value Problem for a System of Linear Ordinary Differential Equations) Consider the system of linear equations
2 Semigroup Theory and Markov Processes
.
d u(t) ≡ u(t) ˙ = Au(t), dt
19
(u(t) = (u(1) (t), u(2) (t), . . . , u(m) (t))) t ≥ 0, (2.6)
with initial condition u(0) = f = (f (1) , f (2) , . . . , f (m) ) ∈ Rm ,
.
(2.7)
and A = ((qij )), a constant (i.e., time-independent) m × m matrix.
.
(2.8)
The unique continuous solution (on .[0, ∞)]) to (2.6), (2.7) is u(t) = etA f :=
∞ n t
.
n=0
n!
An f,
(2.9)
as is easily checked by term-by-term differentiation. Writing Tt f = etA f,
(2.10)
.
one has a one-parameter family .{Tt : t > 0} of linear operators on the finitedimensional Banach space .Rm satisfying the semigroup property Tt+s = Tt Ts ,
.
since .e(t+s)A = etA esA , as may be easily checked by expanding the three exponentials and using the binomial theorem. Equations such as (2.6) often arise as equations of motion for a dynamical system. More generally, if .X is a Banach space and A is a bounded linear operator on .DA = X (into .X ), then Tt f ≡ etA f
.
(t > 0, f ∈ X )
(2.11)
tA defines a one-parameter ∞ t n n semigroup of bounded linear operators .{Tt = e : t > 0}, tA with .e = n=0 n! A . In particular, .u(t) ≡ Tt f is a solution of the initial value problem
.
d u(t) = Au(t), u(0) = f, dt
t > 0.
(2.12)
If f is such that .Tt f → f as .t ↓ 0 (in the Banach space norm), then .u(t) = Tt f is the unique solution to (2.12), which is continuous at .t = 0 and satisfies .u(0) = f .
20
2 Semigroup Theory and Markov Processes
The next example illustrates .u(t) = Tt f as a solution to an initial value problem in the context of a Markov chain on a countable state space S. Example 3 (Markov Chains) First, let S be a finite set, say .S = {1, 2, . . . , n}, and let .S be the class of all subsets of S. Suppose .p(t; i, j ) is a transition probability on S. Then, .(Tt f )(i) = f (j )p(t; i, j ) (i ∈ S). (2.13) j ∈S
Taking .f = fj := 1{j } , the semigroup property implies the so-called Chapman– Kolmogorov equations for .p(t; i, j ). Namely, p(t + s; i, j ) =
.
p(s; i, k)p(t; k, j ) =
k
p(t; i, k)p(s; k, j ),
i, j ∈ S, t ≥ 0,
k
(2.14) or as matrix products for the transition probability matrix: p(t) = ((p(t; i, j )))ij ∈S , p(t + s) = p(s)p(t) = p(t)p(s).
.
(2.15)
Assume that for all .i, j , the function .t → p(t; i, j ) is continuous (from the right) at t = 0. This is equivalent to the property that the semigroup .Tt is strongly continuous on .C(S : R) (see Definition 2.6). In particular, for each pair .(i, j ), one has the finite limit
.
qij ≡ lim
.
t↓0
p(t; i, j ) − δij p(t; i, j ) − p(0; i, j ) = lim , t↓0 t t
(2.16)
so that by term-by-term differentiation at .t = 0 of (2.13), one has Af (i) =
n
.
qij f (j ).
(2.17)
j =1
Since n .
p(t; i, j ) = 1
(i ∈ S),
(2.18)
j =1
one also has (a)
n
.
j =1
qij = 0.
(2.19)
2 Semigroup Theory and Markov Processes
21
Also, since .p(t; i, i) ≤ δii = 1, .p(t; i, j ) ≥ δij = 0 if .i = j , (2.16) yields, qii ≤ 0,
(b)
.
qij ≥ 0 if i = j.
(2.20)
Thus, the matrix (operator) .A = ((qij ))i,j ∈S has the following properties: (i) Nonpositive diagonals: .qii ≤ 0, i ∈ S. (ii) Nonnegative off-diagonals: .qij ≥ 0, i = j, i, j ∈ S. (iii) Zero row sums: . j ∈S qij = 0. By uniqueness of the solution u(t) = etA f,
.
t ≥ 0,
to the initial value problem u(t) ˙ = Au(t), t > 0,
.
u(0) = f,
(2.21)
∞ m t Ar ). m!
(2.22)
one therefore has Tt f = etA f,
.
(etA =
m=0
In particular, letting .fj be the indicator of .{j }, p(t; i, j ) = Tt fj (i) = etA
.
(i,j )
.
(2.23)
Conversely: Proposition 2.1 Let .S = {1, 2, . . . , n}. Given any .n × n matrix .A = ((qij ))i,j ∈S satisfying (i)–(iii) above, (2.23) defines a semigroup of transition probabilities
+ .p(t; i, j ) such that .p (0 ; i, j ) = qij , i, j ∈ S. Proof Note that (2.20) implies p(t; i, j ) ≥ 0 for all i, j ∈ S,
.
and p(t; i, i) ≤ 1, for all i ∈ S, for all t > 0.
.
On the other hand, letting .f ≡ 1, one has .
j ∈S
p(t; i, j ) = (Tt f )(i).
(2.24)
22
2 Semigroup Theory and Markov Processes
But .Tt f ≡ 1 solves (2.21) for .u(0) ≡ 1 (in view of (2.19)). Hence, by uniqueness of the solution to (2.21), .
p(t; i, j ) = (Tt 1)(i) = 1.
(2.25)
j ∈S
By specializing (2.21) to the functions .fj = 1{j } , (or using (2.23) directly), one gets tqij + o(t)
p(t; i, j ) =
.
as t ↓ 0, i = j,
1 − tqii + o(t) as t ↓ 0, i = j.
(2.26)
This completes the proof.
Remark 2.1 In this context, the quantity .qij is referred to as the rate of transition from state i to state j .(i = j ); of course .qii is determined by .qij .(j = i) since
qii = −
.
qij .
{j :j =i}
The parameters .qij , i = j , and .−qii , i, j ∈ S will also be referred to as the infinitesimal parameters (or infinitesimal rates). A more illuminating description of .qij ’s will be given later. Next, assume that S is denumerable (i.e., infinite but countable). Let .A = ((qij )) be a matrix satisfying properties (i)–(iii). This implicitly requires that .
qij < ∞
for all i ∈ S.
(2.27)
{j :j =i}
Then, it is easily checked that Theorem 2.2 (Bounded Rates Theorem) Under the conditions (i)–(iii) on .A = ((qij ))i,j ∈S , one has that Af (i) =
.
qij f (j ), i ∈ S, f ∈ B(S),
j ∈S
defines a bounded linear operator on .B(S) if and only if .
sup{|qii | : i ∈ S} < ∞.
If (2.28) holds, then Tt f ≡ eAt f :=
∞ n n t A f
.
n=0
n!
,
f ∈ B(S),
(2.28)
2 Semigroup Theory and Markov Processes
23
defines a Markov semigroup whose transition probabilities are given by p(t; i, j ) = (etA )(i,j ) .
.
(2.29)
The proof of this theorem is entirely analogous to that of the finite (state space) case and left as an exercise. Remark 2.2 In case the matrix (operator) A is unbounded (i.e., (2.28) fails), then it maybe shown that there exists at least one sub-Markovian transition probability (i.e., . j p(t; i, j ) ≤ 1) satisfying (2.26). However, uniquenss is a more delicate affair (to be treated in Chapter 6). Our program in this chapter is to show that, under appropriate conditions, oneparameter semigroups on a Banach space are of the form (2.10), suitably defined and interpreted to allow unbounded operators A as well. Before pursuing this program, let us consider two more examples. Example 4 (Initial Value Problem for Parabolic PDE’s) The equation of heat conduction (or diffusion of a solute) in a homogeneous medium (without boundary) is .
∂u(t, x) = DΔx u (t > 0, x ∈ Rk ), ∂t
(2.30)
with the initial condition u(0, x) = f (x).
(2.31)
.
Here .u(t, x) is the temperature in the medium at time t at a point x (In the physical problem, .k = 3), .f (x) is the initial temperature at this point, and D is a positive constant. Here .Δx is the Laplacian Δx =
.
k ∂2 . ∂xj2 j =1
(2.32)
By taking Fourier transforms of both sides (as functions of x, for fixed t) of (2.30), one arrives at .
∂ u(t, ˆ ξ) ˆ ξ ) (t > 0, ξ ∈ Rk ) = −D|ξ |2 u(t, ∂t
(2.33)
whose general solution is u(t, ˆ ξ ) = c(ξ )e−tD|ξ | . 2
.
In view of (2.31), which implies (assuming f is integrable) .u(0, ˆ ξ ) = fˆ(ξ ),
(2.34)
24
2 Semigroup Theory and Markov Processes
c(ξ ) = fˆ(ξ ).
(2.35)
.
The expression (2.34) of .u(t, ˆ ξ ) as a product of Fourier transforms leads to the convolution formula u(t, x) = f ∗ ϕt ,
.
where .ϕˆt (ξ ) = e−tD|ξ | , i.e., .ϕ is the Gauss kernel 2
ϕt (x) = ( √
1
.
and, therefore, .u(t, x) =
4π Dt
|x|2
)k e− 4Dt ,
1 f (y)ϕt (x − y)dy = ( √ )k k 4π Dt R
Rk
f (y)e−
(2.36)
|y−x|2 4Dt
dy,
(2.37)
which is linear in f and may be expressed as u(t, ·) = Tt f (= f ∗ ϕt )
(2.38)
Tt+s f = Tt Ts f,
(2.39)
.
and satisfies .
since .ϕt+s = ϕt ∗ ϕs . The operation (2.37) can be extended to the Banach space B(Rk ) or to its subspace .C0 (Rk ) of all real bounded continuous functions on .Rk that vanish at infinity in the sense that given .ε > 0, there is a compact set .Kε ⊂ Rk , such that .|f (x)| < ε if .x ∈ Kεc (see Exercise 1).
.
Remark 2.3 Since continuous functions that vanish at infinity are also uniformly continuous (Exercise 3), the Banach space .C0 (Rk ) is sometimes a more convenient choice over .Cb (Rk ). The semigroup resulting from this example is variously referred to as a Gauss, normal, or heat semigroup. Note that .ϕt (y − x)dy = p(t; x, dy) is a transition probability (namely, that of a Brownian motion with diffusion coefficient 2D). Thus, Example 4 belongs to the class of Markov semigroups of Example 1. Also, when the Equation (2.30) is expressed in the form (2.33), the solution (in the Fourier frequency domain) is of the form 2 u(t, ˆ ξ ) = e−tD|ξ | fˆ(ξ ),
.
(2.40)
2 Semigroup Theory and Markov Processes
25
−tD|ξ |2 . In this sense, the semigroup .T may be i.e., .T t f is a multiplication of .fˆ by .e t viewed to be of the form .etDΔ in the sense that .etDΔ f is, in the frequency domain, ) ˆ −tD|ξ |2 fˆ(ξ ) = et A(ξ is the representation of .DΔ in the frequency .e f (ξ ), where .A domain (namely, multiplication by .−D|ξ |2 ). So, Example 4 also belongs to the class of Example 2 extended to unbounded A! A general method of extension is forthcoming in the form of the Hille–Yosida theorem.
Remark 2.4 Although (2.30) is the typical form of the heat equation in pde theory, in probability there is typically a factor of . 12 in front of .DΔ on the right-hand side. As a result, the 4 in the formula (2.36) is replaced by a 2. This is natural from the perspective of standardizing the dispersion rate as a variance parameter noted above. Let A be a linear operator defined on a linear subspace .DA (into .X ). The notion of a closed linear operator is as follows. Definition 2.2 A linear operator A having domain .DA in the Banach space .X is said to be closed if its graph .GA = {(f, Af ) : f ∈ DA } is a closed subset of .X × X in the product topology. Definition 2.3 Given a linear operator A defined on a linear space .DA ⊂ X , denote the operator .(λI − A) : DA → X simply by .λ − A. The resolvent set of A is then ρ(A) = {λ ∈ F : λ − A is one-to-one and onto, (λ − A)−1 is bounded},
.
where F is the scalar field (real .R or .C). ∅, then A is closed. Proposition 2.3 If .ρ(A) = Proof Suppose .λ ∈ ρ(A); then, .(λ − A)−1 is bounded on .X (into .X ) and, therefore, obviously closed. Hence, .λ − A is closed, which implies A is closed. Definition 2.4 For .λ ∈ ρ(A), .Rλ ≡ (λ − A)−1 is called the resolvent operator. Proposition 2.4 (The Resolvent Identity) Let .λ, μ ∈ ρ(A). Then, (λ − A)−1 − (μ − A)−1 ≡ Rλ − Rμ = (μ − λ)Rλ Rμ .
.
In particular, .Rλ and .Rμ commute. Proof Let .g ∈ X . There exists .f ∈ DA such that .(μ − A)f = g. Hence, .
((λ − A)−1 − (μ − A)−1 )g = [(λ − A)−1 − (μ − A)−1 ](μ − A)f
(2.41)
26
2 Semigroup Theory and Markov Processes
= (λ − A)−1 (μ − λ + λ − A)f − (μ − A)−1 (μ − A)f = (μ − λ)Rλ f + f − f = (μ − λ)Rλ Rμ g. Definition 2.5 A one-parameter contraction semigroup on .X is a family of maps {Tt : t > 0} on .X such that (i)–(iii) of the following hold.
.
(i) .Tt is a bounded linear operator on .X , for each .t > 0, (ii) .Tt+s = Tt Ts for all s, t > 0, (iii) .||Tt || ≤ 1 for all t > 0. The subspace .X0 = {f ∈ X : ||Tt f − f || → 0 as t ↓ 0} is called the center of the semigroup. Proposition 2.5 The center .X0 is a Banach space. Proof The main thing to check is that .X0 is closed, as the vector space properties are easily seen to be inherited. Let .f0 ∈ X0 . Then, there exists .δ > 0 such that .||Ttn f − f || > δ for a sequence .tn ↓ 0. If .||f − f0 || < δ/4, then ||Ttn f − f || ≥ ||Ttn f0 − f0 || − ||Ttn (f − f0 )|| − ||f − f0 || > δ −
.
δ δ δ − = . 4 4 2
Hence, .f ∈ X0 , proving closedness of .X0 . Thus, .X0 is a Banach space.
Definition 2.6 In case .X = X0 , the semigroup is said to be of class .C0 or strongly continuous on .X . The reason for the latter terminology is the following (also see Exercise 1): On [0, ∞), one has (take .T0 f = f )
.
t → Tt f is continuous for all f ∈ X0 .
.
(2.42)
For one has ||Tt+s f − Tt f || = ||Tt (Ts f − f )|| ≤ ||Ts f − f || → 0,
.
as .s ↓ 0. Although we have made use of contraction in the last inequality, all that is needed is that .||Tt || < ∞, which is of course true. It should also be emphasized that Proposition 2.6 .{Tt : t > 0} is a strongly continuous semigroup on .X0 . Proof All that needs to be checked is that the semigroup leaves .X0 invariant. That is, .f ∈ X0 ⇒ Tt f ∈ X0 . But this is proved by (2.42) and the display following it.
2 Semigroup Theory and Markov Processes
27
Definition 2.7 The infinitesimal generator A of a one-parameter semigroup .{Tt : t > 0} is defined by Af = lim
.
t↓0
Tt f − f t
(2.43)
for all .f ∈ X , for which this limit exists. The set of all such f is called the domain of the infinitesimal generator and denoted by .DA . To include its domain, the pair .(A, DA ) is often referred to as the infinitesimal generator. Clearly .DA ⊂ X0 . Theorem 2.7 Let .{Tt : t > 0} be a strongly continuous contraction semigroup on X . Then,
.
t → Tt f is continuous on .[0, ∞) for all f ∈ X . DA = X (i.e., A is densely defined). .DA coincides with the range of .Rλ for each .λ > 0. .Tt f ∈ DA for all f ∈ DA , .Tt and A commute, i.e.,
a. b. c. d.
. .
ATt f = Tt Af.
.
(2.44)
e. If .f ∈ DA ,
t
Tt f − f =
Ts Af ds.
.
(2.45)
0
f. A is a closed linear operator. g. If .Reλ > 0, then .λ ∈ ρ(A), and ||Rλ || ≤
.
∞
Rλ f =
1 , Reλ
e−λt Tt f dt
.
for all f ∈ X .
(2.46)
(2.47)
0
Proof Part (a) has been already proved (see proof of (2.42)). Since .t → ||Tt f − f || is bounded and continuous and .λe−λt dt converges weakly to the Dirac measure .δ0 as .λ → ∞ (.λ real), it follows that . for all f ∈ X , one has ||λ
∞
e
.
−λt
Tt f dt − f || = ||
0
∞
0 ∞
≤
(Tt f − f )λe−λt dt||
||Tt f − f ||λe−λt dt
0
→
0
∞
||Tt f − f ||δ0 (dt) = ||f − f || = 0.
(2.48)
28
2 Semigroup Theory and Markov Processes
(Note that .Tt → I strongly as .t ↓ 0, and one sets .T0 = I ). Next, write Sλ f =
.
∞
e−λt Tt f dt
(f ∈ X )
0
and note that .
∞ ∞ Th Sλ f − Sλ f 1 e−λt Tt+h f dt − e−λt Tt f dt} = { h 0 h 0 ∞ ∞ 1 −λ(t−h) e Tt f dt − e−λt Tt f dt} = { h h 0 h ∞ 1 λh −λt λh = {(e − 1) e Tt f dt − e Tt f dt} h 0 0 → λSλ f − f strongly as h ↓ 0.
In other words, .Sλ f ∈ DA for all f ∈ X and all .λ > 0 (or, in the complex cases, all .λ such that .Reλ > 0). Moreover, ASλ f = λSλ f − f,
.
or (λ − A)Sλ f = f
for all f ∈ X
.
(2.49)
Hence, .Sλ = (λ − A)−1 , which exists and is bounded: .||Sλ || ≤ λ1 . Thus, .Rλ = Sλ and ∞ −1 .Rλ f = (λ − A) f = e−λt Tt f dt, (f ∈ X ), (2.50) 0
proving (2.47). By (2.48), .
lim λRλ f = f
λ→∞
( for all f ∈ X ).
(2.51)
Since the range of .λRλ ≡ λ(λ − A)−1 is the domain of .λ(λ − A) and, therefore, of A, A is closed, proving (f). Now, .λRλ f ∈ DA for all .f ∈ X . Hence, the left side of (2.51) implies .D A = X , proving (b) and (c). If .X is a complex Banach space, one may similarly obtain ||Rλ f || ≤ ||f ||/Reλ
.
(λ such that Reλ > 0),
proving (g). Suppose .f ∈ DA , .t > 0. Then,
2 Semigroup Theory and Markov Processes
.
Th Tt f − Tt f Tt (Th f − f ) = → Tt Af h h
29
as h ↓ 0,
proving that (d) holds, i.e., .Tt f ∈ DA and (2.44), Finally, (e) follows from the fundamental theorem of calculus, just as in the case of real-valued functions. It is sufficient for our purposes to assume .X to be a real Banach space. The following is the most important theorem of this chapter. Theorem 2.8 (Hille–Yosida Theorem) A linear operator A on .X is the infinitesimal generator of a strongly continuous contraction semigroup if and only if a. A is densely defined. b. For all .λ > 0, one has .λ ∈ ρ(A) and .||Rλ || ≤ 1/λ. The proof of this theorem is deferred to Appendix A. Theorem 2.9 (The Abstract Cauchy Problem) Let .(A, DA ) be the infinitesimal generator of a contraction semigroup .{Tt : t ≥ 0} on .X . If .f ∈ DA , then .u(t) = Tt f is the unique (strongly) differentiable solution to the equation .
du(t) = Au(t) (0 ≤ t < ∞), dt
(2.52)
subject to the “initial condition” ||u(t) − f || → 0 as t ↓ 0.
.
(2.53)
Proof That .Tt f is a solution follows from Theorem 2.7, specifically (2.44), (2.45). To prove uniqueness, let .v(t) be any solution. Then, .s → Tt−s v(s) is (strongly) differentiable for .0 ≤ s ≤ t. Indeed, by (2.45), .
d (Tt−s v(s)) ds Tt−s−h v(s + h) − Tt−s v(s) = lim h→0 h Tt−s−h v(s + h) − Tt−s v(s + h) Tt−s v(s + h) − Tt−s v(s) } + = lim { h→0 h h dv(s) 1 t−s Tu Av(s + h)du} + Tt−s = lim {− h→0 ds h t−s−h = −Tt−s Av(s) + Tt−s Av(s) = 0,
since .v(s + h) ∈ DA and .v(s) satisfies (2.52). Hence, .s → Tt−s v(s) is independent of s. Set .s = 0, t to get .Tt v(0) = T0 v(t) = v(t), i.e., .Tt f = v(t).
30
2 Semigroup Theory and Markov Processes
Example 5 (Birth–Death with Two Reflecting Barriers) .S = {1, 2, . . . , n}, .qii = −λi .(1 ≤ i ≤ n), .qi,i+1 = λi μi .(1 ≤ i ≤ n − 1), .qi,i−1 = λi δi .(2 ≤ i ≤ n), .qij = 0 for all other pairs .(i, j ). Here, .μi > 0 and .δi > 0 for all .2 ≤ i ≤ n − 1, .μ1 = 1, .δn = 1, .μi + δi = 1, .λi > 0 for all i. Example 6 (Pure Birth) .S = {0, 1, 2, . . .}, .qii = −λi , .qi,i+1 = λi . for all i, where λi > 0 . for all i. If .λi is a bounded sequence, then there exists a unique Markov semigroup with infinitesimal parameters .A = ((qij )). An especially simple case is when .λi = λ . for all i. In this case,
.
⎡ 0 ⎢0 ⎢ B = ⎢0 ⎣ .. .
etA = e−λt etλB ,
.
+1 0 0 .. .
0 +1 0 .. .
0 0 +1 .. .
⎤ 0 ... 0 . . .⎥ ⎥ 0 . . .⎥ ⎦ .. . . . .
(2.54)
Note that .(Bf )(i) = f (i + 1), so that .(B r f )(i) = f (i + r), and p(t; i, j ) = (etA )i,j = e−λt (etλB fj )i fj (i) := δj i
.
= e−λt
=
0
∞ (λt)r r=0
r!
j −i e−λt (λt) (j −i)!
(B r fj )(i) = e−λt
∞ (λt)r r=0
r!
fj (i + r)
if j < i if j ≥ i
(2.55)
This provides an alternative approach in defining the homogeneous Poisson process. Definition 2.8 The semigroup .etA defined by (2.54) is referred to as the Poisson semigroup, and the Markov process is the homogeneous Poisson process, with intensity parameter .λ. Note: To accommodate the possibility that a Markov chain may leave the state space in finite time t, it is fruitful to allow substochastic transition probabilities: . S p(t; x, dy) < 1, and therefore, in Definition (2.1), replace (i) by (i ) : B → p(t; x, B) is a measure on S such that p(t; x, S) ≤ 1.
.
We permit the substochastic relaxation for the transition probabilities without further remark. In contrast, transition probabilities for which . S p(t; x, dy) = 1 for all .x ∈ S, t ≥ 0, are said to be stochastic or conservative. If one has .p(t; x, S) < 1 for some .t > 0, .x ∈ S, one may introduce a fictitious state .s∞ and consider the new state space .S = S ∪ {s∞ }, with the .σ -field .S = S ∪ {B ∪ {s∞ } : B ∈ S}, and transition probability
2 Semigroup Theory and Markov Processes
31
p(t; x, B) = p(t; x, B)
if x ∈ S, B ∈ S,
.
p(t; x, {s∞ }) = 1 − p(t; x, S)
if x ∈ S,
p(t; s∞ , {s∞ }) = 1
for all t ≥ 0.
(2.56)
It is then straightforward to check conditions (i), (ii), and (iii) of Definition 2.1. There are other ways of augmenting p also; some of these will be discussed later in this chapter. Example 7 (Convolution Semigroups) Let .S = Rk (k ≥ 1). Suppose .{Qt : 0 ≤ t < ∞} is a one-parameter family of probability measures on .Rk , i.e., on its Borel .σ -field, such that Qt ∗ Qs = Qt+s ,
.
lim Qt = δ0 ,
(2.57)
t↓0
where the limit is in the sense of weak convergence of probability measures:
f dQt →
.
f dδ0 = f (0)
ast ↓ 0,
for all real, bounded continuous f on .R. In a convenient abuse of notation, one may define linear operators Qt f (x) =
.
Rk
f (x + y)Qt (dy) =
Rk
f (y )p(t; x, dy ),
f ∈ Cb (Rk : Rk ), (2.58)
where as a Markov semigroup, writing .B − x = {y − x : y ∈ B}, .B ⊂ Rk , the transition probabilities (see (2.2)) are given by p(t; x, B) = Qt (B − x),
.
B ∈ B(Rk ), x ∈ Rk , t ≥ 0.
Then, Qt+s f (x) =
.
Rk
f (x + y)Qt+s (dy)
=
Rk
=
Rk
f (x + y)(Qt ∗ Qs )(dy) Rk
f (x + y1 + y2 )Qt (dy1 )Qs (dy2 )
(2.59)
32
2 Semigroup Theory and Markov Processes
=
Rk
(
=
Rk
Rk
f (x + y1 + y2 )Qs (dy2 ))Qt (dy1 )
(Qs f )(x + y1 )Qt (dy1 ) = (Qt Qs f )(x)
(2.60)
Therefore, .{Qt : 0 ≤ t < ∞} is a one-parameter semigroup on .Cb (Rk ). It is called a convolution semigroup. Since, by the weak convergence (2.57) Qt f (x) − f (x) =
.
Rk
[f (x + y) − f (x)]Qt (dy)
goes to zero uniformly in x, if f is uniformly continuous (and bounded), .{Qt : 0 ≤ t < ∞} is strongly continuous on .C0 (Rk ), the space of continuous functions vanishing at infinity. Convolution semigroups have already occurred in previous examples: y2
− √ 1 e 2σ 2 t dy, (see Remark 2.4). 2π σ 2 t n = e−λt (λt) Qt (R \ {nh : n = 0, 1, 2, . . .}) n! ,
1. .Qt (dy) =
2. .Qt ({nh}) nonzero number. (1)
(2)
= 0, where h is a given
(m)
Note that if .Qt , Qt , . . . , Qt are m convolution semigroups, then the semigroup (2) (m) corresponding to .Q(1) = Qt is another, i.e., t ∗ Qt ∗ . . . ∗ Qt .
(1)
(Qt
(2)
∗ Qt (1)
= (Qt
(1)
(m)
(2) (m) ∗ . . . ∗ Qt ) ∗ (Q(1) s ∗ Qs ∗ . . . ∗ Qs ) (2)
∗ Q(1) s ) ∗ (Qt (2)
(m)
∗ Q(2) s ) ∗ . . . (Qt (m)
= Qt+s ∗ Qt+s ∗ . . . ∗ Qt+s .
∗ Q(m) s ) (2.61)
It turns out that all convolution semigroups may be obtained from the Gauss (or normal) and Poisson semigroups as components and using (2.61) and taking limits. This is the topic of Chapter 5. Before concluding this chapter, let us briefly record the abstract forms of Kolmogorov’s backward and forward equations. The abstract equations are (see (2.44)) .
d Tt f = ATt f (backward equation), dt
d Tt f = Tt Af (forward equation). dt (2.62)
For Markov chains on countable state spaces S in Example 3, these take the form
Exercises
33
.
j ∈S
j ∈S
f (j )p (t; i, j ) =
qij
j ∈S
f (k)p(t; j, k), i ∈ S,
k∈S
f (j )p (t; i, j ) = ( qj k f (k))p(t; i, j ), i ∈ S.
(2.63)
j ∈S k∈S
Much will be discovered about these equations for important classes of Markov processes in forthcoming chapters.
Exercises 1. Consider the k-dimensional Gauss semigroup {Tt } defined by the transition 1 − 2t1 |y−x|2 probabilities p(t; x, dy) = dy in Example 4 with X = k e (2π t) 2
B(Rk ), with the sup-norm. Show that the center is the space of continuous functions on Rk that vanish at infinity. In particular, show that {Tt } is not strongly continuous as a semigroup on the space of bounded continuous functions. 2 2. In Example 4, let X = L2 (Rk ), with the L2 -norm: ||f || = Rk |f | dμ, μ
3.
4.
5.
6.
being Lebesgue measure on Rk . (a) Show that Tt defined on X is a contraction semigroup, with generator Δ. (b) Is this semigroup strongly continuous? (a) Show that the space of continuous functions on Rk that vanish at infinity is a separable Banach space. (b) Show that a continuous function on Rk that vanishes at infinity is uniformly continuous. (Positive Maximum Principle) Consider a strongly continuous Markov semigroup on the Banach space X ≡ C0 (S) of continuous functions vanishing at infinity on a locally compact metric space S, endowed with the sup-norm, having infinitesimal generator (A, DA ). Suppose that f ∈ DA , and 0 ≤ f (x0 ) = supx∈S f (x) for some x0 ∈ S. Show that Af (x0 ) ≤ 0. [Hint: Note that Tt f (x0 )−f (x0 ) = S f (y)(p(t; x0 , dy)−f (x0 ) ≤ f (x0 )(p(t; x0 , S)−1) ≤ 0, and use the definition of Af (x0 ).] Let (S, S) be the state space of a continuous parameter Markov process X = {Xt : t ≥ 0}. Assume that S is a Polish space and S is its Borel σ -field. Prove that the transition probabilities of the Markov process satisfy (i), (ii), and (iii) of Definition 2.1. [Hint: On a Polish space, one may assume the existence of regular conditional probabilities; see BCPT pp. 29–30, Breiman 1968, p78.] (Compound Poisson Process) Let N = {Nt : t ≥ 0} be a homogeneous Poisson process with intensity ρ > 0. Let Y1 , Y2 , . . . be an iid sequence of integervalued random variables, independent of {Nt : t ≥ 0}. Define an integer-valued
34
2 Semigroup Theory and Markov Processes
t compound Poisson process by Xt = N t ≥ 0, where N0 = 0, Y0 = j =0 Yj , x0 ∈ Z. (i) Prove that Xt , t ≥ 0, is a process with independent increments and is, therefore, Markov. (ii) Find the infinitesimal generator Q of the Markov process {Xt : t ≥ 0} in (i) in terms of the probability mass function f of Y1 , and the Poisson parameter, ρ > 0. 7. (Continuous Time Simple Symmetric Random Walk) Let the probability generating function f in Exercise 6 be that of symmetric Bernoulli, f (1) = f (−1) = 12 . (i) Compute the infinitesimal generator Q. (ii) Show that pii (t) = ∞ (m) t m 1 − ρt + 32 ρ 2 t 2 + O(t 3 ), as t ↓ 0. [Hint: pii (t) = m=0 qii m! , where Qm = ((qij(m) )).] 8. (Deterministic Motion) Consider the initial value problem ∂u(t,x) = μ · ∇u, ∂t k ∂ , (t > 0, x ∈ Rk ), with where μ = (μ1 , . . . , μk ) ∈ Rk , ∇ = j =1 ∂x j initial condition u(0, x) = f (x), f ∈ L1 (Rk ). Apply the Fourier method of Example 4, to interpret the semigroup etA for an unbounded operator A in this case. [Hint: This involves the Fourier transform of the point mass measure δa (dx) (see BCPT, p.82).] 9. (Resolvent of Brownian Motion) Show that the resolvent of the onedimensional standard√ Brownian motion semigroup can be expressed as Rλ f (x) = √1 R e− 2λ|y−x| f (y)dy. 2π 10. (Resolvent of the Poisson Process) Calculate the resolvent of the Poisson semigroup with intensity parameter ρ > 0. [Hint: Use a change ∞of variables to express the integral in terms of the Gamma function Γ (s) = 0 t s−1 e−t dt, noting that Γ (k + 1) = k! for nonnegative integers k.] 11. Consider an integer-valued continuous parameter Markov process. Show for bounded f that Rλ f (i) = p ˆ (λ)f (j ), where, for λ > 0, pˆ ij (λ) = ij j ∞ −λt e p (t)dt. ij 0 12. (Homogeneous Cox Process) The Cox process, or doubly stochastic Poisson process, X = {Xt : t ≥ 0} is defined for a given nonnegative random variable Λ with distribution μ as conditionally given ρ, a Poisson process with intensity Λ. (i) In the case EΛ2 < ∞, show that Var(Xt ) = EΛt + Var(Λ)t 2 . (ii) Use (i) to conclude that X cannot have independent increments if Λ is not a.s. constant. [Hint: Note homogeneity of increments and consider what that would imply for independent increments.] (iii) Show that X is not Markov unless Λ is a.s. constant. [Hint: Check the failure of the following special case of the semigroup property p00 (t +s) = p00 (t)p00 (s), s, t > 0, for this process, unless Λ is an a.s. constant. Assume the semigroup property and take s = t. Express the consequence in terms of Var(e−Λt ).]
Chapter 3
Regularity of Markov Process Sample Paths
This chapter provides general constructions of Markov processes on compact and locally compact separable state spaces from their Feller transition probabilities and arbitrary initial distributions.
In this chapter, Markov processes, having a locally compact, separable state space, are constructed with right-continuous sample paths having left limits, often referred to as càdlàg paths. An alternative construction is presented in Chapter 21 without topological restrictions on the state space beyond that of being a metric space; see Theorems 21.2 and 21.4. Consider the Poisson process .{Xt : t ≥ 0} with intensity parameter .λ > 0 in Definition 1.3 from Chapter 1. Although the sample paths are step functions with unit jumps, for any .ε > 0, and fixed t, one has a weak form of “continuity” in that .
lim P (|Xt − Xs | > ε) = 1 − e−λ|t−s| → 0 as s → t.
s→t
With this example in mind, the following is a natural definition. Definition 3.1 A stochastic process .{Xt } with values in a metric space .(S, d) is said to be continuous in probability, or stochastically continuous, at t if for each .ε > 0, .
lim P (d(Xt , Xs ) > ε) = 0.
s→t
If this holds for all t, then the process is said to be continuous in probability, or stochastically continuous.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_3
35
36
3 Regularity of Markov Process Sample Paths
Definition 3.2 Stochastic processes .{X˜ t } and .{Xt } defined on a common probability space .(Ω, F, P ) are said to be stochastically equivalent if .P (Xt = X˜ t ) = 1 for each .t ≥ 0. In this case, .{X˜ t } is also called a version of .{Xt }. Theorem 3.1 Let .{Xt } be a submartingale or supermartingale with respect to a filtration .{Ft }. Suppose that .{Xt } is continuous in probability at each .t ≥ 0. Then, there is a stochastic process .{X˜ t } stochastically equivalent to .{Xt }, such that with probability one, the sample paths of .{X˜ t } are bounded on compact intervals .a ≤ t ≤ b, .(a, b ≥ 0), and are right-continuous and have left-hand limits at each .t > 0, i.e., having càdlàg sample paths. Proof Fix a rational .T > 0, and let .QT denote the set of rational numbers in .[0, T ]. Write .QT = ∞ R n=1 n , where each .Rn is a finite subset of .[0, T ] and .T ∈ R1 ⊂ R2 ⊂ · · · . By Doob’s maximal inequality (see Chapter 1), one has P (max |Xt | > λ) ≤
.
t∈Rn
E|XT | , λ
n = 1, 2, . . . .
Therefore, P ( sup |Xt | > λ) ≤ lim P (max |Xt | > λ) ≤
.
n→∞
t∈QT
t∈Rn
E|XT | . λ
In particular, the paths of .{Xt : t ∈ QT } are bounded with probability one. Let (c, d) be any interval in .R and let .U (T ) (c, d) denote the number of upcrossings of (T ) (c, d) is the limit of the number .(c, d) by the process .{Xt : t ∈ QT }. Then, .U (n) .U (c, d) of upcrossings of .(c, d) by .{Xt : t ∈ Rn } as .n → ∞. By the upcrossing inequality (see BCPT1 Theorem 3.11), one has .
EU (n) (c, d) ≤
.
E|XT | + |c| . d −c
Since the .U (n) (c, d) are nondecreasing with n, it follows that .U (T ) (c, d) is a.s. finite. Taking unions over all .(c, d), with .c, d rational, it follows that with probability one, .{Xt : t ∈ QT } has only finitely many upcrossings of any finite interval. In particular, therefore, left- and right-hand limits must exist at each .t < T a.s. To construct a right-continuous version of .{Xt }, define .X˜ t = lims→t + ,s∈QT Xs for ˜ t } is in fact stochastically equivalent to .{Xt } now follows from .t < T . That .{X continuity in probability; i.e., .X˜ t = p−lims→t + Xs = Xt since a.s. limits and limits in probability must a.s. coincide. Since T is arbitrarily large, the proof is complete.
1 Throughout,
Theory.
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
3 Regularity of Markov Process Sample Paths
37
The above martingale regularity theorem can now be applied to the problem of constructing Markov processes with regular sample path properties. Let us recall : Definition 3.3 Markov transition probabilities .p(t; x, dy) are said to have the Feller property if for each (bounded) continuous function f , the function .x → f (y)p(t; x, dy) = Ex f (Xt ) is continuous. In this case, the corresponding S Markov process is referred to as a Feller process. Theorem 3.2 Let .{Xt : t ≥ 0} be a Feller Markov process with a locally compact and separable metric state space S with (Feller) transition probability function p(t; x, B) = P (Xt+s ∈ B | Xs = x), x ∈ S, s, t ≥ 0, B ∈ B(S),
(i)
.
such that for each .ε > 0, (ii)
.
p(t; x, B c (x : ε)) = o(1) as t → 0+ uniformly for x ∈ S.
Then, there is a stochastically equivalent version .{X˜ t } of .{Xt } a.s. having càdlàg sample paths at each .t < ζ∞ , where / S}, ζ∞ = inf{t > 0 : Xt ∈
.
is an extended real-valued random variable, referred to as the explosion time for the process .{Xt }, to be constructed in the proof. Proof Consider first the case that S is a compact metric space. It is enough to prove that .{Xt } a.s. has left- and right-hand limits at each t, for then .{Xt } can be modified as .X˜ t = lims−t + Xs , which by stochastic continuity will provide a version of .{Xt }. It is not hard to check stochastic continuity from (i) and (ii). Let .f ∈ C(S) be an arbitrary continuous function on S. Consider the semigroup .{Tt } acting on .C(S) by Tt f (x) = Ex (Xt ),
.
x ∈ S, t ≥ 0.
For .λ > 0, denote its resolvent by
∞
Rλ f (x) =
.
e−λs Ts f (x) ds,
x ∈ S.
0
Consider the process .{Yt } defined by Yt = e−λt Rλ f (Xt ),
.
t ≥ 0,
(3.1)
where .f ∈ C(S) is a fixed but arbitrary nonnegative function on S. Since .E|Yt | < ∞, .Yt is .Ft -measurable, .Tt f (x) ≥ 0 .(x ∈ S, t ≥ 0), and E{Yt+h |Ft } := e−λ(t+h) E{Rλ f (Xt+h )|Ft } = e−λ(t+h) Th Rλ f (Xt )
.
38
3 Regularity of Markov Process Sample Paths
= e
−λt
≤ e−λt
∞
e
−λ(s+h)
Ts+h f (Xt ) ds = e
0
−λt
∞
e−λs Ts f (Xt ) ds
h ∞
eλs Ts f (Xt ) ds = Yt ,
(3.2)
0
it follows that .{Yt } is a nonnegative supermartingale with respect to .Ft = σ {Xs s ≤ t}, .t ≥ 0. Applying the martingale regularity result of Theorem 3.1, we obtain a version .{Y˜ t } of .{Yt }, whose sample paths are a.s. right continuous with left-hand limits at each t. Thus, the same is true for .{λeλt Yt } = {λRλ f (Xt )}. Since .λRλ f → f uniformly as .λ → ∞, the process .{f (Xt )} must, therefore, a.s. have left- and right-hand limits at each t. The same will be true for any .f ∈ C(S), since one can write .f = f + − f − with .f + and .f − continuous nonnegative functions on S. So we have that for each .f ∈ C(S), .{f (Xt )} is a process whose left- and right-hand limits exist at each t (with probability one). As remarked at the outset, it will be enough to argue that this means that the process .{Xt } will a.s. have left- and righthand limits. Since S is a compact metric space, it has a countable dense subset .{xn }. The functions .fn : S → R defined by .fn (x) = ρ(xn , x), .x ∈ S, are continuous for the metric .ρ, and separate points of S in the sense that if .x = y, then for some n, .fn (x) = fn (y). In view of the above, for each n, .{fn (Xt )} is a process whose left- and right-hand limits exist at each t with probability one. From the countable union of events of probability 0 having probability 0, it follows that, with probability one, for all n the left- and right-hand limits exist at each t for .{fn (Xt )}. One may restrict the sample space to a set of sample points of probability one, such that .f (Xt ) has left- and right-hand limits for every sample point, for every n. But this means that the left- and right-hand limits exist at each t for .{Xt }, since the .fn ’s separate points; i.e., if either limit, say left, fails to exist at some .t , then the sample path − , contradicting the .t → Xt must have at least two distinct limit points as .t → t corresponding property for all the processes .{fn (Xt )}, .n = 1, 2, . . . . In this case, one has .ζ∞ = ∞ a.s., i.e., no explosion occurs since .sup{fn (x) : x ∈ S, n ≥ 1} < ∞. In the case that S is locally compact, one may adjoin a point at infinity, denoted .s∞ (∈ / S), to S. The topology of the one point compactification on .S¯ = S ∪ {s∞ } defines a neighborhood system for .s∞ by complements of compact subsets of S. ¯ Let .B(S) be the .σ -field generated by .{B, {s∞ }}. The transition probability function ¯ B∈ .p(t; x, B), .(t ≥ 0, x ∈ S, B ∈ B(S)) is extended to .p(t; ¯ x, B) .(t ≥ 0, x ∈ S, ¯ S)) ¯ by making .s∞ an absorbing state; i.e., .p(t; ¯ s∞ , B) = 1 if and only if .s∞ ∈ B, B( ¯ otherwise .p(t; .B ∈ B, ¯ s∞ , B) = 0. If the conditions of Theorem 3.2 are fulfilled for ¯ The basic Theorem 3.2 .p, ¯ then one obtains a regular process .{X¯ t } with state space .S. provides a process .{X˜ t t < ζ∞ } with state space S, whose sample paths are rightcontinuous with left-hand limits by restricting to .t < ζ∞ .
Exercises
39
Exercises 1.
2.
3. 4. 5.
(i) Prove that two stochastically equivalent processes have the same finitedimensional distributions. (ii) Prove that under the hypothesis of Theorem 3.2, {Xt } is stochastically continuous. (iii) Prove that, under the hypothesis of Theorem 3.2, λRλ f → f uniformly as λ → ∞. [Hint: Check that as λ → ∞, λe−λt dt converges weakly to δ0 (dt). Also, under the condition (ii), the supnorm ||Tt f − f || → 0 as t → 0.] Assume the framework of Theorem 3.2. Give conditions for u(t, x) = Ex u0 (X(t)), t ≥ 0, to solve the abstract Cauchy problem stated in Theorem 2.9. [Hint: Consider continuity and decay conditions on the initial data.] Show that there is a càdlàg version of the Poisson process defined by the transition probabilities in (2.55). Show that there is a càdlàg version of the process associated with the transition probabilities (2.59) defined by the convolution semigroup. (An Explosive Process) Let {Tn }∞ n=1 be i.i.d. mean one exponentially distributed random variables, and {λn }, a sequence of positive numbers. Xt = n−1 Define n −1 −1 0 if 0 ≤ t < λ−1 T , and X = n if λ T ≤ t < λ 1 t j j =1 j Tj for j =1 j 1 n = 2, 3, . . . . Let S = {0, 1, 2, . . . }. (i) Show that {Xt } is a Markov process. [Hint: Use induction on n to show for fixed 0 < s < t, P (Xt ≤ n|Xu , u ≤ s) = P (Xt ≤ n|Xs ) on [Xs = m] for each m = 0, 1, . . . , n, noting that σ (Xu , u ≤ s) = σ (Xs , λ−1 j Tj , j ≤ Xs ) −1 λ T .] and [Xs = m] = [Sm ≤ s < Sm+1 ], Sm = m j j =1 j 1 < ∞. Show that ζ = inf{t > 0 : Xt ∈ / S} = (ii) Assume ∞ ∞ ∞ −1 n=1 λn ∞ −1 n=1 λn Tn < ∞ with probability one. [Hint: Calculate E n=1 λj Tj .] (iii) (Yule Process) The Yule process may be defined by the choice λj = j λ, j =1, 2 . . . (compare Chapter 1, Exercise 3). Show, more gener∞ 1 ally, if = ∞, then {Xt } is nonexplosive. [Hint: Show that j =1 λ ∞ j1 E exp{− j =1 λj Tj } = 0.]
Chapter 4
Continuous Parameter Jump Markov Processes
Two problems are addressed in this chapter. The first is to compute the space-time structure of Markov processes on a general state space having piecewise constant sample paths. The second is to provide a construction of continuous parameter Markov processes having prescribed infinitesimal parameters, including a treatment of solutions to the associated Cauchy problem (backward differential equations). This includes Feller’s method of successive approximations to analytically construct the minimal solution to both the backward and forward equations as possibly substochastic transition probabilities having prescribed infinitesimal parameters and satisfying the Chapman–Kolmogorov equations.
Let .{Xt : t ≥ 0} be a Markov process on a measurable state space .(S, S), whose sample paths are piecewise constant between jumps. Such processes are said to be pure jump Markov processes. We begin by considering the joint distribution of the time spent in the initial state and its displacement at the time of the first jump. This analysis is facilitated by a combination of the Markov property and the pure jump sample path regularity. Unless otherwise stated, it will be assumed in this chapter that S is a locally compact and separable metric space with Borel .σ -field .S. In particular, a countable set S with the discrete topology provides an important special case. Theorem 4.1 Assume that .{Xt : 0 ≤ t < ∞} is a Markov process on a locally compact and separable metric space .(S, S) starting at .X0 = x, and let .T0 ≡ T0 (x) := inf{t > 0 : Xt = x}. Then, there is a nonnegative measurable function .λ on S such that .T0 has an exponential distribution with parameter .λ(x). © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_4
41
42
4 Continuous Parameter Jump Markov Processes
Moreover, .XT0 and .T0 are independent random variables. In the case .λ(x) = 0, the degeneracy of the exponential distribution can be interpreted to mean that x is an absorbing state, i.e., .Px (T0 = ∞) = 1. Proof Define .ψ(t, x, B) = Px (T (x) > t, XT (x) ∈ B), .x ∈ S, B ∈ S. Let .Xs+ denote the after-s process defined by (Xs+ )t = Xs+t ,
.
t ≥ 0.
Then, conditioning on .Fs = σ (Xu : 0 ≤ u ≤ s), one has .[T (x) > s] ∈ Fs , so that ψ(s + t, x, B) = Px (T (x) > s, T (Xs+ ) > t, XT (x) ∈ B)
.
= Px (T (x) > s, T (Xs+ ) > t, XT (Xs ) ∈ B) = Ex (1[T (x)>s] ψ(t, Xs , B)) = Ex (1[T (x)>s] ψ(t, x, B)), since .Xs = X0 = x on .[T (x) > s]. Letting .t ↓ 0 in (4.1), since .ψ(t, x, B)) ↑ Px (XT (x) ∈ B) and .ψ(s + t, x, B) ↑ ψ(s, x, B), one has Px (T (x) > s, XT (x) ∈ B) = Px (T (x) > s)Px (XT (x) ∈ B),
.
(4.1)
proving independence. Next, observe that .ψ(t, x, S) = Px (T (x) > t), and apply the last line of (4.1) to .B = S, to get Px (T (x) > s + t) = Px (T (x) > s)Px (T (x) > t).
.
(4.2)
The only nonnegative, right continuous solutions to this equation are of the form Px (T0 (X) > t) = e−λ(x)t ,
.
t ≥ 0,
for a constant .λ(x) (Exercise 1). Since .t → Px (T0 (X) > t) is nondecreasing, λ(x) ≥ 0. The measurability of .x → λ(x) follows from that of .x → Px (T0 (X) > t).
.
Remark 4.1 For the specification of the infinitesimal generator A of the Markov process in Theorem 4.1, let .B(S) denote the Banach space of bounded measurable functions on S endowed with the sup norm, and let .Tt (t ≥ 0) be the semigroup of transition operators on .B(S). Then, for .f ∈ B(S), denoting by K the transition operator of the Markov chain with transition probability .k(x, dy),
Tt f (x) = exp{−λ(x)t}f (x) +
(λ(x) exp{−λ(x)u}f (y)k(x, dy)du
.
(0,t] S
= exp{−λ(x)t}f (x) + (1 − exp{−λ(x)t})Kf (x)).
(4.3)
4 Continuous Parameter Jump Markov Processes
43
For .f ∈ DA , using .
Af (x) = lim
.
t↓
exp{−λ(x)t} = 1 − λ(x)t = o(t),
Tt f (x) − f (x) = −λ(x)f (x) + λ(x)Kf (x) (f ∈ DA ), t
(4.4)
the domain .DA comprising all f for which the limit holds uniformly on S. In particular, if .λ(x) is bounded on S, then A is a bounded operator on .B(S). If .λ(x) is continuous and .x → k(x, dy) is weakly continuous, then it follows from the above that .x → p(t; x, dy) is weakly continuous for all .t > 0. However, as stated above, in the case .λ(x) is bounded, no weak continuity condition is required for the Markov process generated by A. Note that on countable state spaces S, which are the most common and important in many applications, the continuity properties are automatic under the discrete topology, since every function on S is continuous. Finally, if .λ(x) is unbounded, explosion may occur. Also, .k(x, dy) may be a subprobability kernel, i.e., .k(x, S) may be less than 1 for some x. For the following corollary to Theorem 4.1, let .
τ0 ≡ τ0 (X) := 0, T0 ≡ T0 (X) := τ1 ,
τn ≡ τn (X) := inf{t > τn−1 : Xt = Xτn−1 }, Tn ≡ Tn (X) := τn+1 − τn
n ≥ 1.
(4.5)
Corollary 4.2 Let .{Xt : t ≥ 0} be a continuous parameter Markov process on a locally compact and separable metric space having the infinitesimal parameters .λ(x), k(x, dy) and initial state x. Then, .{Yn := Xτn : n = 0, 1, 2, . . .} is a discrete parameter Markov process having the transition probabilities .k(x, dy). Also, conditionally given .{Yn : n ≥ 0}, the holding times .T0 (X), .T1 (X), .T2 (X), . . . are independent exponentially distributed random variables with parameters .λ(x0 ), .λ(x1 ), .λ(x2 ), . . . on the event .[Y0 = x0 , Y1 = x1 , Y2 = x2 , . . . ]. Proof Theorem 4.1 implies that .T0 (X) and the after-.T0 (X) process, .XT+0 , are independent by the strong Markov property (Exercise 2). The discrete parameter Markov process .{Yn } defined in Corollary 4.2 is typically referred to as the skeleton process. Next, let us show that conversely given the infinitesimal parameters .λ(x), k(x, dy), there is a semigroup of, possibly substochastic, transition probabilities .p(t, x, dy) satisfying the backward and forward equations for these parameters (see (2.63) in Chapter 2). Let A denote the infinitesimal generator of the Markov process .{Xt }, and let .p(t; x, dy) be its transition probability. For .t > 0, C ∈ S, the backward and forward equations take the form
44
4 Continuous Parameter Jump Markov Processes
∂p(t; x, C) = −λ(x)p(t; x, C) + p(t; x1 , C)λ(x)k(x, dx1 ), [B] : ∂t S . ∂p(t; x, C) = − λ(x1 )p(t; x, dx1 ) + λ(x1 )p(t; x, dx1 )k(x1 , C), [F ] : ∂t C S (4.6) with p(0+ ; x, dx1 ) = δ{x} (dx1 ), x ∈ S.
.
(4.7)
These may be expressed in equivalent mild or integrated form as follows (Exercise 4): For .C ∈ S, .t > 0, [B] : p(t; x, C)=e−λ(x)t δ{x} (C)+λ(x)
t
e−λ(x)t1 k(x, dx1 )p(t−t1 ; x1 , C)dt1 ,
.
[F ] : p(t; x, C)=e−λ(x)t δ{x} (C)+ where .k(t1 ; x1 , C) =
0
t 0
S
(4.8) λ(x1 ) k(t1 ; x1 , C)p(t−t1 ; x, dx1 )dt1 ,
S
e−λ(x2 )t1 k(x1 , dx2 ), x1 ∈ S, C ∈ S.
(4.9)
C d Tt f (x) = Remark 4.2 Let .f = 1C . Then, (4.6[B]) is the backward equation . dt (ATt f )(x), formally applied to f . Similarly, (4.6[F]) is the forward equation d . dt Tt f (x) = (Tt Af )(x) applied to .f = 1C . However, the forward equation is meant generally to describe the evolution of an initial probability distribution .μ(dx) = f (x)ν(dx), say. Applied to such a function f , the forward equation looks at the evolution of its density. More generally, if .μ(dx) is a probability measure on .(S, S), then defining
μ(t, C) =
μ(dx)p(t; x, C),
.
t ≥ 0, C ∈ S;
S
then integrating both sides of (4.6[F]) with respect to .μ(dx), the forward equations take a form familiar in physics and referred to as the Fokker-Planck, or continuity equation, governing how the rate at which probability mass in a region C of the state space changes in time as the result of spatial movement of a jump Markov process. Namely, for .C ∈ S, t > 0, ∂μ(t, C) =− . ∂t
λ(x1 )μ(t, dx1 ) +
C
=−
λ(x2 )k(x2 , C)μ(t, dx2 ) S
λ(x)k(x, C c )μ(t, dx) + C
λ(x)k(x, C)μ(t, dx). (4.10) Cc
4 Continuous Parameter Jump Markov Processes
45
The steady state .μ(t, C) = μ(C), t ≥ 0, if there is one, corresponds to a choice of μ(dx) such that . λ(x)k(x, C c )μ(dx) = λ(x)k(x, C)μ(dx), C ∈ S.
.
Cc
C
We will proceed in two steps: (i) Show that there is a smallest nonnegative solution to [B] and [F], in a sense to be defined, and (ii) show that this solution satisfies the Chapman–Kolmogorov equations, i.e., semigroup property p(s + t; x, C) =
p(s; x, dx1 )p(t; x1 , C)
.
S
=
p(t; x, dx1 )p(s; x1 , C), x ∈ S, s, t ≥ 0,
(4.11)
S
that defines a corresponding Markov process. The approach to (i) was introduced by Feller (1940) based on a method of successive approximations that may be interpreted as truncating1 the number of jumps to occur prior to time t. Proposition 4.3 (Feller’s Method of Successive Approximations) Given any infinitesimal parameters .λ(x), k(x, dy), there exists a smallest nonnegative solution .p(t; x, dy) of the backward equations [B] satisfying the initial condition (4.7). Moreover, .p(t; x, dy) solves the forward equations [F] with (4.7), as well. Proof To solve [B], start with the first approximation p(0) (t; x, dx1 ) = e−λ(x)t k (0) (x, dx1 )
.
(x ∈ S, t ≥ 0)
(4.12)
where k (0) (x, dx1 ) = δ{x} (dx1 ),
.
(4.13)
and compute successive approximations, recursively for .n = 1, 2, . . . , .C ∈ S, by .p
(n) (t; x, C) = p (0) (t; x, C) +
t 0
S
λ(x)e−λ(x)(t−t1 ) p (n−1) (t1 ; x1 , C)k(x, dx1 )dt1 .
(4.14) Then, .p(0) (t; x, C) ≤ p(1) (t; x, C) and, by induction, p(n−1) (t; x, C) ≤ p(n) (t; x, C),
.
n = 1, 2, . . . .
1 This method is also effective in the analysis of mild forms of the incompressible Navier-Stokes equations in Fourier space by (semi-Markov) probabilistic methods introduced by Le Jan and Sznitman (1997); also see Bhattacharya and Waymire (2021) and Dascaliuc et al. (2023a,b).
46
4 Continuous Parameter Jump Markov Processes
Thus, for any .x ∈ S, C ∈ S, p(t; x, C) = lim p(n) (t; x, C)
.
n→∞
(4.15)
exists and 0 ≤ p(t; x, C) ≤ p(t; x, S) ≤ 1,
x ∈ S, C ∈ S, t ≥ 0.
.
This latter upper bound follows from .p(0) (t; x, S) ≤ e−λ(x)t , so that .p(1) (t; x, S) ≤ 1. Thus, inductively, p(n+1) (t; x, S) ≤ e−λ(x)t +
t
.
λ(x)e−λ(x)s ds = 1.
0
That .p(t; x, C) solves the backward equations follows from Lebesgue’s monotone convergence theorem. To prove minimality, suppose that .g(t; x, C), x ∈ S, C ∈ S, is another solution to [B]. Then, .g(t; x, C) ≥ p(0) (t; x, C), and inductively comparing [B] to (4.14), .g(t; x, C) ≥ p(n) (t; x, C) for all .n ≥ 1. Minimality follows by passing to the limit as .n → ∞. To prove that the forward equations [F] are also satisfied for .p(t; x, dx1 ), we apply a similar truncation of the number of jumps prior to time t but consider the recursion derived in terms of the time of the last jump (see Remark 4.3). Using the same notation for the new recursive probabilities, start the recursion with (4.7), but modify the recursion for .C ∈ S, .n ≥ 1, by p
.
(n)
(t; x, C) = p
(0)
(t; x, C) +
t 0
λ(x1 ) k(t − t1 ; x1 , C)p(n−1) (t1 ; x, dx1 )dt1 .
S
(4.16) Again, while the same notations for the recursions (4.14) and (4.16) are used, they define distinct recursions. As above, one may check, inductively, that .0 ≤ p(n) (t; x, C) ≤ p(n+1) (t; x, C) for all .x ∈ S, C ∈ S, .t ≥ 0, .n = 0, 1, . . . . Thus, φ(t; x, C) = lim p(n) (t; x, C)
.
n→∞
exists and solves the forward equation [F] by the monotone convergence theorem. To prove that, in fact, φ(t; x, dy) = p(t; x, dy),
.
we show by induction that, in fact, φ (n) (t; x, C) = β (n) (t; x, C),
.
x ∈ S, t ≥ 0, C ∈ S, n = 0, 1, 2 . . . ,
(4.17)
4 Continuous Parameter Jump Markov Processes
47
where .β (n) (t) and .φ (n) (t) denote the nth approximate solutions to the backward and forward recursions, respectively. The case .n = 0 is true by definition. Assume that (4.17), [B] and [F], hold for .n ≤ m. Then, starting with [F] and using the induction hypothesis, one has .
φ (m+1) (t; x, C) = φ (0) (t; x, C) + = φ (0) (t; x, C) + +
t 0
S
t S
C
0
S
C
t
t−t1
0
C
λ(x1 )e−λ(x2 )(t−t1 ) k(x1 , dx2 )β (m) (t1 ; x, dx1 ))dt1
0
e−λ(x2 )(t−t1 ) k(x1 , dx2 )δ{x} (dx1 )λ(x)e−λ(x)t1 dt1
e−λ(x2 )t1 λ(x1 )k(x1 , dx2 )e−λ(x)t2 λ(x)k(x, dx3 )
S
×β (m−1) (t − t1 − t2 ; x3 , dx1 )dt2 dt1 t = φ (0) (t; x, C) + e−λ(x2 )(t−t1 ) λ(x)e−λ(x)t1 k(x, dx2 )dt1 +
t 0
S
C
S
0 t
C
e−λ(x2 )t1 e−λ(x)(t2 −t1 ) λ(x1 )k(x1 , dx2 )λ(x)k(x, dx3 )
t1
×β (m−1) (t − t2 ; x3 , dx1 )dt2 dt1 = F1 + F2 + F3 .
(4.18)
On the other hand, starting with [B], the induction hypothesis yields .
β (m+1) (t; x, C) =β
(0)
(t; x, C) +
t 0
= β (0) (t; x, C) + +
t
t t 0
S
0
S
0
e−λ(x)t1 λ(x)k(x, dx1 )φ (m) (t − t1 ; x1 , C)dt1 S
S
e−λ(x)t1 λ(x)k(x, dx1 )e−λ(x1 )(t−t1 ) δx1 (C)dt1
e−λ(x)t1 λ(x)k(x, dx1 )λ(x2 )e−λ(x3 )t2 k(x2 , dx3 )
C
×φ (m−1) (t − t1 − t2 ; x1 , dx2 )dt2 dt1 t = β (0) (t; x, C) + e−λ(x)t1 λ(x)k(x, dx1 )e−λ(x1 )(t−t1 ) dt1 +
t 0
S
C
S
0 t
C
e−λ(x)t1 λ(x)k(x, dx1 )λ(x2 )e−λ(x3 )(t2 −t1 ) k(x2 , dx3 )
t1
×φ (m−1) (t − t2 , x1 , dx2 )dt2 dt1 = B1 + B2 + B3 .
(4.19)
48
4 Continuous Parameter Jump Markov Processes
One readily sees by definition that .F1 = B1 . Also .F2 = B2 involves renaming integration variable .x2 as .x1 and a change of variables .t − t1 → t1 . Checking that .F3 = B3 is similarly accomplished, using the induction hypothesis, renaming space variables, and change of variables in time, respectively. Thus, one obtains term by term agreement between the right sides of (4.18) and (4.19). Remark 4.3 Note that .p(0) (t; x, dx1 ) is the distribution of the process starting at x, until the time of the first jump. Similarly, .p(1) (t; x, dx1 ) is the distribution of the process until the time of the second jump, and so on. Theorem 4.4 Given any infinitesimal parameters .λ(x), k(x, dy), there exists a smallest nonnegative solution .p(t; x, dy) of the backward equations [B] and the forward equations [F], for the initial condition (4.7). This solution satisfies the Chapman–Kolmogorov semigroup equations (4.11) and the Feller property .limt→0+ p(t; x, dx1 ) = δ{x} (dx1 ), .x ∈ S. However, p(t; x, dx1 ) ≤ 1
.
for all x ∈ S, all t ≥ 0.
(4.20)
S
In case equality holds in this last inequality (4.20) for all .x ∈ S and .t ≥ 0, there does not exist any other nonnegative solution of [B], nor [F], and satisfying the initial condition (4.7). Proof The existence of the minimal solution has been settled. Our task is to prove the Chapman–Kolmogorov equations for the the minimal solution .p(t; x, dy) to the backward equations for these parameters. This is accomplished as follows. Define, for .C ∈ S, d
.
(n)
(t; x, C) =
δx (C)e−λ(x)t
if n = 0
β (n) (t; x, C) − β (n−1) (t; x, C).
if n ≥ 1.
(4.21)
The strategy is to prove for .s, t ≥ 0, β (n) (t + s; x, C) =
n
.
d (m) (s; x, dx1 )β (n−m) (t : x1 , C).
(4.22)
S m=0
Then, the Chapman–Kolmogorov equations will follow in the limit as .n → ∞ using monotone convergence and the telescoping sum p(t; x, C) =
∞
.
m=0
d(t; x, C),
x ∈ S, t ≥ 0.
4 Continuous Parameter Jump Markov Processes
49
The proof of (4.22) is by induction. It is obvious for .n = 0. To simplify notation for the backward iteration, let ε(t; x, C) =
λ(x)e−λ(x) k(x, C)
if λ(x) > 0,
0
otherwise.
.
Then, (4.14) can be expressed as β
.
(n+1)
(t; x, C) = e
−λ(x)t
δ{x} (C) +
t 0
ε(t − t1 , x, dx1 )β (n) (t1 , x1 , C), n ≥ 0.
S
As a lemma, one may also check by induction and the recursion for [B] (Exercise 5), d (m+1) (t; x, C) =
t
ε(t1 ; x, dx1 )d (m) (t − t1 ; x1 , C)dt1 m ≥ 0.
.
0 .
β (n+1) (s + t; x, C) = δx (C)e−λ(x)(s+t) + = δx (C)e−λ(x)(t+s) + +
s 0
S
+
0
S
=
S
+ =
S
s 0
S
0
S
s
S
t+s
ε(t2 ; x, dx2 )β (n) (s + t − t2 ; x2 , C)dt2 ε(t2 ; x, dx2 )β (n) (s + t − t2 ; x2 , C)dt1
t+s s
S
ε(t2 ; x, dx2 )β (n) (s + t − t2 ; x2 , C)dt2
ε(t2 ; x, dx2 )β (n) (s + t − t2 ; x2 , C)dt2
= δx (C)e−λ(x)(t+s) + +
s+t
ε(t2 ; x, dx2 )β (n) (s + t − t2 ; x2 , C)dt2
= δx (C)e−λ(x)(t+s) + s
(4.23)
S
S
t
δx (dx1 )e−λ(x)s
0
S
ε(t − t1 ; x1 , dx2 )β (n) (t1 ; x2 , C)dt1
ε(t2 ; x, x2 )β (n) (s + t − t2 ; x2 , C)dt2
δx (dx1 )e−λ(x)s {δx1 (C)e−λ(x1 )t +
t 0
S
ε(t − t1 ; x1 , dx2 )β (n) (t1 ; x2 , C)dt1 }
n s { ε(t2 ; x, dx2 )d (m) (s − t2 ; x1 , dx2 )}dt2 β (n−m) (t; x1 , C) S m=0
0
S
δx (dx1 )e−λ(x1 )s β (n+1) (t; x1 , C)) +
n S m=0
d (m+1) (s; x, dx1 )β (n−m) (t; x1 , C)
50
4 Continuous Parameter Jump Markov Processes
=
=
S
d (0) (s; x, dx1 )β (n+1) (t; x1 , C) +
n+1 S m=0
n+1 S m=1
d (m) (s; x, dx1 )β (n+1−m) (t; x1 , C)
d (m) (s; x, dx1 )β (n+1−m) (t; x1 , C)
(4.24)
This completes the induction, and the Chapman–Kolmogorov equations now follow in the limit as .n → ∞. To allow for the possibility that .p(t; x, S) < 1, one may adjoin a fictitious state, denoted .s∞ , to S and extend .p(t; x, S) < 1 to S = S ∪ {s∞ },
.
S = S ∨ σ ({s∞ }),
by p(t; x, {s∞ }) = 1 − p(t; x, S), x ∈ S,
.
p(t; s∞ , {s∞ }) = 1.
Then, using the Kolmogorov extension theorem, one may construct a Markov process .X = {Xt : t ≥ 0} on a probability space .(Ω, F, P ), having state space .(S, S) and transition probabilities .p(t; x, dy). Alternatively, one may use Kolmogorov’s extension theorem to construct a sequence of nonnegative mean one exponentially distributed random variables .{T n : n ≥ 0} and, independently, a Markov chain .{Yn : n = 0, 1, . . . } on S with transition probability kernel .k(x, dy). Let Tm = λ−1 (Ym )T m , m ≥ 1.
.
Then, .X may be constructed as Xt =
.
Y Nt
for finite Nt
s∞
otherwise
(4.25)
where Nt = inf{n ≥ 0 :
n
.
Tm > t}.
(4.26)
m=0
That the two processes are stochastically equivalent follows by checking that their finite dimensional distributions coincide (Exercise 3). Definition 4.1 The Markov process, .X or its restriction to .0 ≤ t < ζ , where / S}, ζ := inf{t ≥ 0 : X t ∈
.
4 Continuous Parameter Jump Markov Processes
51
is referred to as the minimal process. The random variable .ζ is referred to as the explosion time. The connection between .p(n) (t; x, dy) and .p(t; x, dy) and the minimal process .X are as follows. Proposition 4.5 For .x ∈ S, C ∈ S, t ≥ 0, a. .p(n) (t; x, C) = Px (X t ∈ C, Nt < n), b. .p(t; x, C) = Px (Xt ∈ C, ζ > t). c. .Px (ζ > t) = p(t; x, S).
n = 1, 2, . . . .
Proof One has, conditioning on .T0 and .Y1 .
Px (Xt ∈ C, Nt < n + 1) = Ex Px (Xt ∈ C, Nt < n + 1|σ (Y1 , T0 )) = Ex {1C (x)1[T0 >t] } + Ex {Px (Xt−T0 ∈ C, Nt−T0 < n|σ (Y1 , T0 ))1[T0 ≤ t]} t = δ{x} (C)e−λ(x)t + λ(x)e−λ(x)t1 k(x, dx1 )Px (Xt−t1 ∈ dx1 , Nt−t1 < n)dt1 . 0
S
(4.27) Thus, .p(n) (t; x, C) and .Px (Xt ∈ C, Nt < n) are solutions to the same recursion (4.14) with common initialization .δ{x} (C)e−λ(x)t for .n = 0 and, by induction, must coincide; (b) follows in the limit as .n → ∞; since .[ζ > t] = [Nt < ∞]; and (c) follows since the added state .s∞ to S is absorbing for .{Xt : t ≥ 0}. In particular, .[X t ∈ S] = [ζ > t] Remark 4.4 Theorem 4.1 and Corollary 4.2 describe the structure of a jump Markov process with infinitesimal parameters .λ(x), .p(x, dy) (.x ∈ S). If conversely, one can prove that the process Y , say, in Corollary 4.2 is Markov, then the rather ungainly proof of the Chapman–Kolmogorov property would be unnecessary; for the process Y satisfies the properties in Proposition 4.5 of the minimal process. The proof of the Markov property of Y is hinted at for a countable state space S in Bhattacharya and Waymire (2009), Theorem 5.4, pp. 278–279. Proposition 4.6 The minimal process is a conservative pure jump process if and only if Px (
∞
.
n=0
1 < ∞) = 0 λ(Yn )
Proof Assume first that . ∞ k=0
1 λ(Yk )
for all x ∈ S.
= ∞ a.s. Recall that
52
4 Continuous Parameter Jump Markov Processes
ζ = lim τn = lim
.
n→∞
n→∞
n
Tk
k=0
where conditionally given .σ (Y0 , Y1 , . . . ) and the random variables .T0 , T1 , . . . are independent and exponentially distributed with respective parameters .λ(Y0 ), λ(Y1 ), . . . . By the dominated convergence theorem, one has for each .x ∈ S, Ex e−ζ = lim Ex e−Σk=0 Tk = lim Ex E{e−Σk=0 Tk | σ (Y0 , Y1 , . . . )} n
.
n
n→∞
n→∞
= lim Ex n→∞
n
λ(Yk ) 1 + λ(Yk )
k=0
= Ex lim
n→∞
n
1 λ(Yk ) = Ex lim n n→∞ 1 + λ(Yk ) k=0 (1 +
k=0
1 λ(Yk ) )
.
Now,
.
log( n
1
k=0 (1 +
1 λ(Yk ) )
)=−
n
log(1 +
k=0
≤−
1 ) λ(Yk )
n
1 λ(Yk ) 1 + λ(Y1 k ) k=0
=−
n k=0
1 1 + λ(Yk )
since .log(1+x) ≥ x/(1+x) for .x ≥ 0. In particular, .log( n 1 1 ) → −∞ a.s. k=0 (1+ λ(Yk ) ) n 1 as .n → ∞. Thus, . k=0 (1 + λ(Yk ) ) → ∞ a.s. as .n → ∞. Again, by the dominated convergence theorem, one has .Ex e−ζ = 0. Thus, .ζ = +∞ a.s. Conversely if .ζ = ∞ a.s., then these calculations show that .Ee−ζ = 0 and, therefore almost surely, ∞ .
k=0
Thus, . ∞ k=0
1 λ(Yk )
(1 +
1 ) = ∞. λ(Yk )
= ∞ a.s.
Once again, we arrive at (see Remark 4.1), Corollary 4.7 (Bounded Rates) If .supx∈S λ(x) < ∞, then the minimal process is nonexplosive. Corollary 4.8 If the embedded spatial process .{Yk }∞ k=0 is pointwise recurrent, then the minimal process is nonexplosive. The following theorem and proposition of Reuter (1957) also provide useful approaches to the determination of explosion/nonexplosion events.
4 Continuous Parameter Jump Markov Processes
53
Theorem 4.9 (Reuter’s Conditions for Explosion) Let .λ(x), .k(x, dy) be given infinitesimal rates for .X. Define a linear operator Af (x) = λ(x)
{f (y) − f (x)}k(x, dy),
.
x ∈ S.
S
Then, .X, starting from .x0 , is explosive if and only if there is a bounded, nonnegative function u and real number .γ > 0, such that .u(x0 ) > 0 for some .x0 , and Au(x) ≥ γ u(x),
.
∀x ∈ S.
Proof Define a bounded, nonnegative function uγ (x) = Ex e−γ ζ , x ∈ S.
.
Then, .Px0 (ζ < ∞) > 0 if and only if .uγ (x0 ) > 0. Moreover, conditioning on the process up to the time .τ1 of the first jump,
uγ (x) = Ex Ex (e−γ ζ |Fτ1 )
= Ex e−γ τ1 uγ (Xτ1 ) λ(x) = uγ (y)k(x, dy). λ(x) + γ S
.
(4.28)
Equivalently, Auγ (x) = γ uγ (x),
.
x ∈ S.
If .Px0 (ζ < ∞) > 0, then .uγ (x0 ) > 0. Thus, necessity of the condition for explosion with positive probability is proved. For sufficiency, suppose that u is a nonnegative bounded function, say .u ≤ B, such that for some .γ > 0, Au(x) ≥ γ u(x),
.
∀ x ∈ S.
(4.29)
Define a martingale by Mt = e−γ t u(X t ) −
t
.
e−γ s (Au − γ u)(Xs )ds
0
= e−γ t u(X t ) − It ,
t ≥ 0,
(4.30)
where .t → It is a nondecreasing function by (4.29). In particular, therefore, e−γ t u(X t ) = Mt + It , t ≥ 0, is a uniformly bounded submartingale. Let .τn denote the time of the nth jump. By Doob’s stopping time theorem, .e−γ τn u(X τn ), n ≥ 1, is a submartingale and
.
54
4 Continuous Parameter Jump Markov Processes
Ex0 e−γ ·0 u(X0 ) ≤ Ex0 e−γ τn u(X τn ) ≤ BEx0 e−γ τn ≤ BPx0 (ζ < ∞).
.
(4.31)
Letting .n → ∞, one has u(x0 ) ≤ BEx0 e−γ ζ .
.
Thus, .Px0 (ζ < ∞) ≥ B −1 u(x0 ) > 0.
Proposition 4.10 Let .λ(x), .k(x, dy) be given infinitesimal rates for .X. Assume that there is a nonnegative function u with the property that for any sequence .{xn } in S such that .limn λ(xn ) = ∞, one also has .limn u(xn ) = ∞, and for which there is a real number .γ > 0 such that Au(x) ≤ γ u(x),
.
∀ x ∈ S,
where Af (x) = λ(x)
{f (y) − f (x)}k(x, dy),
.
x ∈ S.
S
Then, .X is not explosive. Proof Since .e−γ t u(Xt ), t ≥ 0, is a nonnegative supermartingale, one has for any .T > 0, Ex e−γ (τn ∧T ) u(Xτn ∧T ) ≤ u(x),
.
∀x ∈ S,
where, again, .τn denotes the time of the nth jump. Letting .T → ∞, one has by Fatou’s lemma that Ex e−γ τn u(Xτn ) ≤ u(x),
.
∀x ∈ S.
(4.32)
Now, by the same argument as used proof of Proposition 4.6, .P (ζ = in the 1 limn τn < ∞) > 0 implies that .P ( ∞ n=1 λ(Xτ ) < ∞) > 0. Thus, .limn λ(X τn ) → n ∞ is an event with positive probability. In view of the condition on u in the statement of the theorem, the event that .u(Xτn ) → ∞ has positive probability as well. But (4.32) makes this impossible, i.e., an event with probability zero. Corollary 4.11 Let .λ(x), .k(x, dy) be given infinitesimal rates. If .γ supx S λ(y)k(x, dy) < ∞, x ∈ S, then .X, starting at .x ∈ S, is nonexplosive. Proof Define .u(x) = λ(x), x ∈ S. Then, Au(x) = λ(x)
{λ(y) − λ(x)}k(x, dy)
.
S
=
4 Continuous Parameter Jump Markov Processes
55
≤ λ(x)
λ(y)k(x, dy) S
≤ γ λ(x) = γ u(x). In the case .Px (ζ = ∞) < 1, there are various ways of continuing .X for t ≥ ζ so as to preserve the Markov property and the backward equations for the t : t ≥ 0}. An illustration is provided in Example 1. An continued process .{X important consequence is the role that these continuations play in the nonuniqueness of nonnegative solutions to the backward equations. The case of a countable state space S plays a large role in the present context. Continuous parameter jump Markov chains are often defined in terms of an array
.
Q = ((qij ))i,j ∈S
.
of parameters such that qij ≥ 0, i = j, and
.
qij = 0, for all i ∈ S.
(4.33)
j
Q then specifies the infinitesimal parameters in the case .qii = 0, by λi = −qii , and k(i, {j }) =
.
qij , i = j, k(i, {i}) = 0. λi
If .λi = −qii = 0, then the convention of treating i as an absorbing state is invoked, i.e., .k(i, {i}) = 1. The infinitesimal rates are given by an array .Q = ((qij )) satisfying (4.33). Q is also referred to as the Q-matrix in this context. The role of the embedded Markov chain is particularly revealing in considerations of transience and recurrence properties. Definition 4.2 Let .{Xt : t ≥ 0} be a continuous parameter Markov chain on a countable state space S. A state .j ∈ S is recurrent if .Pj (sup{t ≥ 0 : Xt = j } = ∞) = 1. Otherwise, .j ∈ S is said to be transient. Theorem 4.12 Let .{Xt : t ≥ 0} be a continuous parameter Markov chain on a countable state space S. A state .j ∈ S is transient or recurrent for .{Xt } according to whether j is transient or recurrent for the embedded discrete parameter Markov chain .{Yn : n = 0, 1, . . . }. Proof The result is trivially true in the case of absorption and left aside. If .j ∈ S is recurrent for .{Xt }, then .{Yn } will return to j at infinitely many jump times of .{Xt } with probability one, and j is thus recurrent for .{Yn }. Conversely, if j is recurrent fro .{Yn }, then under .Pj , the successive return times
56
4 Continuous Parameter Jump Markov Processes (r)
(r−1)
= inf{t ≥ τj
τj
.
(0)
where .τj (1) (r) .τ j , τj
: Xt = j },
r = 1, 2, . . . ,
is the initial holding time in state j , have i.i.d. positive increments (r−1)
− τj
, r ≥ 2. In particular, therefore, .Pj -a.s., (n)
τj
.
(1)
= τj
+
n (r) (r−1) (τj − τj ) → ∞. r=2
Thus, j is recurrent for .{Xt },
In view of the almost sure dichotomy between transient and recurrent states of discrete parameter Markov chains (Exercise 16), one has the following. Corollary 4.13 A state .j ∈ S is transient for .{Xt } if and only if .Pj (sup{t ≥ 0 : Xt = j } < ∞) = 1. In this case, j is transient for the embedded Markov chain. Example 1 (Continuous Parameter Birth–Death Chains) For given sequences of nonnegative numbers .{βi }i∈Z , .{δi }i∈Z , the minimal Markov chain on .S = Z specified by the infinitesimal rates qi,i+1 = βi , qi,i−1 = δi , qii = −λi := −(βi + δi ), qij = 0, |i − j | ≥ 2, i, j ∈ Z, (4.34)
.
is referred to as a (continuous parameter) birth–death chain. The further specialization .δi = 0 for all .i ∈ Z is referred to as a pure birth process, and .βi = 0 for all .i ∈ Z defines the pure death process. The state space may naturally be further restricted to .S = Z+ in cases .βi = δi = 0 for .i ≤ −1. It follows from Proposition 4.6 that a pure birth process is nonexplosive if and only if
.
∞ 1 = ∞. λn n=1
Now, consider an explosive pure birth process, e.g., .λi = i12 , .i ∈ S = Z+ . Let .{X t : 0 ≤ t < ζ } denote the minimal process on S starting at .i = 0. Consider the (0) extension .Xt , 0 ≤ t ≤ ζ (0) := ζ of .{Xt : 0 ≤ t < ζ } by an instantaneous return to zero, i.e., (0)
Xt
.
= X t , 0 ≤ t < ζ,
(0)
Xζ = 0.
Next define Rn :=
n
.
j =0
ζ (j ) , n ≥ 0, R−1 = 0,
4 Continuous Parameter Jump Markov Processes (n)
and let .{Xt
57 (0)
: 0 ≤ t < ζ (n) }, n ≥ 1, be i.i.d. copies of .Xt . Define XRn−1 = XRn = 0,
.
and t = X
.
(0) Xt Xt(n)
if 0 ≤ t ≤ ζ (0) if Rn−1 ≤ t ≤ Rn , n ≥ 1.
(4.35)
t Since .ζ (n) , n ≥ 0, are i.i.d. positive random variables, one has .Rn → ∞ a.s.. So .X is defined for all .t ≥ 0. Clearly t = j ) > P0 (Xt = j ) = p(t; 0, {j }), P0 (X
.
t = j ] is a proper subset. We leave it as Exercise 6 to check that since .[X t = j ] ⊂ [X the backward equations hold, but the forward equations fail. Note that the extension could have been made by an instantaneous return to any state .k ∈ Z+ , or for that may to a randomly selected state, independent of the process. Such fascinating considerations2 lead to infinitely many solutions to the backward equations. More generally, explosion starting in state j implies that j is transient. In view of Theorem 4.12, the criteria for transience or recurrence of a continuous parameter birth–death chain on .S = {0, 1, . . . } with reflecting boundary at 0 are the same as those for the discrete parameter embedded Markov chain, namely, convergence or δ1 ···δj divergence of the series . ∞ n=1 β1 ···βj (Exercise 11 or see Bhattacharya and Waymire 2022). To derive necessary and sufficient conditions for explosion of the birth–death process on .S = {0, 1, . . . } with reflection at 0, one may apply Theorem 4.9 as follows. Proposition 4.14 (Reuter) Let .{Xt : t ≥ 0} be a continuous parameter birth– death process on .S = {0, 1, . . . } with reflection at 0 and starting in state .j ∈ S. Assume .βj δj > 0, j = 1, 2 . . . , β0 = 1, δ0 = 0, and .0 < λj < ∞, j ≥ 0. Then, .{Xt } is .Pj -a.s. explosive for each .j ∈ S if and only if
.
n ∞ δ1 · · · δn β1 · · · βj 1 < ∞, β1 · · · βn δ1 · · · δj λj βj n=1
j =1
∞ δ1 · · · δj < ∞. β1 · · · βj j =1
Proof Suppose that u is a positive function, such that .Au(j ) = γ u(j ), j ≥ 0, for γ > 0, where
.
Au(j ) = λj {βj u(j + 1) + δj u(j − 1) − u(j )},
.
2 See
j ≥ 0 (u(−1) = 0).
Freedman (1983) for a more extensive treatment of such extensions.
58
4 Continuous Parameter Jump Markov Processes
Then, writing .β˜j = λj βj , δ˜j = λj δj , j ≥ 0, .λj = β˜j + δ˜j , .
δ˜j β˜j
=
δj βj
,
β˜0 u(1) = (γ + β˜0 )u(0) β˜j u(j + 1) + δ˜j u(j − 1) = (γ + β˜j + δ˜j )u(j ), j ≥ 1.
(4.36)
Then, u(j + 1) =
.
(γ + λj )u(j ) − δ˜j u(j − 1) , β˜j
j ≥ 0.
(4.37)
Letting .v(j + 1) = u(j + 1) − u(j ), j ≥ 0, and taking .u(0) = 1 to standardize the solution, it follows that .v(1) = u(1) − 1 = γ˜ and β0
v(j + 1) =
.
δj γ v(j ) + u(j ), j ≥ 0. βj β˜j
(4.38)
In particular, inductively one has .v(j ) > 0, j ≥ 0, and therefore, u must be strictly increasing. Now, one may recursively compute v(j + 1) =
.
≥
δj γ γ δj · · · δ1 γ u(j − 1) + · · · + u(0) u(j ) + ˜ ˜ βj βj −1 β˜0 βj · · · β1 βj δj γ γ δj · · · δ1 γ + ··· + . + ˜ ˜ βj βj −1 β˜0 βj · · · β1 βj
(4.39)
∞ δ1 ···δn n β1 ···βj 1 Now, .limj →∞ u(j ) = ∞ j =1 δ1 ···δj λj βj = ∞, j =1 v(j ) + 1 = ∞ if . n=1 β1 ···βn i.e., u is an unbounded solution, and therefore ,.{Xt } is nonexplosive. On the other hand, again by monotonicity, v(j ) ≤ γ (
.
δj · · · δ1 1 δj 1 1 + ··· + )u(j ), + βj β˜j −1 βj · · · β1 β˜0 β˜j
So, .
u(j + 1) ≤ (1 +
δj · · · δ1 γ δj γ γ + ··· + )u(j ) + ˜ ˜ β β βj j βj −1 j · · · β1 β˜0
≤ exp(
δj γ δj · · · δ1 γ γ + + ··· + )u(j ) ˜ ˜ β β βj j βj −1 j · · · β1 β˜0
j ≥ 1.
(4.40)
Exercises
59
.. . j δk · · · δ2 γ δk · · · δ1 1 1 δk 1 ≤ exp( { + + ··· + u(1) + })u(0). ˜k βk β˜k−1 βk · · · β2 β˜1 βk · · · β1 β˜0 β k=0
β1 ···βj δ1 ···δn n Thus, if . ∞ j =1 δ1 ···δj n=1 β1 ···βn be bounded, and .{Xt } explodes.
1 λj βj
< ∞ and . ∞ k=1
δk ···δ1 βk ···β1
< ∞, then u must
For another interesting application of Theorem 4.12, see Exercise 13. Remark 4.5 Even in the case of countable state spaces, general continuous parameter Markov chains are capable of more exotic behavior than presented here, e.g., see Chung (1967) and Freedman (1983). The examples by Lévy (1951), Dobrushin (1956), Feller and McKean (1956), and Blackwell (1958) are especially instructive in identifying nuances of the general theory.
Exercises 1. (Cauchy’s Functional Equation) Suppose that a positive right-continuous function ϕ on [0, ∞) satisfies ϕ(s + t) = ϕ(s)ϕ(t), for all s, t ≥ 0. Prove that ϕ is of the form ϕ(t) = e−λt , for some λ ∈ R. [Hint: Let f = log ϕ. Then, f satisfies f (s + t) = f (s) + f (t), s, t ≥ 0, implying f (mx) = mf (x), m 1 f (m n x) = n f (x) for all m, n ∈ Z+ , x ≥ 0 (noting that f (x) = nf ( n x)). Now use right-continuity to prove f (cx) = cf (x) for all c > 0 and hence f (x) = xf (1) = θ x for all x ≥ 0 and for some constant θ .] 2. Supply the details for the proof of Corollary 4.2. [Hint: Use the strong Markov property.] 3. Show that the two definitions of the minimal process X obtained from Kolmogorov extension theorem applied to p(t : x, dy) and (4.25), respectively, are stochastically equivalent. 4. Establish the equivalence of the integro-differential and mild forms of the backward and forward equations given in (4.8). 5. Give a proof of (4.23). [Hint: Use mathematical induction.] 6. Show that the transition probabilities of the instantaneous return to zero of the minimal process fail to satisfy the forward equation, but not extension X ˜ the backward equations. [Hint: For n ≥ 0, show that p˜ 00 (t) = ∞ n=0 P0 (Xt = ∞ t ∗n 0, Nt = n) = n=0 0 p0,0 (t − s)f0 (s)ds, where f0 is the pdf of ζ , since the times between successive explosions are i.i.d. with pdf f0 and p(t) = e−λ0 t . Differentiate to see the forward equations fail. To show the backward equations in this case, write f0 = eλ0 ∗ f1 , where eλ0 (t) = λt0 e−λ0 t , t ≥ 0, and f 1 is the pdf of the first explosion time starting from i0 = 1. Check that ∞ ∗n n=1 f0 (t) = λ0 p˜ 10 (t).]
60
4 Continuous Parameter Jump Markov Processes
7. Compute the transition probabilities of the homogeneous Poisson process by Feller’s method of successive approximation. 8. (A Pure Death Process) Let S = {0, 1, 2, . . .} and let qi,i−1 = δ > 0, qii = −δ, i ≥ 1, qij = 0, otherwise. (i) Calculate pij (t) ≡ p(t; i, {j })), t ≥ 0 using successive approximations. (ii) Calculate Ei Xt and Vari Xt . 9. (Birth–Death with Linear Rates) The continuous parameter birth–death chain with linear rates on S = {0, 1, 2, . . .} is defined by the (minimal) Markov chain {Xt : t ≥ 0} with qi,i+1 = iβ, qi,i−1 = iδ, i ≥ 0, where β, δ > 0, qij = 0 if |j − i| > 1, q01 = 0. Let mi (t) = Ei Xt ,
.
(i) (ii) (iii) (iv)
si (t) = Ei Xt2 .
Use the forward equation to show mi (t) = (β − δ)mi (t), mi (0) = i. Show mi (t) = ie(β−δ)t . Show s (t) = 2(β − δ)si (t) + (β + δ)mi (t). Show that si (t) =
.
ie2(β−δ)t {i +
β+δ −(β−δ)t )}, β−δ (1 − e
i(i + 2βt),
if β = δ if β = δ.
(v) Calculate Vari Xt . 10. (Pure Birth with Affine Linear Rates) Consider a pure birth process on S = {0, 1, 2, . . .} having infinitesimal parameters qii = −(ν + iλ), qi,i+1 = ν + iλ, qik = 0 if k < i or if k > i + 1, where ν, λ > 0. Successively solve the forward equation for p00 (t), p01 (t) and inductively for p0n (t) = ν(ν+λ)···(ν+(n−1)λ) −νt e [1 − e−λt ]n . The case ν = 0 is referred to as the Yule n!λn linear growth process (see Exercise 3, Chapter 1). 11. Show that a discrete parameter birth and death process on S = {0, 1, . . . } with reflecting boundary at 0 and βj δj > 0, j ≥ 1, is transient if and only if ∞ δ1 ···δj n=1 β1 ···βj < ∞. [Hint: Compute the probability of hitting i before k from an initial state j ∈ i, . . . , k, and the probability of eventually hitting i by passing to the limit as k → ∞.] 12. Consider a recurrent birth and death process on S = {0, 1, . . . } with reflecting β0 ···βj −1 boundary at 0 and βj δj > 0, j ≥ 1. (a) Show ∞ j =1 δ1 ···δj < ∞ implies, for β ···β
suitable choice of normalizing constant m0 , mj = 0δ1 ···δjj−1 m0 is an invariant probability for the embedded discrete parameter Markov chain. [Hint: Check time-reversibility.] (b) Define π1 = λλ1 0δ1 π0 , πj = λλj0 mj , j ≥ 2. Show that λ0 ∞ if ∞ j =1 λj mj < ∞, then, for suitable choice of π0 , {πj }j =0 is the unique
Exercises
61
invariant probability for the continuous parameter Markov chain. [Hint: Check that the forward equations apply.] 13. (α-Riccati Jump Process) The α-Riccati process arises in a variety of applications3 and can be associated with a variety of critical phenomena. Let n T = ∪∞ {1, 2}0 = {∅}, and view T as a binary tree graph rooted at n=0 {1, 2} , ∅. The height of v = (v1 , . . . , vn ) ∈ T is defined by |v| = n and |∅| = 0. If v ∈ T, then vj = (v1 , . . . , vn , j ). ∅j = (j ), j ∈ {1, 2} denotes the first-generation offspring of v, and v|m = (v1 , . . . , vm ), m ≤ n = |v| is the m-generation ancestor of v. For |v| = n, ← v− := v|(n − 1) denotes the immediate ancestor of − to make T a v. Then, u, v ∈ T are connected by an edge if u = ← v−, or v = ← u graph. The α-Riccati process is the jump Markov process on the denumerable n evolutionary space S ≡ E consisting of finite subsets V ⊂ ∪∞ n=0 {1, 2} with the properties that are as follows: (i) V = {∅} ∈ E. (ii) If {∅} = V ∈ E, then (v) = V \{v} ∪ {v1, v2} ∈ E. Define A(α) f (V ) = there is a|v|v ∈ V vsuch that V )−f (V )} = λα (V ) W ∈E {f (W )−f (V )}kα (V , W ), V ∈ E, v∈V α {f (V where λα (V ) = v∈V α |v| , V ∈ E and kα (V , W ) =
.
α |v| /λα (V ), if W = V v for some v ∈ V , 0
otherwise.
(a) Show4 that the α = 1/2-Riccati is a standard Poisson process. (b) Show that the α-Riccati is explosive if and only if α > 1. [Hint: Let λβ (V ) = v∈V β |v| and observe that A(α) λβ = (2β − 1)λαβ . Also note that λ√1 (V ) = |V | is the | , V ∈ E. Use cardinality of V . To prove explosion, consider u(V ) = √|V|V|+1 √ √ A(α) u(V ) ≥ |V | + 1√+ 1 ≤ 2 |V | + 1 and λα (V ) ≥ |V | for α > 1, to check √ √ √x+1−√x |V√ |+1− |V | √ { 2 |V |+1 |V |}u(V ) V ∈ E, and the function x → x 2√1+x , x ≥ √
√ 1, is bounded below by γ = 2−1 < 1. To prove nonexplosion, consider 2 2 u(V ) = |V |.] 14. Consider a pure birth process on S = {0, 1, 2, . . .} defined by infinitesimal parameters qk,k+1 = λk > 0, qkk = −λk . −1 −2 (i) If ∞ = ∞ and ∞ < ∞, then show that the variance of 0 λk 0 λk τn = T0 + T1 + · · · + Tn−1 (the time to reach n, starting from 0) goes to a finite limit as n → ∞. Use Kolmogorov’s zero-one law to show that as n → ∞, τn − n−1 λ−1 0 k converges (for all sample paths, outside a set of probability zero) to a finite randomvariable η. ∞ −2 ∞ −3 −1 (ii) If ∞ λ = ∞ = λ 0 0 0 λk < ∞, show that k k but
3 See
Athreya (1985), Aldous and Shields (1988), Best and Pfaffelhuber (2010), Bhattacharya and Waymire (2021), and Dascaliuc et al. (2018a,b, 2023a,b,c). This includes a role for α-Riccati as an idealized model for self-similar incompressible Navier-Stokes equations. 4 See Aldous and Shields (1988) and Dascaliuc et al. (2018a).
62
4 Continuous Parameter Jump Markov Processes
(τn −
n−1
.
0
λ−1 k )/(
n−1
1/2 λ−2 k )
0
is asymptotically (as n → ∞) Gaussian with mean 0 and variance 1. [Hint: Use a central limit theorem, BCPT,5 p. 81.] 15. Let {Xt : t ≥ 0} be a Poisson process with parameter λ > 0. Let T0 denote the time of the first occurence. (i) Let N = X2T0 − XT0 and calculate Cov(N, T0 ). (ii) Calculate the (conditional) expected value of the time T0 + · · · + Tr−1 of the rth occurence given Xt = n > r. 16. Prove the transience and recurrence dichotomy of a state j for a discrete parameter Markov chain. [Hint: Use the strong Markov property.] 17. (Continuous Parameter Branching Markov Chain) At time 0, X0 = i0 ≥ 1. Each of the i0 particles, independently of the others, waits an exponential length of time with parameter λ > 0 and then splits into a random number k particles with probability fk , independently of the holding time, where fk ≥ 0, f0 > 0, 0 < f0 + f1 < 1, and ∞ k=0 fk = 1. The progeny continues to split according to this process, and Xt counts the total number present at time t ≥ 0. If Xs = 0, then Xt = 0 for all t ≥ s, i.e., 0 is an absorbing state. Determine the infinitesimal parameters Q = ((qij )), and λi , kij ≡ k(i, {j }). 18. Let p(t) = ((pij (t))) denote the transition probabilities for the continuous (i) parameter branching Markov process given in Exercise 17. Let gt (r) = ∞ j j =0 pij (t)r , i = 0, 1, 2, . . . . (i) (i) (1) (i) Use the Chapman–Kolmogorov equations to show gt+s (r) = gt (gs (r)). (ii) Let ρ = P (Xt = 0 for some t > 0|X0 = 1). Show that ρ is the smallest j nonnegative root of the equation h(r) = 0, where h(r) = ∞ j =0 q1j . ∞ (iii) Show that if j =0 jfj ≤ 1, then ρ = 1.
19. (Continuous Parameter Critical Binary Branching) Consider the critical binary continuous parameter branching Markov chain6 in {Xt : t ≥ 0} defined by the offspring distribution f0 = f1 = 12 in the context of Exercise 17. (i) Verify that {Xt : t ≥ 0} is a birth–death process on S = {0, 1, 2, . . . } with absorption at 0 and birth–death rates qii+1 = qii−1 = iλ/2, i ≥ 1. (ii) Let Mt = #{0 ≤ s ≤ t : Xs − Xs− = −1} denote the number of deaths in time 0 to t. Show that {(Xt , Mt ) : t ≥ 0} is a continuous parameter Markov chain on S = Z+ × Z+ with infinitesimal rates A = ((aij )), 5 Throughout,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory. 6 This exercise is based on questions about the main channel lengths in a river network by Gupta et al. (1990). Also see Durrett et al. (1991).
Exercises
63
where ai,j = λ2 i1 if j = i + (1, 0) or j = i + (−1, 1), aij = −λi1 if i = j, and ai,j = 0 otherwise. (iii) Let g(t, u, v) = E(1,0) uXt v Mt , 0 ≤ u, v ≤ 1, t ≥ 0. Use Kolmogorov’s ∂g λ 2 forward equation to show ∂g ∂t = 2 (u + v − 2u) ∂u , g(0, u, v) = v. (iv) Integrate the characteristic curve defined by λ γ (t) = − j (γ 2 (t) − 2γ (t)tv), 2
.
d g(t, γ (t), v) = 0, dt
g(0, γ (0), v) = γ (0) to obtain g(t, u, v) =
1 λ(c −c )t 1 2 1 λ(c −c )t 1 2 (u−c2 )−(u−c1 )e 2 1 2
c1 (u−c2 )+c2 (c1 −u)e 2
where c1 = c1 (v) = 1 − (1 − v)V2 , c2 = c2 (v) = 1 + (1 − v) . (v) Let τ0 = inf{t ≥ 0 : Xt = 0}. Show that 1
.P Mτ0 = n =
2
n
(−1)
n−1
2n − 1 −2n+1 1 2 = , n = 1, 2, . . . 2n − 1 n
[Hint: ∞ n=1 P (Mτ0 = n) = limv→∞ g(t, 1, v), 0 ≤ v < 1.] (vi) Using Stirling’s formula show that
3 1 P Mτ0 = n ∼ √ n− 2 2 π
.
as n → ∞.
n ˆ (vii) Let hn = P (Mτ0 = n)E{τ0 | Mτ0 = n}, n ≥ 1, and h(v) = ∞ n=1 hn v . Show that √ 2 2 1−v c1 (v) 2 ˆ ) = − log( log(1 − .h(v) = − ). √ c2 (v) λ λ 1+ 1−v [Hint: P (τ0 > t | Mτ0 = n) = √ 2 π λ
1
P (Mτ0 =n)−P (Xt =0,Mτ0 =n) .] P (Mτ0 =n)
(viii) E{τ0 | Mτ0 = n} ∼ n 2 as n → ∞. [Hint: Apply a Tauberian theorem for Laplace transforms, Feller 1971, Chapter XIII.6.]
Chapter 5
Processes with Independent Increments
The goal is to characterize all processes continuous in probability having independent increments and certain sample path regularity as a sum of (possibly degenerate) continuous Brownian motion and a limit of independent Poisson superpositions of terms representing jumps. It should be noted that while we restrict to .R-valued processes for simplicity of exposition, the methods and results readily extend to multidimensional .Rk -valued processes.
Let us begin by revisiting some essential examples of processes with independent increments from the perspective of the present chapter.1 Example 1 (Poisson and Compound Poisson Processes) Let .N = {Nt : t ≥ 0} denote a Poisson process with intensity parameter .ρ > 0 and .Y0 = 0, Y1 , Y2 , . . . an i.i.d. sequence of nonnegative random variables distributed as .γ (dy), independently of N. Consider the compound Poisson process defined by Xt =
Nt
.
Yj , X0 = 0,
t > 0.
j =1
1 The
primary focus of this chapter is the representation theory of stochastic processes having stationary, independent increments. A variety of interesting applications can be found in Woyczy´nski (2001) and in Barndorff-Nielsen et al. (2001). The approach here essentially follows that of Itô and Rao (1961), Itô (2004). © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_5
65
66
5 Processes with Independent Increments
Let ψ(λ) = Ee−λY1 ,
.
λ≥0
The assumption of nonnegative summands assures finiteness of .ψ(λ). For the general theory, Laplace transforms are replaced by characteristic functions. It is straightforward to check that .X = {Xt : t ≥ 0} is a right-continuous stochastic process with stationary and independent increments. In fact, the sample paths are right-continuous step functions. Then, conditioning on N, one has Ee−λ(Xt+s −Xs ) = Ee−λXt = E(ψ(λ))Nt
.
∞ (ρtψ(λ))n
e−ρt = exp{ρt (ψ(λ) − 1)} n! n=0 = exp{−t (1 − e−λy )ν(dy)}, λ ≥ 0,
=
R
(5.1)
where ν(dy) = ργ (dy).
.
Example 2 (Lévy’s Nondecreasing Processes with Stationary Independent Increments) In this example, we consider a structural lemma that includes the Poisson process and the compound Poisson process for nonnegative .Y1 but is significantly richer than these examples. Lemma 1 (Lévy (1954)) Suppose that X is a right-continuous process having stationary independent increments such that 0 ≤ X0 ≤ Xs ≤ Xt ,
.
0 ≤ s ≤ t.
Then, Ee−λ(Xt −X0 ) = e−tψ(λ) ,
.
λ > 0,
where
∞
ψ(λ) = mλ +
.
(1 − e−λx )ν(dx),
0
∞ for a constant .m ≥ 0 and a Borel measure .ν(dx) on .(0, ∞), such that . 0 (1 − e−x )ν(dx) < ∞. Proof Without loss of generality assume that .X0 = 0, else replace .Xt by .Xt − X0 for .t ≥ 0. Observe from stationary independent increments, writing .g(t) = Ee−λXt for fixed .λ > 0, one has
5 Processes with Independent Increments
67
g(t) = g(t − s)g(s),
0≤s 0 and differentiating (5.2) with respect to .λ > 0, one arrives at 1 t
ψ (λ)etψ(λ) =
.
∞
xe−λx P (Xt ∈ dx),
t > 0, λ > 0.
0
Since the left side has a limit, namely, .ψ (λ), as .t ↓ 0, the same must be true on the right side. Notice that for each .t > 0, the right side is the Laplace transform of a measure . xt P (Xt ∈ dx). Thus, in view of convergence on the left side, the limit a-priori exists as .t ↓ 0 on the right side and is, therefore, also a Laplace transform of a measure, say .ν˜ (dx), by the extended continuity theorem2 for Laplace transforms. That is,
∞
ψ (λ) =
.
e−λx ν˜ (dx),
λ > 0.
(5.3)
0
Also, .ψ(0) = 0. Hence, integrating both sides of (5.3), one obtains using the FubiniTonelli theorem, λ ∞
ψ(λ) =
exp{−yx}˜ν (dx)dy
.
0
0
λ
=
{
0
{0}
exp{−yx}˜ν (dx) +
= λν˜ ({0}) + = λm +
∞
0+
∞
0+
∞
0+
exp{−yx}˜ν (dx)}dy
1 − e−λx ν˜ (dx) x
(1 − e−λx )ν(dx),
(5.4)
where .m = {0} exp{−yx}˜ν (dx) = ν˜ ({0}) and .ν(dx) = x1 ν˜ (dx) on .(0, ∞). Since ∞ −x )ν(dx) ≤ ψ(1) < ∞, the proof of the lemma is now complete. . + (1 − e 0
2 See
Feller (1968, 1971), p. 433.
68
5 Processes with Independent Increments
Remark 5.1 Stochastic processes of the type considered in Lemma 1 are often referred to as subordinators for their role as “random clocks” subordinate to other processes (see Example 3). Lévy in fact went much further than this lemma, to derive the following sample path representation in terms of a Poisson random measure .N(dt × dx) with intensity measure .dt × ν(dx). In particular, ( A dtν(dx))n − dtν(dx) e A , A ∈ B([0, ∞) × (0, ∞)), n = 0, 1 . . . . .P (N (A) = n) = n!
For ease of reference, the complete definition of the Poisson random measure (or Poisson random field) is as follows. Definition 5.1 A Poisson random measure on a measurable space .(S, S) is a random field .N = {N(A) : A ∈ S}, i.e., a stochastic process indexed by .S, defined on a probability space .(Ω, F, P ) such that a. .N(A) has a Poisson distribution for each .A ∈ S. b. .N(A1 ), . . . , N(An ) are independent for any disjoint .A1 , . . . , An , n ≥ 1 in .S. c. For almost all .ω ∈ Ω, .A → N(A, ω), A ∈ S, is a measure. ρ(A) = EN(A), A ∈ S, is referred to as the intensity measure.
.
That the set function .A → ρ(A), A ∈ S, is a measure follows immediately from Lebesgue’s monotone convergence theorem. The following lemma provides a formula for the so-called Laplace functional of the Poisson random field. Proposition 5.1 Suppose that N is a Poisson random field on .(S, S) with intensity measure .ρ(dz). If f is a non-negative measurable function on S, then Ee
.
−λ
S
f (z)N (dz)
= exp{
(1 − e−λf (z) )ρ(dz)},
λ ≥ 0.
(5.5)
S
Proof Suppose that .f (z) = a1A (z), z ∈ S, for a constant .a ≥ 0 and measurable set A ∈ S. Then,
.
Ee−λ
.
S
f (z)N (dz)
= Ee−λaN (A) = exp{−ρ(A)(1 − e−λa )} = exp{ (1 − e−λf (z) )ρ(dz)},
(5.6)
S
since .1 − e−λf (z) = (1 − e−λa )1A (z). The formula extends to .f (z) = m j =1 aj 1Aj (z), z ∈ S, for .aj ≥ 0, and disjoint measurable .Aj ∈ S, j = 1, . . . , m, by independence of “increments,” since
5 Processes with Independent Increments
Ee−λ
.
S
f (z)N (dz)
69
=E
m
e−λaj N (Aj )
j =1 m
=
exp{−ρ(Aj )(1 − e−λaj )}
j =1
= exp{−
m
ρ(Aj )(1 − e−λaj )}
j =1
(1 − e−λf (z) )ρ(dz)}.
= exp{
(5.7)
S
From here, one may use Lebesgue’s monotone convergence theorem to obtain the general formula (Exercise 2). Corollary 5.2 The stochastic process given in Lemma 1 is stochastically equivalent to the process defined by Xt = X0 + mt +
t
.
0
∞
0+
xN(ds × dx),
t ≥ 0,
(5.8)
where N is the Poisson random field on .S = [0, ∞) × [0, ∞) with intensity measure ρ(dt × dx) = dtν(dx).
.
Proof Without loss of generality, assume .X0 = 0, or replace .Xt by .Xt − X0 , t ≥ 0 in the following. By the corresponding properties of the Poisson random measure N, (5.8) defines a process with stationary and independent increments. To also see that this process is equivalent in distribution to the process of Lemma 1, it suffices, by the uniqueness theorem for Laplace transforms3 to calculate .Ee−λXt for fixed .t > 0. Let .ft (s, x) = x1[0,t] (s), s, x ≥ 0 Then, applying (5.5) to the function .ft , one has Ee−λXt = e−λmt Ee−λ
.
= e−λmt Ee−λ
t ∞
0 0+
xN (ds×dx)
[0,∞)×[0,∞) ft (s,x)N (ds×dx)
= exp{−tmλ − t
∞ 0+
(1 − e−λx )ν(dx)}.
(5.9)
This calculation, taken together with the stationary and independent increments property, proves stochastic equivalence between the process analyzed in Lemma 1 and the process defined by (5.8).
3 See
Feller (1968, 1971), Chapter XIII.
70
5 Processes with Independent Increments
Remark 5.2 Lemma 1 and Corollary 5.2 characterize all processes with stationary, nonnegative independent increments having cádlág sample paths. Adding a term ct is redundant if .c > 0, but if .c < 0, such an addition provides a slightly enhanced class of processes. An additional consequence for the structure of the Laplace transform occurs if, in addition to the conditions of Lemma 1, one further assumes and a scaling relation of the form (see Example 3): For each .λ > 0, there is a .α = α(λ) > 0, such that Xt =dist λXα −1 t ,
.
t ≥ 0, X0 = 0,
(5.10)
or, equivalently, .λXt =dist Xα −1 ( 1 )t , λ > 0, t ≥ 0. In particular, it follows, by λ iterating the scaling relation (Exercise 4), that one has α(λ1 λ2 ) = α(λ1 )α(λ2 ),
.
λ1 , λ2 > 0.
(5.11)
Thus, one may apply Cauchy’s functional formula to .α(ex ), x ∈ R, (see Exercise 1 in Chapter 4), to conclude that α ≡ α(λ) = λθ ,
.
for some θ ≥ 0.
(5.12)
Such processes are referred to as one-sided stable processes , and the exponent .θ is referred to as a stable law scaling exponent. The case in which .θ = 0 corresponds to the degenerate case, in which .Xt = 0 a.s. Corollary 5.3 In addition to the conditions of Lemma 1, assume the scaling relation (5.10), with scaling exponent .θ . Then, .0 ≤ θ ≤ 1. Moreover, Xt =
.
ψ(1)t mt +
t ∞ 0
0+
if θ = 1, xN(ds × dx),
if 0 ≤ θ < 1,
where for .θ < 1, ν(dx) =
.
ψ(1)θ dx, Γ (1 − θ )x θ +1
and .N (ds × dx) is the Poisson random measure on .(0, ∞) × (0, ∞) with intensity measure .dt × ν(dx) and .m ≥ 0. Proof Observe for arbitrary .t, λ ≥ 0, one has e−tψ(λ) = Ee−λXt = Ee−Xλθ t = e−λ
θ tψ(1)
.
Thus, ψ(λ) = ψ(1)λθ ,
.
λ ≥ 0.
.
5 Processes with Independent Increments
71
Using Lemma 1, ψ(1)λθ = ψ(λ) = μλ +
.
∞
0+
(1 − e−λx )ν(dx).
In particular, .θ = 1 corresponds to a degeneracy for which .Xt = γ t, for some .γ , and .ψ(1) = log EeX1 = γ . For .θ > 0, d . dλ
∞
0+
(1 − e−λx )ν(dx) =
d ψ(1)λθ , dλ
yields that the Laplace transform of .ν˜ (dx) = xν(dx) is given by .ψ(1)θ λθ−1 , λ > 0. Since the Laplace transform is a non-increasing function of .λ > 0, one has .θ < 1. ψ(1)θ From here, one may identify the asserted formula .ν˜ (dx) = xν(dx) = Γ (1−θ)x θ dx as the inverse Laplace transform of .ψ(1)θ λθ−1 , λ > 0, by using a simple change of variables (.y = λx) as follows: .
∞ 0+
e
−λx
ψ(1) ψ(1) dx = ( Γ (1 − θ ) Γ (1 − θ )x θ
∞
0+
y −θ e−y dy)λθ−1 = ψ(1)λθ−1 .
Example 3 (First Passage Time Processes) Let .{Bt } be a standard Brownian motion starting at zero. Fix .0 < a < ∞. The first passage time to a is defined by 4
τa := inf{t ≥ 0 : Bt ≥ a},
.
a ∈ R.
Consider the nondecreasing stochastic process .{τa : a ≥ 0}. Proposition 5.4 Under .P0 , the stochastic process .{τa : 0 ≤ a < ∞} is a process with independent increments. Moreover, the increments are homogeneous, i.e., for .0 < a < b, 0 < c < d, the distribution of .τb − τa is the same as that of .τd − τc if .d − c = b − a, both distributions being the same as the distribution of .τb−a under .P0 . Proof Recall that the conditional distribution of the after-.τa process .Bτ+a given the past up to time .τa is .PBτa = Pa , since .Bτa = a on the event .[τa < ∞] and .P0 (τa < ∞) = 1. Thus, .{Bt∧τa } and .{Bτa +t } are independent stochastic process and the distribution of .Bτ+a under .P0 is .Pa . That is, .Bτ+a := {(Bτ+a )t } has the same distribution as that of a Brownian motion starting at a. Also, starting from zero, the state b can be reached only after a has been reached. Hence, with .P0 -probability 1,
4 This example partially motivated the introduction of the inverse Gaussian distributions by Barndorff-Nielsen and Halgreen (1977) and Barndorff-Nielsen et al. (1978).
72
5 Processes with Independent Increments
τb = τa + inf{t ≥ 0 : Xτa +t = b} = τa + τb Xτ+a ,
.
(5.13)
where .τb (Bτ+a ) is the first hitting time of b by the after-.τa process .Bτ+a . Since this last hitting time depends only on the after-.τa process, it is independent of .τa , which is measurable with respect to .Fτa . Hence, .τb − τa ≡ τb (Bτ+a ) and .τa are independent, and the distribution of .τb − τa under .P0 is the distribution of .τb under .Pa . This last distribution is the same as the distribution of .τb−a under .P0 . For a Brownian motion .{Bt } starting at a, .{Bt − a} is a Brownian motion starting at zero; and if .τb is the first passage time of .{Bt } to b, then it is also the first passage time of .{Bt − a} to .b − a. Finally, if .0 < a1 < a2 < · · · < an are arbitrary, then these arguments applied to .τan−1 shows that .τan − τan−1 is independent of .{τai : 1 ≤ i ≤ n − 1}, and its distribution (under .P0 ) is the same as the distribution of .τan −an−1 under .P0 . Thus, the first passage process for Brownian motion starting at zero falls within the framework of Example 2, with the stochastically equivalent representation presented there. In particular, also note that .τ0 = 0 and, for .λ > 0, λτa = inf{λt : Bt ≥ a}
.
1
= inf{λt : λ− 2 Bλt ≥ a} = τ
1
λ2 a
.
(5.14)
Thus, 1
α(λ) = λ 2 ,
.
θ = 1/2.
(5.15)
For contrast, let us now consider a (non-degenerate) process having stationary independent increments and almost sure continuous sample paths. Example 4 (Brownian Motion) The Brownian motion starting at zero, .X = {Xt = μt + σ Bt : t ≥ 0}, X0 = 0, is a stochastic process having continuous sample paths with stationary and independent Gaussian increments, uniquely specified by the parameters .μ, σ 2 > 0. Moreover, 1 2 2 σ t
Ee−λXt = e−λμt+ 2 λ
.
,
λ ∈ R.
(5.16)
Almost sure sample path continuity uniquely sets this example apart from the previous (non-degernate) examples, thereby occupying a special place in the general theory. In the spirit of these examples, the goal of this chapter is to characterize all processes continuous in probability and having independent increments and certain sample path regularity as a sum of (possibly degenerate) continuous Brownian motion and a limit of independent Poisson superpositions of terms representing
5 Processes with Independent Increments
73
jumps. The founders5 of the theory are Bruno de Finetti, Paul Lévy, Yu Khinchine, and A.N. Kolmogorov. Our exposition follows that of K. Itô. Definition 5.2 a. A stochastic process .X = {Xt : t ≥ 0}, X0 = 0, having independent increments is referred to as an additive process. b. A Lévy process is an additive stochastic process .{Xt : t ≥ 0}, continuous in probability, whose sample paths are a.s. right-continuous with left-hand limits. c. A Lévy process having stationary increments will be referred to as a homogeneous Lévy process. Remark 5.3 The condition that .X0 = 0 is by standard convention. Naturally, all else being satisfied, one may always replace .Xt by .Xt − X0 if needed. Brownian motion and the compound Poisson process are prototypical examples of homogeneous Lévy processes. This is explained by the following two theorems and their corollaries. Theorem 5.5 If .{Xt : t ≥ 0} is an additive process having a.s. continuous sample paths, then the increments of .{Xt : t ≥ 0} must be Gaussian. Proof Let .s < t. Recall that a continuous function on a compact interval is uniformly continuous. Thus, ∞ P (∩∞ n=1 ∪k=1 [|Xu − Xv |
0, there is a number .δ = δ(ε) > 0 such that P (|Xu − Xv | < ε whenever |u − v| < δ, s ≤ u, v ≤ t) > 1 − ε.
.
Let .{εn }∞ n=1 be a sequence of positive numbers decreasing to zero and partition .(s, t] (n) (n) (n) into subintervals .s = t0 < t1 < · · · < tkn = t of lengths less than .δn = δ(εn ). Then, Xt − Xs =
.
kn kn {X(tj(n) ) − X(tj(n) )} := Xj(n) −1 j =1
(n)
j =1
(n)
where .X1 , . . . , .Xkn are independent (triangular array). Observe that the truncated random variables
5 See
Bose et al. (2002) for a review of the history of this development.
74
5 Processes with Independent Increments
.
(n) (n) X˜ j := Xj 1[|X(n) | 0, .Xt has a Poisson distribution with .EXt = λ(t) for some .λ(t) > 0. Partition .(0, t] into n intervals of the form .(ti−1 , ti ], .i = 1, . . . , n having equal lengths .Δ = t/n. It is to be noted that explicit dependence of these parameters .ti on the fixed value t, as well as others as they occur, will be suppressed to keep the notation simple. Let n (n) .A i=1 1A(n) (the number of time intervals i = [X(ti ) − X(ti−1 ) ≥ 1] and .Sn = i with at least one jump occurrence). Let D denote the shortest distance between jumps in the path .Xs , .0 < s ≤ t. Then, .P (Xt = Sn ) ≤ P (0 < D ≤ t/n) → 0 as .n → ∞, since .[Xt = Sn ] implies that there is at least one interval containing two or more jumps. Now, by the Poisson approximation (Chapter 1, Lemma 2), we have uniformly over .B ⊂ {0, 1, 2, . . . }, |P (Sn ∈ B) −
.
λm e−λ | ≤ ρn max pi , i≤n m!
(5.17)
m∈B
n (n) (n) where .pi ≡ pi = P (Ai ), 1 ≤ i ≤ n, .ρn = i=1 pi , .λ = − log θ (t), for (n) .θ (t) := P (Xt = 0) < 1. Since .max{p : i = 1, . . . , n} ≤ δn := P (0 < D ≤ nt ) i as .n → ∞, we only need to show that
6 Throughout, BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory.
5 Processes with Independent Increments n .
75
pi(n) → − log θ (t) as n → ∞.
i=1
For this, note that .θ (t) =
.
n
(n) c i=1 P ((Ai ) ),
log θ (t) =
n
so that
(n)
log(1 − pi ) = −
i=1
n
(n)
pi
− εn ,
i=1
n (n)2 where .0 < εn < /(1 − δn ) → 0 as .n → ∞. Hence, . ni=1 pi(n) → i=1 pi − log θ (t). Now use Scheffe’s theorem for the convergence in total variation for the (n) Poisson distribution with parameter . ni=1 pi converging to that parameter .λ. Corollary 5.8 If .X = {Xt : t ≥ 0} is a stochastically continuous process with stationary independent increments, whose sample paths are right-continuous step functions, then X must be a compound Poisson process. In particular, if the sample paths are right-continuous step functions with positive unit jumps, then X is a (homogeneous) Poisson process. Proof Let .N(t) = #.{s ≤ t : Xs − Xs − = 0} denote the number of jumps that take place in the time interval 0 to t. For .s < t, .N(t) − N(s) counts the number of jumps in the interval .(s, t], which is independent of .N(s). In particular, .{N(t) : t ≥ 0} is a process continuous in probability (Exercise 3) with independent increments having sample paths that are right-continuous step functions, which grow by positive unit jumps, with .N(0) = 0. Thus, .{N(t) : t ≥ 0} must be a (homogeneous) Poisson process by Theorem 5.7. Denoting the successive jump times .0 = τ0 < τ1 < τ2 < · · · , we have Xt = X0 +
N (t)
{Xτj − Xτj −1 }.
.
j =1
Recall that the inter-arrival sequence .τj − τj −1 , j ≥ 1 is an i.i.d. sequence of exponentially distributed random variables (see Theorem 4.1). Apply the strong Markov property to see that .{N(t) : t ≥ 0} is independent of the i.i.d. sequence of jump sizes .Yj = Xτj − Xτj −i , .j = 1, 2, . . . (see Corollary 4.2). Remark 5.4 It follows from methods of Chapter 3 concerning regularization of martingales that the assumed sample path regularity required here is not really a restriction. The regularized process is referred to as the Lévy modification of X in the present context.
76
5 Processes with Independent Increments
Throughout .X = {Xt : t ≥ 0} is a given Lévy process defined on a complete7 probability space .(Ω, F , P ). Definition 5.3 Let .X = {Xt : t ≥ 0} be a Lévy process defined on a complete probability space .(Ω, F , P ). A collection of .σ -fields .{Ast : 0 ≤ s < t, s, t ∈ [0, ∞)} is called additive if a. .Ars ∨ Ast = Art b. For .0 ≤ t0 < t1 < · · · < tn , .{At0 ,t1 , At1 t2 , . . . , Atn−1 tn } is a collection of independent .σ -fields. For .0 ≤ s < t < ∞, the .σ -fields Dst (X) = σ {Xv − Xu : s ≤ u < v ≤ t}
.
are referred to as differential .σ -fields associated with X. We say that X is adapted to an additive family .{Ast }s,t if .Xt − Xs is .Ast -measurable for .s < t. Note that if .{Ast } is additive, then the family of .σ -field completions .{Ast } is also additive (Exercise 16). Also, if X is adapted to an additive family .{Ast }, then X is an additive process and .Dst (X) ⊂ Ast , provided that .X0 = 0 almost surely. Next, let us analyze the number of jumps of various nonzero sizes which occur in sample paths of a Lévy process X within a specified time t. Since almost all sample paths are right-continuous with left limits, the counting random variables will only be defined up to a subset of .ω ∈ Ω of probability zero without further mention of this. The (random) set of jump times is denoted by Γ = Γ (ω) := {t > 0 : Xt (ω) − Xt − (ω) = 0}.
.
(5.18)
Also we consider the set of jump times and sizes jointly, as given by J = J (ω) := {(t, Xt (ω) − Xt − (ω)) : t ∈ Γ (ω)}.
.
(5.19)
Let R0 = R\{0}.
.
Note that J (ω) ⊂ (0, ∞) × R0
.
is a countable set (Exercise 15). Let .B0 denote the collection of all Borel subsets of .(0, ∞) × R0 , and define
7 See
BCPT p. 225.
5 Processes with Independent Increments
77
R0 := {A ∈ B0 : A ⊂ (0, δ) × {Δ : |Δ| > δ −1 } for some δ > 0}.
.
Notice that since .δ may be arbitrarily large, membership in .R0 is unrestrictive if one considers fixed positive jump sizes. One may check that .R0 is closed under finite unions and relative complements, i.e., a measure-theoretic ring of subsets, but generally not under countable unions nor complements. However, .R0 is also closed under countable intersections (Exercise 18). Define the possibly infinite valued counting random variables N(A) = N(A, ω) := card{A ∩ J (ω)},
.
A ∈ B0 .
(5.20)
Note that for cádlág sample paths, .N(A) must be finite for .A ∈ R0 . Proposition 5.9 For .A ∈ B0 , either .N(A) is a Poisson distributed random variable with finite intensity .ν(A) := EN(A) or .N(A) is .+∞ with probability one. Moreover, the random set function .A → N(A), A ∈ B0 is a.s. a measure, and the intensity .A → ν(A) is a measure on .B0 . If .A ∈ R0 , then .N(A) is Poisson distributed with finite intensity. Proof First one must check that .N(A) is a random variable; i.e., .ω → N(A, ω) is measurable. The proof of measurability is outlined in Exercise 19 where, in fact, one shows that for .A ∈ R0 and .A ⊂ (s, t] × R0 , for some .0 ≤ s < t, .N(A) is measurable with respect to .Ds,t (X). Consider .A ∈ R0 , and let At := A ∩ ((0, t] × R0 ) , t ≥ 0.
.
The stochastic process .Nt := N(At ) is a right-continuous nondecreasing step function with unit jumps,.N0 = 0. Moreover, for any .0 ≤ s < t, Nt − Ns = N(At \As ) = N(A ∩ ((s, t] × R0 ))
.
is measurable with respect to the additive differential filtration .Ds,t (X), and hence is an additive process. Finally, one has continuity in probability at each t, since for any .ε > 0, there is a .δ > 0 such that P (|Nt − Ns | > ε) ≤ P (|Xt − Xs | > δ −1 ) → 0 as s → t.
.
In view of Corollary 5.8, .Nt − Ns is Poisson distributed for .0 ≤ s < t. Since A ∈ R0 , it follows that .A = At for a sufficiently large t, and hence .N(A) = Nt has a Poisson distribution with finite intensity. Now, more generally for .A ∈ B0 , there is a nondecreasing sequence .B1 ⊂ B2 ⊂ · · · in .R0 such that .A = ∪∞ n=1 Bn . Thus, .N (A) is the limit of a nondecreasing sequence of Poisson distributed random variables with respective (nondecreasing) intensities .νn = ν(Bn ) < ∞, n ≥ 1. Let .ν := limn νn . If .ν < ∞, then it follows that .N (A) is Poisson distributed with parameter .ν. If .ν = ∞, then .P (N (A) < ∞) = 0. Finally, note that for
.
78
5 Processes with Independent Increments
fixed .ω, .A → N (A, ω) is simply a counting measure, in particular a nonnegative integer valued countably additive set function with .N(∅, ω) = 0. It follows from Lebesgue’s monotone convergence theorem that .A → ν(A) = EN(A) is also a measure on .B0 . Remark 5.5 We will make the convention that a Poisson distribution with infinite intensity corresponds to an extended nonnegative integer valued random variable N with .P (N = 0) = 1. Definition 5.4 The random measure N will be referred to as the space-time jump counting measure associated with X. The intensity measure .ν(·) is referred to as the Lévy measure of X. The random field .N = {N(A) : A ∈ B0 } obtained from the Lévy process X provides counts of the numbers of jumps of specified range of sizes in specified periods of time. From N, one may measure the contributions to the evolution of X arising solely from jumps by introducing the finite sums:
M(A) = M(A, ω) :=
.
z,
A ∈ R0 .
(5.21)
(t,z)∈A∩J (ω)
Since k k−1 k , ])), N(A ∩ ((0, ∞) × ( n→∞ n n n
M(A) = lim
.
(5.22)
k
the map .ω → M(A, ω) is measurable for each .A ∈ R0 . In fact, it follows that M inherits measurability with respect to the additive differential filtration .{Ds,t (X) : 0 ≤ s < t < ∞} from that of N ; see Exercise 19. A commonly used alternative notation for (5.22) is to write .M(A) = z1A (t, z)N (dt × dz), A ∈ R0 . (5.23) R0
(0,∞)
With this, we may consider the evolution of X apart from jumps by defining Yt (A) := Xt − M(At ),
.
A ∈ R0 ,
t ≥ 0,
(5.24)
where as before .At = A ∩ ((0, t] × R0 ). Definition 5.5 The measure M in (5.23) will be referred to as the space-time accumulated jump size measure associated with X. The processes .{Yt (A) : t ≥ 0}, for .A ∈ R0 , in (5.24) will be referred to as the jump reduced processes associated with X.
5 Processes with Independent Increments
79
The following lemma8 is fundamental to discerning the structure of the Lévy process X. Lemma 2 (Itô’s Fundamental Lemma) Let .A = {Ast : s < t, s, t ∈ [0, T ]} be an additive family of .σ -fields, .X = {Xt } a Lévy process, and .Y = {Yt } a Lévy process having Poisson distributed increments, both X and Y being adapted to .A. If with probability one there is no common jump point for X and Y , then the two processes are independent. That is, if P (Xt = Xt − and Yt = Yt − for some t ∈ (0, T ]) = 0,
.
then .σ {Xt : t ∈ T } and .σ {Yt : t ∈ T } are independent. Proof It is sufficient to establish independence of .Xt − Xs and .Yt − Ys for fixed 0 ≤ s < t. Let .s < tn0 < tn2 < · · · < tnn = t be a partition of .(s, t] into n disjoint subintervals .(tn,j −1 , tnj ], j = 1, . . . , n, of equal length. Define,
.
.
Xnj = Xtnj − Xtn,j −1 ,
Ynj = Ytnj − Ytn,j −1 (5.25)
Xt − Xs =
n
Xnj
j =1
Yt − Ys =
n
Ynj .
j =1
The hypothesis of the lemma implies that as .n → ∞, simultaneously for all n, Xnj = 0 on all intervals where .Ynj > 0. Thus, there exists .N = N(ω) such that for all .n ≥ N (ω), the above is true. Given .ε > 0, there then exists N (nonrandom), such that with probability .1 − ε, .Xnj = 0 on all intervals, where .Ynj > 0 for all .n ≥ N. The small probability .ε may be thought to be of that of the exceptional j ’s for which .Xnj = 0 but .Ynj > 0. So, we can assume that these do not occur. We will consider n of the form .n = 2m because as m increases, the exceptional set decreases to the empty set in finite (random) time, under the hypothesis. So for all m .n ≥ 2 0 = N for some positive integer .m0 , one has disjoint sets J of intervals, and .{1, . . . , n}\J = I , say, such that .Xnj = 0, Ynj > 0 for .j ∈ J and .Ynj = 0 for .j ∈ I . The sets J and I depend on .ω, in general, but one may split the sample space into disjoint sets corresponding to different (nonrandom) sets J of intervals. We may also assume that on those intervals where .Ynj > 0, it is the case that .Ynj = 1. This is because the probability that on any one of the n subintervals the event .[Ynj > 1] occurs is no more than .
ne−
.
8 See
λ(t−s) n
O(
(λ(t − s))2 1 ) = O( ), 2 n n
Itô (2008), p. 58, for a more detailed proof of this fundamental lemma.
80
5 Processes with Independent Increments
where .λ is the parameter of the Poisson process Y and each of the subinterval lengths is . t−s n . Fix an integer .k ≥ 0. We will consider the conditional distribution of .Xt − Xs given .Yt − Ys = k. Let n be very large as indicated above. Note that the pairs .(Xnj , Ynj ), j = 1, . . . , n are i.i.d., in view of the stationary independent increments. The event .Yt − Ys = k is the disjoint union of . nk sets J corresponding to the choices of k indices j for which .Ynj = 1, with .Ynj = 0 for the complementary set I of indices. We consider the conditional distribution of .Xt − Xs , given such a set J , and show that it does not depend on k or J . Consider the distribution of thetwo conditionally independent pairs of random variables: .( nj=1 Ynj 1[Ynj >0] , nj=1 Xnj 1[Ynj >0] ), n .( 1[Ynj =0] , nj=1 Xnj 1[Ynj =0] ). The event .[Yt − Ys = k] means that the first j =1 Ynj pair is .(k, j ∈J Xnj 1[Ynj >0] ) and the second pair is .(0, nj=1 Xnj 1[Ynj =0] ), where J is the index set of k intervals, in which .Ynj = 1, and I is the complementary set .{1, 2, . . . , }\J . On the set J , . j ∈J Xnj = o(1) in probability, denoted .op (1), because it is the sum of k i.i.d random variables, each of the order .op (1/n). This last fact makes use of the hypothesis of the lemma, which implies (see Remark 5.6) P (Xnj = 0, Ynj > 0) = op (1/n).
.
Hence, conditionally given .Yt − Ys = k, . j ∈I Xnj 1[Ynj =0] is, asymptotically in n probability, the same as the sum . j =1 Xnj = Xt − Xs . In terms of conditional expectations, this may be expressed as E(exp{iξ
n
.
Xnj }|J ) − E(exp{iξ
Xnj }|J ) → 0
(5.26)
j ∈I
j =1
in probability as .n → ∞ and .E(exp{iξ j ∈I Xnj }|J ) = E exp{iξ j ∈I Xnj }. This n shows that the conditional distribution of .Xt − Xs = j =1 Xnj is asymptotically the same no matter what .k = Yt − Ys is. This implies .Xt − Xs is independent of .Yt − Ys . Remark 5.6 Let .Θn = P (Xnj 1[Ynj >0] = 0). Then, P (Xnj = 0, Ynj > 0) ≤ 1 − Θn ,
.
so that P (Xnj = 0 for for all j such that Ynj > 0) ≤ (1 − Θn )n ,
.
which converges, as .n → ∞, to P (In the interval [s, t] jumps of X and Y take place on disjoint sets) = 1,
.
5 Processes with Independent Increments
81
by hypothesis, unless .Xt − Xs = 0 with probability one (a case in which the independence is trivially true). This implies .Θn = o(1/n). One may perhaps also argue more simply that the event .P (Xnj = 0) = 1 − exp{−γ (t − s)/n} = O(1/n), where .γ is the parameter of the exponential jump distribution of X. But the probability that X and Y have both jumps in the interval .[(j −1)(t −s)/n, j (t −s)/n] is of smaller order than .O(1/n). Let At := A ∩ ((0, t] × R0 ) , t ≥ 0.
.
The stochastic process Nt ≡ Nt (A) := N(At )
.
(5.27)
is a.s. a right-continuous nondecreasing step function with unit jumps, .N0 = 0. Proposition 5.10 For each .A ∈ R0 , the jump reduced process .{Yt (A) : t ≥ 0} associated with X by (5.24) is a Lévy process and is independent of the Poisson process .{Nt (A) : t ≥ 0}. Proof Observe that for fixed .A ∈ R0 , .{Yt (A) : t ≥ 0} is a Lévy process and {Nt (A) : t ≥ 0} is a Poisson process with step function sample paths having unit jumps. Moreover, these two processes are adapted to a common additive differential filtration .{Ds,t : 0 ≤ s < t < ∞}. Clearly, by sample path, the two processes cannot have a common Jump time. Thus, the assertion follows from Lemma 2.
.
Proposition 5.11 For disjoint .A1 , . . . , An ∈ R0 , n ≥ 1, the stochastic processes i = {N i ≡ N(Ai ) : t ≥ 0}, .i = 1, 2, . . . n and .{Y (∪n Aj ) : t ≥ 0} are .N t t t j =1 mutually independent. Moreover, this independence statement is also valid for the corresponding space-time accumulated jump size processes .M i , i = 1, 2, . . . n, in place of the jump count processes .N i , i = 1, 2, . . . . Proof A simple recursive relationship provides the key to tracking independence. Observe that if one denotes the jump reduced (hence Lévy) process .Y (A) = {Yt (A) : t ≥ 0} associated with .X = {Xt : t ≥ 0} by .YX (A) and similarly denote the space-time jump counting measure associated with X by .NX , then, for the processes i i i j .N , Yt := Yt (∪ j =1 A ), i = 1, 2 . . . n, one has a recursive relationship given by Nti+1 = NY i (Ai+1 t ),
.
and
Yti+1 = (YY i )t (Ai+1 ), i = 0, 1 . . . , n − 1,
where .Y 0 = X. Thus, one has, denoting the join9 of .σ -fields by .∨, .σ (N i+1 ) ∨ σ (Y i+1 ) ⊂ σ (Y i ), i = 0, 1, . . . , n − 1, and hence
9 The
join refers to the .σ -field generated by a union of .σ -fields.
82
5 Processes with Independent Increments
σ (N i+1 ) ∨ σ (N i+1 ) ∨ · · · ∨ σ (N n ) ∨ σ (Y n ) ⊂ σ (Y i ),
i = 0, 1, . . . , n − 1.
.
Applying the previous proposition (i.e., for .n = 1) to the Lévy process .{Yti (Ai+1 ) : t ≥ 0}, it follows that .σ (N i+1 ) is independent of .σ (Y i+1 ). Now let .Fi ∈ σ (N i ), i = 1, 2, . . . , n and .G ∈ σ (Y n ). Then .Fi+1 ∩ · · · ∩ Fn ∩ G ∈ σ (Y i ) for each .i = 0, 1 . . . , n − 1. Therefore, P (∩nj=1 Fj ∩ G) = P (F1 )P (∩nj=2 Fj ∩ G) = · · · =
n
.
P (Fj )P (G).
j =1
That is, .σ (N 1 ), . . . , σ (N n ), σ (Y n ) are independent. The independence readily extends to the corresponding accumulated jump size processes .M i , i = 1, 2, . . . n, in place of the jump count processes .N i , i = 1, 2, . . . , via the a.s. limit formulae k k−1 k N i (A ∩ (0, ∞) × ( , ]), i = 1, . . . , n. n→∞ n n n
M i (A) = lim
.
k
Corollary 5.12 For any disjoint .A1 , . . . , An in .B0 , .n ≥ 1, the Poisson random variables .N (A1 ), . . . , N (An ) are independent. Proof First, consider the case that each .Ai ∈ R0 , i = 1, 2, . . . n. Then, for sufficiently large t one has .Ai = Ait , i = 1, . . . , n. So the assertion is immediate in this case. More generally, for any .Ai ∈ B0 there is a nondecreasing sequence ∞ B . Thus, each .Ai ∈ A := i .Bi,1 ⊂ Bi,2 ⊂ · · · in .R0 such that .A = ∪ i n=1 i,n σ (Bi,1 , Bi,2 , . . . ), i = 1, 2, . . . n and .Ai , i = 1, 2, . . . , n are independent .σ -fields. Proposition 5.13 For .A ∈ R0 , iξ M(A) .Ee = exp{−
(1 − eiξ z )ν(dt × dz)},
ξ ∈ R.
A
If furthermore .A ⊂ {(t, z) : |z| < a} for some .0 < a < ∞, then
EM(A) =
zν(dt × dz),
.
and
Var(M(A)) =
A
Proof For .A ∈ R0 and integer .n ≥ 1, let Ank := {(t, z) ∈ A :
.
so that a.s.
z2 ν(dt × dz). A
k+1 k a 0, .A = (0, t] × {z : |z| > δ} ∈ R0 . It follows from Proposition 5.9 that EN(A) =
ν(dt × dz) < ∞.
.
A
On the other hand, for “small jumps,” one also has the following. Proposition 5.14 For .A = {(t, z) : 0 ≤ t ≤ s, |z| < 1}, . z2 ν(dt × dz) < ∞. A
Proof Let .t, δ > 0 and consider .Js (δ) = {(s, z) : 0 < s ≤ t, δ < |z| < 1} ∈ R0 . Since .{M(Js (δ)) : 0 < s ≤ t} and .{Xs −M(Js (δ)) : 0 < s ≤ t} are independent and .Jt (δ) = J (δ), it follows that .M(J (δ)) and .Xt − M(J (δ)) are independent. Thus,
84
5 Processes with Independent Increments
Eeiξ Xt = Eeiξ M(J (δ)) Eeiξ {Xt −M(J (δ))} .
.
Thus, using Proposition 5.13 and the bound .cos(x) ≤ 1 −
x2 4 ,
for .|x| ≤ 1,
|Eeiξ Xt | ≤ |Eeiξ M(J (δ)) | = exp{
(cos(ξ z) − 1)ν(ds × dz)}
.
≤ exp{−
ξ2 4
J (δ)
z2 ν(ds × dz)},
|ξ | < 1.
J (δ)
If .
{(s,z):0 1/2,
.
88
5 Processes with Independent Increments
and using (5.34), .∪nj=m+1 Aj ∩ Bj ⊂ [Xm,n > ε], so that, since the events .Aj ∩ Bj , .m + 1 ≤ j ≤ n, are disjoint, one has by the previous lemma that n .
P (Aj ∩ Bj ) ≤ β(m, n, T , ε).
j =m+1
But the events .Aj and .Bj are independent and .P (Bj ) > 1/2. Thus,
.
n 1 1 P (Aj ) P (∪nj=m+1 Aj ) ≤ 2 2 j =m+1
n
≤
n
P (Aj )P (Bj ) =
j =m+1
P (Aj ∩ Bj )
j =m+1
≤ β(m, n, T , ε).
(5.35)
Since .[maxm≤j ≤n Xmj > 2ε] ⊂ ∪nj=m+1 Aj the proof of the lemma is complete. Although .sup0 0.
Now for any two positive numbers .λ1 , λ2 , and .z > 0, one has ν(bλ−1 z, ∞) = λ1 λ2 ν(z, ∞) = ν(bλ−1 b−1 z, ∞). 1 λ2 1 λ2
.
Thus, if .ν(1, ∞) > 0, then .bλ1 λ2 = bλ1 bλ2 is positive, continuous, and log-linear on .(0, ∞). Hence, .bλ = λα for some real number .α. Moreover, since .λν(z, ∞) is nondecreasing in .λ, one has .α > 0. Let .θ = α1 > 0. Taking .λ = zθ , one obtains θ + = ν(1, ∞) > 0, we have .z ν(z, ∞) = ν(1, ∞) and hence, writing .ν ν(dz) =
.
ν+ dz zθ+1
on (0, ∞).
Defining .ν + = 0 in the case .ν(1, ∞) = 0 makes the formula true when .ν(1, ∞) = 0 as well. By repeating all of the above considerations with .z < 0, one obtains − = ν(−∞, 0) ≥ 0 such that .ν ν(dz) =
.
ν− dz |z|θ+1
on (−∞, 0).
To check that one obtains the same exponent .θ in both cases, use the relation (1) .EN −z) ∪ (z, ∞)) = ENλ(2) ((0, λ ((0, t] × (−∞, t] × (−∞, −z) ∪ (z, ∞)). Finally, note that since . {z:|z|≤1} z2 ν(dz) < ∞ and . {z:|z|>1} ν(dz) < ∞, one has either + = ν − = 0 or .0 < θ < 2. In this case, the formula for the characteristic function .ν follows directly from the Lévy-Khinchine formula. If .ν + = ν − = 0, then .N = 0 and Xt = mt +
.
√
vBt ,
(v > 0), t ≥ 0.
The nondegeneracy assumption is used to obtain .v > 0. Standard scaling properties of Brownian motion now imply .θ = 2. Definition 5.8 The parameter .θ is called the stable law exponent. In the case .cλ = 0, the stable process is said to be self-similar. Example 3 provides an important self-similar example. The scaling relations 1 (5.10) and (5.39) are easily seen to be equivalent by replacing .λ by .λ− θ in (5.10), since .α(λ) = λθ in the latter.
Exercises
95
Exercises t 1. Show that if Xt = N j =1 Yj , t > 0, X0 = 0, is a compound Poisson process with i.i.d. nonnegative increments Yj , j ≥ 1, independent of the Poisson process Nt , t ≥ 0 with intensity parameter ρ > 0, then m = 0 in Lemma 1, and ν(dx) = ρP (Y1 ∈ dx). 2. For the Poisson random measure, show that ρ(A) = EN(A), A ∈ S, defines a measure and verify the formula (5.5). 3. Show that the counting process in {N(t) : t ≥ 0} in the proof of Corolllary 5.8 is continuous in probability. 4. Verify the multiplicative property (5.11). [Hint: Iteratively calculate e−tα(λ1 λ2 )ψ(1) = Ee−Xα(λ1 λ2 )t .] 5. (Heavy Tails) Suppose that X is a stable law with E|X|m < ∞ for all m = 1, 2, . . . . Show that X must be Gaussian. If X has a stable law of exponent θ ∈ (0, 2), determine the largest real number m such that E|X|m < ∞. 6. (Self-similarity) A real-valued stochastic process X = {Xt : t ≥ 0}, X0 = 0, is said to be self-similar if for each λ > 0 there is a constant cλ > 0 such that the stochastic process {Xλt : t ≥ 0} has the same distribution as {cλ Xt : t ≥ 0}. (i) Show that standard Brownian motion started at 0 is self-similar with cλ = 1 λ2 . (ii) Show that in general cλ = λγ for some parameter γ ≥ 0, referred to as the scaling exponent. (iii) Show that if X = {Xt : t ≥ 0} is a stochastically continuous, stationary, self-similar process, then X is a.s. constant. [Hint: Note that by stationarity Xt+h = Xt in distribution for arbitrary h ≥ 0.] 7. Show that the Gaussian process in Theorem 5.5 may be represented as a deterministic translation of a time change of Brownian motion. 8. Let λ > 0 and let τa be the first passage time to a > 0 for standard Brownian √ motion B starting at zero. Show that τa < ∞ a.s., and Ee−λτa = e− 2λa . [Hint: For y > 0, P (max0≤s≤t Bs > y) ≤ P (max0≤s≤t {Bs − 2
y 2t s}
y2
> y2 ) ≤ e− 2t , so
that P (τa < ∞) = 1. Let Mt = exp{λBt − λ2 t }, t ≥ 0. Then, 1 = EMt∧τa since Mt∧τa , t ≥ 0, is a martingale. Let t → ∞.] 9. (Infinitely Divisible Transition Probabilities) Supppose that X = {Xt : t ≥ 0} is a homogeneous Lévy process. Apply the theory developed in this chapter to show that X is a Markov process on R with transition probabilities p(t; x, B) = qt (B − x), x ∈ R, B ∈ B(R), where qt is a probability on R with characteristic function of the form qˆt (ξ ) = exp(tϕ(ξ )), where ϕ(ξ ) = imξ − v2 ξ 2 + R (eiξ z − iξ z 1 − 1+z 2 )ν(dz).
10. Let B = {(Bt(1) , Bt(2) ) : t ≥ 0} denote two-dimensional standard Brownian (2) motion starting at (0, 0). Let τy = inf{t ≥ 0 : Bt ≥ y}. Show that (1) C = {Cy := Bτy : y ≥ 0} is a (symmetric) Cauchy process, i.e., a Lévy
96
5 Processes with Independent Increments
process whose increments are distributed according to the symmetric Cauchy distribution. [Hint: Use the strong Markov property to argue that {Cy : y ≥ 0} (2) is a Markov process, and use the first passage time density for {Bt , t ≥ 0} and a change of variables to compute the stationary transition probabilities y p(y; x, dz) = π1 y 2 +(z−x) 2 dz for C. ] 11. Show that for processes with right-continuous sample paths having left limits, continuity in probability at t is equivalent to the condition that Xt = Xt − a.s. 12. Let X = {Xt : t ≥ 0} and Y = {Yt : t ≥ 0} be homogeneous Poisson processes defined on a probability space (Ω, F, P ). Suppose that a.s. X and Y do not have a common jump time. Give an alternative argument to Lemma 2 for this special case, to prove that X and Y are independent based on their respective holding time sequences. t (ξ ) = Eeiξ(Xt −X0 ) in the following. In 13. Calculate the characteristic functions Q each case, show directly that the distribution of Xt is infinitely divisible. (a) {Xt : t ≥ 0} is Brownian motion. (b) {Xt : t ≥ 0} is a compound Poisson process. 14. Suppose that Xt = Yt + Zt , t ≥ 0, where Y = {Yt : t ≥ 0} is a compound Poisson process and Z = {Zt : t ≥ 0} is a Brownian motion, independent of Y . Show that {Xt : t ≥ 0} is a process with stationary independent increments. 15. Show that J (ω) := {(t, Xt (ω) − Xt − (ω)) : t ∈ Γ (ω)} is a countable set, where Γ (ω) := {t > 0 : Xt (ω) − Xt − (ω) = 0}. [Hint: Consider jumps of size n1 or greater for each n = 1, 2, . . . }.] 16. (i) Show that if {Ast } is additive, then the family of σ -field completions {Ast } is also additive. (ii) Show that if X is adapted to an additive family {Ast }, then X is an additive process and Dst (X) ⊂ Ast , provided that X0 = 0 almost surely. (iii) Prove that additivity is preserved under convergence in probability. 17. Show that σ -field completions of an additive differential filtration yield an additive differential filtration. Show that X = {Xt : t ≥ 0}, X0 = 0, is an additive process if and only if it is adapted to an additive differential filtration. 18. Consider R0 := {A ∈ B0 : A ⊂ (0, δ) × {Δ : |Δ| > δ −1 } for some δ > 0}. Show that R0 is closed under finite unions and relative complements, and countable intersections but generally not under countable unions nor complements. [Hint: If ∩n An = ∅, then it belongs to R0 by definition. If A ⊂ B ∈ R0 then B\A ⊂ B implies B\A ∈ R0 .] 19. Define N (A) = N(A, ω) := card{A ∩ J (ω)}, A ∈ R0 . Show that ω → N(A, ω) is measurable by completing the following steps: Let D = {Ds,t : 0 ≤ s < t < ∞} denote the additive differential filtration associated with the Lévy process X. For 0 < s < t, z > 0, Ss,t,z := (s, t] × (z, ∞), and let Qs,t denote the rationals in (s, t].
Exercises
97
(i) Show N(Ss,t,z ) is measurable with respect to {Ds,t : 0 < s < t < ∞}. [Hint: Show [N(Ss,t,z ) ≥ 1] ∈ Ds,t by expressing as countable unions and intersections of events of the form [Xq − Xp ≥ z + 1/m], p, q ∈ Qs,t . Then, proceed inductively by expressing [N(Ss,t,z ) ≥ k + 1] = ∪q∈Qs,t [N(Ss,q,z ) ≥ k] ∩ [N(Sq,t,z ) ≥ 1].] (ii) Show that N(A ∩ Ss,t,z ) is measurable with respect to Ds,t for A ∈ B0 . [Hint: Consider the collection of all A ∈ B0 such that N(A ∩ Ss,t,z ) is measurable with respect to Ds,t , and use the π -λ theorem.] (iii) Show that N(A ∩ S˜s,t,z ) is measurable with respect to Ds,t for A ∈ B0 , where S˜s,z,t := (s, t] × (−∞, −z), z > 0. [Hint: Repeat similar steps as for Ss,t,z .] (iv) For A ∈ R0 , show N(A) is measurable with respect to Ds,t . [Hint: Consider A ∈ R0 with A ⊂ (s, t] × R\{0} for some 0 < s < t. Consider A = A ∩ Ss,t,z ∪ A ∩ S˜s,t,z for small z > 0. ] 20. Show that if one also assumes stationarity of increments in Theorem 5.7, then one may in fact directly conclude that ρn = npn ≤ − log P (Xt = 0) and c n P (Xt = 0) > 0. [Hint: P (Xt = 0) = P (∩ni=1 A(n) i ) = (1 − pn ) . Conclude (n) that if P (Xt = 0) = 0, then P (Ai ) = 1 for each i and hence Xt ≥ n for each n. ]
Chapter 6
The Stochastic Integral
In this chapter, Itô’s stochastic integral and some of its basic properties are carefully introduced. The essential role of martingale theory is made manifestly clear through this development.
A one-dimensional Brownian motion .X = {Xt : t ≥ 0} starting at x and having drift coefficient .μ and diffusion coefficient .σ 2 may be defined by the transformation of standard Brownian motion .B = {Bt : t ≥ 0} given by Xt = x + μt + σ Bt ,
.
t ≥ 0.
(6.1)
Symbolically, this may be viewed as the integration, which “defines” the equation in differential form denoted by dXt = μdt + σ dBt ,
.
X0 = x.
(6.2)
Along these same lines, a diffusion .X = {Xt : t ≥ 0} on .R may be thought of as a Markov process that is locally like a Brownian motion. That is, in a sense to be made precise, the following relation holds, dXt = μ(Xt )dt + σ (Xt )dBt ,
.
(6.3)
where .μ(·), σ (·) are given functions on .R and .B = {Bt : t ≥ 0} is a standard onedimensional Brownian motion. In other words, conditionally given .{Xs : 0 ≤ s ≤ t}, for sufficiently smooth .μ(·) and .σ 2 (·), in a small time interval .(t, t + dt], the displacement .dXt := Xt+dt − Xt is approximately the Gaussian random variable © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_6
99
100
6 The Stochastic Integral
μ(Xt )dt + σ (Xt )(Bt+dt − Bt ), having mean .μ(Xt )dt and variance .σ 2 (Xt )dt. However, the sample paths of Brownian motion are continuous yet wild (nowhere differentiable). To be precise, one has the following result.
.
Proposition 6.1 Let .B = {Bt : t ≥ 0} be a standard Brownian motion starting at zero. Fix .t > 0. Then, one has
.
max n |
1≤k≤2
k−1
(B(m+1)2−n t −Bm2−n t )2 −k2−n t| → 0
a.s.
as n → ∞.
(6.4)
m=0
2 −n Proof The partial sums .Yk ≡ Yk,n = k−1 m=0 (B(m+1)2−n t − Bm2−n t ) − k2 t, k = n 1, . . . , 2 , form a martingale since the summands are mean zero independent random variables. Writing .N = 2n and noting that the N terms are i.i.d. with the common distribution of .Z 2 − EZ 2 , where Z is .N(0, 2−n t), it follows from Doob’s maximal inequality (Corollary 1.16) that for all n, n
n
P ( max n |Yk | ≥ 2− 4 ) ≤ EYN2 /2− 2
.
1≤k≤2
3
n
= N 2 2 var(Z 2 ) = 2 2 n var(Z 2 ) n 3 3 t ≤ 2 2 n EZ 4 = 2 2 n 3( )2 = 3t 2 2− 2 . N
(6.5)
Since the right-hand side is summable (over .n = 1, 2, . . . ), it follows from the Borel-Cantelli lemma that there is a finite .n0 (ω), depending on the sample point .ω ∈ Ω, such that n
P ( max n |Yk | < 2− 4 for all n ≥ n0 ) = 1.
.
1≤k≤2
(6.6)
In particular, this implies (6.4).
As a result of Proposition 6.1, outside a set of probability 0, a Brownian path is of unbounded variation on every nondegenerate interval (Exercise 1). Thus, (6.3) cannot be regarded as an ordinary differential equation such as dXt /dt = μ(Xt ) + σ (Xt )dBt /dt.
.
The precise sense in which (6.3) is valid and, in fact, may be solved to construct a diffusion with coefficients .μ(·), σ 2 (·), is the goal of this chapter and the next. The first task is to provide a useful definition of the integrated version of (6.3) expressed as:
t
Xt = x +
.
0
μ(Xs )ds +
t
σ (Xs )dBs . 0
(6.7)
6 The Stochastic Integral
101
The first integral may generally be viewed as an ordinary Riemann integral in cases, for example, when .μ(·) is continuous and .s → Xs is continuous. But the second integral cannot be defined as a Riemann-Stieltjes integral, since the Brownian paths are of unbounded variation on .[0, t] for every .t > 0. On the other hand, for a constant function .σ (x) ≡ σ , the second integral has the obvious meaning as .σ · (Bt − B0 ), so that (6.7) becomes
t
Xt = x +
.
μ(Xs )ds + σ (Bt − B0 ).
(6.8)
0
It turns out that if, for example, .μ(x) is Lipschitzian, then (6.8) may be solved sample path-wise, more or less by Picard’s well-known method of iteration for ordinary differential equations, to obtain a continuous solution .t → Xt ; see Exercise 3. To set the general mathematical framework for defining the stochastic integral, let .(Ω, F, P ) be a probability space on which a standard one-dimensional Brownian motion .B = {Bt : t ≥ 0} is defined. Suppose .{Ft : t ≥ 0} is an increasing family of sub-.σ -fields of .F, referred to as a filtration of .F, such that .
(i) Bs is Fs -measurable (s ≥ 0). (ii) {Bt − Bs : t ≥ s} is independent of (events in) Fs (s ≥ 0).
(6.9)
Condition (i) says that the Brownian motion .B = {Bt : t ≥ 0} is adapted to the filtration .{Ft : t ≥ 0}. For example, one may take .Ft = σ {Bs : 0 ≤ s ≤ t}. This is the smallest .Ft one may take in view of (i). Often it is important to take larger .Ft . As an example, let .Ft = σ [{Bs : 0 ≤ s ≤ t}, .{Zλ : λ ∈ Λ}], where .{Zλ : λ ∈ Λ} is a family of random variables independent of .{Bt : t ≥ 0}. For technical convenience, also assume that .Ft , .F are P –complete, i.e., if .N ∈ F and .P (N) = 0, then N and each of its subsets belong toeach .Ft . For the ensuing definitions, fix two numbers (time points) .0 ≤ α < β < ∞. Definition 6.1 A process .f = {f (t) : α ≤ t ≤ β} is progressively measurable if, for each t, the map .(s, ω) → f (s, ω) is .B[α, t] ⊗ Ft -measurable on .[α, t] × Ω, where .B[α, t] is the Borel .σ -field on .[α, t]. Remark 6.1 If the process .f = {ft : α ≤ t ≤ β} is either right-continuous, or leftcontinuous, then it is progressively measurable (e.g., see BCPT,1 Proposition 3.6, p. 60.). Definition 6.2 A real-valued progressively measurable stochastic process .f = {f (t) : α ≤ t ≤ β} is said to be a nonanticipative functional on .[α, β]. Such an f is said to belong to .M[α, β] if
1 Throughout,
Theory.
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
102
6 The Stochastic Integral
β
E
.
f 2 (t)dt < ∞.
(6.10)
α
If .f ∈ M[α, β] for all .β > α, then f is said to belong to .M[α, ∞). Remark 6.2 It is worthwhile for later considerations to note that .M[α, β] is a subspace of .L2 ([α, β] × Ω, B[α, β] ⊗ F, λ × P ) with the norm ||f ||2L2 ([α,β]×Ω,B⊗F ,λ×P ) := E(
β
.
|f (s)|2 ds),
f ∈ M([α, β]).
(6.11)
α
To initiate the definition of the stochastic integral, one may begin by consideration of nonanticipative random step functionals to serve as integrands in a type of “Riemann sum”. One may easily check by inspection that the following step functionals are indeed non-anticipative. Definition 6.3 A real-valued stochastic process .{f (t) : α ≤ t ≤ β} is said to be a nonanticipative step functional on .[α, β] if there exists a finite set of time points .α = t0 < t1 < · · · < tm = β and random variables .fj .(0 ≤ j ≤ m), such that .
(i) fj is Ftj -measurable (0 ≤ j ≤ m). (ii) f (t) = fj for tj ≤ t < tj +1 (0 ≤ j ≤ m − 1), f (β) = fm .
(6.12)
Definition 6.4 (Itô Integral of a Step Functional) The stochastic integral, or the Itô integral, of the nonanticipative step functional .f = {f (t) : α ≤ t ≤ β} in (6.12) is the stochastic process defined for .α ≤ t ≤ β by
t
f (s)dBs
.
α
⎧ ⎪ ⎨f (t0 )(Bt − Bt0 ) ≡ f (α)(Bt − Bα ) for t ∈ [α, t1 ], k := ⎪ f (tj −1 )(Btj − Btj −1 ) + f (tk )(Bt − Btk ) for t ∈ (tk , tk+1 ]. ⎩ j =1
(6.13) Observe that the Riemann-type sum (6.13) is obtained by taking the value of the integrand at the left end point of a time interval .(tj −1 , tj ]. As a consequence, for each t .t ∈ [α, β] the Itô integral . α f (s)dBs is .Ft –measurable, i.e., it is nonanticipative. Proposition 6.2 The Itô integral given in Definition 6.4 has the following properties. a. (Continuous, Nonanticipative, Additive) If f is a nonanticipative step functional t on .[α, β], then .t → α f (s)dBs is nonanticipative and is continuous for every .ω ∈ Ω. The integral is also additive in the sense
6 The Stochastic Integral
103
s
t
f (u)dBu +
.
α
t
f (u)dBu =
s
(α ≤ s < t ≤ β).
f (u)dBu ,
(6.14)
α
b. (Linearity) If f , g are nonanticipative step functionals on .[α, β], then for any .a, b ∈ R, af + bg is a nonanticipative step functional and
t
.
t
(af (s) + bg(s))dBs = a
α
f (s)dBs + b
α
t
(6.15)
g(s)dBs . α
c. (Martingale) Suppose f is a nonanticipative step functional on .[α, β] such that
β
E
.
β
f 2 (t)dt ≡
α
(Ef 2 (t))dt < ∞.
(6.16)
α
t Then, .{Iα (t) := α f (s)dBs : α ≤ t ≤ β} is a square integrable .{Ft }– t martingale, i.e., .E( α f (s)dBs )2 < ∞ and the martingale property,
t
E(Iα (t) | Fs ) = E(
.
s
f (u)dBu | Fs ) =
α
f (u)dBu = Iα (s),
(6.17)
α
holds for .α ≤ s < t ≤ β. d. (Centered) If (6.16) holds, then
t
E(
.
f (s)dBs |Fα ) = 0,
α
t
E
f (s)dBs = 0
(6.18)
α
e. (Itô Isometry—.L2 Norm Preserving Property) If .f, g are square-integrable, nonanticipative step functionals on .[α, β], then
t
E((
t
f (s)dBs )2 |Fα ) = E(
.
α
f 2 (s)ds | Fα )
α
and, more generally, E((
t
t
f (s)dBs )(
.
α
t
g(s)dBs )|Fα ) = E(
α
f (s)g(s)ds | Fα )
(6.19)
α
for .α ≤ t ≤ β. In particular, .
L2 (Ω,P ) ≡ E( = E
t
f (s)dBs )(
α t
t
g(s)dBs ) α
f (s)g(s)ds α
:= f, g L2 ([α,t]×Ω,λ×P ) .
(6.20)
104
6 The Stochastic Integral
f. (Quadratic Martingales) Suppose that f is a square-integrable, nonanticipative functional. The process
t
Qα (t) := (
.
t
f (s)dBs )2 −
α
(a ≤ t ≤ β),
f 2 (s)ds,
(6.21)
α
is a martingale. Proof (a) and (b) follow from Definition 6.4. s (c) Let f be given by (6.12). As . α f (u)dBu is .Fs -measurable, it follows from (6.14) that
t
E(
.
f (u)dBu | Fs ) =
α
s
t
f (u)dBu + E(
α
f (u)dBu | Fs ).
(6.22)
s
Now, by (6.13), if .tj −1 < s ≤ tj and .ti < t ≤ ti+1 , then
t
.
s
f (u)dBu = f (s)(Btj − Bs ) + f (tj )(Btj +1 − Btj ) + · · · + f (ti−1 )(Bti − Bti−1 ) + f (ti )(Bt − Bti ).
(6.23)
Observe that, for .s < t , .Bt − Bs is independent of .Fs (property (6.9)(ii)), so that E(Bt − Bs | Fs ) = 0
.
(s < t ).
(6.24)
Applying this to (6.23), E(f (s)(Btj − Bs ) | Fs ) = f (s)E(Btj − Bs | Fs ) = 0,
.
E(f (tj )(Btj +1 − Btj ) | Fs ) = E[E(f (tj )(Btj +1 − Btj ) | Ftj ) | Fs ] = E[f (tj )E(Btj +1 − Btj | Ftj ) | Fs ] = 0 E(f (ti )(Bt − Bti ) | Fs ) = E[f (ti )E(Bt − Bti ) | Fs )] = 0.
(6.25)
Therefore, E(
.
t
f (u)dBu | Fs ) = 0
(s < t).
(6.26)
s
From this and (6.14), the martingale property (6.17) follows. The relation (6.18) is an immediate consequence of (6.26). In order to prove the first inequality in (e), let .t ∈ [α, t1 ]. Then,
6 The Stochastic Integral
E((
t
.
105
f (s)dBs )2 |Fα ) = E(f 2 (α)(Bt − Bα )2 | Fα )
α
= f 2 (α)E((Bt − Bα )2 | Fα ) t = f 2 (α)(t − α) = E( f 2 (s)ds | Fα ),
(6.27)
α
by independence of .Fα and .Bt − Bα . If .t ∈ (ti , ti+1 ], then, by (6.13), E((
.
= E[{
t α i
j =1
f (s)dBs )2 |Fα ) f (tj −1 )(Btj − Btj −1 ) + f (ti )(Bt − Bti )}2 |Fα ].
(6.28)
Now the contribution of the product terms in (6.28) is zero. For, if .j < k , then E[f (tj −1 )(Btj − Btj −1 )f (tk−1 )(Btk − Btk−1 ) | Fα ]
.
= E[E(· · · | Ftk−1 ) | Fα ] = E[f (tj −1 )(Btj − Btj −1 )f (tk−1 )E(Btk − Btk−1 | Ftk−1 ) | Fα ] = 0.
(6.29)
Therefore, (6.28) reduces to E[(
.
=
t
i j =1
=
i j =1
α
f (s)dBs )2 |Fα ]
E(f 2 (tj −1 )(Btj − Btj −1 )2 | Fα ) + E(f 2 (ti )(Bt − Bti )2 | Fα )
E[f 2 (tj −1 )E((Btj − Btj −1 )2 | Ftj −1 ) | Fα ]
+E[f 2 (ti )E((Bt − Bti )2 | Fti ) | Fα ] =
i j =1
= E(
E(f 2 (tj −1 )(tj − tj −1 ) | Fα ) + E(f 2 (ti )(t − ti ) | Fα )
i
(tj − tj −1 )f 2 (tj −1 ) + (t − ti )f 2 (ti ) | Fα )
j =1
= E(
t α
f 2 (s)ds | Fα ).
(6.30)
106
6 The Stochastic Integral
The relation (6.19) follows by the polar identity with respect to an innerproduct, . f, g
=
1 (||f + g||2 − ||f − g||2 ), 4
applied conditionally given .Fα . The last statement in (e) follows from the relation (6.19) by taking expectation. More general versions of (6.19) and (6.18) are proved later without appeal to the polar identity; see Theorem 7.3. The proof of (f) follows similarly from additivity of the integral and the Itô isometry property (e), using basic properties of the filtration .{Ft : α ≤ t ≤ β} and conditional expectation. Remark 6.3 This proposition and its extension to more general class of integrands provides important classes of martingales, namely, Itô integrals and suitably centered squares of Itô integrals. Taking .f ≡ 1 provides the special quadratic martingale .Bt2 − t, t ≥ 0, already used, for example, to obtain the expected time to escape an interval for Brownian motion by optional stopping theory (Theorem 1.12). Specifically, letting τa,b = inf{t : Bt ∈ {a, b}}, B0 = x ∈ (a, b),
.
Theorem 1.12 may be applied to obtain x = EB0 = EBτa,b = aP (τa ≤ τb ) + bP (τa > τb ) = b + (a − b)P (τa ≤ τb ), (6.31)
.
and x 2 = EB02 − 0 = EBτ2 − Eτ,
.
(6.32)
from which it follows that P (τa ≤ τb ) =
.
b−x and Eτ = (b − x)(x − a). b−a
(6.33)
t The process . α f 2 (s)ds, α ≤ t ≤ β, is referred to as the quadratic variation of the t martingale .Iα (t) = α f (s)dBs , α ≤ t ≤ β. In particular, the standard Brownian motion .B = {Bt : t ≥ 0} is a continuous martingale with quadratic variation t. The converse is true as well! This is Lévy’s characterization of Brownian motion (see Exercise 1 in Chapter 8). Observe that according to property (f), the sub-martingale t 2 .( f (s)dB s ) , α ≤ t ≤ β, differs from a martingale by the quadratic variation α t process . f t := α f 2 (s)ds. The connection between this definition of quadratic variation from the point of view of martingales and that familiar to analysis is developed in Exercise 2.
6 The Stochastic Integral
107
Remark 6.4 The proof of Proposition 6.2 suggests an often used heuristic formalism of stochastic calculus: EdBt = E(Bt+dt − Bt ) = 0,
.
Cov(dBt , dBs ) = 0, s = t, Var(dBt ) = dt. (6.34)
We will see that the extension of the class of integrands to non-anticipative functionals in .M[α, β] is also consistent with this formalism. The next task is to extend the definition of the stochastic integral to the larger class of nonanticipative functionals to serve as integrands, by approximating these by step functionals. The following is a standard result from real analysis, whose proof is included for completeness. Proposition 6.3 Let .f ∈ M[α, β]. Then, there exists a sequence .{fn }∞ n=1 of nonanticipative step functionals belonging to .M[α, β] such that E
.
β
(fn (t) − f (t))2 dt −→ 0 as n → ∞.
(6.35)
α
Proof The proof follows by a standard “molifier” argument from analysis for L2 −spaces. Extend .f (t) to .(−∞, ∞) by setting it zero outside .[α, β]. For .ε > 0, write .gε (t) := f (t − ε). Let .ψ be a symmetric, continuous, nonnegative 1 (nonrandom) function that vanishes outside .(−1, 1) and satisfies . −1 ψ(x)dx = 1.ε Write .ψε (x) := ψ(x/ε)/ε. ε Then, .ψε vanishes outside .(−ε, ε) and satisfies . −ε ψε (x)dx = 1. Now define .g := gε ∗ ψε , i.e., .
g ε (t) =
ε
.
−ε
gε (t − x)ψε (x)dx =
t
=
ε −ε
f (y)ψε (t − ε − y)dy
f (t − ε − x)ψε (x)dx (y = t − ε − x).
(6.36)
t−2ε
Note that, for each .ω ∈ Ω, the Fourier transform of .g ε is gˆ ε (ξ ) = gˆ ε (ξ )ψˆ ε (ξ ) = eiξ ε fˆ(ξ )ψˆ ε (ξ ),
.
so that, by the Plancherel identity (BCPT, (6.36), p. 112.), .
∞
1 |g (t) − f (t)| dt = 2π −∞ ε
2
=
1 2π
∞
−∞ ∞ −∞
|gˆ ε (ξ ) − fˆ(ξ )|2 dξ |fˆ(ξ )2 | · |eiξ ε ψˆ ε (ξ ) − 1|2 dξ.
The last integrand is bounded by .4|fˆ(ξ )|2 . Since
(6.37)
108
6 The Stochastic Integral
E
.
|fˆ(ξ )|2 dξ ≡
∞
−∞
Ω
|fˆ(ξ )|2 dξ dP = 2π
Ω
= 2π E
[α,β]
∞
−∞
f 2 (t)dt < ∞,
f 2 (t)dt dP (6.38)
it follows, by Lebesgue’s dominated convergence theorem, that E
∞
.
−∞
|g ε (t) − f (t)|2 dt −→ 0 as ε → 0.
(6.39)
This proves that there exists a sequence .{gn }∞ n=1 of continuous nonanticipative functionals in .M[α, β] such that
β
E
.
(gn (s) − f (s))2 ds −→ 0.
(6.40)
α
It is now enough to prove that if g is a continuous element of .M[α, β], then there exists a sequence .{hn }∞ n=1 of nonanticipative step functionals in .M[α, β] such that
β
E
.
(hn (s) − g(s))2 ds −→ 0.
(6.41)
α
For this, first assume that g is also bounded, .|g| ≤ M. Define, for .0 ≤ k ≤ n − 1, hn (t) = g(α +
.
k+1 k k (β − α), (β − α)) if α + (β − α) ≤ t < α + n n n
hn (β) = g(β). Use Lebesgue’s dominated convergence theorem to prove (6.41). If g is unbounded, then apply (6.41) with g replaced by its truncation .g M , which agrees with g on .{t ∈ [α, β] : |g(t)| ≤ M} and equals .−M on the set .{t ∈ [α, β] : g(t) < −M} and M on .{t ∈ [α, β] : g(t) > M}. Note that .g M = (g ∧ M) ∨ (−M) is a continuous and bounded element of .M[α, β], and apply Lebesgue’s dominated convergence theorem to get .
β
E
(g M (s)−g(s))2 ds −→ 0 as M → ∞.
α
We are now ready to define the stochastic integral of an arbitrary element f in M[α, β].
.
Theorem 6.4 Given .f ∈ M[α, β], there is a sequence .{fn }∞ n=1 of squareintegrable nonanticipative step functionals satisfying (6.35) and such that the sequence of stochastic processes defined by
6 The Stochastic Integral
109
In (t) :=
t
fn (s)dBs ,
.
α ≤ t ≤ β,
α
is a.s. uniformly Cauchy. That is, with probability one, max |In (t) − Im (t)| → 0
.
α≤t≤β
as .n, m → ∞. Proof Given .f ∈ M[α, β], let .{fn }∞ n=1 be a sequence of step functionals For each pair of positive integers .n, m, the stochastic integral satisfying (6.35). t . (f − f )(s)dB . (α ≤ t ≤ β) is a square integrable martingale, by Proposim s α n tion 6.2(c). Therefore, by Doob’s maximal inequality (Theorem 1.14) and the Itô isometry (6.19), for all .ε > 0, P (Mn,m > ε) ≤
β
E(fn (s) − fm (s))2 ds/ε2 ,
(6.42)
(fn (s) − fm (s))dBs | : α ≤ t ≤ β}.
(6.43)
.
α
where
t
Mn,m := max{|
.
α
Choose an increasing sequence .{nk } of positive integers such that the right side of (6.42) is less than .1/k 2 for .ε = 1/k 2 , if .n, m ≥ nk . Denote by A the set A := {ω ∈ Ω : Mnk ,nk+1 (ω) ≤ 1/k 2 for all sufficiently large k}.
.
(6.44)
By the Borel-Cantelli lemma, .P (A) = 1. Now on A the series . k Mnk ,nk+1 < ∞. Thus, given any .δ > 0, there exists a positive integer .k(δ) (depending on .ω ∈ A) such that .Mnj ,n ≤ Mnk ,nk+1 < δ ∀ j, ≥ k(δ). k≥k(δ)
t In other words, on the set A the sequence .{ α fnk (s)dBs : α ≤ t ≤ β} is Cauchy t in the supremum distance (6.43). Therefore, . α fnk (s)dBs .(α ≤ t ≤ β) converges uniformly to a continuous function, outside a set of probability zero. Simply relabel this subsequence as .{fn }∞ n=1 to obtain the assertion of the theorem. t Definition 6.5 For .f ∈ M[α, β], the uniform limit of . α fn (s)dBs .(α ≤ t ≤ β) on A as constructed in Theorem t 6.4 is called the stochastic integral, or the Itô integral, of f and is denoted by . α f (s)dBs .(α ≤ t ≤ β). It will be assumed that t .t → α f (s)dBs is continuous for all .ω ∈ A, with an arbitrary specification as a
110
6 The Stochastic Integral
nonrandom constant on .Ac . If .f ∈ M[α, ∞), then such a continuous version exists t for . α f (s)dBs on the infinite interval .[α, ∞). It should be noted that the Itô integral of f is well defined up to a null set. For if .{fn }, .{gn } are two sequences of nonanticipative step functionals both satisfying (6.35), then
β
E
.
(fn (s) − gn (s))2 ds −→ 0.
(6.45)
α
t ∞ If .{fnk }∞ , .{g }∞ are subsequences of .{fn }∞ n=1 , .{gm }m=1 such that . α fnk (s)dBs t k=1 mk k=1 and . α gmk (s)dBs both converge uniformly to some processes .Y = {Yt : α ≤ t ≤ β}, .Z = {Zt : α ≤ t ≤ β}, respectively, outside a set of zero probability, then .
β
E
β
(Ys − Zs )2 ds ≤ (β − α) lim E
α
k→∞
α
(fnk (s) − gmk (s))2 ds = 0,
by (6.45) and Fatou’s lemma. As .{Ys : α ≤ s ≤ β}, .{Zs : α ≤ s ≤ β} are a.s. continuous, one must t have .Ys = Zs for .α ≤ s ≤ β, outside a P -null set. Note also that . α f (s)dBs is .Ft -measurable for .α ≤ t ≤ β, .f ∈ M[α, β]. The following generalization of Proposition 6.2 is almost immediate. Theorem 6.5 Properties (a)–(f) of Proposition 6.2 hold for the extension of the Itô integral to .M[α, β]. Proof This follows from the corresponding properties of the approximating step functionals .{fnk }∞ k=1 , on taking limits (Exercise 4). Example 5 Let .f (s) = Bs . Then, .f ∈ M[α, ∞). Fix .t > 0. Define .fn (t) = Bt and fn (s) := Bj 2−n t
.
if j 2−n t ≤ s < (j + 1)2−n t
(0 ≤ j ≤ 2n − 1).
Then, writing .Bj,n := Bj 2−n t ,
t
.
0
fn (s)dBs =
n −1 2
Bj,n (Bj +1,n − Bj,n )
j =0 2 −1 1 = − (Bj +1,n − Bj,n )2 − 12 B02 + 12 Bt2 2 n
j =0
→ − 12 t − 12 B02 + 12 Bt2
a.s. (as n → ∞).
The last convergence follows from Exercise 1. Hence,
(6.46)
Exercises
111
.
0
t
Bs dBs =
1 1 2 (Bt − B02 ) − t. 2 2
(6.47)
Notice that this example explicitly demonstrates (i) the almost sure (sample pathwise) convergence of the partial sums defining the integral, via the strong law of large numbers, and (ii) the a.s. uniformity, via the scaling property of the displacements. Theorem 6.4 shows more generally for any .f ∈ M[α, β], there is always a partition, for which this almost sure uniform convergence happens. the formal application of ordinary calculus would yield t As remarked1 earlier, 2 − B 2 ). Of course, unlike (6.47), and contrary to the basic (B . B dB = s s t 0 0 2 martingale property requirement for Itô integrals of Theorem 6.5, the usual rules of calculus do not yield a martingale. An alternative method of evaluation of this integral that exploits the martingale structure of the Itô integral (namely, Itô’s lemma) will be given in Chapter 8. Notice that if one replaced the values of f at the left end points of the subintervals by those at the right end points, then one would get, in place of the first sum in (6.46), 2n−1 .
j =0
2 −1 1 1 Bj +1,n (Bj +1,n − Bj,n ) = (Bj +1,n − Bj,n )2 + (Bt2 − B02 ), 2 2 n
j =0
which converges a.s. to . 12 (Bt2 − B02 ) + 12 t. Again, this limit is not a martingale. The following useful formula is left as Exercise 7. Also see Exercise 8. Proposition 6.6 (Integration by Parts) Let .f (t) .(α ≤ t ≤ β) be a nonrandom continuously differentiable function. Then, .
α
β
f (s)dBs = f (β)Bβ − f (α)Bα −
β
Bs f (s)ds.
α
Notice that if .f (s) is a random nonanticipative functional, then the derivative f (s) may not be anticipative (assuming it exists).
.
Exercises 1. (Quadratic Variation of Brownian Motion) Let {Bt : t ≥ 0} be a standard Brownj ian motion. Prove that, for every t > 0, as n → ∞, max1≤j ≤2n |{ i=1 (Bi2−n t − B(i−1)2−n t )2 − j 2−n t}| → 0, a.s. [Hint: Apply Doob’s maximal inequality to the martingale Xj , 1 ≤ j ≤ 2n , defined within the curly brackets to get n n n P (max1≤j ≤2n |Xj | ≥ 2− 4 ) ≤ EX22n /2− 2 = 2− 2 Var(B1 − B0 )2 .]
112
6 The Stochastic Integral
2. (Unbounded Variation) Let B be a standard Brownian motion starting at zero. n Define Vn = 2j =1 |Bj 2−n − B(j −1)2−n |. n
n
(i) Verify that EVn = 2 2 E|B1 |. [Hint: 2 2 (Bj 2−n − B(j −1)2−n ) has the same distribution as B1 .] (ii) Show that VarVn = Var|B1 |. (iii) Show that, with probability one, B is of unbounded variation on every interval [s, t], s < t. [Hint: Vn is a nondecreasing sequence and therefore has a limit. By Chebyshev’s inequality, P (Vn > M) → 1 as n → ∞ for any M > 0.] 3. Consider the stochastic integral equation (6.8) with μ(·) Lipschitzian, |μ(x) − μ(y)| ≤ M|x − y|. Solve this equation by the method of successive approximations, with the nth approximation Xt(n) given by
(n) .Xt
=x+
t
0
μ(Xs(n−1) )ds + σ (Bt − B0 ) (n)
[Hint: For each t > 0, write δn (t) := max{|Xs T > 0. Then, .
δn (T ) ≤ M ≤ M n−1
T
T
δn−1 (s)ds ≤ M 2
0
tn−1
dtn−2 · · ·
ds 0
0
tn−1
T ds
0
0
t2 0
(0)
(n ≥ 1), Xt (n−1)
− Xs
≡ x.
| : 0 ≤ s ≤ t}. Fix
δn−2 (tn−2 ))dtn−2 ≤ · · ·
δ1 (t1 )dt1 ≤ M n−1 δ1 (T )T n−1 /(n − 1)!
Hence, n δn (T ) converges to a finite limit, which implies the uniform con(n) vergence of X(n) = {Xs : 0 ≤ s ≤ T } to a finite and continuous limit X = {Xs : 0 ≤ s ≤ T }.] 4. Prove Theorem 6.5 in detail. β 5. Suppose fn , f ∈ M[α, β] and E α (fn (s) − f (s))2 ds → 0 as n → ∞. (i) Prove that Un := max{|
.
t
fn (s)dBs −
α
t
f (s)dBs | : α ≤ t ≤ β} −→ 0
α
in probability. (ii) If {Y (t) : α ≤ t ≤ β} is a stochastic process with continuous sample paths t and α fn (s)dBs → Y (t) in probability for all t ∈ [α, β], then prove that P (Y (t) =
.
α
t
f (s)dBs , ∀ t ∈ [α, β]) = 1.
Exercises
113
6. Prove the convergence in (6.46). [Hint: Apply space-time scaling to express the n 2 , where Z , . . . , Z n are i.i.d. sum as an average of the form 2tn 2j =1 Zn,j n,1 n,2 standard normal. Use Borel-Cantelli I to prove a strong law of large numbers for such an average.] 7. Prove the integration by parts formula in Proposition 6.6. [Hint: Rewrite the indicated Riemann-type sum in terms of differences of f .] 8. Apply the integration by parts formula to express Lévy’s fractional Brownian t (h) motion2 with parameter h > 0, Bt = 0 s h dBs , t ≥ 0, as a Riemann integral of Brownian motion paths. 9. Use the integration by parts formula and the martingale property of stochastic t integrals to show that tBt − 0 Bs ds, t ≥ 0, is a martingale. Also establish the martingale property directly using properties of conditional expectation.
2 Lévy
(1953) introduced fractional Brownian motion in a general framework of Gaussian processes. His definition was subsequently modified by Mandelbrot and Van Ness (1968) for an application to the Hurst problem in hydrology. See Bhattacharya and Waymire (2022) for a rather comprehensive treatment of this problem.
Chapter 7
Construction of Diffusions as Solutions of Stochastic Differential Equations
In this chapter it is shown how to use Itô’s stochastic calculus to specify a diffusion in terms of given drift and dispersion coefficients as a solution to the corresponding stochastic differential equation. This presents a powerful probabilistic alternative to the analytic theory of determining the transition probabilities from semigroups generated by a corresponding infinitesimal operator. The two approaches combine to provide a powerful overall tool of both probabilistic and analytic importance.
In the last chapter a precise meaning was given to the stochastic differential equation (6.3) in terms of its integrated formulation (6.7). The present chapter is devoted to the solution of (6.3) (or (6.7)). The chapter is organized into subsections in which the scope of the theory is gradually extended to broader classes of coefficients and/or higher dimensions. Specifically, the goals are to (i) establish existence and uniqueness of a continuous process .X = {Xt : t ≥ α} in .M[α, ∞) such that dXt = μ(Xt )dt + σ (Xt )dBt ,
.
t > α, X(α) = Xα ,
(7.1)
for given Lipschitz coefficients .μ, σ 2 and .Fα -measurable random variable .Xα in one dimension; (ii) derive the Markov property (from uniqueness) and strong Markov properties (from Markov property and Feller property); (iii) extend to higher dimensional equations of the (suitably interpreted) form dXt = μ(Xt )dt + σ (Xt )dBt ,
t > α, X(α) = Xα ,
(7.2)
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_7
115
.
116
7 Construction of Diffusions
where .X and .μ are column vectors, .σ is a matrix, and .B a multidimensional standard Brownian motion of consistent dimensions for the indicated algebraic operations; (iv) relax the Lipschitz conditions to local Lipschitz conditions for an existence theorem of a minimal process, together with the possibility of explosion; (v) derive the Markov and strong Markov properties for the minimal process; and (vi) extend (i)–(iv) to time-dependent coefficients .μ(x, t), σ (x, t).
7.1 Construction of One-Dimensional Diffusions Let .μ(x) and .σ (x) be two real–valued functions on .R that are Lipschitzian, i.e., there exists a constant .M > 0 such that |μ(x) − μ(y)| ≤ M|x − y|,
.
|σ (x) − σ (y)| ≤ M|x − y|,
∀ x, y.
(7.3)
The first order of business is to show that Equation (7.1) is valid as a stochastic integral equation in the sense of Itô, i.e.,
t
Xt = Xα +
.
t
μ(Xs )ds +
α
σ (Xs )dBs ,
(t ≥ α).
(7.4)
α
Theorem 7.1 Suppose .μ(·), .σ (·) satisfy (7.3). Then, for each .Fα –measurable square integrable random variable .Xα , there exists a unique (except on a set of zero probability) continuous nonanticipative functional .{Xt : t ≥ α} belonging to .M[α, ∞) that satisfies (7.4). Proof We prove “existence” by the method of successive approximation. Fix .T > α. Let Xt(0) := Xα ,
α ≤ t ≤ T.
.
Define, recursively, (n+1) .Xt
t
:= Xα + α
μ(Xs(n) )ds
+ α
t
σ (Xs(n) )dBs ,
(7.5)
α ≤ t ≤ T.
(7.6)
For example, (1)
Xt
.
(2)
Xt
= Xα + μ(Xα )(t − α) + σ (Xα )(Bt − Bα ), t = Xα + μ([Xα + μ(Xα )(s − α) + σ (Xα )(Bs − Bα )])ds +
α t
σ ([Xα + μ(Xα )(s − α) + σ (Xα )(Bs − Bα )])dBs , α ≤ t ≤ T .
α
(7.7)
7.1 Construction of One-Dimensional Diffusions
117
(n)
Note that, for each n, .{Xt : α ≤ t ≤ T } is a continuous nonanticipative functional on .[α, T ]. Also, for .n ≥ 1, .
(n+1) (n) 2 − Xt Xt t t (n) (n−1) (n) (n−1) =[ μ Xs − μ Xs ds + σ Xs − σ Xs dBs ]2 α
α
t (n−1) 2 t (n) (n−1) 2 (n) μ(Xs − μ Xs ds + 2 σ Xs − σ Xs dBs . (7.8)
≤2
α
α
Write (n)
Dt
.
:= E( max (Xs(n) − Xs(n−1) )2 ).
(7.9)
α≤s≤t
Taking expectations of the maximum in (7.8), over .α ≤ t ≤ T , and using (7.3) we get (n+1)
DT
.
≤ 2EM 2 (
T
α
|Xs(n) − Xs(n−1) |ds)2
t
+2E max ( α≤t≤T
α
(σ (Xs(n) ) − σ (Xs(n−1) ))dBs )2 .
(7.10)
Now, using the Cauchy–Schwarz inequality, one has
T
E(
.
α
|Xs(n) − Xs(n−1) |ds)2 ≤ (T − α)E
T α
T
≤ (T − α) α
(Xs(n) − Xs(n−1) )2 ds
Ds(n) ds.
(7.11)
Also, by Doob’s maximal inequality Theorem 1.14, and Theorem 6.5 (Itô isometry), .
E max (
t
α≤t≤T
≤ 4E(
α
T
≤ 4M 2
(n)
(n−1)
(n)
(n−1)
(σ (Xs ) − σ (Xs
α
(σ (Xs ) − σ (Xs
T α
(n)
E(Xs
))dBs )2 ))dBs )2 = 4
(n−1) 2 ) ds ≤ 4M 2
− Xs
T
T α
α
(n)
(n−1) 2 )) ds
E(σ (Xs ) − σ (Xs (n)
(7.12)
Ds ds.
Using (7.11) and (7.12) in (7.10), obtain (n+1) .D T
T
≤ (2M (T − α) + 8M ) 2
2
α
Ds(n) ds
= c1 α
T
Ds(n) ds
(n ≥ 1), (7.13)
118
7 Construction of Diffusions
say. An analogous, but simpler, calculation gives (1)
DT ≤ 2(T − α)2 Eμ2 (Xα ) + 8(T − α)Eσ 2 (Xα ) = c2 ,
.
(7.14)
where .c2 is a finite positive number (Exercise 5). It follows from (7.13), (7.14), and induction that for .n = 0, 1, . . . , (n+1) .D T
≤
c1n
T
sn
dsn
α
α
s2
dsn−1 · · · α
Ds(1) ds2 ≤ c2 1
(T − α)n c1n . n!
(7.15)
By Chebyshev’s inequality, (n+1)
P ( max |Xt
.
α≤t≤T
− Xt | > 2−n ) ≤ (n)
22n c2 (T − α)n c1n . n!
(7.16)
Since the right side is summable in n, it follows from the Borel–Cantelli lemma that P ( max |Xt(n+1) − Xt(n) | > 2−n for infinitely many n) = 0.
.
α≤t≤T
(7.17)
Let N denote the set within parentheses in (7.17). Outside N , the series (0)
Xt
.
+
∞ (n+1) (n) (n) (Xt − Xt ) = lim Xt n→∞
n=0
(7.18)
converges absolutely, uniformly for .α ≤ t ≤ T , to a nonanticipative functional (n) {Xt : α ≤ t ≤ T }, say. Since each .Xt is continuous and the convergence is uniform, the limit is continuous. For continuous nonanticipative functionals .f, g on .[α, T ], define the norm (when finite):
.
||f − g||2 := E max (f (s) − g(s))2 .
.
α≤s≤T
(n)
Writing .X = {Xs : α ≤ s ≤ T }, .X(n) = {Xs (see (7.15), (7.18)), ||X − X(n) || ≤
∞
.
||X(m+1) − X(m) || ≤ (c2
m=n
In particular, .||X|| < ∞, and .X ∈ M[α, T ]. To prove that X satisfies (7.4), note that
(7.19)
: α ≤ s ≤ T }, one has
∞ (T − α)m c1m 1 ) 2 → 0. m! m=n
(7.20)
7.1 Construction of One-Dimensional Diffusions
.
t
max |
α≤t≤T
α
≤M
α
μ(Xs(n) )ds
T
|Xs(n)
t
+ α
119
σ (Xs(n) )dBs
− Xs |ds + max | α≤t≤T
t
α
t
−
t
μ(Xs )ds −
α
σ (Xs )dBs |
α
(σ (Xs(n) ) − σ (Xs ))dBs |
≤ M(T − α) max |Xs(n) − Xs | α≤s≤T
t
+ max | α≤t≤T
α
(σ (Xs(n) ) − σ (Xs ))dBs |.
(7.21)
The first term on the right side goes to zero by the a.s. uniform convergence proved above. The second term goes to zero a.s. as .n → ∞ by Theorem 6.4 and (7.20). Thus the right side of (7.6) converges uniformly to the right side of (7.4) as .n → ∞. Since .X(n) converges uniformly to X on .[α, T ] outside a set of zero probability, (7.4) holds on .[α, T ] with probability one. It remains to prove the uniqueness to the solution to (7.4). Let .Y = {Yt : α ≤ t ≤ T } be another solution. Then write ϕt := E(max{|Xs − Ys | : α ≤ s ≤ t})2
.
(n)
(n)
and use (7.9)–(7.13) with .ϕt in place of .Dt , .Xt in place of .Xt (n−1) .Xt , to get ϕT ≤ c1
and .Yt in place of
T
(7.22)
ϕs ds.
.
α
Since .t → ϕt is nondecreasing, iteration of (7.22) leads, just as in (7.14), (7.15), to ϕT ≤
.
c2 (T − α)n c1n −→ 0 as n → ∞. n!
Hence, .ϕT = 0, i.e., .Xt = Yt on .[α, T ] outside a set of zero probability.
(7.23)
The next result identifies the stochastic process .X = {Xt : t ≥ 0} solving (7.4), in the case .α = 0, as a diffusion with drift .μ(·), diffusion coefficient .σ 2 (·), and initial distribution as the distribution of the given random variable .X0 . Theorem 7.2 Assume (7.3). For each .x ∈ R let .Xx = {Xtx : t ≥ 0} denote the unique continuous solution in .M[0, ∞) of Itô’s integral equation Xtx = x +
t
.
0
μ(Xsx )ds +
t 0
σ (Xsx )dBs
Then .Xx = {Xtx : t ≥ 0} is a Markov process.
(t ≥ 0).
(7.24)
120
7 Construction of Diffusions
Proof By the additivity of the Riemann and stochastic integrals, Xtx = Xsx +
t
μ(Xux )du +
.
s
t s
σ (Xux )dBu ,
(t ≥ s).
(7.25)
Consider also the equation, for .z ∈ R,
t
Xt = z +
t
μ(Xu )du +
.
s
σ (Xu )dBu ,
(t ≥ s).
(7.26)
s
Let us write the solution to (7.26) (in .M[s, ∞)) as a.s. equal to a function θ (s, t; z, Bst ) where .Bst := {Bu − Bs : s ≤ u ≤ t}. It may be seen from the successive approximation scheme (7.5)–(7.7) that .θ (s, t; z, Bst ) is measurable in t t x .(z, Bs ) with respect to .B(R) ⊗ σ (Bs ) (Exercise 6). As .{Xu : u ≥ s} is a continuous stochastic process in .M[s, ∞) and is, by (7.25), a solution to (7.26) with .z = Xsx , it follows from the uniqueness of this solution that .
Xtx = θ (s, t; Xsx , Bst ),
(t ≥ s), a.s.
.
(7.27)
Since .Xsx is .Fs –measurable and .Fs and .Bst are independent, (7.27) implies, using the substitution property for conditional expectation,1 that the conditional distribution of .Xtx given .Fs is the distribution of the random variable .θ (s, t; z, Bst ), say x x .q(s, t; z, dθ ), evaluated at .z = Xs . Since .σ {Xu : 0 ≤ u ≤ s} ⊆ Fs , the Markov property is proved. To prove homogeneity of this Markov process, consider for every t+h .h > 0, the solution .θ (s + h, t + h; z, B s+h ) of Xt+h = z +
t+h
μ(Xu )du +
.
s+h
t+h
σ (Xu )dBu ,
(t ≥ s).
(7.28)
s+h
Writing .Yu = Xu+h and with the change of variables .u = u − h, (7.28) may be expressed as Yt = z +
t
.
s
= z+
s
t
μ(Xu +h )du + μ(Yu )du +
t
s
t
σ (Xu +h )dBu +h
σ (Yu )d B˜ u ,
(t ≥ s),
(7.29)
s
where .B˜ u = Bu +h , so that the process of the increments .{B˜ u − B˜ s ≡ Bu +h − Bs+h : s ≤ u ≤ t} ≡ B˜ st has the same distribution as .Bst . Hence .Yt has the
1 See
BCPT,2 p. 38. BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory.
2 Throughout,
7.1 Construction of One-Dimensional Diffusions
121
same distribution as .Xt , i.e., .q(s + h, t + h; z, dθ ) = q(s, t; z, dθ ). This proves homogeneity. Definition 7.1 The family of Markov processes .Xz = {Xtz : t ≥ 0}, z ∈ R, solving (7.4) (with .α = 0) for respective initial values .X0 = z ∈ R, defines a diffusion on .R with drift .μ(·) and diffusion parameter (coefficient) .σ (·), starting at .z ∈ R. A Word on Terminology To be consistent with the predominant use in the literature, some abuse of terminology in reference to .σ (·) vs .σ 2 (·) as the ‘diffusion coefficient’ will occur in the text. This is largely due to the explicit appearance of .σ (·) as a ‘coefficient’ in the stochastic differential equation and .σ 2 (·) as a ‘coefficient’ in the associated infinitesimal generator (see (7.30) below). As a result, some reliance on the context becomes necessary; however, we shall refer to .σ (·) as the diffusion parameter in instances where it may not be clear otherwise. We will denote the transition probability of the diffusions .Xz = {Xtz : t ≥ 0}, .z ∈ R by .p(t; x, dy), .(x ∈ R, t ≥ 0). Under the assumption (7.3) and in keeping with the “locally Brownian” intuitive description of the diffusion, one may compute the infinitesimal moments in terms of the drift and diffusion coefficients as follows (Exercise 8): (i) E(Xtx − x) = μ(x)t + o(t), t ↓ 0.
.
(7.30)
.
(ii) Var(Xtx ) = E(Xtx − x)2 + o(t) = σ 2 (x)t + o(t), t ↓ 0.
(7.31)
In the next chapter (using Itô’s lemma), it will also follow that P (|Xtx − x| > ε) = o(t), t ↓ 0.
.
(7.32)
Example 1 (Ornstein–Uhlenbeck Process and the Langevin Equation) Let .μ(x) = (n) −γ x, .σ (x) = σ, x ∈ R. Then the successive approximations .Xt (see Equations (7.5)–(7.7)) are given by (assuming .B0 = 0) (0)
≡ X0 ,
(1)
= X0 +
Xt
.
Xt
(2) Xt
t
−γ X0 ds + σ Bt = X0 (1 − tγ ) + σ Bt ,
0
= X0 +
t
−γ {X0 (1 − sγ ) + σ Bs }ds + σ Bt
0
t 2γ 2 )X0 − γ σ = (1 − tγ + 2!
0
t
Bs ds + σ Bt ,
122
7 Construction of Diffusions
(3)
Xt
s s2γ 2 )X0 − γ σ Bu du + σ Bs }ds + σ Bt 2! 0 0 t s t t 3γ 3 t 2γ 2 )X0 + γ 2 σ − Bu du ds − γ σ Bs ds + σ Bt = (1 − tγ + 3! 2! 0 0 0 t t t t 3γ 3 t 2γ 2 2 − )X0 + γ σ ( ds)Bu du − γ σ Bs ds + σ Bt = (1−tγ + 2! 3! 0 u 0 t t t 2γ 2 t 3γ 3 2 = (1−tγ + (t − u)Bu du − γ σ Bs ds + σ Bt . − )X0 + γ σ 2! 3! 0 0 = X0 +
t
−γ {(1 − sγ +
(7.33) Assume, as an induction hypothesis, (n)
Xt
.
=(
t n n−1 (−tγ )m (t − s)m−1 Bs ds + σ Bt . )X0 + (−γ )m σ m! 0 (m − 1)!
m=0
(7.34)
m=1
Now use (7.6) to check that (7.34) holds for .n + 1, replacing n. Therefore, (7.34) holds for all n. But as .n → ∞, the right side converges (for every .ω ∈ Ω) to e
.
−tγ
X0 − γ σ
t
e−γ (t−s) Bs ds + σ Bt .
(7.35)
0
Hence, .Xt equals (7.35). In particular, with .X0 = x, Xtx = e−tγ x − γ σ
.
t
e−γ (t−s) Bs ds + σ Bt .
(7.36)
0
As a special case, for .γ > 0, .σ = 0, (7.36) gives a representation of an Ornstein– Uhlenbeck process as a functional of a Brownian motion. The corresponding stochastic differential equation for this choice of drift and diffusion parameters is referred to as the Langevin equation.
7.2 Extension to Multidimensional Diffusions In order to construct multidimensional diffusions by the method of Itô it is necessary to define stochastic integrals for vector–valued integrands with respect to the increments of a multidimensional Brownian motion. This turns out to be rather straightforward. (1) (k) Let .B = {Bt = (Bt , . . . , Bt ) : t ≥ 0} be a standard k-dimensional Brownian motion with respect to a filtration .{Ft : t ≥ 0}. For example, one may take .Ft = σ (Bs : s ≤ t). Assume (6.9) for .{Bt : t ≥ 0}. Define a vector–valued stochastic process
7.2 Extension to Multidimensional Diffusions
123
f = {f(t) = (f (1) (t), . . . , f (k) (t)) : α ≤ t ≤ β}
.
to be a nonanticipative step functional on .[α, β] if (6.12) holds for some finite set of time points .t0 = α < t1 < · · · < tm = β and some .Fti -measurable random vectors k .fi .(0 ≤ i ≤ m) with values in .R . Definition 7.2 (Stochastic Integral) The stochastic integral, or the Itô integral of a nonanticipative step functional .f on .[α, β] is defined by
t
.
α
f(s) · dBs
f(t0 ) · (Bt − Bt0 ) for t ∈ [α, t1 ], := i j =1 f(tj −1 ) · (Btj − Btj −1 ) + f(ti ) · (Bt − Bti ) for t ∈ (ti , ti+1 ].
(7.37)
Here .· (dot) denotes the Euclidean inner product, f(s) · (Bt − Bs ) =
k
.
(i)
f (i) (s)(Bt − Bs(i) ),
(7.38)
i=1
with .f(s) = (f (1) (s), . . . , f (k) (s)). It follows from this definition that the stochastic integral (7.37) is the sum of k one-dimensional stochastic integrals,
t
.
f(s) · dBs =
k
α
t
α
i=1
f (i) (s)dBs(i) .
(7.39)
Parts (a), (b) of Proposition 6.2 extend immediately. In order to extend part (c) assume, as in (6.16),
β
.
E|f(u)| du ≡ 2
α
k i=1
β
E(f (i) (u))2 du < ∞.
(7.40)
α
The square integrability and the martingale property of the stochastic integral follow from those of each of its k summands in (7.39). Similarly, one has
t
E(
.
t
f(s) · dBs |Fα ) = 0, E(
α
f(s) · dBs ) = 0.
(7.41)
α
It remains to prove the analog of (6.19), namely, E((
.
α
t
f(u) · dBu )2 |Fα ) = E( α
t
|f(u)|2 du|Fα ).
(7.42)
124
7 Construction of Diffusions
t (i) In view of (6.19) applied to . α f (i) (u)dBu .(1 ≤ i ≤ k), it is enough to prove that the product terms vanish, i.e.,
t
E[(
.
α
f (i) (u)dBu(i) )(
t
(j )
f (j ) (u)dBu )|Fα ] = 0 for i = j.
(7.43)
α
To see this, note that, for .s < s < u < u , .
(j )
(i)
(j )
E[f (i) (s)(Bs − Bs(i) )f (j ) (u)(Bu − Bu ) | Fα ] (j )
(i)
(j )
= E[E(f (i) (s)(Bs − Bs(i) )f (j ) (u)(Bu − Bu ) | Fu ) | Fα ] (j )
(i)
(j )
= E[f (i) (s)(Bs − Bs(i) )f (j ) (u)E(Bu − Bu | Fu ) | Fα ] = 0. (7.44) Also, by the independence of .Fs and .{Bs − Bs : s ≥ s}, and by the independence (j ) of .{Bt(i) : t ≥ 0} and .{Bt : t ≥ 0} for .i = j , .
(j )
(j )
(i) (j ) E[f (i) (s)(Bs(i) (s)(Bs − Bs ) | Fα ]
− Bs )f (j )
(i)
(j )
= E[E(f (i) (s)(Bs − Bs(i) )f (j ) (s)(Bs − Bs ) | Fs ) | Fα ] (i)
(j )
(j )
= E[f (i) (s)f (j ) (s){E((Bs − Bs(i) )(Bs − Bs ) | Fs )} | Fα ] = 0. One may similarly prove, for all .f, g ∈ M[α, T ], .α ≤ t ≤ T , E((
t
f (s)dBs )(
.
α
t
t
g(s)d Bs )|Fα ) = E(
α
f (s)g(s)ds | Fα ).
(7.45)
α
Thus, Proposition 6.2 is fully extended to k-dimension. For the extension to more general nonanticipative functionals, consider the following analog of Definition 6.2 Definition 7.3 If .f = {f(t) = (f (1) (t), . . . , f (k) (t)) : α ≤ t ≤ β} is a progressively measurable stochastic process with values in .Rk such that .f(t) is .Ft –measurable for all .t ∈ [α, β], then .f is called a nonanticipative functional on .[α, β] with values in k .R . If, in addition,
β
E
.
|f(t)|2 dt < ∞,
(7.46)
α
then .f is said to belong to .M[α, β]. If .f ∈ M[α, β] for all .β > α, then .f belongs to M[α, ∞).
.
For .f ∈ M[α, β] with values in .Rk one may apply the results in Chapter 6 to t (i) each of the k components . α f (i) (u)dBu .(1 ≤ i ≤ k) to extend Proposition 6.3 to t k-dimension and to define the stochastic integral . α f(u)dBu for a k-dimensional
7.2 Extension to Multidimensional Diffusions
125
f ∈ M[α, β]. The following extension of Theorem 6.5 on Itô isometry is now immediate.
.
Theorem 7.3 Suppose .f, g ∈ M[α, β] with values in .Rk . Then t a. .t → α f(u) · dBu is continuous and additive.
t b. . α f(u) · dBu : α ≤ t ≤ β is a square integrable .{Ft : α ≤ t ≤ β}–martingale and t t .E(( f(u) · dBu )2 |Fα ) = E( |f(u)|2 du|Fα ), (7.47) α
and .(
t α
α
f(u) · dBu )2 −
t α
|f(u)|2 du, .α ≤ t ≤ β, is a martingale.
c. E[
t
t
f(s)dBs
.
α
t
g(s)dBs |Fα ] =
α
E[f(s) · g(s)|Fα ]ds
(7.48)
α
t t t d. (Itô Isometry) .E[ α f(s)dBs α g(s)dBs ] = α E[f(s) · g(s)]ds if .f, g both satisfy (7.46). e. If .r = r , .f, g ∈ M[α, β] are real-valued, then
t
E[
f (s)dBs(r)
.
α
t α
g(s)dBs(r ) ] = 0, α ≤ t ≤ T .
The final task is to construct multidimensional diffusions. For this let .μ(i) (x), k .σij (x) .(1 ≤ i, j ≤ k) be real-valued Lipschitzian functions on .R . Then, writing, .σ (x) for the .k × k matrix .((σij (x))), |μ(x) − μ(y)| ≤ M|x − y|,
.
σ (x) − σ (y) ≤ M|x − y|,
(x, y ∈ Rk ),
(7.49)
for some positive constant M. Here . · denotes the matrix norm, D := sup{|Dz| : z ∈ Rk , |z| = 1}.
.
Consider the k-dimensional stochastic differential equation dXt = μ(Xt )dt + σ (Xt )dBt ,
.
(t ≥ α),
(7.50)
defined by the vector stochastic integral equation,
t
Xt = Xα +
.
α
μ(Xs )ds +
t
σ (Xs )dBs ,
(t ≥ s),
α
which, in turn, is a shorthand for the system of k equations
(7.51)
126
7 Construction of Diffusions
(i) .Xt
=
Xα(i)
t
+
t
μ (Xs )ds + (i)
α
σ i (Xs ) · dBs ,
(1 ≤ i ≤ k).
(7.52)
α
Here .σ i (x) is the k-dimensional vector, .(σi1 (x), . . . , σik (x)). The proofs of the following theorems are entirely analogous to those of Theorems 7.1 and 7.2 (Exercises 3 and 4). Basically, one only needs to write 2 |X(n+1) − X(n) t t |
.
t
| α
(n−1) (μ(X(n) ))ds|2 s ) − μ(Xs
(Xt(n+1) − Xt(n) )2 , t in place of ( (μ(Xs(n) ) − μ(Xs(n−1) ))ds)2
instead of
α
(n−1) 2 σ (X(n) ) , s ) − σ (Xs
for
(σ (Xs(n) ) − σ (Xs(n−1) )2 ,
etc.
Theorem 7.4 Suppose .μ(·), σ (·) satisfy (7.49). Then, for each .Fα –measurable square integrable random vector .Xα , there exists a unique (up to a P –null set) continuous element .X = {Xt : t ≥ α} of .M[α, ∞) such that (7.51) holds. Theorem 7.5 Suppose .μ(·), σ (·) satisfy (7.49), and let .Xx = {Xxt : t ≥ 0} denote the unique (up to a P –null set) continuous nonanticipative functional in .M[0, ∞) satisfying the integral equation
t
Xt = x +
.
0
t
μ(Xs )ds +
σ (Xs )dBs ,
(t ≥ 0).
(7.53)
0
Then .Xx = {Xxt : t ≥ 0} is a Markov process. Definition 7.4 The family of Markov processes .Xx = {Xxt : t ≥ 0}, x ∈ Rk define diffusion on .Rk with drift .μ(·) and diffusion matrix .σ (·)σ (·), .σ (·) being the transpose .σ (·). Remark 7.1 It may be noted that in Theorems 7.4 and 7.5 it is not assumed that σ (x) is positive definite. It is known from the theory of partial differential equations that the positive definiteness guarantees the existence of a density for the transition probability and its smoothness (see Friedman 1964), but it is not needed for the Markov property.
.
7.3 An Extension of the Itô Integral & SDEs with Locally Lipschitz Coefficients In this subsection we present a further extension of the previously developed theory to accommodate the construction of diffusions with locally Lipschitz drift and diffusion coefficients (parameters): For each .n, there is a constant .Mn such that for .|x|, |y| ≤ n,
7.3 An Extension of the Itô Integral & SDEs with Locally Lipschitz Coefficients
|μ(x) − μ(y)| ≤ Mn |x − y|
and |σ (x) − σ (y)| ≤ Mn |x − y|.
.
127
(7.54)
Example 2 For a simple example in anticipation of new phenomena, consider μ(x) = x 2 , x ∈ R, with .σ 2 ≡ 0. One has .|μ(x)−μ(y)| = |x +y||x −y| ≤ 2n|x −y| for .|x| ≤ n, |y| ≤ n. Observe that the solution to .dX(t) = X2 (t)dt, X(0) = 1, namely, the process .X(t) = (1 − t)−1 , 0 ≤ t < 1, “explodes”out of the state space at .t = 1.
.
The following result provides a natural way in which to view integration over intervals determined by a bounded stopping time. Proposition 7.6 Let .f ∈ M[α, β] and .τ a (bounded) stopping time with values in [α, β]. Then, writing
.
If (t) :=
t
α ≤ t ≤ β,
f (s)dB(s),
.
(7.55)
α
one has
β
If (τ ) =
.
α
(7.56)
f (s)1[τ ≥s] dB(s).
Proof First note that .s → 1[τ ≥s] is left-continuous on .[α, β] and, therefore, (s, ω) → 1[τ (ω)≥s] is nonanticipative, i.e., progressively measurable. Thus .g(s) := f (s)1[τ ≥s] ∈ M[α, β]. Assume first that .τ has finitely many values .si .(1 ≤ i ≤ k), .s1 < s2 < · · · < sk in .[α, β]. Then ⎧ ⎪ for s ∈ [α, s1 ], ⎪ ⎨1 .1[τ ≥s] = 1[τ ≥si+1 ] for s ∈ (si , si+1 )](i = 1, 2, · · · , k − 1), ⎪ ⎪ ⎩0 for s ∈ (s , β)]. .
k
By Theorem 7.3(a), writing .g(s) = f (s)1[τ ≥s] , .Ig (t) =
t α
g(s)dB(s), one has
s1
Ig (s1 ) =
f (s)dB(s),
.
α
s1
Ig (sj ) =
f (s)dB(s) +
α
i=1 s1
=
j −1
f (s)dB(s) +
α
Ig (β) = Ig (sk ).
j −1 i=1
si+1 si
f (s)1[τ ≥si+1 ] dB(s)
1[τ ≥si+1 ]
si+1
f (s)dB(s),
(j = 2, · · · , k),
si
(7.57)
128
7 Construction of Diffusions
Since on .[τ = sj ], one has .1[τ ≥si+1 ] = 1 for .i = 1, 2, · · · j − 1, and .1[τ ≥si+1 ] = 0 for .i ≥ j ,
s1
Ig (β) =
.
f (s)dB(s) +
α
i=1 sj
=
j −1
si+1
f (s)dB(s) si
f (s)dB(s) = If (sj ) = If (τ )
(j = 1, . . . , k)
on [τ = sj ],
α
proving (7.56). For general .τ (with values in .[α, β]) define the approximation .τ (n) = α + (β − α)i/2n on the event .[α + (β − α)(i − 1)/2n ≤ τ < α + (β − α)i/2n ] n (n) = β on .[τ = β]. Then .τ (n) ↓ τ , and .[τ (n) ≥ s] ↓ [τ ≥ s] .(i = 1, 2, · · · , 2 ), .τ for .s ∈ [α, β], as .n ↑ ∞. Writing .gn (s, ω) = f (s)1[τ (n) (ω)≥s] , one then has
β
E
.
(gn (s) − g(s))2 ds −→ 0
as n → ∞,
α
i.e., .gn → g in .M[α, β]. Hence, there exists a sequence of integers .n1 < n2 < · · · such that .Ignk (t) → Ig (t) uniformly over .t ∈ [α, β] as .k → ∞, outside a P – null set. But .Ignk (β) = If (τ (nk ) ) for all k (outside a P –null set). Taking limits as .k → ∞, we get (7.56). Remark 7.2 Note that in (7.57) we have made use of the relation .
t
gf (u)dB(u) = g
s
t
f (u)dB(u) s
for g a bounded .Fs –measurable random variable. This obviously holds for nonanticipative step functionals f and, therefore, for all .f ∈ M[s, t]. Our next task is to extend the notion of a stochastic integral to a larger class of summands than .M[α, β]. Definition 7.5 The set of all progressively measurable stochastic processes f satisfying
β
P(
.
f 2 (s)ds < ∞) = 1
(7.58)
α
is denoted by .L[α, β]. If (7.58) holds .∀ .β > α, then this space is .L[α, ∞). t We can proceed with the definition of . α f (s)dB(s) for .f ∈ L[α, ∞) as follows: Define the .{Ft : α ≤ t ≤ β}–stopping times τ
.
(n)
(ω) = inf{t ≥ α : α
t
f 2 (s, ω)ds ≥ n} ∧ n,
(integer n ≥ α).
7.3 An Extension of the Itô Integral & SDEs with Locally Lipschitz Coefficients
129
t Then .τ (n) ↑ ∞ a.s. as .n ↑ ∞. (If, for some .ω, . α f 2 (s, ω)ds < n .∀ t, then (n) (ω) = ∞ ∧ n = n; if . t f 2 (s, ω)ds ↑ ∞ as .t ↑ ∞, then .inf{t ≥ α : .τ α t 2 α f (s, ω)ds ≥ n} ↑ ∞ as .n ↑ ∞). Let fn (s) := f (s)1[τ (n) ≥s]
(n ≥ α, n integer).
.
Then
T
.
α
fn2 (s)ds ≤
τ (n) (ω)
f 2 (s, ω)ds ≤ n.
α
Hence .fn ∈ M[α, ∞) and, applying Proposition 7.6 with .τ = τ (m) ∧ t for .m < n, and noting that .fn (t) = fm (t) = f (t) for .t ≤ τ (m) , one has Ifn (t ∧ τ (m) ) =
t
.
α
fn (s)1[τ (m) ≥s] dBs =
α
t
f (s)1[τ (m) ≥s] dBs
≡ Ifm (t) a.s.,
(7.59)
so that, using continuity in t for all terms in (7.59), Ifn (t) = Ifm (t) ∀ t ≤ τ (m) , a.s., if n > m.
.
(7.60)
Definition 7.6 Let .f ∈ L[α, ∞). The stochastic integral
t
If (t) =
.
f (s)dBs , α ≤ s < ∞,
α
is defined by If (t) := Ifm (t) ∀ t ≤ τ (m)
.
(m = [α] + 1, [α] + 2, . . . ).
(7.61)
If .f ∈ L[α, β], then the same construction holds for .α ≤ β, with .τ (n) (ω) = tt ≤ t 2 (n) 2 inf{t ∈ [α, β] : α f (s, ω)ds ≥ n}, and .τ (ω) = β if . α f (s, ω)ds < n for all .t ≤ β. Remark 7.3 Note that .t → If (t) is continuous outside a P –null set if .f ∈ L[α, ∞). With the above definition of the stochastic integral, Proposition 7.6 may be extended to .f ∈ L[α, ∞) and bounded stopping times .τ ≤ β. To see this note that, with .fn as defined above, If (τ ) = lim Ifn (τ ) = lim
.
n→∞
≡ lim
n→∞ α
n→∞ α
β
β
fn (s)1[τ ≥s] dBs
f (s)1[τ ≥s] 1[τ (n) ≥s] dBs = lim Ig (β ∧ τ (n) ), n→∞
130
7 Construction of Diffusions
where .g(s) := f (s)1[τ ≥s] . We used Proposition 7.6 in the second and the last equalities above. Now let .n ↑ ∞ and use the continuity of the stochastic integral to get .If (τ ) = Ig (β). Proposition 7.6 and its extension given in Remark 7.3 allow us to construct diffusions with locally Lipschitz coefficients (parameters); recall (7.54). Assume that .μ(·), σ (·) are locally Lipschitzian. One may still construct a diffusion on .Rk satisfying (7.34) up to a random time, .ζ , referred to as explosion time, as suggested by the definition to follow below. For this, let .{Xxt,n : t ≥ 0} be the solution of (7.34) with globally Lipschitzian coefficients .μn (·), σ n (·) satisfying μn (y) = μ(y)
and
.
σ n (y) = σ (y)
for |y| ≤ n.
(7.62)
We may, for example, let .μn (y) = μ(ny/|y|) and .σ n (y) = σ (ny/|y|) for .|y| > n, so that (7.33) holds for all .x, y with .M = Mn . Let us show that, if .|x| ≤ n, Xxt,m (ω) = Xxt,n (ω)
.
for 0 ≤ t < ζn (ω)
(m ≥ n)
(7.63)
outside a set of zero probability, where ζn (ω) := inf{t ≥ 0 : |Xxt,n (ω)| = n}.
(7.64)
.
To prove (7.63), first note that if .m ≥ n then μm (y) = μn (y) and σ m (y) = σ n (y), for |y| ≤ n,
.
so that for .0 ≤ t < ζn (ω), μm (Xxt,n ) = μn (Xxt,n )
.
and
σ m (Xxt,n ) = σ n (Xxt,n ).
(7.65)
It follows from (7.65), Proposition 7.6 with .t ∧ ζn for .τ and continuity of the stochastic integral, that for .0 ≤ t ≤ ζn ,
t
.
0
μm (Xxs,n )ds
+ 0
t
σ m (Xxs,n )dBs
t
= 0
μn (Xxs,n )ds
t
+ 0
σ n (Xxs,n )dBs
almost surely. Therefore, a.s. for .0 ≤ t ≤ ζn , Xxt,m − Xxt,n =
t
.
0
[μm (Xxs,m ) − μm (Xxs,n )]ds +
0
t
[σ m (Xxs,m ) − σ m (Xxs,n )]dBu . (7.66)
In particular, from sample path continuity, it follows that .ζn , n ≥ 1, is an a.s. monotone sequence so that .ζn ↑ a.s. to (the explosion time) .ζ , say, as .n ↑ ∞, and
7.4 Strong Markov Property
131
Xxt (ω) = lim Xxt,n (ω),
.
n→∞
(7.67)
t < ζ (ω),
exists a.s., and has a.s. continuous sample paths on .[0, ζ (ω)). If .P (ζ < ∞) > 0, then we say that explosion occurs. In this case, one may continue the process .{Xxt } for .t ≥ ζ (ω) by setting Xxt (ω) = “∞”,
.
t ≥ ζ (ω),
(7.68)
where “.∞” is a new state, usually taken to be the point at infinity in the one-point compactification of .Rk . In addition to all open subsets of .Rk , complements of all compact subsets of .Rk with the new point “.∞” adjoined to them define the topology of .Rk ∪ {“∞”}. The two points .+∞ and .−∞ may be used in one dimension. Using the Markov property of .{Xxt,n } for each n, it is not difficult to show that x k .{Xt : t ≥ 0} defined by (7.67) and (7.68) is a Markov process on .R ∪ {“∞”}, called the minimal diffusion generated by (Exercise 9) A=
.
1 ∂2 ∂ μ(i) (x) (i) , dij (x) (i) (j ) + 2 ∂x ∂x ∂x
(7.69)
with drift vector .μ(·) and diffusion matrix .d(·) = σ (·)σ (·).
7.4 Strong Markov Property In this sub-section we derive the important result that the diffusion .{Xx , x ∈ Rk } has the strong Markov property. We will denote the distribution of .Xx by .Qx . Theorem 7.7 Under the hypothesis of Theorem 7.5, .Xx .(x ∈ Rk ) has the strong Markov property with respect to .Fτ , for every .{Ft : t ≥ 0}–stopping time .τ . According to Theorem 1.6, we only need to show the Feller property, i.e., the map .x → p(t; x, dy) is weakly continuous. The following proposition, which is of independent interest, implies this. Proposition 7.8 Under the hypothesis of Theorem 7.5, one has .
E( max |Xxs − Xzs |2 )
(7.70)
0≤s≤t
≤ 3|x − z|2 [1 + (3t + 12)M 2 t exp{
3M 2 t 2 + 12M 2 t}], 2
0 ≤ t ≤ T , T > 0, 1
with the matrix norm in (7.49) taken as .C = (Trace C C) 2 . Proof Since
132
7 Construction of Diffusions
x .Xt
− Xzt
t
=x−z+ 0
{μ(Xxs ) − μ(Xzs )}ds
+
t
0
{σ (Xxs ) − σ (Xzs )}dBs , t ≥ 0,
one has .
max |Xxs − Xzs | ≤ |x − z| + M
0≤s≤t
s
+ max | 0≤s≤t
0
t
0
|Xxs − Xzs |ds
{σ (Xxu ) − σ (Xzu )}dBu |,
t ≥ 0.
Using the Cauchy–Schwarz inequality on the middle term and Doob’s maximal inequality (for second moment) for the third, one gets Dt := E( max |Xxs − Xzs |2 )
.
0≤s≤t
≤ 3|x − z|2 + 3M 2 t
t
0
E|Xxs − Xzs |2 ds + 12M 2
t
≤ 3|x − z|2 + (3M 2 t + 12M 2 )
Ds ds,
t 0
E|Xxs − Xzs |2 ds
t ≥ 0.
0
We have here used the relations s . E( max | (σ (Xxu ) − σ (Xzu ))dBu |2 ) 0≤s≤t
0
k s ( (σ i · (Xxu ) − σ i (Xzu )) · dBu ]2 ) = E( max 0≤s≤t
i=1
0
t k ≤ 4 E|σ i (Xxu ) − σ i (Xuz )|2 du 0
i=1
t
=4 0
Eσ (Xxu ) − σ (Xzu )2 du
≤ 4M 2 0
t
E|Xxu − Xzu |2 du.
The inequality (7.70) now follows from Gronwall’s inequality below.
Proof of Theorem 7.7 It follows from (7.70) that .Xx converges to .Xz in probability as .x → z, with respect to the topology of uniform convergence on compact intervals in the space .C([0, ∞) : Rk ). Hence .Xx converges in distribution to .Xz as .x → z. That is, .x → Qx is weakly continuous.
7.5 An Extension to SDEs with Nonhomogeneous Coefficients
133
Suppose an integrable .α : [0, T ] → R
Lemma 1 (Gronwall’s Inequality) satisfies the inequality
t
α(t) ≤ C0 + H (t)
0 ≤ t ≤ T,
α(s)ds,
.
(7.71)
0
where H is a nonnegative integrable function on .[0, T ]. Then
t
α(t) ≤ C0 [1 + H (t)
exp{
.
0
t
0 ≤ t ≤ T.
H (u)du}ds],
(7.72)
s
t Proof Write .g(t) = 0 α(s)ds. Then the inequality (7.71) may be expressed as t t
.g (t) − H (t)g(t) ≤ C0 or .(d/dt)[exp{− 0 H (s)ds}g(t)] ≤ C0 exp{− 0 H (s)ds}. Integrating this one gets g(t) ≤ C0 e
.
t 0
t
H (s)ds 0
e−
s 0
H (u)du
ds = C0
t
exp{ 0
t
H (u)du}ds. s
Using this estimate in the right side of (7.71), one gets (7.72). If one uses the matrix norm 1
C := max |Cx| = (largest eigenvalue of C C) 2
.
|x|=1
(under the assumption (7.49)), then in k dimensions the right side of (7.70) changes 2 2 to .3|x − z|2 [1 + (3t + 12k)M 2 t exp{ 3M2 t + 12M 2 kt}] (Exercise 12(b)). As noted in the previous subsection on processes with locally Lipschitz coefficients, one may establish the Markov property for the minimal process. In fact, one may prove the strong Markov property for the minimal process as well (Exercise 9). Criteria for explosion/nonexplosion based on on the coefficients of the diffusion are provided in Chapter 12.
7.5 An Extension to SDEs with Nonhomogeneous Coefficients The construction of (nonhomogeneous) diffusions with time-dependent drift and diffusion coefficients, say .μ(x, t) and .σ (x, t), will be solved by viewing these as “homogeneous coefficients” for a space-time process .(Xt , t) evolving in .Rk × [0, ∞). So suppose that we are given functions .μ(i) (x, t), σij (x, t) on .Rk × [0, ∞) into .R. Write .μ(x, t) for the vector whose ith component is .μ(i) (x, t), σ (x, t) for the .k × k matrix whose .(i, j ) element is .σij (x, t). Given .s ≥ 0, we would like to solve the stochastic differential equation
134
7 Construction of Diffusions
dXt = μ(Xt , t)dt + σ (Xt , t)dBt
(t ≥ s),
.
Xs = Z,
(7.73)
where .Z is .Fs -measurable, .E|Z|2 < ∞. It is not difficult to extend the proofs of Theorems 7.1, 7.2 (also see Theorems 7.4, 7.5) to prove the following result (Exercise 10). Theorem 7.9 Assume .
|μ(x, t) − μ(y, s)| ≤ M{|x − y| + |t − s|}, σ (x, t) − σ (y, s) ≤ M{|x − y| + |t − s|},
(7.74)
for some constant M. a. Then for every .Fs -measurable k-dimensional square integrable .Z, there exists a unique nonanticipative continuous solution .X = {Xt : t ≥ s} in .M[s, ∞) of
t
Xt = Z +
.
s
t
μ(Xu , u)du +
(t ≥ s).
σ (Xu , u)dBu
(7.75)
s
b. Let .{Xs,x : t ≥ s} denote the solution of (7.75) with .Z = x ∈ Rk . Then it is a t (generally nonhomogeneous) Markov process with (initial) state .x at time s and transition probability
p(s , t; z, B) = P (Xst z ∈ B)
.
(s ≤ s ≤ t).
.
Remark 7.4 To solve (7.73), one may introduce an additional coordinate t and solve the augmented equation for .Yt = (Xt , t) , with a drift .(μ(x, t), 1) , diffusion matrix .(σ (x, t), 0) augmented by a column of zeros, and a .(k + 1)-dimensional Brownian (k+1) ), .t ≥ 0. motion .(Bt , Bt
7.6 An Extension to k−Dimensional SDE Governed by r−Dimensional Brownian Motion Finally, we consider k–dimensional diffusions .Xt = (Xt , . . . , Xt ) governed by r-dimensional Brownian motions .Bt = (Bt(1) , . . . , Bt(r) ) illustrated by the following examples. (1)
(k)
Example 3 (k = 1, r =2) Suppose that .Bj = {Bj (t) : t ≥ 0}, j = 1, 2, are two independent one-dimensional Brownian motions and one wishes to consider the one-dimensional diffusion defined by
Exercises
135
dX(t) = σ1 dB1 (t) + σ2 dB2 (t),
(7.76)
.
for positive constants .σ1 , σ2 . Then, up to specification of the initial state, .X = {X(t) : t ≥ 0} is well-defined and equivalently given by the one-dimensional diffusion equation dX(t) =
.
σ12 + σ22 dB(t).
(7.77)
The following example illustrates another possible dimensional effect. Example 4 (k = 2, r = 1) Suppose that .{Bt : t ≥ 0} is a one-dimensional standard Brownian motion and consider the two-dimensional diffusion .{(X1 (t), X2 (t)) : t ≥ 0} defined by .
dX1 (t) = σ1 dBt dX2 (t) = σ2 dBt .
(7.78)
In this case, the “two-dimensional” process is also well defined and is determined by a single one-dimensional Brownian motion. More generally, let .μ(·) be a Lipschitz function on .Rk into .Rk and .σ (·) a Lipschitz function on .Rk into the space of .k × r matrices. Here r need not equal k. Consider now the Equations (7.50), (7.51) with these changes. In particular, .σi· (x) := (σi1 (x), · · · , σir (x)) is the i-th row of .σ (x), .1 ≤ i ≤ k: Xt(i) = Xα(i) +
t
.
μ(i) (Xs )ds +
α
t r
(j )
σij (Xs )dBs ,
(1 ≤ i ≤ k).
α j =1
The proofs of Theorems 7.4, 7.5 go over without any essential change (Exercise 11).
Exercises 1. If h is a Lipschitzian function on R and X is a square integrable random variable, then prove that Eh2 (X) < ∞. [Hint: Look at h(X) − h(0).] 2. Assume the hypothesis of Theorem 7.1 with α = 0. (i) Prove that
t
E(Xt −X0 ) ≤ 4M (t +1)
.
2
2
E(Xs −X0 )2 ds+4t 2 Eμ2 (X0 )+4tEσ 2 (X0 ).
0
(ii) Write Dt := E(max{(Xs − X0 )2 : 0 ≤ s ≤ t}), 0 ≤ t ≤ T . Show from (i) that
136
7 Construction of Diffusions
t (a) Dt ≤ d1 0 Ds ds + d2 , where d1 = 4M 2 (t + 4), d2 = 4t 2 Eμ2 (X0 ) + 16tEσ 2 (X0 )); (b) DT ≤ d2 ed1 T . 3. Write out a proof of Theorem 7.4 by extending step by step the proof of Theorem 7.1. 4. Write out a proof of Theorem 7.5 along the lines of that of Theorem 7.2. (1) 5. Calculate the bound on DT in (7.14). 6. Show that measurability of θ (s, t; z, Bst ) follows from (7.5)–(7.7). 7. Consider the Gaussian diffusion Xx = {Xtx : t ≥ 0} given by (7.36). (i) Show that EXtx = e−tγ x. x ). (ii) Compute Cov(Xtx , Xt+h (iii) Compute the transition probability distribution p(t; x, dy) for the Ornstein-Uhlenbeck process;[Hint: The finite dimensional distributions have a Gaussian density.] (iv) For the case γ > 0, σ = 0, compute the t asymptotic distribution of Xt as t → ∞. [Hint: The Riemann integral 0 e−γ (t−s) Bs ds is a limit of partial sums, and each partial sum has a Gaussian distribution.] 8. (i) Prove (7.30), and (7.32), under the assumption (7.3). [Hint: Show that the distribution of Xt+h − Xt , given Xt = x, differs from the distribution of μ(x)h + σ (x)Bh by o(h) as h ↓ 0, where Bh = Bt+h − Bh is Gaussian with mean zero and variance h. Use the expected squared error between the two quantities.] (ii) Write out a corresponding proof for the multidimensional case. 9. Show that the minimal process Xx defined by (7.67), (7.68) on Rk ∪ {∞} is Markov and strong Markov. 10. (a) Write out a detailed proof of Theorem 7.9 along the lines of those of Theorems 7.1, 7.2. (b) Show that a different proof of Theorem 7.9 follows directly from Theorems 7.4, 7.5, using Remark 7.4, assuming |μ(x, t) − μ(y, s)| ≤ M(|x − y| + |t − s|), |σ (x, t) − σ (y, s)| ≤ M(|x − y| + |t − s|). 11. Follow through in verifying that the proofs of existence and uniqueness of solutions and their Markov property remain essentially unchanged for rectangular systems. 12. (a) (i) Suppose (7.49) holds with respect to the matrix norm C := 1 (n+1) in k-dimension (Trace C C) 2 . Then prove that the estimates of DT (1) (k > 1) satisfy (7.13) (with c1 as in (7.14)), and DT satisfies (7.14) with μ2 (·) and σ 2 (·) replaced by |μ(·)|2 and σ (·)2 , respectively. (ii) How do these estimates change if the matrix norm used is C = 1 max|x|=1 |Cx| = (largest eigenvalue of C C) 2 ? (b) Show that with the matrix norm as in (a)(ii) (used in assumption (7.33)), the right side of (7.70) needs to be modified as indicated following the proof of Gronwall’s inequality (Lemma 1).
Exercises
137
13. Let A and σ be a k × k real matrices and use Picard iteration to solve the the k-dimensional Langevin equation dX(t) = AX(t)dt + σ dB(t), t > 0,
.
X(0) = x.
Under what conditions is the solution a nondegenerate k-dimensional Gaussian process?
Chapter 8
Itô’s Lemma
This chapter includes one of the most impactful “chain rule” formulae of the Itô calculus, the celebrated Itô’s lemma. Both standard and novel illustrative applications are included. In addition, well-known and useful criteria for recurrence and transience of one-dimensional diffusions are derived as an elegant consequence of Itô’s theory.
As observed in Proposition 6.1 and Exercises 1, 2 in Chapter 6, the sample paths of Brownian motion are nowhere differentiable. By Proposition 6.1, it follows that a.s.,
.
max
1≤j 0. Then one has for .i = j , as .n → ∞ a.s., max n |
.
1≤k≤2
k−1 (i) (j ) (j ) (i) B(m+1)2−n t − Bm2−n t B(m+1)2−n t − Bm2−n t | → 0.
(8.10)
m=0 (i)
(i)
(j )
(j )
Proof Note that the .2n +1 terms .(B(m+1)2−n t −Bm2−n t )(B(m+1)2−n t −Bm2−n t ), .m = 0, 1, . . . , 2n = N are independent with common mean zero. Therefore, summing them up to .k ≤ 2n defines a martingale sequence .Y˜k ≡ Y˜k,n , for .k = 1, . . . , N = 2n . Applying Doob’s maximal inequality, one obtains
.
P ( max n | 1≤k≤2
k−1 (i) (j ) n (j ) (i) B(m+1)2−n t − Bm2−n t B(m+1)2−n t − Bm2−n t | > 2− 4 ) m=0
≤ E(Y˜N )2 /2− 2 2 (j ) n (i) (j ) 2 (i) = N 2 2 E B(m+1)2−n t − Bm2−n t B(m+1)2−n t − Bm2−n t n
n
n
= N(2−n t)2 2 2 = 2− 2 t 2 .
(8.11)
Thus, one may apply the Borel–Cantelli lemma to obtain the assertion. The following are important corollaries to Itô’s lemma as formulated in (8.7).
142
8 Itô’s Lemma
Corollary 8.2 Suppose that .ϕ is a twice continuously differentiable real-valued function with bounded derivatives on .Rk . Let .Xx be the k-dimensional diffusion defined by dXxt = μ(Xxt )dt + σ (Xxt )dBt ,
.
(8.12)
with Lipschitz coefficients .μ(x), σ (x). Then Mt := ϕ(Xxt ) −
t
.
0
Aϕ(Xxs )ds, t ≥ 0,
is a martingale, where A=
μ(i) (x)
.
∂ 1 ∂2 + d (x) , ij ∂x i 2 ∂x i ∂x j
(8.13)
1≤i,j ≤k
1≤i≤k
for .D = ((dij )) = σ σ t . As a direct consequence of the above corollary and the optional stopping theorem, one obtains the following result. Corollary 8.3 Under the hypothesis of Corollary 8.2, assume that Aϕ(x) = 0 for all x.
.
Then for any stopping time .τ such that .{Xτ ∧t : t ≥ 0} is uniformly integrable, one has Eϕ(Xxτ ) = ϕ(x), x ∈ Rk .
.
Definition 8.1 A function .ϕ ∈ DA such that .Aϕ(x) = 0 for all .x, is said to be A-harmonic. t Example 1 For an example application, consider the evaluation of . 0 Bs dBs . Consider .Y (t) = 12 B 2 (t). Then, by Itô’s lemma, noting .dB(t) = 0dt + 1dBt ,
.
dY (t) =
.
1 dt + Bt dBt . 2
It follows that
t
.
0
1 1 1 Bs dBs = Y (t) − t = Bt2 − t. 2 2 2
t t Similarly, one may evaluate . 0 Bs2 dBs = 31 Bt3 − 0 Bs ds.
8 Itô’s Lemma
143
Example 2 Using the fact that stochastic integrals are martingales with expectation zero, one may obtain a simple recursion between even moments .Bt2m as follows. By Itô’s lemma, t 2m(2m − 1) t 2m−2 Bs ds + 2mBs2m−1 dBs ) 2 0 0 2m(2m − 1) t EBs2m−2 ds. = 2 0
EBt2m = E(
.
Thus .EBt2 = t, EBt4 = 3t 2 , . . . . The same argument as above applied to a function .ϕ(t, y) on .[0, T ] × R, such that .ϕ0 := ∂ϕ/∂t, .ϕ , .ϕ are continuous and bounded, leads to dϕ(t, Y (t)) = {ϕ0 (t, Y (t)) + ϕ (t, Y (t))f (t) + 12 ϕ (t, Y (t))g 2 (t)s}dt
.
+ϕ (t, Y (t))g(t)dBt .
(8.14)
To state an extension to multidimensions, let .ϕ(t, y) be a real-valued function on [0, T ] × Rm that is once continuously differentiable in t and twice in .y. Write
.
∂0 ϕ(t, y) :=
.
∂ϕ(t, y) , ∂t
∂r ϕ(t, y) :=
∂ϕ(t, y) ∂y (r)
(1 ≤ r ≤ m).
(8.15)
Let .B = {Bt : t ≥ 0} be a k-dimensional standard Brownian motion satisfying conditions ((6.9)(i), (ii)) (with .Bs replaced by .Bs ). Suppose .Y(t) is a vector of m processes of the form Y(t) = (Y (1) (t), . . . , Y (m) (t)), t t (r) (r) Y (t) = Y (0) + fr (s)ds + gr (s) · dBs .
0
(1 ≤ r ≤ m). (8.16)
0
Here, .f1 , . . . , .fm are real-valued and .g1 , . . . , .gm vector-valued (with values in .Rk ) nonanticipative functionals belonging to .M[0, T ]. Also .Y (r) (0) are .F0 -measurable square integrable random variables. One may express (8.16) in the differential form dY (r) (t) = fr (t)dt + gr (t) · dBt
.
(1 ≤ r ≤ m),
(8.17)
or, equivalently, expressing .Y and .f as .m × 1 vectors, .g an .m × k matrix with row vectors .gr , r = 1, . . . m, and .B as a .k × 1 vector, one may also express the system, suitably interpreted coordinate-wise, as dY(t) = f(t)dt + g(t)dBt ,
.
Itô’s lemma says
t > 0.
(8.18)
144
8 Itô’s Lemma
dϕ(t, Y(t)) = {∂0 ϕ(t, Y(t)) +
m
.
∂r ϕ(t, Y(t))fr (t)
r=1
+
+
1 2!
∂r ∂r ϕ(t, Y(t))(gr (t) · gr (t))}dt
1≤r,r ≤m
m
∂r ϕ(t, Y(t))gr (t) · dBt .
(8.19)
r=1
In order to arrive at this, write dϕ(t, Y(t)) = ϕ(t + dt, Y(t + dt)) − ϕ(t, Y(t)) = ϕ(t + dt, Y(t + dt))
.
−ϕ(t, Y(t + dt)) + ϕ(t, Y(t + dt) − ϕ(t, Y(t)) = ∂0 ϕ(t, Y(t))dt +
m
∂r ϕ(t, Y(t))dY (r) (t)
r=1
+
1 2!
∂r ∂r ϕ(t, Y(t))dY (r) (t)dY (r ) (t).
(8.20)
1≤r,r ≤m
In the usual Newton–Leibniz calculus, of course, the contribution of the last sum to the differential would be zero. But there is one term in the product .dY (r) (t)dY (r ) (t), which is of the order dt and, therefore, must be retained in computing the stochastic differential. This term in (see (8.17)) .
(gr (t) · dBt )(gr (t) · dBt ) =(
k
(i)
gr(i) (t)dBt )(
k
(j )
(j )
gr (t)dBt )
j =1
i=1
=
k
(i)
(i)
gr(i) (t)gr (t)(dBt )2 +
i=1
i=j
(j )
(i)
(j )
gr(i) (t)gr (t)dBt dBt .
(8.21)
Now as seen above (see (8.10)), the first sum in (8.21) equals (
k
.
gr(i) (t)gr(i) (t))dt = gr (t) · gr (t)dt.
(8.22)
i=1
To show that the contribution of the second term in (8.21) to .ϕ(t, Y(t))−ϕ(0, Y(0)) over any interval .[0, t] is zero, note that, for .i = j ,
8 Itô’s Lemma
Zk,n :=
k−1
.
145
gr(i) (m2−n t)gr ( (j )
m=0
m (j ) (j ) (i) t)(B (i) m+1 − B m t )(B m+1 − B m t ), n t t n n n 2 2n 2 2 2 1 ≤ k ≤ 2n ,
(8.23)
is a martingale. This is true as the conditional expectation of the .m−th summand, given .Fm2−n t , is zero. Therefore, as in (8.10), .
max |Zk,n |→0
1≤k≤2n
as n → ∞.
a.s.
(8.24)
Thus, .
i=j
(j )
(j )
gr(i) (t)gr (t)dBt(i) dBt
=0
(i = j ).
(8.25)
Using (8.22), (8.25) in (8.20), Itô’s lemma (8.19) is obtained. A more elaborate argument is given below. For ease of reference, here is a statement of Itô’s lemma from a perspective of stochastic calculus. Theorem 8.4 (Itô’s Lemma) Assume .f1 , . . . , .fm , .g1 , . . . , .gm belong to .L[0, T ], and .Y (r) (0) .(1 ≤ r ≤ m) are .F0 -measurable and square integrable. Let .Y = {Y (t) : t ≥ 0} denote the process defined by (8.16). Let .ϕ(t, y) be a real-valued function on .[0, T ] × Rm , which is once continuously differentiable in t and twice in .y. Then (8.19) holds, i.e., for .0 ≤ s < t ≤ T ,
t
ϕ(t, Y(t)) − ϕ(s, Y(s)) =
.
{∂0 ϕ(u, Y(u)) +
s
m
∂r ϕ(u, Y(u))fr (u)
r=1
+
+
1 2!
∂r ∂r ϕ(u, Y(u))gr (u) · gr (u)}du
1≤r,r ≤m
m
t
∂r ϕ(u, Y(u))gr (u) · dBu .
(8.26)
r=1 s
Proof Assume first that .fr , gr are nonanticipative step functionals bounded by a constant and that .∂0 ϕ, ∂r ϕ, ∂r ∂r ϕ are bounded on .[0, T ] × Rm . The general case follows by approximating .fr , gr by step functionals and taking limits in probability. Fix .s < t .(0 ≤ s < t ≤ T ). In view of the additivity of the (Riemann and stochastic) integrals, it is enough to consider the case .fr (s ) = fr (s), .gr (s ) = gr (s) (n) (n) (n) for .s ≤ s ≤ t. Let .t0 = s < t1 < · · · < tNn = t be a sequence of partitions such that
146
8 Itô’s Lemma (n)
(n)
δn := max{ti+1 − tt
.
: 0 ≤ i ≤ Nn − 1} → 0
as n → ∞.
Write ϕ(t, Y(t)) − ϕ(s, Y(s)) =
N n −1
.
(n) (n) (n) [ϕ(ti+1 , Y(ti+1 )) − ϕ(ti(n) , Y ti+1 ))]
i=0
+
N n −1
(n)
(n)
(n)
(n)
[ϕ(ti , Y(ti+1 )) − ϕ(ti , Y(ti ))]. (8.27)
i=0
The first sum on the right may be expressed as N n −1
(n) (n) (1) (ti+1 − ti(n) ){∂0 ϕ(ti(n) , Y(ti+1 )) + Rn,i }
.
(8.28)
i=0
where (1) (n) (n) (n) |Rn,i | ≤ max{|∂0 ϕ(u, Y(ti+1 )) − ∂0 ϕ(u , Y(ti+1 ))| : ti(n) ≤ u ≤ u ≤ ti+1 }→0
.
uniformly in i (for each .ω), as .n → ∞, because .∂0 ϕ is uniformly continuous on [s, t] × {Y (u, ω) : s ≤ u ≤ t}. Hence, (8.28) converges, for all .ω, to
.
t
J0 :=
∂0 ϕ(u, Y(u))du.
.
(8.29)
s
By a Taylor expansion, the second sum on the right in (8.27) may be expressed as N m n −1 .
(n)
(n)
(n)
(n)
(Y (r) (ti+1 ) − Y (r) (ti ))∂r ϕ(ti , Y(ti ))
i=0 r=1
+
1 2!
N n −1
1≤r,r ≤m i=0
(n) (n) (Y (r) (ti+1 ) − Y (r) (ti(n) ))(Y (r ) (ti+1 ) − Y (r ) (ti(n) )) (n)
(n)
(2)
×{∂r ∂r ϕ(ti , Y(ti )) + Rr,r ,n,i },
(8.30)
(2)
where .Rr,r ,n,i → 0 uniformly in i (for where each .ω), because .(u, y) → ∂r ∂r ϕ(u, y) and .t → Y(t) are continuous. Using (8.16) the in-probability limit in the first sum in (8.30) is expressed as m N n −1 .
r=1 i=0
(n)
(n)
(n)
(n)
(n)
(n)
{(ti+1 − ti )fr (ti ) + gr (ti ) · (Bt (n) − Bt (n) )}∂r ϕ(ti , Y(ti )) i+1
i
8 Itô’s Lemma
147
→ J11 :=
t m t [ fr (u)∂r ϕ(u, Y(u))du + ∂r ϕ(u, Y(u))gr (u) · dBu ]. r=1
s
s
(8.31) For .∂r ϕ is continuous and bounded, so that N n −1 t (n) i+1 .
(n)
ti
i=0
(n)
(n)
(∂r ϕ(ti , Y(ti )) − ∂r ϕ(u, Y(u)))gr (u) · dBu =
t s
hn (u) · dBu → 0,
t in probability, since . s |hn (u)|2 du → 0. It remains to find the limiting value of the second sum in (8.30) (excluding the remainder). For this, write N n −1
(r) (n) (n) (r ) (n) (n) (n) (n) Y ti+1 − Y (r) ti Y ti+1 − Y (r ) ti ∂ r ∂r ϕ t i , Y t i
.
i=0
=
N n −1
(n) (n) (n) (n) ti+1 − ti fr ti + gr ti · (Bt (n) − Bt (n) ) i
i+1
i=0
(n) (n) (n) (n) (n) (n) × ti+1 − ti fr ti + gr ti · Bt (n) − Bt (n) ∂r ∂r ϕ ti , Y ti . i+1
i
(8.32) By (8.10), (8.23) and (8.24), (8.32) converges a.s. (for a suitable sequence of partitions) to
.
lim
N n −1
k k (j ) (n) (j ) (j ) (j ) (n) (j ) (j ) { gr (ti )(B (n) − B (n) )}{ gr (ti )(B (n) − B (n) )} ti+1
i=0 j =1 (n)
ti
ti+1
j =1
ti
(n)
∂r ∂r ϕ(ti , Y(ti )) = lim
k
(j )
(j )
gr (ti(n) )gr (ti(n) )
j =1
=
k
i=0
(j ) (j ) gr (ti(n) )gr (ti(n) ) lim
j =1
= J12 :=
N n −1
N n −1
(j ) ti+1
(j ) ti
(B (n) − B (n) )2 ∂r ∂r ϕ(ti(n) , Y(ti(n) ))
(n) (ti+1 − ti(n) )∂r ∂r ϕ(ti(n) , Y(ti(n) ))
i=0 k j =1 s
t
(j )
(j )
gr (u)gr (u)∂r ∂r ϕ(u, Y(u))du.
The proof is completed by adding .J0 , .J11 , and .J12 .
(8.33)
148
8 Itô’s Lemma
Applying Itô’s lemma to the diffusion .Y(t) = Xt .(t ≥ 0) constructed in Chapter 6, the following “SDE version”, also called Itô’s lemma, is obtained immediately. Corollary 8.5 (Itô’s Lemma for Diffusions) Let .X be a diffusion given by the solution to the stochastic differential equation (7.50), with .α = 0 and .μ(·), σ (·) Lipschitzian. Assume .ϕ(t, y) satisfies the hypothesis of Theorem 8.4 with .m = k. a. Then one has the relation ϕ(t, Xt ) = ϕ(s, Xs ) +
.
t
{∂0 ϕ(u, Xu ) + (Aϕ)(u, Xu )} du
s
+
k
t
∂r ϕ(u, Xu )σ r (Xu ) · dBu ,
(8.34)
r=1 s
where .A is the differential operator (Aϕ)(u, x) :=
.
1 2
drr (x)∂r ∂r ϕ(u, x) +
1≤r,r ≤k
k
μ(r) (x)∂r ϕ(u, x),
(8.35)
r=1
for ((drr (x))) := σ (x)σ (x).
.
(8.36)
b. In particular, if .∂0 ϕ, ∂rr ϕ are bounded, then
t
Zt := ϕ(t, Xt ) −
.
0
{∂0 ϕ(u, Xu ) + (Aϕ)(u, Xu )} du
(0 ≤ t ≤ T ), (8.37)
is a .{Ft }-martingale. Note that part (b) is an immediate consequence of part (a), since .Zt − Zs equals the stochastic integral appearing in the right-hand side of (8.34), whose conditional expectation, given .Fs , is zero. Remark 8.1 (Itô’s Lemma for Diffusions with Locally Lipschitz Coefficients) An appropriate version of Corollary 8.5 extends to the case of diffusions with locally Lipschitz coefficients. For example, by Theorem 8.4, (8.34) holds when t is changed to .t ∧ τ where .τ = inf{u ≥ 0 : |Xux | ≥ R} for every R such that .|x| < R. The (symmetric) matrix .Hϕ := ((∂r ∂r ϕ))1≤r,r ≤m appearing in the definition of .Aϕ is referred to as the Hessian matrix. Borrowing other notions from vector calculus, such as the gradient vector of the (scalar) function .ϕ defined by ∂ϕ .grad ϕ(t, x) ≡ ∇ϕ(t, x) := ( ∂xr (t, x))1≤r≤m , one may more succinctly express the operator .A as
8 Itô’s Lemma
149
Aϕ(t, x) =
.
1 TraceDH(t, x) + μ · ∇ϕ(t, x). 2
(8.38)
The Hessian may also be recast as .Hϕ = ∇ t ∇ϕ. As will be illustrated in later examples and exercises, a useful result for the product of two one-dimensional diffusions .Z1 and .Z2 , say, may be obtained from the higher dimensional version of Itô’s lemma as follows. Proposition 8.6 Suppose that .Z = (Z1 , Z2 ) is given by the stochastic differential equation dZ(t) = μ(Z(t))dt + σ (Z(t))dBt , Z(0) = (x, y),
.
for Lipschitzian .μ, σ . Then, letting .Z˜ = (Z2 , Z1 ), .D = σ σ t = ((dr,r ))1≤r,r ≤2 ,
t
Z1 (t)Z2 (t) = z1 z2 +
.
t
+
˜ {μ(Z(s)) · Z(s) + d12 (Z(s))}ds
0
0 t
+ 0
{Z2 (s)σ11 (Z(s)) + Z1 (s)σ21 (Z(s))}dBs(1) {Z2 (s)σ12 (Z(s)) + Z1 (s)σ22 (Z(s))}dBs(2) ,
(8.39)
where d12 (Z(s)) = σ11 (Z(s))σ21 (Z(s)) + σ12 (Z(s))σ22 (Z(s)).
.
Proof Apply Itô’s lemma to .ϕ(Z1 (t), Z2 (t)) = Z1 (t)Z2 (t) noting that 01 .Hϕ (z1 , z2 ) = , and .∇ϕ(z1 , z2 ) = (z2 , z1 ). Along the way, also make use of 10
d12 d11 the fact that .d12 = σ11 σ21 = d21 so that .DHϕ (z1 , z2 ) = . In particular d22 d12 1 ˜ for .z = (z1 , z2 ). . T race(DHϕ )(z) = d12 and .μ · ∇ϕ(z) = μ(z) · z 2 Remark 8.2 (Integration by Parts) Proposition 8.6 may be viewed as another version of “integration by parts”. In particular, in the notation introduced there, and dZi (t) = μ( Z(t))dt + σi (Z(t))dBt ,
.
i = 1, 2,
as a mnemonic device, one may formally express the result as
t
.
0
Z1 (s)dZ2 (s) =
Z1 (s)Z2 (s)|t0
where .d12 (Z) = σ1 (Z) · σ2 (Z).
t
− 0
t
Z2 (s)dZ1 (s) −
d12 (Z(s))ds, 0
150
8 Itô’s Lemma
Example 3 (Langevin Equation and Ornstein–Uhlenbeck Process) Langevin equation defining the Ornstein–Uhlenbeck process, dX(t) = −γ X(t)dt + σ dBt ,
.
X0 = x0 ,
Recall the
(8.40)
which can be solved by Picard iteration (see Chapter 7, Example 1). Alternatively, by Itô’s lemma, taking .ϕ(t, x) = eγ t x, one may check that d(eγ t X(t)) = σ eγ t dBt .
.
Thus, X(t) = x0 e−γ t + σ
.
t
e−γ (t−s) dBs .
0
Applying integration by parts (Proposition 6.6) To the stochastic integral, one obtains the previously derived formula X(t) = x0 e
.
−γ t
t
− γσ
e−γ (t−s) Bs ds + σ Bt .
0
Example 4 (Mean Reverting Ornstein–Uhlenbeck) Consider the process defined by dX(t) = (μ − γ X(t))dt + σ dB(t), X(0) = x,
.
where .γ > 0. Applying Itô’s lemma to .ϕ(t, X(t)) = eγ t X(t), ϕ(t, y) = eγ t y, one obtains the solution t μ −γ t μ + (x − )e +σ e−γ (t−s) dB(s), t ≥ 0. .X(t) = γ γ 0 Example 5 (Geometric Brownian Motion) equation
Consider the stochastic differential
dX(t) = μX(t)dt + σ X(t)dBt , X(0) = x0 > 0,
.
(8.41)
arising, for example, in mathematical finance as a model of “yield” .dX(t)/X(t) on a stock of price .X(t) at time .t. One may not a priori conclude from the defining equation (8.41) that .τ0 = inf{t ≥ 0 : X(t) = 0} = +∞ a.s. However, notice 1 that on the event .[τ0 > t], Itô’s lemma provides that .d log X(t) = X(t) dX(t) − { 12 σ 2 X2 (t)/X2 (t)}dt = μdt − 12 σ 2 dt + σ dBt . Thus, .
1 log X(t) = log x0 + (μ − σ 2 )t + σ Bt , 0 ≤ t < τ0 . 2
8 Itô’s Lemma
151
Now observe that the geometric Brownian motion defined by X(t) ≡ ϕ(t, Bt ) = x0 exp{σ Bt + μt −
.
σ2 t}, 2
solves (8.41) for all times .t ≥ 0. In particular, it follows that .τ0 = ∞ with probability one. Example 6 (Affine Linear Coefficients) Consider the stochastic differential equation dX(t) = (a1 + b1 X(t))dt + (a2 + b2 X(t))dBt , t > 0, X(0) = x0 > 0.
.
Writing dX(t) = (b1 X(t)dt + b2 X(t))dBt ) + (a1 dt + a2 dBt ),
.
the solution is verified by the integration by parts formula applied to a method analogous to the integrating factor or, more generally, “variation of parameters”method of ordinary differential equations; see Exercise 10. Let .Z1 denote the geometric Brownian motion starting at one defined by dZ1 (t) = b1 Z1 (t)dt + b2 Z1 (t)dBt , t > 0, Z1 (0) = 1.
.
Also define
t
Z2 (t) = x0 +
.
0
(a1 − a2 b2 )Z1−1 (s)ds
t
+ 0
a2 Z1−1 (s)dBs ,
t ≥ 0,
where Z1−1 (t) = exp{−b2 Bt − (b1 −
.
b22 )t}, t ≥ 0. 2
Now, using the integration by parts formula for the product given in Proposition 8.6, it follows that .X(t) = Z1 (t)Z2 (t). In particular, X(t) = e{b2 Bt +(b1 −
.
b22 2 )t}
t
{x0 +
(a1 − a2 b2 )e−b2 Bs −(b1 −
b22 2 )s
ds
0
t
+
a2 e−b2 Bs −(b1 −
b22 2 )s
dBs }.
(8.42)
0
Example 7 (Logistic Population Growth Model) Consider the stochastic differential equation
152
8 Itô’s Lemma
dX(t) = rX(t)(k − X(t))dt + σ X(t)dB(t), X(0) = x > 0,
.
where .r, k are nonnegative parameters describing the growth rate and carrying capacity, respectively. Observe that starting from .x > 0, the process cannot reach zero. This follows from results in Section 8.1, especially (8.55). This is an example of an inaccessible boundary to be more fully discussed in Chapter 21. Consider 1 .Y (t) = X(t) , t ≥ 0. From Itô’s lemma, one obtains dY (t) = rdt + (σ 2 − rk)Y (t)dt − σ Y (t)dB(t),
.
Y (0) = x −1 .
In particular, therefore, in the notation of the previous Example 6, Y is a process with affine linear coefficients .a1 = r, b1 = σ 2 − rk and .a2 = 0, b2 = σ . The solution follows from (8.42). Specifically, one has X(t) =
.
k exp{σ Bt + (kr − σ 2 /2)t} 1 = , t Y (t) x −1 + r 0 exp{σ Bs + (kr − σ 2 /2)s}ds
t ≥ 0.
(8.43)
Remark 8.3 Throughout this and subsequent chapters, unless otherwise specified, the stopping times are .{Ft : t ≥ 0}–stopping times where .{Ft : t ≥ 0} is a P – complete filtration for which (6.9) holds (for multidimensional .B when appropriate).
8.1 Asymptotic Properties of One-Dimensional Diffusions: Transience and Recurrence While the focus of this subsection is on a complete characterization of the transience/recurrence dichotomy for one-dimensional diffusions, the following lemma obviously applies more generally. To set some notation, denote the open ball of radius R and its boundary, respectively, by B(x0 : R) := {x : |x − x0 | < R},
.
∂B(x0 : R) = {x : |x − x0 | = R}.
(8.44)
Also define the .escape time from .B(x0 : R) by a process .Xx = {Xxt : t ≥ 0} by (x)
τ ≡ τ∂B := inf{t ≥ 0 : |Xxt − x0 | = R}.
.
(8.45)
Lemma 1 Let .μ(x), .σ (x) be locally Lipschitzian on .Rk . Write .σ i (x) for the ith row of the matrix .σ (x). Assume that, for some i, .dii (x) = |σ i (x)|2 > 0 .∀ x ∈ Rk . Then, (x) for every .x ∈ B(x0 : R), .Eτ∂B < ∞.
8.1 Asymptotic Properties of One-Dimensional Diffusions: Transience and. . .
153
Proof Without loss of generality, let .i = 1. Let .ϕ be a twice continuously differentiable function with compact support that equals .− exp{cx (1) } in .B(x0 : R) := {x : |x − x0 | ≤ R} for .c > 0 .(x = (x(1) , . . . , x (k) )). Then .Aϕ(x) = (− 21 c2 d11 (x) − cμ(1) (x)) exp{cx (1) }, so that for c large enough, there exists .δ > 0 satisfying.Aϕ(x) ≤ −δ < 0 for .x ∈ B(x0 : R). By Itô’s lemma (Corollary 8.5), t .ϕ(Xt ) − 0 Aϕ(Xs )ds .(t ≥ 0) is a martingale whose expectation is .ϕ(x) for all x .t ≥ 0, if .X0 ≡ x. Clearly, .τ (ω) ∧ t ≤ t < ∞ for all .ω ∈ Ω, and .|Xτ ∧t | is uniformly bounded and therefore uniformly integrable for all .t ≥ 0. Thus, by the optional stopping theorem (see BCPT1 p. 61), for .X0 ≡ x ∈ B(x0 : R) = {z : |z − x0 | < R}, one has τ ∧t x .E(ϕ(Xτ ∧t )) − E Aϕ(Xxs )ds = ϕ(x) = − exp{cx (1) }. 0
But for .s ≤ τ ∧ t, .Xxs ∈ B(x0 : R), and the integral above is no more than .−δ(τ ∧ t), τ ∧t so that .−E 0 Aϕ(Xxs )ds ≥ δE(τ ∧ t). Hence, δE(τ ∧ t) ≤ ϕ(x) − Eϕ(Xxτ ∧t ) ≤ g(x) := ϕ(x) − min{ϕ(y) : |y − x0 | ≤ R} < ∞.
.
Now, let .t ↑ ∞ to get .Eτ ≤ g(x)/δ.
Let .Xx = {Xtx : t ≥ 0}, .x ∈ R, be the diffusion on .R with locally Lipschitzian coefficients .μ(·), .σ (·) with .σ 2 (x) > 0 for all x. Let .τax := τ a ◦ Xx be the first x := τ x ∧ τ x . For a given pair .c < d, define passage of .Xx to .{a}, .τa,b a b ψ(x; c, d) := P (Xx reaches c before d),
.
ϕ(x; c, d) := P (X
x
(8.46)
reaches d before c).
x By Lemma 1, .P (τc,d < ∞) ≡ P (Xx reaches .{c, d} in finite time) .= 1, i.e., .ϕ(x; c, d) = 1 − ψ(x; c, d).
Proposition 8.7 Under the above hypothesis on .μ(·), .σ (·), a. .ϕ(x; c, d) is the unique continuous solution to the equation Aϕ(x) = 0
for c < x < d,
.
ϕ(c) = 0, ϕ(d) = 1,
(8.47)
where Aϕ(x) :=
.
1 Throughout,
Theory.
1 2 σ (x)ϕ (x) + μ(x)ϕ (x). 2
(8.48)
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
154
8 Itô’s Lemma
b. This solution is given by
x
ϕ(x; c, d) =
d
exp{−I (c, y)}dy
.
exp{−I (c, y)}dy
c y
I (c, y) :=
(c < x < d),
c
[2μ(z)/σ 2 (z)]dz.
(8.49)
c
Proof That the function .ϕ in (8.49) satisfies (8.47) is easy to check. Also, one may easily extend .ϕ to a twice continuously differentiable function on .R with compact support (Exercise 4). Now, apply Itô’s lemma (Corollary 8.5) to this extended function to get, using the optional sampling theorem x .Eϕ(X x τc,d ∧t ) − ϕ(x)
x ∧t τc,d
=E
0ds = 0.
(8.50)
0
Letting .t ↑ ∞, one obtains ϕ(x) = Eϕ(Xτxx ) = P (Xx reaches d before c).
.
c,d
The uniqueness of the solution is easily shown, by directly solving (8.47) by integration (Exercise 5). It is convenient to introduce notation for the (possibly infinite) time of the last visit by .Xx as follows (with the convention that the supremum of the empty set is zero): ηax := sup{t : Xtx = a},
.
x, a ∈ R.
(8.51)
Definition 8.2 A diffusion .{Xx : x ∈ R} is said to be (point) recurrent if .ρxa := P (τax < ∞) ≡ P (Xx reaches .a) = 1 for all .x, a ∈ R. It is said to be transient if x .P (ηa = ∞) = 0 for all .x, a. Remark 8.4 It may be noted (Exercise 8) that .{Xx , x ∈ R}, is recurrent if and only if P ηax = ∞ = 1
.
∀ x, a ∈ R,
(8.52)
recalling that the supremum of an empty set is taken to be zero in the definition of ηax .
.
Theorem 8.8 Assume .μ(·), σ (·) are locally Lipschitzian and that .σ (·) does not vanish anywhere. Then (a) the diffusion is either recurrent, or it is transient. (b) The diffusion is recurrent if and only if
8.1 Asymptotic Properties of One-Dimensional Diffusions: Transience and. . .
0
.
−∞
∞
exp{I (y, 0)}dy = ∞ and
155
exp{−I (0, y)}dy = ∞.
(8.53)
0
Proof (b) Let .d > x. Then, by (8.49), x ρxd ≡
.
P (τdx
< ∞) = lim ϕ(x; c, d) = lim cd c↓−∞
c↓−∞
c
exp{−I (c, y)}dy
.
exp{−I (c, y)}dy
Since .exp{−I (c, y)} = exp{−I (c, 0)} · exp{−I (0, y)} (with the convention that b a . a = − b if .b < a), one may cancel the common factor .exp{−I (c, 0)} from the numerator and the denominator to get x ρxd = lim cd
exp{−I (0, y)}dy
.
c↓−∞
c
.
exp{−I (0, y)}dy
0
=
x c exp{−I (0, y)}dy + 0 exp{−I (0, y)} lim c↓−∞ 0 exp{−I (0, y)}dy + d exp{−I (0, y)}dy c 0
.
0 The last limit is 1 if and only if .limc↓−∞ c exp{−I (0, y)}dy = ∞, which is the first relation in (8.53). Interpreting .∞/∞ = 1, one then has the general relation x
ρxd = −∞ d
exp{−I (0, y)}dy
.
(x < d).
(8.54)
−∞ exp{−I (0, y)}dy
Similarly, if .c < x, then letting .d ↑ ∞ In .ψ(x; c, d), one shows that ∞ ρxc ≡
.
P (τcx
< ∞) = x∞ c
exp{−I (0, y)}dy exp{−I (0, y)}dy
,
(c < x),
(8.55)
and it is 1 if and only if the second relation in (8.53) holds. (a) Suppose that the process is not recurrent. Then there exist .c < d such that .ρcd < 1 or .ρdc < 1. We will first show that, for all .x ∈ (c, d), there is a last visit to c or d with probability one, i.e., P ηcx < ∞ or
.
ηdx < ∞ = 1.
(8.56)
For this, define the stopping times τ1 := inf{t ≥ 0 : Xtx = d},
.
τ2r+1 := inf{t > τ2r : Xtx = d},
τ2 := inf{t > τ1 : Xtx = c}, τ2r+2 := inf{t > τ2r+1 : Xtx = c}
Then by the strong Markov property, writing .θr := P (τr < ∞), one has
(r ≥ 1).
156
8 Itô’s Lemma
θ1 = P (τ1 < ∞),
θ2r = θ1 (ρdc ρcd )r−1 ρdc ,
.
θ2r+1 = θ1 (ρdc ρcd )r , r ≥ 1.
Since .ρdc ρcd < 1, . ∞ r=1 θr < ∞. Therefore, (8.56) follows from the Borel–Cantelli lemma. Let us now use this to show that the process must be transient. Suppose it is not transient. Then, .δ := P (ηax = ∞) > 0 for some .x, a. Define the stopping times γ1 := inf{t ≥ 0 : Xtx = a},
.
γ2 := inf{t > γ1 + 1 : Xtx = a},
γr+1 := inf{t > γr + 1 : Xtx = a}
(r ≥ 1).
Note that P (γr < ∞) ≥ δ for every r ≥ 1,
.
which implies P (γr < ∞ and Xγ+r reaches bothc and d) ≥ δρac ρcd ρdc > 0 for every r ≥ 1.
.
Since, by construction, .γr+1 − γr ≥ 1, it follows that P (sup{t : Xtx = c} > r and sup{t : Xtx = d} > r) ≥ δρac ρcd ρdc for all r ≥ 1,
.
so that P (ηcx = ∞and ηdx = ∞) ≥ δρac ρcd ρcd > 0.
.
This contradicts (8.56).
Remark 8.5 Observe that in one-dimension and locally Lipschitz coefficients, the condition .σ 2 (x) > 0 for all .x ∈ R implies that P (Xx will eventually reach a) > 0 ∀x, a ∈ R,
.
since (8.54) and (8.55) are always positive. Such a process is said to be irreducible. Example 8 (Recurrence/Transience of One-Dimensional Brownian Motion) Consider .Xtx = x + μt + σ Bt , t ≥ 0. Clearly, the criterion provides that the process is x−c recurrent if and only if .μ = 0. Moreover, in this case, .Px (τd < τc ) = d−c ,c ≤ x ≤ d. More generally, see Exercise 9. For .μ < 0, say, one has for .c ≤ x ≤ d, 2|μ|
Px (τd < τc ) =
.
e σ2 2|μ|
e σ2 In particular, letting .c ↓ −∞, one has
2|μ|
c
2|μ|
c
x
− e σ2
d
− e σ2
.
(8.57)
Exercises
157
Px (τd < ∞) = e
.
− 2|μ| 2 (d−x) σ
.
(8.58)
Letting .M x := supt≥0 Xtx , it follows that .M x − x is exponentially distributed with . parameter .λ = 2|μ| σ2
Exercises 1. (Lévy martingale characterization of Brownian motion) Show that if X = {Xt : t ≥ 0} is a continuous martingale starting at x = 0 with quadratic variation t, then X is standard Brownian motion.[Hint: Use Itô’s lemma to check that 1 2 for each λ ∈ R, {eλXt − 2 λ t : t ≥ 0} is a martingale and deduce from the constant expected value property and moment generating function that Xt − Xs is Gaussian with mean zero and variance t − s. Extend this calculation to increments Xt0 , Xt1 − Xt0 , . . . , Xtm − Xtm −1 , 0 < t0 < t1 < · · · < tm , m ≥ 1 by iteratively conditioning.] 2. (Geometric Brownian Motion) In Example 5, show that starting from x > 0, (a) {Xt : t ≥ 0} reaches every state y > 0 with positive probability, whatever be μ and σ 2 > 0, and (b) it reaches every state with probability one if μ ≤ 12 σ 2 . (c) Compute the time-asymptotic behavior of Xt in each of the cases (i) μ > 12 σ 2 , (ii) μ < 12 σ 2 , and (iii) μ = 12 σ 2 . [Hint: Express X explicitly in terms of B.] 3. Give a proof of the integration by parts formula Proposition 6.6 using Itô’s lemma. 4. Let ϕ : [c, d] → R be twice continuously differentiable on [c, d], with ϕ (c), ϕ (c) denoting right-hand derivatives, and ϕ (d), ϕ (d) denoting lefthand derivatives. Show that ϕ has a twice continuously differentiable extension with compact support to all of R. 5. Prove uniqueness of the solution to (8.47). [Hint: Give an argument from which the solution is explicitly computed.] 6. (Ornstein–Uhlenbeck) Consider the Ornstein–Uhlenbeck process X defined by Example (3). Use Itô’s lemma to compute (a) the mean and variance of Xt , and (b) the characteristic function EeirXt = E cos(rXt ) + iE sin(rXt ), r ∈ R. (c) Show that lim supt→∞ Xt = +∞, lim inft→∞ Xt = −∞ regardless of how small the parameter σ = 0 might be. 7. (Multidimensional Ornstein–Uhlenbeck) Let Γ be a k × k real matrix with rows γ i , i = 1, . . . , k and σ > 0. Let B denote k−dimensional standard Brownian motion and let Y denote the process defined by dY (r) (t) = γ r · (r) Y(t)dt + σ dBt , t > 0, Y (r) (0) = y (r) , r = 1, . . . , k. Equivalently, = ∞ Γ ndY(t) tn Γ Y(t)dt + σ dBt , t > 0, Y(0) = y. Defining eΓ t = , note n=0 n! d Γt e = Γ eΓ t . Establish the representation that Γ commutes with eΓ t and dt
158
8 Itô’s Lemma
t has (component-wise) Y(t) = eΓ t y + Bt − σ 0 Γ eΓ (t−s) Bs ds, t ≥ 0.[Hint: d Denote the ij -entry of eΓ t by εij (t)and check that dt εij (t) = km=1 εim (t)γmj . Define ϕr (t, y) = kj =1 εrj (t)y (j ) and use Itô’s lemma to show dϕr (t, Y(t)) = (j )
σjk=1 εrj (t)dBt .] 8. Show that the condition (8.52) is necessary and sufficient for (point) recurrence as defined in Definition 8.2. 9. Show that a one-dimensional diffusion with zero drift and strictly positive Lipschitz diffusion coefficient is always a recurrent process. 10. (Variation of Parameters) Consider the ordinary differential equation x (t) = t a(s)ds is a well-defined, a(t)x(t) + b(t), t > 0, x(0) = x0 . Assume ψ(t) = e 0 (t) = a(t)ψ(t), t ≥ 0. Show that x(t) = ψ −1 (t)(x + positive solution to ψ 0 t −1 (s)ds), t ≥) is a solution to the given problem. b(s)ψ 0
Chapter 9
Cameron–Martin–Girsanov Theorem
The topic of this chapter is a useful change-of-measure formula based on the mutual absolute continuity, and a relatively simple formula for the RadonNikodym derivative, of the distributions of a pair of diffusions over a finite time interval that have the same diffusion coefficient. Two derivations are provided of this important technique, each with its own merits.
A change-of-measure formula under a drift change for nonsingular diffusions is one of the cornerstones of the theory of stochastic differential equations. To have some feel for the theory, consider two Brownian motions, say standard Brownian motion .B = {Bt : t ≥ 0} starting at zero and Brownian motion .X = {Xt ≡ Bt +μt : t ≥ 0} starting at zero having nonzero drift .μ. The distributions of B and X are mutually singular probabilities on the path space .C[0, ∞), e.g. by the strong law of large numbers (see Exercise 11). However, restricted to a finite time horizon .0 ≤ t ≤ T < ∞, the distributions of .B[0,T ] := {Bt : 0 ≤ t ≤ T } and .X[0,T ] := {Xt : 0 ≤ t ≤ T } will be seen to be mutually absolutely continuous on .C[0, T ]. This and much more, including a formula for the Radon-Nikodym derivative, make up the Cameron– Martin–Girsanov theory for diffusions. In general, the Radon-Nikodym derivative provides a useful transformation between probability distributions of diffusions with different drifts but common nonsingular diffusion coefficients. The changeof-measure formula was first derived by Cameron and Martin (1944), (1945), for the computation of distributions of functionals of Brownian motion under translation by nonrandom as well as random functions. The present generalization to all nonsingular diffusions is due to Girsanov (1960). We provide two different approaches toward its derivation, each with its own merits. Basic to both approaches is the following result on exponential martingales. © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_9
159
160
9 Cameron–Martin–Girsanov Theorem
As usual, let .(Ω, F, P ) be a probability space on which is defined a kdimensional standard Brownian motion .B, with respect to a P -complete filtration .{Ft : t ≥ 0}. Proposition 9.1 Let .ft , .α ≤ t ≤ T , be a nonanticipative bounded functional with values in .Rk . Then t 1 t |fs |2 ds}, α ≤ t ≤ T , (9.1) .Mt := exp{ fs · dBs − 2 α α is a .{Ft : t ≥ 0}-martingale. In particular, .EMt = EMα = 1, .α ≤ t ≤ T . Proof First assume .f is a bounded nonanticipative step functional: .ft = gi for .ti ≤ t < ti+1 .(α = t0 < t1 < · · · < tm = T ), .0 ≤ i ≤ m − 1, .fT = gm , where .gi is .Fti -measurable and bounded .(0 ≤ i ≤ m). If .t ∈ [tj , tj +1 ), and .s ∈ [ti , ti+1 ), .i < j , then .Fs ⊆ Ftj so that E[Mt | Fs ] = E[E(Mt | Ftj ) | Fs ]
.
1 = E[Mtj E(exp{gj · (Bt − Btj ) − (t − tj )|gj |2 } | Ftj )|Fs ] 2 = E[Mtj | Fs ], (9.2) since the conditional distribution of .gj · (Bt − Btj ), given .Ftj , is normal with mean 2
zero and variance .(t − tj )|gj |2 , and .E exp{Y − σ2 } = 1 for every normal random variable Y with mean zero and variance .σ 2 (i.e., .θ → exp{θ 2 σ 2 /2} is the moment generating function of Y ). If .i < j − 1, then .Fs ⊆ Ftj −1 and taking conditional expectation of .Mtj given .Ftj −1 , as in (9.2), one gets E[Mt | Fs ] = E[Mtj | Fs ] = E[Mtj −1 | Fs ].
.
Proceeding in this manner, one finally arrives at E[Mt | Fs ] = E[Mti+1 | Fs ]
.
1 = Ms E(exp{gi · (Bti+1 − Bs ) − (ti+1 − s)|gi |2 } | Fs ) 2 = Ms . If .s, t ∈ [tj , tj +1 ), .s < t, then .Mt = Ms exp{gj (Bt − Bs ) − 12 (t − s)|gj |2 }, and .E[Mt | Fs ] = Ms · 1 = Ms , proving the martingale property. Now let .f be an arbitrary bounded nonanticipative step functional on .[α, T ]. There exists a uniformly bounded (by a constant A, say) sequence of nonanticipative t (n) t (n) (n) step functionals .f(n) such that .Yt := α fs · dBs − 12 α |fs |2 ds converges t t in probability to . α fs · dBs − 12 α |fs |2 ds = Yt , as shown in the proof of Proposition 6.3. Then, writing
9 Cameron–Martin–Girsanov Theorem (n)
Mt
.
161 (n)
:= exp{Yt },
Mt := exp{Yt },
one has (n) 2 .E(Mt )
t
= E exp{2 α
f(n) s
t
· dBs − 2 α
2 |f(n) s | ds} · exp{
≤ 1 · exp{(t − α)A2 } < ∞,
α
t
2 |f(n) s | ds}
(α ≤ t ≤ T ).
Hence, .{Mt(n) : n ≥ 1} is uniformly integrable, and it converges to .Mt in .L1 (as (n) (n) .n → ∞), .α ≤ t ≤ T . Therefore, taking limits in the identity .Ms = E[Mt | Fs ] .(α ≤ s < t ≤ T ), one obtains the desired result: .Ms = E[Mt | Fs ]. Corollary 9.2 Let .f = {ft : α ≤ t ≤ T } ∈ L[α, T ], i.e., .f is nonanticipative on T t t [α, T ] and . α |fs |2 ds < ∞. Then .Mt := exp{ α fs · dBs − 12 α |fs |2 ds}, .0 ≤ t ≤ T , is a supermartingale, i.e., .E[Mt | Fs ] ≤ Ms . In particular, .EMt ≤ 1, .(α ≤ t ≤ T ).
.
Proof One may find a sequence of bounded nonanticipative functionals .f(n) = (n) {ft : α ≤ t ≤ T } that converge in probability and in .L2 (dt × dP ) to .f t (n) t (n) (n) as .n → ∞. Then .Ms,t := exp{ s fu · dBu − 12 s |fu |2 du} → Ms,t := t t exp{ s fu · dBu − 12 s |fu |2 du} in probability. Hence, by Fatou’s lemma, writing (n) (n) .Mt := Mα,t , .Mt := Mα,t , one has for all .A ∈ Fs .(α ≤ s < t ≤ T ), (n)
E(Mt 1A ) ≡ E(Ms 1A Ms,t ) = E( lim Ms 1A Ms,t )
.
n→∞
(n) (n) ≤ lim E(Ms 1A Ms,t ) = lim E[E(Ms 1A Ms,t | Fs )] n→∞
=
n→∞
(n) lim E[Ms 1A E(Ms,t n→∞
| Fs )] = E(Ms 1A ),
(n) since .E(Ms,t | Fs ) = 1 by Proposition 9.1.
Corollary 9.3 Suppose .f = (f (1) , . . . , f (k) ) is a nonanticipative functional on .[0, T ] such that E exp{θ
.
T
|f(s)|2 ds} < ∞
(9.3)
0
for some .θ > 1. Then .Mt , .0 ≤ t ≤ T , defined by (9.1) is a martingale. (j )
Proof Define .fn (t) := (f (j ) (t) ∧ n) ∨ (−n) .(1 ≤ j ≤ k). Then .fn = (1) (k) (n) nonanticipative functional on .[0, T ]. Hence .Mt := (fn , . . . , fn ) is a bounded t 1 t 2 exp{ 0 fn (s)·dBs − 2 0 |fn (s)| ds}, .0 ≤ t ≤ T , is a martingale, by Proposition 9.1. Since .Mt(n) → Mt a.s. as .n → ∞, it is enough to prove that .Mt(n) , .n ≥ 1, is uniformly integrable. For .θ1 > 1, one has
162
9 Cameron–Martin–Girsanov Theorem
(n) θ .E(Mt ) 1
t
= E exp{θ1
θ1 fn (s) · dBs − 2
0
t
|fn (s)|2 ds}
0
θ13 t fn (s) · dBs − |fn (s)|2 ds} = E[exp{θ1 2 0 0 (θ 3 − θ1 ) t · exp{ 1 |fn (s)|2 ds}]. 2 0 t
By Hölder’s inequality, the last expectation is no greater than
1 θ14 t . fn (s) · dBs − |fn (s)|2 ds}) θ1 2 0 0 t 3 θ − θ1 θ1 1− 1 )( 1 ) ·(E exp{( |fn (s)|2 ds}) θ1 θ1 − 1 2 0 t (θ1 + 1) 1− 1 = 1 · (E exp{θ12 |fn (s)|2 ds}) θ1 . 2 0 (E exp{θ12
t
Choose .θ1 close enough to 1 such that .
θ12 (θ1 +1) 2
E(Mt(n) )θ1 ≤ (E exp{θ
t
.
≤ θ . Then |fn (s)|2 ds})
1− θ1
1
0 t
≤ (E exp{θ
|f(s)|2 ds})
1− θ1
1
.
0 (n)
(n)
Hence, by (9.3), .sup{E(Mt )θ1 : n ≥ 1} < ∞, so that .Mt (n) and .Mt = lim Mt is a martingale.
is uniformly integrable
Remark 9.1 It has been proved by Novikov (1972) that if 1 .E exp{ 2
T
|f(s)|2 ds} < ∞,
(9.4)
0
i.e. .θ = 1/2 in (9.3), then .Mt , .0 ≤ t ≤ T , is a martingale. For our first approach to the main result (CMG Theorem), we will need Paul Lévy’s important characterization of Brownian motion as a continuous martingale with the appropriate quadratic variation. Theorem 9.4 (Lévy’s Martingale Characterization of Brownian Motion) Let Zt = (Zt(1) , . . . , Zt(k) ) .t ≥ 0, be an .Rk -valued stochastic process adapted to a filtration .{Ft : t ≥ 0} of .F, defined on a probability space .(Ω, F, P ). Then .Zt −Z0 , is a k-dimensional standard Brownian motion if and only if
.
9 Cameron–Martin–Girsanov Theorem
163
i. .t → Zt is continuous on .[0, ∞), a.s., (i) (i) ii. .Zt − Z0 , .t ≥ 0, is a .{Ft : t ≥ 0}–martingale for each .i = 1, 2, . . . , k, and (j ) (j ) (i) iii. .(Zt − Z0(i) )(Zt − Z0 ) − δij t, .t ≥ 0, is a .{Ft : t ≥ 0}-martingale for each pair .(i, j ), .1 ≤ i ≤ j ≤ k. Proof The “only if” part is obvious. To prove the “if” part, we need to show that for arbitrary m and .0 = t0 < t1 < · · · < tm , .Ztj − Ztj −1 , .1 ≤ j ≤ m, are independent Gaussian random vectors with zero mean vectors and dispersion matrices .(tj − tj −1 )I , .1 ≤ j ≤ m, where I is the .k × k identity matrix. For this, it is enough to prove, for all such m and .t0 , t1 , . . . . , tm , .i ≤ j ≤ m, E exp{i
m
.
1 (tj − tj −1 )|ξ j |2 }, ξ j ∈ Rk . 2 m
ξ j · (Ztj − Ztj −1 )} = exp{−
j =1
(9.5)
j =1
Without loss of generality, assume .ξ (j ) = 0 .∀ j (else, just delete those .Ztj − Ztj −1 terms for which .ξ (j ) = 0). Write .
tj −1,r,n = tj −1 + r2−n (tj − tj −1 )
(r = 0, 1, . . . , 2n ),
Zj,r,n = ξ j · (Ztj −1,r,n − Ztj −1,r−1,n )
(r = 1, . . . , 2n ),
m
n
2 m
ξ j · (Ztj − Ztj −1 ) =
j =1
Zj,r,n .
(9.6)
j =1 r=1
Then, for each n, .Zj,r,n .(1 ≤ j ≤ m, .1 ≤ r ≤ 2n ) are martingale differences when the indices .(j, r) are ordered by the values of .tj −1,r,n ≡ tj −1 + r2−n (tj − tj −1 ). One has, by (ii) and (iii), .
E(Zj,r,n | Ftj −1,r,n−1 ) = 0, 2 2 := E(Zj,r,n | Ftj −1,r,n−1 ) = 2−n (tj − tj −1 )|ξ j |2 , σj,r,n n
sn2
:=
2 m
2 σj,r,n =
j =1 r=1
m
(tj − tj −1 )|ξ j |2 ,
j =1 2n
Ln (ε) :=
m
2 E(Zj,r,n 1[|Zj,r,n |>ε] | Ftj −1,r,n−1 ), (ε > 0).
(9.7)
j =1 r=1
By the martingale central limit theorem,1 if we prove that .Ln (ε) → 0 in probability as .n → ∞, then (9.5) will follow from (9.7). For this, first note that .Ln (ε)
1 See
Bhattacharya and Waymire (2022), Theorem 15.1.
164
9 Cameron–Martin–Girsanov Theorem
2 is uniformly bounded (for all n and .ε) by . m j =1 (tj − tj −1 )|ξ j | . Also, by the assumption (i) of continuity, .δn := max{|Zj,r,n | : 1 ≤ j ≤ m, .1 ≤ r ≤ 2n } → 0 a.s. as .n → ∞. Therefore, .ELn (ε) → 0 as .n → ∞, for every .ε > 0. Remark 9.2 The assumptions (ii), (iii) of Theorem 9.4 are satisfied by every square integrable process .Xt , .t ≥ 0, with (homogeneous) independent mean-zero increments (if the process is scaled so as to have variance one for .Xt+1 − Xt ), for the special case .k = 1. For example, if .Nt , .t ≥ 0, is a Poisson process with intensity parameter (mean per unit time) .λ = 1, then .Xt := Nt − t satisfies (ii), (iii) with .k = 1. Thus, the continuity assumption (i) is crucial for the martingale characterization of Brownian motion. t For the case of a stochastic integral .Zt := 0 f(u) · dBu , a more direct proof of Lévy’s characterization is sketched in Exercise 5. As an important consequence of Lévy’s characterization of Brownian motion, we get the following result that says that every stochastic integral is a standard Brownian motion when time is measured by quadratic variation. In other words, a large class of martingales (adapted to Brownian motion) with continuous paths can be expressed as time-changed Brownian motions.2 Corollary9.5 (Time Change of Stochastic Integrals) Let .f ∈ L[0, ∞) be such t that .t → 0 |f(s)|2 ds is strictly increasing and goes to .∞ (a.s.) as .t → ∞. Define s the stopping times .τt := inf{s ≥ 0 : 0 |f(u)|2 du = t}, .t ≥ 0. Then .Yt := If (τt ), s .t ≥ 0, is a standard one-dimensional Brownian motion, where .If (s) := 0 f(u)·dBu .(s ≥ 0). Proof Since the inverse of a strictly increasing continuous function on .[0, ∞) into [0, ∞) is continuous, .t → τt is continuous (a.s.) and so is .t → Yt . Now, assuming first that .f ∈ M[0, ∞), .If (t), .t ≥ 0, is a .{Ft : t ≥ 0}–martingale. Hence, by the optional stopping rule, .If (τt ), .t ≥ 0, is a .{Fτt : t ≥ 0}–martingale. Also, since t 2 2 .St := I (t) − f 0 |f(s)| ds, .t ≥ 0, is a martingale, so is .Xt := Sτt .(t ≥ 0). But 2 .Xt = Yt − t, .t ≥ 0. Hence, by Lévy’s characterization, .Yt , .t ≥ 0, is a standard onedimensional Brownian motion. For a general .f ∈ L[0, ∞], let .fn (t) := f(t)1[τn >t] T 2 .(n = 1, 2, . . . ). Then .E 0 |fn (u)| du ≤ n .∀ T , so that .fn ∈ M[0, ∞]. Therefore, a (n) (n) minor modification of Theorem 9.4 implies .Yt := Ifn (τt ) is a standard Brownian s (n) motion on .[0, n], where .τt := inf{s ≥ 0 : 0 |f(u)|2 1[τn >u] du = t} = inf{s ≥ 0 : s∧τn (n) |f(u)|2 du = t} = τt for .t ≤ n (since .τn ≥ τt ). Thus, .Yt = Ifn (τt ) = If (τt ), 0 .0 ≤ t ≤ n. Hence, .If (τt ) is a standard Brownian motion on .[0, n] for every .n ≥ 1. .
2 In a startling and sensational recent discovery of a sealed letter written some 60 years ago to the Academy of Sciences in Paris, it was revealed that by the age of 25, Wolfgang Doeblin had obtained this result among several others in the development of the stochastic calculus. See Handwerk and Willems (2007), Davis and Etheridge (2006) for historical accounts.
9 Cameron–Martin–Girsanov Theorem
165
t Remark 9.3 The assumption .t → 0 |f(u)|2 du is strictly increasing may be s dispensed by defining .τt := inf{s ≥ 0 : 0 |f(u)|2 du > t}. .τt is then an .{Ft : t ≥ 0}–optional time, which is also a .{Ft }–stopping time if .{Ft : t ≥ 0} is right continuous (see BCPT3 p. 59). One may thus take the filtration to be .Ft+ := ∩ε>0 Ft+ε .(t ≥ 0) if necessary. It may be shown that, outside a P –null set, .t → If (τt ) is continuous.4 The proof of the desired martingale properties of 2 .Yt := If (τt ) and .Yt − t .(t ≥ 0) follows as above. Example 1 (One-Dimensional Diffusion) Consider a one-dimensional diffusion satisfying dX(t) = σ (X(t))dB(t),
.
X(0) = x,
where .σ is a positive Lipschitz function on .R bounded away from zero. To “find”the time-change .τt that makes .Z(t) = X(τt ), t ≥ 0, a standard Brownian motion, consider that a continuous strictly increasing process .t → τt , τ0 = 0, would make .Z(t), t ≥ 0 a continuous martingale. Thus, in view of Corollary 9.5, it is sufficient to determine such a process .τt , t ≥ 0, for which Z has the same quadratic variation as Brownian motion, namely, t. Now, by Itô’s lemma,
τt
Z (t) − t = X (τt ) − t = x +
.
2
2
2
t
σ (Xs )ds − t + 2
σ (Xs )dBs , t ≥ 0.
0
0
So the martingale property of .Z 2 (t) − t, t ≥ 0, may be cast as a definition of .τt as follows τt . σ 2 (X(s))ds = t, t ≥ 0. 0
Indeed, it follows from Corollary 9.5 that .Z(t) = X(τt ), t ≥ 0, is standard Brownian motion starting at x. Conversely, .X(t) = B(τt−1 ), t ≥ 0, can be viewed as a time change of Brownian motion, where .τt−1 is given by by .
0
τt−1
ds = t, σ 2 (X(s))
t ≥ 0.
Remark 9.4 One may relax the assumption on .σ (·) to be just measurable and bounded away from zero, using Feller’s construction of all one-dimensional diffusions in Chapter 21. However, the construction of diffusions presented in Chapter 7 requires that .σ (·) be Lipschitz. An alternative brief treatment of time change to
3 Throughout,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory. 4 See Ikeda and Watanabe (1989), pp. 85–86.
166
9 Cameron–Martin–Girsanov Theorem
Brownian motion is given in McKean (1969) that does not require the martingale characterization of Brownian motion. Specifically, one has the following: Theorem 9.6 Let .g(s), s ≥ 0, be an .Rk valued nonanticipative process belonging to .M[0, t ∞), 2or .M[0, T ] for some .T > 0. Assume that the quadratic variation .qt = t ≥ 0, of the (one-dimensional) process defined by the stochastic 0 |g(s)| ds, t integral .I (t) = 0 g(s) · dBs , t ≥ 0, is strictly increasing in t. Define the stopping times .τt = inf{s ≥ 0 : qs = t}, t ≥ 0. Then .Wt = I (τt ), t ≥ 0, is a standard one-dimensional Brownian motion with respect to the filtration .Gt = Fτt , t ≥ 0. Proof For arbitrary .s < t, and .ξ ∈ R, consider the process .C(t) = exp{iξ I (t) + 1 2 2 ξ qt }, t ≥ 0. By Itô’s lemma, one has ξ2 1 dC(t) = C(t){iξ g(t) · dBt + (iξ |g(t)|)2 dt + ( |g(t)|)2 dt} 2 2 = C(t)iξ g(t) · dBt ,
.
(9.8)
so that .C(t), t ≥ 0, is a .Ft -martingale. By the optional stopping theorem, it follows that .C(τt ), t ≥ 0, is a .Gt -martingale. Therefore, noting that .C(τt ) = exp{iξ I (τt ) + 1 2 2 ξ t}, one has for all .s < t, ξ2 1 E exp{iξ I (τt ) + ξ 2 t|Gs } = exp{iξ I (τs ) + s}, 2 2
.
(9.9)
From this, one obtains that E exp{iξ(I (τt ) − I (τs ))|Gs } = exp{−
.
ξ2 (t − s)}, 2
(9.10)
proving that .I (τt ) − I (τs ) is independent of .Gs and that it has the Gaussian distribution with mean zero and variance .t − s. Moreover, .t → Wt = I (τt ), t ≥ 0, is a composition of continuous functions and therefore continuous. For an application of time change, one may obtain the following property of two-dimensional Brownian motion when viewed as a random path in the complex plane5 Corollary 9.7 Let .B = (B (1) , B (2) ) be a two-dimensional standard Brownian motion starting at the origin and let .f ≡ u + iv : C → C be a nonconstant analytic function on the complex plane. Then .(u(B), v(B)) is a time change of twodimensional Brownian motion. Proof Observe that by the Cauchy-Riemann equations, namely,
5 This
result is generally attributed to Lévy (1948).
9 Cameron–Martin–Girsanov Theorem
167
ux = −vy , uy = vx ,
.
u and v are harmonic functions so that by Itô’s lemma, .{u(Bt ) : t ≥ 0} and {v(Bt ) : t ≥ 0} are stochastic integrals. Moreover, by the Cauchy-Riemann equations, .| grad u|2 = | grad v|2 and .grad u · grad v = 0. Thus, the quadratic variation and therefore time change to Brownian motion are the same for both stochastic integrals.
.
Remark 9.5 Analytic functions f for which .f (z) = 0 for all .z ∈ C are referred to as conformal maps. In this context, Corollary 9.7 is often cited as a form of a “conformal invariance”property of Brownian motion; see Lawler (2008) for a general perspective on the utility of conformal invariance in the context of scaling limits of discrete lattice systems. Remark 9.6 One may observe that the martingale .v(B) induced by the imaginary part of .f (B) is represented by a stochastic integral that may be viewed as a linear transformation of the stochastic integral defining the martingale induced by the real part .u(B). Such “martingale transforms” play a probabilistic role related to that of Hilbert and Riesz transforms in harmonic analysis; see Exercise 3. The following lemma is useful to note for computations in proofs to folllow. Lemma 1 Let .B = {Bt : t ≥ 0} denote a k-dimensional standard Brownian motion on .(Ω, F, P ), and let .γ (u) ≡ γ (u, · ) be a nonanticipative .Rk –valued functional T on .[0, T ] such that .E exp{θ 0 |γ (u)|2 du} < ∞ for some .θ > 1. Define Mt := exp{
.
t
γ (u) · dBu −
0
1 2
t
|γ (u)|2 du}, 0 ≤ t ≤ T .
0
Then under P , one has dMt = Mt γ (t) · dBt ,
.
i.e., .Mt = 1 +
t 0
(9.11)
Ms γ (s) · dBs , t ≥ 0, is a martingale.
Proof By Itô’s lemma, under P , one has 1 1 dMt = Mt [γ (t) · dBt − |γ (t)|2 dt] + Mt |γ (t)|2 dt = Mt γ (t) · dBt . 2 2
.
Thus, the assertion follows.
(9.12)
The next corollary paves the way to prove the main result of this chapter. Corollary 9.8 Let .B = {Bt : t ≥ 0} denote a k-dimensional standard Brownian motion on .(Ω, F, P ), and let .γ (u) ≡ γ (u, · ) be a nonanticipative .Rk -valued T functional on .[0, T ] such that .E exp{θ 0 |γ (u)|2 du} < ∞ for some .θ > 1, and t t define .Mt := exp{ 0 γ (u) · dBu − 12 0 |γ (u)|2 du}, .0 ≤ t ≤ T .
168
9 Cameron–Martin–Girsanov Theorem
a. Then a probability measure on .FT is defined by Q(A) = E1A MT ≡
∀ A ∈ FT ,
MT dP
.
(9.13)
A
and one has Q(A) = E1A Mt
∀ A ∈ Ft , 0 ≤ t ≤ T ,
.
(9.14)
i.e., denoting the respective restrictions of .Q, P to .Ft by .Q[0,t] , P[0,t] one has dQ[0,t] . dP[0,t] = Mt on .Ft . b. Under Q, B˜ t := Bt −
t
0 ≤ t ≤ T,
γ (u)du,
.
0
is a standard k-dimensional Brownian motion. Proof a. That Q is a probability measure on .FT follows from the fact that .Mt , .0 ≤ t ≤ T , is a martingale so that .EMT = EM0 = 1. Also, if .A ∈ Ft .(0 ≤ t ≤ T ), then .E1A MT = E[1A Mt ], by the martingale property. t(i) .(1 ≤ i ≤ k), .0 ≤ b. By Theorem 9.4, we need to prove that, under Q, (1) .B (j ) (i) t B t − δij t .(0 ≤ t ≤ T ) are t ≤ T , are mean-zero martingales and (2) .B martingales. For .A ∈ Fs , .s < t ≤ T , writing .EQ and .EP .(= E) for expectations under Q and P , respectively, on .FT , we have t(i) Mt | Fs )]. t(i) ) = EP (1A B t(i) Mt ) = EP [1A EP (B EQ (1A B
.
Using Lemma 1, one has under P that .Mt = 1 + .
t 0
(9.15)
Ms γ (s) · dBs , and
t(i) Mt ) d(B t = d[(Bt(i) − γ (i) (u)du)Mt ] 0
= =
d(Bt(i) Mt ) − γ (i) (t)Mt dt Mt dBt(i) (i)
= (Bt −
+ Bt(i) dMt
t 0
−(
+ Mt γ
t
γ (i) (u)du)dMt
0 (i)
(t)dt − γ
(i)
t
(t)Mt dt − (
γ (i) (u)du)dMt
0 (i)
γ (i) (u)du)dMt + Mt dBt .
(9.16)
9 Cameron–Martin–Girsanov Theorem
169
t(i) Mt is a stochastic integral and, therefore, a (mean-zero) Hence, under P , .B martingale. Using this in (9.15), one gets .∀ A ∈ Fs , t(i) ) = EP (1A B s(i) Ms ) = EQ (1A B s(i) ), EQ (1A B
.
(9.17)
t , .0 ≤ t ≤ T , is a mean-zero martingale .(1 ≤ i ≤ k). Next note that so that .B under P , using Itô’s lemma, (i)
.
t(i) B t d[(B
(j )
− δij t)Mt ]
t(i) d B t(j ) + B t(j ) d B t(i) + δij − δij )Mt t(j ) − δij t)dMt + (B t(i) B = (B t(i) B t(i) d B t(j ) + B t(j ) d B t(i) ). t(j ) − δij t)dMt + Mt (B = (B t(i) B t(j ) − δij t)Mt is a stochastic integral and, therefore, a Thus, under P , .(B t(i) B t(j ) − δij t, martingale. Therefore, as in the proof of (9.17), using (9.16), .B .0 ≤ t ≤ T , is a martingale under Q. We are now ready to state and prove the CMG Theorem. Consider the diffusion Xx,0 t =x+
t
.
0
μ(u, Xx,0 u )du +
t 0
σ (u, Xx,0 u )dBu ,
(9.18)
where .μ(u, y), .σ (u, y) are measurable (on .[0, T ] × Rk ) and spatially Lipschitzian, uniformly over .u ∈ [0, T ]. Let .γ (u, y) be measurable on .[0, T ] × Rk into .Rk , spatially Lipschitzian (uniformly on .[0, T ]) and such that .σ (u, y)γ (u, y) is also spatially Lipschitzian uniformly on .[0, T ]. Define Mt := exp{
.
0
t
γ (u, Xx,0 u ) · dBu −
1 2
t 0
2 |γ (u, Xx,0 u )| du},
0 ≤ t ≤ T.
(9.19)
Let Q be the probability measure on .FT defined by Q(A) =
MT dP
.
(A ∈ FT ).
(9.20)
A
Theorem 9.9 (Cameron–Martin–Girsanov Theorem (CMG)) Assume that for T 2 some .θ > 1, .E exp{θ 0 |γ (u, Xx,0 u )| du} < ∞. Under the above notation and hypotheses (9.18)–(9.20), the following are true. a. Under Q, Bt := Bt −
.
0
t
γ (u, Xx,0 u )du,
0 ≤ t ≤ T,
(9.21)
170
9 Cameron–Martin–Girsanov Theorem
is a standard .{Ft : t ≥ 0}–Brownian motion on .Rk (on the interval of time .[0, T ]). b. Under Q, .Xx,0 u .(0 ≤ u ≤ T ) is a diffusion with drift .μ(u, y) + σ (u, y)γ (u, y) and diffusion coefficient .σ (u, y). x,0 under P and Q, respectively, are c. The distributions .PTx,0 and .Qx,0 T , say of .X mutually absolutely continuous and related by (9.23). Proof (a) is a special case of Corollary 9.8. Part (b) follows on rewriting (9.18) as x,0 .Xt
= x+
t
0
x,0 x,0 {μ(u, Xx,0 u ) + σ (u, Xu )γ (u, Xu )}du +
t 0
σ (u, Xx,0 u )d Bu ,
(0 ≤ t ≤ T ), (9.22) Bu .0 ≤ u ≤ T , is a standard k-dimensional .{Ft : t ≥ noting that, under Q, . 0}-Brownian motion. To prove part (c), note that for every Borel subset D of x,0 x,0 k .C([0, T ] : R ), the set .A := [X [0,T ] ∈ D] ∈ FT , where .X[0,T ] (ω) is the function .t → Xx,0 t (ω), .0 ≤ t ≤ T , .∀ ω ∈ Ω. Hence, x,0 .Q T (D)
= Q(A) =
MT dP , A
PTx,0 (D) = P (A),
x,0 so that .Qx,0 T (D) = 0 if and only if .PT (D) = 0.
Let the vector fields (of .Rk
(9.23)
into .Rk ) .μ(t, y) and .β(t, y) be spatially
Corollary 9.10 Lipschitz uniformly for .t ∈ [0, T ]. Also assume .σ (t, y) is spatially Lipschitz uniformly for .t ∈ [0, T ] and that it is nonsingular such that γ (t, y) := σ −1 (t, y)(β(t, y) − μ(t, y)),
.
0 ≤ t ≤ T,
(9.24)
is bounded. Then the distributions on .C([0, T ) : Rk ) of the two diffusions with drifts .μ(t, y) and .β(t, y), respectively, and having the same diffusion coefficient .σ (t, y), are mutually absolutely continuous. Corollary 9.10 is an immediate consequence of part (c) of Theorem 9.9. Remark 9.7 Note that the vector field .γ (u) in Corollary 9.8 is only required to be nonanticipative, and need not be continuous or Lipschitzian, as long as the integrability condition is satisfied (e.g., if .γ is bounded on .[0, T ]). Similarly, part (a) of Theorem 9.9 holds if .γ (u, Xx,0 u ) is nonanticipative and bounded on .[0, T ] × Ω. Let now .μ(u, y) ≡ 0, .σ (u, y) spatially Lipschitz (uniformly on .[0, T ]), and .γ (u, y) bounded measurable on .[0, T ] × Rk (into .Rk ). Then the representation (9.22) holds under Q. However, its Markov property is not immediately clear, since the conditions for the pathwise uniqueness of the solution to an equation such
9 Cameron–Martin–Girsanov Theorem
171
as (9.22) are not met. The Markov property was proved by Stroock and Varadhan (1969), who showed that parts (b) and (c) still hold; indeed, one requires only that −1 (u, y) are continuous and bounded. The same argument leads to the .σ (u, y), σ conclusion of Corollary 9.10 under the same assumption on .σ , and letting .μ = 0 and .β(u, y) bounded and measurable. In particular, one may define a diffusion on k .R with drift .μ(u, y) bounded and measurable and diffusion .σ (u, y) continuous and uniformly elliptic (i.e., .σ , σ −1 both bounded). This result is quite deep (see Stroock and Varadhan (1979)). We now discuss the second approach to the CMG Theorem following McKean (1969). Assume .μ(·), σ (·) locally Lipschitzian, .σ (x) nonsingular for all .x. Consider the diffusion .Xx0 on .Rk governed by x0 .Xt
t
= x0 + 0
σ (Xxs 0 )dBs ,
(9.25)
where .{Bt : t ≥ 0} is a standard k-dimensional Brownian motion with respect to a filtration .{Ft : t ≥ 0} on a probability space. Assume that this diffusion is nonexplosive. This is automatic for .k = 1 by the tests of Feller in Chapter 12 but requires additional conditions on .σ σ in higher dimensions to be provided in Chapter 12. Let .Yx0 be the diffusion governed by the Itô equation Yxt 0 = x0 +
t
.
0
μ(Yxs 0 )ds +
t 0
σ (Yxs 0 )dBs .
(9.26)
Let .ζ denote the explosion time of .Yx0 . Write 0 Mt,x := exp{ 0
.
0
t
σ −1 (Xxs 0 )μ(Xxs 0 )·dBs −
1 2
t 0
|σ −1 (Xxs 0 )μ(Xxs 0 )|2 ds}.
(9.27)
Below “distribution,” without qualification, refers to distribution under P , and .E denotes expectation under P . Theorem 9.11 (Cameron–Martin–Girsanov Theorem (CMG)) above hypothesis, one has, for each .t ≥ 0, 0 P (Yx[0,t] ∈ C, ζ > t) = E(1[Xx0
.
[0,t] ∈C]
0 Mt,x ) 0
Under the
(9.28)
for every Borel subset C of .C([0, t] : Rk ). Here, the subscript .[0, t] indicates restriction of the processes to the time interval .[0, t]. Proof Define, for each positive integer .n > |x0 |, τn := inf{t ≥ 0 : |Xxt 0 | = n},
.
τ˜n := inf{t ≥ 0 : |Yxt 0 | = n}.
(9.29)
172
9 Cameron–Martin–Girsanov Theorem
It is enough to prove that, for each n, and each .t > 0, 0 P (Yx[0,t] ∈ C, τ˜n > t) = E(1[Xx0
.
[0,t] ∈C]∩[τn >t]
(9.30)
Mt ).
To prove (9.30), one may arbitrarily modify .μ(·), .σ (·) such that .μ(x) vanishes for sufficiently large values of .|x|, and .σ (x) = I for sufficiently large values of .|x|, and .μ(·), .σ (·) are bounded and globally Lipschitzian. With this modification, define for .0 ≤ s ≤ t, s Mt,x := exp{ 0
.
s
t
σ −1 (Xxu0 )μ(Xxu0 )dBu −
1 2
t s
|σ −1 (Xxu0 )μ(Xxu0 )|2 du}.
(9.31)
0 , .s ≥ 0, is a .{F } 0 Then .Ms,x s s≥0 -martingale, by Proposition 9.1, and .Ms,x0 = 1 + 0 t 0 x x −1 (X 0 )μ(X 0 ) · dB . As before, define the probability measure .Q on u x0 u u 0 Mu,x0 σ .Ft by 0 Qx0 (A) = E(1A Mt,x ) 0
.
∀ A ∈ Ft .
(9.32)
We will first show that, under .Qx0 , .Xsx0 , .0 ≤ s ≤ t, is a Markov process with respect to .{Fs }0≤s≤t . Let G be a real bounded .Fs -measurable (test) function and g a real bounded Borel measurable function on .Rk . Then 0 0 0 EQx0 Gg(Xxs+u ) = EGg(Xxs+u )Ms+u,x 0
.
0 s 0 = E[GMs,x g(Xxs+u )Ms+u,x ] 0 0 0 s 0 E(g(Xxs+u )Ms+u,x | Fs ) = E[GMs,x 0 0 0 0 {E(g(Xyu )Mu,y }y=Xx0 ], = E[GMs,x 0
(9.33)
s
s 0 since (i) .{Bs+u − Bs , 0 ≤ u ≤ t − s} is independent of .Fs and (ii) .(Xxs+u , Ms+u,x ) 0 y x0 0 is the same function of .(Xs , {Bs+u − Bs , 0 ≤ u ≤ u}) as .(Xu , Mu,y ) is of .(y, {Bu , 0 ≤ u ≤ u}). The last expression of (9.33) equals 0 E[GMs,x (EQy g(Xyu ))y=Xx0 ] = EQx0 [G(EQy g(Xyu ))y=Xx0 ]. 0
.
s
s
0 That is, under .Qx0 , the conditional distribution of .Xxs+u , given .Fs is the .Qy y x0 distribution of .Xu on .[Xs = y]. This proves the Markov property of .Xxs , .0 ≤ s ≤ t, under .Qx0 . It remains to prove that the .Qx0 -distribution of .Xxs 0 .0 ≤ s ≤ t, is the same as the distribution of .Yxs 0 , .0 ≤ s ≤ t. For this, let g be a bounded function on k .R with bounded and continuous first-order derivatives. Then, by Itô’s lemma, and recalling Lemma 1,
9 Cameron–Martin–Girsanov Theorem
173
0 EQx0 g(Xxs 0 ) − g(x0 ) = E[g(Xxs 0 )Ms,x ] − g(x0 ) 0 t 0 = E[( Mu,x σ (Xux0 )∇g(Xxu0 ) · dBu ) 0
.
0
+
1 t 2
0 t
+
0 Mu,x g(Xxu0 )σ −1 (Xxu0 )μ(Xxu0 ) · dBu 0
0 t
+ 0
0 (σ (Xux0 )∇g(Xxu0 ) · σ −1 (Xxu0 )μ(Xxu0 ))Mu,x du] 0
s
= E[
0 du (∂rr g)(Xxu0 )σrr (Xxu0 )Mu,x 0
0
0 (Ag)(Xxu0 )Mu,x du] 0
= 0
s
[EQx0 Ag(Xxu0 )]du
where A is the generator of the diffusion .Yxs 0 , .0 ≤ s ≤ t. This shows that the .Qx0 distribution of .Xsx0 , .0 ≤ s ≤ t, is the same as the distribution (under P ) of .Yxs 0 , .0 ≤ s ≤ t, since the two Markov processes have the same generator. Since .t ≥ 0 is arbitrary, the theorem is proved. Corollary 9.12 Under the hypothesis of Theorem 9.11, a. The distribution of the explosion time .ζ of .Yx0 is given by 0 P (ζ > t) = EMt,x , 0
.
(9.34)
so that 0 0 , b. .Yxt 0 is nonexplosive if and only if .EMt,x = 1 for all t (or, equivalently, .{Mt,x 0 0 k .t ≥ 0} is a martingale) for every .x0 ∈ R , and in this case 0 c. The distribution of .Yx[0,T ] is absolutely continuous with respect to the distribution x0 of .X[0,T ] for all .T > 0, and all .x0 . Remark 9.8 Since necessary and sufficient conditions are known for nonexplosion 0 , in one dimension, Corollary 9.12 provides a characterization of the process .Mt,x 0 k .t ≥ 0 .(x0 ∈ R ) being a martingale. The use of Novikov’s condition here follows as a consequence. For .k > 1, one may use Kha´sminskii’s sufficient condition for 0 , .t ≥ 0 more nonexplosion to arrive at a criterion for the martingale property of .Mt,x 0 general than what an application of Novikov’s condition yields (see Chapter 12). In Theorem 9.11, the diffusion .Xt is taken to be driftless. However, one can use a general nonexplosive diffusion Xxt 0 = x0 +
t
.
0
β(Xxs 0 )ds +
t 0
σ (Xxs 0 )dBs
(9.35)
174
9 Cameron–Martin–Girsanov Theorem
and arrive at the following extension, more or less using the same arguments (Exercise 8). Corollary 9.13 Assume .β(·) and .σ (·) are locally Lipschitzian, and .σ (x) is nonsingular for all .x, and that .Xx .(x ∈ Rk ) is nonexplosive. Then if .μ(·) is locally Lipschitzian, one has for every .t > 0, .x0 ∈ Rk , 0 P (Yx[0,t] ∈ C, ζ > t) = E(1[Xx0
.
[0,t] ∈C]
0 Mt,x ) 0
(9.36)
where C is a Borel subset of .C([0, t] : Rk ) and 0 Mt,x := exp{ 0
t
σ −1 (Xxs 0 )[μ(Xxs 0 ) − β(Xxs 0 )] · dBs
.
0
1 − 2
t
|σ −1 (Xxs 0 )[μ(Xxs 0 ) − β(Xxs 0 )]|2 ds}.
0
(9.37)
Example 2 (First Passage Time Under Drift) Recall that the hitting time .τy of a point y by a one-dimensional standard Brownian motion .{Bt : t ≥ 0} starting at zero has distribution that may be readily computed from the reflection principle (strong Markov property), P ( max Bs > y) = 2P (Bt > y),
y > 0,
.
0≤s≤t
to obtain the probability density |y| − y 2 fy (t) := √ e 2t . 2π t 3
.
This calculation extends to Brownian motion with drift .μ by an application of CMG and the optional stopping theorem with respect to the pair of stopping times .τy ∧t ≤ t as follows. Applying CMG, conditioning on .Fτy ∧t and then using the optional stopping rule, one has .
P (inf{s : Bs + μs = y} ≤ t) = E(1[τy ≤t] Mt ) = E(1[τy ≤t] E(Mt |Fτy ∧t )) = E(1[τy ≤t] Mτy ∧t ).
(9.38)
Now, on .[τy ≤ t] one has .Mτy ∧t = Mτy . Thus, P (inf{s : Bs + μs = y} ≤ t) = E(1[τy ≤t] Mτy ) =
t
.
= 0
t
0
1 2 |y| − y 2 eμy− 2 μ s √ e 2s ds 2π s 3
|y| − (y−μs)2 2s ds. e √ 2π s 3
(9.39)
Exercises
175
Example 3 (Boundary Value Distribution of Brownian Motion on a Half-Space) Let .B = {Bt : t ≥ 0} be a k-dimensional standard Brownian motion starting at .x ∈ Hk := {x : xk > 0}, .(k ≥ 2). .Hk is referred to as the upper half-space in k k : x = 0}. Since the .R . Let .τ = inf{t : Bt ∈ ∂Hk } where .∂Hk = {x ∈ R k coordinates of .B are independent, standard Brownian motions, .τ is simply the first passage time of standard Brownian motion .B (k) to 0 starting from .xk > 0. Let us (1) (k−1) compute the distribution of .Bτ = (Bτ , . . . , Bτ , 0). For a Borel measurable set (1) , . . . , B (k−1) ), τ ) .A = Ak−1 × {0} ⊆ ∂Hk , note that that .P (Bτ ∈ A) = Eψ((B where .ψ(g, t) = 1Ak−1 (g(t)) for a continuous function .g : C[0, ∞) → Rk−1 , and real number .t ≥ 0. Since .τ is a function of .B (k) , it is independent of (1) , . . . , B (k−1) ). Thus, one may apply the substitution property of conditional .(B expectation to write P (Bτ ∈ A) = E[E{ψ((B (1) , . . . , B (k−1) , 0), τ )|σ (τ )}]
.
where, writing .xk−1 = (x1 , . . . , xk−1 ), (1)
(k−1)
Eψ((B (1) , . . . , B (k−1) ), τ )|τ ] = P [(Bt , . . . , Bt ) ∈ Ak−1 ]|t=τ 1 1 k−1 2 = (√ )k−1 e− 2π τ |y−x | dy. 2π τ Ak−1
.
Thus, applying Fubini-Tonneli and making a change of variable .s = (|y − xk−1 |2 + xk2 )/2t one has .
P (Bτ ∈ A) ∞ (|y − xk−1 |2 + xk2 ) k+2 1 2t = ( √ )k xk ( ) 2 e−t dtdy 2 2 k−1 2 2t |y − x | + xk 2π Ak−1 0 k xk k dy. (9.40) Γ ( )π − 2 = k 2 Ak−1 (|y − xk−1 |2 + xk2 ) 2 (1)
(k−1)
The probability density function of .(Bτ , . . . , Bτ ) can be read off from here. Note that in the case .k = 2, this is a Cauchy density (also see Exercise 3).
Exercises 1. (Elementary Change of Measure) Suppose that Z is a standard normal random variable defined on (Ω, F, P ), and let X = σ Z, Y = σ Z + μ. Then X, Y are random variables on (Ω, F, P ) with densities fa (x) = R, a = 0, a = μ, respectively.
− √1 e 2π
(x−a)2 2σ 2
,x ∈
176
9 Cameron–Martin–Girsanov Theorem
(a) Show that the probability distributions PX and PY on R are mutually μ2 dPY (x) = exp( σμ2 x − 2σ absolutely continuous with dP 2 ).[Hint: Write fY (x)X PY (B) = B fY (y)dy = B fX (x) fX (x)dx, B ∈ B.] (b) Show P (Y ∈ B) = E1(Y ∈ B) = E(1(X ∈ B)M), where M = μ2 exp( σμ2 X − 2σ 2 ). (c) Define a probability Q on (Ω, F, P ) by Q(A) = A M(ω)P (dω), A ∈ F. Show that under Q, X has density fμ , i.e., QX is the normal distribution with mean μ and variance σ 2 . 2. Show that two Brownian motions X[0,T ] and Y[0,T ] starting at zero having the same drift parameter μ, but different diffusion coefficients σ12 = σ22 , have mutually singular distributions. [Hint: It suffices to take μ = 0 and consider the law of the iterated logarithm.] 3. (Probabilistic Construction of Hilbert Transform) The Hilbert transform f1 of a “rapidly decreasing” C ∞ real-valued function f is defined in harmonic analysis by the requirement that the harmonic extension of f + if1 to the upper half-plane be complex analytic. Here, rapidly decreasing means that f together with all its derivatives vanish at ∞ faster than any power of x. An analytic formula is known to be given in terms of Fourier transform by fˆ1 (ξ ) = isgn(ξ )fˆ(ξ ), ξ ∈ R. This exercise concerns an alternative probabilistic representation of f1 . The harmonic extension u(x, y), x ∈ R, y > 0, of a real-valued function f on R to the upper half-plane is given by u(x, y) = E(x,y) f (Bτ(1) )
.
(1)
(2)
where Bt = (Bt , Bt ), t ≥ 0,is two-dimensional Brownian motion and τ := inf{t ≥ 0 : Bt(2) = 0} is the hitting time of the boundary R to the half-space R × (0, ∞). y (a) Show that u(x, y) = R Py (x − t)f (t)dt, where Py (x) = π1 y 2 +x 2 is the so-called Poisson kernel (Cauchy density). t (b) Show that the martingale transform defined by Vt = 0 A∇u(Bs ) · dBs , 0 1 where A = , provides a version of u1 (Bt ) where u1 (x, y) is the −1 0 harmonic extension of the Hilbert transform f1 of f to the upper halfplane.[Hint: Use Itô’s lemma and the Cauchy-Riemann equations applied to the analytic function u(x, y) + iu1 (x, y) in the upper half-plane.] (c) Show that f1 (x) = E[Vτ |Bτ(1) = x], x ∈ R. 0 , t ≥ 0, as defined 4. Give an example of the exponential martingale Mt,x 0 in (9.31), where Novikov’s condition (9.4) fails. [Hint: Use Kha´sminskii’s criterion of nonexplosion]. 5. Let Bt , t ≥ 0, be a standard one-dimensional Brownian motion with respect to a filtration {Ft : t ≥ 0}, and let f (t), t ≥ 0, be a nonanticipative
Exercises
6.
7.
8. 9.
10.
11.
12.
177
t functional belonging to L[0, ∞), such that 0 f 2 (s)ds → ∞ a.s. as t ↑ ∞. s Let τt := inf{s ≥ 0 : 0 f 2 (u)du = t}, t ≥ 0. Prove that If (τt ), t ≥ 0, hasthe same finite-dimensional distributions as Brownian motion, where t If (t) := 0 f (u)dBu , t ≥ 0. [Hint: Use characteristic functions] Extend this to k-dimensional stochastic integrals. Let f(·) be a k-dimensional process belonging to L[0, ∞). Use the supermartingale property of the exponential martingale Mt defined t by (9.1) to prove the t exponential inequality P (supt≥0 0 f(s) · dBs − α2 0 |f(s)|2 ds > β) ≤ e−αβ (α > 0, β > 0). Let σ (t), t ≥ 0, be a (k × d)-matrix valued nonanticipative functional, such t t that 0 |σ (u)|2 du < ∞ ∀ t > 0 and 0 |σ (u)|2 du → ∞ a.s. as t → ∞. Give t a direct proof (using characteristic functions) that Iσ (t) := 0 σ (u)dBu , t ≥ 0 (Bt , t ≥ 0, a d-dimensional standard Brownian motion), is a k-dimensional (i) standard Brownian motion if and only if the components Xt := Iσ i (t), 1 ≤ (j ) (i) i ≤ k, (σ i (·) is the ith row of σ (·)) satisfy dXt dXt = δij dt (in the sense of Itô’s lemma). Write out a proof of Corollary 9.13. Let τa = inf{t ≥ 0 : Bt = a}. Define the stochastic process Y = {Yt : t ≥ 0} as follows: Yt = Bt , t ≤ τa , Yt = 2a − Bt , t ≥ τa . (i) Show that Y has the same distribution as B. (ii) Show that P (Bt ≥ x, τa < t) = P (Yt ≥ 2a − x, τa < t) = P (Yt ≥ 2a − x, Mt > a), where Mt = max{Bs : 0 ≤ s ≤ t}. (iii) Find the joint distribution of (Bt , Mt ). [Hint: Bτa +s = Bτa + Bτa +s − Bτa , and Bτa +s −Bτa , s ≥ 0, is a standard Brownian motion starting at zero, independent of Fτa . In particular, the latter is invariant under reflection about zero. Use this to argue that {Bτa +s:s≥0 } has the same distribution as Bτa + (Bτa − Bτa +s ), s ≥ 0.] Use the CMG Theorem to derive the joint distribution of Xtx := Btx + tμ and max0≤s≤t Xsx , where Btx , t ≥ 0, is a standard Brownian motion starting at x. [Hint: Use the joint distribution of Btx and max0≤s≤t Bsx (e.g., see Corollary 7.12, p. 84, in Bhattacharya and Waymire (2021)).] Show that the distributions of two Brownian motions with the same diffusion coefficient but different drifts are mutually singular as probabilities on C[0, ∞).[Hint: Use the strong law of large numbers]. (Martingale Characterization of Poisson Process) Suppose that {Xt :≥ 0} is a stochastic process with right continuous step function sample paths with positive unit jumps and X0 = 0. Suppose that there is a continuous nondecreasing function m(t) with m(0) = 0 such that Mt := Xt − m(t), t ≥ 0, is a martingale. Show that {Xt : t ≥ 0} is a (possibly nonhomogeneous) Poisson process.
Chapter 10
Support of Nonsingular Diffusions
This chapter involves the use of the Cameron-Martin-Girsanov theorem to show that the distribution .Px of a nonsingular nonexplosive diffusion is fully supported by .C([0, ∞) : Rk ), starting at x. Strong and weak maximum principles for Dirichlet problems associated with the infinitesimal generator are obtained as corollaries.
In this chapter, we use the Cameron-Martin-Girsanov Theorem (Theorem 9.11 or Corollary 9.13) to show that the distribution on .C([0, ∞) : Rk ) of every nonexplosive nonsingular diffusion x0 .Xt
t
= x0 +
μ(Xxs 0 )ds
0
+ 0
t
σ (Xxs 0 )dBs ,
t ≥ 0,
(10.1)
has the full support .Cx0 := {f ∈ C([0, ∞) : Rk ), f(0) = x0 }. Theorem 10.1 (Support Theorem) Suppose .μ(·), .σ (·) are locally Lipschitzian, σ (x) nonsingular for all .x, and .Xx0 is nonexplosive. Then the support of the distribution of .Xx0 is .Cx0 .
.
Proof It is enough to prove that for every .T > 0, .ε > 0, and a continuously differentiable .f ∈ Cx0 with bounded derivative .f , one has P ( max |Xxt 0 − f(t)| ≤ ε) > 0.
.
0≤t≤T
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_10
(10.2)
179
180
10 Support of Nonsingular Diffusions
Let .f (t) = c(t), so that .f (t) = x0 + (nonhomogeneous) diffusion, Zt =
t
.
t 0
c(u)du. Then .Zt := Xxt 0 − f(t), .t ≥ 0, is a
t
[μ(Zs + f(s)) − c(s)] ds +
0
σ (Zs + f(s))dBs .
(10.3)
0
Then (10.2) is equivalent to P ( max |Zt | ≤ ε) > 0.
(10.4)
.
0≤t≤T
By Theorem 9.9, this is equivalent to proving P ( max |Yt | ≤ ε) > 0,
(10.5)
.
0≤t≤T
where .Y is governed by the equation Yt =
.
t
−MYs ds +
0
t
σ f (Ys )dBs ,
(10.6)
0
σ f (s, y) := σ (y + f(s)), and M is a (large) positive number to be specified later. Note that for a fixed .f and .ε > 0, the probability in (10.5) depends only on the values of .μ and .σ in a compact neighborhood of .x0 . Hence, we may assume, without loss of generality, that .μf (s, y) := μ(y + f(s)), .σ f , .σ −1 f are bounded and Lipschitzian (to satisfy the hypothesis for .μ and .σ in Corollary 9.10). Now (10.5) is implied by
.
Eτ ≥ T ,
.
τ := inf{t ≥ 0 : |Yt | = ε}.
(10.7)
To derive the last inequality (for a sufficiently large M), define a(s, y) := σ f (s, y)σ f (s, y),
.
B(s, y) :=
d(s, y) :=
aij (s, y)y (i) y (j ) /|y|2 ,
i,j
aii (s, y),
C(y) := −2M|y|2 .
(10.8)
i
Then applying the generator .As := 12 aij (s, y)∂ 2 /∂y (i) ∂y (j ) − M y (i) ∂/∂y (i) to a radial function .ϕ(y) = F (|y|), with .F (·) twice differentiable on .[0, ε], As ϕ(y) =
.
=
1 1 B(s, y) + C(y) − a(s, y) )F (|y|) a(s, y)F (|y|) + ( 2 |y| 2 1 1 B(s, y) + C(y) a(s, y)[F (|y|) + { − 1} F (|y|)] 2 a(s, y) |y|
(10.9)
10 Support of Nonsingular Diffusions
181
We would like to find a function F on .[0, ε] such that .F (0) > T , .F (ε) = 0, F (r) < 0 and .As ϕ(y) ≥ −1 for .0 ≤ s ≤ T , .|y| ≡ r ≤ ε. For then Itô’s lemma and optional stopping would yield .Eϕ(Yτ ) − ϕ(0) ≥ E(−τ ), or, .Eτ ≥ F (0) > T . One such function is u β(v ) 2 ε u dv }dv)du, ( exp{− .F (r) := v λ r 0 v
.
β(u) :=
kλ 2Mu2 − − 1, Λ λ
(10.10)
where .λ > 0 is a lower bound and .Λ < ∞ is an upper bound of the eigenvalues of a(s, y) for .0 ≤ s ≤ T , .|y| ≤ ε. Note that .{B(s, y) + C(y)}/d(s, y) − 1 ≥ β(|y|) for .0 ≤ s ≤ T , .|y| ≤ ε. Also
.
β(v ) dv }dv < 0, v 0 v r r β(v ) 2 β(r) ) exp{− F (r) = − [1 + dv }dv] (− r v λ v 0 2 .F (r) = − λ
=−
r
r
exp{−
2 β(r) − F (r), λ r
so that F (|y|) + {
.
1 2 B(s, y) + C(y) − 1} F (|y|) ≥ − , d(s, y) |y| λ 0 ≤ s ≤ T , |y| ≤ ε.
(10.11)
Using (10.11) in (10.9), we get As ϕ(y) ≥ −1
.
0 ≤ s ≤ T , 0 < |y| ≤ ε.
(10.12)
Applying Itô’s lemma (and optional stopping rule) to .ϕ(Yt ) (see Exercise 1) to get Eτ ≥ F (0). From (10.10), it is clear that .F (0) → ∞ as .M → ∞. Hence we can choose M such that .Eτ ≥ T .
.
Corollary 10.2 (The Let G be a connected open Strong Maximum Principle) set, and .A = 12 aij (x)∂ 2 /∂x (i) ∂x (j ) + bi (x)∂/∂x (i) be the generator of a diffusion such that .a −1 (·) ≡ (σ (·)σ (·) )−1 is bounded. If .u(·) is A–harmonic, i.e., .Au(x) = 0 .∀ x ∈ G, then u cannot attain its maximum or minimum in G unless u is constant on G. Proof Let .x ∈ G be such that one has .u(x) ≥ u(y) .∀ y ∈ G. Let .δ > 0 be such that B(x : δ) := {y : |y − x| ≤ δ} ⊆ G. Consider the diffusion .Xx with generator A, starting at .x. Then, by the strong Markov property, one has (Exercise 2)
.
182
10 Support of Nonsingular Diffusions
u(x) = E u(Xxτ )
.
(10.13)
where .τ := inf{t ≥ 0 : |Xxt − x| = δ}. If .π(x, dy) denotes the distribution of .Xxτ (on .{y : |y − x| = δ}), then (10.13) may be expressed as u(x) =
.
{y:|y−x|=d}
u(y)π(x, dy).
(10.14)
It follows easily from the Support Theorem 10.1 that the support of .π(x, dy) is (all of) .{y : |y − x| = δ} (Exercise 3). Since u is continuous (on .{y : |y − x| = δ}) and .u(y) ≤ u(x) .∀ y, one must have .u(y) = u(x) .∀ y ∈ {y : |y − x| = δ} in view of (10.14). Since the same is true for all .δ ∈ (0, δ], .u(y) = u(x) .∀ y ∈ B(x : δ). This shows that the set B of points .x ∈ G such that .u(x) = supz∈G u(z) is an open set. But B is also a closed subset of G, since .B = {x ∈ G : u(x) = M}, where .M = supz∈G u(z). Hence B is either empty or .B = G. Among the important applications of maximum principles are those to uniqueness problems for important classes of partial differential equations (also see Chapter 16). Corollary 10.3 (The Weak Maximum Principle) Let G be a bounded open set, and let A satisfy the hypothesis of Corollary 10.2. If f is a continuous function on .∂G, and g is a continuous function on G, and if .u1 , u2 are both solutions to the Dirichlet problem Auj (x) = g(x), (x ∈ G),
.
lim uj (x) = f (b) (b ∈ ∂G), (j = 1, 2),
x→b
(10.15) then .u1 ≡ u2 (on .G). Proof Without essential loss of generality, assume G is connected (else, apply the following argument to each connected component of G). Since .v := u1 −u2 satisfies .Av(x) = 0 for .x ∈ G, .v(b) = 0 for .b ∈ ∂G, and v is continuous on .G, .v attains its minimum as well as maximum on .∂G, which are both zero. Therefore .v ≡ 0 on .G.
Exercises 1. (i) For k = 1, show that ϕ(y) in Theorem 10.1 is twice continuously differentiable on (−ε, ε), and may be extended to a bounded function with bounded first and second derivatives on all of R. Then justify the use of Itô’s lemma and the optional stopping rule to derive the desired results Eτ ≥ T . (ii) For k > 1, and arbitrary 0 < δ < ε, show that ϕ(y) is twice continuously differentiable on δ < |y| < ε, which can be extended to a bounded
Exercises
183
function ϕδ with bounded first and second order derivatives on Rk . [Hint: Derive expressions for derivatives of a radial function, or look ahead to the display in the proof of Lemma 8 in the next chapter.] Starting at y0 such that δ < |y0 | < ε, using Itô’s lemma to this extended function, show y y that Eτδ 0 ∧ τε 0 ∧ t ≥ F (|y0 |) − EF (|Yτ y0 ∧τ y0 ∧t |) for all t ≥ 0, where y
δ
y
ε
τθ 0 = inf{t ≥ 0 : |Yt 0 | = θ }, θ > 0. Then let t ↑ ∞, and then let δ ↓ 0 to y y y y get Eτ0+0 ∧ τε 0 ≥ F (|y0 |) − EF (|Yτ y0 ∧τ y0 |), where τ0+0 := limδ↓0 τδ 0 . Prove y
0+
y
ε
that τ0+0 = ∞ a.s., and first derive Eτε 0 ≥ F (|y0 |), and then Eτε0 ≥ F (0). 2. Derive (10.13). 3. Let τ be the hitting time of the sphere {y : |y − x| = δ}. Show that the support of the distribution π(x, dy) of Xxτ is full under the hypothesis of Corollary 10.2.
Chapter 11
Transience and Recurrence of Multidimensional Diffusions
In this chapter criteria are developed for the transience, null recurrence, and positive recurrence of multidimensional diffusions generated by a second order elliptic operator, denoted by L, with the diffusion matrix .((ai j (·)))1≤i,j ≤k , non-singular and continuous, and drift vector .b(·), measurable and bounded on compact subsets of .Rk .
In this chapter we analyze1 the long time behavior of diffusions generated by the elliptic operator k k 1 ∂2 ∂ .L = aij (x) + bj (x) 2 ∂xi ∂xj ∂xj i,j =1
(11.1)
j =1
under the following conditions assumed throughout this chapter. Criteria2 were announced by Kha´sminskii (1960) under additional smoothness conditions. Remark 11.1 The new notation L is being used here, in place of the earlier notation A in preceding chapters for elliptic operators generating a diffusion, in order to indicate that the smoothness assumptions for A are relaxed here.
1 The
generality in the approach presented in this chapter follows Bhattacharya (1978). of Kha´sminskii’s (1960) announcement, Friedman (1973) derived analagous criteria for recurrence and transience under additional growth and smoothness conditions. Alternative derivations of Kha´sminskii’s announced results under additional smoothness conditions may be found in Pinsky and Dieck (1995). 2 Unaware
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_11
185
186
11 Transience and Recurrence of Multidimensional Diffusions
Condition (A): The matrix .((aij (x))) is real-symmetric and positive definite for each .x ∈ Rk , and the functions .aij (x), 1 ≤ i, j ≤ k, are continuous. The functions .bj (x), 1 ≤ j ≤ k, are real-valued, Borel measurable and bounded on compacts. A key role is played by the following property of diffusions to be considered here. Definition 11.1 (Strong Feller Property) A Markov process .X = {X(t) : t ≥ 0} starting at .x ∈ Rk under .Px is said to have the strong Feller property under .Px if for every bounded measurable function .f : Rk → R, the function .x → Ex f (X(t)) is continuous on .Rk for each fixed .t > 0. The following extension of the construction of diffusions to possibly non-smooth coefficients on .Rk , .k ≥ 2, is due to Stroock and Varadhan (1979). For the case of Lipschitz coefficients this is a consequence of Itô’s lemma (see Corollary 8.5), and Theorem 10.1. Theorem 11.1 In addition to the Condition (A), suppose that .aij , bj are bounded on .Rk . Then for each .x ∈ Rk there is a unique probability measure .Px on .Ω = C([0, ∞) : Rk ) with Borel .σ -algebra for the topology of uniform convergence on compact subsets of .[0, ∞). Then, (i) .Px (X(0) = x) = 1, x ∈ Rk (ii) For every bounded .f : Rk → R, having bounded and continuous first and second order derivatives, the process
t
M(t) = f (X(t)) −
Lf (X(s))ds,
.
t ≥ 0,
0
is a martingale under .Px with respect to the filtration .Ft = σ (X(u) : u ≤ t), t ≥ 0. Furthermore, (a) X is a strong Markov and strong Feller process, and (b) The support of .Px is .{ω ∈ C([0, ∞) : Rk ) : ω(0) = x}. Remark 11.2 It may be noted that Theorem 11.1 holds more generally under condition A if explosion does not occur (see Definition 11.4). A sufficient condition is provided in the next chapter. The first step to relax the conditions for this construction to those of Condition (A) is to consider continuous truncations of the coefficients of L outside a large ball centered at the origin. Specifically, for each integer .N ≥ 1, define truncated coefficients of L as follows: (i) .aij,N (x) = aij (x), bj,N (x) = bj (x), for |x| ≤ N, 1 ≤ i, j ≤ k. (ii) .aij,N (x) = aij (x0 ), bj,N (x) = bj,N (x0 ) for x = cx0 such that |x0 | = N, c > 1. In addition, define
11 Transience and Recurrence of Multidimensional Diffusions
LN =
.
k k 1 ∂2 ∂ aij,N (x) + bj,N (x) , 2 ∂xi ∂xj ∂xj i,j =1
187
N = 1, 2, . . . .
j =1
Definition 11.2 The diffusion .Px,N , x ∈ Rk , or the coordinate processes on k k .C([0, ∞) : R ), associated with distributions .Px,N , x ∈ R , associated with the operator .LN in accordance with Theorem 11.1, will be referred to as the diffusion with generator .LN . Under Condition (A) it is a somewhat technical matter that one can construct a canonical model of the diffusion process on the enlarged state space .Rk ∪ {∞} obtained by adjoining a fictitious state, denoted .∞, to .Rk under the onepoint compactification topology (see Stroock and Varadhan (1979).) Under the additional condition that the coefficients of L are Lipschitz, the result follows from Corollary 8.2 and Remark 8.1. Theorem 11.2 Assume Condition (A) holds. Then for each .x ∈ Rk there is a probability measure .Px on the Borel .σ -field of .Ω = C([0, ∞) : Rk ∪ {∞}) for the one-point compactification of .Rk ∪ {∞}, and the corresponding topology of uniform convergence on compact subsets of .[0, ∞) on .Ω such that the coordinate projection process .X = {X(t) : t ≥ 0} on .Ω is a strong Markov process under k .Px , x ∈ R ∪ {∞}, and for each twice continuously differentiable function f ,
t
M(t) = f (X(t)) −
Lf (X(s))ds,
.
t ≥ 0,
0
is a martingale under each .Px , x ∈ Rk ∪ {∞}. We will continue to represent the coordinate-projection process with state space Rk ∪ {∞} with the same notation, X and .Px . The coordinate-projection processes associated with .LN , N ≥ 1, through Theorem 11.1 will continue to be denoted by k .Px,N , x ∈ R . Also .Ft will denote the filtration generated by .σ (X(s) : 0 ≤ s ≤ t), 3 .t ≥ 0, and .Fτ denotes the pre-.τ sigmafield if .τ is a stopping time (see BCPT p. 59). k Recall that for an open set .U ⊂ R , and .x ∈ U , the escape time .
ηU (x) = inf{t ≥ 0 : X(t) ∈ / U } = inf{t ≥ 0 : X(t) ∈ ∂U },
.
(11.2)
is a stopping time. Definition 11.3 Let L denote the elliptic operator (11.1) under the assumptions of Condition (A), and let .X = {X(t) : t ≥ 0} be the coordinate-projection process having distributions .Px , x ∈ Rk . Let
3 Throughout,
Theory.
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
188
11 Transience and Recurrence of Multidimensional Diffusions
B(x : r) = {y ∈ Rk : |y − x| < r}.
.
(i) A point .x ∈ Rk is said to be recurrent for the diffusion if for any .ε > 0, Px (X(t) ∈ B(x : ε) for a sequence of t ↑ ∞) = 1.
.
(ii) A point .x ∈ Rk is said to be transient if Px (|X(t)| → ∞ as t → ∞) = 1.
.
If all points are recurrent, or if all points are transient, then the diffusion is said to be recurrent, or to be transient, respectively. The following lemma follows from the main theorem (see Stroock and Varadhan (1979), Lemma 9.2.2, page 234), and is stated without proof. For elliptic operators with Hölder continuous coefficients, it is a standard result in the theory of partial differential equations (see Friedman (1964)) that the transition probability has a density .p(t; x, y) which is differentiable in both .x, y, for all .t > 0. Lemma 1 Assume Condition (A). For each Borel measurable set .B ⊂ Rk , the map k .(t, x) → p(t; x, B) is continuous on .(0, ∞) × R . In particular, for all .(t, x) ∈ k (0, ∞) × R , .p(t; x, dy) is absolutely continuous with respect to Lebesgue measure with a density .p(t; x, y) ≥ 0, in the sense that .p(t; x, B) = B p(t; x, y)dy, for all .B ∈ S. Lemma 2 Let .(S, ρ) be a metric space with Borel .σ -field .S. Let .(x, t) → p(t; x, B) be continuous for each .B ∈ S. Consider a Markov process .{X(t) : t ≥ 0} on a probability space .(Ω, F, Px ), x ∈ S, having transition probability .p(t; x, dy). Then for every bounded measurable function g on .Ω, the function .x → Ex g is continuous on S. Proof Define the filtration .Ft = σ (X(s), 0 ≤ s ≤ t), t ≥ 0. Suppose g is a finite dimensional measurable function on .Ω, say, g(X) = g(X(0), X(t1 ), . . . , X(tk )),
.
0 < t1 < · · · < tk .
Then, Ex g = Ex [E(g(X(0), X(t1 ), . . . , X(tk ))|Ftk−1 )]
.
= Ex gk−1 (X(0), X(t1 ), . . . , X(tk−1 )),
(11.3)
where .gk−1 (x, x1 , . . . , xk−1 ) = S g(x, x1 ,. . . , xk−1 , xk )p(tk − tk−1 ; xk−1 , dxk ). Continuing in this manner, .g1 (x, x1 ) = S g(x, x1 , x2 )p(t2 − t1 ; x1 , dx2 ), and therefore, .Ex g = S g0 (x, x1 )p(t1 ; x, dx1 ). By assumption, the last integral is
11 Transience and Recurrence of Multidimensional Diffusions
189
continuous, since a bounded measurable function can be approximated uniformly by simple functions.4 Restricting to indicator functions .g = 1B of measurable sets, it follows that the class .C of sets such that .x → Px (B) is continuous contains the field of all finite dimensional sets. But if .Bn ↑ B then .Px (Bn ) ↑ Px (B) for .Bn a finite dimensional sequence. Therefore, .x → Px (B) is lower semicontinuous for all sets B belonging to the .σ -field generated by the field of finite dimensional sets. Turning to complements, it follows that .x → Px (B c ) = 1 − Px (B) is lower semicontinuous. That is .x → Px (B) is upper semicontinuous. Hence .x → Px (B) is continuous for all B in .C, which is, therefore, the .σ -field .F. An entirely analogous argument holds for bounded measurable functions. Definition 11.4 (Explosion Time) Assume Condition (A). The explosion time for X is defined by ζ = lim ηB(0:N ) ,
.
N ↑∞
where .B(0 : N) = {x ∈ Rk : |x| < N }, and .ηB(0:N ) is the escape time from the ball .B(0 : N ). For .x ∈ Rk , the probability measure .Px is said to be conservative if .Px (ζ = ∞) = 1. Definition 11.5 Assume Condition (A). A Borel measurable function .f : Rk → R is said to be L-harmonic on an open set G if it is bounded on compacts, and for all .x ∈ G, f (x) = Ex f (X(ηU )),
.
for every neighborhood U of x having compact closure .U in G. Under additional smoothness of L, the following lemma follows from Corollary 8.3 and Corollary 10.3 Lemma 3 Assume Condition (A) holds. a. Every L-harmonic function on an open set .G ⊂ Rk is continuous on G. b. (Maximum Principle) Let f be a non-negative L-harmonic function on a connected open set .G ⊂ Rk . Then f is either strictly positive on G or .f ≡ 0 on G. Proof To prove part (a), let f be a L-harmonic function on an open set .G ⊂ Rk . For .x ∈ U , an open subset of G whose closure .U in G is compact, one has f (x) = Ex f (X(ηU )) = Ex,N f (X(ηU )),
.
4 See
BCPT, p. 221, (2.4).
190
11 Transience and Recurrence of Multidimensional Diffusions
for .B(0 : N) ⊃ U . This makes f continuous on G, by Lemma 2. For part (b), suppose that .f ≥ 0 on G and .f (x0 ) = 0, x0 ∈ G, and let B denote the open ball centered at .x0 of radius .ε > 0, sufficiently small that .B ⊂ G. Then 0 = f (x0 ) = Ex0 f (X(ηB )) =
f (y)p(x0 , dy),
.
∂B
where .p(x0 , dy) is the distribution of .X(ηB ) under .Px0 , and therefore under .Px0 ,N for .N > |x0 | + ε, as well. By Theorem 11.1, the support of .p(x0 , dy) is .∂B (Exercise 1). Since .f ≥ 0 and continuous, it follows that .f = 0 on .∂B. Thus, .f = 0 on G. Lemma 4 Assume Condition (A) holds. a. If .U ⊂ Rk is a non-empty, open subset of .Rk , such that there is a non-empty open connected set V , .V ∩ U = ∅, then .x → Px (ηU < ∞) is positive and continuous on U . b. If .U1 , U2 ⊂ Rk are two non-empty, open subsets of .Rk , such that .U 1 ∩ U 2 = ∅, c and .U 2 = Rk \U 2 is connected, then .x → Px (ηU c < ηU c ) is positive and c
1
c
2
c
continuous on .U 1 ∩U 2 , where .ηU c is the escape time of the open set .U i , .i = 1, 2. i
Proof It follows from the strong Markov property that both functions are Lharmonic and, therefore, continuous. To prove positivity in (b), and therefore c c positivity in (a), let .x ∈ U 1 ∩ U 2 . Let .B ⊂ U1 be an open ball and .B1 a bounded open set such that .x ∈ B1 , .B ⊂ B1 ⊂ {x ∈ R: |x| < N }, .B1 ∩ U 2 = ∅. Then, Px (ηU c < ηU c ) ≥ Px (ηB c < ηB c ) = Px,N (ηB c < ηB c ) > 0.
.
1
2
1
1
The final expression is positive since Px,N ({ω ∈ C([0, ∞) : Rk ) : ω(0) = x}) = 1.
.
.
Lemma 5 Assume Condition (A) holds. If .Px0 is conservative for some .x0 ∈ Rk , then .Px is conservative for all .x ∈ Rk , and the process .{X(t) : t ≥ 0} has the strong Feller property. Proof Since .x → Px (ζ < ∞) is harmonic, the first assertion follows. The strong Feller property follows from Lemma 2. Lemma 6 Assume Condition (A) holds. If U is a bounded open subset of .Rk with escape time .ηU , then for all .t > 0, .x ∈ Rk , E x ηU ≤
.
t . 1 − supx Px (ηU > t)
(11.4)
11 Transience and Recurrence of Multidimensional Diffusions
191
Proof For fixed but arbitrary .t > 0, let .q = supx Px (ηU > t), and .At = [ηU > t]. Then, by the strong Markov property, for .s ≥ 0, Px (At+s ) = Ex {1As PX(s) (At )} ≤ qPx (As ).
.
Iterating, one has Px (Ant ) ≤ q n ,
.
n = 1, 2, . . . .
Thus, E x ηU =
∞
.
Ex 1[nt 0, .q = supx Px (ηU > t). Lemma 7 Assume Condition (A) holds. If U is a bounded open subset of .Rk , then .
sup EηU < ∞. x∈U
Proof Let N be sufficiently large so that .B(0 : N) ⊃ U . Then .Ex ηU = Ex,N ηU for x ∈ U . Fix .t0 > 0. Since .Px,N (ω ∈ C([0, ∞) : Rk ) : ω(0) = x) = 1, one has
.
.
sup Px,N (ηU > t0 ) ≤ Px,N (|X(t0 )| < N) < 1, ∀ x ∈ U . x∈U
The assertion now follows by applying Lemma 6.
192
11 Transience and Recurrence of Multidimensional Diffusions
Proposition 11.3 Assume Condition (A) holds. The following are equivalent. a. The diffusion is recurrent. b. .Px (X(t) ∈ U for some t ≥ 0) = 1 for all .x ∈ Rk , nonempty open sets .U ⊂ Rk . c. There is a compact set .K ⊂ Rk such that .Px (X(t) ∈ K for some t ≥ 0) = 1 for all .x ∈ Rk . d. .Px (X(t) ∈ U for a sequence of t ↑ ∞) = 1 for all .x ∈ Rk , nonempty open sets k .U ⊂ R . e. For each .x ∈ Rk , there is a point .z ∈ Rk , a pair of numbers .0 < r0 < r1 , and a point .y ∈ ∂B(z : r1 ) = {w : |w − z| = r1 } such that .Py (ηB c (x:r0 ) < ∞) = 1. Proof The implications (b).⇒(c), (b).⇒(e), and (d).⇒(a) are obvious. To prove (a).⇒(b), let .x ∈ Rk , and let U be a nonempty open set such that .x ∈ / U . Let .B ⊂ U be an open ball, and choose .ε > 0 such that .B(x : ε)∩B = ∅. Let .U1 ⊃ B(x : ε)∪B be a bounded open set. Define .η1 = ηU1 , .η2i = inf{t > η2i−1 : X(t) ∈ ∂B(x : ε)}, .η2i+1 = inf{t > η2i : X(t) ∈ ∂U1 }, i = 1, 2, . . . . By Lemma 7, and recurrence of x, the .ηi ’s are .Px -a.s. finite stopping times. Consider the events, .A0
= [X(t) ∈ B for some t ∈ [0, η1 )], Ai = [X(t) ∈ B for some t ∈ [η2i−1 , η2i )], i ≥ 2. c
Since .y → Py (ηB c < ηB(x:ε)c ) > 0 is positive and continuous on .B ∩ B(x : ε) by Lemma 4(b),
c
δ = inf Py (ηB c < ηB(x:ε)c ) > 0.
.
y∈∂U1
Using the strong Markov property and induction on n, one obtains .Px (∩ni=0 Aci ) ≤ (1 − δ)n . Thus, Px (X(t) ∈ U for no t ≥ 0) ≤ Px (X(t) ∈ Bfor no t ≥ 0) ≤ lim Px ∩ni=0 Aci = 0.
.
n
Next, to prove (b).⇒(d), let .x ∈ Rk , U a nonempty open set, B an open ball, and .ε > 0 such that .B ∩ B(x : ε) = ∅ and .B ⊂ U . Define θ1 = inf{t ≥ 0 : X(t) ∈ ∂B(x : ε)}, θ2i = inf{t ≥ θ2i−1 : X(t) ∈ ∂B},
.
and θ2i+1 = inf{t ≥ θ2i : X(t) ∈ ∂B(x : ε)}, i ≥ 1.
.
By (b) and the strong Markov property, the .θi ’s are .Px -a.s. finite. Also .θi ↑ ∞ Px -a.s. as .i ↑ ∞; for otherwise, with .Px -positive probability the sequences ∞ and .{X(θ )}∞ converge to a common limit, which is impossible .{X(θ2i−1 )} 2i i=1 i=1 since .∂B(x : ε) and .∂B are disjoint. To prove (c).⇒(b), Let K be as in (c), B an arbitrary open ball, and .x ∈ Rk . Let U be an open ball containing .B ∪ K. Define .
11 Transience and Recurrence of Multidimensional Diffusions
193
η1 = ηK c ,
.
η2i = inf{t ≥ η2i−1 : X(t) ∈ ∂U }, η2i+1 = inf{t ≥ η2i : X(t) ∈ K}, i ≥ 1.
By (c), the strong Markov property and Lemma 7, .ηi , i ≥ 1, are .Px -a.s. finite. Now proceed as in the proof that (a).⇒(b). Specifically, define the .Ai ’s by replacing .ηi by .η , i ≥ 1, respectively, and let .δ + infy∈K Py (A1 ). then .δ > 0 by Lemma 4(b), and i Px (X(t) ∈ B for no t ≥ 0) ≤ Px (∩ni=1 Aci ) ≤ (1 − δ)n ,
.
n = 1, 2 . . . .
To prove (e).⇒(c), take .K = B(z : r0 ) and use Lemma 4(a), and the maximum principle of Lemma 3(b). Theorem 11.4 Assume Condition (A) holds. a. If there is a recurrent point, then the diffusion is recurrent. b. If there is no recurrent point, then the diffusion is transient. Proof To prove part (a), suppose y is a recurrent point. Choose .0 < r0 < r1 , z ∈ Rk such that .|y − z| = r1 . In Proposition 11.3(a.⇒b), it is shown that Py (X(t) ∈ ∂B(z : r0 ) for some t ≥ 0) = Py (X(t) ∈ B(z : r0 ) for some t ≥ 0) = 1.
.
By Proposition 11.3(e), the diffusion is recurrent. To prove part (b), suppose that no point in .Rk is recurrent. Fix .x ∈ Rk , and let .r > |x| be an otherwise arbitrary positive number. By Proposition 11.3(e) and the maximum principle (Lemma 3(b)), for each .r1 > r one has δr1 ≡ sup Py (ηB(0:r)c < ∞) < 1.
.
|y|=r1
Define η1 = inf{t ≥ 0 : X(t) ∈ ∂B(0 : r1 )}, η2i = inf{t ≥ η2i−1 : X(t) ∈ B(0 : r)},
.
and η2i+1 = inf{t ≥ η2i : X(t) ∈ ∂B(0 : r1 )}, i ≥ 1.
.
By Lemma 7 and the strong Markov property, for all .i ≥ 1, .
Px (X(t) ∈ B(0 : r) for some sequence of t’s increasing to infinity) ≤ Px (η2i+1 < ∞)
194
11 Transience and Recurrence of Multidimensional Diffusions
= Ex (1[η2i−1 r) = 1. Since .r > |x| is arbitrary, the proof is complete. We now turn to computable criteria to determine whether a multidimensional diffusion is recurrent or transient. However, this will require some additional notation. Fix .r0 > 0 and for .x, z ∈ Rk , let .x = x − z. (i) k
Az (x) =
aij (x + z)
.
i,j =1
xi xj |x |
, 2
Bz (x) = 2
k
xi bi (x + z).
i=1
(ii) k
Cz (x) =
.
aii (x + z),
β z (r) = sup
|x |=r
i=1
Bz (x) − Az (x) + Cz (x) . Az (x)
(iii) β z (r) = inf
.
|x |=r
Bz (x) − Az (x) + Cz (x) , Az (x)
α z (r) = sup Az (x). |x |=r
(iv) α z (r) = min Az (x),
.
|λ |=r
I z (r) =
r r0
β z (u) du, u
I z (r) =
r
r0
β z (u) u
du.
Lemma 8 Let .F ∈ C 2 ((0, ∞)), and let .z ∈ Rk . Define the function f (x) = F (|x − z|),
.
x ∈ Rk , |x − z| > 0.
Then 2Lf (x) = Az (x)F (|x − z|) +
.
F (|x − z|) [Bz (x) − Az (x) + Cz (x)]. |x − z|
11 Transience and Recurrence of Multidimensional Diffusions
195
Proof Straightforward differentiations yield for .|x − z| > 0, ∂f (x) (xi − zi ) = F (|x − z|) ∂xi |x − z|
.
∂ 2 f (x) ∂xi2
=
(xi − zi )2 (xi − zi )2 F (|x − z|) F (|x − z|) − F (|x − z|) + 2 3 |x − z| |x − z| |x − z|
(xi − zi )(xj − zj ) (xi − zi )(xj − zj ) ∂ 2 f (x) = F (|x − z|) − F (|x − z|), i = j. ∂xi xj |x − z|2 |x − z|3
From here one may directly check the asserted expression for .2Lf (x).
Theorem 11.5 Assume Condition (A) holds. ∞ a. If for some .r0 > 0 and .z ∈ Rk , . r0 exp(−I z (r))dr = ∞, then the diffusion associated with L is recurrent. ∞ b. If for some .r0 > 0 and .z ∈ Rk , . r0 exp(−I z (r))dr < ∞, then the diffusion associated with L is transient. Proof ∞ To prove part (a), assume that for some .r0 . r0 exp(−I z (r))dr = ∞. Define, F (r) =
r
.
exp(−I z (u))du = ∞,
>
0 and .z
∈
Rk ,
f (x) = F (|x − z|), |x − z| ≥ r0 .
r0
Let .z ∈ Rk be such that .r = |x − z| > r0 . Define stopping times by the escape times ηB(z:r0 ) = inf{t ≥ 0 : X(t) ∈ ∂B(z : r0 )}, ηN = ηB(z:r0 ) ∧ ηB(x:N ) .
.
(11.6)
By Theorem 11.1 and optional sampling, M(t) = f (X(t ∧ ηN )) −
t∧ηN
Lf (X(s))ds,
.
t ≥ 0,
0
is a .Px -martingale, provided .|x − z| < N. Thus, for .r = |x − z| > r0 , .
2Ex F (|X(t ∧ ηN ) − z|) − 2F (r) t∧ηN = Ex 2Lf (X(s))ds 0
≤ Ex
0
= 0, since
t∧ηN
Az (X(s)){F (|X(s) − z|) +
F (|X(s) − z|) β z (|X(s) − z|)}ds |X(s) − z| (11.7)
196
11 Transience and Recurrence of Multidimensional Diffusions
F (u) ≥ 0,
F (u) +
.
1 F (u)β z (u) = 0, u
u ≥ r0 .
Letting .t ↑ ∞ in (11.6), and recalling that .ηN < ∞ .Px -a.s., one obtains
r
Ex F (|X(ηN ) − z|) ≤ F (r) =
exp{−I z (u)}du.
.
(11.8)
r0
On evaluating the left side of (11.8), one has Px (ηB(z:r0 ) > ηB(x:N ) )
N
.
exp{−I z (u)}du ≤
r
exp{−I z (u)}du.
r0
(11.9)
r0
Now let .N ↑ ∞, to obtain r r
exp{−I z (u)}du
r0
exp{−I z (u)}du
Px (ηB(z:r0 ) = ∞) ≤ lim N0
.
N ↑∞
= 0.
(11.10)
Thus, .Px (ηB(z:r0 ) < ∞) = 1, and the diffusion is recurrent by Proposition 11.3(e). ∞ To prove part (b), assume that for some .r0 > 0 and .z ∈ Rk , . r0 exp(−I z (r))dr < ∞. Define r .G(r) = exp{−I z (u)}du, g(x) = G(|x − z|), |x − z| ≥ r0 . r0
Since .G (u) ≥ 0 and .G (u) + u1 G (u)β z (u) = 0 for .u ≥ r0 , one obtains, as above, Ex G(|X(ηN ) − z|) − G(|x − z|) ≥ 0,
.
or Px (ηB(z:r0 ) > ηB(x:N ) )
N
.
r0
exp{−I z (u)}du ≥
|x−z| r0
exp{−I z (u)}du.
(11.11)
Hence, letting .N ↑ ∞, one has |x−z| r
Px (ηB(z:r0 ) = ∞) ≥ 0 ∞
.
r0
exp{−I z (u)}du
exp{−I z (u)}du
> 0.
Thus, the diffusion is not recurrent by Proposition 11.3, and therefore it is transient by Theorem 11.4. We may now use a renewal or regeneration method to provide conditions for the existence of a unique invariant probability (see Bhattacharya and Waymire (2022),
11 Transience and Recurrence of Multidimensional Diffusions
197
Chapter 16, in the discrete parameter case). Recall (11.2) for the definition of the escape time .ηU . Lemma 9 Assume Condition (A) holds. a. The diffusion is recurrent and admits a finite invariant measure if there is a .z ∈ Rk such that .
sup y∈∂B(z:r1 )
Ey ηB(z:r0 )c < ∞,
for some .0 < r0 < r1 . b. Assume that the process is recurrent, but there is a .z ∈ Rk and .0 < r0 < r1 such that Ey ηB(z:r0 )c = ∞,
.
for all .y ∈ ∂B(z : r1 ). Then there does not exist a finite invariant measure. Proof Consider part (a) first, and assume there is a .z ∈ Rk such that .
sup y∈∂B(z:r1 )
Ey ηB(z:r0 )c < ∞,
for some .0 < r0 < r1 . Then the diffusion is recurrent by Proposition 11.3(e). Define η1 ≡ η1 (X) = inf{t ≥ 0 : X(t) ∈ ∂B(z : r0 )},
.
η2i ≡ η2i (X) = inf{t ≥ η2i−1 : X(t) ∈ ∂B(z : r1 )},
.
and η2i+1 ≡ η2i+1 (X) = inf{t ≥ η2i : X(t) ∈ ∂B(z : r0 )}, i ≥ 1.
.
Under the assumption of part (a) and the strong Markov property, the sequence X˜ 0 , X˜ i = X(η2i+1 ), i = 0, 1, . . . is a recurrent Markov chain on the compact state space .S = ∂B(z : r0 ), and therefore has an invariant probability .π˜ (dx) for the Feller transition probabilities5 given by
.
p(x, ˜ A) =
Py (X(η3 ) ∈ A)Px (X(η2 ) ∈ dy),
.
∂B(z:r1 )
for .A ∈ B(S), x ∈ S = ∂B(z : r0 ). In particular, .π˜ extends to Borel measurable sets .C ⊂ Rk , by defining a sigma-finite measure by
5 See
Bhattacharya and Waymire (2022), Proposition 8.6.
198
11 Transience and Recurrence of Multidimensional Diffusions
π(C) =
Ex ρC π˜ (dx),
.
(11.12)
S
where ρC =
η3
1C (X(t))dt
.
η1
denotes the occupation time of C by the process X during a single cycle. Equivalently, letting .ρ denote the time to complete a cycle starting in .S = ∂B(z : r0 ), .π is defined by the condition that for any bounded continuous function f on .Rk ,
.
Rk
f (x)π(dx) =
ρ
Ex
f (X(s))ds π˜ (dx).
(11.13)
0
S
To prove that .π is an invariant probability we use the following formula of Kha´ (1960). For a random variable .g(X), and a stopping time .τ such that sτminskii + .Ex g(X )dt exists, where .Xt+ denotes the after-t process defined by .Xt+ (s) = t 0 X(t + s), s ≥ 0, Ex
.
τ
τ
EX(t) g(X)dt = Ex
0
0
g(Xt+ )dt.
(11.14)
To check (11.14), apply Fubini’s theorem to the right side and then condition on .Ft to get, Ex
.
0
τ
g(Xt+ )dt = Ex =
0
∞
∞
Ex E 1[τ >t] g(Xt+ )|Ft dt
∞
Ex 1[τ >t] {E g(Xt+ )|Ft }dt
0
=
1[τ >t] g(Xt+ )dt
Ex 1[τ >t] g(Xt+ )dt
0
=
∞
0
= Ex
τ
EX(t) g(X)dt.
(11.15)
0
Now, with .τ = ρ, .h ∈ C0 (Rk ), .f (x) = Ex h(X(t)) in (11.13), and using (11.14) as well, . Tt h(x)π(dx) = Ex h(X(t))π(dx) Rk
Rk
11 Transience and Recurrence of Multidimensional Diffusions
=
ρ
Ex S
0
0
ρ
Ex
= S
ρ
Ex
= S
S
S
S
h(X(t + s))ds π˜ (dx) h(X(u))duπ˜ (dx)
t ρ
h(X(u))duπ˜ (dx)
0 ρ+t
Ex
+
h(Xs+ (t))ds π˜ (dx)
ρ+t
Ex
=
EX(s) h(X(t))ds π˜ (dx)
0
Ex
=
199
h(X(u))duπ˜ (dx) −
Ex S
ρ
t
h(X(u))duπ˜ (dx)
0
(11.16) Using (11.13), one has
Ex
.
S
ρ+t
h(X(u))duπ˜ (dx) = S
ρ
Ex EX˜ 1
Ex
= S
t
t
h(X(u))duπ˜ (dx)
0
h(X(u))duπ˜ (dx).
(11.17)
0
Thus, by (11.16) and (11.17), and again using (11.13),
.
Rk
Tt h(x)π(dx) =
Ex S
0
ρ
h(X(u))duπ˜ (dx) =
Rk
h(x)π(dx).
(11.18)
This proves the invariance of .π as defined by (11.12). To prove part (b) observe that if the expected value (of the return time) is infinite, starting from the boundary of the large sphere, then the invariant measure constructed in part (a) is not finite. Since every sigma-finite invariant measure is a multiple of this one, as follows from Proposition 11.6 or Theorem 11.7, the diffusion has no finite invariant measure. For further analysis, especially concerning the convergence of the positive recurrent diffusion to equilibrium, we consider the discrete parameter Markov chain .{X(n) : n = 0, 1, . . . }, called the discrete skeleton of the diffusion .{X(t) : t ≥ 0}. It is defined to be recurrent or transient as in Definition 11.3, but replacing t by n. We have the following corollary to Proposition 11.3 and Theorem 11.4. Proposition 11.6 Assume Condition (A). If the diffusion is recurrent, then so is its discrete skeleton .{X(n) : n ≥ 0}, and if the diffusion is transient then so is .{X(n) : n ≥ 0}.
200
11 Transience and Recurrence of Multidimensional Diffusions
Proof Let .x0 be a transient point of the diffusion. Then, by definition, so is it for the discrete skeleton. If .x0 is not a transient point of the diffusion, then there exists .N > 0 such that .x0 ∈ B(0 : N) and that .Py (B(0 : N) is reached in finite time) = 1, for all .y ∈ Rk . For arbitrary fixed .ε > 0, let D be the event that .X(n) reaches .B(0 : δ), where .0 < δ < ε, at some time .n ≥ 1. By the strong Feller property, .y → Py (D) is continuous and positive on .B(0 : N) whose minimum is denoted by .λ > 0. If c .λ < 1, then conditioning on the event .D and using the strong Markov property, the conditional probability that .B(0 : δ) is reached, after reaching .B(0 : N) is again at least .λ. Continuing in this way, the probability that by the nth try .B(0 : δ) is reached is . 1≤m≤n−1 λ(1 − λ)m−1 , which converges to 1 as .n → ∞. This shows that .Px0 (X(n), n ≥ 1, reaches B(x0 : ε)) = 1. By the strong Markov property, it follows that .x0 is a recurrent point for .{X(n) : n ≥ 0}. The argument actually shows that .λ = 1. Remark 11.4 Note that .Py (D) = Ey f (X(1)), where f (x) = Px (X(n) reaches B(0 : δ) for some n ≥ 0).
.
Therefore, .y → Py (D) is continuous by the strong Feller property. Focusing on the discrete skeleton is especially useful for the following important result. Theorem 11.7 Assume Condition (A). (a) Suppose .{X(t) : t ≥ 0}, is a recurrent diffusion, and that there exists a measure .ν such that for some .x0 ∈ Rk and .r > 0, .ν(B(x0 : r)) > 0 and p(1; x, A) ≥ ν(A)
∀ x ∈ B(x0 : r), A ∈ B(Rk ).
.
(11.19)
Then the diffusion has a sigma-finite invariant measure .π , unique up to a scalar multiple. (b) In addition to the assumption in (a), assume the diffusion satisfies .
sup
|x−x0 |=r1
Ex τ1 < ∞,
(11.20)
for any .x0 ∈ Rk and .r1 > r > 0, where .τ1 = inf{t ≥ 0 : X(t) ∈ B(x0 : r)} is the escape time from .B(x0 : r). Then the diffusion has a unique invariant probability .π and, moreover, whatever be the initial distribution, the distribution of the after-t process .Xt+ (s) = {X(t + s) : s ≥ 0} converges in total variation norm to .Pπ , as .t → ∞. Proof (a)We have proved that if the diffusion is recurrent so is .{X(n) : n ≥ 0}. The condition (11.19) implies that .{X(n) : n ≥ 0}, is Harris recurrent, and by Theorem 20.4 in Bhattacharya and Waymire (2022), it has a sigma-finite invariant measure .π , unique up to a scalar multiple. Since every such measure is also invariant for the diffusion .{X(t) : t ≥ 0} (Exercise 12), part (a) is proved. To prove part (b), we again consider the discrete process .{X(n) : n ≥ 0}, and prove the corresponding result
11 Transience and Recurrence of Multidimensional Diffusions
201
for it. Note that, given .sup|x−x0 |=r1 Ex τ1 < ∞ for the diffusion, one needs to check sup|x−x0 |=r1 Ex η < ∞ where .η = inf{n > 0 : X(n) ∈ B(x0 : r)}, for given .x0 , .0 < r < r1 . Consider the following stopping times: .τ0 = 0, .τ1 := inf{t > 0 : X(t) ∈ B(x0 , r)}, .τ2 = inf{t > τi : X(t) ∈ ∂B(x0 : r1 )}, .τ2i+1 = inf{t > τ2i : X(t) ∈ B(x0 : r)}, .τ2i+2 = inf{t > τ2i+1 : X(t) ∈ ∂B(x0 : r1 )}, (i = 0, 1, . . . ). Choose .r , .0 < r < r < r1 . Define .τ 2i+1 = inf{t > τ2i+1 : X(t) ∈ B(x0 : r ). Let .Ei = {τ2i+1 − τ2i+1 ≥ 1}, i ≥ 0. Clearly, .η ≤ τ2i+2 on .Ei ; i.e., .η ≤ τ2θ+2 , where .θ := inf{i ≥ 0 : τ2i+1 − τ2i+1 ≥ 1}. By the strong .δ := min{Px (E0 ) : Feller property, n sup |x − x0 | = r1 } > 0. Hence .Ex η ≤ Ex τ1 + ∞ (1 − δ) |x−x0 |=r Ex τ2n+2 ≤ n=0 ∞ n Ex τ1 + n=0 (1−δ) n sup|x−x0 |=r1 Ex (τ2 −τ0 ). Note that .C = sup|x−x0 |=r1 Ex τ1 < ∞ by hypothesis. This implies .sup|x−x0 |=r1 Ex (τ2i+1 − τ2i ) ≤ C, i ≥ 1. Also, .sup|x−x |=r Ex (τ2i+2 − τ2i+1 ) is finite, .i ≥ 1, since the escape time from a compact 0 1 set has finite moments of all orders, uniformly for all initial x in the set. Thus we have proved .supx∈B(x0 :r ) Ex η < ∞. By Theorem 20.4 in Bhattacharya and
.
Waymire (2022), since .{X(n) : n ≥ 0} is locally minorized on .A0 = B(x0 : r ), the discrete time process .{X(n) : n ≥ 0}, has a unique invariant probability .π and whatever be the initial distribution .μ, the distribution .Qn of the after n-process + .Xn = {X(n + m) : m = 0, 1, . . . } converges in total variation distance to the distribution .P˜π , say, of the stationary distribution of .{X(n) : n ≥ 0}, under the initial distribution .π . The desired result (b) for the diffusion easily follows, because the diffusion has a unique invariant probability and the convergence of .Xt+ in total variation as .t → ∞ is an easy consequence of the convergence of .Xn+ as .n → ∞. Remark 11.5 From standard results in PDE (See, e.g., Friedman (1964)), the local minorization (11.19) holds in the case the coefficients of L are Hölder continuous (Exercise 9), ensuring Harris recurrence of .{X(n) : n ≥ 0}. We expect something similar to be true under Condition (A) and recurrence to imply Harris recurrence using the estimates for the transition probability density in Stroock and Varadhan (1979), Chapter 9, Section 2. We now turn to verifiable conditions for null and positive recurrence based on the coefficients of L. Theorem 11.8 Assume Condition (A) holds. a. If for some .r0 > 0 and .z ∈ Rk ,
∞
.
r0
exp(−I z (r))dr = ∞,
∞
r0
1 exp(−I z (r))dr < ∞, α z (r)
then there exists a finite invariant measure (unique up to constant multiples) for the diffusion associated with L. b. If for some .r0 > 0 and .z ∈ Rk ,
202
11 Transience and Recurrence of Multidimensional Diffusions
∞ .
r0
N
exp(−I z (r))dr = ∞,
s r0 exp(−I z (s))( r0 (I z (r))/α z (r))dr)ds = ∞, N N →∞ r0 exp(−I z (r))dr lim
then the recurrent diffusion does not admit a finite invariant measure. ∞ Proof To prove (a), assume for some .r0 > 0 and .z ∈ Rk , . r0 exp(−I z (r))dr = ∞ ∞ and . r0 α 1(r) exp(−I z (r))dr < ∞. Define, z
F (r) = −
r
∞
exp(−I z (s))(
.
s
r0
1 exp{I z (u)}du)ds, α z (u)
r > r0 .
Then, for .r ≥ r0 ,
F (r) = − exp{−I z (r)}
.
r
∞
1 exp{I z (u)}du < 0, α z (u)
and F (r) = −
.
β z (r) 1 . F (r) + α z (r) r
Let f (x) = F (|x − z|),
.
|x − z| ≥ r0 .
Then, using these last two calculations, 2LF (x) ≥
.
Az (x) ≥ 1, αz
|x − z| ≥ r0 .
If .ηB(x0 ;r) , ηN are as in (11.6), then as in the proof of Theorem 11.5, one has
t∧ηN
2Ex F (|X(t ∧ ηN − z|) − 2F (|x − z|) = Ex
2Lf (X(s))ds
.
0
≥ Ex (t ∧ ηN ),
r0 ≤ |x − z| ≤ N.
First, letting .t ↑ ∞ and then .N ↑ ∞, from here one has, since .ηN ↑ ηB(x:η0 )c Px -a.s. (due to recurrence),
.
Ex ηB(x:η0 )c ≤ −2F (|x − z|).
.
Now apply Lemma 9(a).
11 Transience and Recurrence of Multidimensional Diffusions
203
∞ To prove (b) assume that for some .r0 > 0 and .z ∈ Rk , . r0 exp(−I z (r))dr = ∞, N r0
and .limN →∞
s exp(−I z (s))( r (I z (r))/α z (r))dr)ds 0 N r exp(−I z (r)dr
= ∞. Define,
0
G(r) =
r
s
exp(−I z (s))(
.
r0
r0
1 exp{I z (u)}du)ds, α z (u)
|x − z| > r0 ,
and g(x) = G(|x − z|) |x − z| > r0 .
.
Then, for .r ≥ r0 , .G (r) ≥ 0, and .G (r) = − Therefore one has
β z (r) 1 r G (r)+ α z (r) , so that .2Lg(x)
Ex (t ∧ ηN ) ≥ 2Ex G(|X(t ∧ ηN ) − z|) − 2G(|x − z|).
.
≤ 1.
(11.21)
Thus, letting .t ↑ ∞, from here one obtains Ex ηN ≥ 2Ex G(|X(ηN ) − z|) − 2G(|x − z|)
.
= 2Px (ηB(x:N ) < ηB(x:η0 )c )G(N ) − 2G(|x − z|). Now let .N ↑ ∞, using (11.11), |x−z| Ex η ≥ 2 lim
.
N →∞
r0
N r0
exp{−I z (u)}du
exp{−I z (u)}du
G(N) − 2G(|x − z|) = ∞.
The proof is now complete by Lemma 9(b). criterion6
Remark 11.6 The null recurrence presented in Theorem 11.8(b) does not quite match with the one announced by Kha´sminskii, which we are unable to verify. In view of the criteria for transience and recurrence provided by Theorem 8.8, we will focus on criteria for null and positive recurrence to complete the picture in one dimension. Definition 11.6 Let τyx := inf{t ≥ 0 : X(t) = y}, X(0) = x,
.
x, y ∈ R.
A recurrent diffusion is said to be positive recurrent (or ergodic) if .Eτax < ∞ .∀ x, a, and null recurrent if .Eτax = ∞ for some .x, a. 6 To date the authors are unaware of a proof nor counterexample for Kha´sminskii’s announced condition for null recurrence.
204
11 Transience and Recurrence of Multidimensional Diffusions
One-dimensional diffusions admit a very special representation in terms of the direction and rate at which they exit subintervals of .R. As is more fully exploited in Chapter 21, one of Feller’s great innovations for one-dimensional diffusion was to recognize the essential roles of scale and speed functions defined as follows. Fix an arbitrary .x0 ∈ R (e.g., .x0 = 0), and define the scale function s(x) ≡ s(x0 ; x) :=
x
exp{−I (x0 , z)}dz,
.
(11.22)
x0
where
z
I (x, z) :=
.
{2μ(y)/σ 2 (y)}dy,
x
with the convention I (z, x) = −I (x, z), and I (x) ≡ I (0, x).
.
Also define the speed function m(x) ≡ m(x0 ; x) :=
x
.
(2/σ 2 (z)) exp{I (x0 , z)}dz.
(11.23)
x0
With this now observe that one may express the form of the differential operator A=
.
1 2 d2 d σ (x) 2 + μ(x) , 2 dx dx
(11.24)
d d . dm(x) ds(x)
(11.25)
as A=
.
(x) df (x) ds(x) df (x) Here one defines . df ds(x) = dx dx = f (x)/s (x), and . dm(x) = f (x)/m (x) according to the familiar chain rule from calculus. Next observe that the solution of the two-point boundary value problem (8.47):
Aϕ(x) = 0 for c < x < d,
.
ϕ(c) = 0, ϕ(d) = 1,
is simply .c1 s(x)+c2 . In particular, one may apply the boundary conditions to obtain the probability of escaping to the right given in (8.49) in the equivalent form P (τdx < τcx ) =
.
s(d) − s(x) , s(d) − s(c)
c ≤ x ≤ d.
(11.26)
11 Transience and Recurrence of Multidimensional Diffusions
205
Similarly P (τcx < τdx ) =
.
s(x) − s(c) , s(d) − s(c)
c ≤ x ≤ d.
(11.27)
It also now follows that the conditions (8.53) for recurrence may be expressed as s(−∞) = −∞ and .s(∞) = ∞. Next consider another two-point boundary value problem.
.
.
Aγ (x) = −1 γ (c) = 0, γ (d) = 0,
c < x < d,
(11.28)
2
d d where A is the operator . 21 σ 2 (x) dx 2 + μ(x) dx . By Itô’s lemma,
t
Z(t) := γ (Xx (t)) −
.
−1ds ≡ γ (Xx (t)) + t
(t ≥ 0),
(11.29)
0 x is a martingale. With .τ∂x ≡ τ∂(c,d) := τcx ∧ τdx .(c ≤ x ≤ d), .{Z(τ∂x ∧ t) : t ≥ 0} is bounded by .max{|γ (y)| : c ≤ y ≤ d} + τ , and is therefore uniformly integrable, since by Lemma 8.4 .Eτ∂x < ∞. Applying the optional stopping theorem one obtains, noting that .γ (Xx (τ∂x )) = 0,
Eτ∂x = γ (x).
(11.30)
.
Observe that one may express the solution of the two-point boundary value x problem (11.28) as .γ (x) = c1 s(x) − c m(y)ds(y) + c2 . Applying the boundary conditions .γ (c) = γ (d) = 0, yields the solution in the form Eτ∂x = γ (x) =
.
s(c; x) s(c; d)
d c
m(c; y)s (c; y)dy −
x
m(c; y)s (c; y)dy.
(11.31)
c
Thus the criteria for positive recurrence may be expressed as that of recurrence s(−∞) = −∞, s(∞) = ∞, together with
.
Eτcx =
x
.
(m(∞) − m(y))ds(y) < ∞ ∀ x > c ⇐⇒ m(∞) < ∞,
(11.32)
c
and Eτdx =
.
d
(m(y) − m(−∞))ds(y) < ∞ ∀ x < d. ⇐⇒ m(−∞) > −∞.
x
The next result is a one-dimensional version of Theorem 11.7.
(11.33)
206
11 Transience and Recurrence of Multidimensional Diffusions
Theorem 11.9 Suppose .μ(·), σ (·), are locally Lipschitz and .σ 2 (x) > 0 for all x. (a) Let .s(0) = 0. If .s(∞) = ∞ and .s(−∞) = −∞, then the diffusion is recurrent; if one of .s(−∞), .s(∞) is finite, then the diffusion is transient. (b) If the diffusion is recurrent, but at least one of the integrals in (11.32), (11.33) diverges, i.e., if at least one of .m(−∞), .m(∞) is not finite, then the diffusion is null recurrent and it has a sigmafinite invariant measure which is unique up to scalar multiples. (c) Suppose the diffusion is recurrent. If both integrals in (11.32) (11.33) converge, i.e., if .m(−∞) and .m(∞) are both finite, then there is a unique invariant probability, and the after-t process .Xt+ converges in total variation distance to .Pπ as .t → ∞, no matter what the initial distribution may be. Proof We have proved the criteria for recurrence, transience, and of null and positive recurrence above. In order to prove the uniqueness and convergence assertions in (b) and (c), apply the one-dimensional analog of Theorem 11.7 as follows. Let .a < b < c < d. If the diffusion is recurrent, then .[a, b] is a recurrent set. Also, the local minorization condition (11.19) holds, i.e., .p(1; x, A) ≥ ν(A) for all .x ∈ [a, b], A a Borel subset of .[a, b], and .ν(A) = A p(1; x0 , y)dy, where .p(1; x0 , y) = minx∈[a,b] p(1; x, y) > 0 (see Exercise 9). Hence part (b)follows from Theorem 11.7, part (a). For part (c), define the stopping times .τ1 = inf{t ≥ 0 : X(t) ∈ [c, d]}, .τ2 = inf{t > τ1 , X(t) ∈ [a, b]}, .τ2i+1 = inf{t > τ2i , X(t) ∈ [c, d]}, x .τ2i+2 = inf{t > τ2i+1 : X(t) ∈ [a, b]} .(i = 1, 2, . . . ), .τz = inf{t > 0 : d X(t) = z, X(0) = x}. Then .supy∈[c,d] Ey τ2 = Eτb < ∞, by (11.32). Similarly, a .supy∈[a,b] Ey τ1 = Eτc < ∞, by (11.33). Going through the same argument over cycles .[τ2i , τ2i+2 ](i ≥ 1), as in the proof of part (b) of Theorem 11.7, the proof is complete. Proposition 11.10 Suppose .μ(·), σ (·) are locally Lipschitzian with .σ 2 (x) > 0 for all x. Then the diffusion .X = {X(t) : t ≥ 0} admits a unique invariant probability if it is positive recurrent, and the invariant probability is absolutely continuous with respect to Lebsgue measure, i.e. .π(dx) = π(x)dx, where the density .π(x) is given by π(x) = θ −1 ·
.
1 σ 2 (x)
eI (0,x) = θ −1 m (x),
−∞ < x < ∞,
(11.34)
where .θ −1 is the normalizing constant, and m is the speed function defined by (11.23) with .x0 = 0. Proof To prove absolute continuity of the invariant probability .π(dx), note that if λ(B) = 0, where.λ is Lebesgue measure on .R, then .p(1; x, B) = 0 for all .x ∈ R. Hence .π(B) = R p(1; x, B)π(dx) = 0. Next, fix .c < d. For any initial .X(0), define
.
.
η1 ≡ η1 (X) := τc (X) = inf{t ≥ 0 : X(t) = c}, η2 := inf{t > η1 : X(t) = d},
(11.35) η2n+1 := inf{t > η2n : X(t) = c},
η2n+2 := inf{t > η2n+1 : X(t) = d}, (n ≥ 1).
11 Transience and Recurrence of Multidimensional Diffusions
207
Since .(η2i+1 − η2i−1 ) .(i = 1, 2, . . . ) are i.i.d., it easily follows from the strong law of large numbers that the unique invariant probability is given by π(B) :=
.
1 E E(η3 − η1 )
η3
1B (X(s))ds
B ∈ B(R),
(11.36)
η1
To show that .π(x) in (11.34) is the invariant density, let f be a twice continuously differentiable nonnegative function on .R with compact support. By (11.36), letting .π(dy) be the invariant probability, η3 1 E f (X(s))ds E(η3 − η1 ) η1 η2 η3 f (X(s))ds + f (X(s))ds] = c0 E[
.
R
f (y)π(dy) =
η1
η2
τd
= c0 [E
f (X (s))ds + E c
0
τc
f (Xd (s))ds]
0
= c0 [ϕ1 (d) + ϕ2 (c)],
(11.37)
where .c0 := E(η31−η1 ) , say, and, writing .Xx to denote the process X starting to the martingales .ϕi (Xx (t)) − t at x, xby the optional stopping rule applied τc x i (X (s))ds, i = 1, 2, .ϕ1 (x) := E 0 f (X (s))ds .(x ≥ c), and .ϕ2 (x) := 0Aϕ τd x E 0 f (X (s))ds .(x ≤ d), where .ϕ1 , ϕ2 are the solutions of Aϕ1 (x) = −f (x) (x > c), ϕ1 (c) = 0, ϕ1 (x) → finite limit as x ↑ +∞,
.
(11.38) Aϕ2 (x) = −f (x) (x < d), ϕ2 (d) = 0, ϕ2 (x) → finite limit as x ↓ −∞. Now suppose that f vanishes off .(a, b) and, use .A = conditions at c, d, respectively, to integrate (11.38) as
x
ϕ1 (x) =
.
b
{
c
d d dm ds
and the boundary
f (y)m (y)dy}s (z)dz
(11.39)
f (y)m (y)dy}s (z)dz
(11.40)
z
and
d
ϕ2 (x) =
.
x
{
z
a
Now simply check that c0 [ϕ1 (d) + ϕ2 (c)] =
.
R
f (x)π(x)dx,
The result now follows from (11.37).
π(x) := θ
1 σ 2 (x)
eI (0,x) .
(11.41)
208
11 Transience and Recurrence of Multidimensional Diffusions
d bounded f , one Remark 11.7 Since . dt R Tt f (x)π(dx) = 0 for all (smooth) way to look for the invariant density is to try to solve . R Af (x)π(x)dx = 0. In particular, after an integration by parts (twice), one has . R f (y)A∗ π(y)dy = 0, ∂ 2 ( 1 σ 2 (y)π(y))
2 where .A∗ π(y) := − ∂(μ(y)π(y)) . From here one may compute the ∂y ∂y 2 asserted invariant probability in Proposition 11.10 by solving .A∗ π = 0 with decay at infinities (required for integrability of .π ).
Example 1 (Stochastic Logistic Model) The logistic model with additive noise is given by the Itô equation dX(t) = rX(t)(k − X(t))dt + σ X(t)dB(t), X(0) = x > 0, (r > 0; k > 0; σ = 0) (11.42)
.
If .σ = 0 then it reduces to the deterministic logistic growth model7 The state “0” is an absorbing state, i.e., if the process starts at 0 or reaches 0; then it stays there forever (it is the “state of extinction”). If .σ = 0, the diffusion8 does not reach zero, starting from .x > 0. Hence we take the state space to be .(0, ∞). The value k is often called the carrying capacity of the population: it represents the population size that the environment (available resources) can carry or sustain. A pathwise solution to the SDE is given in Chapter 8, Example 7. The reader is invited to provide the proofs of the following results using the results of this chapter (Exercise 11). A. .2rk/σ 2 < 1: The process approaches 0 as .t → ∞, i.e., the population is “threatened with extinction” or “becomes endangered”. The diffusion is transient, and the probability of actually hitting 0 in finite time is zero, if the process starts at .x > 0. B. .2rk/σ 2 = 1: In this case the diffusion on .(0, ∞) is null recurrent. C. .2rk/σ 2 > 1: The diffusion .{X(t) : t ≥ 0} is positive recurrent on .(0, ∞), and approaches a unique steady state as .t → ∞, no matter what the initial state may be. With harvesting, the model is, for .r > 0; k > 0, t > 0, θ > 0, given by dX(t) = rX(t)(k − X(t))dt + θ dt + σ X(t)dB(t), X(0) = x > 0,
.
(11.43)
where the positive constant .θ denotes the rate of consumption or harvesting. In this model .X(t) → 0 as .t → ∞, but never reaches zero. Next consider a model in which the harvesting rate is proportional to the stock, i.e., for .r > 0, k > 0, σ = 0, dX(t) = rX(t)(k − X(t))dtθ X(t)dt + σ X(t)dB(t), X(0) = x > 0.
.
7 See
(11.44)
Bacaer (2011) for an interesting historical account of this model. results for the stochastic model (.σ = 0) are given in Bhattacharya and Majumdar (2022). Multiplicative random perturbations on population growth are treated in the Special Topics Chapter 26.
8 The
Exercises
209
Then a. The diffusion is transient if and only if .kr − θ < σ 2 /2. b. If .kr − θ = σ 2 /2 the process is null recurrent, and c. If .kr −θ > σ 2 /2 the difusion .{X(t) : t ≥ 0} is positive recurrent and it converges to a unique steady state (invariant probability). For additional examples of compelling mathematical interest see those at the end of the special topics Chapter 14. Remark 11.8 In the one-dimensional case, the asymptotic criteria for transience, recurrence, and positive recurrence in Theorem 11.9 hold with Feller’s scale function s(.·) and speed function m(.·) (see Chapter 21, Definition 21.2) in place of the corresponding quantities defined by (11.22), (11.23), thus relaxing condition (A). In the multidimensional case, condition (A) may be relaxed a little by allowing isolated singularities, as long as the integral conditions in Theorems 11.5, 11.8 hold.
Exercises 1. Under the hypothesis of Lemma 3, prove that the distribution p(x0 , dy) of X(τB ) on ∂B has full support ∂B. [Hint: If p(x0 , dy) does not have full support on ∂B, then there exists a non-empty open subset O ⊂ ∂B such that p(x0 , O) = 0. Pick a point z ∈ O, and let ω0 ∈ C([0, ∞) : Rk ) be such that ω0 (t) = z for some t. This implies that the open subset V of C([0, ∞) : Rk ) with ω(0) = x, and comprising all those ω such that ω(t) ∈ O, has probability zero.] 2. Let the dimension k = 1, and μ(y) = 0, σ 2 (y) = (1 + |y|a ), y ∈ R. (i) Show that the diffusion is recurrent, whatever be a ∈ R. (ii) Show that the diffusion is null recurrent if a ≤ 1. (iii) Are there values of a for which the diffusion positive recurrent? 3. Let the dimension k = 1, and μ(y) = θ = 0, σ 2 (y) = (1 + |y|a ), y ∈ R. (i) Show that the diffusion is transient for a < 1, and positive recurrent for a > 1. (iii) What happens if a = 1? 4. Let the dimension k = 1, and μ(y) = 0, σ 2 (y) = e−|y| , y ∈ R. Show that the diffusion is positive recurrent and compute its invariant distribution. 5. Classify the following diffusions as transient, null recurrent or positive recurrent. (i) (ii) (iii) (iv)
μ(x) ≡ μ = 0, σ 2 (x) ≡ σ 2 > 0. μ(x) ≡ 0, σ 2 (x) ≡ σ 2 > 0. μ(x) = βx (β > 0), σ 2 (x) ≡ σ 2 > 0. μ(x) = βx (β < 0), σ 2 (x) ≡ σ 2 > 0.
6. Consider a dimension k ≥ 2. Let the drift vector μ(·) = 0, and the diffusion matrix given by (1 + |y|a )Ik , y ∈ Rk , for some constant a, Ik being the identity
210
7.
8. 9.
10.
11. 12.
11 Transience and Recurrence of Multidimensional Diffusions
matrix. (i) Show that, for all a, the diffusion is recurrent if k = 2 and transient if k > 2. (ii) Prove that if a > 2, the diffusion is positive recurrent for k = 2. Let k ≥ 2 be the dimension, and let μ(y) = θy for some θ > 0, and let the diffusion matrix be (1 + |y|a )Ik . (i) Check that for all a < 2, the diffusion is transient, whatever be the dimension k ≥ 2. (ii) Show that for a > 2, the diffusion is recurrent for k = 2, and transient for k > 2. In this model the diffusion is transient for k > 2, whatever be a. (iii) Show that for k = 2, the diffusion is null recurrent if a ≥ 2. Next let k ≥ 1, and consider the model in Exercise 7, but with θ < 0. Show that the diffusion is positive recurrent for all values of a. Assume that the coefficients of L are Hölder continuous and ((aij (x))) is positive definite for all x. Prove that p(1; x, y) is bounded away from zero on every disc B(x0 , δ). [Hint: Under the Dirichlet (or zero)-boundary condition, the fundamental solution pN (t; x, y) to the initial-boundary value problem for L on B(0 : N) is continuous in t, x, y(t > 0), and for all x, y ∈ B(0 : N ). In particular, infx∈B(x0 :δ) pN (1; x, y) = pN (1; x0 , y), say, is positive if B(x0 : δ) ⊂ B(0 : N). Since, as N → ∞, pN (1; x, y) ↑ p(1; x, y), the transition probability of the diffusion at t = 1, it follows that p(1; x, y) ≥ pN (1; x0 , y) for all x ∈ B(x0 : δ).] Suppose {X(t) : t ≥ 0} is a positive recurrent diffusion satisfying the condition in Lemma 9(a). Prove that the distribution of the after-t process Xt+ = {X(t + s) : s ≥ 0} under the distribution μ converges in variation distance to Pπ as t → ∞, where π is the invariant probability of the diffusion. [Hint: By Theorem 11.7, the distribution P˜ , say, of Xn+ = {X(n + m) : m = 0, 1, 2, . . . } converges in variation distance to Pπ˜ , the distribution of {X(n) : n = 0, 1, 2 . . . } under the invariant initial distribution π . In particular, p(n; x, dy) converges in variation distance to π as n → ∞. Use the Chapman Kolmogorov equation p(t; x, dy) = Rk p(n; x, dz)p(t − n; z, dy), t > n, to argue that p(t; x, dy) converges in variation distance to π(dy) as t → ∞. ∞) : Rk ), use Px (Xt+ ∈ B) = Finally, for an arbitrary Borel set B ⊂ C([0, + Rk Py (B)p(t; x, dy) to argue that Px (Xt ∈ B) → Pπ (B), uniformly for all such Borel subsets B.] Provide the details to verify each case of Example 1. For the proof of part (a) of Theorem 11.7, prove that a sigma-finite invariant measure for the discrete skeleton {X(n) : n ≥ 0} is a sigma-finite invariant measure for the diffusion {X(t) : t ≥ 0}. [Hint: Suppose π is a sigma-finite invariant measure of {X(n) : n ≥ 0}. Choose any non-negative, bounded measurable function f on Rk such that f = 0 on Rk\B, where B is compact and π(B) < ∞. Consider {X(n) : n ≥ 0}. One has Rk Ex (f (X(n))π(dx) = Also, by applying the above Rk f (x)π(dx), n = 0, 1, 2, . . . , by assumption. equality to Tt f , in place of f, one has T f dπ = T f dπ for all Rk 1+t Rk t 0 ≤ t ≤ 1. Hence the blocks { Rk Tt f dπ : n ≤ t < n + 1} are all identical. Thus, it follows that 1t [0,t] ( Rk Ts f (x)π(dx))ds → c(f ) := [0,1] ( Rk Ts f (x)π(dx))ds, as t → ∞. Apply this to the function Tt f to show that π is invariant under time shift of the diffusion.]
Chapter 12
Criteria for Explosion
The phenomena of explosion of a multidimensional nonsingular diffusion is considered in this chapter. In particular, conditions are provided that are sufficient for both explosion and nonexplosion. In one dimension, the conditions are also necessary.
Recall the simple Example 2 of Chapter 7 for a stark illustration of explosion of a diffusion having locally Lipschitzian coefficents. We derive in this chapter verifiable criteria for explosion (see Section 7.3). Throughout this chapter it is assumed, unless otherwise specified,1 μ(·), σ (·)satisfy Condition (A) of Chapter 11.
.
(12.1)
First consider the case of one-dimensional diffusions. It is convenient to use Feller’s scale and speed functions given by (11.22) and (11.23), respectively, given by
x
s(x) :=
.
e−I (0,y) dy =
0
m(x) :=
.
0
x
2 eI (0,y) dy = σ 2 (y)
x
y
exp{− 0
0
0 x
2μ(z) dz}dy, σ 2 (z)
y 2 2μ(z) exp{ dz}dy. 2 σ 2 (y) 0 σ (z)
(12.2)
(12.3)
1 The positivity condition is removed in the treatment of one-dimensional diffusions in Chapter 21.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_12
211
212
12 Criteria for Explosion
Note that a diffusion .Xx , x ∈ R, having drift .μ(·) and diffusion coefficient .σ 2 (·) has the generator L=
.
d 1 2 d d2 d = . σ (x) 2 + μ(x) dx dm(x) ds(x) 2 dx
(12.4)
By Proposition 8.7, for all .c < x < d, s(x) − s(c) , P τdx < τcx = s(d) − s(c)
ρxd =
s(x) − s(−∞) s(d) − s(−∞)
s(d) − s(x) , s(d) − s(c)
ρxc =
s(∞) − s(x) s(∞) − s(c)
.
P (τcx < τdx ) =
(12.5)
with the convention .∞/∞ = 1. By Theorem 8.8, the diffusion is recurrent if and only if .s(−∞) = −∞ and .s(∞) = ∞ (also see Exercise 2). Here .ρxy is the probability that, starting at x, the diffusion ever reaches y. The following result shows that if the diffusion is nonrecurrent (i.e., if it is transient), then the diffusion x .X (t) .(t ≥ 0) approaches an infinite limit .+∞ or .−∞ as t increases. In one dimension let us set .Xx (t) = ∞ for .t ≥ ζ x if .Xx (t) → ∞ as .t ↑ ζ x < ∞, where x x x x .ζ is the explosion time. Similarly, set .X (t) = −∞ for .t ≥ ζ if .X (t) → −∞ as x .t ↑ ζ < ∞. Proposition 12.1 a. If .s(−∞) = −∞ and .s(∞) < ∞, then P (inf Xx (t) > −∞ and limx Xx (t) = ∞) = 1.
.
t↑ζ
t≥0
(12.6)
b. If .s(−∞) > −∞ and .s(∞) = ∞, then P (sup Xx (t) < ∞ and limx Xx (t) = −∞) = 1.
.
t↑ζ
t≥0
(12.7)
c. If .s(−∞) > −∞ and .s(∞) < ∞, then P (inf Xx (t) > −∞ and limx Xx (t) = ∞) =
s(x) − s(−∞) , s(∞) − s(−∞)
P (sup Xx (t) < ∞ and limx Xx (t) = −∞) =
s(∞) − s(x) . (12.8) s(∞) − s(−∞)
.
t≥0
t≥0
t↑ζ
t↑ζ
Proof (a) By (12.5), .ρxd = 1 for all .x < d, if .s(−∞) = −∞. Letting .d ↑ ∞, one gets .P (supt≥0 Xx (t) = ∞) = 1. If, also, .s(∞) < ∞, then by (12.5), .ρxc ↓ 0 as x .c ↓ −∞(x > c). Hence .P (inft≥0 X (t) = −∞) = 0. Let .M > x be an integer. Then, letting N be an integer greater than M, and using the strong Markov property with .τNx ,
12 Criteria for Explosion .
213
P ( limx Xx (t) < M) t↑ζ
= P ( limx XN (t) < M) t↑ζ
= ρN M ≡ P (XN (t) = M for somet ≥ 0) → 0 as N ↑ ∞. Hence .P (limt↑ζ x Xx (t) < M) = 0 for every .M < ∞, proving (12.6). The proof of (b) is entirely analogous. (c) Let .x > c. Then .P (Xx never reaches .c) .≡ 1−ρxc = 1−(s(∞)−s(x))/(s(∞)− s(c)) ↑ (s(x)−s(−∞))/(s(∞)−s(−∞))(as c ↓ −∞) = P (inft≥0 Xx (t) > −∞). Similarly, .P (supt≥0 Xx (t) < ∞) is given by .(s(∞) − s(x))/(s(∞) − s(−∞)). To prove (12.8), it is then enough to show that .Xx (t) goes to either .+∞ or .−∞ as .t ↑ ζx , outside a P -null set. For this, let M be a large integer such that .−M < x < M. x x ∧τx < Let .N > M. Then, by (12.5) applied to .c = −N and .d = N, .τ−N,N := τ−N N x ∞ a.s.. Using the strong Markov property with respect to .τ−N,N = τ , say one has .
x P (|Xx (t)| > M ∀ t > τ−N,N )=
= 1 − E{P (Xy reaches [−M, M])y=Xx (τ ) } = 1 − E[1[Xx (τ )=−N ] P (X−N reaches − M) + 1[Xx (τ )=N ] P (XN reaches M)] x x = 1 − {P (τ−N < τNx )ρ−N,−M + P (τNx < τ−N )ρN,M } −→ 1
since .ρ−N,−M = (s(−N) − s(−∞))/(s(−M) − s(−∞)) and .ρN,M = (s(∞) − s(N))/(s(∞) − s(M)) both go to zero as .N ↑ ∞. Hence .P (limt↑ζ x |Xx (t)| ≥ M) = 1. Now, let .M ↑ ∞ to get the desired result .P (limt↑ζ x |Xx (t)| = ∞) = 1, i.e., .P (limt↑ζ x |Xx (t)| = ∞) = 1. We are now ready to derive Feller’s necessary and sufficient condition for explosion. To simplify notation, we will omit the superscript x in .τzx := inf{t ≥ 0 : Xx (t) = z} and write it as .τz . Define τ+∞ := lim τn ,
.
n↑∞
τ−∞ := lim τ−n , n↑∞
(12.9)
and note that by Proposition 12.1, if at least one of .s(−∞), .s(∞) is finite, then limt↑ζ x Xx (t) exists and equals .+∞ or .−∞, a.s., and
.
ζ x = τ+∞
on
[ limx Xx (t) = +∞],
ζ = τ−∞
on
[ limx Xx (t) = −∞].
.
x
t↑ζ t↑ζ
(12.10)
Also, in the case .s(−∞) = −∞ and .s(∞) = ∞, the diffusion is recurrent (Theorem 8.8) and, clearly, .ζ x = ∞ a.s. for every x, as are .τ+∞ and .τ−∞ . Hence,
214
12 Criteria for Explosion
explosion occurs (with positive probability) if and only if .P (τ+∞ < ∞) > 0 or P (τ−∞ < ∞) > 0, for some x. z dv. Theorem 12.2 (Feller’s Test for Explosion) Let .k = 1, .I (z) := 0 2μ(v) σ 2 (v)
.
a. Suppose the integrals (i), (ii) below both diverge. Then .P (ζ x = ∞) = 1 for all x. b. If the integral (i) converges, then .P (τ+∞ < ∞) > 0, and if the integral (ii) converges, then .P (τ−∞ < ∞) > 0. .
∞
(i) 0
(ii)
y
exp{−I (y)}[
(exp{I (v)}/σ 2 (v))dv)]dy,
0
(12.11)
0
−∞
0
exp{−I (y)}[
(exp{I (v)}/σ 2 (v))dv]dy.
y
Proof (a) Suppose the integrals in (12.11) both diverge. We will construct a nonnegative function .ϕ(z) such that (1) .ϕ(z) ↑ ∞ as .0 ≤ z ↑ ∞ and .ϕ(z) ↑ ∞ as .z ↓ −∞ .(z < 0), and (2) .Aϕ(z) ≤ ϕ(z) for all z. Given such a function, let .ψ(t, z) := e−t ϕ(z). By Itô’s lemma and Doob’s optional stopping rule, .
Eψ(τn ∧ τ−n ∧ t, Xx (τn ∧ τ−n ∧ t)) − ϕ(x) τn ∧τ−n ∧t e−s [Aϕ(Xx (s)) − ϕ(Xx (s))]ds ≤ 0. =E
(12.12)
0
This means .ϕ(x) ≥ Eψ(τn ∧ τ−n ∧ t, Xx (τn ∧ τ−n ∧ t)), so that ϕ(x) ≥ e−t (ϕ(n) ∧ ϕ(−n))P (τn ∧ τ−n ≤ t), P (τn ∧ τ−n ≤ t) ≤ et ϕ(x) (ϕ(n) ∧ ϕ(−n)). .
Letting .n ↑ ∞ in the last inequality, we get .P (ζ x ≤ t) = 0 .∀ t ≥ 0. This implies .P (ζ x < ∞) = 0. To construct such a .ϕ, write z y ⎧ z exp{I (v)} −I (y) m(y)dy = 2 ⎪ ⎪ dv]dy z ≥ 0 e exp{−I (y)}[ ⎨ 2 0 0 0 σ (v) .ϕ1 (z) := 0 0 0 exp{I (v)} ⎪ ⎪ dv]dy, z < 0, e−I (y) m(y)dy = 2 exp{−I (y)}[ ⎩− σ 2 (v) z z y (12.13)
and check that .Lϕ1 (z) = 1 .(z ∈ R). Now define .ϕ(z) := 1 + ϕ1 (z), which satisfies all the requirements above. (b) Assume first that the integral (i) in (12.11) converges. We will construct a function .ϕ(z) such that (1) .ϕ(z) increases to a finite positive limit as .z ↑ ∞,
12 Criteria for Explosion
215
ϕ(0) = 0, and (2) .Lϕ(z) ≥ ϕ(z) .∀ z > 0. Given such a function, consider ψ(t, y) := e−t ϕ(z), and apply Itô’s lemma to get, as in (12.12) but with .τ = τn ∧ τ0 ∧ t and with ‘.≥ 0’ in place of ‘.≤ 0’, . .
ϕ(x) ≤ Eψ(τn ∧ τ0 ∧ t, Xx (τn ∧ τ0 ∧ t)) ≤ ϕ(n) P (τn ≤ τ0 ∧ t) + e−t , x > 0,
.
since .ϕ(0) = 0. From this, one gets .P (τn ≤ τ0 ∧ t) + e−t ≥ ϕ(x)/ϕ(n), and letting .n → ∞, we have ϕ(x) > 0. ϕ(∞)
e−t + P (τ+∞ ≤ τ0 ∧ t) ≥
.
Now let .t → ∞ to derive the relation P (τ+∞ < ∞) ≥ ϕ(x)/ϕ(∞) > 0.
.
(12.14)
It remains to construct .ϕ. For this, let .ϕ0 (z) = 1, and recursively define (note that .ϕ1 (z) is as given in (12.13) for .t ≥ 0)
z
ϕm (z) := 2
y
exp{−I (y)}[
.
0
0
eI (v) ϕm−1 (v)dv]dy, σ 2 (v)
(z ∈ R), m ≥ 1.
(12.15) Check that .Lϕm (z) = ϕm−1 (z). Observe that .ϕm (z) ↑ as .z ↑ .(z ≥ 0) and, as we will see by induction, ϕm (z) ≤
.
ϕ1m (z) m!
∀ m ≥ 1, z ≥ 0.
(12.16)
Assuming this for a given m, one has
z
ϕm+1 (z) ≤ 2
y
exp{−I (y)}[ϕm (y)
.
0 z
≤2 = 0
0 z
(exp{I (v)}/σ 2 (v))dv]dy
0
ϕ m (y) [ exp{−I (y)} 1 m!
y
(exp{I (v)}/σ 2 (v))dv]dy
0
ϕ (m+1) (z) dϕ1m+1 (y) = 1 , (m + 1)! (m + 1)!
proving (12.16), since it clearly holds for .m = 1. Now, by assumption .ϕ1 (z) < ϕ1 (∞) < ∞. Hence, there exists a positive integer r such that .ϕ1r (z)/r!
0. The case, .P (τ−∞
0 if the integral (ii) in (12.11) converges, is proved analogously (see Exercise 5). Definition 12.1 In the case that .P (ζ x = ∞) = 1 for all x, we say that the diffusion is nonexplosive or conservative. Note that if the diffusion X on I is nonexplosive then .Px (X(t) ∈ I ) = 1 for all x ∈ I. To derive a criterion for explosion for a diffusion .Xx (t), .t ≥ 0, in dimension .k > 1, the following notation will be used. Recall the notation of (12.1). .
D(x) = ((aij (x))) := σ (x)σ (x),
.
C(x) :=
A(x) :=
i,j
B(x) := 2
aii (x),
i
|x|=r
B(x) + C(x) − 1, A(x)
α(r) := max A(x), |x|=r
I (r) :=
x (i) μ(i) (x),
i
β(r) := max
aij (x)x (i) x (j ) /|x|2 ,
r
r0
β(u) du, u
B(x) + C(x) − 1, A(x)
β(r) := min
|x|=r
α(r) := min A(x), |x|=r
I (r) :=
r r0
β(u) u
du,
(12.17)
where .r0 > 0 is a given constant. It is simple to check that for every real valued twice continuously differentiable function F on .(0, ∞), for the function .ϕ(x) := F (|x|), one has (Exercise 4) 2(Aϕ(x)) = A(x)F (|x|) +
.
B(x) + C(x) − A(x) F (|x|), |x|
(|x| > 0).
(12.18)
Theorem 12.3 (Kha´sminskii’s Test for Explosion) ∞ r exp{I¯(u)} −I¯(r) a. .P (ζ < ∞) = 0 if . e ( du)dr = ∞, ¯ 1 ∞ 1 r α(u) exp{I (u)} e−I (r) ( b. .P (ζ < ∞) > 0 if . du)dr < ∞. α(u) 1 1 Proof (a) Assume that the right side of (a) diverges. Fix .r0 ∈ [0, |x|). Define the following functions on .(r0 , ∞),
Exercises
217
ϕ0 (v) ≡ 1,
.
ϕn (v) = 2
v
¯
e−I (r) (
r0
r
r0
exp{I¯(u)}ϕn−1 (u) du)dr, α(u) ¯
n ≥ 1. (12.19)
Then the radial functions .ϕn (|y|) on .(r0 ≤ |y| < ∞) satisfy: .Lϕn (|y|) ≤ ϕn−1 (|y|) (Exercise 4). Now, as in the proof of Theorem 12.2, apply Itô’s lemma to the function ϕ(t, y) := exp{−t}ϕ(|y|),
.
where ϕ(|y|) = 1 + ϕ1 (|y|),
extended to all of .Rk as a twice continuously differentiable function. Then, writing .τR := inf{t ≥ 0 : |Xx (t)| = R}, Eϕ(τr0 ∧ τR ∧ t, Xx (τr0 ∧ τR ∧ t)) ≤ ϕ(|x|),
.
or e−t ϕ(R)P (τR ≤ τr0 ∧ t) ≤ ϕ(|x|),
.
P (τR ≤ τr0 ∧ t) ≤ et
ϕ(|x|) . ϕ(R)
Letting .R → ∞, we get .P (ζ ≤ τr0 ∧ t) = 0 for all .t ≥ 0. Now let .r0 ↓ 0 to get P (ζ < t) = 0 .∀ t, since .τr0 → ∞ a.s. as .r0 ↓ 0. It follows that .P (ζ < ∞) = 0. (b) This follows by replacing upper bars by lower bars in the definition (12.19) and noting that for the resulting functions, .Lϕr (|y|) ≥ ϕr−1 (|y|). The rest of the proof follows the proof of the second half of part (a) of Theorem 12.2, with r r .ϕ(|y|) = m=1 ϕm (|y|), where r is so chosen that .ϕ1 (∞)/r! < 1 (Exercise 4). .
Exercises 1. Prove that no explosion occurs for a one-dimensional diffusion if μ(·) ≡ 0, and σ 2 (·) > 0 is locally Lipschitzian. 2. Check the following examples of drift and diffusion coefficients for recurrence, transience, explosion, or nonexplosion, where k is a nonnegative integer, β, σ are positive parameters. (a) (b) (c) (d)
μ(x) = −βx 2k+1 , σ (x) = σ. μ(x) = −βx 2k , σ (x) = σ. μ(x) = −βx 2k+1 , σ (x) = σ · (1 + x 2 ). μ(x) = −βx 2k , σ (x) = σ · (1 + x 2 ).
3. Let σ 2 (·) be bounded away from zero and infinity. Assume μ(x) is bounded on x ≤ 0, and μ(x) x c as x → ∞ (i.e., there exist 0 < A1 ≤ A2 such that
218
4.
5.
6.
7. 8.
9.
12 Criteria for Explosion
A1 ≤ μ(x)/x c ≤ A2 for all sufficiently large x). Prove that explosion occurs if and only if c > 1. (i) Derive (12.18). (ii) For the function ϕr in (12.19), prove that Lϕr (|y|) ≤ ϕr−1 (|y|). (iii) If ϕr are defined as in (12.19), but with upper bars replaced by lower bars, prove that Lϕr (|y|) ≥ ϕr−1 (|y|). (iv) Write out the details of the proof of part (b) of Theorem 12.3. (v) Suppose that the functions A(x) and B(x) + C(x) are radial, i.e., functions of |x|. Prove that the criteria in Theorem 12.3 are then precise, i.e., they provide necessary and sufficient conditions for explosion. Prove part (b) of Theorem 12.2 for the case when the second integral (ii) in (12.11) converges. [Hint: Relabel the Markov process by the transformation x → −x, and then use the result for τ+∞ .] Suppose σ 2 (z) > 0 for all z, bounded away from zero and infinity for z ≤ 0, σ (z) z1+β and μ(z) z1+α as z → ∞ (α > 0, β > 0). Also assume μ(z) is bounded for z ≤ 0. Prove that explosion occurs if and only if 2β < α. Let σ 2 (z) > 0 for all z, and μ(z) < 0 for z > A, μ(z) > 0 for z < −B, where A and B are positive numbers. Show that the diffusion is nonexplosive. (i) Let k = 2 and consider a diffusion dX(t) = σ (X(t))dB(t), σ (·) nonsingular. Show that there is no explosion if D(x) = σ (x)σ (x) has equal eigenvalues for every x. In particular, consider the case σ (x) = (1 + |x|a )I2 for some a ∈ (−∞, ∞). (ii) Let k > 2 and μ = 0, and suppose the eigenvalues of D(x) are all equal to α(x), say, where α(x) ≥ d1 + c1 |x|a , for some constants d1 ≥ 0, c1 > 0, a > 2. Show that the diffusion is explosive. Prove that if s(∞) = ∞, then P (τ+∞ < ∞) = 0. [Hint: Use Proposition 12.1 and Theorem 12.2.] Similarly, show that P (τ−∞ < ∞) = 0 if s(−∞) = −∞.
Chapter 13
Absorption, Reflection, and Other Transformations of Markov Processes
This chapter builds on the construction of unrestricted Markov processes on a measurable state space, by further constructing Markov processes by means of various transformations such as reflection, rotation, periodic embeddings, and so on. Much more generally, a special class of transformations of an unrestricted diffusion is shown to preserve the Markov property. These transformed processes are shown to be diffusions when the transformation is also sufficiently smooth.
We begin this chapter with the simple yet useful observation that a one-to-one measurable map of a state space .(S, S) of a Markov process .{Xt : t ≥ 0} onto a state space .(S , S ) gives rise to a Markov process1 .{Yt := ϕ(Xt ), t ≥ 0} on .(S , S ). After all, this is merely a relabeling of the states. For example, if .{Xt : t ≥ 0} is a diffusion on .R with coefficients .μ(·) and .σ (·), and .ϕ is a twice continuously differentiable strictly increasing (or strictly decreasing) map on .R onto .(a, b) .(−∞ ≤ a < b ≤ ∞), then .{Yt : t ≥ 0} is said to be a diffusion on .(a, b). By Itô’s lemma, 1 dYt = dϕ(Xt ) = {ϕ (Xt )μ(Xt ) + σ 2 (Xt )ϕ (Xt )}dt + ϕ (Xt )σ (Xt )dBt 2 = μ(Yt )dt + σ (Yt )dBt (t ≥ 0), (13.1)
.
1 See Rogers and Pitman (1981) and references therein for some other approaches to conditions for
the Markov property of functions of Markov processes. © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_13
219
220
13 Absorption, Reflection, and Other Transformations of Markov Processes
where 1 μ(y) := (ϕ μ + σ 2 ϕ )(ϕ −1 (y)), 2
σ (y) := (ϕ σ )(ϕ −1 (y)).
.
(13.2)
We will say .{Yt : t ≥ 0} is a diffusion on .(a, b) with coefficients . μ(·), . σ (·). In this manner, one can define diffusions on arbitrary nondegenerate open intervals .(a, b). Conversely, given such a diffusion on an interval .(a, b) (with one or both of .a, b finite), one can construct an “equivalent” diffusion on .R by a .C 2 -diffeomorphism .ϕ on .(a, b) onto .R. A point worth noting here is that a conservative diffusion on .(a, b), with at least one of the end points .a, b finite, cannot be defined for arbitrarily specified Lipschitzian coefficients. The coefficients . μ(·) and . σ (·) of a conservative diffusion on .(a, b) must be such that a .C 2 -diffeomorphism onto .R will give rise to a conservative diffusion on .R. Following the line of proof of Feller’s criteria for explosion (Theorem 12.2), one may show that a diffusion on .(a, b), with coefficients .μ(·), σ (·), is conservative if and only if, for some .x0 ∈ (a, b) (Exercise 2),
x0
.
m(x)ds(x) = −∞,
b
m(x)ds(x) = ∞,
(13.3)
x0
a
where .s(x) and .m(x) are defined, for an arbitrarily fixed .x0 ∈ (a, b), by s(x) :=
x
.
I (x) :=
exp {−I (y)} dy,
x0 x x0
m(x) :=
x x0
2 σ 2 (y)
2μ(y) dy. σ 2 (y)
exp{I (y)}dy, (13.4)
Note that if the first integral in (13.3) is convergent, then .P (τa < ∞) > 0, where τa = limz↓a τz , .τz := inf{t ≥ 0 : Yt = z}. Although one can prove the assertion above as well as necessary and sufficient conditions for transience, null and positive recurrence on .(a, b), by using the results in Chapter 8 and at the end of Chapter 11 on .S = (−∞, ∞) by a twice continuously differentiable diffeomorphism .ϕ : (a, b) → (−∞, ∞), it is more useful to derive the criteria directly. The method is the same as in Chapters 8 and 11. In Exercise 2, one shows that the diffusion on .(a, b) is recurrent if and only if
.
s(a) = −∞,
.
s(b) = +∞,
(13.5)
and it is positive recurrent if and only if, in addition to (13.5), m(a) > −∞,
.
m(b) < ∞.
(13.6)
13.1 Absorption
221
Definition 13.1 (Feller’s Boundary Classification) If the first integral in (13.3) is convergent, then ‘a’ is said to be accessible. Similarly, ‘b’ is accessible if the second integral in (13.3) converges. If a boundary is not accessible, then it is said to be inaccessible. Example 1 (A Zero-Seeking Brownian Motion) Consider the process on .R defined by dZ(t) = −(1 − σ 2 )(1 + Z 2 (t))Z(t)dt + σ (1 + Z 2 (t))dB(t),
.
Z(0) = z.
Define .X(t) = ϕ(Z(t)), t ≥ 0, where .ϕ : R → (− π2 , π2 ) is given by .ϕ(z) = tan−1 (z), z ∈ (−∞, ∞). Then X is constrained to the interval .(− π2 , π2 ) and satisfies dX(t) = − tan(X(t))dt + σ dB(t).
.
States of .X(t) located away from the origin, symmetrically on either side, experience an increasingly large pull back toward the origin as a function of distance, becoming infinite at the boundaries. It is interesting to consider the longtime behavior of X in terms of the parameter .σ ≥ 0 (Exercise 3). In a similar fashion as above, one can define diffusions on .C 2 -diffeomorphic images of .Rk .(k > 1), such as the unit ball .B(0 : 1), or the positive orthant .(0, ∞)k , etc. Itô’s lemma may be used to derive the coefficients of the diffusion on the image space in terms of those on .Rk , and vice versa.
13.1 Absorption Let G be an open subset of .Rk .(k ≥ 1) on which are defined locally Lipschitzian coefficients .μ(·), σ (·) which may be extended to all of .Rk as locally Lipschitzian so that the diffusion on .Rk is nonexplosive. For a given initial state .x ∈ G, let .Xx be the diffusion on .Rk with these extended coefficients. Then define the map x
Xt := Xx (t ∧ τ x ),
.
τ (f ) := inf{t ≥ 0 : f (t) ∈ ∂G}, f ∈ C([0, ∞) : G), τ x := τ (Xx ),
(x ∈ G).
(13.7)
x
The process .X .(x ∈ G) is said to be a diffusion on .G having coefficients .μ(·), σ (·), with absorption on .∂G. Let .{Ft : t ≥ 0} denote the filtration with respect to which the Markov property of the unrestricted diffusion .Xx .(x ∈ Rk ) is defined. x
Proposition 13.1 .X .(x ∈ G) is a Markov process on the state space .G, with the transition probability .p(t; x, dy) defined by p(t; x, dy) := δx (dy)
.
∀ x ∈ ∂G, t ≥ 0,
222
13 Absorption, Reflection, and Other Transformations of Markov Processes
p(t; x, B) :=
P (τ x > t, Xxt ∈ B) x ∈ G, B ∈ B(G), P (τ x ≤ t, Xxτ x ∈ B) x ∈ G, B ∈ B(∂G).
(13.8)
x
Proof If .x ∈ ∂G, then .Xt = x∀ t ≥ 0 and it is clearly (trivially) Markov with the transition probability .δx (dy). Suppose .x ∈ G, B Borel subset of G. Then, writing x x + .(Xs ) for the after-s process .{X s+t : t ≥ 0}, one has x
.
P (Xs+t ∈ B | Fs ) = P (Xxs+t ∈ B, τ x > s + t | Fs ) = P (Xxs+t ∈ B, τ x > s, τ ((Xxs )+ ) > t | Fs ) y
= 1{τ x >s} [P (Xt ∈ B, τ y > t)]y=Xxs =
y [P (Xt
(strong Markov property) x
∈ B, τ y > t)]y=Xx ≡ p(t; Xs , B), s
x
x
since .Xs = Xxs if .τ x > s, and .p(t; Xs , B) is zero on .[τ x ≤ s](B ⊂ G). Now, let .B ⊆ ∂G. Then x
.
P (Xs+t ∈ B | Fs ) = P (Xxτ x ∈ B, τ x ≤ s + t | Fs ) = P (Xxτ x ∈ B, τ x ≤ s | Fs ) + P (Xxτ x ∈ B, s < τ x ≤ s + t | Fs ) = 1[τ x ≤s]∩[Xx x ∈B] + 1[τ x >s] P (τ ((Xxs )+ ) ≤ t, (Xxs )+ ∈ B | Fs ) τ ((Xx )+ ) τ
=
s
x 1[τ x ≤s] p(t; Xs , B) + 1[τ x >s] [P (τ y x
≤
y t, Xτ y
∈ B)]y=Xxs
x
= 1[τ x ≤s] p(t; Xs , B) + 1[τ x >s] p(t; Xs , B) x
= p(t; Xs , B). Example 2 (Brownian Motion on a Half-Line with Absorption at Zero) Let .{Btx : t ≥ 0} be a Brownian motion on .R = (−∞, ∞) with .B0x = x, where .x < 0. Let .τ be as above, with .G = (−∞, 0), .∂G = {0}. Consider Brownian motion W , reflected at .0 : Wt = Btx for .t ≤ τ x and .Wt = −Btx for .t > τ x . By the strong Markov property, W has the same distribution as .B x . Also .τ x ≡ τ (B x ) = τ (W ). Therefore, for all .c < 0, p(t; x, (−∞, c]) = P (Btx ≤ c, τ x > t)
.
= P (Btx ≤ c) − P (Btx ≤ c, τ x ≤ t) Now,
(13.9)
13.1 Absorption .
223
P (Btx ≤ c, τ (B x ) ≤ t) = P (Wt ≤ c, τ (W ) ≤ t) = P (Btx ≥ −c, τ (W ) ≤ t) = P (Btx ≥ −c, τ (B x ) ≤ t) (Since‘Btx ≥ −c ⇒ ‘τ (B x ) ≤ t ). (13.10)
= P (Btx ≥ −c) Using (13.10) in (13.9), one obtains
p(t; x, (−∞, c]) = P (Btx ≤ c) − P (Btx ≥ −c).
.
(13.11)
Also, p(t; x, {0}) = P (τ x ≤ t).
(13.12)
.
Thus, on .(−∞, 0), .p(t; x, dy) has the density (obtained by differentiating (13.11) with respect to c, and then setting .c = y) q1 (t; x, y) := p(t; x, y)−p(t; x, −y) = p(t; x, y)−p(t; −x, y)
(x < 0, y < 0), (13.13) where .p(t; x, y) is the transition probability density of a standard Brownian motion
.
(y−x)2 1 p(t; x, y) = √ e− 2t . 2π t
.
Hence, q1 (t; x, y) = √
1
.
2π t
e−
(y−x)2 2t
(y+x)2 1 −√ e− 2t 2π t
(x < 0, y < 0).
(13.14)
Note that the discrete component of .p(t; x, dy) puts mass .p(t; x, {0}) = P (τ x ≤ t) at .{0}. One may write this probability explicitly as p(t; x, {0}) = 1 −
q1 (t; x, y)dy.
.
(13.15)
(−∞,0)
The transition probability of Brownian motion on .[0, ∞) with absorption at ‘0’ may be obtained from that on .(−∞, 0] by making the transformation .x → −x on the state space .(−∞, 0] to .[0, ∞). In this case, the density component of the transition probability .p(t; x, dy) .(x ∈ (0, ∞)) is obtained from (13.14) as q2 (t; x, y) = q1 (t; −x, −y) = p(t; −x, −y) − p(t; −x, y)
.
= p(t; x, y) − p(t; x, −y) = p(t; x, y) − p(t; −x, y),
(x > 0, y > 0)
(13.16)
224
13 Absorption, Reflection, and Other Transformations of Markov Processes
13.2 General One-Dimensional Diffusions on Half-Line with Absorption at Zero We first consider locally Lipschitz coefficients .μ(·), σ (·) on .[0, ∞) with .μ(0) = 0 (and .σ (y) = 0, .y ∈ [0, ∞)). Extend these coefficients to all of .R as follows: μ(−x) = −μ(x),
.
σ (−x) = σ (x)
(x > 0).
(13.17)
Then the unrestricted diffusion .{Xt : t ≥ 0} on .(−∞, ∞) satisfies the Itô equation with the same coefficients, namely, .μ(·), .σ (·), as does .{Yt := −Xt , t ≥ 0}. In other words, they have the same law, starting from the same initial point. In particular, they have the same transition probability. One may now repeat the arguments in Example 2 verbatim, with Brownian motion .{Bt : t ≥ 0} replaced by .{Xt : t ≥ 0}. In particular, assuming that .Xt has a transition probability density .p(t; x, y) that satisfies the backward equations, the transition probability .p(t; x, dy) of the process absorbed at 0, .x ∈ (−∞, 0), has the density component q(t; x, y) = p(t; x, y)−p(t; x, −y) = p(t; x, y)−p(t; −x, y),
(x < 0, y < 0). (13.18) Similarly (or using relabeling .x → −x), the density component of the transition probability .p for the diffusion on .[0, ∞), with absorption at 0, satisfies the same equation as (13.18), but for .x > 0, .y > 0.
.
Remark 13.1 In the case .μ(x) = 0 in the specification of the drift on .(0, ∞), one may still extend .μ(·) and .σ (·) to .(−∞, 0) by (13.17), and one may take .μ(0) to be arbitrary, say .μ(0). As will be seen in Chapter 21, the construction of Feller’s onedimensional diffusion2 is determined by the scale function .s(·) and speed function .m(·), which are not affected by changing the values of .μ(·) at one point. Thus, the formula (13.18) still holds even if .μ(0−) = 0. Remark 13.2 By (13.18), one has .
lim q(t; x, y) = 0
(t > 0, y < 0),
(13.19)
lim q(t; x, y) = 0
(t > 0, x < 0).
(13.20)
(t > 0, x < 0, y < 0),
(13.21)
x↑0
and .
y↑0
Also, .q(t; x, y) satisfies .
2 See
∂q(t; x, y) = (Aq)(t; x, y) ∂t
Chapter 21, or Mandl (1968).
13.2 General One-Dimensional Diffusions on Half-Line with Absorption at Zero
225
where .A = μ(x) d/dx + 12 σ 2 (x)d 2 /dx 2 . It follows that .q(t; x, y) is the fundamental solution to the initial-value (or, Cauchy) problem: ∂u(t, x) = Au(t, x) ∂t lim u(t, x) = 0 .
x↑0
(t > 0, x ∈ (−∞, 0)), (t > 0),
lim u(t, x) = f (x),
(13.22)
t↓0
where f is a bounded uniformly continuous function on .(−∞, 0) satisfying the boundary condition .f (0+) = 0, and in this case, the solution to (13.22) is given by u(t, x) =
(t > 0, x < 0).
f (y)q(t; x, y)dy,
.
(13.23)
(−∞,0)
More generally, it may be shown that if G is an open set with a smooth boundary (e.g., .G = {x ∈ Rk : |x| < 1}) and A=
.
μ(i) (x) ∂/∂x (i) +
1 aij (x)∂ 2 /∂x (i) ∂x (j ) , 2 i,j
i
with .μ(·), .a(·) = σ (·)σ (·) bounded and Lipschitzian and the eigenvalues of .a(·) being bounded away from zero and infinity, then the density component .q(t; x, y) of the distribution .p(t; x, dy) .(x ∈ G) is the fundamental solution to the initial boundary value problem3 .
∂u(t, x) = Au(t, x) (t > 0, x ∈ G), ∂t lim u(t, x) = 0 (t > 0), x→∂G (x∈G)
lim u(t, x) = f (x) t↓0
(x ∈ G),
(13.24)
for bounded uniformly continuous f on G satisfying the boundary condition .
lim f (x) = 0
x→b (x∈G)
∀ b ∈ ∂G.
(13.25) x
Let the diffusion on .G, with absorption on .∂G, be denoted by .X , as in Proposition 13.1. Note that if f can be extended to a twice continuously differentiable function as a bounded function with continuous and bounded derivatives, and if
3 See
Friedman (1964).
226
13 Absorption, Reflection, and Other Transformations of Markov Processes
f satisfies the boundary condition (13.25), then by Itô’s lemma and the optional x x stopping rule (noting that .Xt = Xxτ x ∧t ), the function .u(t, x) := Ef (Xt ) satisfies x
u(t, x) − f (x) ≡ Ef (Xt ) − f (x) τ ∧t t =E (Af )(Xxs )ds = E (Af )(Xxs∧τ )ds
.
0
t
=E 0
x
(Af )(Xs )ds =
0
t
(T s Af )(x)ds,
(13.26)
0
where .T s g(x) := G g(y)p(s; x, dy)), so that, using the fact that .T t and its generator A (restricted to functions f that vanish on .∂G) commute, .
∂u(t, x) = T t Af (x) = AT t f (x) ≡ Au(t, x) ∂t
(t > 0, x ∈ G),
yielding the first equation in (13.24). The third equation is clear from (13.26). For the middle equation, use the fact that if .x .(∈ G) → b ∈ ∂G, then, for any given x → 0, a.s., .τ x ∧ t → 0 a.s., .Xx .t > 0, .τ τ x ∧t → b a.s., and, therefore, .u(t, x) = x Ef (Xτ x ∧t ) → f (b) = 0. Example 3 (k-Dimensional Brownian Motion on a Half-Space with Absorption at the Boundary) Let .Bx be a Brownian motion on .Rk .(k > 1), .x ∈ [0, ∞) × Rk−1 = x x x := inf{t ≥ {y = (y (1) , . . . , y (k) ) : y (1) ≥ 0}. Define .Bt := Bt∧τ x , where .τ (1) x 0 : (Btx )(1) = 0} = τ x , say. The transition probability of .Bt has the density component p(t; x, y) = q2 (t; x (1) , y (1) )
k
.
p(t; x (j ) , y (j ) )
x, y ∈ (0, ∞) × Rk−1 ,
j =2
where .q2 is as in (13.16) and .p(t; x, y) is the transition probability density of a one-dimensional standard Brownian motion. The singular part of .p(t; x, dy), for k−1 is given by the following, where we write .x = (x (2) , . . . , x (k) ) .x ∈ (0, ∞) × R 1 (1) and .τ x as the first passage time to the boundary, starting at .(x (1) , , x1 ), .
p(t; x, {0} × C) (1)
= P (τ x ≤ t, Bx x (1) ∈ C) τ t 1 gx (1) (s)Φk−1 (s − 2 (C − x1 ))ds, =
C ∈ B(Rk−1 ),
(13.27)
0
where .gx (·) is the first passage time density (to 0) of a one-dimensional standard Brownian motion starting at .x (1) > 0. This can be obtained from (13.15) on differentiation with respect to t but is given more explicitly in (16.6) of Bhattacharya
13.2 General One-Dimensional Diffusions on Half-Line with Absorption at Zero
227
and Waymire (2021). Also see Example 1, Chapter 5. Here, .Φk−1 is the standard (k − 1)-dimensional normal distribution (with mean vector zero and identity dispersion matrix). In the case .k = 2, this distribution can be expressed in closed form (Exercise 5).
.
Example 4 (Markov Property for Functions of a Markov Process) We next consider the problem of constructing Markov processes .Yt , .t ≥ 0, which are functions of a Markov process .Xt , .t ≥ 0, say, .Yt = ϕ(Xt ), where .ϕ is not one-to-one, and thus .Yt may be viewed as a reduction of .Xt . Here is a preliminary result. Proposition 13.2 Let .{Xtx , t ≥ 0}, .x ∈ S, be a time-homogeneous Markov process on a measurable state space .(S, S), having transition probability .p(t; x, dy) and semigroup of transition operators .Tt , .t ≥ 0. Suppose .ϕ is a measurable map on .(S, S) onto .(S , S ). If .Tt (f ◦ ϕ) is a (measurable) function of .ϕ for every bounded y measurable f on .S , then .Yt := ϕ(Xtx ), .t ≥ 0, is a time-homogeneous Markov process on .(S , S ), with .y = ϕ(x) .(x ∈ S), having the transition probability −1 (C)) .(x ∈ ϕ −1 ({y})). .q(t; y, C) := p(t; x, ϕ Proof Suppose .Xtx , .t ≥ 0, has the Markov property with respect to the filtration −1 ({y}), .{Ft , t ≥ 0}. Then, using this Markov property, for a given .y ∈ S , and .x ∈ ϕ one has for .C ∈ S , y
x P (Ys+t ∈ C | Fs ) = P (ϕ(Xs+t ) ∈ C | Fs )
.
x = P (Xs+t ∈ ϕ −1 (C) | Fs ) = p(t; Xsx , ϕ −1 (C)) = Tt (f ◦ ϕ)(Xsx ),
where .f = 1C . By assumption, .Tt (f ◦ ϕ)(z) = v(t, ϕ(z)) for a measurable function y → v(t, y). Hence,
.
y
y
P (Ys+t ∈ C | Fs ) = v(t, ϕ(Xsx )) = v(t, Ys ),
.
y
proving the Markov property of .{Yt , t ≥ 0} with transition probability .q(t; y, C) = p(t; x, ϕ −1 (C)). Since it is generally not possible to calculate .p(t; x, dy) explicitly (in order to check the hypothesis of Proposition 13.2), the next result can often be used to provide important examples. Consider a group .G of transformations g on S such that p(t; g −1 x, g −1 B) = p(t; x, B)
.
∀ t > 0, x ∈ S, B ∈ S.
(13.28)
Here, each g is a bimeasurable map on S onto S. We say a measurable function .ϕ on (S, S) onto .(S , S ) is a maximal invariant if (1) .ϕ ◦g = ϕ∀ g ∈ G (invariance) and (2) if .ψ is an invariant function: .(S, S) → (S , S ), i.e., .ψ ◦ g = ψ .∀ g ∈ G, then there exists a measurable function .γ : S → S such that .ψ = γ ◦ ϕ (maximality). One may, for most purposes, think of .S as the space of orbits .{gx : g ∈ G}, .x ∈ S.
.
228
13 Absorption, Reflection, and Other Transformations of Markov Processes
Theorem 13.3 Under the hypothesis (13.28), .Yt := ϕ(Xt ), .t ≥ 0, is a Markov process, where .ϕ is a maximal invariant under .G. Proof For all .C ∈ S , .y ∈ S , .x ∈ ϕ −1 ({y}), so that .Yt = ϕ(Xtx ), one has y
x P (Ys+t ∈ C | Fs ) = P (Xs+t ∈ ϕ −1 (C) | Fs ) = p(t; Xsx , ϕ −1 (C)). y
.
(13.29)
Now by assumption (13.28) and invariance of .ϕ one has .∀ g ∈ G, p(t; x, ϕ −1 (C)) = p(t; g −1 x, g −1 ϕ −1 (C)) = p(t; g −1 x, (ϕ ◦ g)−1 (C))
.
= p(t; g −1 x, ϕ −1 (C)), which shows that .x → p(t; x, ϕ −1 (C)) is invariant under .G. Hence, because of the maximality of .ϕ, .p(t; x, ϕ −1 (C)) = q(t; ϕ(x), C), say, for some function q such that .y → q(t; y , C) is measurable on .(S , S ). Hence, by (13.29), y y P Ys+t ∈ C | Fs = q(t; ϕ(Xsx ), C) = q(t; Ys , C).
.
We will consider several applications of Proposition 13.2 and Theorem 13.3. The following immediate corollary allows one to verify the hypothesis of Theorem 13.3 in the case of diffusions. Corollary 13.4 If the generator A of a diffusion .{Xt : t ≥ 0} is invariant under a group of transformations, and .ϕ is the maximal invariant, then .{Yt = ϕ(Xt ) : t ≥ 0} is a Markov process. Example 5 (General Diffusions with a Markovian Radial Component) Consider a diffusion on .Rk .(k > 1) governed by the Itô equation dXt = μ(Xt )dt + σ (Xt )dBt ,
.
t > 0.
(13.30)
To determine conditions under which the norm .|Xt |, t ≥ 0, is a Markov process, one may look for conditions under which the generator A maps (smooth) radial functions to radial functions, i.e., smooth functions of the form, .h(x) = F (|x|). In anticipation of such conditions, let .d(x) := Σaij (x)x (i) x (j ) /|x|2 .(x = 0), .C(x) := Σaii (x), (i) μ(i) (x). Assume that the functions .d(x) and .B(x)+C(x) .B(x) := 2x·μ(x) ≡ 2Σx are radial. Assume also that .σ (·) is nonsingular, and write d(x) = α(|x|),
.
β(|x|) =
B(x) + C(x) − d(x) , d(x)
(x = 0).
(13.31)
If F is a twice continuously differentiable bounded function on .(0, ∞) with bounded derivatives, then for the radial function .h(x) := F (|x|) on .Rk \{0}, one has (see Lemma 8, Chapter 11),
13.2 General One-Dimensional Diffusions on Half-Line with Absorption at Zero
229
1 B(x) + C(x) − d(x) d(x)F (|x|) + F (|x|) (x = 0), 2 2|x|
(13.32)
Ah(x) =
.
so that A transforms radial functions to radial functions. It follows from Proposition 13.2 that .Rt := |Xt |, .t ≥ 0, is a diffusion on .(0, ∞) with generator AR :=
.
1 β(r) d d2 α(r)[ 2 + ]. 2 r dr dr
(13.33)
(see Exercise 6), with inaccessible boundaries 0 and .∞, or, equivalently, .log Rt , t ≥ 0, is a conservative diffusion on .(−∞, ∞). In this example, the group G in Corollary 13.4 is the group of rotations, i.e., the orthogonal group.
.
Example 6 (Bessel Processes) Let .k ≥ 2, .{Bt , t ≥ 0} be a standard Brownian motion on .Rk . As a special case of the previous example, .Rt := |Bt |, .t ≥ 0, is a diffusion on .(0, ∞) with generator AR :=
.
k−1 d 1 d2 + , 2 dr 2 2r dr
(13.34)
with inaccessible boundaries 0 and .∞. Example 7 (Squared Bessel Process) Let .k ≥ 2 and .Rt2 = |Bt |2 , t ≥ 0 where .B is standard Brownian motion started at .0. From the preceding example, it follows that 2 2 2 .Rt = |Bt | , .t ≥ 0, is a diffusion on .(0, ∞), since .x → x is a diffeomorphism. Here is an alternative description of the process. By Itô’s lemma, one has dRt2 = kdt + 2Bt · dBt .
.
Now observe, noting also that 0 is an inaccessible boundary for R, the process .B˜ t := t −1 k (j ) (j ) Rs j =1 Bs dBs , t > 0, is a continuous martingale with quadratic variation 0t (j ) d . = 0 |f(s)| s = t, where .f is the vector-valued function with components .fs (j ) ˜ Bs /Rs . Thus, by Lévy’s characterization, Theorem 9.4, .B is a one-dimensional standard Brownian motion. Thus, letting .Xt := Rt2 , t ≥ 0, one has dXt = kdt +
.
|Xt |d B˜ t , t > 0,
X0 = x,
for .x ≥ 0, is referred to as the squared Bessel process starting at x. Example 8 (Diffusions on a Torus) On .Rk .(k ≥ 1) let .μ(·), .σ (·) be Lipschitzian and periodic, μ(x + ν) = μ(x),
.
σ (x + ν) = σ (x)
∀ x ∈ Rk , ∀ ν ∈ Zk .
Consider the diffusion .{Xt : t ≥ 0}, having these coefficients, so that
(13.35)
230
13 Absorption, Reflection, and Other Transformations of Markov Processes
dXt = μ(Xt )dt + σ (Xt )dBt .
.
Check that for any given .ν ∈ Zk , .Yt := Xt + ν satisfies the same equation dYt = μ(Yt )dt + σ (Yt )dBt .
.
Therefore, Theorem 13.3 or Corollary 13.4 applies with .G as the group of all translations by elements .ν of .Zk . A maximal invariant is given by .ϕ(x) := x mod 1 ≡ (x (1) mod 1, . . . , x (k) mod 1), .x ∈ Rk . Thus, .ϕ takes values in the torus .T := {x mod 1 : x ∈ Rk }, which as a measurable space may be labeled as .([0, 1)k , .B([0, 1)k )) .(ν mod 1 := 0, ν ∈ Zk ). In particular, .T is compact. The Markov process .Zt := ϕ(Xt ), .t ≥ 0, is called a diffusion on the (unit) torus. Example 9 (Brownian Motion on a Circle and on a Torus) For each .a = (a (1) , a (2) , . . . , a (k) ) with .a (i) > 0(1 ≤ i ≤ k), one may define a diffusion .{Zt , t ≥ 0} on the torus .Ta := {x mod a ≡ (x (1) mod a (1) , . . . , x (k) mod a (k) ) : x ∈ Rk } by setting .Zt := Xt mod a. The transition probability density of .{Zt , t ≥ 0} with respect to Lebesgue measure or .[0, a1 ) × · · · × [0, ak ) is easily shown to be
q(t; z, z ) =
.
p(t; z, z + (ν (1) a1 , . . . , ν (k) ak )),
(z, z ∈ Ta , t > 0).
ν∈Zk
p(t; x, y) := ( √
1 2π t
)k exp{−
|y − x|2 } 2t
(x, y ∈ Rk ).
(13.36)
As a special case, the Brownian motion on the unit circle .S 1 may be defined as .Zt := Bt mod 2π , .t ≥ 0. Its transition probability density with respect to Lebesgue measure on .S 1 (labeled as .[0, 2π )) is q(t; z, z ) :=
.
1 1 (2π t)− 2 exp{− (z + 2π ν − z)2 }, 2t
z, z ∈ [0, 2π ).
(13.37)
ν∈Z
13.3 Reflecting Diffusions Let .μ(·) and .σ (·) be locally Lipschitzian on .R, such that the diffusion with these coefficients is nonexplosive. First assume also that .μ(·) is an odd function and .σ (·) is an even function, μ(−x) = −μ(x),
.
σ (−x) = σ (x),
(x ∈ R).
(13.38)
Then a diffusion .X := {Xt : t ≥ 0}, with these coefficients and .Y := {Yt := −Xt : t ≥ 0}, satisfy the same Itô equation
13.3 Reflecting Diffusions
231
dXt = μ(Xt )dt + σ (Xt )dBt ,
.
t , dYt = μ(Yt )dt + σ (Yt )d B
(13.39)
t = −Bt : t ≥ 0}, is also a standard Brownian motion, just as .{Bt : t ≥ 0}. where .{B Therefore, .{Xt : t ≥ 0}, and .{Yt : t ≥ 0}, have the same transition probability .p(t; x, dy). This is equivalent to saying that p(t; x, B) = p(t; −x, −B),
B ∈ B(R).
.
(13.40)
In other words, the transition probability is invariant (i.e., satisfies (13.28)) under the reflection group .{g, e}, where .gx = −x. A maximal invariant under this group is 2 .|x| (another, equivalent, one is .x ). Hence, by Theorem 13.3, .Zt := |Xt |, .t ≥ 0, is a Markov process on .S = [0, ∞). It is called a diffusion on .[0, ∞) having coefficients 4 .μ(·), .σ (·), with reflection at 0. If the process .X has a transition probability density .p(t; x, y) then Z has the transition probability density q(t; x, y) := p(t; x, y) + p(t; x, −y)
.
= p(t; x, y) + p(t; −x, y)
(x > 0, y > 0, t > 0) (13.41)
satisfying the backward equations . ∂q ∂t = Aq. Next, suppose one is given .μ(·) and .σ (·) on .[0, ∞), which are locally Lipschitz, 2 .σ (·) > 0. Extend them to .R by (13.38). Change .μ(0) to 0. As pointed out in Remark 13.1, the diffusion is well defined as a Feller diffusion (see Chapter 21), and (13.40), (13.41) hold. Remark 13.3 It follows that .q(t; x, y) satisfies the same equation in .(t, x) as p(t; x, y). Namely,
.
.
∂q(t; x, y) = Aq(t; x, y), ∂t
0 < x, y < ∞ (t > 0),
(13.42)
and, by (13.41), satisfies the boundary condition .
lim x↓0
∂q (t; x, y) = 0 ∂x
(t > 0, y > 0).
(13.43)
Example 10 (Reflecting Brownian Motion on .[0, ∞)) If .Btx , .t ≥ 0, is a standard Brownian motion starting at .x > 0, then .Ztx := |Btx |, .t ≥ 0, is a reflecting Brownian motion on .[0, ∞) whose transition probability density is q(t; x, y) := p(t; x, y) + p(t; x, −y),
.
4 See
(x > 0, y > 0, t > 0)
Friedman (1964) for existence of a transition probability density.
232
13 Absorption, Reflection, and Other Transformations of Markov Processes 1
p(t; x, y) := (2π t)− 2 exp{−
1 (y − x)2 }. 2t
(13.44)
Note that .
∂q(t; x, y) =0 ∂x x↓0 ∂q(t; x, y) =0 ∂x y↓0
(t > 0, y > 0), (t > 0, x > 0).
(13.45)
The relations (13.45) also hold for the general nonsingular reflecting diffusion on [0, ∞). In the case of Brownian motion with drift .μ = 0, one may check that the process .Yt = |μt + Bt |, t ≥ 0, does not have the Markov property (Exercise 1).
.
Remark 13.4 Suppose .μ(x), σ (x) = 0 are bounded, locally Lipschitzian, specified on .[0, ∞). If .μ(0) = 0, one may still define .σ (x), μ(x) on .R satisfying (13.38) for .x = 0. In this case, .μ(·) is not continuous at 0. However, as noted in Remark 13.1, there still exists a symmetric diffusion with these extended coefficients (one may, in fact, modify .μ(·) to let .μ(0) = 0, without changing Feller’s generator .A = d d dm(x) ds(x) ). Thus, a reflecting Brownian motion on .[0, ∞) may be constructed again as .Zt := |Xt |, .t ≥ 0. Alternatively, one may use pde to obtain the fundamental solution .q(t; x, y) satisfying (13.42), (13.43). Example 11 (Reflecting Diffusions on .[0, 1]) Suppose that Lipschitzian coefficients μ(·) and .σ (·) > 0 are given on .[0, 1]. Assume first that
.
μ(0) = 0,
.
μ(1) = 0.
(13.46)
Extend these coefficients to .[−1, 1] by setting μ(−x) := −μ(x),
.
σ (−x) := σ (x),
0 < x ≤ 1.
(13.47)
Now extend .μ(·), σ (·) to all .R periodically with period 2: μ(x + 2ν) = μ(x),
.
σ (x + 2ν) = σ (x),
−∞ < x < ∞, ν ∈ Z.
(13.48)
Let .Xt , .t ≥ 0, be a diffusion on .R with these extended coefficients. Then .Zt := Xt mod 2, .t ≥ 0, is a diffusion on the circle .{x mod 2 : x ∈ R}. Hence, .Zt − 1 is a Markov process on .[−1, 1), which is symmetric, i.e., .Zt − 1 and .−Zt + 1 have the same distribution (when .−1 and .+1 are identified). Thus, .Yt := |Zt − 1| is a Markov process on .[0, 1], which we call the reflecting diffusion on .[0, 1] with coefficients .μ(·), σ (·), satisfying (13.47). One may also express .Yt = ϕ(Xt ), where .ϕ is a maximal invariant under the group .G generated by .θ0 = reflection around 0, .θ0 x = −x, and .g1 = translation by 2. The group .G is also generated by .θ0 and the reflection .θ1 around .1 : θ1 x = 2 − x .(= 1 + 1 − x). Identified as the orbit of x,
13.3 Reflecting Diffusions
233
ϕ(x) ≈ o(x) = {±x + 2ν : ν ∈ Z} .(x ∈ R), since the map .x → o(x) on .[0, 1] is one-to-one and onto the space of orbits, the reflecting diffusion on .[0, 1] may also be defined in this manner. In particular, the transition probability density .q(t; x, y) of .Yt , .t ≥ 0, is given in terms of the transition probability density .p(t; x, y) of the unrestricted diffusion as (Exercise 10)
.
.q(t; x, y)
=
p(t; x + 1, y + 1 + 2ν) +
ν∈Z
p(t; x + 1, −y + 1 + 2ν) x, y ∈ [0, 1]), t > 0.
ν∈Z
(13.49) As pointed out in Remark 13.4, given any bounded Lipschitzian .μ(·), σ (·) > 0 on .[0, 1], one can simply change .μ(·), if necessary, so that .μ(0) = μ(1) = 0, and use the above construction for a symmetric diffusion with .μ(·) having possible discontinuities at integer points. Example 12 (Reflecting Brownian Motion on .[0, 1]) For a one- dimensional standard Brownian motion .Bt , .t ≥ 0, the reflecting diffusion .Yt = |Zt − 1|, where .Zt = Bt mod 2, has the transition probability density (13.49) with .p(t; x, y) = 1 (2π t)− 2 exp{−(x − y)2 /2t}. Example 13 (Reflecting Brownian Motion on .[0, 1]k , k ≥ 2) Given a standard kdimensional Brownian motion .Bt , .t ≥ 0, the reflecting Brownian motion on .[0, 1]k (1) (k) (i) (i) (i) (i) is .Yt = (Yt , . . . , Yt ), .t ≥ 0, where .Yt = |Zt − 1|, .Zt := Bt mod 2. Example 14 (General Multidimensional Diffusions with Reflection on a Half Space) Let .μ(x), .σ (x) be prescribed on .H k = Hk ∪ ∂Hk , where .Hk = {x ∈ Rk , x (1) > 0}, k (1) = 0}. Assume .μ(·), .σ (·) to be Lipschitzian, .σ (·) symmetric, .∂Hk = {x ∈ R , x satisfying μ(1) (x) = 0,
σ1j (x) = 0,
.
2 ≤ j ≤ k,
for x ∈ ∂Hk .
(13.50)
Extend the coefficients to all of .Rk by setting, for all .x = (x (1) , . . . , x (k) ), the vector (1) , x (2) , . . . , x (k) ), and letting .x = (−x μ(1) (x) = −μ(1) (x),
.
σ11 (x) = σ11 (x),
σ1j (x) = −σ1j (x)
μ(i) (x) = μ(i) (x)
for 2 ≤ i ≤ k,
σij (x) = σij (x)
for 2 ≤ i, j ≤ k,
for 2 ≤ j ≤ k,
(x ∈ Rk ).
Then writing .Xt = (−Xt(1) , Xt(2) , . . . , Xt(k) ) one has dXt = μ(Xt )dt + σ (Xt )dBt ,
.
(1) dX t
≡
(1) −dXt
= −μ (Xt )dt − (1)
k j =1
(j )
σ1j (Xt )dBt
(13.51)
234
13 Absorption, Reflection, and Other Transformations of Markov Processes
(1)
= μ(1) (Xt )dt + σ11 (Xt )dB t
+
k
(j )
σ1j (Xt )dBt
j =2 (1)
= μ(1) (Xt )dt + σ11 (Xt )dB t
+
k
(j )
σ1j (Xt )dBt ,
j =2 (1)
where .B t
(1)
:= −Bt . Also, for .i ≥ 2, (i)
(i)
dX t = dXt = μ(i) (Xt )dt +
.
k
(j )
σij (Xt )dBt
j =1
= μ (Xt )dt (i)
(1) + σi1 (Xt )dB t
+
k
(j )
σij (Xt )dBt .
j =2
In other words, .Xt satisfies the Itô equation .dXt = μ(Xt )dt + σ (Xt )dBt , (1) (2) (k) where .Bt = (−Bt , Bt , . . . , Bt ) is a standard Brownian motion on .Rk . Hence, .Xt and .Xt have the same transition probability. It follows that .Yt := (|Xt(1) |, Xt(2) , . . . , Xt(k) ), .t ≥ 0, is a Markov process on .H k . We will call .Yt , .t ≥ 0, the reflecting diffusion on .H k with reflection on .∂Hk . Remark 13.5 It is known5 that the unrestricted diffusion, whose coefficients satisfy (13.50), (13.51), has a transition probability density .p(t; x, y) if, in addition to the assumed smoothness of the coefficients, .σ (·) is assumed nonsingular. Then the transition probability .q(t; x, y) of the reflecting diffusion .Yt is given by q(t; x, y) = p(t; x, y) + p(t; x, y)
.
= p(t; x, y) + p(t; x, y),
(x, y ∈ Hk ).
(13.52)
This implies that q satisfies .∂q/∂t = Aq and the Neumann boundary condition .
∂q(t; x, y) = 0 (y ∈ Hk ), ∂x (1) x (1) =0
∂q(t; x, y) = 0, (x ∈ Hk ). ∂y (1) y (1) =0 (13.53)
As indicated in earlier remarks (see Remark 13.4), the condition ‘.μ(1) (x) = 0 for .x ∈ ∂H ’, may be dispensed with. To understand the role of the second condition ‘.σ1j (x) = 0 for .2 ≤ j ≤ k for .x ∈ ∂Hk ’, note first that in view of the assumed symmetry of the matrix .σ (·), this condition is equivalent to
5 See
Friedman (1964).
13.3 Reflecting Diffusions
235
a1j (x) = 0
2 ≤ j ≤ k, x ∈ ∂Hk ,
for
.
(13.54)
a(x) := σ (x)σ (x). The following example shows that (13.54) ensures that the reflection is in the “normal” direction as opposed to a “conormal” direction as explained below. We have proved the following result for the process .{Yt : t ≥ 0}. Theorem 13.5 Given locally Lipschitz coefficients .μ(·), .σ (·), .σ (·) symmetric and positive definite, on the half space .H k = {x = (x (1) , x (2) , . . . , x (k) ) ∈ Rk : x (1) ≥ 0}, satisfying (13.17), there exists a diffusion .{Y(t) : t ≥ 0}, on the half-space k : x (1) = 0}, whose transition .(Hk ) reflecting at the boundary .∂H k = {x ∈ R probability density .q(t; x, y) obeys the backward equation . ∂q ∂t = Aq in .Hk , where for .a(·) = σ (·)σ (·) , ∂q(t; x, y) ∂ 2 q(t; x, y) 1 + μ(i) (x) . aij (x) 2 ∂x (i) x (j ) ∂x (i)
Aq =
.
1≤i,j ≤k
(13.55)
1≤i≤k
and the Neumann boundary condition (13.53) on .∂H k . Example 15 (Diffusion on a Half-Space of .Rk with Conormal Reflection at the Boundary) Consider an arbitrary half-space H k,γ = {x = (x (1) , x (2) , . . . , x (k) ) ∈ Rk , γ · x ≥ 0},
.
(i) (j ) is the usual inner where .γ is a unit vector in .Rk and .γ · x = 1≤i≤k γ x product in .Rk . Suppose we are given a locally Lipschitz drift .ν(·) ∈ H k,γ , and a constant symmetric positive definite matrix .a = ((aij )). Let O be an orthogonal 1
1
transformation that maps .a 2 γ /|a 2 γ | to .e = (1, 0, 0, . . . , 0). Let .{Yt : t ≥ 0}, be 1 the diffusion on .H k given by Theorem 13.5 with drift .μ(·) = (a 2 O)−1 ν(·), and the diffusion matrix .I (identity matrix). It is simple to check that 1
X(t) = a 2 OYt ,
.
t ≥ 0, 1
is a diffusion on .H k,γ with drift .ν and diffusion matrix .a, .a 2 O being a diffeomorphism on .H k onto .H k,γ . Expressed in terms of this transformation, the backward equation for the transition probability density p, say, of the process .{Xt : t ≥ 0} is ∂p . ∂t = Ap, where Ap =
.
∂ 2p ∂p 1 aij (i) (j ) + μ(i) (i) , 2 ∂x ∂x ∂x 1≤i,j ≤k
1≤i≤k
and the boundary condition becomes (Exercise 12)
(13.56)
236
13 Absorption, Reflection, and Other Transformations of Markov Processes
(aγ ) · (grad)x p(t; x, y) = 0, or
.
(
1≤i≤k 1≤j ≤k
aij γi )
∂p(t; x, y) = 0. ∂x (i)
(13.57)
The boundary condition (13.57) is called a conormal boundary condition (see Exercise 12). Remark 13.6 Suppose one is given an open set .G ⊂ Rk .(k > 1) with (1) a smooth boundary .∂G, (2) an infinitesimal generator .A with coefficients .μ(·) and .σ (·) Lipschitzian on .G, .σ (·) having eigenvalues bounded away from zero and infinity, and (3) a continuous field of unit vectors .γ (x) .(x ∈ ∂G) such that .γ (x) · n(x) ≥ C > 0 for all .x ∈ ∂G, where .n(x) is the unit outward normal to .∂G at .x. Then there exists a reflecting diffusion on .G whose transition probability density .q(t; x, y) satisfies6 .
∂q(t; x, y) = Aq(t; x, y) ∂t γ (x) · gradx q(t; x, y) = 0
(x, y ∈ G, t > 0), (x ∈ ∂G, y ∈ G, t > 0).
Exercises For the exercises below, inaccessible boundaries are those that, starting from the interior, cannot be reached in finite time. 1. For .μ = 0, (a) show that .Yt = |μt + Bt |, t ≥ 0, does not have the Markov property. [Hint:Compute the transition probabilities .p(t, y0 , y1 ) for Y on .[0, ∞) and show that the Chapman–Kolmogorov equations fail for .p(t + t, , 0, 0).] (b) Using the method of this chapter, how would one construct a diffusion on .[0, ∞) with drift .μ, diffusion coefficient 1, and a reflecting boundary condition . ∂p(t;x,y) |x=0 = 0. Prove that the diffusion is transient. ∂x 2. Let .μ(·), .σ (·) > 0 be locally Lipschitzian on .(a, b), .a < b, with either .a > −∞ or .b < ∞, or both a and b are finite. Define the scale and speed functions .s(x) and .m(x) as in (13.4). Assume (13.3) holds. (i) Prove that a and b are inaccessible. (ii) Prove that the diffusion is recurrent if and only if .s(a) := limx↓a s(x) = −∞ and .s(b) := limx↑b s(x) = ∞. (iii) Prove that the diffusion is positive recurrent if and only if .m(a) > −∞ and .m(b) < ∞. [Hint: Use .A ≡ 12 σ 2 (x)(d 2 /dx 2 ) + μ(x)(d/dx) =
6 See
Ikeda and Watanabe (1989), pp. 217-232, or Stroock and Varadhan (1979), pp. 147-225. For the special case of Brownian motion on the two-dimensional closed unit disc .G with general oblique (or, non-conormal) reflection, see McKean (1969), pp. 126-132.
Exercises
237
(d/dm(x))(d/ds(x)), compute (1) .P (Xx reaches c before d) as .[s(x) − s(c)]/[s(d) − s(c)], .a < c < x < d < b, and (2) .γ (x) := expected time for .Xx to reach .{c, d}, as the solution of .Aγ (x) = −1, .γ (c) = γ (d) = 0.] (iv) Derive (13.3) from Feller’s criterion for explosion on .(−∞, ∞) (Theorem 12.2), using a diffeomorphism .ϕ : (a, b) → R. 3. Determine values of the parameter .σ under which the zero-seeking Brownian motion given in Example 1 is transient, null-recurrent, and ergodic. 4. Consider the diffusion on .(0, 1) with coefficients .μ(x) = C1 (1 − x) − C2 x, 1 .σ (x) = (x(1 − x)) 2 , .C1 ≥ 0, .C2 ≥ 0. (i) Show that 0 and 1 are inaccessible if and only if .C1 ≥ 12 , .C2 ≥ 12 . (ii) If .C1 ≥ 12 , .C2 ≥ 12 , show that the diffusion on .(0, 1) is positive recurrent. (iii) Check that .m(0) and .m(1) are finite if .c1 > 0, c2 > 0. Hence, with the additional condition of recurrence, it is not enough for positive recurrence that the speed measure is finite at the boundary. 5. Prove that in Example 3, the distribution .p(t; x, dy) on the boundary k (1) = 0}, with .x away from the boundary (i.e., .x (1) > 0), as .{y ∈ R , y given in (13.27), has for .k = 2 then density (with respect to Lebesgue (1) measure .dy (2) on the boundary) .p(t; x, (0, y (2) )) = xπ [(x (1) )2 + (y (2) − x (2) )2 ]−1 exp{−[(x (1) )2 + (y (2) − x (2) )2 ]/2t} which converges, as .t ↑ ∞, (2) (2) to the Cauchy density .f (y (2) ; x (1) , x (2) ) = (π x (1) )−1 [1 + ( yx (1) − xx (1) )2 ]−1 , (2) ∈ R .(x (1) > 0, .x (2) ∈ R). .y 6. (i) Find conditions on .α(·) > 0 and .β(·) such that the diffusion on .(0, ∞) with generator (13.33) has inaccessible boundaries. (ii) In the case of inaccessible boundaries, find conditions on .α(·), .β(·) such that the diffusion is (a) null recurrent, or (b) positive recurrent, or (c) transient. 7. Consider Example 6. (i) Show that .R = {Rt : t ≥ 0} is a Markov process on .S = (0, ∞) and that for twice continuously differentiable functions f with compact support, d R one has . dt Tt f = AR TtR f , where .TtR , t ≥ 0, are the transition operators for .{Rt : t ≥ 0}. (“0” is an example of an inaccessible boundary for the process R). (ii) Since the range of .|Bt | is .[0, ∞), one may consider the radial Brownian motion on .S = [0, ∞). Show that .R = {Rt : t ≥ 0} is a Markov process on the state space .S = [0, ∞), with the same transition probability .p(t; r, dr) for .r > 0 as in (i), but with .p(t; 0, (0, ∞)) = 1 for all .t > 0 for .k ≥ 2. (In Feller’s boundary classification, “0” is an example of an entrance boundary for the process R). 8. (i) If .a < b are finite, and .μ(x) = 0, σ 2 (x) > 0 on .(a, b), then show that the boundaries .a, b are accessible. (ii) If .a = −∞, .b = +∞, .μ(x) = 0, σ 2 (x) > 0 on .(−∞, ∞), then show that the boundaries are inaccessible and, indeed, the
238
13 Absorption, Reflection, and Other Transformations of Markov Processes
diffusion is recurrent. (iii) Under the additional condition on .σ 2 (·) in (ii), is the diffusion recurrent? 9. (Diffusion on the Unit Sphere) Suppose a diffusion on .Rk , .k > 1, is nonexplosive with locally Lipschitzian coefficients satisfying .μ(cx) = cμ(x), .σ (cx) = ±cσ (x) .∀ c ∈ R. (i) Show that its transition probability is invariant under the group .G+ of transformations .gc (x) = cx, .c > 0, and that a maximal invariant under this group is .ϕ(x) = x/|x| (if .x = 0), .ϕ(0) = 0. (ii) Consider the example .μ(x) = ax, and for some .γ = 0, .σi,i (x) = γ x (i+1) , (1) , .σ (i) .1 ≤ i ≤ k − 1, .σkk (x) = −x i,i+1 (x) = −γ x , .1 ≤ i ≤ k − 1, (k) .σk1 = x , and all other coefficients are zero. Show that .σ (x)x = 0 for all 2 x 2 2 2 2 .x, and that .Rt := |Xt | equals .|x| exp{(2γ −a )t}, .t ≥ 0, a deterministic 2 2 process. Show also that if .2γ = a , then, for every initial .x = 0, .Xxt , k .t ≥ 0, is a Markov process on the sphere .{y ∈ R : |y| = |x|}. 10. Verify (13.49) and establish the backward and forward boundary conditions: ∂q ∂q . ∂x = 0, x = 0, 1, and . ∂y = 0, y = 0, 1, respectively. 11. (a) Consider k-dimensional standard Brownian motion on the orthant .Ok = {x ∈ Rk : x (i) ≥ 0, 1 ≤ i ≤ k}, with normal reflection at the boundary. That is, the process is .Xxt = {(|Bt(1) |, . . . , |Bt(k) |), .x = (x (1) , . . . , x (k) ), t ≥ 0. Here, (1) (k) .Bt = (Bt , . . . , Bt ), t ≥ 0, is a k-dimensional standard Brownian motion. (i) Compute its transition probability density. (ii) Show that the process is nullrecurrent for .k ≥ 1. 1 12. In Example 15, verify the assertion that the diffeomorphism .a 2 O on .H k onto .H k,γ transforms the diffusion .{Yt : t ≥ 0} on .H k with drift .μ(·) and diffusion matrix .I, and normal reflection at the boundary .{x : x (1) = 0}, to a process 1 .{Xt : t ≥ 0} on .H k,γ with drift .ν(·) = a 2 Oμ(·), and diffusion matrix .a, and the conormal boundary condition in (13.57).
Chapter 14
The Speed of Convergence to Equilibrium of Discrete Parameter Markov Processes and Diffusions
Coupling methods and discrete renewal theory are used to derive criteria for polynomial as well as exponential rates of convergence to equilibrium for Harris recurrent Markov processes in discrete time and diffusions in continuous time.
In this chapter, based mostly on Wasielak (2009) and Bhattacharya and Wasielak (2012), criteria for polynomial and exponential rates of convergence of Markov processes to equilibrium are derived both for discrete time Harris positive recurrent Markov processes, as well as d-dimensional diffusions, .d ≥ 1, using the method of coupling. Lindvall (2002) used this method, which he attributes to Jim Pitman, to obtain polynomial rates of convergence of one-dimensional diffusions to equilibrium. A more general description of his method is given in Lindvall (2002). Some criteria for polynomial convergence of multidimensional diffusions were derived by Davies (1986), also using estimates of moments of the coupling time. To state the necessary notions from renewal theory (see, e.g., Bhattacharya and Waymire (2021), Chapter 8), let .{Yn : n = 1, 2, . . . } be a sequence of i.i.d. strictly positive integer-valued random variables, with lattice span one, having the probability mass function (pmf) .f (j ), j ≥ 1; and let .Y0 be another non-negative integer-valued random variable independent of .{Yn : n = 1, 2, . . . }, .Y0 is the delay distribution. This refers to the times .Yn needed to repair an item, with .Y0 being the delay in starting the repair. Assume .μ = EY1 < ∞. Let Sn := Y0 + · · · + Yn ,
.
Nn := inf{k ≥ 0 : Sk ≥ n}
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_14
239
240
14 Speed of Convergence
.Nn is the nth renewal epoch, .SNn , n ≥ 0 is the renewal process, and .Rn = SNn −n is the residual lifetime (or overshoot) of the last renewal prior to the nth epoch. It helps to think about the renewal process as a point process .{Sn } on the integers with the points .Sn , n ≥ 0, laid out with blue dots on an axis. It is well known that .Rn , n ≥ 1, is a Markov process on the space .Z+ of non-negative integers, with the unique invariant distribution given by the integrated tail distribution1
g(j ) =
f (i)
.
i>j
μ
,
j = 0, 1, 2, . . . .
This is to be coupled with renewals of the process S˜n = Y˜0 + Y˜1 + · · · + Y˜n
.
where .Y˜n , n ≥ 1, are i.i.d. with the same common distribution as .Yn , n ≥ 1, and ˜0 an independent delay. Let .N˜ n , S˜ ˜ , R˜ n , be defined for this sequence as for the .Y Nn original sequence; but .Y˜0 is assumed to have the invariant distribution g. One may think of this point process .{S˜n , n ≥ 0} placed with red dots on the same axis as .{Sn : n ≥ 0}. Coupling occurs at a time T when a red dot falls on a blue dot, or, equivalently, T = inf{n ≥ 0 : Rn = R˜ n } ≡ inf{n ≥ 0 : SNn = S˜N˜ n }.
.
(14.1)
For estimating moments of T the following lemma will be used (See Lindvall (2002), Lemma (4.1)). Lemma 10 (i) For each .ρ > 0, there exists a constant .C = C(ρ) such that .ERn ≤ β C + ρn. (ii) Suppose .E(Y1 ) < ∞ for some .β ≥ 1. Then β
ERnβ ≤ nEY1 for all n ≥ 1.
.
(iii) If .ϕ is an increasing non-negative function on .Z+ , then for all .n ≥ 1, Eϕ(Rn ) ≤ Eϕ(Y1 )n.
.
Proof (i). Since .Rn = SNn − n, using Wald’s second identity (see, e.g., Bhattacharya and Waymire (2021), Chapter 19 and p. 142, or BCPT,2 p. 72), one has ERn = μENn − n.
.
1 Bhattacharya 2 Throughout,
Theory.
and Waymire (2021), p. 108. BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
14 Speed of Convergence
241
By the renewal theorem3 μENn /n → 1 as n → ∞,
.
i.e., .ERn /n → 0. Therefore, given any .ρ > 0, however small, there exists .n1 such that ERn < ρn for n ≥ n1 .
.
Take .C = max{ERn : n ≤ n1 }. (iii) One has, P (Rn = k) =
.
P (Rn = k, Sj < n, Sj +1 ≥ n)
j ≥1
=
P (Sj = i, Yj +1 = n + k − i)
j ≥1 0≤i≤n−1
=
f (n + k − i)P (Sj = i)
j ≥1 0≤i≤n−1
=
(
P (Sj = i))f (n + k − i)
0≤i≤n−1 j ≥1
≤
f (n + k − i), k ≥ 0.
0≤i≤n−1
Eϕ(Rn ) =
.
f (n + k − i)
0≤i≤n−1
k≥0
=
ϕ(k) (
ϕ(k)f (n + k − i))
0≤i≤n−1 k≥0
≤
ϕ(n + k − i)f (n + k − i)
0≤i≤n−1 k≥0
=
ϕ(k )f (k )
0≤i≤n−1 k ≥n−i
≤ nEϕ(Y1 ). We have used the facts that (a)
3 See,
e.g., Bhattacharya and Waymire (2021), Chapter 8, Theorem 8.5.
242
14 Speed of Convergence
.
P (Sj = i) = E(
j ≥1
1[Sj =i] ) ≤ 1 for all i,
j ≥1
and (b) ϕ(k) ≤ ϕ(n + k − i) for i < n.
.
Part (ii) is a special case of (iii) with .ϕ(k) = k β .
The lemma is helpful in the proof of the following result of Lindvall (2002), Theorem (4.2), which, in turn, will be used to derive the main results of this chapter. Theorem 14.1 (Lindvall) Let .{Yn : n ≥ 1} be an i.i.d. sequence of strictly positive integer-valued random variables with lattice span 1, and .Y0 a non-negative integerβ β valued random variable independent of .{Yn : n ≥ 1}. Assume .EY0 < ∞, .EY1 < ∞, for some .β ≥ 1. Let .{Y˜n : n ≥ 1}, be a sequence independent of .{Yn : n ≥ 1}, but with the same common distribution as .Y1 , and with .Y˜0 having the invariant β distribution of the residual renewal process .R˜ n . Assume .EY˜0 < ∞. Then .ET β < ∞, where T is the coupling time defined by (14.1). Remark 14.1 Before proceeding with the rather long proof, it is important to note that the case .β = 1 simply follows from the fact that the Markov chain .(Rn , R˜ n ), .n ≥ 1, is positive recurrent. For it has the unique invariant probability (mass function) .g(·) × g(·). Proof Following Lindvall (2002), Chapter 2, but with some more details from ˜ and Wasielak (2009), we define stopping times involving the sequences .S, S, corresponding values of these sequences for a possible match. Recalling that4 P (Rn = 0) → g(0) > 0 as n → ∞,
.
one can find .0 < γ < g(0), and an integer .n0 such that for .n ≥ n0 , P (Rn = 0) > γ , for all n ≥ n0 .
.
Now define, recursively .
τ = inf{j : Bj = 0}, A0 = Sn0 , ν0 = inf{j : S˜j ≥ Sn0 }, B0 = S˜ν0 − Sn0 , A1 = S˜ν0 +n0 , ν1 = inf{j > ν0 : Sj ≥ A1 }, B1 = Sν1 − A1 ,
(14.2)
4 See
Bhattacharya and Waymire (2021), (8.17).
14 Speed of Convergence
243
A2n+1 = S˜ν2n +n0 , n ≥ 0, A2n = Sν2n−1 +n0 , (n ≥ 1), ν2n+1 = inf{j > ν2n : Sj ≥ A2n+1 }, B2n+1 = Sν2n+1 − A2n+1 , n ≥ 1, ν2n = inf{j > ν2n−1 : S˜j ≥ A2n }, B2n = S˜ν2n − A2n n ≥ 1.
Note that .Bj = 0 implies a coupling at the j th step. This stopping time is attained in .τ steps of lengths .Bj , j ≥ 0. Therefore, writing U2n+1 = Sν2n+1 − S˜ν2n ≥ B2n+1 ,
.
U2n+2 = S˜ν2n+2 − Sν2n+1 ≥ B2n+2 , (n ≥ 0),
(14.3)
one has T ≤ S˜ν0 − Sn0 +
1≤j ≤τ
Uj 1[τ ≥j ]
1≤j j −1] .
(14.4)
1≤j 1|G0 ) = P (B1 = 0|G0 ) < 1 − γ .
.
Suppose the inequality holds for j P (τ ≥ j |Gj −1 ) ≡ P (τ > j − 1|Gj −1 ) ≤ (1 − γ )j −1 .
.
Since .P (Bj +1 = 0|Gj ) < 1 − γ , one has P (τ ≥ j + 1|Gj ) ≡ P (τ > j |Gj )
.
= E(1[τ >j −1] 1[Bj +1 =0] |Gj ) = E(1[τ >j −1] E(1[Bj +1 =0] |Gj ) < E(1[τ >j −1] (1 − γ )) < (1 − γ )j ,
246
14 Speed of Convergence
completing the induction argument. Thus to prove that .ET β < ∞, we need to show that . E(Bi−1 1[τ ≥j ] ) < ∞. (14.14) j ≥1
We have earlier used the fact .[τ ≥ j ] = [τ > j − 1] ∈ Gj −1 . Therefore, arguing as in (14.11), but using Lemma 10(i), with .ρ = 1/2, .
E(B2n+2 |G2n+1 ) = E((S˜ν2n+2 − S˜ν2n +n0 ) − (Sν2n+1 +n0 − S˜ν2n +n0 ))|G2n+1 ) ≤ C + 1/2B2n+1 , n ≥ 0.
One may similarly estimate .E(B2n+1 |G2n ) and arrive at E(Bj −2 |Gj −1 ) ≤ C + 1/2Bj −2 , j ≥ 3.
.
(14.15)
Using this, we have, for .j ≥ 3, E(Bj −1 1[τ ≥j ] ) = E(E(Bj −1 |Gj −2 )1[τ ≥j −1] )
.
≤ E((C + 1/2Bj −2 )1[τ ≥j −1] ) ≤ C(1 − γ )j −1 + 1/2E(Bj −2 1[τ ≥j −1] ) ≤ C(1 − γ )j −1 + 1/2(C(1 − γ )j −1 + (1/2)E(Bj −3 1[τ ≥j −2] ) .. . ≤C
(1 − γ )j −2−k (1/2)k + (1/2)j −1 E(B0 1[τ ≥j ]
0≤k≤j −2
=C
(1 − γ )j −2−k (1/2)k + (1/2)j −1 EY˜0
0≤k≤j −2
≤C
(1 − γ )j −1−k (1/2)k
0≤k≤j −1
≤ Cj (max{1 − γ , 1/2})j −1 . Before proceeding further, let us state also a criterion for the finiteness of .E exp{ρT } for some .ρ > 0.
14 Speed of Convergence
247
Theorem 14.2 (Lindvall) Assume that for some .γ > 0, .E exp{γ Y0 } < ∞, E exp{γ Y˜0 } < ∞, .E exp{γ Y1 } < ∞. Then there exists .ρ > 0 such that .E(exp{ρT }) < ∞. .
Proof We have, for .ρ > 0, by the monotone convergence theorem, E exp{ρT } ≤ E exp{ρ(S˜ν0 − Sn0 +
.
Uj 1[τ >j ] )}
j >0
= lim E exp{ρ((S˜ν0 − Sn0 ) + n→∞
Uj 1[τ >j ] )}.
(14.16)
0j ] )}
0 0, then there exists .0 < ρ ≤ s such that E exp{ρ(η2 − η1 )} < ∞, Eπ exp{ρη1 } < ∞,
.
and (14.28) holds.
7 See
Bhattacharya and Waymire (2022), Proposition 20.3, Theorem 20.4.
252
14 Speed of Convergence
Proof (a) Let us first show that .supx∈A0 Ex η1r < ∞. Note that .θτj is independent of .τj , j ≥ 1, (see (14.24), (14.25)). Therefore,
Ex η1r =
Ex τjr 1[θτi =0 for 1≤i≤j −1,θτj =1] ,
.
1≤j r0 . Write .τr0 = τ∂Br0 , .τr0 ,N = τr0 ∧ τ∂BN , .
wα (x) = Ex τrα0 , Ex τrα0 ,N = Ex (τr0 ∧ τ∂BN )α , wα,N (x) = Ex (τr0 ∧ τ∂BN )α , Cn,α = n!(2
[r0 ,∞)
s(r)1−1/α m (r)dr)n .
(14.46) By the proof of Theorem 11.8, part (a), providing the upper bound below for w1 (x) = Ex τr0 , given by
.
w1 (x) ≤ 2
.
[r0 ,|x|)
(exp{−I (r)}(
=2
[r0 ,|x|)
s (u)
= 2(s(|x|)
[u,∞)
[u,∞)
m (r)drdu
[|x|,∞)
exp{I (u)}/a(u))du)dr
m (r)dr +
≤ 2(s(|x|)s(|x|)1/α−1 )
[|x|,∞)
[r0 ,|x|]
s(u)m (u)du)
s(u)1−1/α m (u)du
260
14 Speed of Convergence
+s(|x|)1/α = 2s(|x|)1/α
[r0 ,|x|]
s(u)1−1/α m (u)du)
s(u)1−1/α m (u)du.
[r0 ,∞)
(14.47)
Let .ε ∈ (0, 1]. Then, wε (x) :=
.
Ex τrε0
≤ (Ex τr0 ) ≤ (2 ε
[r0 ,∞)
s(u)1−1/α m (u)du)ε s(|x|)ε/α . (14.48)
We will prove the following by induction, wn+ε (x) ≤ Cn+ε,α s(|x|)
.
Here Cn+ε,α := (
.
n+ε α
(i + ε))(2
1≤i≤n
[r0 ,∞)
for n + ε ≤ α.
s(u)1−1/α m (u)du)n+ε .
(14.49)
(14.50)
Note that (14.49) has already been proved for .n = 0, with the product in (14.50) over an empty set taken to be 1. Assume (14.49) for .n + ε, and let .n + 1 + ε ≤ α. Lemma 11 and Itô’s lemma (Corollary 8.5), applied to the stopping time .τr0 ,N , imply wn+1+ε,N (x) = −Ex
.
[0,τr0 ,N )
Lwn+1+ε,N (X(u))du
= (n + 1 + ε)Ex
[0,τr0 ,N )
wn+ε,N (X(u)du
≤ (n + 1 + ε)Ex
[0,τr0 ,N )
wn+ε (X(u)du
≤ (n + 1 + ε)Cn+ε,α Ex
[0,τr0 ,N )
s(|X(u)|)(n+ε)/α du, (14.51)
where the last step follows from the induction hypothesis. Next by Lemma 12 below, and integration by parts, we get .
Ex
[0,τr0 ,N )
s(|X(u)|)(n+ε)/α du
≤ Ex
[r0,N ,∞)
exp{−I (u)}
[r0 ,u]
(s(u)(n+ε)/α dv)du
14 Speed of Convergence
261
≤ s(|x|)(n+1+ε)/α
[|x|,∞)
s(u)1−1/α m (u)du
+s(|x|)(n+1+ε)/α ≤(
[r0 ,∞)
[r0 ,|x|]
s(u)(1−1/α) m (u)du
s(u)1−1/α m (u)du)s(|x|)(n+1+ε)/α .
(14.52)
Substituting this in (14.51), and letting .N ↑ ∞, the induction is completed. Now take .β = n + ε. For Lemma 12, define for a nonnegative continuous function h on .[r0 , ∞),
F (r) =
.
[r0 ,r]
exp{−I (u)}
[r0 ,u]
and for a positive integer N ,
(h(v)/a(v)) exp{I (v)}dv)du, r0 ≤ r < ∞,
F (r; N) =
F (r)
for r0 ≤ r ≤ N,
0
for r ≥ 2N,
.
such that .F (r; N ) is twice continuously differentiable on .[r0 , ∞). Also write h(r; N ) = h(r) for .r0 ≤ r ≤ N, and continuous on .[r0 , ∞). The following result may be found in Bhattacharya and Ramasubramanian (1982), Lemma 2.1(iv).
.
Lemma 12 One has for .x ∈ ∂Br1 ,
.Ex
[0,τr0 ]
h(|X(s)|)ds ≤ 2
[r0 ,∞)
exp{−I (u)}
[r0 ,u]
(h(v)/a(v)) exp{I (v)}dv)du,
if the integral on the right converges. Proof Consider the function .f (x; N) = F (|x|; N), and note that (See Chapter 11, Lemma 8) 2Lf (x; N ) = A(x)F (|x|; N) + (B(x) − A(x) + C(x))F (|x|; N )/|x|.
.
One also has for .r0 ≤ r ≤ N, . F (r; N ) = exp{−I (r)}
[r0 ,r]
(h(v)/a(v)) exp{I (v)}dv) ≥ 0,
F (r; N ) = −(β(r)/r) exp{−I (r)}
[r0 ,r]
(h(v)/a(v)) exp{I (v)}dv + h(r)/a(r)
= −(β(r)/r)F (r; N) + h(r)/a(r), a(r)(F (r; N) + (β(r)/r)F (r; N)) = h(r).
(14.53)
262
14 Speed of Convergence
Therefore (Exercise 4), 2Lf (x; N) ≥ h(|x|; N) for |x| ≤ N.
.
By Itô’s lemma (Corollary 8.5), .f (X(t); N) − martingale, so that
[0,t] Lf (X(s); N)ds, t
≥ 0, is a
Ex
h(|X(s); N)|ds ≤ 2
.
[0,t]
[0,t]
Ex Lf (X(s); N)ds
= 2(Ex F (|X(t); N)|) − F (|x|)).
(14.54)
By the optional sampling theorem (Chapter 1), one may take .t = τ∂BN ∧N in (14.54), and then let .N ↑ ∞, using the monotone convergence theorem, to get the desired result. Our next task is to consider the discrete time Markov process .{X(n) : n = 0, 1, . . . }, i.e., consider the skeleton of the process .{X(t) : t ≥ 0} at times .t = n = 0, 1, 2, . . . . Its one-step transition probability is .p(x, dy) := p(1; x, y)dy. Let .0 < r < r1 , and .A0 = Br2 . Note that .
p(x, B) ≥ λϕ(B) for all x ∈ Br1 , B ∈ B(R), ϕ(B) := m(B∩B r1 )/m(B r1 ),
(14.55)
with m being Lebesgue measure on .R, and λ = min{p(1; x, y) : x, y ∈ B r1 }m(B r1 ).
.
Define the following stopping time for .{X(n) : n = 0, 1, . . . }: For .0 < r < r1 , η1 = inf{n ≥ 1 : X(n) ∈ B r }.
.
(14.56)
The proof of the following result is quite analogous to the proof of the corresponding result, with .α = 1, in Theorem 11.7. Recall the stopping times .
τ0 = 0,
τ1 = inf{t ≥ 0 : |X(t)| = r}
τ2i+1 = inf{t > τ2i : |X(t)| = r},
τ2i = inf{t > τ2i−1 : |X(t)| = r1 }, i ≥ 1.
Also, let τ2i+1 = inf{t > τ2i+1 : |X(t)| = r }, 0 < r < r < r1 (i ≥ 0).
.
Proposition 14.7 If for some .0 < r1 < r2 , .α ≥ 1, one has
14 Speed of Convergence
263
C := sup Ex τ1α < ∞,
(14.57)
.
|x|=r1
recalling the notation in (14.39), then sup Ex η1α < ∞.
.
x∈B r
Proof Consider the events Ei = [τ2i+1 − τ2i+1 ≥ 1],
.
i ≥ 0.
Let .δ := min|x|=r1 Px (E0 ) > 0, by the strong Feller property, and the facts that (i) Px has full support, starting at x, for every x, and (ii) .{x : |x| = r} is compact (see Exercise 16(a)). Let .θ = inf{i ≥ 0 : 1Ei = 1}. Then, since .η1 ≤ τ2θ+2 ,
.
α η1α ≤ τ2θ+2
.
=(
θ
(τ2i+2 − τ2i ))α
i=0
≤ (θ + 1)α−1
θ
(τ2i+2 − τ2i )α .
i=0
Therefore, for all .x ∈ B r , Ex η1α ≤
.
∞ (1 − δ)n (n + 1)α−1 c(α) < ∞, n=0
α ). where .c(α) = 2α−1 (sup|x|=r1 Ex τ1α + sup|x|=r Ex τ∂B r 1
In view of Propositions 14.4(a), Theorem 14.6 and Proposition 14.7, under the assumptions of Theorem 14.6, one has as .n → ∞, ||Pμ ◦ Xn+ − Pπ ◦ Xn+ || = o(n−α+1 ),
.
(14.58)
for the discrete parameter process .{X(n) : n ≥ 0}. Now observe that the adjoint operator Tt∗ ν(dy) :=
p(t; x, dy)ν(dx)
.
S
is a contraction on the Banach space .M of all finite signed measures with total variation norm (Exercise 5)
264
14 Speed of Convergence
||ν|| :=
sup
.
B∈B(Rd )
|ν(B)|.
Therefore, with ν = v − π,
.
∗ v = Tt−[t] μ,
(μ a probability measure),
one has .
∗ ||Tt∗ μ − π || = ||Tt∗ (μ − π )|| ≤ ||T[t] (μ − π )||, ∗n ||Tt∗n μ − π || = ||Tt∗n (μ − π )|| ≤ ||T[t] (μ − π )||.
(14.59)
From (14.58) and (14.59) it follows that sup
.
B∈B(Rd )
|Pμ (X(t) ∈ B) − π(B)| = o(t −α+1 ) as t → ∞.
Applying this to the after-t process .{Xt+ }, where Xt+ (s) = X(t + s), s ≥ 0,
.
we have one of the main results of this chapter. Theorem 14.8 (Polynomial Rates of Convergence of Diffusions to Equilibrium) Suppose Assumptions 1,2 hold. Then the diffusion .{X(t) : t ≥ 0} has a unique invariant probability .π . a. If, also, (14.43) holds for some .α ≥ 1, then ||Px ◦ Xt+ − Pπ ◦ Xt+ || = o(t −α+1 ), as t → ∞, for all x ∈ Rd ,
.
(14.60)
uniformly on compact subsets of .Rd . b. If, in addition to the hypothesis of (a), .Eπ τ1α < ∞, where .τ1 is defined in (14.39), then ||Px ◦ Xt+ − Pπ ◦ Xt+ || = o(t −α ), as t → ∞,
.
(14.61)
uniformly (in x) on compact sets. If also .Eμ τ1α < ∞, then .Px may be replaced by .Pμ in (14.61). Proof This follows from Theorem 14.3(a), Theorem 14.6, Proposition 14.4(a), and (14.59). We finally turn to a criterion for exponential rates of convergence. As in the case of deriving polynomial rates of convergence for a diffusion {X(t)} from the
14 Speed of Convergence
265
corresponding rates for its discrete skeleton .{X(n)}, we will first prove the following analog of Proposition 14.7. Proposition 14.9 With the notation as before, assume that there exists .γ > 0 such that .
sup Ex exp{γ τ1 } < ∞.
|x|=r1
Then there exists .0 < γ < γ , such that .
sup Ex exp{γ η1 }. x∈Br
Proof With the same notation as in the proof of Proposition 14.7, for .i ≥ 1, and some .γ > 0 sufficiently small, Ex exp{γ (τ2i+2 − τ2i )} = Ex (Ex exp{γ (τ2i+2 − τ2i )}|Fτ2i ))
.
≤ sup Ex exp{γ (τ2i+2 − τ2i )} = b(γ ), say, |x|=r1
since, by the strong Markov property, the .Fτ2i -distribution of .X(t), t ≥ 0, is its distribution starting at .X(τ2i ) ∈ {x : |x| = r1 }. Therefore, .
Ex (exp{γ
n (τ2i+2 − τ2i )}1∩n−1 E c ∩En ) j =0 j
i=0
n−1 (τ2i+2 − τ2i )}1∩n−1 E c )Ex (exp{γ (τ2n+2 − τ2n )}1En |Fτ2n ) = Ex (exp{γ j =0 j
i=0
≤ b(γ )Ex (exp{γ
n−1 (τ2i+2 − τ2i )}1∩n−1 E c ). i=0
j =0 j
c Now, .En−1 ∈ Fτ2n−1 , and .F2n−2 ⊂ Fτ2n−1 ⊂ F2n , so that .
c |Fτ Ex (exp{γ (τ2n − τ2n−2 )}1En−1 2n−2 ) c |Fτ = Ex (Ex (exp{γ (τ2n − τ2n−2 )}|Fτ2n−1 )1En−1 2n−2 ).
Note that .|X(τ2n−1 )| = r . Therefore, the last expression reduces to .
c |Fτ E(Ex (exp{γ (τ2n − τ2n−1 )}|Fτ2n−1 ) exp{γ (τ2n−1 − τ2n−2 )}1En−1 2n−2 )) c |Fτ ≤ Ex (exp{γ (τ2n−1 + 1 − τ2n−2 )}1En−1 2n−2 ) c |Fτ = c1 eγ Ex (exp{γ (τ2n−1 − τ2n−2 )}1En−1 2n−2 ),
266
14 Speed of Convergence
where .c1 = sup|x|=r Ex exp{γ τ∂Br1 } < ∞, if .γ is sufficiently small (Exercise 16(b)). Finally, .
c |Fτ Ex (exp{γ (τ2n−1 − τ2n−2 )}1En−1 2n−2 ) c |Fτ = Ex (Ex (exp{γ (τ2n−1 − τ2n−2 )}1En−1 2n−1 )|Fτ2n−2 )) c |Fτ = Ex ((exp{γ (τ2n−1 − τ2n−2 )})Ex (1En−1 2n−2 ))
≤ Ex (exp{γ (τ2n−1 − τ2n−2 )}|Fτ2n−2 ) sup Px (τ∂Br ≤ 1) |x|=r
≤ (1 − δ)Ex exp{γ (τ2n−1 − τ2n−2 )} ≤ c1 eγ Ex sup Ex exp{γ τ1 }(1 − δ). |x|=r1
Continuing in this way, one arrives at Ex (exp{γ
.
n (τ2i+2 − τ2i )}1∩n−1 E c ∩En ) j =0 j
i=0
≤ (c1 eγ )n−1 b(γ )( sup exp{γ τ1 })n−1 (1 − δ)n , |x|=r1
where .c1 = sup|x|=r Ex exp{γ τ∂Br1 } and .b(γ ) = sup|x|=r1 Ex exp{γ (τ2 − τ0 )}. If, instead of .γ , one chooses .γ < γ sufficiently small, then one would get .
sup Ex exp{γ η1 } = sup Ex exp{γ η1 }. |x|≤r
x∈Br
Note that one needs .γ such that
c1 eγ sup Ex exp{γ τ1 } < (1 − δ)−1 .
.
|x|=r1
Except for .c1 , the other terms converge to 1 as .γ ↓ 0. Choosing .γ sufficiently close to .γ1 , one can reduce .c1 sufficiently and, at the same time make .δ large and .1 − δ sufficiently small. Theorem 14.10 (Exponential Rates of Convergence of Diffusions to Equilibrium) Suppose Assumptions 1 and 2 hold, and that for some .r1 > r0 > 0, .δ > 0, .λ > 0, .
Then
Trace(a(x)) − (2 − δ)A(x) + 2xμ(x) < −λ for |x| ≥ r0 . |x|2
(14.62)
14 Speed of Convergence
267
(i) There exists .λ , .0 < λ < λ such that .
sup Ex exp{λ τ1 } < ∞,
(14.63)
x∈∂Br1
where .τ1 := τ∂Br0 . (ii) There exists .λ > 0 such that ||Px ◦ Xt+ − Pπ ◦ Xt+ || = o(exp{−λ t}) as t → ∞,
.
(14.64)
uniformly on compact subsets of .Rd . If .Eμ exp{λτ1 } < ∞, then (14.64) holds for .Pμ in place of .Px . Proof Let g(t, x) = exp{γ t}|x|δ = exp{γ t}g(x)
.
for some positive constant .γ to be determined later. Let .ηM = inf{t ≥ 0 : |X(t)| = M}. By Itô’s Lemma, .
Ex [exp{γ (τ1 ∧ ηM )}g(|X(τ1 ∧ ηM )|) − |x|δ = Ex exp{γ s}[γ |X(s)|δ + Lg(X(s)]ds, r0 ≤ |x| ≤ M. (14.65) [0,τ1 ∧ηM ]
Note that 2Lg(x) = δ|x|δ−2 [(δ − 2)A(x) + B(x) + C(x)].
.
Hence, by (14.62), 2Lg(x) ≤ −λδ|x|δ for |x| ≥ r0 .
.
Choose .γ = (δ/2)λ, to show that .γ |x|δ + Lg(x) ≤ 0 for .|x| ≥ r0 . Hence (14.65) yields Ex [exp{γ τ1 ∧ ηM }g(|X(τ1 ∧ ηM )|)] ≤ |x|δ for r0 ≤ |x| ≤ M.
.
Letting .M ↑ ∞, one obtains Ex exp{λ τ1 } ≤ r1δ for |x| = r1 ,
.
for all .0 < λ < λ. The proof of (14.63) is completed by using Proposition 14.9 and Proposition 14.4(b).
268
14 Speed of Convergence
Remark 14.6 In view of the results of Stroock and Varadhan (1979), the smoothness Assumption 1 may be relaxed to: (i) .a(x) is continuous and positive definite, (ii) the drift .μ(x) is Borel measurable and bounded on compacts (See Bhattacharya (1978)). In one dimension the most general assumptions are those of Feller (see Chapter 21). Remark 14.7 In the self-adjoint case, Chen (2006) and his collaborators have obtained sharp results on the speed of convergence in terms of the spectral gap. Also see Chen and Wang (1995), Mao (2006) in this regard. Remark 14.8 A successful coupling of a Markov process, in the sense described in this chapter, implies convergence in total variation distance of the process (being coupled) to the one under equilibrium. There are important classes of Markov processes with unique invariant probabilities and convergence in distribution to equilibrium under every initial state, but for which the convergence is not in total variation distance (see Bhattacharya and Waymire (2022), Chapters 18 and 19). See Exercise 11. Next let us consider some illustrative examples. Some may defy intuition. Example 1 (Polynomial Rates of Convergence to Equilibrium for One-Dimensional Diffusions) Let .{X(t)} be a positive recurrent one-dimensional diffusion. Although Theorem 14.8 applies, one may improve it, or simplify, because of point recurrence. Take two distinct points .x0 = 0, .x1 and define stopping times τj =: inf{t > τj −1 : X(t) = x0 after visiting x1 }, j ≥ 1,
.
τ0 = 0. Then .{τj − τj −1 : j ≥ 2} are i.i.d, and one may obtain a sufficient condition for rate of convergence .o(t −α+1 ), .α ≥ 1, of .Px ◦ Xt+ to equilibrium .Pπ ◦ Xt+ assuming .E(τ2 − τ1 )α < ∞. This amounts to (14.43), by deleting the ‘overbar’, namely,
.
.
[0,∞)
s(y)1−1/α m(dy) < ∞,
|s(y)|1−1/α m(dy) < ∞,
(14.66)
(−∞,0)
where .
I (y) =
(2μ(z)/σ (z))dz (y ∈ R), s(y) = 2
[0,y)
m(y) =
[0,y]
2(exp{I (y)}/σ 2 (y))dy,
[0,y)
exp{−I (z)}dz, (14.67)
with the usual convention, . (y,0) = − (0,y) . Usually it is the case that positive recurrence occurs if the drift .μ(y) < 0 for large positive y, and .μ(y) > 0 for y negative of large magnitude. Some curious examples follow. Assume, for simplicity, the coefficients to be locally Lipschitz. Let .μ(y) = 0 for all y, .σ 2 (y) > 0 and grows fast as .|y| → ∞, such that
14 Speed of Convergence
269
m(∞) =
.
[0,∞)
(2/σ 2 (y))dy < ∞, −m(−∞) =
(2/σ 2 (y))dy < ∞. (−∞,0]
(14.68) To see this, first note that .s(y) = y for all y, implying .s(∞) = ∞, s(−∞) = −∞. Hence the diffusion is recurrent (Theorem 8.8); in particular, there is no explosion. Next, (14.68) says that .m(∞) < ∞, and .m(−∞) > −∞. That is, the diffusion is positive recurrent (Theorem 11.9(c) and (11.33)), with invariant probability .π having the density π (y) = m (y)/(m(∞) − m(−∞)).
.
(14.69)
(i) Consider the special case .μ(·) = 0, σ 2 (y) = 1 + |y|a for some a. Then check that the diffusion is null recurrent if .a ≤ 1, and positive recurrent if .a > 1; also, for a given .α ≥ 1, the rate of convergence to equilibrium is .o(t −α+1 ) as .t → ∞, if .a > 2 − 1/α (Exercise 6). (ii) Now consider the case .μ(y) ≡ θ = 0, σ 2 (y) = 1 + |y|a for some a. Then check that the diffusion is transient if .a < 1, and positive recurrent if .a > 1. If −α+1 ) .α ≥ 1, and .a > 2 − 1/α, then the rate convergence to equilibrium is .o(t as .t → ∞. What happens for .a = 1, depends on the value of .θ (Exercise 7). (iii) Finally, note that specifications of .μ(y) and .σ 2 (y) in (i), (ii), are only needed for .|y| > R, for some R, however large R may be (Exercise 12). Thus, surprisingly, although the constant drift tends to push the diffusion to .∞ if θ > 0, and toward .−∞ if .θ < 0, the large diffusion coefficient, with .a > 1, prevents transience, and even allows convergence to a steady state!
.
Example 2 (Polynomial and Exponential Rates of Convergence to Equilibrium for Multi-dimensional Diffusions) First, here are analogs of Examples 1(i),(ii). Let .d > 1, .Id denotes the .d ×d identity matrix. We will denote the diffusion matrix by .D(y). We will consider radial diffusions, although one may similarly consider non-radial diffusions with radial approximations. (i) Let .μ(·) = 0, .D(y) = (1 + |y|a )Id . Then, using the notation (14.42) with ˜ = 1 + r a , .I (r) = (d − 1) ln r, .s(r) = ln r if .r0 = 1, .β(r) = d − 1, .a(r) −(d−2) )/(d − 2), if .d > 2. Then, for all a, the diffusion .d = 2; .s(r) = (1 − r is recurrent if .d = 2 and transient if .d > 2. If .a > 2, the diffusion is positive recurrent for .d = 2. In the latter case, the rate of convergence to equilibrium is −α+1 ) for all .α ≥ 1, in view of Theorem 14.8. That is, the convergence to .o(t equilibrium is faster than any polynomial rate (Exercise 7)! If .a ≤ 2, then the two-dimensional diffusion is null recurrent. (ii) Let .μ(y) = θy for some non-zero .θ , .D(y) = (1 + |y|a )Id . In this case 2 a .β(r) = d − 1 + (2θ r /(1 + r )). Suppose .θ > 0. Then one may check that for all .a ≤ 2, the diffusion is transient, whatever be the dimension .d ≥ 2. For .a > 2 the diffusion is recurrent for .d = 2, and transient for .d > 2. In
270
14 Speed of Convergence
this model the diffusion is transient for .d > 2, whatever a is. For .d = 2, the diffusion is transient if .a < 2, and null recurrent if .a ≥ 2 (Exercise 8). (iii) Next, consider the case (ii), but with .θ < 0. Then the diffusion is positive recurrent for all values of a. The speed of convergence to equilibrium is exponential (1) for all values of a, if .d = 1, (2) for .a < 2, for all d, and (3) for .a = 2, if .|θ | > d − 2. This follows from Theorem 14.10 (Exercise 9). (iv) The results (i)-(iii) also hold if the hypothesis about .μ(y) and .D(y) hold only for .|y| > R, however large R may be (Exercise 13).
Exercises 1. Prove that, in the notation of(14.26), the integer-valued sequences {η0 , ηn − ηn − 1 : n ≥ 1} and {η˜ 0 , η˜ n − η˜ n−1 : n ≥ 1}, with η˜ 0 having the invariant distribution of the residual renewal process, are independent and satisfy the hypotheses of Theorem 14.1, with Yn = ηn − ηn−1 , Y˜n = η˜ n − η˜ n−1 , n ≥ 1. 2. Prove that a birth–death chain on a finite set, with both boundaries reflecting, has a unique invariant probability, and the convergence to equilibrium is exponential, no matter the initial state. 3. In Remark 14.5, directly prove Ex τ n < ∞, for all n ≥ 1. 4. Check (14.53) and that 2Lf (x; N) ≥ h(|x|; N) for |x| ≤ N. 5. Prove that the adjoint operator T ∗ is a contraction on the space of finite signed measures endowed with total variation norm. 6. Check the assertions in Example 1(i), and compute the invariant distribution when σ 2 (y) = exp{|y|}, μ(y) = 0. 7. Check the assertions in Example 1(ii), with μ(y) = θ , σ 2 (y) = 1 + |y|a . Describe the asymptotics in the case a = 1. 8. Check the assertions in Example 2(i). 9. Check the assertions in Example 2(ii). 10. Check the assertions in Example 2(iii). 11. Find a Markov process on an interval [a, b] with a transition probability p having the Feller property, such that p(n) (x, dy) is supported on a countable set for every n and every x ∈ [a, b], but has a unique absolutely continuous invariant probability. 12. Show that the specifications of μ(y), σ 2 (y) in Example 1 are needed only for |y| > R, however large R may be. 13. Prove the assertion in part (iv) of Example 2. 14. In Example 1(ii), assume that μ(y) → θ = 0 as |y| → ∞, and that σ 2 (y) ∼ 1 + |y|a as |y| → ∞. (Here ∼ means the ratio of the two sides goes to 1). Derive the asymptotics in this case without requiring specifications of μ(y) and σ 2 (y) for all y. 15. (Stochastic Logistic Growth Model) (a) Consider the one-dimensional diffusion of Example 1 in Chapter 11: dX(t) = rX(t)(k−X(t))dt +σ X(t)dB(t), t ≥ 0,
Exercises
271
where r > 0, k > 0, θ > 0. Estimate the speed of convergence to equilibrium 2 if rk − θ > σ2 . 16. (a) In the proof of Proposition 14.7, show that δ := inf|x|=r1 Px (τ1 − τ 1 ≥ 1) > 0. [Hint: If |x| = r1 , then Px (τ1 − τ1 ≥ 1) = Px (X(τ1 ) < ∞, τ1 − τ1 ≥ 1) = Ex PX(τ1 ) (τ∂Br ≥ 1) ≥ inf|x|=r Px (τ∂Br ≥ 1) > 0, since, for |x| = r, Px (τ∂Br ≥ 1) > 0, because (i) Px has full support, (ii) x → Px (τ∂Br ≥ 1) is continuous on |x| = r, by the strong Feller property, and (iii) the last infimum is achieved on the compact set {x : |x| = r}. ] (b) In the proof of Proposition 14.9, prove that sup|x|=r Ex exp{γ τ∂Br } < ∞ for a sufficiently small γ > 0. [hint:Px (τ∂Br > 1) < 1 for all x ∈ {x : |x| ≤ r }, in view of the full support of Px . Let θ := sup|x|≤r Px (τ∂Br1 > 1). Then θ < 1 by the continuity of x → Px (τ∂Br > 1) (strong Feller property). Also, iterating and using the strong Markov property repeatedly with stopping time τ∂Br1 , one has γn n Px (τ∂Br1 > n) ≤ θ n for all n. Hence Ex exp{γ τ∂Br1 } ≤ ∞ n=0 e θ < ∞ if γ e < 1/θ . ]
Chapter 15
Probabilistic Representation of Solutions to Certain PDEs
The theory and methods of partial differential equations (pdes) have a long history in the context of diffusions. This is natural from the more analytic perspective of semigroup theory. However, the introduction of stochastic calculus in terms of stochastic differential equations provides an alternative probabilistic approach in which to analyze pdes.
Recall from Chapter 2 that from an analytic point of view, the Markov property is partly conveyed in the semigroup property of the family of linear operators .{Tt : t ≥ 0} defined on the vector space of bounded, measurable functions f by Tt f (x) = Ex f (Xt ) =
.
Rk
t ≥ 0,
f (y)p(t; x, dy),
x ∈ Rk .
(15.1)
That is, one has Tt+s f = Tt Ts f,
.
s, t ≥ 0.
Let .Cb (Rk ) be the Banach space of bounded, continuous functions on .Rk with the uniform norm. Observe that .Tt : Cb (Rk ) → Cb (Rk ) is implied by the Feller property established as a result of Proposition 7.8. The necessity of .Tt f − f = o(1) uniformly as .t ↓ 0, may be recast in terms of Itô’s lemma as follows: Tt f (x) − f (x) = E{f (Xxt − f (x)} = E
.
0
t
Af (Xsx )ds.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_15
(15.2)
273
274
15 Probabilistic Representation of Solutions to Certain PDEs
Further conditions on .f ∈ Cb (Rk ) generally involve bounds on norms of x x .μ(Xs ), ∇(Xs ), and, more generally, .Af as follows. As noted earlier, the Lipschitz condition on .μ, for example, implies |μ(Xxs )| ≤ |x| + K|Xxs − x|.
.
A condition of polynomial growth of .∇f , i.e., .|∇f (x)| ≤ c(1 + |x|2m ) for some positive constant c and nonnegative integer m, controls the size of .∇f (Xxs ) in (15.2) in terms of a moment bound. In this regard, it is useful to recall Doob’s .Lp maximal inequality for moments of right-continuous martingales .{Zt : α ≤ t ≤ β} with p .E|Zt | < ∞, p > 1 (Theorem 1.15), E| max Zt |p ≤ (
.
α≤t≤β
p p ) E|Zβ |p . p−1
(15.3)
A function .f : Rk → R is said to vanish at infinity if .|f (x)| → 0 as .|x| → ∞. Proposition 15.1 Consider the semigroup .{Tt : t ≥ 0} defined by (15.1) on the space .Cb (Rk ), where .{Xx : x ∈ Rk } is the diffusion defined by Lipschitzian coefficients .μ(·), σ (·) having Lipschitz constant K. Recall the linear operator A introduced in (8.13). Namely, A=
.
1≤i≤k
μ(i) (x)
1 ∂2 ∂ + a (x) , ij ∂x i 2 ∂x i ∂x j
D = ((aij )) = σ σ t .
1≤i,j ≤k
Then the semigroup and its generator have the following properties: 1. (a) . Tt f − f → 0 as .t ↓ 0, for every real-valued Lipschitzian f on .Rk vanishing at infinity. (b) If .μ(·), σ (·) are bounded, then .||Tt f − f || → 0 for all bounded, uniformly continuous functions f . 2. Let f be a real-valued twice continuously differentiable function on .Rk . a. If .Af and .grad f are polynomially bounded (i.e., .|Af (x)| ≤ c1 (1 + |x|m1 ), m .| grad f (x)| ≤ c2 (1 + |x| 2 ), then .Tt f (x) → f (x) as .t ↓ 0, uniformly on every bounded set of .x’s. b. If .Af is bounded and .grad f polynomially bounded, then . Tt f − f → 0. Proof Given .ε > 0, let .K(ε) be such that .|f (x)| < ε/2 if .|x| > K(ε). For .|x| ≤ K(ε), .|Ef (Xtx ) − f (x)| < C(ε)t for some constant .C(ε) (using Exercise 7(i) with .||X0 ||2m = K(ε)); hence .|Tt f (x) − f (x)| < ε if .t < ε/C(ε). For .|x| > K(ε), .|f (x)| < ε/2, so that .|Tt f (x) − f (x)| < |Tt f (x)| + |f (x)| < ε, if .t < ε/(2||f ||). Thus, if .t < δ := min{ε/C(ε), ε/(2||f ||)}, then .||Tt f − |f || < ε. For (1b), see Exercise 8. For (2a), use Itô’s lemma and Exercise 7(iii). For (2b), use Exercise 6(ii).
15 Probabilistic Representation of Solutions to Certain PDEs
275
Definition 15.1 (Infinitesimal Generator) Let .D denote the set of all real-valued bounded continuous functions f on .Rk such that . t −1 (Tt f − f ) − g → 0 as .t ↓ 0, for some bounded continuous g (i.e., .g = ((d/dt)Tt f )t=0 ), where the convergence ˆ for of the difference quotient to the derivative is uniform on .Rk . Write .g = Af ˆ such f . The operator .A on the domain .D is called the infinitesimal generator of .{Tt : t ≥ 0}. Remark 15.1 In this context, .A is often referred to as the formal infinitesimal generator, according to which .Af (x) is given by the indicated pointwise operations for all twice continuously differentiable functions f . ˆ one To determine functions in .D together with an explicit recipe to compute .A requires conditions such that Tt f − f − tg = o(t),
uniformly ast ↓ 0.
.
Once again, this may be analyzed by an application of Itô’s lemma to .f (Xxt ), minimally requiring f to be twice continuously differentiable to get started. Proposition 15.2 Assume that .μ(·) and .σ (·) are Lipschitz. 1. Every twice continuously differentiable f , which vanishes outside some bounded ˆ = Af where .A is given by (8.35) or in set, belongs to .D and, for such f , .Af Proposition 15.1. 2. Let f be a real-valued, bounded, twice continuously differentiable function on k .R such that .grad f is polynomially bounded, and .Af (x) → 0 as .|x| → ∞. ˆ = Af. Then .f ∈ D and .Af ˆ Suppose .f, g ∈ D are such that .f (y) = g(y) in a 3. (Local Property of .A) ˆ (x) = Ag(x). ˆ neighborhood .B(x : ε) ≡ {y : |y − x| < ε} of .x. Then .Af Proof For (a) suppose .f (x) = 0 for .|x| ≥ R. For each .ε > 0 and .x ∈ Rk , use Itô’s lemma to obtain .|Tt f (x) − f (x) − tAf (x)| ≤ tθ (Af, ε) + 2t Af P max |Xxs − x| ≥ ε , 0≤s≤t
where .θ (h, ε) = sup{|h(y)−h(z)| : y, z ∈ Rk , .|y−z| < ε}. As .ε ↓ 0, .θ (Af, ε) → 0, and for each .ε > 0, P ( max |Xxs − x| ≥ ε) → 0
.
0≤s≤t
as t ↓ 0
c uniformly for all .x in .BR+ε := {y : |y| ≤ R +ε} (see Exercise 8(iii)). For .x ∈ BR+ε ,
|Tt f (x) − f (x) − tAf (x)| = |E
.
0
t
Af (Xxs )ds| ≤ t Af P (τ x ≤ t),
276
15 Probabilistic Representation of Solutions to Certain PDEs
where .τ x := inf{t ≥ 0 : |Xxt | = R}. But .P (τ x ≤ t) ≤ sup{P (τ y ≤ t) : |y| = R+ε}, by the strong Markov property applied to the stopping time .τ := inf{t ≥ 0 : |Xxt | = R + ε}. Now P (τ y ≤ t) ≤ P ( max |Xys − y| ≥ ε) = o(t)
.
0≤s≤t
as t ↓ 0,
uniformly for all .y satisfying .|y| = R + ε. This argument extends to give (b) (see ˆ consider .Tt f (x) − Tt g(x) = o(t) Exercise 6). To prove the local property (c) of .A as .t ↓ 0, by Exercise 8. Example 1 (Brownian Motion) In the case .μ = 0, σ = I, .Xxt = x + Bt , t ≥ 0 is k-dimensional standard Brownian motion starting at .x and .A = 12 Δ. Proposition 15.3 (Initial-Value Problem) Let .μ(·), σ (·) be Lipschitzian. Adopt the notation of Definition 15.1. ˆ t f = Tt Af ˆ . a. If .f ∈ D, then .Tt f ∈ D for all .t ≥ 0, and .AT b. (Existence of a Solution) Let f satisfy the hypothesis 1 or 2 of Proposition 15.2. The function .u(t, x) := Tt f (x) satisfies Kolmogorov’s backward equation ∂u(t, x) ˆ = Au(t, x) ∂t
.
(t > 0, x ∈ Rk ),
and the initial condition u(t, x) → f (x)
.
as t ↓ 0, uniformly on Rk .
If, in addition, .u(t, x) is twice continuously differentiable in .x, for .t > 0, then ˆ Au(t, x) = Au(t, x). c. (Uniqueness) Suppose .v(t, x) is continuous on .[0, ∞) × Rk }, once continuously differentiable in .t > 0 and twice continuously differentiable in .x ∈ Rk , and satisfies .
.
∂v(t,x) ∂t
= Av(t, x)
(t > 0, x ∈ Rk ),
limt↓0 v(t, x) = f (x)
(x ∈ Rk ),
where f is continuous and polynomially bounded. If .Av(t, x) and .| grad v(t, x)| = |(∂/∂x (1) , . . . , ∂/∂x (k) )v(t, x)| are polynomially bounded uniformly for every compact set of time points in .(0, ∞), then .v(t, x) = u(t, x) := Ef (Xxt ).
.
Proof For (a), .
1 Tt (Th f − f ) ˆ → Tt Af (Th (Tt f ) − Tt f ) = h h
15.1 Feynman–Kaˆc Formula for Multidimensional Diffusion
277
in “sup norm,” since . Tt g ≤ g for all bounded measurable g. Part (b) follows from Proposition 15.2 and part 1. For part (c) on uniqueness, for each .t > 0 use Itô’s lemma to the function .w(s, x) = v(t − s, x) to get Ev(ε, Xxt−ε ) − v(t, x) = E
t
.
where .v0 (t, y) := Ef (Xxt ).
∂ ∂t v(t, y).
ε
−v0 (r, Xxt−r ) + Av(r, Xxt−r ) dr = 0,
Let .ε ↓ 0, to get .Ev(0, Xxt ) = v(t, x), i.e., .v(t, x) =
Proposition 15.4 (Dirichlet Problem) Let G be a bounded open subset of .Rk . Assume that .μ(·), σ (·) are Lipschitzian and that, for some i, .aii (x) := |σ i (x)|2 > 0 ¯ Suppose v is a twice continuously differentiable function on G, for .x ∈ G. ¯ satisfying continuous on .G, Av(x) = −g(x)
.
v(x) = f (x)
(x ∈ G), (x ∈ ∂G),
¯ respectively. Assume where f and g are (given) continuous functions on .∂G and .G, that v can be extended to a twice continuously differentiable function on .Rk . Then v(x) = Ef (Xxτ ) + E
τ
.
0
g(Xxs )ds
¯ (x ∈ G),
where .τ := inf{t ≥ 0 : Xxt ∈ ∂G}. Proof Note that v can be taken to be twice continuously differentiable on .Rk with compact support. Apply Itô’s lemma to .{v(Xxt ) : t ≥ 0)}, and then use the optional stopping theorem. Remark 15.2 It is known that twice-continuously differentiable functions on a bounded open set .G ⊂ Rk with derivatives having continuous limits at the boundary k 1 .∂G have extensions to continuously differentiable functions on .R if .∂G is .C 1 smooth.
15.1 Feynman–Kaˆc Formula for Multidimensional Diffusion Proposition 15.5 (Feynman–Kaˆc Formula) Let μ(·), σ (·) be Lipschitzian. Suppose u(t, x) is a continuous function on [0, ∞) × Rk , once continuously differentiable in t for t > 0, and twice continuously differentiable in x on Rk , satisfying
1 For
example, see Stein (1970), Chapter VI.
278
15 Probabilistic Representation of Solutions to Certain PDEs
∂u(t, x) = Au(t, x) + v(x)u(t, x) ∂t
(t > 0, x ∈ Rk ),
u(0, x) = f (x), (15.4) where f is a polynomially bounded continuous function on Rk , and v is a continuous function on Rk that is bounded above. If grad u(t, x) and Au(t, x) are polynomially bounded in x uniformly for every compact set of t in (0, ∞), then .
u(t, x) =
.
E(f (Xxt ) exp{
t
0
v(Xxs )ds})
(t ≥ 0, x ∈ Rk ).
Proof For each t > 0, apply Itô’s lemma to the (k + 1) × (k + 1) stochastic differential equation for Y(s) := (Y1 (s), Y2 (s)) = (Xxs ,
s
.
0
v(Xxs )ds ), ϕ(s, y) = u(t − s, y1 ) exp(y2 ),
where y = (y1 , y2 ).
Remark 15.3 An obviously equivalent form of (15.4) frequently occurs in applications with v(x)u(t, x) replaced by −v(x)u(t, x) under the condition2 that v is bounded below, often, in fact, nonnegative. The Feynman–Kaˆc representation admits a particularly nice probabilistic interpretation in this context; see Exercise 3. Also, in the case that λ is a (positive) constant, note that the Feynman–Kaˆc formula may be obtained with the aid of Proposition 15.2 by transforming to an initial value problem for w(t, x) = eλt u(t, x) by introducing an integrating factor. Example 2 (Integrated Geometric Brownian Motion) Let B = {Bt : t ≥ 0} denote standard Brownian motion started at 0. Also let μ, σ 2 > 0 be given real number parameters and define At :=
.
t
exp{μs + σ Bs }ds,
t ≥ 0.
(15.5)
0
The stochastic process A = {At : t ≥ 0} arises in mathematical finance where one considers so-called Asian options. These are derivative contracts that are written to be contingent on the value of averaged price process A over some specified time horizon T , rather than over the geometric Brownian motion model Zt = exp{μt + σ Bt }, 0 ≤ t ≤ T , often used in finance to model underlying prices in continuous time. One motivation for the development of such contracts is that they are less susceptible to short-term manipulation of the underlying asset.3 The
2 See
Chen et al. (2003) for a derivation of the Itô (1965) condition on the potential for validity of a Feynman-Kac formula. 3 Also see Thomann and Waymire (2003) for an alternative application to contracts written on natural resources.
15.1 Feynman–Kaˆc Formula for Multidimensional Diffusion
279
stochastic process (15.5) is also of interest in the physics of polymers and other disordered systems.4 As an illustration of the use of the Feynman–Kaˆc formula, we will use it here to derive a second-order linear parabolic equation governing the density a(t, y) of At , t ≥ 0. In particular, although At is clearly not a Markov process, we will derive the following surprising fact.5 Proposition 15.6 The density defined by P (At ∈ dy) = a(t, y)dy exists for t > 0 and satisfies .
∂2 1 1 ∂a ∂ = 2 { σ 2 y 2 a} − {[( σ 2 + μ)y + 1]a}, t > 0, a(0, y) = δ0 (y). ∂t ∂y 2 ∂y 2
Proof One may directly check that At has a density (Exercise 1). Let Zt := μt + σ Bt and define Xtx := e−Zt (x + At ), t ≥ 0. Then one may readily check that the jointly defined process {(Xtx , Zt ) : t ≥ 0} is a Markov process with drift 2 2 σ x −σ 2 x vector m = (1 + 12 σ 2 x − μx, μ) and diffusion matrix D = −σ 2 x σ 2 ˆ ξ ) := E exp(iξ At ), t ≥ 0. Also define ϕ(t, z, ξ ) = (Exercise 2). Let a(t, E exp(iξ ez At ) = E0,z hξ (Xt , Zt ), where hξ (x, z) = exp(iξ xeξ ). Then an application of the Feynman–Kac formula to ϕ(t, z, ξ ) yields .
∂ϕ = A(z) ϕ + iξ ez ϕ, ∂t
ϕ(0, z, ξ ) = 1,
(15.6)
2
∂ ∂ + 12 σ 2 ∂z ˆ ξ ) = ϕ(t, 0, ξ ) = E exp(iξ At ). where A(z) = μ ∂z 2 . Now observe that a(t, Thus, performing the indicated differentiations in (15.6), it follows that
.
1 ∂ 2 aˆ ∂ aˆ 1 ∂ aˆ = σ 2 ξ 2 2 + ( σ 2 + μ)ξ + iξ ez a. ˆ ∂t 2 2 ∂ξ ∂ξ
Now apply the Fourier inversion formula a(t, y) = complete the proof.
4 See
1 2π
∞
−∞ e
−iξy a(t, ˆ ξ )dξ ,
to
Comtet et al. (1998) and references therein to the mathematical physics literature. Bhattacharya et al. (2001) and Rogers and Shi (1995) for this and for v(t, x) := Ef (x + At ) for a class of homogeneous functions f . More recently, a relatively simple explicit calculation of the Laplace transform of the distribution of the hitting time has been found by Metzler (2013).
5 See
280
15 Probabilistic Representation of Solutions to Certain PDEs
15.2 Kolmogorov Forward Equation (The Fokker–Planck Equation) We close this chapter with Kolmogorov’s forward equation, also referred to as the Fokker–Planck equation. Let us begin with the form of the backward and forward equations, respectively, in the context of the transition probability densities of an unrestricted multidimensional diffusion on .S = Rk . 1 ∂p ∂ 2p ∂p = . aij (x) (i) (j ) + μ(i) (x) (i) , ∂t 2 ∂x ∂x ∂x k
k
k
i=1 j =1
i=1
k k k ∂ μ(i) (y)p 1 ∂ 2 aij (y)p ∂p = − , ∂t 2 ∂y (i) ∂y (j ) ∂y (i) i=1 j =1
(15.7)
i=1
Consider a diffusion .{Xt } on .S = Rk whose coefficients are Lipschitzian, with nonsingular diffusion coefficient. Let .p(t; x, y) be the transition probability density of .{Xt }. Letting .f = 1B the semigroup property may be expressed as p(s + t; x, y)dy =
.
B
( p(t; z, y)p(s; x, z)dz)dy B
(15.8)
S
for all Borel sets .B ⊂ S. This implies the Chapman–Kolmogorov equation p(s + t; x, y) =
p(t; z, y)p(s; x, z)dz = (Ts f )(x),
.
(15.9)
S
where .f (z) := p(t; z, y). A somewhat informal derivation of the backward equation for p may be based on (15.9) as follows. One has (Af )(x) = lim
.
s↓0
(Ts f )(x) − f (x) . s
(15.10)
But the right side equals .∂p(t; x, y)/∂t, and the left side is .Ap(t; x, y). This is precisely the desired backward equation for p. In applications to physical sciences, the equation of greater interest is often Kolmogorov’s forward equation, or the Fokker–Planck equation, governing the probability density function of .Xt , when .X0 has an arbitrary initial distribution .π . Suppose for simplicity that .π has a density g. Then the density of .Xt is given by (T∗t g)(y) :=
g(x)p(t; x, y)dx.
.
S
(15.11)
15.2 Kolmogorov Forward Equation (The Fokker–Planck Equation)
281
The operator .T∗t transforms a probability density g into another probability density. More generally, it transforms any integrable g into an integrable function .T∗t g. It is adjoint (transpose) to .Tt in the sense that inserting a factor .f (y) and integrating (15.11) with respect to Lebesgue measure dy, one obtains Tt∗ g, f :=
.
S
(Tt∗ g)(y)f (y)dy =
g(x)(Tt f )(x)dx = g, Tt f .
(15.12)
S
Here .u, v = u(x)v(x) dx. If f is twice continuously differentiable and vanishes outside a compact subset of S, then one may differentiate with respect to t in (15.12) and interchange the orders of integration and differentiation to get, using the backward equation,
.
∂ ∗ ∂ Tt g, f = g, Tt f = g, Tt Af ∂t ∂t = Tt∗ g, Af ≡ (Tt∗ g)(y)(Af )(y)dy.
(15.13)
S
Now, assuming that .f, h are both twice continuously differentiable and that f vanishes outside a compact set, and h is integrable with respect to Lebesgue measure, integration by parts yields
∂f ∂ 2f 1 μ(i) (y) (i) ]dy aij (y) (i) (j ) + 2 ∂y ∂y ∂y k
h, Af =
h(y)[
.
S
i=1
1 ∂ 2 (aij (y)h) ∂(μ(i) (y)h) − ]f (y)dy 2 ∂y (i) ∂y (j ) ∂y (i) k
k
[ S
k
i=1 j =1
=
k
i=1 j =1
k
i=1
∗
= A h, f ,
(15.14)
where .A∗ is the formal adjoint of .A defined by ∂ 1 (A∗ h)(y) = div{−h(y)μ(i) (y) + grad( )(aij (y)h(y))}1≤i≤k . 2 ∂yj
.
j
(15.15) Applying (15.14) in (15.13), with .T∗t g in place of h, one gets
.
∂ ∗ T g, f = A∗ T∗t g, f . ∂t t
(15.16)
Since (15.16) holds for sufficiently many functions f , all infinitely differentiable functions vanishing outside some closed, bounded subset of S for instance, we get (Exercise 11),
282
15 Probabilistic Representation of Solutions to Certain PDEs
.
∂ ∗ (T g)(y) = A∗ (T∗t g)(y). ∂t t
(15.17)
That is, .
S
∂p(t; x, y) g(x)dx = ∂t
S
1 ∂ 2 (aij (y)p) ∂(μ(i) (y)p) − ]g(x)dx. 2 ∂y (i) ∂y (j ) ∂y (i) k
k
[
k
i=1 j =1
i=1
(15.18) Since (15.18) holds for sufficiently many functions g , we get Kolmogorov’s forward equation for the transition probability density p (Exercise 12). Remark 15.4 (A Physical Description of the Forward Equation) Given an initial concentration .c0 (y) of solute in a fluid, the concentration .c(t, y) at y at time t is given by .c(t, y)
=
R3
c0 (z)p(t; z, y)dz,
(15.19)
where .p(t; z, y) is the transition probability density of the position process of an individual solute particle. Einstein’s derivation of the diffusion equation for .A = 12 DΔ is a special case (see Einstein 1905), where .p(t; z, y) is the transition probability density of a three-dimensional Brownian motion with dispersion matrix .DI3 , .D > 0. Now, Kolmogorov’s forward equation for .c(t, y) may be expressed as .
∂c(t, y) = A∗ c(t, y) = −divJ (t, y), ∂t
(15.20)
with .J = (J1 , J2 , J3 ) given by .Ji (t, y)
= c(t, y)μ(i) (y) −
1 ∂ (aij (y)c(t, y)), i = 1, 2, 3. ∂y 2
(15.21)
j
The forward equation is also referred to as the Fokker–Planck equation in this context. From a physical perspective, one may further understand the quantity J defined by (15.21) as follows. The increase in the amount of solute in a small region .[y, y + Δy] during a small time interval .[t, t + Δt] is approximately .
∂c(t, y) ΔtΔy. ∂t
(15.22)
On the other hand, if .v(t, y) = (v1 (t, y), v2 (t, y), v3 (t, y)) denotes the velocity of the solute at y at time t, moving as a continuum, then .v(t, y)Δt flows into the region at y during .[t, t + Δt]. Hence the amount of solute that flows into the region at y during .[t, t + Δt] is approximately .v(t, y)c(t, y)Δt, while the amount passing out at .y + Δy during .[t, t + Δt] is approximately .v(t, y + Δy)c(t, y + Δy)Δt. Therefore, the change in the amount of solute in .[y, y + Δy] during .[t, t + Δt] is approximately
Exercises
283 .[v(t, y)c(t, y)
− v(t, y + Δy)c(t, y + Δy)]Δt.
(15.23)
Assuming mass cannot escape, equating (15.22) and (15.23) and dividing by .ΔtΔy, after letting .Δt, .Δy = (Δy (1) , . . . , Δ(k) ) go to zero, one has .
∂ c(t, y) = −div(v(t, y)c(t, y)). ∂t
(15.24)
The equation (15.24) is generally referred to as the equation of continuity or the equation of mass conservation. The quantity .J (t, y) = v(t, y)c(t, y) is called the flux of the solute, which is seen to be the rate at which the solute mass per unit volume is displaced. In the present case, therefore, the flux is given by (15.21). Also see Exercise 4 when there is an additional source, other than this flux, from which particles are created or annihilated at the rate .g(t, y) at y at time t.
Exercises 1. Show that At in Example 2 has a density for each t > 0. 2. Prove the Markov property for the process {(At , Zt ) : t ≥ 0} in Example 2 and calculate the drift vector m and diffusion matrix a. [Hint: Use Itô’s lemma.] 3. Assume the conditions for the Feynman–Kaˆc formula stated in Proposition 15.5, with v(x) = −λ(x) where λ(x) is a nonnegative continuous function. Enlarge the probability space (Ω, F , P ) for Xx , x ∈ Rk to adjoin a (possibly infinite) nonnegative random variable ζ such that conditionally given σ {Xs : 0 ≤ s ≤ t}, ζ is distributed with hazard rate h(t) = λ(Xt ), t ≥ 0; that is, t x ˜ xt = Xxt , 0 ≤ t < ξ. P (ζ > t|σ {Xs : 0 ≤ s ≤ t}) := e− 0 λ(Xs )ds , t ≥ 0. Let X x ˜ t ), t ≥ 0. (ii) Prove that {X˜ t : t ≥ 0} is a (i) Show that u(t, x) = Ef (X (nonconservative) Markov process with generator A˜ = A + v(·) = A − λ(·), where (Aλ(·))f (x) = Af (x) − λ(x)f . Consider the special case λ(x) = λ > 0 is a constant. 4. (Sources and Sinks: Duhamel’s principle) (i) Consider the one-dimensional diffusion equation given by ∂u ∂t c(t, y) = d − dy (μ(y)c(t, y)) + 12 (σ 2 (y)c(t, y)) + g(t, y), t > 0, c(0+, y) = c0 (y), y ∈ R, for Lipschitz coefficients μ(y), σ 2 (y) > 0. Use that c0 (y) is integrable and g(t, y) is a bounded, continuous function such that R |g(t, x)|p(t; x, y)dx < ∞, where p(t; x, y) is transition probability d2 d density6 of the diffusion with generator A = 12 σ 2 (x) dx 2 + μ(x) dx .
6 See
Friedman (1964) for the existence of a density.
284
15 Probabilistic Representation of Solutions to Certain PDEs
Check that c(t, y) = R c0 (x)p(t; x, y)dx + 0t R g(s, x)p(t − s; x, y)dxds is a solution, and provide a probability heuristic for the formula. State and solve the corresponding backward equation. [Hint :Assume lims↓0 R g(t − s, x)p(t; x, y)dy = g(t, x).] (ii) Consider the one-dimensional nonhomogeneous equation with Dirichlet ∂u 1 2 ∂2u boundary conditions: ∂t∂ u(t, x) = μ(x) ∂x + 2 σ (x) ∂x 2 +h(t, x), t > 0, a < x < b, u(t, a+) = u(t, b−) = 0, and u(0+, x) = f (x), a ≤ x ≤ b, where μ(x), σ 2 (x) > 0 are Lipschitz, h is a bounded, continuous function on [0, ∞) × [a, b], and f is a bounded, continuous function on [a, b] such that f (a) = f (b) = 0. Check7 that u(t, x) = 0t [a,b] h(s, y)q(t − s, x, y)dyds + [a,b] f (y)q(t, x, y)dy, t > 0 is a solution, where q(t, x, y) is the transition probability density component of a diffusion absorbed d2 d at a, b, with generator A = 12 σ 2 (x) dx 2 + μ(x) dx . State and solve the corresponding Fokker–Planck equation. (iii) State and prove analogs of (i), (ii) in multidimension.
5. Suppose that g(·) is a real-valued, bounded, nonanticipative functional on [0, T ]. Prove that t .t → ( g(u)dBu )m g(t)
M[0, T ]
belongs to
0
for all positive integers m. 6. (i) Let m be a positive integer, and g a nonanticipative functional on [0, T ] such that E 0T g 2m (t)dt < ∞. Prove that
T
.E(
g(t)dBt )2m ≤ (m(2m − 1))m T m−1
0
T
Eg 2m (t)dt.
0
[Hint: For bounded nonanticipative step functionals g , use Itô’s lemma to get .Jt
:= E(
t 0
g(s)dBs )2m = m(2m − 1)
t 0
E{(
s 0
g(u)dBu )2m−2 g 2 (s)}ds,
so that Jt increases as t increases, and .
t dJt = m(2m − 1)E{( g(u)dBu )2m−2 g 2 (t)} dt 0 t ≤ m(2m − 1){E( g(u)dBu )2m }(2m−2)/2m (Eg 2m (t))1/m 0
1− 1 = m(2m − 1)Jt m (Eg 2m (t))1/m ,
by Hölder’s inequality.] 7 See
Friedman (1964) for the existence of a density.
Exercises
285
(ii) Extend (i) to multidimension, i.e., for nonanticipative functionals f = T {(f1 (t), . . . , fk (t)) : 0 ≤ t ≤ T } satisfying E 0 fi2m (t)dt < ∞ (1 ≤ i ≤ k ), prove that .E(
T
f(t) · dBt )2m ≤ k 2m−1 (m(2m − 1))m T m−1
0
7.
k
E
i=1
0
T
fi2m (t)dt.
(i) In addition to the hypothesis of Theorem 7.1 assume that EXα2m < ∞, where m is a positive integer. Prove that for 0 ≤ t − α ≤ 1, . Xt
− Xα 2m ≤ c1 (m, M)[(t − α) μ(Xα ) 2m + (t − α)1/2 σ (Xα ) 2m ] = c1 (m, M)ϕ(t − α),
say, with ϕ(t − α) ↓ 0 as t ↓ α). Here · 2m is the L2m -norm for random variables, and c1 (m, M) depends only s on m and M . [Hint: Let 2m = E max δt = E maxα≤s≤t |X − X | | s α α≤s≤t α (μ(Xu ) − μ(Xα )|du + (s − s α)μ(Xα ) + α (σ (Xu ) − σ (Xα ))dBu + σ (Xα )(Bs − Bα )|2m ≤ 42m−1 [· · · ], and then use Grownwall’s inequality.] (ii) Deduce from (i) that supα≤s≤t EXs2m < ∞ for all t ≥ α . (iii) Extend (i), (ii) to multidimension. 8. Let μ(·), σ (·) be Lipschitzian. Prove that (a) for every ε > 0, .P (
max |Xxs − x| ≥ ε) = o(t)
0≤s≤t
as t ↓ 0,
uniformly for every bounded set of x’s. Show that the convergence is uniform for all x ∈ Rk if μ(·), σ (·) are bounded as well as Lipschitzian.[Hint: .
P ( max |Xxs − x| ≥ ε) 0≤s≤t
≤ P(
t 0
|μ(Xxs )|ds ≥
s ε ε ) + P ( max | σ (Xxu )dBu | ≥ ) = I1 + I2 , 2 2 0≤s≤t 0
say. To estimate I1 note that |μ(Xxs )| ≤ |μ(x)|+M|Xxs −x|, and use Chebyshev’s inequality for the fourth moment and Exercise 7(iii). To estimate I2 use Doob’s Lp -maximal inequality (15.3) for the p = 4th moment, Exercise 6(ii) with m = 2, and Exercise 7(iii).] (b) Prove that I1 , I2 go to zero as t ↓ 0 uniformly on Rk if μ(·) and σ (·) are also bounded. 9. Show that if (15.16) holds for all infinitely differentiable functions that vanish outside a compact set, then the forward equations hold. x of X x 10. Let μ(·), σ (·) be Lipschitzian. Show that the solution Xt,ε t,ε = x + t t x )ds + x )dB converges as ε → 0, to the solution of the μ(X εσ (X s s,ε s,ε 0 0 x x x t ordinary differential equation dY dt = μ(Yt ), Y0 = x, in the norm distance x x 2 2 ||X − Y || defined by ||Z|| = E max0≤s≤t |Zs | .
286
15 Probabilistic Representation of Solutions to Certain PDEs
11. Supply the details to derive (15.17). 12. Show that the class of infinitely differentiable functions g vanishing outside compact sets is sufficient to establish (15.18).
Chapter 16
Probabilistic Solution of the Classical Dirichlet Problem
The classical Dirichlet problem seeks to find functions with specified boundary values on a domain for which the function is harmonic on the interior. The intrinsic connection between the Laplacian operator and Brownian motion leads to a beautiful interplay between pde and stochastic calculus presented in this chapter.
In physics, a conservative force is the gradient of a potential, which is a scalar function. The Laplacian of this function is the divergence of its gradient, and its vanishing implies equilibrium in many problems in physics, such as gravitation, electricity, and magnetism. The Dirichlet problem is to determine the potential in a (usually bounded) region given its values on the boundary. It also arises in the problem of equilibrium temperature distribution. Throughout this chapter, we consider standard Brownian motions .Bx = {Bxt : t ≥ 0} in .Rk (dimension .k ≥ 2), starting at .x ∈ Rk . A real-valued function f defined on a nonempty open subset G of .Rk is said to be harmonic in G if it is twice differentiable in G and satisfies Δf (x) = 0
for all x = (x1 , . . . , xk ) ∈ G
.
(16.1)
where .Δ is the Laplacean Δ=
k
.
∂ 2 /∂xi2 .
(16.2)
i=1
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_16
287
288
16 Probabilistic Solution of the Classical Dirichlet Problem
Suppose now that G is bounded. The classical Dirichlet problem consists in finding, for each continuous real-valued function .ϕ on .∂G, a function u on .G = G ∪ ∂G that is (i) harmonic in G and (ii) continuous on .G with .u(x) = ϕ(x) .∀ x ∈ ∂G. It is known that the problem is not always solvable in this generality (see Itô and McKean 1963, pp. 261-264, for the famous counterexample known as Lebesgue’s thorn), and that certain smoothness assumptions need to be imposed on .∂G to ensure the existence of a u satisfying (i) and (ii). One such condition is that every point x on the boundary is a Poincaré point. Definition 16.1 A point .x ∈ ∂G is said to be a Poincaré point if there exists a c truncated cone C with vertex at .x, .C\{x} ⊂ G . Let .C([0, ∞) → Rk ) denote the set of all continuous functions on .[0, ∞) into Define
k .R .
τ (f ) = τ ∂G (f ) := inf{t ≥ 0 : f (t) ∈ ∂G},
.
f ∈ C([0, ∞) → Rk ).
(16.3)
Our main objective in this chapter is to prove the theorem stated below of Kakutani and Doob. This will be followed by an application to calculate the probability .u(x) that a k-dimensional Brownian motion starting at .x ∈ G := {y : R1 < |y − a| < R2 }, where .a ∈ Rk , 0 < R1 < R2 , will reach .F1 = {y ∈ Rk : |y − a| = R1 } before .F2 = {y ∈ Rk : |y − a| = R2 }. Observe that .u = ϕ on .∂G = F1 ∪ F2 , where .ϕ(x) = 1, x ∈ F1 and .ϕ(x) = 0, x ∈ F2 . Moreover u(x) = Eϕ(Bτx(B x ) ) x ∈ G,
.
(16.4)
In view of the next theorem, this probability may be analytically computed by solving an appropriate Dirichlet problem. Theorem 16.1 (Doob-Kakutani) Let G be a bounded nonempty open subset of Rk
.
a. For every real-valued bounded Borel measurable function .ϕ on .∂G (regarded as a metric space with the Euclidean distance), the function u(x) := Eϕ(Bτx(B x ) )
.
x ∈ G,
(16.5)
is harmonic in G. b. Suppose that every point of .∂G is a Poincaré point. Then, for each bounded continuous .ϕ on .∂G, u is the unique solution of the Dirichlet problem satisfying (i) and (ii). We will prove the theorem in several steps, each step stated as a lemma. Lemma 1 Under the hypothesis of part (a) of the theorem, u has the mean value property: For every .r > 0 such that .B(x : r) ≡ {y : |y − x| < r} ⊆ G one has
16 Probabilistic Solution of the Classical Dirichlet Problem
289
u(x) =
.
S k−1
u(x + rθ)μ(dθ )
(16.6)
where .μ is the uniform distribution (normalized surface area measure) on the unit sphere .S k−1 = {y : |y − x| = 1}. Proof Fix .x ∈ ∂G, and let .r > 0 be such that .B(x : r) ⊆ G. Let .η := inf{t ≥ x x 0 : |Btx − x| = r}. Since the after–.η process .Bη + ≡ {Bη+t : t ≥ 0} hits .∂G at the x same point as .B does, one has the following relations in view of the strong Markov property with respect to the stopping time .η: Conditioning on the pre-.η .σ -field .Gη , one has .
ϕ(Bτx(B x ) ) = ϕ((Bηx+ )τ (Bηx+ ) )
u(x) = E(E[ϕ((Bηx+ )τ (Bηx+ ) )|Gη ]) = E(E[ϕ((B y )τ (B y ) )]y=Bηx ) = Eu(Bηx ).
(16.7)
Since the distribution of .B x is invariant under orthogonal transformations around x, i.e., .B x − x is distributed as .B 0 , the same as that of .O(B x − x), for every orthogonal transformation O, it follows that the distribution of .Bηx is the uniform distribution on the sphere .{y : |y − x| = r}. By a change of variables .y → rθ , or .θ = y/r, k−1 , one arrives at (16.6) from (16.7). .θ ∈ S Lemma 2 If a function u has the mean value property in G, then it is infinitely differentiable and harmonic in G. Proof Let .a ∈ G. Find .R > 0 such that .B(a; 2R) ⊆ G. We first show that u is infinitely differentiable in .B(a : R). For this, choose an infinitely differentiable radial function .ψ(x) = g(|x|) whose support is contained in .B(0 : R) and such that R2 2 −1 . ψ(x)dx = 1 (for example, let .g(r) = ck exp{−( 4 − r ) } for .0 ≤ r < R/3, R R k−1 dr = 1). Define .g(r) = 0 for .r ≥ 2 , such that .ck 0 g(r)r u(x) =
u(x)
for x ∈ B(a : 2R)
0
for x ∈ Rk \B(a : 2R).
.
For .y ∈ B(a : R) one has, using the polar transformation, (u ψ)(y) ≡
.
u(y + x)ψ(x)dx =
= ck
S k−1
u(y + x)g(|x|)dx B(0:R)
R 0
u(y + rθ )g(r)r k−1 drμ(dθ )
(16.8)
290
16 Probabilistic Solution of the Classical Dirichlet Problem
R
= ck
u(y + rθ )μ(dθ ))dr
g(r)r k−1 ( S k−1
0
= u(y),
(16.9)
where .ck is the surface area of .S k−1 . Since .u ψ is infinitely differentiable everywhere, u is infinitely differentiable in .B(a : R), by (16.9). Thus u is infinitely differentiable in G. To show that u is harmonic, again let .a ∈ G. In a sufficiently small neighborhood of a (contained in G), Taylor expansion yields up to .O(|x −a|3 ) .u(x)
= u(a) +
k
(xi − ai )(
k ∂ 2 u(x) 1 ∂u(x) (xi − ai )(xj − ai )( )a . )a + 2 ∂xi ∂xj ∂xi
(16.10)
i,j =1
i=1
Integrating both sides with respect to the uniform distribution .μp (dx), say, on the sphere .{x : |x − a| = ρ} = S(a : ρ), for sufficiently small .ρ, and using the mean value property of u, one gets u(a) = u(a) +
.
k ρ 2 ∂ 2 u(x) ( )a + O(ρ 3 ). 2 2k ∂x i i=1
(16.11)
Note that (16.11) results from rotational invariance of .μρ . Specifically, .
(xi − ai )μρ (dx) = 0
(1)
∀ i,
S(a:ρ)
(xi − ai )(xj − aj )μρ (dx) = 0
(2) S(a:ρ)
(xi − ai ) μρ (dx) =
(x1 − a1 )2 μρ (dx)
2
(3)
∀ i = j,
S(a:ρ)
S(a:ρ)
1 (xi − ai )2 μρ (dx) k k
= S(a:ρ)
∀ i.
(16.12)
i=1
For (1) the .μρ –distribution of .(xi − ai ) is the same as that of .−(xi − ai ), (2) the μρ –distribution of .x − a is the same as that of .x − a , where .xi − ai = −(xi − ai ) and .xj − aj = xj − aj .∀ j = i, (3) the .μρ –distribution of .xi − ai is the same for all i. Now (16.11) implies
.
.
ρ2 Δu(a) = O(ρ 3 ) 2k
which is only possible if .Δu(a) = 0.
as ρ ↓ 0,
(16.13)
16 Probabilistic Solution of the Classical Dirichlet Problem
291
Remark 16.1 This follows from a special case of Proposition 15.4 applied to balls, and Lemma 1(Exercise). But here is a direct proof. The converse of Lemma 1 is also true: If u is a twice continuously differentiable function in G which is harmonic, then u has the mean value property. To see this fix .a ∈ G, and consider the function f (r) =
.
S k−1
u(a + rθ )μ(dθ )
(r > 0 sufficiently small).
Then
d .f (r) = ( u(a + rθ ))μ(dθ ) = S k−1 dr
S k−1
ν, grad u(a + rθ )μ(dθ ),
where . , denotes the Euclidean inner product and .ν is the outward unit normal to S k−1 at .θ , i.e., .ν = θ = (θ1 , θ2 , . . . , θk ). Hence, by the Gauss divergence theorem,
.
f (r) = ck r k
∇(grad u)(a + rx)dx
.
=
ck r k
B(0:1)
(Δu)(a + rx)dx = 0, B(0:1)
where .ck is a normalizing constant .(ck B(0:1) dx = 1). Thus .r → f (r) is a constant and, hence, .f (r) = u(a) ≡ limr↓0 f (r). Lemmas 1 and 2 complete the proof of part (a) of Theorem 16.1. For part (b) we need to show that u is continuous on .G, i.e., .u(x) ≡ Eϕ(Bτx(B x ) ) → u(b) = ϕ(b) as .x → b for every .b ∈ ∂G. To require this for every bounded continuous .ϕ on .∂G is equivalent to the requirement that Bτx(B x ) converges in distribution to δ{b} asx → b
.
(∀ b ∈ ∂G),
(16.14)
where (see (16.3) .Bτx(B x ) = inf{t ≥ 0 : Btx ∈ ∂G}. Lemma 3 Let .b ∈ ∂G. Suppose that for every .δ > 0, one has .
lim P (τ (B x ) > δ) = 0.
x→b, x∈G
(16.15)
Then (16.14) holds. Proof Let .ϕ be a bounded continuous function on .∂G. Given .ε > 0, let .η1 > 0 be such that .|ϕ(y) − ϕ(b)| < ε/3 if .|y − b| < 2η1 , .y ∈ ∂G. Write .ϕ∞ := max{|ϕ(y)| : y ∈ ∂G}. Then
292
16 Probabilistic Solution of the Classical Dirichlet Problem .
|u(x) − u(b)| = |Eϕ(Bτx(B x ) ) − ϕ(b)| ≤ E(|ϕ(Bτx(B x ) ) − ϕ(b)| · 1{τ (B x )≤δ} ) + 2ϕ∞ P (τ (B x ) > δ) ε P ( max |B x − b| < 2η1 , τ (B x ) ≤ δ) 3 0≤t≤δ t
δ). (16.16) 0≤t≤δ
For .|x − b| < η1 , .
P ( max |Btx − b| ≥ 2η1 ) ≤ P ( max |Btx − x| ≥ η1 ) 0≤t≤δ
0≤t≤δ
= P ( max |Bt0 | ≥ η1 ) −→ 0 0≤t≤δ
as δ ↓ 0.
(16.17)
Therefore, there exists .δ0 > 0 such that, if .|x − b| < η1 then 2ϕ∞ P ( max |Btx − b| ≥ 2η1 )
0 such that if .|x − b| < η2 then 2ϕ∞ P (τ (B x ) > δ0 )
0 : f (t) ∈ G }
.
(16.20)
c
(f ∈ C([0, ∞) → Rk )).
Note that if .x ∈ G, then .τ˜ ◦ B x = τ (B x ) > 0. Also note that .τ ◦ B x is a stopping time with respect to the filtration .Ft+ (= ∩ε>0 Ft+ε ), .t ≥ 0 (see BCPT1 Exercise 12, p.73). Here .Ft = σ (Bs : s ≤ t). It is simple to show that a .Ft -Brownian motion 1 Throughout,
Theory.
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
16 Probabilistic Solution of the Classical Dirichlet Problem
293
is als a .{Ft+ }-Brownian motion (see Bhattacharya and Waymire 2021, Proposition 1.6, p. 79). It is not difficult to show that a .{Ft }-diffuson is also a .{Ft+ }-diffusion (Exercise 4). Lemma 4 If (16.20) holds for a boundary point b, then it is a regular boundary point. Proof For each .δ > 0, P (τ (B x ) > δ) = P (τ˜ ◦ B x > δ)
.
∀ x ∈ G.
(16.21)
Thus, it is enough to show that .P (τ˜ ◦ B x > δ) → 0 as .x → b .(x ∈ G). Write k .Ah = {f ∈ C([0, ∞) : R ) : f (t) ∈ G for .h ≤ t ≤ δ}, .0 < h < δ. Then .[τ˜ > δ] = limh↓0 Ah , so that P (τ˜ ◦ B x > δ) = lim P (B x ∈ Ah ).
.
h↓0
(16.22)
y
Now, by the Markov property, writing .ϕ(y) = P (Bt ∈ G for .0 ≤ t ≤ δ − h), one has for .0 < h < δ, y
P (B x ∈ Ah ) = E[(P (Bt ∈ G for 0 ≤ t ≤ δ − h))y=Bhx ] x = Eϕ(Bh ) = ϕ(y)p(h; x, y)dy.
.
Rk
(16.23)
Since .ϕ is bounded and .x → p(h; x, y) is continuous, .x → P (B x ∈ Ah ) is continuous on .Rk . Since (16.20) holds, .P (τ˜ ◦ B b > δ) = 0 .∀ δ > 0. Let .ε > 0. Fix any .δ > 0. Then one may, by (16.22), find .h > 0 such that P (B b ∈ Ah ) < ε.
.
Now find .η > 0 such that .P (B x ∈ Ah ) < ε if .|x − b| < η, using the continuity of x .x → P (B ∈ Ah ). Then P (τ˜ ◦ B x > δ) ≤ P (B x ∈ Ah ) < ε.
.
Lemma 5 If b is a Poincaré point of .∂G, and .ϕ a continuous function on .∂G, then u is continuous at b. Proof By Blumenthal’s zero–one law2 P (τ˜ ◦ B b = 0) = 0 or 1,
.
2 See
BCPT, Proposition 11.6, p.198.
(16.24)
294
16 Probabilistic Solution of the Classical Dirichlet Problem
since .[τ˜ ◦ B b = 0] ∈ F0+ = ∩t>0 Ft , where .Ft = σ (Xsb : s ≤ t}.. be a cone with vertex at b and .C\{b} = Cb , say, a truncation of .C contained Let .C c around b, one may find a finite number, say n, of rotations of .C in .G . By rotating .C whose union is .Rk . If .P (τ˜ ◦ B b = 0) = 0, then .P (τ˜ ◦ B b > 0) = 1 so that one can 1 , which in turn implies .P (Btb ∈ Cbc find .δ0 > 0 for which .P (τ˜ ◦ B b > δ0 ) > 1 − 2n 1 . Now the probability of .[Btb ∈ C] is the same as that of for .0 < t ≤ δ0 ) > 1 − 2n b .[Bt ∈ Ci ] .(i = 1, . . . , n), where .Ci is obtained from C by one of the n rotations whose union is .Rk . But .∪n Ci contains a ball .B(b : r), .r > 0. Hence, one of .C i=1 1 for all .t ∈ (0, δ0 ], which is of course not true, would have .P (Btb ∈ B(b : r)) < n 2n since letting .t ↓ 0 .(t > 0) one would get .1 = limt↓0 P (Btb ∈ B(b : r)) < 12 . It only remains to prove uniqueness. This is a consequence of Lemma 6 (The Maximum Principle) Suppose G is a connected open set and u has the mean value property in G. Then u cannot attain its supremum or infimum in G, unless u is constant in G. Proof Suppose u attains its supremum at a point .x0 ∈ G. By the mean value property, for all sufficiently small p, u(x0 ) =
u(y)μρ (dy),
.
Sp (x0 )
where .μρ is the normalized surface area measure on .Sρ (x0 ) = {y : |y − x0 | = ρ} (i.e., .μρ (dy) is the distribution of .y = x0 + ρθ , when .θ has the rotation invariant distribution on .S k−1 ). Since u is continuous on .Sρ (x0 ), and .μρ assigns positive measure to every nonempty, open subset of .Sρ (i.e., support of .μρ is .Sρ (x0 )), and .u(y) ≤ u(x0 ) on .Sρ (x0 ), it follows that .u(y) = u(x0 ) .∀ y ∈ Sρ (x0 ) (for all sufficiently small .ρ). This means that the set .D ≡ {x ∈ G : u(x) = u(x0 )} is open, but D is clearly closed. Thus .G = D ∪ (G\D) is a union of two disjoint open sets. Since G is connected, this implies .D = G. Note that one can apply the argument to each connected component of an arbitrary open set G and conclude that if .u ≡ 0 on .∂G, then .u ≡ 0 in G (provided u has the mean value property in G and u is continuous on .G = G ∪ ∂G). The proof of the Theorem is now complete. If .u1 , u2 are two solutions, then .u1 − u2 = 0 on .∂G and has the mean value property. If .u1 (x) > u2 (x) at some .x ∈ G, .u1 (x) − u2 (x) > 0, contradicting the fact that the maximum is on .∂G. . Example 1 (Application to Transience and Recurrence) Although Proposition 16.2 follows from results in Chapter 11, we give a proof based on Theorem 16.1, with some new computations. As an application, we consider the problem of transience and recurrence of k-dimensional standard Brownian motion. Proposition 16.2 Two-dimensional Brownian motion is recurrent in the sense that given any .a ∈ R2 and .ε > 0, x P (ηB(a,ε) = ∞) = 1,
.
∀ x ∈ R2 ,
(16.25)
16 Probabilistic Solution of the Classical Dirichlet Problem
295
where x ηB(a,ε) := sup{t ≥ 0 : |Btx − a| ≤ ε}
.
(16.26)
is the time of the last visit to .B(a, ε). Every higher dimensional Brownian motion on .Rk , k ≥ 3, is transient in the sense that P |Btx | −→ ∞ as t → ∞ = 1
.
∀ x ∈ Rk .
(16.27)
Proof Let .0 < R1 < R2 . In Theorem 16.1 take G = {x : R1 < |x − a| < R2 }.
.
Then ∂G = {x : |x − a| − R1 } ∪ {x : |x − a| = R2 },
.
τ = τ ∂G (B x ) Define ϕ(x) =
1
for |x − a| = R1 ,
0
for |x − a| = R2 .
.
Then starting from .x ∈ G, the probability .u(x) of reaching the inner sphere of radius .R1 before reaching the outer sphere of radius .R2 may be expressed as .u(x) = Eϕ(Bτx ), satisfies Δu(x) = 0
∀ x ∈ G,
.
u(x) = ϕ(x)
∀ x ∈ ∂G.
(16.28)
For simplicity, let .a = 0 first. Then the function (16.28) is invariant under rotations so that .u(x) = g(|x|) for some g which is continuous on .[R1 , R2 ] and twice (actually, infinitely) continuously differentiable on .(R1 , R2 ). Applying the Laplacian to .g(|x|), one gets Δu(x) = g
(|x|) +
.
k−1 g (|x|). |x|
(16.29)
In other words, g satisfies g
(r) +
.
k−1 g (r) = 0 r g(R1 ) = 1,
R1 < r < R2 , g(R2 ) = 0.
(16.30)
296
16 Probabilistic Solution of the Classical Dirichlet Problem
Writing .f = g , one gets .f (r) +
k−1 r
f (r) = 0, whose general solution is
f (r) = c1 (
.
Hence .g(r) = c1
r
R1 (R1 /r
g(r) =
.
)k−1 dr
⎧ ⎨c1 R1 log
R1 k−1 ) . r
(16.31)
+ c2 , i.e., r R1
+ c2
k−1 1 ⎩−c1 R1 ( k−2 k−2 r
−
if k = 2 1 ) + c2 R1k−2
if k > 2,
where .c1 , c2 are determined by the boundary conditions in (16.30):
g(r) =
.
⎧ log R2 −log r ⎪ ⎪ ⎨ log R2 −log R1
if k = 2
⎪ ⎪ ⎩
if k > 2.
k−2 R1 k−2 R − R1 r 2
k−2 R 1− R1 2
(16.32)
Substituting .r = |x| in (16.32), one gets .u(x). For arbitrary .a use translation invariance, of Brownian motion to obtain, writing .τRx i for .τ∂B(a:Ri ) ◦ B x .(i = 1, 2), P τRx 1
.
⎧ log R2 − log |x − a| ⎪ ⎪ ⎪ ⎨ log R2 − log R1 < τRx 2 = k−2 − (R /R )k−2 ⎪ 1 2 ⎪ (R1 /|x − a|) ⎪ ⎩ 1 − (R1 /R2 )k−2
if k = 2 (16.33) if k > 2,
for .R1 ≤ |x − a| ≤ R2 . Letting .R2 ↑ ∞ in (16.33), one arrives at x 1 .P τR < ∞ = R1 k−2 1 ( |x−a| )
for |x − a| ≥ R1 , if k = 2, for |x − a| ≥ R1 ifk > 2.
The completion of the proof is left as (Exercise 3).
(16.34)
x = inf{t ≥ 0 : In (16.33) if one lets .R1 ↓ 0, then one gets .(∀ k ≥ 2), writing .τ{a} Btx = a}, x P (τ{a} < τRx 2 ) = 0
.
∀ x = a, |x − a| ≤ R2 ,
so that, .∀ k ≥ 2, x P (τ{a} < ∞) = 0
.
∀ x = a.
(16.35)
Thus, the probability of ever hitting any given point (other than the starting point) is zero. This is in contrast with the one-dimensional case.
Exercises
297
Example 2 (Poisson Equation) Before we close this subsection, we briefly consider Poisson’s equation: For a given bounded continuous differentiable f on G find a v, which is continuous on .G, and satisfies Δv(x) = −f (x)
.
v(x) = 0
x ∈ G,
x ∈ ∂G,
(16.36)
where G is a bounded open set with a .C 1 -smooth boundary. This is a special case of Proposition 15.4, but we present a proof based on the results of this chapter and some new computations Assume that .∂G is smooth enough and f is such that f and its first derivatives may be continuously extended to all of .Rk . Let this extension .f be so chosen as to have compact support (for example, if G is a sphere or ellipsoid and the first order derivatives of f are uniformly continuous such an extension is easily shown to be possible). Assume .k > 2. The function f (y) 1 .w(x) = (16.37) dy ck Rk |x − y| satisfies the equation Δw(x) = −f (x)
.
x ∈ Rk .
(16.38)
(Exercise 2; here .ck is the surface area of .S k−1 ). The solution of (16.36) is then equivalent to the solution of the Dirichlet problem Δh(x) = 0
.
h(x) = w(x)
∀ x ∈ G, ∀ x ∈ ∂G.
(16.39)
For if h solves (16.39), then .v(x) ≡ w(x)−h(x)(x ∈ G) solves (16.36). Conversely, if .v(x) solves (16.36), then .h(x) ≡ w(x) − v(x) solves (16.39).
Exercises 1. In the proof of Lemma 5, it is said that Rk is the union of a finite number of (around b). Prove this. [Hint:It is enough to prove that rotations of the cone C the sphere S(b : 1) := {x : |x − b| = 1} is the union of a finite number of intersections of the sphere with rotated cones. For this use compactness of S(b : 1).] 2. Prove (16.38). 3. Use (16.34) to complete the proof of Proposition 16.2. 4. Consider a diffusion on Rk (or on a domain D ⊂ Rk ). Let Ft := σ (Xs : 0 ≤ s ≤ t). Show that {Xs : s ≥ 0} is a diffusion with respect to the filtration Ft+ . [Hint:The conditional distribution of Xt , given Fs+ε , has the density p(t − s − ε; x, y) on [Xs+ε = x]. Let ε ↓ 0.]
Chapter 17
The Functional Central Limit Theorem for Ergodic Markov Processes
In this chapter a broadly applicable functional central limit theorem for continuous parameter ergodic Markov processes is developed. This uses the martingale central limit theorem applied to functions in the range of the infinitesimal generator. This chapter includes an example application to dispersion in a periodic media and another to obtain the seminal formula of Taylor and Aris for long-time dispersion of solutes embedded in fluid flow.
Consider an arbitrary Markov process X with stationary transition probabilities p(t; x, dy) on a metric space S. A semigroup of transition operators .Tt , t ≥ 0, can be defined on the space .B(S) of real-valued, bounded (Borel) measurable functions on S by
.
Tt f (x) = Ex f (Xt ) ≡
f (y)p(t; x, dy),
.
x ∈ S, f ∈ B(S).
(17.1)
S
(Also see Chapter 2 on semigroups.) The semigroup property .Tt+s = Tt Ts = Ts Tt is a direct consequence of the Markov property. Moreover, if .B(S) is given the norm .||f || = supx∈S |f (x)|, then the averaging defining .Tt f (x) implies the contraction property .||Tt f || ≤ ||f ||. Moreover .Tt is a positive operator in the sense that .f ≥ 0 pointwise on S implies that .Tt f ≥ 0 pointwise on S for each .t ≥ 0. The infinitesimal generator .(A, DA ) of the Markov process X, or its associated semigroup, with respect to this norm is defined for .f ∈ DA by Af = g
.
if lim || t↓0
Tt f − f − g|| = 0, t
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_17
(17.2) 299
300
17 Functional Central Limit Theorem
for some .g ∈ B(S). The set .DA , so-defined, is called the domain of A. Suppose that the Markov process X has a unique invariant probability .π . Then the Hilbert space .L2 (S, π ) contains .B(S). Moreover, using the invariance of .π , one can check that .Tt , t ≥ 0 is a contraction semigroup of positive linear operators for the .L2 −norm .|| · ||2 on .L2 (S, π ), as well. Moreover, an infinitesimal generator .Aˆ can be defined on a domain .DAˆ ⊆ L2 (S, π ) in the same limit but with the norm replaced by .|| · ||2 . Note that .DA ⊆ DAˆ . Hence the generator .Aˆ is an extension of A. Having said all of this, we will continue to use .(A, DA ) for the generator on 2 .L (S, π ). In general the choice of the function space depends on what one is trying to quantify. Here is a repeat result from Chapter 2. Lemma 1 For .f ∈ DA , (i) .Tt f ∈ DA for all .t > 0, and (ii) .ATt f = Tt Af for all t > 0.
.
Proof By definition, and using commutativity and contraction properties of .Tt , t ≥ 0, one has ||(Ts (Tt f ) − Tt f )/s − Tt Af || = ||(Tt {(Ts f ) − f )/s − Af }||
.
≤ ||(Ts f − f )/s − Af || → 0
as s ↓ 0.
Lemma 2 (Dynkin’s Martingale) Suppose that .X = {Xt : t ≥ 0} is a Markov process on a metric space S, having right-continuous sample paths. Assume that for 1 .f ∈ DA , .(s, ω) → Af Xs (ω) is progressively measurable (see BCPT pp. 59–60). Then t .Zt := f (Xt ) − Af (Xs )ds, t ≥ 0, (17.3) 0
is a martingale with respect to .Ft = σ (Xs : 0 ≤ s ≤ t), t ≥ 0. Proof Note that the conditional distribution of the after-s process .Xs+ given .Fs is the same as the distribution of X, starting at .Xs . Also, using .Ts Af = ATs f , one has s s+t .E(Zt+s |Fs ) = E(f (Xt+s )|Fs ) − Af (Xu )du − E( Af (Xu )du|Fs ) = Tt f (Xs ) −
0 s
0
= Tt f (Xs ) −
0
1 Throughout,
Theory.
s
t
Af (Xu )du −
Tu Af (Xs )du 0
t
ATu f (Xs )du −
s
Af (Xu )du.
(17.4)
0
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
17 Functional Central Limit Theorem
301
t d Tu f (Xs )du which, by the The middle integral in the last expression equals . 0 du fundamental theorem of calculus (see Theorem 2.7), is .Tt f (Xs ) − f (Xs ). The martingale property of Z now follows. Remark 17.1 The assumptions2 of progressive-measurability implies integrability of the integrand almost surely defining .Zt . This will continue to be assumed without further mention. One may note that in applications to diffusions X, the sample paths are continuous functions, making this a nonissue. We will, in general, assume that the right-continuous Markov process is ergodic, and that S is a Polish space equipped with its Borel .σ -field. Remark 17.2 For diffusions defined on all of .Rk , Itô’s lemma guarantees the martingale property of Z. However, the above Lemma 2 applies to general continuous parameter ergodic Markov processes. Theorem 17.1 (Functional Central Limit Theorem for Ergodic Markov Processes3 ) Suppose that X is a stationary, ergodic Markov process on S with transition probability .p(t; x, dy) and invariant probability .π . Let .(A, DA ) denote the generator on .L2 (S, π ). Assume that f belongs to the range .RA of .(A, DA ) with .Ag = f for some .g ∈ DA . Then, letting .B = {Bt : t ≥ 0} denote a standard Brownian motion a. . S f (x)π(dx) = 0. nt b. .{ √1n 0 f (Xs )ds : t ≥ 0}n ⇒ {σ Bt : t ≥ 0} as .n → ∞. c. .σ 2 = −2 f, g π = −2 S f (x)g(x)π(dx). Proof For (a) observe that .Ag = .
d dt Tt g|t=0 ,
d f dπ = Agdπ = dt S S
so that by the invariance of .π , one has
Tt gdπ |t=0 S
d = dt
gdπ = 0. S
The proof of part (b) will follow from a version of the martingale functional central limit theorem originally due to Billingsley (1961) and Ibragimov (1963) (see Bhattacharya and Waymire (2022), Theorem 15.5). One has
t
Tt g(x) = g(x) +
Ts Ag(x)ds
.
π − a.e.,
(17.5)
0
so that, by Lemma 2, Yn (g) := g(Xn ) − 0
2 See
BCPT, Proposition 3.6. (1982).
3 Bhattacharya
n
Ag(Xs )ds,
.
n = 1, 2, . . . ,
(17.6)
302
17 Functional Central Limit Theorem
is a square integrable martingale, and the difference sequence Δn (g) = g(Xn+1 ) − g(Xn ) −
n+1
n = 0, 1, 2 . . . ,
Ag(Xs )ds,
.
(17.7)
n
is stationary and ergodic. In particular, therefore, EΔn (g) = 0,
.
n = 1, 2, . . . .
(17.8)
So one has by the martingale functional central limit theorem that Zn (t) = n−1 2 (Y[nt] (g) + (nt − [nt])Δ[nt] (g)),
t ≥ 0,
(17.9)
Ag(Xs )ds}2 .
(17.10)
.
converges in distribution to .σ B with
1
σ 2 = E(Δ0 (g))2 = E{g(X1 ) − g(X0 ) −
.
0
Now, .
− 12
|Zn (t) + n
nt
Ag(Xs )ds| 0
− 12
≤n
(|g(X[nt] )| + |Δ[nt] (g)| +
[nt]+1 [nt]
|Ag(Xs )|ds)
1
= n− 2 (I1 ([nt]) + I2 ([nt]) + I3 ([nt]), say.
(17.11)
1
Let us check that each of .n− 2 Ij ([nt]), j = 1, 2, 3 is almost surely .o(1) as .n → ∞ by a simple Borel-Cantelli argument. Namely, for each .j = 1, 2, 3, one has for any .ε > 0, that ∞ .
∞ √ √ P (Ij (1) > ε m) P (Ij (m) > ε m) = m=1
m=1
≤
∞ 0
√ P (Ij (1) > ε s)ds
∞ 2 P (Ij (1) > r)rdr = EIj2 (1) < ∞. = 2 ε 0
Thus, for any .t0 > 0, one has .Pπ -a.s. that 1
.
sup n− 2 (I1 ([nt]) + I2 ([nt]) + I3 ([nt]) → 0, 0≤t≤t0
a.s. as n → ∞.
(17.12)
17 Functional Central Limit Theorem
303
Finally to compute .σ in terms of the given parameters of the process, observe that, using the orthogonality of the martingale differences and stationarity, σ2
.
1
= E[g(X1 ) − g(X0 ) −
f (Xs )ds]2
0
= n[E(g(X 1 ) − g(X0 ))
2
n
1 n
+E(
Ag(Xs )ds) − 2E(g(X 1 ) − g(X0 )) 2
n
0
1 n
Ag(Xs )ds)].
0
Now, E[g(X 1 ) − g(X0 )] = 2
g 2 dπ − 2Eg(X0 )g(X 1 )
2
.
n
S
S
=2
n
g 2 dπ − 2
=2
gT 1 gdπ n
S
g 2 dπ − 2 g, T 1 g π
n
S
=2
g 2 dπ − 2 g, g + S
1 1 Ag + o( ) π n n
1 2 = − g, Ag π + o( ). n n Next E(
.
0
1 n
1 Ag(Xs )ds) ≤ E n
2
0
1 n
(Ag(Xs ))2 ds =
1 ||Ag||2 . n2
Using these last two estimates together with the Cauchy-Schwarz inequality, one has n1 − 32 ). Thus, one has .σ 2 = −2 g, Ag π + .E g(X 1 ) − g(X0 ) 0 Ag(Xs )ds = O(n n
o(1) as .n → ∞. Since .σ 2 is independent of n, the proof is complete.
Remark 17.3 Note that we have used the notion of ergodicity of a discrete parameter sequence. This is considered in detail in Bhattacharya and Waymire (2022, Chapter 4). The notion is entirely analogous for continuous parameter processes. There are a number of important questions that naturally arise with regard to the application of Theorem 17.1. Here is a list, which will be answered in the order that they occur below:
304
17 Functional Central Limit Theorem
Q1 How large is the range space .RA ? Q2 How can one guarantee that the asymptotic variance .σ 2 is positive? Q3 Under what additional conditions can one ensure that the asymptotic distribution holds regardless of the initial distribution for the Markov process? Proposition 17.2 In addition to the hypothesis of Theorem 17.1, assume that the semigroup .{Tt } is strongly continuous in .L2 (S, π ), i.e., .DA = L2 . Then, (a) .RA is dense in .1⊥ -the subspace of mean zero functions in .L2 . (b) If one also assumes that 0 is an isolated point of the spectrum of A, then .RA = 1⊥ . ˆ i.e., Proof (a) Let .NAˆ denote the null space of .A, ˆ = 0}. NAˆ := {f ∈ L2 : Af
.
Since .Aˆ has a closed graph, .NAˆ is a closed subspace. We will now see that .N ˆ is a one-dimensional space of constants. For this, let .f ∈ N ˆ . Applying A A Dynkin’s martingale formula (Lemma 2), .{f (Xt ) : t ≥ 0} is a square integrable stationary martingale converging to some shift-invariant square integrable random variable Z (Theorem 12.3 in Bhattacharya and Waymire (2021).) Shift invariance and ergodicity imply that Z is a constant .π −almost surely. Therefore, f (Xt ) = E(Z|σ (Xs , 0 ≤ s ≤ t)) = Z, a.s..
.
Thus .NAˆ is the one-dimensional subspace of constants, the eigenspace of 0. Next, note the general facts (i) the adjoint .A∗ is well-defined since .DA is dense in ⊥ 2 4 .L (S, π ), and (ii) Range.(A) = N ∗ (Exercise 1). For (a), it is enough to show A ⊥ that .NA∗ ⊃ N ˆ , i.e., .NA∗ ⊂ NAˆ . For this, let .h ∈ NAˆ ∗ . Then for all .g ∈ DAˆ , A d ˆ t g, h = Tt g, Aˆ ∗ h = 0, so that . Tt g, h = g, Tt∗ h is constant Tt g, h = AT . dt
in t. Hence .Tt∗ h = h for all t. Now, .||Tt h − h||2 = ||Tt h||2 + ||h||2 − 2 Tt h, h = 2||h||2 − 2 Tt h, h = 2||h||2 − 2 h, Tt∗ h = 2||h||2 − 2 h, h = 0. This proves that ˆ = 0, i.e., .h ∈ N ˆ . For .Tt h = h and, differentiating with respect to t, one gets .Ah A ˆ and for any .g ∈ 1⊥ , .f := ∞ Tt gdt part (b), 0 belongs to the resolvent set .ρ(A), 0 belongs to .DAˆ , and (Exercise 3) ˆ = −g. Af
.
(17.13)
Remark 17.4 It may be shown that (b) is both necessary and sufficient for .RAˆ = 1⊥ (see Bhattacharya (1982)). We next turn to question Q2. The following simple example indicates the necessity of addressing this question. 4 See
Yosida (1964), p.205.
17 Functional Central Limit Theorem
305
Example 1 (Newman5 ) Let .t → Xt , t ≥ 0 denote the deterministic periodic motion on the unit circle .S 1 . Then the uniform distribution .π on the circle is the unique invariant probability. This strictly stationary Markov process is ergodic. If f is 1 bounded and . S 1 f dπ = 0, then . 0 f (Xs )ds is bounded, and hence .σ 2 = 0. Example 2 Suppose that .Aˆ is self-adjoint. Then under the hypothesis of Theoˆ g , rem 17.1, .σ 2 > 0 for every nonzero .f ∈ RAˆ . To see this, note that .σ 2 = −2 Ag, 2 6 ˆ = f . The spectral theorem implies that .σ > 0. where .Ag For the remainder of this chapter, it will be convenient to assume that the semigroup is .{Tt : t ≥ 0} is strongly continuous in .L2 (S, π ), i.e., the domain .DA of the generator A is dense in .L2 (S, π ). Remark 17.5 Suppose .π is an invariant probability for a transition probability p(t; x, dy) and A the infinitesimal generator on .L2 (S, π ). Since A is a closed operator, its null space
.
NA = {h ∈ L2 (S, π ) : Ah = 0}
.
is closed. Therefore, given .f ∈ RA , the range of A, there exists a unique .g ∈ DA ∩ NA⊥ such that .Ag = f . One may use this g in Theorem 17.1. From the first part of the proof of part (a) of Proposition 17.2, the following result is an easy corollary. Proposition 17.3 Suppose that X is a stationary, Markov process on S with transition probability .p(t; x, dy) and invariant probability .π . Then X is ergodic if and only if the null space is a one-dimensional space of constants. Equivalently, X is ergodic if and only if 0 is a simple eigenvalue of A. Our next task is to show that, at least under some reasonable additional conditions, the variance parameter .σ 2 in Theorem 17.1(c) is strictly positive unless .f = 0 .π -a.s.. Proposition 17.4 In addition to the hypothesis in Theorem 17.1, asssume that for each pair .(t, x) ∈ (0, ∞) × S, .p(t; x, dy) and .π(dy) are mutually absolutely continuous. Then if .f ∈ RAˆ and f is bounded .π -a.s., then the variance parameter 2 .σ in Theorem 17.1(c) is strictly positive, unless .f = 0, .π -a.s.. ˆ = f. Proof Suppose .f ∈ RAˆ , f bounded, and .f = 0. Let .g ∈ DAˆ be such that .Ag 2 Ift possible suppose .σ = 0. Since differences of the martingale .Yt = g(Xt ) − 0 f (Xs )ds over successive non-overlapping time intervals of equal length form a stationary sequence of martingale differences, each martingale difference must be zero a.s., so that one must have, choosing a separable version of the martingale if
5 Personal 6 See
communication by C.M. Newman. Bhattacharya and Waymire (2022).
306
17 Functional Central Limit Theorem
necessary:
t
Q(g(Xt ) − g(X0 ) −
.
f (Xs )ds = 0 for all t ≥ 0) = 1,
0
where .(Ω, F, Q) is the underlying probability space for the Markov process .{Xs : t s ≥ 0}. Thus, with probability one, . 0 f (Xs )ds = g(Xt ) − g(X0 ) for all t. From this, one shows that7 .g(Xt ) − g(X0 ) = 0 Q-a.s.. Choose and fix .t > 0. Since g is not an a.s. constant, there exist numbers .a < b and sets .A, B ∈ B(S) such that (i) .g < a on A, .g > b on B, and (ii) .π(A) > 0, π(B) > 0. Hence, writing .p(t; x, y) for a strictly positive version of the density of .p(t; x, dy) with respect to .π(dy), Q(g(X0 ) < a, g(Xt ) > b) =
.
{y:g(y)>b} {x:g(x) 0,
which contradicts .g(Xt ) − g(X0 ) = 0 a.s..
Remark 17.6 Note that it is enough to assume the absolute continuity in Proposition 17.4 for sufficiently large t. Let us now consider a progressively measurable Markov process .{Xt : t > 0} with state space S, initial distribution .μ, and transition probability p defined on some probability space .(Ω, A, Q). As .μ varies let the probability space vary. In this notation, .Q = Qm . The tail .σ -field is .T = ∩t≥0 [Xs : s ≥ t]. The tail .σ -field is .Qμ -trivial if .Qμ (A) = 0 or 1 for all .A ∈ T . The following proposition is essentially proved in Orey (1971), Proposition 4.3, pp. 19–20. Proposition 17.5 Let p be a transition probability admitting an invariant initial distribution .μ. Then the following statements are equivalent: (a) The tail .σ -field is .Qμ -trivial for every initial distribution .μ. (b) .||p(t; x, dy) − π(dy)||v → 0 as .t → ∞,for every .x ∈ S. Here .||ν||v denotes the variation norm of a signed measure .ν. Remark 17.7 Note that if condition (b) (or (a)) of Proposition 17.5 holds, then for every .π -integrable f one has Qμ ( lim T −1
.
7 See
T →∞
Bhattacharya (1982).
T 0
f (Xs )ds =
f dm) = 1, S
(17.14)
17 Functional Central Limit Theorem
307
whatever be the initial distribution .μ. For the event C within parentheses is the same T +t as .C◦θt = {limT →∞ T −1 t f (Xs )ds = S f dm} for every .t > 0. Hence the left side of (17.14) equals .Eμ d(Xt ) where .d(x) = Qδx (C) and .Eμ denotes expectation with respect to .μ. Now .Em d(Xt ) = Qm (C) = 1, by the ergodic theorem. Therefore, |Qμ (C) − 1| = |Eμ d(Xt ) − Eπ d(Xt )| =| d(y)(p(t; x, dy) − π(dy))μ(dx)|
.
S
S
≤ ||p(t; x, dy) − π(dy)||v → 0 as t → ∞. Hence (17.14) holds. We can now prove the following useful result. Theorem 17.6 Let p be a transition probability admitting an invariant initial distribution .π . Assume that .||p(t; x, dy) − π(dy)||v → 0 as .t → ∞, for all .x ∈ S. Suppose .f ∈ RAˆ . Then for every .μ, the .Qμ -distributions of the random functions − 1 nt .n 2 0 f (Xs )ds(t ≥ 0), converge weakly to the Wiener measure with zero drift and variance parameter .σ 2 given by Theorem 17.1(c). Proof Fix a probability measure .μ on S. Write .Zn for the random function − 1 nt .n 2 0 f (Xs )ds(t ≥ 0). Let .ψ be a real valued bounded continuous function on − 1 nt+h f (Xs )ds(t ≥ .C[0, ∞). For .h > 0 let .Zn,h denote the random function .n 2 h 0). Then .Eμ ψ(Zn,h ) = Eμh ψ(Zn ), where .μh is the .Qμ -distribution of .Xh . Note that .|Eμh ψ(Zn ) − Em ψ(Zn )| ≤ ||ψ||∞ ||p(h; x, dy) − m(dy)||v μ(dx) → 0 S
1
as .h → ∞, uniformly for all n. Choose .h(n) → ∞, .h(n) = o(n− 2 ). Then by Theorem 17.1, .Em ψ(Zn ) → C[0,∞) ψdWσ 2 , where .Wσ 2 is the limiting Wiener measure. One has, therefore, .limn Eμ ψ(Zn,h ) = C[0,∞) ψdWσ 2 . By the ergodic theorem and (17.14), .
− 12
h(n)+nt
sup |Zn,h(n) (t) − n
− 12
h(n)
f (Xs )ds| ≤ n
0
t≥0
0
in .Qμ -probability as .n → ∞. Finally, the change of time θn (t) = (t − h(n)/n) ∨ 0,
.
|f (Xs )|ds → 0
308
17 Functional Central Limit Theorem 1
applied to the process .n− 2 (1968, pp. 144–145)),
h(n)+nt 0
f (Xs )ds, shows as .n → ∞, (see Billingsley
Eμ ψ(Zn ) →
.
C[0,∞)
ψdWσ 2 .
17.1 A Functional Central Limit Theorem for Diffusions with Periodic Coefficients For this example we will assume a symmetric positive definite matrix .a(x) = ((aij (x)))1≤i,j ≤k for each .x ∈ Rk , a vector field .b(x) = (bi (x))1≤i≤k , .x ∈ Rk , such that (i) Each component .aij (x) and .bi (x) is a periodic function. aij (x+m) = aij (x)
bi (x+m) = bi (x), for all m ∈ Zk , x ∈ Rk , 1 ≤ i, j ≤ k.
.
(ii) Each .aij (x) has a second order derivative. (iii) Each .bi (x) has a continuous first order derivative. These coefficients may be associated with Kolmogorov’s backward pde for the operator A=
.
aij (x)
i,j
∂2 ∂ + bi (x) , ∂xi ∂xj ∂xi i
(1)
(k)
and/or with the diffusion .X = {Xt = (Xt , . . . , Xt ) : t ≥ 0} governed by the stochastic differential equation Xt = X0 +
.
t
b(Xs )ds +
0
t
σ (Xs )dBs ,
t > 0,
(17.15)
0
where .σ (x) is the positive square root of .a(x) for each .x ∈ Rk . It is the latter sde perspective that is to be considered here, with the pde connection serving as a tool for some of the analysis. In particular, with some appeal to pde theory for the existence of densities, the ergodic theory associated with the diffusion .X may be obtained from that of its transformation X˙ := X mod1
.
17.1 A Functional Central Limit Theorem for Diffusions with Periodic. . .
309
on the (compact) torus T = {x mod1 : x ∈ Rk } ≡ [0, 1)k ,
.
as follows. ˙ The diffusion X has a Theorem 17.7 (Ergodic Theory for X Derived from .X) unique invariant probability .π(dx), absolutely continuous with respect to Lebesgue measure with density .π(x). Moreover, .
sup |p(t; x, y) − π(y)| ≤ ce−βt ,
t ≥ 0,
x∈T
for positive constants .c, β. Proof Recall the construction in Example 8 from Chapter 13 of a diffusion, denoted X˙ = {X˙ t : t ≥ 0}, with state space the torus .T for the prescribed Lipschitz and periodic coefficients. Let
.
p(t; ˙ x, dy) = p(t; ˙ x, y)dy,
.
t > 0, x, y ∈ T,
˙ where denote the transition probabilities for .X,
p(t; ˙ x, y) =
.
p(t; x, y + m),
x, y ∈ [0, 1)k .
m∈Zk
Since, from pde theory,8 one has .
inf p(t; x, y) > 0,
x,y∈T
t > 0,
it follows readily from (13.36) that .
inf p(t; ˙ x, y) > 0,
x,y∈T
t > 0,
as well. Moreover, .T is compact, so Doeblin’s condition holds, and the assertion of ˙ i.e., the existence of an absolutely continuous invariant the theorem applies to .X, ˙ Now, .π(x) has a periodic extension to an probability .π(dx) = π(x)dx on .T for .X. invariant measure for X, again denoted by .π(dx) = π(x)dx. The main goal for this example is to prove the next theorem from Bensoussan et al. (1978) and Bhattacharya (1985). This theorem is partially motivated by an application to homogenization of dispersion in porous media according to scale, a topic to be treated in more detail in the special topics Chapter 24.
8 See
Friedman (1964), pp. 44–46.
310
17 Functional Central Limit Theorem
Theorem 17.8 Let .{Xt : t ≥ 0} be a diffusion on .Rk with differentiable periodic coefficients .b(·) and .σ (·), with period 1 (period lattice .Zk ), and generator A for the diffusion dXt = b(Xt )dt + σ (Xt )dBt ,
.
where .{Bt : t ≥ 0}, is a standard k-dimensional Brownian motion. That is, with a(x) = σ (x)σ (x) ,
.
A=
.
bj (·)
1≤≤j ≤k
1 ∂ + ∂xj 2
aij (x)
1≤≤i,j ≤k
∂2 . ∂xi ∂xj
(17.16)
Then 1
Zt,λ := λ− 2 (Xλt − λbt) ⇒ {Zt : t ≥ 0}, as λ → ∞,
.
(17.17)
where the limiting diffusion .Z = {Zt : t ≥ 0} is a k-dimensional Brownian motion with mean zero and .k × k dispersion matrix K=
.
T1
(Gradg(x) − Ik )a(x)(Gradg(x) − Ik ) π(dx),
(17.18)
where Gradg = (gradg1 , . . . , gradgk ) ≡ (∇g1 , . . . , ∇gk ).
.
Here .π is the unique invariant probability of the diffusion X˙ t := Xt mod1,
.
t ≥ 0,
on the unit k-dimensional torus .T1 , and .g = (g1 , . . . , gk ) where .gj is the unique mean-zero periodic solution of Agj = bj − bj , j = 1, . . . , k,
.
bj =
T1
bj (x)π(dx).
(17.19)
Proof By Itô’s lemma, writing .∇ = grad, and .σj (x) for the j th row of .σ (x), one has .
gj (Xt ) − gj (X0 ) 1 = ∇gj (Xs )dXs + 2 [0,t]
1≤≤i,i ≤k [0,t]
ai,i (Xs )
∂2 gj (Xs )ds ∂xi ∂xi
17.1 A Functional Central Limit Theorem for Diffusions with Periodic. . .
=
311
[0,t]
(bj (Xs ) − bj )ds +
= Xt,j −X0,j − tbj +
[0,t]
[0,t]
∇gj · σj · (Xs )dBs
∇gj · σj · (Xs )dBs −
[0,t]
σj · (Xs )dBs , j = 1, . . . , k. (17.20)
Hence, Xt − X0 − tb = g(Xt ) − g(X0 ) −
.
[0,t]
(Gradg(Xs ) − Ik )σ (Xs )dBs
(17.21)
Under the scaling in Theorem 17.1, since Grad(g(Xs )) = Grad(g(X˙ s ))
.
and σ (Xs ) = σ (X˙ s ),
.
the martingale FCLT9 applies if .X˙ 0 has the invariant distribution .π . Note that .gj , ∇gj are bounded, so that
.
1
λ− 2 (g(Xt ) − g(X0 )) → 0 as λ → ∞.
.
Hence (17.16) follows from (17.20). For an arbitrary initial distribution, the FCLT still holds in view of Theorem 17.6. Example 3 (Taylor-Aris Theory of Solute Transport) In a classic paper, G. I. Taylor10 showed that when a solute in dilute concentration .c0 (dx) is injected into a liquid flowing with a steady flow velocity through an infinite straight capillary of uniform circular cross section, the concentration .c(t, y) along the capillary becomes Gaussian as time increases. Moreover the large-scale dispersion coefficient could be explicitly calculated in terms of the cross-sectional average velocity .U0 , its radius .a > 0, and the molecular diffusion coefficient .D0 as
9 Bhattacharya
and Waymire (2022), Theorem 15.5, or Billingsley (1968), Theorem 23.1. (1953) had argued the seminal asymptotic formula 11 .D ∼ a 2 U02 /48D0 . The explicit computation (17.22) was given by Aris (1956) using the method of moments and eigenfunction expansions, and it is popularly called the Taylor-Aris formula. 11 Wooding (1960) extended the Taylor-Aris analysis and formula to dispersion of a solute in a unidirectional parabolic flow between two parallel planes separated by a distance r. Accordingly, 2U 2 in this setting .D = 2D + 8r 945D . 10 Taylor
312
17 Functional Central Limit Theorem
D :=
.
a 2 U02 + D0 . 48D0
(17.22)
In the classic treatments, the problem was viewed as that of determining the long time behavior of the solute concentration .c(t; y) as a solution to the Fokker-Planck equation for the concentration. Specifically, let G = {y = (y (1) , y) ˜ : −∞ < y (1) < ∞, |y| ˜ < a}, ∂G = {y : |y| ˜ = a}, y˜ = (y (2) , y (3) ) . .
(17.23)
The velocity of the liquid is along the capillary length (i.e., in the .y (1) direction) and is given, as the solution of a Navier-Stokes equation governing a steady nonturbulent flow, by the parabolic flow F (y) ˜ = U0 (1 −
.
|y| ˜2 ). a2
(17.24)
The parameter .U0 is the maximum velocity, attained at the center of the capillary. Using Gauss’ divergence theorem, the pde for the time evolution of the concentration from the perspective of conservation of mass (Fick’s law) is of the form c(t; y) =
p(t; x, y)c0 (dx),
.
(17.25)
G
where, .p(t; x, y) satisfies the Fokker-Plank equation, i.e., Kolmogorov forward equation (see Chapter 15), .
1 ∂p ∂ = D0 y p − (1) (F (y)p) ˜ ∂t 2 ∂y
(y ∈ G, t > 0).
(17.26)
The forward boundary condition is the no-flux condition y (2)
.
∂p ∂p + y (3) (3) ≡ n(y) · grad p = 0 (2) ∂y ∂y
(y ∈ ∂G, t > 0).
(17.27)
It was subsequently recognized by Bhattacharya and Gupta (1984) that this problem has an equivalent formulation in terms of the injected particles. Namely, for a dilute concentration, particle motions may be assumed to be independent, and their individual evolutions simply involve an ergodic theorem for the cross-sectional dispersion in the transverse direction to the fluid flow and a central limit theorem in the longitudinal direction of the flow. Specifically, since (i) the diffusion matrix is .D0 I, (ii) the drift velocity is along the .x (1) -axis and depends only on .x (2) , x (3) , and (iii) the boundary condition only involves .x (2) , x (3) , the individual immersed particle trajectories .{Xt } have the following representation:
17.1 A Functional Central Limit Theorem for Diffusions with Periodic. . .
313
Proposition 17.9 a. .{X˜ t := (Xt , Xt )} is a two-dimensional, reflecting Brownian motion on the disc .Ga := {y ∈ R2 : |y| ˜ ≤ a} with diffusion matrix .D0 I, where .I is the .2 × 2 identity matrix. t √ (1) (1) b. .Xt = X0 + 0 F (X˜ s )ds + D0 Bt , where .B = {Bt } is a one-dimensional standard Brownian motion, independent of .{X˜ t } and .X0(1) . (2)
(3)
Theorem 17.10 Let .X˜ and B denote the independent diffusions in Proposition 17.9. a. .X˜ is an ergodic Markov process whose unique invariant probability .π is the uniform distribution on on the disc .Ga . b. The radial process .{R˜ t := |X˜ t |} is a one-dimensional diffusion on .[0, a] whose transition probability density is q(t; r, r˜ ) :=
p(t; ˜ x, ˜ y)s ˜ r˜ (d y) ˜
.
{y:| ˜ y|=˜ ˜ r}
(|x| ˜ = r),
(17.28)
where .sr˜ (d y) ˜ is the arc length measure on the circle .{y˜ ∈ R2 : |y| ˜ = r˜ } and .p(t; x, ˜ y) ˜ is the transition probability of the diffusion .X˜ on .Ga . Moreover, .{R˜ t } has the unique invariant density π(r) =
.
2r , a2
0 ≤ r ≤ a,
(17.29)
and .
max |q(t; r, r˜ ) −
0≤r,˜r ≤a
2r | ≤ 2π ac1 e−c2 t a2
(t > 0),
(17.30)
for some constants .c1 > 0, c2 > 0. c. 1 (1) Yt(n) := n−1/2 (Xnt − U0 nt) ⇒ N(0, Dt), 2
.
(17.31)
where D :=
.
a 2 U02 + D0 , 48D0
(17.32)
Proof The Markov property for these functions of X, transition probabilities, and their generators are provided in Chapter 13. The transition probability density .p˜ of the Markov process .{X˜ t } is related to p by
314
17 Functional Central Limit Theorem
p(t; ˜ x, ˜ y) ˜ =
∞
.
−∞
p(t; x, y)dy (1) .
(17.33)
˜ x, ˜ y) ˜ is positive and continuous in .x, ˜ y˜ ∈ Ga for The disc .Ga is compact and .p(t; each .t > 0. It follows, due to Doeblin minorization, that there is a unique invariant probability .π for .p˜ and that for some positive constants .c1 , c2 , one has .
max |p(t; ˜ x, ˜ y) ˜ − π(y)| ˜ ≤ c1 e−c2 t
(t > 0).
x, ˜ y∈B ˜ a
(17.34)
It is straightforward to check that .π(y) ˜ is the uniform density (Exercise 6) The generator of .{R˜ t } is given by the backward operator 1 d d2 1 ), 0 < r < a, D0 ( 2 + .A := r dr 2 dr
∂ = 0, dr r=a
(17.35)
and its transition probability density converges exponentially fast to the invariant density (17.29). To prove (17.31), write f (r) = 1 −
.
r2 a2
(0 ≤ r ≤ a).
(17.36)
Then F (y) ˜ = U0 f (|y|). ˜
.
Now by the central limit theorem (Theorem 17.1), it follows that as .n → ∞, {Zt(n) = n−1/2
nt
.
(f (R˜ s ) − Eπ f )ds : t ≥ 0} ⇒ σ B,
(17.37)
0
where Eπ f =
.
a
f (r)π(r)dr =
0
0
a
(1 −
1 r 2 2r )( )dr = . 2 a2 a2
(17.38)
The key calculation of the variance parameter .σ 2 of the limiting Gaussian yields
a
σ =2
.
2
(f (r) − Eπ f )h(r)π(r)dr
(17.39)
0
where h is a function in .L2 ([0, a], π ) satisfying Ah(r) = −(f (r) − Eπ f ) =
.
1 r2 − . 2 2 a
(17.40)
17.1 A Functional Central Limit Theorem for Diffusions with Periodic. . .
315
Such an h is unique up to the addition of a constant and is easily given by 1 r4 ( − 2r 2 ) + c3 8D0 a 2
h(r) =
.
(17.41)
where .c3 is an arbitrary constant, which may be taken to be zero in carrying out the integration in (17.39). One then obtains σ2 =
.
a2 . 48D0
(17.42)
Finally, since .{Bt } and .{X˜ t } in part (b) of Proposition 17.9 are independent, the formula for D follows. Remark 17.8 (17.31) and (17.32) represent Taylor’s main result as completed by Aris (1956). It is noteworthy that the effective dispersion D is larger than the molecular diffusion coefficient .D0 and grows quadratically with velocity. Remark 17.9 That the probabilisitic method is truly different from the use of eigenfunctions by Aris (1956) is shown in Bhattacharya and Gupta (1984) by deriving an identity involving the zeros of Bessel functions of order one by comparing the two methods and even conjecturing a result which seems to be new. Remark 17.10 The problem of extending these results to cover nonsmooth dispersion coefficients at the molecular scale arose in hydrologic experiments involving sharp heterogeneities in the subsurface medium. The conventional wisdom was that the large-scale dispersion would still be Gaussian with a dispersion coefficient of the Taylor-Aris general form (17.22), but involving ‘averaged’values of the molecular scale diffusion coefficient. In an analysis of this problem, Ramirez et al. (2006) showed that the probabilistic approach12 of Bhattacharya and Gupta (1984), together with properties of skew Brownian motion, leads to Gaussian dispersion, with an explicit formula for the large-scale dispersion. In particular the specific nature of the asymptotic dispersion is shown to involve both arithmetic and harmonic averaging of the molecular scale diffusion coefficient in the formula for the large-scale dispersion. Remark 17.11 (A Gaussian Random Field Indexed By .RA .) It follows immediately from Theorem 17.1 that one has the mean-zero Gaussian random field .G indexed by .RA , where G := {Z(f ) : f ∈ RA },
.
technical issue occurs in that the independence of .{Bt } and .{X˜ t } in Proposition 17.9(b) no longer holds for finite time t.
12 A
316
17 Functional Central Limit Theorem
with covariance .
Cov(Z(f1 ), Z(f2 )) = − f1 , g2 − f2 , g1 ,
where .gi , i = 1, 2, is the unique mean-zero element of .DA such that .Agi = fi , i = 1, 2.
Exercises 1. (a) Show that the adjoint operator A∗ is well defined if DA is dense in L2 (S, π ). What is the domain of A∗ ? [Hint: Let g ∈ L2 be such that the map f → Af, g is continuous on DA . Then it has a unique extension to L2 , and there exists a unique element A∗ g, say, such that A∗ g, f = Ag, f . DA∗ comprises all such g and A∗ g is the corresponding linear functional as defined.] (b) Prove that RA ⊂ NA⊥ . [Hint: Use Theorem 17.1(c) and Proposition 17.2.] 2. In the proof of Proposition 17.4, show that g(Xt ) − g(X0 ) = 0 π -a.s.. [Hint: Under the hypothesis, the martingale differences Δn g are zero π -a.s.. Hence one n has the identity g(Xh ) − 0 f (Xs )ds = g(X0 ). The martingale on the left side does not depend on n. It is uniformly integrable, and its limit, as n → ∞, is the same as its value for every integer n > 0. The same arguments show that n g(Xh ) − m f (Xs )ds = g(Xm ) for all m > n. But the limit of the martingale on the left side is the same as that of the preceding identity. Hence the two right sides must coincide, i.e., g(Xm ) = g(X0 ) a.s..] 3. Supply the details to prove (17.13). 4. Consider a positive recurrent diffusion on R (see Chapters 11 and 21). (a) Show RA = 1⊥ . (b) Show that f ∈ RA if and only if x that y x → 0 −∞ f (z)m(dz)dy ∈ L2 (R, π ), where π is the unique invariant probability. (c) Write out the density of π with respect to Lebesgue measure. [Hint: See Chapter 11.] (d) Check that for f ∈ RA , σ 2 = −2 f, g = ∞ 0 y 2 m(R) −∞ f (x){ x dy −∞ f (z)dz}m(dx). (e) Show that one needs to assume x f ∈ 1⊥ , and x → 0 { (−∞,y] f (z)m(dz)}dy ∈ L2 . 5. Write out the asymptotic variance for Theorem 17.8 with k = 1, i.e., T1 = {x mod1 : x ∈ R}, which may be viewed as the unit circle in the complex plain, {e2π ix : 0 ≤ x ≤ 1}. 6. Show that the invariant distribution for (17.33) in the Taylor-Aris example is uniform. [Hint: Use the backward equation (and boundary condition) for p˜ to show that ∂t∂ {|y|≤a} p(t; ˜ x, ˜ y)π( ˜ x)d ˜ x˜ = 0.] ˜ 2
2
U 7. Wooding (1960) Compute the asymptotic formula D = 2D + 8r 945D for dispersion of a solute in a unidirectional parabolic flow between two parallel planes separated by a distance r.
Chapter 18
Asymptotic Stability for Singular Diffusions
In the present chapter, asymptotic stability conditions for diffusions with linear drift and Lipschitz diffusion coefficients are presented. Although originally designed to provide criteria for the stability in distribution of multidimensional diffusions with linear drift and singular diffusion coefficients, the method presented may be used for nonsingular diffusions as well and also for some nonlinear drift.
In this chapter we consider diffusions .{Xx (t) : t ≥ 0} on .Rk defined as solutions to the stochastic differential equation
t
X (t) = x +
.
x
0
BX (s)ds + x
t
σ (Xx (s))dW (s),
t ≥ 0,
(18.1)
0
where B is a .k×k matrix, .σ (·) is a Lipschitz .k×-matrix valued function on .Rk , and .W = {W (t) : t ≥ 0} is a .-dimensional standard Brownian motion. In particular, assume there is a .λ0 ≥ 0 such that ||σ (x) − σ (y)|| ≤ λ0 |x − y|,
.
x, y ∈ Rk ,
(18.2)
where .|| · || denotes matrix norm with respect to the Euclidean norm .| · |. The transition probability of the diffusion will be denoted .p(t; x, dy). The goal is to provide conditions on B and .σ (·) to ensure stability in distribution in the sense that there is a probability measure .π(dy) such that, for every initial state x, .p(t; x, dy) converges weakly to .π(dy) as .t → ∞. © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_18
317
318
18 Asymptotic Stability for Singular Diffusions
Criteria were established in Chapter 11 in the case that the matrices .σ (·)σ (·)t are nonsingular. In such (nonsingular) cases, asymptotic stability is equivalent to the existence of a unique invariant probability. The emphasis here is on singular diffusions, although the method applies to nonsingular diffusions also. There are two components of the present method: (1) tightness and (2) asymptotic flatness defined below. Example 1 Let .k = 2 and take B=
.
−1 0 , 0 −1
x2 0 . −x1 0
σ (x) = c
Then .R 2 (t) = X12 (t) + X22 (t), t ≥ 0, satisfies dR 2 (t) = (c2 − 2)R 2 (t)dt,
.
so that R 2 (t) = R 2 (0)e(c
.
2 −2)t
,
t ≥ 0.
√ In the case .|c| = 2, .R 2 (t) = R 2 (0) for all .t ≥ 0. Thus, the transition probabilities x .{p(t; x, dy) : t ≥ 0} of .{X (t) : t ≥ 0} are tight for every x. However, there is an invariant probability on every circle, and√the angular motion on the circle is a periodic diffusion. On the other hand, if .c = √ 2, then the only invariant probability is the point mass at √ the origin, but if .|c| > 2, and .X(0) = 0, then .R 2 (t) → ∞ a.s. as .t → ∞. If .|c| < 2, then the diffusion is stochastically stable a.s. and, therefore, in distribution. In the present chapter, stability criteria1 that will apply to both nonsingular and singular diffusions are presented, i.e., for the latter, cases in which .σ (x)σ (x)t is of rank less than the dimension k for some x. Lemma 1 below provides an approach to this problem. It involves viewing the space-time family .{Xx (t) : t ≥ 0, x ∈ Rk } as a stochastic flow in the spirit of dynamical systems. That is, one may informally view it as a collection of time evolutions of particles originating at the points .x ∈ Rk . Definition 18.1 A stochastic flow .{Xx (t) : t ≥ 0, x ∈ Rk } is said to be asymptotically flat uniformly on compacts (in probability) if for every .ε > 0, compact .K ⊂ Rk , one has .
sup P (|Xx (t) − Xy (t)| > ε) → 0 as t → ∞. x,y∈K
1 The
results presented here were originally obtained by Basak and Bhattacharya (1992).
(18.3)
18 Asymptotic Stability for Singular Diffusions
319
Lemma 1 The following two conditions are sufficient to imply stochastic stability: (i) .{p(t; x, dy) : t ≥ 0} is tight, and (ii) the stochastic flow is asymptotically flat uniformly on compacts. Proof For each x tightness provides a probability measure .π x (dy), say, such that x .p(tm(x) ; x, dy) ⇒ π (dy) as .m → ∞ for an unbounded sequence .0 ≤ t1(x) < t2(x) < · · · . For .z ∈ Rk , tightness of .{p(tm(x) ; z, dy) : m ≥ 1} provides a common subsequence, say .{tm }, such that .p(tm ; x, dy) ⇒ π x , and .p(tm ; z, dy) ⇒ π z as x = π z . Then there is a continuous function g with compact .m → ∞. Suppose .π support, say K, such that
.
Rk
g(y)π x (dy) =
Rk
g(y)π z (dy),
(18.4)
but .
lim
m→∞ Rk
g(y)p(tm ; x, dy) = lim
m→∞ Rk
g(y)p(tm ; z, dy).
So (18.4) contradicts the assumption of asymptotic flatness uniformly on compacts. In order to establish asymptotic flatness uniformly on compacts, a stronger property of asymptotic flatness in the .δ-th mean, which means for .δ > 0, and compact .K ⊂ Rk , .
lim sup E|Xx (t) − Xy (t)|δ = 0.
t→∞ x,y∈K
(18.5)
The following example shows that asymptotic flatness uniformly on compacts, alone, is far from being sufficient for stability. Example 2 Let .k = 1, .b(x) = e−x , σ (x) ≡ 0. Then .Xx (t) = log(t + ex ) → ∞ as .t → ∞, but Xx (t) − Xz (t) = log
.
t + ex →0 t + ez
as .t → ∞, uniformly for .x, z in a compact set K. Let a(x) = σ (x)σ (x) , a(x, y) = (σ (x) − σ (y))(σ (x) − σ (y)) .
.
The main result for this chapter can be stated as follows. Theorem 18.1 Assume the Lipschitz condition (18.2).
(18.6)
320
18 Asymptotic Stability for Singular Diffusions
a. Assume there is a symmetric positive definite matrix C and a positive constant .γ such that for .x = y, .2C(x
− y) · B(x − y) −
2C(x − y) · a(x, y)C(x − y) + tr(a(x, y)C) ≤ −γ |x − y|2 , (x − y) · C(x − y) (18.7)
where .· is the Euclidean dot product and .tr denotes the matrix trace. Then the diffusion .{Xx (t) : t ≥ 0} is stable in distribution. b. Assume there is a symmetric positive definite matrix C and constant .β > 0 such that for all sufficiently large .|x|, 2Cx · Bx −
.
2Cx · a(x)Cx + tr(a(x)C) ≤ −β|x|2 . x · Cx
(18.8)
Then an invariant probability exists. Corollary 18.2 Assume the Lipschitz condition (18.2). Then (18.7) is sufficient for stability. Proof This follows since for Lipschitz .σ (·), (18.2) implies (18.8) for every .β ∈ (0, λ), as can be seen as follows. Take .y = 0 in (18.7) and note that .
2Cx · a(x, 0)Cx + tr(a(x, 0)C) x · Cx 2Cx · a(x)Cx = 2Cx · Bx − + tr(a(x)C) + O(|x|), as |x| → ∞. x · Cx
− γ |x|2 ≥ 2Cx · Bx −
In the course of the proof of Theorem 18.1 below, one sees that (18.8) implies the existence but not the uniqueness implied by stability. The stronger condition (18.7) also implies asymptotic flatness (18.3). The existence of an invariant probability and asymptotic flatness together immediately yield uniqueness and stability, as demonstrated by the proof of Lemma 1. The following Corollary 18.3 may be viewed as a generalization of a well-known necessary and sufficient condition2 on the matrix B for stability in the case of a constant matrix .σ (·) ≡ σ . Also see Exercise 6. Corollary 18.3 Assume that .σ (·) is Lipschitz with coefficient .λ0 and all eigenvalues of B have negative real parts. Assume in addition that (k − 1)λ20
N
Then, .Mn → ∞ as .n → ∞, and Ev(Xtx ) = v(x) + E
t
.
0
t
= v(x) + E 0
t
+E 0
Lv(Xsx ))ds 1[|Xsx |>n] Lv(Xsx ))ds
1[|Xsx |≤n] Lv(Xsx ))ds
t
≤ −Mn E 0
1[|Xsx |>n] ds + Mt
324
18 Asymptotic Stability for Singular Diffusions
= −Mn 0
t
c
p(s; x, B n )ds + Mt + v(x),
(18.21)
where the last inequality is by Fubini-Tonelli, and .Bn = {y : |y| < n}. Thus, .
1 t
t
c
p(s; x, B n )ds ≤
0
M v(x) − Ev(Xtx ) + , Mn t Mn
and therefore 1 . t
0
t
c
p(s; x, B n ) → 0 as n → ∞.
Thus, given .δ > 0, there is a compact set .Kδ = B n for sufficiently large n, such that 1 . t
t
p(s; x, Kδ )ds > 1 − δ, for all t > 0.
0
Thus, choose an unbounded, increasing sequence of times .tn , n ≥ 1, such that {p(tn ; x, dy)} is tight. Now apply Lemma 1 to prove part (b), and complete the proof of Theorem 18.1.
.
For nonsingular diffusions, the Kha´sminskii criterion for positive recurrence given in Chapter 11 may be applied, but it is not very suitable if the infinitesimal generator is far from being radial. In the following example, Kha´sminskii criterion is not satisfied but (18.8) holds. Example 3 Let −1 0 B= , 0 −1
k = 2,
.
a(x) = (δ1 + δ2 x22 )I2 ,
for constants .δ1 > 0, δ2 ≥ 0. Recall, for nonsingular .σ (·), the Kha´sminskii criterion for stability, given by Theorem 11.8, computes as follows:
∞
.
e−I (r) dr = ∞ for all δ2 ≥ 0
1 ∞ 1
eI (r) dr < ∞ if and only if δ2 < 1. α(r)
(18.22)
In particular, according to this test, the diffusion is stable in distribution if .δ2 ∈ [0, 1). On the other hand, taking .C = I2 , the left side of (18.8) is .−2|x|2 . Thus the criterion (18.8) is satisfied, and the diffusion is stable in distribution regardless of the value of .δ2 ≥ 0.
18 Asymptotic Stability for Singular Diffusions
325
It is natural to ask4 if a diffusion on .Rk would be stable in distribution if .σ (·) is Lipschitz, .σ (0) = 0, and all eigenvalues of B have negative real parts. Equation (18.9) shows that the answer is affirmative for .k = 1. Example 1 provides a counterexample in the case .k = 2, with .σ (·) modified near the origin. The following two examples show that even when restricted to nonsingular diffusions, the answer is “yes”for .k = 1, and “no”for .k > 1. Example 4 Let B = −δIk (δ > 0),
k > 2,
.
a(x) = dr 2 Ik , (d > 0, r = |x| ≥ 1).
(18.23)
Then .a(·) is nonsingular and Lipschitz on .Rk . In this case, Kha´smininskii’s criterion is necessary as well as sufficient, and δ I (r) = (k − 1 − 2 ) ln r, d
.
r ≥ 1.
If .δ/d < (k − 2)/2, then the first integral in (18.22) converges, implying that the diffusion is transient. If .δ/d = (k − 2)/2, then the first integral in (18.22) diverges, as does the second integral. Thus the diffusion is null-recurrent. If .δ/d > (k − 2)/2, then (18.22) holds and the diffusion is, therefore, positive recurrent and hence stable in distribution. Example 5 k = 2,
.
B = −δI2 (δ > 0),
λ1 x22 + ε −λ1 x1 x2 a(x) = , −λ1 x1 x2 λ1 x12 + ε
where .δ, λ1 , and .ε are positive constants. Note that the positive definite square root of .a(x) is Lipschitz in this case. Kha´sminskii criterion for recurrence (or transience) is necessary and sufficient here, and computes as follows: .
β(r) = β(r) := inf
|x|=r
r
I (r) = I (r) := 1
λ1 − 2δ 2 (λ1 − 2δ)|x|2 + ε =1+ r , ε ε
β(u) u
du.
(18.24)
If .λ1 = 2δ, then both integrals in (18.22) diverge, which implies that the diffusion is null recurrent. If .λ1 > 2δ, then the first integral in (18.2) is finite, so that the diffusion is transient. In particular, there is no invariant probability in this case.
4 This question was posed by L. Stettner during a 1986-IMA workshop at the University of Minnesota.
326
18 Asymptotic Stability for Singular Diffusions
Example 6 In Theorem 18.1, if .σ (·) is a constant matrix, singular or nonsingular, then .a(x, y) vanishes. Hence if all the eigenvalues of B have negative real parts, then the diffusion has a unique invariant probability and it is stable in distribution. (See Exercise 5). Remark 18.1 It should be noted that Theorem 18.1 holds with Bx replaced by .b(x), if the inequality (18.7) holds. There is no change in the proof. The problem of course is to show that the inequality (18.7) holds for a given .b(·). Example 7 (Nonlinear Drift) Let .b(x) = (b1 (x), . . . , bk (x)). Suppose .bj (x) is a polynomial in .xj alone, involving only odd powers, apart from a constant term: m(j )
bj (xj ) = cj,0 + cj,1 xj + · · · + cj,m(j ) xj
.
,
where .m(j ) ≥ 1, .cj,1 < 0, and .cj,3 , . . . , cj,m(j ) ≤ 0, for all .j = 1, . . . , k. If .σ (·) is a constant matrix, then (18.7) holds with .C = Ik , and the diffusion is stable in distribution. If .σ (·) is Lipschitz: .|σ (x) − σ (y)| ≤ λ0 |x − y|, then (18.7) holds if 2 .2λ > (k − 1)λ , where .λ = min{|cj,1 | : j = 1, . . . , k} (Exercise 7). 0 Example 8 (Nonlinear Drift) Consider a nonlinear drift .b(·). Assuming .b(·) is continuously differentiable, one has the identity b(x) − b(y) =
.
[0,1]
d (b(y + θ (x − y))dθ = dθ
[0,1]
J (y + θ (x − y))(y)(x − y)dθ,
where .J (x) = gradb(x) ≡ ((∂bi (x)/∂xj )). If C is a symmetric positive definite matrix, then .
C(x − y) · (b(x) − b(y)) C(x − y)J (y + θ (x − y))(x − y)dθ = [0,1]
=
[0,1]
1 = 2
J (y + θ (x − y))C(x − y) · (x − y)dθ
[0,1]
((CJ + J C)(y + θ (x − y)(x − y) · (x − y))dθ. (18.25)
Assume now that . 12 (CJ + J C)(·) has all its eigenvalues less than or equal to .−λ < 0. Then, if .σ (·) is constant, (18.7) holds, and the diffusion with drift .b(·) has a unique invariant measure, and it is stable in distribution. If, instead, .σ (·) is Lipschitz: .|σ (x) − σ (y)| ≤ λ0 |x − y|, then, as in the proof of Theorem 18.1, the diffusion is stable in distribution if .2λ > (k − 1)λ20 ||C||. In the special case .b(x) = Bx considered in Corollary 18.3, namely, B has all its eigenvalues with negative real parts, one may apply the above with .C = P (see (18.10)), to get .(CJ + J C) = (P B + B P ) = −Ik , i.e., .2λ = 1 (see (18.11)). As pointed out in Example 6, in this
Exercises
327
case, if .σ (·) is a constant matrix, the diffusion is stable in distribution. In the case σ (·) is Lipschitz, it follows from the above that the diffusion is stable in distribution if .1 > (k − 1)λ20 ||P ||, the same result as Corollary 18.3. Example 6 is a special case of this method, with J diagonal. It is usually relatively simple to deal with dimension .k = 2, since one can compute the eigenvalues by solving a quadratic equation.(See Exercise 8).
.
Remark 18.2 In the case .σ (·) is nonsingular and Lipschitz, one only needs to check the simpler condition (18.8), in order to deduce stability in distribution. This follows from Chapter 11, i.e., one only needs to prove tightness of the family .{p(t; x, dy) : t > 0} of probability measures, for some x. We close with the following interesting example introduced by Kolmogorov to further illustrate the scope of possibilities. Example 9 (Kolmogorov’s Nonsingular Diffusion with Singular Diffusion Coefficient) This example of Kolmogorov shows that a d-dimensional diffusion with a singular diffusion term of rank less than d can still be nonsingular, i.e., have a ddimensional smooth density everywhere. Consider the two-dimensional diffusion defined by dVt = βVt dt + σ dBt ,
.
dXt = Vt dt,
where .σ is a nonzero constant and .{Bt : t ≥ 0} is a one-dimensional standard Brownian motion. Then .{Vt : t ≥ 0}, starting at any arbitrary state, is a onedimensional Ornstein-Uhlenbeck process, in particular with a unique invariant Gaussian density if .β < 0. The position process .{Xt : t ≥ 0} is also Gaussian, and the joint distribution of .{(Xt , Vt ) : t ≥ 0} has a positive two-dimensional Gaussian density for .t > 0, regardless of the initial state .(V0 , X0 ) = (v0 , x0 ). Hormander’s profound Hypoellipticity Theorem says that if the Lie algebra defined by certain .d + 1 smooth vector fields based on the drifts and diffusion coefficients is of rank d at every point of .Rd , then the d-dimensional diffusion has a smooth positive joint density everywhere.5
Exercises √ 1. Consider Example 1. (a) Let |c| = 2. Show that √ there is an invariant probability on every circle {x : |x| = r}. (b) Let c = 2. Show that the only invariant √ probability is δ{0} . Show also that there is no stabilizing √ distribution if |c|√> 2 but that there is a stabilizing distribution if |c| < 2. (c) Let |c| = 2, but
5 For a precise statement and derivation of an extension of Hormander’s result using Malliavin calculus, we refer to Chapter 5 of Ikeda and Watanabe (1989).
328
2.
3. 4.
5.
18 Asymptotic Stability for Singular Diffusions
modify σ (·) to be nonsingular on {x : |x| < r0 } for some r0 > 0, and keep it the same as in the example for |x| > r0 . Investigate the asymptotics under this change. (a) Prove eA+B = eA eB are finite (convergent) and A, B commute. d s(B +B) Prove (18.11). [Hint: B es(B +B) + es(B +B) B = ds e , so that ∞ s(B +B) s(B +B) B}ds = es(B +B) |∞ = −I .] {B e + e k 0 0 Show that stability in distribution implies that π is the unique invariant probability. (Stochastic Stability of a Trap) If the drift is a Lipschitz function b(·), i.e., not necessarily linear, then a point x ∗ such that b(x ∗ ) = σ (x ∗ ) = 0, is referred to as a trap. Show that if x ∗ is a trap then a sufficient condition that Xtx → x ∗ a.s. and in the δ-th mean, exponentially fast for every x as t → ∞, is given by ∗ )·a(x)C(x−x ∗ ) 2C(x − x ∗ ) · b(x) − 2C(x−x + tr(a(x, x ∗ )C) ≤ −γ |x − x ∗ |2 for all (x−x ∗ )·C(x−x ∗ ) ∗ x = x . Assume that σ (·) = σ is a constant matrix. (i) Use the Picard method of successive approximation to show that Xtx = t etB x + 0 e(t−s)B σ dW (s), t ≥ 0. (ii) Use Itô’s Lemma to verify this solution. (iii) Show that .Σ(t)
= E(
t
e(t−s)B σ dW (s))(
0
t 0
e(t−s)B σ dW (s)) =
t
euB σ σ euB du.
0
(iv) Assume that the real parts of the ∞eigenvalues λj , j = 1, . . . , k are negative. Show that Σ(t) → Σ := 0 euB σ σ euB du as t → ∞. [Hint: The modulus of an eigenvalue of eB is |eλj | < 1, 1 ≤ j ≤ k, and therefore etB x → 0 as t → ∞.] (v) Show that {Xtx : t ≥ 0} is stable with Gaussian invariant probability having mean zero and dispersion matrix Σ. (vi) Take B = −δIk where δ > 0 Compute the dispersion matrices Σ(t), Σ explicitly in this case, and show that they are singular if and only if σ is singular. 6. Give an example of a 2 × 2 matrix B whose eigenvalues have negative real parts, but B + B is not negative definite. 7. (Nonlinear drift) (a) Show that if the linear drift is replaced by an arbitrary Lipschitz drift b(·) then Theorem 18.1 holds with no essential change of proof other than replacing B(x − y) in (18.7) and Bx in (18.8) by b(x) − b(y) and b(x), respectively. (b) Verify the stability assertion in Example 7. 8. Consider the case k = 2, b(x) = (b1 (x), b2 (x)), where b1 (x) = −cx1 − αx1 x22 , b2 (x) = −dx2 − βx12 x2 , and c, d, α, β are positive constants. (a) Show that the eigenvalues of 12 (J (x) + J (x)) are no more than max{−c, −d}. [Hint: Solve
(x) the equation det( J (x)+J − λ) = 0.] (b) What conclusions can be drawn about 2 stability in distribution in the cases (i) σ (·) is constant ? (ii) σ (·) is Lipschitz ?
Chapter 19
Stochastic Integrals with L2 -Martingales .
In this chapter the theory of stochastic integrals is extended to that with respect to square integrable martingales. This extension is enabled by the so called Doob–Meyer decomposition of submartingales
In this chapter the theory of stochastic integrals is extended to that with respect to square integrable martingales. This extension is enabled by the so called Doob– Meyer decomposition of submartingales, Theorem 19.1, below. In the case of a discrete parameter submartingale .{Xn : n ≥ 0} adapted to a filtration .{Fn : n ≥ 0}, such a decomposition is simple to derive (the Doob decomposition) as follows. Let .Zn = E(Xn − Xn−1 |Fn−1 ), n ≥ 1, and define .An = Z (n ≥ 1), A0 = 0. m 1≤m≤n Then Xn = Mn + An (n ≥ 0); M0 = X0 , Mn = Xn − An .
.
(19.1)
Here .An (n ≥ 0) is an increasing nonnegative process, and .Mn (n ≥ 0) is a martingale, both adapted to the filtration .{Fn : n ≥ 0}. Moreover, .An (n ≥ 0) is required to be predictable, i.e., .Fn−1 -measurable .(n ≥ 1). The continuous parameter case is more delicate, as we shall see below. Let .(Ω, F, P ) be a probability space, and .{Ft : t ≥ 0} a P -complete, right-continuous filtration .(Ft ⊂ F ∀t). Consider a right-continuous .{Ft }-adapted submartingale .{Xt : t ≥ 0}, with left limits. Recall that every stochastically continuous submartingale has such a version (Bhattacharya and Waymire (2021), Theorem 13.6). Throughout this chapter, unless otherwise specified, we will assume this set up.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_19
329
19 Stochastic Integrals with .L2 -Martingales
330
Definition 19.1 An .{Ft }-adapted process .{At : t ≥ 0} is said to be an integrable increasing process, if (i) .A0 = 0 and .At is increasing with t, a.s., (ii) .t → At is a.s. right-continuous, and (iii) .E(At ) < ∞ for all t. Such a process is said to be natural if for every bounded right-continuous .{Ft }-martingale .{mt }, one has E
.
[0,t]
ms dAs = E
[0,t]
ms − dAs ,
(19.2)
or, equivalently, E(mt At ) = E
.
[0,t]
ms − dAs , for all t ≥ 0.
(19.3)
In (19.3), the integration is in the usual sense of Lebesgue–Stieltjes integration (see BCPT1 p. 228) with respect to the measure .dAs . To see the equivalence of the two conditions in (19.2) and (19.3), consider a partition .Πn : 0 = t0 < t1 < · · · < tn = t, of .[0, t] with .Δn := max{ti+1 − ti : i = 0, . . . , n − 1} → 0 as .n ↑ ∞. Then .E(mt At ) = E( 0≤i≤n−1 mt (Ati+1 − Ati )) = E( 0≤i≤n−1 mti+1 (Ati+1 − Ati )), using .E(mt |Fti+1 ) = mti+1 (i = 0, 1, . . . , n−1). Letting .n ↑ ∞, and using the rightcontinuity of .ms , 0 ≤ s ≤ t the last sum converges to. [0,t] ms dAs , establishing the equivalence. Remark 19.1 If an integrable increasing process .{At : t ≥ 0} is continuous, almost surely, then it is automatically natural. For on .[0, t] there are at most countably many discontinuities of a right-continuous martingale .ms (ω), for each .ω ∈ Ω, and with probability one, .dAs (ω), 0 ≤ s ≤ t, has no atom. Definition 19.2 For each .a > 0, denote by .Ta , the set of all .{Ft }-stopping times τ ≤ a, a.s. A submartingale .{Xt : t ≥ 0} is said to be of class DL, if for every .a > 0, the family of random variables .{Xτ : τ ∈ Ta } is uniformly integrable. .
Remark 19.2 Note that every martingale, or nonnegative submartingale, .{Mt : t ≥ 0} is of class DL. For, in view of the optional stopping theorem2 , if .τ ≤ a, then .
[|Mτ |≥λ]
|Mτ |dP ≤
[|Mτ |≥λ]
|Ma |dP and P (|Mτ | ≥ λ) ≤ E|Mτ |/λ ≤ E|Ma |/λ.
Theorem 19.1 (Doob–Meyer Decomposition) 3 Let .{Xt : t ≥ 0}, be a rightcontinuous submartingale of class DL. Then (a) .{Xt } can be expressed as
1 Throughout, BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory. 2 See BCPT, Theorem 3.8 and Remark 3.7. 3 The result is due to Meyer (1966). The method of proof is due to Rao (1969), as presented in Ikeda and Watanabe (1989).
19 Stochastic Integrals with .L2 -Martingales
331
Xt = Mt + At ,
.
where .M = {Mt : t ≥ 0} is a martingale and .A = {At : t ≥ 0} is an integrable increasing process, which can be chosen to be natural to make this decomposition unique. Proof Let us first prove the uniqueness of the decomposition, when A is natural. Suppose .Xt = Mt + At = Mt + A t are two decompositions. Consider a bounded right-continuous martingale .{ms : s ≥ 0}, with left limits. Fix .t > 0, and consider partitions .Πn : 0 = t0 < t1 < · · · < tn = t, of .[0, t] with .Δn := max{ti+1 − ti : i = 0, . . . , n − 1} ↓ 0 as .n ↑ ∞. Then using .As − A s = Ms − Ms and denoting the martingale .Ms − Ms = Ns , E
.
[0,t]
ms− d(As − A s ) = lim
n→∞
= lim
n→∞
E(mti [(Ati+1 − A ti+1 ) − (Ati − A ti )])
0≤i≤n−1
E(mti (Nti+1 − Nti )) = 0,
(19.4)
0≤i≤n−1
since .E(Nti+1 |Fti ) = Nti . Hence we arrive at E
.
[0,t]
ms − d(As ) = E
[0,t]
ms − d(A s ).
Using the property that A and .A are natural (see (19.3)), one then has .E(mt At ) = E(mt A t ) for all bounded right-continuous martingales .{mt }. In particular, letting Y be any bounded random variable and letting .mt = E(Y |Ft ), one has E(Y At ) = E(mt At ) = E(mt A t ) = E(Y A t ),
.
for any bounded random variable Y . This implies .At = A t almost surely. Since this holds with probability one (simultaneously) on a countable dense subset of .[0, t] and A and .A are right-continuous, one gets .A = A a.s. and, therefore .M = M
a.s. . Next, let .Yt = Xt − E(Xa |Ft ), which is a non-positive submartingale on n n .[0, a], Ya = 0 a.s.. Consider the partitions .Πn : tj,n = j a/2 : j = 0, 1, .., 2 }(n = 1, 2, . . . ), of .[0, a]. Then, as in the Doob decomposition (19.1), (n)
Yt = At
.
+ m(n)t ,
(n)
m(n)t = Yt − At ,
(19.5)
where .m(n)t = E(m(n)a |Ft ), t ∈ Πn , (with .t = tk,n = ka/2n ), is a martingale. Here (n) .At ,n = [E(Ytj +1,n |Ftj,n ) − Ytj,n ] (k = 0, 1, . . . , 2n ), k 0≤j ≤k−1
19 Stochastic Integrals with .L2 -Martingales
332
is, for each n, an increasing process .A(n) . We will first show that {A(n) a : n = 1, 2, . . . } is uniformly integrable.
(19.6)
.
For this, let τ (n)λ = inf{tk−1,n : A(n) tk,n > λ},
.
k
if the infimum is attained, or .τ (n)λ = a, otherwise. Then .τ (n)λ ∈ Ta . Note (n) that, for .t ∈ Πn , Yt = −E(A(n) a |Ft ) + At , since .m(n)t = E(m(n)a |Ft ) = (n) (n) E(Ya − Aa |Ft ) = −E(Aa |Ft ). By the optional sampling theorem, .Yt∧τ (n)λ = (n) (n) −E(Aa |Ft∧τ (n)λ ) + At∧τ (n)λ .(t ∈ Πn ) is a submartingale. Also, letting .t = τ (n)λ , one has (n)
Yτ (n)λ = −E(A(n) a |Fτ (n)λ ) + Aτ (n)λ
.
≤ λ − E(A(n) a |Fτ (n)λ ),
(19.7)
(n)
(n)
noting that .Aτ (n)λ ≤ λ (by definition of .τ (n)λ ). Also, .[τ (n)λ < a] = [Aa Hence, taking expectations in (19.7), on the event .[τ (n)λ < a] =
[A(n) a
E(A(n) a 1[A(n) >λ] ) ≤ −E(Yτ (n)λ 1[A(n) >λ] ) + λP (τ (n)λ < a).
.
a
a
> λ].
> λ], to get (19.8)
One may use the equality in (19.7) with . λ2 in place of .λ and taking expectations on the event .[τ (n) λ < a] to get 2
.
(n)
− E(Yτ (n) λ 1[τ (n)λ/2 ε},
.
where the infimum over the empty set is .∞. Note that for each .n ≥ 1, A(n) a = E(Aa ∧ λ|Fa ) = Aa ∧ λ.
.
Hence .τn,ε = a implies (n)
At
.
− At ∧ λ ≤ ε ∀t ∈ [0, a].
Let .Θn (t) = tk+1,n if .t ∈ (tk,n , tk+1,n . Then .Θn (τn,ε ) is a stopping time, .τn,ε and Θn (τn,ε ) ∈ Ta ∀n. As .A(n) is decreasing with n, .τn,ε is increasing with n; also, t .Θn (t) ↓ t as .n ↑ ∞, .Θn (τn,ε ) ≥ τn,ε . Hence .
τε := lim τn,ε = lim Θn (τn,ε ),
.
n
n
(n)
as .n → ∞. Next, on .tk,n ≤ t < tk+1,n , the pair .{Aτn,ε := E(Atk,n ∧ (n) λ|Fτn,ε ), AΘn (τn,ε ) := E(Atk,n ∧ λ|FΘn (τn,ε ) ) = AΘn (τn,ε ) ∧ λ)} is a martingale. Therefore, E(1[tk−1,n ≤τn,ε ε) → 0 as .n → ∞. This convergence in probability implies that there exists a subsequence .{n(j ) : j = 1, 2, . . . } such that (n(j )) .supt∈[0,a] |At − At ∧ λ| → 0 almost surely as .j → ∞. Applying this to (19.16), one obtains .E (As − ∧ λ)dAs = E (As ∧ λ)dAs ∀ t ∈ [0, a], (19.20) [0,t]
[0,t]
which leads to E{
.
[0,t]
(As ∧ λ − As − ∧ λ)dAs } = 0.
(19.21)
Since the integrand is nonnegative and is positive only at points of discontinuity of As , which is at most a countable set, .dAs assigns zero probability at possible points of discontinuity. Thus .s → As ∧ λ is continuous almost surely. This being true for all .λ, the proof is complete that .s → As is almost surely continuous.
.
We next turn to the definition of integrals with respect to square integrable martingales, which is similar to those with Brownian motion, replacing independent increments by increments orthogonal to the past. Continue to assume that the underlying probability space .(Ω, F, P ) is P -complete and the .σ -fields of the filtration .{Ft } are also all P -complete. In addition, assume that .Ft is right-continuous, i.e., .Ft = Ft + ≡ ∩s>t Fs for every t. Note that every right-continuous submartingale/martingale with respect to a filtration .{Gt } is a submartingale/martingale with respect to the filtration .{Gt + } (Exercise 7). We will then assume that all submartingales/martingales are right-continuous with left limit. The following definitions are preparatory for the definition of stochastic integrals with respect .L2 martingales. Definition 19.5 We will denote by .M2 the class of square integrable .{Ft : t ≥ 0} martingales .{Mt : t ≥ 0}, right-continuous with left limits. The subclass of continuous square integrable martingales will be denoted by .Mc2 . If the process .{Mt : t ≥ 0} is such that there exists a sequence of .{Ft }-stopping times .τn ↑ ∞ almost surely as .n ↑ ∞, .{Mt ∧ τn : t ≥ 0} is a square integrable .{Ft }-martingale .(n = 1, 2, . . . ), then .{Mt : t ≥ 0} is said to be a .{Ft }-local martingale. The class of such local martingales will be denoted by .M2,l ; the subclass of continuous square c . integrable .{Ft }-local martingales will be denoted by .M2,l
19 Stochastic Integrals with .L2 -Martingales
338
Definition 19.6 A real-valued stochastic process .{f (t) : t ≥ 0} is said to be a nonanticipative step functional (with respect to .{Ft }) if there exist a sequence .0 = t0 < t1 < · · · < tm < · · · , .tn ↑ ∞ and a sequence of .Ftj -measurable random variables .fj (j = 0, 1, 2, . . . ), uniformly bounded by a constant, such that f (t) = fj −1 for tj −1 ≤ t < tj (j = 1, 2, . . . ).
.
(19.22)
The class of such functionals is denoted by .L0 . Let .L2 be the class of all .{Ft }-non-anticipative progressively measurable processes .f (·) such that ||f ||22,T := E
.
[0,T ]
f 2 (s)ds < ∞
∀ T > 0.
A norm .|| · ||2 on .L2 is defined by ||f ||22 =
.
2−k (||f ||2k 2 ∧ 1).
(19.23)
1≤k 0. (a) Then
E(
.
(Mtj,n − Mtj −1,n )2 − Ma )2 → 0
1≤j ≤2n
as .n → ∞. In particular, .
(Mtj,n − Mtj −1,n )2 → Ma
1≤j ≤2n
in probability. (b) . 1≤j ≤2n |Xtj,n − Xtj −1,n | → ∞ almost surely as .n → ∞.
19 Stochastic Integrals with .L2 -Martingales
340
Proof 9 (a) First assume that M is bounded, .|Mt | ≤ K on .[0, a]. For .0 ≤ s < t ≤ t1 < t2 , one has E((Mt2 − Mt1 )2 |Ft ) = E((E(Mt2 − Mt1 )2 |Ft1 )|Ft )
.
= E((Mt22 − Mt21 )|Ft1 )|Ft ) = E(Mt22 − Mt21 )|Ft ), since .E(Mt2 Mt1 |Ft1 ) = Mt21 a.s. By the Doob Meyer-decomposition, this may be expressed as E((Mt2 − Mt1 )2 |Ft ) = E(Mt22 − Mt21 )|Ft )
.
= E(Mt2 − Mt1 |Ft ) a.s.
(19.25)
One then has E((Mt2 − Mt1 )2 |Ft ) = E( (Mtj,n − Mtj −1,n )2 − Ma )2
.
1≤j ≤2n
=E
[(Mtj,n − Mtj −1,n )2 − (Mtj,n − Mtj −1,n )]2
1≤j ≤2n
=
E[(Mtj,n − Mtj −1,n )2 − (Mtj,n − Mtj −1,n )]2 ,
(19.26)
1≤j ≤2n
since the expectation of the product terms vanish by (19.25), i.e., for .k > j , .
E[(Mtj,n − Mtj −1,n )2 − (Mtj,n − Mtj −1,n ] ×[(Mtk,n − Mtj −1,n )2 − (Mtk,n − Mtk−1,n )] = EE[(Mtj,n − Mtj −1,n )2 − (Mtj,n − Mtj −1,n ] ×[(Mtk,n − Mtj −1,n )2 − (Mtk,n − Mtk−1,n )|Ftj,n ],
and E([(Mtk,n − Mtj −1,n )2 − (Mtk,n − Mtk−1,n )]Ftj,n ) = 0 a.s.
.
Now the last expression of (19.26) is bounded above by .2( 1≤j ≤2n E(Mtj,n − Mtj −1,n )4 + 2E( 1≤j ≤2n (Mtj,n − Mtj −1,n )2 ). The expected value of the sum of the first term goes to zero as .n → ∞ (Exercise 4(b)). The expectation of the
9 This
proof follows Karatzas and Shreve (1991).
19 Stochastic Integrals with .L2 -Martingales
341
second sum is bounded by .Eδn Ma , where .δn = max{Mt − Ms : 0 ≤ s < t, t − s ≤ a2−n } ↓ 0 almost surely as .n ↑ ∞. By monotone convergence theorem, the expectation goes to zero. If M is not bounded on .[0, a], define the stopping time τK := inf{t ≥ 0 : |Mt | ≥ K or Mt ≥ K},
.
(K)
where K is a positive integer. Consider the martingale .Mt := Mt ∧τK . It is simple 2 to check that .M (K) is an .{Ft }-martingale (Exercise 6). Note that .Mt∧τ − Mt∧τK K is a bounded martingale. By the uniqueness of the Doob–Meyer decomposition, it follows that .M (K) t = Mt∧τK . Hence for partitions .Πn , one has .
lim E[
n→∞
1≤j ≤2n
− Mt(K) Mt(K) j,n j −1,n
2
− Ma∧τK ]2 = 0,
appealing to the result for the bounded case. Because .τK ↑ ∞ almost surely as K ↑ ∞ and, especially, .P (τK < a) → 0 as .K ↑ ∞, one obtains the desired convergence in (a). (b) This follows from (a) (see Exercise 3(c)).
.
Remark 19.5 The relation M, N = (M 2 + N 2 + 2MN)/4 − (M 2 + N 2 − 2MN)/4
.
implies that .M, N may be evaluated as the bounded quadratic variation part of MN expressed uniquely as a function of bounded variation starting at zero, plus a martingale, easing the computation perhaps. Remark 19.6 For the proof of part (a), it is not necessary to take any special sequence of partitions, as long as the mesh size goes to zero. For the proof of the almost sure result in part (b), any sequence of partitions which become finer with n suffices, since that ensures an increasing sequence of variations, because of the triangle inequality of real numbers. We now turn to two results which are sometimes useful in different contexts. Proposition 19.6 (Itô’s Formula for Square Integrable Martingales) Suppose M ∈ Mc2 and .f ∈ C 2 (R). Then
.
f (Mt ) − f (M0 ) =
.
[0,t]
f (Ms )dMs + (1/2)
[0,t]
f
(Ms )dMs .
Proof The proof is basically the same as for Itô’s Lemma for stochastic integrals with respect to Brownian motion .Bs with the quadratic variation dt replaced by .dMt = dAt , (Exercise 8).
19 Stochastic Integrals with .L2 -Martingales
342
Proposition 19.7 (Maximal Inequality and Quadratic Variation) Let .M ∈ Mc2 ,and .Mt∗ = max{|Ms | : 0 ≤ s ≤ t}, .M0 = 0. Then for .p ≥ 1, one has ∗2p
EMt
.
p
≤ C(p)EM, Mt ,
where .C(p) is a constant depending on p. Proof We will use Doob’s maximal inequality for moments, namely, ∗2p
EMt
.
≤ (2p/(2p − 1))2p E|Mt |2p ,
p > 1/2.
(19.27)
For .p = 1, this yields EMt∗2 ≤ 4E|Mt |2 = 4E(M, Mt ).
.
For .p > 1, apply Proposition 19.6 to .f (x) = |x|2p to get |Mt |2p =
.
[0,t]
2p|Ms |2p−1 sgn(Ms )dMs + p(2p − 1)
[0,t]
|Ms |2p−2 dMs .
Take expectations to get ∗2p 1−1/p
E|Mt |2p ≤ p(2p − 1)E|Mt∗ |2p−2 Mt ) ≤ p(2p − 1)(EMt
.
Now apply (19.27).
)
(EM p t )1/p .
Exercises 1. Let (S, S) be a measurable space and μ a finitely additive measure on the σ -field S. Suppose that for all An , A ∈ S, An ↓ ∅, μ(An ) ↓ 0. Show that μ is countably additive. 2. Let B = {(B1 (t), B2 (t), B3 (t))} be a standard Brownian motion in R3 , and define Mt = R1t , t ≥ 1, where {Rt = |B|t } is the Bessel process (Chapter 13, p Example 6). Show that supt≥1 EMt < ∞ for p < 3, and that {Mt : t ≥ 1} is a local martingale, but not a martingale. [Hint: Rt2 is the sum of three i.i.d. Gamma (chi-square) distributed random variables.] 3. Let {Xt : t ≥ 0} be an almost surely continuous real-valued process, Πn : tj,n = j a/2n , j = 0, 1, .., 2n (n = 1, 2, . . . ), be partitions of [0, a], a > 0. Assume that, for some p > 0, 1≤j ≤2n |Xtj,n − Xtj −1,n |p → Z in probability, where Z takes values in [0, ∞). Prove that (a) if q > p, then 1≤j ≤2n |Xtj,n −Xtj −1,n |q → q−p : 0 ≤ s < t ≤ a, t − s ≤ 0 in probability. [Hint: Let δn := max{|X t − Xs | −n a2 }. Then the sum is less than δn 1≤j ≤2n |Xtj,n − Xtj −1,n |p .] (b) Prove that
Exercises
343
if 0 < q < p, then 1≤j ≤2n |Xtj,n − Xtj −1,n |q → ∞ in probability on [Z > 0]. [Hint: Interchange p and q in the above argument, and then divide by δn .] (c) Suppose p > 1. Prove that the variation defined by limn 1≤j ≤2n |Xtj,n − Xtj −1,n | = ∞ almost surely on [Z > 0]. [Hint: The sum is monotone increasing in n.] 4. Let Πn : tj,n = j a/2n , j = 0, 1, .., 2n (n = 1, 2, . . . ) be partitions of [0, a], and let M ∈ Mc2 , M0 = 0. (a) Assume |Mt | ≤ N for N > 0. Prove that E[ 1≤j ≤2n (Mtj,n − Mtj −1,n )|2 ] ≤ 6N 4 . [Hint: E[ 1≤j ≤2n (Mtj,n − Mtj −1,n )2 |Ftj,n ) = E[ 1≤j ≤2n (Mt2j,n − Mt2j −1,n )|Ftk,n ] ≤ E[Xa2 ]|Ftk,n ] ≤ N 2 . Hence, E[ 1≤j ≤2n J +1≤k≤2n (Mtj,n − Mtj −1,n )2 (Mtk,n − Mtk−1,n )2 = E[ 1≤j ≤2n (Mtj,n − Mt j −1,n )2 E( J +1≤k≤2n (Mtk,n − Mtk−1 ,n )2 |Ftj,n )] ≤ N 2 N 2 = N 4 . Now use .E[ (Mtj,n − Mtj −1,n )4 ≤ 4N 2 E[ (Mtj,n − Mtj −1,n )2 ] ≤ 4N 4 . 1≤j ≤2n
1≤j ≤2n
Finally, one has .
E(
(Mtj,n − Mtj −1,n )2 )2
1≤j ≤2n
=E
1≤j ≤2n
|Mtj,n − Mtj −1,n |4 + 2E[
1≤j ≤2n
J +1≤k≤2n
(Mtj,n − Mtj −1,n )2
×(Mtk,n − Mtk−1,n )2 ≤ 6N 4 .
5. 6. 7.
8. 9.
(b) Under the hypothesis in (a), show that 1≤j ≤2n E(Mtj,n − Mtj −1,n )4 → 0 as n → ∞. [Hint: The expectation is bounded by E(δn 1≤j ≤2n (Mtj,n −Mtj −1,n )2 ), where δn = max{(Mt − Ms )2 : 0 ≤ s < t, t − s ≤ a2−n , t ∈ [0, a]}. Hence the expectation goes to 0, noting that δn ≤ 4N 2 , δn → 0 as n → ∞, by continuity of M, and using (a).] Prove Theorem 19.4 following the method of proofs of Proposition 6.2 and Theorem 6.5 in Chapter 6. Show that M (K) = Mt∧τK is an {Ft }-martingale. Suppose that {Xt : t ≥ 0} is a right-continuous martingale/submartingale with respect to a filtration{Ft }. Show that {Xt : t ≥ 0} is a martingale/submartingale with respect to {Ft + }. Write out a proof of Proposition 19.6 following the proof for stochastic integrals with Brownian motion {Bs } with Ms replaced by Bs . [Hint: Use Theorem 19.5.] Complete the proof of the statement in Remark 19.3.
Chapter 20
Local Time for Brownian Motion
Lévy’s local time for Brownian motion may be informally viewed as a (random) ‘density’ .(t, x) for the random measure of time during an interval of time .[0, t] at which the Brownian motion occupies an infinitesimally small spatial neighborhood of the point x. An interesting formula due to Skorokhod is used to obtain a formula for the local time at zero. A refined formula due to Tanaka is also included.
Brownian paths are continuous but otherwise quite irregular and chaotic, as is indicated by the fact that when a path hits a point x, it visits every neighborhood of x, however small, infinitely many times, before going off somewhere else. Recall also that the motion is instantaneous, i.e., there is no interval of constancy. This makes Paul Lévy’s local time of Brownian motion appear magical! Outside a set of probability zero, a path .ω has an occupation time density (with respect to Lebesgue measure), or local time, of the amount of time in .[0, t] that it spends in a set. As we shall indicate in Chapter 21, this local time may be used to construct all of Feller’s one-dimensional (instantaneous) diffusions on .R. Theorem 20.1 (Lévy (1948)) Let .{Bt : t ≥ 0} be a standard Brownian motion defined on a probability space .(Ω, F , P ), starting at some point. Outside a P -null set, there exists a function .(t, x) ≥ 0, continuous in .(t, x) and increasing with t, such that .2 (t, x)dx = 1[Bs ∈A] ds ∀ Borel sets A ⊂ R, t ≥ 0. (20.1) A
[0,t]
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_20
345
346
20 Local Time for Brownian Motion
Definition 20.1 The function .(t, x) in Theorem 20.1 is called the local time of Brownian motion, and (20.1) is referred to as the occupation time formula. Remark 20.1 The factor 2 on the left side of (20.1) seems anomalous, but it is harmless and commonly used (because Brownian motion is generated by half times the Laplacian and this half factor naturally appears in Itô’s Lemma). Some heuristics are useful for motivating the proof of Theorem 20.1. One may informally think of .2(t, x) as a smooth version of . [0,t] δx (Bs )ds, where .δx (·) is the Dirac delta function at the point x. If one integrates over x, one then has . A (t, x)dx approximately given by .
1 2
[ A
[0,t]
δx (Bs )ds]dx =
1 2
[0,t]
1[Bs ∈A] ds.
Recalling Itô’s Lemma, one then looks for a smooth (twice differentiable) function f such that .f (z) is approximately the “density” of .δx (z). Such a nice density is .gn (z; x) which is a symmetric p.d.f around x vanishing outside .[x − 1/n, x + 1/n]. Then the corresponding .f = fn is, given by
fn (x) =
gn (y; x)dydz.
.
(−∞,x] (−∞,z]
By Itô’s Lemma,
1 .fn (Bt ) − fn (B0 ) = f (Bs )dBs + 2 [0,t]
[0,t]
fn (Bs )ds.
(20.2)
Thus local time . exists if .1/2 [0,t] fn (Bs )ds ≡ 1/2 [0,t] gn (Bs ; x)ds converges, as .n → ∞, to a function satisfying the conditions specified in the theorem. Given the specification of .gn , one has ⎧ ⎪ ⎪ ⎨1 .fn (y) → 1/2 ⎪ ⎪ ⎩0
if y > x if y = x
(20.3)
if y < x,
and .fn (z) → (z − x)+ as .n → ∞. From (20.2), (20.3), given the appropriate convergence, .(t, x) should be given by +
+
(t, x) = (Bt − x) − (B0 − x) −
.
[0,t]
1(x,∞) (Bs )dBs .
(20.4)
20 Local Time for Brownian Motion
347
Proof (of Theorem 20.1) Having1 arrived to (20.4) by heuristics, we will actually show the random function (20.4) indeed satisfies the requirements of the theorem. To prove this consider, for an arbitrary .T > 0, the random function .Xx := {Xt : 0 ≤ t ≤ T }, where .Xt is given by the stochastic integral in (20.4). Then .x → Xx is a random function on .R into the metric space .C([0, T ] : R) of real-valued continuous functions f on .[0, T ] endowed with the sup norm: ||f || = max{|f (s)| : 0 ≤ s ≤ T }.
.
Then, by Lemma 1 below, E||Xx − Xz ||4 ≤ c(T )|x − z|2 .
.
(20.5)
for some constant depending only on T . By the Kolmogorov–Chentsov Theorem (see BCPT2 Theorem 10.2, p. 180), it follows that .x → Xx is almost surely continuous in the sup norm. Thus the random function in (20.4) is continuous on .[0, ∞) × R, almost surely. To prove (20.1), let f be a continuous function on .R with compact support, and
(y − x)+ f (x)dx
F (y) =
.
(−∞,∞)
(y − x)f (x)dx =
= (−∞,y]
[0,∞)
zf (y − z)dz.
Then, integrating by parts,
F (y) =
.
[0,∞)
zf (y − z)dz
=
f (u)du, (−∞,y]
and F (y) = f (y).
.
By Itô’s Lemma, F (Bt ) − F (B0 ) −
.
1 Ikeda
[0,t]
F (Bs )dBs = 1/2
[0,t]
f (Bs )ds.
(20.6)
and Watanabe (1989), pp. 113–116. BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory.
2 Throughout,
348
20 Local Time for Brownian Motion
The integral on the left equals, by a Fubini type interchange of order (See Lemma 2), .
[0,t]
F (Bs )dBs =
[0,t]
(
f (u)1(u,∞) (Bs )du)dBs . (−∞,∞)
Combining this with the remaining term on the left side of (20.6), one gets .
F (Bt ) − F (B0 ) −
[0,t]
F (Bs )dBs
(f (u){(Bt − u)+ − (B0 − u)+ } −
=
[0,t]
(−∞,∞)
= 1/2
[0,t]
1[u,∞) (Bs )dBs )du
f (Bs )ds.
In particular, the expression in (20.4) is indeed the local time .(t, x). Lemma 1 Let .Xx (t) = [0,t] 1(x,∞) (Bs )dBs , where .{Bs : s ≥ 0} is a standard Brownian motion. Then E||Xx − Xz || ≡ E sup |Xx (t) − Xz (t)| ≤ c(T )|x − z|2 ,
.
0≤t≤T
where .c(T ) is a constant depending only on T . Proof Use the martingale moment inequality with quadratic variation (Proposition 19.7) for the martingale Mt ≡ Xx (t) − Xz (t) =
.
[0,t]
1(x,z] (Bs )dBs , 0 ≤ t ≤ T ,
(for .x > z, say), to obtain E||Xx − Xz ||4 ≤ c(2)EM 2T ,
.
(20.7)
where .M T is the quadratic variation of M on .[0, T ]: M T =
.
[0,T ]
1(x,z] (Bs )ds.
EM 2T = E{
.
=2
[0,T ]
[0,T ]
(20.8)
(
1(x,z] (Bs )ds)( [s,T ]
[0,T ]
1(x,z] (Bs )ds)}
E1(x,z] (Bs )1(x,z] (Bt )dt)ds.
(20.9)
20 Local Time for Brownian Motion
349
Now for .s < t, the conditional distribution of .Bt , given .Bs , is normally distributed with mean .Bs and variance .t − s. Hence the conditional probability that .Bt ∈ (z, x], given .Bs , is √ √ √ Φ((x − Bs )/ t − s) − Φ((z − Bs )/ t − s) ≤ c|x − z|/ t − s,
.
for some absolute constant c. Also, if .B0 = x0 , √ P (z < Bs ≤ x)P (z − x0 ≤ Bs − x0 ≤ x − x0 ) ≤ c|x − z|/ s.
.
Hence EM 2T ≤ 2
.
[0,T ]
(
[s,T ]
= 2c2 |x − z|2 = 4c |x − z| 2
(t − s)−1/2 dt)s −1/2 ds
[0,T ]
2 [0,T ]
2(T − s)1/2 s −1/2 ds (T /s − 1)1/2 ds
= 8c2 T |x − z|2 = d(T )|x − z|2 ,
(20.10)
say (Exercise 1).
Lemma 2 Let f be a continuous function on .R with compact support and .{Bs } a standard Brownian motion. Then,
.
[0,t]
( (−∞,∞)
f (u)1(u,∞) (Bs )du)dBs =
f (u)( (−∞,∞)
[0,t]
1u,∞) (Bs )dBs )du.
(20.11)
Proof Let the support of f be .[a, b]. Define the partitions .Πn : tj,n = a + j (b − a)/2n ; j = 0, 1, . . . , 2n (n = 1, 2, . . . ), of .[a, b], .n = 1, 2, . . . . As shown in Theorem 20.1 and Lemma 1, .u → [0,t] 1u,∞) (Bs )dBs is continuous almost surely, uniformly for all t in a bounded interval. Hence the right side of (20.11) is the limit of −n . (b − a)2 f (tj,n )( 1(tj,n ,∞) (Bs )dBs ) = Fn (Bs )dBs , [0,t]
0≤j ≤2n−1
[0,t]
where Fn (y) :=
.
0≤j ≤2n−1
(b − a)2−n f (tj,n )1(tj,n ,∞) (y).
350
20 Local Time for Brownian Motion
The sequence of Riemann sums .Fn is bounded and converges uniformly to F (y) = [0,t] f (u)1(u,∞) (y)du. This implies that the sequence of stochastic integrals . [0,t] Fn (Bs )dBs converges almost surely and in .L1 to . [0,t] F (Bs )dBs , namely, the left side of (20.11). Hence the two sides of (20.11) are equal almost surely.
.
In (20.4), replacing .{Bs } by .{−Bs }, one obtains for the case .x = 0, (20.4) changes ˜ 0), say, given by to the corresponding expression for the local time at zero, .(t, (−Bt )+ +
.
[0,t]
˜ 0). 1(−∞,0] (Bs )dBs = (t,
(20.12)
Since the time during .[0, t] that .{Bs : s ≥ 0} spends at zero is the same as that spent by .{−Bs :≥ 0}, ˜ 0) (t, 0) = (t,
.
almost surely,
so that by adding the two expressions (20.4) and (20.12), one arrives at Corollary 20.2 (Tanaka’s Formula) Let .B = {Bt : t ≥ 0} be a standard Brownian motion starting at zero. Then one has 2(t, 0) = |Bt | −
.
[0,t]
sgn(Bs )dBs
almost surely,
or, equivalently, changing B to .−B, 2(t, 0) = |Bt | +
.
[0,t]
sgn(Bs )dBs
almost surely.
(20.13)
Remark 20.2 Convexity of the functions .x → |x|, .x → x + plays an essential role in the Tanaka (1963) formula. Both the local time theory and Tanaka theorem for Brownian motion presented here have a natural extension to semimartingales3 and differences of convex functions, respectively, e.g., see Revuz and Yor (1999), Chapter VI. We conclude this chapter with another amazing representation by Lévy of the local time .(t) = (t, 0) at zero. For the proof of Lévy’s result, we will use an important lemma due to Skorokhod.4 It provides a unique representation of a continuous function f as the difference between a continuous non-negative .gz , say, starting at an arbitrarily given .z ≥ 0, and a nondecreasing continuous function k which only grows on the set of zeros of .gz . 3 Also
see Appuhamillage et al. (2014), Ramirez et al. (2016) for consequences of continuity of local time for physics. 4 Skorokhod (1961, 1962).
20 Local Time for Brownian Motion
351
Lemma 3 (Skorokhod’s Equation) Let5 .f (t) be a continuous real-valued function on .[0, ∞), f (0) = 0, and let .z ≥ 0 be given. Then there exists a unique continuous function .k(t) on .[0, ∞) such that (i) k is nondecreasing, (ii) .k(0) = 0, and (iii) .k(·) grows only on the zeros of the function .g(t) := z + k(t) + f (t), i.e., k is flat off .{t : g(t) = 0}. The function k is given by k(t) = max[0, max{−f (s) − z : 0 ≤ s ≤ t}], t ≥ 0.
.
(20.14)
Existence 6 We will check that .k(·) in (20.12) satisfies the desired criteria (i)–(iii). Criteria (i) and (ii) are obviously met. For (iii), first note that if .t1 < t2 , then k(t2 ) = max[k(t1 ), max{−f (s) − z : t1 ≤ s ≤ t2 ].
.
Consider now, for any .ε > 0, the open set .{s : g(s) > ε}, which is a countable union of disjoint open intervals. If .(t1 , t2 ) is one of these intervals, then k(t2 ) = max[k(t1 ), max{−f (s) − z : t1 ≤ s ≤ t2 ].
.
But k(ti ) + f (ti ) + z = g(ti ) = ε(i = 1, 2),
.
and .
max{−f (s) − z : t1 ≤ s ≤ t2 } = max{−g(s) + k(s) : t1 ≤ s ≤ t2 } ≤ −ε + k(t2 ).
But this says .k(t2 ) ≤ max[k(t1 ), k(t2 ) − ε], which cannot be true unless .k(t2 ) = k(t1 ). Hence .k(·) is flat on .{t : g(t) > ε}. This being true for every .ε > 0, (iii) is verified. (Uniqueness). To prove uniqueness, suppose .g˜ and .k˜ also satisfy the conditions ˜ Suppose there is a point .T > 0 such that .g(T ) > g(T ˜ ). Let above, but .g = g. t0 := sup{0 ≤ s < T , g(s) = g(s)}. ˜
.
˜ Note that one must have .g(s) > g(s) ˜ on .(t0 , T ], implying .k(s) > k(s) on .(t0 , T ]. Since .g(s) ˜ ≥ 0, one must have .g(s) > 0 on .(t0 , T ]. Hence .k(s) is flat on .(t0 , T ], so that .k(t0 ) = k(T ) and .g(t0 ) = g(T ). Since ˜ g(t) − g(t) ˜ = k(t) − k(t) ∀ t,
.
5 Ikeda
and Watanabe (1989), pp. 113–116; Karatzas and Shreve (1991), p. 233. and Shreve (1991), pp. 210–212.
6 Karatzas
352
20 Local Time for Brownian Motion
this leads to the contradiction ˜ ) = k(t0 ) − k(T ˜ ) ≤ k(t0 ) − k(t ˜ 0 ) = 0. 0 < g(T ) − g(T ˜ ) = k(T ) − k(T
.
˜ for all t. Hence .g = g˜ on .[0, ∞) and, therefore, also .k(t) = k(t)
(1961–1962)7
Remark 20.3 Skorokhod used his famous equation to construct reflecting diffusions. The method extends to multidimensional diffusions as well. We will use it to derive a theorem due to Lévy. A different proof, using Lévy’s representation of processes with independent increments, specialized to passage times .τa of a standard Brownian motion to .a ≥ 0, may be found in Itô and McKean (1965), pp. 42–48. Theorem 20.3 (Lévy) Let8 .X = {X(t); t ≥ 0} be a standard Brownian motion starting at zero, .Y (t) = max{X(s) : 0 ≤ s ≤ t}, and .(t) the local time of .{X(t)} at zero. Then .{(|X(t)|, 2(t)) : t ≥ 0} has the same distribution as .{(Y (t) − X(t), Y (t)) : t ≥ 0}. In particular, .Y (t) − X(t) has the same distribution as .|X(t)|, and .2(t) has the same distribution as .Y (t). Proof We will check that Tanaka’s formula (20.13) provides the Skorokhod representation needed, with .z = 0, f (s) = −Ws = −2(s, 0) + |Bs |, where .{Ws : s ≥ 0} is the standard Brownian motion starting at 0, as given by the stochastic integral in (20.13). That is, g(t) = |Bt | = k(t) + f (t),
.
with .k(t) = 2(t). Note that .k(0) = 0, k(t) increases with t, and it grows only on the set of zeros of .g(t) = |Bt |. By the uniqueness of the Skorokhod representation, k(t) = max{Ws : 0 ≤ s ≤ t} = Mt ,
.
say, .g(t) = Mt − Wt , and .(g(t), k(t)) ≡ (|Bt |, 2(t)) has the same distribution as (Mt − Wt , Mt ). Now apply this distributional result to the process .{X(s) : s ≥ 0} for .Ws = X(s).
.
Exercises 1. Show that the constant c in the expression d(T ) (at the end of the proof of Lemma 1) may be taken to be (2π )−1/2 . Using this and the known expression of c(2) (see Proposition 19.7 in Chapter 19 on stochastic integration with respect
7 See 8 See
Skorokhod (1961, 1962). Lévy (1948).
Exercises
353
to martingales), find a numerical upper bound for c(T ), apart from the constant multiple T . 2. (Occupation Time Formula) Show that (20.1) may be reformulated as t g(B )ds = 2 R g(x)(t, x)dx a.s., for bounded, measurable functions g. s 0 [Hint: Use simple function approximations to g.] 3. Show that, with probability one, (t, 0) = limε↓0 |{0≤s≤t:B4εs ∈(−ε,ε)}| , where |A| denotes the Lebesgue measure of a measurable set A, and {Bs : s ≥ 0} is a standard Brownian motion starting at zero. 4. Use Tanaka’s formula to show that the inverse function of t → (t, 0) is an increasing stable process, i.e., a stable subordinator. [Hint: See Example 3 in Chapter 5.] t . [Hint: By the reflection principle P (Y (t) > y) = 5. Show that E(t, 0) = 2π 2P (Bt > y), y ≥ 0. This formula may vary in the literature in accordance with Remark 20.1 on the factor of ‘2’.]
Chapter 21
Construction of One-Dimensional Diffusions by Semigroups
This chapter contains a treatment of Feller’s seminal contribution to a comprehensive theory of one-dimensional diffusions.
In a series of classic papers between 1952 and 1960, William Feller essentially showed that all regular Markov processes on an interval state space with continuous sample paths are generated (in the sense of semigroup theory) by generalized d d+ second-order differential operators1 of the form .Dm Ds+ = dm ds , with appropriate boundary conditions; see Remark 21.1 below. Here, s is a strictly increasing con+ (y)−f (x) tinuous function on the interval, .Ds+ f (x) ≡ dds f (x) = limy→x + fs(y)−s(x) denotes a right-sided derivative, and m is a strictly increasing right continuous function. A comprehensive account of this theory along with a wealth of additional facts appears in the book by Itô and McKean (1965). A somewhat simpler account than theirs may be found in the books of Mandl (1968) and Freedman (1971). Mandl (1968) gives Feller’s construction of semigroups generated by .Dm Ds+ , while Freedman’s (1971) chapter on diffusions provides the probabilistic counterpart, demonstrating that all regular diffusions are necessarily of this form. This chapter contains another treatment of Feller’s seminal contribution to the theory of diffusions. Here the term “regular diffusion” is in the sense of Freedman (1971). That is, Definition 21.1 A regular diffusion is a Markov process having continuous sample paths, which has positive probabilities of moving to the left and to the right
1 Especially
see Feller (1957) in this regard.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_21
355
356
21 One Dimensional Diffusions by Semigroups
instantaneously, starting from any point in the interior. “Instantaneously” means in any time interval, however small. To be on familiar ground, first consider classical second-order differential operators L=
.
d2 1 d a(x) 2 + b(x) , 2 dx dx
(21.1)
where .a(·) and .b(·) are Borel measurable functions on an interval .(r0 , r1 ), bounded on compact intervals; .a(·) is positive on .(r0 , r1 ) and bounded away from zero on compact intervals. Assume that zero lies in the interior of the interval .(r0 , r1 ), the state space (This may be achieved by a simple translation). Remark 21.1 Much of Feller’s seminal work was motivated by an interest in mathematical biology, especially genetics.2 These and a number of other of other interesting examples appear in selected examples and exercises at the end of this chapter. Write x y dr}dy, s(x) = 0 exp{− 0 2b(r) ya(r) x 2 . 2b(r) m(x) = 0 a(y) exp{ 0 a(r) dr}dy.
(21.2)
Then .
ds(x) = exp{− dx
x 0
2b(r) dr}, a(r)
dm(x) 2 = exp{ dx a(x)
x
0
2b(r) dr} a(r)
(21.3)
and df (x) df (x) d + s = Dm / ds(x) dx dx x 2b(r) = Dm [f (x) exp{ dr}] 0 a(r) x dm(x) d 2b(r) [f (x) exp{ dr}]/ = dx dx 0 a(r)
Dm Ds+ f (x) = Dm
.
=
1 a(x)f (x) + b(x)f (x) = Lf (x), 2
for all twice-differentiable functions f on I .
2 See
the survey article by Peskir (2015) for a penetrating overview.
x ∈ (r0 , r1 ), (21.4)
21 One Dimensional Diffusions by Semigroups
357
Writing, .I = [r0 , r1 ], where .−∞ ≤ r0 < r1 ≤ ∞, the interval .[r0 , r1 ] represents the two-point compactification of .(r0 , r1 ) obtained by identifying .(r0 , r1 ) with a finite interval .(a, b) by a homeomorphism and then extending this homeomorphism to .[r0 , r1 ] and .[a, b], .r0 being the image of a, and .r1 being the image of b. We then write .C(I ) for the set of all real-valued continuous functions on this compact space I . One may, in case .r0 or .r1 is infinite, identify .C(I ) with the set of all real-valued continuous functions on I having finite limits at both end points. .C(I ) is a Banach space under “sup” norm, and this is the norm we will use in this chapter unless otherwise specified. For the general case, we define the following. Definition 21.2 The scale function .s(·) is a continuous strictly increasing function on .(r0 , r1 ). The speed function .m(·) is a right-continuous strictly increasing function on .(r0 , r1 ). One may view .m(·) as the distribution function for the Lebesgue-Stieltjes measure, denoted dm or .m(dx), and referred to as the speed measure: .m(x) = m(dz), x ∈ R (see BCPT3 p. 228). (0,x] It will be assumed throughout that .m(0) = 0 and 0 is an interior point of I . This is simply achieved by translation since adding a constant to the speed function does not change the speed measure. In particular, .Ds+ denotes the right-hand derivative, i.e., Ds+ f (x) = lim
.
y→x +
f (y) − f (x) , r0 < x < r1 . s(y) − s(x)
To simplify the calculations a little, we shall also take .s(x) = x. This choice of scale function is referred to as natural scale. In this case, Ds+ f (x) ≡ Dx+ f (x) = f (x + ).
.
Natural scale can always be achieved by a change of variables transforming .x → s(x). In this case the derivative .Ds+ will be denoted .Dx+ . Denote by .DL the domain of the operator .L = Dm Ds+ from .C(I ) to .C(I ), i.e., DL = {f ∈ C(I ) : Lf ∈ C(I )}.
.
(21.5)
To be clear, on the assumed natural scale .s(x) = x, .f ∈ DL ⊂ C(I ) requires that the one-sided derivative .Dx+ f (x) ≡ f (x + ) exist for all .x ∈ I and that 4 + .Dx f induces a signed measure absolutely continuous with respect to the measure 5 .m(dx) and having Radon–Nikodym derivative .Dm f . In addition, for .f ∈ DL , it
3 Throughout,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory. 4 See BCPT, pp. 6, 228. 5 See BCPT p. 250.
358
21 One Dimensional Diffusions by Semigroups
is required that Lf (x) := Dm Dx+ f (x), r0 < x < r1 ,
.
(21.6)
exists, as well as their limits .Lf (ri ) = Dm Dx+ f (ri ), i = 0, 1, and that Lf , so defined, belongs to .C(I ). Remark 21.2 The notations .m(dx) and .dm(x) will both be used to denote the speed measures. Lemma 1 .DL is dense in .C(I ). Proof Let .f ∈ C(I ). Fix .ε > 0. The goal is to approximate f uniformly by a function .h ∈ DL , where the domain is as defined above. Keep in mind that such an approximate must have a right-hand derivative, .Dx+ h, that is the distribution function of a Lebesgue-Stieltjes measure having a continuous Radon–Nikodym derivative (density), .Dm Dx+ h, with respect to the speed measure m. First let us note that without loss of generality, for the approximation, it may be assumed that f is continuously differentiable from the start. To see this, begin with a polygonal approximation by a step function .f1 of f such that .f1 is constant at each endpoint of I , and .||f1 − f || < ε. Then, let .φε be an infinitely differentiable function on .R vanishing outside .(− δ2ε , δ2ε ) where .|f1 (x) − f1 (y)| < ε if .|x − y| ≤ δε . Consider the extension .f˜1 (if necessary) of .f1 to .(−∞, ∞) by letting it have the value .f1 (r0 ) on .(−∞, r0 ) and the value .f (r1 ) on .(r1 , ∞). Let .f˜2 = f˜1 ∗ φε . Let .f2 be the restriction of .f˜2 to .[r0 , r1 ] (in case .r1 is finite). Then, .||f2 − f1 || < ε. Thus .||f2 − f || < 2ε. So, let us assume that .f ∈ C(I ) is smooth from the start. Next, let us find a function .f3 that is absolutely continuous with respect to the speed measure m and that has a continuous Radon–Nikodym derivative .Dm f3 with respect to m to approximate the derivative .f of f in the sense that .
|f (y) − f3 (y)|dy < ε.
(21.7)
I
To construct such a function .f3 , reconsider the polygonal approximation .f1 to .f by step functions, let .δ > 0 be arbitrarily small. Fix a finite interval, say .J = [j0 , j1 ] ⊂ (r0 , r1 ), on which .f1 is constant. Let .ϕi , i = 0, 1, be a pair of nonnegative continuous functions, vanishing off of .[j0 − δ, j0 ] and .j1 , j1 + δ], respectively, x such that . I ϕi (x)m(dx) = 1, i = 0, 1. Then, the function .j (x) = r0 (ϕ0 (y) − ϕ1 (y))m(dy), x ∈ I , is continuous and satisfies j (x) =
1
on [j0 , j1 ]
0
on (−∞, j0 − δ) ∪ (j1 + δ, ∞),
.
0 ≤ j (x) ≤ 1 on .[j0 − δ] ∪ [j1 , j1 + δ], and one has
.
21 One Dimensional Diffusions by Semigroups
359
|1J (x) − j (x)|dx < 2δ.
.
I ψi .(i = 0, 1), where .ψi is a continuous One may, for example, take .ϕi = ψ (z)m(dz) R i nonnegative function, . R ψi (z)m(dz) > 0 .(i = 0, 1), .ψ0 vanishes outside .[j0 − δ, j0 ], and .ψ1 vanishes outside .[j1 , j1 + δ]. Applying this procedure to each step interval of .f1 and taking linear combinations, with suitably small .δ, one obtains .f3 , as desired. Now define x y .h(x) = f (0) + Dm f3 (z)m(dz)dy, x ∈ I. 0
r0
Then, Dm Dx+ h(x) = Dm f3 (x), x ∈ I,
h ∈ DL ,
.
and
x
||f − h|| = sup |
.
x∈I
0
y
Dm f3 (z)m(dz))dy|
(f (y) − f3 (y))dy|
0
x∈I
r0 x
= sup | ≤
(f (y) −
|f (y) − f3 (y)|dy < ε.
I
It will be important later to distinguish between different types of possible boundary behaviors of .m(·) and .s(·) (or x, when in the natural scale). Definition 21.3 The boundary point .ri (.i = 0 or 1) is inaccessible if |
.
ri
m(x)dx| = ∞,
(21.8)
0
and accessible otherwise.
x 0 Recall that (see (21.2)) .m(x) < 0 for .x < 0, and . 0 = − x if .x < 0. (This is the convention in (21.2) and elsewhere). Definition 21.4 An accessible boundary .ri is regular if |
.
[0,ri )
and is an exit boundary otherwise.
xdm(x)| < ∞,
(21.9)
360
21 One Dimensional Diffusions by Semigroups
Remark 21.3 Note that for the integral in (21.8) to be infinity, .ri must be infinity (i = 0, 1). Hence .ri is inaccessible if it is plus or minus infinity. Thus, .ri is regular if .ri and .m(ri ) are both finite, and .ri is an exit boundary if it is finite, but .m(ri ) is plus or minus infinity.
.
Definition 21.5 An inaccessible boundary .ri is natural if the integral in (21.9) is infinite (i.e., if and only if both integrals are infinite), and an entrance bondary if the integral in (21.9) is finite, but (21.8) is infinite. Our first goal is to determine explicit conditions on s and m in terms of .ri to uniquely determine the existence of a strongly continuous contraction semigroup with generator .(L, DL ). The approach to achieve this is to show (by the Hille–Yosida Theorem) that λ ∈ ρ(L) and ||Rλ || ≤
.
1 ∀λ > 0 iff r0 , r1 are inaccessible boundaries. λ
Toward this end, fix .λ > 0 and try to obtain a solution .u ∈ DL of (λ − L)u = f
(21.10)
.
for an arbitrary .f ∈ C(I ). This is accomplished by obtaining a Green’s function G(x, y) (also called a fundamental solution) for the operator .λ − L such that
.
u(x) = (Gf )(x) ≡
G(x, y)f (y)dm(y)
.
(21.11)
(r0 ,r1 )
solves (21.10) for every .f ∈ C(I ). This is carried out by completing the following steps: (i)
Construct, by successive approximation, a function .g(x) = g(x; λ) that satisfies (λ − L)g(x) = 0
.
r0 < x < r1 ,
(21.12)
is positive and decreasing on .(r0 , 0) and increasing on .(0, r1 ). (ii) Set g+ (x) = g(x)
r1
g(y)
.
x
−2
dy,
g− (x) = g(x)
x
g(y)−2 dy.
(21.13)
r0
One notes (see Lemma 5 below) that .g+ is decreasing, .g− increasing, and .g± solve (21.12). (iii) Use .g+ , .g− , to construct a Green’s function, .G ∝ g− g+ , in the usual manner (developed in detail below, see (21.44)).
21 One Dimensional Diffusions by Semigroups
361
These steps will be carried out in a sequence of lemmas that provide a proof of the following theorem; specifically, steps (i)–(iii) are proven in Lemmas 4, 5, and 7. In the process of proving the theorem, a number of results applicable to the accessible cases will be derived as well. Theorem 21.1 (a) The operator L with domain .DL is the infinitesimal generator of a (strongly continuous) contraction semigroup on .C(I ) if and only if .r0 and .r1 are inaccessible boundaries. (b) This semigroup is Markovian and has a (Feller continuous) transition probability .p(t; x, dy) on I satisfying .
1 lim {sup p(t; x, Bεc (x))} = 0; for all ε > 0. t↓0 t x∈I
(21.14)
Here .Bε (x) = {y : |y − x| < ε}. As noted above, the proof is accomplished by a series of lemmas. Definition 21.6 The Wronskian .W (v1 , v2 ) of functions .v1 , v2 is defined by the following determinant: W (v1 , v2 )(x) = (
.
d d v1 (x))v2 (x) − v1 (x) v2 (x) = det dx dx
d d dx v1 (x) dx v2 (x) v1 (x) v2 (x)
(21.15) Lemma 2 Suppose that .v1 , v2 are both solutions of .(λ − L)v = 0. Then the Wronskian (21.15) is independent of x. It is identically zero if and only if .v1 , .v2 are linearly dependent. Proof The second part of the lemma is clear from the definition of W . For the first part, we will show that the variation of W is zero on .(x, y] ⊂ I . Note that .W (x) ≡ v (x)v2 (x) − v1 (x)v (x) is of bounded variation on finite subintervals of 1 2 .(r0 , r1 ). Hence, for .r0 < y < x < r1 , W (x) − W (y) =
dW (z)
.
(y,x]
=
(y,x]
= (y,x]
d(v1 (z)v2 (z)) − (v1 v2 (z))dz
− (y,x]
(y,x]
+
(v1 v2 (z))dz
d(v2 (z)v1 (z))
v2 (z) (y,x]
dv1 (z) dm(z) dm(z)
−
v1 (z) (y,x]
dv2 (z) dm(z) dm(z)
362
21 One Dimensional Diffusions by Semigroups
=
v2 (z)(Lv1 (z))dm(z) −
(y,x]
=
v1 (z)(Lv2 (z))dm(z) (y,x]
v2 (z)λv1 (z)dm(z) − (y,x]
v1 (z)λv2 (z)dm(z) (y,x]
= 0. Lemma 3 Let .v1 , v2 be two solutions of .(λ − L)v = 0, such that .W (v1 , v2 ) = 0. Then every solution v of this equation is a unique linear combination of .v1 and .v2 . In particular, .v1 , v2 are independent solutions if and only if .W (v1 , v2 ) = 0. Proof The second part of the lemma is contained in Lemma 2. For the first part, let v1 , v2 be two linearly independent solutions and v an arbitrary solution of
.
(λ − L)v = 0.
.
There exist unique constants .c1 , c2 such that c1 v1 (0) + c2 v2 (0) = v(0),
.
d d d v1 )(0) + c2 ( v2 )(0) = ( v)(0), dx dx dx d d v2 (0) are linearly independent in .R2 v1 (0) , and . v2 (0), dx since . v1 (0), dx by Lemma 2. We claim .u0 := c1 v1 + c2 v2 = v. To prove this note that .(L − λ)u0 = 0, .u0 (0) = v(0), and .u0 (0) = v (0). Therefore, .W (u0 , v)(0) = 0, and hence, .W (u0 , v)(z) = 0 for all z. This implies, by Lemma 2, that .v = cu0 for some constant c. But then .c = 1 since this is the only constant for which .v(0) = .cu0 (0). c1 (
.
Lemma 4 There is a function .g ∈ DL such that d d g(x) = λg(x), dm dx x x and .1 + λ 0 m(y)dy ≤ g(x) ≤ exp(λ 0 m(y)dy). Lg(x) =
(21.16)
.
Proof Throughout we use the convention for .x < 0,
h(z)dm(z) = −
.
(0,x]
[x,0)
h(z)dm(z),
Recursively define the functions
0
x
0
h(z)m(dz) = −
h(z)m(dz). x
(21.17)
21 One Dimensional Diffusions by Semigroups
g (x) ≡ 1,
.
0
g
n+1
x
(x) =
363
g n (z)dm(z))dy, (n = 1, 2, . . . ).
( 0
(21.18)
(0,y]
Then one has Lg n+1 = g n (x)
(n = 0, 1, 2 . . . ).
.
(21.19)
By induction check that, for each .n ≥ 1, .g n (x) is positive for .x = 0 (.g n (0) = 0), strictly decreasing for .x < 0, and strictly increasing for .x > 0. Write ∞
g(x) = g(x; λ) =
.
λn g n (x).
(21.20)
n=0
At least formally, .Lg(x) = λg(x) in view of (21.19). To prove the desired convergence properties, we use induction on n go get, for .x > 0, g
.
n+1
x
(x) =
g n (z)dm(z))dy
(
0
(0,y] x
≤
g (y)m(y)dy =
0
g n (yn )m(yn )dyn
g n−1 (yn−1 )m(yn−1 )dyn−1 )m(yn )dyn
0
x
≤
yn
y1
···
0
0
=
yn
( 0
.. .
x
0
x
≤
n
m(y0 )m(y1 ) · · · m(yn )dy0 dy1 · · · dyn
0
(g 1 (x))n+1 , (n + 1)!
(21.21)
z where .g 1 (z) = 0 m(y)dy, i.e., .dg 1 (z) = m(z)dz. Similarly, for .x < 0, g
.
n+1
0
(x) =
( x
[y,0) 0
≤−
g n (x)dm(z))dy
g n (y)m(y)dy
x
.. . ≤
g 1 (x))n+1 , (n + 1)!
364
21 One Dimensional Diffusions by Semigroups
where .g 1 (z) = −
0 z
m(y)dy > 0, i.e., .dg 1 = m(z)dz. As a consequence
dg n+1 (x) |=| .| dx
(0,x]
(0,x]
(g(x))n
≤
(g 1 (y))n dm(y)| n!
g (y)dm(y)| ≤ | n
n!
|m(x)|,
(n = 1, 2, . . . ).
(21.22)
It follows from (21.21) that the function g in (21.20) converges absolutely, uniformly on compact subsets of .(r0 , r1 ), and that g(x) ≤ exp{λg 1 (x)}
(21.23)
.
It follows from (21.22) that the series of derivatives absolutely, uniformly on compacts, and ∞ .
n=0
∞
.
0
d n λn dx g (x) converges
∞
|λn
dg n (x) dg n (x) 1 |= | ≤ |λm(x)|eλg (x) . |λn dx dx
(21.24)
n=1
Hence g is continuously differentiable and one has ∞
d g(x) = λn . dx
g
n−1
(z)dm(z) = λ
(0,x]
n=1
g(z)dm(z).
(21.25)
(0,x]
Hence, Lg(x) =
.
d d g(x) = λg(x), dm dx
(21.26)
and by (21.21) ( and the obvious fact that .g ≥ g 0 + λg 1 ), 1 + λg 1 (x) ≤ g(x) ≤ exp{λg 1 (x)}.
.
(21.27)
Example 1 (1/2 Laplacian) In the case .s(x) = x, m(x) = 2x on .(−∞, ∞), .L = 1 d2 2 dx 2 , and one obtains from (21.20) that g(x; λ) =
∞ n n 2n 2 λ x
.
n=0
so that
(2n)!
√ = cosh( 2λx),
21 One Dimensional Diffusions by Semigroups
365
√ √ √ tanh( 2λx) e− 2λx 1 − , )= √ .g+ (x) = cosh( 2λx)( √ √ 2λ 2λ 2λ
and √ √ √ tanh( 2λx) e 2λx 1 + )= √ , .g− (x) = cosh( 2λx)( √ √ 2λ 2λ 2λ
and, therefore, .W (g− , g+ ) =
√2 . 2λ
Thus, in accordance with (21.44) below,
√ 1 G(x, y) = √ e− 2λ|y−x| , −∞ < x, y < ∞. 2 2λ
.
Exercise 14 contains another illustrative calculation along these lines. Lemma 5 The functions .g+ , g− defined in (21.13) are, respectively, strictly decreasing and strictly increasing on I , and each solves (21.26). Proof In view of the first inequality in (21.27), one has g+ (x) ≤ g(x)
r1
.
(1 + λg 1 (z))−2 dz
x r1
= g(x) x
g(x) ≤ |m(x)|
(1 + λg 1 (z))−2
r1
dg 1 (z) m(z)
(1 + λg 1 (z))−2 dg 1 (z)
(21.28)
x
=
g(x) {(1 + λg 1 (x))−1 − (1 + λg 1 (r1 ))−1 } λ|m(x)|
≤
g(x) (1 + λg 1 (x))−1 , λ|m(x)|
which is finite (for .x > 0 and, therefore, for all x). Similarly, g− (x) ≤
.
≤
g(x) {(1 + λg 1 (x))−1 − (1 + λg 1 (r0 ))−1 } λ|m(x)| g(x) (1 + λg 1 (x))−1 λ|m(x)|
(21.29)
is finite. Now let us check that .g+ and .g− also satisfy the eigenvalue problem of the form (21.26), or, equivalently, (21.12). Using the definition of .g+ and (21.26) for g, one has
366
21 One Dimensional Diffusions by Semigroups
.
x
λ 0
g+ (y)dm(y)
x
=λ
r1
1
g(y)( 0
g 2 (z)
y
dz)dm(y)
r1 d + 1 dz)dm(y) g (y )( 2 y g (z) 0 dm x r1 1 = dz)dg (y) ( 2 y g (z) 0 r1 x r1 1 1 + + d = g (x ) dz)dy dz − g (y ) ( 2 2 dy y g (z) x g (z) 0 r1 1 1 1 + = g (x ) + . dz − 2 (z) g(x) g(0) g x
=
x
(21.30)
Now, since .g (0+ ) = 0, one has g+ (0+ ) = lim
.
x→0+
1 (g(x) x
= − lim
g(x)
x→0+
r1 x
x
dz − g(0) g 2 (z)
dz 0 g 2 (z)
x
=−
r1 0
dz ) g 2 (z)
1 . g(0)
(21.31)
Thus, using this and the definition of .g+ , the last line of (21.30) is precisely (x + ) − g (0+ ). Thus, g+ +
.
g+ (x + ) − g+ (0+ ) = λ
x
.
0
g+ (y)dm(y).
(21.32)
Hence, one has that .Lg+ = λg+ . Similarly, one has that Lg− (x) =
.
d d g− (x) = λg− (x). dm dx
Next note that g is strictly positive, and (21.27)) so that .
.
d dx g(x)
(21.33)
is increasing in x (see (21.25),
r1 d d g+ (x) = ( g(x)) g(z)−2 dz − g(x)−1 dx dx x r1 d < g(z)−2 g(z)dz − g(x)−1 = −g(r1 )−1 ≤ 0. dz x
Here .g(r1 ) = limx→r1 g(x). Hence, .g+ is decreasing (strictly). Similarly,
(21.34)
21 One Dimensional Diffusions by Semigroups
.
d g− (x) > g(r0 )−1 ≥ 0, dx
367
(g(r0 ) = lim g(x))
(21.35)
x→r0
so that .g− is strictly increasing. Next let .v1 , v2 be two solutions of (21.12), i.e., (λ − L)vi (x) = 0,
for r0 < x < r1 (i = 1, 2).
.
(21.36)
± The next order of business is to study the boundary behavior of .g± , . dg dx . The following table of values at .ri (.i = 1, 2) are obtained as limits as .x → ri with .x ∈ (r0 , ri ).
Lemma 6 One has the following boundary behavior for .g± given in Table 21.1. Proof d d (I). Since . dm dx g+ = λg+ , one has for .0 < x < x0 < r1 , integrating with respect to .m(dz), d d g+ )(x0 ) − ( g+ )(x) = λ .( g+ (z)dm(z). dx dx (x,x0 ]
Next, integrating with respect to dx,
(x0 − x)g+ (x0 ) − g+ (x0 ) + g+ (x) = λ
.
(x,x0 ]
(
(z,x0 ]
g+ (y)m(dy))dz. (21.37)
Therefore, g(x) =
.
g+ (x0 ) − (x0 − x)g+ (x0 ) + λ
(x,x0 ]
≥λ
(x,x0 ]
(
(z,x0 ]
(
(z,x0 ]
g+ (y)m(dy))dz
g+ (y)m(dy))dz.
d g+ < 0) Letting .x0 ↑ r1 , one gets (since . dx
.g+ (x)
≥λ
(x,r1 ]
( (z,r1 )
g+ (y)dm(y))dz ≥ λg+ (r1 )
(z,r1 ]
(m(r1 ) − m(z))dz.
(21.38)
If .r1 is exit or natural, then .∞
=
(0,r1 )
ydm(y) =
r1 0
(m(r1 ) − m(y))dy
so that, by (21.38), .g+ (r1 ) must vanish. If .r1 is regular, then .r1 , .m(r1 ) are both finite, g is bounded away from zero and infinity on .[x, r1 ) and, therefore, .g+ (x)
≡ g(x)
(x,r1 ]
g −2 (y)dy → 0
368
21 One Dimensional Diffusions by Semigroups
Table 21.1 Boundary Behavior of .g±
I II III IV
.g+ (r1 ) .g− (r1 )
d dx g+ (r1 ) d . dx g− (r1 ) .
Natural 0 .∞ 0
Exit 0 .< ∞ .< 0
Entrance .> 0 .∞ 0
.∞
.∞
.
−∞
.≥
0
0
as .x ↑ r1 . This validates the first row in Table 21.1, except for the case of an entrance boundary, which we establish as follows. The function .g+ (x) is d g ≤ 0; also, for .x > 0, decreasing, so that . dx + .(
d d g± )(x) = ( g± )(0) + λ g± (z)dm(z), dx dx (0,x]
(21.39)
d g is increasing, and, in particular, the limit at .r , which shows that . dx ± 1 d .( g )(r ) exists and is . ≤ 0 . If . r is inaccessible, then + 1 1 dx .g(r1 )
≥ (1 + λg 1 (r1 )) = ∞,
so that .g(r1 )−1 = 0. Letting .x ↑ r1 , in the first equality in (21.30), one gets d d d dx g+ )(r1 ) ≥ 0 (since .( dx g)(x) ≥ 0 for .x > 0). Hence .( dx g+ )(r1 ) = 0 if .r1 is inaccessible. Now, using this, one has
.(
.
− log g+ (r1 ) + log g+ (0) r1 ( d g+ )(x) = dx − dx g+ (x) 0 r1 d d g )(x) ( dx g+ )(r1 ) − ( dx d + = dx, (( g+ )(r1 ) = 0) g (x) dx + 0 r1 d d λ ( g+ ) = λg+ g+ (y)dm(y))dx, ( = g (x) dm dm + 0 (x,r1 ) r1 r1 g+ (x) ( ≤λ dm(y))dx = λ ( dm(y))dx 0 g+ (x) (x,r1 ) 0 (x,r1 ) y =λ ( dx)dm(y) = λ ydm(y) < ∞, (21.40) (0,r1 )
0
(0,r1 )
since .r1 is an entrance boundary. This shows .g+ (r1 ) = 0.
21 One Dimensional Diffusions by Semigroups
369
(II). By the Lemma 2, there exist unique real numbers .c1 , c2 such that .g(x) = c1 g+ (x) + c2 g− (x). Clearly, .c1 , c2 are nonzero (since .g(x) is strictly decreasing for .x < 0 and strictly increasing for .x > 0) and cannot both be negative (since .g(x) > 0 for .x = 0). Indeed, both are positive. For suppose .c1 < 0 and .c2 > 0; d g(x) = c d g (x) + c d g (x) > 0 for all x , but this is not true. then . dx 2 dx − 1 dx + d g(x) is negative for all x . Similarly, .c1 > 0 and .c2 < 0 would imply that . dx 1 Next note that .g(r1 ) = ∞ if and only if .g (r1 ) = ∞ if and only if .g− (r1 ) = ∞. Hence, .g− (r1 ) = ∞ if and only if .r1 is inaccessible. d g )(r ) = 0 if .r is (III). In establishing (I), we have already proved that .( dx + 1 1 d g )(r ) ≤ 0 if .r is inaccessible (i.e., natural or entrance). The assertion .( dx + 1 1 regular, is obvious. Assume now that .r1 is exit. By Lemma 1, .0
d d g− )(0)g+ (0) − g− (0)( g+ )(0) = W (g− , g+ ) dx dx d d = lim {( g− )(x)g+ (x) − g− (x)( g+ )(x)}. dx x↑r1 dx 0, dx
d g (r ))(r ) < 0. which implies that .( dx + 1 y d1 g− (z)dz ≥ 0 (IV). One has .g− (y) = g− (0) + 0y dz d g )(0). Hence y( dz −
y d d dz g− (z)dz ≥ 0 ( dz g− )(0)dz =
370
21 One Dimensional Diffusions by Semigroups
.(
d d g− (y)dm(y) g− )(x) − ( g− )(0) = λ dx dx (0,x] d ≥λ y( g− )(0)dm(y) dz (0,x] d = λ( g− )(0) ydm(y). dz (0,x]
(21.43) d g )(0) > 0), Letting .x ↑ r1 , one gets (since .( dz − .(
d g− )(r1 ) = ∞, dx
if .r1 is natural or exit. If .r1 is regular, then .r1 < ∞, .m(r1 ) < ∞, .g 1 (r1 ) < ∞, and, d g)(r ) < ∞. Since .( d g)(r ) = c ( d g )(r ) + therefore, .g(r1 ) < ∞ and .( dx 1 1 1 dx + 1 dx d d g )(r ) < ∞. If .r c2 ( dx g− )(r1 ) for positive constants .c1 , c2 , it follows that .( dx − 1 1 d g )(r ) = ∞, then the right side is entrance, then .g+ (r1 ) > 0 (by (I)), and if .( dx − 1 of (21.41) would be .+∞, a contradiction since .W (g− , g+ ) = W (g− , g+ )(0) is finite. The proof of (I)–(IV) is complete; the proof of the boundary behavior at .r0 is entirely analogous. Let us now show that the desired Green’s function for .λ − L is given as follows: Lemma 7 The Green’s function for .λ − L is given by G(x, y) =
.
g− (x)g+ (y) W (g− ,g+ )
if r0 < x < y < r1 G(y, x) if r0 < y ≤ x < r1 .
(21.44)
For every bounded measurable real-valued function f on I , (Gf )(x) :=
G(x, y)f (y)dm(y) r0 < x < r1 .
.
(21.45)
(r0 ,r1 )
The integral is bounded and for all .f ∈ C(I ), (λ − L)Gf (x) = f (x)
.
r0 < x < r1 .
(21.46)
Proof To show that the integral on the right converges, write .W = W (g− , g+ ) and note that
21 One Dimensional Diffusions by Semigroups .
W (Gf )(x)
371
= g+ (x)
(r0 ,x]
= λ−1 g+ (x)
(
d d
(r0 ,x] dm dy
+λ−1 g− (x) = λ−1 g+ (x)
g− (y)f (y)dm(y) + g− (x)
(
d d
(r0 ,x]
g+ (y)f (y)dm(y)
g− (y))f (y)dm(y)
(x,r1 ) dm dy
f (y)d(
(x,r1 )
g+ (y))f (y)dm(y),
d d g− (y)) + λ−1 g− (x) f (y)d( g+ (y)), dy dy (x,r1 )
(21.47) which is bounded in magnitude by .
λ−1 g+ (x)|{(
d d g− (y))y=x − ( g− (y))y=r0 }|||f ||∞ dy dy
+λ−1 g− (x)|{(
d d g+ (y))y=r1 − ( g+ (y))y=x }|||f ||∞ . dy dy
(21.48)
To prove, (21.46), from (21.47) one gets .
d d W (Gf )(x) = ( g+ )(x) dx dx +(
(r0 ,x]
d g− )(x) dx
g− (y)f (y)dm(y)
g+ (y)f (y)dm(y),
(x,r1 )
so that .
d d W (Gf )(x) dm dx d d g+ )(x) =( g− (y)f (y)dm(y) dm dx (r0 ,x] +(
d d d g− )(x) g+ )(x)g− (x)f (x) + ( dm dx dx
d g− )(x)g+ (x)f (x) dx = λg+ (x) g− (y)f (y)dm(y) + λg− (x)
(x,r1 )
g+ (y)f (y)dm(y)
−(
(r0 ,x]
(x,r1 )
g+ (y)f (y)dm(y)−Wf (x)
= λW (Gf )(x) − Wf (x).
372
21 One Dimensional Diffusions by Semigroups
The next lemma deals with the boundary behavior of Gf . Lemma 8 Let .f ∈ C(I ). (a) Then .Gf ∈ C(I ). (b) If .ri is accessible, then .(Gf )(ri ) := limx→ri (Gf )(x) = 0 (c) If .ri is natural, then .(Gf )(ri ) = λ−1 f (ri ) (.i = 1, 2). Proof Take .ri = r1 . Consider first the case: f vanishes in a neighborhood of .r1 . Let .f (x) = 0 for .x > r (.r < r1 ). Then W (Gf )(x) = g+ (x)
.
(r0 ,x]
g− (y)f (y)dm(y) + g− (x)
(x,r1 )
g+ (y)f (y)dm(y) (21.49)
→ g+ (r1 )
(r0 ,r]
g− (y)f (y)dm(y) as x ↑ r1 .
If .r1 is accessible (i.e., regular or exit) or natural, then .g+ (r1 ) = 0 and .(Gf )(r1 ) = 0. Next consider the case: .f ≡ 1. In this case (see the last equality in (21.47) ): .
W (Gf )(x) d d g− )(x) − ( g− )(r0 )) dx dx d d +λ−1 g− (x)(( g+ )(r1 ) − ( g+ )(x)) dx dx d d = λ−1 W − λ−1 g+ (x)( g− )(r0 ) + λ−1 g− (x)( g+ )(r1 ). dx dx
= λ−1 g+ (x)((
(21.50)
In case .r1 is accessible, then .g+ (r1 ) = 0, .g− (r1 ) < ∞, and one has .
W lim (Gf )(x) x↑r1
= λ−1 W + λ−1 g− (r1 )( = λ−1 lim (( x↑r1
= λ−1 lim ( x↑r1
d g+ )(r1 ) dx
d d d g− )(x)g+ (x) − g− (x)( g+ )(x) + g− (x)( g+ )(x)) dx dx dx
d g− )(x)g+ (x). dx
(21.51)
Now one has (
.
d d g− )(x) = ( g− )(0) + dx dx ≤(
(0,x]
λg− (y)dm(y)
d g− )(0) + λg− (r1 )m(x), dx
21 One Dimensional Diffusions by Semigroups
373
so that .
lim (
x↑r1
d d g− )(x)g+ (x) ≤ ( g− )(0)g+ (r1 ) + λg− (r1 ) lim sup m(x)g+ (x) dx dx x↑r1 = λg− (r1 ) lim sup m(x)g+ (x) x↑r1
=0 if .r1 is accessible. For, in this case .g− (r1 ) < ∞, .m(r1 ) < ∞, .g+ (r1 ) = 0. Thus, if r1 is accessible, then .(Gf )(r1 ) = 0 in case f vanishes in a neighborhood of .r1 or if .f ≡ 1. Any .f ∈ C(I ) may be approximated uniformly by a linear combination of functions of these two types. Also (21.50) implies (.W (G1)(x) ≤ λ−1 W for all x) .
||Gf ||∞ ≤ λ−1 ||f ||∞ ,
.
(21.52)
which holds whatever the type of boundaries. Therefore, the assertion concerning an accessible boundary is proved. d If .r1 is natural, and .f ≡ 1, then .( dx g+ )(r1 ) = 0 and .g+ (r1 ) = 0 by (21.50), (Gf )(x) = λ−1 −
.
λ−1 d g+ (x)( g− )(r0 ) → λ−1 as x ↑ r1 , W dx
(21.53)
while, as shown earlier, .Gf (x) → 0 as .x ↑ r1 for all f vanishing in a neighborhood of .r1 . Therefore, using (21.52) and approximation, .
lim (Gf )(x) = λ−1 f (r1 )
x↑r1
(21.54)
for every .f ∈ C(I ), in case .r1 is natural. d It remains to consider the case: .r1 is entrance. In this case .( dx g+ )(r1 ) = 0, and (21.50) yields λ−1 d g+ (x)( g− )(r0 ) dx W d → λ−1 − λ−1 g+ (r1 )( g− )(r0 ) = λ−1 as x → r1 . dx
(G1)(x) = λ−1 −
.
(21.55)
We are now in a position to complete the proof of Theorem 21.1.
374
21 One Dimensional Diffusions by Semigroups
Proof of Theorem 21.1 (a) (Sufficiency) .DL is dense in .C(I ) by Lemma 1. Fix .λ > 0. By Lemmas 7 and 8, .u ≡ Gf ∈ DL and solves (λ − L)u = f.
.
(21.56)
for every .f ∈ C(I ). Fix .f ∈ C(I ). Let us show that .u ≡ Gf is the unique element of .DL satisfying (21.56). If .u1 is another such element, then .(λ−L)(u− u1 )(x) = 0 for .r0 < x < r1 . Therefore by Lemma 3, .u − u1 = c1 g+ + c2 g− for some constants .c1 , c2 . If .r1 is inaccessible, then .g− (r1 ) = ∞ (Lemma 6 (III)) and .g+ (r1 ) < ∞. Therefore in order that .u − u1 ∈ C(I ), one must have .c2 = 0. Similarly, the inaccessibility of .r0 would imply .c1 = 0 in order that .u − u1 ∈ C(I ). Hence .u − u1 ≡ 0. Thus .u = Gf is the unique element of .DL satisfying (21.56), i.e., .λ − L is one-to-one on .DL and its range is .C(I ). Further, (21.52) implies that ||(λ − L)−1 f ||∞ = ||Gf ||∞ ≤ λ−1 ||f ||∞ .
.
Thus, by the Hille–Yosida Theorem A.1, L on .DL is the infinitesimal generator of a strongly continuous contraction semigroup on .C(I ). (Necessity) If .r1 is accessible, then .0 < g− (r1 ) < ∞. Hence for any given .f ∈ C(I ), .Gf +c1 g− ∈ DL and is a solution of .(λ−L)u = f for every .c1 ∈ R. Thus .λ−L is not one-to-one. Similarly, if .r0 is accessible then .Gf +c2 g+ ∈ DL and is a solution of .(λ − L)u = f for every .c2 ∈ R. Thus if .r0 , r1 are not both inaccessible, L on .DL is not the infinitesimal generator of a strongly continuous semigroup on .C(I ). (b) Let .Tt denote the semigroup generated by L (with domain .DL ) on .C(I ). By the Riesz representation theorem, there exists a finite signed measure .p(t; x, dy) such that .(Tt f )(x) = f (y)p(t; x, dy) for all f ∈ C(I ). (21.57) I
The Laplace transform of .t → (Tt f )(x) is the resolvent (see (2.47), Chapter 2) (λ − L)−1 f (x) =
.
∞
e−λt (Tt f )(x)dt (t ≥ 0).
(21.58)
0
By uniqueness of the solution of .(L − λ)u = f , (λ − L)−1 f (x) = (Gf )(x).
.
(21.59)
But for .f ≥ 0, .(Gf )(x) ≥ 0. This implies that .p(t; x, dy) is a measure, and (21.52) implies that .0 ≤ p(t; x, B) ≤ 1 for all Borel subsets B of I . Measurability of .x → p(t; x, B) follows from that of .x → Tt f (x) for all .f ∈ C(I ) (Exercise 2(i)). The
21 One Dimensional Diffusions by Semigroups
375
Chapman–Kolmogorov equation is just a restatement of the semigroup property of dg− + Tt . Also, since .r0 , .r1 are inaccessible .( dg dx )(r1 ) = 0, .( dx )(r0 ) = 0. Hence (21.50) yields
.
(G1)(x) =
.
1 for all x ∈ I. λ
(21.60)
Since the Laplace transform is one-to-one on the set of all bounded continuous functions on .[0, ∞) (Exercise 2(ii)), and .
1 = λ
∞
e−λt dt,
(21.61)
0
it follows that p(t; x, I ) ≡
1p(t; x, dy) = 1, (for all t > 0, x ∈ I ).
.
(21.62)
I
Thus .p(t; x, dy) is a transition probability function on I . Feller continuity is simply the statement .x → Tt f (x) is continuous if .f ∈ C(I ), which is clearly true. It remains to prove (21.14). Given .ε > 0 and .x ∈ I , there exists .fx ∈ DL such that (Exercise 4), .fx (y) ≥ 0 for all .y ∈ I and fx (y) =
.
0 for |y − x| ≤ 3ε , 1 for |y − x| ≥ 2ε 3.
(21.63)
Then if .|x − x| ≤ ε/3, one has .fx (x ) = 0 and
.
Tt fx (x ) − fx (x ) t {x ∈I :|x −x|< ε } sup
3
Tt fx (x ) t {x ∈I :|x −x|< 3ε } 1 f (y)p(t; x , dy) ≥ sup ε } {x ∈I :|x −x|< } t {y∈I :|y−x|≥ 2ε 3 =
sup
3
= ≥
sup
{x ∈I :|x −x|< 3ε }
1 2ε p(t; x , {y : |y − x| ≥ }) 3 t
1 p(t; x , {y : |y − x | ≥ ε}). {x ∈I :|x −x|< ε } t sup
3
But .Lfx (x ) = 0 if .|x − x| ≤ 3ε , and
(21.64)
376
21 One Dimensional Diffusions by Semigroups
.
sup |
x ∈I
Tt fx (x ) − fx (x ) − Lfx (x )| → 0 as t ↓ 0. t
(21.65)
Hence .
1 p(t; x , {y − x | ≥ ε}) → 0 as t ↓ 0. {x ∈I :|x −x"≤ ε } t sup
(21.66)
3
Assume without essential loss of generality that I is a finite interval. This is achieved, e.g., by taking as a diffeomorphic image of .(r0 , r1 ) a finite interval .(a, b) and then treating .[a, b] as I . This may mean abandoning the natural scale; but that is of no consequence. (In any case, the two-point compactification reduces I to a compact set). Now find a finite set of points .r0 = x0 < x1 < · · · < xn = r1 such that .|xi−1 −xi | ≤ 3ε (.i = 0, 1, . . . , n−1). Obtain nonnegative functions .fxi with the properties (21.63) (with .xi = x). Then (21.66) holds with .x = xi (.i = 1, 2, . . . , n). Hence, . sup
1
x∈I t
p(t; x, {y : |y − x| ≥ ε}) = max
1
sup
0≤i≤n x ∈I :|x −xi |≤ ε t
p(t; x , {y ∈ I : |y − x | ≥ ε})
3
→ 0 as t ↓ 0.
The proof of Theorem 21.1 is now complete.
The “necessity” part of Theorem 21.1 (and its proof) shows that if at least one of the boundary points .ri (.i = 0, 1) is accessible, then there exist more than one solution .u ∈ DL to .(λ−L)u = f for any given .f ∈ C(I ). Thus in order to construct a semigroup in this case, it is necessary to restrict the domain of L further (by means of what may be called “boundary conditions”). The next two theorems precisely tell us these restrictions. In preparation we will require the processes associated with the generator L in Theorem 21.1. For the statement of the following theorem, consider the space .S [0,∞) of all functions on .[0, ∞) into a Polish space .(S, ρ), with the Kolmogorov .σ -field .G generated by finite or countably many coordinates. Denote by .X(t) the projection on the t-th coordinate: .X(t)(ω) = ω(t), t ≥ 0, .ω ∈ S [0,∞) . Let .Px denote the distribution on .(S [0,∞) , G) of the Markov process with a transition probability .p(t; y, dz), starting at .X(0) = x. Theorem 21.2 (Dynkin-Snell Criterion for Continuity of Sample Paths of Markov Processes) Assume that for every .ε > 0, .
lim t↓0
1 sup p(t; x, B c (x : ε)) = 0, t x∈S
where B(x : ε) = {y ∈ S : ρ(y, x) < ε},
.
B c (x : ε) = S\B(x : ε).
(21.67)
21 One Dimensional Diffusions by Semigroups
377
Let .Ω = C([0, ∞) : S) be the set of all continuous functions on .[0, ∞) into S, considered as a subset of the product space .S [0,∞) . Then Px∗ (Ω) = 1,
(21.68)
.
where .Px∗ denotes the outer measure on subsets of .S [0,∞) defined by the probability space .(S [0,∞) , G, Px ). For the proof of the theorem, we will need the following lemma. Lemma 9 For .ε > 0, δ > 0, define Γ (ε, δ) = sup sup p(t; x, B c (x : ε)).
.
(21.69)
t r0 and .Af ∈ C(I ) (Af )(r0 ) = lim (Af )(y) = (Lf )(r0 ),
.
y↓r0
(21.83)
which is of the form (21.77) for .i = 0, with α0 = R, Q0 (dy) = AQπ0 (dy), β0 = 1.
.
(21.84)
In case .Q + R = ∞, one has (r(y)−f (r0 )) 1 π(tk ; dy) − r(tk )f (r0 )] (Af )(r0 ) tk [q(tk ) I y−r0 .0 = = lim 1 tk ↓0 Q+R tk (q(tk ) + r(tk )) f (y)−f (r0 ) π0 (dy) f (r0 ) I y−r0 − = 1 1+S 1+ S S f (y) − f (r0 ) 1 f (r0 ), = π0 (dy) − (21.85) y − r0 1+S 1+S I S = 1 in the which is of the form (21.77) for .i = 0. If .S = ∞ then one must take . S+1 above computation and (21.85) becomes f (y) − f (r0 ) π0 (dy) = 0, (21.86) . y − r0 I
and if .S = 0, (21.85) becomes .f (r0 ) = 0. In any case one has a relation of the form Φ0 (f ) = 0, with .β0 = 0 and least one of .α0 , Q0 not zero. ati (x) Finally if .βi = 0 and . dQ |ri −x| < ∞, then .f → Φi (f ) is a nonzero bounded dQi (x) linear functional on .C(I ) (indeed .|Φi (f )| ≤ di ||f || + 2||f || |ri −x| ). Hence .{f ∈ C(I ) : Φi (f ) = 0} is a closed linear subspace of .C(I ) which is not equal to .C(I ) and is, therefore, not dense in .C(I ). As a consequence .DA = DL ∩ {f : Φi (f ) = 0} is not dense in .C(I ) contradicting part (b) of the theorem on p. 6. .
We now show that each pair of boundary conditions Φi (f ) = 0,
.
i = 0, 1,
gives rise to a unique strongly continuous Markov semigroup .{Tt } on .C(I ): i.e., the measure .p(t; x, dy) in the Riesz representation of the bounded linear functional .f → Tt f (x) is a transition (sub)-probability on I for each fixed .t ≥ 0, x ∈ I . Theorem 21.6 Let .Φi (f ) = 0 (.i = 0, 1) be nontrivial boundary conditions of the form (21.77) which satisfy (21.78). Let A be the operator defined on DA = DL ∩ {f : Φi (f ) = 0 for i = 0, 1}
.
382
21 One Dimensional Diffusions by Semigroups
by (Af )(x) = (Lf )(x), r0 < x < r1 .
.
Then A is the infinitesimal generator of a strongly continuous contraction semigroup on .C(I ) or a proper closed subspace of .C(I ). Proof The proof will be carried out in several steps. First let us normalize the functions .g+ , g− (see (21.13), (21.20)). Write v+ =
.
g+ , g+ (r0 )
v− =
g− . g− (r1 )
(21.87)
Then v+ (r0 ) = 1, v+ (r1 ) = 0; v− (r0 ) = 0, v− (r1 ) = 1.
.
(21.88)
Choose and fix .λ > 0 arbitrarily. Given .f ∈ C(I ) the general solution of (λ − L)u(x) = f (x),
.
r0 < x < r1 ,
(21.89)
is (see Lemma 3) u = Gf + c1 v+ + c2 v−
.
(21.90)
where Gf is defined by (21.45), for suitable constants .ci ≡ ci (f ). The boundary conditions for u may be expressed as Φ0 (v+ ) Φ0 (v− ) c1 Φ0 (Gf ) =− . . Φ1 (v+ ) Φ1 (v− ) c2 Φ1 (Gf )
(21.91)
We need to show that the .2 × 2 matrix on the left is nonsingular. Consider the function v(x) = v+ (x) + v− (x).
.
(21.92)
It satisfies: (i) .Dm Dx+ v = λv > 0, (ii) .v(r0 ) = v(r1 ) = 1. It follows from the strictly increasing property of .Dx+ v (implied by (i) ) that v is strictly convex and, hence, (a) .v(x) < 1 for .r0 < x < r1 , and (b) .Dx+ v(r0 ) < 0, .Dx+ v(r1 ) > 0, i.e., + + + + .Dx v+ (r0 ) < −Dx v− (r0 ), .Dx v+ (r1 ) > −Dx v− (r1 ). Now
21 One Dimensional Diffusions by Semigroups
383
.
Φ0 (v+ ) = (α0 + β0 ) + Φ1 (v+ ) =
I
I
1 − v+ (x) dQ0 (x), x − r0
−v+ (x) dQ1 (x), r1 − x
−v− (x) dQ0 (x), I x − r0 1 − v− (x) dQ1 (x), Φ1 (v− ) = (α1 + β1 ) + r1 − x I Φ0 (v− ) =
−v− (x) + (x) By property (a) above, . 1−v x−r0 > | x−r0 | = it then follows that
v− (x) x−r0
(21.93)
for all .x = r0 ; since .α0 +β0 ≥ 0,
Φ0 (v+ ) > |Φ0 (v− )|
(21.94)
α0 = β0 = 0 and Q0 (I − {r0 }) = 0.
(21.95)
Φ1 (v− ) > |Φ1 (v+ )|
(21.96)
α1 = β1 and Q1 (I − {r1 }) = 0.
(21.97)
.
unless .
Similarly, .
unless .
If .α0 = β0 = 0 and .Q0 (I − {r0 }) = 0, then .Q0 ({r0 }) ≡ q0 > 0, and Φ0 (v+ ) = −q0 (Dx+ v+ )(r0 ), Φ0 (v− ) = −q0 (Dx+ v− )(r0 ),
.
and, by property (b) above, Φ0 (v+ ) > |Φ0 (v− )|.
.
(21.98)
Similarly, if .α1 = β1 = 0 and .Q1 (I − {r1 }) = 0, then Φ1 (v− ) > |Φ1 (v1 )|.
.
(21.99)
Thus in all cases, the determinant of the .2 × 2 matrix in (21.91) is positive, so that c1 , c2 are uniquely determined by the boundary conditions (21.91). Hence there exists a unique element .u ∈ DA such that
.
(λ − A)u = f.
.
(21.100)
384
21 One Dimensional Diffusions by Semigroups
Next let us show that f ≥ 0 implies u ≥ 0,
.
where u is the solution in .DA of (21.100). For this note that (see Lemma 8) (Gf )(ri ) = 0,
.
i = 0, 1,
and, (see the proof of Lemma 8 and Table 21.1), (Dx+ Gf )(r0 ) ≥ 0, and (Dx+ Gf )(r1 ) ≤ 0.
.
Hence, Φ0 (Gf ) = I . Φ1 (Gf ) = I
−(Gf )(x) x−r0 dQ0 (x) −(Gf )(x) r1 −x dQ1 (x)
≤ 0, ≤ 0.
(21.101)
Now the diagonal elements of the matrix in (21.91) are positive, off-diagonal elements nonpositive, and the determinant of the matrix is positive (by calculations (21.93–(21.99) ). Therefore, the elements of the inverse matrix are all nonnegative. It then follows from (21.91) and (21.101) that .c0 ≥ 0, .c1 ≥ 0. Therefore, .u(x) = (Gf )(x) + c1 v+ (x) + c2 v− (x) ≥ 0 for all x. Finally, one needs to show that the solution u in .DA of (λ − A)u = f
.
satisfies, for each .f ∈ C(I ), the inequality ||u|| ≤
.
1 ||f ||. λ
(21.102)
Because of the fact that .f1 ≥ f2 implies .u1 ≥ u2 , where .ui is the solution in .DA of (λ − A)ui = fi
.
i = 1, 2,
in order to prove (21.102), it is enough to prove (21.102) for .f ≡ 1. Since the function u(x) = λ−1 + k1 v+ (x) + k2 v− (x)
.
(21.103)
is in .DL and satisfies .(λ − L)u(x) = 0 (.r0 < x < r1 ), whatever be the constants k1 , k2 , the unique solution to
.
(λ − A)u = λ−1
.
21 One Dimensional Diffusions by Semigroups
385
in .DA is given by (21.103) with .k1 , k2 determined by the equations .
0 = Φ0 (λ−1 ) + k1 Φ0 (v+ ) + k2 Φ0 (v− ) 0 = Φ1 (λ−1 ) + k1 Φ1 (v+ ) + k2 Φ1 (v− ),
i.e., (writing .Δ for the determinant of the .2 × 2 matrix discussed earlier), 1 Φ1 (v− ) −Φ1 (v+ ) −Φ0 (λ−1 ) k1 = . . k2 −Φ1 (λ−1 ) Δ −Φ0 (v− ) Φ0 (v+ )
(21.104)
But .Φ0 (λ−1 ) = (α0 + β0 )λ−1 ≥ 0, .Φ1 (λ−1 ) = (α1 + β1 )λ−1 ≥ 0. Hence .k1 ≤ 0, .k2 ≤ 0 (note that the elements of the .2 × 2 matrix in (21.104) are all nonnegative). Thus .u(x) appearing in (21.103), with .k1 , k2 given by (21.104), satisfies u(x) ≤ λ−1 for r0 ≤ x ≤ r1 .
.
(21.105)
To complete the proof of the theorem, one still needs to prove that .DA is dense in C(I ). However, this follows from the following general fact from functional analysis (Exercise 18).
.
Lemma 10 If .Φ is an unbounded linear functional on a Banach space .X , then its kernel {f ∈ X : Φ(f ) = 0}
.
is dense in .X . In fact, if .Φi (.1 ≤ i ≤ k) are a finite number of unbounded linear functionals, then the set .{f ∈ X : Φi (f ) = 0 ∀ i} is dense in .X The proof of this lemma is outlined in Exercise 18.
Remark 21.6 In the case that one boundary, say .r0 , is accessible, and the other .r1 is inaccessible, then as noted in the proof of Theorem 21.1, for .f ∈ C(I ), .(λ − Dm Dx+ )−1 f (x) = Gf (x) + cv+ (x) for some constant c. If both .r0 and .r1 are inaccessible then, again as shown in the proof of Theorem 21.1, the resolvent is Gf . One may rewrite the boundary condition (21.77) as (see the note following the statement of Theorem 21.5): αi f (ri ) + (−1)
.
i+1
γi (Dx+ f )(ri ) +
I \{ri }
f (ri ) − f (x) dQi (x) + βi (Lf )(ri ) = 0, |ri − x| (21.106)
where .γi = Qi ({ri )}). Definition 21.7 The boundary condition (21.106) is said to be local if the integral is zero, i.e., if .Q(I − {ri }) = 0.
386
21 One Dimensional Diffusions by Semigroups
The reason for this nomenclature is clear: a local boundary condition does not restrict the functions away from an arbitrarily small neighborhood of the boundary point. Definition 21.8 The diffusion is conservative if .p(t; x, I ) = 1 for all .t > 0 and all x ∈ I.
.
If .p(t; x, I ) < 1, then one may add a fictitious state .r∞ , say, to the state space and obtain a transition probability (Exercise) on .I = I ∪ {r∞ }: .
p(t; x, B) = p(t; x, B) if x ∈ I, B ∈ B(I ), p(t; x, {r∞ }) = 1 − p(t; x, I ) if x ∈ I, p(t; r∞ , {r∞ }) = 1.
(21.107)
There are many other ways of augmenting p so as to obtain a transition probability. No matter how this is done up to the time of leaving I , the process is determined by p. More precisely, if .{X(t) : t ≥ 0} is the Markov process with such a Feller continuous transition probability .p having right continuous sample paths, letting .r∞ be an added isolated point in .I , while giving I the usual topology, then p is the transition sub-probability corresponding to the semigroup: Tt f (x) = Ex {f (X(t)) · 1[τ >t] },
.
x ∈ I, f bounded and measurable.
(21.108)
To conclude this chapter, we shall consider a number of illustrative and important examples of Feller’s boundary conditions and their probabilisitic meanings. To this end, let .r0 be a regular boundary in Feller’s classification (Theorem 21.5), with natural scale. We first consider the local boundary conditions where the behavior at .r0 depends entirely only on the behavior of the diffusion in its immediate vicinity. We assume that .r1 is either inaccessible or has a local boundary condition as well. Example 2 (Dirichlet Boundary Condition) We begin with the so-called Dirichlet boundary condition at .r0 , namely, f (r0 ) = 0,
.
Φ1 (f ) = 0,
(21.109)
where .Φ1 (f ) = 0 is a local boundary condition at .r1 . This is the case in the general condition (21.106) with .α0 > 0, and all other terms equal to 0 for .i = 0. Then the d d domain .DL of the generator .L = dm dx is contained in .C(I ) ∩ {f ∈ C(I ) : f (r0 ) = 0}. For .f ∈ DL this implies .
d Tt f = LTt f ∈ DL , dt
so that Tt f (r0 ) = 0 for all t.
.
21 One Dimensional Diffusions by Semigroups
387
Since .DL contains functions which are positive except at the boundaries, the last condition, namely,
f (y)p(t; r0 , dy) = 0 = lim
.
x→r0 I
I
f (y)p(t; x, dy) = 0 ∀ t,
implies that on reaching .r0 the diffusion .X = {Xt : t ≥ 0} vanishes. In particular, p(t; r0 , I ) = 0. The same argument applies to the condition .f (r1 ) = 0. In particular, if (21.109) holds, then the diffusion is not conservative: .p(t; x, I ) ≤ Px (τ > t), where .τ is the first time the diffusion reaches .r0 , with equality under certain boundary conditions at .r1 .
.
Example 3 (Absorbing Boundary Condition) Next consider the boundary conditions Lf (r0 ) = 0,
.
(21.110)
with .r1 inaccessible or local, i.e., (21.106) with .β0 > 0 and all other terms being zero for .i = 0. The boundary condition at .r0 implies that for .f ∈ DL , .
d Tt f (r0 ) = LTt f (r0 ) = 0. dt
Hence .t → Tt f (r0 ) is a constant, namely, .f (r0 ). Therefore, p(t; r0 , {r0 }) = 1
.
∀ t ≥ 0.
We will call this an absorbing boundary condition. Here the diffusion stays at .r0 forever after reaching it. Once again, the same applies to .r1 with boundary condition .Lf (r1 ) = 0. This terminology is, unfortunately, at variance with that used by Feller (1954), where the Dirichlet boundary condition in which the process vanishes, .f (r0 ) = 0, is called “absorbing”, and the boundary condition in which the process remains at the boundary, .Lf (r0 ) = 0, is called “adhesive”. Our only justification is that in all our previous work, we have used the present terminology. Example 4 (Reflecting Boundary Condition) Another local boundary condition at r0 is given by
.
f (r0+ ) ≡
.
d f (r0+ ) = 0, dx
Φ1 (f ) = 0,
(21.111)
i.e., in (21.106) take .γ0 > 0 and all other terms zero for .i = 0. This is referred to as .r0 being a reflecting boundary. It arises if, in (21.77), .Q0 is the point mass .δr0 (dx) at .r0 , .α0 = 0, β0 = 0. The integrand in (21.77) in this case is defined to be + .f (r ). Here the diffusion returns to .(r0 , r1 ] immediately, and continuously, after 0 reaching .r0 . In Chapter 13 it is shown that such a diffusion may be achieved by
388
21 One Dimensional Diffusions by Semigroups
folding a symmetrically extended process around .r0 . In particular, if .r0 is taken to be zero, without loss of generality, then a diffusion .{Xt : t ≥ 0} on .[0, r1 ] with the given scale and speed functions can be extended to .[−r1 , r1 ] by adjusting the scale function so that .s(0) = 0, and letting .s(−x) = −s(x), and .m(−x) = −m(x). By replacing .m(x) by .m(x + ), one can make the left-continuous m on .[−r1 , 0) right-continuous. In any case for continuous m no such adjustment is necessary. Denoting the diffusion with this extended coefficient as .{Yt : t ≥ 0}, the stochastic process .{|Yt | : t ≥ 0} on .[0, r1 ] is shown to be a diffusion with the reflecting boundary condition (21.111) at 0 (Exercise 7). The diffusion is conservative if .r1 is inaccessible. If both boundaries are reflecting, i.e., f (r0+ ) = 0,
.
f (r1− ) = 0,
(21.112)
then by the method of Chapter 13, a pathwise construction may be provided, and the diffusion is conservative. The method of Chapter 13 allows one to provide useful representations of transition probabilities of the transformed diffusion in terms of the original one. In (21.106) this is the case if .γ0 > 0, γ1 > 0 and all other terms are 0. Example 5 (Mixed (or Elastic) Boundary Condition) Our last example is a a mixed local boundary or elastic local boundary condition at .r0 (corresponding to .α0 > 0, γ0 > 0 and all other terms zero for .i = 0 in (21.106), namely, with .γ = αγ00 , f (r0+ ) − γf (r0 ) = 0,
Φ1 (f ) = 0.
.
(21.113)
To describe the probabilistic behavior, without loss of generality take .r0 = 0, and recall the local time .(t, ω) of a standard Brownian motion .{Bt : t ≥ 0} at 0 is the same as the local time for the reflected Brownian motion at 0. Here .ω is a sample path of the reflected Brownian motion. Proposition 21.7 Let .X = {Xt = |Bt | : t ≥ 0} be the reflected Brownian motion. Consider the stochastic process: ζ (t, ·) = exp{−2γ (t, ·)},
t ≥ 0.
.
Let .r∞ be an arbitrary symbol, referred to as a “cemetary” state, adjoined to the state space. Define the process Y on .[0, ∞) ∪ {r∞ } by Yt =
.
Xt
for
r∞
if t ≥ ζ,
t < ζ,
(21.114)
where, conditionally given the .σ -field .Ft generated by .{Xs : 0 ≤ s ≤ t}, Px (ζ > t|Ft ) =
.
1
if t < τ0 ,
exp{−2γ (t − τ0 )}
if t ≥ τ0 .
(21.115)
21 One Dimensional Diffusions by Semigroups
389
Here τ0 = inf{s ≥ 0 : Xs = 0}.
.
Then the process .Zt := {Yt : t < ζ } is a diffusion on .[0, ∞) satisfying the boundary condition (21.113) at .r0 . We will postpone the rather long proof of this proposition until after we conclude our next example, namely, that of a nonlocal boundary condition. Example 6 (A Nonlocal Boundary Condition) One may represent the boundary condition (21.106), at .ri as .Φi (f ) ≡ αi f (ri ) − Θi f (x)μi (dx) − (−1)i γi f (ri ) + βi Lf (ri ) = 0, (i = 0, 1), I
where we assume .Θi =
I \{ri }
|ri − x|
(21.116)
−1
Qi (dx) ≡ Θi
μi (dx) < ∞,
(21.117)
I
μi being a probability measure on .I \{ri }. The term .f (ri ) I \{ri } |ri − x|−1 Qi (dx) is absorbed in .αi f (ri ). Consider the case .i = 0, with .Φ1 (f ) = 0, a local boundary condition at .ri , or .r1 is inaccessible. The diffusion on reaching .r0 stays there for a time .η, which is independent of the path up to the time of arrival at .r0 and is exponentially distributed with parameter .α0 . After time .η, it jumps to a point, say .Z1 , in .(r0 , r1 ] randomly, and independently of the past, according to the distribution .μ0 . Then the process starts afresh from the point .Z1 in .(r0 , r1 ], in case of nonabsorption if .Z1 = r1 , until it reaches .r0 again and repeats this process. If .r1 is an absorbing boundary and .Z1 = r1 , then the process stays at .r1 thereafter. The Markovian nature of the process is established by the same argument as used for jump Markov processes in Chapter 4, or by the fact that the process is generated by a contraction semigroup. To check the boundary condition, note that for .f ∈ C(I ), .
Tt f (r0 ) = f (r0 )P (η > t) + α0
e
.
−α0 s
ds
(0,t]
(r0 ,r1 ]
f (x)μ0 (dx),
(21.118)
so that .
d Tt f (r0 ) = −α0 f (r0 )e−α0 t + α0 e−α0 t dt
(r0 ,r1 ]
f (x)μ0 (dx),
(21.119)
and, as a consequence, Lf (r0 ) = −α0 f (r0 ) + α0
.
(r0 ,r1 ]
f (x)μ0 (dx),
(21.120)
390
21 One Dimensional Diffusions by Semigroups
which is the same as (21.116) for .i = 0 with .Θ0 = α0 , β0 = 1. If the boundary condition at .r1 is local or .r1 is inaccessible, then this Markov process has continuous sample paths in between jumps from .r0 . We now provide a proof of Proposition 21.7. Proof of Proposition 21.7 Let B be a Borel subset of .[0, ∞). Then, writing .Fs = σ {Xu : 0 ≤ u ≤ s} and .Xs+ as the after-s X process: .(Xs+ )t = Xs+t , and noting that [ζ > s + t] = [ζ > s, ζ (Xs+ ) > t],
.
one has + ) ∈ B, ζ > s, ζ (X + ) > t|F ) = 1 t s [ζ >s] PXs (Xt ∈ B, ζ > t) = Py (Zt ∈ B)|y=Zs , s
.Py ((Xs
(21.121)
proving the Markov property of the diffusion. To check that it satisfies the boundary condition (21.113), we consider the resolvent (Laplace transform) of .t → f (t, ·) = E· f (Zt ) ∈ DL , t ≥ 0, satisfying the boundary conditions (21.113). That is, fˆ(λ, x) =
.
[0,∞)
e−λt Ex f (Zt )dt
= Ex {
[0,∞)
= Ex { = Ex
[0,τ0 ]
[0,∞)
= Ex
[0,∞)
e−λt e−2γ (t) f (Xt )dt} e−λt f (Xt )dt} + Ex e−λt f (Xt )dt − Ex
[τ0 ,∞)
[τ0 ,∞)
(e−λt e
[0,∞)
0
f ((Xτ+0 )t )dt
e−λt f (Xt )dt + Ex {e−λτ0 fˆ(λ, 0)}
e−λt f (Xt )dt − Ex (e−λτ0 E0
= Ex (e−λτ0 (fˆ(λ, 0) − E0
−2γ (Xτ+ )
[0,∞)
e−λt f (Xt )dt) + Ex [e−λτ0 fˆ(λ, 0)
e−λt f (Xt )dt)) + Ex
[0,∞)
e−λt f (Xt )dt.
(21.122) Recall7 that Ex e−λτ0 = e−x
.
√
2λ
,
so that (21.122) becomes fˆ(λ, x) = e−x
.
7 Bhattacharya
√
2λ
(fˆ(λ, 0) − E0
[0,∞)
e−λt f (Xt )dt) + Ex
[0,∞)
e−λt f (Xt )dt
and Waymire (2021), p. 85–86, Proposition 7.15 and Remark 7.8.
(21.123)
21 One Dimensional Diffusions by Semigroups
391
Writing g(x) = fˆ(λ, x),
.
one has g(x) = (λ − L)f (x),
.
or, (λ − L)g(x) = f (x),
.
where L is the generator of the diffusion .{Zt : t ≥ 0}. The derivative of g at .x = 0 yields, using the fact that the derivative of the last summand in (21.123) vanishes (which is the boundary condition for the reflecting Brownian motion .{Xt }), .
√ √ d g(x)|x=0 = − 2λg(0) + 2λE0 e−λt f (Xt )dt. dx [0,∞)
(21.124)
Now the last summand on the right in (21.124) can be expressed as (Exercise 9) √ .
2λE0
[0,∞)
e
−λt
f (Xt )dt =
√
[0,∞)
e−
2λy
f (y)dy.
(21.125)
We will now evaluate .g(0) = fˆ(λ, 0) and show that √ √ √ 2λg(0) = { 2λ/(γ + 2λ)}
.
[0,∞)
√
e−
2λy
f (y)dy.
(21.126)
Plugging this and the quantity in (21.125) in (21.124) yields the desired relation (21.113), namely, g (0) = γ g(0).
.
(21.127)
Since every function .g ∈ DL is of the form .(λ − L)−1 f for some f in the center of the semigroup, the proof of the proposition would be complete, once (21.126) is proved. To derive (21.126) we follow the method of Itô-McKean (1965). Consider the alternative representation of the Brownian local time .(t) at 0, also due to Lévy (1948), and proved in Chapter 20. Namely, that .2(t, ·) has the same distribution as the maximum .Mt of a standard Brownian motion .{Bs : s ≥ 0} on .[0, t], starting at zero. Also the reflecting Brownian motion .Xt has the representation (in law)8 as
8 Bhattacharya
and Waymire (2021), Theorem 19.3, p. 230.
392
21 One Dimensional Diffusions by Semigroups
Mt − Bt , and the joint distribution is the same as that of .(Mt , Mt − Bt ). So that, using the joint distribution9 of .(Mt , Bt ), one has
.
g(0) = E0
.
=
[0,∞)
[0,∞)
e
e−λt e−γ Mt f (Mt − Bt )dt
−λt
dt
[0,∞)
e−γ z dz
2(2z − x) −(2z−x)2 /2t f (z − x) √ e dx (−∞,z] 2π t 3 √ e−γ z e− 2λ(2z−x) f (z − x)dxdz = ·
[0,∞)
= (γ +
(−∞,z]
√
2λ)
−1
√
[0,∞)
e−
2λz
(21.128)
f (z)dz,
which is the same as (21.126). For the calculation of the time integral in the third line, see Bhattacharya and Waymire (2021), Corollary 7.12, Proposition 7.15, and Remark 7.8. For the integral in the fourth line, make the transformation .(x, z) → y = (z − x, z) to get e
.
[0,∞)
−γ z
=
[0,∞)
√
e−
2λ(2z−x)
f (z − x)dxdz
(−∞,z]
e−2γ z
=
([0,∞)
√
e− (0∞)
e−(
√
2λ+γ )z
√ 2λy − 2λz
e
dz
√
[0,∞)
e−
f (y)dxdy
2λy
f (y)dy.
Remark 21.7 It is clear from the long proof above that a precise probabilistic representation under the boundary condition (21.113), or (21.127), that for general diffusions such a representation would be difficult to obtain. Feller (1954) shows that in the general case, a diffusion with the elastic boundary condition may be constructed as a limiting process. When .r0 is reached, let the jump to a point .x0 in the interior with probability .θ0 and with probability .1−θ0 the process terminates. With a probability .κ0 , this process then vanishes on return to .r0 , and with probability .1 − κ0 continues as described before. One lets .xκ0 → r0 and .θ0 → 1 (while .κ0 remains fixed); then the limiting process has boundary condition at .r0 of the type (21.127). Feller provides a detailed and rather long derivation of the constant .γ .
9 See
Bhattacharya and Waymire (2021), Corollary 7.12, p. 82.
21 One Dimensional Diffusions by Semigroups
393
Remark 21.8 (Time Change) There is a general method, called time change, which transforms Brownian motion .{Bt } into arbitrary diffusions. For a diffusion with inaccessible boundaries in natural scale and speed measure .m(dy)(= dm(y)), let 1 2
M(t) :=
.
(t, y)m(dy), (−∞,∞)
where .(t, y) is the local time of Brownian motion. Let T (t) := max{s : M(s) = t},
.
i.e., T is the inverse of M. Then it may be shown10 that .{Xt := BT (t) } is a diffusion in natural scale on .(−∞, ∞). Similar constructions can be made for diffusions with local boundary conditions on half-lines11 .(0, ∞) or .[0, ∞). The problem of changing the original scale function .s(x) (see, e.g., (21.4)) to natural scale natural scale and carrying out computations in the natural scale is often a little inconvenient. So we summarize the boundary classification in terms of a given scale .s(x) and speed function .m(x) here. (i) A boundary .ri is inaccessible if |
.
[0,ri )
m(x)d + s(x)| = ∞.
(21.129)
Staring from the interior .(r0 , r1 ), the probability of reaching the boundary .ri is zero. (ii) An accessible boundary .ri is regular if |
.
[0,ri )
s(x)dm(x)| < ∞.
(21.130)
(iii) An accessible boundary .ri is an exit boundary if |
.
[0,ri )
s(x)dm(x)| = ∞.
(21.131)
(iv) An inaccessible boundary .ri is natural if, in addition to (21.129), the quantity in (21.131) is also infinite. (v) An inaccessible boundary .ri is an entrance boundary if (21.131) holds. The classification in Lemma 6 remains valid when expressed in terms of the scale function .s(x) and speed function .m(x). 10 See 11 See
Itô and McKean (1965), pp.167–176. Itô and McKean (1963).
394
21 One Dimensional Diffusions by Semigroups
Asymptotic behaviors of one-dimensional diffusions, such as (i) recurrence or transience and (ii) positive recurrence or null recurrence, are as analyzed for nonsingular diffusions with smooth coefficients in Chapter 11. The proofs given there in terms of scale and speed functions apply to Feller’s general diffusions as well. We conclude with a number of examples of one-dimensional diffusions12 Example 7 (Brownian Motion) I = [−∞, ∞],
.
L=
1 d2 . 2 dx 2
Both boundaries are inaccessible and natural. Here .s(x) = x, m(x) = 2x. Example 8 (Ornstein–Uhlenbeck Process) I = [−∞, ∞],
.
L = βx
1 d2 d + σ2 2 dx 2 dx
with .β, σ nonzero constants. Both boundaries .−∞, ∞ are inaccessible and natural. Example 9 (Dirichlet, Absorbing, and Reflecting Boundary Conditions) I = [0, ∞],
.
L=
1 d2 . 2 dx 2
Here 0 is a regular boundary and .∞ is a natural boundary. With boundary condition f (0) = 0, .r0 = 0 is a regular Dirichlet boundary; under .f (0+ ) = 0, 0 is a reflecting boundary; and under .f (0) = 0, .r0 = 0 is an absorbing boundary. In the first case, the diffusion is nonconservative. But in the last two cases, the diffusion is conservative.
.
Example 10 (Radial Brownian Motion) Let k be an integer, .k ≥ 2, and I = [0, ∞],
.
L=
k−1 d 1 d2 . + 2x dx 2 dx 2
Fixing a point, say 1, in the interior of I , one may compute .s(x) and .m(x) as in (21.4) but with the integrals taken over .[1, x). Boundaries are inaccessible, but .r0 = 0 is entrance, while .r1 = ∞ is natural. This process arises as .Xt = |Bt |, t ≥ 0, where .{Bt } is a k-dimensional standard Brownian motion (see Chapter 13), .k > 1. One could include 0 in the state space .(0, ∞); but from the interior the probability of reaching 0 is zero, while starting from zero the process would immediately move to the interior, never returning to 0.
12 See
Mandl (1968), Karlin and Taylor (1981) for more examples.
Exercises
395
Example 11 (Raleigh Process) This is a more general form of Example 10, with I = [0, ∞],
.
L=
β −1 d 1 d2 . + 2x dx 2 dx 2
For .1 ≤ β < 2, 0 is a regular boundary, and a boundary condition needs to be attached (e.g., .f (0) = 0, or .f (0) = 0, etc.) to define the diffusion; for .β ≥ 2, .r0 = 0 is an entrance boundary. For .β < 1, 0 is an exit boundary. Example 12 (Wright–Fisher Gene Frequency Without Mutation) I = [0, 1],
.
L = cx(1 − x)
d2 , c > 0. dx 2
Here .r0 = 0 and .r1 = 1 are accessible and exit boundaries, respectively. Remark 21.9 The Wright–Fisher diffusion model arises in a scaling limit of a discrete time Markov chain in which x and .1 − x are gene frequencies of A-genes and a-genes, respectively, without mutation or selective advantage.
Exercises 1. Let g, g ,
d dm g
be continuous.
d g = 0, then g is a constant on (r0 , r1 ). (a) Show that if g = 0 on (r0 , r1 ) or dm dg [Hint: Let x < y, g(y) − g(x) = (x,y] dm (z)dm(z) = 0.]
d d (b) Show that for x < y, g(y) − g(x) = (x,y] (x,z] dm g(u)m(du) dz + du
(y − x)g (x). [Hint: The inner integral equals g (z) − g (x).] (c) Let g1 , g2 be two linearly independent solutions of Lg − λg = 0. Show that every solution g is a linear combination of g1 , g2 . [Hint: There exist unique constants c1 , c2 such that c1 g1 (0) + c2 g2 (0) = g(0), and c1 g1 (0) + c2 g2 (0) = g (0); recall that W (g1 , g2 ) is a constant. Then consider W (c1 g1 + c2 g2 , g) and show this implies g = c(c1 g1 + c2 g2 ) for some constant c.]
2. Suppose that {Tt } is a strongly continuous contraction semigroup on C(I ). (a) Prove that x → p(t; x, B) is measurable for every B ∈ B(I ) and t > 0. (b) Prove that the Laplace transform is one-to-one on the space of all bounded continuous functions on [0, ∞). [Hint: Let f be bounded and continuous, and fˆ(λ) = [0,∞) e−λt f (t)dt its Laplace transform on (0, ∞). Then
(−1)k k ˆ(k) (λ) = f (k) (λ) = (−1)k [0,∞) e−λt t k f (t)dt. Consider [λx] k=0 k! λ f Nλ x [0,∞) P ( λ ≤ t )f (t)dt, where Nλ is Poisson distributed with parameter
396
21 One Dimensional Diffusions by Semigroups
λ, so that Nλ /λ converges in distribution to the Dirac measure δ{i} . The last integral converges, as λ → ∞, to [0,x] f (t)dt for all x, and thus determining f . Note that the assumption of continuity may be replaced by measurability with the same conclusion.] 3. Prove that for every ε > 0 and x ∈ (r0 , r1 ), there exists fx ∈ D such that fx (y) ≥ 0 ∀y ∈ I , and (21.63) holds. [Hint: Let g(y) = 1 if |y − x| > ε/2, and g(y) = 0 if |y − x| < 1/2. Now consider the convolution of g with a twice continuously differentiable p.d.f. ϕε with support in {y : |y| ≤ ε/6}.] 4. Suppose {Bt : t ≥ 0} is a standard one-dimensional Brownian motion. It was shown in Chapter 13 that {|Bt | : t ≥ 0} is a Markov process. Prove that the latter process is generated as in Example 3. 5. (a) Calculate s(x) and m(x) in Example 4, with the point 1, instead of 0, as the pivot. (b) Let {Bt : t ≥ 0} be a k-dimensional standard Brownian motion. The radial Brownian motion defined by {|Bt | : t ≥ 0} was shown to be a Markov process in Chapter 13. Show that the generator of this process is as in Example 4. 6. Show that the diffusion in Example 6 is that in Example 5 but with 0 absorbing at 1 vanishing. 7. Suppose that s(·), m(·) are given strictly increasing continuous functions on [0, ∞), s(0) = 0 = m(0). A probabilistic method of constructing a diffusion on [0, ∞) with these scale and speed functions, with reflecting boundary condition f (0+ ) = 0, and with either inaccessible or local boundary condition at ∞, is the following. First extend the functions s(·), m(·) on (−∞, ∞) by letting s(−x) = s(x), m(−x) = −m(x) for x > 0. Let {Xt : t ≥ 0} be a diffusion on (−∞, ∞) with these extended coefficients and the same boundary condition at −∞ as at ∞. Show that the folded process {Yt = |Xt | : t ≥ 0} is a diffusion on [0, ∞) with scale and speed functions as prescribed, boundary condition f (0) = 0, and the given boundary condition at ∞, as mentioned. d2 8. Let L = 12 dx 2 on the open half-line (0, ∞) with absorbing boundary condition ∞ f (0+ ) = 0. Compute (λ − L)−1 f (x) = 0 G(x, y)f (y)m(dy) + c(f )v+ (x), for a suitable constant c(f ). [Hint: Use Lemma 8(b) to see that c(f ) = 0.] 9. (a) Verify (21.125). [Hint: Use Example 1 to get the Laplace transform of √ √ 1 − − 2λ(y−x) 2λ|x−y| the reflected Brownian motion Xt as √ (e 1[0,y) (x)). +e 2 2λ √ − 2λy f (y)dy.] (b) Check (21.128) Then (21.125) is (0,∞) e 10. (Skew Brownian Motion) Fix 0 < α < 1. The α-skew Brownian motion can be defined as the diffusion on R having generator L = dmdd + s for the scale function 2(1 − α)−1 x, if x < 0 , .s(x) = 2α −1 x, if x ≥ 0
Exercises
397
and speed function m(x) =
.
(1 − α)x,
if x < 0
αx,
if x ≥ 0
.
Show that .DL
= {f ∈ C(R) : (1 − α)f (0− ) = αf (0+ ), f exists & continuous onR\{0}}.
11. (Branching Diffusion) Compute the scale and speed functions, and verify the boundary conditions for the diffusion having generator of the form I = d d2 [0, 1], L = β dx 2 + αx dx , where α, β are positive constants. 12. Assume natural scale, but suppose that the speed measure has a positive mass at x0 . For f ∈ DL , show that f (x0+ ) − f (x0− ) = I h(y)m(dy) for some h ∈ C(I ). [Hint: f ∈ DL is in the range of the resolvent.] 13. Assume s(x) = x, x ∈ (−∞, ∞), with inaccessible boundaries ±∞. Assume G(x, y) = G(−x, y) for all x, y. (i) Show G(x, y) = G(−x, −y) = G(x, −y) for all x, y. [Hint: Use symmetry of G(x, y).] (ii) Show that p(t; x, dy) = p(t; −x, dy), x ≥ 0. 14. (Slowly Reflecting (or Sticky) Point at 0) Let s(x) = x, m(dx) = 2dx + 2γ δ{0} (dx), −∞ < x < ∞, for a given parameter γ > 0. The speed function with m(0) = 0 is m(x) =
.
2x
if x ≥ 0,
2x − 2γ
if x < 0
.
Fix an arbitrary resolvent parameter λ > 0. (i) Show
√ cosh( 2λx) x ≥ 0 .g(x; λ) = √ √ √ cosh( 2λx) − 2λγ sinh( 2λx) x < 0. [Hint: Make the recursive computation in (21.18).] (ii) Show g+ (x) =
.
√
e−
2λx √ e− 2λx
√ √ − γ 2λ sinh( 2λx)
if x ≥ 0 if x < 0,
398
21 One Dimensional Diffusions by Semigroups
and g− (x) =
√
e
.
e
√
2λx
+γ
√ √ 2λ sinh( 2λx)
if x ≥ 0,
2λx
if x < 0.
W (g− , g+ ) = 2(λγ +
.
√
,
2λ).
[Hint: The integrals involving hyperbolic trigonometric functions are computable in closed form by simple substitution methods, and G(x, y) is unchanged by scaling g+ or g− by nonzero constants.] (iii) Show that ⎧ √ √ 2λ − 2λ|y−x| ⎪ ⎨ 2+γ √ e −
√ √ √γ 2λ e− 2λ|x+y| , x · y ≥ 0 2λ+γ λ) 2λ+γ λ) 4( 4( √ .G(x, y) = ⎪ ⎩ √ 1 e− 2λ|y−x| , x · y < 0. 2( 2λ+γ λ)
(iv) Show that G(x, y) is unchanged in the case of the natural scale function s(x) = x, and odd speed function m(x) =
.
2x + γ
if x ≥ 0,
2x − γ
if x < 0
.
[Hint: Adding a constant to the speed function does not change the speed measure.] (v) Show13 that the set {t ≥ 0 : Xt = 0} of zeroes of sticky Brownian motion has positive Lebesgue ∞ measure. [Hint: Consider (21.58) and (21.59), and check that E0 0 e−λt 1[Xt ∈R\{0}] dt < λ1 by an exact calculation of √ √ 2λ 1 . Note that R G(0, y)1R\{0} (y)m(dy) = R G(0, y)m(dy) = λ 2λ+γ λ
1 λ .]
15. Consider Feller’s diffusion X on the half-line I = [0, ∞) with natural scale d2 and speed functions s(x) = x, m(x) = 2x, x > 0, i.e., L = 12 dx 2 on (0, ∞), respectively, with boundary condition f (0+) = γf (0+), for a positive nonnegative constant γ . This is often viewed as an interpolation between reflecting and absorbing boundary conditions. For γ = 0 one obtains a reflecting boundary at x = 0, i.e., f (0+ ) = 0, and absorption may be viewed as the case γ = ∞, i.e., f (0+ ) = γ1 f (0+ ) = 0. Show that the zero set {t ≥ 0 : Xt = 0} has positive Lebesgue measure for 0 < γ < ∞. [Hint: Note
13 Slowly
reflecting behavior was discovered by Feller (1952). See Peskir (2015) for a historical account. See Salminen and Stenlund (2021) for this and other interesting examples. Also see Bass (2014) for an analysis of the corresponding stochastic differential equation.
Exercises
399
(0)+γ (Gh) (0) . From that (λ−L)−1 f (x) = Gf (x)+cg+ (x), where c = − −(Gh) (0)+γ g (0) −g+ + here apply the same hint as in Exercise 14.] d2 2 16. Let I = [0, ∞), and L = 12 dx 2 on C (I ) together with each case of Feller’s general boundary condition, αf (0) − γf (0) + βf (0) = 0, α, β, γ ≥ 0, α + β + γ = 1, enumerated below. In each case, describe the respective sample path behavior of the restricted Brownian motion upon reaching the boundary r0 = 0. (i) f (0) = 0. (ii) f (0) = 0. (iii) f (0) = 0. (iv) αf (0) − γf (0) = 0, (αβ > 0). (v) βf (0) = γf (0), (βγ > 0). [Hint: Review the chapter examples and Exercise 15.] 17. Check that the proofs of (i) transience or recurrence and (ii) null recurrence or positive recurrence, given in Chapter 8, apply to Feller’s general onedimensional diffusions. 18. (a) Let X be a Banach space and f an unbounded linear functional on X . Prove that the kernel of f is dense in X . [Hint: Suppose x ∈ / ker(f ). There is a sequence {xn } such that |xn | = 1, xn > n for all n = 1, 2, . . . . Consider yn = x − ff(x(x)n ) xn . Then yn ∈ ker(f ) and yn → x.] (b) Prove that the linear functionals f (ri ) and Lf (ri ) are unbounded in C(I ). (c) In the case of a Dirichlet condition f (ri ) = 0, the semigroup is strongly continuous on a proper closed subspace C0 (I ) of C(I ).
Chapter 22
Eigenfunction Expansions of Transition Probabilities for One-Dimensional Diffusions
We consider one-dimensional non-singular diffusions with Dirichlet or Neumann (reflecting) boundary conditions. The generators of these diffusions are self-adjoint on .L2 ([r0 , r1 ], m(dx)), where .m(dx) is the speed measure on the state space .I = [r0 , r1 ]. This leads to eigenfunction expansions of the transition probability densities of the diffusions with these generators.
2
d Let .L = dmd + s be the infinitesimal generator of a Feller diffusion on .C(I ), .I = [r0 , r1 ], with Dirichlet or Neumann (reflecting) boundary conditions on .ri (i = 0, 1). Here, .s(x) is the scale function, and .m(x) is the speed function, as defined in Chapter 21, and .m(dx) is the speed measure with density .m (x). Recall that .r0 , r1 are finite in this case, as are .s(x) and .m(x). Feller’s results apply under more general regular boundary conditions, but, for simplicity, we only consider the Dirichlet and reflecting boundary conditions. We will denote by .DL the domain of the generator of the corresponding diffusion on .L2 ([r0 , r1 ], m(dx)), extending it from the generator on .C(I ) considered in Chapter 21. Let .p(t; x, dy) denote the transition probability, or sub-probability (depending on the boundary condition). This also defines the semigroup .{Tt : t ≥ 0}, on .L2 ([r0 , r1 ], m(dx)):
Tt f (x) =
.
[r0 ,r1 ]
f (y)p(t; x, dy), f ∈ L2 ([r0 , r1 ], m(dx)).
(22.1)
It is a strongly continuous contraction semigroup on the Hilbert space .L2 [r0 , r1 ] := L2 ([r0 , r1 ], m(dx)), or on a subspace .L20 [r0 , r1 ] if at least one of the boundaries has the Dirichlet boundary condition. Strong continuity holds, since on the Banach space .C(I ), the sup norm .||(λ − L)−1 || is larger (i.e., no smaller) than the .L2 -norm © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_22
401
402
22 Eigenfunction Expansions of Transition Probabilities
||(λ − L)−1 ||2 restricted to .C(I ), and .||(λ − L)−1 || ≤ λ1 , as proved in Chapter 21. It follows from the Hille-Yosida Theorem that .{Tt : t ≥ 0} is a contraction semigroup in .L2 [r0 , r1 ]; also see Exercise 1 for another proof of contraction. For simplicity, we will make the following:
.
Assumption : s(x) and m(x) are continuously differentiable.
.
(22.2)
Theorem 22.1 Under Dirichlet or Neumann boundary conditions, the semigroup {Tt : t ≥ 0} and its generator are self-adjoint in .L2 [r0 , r1 ], or a proper subspace of it in case at least one of the boundaries has Dirichlet condition.
.
Proof Let .f, g ∈ DL , under Dirichlet or Neumann boundary conditions on .ri ( i.e., f (ri ) = 0 or .f (ri ) = 0, and same for g), .i = 0, 1. Then, writing the inner product in .L2 [r0 , r1 ], as .f, g, one has
.
Lf, g =
.
= = = =
[r0 ,r1 ]
(
d2 f )(x)g(x)m(dx) dmds
[r0 ,r1 ]
1 d ( f (x))g(x)m(dx) dm s (x)
[r0 ,r1 ]
m (x)
[r0 ,r1 ]
1 d f (x)]g(x)dx [ dx s (x)
[r0 ,r1 ]
1 f (x)g (x)dx, s (x)
1
d 1 [ f (x)]g(x)m (x)dx dx s (x)
(22.3)
using the boundary conditions .f (ri )g(ri ) = 0 for .i = 0, 1. The last expression in (22.3) is symmetric in f and g. The next result provides for eigenfunction expansions of transition probability densities under the Dirichlet or Neumann boundary conditions. Theorem 22.2 Under Dirichlet or Neumann boundary conditions on .ri (i = 0, 1): (a) For all .λ > 0, the resolvent operator .f → (λ − L)−1 f = [0,∞) e−λt Tt f dt is compact on .L2 [r0 , r1 ] or on a proper subspace of it in the case of Dirichlet boundary conditions. (b) The transition probability, or sub-probability, density .p(t; x, y) with respect to .m(dy) has an absolutely convergent expansion . n≥0 exp{αn t}ϕn (x)ϕn (y), where .αn are the nonpositive eigenvalues of L with corresponding normalized eigenfunctions .ϕn . The convergence is uniform on .I × I . with kernel . n≥0 G(x, y), say, Proof The self-adjoint Green’s operator .f → Gf under the boundary conditions imposed, is smaller than .G(x, y), as argued in the proof of Theorem 21.6. Therefore, by a standard theorem in functional analysis is a compact self-adjoint (see Appendix A in Bhattacharya and Waymire (2022)), .G
22 Eigenfunction Expansions of Transition Probabilities
403
operator on .H = L2 [r0 , r1 ], or a proper closed subspace .H0 of it in case of Dirichlet −λt p(t; x, y)dt, and it has an y) = boundary condition. Note also that .G(x, [0,∞) e expansion .
βn ϕn (x)ϕn (y),
n≥0
where .βn are the nonnegative eigenvalues of the Green’s 2 operator .G with corresponding normalized eigenfunctions .ϕn (.n ≥ 0), . βn < ∞ (Theorem A.3 in Appendix A, Bhattacharya and Waymire (2022)). The orthonormal sequence is is dense in the Hilbert complete in the Hilbert space, because the domain of .G −1 space. Because .G = (λ − L) , and for .u ∈ DL , .u = (λ − L)−1 f , one has u(x) =
.
n
βn ϕn (x)ϕn (y)f (y)m(dy),
I
and Lu(x) = λu(x) − f (x) =
.
(λβn − 1)f, ϕn ϕn (x). n
In particular, Lϕn (x) = (λβn − 1)ϕn (x).
.
That is, .αn := λβn − 1 are eigenvalues of L with corresponding eigenfunctions .ϕn (n = 0, 1, . . . ). Next consider the equation
.
.
d Tt f = LTt f = Tt Lf dt
Specializing to .f = ϕn , one obtains exp{tαn }ϕn (∀n). Hence Tt f =
.
.
(f ∈ DL ).
d dt Tt ϕn
= αn Tt ϕn , yielding .Tt ϕn =
exp{αn t}ϕn (x)
ϕn (y)f (y)m(dy). I
n≥0
This holds for all f in the Hilbert space H or .H0 , as the case may be. It follows that the density .p(t; x, y) (with respect to .m(dy)) has the expansion p(t; x, y) =
.
exp{αn t}ϕn (x)ϕn (y),
(22.4)
n≥0
the convergence being absolute and uniform.
404
22 Eigenfunction Expansions of Transition Probabilities
Example 1 (Brownian Motion with Two Reflecting Boundaries) In this case, .I = [0, d], .x0 = 0, .s(x) = x, .m(x) = 2x/σ 2 , and L=
.
σ 2 d2 , 2 dx 2
(22.5)
with the backward boundary condition d =0 . dx x=0,d
(t > 0; y ∈ I ).
(22.6)
We seek eigenvalues .α of L and corresponding eigenfunctions .ϕ. That is, consider solutions of .
1 2 σ ϕ (x) = αϕ(x), . 2
(22.7)
ϕ (0) = ϕ (d) = 0.
(22.8)
Check that the functions ϕn (x) = bm cos(nπ
.
x ) d
(n = 0, 1, 2, . . . ),
(22.9)
satisfy (22.7), (22.8), with .α given by α = αn = −
.
n2 π 2 σ 2 2d 2
(n = 0, 1, 2, . . . ).
(22.10)
Now .m (x) = 2/σ 2 . Since this is a constant, one may use dy in place of dm in Theorems 22.1, 22.2. Using .
d
d
12 dx = d,
0
cos2 (
0
d nπ x )dx = , 2 d
(22.11)
the normalizing constant .bn is given by bn =
.
2 d
(n ≥ 1);
b0 =
1 . d
(22.12)
To explicitly see that (22.9) and (22.10) provide all the eigenvalues and eigenfunctions, one can check the completeness of .{ϕn : n = 0, 1, 2, . . . } in .L2 ([0, d], dy). This may be done by using Fourier series (Exercise 9). Thus, for .t > 0, 0 < x, y < d,
22 Eigenfunction Expansions of Transition Probabilities ∞
p(t; x, y) =
.
405
eαn t ϕn (x)ϕn (y)
n=0 ∞ σ 2 π 2 n2 nπ x 1 2 nπy exp{− t} cos( + ) cos( ). 2 d d d d 2d
=
n=1
(22.13) Notice that lim p(t; x, y) =
.
t→∞
1 , d
(22.14)
the convergence being exponentially fast, uniformly with respect to .x, y. Thus, as .t → ∞, the distribution of the process at time t converges to the uniform distribution on .[0, d]. In particular, this limiting distribution provides the invariant initial distribution for the process. Example 2 (Brownian Motion with Two Absorbing Boundaries) Here .I = [0, d], − 2μ x
2
2
2μ
x
σ σ σ 2 −1), and the transition probability (1−e σ 2 ), .m(x) = μ x0 = 0, .s(x) = 2μ 2 (e has a density .p(t; x, y) on .0 < x, y < d.
.
L=
.
d σ 2 d2 +μ , 2 dx 2 dx
(22.15)
with backward boundary conditions on .u ∈ DL , u(0) = 0 u(d) = 0.
.
(22.16)
for .t > 0. Check that the functions ϕn (x) = bn exp{−
.
nπ x μx ) } sin( d σ2
(n = 1, 2, . . . )
(22.17)
are eigenfunctions corresponding to eigenvalues α = αn = −
.
n2 π 2 σ 2 μ2 − , 2d 2 2σ 2
(n = 1, 2, . . . ).
(22.18)
(0 ≤ x ≤ d),
(22.19)
Since m (x) = exp{
.
the normalizing constants are
2μx } σ2
406
22 Eigenfunction Expansions of Transition Probabilities
bn =
.
2 d
(n = 1, 2, . . . ).
(22.20)
The eigenfunction expansion of the density function .p(t; x, y) with respect to m(dy) = m (y)dy is therefore
.
p(t; x, y) =
∞
.
eαn t ϕn (x)ϕn (y)π(y)
n=1 ∞
−μ(y + x) μ2 t n2 π 2 σ 2 t 2 } exp{− } exp{− } = exp{ d σ2 2σ 2 2d 2 n=1
nπy nπ x ) sin( ) × sin( d d
(t > 0, 0 < x, y < d).
(22.21)
The density .p(t; ˜ x, y), say, with respect to Lebesgue measure dy is p(t; ˜ x, y) = p(t; x, y)m (y)
.
= p(t; x, y)2 exp{ =
2μy } σ2
μ2 μ 2 exp{μ(y − x)/σ 2 } exp{− 2 } exp{− 2 t} d 2σ 2σ ∞ 2πy 2π x n2 π 2 σ 2 ) cos( ), t} sin( exp{− × d d 2d 2
(22.22)
n=1
for .t > 0, 0 < x, y < d. It remains to calculate p(t; x, {0}) = Px (Xt = 0),
.
p(t; x, {d}) = Px (Xt = d),
0 < x < d. (22.23)
Note that p(t; x, {0}) + p(t; x, {d}) = 1 −
p(t; ˜ x, y)dy
.
(0,d)
μx 2 exp{− 2 } d σ ∞ nπ x n2 π 2 σ 2 t ) } sin( an exp{− × 2 d 2d
= 1−
n=1
(22.24) where
22 Eigenfunction Expansions of Transition Probabilities
d
an =
exp{
.
0
407
nπy μy )dy. } cos( 2 d σ
(22.25)
Thus, it is enough to determine .p(t; x, {0}); the function .p(t; x, {d}) is then determined from (22.24). Thus, .p(t; x, {0}) = ϕ(x) − ϕ(y)p(t; ˜ x, y)dy (22.26) (0,d)
where .ϕ(x) = Px ({Xt } reaches 0 before .d). In particular, ⎧ ⎨ d−x if μ = 0 d .ϕ(x) = 2} ⎩ 1−exp{2μ(d−x)/σ if μ = 0. 1−exp{2dμ/σ 2 }
(22.27)
Substituting (22.21), (22.27) in (22.26) yields .p(t; x, {0}). As further application of the eigenfunction expansions on compact intervals, one may obtain formulae on the half-line by passage to a limit as illustrated by the examples to follow. Example 3 (Brownian Motion on Half-line with One Reflecting Boundary) Let .I = [0, ∞), .x0 = 0, .s(x) = x, .m(x) = σ22 x. We seek the transition density p with respect to Lebesgue measure, since .m (y) is a constant. Then one has σ 2 d2 , 2 dx 2
L=
.
I = [0, ∞),
d |x=0 = 0. dx
(22.28)
This diffusion is the limiting form of that in Example 1 as .d ↑ ∞. Letting .d ↑ ∞ in (22.13) one has ∞ nπ x nπy 1 σ 2 π 2 (n/d)2 t} cos( ) cos( ) exp{− d↑∞ d 2 d d n=−∞
p(t; x, y) = lim
.
=
∞ −∞
1 = 2
1 = 2π =
exp{−
tσ 2 π 2 u2 } cos(π xu) cos(πyu)du 2
∞
−∞
exp{−
∞
−∞
tσ 2 π 2 u2 }[cos(π(x + y)u) + cos(π(x − y)u)]du 2
(e−iξ(x+y) + e−iξ(x−y) ) exp{−
tσ 2 ξ 2 }dξ, 2
(ξ = π u)
1 (x + y)2 (x − y)2 (exp{− } + exp{− }) (2π tσ 2 )1/2 2tσ 2 2tσ 2 (t > 0, 0 ≤ x, y < ∞).(22.29)
408
22 Eigenfunction Expansions of Transition Probabilities
The last equality in (22.29) follows from Fourier inversion and the fact that exp{−tσ 2 ξ 2 /2} is the characteristic function of the normal distribution with mean zero and variance .tσ 2 .
.
Example 4 (Brownian Motion on Half-line with Absorbing Boundary) .I = [0, ∞), 2
− 2μ x
2
2μ
x
σ σ σ2 x0 = 0, .s(x) = 2μ (1 − e σ 2 ), .m(x) = μ − 1). The transition density 2 (e function .p(t; ˜ x, y) with respect to Lebesgue measure for .x, y ∈ (0, ∞) is obtained as a limit of (22.21) as .d ↑ ∞. That is,
.
.p(t; x, y)
∞ μ(y − x) μ2 t π 2 σ 2 tu2 } sin(π xu) sin(πyu)du − } exp{− 2 σ2 2σ 2 0 ∞ μ(y − x) μ2 t σ 2 tξ 2 exp{ } sin(ξ x) sin(ξy)dξ − } exp{− 2 2 2 σ 2σ 0 ∞ μ(y − x) μ2 t σ 2 tξ 2 exp{ } − } exp{− 2 σ2 2σ 2 0
= 2 exp{ =
2 π
=
1 π
×[cos(ξ(x − y)) − cos(ξ(x + y))]dξ ∞ 2 2 μ(y − x) μ2 t 1 exp{ − } (e−iξ(x−y) − e−iξ(x+y) )e−σ tξ /2 dξ = 2 2 2π σ 2σ −∞ = exp{
μ2 t 1 μ(y − x) − } 2 σ 2σ 2 (2π σ 2 t)1/2
×[exp{−
(x + y)2 (x − y)2 } − exp{− }]. 2 2σ t 2σ 2 t
(22.30)
Integration of (22.30) with respect to y over .(0, ∞) yields .p(t; x, {0}c ), which is the probability that, starting from x, the first passage time to zero is greater than t. Note that Examples 2 and 4 yield the distributions of the maximum, and the minimum and the joint distribution of the maximum, minimum, and the state at time t of a diffusion with constant coefficients over the time period .[0, t] (Exercises 4–5).
Exercises 1. (a) Prove that the semigroup {Tt : t ≥ 0} under reflecting boundary 2 conditions at r0 , r1 , is contracting in L2 ([r 0 , r21 ], dm). [Hint: ||Tt f ||2 = 2 ( f (y)p(t; x, dy)) m(dx) = = I I f (y)p(t; x, dy)m(dx) I I2 f (y)m(dy).] I
Exercises
409
(b) Prove that if at least one of ri has the Dirichlet boundary condition, while the other has a Dirichlet or Neumann boundary condition, then {Tt : t ≥ 0} is a contracting semigroup on the closed subspace H0 ⊂ L2 [r0 , r1 ]. [Hint: The transition probability is smaller than that in part (a).] 2. Let p(t; x, dy) be the transition probability of a Markov process on a state space (S, S) which is strongly continuous and contracting on an appropriate real Banach space X . Prove that the spectrum of the semigroup is contained in (−∞, 0]. [Hint: Every λ > 0 belongs to the resolvent set.] 3. Assume that the nonzero eigenvalues of L are bounded away from zero. (i) In the case of two reflecting boundaries, prove that .
lim p(t; x, y) = π(y)
t→∞
(x, y ∈ [0, d])
where π(y) = m (y)/m(I ). Show that this convergence is exponentially fast, uniformly in x, y. (ii) Prove that π(y)dy is the unique invariant probability for p. (iii) Prove (ii) without the assumption concerning eigenvalues. 4. Use Example 4 to find (i) The distribution of the minimum mt of a Brownian motion {Xs } over the time interval [0, t] (ii) The distribution of the maximum Mt of {Xt } over [0, t] (iii) The joint distribution of (Xt , mt ) 5. Use Example 2 to compute the joint distribution of (Xt , mt , Mt ), using the notation of Exercise 4 above. 6. For a Brownian motion with drift μ and diffusion coefficient σ 2 > 0, compute the p.d.f of τa ∧ τb when X0 = x and a < x < b. 7. Establish the identities .
∞
(2π σ 2 t)−1/2
[exp{−
n=−∞
=
(2nd + y + x)2 (2nd + y − x)2 }] } + exp{− 2 2σ 2 t 2σ t
∞ nπy 2 1 σ 2 π 2 n2 t nπ x ) cos( ), + exp{− } cos( d d d d 2d 2 n=1
∞
(2π σ 2 t)−1/2
[exp{−
n=−∞
=
2 d
∞ n=1
exp{−
(2nd + y + x)2 (2nd + y − x)2 }] } − exp{− 2σ 2 t 2σ 2 t
nπy nπ x σ 2 π 2 n2 t ) sin( ), (0 ≤ x, y ≤ d). } sin( d d 2d 2
410
22 Eigenfunction Expansions of Transition Probabilities
Use these to derive Jacobi’s identity for the theta function: θ (z) :=
.
∞
∞ 1 1 1 exp{−π n2 /z} ≡ √ θ ( ), z > 0). exp{−π n2 z} + √ z n=−∞ z z n=−∞
[Hint: Compare (22.13) and (13.49).] 8. (Hermite Polynomials and the Ornstein-Uhlenbeck Process) generator L = 12 d 2 /dx 2 − x d/dx on the state space R1 .
Consider the
(i) Prove that L is symmetric on an appropriate dense subspace of 2 L2 (R1 , e−x dx). (ii) Check that L has eigenfunctions Hn (x) := (−1)n exp{x 2 }(d n /dx n ) exp{−x 2 },
.
the so-called Hermite polynomials, with corresponding eigenvalues n = 0, −1, −2, . . . . (iii) Give some justification for the expansion p(t; x, y) = e−y
2
∞
.
n=0
cn e−nt Hn (x)Hn (y),
1 cn := √ n . π 2 n!
9. According to the theory of Fourier series, the functions cos nx (n = 0, 1, 2, . . . ) and sin nx (n = 1, 2, . . . ) form a complete orthogonal system in L2 ([−π, π ], dx). Use this to prove the following: (i) The functions cos nx (n = 0, 1, 2, . . . ) form a complete orthogonal sequence in L2 ([0, π ], dx). [Hint: Let f ∈ L2 ([0, π ], dx). Make an even extension of f to [−π, π ], and show that this f may be expanded in L2 ([−π, π ], dx) in terms of cos nx (n = 0, 1, . . . ).] (ii) The functions sin x (n = 1, 2, . . . ) form a complete orthogonal sequence in L2 ([0, π ], dx). [Hint: Extend f to [−π, π ] by setting f (−x) = −f (x) for x ∈ [−π, 0).]
Chapter 23
Special Topic: The Martingale Problem
The martingale problem as formulated by Stroock and Varadhan provides an alternative approach to semigroup theory when defining a diffusion with a given infinitesimal generator by requiring a large class of functions of the process satisfy a natural martingale condition related to the infinitesimal operator for Markov processes. The basic elements of this theory is presented with an application to the approximation of a one-dimensional diffusion by a scaled birth-and-death Markov chain.
To appreciate a basic role of martingales in the theory of Markov processes, consider a Markov semigroup, i.e., a strongly continuous contraction semigroup .{Tt : t ≥ 0} on the Banach space .Cb (S), where S is Polish. The following result has been referred to as Dynkin’s martingale in Chapter 17. Theorem 23.1 If .f ∈ DA , the domain of the generator A of the Markov processes {X(t) : t ≥ 0}, then .M(t) = f (X(t)) − [0,t] Af (X(u))du, t ≥ 0, is a martingale
.
Remark 23.1 One may note by reversing the proof that the martingale relation is equivalent to the semigroup relation .Tt f − f = (0,t] Ts Af ds (for all f in .DA ). In his classic work, Feller (1954) constructed the most general one-dimensional (regular) Markov processes with continuous sample paths by means of semigroup theory and also found the most general boundary conditions for them (see Chapter 21). For multidimensional diffusions, however, semigroup theory for constructing diffusions is not as convenient as that by Itô’s seminal theory of stochastic differential equations.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_23
411
412
23 Special Topic: The Martingale Problem
In this chapter we provide a brief introduction1 to the so-called martingale problem for diffusions due to Stroock and Varadhan ((1969), (1979)), extending Itô’s theory to possibly nonsmooth coefficients. Here, informally, the generator is a second-order elliptic operator: A= .
μi (x)
∂2 1 ∂ aij (x) ; + 2 ∂xi ∂xj ∂xi 1≤i,j ≤k
1≤i≤k
(23.1)
[a(x) = ((aij (x)))] a(x) being a nonnegative definite matrix for each .x = (x1 , . . . , xk ) ∈ S = Rk . In Itô’s seminal theory of stochastic differential equations (SDE) presented in Chapters 6–8, the Markov process governed by A is constructed as the solution of the SDE, expressed equivalently in terms of .μ = (μ1 , . . . , μk ), σ (·)σ (·) = a(·), σ (·) = ((σij (·)))1≤i,j ≤k by
.
dX(t) = μ(X(t))dt + σ (X(t))dB(t), .X(t) = X(0) + μ(X(s))ds + σ (X(s))dB(s), .
[0,t]
[0,t]
(23.2) (23.3)
or, dXi (t) = μi (X(t))dt +
.
σij (X(t))dBj (t) (1 ≤ i ≤ k).
(23.4)
1≤j ≤k
Here .B = (B1 , . . . , Bk ) is a standard k-dimensional Brownian motion, and .X(0) is independent of the Brownian motion. .f (X(t))− Under the assumption that .μ and .σ are Lipschitz (or locally Lipschitz), 2 k Af (X(s))ds is a martingale (local martingale) for every .f ∈ C (R ), the set b [0,t] k of all real-valued bounded functions on .R having bounded derivatives of first and second order. In the absence of the smoothness, assumptions made on .μ and .σ the stochastic integral in (23.3) may still make sense, but the proof of the existence of a pathwise unique solution may break down. Tanaka’s one-dimensional equation dX(t) = sgn X(t)dB(t), X(0) = 0,
.
(23.5)
provides a well-known example. Let .W (t), .t ≥ 0 be a standard Brownian motion starting at 0. Define the stochastic integral
1 In
addition to that of its originators, Stroock and Varadhan (1979), a more thorough treatment than the present chapter can be found in the text by Ethier and Kurtz (1985).
23 Special Topic: The Martingale Problem
413
Y (t) =
.
[0,t]
sgn(W (s))dW (s),
dY (t) = sgn W (t)dW (t).
(23.6)
By Lévy’s Theorem 9.4, because Y has quadratic variation t on .[0, t], it is a standard Brownian motion. Note also that . [0,t] sgn(W (s))dY (s) = 2 [0,t] sgn (W (s))dW (s) = W (t). Thus, W is a solution to Tanaka’s SDE: dW (t) = sgn W (t)dY (t),
.
(23.7)
but driven by the Brownian motion Y , not B. One can of course choose B or W in (23.6) to get B as the solution .dB(t) = sgn B(t)dY (t). Apart from the fact that one has to construct a new driving Brownian motion Y , the solution W is not pathwise unique. For example, .−W (t), .t ≥ 0 is another solution. However, the two solutions have the same law. Although there does not seem to be a universal agreement on nomenclature, we shall call a solution of an SDE which may not be pathwise unique or may not be driven by the underlying Brownian motion, a weak solution. The martingale problem of Stroock and Varadhan (1969,1979) may be stated as follows, generalizing the martingale formulation indicated at the outset (by the semigroup formalism). Definition 23.1 (Martingale Problem) Let the second order operator A be as in (23.1). Let .C[0, ∞) ≡ C([0, ∞) : Rk ) be the set of all continuous .Rk -valued functions on .[0, ∞), .B the .σ -field of all Borel measurable subsets of the metric space .C[0, ∞) endowed with the topology of uniform convergence on the bounded intervals of .[0, ∞). Denote the canonical filtration .Bt = σ {x(u), 0 ≤ u, ≤ t} where .x(u) is the projection on the uth coordinate of .C[0, ∞). Then a solution to the (local) martingale problem for .(x, A) is a probability P on .(C[0, ∞), B) such that 2 k .P (x(0) = x) = 1 and for all .f ∈ C (R ), f (x(t)) −
.
[0,t]
Af (x(s))ds is a local {Bt } martingale under P .
(23.8)
Here “local” means .f (x(t ∧ τn )) − [0,t∧τn ] Af (x(s))ds is a .{Bt }-martingale for all .n ∈ N, .τn := inf{t ≥ 0 : |x(t)| ≥ n}. In more detail one says .(C([0, ∞), {Bt }, B, P ) is a solution to the local martingale problem for .(x, A). Remark 23.2 The lower case notation for the formulation adopted here is merely a mathematical convenience, indicating the use of the canonical model .Ω = C([0, ∞)). Let the coefficients .μ and .σ be measurable and locally bounded. It follows from Itô’s Lemma that if .X(t), .t ≥ 0 is a weak solution to the SDE (23.2) with .X(0) = x, then its distribution P (on .(C[0, ∞), B) is a solution to the local martingale problem .(x, A). The converse is also true. Namely,
414
23 Special Topic: The Martingale Problem
Proposition 23.2 If there exists an augmented solution-space .(Ω, F, P ), with a filtration .{Bt } such that (23.8) holds, then there exists a filtration .{Ft } and an .{Ft }adapted Brownian motion .Bt , .t ≥ 0 and a .{Ft }-adapted process .Xt such that (23.2) holds. Proof To prove this, for simplicity, we assume the diffusion matrix to be nonsingular. First let .σ (·) be the positive square root of .a(·) (obtained from the spectral decomposition of .a(·)). Consider on .(C[0, ∞), {Bt }, B, P ), M(t) = x(t) − x(0) −
.
[0,t]
μ(x(s))ds,
(23.9)
where .M(t) = (M 1 (t), . . . , M k (t)). It follows from (23.8), with .f (x) = xi that .M i is a local martingale (.i = 1, . . . , k). Also consider the local martingales corresponding to .f (x) = xi xj in (23.8). That is, (ij )
.Mt
:= xi (t)xj (t) − xi (0)xj (0) −
[0,t]
[xj (s)μi (x(s)) + xi (s)μj (x(s)) + ai j (x(s))]ds (23.10)
is a local martingale for every pair .(i, j ). This may be expressed as (see Remark 19.5, and Exercise 1)2 (ij ) i Mt = xi (0)xj (0) + xj (0)xi (0) + xj (s)M (ds) + xi (s)M j (ds) [0,t]
[0,t]
.
+ M i , M j t −
[0,t]
aij (x(s))ds,
i j . M , M t
(23.11) i .M ,
j (see where is the quadratic covariation of the martingales .M i j Definition 19.7). Then . M , M t − [0,t] aij (x(s))ds is a continuous local martingale of locally finite variation, starting at 0; hence it equals 0 a.s. That is, i j . M , M t = [0,t] aij (x(s))ds, almost surely. Then one has
Wt : =
.
Wti
=
[0,t]
σ (x(s))−1 M(ds),
1≤j ≤k [0,t]
σ ij (x(s))M j (ds); ((σ ij (·))) = σ −1 (·),
d i d j Wt , Wt = σ i (x(t))σ j (x(t)) M i , M j = δij . dt dt 1≤≤k
2 See
Scheutzow (2019) for a more general result.
23 Special Topic: The Martingale Problem
415
Hence .Wt is a standard k-dimensional Brownian motion, .M(dt) = σ (x(t))dWt , and (23.9) reduces to .x(t) = x(0) + μ(x(s))ds + σ (x(s))dWs . [0,t]
[0,t]
Having established the equivalence of the (local) existence of a weak solution to that of a solution to the (local) martingale problem for .(x, A) for every initial .x, we briefly address the question of existence of a solution and uniqueness. Itô’s theory shows if the coefficients .μ(·) and .σ (·) are locally Lipschitz, then this solution exists and is (pathwise) unique up to explosion time (see Chapter 12). Note that, in Itô’s theory, it is not necessary to assume that .σ (·) or .a(·) are positive definite. For the relationship between the smoothness of .σ (·) and that of .a(·), we refer to Friedman (1975, Chapter 6). According to standard PDE theory (see Friedman, loc. cit.), in case (i) .a(·) is uniformly elliptic, i.e., the eigenvalues are bounded away from zero and infinity, (ii) .μ(·) and .a(·) are locally Hölder continuous, there exists a smooth transition probability density of the diffusion with these coefficients satisfying Kolmogorov’s backward and forward equations. Of course for the existence and uniqueness of the diffusion (possibly with explosion) with a continuous probability density, it is enough to have the preceding conditions (i) and (ii), hold locally. Using their martingale method and some hard analysis, Stroock and Varadhan (1979) prove the following : Theorem 23.3 (Stroock and Varadhan) If .μ(·) is measurable and bounded on compacts, and if the smallest eigenvalue of .a(·) is bounded away from zero on compacts, then there exists a (local) weak solution, which is unique in law for .(x, A) for every .x. Recall that the uniqueness implies the Markov property of the (weak) solution (see the proof of Theorem 7.2). When the (local) martingale problem for .(x, A) has a solution (for every initial .x) that is unique in law, we say that the problem is well posed. The martingale method has been used by Stroock and Varadhan (1979, Theorem 11.2.3) to derive convergence of scaled Markov chains to diffusions3 Following them, for each fixed .h > 0, let .ph (x, dz) be the transition probability 1 of a Markov chain .{Xn } on the lattice .ΔZ, .Δ = h 2 , in the canonical model {0,...,∞} .(ΔZ , Fmh = σ {x(j h) : j = 0, . . . , m}). By .x(j h) we mean the (random) j th iterate of this chain. As in the case of the FCLT (Chapter 17), let .Phx denote the distribution of the polygonal process on .(C[0, ∞) : R) for a Markov chain with transition probability .ph (x, dy) starting at x. Here .Ω has the usual topology of uniform convergence on compact intervals of .[0, ∞). That is,
3 See
Durrett (1996) Chapter 8, Sec 7, for a nice treatment of convergence for both discrete and continuous parameter Markov chains with interesting examples.
416
23 Special Topic: The Martingale Problem
(i) .Phx (x(0) = x) = 1 x(mh) + (t−mh) (ii) .Phx [x(t) = (m+1)h−t) Δ x((m + 1)h), mh ≤ t ≤ (m + 1)h] = Δ 1 for all m = 0, 1, . . . (iii) The conditional distribution of .x((m+1)h) given .Fmh is .ph (j Δ, dy) on the set .{x(·) : x(mh) = j Δ}. Denote by .Ah the linear operator on the space .Cb (R) given by Ah f (x) =
(f (x) − f (z))ph (x, dz),
.
(23.12)
and note that f (x(j h)) −
Ah f (x(kh)), (j = 0, 1, . . . ) is a {Fj h }-martingale
.
0≤k≤j −1
(23.13) under the Markov chain, starting at .x(0) or, equivalently, under .Phx 1 μh = h .
1 ah (x) = h
(Exercise 2). Let
{|z−x|≤1}
(z − x)ph (x, dz)
(23.14) (z − x) ph (x, dz). 2
{|z−x|≤1}
Proposition 23.4 Assume that for all .R > 0 there exist functions .μ(·) and .a(·) such that (i) .limh↓0 sup{|x|≤R} |μh (x) − μ(x)| = 0, (ii) .limh↓0 sup{|x|≤R} |ah (x) − a(x)| = 0 (iii) .limh↓0 sup{|x|≤R} h1 |x − z|3 ph (x, dz) = 0 2
d d ∞ Define the operator .L = μ(x) dx + 21 a(x) dx 2 . Then for every .f ∈ C0 (the space 1 of infinitely differentiable functions vanishing at infinity), . h AAh f (x) → Lf (x) as .h ↓ 0, uniformly on compact subsets of .R.
Proof Consider the Taylor expansion .f (z) = f (x) + f2 (x, z) + r(x, z), 1 f2 (x, z) = (z − x)f (x) + (z − x)2 f (x), 2 .
(23.15)
|r(x, z)| ≤ c(f )|z − x| , 3
c(f ) being a constant depending on f . Let
.
Lh = μh (x)
.
1 d d2 + ah (x) 2 . 2 dx dx
(23.16)
23 Special Topic: The Martingale Problem
417
It is simple to check, using definitions (23.12), (23.14), and (23.16), the Taylor expansion, and (23.4)(iii) that 1 c(f ) | Ah f (x) − Lh f (x)| ≤ h h
.
|z − x|3 ph (x, dz) → 0,
(23.17)
uniformly on compacts as .h → 0. The proposition now follows from the assumptions (23.4)(i)–(iii). Theorem 23.5 In addition to (23.4)(i)–(iii), assume: 1 .(iii)’: lim sup |x − z|3 ph (x, dz) = 0. h↓0 x∈R h Also, assume .
sup {ah (x) + |μh (x)|} < ∞
(23.18)
h≥0, x∈R h converges weakly to and that the initial point .x(0) = xh → x0 as .h ↓ 0. Then .Pxh the distribution of the diffusion with drift .μ(·) and diffusion coefficient .a(·), starting at x, provided the martingale problem for .(x, L) is well posed.
Proof Note that, for any .ε > 0, h Pxh (|x(m + h) − x(mh)| ≥ ε) ≤ sup
.
|x − z|3 ph (x, dz) ≤ h · o(h),
x∈R
as .h ↓ 0. Therefore .
Phxh (x(m + h) − x(mh)| ≥ ε) → 0 as h ↓ 0 for all T > 0.
(23.19)
0≤m≤T / h
We now invoke a result adapted to Markov chains (Stroock and Varadhan (1979, n Theorem 1.4.11) to conclude that (23.19) implies that .{Phxh : n = 1, 2, . . . } is pren compact, for every sequence .hn ↓ 0. In view of Proposition 23.4, it now follows that every limit point is the distribution of a diffusion with drift .μ(·) and diffusion coefficient .a(·) starting at .x0 provided the martingale problem for .(x, L) is well posed for all x. Example 5 (Birth-and-Death Approximation of Diffusions) Let .μ(·) and .a(·) be bounded and Lipschitz, .a(·) bounded away from zero. Consider the birth–death chain on .ΔZ, .Δ = h1/2 , with transition probabilities:
418
23 Special Topic: The Martingale Problem
.
ph (iΔ, (i − 1)Δ) =
a(iΔ) μ(iΔ)Δ =: δi (h) − 2a0 2
ph (iΔ, (i + 1)Δ) =
a(iΔ) μ(iΔ)Δ =: βi (h), + 2 2a0
ph (iΔ, iΔ) = 1 − βi (h) − δi (h). Here, .a0 = supx∈R a(x). Note that, for sufficiently small .h > 0, .ph (iΔ, j Δ) are probabilities, .(j = i − 1, i, i + 1). Then, with .x = iΔ, .
μh (iΔ) =
1 h
{|z−x|≤1}
(z − x)ph (x, dz) = μ(iΔ),
a(iΔ) 1 ah (iΔ) = (z − x)2 ph (x, dz) = h {|z−x|≤1} a0 1 lim sup |x − z|3 ph (x, dz) = 0. h↓0 x∈R h For .f ∈ C0∞ , at .x = iΔ, 1 1 . Ah (f (x)) = h h
1 a(iΔ) +o(1) (f (z)−f (x))ph (x, dz) = μ(iΔ)f (x)+ f (x) 2 2a0
as .h ↓ 0, uniformly on .R. If we take .a0 = 1, then the desired convergence follows. In general, one can use a simple scaling to get the desired result. The boundedness restriction on .a(·) and .μ(·) may be relaxed by localization, still requiring that .a(·) is bounded away from zero on compacts, and both .μ(·) and .a(·) are Lipschitz. Remark 23.3 As an alternative to the martingale method, one may prove the approximation in Example 5 and also that of the transition probability density of the diffusion, by appealing to numerical analysis, laid out in detail in Bhattacharya and Waymire (1990, 2009), pp. 386–389.
Exercises 1. Verify Equation (23.11). 2. Show that f (x(j h)) − 0≤k≤j −1 Ah f (x(kh)) for j = 0, 1, . . . is a {Fj h }martingale under the Markov chain, starting at x(0) (see Equation (23.13)). 3. Suppose that P is a unique solution to the martingale problem for (x, A). Let ψt (B) = P ({x ∈ C0 [0∞) : x(t) ∈ B}), B ∈ B(Rk ), t ≥ 0. (i) Show that {ψ sense that t : t ≥ 0} is a (weak) solution to the forward equation in the 2 (Rk ), t ≥ 0. f (y)ψ (dy) = f (y)ψ (dx) + ψ (y)Af (y)ds, f ∈ C k k t 0 s 0 R R [0,t]
Exercises
419
[Hint: Take expected values in M(t).] (ii) Suppose that ψt (dy) = ϕ(t, y)dy, t > 0 has a twice continuously differentiable density ϕ(t, y) with respect to Lebesgue measure. Verify Kolmogorov’s forward equation for f ∈ C02 (Rk ): .
k k k ∂ϕ 1 ∂ 2 (aij (y)ϕ(t, y)) ∂(μ(i) (y)ϕ(t, y)) − ]f (y)dy. [ (t, y) = ∂t ∂y (i) ∂y (j ) ∂y (i) Rk 2 i=1 j =1 i=1
[Hint: Use integration-by-parts as in Chapter 15.]
Chapter 24
Special Topic: Multiphase Homogenization for Transport in Periodic Media
There have been numerous studies, especially in the hydrology literature, mostly empirical, showing that the diffusivity of a solute concentration immersed in a fluid flow increases with spatial scale, sometimes at a rather large rate. This serves to motivate the mathematical theory presented in this chapter. Striking examples are provided to demonstrate the effect of scale as one successively passes from small to larger scales. The main results in this chapter are based on Bhattacharya and Goetze (Bernoulli 1:81–123, 1995) and Bhattacharya (Ann Appl Probab 951–1020, 1999).
Thermal and electric conduction in composite media and diffusion of matter through them are problems of great importance.1 Examples of such composite media are natural heterogeneous material such as soils, polycrystals, wood, animal and plant tissue, fiber composites, etc. One of the applications of interest in this chapter is the transport of solutes, such as pollutants, in an aquifer. The concentration .c(t, y) of matter at time t at a point y in such problems satisfies a Fokker-Planck equation: .
∂c(t, y)/∂t =
∂ ∂2 1 (Dij (y)c(t, y)) − (vi (y)c(t, y)), ∂yi ∂yj ∂yi 2 1≤i,j ≤k
1≤i≤k
c(0, dy) = δx (dy).
1 See Ben Arous
(24.1)
and Owhadi (2003), Bensoussan et al. (1978), and Fried and Combarnous (1971).
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_24
421
422
24 ST: Multiphase Homogenization
Here .D(·) = ((Dij (·)))1≤i,j ≤k is a .k × k positive definite matrix-valued function depending on the local properties of the medium, and .v(·) is a vector field which arises from other sources. Throughout the spatial dimension is .k > 1. The initial concentration may be more general than a point source at some point x. One may also think of (24.1) as the equation of transport, or diffusion, of a substance in a turbulent fluid. Indeed, in a classic study, Richardson (1926) analyzed data on atmospheric diffusion over 12 or more different orders of spatial scale and conjectured the relationship DL ∝ L(t)4/3 ,
.
DL (t) := L(t)2 /t,
(24.2)
where .DL is the diffusivity at spatial scale L and .L(t) is the root-mean squared distance from the mean flow. To be specific, diffusivity is dispersion per unit time, or the expected squared distance traveled in a unit of time due to dispersion. This relationship was later shown by Batchelor (1952) and Obukhov (1941) to be consistent with Kolmogorov’s dimensional analysis showing that the velocity spectrum of turbulent fluid is proportional to .L1/3 . The velocity v we consider is smooth and not turbulent. There have been numerous studies, especially in the hydrology literature, mostly empirical, (see, e.g., Fried and Combarnous (1971)), showing that the diffusivity of a solute concentration increases with spatial scale, sometimes at a rather large rate. Some theoretical studies such as Winter et al. (1984), Gelhar and Axness (1983), have looked at the dispersion assuming that the velocity v is an ergodic random field and analyzed the relationship between the correlation structure and distance. The present study provides, in particular, the relationship between .L(t) and .DL (t) at different spatial scales for the transport of a substance, under a smooth velocity v composed of a local component b and a large-scale component .β: v(y) = b(y) + β(y/a).
(24.3)
.
Leaving aside the beginning part (Theorems 24.1, 24.2), b and .β, and therefore v are assumed to be divergence-free, i.e., .divv = 0, thereby preventing local accumulation, and for the most part, b and .β are assumed to be (spatially) periodic. The spatial parameter “a” is assumed to be “large” compared to the local scale, and precise asymptotics in law are provided over times t in relation to “a.” Mathematically, or even physically, (24.1) represents the diffusion of solute particles in a composite medium governed by the stochastic differential equation: dX(t) = [b(X(t)) + β(X(t)/a)]dt + σ (X(t))dB(t),
.
X(0) = 0,
(24.4)
where .σ (·) is the positive square root of .D(·) and .{B(t) : t ≥ 0} is a standard kdimensional Brownian motion. The velocity v is assumed to be of the form (24.3). We write .x = (x1 , . . . , xk ), b = (b1 , . . . , bk ), β = (β1 , . . . βk ).
24 ST: Multiphase Homogenization
423
We begin with two interesting examples to illustrate the theory underlying the evolution of concentration through Gaussian and non-Gaussian phases and different diffusivities as time progresses. Example 1 Let .k = 2 and .{B(t) = (B1 (t), B2 (t)) : t ≥ 0} a two-dimensional standard Brownian motion. Consider the diffusion process .{X(t) = (X1 (t), X2 (t)) : t ≥ 0}, defined by dX1 (t) = [c1 + c2 sin(2π X2 (t)) + c3 cos(2π X2 (t))/a)]dt + dB1 (t),
.
dX2 (t) = dB2 (t),
X(0) = (0, 0).
(24.5)
The constants .cj (j = 1, 2, 3) are all nonzero. Here N denotes the Normal distribution, .L stands for law or distribution, .⇒ denotes convergence in law, and .⇒ .L(Z) denotes convergence in law to the distribution of a random variable Z. Also, the ratio of two sides of .∼ converges to one. In this example, k = 2, b1 (y) = c1 + c2 sin(2πy2 ),
.
β1 (y) = c3 cos(2πy2 /a),
while .b2 (y) = β2 (y) = 0. One could make it a k-dimensional example, .k > 2, by letting .bj (y) = βj (y) = 0, for .2 < j ≤ k, with noise .dBj (t). Then .Xj (t), 2 < j ≤ k, would just be independent Brownian motions, with diffusivity one in these coordinates. This is an example of stratified media, with the growth in diffusivity only in the first coordinate. Table 24.1 displays this growth with time. The first line represents an asymptotic Normal distribution around the mean flow. In this range of time the large-scale effect can be ignored. Due to periodicity, Z(t) := X(t) mod2π
.
is an ergodic diffusion on .T1 , the unit two-dimensional torus, with the uniform invariant distribution. Hence the mean flow, namely, mean of .b(Z(t)), over time t converges to the phase average . b of .b(·) with respect to the uniform distribution on the unit torus. The stochastic integral is then the √ integral of a martingale with stationary ergodic increments and, when divided by . t, converges to the Normal
Table 24.1 Scale regimes for Example 1. Time scale a , the diffusion is again asymptotically normal but with a large diffusivity. After this the diffusion settles down to this final Gaussian phase with diffusivity gradually going down, in the absence of any further source of heterogeneity. If one expresses the relationship between L and .DL as D L ∝ Lθ ,
.
(24.6)
then .θ grows from 0 to 1 with time and then starts going down to 0 after the final phase (Exercise 11). Note that if in the final phase one lets .t∼a 2+γ for some .γ ≥ 0, then 2
DL ∼a 2 ∼t 2+γ .
.
Therefore, .θ attains the maximum value 1 at .γ = 0 and decreases as .γ increases. The range in the first line, namely, .1 > a 2 , the diffusivity does not really grow, except for a constant term. Theorem 24.3 analyzes the precise mathematical distinction between the two cases. First Phase Asymptotics Without essential loss of generality, we may take as the initial point .x = 0. This essentially means that the concentration spreads starting with a point mass injection at a point we label 0. One may assume a somewhat more general concentration profile which has not spread very far. Let .{X(t) : t ≥ 0} be as defined in (24.5). We use constants .ci , ci , di , etc., independent of t and a. Theorem 24.1 Let .x0 = 0. Assume that .b(·) and its first order derivatives are bounded, and .β(·), D(·) and their first and second-order derivatives are bounded. Let .X(t) be as in (24.4), and let .Y (t) be defined by dY (t) = [b(Y (t)) + β(0)]dt + σ (Y (t))dB(t),
.
Y (0) = 0,
(24.8)
which ignores the large-scale variability. Let .Pt be the distribution of .{X(s) : 0 ≤ s ≤ t} and .Qt the distribution of .{Y (s) : 0 ≤ s ≤ t}. (i) Then there exist constants .dj (j = 1, . . . , 5) such that for .a, t > 1, .
||Pt − Qt ||T V ≤ d1
t 3/2 t 3/2 t t 3/2 t 3/2 t 3/2 + d2 + d3 ≤ d4 2 + d2 ≤ d5 . 2 a a a a a a
(24.9)
Here .|| · ||T V is total variation norm. (ii) In the special case .bj = 0 for .2 ≤ j ≤ k, ∂βj .βj (0) = 0 for .2 ≤ j ≤ k and . ∂x1 = 0 for .1 ≤ j ≤ k, one has .d2 = 0, so that ||Pt − Qt ||T V ≤ d1
.
t 3/2 t + d3 . 2 a a
(24.10)
If, in addition, as in Example 1, .∇β(0) = 0, then one may take .d3 = 0 in (24.10). Proof By the Cameron–Martin–Girsanov Theorem (see Theorem 9.9), the RadonNikodym derivative of .Pt with respect to .Qt is given by .eVt , where
426
24 ST: Multiphase Homogenization
Vt =
.
[0,t]
σ −1 (Y (s)){β(Y (s)/a) − β(Y (0)/a)}dB(s)
−(1/2)
[0,t]
|σ −1 (Y (s)){β(Y (s)/a) − β(Y (0)/a)}|2 ds.
Since .{eVt : t ≥ 0} is a martingale, .EeVt = 1, and + − 0 = E 1 − eVt = E 1 − eVt − E 1 − eVt .
.
Also .(1 − ex )+ = 0 if .x > 0, and .1 − ex ≤ |x| ∧ 1 for .x ≤ 0. Therefore, ||Pt − Qt ||T V = E|1 − eVt | = 2E(1 − eVt )+ ≤ 2E(|Vt | ∧ 1).
.
(24.11)
Now E|Vt | ≤ I1 (t) + I2 (t),
.
where denoting the set of eigenvalues of D(y) := σ (y)σ (y),
.
by .ρ(D(y)), and λ = inf ρ(D(y)),
.
y∈Rk
one has (see (7.42)), .
I1 (t) = E| 2
=E
[0,t]
[0,t]
σ −1 (Y (s)){β(Y (s)/a) − β(Y (0)/a)}dB(s)|2
|σ −1 (Y (s)){β(Y (s)/a) − β(Y (0)/a)}ds|2
≤ (1/λ)
[0,t]
E|{β(Y (s)/a) − β(Y (0)/a)}ds|2 ,
I2 (t) = (1/2) ≤ (1/2λ)
[0,t]
[0,t]
E|σ −1 (Y (s)){β(Y (s)/a) − β(Y (0)/a)}|2 ds
E|{β(Y (s)/a) − β(Y (0)/a)}ds|2 .
By Itô’s Lemma (Theorem 8.4), denoting ∇ = grad,
.
(24.12)
24 ST: Multiphase Homogenization
427
one has .
βj (Y (s)/a) − βj (Y (0)/a)
= L0 βj (·/a)Y (s )ds + [0,s]
[0,s]
∇βj (·/a)(Y (s )) · σ (Y (s ))dB(s ), (24.13)
where .L0 is the generator of the diffusion .{Y (t) : t ≥ 0}, and recalling .D(y) = ((Dii (y))) = σ (y)σ (y), L0 = (b(·) + β(0)) · ∇ +
.
.
1 2
Dii (·)
1≤i,,i ≤k
∂2 , ∂yi ∂yi
L0 (βj (·)/a)(Y (s )
= (1/2a 2 )
Dii (Y (s )
1≤i,i ≤k
∂ 2 βj (Y (s )/a ∂yi ∂yi
+(1/a)(b(Y (s )) + β(0)) · ∇βj (·)(Y (s )/a).
(24.14)
Therefore, denoting the Riemann integral on the right side of (24.13) by .J1j (s) and the stochastic integral by .J2j (s), one has E(βj (Y (s)/a) − βj (Y (0)/a))2 ≤ 2EJ1j (s)2 + 2EJ2j (s)2 .
.
(24.15)
Letting .|| · ||∞ denote the supremum of the Euclidean norm of a real-, vector- or matrix-valued function on .Rk , we have .
EJ1j (s)2 ≤ s 2 [(c1 /a 4 )||D||2∞ || max i,i
∂ 2 βj (·)||2∞ + (2/a 2 )s 2 ||(b(·) + β(0)) · ∇βj (·)||2∞ ], ∂yi ∂yi
EJ2j (s)2 ≤ (s/a 2 )||D(·)||∞ ||∇βj (·)||2∞ .
.
Thus
.
[0,t] 1≤j ≤k
(EJ1j (s)2 + EJ2j (s)2 )ds
≤ (t 3 /a 4 )c2 ||D||2∞
1≤j ≤k
|| max i,i
∂ 2 βj (·)||2∞ ∂yi ∂yi
(24.16)
428
24 ST: Multiphase Homogenization
+(2t 3 /3a 2 )
1≤j ≤k
||b(·) + β(0)) · ∇βj (·)||2∞
+(t 2 /2a 2 )||D(·)||∞
1≤j ≤k
||∇βj (·)||2∞ = I (t),
(24.17)
say. Hence, noting (24.12), (24.15)–(24.17), we get √ E|Vt | ≤ I1 (t) + I2 (t) ≤ (1/ λ) I (t) + (1/2λ)I (t).
.
(24.18)
From this it follows (see Exercise 1) that E(|Vt | ∧ 1) ≤ c3 I (t).
.
Thus (Exercise 1(ii)) E(|Vt | ∧ 1) ≤ c3 I (t) ≤ d1 t 3/2 /a 2 + d2 t 3/2 /a + d3 (t/a).
.
(24.19)
For part (ii), notice that, under the added conditions, the middle term in the right side of (24.17) and, therefore, in (24.19) vanishes. Theorem 24.1 leads to a first phase asymptotic Gaussian law, where .X(·) can be approximated by .Y (·). We will assume that the local velocity .b(·) and the non-singular matrix .σ (·) are differentiable and periodic with period one in each coordinate, i.e., with period lattice .Zk . Then the process Z(t) := Y (t) mod1
.
is a diffusion on the unit torus T1 = {(y1 mod1, y2 mod1, . . . , yk mod1) : y ∈ Rk }.
.
We denote its generator also by .L0 (see (24.14)) but restricted to periodic functions with period one. It has been shown in Chapter 17 that the functional central limit theorem holds for the process .Y (·) (Theorem 17.8), centered at .( b + β(0))t, and it is asymptotically Gaussian having limiting dispersion matrix K, where . b is the mean of .b(Z(t)) with respect to the invariant distribution .π(y)dy of .Z(t), .
b=
b1 , . . . , bk ), T1 b(y)π(y)dy = (
K=
T1
(gradψ(x) − Ik )D(x)(gradψ(x) − Ik )T π(x)dx,
(24.20)
where the components of .ψ(y) = (ψ1 (y), . . . , ψk (y)) are unique mean zero (periodic) solutions of
24 ST: Multiphase Homogenization
429
L0 ψj (y) = bj (y) − bj ,
.
j = 1, . . . , k.
(24.21)
Also .Ik is the .k × k identity matrix. We thus have the following result. Theorem 24.2 Assume that .b(·) is continuously differentiable, .σ (·) is Lipschitzian, and both are periodic of period one in each coordinate, as is the continuously differentiable matrix .σ . Also assume that .β has continuous and bounded first and second derivatives. a. Then in the limit as n → ∞, a → ∞, such that n/a 2/3 → 0,
(24.22)
√ [X(nt) − nt ( b + β(0))]/ n,
(24.23)
.
the process .
t ≥ 0,
starting at 0, converges in law to a k-dimensional Brownian motion with a dispersion matrix K given by (24.20). b. Under the additional hypothesis in Theorem 24.1(ii), the convergence holds over the wider time scale .1 t, one only needs to evaluate the asymptotic distribution of √
c3 /(a t)
.
[0,t]
cos(2π B2 (s)/a)ds.
For this phase, we take “a” to be a positive integer. Since the function f (y) = −(c3 a 2 /2π 2 ) cos(2πy/a)
.
(24.25)
430
24 ST: Multiphase Homogenization
satisfies .
1
f (y) = c3 cos(2πy/a), 2
Itô’s lemma shows that (24.25) equals .
√ (1/a t)(c3 a 2 /2π 2 )[cos(2π B2 (t)/a) − 1] √ (c3 a/π ) sin(2π B2 (s)/a)dB2 (s) +(1/a t)) √ ≈ (c3 /π t)
[0,t]
[0,t]
sin(2π B2 (s)/a)dB2 (s),
(24.26)
where .≈ indicates that the two sides differ by a quantity which converges to zero. Note that the first term on the left in (24.26) goes to zero, since .t >> a 2 . Now the stochastic integraI .I (t) is a martingale whose quadratic variation is Q(t) = (c32 /π 2 )(1/t)
.
=L
(c32 /π 2 )(1/t)
[0,t]
[0,t]
sin2 (2π B2 (s)/a)ds sin2 (2π B2 (s/a 2 )ds
= (c32 /π 2 )(1/t/a 2 ) → (c32 /π 2 )
[0,1]
[0,t/a 2 ]
sin2 (2π B2 (u))du
sin2 (2πy)dy = (c32 /2π 2 ),
(24.27)
almost surely, as .t/a 2 → ∞, by the ergodic theorem (Exercise 6). Note that .2π B2 (u) mod1 is a diffusion on the unit torus which converges in variation norm to the invariant probability, namely, the uniform distribution, exponentially fast as .u → ∞, uniformly with respect to the initial distribution. Thus (24.27) implies the desired result, via the martingale CLT (see, e.g., Bhattacharya and Waymire (2022), Theorem 15.5) (Exercise 6). Another way to derive the convergence to the Gaussian law is via time change using stochastic integrals (see Corollary 9.5) (Exercise 8). In Example 2, the computations for the middle and last rows of Table 24.2 are a little more elaborate, as we show now. Example 4 (Example 2 Asymptotics) As pointed out above, the convergence in law to the Gaussian in the first row of Table 24.2 follows by direct computation. Here we present the computations for the intermediate phase in row 2 in some detail. As before, we let .X(0) = 0. One then has
24 ST: Multiphase Homogenization
.
431
X1 (t) − t (c1 + c3 ) √ t 1 1 1 = c2 √ sin(2π Z(s))ds + c3 √ [cos(2π Z(s))/a)] − 1]ds + √ B1 (t), t [0,t] t [0,t] t
(24.28) where .X2 (s) = B2 (s) + sδ and .Z(s) = X2 (s) mod1 is a diffusion on the unit two-dimensional torus .T1 . Now .{Z(s) : s ≥ 0} is a Markov process on the onedimensional torus .[0, 1], with 0 and 1 identified and the distribution of .Z(s) converges exponentially fast in variation norm, as .s → ∞, to its invariant probability, namely, the uniform distribution. Note also that, under the invariant distribution, .E sin(2π Z(s))
=
[0,1]
sin(2πy)dy = 0.
It follows from Theorem 17.1 that √
.(1/
t)
[0,t]
sin(2π Z(s))ds ⇒ N (0, ν), ν = −2
[0,1]
(sin 2πy)u(y)dy,
where .u(y) is the mean-zero solution of .
1
u (y) + δu (y) = sin 2πy. 2
A direct computation shows .u(y)
= −(2δ/(π 2 + δ 2 )[(cos 2πy)/2π + sin 2πy/2δ].
Plugging this in the expression for .ν , one gets .ν
= (2δ/(π 2 + δ 2 ))(1/2δ)
[0,1]
(sin2 2πy)dy = 1/(2(π 2 + δ 2 ).
With the contribution from the independent last term in (24.28), one arrives at the desired result, if the middle term on the right in (24.25) goes to zero. Proof of this is indicated in Exercise 5. We now turn to the asymptotics for .t >> a 2 . Use Itô’s Lemma to write .
X1 (t) − X1 (0) − c1 t B1 (t) 1 w(X2 (t)) − w(X2 (0)) w (X2 (s))dB2 (s)) + √ −√ = , √ √ t t [0,t] t t
(24.29)
where w is the mean-zero (under the invariant probability of .Z(t)) solution of .
1
w (y) + δw (y) = c2 sin 2πy + c3 cos(2πy/a). 2
432
24 ST: Multiphase Homogenization
By a direct computation, one obtains w(y) = −(c2 δ/(π 2 + δ 2 ))[(cos 2πy)/2π + (sin 2πy)/2δ]
.
+(c2 δa 3 /(δ 2 a 2 + π 2 )[sin(2πy/a)/2π − cos(2πy/a)/2δa],
.
w (y) = −(c2 δ/(π 2 + δ 2 ))[(sin 2πy) + π(cos 2πy)/δ] +(c2 δa 2 /(π 2 + δ 2 a 2 )[cos(2πy/a) + (π/δa) sin(2πy/a)] = I1 (y) + I2 (y),
say. Since w is bounded, the first term on the right in (24.29) can be neglected. Also, omitting the last term in .w (y) above, which is of the order .1/a , the term involving the stochastic integral on the right side of (24.29) is a martingale √
.(1/
t)
[0,t]
(I1 (X2 (s) + I2 (X2 (s))dB2 (s),
and its quadratic variation is .
1 I12 (X2 (s)) + I22 (X2 (s)) + 2I1 (X2 (s))I2 (X2 (s)) ds. t [0,t]
As argued in the case (24.27), .
1 I 2 (X2 (s))ds → c22 /2(π 2 + δ 2 ) as t → ∞. t [0,t] 1
Also, .
1 1 I22 (X2 (s)) = (c3 /δ)2 cos2 (2π(B2 (s)/a + sδ/a))ds t [0,t] t [0,t] cos2 (2π(B2 (s/a 2 ) + sδ/a))ds = L(c3 /δ)2 (1/t)) [0,t]
= (c3 /δ)2 (1/(t/a 2 ))
[0,t/a 2 ]
cos2 (2π(B2 (s ) + as δ))ds
→ c32 /2δ 2 in probability as t/a 2 → ∞.
For the Brownian motion .{2π B2 (u) mod1 : u ≥ 0}, on the unit circle converges to the uniform distribution in total variation, uniformly over all initial states, and 2 . [0,1] (cos 2πy)dy = 1/2. Thus one may use the ergodic theorem to get the limit. We will next show that the integral of the product term in the quadratic variation goes to zero as .t/a 2 → ∞. For this express it as
24 ST: Multiphase Homogenization
.
433
2 [sin 2π(B2 (s) + sδ) cos 2π(B2 (s)/a + sδ/a)]ds t [0,t] sin 2π(aB2 (s) + a 2 sδ) cos 2π(B2 (s) + asδ)ds =L 2/(t/a 2 ) = 2/(t/a 2 )
[0,t/a 2 ]
[0,t/a 2 ]
sin 2π(aZ2 (s)) cos 2π(Z2 (s))ds,
where .Z2 (s) = (B2 (s)+asδ) mod1. Since, as argued earlier, .Z(s) := (B2 (s)+y) mod1 is an ergodic diffusion on the unit circle approaching its invariant probability (uniform distribution) exponentially fast in total variation distance, uniformly in y , as .s → ∞. The last expression in the above display equals .(2/ta
2)
[0,t/a 2 ]
[sin 2π((a + 1)Z(s)) + sin 2π((a − 1)Z(s))]ds.
Now, uniformly in “a ” (positive integer), .(1/A)
[0,A]
[sin(2π(a + 1)Z(s)))ds → 0 in probability
(24.30)
as .A → ∞ (Exercise 9). Finally, the last term on the right in (24.29), namely, √ t is independent of the other terms and has the standard normal distribution with mean zero and variance one. Collecting the terms together, one gets a Gaussian distribution with mean zero and having variance .1 + c22 /2(π 2 + δ 2 ) + c32 /2δ 2 .
.B1 (t)/
Final Phase Asymptotics For the analysis of the limiting properties of .{X(t) : t ≥ 0}, the following assumptions will be made: A1 .b(·) and .β(·) are continuously differentiable and periodic with period one (in each coordinate), i.e., with period lattice .Zk . The dispersion matrix D of .X(·) is a constant positive definite matrix, .D = σ σ . A2 .divb(·) = 0 = divβ(·). A3 “a” is a positive integer. Remark 24.1 In A3 it is enough to require that “a” is rational: .a = p/q, with positive integers p and q having no common factor other than 1. One may then take the unit of spatial scale as .(1/q)-th of the original unit, in which case p takes the place of “a.” From a physical point of view, one may consider “a” to be a large (fixed) number and the unit of time to be that needed by the mean flow to travel one unit of distance. Remark 24.2 One may take the matrix .σ , and therefore D, to be spatially periodic with period lattice .Zk . Asymptotic results are still valid, although computations become more complicated. However, because of the first phase result, namely, Theorem 24.2, for the final phase, one may assume .σ and D to be constant matrices, without any essential loss of generality for the problem at hand.
434
24 ST: Multiphase Homogenization
Remark 24.3 It is not necessary that the coefficients, .a(·), β(·) have period one, i.e., a period lattice .Zk . Any common period lattice suffices, because a linear map can transform the lattice to .Zk . We begin with the fact mentioned earlier (see (24.20) and (24.21) and the paragraph preceding them) that, for a given “a” as .t → ∞, the diffusion with periodic coefficients .b(·), .β(·/a) is asymptotically Gaussian with the dispersion matrix K given by (24.20), (24.21). For simplicity it will be assumed that the local dispersion matrix is D, where D is a constant positive definite matrix. As before, we assume .X(0) = 0. Thus .X(t) = b(X(s)) + β(X(s)/a)ds + σ B(t), D = σ σ
(24.31) [0,t]
The corresponding diffusion .X(t) mod a on the k-dimensional torus .Ta can be shown to converge to equilibrium in variation norm with distance bounded above by .ca k/2 exp{−c t/a 2 } (Bhattacharya (1999); Diaconis and Stroock (1991); Fill (1991)). Unfortunately, this leads to a relaxation time of the order .t >> a 2 log a. To get the optimal order .t >> a 2 , we refer to a result of Franke (2004). To state this result, it is convenient to consider Y (t) = X(a 2 t)/a,
.
Z(t) = Y (t) mod1.
Now .Z(·) is a diffusion on the unit k-dimensional torus .T1 , whose generator is Aa =
.
∂2 1 Dij + (ab(a·) + aβ(·)) · ∇. 2 ∂yi ∂yj
(24.32)
1≤i,j ≤k
It is also the generator of .Y (t) but restricted to functions which are periodic with period one. Let .qv,D (t; x, y) be the transition probability density of a diffusion on the unit torus with dispersion matrix D as in (24.32) but an arbitrary divergence-free velocity .v(·) in place of .ab(a·) + aβ(·). Then the result in Franke (2004) says that δ := inf[ min qv,D (1; x, y) : v(·) is divergence-free] > 0.
.
x,y∈T1
It now easily follows from Doeblin’s minorization (see BCPT2 p. 213) that .
sup{
T1
2 Throughout,
Theory.
|qv,D (t; x, y) − 1|dy : x, y ∈ T1 , v(·) is divergence free} ≤ c e−δt ,
BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability
24 ST: Multiphase Homogenization
435
for some constant .c > 0. This immediately leads to sup
.
Ta |pa (t; x, y) − a −k |dy ≤ c exp{−δt/a 2 },
(24.33)
x∈Ta
where .pa is the transition probability density of .X(t) moda, on the big torus .Ta . One may now proceed assuming that, over this time scale, .{X(t) moda : t ≥ 0}, is an ergodic stationary process. It is convenient to consider the rescaled process Y (t) = X(a 2 t)/a,
.
t ≥ 0.
Let .Z(t) = Y (t) mod1, t ≥ 0, be a stationary ergodic diffusion on the unit torus T1 . The infinitesimal generator of the diffusion .Z(·) on the torus, as well as that of .Y (·) restricted to periodic functions, will be denoted by .
Aa =
.
∂2 1 Dij + a(b(a·) + β(·)) · ∇. 2 ∂yi ∂yj
(24.34)
1≤j ≤k
Aa acts on a dense subset .L(Aa ) of .L2 (T1 , dy) and, in view of the exponential ergodicity, the range of .Aa is .1⊥ , the set of all mean zero square integrable functions on .T1 (Theorem 17.8). Therefore, there exists a (unique) solution .gj ∈ 1⊥ of the equation
.
j (1 ≤ j ≤ k), Aa gj (y) = bj (ay) + βj (y) − bj − β
.
(24.35)
where, for an integrable function f , f =
.
[0,1]k
f (y)dy
is the mean of f under the invariant distribution 1dy on .T1 . Writing .g = for the vector means of b and .β, it follows (g1 , . . . , gk ), and similarly . b and .β ∂gi )), by Itô’s Lemma that, recalling .gradg(y) := (( ∂y j ) = a(g(Y (t))−g(Y (0)))− .Y (t)−Y (0)−at ( b+β
[0,t]
[agradg(Y (s))−Ik ]σ dW (t),
(24.36) where .W (t) = B(a 2 t)/a, t ≥ 0, is a standard k-dimensional Brownian motion. 3 It follows from the martingale (functional) central limit √ theorem that .Y (·) is asymptotically a Brownian motion. In particular, for .a/ t → 0,
3 Billingsley
(1968), Theorem 23.1. Bhattacharya and Waymire (2022), Theorem 15.5.
436
24 ST: Multiphase Homogenization
√ )]/ t ⇒ N(0, K), [Y (t) − Y (0) − at ( b+β
.
(24.37)
where Kjj = Ejj + Djj , Ejj = a 2 gradgj (y)D(gradgj (y)) dy.
.
[0,1]k
(24.38)
This follows from (24.36), and the fact that due to periodic boundary conditions: .
[0,1]k
∂gj dy = 0, for all j, j . ∂yj
(24.39)
The main task now is to analyze the asymptotic behavior of .Ejj as a function of “a.” We will outline this analysis, leaving some of the details to Bhattacharya (1999). Introduce the complex Hilbert spaces .H 0 , H 1 as follows: 0 = {h : Rk → C : h is periodic with period lattice Zk , 2 .H [0,1]k |h(y)| dy < ∞, [0,1]k h(y)dy = 0}, f, g0 = [0,1]k f (y)g(y)dy. 1 0 2 .H = {h ∈ H : [0,1]k |∇h| (dy) < ∞}, where 1 .f, g = 2
1≤j,j ≤k n≥1
[0,1]k
∂f ∂g dy ∂yj ∂yj
= −Df, g0 1 = gradf (y)Dgradg(y)dy, 2 [0,1]k and .g(y) denotes the complex conjugate of .g(y). Note that Ejj = a 2 (gj , gj + gj , gj 1 ).
.
(24.40)
It is useful to introduce the second-order differential operator: D=
.
1 2
1≤j,j ≤k
Dij
∂2 ∂yj ∂yj
(24.41)
Since .b(ay) is rapidly oscillating (with period .1/a), it may be shown that the term involving .b(ay) in (24.35)–(24.36) is well approximated by its average, namely, . b, and the generator .Aa in (24.36) can be approximated by
24 ST: Multiphase Homogenization
A=
.
437
∂2 1 Dij + a( b + β(·)) · ∇, 2 ∂yi ∂yj
b=
1≤j ≤k
[0,1]k
b(y)dy.
(24.42)
(see Bhattacharya (1999), pp. 971–976, for details). One may now express .Aa and A as
.
Sa := D −1 (b(a·) + β(·)) · ∇,
Aa = D(I + aSa ),
.
(24.43)
and, A = D(I + aS),
.
S := D −1 ( b + β(·) · ∇),
(24.44)
where I denotes the identity operator on .H 1 . The operators .Sa and .S : H 0 → H 1 are skew symmetric and compact. Skew symmetry follows on integration by parts and the divergence-free condition on .b(·) and .β(·) (Exercise 10). To prove compactness, namely, that any bounded subset of .H 0 is mapped into a compact set in .H 1 , one may use the Fourier transform as an isometry on .H 0 (up to a constant multiple) or, equivalently, the Parseval relation.4 Denote by .fˆ(r) the Fourier coefficients .fˆ(r) = f (y)e−2π ir·y dy = (1/2π )k f (y)e−ir·y dy, (r ∈ Zk ). [0,1]k
[−π,π ]k
On .H 0 , fˆ(0) = 0. Now let .H 2 be the subspace of .H 1 having square integrable derivatives up to order two. Then .g → Dg is an invertible map on .H 2 onto .H 0 . Indeed, if .f = Dg, then, on integration by parts, .fˆ(r) =
[0,1]k
(Dg)(y)e−2π ir·y dy
= (1/2)
Djj (
1≤j,j ≤k
= (−2π 2 )
[0,1]k
e−2π ir·y
∂ 2 g(y) )dy ∂yi ∂yj
Djj (rj rj )g(r), ˆ
(24.45)
1≤j,j ≤k
g(r) ˆ = (−1/2π 2 )(
.
Djj (rj rj ))−1 fˆ(r),
(24.46)
1≤j,j ≤k
|g(r)| ˆ ≥ (2λM π 2 )−1 |fˆ(r)|/|r|2 ,
.
4 See
BCPT, p. 133.
r ∈ Zk \{0}).
(24.47)
438
24 ST: Multiphase Homogenization
Here .λM is the maximum eigenvalue of D. We now prove that .D −1 : H 0 → H 1 is compact. One needs to show that given a bounded sequence .fn bounded by 1 in norm .|| · ||0 , there exists a subsequence .fn , say, .n = 1, 2, . . . , such that gn = D −1 fn → g := D −1 f,
.
for some .f ∈ H 0 , the convergence being in .H 1 . Since a bounded subset of .H 0 is weakly compact, by Alaoglu’s Theorem (see Lemma 2, Chapter 19), there exists a subsequence .fn converging weakly to some .f ∈ H 0 . This implies .fˆn (r) → fˆ(r) as .n → ∞, for all .r ∈ Zk \{0}. .fˆn (0) = 0, fˆ(0) = 0. Then (see (24)), .||gn
− g||21 = −D(gn − g), gn − g0 = −fn − f, D −1 (fn − f )0 ≤ (1/2λm π 2 ){ |fˆn (r) − fˆ(r)|2 + (1/R 2 ) |fˆn (r) − fˆ(r)|2 } |r|>R
0 0 from the mean flow in interpreting experimental results. The original version of the results in this chapter appeared in Bhattacharya and Goetze (1995). For some additional literature related to this chapter, we refer to Bhattacharya (1985), Bhattacharya et al. (1989), Bhattacharya (1999), Bhattacharya et al. (1999), Avellaneda and Majda (1992), and Owhadi (2003).
Exercises 1. (i) Let √ a, b,√c be positive numbers c ≤ (a + b) ∧ 1. Show that c ≤ (2 max{ a, b}) ∧ 1. [Hint: Obvious for b ≥ 1, or a ≥ 1. For a < 1, b < 1,
Exercises
443
Fig. 24.2 Increased transverse dispersion with Peclet number
√ √ a + b ≤ max{2a, 2b} < 2 max{ a, b.}.] (ii) Show that (c γ + c
γ 2 ) ∧ 1 ≤ [c + c
+ 2(c
)1/2 ]γ ), for nonnegative c , c
, γ . 2. (i) Prove that the invariant distribution π in each of the Examples 1, 2, is the uniform distribution on the torus [0, 1] × [0, 1] (with 0 and i 1 identified)[Hint: Compute the adjoint L∗0 , and note that divb(·) ≡ i=1,2 ∂b ∂yi = 0, and D(·) = D is a constant matrix.] (ii) Compute the mean vectors b and the 2×2 dispersion matrices K of the Gaussian law in Examples 1, 2 directly, without using (24.20). 3. Verify the convergence in the second row of Table 24.1. [Hint: The integral of the first term of X1 (t) is of the order t, which, when divided by t 2 /a 2 , goes to zero in the given range of t. For the second term of X1 (t),Taylor expansion around 0 yields cos(2π B2 (s)/a) − 1 = −(2π 2 /a 2 )B22 (s) + (8π 3 /6a 2 )B22 (s)θ , where |θ | ≤ 1, a.s. The second summand on the right in the expansion can be neglected, when its integral over [0, t] is√divided by t 2 /a 2 ; for the first summand, use the fact that the processes { tB2 (s/t) : s ≥ 0} and The term B1 (t)/t 2 /a 2 → 0 in {B2 (s) : s ≥ 0} have the same √ distribution.4/3 2 2 probability, because t /a >> t iff t >> a .] 4. Verify the convergence in the third row of Table 24.1. [Hint: By Itô’s lemma, sin(2π B2 (s))ds = − sin(2π B2 (t))/2π 2 + (1/π ) [0,t] cos(2π B2 (s))dB2 (s); [0,t]
444
5.
6. 7. 8. 9.
10. 11. 12.
13. 14.
24 ST: Multiphase Homogenization
E[ [0,t] sin(2π B2 (s))ds]2 = O(t), so that (1/t) [0,t] sin(2π B2 (s))ds → 0 in probability. Also, [0,t] cos(2π B2 (s)/a)ds =L [0,t] cos(2π B2 (s/a 2 ))ds = a 2 [0,1] cos(2π B2 (u))du.] Prove that the middle term on the right in (24.28) goes to zero as a → ∞ and t/a 2/3 → r > 0.[Hint: By a Taylor expansion, up to the second derivative, the middle term is seen to be O(t −1/2 [0,t] (B2 (s) + δs)/a)2 ds), which is of the order of r 5/2 a −1/3 → 0.] Justify the use of Birkhoff’s ergodic theorem for the convergence in the last line of (24.27). Show how the martingale CLT can be used to obtain the convergence to the appropriate Gaussian law for the case t >> a 2 . Instead of the martingale CLT, suggested in Exercise 7, use the time change (Corollary 9.5) to derive the desired convergence for t >> a 2 . Prove 24.30. [Hint: The mean of the quantity on the left goes to zero as A → ∞, and its variance,or square, goes to zero, since the covariance c(s, s ), say, of the integrand process goes to zero exponentially fast as |s-s → ∞, uniformly in “a.”] Prove that (i) the operators Sa and S are skew symmetric, and (ii) (b(a·)+β(·))· ∇, and ( b + β(·)) · ∇ : H 1 → H 1 , are bounded. Compute the exponent of diffusivity θ ; see (24.6) in Tables 24.1, 24.2, in the different ranges of time displayed. 1 ) = c3 cos 2πy2 belongs to the range of ( Prove that (β1 − β b + β(·)) · ∇g = ∂g ∂g 1 , of the form y + dy mod1 + δ . [Hint: Find g ∈ H (c1 + c3 cos 2πy2 ) ∂y 1 2 ∂y2 1 satisfying the equation.] Verify that the approximation of gj by hj (see (24.42)) does not affect the asymptotic dispersion in Examples 1, 2. Verify Remark 24.5.
Chapter 25
Special Topic: Skew Random Walk and Skew Brownian Motion
Skew Brownian motion was introduced by Itô and McKean (Illinois J Math 7:181–231, 1963) in a paper devoted to probabilistic constructions of Markov processes satisfying Feller’s conditions for the classification of all one-dimensional diffusions. Its eventual foundational role in modeling the presence of certain types of interfacial heterogeneities is now well documented. Applications include solute transport, ecology, finance, and, in general, phenomena associated with the occurrence of sharp heterogeneities. This chapter relies rather heavily on the Skorokhod embedding and Doob’s maximal inequality for martingales to provide a skew random walk approximation to skew Brownian motion and may be viewed as an illustration of their use in this context.
For motivation1 of the processes introduced here, consider a suspension of a dilute concentration .c0 (y) of solute in a one-dimensional saturated medium at time zero. Suppose that the dispersion rate is .D − on .(−∞, 0) and .D + on .(0, ∞). If .D − = D + then .y = 0 is referred to as an interface. A basic model in partial differential equation for the time evolution (dispersion) of such a concentration .c(t, y) is given by .
∂c 1 ∂ ∂c = (D(y) ), ∂t 2 ∂y ∂y
y = 0,
c(0+ , y) = c0 (y)
(25.1)
1 Also
see Ouknine (1991), and the survey articles by Lejay (2006), and by Ramirez et al. (2013), as well as the Example 3, Remark 17.10 in Chapter 17. The paper Ramirez (2012) also provides an intriguing application to population biology.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_25
445
446
25 Special Topic: Skew Random Walk and Skew Brownian Motion
together with a condition of continuity of flux across the interface D+
.
∂c ∂c (t, 0+ ) − D − (t, 0− ) = 0. ∂y ∂y
(25.2)
In the case .D + = D − = D, as explained in Chapter 2, Example 4, the solution may be represented as c(t, y) = Ey c0 (Yt ) =
.
R
√
1 2π Dt
e−
(y−z)2 2Dt
t ≥ 0,
c0 (z)dz,
(25.3)
where Y is a Brownian motion with zero drift and diffusion coefficient D (Exercise 6). It is natural to inquire about the nature of the particle motion in the presence of an interface. The answer will follow from the considerations to follow. The skew Brownian motion with parameter .α ∈ (0, 1), .B (α) is a one-dimensional stochastic process on .(−∞, ∞) that behaves like a standard Brownian motion on each of the half-lines .(−∞, 0) and .(0, ∞). In particular, we will see that .B (α) is a strong Markov process having continuous sample paths and homogeneous transition probability density .p(α) (t; x, y) satisfying .
1 ∂ 2 p(α) ∂p(α) = , 2 ∂x 2 ∂t
x, y ∈ R, t > 0.
(25.4)
However, at the interface .x = 0, one has the following relation: α
.
∂p(α) (t; x, y) ∂p(α) (t; x, y) |x=0− = 0, |x=0+ − (1 − α) ∂x ∂x
t > 0, −∞ < y < ∞. (25.5)
In the case .α = 12 , the interface condition is simply continuity of the derivative at the backward variable .x = 0, and the transition probabilities therefore coincide with those of a standard Brownian motion. To cast this in the framework of Feller’s semigroup theory, define speed and scale functions by s(x) =
.
α −1 x, x ≥ 0, (1 − α)−1 x,
x < 0,
m(x) =
2αx, x ≥ 0, 2(1 − α)x. x < 0,
.
(25.6)
Then one has L = Dm Ds+ =
.
and
1 d2 on R\{0}, 2 dx 2
(25.7)
25 Special Topic: Skew Random Walk and Skew Brownian Motion
Ds f (0+ ) − Ds f (0− ) =
.
447
df df (x)|x=0+ − (x)|x=0− = αf (0+ ) − (1 − α)f (0− ). ds ds (25.8)
Thus, in view of (21.5), .f ∈ DL if and only if .f exists and is continuous on .(−∞, 0) ∪ (0, ∞), .f (0+ ) = f (0− ), and .αf (0+ ) = (1 − α)f (0− ). The inaccessibility of the boundaries .±∞ imposes no further restrictions on the domain; see Definition 21.3. However, it is also possible to give a direct probabilistic construction of .α-skew Brownian motion from standard Brownian motion B. Recall that by continuity of the paths of the Brownian motion, the (random) set .{t : Bt = 0} is the complement of the inverse image of the closed set .{0} and, therefore, an open subset of .(0, ∞). Using the fact that every open subset of .R is a countable disjoint union of open intervals, one may express .{t : Bt = 0} as a countable disjoint union of open intervals .J1o , J2o , . . . , referred to as excursion intervals of B. So, while the excursions of Brownian motion cannot be ordered, like the rational numbers they can be enumerated. Note that .Bt = 0 at an endpoint t of any .Jko . We denote the closure of .Jko by .Jk . Let .A1 , A2 , . . . be an i.i.d. sequence of .±1 Bernoulli random variables, independent of B, with .P (An = 1) = α. The .α-skew Brownian motion starting at zero is the process .B (α) defined by Bt(α) =
∞
.
Ak 1Jk (t)|Bt |,
t ≥ 0.
(25.9)
k=1
That is, one makes independent tosses of a coin to decide if a particular excursion of .|B| should remain positive or should undergo a sign change. Proposition 25.1 .B (α) is a Markov process having continuous sample paths and stationary transition probabilities .p(α) (t; x, y) given by
.
p
(α)
(t; x, y) =
⎧ (y−x)2 ⎪ √ 1 e− 2t + ⎪ ⎪ 2π t ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ √ 1 − (y−x) ⎪ ⎪ ⎨ 2π t e 2t −
2
(y+x) (2α−1) √ e− 2t 2π t
2
(y+x) (2α−1) √ e− 2t 2π t
if x ≥ 0, y > 0 if x ≤ 0, y < 0
⎪ ⎪ (y−x)2 ⎪ ⎪ √2α e− 2t if x ≤ 0, y > 0 ⎪ ⎪ 2π t ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (y−x)2 ⎩ 2(1−α) √ e− 2t if x ≥ 0, y < 0. 2π t
Proof Let .k(t) = k for .t ∈ Jko , and take the right-continuous extension to extend to the value at an endpoint t; i.e., .k(t) = k when t is the left endpoint of .Jk . Then (α) (α) .Bt = Ak(t) |Bt |. Also .k(t) is constant on an open interval. It follows that .t → Bt is continuous on each excursion interval. Moreover, since .Bt = 0 for .t ∈ ∂Jk , k ≥ 1, the continuity extends to the endpoints of excursion intervals. Let .Ft := σ {|Bs |, 0 ≤
448
25 Special Topic: Skew Random Walk and Skew Brownian Motion (α)
s ≤ t} ∨ σ {A1 , A2 , . . . } ⊃ σ (Bs , s ≤ t). Then, for .0 ≤ s < t and a nonnegative, measurable function g one has using the Markov property for .|B| and independence of .|B| of the i.i.d. sign changes .A1 , A2 , . . . , .
E{g(Bt(α) )|Fs } = E{g(Ak(t) |Bt |)|Fs } =
∞
E{g(Ak |Bt |)1Jk (t)|Fs }
k=1
= 1[Ak(t) = 1]
∞
E{g(|Bt |)1Jk (t)|Fs }
k=1
+1[Ak(t) = −1]
∞
E{g(−|Bt |)1Jk (t)|Fs }
k=1
= 1[Ak(t) = 1]E{g(|Bt |)|σ (|Bs |, Ak , k ≥ 1)} +1[Ak(t) = −1]E{g(−|Bt |)|σ (|Bs |, Ak , k ≥ 1)} = E{g(Ak(t) |Bt |)|σ (|Bs |, Ak , k ≥ 1)} = E{g(Bt(α) )|σ (|Bs |, Ak , k ≥ 1)} = E{g(Bt(α) )|σ (Bs(α)) }. The Markov property follows from the smoothing property of conditional expec(α) tation (see BCPT2 p. 38), since .Fs ⊃ σ (Bu , u ≤ s). The computation of the (α) transition probabilities .p (t; x, y) will follow from the strong Markov property for Brownian motion: specifically, the reflection principle. The bulk of these are left as Exercise 3; however observe, for example, that for .y > 0, .
(α)
P0 (Bt =
∞
> y) [P0 (Bt > y, An = +1, t ∈ Jn ) + P0 (Bt < −y, An = +1, t ∈ Jn )]
n=1
= αP0 (Bt > y) + αP0 (Bt < −y).
(25.10)
In particular, it follows that 2α − y 2 p(α) (t; 0, y) = √ e 2t . 2π t
.
2 Throughout, BCPT refers to Bhattacharya and Waymire (2016), A Basic Course in Probability Theory.
25 Special Topic: Skew Random Walk and Skew Brownian Motion
449
One may check that for fixed y, the transition probabilities of the skew Brownian motion are continuous in the backward variable x. This implies the Feller property (see Chapter 1, Definition 1.1). The Feller property enjoys numerous important consequences in the general theory of Markov processes. The following result is chief among these when taken together with the sample path continuity. Theorem 25.2 Skew Brownian motion has the strong Markov property. Proof In view of the sample path continuity and the Feller property, the strong Markov property follows by Theorem 1.6. The stochastic process corresponding to the interfacial dispersion problem posed at the outset of this chapter now follows by a rescaling of an .α-skew Brownian motion for an appropriate choice of .α. Specifically one has the following consequence; see Exercise 9. √ (α ∗ ) ) where .s(y) = D + y, y ≥ 0, Corollary 25.3 The process defined by .Y = s(B √ √ + and .s(y) = D − y, y ≤ 0, and .α ∗ = √ + D√ − is a Markov process having D + D continuous sample paths with stationary transition probabilities given by .
p(t; x, y) √ √ ⎧ 2 + − (y−x)2 ⎪ √ 1 √D −√D exp{− (y+x) }] x > 0, y > 0, } + [exp{− ⎪ + ⎪ 2π D + t − + D+ 2D + t 2D t D ⎪ ⎪ √ √ ⎪ ⎪ 2 2 − + ⎪ (y+x) (y−x) ⎪ ⎨ √ 1 − [exp{− 2D − t } − √D + −√D − exp{− 2D − t }] x < 0, y < 0, 2π D t D + D √ √ = ⎪ (y D − −x D + )2 ⎪ } x ≤ 0, y > 0, ⎪ √ + 2 √ − √ 1 exp{− − ⎪ 2D D + t 2π t ⎪ D + D ⎪ √ √ ⎪ ⎪ 2 − + ⎪ D ) ⎩√ 2√ √ 1 exp{− (y D −x } x ≥ 0, y < 0. + − 2D − D + t D + D
2π t
(25.11) Moreover, for each fixed .x ∈ R, one has .
∂p(t; x, y) 1 ∂ ∂p (D(y) ), = 2 ∂y ∂y ∂t
y = 0,
(25.12)
∂p ∂p (t; x, 0+ ) − D − (t; x, 0− ) = 0. ∂y ∂y
(25.13)
together with continuity of flux across the interface D+
.
Note that, unlike the case of skew Brownian motion noted earlier, (25.11) is jointly continuous in the forward and backward transition variables .x, y. In fact .p(t; x, y) is symmetric in .x, y. However, this symmetry is not a property of the transition probabilities for skew Brownian motion given in Proposition 25.1.
450
25 Special Topic: Skew Random Walk and Skew Brownian Motion
Skew random walk is a natural discretization of skew Brownian motion defined as follows. Definition 25.1 The skew random walk is a discrete Markov chain .{Yn : n = 0, 1, 2, . . . } on the integers .Z having transition probabilities
.
(α)
pij
⎧ 1 ⎪ 0, j = i ± 1 ⎪ 2 if i = ⎪ ⎪ ⎪ ⎨ = α if i = 0, j = 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 1 − α if i = 0, j = −1.
Remark 25.1 One may note that the Markov process does not skip states and, as such, is an example of a discrete parameter birth–death Markov chain. In fact, the terminology “skew random walk”is a misnomer since the process does not have independent increments.3 One may anticipate an approximation like that of the functional central limit theorem connecting random walks to Brownian motion will also hold here.4 However, in view of the non-Lipschitz nature of the coefficients, the martingale approximation developed in Chapter 23, Example 5 does not directly apply here. The remainder of this chapter gives a proof based on the Skorokhod embedding method for skew Brownian motion.5 Let us recall that by an application of the Skorokhod embedding theorem,6 there is a sequence of times .T1 < T2 < · · · such that .BT1 has a symmetric Bernoulli .±1distribution, and .BTi+1 −BTi .(i ≥ 0) are i.i.d. with with a symmetric .±1-distribution. Moreover, .Ti+1 − Ti .(i ≥ 0) are i.i.d. with mean one; also see Exercise 5. Lemma 1 (Discrete Excursion Representation) Let .S = {Sn : n = 0, 1, 2, . . . } be a simple symmetric random walk starting at 0, and let .J˜π1 , J˜π2 , . . . denote an enumeration of the excursions of .S away from zero for a fixed but arbitrary permutation .π of the natural numbers. In particular .|Sn | > 0 if .n ∈ J˜πk . Define
3 Iksanov
and Pilipenko (2023) and Dong et al. (2023) recently extended the theory to skew stable Lévy process as a limit of a sequence of stable Lévy processes perturbed at zero. 4 Convergence of the distribution at a fixed time point was first announced in Harrison and Shepp (1981). The suggested “fourth moment proof”along the lines of that given for simple symmetric random walk (i.e., .α = 1/2) based on convergence of finite dimensional distributions and tightness is quite laborious and tricky due to the lack of independence of the increments. A full proof was given by Brooks and Chacon (1983). 5 Sooie-Ho Loke co-authored this approach while a graduate student at Oregon State University. However, a more general functional central limit theorem by Cherny et al. (2002) appeared earlier, so that the collaboration with Loke remained unpublished until this book chapter. 6 See BCPT, p. 200, or Bhattacharya and Waymire (2021), p.204.
25 Special Topic: Skew Random Walk and Skew Brownian Motion
S0(α) = 0,
Sn(α) =
.
∞
451
1J˜π (n)A˜ k |Sn |, n ≥ 1, k
k=1
where .A˜ 1 , A˜ 2 , . . . is an i.i.d. sequence of Bernoulli .±1-random variables, independent of .S, with .P (A˜ 1 = 1) = α. Then .S (α) is distributed as .α-skew random walk. (α)
Proof Let G be a bounded .σ (Sj , j ≤ n) measurable function. We will first show (α)
(α)
(α)
(α)
that .P (Sn+1 = Sn + 1 | Sj , j ≤ n) = 1/2 on .[Sn .
> 0]. For this consider
(α)
E{G(S1 , . . . , Sn(α) )1[S (α) =S (α) +1] 1[S (α) >0] } n+1
n
n
(α)
(α)
= E{E[G(S1 , . . . , Sn(α) )1[S (α) =S (α) +1] 1[S (α) >0] | Sj , j ≤ n]} n
n+1
n
(α)
(α)
= E{G(S1 , . . . , Sn(α) )E[1[S (α) =S (α) +1] 1[S (α) >0] | Sj , j ≤ n]} n
n+1
n
(α)
(α)
= E{G(S1 , . . . , Sn(α) )E[1[S (α) =S (α) +1] 1[S (α) >0] 1[S (α) >0] | Sj , j ≤ n]} n
n+1
n
n+1
(α)
(α)
= E{G(S1 , . . . , Sn(α) )E[1[|Sn+1 |=|Sn |+1] 1[S (α) >0] 1[S (α) >0] | Sj , j ≤ n]} n
(α)
n+1
(α)
= E{G(S1 , . . . , Sn(α) )E[1[|Sn+1 |=|Sn |+1] 1[S (α) >0] | Sj , j ≤ n]} n
(α)
(α)
= E{G(S1 , . . . , Sn(α) )1[S (α) >0] E[1[|Sn+1 |=|Sn |+1] | Sj , j ≤ n]} n
1
(α) . = E G(S1 , . . . , Sn(α) )1[S (α) >0] · n 2
(α) The proof that .P (Sn+1 = Sn(α) + 1 | Sj(α) , j ≤ n) = 1/2 on .[Sn(α) < 0] is by entirely similar considerations and left to the reader. (α) (α) (α) So let us turn to showing that .P (Sn+1 = Sn + 1 | Sj , j ≤ n) = α on
[Sn(α) = 0]. For this consider
.
.
(α)
E{G(S1 , . . . , Sn(α) )1[S (α) =S (α) +1] 1[S (α) =0] } n+1
n
n
= E{E[G(S1(α) , . . . , Sn(α) )1[S (α) =1] 1[S (α) =0] | {Sj(α) }n1 , Sn(α) = 0]} n+1
(α)
= E{E[G(S1 , . . . , Sn(α) )1[n+1∈J
n
(α) (α) ] π(S1 ,...,Sn )
1[A
(α) (α) =1] π(S1 ,...,Sn )
(α)
×1[|Sn+1 |=1] 1[S (α) =0] | {Sj }n1 , Sn(α) = 0]} n
(α)
= E{E[G(S1 , . . . , Sn(α) )1[n+1∈J (α)
(α) (α) ] π(S1 ,...,Sn )
×1[|Sn |=0] 1[S (α) =0] | {Sj }n1 , Sn(α) = 0]} n
1[A
(α) (α) =1] π(S1 ,...,Sn )
452
25 Special Topic: Skew Random Walk and Skew Brownian Motion (α)
= E{G(S1 , . . . , Sn(α) )1[S (α) =0] 1[n+1∈J n
×E[1[A
(α) (α) =1] π(S1 ,...,Sn )
(α) (α) ] π(S1 ,...,Sn )
(α)
1[|Sn+1 |=1] 1[|Sn |=0] | {Sj }n1 , Sn(α) = 0]}
Now since the event .[|Sn | = 0] implies the event .[|Sn+1 | = 1], the indicator random variable .1[|Sn+1 |=1] 1[|Sn |=0] is measurable with respect to the .σ -field (α) n (α) .σ ({S = 0). Also since the event .[|Sn | = 0] implies that .|Sn+1 | is j }1 , Sn in a completely new excursion and the fact that .{Ak } is independent of .{Sk }, (α) n (α) .1[A = 0). Thus we have, (α) (α) =1] is independent of .σ ({Sj }1 , Sn π(S1
,...,Sn )
(α)
(α)
E[G(S1 , . . . , Sn )1[S (α) =S (α) +1] 1[S (α) =0] ]
.
n+1
n
n
(α)
= E[G(S1 , . . . , Sn(α) )1[S (α) =0] 1[n+1∈J n
× E[1[A
(α) (α) =1] π(S1 ,...,Sn )
.
(α) (α) ] π(S1 ,...,Sn )
(α)
]E[1[|Sn+1 |=1] 1[|Sn |=0] | {Sj }n1 , Sn(α) = 0]]
= E[G(S1(α) , . . . , Sn(α) )1[S (α) =0] · α] n
The proof of the next lemma is left as an exercise. (α) Lemma 2 The discrete parameter stochastic process . Sm = BT(α) ,m = S0(α) = 0, m 1, 2, . . . , is distributed as an .α-skew random walk.
Lemma 3 For any .λ > 0, the distribution of .α-skew Brownian motion is invariant 1 (α) under the space-time transformation .{λ− 2 Bλt : t ≥ 0}. Proof Note that the Markov property and continuity of sample paths are preserved by this continuous bijective transformation on path space .C[0, ∞). So it is sufficient to check, by inspection, that 1
1
1
λ 2 p(α) (λt; λ 2 x, λ 2 y) = p(α) (t; x, y),
.
x, y ∈ R, t ≥ 0.
The following lemma might be viewed as a symmetrization reminiscent of von Neumann’s algorithm for simulating a fair coin by tossing it twice;7 also see Exercise 4. Lemma 4 (Walsh’s Martingale) Define a reversal function
7 This
martingale was identified by Walsh (1978), who also pointed out skew Brownian motion as an example of a (continuous) diffusion having a discontinuous local time; also see Ramirez et al. (2006) in this regard.
25 Special Topic: Skew Random Walk and Skew Brownian Motion
453
rα (x) = αx1(−∞,0] (x) + (1 − α)x1[0,∞) (x), x ∈ R.
.
Then .rα (B (α) ) is a martingale. (α)
Proof Fix .α and let .Rt = rα (Bt ), t ≥ 0. A proof can be obtained by a direct calculation of .E{Rt |σ (Bu : u ≤ s)} via the Markov property and formulae for the transition probabilities of the skew Brownian motion. First consider the case (α) .Bs > 0. Then, as a result of the indicated cancellations of integrals, one has .
(α)
E{rα (Bt )|σ (Bu : u ≤ s)} = rα (y)p(α) (t − s; Bs(α) , y)dy R
∞
= (1 − α)
yp 0
(1 − α) =√ 2π(t − s)
(α)
∞
(t
− s; Bs(α) , y)dy
y{e−
(α) (y−Bs )2 2(t−s)
− e−
−α 0 (α) (y+Bs )2 2(t−s)
∞
yp(α) (t − s; Bs(α) , −y)dy
}dy
0
= rα (Bs(α) ). The last equality follows by observing that the last integral is the mean of a (α) (α) Brownian motion starting at .Bs and absorbed at zero. The case that .Bs ≤ 0 is similar and left to the reader. Lemma 5 For each .k = 1, 2, . . . ,and .m ≥ 2, (α)
(α)
E|rα (Bk+1 ) − rα (Bk )|m ≤ 1.
.
Proof Similarly to the previous lemma, a proof can be obtained by a direct calculation via the Markov property and formulae for the transition probabilities of the skew Brownian motion. Namely, .
(α) E|rα (Bk+1 ) − rα (Bk(α) )|m ∞ ∞ = (1 − α)m |y − x|m p(1; x, y)p(k; 0, x)dydx
+α m
+
−∞
0 0
0 0
−∞ −∞
∞ 0
+
0
0
−∞ 0
∞
|y − x|m p(1; x, y)p(k; 0, x)dydx
|αy − (1 − α)x|m p(1; x, y)p(k; 0, x)dydx |(1 − α)y − αx|m p(1; x, y)p(k; 0, x)dydx.
454
25 Special Topic: Skew Random Walk and Skew Brownian Motion
For each of these terms, substituting the formulae for the transition probabilities p(1; x, y), one may check that each of the inner integrals (with respect to y) is bounded by one for any x and .α. Adding up the integrals with respect to x yields an upper bound by a total probability of one.
.
The proofs of the following properties of the reversal function are straightforward to check and left to the reader. Lemma 6 For any nonnegative function f on .R (i) .rα (maxx f (x)) = maxx rα (f (x)). (ii) For any .x ∈ R, .rα (|x|) ≤ |rα (x)| iff α ≥ 1/2. (iii) For .x ≥ y, |rα (x − y)| ≤ |rα (x) − rα (y)| iff α ≥ 1/2. Define the polygonal random function .S (α,n) on .[0, 1] as follows: (α)
(α,n)
St
.
(α)
(α)
Sk−1 k − 1 Sk − Sk−1 + n(t − := √ ) √ n n n k for t ∈ [ k−1 n , n ], 1 ≤ k ≤ n.
(α,n)
That is, .St
=
(α)
Sk √
n
at points .t =
k n
(25.14) (α,n)
(0 ≤ k ≤ n), and .t → St
.
interpolated between the endpoints of each
is linearly
k interval .[ k−1 n , n ].
Theorem 25.4 .S (α,n) converges in distribution to the .α-skew Brownian motion .B (α) as .n → ∞. Proof Let .Tk , .k ≥ 1, be the embedding times of simple symmetric random walk {Sk = BTk : k = 0, 1, 2, . . . } in a standard Brownian motion .{Bt : t ≥ 0} starting at zero. That is, the simple symmetric random walk .{Sk : k = 0, 1, 2, . . . } has the same distribution as .{ Sk := BTk : k = 0, 1, 2, . . . }. It now follows that the (α) (α) Sm ≡ skew random walk .{Sm : m = 0, 1, 2, . . . } has the same distribution as .{ (α) (α,n) has the same distribution as BTm : m = 0, 1, 2, . . . }. In particular, therefore, .S 1 (α,n) (α) − . S (α,n) defined by . Sk/n := n 2 BTk .(k = 0, 1, . . . , n) and with linear interpolation between .k/n and .(k + 1)/n .(k = 0, 1, . . . , n − 1). Also, define, for each .n = (α) t(α,n) := n− 12 Bnt 1, 2, . . . , the skew Brownian motion .B , .t ≥ 0. We will show that .
.
(α,n) t(α,n) | −→ 0 max | St −B
0≤t≤1
in probability as n → ∞,
(25.15)
which implies the asserted weak convergence (see Exercise 2). Now .
t(α,n) |. max | St(α,n) − B
0≤t≤1
(25.16)
25 Special Topic: Skew Random Walk and Skew Brownian Motion 1
455
≤ n− 2 max |BTk − Bk | (α)
(α)
1≤k≤n
(α,n) + max { max | St(α,n) − Sk/n | + n− 2 1
0≤k≤n−1
k k+1 n ≤t≤ n
= In(1) + In(2) + In(3) ,
max |Bt(α) − Bk(α) |}
k≤t≤k+1
say.
(25.17)
˜ − S (α) ˜ Now, writing .Z˜ k = S (α) k k−1 , as .n → ∞ one readily has 1 1 In(2) ≤ n− 2 max{|Z˜ k | : 1 ≤ k ≤ n} ≤ √ → 0. n
.
For the third term, let .ε > 0 and fix a .k ∈ {0, 1, . . . , n − 1}. Note that using the lemma on properties of the reversal function one has .
(α)
P ( max |Bt k≤t≤k+1
(α)
− Bk | >
√
nε)
√ ≤ P ( max |rα (Bt(α) ) − rα (Bk(α) )| > rα ( nε)) k≤t≤k+1
√ (α) (α) = P ( max |rα (Bt ) − rα (Bk )| > rα ( nε)) k≤t≤k+1
√ (α) (α) = P ( max |rα (Bt ) − rα (Bk )| > (1 − α) nε). k≤t≤k+1
Thus, by Walsh’s martingale lemma and Doob’s maximal inequalty, one has n E|r (B (α) ) − r (B (α) )|4 √ α k+1 α k 1 (α) (α) .P ( max max |Bt − Bk | > nε) ≤ ≤ c(α, ε) , n 1≤k≤n k≤t≤k+1 (1 − α)4 n2 ε4 k=1
(25.18) (3)
for a positive constant .c(α, ε). It follows that .P (In > ε) ≤ c(α, ε) n1 . In particular, 1
In(3) ≤ n− 2
.
(α)
max max{|Bt
0≤k≤n−1
(α)
− Bk | : k ≤ t ≤ k + 1} → 0 in probability. (25.19)
Hence we need to prove, as .n → ∞, 1
In(1) := n− 2 max |BTk − Bk | −→ 0 in probability. (α)
.
1≤k≤n
(α)
(25.20)
This follows exactly as for the case of the convergence of simple symmetric random walk to Brownian motion. Namely, since .Tn /n → 1 a.s., by the strong law of large numbers, it follows that
456
25 Special Topic: Skew Random Walk and Skew Brownian Motion
εn := max |
.
1≤k≤n
k Tk − | −→ 0 n n
as n → ∞ (almost surely).
(25.21)
So, in view of (25.21), there exists for each .ε > 0 an integer .nε such that .P (εn < ε) > 1 − ε for all .n ≥ nε . Hence with probability greater than .1 − ε one has for all .n ≥ nε , the estimate In(1) ≤
.
=
1
max
n− 2 |Bs(α) − Bt | =
max
|B s
|s−t|≤nε, 0≤s,t≤n+nε
|s −t |≤ε, 0≤s ,t ≤1+ε
−→ 0
(α,n)
(α)
−B t
(α,n)
d
| =
max
|s −t |≤ε, 0≤s ,t ≤1+ε
|B s/n − Bt/n | (α,n)
max
|s−t|≤nε, 0≤s,t≤n(1+ε)
(α)
(α,n)
(α)
|Bs − Bt |
as ε ↓ 0, (α)
by the continuity of .t → Bt . Given .δ > 0 one may then choose .ε = εδ such that (1) (1) for all .n ≥ n(δ) := nεδ , .P (In > δ) < δ. Hence .In → 0 in probability. A practical value of this extension is in the design of numerical schemes8 for more general models. Hoteit et al. (2002) describes four different probability schemes from the existing literature. The skew random walk convergence provides a theoretical justification for one of these schemes. Another interesting application that arises in this general context of solute transport involves experiments9 to determine the so-called breakthrough curves, or passage time distributions, in the presence of an interface. In particular, a distinct skewness was empirically observed in these experiments that was eventually explained as a consequence of skew diffusion.10 This application motivated the following explicit computation of the first passage time for skew Brownian motion obtained by Appuhamillage and Sheldon (2012). It involves first characterizing the distribution of ranked excursion heights of skew Brownian motion. The ranked excursion heights are then used to obtain a formula11 for the first passage time distribution. Let (α)
(α)
M1 (t) ≥ M2 (t) ≥ . . . ≥ 0
.
(α)+
be the ranked decreasing sequence of excursion heights .sups∈Jm ∩[0,t] Bs ranging (α)+ over all m such that .Jm ∩ [0, t] is nonempty. Note that .Bs denotes the positive
8 See
Lejay and Martinez (2006), Bokil et al. (2020), and references therein, for more extensive treatments of the numerical problems. 9 Berkowitz et al. (2009). 10 See Appuhamillage et al. (2010, 2011). 11 Appuhamillage and Sheldon (2012) provide refined expressions for these formulae as well.
25 Special Topic: Skew Random Walk and Skew Brownian Motion
457
(α)
part of .Bs , so that a negative excursion has height zero and the height of the final excursion is included in the ranked list even if that excursion is incomplete. For the purposes of coupling the various skew Brownian motions represented (α) by (25.9), the dependence on .α ∈ (0, 1) of the excursion flips .{Am ≡ Am : m = 0, 1, . . .} will be made explicit, i.e., (α)
Bt
.
=
∞
1Jm (t)A(α) m |Bt |.
(25.22)
m=1 (α/β)
A coupling construction: Let .{Am : .m = 0, 1, . . .} be a sequence of i.i.d. .±1 (α/β) Bernoulli random variables independent of .A(β) and B with .P (Am = 1) = α/β. (α) Define .Am as follows: (α) .Am
=
(α/β)
(β)
1
Am = 1, Am
−1
otherwise.
= 1,
(α)
By construction, the sequence .{Am : m = 0, 1, . . .} consists of i.i.d. .±1 Bernoulli (α) random variables that are independent of B with .P (Am = 1) = α. Hence, by using (α) .Am , m ≥ 1, as the excursion signs in (25.22), we obtain an .α-skew Brownian motion .B (α) . This is a two-stage construction: first, define .B (β) by independently setting each excursion of .|B| to be positive with probability .β; then, for each positive excursion of .B (β) , independently decide whether to keep it positive (with probability .α/β), or flip it to be negative (with probability .1 − α/β). Theorem 25.5 (Appuhamillage and Sheldon (2012)) Fix .y ≥ 0, .t > 0 and α, β ∈ (0, 1). For each .j = 1, 2, . . ., the following relation between ranked excursion heights of .α- and .β-skew Brownian motions holds:
.
(α)
P0 (Mj (t) > y) =
.
∞ k−1 (β) (1 − βα )k−j ( βα )j P0 (Mk (t) > y). j −1
(25.23)
k=1
The proof of Theorem 25.5 involves the following inversion formula (Exercise 10). Lemma 7 (Pitman and Yor (2001), Lemma 9) Let bk =
.
∞ m am , k
k = 0, 1, . . .
m=0
be the binomial moments of a nonnegative sequence .(am , m = 0, 1, . . .). Let k B(θ ) := ∞ k=0 bk θ and suppose .B(θ1 ) < ∞ for some .θ1 > 1. Then
.
458
25 Special Topic: Skew Random Walk and Skew Brownian Motion
am =
.
∞ k bk , (−1)k−m m
m = 0, 1, . . . ,
k=0
where the series is absolutely convergent. Proof of Theorem 25.5 For .α < β, by the two-stage coupling construction of the (β) (α) (α) excursion sign .Am that .Mj (t) = MHj (t), where .Hj has a negative binomial distribution:
h−1 .P (Hj = h) = (1 − βα )h−j ( βα )j . j −1 Hence (α) .P0 (M j (t)
∞ k−1 (β) (1 − βα )−j ( βα )j P0 (Mk (t) > y). > y) = j −1
(25.24)
k=1
For .β < α, the relation can be inverted by an application of Lemma 7. Let .k := j −1 and .m := h − 1. Then one can write (25.23) as (α) .P0 (M k+1 (t)
∞ m (β) (1 − βα )m−k ( βα )k+1 P0 (Mm+1 (t) > y). > y) = k
(25.25)
m=0
We then apply Lemma 7 to the sequences bk := (1 − βα )k ( βα )−k−1 P0 (Mk+1 (t) > y), am := (1 − βα )m P0 (Mm+1 (t) > y). (α)
.
(β)
After simplifying, we obtain (β) .P0 (M j (t)
∞ h−1 (α) (1 − βα )h−j ( βα )j P0 (Mh (t) > y). > y) = j −1 h=1
(25.26)
Let Ty(α) = inf{s ≥ 0 : Bs(α) = y}
.
denote the first time for .α-skew Brownian motion to reach y. In the case of Brownian motion the following formula was obtained by Csaki and Hu (2003) as a limit of the corresponding results for simple symmetric random walk using reflection principle calculations left as an Exercise 11.
25 Special Topic: Skew Random Walk and Skew Brownian Motion
459
Lemma 8 (Csaki and Hu (2003)) In the case of standard Brownian motion (.α = 1/2), one has (1/2)
P0 (Mj
.
y (t) > y) = 2(1 − Φ((2j − 1) √ )), t
(25.27)
where .Φ is the standard normal distribution function. Corollary 25.6 Fix .y ≥ 0 and .t > 0. Then for each .j = 1, 2, . . ., the distribution (α) of .Mj (t) is given by the formula P0 (Mj(α) (t) > y) =
.
∞ h−j j √ h−1 2α 1 − Φ((2h − 1)y/ t) , 2 1 − 2α j −1 h=1
where .Φ(·) is the standard normal distribution function. Proof The result is immediate from Theorem 25.5 using Lemma 8 and taking .β = 1/2 in (25.23). Corollary 25.7 Fix .t > 0. Then .
(α)
P0 (Ty ∈ dt) ⎧ 2 (2h−1)y ⎨2α ∞ (1 − 2α)h−1 √ } dt, y > 0 exp{− ((2h−1)y) h=1 3/2 2t 2π t = 2 ((2h−1)y) ⎩2(1 − α) ∞ (2α − 1)h−1 (2h−1)(−y) √ } dt, y < 0. exp{− h=1 3/2 2t 2π t
Proof For the case .y > 0 and .t > 0, we have the following relation between the distributions of .Ty(α) and the highest excursion of skew Brownian motion started at 0: P0 (Ty(α) < t) = P0 (M1(α) (t) > y).
.
Thus using Corollary 25.6, one has P0 (Ty(α) < t) = P0 (M1(α) (t) > y) .
∞ h−1 = 4α (1 − 2α) h=1
∞ (2h−1)y √ t
1 z2 √ exp{− } dz. 2 2π
The result is immediate after taking the derivative of the above expression with (α) (1−α) respect to t. For the case .y < 0, use the relation .P0 (Ty ∈ dt) = P0 (T−y ∈ dt) and the case .y > 0.
460
25 Special Topic: Skew Random Walk and Skew Brownian Motion
Exercises 1. Compute the first passage time density (breakthrough curves) for onedimensional Brownian motion, with and without drift. 2. Show that if {Xn : n = 0, 1, . . . } is a sequence of random maps with values in a Banach space S such that Xn converges in distribution to X, then for any sequence of random maps {Yn : n = 0, 1, . . . } such that ||Yn || → 0 in probability as n → ∞, where 0 is the zero vector in S, one has that Xn + Yn converges in distribution to X. [Hint: By Alexandrov’s theorem,12 it suffices to check that Ef (Xn + Yn ) → Ef (X) as n → ∞ for bounded, uniformly continuous functions f : S → R.] 3. Verify the formulae for the transition probabilities for skew Brownian motion. (α) [Hint: Begin with the calculation of P0 (Bt > y) in the case x = 0 and y > 0. Next consider the case x > 0, y > 0 by conditioning on the hitting time τ0 = inf{t : Bt = 0} of 0 starting from x. The case x < 0, y > 0 is also obtained from the reflection principle. Finally, use symmetry to obtain the case x > 0, y < 0 from x < 0, y > 0.] 4. Let {Yn : n ≥ 0} denote the skew random walk with parameter α starting at zero, and let rα be the reversal function rα (x) = (1−α)x1[0,∞) (x)+αx1(−∞,0] . Show that Mn = rα (Yn ), n = 0, 1, 2, . . . is a martingale. 5. Show that the embedding of simple symmetric random walk in Brownian motion is recursively by waiting for the Brownian motion to escape intervals of unit length centered at respective hitting positions; i.e., starting at S0 = B0 = 0, S1 = ±1 according to BT1 = ±1, S2 = S1 ± 1 according to BT2 = BT1 ± 1, etc. 6. Verify that in the case of constant diffusion coefficient D + = D − = D, the solution to (25.1) for twice-continuously differentiable initial data c0 , is given by c(t, y) = Ey c0 (Yt ), t ≥ 0, y ∈ R, where Y is a Brownian motion with diffusion coefficient D and zero drift. 7. For fixed y, the transition probabilities p(α) (t; x, y) satisfy both the interface condition and continuity in the backward variable x. 8. Verify that for fixed x the transition probabilities p(t; x, y) given by (25.11) satisfy the continuity of flux condition in the forward variable y, and is continuous in the forward variable y. In fact, p(t; x, y) is jointly continuous in the backward and forward variables. 9. Verify Corollary 25.3. 10. (Pitman and Yor Inversion Lemma) Prove Lemma 7. [Hint: Define A(z) = ∞ dk m m=1 am z . Note that k!bk = dzk A(1) so that A(1 + θ ) = B(θ ), |θ | < θ1 − 1. k Now compute A(z) = B(z − 1) = ∞ k=1 bk (z − 1) by the binomial (Taylor) k expansion of (z − 1) . Note absolute convergence from A(2 + z) = B(1 + z).]
12 See
BCPT, p. 137.
Exercises
461
11. Let {Sn : n ≥ 0}, S= 0 be the simple symmetric random walk. Let ρ0 = 0, ρj = inf{k > ρj −1 : Sk = 0}, j ≥ 1 denote the successive return times to zero, τj = ρj − ρj −1 , j ≥ 1 the length of time between returns, and Nn = |{0 < k ≤ n : Sk = 0}| the number of returns to zero by time n. Also let μj = maxρj −1 0,
(26.2)
is a general model with this feature, generalizing the logistic and exponential models as special cases, and will be assumed throughout the treatment presented 1 Kingsland
(1982) provides a very nice summary of this history with extensive references. Also see Bacaer (2011) for the special case of logistic growth. 2 The Richards (1959) model is also known as the theta-logistic model Lande et al. (2003); Gilpin and Ayala (1973). © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8_26
463
464
26 Piecewise Deterministic Markov Processes
in this chapter. Here the parameter k is referred to as the carrying capacity of the population. Note from (26.1) that the population size N increases for .N < k and decreases for .N > k, with a unique maximum at .N = k. While these and other classical models differ in details, the solutions share a common averaging dynamic that is noteworthy. Namely, a suitably transformed3 measure of the population size evolves as a temporally weighted average between the (transformed) initial population size and the (transformed) carrying capacity. The proof rests on a more basic fact that every increasing, continuously differentiable function .x(t), .0 ≤ t < ∞, can be represented as a weighted average by time-varying weights of its initial data .x0 and its asymptotic limit .x∞ < ∞ (assumed finite for simplicity) as follows: Lemma 1 Suppose that .x : [0, ∞) → [x0 , x∞ ) is continuously differentiable and monotonically increasing with .x∞ = limt→∞ x(t) < ∞, and let .ν > 0 be arbitrary. Then there exists a monotone (increasing or decreasing) function .h : [x0 , x∞ ] → R such that h(x(t)) = h(x∞ )(1 − e−νt ) + h(x0 )e−νt , t ≥ 0.
.
Proof Since .x(t) is continuously differentiable and increasing, it is invertible with continuously differentiable inverse .τ (x), where .τ : [x0 , x∞ ) → [0, +∞), such that .τ (x0 ) = 0, .limx→x∞ τ (x) = +∞, and .dτ/dx > 0. Let .c1 and .c2 be arbitrary real numbers such that .c1 = c2 . Define .h : [x0 , x∞ ] → R as follows: h(x) := c1 (1 − e−ντ (x) ) + c2 e−ντ (x)
.
(26.3)
Then h is continuously differentiable, with derivative .
dh dτ = (c1 − c2 )ν . dx dx
Since .c1 = c2 , .ν > 0 and .dτ/dx > 0, it follows that h is increasing (if .c1 > c2 ) or decreasing (if .c1 < c2 ). In either case, h is invertible, and by taking the inverse in the definition (26.3) and setting .x = x(t), we obtain the claimed result by noting that .c1 = h(x∞ ) and .c2 = h(x0 ) (recall that .τ (x0 ) = 0 and .limx→x∞ τ (x) .= +∞). Such averaging dynamics explicitly reveal the (non-transformed) function in Lemma 1 to be represented by x(t) = h−1 h(x∞ )(1 − e−νt ) + h(x0 )e−νt , t ≥ 0.
.
(26.4)
3 One-to-one transformations, e.g., logarithms, exponentials, reciprocals, centering and scalings, are often found to be convenient when graphing and analyzing biological data.
26 Piecewise Deterministic Markov Processes
465
In particular, if h is such that for all .N(0) = N0 ∈ (0, k], h(N(t)) = h(k)(1 − e−νt ) + h(N0 )e−νt , t ≥ 0,
.
then .x(t) = h(N(t)) will solve the affine-linear equation .
dx(t) d = {h(k)(1 − e−νt ) + h(N0 )e−νt } dt dt = −ν{h(k)(1 − e−νt ) + h(N0 )e−νt − h(k)} = −νh(x(t)) + νh(k),
(26.5)
for all .t ≥ 0 and all .N0 ∈ (0, k]. The converse holds as well. This simple realization will be useful for analyzing the disturbance model to be introduced below (see (26.8)). In particular, for Richards growth (also see Exercise 5), h(N) = 1/N θ
.
transforms the nonlinear equation (26.1) (for (26.2)) into the affine linear equation (26.5) with .ν = θ r. The solution of (26.1), (26.2) is given by .
1 1 1 = θ (1 − e−θrt ) + θ e−θrt , N θ (t) k N0
(26.6)
or equivalently, .N θ (t) evolves as a harmonic average of .k θ and .N0θ of the form N θ (t) =
.
1 1 kθ
(1 − e−θr t ) + 1θ e−θr t N0
.
(26.7)
Next consider the possibility of episodic disturbances of the population, i.e., random catastrophes. Familiar examples of catastrophes are numerous, including severe storms, meteor impacts, epidemics, forest fires, floods, droughts, infestations, volcanic eruptions, and so on. The episodic nature of such disturbances means that it is more natural to represent their occurrence times as a Poisson process in time with intensity parameter .λ determining the mean frequency of occurrence. One can model the resulting mortality either by subtracting a random number from the population or by assuming that only a random fraction of the population survives the disturbance. However, the latter multiplicative model is natural in this case since it scales with the population size, i.e., the mortality in an additive model, can be larger than the total size of the population. Note that whether the mortality due to a
466
26 Piecewise Deterministic Markov Processes
catastrophe is additive4 or multiplicative, the resulting stochastic process for the population size, .N (t), is no longer continuous but rather piecewise continuous, with jump discontinuities occurring at the times of catastrophes.5 When treating disturbance models, the blanket assumption that the disturbance factors are positive with positive probability is made throughout. Obviously if disturbances .D are permitted to destroy the entire population with probability one, i.e., .P (D = 0) = 1, then there can be no recovery for the given model. The stochastic process of interest here falls within a general class of piecewise deterministic Markov processes6 singled out by Davis (1984),7 in which a singlespecies population undergoes deterministic growth determined by an ordinary differential equation (26.1) but which also experiences random, episodic disturbances that remove a random fraction of the population. In the present model, net growth is deterministic, while the frequency and magnitude of disturbances that lead to mortality are treated as stochastic. The competition between the population’s net reproductive rate and its mortality rate due to disturbances sets up a situation where critical thresholds can be computed in terms of model parameters that determine what will happen to the size of the population in the long term. This model can be expressed more precisely as follows: .
dN (t) = G(N(t)), dt
τi ≤ t < τi+1 ,
N (τi ) = Di N(τi− ),
N (0) = N0 > 0, with N0 < k,
i = 0, 1, 2, . . . , (26.8)
where .0 = τ0 < τ1 < τ2 < · · · is the sequence of arrival times of a Poisson renewal process .{Λ(t) : t ≥ 0} with intensity .λ > 0 and .D1 , D2 , . . . is a sequence of independent and identically distributed (i.i.d.) disturbance factors on the interval .[0, 1] and independent of the arrival time process. These disturbance factors determine the fraction of the population that survives a given disturbance. The Richards growth curve .G = Gθ will be considered for specificity of results.8 One may note that .Gθ (0) = 0 implies that .N = 0 is an absorbing state for (26.8). In particular the Dirac (point mass) probability distribution .δ0 is an 4 See Chapter 11, Example 1 for results in the case of additive perturbations of the population. Also
see Goncalves et al. (2022), Schlomann (2018) for alternative models. and Tuckwell (1978, 1981, 1997) appear to be among the earliest to consider population dynamics models that included random catastrophes. In each of their papers, these authors modeled the disturbance times with a Poisson event process and represented the growth between disturbances by deterministic, logistic growth. 6 This class of models gained attention in Bayesian statistics, e.g., Azais and Bouguet (2018), cellular biology, e.g., Crudu et al. (2012), micro-insurance, e.g., Henshaw (2022), and fluid flows, e.g., Majda and Tong (2016), to name a few other applications outside of the present chapter. 7 Davis (1984) provides a complete characterization of the piecewise deterministic Markov processes in terms of the general form of their infinitesimal generator. 8 More general considerations involving the qualitative features of the growth curves are included in Peckham et al. (2018). 5 Hanson
26 Piecewise Deterministic Markov Processes
467
invariant (equilibrium) distribution. In particular, (26.8) defines a reducible Markov process in the sense that the state .N = 0 is inaccessible from states in .(0, ∞). We are interested in conditions under which this is the only invariant distribution, as well as conditions in which another invariant distribution also exists on the interval .(0, k). Of course, having two distinct invariant distributions on .[0, k) means that there will be infinitely many since randomizing between the two is also invariant. In addition to the continuous time model (26.8), a natural discrete time model is obtained by considering the population sizes at the sequence of times at which disturbances occur. That is, Nn = Dn N(τn− ),
.
n = 0, 1, 2, . . . ,
τ0− = 0,
X0 = 1,
(26.9)
where .Nn is the random size of the population immediately after the nth episodic disturbance. The left-hand limit N(τn −) = lim N(t)
.
t↑τn
refers to the population size just before the nth disturbance. That such discrete parameter Markov processes may be viewed as i.i.d. iterated maps9 can be used to great advantage, especially for non-irreducible Markov processes, will naturally emerge for this application. Theorem 26.1 (Threshold for Convergence in Distribution) Let .τn be a sequence of arrival times of a Poisson process with intensity .λ > 0 and .Dn be a sequence of i.i.d. random disturbance variables on .[0, 1] which is independent of the Poisson process. Consider .Gθ (N ) = rN(1 − (N/k)θ ) for some .r > 0, .θ > 0, and .k > 0. a. If .E [ln(D1 )] + λr > 0, then .{Nn }∞ n=0 converges in distribution to a unique invariant distribution with support on .(0, k). b. If .E [ln(D1 )] + λr < 0, then .{Nn }∞ n=0 converges in distribution to 0. Moreover, in this case, the convergence to .δ{0} is exponentially fast in the (Prokhorov) metric of convergence in distribution. Proof The proof relies on some ergodic theory for discrete parameter Markov processes represented as i.i.d. iterated random maps in rather interesting ways. The specific conditions employed to prove the existence of unique invariant probabilities can be found in from Bhattacharya and Waymire (2022), Theorem 18.6, Bhattacharya and Majumdar (2007), Theorems 3.1 and 7.1, or in Diaconis and Freedman (1999). First, let us note that the discrete-time disturbance model associated with (26.7) is
9 See
Bhattacharya and Waymire (2022), Bhattacharya and Majumdar (2007), Diaconis and Freedman (1999), Schreiber (2012) for a more general perspective.
468
26 Piecewise Deterministic Markov Processes
Nn = Dn
.
1 ( k1θ
(1 − e−rθTn ) +
1 θ Nn−1
e−rθTn )1/θ
,
(n ≥ 1) .
(26.10)
The reciprocal transform for the analysis of the Richards growth model is given by Jn = 1/Nnθ ∈ (k 1−θ , ∞). Define .Yn = exp (−rθ Tn ), .An = Yn Dn−θ , and .Bn = (1 − Yn ) k −θ Dn−θ . One then has
.
Jn = An Jn−1 + Bn ,
.
J0 = 1/N0θ .
(26.11)
From (26.11) it follows from the ergodic theory (cited above) for iterated affine random maps on the complete and separable metric space .[ k1 , ∞) that under the condition that E(ln B1 )+ < ∞,
E ln A1 < 0,
.
(26.12)
Jn = 1/Nnθ , evolves to a unique, invariant distribution on .[k −θ , ∞). These conditions are readily checked. From this the assertion (a) follows because the map θ −θ , ∞) is continuous with a continuous inverse. .x → 1/x of .(0, k] onto .[k The recursion (26.10) can also be directly expressed as iterated random function dynamics, .
Nn = γA(n) ◦ γA(n−1) ◦ · · · ◦ γA(1) (N0 ),
(26.13)
.
where the random vectors (i)
(i)
A(i) = (α1 , α2 ),
.
i ≥ 1,
are i.i.d. with independent components such that .α1 = D1 ∈ (0, 1), .α2 is exponentially distributed with parameter .λ > 0, and γA (x) ≡ γ(α1 ,α2 ) (x) = α1
.
1 1 kθ
(1 − e−rα2 ) + x1θ e−rα2
.
(26.14)
To prove (b) observe that these iterated maps are Lipschitz maps on the complete and separable metric space .[0, k] with Lipschitz coefficient .α1 erα2 . Moreover .δ{0} is the unique invariant probability, with exponential convergence, under the condition (cited above) that 0 > E ln(α1 erα2 ) = E ln D1 +
.
r . λ
26 Piecewise Deterministic Markov Processes
469
Remark 26.1 While the threshold condition does not depend on the parameter .θ , when it exists, the details of the asymptotic distribution will depend on .θ . The following theorem describes the relationship between steady state distributions of the continuous and discrete time evolutions.10 Let us review that between disturbances, the deterministic law of evolution of the population continuously in time is given by an equation of the general form .
dN(t) = Gθ (N (t)), dt
N (0) = x,
(26.15)
whose solution from (26.6) may be expressed as N(t) = gθ (t, x), t ≥ 0, x > 0,
.
where the population flows .x → gθ (t, x) are continuous, one-to-one maps with a continuous inverse, such that .gθ (0, x) = x, and .gθ (s+t, x) = gθ (t, gθ (s, x)), s, t ≥ 0, x > 0. In particular, the uninterrupted evolutions considered here have unique solutions at all times for a given initial value. Also, the discrete-time post-disturbed population model is given by N0 = x,
Nn = Dn gθ (Tn , Nn−1 ), n = 1, 2, . . . .
.
(26.16)
Theorem 26.2 (Continuous and Discrete Time Invariant Distributions) Let gθ (t, x) be the flow of the deterministic system (26.15). Then
.
a. Given an invariant distribution .π for the discrete time post-disturbance population model (26.16), let Y be a random variable with distribution .π , and let T be an exponentially distributed random variable with parameter .λ, independent of Y . Then μ(C) = P (gθ (T , Y ) ∈ C),
.
C ⊂ (0, ∞),
is an invariant probability measure for the corresponding continuous time disturbance model (26.8). b. Given an invariant probability measure .μ, for the continuous time disturbance model (26.8), let Y be a random variable with distribution .μ, and let X be distributed as the random disturbance factor .D1 in .(0, 1), independent of Y . Then π(C) = P (XY ∈ C),
.
C ⊂ (0, ∞),
10 This result follows as a special case of a much more general theory for piecewise deterministic Markov processes given by Costa (1990). A proof for a more general class of growth rates is also provided in Peckham et al. (2018).
470
26 Piecewise Deterministic Markov Processes
is an invariant distribution for the corresponding discrete time post-disturbance model (26.16). Proof For an invariant probability .μ, continuous time evolution can be expressed in terms of the semigroup of linear contraction operators defined on .L2 (μ) by T (t)f (x) = Ex f (N (t)),
t ≥ 0, x > 0,
.
via its infinitesimal generator given by Af (x) =
.
d f (gθ (t, x))|t=0 + λ{Ef (Xx) − f (x)}. dt
To derive this simply observe that up to .o(t) error as .t ↓ 0, either one or no disturbance will occur in the time interval .[0, t). Thus .
f (gθ (t, x))e−λt − f (x) 1 t T (t)f (x) − f (x) + = E (f (Xgθ (s, x))) λe−λs ds + o(t). t t 0 t
The first term is, by the product differentiation rule, .
d f (gθ (t, x))e−λt − f (gθ (0, x))e−λ0 f (gθ (t, x))e−λt |t=0 → dt t d = f (gθ (t, x))|t=0 − λf (x). dt
The second term is .λEf (Xx) in the limit as .t ↓ 0. Since .μ is an invariant probability for this continuous time evolution, then one d has from the adjoint (Fokker-Planck) equation .A∗ μ = dt μ = 0 for the adjoint operator. In particular, for f belonging to the domain of A as an (unbounded) operator on .L2 (μ), 0 =< f, A∗ μ >=< Af, μ >=
∞
.
Af (x)μ(dx), f ∈ DA ⊂ L2 (μ).
0
In the case of the discrete time evolution, the one-step transition operator is defined by Mf (x) = Ef (Xgθ (T , x)), x > 0.
.
The condition for .π to be an invariant probability for the discrete time evolution is that for .f ∈ L2 (π ) ⊂ L1 (π ), .
0
∞
∞
Mf (x)π(dx) =
f (x)π(dx). 0
26 Piecewise Deterministic Markov Processes
471
In particular, it suffices to consider indicator functions .f = 1C , C ⊂ (0, ∞), in which case one has ∞ . P (Xgθ (T , x) ∈ C) π(dx) = π(C). 0
These are the essential calculations required for the proofs of (a) and (b). Let’s begin with part (a). First note from the definition of .μ that
∞
.
∞ ∞
Af (x)μ(dx) =
0
0
Af (gθ (t, y))λe−λt dt π(dy).
0
Now, in view of the above calculation of A, one has
∞
.
Af (gθ (t, y))λe−λt dt =
0
∞
∂f (gθ (t, x)) + λ[Ef (Xgθ (t, x)) − f (gθ (t, x)])λe−λt dt. ∂t
( 0
After an integration by parts, this yields
∞
.
Af (gθ (t, y))λe−λt dt = λ{Ef (Xgθ (T , x)) − f (x)}
0
Thus, using this and the invariance of .π for the discrete process, one has .
∞
∞
Af (x)μ(dx) = λ
0
{Ef (Xgθ (T , x)) − f (x)}π(dx) = 0.
0
This proves part (a). To prove part (b), first apply A to the function .x → P (Xgθ (T , x) ∈ C). First note from the composition property and an indicated change of variable .P (Xgθ (T , x)
∈ C) = P (Xgθ (T + t, x) ∈ C) = eλt
∞ t
P (Xg(s, x) ∈ C)λe−λs ds.
In particular the first term of .AP (Xgθ (T , x) ∈ C) is .
d P (Xgθ (T , x) ∈ C)|t=0 = λ{P (Xgθ (T + t, x) ∈ C) − P (Xx ∈ C)}. dt
Adding this to the second term yields AP (Xgθ (T , x) ∈ C) = λ{
.
∞
P (Xgθ (T , y) ∈ C)P (Xx ∈ dy) − P (Xx ∈ C)}.
0
Integrating with respect to the continuous time invariant distribution .μ yields
472
26 Piecewise Deterministic Markov Processes
∞
0=λ
.
0
{
∞
P (Xgθ (T , y) ∈ C)P (Xx ∈ dy) − P (Xx ∈ C)}μ(dx),
0
or equivalently
∞ ∞
.
0
P (Xgθ (T , y) ∈ C)P (Xx ∈ dy)μ(dx) =
0
∞
P (Xx ∈ C)μ(dx).
0
∞ But since by definition .π(dy) = 0 P (Xx ∈ dy)μ(dx), this is precisely the condition ∞ . P (Xgθ (T , y) ∈ C)π(dy) = π(C), 0
i.e., that .π is an invariant probability for the discrete time distribution.
The following corollary demonstrates the relationship between invariant distributions of the discrete-time and continuous time stochastic processes as stated in general in Theorem 26.2 for the Richards growth model. Corollary 26.3 Assume that the conditions of Theorem 26.1 hold and that 1/p := r/λ > −E ln D1 .
.
Then the rescaled continuous time disturbed Richards model . Nk has the invariant cumulative distribution function
x
μk (0, x] =
(
.
0
y −θ − x −θ λ ) θr πk (dy) y −θ − 1
0 ≤ x ≤ 1,
(26.17)
where .πk is the rescaled invariant distribution for the discrete-time distributed Richards model from (a) in Theorem 26.1. Proof Assume .r/λ > −E(ln D1 ). Let .πk (dx) denote the invariant distribution for the discrete-time post-disturbance Richards model, rescaled by k to a distribution on .[0, 1]. That is if Y denotes a random variable with distribution .π(dx) on .[0, k], then let .Yk = k −1 Y and denote its distribution by .πk (dx) on .[0, 1]. Scaling the post-disturbance evolution accordingly, one has Nn = Dn [e−rθt + . k Thus, letting
Nn−1 k
−θ
1
(1 − e−rθT )]− θ .
26 Piecewise Deterministic Markov Processes
473
gθ (T , Y ) = [e−rθT + Yk−θ (1 − e−rθT )]− θ , 1
.
where T is exponentially distributed with parameter .λ > 0 and independent of .Yk , by Theorem 26.2 the invariant distribution of the population size rescaled by . k1 can be computed as follows: P (gθ (T , Yk ) ≤ x) = P (T ≥ −
.
x −θ − Yk−θ 1 ), Yk ≤ x) ln( θr 1 − Yk−θ
= E[1[Yk ≤x] ( =
x
( 0
x −θ − Yk−θ 1 − Yk−θ
λ
) θr ]
y −θ − x −θ λ ) θr πk (dy) y −θ − 1
0 ≤ x ≤ 1.
Corollary 26.4 Assume a uniformly distributed disturbance on .[0, 1] and .θ = 1. Then the invariant distribution function for the (rescaled) population in the continuous time Richards (logistic) growth model is given by the Beta distribution μk [0, x] = Cp x 1−p ,
.
0 ≤ x ≤ q, p =
λ . r
Proof For uniformly distributed disturbances on .[0, 1] and .θ = 1, one has 1/p = r/λ > 1 = −E ln D1 .
.
Thus the discrete parameter invariant probability exists, and one may compute as in the previous corollary that .Yk has the pdf .Cp (1 − y)p y −p , 0 ≤ y ≤ 1. From here one has
x
μk [0, x] =
(
.
0
y −1 − x −1 p ) Cp (1 − y)p y −p dy y −1 − 1
= Cp x 1−p ,
0 ≤ x ≤ 1, p =
λ . r
A natural extension of the disturbances introduced in this chapter would allow for climatic effects that could produce a gradual increase in the average frequency of disturbances in the Poisson process. That is, .Λ(t), t ≥ 0, would be replaced by a time-inhomogeneous Poisson process with a nondecreasing intensity function θ .λ(t), t ≥ 0, e.g., .λ(t) = t , or .λ(t) = log(1 + t), t ≥ 0. However, if
474
26 Piecewise Deterministic Markov Processes
∞
.
0 λ(t)dt = ∞, then it is possible to homogenize the Poisson process by an appropriate (nonlinear) change in time scale (Exercise 3). Under such a time change, the general form of the model remains the same, but the parameters of the logistic curve then become time-dependent. Another way in which the model may be modified is to permit immigrations into the population after the catastrophic events. Such modifications could make interesting projects for further analysis.
Exercises 1. For simple exponential growth, G∞ (N ) = rN for some r > 0, N(τn− ) = Nn−1 er Tn
(i) Show that Nn = N0 nk=1 [er Tk Dk ]. where N0 is a given initial condition in (0, k), Tn ≡ τn − τn−1 (n > 1) is the random time interval between disturbances and Dn is the fraction of the population that survives the nth disturbance. (ii) Show that Yn = exp(r Tn ) takes values in [1, ∞) and has a Paraeto distribution with cumulative distribution function FYn (s) = 1 − s −p , s > 0, where p := λ/r > 0. Here, EYn = p/(p − 1) = (λ/r)/((λ/r) − 1) if λ/r > 1 and is infinite otherwise. 2. (Disturbed Exponential Growth Model) Let τn be a sequence of arrival times of a Poisson process with intensity λ > 0, and Dn be a sequence of i.i.d. random disturbance variables on [0, 1] which is independent of the Poisson process. Consider G∞ (N ) = rN for some r > 0. (a) Show (i) If 0 < E [ln D1 ] + λr < ∞, then Nn → ∞ a.s. as n → ∞. (ii) If −∞ < E [ln D1 ] + λr < 0, then Nn → 0 a.s. as n → ∞. ⎧ ⎪ if E(D1 ) + λr < 1 ⎪ ⎨0, (b) Show that E (Nn ) → N0 , if E(D1 ) + λr = 1 . ⎪ ⎪ ⎩∞, if E(D ) + r > 1 1
λ
[Hint:Consider the recursive equations for the expected values.] ∞ 3. Assume 0 λ(t)dt = ∞. Show that it is possible to homogenize the Poisson process to a unit rate Poisson process by an appropriate (nonlinear) change in time scale. 4. For the logistic model (θ = 1), show that E
.
⎧ ⎨
ED1−1 , λ k(1− r (ED1−1 )−1))
1 → ⎩∞, Nn
if E D1−1 − if E D1−1 −
r λ
−1
11 This
growth curve dates back to Gompertz (1825).
Appendix A
The Hille–Yosida Theorem and Closed Graph Theorem
For simplicity, we assume the Banach space .X in the next theorem to be real. See Definitions 2.3, 2.4, and 2.7 for context. Theorem A.1 (Hille–Yosida Theorem) A linear operator A on .X is the infinitesimal generator of a strongly continuous contraction semigroup if and only if (1) A is densely defined and (2) for all .λ > 0, .λ ∈ ρ(A), and .||Rλ || ≤ 1/λ. Proof The necessity proof (“only if”) is contained in Theorem 2.7. Sufficiency. Consider the resolvent operator .Rn for .n = 1, 2, . . .. Since .(n − A)Rn g = g = Rn (n − A)g .∀g ∈ DA , ARn g = Rn Ag
.
∀g ∈ DA
(A.1)
Also, .ARn is a bounded operator. Indeed, .(n − A)Rn = I implies, by hypothesis, ARn = nRn − I.
.
(||ARn || ≤ ||nRn || + ||I || ≤ 2).
(A.2)
Now .∀g ∈ DA , .Rn (n − A)g = g implies ||(nRn − I )g|| = ||Rn Ag|| ≤
.
1 ||Ag|| → 0. n
Since .||nRn − I || ≤ 2 and .DA is dense in .X , the last relation implies ||nRn g − g|| → 0.
.
∀g ∈ X .
(A.3)
(n = 1, 2, . . .)
(A.4)
Now define (see (A.1)) A(n) = nARn = n(nRn − I ).
.
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
477
478
A The Hille–Yosida Theorem and Closed Graph Theorem
Then .A(n) is a bounded operator (.||A(n) || ≤ n||nRn − I || ≤ 2n) and, by (A.1) and (A.3), A(n) f = nARn f = nRn Af → Af
∀f ∈ DA .
.
(n)
For each .n = 1, 2, . . . consider the semigroup .{etA contraction:
: 0 ≤ t < ∞}. This is a
||etA || = ||etn(nRn −I ) || = ||e−nt · en (n)
.
≤ e−nt e||tn
2R
n ||
(A.5)
2R
n
||
≤ e−nt · ent = 1.
Now one has, using the lemma below: (n)
(m)
||etA f − etA
.
f || ≤ t||A(n) f − A(m) f ||
(∀f ∈ X ).
Combining this with (A.5), one gets (n)
(m)
||etA f − etA
.
f || → 0 as n, m → ∞, ∀f ∈ DA .
(A.6)
Now define the operators .Tt by (n)
Tt f = lim etA f.
.
(f ∈ DA )
n→∞
(n)
Since .||Tt f || ≤ limn→∞ ||etA f || ≤ ||f ||, .||Tt || ≤ 1. Hence .Tt can be uniquely extended to .X as a contraction. Moreover, (n)
(n)
Tt+s f = lim e(t+s)A f = lim etA
.
n→∞
n→∞
(n)
· esA f = Tt Ts f. (n)
Now for .f ∈ DA , (see (2.45) applied to the semigroup .etA ) (n)
Tt f − f = lim etA f − f
.
n→∞
= lim =
n→∞ 0 t
t
(n)
esA A(n) f ds
Ts Af ds
(n)
(since ||esA || ≤ 1, and A(n) f → f ).
0
(A.7) Hence .t → Tt f is continuous on .[0, ∞) .∀f ∈ DA . Therefore, .limt↓0 Tt f = f ∀f ∈ DA . Since .DA = X , it follows that .X0 = X . Also, (A.7) implies that the domain of the infinitesmal generator B, say, of .{Tt : 0 ≤ t < ∞} contains .DA and
.
A The Hille–Yosida Theorem and Closed Graph Theorem
479
on .DA one has .B = A. Since .1 ∈ ρ(A), .1 ∈ ρ(B) (by the necessity part), .I − B, and .I − A are one-to-one on their respective domains, agree on .DA , and are onto .X . Hence .DA = DB . Lemma 1 Let A and B be bounded operators which commute. If .||etA || ≤ 1, tB || ≤ 1 .∀t ≥ 0, then .∀t ≥ 0, .||e ||etA f − etB f || ≤ t||Af − Bf ||
∀f ∈ X .
.
Proof For every positive integer n, one may write etA f − etB f =
n
.
e
k−1 n tA
e
n−k n tB
t t e n Af − e n B f .
k=1
Hence t
t
||etA f − etB f || ≤ n||e n A f − e n B f ||.
.
(A.8)
But t
||
.
enA − I − A|| → 0, t/n
t
||
enB − I − B|| → 0. t/n
Hence t
n||e
.
t nA
f −e
t nB
t
e nBf − f e n Af − f − || → t||Af − Bf ||. f || = t|| t/n t/n
Therefore, taking the limit in (A.8) as .n → ∞, the desired inequality is obtained. The notion of a closed operator given in Definition 2.2 is particularly useful in the context of linear operators on a Banach space. Its proof can be based on the Baire category theorem for metric spaces. Along the way we also prove the open mapping theorem, a useful tool for checking continuity of inverse maps to continuous linear bijections. Let us start with, Theorem A.2 (Baire Category Theorem) Let .X be a complete metric space with metric .ρ. If .U1 , U2 , . . . are dense open subsets of .X , then their intersection ∞ U is dense in .X . In particular, .∩∞ U = ∅. .∩ n=1 n n=1 n Proof We wish to show that .∩∞ n=1 Un = ∅ meets any non-empty open .V ⊂ X . By hypothesis, there is a .u1 ∈ U1 ∩ V and .r1 chosen sufficiently small such that .B(u1 : r1 ) ⊂ U1 ∩ V , 0 < r1 < 1. Proceed inductively for .n ≥ 2 to get .un ∈ Un , .0 < rn < 1/n, such that .B(un : rn ) ⊂ Un ∩ B(un−1 : rn−1 ). Then .{un } is a Cauchy sequence since .ρ(uk , um ) < 2rn < 2/n for .k, m > n. By completeness of .X , there
480
A The Hille–Yosida Theorem and Closed Graph Theorem
is a .u ∈ X such that .un → u. For every .m > n, .um ∈ B(un : rn ) ⊂ B(un : rn ) ⊂ Un . So .u ∈ Un for all n. Moreover, .u ∈ B(u1 : r1 ) and hence .u ∈ V . The proof of the next theorem is aided by a lemma. Lemma 2 Let .A : X → Y be a bounded, linear, surjective map between Banach spaces .X , Y . Then there is an .r > 0 such that BY (0 : r) ⊂ A(BX (0 : 1)),
.
where the subscript label . = X or .Y denotes the space in which the indicated open ball .B (0 : t) is defined, .t = r, 1. Proof Since A is linear and surjective, one may express .Y = ∪∞ n=1 A(BX (0 : n)) as a union of closed sets. It follows from the Baire category theorem that .A(BX (0 : n)) has a non-empty interior for some n. Thus, scaling by .1/n, the interior of .A(BX (0 : 1)) is non-empty. Thus, there is a .y0 ∈ Y and .r > 0 such that y0 ∈ BY (y0 : 4r) ⊂ A(BX (0 : 1)).
.
By symmetry one also has .−y0 ∈ A(BX (0 : 1)). Noting .BY (y0 : 4r) = y0 + BY (0 : 4r) and since each .x ∈ BY (0; 4r) may be expressed .x = (y0 + x) + (−y0 ), it follows that BY (0; 4r) ⊂ A(BX (0 : 1)) + A(BX (0 : 1)) = 2A(BX (0 : 1)),
.
with .R + S := {r + s : r ∈ R, s ∈ S}, .R, S ⊂ Y , and the last equality derived from convexity, i.e., for convex C, .C + C = 2C (Exercise). Therefore, again by scaling, one has BY (0 : 2r) ⊂ A(BX (0 : 1)).
.
(A.9)
Now, let .y ∈ BY (0 : r). We seek .x ∈ BX (0 : 1) such that .Ax = y. Using, (A.9) for .2y ∈ BY (0 : 2r) and .ε > 0, there is a .z ∈ X such that .||2z||X < 1, and .||2y − A2z||Y < 2ε. So, let .z1 ∈ X such that .||z1 ||X < 1/2, and .||Az1 − y||Y < r/2. Now iterate application of (A.9) for .4(Az1 −y) ∈ BY (0 : 2r) to obtain .z2 ∈ X such that .||z2 ||X < 1/4, .||A(z1 + z2 ) − y||Y < r/4, and so on. This procedure yields .{zn } in .X such that .||zn ||X < 2−n , .||A(z1 + · · · + zn ) − y||Y < r2−n . In particular, .{z1 + · · · + zn } being Cauchy in the Banach space .X , there is a .z ∈ X such that .limn (z1 + · · · + zn ) = z ∈ B(0 : 1), and .Az = y. Theorem A.3 (Open Mapping Theorem) If .A : X → Y is a bounded linear, surjective map between two Banach spaces .X , Y , then A is an open map, i.e., A maps open subsets of .X onto open subsets of .Y .
A The Hille–Yosida Theorem and Closed Graph Theorem
481
Proof Let .U ⊂ X be an open set and .y ∈ A(U ), say .y = Ax, x ∈ U . Let us show that there is an open ball containing y and contained in .A(U ). Choose .r > 0 such that .x + BX (0 : r) ⊂ U , and therefore .y + A(BX (0 : r)) ⊂ A(U ). By Lemma 2, there is a .t > 0 such that .BY (0 : t) ⊂ A(BX (0 : r)). Thus .A(U ) ⊃ y + BY (0 : t) = BY (y : t), i.e., .A(U ) is open. Note that if .A : X → Y is a bounded linear map on normed linear spaces then, by continuity, the graph .{(x, Ax) : x ∈ X } is a closed subset of the product space .X × Y given the product topology (Exercise). In the case of Banach spaces, the converse holds, and, among other things, this provides an alternative approach to proving that a linear map is bounded. Theorem A.4 (Closed Graph Theorem) Let A be a linear map on a Banach space X into a Banach space .Y . If the graph .GA = {(x, Ax) : x ∈ X } is a closed subset of the product space .X × Y , then A is a bounded linear map.
.
Proof Since .GA is a closed subspace of the Banach space .X × Y for the norm |||(x, y)||| = ||x||X + ||y||Y , .GA is also a Banach space for the norm .||| · |||. The ˆ projection of .GA onto .X given by .A(x, Ax) = x, x ∈ X defines a bounded, linear bijection, i.e., one-to-one and onto. Thus, .Aˆ is open by Theorem A.3, and hence .Aˆ is continuous. Therefore, its second coordinate projection A is also continuous.
.
References
Aldous D, Shields P (1988) A diffusion limit for a class of randomly-growing binary trees. Prob Thry Rel Fields 79(4):509–542 Appuhamillage T, Sheldon D (2012) First passage time of skew Brownian motion. J Appl Probab 49(3):685–696 Appuhamillage TA, Bokil VA, Thomann E, Waymire E, Wood BD (2010) Solute transport across an interface: a Fickian theory for skewness in breakthrough curves. Water Resour Res 46(7):W07511. https://doi.org//10.1029/2009WR008258 Appuhamillage T, Bokil V, Thomann E, Waymire E, Wood B (2011) Occupation and local times for skew Brownian motion with applications to dispersion across an interface. Ann Appl Probab 21(1):183–214 Appuhamillage TA, Bokil VA, Thomann EA, Waymire EC, Wood BD (2014) Skew disperson and continuity of local time. J Stat Phys 156(2):384–394 Aris R (1956) On the dispersion of a solute in a fluid flowing through a tube. Proc Roy Soc Ser A 235:67–77 Arnold L (1974) Stochastic differential equations: theory and applications. Wiley-Interscience, New York Athreya KB (1985) Discounted branching random walks. Adv Appl Probab 17(1):53–66 Athreya KB, Ney P (1978) A new approach to the limit theory of recurrent Markov chains. Trans Amer Math Soc 245:493–501 Avellaneda M, Majda AJ (1992) Mathematical models with exact renormalization for turbulent transport. II. Fractal interfaces, non-Gaussian statistics and the sweeping effect. Comm Math Phys 146:139–204 Azais R, Bouguet F (Eds) (2018) Statistical inference for piecewise-deterministic Markov processes. John Wiley & Sons Bacaer N (2011) Verlulst and the logistic equation (1838). In: A short history of mathematical population dynamics. Springer, London Barndorff-Nielsen O, Halgreen C (1977) Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Zeit Wahr und verw Gebiete 38(4):309–311 Barndorff-Nielsen O, Blaesild P, Halgreen C (1978) First hitting time models for the generalized inverse Gaussian distribution. Stochast Proc Appl 7(1):49–54 Barndorff-Nielsen OE, Mikosch T, Resnick SI (Eds) (2001) Lévy processes: theory and applications. Springer Science & Business Media, Berlin Basak GK, Bhattacharya RN (1992) Stability in distribution for a class of singular diffusions. Ann Probab 20(1):312–321
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
483
484
References
Bass R (2014) A stochastic differential equation with a sticky point. Electron J Probab 19(32):1–22 Batchelor GK (1952) The effect of homogeneous turbulence on material lines and surfaces. Proc R Soc Lond Ser A Math Phys Sci 213(1114):349–366 Bell J (2015) The Dunford-Pettis theorem, University of Toronto Ben Arous G, Owhadi H (2003) Multiscale homogenization with bounded ratios and anomalous slow diffusion. Comm Pure Appl. Math. A J Issue Courant Instit Math Sci 56(1):80–113 Bensoussan A, Lions JL, Papanicolaou G (1978) Asymptotic analysis for periodic structures, vol 374. American Mathematical Society, Providence Berkowitz B, Cortis A, Dror I, Scher H (2009) Laboratory experiments on dispersive transport across interfaces: the role of flow direction. Water Resour Res 45(2): W02201. https://doi.org// 10.1029/2008WR007342 Best K, Pfaffelhuber P (2010) The Aldous-Shields model revisited with application to cellular ageing. Elect Comm Probab 15:475–488 Bhattacharya RN (1978) Criteria for recurrence and existence of invariant measures for multidimensional diffusions. Ann Prob 6:541–553 Bhattacharya R (1985) A central limit theorem for diffusions with periodic coefficients. Ann Prob 13(2):385–396 Bhattacharya R (1999) Multiscale diffusion processes with periodic coefficients and an application to solute transport in porous media. Ann Appl Probab 9:951–1020 Bhattacharya RN (1982) On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Zeit für Wahr und verw Geb 60(2):185–201 Bhattacharya RN, Goetze F (1995) Time scales for Gaussian approximation and its break down under a hierarchy of periodic spatial heterogeneities. Bernoulli 1:81–123. Correction ibid 1996:107–108 Bhattacharya RN, Gupta VK (1984) On the Taylor-Aris theory of solute transport in a capillary. SIAM J Appl Math 44:33–39 Bhattacharya R, Majumdar M (2007) Random dynamical systems: theory and applications. Cambridge University Press, Cambridge Bhattacharya R, Majumdar M (2022) Extinct or endangered: possible or inevitable. In: Pure and applied functional analysis. Special Issue in Honor of Roy Radner, Zaslavski A, Ali Khan M (eds) Bhattacharya RN, Ramasubramanian S (1982) Recurrence and ergodicity of diffusions. J Multivar Anal 12(1):95–122 Bhattacharya R, Wasielak A (2012) On the speed of convergence of multidimensional diffusions to equilibrium. Stoch Dyn 12(1):1150003 Bhattacharya R, Waymire E (1990, 2009) Stochastic processes with applications. Wiley, New York; Published (2009) In: Classics in applied mathematics series, vol 61. SIAM, Philadelphia Bhattacharya R, Waymire E (2016) A basic course in probability theory. Springer, New York. Errata: https://sites.science.oregonstate.edu/~waymire/ Bhattacharya R, Waymire E (2021) Random walk, brownian motion, and martingales. Graduate texts in mathematics. Springer, New York Bhattacharya R, Waymire E (2022) Stationary processes and discrete parameter Markov processes. Graduate texts in mathematics. Springer, New York Bhattacharya RN, Gupta VK, Walker H (1989) Asymptotics of solute dispersion in periodic porous media. SIAM J Appl Math 49(1):86–98 Bhattacharya R, Denker M, Goswami A (1999) Speed of convergence to equilibrium and to normality for diffusions with multiple periodic scales. Stoch Proc Appl 80(1):55–86 Bhattacharya R, Thomann E, Waymire E (2001) A note on the distribution of integrals of geometric Brownian motion. Stat Probab Lett 55(2):187–192 Billingsley P (1961) The Lindeberg-Levy theorem for martingales. Proc Amer Math Soc 12(5):788–792 Billingsley P (1968) Convergence of probability measures. Wiley, New York Billingsley P (1995) Probability and measure, 3rd edn. Wiley, New York
References
485
Blackwell D (1958) Another countable Markov process with only instantaneous states. Ann Math Statist 29(1):313–316 Bokil VA, Gibson NL, Nguyen SL, Thomann EA, Waymire EC (2020) An Euler-Maruyama method for diffusion equations with discontinuous coefficients and a family of interface conditions. J Comp App Math 368:112545 Bose A, Dasgupta A, Rubin H (2002) A contemporary review and bibliography of infinitely divisible distributions and processes. Sankhy¯a Indian J Statist Ser A 64:763–819 Breiman L (1968) Probability. Addison Wesley, Reading. SIAM, Philadelphia Brooks JK, Chacon RV (1983) Diffusions as a limit of stretched Brownian motions. Adv Math 49(2):109–122 Cameron RH, Martin WT (1944) An expression for the solution of a general class of non-linear integral equations. Amer J Math 66:281–298 Cameron RH, Martin WT (1945) Transformations of Wiener integrals under a general class of linear transformations. Trans Amer Math Soc 58:184–219 Chen MF, Wang FY (1995) Estimation of the first eigenvalue of second order elliptic operators. J Funct Anal 131(2):345–363 Chen MF (2006) Eigenvalues, inequalities, and ergodic theory. Springer Science & Business Media, Berlin Chen L, Dobson S, Guenther R, Orum C, Ossiander M, Thomann E, Waymire E (2003) On Itô’s complex measure condition. Institute of Mathematical Statistics lecture notes-monograph series, pp 65–80 Cherny AS, Shiryaev AN, Yor M (2002) Limit behavior of the “horizontal–vertical”random walk and some extensions of the Donsker–Prokhorov invariance principle.Th Probab Appl 47(3):498–516 Chung KL (1967) Markov chains with stationary transition probabilities. Springer Comtet A, Monthus C, Yor M (1998) Exponential functionals of Brownian motion and disordered systems. J Appl Probab 35:255–271 Costa OLV (1990) Stationary distributions for piecewise-deterministic Markov processes. J Appl Probab 27(1):60–73 Crump KS (1975) On point processes having an order statistic structure. Sankhya Indian J Statist Ser A 37:396–404 Crudu A, Debussche A, Muller A, Radulescu O (2012) Convergence of stochastic gene networks to hybrid piecewise deterministic processes. Ann Appl Probab 1822–1859 Csáki E, Hu Y (2003) Lengths and heights of random walk excursions. Discrete random walks, DRW03, Paris, France, pp 45–52. hal-01183932 Dascaliuc R, Michalowski N, Thomann E, Waymire E (2018a) A delayed Yule process. Proc Amer Math Soc 146(3):1335–1346 Dascaliuc R, Thomann EA, Waymire EC (2018b) Stochastic explosion and non-uniqueness for .α-Riccati equation. J Math Anal Appl 476(1):53–85. Errata (2023): https://arxiv.org/abs/2303. 05482 Dascaliuc R, Pham TN, Thomann E, Waymire EC (2023a) Doubly stochastic Yule cascades (part I): the explosion problem in the time-reversible case. J Funct Anal 284(1):109722 Dascaliuc R, Pham TN, Thomann E, Waymire EC (2023b) Doubly stochastic Yule cascades (part II): the explosion problem in the non-reversible case. Ann de’l Inst Henri Poincaré (to appear) Dascaliuc R, Pham T, Thomann EA, Waymire EC (2023c) Errata to ‘Stochastic explosion and non-uniqueness for .α-Riccati equation’. J Math Anal Appl 476(1). https://arxiv.org/abs/2303. 05482 Davies PL (1986) Rates of convergence to the stationary distribution for k-dimensional diffusion processes. J Appl Prob 23(2):370–384 Davis MHA (1984) Piecewise-deterministic Markov processes: a general class of non-diffusion stochastic models. J R Statist Soc Ser B (Methodol) 46(3):353–388 Davis M, Etheridge A (2006) Louis Bachelier’s theory of speculation: the origins of modern finance. Princeton University Press, Princeton
486
References
Diaconis P, Stroock D (1991) Geometric bounds for eigenvalues of Markov chains. Ann Appl Probab 1:36–61 Diaconis P, Freedman D (1999) Iterated random functions. SIAM Rev 41(1):45–76 Dobrushin RL (1956) An example of a countable homogeneous Markov process all states of which are instantaneous (Russian). Teor Veroyatnost i Primenen 1:481–485 Dong C, Iksanov O, Pilipenko A (2023) On a discrete approximation of a skew stable Lé vy process. arXiv:2302.07298 Durrett R (1996) Stochastic calculus: a practical introduction, vol 6. CRC Press, Boca Raton Durrett R, Kesten H, Waymire E (1991) On weighted heights of random trees. J Theor Probab 4(1):223–237 Dunford N, Schwartz JT (1963, 1988) Linear operators, part 1: general theory, vol 10. Wiley, New York Dynkin EB (1965) Markov processes, vol 1. Springer, Berlin Einstein A (1905) On the movement of small particles suspended in a stationary liquid demanded by the molecular-kinetic theory of heat (English translation, 1956). Investigations on the theory of the Brownian movement. Ann Phys 322:549560 Ethier SN, Kurtz TG (1985) Markov processes: characterization and convergence. Wiley, New York Feller W (1940) On the integro-differential equations of purely discontinuous Markoff processes. Trans Amer Math Soc 48(3):488–515 Feller W (1952) The parabolic differential equations and the associated semigroups of transformations. Ann Math 55:468–519 Feller W (1954) Diffusion processes in one dimension. Trans Amer Math Soc 77(1):1–31 Feller W (1957) Generalized second order differential operators and their lateral conditions. Ill J Math 1(4):459–504 Feller W (1968, 1971) An introduction to probability theory and its applications, vol 1, 3rd edn, vol 2, 2nd edn. Wiley, New York Feller W, McKean HP (1956) A diffusion equivalent to a countable Markov chain. Proc Nat Acad Sci 42(6):351–354 Fill JA (1991) Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process. Ann Appl Probab 1:62–87 Folland GB (1984) Real analysis: modern techniques and their applications, vol 40. Wiley, Hoboken Franke B (2004) Integral inequalities for the fundamental solutions of diffusions on manifolds with divergence-free drift. Mathematische Zeitschrift 246(1):373–403 Freedman D (1971) Brownian motion and diffusion. Holden-Day, New York. Reprint Springer, New York Freedman D (1983) Constructing the general Markov chain. In: Approximating countable Markov chains. Springer, New York Fried JJ, Combarnous MA (1971) Dispersion in porous media. In: Advances in hydroscience, vol 7, pp 169–282. Elsevier, Amsterdam Friedman A (1964) Partial differential equations of parabolic type. Reprinted (2008), Courier Dover Publications, Mineola Friedman A (1973) Wandering out to infinity of diffusion processes. Trans Amer Math Soc 184:185–203 Friedman A (1975) Stochastic differential equations and applications. Academic, New York Gelhar LW, Axness CL (1983) Three-dimensional stochastic analysis of macrodispersion in aquifers. Water Resour Res 19(1):161–180 Gilpin ME, Ayala FJ (1973) Global models of growth and competition. Proc Natl Acad Sci USA 70(12):Part I, 3590–3593 Girsanov IV (1960) On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Thry Probab Its Appl 5(3):285–301
References
487
Gompertz B (1825) XXIV. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. In a letter to Francis Baily, Esq. FRS & c. Philos Trans R Soc Lond 115:513–583 Goncalves B, Huillet T, Löcherbach E (2022) On population growth with catastrophes. Stoch Models 38(2):214–249 Gupta VK, Mesa OJ, Waymire E (1990) Tree-dependent extreme values: the exponential case. J Appl Probab 27(1):124–133 Handwerk A, Willems H (2007) Wolfgang Doeblin: a mathematician rediscovered. Springer, Berlin Hanson FB, Tuckwell HC (1978) Persistence times of populations with large random fluctuations. Theoret Populat Biol 14:46–61 Hanson FB, Tuckwell HC (1981) Logistic growth with random density independent disasters. Theoret Populat Biol 19:1–18 Hanson FB, Tuckwell HC (1997) Population growth with randomly distributed jumps. J Math Biol 36:169–187 Harrison M, Shepp L (1981) On skew Brownian motion. Ann Probab 9(2):309–313 Henshaw K (2022) Mathematical perspectives on insurance for low-income populations. Doctoral Dissertation, University of Liverpool, UK Hoteit H, Mose R, Younes A, Lehmann F, Ackerer P (2002) Three-dimensional modeling of mass transfer in porous media using the mixed hybrid finite elements and the random-walk methods. Math Geol 34(4):435–456 Ibragimov IA (1963) A central limit theorem for a class of dependent random variables. Thry Probab Appl 8(1):83–89 Ikeda N, Watanabe S (1989) Stochastic differential equations and diffusion processes, 2nd edn. North-Holland, Amsterdam Iksanov A, Pilipenko A (2023) On a skew stable Lévy process. Stochast Proc Appl 156:44–68 Itô K (2004) Stochastic processes: lectures given at Aarhus University. Springer Science & Business Media, Berlin Itô K (1965) Generalized uniform complex measures in the Hilbertian metric space with the application to the Feynman integral. Proc Fifth Berkeley Symp Math Stat Probab II:145–161 Itô K, McKean HP (1963) Brownian motions on a half line, Illinois. J Math 7:181–231 Itô K, McKean HP (1965) Diffusion processes and their sample paths. Reprint Springer, Berlin Itô K, Rao KM (1961) Lectures on stochastic processes, vol 24. Tata Institute of Fundamental Research, Bombay Karatzas I, Shreve SE (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New York Karlin S, Taylor HE (1981) A second course in stochastic processes. Elsevier, Amsterdam Kha´sminskii RZ (1960) Ergodic properties of recurrent diffusions and stabilization of the Cauchy problem for parabolic equations, Teoriya Veroyat Primen 5:7–28 Kingsland S (1982) The refractory model: the logistic curve and the history of population ecology. Quart Rev Biol 57(1):29–52 Lande R, Engen S, Saether B-E (2003) Stochastic population dynamics in ecology and conservation, Oxford series in ecology and evolution. Oxford University Press, New York, p 396 Lawler GF (2008) Conformally invariant processes in the plane, no 114, American Mathematical Society, Providence Le Jan YL, Sznitman AS (1997) Stochastic cascades and 3-dimensional Navier-Stokes equations. Probab Thry Rel Fields 109(3):343–366 Lejay A (2006) On the constructions of the skew Brownian motion. Probab Surv 3:413–466 Lejay A, Martinez M (2006) A scheme for simulating one-dimensional diffusion processes with discontinuous coefficients. Ann Appl Probab 16(1):107–139 Lévy P (1948) Processus stochastiques et mouvement brownien. Gauthier-Villars, Paris Lévy P (1951) Systems Markoviens et stationnaires, Cas denombrable. Ann Sci de l’Ecole Normale Superieure de Paris 68:327–381
488
References
Lévy P (1953) Random functions: general theory with special reference to Laplacian random functions, vol 1, no 12. University of California Press, Chicago Lévy P (1954) Théorie de l’addition des variables aléatories, 2nd edn. Gauthier-Villars, Paris (1st edn. 1937) Lindvall T (2002) Lectures on the coupling method. Courier Corporation, North Chelmsford Majda AJ, Tong XT (2016) Geometric ergodicity for piecewise contracting processes with applications for tropical stochastic lattice models. Commun Pure Appl Math 69(6):1110–1153 Mandl P (1968) Analytical treatment of one-dimensional Markov processes, vol 151. Springer, Berlin Mandelbrot BB, Van Ness JW (1968) Fractional Brownian motions, fractional noises and applications. SIAM Rev 10(4):422–437 Mao YH (2006) Convergence rates in strong ergodicity for Markov processes. Stoch Proc Appl 116(12):1964–1976 McKean HP (1969) Stochastic integrals. Academic, New York Metzler A (2013) The Laplace transform of hitting times of integrated geometric Brownian motion. J Appl Probab 50(1):295–299 Meyer PA (1966) Probability and potentials, vol 1318. Blaisdell Publishing Company, Waltham Meyn S, Tweedie RL (1993) Markov chains and stochastic stability. Cambridge University Press, Cambridge Novikov A (1972) On an identity for stochastic integrals. Thry Probab Appl 17(4):717–720 Nummelin E (1978) Splitting technique for Harris recurrent Markov chains. Z Wahrs Verw Geb 43:309–318 Obukhov A (1941) Spectral energy distribution in a turbulent flow. Izv Akad Nauk SSSR Ser Geogr i Geofiz 5:453–466 Orey S (1971) Limit theorems for Markov chain transition probabilities. van Nostrand, London Ouknine Y (1991) Skew-Brownian motion” and derived processes. Thry Prob Its Appl 35(1):163– 169 Owhadi H (2003) Anomalous slow diffusion from perpetual homogenization. Ann Prob 31(4):1935–1969 Peckham SD, Waymire EC, De Leenheer P (2018) Critical thresholds for eventual extinction in randomly disturbed population growth models. J Math Bio 77(2):495–525 Peskir G (2015) William feller, selected papers II. Springer, Berlin, pp 77–93 Pinsky RG, Dieck TT (1995) Positive harmonic functions and diffusion, vol 45. Cambridge University Press, Cambridge Pitman J, Yor M (2001) On the distribution of ranked heights of excursions of a Brownian bridge. Ann Probab 29(1):361–384 Ramirez JM (2012) Population persistence under advection-diffusion in river networks. J Math Bio 65(5):919–942 Ramirez JM, Thomann EA, Waymire EC, Haggerty R, Wood B (2006) A generalized Taylor-Aris formula and skew diffusion. Multiscale Model Simulat 5(3):786–801 Ramirez JE, Thomann EA, Waymire EC (2013) Advection-disperson across interfaces. Stat Surv 28(4):487–509 Ramirez JM, Thomann EA, Waymire EC (2016) Continuity of local time: an applied perspective. In: The fascination of probability, statistics and their applications; In: Honour of Ole E. Barndorff-Nielsen, Podolskij M, Steltzer R, Thorbjornsen S, Veraart AED (eds). Springer, Berlin Rao KM (1969) On decomposition theorems of Meyer. Math Scandinavica 24(1):66–78 Reed M, Simon B (1972) Methods of modern mathematical physics, vol 1. Academic, New York Reuter GEH (1957) Denumerable Markov processes and the associated contraction semigroups on l. Acta Mathematica 97:1–46 Revuz D, Yor M (1999) Continuous martingales and brownian motion, 3rd edn. Springer, Berlin Richards FJ (1959) A flexible growth function for empirical use. J Exper Botany 10:290–300 Rogers LC, Pitman JW (1981) Markov functions. Ann Probab 9:573–582 Rogers LCG, Shi Z (1995) The value of an Asian option. J Appl Probab 32(4):1077–1088
References
489
Salminen P, Stenlund D (2021) On occupation times of one-dimensional diffusions. J Theoret Probab 34(2):975–1011 Scheutzow M (2019) Lecture notes in stochastic processes III, BMS advanced course. http://page. math.tu-berlin.de/~scheutzow/WT4main.pdf Schlomann BH (2018) Stationary moments, diffusion limits, and extinction times for logistic growth with random catastrophes. J Theor Bio 454:154–163 Schreiber SJ (2012) Persistence for stochastic difference equations: a mini-review. J Differ Equ Appl 18(8):1381–1403 Skorokhod AV (1961) Stochastic equations for diffusion processes in a bounded region, I. Theory Probab Appl 6(3):264–274 Skorokhod AV (1962) Stochastic equations for diffusion processes in a bounded region, II. Theory Probab Appl 7(1):3–23 Stein EM (1970) Singular integrals and differentiability properties of functions. Princeton University Press, Princeton Stroock DW, Varadhan SS (1969) Diffusion processes with continuous coefficients, I; II. Comm Pure Appl Math 22(3):345–400; 22(4):479–530 Stroock DW, Varadhan SRS (1979) Multidimensional diffusion processes. Springer, Berlin Tanaka H (1963) Note on continuous additive functionals of the 1-dimensional Brownian motion Path. Z Wahr verw Gebiete 1:251–257 Taylor GI (1953) Dispersion of Soluble matter in a solvent flowing through a tube. Proc R Soc Ser A 219:186–203 Thomann E, Waymire EC (2003) Contingent claims on assets with conversion costs. J Stat Plan Infer 113:403–417 Walsh J (1978) A diffusion with a discontinuous local time. Asterisque 52(53):37–45 Wasielak A (2009) Various limiting criteria for multidimensional diffusion processes, PhD Thesis University of Arizona, Tucson Winter CL, Neuman SP, Newman CM (1984) Prediction of far-field subsurface radionuclide dispersion coefficients from hydraulic conductivity measurements: a multidimensional stochastic theory with application to fractured rocks (No. NUREG/CR-3612), Arizona University, Tucson (USA), Department of Hydrology and Water Resources Wooding RA (1960) Rayleigh instability of a thermal boundary layer in flow through a porous medium. J Fluid Mech 9:183–192 Woyczy´nski WA (2001) Lévy processes in the physical sciences. In: Lévy processes. Birkhauser, Boston, pp 241–266 Yosida K (1964) Functional analysis, 1st ed. Springer, New York
Related Textbooks and Monographs
The following is list of some supplementary and/or follow-up reading, including the books cited in the text. Arnold L (1974) Stochastic differential equations: theory and applications. Wiley-Interscience, New York Asmussen S, Hering H (1983) Branching processes. Birkhäuser, Boston Athreya KB, Ney PE (1972) Branching processes. Springer, New York Bensoussan A, Lions JL, Papanicolaou G (1978) Asymptotic analysis for periodic structures, vol 374. American Mathematical Society, Providence Bhattacharya R, Waymire E (1990) Stochastic processes with applications. Wiley, New York; Published (2009) In: Classics in applied mathematics series, vol 61, SIAM, Philadelphia Bhattacharya R, Majumdar M (2007) Random dynamical systems: theory and applications. Cambridge University Press, Cambridge Bhattacharya R, Waymire E (2016) A basic course in probability. Springer, New York. Errata: https://sites.science.oregonstate.edu/~waymire/ Bhattacharya R, Waymire E (2021) Random walk, brownian motion, and martingales. Graduate texts in mathematics. Springer, New York Bhattacharya R, Waymire E (2022) Stationary processes and discrete parameter Markov processes. Graduate texts in mathematics. Springer, New York Billingsley P (1968) Convergence of probability measures. Wiley, New York Billingsley P (1995) Probability and measure, 3rd edn. Wiley, New York Breiman L (1968) Probability. Addison Wesley, Reading. Reprint SIAM, Philadelphia Chen MF (2006) Eigenvalues, inequalities, and ergodic theory. Springer Science & Business Media, Berlin Chung KL (1967) Markov chains with stationary transition probabilities. Springer Chung KL, Williams RJ (1990) Introduction to stochastic integration, 2nd edn. Birkhauser, Boston Dieudonné J (1960) Foundations of modern analysis. Academic, New York Doob JL (1953) Stochastic processes. Wiley, New York Dunford N, Schwartz JT (1963, 1988) Linear operators, part 1: general theory, vol 10. Wiley, New York Durrett R (1984) Brownian motion and martingales in analysis. Wadsworth, Belmont Durrett R (1995) Probability theory and examples, 2nd edn. Wadsworth, Brooks & Cole, Pacific, Grove Durrett R (1996) Stochastic calculus: a practical introduction, vol 6. CRC Press, Boca Raton Dym H, McKean HP (1972) Fourier series and integrals. Academic, New York © Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
491
492
Related Textbooks and Monographs
Dynkin EB (1965) Markov processes, vol 1, 2. Springer, Berlin Ethier SN, Kurtz TG (1985) Markov processes: characterization and convergence. Wiley, New York Feller W (1968, 1971) An introduction to probability theory and its applications, vol 1. 3rd edn, vol 2, 2nd edn. Wiley, New York Folland GB (1984) Real analysis: modern techniques and their applications, vol 40. Wiley, Hoboken Freedman D (1971) Brownian motion and diffusion. Holden-Day, New York. Reprint Springer, Berlin Freedman D (1983) Constructing the general Markov chain. In: Approximating countable Markov chains. Springer, New York Friedman A (1964) Partial differential equations of parabolic type. Reprinted Courier Dover Publications, Mineola (2008) Friedman A (1969) Partial differential equations, vol 1. Holt, Reinhart and Winston, New York, p 969 Friedman A (1975) Stochastic differential equations and applications. Academic, New York Gihman II, Skokohod AV (1974) The theory of stochastic processes I. Springer, New York. English translation by S. Kotz of the original Russian published in 1971 by Nauka, Moscow Gilbarg D, Trudinger NS (1977) Elliptic partial differential equations of second order, vol 224, no 2. Springer, Berlin Hall P, Heyde CC (1980) Martingale limit theory and its application. Academic, New York Harris TE (1963) The theory of branching processes. Springer, Berlin Ikeda N, Watanabe S (1989) Stochastic differential equations and diffusion processes, 2nd edn. North-Holland, Amsterdam Itô K (2004) Stochastic processes: lectures given at Aarhus University Springer Science & Business Media, Berlin Itô K, McKean HP (1965) Diffusion processes and their sample paths. Reprint Springer, Berlin Itô K, Rao KM (1961) Lectures on stochastic processes, vol 24. Tata Institute of Fundamental Research, Bombay Jagers P (1975) Branching processes with applications to biology. Wiley, New York Kallenberg O (2001) Foundations of modern probability, 2nd edn. Springer, New York Karatzas I, Shreve SE (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, New York Karlin S, Taylor HM (1975) A first course in stochastic processes. Academic, New Yok Karlin S, Taylor HM (1981) A second course in stochastic processes. Elsevier, Amsterdam Kha´sminskii R (2011) Stochastic stability of differential equations, vol 66. Springer Science& Business Media, Berlin Lawler G (2008) Conformally invariant processes in the plane. American Mathematical Society, Providence Lévy P (1925) Calcul des probabilités. Gauthier-Villars, Paris Lévy P (1948) Processus stochastiques et mouvement brownien. Gauthier-Villars, Paris Lévy P (1954) Théorie de l’addition des variables aléatories, 2nd edn. Gauthier-Villars, Paris. (1st edn. (1937)) Liggett TM (1985) Interacting particle systems. Springer, New York Liggett TM (2010) Continuous time markov processes: an introduction. Graduate studies in mathematics, vol 112. American Mathematical Society, Providence Lindvall T (2002) Lectures on the coupling method. Courier Corporation, North Chelmsford Mandl P (1968) Analytical treatment of one-dimensional Markov processes, vol 151. Springer, Berlin McKean HP (1969) Stochastic integrals. Academic, New York Meyer PA (1966) Probability and potentials, vol 1318. Blaisdell Publishing Company, Waltham Meyn S, Tweedie RL (1993) Markov chains and stochastic stability. Cambridge University Press, Cambridge
Related Textbooks and Monographs
493
Neveu J (1971) Mathematical foundations of the calculus of probability. Holden-Day, San Francisco Neveu J (1975) Discrete parameter martingales. North-Holland, Amsterdam Nualart D (1995) The malliavin calculus and related topics. Springer, New York Oksendal B (1998) Stochastic differential equations, 5th edn. Springer, Berlin Orey S (1971) Limit theorems for Markov chain transition probabilities. van Nostrand, London Parthasarathy KR (1967) Probability measures on metric spaces. Academic, New York Pinsky RG, Dieck TT (1995) Positive harmonic functions and diffusion, vol 45. Cambridge University Press, Cambridge Reed M, Simon B (1972) Methods of modern mathematical physics, vol 1. Academic, New York Revuz D, Yor M (1999) Continuous martingales and brownian motion, 3rd edn. Springer, Berlin Rogers LCG, Williams D (2000) Diffusions, Markov processes and martingales, vol 1, 2, 2nd edn. Cambridge University Press, Cambridge Royden HL (1988) Real analysis, 3rd edn. MacMillan, New York Stein EM (1970) Singular integrals and differentiability properties of functions. Princeton University Press, Princeton Stroock DW, Varadhan SRS (1979) Multidimensional diffusion processes. Springer, Berlin Yosida K (1964) Functional analysis, 1st edn. Springer, New York
Symbol Index
Special Sets and Functions: By convention, .X(t) and .Xt in continuous parameter, and .X(n) and .Xn discrete parameter, may be used interchangeably indexing stochastic processes. In the classic notation of G.H. Hardy, one writes .a(x) = O(b(x)) to mean that there is a constant c (independent of x) such that .|a(x)| ≤ c|b(x)| for all x. Also .a(x) = o(b(x)) indicates that the ratio .a(x)/b(x) → 0 according to specified limit. .Z+ , set of non-negative integers .Z++ , set of positive integers .R+ , set of non-negative real numbers .R++ , set of positive real numbers .R0 = R\{0} .∂A, boundary of set A o .A , interior of set A − .A , closure of set A c .A , complement of set A .1B (x), 1B , indicator function of B .[X ∈ B], inverse image of the set B under X .#A, .|A|, .card(A), cardinality for finite set A .δx Dirac delta (point mass) .⊗, .σ -field product . , product of .σ −fields .p(t; x, dy), homogeneous (stationary) transition probability .p(s, t; x, dy), nonhomogeneous (nonstationary) transition probability
distribution of Markov process with initial value x .Ex expected value with respect to .Px + + .Xt = {Xt (s) ≡ Xt+s : s ≥ 0}, after-t process t .Bs = {Bu − Bs : s ≤ u ≤ t} .Fτ pre-.τ .σ -field .B , .B (S) Borel .σ -field .B0 collection of Borel subsets of .(0, ∞) × R0 .R0 = {A ∈ B0 : A ⊂ (0, δ) × {Δ : |Δ| > δ −1 for some δ > 0} .Dst (X) differential .σ -field associated with X .Δst X = X(t) − X(s) .ρ(A), resolvent set for linear operator A .B(x : r), open ball centered at x of radius r; .B(x : r), the closed ball .∂B(x : r), the boundary set of .B(x : r) Function Spaces, Elements and Operations: .(R, R) generic measurable range space for random map .K generic path space for stochastic process .C[0, 1], set of continuous, real-valued functions defined on .[0, 1] ∞ .S , infinite product space, space of infinite sequences in S .C([0, ∞) : S), set of continuous functions on .[0, ∞) with values in topological space S .Cb (S) or .Cb (S : R), set of continuous bounded, real-valued functions on a metric (or topological) space S .B(S), set of bounded, measurable real-valued functions on a measurable space .(S, S ) .Px
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
495
496
Symbol Index
or .C0 (S : R) space of continuous real-valued functions on metric space or topological space S that vanish at infinity. .C(S : C), set of continuous, complex-valued functions on S .D([0, ∞) : S), D[0, T ], Skorokhod spaces of càdl‘ag functions .M[α, β], space of real-valued progressively measure stochastic processes .f (t), α ≤ t ≤ β, such that β 2 .E α f (t)dt < ∞ .L[α, β], space of real-valued progressively measure stochastic processes .f (t), α ≤ t ≤ β, such that β 2 .P ( α f (s)ds < ∞) = 1 .NA , the null space (or kernel) of the operator A .RA , the range space of the operator A .·, ·, inner product ⊥ .1 = {f : f, 1 = 0} (gradient vector) .∇ = grad = ( ∂x∂ j )1≤j ≤m (divergence) .divb(x) = ∇ · b(x) = m ∂bj (x) j =1 ∂xj , .b = (b1 , . . . , bm ) (generalized derivative) .Dh f (x) = df (x) f (x+h)−f (x) dg(x) = limh→0 g(x+h)−g(x) , .C0 (S)
.Dx f (x)
=
df (x) dx .
(Laplacian) .Δ = ∇ · ∇ =
m
∂2 j =1 ∂x 2 j 2
(Hessian matrix) .∇ ∇ = (( ∂x∂i ∂xj ))1≤i,j ≤m .Rλ , resolvent operator .X0 , center of semigroup on Banach space .X ∗ ∗ .A , Tt , adjoint operator to A, .Tt , respectively .ζ , explosion time .(t, x), local time .ei .e i-th coordinate of unit vector .e .i.o. infinitely often .f ∗ g, convolution of functions .Q1 ∗ Q2 , convolution of probabilities Cov, covariance Var, variance .Πt (1),...,t (m) (f ), finite-dimensional projection of f to .(f (t1 ), . . . , f (tm )) 1 if x ≥ 0 x .sgn(x) ≡ |x| = −1 if x < 0 .MT , quadratic variation of martingale .Mt on interval .0 ≤ t ≤ T . .⇒, weak convergence .Op , little O in probability ‘.∞’ or .s∞ , compactification point at infinity .A , .v matrix, including vector, transpose k tr(A) .= i=1 aii , matrix trace of .A = ((aij ))1≤i,j ≤k p-lim, limit in probability
Author Index
A Aldous, 61 Appuhamillage, 456, 457 Aris, 311, 315 Arnold, 320 Athreya, 61, 251 Avellenda, 442 Axeness, 422 Ayala, 463 Azais, 466
B Bacaer, 208, 463 Barndorff-Nielsen, 65, 71 Basak, 318 Bass, 398 Batchelor, 422 Bell, 334 Ben Arous, 421 Bensoussan, 309, 421, 429 Berkowitz, 456 Best, 61 Bhattacharya, 10, 45, 51, 61, 113, 177, 185, 196, 200, 208, 239, 240, 242, 248, 251, 258, 261, 268, 279, 293, 301, 303, 306, 309, 311, 312, 315, 318, 329, 418, 424, 429, 430, 434–436, 438, 442, 461, 466, 467 Billingsley, 5, 301, 308, 311, 435 Blackwell, 59 Blaesild, 71 Bokil, 350, 456 Bose, 73
Bouguet, 466 Breiman, 33 Brooks, 450
C Cameron, 159 Chacon, 450 Chen, 268 Cherny, 450 Chung, 59 Combarnous, 421, 422, 441 Comtet, 279 Cortis, 456 Costa, 469 Crudu, 466 Csáki, 458, 460, 461
D Dascaliuc, 61 DasGupta, 73 Davies, 239 Davis, 466 Debussche, 466 de Finetti, 73 De Leenheer, 466, 469 Denker, 442 Diaconis, 434, 467 Dieck, 185 Dobrushin, 59 Doeblin, 164 Dong, 450 Doob, 288
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
497
498 Dror, 456 Dunford, 333, 334 Durrett, 62, 415 Dynkin, 257, 376, 379 E Einstein, 282 Engen, 463 Ethier, 411 F Feller, 45, 59, 67, 171, 211, 214, 220, 268, 367, 392, 397, 411, 447 Fill, 434 Folland, 334 Foster, 255 Franke, 434 Freedman, 59, 467 Fried, 421, 422, 441 Friedman, 126, 185, 188, 225, 231, 234, 283, 309, 415 G Gelhar, 422 Gibson, 456 Giplin, 463 Girsanov, 159 Goetze, 442 Goswami, 442 Gonclalves, 466 Gupta, 62, 312, 315, 442 H Haggerty, 315 Halgreen, 71 Harrison, 450 Hormander, 327 Hoteit, 456 Hu, 458, 460, 461 Huillet, 466 I Ibragimov, 301 Ikeda, 165, 236, 327, 330, 335, 346, 349, 350 Iksanov, 450 Itô, 73, 78, 89, 288, 352, 391, 393, 411, 445 K Kakutani, 288 Karatzas, 339, 349–351
Author Index Karlin, 394 Kesten, 62 Kha´sminskii, 173, 176, 185, 198, 203, 216, 324, 325 Khinchine, 73, 91 Kingsland, 463 Kolmogorov, 73 Kurtz, 411
L Lande, 463 Lawler, 167 Le Cam, 10 Lehmann, 456 Lévy, 59, 66, 73, 89, 91, 113, 162, 166, 345, 350, 352, 391, 413 Le Jan, 45 Lejay, 445, 456 Lindvall, 239, 240, 242, 246 Löcherback, 466 Lions, 309, 421, 429 Loke, 450
M Majda, 442, 466 Majumdar, 208, 466, 467 Mandelbrot, 113 Mandl, 224, 394 Mao, 268 Martin, 159 Martinez, 456 McKean, 59, 165, 171, 236, 288, 352, 391, 393, 445 Mesa, 62 Metzler, 279 Meyer, 330 Meyn, 254, 256 Michalowski, 61 Mikosch, 65 Monthus, 279 Mose, 456 Muller, 466
N Neuman, 422 Newman, 304, 422 Ney, 251 Nguyen, 456 Novikov, 162, 173, 176 Nummelin, 251
Author Index O Obhukov, 422 Orey, 306 Ouknine, 445 Owhadi, 421, 442
P Papanicolaou, 309, 421, 429 Peckham, 466, 469 Peskir, 397 Pettis, 333, 334 Pfaffelhuber, 61 Pham, 61 Pilipenko, 450 Pinsky, 185 Pitman, 219, 239, 457, 460
R Radulescu, 466 Ramasubramanian, 261 Ramirez, 315, 350, 445, 452 Rao, 330 Reed, 438 Resnick, 65 Reuter, 52, 57 Revuz, 350 Richards, 463 Richardson, 422 Rogers, 219, 279 Rubin, 73
S Saether, 463 Salminen, 398 Scher, 456 Scheutzow, 414 Schlomann, 466 Schreiber, 467 Schwarz, 334 Sheldon, 456, 457 Shepp, 450 Shi, 279 Shields, 61 Shiryaev, 450 Shreve, 339, 349–351
499 Simon, 438 Skorokhod, 350, 352 Snell, 376 Stein, 277 Stenlund, 398 Stettner, 324 Stroock, 171, 186–188, 201, 236, 268, 412, 415, 417, 434 Sznitman, 45
T Tanaka, 350, 413 Taylor, 311, 315, 394 Thomann, 61, 278, 279, 315, 350, 445, 452, 456 Tong, 466 Tweedie, 254–256
V van Ness, 113 Varadhan, 171, 186–188, 201, 236, 268, 412, 415, 417 von Neumann, 452
W Walker, 442 Walsh, 452, 455 Wasielak, 239, 242, 258 Watanabe, 165, 236, 327, 330, 335, 346, 349, 350 Waymire, 10, 45, 51, 61, 62, 177, 196, 200, 239, 240, 242, 248, 251, 268, 278, 279, 293, 301, 303, 311, 315, 329, 350, 418, 430, 435, 438, 445, 452, 456, 461, 466, 467, 469 Winter, 422 Wood, 315, 350, 456 Wooding, 311, 316 Woyczy´nksi, 65
Y Yor, 279, 350, 450, 457, 460 Younes, 456
Subject Index
A Absorbing boundaries for diffusions, 221 Absorbing boundary, 387, 394 Absorbing boundary condition, 387 Abstract Cauchy problem, 20, 29 Accessible boundary, 221, 359 Adapted, 2 Adapted process, 76 Additive process, 73 Additive .σ -field, 76 Adhesive boundary condition, 387 Adjoint operator, 263 Affine linear coefficients, 151 After-m process, 2 After-.τ process, 3 .A-harmonic function, 142, 181 Alaoglu theorem, 335 .α-Riccati model, 61 Asian options, 278 Asymptotically flat stochastic flow, 318 Averaging dynamic, 464
B Backward equation, 276 Backward equations (abstract form), 32 Bessel processes, 229 Birth-death Markov chain, 56, 450 Birth-death with two reflecting boundaries, 29 Bounded (infinitesimal) rates, 22 Bounded rates, 52 Branching diffusion, 397 Breakthrough curve, 456 Brownian motion, 8, 72, 74, 276, 394
on circle, 230 on half-space, 175 on half-space with absorbing boundary, 226 on torus, 230 with absorprtion at zero, 222 with conormal reflection, 235 with one absorbing boundary, 408 with one reflecting boundary, 407 with two absorbing boundaries, 405 with two reflecting boundaries, 403 zero-seeking, 221
C Càdlàg, 4, 36 Càdlàg paths, 35 Càdlàg sample paths, 379 Cameron–Martin–Girsanov (CMG) Theorem, 169, 171 Carrying capacity, 208, 464 Cauchy initial value problem, 225 Cauchy process, 96 Cauchy’s functional equation, 59 Center of semigroup, 26 Chapman–Kolmogorov equation, 3, 17, 18, 20, 280 Class .C0 semigroup, 26 Class DL submartingale, 330 Closed linear operator, 25 Compound Poisson process, 33, 65, 75 Conformal invariance, 167 Conormal boundary condition, 236 Conormal reflection, 235 Conservative transition probabilities, 30
© Springer Nature Switzerland AG 2023 R. Bhattacharya, E. C. Waymire, Continuous Parameter Markov Processes and Stochastic Differential Equations, Graduate Texts in Mathematics 299, https://doi.org/10.1007/978-3-031-33296-8
501
502 Conservative diffusion, 189, 216, 220, 386 Conservative process, 51 Continuity equation, 44, 283 Continuity in probability, 35 Continuity of flux, 446 Continuous in probability, 36 Continuous parameter Kolmogorov maximal inequality, 86 Continuous parameter martingale, 11 Continuous parameter optional stopping, 13 Continuous parameter transition probabilities, 17 Continuous time random walk, 34 Contraction operator, 18 Contraction semigroup, 26 Convolution semigroups, 31, 91 Coupling inequality, 10 Coupling method, 239 Coupling time, 240 Cox process, 34
D Delay distribution, 239 Denumerable set, 22 Derivatives of radial functions, 195 Deterministic motion, 34 Differential .σ -field, 76 Diffusion, 130, 221 diffusion, viii Diffusion approximation by birth-death chain, 417 Diffusion coefficient, 121 Diffusion equation, 23 Diffusion on a torus, 229 Diffusion on bounded domain, 221 Diffusion on open interval, 220 Diffusion on unit sphere, 238 Diffusion parameter, 121 Diffusions with a Markovian radial compoment, 228 Dirichlet boundary, 386, 394 Dirichlet problem, 182, 189, 277 Dirichlet problem, classical, 288 Discrete excursion representation, 450 Discrete parameter martingale, 11 Discrete parameter optional stopping, 12 Discrete skeleton of diffusion, 199 Distribution of Markov process, 1 Divergence free, 422 Domain of infinitesimal generator, 26, 357 Doob-Kakutani theorem, 288 Doob’s maximal inequality, 13 Doob-Meyer decomposition, 330
Subject Index Doubly stochastic Poisson process, 34 Drift coefficient, 121 Drift vector, 131 Duhamel’s principle, 283 Dunford–Pettis compactness criterion, 333 Dunford–Pettis theorem, 334 Dyadic numbers, 8 Dynkin’s martingale, 300, 411 Dynkin-Snell theorem, 376
E Ehrenfest model, 254 Elastic boundary, 392 Elastic boundary condition, 388 Entrance boundary, 360 Entrance boundary example, 237 Ergodic diffusion, 203, 206 Escape probability for Brownian motion, 156 Escape time, 187 Evolutionary space, 61 Excursion, 447 Existence of density for diffusion, 188 Existence of invariant measure for diffusion, 201 Existence of invariant probability for one-dimensional diffusion, 206 Exit boundary, 359 Explosion, 39, 51 Explosion of jump Markov process, 52 Explosion of multi-dimensional diffusion, 216 Explosion of one-dimensional diffusion, 214 Explosion time, 37, 130, 189 Exponential martingale, 159 Exponential rates of convergence, 266, 269
F Feller boundary classification, 220, 367, 379 Feller process, 37 Feller property, 6, 37 Feller property for skew Brownian motion, 449 Feynman–Kaˆc formula, 277 Filtration, 2, 8 adapted, 2 First passage time process, 71 Flat stochastic flow, 318, 319 Flux, 283 Fokker-Planck equation, 44, 280, 282, 470 Forward equation, 208, 282, 419, 470 Forward equation for multi-dimensional diffusion, 280 Forward equations (abstract form), 32 Foster-Tweedie criterion, 255
Subject Index Fourier frequency domain, 24 Fractional Brownian motion, 113 Functional central limit theorem for Markov processes, 301 Function of a Makov process, 227 Fundamental solution, 360
G Gaussian random field limit, 315 Gauss kernel, 24 Gauss semigroup, 24 Geometric Brownian motion, 150, 151, 157 Geometric ergodicity, 254, 255 Gompertz model, 475 Graph of linear operator, 25 Green’s function, 360, 370 Grownwall inequality, 133
H Harmonic, 287 Harmonic average, 465 Heat equation, 23 Heat semigroup, 24 Hermite polynomials, 410 Hessian, 148 Hilbert transform, 167, 176 Homogeneous Lévy measure, 93 Homogeneous Lévy process, 73, 92 Hormander’s Hypoellipticity Theorem, 327 Hypoellipticity, 327
I Inaccessible boundary, 152, 221, 359 Inaccessible boundary example, 237 Incomplete Gamma function, 475 Infinitely divisible distribution, 91 Infinitesimal genertor, 275 for the compound Poisson process, 33 multidimensional diffusion, 275 of semigroup, 26 Infinitesimal parameters, 22 Infinitesimal rates, 22, 55 Initial value problem, 276 for ordinary differential equations, 18 for parabolic partial differential equations, 23 Integrable increasing process, 330 Integrated geometric Brownian motion, 278 Integration by parts, 13, 149
503 Intensity measure, 68 Interface, 445 Invariant function of Markov process, 227 Invariant probability for diffusion, 196 Inverse Gaussian process, 71 Irreducible one-dimensional diffusion, 156 Itô’s lemma, 139, 143, 145, 148 for diffusions, 148 for square integrable martingales, 341 Itô integral, 102, 109, 123 step functional, 102 Itô isometry, 103, 125
J Jacobi identity for theta function, 410 Join of .σ -fields, 81 Jump process, 51 Jump reduced process, 78
K Kha´sminskii’s test for explosion, 216 Kolmogorov maximal inequality, 86 Kolmogorov’s nonsingular diffusion equation with singular diffusion coefficient, 327
L Lévy-Itô decomposition, 89, 90 Lévy-Khinchine formula, 91 Lévy martingale characterization of Brownian motion, 157 Lévy measure, 78, 93 Lévy modification, 75 Lévy process, 73 Lévy’s martingale characterization of Brownian motion, 106 Lévy stable process, 93 Langevin equation, 122, 137, 150 Laplace functional, 68 Laplacian operator, 23 Lebesgue’s thorn, 288 L-harmonic function, 189 Liapounov function, 321 Lipschitzian, local, global, 130 Local boundary condition, 385 Local martingale, 337, 413 Local time for Brownian motion, 345 Logistic growth model, 151 Logistic model with additive noise, 208 Logistic model with harvesting, 208
504 M Malliavin calculus, 327 Markov chain, 20 Markov function of a Markov process, 227 Markov process distribution, 1 Markov property discrete parameter, 1 Markov sample path regularity, 37 Markov semigroups, 18 Martingale characterization of Brownian motion, 162 Martingale characterization of the Poisson process, 177 Martingale problem, 413 Martingale regularity theorem, 36 Martingale transform, 176 Mass conservation, 283 Mathematical finance, 278 Matrix norm, 133, 136 Maximal inequality, 342 Maximal inequality for moments, 14 Maximal invariant, 227, 231 Maximum principle, 33, 189, 294 Mean reverting Ornstein–Uhlenbeck, 150 Method of successive approximation, 116 Minimal process for diffusion, 131 Mixed local boundary condition, 388 Molifier, 107 Multidimensional diffusion, 125 Multidimensional Ornstein–Uhlenbeck process, 157
N Natural boundary, 360 Natural integrable increasing process, 330 Natural scale, 357 Naviier-Stokes equation, 61 Neumann boundary condition, 234 Non-anticipative functional, 101 Non-anticipative functional on .[α, β], 124 Non-anticipative step functional, 338 Non-anticipative step functional on .[α, β], 102 Non-decreasing process with stationary independent increments, 66 Non-explosion, 51 Non-explosive diffusion, 216 Non-homogeneous diffusion, 133 Non-uniqueness for backward equations, 55 Normal semigroup, 24, 32 Null-recurrent diffusion, 203
Subject Index O Occupation time formula, 345, 346, 353 One-dimensional diffusion, 121 One-dimensional diffusion with absorption at zero, 224 One-sided stable process, 70 Optional stopping theorem, 12, 13 Orbit, 227 Ornstein–Uhlenbeck diffusion, 394 Ornstein–Uhlenbeck process, 121, 122, 150, 157 Overshoot, 240
P Parabolic partial differential equation , 274 Peclet number, 442 Pitman and Yor inversion lemma, 457, 460 Poincaré point, 288 Poisson approximation to binomial, 10 Poisson coupling approximation, 10 Poisson equation, 297 Poisson kernel, 176 Poisson process, 35, 65, 74 Poisson random field, 68 Poisson random measure, 68 Poisson semigroup, 30, 32 Polar identity, 106 Polymers, 279 Polynomial rates of convergence, 264, 268, 269 Population growth, 151 Positive-recurrent diffusion, 203 Pre-.τ -sigmafield, 3 Progressively measurable, 101 Progressive measurability, 101 Pure birth explosion, 39 Pure birth process, 30, 56 Pure death process, 56, 60 Pure jump, 51 Pure jump Markov process, 41
Q Q-matrix, 55 Quadratic martingale, 104 Quadratic variation, 106, 338, 342 Quadratic variation of Brownian motion, 100
R Radial Brownian motion, 394 Raleigh process, 395 Random catastrophes, 465
Subject Index Rate of transition, 22 Recurrence criteria for diffusion, 154, 195 Recurrence of 2d-Brownian motion, 294 Recurrence of diffusion, 294 Recurrent diffusion, 203 Recurrent point, 154 Recurrent point for multidimensional diffusion, 187 Recurrent state of Markov chain, 55 Reflecting boundary, 394 Reflecting boundary condition, 387 Reflecting Brownian motion, 231, 396 Reflecting Brownian motion on .[0, 1], 233 Reflecting Brownian motion on .[0, 1]k , 233 Reflecting diffusion, 230, 232, 234 on half-space, 233 on .[0, 1], 232 Regeneration method, 196 Regular boundary, 359 Regular diffusion, 355 Regularization of continuous parameter submartingales, 15 Regular submartingale, 335 Renewal method, 196 Renewal process, 240 Renewal theory, 239 Residual life time, 240 Residual lifetime, 240 Resolvent identity, 25 Resolvent operator, 25 Resolvent set, 25 Reuter’s condition for explosion, 52 Richards growth model, 463 Riesz transform, 167 Right-continuous Markov process, 379
S Scale function, 204, 357 Scale function for skew Brownian motion, 446 Scaling exponent, 95 Scaling exponent for stable law, 70 Scaling relation, 70 Self-similar process, 94, 95 Singular diffusion, 318 Skeleton process, 43 Skew Brownian motion, 315, 396, 446 Skew Brownian motion transition probabilities, 447 Skew random walk, 450 Skorokhod’s equation, 351
505 Skorokhod topology, 4 Slowly reflecting boundary, 398 Slowly reflecting point, 397 Speed function for skew Brownian motion, 446 Speed function, 204, 357 Speed measure, 357 Squared Bessel process, 229 Stability in distribution, 317 Stability under non-linear drift, 328 Stable law exponent, 94 Stable process, 93, 353 Sticky boundary, 398 Sticky point, 397 Stochastically continuous, 35 Stochastic continuity, 15 Stochastic differential equations (SDE) non-homogeneou, 133 Stochastic equivalence, 35 Stochastic flow, 318 Stochastic integral, 109, 123, 128, 129 step functional, 102 Stochastic integration by parts, 111 Stochastic logistic model, 208 Stochastic stability of a trap, 328 Stochastic transition probabilities, 30 Stopping time, 2, 8 Strong aperiodicity, 251 Strong Feller property, 186 Strongly continuous semigroup, 26, 305 Strong Markov property, 3, 8 Strong Markov property for skew Brownian motion, 449 Strong maximum principle, 181 Subordinator, 68, 353 Substochastic transition probability, 30 Support theorem, 179
T Tanaka’s equation, 413 Tanaka’s formula, 350 Theta function, 410 Theta-logistic growth curve, 463 Theta-logistic model, 463 Time change, 392 of diffusion, 165 of stochastic integral, 164 Transience criteria for diffusion, 154, 195 Transience of k-dimensional Brownian motion (.k > 2), 294 Transience of diffusion, 294 Transient point, 154
506 Transient point for multidimensional diffusion, 187 Transient state of Markov chain, 55 Transition operator, 18 Transition rate, 22 Trap, 328 Tree graph, 61 Trivial tail .σ -field, 306 Truncated generator .LN , 187 Two–dimensional Brownian motion, 294 Two-point boundary value problem, 204 U Unbounded variation, 112 V Vanish at infinity, 24, 274 Variation of parameters, 158 Vertex height, 61
Subject Index W Walsh martingale for skew Brownian motion, 452 Weak Feller property, 6 Weak forward equation, 418 Weak maximum principle, 182 Weak solution to sde, 413 Weak star topology, 335 Well-posedness of the (local) martingale problem, 415 Wright–Fisher diffusion model, 395 Wronskian, 361
Y Yule process, 39, 60
Z Zero-seeking Brownian motion, 221