386 27 3MB
English Pages 403 Year 2013
Sixth Edition
Stochastic Differential Equations An Introduction with Applications
123
Universitext
Bernt Øksendal
Universitext
The front cover shows five sample paths Xt (ω1 ), Xt (ω2 ), Xt (ω3 ) and Xt (ω4 ) of a geometric Brownian motion Xt (ω), i.e. of the solution of a (1-dimensional) stochastic differential equation of the form dXt = (r + α · Wt )Xt dt
t ≥ 0 ; X0 = x
where x, r and α are constants and Wt = Wt (ω) is white noise. This process is often used to model “exponential growth under uncertainty”. See Chapters 5, 10, 11 and 12. The figure is a computer simulation for the case x = r = 1, α = 0.6. The mean value of Xt , E[Xt ] = exp(t), is also drawn. Courtesy of Jan Ubøe, Norwegian School of Economics and Business Administration, Bergen.
For further volumes: http://www.springer.com/series/223
Bernt Øksendal
Stochastic Differential Equations An Introduction with Applications
Sixth Edition, Sixth Corrected Printing With 14 Figures
123
Bernt Øksendal Department of Mathematics University of Oslo Oslo Blindern Norway
Editorial board: Sheldon Axler, San Francisco State University Vincenzo Capasso, Universit`a degli Studi di Milano Carles Casacuberta, Universitat de Barcelona Angus MacIntyre, Queen Mary, University of London Kenneth Ribet, University of California, Berkeley ´ Claude Sabbah, CNRS, Ecole Polytechnique Endre S¨uli, University of Oxford Wojbor Woyczy´nski, Case Western Reserve University
ISBN 978-3-540-04758-2 ISBN 978-3-642-14394-6 (eBook) DOI 10.1007/978-3-642-14394-6 Springer Heidelberg New York Dordrecht London Library of Congress Control Number: 2010930618 Mathematics Subject Classification (2000): 60H10, 60G35, 60G40, 60G44, 93E20, 60J45, 60J25 c Springer-Verlag Berlin Heidelberg 1985, 1989, 1992, 1995, 1998, 2003, 6th corrected printing 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
To My Family Eva, Elise, Anders and Karina
We have not succeeded in answering all our problems. The answers we have found only serve to raise a whole set of new questions. In some ways we feel we are as confused as ever, but we believe we are confused on a higher level and about more important things. Posted outside the mathematics reading room, Tromsø University
Preface to the Sixth Corrected Printing of the Sixth Edition
The printing contains a number of corrections of errors which, albeit not very fundamental, have been annoying and causing confusion for the readers. I am grateful to Kai Liu, Long Zhao and Lasse Leskel¨a for pointing out these errors and thereby helping me to improve the book. And once more I want to thank Dina Haraldsson for her kind assistance with the LaTeX files.
Oslo, May 2013 Bernt Øksendal
Preface to the Fifth Corrected Printing of the Sixth Edition
The biggest change in this printing is in Theorem 10.1.9, where the proof of part b) has been corrected and rewritten. I am indebted to Yang Ge for pointing out the gap in the previous proof and for other helpful suggestions. In addition some other corrections and improvements here and there have been carried out, mainly based on useful comments from (in alphabetical order), Nikolai Chernov, Giulia Di Nunno, Ulrik Skre Fjordheim, Sindre Froyn, Helge Holden, Gautam Iyer, Torstein Nilssen, Jan Ubøe, Zhigus Yan and Yan Zhu. I thank them all for their help. I am also grateful to Dina Haraldsson for once again offering her valuable typing assistance.
Blindern, April 2010 Bernt Øksendal
Preface to the Fourth Corrected Printing of the Sixth Edition
Basically all the corrections in this printing are due to Ralf Forster. I am grateful for his valuable help to improve the book. I am also grateful to Jerome Stein for inspiring discussions, which led to the new Exercise 8.18. I also want to thank Dina Haraldsson, who also this time helped me with her expert typing.
Blindern, February 2007 Bernt Øksendal
Preface to the Third Corrected Printing of the Sixth Edition
In this Second Corrected Printing some misprints have been corrected and other improvements in the presentation have been carried out. The exercises to which a (partial or complete) solution is provided (in the back of the book) are denoted by an asterix *. I wish to thank the following persons (in alphabetical order) for their helpful comments: Holger van Bargen, Catriona Byrne, Mark Davis, Per-Ivar Faust, Samson Jinya, Paul Kettler, Alex Krouglov, Mauro Mariani, John O’Hara, Agn`es Sulem, Bjørn Thunestvedt and Vegard Trondsen. My special thanks go to Dina Haraldsson for her careful and skilled typing.
Blindern, August 2005 Bernt Øksendal
Preface to the Sixth Edition
This edition contains detailed solutions of selected exercises. Many readers have requested this, because it makes the book more suitable for self-study. At the same time new exercises (without solutions) have beed added. They have all been placed in the end of each chapter, in order to facilitate the use of this edition together with previous ones. Several errors have been corrected and formulations have been improved. This has been made possible by the valuable comments from (in alphabetical order) Jon Bohlin, Mark Davis, Helge Holden, Patrick Jaillet, Chen Jing, Natalia Koroleva, Mario Lefebvre, Alexander Matasov, Thilo Meyer-Brandis, Keigo Osawa, Bjørn Thunestvedt, Jan Ubøe and Yngve Williassen. I thank them all for helping to improve the book. My thanks also go to Dina Haraldsson, who once again has performed the typing and drawn the figures with great skill.
Blindern, September 2002 Bernt Øksendal
Preface to Corrected Printing, Fifth Edition
The main corrections and improvements in this corrected printing are from Chapter 12. I have benefitted from useful comments from a number of people, including (in alphabetical order) Fredrik Dahl, Simone Deparis, Ulrich Haussmann, Yaozhong Hu, Marianne Huebner, Carl Peter Kirkebø, Nikolay Kolev, Takashi Kumagai, Shlomo Levental, Geir Magnussen, Anders Øksendal, J¨ urgen Potthoff, Colin Rowat, Stig Sandnes, Lones Smith, Setsuo Taniguchi and Bjørn Thunestvedt. I want to thank them all for helping me making the book better. I also want to thank Dina Haraldsson for proficient typing.
Blindern, May 2000 Bernt Øksendal
Preface to the Fifth Edition
The main new feature of the fifth edition is the addition of a new chapter, Chapter 12, on applications to mathematical finance. I found it natural to include this material as another major application of stochastic analysis, in view of the amazing development in this field during the last 10–20 years. Moreover, the close contact between the theoretical achievements and the applications in this area is striking. For example, today very few firms (if any) trade with options without consulting the Black & Scholes formula! The first 11 chapters of the book are not much changed from the previous edition, but I have continued my efforts to improve the presentation throughout and correct errors and misprints. Some new exercises have been added. Moreover, to facilitate the use of the book each chapter has been divided into subsections. If one doesn’t want (or doesn’t have time) to cover all the chapters, then one can compose a course by choosing subsections from the chapters. The chart below indicates what material depends on which sections. For example, to cover the first two sections of the new chapter 12 it is recommended that one (at least) covers Chapters 1–5, Chapter 7 and Section 8.6.
XX
Chapter 10, and hence Section 9.1, are necessary additional background for Section 12.3, in particular for the subsection on American options. In my work on this edition I have benefitted from useful suggestions from many people, including (in alphabetical order) Knut Aase, Luis Alvarez, Peter Christensen, Kian Esteghamat, Nils Christian Framstad, Helge Holden, Christian Irgens, Saul Jacka, Naoto Kunitomo and his group, Sure Mataramvura, Trond Myhre, Anders Øksendal, Nils Øvrelid, Walter Schachermayer, Bjarne Schielderop, Atle Seierstad, Jan Ubøe, Gjermund V˚ age and Dan Zes. I thank them all for their contributions to the improvement of the book. Again Dina Haraldsson demonstrated her impressive skills in typing the manuscript – and in finding her way in the LATEX jungle! I am very grateful for her help and for her patience with me and all my revisions, new versions and revised revisions . . .
Blindern, January 1998 Bernt Øksendal
Preface to the Fourth Edition
In this edition I have added some material which is particularly useful for the applications, namely the martingale representation theorem (Chapter IV), the variational inequalities associated to optimal stopping problems (Chapter X) and stochastic control with terminal conditions (Chapter XI). In addition solutions and extra hints to some of the exercises are now included. Moreover, the proof and the discussion of the Girsanov theorem have been changed in order to make it more easy to apply, e.g. in economics. And the presentation in general has been corrected and revised throughout the text, in order to make the book better and more useful. During this work I have benefitted from valuable comments from several persons, including Knut Aase, Sigmund Berntsen, Mark H. A. Davis, Helge Holden, Yaozhong Hu, Tom Lindstrøm, Trygve Nilsen, Paulo Ruffino, Isaac Saias, Clint Scovel, Jan Ubøe, Suleyman Ustunel, Qinghua Zhang, Tusheng Zhang and Victor Daniel Zurkowski. I am grateful to them all for their help. My special thanks go to H˚ akon Nyhus, who carefully read large portions of the manuscript and gave me a long list of improvements, as well as many other useful suggestions. Finally I wish to express my gratitude to Tove Møller and Dina Haraldsson, who typed the manuscript with impressive proficiency.
Oslo, June 1995
Bernt Øksendal
Preface to the Third Edition
The main new feature of the third edition is that exercises have been included to each of the chapters II–XI. The purpose of these exercises is to help the reader to get a better understanding of the text. Some of the exercises are quite routine, intended to illustrate the results, while other exercises are harder and more challenging and some serve to extend the theory. I have also continued the effort to correct misprints and errors and to improve the presentation. I have benefitted from valuable comments and suggestions from Mark H. A. Davis, H˚ akon Gjessing, Torgny Lindvall and H˚ akon Nyhus, My best thanks to them all. A quite noticeable non-mathematical improvement is that the book is now typed in TE X. Tove Lieberg did a great typing job (as usual) and I am very grateful to her for her effort and infinite patience.
Oslo, June 1991
Bernt Øksendal
Preface to the Second Edition
In the second edition I have split the chapter on diffusion processes in two, the new Chapters VII and VIII: Chapter VII treats only those basic properties of diffusions that are needed for the applications in the last 3 chapters. The readers that are anxious to get to the applications as soon as possible can therefore jump directly from Chapter VII to Chapters IX, X and XI. In Chapter VIII other important properties of diffusions are discussed. While not strictly necessary for the rest of the book, these properties are central in today’s theory of stochastic analysis and crucial for many other applications. Hopefully this change will make the book more flexible for the different purposes. I have also made an effort to improve the presentation at some points and I have corrected the misprints and errors that I knew about, hopefully without introducing new ones. I am grateful for the responses that I have received on the book and in particular I wish to thank Henrik Martens for his helpful comments. Tove Lieberg has impressed me with her unique combination of typing accuracy and speed. I wish to thank her for her help and patience, together with Dina Haraldsson and Tone Rasmussen who sometimes assisted on the typing.
Oslo, August 1989
Bernt Øksendal
Preface to the First Edition
These notes are based on a postgraduate course I gave on stochastic differential equations at Edinburgh University in the spring 1982. No previous knowledge about the subject was assumed, but the presentation is based on some background in measure theory. There are several reasons why one should learn more about stochastic differential equations: They have a wide range of applications outside mathematics, there are many fruitful connections to other mathematical disciplines and the subject has a rapidly developing life of its own as a fascinating research field with many interesting unanswered questions. Unfortunately most of the literature about stochastic differential equations seems to place so much emphasis on rigor and completeness that it scares many nonexperts away. These notes are an attempt to approach the subject from the nonexpert point of view: Not knowing anything (except rumours, maybe) about a subject to start with, what would I like to know first of all? My answer would be: 1) In what situations does the subject arise? 2) What are its essential features? 3) What are the applications and the connections to other fields? I would not be so interested in the proof of the most general case, but rather in an easier proof of a special case, which may give just as much of the basic idea in the argument. And I would be willing to believe some basic results without proof (at first stage, anyway) in order to have time for some more basic applications. These notes reflect this point of view. Such an approach enables us to reach the highlights of the theory quicker and easier. Thus it is hoped that these notes may contribute to fill a gap in the existing literature. The course is meant to be an appetizer. If it succeeds in awaking further interest, the reader will have a large selection of excellent literature available for the study of the whole story. Some of this literature is listed at the back.
XXVIII
In the introduction we state 6 problems where stochastic differential equations play an essential role in the solution. In Chapter II we introduce the basic mathematical notions needed for the mathematical model of some of these problems, leading to the concept of Ito integrals in Chapter III. In Chapter IV we develop the stochastic calculus (the Ito formula) and in Chapter V we use this to solve some stochastic differential equations, including the first two problems in the introduction. In Chapter VI we present a solution of the linear filtering problem (of which problem 3 is an example), using the stochastic calculus. Problem 4 is the Dirichlet problem. Although this is purely deterministic we outline in Chapters VII and VIII how the introduction of an associated Ito diffusion (i.e. solution of a stochastic differential equation) leads to a simple, intuitive and useful stochastic solution, which is the cornerstone of stochastic potential theory. Problem 5 is an optimal stopping problem. In Chapter IX we represent the state of a game at time t by an Ito diffusion and solve the corresponding optimal stopping problem. The solution involves potential theoretic notions, such as the generalized harmonic extension provided by the solution of the Dirichlet problem in Chapter VIII. Problem 6 is a stochastic version of F.P. Ramsey’s classical control problem from 1928. In Chapter X we formulate the general stochastic control problem in terms of stochastic differential equations, and we apply the results of Chapters VII and VIII to show that the problem can be reduced to solving the (deterministic) Hamilton-Jacobi-Bellman equation. As an illustration we solve a problem about optimal portfolio selection. After the course was first given in Edinburgh in 1982, revised and expanded versions were presented at Agder College, Kristiansand and University of Oslo. Every time about half of the audience have come from the applied section, the others being so-called “pure” mathematicians. This fruitful combination has created a broad variety of valuable comments, for which I am very grateful. I particularly wish to express my gratitude to K.K. Aase, L. Csink and A.M. Davie for many useful discussions. I wish to thank the Science and Engineering Research Council, U.K. and Norges Almenvitenskapelige Forskningsr˚ ad (NAVF), Norway for their financial support. And I am greatly indebted to Ingrid Skram, Agder College and Inger Prestbakken, University of Oslo for their excellent typing – and their patience with the innumerable changes in the manuscript during these two years. Oslo, June 1985
Bernt Øksendal
Note: Chapters VIII, IX, X of the First Edition have become Chapters IX, X, XI of the Second Edition.
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Stochastic Analogs of Classical Differential Equations . . . . . . . . 1.2 Filtering Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Stochastic Approach to Deterministic Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Optimal Stopping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Stochastic Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Mathematical Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 2 3 3 4 4
2
Some Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Probability Spaces, Random Variables and Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 An Important Example: Brownian Motion . . . . . . . . . . . . . . . . . . 12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3
Itˆ o Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Construction of the Itˆo Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Some properties of the Itˆo integral . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Extensions of the Itˆ o integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21 21 30 34 37
4
The Itˆ o Formula and the Martingale Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The 1-dimensional Itˆo formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Multi-dimensional Itˆo Formula . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Martingale Representation Theorem . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 48 49 55
5
Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.1 Examples and Some Solution Methods . . . . . . . . . . . . . . . . . . . . . 65 5.2 An Existence and Uniqueness Result . . . . . . . . . . . . . . . . . . . . . . . 70
XXX
Contents
5.3 Weak and Strong Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6
The Filtering Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 The 1-Dimensional Linear Filtering Problem . . . . . . . . . . . . . . . . 87 6.3 The Multidimensional Linear Filtering Problem . . . . . . . . . . . . . 107 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7
Diffusions: Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.1 The Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2 The Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 7.3 The Generator of an Itˆo Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.4 The Dynkin Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.5 The Characteristic Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8
Other Topics in Diffusion Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.1 Kolmogorov’s Backward Equation. The Resolvent . . . . . . . . . . . 141 8.2 The Feynman-Kac Formula. Killing . . . . . . . . . . . . . . . . . . . . . . . . 145 8.3 The Martingale Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 8.4 When is an Itˆo Process a Diffusion? . . . . . . . . . . . . . . . . . . . . . . . . 150 8.5 Random Time Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 8.6 The Girsanov Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9
Applications to Boundary Value Problems . . . . . . . . . . . . . . . . . 181 9.1 The Combined Dirichlet-Poisson Problem. Uniqueness . . . . . . . . 181 9.2 The Dirichlet Problem. Regular Points . . . . . . . . . . . . . . . . . . . . . 185 9.3 The Poisson Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
10 Application to Optimal Stopping . . . . . . . . . . . . . . . . . . . . . . . . . . 213 10.1 The Time-Homogeneous Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 10.2 The Time-Inhomogeneous Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 10.3 Optimal Stopping Problems Involving an Integral . . . . . . . . . . . . 230 10.4 Connection with Variational Inequalities . . . . . . . . . . . . . . . . . . . . 232 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 11 Application to Stochastic Control . . . . . . . . . . . . . . . . . . . . . . . . . . 243 11.1 Statement of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 11.2 The Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . . . . . . 245 11.3 Stochastic control problems with terminal conditions . . . . . . . . . 259 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Contents XXXI
12 Application to Mathematical Finance . . . . . . . . . . . . . . . . . . . . . . 269 12.1 Market, portfolio and arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 12.2 Attainability and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 12.3 Option Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Appendix A: Normal Random Variables . . . . . . . . . . . . . . . . . . . . . . . 315 Appendix B: Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . 319 Appendix C: Uniform Integrability and Martingale Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Appendix D: An Approximation Result . . . . . . . . . . . . . . . . . . . . . . . . 327 Solutions and Additional Hints to Some of the Exercises . . . . . . . 331 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 List of Frequently Used Notation and Symbols . . . . . . . . . . . . . . . . 371 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
1 Introduction
To convince the reader that stochastic differential equations is an important subject let us mention some situations where such equations appear and can be used:
1.1 Stochastic Analogs of Classical Differential Equations If we allow for some randomness in some of the coefficients of a differential equation we often obtain a more realistic mathematical model of the situation. Problem 1. Consider the simple population growth model dN = a(t)N (t), dt
N (0) = N0 (constant)
(1.1.1)
where N (t) is the size of the population at time t, and a(t) is the relative rate of growth at time t. It might happen that a(t) is not completely known, but subject to some random environmental effects, so that we have a(t) = r(t) + “noise” , where we do not know the exact behaviour of the noise term, only its probability distribution. The function r(t) is assumed to be nonrandom. How do we solve (1.1.1) in this case? Problem 2. The charge Q(t) at time t at a fixed point in an electric circuit satisfies the differential equation L · Q (t) + R · Q (t) +
1 · Q(t) = F (t), Q(0) = Q0 , Q (0) = I0 C
(1.1.2)
where L is inductance, R is resistance, C is capacitance and F (t) the potential source at time t. B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 1, © Springer-Verlag Berlin Heidelberg 2013
2
1 Introduction
Again we may have a situation where some of the coefficients, say F (t), are not deterministic but of the form F (t) = G(t) + “noise” .
(1.1.3)
How do we solve (1.1.2) in this case? More generally, the equation we obtain by allowing randomness in the coefficients of a differential equation is called a stochastic differential equation. This will be made more precise later. It is clear that any solution of a stochastic differential equation must involve some randomness, i.e. we can only hope to be able to say something about the probability distributions of the solutions.
1.2 Filtering Problems Problem 3. Suppose that we, in order to improve our knowledge about the solution, say of Problem 2, perform observations Z(s) of Q(s) at times s ≤ t. However, due to inaccuracies in our measurements we do not really measure Q(s) but a disturbed version of it: Z(s) = Q(s) + “noise” .
(1.2.1)
So in this case there are two sources of noise, the second coming from the error of measurement. The filtering problem is: What is the best estimate of Q(t) satisfying (1.1.2), based on the observations Z(s) in (1.2.1), where s ≤ t ? Intuitively, the problem is to “filter” the noise away from the observations in an optimal way. In 1960 Kalman and in 1961 Kalman and Bucy proved what is now known as the Kalman-Bucy filter. Basically the filter gives a procedure for estimating the state of a system which satisfies a “noisy” linear differential equation, based on a series of “noisy” observations. Almost immediately the discovery found applications in aerospace engineering (Ranger, Mariner, Apollo etc.) and it now has a broad range of applications. Thus the Kalman-Bucy filter is an example of a recent mathematical discovery which has already proved to be useful – it is not just “potentially” useful. It is also a counterexample to the assertion that “applied mathematics is bad mathematics” and to the assertion that “the only really useful mathematics is the elementary mathematics”. For the Kalman-Bucy filter – as the whole subject of stochastic differential equations – involves advanced, interesting and first class mathematics.
1.4 Optimal Stopping
3
1.3 Stochastic Approach to Deterministic Boundary Value Problems Problem 4. The most celebrated example is the stochastic solution of the Dirichlet problem: Given a (reasonable) domain U in Rn and a continuous function f on the boundary of U, ∂U . Find a function f˜ continuous on the closure U of U such that (i) f˜ = f on ∂U (ii) f˜ is harmonic in U , i.e. Δf˜: =
n ∂ 2 f˜ i=1
∂x2i
= 0 in U .
In 1944 Kakutani proved that the solution could be expressed in terms of Brownian motion (which will be constructed in Chapter 2): f˜(x) is the expected value of f at the first exit point from U of the Brownian motion starting at x ∈ U . It turned out that this was just the tip of an iceberg: For a large class of semielliptic second order partial differential equations the corresponding Dirichlet boundary value problem can be solved using a stochastic process which is a solution of an associated stochastic differential equation.
1.4 Optimal Stopping Problem 5. Suppose a person has an asset or resource (e.g. a house, stocks, oil...) that she is planning to sell. The price Xt at time t of her asset on the open market varies according to a stochastic differential equation of the same type as in Problem 1: dXt = rXt + αXt · “noise” dt where r, α are known constants. The discount rate is a known constant ρ. At what time should she decide to sell? We assume that she knows the behaviour of Xs up to the present time t, but because of the noise in the system she can of course never be sure at the time of the sale if her choice of time will turn out to be the best. So what we are searching for is a stopping strategy that gives the best result in the long run, i.e. maximizes the expected profit when the inflation is taken into account. This is an optimal stopping problem. It turns out that the solution can be expressed in terms of the solution of a corresponding boundary value problem
4
1 Introduction
(Problem 4), except that the boundary is unknown (free) as well and this is compensated by a double set of boundary conditions. It can also be expressed in terms of a set of variational inequalities.
1.5 Stochastic Control Problem 6 (An optimal portfolio problem). Suppose that a person has two investment possibilities: (i) A safe investment (e.g. a bond), where the price X0 (t) per unit at time t grows exponentially: dX0 = ρX0 (1.5.1) dt where ρ > 0 is a constant. (ii) A risky investment (e.g. a stock), where the price X1 (t) per unit at time t satisfies a stochastic differential equation of the type discussed in Problem 1: dX1 = (μ + σ · “noise”)X1 (1.5.2) dt where μ > ρ and σ ∈ R \ {0} are constants. At each instant t the person can choose how large portion (fraction) ut of his fortune Vt he wants to place in the risky investment, thereby placing (1−ut )Vt in the safe investment. Given a utility function U and a terminal time T the problem is to find the optimal portfolio ut ∈ [0, 1] i.e. find the investment distribution ut ; 0 ≤ t ≤ T which maximizes the expected (u) utility of the corresponding terminal fortune VT : (u) max E U (VT ) (1.5.3) 0≤ut ≤1
1.6 Mathematical Finance Problem 7 (Pricing of options). Suppose that at time t = 0 the person in Problem 6 is offered the right (but without obligation) to buy one unit of the risky asset at a specified price K and at a specified future time t = T . Such a right is called a European call option. How much should the person be willing to pay for such an option? This problem was solved when Fischer Black and Myron Scholes (1973) used stochastic analysis and an equlibrium argument to compute a theoretical value for the price, the now famous Black-Scholes option price formula. This theoretical value agreed well with the prices that had already been established
1.6 Mathematical Finance
5
as an equilibrium price on the free market. Thus it represented a triumph for mathematical modelling in finance. It has become an indispensable tool in the trading of options and other financial derivatives. In 1997 Myron Scholes and Robert Merton were awarded the Nobel Prize in Economics for their work related to this formula. (Fischer Black died in 1995.) We will return to these problems in later chapters, after having developed the necessary mathematical machinery. We solve Problem 1 and Problem 2 in Chapter 5. Problems involving filtering (Problem 3) are treated in Chapter 6, the generalized Dirichlet problem (Problem 4) in Chapter 9. Problem 5 is solved in Chapter 10 while stochastic control problems (Problem 6) are discussed in Chapter 11. Finally we discuss applications to mathematical finance in Chapter 12.
2 Some Mathematical Preliminaries
2.1 Probability Spaces, Random Variables and Stochastic Processes Having stated the problems we would like to solve, we now proceed to find reasonable mathematical notions corresponding to the quantities mentioned and mathematical models for the problems. In short, here is a first list of the notions that need a mathematical interpretation: (1) (2) (3) (4) (5)
A random quantity Independence Parametrized (discrete or continuous) families of random quantities What is meant by a “best” estimate in the filtering problem (Problem 3) What is meant by an estimate “based on” some observations (Problem 3)? (6) What is the mathematical interpretation of the “noise” terms? (7) What is the mathematical interpretation of the stochastic differential equations? In this chapter we will discuss (1)–(3) briefly. In the next chapter we will consider (6), which leads to the notion of an Itˆo stochastic integral (7). In Chapter 6 we will consider (4)–(5). The mathematical model for a random quantity is a random variable. Before we define this, we recall some concepts from general probability theory. The reader is referred to e.g. Williams (1991) for more information. Definition 2.1.1 If Ω is a given set, then a σ-algebra F on Ω is a family of subsets of Ω with the following properties: (i) ∅ ∈ F (ii) F ∈ F ⇒ F C ∈ F, where F C = Ω \ F is the complement of F in Ω ∞ (iii) A1 , A2 , . . . ∈ F ⇒ A: = Ai ∈ F i=1
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 2, © Springer-Verlag Berlin Heidelberg 2013
8
2 Some Mathematical Preliminaries
The pair (Ω, F ) is called a measurable space. A probability measure P on a measurable space (Ω, F ) is a function P : F −→ [0, 1] such that (a) P (∅) = 0, P (Ω) = 1 (b) if A1 , A2 , . . . ∈ F and {Ai }∞ i=1 is disjoint (i.e. Ai ∩ Aj = ∅ if i = j) then ∞
∞ P Ai = P (Ai ) . i=1
i=1
The triple (Ω, F , P ) is called a probability space. It is called a complete probability space if F contains all subsets G of Ω with P -outer measure zero, i.e. with P ∗ (G): = inf{P (F ); F ∈ F, G ⊂ F } = 0 . Any probability space can be made complete simply by adding to F all sets of outer measure 0 and by extending P accordingly. From now on we will assume that all our probability spaces are complete. The subsets F of Ω which belong to F are called F -measurable sets. In a probability context these sets are called events and we use the interpretation P (F ) = “the probability that the event F occurs” . In particular, if P (F ) = 1 we say that “F occurs with probability 1”, or “almost surely (a.s.)”. Given any family U of subsets of Ω there is a smallest σ-algebra HU containing U, namely HU = {H; H σ-algebra of Ω, U ⊂ H} . (See Exercise 2.3.) We call HU the σ-algebra generated by U. For example, if U is the collection of all open subsets of a topological space Ω (e.g. Ω = Rn ), then B = HU is called the Borel σ-algebra on Ω and the elements B ∈ B are called Borel sets. B contains all open sets, all closed sets, all countable unions of closed sets, all countable intersections of such countable unions etc. If (Ω, F , P ) is a given probability space, then a function Y : Ω → Rn is called F -measurable if Y −1 (U ): = {ω ∈ Ω; Y (ω) ∈ U } ∈ F for all open sets U ∈ Rn (or, equivalently, for all Borel sets U ⊂ Rn ). If X: Ω → Rn is any function, then the σ-algebra HX generated by X is the smallest σ-algebra on Ω containing all the sets X −1 (U ) ;
U ⊂ Rn open .
2.1 Probability Spaces, Random Variables and Stochastic Processes
9
It is not hard to show that HX = {X −1 (B); B ∈ B} , where B is the Borel σ-algebra on Rn . Clearly, X will then be HX -measurable and HX is the smallest σ-algebra with this property. The following result is useful. It is a special case of a result sometimes called the Doob-Dynkin lemma. See e.g. M. M. Rao (1984), Prop. 3, p. 7. Lemma 2.1.2 If X, Y : Ω → Rn are two given functions,then Y is HX measurable if and only if there exists a Borel measurable function g: Rn → Rn such that Y = g(X) . In the following we let (Ω, F , P ) denote a given complete probability space. A random variable X is an F -measurable function X: Ω → Rn . Every random variable induces a probability measure μX on Rn , defined by μX (B) = P (X −1 (B)) . μX is called the distribution of X. If |X(ω)|dP (ω) < ∞ then the number Ω
E[X]: =
X(ω)dP (ω) =
xdμX (x) Rn
Ω
is called the expectation of X (w.r.t. P ). n More generally, if f : R → R is Borel measurable and |f (X(ω))|dP (ω) < ∞ then we have Ω
E[f (X)]: =
f (X(ω))dP (ω) =
f (x)dμX (x) . Rn
Ω
The Lp -spaces If X : Ω → Rn is a random variable and p ∈ [1, ∞) is a constant we define the Lp -norm of X, ||X p , by X p = X Lp(P ) =
p1 |X(ω)|p dP (ω) .
Ω
If p = ∞ we set X ∞ = X L∞ (P ) = inf{N ∈ R; |X(ω)| ≤ N a. s.}.
10
2 Some Mathematical Preliminaries
The corresponding Lp -spaces are defined by Lp (P ) = Lp (Ω) = {X : Ω → Rn ; X p < ∞}. With this norm the Lp -spaces are Banach spaces, i.e. complete (see Exercise 2.19), normed linear spaces. If p = 2 the space L2 (P ) is even a Hilbert space, i.e. a complete inner product space, with inner product (X, Y )L2 (P ) := E[X · Y ];
X, Y ∈ L2 (P ).
The mathematical model for independence is the following: Definition 2.1.3 Two subsets A, B ∈ F are called independent if P (A ∩ B) = P (A) · P (B) . A collection A = {Hi ; i ∈ I} of families Hi of measurable sets is independent if P (Hi1 ∩ · · · ∩ Hik ) = P (Hi1 ) · · · P (Hik ) for all choices of Hi1 ∈ Hi1 , · · · , Hik ∈ Hik with different indices i1 , . . . , ik . A collection of random variables {Xi ; i ∈ I} is independent if the collection of generated σ-algebras HXi is independent. If two random variables X, Y : Ω → R are independent then E[XY ] = E[X]E[Y ] , provided that E[|X|] < ∞ and E[|Y |] < ∞. (See Exercise 2.5.) Definition 2.1.4 A stochastic process is a parametrized collection of random variables {Xt }t∈T defined on a probability space (Ω, F , P ) and assuming values in Rn . The parameter space T is usually (as in this book) the halfline [0, ∞), but it may also be an interval [a, b], the non-negative integers and even subsets of Rn for n ≥ 1. Note that for each t ∈ T fixed we have a random variable ω → Xt (ω) ;
ω∈Ω.
On the other hand, fixing ω ∈ Ω we can consider the function t → Xt (ω) ;
t∈T
which is called a path of Xt . It may be useful for the intuition to think of t as “time” and each ω as an individual “particle” or “experiment”. With this picture Xt (ω) would represent the position (or result) at time t of the particle (experiment) ω.
2.1 Probability Spaces, Random Variables and Stochastic Processes
11
Sometimes it is convenient to write X(t, ω) instead of Xt (ω). Thus we may also regard the process as a function of two variables (t, ω) → X(t, ω) from T ×Ω into Rn . This is often a natural point of view in stochastic analysis, because (as we shall see) there it is crucial to have X(t, ω) jointly measurable in (t, ω). Finally we note that we may identify each ω with the function t → Xt (ω) = (Rn )T of from T into Rn . Thus we may regard Ω as a subset of the space Ω n all functions from T into R . Then the σ-algebra F will contain the σ-algebra B generated by sets of the form {ω; ω(t1 ) ∈ F1 , · · · , ω(tk ) ∈ Fk } ,
Fi ⊂ Rn Borel sets
Therefore one may also adopt the point of view that a stochastic process is a probability measure P on the measurable space ((Rn )T , B). The (finite-dimensional) distributions of the process X = {Xt }t∈T are the measures μt1 ,...,tk defined on Rnk , k = 1, 2, . . ., by μt1 ,...,tk (F1 × F2 × · · · × Fk ) = P [Xt1 ∈ F1 , · · · , Xtk ∈ Fk ] ;
ti ∈ T .
Here F1 , . . . , Fk denote Borel sets in Rn . The family of all finite-dimensional distributions determines many (but not all) important properties of the process X. Conversely, given a family {νt1 ,...,tk ; k ∈ N, ti ∈ T } of probability measures on Rnk it is important to be able to construct a stochastic process Y = {Yt }t∈T having νt1 ,...,tk as its finite-dimensional distributions. One of Kolmogorov’s famous theorems states that this can be done provided {νt1 ,...,tk } satisfies two natural consistency conditions: (See Lamperti (1977) or Kallenberg (2002).) Theorem 2.1.5 (Kolmogorov’s extension theorem) For all t1 , . . . , tk ∈ T , k ∈ N let νt1 ,...,tk be probability measures on Rnk s.t. νtσ(1) ,···,tσ(k) (F1 × · · · × Fk ) = νt1 ,···,tk (Fσ−1 (1) × · · · × Fσ−1 (k) )
(K1)
for all permutations σ on {1, 2, . . . , k} and νt1 ,...,tk (F1×· · ·×Fk ) = νt1 ,...,tk ,tk+1 ,...,tk+m (F1×· · ·×Fk ×Rn×· · ·×Rn ) (K2) for all m ∈ N, where (of course) the set on the right hand side has a total of k + m factors. Then there exists a probability space (Ω, F , P ) and a stochastic process {Xt } on Ω, Xt : Ω → Rn , s.t. νt1 ,...,tk (F1 × · · · × Fk ) = P [Xt1 ∈ F1 , · · · , Xtk ∈ Fk ] , for all ti ∈ T , k ∈ N and all Borel sets Fi .
12
2 Some Mathematical Preliminaries
2.2 An Important Example: Brownian Motion In 1828 the Scottish botanist Robert Brown observed that pollen grains suspended in liquid performed an irregular motion. The motion was later explained by the random collisions with the molecules of the liquid. To describe the motion mathematically it is natural to use the concept of a stochastic process Bt (ω), interpreted as the position at time t of the pollen grain ω. We will generalize slightly and consider an n-dimensional analog. To construct {Bt }t≥0 it suffices, by the Kolmogorov extension theorem, to specify a family {νt1 ,...,tk } of probability measures satisfying (K1) and (K2). These measures will be chosen so that they agree with our observations of the pollen grain behaviour: Fix x ∈ Rn and define p(t, x, y) = (2πt)−n/2 · exp(−
|x − y|2 ) 2t
for y ∈ Rn , t > 0 .
If 0 ≤ t1 ≤ t2 ≤ · · · ≤ tk define a measure νt1 ,...,tk on Rnk by νt1 ,...,tk (F1 × · · · × Fk ) =
(2.2.1)
p(t1 , x, x1 )p(t2 −t1 , x1 , x2 ) · · · p(tk −tk−1 , xk−1 , xk )dx1 · · · dxk
=
F1 ×···×Fk
where we use the notation dy = dy1 · · · dyk for Lebesgue measure and the convention that p(0, x, y)dy = δx (y), the unit point mass at x. Extend this definition to all finite sequences of ti ’s by using (K1). Since p(t, x, y)dy = 1 for all t ≥ 0, (K2) holds, so by Kolmogorov’s theorem there
Rn
exists a probability space (Ω, F , P x ) and a stochastic process {Bt }t≥0 on Ω such that the finite-dimensional distributions of Bt are given by (2.2.1), i.e.
P x (Bt1 ∈ F1 , · · · , Btk ∈ Fk ) = p(t1 , x, x1 ) · · · p(tk − tk−1 , xk−1 , xk )dx1 . . . dxk .
=
(2.2.2)
F1 ×···×Fk
Definition 2.2.1 Such a process is called (a version of ) Brownian motion starting at x (observe that P x (B0 = x) = 1). The Brownian motion thus defined is not unique, i.e. there exist several quadruples (Bt , Ω, F , P x ) such that (2.2.2) holds. However, for our purposes this is not important, we may simply choose any version to work with. As we shall soon see, the paths of a Brownian motion are (or, more correctly, can be chosen to be) continuous, a.s. Therefore we may identify (a.a.) ω ∈ Ω with a continuous function t → Bt (ω) from [0, ∞) into Rn . Thus we may adopt the point of view that Brownian motion is just the space C([0, ∞), Rn ) equipped with certain probability measures P x (given by (2.2.1) and (2.2.2) above).
2.2 An Important Example: Brownian Motion
13
This version is called the canonical Brownian motion. Besides having the advantage of being intuitive, this point of view is useful for the further analysis of measures on C([0, ∞), Rn ), since this space is Polish (i.e. a complete separable metric space). See Stroock and Varadhan (1979). We state some basic properties of Brownian motion: (i)
Bt is a Gaussian process, i.e. for all 0 ≤ t1 ≤ · · · ≤ tk the random variable Z = (Bt1 , . . . , Btk ) ∈ Rnk has a (multi)normal distribution. This means that there exists a vector M ∈ Rnk and a non-negative definite matrix C = [cjm ] ∈ Rnk×nk (the set of all nk × nk-matrices with real entries) such that nk u j Zj uj cjm um + i uj Mj E x exp i = exp − 12 (2.2.3) j=1
j,m
j
√ for all u = (u1 , . . . , unk ) ∈ Rnk , where i = −1 is the imaginary unit and E x denotes expectation with respect to P x . Moreover, if (2.2.3) holds then (2.2.4) M = E x [Z] is the mean value of Z and cjm = E x [(Zj −Mj )(Zm −Mm )] is the covariance matrix of Z . (2.2.5) (See Appendix A). To see that (2.2.3) holds for Z = (Bt1 , . . . , Btk ) we calculate its left hand side explicitly by using (2.2.2) (see Appendix A) and obtain (2.2.3) with
and
M = E x [Z] = (x, x, · · · , x) ∈ Rnk
(2.2.6)
⎧ t I ⎪ ⎪ 1 n ⎪ t I ⎪ ⎪ ⎪ 1. n C =⎪ ⎪ . ⎪ ⎪ ⎩ . t1 In
(2.2.7)
t1 In t2 In .. . t2 In
⎫ t1 In ⎪ ⎪ t2 In ⎪ ⎪ ⎪ ⎪ .. ⎪ ⎪ . . ⎪ ⎪ ⎭ · · · tk In ··· ···
Hence E x [Bt ] = x
for all t ≥ 0
(2.2.8)
and E x [(Bt − x)2 ] = nt, E x [(Bt − x)(Bs − x)] = n min(s, t) . Moreover,
E x [(Bt − Bs )2 ] = n(t − s) if t ≥ s ,
(2.2.9) (2.2.10)
since E x [(Bt − Bs )2 ] = E x [(Bt − x)2 − 2(Bt − x)(Bs − x) + (Bs − x)2 ] = n(t − 2s + s) = n(t − s), when t ≥ s .
14
2 Some Mathematical Preliminaries
(ii) Bt has independent increments, i.e. Bt1 , Bt2 − Bt1 , · · · , Btk − Btk−1 are independent for all 0 ≤ t1 < t2 · · · < tk .
(2.2.11)
To prove this we use the fact that normal random variables are independent iff they are uncorrelated. (See Appendix A). So it is enough to prove that E x [(Bti − Bti−1 )(Btj − Btj−1 )] = 0
when ti < tj ,
(2.2.12)
which follows from the form of C: E x [Bti Btj − Bti−1 Btj − Bti Btj−1 + Bti−1 Btj−1 ] = n(ti − ti−1 − ti + ti−1 ) = 0 . From this we deduce that Bs − Bt is independent of Ft if s > t. (iii) Finally we ask: Is t → Bt (ω) continuous for almost all ω? Stated like this the question does not make sense, because the set H = {ω; t → Bt (ω) is continuous} is not measurable with respect to the Borel σ-algebra B on (Rn )[0,∞) mentioned above (H involves an uncountable number of t’s). However, if modified slightly the question can be given a positive answer. To explain this we need the following important concept: Definition 2.2.2 Suppose that {Xt } and {Yt } are stochastic processes on (Ω, F , P ). Then we say that {Xt } is a version of (or a modification of) {Yt } if P ({ω; Xt (ω) = Yt (ω)}) = 1 for all t . Note that if Xt is a version of Yt , then Xt and Yt have the same finitedimensional distributions. Thus from the point of view that a stochastic process is a probability law on (Rn )[0,∞) two such processes are the same, but nevertheless their path properties may be different. (See Exercise 2.9.) The continuity question of Brownian motion can be answered by using another famous theorem of Kolmogorov: Theorem 2.2.3 (Kolmogorov’s continuity theorem) Suppose that the process X = {Xt }t≥0 satisfies the following condition: For all T > 0 there exist positive constants α, β, D such that E[|Xt − Xs |α ] ≤ D · |t − s|1+β ;
0 ≤ s, t ≤ T .
(2.2.13)
Then there exists a continuous version of X. For a proof see for example Stroock and Varadhan (1979, p. 51) or Kallenberg (2002), Thm. 3.23. For Brownian motion Bt it is not hard to prove that (See Exercise 2.8) E x [|Bt − Bs |4 ] = n(n + 2)|t − s|2 .
(2.2.14)
Exercises
15
So Brownian motion satisfies Kolmogorov’s condition (2.2.13) with α = 4, D = n(n + 2) and β = 1, and therefore it has a continuous version. From now on we will assume that Bt is such a continuous version. Finally we note that (1)
(n)
If Bt = (Bt , · · · , Bt ) is n-dimensional Brownian motion, then (j)
the 1-dimensional processes {Bt }t≥0 , 1 ≤ j ≤ n are independent, 1-dimensional Brownian motions .
(2.2.15)
Exercises 2.1.
Suppose that X: Ω → R is a function which assumes only countably many values a1 , a2 , . . . ∈ R. a) Show that X is a random variable if and only if X −1 (ak ) ∈ F
for all k = 1, 2, . . .
(2.2.16)
b) Suppose (2.2.16) holds. Show that E[|X|] =
∞
|ak |P [X = ak ] .
(2.2.17)
k=1
c) If (2.2.16) holds and E[|X|] < ∞, show that E[X] =
∞
ak P [X = ak ] .
k=1
d) If (2.2.16) holds and f : R → R is measurable and bounded, show that ∞ f (ak )P [X = ak ] . E[f (X)] = k=1
2.2.
Let X: Ω → R be a random variable. The distribution function F of X is defined by F (x) = P [X ≤ x] . a) Prove that F has the following properties: (i) 0 ≤ F ≤ 1, lim F (x) = 0, lim F (x) = 1 . x→−∞
x→∞
(ii) F is increasing (= non-decreasing). (iii) F is right-continuous, i.e. F (x) = lim h→0 F (x + h) . h>0
16
2. Some Mathematical Preliminaries
b) Let g: R → R be measurable such that E[|g(X)|] < ∞. Prove that
∞ g(x)dF (x) ,
E[g(X)] = −∞
where the integral on the right is interpreted in the LebesgueStieltjes sense. c) Let p(x) ≥ 0 be a measurable function on R. We say that X has the density p if
x p(y)dy
F (x) =
for all x .
−∞
Thus from (2.2.1)–(2.2.2) we know that 1-dimensional Brownian motion Bt at time t with B0 = 0 has the density 1 x2 p(x) = √ exp(− ); x ∈ R . 2t 2πt Find the density of Bt2 . 2.3.
Let {Hi }i∈I be a family of σ-algebras on Ω. Prove that H = {Hi ; i ∈ I} is again a σ-algebra.
2.4.* a) Let X: Ω → Rn be a random variable such that E[|X|p ] < ∞
for some p, 0 < p < ∞ .
Prove Chebychev’s inequality: P [|X| ≥ λ] ≤ Hint:
Ω
|X|p dP ≥
1 E[|X|p ] λp
for all λ ≥ 0 .
|X|p dP , where A = {ω: |X| ≥ λ} .
A
b) Suppose there exists k > 0 such that M = E[exp(k|X|)] < ∞ . Prove that P [|X| ≥ λ] ≤ M e−kλ for all λ ≥ 0 . 2.5.
Let X, Y : Ω → R be two independent random variables and assume for simplicity that X and Y are bounded. Prove that E[XY ] = E[X]E[Y ] .
Exercises
17
Hint: Assume |X| ≤ M , |Y | ≤ N . Approximate X and Y by simple functions ϕ(ω) =
m−1 i=1
ai XFi (ω), ψ(ω) =
n−1 j=1
bj XGj (ω), respectively,
where Fi = X −1 ([ai , ai+1 )), Gj = Y −1 ([bj , bj+1 )), −M = a0 < a1 < . . . < am = M , −N = b0 < b1 < . . . < bn = N . Then E[X] ≈ E[ϕ] = ai P (Fi ), E[Y ] ≈ E[ψ] = bj P (Gj ) i
j
and E[XY ] ≈ E[ϕψ] =
ai bj P (Fi ∩ Gj ) . . .
.
i,j
2.6.* Let (Ω, F , P ) be a probability space and let A1 , A2 , . . . be sets in F such that ∞ P (Ak ) < ∞ . k=1
Prove the Borel-Cantelli lemma: P(
∞ ∞
Ak ) = 0 ,
m=1 k=m
i.e. the probability that ω belongs to infinitely many Ak s is zero. 2.7.* a) Suppose G1 , G2 , . . . , Gn are disjoint subsets of Ω such that Ω=
n
Gi .
i=1
Prove that the family G consisting of ∅ and all unions of some (or all) of G1 , . . . , Gn constitutes a σ-algebra on Ω. b) Prove that any finite σ-algebra F on Ω is of the type described in a). c) Let F be a finite σ-algebra on Ω and let X: Ω → R be F measurable. Prove that X assumes only finitely many possible values. More precisely, there exists a disjoint family of subsets F1 , . . . , Fm ∈ F and real numbers c1 , . . . , cm such that X(ω) =
m
ci XFi (ω) .
i=1
2.8.
Let Bt be Brownian motion on R, B0 = 0. Put E = E 0 . a) Use (2.2.3) to prove that E[eiuBt ] = exp(− 12 u2 t) for all u ∈ R .
18
2. Some Mathematical Preliminaries
b) Use the power series expansion of the exponential function on both sides, compare the terms with the same power of u and deduce that E[Bt4 ] = 3t2 and more generally that (2k)! k t ; E Bt2k = k 2 · k!
k∈N.
c) If you feel uneasy about the lack of rigour in the method in b), you can proceed as follows: Prove that (2.2.2) implies that
2 1 −x E[f (Bt )] = √ f (x)e 2t dx 2πt R
for all functions f such that the integral on the right converges. Then apply this to f (x) = x2k and use integration by parts and induction on k. d) Prove (2.2.14), for example by using b) and induction on n. 2.9.* To illustrate that the (finite-dimensional) distributions alone do not give all the information regarding the continuity properties of a process, consider the following example: Let (Ω, F , P ) = ([0, ∞), B, μ) where B denotes the Borel σ-algebra on [0, ∞) and μ is a probability measure on [0, ∞) with no mass on single points. Define 1 if t = ω Xt (ω) = 0 otherwise and Yt (ω) = 0 for all (t, ω) ∈ [0, ∞) × [0, ∞) . Prove that {Xt } and {Yt } have the same distributions and that Xt is a version of Yt . And yet we have that t → Yt (ω) is continuous for all ω, while t → Xt (ω) is discontinuous for all ω. 2.10. A stochastic process Xt is called stationary if {Xt } has the same distribution as {Xt+h } for any h > 0. Prove that Brownian motion Bt has stationary increments, i.e. that the process {Bt+h − Bt }h≥0 has the same distribution for all t. 2.11. Prove (2.2.15). 2.12. Let Bt be Brownian motion and fix t0 ≥ 0. Prove that t : = Bt0 +t − Bt0 ; B is a Brownian motion.
t≥0
Exercises
19
2.13.* Let Bt be 2-dimensional Brownian motion and put Dρ = {x ∈ R2 ; |x| < ρ} Compute
for ρ > 0 .
P 0 [Bt ∈ Dρ ] .
2.14.* Let Bt be n-dimensional Brownian motion and let K ⊂ Rn have zero n-dimensional Lebesgue measure. Prove that the expected total length of time that Bt spends in K is zero. (This implies that the Green measure associated with Bt is absolutely continuous with respect to Lebesgue measure. See Chapter 9). 2.15.* Let Bt be n-dimensional Brownian motion starting at 0 and let U ∈ Rn×n be a (constant) orthogonal matrix, i.e. U U T = I. Prove that t : = U Bt B is also a Brownian motion. 2.16. (Brownian scaling). Let Bt be a 1-dimensional Brownian motion and let c > 0 be a constant. Prove that t : = 1 Bc2 t B c is also a Brownian motion. 2.17.* If Xt (·): Ω → R is a continuous stochastic process, then for p > 0 the (p) p’th variation process of Xt , X, Xt is defined by (p) Xt (ω) − Xt (ω)p (limit in probability) X, X (ω) = lim t
Δtk →0
k+1
k
tk ≤t
where 0 = t1 < t2 < . . . < tn = t and Δtk = tk+1 − tk . In particular, if p = 1 this process is called the total variation process and if p = 2 this is called the quadratic variation process. (See Exercise 4.7.) For Brownian motion Bt ∈ R we now show that the quadratic variation process is simply (2)
B, Bt (ω) = B, Bt (ω) = t
a.s.
Proceed as follows: a) Define ΔBk = Btk+1 − Btk and put Y (t, ω) =
(ΔBk (ω))2 ,
tk ≤t
20
2. Some Mathematical Preliminaries
Show that E[(
(ΔBk )2 − t)2 ] = 2
tk ≤t
(Δtk )2
tk ≤t 2
and deduce that Y (t, ·) → t in L (P ) as Δtk → 0 . b) Use a) to prove that a.a. paths of Brownian motion do not have a bounded variation on [0, t], i.e. the total variation of Brownian motion is infinite, a.s. 2.18. a) Let Ω = {1, 2, 3, 4, 5} and let U be the collection U = {1, 2, 3}, {3, 4, 5} of subsets of Ω. Find the smallest σ-algebra containing U (i.e. the σalgebra HU generated by U). b) Define X : Ω → R by X(1) = X(2) = 0, c)
X(3) = 10,
X(4) = X(5) = 1.
Is X measurable with respect to HU ? Define Y : Ω → R by Y (1) = 0,
Y (2) = Y (3) = Y (4) = Y (5) = 1.
Find the σ-algebra HY generated by Y . 2.19. Let (Ω, F , μ) be a probability space and let p ∈ [1, ∞]. A sequence p {fn }∞ n=1 of functions fn ∈ L (μ) is called a Cauchy sequence if fn − fm p → 0
as n, m → ∞.
The sequence is called convergent if there exists f ∈ Lp (μ) such that fn → f in Lp (μ). Prove that every convergent sequence is a Cauchy sequence. A fundamental theorem in measure theory states that the converse is also true: Every Cauchy sequence in Lp (μ) is convergent. A normed linear space with this property is called complete. Thus the Lp (μ) spaces are complete. 2.20. Let Bt be 1-dimensional Brownian motion, σ ∈ R be constant and 0 ≤ s < t. Use (2.2.2) to prove that E exp(σ(Bs − Bt )) = exp 12 σ 2 (t − s) . (2.2.18)
3 Itˆ o Integrals
3.1 Construction of the Itˆ o Integral We now turn to the question of finding a reasonable mathematical interpretation of the “noise” term in the equation of Problem 1 in the Introduction: dN = (r(t) + “noise”)N (t) dt or more generally in equations of the form dX = b(t, Xt ) + σ(t, Xt ) · “noise” , dt
(3.1.1)
where b and σ are some given functions. Let us first concentrate on the case when the noise is 1-dimensional. It is reasonable to look for some stochastic process Wt to represent the noise term, so that dX = b(t, Xt ) + σ(t, Xt ) · Wt . dt
(3.1.2)
Based on many situations, for example in engineering, one is led to assume that Wt has, at least approximately, these properties: (i) t1 = t2 ⇒ Wt1 and Wt2 are independent. (ii) {Wt } is stationary, i.e. the (joint) distribution of {Wt1 +t , . . . , Wtk +t } does not depend on t. (iii) E[Wt ] = 0 for all t. However, it turns out there does not exist any “reasonable” stochastic process satisfying (i) and (ii): Such a Wt cannot have continuous paths. (See Exercise 3.11.) If we require E[Wt2 ] = 1 then the function (t, ω) → Wt (ω) cannot even be measurable, with respect to the σ-algebra B × F, where B is the Borel σ-algebra on [0, ∞]. (See Kallianpur (1980, p. 10).) B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 3, © Springer-Verlag Berlin Heidelberg 2013
22
3 Itˆ o Integrals
Nevertheless it is possible to represent Wt as a generalized stochastic process called the white noise process. That the process is generalized means that it can be constructed as a probability measure on the space S of tempered distributions on [0, ∞), and not as a probability measure on the much smaller space R[0,∞) , like an ordinary process can. See e.g. Hida (1980), Adler (1981), Rozanov (1982), Hida, Kuo, Potthoff and Streit (1993), Kuo (1996) or Holden, Øksendal, Ubøe and Zhang (1996). We will avoid this kind of construction and rather try to rewrite equation (3.1.2) in a form that suggests a replacement of Wt by a proper stochastic process: Let 0 = t0 < t1 < · · · < tm = t and consider a discrete version of (3.1.2): (3.1.3) Xk+1 − Xk = b(tk , Xk )Δtk + σ(tk , Xk )Wk Δtk , where Xj = X(tj ),
Wk = Wtk ,
Δtk = tk+1 − tk .
We abandon the Wk -notation and replace Wk Δtk by ΔVk = Vtk+1 −Vtk , where {Vt }t≥0 is some suitable stochastic process. The assumptions (i), (ii) and (iii) on Wt suggest that Vt should have stationary independent increments with mean 0. It turns out that the only such process with continuous paths is the Brownian motion Bt . (See Knight (1981) or Kallenberg (2002), Thm. 13.4). Thus we put Vt = Bt and obtain from (3.1.3): Xk = X0 +
k−1
b(tj , Xj )Δtj +
j=0
k−1
σ(tj , Xj )ΔBj .
(3.1.4)
j=0
Is it possible to prove that the limit of the right hand side of (3.1.4) exists, in some sense, when Δtj → 0? If so, then by applying the usual integration notation we should obtain
t Xt = X0 +
t b(s, Xs )ds + “
0
σ(s, Xs )dBs ”
(3.1.5)
0
and we would adopt as a convention that (3.1.2) really means that Xt = Xt (ω) is a stochastic process satisfying (3.1.5). Thus, in the remainder of this chapter we will prove the existence, in a certain sense, of
t “ f (s, ω)dBs (ω)” 0
where Bt (ω) is 1-dimensional Brownian motion starting at the origin, for a wide class of functions f : [0, ∞] × Ω → R. Then, in Chapter 5, we will return to the solution of (3.1.5).
3.1 Construction of the Itˆ o Integral
23
Suppose 0 ≤ S < T and f (t, ω) is given. We want to define
T f (t, ω)dBt (ω) .
(3.1.6)
S
It is reasonable to start with a definition for a simple class of functions f and then extend by some approximation procedure. Thus, let us first assume that f has the form φ(t, ω) = ej (ω) · X[j·2−n ,(j+1)2−n ) (t) , (3.1.7) j≥0
where X denotes the characteristic (indicator) function and n is a natural number. For such functions it is reasonable to define
T φ(t, ω)dBt (ω) =
ej (ω)[Btj+1 − Btj ](ω) ,
(3.1.8)
j≥0
S
where tk =
(n) tk
⎧ ⎨ k · 2−n S = ⎩ T
if if if
⎫ S ≤ k · 2−n ≤ T ⎬ k · 2−n < S ⎭ k · 2−n > T
However, without any further assumptions on the functions ej (ω) this leads to difficulties, as the next example shows. Here – and in the following – E means the same as E 0 , the expectation w.r.t. the law P 0 for Brownian motion starting at 0. And P means the same as P 0 . Example 3.1.1 Choose φ1 (t, ω) =
Bj·2−n (ω) · X[j·2−n ,(j+1)2−n ) (t)
j≥0
φ2 (t, ω) =
B(j+1)2−n (ω) · X[j·2−n ,(j+1)2−n ) (t) .
j≥0
Then
# T E
$ φ1 (t, ω)dBt (ω) = E[Btj (Btj+1 − Btj )] = 0 , j≥0
0
since {Bt } has independent increments. But # T E
$ φ2 (t, ω)dBt (ω) = E[Btj+1 · (Btj+1 − Btj )] j≥0
0
=
j≥0
E[(Btj+1 − Btj )2 ] = T ,
by (2.2.10) .
24
3 Itˆ o Integrals
So, in spite of the fact that both φ1 and φ2 appear to be very reasonable approximations to f (t, ω) = Bt (ω) , their integrals according to (3.1.8) are not close to each other at all, no matter how large n is chosen. This only reflects the fact that the variations of the paths of Bt are too big to enable us to define the integral (3.1.6) in the Riemann-Stieltjes sense. In fact, one can show that the paths t → Bt of Brownian motion are nowhere differentiable, almost surely (a.s.). (See Breiman (1968)). In particular, the total variation of the path is infinite, a.s. In general it is natural to approximate a given function f (t, ω) by f (t∗j , ω) · X[tj ,tj+1 ) (t) j
t∗j
where the points belong to the intervals [tj , tj+1 ], and then define T f (t, ω)dBt (ω) as the limit (in a sense that we will explain) of S f (t∗j , ω)[Btj+1 − Btj ](ω) as n → ∞. However, the example above shows that j
– unlike the Riemann-Stieltjes integral – it does make a difference here what points t∗j we choose. The following two choices have turned out to be the most useful ones: 1) t∗j = tj (the left end point), which leads to the Itˆ o integral, from now on denoted by
T f (t, ω)dBt (ω) , S
and 2) t∗j = (tj + tj+1 )/2 (the mid point), which leads to the Stratonovich integral, denoted by
T f (t, ω) ◦ dBt (ω) . S
(See Protter (2004, Th. V. 5.30)). In the end of this chapter we will explain why these choices are the best and discuss the relations and distinctions between the corresponding integrals. In any case one must restrict oneself to a special class of functions f (t, ω) in (3.1.6), also if they have the particular form (3.1.7), in order to obtain a reasonable definition of the integral. We will here present Itˆ o’s choice t∗j = tj . The approximation procedure indicated above will work out successfully provided that f has the property that each of the functions ω → f (tj , ω) only depends on the behaviour of Bs (ω) up to time tj . This leads to the following important concepts:
3.1 Construction of the Itˆ o Integral
25
Definition 3.1.2 Let Bt (ω) be n-dimensional Brownian motion. Then we (n) to be the σ-algebra generated by the random variables define Ft = Ft Bi (s) 1≤i≤n,0≤s≤t . In other words, Ft is the smallest σ-algebra containing all sets of the form {ω; Bt1 (ω) ∈ F1 , · · · , Btk (ω) ∈ Fk } , where tj ≤ t and Fj ⊂ Rn are Borel sets, j ≤ k = 1, 2, . . . (We assume that all sets of measure zero are included in Ft ). One often thinks of Ft as “the history of Bs up to time t”. A function h(ω) will be Ft -measurable if and only if h can be written as the pointwise a.e. limit of sums of functions of the form g1 (Bt1 )g2 (Bt2 ) · · · gk (Btk ) , where g1 , . . . , gk are bounded continuous functions and tj ≤ t for j ≤ k, k = 1, 2, . . . . (See Exercise 3.14.) Intuitively, that h is Ft -measurable means that the value of h(ω) can be decided from the values of Bs (ω) for s ≤ t. For example, h1 (ω) = Bt/2 (ω) is Ft -measurable, while h2 (ω) = B2t (ω) is not. Note that Fs ⊂ Ft for s < t (i.e. {Ft } is increasing) and that Ft ⊂ F for all t. Definition 3.1.3 Let {Nt }t≥0 be an increasing family of σ-algebras of subsets of Ω. A process g(t, ω): [0, ∞)×Ω → Rn is called Nt -adapted if for each t ≥ 0 the function ω → g(t, ω) is Nt -measurable. Thus the process h1 (t, ω) = Bt/2 (ω) is Ft -adapted, while h2 (t, ω) = B2t (ω) is not. We now describe our class of functions for which the Itˆo integral will be defined: Definition 3.1.4 Let V = V(S, T ) be the class of functions f (t, ω): [0, ∞) × Ω → R such that (t, ω) → f (t, ω) is B×F-measurable, where B denotes the Borel σ-algebra on [0, ∞). (ii) f (t, ω) is Ft -adapted. T (iii) E f (t, ω)2 dt < ∞.
(i)
S
26
3 Itˆ o Integrals
The Itˆ o Integral For functions f ∈ V we will now show how to define the Itˆ o integral
T I[f ](ω) =
f (t, ω)dBt (ω) , S
where Bt is 1-dimensional Brownian motion. The idea is natural: First we define I[φ] for a simple class of functions φ. Then we show that each f ∈ V can be approximated (in an appropriate sense) by such φ’s and we use this to define f dB as the limit of φdB as φ → f . We now give the details of this construction: A function φ ∈ V is called elementary if it has the form ej (ω) · X[tj ,tj+1 ) (t) . (3.1.9) φ(t, ω) = j
Note that since φ ∈ V each function ej must be Ftj -measurable. Thus in Example 3.1.1 above the function φ1 is elementary while φ2 is not. For elementary functions φ(t, ω) we define the integral according to (3.1.8), i.e.
T φ(t, ω)dBt (ω) = ej (ω)[Btj+1 − Btj ](ω) . (3.1.10) j≥0
S
Now we make the following important observation: Lemma 3.1.5 (The Itˆ o isometry) If φ(t, ω) is bounded and elementary then # T 2 $ $ # T 2 E φ(t, ω)dBt (ω) φ(t, ω) dt . (3.1.11) =E S
S
Proof. Put ΔBj = Btj+1 − Btj . Then % 0 if E[ei ej ΔBi ΔBj ] = E[e2j ] · (tj+1 − tj ) if
i = j i=j
using that ei ej ΔBi and ΔBj are independent if i < j. Thus 2 $
# T φdB
E S
=
E[ei ej ΔBi ΔBj ] =
i,j
E[e2j ] · (tj+1 − tj )
j
# T =E S
$
φ2 dt .
3.1 Construction of the Itˆ o Integral
27
The idea is now to use the isometry (3.1.11) to extend the definition from elementary functions to functions in V. We do this in several steps: Step 1. Let g ∈ V be bounded and g(·, ω) continuous for each ω. Then there exist elementary functions φn ∈ V such that # T E
$ (g − φn )2 dt → 0
as n → ∞ .
S
Proof. Define φn (t, ω) =
j
g ∈ V, and
T
g(tj , ω) · X[tj ,tj+1 ) (t). Then φn is elementary since
(g − φn )2 dt → 0
as n → ∞, for each ω ,
S
T since g(·, ω) is continuous for each ω. Hence E[ (g − φn )2 dt] → 0 as n → ∞, S
by bounded convergence.
Step 2. Let h ∈ V be bounded. Then there exist bounded functions gn ∈ V such that gn (·, ω) is continuous for all ω and n, and # T E
$ (h − gn )2 dt → 0 .
S
Proof. Suppose |h(t, ω)| ≤ M for all (t, ω). For each n let ψn be a non-negative, continuous function on R such that (i) ψn (x) = 0 for x ≤ − n1 and x ≥ 0 and ∞ (ii) ψn (x)dx = 1 −∞
Define
t ψn (s − t)h(s, ω)ds .
gn (t, ω) = 0
Then gn (·, ω) is continuous for each ω and |gn (t, ω)| ≤ M . Since h ∈ V we can show that gn (t, ·) is Ft -measurable for all t. (This is a subtle point; see e.g. Karatzas and Shreve (1991), p. 133 for details.) Moreover,
T S
(gn (s, ω) − h(s, ω))2 ds → 0
as n → ∞, for each ω ,
28
3 Itˆ o Integrals
since {ψn }n constitutes an approximate identity. (See e.g. Hoffman (1962, p. 22).) So by bounded convergence # T E
$ (h(t, ω) − gn (t, ω))2 dt → 0
as n → ∞ ,
S
as asserted.
Step 3. Let f ∈ V. Then there exists a sequence {hn } ⊂ V such that hn is bounded for each n and # T
2
$
(f − hn ) dt → 0 as n → ∞ .
E S
Proof. Put
⎧ if ⎨ −n hn (t, ω) = f (t, ω) if ⎩ n if
f (t, ω) < −n −n ≤ f (t, ω) ≤ n f (t, ω) > n .
Then the conclusion follows by dominated convergence. That completes the approximation procedure.
We are now ready to complete the definition of the Itˆo integral
T f (t, ω)dBt (ω) for f ∈ V . S
If f ∈ V we choose, by Steps 1-3, elementary functions φn ∈ V such that # T
2
$
|f − φn | dt → 0 .
E S
Then define
T I[f ](ω): =
T f (t, ω)dBt (ω): = lim
φn (t, ω)dBt (ω) .
n→∞
S
S
The limit exists as an element of L2 (P ), since Cauchy sequence in L2 (P ), by (3.1.11).
T S
φn (t, ω)dBt (ω) forms a
3.1 Construction of the Itˆ o Integral
29
We summarize this as follows: Definition 3.1.6 (The Itˆ o integral) Let f ∈ V(S, T ). Then the Itˆ o integral of f (from S to T ) is defined by
T
T f (t, ω)dBt (ω) = lim
S
(limit in L2 (P ))
φn (t, ω)dBt (ω)
n→∞
(3.1.12)
S
where {φn } is a sequence of elementary functions such that # T
2
$
(f (t, ω) − φn (t, ω)) dt → 0
E
as n → ∞ .
(3.1.13)
S
Note that such a sequence {φn } satisfying (3.1.13) exists by Steps 1–3 above. Moreover, by (3.1.11) the limit in (3.1.12) exists and does not depend on the actual choice of {φn }, as long as (3.1.13) holds. Furthermore, from (3.1.11) and (3.1.12) we get the following important Corollary 3.1.7 (The Itˆ o isometry) # T
2 $ f (t, ω)dBt
E
# T =E
S
f 2 (t, ω)dt
$ for all f ∈ V(S, T ) .
(3.1.14)
S
Corollary 3.1.8 If f (t, ω) ∈ V(S, T ) and fn (t, ω) ∈ V(S, T ) for n = 1, 2, . . . T and E (fn (t, ω) − f (t, ω))2 dt → 0 as n → ∞, then S
T
T fn (t, ω)dBt (ω) →
f (t, ω)dBt (ω)
S
in L2 (P ) as n → ∞ .
S
We illustrate this integral with an example: Example 3.1.9 Assume B0 = 0. Then
t
Proof. Put φn (s, ω) = # t E
0
Bj (ω) · X[tj ,tj+1 ) (s), where Bj = Btj . Then
# t j+1 $ 2 (φn − Bs ) ds = E (Bj − Bs ) ds 2
$
j
0
=
Bs dBs = 12 Bt2 − 12 t .
j
t j+1
(s − tj )ds = tj
j
tj
1 2 (tj+1
− tj )2 → 0
as Δtj → 0 .
30
3 Itˆ o Integrals
So by Corollary 3.1.8
t
t Bs dBs = lim
Δtj →0
0
φn dBs = lim
Δtj →0
0
Bj ΔBj .
j
(See also Exercise 3.13.) Now 2 Δ(Bj2 ) = Bj+1 − Bj2 = (Bj+1 − Bj )2 + 2Bj (Bj+1 − Bj )
= (ΔBj )2 + 2Bj ΔBj , and therefore, since B0 = 0, Δ(Bj2 ) = (ΔBj )2 + 2 Bj ΔBj Bt2 = j
j
or
Bj ΔBj = 12 Bt2 −
j
1 2
j
(ΔBj )2 .
j
Since (ΔBj )2 → t in L2 (P ) as Δtj → 0 (Exercise 2.17), the result follows. j
The extra term − 12 t shows that the Itˆo stochastic integral does not behave like ordinary integrals. In the next chapter we will establish the Itˆ o formula, which explains the result in this example and which makes it easy to calculate many stochastic integrals.
3.2 Some properties of the Itˆ o integral First we observe the following: Theorem 3.2.1 Let f, g ∈ V(0, T ) and let 0 ≤ S < U < T . Then (i) (ii)
T S T
f dBt =
(cf + g)dBt = c ·
(iii) E (iv)
f dBt +
S
S
T
U
T
f dBt = 0
T
U T
f dBt for a.a. ω
f dBt +
S
T
gdBt (c constant) for a.a. ω
S
S
f dBt is FT -measurable.
S
Proof. This clearly holds for all elementary functions, so by taking limits we obtain this for all f, g ∈ V(0, T ).
3.2 Some properties of the Itˆ o integral
31
An important property of the Itˆ o integral is that it is a martingale: Definition 3.2.2 A filtration (on (Ω, F )) is a family M = {Mt }t≥0 of σalgebras Mt ⊂ F such that 0 ≤ s < t ⇒ Ms ⊂ Mt (i.e. {Mt } is increasing). An n-dimensional stochastic process {Mt }t≥0 on (Ω, F , P ) is called a martingale with respect to a filtration {Mt }t≥0 (and with respect to P ) if (i) Mt is Mt -measurable for all t, (ii) E[|Mt |] < ∞ for all t and (iii) E[Ms |Mt ] = Mt for all s ≥ t. Here the expectation in (ii) and the conditional expectation in (iii) are taken with respect to P = P 0 . (See Appendix B for a survey of conditional expectation). Example 3.2.3 Brownian motion Bt in Rn is a martingale w.r.t. the σalgebras Ft generated by {Bs ; s ≤ t}, because E[|Bt |]2 ≤ E[|Bt |2 ] = |B0 |2 + nt
and if s ≥ t then
E[Bs |Ft ] = E[Bs − Bt + Bt |Ft ] = E[Bs − Bt |Ft ] + E[Bt |Ft ] = 0 + Bt = Bt . Here we have used that E[(Bs − Bt )|Ft ] = E[Bs − Bt ] = 0 since Bs − Bt is independent of Ft (see (2.2.11) and Theorem B.2.d)) and we have used that E[Bt |Ft ] = Bt since Bt is Ft -measurable (see Theorem B.2.c)). For continuous martingales we have the following important inequality due to Doob: (See e.g. Stroock and Varadhan (1979), Theorem 1.2.3 or Revuz and Yor (1991), Theorem II.1.7) Theorem 3.2.4 (Doob’s martingale inequality) If Mt is a martingale such that t → Mt (ω) is continuous a.s., then for all p ≥ 1, T ≥ 0 and all λ>0 1 P [ sup |Mt | ≥ λ] ≤ p · E[|MT |p ] . λ 0≤t≤T We now use this inequality to prove that the Itˆ o integral
t f (s, ω)dBs 0
can be chosen to depend continuously on t :
32
3 Itˆ o Integrals
Theorem 3.2.5 Let f ∈ V(0, T ). Then there exists a t-continuous version of
t 0≤t≤T ,
f (s, ω)dBs (ω) ; 0
i.e. there exists a t-continuous stochastic process Jt on (Ω, F , P ) such that
t P [Jt =
for all t, 0 ≤ t ≤ T .
f dB] = 1
(3.2.1)
0
Proof. Let φn = φn (t, ω) =
j
such that
# T E
(n)
ej (ω)X[t(n) ,t(n) ) (t) be elementary functions j
j+1
$ (f − φn )2 dt → 0
when n → ∞ .
0
Put
t In (t, ω) =
φn (s, ω)dBs (ω) 0
and
t It = I(t, ω) =
0≤t≤T .
f (s, ω)dBs (ω) ; 0
Then In (·, ω) is continuous, for all n. Moreover, In (t, ω) is a martingale with respect to Ft , for all n : # t E[In (s, ω)|Ft ] = E
s φn dB +
0
t
#
t
φn dB + E
=
(n)
0
t≤tj
t =
φn dB +
t φn dB + 0
(n) ej ΔBj |Ft
$
(n)
≤tj+1 ≤s
(n) E E[ej ΔBj |Ft(n) ]|Ft j
0
=
$ φn dB Ft
j
(n) E ej E[ΔBj |Ft(n) ]|Ft j
j
t φn dB = In (t, ω)
= 0
(3.2.2)
3.2 Some properties of the Itˆ o integral
33
when t < s, using Theorem B.3. and Theorem B.2.d). Hence In − Im is also an Ft -martingale, so by the martingale inequality (Theorem 3.2.4) it follows that $ # 1 P sup |In (t, ω) − Im (t, ω)| > ≤ 2 · E |In (T, ω) − Im (T, ω)|2 0≤t≤T 1 = 2E
# T
$ (φn − φm )2 ds → 0
as m, n → ∞ .
0
Hence we may choose a subsequence nk ↑ ∞ s.t. P sup |Ink+1 (t, ω) − Ink (t, ω)| > 2−k < 2−k . 0≤t≤T
By the Borel-Cantelli lemma P sup |Ink+1 (t, ω) − Ink (t, ω)| > 2−k 0≤t≤T
for infinitely many k = 0 .
So for a.a. ω there exists k1 (ω) such that sup |Ink+1 (t, ω) − Ink (t, ω)| ≤ 2−k
for k ≥ k1 (ω) .
0≤t≤T
Therefore Ink (t, ω) is uniformly convergent for t ∈ [0, T ], for a.a. ω and so the limit, denoted by Jt (ω), is t-continuous for t ∈ [0, T ], a.s. Since Ink (t, ·) → I(t, ·) in L2 [P ] for all t, we must have for all t ∈ [0, T ] .
It = Jt a.s. ,
That completes the proof. From now on we shall always assume that
t 0
continuous version of the integral.
f (s, ω)dBs (ω) means a t-
Corollary 3.2.6 Let f (t, ω) ∈ V(0, T ) for all T . Then
t Mt (ω) =
f (s, ω)dBs 0
is a martingale w.r.t. Ft and
1 P sup |Mt | ≥ λ ≤ 2 · E λ 0≤t≤T
# T
$ f (s, ω)2 ds ;
λ, T > 0 .
(3.2.3)
0
Proof. This follows from (3.2.2), the a.s. t-continuity of Mt and the martingale inequality (Theorem 3.2.4), combined with the Itˆo isometry (3.1.14).
34
3 Itˆ o Integrals
3.3 Extensions of the Itˆ o integral The Itˆo integral f dB can be defined for a larger class of integrands f than V. First, the measurability condition (ii) of Definition 3.1.4 can be relaxed to the following: (ii)’ There exists an increasing family of σ-algebras Ht ; t ≥ 0 such that a) Bt is a martingale with respect to Ht and b) ft is Ht -adapted. Note that a) implies that Ft ⊂ Ht . The essence of this extension is that we can allow ft to depend on more than Ft as long as Bt remains a martingale with respect to the “history” of fs ; s ≤ t. If (ii)’ holds, then E[Bs −Bt |Ht ] = 0 for all s > t and if we inspect our proofs above, we see that this is sufficient to carry out the construction of the Itˆo integral as before. The most important example of a situation where (ii)’ applies (and (ii) doesn’t) is the following: Suppose Bt (ω) = Bk (t, ω) is the k’th coordinate of n-dimensional Brown(n) ian motion (B1 , . . . , Bn ). Let Ft be the σ-algebra generated by B1 (s1 , ·), · · ·, (n) Bn (sn , ·); sk ≤ t. Then Bk (t, ω) is a martingale with respect to Ft because (n) Bk (s, ·) − Bk (t, ·) is independent of Ft when s > t. Hence we can choose t (n) Ht = Ft in (ii)’ above. Thus we have now defined f (s, ω)dBk (s, ω) for 0
(n)
Ft -adapted integrands f (t, ω). That includes integrals like
B2 dB1 or sin(B12 + B22 ) dB2 involving several components of n-dimensional Brownian motion. (Here we have used the notation dB1 = dB1 (t, ω) etc.) This allows us to define the multi-dimensional Itˆ o integral as follows: Definition 3.3.1 Let B = (B1 , B2 , . . . , Bn ) be n-dimensional Brownian mom×n tion. Then VH (S, T ) denotes the set of m × n matrices v = [vij (t, ω)] where each entry vij (t, ω) satisfies (i) and (iii) of Definition 3.1.4 and (ii)’ above, with respect to some filtration H = {Ht }t≥0 . m×n If v ∈ VH (S, T ) we define, using matrix notation ⎛ ⎞⎛ ⎞ dB1
T
T v11 · · · v1n . . . .. ⎠ ⎝ .. ⎠ vdB = ⎝ .. vm1 · · · vmn dBn S S to be the m × 1 matrix (column vector) whose i’th component is the following sum of (extended) 1-dimensional Itˆ o integrals: n
T
j=1 S
vij (s, ω)dBj (s, ω) .
3.3 Extensions of the Itˆ o integral
35
(n)
If H = F (n) = {Ft }t≥0 we write V m×n (S, T ) and if m = 1 we write 1×n n VH (S, T ) (respectively V n (S, T )) instead of VH (S, T ) (respectively 1×n (S, T )). We also put V V m×n = V m×n (0, ∞) = V m×n (0, T ) . T >0
The next extension of the Itˆ o integral consists of weakening condition (iii) of Definition 3.1.4 to # T $ 2 (iii)’ P f (s, ω) ds < ∞ = 1 . S
Definition 3.3.2 WH (S, T ) denotes the class of processes f (t, ω) ∈ R satisfying (i) of Definition for * 3.1.4 and (ii)’, (iii)’ above. Similarly to the notation m×n WH (0, T ) and in the matrix case we write WH (S, T ) V we put WH = T >0
etc. If H = F (n) we write W(S, T ) instead of WF (n) (S, T ) etc. If the dimension is clear from the context we sometimes drop the superscript and write F for F (n) and so on. Let Bt denote 1-dimensional Brownian motion. If f ∈ WH one can show t that for all t there exist step functions fn ∈ WH such that |fn − f |2 ds → 0 0
in probability, i.e. in measure with respect to P . For such a sequence one has t that fn (s, ω)dBs converges in probability to some random variable and the 0
limit only depends on f , not on the sequence {fn }. Thus we may define
t
t f (s, ω)dBs (ω) = lim fn (s, ω)dBs (ω) (limit in probability) for f ∈ WH . n→∞
0
0
(3.3.1) As before there exists a t-continuous version of this integral. See Friedman (1975, Chap. 4) or McKean (1969, Chap. 2) for details. Note, however, that this integral is not in general a martingale. See for example Dudley’s Theorem (Theorem 12.1.5). It is, however, a local martingale. See Karatzas and Shreve (1991), p. 146. See also Exercise 7.12. A comparison of Itˆ o and Stratonovich integrals Let us now return to our original question in this chapter: We have argued that the mathematical interpretation of the white noise equation dX = b(t, Xt ) + σ(t, Xt ) · Wt dt
(3.3.2)
36
3 Itˆ o Integrals
is that Xt is a solution of the integral equation
t Xt = X0 +
t b(s, Xs )ds + “
0
σ(s, Xs )dBs ” ,
(3.3.3)
0
for some suitable interpretation of the last integral in (3.3.3). However, as indicated earlier, the Itˆ o interpretation of an integral of the form
t “
f (s, ω)dBs (ω)”
(∗)
0
is just one of several reasonable choices. For example, the Stratonovich integral is another possibility, leading (in general) to a different result. So the question still remains: Which interpretation of (∗) makes (3.3.3) the “right” mathematical model for the equation (3.3.2)? Here is an argument that indicates that the Stratonovich interpretation in some situations may be the most (n) appropriate: Choose t-continuously differentiable processes Bt such that for a.a. ω as n → ∞ B (n) (t, ω) → B(t, ω) (n)
uniformly (in t) in bounded intervals. For each ω let Xt (ω) be the solution of the corresponding (deterministic) differential equation (n)
dXt dB = b(t, Xt ) + σ(t, Xt ) t dt dt
.
(3.3.4)
(n)
Then Xt (ω) converges to some function Xt (ω) in the same sense: For a.a. (n) ω we have that Xt (ω) → Xt (ω) as n → ∞, uniformly (in t) in bounded intervals. It turns out (see Wong and Zakai (1969) and Sussman (1978)) that this solution Xt coincides with the solution of (3.3.3) obtained by using Stratonovich integrals, i.e.
t Xt = X0 +
t σ(s, Xs ) ◦ dBs .
b(s, Xs )ds + 0
(3.3.5)
0
This implies that Xt is the solution of the following modified Itˆ o equation:
t Xt = X0 +
b(s, Xs )ds + 0
1 2
t
t
σ (s, Xs )σ(s, Xs )ds + 0
σ(s, Xs )dBs , (3.3.6) 0
where σ denotes the derivative of σ(t, x) with respect to x. (See Stratonovich (1966)).
Exercises
37
Therefore, from this point of view it seems reasonable to use (3.3.6) (i.e. the Stratonovich interpretation) – and not the Itˆ o interpretation
t Xt = X0 +
t b(s, Xs )ds +
0
σ(s, Xs )dBs
(3.3.7)
0
as the model for the original white noise equation (3.3.2). On the other hand, the specific feature of the Itˆ o model of “not looking into the future” (as explained after Example 3.1.1) seems to be a reason for choosing the Itˆo interpretation in many cases, for example in biology (see the discussion in Turelli (1977)). The difference between the two interpretations is illustrated in Example 5.1.1. Note that (3.3.6) and (3.3.7) coincide if σ(t, x) does not depend on x. For example, this is the situation in the linear case handled in the filtering problem in Chapter 6. In any case, because of the explicit connection (3.3.6) between the two models (and a similar connection in higher dimensions – see (6.1.3)), it will for many purposes suffice to do the general mathematical treatment for one of the two types of integrals. In general one can say that the Stratonovich integral has the advantage of leading to ordinary chain rule formulas under a transformation (change of variable), i.e. there are no second order terms in the Stratonovich analogue of the Itˆo transformation formula (see Theorems 4.1.2 and 4.2.1). This property makes the Stratonovich integral natural to use for example in connection with stochastic differential equations on manifolds (see Elworthy (1982) or Ikeda and Watanabe (1989)). However, Stratonovich integrals are not martingales, as we have seen that Itˆo integrals are. This gives the Itˆ o integral an important computational advantage, even though it does not behave so nicely under transformations (as Example 3.1.9 shows). For our purposes the Itˆo integral will be most convenient, so we will base our discussion on that from now on.
Exercises Unless otherwise stated Bt denotes Brownian motion in R, B0 = 0. 3.1.* Prove directly from the definition of Itˆ o integrals (Definition 3.1.6) that
t
t sdBs = tBt −
0
(Hint: Note that j
Δ(sj Bj ) =
Bs ds . 0
j
sj ΔBj +
j
Bj+1 Δsj .)
38
3.2.
3. Itˆ o Integrals
Prove directly from the definition of Itˆ o integrals that
t
Bs2 dBs
=
1 3 3 Bt
0
t −
Bs ds . 0 (X)
3.3.* If Xt : Ω → Rn is a stochastic process, let Ht = Ht denote the σ(X) algebra generated by {Xs (·); s ≤ t} (i.e. {Ht }t≥0 is the filtration of the process {Xt }t≥0 ). a) Show that if Xt is a martingale w.r.t. some filtration {Nt }t≥0 , then (X) Xt is also a martingale w.r.t. its own filtration {Ht }t≥0 . (X) b) Show that if Xt is a martingale w.r.t Ht , then E[Xt ] = E[X0 ]
for all t ≥ 0 .
(∗)
c) Give an example of a stochastic process Xt satisfying (∗) and which is not a martingale w.r.t. its own filtration. 3.4.* Check whether the following processes Xt are martingales w.r.t. {Ft }: (i) Xt = Bt + 4t (ii) Xt = Bt2 t (iii) Xt = t2 Bt − 2 sBs ds 0
(iv) Xt = B1 (t)B2 (t), where (B1 (t), B2 (t)) is 2-dimensional Brownian motion. 3.5.* Prove directly (without using Example 3.1.9) that Mt = Bt2 − t is an Ft -martingale. 3.6.* Prove that Nt = Bt3 − 3tBt is a martingale. 3.7.
A famous result of Itˆ o (1951) gives the following formula for n times iterated Itˆ o integrals:
n Bt n! · · · ( ( dBu1 )dBu2 ) · · · dBun = t 2 hn √ (3.3.8) t 0≤u1 ≤···≤un ≤t
where hn is the Hermite polynomial of degree n, defined by hn (x) = (−1)n e
x2 2
dn − x22 e ; dxn
n = 0, 1, 2, . . .
(Thus h0 (x) = 1, h1 (x) = x, h2 (x) = x2 − 1, h3 (x) = x3 − 3x.)
Exercises
39
a) Verify that in each of these n Itˆo integrals the integrand satisfies the requirements in Definition 3.1.4. b) Verify formula (3.3.8) for n = 1, 2, 3 by combining Example 3.1.9 and Exercise 3.2. c) Use b) to give a new proof of the statement in Exercise 3.6. 3.8.* a) Let Y be a real valued random variable on (Ω, F , P ) such that E[|Y |] < ∞ . Define Mt = E[Y |Ft ] ;
t≥0.
Show that Mt is an Ft -martingale. b) Conversely, let Mt ; t ≥ 0 be a real valued Ft -martingale such that sup E[|Mt |p ] < ∞
for some p > 1 .
t≥0
Show that there exists Y ∈ L1 (P ) such that Mt = E[Y |Ft ] . (Hint: Use Corollary C.7.) 3.9.* Suppose f ∈ V(0, T ) and that t → f (t, ω) is continuous for a.a. ω. Then we have shown that
T
f (t, ω)dBt (ω) = lim
Δtj →0
0
f (tj , ω)ΔBj
in L2 (P ) .
j
Similarly we define the Stratonovich integral of f by
T f (t, ω) ◦ dBt (ω) = lim 0
Δtj →0
f (t∗j , ω)ΔBj , where t∗j = 12 (tj + tj+1 ),
j
whenever the limit exists in L2 (P ). In general these integrals are different. For example, compute
T Bt ◦ dBt 0
and compare with Example 3.1.9. 3.10. If the function f in Exercise 3.9 varies “smoothly” with t then in fact the Itˆo and Stratonovich integrals of f coincide. More precisely, assume that there exists K < ∞ and > 0 such that E[|f (s, ·) − f (t, ·)|2 ] ≤ K|s − t|1+ ;
0≤s, t≤T .
40
3. Itˆ o Integrals
Prove that then we have
T f (t, ω)dBt = lim
Δtj →0
0
f (tj , ω)ΔBj
(limit in L1 (P ))
j
for any choice of tj ∈ [tj , tj+1 ]. In particular,
T
T f (t, ω) ◦ dBt .
f (t, ω)dBt = 0
0
(Hint: Consider E | f (tj , ω)ΔBj − f (tj , ω)ΔBj | .) j
j
3.11. Let Wt be a stochastic process satisfying (i), (ii) and (iii) (below (3.1.2)). Prove that Wt cannot have continuous paths. (Hint: Consider (N ) (N ) E[(Wt − Ws )2 ], where (N )
Wt
= (−N ) ∨ (N ∧ Wt ), N = 1, 2, 3, . . .) .
3.12.* As in Exercise 3.9 we let ◦dBt denote Stratonovich differentials. (i) Use (3.3.6) to transform the following Stratonovich differential equations into Itˆ o differential equations: (a) dXt = γXt dt + αXt ◦ dBt (b) dXt = sin Xt cos Xt dt + (t2 + cos Xt ) ◦ dBt (ii) Transform the following Itˆo differential equations into Stratonovich differential equations: (a) dXt = rXt dt + αXt dBt (b) dXt = 2e−Xt dt + Xt2 dBt 3.13. A stochastic process Xt (·): Ω → R is continuous in mean square if E[Xt2 ] < ∞ for all t and lim E[(Xs − Xt )2 ] = 0
s→t
for all t ≥ 0 .
a) Prove that Brownian motion Bt is continuous in mean square. b) Let f : R → R be a Lipschitz continuous function, i.e. there exists C < ∞ such that |f (x) − f (y)| ≤ C|x − y| Prove that Yt : = f (Bt ) is continuous in mean square.
for all x, y ∈ R .
Exercises
41
c) Let Xt be a stochastic process which is continuous in mean square and assume that Xt ∈ V(S, T ), T < ∞. Show that
T
T Xt dBt = lim
φn (t, ω)dBt (ω)
n→∞
S
(limit in L2 (P ))
S
where φn (t, ω) =
j
T 0. Show that XdP for ω ∈ Gi . E[X|G](ω) = Gi P (Gi ) c) Suppose X assumes only finitely many values a1 , . . . , am . Then from elementary probability theory we know that (see Exercise 2.1) E[X|Gi ] =
m
ak P [X = ak |Gi ] .
k=1
Compare with b) and verify that E[X|Gi ] = E[X|G](ω)
for ω ∈ Gi .
Thus we may regard the conditional expectation as defined in Appendix B as a (substantial) generalization of the conditional expectation in elementary probability theory. 3.18. Let Bt be 1-dimensional Brownian motion and let σ ∈ R be constant. Prove directly from the definition that Mt := exp(σBt − 12 σ 2 t);
t≥0
is an Ft -martingale. (Hint: If s > t then E[exp(σBt − 12 σ 2 s)|Ft ] = E[exp(σ(Bs − Bt )) × exp(σBt − 12 σ 2 s)|Ft ]. Now use Theorem B.2 e), Theorem B.2 d) and Exercise 2.20.)
4 The Itˆ o Formula and the Martingale Representation Theorem
4.1 The 1-dimensional Itˆ o formula Example 3.1.9 illustrates that the basic definition of Itˆo integrals is not very useful when we try to evaluate a given integral. This is similar to the situation for ordinary Riemann integrals, where we do not use the basic definition but rather the fundamental theorem of calculus plus the chain rule in the explicit calculations. In this context, however, we have no differentiation theory, only integration theory. Nevertheless it turns out that it is possible to establish an Itˆ o integral version of the chain rule, called the Itˆ o formula. The Itˆ o formula is, as we will show by examples, very useful for evaluating Itˆ o integrals. From the example
t Bs dBs =
1 2 2 Bt
−
1 2t
or
1 2 2 Bt
=
1 2t
t +
0
Bs dBs ,
(4.1.1)
0
we see that the image of the Itˆ o integral Bt = is not again an Itˆo integral of the form
t 0
dBs by the map g(x) = 12 x2
t f (s, ω)dBs (ω) 0
but a combination of a dBs -and a ds-integral: 1 2 2 Bt
t = 0
1 2 ds
t +
Bs dBs . 0
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 4, © Springer-Verlag Berlin Heidelberg 2013
(4.1.2)
44
4 The Itˆ o Formula and the Martingale ...
It turns out that if we introduce Itˆ o processes (also called stochastic integrals) as sums of a dBs -and a ds-integral then this family of integrals is stable under smooth maps. Thus we define Definition 4.1.1 (1-dimensional Itˆ o processes) Let Bt be 1-dimensional Brownian motion on (Ω, F , P ). A (1-dimensional) Itˆo process (or stochastic integral) is a stochastic process Xt on (Ω, F , P ) of the form
t
t Xt = X0 + u(s, ω)ds + v(s, ω)dBs , (4.1.3) 0
0
where v ∈ WH , so that # t
2
$
v(s, ω) ds < ∞ for all t ≥ 0 = 1
P
(4.1.4)
0
(see Definition 3.3.2). We also assume that u is Ht -adapted (where Ht is as in (ii)’, Section 3.3) and $
# t
|u(s, ω)|ds < ∞ for all t ≥ 0 = 1 .
P
(4.1.5)
0
If Xt is an Itˆ o process of the form (4.1.3) the equation (4.1.3) is sometimes written in the shorter differential form dXt = udt + vdBt .
(4.1.6)
For example, (4.1.1) (or (4.1.2)) may be represented by d 12 Bt2 = 12 dt + Bt dBt . We are now ready to state the first main result in this chapter: Theorem 4.1.2 (The 1-dimensional Itˆ o formula) Let Xt be an Itˆ o process given by dXt = udt + vdBt . 2
Let g(t, x) ∈ C ([0, ∞) × R) (i.e. g is twice continuously differentiable on [0, ∞) × R). Then Yt = g(t, Xt ) is again an Itˆ o process, and dYt =
∂2g ∂g ∂g (t, Xt )dt + (t, Xt )dXt + 12 2 (t, Xt ) · (dXt )2 , ∂t ∂x ∂x
(4.1.7)
4.1 The 1-dimensional Itˆ o formula
45
where (dXt )2 = (dXt ) · (dXt ) is computed according to the rules dt · dt = dt · dBt = dBt · dt = 0 ,
dBt · dBt = dt .
(4.1.8)
Before we prove Itˆ o’s formula let us look at some examples. Example 4.1.3 Let us return to the integral
t Bs dBs
I=
from Chapter 3 .
0
Choose Xt = Bt and g(t, x) = 12 x2 . Then Yt = g(t, Bt ) = 12 Bt2 . Then by Itˆ o’s formula,
dYt =
∂g ∂ 2g ∂g dt + dBt + 12 2 (dBt )2 = Bt dBt + 12 (dBt )2 = Bt dBt + 12 dt . ∂t ∂x ∂x
Hence
d( 12 Bt2 ) = Bt dBt + 12 dt .
In other words, 1 2 2 Bt
t =
Bs dBs + 12 t,
as in Chapter 3 .
0
Example 4.1.4 What is
t sdBs ? 0
From classical calculus it seems reasonable that a term of the form tBt should appear, so we put g(t, x) = tx and Yt = g(t, Bt ) = tBt . Then by Itˆ o’s formula, dYt = Bt dt + tdBt + 0 i.e. d(tBt ) = Bt dt + tdBt or
t tBt =
t Bs ds +
0
sdBs 0
46
4 The Itˆ o Formula and the Martingale ...
or
t
t sdBs = tBt −
0
Bs ds , 0
which is reasonable from an integration-by-parts point of view. More generally, the same method gives Theorem 4.1.5 (Integration by parts) Suppose f (s, ω) is continuous and of bounded variation with respect to s ∈ [0, t], for a.a. ω. (See Exercise 2.17.) Then
t
t f (s)dBs = f (t)Bt − Bs dfs . 0
0
Note that it is crucial for the result to hold that f is of bounded variation. (See Exercise 4.3 for the general case.) Sketch of proof of the Itˆ o formula. First observe that if we substitute dXt = udt + vdBt in (4.1.7) and use (4.1.8) we get the equivalent expression
t g(t, Xt ) = g(0, X0 ) + 0
t vs ·
+ 0
∂g ∂g ∂2g (s, Xs ) + us (s, Xs ) + 12 vs2 · 2 (s, Xs ) ds ∂s ∂x ∂x
∂g (s, Xs )dBs ∂x
where us = u(s, ω), vs = v(s, ω) .
(4.1.9)
Note that (4.1.9) is an Itˆo process in the sense of Definition 4. 1.1. 2
∂g ∂ g We may assume that g, ∂g ∂t , ∂x and ∂x2 are bounded, for if (4.1.9) is proved in this case we obtain the general case by approximating by C 2 functions ∂ 2 gn n ∂gn gn such that gn , ∂g ∂t , ∂x and ∂x2 are bounded for each n and converge 2
∂g ∂ g uniformly on compact subsets of [0, ∞) × R to g, ∂g ∂t , ∂x , ∂x2 , respectively. (See Exercise 4.9.) Moreover, from (3.3.1) we see that we may assume that u(t, ω) and v(t, ω) are elementary functions. Using Taylor’s theorem we get
g(t, Xt ) = g(0, X0 ) +
Δg(tj , Xj ) = g(0, X0 ) +
j
+ 21
∂2g j
where
∂t2
∂g ∂g ∂t , ∂x
(Δtj )2 +
∂2g (Δtj )(ΔXj ) + ∂t∂x j
∂g j
1 2
∂t
Δtj +
∂g ΔXj ∂x j
∂2g (ΔXj )2 + Rj , 2 ∂x j j
etc. are evaluated at the points (tj , Xtj ),
4.1 The 1-dimensional Itˆ o formula
47
Δtj = tj+1 − tj , ΔXj = Xtj+1 − Xtj , Δg(tj , Xj ) = g(tj+1 , Xtj+1 ) − g(tj , Xj ) and Rj = o(|Δtj |2 + |ΔXj |2 ) for all j. If Δtj → 0 then ∂g j
∂t
Δtj =
∂g j
∂t
t (tj , Xj )Δtj → 0
∂g (s, Xs )ds ∂t
∂g ∂g ∂g ΔXj = (tj , Xj )ΔXj → (s, Xs )dXs . ∂x ∂x ∂x j j
(4.1.10)
t
(4.1.11)
0
Moreover, since u and v are elementary we get ∂2g ∂2g ∂ 2g 2 2 2 (ΔX ) = u (Δt ) + 2 uj vj (Δtj )(ΔBj ) j j ∂x2 ∂x2 j ∂x2 j j j +
∂2g v 2 · (ΔBj )2 , 2 j ∂x j
where uj = u(tj , ω), vj = v(tj , ω) .
(4.1.12)
The first two terms here tend to 0 as Δtj → 0. For example, #
2 $ ∂2g E uj vj (Δtj )(ΔBj ) = ∂x2 j 2 $ # ∂ 2 g E u v (Δtj )3 → 0 = 2 j j ∂x j
as Δtj → 0 .
We claim that the last term tends to
t 0
∂2g 2 v ds ∂x2
To prove this, put a(t) = E
# j
aj (ΔBj )2 −
j
in L2 (P ), as Δtj → 0 .
∂2g 2 ∂x2 (t, Xt )v (t, ω),
2 $ aj Δtj
=
aj = a(tj ) and consider
E[ai aj ((ΔBi )2 −Δti )((ΔBj )2 −Δtj )] .
i,j
If i < j then ai aj ((ΔBi )2 − Δti ) and (ΔBj )2 − Δtj are independent so the terms vanish in this case, and similarly if i > j. So we are left with
48
4 The Itˆ o Formula and the Martingale ...
E[a2j ((ΔBj )2 − Δtj )2 ] =
j
=
E[a2j ] · E[(ΔBj )4 − 2(ΔBj )2 Δtj + (Δtj )2 ]
j
E[a2j ]
2
· (3(Δtj ) − 2(Δtj )2 + (Δtj )2 ) = 2
j
E[a2j ] · (Δtj )2
j
→0
as Δtj → 0 .
In other words, we have established that j
2
t
aj (ΔBj ) →
a(s)ds
in L2 (P ) as Δtj → 0
0
and this is often expressed shortly by the striking formula (dBt )2 = dt . (4.1.13) The argument above also proves that Rj → 0 as Δtj → 0. That completes the proof of the Itˆo formula. Remark. Note that it is enough that g(t, x) is C 2 on [0, ∞) × U , if U ⊂ R is an open set such that Xt (ω) ∈ U for all t ≥ 0, ω ∈ Ω. Moreover, it is sufficient that g(t, x) is C 1 w.r.t. t and C 2 w.r.t. x.
4.2 The Multi-dimensional Itˆ o Formula We now turn to the situation in higher dimensions: Let B(t, ω) = (B1 (t, ω), . . ., Bm (t, ω)) denote m-dimensional Brownian motion. If each of the processes ui (t, ω) and vij (t, ω) satisfies the conditions given in Definition 4.1.1 (1 ≤ i ≤ n, 1 ≤ j ≤ m) then we can form the following n Itˆo processes ⎧ ⎪ ⎨dX1 = u1 dt + v11 dB1 + · · · + v1m dBm .. .. .. (4.2.1) . . . ⎪ ⎩ dXn = un dt + vn1 dB1 + · · · + vnm dBm Or, in matrix notation simply dX(t) = udt + vdB(t) ,
(4.2.2)
where
⎧ ⎧ ⎫ ⎫ ⎧ ⎫ ⎧ ⎫ X1 (t) ⎪ dB1 (t) ⎪ u1 ⎪ v11 · · · v1m ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ . ⎪ ⎪ . ⎪ ⎪ ⎪ .. ⎪ ⎪ .. .. ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ .. ⎪ . ⎪ X(t) =⎪ , u =⎪ , v =⎪ , dB(t) =⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4.2.3) . . . ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ ⎩ ⎭ ⎩ ⎩ . ⎪ ⎭ ⎭ un vn1 · · · vnm Xn (t) dBm (t) Such a process X(t) is called an n-dimensional Itˆ o process (or just an Itˆ o process).
4.3 The Martingale Representation Theorem
49
We now ask: What is the result of applying a smooth function to X? The answer is given by Theorem 4.2.1 (The general Itˆ o formula) Let dX(t) = udt + vdB(t) be an n-dimensional Itˆ o process as above. Let g(t, x) = (g1 (t, x), . . . , gp (t, x)) be a C 2 map from [0, ∞) × Rn into Rp . Then the process Y (t, ω) = g(t, X(t)) is again an Itˆ o process, whose component number k, Yk , is given by dYk =
∂gk ∂gk (t, X)dt + (t, X)dXi + ∂t ∂xi i
1 2
∂ 2 gk (t, X)dXi dXj ∂xi ∂xj i,j
where dBi dBj = δij dt, dBi dt = dtdBi = 0. The proof is similar to the 1-dimensional version (Theorem 4.1.2) and is omitted. Example 4.2.2 Let B = (B1 , . . . , Bn ) be Brownian motion in Rn , n ≥ 2, and consider 1
R(t, ω) = |B(t, ω)| = (B12 (t, ω) + · · · + Bn2 (t, ω)) 2 , i.e. the distance to the origin of B(t, ω). The function g(t, x) = |x| is not C 2 at the origin, but since Bt never hits the origin, a.s. when n ≥ 2 (see Exercise 9.7) Itˆ o’s formula still works and we get dR =
n Bi dBi i=1
R
+
n−1 dt . 2R
The process R is called the n-dimensional Bessel process because its generator (Chapter 7) is the Bessel differential operator Af (x) = 12 f (x) + n−1 2x f (x). See Example 8.4.1.
4.3 The Martingale Representation Theorem Let B(t) = (B1 (t), . . . , Bn (t)) be n-dimensional Brownian motion. In Chapter 3 (Corollary 3.2.6) we proved that if v ∈ V n then the Itˆ o integral
t Xt = X0 +
v(s, ω)dB(s) ; 0
t≥0
50
4 The Itˆ o Formula and the Martingale ... (n)
is always a martingale w.r.t. filtration Ft (and w.r.t. the probability measure (n) P ). In this section we will prove that the converse is also true: Any Ft martingale (w.r.t. P ) can be represented as an Itˆo integral (Theorem 4.3.4). This result, called the martingale representation theorem, is important for many applications, for example in mathematical finance. See Chapter 12. For simplicity we prove the result only when n = 1, but the reader can easily verify that essentially the same proof works for arbitrary n. We first establish some auxiliary results. Lemma 4.3.1 Fix T > 0. The set of random variables {φ(Bt1 , . . . , Btn ); ti ∈ [0, T ], φ ∈ C0∞ (Rn ), n = 1, 2, . . .} is dense in L2 (FT , P ). Proof. Let {ti }∞ i=1 be a dense subset of [0, T ] and for each n = 1, 2, . . . let Hn be the σ-algebra generated by Bt1 (·), . . . , Btn (·). Then clearly Hn ⊂ Hn+1 and FT is the smallest σ-algebra containing all the Hn ’s. Choose g ∈ L2 (FT , P ). Then by the martingale convergence theorem Corollary C.9 (Appendix C) we have that g = E[g|FT ] = lim E[g|Hn ] . n→∞
2
The limit is pointwise a.e. (P ) and in L (FT , P ). By the Doob-Dynkin Lemma (Lemma 2.1.2) we can write, for each n, E[g|Hn ] = gn (Bt1 , . . . , Btn ) for some Borel measurable function gn : Rn → R. Each such gn (Bt1 , . . . , Btn ) can be approximated in L2 (FT , P ) by functions φn (Bt1 , . . . , Btn ) where φn ∈ C0∞ (Rn ) and the result follows. For an alternative proof of the next result see Exercise 4.17. Lemma 4.3.2 The linear span of random variables of the type % T h(t)dBt (ω) −
exp 0
1 2
T
2
+
h (t)dt ;
h ∈ L2 [0, T ] (deterministic)
(4.3.1)
0 2
is dense in L (FT , P ). Proof. Suppose g ∈ L2 (FT , P ) is orthogonal (in L2 (FT , P )) to all functions of the form (4.3.1). Then in particular
(4.3.2) G(λ): = exp{λ1 Bt1 (ω) + · · · + λn Btn (ω)}g(ω)dP (ω) = 0 Ω
4.3 The Martingale Representation Theorem
51
for all λ = (λ1 , . . . , λn ) ∈ Rn and all t1 , . . . , tn ∈ [0, T ]. The function G(λ) is real analytic in λ ∈ Rn and hence G has an analytic extension to the complex space Cn given by
G(z) = exp{z1 Bt1 (ω) + · · · + zn Btn (ω)}g(ω)dP (ω) (4.3.3) Ω
for all z = (z1 , . . . , zn ) ∈ Cn . (See the estimates in Exercise 2.8 b).) Since G = 0 on Rn and G is analytic, G = 0 on Cn . In particular, G(iy1 , iy2 , . . . , iyn ) = 0 for all y = (y1 , . . . , yn ) ∈ Rn . But then we get, for φ ∈ C0∞ (Rn ),
φ(Bt1 , . . . , Btn )g(ω)dP (ω) Ω
=
(2π)−n/2
Ω −n/2
= (2π)
Rn −n/2
= (2π)
i(y1 Bt1 +···+yn Btn ) φ(y)e dy g(ω)dP (ω)
Rn
i(y1 Bt1 +···+yn Btn ) e g(ω)dP (ω) dy φ(y) Ω
φ(y)G(iy)dy =0,
(4.3.4)
Rn
where φ(y) = (2π)−n/2
φ(x)e−i x·y dx
Rn
is the Fourier transform of φ and we have used the inverse Fourier transform theorem
i x·y φ(x) = (2π)−n/2 dy φ(y)e Rn
(see e.g. Folland (1984)). By (4.3.4) and Lemma 4.3.1 g is orthogonal to a dense subset of L2 (FT , P ) and we conclude that g = 0. Therefore the linear span of the functions in (4.3.1) must be dense in L2 (FT , P ) as claimed. Suppose B(t) = (B1 (t), . . . , Bn (t)) is n-dimensional. If v(s, ω) ∈ V n (0, T ) then the random variable
T V (ω): =
v(t, ω)dB(t) 0
(4.3.5)
52
4 The Itˆ o Formula and the Martingale ... (n)
is FT -measurable and by the Itˆo isometry
T
2
E[V ] =
(n)
E[v 2 (t, ·)]dt < ∞ ,
so V ∈ L2 (FT , P ) .
0 (n)
The next result states that any F ∈ L2 (FT , P ) can be represented this way: Theorem 4.3.3 (The Itˆ o representation theorem) (n) 2 Let F ∈ L (FT , P ). Then there exists a unique stochastic process f (t, ω) ∈ V n (0, T ) such that
T F (ω) = E[F ] +
f (t, ω)dB(t) .
(4.3.6)
0
Proof. Again we consider only the case n = 1. (The proof in the general case is similar.) First assume that F has the form (4.3.1), i.e. % T h(t)dBt (ω) −
F (ω) = exp 0
1 2
T
2
+
h (t)dt 0
for some h(t) ∈ L2 [0, T ]. Define % t h(s)dBs (ω) −
Yt (ω) = exp
1 2
0
t
2
+
h (s)ds
;
0≤t≤T .
0
Then by Itˆ o’s formula dYt = Yt (h(t)dBt − 12 h2 (t)dt) + 12 Yt (h(t)dBt )2 = Yt h(t)dBt so that
t Yt = 1 +
Ys h(s)dBs ;
t ∈ [0, T ] .
0
Therefore
T F = YT = 1 +
Ys h(s)dBs 0
and hence E[F ] = 1. So (4.3.6) holds in this case. If F ∈ L2 (FT , P ) is arbitrary, we can by Lemma 4.3.2 approximate F in L2 (FT , P ) by linear combinations Fn of functions of the form (4.3.1). By linearity (4.3.6) also holds for linear combinations of functions of the form (4.3.1). Then for each n we have
4.3 The Martingale Representation Theorem
53
T Fn (ω) = E[Fn ] +
where fn ∈ V(0, T ) .
fn (s, ω)dBs (ω), 0
By the Itˆo isometry #
2
T
E[(Fn − Fm ) ] = E (E[Fn − Fm ] + = (E[Fn − Fm ])2 +
T
2
$
(fn − fm )dB) 0
E[(fn − fm )2 ]dt → 0
as n, m → ∞
0
so {fn } is a Cauchy sequence in L2 ([0, T ] × Ω) and hence converges to some f ∈ L2 ([0, T ] × Ω). Since fn ∈ V(0, T ) we have f ∈ V(0, T ). (A subsequence of {fn (t, ω)} converges to f (t, ω) for a.a. (t, ω) ∈ [0, T ] × Ω. Therefore f (t, ·) is Ft -measurable for a.a. t. So by modifying f (t, ω) on a t-set of measure 0 we can obtain that f (t, ω) is Ft -adapted.) Again using the Itˆo isometry we see that F = lim Fn = lim n→∞
n→∞
T E[Fn ] +
fn dB
T = E[F ] +
0
f dB , 0
the limit being taken in L2 (FT , P ). Hence the representation (4.3.6) holds for all F ∈ L2 (FT , P ). The uniqueness follows from the Itˆ o isometry: Suppose
T
T f1 (t, ω)dBt (ω) = E[F ] +
F (ω) = E[F ] + 0
f2 (t, ω)dBt (ω) 0
with f1 , f2 ∈ V(0, T ). Then
T
T 2 0 = E[( (f1 (t, ω) − f2 (t, ω))dBt (ω)) ] = E[(f1 (t, ω) − f2 (t, ω))2 ]dt 0
0
and therefore f1 (t, ω) = f2 (t, ω) for a.a. (t, ω) ∈ [0, T ] × Ω.
Remark. The process f (t, ω) can be expressed in terms of the Fr´echet derivative and also in terms of the Malliavin derivative of F (ω). See Clark (1970/71), Davis (1980) and Ocone (1984). Theorem 4.3.4 (The martingale representation theorem) (n) Let B(t) = (B1 (t), . . . , Bn (t)) be n-dimensional. Suppose Mt is an Ft martingale (w.r.t. P ) and that Mt ∈ L2 (P ) for all t ≥ 0. Then there exists a unique stochastic process g(s, ω) such that g ∈ V (n) (0, t) for all t ≥ 0 and
54
4. The Itˆ o Formula and the Martingale ...
t Mt (ω) = E[M0 ] +
a.s., for all t ≥ 0 .
g(s, ω)dB(s) 0
Proof (n = 1). By Theorem 4.3.3 applied to T = t, F = Mt , we have that for all t there exists a unique f (t) (s, ω) ∈ L2 (Ft , P ) such that
t Mt (ω) = E[Mt ] +
f (t) (s, ω)dBs (ω) = E[M0 ] +
0
t
f (t) (s, ω)dBs (ω) .
0
Now assume 0 ≤ t1 < t2 . Then # t2 Mt1 = E[Mt2 |Ft1 ] = E[M0 ] + E
= E[M0 ] +
f (t2 ) (s, ω)dBs (ω)|Ft1
$
0
t1
f
(t2 )
0
(s, ω)dBs (ω) .
(4.3.7)
But we also have
t1 Mt1 = E[M0 ] +
f (t1 ) (s, ω)dBs (ω) .
(4.3.8)
0
Hence, comparing (4.3.7) and (4.3.8) we get that # t1 2 $ t1 (t2 ) (t1 ) 0=E (f − f )dB = E[(f (t2 ) − f (t1 ) )2 ]ds 0
0
and therefore f (t1 ) (s, ω) = f (t2 ) (s, ω)
for a.a. (s, ω) ∈ [0, t1 ] × Ω .
So we can define f (s, ω) for a.a. s ∈ [0, ∞) × Ω by setting f (s, ω) = f (N ) (s, ω)
if s ∈ [0, N ]
and then we get
t Mt = E[M0 ]+
f 0
(t)
t (s, ω)dBs (ω) = E[M0 ]+
f (s, ω)dBs (ω) 0
for all t ≥ 0 .
Exercises
55
Exercises 4.1.* Use Itˆ o’s formula to write the following stochastic processes Yt in the standard form dYt = u(t, ω)dt + v(t, ω)dBt for suitable choices of u ∈ Rn , v ∈ Rn×m and dimensions n, m: a) Yt = Bt2 , where Bt is 1-dimensional b) Yt = 2 + t + eBt (Bt is 1-dimensional) c) Yt = B12 (t) + B22 (t) where (B1 , B2 ) is 2-dimensional d) Yt = (t0 + t, Bt ) (Bt is 1-dimensional) e) Yt = (B1 (t) + B2 (t) + B3 (t), B22 (t) − B1 (t)B3 (t)), where (B1 , B2 , B3 ) is 3-dimensional. 4.2.* Use Itˆ o’s formula to prove that
t
Bs2 dBs
=
1 3 3 Bt
t −
0
Bs ds . 0
4.3.* Let Xt , Yt be Itˆ o processes in R. Prove that d(Xt Yt ) = Xt dYt + Yt dXt + dXt · dYt . Deduce the following general integration by parts formula
t
t Xs dYs = Xt Yt − X0 Y0 −
0
4.4.
t Ys dXs −
0
dXs · dYs . 0
(Exponential martingales) Suppose θ(t, ω) = (θ1 (t, ω), . . . , θn (t, ω)) ∈ Rn with θk (t, ω) ∈ V[0, T ] for k = 1, . . . , n, where T ≤ ∞. Define % t θ(s, ω)dB(s) −
Zt = exp 0
1 2
t
+ θ (s, ω)ds ; 2
0≤t≤T
0
where B(s) ∈ Rn and θ2 = θ · θ (dot product). a) Use Itˆ o’s formula to prove that dZt = Zt θ(t, ω)dB(t) . b) Deduce that Zt is a martingale for t ≤ T , provided that Zt θk (t, ω) ∈ V[0, T ]
for 1 ≤ k ≤ n .
56
4. The Itˆ o Formula and the Martingale ...
Remark. A sufficient condition that Zt be a martingale is the Kazamaki condition # E exp
t 1 2
$ θ(s, ω)dB(s) t] → 1
as n → ∞
we can conclude that (4.1.9) holds (a.s.) for g. 4.10. (Tanaka’s formula and local time). What happens if we try to apply the Itˆo formula to g(Bt ) when Bt is 1-dimensional and g(x) = |x| ? In this case g is not C 2 at x = 0, so we modify g(x) near x = 0 to g (x) as follows: % |x| if |x| ≥ g (x) = 1 x2 2 ( + ) if |x| < where > 0. a) Apply Exercise 4.8 b) to show that
t g (Bt ) = g (B0 ) + 0
g (Bs )dBs +
1 · |{s ∈ [0, t]; Bs ∈ (−, )}| 2
where |F | denotes the Lebesgue measure of the set F .
Exercises
59
b) Prove that
t
g (Bs )
t · XBs ∈(− , ) dBs =
0
0
Bs · XBs ∈(− , )dBs → 0
2
in L (P ) as → 0. (Hint: Apply the Itˆ o isometry to # t E 0
Bs · XBs ∈(− , ) dBs
2 $ .
c) By letting → 0 prove that
t |Bt | = |B0 | +
sign(Bs )dBs + Lt (ω) ,
(4.3.12)
0
where Lt = lim
→0
1 · |{s ∈ [0, t]; Bs ∈ (−, )}| (limit in L2 (P )) 2
and % sign(x) =
−1 for x ≤ 0 . 1 for x > 0
Lt is called the local time for Brownian motion at 0 and (4.3.12) is the Tanaka formula (for Brownian motion). (See e.g. Rogers and Williams (1987)). 4.11.* Use Itˆ o’s formula (for example in the form of Exercise 4.3) to prove that the following stochastic processes are {Ft }-martingales: 1
a) Xt = e 2 t cos Bt (Bt ∈ R) 1 (Bt ∈ R) b) Xt = e 2 t sin Bt c) Xt = (Bt + t)exp(−Bt − 12 t)
(Bt ∈ R).
60
4. The Itˆ o Formula and the Martingale ...
4.12. Let dXt = u(t, ω)dt + v(t, ω)dBt be an Itˆ o process in Rn such that # t
$
# t
|u(r, ω)|dr + E
E
$ |vv (r, ω)|dr < ∞ T
0
for all t ≥ 0 .
0 (n)
Suppose Xt is an {Ft }-martingale. Prove that for a.a. (s, ω) ∈ [0, ∞) × Ω .
u(s, ω) = 0
(4.3.13)
Remarks: 1) This result may be regarded as a special case of the Martingale Representation Theorem. (n) 2) The conclusion (4.3.13) does not hold if the filtration Ft is replaced by the σ-algebras Mt generated by Xs (·); s ≤ t, i.e. if we only assume that Xt is a martingale w.r.t. its own filtration. See e.g. the Brownian motion characterization in Chapter 8. Hint for the solution: (n) If Xt is an Ft -martingale, then deduce that # s E
(n) u(r, ω)dr|Ft
$ =0
for all s ≥ t .
t
Differentiate w.r.t. s to deduce that (n)
E[u(s, ω)|Ft ] = 0
a.s., for a.a. s > t .
Then let t ↑ s and apply Corollary C.9. 4.13. Let dXt = u(t, ω)dt + dBt (u ∈ R, Bt ∈ R) be an Itˆo process and assume for simplicity that u is bounded. Then from Exercise 4.12 we know that unless u = 0 the process Xt is not an Ft -martingale. However, it turns out that we can construct an Ft -martingale from Xt by multiplying by a suitable exponential martingale. More precisely, define Yt = Xt Mt where
Mt = exp
t −
u(r, ω)dBr − 0
1 2
t
u2 (r, ω)dr
0
Use Itˆ o’s formula to prove that Yt
is an Ft -martingale .
.
Exercises
61
Remarks: a) Compare with Exercise 4.11 c). b) This result is a special case of the important Girsanov Theorem. It can be interpreted as follows: {Xt }t≤T is a martingale w.r.t the measure Q defined on FT by (T < ∞) .
dQ = MT dP See Section 8.6.
4.14.* In each of the cases below find the process f (t, ω) ∈ V[0, T ] such that (4.3.6) holds, i.e.
T F (ω) = E[F ] +
f (t, ω)dBt (ω) . 0
T
a) F (ω) = BT (ω)
b) F (ω) =
c) F (ω) = BT2 (ω)
d) F (ω) = BT3 (ω)
e) F (ω) = e
BT (ω)
0
Bt (ω)dt
f) F (ω) = sin BT (ω)
4.15. Let x > 0 be a constant and define Xt = (x1/3 + 13 Bt )3 ;
t≥0.
Show that 1/3
dXt = 13 Xt
2/3
dt + Xt
dBt ;
X0 = x .
4.16. By Exercise 3.8 we know that if Y is an FT -measurable random variable such that E[|Y |2 ] < ∞ then the process 0≤t≤T Mt := E[Y |Ft ] ; is a martingale with respect to Ft 0≤t≤T . a) Show that E[Mt2 ] < ∞ for all t ∈ [0, T ]. (Hint: Use Exercise 3.16.) b) According to the martingale representation theorem (Theorem 4.3.4) there exists a unique process g(t, ω) ∈ V(0, T ) such that
t Mt = E[M0 ] +
g(s, ω)dB(s) ; 0
t ∈ [0, T ].
62
4. The Itˆ o Formula and the Martingale ...
Find g in the following cases: (i) Y (ω) = B 2 (T ) (ii) Y (ω) = B 3 (T ) (iii) Y (ω) = exp(σB(T )); σ ∈ R is constant. (Hint: Use that exp(σB(t) − 12 σ 2 t) is a martingale.) 4.17. Here is an alternative proof of Theorem 4.3.3 which, in particular, does not use the complex analysis argument of Lemma 4.3.2. The idea of the proof is taken from Davis (1980), where it is extended to give a proof of the Clark representation formula. (See the Remark before Theorem 4.3.4.): In view of Lemma 4.3.1 it is enough to prove the following: Let Y = φ(Bt1 , . . . , Btn ) where 0 ≤ t1 < t2 < · · · < tn ≤ T and φ ∈ C0∞ (Rn ). We want to prove that there exists f (t, ω) ∈ V(0, T ) such that
T (4.3.14) Y = E[Y ] + f (t)dB(t). 0
a) Use the Itˆ o formula to prove that if w = w(t, x1 , . . . , xk ) : [tk−1 , tk ]× Rk → R is once continuously differentiable with respect to t and twice with respect to xk , then w(t, B(t1 ), . . . , B(tk−1 ), B(t)) = w(tk−1 , B(t1 ), . . . , B(tk−1 ), B(tk−1 ))
t ∂w + (s, B(t1 ), . . . , B(tk−1 ), B(s))dB(s) ∂xk tk−1
t + tk−1
∂w 1 ∂ 2 w + 2 2 (s, B(t1 ), . . . , B(tk−1 ), B(s))ds; t ∈ [tk−1 , tk ]. ∂s ∂xk
b) For k = 1, . . . , n define functions vk : [tk−1 , tk ]×Rk → R inductively as follows: , ∂vn 1 ∂ 2 vn ; tn−1 < t < tn ∂t + 2 ∂x2n = 0 (4.3.15) vn (tn , x1 , . . . , xn ) = φ(x1 , . . . , xn ) ; and, for k = n − 1, n − 2, . . . , 1, , ∂vk 1 ∂ 2 vk ; tk−1 < t < tk ∂t + 2 ∂x2k = 0 vk (tk , x1 , . . . , xk ) = vk+1 (tk , x1 , . . . , xk , xk ) (4.3.16)
Exercises
63
Verify that the solution of (4.3.16) is, for t ∈ [tk−1 , tk ] vk (t, x1 , . . . , xk ) = (2π(tk − t))−1/2
(4.3.17) 2 k −y) vk+1 (tk , x1 , . . . , xk , y) exp − (x 2(tk −t) dy
R
(xk ) ) (compare with Theorem 8.1.1) . = E vk+1 (tk , x1 , . . . , xk , Btk−t In particular, w = vk satisfies the smoothenss conditions of a). c) Show that the representation (4.3.14) holds with f (t, ω) =
∂vk (t, B(t1 ), . . . , B(tk−1 ), B(t)) ∂xk
for t ∈ [tk−1 , tk ].
[Hint: By (4.3.15) and a) we have φ(B(t1 ), . . . , B(tn )) = vn (tn , B(t1 ), . . . , B(tn )) = vn (tn−1 , B(t1 ), . . . , B(tn−1 ), B(tn−1 ))
tn ∂vn (s, B(t1 ), . . . , B(tn−1 ), B(s))dB(s) + ∂xn tn−1
= vn−1 (tn−1 , B(t1 ), . . . , B(tn−1 ))
tn ∂vn (s, B(t1 ), . . . , B(tn−1 ), B(s))dB(s) . + ∂xn tn−1
Now repeat the procedure with vn−1 (tn−1 , B(t1 ), . . . , B(tn−1 )) and proceed by induction.]
5 Stochastic Differential Equations
5.1 Examples and Some Solution Methods We now return to the possible solutions Xt (ω) of the stochastic differential equation dXt = b(t, Xt ) + σ(t, Xt )Wt , dt
b(t, x) ∈ R, σ(t, x) ∈ R
(5.1.1)
where Wt is 1-dimensional “white noise”. As discussed in Chapter 3 the Itˆo interpretation of (5.1.1) is that Xt satisfies the stochastic integral equation
t Xt = X0 +
t b(s, Xs )ds +
0
σ(s, Xs )dBs 0
or in differential form dXt = b(t, Xt )dt + σ(t, Xt )dBt .
(5.1.2)
Therefore, to get from (5.1.1) to (5.1.2) we formally just replace the white t noise Wt by dB dt in (5.1.1) and multiply by dt. It is natural to ask: (A) Can one obtain existence and uniqueness theorems for such equations? What are the properties of the solutions? (B) How can one solve a given such equation? We will first consider question (B) by looking at some simple examples, and then in Section 5.2 we will discuss (A). It is the Itˆ o formula that is the key to the solution of many stochastic differential equations. The method is illustrated in the following examples.
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 5, © Springer-Verlag Berlin Heidelberg 2013
66
5 Stochastic Differential Equations
Example 5.1.1 Let us return to the population growth model in Chapter 1: dNt = at N t , N0 given dt where at = rt + αWt , Wt = white noise, α = constant. Let us assume that rt = r = constant. By the Itˆo interpretation (5.1.2) this equation is equivalent to (here σ(t, x) = αx) dNt = rNt dt + αNt dBt
(5.1.3)
or dNt = rdt + αdBt . Nt Hence
t 0
dNs = rt + αBt Ns
(B0 = 0) .
(5.1.4)
To evaluate the integral on the left hand side we use the Itˆo formula for the function g(t, x) = ln x ; x>0 and obtain
1 1 · dNt + 12 − 2 (dNt )2 Nt Nt dNt 1 dNt = − · α2 Nt2 dt = − 12 α2 dt . 2 Nt 2Nt Nt
d(ln Nt ) =
Hence
dNt = d(ln Nt ) + 12 α2 dt Nt so from (5.1.4) we conclude ln or
Nt 1 = (r − α2 )t + αBt N0 2
Nt = N0 exp((r − 12 α2 )t + αBt ) .
(5.1.5)
For comparison, referring to the discussion at the end of Chapter 3, the Stratonovich interpretation of (5.1.3), dN t = rN t dt + αN t ◦ dBt , would have given the solution N t = N0 exp(rt + αBt ) . The solutions Nt , N t are both processes of the type Xt = X0 exp(μt + αBt )
(μ, α constants) .
(5.1.6)
5.1 Examples and Some Solution Methods
67
Such processes are called geometric Brownian motions. They are important also as models for stochastic prices in economics. See Chapters 10, 11, 12. Remark. It seems reasonable that if Bt is independent of N0 we should have E[Nt ] = E[N0 ]ert ,
(∗)
i.e. the same as when there is no noise in at . To see if this is indeed the case, we let Yt = eαBt and apply Itˆo’s formula: dYt = αeαBt dBt + 12 α2 eαBt dt or
t Yt = Y0 + α
e
αBs
dBs +
1 2 2α
0
t eαBs ds . 0
t Since E[ eαBs dBs ] = 0 (Theorem 3.2.1 (iii)), we get 0
E[Yt ] = E[Y0 ] + 12 α2
t E[Ys ]ds 0
i.e.
So
d E[Yt ] = 12 α2 E[Yt ], E[Y0 ] = 1 . dt 1
2
E[Yt ] = e 2 α
t
,
and therefore – as anticipated – we obtain E[Nt ] = E[N0 ]ert . For the Stratonovich solution, however, the same calculation gives 1
2
E[N t ] = E[N0 ]e(r+ 2 α
)t
.
Now that we have found the explicit solutions Nt and N t in (5.1.5), (5.1.6) we can use our knowledge about the behaviour of Bt to gain information on these solutions. For example, for the Itˆo solution Nt we get the following: (i) If r > 12 α2 then Nt → ∞ as t → ∞, a.s. (ii) If r < 12 α2 then Nt → 0 as t → ∞, a.s. (iii) If r = 12 α2 then Nt will fluctuate between arbitrary large and arbitrary small values as t → ∞, a.s.
68
5 Stochastic Differential Equations
These conclusions are direct consequences of the formula (5.1.5) for Nt together with the following basic result about 1-dimensional Brownian motion Bt : Theorem 5.1.2 (The law of iterated logarithm) Bt = 1 a.s. lim sup √ 2t log log t t→∞ For a proof we refer to Lamperti (1977), §22. For the Stratonovich solution N t we get by the same argument that N t → 0 a.s. if r < 0 and N t → ∞ a.s. if r > 0. Thus the two solutions have fundamentally different properties and it is an interesting question what solution gives the best description of the situation. Example 5.1.3 Let us return to the equation in Problem 2 of Chapter 1: LQt + RQt +
1 Qt = Ft = Gt + αWt . C
(5.1.7)
We introduce the vector ⎧ ⎫ ⎧ ⎫ ⎩ X1 ⎪ ⎭=⎪ ⎩ Qt ⎪ ⎭ X = X(t, ω) = ⎪ and obtain X2 Qt % X1 = X2 LX2 = −RX2 − C1 X1 + Gt + αWt
(5.1.8)
or, in matrix notation, (5.1.9)
dX = dX(t) = AX(t)dt + H(t)dt + KdBt where dX =
dX1 dX2
, A=
0
1 − CL
1 −R L
, H(t) =
0 1 L Gt
, K=
0 α L
,
(5.1.10)
and Bt is a 1-dimensional Brownian motion. Thus we are led to a 2-dimensional stochastic differential equation. We rewrite (5.1.9) as exp(−At)dX(t) − exp(−At)AX(t)dt = exp(−At)[H(t)dt + KdBt ] , (5.1.11) where for a general n × n matrix F we define exp(F ) to be the n × n matrix ∞ 1 n given by exp(F ) = n! F . Here it is tempting to relate the left hand side to
n=0
d(exp(−At)X(t)) . To do this we use a 2-dimensional version of the Itˆo formula (Theorem 4.2.1).
5.1 Examples and Some Solution Methods
69
Applying this result to the two coordinate functions g1 , g2 of g: [0, ∞) × R2 → R2
given by
g(t, x1 , x2 ) = exp(−At)
x1 x2
,
we obtain that d(exp(−At)X(t)) = (−A) exp(−At)X(t)dt + exp(−At)dX(t) . Substituted in (5.1.11) this gives
t exp(−At)X(t) − X(0) =
t exp(−As)H(s)ds +
0
exp(−As)KdBs 0
or X(t) = exp(At)[X(0) + exp(−At)KBt
t + exp(−As)[H(s) + AKBs ]ds] ,
(5.1.12)
0
by integration by parts (Theorem 4.1.5). Example 5.1.4 Choose Xt = Bt , 1-dimensional Brownian motion, and g(t, x) = eix = (cos x, sin x) ∈ R2
for x ∈ R .
Then Y (t) = g(t, Xt ) = eiBt = (cos Bt , sin Bt ) is by Itˆ o’s formula again an Itˆo process. Its coordinates Y1 , Y2 satisfy , dY1 (t) = − sin(Bt )dBt − dY2 (t) = cos(Bt )dBt −
1 2
1 2
cos(Bt )dt
sin(Bt )dt .
Thus the process Y = (Y1 , Y2 ), which we could call Brownian motion on the unit circle, is the solution of the stochastic differential equations , dY1 = − 21 Y1 dt − Y2 dBt (5.1.13) dY2 = − 21 Y2 dt + Y1 dBt . Or, in matrix notation, dY (t) = − 21 Y (t)dt + KY (t)dBt ,
where K =
0 −1 1 0
.
70
5 Stochastic Differential Equations
Other examples and solution methods can be found in the exercises of this chapter. For a comprehensive description of reduction methods for 1-dimensional stochastic differential equations see Gard (1988), Chapter 4.
5.2 An Existence and Uniqueness Result We now turn to the existence and uniqueness question (A) above. Theorem 5.2.1 (Existence and uniqueness theorem for stochastic differential equations). Let T > 0 and b(·, ·): [0, T ] × Rn → Rn , σ(·, ·): [0, T ] × Rn → Rn×m be measurable functions satisfying |b(t, x)| + |σ(t, x)| ≤ C(1 + |x|) ; x ∈ Rn , t ∈ [0, T ] for some constant C, (where |σ|2 = |σij |2 ) and such that |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ D|x − y| ;
(5.2.1)
x, y ∈ Rn , t ∈ [0, T ] (5.2.2)
for some constant D. Let Z be a random variable which is independent of the (m) σ-algebra F∞ generated by Bs (·), s ≥ 0 and such that E[|Z|2 ] < ∞ . Then the stochastic differential equation dXt = b(t, Xt )dt + σ(t, Xt )dBt ,
0 ≤ t ≤ T, X0 = Z
(5.2.3)
has a unique t-continuous solution Xt (ω) with the property that Xt (ω) is adapted to the filtration FtZ generated by Z and Bs (·); s ≤ t (5.2.4) and $ # T |Xt |2 dt < ∞ . (5.2.5) E 0
Remarks. Conditions (5.2.1) and (5.2.2) are natural in view of the following two simple examples from deterministic differential equations (i.e. σ = 0): a) The equation dXt = Xt2 , dt
X0 = 1
(5.2.6)
5.2 An Existence and Uniqueness Result
71
corresponding to b(x) = x2 (which does not satisfy (5.2.1)) has the (unique) solution 1 ; 0≤t 0 the function % 0 for t ≤ a Xt = (t − a)3 for t > a
(5.2.7)
solves (5.2.7). In this case b(x) = 3x2/3 does not satisfy the Lipschitz condition (5.2.2) at x = 0. Thus condition (5.2.2) guarantees that equation (5.2.3) has a unique solution. Here uniqueness means that if X1 (t, ω) and X2 (t, ω) are two tcontinuous processes satisfying (5.2.3), (5.2.4) and (5.2.5) then for all t ≤ T , a.s.
X1 (t, ω) = X2 (t, ω)
(5.2.8)
Proof of Theorem 5.2.1. The uniqueness follows from the Itˆo isometry (Corollary 3.1.7) and the Lipschitz property (5.2.2): Let X1 (t, ω) = Xt (ω) t (ω) be solutions with initial values Z, Z respectively, i.e. and X2 (t, ω) = X X1 (0, ω) = Z(ω), X2 (0, ω) = Z(ω), ω ∈ Ω. For our purposes here we only need but the following more general estimate will be useful for us the case Z = Z, later, in connection with Feller continuity (Chapter 8). s ) and γ(s, ω) = σ(s, Xs ) − σ(s, X s ). Then Put a(s, ω) = b(s, Xs ) − b(s, X t |2 ] = E E[|Xt − X
# 2 $
t
t Z − Z + a ds + γdBs
2 ] + 3E ≤ 3E[|Z − Z|
0
# t
2 ] + 3tE ≤ 3E[|Z − Z|
a ds 0 # t
0
2 $
# t
a ds + 3E 0
2 ] + 3(1 + t)D2 ≤ 3E[|Z − Z|
2 $ γdBs
+ 3E
$
2
# t 0 2
$
γ ds 0
t 0
s |2 ]ds . E[|Xs − X
72
5 Stochastic Differential Equations
So the function t |2 ] ; v(t) = E[|Xt − X
0≤t≤T
satisfies
t v(t) ≤ F + A
v(s)ds ,
(5.2.9)
0
2 ] and A = 3(1 + T )D2 . where F = 3E[|Z − Z| By the Gronwall inequality (Exercise 5.17) we conclude that v(t) ≤ F exp(At) .
(5.2.10)
Then F = 0 and so v(t) = 0 for all t ≥ 0. Hence Now assume that Z = Z. t | = 0 P [|Xt − X
for all t ∈ Q ∩ [0, T ]] = 1 ,
where Q denotes the rational numbers. t | it follows that By continuity of t → |Xt − X P [|X1 (t, ω) − X2 (t, ω)| = 0
for all t ∈ [0, T ]] = 1 ,
(5.2.11)
and the uniqueness is proved. The proof of the existence is similar to the familiar existence proof for ordi(0) (k) (k) nary differential equations: Define Yt = X0 and Yt = Yt (ω) inductively as follows (k+1) Yt
t = X0 +
b(s, Ys(k) )ds
t +
0
σ(s, Ys(k) )dBs .
0
Then, similar computation as for the uniqueness above gives (k+1) E[|Yt
−
(k) Yt |2 ]
≤ (1 + T )3D
2
t
E[|Ys(k) − Ys(k−1) |2 ]ds ,
0
for k ≥ 1, t ≤ T and (1)
E[|Yt
(0)
− Yt |2 ] ≤ 2C 2 t2 E[(1 + |X0 |)2 ]
+2C 2 t(1 + E[|X0 |2 ]) ≤ A1 t
(5.2.12)
5.2 An Existence and Uniqueness Result
73
where the constant A1 only depends on C, T and E[|X0 |2 ]. So by induction on k we obtain (k+1)
(k) 2
− Yt
E[|Yt
| ]≤
Ak+1 tk+1 2 ; (k + 1)!
k ≥ 0, t ∈ [0, T ]
(5.2.13)
for some suitable constant A2 depending only on C, D, T and E[|X0 |2 ]. Hence, if λ denotes Lebesgue measure on [0, T ] and m > n ≥ 0 we get - m−1 - (m) - (k+1) (n) (k) -Y Yt − Yt -L2 (λ×P ) = − Yt t
L2 (λ×P )
k=n
≤
m−1
m−1 - (k+1) (k) -Yt E − Yt -L2 (λ×P ) =
k=n
≤
k=n
T m−1 k=n
0
T
(k+1)
|Yt
(k) 2
− Yt
| dt
1/2
0
m−1 Ak+1 T k+2 1/2 Ak+1 tk+1 1/2 2 2 dt = →0 (k + 1)! (k + 2)!
(5.2.14)
k=n
as m, n → ∞. (n) ∞ (n) 2 Therefore {Yt }∞ n=0 is a Cauchy sequence in L (λ× P ). Hence Yt n=0 is convergent in L2 (λ × P ). Define (n)
(limit in L2 (λ × P )).
Xt := lim Yt n→∞
(n)
Then Xt is FtZ -measurable for all t, since this holds for each Yt that Xt satisfies (5.2.3): For all n and all t ∈ [0, T ] we have (n+1)
Yt
t = X0 +
b(s, Ys(n) )ds +
0
t
σ(s, Ys(n) )dBs .
0
Now let n → ∞. Then by the H¨older inequality we get that
t
b(s, Ys(n) )ds
t →
0
b(s, Xs )ds
in L2 (P )
0
and by the Itˆo isometry it follows that
t 0
σ(s, Ys(n) )dBs
t →
σ(s, Xs )dBs 0
in L2 (P ).
. We prove
74
5 Stochastic Differential Equations
We conclude that for all t ∈ [0, T ] we have
t Xt = X0 +
t b(s, Xs )ds +
0
σ(s, Xs )dBs
a.s.
(5.2.15)
0
i.e. Xt satisfies (5.2.3). It remains to prove that Xt can be chosen to be continuous. By Theorem 3.2.5 there is a continuous version of the right hand side of (5.2.15). Denote t is continuous and t . Then X this version by X t = X0 + X 0 + =X
t
t b(s, Xs )ds +
0
0
t
t
0
s )ds + b(s, X
σ(s, Xs )dBs
for a.a. ω
s )dBs σ(s, X
for a.a. ω .
0
5.3 Weak and Strong Solutions The solution Xt found above is called a strong solution, because the version Bt of Brownian motion is given in advance and the solution Xt constructed from it is FtZ -adapted. If we are only given the functions b(t, x) and σ(t, x) t , B t ), Ht ) on a probability space (Ω, H, P ) and ask for a pair of processes ((X t , B t )) is t (or more precisely (X such that (5.2.3) holds, then the solution X called a weak solution. Here Ht is an increasing family of σ-algebras such that t is Ht -adapted and B t is an Ht -Brownian motion, i.e. B t is a Brownian X t |Ht ] = 0 for all motion, and Bt is a martingale w.r.t. Ht (and so E[Bt+h − B t, h ≥ 0). Recall from Chapter 3 that this allows us to define the Itˆo integral t need not on the right hand side of (5.2.3) exactly as before, even though X be FtZ -adapted. A strong solution is of course also a weak solution, but the converse is not true in general. See Example 5.3.2 below. The uniqueness (5.2.8) that we obtain above is called strong or pathwise uniqueness, while weak uniqueness simply means that any two solutions (weak or strong) are identical in law, i.e. have the same finite-dimensional distributions. See Stroock and Varadhan (1979) for results about existence and uniqueness of weak solutions. A general discussion about strong and weak solutions can be found in Krylov and Zvonkin (1981). Lemma 5.3.1 If b and σ satisfy the conditions of Theorem 5.2.1 then we have A solution (weak or strong) of (5.2.3) is weakly unique .
5.3 Weak and Strong Solutions
75
t , B t ), H t ) and ((X t , B t ), H t ) be two weak solutions. Sketch of proof . Let ((X t and B t , respectively, Let Xt and Yt be the strong solutions constructed from B as above. Then the same uniqueness argument as above applies to show that t and Yt = X t for all t, a.s. Therefore it suffices to show that Xt and Xt = X Yt must be identical in law. We show this by proving by induction that if (k) (k) Xt , Yt are the processes in the Picard iteration defined by (5.2.12) with t , then t and B Brownian motions B t ) and (Yt (Xt , B (k)
(k)
t ) ,B
have the same law for all k.
This observation will be useful for us in Chapter 7 and later, where we will investigate further the properties of processes which are solutions of stochastic differential equations (Itˆo diffusions). From a modelling point of view the weak solution concept is often natural, because it does not specify beforehand the explicit representation of the white noise. Moreover, the concept is convenient for mathematical reasons, because there are stochastic differential equations which have no strong solutions but still a (weakly) unique weak solution. Here is a simple example: Example 5.3.2 (The Tanaka equation) Consider the 1-dimensional stochastic differential equation dXt = sign(Xt )dBt ; where sign(x) =
X0 = 0 .
(5.3.1)
% +1 if x ≥ 0 −1 if x < 0 .
Note that here σ(t, x) = σ(x) = sign(x) does not satisfy the Lipschitz condition (5.2.2), so Theorem 5.2.1 does not apply. Indeed, the equation (5.3.1) t be a Brownian motion generating has no strong solution. To see this, let B the filtration Ft and define
t Yt =
s )dB s . sign(B
0
By the Tanaka formula (4.3.12) (Exercise 4.10) we have t | − | B 0 | − L t (ω) , Yt = | B t (ω) is the local time for B t (ω) at 0. It follows that Yt is measurable where L s (·)|; s ≤ t, which is clearly strictly w.r.t. the σ-algebra Gt generated by | B contained in Ft . Hence the σ-algebra Nt generated by Ys (·); s ≤ t is also strictly contained in Ft .
76
5. Stochastic Differential Equations
Now suppose Xt is a strong solution of (5.3.1). Then by Theorem 8.4.2 it follows that Xt is a Brownian motion w.r.t. the measure P . (In case the reader is worried about the possibility of a circular argument, we point out that the proof of Theorem 8.4.2 is independent of this example!) Let Mt be the σ-algebra generated by Xs (·); s ≤ t. Since (sign(x))2 = 1 we can rewrite (5.3.1) as dBt = sign(Xt )dXt . t = Xt , Yt = Bt we conclude that Ft is By the above argument applied to B strictly contained in Mt . But this contradicts that Xt is a strong solution. Hence strong solutions of (5.3.1) do not exist. To find a weak solution of (5.3.1) we simply choose Xt to be any Brownian t by t . Then we define B motion B t = B
t
s )dB s = sign(B
0
i.e.
t sign(Xs )dXs 0
t = sign(Xt )dXt . dB
Then
t , dXt = sign(Xt )dB
so Xt is a weak solution. Finally, weak uniqueness follows from Theorem 8.4.2, which – as noted above – implies that any weak solution Xt must be a Brownian motion w.r.t. P.
Exercises 5.1.
Verify that the given processes solve the given corresponding stochastic differential equations: (Bt denotes 1-dimensional Brownian motion) (i) Xt = eBt solves dXt = 12 Xt dt + Xt dBt Bt (ii) Xt = 1+t ; B0 = 0 solves dXt = −
1 1 Xt dt + dBt ; 1+t 1+t
(iii) Xt = sin Bt with B0 = a ∈ (− π2 , π2 ) solves dXt = − 12 Xt dt +
X0 = 0
. 1−Xt2dBt for t < inf s > 0; Bs ∈ / − π2 , π2
Exercises
77
(iv) (X1 (t), X2 (t)) = (t, et Bt ) solves # $ # $ # $ dX1 1 0 = dt + X1 dBt X2 e dX2 (v) (X1 (t), X2 (t)) = (cosh(Bt ), sinh(Bt )) solves $ $ # $ # # 1 X1 X2 dX1 = dt + dBt . dX2 X1 2 X2 5.2.
A natural candidate for what we could call Brownian motion on the ellipse
x2 y2 + 2 =1 where a > 0, b > 0 2 a b is the process Xt = (X1 (t), X2 (t)) defined by (x, y);
X1 (t) = a cos Bt ,
X2 (t) = b sin Bt
where Bt is 1-dimensional Brownian motion. Show that Xt is a solution of the stochastic differential equation # where M =
0 b a
dXt = − 21 Xt dt + M Xt dBt $ − ab . 0
5.3.* Let (B1 , . . . , Bn ) be Brownian motion in Rn , α1 , . . . , αn constants. Solve the stochastic differential equation dXt = rXt dt + Xt
n
αk dBk (t) ;
X0 > 0 .
k=1
(This is a model for exponential growth with several independent white noise sources in the relative growth rate). 5.4.* Solve the following stochastic differential equations: $ # $ # $# $ # 1 1 0 dB1 dX1 = dt + (i) 0 0 X1 dX2 dB2 (ii) dXt = Xt dt + dBt (Hint: Multiply both sides with “the integrating factor” e−t and compare with d(e−t Xt )) (iii) dXt = −Xt dt + e−t dBt . 5.5.
a) Solve the Ornstein-Uhlenbeck equation (or Langevin equation) dXt = μXt dt + σdBt where μ, σ are real constants, Bt ∈ R.
78
5. Stochastic Differential Equations
The solution is called the Ornstein-Uhlenbeck process. (Hint: See Exercise 5.4 (ii).) b) Find E[Xt ] and Var[Xt ]: = E[(Xt − E[Xt ])2 ]. 5.6.* Solve the stochastic differential equation dYt = r dt + αYt dBt where r, α are real constants, Bt ∈ R. (Hint: Multiply the equation by the ’integrating factor’ Ft = exp − αBt + 12 α2 t . ) 5.7.* The mean-reverting Ornstein-Uhlenbeck process is the solution Xt of the stochastic differential equation dXt = (m − Xt )dt + σdBt where m, σ are real constants, Bt ∈ R. a) Solve this equation by proceeding as in Exercise 5.5 a). b) Find E[Xt ] and Var[Xt ]: = E[(Xt − E[Xt ])2 ]. 5.8.* Solve the (2-dimensional) stochastic differential equation dX1 (t) = X2 (t)dt + αdB1 (t) dX2 (t) = −X1 (t)dt + βdB2 (t) where (B1 (t), B2 (t)) is 2-dimensional Brownian motion and α, β are constants. This is a model for a vibrating string subject to a stochastic force. See Example 5.1.3. 5.9.
Show that there is a unique strong solution Xt of the 1-dimensional stochastic differential equation dXt = ln(1 + Xt2 )dt + X{Xt >0} Xt dBt ,
X0 = a ∈ R .
5.10. Let b, σ satisfy (5.2.1), (5.2.2) and let Xt be the unique strong solution of (5.2.3). Show that E[|Xt |2 ] ≤ K1 · exp(K2 t) 2
2
for t ≤ T
(5.3.2) 2
where K1 = 3E[|Z| ] + 6C T (T + 1) and K2 = 6(1 + T )C . (Hint: Use the argument in the proof of (5.2.10)).
Exercises
79
Remark. With global estimates of the growth of b and σ in (5.2.1) it is possible to improve (5.3.2) to a global estimate of E[|Xt |2 ]. See Exercise 7.5. 5.11.* (The Brownian bridge). For fixed a, b ∈ R consider the following 1-dimensional equation dYt =
b − Yt dt + dBt ; 1−t
0 ≤ t < 1 , Y0 = a .
(5.3.3)
Verify that
t Yt = a(1 − t) + bt + (1 − t) 0
dBs ; 1−s
0≤t 0 is constant. a) Discuss this equation, for example by proceeding as in Example 5.1.3. b) Show that y(t) solves a stochastic Volterra equation of the form
t
y(t) = y(0) + y (0) · t +
t a(t, r)y(r)dr +
0
γ(t, r)y(r)dBr 0
where a(t, r) = r − t, γ(t, r) = (r − t). 5.13. As a model for the horizontal slow drift motions of a moored floating platform or ship responding to incoming irregular waves John Grue (1989) introduced the equation xt + a0 xt + w2 xt = (T0 − α0 xt )ηWt ,
(5.3.5)
where Wt is 1-dimensional white noise, a0 , w, T0 , α0 and η are constants. # $ xt (i) Put Xt = and rewrite the equation in the form xt dXt = AXt dt + KXt dBt + M dBt ,
80
5. Stochastic Differential Equations
where
#
# $ $ 1 0 0 , K = α0 η 0 −1 −a0
0 A= −w2
# $ 0 and M = T0 η . 1
(ii) Show that Xt satisfies the integral equation
t Xt =
t e
A(t−s)
eA(t−s) M dBs
KXs dBs +
0
if X0 = 0 .
0
(iii) Verify that eAt = where λ =
a0 2 ,ξ
e−λt {(ξ cos ξt + λ sin ξt)I + A sin ξt} ξ
= (w2 −
a20 1 2 4 )
and use this to prove that
t (T0 − α0 ys )gt−s dBs
xt = η
(5.3.6)
0
and
t
with yt : = xt ,
(T0 − α0 ys )ht−s dBs ,
yt = η
(5.3.7)
0
where 1 Im(eζt ) ξ 1 ¯ ht = Im(ζeζt ) , ξ gt =
ζ = −λ + iξ
(i =
√ −1) .
So we can solve for yt first in (5.3.7) and then substitute in (5.3.6) to find xt . 5.14. If (B1 , B2 ) denotes 2-dimensional Brownian motion we may introduce complex notation and put B(t): = B1 (t) + iB2 (t)
(i =
√ −1) .
B(t) is called complex Brownian motion. (i) If F (z) = u(z) + iv(z) is an analytic function i.e. F satisfies the Cauchy-Riemann equations ∂v ∂u = , ∂x ∂y
∂u ∂v =− ; ∂y ∂x
z = x + iy
Exercises
81
and we define Zt = F (B(t)) prove that dZt = F (B(t))dB(t) ,
(5.3.8)
where F is the (complex) derivative of F . (Note that the usual second order terms in the (real) Itˆo formula are not present in (5.3.8)!) (ii) Solve the complex stochastic differential equation dZt = αZt dB(t)
(α constant) .
For more information about complex stochastic calculus involving analytic functions see e.g. Ubøe (1987). 5.15. (Population growth in a stochastic, crowded environment) The nonlinear stochastic differential equation dXt = rXt (K − Xt )dt + βXt dBt ;
X0 = x > 0
(5.3.9)
is often used as a model for the growth of a population of size Xt in a stochastic, crowded environment. The constant K > 0 is called the carrying capacity of the environment, the constant r ∈ R is a measure of the quality of the environment and the constant β ∈ R is a measure of the size of the noise in the system. Verify that Xt =
exp{(rK − 12 β 2 )t + βBt } ; t 1 2 −1 x + r exp{(rK − 2 β )s + βBs }ds
t≥0
(5.3.10)
0
is the unique (strong) solution of (5.3.9). (This solution can be found by performing a substitution (change of variables) which reduces (5.3.9) to a linear equation. See Gard (1988), Chapter 4 for details.) 5.16.* The technique used in Exercise 5.6 can be applied to more general nonlinear stochastic differential equations of the form dXt = f (t, Xt )dt + c(t)Xt dBt ,
X0 = x
(5.3.11)
where f : R×R → R and c: R → R are given continuous (deterministic) functions. Proceed as follows:
82
5. Stochastic Differential Equations
a) Define the ’integrating factor’
t −
Ft = Ft (ω) = exp
c(s)dBs +
1 2
0
t
c2 (s)ds .
(5.3.12)
0
Show that (5.3.11) can be written d(Ft Xt ) = Ft · f (t, Xt )dt .
(5.3.13)
Yt (ω) = Ft (ω)Xt (ω)
(5.3.14)
Xt = Ft−1 Yt .
(5.3.15)
b) Now define so that
Deduce that equation (5.3.13) gets the form dYt (ω) = Ft (ω) · f (t, Ft−1 (ω)Yt (ω)) ; Y0 = x . (5.3.16) dt Note that this is just a deterministic differential equation in the function t → Yt (ω), for each ω ∈ Ω. We can therefore solve (5.3.16) with ω as a parameter to find Yt (ω) and then obtain Xt (ω) from (5.3.15). c) Apply this method to solve the stochastic differential equation dXt =
1 dt + αXt dBt ; Xt
X0 = x > 0
(5.3.17)
where α is constant. d) Apply the method to study the solutions of the stochastic differential equation dXt = Xtγ dt + αXt dBt ;
X0 = x > 0
(5.3.18)
where α and γ are constants. For what values of γ do we get explosion? 5.17. (The Gronwall inequality) Let v(t) be a nonnegative function such that
t v(t) ≤ C + A
v(s)ds
for 0 ≤ t ≤ T
0
for some constants C, A where A ≥ 0. Prove that v(t) ≤ C exp(At)
for 0 ≤ t ≤ T .
(5.3.19)
Exercises
[Hint: We may assume A = 0. Define w(t) =
t 0
83
v(s)ds . Then w (t) ≤
C + Aw(t). Deduce that C (exp(At) − 1) A by considering f (t): = w(t) exp(−At). Use (5.3.20) to deduce (5.3.19.)] w(t) ≤
(5.3.20)
5.18. The geometric mean reversion process Xt is defined as the solution of the stochastic differential equation dXt = κ(α − log Xt )Xt dt + σXt dBt ;
X0 = x > 0
(5.3.21)
where κ, α, σ and x are positive constants. This process was used by J. Tvedt (1995) to model the spot freight rate in shipping. a) Show that the solution of (5.3.21) is given by 2 Xt := exp e−κt ln x + α − σ2κ (1 − eκt ) t +σ e−κt eκs dBs .
(5.3.22)
0
[Hint: The substitution Yt = log Xt transforms (5.3.21) into a linear equation for Yt ] b) Show that σ2 σ 2 (1 − e−2κt ) E[Xt ] = exp e−κt ln x + α − (1 − e−κt ) + . 2κ 2κ (k)
5.19. Let Yt be the process defined inductively by (5.2.12). Show that (n) ∞ Yt is uniformly convergent for t ∈ [0, T ], for a.a. ω. Since each n=0 (n)
Yt is continuous, this gives a direct proof that Xt can be chosen to be continuous in Theorem 5.2.1. [Hint: Note that sup
0≤t≤T
(k+1) |Yt
−
(k) Yt |
T ≤
|b(s, Ys(k) ) − b(s, Ys(k−1) |ds
0
t + sup (σ(s, Y (k) (s)) − σ(s, Ys(k−1) ))dBs 0≤t≤T
0
84
5. Stochastic Differential Equations
Hence P
(k+1)
sup |Yt
0≤t≤T
≤P
T
(k)
− Yt
| > 2−k
|b(s, Ys(k) ) − b(s, Ys(k−1) )|ds > 2−k−1
0
+P
t sup (σ(s, Ys(k) ) − σ(s, Ys(k−1) ))dBs > 2−k−1 .
0≤t≤T
0
Now use the Chebychev inequality, the H¨ older inequality and the martingale inequality (Theorem 3.2.4), respectively, combined with (5.2.13), to prove that P
(k+1)
sup |Yt
0≤t≤T
(k)
− Yt
(A T )k+1 3 | > 2−k ≤ (k + 1)!
for some constant A < ∞. Therefore the result follows by the BorelCantelli lemma.]
6 The Filtering Problem
6.1 Introduction Problem 3 in the introduction is a special case of the following general filtering problem: Suppose the state Xt ∈ Rn at time t of a system is given by a stochastic differential equation dXt = b(t, Xt ) + σ(t, Xt )Wt , dt
t≥0,
(6.1.1)
where b: Rn+1 → Rn , σ: Rn+1 → Rn×p satisfy conditions (5.2.1), (5.2.2) and Wt is p-dimensional white noise. As discussed earlier the Itˆo interpretation of this equation is (system)
dXt = b(t, Xt )dt + σ(t, Xt )dUt ,
(6.1.2)
where Ut is p-dimensional Brownian motion. We also assume that the distribution of X0 is known and independent of Ut . Similarly to the 1-dimensional situation (3.3.6) there is an explicit several-dimensional formula which expresses the Stratonovich interpretation of (6.1.1): dXt = b(t, Xt )dt + σ(t, Xt ) ◦ dUt in terms of Itˆo integrals as follows: dXt = b(t, Xt )dt + σ(t, Xt )dUt , where p n ∂σij bi (t, x) = bi (t, x) + 1 σkj ; 1≤i≤n. 2 ∂xk j=1
(6.1.3)
k=1
(See Stratonovich (1966)). From now on we will use the Itˆo interpretation (6.1.2). B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 6, © Springer-Verlag Berlin Heidelberg 2013
86
6 The Filtering Problem
In the continuous version of the filtering problem we assume that the observations Ht ∈ Rm are performed continuously and are of the form /t , Ht = c(t, Xt ) + γ(t, Xt ) · W
(6.1.4)
where c: Rn+1 → Rm , γ: Rn+1 → Rm×r are functions satisfying (5.2.1) and /t denotes r-dimensional white noise, independent of Ut and X0 . W To obtain a tractable mathematical interpretation of (6.1.4) we introduce
t Zt =
Hs ds
(6.1.5)
0
and thereby we obtain the stochastic integral representation (observations)
dZt = c(t, Xt )dt + γ(t, Xt )dVt ,
Z0 = 0
(6.1.6)
where Vt is r-dimensional Brownian motion, independent of Ut and X0 . Note that if Hs is known for 0 ≤ s ≤ t, then Zs is also known for 0 ≤ s ≤ t and conversely. So no information is lost or gained by considering Zt as our “observations” instead of Ht . But this allows us to obtain a well-defined mathematical model of the situation. The filtering problem is the following: Given the observations Zs satisfying (6.1.6) for 0 ≤ s ≤ t, what is the best t of the state Xt of the system (6.1.2) based on these observations? estimate X As we have pointed out earlier, it is necessary to find a precise mathematt is based on ical formulation of this problem: By saying that the estimate X the observations {Zs ; s ≤ t} we mean that t (·) is Gt -measurable, X where Gt is the σ-algebra generated by {Zs (·), s ≤ t} .
(6.1.7)
t is the best such estimate we mean that By saying that X
t |2 dP = E[|Xt − X t |2 ] = inf{E[|Xt − Y |2 ]; Y ∈ K} . |Xt − X
(6.1.8)
Ω
Here – and in the rest of this chapter – (Ω, F , P ) is the probability space corresponding to the (p + r)-dimensional Brownian motion (Ut , Vt ) starting at 0, E denotes expectation w.r.t. P and K: = Kt : = K(Z, t): = {Y : Ω → Rn ; Y ∈ L2 (P ) and Y is Gt -measurable} , (6.1.9) where L2 (P ) = L2 (Ω, P ). Having found the mathematical formulation of our problem, we now start t . to study the properties of the solution X
6.2 The 1-Dimensional Linear Filtering Problem
87
We first establish the following useful connection between conditional expectation and projection: Lemma 6.1.1 Let H ⊂ F be a σ-algebra and let X ∈ L2 (P ) be F -measurable. Put N = {Y ∈ L2 (P ); Y is H-measurable} and let PN denote the (orthogonal) projection from the Hilbert space L2 (P ) into the subspace N . Then PN (X) = E[X|H] . Proof. Recall (see Appendix B) that E[X|H] is by definition the P -unique function from Ω to R such that (i) E[X|H] is H-measurable (ii) E[X|H]dP = XdP for all A ∈ H. A
A
Now PN (X) is H-measurable and
Y (X − PN (X))dP = 0
for all Y ∈ N .
Ω
In particular,
(X − PN (X))dP = 0
for all A ∈ H
A
i.e.
PN (X)dP =
A
XdP
for all A ∈ H .
A
Hence, by uniqueness, PN (X) = E[X|H].
t From the general theory of Hilbert spaces we know that the solution X of the problem (6.1.8) is given by the projection PKt (Xt ). Therefore Lemma 6.1.1 leads to the following useful result: Theorem 6.1.2
t = PKt (Xt ) = E[Xt |Gt ] . X
This is the basis for the general Fujisaki-Kallianpur-Kunita equation of filtering theory. See for example Bensoussan (1992), Davis (1984) or Kallianpur (1980).
6.2 The 1-Dimensional Linear Filtering Problem From now on we will concentrate on the linear case, which allows an explicit t (the Kalman-Bucy solution in terms of a stochastic differential equation for X filter ):
88
6 The Filtering Problem
In the linear filtering problem the system and observation equations have the form: (linear system)
(linear observations)
dXt = F (t)Xt dt+C(t)dUt ; F (t) ∈ Rn×n , C(t) ∈ Rn×p
(6.2.1)
dZt = G(t)Xt dt+D(t)dVt ; G(t) ∈ Rm×n , D(t) ∈ Rm×r
(6.2.2)
To be able to focus on the main ideas in the solution of the filtering problem, we will first consider only the 1-dimensional case: dXt = F (t)Xt dt + C(t)dUt ; F (t), C(t) ∈ R
(6.2.3)
(linear observations) dZt = G(t)Xt dt + D(t)dVt ; G(t), D(t) ∈ R
(6.2.4)
(linear system)
We assume (see (5.2.1)) that F, G, C, D are bounded on bounded intervals. Based on our interpretation (6.1.5) of Zt we assume Z0 = 0. We also assume that X0 is normally distributed (and independent of {Ut }, {Vt }). Finally we assume that D(t) is bounded away from 0 on bounded intervals. The (important) extension to the several-dimensional case (6.2.1), (6.2.2) is technical, but does not require any essentially new ideas. Therefore we shall only state the result for this case (in the next section) after we have discussed the 1-dimensional situation. The reader is encouraged to work out the necessary modifications for the general case for himself or consult Bensoussan (1992), Davis (1977) or Kallianpur (1980) for a full treatment. From now on we let Xt , Zt be processes satisfying (6.2.3), (6.2.4). Here is an outline of the solution of the filtering problem in this case. Step 1. Let L = L(Z, t) be the closure in L2 (P ) of functions which are linear combinations of the form c0 + c1 Zs1 (ω) + · · · + ck Zsk (ω) , Let PL
with sj ≤ t, cj ∈ R .
denote the projection from L2 (P ) onto L .
Then, with K as in (6.1.9), t = PK (Xt ) = E[Xt |Gt ] = PL (Xt ) . X Thus, the best Z-measurable estimate of Xt coincides with the best Z-linear estimate of Xt . Step 2. Replace Zt by the innovation process Nt :
t Nt = Zt − 0
Then
(GX)∧ s ds ,
where (GX)∧ s = PL(Z,s) (G(s)Xs ) = G(s)Xs .
6.2 The 1-Dimensional Linear Filtering Problem
89
(i) Nt has orthogonal increments, i.e. E[(Nt1 − Ns1 )(Nt2 − Ns2 )] = 0 for non-overlapping intervals [s1 , t1 ], [s2 , t2 ]. t = PL(N,t) (Xt ). (ii) L(N, t) = L(Z, t), so X Step 3. If we put dRt =
1 dNt , D(t)
then Rt is a 1-dimensional Brownian motion. Moreover, L(N, t) = L(R, t)
and
t = PL(N,t) (Xt ) = PL(R,t) (Xt ) = E[Xt ] + X
t 0
∂ E[Xt Rs ]dRs . ∂s
Step 4. Find an expression for Xt by solving the (linear) stochastic differential equation dXt = F (t)Xt dt + C(t)dUt . Step 5. Substitute the formula for Xt from Step 4 into E[Xt Rs ] and use t : Step 3 to obtain a stochastic differential equation for X t = ∂ E[Xt Rs ]s=t dRt + dX ∂s
t 0
∂2 E[Xt Rs ]dRs dt ∂t∂s
etc.
Before we proceed to establish Steps 1–5, let us consider a simple, but motivating example: Example 6.2.1 Suppose X, W1 , W2 , . . . are independent real random variables, E[X] = E[Wj ] = 0 for all j, E[X 2 ] = a2 , E[Wj2 ] = m2 for all j. Put Zj = X + Wj . of X based on {Zj ; j ≤ k} ? More What is the best linear estimate X precisely, let L = L(Z, k) = {c1 Z1 + · · · + ck Zk ; c1 , . . . , ck ∈ R} . Then we want to find
k = Pk (X) , X
where Pk denotes the projection into L(Z, k). We use the Gram-Schmidt procedure to obtain random variables A1 , A2 , . . . such that (i) E[Ai Aj ] = 0 for i = j (ii) L(A, k) = L(Z, k) for all k.
90
6 The Filtering Problem
Then k = X
k E[XAj ] j=1
E[A2j ]
Aj
for k = 1, 2, . . . .
(6.2.5)
k−1 from this by observing k and X We obtain a recursive relation between X that j−1 , (6.2.6) Aj = Zj − X which follows from Aj = Zj − Pj−1 (Zj ) = Zj − Pj−1 (X) ,
since Pj−1 (Wj ) = 0 .
By (6.2.6) j−1 )] = E[X(X − X j−1 )] = E[(X − X j−1 )2 ] E[XAj ] = E[X(Zj − X and
j−1 )2 ] = E[(X − X j−1 )2 ] + m2 . E[A2j ] = E[(X + Wj − X
Hence k = X k−1 + X
k−1 )2 ] E[(X − X k−1 ) . (Z − X k−1 )2 ] + m2 k E[(X − X
If we introduce Zk =
(6.2.7)
k 1 Zj , k j=1
then this can be simplified to k = X
a2 Zk . a2 + k1 · m2
(6.2.8)
(This can be seen as follows: Put αk =
a2 , a2 + k1 m2
Uk = αk Z k .
Then (i) Uk ∈ L(Z, k) (ii) X − Uk ⊥L(Z, k), since E[(X − Uk )Zi ] = E[XZi ] − αk E[Z k Zi ] 1 E[Zj Zi ] = E[X(X + Wi )] − αk k j 1 1 = a2 − αk E[(X +Wj )(X +Wi )] = a2 − αk [ka2 +m2 ] = 0 .) k k j
6.2 The 1-Dimensional Linear Filtering Problem
91
The result can be interpreted as follows: k ≈ Z k , while for small k the relation between a2 For large k we put X 2 and m becomes more important. If m2 a2 , the observations are to a large k is put equal to its mean value, 0. See extent neglected (for small k) and X also Exercise 6.11. This example gives the motivation for our approach: We replace the process Zt by an orthogonal increment process Nt (Step 2) t analogous to (6.2.5). Such a represenin order to obtain a representation for X tation is obtained in Step 3, after we have identified the best linear estimate with the best measurable estimate (Step 1) and established the connection between Nt and Brownian motion. Step 1. Z-Linear and Z-Measurable Estimates Lemma 6.2.2 Let X, Zs ; s ≤ t be random variables in L2 (P ) and assume that (X, Zs1 , Zs2 , . . . , Zsn ) ∈ Rn+1 has a normal distribution for all s1 , s2 , . . . , sn ≤ t, n ≥ 1. Then PL (X) = E[X|G] = PK (X) . In other words, the best Z-linear estimate for X coincides with the best Zmeasurable estimate in this case. ˇ = PL (X), X = X − X. ˇ Then we claim that X is independent of Proof. Put X G: Recall that a random variable (Y1 , . . . , Yk ) ∈ Rk is normal iff c1 Y1 + · · · + ck Yk is normal, for all choices of c1 , . . . , ck ∈ R. And an L2 -limit of normal variables is again normal (Appendix A). Therefore Zs1 , . . . , Zsn ) is normal for all s1 , . . . , sn ≤ t . (X, and Zsj are uncorrelated, for 1 ≤ j ≤ n. It follows sj ] = 0, X Since E[XZ (Appendix A) that and (Zs1 , . . . , Zsn ) are independent . X is independent from G as claimed. But then So X = E[XG ] · E[X] ˇ = E[XG X] = 0 for all G ∈ G E[XG (X − X)] ˇ . Since X ˇ is G-measurable, we conclude that i.e. XdP = XdP G
ˇ = E[X|G]. X
G
92
6 The Filtering Problem
There is a curious interpretation of this result: Suppose X, {Zt }t∈T are L2 (P )-functions with given covariances. Among all possible distributions of (X, Zt1 , . . . , Ztn ) with these covariances, the normal distribution will be the “worst” w.r.t. estimation, in the following sense: For any distribution we have E[(X − E[X|G])2 ] ≤ E[(X − PL (X))2 ] , with equality for the normal distribution, by Lemma 6.2.2. (Note that the quantity on the right hand side only depends on the covariances, not on the distribution we might choose to obtain these covariances). For a broad discussion of similar conclusions, based on an information theoretical game between nature and the observer, see Tops¨oe (1978). Finally, to be able to apply Lemma 6.2.2 to our filtering problem, we need the following result: Lemma 6.2.3
#
Xt Mt = Zt
$
∈ R2
is a Gaussian process .
Proof. We may regard Mt as the solution of a 2-dimensional linear stochastic differential equation of the form # $ X0 ; (6.2.9) dMt = H(t)Mt dt + K(t)dBt , M0 = 0 where H(t) ∈ R2×2 , K(t) ∈ R2×2 and Bt is 2-dimensional Brownian motion. Use Picard iteration to solve (6.2.9), i.e. put (n+1) Mt
t = M0 +
H(s)Ms(n) ds
t +
0 (n)
K(s)dBs ,
n = 0, 1, 2, . . . (6.2.10)
0 (n)
Then Mt is Gaussian for all n and Mt → Mt in L2 (P ) (see the proof of Theorem 5.2.1) and therefore Mt is Gaussian (Theorem A.7). Step 2. The Innovation Process Before we introduce the innovation process we will establish a useful representation of the functions in the space L(Z, T ) = the closure in L2 (P ) of all linear combinations c0 + c1 Zt1 + · · · + ck Ztk ; 0 ≤ ti ≤ T, cj ∈ R .
6.2 The 1-Dimensional Linear Filtering Problem
93
If f ∈ L2 [0, T ], note that # T
2 $
E
f (t)dZt
2 $
# T f (t)G(t)Xt dt
=E
0
# T
0
0
T
# T f (t)G(t)Xt dt
+2E
2 $ f (t)D(t)dVt
+E
0
$ f (t)D(t)dVt
.
0
Since 2 $ # T
T f (t)G(t)Xt dt ≤ A1 · f (t)2 dt by the Cauchy-Schwartz inequality, E 0
0
# T E
2 $ f (t)D(t)dVt
T =
0
f (t)2 D2 (t)dt
by the Itˆo isometry
0
and {Xt }, {Vt } are independent, we conclude that
T
2
f (t)dt ≤ E
A0
2 $
# T
0
f (t)dZt
T ≤ A2
0
f 2 (t)dt ,
(6.2.11)
0
for some constants A0 , A1 , A2 not depending on f . We can now show Lemma 6.2.4 L(Z, T ) = {c0 +
T 0
f (t)dZt ; f ∈ L2 [0, T ], c0 ∈ R}.
Proof. Denote the right hand side by N (Z, T ). It is enough to show that a) N (Z, T ) ⊂ L(Z, T ) b) N (Z, T ) contains all linear combinations of the form c0 + c1 Zt1 + · · · + ck Ztk ;
0 ≤ ti ≤ T
c) N (Z, T ) is closed in L2 (P ) a): This follows from the fact that if f is continuous then
T f (t)dZt = lim
n→∞
0
j
f (j · 2−n ) · (Z(j+1)2−n − Zj·2−n ) .
94
6 The Filtering Problem
b): Suppose 0 ≤ t1 < t2 < · · · < tk ≤ T . We can write k
ci Zti =
k−1
i=1
cj ΔZj
=
j=0
k−1
t j+1
cj dZt
=
j=0 t j
T k−1
cj X[tj ,tj+1 ) (t) dZt ,
j=0
0
where ΔZj = Ztj+1 − Ztj . c): This follows from (6.2.11) and the fact that L2 [0, T ] is complete.
Now we define the innovation process Nt as follows:
t Nt = Zt −
∧ (GX)∧ s ds, where (GX)s = PL(Z,s) (G(s)Xs ) = G(s)Xs .
0
(6.2.12) i.e.
t )dt + D(t)dVt . dNt = G(t)(Xt − X
(6.2.13)
Lemma 6.2.5 (i) Nt has orthogonal increments t (ii) E[Nt2 ] = D2 (s)ds 0
(iii) L(N, t) = L(Z, t) for all t ≥ 0 (iv) Nt is a Gaussian process Proof. (i): If s < t and Y ∈ L(Z, s) we have # t E[(Nt − Ns )Y ] = E
t =
r )dr + G(r)(Xr − X
s
r )Y ]dr + E G(r)E[(Xr − X
# t
s
t
$ D(r)dVr Y
s
$ DdV Y = 0 ,
s
r ⊥L(Z, r) ⊃ L(Z, s) for r ≥ s and V has independent increments. since Xr − X (ii): By Itˆo’s formula, with g(t, x) = x2 , we have d(Nt2 ) = 2Nt dNt + 12 2(dNt )2 = 2Nt dNt + D2 dt . So E[Nt2 ]
# t =E
$ 2Ns dNs +
0
Now
t Ns dNs = lim 0
t
Δtj →0
D2 (s)ds .
0
Ntj [Ntj+1 − Ntj ] ,
6.2 The 1-Dimensional Linear Filtering Problem
95
so since N has orthogonal increments we have # t E
$ Ns dNs = 0 ,
and (ii) follows .
0
(iii): It is clear that L(N, t) ⊂ L(Z, t) for all t ≥ 0. To establish the opposite inclusion we use Lemma 6.2.4. So choose f ∈ L2 [0, t] and let us see what functions can be obtained in the form
t
t f (s)dZs −
f (s)dNs = 0
t
0
t
t
f (s)dZs −
= 0
t #
t f (s) −
= 0
r dr f (r)G(r)X
0
# r $
t f (r) g(r, s)dZs dr − f (r)c(r)dr
0
0
$
0
t
f (r)g(r, s)dr dZs −
f (r)c(r)dr , 0
s
where we have used Lemma 6.2.2 and Lemma 6.2.4 to write, for each r,
(GX)∧ r
r = c(r) +
for some g(r, ·) ∈ L2 [0, r], c(r) ∈ R .
g(r, s)dZs 0
From the theory of Volterra integral equations (see e.g. Davis (1977), p. 125) there exists for all h ∈ L2 [0, t] an f ∈ L2 [0, t] such that
t f (s) −
f (r)g(r, s)dr = h(s). s
So by choosing h = X[0,t1 ] where 0 ≤ t1 ≤ t, we obtain
t
t f (r)c(r)dr +
0
t X[0,t1 ] (s)dZs = Zt1 ,
f (s)dNs = 0
0
which shows that L(N, t) ⊃ L(Z, t). t is a limit (in L2 (P )) of linear combinations of the form (iv): X M = c0 + c1 Zs1 + · · · + ck Zsk , Therefore
t1 , . . . , X tm ) (X
where sk ≤ t .
96
6 The Filtering Problem
is a limit of m-dimensional random variables (M (1) , . . . , M (m) ) whose components M (j) are linear combinations of this form. (M (1) , . . . , M (m) ) has a normal distribution since {Zt } is Gaussian, and therefore the limit has. Hence t } is Gaussian. It follows that {X
t Nt = Zt −
s ds G(s)X
0
is Gaussian, by a similar argument. Step 3. The Innovation Process and Brownian Motion
t s ds be the innovation process defined in Step 2. Recall Let Nt = Zt − G(s)X 0
that we have assumed that D(t) is bounded away from 0 on bounded intervals. Define the process Rt (ω) by dRt =
1 dNt (ω) ; D(t)
t ≥ 0, R0 = 0 .
(6.2.14)
Lemma 6.2.6 Rt is a 1-dimensional Brownian motion. Proof. Observe that (i) (ii) (iii) (iv)
Rt has continuous paths Rt has orthogonal increments (since Nt has) Rt is Gaussian (since Nt is) E[Rt ] = 0 and E[Rt Rs ] = min(s, t).
To prove the last assertion in (iv), note that by Itˆo’s formula d(Rt2 ) = 2Rt dRt + (dRt )2 = 2Rt dRt + dt , so, since Rt has orthogonal increments, E[Rt2 ]
t = E[
ds] = t . 0
Therefore, if s < t, E[Rt Rs ] = E[(Rt − Rs )Rs ] + E[Rs2 ] = E[Rs2 ] = s . Properties (i), (iii) and (iv) constitute one of the many characterizations of a 1dimensional Brownian motion (see Simon (1979), Theorem 4.3). (Alternatively, we could easily deduce that Rt has stationary, independent increments and
6.2 The 1-Dimensional Linear Filtering Problem
97
therefore – by continuity – must be Brownian motion, by the result previously referred to in the beginning of Chapter 3. For a general characterization of Brownian motion see Corollary 8.4.5.) Since L(N, t) = L(R, t) we conclude that
t = PL(R,t) (Xt ) . X
It turns out that the projection down to the space L(R, t) can be described very nicely: (compare with formula (6.2.5) in Example 6.2.1) Lemma 6.2.7 t = E[Xt ] + X
t 0
∂ E[Xt Rs ]dRs . ∂s
(6.2.15)
Proof. From Lemma 6.2.4 we know that t = c0 (t) + X
t
for some g ∈ L2 [0, t], c0 (t) ∈ R .
g(s)dRs 0
t ] = E[Xt ]. We have Taking expectations we see that c0 (t) = E[X t )⊥ (Xt − X
t
for all f ∈ L2 [0, t] .
f (s)dRs 0
Therefore #
$
t
E Xt
f (s)dRs 0
#
t =E X
g(s)f (s)ds =
=E 0
# t
f (s)dRs = E
0
t
$
# t
$
t
g(s)dRs 0
g(s)f (s)ds ,
$
t f (s)dRs 0
for all f ∈ L2 [0, t] ,
0
where we have used the Itˆ o isometry. In particular, if we choose f = X[0,r] for some r ≤ t, we obtain
r E[Xt Rr ] = g(s)ds 0
or g(r) = This completes Step 3.
∂ E[Xt Rr ] , ∂r
as asserted .
98
6 The Filtering Problem
Step 4. An Explicit Formula for Xt This is easily obtained using Itˆo’s formula, as in the examples in Chapter 5. The result is t Xt = exp
# F (s)ds
0
t = exp
t X0 +
s −
exp 0
$ F (u)du C(s)dUs
0
t
t F (s)ds X0 + exp F (u)du C(s)dUs .
0
0
s
t In particular, we note that E[Xt ] = E[X0 ] exp( F (s)ds). 0
More generally, if 0 ≤ r ≤ t, (see Exercise 6.12) t Xt = exp
F (s)ds Xr +
r
t
t exp r
F (u)du C(s)dUs .
(6.2.16)
s
0t Step 5. The Stochastic Differential Equation for X We now combine the previous steps to obtain the solution of the filtering probt . Starting with the formula lem, i.e. a stochastic differential equation for X from Lemma 6.2.7 t = E[Xt ] + X
t f (s, t)dRs , 0
where f (s, t) =
∂ E[Xt Rs ] , ∂s
(6.2.17)
we use that
s Rs = 0
G(r) r )dr + Vs (Xr − X D(r)
from (6.2.13) and (6.2.14))
and obtain
s E[Xt Rs ] = 0
G(r) r ]dr , E[Xt X D(r)
6.2 The 1-Dimensional Linear Filtering Problem
99
where r = Xr − X r . X
(6.2.18)
Using formula (6.2.16) for Xt , we obtain r ] = exp E[Xt X
t
r ] = exp F (v)dv E[Xr X
r
t
F (v)dv S(r) ,
r
where r )2 ] , S(r) = E[(X
(6.2.19)
i.e. the mean square error of the estimate at time r ≥ 0. Thus
s E[Xt Rs ] = 0
G(r) exp D(r)
t
F (v)dv S(r)dr
r
so that G(s) exp f (s, t) = D(s)
t
F (v)dv S(s) .
(6.2.20)
s
We claim that S(t) satisfies the (deterministic) differential equation G2 (t) dS = 2F (t)S(t) − 2 S 2 (t) + C 2 (t) dt D (t)
(The Riccati equation) . (6.2.21)
To prove (6.2.21) note that by the Pythagorean theorem, (6.2.15) and the Itˆo isometry t ] + E[X t )2 ] = E[X 2 ] − 2E[Xt X 2 ] = E[X 2 ] − E[X 2] S(t) = E[(Xt − X t t t t t
(6.2.22) = T (t) − f (s, t)2 ds − E[Xt ]2 , 0
where T (t) = E[Xt2 ] .
(6.2.23)
Now by (6.2.16) and the Itˆo isometry we have t t
t 2 T (t) = exp 2 F (s)ds E[X0 ] + exp 2 F (u)du C 2 (s)ds , 0
0
s
100
6 The Filtering Problem
using that X0 is independent of {Us }s≥0 . So t dT = 2F (t) · exp 2 F (s)ds E[X02 ] + C 2 (t) dt 0
t +
t 2F (t) exp 2 F (u)du C 2 (s)ds
0
s
i.e.
dT = 2F (t)T (t) + C 2 (t) . dt Substituting in (6.2.22) we obtain, using Step 4, dT dS = − f (t, t)2 − dt dt
t 2f (s, t) · 0
(6.2.24)
∂ f (s, t)ds − 2F (t)E[Xt ]2 ∂t
G2 (t)S 2 (t) −2 = 2F (t)T (t) + C (t) − D2 (t) 2
G2 (t)S 2 (t) , = 2F (t)S(t) + C 2 (t) − D2 (t)
t
f 2 (s, t)F (t)ds − 2F (t)E[Xt ]2
0
which is (6.2.21) .
t : We are now ready for the stochastic differential equation for X From the formula t = c0 (t) + X
t f (s, t)dRs
where c0 (t) = E[Xt ]
0
it follows that t = c (t)dt + f (t, t)dRt + dX 0
t 0
∂ f (s, t)dRs dt , ∂t
since
u t 0
0
u = 0
So
u u ∂ ∂ f (s, t)dRs dt = f (s, t)dt dRs ∂t ∂t 0
s
u − c0 (u) − (f (s, u) − f (s, s))dRs = X
u f (s, s)dRs . 0
(6.2.25)
6.2 The 1-Dimensional Linear Filtering Problem
t = c (t)dt + dX 0
G(t)S(t) dRt + D(t)
t
101
f (s, t)dRs F (t)dt
0
or t − c0 (t))dt + t = c (t)dt + F (t) · (X dX 0
G(t)S(t) dRt D(t)
t dt + G(t)S(t) dRt , = F (t)X D(t)
(6.2.26)
since c0 (t) = F (t)c0 (t) (Step 4). If we substitute 1 t dt] [dZt − G(t)X dRt = D(t) we obtain t = (F (t) − dX
G2 (t)S(t) G(t)S(t) )Xt dt + dZt . D2 (t) D2 (t)
(6.2.27)
So the conclusion is: Theorem 6.2.8 (The 1-dimensional Kalman-Bucy filter) t = E[Xt |Gt ] of the 1-dimensional linear filtering problem The solution X dXt = F (t)Xt dt + C(t)dUt ; F (t), C(t) ∈ R (6.2.3)
(linear system)
(linear observations) dZt = G(t)Xt dt + D(t)dVt ; G(t), D(t) ∈ R
(6.2.4)
(with conditions as stated earlier) satisfies the stochastic differential equation G2 (t)S(t) G(t)S(t) 0 = E[X0 ] dXt = F (t) − dZt ; X Xt dt + (6.2.28) D2 (t) D2 (t) where t )2 ] satisfies the (deterministic) Riccati equation S(t) = E[(Xt − X G2 (t) dS = 2F (t)S(t) − 2 S 2 (t) + C 2 (t), S(0) = E[(X0 − E[X0 ])2 ] . dt D (t) Example 6.2.9 (Noisy observations of a constant process) Consider the simple case (system) (observations)
0 , E[X 2 ] = a2 dXt = 0, i.e. Xt = X0 ; E[X0 ] = X 0 dZt = Xt dt + mdVt ; Z0 = 0
(corresponding to Ht =
dZ = Xt + mWt , Wt = white noise) . dt
(6.2.29)
102
6 The Filtering Problem
First we solve the corresponding Riccati equation for t )2 ]: S(t) = E[(Xt − X dS 1 = − 2 S2 , S(0) = a2 dt m i.e.
a 2 m2 ; m2 + a 2 t t : This gives the following equation for X
t≥0.
S(t) =
t = − dX
a2 a2 t dt + dZt ; X m2 + a 2 t m2 + a 2 t
0 = E[X0 ] = 0 X
or t t exp d X 0
a2 ds 2 m + a2 s
t = exp 0
a2 a2 ds dZt 2 2 2 m +a s m + a2 t
which gives t = X
m2
m2 a2 0 + Zt ; X 2 2 +a t m + a2 t
t≥0.
This is the continuous analogue of Example 6.2.1. Example 6.2.10 (Noisy observations of a Brownian motion) If we modify the preceding example slightly, so that (system) (observations)
dXt = cdUt ; E[X0 ] = 0, E[X02 ] = a2 , c constant dZt = Xt dt + mdVt ,
the Riccati equation becomes dS 1 = − 2 S 2 + c2 , S(0) = a2 dt m or m2 dS = dt, (S = mc) . m2 c2 − S 2 This gives mc + s 2ct ; = K exp mc − s m Or
mc + a2 K= . mc − a2
(6.2.30)
6.2 The 1-Dimensional Linear Filtering Problem
103
⎧ K·exp( 2ct m )−1 ⎪ mc K·exp( 2ct ; if S(0) < mc ⎪ ⎪ m )+1 ⎨ S(t) = mc (constant) if S(0) = mc ⎪ ⎪ 2ct ⎪ m )+1 ⎩ mc K·exp( 2ct if S(0) > mc . K·exp( )−1 m
Thus in all cases the mean square error S(t) tends to mc as t → ∞ . For simplicity let us put a = 0, m = c = 1. Then S(t) =
exp(2t) − 1 = tanh(t) . exp(2t) + 1
t is The equation for X t dt + tanh(t)dZt , t = − tanh(t)X dX
0 = 0 X
or t ) = sinh(t)dZt . d(cosh(t)X So t = X
1 cosh(t)
t sinh(s)dZs . 0
If we return to the interpretation of Zt :
t Zt =
Hs ds , 0
where Hs are the “original” observations (see (6.1.4)), we can write t = X
1 cosh(t)
t sinh(s)Hs ds ,
(6.2.31)
0
t is approximately (for large t) a weighted average of the observations so X Hs , with increasing emphasis on observations as time increases. Remark. It is interesting to compare formula (6.2.31) with established formulas in forecasting. For example, the exponentially weighted moving average n suggested by C.C. Holt in 1958 is given by X n n = (1 − α)n Z0 + α X (1 − α)n−k Zk , k=1
where α is some constant; 0 ≤ α ≤ 1. See The Open University (1981), p. 16.
104
6 The Filtering Problem
This may be written n = β −n Z0 + (β − 1)β −n−1 X
n
β k Zk ,
k=1 1 1−α
(assuming α < 1), which is a discrete version of (6.2.31), or – where β = more precisely – of the formula corresponding to (6.2.31) in the general case when a = 0 and m, c are not necessarily equal to 1. Example 6.2.11 (Estimation of a parameter) Suppose we want to estimate the value of a (constant) parameter θ, based on observations Zt satisfying the model dZt = θ M (t)dt + N (t)dBt , where M (t), N (t) are known functions. In this case the stochastic differential equation for θ is of course dθ = 0 , so the Riccati equation for S(t) = E[(θ − θt )2 ] is 2 M (t)S(t) dS =− dt N (t) which gives −1
t −1 2 −2 S(t) = S0 + M (s) N (s) ds 0
and the Kalman-Bucy filter is dθt =
M (t)S(t) (dZt − M (t)θt dt) . N (t)2
This can be written
S0−1
t +
M (s)2 N (s)−2 ds dθt + M (t)2 N (t)−2 θt dt = M (t)N (t)−2 dZt .
0
We recognize the left hand side as
t −1 d( S0 + M (s)2 N (s)−2 ds θt ) 0
so we obtain θt =
θ0 S0−1 + S0−1
+
t
0 t
M (s)N (s)−2 dZs
M (s)2 N (s)−2 ds
0
.
6.2 The 1-Dimensional Linear Filtering Problem
105
This estimate coincides with the maximum likelihood estimate in classical estimation theory if S0−1 = 0. See Liptser and Shiryaev (1978). For more information about estimates of drift parameters in diffusions and generalizations, see for example Aase (1982), Brown and Hewitt (1975) and Taraskin (1974). Example 6.2.12 (Noisy observations of a population growth) Consider a simple growth model (r constant) E[(X0 − b)2 ] = a2 ,
dXt = rXt dt, E[X0 ] = b > 0 , with observations dZt = Xt dt + mdVt ;
m constant .
The corresponding Riccati equation 1 dS = 2rS − 2 S 2 , dt m
S(0) = a2 ,
gives the logistic curve S(t) =
2rm2 ; 1 + Ke−2rt
where K =
t becomes So the equation for X t dt + S dZt ; t = r − S X dX m2 m2
2rm2 a2
−1.
0 = E[X0 ] = b . X
For simplicity let us assume that a2 = 2rm2 , so that S(t) = 2rm2
for all t .
(In the general case S(t) → 2rm2 as t → ∞, so this is not an unreasonable approximation for large t). Then we get t ) = exp(rt)2rdZt , d(exp(rt)X or
0 = b X
# t $ Xt = exp(−rt) 2r exp(rs)dZs + b . 0
As in Example 6.2.10 this may be written # t $ Xt = exp(−rt) 2r exp(rs)Hs ds + b , 0
if Zt =
t 0
Hs ds .
(6.2.32)
106
6 The Filtering Problem
For example, assume that Hs = β (constant) for 0 ≤ s ≤ t, i.e. that our observations (for some reason) give the same value β for all times s ≤ t. Then t = 2β − (2β − b) exp(−rt) → 2β X
as t → ∞ .
If Hs = β · exp(αs), s ≥ 0 (α constant), we get $ # 2rβ Xt = exp(−rt) (exp(r + α)t − 1) + b r+α 2rβ ≈ exp αt for large t . r+α Thus, only if α = r, i.e. Hs = β exp(rs); s ≥ 0, does the filter “believe” the observations in the long run. And only if α = r and β = b, i.e. Hs = b exp(rs); s ≥ 0, does the filter “believe” the observations at all times. Example 6.2.13 (Constant coefficients – general discussion) Now consider the system F, C constants = 0
dXt = F Xt dt + CdUt ; with observations
G, D constants = 0 .
dZt = GXt dt + DdVt ; The corresponding Riccati equation S = 2F S − has the solution
G2 2 S + C2 , D2
S(0) = a2 2
S(t) =
1 )G t ) α1 − Kα2 exp( (α2 −α D2 2
1 )G t 1 − K exp( (α2 −α ) D2
,
where 1 F 2 D2 + G2 C 2 ) 1 α2 = G−2 (F D2 + D F 2 D2 + G2 C 2 ) α1 = G−2 (F D2 − D
and K=
a2 − α1 . a2 − α2
t of the form This gives the solution for X t = exp X
t 0
t
t 0 + G H(s)ds X exp H(u)du S(s)dZs , D2 0
s
6.3 The Multidimensional Linear Filtering Problem
107
where G2 S(s) . D2 For large s we have S(s) ≈ α2 . This gives H(s) = F −
t G2 α2 G2 α2 Gα2 Xt ≈ X0 exp( F − exp( F − t) + 2 (t − s))dZs D2 D D2 0
0 exp(−βt) + Gα2 exp(−βt) =X D2
t exp(βs)dZs
(6.2.33)
0
√ where β = D−1 F 2 D2 + G2 C 2 . So we get approximately the same behaviour as in the previous example.
6.3 The Multidimensional Linear Filtering Problem Finally we formulate the solution of the n-dimensional linear filtering problem (6.2.1), (6.2.2): Theorem 6.3.1 (The Multi-Dimensional Kalman-Bucy Filter) t = E[Xt |Gt ] of the multi-dimensional linear filtering problem The solution X (linear system) (linear observations)
dXt = F (t)Xt dt+C(t)dUt ; F (t) ∈ Rn×n , C(t) ∈ Rn×p dZt = G(t)Xt dt+D(t)dVt ; G(t) ∈ Rm×n , D(t) ∈ Rm×r
(6.3.1) (6.3.2)
satisfies the stochastic differential equation t = (F − SGT (DDT )−1 G)X t dt + SGT (DDT )−1 dZt ; dX t )(Xt − X t )T ] ∈ Rn×n where S(t): = E[(Xt − X equation
0 = E[X0 ] X (6.3.3) satisfies the matrix Riccati
dS = F S + SF T − SGT (DDT )−1 GS + CC T ; dt S(0) = E[(X0 − E[X0 ])(X0 − E[X0 ])T ] .
(6.3.4)
The condition on D(t) ∈ Rm×r is now that D(t)D(t)T is invertible for all t and that (D(t)D(t)T )−1 is bounded on every bounded t-interval.
108
6. The Filtering Problem
A similar solution can be found for the more general situation (system)
dXt = [F0 (t) + F1 (t)Xt + F2 (t)Zt ]dt + C(t)dUt
(observations) dZt = [G0 (t) + G1 (t)Xt + G2 (t)Zt ]dt + D(t)dVt ,
(6.3.5) (6.3.6)
where Xt ∈ Rn , Zt ∈ Rm and Bt = (Ut , Vt ) is n + m-dimensional Brownian motion, with appropriate dimensions on the matrix coefficients. See Bensoussan (1992) and Kallianpur (1980), who also treat the non-linear case. An account of non-linear filtering theory is also given in Pardoux (1979) and Davis (1984). For the solution of linear filtering problems governed by more general processes than Brownian motion (processes with orthogonal increments) see Davis (1977). For various applications of filtering theory see Bucy and Joseph (1968), Jazwinski (1970), Gelb (1974), Maybeck (1979) and the references in these books.
Exercises 6.1.
(Time-varying observations of a constant) Prove that if the (1-dimensional) system is E[X02 ] = a2
dXt = 0, E[X0 ] = 0 , and the observation process is dZt = G(t)Xt dt + dVt ,
Z0 = 0
t )2 ] is given by then S(t) = E[(Xt − X S(t) =
1 S(0)
+
1 t 0
G2 (s)ds
.
(6.3.7)
We say that we have exact asymptotic estimation if S(t) → 0 as t → ∞, i.e. if
∞ G2 (s)ds = ∞ . 0
Thus for G(s) =
1 (1 + s)p
(p > 0 constant)
we have exact asymptotic estimation iff p ≤ 6.2.
1 2
.
Consider the linear 1-dimensional filtering problem with no noise in the system:
Exercises (system) (observations)
dXt = F (t)Xt dt dZt = G(t)Xt dt + D(t)dVt
109
(6.3.8) (6.3.9)
t )2 ] as usual and assume S(0) > 0 . Put S(t) = E[(Xt − X a) Show that 1 R(t): = S(t) satisfies the linear differential equation R (t) = −2F (t)R(t) +
G2 (t) ; D2 (t)
R(0) =
1 S(0)
(6.3.10)
b) Use (6.3.10) to prove that for the filtering problem (6.3.8), (6.3.9) we have t t t 2 G (s) 1 1 = exp − 2 F (s)ds + exp − 2 F (u)du ds . S(t) S(0) D2 (s) 0
0
s
(6.3.11) 6.3.
In Example 6.2.12 we found that S(t) → 2rm2
as t → ∞ ,
so exact asymptotic estimation (Exercise 6.1) of Xt is not possible. However, prove that we can obtain exact asymptotic estimation of X0 , in the sense that E[(X0 − E[X0 |Gt ])2 ] → 0
as t → ∞ .
t , so (Hint: Note that X0 = e−rt Xt and therefore E[X0 |Gt ] = e−rt X that E[(X0 − E[X0 |Gt ])2 ] = e−2rt S(t)) . 6.4.
Consider the multi-dimensional linear filtering problem with no noise in the system: (system) (observations)
dXt = F (t)Xt dt ; Xt ∈ Rn , F (t) ∈ Rn×n dZt = G(t)Xt dt + D(t)dVt ; G(t) ∈ Rm×n , D(t) ∈ Rm×r
(6.3.12) (6.3.13)
Assume that S(t) is nonsingular and define R(t) = S(t)−1 . Prove that R(t) satisfies the Lyapunov equation (compare with Exercise 6.2) R (t) = −R(t)F (t)−F (t)T R(t)+G(t)T (D(t)D(t)T )−1 G(t) . (6.3.14)
110
6. The Filtering Problem
(Hint: Note that since S(t)S −1 (t) = I we have S (t)S −1 (t) + S(t)(S −1 ) (t) = 0, which gives (S −1 ) (t) = −S −1 (t)S (t)S −1 (t) .) 6.5.
(Prediction) In the prediction problem one seeks to estimate the value of the system X at a future time T based on the observations Gt up to the present time t < T . Prove that in the linear setup (6.2.3), (6.2.4) the predicted value E[XT |Gt ] ,
T >t
is given by T E[XT |Gt ] = exp
t . F (s)ds · X
(6.3.15)
t
(Hint: Use formula (6.2.16).) 6.6.
(Interpolation/smoothing) The interpolation or smoothing problem consists of estimating the value of the system X at a time s < t, given the observations up to time t, Gt . With notation as in (6.2.1), (6.2.2) one can show that Ms : = E[Xs |Gt ] satisfies the differential equation ,
s ) ; = F (s)Ms + C(s)C T (s)S −1 (s)(Ms − X M t = Xt . d ds Ms
s 0 .
(6.3.20)
Here σ = 0 and x are known constants. The parameter μ is also constant, but we do not know its value, only its probability distribution, which is assumed to be normal ¯ and variance a2 . We assume with mean μ that μ is independent of Bs s≥0 and that E[μ2 ] < ∞. We assume that we can observe the value of Xt for all t. Thus we have access to the information “(σ-algebra)” Mt generated by Xs ; s ≤ t. Let Nt be the σ-algebra generated by ξs , s ≤ t, where dξt = μ dt + σ dBt ; a) Prove that Mt = Nt .
ξ0 = x.
(6.3.21)
114
6. The Filtering Problem
b) Prove that μ θ + σ −2 ξt ) E[μ|Nt ] = (θ + σ −2 t)−1 (¯ where
θ = E[(μ − μ ¯)2 ]−1 ,
c) Define t = B
t
μ ¯ = E[μ].
σ −1 (μ − E[μ|Ms ])ds + Bt .
(6.3.22) (6.3.23)
(6.3.24)
0
t is a Brownian motion. Prove that B t is Mt -measurable for all t. Hence d) Prove that B Ft ⊆ Mt
(6.3.25)
s ; s ≤ t. where Ft is the σ-algebra generated by B e) Prove that ξt is Ft -measurable for all t. Combined with d) and a) this gives that Ft = Mt = Nt = Ft . f) Prove that
t . dXt = E[μ|Mt ]Xt dt + σ Xt dB
Note that in this representation of Xt all the coefficients are observable quantities. 6.16. Repeat Exercise 6.15, but this time with Xt being a mean-reverting Ornstein-Uhlenbeck process given by dXt = (μ − ρ Xt )dt + σ dBt ;
X0 = x ∈ R .
Here ρ, σ = 0 and x are known constants, while μ is an unknown constant, as before. Conclude that Xt can be given a representation of the form t . dXt = (E[μ|Mt ] − ρ X(t))dt + σ dB
7 Diffusions: Basic Properties
7.1 The Markov Property Suppose we want to describe the motion of a small particle suspended in a moving liquid, subject to random molecular bombardments. If b(t, x) ∈ R3 is the velocity of the fluid at the point x and time t, then a reasonable mathematical model for the position Xt of the particle at time t would be a stochastic differential equation of the form dXt = b(t, Xt ) + σ(t, Xt )Wt , dt
(7.1.1)
where Wt ∈ R3 denotes “white noise” and σ(t, x) ∈ R3×3 . The Itˆ o interpretation of this equation is dXt = b(t, Xt )dt + σ(t, Xt )dBt ,
(7.1.2)
where Bt is 3-dimensional Brownian motion, and similarly (with a correction term added to b) for the Stratonovich interpretation (see (6.1.3)). In a stochastic differential equation of the form dXt = b(t, Xt )dt + σ(t, Xt )dBt ,
(7.1.3)
where Xt ∈ Rn , b(t, x) ∈ Rn , σ(t, x) ∈ Rn×m and Bt is m-dimensional Brownian motion, we will call b the drift coefficient and σ – or sometimes 1 T 2 σσ – the diffusion coefficient (see Theorem 7.3.3). Thus the solution of a stochastic differential equation may be thought of as the mathematical description of the motion of a small particle in a moving fluid: Therefore such stochastic processes are called (Itˆo) diffusions. In this chapter we establish some of the most basic properties and results about Itˆ o diffusions: 7.1 The Markov property. 7.2 The strong Markov property. B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 7, © Springer-Verlag Berlin Heidelberg 2013
116
7 Diffusions: Basic Properties
7.3 The generator A of Xt expressed in terms of b and σ. 7.4 The Dynkin formula. 7.5 The characteristic operator. This will give us the necessary background for the applications in the remaining chapters. Definition 7.1.1 A (time-homogeneous) Itˆ o diffusion is a stochastic process Xt (ω) = X(t, ω): [s, ∞) × Ω → Rn satisfying a stochastic differential equation of the form t≥s;
dXt = b(Xt )dt + σ(Xt )dBt ,
Xs = x
(7.1.4)
where Bt is m-dimensional Brownian motion and b: Rn → Rn , σ: Rn → Rn×m satisfy the conditions in Theorem 5.2.1, which in this case simplify to: |b(x) − b(y)| + |σ(x) − σ(y)| ≤ D|x − y| ;
x, y ∈ Rn ,
(7.1.5)
i.e. that b(·) and σ(·) are Lipschitz continuous. We will denote the (unique) solution of (7.1.4) by Xt = Xts,x ; t ≥ s. If s = 0 we write Xtx for Xt0,x . Note that we have assumed in (7.1.4) that b and σ do not depend on t but on x only. We shall see later (Chapters 10, 11) that the general case can be reduced to this situation. The resulting process Xt (ω) will have the property of being time-homogeneous, in the following sense: Note that s+h
s,x Xs+h
s+h
b(Xus,x )du
= x+
σ(Xus,x )dBu
+
s
s
h
h s,x b(Xs+v )dv +
= x+ 0
s,x v , σ(Xs+v )dB
(u = s + v)
(7.1.6)
0
v = Bs+v − Bs ; v ≥ 0. (See Exercise 2.12). On the other hand of where B course
h
h Xh0,x = x + b(Xv0,x )dv + σ(Xv0,x )dBv . 0
0
v }v≥0 and {Bv }v≥0 have the same P 0 -distributions, it follows by Since {B weak uniqueness (Lemma 5.3.1) of the solution of the stochastic differential equation X0 = x dXt = b(Xt )dt + σ(Xt )dBt ; that s,x }h≥0 {Xs+h
and
{Xh0,x}h≥0
7.1 The Markov Property
117
have the same P 0 -distributions, i.e. {Xt }t≥0 is time-homogeneous. We now introduce some notation which is common (and convenient) in the context of Markov processes: From now on we let Qx denote the probability law of a given (timehomogeneous) Itˆ o diffusion {Xt }t≥0 when its initial value is X0 = x ∈ Rn . The expectation with respect to Qx is denoted by E x [ · ]. Hence we have E x [f1 (Xt1 ) · · · fk (Xtk )] = E[f1 (Xtx1 ) · · · fk (Xtxk )]
(7.1.7)
for all bounded Borel functions f1 , . . . , fk and all times t1 , . . . , tk ≥ 0; k = 1, 2, . . ., where E = EP denotes the expectation with respect to the probability law P = P 0 for {Bt }t≥0 when B0 = 0 (see Section 2.2). (m) As before we let Ft be the σ-algebra generated by {Br ; r ≤ t}. Similarly we let Mt be the σ-algebra generated by {Xr ; r ≤ t}. We have established (m) earlier (see Theorem 5.2.1) that Xt is measurable with respect to Ft , so (m) Mt ⊆ Ft . We now prove that Xt satisfies the important Markov property: The future behaviour of the process given what has happened up to time t is the same as the behaviour obtained when starting the process at Xt . The precise mathematical formulation of this is the following: Theorem 7.1.2 (The Markov property for Itˆ o diffusions) Let f be a bounded Borel function from Rn to R. Then, for t, h ≥ 0 (m)
E x [f (Xt+h )|Ft
](ω) = E Xt (ω) [f (Xh )] .
(7.1.8)
(See Appendix B for definition and basic properties of conditional expectation). Here and in the following E x denotes the expectation w.r.t. the probability measure Qx . Thus E y [f (Xh )] means E[f (Xhy )], where E denotes the expectation w.r.t. the measure P 0 . The right hand side means the function E y [f (Xh )] evaluated at y = Xt (ω). Proof. Since, for r ≥ t,
r Xr (ω) = Xt (ω) +
r b(Xu )du +
t
σ(Xu )dBu , t
we have by uniqueness Xr (ω) = Xrt,Xt (ω) . In other words, if we define F (x, t, r, ω) = Xrt,x (ω)
for r ≥ t ,
we have Xr (ω) = F (Xt , t, r, ω); r ≥ t .
(7.1.9)
118
7 Diffusions: Basic Properties (m)
Note that ω → F (x, t, r, ω) is independent of Ft rewrite (7.1.8) as (m)
E[f (F (Xt , t, t + h, ω))|Ft
. Using (7.1.9) we may
] = E[f (F (x, 0, h, ω))]x=Xt .
(7.1.10)
Put g(x, ω) = f ◦ F (x, t, t + h, ω). Then (x, ω) → g(x, ω) is measurable. (See Exercise 7.6). Hence we can approximate g pointwise boundedly by functions of the form m φk (x)ψk (ω) . k=1
Using the properties of conditional expectation (see Appendix B) we get (m) E[g(Xt , ω)|Ft ]
#
(m) φk (Xt )ψk (ω)|Ft
$
= E lim (m) = lim φk (Xt ) · E[ψk (ω)|Ft ] (m) = lim E[φk (y)ψk (ω)|Ft ]y=Xt (m)
= E[g(y, ω)|Ft
]y=Xt = E[g(y, ω)]y=Xt .
Therefore, since {Xt } is time-homogeneous, (m)
E[f (F (Xt , t, t + h, ω))|Ft
] = E[f (F (y, t, t + h, ω))]y=Xt = E[f (F (y, 0, h, ω))]y=Xt
which is (7.1.10).
Remark. Theorem 7.1.2 states that Xt is a Markov process w.r.t. the family (m) (m) of σ-algebras {Ft }t≥0 . Note that since Mt ⊆ Ft this implies that Xt is also a Markov process w.r.t. the σ-algebras {Mt }t≥0 . This follows from Theorem B.3 and Theorem B.2 c)( Appendix B): (m)
E x [f (Xt+h )|Mt ] = E x [E x [f (Xt+h )|Ft ]|Mt ] = E x [E Xt [f (Xh )]|Mt ] = E Xt [f (Xh )] since E Xt [f (Xh )] is Mt -measurable.
7.2 The Strong Markov Property Roughly, the strong Markov property states that a relation of the form (7.1.8) continues to hold if the time t is replaced by a random time τ (ω) of a more general type called stopping time (or Markov time):
7.2 The Strong Markov Property
119
Definition 7.2.1 Let {Nt } be an increasing family of σ-algebras (of subsets of Ω). A function τ : Ω → [0, ∞] is called a (strict) stopping time w.r.t. {Nt } if for all t ≥ 0 . {ω; τ (ω) ≤ t} ∈ Nt , In other words, it should be possible to decide whether or not τ ≤ t has occurred on the basis of the knowledge of Nt . Note that if τ (ω) = t0 (constant) for all ω, then τ is trivially a stopping time w.r.t. any filtration, because in this case % Ω if t0 ≤ t {ω : τ (ω) ≤ t} = ∅ if t0 > t Example 7.2.2 Let U ⊂ Rn be open. Then the first exit time τU : = inf{t > 0; Xt ∈ / U} is a stopping time w.r.t. {Mt }, since {ω; τU ≤ t} = {ω; Xr ∈ / K m } ∈ Mt m
r∈Q r 0; Xt ∈ If we include the sets of measure 0 in Mt (which we do) * then the family {Mt } Ms (see Chung (1982, is right-continuous i.e. Mt = Mt+ , where Mt+ = s>t
Theorem 2.3.4., p. 61)) and therefore τH is a stopping time for any Borel set H (see Dynkin (1965 II, 4.5.C.e.), p. 111)). Definition 7.2.3 Let τ be a stopping time w.r.t. {Nt } and let N∞ be the smallest σ-algebra containing Nt for all t ≥ 0. Then the σ-algebra Nτ consists of all sets N ∈ N∞ such that N ∩ {τ ≤ t} ∈ Nt
for all t ≥ 0 .
In the case when Nt = Mt , an alternative and more intuitive description is: Mτ = the σ-algebra generated by {Xmin(s,τ ) ; s ≥ 0} .
(7.2.1)
(See Rao (1977, p. 2.15) or Stroock and Varadhan (1979, Lemma 1.3.3, p. 33).) (m) Similarly, if Nt = Ft , we get Fτ(m) = the σ-algebra generated by {Bs∧τ ; s ≥ 0} .
120
7 Diffusions: Basic Properties
Theorem 7.2.4 (The strong Markov property for Itˆ o diffusions) (m) Let f be a bounded Borel function on Rn , τ a stopping time w.r.t. Ft , τ < ∞ a.s. Then E x [f (Xτ +h )|Fτ(m) ] = E Xτ [f (Xh )]
for all h ≥ 0 .
(7.2.2)
Proof. We try to imitate the proof of the Markov property (Theorem 7.1.2). For a.a. ω we have that Xrτ,x (ω) satisfies τ +h
Xττ,x +h
τ +h
b(Xuτ,x )du
=x+
σ(Xuτ,x )dBu .
+
τ
τ
By the strong Markov property for Brownian motion (Gihman and Skorohod (1972, p. 30)) the process v = Bτ +v − Bτ ; B
v≥0 (m)
is again a Brownian motion and independent of Fτ
h Xττ,x +h
h b(Xττ,x +v )dv
=x+
+
0
. Therefore
σ(Xττ,x +v )dBv .
0
Hence {Xττ,x +h }h≥0 must coincide a.e. with the strongly unique (see (5.2.8)) solution Yh of the equation
h Yh = x +
h b(Yv )dv +
0
v . σ(Yv )dB
0 (m)
Since {Yh }h≥0 is independent of Fτ , {Xττ,x +h } must be independent also. Moreover, by weak uniqueness (Lemma 5.3.1) we conclude that 0,x {Yh }h≥0 , and hence {Xττ,x +h }h≥0 , has the same law as {Xh }h≥0 .
Put F (x, t, r, ω) = Xrt,x (ω)
for r ≥ t .
Then (7.2.2) can be written E[f (F (x, 0, τ + h, ω))|Fτ(m) ] = E[f (F (x, 0, h, ω))]x=Xτ0,x . Now, with Xt = Xt0,x ,
(7.2.3)
7.2 The Strong Markov Property τ +h
F (x, 0, τ + h, ω) = Xτ +h (ω) = x +
τ =x+
τ b(Xs )ds +
0
τ +h
= Xτ +
τ +h
b(Xs )ds + 0
τ +h
σ(Xs )dBs + 0
τ
σ(Xs )dBs 0
τ +h
b(Xs )ds + τ
τ +h
b(Xs )ds +
121
σ(Xs )dBs τ
σ(Xs )dBs τ
= F (Xτ , τ, τ + h, ω) . Hence (7.2.2) gets the form E[f (F (Xτ , τ, τ + h, ω))|Fτ(m) ] = E[f (F (x, 0, h, ω))]x=Xτ . Put g(x, t, r, ω) = f (F (x, t, r, ω)). As in the proof of Theorem 7.1.2 we may assume that g has the form φk (x)ψk (t, r, ω) . g(x, t, r, ω) = k (m)
Then, since Xττ,x of Fτ we get, using (7.2.3) +h is independent (m) E[g(Xτ , τ, τ + h, ω)|Fτ ] = E[φk (Xτ )ψk (τ, τ + h, ω)|Fτ(m) ] =
k
φk (Xτ )E[ψk (τ, τ +h, ω)|Fτ(m)] =
k
= =
E[φk (x)ψk (τ, τ +h, ω)|Fτ(m) ]x=Xτ
k (m) E[g(x, τ, τ + h, ω)|Fτ ]x=Xτ = E[g(x, τ, τ + h, ω)]x=Xτ 0,x E[f (Xττ,x +h )]x=Xτ = E[f (Xh )]x=Xτ = E[f (F (x, 0, h, ω))]x=Xτ
.
We now extend (7.2.2) to the following: (m) If f1 , · · · , fk are bounded Borel functions on Rn , τ an Ft -stopping time, τ < ∞ a.s. then E x [f1 (Xτ +h1 )f2 (Xτ +h2 ) · · · fk (Xτ +hk )|Fτ(m) ] = E Xτ [f1 (Xh1 ) · · · fk (Xhk )] (7.2.4) for all 0 ≤ h1 ≤ h2 ≤ · · · ≤ hk . This follows by induction: To illustrate the argument we prove it in the case k = 2: (m)
E x [f1 (Xτ +h1 )f2 (Xτ +h2 )|Fτ(m) ] = E x [E x [f1 (Xτ +h1 )f2 (Xτ +h2 )|Fτ +h1 ]|Fτ(m) ] (m)
= E x [f1 (Xτ +h1 )E x [f2 (Xτ +h2 )|Fτ +h1 ]|Fτ(m) ] = E x [f1 (Xτ +h1 )E Xτ +h1 [f2 (Xh2 −h1 )]|Fτ(m) ] = E Xτ [f1 (Xh1 )E Xh1 [f2 (Xh2 −h1 )]] (m)
= E Xτ [f1 (Xh1 )E x [f2 (Xh2 )|Fh1 ]] = E Xτ [f1 (Xh1 )f2 (Xh2 )] ,
as claimed .
122
7 Diffusions: Basic Properties
Next we proceed to formulate the general version we need: Let H be the set of all real M∞ -measurable functions. For t ≥ 0 we define the shift operator θt : H → H as follows: If η = g1 (Xt1 ) · · · gk (Xtk ) (gi Borel measurable, ti ≥ 0) we put θt η = g1 (Xt1 +t ) · · · gk (Xtk +t ) . Now extend in the natural way to all functions in H by taking limits of sums of such functions. Then it follows from (7.2.4) that E x [θτ η|Fτ(m) ] = E Xτ [η]
(7.2.5)
for all stopping times τ and all bounded η ∈ H, where (θτ η)(ω) = (θt η)(ω)
if τ (ω) = t .
Hitting distribution, harmonic measure and the mean value property We will apply this to the following situation: Let H ⊂ Rn be measurable and let τH be the first exit time from H for an Itˆo diffusion Xt . Let α be another stopping time, g a bounded continuous function on Rn and put α τH = inf{t > α; Xt ∈ / H} .
η = g(XτH )X{τH 2.
2−n ,
7.5 The Characteristic Operator
129
7.5 The Characteristic Operator We now introduce an operator which is closely related to the generator A, but is more suitable in many situations, for example in the solution of the Dirichlet problem. o diffusion. The characteristic operator Definition 7.5.1 Let {Xt } be an Itˆ A = AX of {Xt } is defined by E x [f (XτU )] − f (x) , U↓x E x [τU ]
Af (x) = lim
(7.5.1)
where the U s are * open sets Uk decreasing to the point x, in the sense that / U } is the first exit Uk+1 ⊂ Uk and Uk = {x}, and τU = inf{t > 0; Xt ∈ k
time from U for Xt . The set of functions f such that the limit (7.5.1) exists for all x ∈ Rn (and all {Uk }) is denoted by DA . If E x [τU ] = ∞ for all open U x, we define Af (x) = 0. It turns out that DA ⊆ DA always and that Af = Af
for all f ∈ DA .
(See Dynkin (1965 I, p. 143).) We will only need that AX and LX coincide on C 2 . To obtain this we first clarify a property of exit times. Definition 7.5.2 A point x ∈ Rn is called a trap for {Xt } if Qx ({Xt = x for all t}) = 1 . In other words, x is trap if and only if τ{x} = ∞ a.s. Qx . For example, if b(x0 ) = σ(x0 ) = 0, then x0 is a trap for Xt (by strong uniqueness of Xt ). Lemma 7.5.3 If x is not a trap for Xt , then there exists an open set U x such that E x [τU ] < ∞ . Proof. See Lemma 5.5 p. 139 in Dynkin (1965 I). Theorem 7.5.4 Let f ∈ C 2 . Then f ∈ DA and Af =
i
bi
∂f + ∂xi
1 2
i,j
(σσ T )ij
∂2f . ∂xi ∂xj
(7.5.2)
Proof. As before we let L denote the operator defined by the right hand side of (7.5.2). If x is a trap for {Xt } then Af (x) = 0. Choose a bounded open set V such that x ∈ V . Modify f to f0 outside V such that f0 ∈ C02 (Rn ). Then f0 ∈ DA (x) and 0 = Af0 (x) = Lf0 (x) = Lf (x). Hence Af (x) = Lf (x) = 0
130
7 Diffusions: Basic Properties
in this case. If x is not a trap, choose a bounded open set U x such that E x [τU ] < ∞. Then by Dynkin’s formula (Theorem 7.4.1) (and the following Remark (i)), writing τU = τ τ |E x [ {(Lf )(Xs ) − Lf (x)}ds]| x E [f (Xτ )] − f (x) 0 − Lf (x) = x E [τ ] E x [τ ] ≤ sup |Lf (x) − Lf (y)| → 0
as U ↓ x ,
y∈U
since Lf is a continuous function.
Remark. We have now obtained that an Itˆo diffusion is a continuous, strong Markov process such that the domain of definition of its characteristic operator includes C 2 . Thus an Itˆo diffusion is a diffusion in the sense of Dynkin (1965 I). Example 7.5.5 (Brownian motion on the unit circle) The characterisY1 tic operator of the process Y = from Example 5.1.4 satisfying the Y2 stochastic differential equations (5.1.13), i.e. ⎧ ⎨ dY1 = − 21 Y1 dt − Y2 dB ⎩
dY2 = − 21 Y2 dt + Y1 dB
is Af (y1 , y2 ) =
1 2
#
∂2f y22 2 ∂y1
$ 2 ∂2f ∂f ∂f 2∂ f − 2y1 y2 + y1 2 − y1 − y2 . ∂y1 ∂y2 ∂y2 ∂y1 ∂y2
This is because dY = − 12 Y dt + KY dB, where 0 −1 K= 1 0 so that dY = b(Y )dt + σ(Y )dB
with b(y1 , y2 ) =
− 12 y1 − 12 y2
and a=
T 1 2 σσ
=
1 2
,
y22 −y1 y2
σ(y1 , y2 ) = −y1 y2 y12
.
−y2 y1
Exercises
131
Example 7.5.6 Let D be an open subset of Rn such that τD < ∞ a.s. Qx for all x. Let φ be a bounded, measurable function on ∂D and define φ(x) = E x [φ(XτD )] (φ is called the X-harmonic extension of φ). Then if U is open, x ∈ U ⊂⊂ D, we have by (7.2.8) that τ )] = E x [E XτU [φ(Xτ )]] = E x [φ(Xτ )] = φ(x) E x [φ(X . D D U So φ ∈ DA and Aφ = 0 in D , in spite of the fact that in general φ need not even be continuous in D (See Example 9.2.1).
Exercises 7.1.* Find the generator of the following Itˆo diffusions: a) dXt = μXt dt + σdBt (The Ornstein-Uhlenbeck process) (Bt ∈ R; μ, σ constants). b) dXt = rXt dt + αXt dBt (The geometric Brownian motion) (Bt ∈ R; r, α constants). c) dYt = r dt + αYt dBt (Bt ∈ R; r, α constants) # $ dt d) dYt = where Xt is as in a) dXt # $ # $ # $ dX1 1 0 e) = dt + X1 dBt (Bt ∈ R) dX2 X2 e # $ # $ # $# $ dX1 1 1 0 dB1 f) = dt + 0 0 X1 dX2 dB2 g) X(t) = (X1 , X2 , · · · , Xn ), where dXk (t) = rk Xk dt + Xk ·
n
αkj dBj ;
1≤k≤n
j=1
((B1 , · · · , Bn ) is Brownian motion in Rn , rk and αkj are constants). 7.2.* Find an Itˆ o diffusion (i.e. write down the stochastic differential equation for it) whose generator is the following: a) Af (x) = f (x) + f (x) ; f ∈ C02 (R)
132
7. Diffusions: Basic Properties 2
∂f 1 2 2∂ f 2 2 b) Af (t, x) = ∂f ∂t + cx ∂x + 2 α x ∂x2 ; f ∈ C0 (R ), where c, α are constants. ∂f ∂f c) Af (x1 , x2 ) = 2x2 ∂x + ln(1 + x21 + x22 ) ∂x 1 2 2
2
f + 12 (1 + x21 ) ∂∂xf2 + x1 ∂x∂1 ∂x + 2 1
7.3.
1 2
·
∂2f ∂x22
; f ∈ C02 (R2 ).
Let Bt be Brownian motion on R, B0 = 0 and define Xt = Xtx = x · ect+αBt , where c, α are constants. Prove directly from the definition that Xt is a Markov process.
7.4.* Let Btx be 1-dimensional Brownian motion starting at x ∈ R+ . Put τ = inf{t > 0; Btx = 0} . a) Prove that τ < ∞ a.s. P x for all x > 0. (Hint: See Example 7.4.2, second part). b) Prove that E x [τ ] = ∞ for all x > 0. (Hint: See Example 7.4.2, first part). 7.5.
Let the functions b, σ satisfy condition (5.2.1) of Theorem 5.2.1, with a constant C independent of t, i.e. |b(t, x)| + |σ(t, x)| ≤ C(1 + |x|)
for all x ∈ Rn and all t ≥ 0 .
Let Xt be a solution of dXt = b(t, Xt )dt + σ(t, Xt )dBt . Show that E[|Xt |2 ] ≤ (1 + E[|X0 |2 ])eKt − 1 for some constant K independent of t. (Hint: Use Dynkin’s formula with f (x) = |x|2 and τ = t ∧ τR , where τR = inf {t > 0; |Xt | ≥ R}, and let R → ∞ to achieve the inequality 2
2
t
E[|Xt | ] ≤ E[|X0 | ] + K ·
(1 + E[|Xs |2 ])ds ,
0
which is of the form (5.2.9).) 7.6.
Let g(x, ω) = f ◦ F (x, t, t + h, ω) be as in the proof of Theorem 7.1.2. Assume that f is continuous.
Exercises
133
a) Prove that the map x → g(x, ·) is continuous from Rn into L2 (P ) by using (5.2.9). For simplicity assume that n = 1 in the following. b) Use a) to prove that (x, ω) → g(x, ω) is measurable. (Hint: For each (m) m = 1, 2, . . . put ξk = ξk = k · 2−m , k = 1, 2, . . . Then g(ξk , ·) · X{ξk ≤x 0
where Bt ∈ R; r, α are constants. a) Find the generator A of Xt and compute Af (x) when f (x) = xγ ; x > 0, γ constant. b) If r < 12 α2 then Xt → 0 as t → ∞, a.s. Qx (Example 5.1.1). But what is the probability p that Xt , when starting from x < R, ever hits the value R ? Use Dynkin’s formula with f (x) = xγ1 , γ1 = 1 − α2r2 , to prove that γ1 x p= . R c) If r > 12 α2 then Xt → ∞ as t → ∞, a.s. Qx . Put τ = inf{t > 0; Xt ≥ R} . Use Dynkin’s formula with f (x) = ln x, x > 0 to prove that E x [τ ] =
ln R x . r − 12 α2
(Hint: First consider exit times from (ρ, R), ρ > 0 and then let ρ → 0. You need estimates for (1 − p(ρ)) ln ρ , where p(ρ) = Qx [Xt reaches the value R before ρ ] , which you can get from the calculations in a), b).) 7.10. Let Xt be the geometric Brownian motion dXt = rXt dt + αXt dBt .
Exercises
135
Find E x [XT |Ft ] for t ≤ T by a) using the Markov property and b) writing Xt = x ert Mt , where Mt = exp(αBt − 12 α2 t)
is a martingale .
o diffusion in Rn and let f : Rn → R be a function such 7.11. Let Xt be an Itˆ that $ # ∞ Ex |f (Xt )|dt < ∞ for all x ∈ Rn . 0
Let τ be a stopping time. Use the strong Markov property to prove that # ∞ E
x
$ f (Xt )dt = E x [g(Xτ )] ,
τ
where # ∞ g(y) = E y
$ f (Xt )dt .
0
7.12. (Local martingales) An Nt -adapted stochastic process Z(t) ∈ Rn is called a local martingale with respect to the given filtration {Nt } if there exists an increasing sequence of Nt -stopping times τk such that τk → ∞ a.s. as k → ∞ and Z(t ∧ τk )
is an Nt -martingale for all k .
a) Show that if Z(t) is a local martingale and there exists a constant T ≤ ∞ such that the family {Z(τ )}τ ≤T is uniformly integrable (Appendix C) then {Z(t)}t≤T is a martingale. b) In particular, if Z(t) is a local martingale and there exists a constant K < ∞ such that E[Z 2 (τ )] ≤ K for all stopping times τ ≤ T , then {Z(t)}t≤T is a martingale. c) Show that if Z(t) is a lower bounded local martingale, then Z(t) is a supermartingale (Appendix C). d) Let φ ∈ W(0, T ). Show that
136
7. Diffusions: Basic Properties
t Z(t) :=
0≤t≤T
φ(s, ω)dB(s) ; 0
is a local martingale. 7.13. a) Let Bt ∈ R2 , B0 = x = 0. Fix 0 < < R < ∞ and define Xt = ln |Bt∧τ | ;
t≥0
where τ = inf {t > 0; |Bt | ≤
or |Bt | ≥ R} .
Prove that Xt is an Ft∧τ -martingale. (Hint: Use Exercise 4.8.) Deduce that ln |Bt | is a local martingale (Exercise 7.12). b) Let Bt ∈ Rn for n ≥ 3, B0 = x = 0. Fix > 0, R < ∞ and define Yt = |Bt∧τ |2−n ;
t≥0
where τ = inf{t > 0; |Bt | ≤
or |Bt | ≥ R} .
Prove that Yt is an Ft∧τ -martingale. Deduce that |Bt |2−n is a local martingale. 7.14. (Doob’s h-transform) Let Bt be n-dimensional Brownian motion, D ⊂ Rn a bounded open set and h > 0 a harmonic function on D (i.e. Δh = 0 in D). Let Xt be the solution of the stochastic differential equation dXt = ∇(ln h)(Xt )dt + dBt More precisely, choose an increasing sequence {Dk } of open subsets of ∞ Dk = D. Then for each k the equation D such that Dk ⊂ D and k=1
above can be solved (strongly) for t < τDk . This gives in a natural way a solution for t < τ : = lim τDk . k→∞
a) Show that the generator A of Xt satisfies Af =
Δ(hf ) 2h
for f ∈ C02 (D) .
In particular, if f = h1 then Af = 0. b) Use a) to show that if there exists x0 ∈ ∂D such that % 0 if y = x0 lim h(x) = ∞ if y = x0 x→y∈∂D (i.e. h is a kernel function), then
Exercises
137
lim Xt = x0 a.s.
t→τ
(Hint: Consider E x [f (XT )] for suitable stopping times T and with f = h1 ) In other words, we have imposed a drift on Bt which causes the process to exit from D at the point x0 only. This can also be formulated as follows: Xt is obtained by conditioning Bt to exit from D at x0 . See Doob (1984). 7.15.* Let Bt be 1-dimensional and define F (ω) = (BT (ω) − K)+ where K > 0, T > 0 are constants. By the Itˆo representation theorem (Theorem 4.3.3) we know that there exists φ ∈ V(0, T ) such that
T φ(t, ω)dBt .
F (ω) = E[F ] + 0
How do we find φ explicitly? This problem is of interest in mathematical finance, where φ may be regarded as the replicating portfolio for the contingent claim F (see Chapter 12). Using the Clark-Ocone formula (see Karatzas and Ocone (1991), Øksendal (1996) or Aase et al (2000)) one can deduce that φ(t, ω) = E[X[K,∞) (BT )|Ft ] ;
t 0; Xt = 0} and put
Xt for t ≤ τ 0 for t > τ . Prove that Yt is also a (strong) solution of (7.5.8). Why does not this contradict the uniqueness assertion of Theorem 5.2.1? (Hint: Verify that Yt =
t Yt = x +
1 1/3 ds 3 Ys
t +
0
Ys2/3 dBs
0
for all t by splitting the integrals as follows:
t
t∧τ =
0
t +
0
.)
t∧τ
7.18.* a) Let dXt = b(Xt )dt + σ(Xt )dBt ;
X0 = x
be a 1-dimensional Itˆ o diffusion with characteristic operator A. Let f ∈ C 2 (R) be a solution of the differential equation Af (x) = b(x)f (x) + 12 σ 2 (x)f (x) = 0 ;
x∈R.
(7.5.9)
Let (a, b) ⊂ R be an open interval such that x ∈ (a, b) and put τ = inf{t > 0; Xt ∈ (a, b)} . Assume that τ < ∞ a.s. Qx and define p = Qx [Xτ = b] . Use Dynkin’s formula to prove that p=
f (x) − f (a) . f (b) − f (a)
(7.5.10)
Exercises
139
In other words, the harmonic measure μx(a,b) of X on ∂(a, b) = {a, b} is given by μx(a,b) (b) =
f (x) − f (a) , f (b) − f (a)
μx(a,b) (a) =
f (b) − f (x) . f (b) − f (a)
(7.5.11)
b) Now specialize to the process X t = x + Bt ; Prove that p=
t≥0.
x−a . b−a
(7.5.12)
c) Find p if Xt = x + μt + σBt ;
t≥0
where μ, σ ∈ R are nonzero constants. 7.19. Let Btx be 1-dimensional Brownian motion starting at x > 0. Define τ = τ (x, ω) = inf{t > 0; Btx (ω) = 0} . From Exercise 7.4 we know that τ < ∞ a.s. P x and E x [τ ] = ∞ . What is the distribution of the random variable τ (ω) ? a) To answer this, first find the Laplace transform for λ > 0 . g(λ): = E x [e−λτ ] √ (Hint: Let Mt = exp(− 2λ Bt − λt). Then {Mt∧τ }t≥0 is a bounded martingale. ) √ [Solution: g(λ) = exp(− 2λ x) .] b) To find the density f (t) of τ it suffices to find f (t) = f (t, x) such that
∞ √ e−λt f (t)dt = exp(− 2λ x) for all λ > 0 0
i.e. to find the inverse Laplace transform of g(λ). Verify that x x2 f (t, x) = √ exp − ; t>0. 2t 2πt3
140
7. Diffusions: Basic Properties
7.20. (Population growth in a stochastic, crowded environment (II)) As an alternative to the model in Exercise 5.15 consider the equation dXt = r Xt (K − Xt )dt + α Xt (K − Xt )dBt ;
X0 = x ≥ 0.
This equation does not satisfy the conditions for existence and uniqueness in Theorem 5.2.1. However, we can still prove that a unique strong solution exists by proceeding as follows: a) For n = 1, 2, . . . define % y(K − y) if 0 ≤ y ≤ n bn (y) = n(K − n) if y > n %
and σn (y) = (n)
and let Xt = Xt
α y(K − y) if α n(K − n) if
0≤y≤n y>n
be the unique solution of
dXt = bn (Xt )dt + σn (Xt )dBt ; Define
(n)
τn = inf{t > 0; Xt Show that
(n)
Xt
(n+1)
= Xt
X0 = x.
= n}.
for all t < τn
and use this to find a unique strong solution Xt for t < τ∞ : lim τn . b) Prove that τ∞ = ∞ a.s. c) Prove that (i) X0 = 0 ⇒ Xt = 0 for all t (ii) X0 = K ⇒ Xt = K for all t (iii) 0 < X0 < K ⇒ 0 < Xt < K for all t (iv) X0 > K ⇒ Xt > K for all t.
n→∞
For a discussion of optimal harvesting from this population model see Lungu and Øksendal (1997).
8 Other Topics in Diffusion Theory
In this chapter we study some other important topics in diffusion theory and related areas. Some of these topics are not strictly necessary for the remaining chapters, but they are all central in the theory of stochastic analysis and essential for further applications. The following topics will be treated: 8.1 8.2 8.3 8.4 8.5 8.6
Kolmogorov’s backward equation. The resolvent. The Feynman-Kac formula. Killing. The martingale problem. When is an Itˆo process a diffusion? Random time change. The Girsanov formula.
8.1 Kolmogorov’s Backward Equation. The Resolvent In the following we let Xt be an Itˆ o diffusion in Rn with generator A. If we choose f ∈ C02 (Rn ) and τ = t in Dynkin’s formula (7.4.1) we see that u(t, x) = E x [f (Xt )] is differentiable with respect to t and ∂u = E x [Af (Xt )] . ∂t
(8.1.1)
It turns out that the right hand side of (8.1.1) can be expressed in terms of u also: Theorem 8.1.1 (Kolmogorov’s backward equation) Let f ∈ C02 (Rn ). a) Define u(t, x) = E x [f (Xt )] . B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 8, © Springer-Verlag Berlin Heidelberg 2013
(8.1.2)
142
8 Other Topics in Diffusion Theory
Then u(t, ·) ∈ DA for each t and ∂u = Au , t > 0, x ∈ Rn ∂t u(0, x) = f (x) ; x ∈ Rn
(8.1.3) (8.1.4)
where the right hand side is to be interpreted as A applied to the function x → u(t, x). b) Moreover, if w(t, x) ∈ C 1,2 (R × Rn ) is a bounded function satisfying (8.1.3), (8.1.4) then w(t, x) = u(t, x), given by (8.1.2). Proof. a) Let g(x) = u(t, x). Then since t → u(t, x) is differentiable we have 1 E x [g(Xr )] − g(x) = · E x [E Xr [f (Xt )] − E x [f (Xt )]] r r 1 = · E x [E x [f (Xt+r )|Fr ] − E x [f (Xt )|Fr ]] r 1 = · E x [f (Xt+r ) − f (Xt )] r ∂u u(t + r, x) − u(t, x) → as r ↓ 0 . = r ∂t Hence Au = lim r↓0
E x [g(Xr )] − g(x) r
exists and
∂u ∂t
= Au, as asserted .
Conversely, to prove the uniqueness statement in b) assume that a function w(t, x) ∈ C 1,2 (R × Rn ) satisfies (8.1.3)–(8.1.4). Then = − ∂w + Aw = 0 Aw: ∂t
for t > 0, x ∈ Rn
(8.1.5)
and x ∈ Rn .
w(0, x) = f (x) ,
(8.1.6)
Fix (s, x) ∈ R × Rn . Define the process Yt in Rn+1 by Yt = (s − t, Xt0,x ), and so by (8.1.5) and Dynkin’s formula we t ≥ 0. Then Yt has generator A have, for all t ≥ 0, E
s,x
[w(Yt∧τR )] = w(s, x) + E
s,x
$ # t∧τ
R Aw(Y r )dr = w(s, x) , 0
where τR = inf{t > 0; |Xt | ≥ R}. Letting R → ∞ we get w(s, x) = E s,x [w(Yt )] ;
∀t ≥ 0 .
8.1 Kolmogorov’s Backward Equation. The Resolvent
143
In particular, choosing t = s we get w(s, x) = E s,x [w(Ys )] = E[w(0, Xs0,x )] = E[f (Xs0,x)] = E x [f (Xs )] .
Remark. If we introduce the operator Qt : f → E • [f (Xt )] then we have u(t, x) = (Qt f )(x) and we may rewrite (8.1.1) and (8.1.3) as follows: d (Qt f ) = Qt (Af ) ; dt
f ∈ C02 (Rn )
(8.1.1)
d (Qt f ) = A(Qt f ) ; dt
f ∈ C02 (Rn ) .
(8.1.3)
Thus the equivalence of (8.1.1) and (8.1.3) amounts to saying that the operators Qt and A commute, in some sense. Arguing formally, it is tempting to say that the solution of (8.1.1) and (8.1.3) is Qt = etA and therefore Qt A = AQt . However, this argument would require a further explanation, because in general A is an unbounded operator. It is an important fact that if a positive multiple of the identity is subtracted from A then the operator A always has an inverse. This inverse can be expressed explicitly in terms of the diffusion Xt : Definition 8.1.2 For α > 0 and g ∈ Cb (Rn ) we define the resolvent operator Rα by $ # ∞ Rα g(x) = E x e−αt g(Xt )dt . (8.1.7) 0
Lemma 8.1.3 Rα g is a bounded continuous function. Proof. Since Rα g(x) =
∞ 0
e−αt E x [g(Xt )]dt, we see that Lemma 8.1.3 is a direct
consequence of the next result: Lemma 8.1.4 Let g be a lower bounded, measurable function on Rn and define, for fixed t ≥ 0 u(x) = E x [g(Xt )] . a) If g is lower semicontinuous, then u is lower semicontinuous. b) If g is bounded and continuous, then u is continuous. In other words, any Itˆ o diffusion Xt is Feller-continuous. Proof. By (5.2.10) we have E[|Xtx − Xty |2 ] ≤ |y − x|2 C(t) ,
144
8 Other Topics in Diffusion Theory
where C(t) does not depend on x and y. Let {yn } be a sequence of points converging to x. Then in L2 (Ω, P ) as n → ∞ .
Xtyn → Xtx
So, by taking a subsequence {zn } of {yn } we obtain that Xtzn (ω) → Xtx (ω)
for a.a. ω ∈ Ω .
a) If g is lower bounded and lower semicontinuous, then by the Fatou lemma u(x) = E[g(Xtx )] ≤ E[ lim g(Xtzn )] ≤ lim E[g(Xtzn )] = lim u(zn ) . n→∞
n→∞
n→∞
Therefore every sequence {yn } converging to x has a subsequence {zn } such that u(x) ≤ lim u(zn ). That proves that u is lower semicontinuous. n→∞
b) If g is bounded and continuous, the result in a) can be applied both to g and −g. Hence both u and −u are lower semicontinuous and we conclude that u is continuous. We now prove that Rα and α − A are inverse operators: Theorem 8.1.5 a) If f ∈ C02 (Rn ) then Rα (α − A)f = f for all α > 0. b) If g ∈ Cb (Rn ) then Rα g ∈ DA and (α − A)Rα g = g for all α > 0. Proof. a) If f ∈ C02 (Rn ) then by Dynkin’s formula Rα (α − A)f (x) = (αRα f − Rα Af )(x)
∞
∞ −αt x = α e E [f (Xt )]dt − e−αt E x [Af (Xt )]dt 0 ∞
= −e−αt E x [f (Xt )] + 0
∞
0
e
−αt
0
d x E [f (Xt )]dt − dt
∞
e−αt E x [Af (Xt )]dt
0
= E x [f (X0 )] = f (x) . b) If g ∈ Cb (Rn ) then by the Markov property # ∞ x
x
E [Rα g(Xt )] = E [E
Xt
$ e−αs g(Xs )ds ]
0
# ∞ $ ∞ x x −αs x x = E [E θt e g(Xs )ds |Ft ] = E [E e−αs g(Xt+s )ds|Ft ] # ∞ = Ex 0
0
$
e−αs g(Xt+s )ds =
∞ 0
0
e−αs E x [g(Xt+s )]ds .
8.2 The Feynman-Kac Formula. Killing
145
Integration by parts gives
∞ x
E [Rα g(Xt )] = α
e
t+s E x [g(Xv )]dv ds .
−αs
0
t
This identity implies that Rα g ∈ DA and A(Rα g) = αRα g − g .
8.2 The Feynman-Kac Formula. Killing With a little harder work we can obtain the following useful generalization of Kolmogorov’s backward equation: Theorem 8.2.1 (The Feynman-Kac formula) Let f ∈ C02 (Rn ) and q ∈ C(Rn ). Assume that q is lower bounded. a) Put $ q(Xs )ds f (Xt ) .
(8.2.1)
∂v = Av − qv ; ∂t
t > 0, x ∈ Rn
(8.2.2)
v(0, x) = f (x) ;
x ∈ Rn
(8.2.3)
# v(t, x) = E
x
exp
t −
0
Then
b) Moreover, if w(t, x) ∈ C 1,2 (R × Rn ) is bounded on K × Rn for each compact K ⊂ R and w solves (8.2.2), (8.2.3), then w(t, x) = v(t, x), given by (8.2.1). Proof. a) Let Yt = f (Xt ), Zt = exp(−
t 0
q(Xs )ds). Then dYt is given by
(7.3.1) and dZt = −Zt q(Xt )dt . So d(Yt Zt ) = Yt dZt + Zt dYt ,
since dZt · dYt = 0 .
Note that since Yt Zt is an Itˆ o process it follows from Lemma 7.3.2 that v(t, x) = E x [Yt Zt ] is differentiable w.r.t. t.
146
8 Other Topics in Diffusion Theory
Therefore, with v(t, x) as in (8.2.1) we get 1 1 x (E [v(t, Xr )] − v(t, x)) = E x [E Xr [Zt f (Xt )] − E x [Zt f (Xt )]] r r t 1 = E x [E x [f (Xt+r ) exp − q(Xs+r )ds |Fr ] − E x [Zt f (Xt )|Fr ]] r =
1 x E [Zt+r · exp r
r
0
q(Xs )ds f (Xt+r ) − Zt f (Xt )]
0
1 = E x [f (Xt+r )Zt+r − f (Xt )Zt ] r r 1 x q(Xs )ds − 1 + E f (Xt+r )Zt+r · exp r 0
∂ → v(t, x) + q(x)v(t, x) ∂t
as r → 0 ,
because 1 f (Xt+r )Zt+r exp r
r
q(Xs )ds − 1 → f (Xt )Zt q(X0 )
0
pointwise boundedly. That completes the proof of a). b) Assume that w(t, x) ∈ C 1,2 (R × Rn ) satisfies (8.2.2) and (8.2.3) and that w(t, x) is bounded on K × Rn for each compact K ⊂ R. Then ∂w Aw(t, x): = − + Aw − qw = 0 ∂t
for t > 0, x ∈ Rn
and
x ∈ Rn .
w(0, x) = f (x) ;
Fix (s, x, z) ∈ R × Rn × R and define Zt = z +
t 0
(8.2.4) (8.2.5)
q(Xs )ds and Ht = (s − t,
Xt0,x , Zt ). Then Ht is an Itˆ o diffusion with generator AH φ(s, x, z) = −
∂φ ∂φ + Aφ + q(x) ; ∂s ∂z
φ ∈ C02 (R × Rn × R) .
Hence by (8.2.4) and Dynkin’s formula we have, for all t ≥ 0, R > 0 and with φ(s, x, z) = exp(−z)w(s, x): E
s,x,z
[φ(Ht∧τR )] = φ(s, x, z) + E
s,x,z
$ # t∧τ
R AH φ(Hr )dr , 0
where τR = inf{t > 0; |Ht | ≥ R}.
8.2 The Feynman-Kac Formula. Killing
147
Note that with this choice of φ we have by (8.2.4) # $ ∂w AH φ(s, x, z) = exp(−z) − + Aw − q(x)w = 0 . ∂s Hence w(s, x) = φ(s, x, 0) = E s,x,0 [φ(Ht∧τR )] t∧τ
R x = E exp − q(Xr )dr w(s − t ∧ τR , Xt∧τR ) 0
t −
→ E x exp
q(Xr )dr w(s − t, Xt )
as R → ∞ ,
0
since w(r, x) is bounded for (r, x) ∈ K × Rn . In particular, choosing t = s we get w(s, x) = E
x
exp
s −
q(Xr )dr w(0, Xs0,x ) = v(s, x) ,
as claimed .
0
Remark. (About killing a diffusion) In Theorem 7.3.3 we have seen that the generator of an Itˆ o diffusion Xt given by (8.2.6) dXt = b(Xt )dt + σ(Xt )dBt is a partial differential operator L of the form
Lf =
aij
∂f ∂2f + bi ∂xi ∂xj ∂xi
(8.2.7)
where [aij ] = 12 σσ T , b = [bi ]. It is natural to ask if one can also find processes whose generator has the form Lf =
aij
∂f ∂2f + bi − cf , ∂xi ∂xj ∂xi
(8.2.8)
where c(x) is a bounded and continuous function. t with generator (8.2.8) is If c(x) ≥ 0 the answer is yes and a process X obtained by killing Xt at a certain (killing) time ζ. By this we mean that there exists a random time ζ such that if we put t = Xt X
if t < ζ
(8.2.9)
t undefined if t ≥ ζ (alternatively, put X t = ∂ if t ≥ ζ, where and leave X n ∂∈ / R is some “coffin” state), then Xt is also a strong Markov process and
148
8 Other Topics in Diffusion Theory
t − c(Xs )ds x x E [f (Xt )] := E [f (Xt ) · X[0,ζ) (t)] = E f (Xt ) · e 0 x
(8.2.10)
for all bounded continuous functions f on Rn . Let v(t, x) denote the right hand side of (8.2.10) with f ∈ C02 (Rn ). Then lim
t→0
t )] − f (x) ∂ E x [f (X = v(t, x)t=0 = (Av − cv)t=0 = Af (x) − c(x)f (x) , t ∂t
by the Feynman-Kac formula. t is (8.2.8), as required. The function c(x) can be So the generator of X interpreted as the killing rate: 1 c(x) = lim Qx [X0 is killed in the time interval (0, t]] . t↓0 t Thus by applying such a killing procedure we can come from the special case c = 0 in (8.2.7) to the general case (8.2.8) with c(x) ≥ 0. Therefore, for many purposes it is enough to consider the equation (8.2.7). If the function c(x) ≥ 0 is given, an explicit construction of the killing time ζ such that (8.2.10) holds can be found in Karlin and Taylor (1981), p. 314. For a more general discussion see Blumenthal and Getoor (1968), Chap. III.
8.3 The Martingale Problem If dXt = b(Xt )dt + σ(Xt )dBt is an Itˆ o diffusion in Rn with generator A and if f ∈ C02 (Rn ) then by (7.3.1)
t f (Xt ) = f (x) +
t ∇f T (Xs )σ(Xs )dBs .
Af (Xs )ds + 0
(8.3.1)
0
Define
t Mt = f (Xt ) −
t ∇f T (Xr )σ(Xr )dBr ) .
Af (Xr )dr (= f (x) + 0
0
(8.3.2) (m)
Then, since Itˆ o integrals are martingales (w.r.t. the σ-algebras {Ft have for s > t (m) E x [Ms |Ft ] = Mt . It follows that (m)
E x [Ms |Mt ] = E x [E x [Ms |Ft
]|Mt ] = E x [Mt |Mt ] = Mt ,
since Mt is Mt -measurable. We have proved:
}) we
8.3 The Martingale Problem
149
Theorem 8.3.1 If Xt is an Itˆ o diffusion in Rn with generator A, then for 2 n all f ∈ C0 (R ) the process
t Mt = f (Xt ) − Af (Xr )dr 0
is a martingale w.r.t. {Mt }. If we identify each ω ∈ Ω with the function ωt = ω(t) = Xtx (ω) we see that the probability space (Ω, M, Qx ) is identified with x) ((Rn )[0,∞) , B, Q where B is the Borel σ-algebra on (Rn )[0,∞) (see Chapter 2). Thus, regarding x on B we can formulate Theothe law of Xtx as a probability measure Q rem 8.3.1 as follows: x is the probability measure on B induced by the law Theorem 8.3.1’. If Q o diffusion Xt , then for all f ∈ C02 (Rn ) the process Qx of an Itˆ
t
t Mt = f (Xt ) − Af (Xr )dr (= f (ωt ) − Af (ωr )dr) ; 0
ω ∈ (Rn )[0,∞) (8.3.3)
0
x -martingale w.r.t. the Borel σ-algebras Bt of (Rn )[0,t] , t ≥ 0. In other is a Q x solves the martingale problem for the differential opwords, the measure Q erator A, in the following sense: Definition 8.3.2 Let L be a semi-elliptic differential operator of the form L=
bi
∂ ∂2 + aij ∂xi ∂xi ∂xj i,j
where the coefficients bi , aij are locally bounded Borel measurable functions on Rn . Then we say that a probability measure P x on ((Rn )[0,∞) , B) solves the martingale problem for L (starting at x) if the process
t Mt = f (ωt ) −
Lf (ωr )dr , M0 = f (x)
a.s. P x
0
is a P x martingale w.r.t. Bt , for all f ∈ C02 (Rn ). The martingale problem is called well posed if there is a unique measure Px solving the martingale problem.
150
8 Other Topics in Diffusion Theory
x solves the martinThe argument of Theorem 8.3.1 actually proves that Q gale problem for A whenever Xt is a weak solution of the stochastic differential equation dXt = b(Xt )dt + σ(Xt )dBt .
(8.3.4)
Conversely, it can be proved that if P x solves the martingale problem for L=
bi
∂ + ∂xi
1 2
(σσ T )ij
∂2 ∂xi ∂xj
(8.3.5)
starting at x, for all x ∈ Rn , then there exists a weak solution Xt of the stochastic differential equation (8.3.4). Moreover, this weak solution Xt is a Markov process if and only if the martingale problem for L is well posed. (See Stroock and Varadhan (1979) or Rogers and Williams (1987)). Therefore, if the coefficients b, σ of (8.3.4) satisfy the conditions (5.2.1), (5.2.2) of Theorem 5.2.1, we conclude that x is the unique solution of the martingale problem Q for the operator L given by (8.3.5) .
(8.3.6)
Lipschitz-continuity of the coefficients of L is not necessary for the uniqueness of the martingale problem. For example, one of the spectacular results of Stroock and Varadhan (1979) is that L=
bi
∂ ∂2 + aij ∂xi ∂xi ∂xj
has a unique solution of the martingale problem if [aij ] is everywhere positive definite, aij (x) is continuous, b(x) is measurable and there exists a constant D such that 1
|b(x)| + |a(x)| 2 ≤ D(1 + |x|)
for all x ∈ Rn .
8.4 When is an Itˆ o Process a Diffusion? The Itˆo formula gives that if we apply a C 2 function φ: U ⊂ Rn → Rn to an Itˆo process Xt the result φ(Xt ) is another Itˆ o process. A natural question is: If Xt is an Itˆ o diffusion will φ(Xt ) be an Itˆ o diffusion too? The answer is no in general, but it may be yes in some cases: Example 8.4.1 (The Bessel process) Let n ≥ 2. In Example 4.2.2 we found that the process 1
Rt (ω) = |B(t, ω)| = (B1 (t, ω)2 + · · · + Bn (t, ω)2 ) 2
8.4 When is an Itˆ o Process a Diffusion?
151
satisfies the equation dRt =
n Bi dBi
Rt
i=1
+
n−1 dt . 2Rt
(8.4.1)
However, as it stands this is not a stochastic differential equation of the form (5.2.3), so it is not apparent from (8.4.1) that R is an Itˆ o diffusion. But this will follow if we can show that
t n Bi Yt : = dBi |B| i=1 0
coincides in law with (i.e. has the same finite-dimensional distributions as) t . For then (8.4.1) can be written 1-dimensional Brownian motion B dRt =
n−1 dt + dB 2Rt
which is of the form (5.2.3), thus showing by weak uniqueness (Lemma 5.3.1) that Rt is an Itˆ o diffusion with generator n−1 f (x) 2x as claimed in Example 4.2.2. One way of seeing that the process Yt coincides t is to apply the following result: in law with 1-dimensional Brownian motion B Af (x) = 12 f (x) +
Theorem 8.4.2 An Itˆ o process dYt = vdBt ;
n×m Y0 = 0 with v(t, ω) ∈ VH
coincides (in law) with n-dimensional Brownian motion if and only if vv T (t, ω) = In
for a.a. (t, ω) w.r.t. dt × dP
where In is the n-dimensional identity matrix. Note that in the example above we have
t Yt =
vdB 0
with
# v=
$
B1 Bn ,..., , |B| |B|
⎞ B1 . B = ⎝ .. ⎠ Bn ⎛
(8.4.2)
152
8 Other Topics in Diffusion Theory
and since vv T = 1, we get that Yt is a 1-dimensional Brownian motion, as required. Theorem 8.4.2 is a special case of the following result, which gives a necessary and sufficient condition for an Itˆ o process to coincide in law with a given diffusion: (We use the symbol ! for “coincides in law with”). o diffusion given by Theorem 8.4.3 Let Xt be an Itˆ b ∈ Rn ,
dXt = b(Xt )dt + σ(Xt )dBt ,
σ ∈ Rn×m ,
X0 = x ,
and let Yt be an Itˆ o process given by u ∈ Rn ,
dYt = u(t, ω)dt + v(t, ω)dBt ,
v ∈ Rn×m ,
Y0 = x .
Then {Xt } ! {Yt } if and only if E x [u(t, ·)|Nt ] = b(Ytx )
and vv T (t, ω) = σσ T (Ytx )
(8.4.3)
for a.a. (t, ω) w.r.t. dt × dP , where Nt is the σ-algebra generated by Ys ; s ≤ t. Proof. Assume that (8.4.3) holds. Let A=
bi
∂ + ∂xi
1 2
(σσ T )ij
i,j
∂2 ∂xi ∂xj
be the generator of Xt and define, for f ∈ C02 (Rn ), Hf (t, ω) =
ui (t, ω)
i
∂f (Yt ) + ∂xi
1 2
(vv T )ij (t, ω)
i,j
∂2f (Yt ) . ∂xi ∂xj
Then by Itˆ o’s formula (see (7.3.1)) we have, for s > t, # s x
E [f (Ys )|Nt ] = f (Yt ) + E
x
$
# s
Hf (r, ω)dr|Nt + E t
T
t
# s = f (Yt ) + E x
$ ∇f vdBr |Nt
x
$ E x [Hf (r, ω)|Nr ]dr|Nt
t
# s = f (Yt ) + E x
$ Af (Yr )dr|Nt
by (8.4.3) ,
(8.4.4)
t
where E x denotes expectation w.r.t. the law Rx of Yt (see Lemma 7.3.2). Therefore, if we define
8.4 When is an Itˆ o Process a Diffusion?
153
t Mt = f (Yt ) −
Af (Yr )dr
(8.4.5)
0
then, for s > t, # s E [Ms |Nt ] = f (Yt ) + E x
$
# s
Af (Yr )dr|Nt − E
x
x 0
t
# t = f (Yt ) − E x
$ Af (Yr )dr|Nt
$ Af (Yr )dr|Nt = Mt .
0
Hence Mt is a martingale w.r.t. the σ-algebras Nt and the law Rx . By uniqueness of the solution of the martingale problem (see (8.3.6)) we conclude that Xt ! Yt . Conversely, assume that Xt ! Yt . Choose f ∈ C02 . By Itˆo’s formula (7.3.1) we have, for a.a. (t, ω) w.r.t. dt × dP , 1 lim (E x [f (Yt+h )|Nt ] − f (Yt )) h↓0 h t+h 1 ∂f = lim Ex ui (s, ω) (Ys ) h↓0 h ∂x i i t
+ 12
(vv T )ij (s, ω)
i,j
=
E x [ui (t, ω)|Nt ]
i
∂f (Yt ) + ∂xi
∂2f (Ys )|Nt ds ∂xi ∂xj 1 2
(8.4.6)
E x [(vv T )ij (t, ω)|Nt ]
i,j
∂2f (Yt ) . (8.4.7) ∂xi ∂xj
On the other hand, since Xt ! Yt we know that Yt is a Markov process. Therefore (8.4.6) coincides with 1 lim (E Yt [f (Yh )] − E Yt [f (Y0 )]) h↓0 h ∂f ∂2f = E Yt ui (0, ω) (Y0 ) + 12 E Yt (vv T )ij (0, ω) (Y0 ) ∂xi ∂xi ∂xj i i,j =
i
E Yt [ui (0, ω)]
∂f (Yt ) + ∂xi
1 2
E Yt [(vv T )ij (0, ω)]
i,j
∂2f (Yt ) . ∂xi ∂xj
(8.4.8)
Comparing (8.4.7) and (8.4.8) we conclude that E x [u(t, ω)|Nt ] = E Yt [u(0, ω)] and E x [vv T (t, ω)|Nt ] = E Yt [vv T (0, ω)] (8.4.9) for a.a. (t, ω).
154
8 Other Topics in Diffusion Theory
On the other hand, since the generator of Yt coincides with the generator A of Xt we get from (8.4.8) that E Yt [u(0, ω)] = b(Yt )
and E Yt [vv T (0, ω)] = σσ T (Yt )
for a.a. (t, ω) . (8.4.10)
Combining (8.4.9) and (8.4.10) we conclude that E x [u|Nt ] = b(Yt ) and E x [vv T |Nt ] = σσ T (Yt ) for a.a. (t, ω) .
(8.4.11)
From this we obtain (8.4.3) by using that in fact vv T (t, ·) is always Nt measurable, in the following sense: Lemma 8.4.4 Let dYt = u(t, ω)dt + v(t, ω)dBt , Y0 = x be as in Theorem 8.4.3. Then there exists an Nt -adapted process W (t, ω) such that vv T (t, ω) = W (t, ω)
for a.a. (t, ω) .
Proof. By Itˆ o’s formula we have (if Yi (t, ω) denotes component number i of Y (t, ω))
t Yi Yj (t, ω) = xi xj +
t Yi dYj (s) +
0
t (vv T )ij (s, ω)ds .
Yj dYi (s) + 0
0
Therefore, if we put
t Hij (t, ω) = Yi Yj (t, ω) − xi xj −
t Yi dYj −
0
Yj dYi ,
1 ≤ i, j ≤ n
0
then Hij is Nt -adapted and
t (vv T )ij (s, ω)ds .
Hij (t, ω) = 0
Therefore (vv T )ij (t, ω) = lim r↓0
H(t, ω) − H(t − r, ω) r
for a.a. t. This shows Lemma 8.4.4 and the proof of Theorem 8.4.3 is complete. Remarks. 1) One may ask if also u(t, ·) must be Nt -measurable. However, the following example shows that this fails even in the case when v = n = 1: Let B1 , B2 be two independent 1-dimensional Brownian motions and define dYt = B1 (t)dt + dB2 (t) .
8.4 When is an Itˆ o Process a Diffusion?
155
Then we may regard Yt as noisy observations of the process B1 (t). So by Example 6.2.10 we have that 1 (t, ω))2 ] = tanh(t) , E[(B1 (t, ω) − B 1 (t, ω) = E[B1 (t)|Nt ] is the Kalman-Bucy filter. In particular, where B B1 (t, ω) cannot be Nt -measurable. 2) The process v(t, ω) need not be Nt -adapted either: Let Bt be 1dimensional Brownian motion and define dYt = sign(Bt )dBt
(8.4.12)
where % sign(z) =
1 if z > 0 −1 if z ≤ 0 .
Tanaka’s formula says that
t |Bt | = |B0 | +
sign(Bs )dBs + Lt
(8.4.13)
0
where Lt = Lt (ω) is local time of Bt at 0, a non-decreasing process which only increases when Bt = 0 (see Exercise 4.10). Therefore the σ-algebra Nt generated by {Ys ; s ≤ t} is contained in the σ-algebra Ht generated by {|Bs |; s ≤ t}. It follows that v(t, ω) = sign(Bt ) cannot be Nt -adapted. Corollary 8.4.5 (How to recognize a Brownian motion) Let dYt = u(t, ω)dt + v(t, ω)dBt be an Itˆ o process in Rn . Then Yt is a Brownian motion if and only if E x [u(t, ·)|Nt ] = 0
and
vv T (t, ω) = In
(8.4.14)
for a.a. (t, ω). Remark. Using Theorem 8.4.3 one may now proceed to investigate when the image Yt = φ(Xt ) of an Itˆ o diffusion Xt by a C 2 -function φ coincides in law with an Itˆo diffusion Zt . Applying the criterion (8.4.3) one obtains the following result: φ(Xt ) ∼ Zt
if and only if
]oφ A[f oφ] = A[f
(8.4.15)
for all second order polynomials f (x1 , . . . , xn ) = ai xi + cij xi xj (and 2 hence for all f ∈ C0 ) where A and A are the generators of Xt and Zt
156
8 Other Topics in Diffusion Theory
respectively. (Here o denotes function composition: (f ◦ φ)(x) = f (φ(x)).) For generalizations of this result, see Csink and Øksendal (1983), and Csink, Fitzsimmons and Øksendal (1990).
8.5 Random Time Change Let c(t, ω) ≥ 0 be an Ft -adapted process (with measurable paths). Define
t βt = β(t, ω) =
c(s, ω)ds .
(8.5.1)
0
We will say that βt is a (random) time change with time change rate c(t, ω). Note that β(t, ω) is also Ft -adapted and for each ω the map t → βt (ω) is non-decreasing. Define αt = α(t, ω) by αt = inf{s; βs > t} .
(8.5.2)
Then αt is a right -inverse of βt , for each ω : β(α(t, ω), ω) = t
for all t ≥ 0 .
(8.5.3)
Moreover, t → αt (ω) is right-continuous. If c(s, ω) > 0 for a.a. (s, ω) then t → βt (ω) is strictly increasing, t → αt (ω) is continuous and αt is also a left -inverse of βt : α(β(t, ω), ω) = t
for all t ≥ 0 .
(8.5.4)
In general ω → α(t, ω) is an {Fs }-stopping time for each t, since {ω; α(t, ω) < s} = {ω; t < β(s, ω)} ∈ Fs .
(8.5.5)
We now ask the question: Suppose Xt is an Itˆ o diffusion and Yt an Itˆo process as in Theorem 8.4.3. When does there exist a time change βt such that Yαt ! Xt ? (Note that αt is only defined up to time β∞ . If β∞ < ∞ we interpret Yαt ! Xt to mean that Yαt has the same law as Xt up to time β∞ ). Here is a partial answer (see Øksendal (1990)): Theorem 8.5.1 Let Xt , Yt be as in Theorem 8.4.3 and let βt be a time change with right inverse αt as in (8.5.1), (8.5.2) above. Assume that u(t, ω) = c(t, ω)b(Yt )
and
vv T (t, ω) = c(t, ω) · σσ T (Yt )
(8.5.6)
for a.a. t, ω. Then Yαt ! Xt . This result allows us to recognize time changes of Brownian motion:
8.5 Random Time Change
157
Theorem 8.5.2 Let dYt = v(t, ω)dBt , v ∈ Rn×m , Bt ∈ Rm be an Itˆ o integral in Rn , Y0 = 0 and assume that vv T (t, ω) = c(t, ω)In
(8.5.7)
for some process c(t, ω) ≥ 0. Let αt , βt be as in (8.5.1), (8.5.2). Then Yαt
is an n-dimensional Brownian motion .
Corollary 8.5.3 Let dYt =
n
vi (t, ω)dBi (t, ω), Y0 = 0, where
i=1
B = (B1 , . . . , Bn ) is a Brownian motion in Rn . Then t : = Yαt B
is a 1-dimensional Brownian motion ,
where αt is defined by (8.5.2) and βs =
s % n
+ vi2 (r, ω) dr .
(8.5.8)
i=1
0
Corollary 8.5.4 Let Yt , βs be as in Corollary 8.5.3. Assume that n
vi2 (r, ω) > 0
for a.a. (r, ω) .
(8.5.9)
i=1
t such that Then there exists a Brownian motion B βt . Yt = B
(8.5.10)
t = Yαt B
(8.5.11)
Proof. Let
be the Brownian motion from Corollary 8.5.3. By (8.5.9) βt is strictly increasing and hence (8.5.4) holds, So choosing t = βs in (8.5.11) we get (8.5.10). Corollary 8.5.5 Let c(t, ω) ≥ 0 be given and define
t 1 c(s, ω) dBs , Yt = 0
where Bs is an n-dimensional Brownian motion. Then Yαt
is also an n-dimensional Brownian motion .
158
8 Other Topics in Diffusion Theory
We now use this to prove that a time change of an Itˆo integral is again an t . First we construct Itˆo integral, but driven by a different Brownian motion B Bt : Lemma 8.5.6 Suppose s → α(s, ω) is continuous, α(0, ω) = 0 for a.a. ω. Fix t > 0 such that βt < ∞ a.s. and assume that E[αt ] < ∞. For k = 1, 2, . . . put % j · 2−k if j · 2−k ≤ αt tj = t if j · 2−k > αt and choose rj such that αrj = tj . Suppose f (s, ω) ≥ 0 is Fs -adapted, bounded and s-continuous for a.a. ω. Then
lim
k→∞
αt f (αj , ω)ΔBαj =
j
f (s, ω)dBs
a.s. ,
(8.5.12)
0
where αj = αrj , ΔBαj = Bαj+1 − Bαj and the limit is in L2 (Ω, P ). Proof. For all k we have E
#
αt f (αj , ω)ΔBαj −
j
f (s, ω)dBs
2 $
0
# α j+1 2 $ =E (f (αj , ω) − f (s, ω)dBs j
αj
αj+1 2 E (f (αj , ω) − f (s, ω))dBs = j
αj
# = E j
where fk (s, ω) =
# αt $ (f (αj , ω) − f (s, ω)) ds = E (f − fk )2 ds ,
α
j+1
2
αj
j
$
0
f (tj , ω)X[tj ,tj+1 ) (s) is the elementary approximation to f .
(See Corollary 3.1.8). This implies (8.5.12).
We now use this to establish a general time change formula for Itˆo integrals. An alternative proof in the case n = m = 1 can be found in McKean (1969, §2.8). Theorem 8.5.7 (Time change formula for Itˆ o integrals) Suppose c(s, ω) and α(s, ω) are s-continuous, α(0, ω) = 0 for a.a. ω and that E[αt ] < ∞. Let Bs be an m-dimensional Brownian motion and let v(s, ω) ∈ n×m VH be bounded and s-continuous. Define
8.5 Random Time Change
1 . c(αj , ω) ΔBαj = c(s, ω) dBs .
159
αt
t = lim B
k→∞
j
(8.5.13)
0
t is an (m-dimensional) Fαt -Brownian motion (i.e. B t is a Brownian Then B (m) motion and Bt is a martingale w.r.t. Fαt ) and (m)
αt
t v(s, ω)dBs =
0
1 r v(αr , ω) αr (ω) dB
a.s. P ,
(8.5.14)
0
where αr (ω) is the derivative of α(r, ω) w.r.t. r, so that αr (ω) =
1 c(αr , ω)
for a.a. r ≥ 0, a.a. ω ∈ Ω .
(8.5.15)
Proof. The existence of the limit in (8.5.13) and the second identity in (8.5.13) follow by applying Lemma 8.5.6 to the function 1 f (s, ω) = c(s, ω) . t is an Fα(m) Then by Corollary 8.5.5 we have that B t -Brownian motion. It remains to prove (8.5.14):
αt v(s, ω)dBs = lim
k→∞
0
= lim
k→∞
= lim
k→∞
t =
v(αj , ω)ΔBαj
j
2 v(αj , ω)
j
j
2 v(αj , ω)
. c(αj , ω) ΔBαj
1 j ΔB c(αj , ω)
2
v(αr , ω) 0
1 c(αj , ω)
1 r dB c(αr , ω)
and the proof is complete.
Example 8.5.8 (Brownian motion on the unit sphere in Rn ; n > 2) In Examples 5.1.4 and 7.5.5 we constructed Brownian motion on the unit circle. It is not obvious how to extend the method used there to obtain Brownian motion on the unit sphere S of Rn ; n ≥ 3. However, we may proceed as follows: Apply the function φ: Rn \ {0} → S defined by φ(x) = x · |x|−1 ;
x ∈ Rn \ {0}
160
8 Other Topics in Diffusion Theory
to n-dimensional Brownian motion B = (B1 , . . . , Bn ). The result is a stochastic integral Y = (Y1 , . . . , Yn ) = φ(B) which by Itˆo’s formula is given by dYi =
Bj Bi |B|2 − Bi2 n − 1 Bi · dB − dBj − dt ; i |B|3 |B|3 2 |B|3
i = 1, 2, . . . , n .
j =i
(8.5.16) Hence dY =
1 1 · σ(Y )dB + b(Y )dt , |B| |B|2
where σ = [σij ] ∈ Rn×n , and
with σij (Y ) = δij − Yi Yj ; 1 ≤ i, j ≤ n
⎞ y1 n−1 ⎜ . ⎟ · ⎝ .. ⎠ ∈ Rn , b(y) = − 2 yn ⎛
(y1 , . . . , yn are the coordinates of y ∈ Rn ) .
Now perform the following time change: Define Zt (ω) = Yα(t,ω) (ω) where αt =
βt−1
t ,
β(t, ω) = 0
1 ds . |B|2
Then Z is again an Itˆo process and by Theorem 8.5.7 + b(Z)dt . dZ = σ(Z)dB Hence Z is a diffusion with characteristic operator ∂2f n − 1 ∂f Af (y) = 12 Δf (y)− · yi yj yi ; − ∂yi ∂yj 2 ∂yi i,j i
|y| = 1 . (8.5.17)
B Thus, φ(B) = |B| is – after a suitable change of time scale – equal to a diffusion Z living on the unit sphere S of Rn . Note that Z is invariant under orthogonal transformations in Rn (since B is). It is reasonable to call Z Brownian motion on the unit sphere S. For other constructions see Itˆ o and McKean (1965, p. 269 (§7.15)) and Stroock (1971). More generally, given a Riemannian manifold M with metric tensor g = [gij ] one may define a Brownian motion on M as a diffusion on M whose characteristic operator A in local coordinates xi is given by 12 times the Laplace-Beltrami operator (here [g ij ] = [gij ]−1 )
8.5 Random Time Change
ΔM
∂ 1 1 ij ∂ · =1 det(g) g . ∂xj det(g) i ∂xi j
161
(8.5.18)
See for example Meyer (1966, p. 256–270), McKean (1969, §4.3). The subject of stochastic differential equations on manifolds is also treated in Ikeda and Watanabe (1989), Emery (1989) and Elworthy (1982). Example 8.5.9 (Harmonic and analytic functions) Let B = (B1 , B2 ) be 2-dimensional Brownian motion. Let us investigate what happens if we apply a C 2 function φ(x1 , x2 ) = (u(x1 , x2 ), v(x1 , x2 )) to B: Put Y = (Y1 , Y2 ) = φ(B1 , B2 ) and apply Itˆo’s formula: dY1 = u1 (B1 , B2 )dB1 + u2 (B1 , B2 )dB2 + 12 [u11 (B1 , B2 ) + u22 (B1 , B2 )]dt and dY2 = v1 (B1 , B2 )dB1 + v2 (B1 , B2 )dB2 + 12 [v11 (B1 , B2 ) + v22 (B1 , B2 )]dt ,
where u1 =
∂u ∂x1
etc. So
dY = b(B1 , B2 )dt + σ(B1 , B2 )dB , Δu u1 u2 1 with b = 2 , σ= = Dφ (the derivative of φ). v1 v2 Δv So Y = φ(B1 , B2 ) is a martingale if (and, in fact, only if) φ is harmonic, i.e. Δφ = 0. If φ is harmonic, we get by Corollary 8.5.3 that
(1) , B (2) ) φ(B1 , B2 ) = (B β1 β2 (2) are two (not necessarily independent) versions of 1 (1) and B where B dimensional Brownian motion, and
t β1 (t, ω) =
t
|∇u|2 (B1 , B2 )ds ,
β2 (t, ω) =
0
|∇v|2 (B1 , B2 )ds .
0
Since T
σσ =
|∇u|2
∇u · ∇v
∇u · ∇v
|∇v|2
we see that if (in addition to Δu = Δv = 0) |∇u|2 = |∇v|2
and ∇u · ∇v = 0
(8.5.19)
162
8 Other Topics in Diffusion Theory
then
t Yt = Y0 +
σdB 0
with
σσ T = |∇u|2 (B1 , B2 )I2 ,
Y0 = φ(B1 (0), B2 (0)) .
Therefore, if we let
t βt = β(t, ω) =
|∇u|2 (B1 , B2 )ds ,
αt = βt−1
0
we obtain by Theorem 8.5.2 that Yαt is a 2-dimensional Brownian motion. Conditions (8.5.19) – in addition to Δu = Δv = 0 – are easily seen to be equivalent to requiring that the function φ(x + iy) = φ(x, y) regarded as a complex function is either analytic or conjugate analytic. Thus we have proved a theorem of P. L´evy that φ(B1 , B2 ) is – after a change of time scale – again Brownian motion in the plane if and only if φ is either analytic or conjugate analytic. For extensions of this result see Bernard, Campbell and Davie (1979), Csink and Øksendal (1983) and Csink, Fitzsimmons and Øksendal (1990).
8.6 The Girsanov Theorem We end this chapter by discussing a result, the Girsanov theorem, which is fundamental in the general theory of stochastic analysis. It is also very important in many applications, for example in economics (see Chapter 12). Basically the Girsanov theorem says that if we change the drift coefficient of a given Itˆ o process (with a nondegenerate diffusion coefficient), then the law of the process will not change dramatically. In fact, the law of the new process will be absolutely continuous w.r.t. the law of the original process and we can compute explicitly the Radon-Nikodym derivative. We now proceed to make this precise. First we state (without proof) the useful L´evy characterization of Brownian motion. A proof can be found in e.g. Ikeda & Watanabe (1989), Theorem II.6.1, or in Karatzas & Shreve (1991), Theorem 3.3.16. Theorem 8.6.1 (The L´ evy characterization of Brownian motion) Let X(t) = (X1 (t), . . . , Xn (t)) be a continuous stochastic process on a probability space (Ω, H, Q) with values in Rn . Then the following, a) and b), are equivalent a) X(t) is a Brownian motion w.r.t. Q, i.e. the law of X(t) w.r.t. Q is the same as the law of an n-dimensional Brownian motion.
8.6 The Girsanov Theorem
163
b) (i) X(t) = (X1 (t), . . . , Xn (t)) is a martingale w.r.t. Q (and w.r.t. its own filtration) and (ii) Xi (t)Xj (t)−δij t is a martingale w.r.t. Q (and w.r.t. its own filtration) for all i, j ∈ {1, 2, . . . , n}. Remark. In this Theorem one may replace condition (ii) by the condition (ii)’ The cross-variation processes Xi , Xj t satisfy the identity Xi , Xj t (ω) = δij t
a.s., 1 ≤ i, j ≤ n
(8.6.1)
where Xi , Xj t = 14 [Xi + Xj , Xi + Xj t − Xi − Xj , Xi − Xj t ] ,
(8.6.2)
Y, Y t being the quadratic variation process. (See Exercise 4.7.) Next we need an auxiliary result about conditional expectation: Lemma 8.6.2 (Bayes’ rule) Let μ and ν be two probability measures on a measurable space (Ω, G) such that dν(ω) = f (ω)dμ(ω) for some f ∈ L1 (μ). Let X be a random variable on (Ω, G) such that
Eν [|X|] = |X(ω)|f (ω)dμ(ω) < ∞ . Ω
Let H be a σ-algebra, H ⊂ G. Then Eν [X|H] · Eμ [f |H] = Eμ [f X|H] a.s.
(8.6.3)
Proof. By the definition of conditional expectation (Appendix B) we have that if H ∈ H then
Eν [X|H]f dμ = Eν [X|H]dν = Xdν H
H
Xf dμ =
= H
H
Eμ [f X|H]dμ
(8.6.4)
H
On the other hand, by Theorem B.3 (Appendix B) we have
Eν [X|H]f dμ = Eμ [Eν [X|H]f · XH ] = Eμ [Eμ [Eν [X|H]f · XH |H]] H
= Eμ [XH Eν [X|H] · Eμ [f |H]] =
Eν [X|H] · Eμ [f |H]dμ . H
(8.6.5)
164
8 Other Topics in Diffusion Theory
Combining (8.6.4) and (8.6.5) we get
Eν [X|H] · Eμ [f |H]dμ = Eμ [f X|H]dμ . H
H
Since this holds for all H ∈ H, (8.6.3) follows.
Before stating the Girsanov theorem we make some general remarks about absolute continuity of measures: Let (Ω, F , {F }t≥0 , P ) be a filtered probability space (i.e. (Ω, F , P ) is a probability space and {F }t≥0 is a filtration on (Ω, F )). Fix T > 0 and let Q be another probability measure on FT . We say that Q is absolutely continuous w.r.t. P FT (the restriction of P to FT ) and write Q # P if P (H) = 0 ⇒ Q(H) = 0
for all H ∈ FT .
By the Radon-Nikodym theorem this occurs if and only if there exists an FT -measurable random variable ZT (ω) ≥ 0 such that dQ(ω) = ZT (ω)dP (ω)
on FT .
In this case we write
dQ = ZT on FT dP and we call ZT the Radon-Nikodym derivative of Q with respect to P . The following observation may be regarded as a weak partial converse of the Girsanov theorem: Lemma 8.6.3 Suppose Q # P FT with dQ dP = ZT on FT . Then Q Ft # P Ft for all t ∈ [0, T ] and if we define d(QF ) t Zt := d(P ) Ft
then Zt is a martingale w.r.t. Ft and P . Proof. Since Q # P on FT and Ft ⊂ FT it is obvious that Q # P on Ft . Choose F ∈ Ft . Then EP [XF · EP [ZT |Ft ]] = EP [EP [XF · ZT |Ft ]] = EP [XF · ZT ] = EQ [XF ] = EP [XF · Zt ]. Since this holds for all F ∈ Ft we conclude that EP [ZT |Ft ] = Zt
a.s. P F . t
8.6 The Girsanov Theorem
165
We can now prove the first version of the Girsanov formula: Theorem 8.6.4 (The Girsanov theorem I) o process of the form Let Y (t) ∈ Rn be an Itˆ t ≤ T, Y0 = 0 .
dY (t) = a(t, ω)dt + dB(t) ;
where T ≤ ∞ is a given constant and B(t) is n-dimensional Brownian motion. Put Mt = exp
t −
a(s, ω)dBs −
1 2
0
t
a2 (s, ω)ds
0≤t≤T .
;
(8.6.6)
0 (n)
Assume that Mt is a martingale with respect to Ft (n) sure Q on FT by dQ(ω) = MT (ω)dP (ω) . (n)
Then Q is a probability measure on FT nian motion w.r.t. Q, for 0 ≤ t ≤ T .
and P . Define the mea(8.6.7)
and Y (t) is an n-dimensional Brow-
Remarks. (1) The transformation P → Q given by (8.6.7) is called the Girsanov transformation of measures. (2) As pointed out in Exercise 4.4 the following Novikov condition (8.6.8) is (n) sufficient to guarantee that {Mt }t≤T is a martingale (w.r.t. Ft and P ): E exp
T 1 2
2
a (s, ω)ds
s, E[Ki (t)|Fs ] E[Mt Yi (t)|Fs ] = E[Mt |Fs ] Ms Ki (s) = = Yi (s) , Ms
EQ [Yi (t)|Fs ] =
which shows that Yi (t) is a martingale w.r.t. Q. This proves (i). The proof of (ii) is similar and is left to the reader. Remark. Theorem 8.6.4 states that for all Borel sets F1 , . . . , Fk ⊂ Rn and all t1 , t2 , . . . , tk ≤ T , k = 1, 2, . . . we have Q[Y (t1 ) ∈ F1 , . . . , Y (tk ) ∈ Fk ] = P [B(t1 ) ∈ F1 , . . . , B(tk ) ∈ Fk ]
(8.6.13)
An equivalent way of expressing (8.6.7) is to say that Q # P (Q is absolutely continuous w.r.t. P ) with Radon-Nikodym derivative dQ = MT dP
(n)
on FT
.
(8.6.14)
8.6 The Girsanov Theorem
167
Note that MT (ω) > 0 a.s., so we also have that P # Q. Hence the two measures Q and P are equivalent. Therefore we get from (8.6.13) P [Y (t1 ) ∈ F1 , . . . , Y (tk ) ∈ Fk ] > 0 ⇐⇒ Q[Y (t1 ) ∈ F1 , . . . , Y (tk ) ∈ Fk ] > 0 ⇐⇒ P [B(t1 ) ∈ F1 , . . . , B(tk ) ∈ Fk ] > 0 ;
t1 , . . . , tk ∈ [0, T ]
(8.6.15)
Example 8.6.5 Suppose Y (t) ∈ Rn is given by 0≤t≤T
dY (t) = g(t)dt + dB(t),
where g : [0, T ] → Rn is a continuous deterministic function. Then the Novikov condition (8.6.8) holds trivially and Y (t) is a Brownian motion w.r.t. Q, where
T
dQ(ω) = exp −
g(s)dB(s) −
1 2
0
T
g 2 (s)ds dP (ω)
(n)
on FT .
0
Theorem 8.6.6 (The Girsanov theorem II) o process of the form Let Y (t) ∈ Rn be an Itˆ t≤T
dY (t) = β(t, ω)dt + θ(t, ω)dB(t) ;
(8.6.16)
where B(t) ∈ Rm , β(t, ω) ∈ Rn and θ(t, ω) ∈ Rn×m . Suppose there exist m n processes u(t, ω) ∈ WH and α(t, ω) ∈ WH such that θ(t, ω)u(t, ω) = β(t, ω) − α(t, ω)
(8.6.17)
Put Mt = exp
t −
u(s, ω)dBs − 0
1 2
t
2
u (s, ω)ds
(8.6.18)
0
and dQ(ω) = MT (ω)dP (ω) (n)
Assume that Mt is a martingale (w.r.t. Ft (m) measure on FT , the process B(t): =
t≤T
;
(m)
on FT
.
(8.6.19)
and P ). Then Q is a probability
t u(s, ω)ds + B(t) ; 0
t≤T
(8.6.20)
168
8 Other Topics in Diffusion Theory
is a Brownian motion w.r.t. Q and in terms of B(t) the process Y (t) has the stochastic integral representation . dY (t) = α(t, ω)dt + θ(t, ω)dB(t)
(8.6.21)
Remark. As in the Remark following Theorem 8.6.4 we note that the following Novikov condition is sufficient to guarantee that Mt is a martingale: T
E exp
1 2
u2 (s, ω)ds
0, x ∈ Rn ∂t g(0, x) = φ(x) where φ ∈ C02 is given. (From general theory it is known that the solution is unique.) b) Let ψ ∈ Cb (Rn ) and α > 0. Find a bounded solution u of the equation in Rn . (α − 12 Δ)u = ψ Prove that the solution is unique. 8.2.
Show that the solution u(t, x) of the initial value problem ∂u ∂2u ∂u = 12 β 2 x2 2 + α x ; t > 0, x ∈ R ∂t ∂x ∂x u(0, x) = f (x) (f ∈ C02 (R) given) can be expressed as follows: u(t, x) = E[f (x · exp{βBt + (α − 12 β 2 )t})]
1 y2 1 2 = √ f (x · exp{βy + (α − 2 β )t}) exp − dy ; 2t 2πt R
t>0.
172
8.3.
8. Other Topics in Diffusion Theory
(Kolmogorov’s forward equation) o diffusion in Rn with generator Let Xt be an Itˆ Af (y) =
aij (y)
i,j
∂2f ∂f + bi (y) ; ∂yi ∂yj ∂y i i
f ∈ C02
where aij ∈ C 2 (Rn ) and bi ∈ C 1 (Rn ) for all i, j and assume that the transition measure of Xt has a density pt (x, y), i.e. that
f (y)pt (x, y)dy ; f ∈ C02 . (8.6.34) E x [f (Xt )] = Rn
Assume that y → pt (x, y) is smooth for each t, x. Prove that pt (x, y) satisfies the Kolmogorov forward equation d pt (x, y) = A∗y pt (x, y) dt
for all x, y ,
(8.6.35)
where A∗y operates on the variable y and is given by A∗y φ(y) =
i,j
∂ ∂2 (aij φ) − (bi φ) ; ∂yi ∂yj ∂yi i
φ ∈ C2
(8.6.36)
(i.e. A∗y is the adjoint of Ay .) (Hint: By (8.6.34) and Dynkin’s formula we have
t
f (y)pt (x, y)dy = f (x) + Rn
Ay f (y)ps (x, y)dy ds ;
f ∈ C02 .
0 Rn
Now differentiate w.r.t. t and use that Aφ, ψ = φ, A∗ ψ
for φ ∈ C02 , ψ ∈ C 2 ,
where ·, · denotes inner product in L2 (dy).) 8.4.
Let Bt be n-dimensional Brownian motion (n ≥ 1) and let F be a Borel set in Rn . Prove that the expected total length of time t that Bt stays in F is zero if and only if the Lebesgue measure of F is zero. Hint: Consider the resolvent Rα for α > 0 and then let α → 0.
8.5.
Show that the solution u(t, x) of the initial value problem , ∂u 1 n ∂t = ρ u + 2 Δu t > 0 ; x ∈ R u(0, x) = f (x)
(f ∈ C02 (Rn ) given)
(where ρ ∈ R is a constant) can be expressed by
Exercises
u(t, x) = (2πt)−n/2 exp(ρt)
Rn
8.6.
173
(x − y)2 dy . f (y) exp − 2t
In connection with the deduction of the Black-Scholes formula for the price of an option (see Chapter 12) the following partial differential equation appears: , ∂u ∂u 1 2 2 ∂2u ∂t = −ρ u + αx ∂x + 2 β x ∂x2 ; t > 0 , x ∈ R u(0, x) = (x − K)+ ;
x∈R,
where ρ > 0, α, β and K > 0 are constants and (x − K)+ = max(x − K, 0) . Use the Feynman-Kac formula to prove that the solution u of this equation is given by e−ρt u(t, x) = √ 2πt
(x · exp{(α − 12 β 2 )t + βy} − K)+ e
2
− y2t
dy ;
t>0.
R
(This expression can be simplified further. See Exercise 12.13.) 8.7.
Let Xt be a sum of Itˆ o integrals of the form Xt =
n
t
vk (s, ω)dBk (s) ,
k=1 0
where (B1 , . . . , Bn ) is n-dimensional Brownian motion. Assume that βt : =
t n
vk2 (s, ω)ds → ∞
as t → ∞, a.s.
0 k=1
Prove that
Xt lim sup √ = 1 a.s. 2βt log log βt t→∞
(Hint: Use the law of iterated logarithm.) 8.8.
Let Zt be a 1-dimensional Itˆ o process of the form dZt = u(t, ω)dt + dBt . Let Gt be the σ-algebra generated by {Zs (·); s ≤ t} and define dNt = (u(t, ω) − E[u|Gt ])dt + dBt .
174
8. Other Topics in Diffusion Theory
Use Corollary 8.4.5 to prove that Nt is a Brownian motion. (If we interpret Zt as the observation process, then Nt is the innovation process. See Lemma 6.2.6.) 8.9.
Define α(t) = 12 ln(1 + 23 t3 ). If Bt is a Brownian motion, prove that r such that there exists another Brownian motion B
αt
t s
e dBs = 0
r . rdB
0
8.10. Let Bt be a Brownian motion in R. Show that Xt : = Bt2 is a weak solution of the stochastic differential equation 1 t . dXt = dt + 2 |Xt |dB
(8.6.37)
(Hint: Use Itˆ o’s formula to express Xt as a stochastic integral and compare with (8.6.37) by using Corollary 8.4.5.) 8.11.* a) Let Y (t) = t + B(t); t ≥ 0 . For each T > 0 find a probability measure QT on FT such that QT ∼ P and {Y (t)}t≤T is Brownian motion w.r.t. QT . Use (8.6.9) to prove that there exists a probability measure Q on F∞ such that Q|FT = QT b) Show that
P Q
lim Y (t) = ∞ = 1
t→∞
while
for all T > 0 .
lim Y (t) = ∞ = 0 .
t→∞
Why does not this contradict the Girsanov theorem? 8.12.* Let dY (t) =
# $ # 0 1 dt + 1 −1
3 −2
$#
(2)
dB1 (t) dB2 (t)
$ ;
t≤T .
Find a probability measure Q on FT such that Q ∼ P and such that # $ # $ −3t B1 (t) B(t): = + t B2 (t) is a Brownian motion w.r.t. Q and
Exercises
# dY (t) =
1 3 −1 −2
$#
175
$
1 (t) dB 2 (t) . dB
8.13. Let b: R → R be a Lipschitz-continuous function and define Xt = Xtx ∈ R by dXt = b(Xt )dt + dBt , X0 = x ∈ R . a) Use the Girsanov theorem to prove that for all M < ∞, x ∈ R and t > 0 we have P [Xtx ≥ M ] > 0 . b) Choose b(x) = −r where r > 0 is constant. Prove that for all x Xtx → −∞
as t → ∞ a.s.
Compare this with the result in a). 8.14. (Polar sets for the graph of Brownian motion) Let Bt be 1-dimensional Brownian motion starting at x ∈ R. a) Prove that for every fixed time t0 > 0 we have P x [Bt0 = 0] = 0 . b) Prove that for every (non-trivial) closed interval J ⊂ R+ we have P x [∃t ∈ J
such that Bt = 0] > 0 .
(Hint: If J = [t1 , t2 ] consider P x [Bt1 < 0 & Bt2 > 0] and then use the intermediate value theorem.) c) In view of a) and b) it is natural to ask what closed sets F ⊂ R+ have the property that P x [∃t ∈ F
such that Bt = 0] = 0 .
(8.6.38)
To investigate this question more closely we introduce the graph Xt of Brownian motion, given by # $ # $ # $ 1 0 t dt + dBt ; X0 = 0 dXt = 0 1 x0 #
i.e. Xt =
Xtt0 ,x0
t +t = 0 x0 Bt
$ where B0x0 = x0 a.s.
Then F satisfies (8.6.38) iff K: = F × {0} is polar for Xt , in the sense that P t0 ,x0 [∃t > 0 ; Xt ∈ K] = 0
for all t0 , x0 .
(8.6.39)
176
8. Other Topics in Diffusion Theory
The key to finding polar sets for a diffusion is to consider its Green operator R, which is simply the resolvent Rα with α = 0 : $
# ∞ Rf (t0 , x0 ) = E
t0 ,x0
for f ∈ C0 (R2 ) .
f (Xs )ds t0
Show that
Rf (t0 , x0 ) =
G(t0 , x0 ; t, x)f (t, x)dt dx , R2
where − 12
G(t0 , x0 ; t, x) = Xt>t0 · (2π(t − t0 ))
|x − x0 |2 exp − 2(t − t0 )
(8.6.40)
(G is the Green function of Xt .) d) The capacity of K, C(K) = CG (K), is defined by C(K) = sup{μ(K); μ ∈ MG (K)} , where MG (K) = {μ; μ measure on K s.t. G(t0 , x0 ; t, x)dμ(t, x) ≤ 1 K
for all t0 , x0 }. A general result from stochastic potential theory states that P t0 ,x0 [Xt hits K] = 0 ⇔ C(K) = 0 .
(8.6.41)
See e.g. Blumenthal and Getoor (1968, Prop. VI.4.3). Use this to prove that Λ 12 (F ) = 0 ⇒ P x0 [∃t ∈ F
such that Bt = 0] = 0 ,
where Λ 21 denotes 1/2-dimensional Hausdorff measure (Folland (1984, §10.2)). 8.15. Let f ∈ C02 (Rn ) and α(x) = (α1 (x), . . . , αn (x)) with αi ∈ C02 (Rn ) be given functions and consider the partial differential equation ⎧ n n ⎨ ∂u = ∂2u ∂u αi (x) ∂x + 12 ; t > 0, x ∈ Rn ∂t ∂x2i i i=1 i=1 ⎩ u(0, x) = f (x) ; x ∈ Rn . a) Use the Girsanov theorem to show that the unique bounded solution u(t, x) of this equation can be expressed by
Exercises
t u(t, x) = E x exp α(Bs )dBs − 0
1 2
t
177
α2 (Bs )ds f (Bt ) ,
0
where E x is the expectation w.r.t. P x . b) Now assume that α is a gradient, i.e. that there exists γ ∈ C 1 (Rn ) such that ∇γ = α . Assume for simplicity that γ ∈ C02 (Rn ). Use Itˆo’s formula to prove that (see Exercise 4.8)
u(t, x) = exp(−γ(x))E
x
% exp −
1 2
t ∇γ 2 (Bs ) 0
+ +Δγ(Bs ) ds exp(γ(Bt ))f (Bt ) . c) Put v(t, x) = exp(γ(x))u(t, x). Use the Feynman-Kac formula to show that v(t, x) satisfies the partial differential equation , ∂v 1 1 2 n ∂t = − 2 (∇γ + Δγ) · v + 2 Δv ; t > 0 ; x ∈ R x ∈ Rn .
v(0, x) = exp(γ(x))f (x) ; (See also Exercise 8.16.)
8.16. (A connection between B.m. with drift and killed B.m.) Let Bt denote Brownian motion in Rn and consider the diffusion Xt in Rn defined by dXt = ∇h(Xt )dt + dBt ;
X0 = x ∈ R n .
(8.6.42)
where h ∈ C01 (Rn ). a) There is an important connection between this process and the process Yt obtained by killing Bt at a certain rate V . More precisely, first prove that for f ∈ C0 (Rn ) we have
x
E [f (Xt )] = E
x
exp
t −
V (Bs )ds · exp(h(Bt ) − h(x)) · f (Bt ) ,
0
(8.6.43) where V (x) = 12 |∇h(x)|2 + 12 Δh(x) .
(8.6.44)
178
8. Other Topics in Diffusion Theory
(Hint: Use the Girsanov theorem to express the left hand side of (8.6.43) in terms of Bt . Then use the Itˆo formula on Zt = h(Bt ) to achieve (8.6.44).) b) Then use the Feynman-Kac formula to restate (8.6.43) as follows (assuming V ≥ 0): TtX (f, x) = exp(−h(x)) · TtY (f · exp h, x) , where TtX , TtY denote the transition operators of the processes X and Y , respectively, i.e. TtX (f, x) = E x [f (Xt )] and similarly for Y . $ Y1 (t) ∈ R2 is given by 8.17. Suppose Y (t) = Y2 (t) #
dY1 (t) = β1 (t)dt + dB1 (t) + 2dB2 (t) + 3dB3 (t) dY2 (t) = β2 (t)dt + dB1 (t) + 2dB2 (t) + 2dB3 (t) where β1 , β2 are bounded adapted processes. Show that there are infinitely many equivalent martingale measures Q for Y (t). (See Example 8.6.7.) 8.18. (The Girsanov theorem in stochastic control) The following exercise is inspired by useful communications with Jerome Stein. It is mainly based on an example given in Fleming (1999), Chapter 2, Section 2.5. See also Platen and Rebolledo (1996). Suppose we have a financial market with two investment possibilities: (i) a risk free asset, where the unit price S0 (t) at time t is given by dS0 (t) = ρ(t)S0 (t)dt;
S0 (0) = 1
(ii) a risky asset, where the unit price S1 (t) at time t is given by dS1 (t) = S1 (t)[μ(t)dt + σ(t)dB(t)];
S1 (0) > 0.
We assume that the coefficients ρ(t), μ(t) and σ(t) = 0 are deterministic and satisfy
T {|ρ(t)| + |μ(t)| + σ 2 (t)}dt < ∞ . 0
A portfolio in this market can be represented by an Ft -adapted stochastic process π(t) which gives the fraction of the total wealth X(t) invested in the risky asset at time t. If we assume that π(t) is self-financing (see Chapter 12) then the corresponding wealth process X(t) = Xπ (t) will have the dynamics
Exercises
179
dX(t) = (1 − π(t))X(t)ρ(t)dt + π(t)X(t)[μ(t)dt + σ(t)dB(t)] X(0) = x > 0 (constant). The problem is to find the portfolio π ∗ which maximizes the expected utility of the terminal wealth, E[U (Xπ (T )] in the case when U is the power utility function, i.e. U (x) = xγ
for some constant γ ∈ (0, 1).
(See Chapter 11 and 12 for further discussions of this type of optimal portfolio problems.) a) Show that we can write E[Xπγ (T )]
T = K E Zπ exp γ {(μ(t) − ρ(t))π(t) 0
− 12 (1
2
2
− γ)σ (t)π (t)}dt
where T Zπ = Zπ (ω) = exp
γσ(t)π(t)dB(t) −
1 2 2γ
0
T
σ 2 (t)π 2 (t)dt
0
and T K = xγ exp γ ρ(t)dt
(does not depend on π).
0
Define a new measure Qπ on FT by dQπ (ω) = Zπ (ω)dP (ω). Then we can write E[Xπγ (T )] = K EQπ [F (π)], where T F (π) = exp γ {(μ(t) − ρ(t))π(t) − 12 (1 − γ)σ 2 (t)π 2 (t)}dt 0
180
8. Other Topics in Diffusion Theory
b) By maximizing the integral in the exponent of F (π) pointwise for each t, ω, show that γ T (μ(t) − ρ(t))2 dt , F (π) ≤ F (π ) = exp 2 (1 − γ)σ 2 (t) ∗
0
which is attained for π(t) = π ∗ (t) :=
μ(t) − ρ(t) (1 − γ)σ 2 (t)
c) Now assume, in addition to our previous assumptions, that the portfolios π we consider satisfy the Novikov condition E exp
γ 2 T 2
σ 2 (t)π 2 (t)dt
< ∞.
0
Use this to conclude that sup E[Xπγ (T )] π
γ T (μ(t) − ρ(t))2 dt , = K exp 2 (1 − γ)σ 2 (t) 0
and if this is finite, then the optimal portfolio is π ∗ (t), given in b) above.
9 Applications to Boundary Value Problems
9.1 The Combined Dirichlet-Poisson Problem. Uniqueness We now use results from the preceding chapters to solve the following generalization of the Dirichlet problem stated in the introduction: Let D be a domain (open connected set) in Rn and let L denote a semielliptic partial differential operator on C 2 (Rn ) of the form L=
n i=1
bi (x)
n ∂ ∂2 + aij (x) ∂xi i,j=1 ∂xi ∂xj
(9.1.1)
where bi (x) and aij (x) = aji (x) are continuous functions (see below). (By saying that L is semi-elliptic (resp. elliptic) we mean that all the eigenvalues of the symmetric matrix a(x) = [aij (x)]ni,j=1 are non-negative (resp. positive) for all x.) The Combined Dirichlet-Poisson Problem Let φ ∈ C(∂D) and g ∈ C(D) be given functions. Find w ∈ C 2 (D) such that (i)
Lw = −g
in D
(9.1.2)
and (ii)
lim w(x) = φ(y)
x→y x∈D
for all y ∈ ∂D .
(9.1.3)
The idea of the solution is the following: First we find an Itˆ o diffusion {Xt } whose generator A coincides with L on C02 (Rn ). To achieve this we simply choose a square root σ(x) ∈ Rn×n of the matrix 2a(x), i.e. T 1 2 σ(x)σ (x)
= [aij (x)] .
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 9, © Springer-Verlag Berlin Heidelberg 2013
(9.1.4)
182
9 Applications to Boundary Value Problems
We assume that σ(x) and b(x) = [bi (x)] satisfy conditions (5.2.1) and (5.2.2) of Theorem 5.2.1. (For example, if each aij ∈ C 2 (D) is bounded and has bounded first and second partial derivatives, then such a square root σ can be found. See Fleming and Rishel (1975).) Next we let Xt be the solution of dXt = b(Xt )dt + σ(Xt )dBt
(9.1.5)
where Bt is n-dimensional Brownian motion. As usual we let E x denote expectation with respect to the probability law Qx of Xt starting at x ∈ Rn . Then our candidate for the solution w of (9.1.2), (9.1.3) is $
# τD w(x) = E [φ(XτD ) · X{τD 0; Xt ∈ G} is the first hitting time of G. (Intuitively: For all starting points the process does not hit G immediately, a.s). A measurable set F ⊂ Rn is called polar (for Xt ) if Qx [TF < ∞] = 0 for all x. (Intuitively: For all starting points the process never hits F , a.s.). Clearly every polar set is semipolar, but the converse need not to be true (consider the process in Example 9.2.1). However, condition (H) does hold for Brownian motion (See Blumenthal and Getoor (1968)). It follows from the Girsanov theorem that condition (H) holds for all Itˆ o diffusions whose diffusion coefficient matrix has a bounded inverse and whose drift coefficient satisfies the Novikov condition for all T < ∞. We also need the following result, the proof of which can be found in Blumenthal and Getoor (1968, Prop. II.3.3): Lemma 9.2.12 Let U ⊂ D be open and let I denote the set of irregular points of U . Then I is a semipolar set. Theorem 9.2.13 Suppose Xt satisfies Hunt’s condition (H). Let φ be a bounded continuous function on ∂D. Suppose there exists a bounded u ∈ C 2 (D) such that (i) Lu = 0 in D (ii)s x→y lim u(x) = φ(y) for all regular y ∈ ∂D x∈D
Then u(x) = E x [φ(XτD )].
9.2 The Dirichlet Problem. Regular Points
193
Proof. Let {Dk } be as in the proof Theorem 9.1.1. By Lemma 9.2.3 b) u is X-harmonic and therefore u(x) = E x [u(Xτk )]
for all x ∈ Dk and all k .
If k → ∞ then Xτk → XτD and so u(Xτk ) → φ(XτD ) if XτD is regular. From the Lemma 9.2.12 we know that the set I of irregular points of ∂D is / I a.s. semipolar. So by condition (H) the set I is polar and therefore XτD ∈ Qx . Hence u(x) = lim E x [u(Xτk )] = E x [φ(XτD )] ,
as claimed .
Under what conditions is the solution u of the stochastic Dirichlet problem (9.2.6), (9.2.7) also a solution of the generalized Dirichlet problem (9.2.13), (9.2.14)? This is a difficult question and we will content ourselves with the following partial answer: Theorem 9.2.14 Suppose L is uniformly elliptic in D, i.e. the eigenvalues of [aij ] are bounded away from 0 in D. Let φ be a bounded continuous function on ∂D. Put u(x) = E x [φ(XτD )] . Then u ∈ C 2+α locally in D for all α < 1 and u solves the Dirichlet problem (9.2.13), (9.2.14), i.e. (i) Lu = 0 in D. (ii)r x→y lim u(x) = φ(y) for all regular y ∈ ∂D . x∈D
Remark. If k is an integer, α > 0 and G is an open set C k+α (G) denotes the set of functions on G whose partial derivatives up to k’th order is Lipschitz (H¨older) continuous with exponent α. Proof. Choose an open ball Δ with Δ ⊂ D and let f ∈ C(∂Δ). Then, from the general theory of partial differential equations, for all α < 1 there exists a continuous function u on Δ such that u|Δ ∈ C 2+α (Δ) and Lu = 0 u=f
in Δ on ∂Δ
(9.2.16) (9.2.17)
(see e.g. Dynkin (1965 II, p. 226)). Since u|Δ ∈ C 2+α (Δ) we have: If K is any compact subset of Δ there exists a constant C only depending on K and the C α -norms of the coefficients of L such that u C 2+α(K) ≤ C( Lu C α(Δ) + u C(Δ) ) .
(9.2.18)
194
9 Applications to Boundary Value Problems
(See Bers, John and Schechter (1964, Theorem 3, p. 232).) Combining (9.2.16), (9.2.17) and (9.2.18) we obtain u C 2+α(K) ≤ C f C(∂Δ) . By uniqueness (Theorem 9.2.13) we know that
u(x) = u(y)dμx (y) ,
(9.2.19)
(9.2.20)
where dμx = Qx [XτΔ ∈ dy] is the first exit distribution of Xt from Δ. From (9.2.19) it follows that
(9.2.21) f dμx1 − f dμx2 ≤ C f C(∂Δ) |x1 − x2 |α ; x1 , x2 ∈ K . Therefore μx1 − μx2 ≤ C|x1 − x2 |α ;
x1 , x2 ∈ K
(9.2.22)
where denotes the operator norm on measures on ∂Δ. So if g is any bounded measurable function on ∂Δ we know that the function
g(x) = g(y)dμx (y) = E x [g(XτΔ )] belongs to the class C α (K). Since u(x) = E x [u(XτU )] for all open sets U with U ⊂ D and x ∈ U (Lemma 9.2.4) this applies to g = u and we conclude that u ∈ C α (M ) for any compact subset M of D. We may therefore apply the solution to the problem (9.2.16), (9.2.17) once more, this time with f = u and this way we obtain that belongs to C 2+α (M )
u(x) = E x [u(XτD )]
for any compact M ⊂ D. Therefore (i) holds by Lemma 9.2.3 a). To obtain (ii)r we apply a theorem from the theory of parabolic differential equations: The Kolmogorov backward equation Lv =
∂v ∂t
has a fundamental solution v = p(t, x, y) jointly continuous in t, x, y for t > 0 and bounded in x, y for each fixed t > 0 (See Dynkin (1965 II), Theorem 0.4 p. 227). It follows (by bounded convergence) that the process Xt is a strong Feller process, in the sense that the function
x → E x [f (Xt )] = f (y)p(t, x, y)dy Rn
9.2 The Dirichlet Problem. Regular Points
195
is continuous, for all t > 0 and all bounded, measurable functions f . In general we have: If Xt is a strong Feller Itˆo diffusion and D ⊂ Rn is open then lim E x [φ(XτD )] = φ(y) x→y x∈D
for all regular y ∈ ∂D and bounded φ ∈ C(∂D) . (See Theorem 13.3 p. 32–33 in Dynkin (1965 II)). Therefore u satisfies property (ii)r and the proof is complete.
(9.2.23)
Example 9.2.15 We have already seen (Example 9.2.1) that condition (9.1.3) does not hold in general. This example shows that it need not hold even when L is elliptic: Consider Example 9.2.11 again, in the case when the point 0 is not regular. Choose φ ∈ C(∂D) such that φ(0) = 1, 0 ≤ φ(y) < 1
for y ∈ ∂D \ {0} .
Since {0} is polar for Bt (see Exercise 9.7 a) we have Bτ0 = 0 a.s and therefore D
u(0) = E 0 [φ(BτD )] < 1 .
By a slight extension of the mean value property (7.2.9) (see Exercise 9.4) we get (9.2.24) E 0 [u(Xσk )] = E 0 [φ(XτD )] = u(0) < 1 where
% + 1 , / D ∩ |x| < σk = inf t > 0; Bt ∈ k
k = 1, 2, . . .
This implies that it is impossible that u(x) → 1 as x → 0. Therefore (9.1.3) does not hold in this case.
196
9 Applications to Boundary Value Problems
In general one can show that the regular points for Brownian motion are exactly the regular points in the classical potential theoretic sense, i.e. the points y on ∂D where the limit of the generalized Perron-Wiener-Brelot solution coincide with φ(y), for all φ ∈ C(∂D). See Doob (1984), Port and Stone (1978) or Rao (1977). Example 9.2.16 Let D denote the infinite strip D = {(t, x) ∈ R2 ; x < R} ,
where R ∈ R
and let L be the differential operator Lf (t, x) =
∂f 1 ∂ 2f + ; ∂t 2 ∂x2
f ∈ C 2 (D) .
An Itˆo diffusion whose generator coincides with L on C02 (R2 ) is (see Example 9.2.10) Xt = (s + t, Bt ) ; t≥0, and all the points of ∂D are regular for this process. It is not hard to see that in this case (9.1.14) holds, i.e. τD < ∞ a.s. (see Exercise 7.4). Assume that φ is a bounded continuous function on ∂D = {(t, R); t ∈ R}. Then by Theorem 9.2.5 the function u(s, x) = E s,x [φ(XτD )] is the solution of the stochastic Dirichlet problem (9.2.6), (9.2.7), where E s,x denotes expectation w.r.t. the probability law Qs,x for X starting at (s, x). Does u also solve the problem (9.2.13), (9.2.14)? Using the Laplace transform it is possible to find the distribution of the first exit point on ∂D for X, i.e. to find the distribution of the first time t = τ that Bt reaches the value R. (See Karlin and Taylor (1975), p. 363. See also Exercise 7.19.) The result is P x [ τ ∈ dt] = g(x, t)dt , where % g(x, t) =
2
(R − x)(2πt3 )−1 exp(− (R−x) ); t>0 2t 0; t≤0.
Thus the solution u may be written
∞ u(s, x) =
∞ φ(r, R)g(x, r − s)dr .
φ(s + t, R)g(x, t)dt = 0
s
(9.2.25)
9.3 The Poisson Problem
197
2
∂ u From the explicit formula for u it is clear that ∂u ∂s and ∂x2 are continuous and we conclude that Lu = 0 in D by Lemma 9.2.3. So u satisfies (9.2.13). What about property (9.2.14)? It is not hard to see that for t > 0
|x − y|2 t0 ,x − 12 f (t0 + t, y) exp − E [f (Xt )] = (2πt) dy 2t R
for all bounded, (t, x)-measurable functions f . (See (2.2.2)). Therefore Xt is not a strong Feller process, so we cannot appeal to (9.2.23) to obtain (9.2.14). However, it is easy to verify directly that if |y| = R, t1 > 0 then for all > 0 there exists δ > 0 such that |x − y| < δ, |t − t1 | < δ ⇒ Qt,x [XτD ∈ N ] ≥ 1 − , where N = [t1 − , t1 + ] × {y}. And this is easily seen to imply (9.2.14). Remark. As the above example (and Example 9.2.1) shows, an Itˆ o diffusion need not be a strong Feller process. However, we have seen that it is always a Feller process (Lemma 8.1.4).
9.3 The Poisson Problem ∂ 2 Let L = aij ∂x∂i ∂xj + bi ∂x be a semi-elliptic partial differential operator i n on a domain D ⊂ R as before and let Xt be an associated Itˆo diffusion, described by (9.1.4) and (9.1.5). In this section we study the Poisson problem (9.2.3), (9.2.4). For the same reasons as in Section 9.2 we generalize the problem to the following: The Generalized Poisson Problem Given a continuous function g on D find a C 2 function v in D such that a) b)
Lv = −g in D lim v(x) = 0 for all regular y ∈ ∂D x→y
(9.3.1) (9.3.2)
x∈D
Again we will first study a stochastic version of the problem and then investigate the relation between the corresponding stochastic solution and the solution (if it exists) of (9.3.1), (9.3.2): Theorem 9.3.1 (Solution of the stochastic Poisson problem) Assume that $ # τD x |g(Xs )|ds < ∞ for all x ∈ D . (9.3.3) E 0
198
9 Applications to Boundary Value Problems
(This occurs, for example, if g is bounded and E x [τD ] < ∞ for all x ∈ D). Define $ # τD x g(Xs )ds . (9.3.4) v(x) = E 0
Then
Av = −g
and lim v(Xt ) = 0
t↑τD
in D ,
(9.3.5)
a.s. Qx , for all x ∈ D .
(9.3.6)
τD
Proof. Choose U open, x ∈ U ⊂⊂ D. Put η =
g(Xs )ds, τ = τU .
0
Then by the strong Markov property (7.2.5) E x [v(Xτ )] − v(x) 1 = x (E x [E Xτ [η]] − E x [η]) E x [τ ] E [τ ] 1 1 = x (E x [E x [θτ η|Fτ ]] − E x [η]) = x (E x [θτ η − η]) . E [τ ] E [τ ] Approximate η by sums of the form η (k) = g(Xti )X{ti 0) is bounded, then the function w given by (9.3.15) solves the Dirichlet-Poisson problem, i.e.
9.3 The Poisson Problem
Lw = −g
in D
201
(9.3.19)
and for all regular y ∈ ∂D .
lim w(x) = φ(y)
x→y x∈D
(9.3.20)
The Green Measure The solution v given by the formula (9.3.4) may be rewritten as follows: Definition 9.3.4 The Green measure (of Xt w.r.t. D at x), G(x, ·) is defined by # τD G(x, H) = E
x
$ XH (Xs )ds ,
H ⊂ Rn
Borel set
(9.3.21)
0
or $
# τD
f (y)G(x, dy) = E
x
f (Xs )ds ,
f bounded, continuous . (9.3.22)
0
In other words, G(x, H) is the expected length of time the process stays in H before it exits from D. If Xt is Brownian motion, then
G(x, H) = G(x, y)dy , H
where G(x, y) is the classical Green function w.r.t. D and dy denotes Lebesque measure. See Doob (1984), Port and Stone (1978) or Rao (1977). See also Example 9.3.6 below. Note that using the Fubini theorem we obtain the following relation between the Green measure G and the transition measure for Xt in D, x QD t (x, H) = Q [Xt ∈ H, t < τD ]: $
# ∞ G(x, H) = E
∞
XH (Xs ) · X[0,τD ) (s)ds =
x 0
QD t (x, H)dt .
(9.3.23)
0
From (9.3.22) we get # τD v(x) = E
x 0
$ g(Xs )ds = g(y)G(x, dy) ,
(9.3.24)
D
which is the familiar formula for the solution of the Poisson equation in the classical case.
202
9 Applications to Boundary Value Problems
Also note that by using the Green function, we may regard the Dynkin formula as a generalization of the classical Green formula: Corollary 9.3.5 (The Green formula) Let E x [τD ] < ∞ for all x ∈ D and assume that f ∈ C02 (Rn ). Then
f (x) = E x [f (XτD )] − (LX f )(y)G(x, dy) . (9.3.25) D
In particular, if f ∈ C02 (D) we have
f (x) = − (LX f )(y)G(x, dy) .
(9.3.26)
D
(As before LX =
∂ bi ∂x + i
1 2
2
(σσ T )ij ∂x∂i ∂xj when
dXt = b(Xt )dt + σ(Xt )dBt . ) Proof. By Dynkin’s formula and (9.3.24) we have $ # τD
E [f (XτD )] = f (x) + E (LX f )(Xs )ds = f (x) + (LX f )(y)G(x, dy) . x
x
0
D
Remark. Combining (9.3.8) with (9.3.26) we see that if E x [τK ] < ∞ for all compacts K ⊂ D and all x ∈ D, then −R is the inverse of the operator A on C02 (D) : for all f ∈ C02 (D) .
A(Rf ) = R(Af ) = −f ,
(9.3.27)
More generally, for all α ≥ 0 we get the following analogue of Theorem 8.1.5: (A − α)(Rα f ) = Rα (A − α)f = −f
for all f ∈ C02 (D) .
(9.3.28)
The first part of this is already established in (9.3.10) and the second part follows from the following useful extension of the Dynkin formula E x [e−ατ f (Xτ )] = f (x) + E x
# τ
$ e−αs (A − α)f (Xs )ds .
(9.3.29)
0
If α > 0 this is valid for all stopping times τ ≤ ∞ and all f ∈ C02 (Rn ). (See Exercise 9.6.) Example 9.3.6 If Xt = Bt is 1-dimensional Brownian motion in a bounded interval (a, b) ⊂ R then we can compute the Green function G(x, y) explicitly. To this end, choose a bounded continuous function g: (a, b) → R and let us compute
Exercises
# τD v(x): = E
x
203
$ g(Bt )dt .
0
By Corollary 9.1.2 we know that v is the solution of the differential equation 1 2 v (x)
= −g(x) ; v(a) = v(b) = 0 .
x ∈ (a, b)
Integrating twice and using the boundary conditions we get 2(x − a) v(x) = b−a
b y a
x y g(z)dz dy − 2 g(z)dz dy .
a
a
a
Changing the order of integration we can rewrite this as
b v(x) =
g(y)G(x, y)dy a
where G(x, y) =
2(x − a)(b − y) − 2(x − y) · X(−∞,x) (y) . b−a
(9.3.30)
We conclude that the Green function of Brownian motion in the interval (a, b) is given by (9.3.30).
In higher dimension n the Green function y → G(x, y) of Brownian motion starting at x will not be continuous at x. It will have a logarithmic singularity 1 ) for n = 2 and a singularity of the order (i.e. a singularity of order ln |x−y| 2−n for n > 2. |x − y|
204
9. Applications to Boundary Value Problems
Exercises 9.1.* In each of the cases below find an Itˆ o diffusion whose generator coincides with L on C02 : 2
1 2∂ f a) Lf (t, x) = α ∂f ∂t + 2 β ∂x2 ; α, β constants 2
∂f ∂f + b ∂x + 12 ( ∂∂xf2 + b) Lf (x1 , x2 ) = a ∂x 1 2 1
∂2f ) ∂x22
; a, b constants
c) Lf (x) = αxf (x) + 12 β 2 f (x) ; α, β constants d) Lf (x) = αf (x) + 12 β 2 x2 f (x) ; α, β constants 2
2
2
∂f f ∂f e) Lf (x1 , x2 ) = ln(1+x21 ) ∂x +x2 ∂x +x22 ∂∂xf2 +2x1 x2 ∂x∂1 ∂x +2x21 ∂∂xf2 . 1 2 2 1
2
9.2.* Use Theorem 9.3.3 to find the bounded solutions of the following boundary value problems: , ∂u 1 ∂ 2 u ∂t + 2 · ∂x2 = φ(t, x) ; 0 < t < T, x ∈ R (i) u(T, x) = ψ(x) ; x∈R where φ, ψ are given bounded, continuous functions. % αxu (x) + 12 β 2 x2 u (x) = 0 ; 0 < x < x0 (ii) u(x0 ) = x20 where α, β are given constants, α ≥ 12 β 2 . (iii) If α < 12 β 2 there are infinitely many bounded solutions of (ii), and an additional boundary condition e.g. at x = 0 is needed to provide uniqueness. Explain this in view of Theorem 9.3.3. 9.3.* Write down (using Brownian motion) and compare the solutions u(t, x) of the , following two boundary value problems: ∂u 1 n ∂t + 2 Δu = 0 for 0 < t < T, x ∈ R a) u(T, x) = φ(x) for x ∈ Rn . , ∂u 1 n ∂t − 2 Δu = 0 for 0 < t < T, x ∈ R b) u(0, x) = ψ(x) for x ∈ Rn . 9.4.
Let G and H be bounded open subsets of Rn , G ⊂ H, and let Bt be n-dimensional Brownian motion. Use the property (H) for Bt to prove that / H} = inf{t > τG ; Bt ∈ / H} a.s. inf{t > 0; Bt ∈ i.e., with the terminology of (7.2.6), α τH = τH
where α = τG .
Use this to prove that if Xt = Bt then the mean value property (7.2.9) holds for all bounded open G ⊂ H, i.e. it is not necessary to require G ⊂⊂ H in this case. This verifies the statement (9.2.24).
Exercises
9.5.
205
(The eigenvalues of the Laplacian) Let D ⊂ Rn be open, bounded and let λ ∈ R. a) Suppose there exists a solution u ∈ C 2 (D) ∩ C(D), u not identically zero, such that % 1 D − 2 Δu = λu in (9.3.31) u=0 on ∂D . Show that we must have λ > 0. (Hint: If 12 Δu = −λu in D then 12 Δu, u = −λu, u
where u, v =
u(x)v(x)dx . D
Now use integration by parts.) b) It can be shown that if D is smooth then there exist 0 < λ0 < λ1 < · · · < λn < · · · where λn → ∞ such that (9.3.31) holds for λ = λn , n = 0, 1, 2, . . ., and for no other values of λ. The numbers {λn } are called the eigenvalues of the operator − 12 Δ in the domain D and the corresponding (nontrivial) solutions un of (9.3.31) are called the eigenfunctions. There is an interesting probabilistic interpretation of the lowest eigenvalue λ0 . The following result indicates this: Put τ = τD = inf{t > 0; Bt ∈ / D}, choose ρ > 0 and define wρ (x) = E x [exp(ρτ )] ;
x∈D.
Prove that if wρ (x) < ∞ for all x ∈ D then ρ is not an eigenvalue for − 21 Δ. (Hint: Let u be a solution of (9.3.31) with λ = ρ. Apply Dynkin’s formula to the process dYt = (dt, dBt ) and the function f (t, x) = eρt u(x) to deduce that u(x) = 0 for x ∈ D). c) Conclude that λ0 ≥ sup{ρ; E x [exp(ρτ )] < ∞ for all x ∈ D} . (We have in fact equality here. See for example Durrett (1984), Chap. 8B). 9.6.
Prove formula (9.3.29), for example by applying the Dynkin formula to the process # $ dt dYt = dXt and the function g(y) = g(t, x) = e−αt f (x).
9.7.
a) Let Bt be Brownian motion in R2 . Prove that P x [∃t > 0; Bt = y] = 0
for all x, y ∈ R2 .
(Hint: First assume x = y. We may choose y = 0. One possible approach would be to apply Dynkin’s formula with f (u) = ln |u|
206
9. Applications to Boundary Value Problems
and τ = inf{t > 0; |Bt | ≤ ρ or |Bt | ≥ R}, where 0 < ρ < R. Let ρ → 0 and then R → ∞. If x = y consider P x [∃t > ; Bt = x] and use the Markov property.) (1) (2) b) Let Bt = (Bt , Bt ) be Brownian motion in R2 . Prove that (1) (2) t = (−B , B ) is also a Brownian motion. B t t c) Prove that 0 ∈ R2 is a regular boundary point (for Brownian motion) of the plane region D = {(x1 , x2 ) ∈ R2 ; x21 + x22 < 1} \ {(x1 , 0); x1 ≥ 0} . d) Prove that 0 ∈ R3 is an irregular boundary point (for Brownian motion) of the 3-dimensional region U = {(x1 , x2 , x3 ) ∈ R3 , x21 + x22 + x23 < 1} \ {(x1 , 0, 0); x1 ≥ 0} .
9.8.* a) Find an Itˆo diffusion Xt and a measurable set G which is semipolar but not polar for Xt . b) Find an Itˆ o diffusion Xt and a countable family of thin sets Hk ; ∞ k = 1, 2, . . . such that Hk is not thin. k=1
9.9.
a) Let Xt be an Itˆo diffusion in Rn and assume that g is a non-constant locally bounded real Xt -harmonic function on a connected open set G ⊂ Rn . Prove that g satisfies the following weak form of the maximum principle: g does not have a (local or global) maximum at any point of G. (Similarly g satisfies the minimum principle). b) Give an example to show that a non-constant bounded Xt -harmonic function g can have a (non-strict) global maximum. (Hint: Consider uniform motion to the right.)
9.10.* Find the (stochastic) solution f (t, x) of the boundary value problem ⎧ ∂f 1 2 2 ∂2 f ⎨ K(x)e−ρt + ∂f ∂t + αx ∂x + 2 β x ∂x2 = 0 for x > 0, 0 < t < T ⎩ f (T, x) = e−ρT φ(x)
for x > 0
Exercises
207
where K, φ are given functions and T, ρ, α, β are constants, ρ > 0, T > 0. (Hint: Consider dYt = (dt, dXt ) where Xt is a geometric Brownian motion). 9.11. a) The Poisson kernel is defined by 1 − r2 1 − |z|2 = 2 1 − 2r cos θ + r |1 − z|2 √ where r ≥ 0, θ ∈ [0, 2π], z = reiθ ∈ C (i = −1 ). The Poisson formula states that if D denotes the open unit disk in the plane R2 = C and h ∈ C( D ) satisfies Δh = 0 in D then Pr (θ) =
1 h(re ) = 2π
2π Pr (t − θ)h(eit )dt .
iθ
0
Prove that the probability that Brownian motion, starting from z ∈ D, first exits from D at a set F ⊂ ∂D is given by
1 Pr (t − θ)dt , where z = reiθ . 2π F
b) The function w = φ(z) = i
1+z 1−z
maps the disc D = {|z| < 1} conformally onto the half plane H = {w = u + iv; v > 0}, φ(∂D) = R and φ(0) = i. Use Example 8.5.9 to prove that if μ denotes the harmonic measure for Brownian motion at the point i = (0, 1) for the half plane H then
f (ξ)dμ(ξ) = R
1 2π
2π f (φ(eit ))dt = 0
1 2πi
∂D
f (φ(z)) dz . z
208
9. Applications to Boundary Value Problems
c) Substitute w = φ(z) (i.e. z = ψ(w): = φ−1 (w) = above to show that
R
1 f (ξ)dμ(ξ) = π
∂H
dw 1 f (w) = |w − i|2 π
w−i w+i )
∞ f (x) −∞
in the integral
dx . x2 + 1
d) Show that the harmonic measure μw H for Brownian motion in H at the point w = u + iv ∈ H is given by dμw H (x) =
v 1 · dx . π (x − u)2 + v 2
9.12. (A Feynman-Kac formula for boundary value problems) Let Xt be an Itˆ o diffusion on Rn whose generator coincides with a given partial differential operator L on C02 (Rn ). Let D, φ and g be as in Theorem 9.3.3 and let q(x) ≥ 0 be a continuous function on Rn . Consider the boundary value problem: Find h ∈ C 2 (D) ∩ C( D ) such that , Lh(x) − q(x)h(x) = −g(x) on D y ∈ ∂D .
lim h(x) = φ(y) ;
x→y
Show that if a bounded solution h exists, then # τD h(x) = E
x
e
−
t 0
q(Xs )ds
g(Xt )dt + e
−
τ 0
D
q(Xs )ds
$ φ(XτD ) .
0
(Compare with the Feynman-Kac formula.) Hint: Proceed as in the proof of Theorem 8.2.1 b). For more information on stochastic solutions of boundary value problems see Freidlin (1985). 9.13. Let D = (a, b) be a bounded interval. a) For x ∈ R define Xt = Xtx = x + μt + σBt ;
t≥0
where μ, σ are constants, σ = 0. Use Corollary 9.1.2 to compute $
# τD x
w(x): = E [φ(XτD )] + E
x
g(Xt )dt 0
when φ: {a, b} → R and g: (a, b) → R are given functions, g bounded and continuous.
Exercises
209
b) Use the results in a) to compute the Green function G(x, y) of the process Xt . (Hint: Choose φ = 0 and proceed as in Example 9.3.6.) 9.14. Let D = (a, b) ⊂ (0, ∞) be a bounded interval and let dXt = rXt dt + αXt dBt ;
X0 = x ∈ (a, b)
be a geometric Brownian motion. a) Use Corollary 9.1.2 to find Qx [XτD = b] . (Hint: Choose g = 0 and φ(a) = 0, φ(b) = 1.) b) Use Corollary 9.1.2 to compute $
# τD x
w(x) = E [φ(XτD )] + E
x
g(Xt )dt 0
for given functions φ: {a, b} → R and g: (a, b) → R, g bounded and continuous. (Hint: The substitution t = ln x, w(x) = h(ln x) transforms the differential equation 1 2 2 2 α x w (x)
+ rxw (x) = −g(x) ;
x>0
into the differential equation 1 2 2 α h (t)
+ (r − 12 α2 )h (t) = −g(et ) ;
t ∈ R .)
9.15.* a) Let D = (a, b) ⊂ R be a bounded interval and let Xt = Bt be 1-dimensional Brownian motion. Use Corollary 9.1.2 to compute x
h(x) = E [e
−ρτD
$ # τD −ρt 2 ψ(BτD )] + E θ e Bt dt x
0
for a given function ψ: {a, b} → R, when ρ > 0 and θ ∈ R are constants. (Hint: Consider the Itˆ o diffusion # # $ # $ # $ (1) $ dYt dt 1 0 dYt = = dt + dBt ; Y0 = y = (s, x) . (2) = dB 0 1 t dYt Then h(x) = w(0, x)
210
9. Applications to Boundary Value Problems
where $
# τD y
w(s, x) = w(y) = E [φ(YτD )] + E
y
g(Yt )dt 0
with φ(y) = φ(s, x) = e−ρs ψ(x) and g(y) = g(s, x) = θ e−ρs x2 . Note that (2)
τD = inf{t > 0; Bt ∈ (a, b)} = inf{t > 0; Yt
∈ (a, b)}
= inf{t > 0; Yt ∈ R × (a, b)} . To find w(s, x) solve the boundary value problem ⎧ 2 ⎨ 1 ∂ w2 + ∂w = −θ e−ρs x2 ; a < x < b 2 ∂x
∂s
⎩ w(s, a) = e−ρs ψ(a) ,
w(s, b) = e−ρs ψ(b) .
To this end, try w(s, x) = e−ρs h(x).) −ρτ b) Use the method in a) to find E x [e D ]. (Compare with Exercise 7.19.) 9.16. (The Black-Scholes equation) a) Let D = (0, T ) × (0, ∞) ⊂ R2 , where T > 0 is a constant. Show ¯ of the boundary value that the unique solution w ∈ C 1,2 (D) ∩ C(D) problem , 2 −rw(s, x) +
∂w ∂w 1 2 2∂ w ∂s (s, x) + rx ∂x (s, x) + 2 σ x ∂x2 (s, x)
= 0 ; (s, x) ∈ D
w(T, x) = (x − K)+ ; x ≥ 0
(where r > 0, σ = 0 are constants) is given by w(s, x) = E s,x [e−r(T −s) (XT −s − K)+ ]
(9.3.32)
where Xt = Xtx is the geometric Brownian motion dXt = rXt dt + σXt dBt ;
X0 = x > 0 .
[Hint: Put u(s, x) = e−rs w(s, x). Then apply Theorem 9.2.13 and Theorem 9.1.1 to the boundary value problem ⎧ 2 ⎨∂u (s, x) + rx ∂u (s, x) + 1 σ 2 x2 ∂ u2 (s, x) = 0 ; (s, x) ∈ D ∂s ∂x 2 ∂x ⎩
lim u(Yt ) = e−r(s+τD ) (XτD − K)+ ; x ≥ 0
t→τD
where Yt = Yts,x = (s+ t, Xtx ) and τD = inf{t > 0; Yt ∈ D} = T − s.]
Exercises
211
b) Use (9.3.32) to show that √ √ w(0, x) = xΦ η + 12 σ T − Ke−rT Φ η − 12 σ T where
x + rT η = σ −1 T −1/2 ln K
and 1 Φ(t) = √ 2π
t
2
e
− s2
ds
−∞
is the normal distribution function. This is the celebrated Black-Scholes option pricing formula. See Corollary 12.3.8.
10 Application to Optimal Stopping
10.1 The Time-Homogeneous Case Problem 5 in the introduction is a special case of a problem of the following type: Problem 10.1.1 (The optimal stopping problem) o diffusion on Rn and let g (the reward function) be a given Let Xt be an Itˆ function on Rn , satisfying a) g(ξ) ≥ 0 for all ξ ∈ Rn b) g is continuous.
(10.1.1)
Find a stopping time τ ∗ = τ ∗ (x, ω) (called an optimal stopping time) for {Xt } such that E x [g(Xτ ∗ )] = sup E x [g(Xτ )] τ
for all x ∈ Rn ,
(10.1.2)
the sup being taken over all stopping times τ for {Xt }. We also want to find the corresponding optimal expected reward g ∗ (x) = E x [g(Xτ ∗ )] .
(10.1.3)
Here g(Xτ ) is to be interpreted as 0 at the points ω ∈ Ω where τ (ω) = ∞ and as usual E x denotes the expectation with respect to the probability law Qx of the process Xt ; t ≥ 0 starting at X0 = x ∈ Rn . We may regard Xt as the state of a game at time t, each ω corresponds to one sample of the game. For each time t we have the option of stopping the game, thereby obtaining the reward g(Xt ), or continue the game in the hope that stopping it at a later time will give a bigger reward. The problem is of course that we do not know what state the game is in at future times, only the probability distribution of the “future”. Mathematically, this means that the B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 10, © Springer-Verlag Berlin Heidelberg 2013
214
10 Application to Optimal Stopping
possible “stopping” times we consider really are stopping times in the sense of Definition 7.2.1: The decision whether τ ≤ t or not should only depend on the behaviour of the Brownian motion Br (driving the process X) up to time t, or perhaps only on the behaviour of Xr up to time t. So, among all possible stopping times τ we are asking for the optimal one, τ ∗ , which gives the best result “in the long run”, i.e. the biggest expected reward in the sense of (10.1.2). In the following we will outline how a solution to this problem can be obtained using the material from the preceding chapter. Later in this chapter we shall see that our discussion of problem (10.1.2)–(10.1.3) also covers the apparently more general problems g ∗ (s, x) = sup E (s,x) [g(s + τ, Xτ )] = E (s,x) [g(s + τ ∗ , Xτ ∗ )]
(10.1.4)
τ
and ∗
G (s, x) = sup E
(s,x)
# τ
τ
$ f (s + t, Xt )dt + g(s + τ, Xτ )
0
= E (s,x)
# τ ∗
$ f (s + t, Xt )dt + g(s + τ ∗ , Xτ ∗ )
(10.1.5)
0
where f is a given profit rate (or reward rate) function (satisfying certain conditions). We shall also discuss possible extensions of problem (10.1.2)–(10.1.3) to cases where g is not necessarily continuous or where g may assume negative values. A basic concept in the solution of (10.1.2)–(10.1.3) is the following: Definition 10.1.2 A measurable function f : Rn → [0, ∞] is called supermeanvalued (w.r.t. Xt ) if f (x) ≥ E x [f (Xτ )] (10.1.6) for all stopping times τ and all x ∈ Rn . If, in addition, f is also lower semicontinuous, then f is called l.s.c. superharmonic or just superharmonic (w.r.t. Xt ). Note that if f : Rn → [0, ∞] is lower semicontinuous then by the Fatou lemma (10.1.7) f (x) ≤ E x [ lim f (Xτk )] ≤ lim E x [f (Xτk )] , k→∞
k→∞
for any sequence {τk } of stopping times such that τk → 0 a.s. P . Combining this with (10.1.6) we see that if f is (l.s.c.) superharmonic, then f (x) = lim E x [f (Xτk )] k→∞
for all such sequences τk .
for all x ,
(10.1.8)
10.1 The Time-Homogeneous Case
215
Remarks. 1) In the literature (see e.g. Dynkin (1965 II)) one often finds a weaker concept of Xt -superharmonicity, defined by the supermeanvalued property (10.1.6) plus the stochastic continuity requirement (10.1.8). This weaker concept corresponds to the Xt -harmonicity defined in Chapter 9. 2) If f ∈ C 2 (Rn ) it follows from Dynkin’s formula that f is superharmonic w.r.t. Xt if and only if Af ≤ 0 where A is the characteristic operator of Xt . This is often a useful criterion (See e.g. Example 10.2.1). 3) If Xt = Bt is Brownian motion in Rn then the superharmonic functions for Xt coincide with the (nonnegative) superharmonic functions in classical potential theory. See Doob (1984) or Port and Stone (1979). We state some useful properties of superharmonic and supermeanvalued functions. Lemma 10.1.3 a) If f is superharmonic (supermeanvalued) and α > 0, then αf is superharmonic (supermeanvalued). b) If f1 , f2 are superharmonic (supermeanvalued), then f1 + f2 is superharmonic (supermeanvalued). c) If {fj }j∈J is a family of supermeanvalued functions, then f (x): = inf {fj (x)} is supermeanvalued if it is measurable (J is any set). j∈J
d) If f1 , f2 , · · · are superharmonic (supermeanvalued) functions and fk ↑ f pointwise, then f is superharmonic (supermeanvalued). e) If f is supermeanvalued and σ ≤ τ are stopping times, then E x [f (Xσ )] ≥ E x [f (Xτ )]. f ) If f is supermeanvalued and H is a Borel set, then f(x): = E x [f (XτH )] is supermeanvalued. Proof of Lemma 10.1.3. a) and b) are straightforward. c) Suppose fj is supermeanvalued for all j ∈ J. Then fj (x) ≥ E x [fj (Xτ )] ≥ E x [f (Xτ )]
for all j .
So f (x) = inf fj (x) ≥ E x [f (Xτ )], as required. d) Suppose fj is supermeanvalued, fj ↑ f . Then f (x) ≥ fj (x) ≥ E x [fj (Xτ )]
for all j, so
f (x) ≥ lim E [fj (Xτ )] = E [f (Xτ )] , x
x
j→∞
by monotone convergence. Hence f is supermeanvalued. If each fj is also lower semicontinuous then if yk → x as k → ∞ we have fj (x) ≤ lim fj (yk ) ≤ lim f (yk ) k→∞
k→∞
for each j .
216
10 Application to Optimal Stopping
Hence, by letting j → ∞, f (x) ≤ lim f (yk ) . k→∞
e) If f is supermeanvalued we have by the Markov property when t > s E x [f (Xt )|Fs ] = E Xs [f (Xt−s )] ≤ f (Xs ) ,
(10.1.9)
i.e. the process ζt = f (Xt ) is a supermartingale w.r.t. the σ-algebras Ft generated by {Br ; r ≤ t}. (Appendix C). Therefore, by Doob’s optional sampling theorem (see Gihman and Skorohod (1975, Theorem 6 p. 11)) we have E x [f (Xσ )] ≥ E x [f (Xτ )] for all stopping times σ, τ with σ ≤ τ a.s. Qx . f) Suppose f is supermeanvalued. By the strong Markov property (7.2.2) and formula (7.2.6) we have, for any stopping time α, E x [f(Xα )] = E x [E Xα [f (XτH )]] = E x [E x [θα f (XτH )|Fα ]] (10.1.10) = E x [θα f (XτH )] = E x [f (XτHα )] α α = inf{t > α; Xt ∈ / H}. Since τH ≥ τH we have by e) where τH
E x [f(Xα )] ≤ E x [f (XτH )] = f(x) , so f is supermeanvalued.
The following concepts are fundamental: Definition 10.1.4 Let h be a real measurable function on Rn . If f is a superharmonic (supermeanvalued) function and f ≥ h we say that f is a superharmonic (supermeanvalued) majorant of h (w.r.t. Xt ). The function h(x) = inf f (x); x ∈ Rn , f
(10.1.11)
the inf being taken over all supermeanvalued majorants f of h, is called the least supermeanvalued majorant of h. Similarly, suppose there exists a function h such that (i) h is a superharmonic majorant of h and (ii) if f is any other superharmonic majorant of h then h ≤ f. Then h is called the least superharmonic majorant of h.
10.1 The Time-Homogeneous Case
217
Note that by Lemma 10.1.3 c) the function h is supermeanvalued if it is h exists and h = h. measurable. Moreover, if h is lower semicontinuous, then Later we will prove that if g is nonnegative (or lower bounded) and lower semicontinuous, then g exists and g = g (Theorem 10.1.7). Let g ≥ 0 and let f be a supermeanvalued majorant of g. Then if τ is a stopping time f (x) ≥ E x [f (Xτ )] ≥ E x [g(Xτ )] . So
f (x) ≥ sup E x [g(Xτ )] = g ∗ (x) . τ
Therefore we always have, if g exists, g(x) ≥ g ∗ (x)
for all x ∈ Rn .
(10.1.12)
What is not so easy to see is that the converse inequality also holds, i.e. that in fact g = g∗ .
(10.1.13)
We will prove this after we have established a useful iterative procedure for calculating g. Before we give such a procedure let us introduce a concept which is related to superharmonic functions: Definition 10.1.5 A lower semicontinuous function f : Rn → [0, ∞] is called excessive (w.r.t. Xt ) if for all s ≥ 0, x ∈ Rn .
f (x) ≥ E x [f (Xs )]
(10.1.14)
It is clear that a superharmonic function must be excessive. What is not so obvious, is that the converse also holds: Theorem 10.1.6 Let f : Rn → [0, ∞]. Then f is excessive w.r.t. Xt if and only if f is superharmonic w.r.t. Xt . Proof in a special case. Let L be the differential operator associated to X (given by the right hand side of (7.3.3)), so that L coincides with the generator A of X on C02 . We only prove the theorem in the special case when f ∈ C 2 (Rn ) and Lf is bounded: Then by Dynkin’s formula we have $
# t x
E [f (Xt )] = f (x) + E
x
Lf (Xr )dr
for all t ≥ 0 ,
0
so if f is excessive then Lf ≤ 0. Therefore, if τ is a stopping time we get E x [f (Xt∧τ )] ≤ f (x)
for all t ≥ 0 .
Letting t → ∞ we see that f is superharmonic.
218
10 Application to Optimal Stopping
A proof in the general case can be found in Dynkin (1965 II, p. 5). The first iterative procedure for the least superharmonic majorant g of g is the following: Theorem 10.1.7 (Construction of the least superharmonic majorant) Let g = g0 be a nonnegative, lower semicontinuous function on Rn and define inductively (10.1.15) gn (x) = sup E x [gn−1 (Xt )] , t∈Sn −n
where Sn = {k · 2 ; 0 ≤ k ≤ 4 }, n = 1, 2, . . . . Then gn ↑ g and g is the least superharmonic majorant of g. Moreover, g = g. n
Proof. Note that {gn } is increasing. Define gˇ(x) = lim gn (x). Then n→∞
gˇ(x) ≥ gn (x) ≥ E [gn−1 (Xt )] x
for all n and all t ∈ Sn .
Hence gˇ(x) ≥ lim E x [gn−1 (Xt )] = E x [ˇ g(Xt )] n→∞
for all t ∈ S =
∞
(10.1.16)
Sn .
n=1
Since gˇ is an increasing limit of lower semicontinuous functions (Lemma 8.1.4) gˇ is lower semicontinuous. Fix t ∈ R and choose tk ∈ S such that tk → t. Then by (10.1.16), the Fatou lemma and lower semicontinuity gˇ(x) ≥ lim E x [ˇ g (Xtk )] ≥ E x [ lim gˇ(Xtk )] ≥ E x [ˇ g (Xt )] . k→∞
k→∞
So gˇ is an excessive function. Therefore gˇ is superharmonic by Theorem 10.1.6 and hence gˇ is a superharmonic majorant of g. On the other hand, if f is any supermeanvalued majorant of g, then clearly by induction f (x) ≥ gn (x)
for all n
and so f (x) ≥ gˇ(x). This proves that gˇ is the least supermeanvalued majorant g of g. So gˇ = g. It is a consequence of Theorem 10.1.7 that we may replace the finite sets Sn by the whole interval [0, ∞]: Corollary 10.1.8 Define h0 = g and inductively hn (x) = sup E x [hn−1 (Xt )] ; t≥0
Then hn ↑ g.
n = 1, 2, . . .
10.1 The Time-Homogeneous Case
219
Proof. Let h = lim hn . Then clearly h ≥ gˇ = g. On the other hand, since g is excessive we have g (Xt )]. g(x) ≥ sup E x [ t≥0
So by induction g ≥ hn
for all n .
Thus g = h and the proof is complete. We are now ready for our first main result on the optimal stopping problem. The following result is basically due to Dynkin (1963) (and, in a martingale context, Snell (1952)): Theorem 10.1.9 (Existence theorem for optimal stopping) Let g ∗ denote the optimal reward and g the least superharmonic majorant of a continuous reward function g ≥ 0. a) Then
g(x) . g ∗ (x) =
(10.1.17)
D = {x; g(x) < g(x) − } .
(10.1.18)
b) For > 0 let Suppose g is bounded. Then stopping at the first time τ of exit from D is close to being optimal, in the sense that |g ∗ (x) − E x [g(Xτ )]| ≤
(10.1.19)
for all x. c) For arbitrary continuous g ≥ 0 let D = {x; g(x) < g ∗ (x)}
(the continuation region) .
(10.1.20)
(x)} and For N = 1, 2, . . . define gN = g ∧ N , DN = {x; gN (x) < g0 N σN = τDN . Then DN ⊂ DN +1 , DN ⊂ D ∩ g −1 ([0, N )), D = DN . If N
σN < ∞ a.s. Qx for all N then g ∗ (x) = lim E x [g(XσN )] . N →∞
(10.1.21)
d) In particular, if τD < ∞ a.s. Qx and the family {g(XσN )}N is uniformly integrable w.r.t. Qx (Appendix C), then g ∗ (x) = E x [g(XτD )] and τ ∗ = τD is an optimal stopping time.
220
10 Application to Optimal Stopping
Proof. First assume that g is bounded and define g (x) = E x [ g(Xτ )]
for > 0 .
(10.1.22)
Then g is supermeanvalued by Lemma 10.1.3 f). We claim that g(x) ≤ g (x)
for all x .
(10.1.23)
To see this define β: = sup{g(x) − g (x)} .
(10.1.24)
x
Then g (x) + β is a supermeanvalued majorant of g. Hence g(x) < g (x) + β
for all x .
(10.1.25)
Choose α such that 0 < α < . Then there exists x0 such that β − α < g(x0 ) − g (x0 ) . Clearly 0≤ g(x0 ) − g(x0 ) ≤ g (x) + β − g(x0 ) < α . Hence x0 ∈ DαC ⊆ D C
(10.1.26)
and therefore g (Xτ )] = g(x0 ) . g (x) = E x0 [ This gives β − α < g(x0 ) − g (x0 ) ≤ g(x0 ) − g (x0 ) = 0 .
(10.1.27)
Letting α ↓ 0, we conclude that β≤0, which proves the claim (10.1.23). We conclude that g is a supermeanvalued majorant of g. Therefore g ≤ g = E[ g (Xτ )] ≤ E[(g + )(Xτ )] ≤ g ∗ +
(10.1.28)
and since was arbitrary we have by (10.1.12) g = g∗ . If g is not bounded, let gN = min(N, g) ,
N = 1, 2, . . .
be the least superharmonic majorant of gN . Then and as before let g0 N
10.1 The Time-Homogeneous Case ∗ g ∗ ≥ gN = g0 N ↑ h
221
as N → ∞ , where h ≥ g
since h is a superharmonic majorant of g. Thus h = g = g ∗ and this proves (10.1.17) for general g. From (10.1.28) and (10.1.17) we obtain (10.1.19). Finally, to obtain c) and d) let us again first assume that g is bounded. Then, since as ↓ 0 τ ↑ τD and τD < ∞ a.s we have E x [g(Xτ )] → E x [g(XτD )]
as ↓ 0 ,
(10.1.29)
and hence by (10.1.28) and (10.1.17) g ∗ (x) = E x [g(XτD )]
if g is bounded .
(10.1.30)
Finally, if g is not bounded define . h = lim g0 N N →∞
Then h is superharmonic by Lemma 10.1.3 d) and since g0 ≤ g for all N we N have h ≤ g. On the other hand gN ≤ g0 ≤ h for all N and therefore g ≤ h. N Since g is the least superharmonic majorant of g we conclude that h= g.
(10.1.31)
Hence by (10.1.30), (10.1.31) we obtain (10.1.21): (x) = lim E x [gN (XσN )] ≤ lim E x [g(XσN )] ≤ g ∗ (x) . g ∗ (x) = lim g0 N N →∞
N →∞
N →∞
≤ N everywhere, so if gN (x) < g0 (x) then gN (x) < N and thereNote that g0 N N fore g(x) = gN (x) < g0 (x) ≤ g (x) and g (x) = gN (x) < g0 N N +1 N (x) ≤ g5 N +1 (x). Hence DN ⊂ D ∩{x; g(x) < N } and DN ⊂ DN +1 for all N . So by (10.1.31) we conclude that D is the increasing union of the sets DN ; N = 1, 2, . . . Therefore τD = lim σN . N →∞
So by (10.1.21) and uniform integrability we have g(x) = lim g0 (x) = lim E x [gN (XσN )] N N →∞
N →∞
= E x [ lim gN (XσN )] = E x [g(XτD )] , N →∞
and the proof of Theorem 10.1.9 is complete.
Remarks. 1) Note that the sets D, D and DN are open, since g = g ∗ is lower semicontinuous and g is continuous.
222
10 Application to Optimal Stopping
2) By inspecting the proof of a) we see that (10.1.17) holds under the weaker assumption that g ≥ 0 is lower semicontinuous. The following consequence of Theorem 10.1.9 is often useful: Corollary 10.1.10 Suppose there exists a Borel set H such that gH (x): = E x [g(XτH )] is a supermeanvalued majorant of g. Then g ∗ (x) = gH (x) ,
so τ ∗ = τH is optimal .
Proof. If gH is a supermeanvalued majorant of g then clearly g(x) ≤ gH (x) . On the other hand we of course have gH (x) ≤ sup E x [g(Xτ )] = g ∗ (x) , τ
so g ∗ = gH by Theorem 10.1.7 and Theorem 10.1.9 a).
Corollary 10.1.11 Let D = {x; g(x) < g(x)} and put g(x) = gD (x) = E x [g(XτD )] . If g ≥ g then g = g ∗ . Proof. Since XτD ∈ / D we have g(XτD ) ≥ g(XτD ) and therefore g(XτD ) = g(x) = E x [ g(XτD )] is supermeanvalued since g is, and the g(XτD ), a.s. Qx . So result follows from Corollary 10.1.10. Theorem 10.1.9 gives a sufficient condition for the existence of an optimal stopping time τ ∗ . Unfortunately, τ ∗ need not exist in general. For example, if Xt = t
for t ≥ 0 (deterministic)
and g(ξ) =
ξ2 ; 1 + ξ2
ξ∈R
then g ∗ (x) = 1, but there is no stopping time τ such that E x [g(Xτ )] = 1 . However, we can prove that if an optimal stopping time τ ∗ exists, then the stopping time given in Theorem 10.1.9 is optimal:
10.1 The Time-Homogeneous Case
223
Theorem 10.1.12 (Uniqueness theorem for optimal stopping) Define as before D = {x; g(x) < g ∗ (x)} ⊂ Rn . Suppose there exists an optimal stopping time τ ∗ = τ ∗ (x, ω) for the problem (10.1.2) for all x. Then τ ∗ ≥ τD and
for all x ∈ D
g ∗ (x) = E x [g(XτD )]
for all x ∈ Rn .
(10.1.32) (10.1.33)
Hence τD is an optimal stopping time for the problem (10.1.2). Proof. Choose x ∈ D. Let τ be an Ft -stopping time and assume Qx [τ < τD ] > 0. Since g(Xτ ) < g ∗ (Xτ ) if τ < τD and g ≤ g ∗ always, we have
x x g(Xτ )dQ + g(Xτ )dQx E [g(Xτ )] =
0 $ # τ 0 ∧u Ag(Xs )ds > g(x) E [g(Xτ0 ∧u )] = g(x) + E x
x
0
so g(x) < g ∗ (x) and therefore x ∈ D. Example 10.1.13 Let Xt = Bt be a Brownian motion in R2 . Using that Bt is recurrent in R2 (Example 7.4.2) one can show that the only (nonnegative) superharmonic functions in R2 are the constants (Exercise 10.2). Therefore g ∗ (x) = g ∞ : = sup{g(y); y ∈ R2 }
for all x .
So if g is unbounded then g ∗ = ∞ and no optimal stopping time exists. Assume therefore that g is bounded. The continuation region is D = {x; g(x) < g ∞ } , so if ∂D is a polar set i.e. cap (∂D) = 0, where cap denotes the logarithmic capacity (see Port and Stone (1979)), then τD = ∞ a.s. and no optimal stopping exists. On the other hand, if cap(∂D) > 0 then τD < ∞ a.s. and E x [g(BτD )] = g ∞ = g ∗ (x) , so τ ∗ = τD is optimal. Example 10.1.14 The situation is different in Rn for n ≥ 3. a) To illustrate this let Xt = Bt be Brownian motion in R3 and let the reward function be % −1 for |ξ| ≥ 1 |ξ| ; ξ ∈ R3 . g(ξ) = 1 for |ξ| < 1 Then g is superharmonic (in the classical sense) in R3 , so g ∗ = g everywhere and the best policy is to stop immediately, no matter where the starting point is.
10.1 The Time-Homogeneous Case
225
b) Let us change g to % h(x) =
|x|−α 1
for |x| ≥ 1 for |x| < 1
for some α > 1. Let H = {x; |x| > 1} and define h(x) = E x [h(BτH )] = P x [τH < ∞] . Then by Example 7.4.2 h(x) =
%
1 |x|−1
if |x| ≤ 1 if |x| > 1 ,
i.e. h = g (defined in a)), which is a superharmonic majorant of h. Therefore by Corollary 10.1.10 h=g, h∗ = H = D and τ ∗ = τH is an optimal stopping time.
Reward Functions Assuming Negative Values The results we have obtained so far regarding the problem (10.1.2)–(10.1.3) are based on the assumptions (10.1.1). To some extent these assumptions can be relaxed, although neither can be removed completely. For example, we have noted that Theorem 10.1.9 a) still holds if g ≥ 0 is only assumed to be lower semicontinuous. The nonnegativity assumption on g can also be relaxed. First of all, note that if g is bounded below, say g ≥ −M where M > 0 is a constant, then we can put g1 = g + M ≥ 0 and apply the theory to g1 . Since E x [g(Xτ )] = E x [g1 (Xτ )] − M
if τ < ∞ a.s. ,
we have g ∗ (x) = g1∗ (x) − M , so the problem can be reduced to the optimal stopping problem for the nonnegative function g1 . (See Exercise 10.4.) If g is not bounded below, then problem (10.1.2)–(10.1.3) is not welldefined unless E x [g − (Xτ )] < ∞ where
for all τ
g − (x) = − min(g(x), 0) .
(10.1.36)
226
10 Application to Optimal Stopping
If we assume that g satisfies the stronger condition that the family {g − (Xτ ); τ stopping time} is uniformly integrable
(10.1.37)
then basically all the theory from the nonnegative case carries over. We refer to the reader to Shiryaev (1978) for more information. See also Theorem 10.4.1.
10.2 The Time-Inhomogeneous Case Let us now consider the case when the reward function g depends on both time and space, i.e. g = g(t, x): R × Rn → [0, ∞) ,
g is continuous .
(10.2.1)
Then the problem is to find g0 (x) and τ ∗ such that g0 (x) = sup E x [g(τ, Xτ )] = E x [g(τ ∗ , Xτ ∗ )] .
(10.2.2)
τ
To reduce this case to the original case (10.1.2)–(10.1.3) we proceed as follows: Suppose the Itˆo diffusion Xt = Xtx has the form dXt = b(Xt )dt + σ(Xt )dBt ;
t ≥ 0 , X0 = x
where b: Rn → Rn and σ: Rn → Rn×m are given functions satisfying the conditions of Theorem 5.2.1 and Bt is m-dimensional Brownian motion. Define (s,x) in Rn+1 by the Itˆo diffusion Yt = Yt # $ s+t Yt = ; t≥0. (10.2.3) Xtx Then
#
$ # $ 1 0 dYt = (Yt )dBt dt + dBt = b(Yt )dt + σ b(Xt ) σ(Xt )
where b(η) = b(t, ξ) =
#
(10.2.4)
⎡
$
1 ∈ Rn+1 , b(ξ)
⎤ 0···0 σ (η) = σ (t, ξ) = ⎣ - - - - ⎦ ∈ R(n+1)×m , σ(ξ)
with η = (t, ξ) ∈ R × Rn . So Yt is an Itˆ o diffusion starting at y = (s, x). Let Ry = R(s,x) denote the probability law of {Yt } and let E y = E (s,x) denote the expectation w.r.t. Ry . In terms of Yt the problem (10.2.2) can be written g0 (x) = g ∗ (0, x) = sup E (0,x) [g(Yτ )] = E (0,x) [g(Yτ ∗ )] τ
(10.2.5)
10.2 The Time-Inhomogeneous Case
227
which is a special case of the problem g ∗ (s, x) = sup E (s,x) [g(Yτ )] = E (s,x) [g(Yτ ∗ )] ,
(10.2.6)
τ
which is of the form (10.1.2)–(10.1.3) with Xt replaced by Yt . Note that the characteristic operator A of Yt is given by ∂φ Aφ(s, x) = (s, x) + Aφ(s, x) ; ∂s
φ ∈ C 2 (R × Rn )
(10.2.7)
where A is the characteristic operator of Xt (working on the x-variables). Example 10.2.1 Let Xt = Bt be 1-dimensional Brownian motion and let the reward function be g(t, ξ) = e−αt+βξ ;
ξ∈R
where α, β ≥ 0 are constants. The characteristic operator A of Yts,x = is given by 2 (s, x) = ∂f + 1 · ∂ f ; Af f ∈ C2 . ∂s 2 ∂x2 Thus Ag = (−α + 12 β 2 )g ,
s+t x Bt
so if β 2 ≤ 2α then g ∗ = g and the best policy is to stop immediately. If β 2 > 2α we have U : = {(s, x); Ag(s, x) > 0} = R2 and therefore by (10.1.35) D = R2 and hence τ ∗ does not exist. If β 2 > 2α we can use Theorem 10.1.7 to prove that g ∗ = ∞: x
sup E (s,x) [g(Yt )] = sup E[e−α(s+t)+βBt ] t∈Sn
= sup [e
t∈Sn −α(s+t)
·e
βx+ 12 β 2 t
]
(see the remark following (5.1.6))
t∈Sn
= sup g(s, x) · e t∈Sn
(−α+ 12 β 2 )t
= g(s, x) · exp((−α + 12 β 2 )2n ) ,
so gn (s, x) → ∞ as n → ∞. Hence no optimal stopping exists in this case. Example 10.2.2 (When is the right time to sell the stocks? (Part 1)) We now return to a specific version of Problem 5 in the introduction: Suppose the price Xt at time t of a person’s assets (e.g. a house, stocks, oil ...) varies according to a stochastic differential equation of the form dXt = rXt dt + αXt dBt , X0 = x > 0 ,
228
10 Application to Optimal Stopping
where Bt is 1-dimensional Brownian motion and r, α are known constants. (The problem of estimating α and r from a series of observations can be approached using the quadratic variation X, Xt of the process {Xt } (Exercise 4.7) and filtering theory (Example 6.2.11), respectively.) Suppose that connected to the sale of the assets there is a fixed fee/tax or transaction cost a > 0. Then if the person decides to sell at time t the discounted net of the sale is e−ρt (Xt − a) , where ρ > 0 is given discounting factor. The problem is to find a stopping time τ that maximizes E (s,x) [e−ρτ (Xτ − a)] = E (s,x) [g(τ, Xτ )] , where
g(t, ξ) = e−ρt (ξ − a) .
The characteristic operator A of the process Yt = (s + t, Xt ) is given by 2 (s, x) = ∂f + rx ∂f + 1 α2 x2 ∂ f ; Af 2 ∂s ∂x ∂x2
f ∈ C 2 (R2 ) .
Hence Ag(s, x) = −ρe−ρs (x − a) + rxe−ρs = e−ρs ((r − ρ)x + ρa). So % R × R+ if r ≥ ρ U : = {(s, x); Ag(s, x) > 0} = {(s, x); x < aρ } if r < ρ . ρ−r
∗
So if r ≥ ρ we have U = D = R × R+ so τ does not exist. If r > ρ then g ∗ = ∞ while if r = ρ then g ∗ (s, x) = xe−ρs . (The proofs of these statements are left as Exercise 10.5.) It remains to examine the case r < ρ. (If we regard ρ as the sum of interest rate, inflation and tax etc., this is not an unreasonable assumption in applications.) First we establish that the region D must be invariant w.r.t. t, in the sense that for all t0 . (10.2.8) D + (t0 , 0) = D To prove (10.2.8) consider D + (t0 , 0) = {(t + t0 , x); (t, x) ∈ D} = {(s, x); (s − t0 , x) ∈ D} = {(s, x); g(s − t0 , x) < g ∗ (s − t0 , x)} = {(s, x); eρt0 g(s, x) < eρt0 g ∗ (s, x)} = {(s, x); g(s, x) < g ∗ (s, x)} = D , where we have used that g ∗ (s − t0 , x) = sup E (s−t0 ,x) [e−ρτ (Xτ − a)] = sup E[e−ρ(τ +(s−t0 )) (Xτx − a)] =e
τ ρt0
sup E[e τ
−ρ(τ +s)
(Xτx
τ ρt0 ∗
− a)] = e
g (s, x) .
10.2 The Time-Inhomogeneous Case
229
Therefore the connected component of D that contains U must have the form D(x0 ) = {(t, x); 0 < x < x0 }
for some x0 ≥
aρ ρ−r
.
Note that D cannot have any other components, for if V is a component of < 0 in V and so, if y ∈ V , D disjoint from U then Ag # τ y
E [g(Yτ )] = g(y) + E
y
$ Ag(Yt )dt < g(y)
0
for all exit times τ bounded by the exit time from an x-bounded strip in V . From this we conclude by Theorem 10.1.9 c) that g ∗ (y) = g(y), which implies V = ∅. Put τ (x0 ) = τD(x0 ) and let us compute g (s, x) = gx0 (s, x) = E (s,x) [g(Yτ (x0 ) )] .
(10.2.9)
From Chapter 9 we know that f = g is the (bounded) solution of the boundary value problem ⎫ 2 ∂f ∂f ⎬ 1 2 2∂ f + rx + α x =0 for 0 < x < x0 ∂s ∂x 2 ∂x2 ⎭ f (s, x0 ) = e−ρs (x0 − a) .
(10.2.10)
(Note that R × {0} does not contain any regular boundary points of D w.r.t. Yt = (s + t, Xt ).) If we try a solution of (10.2.10) of the form f (s, x) = e−ρs φ(x) we get the following 1-dimensional problem −ρφ + rxφ (x) + 12 α2 x2 φ (x) = 0 for 0 < x < x0 φ(x0 ) = x0 − a .
+ (10.2.11)
The general solution φ of (10.2.11) is φ(x) = C1 xγ1 + C2 xγ2 , where C1 , C2 are arbitrary constants and γi = α−2
1 2 2α
−r±
. (r − 12 α2 )2 + 2ρα2
(i = 1, 2) , γ2 < 0 < γ1 .
230
10 Application to Optimal Stopping
Since φ(x) is bounded as x → 0 we must have C2 = 0 and the boundary 1 (x0 − a). We conclude that the requirement φ(x0 ) = x0 − a gives C1 = x−γ 0 bounded solution f of (10.2.10) is γ1 x −ρs . (10.2.12) gx0 (s, x) = f (s, x) = e (x0 − a) x0 If we fix (s, x) then the value of x0 which maximizes gx0 (s, x) is easily seen to be given by aγ1 (10.2.13) x0 = xmax = γ1 − 1 (note that γ1 > 1 if and only if r < ρ). Thus we have arrived at the candidate gxmax (s, x) for g ∗ (s, x) = (s,x) −ρτ sup E [e (Xτ − a)]. To verify that we indeed have gxmax = g ∗ it would τ
suffice to prove that gxmax is a supermeanvalued majorant of g (see Corollary 10.1.10). This can be done, but we do not give the details here, since this problem can be solved more easily by Theorem 10.4.1 (see Example 10.4.2). The conclusion is therefore that one should sell the assets the first time 1 . The expected discounted the price of them reaches the value xmax = γaγ 1 −1 profit obtained from this strategy is g ∗ (s, x) = gxmax (s, x) = e−ρs
γ1 − 1 a
γ1 −1
x γ1
γ1 .
Remark. The reader is invited to check that the value x0 = xmax is the only value of x0 which makes the function x → gx0 (s, x)
(given by (10.2.9))
continuously differentiable at x0 . This is not a coincidence. In fact, it illustrates a general phenomenon which is known as the high contact (or smooth fit) principle. See Samuelson (1965), McKean (1965), Bather (1970) and Shiryaev (1978). This principle is the basis of the fundamental connection between optimal stopping and variational inequalities. Later in this chapter we will discuss some aspects of this connection. More information can be found in Bensoussan and Lions (1978) and Friedman (1976). See also Brekke and Øksendal (1991).
10.3 Optimal Stopping Problems Involving an Integral Let dYt = b(Yt )dt + σ(Yt )dBt ,
Y0 = y
(10.3.1)
be an Itˆ o diffusion in Rk . Let g: Rk → [0, ∞) be continuous and let f : Rk → [0, ∞) be Lipschitz continuous with at most linear growth. (These conditions
10.3 Optimal Stopping Problems Involving an Integral
231
can be relaxed. See (10.1.37) and Theorem 10.4.1.) Consider the optimal stopping problem: Find Φ(y) and τ ∗ such that # τ Φ(y) = sup E
y
τ
# τ ∗ $ $ y ∗ f (Yt )dt + g(Yτ ) = E f (Yt )dt + g(Yτ ) .
0
(10.3.2)
0
This problem can be reduced to our original problem (10.1.2)–(10.1.3) by proceeding as follows: Define the Itˆo diffusion Zt in Rk × R = Rk+1 by # dZt =
$ # $ $ # dYt b(Yt ) σ(Yt ) := dBt ; dt + f (Yt ) 0 dWt
Z0 = z = (y, w) .
(10.3.3)
Then we see that g (Zτ )] Φ(y) = sup E (y,0) [Wτ + g(Yτ )] = sup E (y,0) [ τ
τ
with g(z): = g(y, w): = g(y) + w ;
z = (y, w) ∈ Rk × R .
(10.3.4)
This is again a problem of the type (10.1.2)–(10.1.3) with Xt replaced by g. Note that the connection between the characteristic Zt and g replaced by operators AY of Yt and AZ of Zt is given by AZ φ(z) = AZ φ(y, w) = AY φ(y, w) + f (y)
∂φ , ∂w
φ ∈ C 2 (Rk+1 ) .
(10.3.5)
In particular, if g(y, w) = g(y) + w ∈ C 2 (Rk+1 ) then AZ g(y, w) = AY g(y) + f (y) .
(10.3.6)
Hence, in this general case the domain U of (10.1.34) gets the form U = {y; AY g(y) + f (y) > 0} . Example 10.3.1 Consider the optimal stopping problem # τ Φ(x) = sup E x τ
$ θe−ρt Xt dt + e−ρτ Xτ ,
0
where dXt = αXt dt + βXt dBt ;
X0 = x > 0
(10.3.7)
232
10 Application to Optimal Stopping
is geometric Brownian motion (α, β, θ constants, θ > 0). We put # $ # $ # $ dt 1 0 Y0 = (s, x) dYt = = dt + dBt ; dXt αXt βXt and #
dYt dZt = dWt Then with
$
⎤ ⎤ ⎡ 0 1 = ⎣ αXt ⎦ dt + ⎣ βXt ⎦ dBt ; θe−ρt Xt 0 ⎡
f (y) = f (s, x) = θe−ρs x ,
Z0 = (s, x, w) .
g(y) = e−ρs x
and g(s, x, w) = g(s, x) + w = e−ρs x + w we have g= AZ
∂ g ∂2 g ∂ g ∂ g + αx + 12 β 2 x2 2 + θe−ρs x = (−ρ + α + θ)e−ρs x . ∂s ∂x ∂x ∂w
Hence
% U = {(s, x, w); AZ g(s, x, w) > 0} =
R3 ∅
if ρ < α + θ if ρ ≥ α + θ .
From this we conclude (see Exercise 10.6): If ρ ≥ α + θ then τ ∗ = 0 and Φ(s, x, w) = g(s, x, w) = e−ρs x + w . If α < ρ < α + θ then τ ∗ does not exist θx −ρs and Φ(s, x, w) = ρ−α e +w . ∗
If ρ ≤ α then τ does not exist and Φ = ∞ .
(10.3.8) (10.3.9) (10.3.10)
10.4 Connection with Variational Inequalities The ‘high contact principle’ says, roughly, that – under certain conditions – the solution g ∗ of (10.1.2)–(10.1.3) is a C 1 function on Rn if g ∈ C 2 (Rn ). This is a useful information which can help us to determine g ∗ . Indeed, this principle is so useful that it is frequently applied in the literature also in cases where its validity has not been rigorously proved. Fortunately it turns out to be easy to prove a sufficiency condition of high contact type, i.e. a kind of verification theorem for optimal stopping, which makes it easy to verify that a given candidate for g ∗ (that we may have
10.4 Connection with Variational Inequalities
233
found by guessing or intuition) is actually equal to g ∗ . The result below is a simplified variant of a result in Brekke and Øksendal (1991): In the following we fix a domain G in Rk and we let dYt = b(Yt )dt + σ(Yt )dBt ;
Y0 = y
(10.4.1)
/ G} . τG = τG (y, ω) = inf{t > 0; Yt (ω) ∈
(10.4.2)
k
be an Itˆ o diffusion in R . Define
Let f : Rk → R, g: Rk → R be continuous functions satisfying
τG y for all y ∈ Rk (a) E [ f − (Yt )dt] < ∞
(10.4.3)
0
and (b) the family {g − (Yτ ); τ stopping time, τ ≤ τG } is uniformly integrable w.r.t. Ry (the probability law of Yt ), for all y ∈ Rk . (10.4.4) Let T denote the set of all stopping times τ ≤ τG . Consider the following problem: Find Φ(y) and τ ∗ ∈ T such that ∗
Φ(y) = sup J τ (y) = J τ (y) ,
(10.4.5)
τ ∈T
where
# τ τ
J (y) = E
y
$ f (Yt )dt + g(Yτ )
for τ ∈ T .
0
Note that since J 0 (y) = g(y) we have Φ(y) ≥ g(y)
for all y ∈ G .
(10.4.6)
We can now formulate the variational inequalities. As usual we let L = LY =
k i=1
bi (y)
∂ + ∂yi
1 2
k
(σσ T )ij (y)
i,j=1
∂2 ∂yi ∂yj
be the partial differential operator which coincides with the generator AY of Yt on C02 (Rk ). Theorem 10.4.1 (Variational inequalities for optimal stopping) a) Suppose we can find a function φ: G → R such that (i) (ii)
φ ∈ C 1 (G) ∩ C(G ) φ ≥ g on G and lim φ(Yt ) = g(YτG )X{τG g(x)} .
234
10 Application to Optimal Stopping
(iii)
Suppose Yt spends 0 time on ∂D a.s., i.e. τG y X∂D (Yt )dt = 0 for all y ∈ G E 0
(iv)
and suppose that ∂D is a Lipschitz surface, i.e. ∂D is locally the graph of a function h: Rk−1 → R such that there exists K < ∞ with |h(x) − h(y)| ≤ K|x − y|
(v) (vi)
for all x, y .
Moreover, suppose the following: φ ∈ C 2 (G\∂D) and the second order derivatives of φ are locally bounded near ∂D Lφ + f ≤ 0 on G \ D. Then φ(y) ≥ Φ(y) for all y ∈ G.
b) Suppose, in addition to the above, that (vii) Lφ + f = 0 on D / D} < ∞ a.s. Ry for all y ∈ G (viii) τD : = inf{t > 0; Yt ∈ and (ix) the family {φ(Yτ ); τ ≤ τD , τ ∈ T } is uniformly integrable w.r.t. Ry , for all y ∈ G. Then
# τ
φ(y) = Φ(y) = sup E τ ∈T
and
y
$ f (Yt )dt + g(Yτ ) ;
y∈G
(10.4.7)
0
τ ∗ = τD
(10.4.8)
is an optimal stopping time for this problem. Proof. By (i), (iv) and (v) we can find a sequence of functions φj ∈ C 2 (G) ∩ C(G ), j = 1, 2, . . ., such that (a) φj → φ uniformly on compact subsets of G, as j → ∞ (b) Lφj → Lφ uniformly on compact subsets of G \ ∂D, as j → ∞ (c) {Lφj }∞ j=1 is locally bounded on G. (See Appendix D). ∞ ∞ GR . Let GR R=1 be a sequence of bounded open sets such that G = R=1
Put TR = min(R, inf {t > 0; Yt ∈ GR }) and let τ ≤ τG be a stopping time. Let y ∈ G. Then by Dynkin’s formula $ # τ ∧TR y y E [φj (Yτ ∧TR )] = φj (y) + E Lφj (Yt )dt (10.4.9) 0
10.4 Connection with Variational Inequalities
235
Hence by (a), (b), (c) and (iii) and the bounded a.e. convergence $ # τ ∧TR φ(y) = lim E −Lφj (Yt )dt + φj (Yτ ∧TR ) y
j→∞
0
$ # τ ∧TR y =E −Lφ(Yt )dt + φ(Yτ ∧TR ) .
(10.4.10)
0
Therefore, by (ii), (iii) and (vi),
φ(y) ≥ E y
$ # τ ∧TR f (Yt )dt + g(Yτ ∧TR ) . 0
Hence by the Fatou lemma and (10.4.3), (10.4.4) $ $ # τ ∧TR # τ y φ(y) ≥ lim E f (Yt )dt + g(Yτ ∧TR ) ≥ E f (Yt )dt + g(Yτ ) . y
R→∞
0
0
Since τ ≤ τG was arbitrary, we conclude that φ(y) ≥ Φ(y)
for all y ∈ G ,
(10.4.11)
which proves a). We proceed to prove b): If y ∈ / D then φ(y) = g(y) ≤ Φ(y) so by (10.4.11) we have φ(y) = Φ(y)
and τ = τ(y, ω): = 0
is optimal for y ∈ /D.
(10.4.12)
Next, suppose y ∈ D. Let {Dk }∞ k=1 be an increasing sequence of open sets Dk ∞ such that Dk ⊂ D, Dk is compact and D = Dk . Put τk = inf{t > 0; Yt ∈ Dk }, k=1
k = 1, 2, . . . By Dynkin’s formula we have for y ∈ Dk , # τk ∧TR φ(y) = lim φj (y) = lim E j→∞
y
j→∞
$ −Lφj (Yt )dt + φj (Yτk ∧TR )
0
$ $ # τk ∧TR # τk ∧TR y y =E −Lφ(Yt )dt + φ(Yτk ∧TR ) = E f (Yt )dt + φ(Yτk ∧TR ) 0
0
236
10 Application to Optimal Stopping
So by uniform integrability and (ii), (vii), (viii) we get # τk ∧TR φ(y) =
lim E
y
R,k→∞
# τD =E
y
$ f (Yt )dt + φ(Yτk ∧TR )
0
$ f (Yt )dt + g(YτD ) = J τD (y) ≤ Φ(y) .
(10.4.13)
0
Combining (10.4.11) and (10.4.13) we get φ(y) ≥ Φ(y) ≥ J τD (y) = φ(y) so φ(y) = Φ(y) and τ(y, ω): = τD
is optimal when y ∈ D .
(10.4.14)
From (10.4.12) and (10.4.14) we conclude that φ(y) = Φ(y)
for all y ∈ G .
Moreover, the stopping time τ defined by % 0 for y ∈ /D τ(y, ω) = τD for y ∈ D is optimal. By Theorem 10.1.12 we conclude that τD is optimal also.
Example 10.4.2 (When is the right time to sell the stocks? (Part 2)) To illustrate Theorem 10.4.1 let us apply it to reconsider Example 10.2.2: Rather than proving (10.2.8) and the following properties of D, we now simply guess/assume that D has the form D = {(s, x); 0 < x < x0 } for some x0 > 0, which is intuitively reasonable. Then we solve (10.2.11) for arbitrary x0 and we arrive at the following candidate φ for g ∗ : % −ρs e (x0 − a)( xx0 )γ1 for 0 < x < x0 φ(s, x) = e−ρs (x − a) for x ≥ x0 . The requirement that φ ∈ C 1 (Theorem 10.4.1 (i)) gives the value (10.2.13) for x0 . It is clear that φ ∈ C 2 outside ∂D and by construction Lφ = 0 on D. Moreover, conditions (iii), (iv), (viii) and (ix) clearly hold. It remains to verify that φ(s, x) > g(s, x) for 0 < x < x0 , i.e. φ(s, x) > e−ρs (x − a) for 0 < x < x0 and (v) Lφ(s, x) ≤ 0 for x > x0 , i.e. Lg(s, x) ≤ 0 for x > x0 . (ii)
This is easily done by direct calculation (assuming r < ρ).
Exercises
237
We conclude that φ = g ∗ and τ ∗ = τD is optimal (with the value (10.2.13) for x0 ).
Exercises 10.1.* In each of the optimal stopping problems below find the supremum g ∗ and – if it exists – an optimal stopping time τ ∗ . (Here Bt denotes 1-dimensional Brownian motion) a) g ∗ (x) = sup E x [Bτ2 ] τ
b) g ∗ (x) = sup E x [|Bτ |p ], τ
where p > 0. 2 c) g ∗ (x) = sup E x [e−Bτ ] τ
d) g ∗ (s, x) = sup E (s,x) [e−ρ(s+τ ) cosh Bτ ] τ
where ρ > 0 and cosh x = 12 (ex + e−x ). 10.2.* a) Prove that the only nonnegative (Bt -) superharmonic functions in R2 are the constants. (Hint: Suppose u is a nonnegative superharmonic function and that there exist x, y ∈ R2 such that u(x) < u(y) . Consider E x [u(Bτ )] , where τ is the first hitting time for Bt of a small disc centered at y). b) Prove that the only nonnegative superharmonic functions in R are the constants and use this to find g ∗ (x) when % xe−x for x > 0 g(x) = 0 for x ≤ 0 . c) Let γ ∈ R, n ≥ 3 and define, for x ∈ Rn , % γ |x| for |x| ≥ 1 fγ (x) = 1 for |x| < 1 . For what values of γ is fγ (·) ((Bt )-) harmonic for |x| > 1 ? Prove that fγ is superharmonic in Rn iff γ ∈ [2 − n, 0] . 10.3.* Find g ∗ , τ ∗ such that ∗
g ∗ (s, x) = sup E (s,x) [e−ρ(s+τ ) Bτ2 ] = E (s,x) [e−ρ(s+τ ) Bτ2∗ ] , τ
238
10. Application to Optimal Stopping
where Bt is 1-dimensional Brownian motion, ρ > 0 is constant. Hint: First assume that the continuation region has the form D = {(s, x); −x0 < x < x0 } for some x0 and then try to determine x0 . Then apply Theorem 10.4.1. 10.4. Let Xt be an Itˆ o diffusion on Rn and g: Rn → R+ a continuous reward function. Define g (x) = sup{E x [g(Xτ )] ; τ stopping time, E x [τ ] < ∞} . Show that g = g ∗ . [Hint: If τ is a stopping time put τk = τ ∧k for k = 1, 2, . . . and consider E x [g(Xτ ) · Xτ ρ then g ∗ = ∞, b) if r = ρ then g ∗ (s, x) = xe−ρs . 10.6. Prove statements (10.3.8), (10.3.9), (10.3.10) in Example 10.3.1. 10.7. As a supplement to Exercise 10.4 it is worth noting that if g is not bounded below then the two problems g ∗ (x) = sup{E x [g(Xτ )] ; τ stopping time} and g (x) = sup{E x [g(Xτ )] ; τ stopping time, E x [τ ] < ∞} need not have the same solution. For example, if g(x) = x, Xt = Bt ∈ R prove that for all x ∈ R g ∗ (x) = ∞ while
g (x) = x
for all x ∈ R .
(See Exercise 7.4.) 10.8. Give an example with g not bounded below where Theorem 10.1.9 a) fails. (Hint: See Exercise 10.7.) 10.9.* Solve the optimal stopping problem # τ Φ(x) = sup E τ
x
e 0
−ρt
Bt2 dt
+e
−ρτ
Bτ2
$ .
Exercises
239
10.10. Prove the following simple, but useful, observation, which can be regarded as an extension of (10.1.35): Let W = {(s, x); ∃τ with g(s, x) < E (s,x) [g(s + τ, Xτ )]}. Then W ⊂ D. 10.11. Consider the optimal stopping problem g ∗ (s, x) = sup E (s,x) [e−ρ(s+τ ) Bτ+ ] , τ
where Bt ∈ R and x+ = max{x, 0}. a) Use the argument for (10.2.8) and Exercise 10.10 to prove that the continuation region D has the form D = {(s, x); x < x0 } for some x0 > 0. b) Determine x0 and find g ∗ . c) Verify the high contact principle: ∂g ∗ ∂g = ∂x ∂x
when (s, x) = (s, x0 ) ,
where g(t, x) = e−ρt x+ . 10.12.* The first time the high contact principle was formulated seems to be in a paper by Samuelson (1965), who studied the optimal time for selling an asset, if the reward obtained by selling at the time t and when price is ξ is given by g(t, ξ) = e−ρt (ξ − 1)+ . The price process is assumed to be a geometric Brownian motion Xt given by X0 = x > 0 , dXt = rXt dt + αXt dBt , where r < ρ. In other words, the problem is to find g ∗ , τ ∗ such that ∗
g ∗ (s, x)= sup E (s,x) [e−ρ(s+τ ) (Xτ − 1)+ ] = E (s,x) [e−ρ(s+τ ) (Xτ ∗ − 1)+ ]. τ
a) Use the argument for (10.2.8) and Exercise 10.10 to prove that the continuation region D has the form D = {(s, x); 0 < x < x0 } for some x0 >
ρ ρ−r .
240
10. Application to Optimal Stopping
b) For a given x0 >
ρ ρ−r
solve the boundary value problem
⎧ ∂f ∂f 1 2 2 ∂2 f ⎪ for 0 < x < x0 ⎨ ∂s + rx ∂x + 2 α x ∂x2 = 0 f (s, 0) = 0 ⎪ ⎩ f (s, x0 ) = e−ρs (x0 − 1)+ by trying f (s, x) = e−ρs φ(x). c) Determine x0 by using the high contact principle, i.e. by using that ∂f ∂g = ∂x ∂x
when x = x0 .
d) With f, x0 as in b), c) define % f (s, x) ; x < x0 γ(s, x) = e−ρs (x − 1)+ ; x ≥ x0 . Use Theorem 10.4.1 to verify that γ = g ∗ and that τ ∗ = τD is optimal. 10.13.* (A resource extraction problem) Suppose the price Pt of one unit of a resource (e.g. gas, oil) at time t is varying like a geometric Brownian motion dPt = αPt dt + βPt dBt ;
P0 = p
where Bt is 1-dimensional Brownian motion and α, β are constants. Let Qt denote the amount of remaining resources at time t. Assume that the rate of extraction is proportional to the remaining amount, so that Q0 = q dQt = −λQt dt ; where λ > 0 is a constant. If the running cost rate is K > 0 and we stop the extraction at the time τ = τ (ω) then the expected total discounted profit is given by τ
J (s, p, q) = E
(s,p,q)
# τ (λPt Qt − K)e
−ρ(s+t)
dt + e
−ρ(s+τ )
$ g(Pτ , Qτ ) ,
0
where ρ > 0 is the discounting exponent and g(p, q) is a given bequest function giving the value of the remaining resource amount q when the price is p. a) Write down the characteristic operator A of the diffusion process ⎤ ⎡ dt dXt = ⎣ dPt ⎦ ; X0 = (s, p, q) dQt
Exercises
241
and formulate the variational inequalities of Theorem 10.4.1 corresponding to the optimal stopping problem ∗
Φ(s, p, q) = sup J τ (s, p, q) = J τ (s, p, q) . τ
b) Assume that g(p, q) = pq and find the domain U corresponding to (10.1.34), (10.3.7), i.e. U = {(s, p, q); A(e−ρs g(p, q)) + f (s, p, q) > 0} , where
f (s, p, q) = e−ρs (λpq − K) .
Conclude that (i) if ρ ≥ α then τ ∗ = 0 and Φ(s, p, q) = pqe−ρs K (ii) if ρ < α then D ⊃ {(s, p, q); pq > α−ρ }. c) As a candidate for Φ when ρ < α we try a function of the form % −ρs 0 < pq ≤ y0 e pq ; φ(s, p, q) = e−ρs ψ(pq) ; pq > y0 for a suitable ψ: R → R, and a suitable y0 . Use Theorem 10.4.1 to determine ψ, y0 and to verify that with this choice of ψ, y0 we have φ = Φ and τ ∗ = inf{t > 0; Pt Qt ≤ y0 }, if ρ < α < ρ + λ. d) What happens if ρ + λ ≤ α ? 10.14.* (Finding the optimal investment time (I)) Solve the optimal stopping problem # ∞ $ Ψ (s, p) = sup E (s,p) e−ρ(s+t) Pt dt − Ce−ρ(s+τ ) , τ
τ
where dPt = αPt dt + βPt dBt ;
P0 = p ,
Bt is 1-dimensional Brownian motion and α, β, ρ, C are constants, 0 < α < ρ and C > 0. (We may interprete this as the problem of finding the optimal time τ for investment in a project. The profit rate after investment is Pt and the cost of the investment is C. Thus Ψ gives the maximal expected discounted net profit.) ∞ ∞ τ Hint: Write e−ρ(s+t) Pt dt = e−ρs [ e−ρt Pt dt − e−ρt Pt dt]. Compute 0
τ
0
∞ E[ e−ρt Pt dt] by using the solution formula for Pt (see Chapter 5) and 0
then apply Theorem 10.4.1 to the problem Φ(s, p) = sup E τ
(s,p)
#
τ −
e 0
−ρ(s+t)
Pt dt − Ce
−ρ(s+τ )
$ .
242
10. Application to Optimal Stopping
10.15. Let Bt be 1-dimensional Brownian motion and let ρ > 0 be constant. a) Show that the family {e−ρτ Bτ ; τ stopping time} is uniformly integrable w.r.t. P x . b) Solve the optimal stopping problem Φ(s, x) = sup E (s,x) [e−ρ(s+τ ) (Bτ − a)] τ
when a > 0 is constant. This may be regarded as a variation of Example 10.2.2/10.4.2 with the price process represented by Bt rather than Xt . 10.16. (Finding the optimal investment time (II)) Solve the optimal stopping problem Ψ (s, p) = sup E
(s,p)
# ∞
τ
e−ρ(s+t) Pt dt − Ce−ρ(s+τ )
$
τ
where dPt = μ dt + σ dBt ;
P0 = p
with μ, σ = 0 constants. (Compare with Exercise 10.14.) 10.17. a) Let dXt = μ dt + σ dBt ; X0 = x ∈ R where μ and σ are constants. Prove that if ρ > 0 is constant then E
x
∞
e−ρt |Xt |dt < ∞
for all x .
0
b)
Solve the optimal stopping problem Φ(s, x) = sup E τ ≥0
where a ≥ 0 is a constant.
s,x
τ 0
e−ρ(s+t) (Xt − a)dt ,
11 Application to Stochastic Control
11.1 Statement of the Problem Suppose that the state of a system at time t is described by an Itˆ o process Xt of the form dXt = dXtu = b(t, Xt , ut )dt + σ(t, Xt , ut )dBt ,
(11.1.1)
where Xt ∈ Rn , b: R × Rn × U → Rn , σ: R × Rn × U → Rn×m and Bt is mdimensional Brownian motion. Here ut ∈ U ⊂ Rk is a parameter whose value we can choose in the given Borel set U at any instant t in order to control the process Xt . Thus ut = u(t, ω) is a stochastic process. Since our decision at time t must be based upon what has happened up to time t, the function (m) ω → u(t, ω) must (at least) be measurable w.r.t. Ft , i.e. the process ut (m) must be Ft -adapted. Thus the right hand side of (11.1.1) is well-defined as a stochastic integral, under suitable assumptions on the functions b and σ. At the moment we will not specify the conditions on b and σ further, but simply assume that the process Xt satisfying (11.1.1) exists. See further comments on this in the end of this chapter. Let {Xhs,x }h≥s be the solution of (11.1.1) such that Xss,x = x, i.e.
h Xhs,x
h b(r, Xrs,x , ur )dr
=x+ s
σ(r, Xrs,x , ur )dBr ;
+
h≥s
s
and let the probability law of Xt starting at x for t = s be denoted by Qs,x , so that Qs,x [Xt1 ∈ F1 , . . . , Xtk ∈ Fk ] = P 0 [Xts,x ∈ F1 , . . . , Xts,x ∈ Fk ] 1 k for s ≤ ti , Fi ⊂ Rn ; 1 ≤ i ≤ k, k = 1, 2, . . . B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 11, © Springer-Verlag Berlin Heidelberg 2013
(11.1.2)
244
11 Application to Stochastic Control
Let f : R×Rn ×U → R (the “profit rate” function) and g: R×Rn → R (the “bequest” function) be given continuous functions, let G be a fixed domain in R × Rn and let T be the first exit time after s from G for the process {Xrs,x}r≥s , i.e. T = Ts,x (ω) = inf{r > s; (r, Xrs,x (ω)) ∈ / G} ≤ ∞ .
(11.1.3)
Suppose
E
s,x
# T $ |f ur (r, Xr )|dr + |g(T, XT)|X{T1. Then considering the figure above and using some intuition we see that it is optimal to excite as much as possible if Xt is in the strip 0 < x < 1 to avoid exiting from G in the interval {t1 } × (0, 1). Using that Xt is just a time change of Brownian motion (see Chapter 8) we conclude that this optimal control leads to a process X ∗ which jumps immediately to the value 1 with probability x and to the value 0 with probability 1 − x, if the starting point is x ∈ (0, 1). If the starting point is x ∈ [1, ∞) we simply choose our control to be zero. In other words, heuristically we should have % ∞ if x ∈ (0, 1) ∗ (11.2.60) u (t, x) = 0 if x ∈ [1, ∞) with corresponding expected payoff φ∗ (s, x) = E s,x [K(Xt∗1 )] =
%
x 1
if 0 ≤ x ≤ 1 if x>1.
(11.2.61)
Thus we see that our candidate u∗ for optimal control is not continuous (not even finite!) and the corresponding optimal process Xt∗ is not an Itˆo diffusion (it is not even continuous). So to handle this case mathematically it is necessary to enlarge the family of admissible controls (and the family of corresponding processes). For example, one can prove an extended version of Theorem 11.2.2 which allows us to conclude that our choice of u∗ above does indeed give at least as good performance as any other Markov control u and that φ∗ given by (11.2.61) does coincide with the maximal expected payoff Φ defined by (11.2.57). This last example illustrates the importance of the question of existence in general, both of the optimal control u∗ and of the corresponding solution Xt
11.3 Stochastic control problems with terminal conditions
259
of the stochastic differential equation (11.1.1). We briefly outline some results in this direction: With certain conditions on b, σ, f, g, ∂G and assuming that the set of control values is compact, one can show, using general results from nonlinear partial differential equations, that a smooth function φ exists such that sup{f v (y) + (Lv φ)(y)} = 0
for y ∈ G
v
and for y ∈ ∂G .
φ(y) = g(y)
Then by a measurable selection theorem one can find a (measurable) function u∗ (y) such that ∗ ∗ (11.2.62) f u (y) + (Lu φ)(y) = 0 , for a.a. y ∈ G w.r.t. Lebesgue measure in Rn+1 . Even if u∗ is only known ∗ to be measurable, one can show that the corresponding solution Xt = Xtu of (11.1.1) exists (see Stroock and Varadhan (1979) for general results in this direction). Then by inspecting the proof of Theorem 11.2.2 one can see that it suffices to have (11.2.62) satisfied outside a subset of G with Green measure 0 (see Definition 9.3.4). Under suitable conditions on b and σ one can in fact show that the Green measure is absolutely continuous w.r.t. Lebesgue measure. Thus by (11.2.62) (and a strengthened Theorem 11.2.2) u∗ is an optimal control. We refer the reader to Fleming and Rishel (1975), Bensoussan and Lions (1978), Dynkin and Yushkevich (1979) and Krylov (1980) for details and further studies.
11.3 Stochastic control problems with terminal conditions In many applications there are constraints on the types of Markov controls u to be considered, for example in terms of the probabilistic behaviour of Ytu at the terminal time t = T . Such problems can often be handled by applying a kind of “Lagrange multiplier” method, which we now describe: Consider the problem of finding Φ(y) and u∗ (y) such that ∗
Φ(y) = sup J u (y) = J u (y)
(11.3.1)
u∈K
where
$
# τG u
J (y) = E
y
f 0
u
(Ytu )dt
+
g(YτuG )
,
(11.3.2)
260
11 Application to Stochastic Control
and where the supremum is taken over the space K of all Markov controls u: Rn+1 → U ⊂ Rk such that E y [Mi (YτuG )] = 0 ,
i = 1, 2, . . . , l ,
(11.3.3)
where M = (M1 , . . . , Ml ): Rn+1 → Rl is a given continuous function, E y [|M (YτuG )|] < ∞
for all y, u ,
(11.3.4)
and we interpret g(YτG (ω)) as 0 if τG (ω) = ∞. Now we introduce a related, but unconstrained problem as follows: For each λ ∈ Rl and each Markov control u define $ # τG u y u u u u f (Yt )dt + g(YτG ) + λ · M (YτG ) Jλ (y) = E
(11.3.5)
0
where · denotes the inner product in Rl . Find Φλ (y) and u∗λ (y) such that u∗ Φλ (y) = sup Jλu (y) = Jλ λ (y) , (11.3.6) u
without terminal conditions. Theorem 11.3.1 Suppose that we for all λ ∈ Λ ⊂ Rl can find Φλ (y) and u∗λ solving the (unconstrained) stochastic control problem (11.3.5)–(11.3.6). Moreover, suppose that there exists λ0 ∈ Λ such that u∗
E y [M (YτGλ0 )] = 0 .
(11.3.7)
Then Φ(y): = Φλ0 (y) and u∗ : = u∗λ0 solves the constrained stochastic control problem (11.3.1)–(11.3.3). Proof. Let u be a Markov control, λ ∈ Λ. Then by the definition of u∗λ we have # τG Ey
$ ∗ u∗ u∗ u∗ u∗ f uλ (Yt λ )dt + g(YτGλ ) + λ · M (YτGλ ) = Jλ λ (y)
0
≥
Jλu (y)
# τG =E
y
$ f u (Ytu )dt + g(YτuG ) + λ · M (YτuG ) .
0
In particular, if λ = λ0 and u ∈ K then u∗
E y [M (YτGλ0 )] = 0 = E y [M (YτuG )]
(11.3.8)
Exercises
261
and hence by (11.3.8) J
u∗ λ0
(y) ≥ J u (y)
for all u ∈ K .
Since u∗λ0 ∈ K the proof is complete.
For an application of this result, see Exercise 11.11.
Exercises 11.1.
Write down the HJB equation for the problem # ∞ Ψ (s, x) = inf E
s,x
e
u
−α(s+t)
(θ(Xt ) +
u2t )dt
$
0
where X t , u t , Bt ∈ R ,
dXt = ut dt + dBt ;
11.2.
α > 0 is a constant and θ: R → R is a given bounded, continuous function. Show that if Ψ satisfies the conditions of Theorem 11.2.1 and u∗ exists then ∂Ψ . u∗ (t, x) = − 12 eαt ∂x Consider the stochastic control problem # ∞ Ψ0 (s, x) = inf E
s,x
u
$ e−ρt f0 (ut , Xt )dt ,
s
where dXt = dXtu = b(ut , Xt )dt + σ(ut , Xt )dBt X t ∈ R n , u t ∈ R k , Bt ∈ R m , f0 is a given bounded continuous real function, ρ > 0 and the inf is taken over all time-homogeneous Markov controls u, i.e. controls u of the form u = u(Xt ). Prove that Ψ0 (s, x) = e−ρs ξ(x) ,
where ξ(x) = Ψ (0, x) .
(Hint: By definition of E s,x we have E
s,x
# ∞ $
∞ s,x s,x −ρt −ρ(s+t) e f0 (u(Xt ), Xt )dt] = E[ e f0 (u(Xs+t ), Xs+t )dt s
0
where E denotes expectation w.r.t. P .)
262
11.3.
11. Application to Stochastic Control
Define X t , u t , Bt ∈ R
dXt = rut Xt dt + αut Xt dBt ; and # ∞ Φ(s, x) = sup E u
s,x
e
−ρ(s+t)
$ f0 (Xt )dt ,
0
where r, α, ρ are constants, ρ > 0 and f is a bounded continuous real function. Assume that Φ satisfies the conditions of Theorem 11.2.1 and that an optimal Markov control u∗ exists. a) Show that % + ∂Φ 1 2 2 2 ∂ 2 Φ ∂Φ −ρt + rvx + 2α v x =0. sup e f (x) + ∂t ∂x ∂x2 v∈R Deduce that ∂2Φ ≤0. ∂x2 b) Assume that
∂2Φ ∂x2
< 0. Prove that u∗ (t, x) = −
r ∂Φ ∂x
2
α2 x ∂∂xΦ2
and that 2 ∂Φ ∂ 2 Φ 2 ∂Φ − r =0. 2α2 e−ρt f0 (x) + ∂t ∂x2 ∂x c) Assume that
∂2Φ ∂x2
= 0. Prove that e−ρt f0 (x) +
∂Φ ∂x
= 0 and
∂Φ =0. ∂t
d) Assume that u∗t = u∗ (Xt ) and that b) holds. Prove that Φ(t, x) = e−ρt ξ(x) and 2α2 (f0 − ρξ)ξ − r2 (ξ )2 = 0 . (See Exercise 11.2)
Exercises
11.4.
263
The assumptions in Theorem 11.2.1 often fail (see e.g. Exercise 11.10), so it is useful to have results in such cases also. For example, if we define Φa as in Theorem 11.2.3 then, without assuming that u∗ exists and without smoothness conditions on Φ, we have the Bellman principle (compare with (11.2.6)–(11.2.7)) $
# α Φa (y) = sup E u
y
f
u
(Yru )dr
+
Φa (Yαu )
0
for all y ∈ G and all stopping times α ≤ τG , the sup being taken over (m) all Ft -adapted controls u. (See Krylov (1980, Th. 6, p. 150).) Deduce that if Φa ∈ C 2 (G) then f v (y) + Lv Φa (y) ≤ 0 11.5.
for all y ∈ G, v ∈ U .
Assume that f = 0 in (11.1.8) and that an optimal Markov control u∗ exists. Prove that the function Φ is superharmonic in G w.r.t. the process Ytu , for any Markov control u. (Hint: See (11.2.6)–(11.2.7).)
11.6.* Let Xt denote your wealth at time t. Suppose that at any time t you have a choice between two investments: 1) A risky investment where the unit price X1 (t) = X1 (t, ω) satisfies the equation dX1 (t) = a1 X1 (t)dt + σ1 X1 (t)dBt . 2) A safe investment where the unit price X0 (t) = X0 (t, ω) satisfies t dX0 (t) = a0 X0 (t)dt + σ0 X0 (t)dB where ai , σi are constants such that a1 > a0 ,
σ1 > σ0
t are independent 1-dimensional Brownian motions. and Bt , B a) Let u(t, ω) denote the fraction of the fortune Zt (ω) which is placed in the riskier investment at time t. Show that (u)
= Zt (a1 u(t) + a0 (1 − u(t)))dt t ) . +Zt (σ1 u(t)dBt + σ0 (1 − u(t))dB
dZt = dZt
b) Assuming that u is a Markov control, u = u(t, Ztu ), find the generator Au of (t, Ztu ). c) Write down the HJB equation for the stochastic control problem (u) Φ(s, x) = sup E s,x (Zτˆ )γ u
264
11. Application to Stochastic Control
where τˆ = min(t1 − s, τ0 ), τ0 = inf{t > 0; Xt = 0} and t1 is a given future time (constant), γ ∈ (0, 1) is a constant. d) Find the optimal control u∗ for the problem in c). 11.7.* Consider the stochastic control problem
(system)
dXt = au(t)dt + u(t)dBt ;
X0 = x > 0
where Bt ∈ R, u(t) ∈ R and a ∈ R is a given constant, and Φ(s, x) = sup E s,x [(Xτˆ )γ ] ,
(performance)
u
where 0 < γ < 1 is constant and τˆ = inf{t > 0; Xt = 0} ∧ (T − s) , T being a given future time (constant). Show that this problem has the optimal control u∗ (t, x) =
ax 1−γ
with corresponding optimal performance 2 a (T − s)γ Φ(s, x) = xγ exp . 2(1 − γ) 11.8.
Use Dynkin’s formula to prove directly that a−b ∗ ,1 u (t, x) = min α2 (1 − γ) is the optimal control for the problem in Example 11.2.5, with utility function N (x) = xγ . (Hint: See the argument leading to (11.2.55).)
11.9.
In Beneˇs (1974) the following stochastic control problem is considered: $ # ∞ s,x −ρ(s+t) 2 e Xt dt , Ψ (s, x) = inf E u
0
where
(u)
dXt = dXt
= aut dt + dBt ;
X t , Bt ∈ R
and a, ρ are (known) constants, ρ > 0. Here the controls u are restricted to take values in U = [−1, 1].
Exercises
265
a) Show that the HJB equation for this problem is % + 2 ∂Ψ ∂Ψ −ρs 2 1 ∂ Ψ + av +2· inf e x + =0. ∂s ∂x ∂x2 v∈[−1,1] b) If Ψ ∈ C 2 and u∗ exists, show that u∗ (x) = −sign(ax) , %
where sign z = (Hint: Explain why x > 0 ⇒ %
11.10. Let f0 (x) =
1 if z > 0 −1 if z ≤ 0 .
∂Ψ ∂x
> 0 and x < 0 ⇒
∂Ψ ∂x
< 0.)
2 x for 0 ≤ x ≤ 1 √ x for x > 1
and put # τ0 u
J (s, x) = E
s,x
$ e−ρ(s+t) f0 (Xtu )dt ,
Φ(s, x) = sup J u (s, x)
0
u
where dXtu = ut dBt ;
t≥0
with control values ut ∈ R, Bt ∈ R and τ0 = inf{t > 0; Xtu ≤ 0} . a) Define φ(s, x) =
1 −ρs e f (x) ρ
where f(x) =
%
for x ≥ 0, s ∈ R
x for 0 ≤ x ≤ 1 √ x for x > 1 .
Prove that J u (s, x) ≤ φ(s, x) for all s, x and all (finite) Markov controls u. √ (Hint: Put φ1 (s, x) = ρ1 e−ρs x for all s, x and φ2 (s, x) = ρ1 e−ρs x for all s, x. Then J u (s, x) ≤ φi (s, x) by Theorem 11.2.2.)
for i = 1, 2
266
11. Application to Stochastic Control
b) Show that Φ(s, x) = φ(s, x) . (Hint: Consider J
uk
(s, x), where % k for 0 ≤ x < 1 uk (x) = 0 for x ≥ 1
and let k → ∞). Thus u∗ does not exist and Φ is not a C 2 function. Hence both conditions for the HJB (I) equation fail in this case. 11.11.* Consider a 1-dimensional version of the stochastic linear regulator problem of Example 11.2.4: Ψ (s, x) = inf E
s,x
u∈K
$ # t 1 −s ((Xru )2 + θu2r )dr
(11.3.9)
0
where for t ≥ 0, X0 = x ,
dXtu = ut dt + σdBt ;
ut , Bt ∈ R, σ, θ and t1 constants, θ > 0, t1 > 0. The infimum is taken over the space K of all Markov controls u satisfying E s,x [(Xtu1 −s )2 ] = m2 ,
where m is a constant .
(11.3.10)
Solve this problem by using Theorem 11.3.1. (Hint: Solve for each λ ∈ R the unconstrained problem Ψλ (s, x) = inf E
s,x
u
# t 1 −s $ ((Xru )2 + θu2r )dr + λ(Xtu1 −s )2 0
with optimal control
u∗λ .
Then try to find λ0 such that u∗ 0 2 E s,x Xt1λ−s = m2 .)
11.12.* Solve the stochastic control problem ∗
Ψ (s, x) = inf J u (s, x) = J u (s, x) u
where
# ∞ u
J (s, x) = E
s,x
e−ρ(s+t) (Xt2 + θu2t )dt
$
0
and dXt = ut dt + σdBt , with ut , Bt ∈ R and σ ∈ R, ρ > 0, θ > 0 are constants. (Hint: Try ψ(s, x) = e−ρs (ax2 + b) for suitable constants a, b and apply Theorem 11.2.2.)
Exercises
267
11.13.* Consider the stochastic control problem # τ Φ(s, x) = sup E
s,x
u
e−ρ(s+t) ut dt
$
0
where the (1-dimensional) system Xt is given by dXt = dXtu = (1 − ut )dt + σ dBt , and ρ > 0, σ = 0 are constants. The control ut = ut (ω) can assume any value in U = [0, 1] and τ = inf{t > 0; Xtu ≤ 0} Show that if ρ ≥
2 σ2
(the time of bankruptcy) .
then the optimal control is u∗t = 1
for all t
and the corresponding value function is Φ(s, x) = e
−ρs
1 ρ
: 1 − exp −
2 x ; σ2
x≥0.
11.14. The following problem is an infinite horizon version of Example 11.2.5, with consumption added. Suppose we have a market with two investment possibilities: (i)
a bond/bank account, where the price X0 (t) at time t is given by dX0 (t) = ρ X0 (t)dt ;
X0 (0) = 1, ρ ≥ 0 constant
(ii) a stock, where the price X1 (t) at time t is given by dX1 (t) = μ X1 (t)dt + σ X1 (t)dB(t);
X1 (0) = x > 0,
where μ, σ are constants, σ = 0. Let Y0 (t), Y1 (t) denote the amount of money that an agent at time t has invested in the bonds and in the stocks, respectively. Assume that the agent at any time can choose her consumption rate c(t) = c(t, ω) ≥ 0 and the ratio u(t) = u(t, ω) =
Y1 (t) Y0 (t) + Y1 (t)
268
11. Application to Stochastic Control
of her total wealth invested in the stocks. We assume that c(t) and u(t) are Ft -adapted processes and that the dynamics of the total wealth Z(t) = Y1 (t) + Y2 (t) is given by dZ(t) = Z(t) {ρ(1 − u(t)) + μ u(t)}dt + σ u(t)dB(t) −c(t)dt; Z(0) = z > 0 . Consider the problem to find Φ, c∗ and u∗ such that Φ(s, z) = sup J c,u (s, z) = J c
∗
,u∗
(s, z) ,
c,u
where
# τ0 J
c,u
(s, z) = E
s,z
e−δ(s+t)
0
cγ (t) dt γ
$
where δ > 0 and γ ∈ (0, 1) are constants and τ0 = inf{t > 0; Z(t) ≤ 0} ≤ ∞ (time of bankruptcy) Use Theorem 11.2.2 to prove that, under some conditions, Φ(s, z) = K e−δs z γ for a certain value of the constant K. Find this value of K and hence find the optimal consumption rate c∗ (t) and the optimal portfolio u∗ (t).
12 Application to Mathematical Finance
12.1 Market, portfolio and arbitrage In this chapter we describe how the concepts, methods and results in the previous chapters can be applied to give a rigorous mathematical model of finance. We will concentrate on the most fundamental issues and those topics which are most closely related to the theory in this book. We emphasize that this chapter only intends to give a brief introduction to this exciting subject, which has developed very fast during the last years and shows no signs of slowing down. For a more comprehensive treatment see for example Benth (2004), Bingham and Kiesel (1998), Elliott and Kopp (2005), Duffie (1996), Karatzas (1997), Karatzas and Shreve (1998), Shreve (2004), Lamberton and Lapeyre (1996), Musiela and Rutkowski (1997), Kallianpur and Karandikar (2000), Merton (1990), Shiryaev (1999) and the references therein. First we give the mathematical definitions of some fundamental finance concepts. We point out that other mathematical models are also possible and in fact actively investigated. Other models include more general (possibly discontinuous) semimartingales (see e.g. Barndorff-Nielsen (1998), Eberlein and Keller (1995), Schoutens (2003), Cont and Tankov (2004), Øksendal and Sulem (2007)) and even non-semimartingale models. See e.g. Cutland, Kopp and Willinger (1995), Lin (1995), Rogers (1997), Hu and Øksendal (2003), Elliott and Van der Hoek (2003), Biagini and Øksendal (2005), Øksendal (2005), Di Nunno et al. (2006), Biagini et al. (2007). (m)
o Definition 12.1.1 a) A market is an Ft -adapted (n + 1)-dimensional Itˆ process X(t) = (X0 (t), X1 (t), . . . , Xn (t)); 0 ≤ t ≤ T which we will assume has the form dX0 (t) = ρ(t, ω)X0 (t)dt ;
X0 (0) = 1
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6 12, © Springer-Verlag Berlin Heidelberg 2013
(12.1.1)
270
12 Application to Mathematical Finance
and dXi (t) = μi (t, ω)dt +
m
σij (t, ω)dBj (t)
(12.1.2)
j=1
= μi (t, ω)dt + σi (t, ω)dB(t) ;
Xi (0) = xi ,
where σi is row number i of the n × m matrix [σij ]; 1 ≤ i ≤ n ∈ N. b) The market {X(t)}t∈[0,T ] is called normalized if X0 (t) ≡ 1. c) A portfolio in the market {X(t)}t∈[0,T ] is an (n + 1)-dimensional (t, ω)(m)
measurable and Ft
-adapted stochastic process 0≤t≤T .
θ(t, ω) = (θ0 (t, ω), θ1 (t, ω), . . . , θn (t, ω));
(12.1.3)
d) The value at time t of a portfolio θ(t) is defined by V (t, ω) = V θ (t, ω) = θ(t) · X(t) =
n
θi (t)Xi (t)
(12.1.4)
i=0
where · denotes inner product in Rn+1 . e) The portfolio θ(t) is called self-financing if
T n m n 2 |θ0 (s)ρ(s)X0 (s) + ds < ∞ a.s. θi (s)μi (s)| + θi (s)σij (s) 0
i=1
j=1
i=1
(12.1.5) and i.e.
dV (t) = θ(t) · dX(t)
(12.1.6)
θ(s) · dX(s)
(12.1.7)
t V (t) = V (0) +
for t ∈ [0, T ] .
0
Comments to Definition 12.1.1. a) We think of Xi (t) = Xi (t, ω) as the price of security/asset number i at time t. The assets number 1, . . . , n are called risky because of the presence of their diffusion terms. They can for example represent stock investments. The asset number 0 is called risk free because of the absence of a diffusion term (although ρ(t, ω) may depend on ω). This asset can for example represent a bank investment. For simplicity we will assume that ρ(t, ω) is bounded. b) Note that we can always make the market normalized by defining X i (t) = X0 (t)−1 Xi (t); The market
1≤i≤n.
(12.1.8)
12.1 Market, portfolio and arbitrage
271
X(t) = (1, X 1 (t), . . . , X n (t)) is called the normalization of X(t). Thus normalization corresponds to regarding the price X0 (t) of the safe investment as the unit of price (the numeraire) and computing the other prices in terms of this unit. Since t X0 (t) = exp
ρ(s, ω)ds
0
we have ξ(t): =
X0−1 (t)
t = exp − ρ(s, ω)ds > 0
for all t ∈ [0, T ]
(12.1.9)
0
and dX i (t) = d(ξ(t)Xi (t)) = ξ(t)[(μi −ρXi )dt+σi dB(t)];
1≤i≤n
(12.1.10)
or dX(t) = ξ(t)[dX(t) −ρ(t)X(t)dt] .
(12.1.11)
c) The components θ0 (t, ω), . . . , θn (t, ω) represent the number of units of the securities number 0, . . . , n, respectively, which an investor holds at time t. d) This is simply the total value of all investments held at time t. e) Note that condition (12.1.5) is required to make (12.1.7) well-defined. See Definition 3.3.2. This part e) of Definition 12.1.1 represents a subtle point in the mathematical model. According to Itˆo’s formula the equation (12.1.4) would lead to dV (t) = θ(t) · dX(t) + X(t) · dθ(t) + dθ(t) · dX(t) if θ(t) was also an Itˆ o process. However, the requirement (12.1.6) stems from the corresponding discrete time model: If investments θ(tk ) are made at discrete times t = tk , then the increase in the wealth ΔV (tk ) = V (tk+1 ) − V (tk ) should be given by ΔV (tk ) = θ(tk ) · ΔX(tk ) (12.1.12) where ΔX(tk ) = X(tk+1 ) − X(tk ) is the change in prices, provided that no money is brought in or taken out from the system i.e. provided the portfolio is self-financing. If we consider our continuous time model as a limit of the o discrete time case as Δtk = tk+1 − tk goes to 0, then (12.1.6) (with the Itˆ interpretation of the integral) follows from (12.1.12). f ) Note that if θ is self-financing for X(t) and θ
V (t) = θ(t) · X(t) = ξ(t)V θ (t)
(12.1.13)
272
12 Application to Mathematical Finance
is the value process of the normalized market, then by Itˆo’s formula and (12.1.11) we have θ
dV (t) = ξ(t)dV θ (t) + V θ (t)dξ(t) = ξ(t)θ(t)dX(t) − ρ(t)ξ(t)V θ (t)dt = ξ(t)θ(t)[dX(t) − ρ(t)X(t)dt] = θ(t)dX(t) .
(12.1.14)
Hence θ is also self-financing for the normalized market. Remark. Note that by combining (12.1.4) and (12.1.6) we get θ0 (t)X0 (t) +
n
θi (t)Xi (t)
i=1
t θ
= V (0) +
θ0 (s)dX0 (s) +
n
t
θi (s)dXi (s) .
(12.1.15)
i=1 0
0
Assume that θ0 (t) is an Itˆo process. Then, if we put Y0 (t) = θ0 (t)X0 (t) , we get dY0 (t) = ρ(t)Y0 (t)dt + dA(t) , where A(t) =
n i=1
t
θi (s)dXi (s) − θi (t)Xi (t) .
0
This equation has the solution
t ξ(t)Y0 (t) = θ0 (0) +
ξ(s)dA(s) 0
or
t θ0 (t) = θ0 (0) +
ξ(s)dA(s) . 0
Using integration by parts we may rewrite this as
t θ0 (t) = θ0 (0) + ξ(t)A(t) − A(0) −
A(s)dξ(s) 0
or
(12.1.16)
12.1 Market, portfolio and arbitrage
273
t θ0 (t) = V θ (0) + ξ(t)A(t) +
ρ(s)A(s)ξ(s)ds .
(12.1.17)
0
This argument goes both ways, in the sense that if we define θ0 (t) by (12.1.17) (which is an Itˆ o process), then (12.1.15) holds. Therefore, if θ1 (t), . . . , θn (t) are chosen, we can always make the portfolio θ(t) = (θ0 (t), θ1 (t), . . . , θn (t)) self-financing by choosing θ0 (t) according to (12.1.17). Moreover, we are free to choose the initial value V θ (0) of the portfolio. We now make the following definition Definition 12.1.2 A portfolio θ(t) which satisfies (12.1.5) and which is selffinancing is called admissible if the corresponding value process V θ (t) is (t, ω) a.s. lower bounded, i.e. there exists K = K(θ) < ∞ such that V θ (t, ω) ≥ −K
for a.a. (t, ω) ∈ [0, T ] × Ω .
(12.1.18)
This is the analogue of a tame portfolio in the context of Karatzas (1996). The restriction (12.1.18) reflects a natural condition in real life finance: There must be a limit to how much debt the creditors can tolerate. See Example 12.1.4. Definition 12.1.3 An admissible portfolio θ(t) is called an arbitrage (in the market {Xt }t∈[0,T ] ) if the corresponding value process V θ (t) satisfies V θ (0) = 0 and V θ (T ) ≥ 0
a.s. and
P [V θ (T ) > 0] > 0 .
In other words, θ(t) is an arbitrage if it gives an increase in the value from time t = 0 to time t = T a.s., and a strictly positive increase with positive probability. So θ(t) generates a profit without any risk of losing money. Intuitively, the existence of an arbitrage is a sign of lack of equilibrium in the market: No real market equilibrium can exist in the long run if there are arbitrages there. Therefore it is important to be able to determine if a given market allows an arbitrage or not. Not surprisingly, this question turns out to be closely related to what conditions we pose on the portfolios that should be allowed to use. We have defined our admissible portfolios in Definition 12.1.2 above, where condition (12.1.18) was motivated from a modelling point of view. One could also obtain a mathematically sensible theory with other conditions instead, for example with the L2 -conditions
EQ
# T n 0
$ |θi (t)σi (t)|2 dt < ∞
i=1
where Q is the probability measure defined in Lemma 12.2.3.
(12.1.19)
274
12 Application to Mathematical Finance
In any case, some additional conditions are required on the self-financial portfolios: If we only require the portfolio to be self-financing (and satisfying (12.1.5)) we can generate virtually any final value V (T ), as the next example illustrates: Example 12.1.4 Consider the following market dX0 (t) = 0,
dX1 (t) = dB(t),
Let
t Y (t) = 0
dB(s) √ 1−s
0≤t≤T =1.
for 0 ≤ t < 1 .
By Corollary 8.5.5 there exists a Brownian motion B(t) such that t) , Y (t) = B(β where
t
ds = ln 1−s
βt = 0
1 1−t
for 0 ≤ t < 1 .
Let M < ∞ be a given constant and define = M} τ : = τM : = inf{t > 0; B(t) and α: = αM : = inf{t > 0; Y (t) = M } . Then τ < ∞ a.s. (Exercise 7.4a))
and τ = ln
1 , 1−α
so α < 1 a.s.
Define , θ1 (t) =
√1 1−t
for 0 ≤ t < α
0
for α ≤ t ≤ 1
and choose θ0 (t) according to (12.1.17) and such that V (0) = 0. Then θ(t) = (θ0 (t), θ1 (t)) is self-financing and the corresponding value process is given by t∧α
V (t) = 0
dB(s) √ = Y (t ∧ α) 1−s
for 0 ≤ t ≤ 1 .
In particular, V (1) = Y (α) = M
a.s.
12.1 Market, portfolio and arbitrage
275
so this portfolio generates the terminal wealth M a.s. although the initial wealth is 0. In this case condition (12.1.5) reduces to
1
θ12 (s)ds < ∞ a.s.
0
Now
1
θ12 (s)ds
0
α = 0
ds = ln 1−s
1 1−α
= τ < ∞ a.s. ,
so (12.1.5) holds. But θ(t) is not admissible, because V (t) = Y (t ∧ α) = 1 B(ln( 1−t∧α )) is not (t, ω)-a.s. lower bounded for (t, ω) ∈ [0, 1] × Ω. Note that θ(t) does not satisfy (12.1.19) either, because in this case Q = P and # 1 E
θ12 (s)ds
$ = E[τ ] = ∞
0
(Exercise 7.4b). This example illustrates that with portfolios only required to be selffinancing and satisfy (12.1.5) one can generate virtually any terminal value V (T, ω) from V (0) = 0, even when the risky price process X1 (t) is Brownian motion. This clearly contradicts the real life situation in finance, so a realistic mathematical model must put stronger restrictions than (12.1.5) on the portfolios allowed. One such natural restriction is (12.1.18), as we have adopted. To emphasize the phenomenon illustrated by this example, we state the following striking result, which is due to Dudley (1977): (m)
Theorem 12.1.5 Let F be an FT -measurable random variable and let B(t) be m-dimensional Brownian motion. Then there exists φ ∈ W m such that
T φ(t, ω)dB(t) .
F (ω) =
(12.1.20)
0
Note that φ is not unique. See Exercise 3.4.22 in Karatzas and Shreve (1991). See also Exercise 12.4. This implies that for any constant z there exists φ ∈ W m such that
T F (ω) = z +
φ(t, ω)dB(t) . 0
Thus, if we let m = n and interpret B1 (t) = X1 (t), . . . , Bn (t) = Xn (t) as prices, and put X0 (t) ≡ 1, this means that we can, with any initial fortune
276
12 Application to Mathematical Finance (m)
z, generate any FT -measurable final value F = V (T ), as long as we are allowed to choose the portfolio φ freely from W m . This again underlines the need for some extra restriction on the family of portfolios allowed, like condition (12.1.18). How can we decide if a given market {X(t)}t∈[0,T ] allows an arbitrage or not? The following simple result is useful: (m)
Lemma 12.1.6 Suppose there exists a measure Q on FT such that P ∼ Q and such that the normalized price process {X(t)}t∈[0,T ] is a local martingale w.r.t. Q. Then the market {X(t)}t∈[0,T ] has no arbitrage. θ
Proof. Suppose θ(t) is an arbitrage for {X(t)}t∈[0,T ] . Let V (t) be the corθ
responding value process for the normalized market with V (0) = 0. Then θ V (t) is a lower bounded local martingale w.r.t. Q, by (12.1.14). Therefore θ V (t) is a supermartingale w.r.t. Q, by Exercise 7.12. Hence θ
EQ [V (T )] ≤ V θ (0) = 0 . θ
(12.1.21)
θ
But since V (T, ω) ≥ 0 a.s. P we have V (T, ω) ≥ 0 a.s. Q (because Q # P ) θ θ and since P [V (T ) > 0] > 0 we have Q[V (T ) > 0] > 0 (because P # Q). This implies that θ EQ [V (T )] > 0 , which contradicts (12.1.21). Hence arbitrages do not exist for the normalized price process {X(t)}. It follows that {X(t)} has no arbitrage. (Exercise 12.1). Definition 12.1.7 A measure Q ∼ P such that the normalized process {X(t)}t∈[0,T ] is a (local) martingale w.r.t. Q is called an equivalent (local) martingale measure. Thus Lemma 12.1.6 states that if there exists an equivalent local martingale measure then the market has no arbitrage. In fact, then the market also satisfies the stronger condition “no free lunch with vanishing risk” (NFLVR). Conversely, if the market satisfies the NFLVR condition, then there exists an equivalent martingale measure. See Delbaen and Schachermayer (1994), (1995), (1997), Levental and Skorohod (1995) and the references therein. Here we will settle with a weaker result, which nevertheless is good enough for many applications: Theorem 12.1.8 a) Suppose there exists a process u(t, ω) ∈ V m (0, T ) such ω) = (X1 (t, ω), . . . , Xn (t, ω)), that, with X(t, ω) σ(t, ω)u(t, ω) = μ(t, ω) − ρ(t, ω)X(t,
for a.a. (t, ω)
(12.1.22)
12.1 Market, portfolio and arbitrage
and such that
#
T
E exp
1 2
u2 (t, ω)dt
277
$ K then the owner of the option will obtain the payoff Xi (T, ω) − K at time T , while if Xi (T.ω) ≤ K then the owner will not exercise his option and the payoff is 0. b) Similarly, the European put option gives the owner the right (but not the obligation) to sell one unit of security number i at a specified price K at time T . This option gives the owner the payoff F (ω) = (K − Xi (T, ω))+ . Theorem 12.3.2 a) Suppose (12.2.12) and (12.2.13) hold and let Q be as in (12.2.2). Let F be a (European) T -claim. Then ess inf F (ω) ≤ p(F ) ≤ EQ [ξ(T )F ] ≤ q(F ) ≤ ∞ .
(12.3.3)
b) Suppose, in addition to the conditions in a), that the market {X(t)} is complete. Then the price of the (European) T -claim F is p(F ) = EQ [ξ(T )F ] = q(F ) .
(12.3.4)
12.3 Option Pricing
289
Proof. a) Suppose y ∈ R and there exists an admissible portfolio ϕ such that
T ϕ V−y (T, ω) = −y +
ϕ(s)dX(s) ≥ −F (ω) a.s. 0
i.e., using (12.2.7) and Lemma 12.2.2, −y+
T n 0
ϕi (s)ξ(s)σi (s)dB(s) ≥ −ξ(T )F (ω) a.s.
(12.3.5)
i=1
is defined in (12.2.3). where B n t Since ϕi (s)ξ(s)σi (s)dB(s) is a lower bounded local Q-martingale, it is 0 i=1
a supermartingale, by Exercise 7.12. Hence EQ [
n t 0 i=1
ϕi (s)ξ(s)σi (s)dB(s)] ≤0
for all t ∈ [0, T ]. Therefore, taking the expectation of (12.3.5) with respect to Q we get y ≤ EQ [ξ(T )F ] . Hence p(F ) ≤ EQ [ξ(T )F ] , provided such a portfolio ϕ exists for some y ∈ R. This proves the second inequality in (12.3.3). Clearly, if y < F (ω) for a.a. ω, we can choose ϕ = 0. Hence the first inequality in (12.3.3) holds. Similarly, if there exists z ∈ R and an admissible portfolio ψ such that
T ψ(s)dX(s) ≥ F (ω) a.s.
z+ 0
then, as in (12.3.5)
z+
T n 0
ψi (s)ξ(s)σi (s)dB(s) ≥ ξ(T )F (ω) a.s.
i=1
Taking Q-expectations we get z ≥ EQ [ξ(T )F ] , provided such z and ψ exist. If no such z, ψ exist, then q(F ) = ∞ > EQ [ξ(T )F ].
290
12 Application to Mathematical Finance
b) Next, assume in addition that the market is complete. Then by completeness we can find (unique) y ∈ R and θ such that
T θ(s)dX(s) = −F (ω) a.s.
−y + 0
i.e. (by (12.2.7) and Lemma 12.2.2) −y +
T n 0
which gives, since
θi (s)ξ(s)σi (s)dB(s) = −ξ(T )F (ω) a.s. ,
i=1 n t 0 i=1
θi (s)ξ(s)σi (s)dB(s) is a Q-martingale (Definition
12.2.4c)), y = EQ [ξ(T )F ] . Hence p(F ) ≥ EQ [ξ(T )F ] Combined with a) this gives p(F ) = EQ [ξ(T )F ] . A similar argument gives that q(F ) = EQ [ξ(T )F ] . How to Hedge an Attainable Claim We have seen that if Vzθ (t) is the value process of an admissible portfolio θ(t) θ for the market {X(t)}, then V z (t): = ξ(t)Vzθ (t) is the value process of θ(t) for the normalized market {X(t)} (Lemma 12.2.2). Hence we have
t ξ(t)Vzθ (t)
=z+
θ(s)dX(s) .
(12.3.6)
0
are defined as before ((12.2.2) If (12.2.12) and (12.2.13) hold, then – if Q, B and (12.2.3)) – we can rewrite this as (see Lemma 12.2.3) ξ(t)Vzθ (t)
=z+
t n 0
i=1
θi (s)ξ(s)
m j=1
j (s) . σij (s)dB
(12.3.7)
12.3 Option Pricing
291
Therefore, the portfolio θ(t) = (θ0 (t), . . . , θn (t)) needed to hedge a given T claim F is given by ξ(t, ω)(θ1 (t), . . . , θn (t))σ(t, ω) = φ(t, ω) ,
(12.3.8)
i.e. θ(t) = X0 (t)φ(t)Λ(t) , where φ(t, ω) ∈ Rm is such that
T ξ(T )F (ω) = z +
φ(t, ω)dB(t)
(12.3.9)
0
(and θ0 (t) is given by (12.1.17)). In view of this it is of interest to find explicitly the integrand φ(t, ω) when F is given. One way of doing this is by using a generalized version of the ClarkOcone theorem from the Malliavin calculus. See Karatzas and Ocone (1991). A survey containing their result is in Øksendal (1996)). See also Di Nunno et al (2009). In the Markovian case, however, there is a simpler method, which we now describe. It is a modification of a method used by Hu (1997). Let Y (t) be an Itˆ o diffusion in Rk of the form dY (t) = b(Y (t))dt + σ(Y (t))dB(t),
Y (0) = y
(12.3.10)
where b: Rk → Rk and σ: Rk → Rk× are given Lipschitz continuous functions. Assume that Y (t) is uniformly elliptic, i.e. that there exists a constant c > 0 such that (12.3.11) xT σ(y)σ T (y)x ≥ c|x|2 for all x ∈ Rk , y ∈ Rk . Let h ∈ C02 (Rk ) and define y [h(Y (T − t))]; g(t, y) = EQ
t ∈ [0, T ],
y ∈ Rk .
(12.3.12)
Then by uniform ellipticity it is known that g(t, y) ∈ C 1,2 ((0, T ) × Rk ) (see Dynkin 1965 II, Theorem 13.18 p. 53 and Dynkin 1965 I, Theorem 5.11 p. 162) and hence Kolmogorov’s backward equation (Theorem 8.1.1) gives that ∂g ∂g + bi (y) + ∂t i=1 ∂yi k
1 2
k i,j=1
(σσ T )ij (y)
∂2g =0. ∂yi ∂yj
(12.3.13)
Now put Z(t) = g(t, Y (t)) .
(12.3.14)
292
12 Application to Mathematical Finance
Then by Itˆ o’s formula we have, by (12.3.13), dZ(t) =
∂g ∂g (t, Y (t))dt + (t, Y (t))dYi (t)) ∂t ∂yi i=1 + 12
k
(σσ T )ij (Y (t))
i,j=1
=
k i=1
∂2g (t, Y (t))dt ∂yi ∂yj
∂g . (t, Y (t))σi (Y (t))dB(t) ∂yi
(12.3.15)
Moreover, Y (T )
Z(T ) = g(T, Y (T )) = EQ
[h(Y (0))] = h(Y (T ))
(12.3.16)
and y [h(Y (T ))] . Z(0) = g(0, Y (0)) = EQ
(12.3.17)
Combining (12.3.15)–(12.3.17) we get
T y h(Y (T )) = EQ [h(Y (T ))] +
, φ(t, ω)dB(t)
(12.3.18)
0
where φ(t, ω) =
k ∂ y EQ h(Y (T − t)) y=Y (t) σi (Y (t)) . ∂y i i=1
(12.3.19)
More generally, by approximating more general processes Y (t) by uniformly elliptic processes Y (n) (t) and more general functions h(y) by C02 functions H (m) (y); n, m = 1, 2, . . . we obtain the following conclusion: Theorem 12.3.3 Let Y (t) ∈ Rk be an Itˆ o diffusion of the form (12.3.10) and assume that h : Rk → R is a given function such that ∂ k y EQ [h(Y (T − t))] exists (12.3.20) ∂yi i=1 and y EQ
T
φ2 (t, ω)dt < ∞ ,
(12.3.21)
0
where φ(t, ω) =
k ∂ y EQ h(Y (T − t)) y=Y (t) σi (Y (t)) . ∂y i i=1
(12.3.22)
12.3 Option Pricing
293
Then we have the Itˆ o representation formula
T h(Y (T )) = EQ [h(Y (T ))] +
. φ(t, ω)dB(t)
(12.3.23)
0
To apply this result to find φ in (12.3.9) we argue as follows: Suppose the equation for X(t) is Markovian, i.e. dX0 (t) = ρ(X(t))X0 (t)dt;
X0 (0) = 1
dXi (t) = μi (X(t))dt + σi (X(t))dB(t);
i = 1, . . . , n
where σi is row number i of the matrix σ. We rewrite this equation for X(t) in terms of B(t) and get, using (12.2.12) and (12.2.3), the following system dX0 (t) = ρ(X(t))X0 (t)dt
(12.3.24)
; dXi (t) = ρ(X(t))Xi (t)dt + σi (X(t))dB(t)
1≤i≤n.
(12.3.25)
Therefore Theorem 12.3.3 gives Corollary 12.3.4 Let X(t) = (X0 (t), . . . , Xn (t)) ∈ Rn+1 be given by (12.3.24)–(12.3.25) and assume that h0 : Rn+1 → R is a given function such that n ∂ x EQ [ξ(T − t)h0 (X(T − t))] exists (12.3.26) ∂xi i=1 and x EQ
T
φ2 (t, ω)dt < ∞
(12.3.27)
0
where φ(t, ω) :=
n ∂ x EQ ξ(T − t)h0 (X(T − t)) x=X(t) σi (X(t)) . ∂xi i=1
Then we have the Itˆ o representation formula
T
ξ(T )h0 (X(T )) = EQ [ξ(T )h0 (X(T ))] +
. φ(t, ω)dB(t)
(12.3.28)
(12.3.29)
0
We summarize our results about pricing and hedging of European T -claims as follows: Theorem 12.3.5 Let {X(t)}t∈[0,T ] be a complete market. Suppose (12.2.12) be as in (12.2.2), (12.2.3). Let F be a Euroand (12.2.13) hold and let Q, B pean T -claim such that EQ [ξ(T )F ] < ∞. Then the price of the claim F is
294
12 Application to Mathematical Finance
p(F ) = EQ [ξ(T )F ] .
(12.3.30)
Moreover, to find a replicating (hedging) portfolio θ(t) = (θ0 (t), . . . , θn (t)) for the claim F we first find (for example by using Corollary 12.3.4 if possible) T an adapted process φ such that EQ φ2 (t)dt < ∞ and 0
T ξ(T )F = EQ [ξ(T )F ] +
. φ(t, ω)dB(t)
(12.3.31)
0
= (θ1 (t), . . . , θn (t)) such that Then we choose θ(t) ω)ξ(t, ω)σ(t, ω) = φ(t, ω) θ(t,
(12.3.32)
and we choose θ0 (t) as in (12.1.17). Proof. (12.3.30) is just part b) of Theorem 12.3.2. The equation (12.3.32) follows from (12.3.8). Note that the equation (12.3.32) has the solution ω) = X0 (t)φ(t, ω)Λ(t, ω) θ(t,
(12.3.33)
where Λ(t, ω) is the left inverse of σ(t, ω) (Theorem 12.2.5).
Example 12.3.6 Suppose the market is (X0 (t), X1 (t)), where X0 (t) = eρt and X1 (t) is an Ornstein-Uhlenbeck process dX1 (t) = αX1 (t)dt + σdB(t);
X1 (0) = x1
where ρ > 0, α, σ are constants, σ = 0. How do we hedge the claim F (ω) = exp(X1 (T )) ? The portfolio θ(t) = (θ0 (t), θ1 (t)) that we seek is given by (12.3.33), i.e. θ1 (t, ω) = eρt σ −1 φ(t, ω) where φ(t, ω) and V (0) = z are uniquely given by (12.3.9), i.e.
T ξ(T )F (ω) = z +
φ(t, ω)dB(t)
0
or
T F (ω) = exp(X1 (T )) = ze
ρT
+ 0
where
, φ0 (t, ω)dB(t)
12.3 Option Pricing
295
φ0 (t, ω) = eρT φ(t, ω) . To find φ(t, ω) explicitly we apply Corollary 12.3.4: we have Note that in terms of B ; dX1 (t) = ρX1 (t)dt + σdB(t)
X1 (0) = x1 .
This equation has the solution (see Exercise 5.5)
t X1 (t) = x1 eρt + σ
eρ(t−s) dB(s) .
0
Hence, if we choose h0 (x1 ) = exp(x1 ) we have x1 x1 EQ [h0 (X(T − t))] = EQ [exp(X1 (T − t))]
= EQ
T
−t ρ(T −t) exp x1 e +σ eρ(T −t−s) dB(s) 0
σ 2 2ρ(T −t) e −1 = exp x1 eρ(T −t) + 4ρ
if ρ = 0 .
This gives, by (12.3.28), d x1 EQ h0 (X(T − t)) x1 =X1 (t) σ dx1 σ 2 2ρ(T −t) (e − 1) = σeρ(T −t) exp X1 (t)eρ(T −t) + 4ρ
φ0 (t, ω) =
Therefore, by (12.3.33), , 2 exp{X1 (t)eρ(T −t) + σ4ρ (e2ρ(T −t) − 1)} θ1 (t) = 2 exp{X1 (t) + σ2 (T − t)}
if ρ = 0 .
if ρ = 0 if ρ = 0 .
The Generalized Black-Scholes Model Let us now specialize to a situation where the market has just two securities X0 (t), X1 (t) where X0 , X1 are Itˆ o processes of the form dX0 (t) = ρ(t, ω)X0 (t)dt
(as before)
dX1 (t) = α(t, ω)X1 (t)dt + β(t, ω)X1 (t)dB(t) ,
(12.3.34) (12.3.35)
where B(t) is 1-dimensional and α(t, ω), β(t, ω) are 1-dimensional processes in W.
296
12 Application to Mathematical Finance
Note that the solution of (12.3.35) is t X1 (t) = X1 (0) exp
t β(s, ω)dB(s)+
0
(α(s, ω)− 12 β 2 (s, ω))ds
. (12.3.36)
0
The equation (12.2.12) gets the form X1 (t)β(t, ω)u(t, ω) = X1 (t)α(t, ω) − X1 (t)ρ(t, ω) which has the solution u(t, ω) = β −1 (t, ω)[α(t, ω) − ρ(t, ω)]
if β(t, ω) = 0 .
(12.3.37)
So (12.2.13) holds iff # E exp
T 1 2
0
(α(s, ω) − ρ(s, ω))2 ds β 2 (s, ω)
$ 0
(12.3.41)
where ρ(t), β(t) are deterministic and #
T
E exp
1 2
0
(α(t, ω) − ρ(t))2 dt β 2 (t)
$ 0
where ρ, α, β = 0 are constants. Then the price p at time 0 of the European call option, with payoff F (ω) = (X1 (T, ω) − K)+
(12.3.44)
where K > 0 is a constant (the exercise price), is √ √ p = x1 Φ η + 12 β T − Ke−ρT Φ η − 12 β T where
x 1 + ρT η = β −1 T −1/2 ln K
and 1 Φ(y) = √ 2π
y
1
2
e− 2 x dx ;
y∈R
(12.3.45) (12.3.46)
(12.3.47)
−∞
is the standard normal distribution function. b) The replicating portfolio θ(t) = (θ0 (t), θ1 (t)) for this claim F in (12.3.44) is given by X (t) 1 θ1 (t, ω) = Φ β −1 (T −t)−1/2 ln + ρ(T −t) + 12 β 2 (T −t) (12.3.48) K with θ0 (t, ω) determined by (12.1.17) and V θ (0) = p.
12.3 Option Pricing
299
Remark. Note that, in particular, θ1 (t, ω) > 0 for all t ∈ [0, T ]. This means that we can replicate the European call without shortselling. For the European put the situation is different. See Exercise 12.16. Proof. a) This follows by applying Theorem 12.3.7a) to the function f (z) = (z − K)+ . Then the corresponding integral (12.3.42) can be written p=
e−ρT √ β 2πT
∞ y2 (x1 exp[y + (ρ − 12 β 2 )T ] − K) exp − 2 dy 2β T γ
where γ = ln( xK1 ) − ρT + 12 β 2 T . This integral splits into two parts, both of which can be reduced to integrals of standard normal distribution type by completing the square. We leave the details to the reader. (Exercise 12.13.) b) This follows similarly from Theorem 12.3.7b). The function f (z) = (z−K)+ is not C 1 , but an approximation argument shows that formula (12.3.43) still holds if we represent f by f (z) = X[K,∞) (z) . Then the rest follows by completing the square as in a). (Exercise 12.13.) American options The difference between European and American options is that in the latter case the buyer of the option is free to choose any exercise time τ before or at the given expiration time T (and the guaranteed payoff may depend on both τ and ω.) This exercise time τ may be stochastic (depend on ω), but only in such a way that the decision to exercise before or at a time t only depends on the history up to time t. More precisely, we require that for all t we have (m)
{ω; τ (ω) ≤ t} ∈ Ft (m)
In other words, τ must be an Ft
.
-stopping time (Definition 7.2.1). (m)
Definition 12.3.9 An American contingent T -claim is an Ft -adapted, (t, ω)-measurable and a.s. lower bounded continuous stochastic process F (t) = F (t, ω); t ∈ [0, T ], ω ∈ Ω. An American option on such a claim F (t, ω) gives the owner of the option the right (but not the obligation) to choose any stopping time τ (ω) ≤ T as exercise time for the option, resulting in a payment F (τ (ω), ω) to the owner.
300
12 Application to Mathematical Finance
Let F (t) = F (t, ω) be an American contingent claim. Suppose you were offered a guarantee to be paid the amount F (τ (ω), ω) at the (stopping) time τ (ω) ≤ T that you are free to choose. How much would you be willing to pay for such a guarantee? We repeat the argument preceding Definition 12.3.1: If I – the buyer – pay the price y for this guarantee, then I will have an initial fortune (debt) −y in my investment strategy. With this initial fortune −y it must be possible to find a stopping time τ ≤ T and an admissible portfolio ϕ such that ϕ (τ (ω), ω) + F (τ (ω), ω) ≥ 0 a.s. V−y
Thus the maximal price p = pA (F ) the buyer is willing to pay is (Buyer’s price of the American contingent claim F ) pA (F ) = sup{y; There exists a stopping time τ ≤ T
(12.3.49)
and an admissible portfolio ϕ such that τ (ω) ϕ V−y (τ (ω), ω): =
−y +
ϕ(s)dX(s) ≥ −F (τ (ω), ω) a.s.} 0
On the other hand, the seller could argue as follows: If I – the seller – receive the price z for such a guarantee, then with this initial fortune z it must be possible to find an admissible portfolio ψ which generates a value process which at any time is not less than the amount promised to pay to the buyer: Vzψ (t, ω) ≥ F (t, ω) a.s. for all t ∈ [0, T ] . Thus the minimal price q = qA (F ) the seller is willing to accept is (Seller’s price of the American contingent claim F )
(12.3.50)
qA (F ) = inf{z; There exists an admissible portfolio ψ such that for all t ∈ [0, T ] we have
t ψ Vz (t, ω): = z + ψ(s)dX(s) ≥ F (t, ω) a.s.} 0
We can now prove a result analogous to Theorem 12.3.2. The result is basically due to Bensoussan (1984) and Karatzas (1988). Theorem 12.3.10 (Pricing formula for American options) a) Suppose (12.2.12) and (12.2.13) and let Q be as in (12.2.2). Let F (t) = F (t, ω); t ∈ [0, T ] be an American contingent T -claim such that
12.3 Option Pricing
sup EQ [ξ(τ )F (τ )] < ∞
τ ≤T
301
(12.3.51)
Then pA (F ) ≤ sup EQ [ξ(τ )F (τ )] ≤ qA (F ) ≤ ∞ . τ ≤T
(12.3.52)
b) Suppose, in addition to the conditions in a), that the market {X(t)} is complete. Then pA (F ) = sup EQ [ξ(τ )F (τ )] = qA (F ) . τ ≤T
(12.3.53)
Proof. a) We proceed as in the proof of Theorem 12.3.2: Suppose y ∈ R and there exists a stopping time τ ≤ T and an admissible portfolio ϕ such that
τ ϕ V−y (τ, ω)
= −y +
ϕ(s)dX(s) ≥ −F (τ ) a.s. 0
Then as before we get −y +
τ n 0
ϕ ϕ ϕi (s)ξ(s)σi (s)dB(s) = V−y (τ ) = ξ(τ )V−y (τ ) ≥ −ξ(τ )F (τ )
a.s.
i=1
Taking expectations with respect to Q we get, since V −y (t) is a Q-supermartingale y ≤ EQ [ξ(τ )F (τ )] ≤ sup EQ [ξ(τ )F (τ )] . τ ≤T
Since this holds for all such y we conclude that pA (F ) ≤ sup EQ [ξ(τ )F (τ )] . τ ≤T
(12.3.54)
Similarly, suppose z ∈ R and there exists an admissible portfolio ψ such that
t Vzψ (t, ω)
ψ(s)dX(s) ≥ F (t) a.s. for all t ∈ [0, T ] .
=z+ 0
Then, as above, if τ ≤ T is a stopping time we get z+
τ n 0
i=1
ψ ψi (s)ξ(s)σi (s)dB(s) = V z (τ ) = ξ(τ )Vzψ (τ ) ≥ ξ(τ )F (τ )
a.s.
302
12 Application to Mathematical Finance
Again, taking expectations with respect to Q and then supremum over τ ≤ T we get z ≥ sup EQ [ξ(τ )F (τ )] . τ ≤T
Since this holds for all such z, we get qA (F ) ≥ sup EQ [ξ(τ )F (τ )] .
(12.3.55)
τ ≤T
b) Next, assume in addition that the market is complete. Choose a stopping time τ ≤ T . Define % k if F (t, ω) ≥ k Fk (t) = Fk (t, ω) = F (t, ω) if F (t, ω) < k and put Gk (ω) = X0 (T )ξ(τ )Fk (τ ) . Then Gk is a bounded T -claim, so by completeness we can find yk ∈ R and a portfolio θ(k) such that
T −yk +
θ(k) (s)dX(s) = −Gk (ω) a.s.
0
and such that
t −yk +
θ(k) (s)dX(s)
0
is a Q-martingale. Then, by (12.2.8)–(12.2.9),
T −yk +
θ(k) (s)dX(s) = −ξ(T )Gk (ω) = −ξ(τ )Fk (τ )
0
and hence
τ −yk +
T θ(k) (s)dX(s) = EQ − yk + θ(k) (s)dX(s) | Fτ(m)
0
0
= EQ [−ξ(τ )Fk (τ ) |
Fτ(m) ]
= −ξ(τ )Fk (τ ) .
From this we get, again by (12.2.8)–(12.2.9),
τ −yk + 0
and
θ(k) (s)dX(s) = −Fk (τ )
a.s.
12.3 Option Pricing
303
yk = EQ [ξ(τ )Fk (τ )] . This shows that any price of the form EQ [ξ(τ )Fk (τ )] for some stopping time τ ≤ T would be acceptable for the buyer of an American option on the claim Fk (t, ω). Hence pA (F ) ≥ pA (Fk ) ≥ sup EQ [ξ(τ )Fk (τ )] . τ ≤T
Letting k → ∞ we obtain by monotone convergence pA (F ) ≥ sup EQ [ξ(τ )F (τ )] . τ ≤T
It remains to show that if we put z = sup EQ [ξ(τ )F (τ )]
(12.3.56)
0≤τ ≤T
then there exists an admissible portfolio θ(s, ω) which superreplicates F (t, ω), in the sense that
t θ(s, ω)dX(s) ≥ F (t, ω)
z+
for a.a. (t, ω) ∈ [0, T ] × Ω .
(12.3.57)
0
The details of the proof of this can be found in Karatzas (1997), Theorem 1.4.3. Here we only sketch the proof: Define the Snell envelope (m)
S(t) = sup EQ [ξ(τ )F (τ )|Ft t≤τ ≤T
0≤t≤T .
]; (m)
Then S(t) is a supermartingale w.r.t. Q and {Ft decomposition we can write S(t) = M (t) − A(t) ;
}, so by the Doob-Meyer
0≤t≤T
(m)
where M (t) is a Q, {Ft }-martingale with M (0) = S(0) = z and A(t) is a nondecreasing process with A(0) = 0. It is a consequence of Lemma 12.2.1 Hence that we can represent the martingale M as an Itˆo integral w.r.t B.
t z+
φ(s, ω)dB(s) = M (t) = S(t) + A(t) ≥ S(t) ;
0≤t≤T
(12.3.58)
0 (m)
for some Ft -adapted process φ(s, ω). Since the market is complete, we know by Theorem 12.2.5 that σ(t, ω) has a left inverse Λ(t, ω). So if we define θ = (θ1 , . . . , θn ) by ω) = X0 (t)φ(t, ω)Λ(t, ω) θ(t,
304
12 Application to Mathematical Finance
then by (12.3.58) and Lemma 12.2.4 we get
t z+ 0
θ dX = z +
t n 0
=z+ ξθi σi dB
i=1
t
≥ S(t) ; φ dB
0≤t≤T .
0
Hence, by Lemma 12.2.3,
t θ(s, ω)dX(s) ≥ X0 (t)S(t) ≥ X0 (t)ξ(t)F (t) = F (t) ;
z+
0≤t≤T .
0
The Itˆ o Diffusion Case: Connection to Optimal Stopping Theorem 12.3.10 shows that pricing an American option is an optimal stopping problem. In the general case the solution to this problem can be expressed in terms of the Snell envelope. See e.g. El Karoui (1981) and Fakeev (1970). In the Itˆo diffusion case we get an optimal stopping problem of the type discussed in Chapter 10. We now consider this case in more detail: Assume the market is an (n + 1)-dimensional Itˆo diffusion X(t) = (X0 (t), X1 (t), . . . , Xn (t)); t ≥ 0 of the form (see (12.1.1)–(12.1.2)) dX0 (t) = ρ(t, X(t))X0 (t)dt ;
X0 (0) = x0 > 0
(12.3.59)
and dXi (t) = μi (t, X(t))dt +
m
σij (t, X(t))dBj (t)
(12.3.60)
j=1
= μi (t, X(t))dt + σi (t, X(t))dB(t) ;
Xi (0) = xi ,
where ρ, μi and σij are given functions satisfying the conditions of Theorem 5.2.1. Moreover, assume that the American claim F (t) is Markovian, i.e. F (t) = h(X(t)) for some lower bounded function h : Rn+1 → R. Define # $ s+t Y (t) = ∈ Rn+2 X(t)
(12.3.61)
(12.3.62)
and put g(y) = g(s, x) = x−1 0 h(x) ;
x = (x0 , x1 , . . . , xn ) ∈ Rn+1 .
(12.3.63)
12.3 Option Pricing
305
Then the problem to find the price pA (F ) = qA (F ) = sup EQ [ξ(τ )F (τ )] τ ≤T
(12.3.64)
in (12.3.53) can be regarded as a special case of the general optimal stopping problem considered in Section 10.4. Indeed, if we put y Φ(y) = sup EQ [g(Y (τ ))] ,
(12.3.65)
τG = inf{t > 0; s + t ≥ T } = T − s
(12.3.66)
τ ≤τG
where is the first exit time of Y (t) from the region G = {(s, x); s < T } ⊂ Rn+1 ,
(12.3.67)
pA (F ) = Φ(0, 1, x1 , . . . , xn ) .
(12.3.68)
then To apply the theory of Section 10.4 to problem (12.3.65) we must rewrite equations (12.3.59)–(12.3.60) in terms of B(t). Using (12.2.12) and (12.2.3) we get, as in (12.3.24)–(12.3.25), dX0 (t) = ρ(X(t))dt ;
X0 (0) = x0 > 0 (12.3.69) dXi (t) = ρ(X(t))Xi (t)dt + σi (X(t))dB(t); Xi (0) = xi ; 1 ≤ i ≤ n (12.3.70) We summarize this as follows: Theorem 12.3.11 Suppose the market {X(t)} has the Markovian form (12.3.59)–(12.3.60), which is equivalent to (12.3.69)–(12.3.70), and that the American claim F (t) has the Markovian form (12.3.61). Then the price pA (F ) of the American option with this payoff F is the solution of the optimal stopping problem (12.3.65) at y = (0, 1, x1 , . . . , xn ). Example 12.3.12 (The American put) Consider the Black-Scholes market dX0 (t) = ρX0 (t)dt ;
X0 (0) = 1
dX1 (t) = αX1 (t)dt + βX1 (t)dB(t) ;
X1 (0) = x1 > 0
the system becomes where ρ, α, β = 0 are constants. In terms of B dX0 (t) = ρX0 (t)dt ;
X0 (0) = 1 ; dX1 (t) = ρX1 (t)dt + βX1 (t)dB(t)
X1 (0) = x1 > 0 .
306
12 Application to Mathematical Finance
Therefore the generator L of the process Y (t) defined in (12.3.62) is given by Lf (s, x0 , x1 ) =
∂f ∂f ∂2f ∂f + ρx0 + ρx1 + 12 β 2 x21 2 ∂s ∂x0 ∂x1 ∂x1
(12.3.71)
for f ∈ C02 (R3 ). Therefore, according to Theorem 10.4.1, to find the value function Φ in (12.3.63)–(12.3.65) we look for a C 1 function φ such that φ(s, x0 , x1 ) ≥ x−1 0 h(x0 , x1 ) for all s < T
(12.3.72)
φ(T, x0 , x1 ) =
(12.3.73)
x−1 0 h(x0 , x1 )
.
Define the continuation region D = {(s, x0 , x1 ); φ(s, x0 , x1 ) > x−1 0 h(x0 , x1 )} .
(12.3.74)
Then we require φ to be C 2 outside ∂D and Lφ(s, x0 , x1 ) ≤ 0
¯ outside D
(12.3.75)
and Lφ(s, x0 , x1 ) = 0 on D . Suppose
F (t) = (K − X1 (t))+
(12.3.76)
(the American put)
Then the function g in (12.3.63) is + −ρs (K − x1 )+ , g(s, x0 , x1 ) = x−1 0 (K − x1 ) = e
so we can disregard the variable x0 and use s instead. Then the variational inequalities (12.3.72)–(12.3.76) get the form φ(s, x1 ) ≥ e−ρs (K − x1 )+ φ(T, x1 ) = e−ρT (K − x1 )+
for all s < T
D = {(s, x1 ); φ(s, x1 ) > e−ρs (K − x1 )+ } ∂φ ∂φ ∂2φ ¯ + ρx1 + 12 β 2 x21 2 ≤ 0 outside D ∂s ∂x1 ∂x1 ∂φ ∂φ ∂2φ + ρx1 + 12 β 2 x21 2 = 0 in D . ∂s ∂x1 ∂x1
(12.3.77) (12.3.78) (12.3.79) (12.3.80) (12.3.81)
In this case we cannot factor out e−ρs as we often did in Chapter 10, because that would lead to a conflict between (12.3.78) and (12.3.81). If such a φ is found, and the additional assumptions of Theorem 10.4.1 hold, then we can conclude that φ(s, x1 ) = Φ(s, x1 )
12.3 Option Pricing
307
and hence pA (F ) = φ(0, x1 ) is the option price at time t = 0. Moreover, τ ∗ = τD = inf{t > 0; (s + t, X1 (t)) ∈ D} is the corresponding optimal stopping time, i.e. the optimal time to exercise the American option. Unfortunately, even in this case it seems that an explicit analytic solution is very hard (possibly impossible) to find. However, there are interesting partial results and good approximation procedures. See e.g. Barles et al (1995), Bather (1997), Jacka (1991), Karatzas (1997), Musiela and Rutkowski (1997) and the references therein. For example, it is known (see Jacka (1991)) that the continuation region D has the form D = {(t, x1 ) ∈ (−∞, T ) × R , x1 > f (t)} , i.e. D is the region above the graph of f , for some continuous, increasing function f : (0, T ) → R.
Figure 12.1. The shape of the continuation region for the American put.
Thus the problem is to find the function f . In Barles et al (1995) it is shown that f (t) ∼ K − βK
1 (T − t)| ln(T − t)|
as t → T − ,
in the sense that f (t) − K 1 →1 as t → T − . −βK (T − t)| ln(T − t)| This indicates that the continuation region has the shape shown in the Figure 12.1. But its exact form is still unknown. For the corresponding American call option the situation is much simpler. See Exercise 12.14.
308
12. Application to Mathematical Finance
Exercises 12.1. a) Prove that the price process {X(t)}t∈[0,T ] has an arbitrage iff the normalized price process {X(t)}t∈[0,T ] has an arbitrage. b) Suppose {X(t)}t∈[0,T ] is normalized. Prove that {X(t)}t∈[0,T ] has an arbitrage iff there exists an admissible portfolio θ such that V θ (0) ≤ V θ (T ) a.s. and P [V θ (T ) > V θ (0)] > 0 .
(12.3.82)
In other words, in normalized markets it is not essential that we require V θ (0) = 0 for an arbitrage θ, only that the gains V θ (T ) − V θ (0) is nonnegative a.s. and positive with positive probability. = (θ0 (t), . . . , θn (t)) as follows: (Hint: If θ is as in (12.3.82) define θ(t) Let θi (t) = θ(t) for i = 1, . . . , n; t ∈ [0, T ]. Then choose θ0 (0) such θ (0) = 0 and define θ 0 (t) according to (12.1.15) to make θ that V self-financing. Then θ · X(t) = (t) = θ(t) V
t
θ(s)dX(s) =
0
t θ(s)dX(s) = V θ (t) − V θ (0).) 0
12.2. Let θ(t) = (θ0 , . . . , θn ) be a constant portfolio. Prove that θ is self-financing. 12.3. Suppose {X(t)} is a normalized market and that (12.2.12) and (12.2.13) hold. Suppose n = m and that σ is invertible with a bounded inverse. Suppose every bounded claim is attainable. Show that then any lower bounded claim F such that EQ [F 2 ] < ∞ is attainable. (Hint: Choose bounded T -claims Fk such that Fk → F
in L2 (Q) and E[Fk ] = E[F ] . (k)
(k)
By assumption there exist admissible portfolios θ(k) = (θ0 , . . . , θn ) and constants Vk (0) such that
T Fk (ω) = Vk (0) +
θ
(k)
0
T (s)dX(s) = Vk (0) +
θ(k) (s)σ(s)dB(s)
0
where θ(k) = (θ1 , . . . , θn ). It follows that Vk (0) = EQ [Fk ] → EQ [F ] as k → ∞. (k)
(k)
Exercises
309
By the Itˆo isometry the sequence {θ(k) σ}k is a Cauchy sequence in L2 (λ× Q) and hence converges in this space. Conclude that there exists an admissible θ such that
T F (ω) = EQ [F ] +
θ(s)dX(s) .) 0
12.4. Let B(t) be 1-dimensional Brownian motion. Show that there exist θ1 (t, ω), θ2 (t, ω) ∈ W such that if we define
t V1 (t) = 1 +
t θ1 (s, ω)dB(s),
V2 (t) = 2 +
0
θ2 (s, ω)dB(s);
t ∈ [0, 1]
0
then V1 (1) = V2 (1) = 0 and V1 (t) ≥ 0 ,
V2 (t) ≥ 0
for a.a. (t, ω). Therefore both θ1 (t, ω) and θ2 (t, ω) are admissible portfolios for the claim F (ω) = 0 in the normalized market with n = 1 and X1 (t) = B(t). In particular, if we drop the martingale condition in Definition 12.2.4b) we have no uniqueness of replicating portfolios, even if we require the portfolio to be admissible. (Note, however, that we have uniqueness if we require that θ ∈ V(0, 1), by Theorem 4.3.3). (Hint: Use Example 12.1.4 with M = −1 and with M = −2. Then define, for i = 1, 2, , 1 √ for 0 ≤ t < α−i 1−t θi (t) = 0 for α−i ≤ t ≤ 1 and
t θi (s)dB(s) = i + Y (t ∧ α−i ) ;
Vi (t) = i +
0 ≤ t ≤ 1 .)
0
given by (12.2.3) is 12.5. Prove the first part of Lemma 12.2.1, i.e. that B(t) (m) an Ft -martingale (see the Remark b) following this lemma). 12.6.* Determine if the following normalized markets {X(t)}t∈[0,T ] allow an arbitrage. If so, find one.
310
12. Application to Mathematical Finance
a) (n = m = 2) dX1 (t) = 3dt + dB1 (t) + dB2 (t), dX2 (t) = −dt + dB1 (t) − dB2 (t). b) (n = 2, m = 3) dX1 (t) = dt + dB1 (t) + dB2 (t) − dB3 (t) dX2 (t) = 5dt − dB1 (t) + dB2 (t) + dB3 (t) c) (n = 2, m = 3) dX1 (t) = dt + dB1 (t) + dB2 (t) − dB3 (t) dX2 (t) = 5dt − dB1 (t) − dB2 (t) + dB3 (t) d) (n = 2, m = 3) dX1 (t) = dt + dB1 (t) + dB2 (t) − dB3 (t) dX2 (t) = −3dt − 3dB1 (t) − 3dB2 (t) + 3dB3 (t) e) (n = 3, m = 2) dX1 (t) = dt + dB1 (t) + dB2 (t) dX2 (t) = 2dt + dB1 (t) − dB2 (t) dX3 (t) = 3dt − dB1 (t) + dB2 (t) f) (n = 3, m = 2) dX1 (t) = dt + dB1 (t) + dB2 (t) dX2 (t) = 2dt + dB1 (t) − dB2 (t) dX3 (t) = −2dt − dB1 (t) + dB2 (t) 12.7.* Determine which of the nonarbitrage markets {X(t)}t∈[0,T ] of Exercise 12.6 a)–f) are complete. For those which are not complete, find a T -claim which is not attainable. 12.8. Let Bt be 1-dimensional Brownian motion. Use Theorem 12.3.3 to find z ∈ R and φ(t, ω) ∈ V(0, T ) such that
T F (ω) = z +
φ(t, ω)dB(t) 0
in the following cases: (i) F (ω) = B 2 (T, ω) (ii) F (ω) = B 3 (T, ω) (iii) F (ω) = exp B(T, ω). (Compare with the methods you used in Exercise 4.14.) 12.9.* Let Bt be n-dimensional Brownian motion. Use Theorem 12.3.3 to find z ∈ R and φ(t, ω) ∈ V n (0, T ) such that
T φ(t, ω)dB(t)
F (ω) = z + 0
in the following cases
Exercises
311
(i) F (ω) = B 2 (T, ω) (= B12 (T, ω) + · · · + Bn2 (T, ω)) (ii) F (ω) = exp(B1 (T, ω) + · · · + Bn (T, ω)). 12.10. Let X(t) be a geometric Brownian motion given by dX(t) = αX(t)dt + βX(t)dB(t) , where α and β are constants. Use Theorem 12.3.3 to find z ∈ R and φ(t, ω) ∈ V(0, T ) such that
T X(T, ω) = z +
φ(t, ω)dB(t) . 0
12.11. Suppose the market is given by dX0 (t) = ρX0 (t)dt ;
X0 (0) = 1
dX1 (t) = (m − X1 (t))dt + σdB(t) ;
X1 (0) = x1 > 0 .
(the mean-reverting Ornstein-Uhlenbeck process) where ρ > 0, m > 0 and σ = 0 are constants. a) Find the price EQ [ξ(T )F ] of the European T -claim F (ω) = X1 (T, ω) . b) Find the replicating portfolio θ(t) = (θ0 (t), θ1 (t)) for this claim. (Hint: Use Theorem 12.3.5, as in Example 12.3.6.) 12.12.* Consider a market (X0 (t), X1 (t)) ∈ R2 where dX0 (t) = ρX0 (t)dt ;
X0 (0) = 1
ρ > 0 constant) .
Find the price EQ [ξ(T )F ] of the European T -claim F (ω) = B(T, ω) and find the corresponding replicating portfolio θ(t) = (θ0 (t), θ1 (t)) in the following cases a) dX1 (t) = αX1 (t)dt + βX1 (t)dB(t); α, β constants, β = 0 b) dX1 (t) = c dB(t) ; c = 0 constant c) dX1 (t) = αX1 (t)dt + σ dB(t) ; α, σ constants, σ = 0. 12.13. (The classical Black-Scholes formula). Complete the details in the proof of the Black-Scholes option pricing formula (12.3.45) and the corresponding replicating portfolio formula (12.3.48) of Corollary 12.3.8.
312
12. Application to Mathematical Finance
12.14. (The American call) Let X(t) = (X0 (t), X1 (t)) be as in Exercise 12.13. If the American T -claim is given by F (t, ω) = (X1 (t, ω) − K)+ ,
0≤t≤T ,
then the corresponding option is called the American call. According to Theorem 12.3.10 the price of an American call is given by pA (F ) = sup EQ [e−ρτ (X1 (τ ) − K)+ ] . τ ≤T
Prove that
pA (F ) = e−ρT EQ [(X1 (T ) − K)+ ] ,
i.e. that it is always optimal to exercise the American call at the terminal time T , if at all. Hence the price of an American call option coincides with that of a European call option. (Hint: Define Y (t) = e−ρt (X1 (t) − K) . a) Prove that Y (t) is a Q-submartingale (Appendix C), i.e. Y (t) ≤ EQ [Y (s)|Ft ]
for s > t .
b) Then use the Jensen inequality (Appendix B) to prove that Z(t): = e−ρt (X1 (t) − K)+ is also a Q-submartingale. c) Complete the proof by using Doob’s optional sampling theorem (see the proof of Lemma 10.1.3 e)). 12.15.* (The perpetual American put) Solve the optimal stopping problem Φ(s, x) = sup E x [e−ρ(s+τ ) (K − X(τ ))+ ] τ ≥0
where dX(t) = αX(t)dt + βX(t)dB(t) ;
X(0) = x > 0 .
Here ρ > 0, K > 0, α and β = 0 are constants. If α = ρ then Φ(s, x) gives the price of the American put option with infinite horizon (T = ∞). (Hint: Proceed as in Example 10.4.2.) 12.16.* Let dX0 (t) = ρX0 (t)dt ;
X0 (0) = 1
dX1 (t) = αX1 (t)dt + βX1 (t)dB(t) ;
X1 (0) = x1 > 0
Exercises
313
be the classical Black-Scholes market. Find the replicating portfolio θ(t) = (θ0 (t), θ1 (t)) for the following European T -claims: a) F (ω) = (K − X1 (T, ω))+ b) F (ω) = (X1 (T, ω))2 .
(the European put)
Remark. Note that in case a) we get θ1 (t) < 0 for all t ∈ [0, T ], which means that we have to shortsell in order to replicate the European put. This is not necessary if we want to replicate the European call. See (12.3.48).
Appendix A: Normal Random Variables
Here we recall some basic facts which are used in the text. Definition A.1 Let (Ω, F , P ) be a probability space. A random variable X: Ω → R is normal if the distribution of X has a density of the form (x − m)2 1 pX (x) = √ · exp − , (A.1) 2σ 2 σ 2π where σ > 0 and m are constants. In other words,
P [X ∈ G] = pX (x)dx , for all Borel sets G ⊂ R . G
If this is the case, then
E[X] =
XdP =
Ω
xpX (x)dx = m
(A.2)
R
and 2
var[X] = E[(X − m) ] =
(x − m)2 pX (x)dx = σ 2 .
(A.3)
R
More generally, a random variable X: Ω → Rn is called (multi-)normal N (m, C) if the distribution of X has a density of the form 1 |A| 1 pX (x1 , · · · , xn ) = · exp − 2 · (xj − mj )ajk (xk − mk ) (A.4) (2π)n/2 j,k
where m = (m1 , · · · , mn ) ∈ Rn and C −1 = A = [ajk ] ∈ Rn×n is a symmetric positive definite matrix. If this is the case then E[X] = m (A.5) B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6, © Springer-Verlag Berlin Heidelberg 2013
316
Appendix A: Normal Random Variables
and A−1 = C = [cjk ] is the covariance matrix of X, i.e. cjk = E[(Xj − mj )(Xk − mk )] .
(A.6)
Definition A.2 The characteristic function of a random variable X: Ω → Rn is the function φX : Rn → C (where C denotes the complex numbers) defined by
φX (u1 , · · · , un ) = E[exp(i(u1 X1 +· · ·+un Xn ))] = eiu,x ·P [X ∈ dx] , (A.7) Rn
where u, x = u1 x1 + · · · + un xn (and i ∈ C is the imaginary unit). In other words, φX is the Fourier transform of X (or, more precisely, of the measure P [X ∈ dx]). Therefore we have (see Kallenberg (2002), Thm. 5.3) Theorem A.3 The characteristic function of X determines the distribution of X uniquely. It is not hard to verify the following: Theorem A.4 If X: Ω → Rn is normal N (m, C), then φX (u1 , · · · , un ) = exp − 12 uj cjk uk + i u j mj .
(A.8)
j
j,k
Theorem A.4 is often used as a basis for an extended concept of a normal random variable: We define X: Ω → Rn to be normal (in the extended sense) if φX satisfies (A.8) for some symmetric non-negative definite matrix C = [cjk ] ∈ Rn×n and some m ∈ Rn . So by this definition it is not required that C be invertible. From now on we will use this extended definition of normality. In the text we often use the following result: Theorem A.5 Let Xj : Ω → R be random variables; 1 ≤ j ≤ n. Then X = (X1 , · · · , Xn )
is normal
if and only if Y = λ1 X1 + · · · + λn Xn
is normal for all λ1 , . . . , λn ∈ R .
Proof. If X is normal, then E[exp(iu(λ1 X1 + · · · + λn Xn ))] = exp = exp
−
1 2
uλj cjk uλk + i
j,k
− 12 u2
j,k
λj cjk λk + iu
j
j
λj mj
,
uλj mj
Appendix A: Normal Random Variables
so Y is normal with E[Y ] =
λj mj , var[Y ] =
317
λj cjk λk .
Conversely, if Y = λ1 X1 + · · · + λn Xn is normal with E[Y ] = m and var[Y ] = σ 2 , then E[exp(iu(λ1 X1 + · · · + λn Xn ))] = exp(− 12 u2 σ 2 + ium) , where m=
2
λj E[Xj ], σ = E
j
=E
#
#
λj Xj −
j
2 $ λj E[Xj ]
j
2 $ = λj (Xj − mj ) λj λk E[(Xj − mj )(Xk − mk )] ,
j
j,k
where mj = E[Xj ]. Hence X is normal. Theorem A.6 Let Y0 , Y1 , . . . , Yn be real, random variables on Ω. Assume that X = (Y0 , Y1 , . . . , Yn ) is normal and that Y0 and Yj are uncorrelated for each j ≥ 1, i.e E[(Y0 − E[Y0 ])(Yj − E[Yj ])] = 0 ;
1≤j≤n.
Then Y0 is independent of {Y1 , · · · , Yn }. Proof. We have to prove that P [Y0 ∈ G0 , Y1 ∈ G1 , . . . , Yn ∈ Gn ] = P [Y0 ∈ G0 ] · P [Y1 ∈ G1 , . . . , Yn ∈ Gn ] , (A.9) for all Borel sets G0 , G1 , . . . , Gn ⊂ R. We know that in the first line (and the first column) of the covariance matrix cjk = E[(Yj − E[Yj ])(Yk − E[Yk ])] only the first entry c00 = var[Y0 ], is non-zero. Therefore the characteristic function of X satisfies φX (u0 , u1 , . . . , un ) = φY0 (uo ) · φ(Y1 ,...,Yn ) (u1 , . . . , un ) and this is equivalent to (A.9). Finally we establish the following: Theorem A.7 Suppose Xk : Ω → Rn is normal for all k and that Xk → X in L2 (Ω), i.e. as k → ∞ . E[|Xk − X|2 ] → 0 Then X is normal. Proof. Since |eiu,x − eiu,y | < |u| · |x − y|, we have E[{exp(iu, Xk ) − exp(iu, X)}2 ] ≤ |u|2 · E[|Xk − X|2 ] → 0
as k → ∞ .
Therefore E[exp(iu, Xk )] → E[exp(iu, X)] as k → ∞ . So X is normal, with mean E[X] = lim E[Xk ] and covariance matrix C = lim Ck , where Ck is the covariance matrix of Xk .
Appendix B: Conditional Expectation
Let (Ω, F , P ) be a probability space and let X: Ω → Rn be a random variable such that E[|X|] < ∞. If H ⊂ F is a σ-algebra, then the conditional expectation of X given H, denoted by E[X|H], is defined as follows: Definition B.1 E[X|H] is the (a.s. unique) function from Ω to Rn satisfying: (1) E[X|H] is H-measurable (2) E[X|H]dP = XdP , for all H ∈ H. H
H
The existence and uniqueness of E[X|H] comes from the Radon-Nikodym theorem: Let μ be the measure on H defined by
H ∈H. μ(H) = XdP ; H
Then μ is absolutely continuous w.r.t. P |H, so there exists a P |H-unique H-measurable function F on Ω such that
μ(H) = F dP for all H ∈ H . H
Thus E[X|H]: = F does the job and this function is unique a.s. w.r.t. the measure P |H. Note that (2) is equivalent to
Z · E[X|H]dP = Z · XdP for all H-measurable Z . (2)’ Ω
Ω
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6, © Springer-Verlag Berlin Heidelberg 2013
320
Appendix B: Conditional Expectation
We list some of the basic properties of the conditional expectation: Theorem B.2 Suppose Y : Ω → Rn is another random variable with E[|Y |] < ∞ and let a, b ∈ R. Then a) b) c) d) e)
E[aX + bY |H] = aE[X|H] + bE[Y |H] E[E[X|H]] = E[X] E[X|H] = X if X is H-measurable E[X|H] = E[X] if X is independent of H E[Y · X|H] = Y · E[X|H] if Y is H-measurable, where · denotes the usual inner product in Rn .
Proof. d): If X is independent of H we have for H ∈ H
XdP = X · XH dP = XdP · XH dP = E[X] · P (H) , H
Ω
Ω
Ω
so the constant E[X] satisfies (1) and (2). e): We first establish the result in the case when Y = XH (where X denotes the indicator function), for some H ∈ H. Then for all G ∈ H
Y · E[X|H]dP = E[X|H]dP = XdP = Y XdP , G
G∩H
G∩H
G
so Y ·E[X|H] satisfies both (1) and (2). Similarly we obtain that the result is true if Y is a simple function Y =
m
cj XHj ,
where Hj ∈ H .
j=1
The result in the general case then follows by approximating Y by such simple functions. Theorem B.3 Let G, H be σ-algebras such that G ⊂ H. Then E[X|G] = E[E[X|H]|G] . Proof. If G ∈ G then G ∈ H and therefore
E[X|H]dP = XdP . G
G
Hence E[E[X|H]|G] = E[X|G] by uniqueness.
Appendix B: Conditional Expectation
321
The following useful result can be found in Chung (1974), Theorem 9.1.4: Theorem B.4 (The Jensen inequality) If φ: R → R is convex and E[|φ(X)|] < ∞ then φ(E[X|H]) ≤ E[φ(X)|H] . Corollary B.5 (i) |E[X|H]| ≤ E[|X| | H] (ii) |E[X|H]|2 ≤ E[|X|2 | H] . Corollary B.6 If Xn → X in L2 then E[Xn | H] → E[X | H] in L2 .
Appendix C: Uniform Integrability and Martingale Convergence
We give a brief summary of the definitions and results which are the background for the applications in this book. For proofs and more information we refer to Doob (1984), Liptser and Shiryaev (1977), Meyer (1966) or Williams (1979). Definition C.1 Let (Ω, F , P ) be a probability space. A family {fj }j∈J of real, measurable functions fj on Ω is called uniformly integrable if lim
M→∞
%
|fj |dP
sup j∈J
+
=0.
{|fj |>M}
One of the most useful tests for uniform integrability is obtained by using the following concept: Definition C.2 A function ψ: [0, ∞) → [0, ∞) is called a u.i. (uniform integrability) test function if ψ is increasing, convex (i.e. ψ(λx + (1 − λ)y) ≤ λψ(x) + (1 − λ)ψ(y) for all x, y ∈ [0, ∞), λ ∈ [0, 1]) and lim
x→∞
ψ(x) =∞. x
So for example ψ(x) = xp is a u.i. test function if p > 1, but not if p = 1. The justification for the name in Definition C.2 is the following: Theorem C.3 The family {fj }j∈J is uniformly integrable if and only if there is a u.i. test function ψ such that +
% sup j∈J
ψ(|fj |)dP
t . (C.1) Nt ≥ E[Ns |Nt ] Similarly, if (C.1) holds with the inequality reversed for all s > t, then Nt is called a submartingale. And if (C.1) holds with equality then Nt is called a martingale. As is customary we will assume that each Nt contains all the null sets of N , that t → Nt (ω) is right continuous for a.a.ω and that {Nt } is right * continuous, in the sense that Nt = Ns for all t ≥ 0. s>t
Theorem C.5 (Doob’s martingale convergence theorem I) Let Nt be a right continuous supermartingale with the property that sup E[Nt− ] < ∞ , t>0
where
Nt−
= max(−Nt , 0). Then the pointwise limit N (ω) = lim Nt (ω) t→∞
−
exists for a.a. ω and E[N ] < ∞. Note, however, that the convergence need not be in L1 (P ). In order to obtain this we need uniform integrability: Theorem C.6 (Doob’s martingale convergence theorem II) Let Nt be a right-continuous supermartingale. Then the following are equivalent:
Appendix C: Uniform Integrability and Martingale Convergence
325
1) {Nt }t≥0 is uniformly integrable 2) There exists N ∈ L1 (P ) such that Nt → N a.e. (P ) and Nt → N in L1 (P ), i.e. |Nt − N |dP → 0 as t → ∞ . Combining Theorems C.6 and C.3 (with ψ(x) = xp ) we get Corollary C.7 Let Mt be a continuous martingale such that sup E[|Mt |p ] < ∞
for some p > 1 .
t>0
Then there exists M ∈ L1 (P ) such that Mt → M a.e. (P ) and
as t → ∞ . |Mt − M |dP → 0 Finally, we mention that similar results can be obtained for the analogous discrete time super/sub-martingales {Nk , Nk }, k = 1, 2, . . .. Of course, no continuity assumptions are needed in this case. For example, we have the following result, which is used in Chapter 9: Corollary C.8 Let Mk ; k = 1, 2, . . . be a discrete time martingale and assume that for some p > 1 . sup E[|Mk |p ] < ∞ k
Then there exists M ∈ L1 (P ) such that Mk → M a.e. (P ) and
as k → ∞ . |Mk − M |dP → 0 Corollary C.9 Let X ∈ Lp (P ) for some p ∈ [1, ∞], let {Nk }∞ k=1 be an increasing family of σ-algebras, Nk ⊂ F and define N∞ to be the σ-algebra generated by {Nk }∞ k=1 . Then E[X|Nk ] → E[X|N∞ ]
as k → ∞ ,
a.e. P and in Lp (P ). Proof. We give the proof in the case p = 1. The process Mk : = E[X|Nk ] is a u.i. martingale, so there exists M ∈ L1 (P ) such that Mk → M a.e. P and in L1 (P ), as k → ∞. It remains to prove that M = E[X|N∞ ]: Note that Mk − E[M |Nk ] L1 (P ) = E[Mk |Nk ] − E[M |Nk ] L1 (P ) ≤ Mk − M L1 (P ) → 0
as k → ∞ .
Hence if F ∈ Nk0 and k ≥ k0 we have
(X −M )dP = E[X −M |Nk ]dP = (Mk −E[M |Nk ])dP → 0 as k → ∞ . F
F
F
326
Appendix C: Uniform Integrability and Martingale Convergence
Therefore
(X − M )dP = 0 F
∞
for all F ∈
Nk
k=1
and hence E[X|N∞ ] = E[M |N∞ ] = M .
Appendix D: An Approximation Result
In this Appendix we prove an approximation result which was used in Theorem 10.4.1. We use the notation from that Theorem. Theorem D.1 Let D ⊂ V ⊂ Rn be open sets such that ∂D is a Lipschitz surface
(D.1)
and let φ: V → R be a function with the following properties φ ∈ C 1 (V ) ∩ C(V ) 2
φ ∈ C (V \ ∂D)
and the second order derivatives
(D.2) (D.3)
of φ are locally bounded near ∂D , 2 Then there exists a sequence {φj }∞ j=1 of functions φj ∈ C (V ) ∩ C(V ) such that
φj → φ uniformly on compact subsets of V , as j → ∞ Lφj → Lφ uniformly on compact subsets of V \ ∂D, as j → ∞
(D.4) (D.5)
{Lφj }∞ j=1
(D.6)
is locally bounded on V .
Proof. We may assume that φ is extended to a continuous function on the whole of Rn . Choose a C ∞ function η: Rn → [0, ∞) with compact support such that
η(y)dy = 1 (D.7) Rn
and put η (x) = −n η
x
for > 0, x ∈ Rn .
Fix a sequence j ↓ 0 and define
φj (x) = (φ ∗ η j )(x) = φ(x − z)η j (z)dz = φ(y)η j (x − y)dy , Rn
Rn
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6, © Springer-Verlag Berlin Heidelberg 2013
(D.8)
(D.9)
328
Appendix D: An Approximation Result
i.e. φj is the convolution of φ and η j . Then it is well-known that φj (x) → φ(x) uniformly on compact subsets of any set in V where φ is continuous. See e.g. Folland (1984), Theorem 8.14 (c). Note that since η has compact support we need not assume that φ is globally bounded, just locally bounded (which follows from continuity). We proceed to verify (D.4)–(D.6): Let W ⊂ V be open with a Lipschitz boundary. Put V1 = W ∩ D, V2 = W \ D.
Then V1 , V2 are Lipschitz domains and integration by parts gives, for i = 1, 2 and x ∈ W \ ∂D
∂2 φ(y) η (x − y)dy = ∂yk ∂y j Vi
∂φ ∂ ∂ φ(y) η (x − y)nik dν(y) − (y) η (x − y)dy , (D.10) ∂y j ∂yk ∂y j Vi
∂Vi
where nik is component number k of the outer unit normal ni from Vi at ∂Vi . (This outer normal exists a.e. with respect to the surface measure ν on ∂Vi since ∂Vi is a Lipschitz surface.) Another integration by parts yields
∂φ ∂ (y) η (x − y)dy = ∂yk ∂y j Vi
∂φ (y)η j (x − y)ni dν(y) − ∂yk
∂Vi
Vi
Combining (D.10) and (D.11) we get
∂2 φ(y) η (x − y)dy = ∂yk ∂y j Vi
∂2φ (y)η j (x − y)dy . ∂yk ∂y
(D.11)
Appendix D: An Approximation Result
329
$
# ∂φ ∂ η (x − y)nik − (y)η j (x − y)ni dν(y) φ(y) ∂y j ∂yk ∂Vi
+
∂ 2φ (y)η j (x − y)dy ; ∂yk ∂y
i = 1, 2 .
(D.12)
Vi
Adding (D.12) for i = 1, 2 and keeping in mind that the outer unit normal for Vi is the inner unit normal for V3−i on ∂V1 ∩ ∂V2 , we get
∂2 φ(y) η (x − y)dy = ∂yk ∂y j W +
% ∂φ ∂ η j (x − y)Nk − (y)η j (x − y)N dν(y) φ(y) ∂y ∂yk ∂W
∂2φ (y)η j (x − y)dy , (D.13) + ∂yk ∂y W
where Nk , N are components number k, of the outer unit normal N from W at ∂W . If we fix x ∈ W \ ∂D then for j large enough we have η j (x − y) = 0 for all y outside W and for such j we get from (D.13)
∂2φ ∂2 φ(y) η j (x − y)dy = (y)η j (x − y)dy . (D.14) ∂yk ∂y ∂yk ∂y Rn
Rn
In other words, we have proved that 2 ∂ φ ∂2 φj (x) = ∗ η j (x) ∂xk ∂x ∂yk ∂y
for x ∈ V \ ∂D .
(D.15)
Similarly, integration by parts applied to W gives, if j is large enough
∂φ ∂ φ(y) η j (x − y)dy = − (y)η j (x − y)dy ∂yk ∂yk W
W
from which we conclude that ∂ φj (x) = ∂xk
∂φ ∗ η j (x) ∂yk
for x ∈ V .
(D.16)
From (D.15) and (D.16), combined with Theorem 8.14 (c) in Folland (1984), we get that ∂φ ∂φj → ∂xk ∂x
uniformly on compact subsets of V as j → ∞
(D.17)
330
Appendix D: An Approximation Result
and ∂2φ ∂ 2 φj → uniformly on compact subsets of V \ ∂D as j → ∞ . ∂xk ∂x ∂xk ∂x (D.18) 2 ∞ ∞ ∂φj ∂ φj Moreover, ∂xk and ∂xk ∂x
are locally bounded on V , by (D.15), j=1
j=1
(D.16) combined with the assumptions (D.2), (D.3). We conclude that (D.4)–(D.6) hold.
Solutions and Additional Hints to Some of the Exercises
2.4.
a) Put A = {ω; |X(ω)| ≥ λ}. Then
p |X(ω)| dP (ω) ≥ |X(ω)|p dP (ω) ≥ λp P (A). Ω
A
b) By a) we have (choosing p = 1) P [|X| ≥ λ] = P [exp(k|X|) ≥ exp(k λ)] ≤ exp(−k λ)E[exp(k|X|)] .
2.6.
P
∞ ∞ m=1 k=m ∞
Ak = lim P m→∞
∞
Ak ≤ lim
k=m
m→∞
∞
P (Ak ) = 0
k=m
P (Ak ) < ∞.
since
k=1
Hence P ({ω; ω belongs to infinitely many Ak ’s}) = P ({ω; (∀m)(∃k ≥ m) s.t. ω ∈ Ak }) ∞ ∞ =0. = P ω; ω ∈ Ak m=1 k=m
2.7.
a) We must verify that (i) φ ∈ G (ii) F ∈ G ⇒ F C ∈ G (iii) F1 , . . . , Fk ∈ G ⇒ F1 ∪ F2 ∪ · · · ∪ Fk ∈ G Since each element F of G consists of a union of some of the Gi ’s, these statements are easily verified.
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6, © Springer-Verlag Berlin Heidelberg 2013
332
Solutions and Additional Hints to Some of the Exercises
b) Let F be a finite σ-algebra of subsets of Ω. For each x ∈ Ω define Fx = {F ∈ F; x ∈ F } Since F is finite we have Fx ∈ F and clearly Fx is the smallest set in F which contains x. We claim that for given x, y ∈ Ω the following holds: (i) Either Fx ∩ Fy = ∅ or Fx = Fy . To prove this we argue by contradiction: Assume that we both have (ii) Fx ∩ Fy = ∅ and Fx = Fy , e.g. Fx \ Fy = ∅. Then there are at most two possibilities: a) x ∈ Fx ∩ Fy Then Fx ∩ Fy is a set in F containing x and Fx ∩ Fy is strictly smaller than Fx . This is impossible. b) x ∈ Fx \ Fy Then Fx \ Fy is a set in F containing x and Fx \ Fy is strictly smaller than Fx . This is again impossible. Hence the claim (i) is proved. Therefore there exist x1 , . . . , xm ∈ Ω such that Fx1 , Fx2 , . . . , Fxm forms a partition of Ω. Hence any F ∈ F is a disjoint union of some of these Fxi ’s. Therefore F is of the form G described in a). c) Let X : Ω → R be F -measurable. Then for all x ∈ R we have {ω ∈ Ω; X(ω) = c} = X −1 ({c}) ∈ F Therefore X has the constant value c on a finite (possibly empty) union of the Fi ’s, where Fi = Fxi is as in b). 2.9.
With
1 if t = ω 0 otherwise and Yt (ω) = 0 for all (t, ω) ∈ [0, ∞) × [0, ∞) we have Xt (ω) =
P [Xt = Yt ] = P [Xt = 0] = P ({ω; ω = t}) = 1 . Hence Xt is a version of Yt . 2.13. Define Dρ = {x ∈ R2 ; |x| < ρ}. Then by (2.2.2) with n = 2 we have
2 2 − x +y 0 −1 e 2t dx dy . P [Bt ∈ Dρ ] = (2π t) Dρ
Introducing polar coordinates
Solutions and Additional Hints to Some of the Exercises
x = r cos θ ; y = r sin θ ;
0 ≤ r ≤ ρ,
333
0 ≤ θ ≤ 2π
we get 0
−1
2π ρ
P [Bt ∈ Dρ ] = (2π t)
ρ
2
re 0
− r2t
dr dθ
0 2
−ρ − r2 = (2π t)−1 · 2π t −e 2t = 1 − e 2t . 0
2.14. The expected total length of time that Bt spends in the set K is given by E
x
∞
∞
XK (Bt )dt =
0
x
∞
(2π t)−n/2
P [Bt ∈ K]dt = 0
0
e
−
(x−y)2 2t
dy = 0,
K
since K has Lebesgue measure 0. t1 ∈ F1 , . . . , B t ∈ Fk ] = P [Bt1 ∈ U −1 F1 , . . . , Bt ∈ U −1 Fk ] 2.15. P [B k k p(t1 , 0, x1 )p(t2 −t1 , x1 , x2 ) · · · p(tk −tk−1 , xk−1 , xk ) = U −1 F1 ×···×U −1 Fk
=
×dx 1 · · · dxk p(t1 , 0, y1 )p(t2 − t1 , y1 , y2 ) · · · p(tk − tk−1 , yk−1 , yk )dy1 · · · dyk
F1 ×···×Fk
= P [Bt1 ∈ F1 , . . . , Btk ∈ Fk ] , by (2.2.1), using the substitutions yj = U xj and the fact that |U xj − U xj−1 |2 = |xj − xj−1 |2 . # $ 2n −1 2n −1 −n 2 2 2 t 2.17. a) E[(Yn (t, ·) − t) ] = E k=0 (ΔBk ) − k=0 2 n #% 2 +2 $ −1 2 −n =E ((ΔBk ) − 2 t)
k=0
=E
n # 2 −1
$ ((ΔBj )2 − 2−n t)((ΔBk )2 −2−n t)
j,k=0 n
=
2 −1
E[((ΔBk )2 − 2−n t)2 ]
k=0
=
n 2 −1
E[(ΔBk )4 − 2 · 2−2n t2 + 2−2n t2 ]
k=0
=
n 2 −1
k=0
2 · 2−2n t2 = 2 · 2−n t2 → 0
as n → ∞ .
334
3.1.
Solutions and Additional Hints to Some of the Exercises
b) This follows from the following general result: If the quadratic variation of a real function over an interval is positive, then the total variation of the function over that interval is infinite. Using the notation ΔXj = Xtj +1 − Xtj we have tBt = Δ(tj Bj ) = (tj+1 Btj +1 − tj Btj ) j
=
j
tj ΔBj +
j
t →
t s dBs +
0
3.3.
Bj+1 Δtj
j
Bs ds
Δtj → 0 .
as
0
a) Suppose Xt is a martingale with respect to some filtration {Nt }t≥0 . (X) Then Ht ⊆ Nt and hence, for s > t, (X)
E[Xs |Ht
(X)
] = E[E[Xs |Nt ]|Ht
(X)
] = E[Xt |Ht (X)
b) If Xt is a martingale with respect to Ht
] = Xt .
then
(X)
E[Xt |H0 ] = X0 i.e. E[Xt ] = E[X0 ] for all t ≥ 0 . c) The process Xt := Bt3 satisfies (∗), but it is not a martingale. To see this, choose s > t and consider E[Bs3 |Ft ] = E[(Bs − Bt )3 + 3Bs2 Bt − 3Bs Bt2 + Bt3 |Ft ] = 0 + 3Bt E[Bs2 |Ft ] − 3Bt2 E[Bs |Ft ] + Bt3 = 3Bt E[(Bs − Bt )2 + 2Bs Bt − Bt2 |Ft ] − 2Bt3 = 3Bt (s − t) + 6Bt3 − 3Bt3 − 2Bt3 = 3Bt (s − t) + Bt3 = Bt3 .
3.4.
(i) If Xt = Bt + t then E[Xt ] = B(0) + t = E[X0 ], so Xt does not satisfy (∗) of Exercise 3.3 b). Hence Xt cannot be a martingale. (ii) If Xt = Bt2 then E[Xt ] = nt + B02 = E[X0 ], so Xt cannot be a martingale. t (iii) If Xt = t2 Bt − 2 r Br dr then, for s > t, 0
E[Xs |Ft ] = E[s2 Bs |Ft ] − 2 = s 2 Bt − 2
t
t
s r Br dr − 2
0
0
t
s
r Br dr − 2Bt
r dr t
r E[Br |Ft ]dr
Solutions and Additional Hints to Some of the Exercises
= s 2 Bt − 2
t
r Br dr − Bt (s2 − t2 ) = t2 Bt − 2
0
335
t r Br dr = Xt . 0
Hence Xt is a martingale. (iv) If Xt = B1 (t)B2 (t) then E[Xs |Ft ] = E[B1 (s)B2 (s)|Ft ] = E[(B1 (s) − B1 (t))(B2 (s) − B2 (t))|Ft ] +E[B1 (t)(B2 (s) − B2 (t))|Ft ] +E[B2 (t)(B1 (s) − B1 (t))|Ft ] + E[B1 (t)B2 (t)|Ft ] = E[(B1 (s) − B1 (t)) · (B2 (s) − B2 (t))] + 0 + 0 + B1 (t)B2 (t) = E[B1 (s) − B1 (t)] · E[B2 (s) − B2 (t)] + B1 (t)B2 (t) = B1 (t)B2 (t) = Xt .
Hence Xt is a martingale. 3.5.
To prove that Mt := Bt2 − t is a martingale, choose s > t and consider E[Ms | Ft ] = E[Bs2 − s | Ft ] = E[(Bs − Bt )2 + 2Bs Bt − Bt2 | Ft ] − s = s − t + 2Bt2 − Bt2 − s = Bt2 − t = Mt .
3.6.
To prove that Nt := consider
Bt3
− 3tBt is a martingale, choose s > t and
E[Ns | Ft ] = E[(Bs − Bt )3 + 3Bs2 Bt − 3Bs Bt2 + Bt3 | Ft ] − 3sBt = 3Bt E[Bs2 | Ft ] − 3Bt2 Bt + Bt3 − 3sBt = 3Bt E[(Bs − Bt )2 + 2Bs Bt − Bt2 | Ft ] − 2Bt3 − 3sBt = 3Bt (s − t) + Bt3 − 3sBt = Bt3 − 3tBt = Nt . 3.8.
a) If Mt = E[Y | Ft ] then for s > t, E[Ms | Ft ] = E[E[Y | Fs ] | Ft ] = E[Y | Ft ] = Mt .
3.9.
T 0
Bt ◦ dBt = 12 BT2
if B0 = 0 .
3.12. (i) a) dXt = (γ + 12 α2 )Xt dt + αXt dBt . b) dXt = 12 sin Xt [cos Xt − t2 ]dt + (t2 + cos Xt )dBt . (ii) a) dXt = (r − 12 α2 )Xt dt + αXt ◦ dBt . b) dXt = (2e
−Xt
− Xt3 )dt + Xt2 ◦ dBt .
336
Solutions and Additional Hints to Some of the Exercises
3.15. Suppose
T C+
T f (t, ω)dBt (ω) = D +
S
g(t, ω)dBt (ω) S
By taking expectation we get C=D. Hence
T
T f (t, ω)dBt (ω) =
S
g(t, ω)dBt (ω) S
and therefore, by the Itˆ o isometry, T 0=E
T 2 =E (f (t, ω) − g(t, ω))dB(t) (f (t, ω) − g(t, ω))2 dt ,
S
S
which implies that for a.a. (t, ω) ∈ [S, T ] × Ω.
f (t, ω) = g(t, ω)
4.1.
4.2.
a) dYt = 2Bt dBt + dt . b) dYt = 1 + 12 eBt dt + eBt dBt . c) dYt = 2dt + 2B1 dB1 (t) + 2B2 dB2 (t) . # $ # $ # $ dt 1 0 d) dYt = = dt + dBt . dBt 0 1 e) dY1 (t) = dB1 (t) + dB2 (t) + dB3 (t) dY2 (t) = dt − B3 (t)dB1 (t) + 2B2 (t)dB2 (t) − B1 (t)dB3 (t) or ⎡ ⎤ # $ # $ # $ dB1 (t) dY1 (t) 0 1 1 1 ⎣ dB2 (t) ⎦. = dt + dYt = dY2 (t) 1 −B3 (t) 2B2 (t) −B1 (t) dB3 (t) By Itˆ o’s formula we get d 13 Bt3 = Bt2 dBt + Bt dt . Hence 1 3 3 Bt
t = 0
Bs2 dBs
t +
Bs ds . 0
Solutions and Additional Hints to Some of the Exercises
4.3.
337
If we apply the Itˆ o formula with g(x, y) = x · y we get ∂g ∂g (Xt , Yt )dXt + (Xt , Yt )dYt ∂x ∂y ∂2g ∂2g ∂2g (Xt , Yt )dXt dYt + 12 2 (Xt , Yt )·(dYt )2 + 12 2 (Xt , Yt )·(dXt )2 + ∂x ∂x∂y ∂y = Yt dXt + Xt dYt + dXt dYt .
d(Xt Yt ) = d(g(Xt , Yt )) =
From this we obtain
t Xt Yt = X0 Y0 +
t Ys dXs +
0
4.5.
E[Bt6 ]
= 15t
3
t Xs dYs +
0
dXs dYs . 0
if B0 = 0 .
4.11. b) By the Itˆo formula we have 1 1 1 1 d e 2 t sin Bt ) = 12 e 2 t sin Bt dt + e 2 t cos Bt dBt + 12 e 2 t (− sin Bt )dt 1
= e 2 t cos Bt dBt . Since
1
f (t) := e 2 t cos Bt ∈ V(0, T )
for all T ,
1 2
we conclude that Xt = e sin Bt is a martingale.
c) By the Itˆ o formula (or Exercise 4.3) we get d (Bt + t) exp(−Bt − 12 t) = (Bt + t) exp(−Bt − 12 t)(−1)dBt + exp(−Bt − 12 t)(dBt + dt) + exp(−Bt − 12 t)(−1)dt = exp(−Bt − 12 t)(1 − t − Bt )dBt ,
where we have used that d(exp(−Bt − 12 t)) = − exp(−Bt − 12 t)dBt . Since
f (t) := exp(−Bt − 12 t)(1 − t − Bt ) ∈ V(0, T )
for all T > 0, we conclude that Xt = (Bt + t) exp(−Bt − 12 t) is a martingale. 4.14. a)
BT (ω) =
T 0
1 · dBt .
b) By integration by parts we have
T
T Bt dt = T BT −
0
T (T − t)dBt .
t dBt = 0
0
338
Solutions and Additional Hints to Some of the Exercises
c) By the Itˆ o formula we have d(Bt2 ) = 2Bt dBt + dt. This gives BT2
T =T+
2Bt dBt .
0
d) By the Itˆo formula we have d(Bt3 ) = 3Bt2 dBt + 3Bt dt . Combined with 4.14 b) this gives BT3
T =
3Bt2 dBt
T +3
0
T Bt dt =
0
(3Bt2 + 3(T − t))dBt .
0
B −1t B −1t e) Since d e t 2 = e t 2 dBt we have
e
T
BT − 12 T
=1+
e
Bt − 12 t
dBt
0
or e
BT
=e
1 2T
T +
e
Bt + 12 (T −t)
dBt .
0
f) By Exercise 4.11 b) we have 1 1t t d e 2 sin Bt = e 2 cos Bt dBt
or e
1 2T
T
1
t
e 2 cos Bt dBt .
sin BT = 0
Hence
T
5.3. 5.4.
Xt = X0 · exp (i)
1
e2
sin BT =
(t−T )
cos Bt dBt .
0
r−
1 2
n k=1
α2k t +
n
αk Bk (t)
X1 (t) = X1 (0) + t + B1 (t) , X2 (t) = X2 (0) + X1 (0)B2 (t) +
t 0
suming (as usual) that B(0) = 0. t (ii) Xt = et X0 + et−s dBs . 0
(if B(0) = 0).
k=1
sdB2 (s) +
t 0
B1 (s)dB2 (s), as-
Solutions and Additional Hints to Some of the Exercises
(iii) Xt = e−t X0 + e−t Bt 5.6.
339
(assuming B0 = 0).
By the Itˆo formula we have dFt = Ft (−α dBt + 12 α2 dt) + 12 Ft α2 dt = Ft (−α dBt + α2 dt) .
Therefore, by Exercise 4.3, d(Ft Yt ) = Ft dYt + Yt dFt + dFt dYt = Ft dYt + Yt Ft (−α dBt + α2 dt) + (−α Ft dBt )(α Yt dBt ) = Ft (dYt − α Yt dBt ) = Ft r dt . Integrating this we get
t Ft Yt = F0 Y0 +
r Fs ds 0
or Yt =
Y0 Ft−1
+
Ft−1
t r Fs ds 0
= Y0 exp(α Bt −
1 2 2 α t)
t +r
exp(α(Bt − Bs ) − 12 α2 (t − s))ds .
0
5.7.
5.8.
a) Xt = m + (X0 − m)e−t + σ
t 0
es−t dBs .
b) E[Xt ] = m + (X0 − m)e−t . 2 Var[Xt ] = σ2 [1 − e−2t ] . # $ t X1 (t) X(t) = = exp(tJ)X(0) + exp(tJ) exp(−sJ)M dB(s), where X (t) 0 # 2 $ # $ # $ 0 1 α 0 dB1 (s) J= , M= , dB(s) = −1 0 0 β dB2 (s) and t2 tn exp(tJ) = I + tJ + J 2 + · · · + J n + · · · ∈ R2×2 . 2 n! 2 Using that J = −I this can be rewritten as t X1 (t) = X1 (0) cos(t) + X2 (0) sin(t) + α cos(t − s)dB1 (s) 0
t
+ β sin(t − s)dB2 (s) , 0
X2 (t) = −X1 (0) sin(t) + X2 (0) cos(t) − +β
t 0
cos(t − s)dB2 (s) .
t 0
α sin(t − s)dB1 (s)
340
Solutions and Additional Hints to Some of the Exercises
5.11. Hint: To prove that lim (1 − t) t→1
t
dBs 1−s
0
t
= 0 a.s., put Mt =
0 ≤ t < 1 and apply the martingale inequality to prove that
0
dBs 1−s
for
P [sup{(1 − t)|Mt |; t ∈ [1 − 2−n , 1 − 2−n−1 ]} > ] ≤ 2−2 · 2−n . Hence by the Borel-Cantelli lemma we obtain that for a.a. ω there exists n(ω) < ∞ such that n ≥ n(ω) ⇒ ω ∈ / An , where An = {ω; sup{(1 − t)|Mt |; t ∈ [1 − 2−n , 1 − 2−n−1 ]} > 2
−n 4
}.
5.12. a) If we introduce #
x1 (t) x2 (t) = y (t) and X(t) = x2 (t)
x1 (t) = y(t),
$
then the equation y (t) + (1 + Wt )y(t) = 0 can be written X (t) :=
#
$ # $ # $ x1 (t) y (t) x2 (t) = = x2 (t) y (t) −(1 + Wt )x1 (t)
which is intepreted as the stochastic differential equation # $ x2 (t)dt dX(t) = −x1 (t)dt − x1 (t)dB(t) # $# $ # $# $ 0 1 x1 (t) 00 x1 (t) = dt − dBt −1 0 10 x2 (t) x2 (t) = K X(t)dt − L X(t)dBt , #
where K=
0 1 −1 0
$
# and
L=
$ 00 . 10
b) By the above interpretation we have
t
t
y (s)ds = y (0) −
y (t) = y (0) + 0
t y(s)ds −
0
y(s)dBs 0
Solutions and Additional Hints to Some of the Exercises
341
Hence, if we apply a stochastic Fubini theorem,
t y(t) = y(0) +
y (s)ds
0
= y(0) + y (0)t −
t s 0
= y(0) + y (0)t −
t s
y(r)ds dr −
t t
y(r)dr ds −
0
t t 0
= y(0) + y (0)t +
0
0
r
t
y(r)dBr ds
0
y(r)ds dBr
r
t (r − t)y(r)dr +
0
(r − t)y(r)dBr . 0
1/2 t 5.16. c) Xt = exp(αBt − 12 α2 t) x2 + 2 exp(−2αBs + α2 s)ds . 0
6.15. We have a) Xt = X0 exp(σ dBt + (μ − 12 σ 2 )t) = x exp(ξt − 12 σ 2 t) . Therefore Mt ⊆ Nt . On the other hand, since ξt = 12 σ 2 t + ln
Xt x
we see that Nt ⊆ Mt . Hence Mt = Nt . b) Consider the filtering problem (system) dμ = 0, μ ¯ = E[μ], a2 = E[(μ − μ ¯ )2 ] = θ−1 (observations) dξt = μ dt + σ dBt ; ξ0 = 0 By Example 6.2.9 the solution is a2 ¯ σ2 μ + ξt σ 2 + a2 t σ 2 + a2 t = (θ + σ −2 t)−1 (¯ μθ + σ −2 ξt ) .
μ = E[μ|Nt ] =
c) The innovation process Nt of the filtering problem in b) is
t Nt = ξt −
μ (s)ds . 0
342
Solutions and Additional Hints to Some of the Exercises
Therefore, by a) t = B
t
σ −1 (μ − E[μ|Ms ])ds + Bt
0
t =
σ −1 (μ − E[μ|Ms ])ds + Bt = σ −1 Nt
0
is a Brownian motion by Lemma 6.2.6. t = σ −1 (ξt − d) Since B
t 0
t is Nt -measurable and μ (s)ds) we see that B
hence Mt -measurable by a). e) We have
μ ¯ θ + σ −2 ξt t = dξt − μ σ dB dt . (t)dt = dξt − θ + σ −2 t
Hence dξt −
1 μ ¯θ t ξt dt = dt + σ dB σ2 θ + t θ + σ −2 t
which gives t d ξt exp − 0
or
ds = exp − σ2 θ + s
t 0
ds μ ¯θ t dt + σ d B σ 2 θ + s θ + σ −2 t
d
ξt 1 μ ¯θ t . dt + σ d B = σ2 θ + t σ 2 θ + t θ + σ −2 t We conclude that μ ¯σ2 θ +σ ¯− 2 ξt = μ σ θ+t
t 0
s dB , +s
σ2 θ
which shows that ξt is Ft -measurable. f) By combining (6.3.20), (6.3.24) and a) we have dXt = Xt (μ dt + σ dBt ) = Xt dξt t ) = Xt ( μ(t)dt + σ dB t . = E[μ|Mt ]Xt dt + σ Xt dB 7.1.
a) Af (x) = μxf (x) + b) Af (x) = rxf (x) +
1 2 2 σ f (x); f 1 2 2 2 α x f (x);
∈ f
C02 (R). ∈ C02 (R).
Solutions and Additional Hints to Some of the Exercises
343
c) Af (y) = rf (y) + 12 α2 y 2 f (y); f ∈ C02 (R). d) Af (t, x) =
∂f ∂t
2
1 2∂ f 2 2 + μx ∂f ∂x + 2 σ ∂x2 ; f ∈ C0 (R ).
e) Af (x1 , x2 ) =
∂f ∂x1
f) Af (x1 , x2 ) =
∂f ∂x1
g) Af (x1 , . . . , xn ) =
2
∂f + x2 ∂x + 12 e2x1 ∂∂xf2 ; f ∈ C02 (R2 ). 2 2
1 ∂2f 2 ∂x21
+ n
k=1
7.2.
+
1 2 ∂2 f 2 x1 ∂x22 ;
∂f rk xk ∂x + k
1 2
f ∈ C02 (R2 ). n n 2 f xi xj ( αik αjk ) ∂x∂i ∂x ; j
i,j=1
k=1
f ∈ C02 (Rn ). √ a) dXt = dt + 2 dBt . # $ # $ # $ dX1 (t) 1 0 b) dX(t) = = dt + dBt . dX2 (t) cX2 (t) αX2 (t) # $ # $ # $# $ dX1 (t) 2X2 (t) X1 (t) 1 dB1 (t) c) dX(t) = = dt + . 2 2 dX2 (t)
ln(1+X1 (t)+X2 (t))
1
0
dB2 (t)
(Several other diffusion coefficients are possible.) 7.4.
a), b). Let τk = inf{t > 0; Btx = 0 or Btx = k}; k > x > 0 and put pk = P x [Bτk = k] . Then by Dynkin’s formula applied to f (y) = y 2 for 0 ≤ y ≤ k we get E x [τk ] = k 2 pk − x2 .
(S1)
On the other hand, Dynkin’s formula applied to f (y) = y for 0 ≤ y ≤ k gives kpk = x . (S2) Combining these two identities we get that E x [τ ] = lim E x [τk ] = lim x(k − x)) = ∞ . k→∞
k→∞
(S3)
Moreover, from (S2) we get P x [∃t < ∞ with Bt = 0] = lim P x [Bτk = 0] = lim (1−pk ) = 1 , k→∞
k→∞
so τ < ∞ a.s. P x . 7.15. By the Markov property (7.2.5) we have E x [X[K,∞) (BT ) | Ft ] = E x [θt X[K,∞) (BT −t ) | Ft ] = E Bt [X[K,∞) (BT −t )] = E y [f (BT −t )]y=Bt
(z−y)2 1 − = 1 f (z)e 2(T −t) dz y=Bt 2π(T − t) R
1 = 1 2π(T − t)
∞ e K
−
(z−Bt )2 2(T −t)
dz ,
(S4)
344
Solutions and Additional Hints to Some of the Exercises
where f (x) = X[K,∞) (x) . 7.18. a) Define w(x) =
f (x) − f (a) ; f (b) − f (a)
x ∈ [a, b].
Then Lw(x) = 0
in D := (a, b)
and w(a) = 0,
w(b) = 1 .
Hence by the Dynkin formula 1 · P [Xτ = b] + 0 · P [Xτ = a] = w(x) + E x
x
x
τ
Lw(Xt )dt = w(x) ,
0
i.e. w = P x [Xτ = b] = c) p = exp − 8.1.
2μb σ2
− exp −
2μa σ2
f (x) − f (a) . f (b) − f (a)
−1
exp −
2μx σ2
− exp −
2μa σ2
.
a) g(t, x) = E x [φ(Bt )] . ∞ b) u(x) = E x [ e−αt ψ(Bt )dt] . 0
8.11. For T > 0 define the measure QT on FT by dQT (ω) = exp(−B(T ) − 12 T )dP (ω)
on FT .
Then by the Girsanov theorem Y (t): = t + B(t); 0 ≤ t ≤ T is a Brownian motion with respect to QT . Since QT = QS
on Ft for all t ≤ min(S, T )
there exists a measure Q on F∞ such that Q = QT
on FT for all T < ∞
(See Theorem 1.14 in Folland (1984).) By the law of iterated logarithm (Theorem 5.1.2) we know that where
P [N ] = 1 , N = ω; lim Y (t) = ∞ . t→∞
On the other hand, since Y (t) is a Brownian motion w.r.t. Q we have that Q[N ] = 0 . Hence P is not absolutely continuous w.r.t. Q.
Solutions and Additional Hints to Some of the Exercises
345
8.12. Here the equation σ(t, ω)u(t, ω) = β(t, ω) #
has the form
which has the solution
1 3 −1 −2
$#
$ # $ u1 0 = u2 1
$ # $ −3 u1 = . u= u2 1 #
(2)
Hence we define the measure Q on FT
by
dQ(ω) = exp(3B1 (T, ω) − B2 (T, ω) − 5T )dP (ω) . 9.1.
# $ # $ α 0 dt + dBt . 0 β # $ # $ a 1 0 dt + dBt . b) dXt = b 0 1 c) dXt = αXt dt + βdBt . d) dXt = αdt + βX#t dBt . $ √ X2 (t) dX1 (t) ln(1+X12 (t)) e) dXt = = dt + 2 a) dXt =
dX2 (t)
9.2.
0 X1 (t) X1 (t)
X2 (t)
(i) In this case we put # $ # $ # $ dt 1 0 dXt = = dt + dBt ; dBt 0 1 and
# $ s X0 = x
D = {(s, x) ∈ R2 ; s < T } .
Then τD = inf{t > 0; (s + t, Bt ) ∈ D} = T − s and we get u(s, x) = E
s,x
τD
ψ(BτD ) +
g(Xt )dt
0
= E ψ(BTx −s ) −
T
−s
φ(s + t, Btx )dt ,
0
where Btx is Brownian motion starting at x. (ii) Define Xt by dXt = α Xt dt + β Xt dBt ;
X0 = x > 0 ,
dB1 (t) . dB2 (t)
346
Solutions and Additional Hints to Some of the Exercises
and put D = (0, x0 ) . If α ≥
1 2 2β
then τD = inf{t > 0; Xt ∈ D} < ∞ a.s. and XτD = x0
a.s. (see Example 5.1.1).
Therefore the only bounded solution is u(x) = E x [(XτD )2 ] = x20
(constant).
(iii) If we try a solution of the form u(x) = xγ we get
for some constant γ
α xu (x) + 12 β 2 x2 u (x) = (α + 12 β 2 (γ − 1))xγ ,
so u(x) is a solution iff γ =1−
2α . β2
With this value of γ we get that the general solution of α xu (x) + 12 β 2 x2 u (x) = 0 is u(x) = C1 + C2 xγ where C1 , C2 are arbitrary constants. If α < 12 β 2 then all these solutions are bounded on (0, x0 ) and we need an additional boundary value at x = 0 to get uniqueness. In this case P [τD = ∞] > 0. If α > 12 β 2 then u(x) = C1 is the only bounded solution on (0, x0 ), in agreement with Theorem 9.1.1 (and (ii) above). 9.3.
a) u(t, x) = E x [φ(BT −t )] . b) u(t, x) = E x [ψ(Bt )] .
9.8.
a) Let Xt ∈ R2 be uniform motion to the right, as described in Example 9.2.1. Then each one-point set {(x1 , x2 )} is thin (and hence semipolar) but not polar. b) With Xt as in a) let Hk = {(ak , 1)}; k = 1, 2, . . . where {ak }∞ k=1 is the set of rational numbers. Then each Hk is thin but Q(x1 ,1) [TH = 0] = 1 for all x1 ∈ R .
9.10. Let Yt = Yts,x = (s + t, Xtx ) for t ≥ 0, where Xt = Xtx satisfies dXt = αXt dt + βXt dBt ;
t ≥ 0, X0 = x > 0 .
of Yt is given by Then the generator A 2 (s, x) = ∂f + αx ∂f + 1 β 2 x2 ∂ f ; Af 2 ∂s ∂x ∂x2
f ∈ C02 (R2 ) .
Solutions and Additional Hints to Some of the Exercises
347
Moreover, with D = {(t, x); x > 0 and t < T } we have / D} = inf{t > 0; s + t > T } = T − s . τD : = inf[t > 0; Yt ∈ Hence YτD = (T, XT −s ) . Therefore, by Theorem 9.3.3 the solution is T $ #
−s −ρT x −ρ(s+t) x φ(XT −s ) + e K(Xt )dt . f (s, x) = E e 0
9.15. a) If we put w(s, x) = e−ρs h(x) we get 1 2
∂ 2 w ∂w = e−ρs 12 h (x) − ρ h(x) , + 2 ∂x ∂s
and the boundary value problem reduces to (1) (2)
1 2 h (x)
− ρ h(x) = −θx2 ; a < x < b h(a) = ψ(a), h(b) = ψ(b)
The general solution of (1) is h(x) = C1 e
√ 2ρ x
+ C2 e−
√ 2ρ x
θ θ + x2 − 2 , ρ ρ
where C1 , C2 are arbitrary constants. The boundary values (ii) will determine C1 , C2 uniquely. b) To find
g(x, ρ) = g(x): = E x e−ρ τD
we apply the above with ψ = 1, θ = 0 and get g(x) = C1 e
√ 2ρ x
+ C2 e−
√ 2ρ x
,
where the constants C1 , C2 are determined by g(a) = 1,
g(b) = 1 .
After some simplifications this gives √ √ sinh( 2ρ(b − x)) + sinh( 2ρ(x − a)) √ g(x) = ; sinh( 2ρ(b − a)) 10.1. a) g ∗ (x) = ∞, τ ∗ does not exist. b) g ∗ (x) = ∞, τ ∗ does not exist. c) g ∗ (x) = 1, τ ∗ = inf{t > 0; Bt = 0}.
a≤x≤b.
348
Solutions and Additional Hints to Some of the Exercises
d) If ρ < If ρ ≥
1 2 1 2
then g ∗ (s, x) = ∞ and τ ∗ does not exist. then g ∗ (s, x) = g(s, x) = e−ρs cosh x and τ ∗ = 0 .
10.2. a) Let W be a closed disc centered at y with radius ρ < |x − y|. Define τρ = inf{t > 0; Btx ∈ W } . In R2 we know that τρ < ∞ a.s. (see Example 7.4.2). Suppose there exists a nonnegative superharmonic function u such that (1)
u(x) < u(y) .
Then u(x) ≥ E x u Bτρ .
(2)
Since u is lower semicontinuous there exists ρ > 0 such that inf{u(z); z ∈ W } > u(x) .
(3)
Combining this with (2) we get the contradiction u(x) ≥ E x u Bτρ > u(x) .
b) The argument in a) also works for R. Therefore g ∗ (x) = sup{g(x); x ∈ R} = sup{x e−x ; x > 0} =
1 . e
c) For x = 0 we have Δ(|x|γ ) =
n n n n γ ∂ 2 2 γ/2 ∂ γ 2 2 −1 = x x 2xi j ∂x2i ∂xi 2 j=1 j i=1 j=1 i=1
= γ(γ + n − 2)|x|γ−2 . So |x|γ , and hence fγ (x) = min(1, |x|γ ), is superharmonic iff γ(γ + n − 2) ≤ 0, i.e. 2−n≤γ ≤0. 10.3. x0 > 0 is given implicitly by the equation : x0 = √
√
2
2ρ x0
2 e +1 · 2√2ρ x , 0 ρ e −1
cosh( 2ρ x) √ and g ∗ (s, x) = e−ρs x20 cosh( for −x0 ≤ x ≤ x0 , where 2ρ x0 )
cosh ξ = 12 (eξ + e−ξ ) .
Solutions and Additional Hints to Some of the Exercises
349
10.9. If 0 < ρ ≤ 1 then Φ(x) = 1ρ x2 + ρ12 but τ ∗ does not exist. If ρ > 1 then %1 Φ(x) =
2 ρx 2
+
√ + C cosh( 2ρ x)
1 ρ2
for |x| ≤ x∗ for |x| > x∗
x
where C > 0, x∗ > 0 are the unique solutions of the equations 1 1 1 C cosh( 2ρ x∗ ) = 1 − (x∗ )2 − 2 ρ ρ 1 1 1 C 2ρ sinh( 2ρ x∗ ) = 2 1 − x∗ . ρ 10.12. If ρ > r then g ∗ (s, x) = e−ρs (x0 − 1)+ ( xx0 )γ and τ ∗ = inf{t > 0; Xt ≥ x0 }, where $ # . γ = α−2 12 α2 − r + ( 12 α2 − r)2 + 2α2 ρ and x0 =
γ γ−1
(γ > 1 ⇔ ρ > r) .
10.13. If α ≤ ρ then τ ∗ = 0 . If ρ < α < ρ + λ then % −ρs e pq ; ∗ G (s, p, q) = e−ρs (C1 (pq)γ1 + where γ1 = β y0 =
−2
#
1 2 2β
+λ−α−
λ ρ+λ−α
.
( 12 β 2
· pq −
K ρ );
$ +λ−
α)2
(−γ1 )K(ρ + λ − α) >0 (1 − γ1 )ρ(α − ρ)
and
1−γ1
C1 =
(α − ρ)y0 . (−γ1 )(ρ + λ − α)
The continuation region is D = {(s, p, q); pq > y0 } . If ρ + λ ≤ α then G∗ = ∞ . 10.14. First assume that Case 1: ρ>α.
if 0 < pq < y0 if pq ≥ y0
+
2ρβ 2
0}
(see (10.3.7)).
In this case the generator L is given by Lφ(s, p) =
∂φ ∂φ 1 2 2 ∂ 2 φ +αp + β p ∂s ∂p 2 ∂p2
and so U = {(s, p); (−ρ)(−C) − p > 0} = {(s, p); p < ρ C}. In view of this we try a continuation region D of the form D = {(s, p); 0 < p < p∗ } for some p∗ > 0 (to be determined).
Solutions and Additional Hints to Some of the Exercises
351
We try a value function candidate of the form φ(s, p) = e−ρs ψ(p) where, by Theorem 10.4.1, the function ψ is required to satisfy the following conditions: (1) L0 ψ(p) := −ρψ(p) + α p ψ (p) + 12 β 2 p2 ψ (p) − p = 0; 0 < p < p∗ (2) L0 ψ(p) ≤ 0; p > p∗ (3) ψ(p) = −C; p > p∗ (4) ψ(p) > −C; 0 < p < p∗ . The general solution of (1) is ψ(p) = K1 pγ1 + K2 pγ2 + where (5)
γi = β −2
1 2 2β
−α±
. 1
2 2β − α
2
p α−ρ
+ 2ρ β 2 ;
i = 1, 2
with γ2 < 0 < γ1 and K1 , K2 are arbitrary constants. Since ψ(p) must be bounded near p = 0 we must have K2 = 0. Hence we put % p K1 pγ1 + α−ρ ; 0 ≤ p < p∗ (6) ψ(p) = −C ; p ≥ p∗ If we require ψ(p) to be continuous at p = p∗ we get the equation (7)
K1 (p∗ )γ1 +
p∗ = −C α−ρ
If ψ(p) is also C 1 at p = p∗ we get (8)
K1 γ1 (p∗ )γ1 −1 +
1 =0 α−ρ
Combining these two equations we get (9)
x∗ =
C(ρ − α)γ1 γ1 − 1
and (10)
K1 =
(p∗ )1−γ1 γ1 (ρ − α)
352
Solutions and Additional Hints to Some of the Exercises
It is easy to see that γ1 > 1 ⇔ ρ > α Since we have assumed ρ > α we get by (9) that p∗ > 0 and by (10) that K1 > 0. It remains to verify that with these values of p∗ and K1 the function φ(s, p) = e−ρs ψ(p) , with ψ given by (6), satisfies all the conditions of Theorem 10.4.1. Many of these verifications are straightforward. However, to avoid false solutions, it is important to check (ii) and (vi), which correspond to (4) and (2) above: Verification of (4): Define h(p) = ψ(p) + C Then h(p∗ ) = h (p∗ ) = 0
by (7) and (8) above.
∗
Moreover, if 0 < p < p then h (p) = K1 γ1 (γ1 − 1)pγ1 −2 > 0 which implies that h (p) < 0 and hence h(p) > 0 for 0 < p < p∗ , as required. Verification of (2): For p > p∗ we have ψ(p) = −C and hence L0 ψ(p) = (−ρ)(−C) − p = ρ C − p , so
for all p > p∗
L0 ψ(p) ≤ 0 if and only if
p∗ ≥ ρ C ,
which holds because U = {(s, p); p < ρ C} and U ⊂ D. The remaining cases, Case 2: ρ = α Case 3: ρ < α are left to the reader. 11.6. u∗ =
a1 −a2 +σ22 (1−γ) (σ12 +σ22 )(1−γ)
(constant),
Φ(s, x) = e
λ(t−t1 ) γ
x
for t < t1 , x > 0
where λ = 12 γ(1 − γ)[σ12 (u∗ )2 + σ22 (1 − u∗ )2 ] − γ[a1 u∗ + a2 (1 − u∗ )] .
Solutions and Additional Hints to Some of the Exercises
11.7. Define
#
dt dY (t) = dXt
$
#
$ # $ 1 0 = dt + dBt ; α u(t) u(t)
353
# $ s Y (0) = x
and G = {(s, x); x > 0 and s < T }. Then Φ(s, x) = Φ(y) = sup E y [g(YτG )], u(·)
where g(y) = g(s, x) = xγ and τG = inf{t > 0; Y (t) ∈ G} = τ . We apply Theorem 11.2.2 and hence look for a function φ such that (1)
sup {f v (y) + (Lv φ)(y)} = 0
for all y ∈ G ,
v∈R
where in this case f v (y) = 0 and Lv φ(y) = Lv φ(s, x) =
∂φ 1 2 ∂ 2 φ ∂φ + av + v . ∂s ∂x 2 ∂x2
2
If we guess that ∂∂xφ2 < 0 then the maximum of the function v → Lv φ(s, x) is attained at v = v ∗ (s, x) = −
(2)
a ∂φ ∂x (s, x) ∂2φ ∂x2 (s, x)
.
We try a function φ of the form φ(s, x) = f (s)xγ
(3)
for some function f (to be determined). Substituted in (2) this gives v ∗ (s, x) = −
(4)
af (s)γ xγ−1 ax = γ−2 f (s)γ(γ − 1)x 1−γ
and (1) becomes f (s)xγ +
ax 2 a2 x f (s)γ xγ−1 + 12 f (s)γ(γ − 1)xγ−2 = 0 1−γ 1−γ
or (5)
f (s) +
a2 γ f (s) = 0 . 2(1 − γ)
354
Solutions and Additional Hints to Some of the Exercises
Combined with the terminal condition φ(y) = g(y)
for y ∈ ∂G
i.e. (6)
f (T ) = 1
the equation (5) has the solution (7)
f (s) = exp
a2 γ (T − s) ; 2(1 − γ)
s≤T .
With this value of f it is now easily verified that φ(s, x) = f (s)xγ satisfies all the conditions of Theorem 11.2.2, and we conclude that the value function is Φ(s, x) = φ(s, x) = f (s)xγ and that
u∗ (s, x) = v ∗ (s, x) =
ax 1−γ
is an optimal Markov control. 11.11. Additional hints: For the solution of the unconstrained problem try a function φλ (s, x) of the form φλ (s, x) = aλ (s)x2 + bλ (s) , for suitable functions aλ (s), bλ (s) with λ ∈ R fixed. By substituting this into the HJB equation we arrive at the equations 1 2 a (s) − 1 θ λ aλ (t1 ) = λ
aλ (s) =
for s < t1
and bλ (s) = −σ 2 aλ (s) bλ (t1 ) = 0 ,
for s < t1
with optimal control u∗ (s, x) = − θ1 aλ (s)x . ∗ Now substitute this into the equation for Xtu and use the terminal condition to determine λ0 . If we put s = 0 for simplicity, then λ = λ0 can be chosen as any solution of the equation Aλ3 + Bλ2 + Cλ + D = 0 ,
Solutions and Additional Hints to Some of the Exercises
355
where A = m2 (et1 − e−t1 )2 , B = m2 (e2t1 + 2 − 3e−2t1 ) − σ 2 (et1 − e−t1 )2 , C = m2 (−e2t1 + 2 + 3e−2t1 ) − 4x2 − 2σ 2 (1 − e−2t1 ) D = −m2 (et1 + e−t1 )2 + 4x2 + σ 2 (e2t1 − e−2t1 ) . 11.12. If we introduce the process $ # $ # $ # 1 0 dt = dt + dBt ; dY (t) = ut σ dXt
# $ s Y (0) = y = x
then the problem can be written Ψ (s, x) = inf E
y
∞
u
f (Y (t), u(t))dt ,
0
where
f (y, u) = f u (y) = f u (s, x) = e−ρs (x2 + θ u2 ).
In this case we look for a function ψ such that (1)
∂ψ 1 2 ∂ 2 ψ ∂ψ e−ρs (x2 +θ v 2 ) + = 0; (s, x) ∈ R2 . +v + 2σ v∈R ∂s ∂x ∂x2 inf
If we guess that ψ has the form ψ(s, x) = e−ρs (ax2 + b) then (1) gets the form (2)
inf {x2 + θ v 2 − ρ(ax2 + b) + v 2ax + 12 σ 2 2a} = 0;
v∈R
x∈R.
The minimum is attained when v = v ∗ (s, x) = −
(3)
ax θ
and substituting this into (2) we get a2 x2 1 − ρ a − + σ2 a − ρ b = 0 θ This is only possible if (4)
a2 + ρ θ a − θ = 0
for all x ∈ R .
356
Solutions and Additional Hints to Some of the Exercises
and (5)
b=
σ2 a ρ
The positive root of (4) is (6)
a=
1 2
1 −ρ θ + ρ2 θ2 + 4θ .
We can now verify that with the values of a and b given by (6) and (5), respectively, the function ψ(s, x) = e−ρs (ax2 + b) satisfies all the requirements of Theorem 11.2.2 and we conclude that the value function is Ψ (s, x) = ψ(s, x) = e−ρs (ax2 + b) and that
u∗ (s, x) = v ∗ (s, x) = −
ax θ
is an optimal control. 11.13. If we introduce # $ # $ # $ dt 1 0 dY (t) = = dt + dBt ; dXt 1 − ut σ and
# $ s Y (0) = x
G = {(s, x) ∈ R2 ; x > 0},
then the problem gets the form Φ(s, x) = sup E u
where
y
τG
f (Y (t), ut )dt ,
0
f (y, u) = f (s, x, u) = e−ρs u .
The corresponding HJB equation is (1)
∂φ 1 2 ∂ 2 φ ∂φ e−ρs v + =0. + (1 − v) + σ ∂s ∂x 2 ∂x2 v∈[0,1] sup
If we substitute (2)
φ(s, x) = e
−ρs 1
ρ
. 2 1 − exp − σ2 x
Solutions and Additional Hints to Some of the Exercises
357
then (1) gets the form . % . 2ρ 2 (3) supv∈[0,1] −1 + ρσ2 exp − σ2 x # . $+ . +v 1 − ρσ2 2 exp − σ2ρ2 x =0. . . If ρ ≥ σ22 then 1 − ρσ2 2 exp − σ2ρ2 x ≥ 0 for all x and hence the supremum in (3) is attained for v = u∗ (s, x) = 1 . Moreover, we see that the corresponding supremum in (2) is 0. Hence by Theorem 11.2.2 we conclude that Φ = φ and u∗t = 1. 12.6. a) no arbitrage b) no arbitrage c) θ(t) = (a, 1, 1) is an arbitrage, where a ∈ R is chosen such that Vθ (0) = 0 d) no arbitrage e) arbitrages exist f) no arbitrage. 12.7. a) complete b) not complete. For example, the claim
T F (ω) =
B3 (t)dB3 (t) = 12 B32 (T ) − 12 T
0
cannot be hedged. (arbitrages exist) not complete (arbitrages exist) complete.
c) d) e) f)
12.9. For the n-dimensional Brownian motion B(t) = (B1 (t), . . . , Bn (t)) the representation formula (12.3.33) gets the form
h(B(T )) = E[h(B(T ))] +
T n 0
(i)
j=1
∂ z E h(B(t − t)) z=B(t) dBj (t). ∂zj
If h(x) = x2 = x21 + · · · + x2n we get E z [h(B(T − t))] = E[(B z (T − t))2 ] = z 2 + n(T − t).
358
Solutions and Additional Hints to Some of the Exercises
Hence
∂ z E [h(B(T − t))] = 2zj ∂zj
and we conclude that 2
2
B (T ) = x + nT +
T n 0
(ii)
2Bj (t)dBj (t),
with
B(0) = x.
j=1
If h(x) = exp(x1 + · · · + xn ) we get E z [h(B(T −t))] = E[exp(z1 + B1 (T −t) + · · · + zn + Bn (T −t))] n n (T −t) + = exp zi . 2 i=1
Hence
n n ∂ z (T − t) + E [h(B(T − t))] = exp zi ∂zj 2 i=1
and we conclude that if B(0) = x then exp
n n T+ Bi (T ) = exp xi 2 i=1 i=1
n
T +
exp
n 2
0
(T −t) +
n
Bi (t) (dB1 (t) + · · · + dBn (t))
i=1
12.12. c) EQ [ξ(T )F ] = σ −1 x1 (1 − αρ )(1 − e−ρT ). The replicating portfolio is θ(t) = (θ0 (t), θ1 (t)), where $ # α −1 ρ(t−T ) θ1 (t) = σ ) 1 − (1 − e ρ and θ0 (t) is determined by (12.1.17). %
12.15. Φ(s, x) = where γ = β −2
#
e−ρs (K − x) e−ρs (K − x∗ )( xx∗ )γ
1 2 2β
−α−
and x∗ =
. 1
2 2β − α
for x ≤ x∗ for x > x∗ 2
$ + 2ρβ 2
Kγ ∈ (0, K) . γ−1
Hence it is optimal to stop the first time X(t) ≤ x∗ .
0
the Borel σ-algebra the σ-algebra generated by {Bs ; s ≤ t}, Bs is mdimensional the σ-algebra generated by {Bs∧τ ; s ≥ 0} (τ is a stopping time) orthogonal to (in a Hilbert space) the σ-algebra generated by {Xs ; s ≤ t} (Xt is an Itˆ o diffusion) the σ-algebra generated by {Xs∧τ ; s ≥ 0} (τ is a stopping time) the boundary of the set G the closure of the set G the complement of the set G G is compact and G ⊂ H the distance from the point y ∈ Rn to the set K ⊂ Rn the first exit time from the set G of a process Xt : τG = inf{t > 0; Xt ∈ / G} Definition 3.3.1 Definition 3.3.2 Hunt’s condition (Chapter 9) the Hamilton-Jacobi-Bellman equation (Chapter 11) the n × n identity matrix
List of Frequently Used Notation and Symbols
XG
373
Vzθ (t)
the indicator function of the set G XG (x) = 1 if /G x ∈ G, XG (x) = 0 if x ∈ the probability law of Bt starting at x the probability law of Bt starting at 0 the probability law of Xt starting at x (X0 = x) the probability law of Yt = (s + t), Xtx )t≤0 with Y0 = (s, x) (Chapter 10) s,x the probability law of Yt = (s + t, Xs+t )t≥0 with Y0 = (s, x) (Chapter 11) the measure P is absolutely continuous w.r.t. the measure Q P is equivalent to Q, i.e. P # Q and Q # P the expectation operator w.r.t. the measures Qx , R(s,x) and Qs,x , respectively the expectation w.r.t. the measure Q the expectation w.r.t. a measure which is clear from the context (usually P 0 ) the minimum of s and t (= min(s, t)) the maximum of s and t (= max(s, t)) the transposed of the matrix σ the unit point mass at x δij = 1 if i = j, δij = 0 if i = j the shift operator: θt (f (Xs )) = f (Xt+s ) (Chapter 7) portfolio (see (12.1.3)) = θ(t) · X(t), the value process (see (12.1.4)) t = z + θ(s)dX(s), the value generated at time t by
X(t) ξ(t) := lim , lim ess inf f ess sup f
the self-financing portfolio θ if the initial value is z (see (12.1.7)) the normalized price vector (see (12.1.8)–(12.1.11)) the discounting factor (see (12.1.9)) equal to by definition the same as lim inf, lim sup sup{M ∈ R ; f ≥ M a.s.} inf{N ∈ R ; f ≤ N a.s.} end of proof
Px P = P0 Qx R(s,x) Qs,x P #Q P ∼Q E x , E (s,x) , E s,x EQ E s∧t s∨t σT δx δij θt θ(t) V θ (t)
0
“increasing” is used with the same meaning as “nondecreasing”, “decreasing” with the same meaning as “nonincreasing”. In the strict cases “strictly increasing/strictly decreasing” are used.
Index
adapted process 25 adjoint operator 172 admissible control 245 admissible portfolio 273 σ-algebra 7 generated by a family of sets 8 generated by a random variable 8, 25 almost surely (a.s.) 8 American call option 312 American contingent T -claim 299 American options 299–308 American put option 305–308 American put option, perpetual 312 analytic functions (and Brownian motion) 81, 161 approximate identity 28 arbitrage 273 asymptotic estimation 108 attainable claim 282 Banach spaces 10 bankruptcy time 267, 268 Bayes’ rule 163 (8.6.3) Bellman principle 263 bequest function 244 Bessel process 49, 150 Black-Scholes equation 210–211 Black-Scholes formula 5, 173, 211, 297, 311 Borel-Cantelli lemma 17, 33 Borel sets, Borel σ-algebra 8 borrowing 255 Brownian bridge 79
Brownian motion complex 80 on the ellipse 77 the graph of 126 on a Riemannian manifold 160 in Rn 3, 12–15 on the unit circle 69, 130 on the unit sphere 159–160 w.r.t. an increasing family Ht of σ-algebras 74 capacity 176 carrying capacity 81 Cauchy sequence (in Lp ) 20 change of time 156 change of variable in an Itˆ o integral 158 characteristic function 316 characteristic operator 129 Chebychev’s inequality 16 closed loop control 245 coincide in law 151, 152 combined Dirichlet-Poisson problem 181, 200 complete market 282 complete normed linear space 10, 20 complete probability space 8 complex Brownian motion 80 conditional expectation 319–320 conditioned Brownian motion 137 contingent T -claim (American) 299 contingent T -claim (European) 282 continuation region 219, 306 continuous in mean square 41
B. Øksendal, Stochastic Differential Equations, Universitext, DOI 10.1007/978-3-642-14394-6, © Springer-Verlag Berlin Heidelberg 2013
376
Index
control, deterministic (open loop) 245 control, feedback (closed loop) 245 control, Markov 245 control, optimal 245 convolution 328 covariance matrix 13, 316 cross-variation processes 163 crowded environment 81 density (of a random variable) 16 deterministic control 245 diffusion coefficient 115 diffusion, Dynkin 130 diffusion, Itˆ o 115 Dirichlet-Poisson problem 181–185, 200 Dirichlet problem 3, 185, 187 generalized 192 stochastic version 187 distribution of a process 11, 18 of a random variable 9 distribution function (of a random variable) 15 Doob-Dynkin lemma 9, 41, 50 Doob-Meyer decomposition 303 Doob’s h-transform 136 drift coefficient 115 Dudley’s theorem 275 Dynkin’s formula 127 eigenvalues (of the Laplacian) 204 electric circuit 1 elementary function/process 26 elliptic partial differential operator 181 equivalent (local) martingale measure 168, 178, 276, 281 estimate (linear/measurable) 91 estimation exact asymptotic 108 of a parameter 104 European call option 5, 288, 311, 313 European contingent T -claim 282 European option 287 European put option 288 events 8 excessive function 217 expectation 9
explosion (of a diffusion) 71, 82 exponential martingale 55 feedback control 245 Feller-continuity 71, 143 Feynman-Kac formula 145, 208 filtering problem, general 2, 86–87 linear 2, 87–108 filtration 31, 38 finite-dimensional distributions (of a stochastic process) 11 first exit distribution (harmonic measure) 123, 138 first exit time 119 Gaussian process 13 generalized (distribution valued) process 22 generator (of an Itˆ o diffusion) 124, 126 geometric Brownian motion 67 geometric mean reversion process 83 Girsanov’s theorem 61, 162–171 Girsanov transformation 165 Green formula 202 function 176, 201–203 measure 19, 201, 259 operator 176 Gronwall inequality 72, 82–83 Hamilton-Jacobi-Bellman (HJB) equation 245–250 harmonic extension (w.r.t. an Itˆ o diffusion) 131 harmonic function 3, 133 (and Brownian motion) 161 harmonic function 3, 133 and Brownian motion 161 w.r.t. a diffusion 186–187 harmonic measure and Brownian motion 133, 208 of a diffusion 123, 138 Hausdorff measure 176 heat equation 184 operator 127 hedging portfolio 282, 290–295 Hermite polynomials 38
Index high contact (smooth fit) principle 230, 232, 239, 240 Hilbert space 10 hitting distribution 122, 123 Ht -Brownian motion 74 h-transform (of Brownian motion) 136 Hunt’s condition (H) 192 independent 10 independent increments 14, 22 innovation process 92, 96 integration by parts (stochastic) 46, 55 interpolation (smoothing) 110 irregular point 190–192, 206 iterated Itˆ o integrals 38 iterated logarithm (law of) 68 Itˆ o diffusion 115, 116 Itˆ o integral 24–37 Itˆ o integral; multidimensional 34–35 Itˆ o interpretation (of a stochastic differential equation) 36, 66, 85 Itˆ o isometry 26, 29 Itˆ o process 44, 48 Itˆ o representation theorem 52 Itˆ o’s formula 44, 49 Jensen inequality
321
Kalman-Bucy filter 2, 87, 101, 107, 155 Kazamaki condition 56 Kelly criterion 256 kernel function 136 killing (a diffusion) 147 killing rate 148, 177 Kolmogorov’s backward equation 141 Kolmogorov’s continuity theorem 14 Kolmogorov’s extension theorem 11 Kolmogorov’s forward equation 172 Langevin equation 77 Laplace-Beltrami operator 161 Laplace operator Δ 3, 57, 127, 133 Laplace transform 139 law of iterated logarithm 68 least superharmonic majorant 216 least supermeanvalued majorant 216 L´evy’s characterization of Brownian motion 162
377
L´evy’s theorem 162 linear regulator problem 252, 266 Lipschitz continuous 116, 150 Lipschitz surface 234, 327 local martingale 35, 135, 168 local time 58, 59, 75, 155 Lp -spaces 9 Lyapunov equation 109 Malliavin derivative 53 market 269 complete 282, 283 normalized 270, 271 Markov control 245 Markov process 118 Markov property 117 martingale 31, 33, 323 convergence theorem 324 inequality 31, 33 local 35, 135 problem 149 representation theorem 49, 53 maximum likelihood 105 maximum principle 206 mean (of a random variable) 13 mean-reverting Ornstein-Uhlenbeck process 78 mean square error 99 mean value property classical 133 for a diffusion 123 measurable function (w.r.t. a σ-algebra) 8 measurable sets (w.r.t. a σ-algebra) 8 measurable space 8 moving average, exponentially weighted 103 noise 1–4, 21–22, 66 normal distribution 13, 315–319 normalization (of a market process) 271 Novikov condition 56, 165 numeraire 271 observation process 86, 88 open loop control 245 optimal control 245 optimal investment time 241, 242
378
Index
optimal performance 245 optimal portfolio selection 4, 254 optimal resource extraction 240 optimal stopping 3, 213–237, 304 optimal stopping existence theorem 219 optimal stopping time 213, 219, 223, 234 optimal stopping uniqueness theorem 223 option pricing 4, 211, 287–307 Ornstein-Uhlenbeck equation/process 77 orthogonal increments 89 outer measure 8 partial information control 245 path (of a stochastic process) 10 performance function 244 Perron-Wiener-Brelot solution 196 Poisson formula 207 Poisson kernel 207 Poisson problem 185, 197 Poisson problem (generalized) 197 Poisson problem (stochastic version) 187 polar set 175, 192 Polish space 13 population growth 1, 65, 81, 105, 139 portfolio 4, 270, 282, 290 prediction 110 probability measure 8 probability space 8 profit rate 212, 244 p’th variation process 19 quadratic variation process
19, 57
random time change 156 random variable 9 recurrent 128 regular point 190–192, 206 replicating portfolio 282, 290 resolvent operator 143 reward function 213 reward rate function 214 Riccati equation 99, 102, 106, 253 risky assets 255, 267 risk free assets 255, 267
scaling (Brownian) 19 self-financing portfolio 271 semi-elliptic partial differential operator 181 semi-polar set 192 separation principle 245, 254 shift operator 122 shortselling 255, 299, 313 smoothing (interpolation) 110 Snell envelope 303 square root (of a matrix) 181 stationary process 18, 21, 22 stochastic control 4, 243–261 stochastic differential equation definition 65 existence and uniqueness of solution 70 weak and strong solution 74–76 stochastic Dirichlet problem 187 stochastic integral 44 stochastic Poisson problem 197 stochastic process 10 stopping time 58, 119 Stratonovich integral 24, 35–37, 39, 40 Stratonovich interpretation (of a stochastic differential equation) 36, 66, 85 strong Feller process 194–195 strong Markov property 118–122 strong solution (of a stochastic differential equation) 74 strong uniqueness (of a stochastic differential equation) 70, 74 submartingale 326 superharmonic function 216, 263 superharmonic majorant 216 supermartingale (135), 216, 276, 303, 324 supermeanvalued function 216 supermeanvalued majorant 216 superreplicate 303 support (of a diffusion) 111 Tanaka’s equation 75 Tanaka’s formula 58, 59, 75, 155 terminal conditions (in stochastic control) 259–261, 266 thin set 192 time change formula Itˆ o integrals 158
Index time-homogeneous 117, 261 total variation process 19 transient 128 transition measure 201 operator 178 trap 129 uncorrelated 317 uniformly elliptic partial differential operator 193, 291 uniformly integrable 135, 323–324 utility function 4, 255 value function 245 value process 270 value process, normalized
271
379
variational inequalities (and optimal stopping) 3, 232–234 version (of a process) 12, 14, 32 Volterra equation deterministic 95 stochastic 79 weak solution (of a stochastic differential equation) 74 weak uniqueness 74 well posed (martingale problem) 149 white noise 22, 65 Wiener criterion 192 X-harmonic
187
zero-one law
189