Stochastic Economic Dynamics [1 ed.] 9788763099820, 9788763001854

This book analyzes stochastic dynamic systems across a broad spectrum in economics and finance. The major unifying theme

245 36 5MB

English Pages 450 Year 2007

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Stochastic Dynamics of Economic Cycles 9783110707021, 9783110706987

This book includes discussions related to solutions of such tasks as: probabilistic description of the investment

140 18 3MB Read more

Stochastic Dynamics of Economic Cycles 9783110707021, 9783110706987

This book includes discussions related to solutions of such tasks as: probabilistic description of the investment

154 83 5MB Read more

Stochastic PDEs and Dynamics 9783110493887, 9783110495102

This book explains mathematical theories of a collection of stochastic partial differential equations and their dynamica

138 48 1MB Read more

Stochastic dynamics, filtering and optimization 9781107182646

441 21 5MB Read more

Stochastic Dynamics of Marine Structures 9781139782173, 9780521881555

Stochastic Dynamics of Marine Structures is a text for students and a reference for professionals on the basic theory an

177 82 10MB Read more

Stochastic Dynamics in Computational Biology 3030623866, 9783030623869

The aim of this book is to provide a well-structured and coherent overview of existing mathematical modeling approaches

169 11 8MB Read more

Stochastic PDEs and Dynamics 9783110493887, 9783110495102

This book explains mathematical theories of a collection of stochastic partial differential equations and their dynamica

137 100 33MB Read more

Economic Dynamics 0262012774, 9780262012775, 9780262255363

This text provides an introduction to the modern theory of economic dynamics, with emphasis on mathematical and computat

389 103 3MB Read more

Economic Growth And Macroeconomic Dynamics

551 84 11MB Read more

An Introduction to Economic Dynamics 0521804787, 9780521804783

This is an examples-driven treatment of introductory economic dynamics for students with a basic familiarity of spreadsh

473 104 4MB Read more

Stochastic Economic Dynamics [1 ed.]
9788763099820, 9788763001854

Author / Uploaded
Bjarne S. Jensen
Tapio Palokangas

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Stochastic Economic Dynamics

Bjarne S. Jensen & Tapio Palokangas (Editors)

Stochastic Economic Dynamics

Copenhagen Business School Press

Stochastic Economic Dynamics © Copenhagen Business School Press, 2007 Printed in Denmark by Narayana Press, Gylling Cover design by BUSTO│Graphic Design First edition 2007 e-ISBN 978-87-630-9982-0

Distribution: Scandinavia DBK, Mimersvej 4 DK-4600 Køge, Denmark Tel +45 3269 7788 Fax +45 3269 7789 North America International Specialized Book Services 920 NE 58th Ave., Suite 300 Portland, OR 97213, USA Tel +1 800 944 6190 Fax +1 503 280 8832 Email: [email protected] Rest of the World Marston Book Services, P.O. Box 269 Abingdon, Oxfordshire, OX14 4YN, UK Tel +44 (0) 1235 465500, fax +44 (0) 1235 465655 E-mail: [email protected]

All rights reserved. No part of this publication may be reproduced or used in any form or by any means graphic, electronic or mechanical including photocopying, recording, taping or information storage or retrieval systems - without permission in writing from Copenhagen Business School Press at www.cbspress.dk

Contents in Brief Introduction

1

Part I: Developments in Stochastic Dynamics 1. Fractional Brownian Motion in Finance

11

2. Moment Evolution of Gaussian and Geometric Wiener Diﬀusions

57

3. Two-Dimensional Linear Dynamic Systems with Small Random Terms

101

4. Dynamic Theory of Stochastic Movement of Systems

133

Part II: Stochastic Dynamics in Basic Growth Models and Time Delays 5. Stochastic One-Sector and Two-Sector Growth Models in Continuous Time

167

6. Comparative Dynamics in a Stochastic Growth and Trade Model with a Variable Savings Rate

217

7. Inada Conditions and Global Dynamic Analysis of Basic Growth Models with Time Delays

229

8. Hopf Bifurcation in Growth Models with Time Delays

247

Part III: Intertemporal Optimization in Consumption, Finance, and Growth 9. Optimal Consumption and Investment Strategies in Dynamic Stochastic Economies

271

10. Diﬀerential Systems in Finance and Life Insurance

317

11. Uncertain Technological Change and Capital Mobility

361

12. Stochastic Control, Non-Depletion of Renewable Resources, and Intertemporal Substitution

381

13. Capital Accumulation in a Growth Model with Creative Destruction

393

14. Employment Cycles in a Growth Model with Creative Destruction

423

i

Table of Contents Introduction

1

Bjarne S. Jensen and Tapio Palokangas

Part I: Developments in Stochastic Dynamics 1. Fractional Brownian Motion in Finance

11

Bernt Øksendal 1.1 Introduction 1.2 Framework and deﬁnitions 1.3 Classical white noise theory and Hida-Malliavin calculus 1.4 Fractional stochastic calculus 1.5 Summary of results 1.6 Concluding remarks 2. Moment Evolution of Gaussian and Geometric Wiener Diﬀusions

11 12 16 30 40 53 57

Bjarne S. Jensen, Chunyan Wang, and Jon Johnsen 2.1 Introduction 2.2 Structure of basic diﬀusion processes 2.3 Dynamics of ﬁrst-order and second-order moments 2.4 Expectation vector functions 2.5 Covariance matrix functions 2.6 Probability density functions 2.7 Final comments Appendices 3. Two-Dimensional Linear Dynamic Systems with Small Random Terms

57 59 64 68 71 84 92 92 101

Nishioka Kunio 3.1 3.2 3.3 3.4 3.5 ii

Introduction Non-random dynamic system Lyapunov index of the random system One-dimensional diﬀusion process in an interval Spiral point and center

101 102 106 109 113

3.6 Saddle point 3.7 Improper and proper node 4. Dynamic Theory of Stochastic Movement of Systems

117 127 133

Masao Nagasawa 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9

Dynamic theory of stochastic processes Kinematic theory Sample path equation in kinematic theory Mechanics and the equation of motion Evolution function and kinematic equation Exponent of motion and initial condition Examples Schr¨odinger’s wave theory and dynamic theory Sample paths of motion governed by the Schr¨odinger equation 4.10 Interference phenomena and entangled motion

133 134 135 137 140 142 143 146 147 159

Part II: Stochastic Dynamics in Basic Growth Models and Time Delays 5. Stochastic One-Sector and Two-Sector Growth Models in Continuous Time

167

Bjarne S. Jensen and Martin Richter 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9

Introduction Neoclassical technologies and CES forms Stochastic one-sector growth models Boundaries, steady state, and convergence Explicit steady-state distribution with CD technologies Sample paths and asymptotic densities with CD and CES technologies General equilibria of two-sector economies Dynamics of two-sector economies Sample paths of two-sector models and CES

167 170 172 177 188 190 202 206 210 iii

6. Comparative Dynamics in a Stochastic Growth and Trade Model with a Variable Savings Rate

217

Zhu Hongliang and Huang Wenzao 6.1 Introduction 6.2 Stochastic dynamic systems for trading economies 6.3 Comparative dynamics and policy parameters 7. Inada Conditions and Global Dynamic Analysis of Basic Growth Models with Time Delays

217 218 221 229

Zhu Hongliang and Huang Wenzao 7.1 7.2 7.3 7.4 7.5

Introduction Neoclassical growth model with a time delay Dynamics with delays in production and depreciation Persistent oscillation in a growth model with delay Final comments

8. Hopf Bifurcation in Growth Models with Time Delays

229 231 236 240 245 247

Morten Brøns and Bjarne S. Jensen 8.1 8.2 8.3 8.4 8.5 8.6

Introduction Dynamics of growth and cycles Hopf bifurcation analysis CD technologies and time delays CES technologies and time delays CES and delays with cycles, square waves, and chaos 8.7 Final comments

247 249 250 257 260 261 265

Part III: Intertemporal Optimization in Consumption, Finance, and Growth 9. Optimal Consumption and Investment Strategies in Dynamic Stochastic Economies

271

Claus Munk and Carsten Sørensen 9.1 Introduction 9.2 Consumption and investment in complete markets 9.3 Results for CRRA utility in general markets iv

271 276 281

9.4 Examples 9.5 Extensions 9.6 Concluding remarks Appendix 10. Diﬀerential Systems in Finance and Life Insurance

290 301 309 310 317

Mogens Steﬀensen 10.1 Introduction 10.2 The diﬀerential equations of Thiele and Black-Scholes 10.3 Surplus and dividends 10.4 Intervention 10.5 Quadratic optimization 10.6 Utility optimization 11. Uncertain Technological Change and Capital Mobility Paul A. de Hek 11.1 Introduction 11.2 Framework of the model 11.3 The eﬀect of uncertainty on growth 11.4 Conclusion Appendices 12. Stochastic Control, Non-Depletion of Renewable Resources, and Intertemporal Substitution

317 321 332 338 344 352 361 361 363 369 374 375 381

Nils Chr. Framstad 12.1 12.2 12.3 12.4 12.5

Introduction The preferences The optimal control problem Non-optimality of immediate total depletion Concluding remarks

381 382 385 387 391

v

13. Capital Accumulation in a Growth Model with Creative Destruction

393

Klaus W¨alde 13.1 Introduction 13.2 Framework of the model 13.3 Solving the model 13.4 Cycles and growth 13.5 Conclusions Appendices 14. Employment Cycles in a Growth Model with Creative Destruction

393 395 399 402 410 411 423

Tapio Palokangas 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8

vi

Introduction Technology R&D and capital accumulation Capitalists Wage settlement Economic growth Cycles Conclusions

423 425 427 428 430 432 435 436

Introduction

Bjarne S. Jensen University of Southern Denmark and Copenhagen Business School Tapio Palokangas University of Helsinki and HECER

A unity of aim and a diversity of topics shaped the contents of this volume. Although each chapter can be read independently as a selfcontained presentation, the book is more than a collection of the individual contributions. Diﬃcult subjects are interrelated, juxtaposed, and examined for consistency in various disciplinary, theoretical, and empirical contexts. The major unifying theme of this joint work is the coherent and rigorous treatment of uncertainty and its implications for describing relevant stochastic processes through basic (prototype, core) models and diﬀerential equations of stochastic dynamics.

Part I: Developments in Stochastic Dynamics 1. Bernt Øksendal. Fractional Brownian Motion in Finance. Stochastic processes in continuous time are here described by a fractional Brownian motion (Wiener process) in which the stochastic increments are not necessarily independent. The increments have a covariance function, for which the size of the Hurst parameter (H) is critically important. When H = 12 , the fractional Brownian motion coincides with the classical Brownian motion. If H > 12 , the stochastic increments have a positive autocorrelation (motion is persistent). If H < 12 , increments have a negative autocorrelation (motion is antipersistent). This chapter gives a survey of the theory of stochastic

Bjarne S. Jensen, Tapio Palokangas calculus (integrals) with fractional Brownian motion and discusses the applications of fractional stochastic calculus to ﬁnancial markets. Asset prices are described as solutions of stochastic diﬀerential equations that are driven by the generalized stochastic processes. 2. Bjarne S. Jensen, Chunyan Wang, and Jon Johnsen. Moment Evolution of Gaussian and Geometric Wiener Diﬀusions. This chapter analyzes two basic stochastic models: The time homogeneous Gaussian and the geometric Wiener diﬀusion of twodimensional vector processes. Using the theory of stochastic processes and Ito’s lemma, the probability distributions of the stochastic state vectors are described by the evolution of their moments (expectation vector and covariance matrix as functions of time). These moments satisfy certain systems of ordinary (deterministic) diﬀerential equations. By solving these ODE, the authors present explicit solutions for the ﬁrst-order and second-order moment functions. Kolmogorov’s forward equation is used to derive the results by alternative methods and to gain information on the probability distributions. The general closed form results for these moment evolutions - still unavailable have many applications in models of linear dynamics with uncertainty. 3. Nishioka Kunio. Two-Dimensional Linear Dynamic Systems with Small Random Terms. Chapter 3 links up with chapter 2 and further studies the asymptotic behavior of the time paths of two-dimensional linear dynamic systems that are perturbed by small random terms. Economic growth is traditionally treated as a non-random dynamic system. If the system is linear and two-dimensional, it can be classiﬁed as one of ﬁve wellknown types, according to its long-run (asymptotic) behavior. With uncertainty involved in economic growth, the asymptotic behavior of a system with random perturbations is important to investigate, for example, when analyzing steady state properties of economic growth. If the random perturbations are small, the asymptotic behavior (time paths) of the linear stochastic system is the same as in non-random cases, unless the relevant dynamic system is a circle or a proper node.

2

Introduction 4. Masao Nagasawa. Dynamic Theory of Stochastic Movement of Systems. The dynamic theory of stochastic movement of systems contains a general mathematical theory of random motion - consisting of two parts, stochastic kinematics and stochastic mechanics. The stochastic kinematics is analytically described by Kolmogorov’s PDE equation, which - with its drift coeﬃcient and diﬀusion coeﬃcient - uniquely characterizes the transition probability distribution when an initial distribution is prescribed. The stochastic mechanics contains the mechanical equation of motion, which, in addition to the Kolmogorov equation, includes a potential function of external forces. The potential function determines a so-called induced drift coeﬃcient. This induced drift coeﬃcient in turn enters a new kinematic (Kolmogorov) equation that fully describes the relevant transition probability density of the observed stochastic process. However, Kolmogorov equations are not easy to solve, except in some simple cases. Therefore, to analyze the stochastic processes, it is often better to use Ito’s stochastic diﬀerential equations (SDE), and in solving them, we have the powerful tools of sample path analysis, in particular, L´evy’s formula and Ito’s formula. The dynamic theory of stochastic motion (mechanics) is then applied to Quantum Mechanics (Schr¨odinger’s complex “wave equation”). Sample paths in one and two dimensions of simple motions governed by Schr¨odinger‘s equation are illustrated. Finally, the methodology is applied to the Schr¨odinger equation with Coulomb potential to obtain the sample path of the electron in the hydrogen atom. Here the critical (“attractive”) radius of the solved radial motions (sample paths) agrees with the classic Bohr radius expression for the “stationary states” of hydrogen. The conceptual existence of sample paths (stochastic trajectories) has been controversial (even denied) in quantum dynamics. Because sample paths and stochastic diﬀerential equations are the natural generalization of deterministic dynamics in Economics, the chapter has devoted a keen eﬀort of calculation to demonstrate some particular sample paths for states of hydrogen motion that are governed by the universal Schr¨odinger equation.

3

Bjarne S. Jensen, Tapio Palokangas Part II: Stochastic Dynamics in Basic Growth Models and Time Delays 5. Bjarne S. Jensen and Martin Richter. Stochastic One-Sector and Two-Sector Growth Models in Continuous Time. This chapter extends the basic deterministic one-sector and two-sector growth models to a stochastic context in continuous time, using Wiener processes for the description of various sources of uncertainty in the growth rate of the labor force, the rate of capital depreciation, and the saving rate. The drift and diﬀusion coeﬃcients of the stochastic dynamic systems are homogeneous of degree one in the two state variables, labor and capital - which allows a reduction to the onedimensional stochastic dynamics of the capital-labor ratio. The crucial issue of absorbing boundaries for the stochastic growth models is rigorously examined, and simple criteria (suﬃcient conditions) for inaccessible boundaries are established, a subject that the literature has not yet adequately addressed. The steady state probability distribution of the capital-labor ratios is derived from Kolmogorov’s forward equation. For stochastic one-sector and two-sector growth models, the sample paths - of the transition to steady states or persistent (endogenous) growth - of particular state variables are simulated for many parametric speciﬁcations of the CD and CES sector technologies involved. The impacts of technology shocks are similarly demonstrated. All relevant sample paths are exhibited on both shorter and longer time horizons. 6. Zhu Hongliang and Huang Wenzao. Comparative Dynamics in a Stochastic Growth and Trade Model with a Variable Savings Rate. The authors consider neoclassical two-sector growth models of a small country that is trading in both commodities in stochastic environment in continuous time, and they use a saving function for which the rate of saving depends on the capital-labor ratio and a policy parameter. The global comparative dynamic properties of the capital accumulation process are studied with respect to changes in the policy parameter. By characterizing the entire time path of the capital accumulation process, the eﬀect of the policy parameter can be determined. The time path of the capital-labor ratio satisﬁes a monotonicity property if the saving function changes monotonically with respect to a policy parameter. In addition, the impact of the policy parameter on the steady-state distribution of the capital-labor ratio is analyzed. 4

Introduction 7. Zhu Hongliang and Huang Wenzao. Inada Conditions and Global Dynamic Analysis of Basic Growth Models with Time Delays. In economics, time delays are often neglected in continuous time dynamics, no doubt due to the diﬃculty in solving and analyzing such models. Nevertheless, economic development depends not only on the current state, but also on past states (history), so delay phenomena also inﬂuence the dynamic characteristics of economic systems. This chapter introduces time delays in a particular neoclassical growth model. The global conditions for steady-state stability and/or persistent oscillation around a steady state are obtained and analyzed. It is shown that oscillations in growing economies are not rare, but common. 8. Morten Brøns and Bjarne S. Jensen. Hopf Bifurcation in Growth Models with Time Delays. Chapter 8 complements the global analysis of particular diﬀerencediﬀerential equations in chapter 7 by performing a non-linear local analysis of the dynamics of the delay model when the size of time delay is close to a critical value. For this critical value, a Hopf bifurcation (of a ﬁxed-point into a closed orbit in the neighborhood of the equilibrium) occurs, that is, periodic solutions (“limit cycles”) are created when the steady state solution of the capital-labor ratio loses its local stability. Analytical criteria are derived to determine the stability types (supercritical or subcritical) of the periodic solutions (limit cycles). Finally, it is shown that the delay model with CES production functions can exhibit dynamics with solutions which have been observed in other delay-diﬀerential equations: square waves and chaos (aperiodic waves/cycles). Simulations illustrate the analytical results and theorems.

5

Bjarne S. Jensen, Tapio Palokangas Part III: Intertemporal Optimization in Consumption, Finance, and Growth 9. Claus Munk and Carsten Sørensen. Optimal Consumption and Investment Strategies in Dynamic Stochastic Economies. The authors derive optimal consumption and investment strategies of an investor with a CRRA utility of consumption and terminal wealth and with access to trade in a complete, but otherwise very general, ﬁnancial market. Interest rates, excess expected returns, price volatilities, correlations, and consumer prices may all evolve stochastically over time, even with non-Markovian dynamics. The risks that individuals want to hedge are shown, as well as how to ﬁnance a desired real consumption process by investing in a market of nominal securities. The general results are extended to the case of a HARA utility and power-linear habit utility. The chapter also discusses how labor income and undiversiﬁable shocks should be included in the consumer price index. In the special case where real interest rates are Gaussian and real market prices of risk are deterministic, the chapter shows that CRRA investors hedge with a single real bond, with a utility of terminal wealth (real zero-coupon bond maturing at the horizon), and with a utility of intermediate consumption (a bond with continuous coupons proportional to the expected future real consumption rate under the forward martingale measure). The results are illustrated by two examples: (i) non-Markovian HJM term structure dynamics, (ii) stochastic volatility and excess returns in the stock market. 10. Mogens Steﬀensen. Life Insurance.

Diﬀerential Systems in Finance and

Financial and life insurance mathematics share a common problem of valuation of future payment streams. However, the valuation principles - no arbitrage and diversiﬁcation - diﬀer because risks diﬀer. In both ﬁnancial and life insurance mathematics, the valuation problem reduces to calculating conditional expected values and the extrema of these values. If the risk process is Markovian, expected values can be characterized by solutions to systems of deterministic diﬀerential equations. Deterministic diﬀerential systems also appear in ﬁnancial and life insurance decision-making. They characterize optimal expected values of future utility and optimal decisions. For both valuation and optimization, we derive some classical examples from ﬁnance and life insurance and generalize to situations that are relevant in 6

Introduction both ﬁelds. We study valuation with participation and early exercise options, applications of the linear regulator, and generalized consumption problems. The collection of results and proofs demonstrate both the similarities and the small but important diﬀerences in the various problems. 11. Paul A. de Hek. Uncertain Technological Change and Capital Mobility. Unpredictable variations in economic productivity may have a positive or negative eﬀect on the average growth rate of output. This theoretical ambiguity result is not solely determined by the value of the elasticity of intertemporal substitution. The growth-uncertainty relationship depends on two factors: whether returns to scale in knowledge creation are increasing or non-increasing, and whether the elasticity of intertemporal substitution (of proﬁts) is higher or lower than some critical value. Empirical studies concerning these two factors indicate that unpredictable variations in economic productivity have a negative eﬀect on the average long-run growth rate. 12. Nils Chr. Framstad. Stochastic Control, Non-Depletion of Renewable Resources, and Intertemporal Substitution. For a wide class of models concerning the optimal extraction of a renewable resource, it is well known that an expected proﬁt maximizer with an inﬁnite horizon does not deplete the resource completely if its relative growth rate is strictly greater than the discount rate. This principle is extended to preferences that have intertemporal substitution in direct utility rates and that exhibit risk aversion (or risk neutrality) suﬃciently close to zero. For a CRRA utility, the eﬀect of intertemporal substitution is seen more clearly. The model in this chapter is an Itˆo process driven by semi-martingales.

7

Bjarne S. Jensen, Tapio Palokangas 13. Klaus W¨alde. Capital Accumulation in a Growth Model with Creative Destruction. Capital accumulation and creative destruction are modeled together with risk-averse households. The novel aspect - risk-averse households - allows the use of well-known models not only for analyzing long-run growth as in the literature, but also short-run ﬂuctuations. The model remains analytically tractable because of a very convenient property of household investment decisions in this stochastic setup. 14. Tapio Palokangas. Employment Cycles in a Growth Model with Creative Destruction. This chapter constructs a model that would explain economic growth with ﬂuctuations in output and employment. The particular features of the model are the following. There is creative destruction in the sense that a new technology renders an old technology obsolete. There are eﬃciency wages in R&D. In production, there is unionemployer bargaining over wages. The ﬁrms can increase the probability of a technological change for themselves by R&D. Learning-byinvestment increases the productivity of labor in the consumptiongood sector in proportion to the expected accumulation of capital. The main results are: In the long run, the economy follows a balanced-growth path that satisﬁes Kaldor’s stylized facts. Wages in production grow on average in proportion to the level of productivity. The economy generates cycles around this long-run equilibrium. Capital stock swings up and down due to endogenous technological shocks. Because union-ﬁrm bargaining keeps real wages in proportion to the level of productivity, the labor-capital ratio is ﬁxed and employment swings in proportion to capital stock. Thus, a stationary state equilibrium is characterized by involuntary unemployment, employment cycles and stable real wages in production.

8

Part I: Developments in Stochastic Dynamics

Chapter 1 Fractional Brownian Motion in Finance

Bernt Øksendal Center of Mathematics for Applications (CMA) Department of Mathematics, University of Oslo, and Norwegian School of Economics and Business Administration

1.1

Introduction

How can we model (as a function of time) (i) the levels of a river? (ii) the characters of solar activity? (iii) the widths of consecutive annual rings of a tree? (iv) the outdoor temperature at a given point? (v) the values of the log returns hn , deﬁned by hn = log

S(tn ) S(tn−1 )

where S(t) is the observed price at time t of a given stock? And how can we model (vi) the turbulence in an incompressible ﬂuid ﬂow? (vii) the electricity price in a liberated electricity market?

Bernt Øksendal The answer in all these cases is: By using a fractional Brownian motion! The examples (i)–(iii), (v) and (vi) are taken from Shiryaev (1999), example (iv) is from Brody, Syroka and Zervos (2002), and example (vii) is from Simonsen (2003). This amazing range of potential applications makes an interesting object to study. 1.2

Framework and deﬁnitions

Fractional Brownian motion is deﬁned as follows: Deﬁnition 1. Let H ∈ (0, 1) be a constant. The (1-parameter) fractional Brownian motion (f Bm) with Hurst parameter H is the Gaussian process BH (t) = BH (t, ω), t ∈ R, ω ∈ Ω, satisfying BH (0) = E[BH (t)] = 0

for all t ∈ R

(1)

and E[BH (s)BH (t)] = 12 {|s|2H + |t|2H − |s − t|2H };

s, t ∈ R.

(2)

Here E denotes the expectation with respect to the probability law P for {BH (t)}t∈R = {BH (t, ω); t ∈ R, ω ∈ Ω}, where (Ω, F) is a measurable space. If H = 12 then BH (t) coincides with the classical Brownian motion, denoted by B(t). If H > 12 then BH (t) is persistent, in the sense that ρn := E[BH (1)·(BH (n+1)−BH (n))] > 0 and

∞

for all n = 1, 2, . . . (3)

ρn = ∞.

(4)

n=1

If H
12 in (i)–(v) and with H < 12 in (vi) and (vii). Another important property of f Bm is self-similarity: For any H ∈ (0, 1) and α > 0 the law of {BH (αt)}t∈R is the same as the law of {αH BH (t)}t∈R . In order to be able to apply f Bm to study the situations above we need a stochastic calculus for f Bm. However, if H = 12 then BH (t) is not a semimartingale, so one cannot use the general theory of stochastic calculus for semimartingales on BH (t). For example, it is not a priori clear what a stochastic integral of the form T φ(t, ω)dBH (t) 0

should mean. The two most common constructions of such a stochastic integral are the following: 1.2.1 The pathwise or forward integral This integral is denoted by T

φ(t, ω)d− BH (t).

0

If the integrand φ(t, ω) is caglad (left-continuous with right sided limits) then this integral can be deﬁned by Riemann sums, as follows: Let 0 = t0 < t1 < · · · < tN = T be a partition of [0, T ]. Put Δtk = tk+1 − tk and deﬁne T 0

φ(t, ω)d− BH (t) = lim

Δtk →0

N −1

φ(tk ) · (B(tk+1 ) − B(tk )),

(6)

k=0

if the limit exists (e.g. in probability). See Theorem 5. Note that with this deﬁnition the integration takes place with respect to t for each ﬁxed “path” ω ∈ Ω. Therefore this integral is often called the pathwise integral. Using a classical integration theory due to Young one can prove that the pathwise integral (6) exists if the p-variation of t → φ(t, ω) is ﬁnite for all p > (1 − H)−1 . See Norvaisa (2000) and the references therein. Since t → BH (t) has ﬁnite 13

Bernt Øksendal q-variation iﬀ q ≥ H1 , we see that if H < even include integrals like T

1 2

then this theory does not

BH (t)d− BH (t).

0

For this reason one often assumes that H > 12 when dealing with forward integrals with respect to BH (t). In general T E

φ(t, ω)d− BH (t) = 0,

(7)

0

even if the forward integral belongs to L1 (P ). For H > 12 the forward integral obeys Stratonovich type of integration rules. For example, if f ∈ C 1 (R) and t Xt :=

φ(s, ω)d− BH (s)

exists for all t > 0

0

then

t f (Xt ) = f (0) +

f (Xs )d− Xs ,

(8)

0

where d− Xs = φ(s, ω)d− BH (s). (See e.g. Norvaisa (2000) and also Theorem 22.) For this reason the forward integral is also sometimes called the Stratonovich integral with respect to f Bm. As a special case of (8) we note that T

2 BH (t)d− BH (t) = 12 BH (T )

for H >

1 2

.

(9)

0

Moreover, a slight extension of (8) gives that the unique solution Xt of the fractional forward stochastic diﬀerential equation d− X(t) = α(t, ω)X(t)dt + β(t, ω)X(t)d− BH (t);

14

X(0) = x > 0 (10)

Fractional Brownian Motion in Finance is t X(t) = x exp

t α(s, ω)ds +

0

β(s, ω)d− BH (s)

for H >

1 2

,

0

(11) provided that the integrals on the right hand side exist. 1.2.2 The Skorohod (Wick-Itˆo) integral This integral is denoted by T φ(t, ω)δBH (t). 0

It may be deﬁned in terms of Riemann sums, as follows: T φ(t, ω)δBH (t) = lim

Δtk →0

0

N −1

φ(tk ) (B(tk+1 ) − B(tk )),

(12)

k=0

where denotes the Wick product (see Theorem 3). Thus the diﬀerence between this integral and the forward integral is the use of the Wick product instead of the ordinary product in the Riemann sums (12) and (6), respectively. The Skorohod integral behaves in many ways like the Itˆo integral of classical Brownian motion. For example, we have T E

φ(t, ω)δBH (t) = 0

(13)

0

if the integral belongs to L2 (P ). Moreover, if f ∈ C 2 (R) then we have the following Itˆo type formula t t f (BH (t)) = f (0) + f (BH (s))δBH (s) + H f (BH (s))s2H−1 ds, 0

0

(14) valid for all H ∈ (0, 1), provided that the left hand side and the last term on the right hand side both belong to L2 (P ) (see Biagini, Øksendal, Sulem and Wallner 2004).1 1

See also Bender (2003a), Elliott and van der Hoek (2003), Hu (2003) and Mishura (2002) for related results. In Duncan, Hu and Pasik-Duncan (2000) and Biagini and Øksendal (2004) Itˆ o formulae for more general processes are proved, but valid only for H > 12 .

15

Bernt Øksendal Note that as a special case of (14) we get T

2 BH (t)δBH (t) = 12 BH (T ) − 12 T 2H ,

H ∈ (0, 1).

(15)

0

The Wick-Skorohod-Itˆo analogue of (10) is the equation δX(t) = α(t, ω)X(t)dt + β(t, ω)X(t)δBH (t);

X(0) = x > 0. (16)

Assume that α(t, ω) = α and β(t, ω) = β are constants. Then by a slight extension of the Itˆo formula (14) one obtains that the unique solution of (16) is X(t) = x exp(βBH (t) + αt − 12 β 2 t2H );

H ∈ (0, 1).

(17)

Note that if H = 12 then the formulas (15) and (17) reduce to the formulas obtained by the Itˆo formula for the classical Brownian motion. Later in this paper we will give a more detailed discussion about these two types of integration and their use in ﬁnance (Section 1.5). But ﬁrst we recall the mathematical foundation of fractional Brownian motion calculus based on white noise theory (Sections 1.3 and 1.4). 1.3

Classical white noise theory and Hida-Malliavin calculus

In this section we give a brief review of some fundamental concepts and results from classical white noise theory. We refer to Holden, Øksendal, Ubøe and Zang (1996), Hida, Kuo, Potthoﬀ and Streit (1993), and Kuo (1996) for more information. Deﬁnition 2. Let S(R) be the Schwartz space of rapidly decreasing smooth functions on R and let Ω := S (R) be its dual, often called the space of tempered distributions. Then by the Bochner-Minlos theorem there exists a unique probability measure P on the Borel subsets of Ω such that − 1 f 2 2 eiω,f dP (ω) = e 2 L (R) ; f ∈ S(R) (18) where i =

√

Ω

−1 , f 2L2 (R) =

R

f (x)2 dx and ω, f = ω(f ) denotes the

action of ω ∈ Ω = S (R) on f ∈ S(R). This measure P is called the white noise probability measure. 16

Fractional Brownian Motion in Finance From (18) it follows that E[ ω, f ] = 0

for all f ∈ S(R),

(19)

where E[ ω, f ] = EP [ ω, f ] =

ω, f dP (ω) Ω

denotes the expectation of ω, f with respect to P . Moreover, (18) implies the isometry E[ ω, f 2 ] = f 2L2 (R)

for all f ∈ S(R).

(20)

Using (19) and (20) we can extend the deﬁnition of ω, f from S(R) to L2 (R) as follows: If f ∈ L2 (R) deﬁne

ω, f = lim ω, fn n→∞

(limit in L2 (P ))

(21)

where fn ∈ S(R) and fn → f in L2 (R). (It follows from (20) that the limit in (2.4) exists in L2 (P ) and is independent of the choice of the approximating sequence {fn }∞ n=1 ⊂ S(R).) In particular, we can for each t ∈ R deﬁne := B(t, ω) := ω, X[0,t] (·) B(t) where

⎧ ⎪ if 0 ≤ s ≤ t ⎨1 X[0,t] (s) = −1 if t ≤ s ≤ 0, except t = s = 0 ⎪ ⎩ 0 otherwise

(22)

(23)

By Kolmogorov’s continuity theorem it can be proved that B(t) has a continuous version, which we will denote by B(t). Then we see that B(t) is a continuous Gaussian process with mean B(0) = E[B(t)] = 0 and covariance

for all t

(24)

E[B(t1 )B(t2 )] =

X[0,t1 ] (s)X[0,t2 ] (s)ds

R min([t1 |, |t2 |); if t1 , t2 > 0 = 0 otherwise

(25)

17

Bernt Øksendal Therefore B(t) is a (classical) Brownian motion with respect to P . Suppose f (t) = ak X[tk ,tk+1 ) (t) k

is a step function, where t1 < t2 < · · · < tN and ak ∈ R. Then by (22) and linearity we get ak ω, X[tk ,tk+1 ) (·) = ak (B(tk+1 ) − B(tk ))

ω, f = k

k

f (t)dB(t).

= R

By taking limits of such step functions we obtain that for all f ∈ L2 (R).

ω, f = f (t)dB(t)

(26)

R

In the following we let x2

hn (x) := (−1)n e 2

dn − x22 e ; dxn

n = 0, 1, 2, . . .

(27)

be the Hermite polynomials and we let 1 √ x2 1 ξn (x) := π − 4 (n − 1)! − 2 hn−1 2 x e− 2 ;

n = 1, 2, . . .

(28)

be the Hermite functions. Then {ξn }∞ n=1 consitutes an orthonormal 2 basis for L (R). The ﬁrst Hermite polynomials are: h0 (x) = 1, h1 (x) = x, h2 (x) = x2 − 1, h3 (x) = x3 − 3x, . . . Let J be the set of all multi-indices α = (α1 , α2 , . . .) of ﬁnite length (i.e. αk = 0 for all k large enough), with αi ∈ N ∪ {0} = {0, 1, 2, . . .} for all i. For α = (α1 , . . . , αm ) ∈ J deﬁne Hα (ω) = hα1 ( ω, ξ1 )hα2 ( ω, ξ2 ) . . . hαm ( ω, ξm ).

(29)

For example, if we put ε(k) = (0, 0, . . . , 1) ∈ Rk 18

(the k’th unit vector)

(30)

Fractional Brownian Motion in Finance then we see that

Hε(k) (ω) = h1 ( ω, ξk ) = ω, ξk =

ξk (t)dB(t).

(31)

R

It is a fundamental fact that the family {Hα }α∈J constitutes an orthogonal basis for L2 (P ). Indeed, we have: Theorem 1. (The Wiener-Itˆ o chaos expansion (I)). Let F ∈ L2 (P ). Then there exists a unique family {cα }α∈J of constants cα ∈ R such that cα Hα (ω) (convergence in L2 (P )). (32) F (ω) = α∈J

Moreover, we have the isometry E[F 2 ] =

c2α α!

(33)

α∈J

where α! = α1 !α2 ! . . . αm ! if α = (α1 , . . . , αm ) ∈ J . Example 1. For each t ∈ R the random variable F (ω) = B(t, ω) belongs to L2 (P ). Its chaos expansion is ∞ B(t) = ω, X[0,t] (·) = ω, (X[0,t] , ξk )L2 (R) ξk k=1

=

∞

(X[0,t] , ξk )L2 (R) ω, ξk =

k=1

∞ t

ξk (s)ds Hε(k) (ω),

(34)

k=1 0

where in general

(f, g)L2 (R) =

f (t)g(t)dt. R

We now use Theorem 1 to deﬁne stochastic test functions and stochastic distributions, as follows: In the following we use the notation (2N)γ := (2 · 1)γ1 (2 · 2)γ2 . . . (2 · m)γm

(35)

if γ = (γ1 , . . . , γm ) ∈ J . 19

Bernt Øksendal Deﬁnition 3. a) The space (S) of Hida test functions is the set of all ψ ∈ L2 (P ) whose expansion ψ(ω) = aα Hα (ω) α∈J

satisﬁes

a2α α!(2N)αk < ∞

for all k = 1, 2, . . .

(36)

α∈J

b) The space (S)∗ of Hida distributions is the set of all formal expansions bα Hα (ω) G(ω) = α∈J

such that

b2α α!(2N)−qα < ∞

for some q ∈ N.

(37)

α∈J

We equip (S) with the projective topology and (S)∗ with the inductive topology. Then (S)∗ becomes the dual of (S) and the action of G ∈ (S)∗ on ψ ∈ (S) is given by α!aα bα . (38)

G, ψ = G, ψ(S)∗ ,(S) = α∈J

Note that

(S) ⊂ L2 (P ) ⊂ (S)∗ .

(39)

2

Moreover, if G ∈ L (P ) then

G, ψ = E[G · ψ]

for all ψ ∈ (S).

Deﬁnition 4. (Integration in (S)∗ ). the property that

Z(t), ψ ∈ L2 (R, dt) Then the integral

Z(t)dt R

20

(40)

Suppose Z : R → (S)∗ has

for all ψ ∈ (S).

Fractional Brownian Motion in Finance is deﬁned to be the unique element of (S)∗ such that Z(t)dt, ψ = Z(t), ψdt for all ψ ∈ (S). R

(41)

R

Such functions Z(t) are called integrable in (S)∗ . Example 2. (White noise). W (t) =

∞

Deﬁne

ξk (t)Hε(k) (ω);

t ∈ R.

(42)

k=1

Then by Deﬁnition 3.b we see that W (t) ∈ (S)∗ for all t. Moreover t W (s)ds =

∞

t

ξk (s)ds Hε(k) (ω) = B(t),

(43)

k=1 0

0

by Example 1. In other words, the function t → B(t) is diﬀerentiable in (S)∗ and d B(t) = W (t) in (S)∗ . (44) dt This justiﬁes the name white noise for W (t). We now recall the deﬁnition of the Wick product, which was originally introduced by the physicist G. Wick in the early 1950’s as a renormalization operation in quantum physics, but has later turned out to be central in stochastic analysis as well: Deﬁnition 5. (The Wick product). Let aα Hα (ω) ∈ (S)∗ and G(ω) = bβ Hβ (ω) ∈ (S)∗ . F (ω) = α∈J

β∈J

Then the Wick product of F and G, F G, is deﬁned by (F G)(ω) = aα bβ Hα+β (ω) = aα bβ Hγ (ω). α,β∈J

γ∈J

(45)

α+β=γ

One can easily verify that the Wick product is a commutative, associative and distributive (over addition) binary operation on both (S) and on (S)∗ . Moreover, note that F G=F ·G

if either F or G is deterministic.

(46) 21

Bernt Øksendal Example 3.

If

F (ω) =

f (t)dB(t) and G(ω) =

R

g(t)dB(t) R

with f, g ∈ L2 (R) (deterministic), then F G = F · G − (f, g),

(47)

where (f, g) = (f, g)L2 (R) . Proof: Using (45) and that h2 (x) = x2 − 1 we get F G = ω, f ω, g ∞ ∞ = (f, ξk ) ω, ξk (g, ξ ) ω, ξ k=1

= = =

∞

=1

(f, ξk )(g, ξ )Hε(k) +ε()

k,=1 ∞

∞

k = ∞

k=1 ∞

(f, ξk )(g, ξ )Hε(k) Hε() +

(f, ξk )(g, ξk )h2 ( ω, ξk )

(f, ξk )(g, ξ )Hε(k) Hε() −

k,=1

(f, ξk )(g, ξk )

k=1

= ω, f · ω, g − (f, g).

One reason for the importance of the Wick product is the following result [we refer to Holden, Øksendal, Ubøe and Zang (1996) for a proof and more information]: Theorem 2. Suppose that Y (t, ω) is a stochastic process which is Skorohod integrable. Then Y (t) W (t) is integrable in (S)∗ and Y (t)δB(t) = Y (t) W (t)dt, (48) R

R

where the left hand side denotes the Skorohod integral of Y (·) with respect to B(·). 22

Fractional Brownian Motion in Finance The Skorohod integral is an extension of the classical Itˆo integral, in the sense that if Y (t, ω) is measurable w.r.t. the σ-algebra Ft generated by B(s, ω); s ≤ t, for all t (i.e. if Y (·) is Ft -adapted ) and T E Y 2 (t, ω)dt < ∞ , (49) 0

then T

T Y (t)δB(t) =

0

Y (t)dB(t),

the classical Itˆo integral.

(50)

0

The integral on the right hand side of (48) may exist even if Y is not Skorohod integrable. Therefore we may regard the right hand side of (48) as an extension of the Skorohod integral and we call it the extended Skorohod integral. We will use the same notation Y (t)δB(t) R

for the extended Skorohod integral. Example 4. Using Wick calculus in (S)∗ we get T

T B(T ) W (t)dt = B(T )

B(T )δB(t) = 0

T

0

W (t)dt 0

2

= B(T ) B(T ) = B (T ) − T,

(51)

by Example 3 with f = g = X[0,T ] . The following result gives a useful interpretation of the Skorohod integral as a limit of Riemann sums: Theorem 3. Let Y : [0, T ] → (S)∗ be a caglad function, i.e. Y (t) is left-continuous with right sided limits. Then Y is Skorohod integrable over [0, T ] and Y (t)δB(t) = lim R

Δtj →0

N −1

Y (tj ) (B(tj+1 ) − B(tj ))

(52)

j=0

where the limit is taken in (S)∗ and 0 = t0 < t1 < · · · < tn = T is a partition of [0, T ], Δtj = tj+1 − tj , j = 0, . . . , N − 1. 23

Bernt Øksendal Proof: This is an easy consequence of Theorem 2.

We also note the following: Theorem 4. Let Y : R → (S)∗ . Suppose Y (t) has the expansion cα (t)Hα (ω); t∈R Y (t) = α∈J

where

cα ∈ L2 (R)

Then

Y (t)δB(t) =

for all α ∈ J .

(cα , ξk )Hα+ε(k) (ω),

(53)

α∈J k∈N

R

provided that the right hand side converges in (S)∗ . In particular, if Y (t)δB(t) ∈ L2 (P ) R

then E

Y (t)δB(t) = 0.

(54)

R

1.3.1

The forward integral

We have already noted that the Skorohod integral is an extension of the classical Itˆo integral to integrands which are not necessarily adapted. There is another natural extension of this type, called the forward integral, which we now deﬁne: Deﬁnition 6. The forward integral of a function Y : R → (S)∗ is deﬁned by B(t + ε) − B(t) − dt, Y (t)d B(t) = lim Y (t) ε→0 ε R

R

provided that the limit exists in (S)∗ . We refer to Nualart and Pardoux (1988), and Russo and Vallois (2000) for more information about the forward integral. At this stage we will settle with the following result, which gives an easy comparison with the Skorohod integral (see Theorem 3). 24

Fractional Brownian Motion in Finance Theorem 5. Suppose that Y : [0, T ] → (S)∗ is caglad and forward integrable over [0, T ]. Then T

−

Y (t)d B(t) = lim

Δtj →0

0

N −1

Y (tj )·(B(tj+1 )−B(tj ))

(limit in (S)∗ ).

j=0

(55) Proof: This follows by a Fubini argument. See e.g. (2.24) in Biagini and Øksendal (2004) for a proof. We say that X(t) is a forward Itˆ o process if t

t u(s, ω)ds +

X(t) = x + 0

v(s, ω)d− B(s);

t≥0

(56)

0

for some measurable processes u(s, ω), v(s, ω) ∈ R (not necessarily adapted) such that t |u(s, ω)|ds < ∞ (57) 0

and the Itˆo forward integral t

v(s, ω)d− B(s)

(58)

0

exists for all t > 0. In that case we use the shorthand notation d− X(t) = u(t)dt + v(t)d− B(t);

X(0) = x

(59)

for the integral equation (56). For such processes we have the following Itˆo formula: Theorem 6.2 (Itˆ o formula for forward processes). Let there 2 o be f ∈ C (R) and deﬁne Y (t) = f (X(t)). Then Y (t) is a forward Itˆ process and d− Y (t) = f (X(t))d− X(t) + 12 f (X(t))v 2 (t)dt. 2

(60)

Russo and Vallois (2000).

25

Bernt Øksendal 1.3.2

Stochastic diﬀerentiation

We now make use of our explicit knowledge of the space Ω = S (R) to deﬁne diﬀerentiation with respect to ω, as follows: Deﬁnition 7. a) Let F : Ω → R, γ ∈ L2 (R). Then the directional derivative of F in the direction γ is deﬁned by F (ω + εγ) − F (ω) ε→0 ε

(61)

Dγ F (ω) = lim

provided that the limit exists in (S)∗ . b) Suppose there exists a function ψ : R → (S)∗ such that Dγ F (ω) = ψ(t)γ(t)dt for all γ ∈ L2 (R).

(62)

R

Then we say that F is diﬀerentiable and we call ψ(t) the stochastic gradient of F (or the Hida-Malliavin derivative of F ). We use the notation Dt F = ψ(t) for the stochastic gradient of F at t ∈ R. Note that – in spite of the notation – Dt F is not a derivative w.r.t. t but (a kind of) derivative w.r.t. ω ∈ Ω. Example 5. Suppose F (ω) = ω, f =

f (s)dB(s) R

for some f ∈ L2 (R). Then by linearity 1

ω + εγ, f − ω, f = γ, f = ε→0 ε

Dγ F (ω) = lim

f (t)γ(t)dt R

for all γ ∈ L2 (R). We conclude that F is diﬀerentiable and Dt f (s)dB(s) = f (t) for a.a. t.

(63)

R

(Note that this is only valid for deterministic integrands f . See Theorem 11 for the general case.) 26

Fractional Brownian Motion in Finance We note two useful chain rules for stochastic diﬀerentiation: Theorem 7. (Chain rule I). Let φ : Rn → R be a Lipschitz continuous function, i.e. there exists C < ∞ such that |φ(x) − φ(y)| ≤ C|x − y|

for all x, y ∈ Rn .

Let X = (X1 , . . . , Xn ) where each Xi : Ω → R is diﬀerentiable. Then φ(X) is diﬀerentiable and n ∂φ Dt φ(X) = (X)Dt Xk . ∂x k k=1

(64)

We refer to Nualart (1995) for a proof. If f (x) = put

∞ m=0

am xm is a real analytic function and X ∈ (S)∗ we

f (X) =

∞

am X m ,

(65)

m=0

provided the sum converges in (S)∗ . We call f (X) the Wick version of f (X). A similar deﬁnition applies to real analytic functions on Rn . Theorem 8. (The Wick chain rule). Let f : Rn → R be real analytic and let X = (X1 , . . . , Xn ) ∈ ((S)∗ )n . Then if f (X) ∈ (S)∗ Dt (f (X)) =

n ∂f k=1

∂xk

(X) Dt Xk ;

t ∈ R.

(66)

We refer to Biagini, Øksendal, Sulem and Wallner (2004) for a proof. Note that by Example 5 and the chain rule (64) we have Dt Hα (ω) =

m

αi Hα−ε(i) (ω)ξi (t) ∈ (S)∗

for all t.

(67)

i=1

In fact, using the topology for (S)∗ one can prove: 27

Bernt Øksendal Theorem 9. Let F ∈ (S)∗ . Then F is diﬀerentiable, and if F has the expansion F (ω) = cα Hα (ω) α∈J

then Dt F (ω) =

cα αi Hα−ε(i) (ω)ξi (t)

for all t ∈ R.

(68)

α,i

The stochastic gradient is the key to the connection between forward integrals and Skorohod integrals: Theorem 10. Suppose Y : R → (S)∗ is caglad. Then T

T

−

Y (t)d B(t) = 0

T Y (t)δB(t) +

0

Dt+ Y (t)dt

for all T > 0,

0

(69) provided that the integrals exist, where Dt+ Y (t) = lims→t+ Ds Y (t). We now mention without proofs some of the most fundamental results from stochastic diﬀerential and integral calculus. For proofs we refer to Nualart and Pardoux (1988) and Biagini, Øksendal, Sulem and Wallner (2004). Theorem 11. (Fundamental theorem of stochastic calculus). Suppose Y (·) : R → (S)∗ and Dt Y (·) : R → (S)∗ are Skorohod integrable. Then Y (s)δB(s) = Dt Y (s)δB(s) + Y (t). (70) Dt R

R

Theorem 12. (Relation between the Wick product and the ordinary product). Suppose g ∈ L2 (R) is deterministic and that F ∈ L2 (P ). Then F g(t)dB(t) = F · g(t)dB(t) − g(t)Dt F dt. (71) R

Corollary 1.

R

Let g ∈ L2 (R) be deterministic and F ∈ L2 (P ). Then E F · g(t)dB(t) = E g(t)Dt F dt (72) R

provided that the integrals converge. 28

R

R

Fractional Brownian Motion in Finance Theorem 13. (Integration by parts). Let F ∈ L2 (P ) and assume that Y : R × Ω → R is Skorohod integrable with Y (t)δB(t) ∈ L2 (P ). R

Then

Y (t)δB(t) =

F

F Y (t)δB(t) +

R

R

Y (t)Dt F dt

(73)

R

provided that the integral on the extreme right converges in L2 (P ). This immediately gives the following generalization of Corollary 1: Corollary 2. Let F and Y (t) be as in Theorem 13. Then E F Y (t)δB(t) = E Y (t)Dt F dt . R

(74)

R

Theorem 14. (The Itˆ o-Skorohod isometry). Y : R × Ω → R is Skorohod integrable with Y (t)δB(t) ∈ L2 (P ).

Suppose that

R

Then 2 E Y (t)δB(t) Y 2 (t)dt +E Dt Y (s)Ds Y (t)ds dt . =E R

R

R

R

(75)

Using Theorem 12 we obtain the following relation between forward integrals and Skorohod integrals: Theorem 15. Suppose that Y : [0, T ] → (S)∗ is caglad and Skorohod integrable over [0, T ]. Moreover, suppose that T Dt+ Y (t)dt 0

exists, where Dt+ Y (t) = lims→t+ Ds Y (t). Then T 0

Y (t)d− B(t) =

T

T Y (t)δB(t) +

0

Dt+ Y (t)dt.

(76)

0

29

Bernt Øksendal 1.4

Fractional stochastic calculus

We now consider the corresponding calculus for fractional Brownian motion BH (t) with arbitrary Hurst parameter H ∈ (0, 1). It turns out that it is possible to transform the calculus for B(t) into the calculus for BH (t) by means of an operator M . This is the idea of Elliott and Van der Hoek (2003), which we now describe. The approach of Elliott and Van der Hoek (2003) represents an extension to all H ∈ (0, 1) of the fractional white noise calculus for H ∈ ( 12 , 1) introduced by Hu and Øksendal (2003). For details we refer to Elliott and Van der Hoek (2003), Hu and Øksendal (2003), Biagini, Øksendal, Sulem and Wallner (2004) and Biagini, Hu, Øksendal and Zang. See also De¨ unel (1998), and Nualart (2004) for an alternative creusefond and Ust¨ approach. Deﬁnition 8. For H ∈ (0, 1) put π −1 (H − 12 ) [Γ(2H + 1) sin(πH)]1/2 cH = 2Γ(H − 12 ) cos 2

(77)

where Γ(·) is the Gamma function. Deﬁne the operator M = M H on S(R) by f (y) = cH |y| 12 −H fˆ(y); f ∈ S(R), (78) M where in general 1 gˆ(y) = √ 2π

e−ixy g(x)dx

R

is the Fourier transform of g. Let L2H (R) be the closure of S(R) in the norm 2 f L2 (R) = (M f, M f )L2 (R) = (M f (x))2 dx; H

f ∈ S(R).

(79)

R

Then the operator M extends in a natural way to an isometry between the two Hilbert spaces L2 (R) and L2H (R). Note that f , M g) = (M f, M g) = (f, M 2 g) (M

for f, g ∈ L2H (R).

(80)

Now deﬁne ˜H (t) = B ˜H (t, ω) = ω, M X[0,t] . B 30

(81)

Fractional Brownian Motion in Finance ˜ is a Gaussian process with mean Then by Section 1.3 we see that B(t) 0 and covariance ˜H (s)B ˜H (t)] = (M X[0,s] , M X[0,t] ) E[B = (X[0,s] , X[0,t] )L2H (R) = 12 (|s|2H + |t|2H − |s − t|2H ),

(82)

by (A.10) in Elliott and Van der Hoek (2003). ˜H (t) has a continuous version, denoted by BH (t), which Therefore B is a fractional Brownian motion with Hurst coeﬃcient H. Arguing as in Section 1.3 we see that if aj X[tj ,tj+1 ) (t) f (t) = j

is a (deterministic) step function, then

ω, M f =

aj (BH (tj+1 ) − BH (tj )) =

j

f (t)dBH (t). R

On the other hand, we know that

ω, M f = M f (t)dB(t). R

Therefore

f (t)dBH (t) =

R

M f (t)dB(t)

(83)

R

for all step functions f , and hence for all f ∈ L2H (R). The chaos expansion of BH (t) ∈ L2 (P ) is ∞ (M X[0,t] , ξk )ξk BH (t) = ω, M X[0,t] = ω, k=1

=

∞

(X[0,t] , M ξk ) ω, ξk =

k=1

∞

t

M ξk (s)ds Hε(k) (ω).

(84)

k=1 0

Therefore, if we deﬁne fractional white noise WH (t) by WH (t) =

∞

M ξk (t)Hε(k) (ω),

(85)

k=1

31

Bernt Øksendal then WH (t) ∈ (S)∗ and dBH (t) = WH (t) dt

in (S)∗ .

(86)

In view of this and Theorem 2 the following deﬁnition is natural: Deﬁnition 9. The Skorohod integral of a function Y : R → (S)∗ with respect to BH (t) is deﬁned by Y (t)δBH (t) = Y (t) WH (t)dt, (87) R

R

provided that Y (t) WH (t) is integrable in (S)∗ . We can in a natural way extend the M -operator to functions Y : R → (S)∗ whose chaos expansion cα (t)Hα (ω) Y (t) = q∈J

has coeﬃcients cα ∈ L2H (R), as follows: M cα (t)Hα (ω). M Y (t) = α∈J

This is well-deﬁned if the series converges in (S)∗ . With this extension of M we note that the connection between the classical white noise W (t) and the fractional white noise WH (t) can be written WH (t) = M W (t);

t ∈ R.

(88)

Combining this with Deﬁnition 9 we get Theorem 16. Let Y : R → (S)∗ . Suppose Y (t) has the expansion cα (t)Hα (ω); t∈R Y (t) = α∈J

where Then

cα (·) ∈ L2H (R) Y (t)δBH (t) = R

for all α ∈ J . (cα , ek )L2H (R) Hα+ε(k) (ω),

α∈J k∈N

provided that the right hand side converges in (S)∗ . 32

(89)

Fractional Brownian Motion in Finance Note in particular that if Y (t)δBH (t) ∈ L2 (P ), R

then

E

Y (t)δBH (t) = 0.

(90)

R

Proof: Y (t)δBH (t) = Y (t) WH (t)dt R

R

Y (t)

= =

M ξk (t)Hε(k) (ω)dt

k=1

R

∞

(cα , M ξk )Hα+ε(k) (ω) =

α,k

=

(M cα , ξk )Hα+ε(k) (ω)

(91)

α,k

(cα , ek )L2H (R) Hα+ε(k) (ω).

(92)

α,k

We also note the following relation between the Skorohod integrals w.r.t. BH (·) and B(·): Y (s)δBH (s) = M s Y (s)δB(s), (93) R

R

where M s indicates that M is operating on the variable s. This follows from (92) and Theorem 4. Example 6. What is T BH (t)δBH (t)? 0

We can answer this by using Wick calculus as in Example 4: T

T BH (t) WH (t)dt =

BH (t)δBH (t) = 0

0

=

1 2

T BH (t) 0

d BH (t)dt dt

T

2 2 2 (T ) = 12 BH (T ) − 12 T 2H , BH (t) = 12 BH

(94)

0

33

Bernt Øksendal because, by (81) and (47), 2 BH (T ) = ω, M X[0,T ] ω, M X[0,T ]

= ω, M X[0,T ] · ω, M X[0,T ] − (M X[0,T ] , M X[0,T ] ) = BH (T ) · BH (T ) − (X[0,T ] , X[0,T ] )L2H (R)

2 = BH (T ) − T 2H

(by (A.10) in Elliott and Van der Hoek (2003)).

(95)

This result could also have been deduced from the following version of the Itˆo formula. o formula for fractional Skorohod integrals). Theorem 17.3 (Itˆ Let f (s, x) : R × R → R belong to C 1,2 (R × R) and assume that the three random variables t t 2 ∂f ∂ f f (t, BH (t)(t)), (s, BH (s))ds and (s, BH (s))s2H−1 ds 2 ∂s ∂x 0 0 all belong to L2 (P ). Then

t ∂f (s, BH (s))ds f (t, BH (t)(t)) = f (0, 0) + 0 ∂s t t 2 ∂f ∂ f + (s, BH (s))dBH (s) + H (s, BH (s))s2H−1 ds. 2 ∂x ∂x 0 0 (96)

Proof: There are several versions of this result. See Mishura (2002), van der Hoek and Biagini, Øksendal, Sulem and Wallner (2004). This result is valid for all H ∈ (0, 1), but if we restrict ourselves to 12 < H < 1 there is a more general Itˆo formula in Duncan, Hu and PasikDuncan (2000) and Biagini and Øksendal (2004).

Example 7. equation

Let α, β = 0 be constants. The fractional Skorohod

δY (t) = αY (t)dt + βY (t)δBH (t); 3

34

Y (0) > 0

Biagini, Øksendal, Sulem and Wallner (2004), Theorem 3.8.

(97)

Fractional Brownian Motion in Finance t i.e.

t αY (s)ds +

Y (t) = Y (0) + 0

βY (s)δBH (s);

t≥0

0

has the unique solution Y (t) = Y (0) exp(βBH (t) + αt − 12 β 2 t2H );

t > 0.

(98)

This follows by applying Theorem 17 to the process X(t) = αt − 12 β 2 t2H + βBH (t) and the function f (x) = Y (0) exp x. In analogy with the classical case we call this process Y (t) the geometric Skorohod fractional Brownian motion. Note that if we put H = 12 we get the classical geometric Brownian motion. We proceed to consider diﬀerentiation: (H)

Deﬁnition 10. The Hida-Malliavin derivative Dt gradient) of an element F ∈ (S)∗ is deﬁned by (H)

Dt F = M −1 Dt F ;

(or stochastic

t ∈ R.

(99)

By Theorem 9 we see that if F has the expansion F (ω) = cα Hα (ω) α∈J

then

(H)

Dt F =

cα αi Hα−ε(i) (ω)ei (t);

t ∈ R.

(100)

α∈J i∈N

We can now formulate the fractional analogue of Theorem 11: Theorem 18.4 (Fractional fundamental theorem of calculus). (H) Suppose Y (·) : R → (S)∗ and Dt Y (·) : R → (S)∗ are Skorohod integrable w.r.t. BH . Then (H) (H) Y (s)δBH (s) = Dt Y (s)δBH (s) + Y (t). (101) Dt R 4

R

Biagini, Øksendal, Sulem and Wallner (2004), Theorem 5.3.

35

Bernt Øksendal Proof: By (98), (92) and Theorem 11 we get (H) −1 Y (s)δBH (s) = M t Dt M s Y (s)δB(s) Dt R

=

M −1 t

=

R

Dt (M s Y (s))δB(s) + M −1 t M t Y (t)

R

M −1 t Dt (M s Y (s))δB(s) + Y (t)

R

=

(H)

Dt (M s Y (s))δB(s) + Y (t) R

=

(H)

M s (Dt Y (s))δB(s) + Y (t) R

=

(H)

Dt Y (s)δBH (s) + Y (t).

R

Let F ∈ Theorem 19.5 (Fractional integration by parts). assume that Y : R × Ω → R is Skorohod integrable w.r.t. L2 (P ) and 2 BH with Y (t)δBH (t) ∈ L (P ). Then

R

Y (t)δBH (t) =

F R

(H)

Y (t)M 2t Dt F dt.

F Y (t)δBH (t) + R

(102)

R

Proof: By (92), Theorem 13 and (98) we get F Y (t)δBH (t) = F M t Y (t)δB(t) R

=

R

F M t Y (t)δB(t) + R

=

R

=

F Y (t)δBH (t) +

R

36

M t (F Y (t))δB(t) + R

5

M t Y (t)Dt F dt (H)

M t Y (t)M t Dt F dt R (H)

Y (t)M 2t Dt F dt.

R

Biagini, Øksendal, Sulem and Wallner (2004), Theorem 5.3.

Fractional Brownian Motion in Finance Corollary 3. Let F and Y (t) be as in Theorem 19. Then (H) E F Y (t)δBH (t) = E Y (t)M 2t Dt F dt . R

(103)

R

We also note the following fractional version of Theorem 14: Theorem 20.6 (The fractional Itˆ o-Skorohod isometry). Suppose Y : R × Ω → R is Skorohod-integrable with respect to BH with

Y (t)δBH (t) ∈ L2 (P ).

R

Then E

2 Y (t)δBH (t)

R

=E

(H) (M Y (t)) dt + E Dt M 2s Y (s) · Ds(H) M 2t Y (t)ds dt . 2

R

R

R

(104) Proof: This follows by combining Theorem 14 with (92) and (98). We omit the details. Finally we turn to the fractional forward integral. This is deﬁned in the same way as in the classical case (Deﬁnition 6): Deﬁnition 11. The forward integral of a function Y : R → (S)∗ with respect to BH (t) is deﬁned by:

Y (t)d− BH (t) = lim

Y (t)

ε→0

R

R

BH (t + ε) − BH (t) dt, ε

(105)

provided that the limit exists in (S)∗ .

6

Elliott and Van der Hoek (2003).

37

Bernt Øksendal Just as in Theorem 5 we have: Theorem 21. Suppose Y : [0, T ] → (S)∗ is caglad and forward integrable over [0, T ] w.r.t. BH (·). Then T

Y (t)d− B(t) = lim

Δtj →0

0

N −1

Y (tj ) · (BH (tj+1 ) − BH (tj ))

(106)

j=0

(limit in (S)∗ ). Remark 1. In the special case when Y = Y (t, ω) : [0, T ] × Ω → R is a classical stochastic process (and Y (t, ·) ∈ (S)∗ for all t) and the limit in (105) exists for a.a. ω, the forward integral of Y coincides with the pathwise integral (or more precisely the left Young (LY) integral of Y ) with respect to dBH (t). See Norvaisa (2000) for details. ∇ Deﬁnition 12. A function Y : [0, T ] → (S)∗ with expansion cα (t)Hα (ω) Y (t) = α∈J (H)

belongs to the space D1,2 if ∞ 2 Y (H) := αi α!(cα , ξi )2 < ∞ D 1,2

α∈J i=1

where

T (cα , ξi ) =

cα (s)ξi (s)ds. 0

The analogue of Theorem 15 is the following: Theorem 22. Suppose that Y : [0, T ] → (S)∗ is cadlag and Skorohod integrable over [0, T ] w.r.t. BH (t). Moreover, suppose that (H) Y ∈ D1,2 . Then T 0

38

(H)

[M 2t Dt Y (u)]u=t dt exists in L2 (P )

Fractional Brownian Motion in Finance and T

Y (t)d− BH (t) =

0

T

T Y (t)δBH (t) +

0

(H)

[M 2t Dt Y (u)]u=t dt.

(107)

0

Proof: We refer to Biagini and Øksendal (2004) for details. See also Mishura (2002). We end this section by giving an Itˆo formula for forward integrals w.r.t. fractional Brownian motion: A forward fractional Itˆ o process is a process of the form t

t u(s, ω)ds +

X(t) = x + 0

v(s, ω)d− BH (s);

t≥0

(108)

0

where u(s, ω) and v(s, ω) are realvalued, measurable (not necessarily adapted) processes such that t

t |u(s, ω)|ds < ∞ and

0

v(s, ω)d− BH (s)

exists a.e..

0

In this case we use the shorthand notation d− X(t) = u(t)dt + v(t)d− BH (t);

X(0) = x.

(109)

Theorem 23. (An Itˆ for forward fractional pro o formula cesses). Suppose H ∈ 12 , 1 . Let f ∈ C 1 (R) and put Y (t) = f (X(t)), where X(t) is given by (108). Then d− Y (t) = f (X(t))d− X(t).

(110)

Proof: This is a classic result about forward (pathwise) integration. A direct proof can be found in Biagini and Øksendal (2004). See also Norvaisa (2000), Nualart (2004) and Russo and Vallois (2000) and the references therein. If f posseses higher order regularity then a corresponding (but more complicated) Itˆo formula can be obtained for lower values of H. See e.g. Countin and Qian (2002) and Gradinaru, Nourdin, Russo and Vallois (2002).

39

Bernt Øksendal Example 8. The fractional forward equation d− X(t) = αX(t)dt + βX(t)d− BH (t); has for

1 2

X(0) = x > 0

< H < 1 the unique solution X(t) = x exp(βBH (t) + αt);

1.5

(111)

t ≥ 0.

(112)

Summary of results

We now use the mathematical machinery described in the earlier sections to study ﬁnance models involving f Bm. We have seen that there are two natural ways of deﬁning integration with respect to f Bm: (a) The pathwise (forward) integration (b) The Skorohod integration. Therefore we discuss these two cases separately: 1.5.1

The pathwise integration model ( 12 < H < 1)

For simplicity we concentrate on the simplest nontrivial type of market, namely on the f Bm version of the classical Black-Scholes market, as follows: Suppose there are two investment possibilities: (i) A safe or risk free investment, with price dynamics dS0 (t) = rS0 (t)dt;

S0 (0) = 1

(113)

and (ii) a risky investment, with price dynamics d− S1 (t) = μS1 (t)dt + σS1 (t)d− BH (t);

S1 (0) = x > 0,

(114)

where r, μ, σ = 0 and x > 0 are constants. By Example 8 we know that the solution of this equation is S1 (t) = x exp(σBH (t) + μt);

t ≥ 0.

(115)

Let {FtH }t≥0 be the ﬁltration of BH (·), i.e. FtH is the σ-algebra generated by the random variables BH (s), s ≤ t. 40

Fractional Brownian Motion in Finance A portfolio in this market is a 2-dimensional FtH -adapted stochastic proces θ(t) = (θ0 (t), θ1 (t)) where θi (t) gives the number of units of investment number i held at time t, i = 0, 1. The corresponding wealth process V θ (t) is deﬁned by V θ (t) = θ(t) · S(t) = θ0 (s)S0 (t) + θ1 (t)S1 (t),

(116)

where S(t) = (S0 (t), S0 (t)). We say that θ is pathwise self-ﬁnancing if d− V θ (t) = θ(t) · d− S(t)

(117)

i.e. t θ

θ

V (t) = V (0) +

t θ0 (s)dS0 (s) +

0

θ1 (s)d− S1 (s).

(118)

0

If, in addition, V θ (t) is lower bounded, then we call the portfolio θ (pathwise) admissible. Deﬁnition 13. A pathwise admissible portfolio θ is called an arbitrage if the corresponding wealth process V θ (t) satisﬁes the following three conditions: Vθ =0 V (T ) ≥ 0 θ

θ

(119) a.s.

P [V (T ) > 0] > 0.

(120) (121)

Remark 2. The non-existence of arbitrage in a market is a basic equilibrium condition. It is not possible to make a sensible mathematical theory for a market with arbitrage. Therefore one of the ﬁrst things to check in a mathematical ﬁnance model is whether arbitrages exist. In the above pathwise f Bm market the existence of arbitrage was proved by Rogers Rogers (1997) in 1997. Subsequently several simple examples of arbitrage were found. See e.g. Dasgupta (1997), Salopek (1998) and Shiryaev (1999). Note, however, that the existence of arbitrage in this pathwise model is already a direct consequence of Theorem 7.2 in Delbaen and Schachermayer (1994): There it is proved in general that if there is no arbitrage using simple portfolios (with pathwise products), then the price process is a semimartingale. Hence, since 41

Bernt Øksendal the process S1 (t) given by (114) is not a semimartingale, an arbitrage must exist. Here is a simple arbitrage example, due to Dasgupta (1997) and Shiryaev (1998): For simplicity assume that μ=r

and

σ = x = 1.

(122)

Deﬁne θ0 (t) = 1 − exp(2BH (t)),

θ1 (t) = 2(exp(BH (t)) − 1).

(123)

Then the corresponding wealth process is V θ (t) = θ0 (t)S0 (t) + θ1 (t)S1 (t) = (1 − exp(2BH (t))) exp(rt) + 2(exp(BH (t)) − 1) exp(BH (t) + rt) = exp(rt)(exp(BH (t)) − 1)2 > 0

for a.a. (t, ω).

(124)

This portfolio is self-ﬁnancing, since θ0 (t)dS0 (t) + θ1 (t)d− S1 (t) = (1 − exp(2BH (t)))r exp(rt)dt + 2(exp(BH (t)) − 1)S1 (t)[rdt + d− BH (t)] = r exp(rt)(exp(BH (t)) − 1)2 dt + 2 exp(rt)(exp(BH (t)) − 1) exp(BH (t))d− BH (t) = d(exp(rt)(exp(BH (t)) − 1)2 ) = d− V θ (t).

∇

We have proved: Theorem 24.7 The portfolio θ(t) = (θ0 (t), θ1 (t)) given by (123) is a (pathwise) arbitrage in the (pathwise) fractional Black-Scholes market given by (113), (114) and (122). In view of this result the pathwise f Bm model is not suitable in ﬁnance, at least not in this simple form (but possibly in combination with classical Brownian motion). 7

42

Dasgupta (1997) and Shiryaev (1999).

Fractional Brownian Motion in Finance 1.5.2 The Wick-Skorohod integration model (0 < H < 1) We now consider the Wick-Skorohod integration version of the market (113)–(114). Mathematically the model below is an extension to H ∈ (0, 1) of the model introduced in Hu and Øksendal (2003) for H ∈ ( 12 , 1). (Subsequently a related model, also valid for all H ∈ (0, 1), was presented in Elliott and Van der Hoek (2003).) However, compared to Hu and Øksendal (2003) we give a diﬀerent interpretation of the mathematical concepts involved: Assume that the values S0 (t), S1 (t) of the risk free (e.g. bond) and risky asset (e.g. stock), respectively, are given by (bond)

dS0 (t) = rS0 (t)dt;

S0 (0) = 1 (125)

and (stock)

δS1 (t) = μS1 (t)dt+σS1 (t)δBH (t);

S1 (0) = x > 0 (126)

where r, μ, σ = 0 and x > 0 are constants. By Example 7 the solution of equation (126) is S1 (t) = x exp(σBH (t) + μt − 12 σ 2 t2H );

t ≥ 0.

(127)

In this Wick-Skorohod model S1 (t) does not represent the observed stock price at time t, but we give it a diﬀerent interpretation: We assume that S1 (t) represents in a broad sense the total value of the company and that it is not observed directly. Instead we adopt a quantum mechanical point of view, regarding S1 (t, ω) as a stochastic distribution in ω (represented mathematically as an element of (S)∗ ), ˆ and regarding the actual observed stock price S(t) as the result of applying S1 (t, ·) ∈ (S)∗ to a stochastic test function ψ(·) ∈ (S). In other words, ˆ := S(t, ·), ψ(·) = S(t), ψ, S(t)

(128)

where in general F, ψ denotes the action of a stochastic distribution F ∈ (S)∗ to a stochastic test function ψ ∈ (S). (See Section 1.3.) We call such stochastic test functions ψ market observers. We will assume that they have the form h(t)dBH (t) = exp h(t)dBH (t) − 12 h 2L2 (R) ψ(ω) = exp H

R

for some h ∈ L2H (R).

R

(129) 43

Bernt Øksendal The set of all linear combinations of such ψ is dense in both (S) and (S)∗ . Moreover, these ψ are normalized, in the sense that E exp h(t)dBH (t) = 1 for all h ∈ L2H (R). (130) R

We let D denote the set of all market observers of the form (129). Similarly, a generalized portfolio is another adapted process θ(t) = θ(t, ω) = (θ0 (t, ω), θ1 (t, ω));

(t, ω) ∈ [0, T ] × Ω

representing a general strategy for choosing the number of units of investment number i at time t; i = 0, 1. (For example, θ1 (t) could be the usual “buy and hold” strategy, consisting of buying a certain number of stocks at a stopping time τ1 (ω) and holding them until another stopping time τ2 (ω) > τ1 (ω). Or θ1 (t) could be the strategy to hold a ﬁxed fraction of the current wealth in stocks.) If the actual observed price at time t is Sˆ1 (t) = S1 (t, ·), ψ(·), the actual number of stocks held is (131) θˆ1 (t) := θ1 (t, ·), ψ(·). Thus the actual observed wealth Vˆ1 (t) held in the risky asset corresponding to this portfolio is Vˆ1 (t) = θ1 (t), ψ · S1 (t), ψ.

(132)

By Lemma 1 below this can be written Vˆ1 (t) = θ1 (t) S1 (t), ψ,

(133)

where denotes the Wick product. In fact, F := θ1 (t) S1 (t) is the unique F ∈ (S)∗ such that

F, ψ = θ1 (t), ψ · S1 (t), ψ

for all ψ ∈ D.

(134)

In view of this it is natural to deﬁne the generalized total wealth process V (t, ω) associated to θ(t, ω) by the Wick product V (t, ·) = θ(t, ·) S(t, ·) = θ0 (t)S0 (t) + θ1 (t) S1 (t).

(135)

Similarly, if we consider a discrete time market model and keep the generalized portfolio process θ(t) = θ(tk , ω); 44

tk ≤ t < tk+1

Fractional Brownian Motion in Finance constant from t = tk to t = tk+1 , the corresponding change in the generalized wealth process is ΔV (tk ) = θ(tk ) ΔS(tk ),

(136)

where ΔV (tk ) = V (tk+1 ) − V (tk ),

ΔS(tk ) = S(tk+1 ) − S(tk ).

If we sum this over k and take the limit as Δtk = tk+1 − tk goes to 0, we end up with the following generalized wealth process formula T

T θ(t) dS(t) = V (0) +

V (T ) = V (0) + 0

θ(t)δS(t),

(137)

0

where δS(t) means that the integral is interpreted in the (Wick-Itˆo-) Skorohod sense. Therefore, by (125)–(126), T T V (T ) = V (0) + rθ0 (t)S0 (t)dt + μθ1 (t) S1 (t)dt 0

0

T σθ1 (t) S1 (t)δBH (t).

+

(138)

0

We now prove the fundamental result which explains why the Wick product suddenly appears in (133) above: Lemma 1. a) Let F, G ∈ (S)∗ . Then

F G, ψ = F, φ · G, ψ

for all ψ ∈ D.

(139)

b) Moreover, if Z ∈ (S)∗ is such that

Z, ψ = F, ψ · G, ψ then

for all ψ ∈ D

Z = F G.

45

Bernt Øksendal Proof: a) Choose ψ = exp

h(t)dBH (t) ∈ D.

R

We may assume that F = exp

f (t)dBH (t)

and

G = exp

R

g(t)dBH (t)

R

for some f, g ∈ L2H (R), because the set of all linear combinations of such Wick exponentials is dense in (S)∗ . For such F, G, ψ we have

F, ψ = E[F · ψ]

G, ψ = E[G · ψ].

and

Therefore

F G, ψ = E exp

= E exp

(f + g)dBH · exp

R

(f + g)dBH − 12 f + g 2L2 (R)

hdBH

R

H

R

· exp

hdBH − 12 h 2L2 (R)

H

R

(f + g + h)dBH

= E exp R

− 12 f 2L2 (R) − 12 g 2L2 (R) − 12 h 2L2 (R) − (f, g)L2H (R) H H H = E exp (f + g + h)dBH − 12 f + g + h 2L2 (R)

H

R

+ (f, h)L2H (R) + (g, h)L2H (R) (f + g + h)dBH · exp(f + g, h)L2H (R) = E exp R

= exp(f + g, h)L2H (R) . 46

(140)

Fractional Brownian Motion in Finance On the other hand, a similar computation gives f dBH · exp hdBH

F, ψ · G, ψ = E exp

· E exp

R

gdBH · exp

R

R

hdBH

R

= exp(f, h)L2H (R) · exp(g, h)L2H (R) = exp(f + g, h)L2H (R) .

(141)

Comparing (140) and (141) we get a). b) This follows from the fact that the set of linear combinations of elements of D is dense in (S), and (S)∗ is the dual of (S). Remark 3. We emphasize that this model for f Bm in ﬁnance does not a priori assume that the Wick product models the growth of wealth. In fact, the Wick product comes as a mathematical consequence of the basic assumption that the observed value is the result of applying a test function to a distribution process describing in a broad sense the value of a company. This way of thinking stems from microcosmos (quantum mechanics), but it has been argued that it is often a good description of macrocosmos situations as well. Here is an example: An agent from an opinion poll ﬁrm stops a man on the street and asks him what political party he would vote for if there was an election today. Often this man on the street does not really have a ﬁrm opinion about this beforehand (he is in a diﬀuse state of mind politically), but the contact with the agent forces him to produce an answer. In a similar sense the general state of a company does not really have a noted stock price a priori, but brings out a number (price) when confronted with a market observer (the stock market). ∇ In view of the above we now make the following deﬁnitions: Deﬁnition 14. a) The total wealth process V θ (t) corresponding to a portfolio θ(t) in the Wick-Skorohod model is deﬁned by V θ (t) = θ(t) S(t).

(142)

b) A portfolio θ(t) is called Wick-Skorohod self-ﬁnancing if δV θ (t) = θ(t)δS(t)

(143) 47

Bernt Øksendal i.e.

t θ

θ

V (t) = V (0) +

t θ0 (s)dS0 (s) +

0

θ1 (s)δS1 (s).

(144)

0

In particular, we assume that the two integrals in (144) exist. By the Girsanov theorem for f Bm [see e.g. Molchan (1969), Valkeila (1999), [EvdV], Hu and Øksendal (2003)] there exists a probability measure Q on (Ω, F) such that Q is equivalent to P (i.e. Q has the same null sets as P ) and such that ˆH (t) := μ − r t + BH (t) B σ

(145)

is a fractional Brownian motion w.r.t. Q. ˆH (t) in (144) we get Replacing BH (t) by B e

−rt

t θ

θ

V (t) = V (0) +

ˆH (s). e−rs σθ1 (s) S1 (s)δ B

(146)

0

Deﬁnition 15. We call a portfolio θ(t) Wick-Skorohod admissible if it is Wick-Skorohod self-ﬁnancing and θ1 (s)S1 (s) is Skorohod integrable ˆH (s). w.r.t. B Deﬁnition 16. A Wick-Skorohod admissible portfolio θ(t) is called a strong arbitrage if the corresponding total wealth process V θ (t) satisﬁes V θ (0) = 0 V θ (T ) ∈ L2 (Q) and P [V θ (T ) > 0] > 0.

(147) V θ (T ) ≥ 0 a.s. P

(148) (149)

The following result was ﬁrst proved by Hu and Øksendal (2003) for the case 12 < H < 1 and then extended to arbitrary H ∈ (0, 1) by Elliott and Van der Hoek (2003) (in a related model): Theorem 25. There is no strong arbitrage in the Wick-Skorohod fractional Black-Scholes market (125)–(126). 48

Fractional Brownian Motion in Finance Proof: If we take the expectation with respect to Q of both sides of (146) with t = T we get, by (90), e−rT EQ [V θ (T )] = V θ (0). From this we see that (147)–(149) cannot hold.

Remark 4. Note that the non-existence of a strong arbitrage in this market (where the value process S1 (t) is not a semimartingale) is not in conﬂict with the result of Delbaen and Schachermayer (1994) mentioned in Remark 2, because in this market the underlying products are Wick products, not ordinary pathwise products. ∇ We proceed to discuss completeness in this market: Deﬁnition 17. The market is called (Wick-Skorohod) complete if (H) for every FT -measurable random variable F ∈ L2 (Q) there exists an admissible portfolio θ(t) = (θ0 (t), θ1 (t)) such that F = V θ (T ) a.s.

(150)

By (146) we see that this is equivalent to requiring that there exists φ such that −rT

e

−rT

F (ω) = e

T ˜H (s), φ(s, ω)δ B

EQ [F ] +

(151)

0

where φ(s) = e−rs σ θ1 (s) S1 (s).

(152)

If such a φ can be found, then we put θ1 (s) = σ −1 ers S1 (s)(−1) φ(s).

(153)

It was proved by Hu and Øksendal (2003) (for 12 < H < 1) and subsequently by Elliott and Van der Hoek (2003) in a related market (for arbitrary H ∈ (0, 1)) that this market is complete. In fact, we have: 49

Bernt Øksendal (H)

Theorem 26.8 Let F ∈ L2 (Q) be FT -measurable. Then F = V θ (T ) a.s. for θ(t) = (θ0 (t), θ1 (t)), with ˆ t(H) F | Ft(H) ], θ1 (t) = σ −1 e−ρ(T −t) S1 (t)(−1) E˜Q [D

(154)

ˆ t(H) is the where E˜Q [·|·] denotes the quasi-conditional expectation and D ˆH (·).9 The other fractional Hida-Malliavin derivative with respect to B component, θ0 (t), is then uniquely determined by the self-ﬁnancing condition (144). In the Markovian case, i.e. when F (ω) = f (BH (T )) for some function f : R → R, we can give a more explicit expression for the replicating portfolio θ(t). This is achieved by using the following representation theorem, due to C. Bender (2003a). It has the same form as in the well-known classical case (H = 12 ): Theorem 27.10 Let f : R → R be such that E[f 2 (BH (T ))] < ∞ . Then T f (BH (T )) = E[f (B(T ))] + φ(t, ω)dBH (t), (155) 0

where φ(t, ω) =

∂ E[f (x + BH (T − t))] . ∂x x=BH (t)

(156)

In view of the interpretation of the observed wealth Vˆ (t) as the result of applying a test function ψ ∈ D to the general wealth process V (t), i.e. Vˆ (t) = V (t), ψ, (157) the following alternative deﬁnition of an arbitrage is natural (compare with Deﬁnition 16): Deﬁnition 18. A Wick-Skorohod admissible portfolio θ(t) is called a weak arbitrage if the corresponding total wealth process V θ (t) satisﬁes V θ (0) = 0

8

(158)

V θ (T ), ψ ≥ 0

for all ψ ∈ D

(159)

V θ (T ), ψ > 0

for some ψ ∈ D.

(160)

Hu and Øksendal (2003), Elliott and Van der Hoek (2003). See Hu and Øksendal (2003) and Elliott and Van der Hoek (2003) for details. 10 Bender (2003a). 9

50

Fractional Brownian Motion in Finance Do weak arbitrages exist? The answer is yes. Here is an example, due to C. Bender (2003b): Example 9.11 (A weak arbitrage). −1 if Kε (x) = 1 if

For ε > 0 deﬁne |x| ≤ ε |x| > ε.

(161)

Then there exists ε0 > 0 such that

Kε0 (x) exp − 12 x2 dx = 0.

(162)

R

By a variant of Lemma 2.6 in Bender (2002) we have E K( ω, f ) exp( ω, g − 12 g 2L2 (R) ) H 2 2 (R) ) (u − (f, g) L H = (2π)−1/2 f L2H (R) K(u) exp − , 2||f 2L2 (R) R

(163)

H

for all bounded K : R → R, f, g ∈ L2H (R). Applying (163) to f = X[0,1] and ω, f = BH (1) we get E[Kε0 (BH (1))] = 0 (164) 2 1 2 E Kε0 (BH (1)) exp( ω, g − 2 g L2 (R) ≥ 0 for all g ∈ LH (R) H (165) (166) E Kε0 (BH (1)) exp( ω, X[0,1] − 12 X[0,1] 2L2 (R) > 0. H

Now consider the Skorohod fractional market (125)–(126) with r = μ = 0,

σ = T = 1.

Then S0 (t) = 1 and S1 (t) = x exp(BH (1) − 1/2). 11

Bender (2003b).

51

Bernt Øksendal ˜H (t) = BH (t) and P = Q. Hence by Theorem 26 and Moreover, B (4.50) there exists a Skorohod self-ﬁnancing portfolio θ(t) = (θ0 (t), θ1 (t)) such that

T θ

Kε0 (BH (1)) = V (1) =

θ1 (s)δS(s) a.s.

(167)

0

Then V θ (0) = 0 and by (165), (166) and (129) we see that (159) and (160) hold. Hence θ(t) is a weak arbitrage. 1.5.3

A connection between the pathwise and the Wick-Skorohod model

In spite of the fundamental diﬀerences in the features of the pathwise model and the Wick-Skorohod model, it turns out that there is a close relation between them. Assume H ∈ ( 12 , 1). Fix ψ ∈ D and deﬁne the function bH : [0, T ] → R by bH (t) = BH (t), ψ = E[BH (t) · ψ].

(168)

Then for p > 1 and any partition P : 0 = t0 < t1 < · · · < tN = T of |0, T ] we have N −1

|bH (tj+1 ) − bH (tj )| = p

j=0

N −1

|E[(BH (tj+1 ) − BH (tj )) · ψ]|p

j=0

≤

N −1

(E[|BH (tj+1 ) − BH (tj )|p ]1/p · E[ψ q ]1/q )p

j=0

≤C

N −1

E[|BH (tj+1 ) − BH (tj )|p ],

j=0 1 p

1 q

where + = 1. Hence, by a known property of f Bm, sup P

N −1

|bH (tj+1 ) − bH (tj )|p < ∞

iﬀ p ≥

1 H

.

j=0

In this sense the continuous function bH (t) is at least as regular as a generic path of a fractional Brownian motion BH (t, ω). Therefore we can deﬁne integration with respect to bH (t) just as we deﬁne pathwise 52

Fractional Brownian Motion in Finance integration with respect to BH (t). Now suppose we start with the wealth generating formula in the Wick-Skorohod model T θ

θ

V (T ) = V (0) +

φ(s, ω)δBH (s).

(169)

0

Suppose φ is caglad and ψ ∈ D. Then this gives T θ θ θ Vˆ (T ) = V (T ), ψ = V (0) + φ(s, ω)δBH (s), ψ = V θ (0) + lim

−1 N

Δtj →0

= V θ (0) + lim

Δtj →0

= V θ (0) + lim

Δtj →0

0

φ(tj ) (BH (tj+1 ) − BH (tj )), ψ

j=0 N −1

φ(tj ), ψ BH (tj+1 ) − BH (tj ), ψ

j=0 N −1

ˆ j )(bH (tj+1 ) − bH (tj )) φ(t

j=0

T ˆ φ(t)db H (t).

= V θ (0) +

(170)

0

We can summarize this as follows: Theorem 28. If H > 12 the mapping F → F, ψ; F ∈ L2 (P ) transforms the Wick-Skorohod fractional Brownian motion model into the pathwise fractional Brownian motion model. If H = 12 this mapping transforms the Wick-Skorohod Brownian motion model into the classical Brownian motion model. 1.6

Concluding remarks

At ﬁrst glance there seems to be a disagreement between the existence of arbitrage in the (fractional) pathwise model (see Theorem 24) and the non-existence of a (strong) arbitrage in the Wick-Skorohod model (Theorem 25). The above discussion, including in particular Theorem 4.16, serves to explain this apparent contradiction: The arbitrages in the pathwise model correspond to the weak arbitrages in the WickSkorohod model (see Example 9), and not to the (non-existent) strong arbitrages. 53

Bernt Øksendal In spite of the mathematical coherence of the Wick-Skorohod model, there is still a lot of controversy about its economic interpretation and features. We refer to the discussions in Bj¨ork and Hult (2005), and Sottinen and Valkeila (2003) for more details. Acknowledgements: I am grateful to Christian Bender, Tomas Bj¨ork, Nils Christian Framstad, Walter Schachermayer and John van der Hoek for helpful communication. References: Bender, C. (2002) The Fractional Itˆo Integral, Change of Measure and Absence of Arbitrage. Manuscript. Bender, C. (2003a) “An Itˆo Formula for Generalized Functionals of a Fractional Brownian Motion with Arbitrary Hurst Parameter.” Stochastic Processes and Their Applications 104: 81–106. Bender, C. (2003b) Construction of a Weak Arbitrage. Manuscript, May. Biagini, F., Hu, Y., Øksendal, B., and Zhang, T. Fractional Brownian Motion and Applications. Springer-Verlag (Forthcoming). Biagini, F., and Øksendal, B. (2004) Forward Integrals and an Itˆo Formula for Fractional Brownian Motion. Preprint, Dept. of Mathematics, University of Oslo 22/2004. Biagini, F., Øksendal, B., Sulem, A., and Wallner, N. (2004) “An Introduction to White Noise Theory and Malliavin Calculus for Fractional Brownian Motion.” The Proceedings of the Royal Society 460: 347–372. Bj¨ork, T., and Hult, H. (2005) “A Note on the Wick Products and the Fractional Black-Scholes Model.” Finance and Stochastics 9: 197–209. Brody, D., Syroka, J. and Zervos, M. (2002) “Dynamical Pricing of Weather Derivatives.” Quantitative Finance 2: 189–198. Coutin, L., and Qian, Z. (2002) “Stochastic Analysis, Rough Path Analysis and Fractional Brownian Motions.” Prob. Theory Related Fields 122: 108–140. Dasgupta, A. (1997) Fractional Brownian Motion: Its Properties and Applications to Stochastic Integration. Ph. D. thesis, Dept. of Statistics, Univ. of North Carolina at Chapel Hill. 54

Fractional Brownian Motion in Finance ¨ unel, A. S. (1998) “Stochastic Analysis of Decreusefond, L., and Ust¨ the Fractional Brownian Motion.” Potential Analysis 10: 177–214. Delbaen, F., and Schachermayer, W. (1994) “A General Verion of the Fundamental Theorem of Asset Pricing.” Mathematische Annalen 300: 463–520. Duncan, T. E., Hu, Y., and Pasik-Duncan, B. (2000) “Stochastic Calculus for Fractional Brownian Motion.” SIAM Journal on Control and Optimization 38: 582–612. Elliott, R., and van der Hoek, J. (2003) “A General Fractional White Noise Theory and Applications to Finance.” Mathematical Finance 13: 301–330. Gradinaru, M., Nourdin, I., Russo, F., and Vallois, P. (2002) m-order integrals and generalized Itˆo’s formula: the case of a fractional Brownian motion with any Hurst index. Preprint. Hida, T., Kuo, H.-H., Potthoﬀ, J., and Streit, L. (1993) White Noise Analysis. Kluwer. Holden, H., Øksendal, B., Ubøe, J., and Zhang, T. (1996) Stochastic Partial Diﬀerential Equations. Birkh¨auser. Hu, y. (2003) Integral Transformations and Anticipative Calculus for Fractional Brownian motion. Manuscript. Hu, Y., and Øksendal, B. (2003) “Fractional White Noise Calculus and Application to Finance.” Inﬁnite Dimensional Analysis, Quantum Probability and Related Topics 6: 1–32. Kuo, H.-H. (1996) White Noise Distribution Theory. CRC Press. Mishura, Y. (2002) Fractional Stochastic Integration and Black-Scholes Equation for Fractional Brownian Model with Stochastic Volatility. Manuscript, December. Molchan, G. (1969) “Gaussian Processes with Spectra Which are Asymptotically Equivalent to a Power of λ.” Theory of Probability and Its Applications 14: 530–530. Norvaisa, R. (2000) “Modelling of Stock Price Changes. A Real Analysis Approach.” Finance and Stochastics 4: 343–369. Nualart, D. (1995) The Malliavin Calculus and Related Topics. Springer-Verlag. Nualart, D. (2004) Stochastic Integration with Respect to Fractional Brownian Motion and Applications. Preprint. Nualart, D., and Pardoux, E. (1988) “Stochastic Calculus with Anticipating Integrands.” Probability Theory and Related Fields 78: 555–581. 55

Bernt Øksendal Rogers, L.C. (1997) “Arbitrage with Fractional Brownian Motion.” Math. Finance 7: 95–105. Russo, F., and Vallois, P. (2000) “Stochastic Calculus with Respect to Continuous Finite Quadratic Variation Processes.” Stochastics and Stochastics Reports 70: 1–40. Salopek, D. M. (1998) “Tolerance to Arbitrage.” Stochastic Processes and Their Applications 76: 217–230. Shiryaev, A. (1999) Essentials of Stochastic Finance. World Scientiﬁc Publishing Company. Simonsen, I. (2003) “Measuring Anti-Correlations in the Nordic Electricity Spot Market by Wavelets.” Physica A: Statistical Mechanics and Its Applications 322: 597–606. Sottinen, T. and Valkeila, E. (2003) “On Arbitrage and Replication in the Fractional Black-Scholes Pricing Model.” Statistics and Decisions 21: 93–108. Valkeila, E. (1999) On Some Properties of Geometric Fractional Brownian Motion. Preprint, Univ. of Helsinki, May. J. van der Hoek: Private Communication.

56

Chapter 2 Moment Evolution of Gaussian and Geometric Wiener Diﬀusions

Bjarne S. Jensen Chunyan Wang University of Southern Denmark and Copenhagen Business School Jon Johnsen Department of Mathematical Sciences, Aalborg University

2.1

Introduction

The purpose of this chapter is to analyse two basic stochastic models in the plane: The time homogeneous Gaussian and Geometric Wiener diﬀusions. Using the theory of stochastic processes and the Itˆo lemma, the probability distributions of the stochastic state vectors are described by the evolution of their moments (expectation and covariance as functions of time). These moments satisfy certain systems of (deterministic) ordinary diﬀerential equations (ODE). We solve these ODE and present explicit solutions (time paths) for the ﬁrst-order and second-order moments. The forward Kolmogorov equation is used to derive the same moment functions by alternative solution methods and gain further information on the probability distributions. Motivation. Uncertainty or incompleteness evidently prevails in the process descriptions of many scientiﬁc disciplines. Hence, stochastic models must often be used for an adequate mathematical representation of the dynamic systems. For any stochastic process X(t),

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen an ideal but unattainable situation is to get an explicit formula for the probability distribution P (t, x) of X(t) at any future instant t. We deal with the situations, where we want to obtain a collection of distributions P (t, x) - corresponding to various drift and diﬀusion coeﬃcients, a(x, t) and B(x, t), that for X(t) enter, respectively, the system of stochastic diﬀerential equations (SDE) in the Itˆo form or the forward Kolmogorov partial diﬀerential equation (PDE). Our intention is to elucidate to what extent the usual methods of the stochastic literature can provide explicit (closed form) expression of P (t, x). It turns out that the use of both Itˆo’s lemma and the forward Kolmogorov equation can only give explicit formulas in the very simplest cases: the Gaussian diﬀusion (GD) and geometric Wiener diﬀusion (GWD), in which a(X, t) and B(X, t) are suitable ﬁrst-order polynomials in the state variables X alone. For these two time-homogeneous diﬀusion processes, it is possible in the two-dimensional case to write up the complete expressions for the evolution of the mean vector and covariance matrix and also for their probability distributions. To our knowledge, these moment solution formulas have not been derived before, although their explicit derivation facilitates the understanding and modeling of stochastic processes in physics, biology, economics, ﬁnance and technological sciences. Given the increasing usage of stochastic diﬀerential equations, the results are likely to be of general interest. Overview of results. In section 3, the evolution of the mean vector m(t) = [mx (t) my (t)] for the GD and GWD models is given in Theorem 1 with asymptotics in Corollary 1. In section 4, the evolution of the covariance matrix Σ(t), covariance vector σ(t) = [σ xx (t) σ xy (t) σ yy (t)] for the GD and GWD models are given in Theorem 2, with asymptotics for covariances and correlation coeﬃcients of the GD model in Corollary 1-2. In section 5 - using the Kolmogorov partial diﬀerential equation the evolution of the density functions, and the time paths for the mean vector m(t) and the covariance matrix Σ(t) of the GD and GWD models are corroborated in Theorem 3 and Theorem 4. However, a GWD density function for transition probability distribution does not exist, but the probability measure is given. The limitations of the methods. Regarding the diﬃculties of deriving the explicit moment formulas, it is easy to understand our restriction to diﬀusion processes in the plane. Indeed, the time dependence often emanates from an exponential matrix etA , and in higher 58

Moment Evolution of Gaussian and Wiener Diﬀusions dimensions, it is in general impossible to write the eigenvalues of A as functions of its entries (in contrast to cases where the characteristic polynomial has the degree two). In three dimensions, there are six distinct second-order moments, and hence six linear diﬀerential equations to be solved explicitly. But according to Abel’s theorem, even the general quintic polynomial is unsolvable algebraically. For planar diﬀusion processes, however, the roots of our cubic characteristic polynomial became simple expressions of at most eight fundamental drift and diﬀusion parameters. By the Itˆ o Lemma, the ordinary diﬀerential equations for the ﬁrst and second order moments are uncoupled when drift and diﬀusion coeﬃcients depend linearly on state variables, x. In all other cases, knowledge of the full distribution P (t, x) is required just to write down the moment diﬀerential equations. Alternatively, one could get an inﬁnite number of coupled diﬀerential equations in moments of arbitrarily high order. We therefore only consider the GD and GWD models. When applying the Kolmogorov equation, it is necessary to know beforehand that P (t, x) has a density function. For our GWD model, this condition is not fulﬁlled (since we require the drift coeﬃcient Ax to have an arbitrary matrix A, we must, as explained below, use the Wiener process of dimension one). Morever, the uniqueness of the obtained solution needs to be proved. Needless to say, heavy calculations are involved in obtaining the ﬁnal formulas for the second-order moment evolutions. The complications arise more from the intricate interconnections of the steps than from the diﬃculty of any step in particular. While computers cannot collect the intermediate elements into compact formulas, computer programs (here Maple IV) can check and conﬁrm the ﬁnal explicit solutions that we obtain. 2.2

Structure of basic diﬀusion processes

2.2.1 Stochastic preliminaries Consider the stochastic diﬀerential equations (SDE) in the Itˆo form, (1)

dX = a(X, t)dt + B(X, t)dw X(t) ∈ R , n

a∈R , n

B∈R

n×r

,

w(t) ∈ R , r

(2)

where a(X, t) is an n-dimensional vector function, called the drift coeﬃcient, and B(X, t) is an n × r matrix function, called the diﬀusion 59

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen coeﬃcient. The elements of a and B are Borel-measurable functions from [0, ∞) × Rn into R. The stochastic state vector is X(t) ∈ Rn , while x ∈ Rn denotes the state variable. The random (stochastic) vector, dw ∈ Rr , represents the noise in the stochastic dynamic system (1). As a stochastic process, w(t) is assumed to be an r-dimensional standard Wiener process with t ∈ R, and hence w(t) has continuous (but nowhere diﬀerentiable) sample paths (phase paths, trajectories). The drift coeﬃcient a(X, t) determines the local drift (change, increment) of the expected value (mean, average trend of evolution) of the stochastic process X(t) in a short interval of time from t to t+dt under the condition that X(t) = x. The matrix product BB T (T =transpose) of the diﬀusion coeﬃcient B(X, t) determines the local dispersion (the size of the central second-order moments, the mean square deviation of the stochastic process X(t) from the original position x) during a short period of time from t to t + dt.1 From probability theory, it is well known that, if the multi-dimensional functions a(x, t) and B(x, t) satisfy both the Lipschitz and the linear growth conditions and are continuous with respect to t, then the stochastic process X(t), solving (1), is a continuous Markov process with transition distribution (density) functions that are, under certain regularity assumptions, uniquely determined by only their ﬁrst- and second-order moments. These moments are then completely described by, respectively, the drift and diﬀusion coeﬃcients in (1). Such continuous Markov processes are called Itˆo diﬀusion processes. In addition, when the transition density function p(x, t | x0 ) of a diﬀusion process exists, it satisﬁes the (forward) Kolmogorov equation (PDE), n ∂ 2 p(x, t | x0 ) 1 ∂p(x, t | x0 ) B(x, t)B T (x, t) j,k = ∂t 2 j,k=1 ∂xj ∂xk

−

n j=1

aj (x, t)

∂p(x, t | x0 ) , ∂xj

(3)

where the elements of a(x, t) and B(x, t)B T (x, t) enter, respectively, as the coeﬃcients of the ﬁrst-order and second-order partial derivatives.2 More generally, the distribution itself, P (t, x | x0 ), solves (3). See also Appendix D. Moreover, for any function of a diﬀusion process, X(t), Itˆo’s Lemma gives the following result:3 1

Cf. Prohorov and Rozanov (1969, pp. 258, 282). Cf. Prohorov (1969, pp. 282). 3 Cf. Karatzas and Shreve (1988), and Øksendal (2005, pp. 48). 2

60

Moment Evolution of Gaussian and Wiener Diﬀusions Lemma 1. (Itˆ o). Let X(t) ∈ Rn be a general diﬀusion process deﬁned as in (1). If F (x, t) is an arbitrary C 2 map from Rn+1 → R, then dF (X, t) = Ft dt + FxT dX + 1/2dX T Fxx dX

(4)

i.e., F (X, t), determined by diﬀusion process X(t), is again a diﬀusion process where Ft and Fx represent, respectively, the partial derivatives with respect to t and x of the function F (x, t), and Fxx represents the Hessian matrix of the function F (x, t), and, (dwi )2 = dt ∀i; dwi ·dwj = 0, for i = j; (dt)2 = 0; dt · dwi = 0. Using this Lemma, one can study many properties of the diﬀusion process X(t) governed by (1). A well-known decisive property of a diﬀusion process is that its conditional transition probability, under certain regularity assumptions, is uniquely determined by only the ﬁrst-order or second-order moments, which again are completely determined by the drift and diﬀusion coeﬃcients. Therefore, it is sufﬁcient to study the functions for the ﬁrst-order and the second-order moments of diﬀusion processes. By Lemma 1, the following lemma for the moment statistics of the Itˆo diﬀusion process (1) can be derived:4 Lemma 2. The mean vector m(t) and variance-covariance matrix Σ(t) of the transition probability distribution for the family of solutions to the stochastic diﬀerential equation, (1), satisfy the deterministic ordinary diﬀerential equations (ODE), dm(t) = E {a(X, t)} (5) dt dΣ(t) ˙ = E a(X, t) X T − mT (t) + [X − m(t)] aT (X, t) Σ(t) = dt (6) + B(X, t)B T (X, t) m ˙ =

where m(t) = [m1 m2 . . . mn ]T = [E(X1 ) E(X2 ) . . . E(Xn )]T Σ(t) = (σ ij )n×n ,

and

σ ij = E[(Xi − mi )(Xj − mj )],

(7) (8)

for i, j = 1, 2,. . . , n. Therefore, m(t) and Σ(t) of the dynamic stochastic system (1) can be studied by solving the diﬀerential equations (5)-(6), and the future 4

Cf. Pugachev and Sinitsyn (1987, pp. 302).

61

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen probability behavior of the diﬀusion process X(t) is described by the time paths of these ﬁrst-order and second-order moments. Remark 1. The diﬀerential equations for the moments usually constitute an inﬁnite coupled system, because the right hand sides of (5)-(6) and the equations for the other moments contain moments of arbitrarily high order. Because knowledge about the full probability distribution, (probability density function), is necessary to solve (5) and (6), numerical methods are much more used in practice. For more details, see Soong (1973). But when the drift- and diﬀusion coeﬃcients are linear functions in the state variables x, the solution of (5) may be inserted in (6). However, although we specialize to dimension n = 2 – and for the GWD model to a one-dimensional Wiener process – the resulting system for variances-covariance, σ xx (t), σ xy (t), σ yy (t), is only barely solvable. The diﬃculty is to determine etA (cf. Appendices B and C). ∇ 2.2.2

Linear Itˆo diﬀusions in the plane

Henceforth, we make three assumptions [occasionally, the state vector x is written as (x, y), and similarly for X(t)]: Assumption 1. The stochastic dynamic system (1) is a time-homogeneous (independent of time) system in the Euclidean plane, i.e., a(x, t) = a(x),

B(x, t) = B(x);

x ∈ R2 .

(9)

Assumption 2. The drift coeﬃcients are linear functions of the state variables a b x x a1 (x) = Ax = ; x= . (10) a(x) = c d y y a2 (x) Assumption 3. The diﬀusion coeﬃcients and the vector dw may, respectively, take two forms, either β 11 β 12 dw1 B(x)dw(t) = , (11) β 21 β 22 dw2 i.e., the noise vector has two diﬀerent independent random elements, and the diﬀusion coeﬃcients are given by a constant matrix - or βx B(x)dw(t) = dw, (12) βy 62

Moment Evolution of Gaussian and Wiener Diﬀusions i.e., the noise vector has only one random element, but the components of the diﬀusion coeﬃcient depend on the state of the system; (12) is often called a geometric Wiener diﬀusion (GWD) process. Combining Assumption 1-3, we can get two basic time-homogeneous diﬀusion models in the Euclidean plane: GD model: General Bivariate Gaussian Diﬀusion: Using (9) (10) and (11), we have, in compact notation, dX = AXdt + Bdw

(13)

dX = (aX + bY )dt + β 11 dw1 + β 12 dw2 , dY = (cX + dY )dt + β 21 dw1 + β 22 dw2 .

(14)

or explicitly,

GWD model: Bivariate Geometric Wiener Diﬀusion: With Assumption (9), (10) and (12), we have, dX = AXdt + βXdw

(15)

dX = (aX + bY )dt + βXdw dY = (cX + dY )dt + βY dw

(16)

or in explicit form,

Remark 2. In practice, it is reasonable to think that the factors inﬂuencing the states of the system interact, and to assume that the uncertainties attributable to the interacting factors are correlated. Therefore, it is necessary to include dependent Wiener processes that are correlated with covariance matrix Σ. The stochastic model takes the form dX = (aX + bY )dt + β 11 dW1 + β 12 dW2 , (17) dY = (cX + dY )dt + β 21 dW1 + β 22 dW2 . In this case, W1 and W2 are not independent Wiener processes, but by using the linear transformation method to replace the correlated Wiener processes with the independent Wiener processes, the diﬀusion model (14) can be obtained: Rewriting (17) in the vector-matrix form dX = AXdt + BdW

W ∼ N (0, Σ)

(18)

63

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen 1

and w = DW with D = Σ− 2 , then w ∼ N (0, 1) are the independent Wiener processes. By replacing W with D−1 w, (18) is transformed into the stochastic diﬀerential equations (14). In the following, we will therefore focus only on the analysis of the probability properties of the diﬀusion process with independent Wiener processes. ∇ 2.3

Dynamics of ﬁrst-order and second-order moments

Applying Lemma 2 to (14) and (16), we get ordinary diﬀerential equations (ODE) for the moments of both the Gaussian and geometric Wiener diﬀusions. Proposition 1. The components of the mean vector function, m = [mx my ]T , satisfy the diﬀerential equations, a b mx m ˙x = Am = m ˙ = m ˙y my c d

(19)

for both of the GD and GWD models (14) and (16), respectively. Proof: By applying (5) to (14) or (16), m ˙x ax + by m ˙ = = E {AX} = E m ˙y cx + dy

=

amx + bmy cmx + dmy

2

which is (19). Proposition 2. The covariance vector function, σ = [σ xx σ xy σ yy ]T

(equivalent to covariance matrix Σ) satisﬁes the diﬀerential equations: GD model: σ˙ = Cσ + δ, (20) or explicitly ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 2a 2b 0 σ xx β 211 + β 212 σ˙ xx ⎣ σ˙ xy ⎦ = ⎣ c a + d b ⎦ ⎣ σ xy ⎦ + ⎣ β 11 β 21 + β 12 β 22 ⎦ (21) 0 2c 2d σ˙ yy σ yy β 221 + β 222

64

Moment Evolution of Gaussian and Wiener Diﬀusions GWD model: ˜ + ˜δ(t) = (C + β 2 I)σ + ˜δ(t), σ˙ = Cσ

(22)

or explicitly ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ 2a + β 2 m2x 2b 0 σ˙ xx σ xx ⎣ σ˙ xy ⎦ = ⎣ ⎦ ⎣ σ xy ⎦+β 2 ⎣ mx my ⎦ b c a + d + β2 m2y σ˙ yy σ yy 0 2c 2d + β 2 (23) Proof: Applying (6) to (14) to get the diﬀerential equation for the covariance functions of the two diﬀusion processes, we get, cf. (11), d

(t) = dt

ax + by σ˙ xx σ˙ xy x − mx y − my =E σ˙ xy σ˙ yy cx + dy x − mx ax + by cx + dy +E y − my β 11 β 21 β 11 β 12 (24) +E β 21 β 22 β 12 β 22

Since E [x(x − mx )] = E [(x − mx )2 ] = σ xx , etc., the right hand side of (24) becomes 2aσ xx + 2bσ xy cσ xx + (a + d) σ xy + bσ yy + BB T . cσ xx + (a + d) σ xy + bσ yy 2cσ xy + 2dσ yy (25) T By letting σ = [σ xx σ xy σ yy ] and using (24) and (25), we get (21). To obtain (23), applying (6) to (16), the ﬁrst two terms in (6) are the same as those in (24), while the third term is, cf. (12), % & 2 2 βx β x β 2 xy βx βy . =E E(BB T ) = E βy β 2 xy β 2 y 2 As E(x2 ) = σ xx + m2x , etc., the combination of terms as before yields (23). 2 Remark 3. Since the homogeneous diﬀusion process is a timehomogeneous Gaussian diﬀusion, the diﬀerential equations (21) can also be derived directly from the following expression Σ˙ = AΣ + ΣAT + BB T

(26) 65

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen where Σ = (σ ij ), is the 2 × 2 covariance matrix, while B and A are given as in (14). The expression (26) is obtained in, e.g., Jacobsen (1970, pp. 12). ∇ Now, the diﬀerential equations for the moments of both the Gaussian and geometric Wiener diﬀusion processes have been obtained; clearly, the mean functions for the Gaussian diﬀusion process and for the geometric Wiener diﬀusion process satisfy the same diﬀerential equation. Contrary to this, the diﬀerential equation for the covariance function of the Gaussian diﬀusion is obviously diﬀerent from that of the geometric diﬀusion. For the geometric Wiener diﬀusion, it can be seen in (23), that the geometric parameter β appears not only in the non-homogeneous but also in the homogeneous parts of the diﬀerential equations. In addition, the mean functions mx (t) and my (t) now appear as components in the non-homogeneous part of (23). Obviously, they are functions of time t, satisfying (19). It is easily recognized that if one has β 1 and β 2 (instead of having just β for both state variables), or moreover two diﬀerent Wiener processes dw1 and dw2 instead of just dw, then the equation for the covariance function in (23) will be much more complex and probably impossible to solve. It is worthwhile to try to solve (19)-(23) explicitly in closed form, in order to study the probability distribution, nonsingularity, stability, etc., of the diﬀusion processes. 2.3.1

Eigenvalues of A, C, and C˜

As a preparation, we make a few observations on the drift coeﬃcient matrix A in (19). It has a trace, determinant, and characteristic polynomial given by, |A| = ad − bc,

trA = a + d,

(27)

2

|A − λI| = λ − (trA)λ + |A|.

(28)

2

The latter (28) has discriminant 4Δ with, 1 Δ2 = (a + d)2 − (ad − bc) = 4 d − a 2 bc = 1 − Δ2 2Δ . Henceforth, we use the notation: √ Δ2 > 0 : Δ ≡ + Δ2 , Δ2 66

1 (d − a)2 + bc 4

⇔ (29)

' < 0 : Δ ≡ + |Δ2 |.

(30)

Moment Evolution of Gaussian and Wiener Diﬀusions Generally, the eigenvalues of A may be written, cf. (28)-(30), as Δ2 > 0 : λ1 = 12 (a + d) + Δ, λ2 = 12 (a + d) − Δ 2

Δ =0: 2

Δ 0 : α1 = (d − a)/2b + Δ/b , α2 = (d − a)/2b − Δ/b 2

Δ = 0 : α = (d − a)/2b = 2c/(a − d)

(34) (35)

When bc = 0, d = a, one of the axes contains an eigenvector, whereas the other eigenvector has a slope given as (d − a)/b or c/(a − d). With reference to the fundamental parameter elements of A, (19), we can now analyze the Itˆ o coeﬃcient matrix C, (20- 21), deﬁned in Proposition 2. Obviously C is real; it has a trace and determinant, tr C = 3 (a + d) = 3 tr A, |C| = 4 (a + d) (ad − bc) = 4 tr A |A|

(36) (37)

and its characteristic polynomial is, |C − λI| = λ3 − 3 ( tr A) λ2 +2 ( tr A)2 + 2 |A| λ − 4 ( tr A) |A| .

(38)

When comparing with the eigenvalues of A, λ1 , and λ2 , or λ, (31- 33), we get the following result: Lemma 3. The eigenvalues of the Itˆo matrix C, (20- 21), associated with the Gaussian diﬀusions, are in the nonsingular case, |C| = 0, given by C C Δ2 > 0 : λC 1 = tr A, λ2 = tr A + 2Δ = 2λ1 , λ3 = tr A − 2Δ = 2λ2 2

Δ =0: 2

Δ 0 : m(t) = d−a b 1 cosh (Δt) − sinh (Δt) sinh (Δt) (a+d)t 2Δ Δ m0 e2 c d−a sinh (Δt) cosh (Δt) + sinh (Δt) Δ 2Δ (45) 1 1 bt 1 − 2 (d − a) t m0 (46) Δ2 = 0 : m(t) = e 2 (a+d)t ct 1 + 12 (d − a) t Δ2 < 0 : m(t) = b 1 sin (Δt) sin (Δt) cos (Δt) − d−a (a+d)t 2Δ Δ 2 m0 (47) e c sin (Δt) cos (Δt) + d−a sin (Δt) Δ 2Δ 68

Moment Evolution of Gaussian and Wiener Diﬀusions where, m0 = [mx (0) my (0)]T , is the initial value of the expectation function at time t = 0. Proof: Although the inﬁnite series etA converges for all t and all square matrices A, it seems impossible in general to express etA in closed form. But in the two-dimensional case, the series may be summed in terms of elementary functions by an application of the 2 Hamilton–Cayley Theorem.5

Figure 1: Drift parameter regions and the mean functions (45)-(47)

5

Cf. Appendix B, and Jensen (1994, p. 307).

69

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen 2.4.2

Asymptotics of the mean vector function

For convenience we shall write f ∼ g whenever the functions f (t) and g(t) are such that the ratio f (t)/g(t) tends to 1 for t → ∞. Moreover, for two vector functions u and v, the relation u ∼ v means that uj ∼ vj for each j. Corollary 1. The asymptotic (long-run) behavior for t → ∞ of the expectation functions m(t) in Theorem 1 are given by Δ2 > 0 : λ1 < 0, λ2 < 0 : λ1 > 0, λ2 = 0, λ1 > 0, λ2 < 0, λ1 > 0, λ2 > 0,

lim m(t) = 0

t→∞ m0x = m0y m0x = m0y m0x = m0y

α2 : m(t) ∼ α2 :

1 ( 1 (a+d)+Δ)t e 2 2

1−

d−a 2Δ c Δ

1

b Δ + d−a 2Δ

m0

lim m(t) = 0

t→∞

1 1 α2 : m(t) ∼ e( 2 (a+d)+Δ)t 2

1 + d−a 2Δ c −Δ

b −Δ d−a 1 − 2Δ

m0 (48)

2

Δ ≤ 0 : tr A < 0 :

lim m(t) = 0.

t→∞

Proof: Writing the components of m(t), (45), in alternative explicit form gives d−a 1 1/2(a+d+2Δ)t b (1 − )mx (0) + my (0) mx (t) = e 2 2Δ Δ d−a b 1 )mx (0) − my (0) , (49) + e1/2(a+d−2Δ)t (1 + 2 2Δ Δ 1 1/2(a+d+2Δ)t c d−a mx (0) + (1 + )my (0) my (t) = e 2 Δ 2Δ c d−a 1 1/2(a+d−2Δ)t − mx (0) + (1 − )my (0) . (50) + e 2 Δ 2Δ When λ1 < 0 and λ2 < 0, the coeﬃcients of t in the expressions above are negative, and hence lim mx (t) = 0, and lim my (t) = 0. t→∞

t→∞

In the other cases, it is seen that the terms with e1/2(a+d+2Δ)t are dominating; the terms e1/2(a+d−2Δ)t dominates only for special initial my (0) = α2 . In this way, the Corollary is veriﬁed. 2 values, r0 = m x (0)

70

Moment Evolution of Gaussian and Wiener Diﬀusions Remark 4. For parameter regions of the drift coeﬃcients - as depicted on the plane (trA, |A|), cf. (27) - (29) - the geometry of the global phase portraits and the evolution (exact time paths) of the mean functions (45) - (47) are shown in Fig. 1. As seen further below, the parameter regions for classifying the global behavior of the covariance functions also correspond to the regions depicted in Fig. 1. Those GD and GWD processes with asymptotic stationary probability distributions are located in the interior of the second (upper-left, shaded) quadrant in Fig. 1.6 ∇ Whereas the sample paths of the diﬀusion processes are always continuous but nondiﬀerentiable, the evolution of their moments are smoothly changing (described by C −1 -curves), as illustrated by the mean functions in Fig. 1. 2.5

Covariance matrix functions

2.5.1 Solutions to the diﬀerential equations From (20)-(23), the covariance function σ = [σ xx σ xy σ yy ]T - which contains the three independent elements of the covariance matrix Σ(t) for both the Gaussian and geometric Wiener diﬀusions – satisﬁes a 3dimensional ordinary diﬀerential equation in a general symbolic form: σ(t) ˙ = Cσ(t) + δ(t).

(51)

The complete solution of (51) can be symbolically written as follows: ⎡ ⎤ t −1 (52) σ(t) = etC ⎣σ(0) + e−τ C δ(τ )dτ ⎦ , e0C = I, e−tC = etC , 0

where the parameter σ(0) ∈ Rn plays the role of initial value of the solution. For a given particular solution of the equation (51), σ ¯ (t), the complete solution σ(t) passing through σ 0 at t = 0, σ(0) = σ 0 , is ¯ (t) σ(t) = etC σ ∗ (0) + σ ∗ ¯ (0); σ (0) = σ 0 − σ

(53) (54)

where σ ∗ (0) denotes the initial value of the solution to the corresponding homogeneous system (51). This symbolic notation is used below with speciﬁc matrices. 6 For a full dynamic description of the trajectory conﬁgurations in Fig. 1, see Jensen (1994, pp. 235).

71

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen ˜

Exponential matrices etC and etC

As a ﬁrst step in determining the solution in (52), we use the ˜ cf. Lemma 3, to calculate the eigenvalues of our speciﬁc C and C, ˜ tC relevant exponential matrices, e and etC . Proposition 3. The exponential matrix etC of the CD model with C, (20 -21), is given by, etC = e(a+d)t M (t),

e0C = M (0) = I,

(55)

where the matrix M (t) is presented in Tables 1-3 as follows : Table 1: The nonsingular case, |C| = 0, with matrix M (t) as M+ (t), M0 (t), M− (t) for respectively : Δ2 > 0, Δ2 = 0 and Δ2 < 0 Table 2: The singular case, |C| = 0, and tr A = 0, with M (t) as M+ (t), M0 (t), M− (t) for respectively : Δ2 > 0, Δ2 = 0 and Δ2 < 0 Table 3: The singular case, |C| = 0, and |A| = 0, with a single M (t) matrix. ˜ ˜ (22 -23), is The exponential matrix etC of the GWD model with C, given by ˜

etC = eβ t etC = e(a+d+β )t M (t), 2

2

˜

e0C = M (0) = I,

(56)

where the matrix M (t) is presented in the Tables 1-3. Proof: For Δ2 > 0, the eigenvalues of C are distinct [Appendix C] t

etC = eλ1 C

C − λC 2 I

λC 1

−

λC 2

C − λC 3 I λC 1

−

λC 3

t

+ eλ2 C

C − λC 1 I

λC 2

−

λC 1

C − λC 3 I λC 2

−

λC 3

t

+ eλ3 C

C − λC 1 I

λC 3

−

λC 1

C − λC 2 I λC 3

−

λC 2

.

By inserting the eigenvalues of C for Δ2 > 0, cf. (39), we get 1 ( tr A+2Δ)t 2 C − 2( tr A − Δ)C + tr A( tr A − 2Δ)I e 8Δ2 1 ( tr A−2Δ)t 2 C − 2( tr A − Δ)C + tr A( tr A + 2Δ)I e + 8Δ2 1 ( tr A)t 2 C − 2( tr A)C + ( tr A − 2Δ)( tr A + 2Δ)I e − 2 4Δ e( tr A)t (cosh(2Δt) − 1) C 2 + 2( tr A − tr A cosh(2Δt) + Δ sinh(2Δt))C = 4Δ2 e( tr A)t ( tr A)2 cosh(2Δt) − 2Δ tr A sinh(2Δt) − 4|A| I. (57) + 2 4Δ

etC =

72

Moment Evolution of Gaussian and Wiener Diﬀusions Using that ⎡

⎤ 4a2 + 2bc 6ab + 2bd 2b2 C 2 = ⎣ 3ac + dc (a + d)2 + 4bc 3bd + ab ⎦ , 6cd + 2ac 4d2 + 2bc 2c2

(58)

a regrouping of the terms in (57) leads to M+ (t) in Table 1. Similarly, when Δ2 < 0, there are also three distinct eigenvalues, and following the same procedures with (41), we get M− (t) in Table 1. For the case Δ2 = 0, it is seen from (40) that C has three identical C C C eigenvalues, λC 1 = λ2 = λ3 = λ = tr A. Hence, by Appendix C, 1 2 2 C C tC λC t I +t C −λ I + t C −λ I = e e 2 1 ( tr A)t 2 2 = e I + (C − tr A)t + (C − tr A) t . (59) 2 By inserting the matrices, cf. (21), ⎡ ⎤ a − d 2b 0 0 b ⎦, (60) C − tr A = ⎣ c 0 2c d − a ⎡ ⎤ 2b2 (d − a)2 + 2bc −2b(d − a) ⎦ 4bc b(d − a) (C − tr A)2 = ⎣ −c(d − a) 2 2 2c(d − a) (d − a) + 2bc 2c (61) into (59), then M0 (t) in Table 1 is obtained. The singular cases in Tables 2-3 are the relevant simpliﬁcations of M+ (t), M0 (t) and M− (t) in Table 1. From (23), we have C˜ = C + β 2 I, where C and β 2 I commute so that 2 2 ˜ etC = eβ tI · etC = eβ t · etC . 2 Proposition 3 is a major result of our investigation. Having obtained ˜ explicit expressions for the exponential matrices etC and etC , the problem is now by (53) and (52) to calculate the particular solutions of the diﬀerential equations (21) and (23).

73

74

(1 −

bc ) cosh(2Δt) 2Δ2

−

d−a 2Δ

c2 (cos(2Δt) 2Δ2

− 1)

⎡

sinh(2Δt)

bc 2Δ2

− 1)

c(d−a) (cos(2Δt) 2Δ2

− 1) +

c Δ

− 1) +

− 1) + 1

b(d−a) (cos(2Δt) 2Δ2

2ct + c(d − a)t2

1 − 12 (d − a)2 t2

bc (cos(2Δt) Δ2

−

− 1) +

2bt − b(d − a)t2

c(d−a) (cosh(2Δt) 2Δ2 c Δ

− 1) +

− 1) + 1

b(d−a) (cosh(2Δt) 2Δ2

bc (cosh(2Δt) Δ2

−

[1 − 12 (d − a)t]2 ⎢ ⎢ 1 2 M0 (t) = ⎢ ⎢ ct − 2 c(d − a)t ⎣ c2 t2

c 2Δ

sinh(2Δt) +

d−a bc bc (1 − 2Δ 2 ) cos(2Δt) − 2Δ sin(2Δt) + 2Δ2 ⎢ ⎢ ⎢ c M− (t) = ⎢ − c(d−a) (cos(2Δt) − 1) + 2Δ sin(2Δt) 4Δ2 ⎢ ⎣

⎡

c2 (cosh(2Δt) 2Δ2

⎢ ⎢ ⎢ M+ (t) = ⎢ − c(d−a) (cosh(2Δt) − 1) + 4Δ2 ⎢ ⎣

⎡ sinh(2Δt)

sin(2Δt)

sin(2Δt)

b Δ

(1 −

⎤

(1 −

−

−

sin(2Δt) sin(2Δt) +

b 2Δ

sinh(2Δt)

bc 2Δ2

sinh(2Δt) +

b 2Δ

d−a 2Δ

d−a 2Δ

− 1) + bc ) cos(2Δt) 2Δ2

b(d−a) (cos(2Δt) 4Δ2

b2 (cos(2Δt) 2Δ2

− 1)

bc ) cosh(2Δt) 2Δ2

− 1) +

− 1)

b(d−a) (cosh(2Δt) 4Δ2

b2 (cosh(2Δt) 2Δ2

⎥ ⎥ bt + 12 b(d − a)t2 ⎥ ⎥ ⎦ 1 2 [1 + 2 (d − a)t]

b2 t2

sinh(2Δt)

b Δ

Table 1. The matrices M+ (t), M0 (t), M− (t) for the nonsingular case : |C| = 0.

⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎤

bc 2Δ2

⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎤

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen

2cd [cos(2Δt) − 1] + 2cΔ sin(2Δt)

2c2 [cosh((a + d)t) − 1]

1 1 [ae 2 (a + d)t + de− 2 (a + d)t ]2 ⎢ ⎢ 1 ⎢ M (t) = ⎢ ace(a+d)t + c(d − a) − dce−(a+d)t (a + d)2 ⎢ ⎣

⎡

2cde(a+d)t − 2c(d − a) − 2ace−(a+d)t

2bc [cosh((a + d)t) − 1]

2abe(a+d)t + 2b(d − a) − 2bde−(a+d)t

⎥ ⎥ ⎥ bde(a+d)t − b(d − a) − abe−(a+d)t ⎥ ⎥ ⎦ 1 1 (a + d)t − (a + d)t 2 + ae 2 ] [de 2

2b2 [cosh((a + d)t) − 1]

⎤

(d2 + Δ2 ) cos(2Δt) + 2dΔ sin(2Δt) + bc

bd [cos(2Δt) − 1] + bΔ sin(2Δt)

2bc [cos(2Δt) − 1] + 2Δ2

Table 3. The matrix M (t) for the singularity : |C| = 0, with |A| = 0.

c2 [cos(2Δt) − 1]

b2 [cos(2Δt) − 1]

(dt + 1)2

2ct(dt + 1)

2ab [cos(2Δt) − 1] + 2bΔ sin(2Δt)

c2 t2

⎤

⎥ ⎥ bt(dt + 1) ⎥ ⎦

b2 t2

2bct2 + 1

2bt(at + 1)

⎥ ⎥ ⎥ ⎦

⎤

(d2 + Δ2 ) cosh(2Δt) + 2dΔ sinh(2Δt) + bc

bd [cosh(2Δt) − 1] + bΔ sinh(2Δt)

2bc [cosh(2Δt) − 1] + 2Δ2 2cd [cosh(2Δt) − 1] + 2cΔ sinh(2Δt)

b2 [cosh(2Δt) − 1]

2ab [cosh(2Δt) − 1] + 2bΔ sinh(2Δt)

(at + 1)2 ⎢ ⎢ M0 (t) = ⎢ ct(at + 1) ⎣

⎡

(a2 + Δ2 ) cos(2Δt) + 2aΔ sin(2Δt) + bc ⎢ 1 ⎢ M− (t) = ⎢ ac [cos(2Δt) − 1] + cΔ sin(2Δt) 2Δ2 ⎣

⎡

c2 [cosh(2Δt) − 1]

(a2 + Δ2 ) cosh(2Δt) + 2aΔ sinh(2Δt) + bc 1 ⎢ ⎢ M+ (t) = ⎢ ac [cosh(2Δt) − 1] + cΔ sinh(2Δt) 2Δ2 ⎣

⎡

Table 2. The matrices M+ (t), M0 (t), M− (t) for the singularity : |C| = 0, with trA = 0. ⎥ ⎥ ⎥ ⎦

⎤

Moment Evolution of Gaussian and Wiener Diﬀusions

75

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Particular solutions Proposition 4. For the GD, (20 - 21), particular solutions σ ¯ (t) are provided by the following expressions: In the nonsingular case, |C| = 4 |A| tr A = 0, σ ¯ (t) can be taken as a constant vector: ⎤ σ ¯ xx ¯ xy ⎦ σ ¯=⎣ σ σ ¯ yy ⎡

(62)

⎡ ⎤ − |A| (β 211 + β 212 ) − (dβ 11 − bβ 21 )2 − (bβ 22 − dβ 12 )2 1 2 2 2 2 ⎣ cdβ 11 − 2adβ 11 β 21 + abβ 21 + cdβ 12 − 2adβ 12 β 22 + abβ 22 ⎦ = 2 |A| tr A − |A| (β 221 + β 222 ) − (cβ 11 − aβ 21 )2 − (cβ 12 − aβ 22 )2

In the singular case, |C| = 4 |A| tr A = 0, the particular solution σ ¯ (t) is as follows: if ⎤ ⎡ γt 1 γ − (β 211 + β 212 ) − ab γt ⎦ (63) trA = 0 : σ ¯ (t) = ⎣ 2b 12 β 22 − cb γt − ba2 γ − β 11 β 21 +β b where γ=

1 1 (dβ 11 − bβ 21 )2 + (dβ 12 − bβ 22 )2 + (β 211 + β 212 ), 2 |A| 2

or if

⎤ γt 1 ⎦ γ − (β 211 + β 212 ) − ab γt |A| = 0 : σ ¯ (t) = ⎣ 2b β 12 β 21 +β 12 β 22 2 2 a2 a+d a γt + 2b2 (β 11 + β 12 − γ) − b2 γ − b2 b (64) ⎡

where γ now denotes: (tr A)−2 [(dβ 11 − bβ 21 )2 + (dβ 12 − bβ 22 )2 ]. ˜ = For the GWD with |C| 0, (22 - 23), a particular component σ ˜ (t) is ⎡ ⎤ t m2x (0) ˜˜ −τ C −β 2 t ⎣ mx (0)my (0) ⎦ (65) δ(τ )dτ = 1 − e σ ˜ (t) = e 2 m (0) y 0 , ˜

˜

where [e−tC is the inverse of (56); e−tC = e−(a+d+β )t M (−t), M (−t) = [M (t)]−1 ] ⎡ ⎤ m2x (0) 2 ˜ e−tC ˜δ(t) = e−β t β 2 ⎣ mx (0)my (0) ⎦ (66) m2y (0) , 2

and mx (0) and my (0) are the initial values of the mean function m(t), cf. (45)-(47) in Theorem 1.

76

Moment Evolution of Gaussian and Wiener Diﬀusions Proof: Concerning (62)-(64) for the GD model, see Appendix A. For the GWD model, (23) shows that the nonhomogeneous part depends on the mean function, which is a function of time t. Noting ˜δ(t), (22)-(23) and (45)-(47) together with (52), we get (65) and (66) after remarkable simpliciﬁcation. 2 Remark 5. The RHS of Formula (65) for the GWD model is another main result of this study. It is noteworthy how the initial mean values of mx and my enter there in precisely the same manner as they enter in ˜δ(t), cf. (22)-(23). The reductions leading to the RHS of formula (65) are really lengthy; it is decisive, ﬁrst, to have Theorem 1 on the means (45)(47) available, and second, to have an extensive cancelation of terms in the integrand (66). The details are left out to save space, but a technical report may be obtained from the authors. Alternatively, one could also make a computer-aided calculation of the integral (65), using Maple or similar software. ∇ General covariance vector solutions Finally, by (52-53), (55 - 56), and Propositions 3-4, we get: Theorem 2. The covariance vector σ(t) = [σ xx (t) σ xy (t) σ yy (t)]T that explicitly solves (20 - 21) for the GD model with the nonsingular Itˆ o coeﬃcient matrix |C| = 0 is [cf., (53), (55), (62), (63) and (64)] ¯ (t). σ(t) = e(a+d)t M (t)σ ∗ (0) + σ

(67)

This vector σ(t) is presented in Tables 4, 5, and 6 for the three cases: Δ2 > 0, Δ2 = 0, and Δ2 < 0, respectively. The covariance vector σ(t) = [σ xx (t) σ xy (t) σ yy (t)]T that explicitly solves (22 - 23) for the GWD model is [cf., (52), (56) and (65)] ˜ (t)]. σ(t) = e(a+d+β )t M (t)[σ(0) + σ 2

(68)

This vector σ(t) is presented in Tables 7, 8, and 9 for the cases: Δ2 > 0, Δ2 = 0, and Δ2 < 0, respectively. Table 4. GD function σ(t), (53), (55), (62), (29), for Δ2 > 0, |C| = 0: ¯ xx σ xx (t) = e(a+d)t k1∗ e2Δt + k2∗ e−2Δt + k3∗ + σ (69) k1∗ = k2∗ = k3∗ =

2 d−a b b2 ∗ 1 − d−a σ ∗xy (0) + 4Δ σ ∗xx (0) + 2Δ 2 σ yy (0) 2Δ 2Δ d−a d−a b 1 b2 ∗ ∗ ∗ 1 + 2Δ σ xx (0) − Δ 2 + 4Δ σ xy (0) + 4Δ2 σ yy (0) ∗ b cσ xx (0) + (d − a) σ ∗xy (0) − bσ ∗yy (0) 2Δ2 1 4 1 4

1−

k1∗ + k2∗ + k3∗ = σ ∗xx (0)

77

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen σ xy (t) = e(a+d)t k4∗ e2Δt + k5∗ e−2Δt + k6∗ + σ ¯ xy k4∗ = k5∗ = k6∗ =

(70)

c bc b ∗ 1 − d−a σ ∗ (0) + 2Δ 1 + d−a σ ∗ (0) 2 σ xy (0) + 4Δ 4Δ 2Δ xx 2Δ yy d−a c ∗ (0) + bc σ ∗ (0) − b − 4Δ 1 + d−a σ 1 − σ ∗yy (0) xx 4Δ 2Δ 2Δ2 xy ∗ 2Δ d−a ∗ (0) − bσ ∗ (0) cσ (0) + (d − a) σ xx xy yy 4Δ2

k4∗ + k5∗ + k6∗ = σ ∗xy (0)

¯ yy σ yy (t) = e(a+d)t k7∗ e2Δt + k8∗ e−2Δt + k9∗ + σ k7∗ k8∗ k9∗

= = =

c2 σ∗ 4Δ2 xx

(0) +

c 2Δ

c2 c σ ∗ (0) − 2Δ 4Δ2 xx c ∗ (0) − −cσ xx 2Δ2

1+

1− (d −

d−a 2Δ

σ ∗xy

d−a σ ∗xy 2Δ a) σ ∗xy (0)

1+

d−a 2Δ

1− (0) + + bσ ∗yy (0)

d−a 2Δ

(0) +

1 4 1 4

(71) 2 2

σ ∗yy

(0)

σ ∗yy

(0)

k7∗ + k8∗ + k9∗ = σ ∗yy (0)

Table 5. GD function σ(t), (53), (55), (62), for Δ2 = 0, |C| = 0: %

σ xx (t) = e(a+d)t

1−

&2 & % 1 1 (d − a)t σ ∗xx (0) + 2bt 1 − (d − a)t σ ∗xy (0) + b2 t2 σ ∗yy (0) 2 2

+σ ¯ xx (72)

%

& & % 1 1 ct − c(d − a)t2 σ ∗xx (0) + 1 − (d − a)2 t2 σ ∗xy (0) σ xy (t) = e(a+d)t 2 2 % & 1 2 ∗ + bt + b(d − a)t σ yy (0) + σ ¯ xy 2 % & 1 σ yy (t) = e(a+d)t c2 t2 σ ∗xx 0) + 2ct 1 + (d − a)t σ ∗xy (0) 2 &2 % 1 ¯ yy + 1 + (d − a)t σ ∗yy (0) + σ 2

(73)

(74)

Table 6. GD function σ(t), (53), (55), (62), for Δ2 < 0, |C| = 0: ∗ ∗ ∗ cos (2Δt) + k11 sin (2Δt) + k12 }+σ ¯ xx σ xx (t) = e(a+d)t {k10 ∗ k10

=

∗ = k11

∗ = k12

b(d−a) bc b2 ∗ 1 − 2Δ σ ∗xx (0) − 2Δ2 σ ∗xy (0) + 2Δ 2 2 σ yy d−a ∗ 1 − 2 σ xx (0) + bσ ∗xy (0) Δ ∗ b cσ xx (0) + (d − a) σ ∗xy (0) − bσ ∗yy (0) 2 2Δ

(75)

(0)

∗ + k ∗ = σ ∗ (0) k10 xx 12

∗ ∗ ∗ σ xy (t) = e(a+d)t {k13 cos (2Δt) + k14 sin (2Δt) + k15 }+σ ¯ xy ∗ = k13 ∗ = k14 ∗ = k15

c(d−a) b(d−a) ∗ 1 σ yy − 4 σ ∗xx (0) + bσ ∗xy (0) + 4 Δ2 1 ∗ ∗ (0) + bσ (0) cσ yy 2Δ xx d−a cσ ∗xx (0) + (d − a) σ ∗xy (0) − bσ ∗yy (0) 4Δ2

(0)

(76)

∗ + k ∗ = σ ∗ (0) k13 xy 15

∗ ∗ ∗ σ yy (t) = e(a+d)t {k16 cos (2Δt) + k17 sin (2Δt) + k18 }+σ ¯ yy ∗ k16

=

∗ = k17 ∗ = k18

∗ + k ∗ = σ ∗ (0) k16 yy 18

78

c(d−a) ∗ c2 ∗ σ xy (0) + 2 σ xx (0) + 2Δ 2Δ2 1 ∗ (0) + d−a σ ∗ (0) cσ xy yy Δ 2 (−c) cσ ∗xx (0) + (d − a) σ ∗xy (0) 2Δ2

1−

bc 2Δ2

− bσ ∗yy (0)

σ ∗yy

(0)

(77)

Moment Evolution of Gaussian and Wiener Diﬀusions Table 7. GWD function σ(t), (52), (56), (65), for Δ2 > 0. 2 2 σ xx (t) = e(a+d+β )t k1 e2Δt + k2 e−2Δt + k3 + 1 − e−β t × % 2 2 & % & 1 b 1 b d−a d−a 1− mx (0) + my (0) e2Δt + 1+ mx (0) − my (0) e−2Δt 4 2Δ Δ 4 2Δ Δ * b 2 cmx (0) + (d − a)mx (0)my (0) − bm2y (0) + , (78) 2Δ2

where k1 , k2 and k3 are given in Table 4 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0),

σ xy (t) =e(a+d+β

2

)t

2 k4 e2Δt + k5 e−2Δt + k6 + 1 − e−β t ×

% & % & d−a d−a bc b c 1− m2x (0) + 1+ m2y (0) e2Δt mx (0)my (0) + 4Δ 2Δ 2Δ2 4Δ 2Δ % & % & c bc b d−a d−a 2 + − m (0)m (0) − (0) e−2Δt 1+ m2x (0) + 1 − m x y y 4Δ 2Δ 2Δ2 4Δ 2Δ * d−a , (79) + cm2x (0) + (d − a)mx (0)my (0) − bm2y (0) 2 4Δ

where k4 , k5 and k6 are given in Table 4 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0),

σ yy (t) =e(a+d+β

2

)t

2 k7 e2Δt + k8 e−2Δt + k9 + 1 − e−β t ×

2 2 d − a 1 c d − a 1 c mx (0) + 1 + mx (0) + 1 − my (0) e2Δt + my (0) e−2Δt 4 Δ 2Δ 4 Δ 2Δ * c + , (80) −cm2x (0) − (d − a)mx (0)my (0) + bm2y (0) 2Δ2

where k7 , k8 and k9 are given in Table 4 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0) Table 8. GWD function σ(t), (52), (56), (65), for Δ2 = 0. 2 1 1 (d − a)t σ xx (0) + 2bt 1 − (d − a)t σ xy (0) + b2 t2 σ yy (0) 2 2 2 2 1 + 1 − e−β t (81) 1 − (d − a)t mx (0) − btmy (0) 2 2 1 1 σ xy (t) = e a+d+β t ct 1 − (d − a)t σ xx (0) + 1 − (d − a)2 t2 σ xy (0) 2 2 & % 2 1 1 + bt 1 + (d − a)t σ yy (0) + 1 − e−β t ct 1 − (d − a)t m2x (0) 2 2 1 1 2 2 + 1 − (d − a) t mx (0)my (0) + bt 1 + (d − a)t m2y (0) (82) 2 2 2 2 1 1 σ yy (t) = e(a+d+β )t c2 t2 σ xx (0) + 2ct 1 + (d − a)t σ xy (0) + 1 + (d − a)t σ yy (0) 2 2 & % 2 2 1 + 1 − e−β t ctmx (0) + 1 + (d − a)t my (0) (83) 2

σ xx (t) = e(a+d+β

2

)t

1−

79

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Table 9. GWD function σ(t), (52), (56), (65), for Δ2 < 0. σ xx (t) = e(a+d+β

2

)t

2 k10 cos(2Δt) + k11 sin(2Δt) + k12 + 1 − e−β t × %

bc 2 b(d − a) b2 2 m (0) − m (0)m (0) + m (0) cos(2Δt) x y x y 2Δ2 2Δ2 2Δ2 d−a 2 b mx (0) + mx (0)my (0) sin(2Δt) + − 2Δ Δ &* bc 2 2 cm (0) + (d − a)m (0)m (0) − bm (0) , (84) + x y x y 2Δ2 1−

where k10 , k11 and k12 are given in Table 6 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0), σ xy (t) = e(a+d+β

2

)t

2 k13 cos(2Δt) + k14 sin(2Δt) + k15 + 1 − e−β t × bc b(d − a) 2 c(d − a) 2 m (0) + m (0)m (0) + m (0) cos(2Δt) x y x y 4Δ2 Δ2 4Δ2 c b m2 (0) + m2 (0) sin(2Δt) + 2Δ x 2Δ y & d − a 2 cmx (0) + (d − a)mx (0)my (0) − bm2y (0) + , (85) 2 4Δ

%

−

where k13 , k14 and k15 are given in Table 6 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0) σ yy (t) = e(a+d+β

2

)t

2 k16 cos(2Δt) + k17 sin(2Δt) + k18 + 1 − e−β t × %

c(d − a) bc 2 c2 2 m (0) + m (0)m (0) + 1 − (0) cos(2Δt) m x y x y 2Δ2 2Δ2 2Δ2 c d−a 2 mx (0)my (0) + my (0) sin(2Δt) + Δ 2Δ &* (−c) 2 2 cm (0) + (d − a)m (0)m (0) − bm (0) , (86) + x y x y 2Δ2

where k16 , k17 and k18 are given in Table 6 with σ ∗xx (0), σ ∗xy (0), σ ∗yy (0) replaced by σ xx (0), σ xy (0), σ yy (0)

80

Moment Evolution of Gaussian and Wiener Diﬀusions Remark 6. It should be noted that the moments formulas (69)–(77) for the GD model with |C| = 0 have – independent of the sign Δ2 – constant terms, σ ¯ xx , σ ¯ xy , σ ¯ yy , (62), depending on both the drift (A), and diﬀusion parameters (B). The terms k∗i in Tables 4-6 are independent of the diﬀusion parameters (B). For the moments (78)– (86) of the GWD model, however, the diﬀusion parameter β is always involved in the long-run covariance vector solutions, except for the trivial long-run stationary solution “0”, see (89)–(90). ∇ Remark 7. In both models, the initial value σ(0) may be thought of as an uncertainty in the measurement of the initial state vector, X(0). Then the explicit formulae allow a discussion of whether this initial uncertainty σ(0) or the diﬀusion coeﬃcient B plays the dominating role for σ(t), and hence for the future deviations of X(t) from the mean value m(t). Symbolic versions of the formulas for σ(t), Σ(t), are for the GD and GWD models – in the case of, σ(0) = 0 – found in Theorems 3 and 4 below. ∇ 2.5.2 Asymptotics of the covariance vector function From (69)–(86), the next result follows immediately, Corollary 2. The asymptotic or long-run behavior, as t → ∞, for the covariance vector function σ(t) of the non-singular GD model with |C| = 0 is given by [cf. Lemma 3] ⎧ λ < 0, λ2 < 0 : lim σ(t) = σ ¯ ⎪ ⎪ 1 t→∞ ⎡ ∗ ⎤ ⎨ k1 (87) Δ2 > 0 : (a+d+2Δ)t ⎣ ∗ ⎦ ⎪ k > 0 : σ(t) ∼ e λ 1 ⎪ 4 ⎩ k7∗ Δ2 ≤ 0 : trA < 0 : lim σ(t) = σ ¯ t→∞

(88)

∗ where σ ¯ is given in (62) and k1, k4∗ , k7∗ are given in (69)–(71).

The asymptotic behavior, as t → ∞, for the covariance vector function σ(t) of the GWD model, is given by ⎧ λ1 < 0, λ2 < 0 : lim σ(t) = 0 ⎪ ⎪ t→∞ ⎡ ⎤ ⎨ k 1 + κ1 2 (89) Δ >0: 2 (a+d+2Δ+β )t ⎣ ⎪ k 4 + κ4 ⎦ ⎪ ⎩ λ1 > 0 : σ(t) ∼ e k 7 + κ7 2 lim σ(t) ≡ 0 Δ2 ≤ 0 : a + d + β < 0 : t→∞

(90) 81

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen where k1, k4 , k7 are given in (78)–(80), and

2 d−a 1 b )mx (0) + my (0) κ1 = (1 − 4 2Δ Δ d−a 2 d−a 2 c bc b (1 − )mx (0) + (1 + )my (0) κ4 = mx (0)my (0) + 2 4Δ 2Δ 2Δ 4Δ 2Δ 2 1 c d−a κ7 = mx (0) + (1 + )my (0) (91) 4 Δ 2Δ The correlation coeﬃcient of the GD model with |C| = 0 has the following asymptotics – in the notations of (62) and (69)–(77): Δ2 > 0 :

⎧ ⎨ λ1 > 0, λ2 < 0 : lim ρ(t)2 =

⎩ λ1 > 0 : ⎧ ⎨ trA < 0 Δ2 = 0 : ⎩ trA > 0 Δ2 < 0 : trA < 0

t→∞

lim ρ(t)2 =

t→∞

lim ρ(t)2 =

t→∞

σ ¯ 2xy σ ¯ xx σ ¯ yy [k4∗ ]2 = k1∗ k7∗

1

σ ¯ 2xy σ ¯ xx σ ¯ yy

lim ρ(t)2 = 1

t→∞

lim ρ(t)2 =

t→∞

σ ¯ 2xy σ ¯ xx σ ¯ yy

(92)

For the GWD model [cf. (78)-(86) and (89)] Δ2 > 0 : Δ2 = 0 :

lim ρ(t)2 =

t→∞

[k4 +κ4 ]2 [k1 +κ1 ][k7 +κ7 ]

lim ρ(t)2 = constant

(93)

t→∞

σ 2 (t)

xy Proof: By the deﬁnition, ρ2 (t) = σ2 (t)σ 2 (t) , and by the asymptotics xx yy for the covariance function in the ﬁrst part above, the asymptotics for the correlation function in the second part is obtained. 2

Remark 8. In Corollary 2, we have only given the asymptotics for the correlation coeﬃcients of the GD model with |C| = 0. It is interesting to note that, the GD model with the singular Itˆo coeﬃcient matrix : |C| = 0, e.g., in Table 2 with Δ2 < 0, trA = 0 and (63), the asymptotics [dominated by the particular solutions (63)] of the 2 correlation coeﬃcient gives a constant: lim ρ(t)2 = − abc > 0; see the t→∞ related ellipses, circles of the mean functions in Fig. 1. ∇

82

Moment Evolution of Gaussian and Wiener Diﬀusions Stationarity. A nonsingular diﬀusion process is said to be stationary, or to have a stationary version, if it admits a time-invariant probability distribution. It is well known that a necessary and suﬃcient condition for the GD processes (13), with |Σ(t)| = 0, to be stationary, cf. (87 88), is that all eigenvalues of the drift coeﬃcient A, cf. (31)–(33), - or equivalently the Itˆ o coeﬃcient matrix C, cf. (39)–(41) - have negative real parts. Similarly, we see from (89) and (90) that the necessary and suﬃcient condition for the GWD processes (15) to be stationary is that ˜ cf. (42), have negative all eigenvalues of the Itˆo coeﬃcient matrix C, real parts; the stationary GWD process is trivial, cf. Remark 6. Moreover, we note that the asymptotically stable deterministic system remains stochastically asymptotically stable upon addition of arbitrary strong disturbances (”arithmetic noise”), e.g. GD model. Such a stability property, determined entirely by the drift coeﬃcient matrix A, disappears with higher (3-dimensional) diﬀusion processes. Furthermore, we note that, with the addition of geometric noise, e.g. the GWD model, this stability property cannot be determined entirely by the drift matrix A. The GWD model can only be stable, when λ1 < −β 2 . Hence the stability region for the GWD model is smaller than for the GD model, see Fig. 1. 2.5.3 Singularity of the covariance matrix A diﬀusion process is said to be nonsingular if its covariance matrix Σ(t) is nonsingular for all t > 0, or singular if Σ(t) is singular for all t > 0. For the GD model, (13)–(14), Σ(t), n = 2, is nonsingular, if and only if (94) rank (B AB A2 B · · · An−1 B) = n This rank condition is equivalent to hypoellipticity of the forward Kolmogorov equation, cf. Theorem 3 below. We state: Lemma 4. Let X(t) be described by the GD model, (13)–(14): When |B| = 0, X(t) is always nonsingular; When |B| = 0, X(t) is nonsingular if and only if the columns of B in (11) are not proportional to any eigenvector of A. The eigenvectors are proportional to (1, α1 ), (1, α2 ) and (1, α), where α1 , α2 and α are given in (34)–(35), i.e., β d−a Δ β 11 ± = α2 . or 22 = (95) β 21 β 12 2b b 83

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Proof: When |B| = 0, (94) clearly holds. But if |B| = 0, i.e. β 11 β 22 − β 12 β 21 = 0, then rank (B) = 2. Consider β 11 β 12 aβ 11 + bβ 21 a β 12 + bβ 22 . (96) (B AB) = β 21 β 22 cβ 11 + dβ 21 cβ 12 + dβ 22 Calculating all the determinants of the 2 × 2 submatrices, we get the same criteria for the singularity of the submatrices, namely that β 11 β 22 β 22 d−a Δ ± = αi, i = 1, 2 · = 1 and = (97) β 21 β 12 β 12 2b b Given |B| = 0, the ﬁrst equality is of course satisﬁed. Hence, the only way to avoid singularity is that the second equality is not satisﬁed. Evidently, ββ 22 is the slope of the column vector of the diﬀusion matrix 12 B, (11). Thus, the necessary and suﬃcient condition for X(t) of the GD model to be nonsingular is that none of the column vectors of B coincide with any eigenvector of A, (34), (35). See also Fig. 1. 2 2.6

Probability density functions

2.6.1

Kolmogorov’s forward equation

In this section, we take an alternative approach to the treatment of the problems with the diﬀusion models in (13)–(14) and (15)–(16). For simplicity, the initial conditions are taken as deterministic, i.e., x0 (ω) is independent of ω ∈ Ω. Our models are written compactly as dX = a(X)dt + B(X)dw,

X(0) = x0 .

(98)

The transition probability distribution , P (s, y, t, A), is given by P (s, y, t, A) = P (X(t) ∈ A | X(s) = y)

(99)

for every t > s > 0, y ∈ Rn and each Borel set A ⊂ Rn . Moreover, provided that P (s, y, t, ·) – a probability measure on Rn – is known to have a transition probability density function p(s, y, t, x), then p(0, x0 , t, x) solves the following system of equations written with (u(t, x) as the unknown), cf. (3): n n 1 ∂u ∂ 2u ∂u = bj,k (x) − aj (x) for t > 0, (100) ∂t 2 j,k=1 ∂xj ∂xk ∂xj j=1 1= u(t, x) dx for each t > 0, (101) Rn

δ x0 (x) = lim u(t, x); t→0+

84

(102)

Moment Evolution of Gaussian and Wiener Diﬀusions where bj,k (x) is short hand for the jk th entry of B(x)B(x)T ; and δ z denotes the point mass (Dirac delta function) at z in Rn . See Remark 10 below for assumptions and references. The probability density p(s, y, t, x) is a fundamental solution to the partial diﬀerential equation (PDE) in (100), which is known as Kolmogorov’s forward equation or the Fokker–Planck equation. See also Appendix D. Our results in the previous sections 2.4 and 2.5 may be found by solving (100)–(102) for the GD and GWD models. In addition, by doing so (or by the mere attempt to do so), further information may be derived; ultimately one ﬁnds the probability distribution for stochastic process X(t) at each ﬁxed t > 0 – and not just the ﬁrst-order and second-order moments for the diﬀusion X(t). The GD model For the Gaussian diﬀusion model, the approach to solving (100)-(102) is well-known. A complete analysis was given in H¨ormander (1967). Here we only need to give a brief account. The main tool is the Fourier transformation, F. Recall that S(Rn ) denotes the Fr´echet space of rapidly decreasing C ∞ functions, that is, smooth ϕ(x) that for all multi-indices α and β in Nn0 satisfy (103) sup |xα ∂ β ϕ(x)| x ∈ Rn < ∞; where, xα = xα1 1 . . . xαnn , and, ∂ β = ∂xβ = (∂/∂x1 )β 1 . . . (∂/∂xn )β n . Then Fϕ deﬁned by Fϕ(ξ) = e−iξ·x ϕ(x) dx (104) Rn

is a linear continuous bijection F : S(Rn ) → S(Rn ), the inverse of which is given as F −1 ψ(x) = (2π)−n Fψ(−x). Moreover, the formulae F(∂xα ϕ) = i|α| ξ α · Fϕ,

F(xα ϕ) = i|α| ∂ξα Fϕ,

(105)

are valid for all multi-indices α ∈ Nn0 and all ϕ ∈ S(Rn ).

85

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Furthermore, F extends to a linear continuous bijection S (Rn ) → S (Rn ), where S (Rn ) denotes the dual space of tempered distributions. More precisely, any element v of S (Rn ) is a continuous linear map S(Rn ) → C, and, with the value at a given ϕ ∈ S(Rn ) denoted by v, ϕ , then Fv is the element of S (Rn ) for which

Fv, ϕ = v, Fϕ for every ϕ ∈ S(Rn ).

(106)

The space S (Rn ) contains Lp (Rn ) for every p ∈ [1, ∞] and M (Rn ) — the Radon measures of ﬁnite total variation. In fact, the inclusion M (Rn ) ⊂ S (Rn ) is given by ϕ dμ.

μ, ϕ = Rn

Moreover, Fμ(ξ) = e−iξ·x dμ, since by (106), Fϕ(x) dμ(x) = e−iξ·x dμ(x), ϕ .

F μ, ϕ =

(107)

Rn

In particular, if μ on Rn is given as the probability distribution of a stochastic variable X, this means that Fμ(−ξ) = E(eiξ·X ), i.e., the characteristic function of X. As an example, Fδ z = e−iξ·z . Recall also that ∂ α u is a well-deﬁned element of S (Rn ) for each u therein and each muliti-index α, namely

∂ α u, ϕ = u, (−1)|α| ∂ α ϕ . Using this deﬁnition, the identities in (105) carry over to all u ∈ S (Rn ). Further details about these standard techniques are found in Rudin (1973). Concerning the solution u(t, x) in (100), one can now for each t ≥ 0 compute the Fourier transform in the x-variable and denote this by Fu(t, ξ) or by uˆ(t, ξ). Given formula (105), and because the 86

Moment Evolution of Gaussian and Wiener Diﬀusions diﬀusion coeﬃcients bj,k (x) in the GD model are independent of x, the transformed system n n 1 ∂ uˆ ∂ uˆ =− bj,k ξ j ξ k uˆ + aj,k ξ j , ∂t 2 j,k=1 ∂ξ k j,k=1

(108)

1 = uˆ(t, 0) for each t > 0,

(109)

−iξ·x0

e

= lim uˆ(t, ξ). t→0+

(110)

is equivalent to (100)–(102). As noted in H¨ormander (1967), when the drift coeﬃcient A = (aj,k ), as in the GD model, is a constant matrix, then the Fourier transformed system, (108)–(110), is solved uniquely by, 1 t sAT T T T (e ξ) BB T (esA ξ) ds (111) uˆ(t, ξ) = exp −ix0 · etA ξ − 2 0 for t > 0. By comparison with the characteristic function for Gaussian random variables, one gets the well-known result, (112)–(113): Theorem 3. The GD model, (13)–(14), has a unique solution X(t), which, for each t > 0, has a Gaussian distribution with the mean vector and covariance matrix [with initial value Σ(0) = 0] formally given by: m(t) = etA x0 t T esA BB T esA ds Σ(t) =

(112) (113)

0

In particular, m(t) is given explicitly in (45-47) of Theorem 1, and Σ(t) is given explicitly by the expression (67) of Theorem 2 (with σ(0) = 0), and it is positive deﬁnite, i.e., Σ(t) > 0, if and only if Kolmogorov’s equation (100) is hypoelliptic with the values of aj,k and bj,k used in the GD model; i.e., if and only if the rank condition (94) is fulﬁlled. Proof: See Theorem 8.2.10 in Arnold (1973) for the Gaussian distribution and the expressions for the mean and covariance (even between diﬀerent times s = t). The equivalence with the hypoellipticity was observed in H¨ormander (1967). 2 87

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Observe that since the rank condition (94) involves only A and B, and not t, one has Σ(t) > 0 for every t > 0, if and only if Σ(t) > 0 holds at a single time t. The formula for uˆ(t, ξ) in (111) has another merit: Suppose that A and B in (98) are such that Σ(t) in (113) is only positive semideﬁnite. Then there is a subspace N ⊂ Rn such that the integral in (111) vanishes for every ξ ∈ N ; consequently |ˆ u(t, ξ)| = 1 for all ξ ∈ N . Therefore the transition probability P (0, x0 , t, A) in (99) cannot have a density function p, because on the one hand, such a density would necessarily belong to L1 with respect to the Lebesgue measure on Rn , so pˆ(0, x0 , t, ξ) would go to 0 for |ξ| → ∞, and on the other hand, p would solve (100)–(102) and therefore have |ˆ p(0, x0 , t, ξ)| = 1 for ξ ∈ N , as we have just seen. This contradiction immediately leads to the following result: Proposition 5. When x0 is independent of ω in Ω, the properties: (i) the GD model has a transition probability density p(0, x0 , t, x), (ii) the covariance function Σ(t) is invertible for every t > 0, (iii) the covariance function Σ(t) is invertible for some t > 0, are equivalent. Remark 9. The formula in (113) is useful as a point of departure for an alternative determination of Σ(t). In fact, in two dimensions where etA can be written down explicitly, we have made a computer-aided calculation using Maple, and the result coincides with the formulae of Theorem 2, when the initial value Σ(0) equals zero. Speciﬁcally, this means that the GD density function u(t, x) = p(0, x0 , t, x) becomes 1

p(0, x0 , t, x) = '

1

(2π)n |Σ(t)|

T Σ(t)−1 (x−m(t))

e− 2 (x−m(t))

(114)

with m(t) , Σ(t) as in Theorems 1-2, provided Σ(t) > 0. ∇ Remark 10. That the GD density function (114) solves (100)–(102) is known under the assumption that, ﬁrst of all, p(s, y, t, x) exists; secondly, the necessary derivatives ∂p , ∂t 88

aj (x)

∂p , ∂xj

bj,k (x)

∂2p , ∂xj xk

j, k = 1, . . . , n

(115)

Moment Evolution of Gaussian and Wiener Diﬀusions exist as continuous functions; and, thirdly, the transition function of a diﬀusion process, P (s, y, t, A) should have a(x) and (bj,k (x))j,k=1,...,n as the drift and diﬀusion coeﬃcients with y-uniform convergence, cf. the deﬁnition of diﬀusion processes in Gikman and Skorokhod (1969, p.375) – as observed in their Remark 1, the existence of the derivatives in (115) is unnecessary because these always exist in the distribution sense in S (Rn ). However, the natural questions are not settled hereby: once a function solving (100) - (102) is obtained, it still has to be veriﬁed that it coincides with the density, p(0, x0 , t, x), even when the latter is known to exist [In the GD case, this is clear, however, since the adopted method shows the uniqueness of the solution]. Moreover, since P (0, x0 , t, ·) is a probability measure for each t > 0, it is in S (Rn ), cf. (106)–(107), so it is reasonable to ask whether it solves (100) in the distribution sense. Furthermore, it is a question whether any solution u(t, x) of (100)–(102) coincides with P (0, x0 , t, x) in general. ∇ Geometric Wiener Diﬀusion The GWD model (15)–(16) will not have probability densities. Proposition 6. For the GWD model, (15)–(16), the solution of Kolmogorov’s forward equation with side conditions, (100)–(102), is supported by a half-line in Rn . Hence a probability density function, p(0, x0 , t, x), does not exist. Proof: The forward Kolmogorov equation (100) is here, n n 1 2 ∂ 2u ∂u ∂u = β xj xk − aj,k xk , ∂t 2 j,k=1 ∂xj ∂xk j,k=1 ∂xj

(116)

since bj,k (x) = β 2 xj xk for j, k = 1,. . . ,n. The solution to (116), satisfying (101)–(102), is the probability measure u(t, dx) on Rn , deﬁned for each t > 0 by the fulﬁllment of, ∞ β2 r2 1 e− 2t dr ϕ(x)u(t, dx) = ϕ(et(A− 2 I)+rβI x0 ) √ (117) 2πt −∞ Rn for every continuous bounded function ϕ(x) on Rn ; cf. Remark 11. 89

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Clearly, u(t, dx) = 0 outside of the set: 1 2 x ∈ Rn ∃r ∈ R : x = et(A− 2 β I)+rβI x0 ,

(118)

which is a (t-dependent) half-line, since 1

et(A− 2 β

2

1

I)+rβI 0

x = erβ et(A− 2 β

2

I) 0

x

(119)

As u(t, dx) is supported by the Lebesgue null-set (118), the transition 2 probability, P (0, x0 , t, dx), has no probability density function. Remark 11. That the u(t, dx) stated in (117) actually is a fundamental solution of Kolmogorov’s equation associated with the GWD model, (15)–(16), will not be proved here; the veriﬁcation is quite lengthy and requires full use of the various techniques for partial differential equations. In fact, our “proof” here is based on determination of the solution to a corresponding GD model (for the ‘log’ of X) followed by a transformation as a distribution density; cf. H¨ormander (1985, Ch. 6) (a technical report may be obtained from the authors). Moreover, in view of Remark 10, we are not able to deﬁnitely conclude that the solution (117) equals the transition probability P (0, x0 , t, dx). Theorem 4. Assuming that the probability measure u(t, dx) in (117) is the probability distribution P (0, x0 , t, dx) of X(t) described by the GWD model, (15)–(16), then inserting ϕ(x) = x into (117), we get ∞ 1 2 r2 1 e− 2t dr xu(t, dx) = et(A− 2 β I)+rβI · x0 √ E [X(t)] = 2πt −∞ Rn ∞ 2 (r−tβ) 1 √ e− 2t dr = etA · x0 2πt −∞ tA 0 (120) = e ·x and by insertion of ϕ(x) = x · xT in (117), we get [with Σ(0) = 0] x · xT u(t, dx) = (121) E [X(t)X(t)T ] = n R ∞ (r−2tβ)2 1 2 2 √ e− 2t dr = eβ t (etA x0 )(etA x0 )T . eβ t (etA x0 )(etA x0 )T 2πt −∞ Hence by (120) and (121), we obtain the formal expression Σ(t) = E(XX T ) − E(X)E(X)T = (eβ t − 1)(etA x0 )(etA x0 )T . (122) 2

90

Moment Evolution of Gaussian and Wiener Diﬀusions Expanding (122) by the explicit formulae (45)-(47), Theorem 1 for the mean values, m(t) = etA x0 , we ﬁnd precisely the expression (68) for σ(t) in Theorem 2 [with σ(0) = 0], as shown in (78)–(86). Accordingly, under the assumption that u(t, dx) (117) is the probability distribution, we have not only the convenient expression in (122), but also independent evidence that parts of (78)–(86) are correct. Formula (122) also shows that Σ(t) is singular for every t > 0, when Σ(0) = 0, since, y = etA x0 gives, y · y T = (yj yk ), which has rank 1. 2.6.2 Formulae from stochastic integration The conclusion in Proposition 6 above that a density function does not exist can also be reached in the following way. The stochastic diﬀerential equation dX = AX dt + BX dw

(123)

with X(0) = x0 , is known from the stochastic integrals to have the solution 1 X(t) = exp t(A − B 2 ) + (w(t) − w(0))B · x0 , t > 0, (124) 2 when A and B commute. In particular, this holds when B = βI and A is arbitrary, as in our GWD model. See Arnold (1973, Thm.8.5.2, Rem.8.5.9) for a more general formula for m independent Wiener processes. Since [A, B] = 0, (124) becomes, 1

X(t) = et(A− 2 β

2

I) (w(t)−w(0))βI

t(A− 12 β 2 I)

= e The vector

e

· x0

(e(w(t)−w(0))β · x0 ).

(125)

e(w(t)−w(0))β · x0

is in span(x0 ) for each t > 0 and ω ∈ Ω, and the linear mapping span(x0 ) → Rn given by 1 2 et(A− 2 β I) , has at most one dimensional range; hence by (125), the vector X(t) lies in a (t-dependent) subspace of Rn . Hence P (0, x0 , t, ·) must vanish outside 1

et(A− 2 β

2

I)

(span(x0 )),

and for this reason, the measure P (0, x0 , t, ·) does not have a density function. 91

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen 2.7

Final comments

The evolution of the expectation and the covariance matrix of planar time homogeneous diﬀusion processes is seldom systematically presented in the literature. We have obtained the complete set of solutions for the moments of the general bivariate Gaussian diﬀusion process and a broad class of planar geometric Wiener diﬀusion processes. They were derived from the ordinary diﬀerential equations generated by Itˆo’s Lemma. These closed-form moment solutions — exposed in terms of the basic drift and diﬀusion coeﬃcients — may oﬀer useful insights and should ﬁnd applications in factual studies of the diﬀusion dynamics within many areas of economic, social and biological process analysis. We have also related the moment solutions explicitly to the corresponding evolution of the transition probability density (or distribution), generated by Kolmogorov’s forward equation, which may play a large role in the future research on stochastic diﬀerential equations. Appendix A: Particular solutions for the GD model Based on the theory of ordinary diﬀerential equations, the particular solution of the aﬃne diﬀerential equation σ˙ = Cσ + δ

(126)

can be obtained as follows: 1. When coeﬃcient matrix C is nonsingular, i.e. |C| = 0, the critical point (127) σ ¯ = −C −1 δ. In Proposition 2, ⎡ ⎤ 2a 2b 0 C = ⎣ c a + d b ⎦, 0 2c 2d hence C −1

⎤ β 211 + β 212 δ = ⎣ β 11 β 21 + β 12 β 22 ⎦ , β 221 + β 222 ⎡

⎡

⎤ −2b2 −2d2 − 2 |A| 4bd 1 ⎣ ⎦. 2cd −4ad 2ab = 4 tr A |A| 2 2 4ac −2a − 2 |A| −2c

By inserting (128) and (129) into (127), then (62) follows. 92

(128)

(129)

Moment Evolution of Gaussian and Wiener Diﬀusions 2. When C is singular, i.e. |C| = 0, then (21) implies ... σ xx − 3 ( tr A) σ ¨ xx + 2 ( tr A)2 + 2 |A| σ˙ xx − 4 ( tr A) |A| σ xx = 2 |A| (β 211 + β 212 ) + 2(dβ 11 − bβ 21 )2 + 2(dβ 12 − bβ 22 )2 .

(130)

When tr A = a + d = 0, (130) becomes ...

σ xx + 4 |A| σ˙ xx = 2 |A| (β 211 + β 212 ) + 2(dβ 11 − bβ 21 )2 + 2(dβ 12 − bβ 22 )2 . (131) Assuming that σ xx = γt, then by (131), γ=

β 211 + β 212 (dβ 11 − bβ 21 )2 + (dβ 12 − bβ 22 )2 + 2 2 |A|

(132)

Based on (21), we get the relationship between σ xx and σ xy as σ xy =

1 (σ˙ xx − 2aσ xx − β 211 − β 212 ), 2b

(133)

and the relationship between σ yy and σ xx as σ yy =

1 [¨ σ xx 2b2

− (3a + d)σ˙ xx + (2a2 + 2ad − 2bc)σ 2x +(a + d)(β 211 + β 212 ) − 2b(β 11 β 21 + β 12 β 22 )]

(134)

By inserting (132) and σ xx = γt into (133) and (134), the particular solution of (63) is obtained. When |A| = ad − bc = 0, (130) becomes ...

σ xx − 3 ( tr A) σ ¨ xx + 2 ( tr A)2 σ˙ xx = 2(dβ 11 − bβ 21 )2 + 2(dβ 12 − bβ 22 )2 (135) Assuming that σ xx = γt, we get, from (135): (136) γ = (trA)−2 (dβ 11 − bβ 21 )2 + (dβ 12 − bβ 22 )2 By inserting (136) and σ xx = γt into (133) and (134), then the particular solution of (64) is obtained.

93

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Appendix B: Exponential matrices in two dimensions To calculate etA for a matrix A with entries a, b, c, and d, decompose it as a−d 1 1 b 2 . (137) B = A − (a + d)I = A = (a + d)I + B, c d−a 2 2 2 Since the matrices in (137) commute, 1

etA = e 2 (a+d)t etB

(138)

From the Hamilton-Cayley theorem, (28), trB = 0 and (29), we get B 2 = ( tr B)B − |B|I = Δ2 I,

(139)

and from this it is inferred that B 2n = (Δ2 )n I,

B 2n+1 = (Δ2 )n B,

n ∈ N0 = {0} ∪ N

(140)

Using (140) and the deﬁnition in (137), we have etB =

∞ n=0

=

∞

tn B n n!

=

n=0

t2n (Δ2 )n

n=0

2n!

∞

I+

t2n B 2n 2n! ∞

∞

+

n=0

t2n+1 B 2n+1 (2n+1)!

t2n+1 (Δ2 )n (2n+1)!

n=0

(141) B

In case Δ2 = 0, and with 00 = 1, 0! = 1, we immediately obtain from (141) etB = I + tB (142) Δ2 = 0 : Thus, (142), (138) and (137) establish (46). In case Δ2 > 0, (141) gives, cf. (30), Δ2 > 0 :

etB =

∞ n=0

(Δt)2n I 2n!

+

= cosh(Δt)I +

1 Δ 1 Δ

∞ n=0

(Δt)2n+1 B (2n+1)!

(143)

sinh(Δt)B

In case Δ2 < 0, (141) gives analogously, cf. (30), Δ2 < 0 :

etB =

∞ n=0

(−1)n (Δt)2n I 2n!

= cos(Δt)I +

1 Δ

+

1 Δ

∞ n=0

(−1)n (Δt)2n+1 (2n+1)!

(144)

sin(Δt)B

Thus, (143), (144) and (138), (137), respectively, establish (45) and (47) .

94

Moment Evolution of Gaussian and Wiener Diﬀusions Appendix C: Exponentials in three dimensions To calculate the exponential matrix etC , we need the following Lemma 5. If a 3 × 3 matrix A has three equal eigenvalues λ, then 1 2 etA = eλt I + t (A − λI) + t2 (A − λI) , 2

(145)

and if it has three distinct eigenvalues λ1 , λ2 and λ3 , then (A − λ2 I) (A − λ3 I) (A − λ1 I) (A − λ3 I) + eλ2 t (λ1 − λ2 ) (λ1 − λ3 ) (λ2 − λ1 ) (λ2 − λ3 ) (A − λ I) (A − λ I) 1 2 + eλ3 t . (146) (λ3 − λ1 ) (λ3 − λ2 )

etA = eλ1 t

Proof: See Apostol (1969).

2

Appendix D: The Basic Wiener Process and the Evolution of Transition Probability Densities Bjarne S. Jensen University of Southern Denmark and Copenhagen Business School Mogens E. Larsen Department of Mathematics, University of Copenhagen The fundamental stochastic process in continuous time is the Wiener process (Brownian motion). The transition probability from any initial state (interval) to any other state (interval) is a ﬁxed probability (unaﬀected by the past history of the stochastic process/states). A brief review of the methodology in Einstein (1905) is instructive and useful for an exact derivation of the partial diﬀerential equation (PDE) that generates the dynamics (evolution) of the probability densities from a Wiener process. The general solution of this PDE clariﬁes the origin and how the conditional (transition) probability density function of the Wiener process is calculated. Our demonstrations also provide a background for stochastic diﬀerential equations (SDE) and their solutions (sample paths), which are driven by the standard Wiener process.

95

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Transition probability density and Kolmogorov’s equation Suppose we have a stochastic process described by the conditional density function, p(x, t | x0 ) - with x as the state variable (position, point) and t as the continuous time variable - describing for each value of t the distribution of the states, e.g., the position of particles. For a small time step (interval), τ , the conditional distribution of states at time t + τ depends on its neighboring distribution. The probability density of the increments during the period τ , is : f (Δ; τ ), with Δ as changes (distance) in the state (position) variable x from any initial state (position), x0 . The probability distribution of increments is assumed to be symmetric f (Δ; τ ) = f (−Δ; τ ) and have the mean zero: ∞ E(Δ) = Δf (Δ; τ )dΔ = 0; −∞

(147)

∞

f (Δ; τ )dΔ = 1;

(148)

−∞

and a variance proportional to the size of τ : ∞ 2 σΔ = Δ2 f (Δ; τ )dΔ = b2 τ ; b > 0 (b2 = D)

(149)

−∞

The conditional probability density function p(x, t +τ | x0 ) at t+τ can then be obtained as a weighted average of the neighboring conditional probability density functions ∞ p(x + Δ, t | x0 )f (Δ; τ )dΔ (150) p(x, t + τ | x0 ) = −∞

or alternatively, in view of (147), as a convolution of p(x, t | x0 ) and f (Δ; τ ) [cf. (162) below] ∞ p(x, t + τ | x0 ) = p(x + Δ, t | x0 )f (−Δ; τ )dΔ. (151) −∞

The Taylor series of p(x, t | x0 ), developed from (x, t) in both directions, yields p(x, t + τ | x0 ) = p(x, t | x0 ) + 96

∂p(x, t | x0 ) τ + ··· ∂t

(152)

Moment Evolution of Gaussian and Wiener Diﬀusions and 1 ∂ 2 p(x, t | x0 ) 2 ∂p(x, t | x0 ) Δ+ Δ +··· ∂x 2 ∂x2 (153) With substitution of (152) and (153) into, respectively, the left and right side of (150), we obtain

p(x + Δ, t | x0 ) = p(x, t | x0 ) +

p(x, t | x0 ) +

∂p(x, t | x0 ) τ ∂t ∞

∂p(x, t | x0 ) p(x, t | x0 ) f (Δ; τ )dΔ + ∂x −∞ ∞ 2 1 ∂ p(x, t | x0 ) + Δ2 f (Δ; τ )dΔ 2 ∂x2 −∞

∞

Δf (Δ; τ )dΔ −∞

(154)

Reduction of (154) yields, using (148) and (149), ∂p(x, t | x0 ) 1 2 ∂ 2 p(x, t | x0 ) τ b τ ∂t 2 ∂x2

(155)

which shows that - approximately and independently of τ - the conditional density function p(x, t | x0 ) satisﬁes the diﬀusion (“heat”) equation [Kolmogorov’s PDE]: 1 ∂p(x, t | x0 ) ∂ 2 p(x, t | x0 ) = b2 ∂t 2 ∂x2

(156)

Thus, by making the simple general descriptive assumptions (148-150), and then the ﬁrst-order (152) and the second-order expansions (153), Einstein (1905, pp. 556) gave a clear-cut procedure for a rigorous derivation of the diﬀusion equation (156) - which of course physicists knew had a well-known (b2 = D) solution. To solve the partial diﬀerential equation (156), we use here the Fourier transformation (F) ∞ p(x, t | x0 )e−iξ(x−x0 ) dx (157) F p(x, t | x0 ) = pˆ(ξ, t | x0 ) = −∞

to obtain the ordinary diﬀerential equation in t: dˆ p 1 ∂ pˆ (ξ, t | x0 ) = (ξ, t | x0 ) = b2 (−iξ)2 pˆ(ξ, t | x0 ) ∂t dt 2 1 2 2 = − b ξ pˆ(ξ, t | x0 ) 2

(158) 97

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen Hence the solution of (158) is 1

pˆ(ξ, t | x0 ) = υˆ (ξ) e− 2 b

2 ξ2 t

(159)

Using the inverse Fourier transformation, we derive the solution of (156) as a convolution [cf., (162)] of the inverse of (159). We obtain, υ(x − x0 ), cf. (159), as ∞ 1 υ(x − x0 ) = F −1 υˆ (ξ) = υˆ (ξ)eiξ(x−x0 ) dξ (160) 2π −∞ and the Gauss kernel, pN (x, t | x0 ), as 1 2 2 ξ t

pN (x, t | x0 ) = F −1 e− 2 b =√

1 2πb2 t

− 12

e

=

1 2π

∞

1 2 2 ξ t

e− 2 b

eiξ(x−x0 ) dξ

−∞

(x−x0 )2 b2 t

(161)

As the inverse Fourier transform takes a product into a convolution, we get, cf. (159-161), ∞ 1 p(x, t | x0 ) = F −1 pˆ(ξ, t | x0 ) = pˆ(ξ, t | x0 ) eiξ(x−x0 ) dξ = 2π −∞ ∞ υ(x − x0 − s) pN (s, t | x0 )ds (162) υ(x − x0 ) ∗ pN (x, t | x0 ) = −∞

Now, in the sense of “distribution” (generalized functions), the Gauss kernel (161) converges to the Dirac δ(x − x0 ) function, for t → 0. Hence (162) gives p(x, 0 | x0 ) = lim p(x, t | x0 ) = υ(x − x0 ) ∗ δ(x − x0 ) = υ(x − x0 ) (163) t→0

If we originally had chosen the boundary condition as υ(x − x0 ) = δ(x − x0 )

(164)

then we would have the conditional probability density function as p(x, t | x0 ) = δ(x − x0 ) ∗ pN (x, t | x0 ) = pN (x, t | x0 )

(165)

i.e. we may consider p(x, t | x0 ) and the Gauss kernel (161) as the solution (162) to Kolmogorov’s equation (156) with the boundary condition (164). See also Bachelier (1900, p. 46, 1964, p. 38).

98

Moment Evolution of Gaussian and Wiener Diﬀusions Acknowledgements: We wish to thank P. Alsholm (Technical University of Denmark) and J.M.S. Jensen (Vinci Computers) for kind assistance in computer calculations, and G. Grubb, M. Jacobsen (University of Copenhagen), H. Spliid, H. Holst (Technical University of Denmark) for valuable comments and discussions. For discussions about the Kolmogorov equation, thanks are also due to Y. V. Prohorov (Steklov-Institute for Mathematics, Moscow). Bjarne S. Jensen thanks the Danish Social Sciences Research Council for Grant no. 9500740.

References: Apostol, T. M. (1969) “Some Explicit Formulas for the Exponential Matrix etA .” American Mathematical Monthly 76: 289-292. Arnold, L. (1973) Stochastic Diﬀerential Equations: Theory and Applications. John Wiley and Sons, New York, London, Sydney, Toronto. German original published by R. Oldenbourg Verlag, Munich 1973. Bachelier, L. (1900) Theorie de la Sp´eculation. Annales scientiﬁques de l’ Ecole normale sup´erieure, (3), No. 1018. Gauthier-Villars, Paris. English translation in: Cootner, P.H. (ed.) The Random Character of Stock Market Prices. MIT Press, Cambridge Mass., 1964: 17–78. Chang, F. and Malliaris, A. G. (1987) “Asymptotic Growth under Uncertainty: Existence and Uniqueness.” Review of Economic Studies 54: 169-174. ¨ Einstein, A. (1905) “Uber die von der molekularkinetischen Theorie der W¨arme gefordete Bewegung von in ruhenden Fl¨ ussigkeiten suspendierten Teilschen.” Annalen der Physik 17: 549 - 560. Translated as “On the Motion of Small particles in Liquids at Rest required by the Molecular-Kinetic Theory of Heat” in the following volumes: (i) F¨ urth, R. (ed.) Albert Einstein: Investigations on the Theory of the Brownian Movement Dover Publ., New York. (ii) Stachel, J. (ed.) Einsteins Miraculous Year. Five Papers that Changed the Face of Physics. Princeton University Press, Princeton, 2005. Gikman, I. I. and Skorokhod, A. V. (1969) Introduction to the Theory of Random Processes. W. B. Saunders Company, Philadelphia, London, Toronto. Russian original published by Nauka, Moscow, 1965. 99

Bjarne S. Jensen, Chunyan Wang, Jon Johnsen H¨ormander, L. (1967) “Hypoelliptic Second Order Diﬀerential Equations”, Acta Mathematica 119: 147–171. H¨ormander, L. (1983, 1985) The Analysis of Linear Partial Diﬀerential Operators. Grundlehren der mathematischen Wissenschaften, vol. 256, 257, 274, 275. Springer Verlag, Berlin. Itˆo, K. and Mckean, H. P. Jr. (1974) Diﬀusion Processes and Their Sample Paths. Second printing. Springer-Verlag, New York. Jacobsen, M. (1991) Homogeneous Gaussian Diﬀusions in Finite Dimensions. Institute of Mathematical Statistics, University of Copenhagen, Preprint No. 3. Jensen, B. S. (1994) The Dynamic Systems of Basic Economic Growth Models. Kluwer Academic Publishers, Dordrecht/Boston/London. Karatzas I. and Shreve, S. (1988) Brownian Motions and Stochastic Calculus. Springer-Verlag, New York. Merton, R. C. (1975) “An Asymptotic Theory of Growth Under Uncertainty.” Review of Economic Studies 47: 375-393. Prohorov, Y. V. and Rozanov, Y. A. (1969) Probability Theory. Springer-Verlag. Pugachev, V. S. and Sinitsyn, I. N. (1987) Stochastic Diﬀerential Systems. John Wiley and Sons, New York. Rudin, W. (1973) Functional Analysis, McGraw-Hill. Selby, S. M. (1973) Standard Mathematical Tables. 21. ed. Chemical Rubber Co., Cleveland, Ohio. Soong, T. T. (1973) Random Diﬀerential Equations in Science and Engineering, Academic Press, New York. Øksendal, B. (2005) Stochastic Diﬀerential Equations. 6. edition. Series Universitext, Springer-Verlag, Berlin.

100

Chapter 3 Two-Dimensional Linear Dynamic Systems with Small Random Terms

Nishioka Kunio Faculty of Commerce, Chuo University, Tokyo, Japan

3.1

Introduction

Let (Ω, F, P) be a standard probability space1 and {W (t, ω), t ≥ 0} be a R1 valued Brownian motion on (Ω, F, P). We consider a linear stochastic diﬀerential equation (SDE in abbreviation): dxε (t, ω) = A · xε (t, ω) dt + ε G · xε (t, ω) dW (t, ω) xε (0, ω) = x∗

(1)

where A and G are constant regular 2 × 2 matrices, ε is a positive real number, and x∗ = (x∗1 , x∗2 ) is a point in R2 . There exists a unique solution of SDE (1), which deﬁnes a random dynamical system {xε (t, ω), t ≥ 0} in R2 . Remark that the origin 0 = (0, 0) is a singular point to our dynamical system for any ε ≥ 0, that is xε (t, ω) ≡ 0 for all t ≥ 0 if xε (0, ω) = 0. 1

Ω is a sample space, whose element ω ∈ Ω denotes an individual experiment. F is a σ-ﬁeld of Ω and P is a probability measure on (Ω, F). See Ito and McKean (1968) or Revuz and Yor (1999) for strict deﬁnitions of probability space, Brownian motion, SDE, and the others.

Nishioka Kunio Let ε = 0 in SDE (1), and we have an ordinary diﬀerential equation dx (t) = A · x(t), dt

x(0) = x∗ ,

(2)

whose solution deﬁnes a non-random dynamical system {x(t), t ≥ 0}. It is easy to investigate asymptotic behaviors of x(t) as t → ∞. According to these asymptotic behaviors, the origin 0 is classiﬁed into a spiral point, a center, a saddle point, an improper node, and a proper node. (See section 3.2 in this paper or/and Coddington and Levison (1955), Ch.15, §1.) Our problem in this note is to answer the question: Does that classiﬁcation for a non-random system (2) keep validity for the random system (1) with suﬃciently small ε? We construct this note as follows. In section 3.2, we talk about the non-random system (2), and review some needful facts to investigate our problem in section 3.2 and 3.3. We discuss the random system (1) for a spiral point and a center in section 3.4, for a saddle point in section 3.5, and for an improper and a proper node in section 3.6. After coordinating Theorems 1, 2, 3, 4, 5, and 6, we present the following main result: Main Theorem.2 Let the origin 0 be a spiral point, an improper node, or a saddle point for the non-random system (2), then it is true for the random system (1) with small ε > 0. However, if the origin is a center or a proper node for the nonrandom system (2), it is not necessarily true for the random system (1) even though ε > 0 is small. 3.2

Non-random dynamic system

We begin to remark that there is a regular matrix Q such that the transformed matrix Q · A · Q−1 or −Q · A · Q−1 is one of the following canonical forms:

2

102

See Remark 3 for an intuitive explanation of this theorem.

Two-Dimensional Systems with Small Random Terms

% (I) (II) (III) (IV) (V) (VI)

% % % % %

a1 a2 0 a2 a1 0 a1 0 a1 a2 a1 0

& −a2 a1 & −a2 0& 0 a2 & 0 a2 & 0 a1 & 0 a1

with a1 < 0 and a2 > 0, with a2 > 0, with a1 < 0 < a2 , with a2 < a1 < 0, with a1 < 0 and a2 > 0, with a1 < 0. (3)

For simplicity, we assume that the matrix A is one of the above canonical forms (I) through (VI), and consider the following couple of the non-random dynamical systems instead of (2) alone: dx (t) = A · x(t), dt ˜ dx ˜ (t), (t) = A˜ · x dt where we put

x(0) = x∗ , (4) ˜ (0) = x∗ , x

A˜ ≡ −A.

(5)

In order (4), we introduce new coordinate functions for to analyze {x(t) = x1 (t), x2 (t) , t ≥ 0}, that is θ(t) = tan−1 (x2 (t)/x1 (t)), ρ(t) ≡ log x(t).

(6)

The function

ρ(t) , x∗ = x(0) (7) t→∞ t is called Lyapunov index which denotes exponential stability or instability of the dynamical system (4). In fact, if L(x∗ ) = 0, then x(t) ∼ K exp{L(x∗ ) t} as t → ∞, x∗ = x(0), L(x∗ ) ≡ lim

where K is a positive constant. 103

Nishioka Kunio To the other dynamical system {˜ x(t), t ≥ 0} in (4), we deﬁne ∗ ˜ ˜ θ(t), ρ˜(t), and L(x ) by the corresponding functions in (6) and (7) respectively. If the matrix A is (I) in (3), we call the origin a spiral point for the dynamical system (4) (see Figure 1). By a simple calculation, we see that (8) θ(t) = θ(0) + a2 t, ρ(t) = ρ(0) + a1 t. So the spiral point is characterized by the fact: θ(T ) = a2 = 0, L(x∗ ) = a1 < 0, T ˜ ) θ(T ˜ ∗ ) = −a1 > 0. lim = −a2 = 0, L(x T →∞ T lim

T →∞

(9)

Figure 1: A spiral point (I), and a center (II) If A is (II), then it is just a special case of (8) where a1 = 0, and it is called a center (see Figure 1). A center is distinguished with the fact: θ(T ) = a2 = 0, L(x∗ ) = 0, T ˜ ) θ(T ˜ ∗ ) = 0. lim = −a2 = 0, L(x T →∞ T lim

T →∞

If the matrix A is (III) or (IV) in (3), then tan θ(t) = tan θ(0) exp{(a2 − a1 )t} , t cos2 θ(s) ds. ρ(t) = ρ(0) + a2 t + (a1 − a2 ) 0

104

(10)

Two-Dimensional Systems with Small Random Terms The origin 0 is said as a saddle point in the case of (III) (see Figure 2), and it is distinguished by the following: π/2 if θ(0) ∈ (0, π) lim θ(t) = (11a) 3π/2 if θ(0) ∈ (π, 2π), t→∞ L(x∗ ) = a2 > 0 if x∗ = (x∗1 , x∗2 ) with x∗2 = 0, ˜ lim θ(t) =

t→∞

˜ ∈ (−π/2, π/2) 0 if θ(0) ˜ ∈ (π/2, 3π/2), π if θ(0)

˜ ∗ ) = −a1 > 0 if x∗ = (x∗ , x∗ ) with x∗ = 0. L(x 1 2 1

(11b)

(11c) (11d)

Figure 2: A saddle point (III), and an improper node (IV) The origin is an improper node in the case (IV) (see Figure 2), and the following holds: lim θ(t) = same as the right side of (11c),

(12a)

L(x∗ ) = a1 < 0 if x∗ = (x∗1 , x∗2 ) with x∗1 = 0, ˜ = same as the right side of (11a), lim θ(t)

(12b) (12c)

t→∞

t→∞

˜ ∗ ) = −a1 > 0 if x∗ = (x∗ , x∗ ) with x∗ = 0. L(x 1 2 1

(12d)

When the matrix A is (V) in (3), then tan θ(t) = tan θ(0) + a2 t, t ρ(t) = ρ(0) + a1 t + a2 sin 2θ(s) ds. 0

105

Nishioka Kunio

Figure 3: An improper node (V), and a proper node (VI) and we say also that the origin 0 is an improper node (see Figure 3). This improper node is characterized by the fact: π/2 if θ(0) ∈ (−π/2, π/2] lim θ(t) = (13a) −π/2 if θ(0) ∈ (π/2, 3π/2], t→∞ (13b) L(x∗ ) = a1 < 0, ˜ = lim θ(t)

t→∞

−π/2 if θ(0) ∈ [−π/2, π/2) π/2 if θ(0) ∈ [π/2, 3π/2),

˜ ∗ ) = −a1 > 0. L(x

(13c) (13d)

If A is (VI) in (3), then θ(t) = θ(0) for all t ≥ 0, ρ(t) = ρ(0) exp{a1 t},

(14)

where the origin 0 is said as a proper node (see Figure 3), and it is distinguished with the fact: θ(t) = θ(0) for all t ≥ 0, ˜ = θ(0) for all t ≥ 0, θ(t) 3.3

L(x∗ ) = a1 < 0, ˜ ∗ ) = −a1 > 0. L(x

(15)

Lyapunov index of the random system

Let e(θ) and e† (θ) be two dimensional vectors such that e(θ) ≡ cos θ, sin θ , e† (θ) ≡ sin θ, − cos θ .

(16)

We denote by x, y the inner product of vectors x and y in R2 , and by gij the ij-element of the matrix G and so on. 106

Two-Dimensional Systems with Small Random Terms As in the previous section, we also use the same coordinate functions for the random dynamical system {xε (t, ω) = xε1 (t, ω), xε2 (t, ω) , t ≥ 0}, that is

θε (t, ω) ≡ tan−1 xε2 (t, ω) /xε1 (t, ω) , ρε (t, ω) ≡ log xε (t, ω).

(17)

Applying Itˆo’s formula, we know that the diﬀusion process {θε (t, ω), t ≥ 0} satisﬁes SDE dθε (t, ω) = bε (θε (t, ω)) dt + ε σ(θε (t, ω)) dW (t, ω)

(18)

where σ(θ) ≡ − e† (θ), G · e(θ) = g21 cos2 θ + (g22 − g11 ) cos θ sin θ − g12 sin2 θ, †

(19)

2

b (θ) ≡ − e (θ), A · e(θ) + ε bG (θ) ε

with bG (θ) ≡ e† (θ), G · e(θ) e(θ), G · e(θ) = −σ(θ) g11 cos2 θ + (g12 + g21 ) cos θ sin θ + g22 sin2 θ . On the other hand, the diﬀusion process {ρε (t, ω), t ≥ 0} is a solution of SDE dρε (t, ω) = Qε (θε (t, ω)) dt + ε R(θε (t, ω)) dW (t, ω)

(20)

where R(θ) ≡ e(θ), G · e(θ), Qε (θ) ≡ e(θ), A · e(θ) + ε2 qG (θ) with qG (θ) ≡

(21)

+ ,2 1 | G · e(θ)|2 − e(θ), G · e(θ) . 2 107

Nishioka Kunio Deﬁnition 1. As in (7), a non-random function3 ρε (t, ω) a.s., x∗ = xε (0, ω) (22) t→∞ t is said as Lyapunov index of the random system {xε (t, ω), t ≥ 0}, if it exists. Lε (x∗ ) ≡ lim

As well known, this Lyapunov index denotes stability of the random system {xε (t, ω), t ≥ 0}:4 Proposition 1. Assume that Lyapunov index Lε (x∗ ) exists with probability one. If Lε (x∗ ) < 0, then the random system {xε (t, ω), t ≥ 0} is asymptotically stable, that is lim |xε (t, ω)| = 0

a.s.,

t→∞

x∗ = xε (0, ω).

While if Lε (x∗ ) > 0, then the random system is asymptotically unstable, that is lim xε (t, ω) = ∞ a.s., x∗ = xε (0, ω). t→∞

Notation 1. After the manner of section 3.2, we suppose that the matrix A is one of the canonical forms (3). So besides SDE (1), we consider a SDE such that ˜ ε (t, ω) dt + ε G · x ˜ ε (t, ω) dW (t, ω) d˜ xε (t, ω) = A˜ · x ˜ ε (0, ω) = x∗ x

(23)

where A˜ = −A as in (5). We signify the random system deﬁned by SDE (23) with {˜ xε (t, ω), t ≥ 0}, and the corresponding functions to ˜ ε (x∗ ) respectively. (17) and (22) with θ˜ε (t, ω), ρ˜ε (t, ω), and L We consider how to calculate the Lyapunov index (22). From SDE (20), it follows that ρε (T, ω) − ρε (0, ω) T = Qε (θε (t, ω)) dt + ε 0 3

0

T

+

, e(θε (t, ω)), G · e(θε (t, ω)) dW (t, ω).

The sign ‘a.s.’ in the below equation is an abbreviation of ‘almost surely’, what is same to say that a stochastic event occurs with probability one. 4 Khas’minski (1980).

108

Two-Dimensional Systems with Small Random Terms Since the function e(θ), G · e(θ) is bounded, the representation theorem of continuous martingales 5 and Law of the iterated logarithm 6 imply that ρε (T, ω) 1 = lim T →∞ T →∞ T T

lim

T

Qε (θε (t, ω)) dt a.s.,

(24)

0

which instructs a way to compute the Lyapunov index for the random system (1). Remark 1. Since the random system (1) is a two dimensional linear system, we enjoy the following remarkable facts: (a) The function Qε in SDE (20) depends only on the variable θ, (b) and there is no variable ρε (t, ω) in SDE (18) for {θε (t, ω), t ≥ 0}. In conclusion, we may calculate the Lyapunov index Lε (x∗ ) of the random system (1), if we can analyze asymptotic behaviors of the stochastic process {θε (t, ω), t ≥ 0} on the unit circle. ∇ 3.4

One-dimensional diﬀusion process in an interval

Before analyzing the stochastic process {θε (t, ω), t ≥ 0} on the unit circle, we consider a one dimensional diﬀusion process {x(t, ω), t ≥ 0} deﬁned by SDE dx(t, ω) = b(x(t, ω)) dt + σ(x(t, ω)) dW (t, ω).

(25)

Assumption 1. Let [α, β] be a closed interval and we suppose the following: (a) The coeﬃcients σ and b are continuous functions on R1 and they satisfy Lipschitz condition, i.e., there is a constant K such that |σ(x) − σ(y)| ≤ K |x − y| and |b(x) − b(y)| ≤ K |x − y| for all x, y. (b) σ(x) > 0 if α < x < β, and σ(α) = 0 = σ(β). 5 6

See Revuz and Yor (1999), Ch V, (1.7) Theorem. See Revuz and Yor (1999), Ch. II, (1.9) Theorem.

109

Nishioka Kunio We deﬁne Feller’s canonical scale function to the diﬀusion process of (25) by x S(x) ≡ s(x† , y) dy, α < x < β, (26) x†

†

where x is a ﬁxed point in the open interval (α, β) and x 2 b(y) † s(x , x) ≡ exp − 2 dy , α < x < β. x† σ(y)

(27)

The ﬁrst hitting time to a point x∗ ∈ [α, β] is deﬁned as inf{t > 0 : x(t, ω) = x∗ } ∗ τ (x , ω) ≡ ∞ if the above { } is empty. For each point x∗ , we introduce the functions7 h− (x∗ ) ≡ lim Ex∗ [exp{−τ (x∗ − δ, ω)}], δ↓0

+

∗

h (x ) ≡ lim Ex∗ [exp{−τ (x∗ + δ, ω)}]. δ↓0

Here it is known that the values of these functions h− (x∗ ) and h+ (x∗ ) are 0 or 1, from Kolmogorov’s 0-1 law 8 . Due to Itˆo and McKean (1968), Ch. 3, §3.4, we classify a point x∗ ∈ [α, β]. (i) A point x∗ is regular, if h− (x∗ ) = 1 and h+ (x∗ ) = 1. (ii) It is a left shunt, if h− (x∗ ) = 1 and h+ (x∗ ) = 0. While it is a right shunt, if h− (x∗ ) = 0 and h+ (x∗ ) = 1. (iii) It is a trap, if h− (x∗ ) = 0 and h+ (x∗ ) = 0 . Moreover the trap x∗ is called right ( left ) repelling, if S(x∗ +) ≡ lim S(x∗ + δ) = −∞ δ↓0

( resp. S(x∗ −) = ∞ )

for Feller’s canonical scale function S in (26). On the other hand, it is right ( left ) attracting, if S(x∗ +) > −∞ ( resp. S(x∗ −) < ∞ ). 7

From now on, Ex∗ [ · ] in the below equation denotes the conditional expectation with respect to the probability measure P under the condition x(0, ω) = x∗ . 8 See Ito and McKean (1968), §3.3.

110

Two-Dimensional Systems with Small Random Terms Let Assumption 1 hold, then the following table (28) denotes all possible combinations of the boundary points α and β in the foregoing classiﬁcations.

Case 3.1: Case 3.2: Case 3.3: Case 3.4: Case 3.5: Case 3.6: Case 3.7: Case 3.8: Case 3.9: Case 3.10:

b(α) b(α) > 0, b(α) > 0, b(α) < 0, b(α) = 0, b(α) = 0, b(α) = 0, b(α) = 0, b(α) = 0, b(α) = 0, b(α) = 0,

S(α+)

b(β) S(β−) b(β) < 0 b(β) > 0 b(β) > 0 b(β) < 0 b(β) < 0 b(β) > 0 b(β) > 0 b(β) = 0, S(β−) = ∞ b(β) = 0, S(β−) < ∞ b(β) = 0, S(β−) < ∞ (28)

S(α+) = −∞, S(α+) > −∞, S(α+) = −∞, S(α+) > −∞, S(α+) = −∞, S(α+) = −∞, S(α+) > −∞,

where b is the coeﬃcient function in SDE (25) and S is Feller’s canonical scale function (26). Following to the classiﬁcation in table (28), we review behaviors of the diﬀusion process {x(t, ω), t ≥ 0} in the interval [α, β]. One can ﬁnd proofs to each parts of the next proposition in Friedman (1976), Ito and McKean (1968), Khas’minski (1967), Maruyama and Tanaka (1957), Nishioka (1976a, 1976b). Proposition 2. Let Assumption 1 hold and {x(t, ω), t ≥ 0} be the diﬀusion process deﬁned by SDE (25). (i) In Case 3.1, the boundary point α is a right shunt and β is a left. Therefore, it holds that x(t, ω) ∈ (α, β) a.s. for all t > 0 if x∗ = x(0, ω) ∈ [α, β], and there exists an invariant probability measure μ such that 1 T →∞ T

lim

0

T

f x(t, ω) dt =

β

f (y) μ(dy) a.s.

(29)

α

for each function f which is summable with respect to μ. 111

Nishioka Kunio (ii) When Case 3.2 holds, α and β are both right shunts. So it holds that x(t, ω) > α a.s. for all t > 0 and Ex∗ [τ (β, ω)] < ∞ if x∗ ∈ [α, β]. (iii) If Case 3.3 holds, then α is a left shunt and β is a right. Therefore . Ex∗ [ min τ (α, ω), τ (β, ω) ] < ∞ for each x∗ ∈ (α, β), and it holds that 9 S(β−) − S(x∗ ) , S(β−) − S(α+) S(x∗ ) − S(α+) Px∗ [τ (β, ω) < τ (α, ω)] = . S(β−) − S(α+)

Px∗ [τ (α, ω) < τ (β, ω)] =

(iv) If Case 3.4 holds, then α is a right repelling trap and β is a left shunt. So it holds that x(t, ω) ∈ (α, β) a.s. for all t > 0 if x∗ = x(0, ω) ∈ (α, β], and there exists an invariant measure 10 μ such that β T f (x(t, ω)) dt f (y) μ(dy) 0 α = β lim T a.s. T →∞ g(x(t, ω)) dt g(y) μ(dy) 0

(30)

(31)

α

for each functions β f and g which are summable with respect to the g(y) μ(dy) = 0. measure μ and α

(v) In Case 3.5, α is a right attracting trap and β is a left shunt. Then (30) holds, but Px∗ [ lim x(t, ω) = α] = 1 t→∞

for each x∗ ∈ (α, β].

9 Here and later on, we denote by Px∗ [τ (α, ω) < τ (β, ω)] in the below equation the conditional probability of the stochastic event

{ω : τ (α, ω) < τ (β, ω)} under the condition x(0, ω) = x∗ , and so on. 10 This μ is not necessarily a probability measure.

112

Two-Dimensional Systems with Small Random Terms (vi) Let Case 3.6 hold, and α is a right repelling trap and β is a right shunt. Then it holds that Px∗ [τ (β, ω) < ∞] = 1

for each x∗ ∈ (α, β].

(vii) In Case 3.7, α is a right attracting trap and β is a right shunt. Therefore it holds that S(β−) − S(x∗ ) and t→∞ S(β−) − S(α+) S(x∗ ) − S(α+) Px∗ [τ (β, ω) < ∞] = for each x∗ ∈ (α, β). S(β−) − S(α+)

Px∗ [ lim x(t, ω) = α] =

(viii) If Case 3.8 holds, then α is a right repelling trap and β is a left repelling. Therefore x(t, ω) ∈ (α, β) a.s. for all t ≥ 0 if x∗ = x(0, ω) ∈ (α, β),

(32)

and there exists an invariant measure 11 μ such that (31) holds. (ix) Let Case 3.9 hold, then α is a right repelling trap and β is a left attracting. Then (32) is valid, but Px∗ [ lim x(t, ω) = β] = 1 t→∞

for each x∗ ∈ (α, β).

(x) When Case 3.10 holds, α is a right attracting trap and β is a left attracting. Therefore (32) is true, but S(β−) − S(x∗ ) t→∞ S(β−) − S(α+) S(x∗ ) − S(α+) Px∗ [ lim x(t, ω) = β] = t→∞ S(β−) − S(α+) Px∗ [ lim x(t, ω) = α] =

3.5

and for each x∗ ∈ (α, β).

Spiral point and center

Let the matrix A be (I) in (3), and the origin 0 is a spiral point to the non-random system (4) (see Figure 1), which is characterized by (9). In this setting, SDE (18) and (20) come to be θε (T, ω) − θε (0, ω) T 2 ε = a2 T + ε bG (θ (t, ω)) dt + ε 0

11

T

σ(θε (t, ω)) dW (t, ω),

(33)

0

This μ is not necessarily a probability measure.

113

Nishioka Kunio ρε (T, ω) − ρε (0, ω) T qG (θε (t, ω)) dt + ε = a1 T + ε2 0

T

R(θε (t, ω)) dW (t, ω).

(34)

0

Step 1. First we suppose that

g11 − g22

2

+ 4 g12 g21 < 0

(35)

holds for the matrix G = gij in the random system (1). Since (19) implies that the coeﬃcient function σ(θ) in SDE (18) does not vanish, the process {θε (t, ω), t ≥ 0} is a non-degenerate diﬀusion process and there exists an invariant probability measure με such that 1 T →∞ T

lim

T

ϕ(θε (t, ω)) dt =

0

2π

ϕ(y) με (dy) a.s.

(36)

0

for each continuous function ϕ on the unit circle. Step 2. Next we suppose that the matrix G = (gij ) satisﬁes

g11 − g22

2

+ 4 g12 g21 ≥ 0.

(37)

Note that the function σ(θ) in SDE (18) is a trigonometric polynomials of degree 2 and it is a periodic function with period π. Therefore if (37) holds, then σ(θ) vanishes at four points on the unit circle, say γ1 , γ2 , γ3 ( = γ1 + π ), and γ4 ( = γ2 + π ), where γ1 may equal to γ2 . (See Figure 4.) From (28), all γk ’s are right (anti-clockwise) shunts, because bε (θ) = a2 + ε2 bG (θ) > 0 for any θ, owing to smallness of ε. Now we can apply Proposition 2 (ii) to {θε (t), t ≥ 0} in the interval [γk , γk+1 ]. Repeat this procedure to the intervals [γ1 , γ2 ], [γ2 , γ3 ], · · · , and we see that θε (t, ω) > θ∗ − π

for all t ≥ 0 and lim θε (t, ω) = ∞ a.s., t→∞

Eθ∗ [τ (θ† , ω)] < ∞ for each point θ† on the unit circle.

(38)

Therefore there exists an invariant probability measure such that (36) holds. Lemma 1. Let the matrix A be (I) or (II) in (3) and ε be small. Then there exists an invariant measure με such that (36) holds. 114

Two-Dimensional Systems with Small Random Terms

Figure 4: Positions of γk ’s on the unit circle

From Lemma 1 we see that 1 lim T →∞ T

0

T

ε

bG (θ (t, ω)) dt and

1 lim T →∞ T

0

T

qG (θε (t, ω)) dt

converge a.s. to each constant. In addition since the functions σ(θ) and R(θ) are bounded, Law of iterated logarithm implies that ε lim θ (T, ω) − a2 ≤ 16 ε2 ||G||2 a.s., (39a) T →∞ T ρε (T, ω) ≤ a1 + 4 ε2 ||G||2 a.s., a1 − 16 ε2 ||G||2 ≤ Lε (x∗ ) = lim T →∞ T (39b)

where ||G|| ≡ max |(G)ij |. i,j

After repeating the similar arguments to θ˜ε (T, ω) and ρ˜ε (T, ω), we have the following theorem. 115

Nishioka Kunio Theorem 1. Suppose that the origin 0 is a spiral point to the nonrandom system (4) characterized by (9). Then the origin is also a spiral point to the related random system for suﬃciently small ε > 0. More precisely, for any δ > 0, there exists a positive ε∗ such that the following inequalities hold with probability one: If ε < ε∗ , then ε lim θ (T, ω) − a2 < δ, T →∞ T ˜ε (T, ω) θ lim + a2 < δ, T →∞ T

ε ∗ L (x ) − a1 < δ, ε ∗ ˜ (x ) + a1 < δ. L

(40)

A center is the special case of the spiral point, and (40) is valid but a1 = 0. We present two counter examples such that Lyapunov index Lε (x∗ ) is not zero even though ε is suﬃciently small. Example 1. Let the matrix A be (II), then the origin 0 is a center for the non-random system (4), which is characterized by (10). (i) First we set

% G=

0 c −c 0

& c = 0,

,

in the related random system (1). In this setting, SDE (20) becomes T ε ε 2 2 R(θε (t, ω)) dW (t, ω). ρ (T, ω) − ρ (0, ω) = ε c T + ε 0

From Theorem 1 and the Law of iterated logarithm we derive the following inequalities instead of (40): ε lim θ (T, ω) − a2 < δ, T →∞ T ˜ε (T, ω) θ lim + a2 < δ, T →∞ T

Lε (x∗ ) = ε2 c2 > 0, ˜ ε (x∗ ) = ε2 c2 > 0. L

From Proposition 1, we see that for any ε > 0 lim |xε (t, ω)| = ∞ a.s. and lim |˜ xε (t, ω)| = ∞ a.s.

t→∞

t→∞

in this random system, and the origin 0 is closer to an unstable spiral point than a center which is characterized by (10). (ii) Next we set

% G=

116

c 0 0 c

& ,

c = 0,

Two-Dimensional Systems with Small Random Terms in the related random system. In this set-up, SDE (20) is ε2 c2 ρ (T, ω) − ρ (0, ω) = − T +ε 2 ε

ε

T

R(θε (t, ω)) dW (t, ω),

0

and we have the following inequalities instead of (40): ε lim θ (T, ω) − a2 < δ, T →∞ T ˜ε (T, ω) θ lim + a2 < δ, T →∞ T

Lε (x∗ ) = −ε2 c2 /2 < 0, ˜ ε (x∗ ) = −ε2 c2 /2 < 0. L

So we obtain that, for any ε > 0, xε (t, ω)| = 0 a.s., lim |xε (t, ω)| = 0 a.s. and lim |˜

t→∞

t→∞

and the origin is nearer to a stable spiral point than a center. Remark 2. When the non-random system is a center, it is neither stable nor unstable. But it becomes to be stable or unstable after adding the foregoing random terms. ∇ Now the following result is evident. Theorem 2. Let the origin be a center to the non-random system (4), i.e. the matrix A is (II) in (3) and (10) holds. Then the origin is not necessarily a center to the related random system even though ε > 0 is suﬃciently small. 3.6

Saddle point

Let the matrix A be (III) in (3), and the origin 0 is a saddle point to the non-random system (4) (See Figure 2). As the related random system, we ﬁrst consider {θε (t, ω), t ≥ 0} given by SDE (18) which comes to be bε (θ) =

a2 − a1 sin 2θ + ε2 bG (θ), 2

a1 < 0 < a2 .

(41)

Since this bε is a periodic function with period π and ε is small, there are four points ξ1 ε < η1 ε < ξ2 ε ( = ξ1 ε + π) < η2 ε ( = η1 ε + π) 117

Nishioka Kunio on the unit circle such that bε (θ) =0 if θ = ξ1 ε , ξ2 ε , η1 ε , η2 ε >0 if θ ∈ (ξ1 ε , η1 ε ) ∪ (ξ2 ε , η2 ε )

(42)

0 for all θ. In this set-up, {θε (t, ω), t ≥ 0} is a non-degenerate diﬀusion process on the unit circle, and there exists an invariant probability measure με such that (36) holds. The invariant probability measure is με (dθ) ≡ 0

m1 ε (θ) + m2 ε (θ) dθ, 2π ε ε m1 (x) + m2 (x) dx

(43)

where (and later on) we set the function12 θ 1 2 bε (x) ε † s (θ , θ) ≡ exp − 2 2 dx . ε θ† σ(x)

(44)

and deﬁne the functions m1 ε and m2 ε as follows; (a) If θ ∈ [ξ1 ε , ξ2 ε ) = [ξ1 ε , ξ1 ε + π) ,

ξ2 ε

sε (ξ1 ε , x) dx , m1 (θ) ≡ 2 ε2 σ(θ) sε (ξ1 ε , θ) ε

θ

(b) If θ ∈ [ξ2 ε , ξ2 ε + π)

m2 ε (θ) ≡

sε (θ, ξ2 ε ) 2 ε2 σ(θ)

ξ1

= [ξ2 ε , ξ1 ε + 2π) ,

ξ2 ε +π

sε (ξ2 ε , x) dx m1 ε (θ) ≡ θ , 2 ε2 σ(θ) sε (ξ2 ε , θ) sε (θ, ξ2 ε + 2π) θ ε ε s (ξ1 , x) dx. m2 ε (θ) ≡ 2 ξ2 ε ε2 σ(θ) 12

118

This is the same function as (27).

θ ε

sε (ξ1 ε , x) dx.

Two-Dimensional Systems with Small Random Terms We shall investigate asymptotic behavior of the process θε (t, ω) for large t and small ε. Proposition 3. Let the matrix A be (III) in (3) and suppose that σ(θ) > 0 for all θ, i.e. the matrix G = (gij ) satisﬁes (35). (i) The process {θε (t, ω), t ≥ 0} is a non-degenerate diﬀusion process on the unit circle. (ii) We deﬁne the probability measure με by (43). Then

1 T

T

0

ϕ(θε (t, ω)) dt →

2π

ϕ(θ) με (dθ) a.s. 0

as T → ∞

for each periodic continuous function ϕ with period π. Moreover it holds that 2π π ϕ(θ) με (dθ) → ϕ( ) as ε → 0. 2 0 Using this proposition, we can calculate the Lyapunov index of the random system in our setting. Corollary 1. Suppose the same assumption as the previous proposition. Then the Lyapunov index Lε (x∗ ) of the random system (1) converges to a2 as ε → 0, where a2 is the Lyapunov index of the corresponding non-random system (4). Before the proof of Proposition 3, we prove the corollary. Proof: From Deﬁnition 1, (24), and Proposition 3, we see that ε

∗

L (x ) = 0

2π

π Qε (θ) με (dθ) → Q0 ( ) = a2 2

as ε → 0. 2

In order to prove Proposition 3, we need the following result, which is known as Laplace method 13 . 13 One can ﬁnd its proof in many books/papers. See Nevelson (1964), for instance.

119

Nishioka Kunio Lemma 2. (Laplace method) Let g and f be continuous functions deﬁned on a closed interval [α, β]. Suppose that (a) there is a unique point x∗ ∈ [α, β] such that maxx∈[α,β] f (x) = f (x∗ ), (b) in a neighborhood of x∗ , f is a C 2 class function and f (x∗ ) = 0, (c) and g(x∗ ) > 0. For a positive number K, we put β J(K) ≡ g(x) exp{Kf (x)} dx. α

(i) If α < x∗ < β, then / 1 2π ∗ ∗ g(x ) ) exp{Kf (x )} 1 + o ( J(K) = −K f (x∗ ) K (ii) If x∗ equals to α or β and if f (x∗ ) = 0, then 0 1 π ∗ ∗ g(x ) ) exp{Kf (x )} 1 + o ( J(K) = −K f (x∗ ) K

as K → ∞.

as K → ∞.

(iii) If x∗ equals to α or β and if f (x∗ ) = 0, then J(K) =

1 1 ∗ ∗ g(x ) ) exp{Kf (x )} 1 + o ( K |f (x∗ )| K

as K → ∞.

Proof of Proposition 3: The assertion (i) and the ﬁrst half of (ii) are well known results14 . So we shall prove the last half of the statement (ii). First note that the equality ξ2 ε 2π ε ϕ(θ) μ (dθ) = 2 ϕ(θ) με (dθ), ξ2 ε = ξ1 ε + π , 0

ξ1 ε

is true, since ϕ and με are periodic with period π. We set a function F ε (θ) by θ 2 bε (x) ε F (θ) ≡ 2 dx ξ1 ε σ(x) 14

120

See Khasminski (1967), Ch. 4, for instance.

Two-Dimensional Systems with Small Random Terms and apply Lemma 2 to the integral ξ2 ε ε ϕ(θ) m1 (θ) dθ = J≡ ξ1

ε

ξ2 ε

ξ1

where ψ ε (θ) ≡

ϕ(θ) 2 2 ε σ(θ)

ψ ε (θ) exp{

ε

ξ2 ε

exp{− θ

1 ε F (θ)} dθ, ε2

1 ε F (x)} dx. ε2

In our set-up, σ(θ) > 0 for all θ. Since (42) holds, maxx∈[ξ1 ε ξ2 ε ) F ε (x) = F ε (η1 ε ). Now we have / 1 2π exp{ 2 F ε (η1 ε )} when ε → 0. (45) J ∼ ψ ε (η1 ε ) ε ε ε |F (η1 )| ε We again apply Lemma 2 to the function ψ ε (η1 ε ) in (45): ξ2 ε ϕ(η1 ε ) 1 ε ε exp{− 2 F ε (x)} dx ψ (η1 ) = 2 ε η1 ε ε2 σ(η1 ε ) 0 ε 1 ϕ(η1 ) π exp{− 2 F ε (ξ2 ε )} when ε → 0. ∼ 2 ε ε ε |F (ξ2 )| ε ε2 σ(η1 ε ) (46) Combine (45) with (46). Then it is derived that √ ε ε ε ε 2π exp{ F (η ) − F (ξ ) /ε2 } ε 1 2 ϕ(η1 ) ' J∼ 2 |F ε (η1 ε ) F ε (ξ2 ε )| σ(η1 ε ) Recall that

when ε → 0.

sε (θ, ξ2 ε ) = exp{ F ε (θ) − F ε (ξ2 ε ) /ε2 },

and we repeat the analogous arguments as before. Then it follows that √ ε ε ε ε ξ2 ε 2π exp{ F (η ) − F (ξ ) /ε2 } ε 1 2 ϕ(η1 ) ε ' , ϕ(θ) m2 (θ) dθ ∼ 2 |F ε (η1 ε ) F ε (ξ1 ε )| ξ1 ε σ(η1 ε ) when ε → 0. In the sequel we obtain that ξ2 ε π 1 1 ϕ(θ) με (dθ) ∼ ϕ(η1 ε ) ∼ ϕ( ) as ε → 0, 2 2 2 ξ1 ε 121

Nishioka Kunio and the proof is complete. 2 Next we suppose that σ(θ) vanishes. Recall (19), and we see that the function σ(θ) is a trigonometric polynomial of degree 2 with period π. So there are at most four points in 0 ≤ γ1 ≤ γ2 < γ3 (= γ1 + π) ≤ γ4 (= γ2 + π) < 2π

such that σ(θ)

if θ = γk ’s if θ = γk ’s.

=0 = 0

(47)

We classify positions of γk ’s on the unit circle as follows: Case Case Case Case Case Case Case Case Case Case Case Case

5.1 : 5.2 : 5.3 : 5.4 : 5.5 : 5.6 : 5.7 : 5.8 : 5.9 : 5.10 : 5.11 : 5.12 :

0 < γ1 < γ2 < π/2 0 < γ1 < π/2 < γ2 < π π/2 < γ1 < γ2 < π 0 = γ1 , 0 < γ2 < π/2 0 = γ1 , π/2 < γ2 < π 0 < γ1 < π/2, γ2 = π/2 π/2 = γ1 , π/2 < γ2 < π 0 = γ1 , π/2 = γ2 0 < γ1 = γ2 < π/2 π/2 < γ1 = γ2 < π 0 = γ1 = γ2 π/2 = γ1 = γ2 .

(48)

Case 5.1: In this case, all γk ’s are right (= anti-clockwise) shunts, which is discussed in Proposition 2 (ii). (See Figure 5). So we may repeat all discussions in Step 2 of §4.3, and obtain the same assertion as in Lemma (1). In this case, the invariant probability measure is με (dθ) ≡

1 mk ε (θ) dθ Nε

where γ5 ≡ γ1 + 2π and we set mk ε (θ) ≡ Nε ≡

1 2 σ(θ) sε (γk† , θ)

4 γk+1 k=1

122

ε2

γk

mk (θ) dθ,

θ

for γk ≤ θ < γk+1

γk+1

(49)

sε (γk† , x) dx, γk ≤ θ < γk+1 ,

k = 1, · · · , 4,

Two-Dimensional Systems with Small Random Terms

Figure 5: Case 5.1

in which the function sε is given by (44) and each γk† is a ﬁxed point chosen from the open interval (γk , γk+1 ). Proposition 4. Suppose that Case 5.1 in (48) holds. Let ε be small. Then, the statements in Proposition 3 (ii) is true, except that the invariant measure με is deﬁned by (49). Proof: We denote by U (γk ) a small open neighborhood of each γk , respectively. By a simple calculation, we see that 1 dmk ε . (θ) = 2 dθ 2 b(θ) − ε σ(θ) σ (θ) This is uniformly bounded in θ ∈ U (γk ) with respect to small ε. So there is a constant K and a positive number δ such that sup ε≤δ

4 k=1

U (γk )

mk ε (θ) dθ < K.

Since σ(θ) > 0 if θ ∈ [0, 2π] − ∪4k=1 U (γk ), we may repeat the similar argument as in the proof of Proposition 3 (ii), after substituting [0, 2π] with [0, 2π] − ∪4k=1 U (γk ) and the deﬁnition of με with (49). 2 The situation of Case 5.1 is analogous to the some other cases, in which an invariant measure exists and we have the following results. Corollary 2. Let Cases 5.2, 5.3, 5.4, 5.5 ,5.9, 5.10, or 5.11 in (48) hold and ε be small. Then, the statement in Proposition 3 (ii) is true, except the invariant measure με is slightly diﬀerent. 123

Nishioka Kunio Proof: The proof of this corollary can be found in Nishioka (1976b). So we omit the proof. 2 Case 5.6: By the similar argument as in Case 5.4, we have the following inequalities; −k1 2bε (θ) −k2 ≤ − 2 ≤ π/2 − θ π/2 − θ σ(θ)

if θ ∈ [

k3 2bε (θ) k4 ≤ − 2 ≤ π/2 − θ π/2 − θ σ(θ)

π π if θ ∈ ( , + δ], 2 2

π π − δ, ), 2 2

where kj ’s are some positive constants and δ > 0. Due to the arguments in section 3.4.1, these inequalities derive that γ2 = π/2 is a right and left attracting trap. On the other hand, γ1 is a right ( = anti-clockwise ) shunt, and Proposition 2 (v) is applicable to {θε (t), t ≥ 0} in the interval (γ1 , π/2) and (vii) to it in (−π/2, γ1 ). (See Figure 6.)

Figure 6: Case 5.6

The next conclusion follows directly from Proposition 2: Proposition 5. Suppose that Case 5.6 in (48) holds and ε is small. (i) If θ∗ = θε (0, ω) ∈ (−π/2, γ1 ), then the statement of Proposition 2 (vii) holds with α = −π/2 and β = γ1 . (ii) If θ∗ = θε (0, ω) ∈ (γ1 , π/2), then θε (t) → π/2 a.s. as t → ∞. 124

Two-Dimensional Systems with Small Random Terms (iii) If θ∗ = θε (0, ω) is neither 0 nor π, then π ε2 Lε (x∗ ) = Qε ( ) = a2 + g22 2 2 2

a.s..

(50)

The situation of Case 5.6 is much similar to the some other cases, in which an attracting trap exists exists and we have the following results. Corollary 3. Let Cases 5.7, 5.8, or 5.12 in (48) hold and ε be small. (i) If θ∗ = θ ∗(0, w) does not equal to the trap points (0, π/2, π, 3π/2), then θε (t, w) converges to the attracting traps almost surly. (ii) If θ∗ = θ ∗ (0, w) equals to the trap points, then θε (t, w) stays there almost surly. Proof: The proof of this corollary can be found in Nishioka (1976b). So we omit the proof. 2 Generally speaking, it is not natural to expect that the diﬀusion process {θε (t), t ≥ 0} converges a.s. to a point as t → ∞. In fact such behavior is eﬀected essentially by a property of the random term ε G · xε (t, ω) dW (t, ω) in SDE (1) as shown in Lemma 1, Propositions 3, 5, and etc.. Therefore we present a slightly wider characterization of a saddle point than the original (11a) – (11d). The new characterization is to request that the following equalities15 hold instead of (11a) and (11c): T π 1 ϕ(θ(t)) dt = ϕ( ) if θ(0) = 0 or π, lim T →∞ T 2 0 T 1 ˜ ϕ(θ(t)) dt = ϕ(0) if θ(0) = π/2 or 3π/2 lim T →∞ T 0

(51)

for each periodic continuous function ϕ with period π. Note that (11a) and (11c) imply (51), but the opposite is not necessarily true. 15 This limits (51) are known as a Toeplitz type limits, which are introduced in order to extend a concept of limit for a sequence and a series.

125

Nishioka Kunio We assert that the random dynamical system (1) satisﬁes the above request. Theorem 3. Let the origin 0 be a saddle point to the non-random dynamical system (4), which is characterized by (11b), (11d), and (51). If ε is small, then the origin is also a saddle point to the related random system. More preciously, there exists a positive ε∗ for any δ > 0 such that the next inequalities hold with probability one: If ε < ε∗ , then ε ∗ L (x ) − a2 < δ if x∗ = (x∗1 , x∗2 ) with x∗2 = 0, (52) ε ∗ ∗ ∗ ∗ ∗ ˜ (x ) + a1 < δ if x = (x1 , x2 ) with x1 = L 0, (53) T π 1 ϕ(θε (t, ω)) dt − ϕ( ) < δ if θ∗ = θε (0, ω) = 0 or π, lim T →∞ T 2 0 (54) T 3π π 1 or , ϕ(θ˜ε (t, ω)) dt − ϕ(0) < δ if θ∗ = θ˜ε (0, ω) = lim T →∞ T 2 2 0 (55) for each periodic continuous function ϕ with period π.

Proof: We have already proved validity of (52) and (54) by the propositions in section 3.6.1 and 5.2. To the random system {˜ xε (t, ω), t ≥ 0} in Notation 1 (page 108), we consider the diﬀusion process {θ˜ε (t, ω), t ≥ 0} given by SDE (18) (page 107), except that bε (θ) =

a1 − a2 sin 2θ + ε2 bG (θ), 2

a 1 < 0 < a2 .

While the diﬀusion process {˜ ρε (t), t ≥ 0} given by SDE (20) (page 107), but Qε (θ) = −a1 + (a1 − a2 ) sin 2θ + ε2 qG (θ). So after a little modiﬁcation, we may apply those propositions in secρε (t), t ≥ 0}, then tion 3.6.1 and section 3.6.2 to {θ˜ε (t), t ≥ 0} and {˜ (53) and (55) follow. 2 126

Two-Dimensional Systems with Small Random Terms 3.7

Improper and proper node

First we suppose that the matrix A is (IV) in (3), and the origin 0 is an improper node to the non-random system (4). (See Figure 2.) We also extend the characterization (12a) – (12d) of an improper node. We replace (12a) and (12c) by the request such that: T 1 ϕ(θ(t)) dt = ϕ(0) if θ(0) = π/2 or 3π/2, T →∞ T 0 T π 1 ˜ lim ϕ(θ(t)) dt = ϕ( ) if θ(0) = 0 or π T →∞ T 2 0 lim

(56)

for each periodic continuous function ϕ with period π. In this case, the random system {xε (t), t ≥ 0} satisﬁes the same SDE as the random system in a saddle point case, except signatures of a1 and a2 . So we need only a little modiﬁcation to obtain the following result: Theorem 4. Suppose that the matrix A is (IV) in (3) and the origin 0 is an improper node to the non-random dynamical system (4) characterized by (12b), (12d), and (56). If ε is small, then the origin is also an improper node to the related random system. In detail, there exists a ε∗ > 0 for any δ > 0 such that the following inequalities hold a.s.: If ε < ε∗ , then ε ∗ L (x ) − a1 ≤ δ if x∗ = (x∗1 , x∗2 ) with x∗1 = 0, ε ∗ ˜ (x ) + a1 ≤ δ if x∗ = (x∗ , x∗ ) with x∗ = 0, L 1 2 1 T 3π π 1 or , ϕ(θε (t, ω)) dt − ϕ(0) ≤ δ if θ∗ = θε (0, ω) = lim T →∞ T 2 2 0 T 1 π ϕ(θ˜ε (t, ω)) dt − ϕ( ) ≤ δ if θ∗ = θ˜ε (0, ω) = 0 or π. lim T →∞ T 2 0 for each periodic continuous function ϕ with period π.

127

Nishioka Kunio Let the matrix A be (V) in (3) ( see Figure 3 ). In this case, the origin 0 is an improper node to the non-random system, which is characterized by (13a) – (13d). We replace the characterization (13a) and (13c) by the condition:

T

π ϕ(θ(t)) dt = ϕ( ) and 2

T

π ˜ ϕ(θ(t)) dt = ϕ( ) 2 0 0 (57) for each periodic continuous function ϕ with period π. In the random system (1) of this set-up, we have 1 lim T →∞ T

1 lim T →∞ T

dθε (t, ω) = bε (θε (t, ω)) dt + ε σ(θε (t, ω)) dW (t, ω) with bε (θ) = a2 cos2 θ + ε2 bG (θ),

a1 < 0 < a2 .

The main part of the function bε (θ) is the non-negative term a2 cos2 θ

(58)

that has zeros of degree 2 at the points θ = 0 and θ = π. Therefore we must treat this case more delicately than the previous saddle point case, but the argument is essentially similar. Theorem 5. Suppose that the matrix A is (V) in (3) and the origin is an improper node to the non-random dynamical system, which is characterized by (13b), (13d), and (57). If ε is small, then the origin is also an improper node to the related random system. In detail, there exists a ε∗ > 0 for any δ > 0 such that the next inequalities hold a.s.: If ε < ε∗ , then ε ∗ ε ∗ ˜ (x ) + a1 ≤ δ, L (x ) − a1 ≤ δ, L (59a) T π 1 (59b) ϕ(θε (t, ω)) dt − ϕ( ) ≤ δ, lim T →∞ T 2 0 T π 1 ε ˜ (59c) ϕ(θ (t, ω)) dt − ϕ( ) ≤ δ lim T →∞ T 2 0 for each periodic continuous function ϕ with period π. Proof: We prove the theorem in an analogous way to these arguments developed in section 3.6 in order to prove Theorem 3. 128

Two-Dimensional Systems with Small Random Terms But the proof needs a diﬀerent complicated classiﬁcation than (48), since the main term (58) is diﬀerent to that in a saddle point case. So we omit to talk about the proof, whose essential part can be found in Nishioka (1976b), Proof of Theorem 3. 2 Let the matrix A be (VI) in (3). The origin 0 is a proper node to the non-random system (4) ( see Figure 3 ) and it is characterized by (15), which should be extended into: ˜ ∗ ) = −a1 > 0, L(x∗ ) = a1 < 0 and L(x T 1 lim ϕ(θ(t)) dt = ϕ(θ(0)), T →∞ T 0 T 1 ˜ lim ϕ(θ(t)) dt = ϕ(θ(0)) T →∞ T 0

(60a) (60b) (60c)

for each periodic continuous function ϕ with period π. ρε (t, ω), t ≥ 0} satisfy In a proper node case, {ρε (t, ω), t ≥ 0} and {˜ the following SDE’s: dρε (t, ω) = a1 dt + ε2 qG (θε (t, ω)) dt + ε RG (θε (t, ω)) dW (t, ω), d˜ ρε (t, ω) = −a1 dt + ε2 qG (θ˜ε (t, ω)) dt + ε RG (θ˜ε (t, ω)) dW (t, ω). Therefore Law of iterated logarithm implies that 1 T →∞ T

Lε (x∗ ) = a1 + ε2 lim

˜ ε (x∗ ) = −a1 + ε2 lim 1 L T →∞ T

T

qG (θε (t, ω)) dt a.s.,

0

T

0

qG (θ˜ε (t, ω)) dt a.s.,

after our proving that 1 T →∞ T

lim

0

T

qG (θε (t, ω)) dt and

1 T →∞ T

T

lim

0

qG (θ˜ε (t, ω)) dt

converge a.s. to each constant. Then it follows that Lε (x∗ ) → a1

˜ ε (x∗ ) → −a1 a.s. and L

a.s. as ε → 0.

However for the diﬀusion process {θε (t, ω), t ≥ 0}, we cannot assert any general result as shown in the following example. 129

Nishioka Kunio Example 2. Let the matrix A be (VI) in (3). (i) We set

% G=

0 −c c 0

& ,

c = 0,

in SDE (1), and this setting derives that dθε (t, ω) = ε c dW (t, ω). So the invariant measure to this {θε (t), t ≥ 0} is Lebesgue measure on the unit circle, and we obtain that 1 lim T →∞ T

T

ε

ϕ(θ (t, ω)) dt = 0

2π

ϕ(θ) dθ

a.s. for all ε > 0.

0

This is far from (60b) of the non-random system in a proper node. (ii) When we put % G=

0 c c 0

& ,

c = 0,

in SDE (1), we have dθε (t, ω) = ε2 c2 cos 2 θε (t, ω) sin 2 θε (t, ω) dt+ε c cos 2 θε (t, ω) dW (t, ω). The random term of the above SDE vanishes at the points γ1 ≡ π/4,

γ2 ≡ 3π/4,

γ3 ≡ 5π/4,

γ4 ≡ 7π/4,

and all of them are right and left attracting traps. So Proposition 2 (x) derives the next fact: Set γ5 ≡ γ1 + 2π = 9π/4. Then for each ε > 0, ⎧ ⎨ γk or γk+1 if θ∗ = θε (0, ω) ∈ (γk , γk+1 ) lim θε (t, ω) = t→∞ ⎩ if θ∗ = θε (0, ω) = γk (k = 1, ..., 4) γk with probability one. This is also diﬀerent from (60b) or the original (15) of the corresponding non-random system. 130

Two-Dimensional Systems with Small Random Terms Remark 3. This example shows that there is no angular motion in the non-random system (4) if the origin is a proper node. Therefore in this case, the angular part {θε (t, ω), t ≥ 0} of the random system (1) is framed by the added random term ε G · xε (t, ω) dW (t, ω) only, and a certain G may break the characterization (60b) and (15) of the non random system in a proper node. There is no radial motion in the non-random system when the origin is a center, and the sequential situation is much the same as what we show in Example 1. ∇ Now the following assertion is evident. Theorem 6. Let the origin be a proper node to the non-random system (4), i.e. the matrix A is (II) in (3) and (60a) – (60c) hold. Then the origin is not necessarily a proper node to the related random system even though ε > 0 is suﬃciently small.

References: Coddington, E. A., and Levinson, N. (1955) Theory of Ordinary Differential Equations. McGraw-Hill. Friedman, A. (1976) Stochastic Diﬀerential Equations and Applications, Vol. 2. Academic Press. Itˆo, K., and McKean, Jr., H. P. (1968) Diﬀusion Processes and Their Sample Paths. Springer Verlag. Khas’minski, R. Z. (1967) “Necessary and Suﬃcient Conditions for the Asymptotic Stability of Linear Stochastic Systems.” Theory of Probability and its Applications, SIAM 12: 167–172. Khas’minski, R. Z. (1980) Stochastic Stability of Diﬀerential Equations. English Ed., Sijthoﬀ & Noordhoﬀ. Maruyama, G. and Tanaka, H. (1957) Some Properties of One Dimensional Diﬀusion Processes, Memo. Memoirs of the Faculty of Science, Kyushu University. Series A. Mathematics 13: 117–141. Nevelson, M. B. (1964) “On the Behavior of the Invariant Measure of a Diﬀusion Processes with Small Diﬀusion on a Circle.” Theory of Probability and its Applications, SIAM 9: 125–131. 131

Nishioka Kunio Nishioka, K. (1975) “Approximation Theorem on Stochastic Stability.” Proceedings of the Japan Academy, Supplement 51 Suppl. : 795797. Nishioka, K. (1976a) “On the Stability of Two-Dimensional Linear Stochastic Systems.” K¯ odai Mathematical Seminar Reports 27: 211230. Nishioka, K. (1976b) “Asymptotic Behaviors of Two-Dimensional Autonomous Systems with Small Random Perturbations.” Journal of Mathematics of Kyoto University 16: 56-69. Revuz, D., and Yor, M. (1999) Continuous Martingales and Brownian Motion. Third ed. Springer Verlag.

132

Chapter 4 Dynamic Theory of Stochastic Movement of Systems

Masao Nagasawa Institute of Mathematics, University of Z¨ urich

4.1

Dynamic theory of stochastic processes

The dynamic theory of stochastic processes consists of two parts, kinematics and mechanics.1 The dynamic theory concerns an evolution equation d d 2 ∂u ∂u 1 2 ij ∂ u + (σ (t, x)) + bi (t, x) i + c(t, x)u = 0, (4.1.1) i j ∂t 2 i,j=1 ∂x ∂x ∂x i=1

which contains a potential function c(t, x), and the diﬀusion matrix σ(t, x), [a, b] × Rd → Rd × Rd and drift vector b(t, x), [a, b] × Rd → Rd must be prescribed. The case with no potential term can be treated in the framework of the conventional theory of Markov processes of Kolmogorov and Itˆo, which is a kinematic theory, as will be explained. In kinematics we have the kinematic equation d d ∂u ∂2u ∂u 1 2 + (σ (t, x))ij i j + bi (t, x) i = 0, ∂t 2 i,j=1 ∂x ∂x ∂x i=1

(4.1.2)

which contains the drift terms bi (t, x)∂u/∂xi but no potential term. The kinematic equation determines Markov (diﬀusion) processes, i.e., 1 In Nagasawa (1993) the two parts of the dynamic theory, kinematics and mechanics are called q-representation and p-representation.

Masao Nagasawa the movement of systems. By contrast, we have the equation of motion in the mechanics part of the dynamic theory. The equation of motion contains the potential function c(t, x) of external forces as in (1). External forces inﬂuence the movement of systems, but not in a direct way. As will be explained, the potential function determines a drift vector through the equation of motion. The induced drift vector then deﬁnes the kinematic equation. The kinematic equation ﬁnally describes sample paths of the movement of observing systems. We must therefore clarify the mathematical structures which connect three notions, external force, induced drift vector and sample paths of the movement. 4.2

Kinematic theory

There is an important class of theories that we call kinematic theories. In kinematic theories one handles the kinematic equation given in (2). According to the analytic method of Kolmogorov (1931), equation (2) characterizes a Markov (diﬀusion) process uniquely, if an initial distribution is prescribed. To discuss the kinematic equation in (2) in applications, one must take a crucial step, namely an appropriate choice of the diﬀusion and drift coeﬃcients σ(t, x) and b(t, x), which are decided through careful analysis of considering systems depending on chosen models.2 By using the fundamental solution q(s, x; t, y) of equation (2), we set q(s, x; t, y) dy, t − s > 0. Q(s, x; t, B) = B

Then it is a transition probability, that is, Q(s, x; t, B) satisﬁes the normality condition Q(s, x; t, Rd ) = 1, (3) and obeys the Chapman–Kolmogorov equation Q(s, x; t, B) = Q(s, x; r, dy)Q(r, y; t, B), Rd

s ≤ r ≤ t.

(4)

Moreover, with a transition probability Q(s, x; t, B) and an initial distribution density μa (x0 ) at the switch on time t = a, we can construct a Markov process {Xt , t ∈ [a, b], Q} through the ﬁnite dimensional 2 Cf. e.g. Jensen and Richter (2007), Jensen, Wang and Johnsen (2007), this volume.

134

Dynamic Theory of Stochastic Movement of Systems distributions Q[f (Xa , Xt1 , . . . , Xtn−1 , Xb )] = μa (x0 ) dx0 q(a, x0 ; t1 , x1 ) dx1 q(t1 , x1 ; t2 , x2 ) dx2 q(t2 , x2 ; t3 , x3 ) dx3 . . . q(tn−1 , xn−1 ; b, xn ) dxn f (x0 , x1 , . . . , xn ), (5) where a < t1 < · · · < tn−1 < b, and f (x0 , x1 , . . . , xn ) is any bounded measurable function on the space (Rd )n+1 , cf. Kolmogorov (1933). Equation (5) determines a Markov process uniquely, and is a fundamental equation in the conventional theory of Markov processes of Kolmogorov. For equation (5) the normality condition in (3) is indispensable. The (marginal) distribution Q[Xt ∈ dx] of a Markov process is a special case of equation (5). We write the distribution as Q[Xt ∈ dx] = e2R(t,x) dx,

(6)

with a density μ(t, x) = e2R(t,x) , and call R(t, x) ‘exponent of distribution’. A Markov process is therefore determined by an exponent of distribution R(a, x) at the switch on time t = a and a transition probability Q(s, x; t, B) through equation (5). We carefully note that the ﬁnite dimensional distribution, i.e., equation (5), belongs to neither classical analysis nor functional analysis, and is a stranger for people who are accustomed to work with the classical mathematics. In other words, with equation (5), we go into a new mathematical ﬁeld called sample path analysis, leaving classical and functional analysis. This is a key in discussing and understanding stochastic processes. 4.3

Sample path equation in kinematic theory

The kinematic equation in (2), together with (5), is equivalent to Itˆo’s stochastic diﬀerential equation t t σ(s, Xs ) dBs + b(s, Xs ) ds, (7) Xt = Xa + a

a

which is the equation of sample paths Xt (ω) (abbreviated as Xt ), where Bt (= Bt (ω)) is a d-dimensional Brownian motion (Wiener process) deﬁned on a probability space {Ω, F, P }, and Xa denotes the initial value (position) at the switch on time t = a. 135

Masao Nagasawa We can decompose the movement Xt into two parts t t σ(s, Xs ) dBs + b(s, Xs ) ds, Xt = Xr + r

where a ≤ r ≤ t, and

Xr = Xa +

(8)

r

r

r

σ(s, Xs ) dBs + a

b(s, Xs ) ds,

(9)

a

i.e., from the switch on time a to the present time r, and from the present time r to the future time t. Then equation (8) shows that the process Xt depends on the information of the past only through the position Xr at the present time r, which is given by equation (9). In other words, the process Xt from the present time r to the future time t does not depend on the detail of the past history. This property is called the Markov property of the process Xt . The ﬁrst integral on the right-hand side of (7) is Itˆo’s stochastic integral. We often write equation (7) in a diﬀerential form dXt = σ(t, Xt ) dBt + b(t, Xt ) dt,

(10)

or component-wise dXti = (σ(t, Xt ) dBt )i + bi (t, Xt ) dt,

i = 1, . . . , d.

(11)

We can compute the diﬀerential of f (t, Xt ), for f ∈ C 2 ([a, b] × Rd ), with the Itˆo formula. If we take up to the second order diﬀerentials of Xt , i.e., up to dXti dXtj , then we get d f (t, Xt ) =

d d ∂f 1 ∂ 2f ∂f i dt + dX + dXti dXtj . t i i ∂xj ∂t ∂x 2 ∂x i=1 i,j=1

(12)

In classical analysis we take only the ﬁrst order diﬀerentials, and consider d ∂f ∂f d f (t, Xt ) = dt + dXti , (13) i ∂t ∂x i=1 where we assume that the second order diﬀerentials dXti dXtj are of small order compared to the ﬁrst order diﬀerentials. But in sample path analysis this is not the case, since P. Levy’s symbolic formulas hold: dBti dBtj = δ ij dt, 136

dBti dt = 0,

and (dt)2 = 0.

(14)

Dynamic Theory of Stochastic Movement of Systems Hence the second order diﬀerentials dXti dXtj are not of small order compared to the ﬁrst order diﬀerentials, since we have dBti dBti = dt, i = 1, . . . , d. Combining equations (11), (12) and (13), we get the Itˆo formula % d f (t, Xt ) =

& d d 1 2 ij ∂ 2 ∂ i ∂ f (t, Xt ) dt + (σ ) + b ∂t 2 i,j=1 ∂xi ∂xj ∂xi i=1 +

d ∂f (t, Xt ) i=1

∂xi

(σdBt )i .

(15)

Itˆo’s stochastic diﬀerential equation (7) and the Itˆo formula above are not only of theoretical importance but also powerful mathematical devices in the theory of Markov processes. Equation (14) proves that the transition density of Xt in (7) satisﬁes (2). The kinematic equation in (2), which is often called Kolmogorov’s equation, is not easy to solve except in some simple cases. Therefore, to analyze the Markov (diﬀusion) process Xt it is often better to handle Itˆo’s stochastic diﬀerential equation in (7). In solving it, we have extremely powerful tools, the so-called sample path analysis, in particular P. Levy’s formulae in (14) and Itˆo’s formula in (15). 4.4

Mechanics and the equation of motion

The single equation in (1) does not help us. The equation of motion in the dynamic theory of random motion is given by a pair of twin evolution equations with a scalar potential c(t, x) ∂φ + ∂t ∂ φˆ + − ∂t

1 Δφ + c(t, x)φ = 0, 2 1 ˆ Δφ + c(t, x)φˆ = 0, 2

(16)

where Δ denotes the Laplace–Beltrami operator % & 1 ∂ ' 2 ij ∂ 2 Δ=∇·∇= ' , det |σ (x)|(σ (x)) ∂xj det |σ 2 (x)| ∂xi which is necessary for discussing duality. The equation of motion in the general case with a vector potential b(t, x) is a pair of twin 137

Masao Nagasawa evolution equations ∂φ + ∂t ∂ φˆ + − ∂t

1 (∇ + b(t, x))2 φ + c(t, x)φ = 0, 2 1 (∇ − b(t, x))2 φˆ + c(t, x)φˆ = 0, 2

(17)

which are in formal duality with respect to d˜ x dt, where ' d˜ x = det |σ 2 (x)| dx, since we have

g(∇ + b(t, x))2 f d˜ x=

f (∇ − b(t, x))2 g d˜ x,

for any smooth f and g vanishing at inﬁnity. This is the duality relation between (∇ + b(t, x))2 and (∇ − b(t, x))2 with respect to the measure ' d˜ x = det |σ 2 (x)| dx. When b and c are independent of time, we often consider stationary solutions. In this case, substituting φ(t, x) = eλt ϕ(x) at the ﬁrst evolution equation in (17), we get 1 λϕ + (∇ + b(x))2 ϕ + c(x)ϕ = 0, 2 which is an eigenvalue problem, and plays a crucial role in quantum physics. We carefully note that the equation of motion in (17) (or (16)) does not belong to the conventional theory of Markov processes of Kolmogorov and Itˆo. In fact, let p(s, x; t, y) be the fundamental solution of the twin equations of motion in (17). Then it does not satisfy the normality condition in (3). Instead, we have p(s, x; t, y) dy = 1, because of the potential terms in equation (17). Hence p(s, x; t, y) dy is not a transition probability, and equation (5) of Kolmogorov is not applicable. This means that we cannot apply the conventional theory of Markov processes to equations with potential terms. We need a new method, which will be explained in the following, for constructing stochastic processes. 138

Dynamic Theory of Stochastic Movement of Systems Let p(s, x; t, y) be the fundamental solution of the twin equations of motion in (17) (or (16)), and {φˆa (x), φb (y)} be a pair of functions which are normalized as (18) φˆa (x) dxp(a, x; b, y)φb (y) dy = 1. ˆ x) of the twin equations of moWe then get solutions φ(t, x) and φ(t, tion in (17) by φ(t, x) = p(t, x; b, y)φb (y) dy, (19) ˆ ˆ φ(t, x) = φa (z) dzp(a, z; t, x). ˆ x) ‘time-reversed (or We will call φ(t, x) ‘evolution function’ and φ(t, backward) evolution function’. The condition in (18) implies the normality condition in the dynamic theory ˆ x)φ(t, x) dx = 1. φ(t, Making use of the triplet {p(s, x; t, y), φˆa (x), φb (y)}, we can construct a stochastic process {Xt , t ∈ [a, b], Q} through the ﬁnite dimensional distributions Q[f (Xa , Xt1 , . . . , Xtn−1 , Xb )] = dx0 φˆa (x0 )p(a, x0 ; t1 , x1 ) dx1 p(t1 , x1 ; t2 , x2 ) dx2 · · ·

(20)

· · · p(tn−1 , xn−1 ; b, xn )φb (xn ) dxn f (x0 , x1 , . . . , xn ). This is a new method for constructing stochastic processes in the dynamic theory. As a special case of (20), the distribution of the stochastic process {Xt , t ∈ [a, b], Q} is given by ˆ x)φ(t, x) dx. Q[Xt ∈ dx] = φ(t, We carefully compare equation (20) with Kolmogorov’s equation in (5) which deﬁnes a Markov process. Equation (5) has only an initial function μa (x), and is deﬁned by the fundamental solution q(s, x; t, y) of the kinematic equation. By contrast, equation (20) has an initial function φˆa (x) and in addition a terminal function φb (y), and is deﬁned by the fundamental solution p(s, x; t, y) of the twin equations of motion. This is a decisive point that makes the dynamic theory of random motion completely diﬀerent from the conventional theory of Markov processes of Kolmogorov and Itˆo. 139

Masao Nagasawa 4.5

Evolution function and kinematic equation

Since the process {Xt , t ∈ [a, b], Q} constructed by equation (20) depends on the initial and terminal functions {φˆa (x), φb (y)}, it is not a Markov process with p(s, x; t, y), which is the fundamental solution of the twin equations of motion in (17). We can nevertheless ﬁnd a basic relation between the process {Xt , t ∈ [a, b], Q} and a Markov process. We otherwise cannot discuss the equation of sample paths in (7). This will be explained in what follows. ˆ x) be the evolution function and time-reversed Let φ(t, x) and φ(t, (or backward) evolution function given in (19). Then we have ˆ x) induce Theorem 1. The twin evolution functions φ(t, x) and φ(t, the forward drift vector a(t, x) and backward drift vector a ˆ (t, x) by a(t, x) =

σ 2 ∇φ(t, x) , φ(t, x)

a ˆ (t, x) =

ˆ x) σ 2 ∇φ(t, . ˆ x) φ(t,

(21)

We introduce a new transition density q(s, x; t, y). Deﬁnition 1. Let p(s, x; t, y) be the fundamental solution of the twin equations of motion in (17). By using an evolution function φ(t, x), we deﬁne a new transition density by q(s, x; t, y) =

1 p(s, x; t, y)φ(t, y). φ(s, x)

(22)

Theorem 2. (i) The function q(s, x; t, y) deﬁned by (22) is the fundamental solution of diﬀusion equations in formal duality ∂u 1 + Δu + (b(t, x) + a(t, x)) · ∇u = 0, ∂t 2 ∂μ 1 + Δμ − div((b(t, x) + a(t, x))μ) = 0, − ∂t 2

(23)

which contain the drift vector a(t, x) induced by an evolution function φ(t, x). The function q(s, x; t, y) obeys the Chapman–Kolmogorov equation in (4), and satisﬁes the normality condition q(s, x; t, y) dy = 1, s < t. (24) (ii) The process {Xt , t ∈ [a, b], Q} constructed through equation (20) is a Markov process which has the transition probability q(s, x; t, y) dy, 140

Dynamic Theory of Stochastic Movement of Systems and its distribution is given by ˆ x)φ(t, x) dx, Q[Xt ∈ dx] = φ(t, ˆ x)φ(t, x), which satisﬁes the second equawith a density μ(t, x) = φ(t, tion in (23). The ﬁrst equation in (23) is often called the KolmogorovSmoluchowski equation, which describes the transition of motion (a stochastic process). The second equation in (23) is called the Fokker– Planck equation, which is exclusively for distribution densities. The two equations must be strictly distinguished of each other to avoid confusion and misunderstandings. I carefully note that even though one knows distribution densities, one cannot see the motion itself. This fact is nowadays well-known in the theory of stochastic processes, but was not known in 1920’s (and is not well-understood even nowadays), and people computed only distribution densities by solving the Fokker–Planck equation. Therefore, discussing the motion itself was purely a guess work at the time. This fact is extremely important, when we look at history of physics. By contrast, the twin equations of motion in (16) or (17) describe the transition (or evolution) in normal time and in reversed time, respectively. Therefore, they play exactly the same role, although time runs in opposite directions of each other. For proving Theorem 2, we rewrite equation (20) into equation (5). In performing this we manipulate equation (20). We ﬁrst insert φ(a, x0 )φ−1 (a, x0 ) just after φˆa (x0 ). We then replace φ−1 (a, x0 )p(a, x0 ; t1 , x1 )φ(t1 , x1 ) by q(a, x0 ; t1 , x1 ), by using the formula in (22). Repeating this procedure, we ﬁnally reach q(tn−1 , xn−1 ; b, xn ) at the tail of the equation. Thus we get equation (5) with the initial distribution density μa (x) = φˆa (x0 )φ(a, x0 ) and the transition function q(s, x; t, y). This proves that the process constructed by (20) is a Markov process with the transition probability q(s, x; t, y) dy. For details we refer to Nagasawa (1993, 2000). Thus an evolution function φ(t, x) given by (19) determines a drift vector a(t, x) by (21), and the induced drift vector a(t, x) then determines, together with the prescribed drift vector b(t, x), the kinematic equation in (23), hence we ﬁnally get sample paths of a Markov process Xt , which has the drift vector b(t, x) + a(t, x). Let us write this as a diagram: Potential c(t, x) =⇒ Induced drift a(t, x) =⇒ Sample paths Xt . 141

Masao Nagasawa Remark 1. By using a time-reversed (or backward) evolution funcˆ x) in (19), we deﬁne also a time-reversed (or backward) trantion φ(t, sition density ˆ x)p(s, x; t, y) qˆ(s, x; t, y) = φ(s,

1 ˆ y) φ(t,

,

which satisﬁes the time-reversed normality condition dxˆ q (s, x; t, y) = 1, s < t, and the function qˆ(s, x; t, y) is the fundamental solution of the timereversed kinematic equation ∂ uˆ 1 + Δˆ u + (−b(t, x) + a ˆ (t, x)) · ∇ˆ u = 0, ∂t 2 ∂μ 1 + Δμ − div((−b(t, x) + a ˆ (t, x))μ) = 0. ∂t 2

−

When we discuss the time-reversed description, we read equation (20) from right to left with a clock running backwards. ∇ 4.6

Exponent of motion and initial condition

We introduce a new pair of variables deﬁned by R(t, x) =

1 ˆ x) and S(t, x) = 1 log φ(t, x) , log φ(t, x)φ(t, ˆ x) 2 2 φ(t,

(25)

and represent the evolution function φ(t, x) and time-reversed (or ˆ x), by using the pair of functions backward) evolution function φ(t, R(t, x) and S(t, x), in the exponential form as φ(t, x) = eR(t,x)+S(t,x) , ˆ x) = eR(t,x)−S(t,x) . φ(t,

(26)

The distribution density of our process depends on the exponent of distribution R(t, x), but does not depend on the function S(t, x), since ˆ (t, x)) μ(t, x) = e2R(t,x) . By contrast the drift vector a(t, x) (resp. a depends on S(t, x). In fact, let a pair of functions R(t, x) and S(t, x) be deﬁned by (25). Then the formulae in (21) yield a(t, x) = σ 2 (∇R(t, x) + ∇S(t, x)), a ˆ (t, x) = σ 2 (∇R(t, x) − ∇S(t, x)). 142

(27)

Dynamic Theory of Stochastic Movement of Systems We will call the function S(t, x) ‘exponent of motion’, and the pair of functions {R(a, x), S(a, x)} ‘initial condition’ of the movement of a system. Then we have Theorem 3. Let an initial condition {R(a, x), S(a, x)} at the switch on time t = a be prescribed. Set φˆa (x) = eR(a,x)−S(a,x) . Then the time-reversed evolution function is given by ˆ x) = φ(t,

φˆa (y) dyp(a, y; t, x),

(28)

where p(s, z; t, x) is the fundamental solution of the twin equations of motion in (17). Further, the evolution function φ(t, x) is given as a solution of a linear integral equation φa (z) =

p(a, z; t, x) dxφ(t, x),

(29)

where φa (x) = eR(a,x)+S(a,x) , which is also a known function. We can get the terminal function φb (x) = φ(b, x) at the terminal time t = b by equation (29), hence the terminal function φb (x) is determined by an initial condition {R(a, x), S(a, x)}. Since the initial and terminal functions {φˆa , φb } are determined by the initial condition, we can use a triplet {p(s, x; t, y), R(a, x), S(a, x)} instead of the triplet {p(s, x; t, y), φˆa , φb }. In other words, the process {Xt , t ∈ [a, b], Q} is uniquely determined by the fundamental solution p(s, x; t, y) of the twin equations of motion in (17) and an initial condition {R(a, x), S(a, x)}. We carefully note that one can prescribe an initial condition {R(a, x), S(a, x)} in our dynamic theory. By contrast, only an initial distribution with a density e2R(a,x) , i.e., only the exponent of distribution R(a, x) can be prescribed in the conventional theory of Markov processes of Kolmogorov and Itˆo. I will demonstrate this deﬁnitive advantage of my dynamic theory with simple examples. 4.7

Examples

We will consider examples in one-dimension for simplicity. 143

Masao Nagasawa Example 1. The free movement of a system in one-dimension is governed by the equation of motion, which is a pair of twin equations ∂φ 1 2 ∂ 2 φ + σ = 0, ∂t 2 ∂x2 (30) ∂ φˆ 1 2 ∂ 2 φˆ + σ = 0, − ∂t 2 ∂x2 where σ is a constant. The fundamental solution of the equation of motion in (30) is (y − x)2 1 p(s, x; t, y) = ' . exp − 2 2σ (t − s) 2πσ 2 (t − s) We take an evolution function φ(t, x) = e−(κ

2 /2)t+ κ x σ

,

(31)

which is a solution of the ﬁrst equation in (30), where κ is an arbitrary constant. We then get drift ∂ log φ(t, x) = σκ, ∂x in view of equation (21). The transition density of the free movement, which is a Markov process in one-dimension, is given by (y − x)2 1 1 κ q(s, x; t, y) = ' − κ2 (t−s)+ (y−x) , exp − 2 2σ (t − s) 2 σ 2πσ 2 (t − s) a(t, x) = σ 2

in view of (22). Then functions deﬁned by u(t, x) = q(t, x; b, z) dzf (z) and μ(t, x) = μ(y) dzq(a, y; t, x) are solutions of the kinematic equation ∂u ∂u 1 2 ∂ 2 u + σ = 0, + σκ 2 ∂t 2 ∂x ∂x ∂μ 1 2 ∂ 2 μ ∂μ − + σ = 0, − σκ 2 ∂t 2 ∂x ∂x with constant drift σκ. We note that μ(t, x) is the distribution density of the process. Therefore, sample paths of the free movement is given by Xt = Xa + σBt−a + σκ(t − a).

(32)

Thus the free movement shows random zigzag motion as the Brownian motion σBt−a , and moreover it has drift σκ(t−a), although no external force is in existence. 144

Dynamic Theory of Stochastic Movement of Systems Example 2. We now consider the movement of a system governed by the twin equations of motion with Hooke’s potential ∂φ + ∂t ∂ φˆ + − ∂t We set, respectively,

1 2 ∂ 2φ σ − 2 ∂x2 1 2 ∂ 2 φˆ σ − 2 ∂x2

1 2 2 κ x φ = 0, 2 1 2 2ˆ κ x φ = 0. 2

φ = eλt ϕ(x) and φˆ = e−λt ϕ(x)

(33)

(34)

in equation (33), then we get an eigenvalue problem 1 d2 ϕ 1 − σ 2 2 + κ2 x2 ϕ = λϕ. 2 dx 2 Hence the process has a stationary distribution density ˆ x) = ϕ2 (x), μ(x) = φ(t, x)φ(t, in view of (34). For the smallest eigenvalue λ0 = σκ, we have the associated eigenfunction 2 (35) ϕ(x) = βe−κx /(2σ) , where β is a normalizing constant. The drift coeﬃcient is therefore a(x) = σ 2

d (log eλt ϕ(x)) = −σκx, dx

in view of equation (21), hence the kinematic equation is ∂u ∂u 1 2 ∂ 2 u + σ = 0. − σκx 2 ∂t 2 ∂x ∂x Sample paths of the motion are given by solutions of a stochastic diﬀerential equation t Xt = Xa + σBt−a − σκ dsXs . (36) a

Drift a(x) = −σκx induces the tendency of the movement towards the origin added to the Brownian motion σBt−a , hence sample paths cannot stay long time far away from the origin. It is a stochastic generalization of the classic harmonic oscillation. In view of (35) it has the stationary distribution with a Gaussian density μ(x) = ϕ2 (x) = β 2 e−κx

2/σ

. 145

Masao Nagasawa 4.8

Schr¨ odinger’s wave theory and dynamic theory

The relation between the dynamic theory and Schr¨odinger’s wave theory should be explained. Although we can treat the general case with a vector potential, we will consider, for simplicity, the Schr¨odinger equations with a scalar potential V (t, x) and with a constant coeﬃcient σ, namely ∂ψ 1 2 + σ Δψ − V (t, x)ψ = 0, ∂t 2 ∂ψ 1 2 + σ Δψ − V (t, x)ψ = 0, −i ∂t 2 i

(37)

where σ 2 = h/(2πm), h is Planck’s constant and m is the mass of an electron. We represent the solution of the ﬁrst equation in (37) as a complex-valued exponential function ψ(t, x) = eR(t,x)+iS(t,x) .

(38)

We identify R(t, x) and S(t, x) of the complex-valued exponential function in (38) with the exponent of distribution and the exponent of motion {R, S}, and set φ(t, x) = eR(t,x)+S(t,x) , ˆ x) = eR(t,x)−S(t,x) . φ(t,

(39)

Then the real-valued exponential functions φ and φˆ satisfy the twin equations of motion 1 2 σ Δφ + c(t, x)φ = 0, 2 1 2 ˆ σ Δφ + c(t, x)φˆ = 0, 2

(40)

c(t, x) = −(V (t, x) + V˜ (t, x)),

(41)

∂φ + ∂t ∂ φˆ + − ∂t where with

∂S (42) V˜ (t, x) = σ 2 (∇S)2 + 2 , ∂t which we will call ‘self potential’, since it is not caused by external forces. Its physical meaning will be clariﬁed later on in Remark 2. Moreover, we get a fundamental relation ˆ x), ψ(t, x)ψ(t, x) = φ(t, x)φ(t, 146

(43)

Dynamic Theory of Stochastic Movement of Systems between the wave function ψ(t, x) and the evolution functions φ(t, x) ˆ x). Equation (43) implies that the intensity ψ(t, x)ψ(t, x) of and φ(t, the complex-valued wave ψ(t, x) coincides with the distribution denˆ x) of the stochastic process Xt determined by the equasity φ(t, x)φ(t, tion of motion in (40). Cf. Nagasawa (1993, 2000) for a proof. We will quickly look at the history of quantum physics. Based on the wave equation in (37), Schr¨odinger (1926) developed a wave theory of electrons. It was successful in computing the energy. But the wave theory failed in explaining the fact that an electron is always found at a point. He then recognized the necessity of a particle theory for electrons, and discussed the stochastic motion of particles in Schr¨odinger (1931). However, it was not fully successful, since he could not ﬁnd the formula given in (27) (he did not know the existence of the exponent of motion S(t, x)), and postponed further discussions about the relation between his theory of stochastic motion and quantum mechanics. The dynamic theory of stochastic movement explained in the present exposition is a further development of an idea in Schr¨odinger (1931). We carefully note that the Schr¨odinger equation in (37) is a complex-valued counterpart of the twin equations of motion in (40). The Schr¨odinger equation is the complex-valued evolution equation, and a useful mathematical device in the dynamic theory of stochastic motion of particles. In the last century people interpreted the Schr¨odinger equation as a wave equation, and reached conceptual confusion as a result. 4.9

Sample paths of motion governed by the Schr¨ odinger equation

We will consider, for simplicity, the Schr¨odinger equation i

∂ψ 1 2 + σ Δψ − V (t, x)ψ = 0, ∂t 2

(44)

whose solution we write as a complex-valued exponential function ψ(t, x) = eR(t,x)+iS(t,x) .

(45)

In Schr¨odinger’s wave theory, ψ(t, x) is a complex-valued wave function. However, we will regard equation (44) not as a wave equation but as a complex-valued counterpart of the twin equations of motion of particles in (40), based on the equivalence explained in the preceding section. Hence we identify R(t, x) and S(t, x) of ψ(t, x) in (45) 147

Masao Nagasawa with the exponent of distribution and the exponent of motion. We can then apply the ﬁrst formula in equation (27), and get a drift vector a(t, x) = σ 2 (∇R(t, x) + ∇S(t, x)),

(46)

with which we deﬁne the equation of sample paths in (7). In this case with a constant σ we get t a(s, Xs ) ds, (47) Xt = Xa + σBt−a + a

where Bt denotes the d-dimensional Brownian motion, and Xa is an initial position, which is a random variable independent of the Brownian motion Bt . Example 3. We consider the Schr¨odinger equation in one-dimension with no potential function i

∂ψ 1 2 ∂ 2 ψ + σ = 0. ∂t 2 ∂x2

We take a special solution ψ(t, x) = eκx/σ+i(κ

2 /2)t

.

Based on the equivalence explained in the preceding section, we identify κ2 t κ R(t, x) = x and S(t, x) = σ 2 of ψ(t, x) with the exponent of distribution and the exponent of motion. Then we get ∂R = σκ, a(t, x) = σ 2 ∂x by equation (46). Therefore, the equation of sample paths in (47) is in this simple case Xt = Xa + σBt−a + σκ(t − a), which coincides with equation (32) of the free motion with a constant drift σκ. Example 4. We consider the Schr¨odinger equation in one-dimension i 148

∂ψ 1 2 ∂ 2 ψ 1 2 2 + σ − κ x ψ = 0, ∂t 2 ∂x2 2

Dynamic Theory of Stochastic Movement of Systems with Hooke’s potential V (x) = κ2 x2 /2. Substituting ψ(t, x) = e−iλt ϕ(x), we get

1 d2 ϕ 1 − σ 2 2 + κ2 x2 ϕ = λϕ. 2 dx 2 For the smallest eigenvalue λ0 = σκ, we have the associated eigenfunc2 2 tion ϕ(x) = βe−κx /(2σ) , hence ψ(t, x) = βe−κx /(2σ) e−iσκt . We then identify R(t, x) = −

κ 2 x + log β 2σ

and S(t, x) = −σκt

of ψ(t, x) with the exponent of distribution and the exponent of motion. Then we get a(t, x) = σ 2

∂R = −σκx, ∂x

by equation (46). Therefore, the equation of sample paths in (47) is in this case t dsXs , Xt = Xa + σBt−a − σκ a

which coincides with equation (36), which is the sample path equation of the stochastic generalization of the classic harmonic oscillation. Example 5. We consider the Schr¨odinger equation with Hooke’s potential in two dimensions in the polar coordinates 1 ∂ψ 1 2 1 ∂ ∂ψ 1 ∂ 2 ψ + σ r + 2 2 − κ2 r2 ψ = 0, (48) i ∂t 2 r ∂r ∂r r ∂η 2 where σ 2 = ε/m, ε = h/(2π) and κ is a constant. By substituting ψ(t, r, η) = e−iλt ψ(r, η), and by separating variables as ψ(r, η) = R(r)Φ(η), we get & % & % 1 21 d dR 1 2 2 1 2 m2 − σ r + κ r + σ 2 R = λR, 2 r dr dr 2 2 r 2 dΦ − 2 = m2 Φ. dη 149

Masao Nagasawa For the angular equation we take complex-valued solutions Φ = eimη ,

m = 0, ±1, ±2, . . . .

Solutions to the eigenvalue problem of the radial part are known. The eigenvalues are λ|m|+n = σκ(|m| + n + 1), where m = 0, ±1, ±2, . . . and n = 0, 2, 4, . . . . Associated eigenfunctions are given by %0 & & % 1κ 2 κ r exp − r , Rm,n (r) = F|m|,n σ 2σ where F|m|,n (x) is a polynomial function of x, cf. (11–13) in Pauling and Wilson (1935). For the case of the smallest eigenvalue λ0 = σκ(m = 0, n = 0, and 2 F|0|,0 (r) = 1) we have ψ(t, r, η) = βe−κr /(2σ)−iσκt . We identify R=−

κr2 2σ

and S = −σκt

with the exponent of distribution and the exponent of motion. Then we get the drift function of the motion a(r) = σ 2

∂R = −σκr, ∂r

in view of (46). The kinematic equation is therefore ∂u 1 2 1 ∂ ∂u 1 ∂ 2 u ∂u + σ r + 2 2 − σκr = 0. ∂t 2 r ∂r ∂r r ∂η ∂r To get stochastic diﬀerential equations, we must expand the Laplacian, and rewrite it as % & 1 1 ∂u ∂u 1 ∂2u ∂u 1 2 ∂ 2 u + σ − σκr = 0, + 2 2 + σ2 2 ∂t 2 ∂r r ∂η 2 r ∂r ∂r in the form of equation (2). We then get the sample path equations & % σ 1 21 1 σ drt = σdBt + − σκrt dt, dηt = dBt2 , 2 rt rt 150

Dynamic Theory of Stochastic Movement of Systems where Bt1 and Bt2 are independent one-dimensional Brownian motions. In this case we can also write the kinematic equation in the rectangular coordinates as % & % & ∂u ∂u 1 2 ∂ 2 u ∂ 2 u ∂u − σκ x + σ +y = 0. + ∂t 2 ∂x2 ∂y 2 ∂x ∂y We then get a Markov process Xt = (Xt1 , Xt2 ), which is a solution of a system of stochastic diﬀerential equations t i i i Xt = Xa + σBt−a − σκ dsXsi , i = 1, 2. a

Hence Xt =

(Xt1 , Xt2 )

has a stationary distribution density μ = ψψ = ϕ2 = β 2 e−κr

2 /σ

,

and is the stochastic generalization of the classic harmonic oscillation in two dimensions. We now choose m = ±1 and n = 0 for the ﬁrst excited eigenvalue λ1 = 2σκ. In this case, F|±1|,0 (r) = 2r, and we get two complex-valued eigenfunctions ϕ+1 (r, η) = βre−κr ϕ−1 (r, η) = βre

2 /(2σ)

eiη ,

−κr 2 /(2σ) −iη

e

for m = 1, ,

(49)

for m = −1.

With the eigenfunctions, we deﬁne complex-valued functions by ψ±1 (t, r, η) = e−i2σκt ϕ±1 (r, η) = βre−κr

2 /(2σ)+i(−2σκt±η)

,

which satisfy the Schr¨odinger equation in (48). We identify κ 2 r and S±1 = −2σκt ± η R = log βr − 2σ of ψ±1 (t, r, η) with the exponent of distribution and the exponent of motion. In view of equation (46) we get ∂R 1 = σ 2 − σκr, ∂r r σ2 1 ∂S ±1 aη±1 (r, η) = σ 2 =± , r ∂η r ar (r, η) = σ 2

(50)

which give the drift vectors in the polar coordinates. Therefore, the kinematic equation is % & % % & & ∂u 1 ∂ 2u ∂u 1 2 1 ∂ ∂u σ 2 1 ∂u 21 + σ r + 2 2 + σ − σκr ± = 0, ∂t 2 r ∂r ∂r r ∂η r ∂r r r ∂η 151

Masao Nagasawa that is % & % & 1 2 1 ∂u ∂u 1 2 ∂ 2 u 1 ∂ 2 u ∂u σ 2 1 ∂u 21 + σ + 2 2 + σ + σ −σκr ± = 0. ∂t 2 ∂r r ∂η 2 r ∂r r ∂r r r ∂η (51) The kinematic equation (51) then implies the sample path equations & % 1 1 1 drt = σdBt1 + σ 2 dt + σ 2 − σκrt dt, 2 rt rt 2 σ σ dηt = dBt2 ± 2 dt, rt rt

(52)

where Bt1 and Bt2 are independent one-dimensional Brownian motions. The radial motion rt does not hit the origin because of repulsive drift 3σ 2 /(2r) (i.e., the Bessel process, cf. McKean (1960)), and is attracted by drift −σκr. Moreover, solving (3/2)σ 2 /r − σκr = 0, we get 0 r¯ =

3σ . 2κ

If r < r¯, then the radial drift function is positive, while it is negative for r > r¯. Hence our particle is attracted toward the zero point r¯ of the drift function. The angular motion ηt with drift ±σ 2 /r2 induces rotational motion. Therefore, the particle makes random motion in a two-dimensional (American) doughnut of an average radius r¯, and anti-clockwise rotation with drift σ 2 /r2 , or clockwise rotation with drift −σ 2 /(r2), respectively. Remark 2. For the case of the smallest eigenvalue λ0 = σκ, we can understand the motion quite naturally as the stochastic generalization of the classic harmonic oscillation in two dimensions. But, for the ﬁrst excited eigenvalue λ1 = 2σκ, it is not so easy to understand the motion, if we only look at the Schr¨odinger equation in (48). In fact, it seems our particle knows that it must be in a two-dimensional doughnut. But how did it know this? This is hard to understand, since there is only Hooke’s potential κ2 r2 /2 in the Schr¨odinger equation. How can Hooke’s force conﬁne our particle in the doughnut? Now, let us look at the twin equations of motion in (40). It has a potential function −c(r) = V (r) + V˜ (r) with the self-potential ∂S V˜ (r) = σ 2 (∇S)2 + 2 , ∂t 152

Dynamic Theory of Stochastic Movement of Systems and in the case of the ﬁrst excited eigenvalue λ1 = 2σκ V˜ (r) = σ 2 /r2 − 4σκ. Hence −c(r) is ∞ at the origin. Moreover, we have dc(r)/dr = −κ2 r + 2σ 2 /r3 , hence the potential function −c(r) becomes minimum at r˜ =

1√

2σ/κ.

Therefore, our particle must stay near by the radius r˜. We now understand that our particle learned from the potential function −c(r) that it must be in the two-dimensional doughnut. However, to see the motion of our particle we must analyze the kinematic equation in (51) or (52), in particular the drift vector given by (50). Through this we have seen that our particle makes the rotational motion in the doughnut together with the random motion. ∇ Remark 3. We once more look at the radial motion rt described by & % ∂u 1 2 ∂ 2 u ∂u 3 21 + σ σ − σκr = 0. + 2 ∂t 2 ∂r 2 r ∂r To have another view of the motion, we apply the so-called time change, that is, Xt (ω) = rτ −1 (t,ω) (ω), where τ

−1

(t, ω) = sup{s; τ (s, ω) ≤ t}, τ (t, ω) =

0

t

α(rs (ω)) ds, and

α(r) = 1/r, cf. e.g., Nagasawa (1993, 2000). Then the motion Xt = rτ −1 (t) satisﬁes a stochastic diﬀerential equation ' dXt = σ Xt dBt + (3σ 2 /2 − σκXt2 )dt, √ in which the coeﬃcient σ x vanishes at the origin, but drift has no singularity. ∇ Example 6. We consider the motion of an electron in a hydrogen atom and take the Schr¨odinger equation i

∂ψ 1 2 + σ Δψ − V (r)ψ = 0, ∂t 2 153

Masao Nagasawa with the Coulomb potential V (r) = −α/r,, where α is a constant. In the spherical coordinates (r, θ, η), x = r sin θ cos η, y = r sin θ sin η and z = r cos θ, it is % & % & ∂ψ 1 2 1 ∂ 1 ∂ψ 1 ∂ 2ψ ∂ 2 ∂ψ i + σ r + sin θ + ∂t 2 r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂η 2 − V (r)ψ = 0. (53) Substituting ψ = e−iλt ϕ, we get % & % & ∂ 2ϕ 1 ∂ ∂ϕ 1 1 2 1 ∂ 2 ∂ϕ r + sin θ + − σ 2 r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂η 2 + V (r)ϕ = λϕ. We then apply the separation of variables, namely by substituting ϕ = R(r)Θ(θ)Φ(η), we get d2 Φ = m2 Φ, dη 2 % & 1 d dΘ m2 − Θ = βΘ, sin θ + sin θ dθ dθ sin2 θ % & 1 β dR 1 d − σ2 2 r2 − 2 R + V (r)R = λn R, 2 r dr dr r −

where λn = −

α2 1 , 2σ 2 n2

n = 1, 2, 3, . . . ,

and β = l(l + 1),

l = |m|, |m| + 1, . . . .

For each n we can choose l = 0, 1, 2, . . . , n − 1, m = 0, ±1, ±2, . . . , ±l,

(54)

cf. e.g. section V-21 of Pauling and Wilson (1935). Each (n, l, m) determines the motion of an electron in a hydrogen atom. For n = 1, (n, l, m) = (1, 0, 0), we get a solution ϕ(r, θ, η) = 2 βe−αr/σ , where β is a normalizing constant. Hence the distribution of the motion Xt = (rt , θt , ηt ) is given by P [Xt ∈ d(r, θ, η)] = β 2 e−2αr/σ r2 dr sin θdθdη. 2

154

.

Dynamic Theory of Stochastic Movement of Systems The solution of the Schr¨odinger equation is ψ(t, (r, θ, η)) = βe−αr/σ

2 −iλ t 1

.

By identifying −αr/σ 2 and −λ1 t with the exponents of distribution and motion, we get the evolution function φ(t, (r, θ, η)) = βe−αr/σ

2 −λ t 1

.

The evolution function determines a drift vector by a = σ2

grad φ = (−α, 0, 0). φ

Therefore, the kinematic equation in the spherical coordinates is % & % & ∂u 1 2 1 ∂ 1 ∂u 1 ∂ 2u ∂ 2 ∂u + σ r + sin θ + ∂t 2 r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂η 2 ∂u = 0. −α ∂r The kinematic equation describes the motion of an electron in a hydrogen atom. But to see sample paths of the motion we need stochastic diﬀerential equations. To get them we ﬁrst expand the Laplacian in the kinematic equation, and rewrite it as σ 2 ∂u 1 σ 2 σ2 ∂ 2u ∂u ∂u 1 2 ∂ 2 u σ 2 ∂ 2 u + + σ + + + cot θ 2 2 2 2 2 2 2 ∂t 2 ∂r r ∂θ r ∂r 2 r ∂θ r sin θ ∂η ∂u −α = 0, ∂r in the form of equation (2). Then the kinematic equation in this form implies the sample path equations for the motion of an electron in a hydrogen atom, when n = 1, 1 dt − αdt, rt σ 1 1 dθt = dBt2 + σ 2 2 cot θt dt, rt 2 rt σ dηt = dB 3 , rt sin θt t drt = σdBt1 + σ 2

where Bt1 , Bt2 and Bt3 are independent one-dimensional Brownian motions, rt > 0, θt ∈ (0, π) and ηt is the angular motion. The radial motion rt does not hit the origin because of repulsive drift σ 2 /r and is 155

Masao Nagasawa attracted with drift −α (see Remark 3 and Remark 4 below). Solving σ 2 /r − α = 0, we get r¯ = σ 2 /α. If r < r¯, then the radial drift function is positive, while it is negative for r > r¯. Hence our electron is attracted toward the zero point r¯ of the drift function, and often moves near by the radius r¯. Since σ 2 = h/2πm and α = 2πe2 /h, where h is the Planck constant, m and −e are the mass and electric charge of an electron, we have h2 1 . r¯ = 2 4π me2 The classic Bohr radius agrees with this. The drift term σ 2 cot θ/2r2 of the motion θt , is singular at θ = 0 and π, hence the electron is repelled from the z-axis and moves near by the xy-plane. We carefully note that our electron does not rotate around a proton in this case of the lowest energy, since the angular motion ηt has no drift. Remark 4. Stochastic processes with drift can be treated with the help of the Maruyama–Girsanov theorem, cf., e.g. Nagasawa (1993, 2000). We consider, for instance, the radial motion. Let {Bt , P } be a one-dimensional Brownian motion, and set Xt = Xa + σBt−a . We deﬁne an exponential functional by % t & 1 t −1 t −1 2 σ a(Xs ) dBs − (σ a(Xs )) ds , Ma = exp 2 a a where

1 a(r) = σ 2 − α, r and a new probability measure R by dR = M. dP Moreover, set

˜t−a = σBt−a − σB

t

a(Xs ) ds. a

˜t , R} is a one-dimensional Brownian motion, and Xt can be Then {B written as t ˜ a(Xs ) ds, Xt = Xa + σ Bt−a + a

hence {Xt , R} gives the radial motion rt with the drift vector a(r) = σ 2 /r−α under the transformed probability measure R = Mab P . Therefore, R[f (Xt )] = P [Mat f (Xt )]. ∇ 156

Dynamic Theory of Stochastic Movement of Systems For n = 2, we can choose (l, m) according to (54). We consider an interesting case of (n, l, m) = (2, 1, ±1). In this case we get complexvalued solutions α r −αr/(2σ2 ) e sin θeiη , σ2 2 αr 2 ϕ−1 (r, θ, η) = β 2 e−αr/(2σ ) sin θe−iη , σ 2

ϕ+1 (r, θ, η) = β

where β is a normalizing constant, and solutions of the Schr¨odinger equation ψ±1 (t, (r, θ, η)) = β

α r −αr/(2σ2 )±η−iλ2 t e sin θ. σ2 2

Therefore, the evolution functions are φ±1 (t, (r, θ, η)) = β

α r −αr/(2σ2 )±η−λ2 t e sin θ. σ2 2

The drift vectors determined by the evolution functions φ(= φ±1 ) are % & σ 2 ∂φ 1 ∂φ 1 ∂φ σ2 grad φ = , , , a(r, θ, η) = φ φ ∂r r ∂θ r sin θ ∂η and in the spherical coordinates we get 1 α ar (r, θ, η) = σ 2 − , r 2 2 σ aθ (r, θ, η) = cot θ, r σ2 aη (r, θ, η) = ± . r sin θ Therefore, the kinematic equation in the spherical coordinates is % & % & ∂u 1 2 1 ∂ ∂ 2u 1 ∂ ∂u 1 2 ∂u + σ r + sin θ + 2 2 2 2 ∂t 2 r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂η 2 ± a · ∇u = 0, where % ±a · ∇u =

& α ∂u σ 2 1 ∂u σ2 1 ∂u + cot θ ± . σ − r 2 ∂r r r ∂θ r sin θ r sin θ ∂η 21

157

Masao Nagasawa To get stochastic diﬀerential equations, we ﬁrst expand the Laplacian in the kinematic equation, and rewrite it in the form of equation (2), i.e., ∂u 1 2 ∂ 2 u σ 2 ∂ 2 u σ2 ∂ 2u ∂u σ 2 ∂u 1 σ 2 + + + + cot θ + σ 2 2 2 2 2 2 2 ∂t 2 ∂r r ∂θ r ∂r 2 r ∂θ r sin θ ∂η ± a · ∇u = 0. Then the kinematic equation in this form implies the sample path equations for the motion of an electron in a hydrogen atom, when (n, l, m) = (2, 1, ±1), & % α 1 21 21 dt, dt + σ − drt = σdBt + σ rt rt 2 σ 1 σ2 σ2 dθt = dBt2 + cot θ dt + cot θt dt, t rt 2 rt2 rt2 σ σ2 dηt = dBt2 ± 2 2 dt, rt sin θt rt sin θt where Bt1 , Bt2 and Bt3 are independent one-dimensional Brownian motions, rt > 0, θt ∈ (0, π) and ηt is the angular motion. Because of the singularity of the radial drift function 2σ 2 /r at the origin, our electron is repelled strongly from the origin and the origin is inaccessible, but the electron is attracted toward the origin with constant drift −α/2 (see Remark 4). Solving σ 2 2r − α2 = 0, we get r¯ = 4σ 2 /α. If r < r¯, then the radial drift function is positive, while it is negative for r > r¯. Hence our electron is attracted toward the zero point r¯ of the drift function, and often moves near by the radius r¯. Since σ 2 = h/2πm and α = 2πe2 /h, we have r¯ =

h2 1 2 2. 4π 2 me2

The classic Bohr radius for n = 2 agrees with this. The drift term 3σ 2 cot θ/2r2 of the motion θt , is singular at θ = 0 and π, hence the electron is repelled strongly from the z-axis and moves near by the xy-plane. Our electron therefore moves in a three dimensional doughnut with an average radius r¯ around a proton. Moreover, the drift terms ±σ 2 /r2 sin2 θ of the angular motion ηt induce rotational motion, hence the electron makes anti-clockwise rotation if m = 1, and clockwise rotation if m = −1. The rotational motion of the electron induces the magnetic moment of the hydrogen atom 158

Dynamic Theory of Stochastic Movement of Systems in this case of (n, l, m) = (2, 1, ±1). For the case of further excited motion, cf. section 4.6 of Nagasawa (2000). I remark here that we can more generally treat the motion of a charged particle in an electro-magnetic ﬁeld. In this case we must handle the Schr¨odinger equation with a vector potential. An interesting case of the motion in a homogeneous magnetic ﬁeld is analyzed in Nagasawa (2002*). The analysis of the typical examples in this section clariﬁes that the Schr¨odinger equation is a complex-valued alternative of the twin equations of motion, and one can use it for computing the energy and distribution density of the motion, but one needs the dynamic theory of stochastic motion for analyzing the sample paths of the motion of particles. The Schr¨odinger equation is in fact the complex-valued evolution equation and its solutions are complex-valued evolution functions in the dynamic theory, and we can use them in analyzing the sample paths of the motion of particles. 4.10

Interference phenomena and entangled motion

The ‘interference’ is originally a notion in wave theories, and the stripes-like patterns of distributions in the problem of double slits have been understood in wave mechanics and quantum mechanics as the typical eﬀect of the ‘wave property’ of electrons. This is not correct. Moreover, it has been said that the theory of stochastic processes cannot solve the double slits problem. This is, in a sense, true. As will be explained, Kolmogorov–Itˆo’s conventional theory of Markov processes does not have the mathematical structures for discussing the double slits problem. But does the dynamic theory of stochastic motion? We will explain that the stripes-like patterns of distributions in the double slits problem are a result caused by the entangled motion of an electron in the dynamic theory of stochastic motion. We shoot an electron and observe it at a screen. In between the electron gun and the screen we set double slits. In the wave theory of electrons, the wave function ψ splits into two parts ψ1 and ψ2 at the double slits, and at the screen we get ψ1 (t, x)+ψ2 (t, x), which is called the superposition of the wave functions ψ1 (t, x) and ψ2 (t, x), where we ignore the normality condition for simplicity. Therefore, the intensity of the wave is given by |ψ1 (t, x) + ψ2 (t, x)|2 = |ψ1 (t, x)|2 + |ψ2 (t, x)|2 + (ψ 1 (t, x)ψ2 (t, x) + ψ1 (t, x)ψ 2 (t, x)), 159

Masao Nagasawa where ψ 1 ψ2 + ψ1 ψ 2 is the interference of the wave functions ψ1 (t, x) and ψ2 (t, x). However, a single electron arrives at a point on the screen, and does not show this intensity. This means that the wave theory failed in solving the problem of double slits. Nevertheless, the intensity |ψ1 + ψ2 |2 is realized as a statistical distribution of many electrons. In fact, if we shoot electrons one by one, so that only a single electron is in the apparatus, then electrons will arrive at the screen one by one successively, and the distribution density of those electrons shows the existence of ‘interference’ ψ 1 ψ2 + ψ1 ψ 2 statistically. Therefore, this ‘interference’ cannot be the interference of waves in conventional wave theories, but it is a statistical eﬀect, which is caused by the entanglement of motion. We will analyze it with the dynamic theory of stochastic motion. We ﬁrst note carefully that there is no ‘entanglement of motion’ in the classical theory of Markov processes, because one can prescribe only initial distributions in the conventional theory of Markov processes. To see this, let us assume that the path of an electron goes through either the ﬁrst slit or the second slit. If we apply the classical theory of Markov processes to the motion of an electron after the double slits, the Markov process starts at the double slits with an initial distribution density 12 (μ1 (x) + μ2 (x)), where μ1 (x) and μ2 (x) depend on the width of the ﬁrst and second slits and the distance of the slits. The Markov process will then arrive at the screen with the distribution density 12 (μ1 (t, x) + μ2 (t, x)), which shows no entanglement. We can understand this as follows; we consider two Markov processes starting at the two slits with the distribution densities μ1 (x) and μ2 (x), respectively. We make the superposition of two Markov processes. Through this superposition in the conventional theory of Markov processes we get no entanglement. We now analyze the double slits problem with the dynamic theory of stochastic motion of particles, which is diﬀerent from the conventional theory of Markov processes, as we have seen. In the dynamic theory, the motion of an electron is determined by an evolution funcˆ x). At the double tion φ(t, x) and a backward evolution function φ(t, slits they are decomposed into φ1 (t, x) = eR1 (t,x)+S1 (t,x) , φˆ1 (t, x) = eR1 (t,x)−S1 (t,x) ,

φ2 (t, x) = eR2 (t,x)+S2 (t,x) , φˆ2 (t, x) = eR2 (t,x)−S2 (t,x) .

(55)

To describe the motion of an electron after the double slits, we apply the equivalence of a complex-valued exponential function and a pair of 160

Dynamic Theory of Stochastic Movement of Systems real-valued exponential functions, explained in Section 4.8, in which we identify the pair of functions {R, S} that appears as the exponents of the complex-valued function in equation (38) and as the exponents of the real-valued functions in equation (39). By using the pairs {R1 , S1 } and {R2 , S2 } in (55), we ﬁrst deﬁne complex-valued exponential functions by ψ1 (t, x) = eR1 (t,x)+iS1 (t,x) , ψ2 (t, x) = eR2 (t,x)+iS2 (t,x) ,

(56)

and apply the superposition ψ(t, x) = ψ1 (t, x) + ψ2 (t, x), where we ignore the normality condition for simplicity. We then represent ψ(t, x) in the exponential form as ψ(t, x) = eR

∗ (t,x)+iS ∗ (t,x)

.

(57)

Let t = t0 at the double slits. Then {R∗ (t0 , x), S ∗ (t0 , x)} is the initial condition at t = t0 for the motion of an electron after the double slits. We deﬁne a pair of real-valued exponential functions ∗

∗

φ(t, x) = eR (t,x)+S (t,x) , ˆ x) = eR∗ (t,x)−S ∗ (t,x) , φ(t,

(58)

by using {R∗ (t, x), S ∗ (t, x)} in (57). For the motion of an electron after the double slits, we thus get the evolution function φ(t, x) and ˆ x), which determine a Markov time-reversed evolution function φ(t, process with an entangled drift vector a(t, x) = σ 2 ∇(R∗ (t, x) + S ∗ (t, x)), in view of (27), hence the Markov process is described by the kinematic equation ∂u 1 2 + σ Δu + a(t, x) · ∇u = 0, ∂t 2 where σ 2 = ε/m with ε = h/2π. In this way we get an entangled Markov process {Xt , Q} which describes the motion of an electron from the double slits to the screen. Then, in view of the basic relation in (43), the distribution density of the entangled process is given by ˆ x)φ(t, x) = ψ(t, x)ψ(t, x) μ(t, x) = φ(t, = |ψ1 (t, x) + ψ2 (t, x)|2 = |ψ1 (t, x)|2 + |ψ2 (t, x)|2

(59)

+ ψ1 (t, x)ψ 2 (t, x) + ψ 1 (t, x)ψ2 (t, x), 161

Masao Nagasawa where ψ1 (t, x) and ψ2 (t, x) are given by (56). Therefore, we get μ(t, x) = e2R1 (t,x) + e2R2 (t,x) + 2eR1 (t,x)+R2 (t,x) cos(S1 (t, x) − S2 (t, x)). (60) One shoots electrons one by one making a time lag long enough so that only a single electron is in the apparatus to avoid interactions between electrons. Then electrons arrive on the screen one by one, and the distribution density of the electrons on the screen statistically shows the eﬀect of entangled motion 2eR1 (t,x)+R2 (t,x) cos(S1 (t, x) − S2 (t, x)).

(61)

This should not be confused with the interference of waves. Let me remind you that we have discussed the double slits problem in the framework of the dynamic theory of random motion, which is a particle theory, not a wave theory. As remarked already, the wave theory cannot explain the double slits experiment. The problem of double slits has been for a long time a problem of gedankenexperiments. It is however experimentally realized by Tonomura (cf. e.g. Tonomura (1994)), and his experiment clearly shows that the stripes-like pattern of the distribution density of many electrons in the double slits problem is not a wave phenomenon but a purely statistical phenomenon. I emphasize that if an electron is a wave and the stripes-like pattern is caused by the wave, then the pattern must appear already by a single electron. But this does not occur. Therefore, Tonomura’s experiment of double slits deﬁnitively denies the claim in Schr¨odinger’s wave mechanics that an electron is a wave, and also the claim in quantum mechanics that an electron has the wave property. The experiment justiﬁes the dynamic theory of stochastic motion. We have shown that the stripes-like pattern is not caused by the socalled wave ‘property’ of an electron, but it is a ‘phenomenon’, which is produced by the motion of an electron, i.e., by the ‘entanglement of the motion’ of an electron. Property and phenomenon are diﬀerent notions, and we should distinguish them clearly. We notice moreover that the existence of the exponent of motion S(t, x) plays a decisive role in equation (61), which is the formula on the eﬀect of entanglement of motion. This fact had not been recognized in the conventional theory of Markov processes. We carefully note that the exponent of motion S(t, x) was found through the analysis of time reversal of Markov processes and the duality relation between semi-groups of a space-time Markov process and its time reversed process, cf. Nagasawa (1993, 2000). 162

Dynamic Theory of Stochastic Movement of Systems Acknowledgements: I would like to express my gratitude to Professor Bjarne S. Jensen of Copenhagen Business School, who kindly arranged for me to give a lecture at the University of Copenhagen in 2004 on the theme explained in this exposition. References: Aspect, A., Dalibard, J., and Roger, G. (1982) “Experimental Test of Bell’s Inequalities Using Time-Varying Analyzers.” Physical Review Letters 49: 1804–1807. Bell, J.S. (1964) “On the Einstein Podolsky Rosen Paradox.” Physics 1: 195–202. Born, M. (1926) “Zur Quantenmechanik der Stossvorg¨ange.” Zeitschrift f¨ ur Physik 37: 863–867. Condon, E.U. (1962) “60 Years of Quantum Physics.” Physics Today 15: 37–49. Einstein, A., Podolsky, B. and Rosen, N. (1935) “Can QuantumMechanical Description of Physical Reality Be Considered Complete?” Physical Review 47: 777–780. Jensen, B.S., and Richter, M. (2007) “Stochastic Neoclassical OneSector and Two-Sector Growth Models with Uncertainty in Continuous Time.” In this volume. Jensen, B.S., Wang, C., and Johnsen, J. (2007) “Moment Evolution of Gaussian and Geometric Wiener Diﬀusion – Derived by the Itˆo Lemma and Kolmogorov Equation.” In this volume. ¨ Kolmogoroﬀ, A. (1931) “Uber die Analytischen Methoden in Wahrscheinlichkeitsrechnung.” Mathematische Annalen 104: 415–458. Kolmogoroﬀ, A. (1933) “Grundbegriﬀe der Wahrscheinlichkeitsrechnung.” Ergebniss der Mathematik 2: Heft 3. Springer-Verlag. McKean, H.P. (1960) “The Bessel motion and a singular integral equation.” Memoir of College of Science University of Kyoto, Series A. Mathematics 33: 317–322. Nagasawa, M. (1993) Schr¨ odinger Equations and Diﬀusion Theory. Birkh¨auser Verlag, Basel, Boston, Berlin. Nagasawa, M. (1997) “On the locality of hidden variable theories in quantum physics.” Chaos, Solitons and Fractals 8: 1773–1792. Nagasawa, M. (2000) Stochastic Processes in Quantum Physics. Birkh¨auser Verlag, Basel, Boston, Berlin. 163

Masao Nagasawa Nagasawa, M. (2002) “On quantum particles.” Chaos, Solitons & Fractals 13: 1393–1405. Nagasawa, M. (2002) “A note on a remark by Landau regarding a charged particle in a magnetic ﬁeld.” Chaos, Solitons and Fractals 14: 1065–1070. Nagasawa, M. (2003) Schr¨ odinger’s Dilemma and Dream, Stochastic Processes and Wave Mechanics (in Japanese). Morikita Shuppan (publishing), Tokyo. Nagasawa, M., and Schr¨oder, K. (1997) “A note on the locality of Gudder’s hidden-variable theory.” Chaos, Solitons and Fractals 8: 1793– 1805. Nagasawa, M., and Tanaka, H. (1999a) “Stochastic diﬀerential equations of pure-jumps in relativistic quantum theory.” Chaos, Solitons and Fractals 10: 1265–1280. Nagasawa, M., and Tanaka, H. (1999b) “Time dependent subordination and Markov processes with jumps.” Seminaire de Probabilite 34: 257–288. Lecture Notes in Mathematics 1729, Springer. Nagasawa M., and Tanaka, H. (1999c) “The principle of variation for relativistic quantum particles.” Seminaire de Probabilite 35: 1–27. Lecture Notes in Mathematics 1775, Springer. Pauling, L., and Wilson, E.B. (1935) Introduction to Quantum Mechanics with Applications to Chemistry. McGraw-Hill Book Co. Inc., New York. Schr¨odinger, E. (1926a) “Quantisierung als Eigenwertproblem (1. Mitteilung).” Annalen der Physik 79: 336–376. ¨ Schr¨odinger, E. (1926, II) “Uber das Verh¨altnis der Heisenberg–Bom– Jordanschen Quantenmechanik zu der meinen.” Annalen der Physik 79: 734–756. ¨ Schr¨odinger, E. (1931) “Uber die Umkehrung der Naturgesetze.” Sitzungsberichte der Preussischen Akademie der Wissenschaften Physikalisch-Mathematische Klasse, 144–153. Tonomura, A. (1994) Microscopic world visualized with electron beam holography. (in Japanese). Nihon Hyoronsha, Tokyo.

164

Part II: Stochastic Dynamics of Basic Growth Models and Time Delays

Chapter 5 Stochastic One-Sector and Two-Sector Growth Models in Continuous Time

Bjarne S. Jensen University of Southern Denmark and Copenhagen Business School Martin Richter Copenhagen Business School Danske Research, Danske Bank, Copenhagen

5.1

Introduction

To set the stage for the main content, exposition, and organisation of our subject matter - stochastic neoclassical models of capital accumulation in continuous time with steady-state or persistent growth per capita - we will review some fundamental issues of “stochastic processes” [parameterized collections (sequences) of stochastic variables] that were ﬁrst raised in the seminal papers of Mirman (1972, 1973), Brock and Mirman (1972, 1973), Merton (1975), Bourguignon (1974). Since the concept of a steady state equilibrium has played an important role in both positive and optimal theory of economic growth, Mirman asked whether the same questions can be posed in random growth models as in deterministic growth models: “In what sense should one even discuss the random evolution of the system ? How does one deﬁne a concept in the random case analogous to the deterministic steady state? Added to these questions are the usual questions of existence, uniqueness, and stability for the random analogue of the steady state”, Mirman (1973, p. 220).

Bjarne S. Jensen, Martin Richter Then he redeﬁned the concept of a steady state in a stochastic sense: “This is done by using the distribution function of possible capital labor ratios generated by the stochastic growth process. Having deﬁned the steady state in terms of a distribution function, we then show that, for each admissible policy, the corresponding stochastic system has a unique steady state distribution, which is a degenerate distribution in the deterministic theory. Moreover, it is shown that this unique steady state distribution is stable in the sense that the set of possible states of the system converges over time to a well-deﬁned set, the analogue of the deterministic steady state, which supports the unique steady state distribution. Finally, it is shown that the sequence of distributions converges to a unique steady state distribution”, Mirman (1973, p. 220). In implementing this research program and in the mathematical analysis, Mirman (1972, pp. 224) ﬁrst assumed that his random variables A(t) - “technology shocks” - were independent, identically distributed in discrete time, and at any time independent of the capital labor ratios, k(t); further, it was assumed that the shocks A(t) were always strictly positive and ﬁnite, (bounded away from both zero and inﬁnity). To further simplify the analysis and avoid the possibility of a steady state at either zero or inﬁnity, the technology (production function) was assumed to satisfy the simple derivative conditions of Inada (satisﬁed by the CD technology). With these assumptions, Mirman used mathematical techniques similar to those in the theory of Markov chains to establish the stochastic generalization of the Solow growth model by showing the existence, uniqueness, and stability of stationary probability measures. With his assumptions, he proved that a stationary measure will always exist, and the stationary measure will be unique if the recurrent states all communicate and admit no cyclically moving subsets. Stability meant that iterates (sequences) of the transition probability tend to a unique asymptotic (time invariant) probability measure (distribution). The tools of the Markov processes were used to demonstrate such stability (convergence) to the unique stationary (steady-state) distribution of the capital-labor ratio. In short, particular neoclassical assumptions and 168

Stochastic One-Sector and Two-Sector Models “techniques from positive deterministic growth theory were combined with the tools of Markov processes to achieve a positive theory of stochastic economic growth”, Mirman (1973, p. 230). Mirman (1973) had assumed that his random variables A(t), (technology shocks), which inﬂuence the production process, are strictly positive and ﬁnite. ”More precisely, for any capital stock, output can neither be arbitrarily large nor arbitrarily small (even with arbitrarily small probability)”, but “it was not clear where the bounds of possible random eﬀects should be set”, Mirman (1972, p. 271). Next, he addressed this problem with arbitrarily large/small outputs. Still, the existence of a stationary measure (steady-state distribution) was demonstrated with a ﬁxed-point argument, Mirman (1972, p. 279): “However, it is possible that there exists a positive probability of extinction or positive probability of an inﬁnite capital stock in the stochastically generalized notion of a steady state”. Mirman (1972, p. 271) studied conditions “for the existence of stationary measures having zero probability at zero and inﬁnity”, - which is analogous to imposing the Inada conditions on the production process. For recent advances in the ﬁeld of stochastic neoclassical growth models in discrete time, see Schenk-Hoppe (2002), Lau (2002). The ﬁrst and important extension of the stochastic study by Mirman and Brock of the discrete-time, neoclassical one-sector growth model to continuous-time stochastic processes was done by Merton (1975) and Bourguignon (1974). Besides existence and uniqueness, much more speciﬁc (parametric) structures of the steady-state (asymptotic, limit) distributions for the capital-labor ratio and other variables could now be examined. Thus, Merton (1975, 1990) derived density functions and ﬁrst and second moments that would be obtained in a steady-state, and a comparison of the results and biases (in expected value) between deterministic (certainty) and stochastic modelling were rigorously derived for a stochastic growth model with a CD production function. The source of uncertainty was not positive technology shocks, but the uncertainty element in Merton (1975) aﬀected the evolution of the labor (population) stock, which he assumed followed a geometric Wiener process. With the latter, the boundary problems at zero and inﬁnity for the capital-labor ratio were essentially absent from the ﬁrst stochastic Solow growth model in continuous time. 169

Bjarne S. Jensen, Martin Richter In the neoclassical one-sector growth model, the sources of uncertainty were extended by Bourguignon (1974) to saving and depreciation rates, and the more general CES technology was adopted. Hence, the boundary problem for the capital-labor ratio naturally came into focus, and the result was that “uncertainty can make the neoclassical model closer to the Harrod-Domar-type models of growth in introducing the possibility of a collapse of the economy”, Bourguignon (1974, p. 142). In particular, uncertainty in the saving rate posed (without further parameters restrictions) a serious problem of the lower boundary absorption (k = 0). Since 1975, the consequences of such critical boundary problems have somehow been that this ﬁeld of research in continuoustime stochastic growth models has not matured and in fact mostly disappeared from the economic literature. A new start is needed and is here attempted, partly by ﬁrst resolving the older methodological problems with absorbing boundaries and steady state and partly by extending the stochastic neoclassical framework to one-sector and two-sector models with parameters generating endogenous (persistent) per capita growth. Simulations of sample paths and asymptotic density functions will illustrate our Theorems and the properties of the parametric stochastic processes. The study of deterministic general equilibrium dynamics in twosector and multi-sector growth models has been reviewed and extended in Jensen [2003], Jensen and Larsen [2005] - with emphasis on factor allocation, output composition, and the dualities for commodity and factor prices. Sample paths of the stochastic two-sector analogue are here discussed and demonstrated. For our purposes and as benchmarks, production functions of the CD and CES form are used in both the stochastic one-sector and two-sector growth models. These technologies must therefore ﬁrst be introduced and adequately described. 5.2

Neoclassical technologies and CES forms

The sector technologies are - in stochastic one-sector and two-sector dynamics - described by nonnegative, smooth, concave, homogeneous production functions, Fi (Li , Ki ), i = 1, 2, with constant returns to scale in labor and capital, Yi = Fi (Li , Ki ) = Li Fi (1, ki ) ≡ Li fi (ki ) ≡ Li yi , Li = 0; Fi (0, 0) = 0 (1) 170

Stochastic One-Sector and Two-Sector Models where the function fi (ki ) is strictly concave and monotonically increasing in the capital-labor ratio ki ∈ [0, ∞), i.e., ∀ki > 0 : fi (ki ) = dfi (ki )/dki > 0,

fi (ki ) = d 2fi (ki )/dki2 < 0

(2)

The sectorial output elasticities, Li , Ki , i - with respect to marginal and proportional factor variation - are, cf. (1), M P Li ∂Yi Li ki fi (ki ) > 0, ki = 0 (3) = =1− ∂Li Yi APLi fi (ki ) M PKi ∂Yi Ki ki fi (ki ) = E(yi , ki ) > 0 (4) ≡ E(Yi , Ki ) ≡ = = ∂Ki Yi APKi fi (ki ) ≡ Li + Ki = 1. (5)

Li ≡ E(Yi , Li ) ≡ Ki i

At any point on the isoquants, the marginal rates of technical substitution, ωi (ki ) are, by (2), positive monotonic functions, M P Li L fi (ki ) − ki = i ki > 0, ∀ki > 0. ωi (ki ) = = (6) M PKi fi (ki ) Ki CES Production Functions General CES forms of Fi (Li , Ki ), (1),γi > 0, 0 < ai < 1, σi > 0 are i Kiai = Li γi kiai ≡ Li fi (ki ) Yi = Fi (Li , Ki ) = γi L1−a i

σi −1 σi

Yi = Fi (Li , Ki ) = γi (1 − ai )Li

σi −1 σi

+ ai Ki

(7)

σ σ−1 i i

(8)

σ /(σi −1) (σ −1)/σi i ≡ Li fi (ki ) (9) = Li γi (1 − ai ) + ai ki i 1/(σi −1) −(σ −1)/σi (10) fi (ki ) = γi ai kiai −1 , fi (ki ) = γi ai ai + (1 − ai )ki i σ /(σ −1)

≶ 1), The limits of fi (ki ) and fi (ki ) become, (∀i : σi ≷ 1 ⇒ ai i i ⎧ σi ⎪ lim fi (ki ) = γi (1 − ai ) σi −1 ⎨ lim fi (ki ) = 0, ki →0 ki →∞ σ (11) σi < 1 : i ⎪ ⎩ lim fi (ki ) = γi aiσi −1 , lim fi (ki ) = 0 k →0 ki →∞ ⎧ i ⎨ lim fi (ki ) = 0, lim fi (ki ) = ∞ ki →∞ σi = 1 : ki →0 (12) lim fi (ki ) = 0 ⎩ lim fi (ki ) = ∞, ki →0 ki →∞ ⎧ σi ⎪ lim fi (ki ) = ∞ ⎨ lim fi (ki ) = γi (1 − ai ) σi −1 , ki →0 ki →∞ σi (13) σi > 1 : σ −1 ⎪ ⎩ lim fi (ki ) = ∞, lim fi (ki ) = γi ai i ki →0

ki →∞

171

Bjarne S. Jensen, Martin Richter For the CES technologies, the monotonic relations between marginal rates of substitution, factor proportions, and output elasticities are, cf. (8-10), σ 1 − ai 1/σi 1 1 − ai i σi ki , ki = [ωi ] , ci = i = 1, 2. (14) ωi = ai ci ai i −1 1 − ai 1−σ 1 ci ωi1−σi σi Ki = 1 + ki = , = (15) Li ai 1 + ci ωi1−σi 1 + ci ωi1−σi With two-sector models and CES technologies, it is apparent from (14) that sectorial factor ratio (”intensity”) reversals can only be avoided if and only if σ1 = σ2 and a1 = a2 . Hence, with σ1 = σ2 , there will be ¯ ω ¯ ): a reversal point, (ki , ωi ) = (k, k¯ = 5.3 5.3.1

a1 (1 − a2 ) a2 (1 − a1 )

σ2 σσ1−σ 2

1

c σ1 = 2σ2 c1

σ

1 2 −σ1

,

c2 ω ¯= c1

σ

1 2 −σ1

(16)

Stochastic one-sector growth models Introduction of stochastic elements

The standard deterministic neoclassical one-sector growth model is described by the ordinary diﬀerential equations (ODE), cf. (1), dL/dt dK/dt dk/dt

L˙ = Ln ≡ K˙ = Lsf (k) − δK ≡ k˙ = sf (k) − (n + δ)k ≡

(17) (18) (19)

This general model becomes, with uncertainty (stochastic elements i ) in the growth rate of labor n, the gross saving rate s, and the capital depreciation rate δ, L˙ = L(n + β1 1 ) K˙ = L(s + φ3 (k) 3 )f (k) − (δ + β2 2 )K

(20) (21)

where βi ≥ 0, and (1 , 2 , 3 ) are “white noise” (stochastic process with a constant spectral density function), related to Wiener processes (w1 , w2 , w3 ) with the correlation structure d wi , wj = ρij dt,

ρii = 1,

i, j = 1, 2, 3

(22)

and wi , wj is the quadratic variation process for the components of the Wiener process, Karatzas and Shreve (1991), Øksendal (2003). For 172

Stochastic One-Sector and Two-Sector Models the formal connection between Wiener processes and “white noise”, see Holden et. al (1996, chap.3). The function φ3 is, as later explained, here conveniently chosen (to avoid boundary problems at zero) as φ3 (k) = β3 tanh(λ3 k),

k ∈ [0, ∞),

φ3 (0) = 0,

φ3 (∞) = β3 (23)

For the labor and capital stock, the associated stochastic diﬀerential equations (SDE) to (20–21) are given by dL = Ln dt + Lβ1 dw1 dK = (sLf (k) − δK) dt − β2 K dw2 + Lf (k)φ3 (k) dw3

(24) (25)

The drift and diﬀusion coeﬃcients of the stochastic dynamic system (24–25) are homogeneous functions of degree one in the state variables L and K. The homogeneity of degree one allows us to reduce the twodimensional stochastic system (24–25) to one-dimensional stochastic dynamics of the capital-labor ratio. As an alternative to the uncertainty in the saving rate, we also consider uncertainty (stochastic element 4 ) in technology, more precisely, uncertainty (4 ) in the total productivity parameter (γ) of the production function f (k), i.e., K˙ = Ls(γ + φ4 (k) 4 ) [f (k)/γ] − (δ + β2 2 )K

(26)

where φ4 is similar to φ3 deﬁned in equation (23). Hence, with (26), the stochastic diﬀerential equation (25) is replaced by dK = (sLf (k) − δK) dt − β2 K dw2 + Ls [f (k)/γ] φ4 (k) dw4 .

(27)

The stochastic diﬀerential equations (24-25) or (24), (27) represent a two-dimensional stochastic system, driven by a three-dimensional Wiener process. For the purposes of simulations, i.e., computing the sample paths L(t, ω) and K(t, ω) or the ratio, k(t, ω) = K(t, ω)/L(t, ω), it is suﬃcient to use the equations (24), (25) or (24),(27). In fact, the SDE (24) is the well-known geometric Wiener process. There is in general no closed form expression for the solutions (sample paths) for K(t, ω) or k(t, ω). However, to precisely examine the absorbing boundary conditions and stationarity conditions for the diﬀusion process, k(t), it is necessary to obtain an analytical expression for k(t) as given by a particular one-dimensional Wiener process. Fortunately, as both drift and diffusion coeﬃcients are homogeneous functions of degree one in K and 173

Bjarne S. Jensen, Martin Richter L, it allows us to analytically describe k(t) as a one-dimensional SDE, driven by a one-dimensional Wiener process, where the relevant drift coeﬃcient and diﬀusion coeﬃcient now need to be exactly determined, cf. Jensen and Wang (1999, Lemma 1). 5.3.2

The SDE of the capital-labor ratio

Theorem 1. The stochastic neoclassical dynamics for the capitallabor ratio k(t) of (24-25) is a diﬀusion process given by the SDE, dk = −(K/L2 ) dL + (1/L) dK + (K/L3 ) dL2 − (1/L2 ) dLdK = [s − ρ13 β1 φ3 (k)]f (k) − n + δ − (β12 + ρ12 β1 β2 ) k dt

(28) (29)

− β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3 The SDE (29) can in its domain be given in the compact form, dk = a(k) dt + b(k) dw,

k(t) ∈ (0, ∞)

(30)

with the drift coeﬃcient, a(k) = s¯(k)f (k) − Θk s¯(k) = s − ρ13 β1 φ3 (k)

Θ = n + δ − (β12 + ρ12 β1 β2 )

(31) (32)

and the diﬀusion coeﬃcient, b2 (k) = β 2 k 2 + φ3 (k)2 f (k)2 − ρφ3 (k)f (k)k β 2 = β12 + β22 + 2ρ12 β1 β2

ρ = 2(ρ13 β1 + ρ23 β2 ).

(33) (34)

Proof: Itˆo’s Lemma: Let X(t) ∈ Rn be a general diﬀusion process, and if F (X) is an arbitrary C 2 map from Rn → R, then dF (X) = FxT dX + 1/2dX T Fxx dX

(35)

i.e., F (X), determined by diﬀusion process X(t), is again a diﬀusion process, where Fx represent the partial derivatives with respect to x of the function F (x), and Fxx represents the Hessian matrix of the function F (x), and where, (dwi )2 = dt ∀i ; dwi · dwj = 0 for i = j ; (dt)2 = 0 ; dt · dwi = 0. Hence, with X = (L, K)T , 174

dX = (dL, dK)T ,

F (X) = K/L ≡ k,

(36)

Stochastic One-Sector and Two-Sector Models ∂F FX =

∂L ∂F ∂K

=

− LK2

1 L

⎛ , FXX = ⎝

∂2F ∂L2

∂2F ∂L∂K

∂2F ∂K∂L

∂2F ∂K 2

⎞ ⎠=

2K L2 −1 L2

−1 L2

0 (37)

we get, cf. (35–37), 1 K dk = − 2 dL + dK + L L 1 K = − 2 dL + dK + L L

% & 1 2K 1 2 2 dL − 2 2 dLdK + 0 dK 2 L L K 1 dL2 − 2 dLdK L3 L

(38) (39)

which is (28). Inserting (24) and (25) into (39) gives dk = −

K 1 L(n dt + β1 dw1 ) + L[{sf (k) − δk} dt − β2 k dw2 2 L L K 2 + φ3 (k)f (k) dw3 ] + 3 L (n dt + β1 dw1 )2 L 1 − 2 L(n dt + β1 dw1 )L[{sf (k) − δk} dt L − β2 k dw2 + φ3 (k)f (k) dw3 ]

= −k(n dt + β1 dw1 ) + {sf (k) − δk} dt − β2 k dw2 + φ3 (k)f (k) dw3 + k(n dt + β1 dw1 )2 − (n dt + β1 dw1 ) × [{sf (k) − δk} dt − β2 k dw2 + φ3 (k)f (k) dw3 ] = −nk dt − β1 k dw1 + sf (k) dt − δk dt − β2 k dw2 + φ3 (k)f (k) dw3 + kβ12 dt − β1 dw1 [−β2 k dw2 + φ3 (k)f (k) dw3 ] = −nk dt − β1 k dw1 + sf (k) dt − δk dt − β2 k dw2 + φ3 (k)f (k) dw3 + kβ12 dt + ρ12 β1 β2 k dt − ρ13 β1 φ3 (k)f (k) dt

= sf (k) − ρ13 β1 φ3 (k)f (k) − nk − δk + β12 k + ρ12 β1 β2 )k dt − β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3

(40)

which establishes (29). Finally, using Levy’s characterization, the local martingale term, −β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3 , can be simpliﬁed to b(k) dw, where w is a new one-dimensional Wiener process. The diﬀusion coefﬁcient b(k) can be calculated by determining the quadratic variation 175

Bjarne S. Jensen, Martin Richter of : −β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3 . Hence, we get b(k) dw

≡

− β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3

b2 (k) dt

≡ [−β1 k dw1 − β2 k dw2 + φ3 (k)f (k) dw3 ]2

b2 (k) dt

= β12 k 2 dt + β22 k 2 dt + φ3 (k)2 f (k)2 dt + 2β1 β2 k 2 [dw1 , dw2 ] − 2β1 kφ3 (k)f (k) [dw1 , dw3 ] − 2β2 kφ3 (k)f (k) [dw2 , dw3 ]

b2 (k) dt

= β12 k 2 dt + β22 k 2 dt + φ3 (k)2 f (k)2 dt + 2ρ12 β1 β2 k 2 dt − ρ13 2β1 kφ3 (k)f (k) dt − 2ρ23 β2 kφ3 (k)f (k) dt

b2 (k)

= (β12 + β22 + 2ρ12 β1 β2 )k 2 + φ3 (k)2 f (k)2

(41)

− 2(ρ13 β1 + ρ23 β2 )φ3 (k)f (k)k which is succinctly summarized in (33) together with (34).

2

The sample path (trajectory) of the process k(t), (30–34), is formally given by t t k(t; ω) = k(0) + a(k[u; ω]) du + b(k[u; ω]) dwu (ω) (42) 0

0

where k(0) is a ﬁxed initial condition, and ω symbolizes a particular realization of the Wiener process. The sample path (128) in this paper is approximated by the Euler scheme, Kloeden and Platen (1995). Remark 1. Note that correlation ρ23 does not enter the drift coefﬁcient in (31–32), because the coeﬃcient of dK 2 is zero, cf. (37–38). Furthermore, note that our introduction of uncertainties (24–25) implies that the deterministic accumulation parameters, s, n, δ, only appear in the drift coeﬃcient , (31–32), but not in the diﬀusion coeﬃcient, (33–34). Some diﬀusion parameters βi , ρij , however, may appear in the drift coeﬃcient, (31–32). From the derived diﬀusion process of the capital-labor ratio k(t) in Theorem 1, we can now analyze the evolution of the one-sector economy, with emphasis on long-run behavior (asymptotic properties). The drift and diﬀusion coeﬃcients, however, govern the evolution only at interior points of the state space. To fully deﬁne a diﬀusion process, the behavior at any boundary points requires separate speciﬁcation. For our purposes of studying the long-run evolution of nontrivial states, we need to carefully examine the conditions that will make the 176

Stochastic One-Sector and Two-Sector Models boundaries, k = 0 or k = ∞, inaccessible for any ﬁnite time (t < ∞). If a(0) = 0 and b(0) = 0 – as is often seen, cf. (30–34) and (11–13) – then k(t) = 0 is an absorbing boundary, i.e., the sample paths k(t) remain at the zero position, once it is attained. Even if a(0) = 0, or b(0) = 0, and hence k = 0 is not an absorbing state, we cannot admit negative state values of k(t), i.e., a viable (working) economic diﬀusion model must require that the inaccessibility of the boundary state k = 0 is ensured by imposing suﬃcient parameter restrictions on the actual drift and diﬀusion coeﬃcients. As the incremental Wiener processes dwi (t) ∈ (−∞, ∞) may occasionally take on very large negative values, the drift and diﬀusion coeﬃcients must indeed be carefully studied to prevent the random variable k(t) from hitting the lower boundary, k = 0. 5.4

Boundaries, steady-state, and convergence

5.4.1 Terminology, concepts and deﬁnitions Let the transition probability in case of a one-dimensional stochastic process X(t) be denoted P (x, t; x0 , t0 ) = Pr[X(t) ≤ x | X(t0 ) = x0 ]

(43)

where X(t) is the state of the process at instant t. The transition probability distribution P (x, t; x0 , t0 ) is assumed to have a probability density function p(x, t; x0 , t0 ), deﬁned everywhere. Boundary conditions. In terms of notation in (30), in the one-dimensional case, we deﬁne the following indeﬁnite integrals (functions), x x a(u) du; s(x) = exp{−2J(x)}, S(x) = s(u)du (44) J(x) = 2 x0 b (u) x0 x exp{2J(x)} 1 m(x) = = , M(x) = m(u)du (45) b2 (x) b2 (x)s(x) x0 The functions s(x), S(x), m(x), and M(x) are called, respectively, the scale density function, the scale function, the speed density function, and the speed measure of the stochastic processes X(t); cf. Karlin and Taylor (1981, p. 194–96, p. 229). Inaccessible boundaries. Let the diﬀusion process X(t) have two boundaries r1 < r2 . Suﬃcient conditions: The boundaries r1 and r2 are inaccessible, if ∀x0 ∈ [r1 , r2 ] , S(r1 ) = −∞; S(r2 ) = +∞; equivalently, s(x) is not integrable on the closed interval [ri , x0 ]

(46) 177

Bjarne S. Jensen, Martin Richter or if

S(ri ) = lim S(x) = lim x→ri

x→ri

x

s(x)dx = ∓∞;

i = 1, 2

(47)

x0

The necessary and suﬃcient condition is ri Σ(ri ) ≡ [S(ri ) − S(x)]m(x)dx = +∞;

i = 1, 2

(48)

x0

Existence of steady-state distribution. A time-invariant distribution function P (x) exists if and only if S(ri ) = ∓∞ and M(x) is ﬁnite at ri , i.e. |M(ri )| < ∞; i = 1, 2 (49) The existence of steady-state distribution P (x) – implying inaccessible boundaries – also implies the convergence of the nonstationary distribution functions P (x, t) towards P (x) as t → ∞. Existence of steady-state density function. A time-invariant probability density function p(x) exists if and only if the speed density m(x) satisﬁes r2 r2 m(x)dx < ∞, p(x) = mm(x), p(x)dx = 1 (50) r1 r1 where m is the normalizing constant. The conditions and formulas above can be applied directly when we have the same stochastic diﬀerential equation (drift and diﬀusion coeﬃcients) for the whole interval of x. For more details, see Karlin and Taylor (1981) and Mandl (1968). Remark 2. It is well-known, (Karlin and Taylor, 1981, p. 359), that the solution (Itˆo-integral) to the stochastic diﬀerential equation (24) is the geometric Wiener process with the continuous sample paths (stochastic trajectories, realizations), 1 2 ∞ 2 L(t) = L0 exp{(n − β1 )t + β1 w1 (t)}; 2n ≷ β1 : lim L(t) = (51) 0 t→∞ 2 Moreover, the boundaries, zero or inﬁnity, are inaccessible, as the sample paths (51) cannot attain any of the two boundaries in ﬁnite time. It is instructive to prove the latter statement as a prelude to the general procedure of proving inaccessibility. From (24) and (44), the scale density s(L) and the scale function S(L) become, cf. (44), L 2 nL/(β12 L2 )dL} = (L/L0 )−2n/β1 (52) s(L) = exp{−2 S(L) =

L0 L

L0

178

s(L)dL = L0 /(1 − 2n/β12 )[(L/L0 )1−2n/β1 − 1] (53) 2

Stochastic One-Sector and Two-Sector Models As S(0) = −∞ for 2n ≥ β12 , the latter is a suﬃcient parameter condition for the inaccessibility of the boundary: k = 0. But despite the ﬁnite S(0) = −L0 /(1 − 2n/β12 ) for 2n < β12 , the boundary k = 0 may still not be attainable. The speed density m(L) is, cf. (45), 2n/β12

m(L) = 2L2n/β1 /(β12 L2 ) = (L0 2

/β12 )L2n/β1 −2 2

(54)

and we must now, with (53)–(54) and (48), evaluate Σ(0) = 0

L0

4n/β 2

1 L [S(L) − S(0)]m(L)dL = 2 0 β1 (1 − 2n/β12 )

0

L0

dL = ∞ (55) L

Thus, Σ(0) = +∞ says that it takes inﬁnite time to reach the zero boundary from any interior state, i.e., L = 0 is after all inaccessible for 2n < β12 , as it cannot be attained in ﬁnite time. The same analysis can be applied to boundary L = ∞, which is neither attainable in ﬁnite time. From this examination of the labor diﬀusion process (24) – which has no steady-state distribution – it is clear that boundary problems for the capital-labor ratio diﬀusion are essentially due to boundary problems associated with the capital stock diﬀusion process. 5.4.2 Boundary conditions – neoclassical growth models Labor growth and capital depreciation rates are uncertain From (30–34) with β3 = 0, and the CES, f = fi , (7–9), we have, σ = 1, dk = {sγk a − Θk}dt − βkdw, σ = 1, dk = {sγ[(1 − a) + ak

(σ−1)/σ σ/(σ−1)

]

(56) − Θk} dt − βk dw. (57)

Theorem 2. The suﬃcient conditions for the diﬀusion processes of one-sector neoclassical growth models to have inaccessible boundaries – with CES technologies, (56–57), and uncertainties in both labor growth and capital depreciation – are: k = 0 : 2(n + δ − sγaσ/(σ−1) ) ≤ β12 − β22 (58) σ < 1: k = ∞ : 2(n + δ) ≥ β12 − β22 k = 0 : always inaccessible (59) σ = 1: k = ∞ : 2Θ + β 2 ≥ 0 ⇔ 2(n + δ) ≥ β12 − β22 k = 0 : always inaccessible (60) σ > 1: k = ∞ : 2(n + δ − sγaσ/(σ−1) ) ≥ β12 − β22 179

Bjarne S. Jensen, Martin Richter The parametric conditions, (58)–(60), with strict inequalities, ensure the existence and the long-run convergence of the stochastic capitallabor ratio k(t) to a time-invariant (steady-state) probability distribution P (k). Proof: σ = 1 : For an arbitrary k0 ∈ (0, ∞), the scale density function is given by, cf. (44), (56),

k

s(k) = exp{−2

k0

sγk a − Θk dk}, β 2k2

0 0 and 2sγ/[(1 − a)β 2 ] > 0, cf. (7), the exponential term in (63) will dominate and explode for k → 0, and hence S(0) diverges, i.e., S(0) = −∞. Thus, the lower boundary k = 0 is always inaccessible, irrespective of the size of the drift and diﬀusion parameters. ∞ ∞ 2sγ 2 s(k)dk ≡ m0 k 2(Θ/β ) exp{ k −(1−a) }dk (64) S(∞) = (1 − a)β 2 k0 k0 Since 1 − a > 0, cf. (7), the divergence of S(∞) here only depends on the polynomial term in (64) with the exponent 2(Θ/β 2 ). Hence, divergence of S(∞) requires that 2(Θ/β 2 ) ≥ −1, or, equivalently, 2(n + δ) ≥ β12 − β22 . Thus, the upper boundary k = ∞ is inaccessible by imposing the parameter restriction stated in (59). σ = 1: From (57), we have the expressions, cf. (61), s(k) = exp{−2

k

k0

sγ[(1 − a) + ak (σ−1)/σ ]σ/(σ−1) − Θk dk} β 2k2

(65)

σ < 1 : From (65) and (11), we have, S(0) = lim

k→0

180

k

k0

0

s(k)dk = k0

(k/k0 )−2[sγa

σ/(σ−1) −Θ]/β 2

dk

(66)

Stochastic One-Sector and Two-Sector Models Hence, it follows from (66) that the divergence of S(0) to −∞ requires that the exponent must be less than or equal to −1, or equivalently, 2sγaσ/(σ−1) ≥ 2Θ + β 2 , which is the lower boundary condition in (58). From (65) and (11), we have, σ k ∞ 1 k 2Θ 2sγ(1 − a) σ−1 1 ( ) β2 exp{ ( − )} dk S(∞) = lim s(k)dk = 2 k→∞ k k β k k 0 0 k0 0 (67) The polynomial term in (67) decides the divergence of S(∞); it diverges to +∞ if the exponent 2Θ/β 2 ≥ −1, which gives the upper inaccessibility condition in (58). σ > 1: From (65) and (13), we have, σ k 0 1 k 2Θ 2sγ(1 − a) σ−1 1 2 β ( − )} dk S(0) = lim s(k)dk = ( ) exp{ 2 k→0 k β k k0 k0 k 0 0 (68) The exponential term in (68) will always explode for k → 0. Hence, S(0) is diverging, and accordingly, k = 0 is inaccessible, irrespective of parameter restrictions. From (65) and (13), we have, k ∞ σ/(σ−1) −Θ]/β 2 s(k)dk = (k/k0 )−2[sγa dk (69) S(∞) = lim k→∞

k0

k0

S(∞) is divergent, if and only if the exponent of the polynomial in (69) is larger than or equal to −1, which gives the parametric inaccessibility restriction as stated in (60). 2 Remark 3. Corresponding to the CD case, (59), Bourguignon (1974, pp. 153–54), gave the upper-boundary inaccessibility condition as 2b/c ≥ −1

⇔

2(n + δ) ≥ β12 − β22 − β1 β2 ρ12

(70)

which diﬀers from our simpler expression in (59). The result (70) is due a “misprint” in his formula for dk (p. 146), equivalent to our (38–39). Thus, we observe that, in contrast to his result, (70), the correlation ρ12 has no implication for the inaccessibility of the upper-boundary. Incidentally, note that ρ12 does not enter the boundary condition for the CES cases, (58), (60). By the way, LHS of the boundary conditions (58–60) represent, with β1 = β2 = 0, the necessary and suﬃcient conditions for the existence of a non-trivial (non-zero and ﬁnite) deterministic steady state. Evidently, β1 > 0 makes it easier to avoid the trivial boundary k = 0, 181

Bjarne S. Jensen, Martin Richter but more likely to explode. Note that β2 > 0 makes it easier to avoid explosion, but more likely to hit the boundary k = 0. The economicmathematical intuition of such βi > 0 eﬀects is left to the reader. The saving rate is uncertain It was seen in (30–34) that the uncertainty in saving behaviour will always introduce nonlinear terms in the diﬀusion coeﬃcient. Inaccessible boundaries here raise conditions that clash with common, deterministic, dynamic regularity properties. By (30–34) with β1 = β2 = 0, λ3 = ∞, φ3 (k) = β3 , cf. (23), we get σ = 1,

dk = {sγk a − (n + δ)k}dt + β3 γk a dw,

(71)

σ = 1,

dk = {sf (k) − (n + δ)k} dt − β3 f (k) dw, f (k) : (9)

(72)

Theorem 3. The diﬀusion process (30–34) with CD or CES functions, (7), (9), and uncertainty only in the saving rate - β1 = β2 = 0, β3 = 0 - will have boundary properties and suﬃcient inaccessibility conditions as follows: k = 0 : 2[sγaσ/(σ−1) − (n + δ)] ≥ [β3 γaσ/(σ−1) ]2 (73) σ < 1: k = ∞ : always inaccessible k = 0 : inaccessible, if a > 12 ; poss.access. if a < 12 σ = 1: (74) k = ∞ : always inaccessible k = 0 : possibly accessible σ > 1: (75) k = ∞ : 2[sγaσ/(σ−1) − (n + δ)] ≤ [β3 γaσ/(σ−1) ]2 Proof: CD case. Applying (44) to (71) gives, 2(1−a)

s(k) = exp{

(n + δ)(k 2(1−a) + k0 ) − 2sγ(k 1−a − k01−a ) } β32 γ 2 (1 − a)

(76)

From (44) and (76), we have,

0

S(0) =

0

s(k)dk ≡ m0 k0

exp{ k0

(n + δ)k 2(1−a) − 2sγk 1−a }dk (77) β32 γ 2 (1 − a)

Since 1 − a > 0, S(0) will converge, and, accordingly, k = 0 may possibly be accessible. 182

Stochastic One-Sector and Two-Sector Models To decide whether k = 0 is in fact inaccessible, we must calculate Σ(0). In the CD case, we have, for the lower boundary k = 0, cf. (48), 0 [S(0) − S(k)]m(k)dk (78) Σ(0) = k0

The limit of the integrand S(k)m(k) in (78) is, cf. (44)–(45), (77), lim S(k)m(k) = k m0 k0 exp{[β32 γ 2 (1−a)]−1 [(n+δ)k 2(1−a) −2sγk 1−a ]}dk lim k→0 β32 γ 2 k 2a exp{[β32 γ 2 (1−a)]−1 [(n+δ)k 2(1−a) −2sγk 1−a ]} k→0

(79)

Since [β32 γ 2 (1 − a)]−1 > 0, the limit (79) converges iﬀ 2(1 − a) > 1, i.e., a < 12 . Hence, with a > 12 , Σ(0), (78) will be divergent, and thus the lower boundary is inaccessible. From (44) and (76), we have, ∞ ∞ (n + δ)k 2(1−a) − 2sγk 1−a }dk (80) s(k)dk ≡ m0 exp{ S(∞) = β32 γ 2 (1 − a) k0 k0 Since 1 − a > 0, and k 2(1−a) is the dominating term, S(∞) will always diverge, i.e., k = ∞ is inaccessible. CES case. Applying (44) to (72) gives k sγ[(1 − a) + ak (σ−1)/σ ]σ/(σ−1) − (n + δ)k s(k) = exp{−2 dk} (81) β32 γ 2 [(1 − a) + ak (σ−1)/σ ]2σ/(σ−1) k0 σ < 1: From (81), we have, for small k, cf. (11), k 0 k sγaσ/(σ−1) − (n + δ) −1 exp{−2 k dk}dk S(0) = lim s(k)dk = k→0 k β32 γ 2 a2σ/(σ−1) k0 k0 0 0 −2[sγaσ/(σ−1) −(n+δ)] 2σ/(σ−1) 2 (k/k0 ) β3 γ 2 a dk (82) = k0

S(0) diverges if the exponent of k/k0 is less than or equal to −1, which immediately gives the condition (73). From (81), we have, for large k, cf. (11), ∞ k k σ/(σ−1) −(n+δ)k dk}dk S(∞) = lim k0 s(k)dk = k0 exp{−2 k0 sγ(1−a) 2 γ 2 (1−a)2σ/(σ−1) β 3 k→∞ ∞ (n+δ)(k2 −k02 )−2sγ(1−a)σ/(σ−1) (k−k0 )] }dk = k0 exp{ β32 γ 2 (1−a)2σ/(σ−1) ∞ n+δ 2s 2 ≡ m30 k0 exp{ β 2 γ 2 (1−a) (83) 2σ/(σ−1) k − β 2 γ(1−a)σ/(σ−1) k}dk 3

3

183

Bjarne S. Jensen, Martin Richter As 1 − a > 0, the constant denominators in (83) are positive, and since the k 2 term is the dominating term in the exponential expression, S(∞) will always be divergent; hence, the upper boundary is always inaccessible. σ > 1: From (81), we have, for small k, cf. (13), 0 k σ/(σ−1) −(n+δ)k s(k)dk = k0 exp{−2 k0 sγ(1−a) dk}dk β32 γ 2 (1−a)2σ/(σ−1) 0 n+δ 2s 2 ≡ m01 k0 exp{ β 2 γ 2 (1−a) (84) 2σ/(σ−1) k − β 2 γ(1−a)σ/(σ−1) k}dk

S(0) = lim

k

k→0 k0

3

3

Since 1 − a > 0, S(0) will always converge, and hence k = 0 may possibly be accessible. Whether in fact k = 0 is attainable in ﬁnite time requires similar evaluations as shown above, cf. (78)–(79), (55). From (81), we have, for large k, cf. (13), S(∞) = lim

k

k→∞ k0

s(k)dk = =

∞

∞ k0

exp{−2

(k/k0 ) k0

k k0

sγaσ/(σ−1) −(n+δ) −1 k dk} β32 γ 2 a2σ/(σ−1)

−2[sγaσ/(σ−1) −(n+δ)] 2 γ 2 a2σ/(σ−1) β3

dk

(85)

S(∞) diverges if the exponent of k/k0 is larger than or equal to −1, which is equivalent to the upper inaccessibility condition in (75). 2 With uncertainty in the saving rate, the drift and diﬀusion coeﬃcients in (20)–(21) now have similar nonlinear elements, that, if dominating, will prevent us from satisfying a suﬃcient lower inaccessibility condition, as the scale function S(k) at k = 0 is now ﬁnite, whenever σ ≥ 1. The factor accumulation process is likely to be much more severely aﬀected (large volatility) by uncertainty about the saving rate than by uncertainties in labor growth and depreciation rates. The lack of any parametric restrictions preventing the accessibility of the absorbing boundary k = 0 (implosion,“economic collapse”), cf. (74), (75), represents a critical stochastic dynamic model complication for the system (30–34) and a mathematical issue to be adequately resolved below. 5.4.3

General parameter uncertainty and inaccessible boundaries

To dampen the impact of the Wiener process dw3 , near k = 0 in our (30–34), the random element 3 in the saving parameter must be state-dependent, and to preserve (24)–(25) as a homogeneous stochastic dynamic system, the function φ3 (k) was chosen, cf. (23). 184

Stochastic One-Sector and Two-Sector Models From economic reasons, the actual shape of φ3 (k) on the domain k ∈ [0, ∞) is chosen as a monotonically increasing curve, but this curve should also – to avoid creating excessive saving parameter volatility – be bounded above by a horizontal asymptote. With these two stipulations upon relevant selections of φ3 (k), one choice might be the logistic (S-shaped) curve described by well-known exponential expression. But among the exponentials, a relevant and convenient choice of φ3 (k), with proper domain and range for our purposes, is (23). Theorem 4. The suﬃcient conditions for the general diﬀusion process (30–34) – with CES technologies and uncertainties in labor growth, capital depreciation, and saving rates – to have inaccessible lower and upper boundaries are: k= 0; k= 0; k = ∞; k = ∞;

σ1

: : : :

2(n + δ − sγaσ/(σ−1) ) ≤ β12 − β22 Always inaccessible 2(n + δ) ≥ β12 − β22 2(n + δ − sγaσ/(σ−1) ) ≥ β12 − β22 − Δ

(86) (87) (88) (89)

Δ ≡ [β3 γaσ/(σ−1) ]2 − 2ρ23 β2 β3 γ)aσ/(σ−1) . Proof: The hyperbolic function φ3 (k) = β3 tanh(λ3 k), (23), is φ3 (k) = β3 tanh(λ3 k) = β3 (eλ3 k − e−λ3 k )/(eλ3 k + e−λ3 k ),

k ≥ 0 (90)

It is well-known and easily veriﬁed from (90) that for small k : large k :

φ3 (k) ∼ β3 λ3 k, ⇔ φ3 (k)/β3 λ3 k → 1 as k → 0 (91) ⇔ φ3 (k)/β3 → 1 as k → ∞ (92) φ3 (k) ∼ β3 ,

Lower boundary. σ = 1: With the CD production function (7), the scale density function s(k) now becomes, cf. (44), (90), k [s − ρ13 β1 φ3 (k)]γk a − Θk dk} (93) s(k) = exp{−2 2 2 2 2 2a − ρφ (k)γk (1+a) 3 k0 β k + φ3 (k)γ k Since a < 1, the dominating term in the numerator and the denominator of (93) becomes, for small k, cf. (91), k 2sγ sγk a −(1−a) s(k) ∼ exp{−2 dk} = exp{ (k −(1−a) −k0 )} (94) 2 2 2 β k (1 − a)β k0 0 Since 1−a > 0, it is seen, from (94), that S(0) = k0 s(k)dk is diverging at k = 0, cf. (63), and hence, the lower boundary is inaccessible. 185

Bjarne S. Jensen, Martin Richter σ < 1: With the CES function (9), the scale function with the dominating terms becomes, cf. (9), (11), (91), (94), S(0) = lim

k→0

k

0

s(k)dk =

k0

(k/k0 )2[Θ−sγa

σ/(σ−1) ]/β 2

dk

(95)

k0

Hence, it follows from (95), that the divergence of S1 (0) requires that the exponent 2[Θ − sγaσ/(σ−1) ]/β 2 ≤ −1, which is the lower boundary condition in (86), cf. (66). σ > 1: With the CES function (7), the scale function with the dominating terms becomes, cf. (13),

0

σ

2sγ(1 − a) σ−1 −1 −1 (k −k0 )}dk (96) S(0) = lim s(k)dk = lim exp{ ¯b2 k→0 k k→0 k 0 0 k

where ¯b2 ≡ β 2 + β32 λ23 γ 2 (1 − a)2σ/(σ−1) − ρβ3 λ3 γ(1 − a)σ/(σ−1) > 0. Since the parameter 2sγ(1 − a)σ/(σ−1) /¯b2 in the exponential term is always positive, it seen by (96) that S(0) is always diverging; hence, k = 0 is inaccessible, irrespective of parameter restrictions, cf. (68). Upper boundary. With the CD function (7), the scale density s(k) becomes, cf. (44), (90), s(k) = exp{−2

k

k0

[s − ρ13 β1 φ3 (k)]γk a − Θk dk} β 2 k 2 + φ23 (k)γ 2 k 2a − ρφ3 (k)γk (1+a)

(97)

Since a < 1, the dominating term in the numerator and denominator of (97) becomes, for large k, cf. (92), s(k) ∼ exp{−2

k

k0

−Θk 2 dk} = (k/k0 )2Θ/β β 2k2

(98)

The divergence of S(∞) from (98) is analogous to the result in (64), (59); hence, we have (88) for σ = 1. σ < 1: With the CES function (9), the scale density s(k) becomes, keeping the dominant terms for large k, cf. (97), (92), s(k) ∼ exp{−2

k

k0

−Θk 2 dk} = (k/k0 )2Θ/β 2 2 β k

(99)

which is the same as (98), and the divergence of S(∞) is analogous to (67), (58). 186

Stochastic One-Sector and Two-Sector Models σ > 1: From (97), (9), (13), (92), we have, for large k, k (s − ρ13 β1 β3 )γaσ/(σ−1) k − Θk dk} s(k) ∼ exp{−2 2 2 2σ/(σ−1) 2 2 2 k − ρβ3 γaσ/(σ−1) k 2 k0 β k + β3 γ a k ¯2 = exp{−2 (¯ a/¯b2 )k −1 dk} = (k/k0 )a¯/b (100) k0 σ σ σ a ¯ ≡ (s − ρ13 β1 β3 )γa σ−1 − Θ, ¯b2 ≡ β 2 + β32 γ 2 a σ−1 − ρβ3 γa σ−1 > 0. Now S(∞) from (100) diverges, cf. the analogue (85) and (75), if the exponent is large: a ¯/¯b2 ≥ −1. Rewriting the exponent, using (32), (34), gives our condition (89), where Δ ≡ ¯b2 −β 2 +2ρ13 β1 β3 γaσ/(σ−1) = 2 [β3 γaσ/(σ−1) ]2 − 2ρ23 β2 β3 γaσ/(σ−1) .

The assumptions about φ3 (k), (23) have removed the uncertainty in saving rates (21) entirely from the lower boundary problems with, σ ≥ 1, cf. (74) and (75), because, with (23), we now have that (87) holds, irrespective of the size of any drift and diﬀusion parameters. With σ < 1, proper parameter restrictions (86) can safeguard against attaining k = 0. Thus, for any substitution elasticity of the CES technology, the stochastic neoclassical growth model of Theorem 1, (29–34), is made fully workable without any boundary problems (extinction, explosion). 5.4.4 Neoclassical SDE and asymptotic non-stationarity The relaxation of the suﬃcient inaccessibility condition S(∞) = ∞ does not itself in the long run imply an explosion. Still, to avoid any risk of implosion, we want to keep the suﬃcient condition S(0) = −∞. But, together with a ﬁnite S(∞), we have the following well-known implications (with probability one), [S(0) = −∞ ∧ S(∞) < ∞] ⇒ lim k(t) = ∞ ⇒ lim E[k(t)] = ∞ (101) t→∞

t→∞

Within our stochastic neoclassical growth model, a ﬁnite S(∞) is simply equivalent to reversing the inequality in (88)–(89). For σ ≤ 1, the reverse of (88) is 2(n + δ) ≤ β12 − β22 . The latter implies that k(t) → ∞ as t → ∞, but it is a pathological case, as the reversal of (88) also implies that L(t) → 0 (although never reached in ﬁnite time). In short, no relevant stochastic endogenous growth is possible with σ ≤ 1. Hence, as in the deterministic case, stochastic endogenous economic growth requires that the marginal product of capital is bounded below, i.e., σ > 1, cf. (13). 187

Bjarne S. Jensen, Martin Richter By reversing (89), the suﬃcient condition of persistent growth becomes, cf. (101), (87) σ

σ > 1 : S(∞) < ∞ ⇔ sγa σ−1 ≥ n + δ + 1/2(−β12 + β22 + Δ) (102) which is the stochastic analogue to the deterministic condition (with only n + δ on RHS) of endogenous (persistent) growth; see Jensen and Wang (1997, p. 93), Jensen and Larsen (1987). We note from (102) that it is generally more diﬃcult (higher saving rates are required) to achieve persistent economic growth per capita in the face of uncertainty – as n − 12 β12 > 0 is now taken for granted in (102), and Δ is always positive when ρ23 = 0, cf. (89). Uncertainties in the accumulation of capital, (23), (25), (β2 = 0, β3 = 0) make the stochastic analogue (102) harder to satisfy. The rapidity of stochastic growth is not directly seen by (102). However, with S(∞) < ∞, the stochastic diﬀerential equation (30) is, asymptotically, σ dk ∼ a ¯ k dt + ¯b k dw ≡ { (s − ρ13 β1 β3 )γa σ−1 − Θ } k dt + (103)

[ β 2 + β32 γ 2 a

2(σ−1) σ

σ

1

− ρβ3 γa σ−1 ] 2 k dw

i.e., geometric Wiener processes with sample paths: a − ¯b2 /2) t + ¯b w(t)} ; 2¯ a > ¯b2 k(t) ∼ k0 exp{(¯

(104)

It is easily veriﬁed that the exponential growth condition, a ¯ − 12 ¯b2 > 0 in (104) is equivalent to (102). Thus, the stochastic condition (102) is indeed the analogue of deterministic exponential per capita growth in the neoclassical growth model. 5.5

Explicit steady-state distribution with CD technologies

Having obtained the conditions for the existence of and convergence to a steady-state (time-invariant) distribution, cf. (59), we also want to obtain as a benchmark – with CD sector technologies – a closed form expression for the time invariant probability density function p(k) and the distribution function P (k) of the diﬀusion process (56). It turns out that the benchmark distribution function P (k) for the CD economy can be expressed by gamma Γ(α) and incomplete gamma functions Γ(α, x0 ), which are generally deﬁned, respectively, by the improper integrals, ∞ ∞ α−1 −x Γ(α) ≡ x e dx, Γ(α, x0 ) ≡ xα−1 e−x dx, α > 0 (105) 0

188

x0

Stochastic One-Sector and Two-Sector Models Theorem 5. The time invariant (steady-state) distribution P (k) for the stochastic process (56), cf. Theorem 1, will have a density function p(k) if and only if: 2Θ + β 2 > 0

2(n + δ) > β12 − β22

⇔

(106)

With (106), the time invariant probability density function p(k) is in closed form, p(k) = c0 k −2[1+Θ/β ] exp{−ck −(1−a) }, 2

0 0

⇔ ⇔

n + δ > β12 + ρ12 β1 β2 2(n + δ) > 3β12 + 4ρ12 β1 β2 + β22

(111) (112)

With (111)–(112), the steady-state distribution P (k), (110), will have ﬁrst-order and second-order moments given by, E(k) =

Γ(α∗∗ ) 2(1−a)−1 2 Γ(α∗ ) (1−a)−1 c c , E(k 2 ) = , σ = E(k 2 ) − [E(k)]2 Γ(α) Γ(α) (113)

where α∗ = α − (1 − a)−1

and

α∗∗ = α − 2(1 − a)−1

(114)

and α was given by (108). 189

Bjarne S. Jensen, Martin Richter Proof: For the stochastic dynamic system (56), the speed densities are, cf. (44)–(45), m(k) =

exp{−2

k

sγ ua − Θu β 2 u2 2 β k2

˜ k

du}

0 0

⇔

Θ > 0,

α∗∗ > 0

⇔

2Θ > β 2

(119)

which gives the moment existence restrictions (111–112), and Theorem 5 is established. 2 5.6

Sample paths and asymptotic densities with CD and CES technologies

We include simulations of both sample paths and asymptotic (longrun stationary) densities of the stochastic growth models. For diﬀerent sets of model parameters, we calculate the steady-state values (mean, 190

Stochastic One-Sector and Two-Sector Models mode, standard deviation) of the long-run stationary processes, or alternatively, simulate particular sample paths with inﬁnity as attractor for parameters with stochastic endogenous (persistent) growth. If they exist, steady-state values (κ) of the capital-labor ratio in deterministic one-sector growth models are the critical points of (19): n+δ f (κ) = ] ; APK (κ) = (n + δ)/s (120) [ k˙ = 0 ⇔ k(t) = κ ] ⇔ [ κ s Closed-form expressions for the root values (κ) - LHS of (120) with CD and CES technologies, (122) - are given in Table 1, with explicit expressions for other steady-state properties, cf. (121–123), (4), 1 − a 1/σ k a σ σ−1 σ−1 f (k) = γ (1 − a) + ak σ

ω(k) = f (k) = γk a ; f (k) = γak a−1 ;

1/(σ−1) f (k) = γa a + (1 − a)k −(σ−1)/σ

(121) (122) (123)

These formulas of Table 1 show explicitly how six basic (structural) parameters - factor accumulation parameters (s, n, δ) and technology parameters (γ, a, σ) - determine various steady-state values (certainty equivalents). The tabulated CES formulas are rather elaborate parametric expressions, except APK (κ) or its reciprocal (K/Y ) at (κ). The actual invariance of APK (κ) to changes in technology parameters is a peculiarity that is solely tied to one-sector growth models [hence absent in Table 3 below] together with the concept of steady ˙ ˙ states (balanced growth : L/L = K/K = Y˙ /Y ), and the assumptions ˙ of : i) L/L = n, and ii) constant gross saving rates, (s). By (19): ˙ d ln k(t)/dt = k/k = 6 k = s APK (k) − (n + δ)

(124)

This growth equation (124) has played an important role in empirical convergence studies, cf. Sala-i-Martin (1996, p. 1342), Quah (1996), Barro and Sala-i-Martin (1992), Barro (1991). Here the invariance of APK (κ), (120),(124), allows a simple numerical consistency check of all the calculations of f (κ) and (κ) in Table 2. Although little commentary is allowed or necessary here, the extensive set of CD and CES parameter cases in Table 2 - illustrating formulas of Table 1 - deserve careful study and scrutiny, as such systematic steady-state numbers (certainty equivalents) are seldom shown; Table 2 is also important as benchmark for corresponding asymptotic expectations, E(k), and the stochastic growth model simulations. 191

Bjarne S. Jensen, Martin Richter In addition to the six basic accumulation and technology parameters for (κ), E(k) is also aﬀected by the uncertainty (volatility) and correlation parameters : (β1 , β2 , β3 , β4 , λ3 , λ4 , ρij ), cf. (20–27), that are involved in the drift and diﬀusion coﬃcients, (31–34), of the stochastic capital-labor ratio, k(t). As most cases in Table 2 show, despite their theoretical distinctions, the actual values of (κ) and E(k) nearly coincide. Moreover, as the CD cases (8-9) show, the additional impact of (β1 = 0.01, β2 = 0.03) on E(k) needs 5 decimals to be seen. In the CD cases (14-15) and CES case (10), the starred values of (β3 , λ3 ) indicate that they are both zero and instead represent (β4 , λ4 ). The CD cases (13-14) and CES cases (9-10) in Table 2 show that interchanging the level size and the uncertainty about (s) and (γ) are equivalent as to the impacts on : (κ), E(k), σ(k), mode (k). The stationary densities of p(k) are obtained, cf. (115-117), by numerically integrating and normalizing their respective speed densities, (44-45), with Mathematica. Some p(k) are exhibited in Figures 5.1 - 5.10. The statistics in Table 2 - expectations, standard deviations, cf. (118) - for the stationary distribution are also calculated by using Mathematica, but closed forms of σ(k) are used as a control whenever they exist, cf. CD, (113). The modes are obtained by solving (125). In Merton (1975), with a one-sector CD-technology and uncertainty only in labor growth, it follows, Merton (1975, p. 383, footnote 1), that the deterministic steady-state value (κ), has the same value as the mode of the stationary (steady-state) distribution, and that (κ) is not equal to the expectation E(k) of the stationary distribution. But with uncertainty in both labor and depreciation rates, β2 = 0, the mode and (κ) do not necessarily coincide. The mode(k) (the most probable long-run value of k) can be obtained by diﬀentiating the density p(k) or speed density m(k). By setting it equal to zero, we have, cf. (44)–(45), p (k) = 0 ⇔ m (k) = 0 ⇔ b(k)b (k) = a(k)

(125)

Solving (125), together with (30–34) and β3 = 0, (56), gives 1

1

mode (k) = [γs/(Θ + β 2 )] 1−α = [γs/(n + δ + β22 + ρ12 β1 β2 )] 1−α (126) Since with CD, Table 1, we have κ = [γs/(n + δ)]1/(1−α) , and mode (k) = κ ⇔ β2 = −ρ12 β1 ; β2 = 0

(127)

This equality can only be satisﬁed if β1 ≥ β2 . With ρ12 = 0, β2 > 0, and CD, cf. (126), we always ﬁnd: mode(k) < κ. See also Table 2. 192

Stochastic One-Sector and Two-Sector Models Although the mode does not coincide with the expectation, the stationary densities, p(k) - without being normal and frequently very spiked density curves - are often close to being symmetric (see the numbers in Table 2 and their shapes in Figures 5.1 - 5.10). But for some stationary distributions with basic parameters close to the boundary of stationarity (endogenous growth), we ﬁnd heavy-tailed distributions with an expectation signiﬁcantly larger than the mode. We have chosen in Fig. 5.1 - 5.11 to exhibit the sample paths for k(t), ω(t), and y(t). All simulations are done using a simple Euler scheme of the underlying stochastic diﬀerential equations (SDE). Hence, in one-sector models, we simulate the sample paths for the capital-labor ratio, k(t), of the SDE, (30). The sample path (trajectory) of the stochastic process k(t), (30), is formally given by t t a(k[u; ω]) du + b(k[u; ω]) dwu [ω] (128) k(t) = k[t; ω] = k(0) + 0

0

where k(0) is a ﬁxed initial condition and [ω] symbolizes a particular realization of the Wiener process. This sample path (128) is thus approximated by the Euler scheme, Kloeden & Platen (1995). The random numbers (realizations) used in the simulations are generated by the Ran2 generator from Numerical Recipes in C++ with the same initial seed. All processes are sampled with the same step size. After having obtained the simulated sample path k(t), by numerically solving the SDE, (128), the sample paths for the wage-rental ratio, M RS = ω(t), and labor productivity, APL = y(t) - with the same realization of the Wiener process - are determined, for every simulated time point, by inserting the sample path k(t), (128), into the CD-CES equations, (121–122): 1−a k(t)1/σ a σ σ−1 σ−1 y(t) = γ (1 − a) + ak(t) σ

ω(t) = y(t) = γk(t)a ;

(129) (130)

As observed in Table 2 and from (129), the numerical values of k(t) in Fig. 5.1 - 5.11 are located below (above) the values of ω(t), in CD cases, if a < 1/2 (a > 1/2), and if these conditions in CES cases are combined with σ ≤ 1 (σ ≥ 1). The values of y(t), (130), are, with the selected size of (γ), located below the sample path of k(t).

193

194 1

Per capita saving: sL (κ)

sf (κ)

(1 − s)f (κ)

Per capita consumption: cL (κ)

γs n+δ

a

1 1−a

s n+δ

Capital share: K (κ) = 1 − L (κ)

Wage–rental ratio: ω(κ)

1−a a

a(n+δ) s

Marginal product of K: M PK (κ) = f (κ)

Capital–output ratio: K/Y, κ/f (κ)

n+δ s s n+δ

s n+δ a 1−a

a 1−a

1 1−a

(1 − a)γ 1−a

γs n+δ

γ 1−a

1

CD (σ = 1)

Average product of K: APK (κ), f (κ)/κ

Marginal product of L: M PL (κ)

Average product of L: APL = f (κ)

Capital-labor ratio: K/L = κ

Variables

a

γs n+δ

σ 1−σ −1 σ 1−σ γs σ−1 σ

1−σ σ

a

γs n+δ

σ−1 σ

sf (κ)

(1 − s)f (κ)

a

a

1−σ σ

s

n+δ σ1 γs n+δ

σ−1

aγ σ σ 1−a 1−σ 1

n+δ s s n+δ

−1

1 1−σ

σ γ(1 − a) σ−1 1 − a n+δ 1 1−σ γs σ−1 σ σ γ(1 − a) σ−1 1 − a n+δ

a

σ 1−a σ−1 1

CES (σ = 1)

Table 1. Steady-state values (certainty equivalents) of one-sector growth models

Bjarne S. Jensen, Martin Richter

s

0.20 0.20 0.20 0.20 0.20 0.25 0.25 0.30 0.30 0.20 0.20 0.20 0.20 0.30 0.33 0.25

0.20 0.20 0.20 0.20 0.20 0.25 0.25 0.20 0.20 0.30 0.20 0.20 0.20 0.25

0.25 0.20 0.25 0.30

case

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 2 3 4

0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.08 0.08 0.05 0.05 0.05 0.05

0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.05

δ

1.0 1.0 1.0 3.0 1.0 1.0 1.0 1.0 0.3 0.2 1.0 1.0 1.0 1.0

1.0 1.0 1.0 3.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 0.3 0.2 1.0 1.0

γ

0.25 0.40 0.60 0.60 0.40 0.40 0.60 0.60 0.40 0.40 0.40 0.40 0.40 0.40

0.20 0.25 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.60

a

0.5 0.5 0.5 0.5 1.5 1.5 1.5 1.5 1.5 1.5 2.0 3.0 7.0 3.0

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

σ

0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.00 0.00 0.01 0.01 0.01 0.01

0.01 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.01

β1

0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.00 0.00 0.03 0.03 0.03 0.03

0.03 0.03 0.03 0.03 0.00 0.00 0.00 0.00 0.03 0.03 0.00 0.00 0.00 0.00 0.00 0.03

β2

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1∗ 0.0 0.0 0.0 0.0

0.00 0.00 0.00 0.00 0.10 0.05 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10∗ 0.10∗ 0.10

β3

0.02 0.02 0.02 0.02

0.05 0.05 0.05 0.05

1.0 1.0 1.0 1.0

0.40 0.60 0.60 0.60

4.0 2.0 2.0 2.0

0.01 0.01 0.01 0.01

0.03 0.03 0.03 0.03

0 0 0 0

Parameters - Endogenous growth models

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.01 0.01 0.02 0.02 0.02 0.02

0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.02

n

Parameter values - stationary models

0 0 0 0

0 0 0 0 0 0 0 0 1 1∗ 0 0 0 0

0 0 0 0 1 1 1 1 1 1 1 1 1 1∗ 1∗ 1

λ3

3.476 4.095 5.643 19.929 7.633 13.148 401.664 55.715 0.523 0.523 9.806 15.469 191.698 99.222

3.715 4.054 5.753 35.900 5.753 8.345 8.345 6.240 6.240 3.175 3.784 12.014 0.509 0.509 8.719 24.105

κ

3.479 4.097 5.642 19.949 7.635 13.147 398.071 55.493 0.523 0.523 9.812 15.507 203.598 99.721

3.718 4.057 5.753 35.900 5.736 8.341 8.329 6.22862 6.22858 3.162 3.770 11.969 0.508 0.509 8.707 23.960

E(k)

3.428 4.035 5.552 19.657 7.412 12.724 345.493 51.178 0.518 0.518 9.429 14.502 105.010 82.518

3.656 3.986 5.632 35.143 5.685 8.329 8.282 6.194 6.102 3.072 3.726 11.833 0.504 0.504 8.719 22.949

mode(k)

0.309 0.375 0.539 1.756 0.988 1.795 120.230 12.365 0.033 0.033 1.471 3.021 182.373 34.630

0.353 0.398 0.631 3.936 0.694 0.403 0.806 0.600 0.830 0.541 0.517 1.644 0.033 0.033 0.079 4.312

σ(k)

36.252 25.157 21.228 264.766 5.815 8.356 36.293 9.725 0.974 0.974 4.697 3.737 3.178 6.944

14.858 12.163 8.629 53.849 8.629 12.517 12.517 9.360 9.360 4.762 5.676 18.021 0.763 0.763 13.078 16.070

ω(κ)

0.350 0.350 0.350 0.350 0.350 0.280 0.280 0.350 0.450 0.300 0.350 0.350 0.350 0.280

0.350 0.350 0.350 0.350 0.350 0.280 0.280 0.333 0.333 0.500 0.450 0.450 0.450 0.300 0.273 0.280

f (κ)/κ

0.031 0.049 0.074 0.024 0.199 0.171 0.257 0.298 0.157 0.105 0.237 0.282 0.344 0.262

0.070 0.088 0.140 0.140 0.140 0.112 0.112 0.133 0.133 0.200 0.180 0.180 0.180 0.120 0.109 0.168

f (κ)

Limits for k → ∞ ∞ 0.295 0.295 ∞ 0.360 0.360 ∞ 0.360 0.360 ∞ 0.360 0.360

1.217 1.433 1.975 6.975 2.672 3.681 112.466 19.500 0.235 0.157 3.432 5.414 67.094 27.782

1.300 1.419 2.014 12.565 2.014 2.336 2.336 2.080 2.080 1.587 1.703 5.406 0.229 0.153 2.378 6.749

f (κ)

Model characteristics

0.088 0.140 0.210 0.070 0.568 0.611 0.917 0.851 0.349 0.349 0.676 0.805 0.984 0.935

0.20 0.25 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.40 0.60

(κ) K

2.857 2.857 2.857 2.857 2.857 3.571 3.571 2.857 2.222 3.333 2.857 2.857 2.857 3.571

2.857 2.857 2.857 2.857 2.857 3.571 3.571 3.000 3.000 2.000 2.222 2.222 2.222 3.333 3.667 3.571

K/Y

Table 2. Numerical cases for one-sector growth models: CD (σ = 1) and CES

n+δ s

0.350 0.350 0.350 0.350 0.350 0.280 0.280 0.350 0.450 0.300 0.350 0.350 0.350 0.280

0.350 0.350 0.350 0.350 0.350 0.280 0.280 0.333 0.333 0.500 0.450 0.450 0.450 0.300 0.273 0.280

Bjarne S. Jensen, Martin Richter Among the alternative initial values k(0) for sample paths (128) and deterministic trajectories (smooth curves) are: [k(0) = 1, 10, 30]; corresponding values of ω(0), y(0), follow from (129–130). With different parameter cases selected from Table 2, the relevant scaling of the vertical axis in Fig. 5.1 - 5.11 is, for illustrative visual and comparative purposes, a delicate graphic problem (in particular without colors). However, the salient features of many sample paths arising from stationary and non-stationary stochastic dynamics are apparent in our Fig. 5.1 - 5.11 for one-sector stochastic growth models. In the CD case of Fig. 5.1, the inﬂuence of the respective initial values upon sample paths of ω(t) has mostly disappeared around (t = 100), cf. (LHS, transient movements), and the two curves have henceforth visually merged on the long-run time interval, (t = 100 − 500). The asymptotic (stationary) probability density of ω(t) is also depicted in Fig. 5.1, cf. (RHS, light). The sample paths of k(t) - and (transient/long-run) deterministic (black) trajectories and a horizontal line for (κ) - are similarly shown in Fig. 5.1, together with p(k), (115-117), cf. (RHS, dark ). To avoid clutter, the sample paths of y(t) are presented without a density curve on the RHS of Fig. 5.1. The overall pattern of transient motion and long-run evolution for the sample paths of Fig. 5.1 is essentially repeated for their alternative CD and CES sample paths, as exhibited in Fig. 5.2 - 5.10. But, evidently, the higher the long-run (steady-state level) of the sample paths is located, the larger is the volatility of the process (and the standard deviation of its asymptotic density) - simply because the diﬀusion coeﬃcients, b(k), (33-34), are increasing functions of (k). It is noteworthy that in the short-run (transient motion), the upward (downward) parts of sample paths looks more regular (monotone) and deceptively less noisy than the underlying reality (process). The uncertainty (volatility) is hidden by the fact that the occurrence of some negative random shocks from the diﬀusion term, b(k) dw, have been masked (absorbed) by the temporary upwards contribution from the dominating drift term, a(k) dt - which later loses its dominance and then leaves the scene free for the volatility eﬀects of the diﬀusion term, b(k) dw. Only with a low speed (drift) are bicycles susceptible to shocks and show erratic motions. Remark 4. For the stationary cases, the “mean reverting eﬀect” for the simulated sample paths, k(t), is ensured by strong mixing properties of the processes. This has the eﬀect that, far from the equilibrium, the “mean reverting eﬀect kills the volatility of the process”. Closer to the mode of the distributions, we have more volatility, and the 196

Stochastic One-Sector and Two-Sector Models process behavior is more like a random walk. The mean reverting effect is also depicted in expectations, E [k(t)], since E [k(t)] converges exponentially fast towards the stationary mean value. As mentioned at the beginning of this section, parametric variation within the basic stochastic growth model (30-34) can also generate non-stationary stochastic processes with persistent (endogenous) growth solutions (sample paths) for the capital-labor ratio, k(t). We will brieﬂy exhibit the stochastic solutions to (57) under parametric CES regimes that satisfy the fundamental suﬃcient (with probability one) condition (102) for long-run per capita growth - which for (57) [with β3 = 0] just corresponds to reversing the condition, (60). Condition (102) with (β3 = 0) is easily veriﬁed to be satisﬁed by the last four CES cases given in Table 2 [and of course by no other parameter cases listed in Table 2; CES case 13 is close to, but not suﬃcient as : 0.0687 < 0.0704. The size of the parameters (a) and (s) are seen in Table 2 to be critical for reducing the actual size of the substitution elasticity (σ) that is required for satisfying (102). The endogenous stochastic growth paths for k(t), y(t) and ω(t), cf. (128–130), for the CES cases (3-4) are simulated and displayed in Fig. 5.11. The time scale (unit: year, quarter, month) of economic growth models is seldom given much attention. However, whatever unit is appropriate, the sample paths in Fig. 5.11 demonstrate the character of the possible growth solutions to the stochastic process (57). We see also that the growth eﬀects of a higher saving rate are signiﬁcant on an extended time horizon. Moreover, as mentioned above, for the transient upward part of sample paths, a strong drift term can now, besides generating long-term growth, also absorb negative shocks along sample paths; fast motion (growth) is helpful. Such a stochastic growth path will, like deterministic growth, exhibit a pronounced tendency for monotone evolution. For large values of k(t), the SDE (57) is approximated by an analogously parameterized geometric Wiener process, (103), with sample paths, (104), and their explicit expectation, standard deviation given by, cf. Dixit and Pindyck (1994, p.71), de La Grandville (2001, p.292): E [k(t)] ∼ k0 ea¯ t ,

¯2

1

σ(t) = k0 ea¯ t (eb t − 1) 2

(131)

Incidentally, we note that (131) is the one-dimensional version for the expectation vector and the covariance matrix of the GWD (Geometric Wiener Diﬀusion) model, cf. Jensen et al (2007, Theorem 4, p. 88). Geometric Wiener processes have no asymptotic stationary density functions (curves) as shown in Fig. 5.1 - 5.10. 197

Bjarne S. Jensen, Martin Richter 30

30

30

25

25

25

20

20

20

15

15

Ωt 15

10

10

10

5

5

20

40

60

80

100

kt yt 100

200

300

400

5

500

0.5

1.

Figure 5.1: CD: case 1

40

40

40

35

35

35

30

30

30

25

25

25

20

20

20

15

15

15

10

10

5

5

Ωt kt

10 5

yt 20

40

60

80

100

100

200

300

400

0.25 0.5 0.75 1.

500

Figure 5.2: CD: case 3

80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5

80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 20

40

60

80

100

Ωt

kt

yt

100

200

300

Figure 5.3: CD: case 4

198

400

500

80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0.05 0.1 0.15 0.2

Stochastic One-Sector and Two-Sector Models 40

40

40

35

35

35

30

30

30

25

25

25

20

20

20

15

15

15

10

10

5

5

Ωt kt

10 5

yt 20

40

60

80

100

100

200

300

400

0.25 0.5 0.75 1.

500

Figure 5.4: CD: case 8

40

40

40

35

35

35

30

30

30

25

25

25

20

20

20

15

15

15

10

10

5

5

Ωt kt

10 5

yt 20

40

60

80

100

100

200

300

400

0.25 0.5 0.75 1.

500

Figure 5.5: CD: case 9

40

40

40

35

35

35

30

30

25

25

20

20

15

15

10

10

5

30 kt Ωt

40

60

80

100

20 15 10

yt

5 20

25

100

200

300

400

500

5 0.05 0.1 0.15 0.2

Figure 5.6: CD: case 16

199

Bjarne S. Jensen, Martin Richter 50 45 40 35 30 25 20 15 10 5

50 45 40 35 30 25 20 15 10 5 20

40

60

80

100

Ωt

kt yt 100

200

300

400

50 45 40 35 30 25 20 15 10 5 0.2 0.4 0.6 0.8 1.

500

Figure 5.7: CES: case 2

50 45 40 35 30 25 20 15 10 5

50 45 40 35 30 25 20 15 10 5 20

40

60

80

100

Ωt

kt yt 100

200

300

400

50 45 40 35 30 25 20 15 10 5 0.2 0.4 0.6 0.8 1.

500

Figure 5.8: CES: case 3

15

15

15

10

10

10 kt Ωt

5

5

5 yt

20

40

60

80

100

100

200

300

Figure 5.9: CES: case 5

200

400

500

0.3 0.6 0.9 1.2 1.5

Stochastic One-Sector and Two-Sector Models The uncertainty (volatility) expressed by σ(t) in (131) seems overwhelming. However, the formula (131) describes the potential volatility associated with all (inﬁnite) possible sample paths (realizations, stochastic simulations) for k(t) = k[t; ω], (128); hence (131) says nothing about an individual (single) sample path (realization, simulation), however erratic it may appear. Thus, the σ(t) expression of volatility in (131) does not contradict the calm picture of the evolutions (sample paths) exhibited in Fig. 5.11. Economic history (society and individual) is fortunately not repeated (replayed), and the unique growth histories (paths), although uncertain, look like the calm (monotone) sample paths exhibited in Fig. 5.11. Evidently, stochastic growth models contribute to our understanding of historical time series observed from growing economies. 15

15

10

10

5

5

20

40

60

80

15

kt

10

Ωt yt

100

100

200

300

400

5

500

0.6 1.2 1.8 2.4 3. 3.6

Figure 5.10: CES: case 11 200

900 kt

kt

180

800

160

700

140 600 120 500 100 400 80 yt

yt

300 60 200

40

100

20 Ωt 20

40

60

80

100

Ωt 20

40

60

80

100

Figure 5.11: CES: case 3 and 4 - endogenous growth

201

Bjarne S. Jensen, Martin Richter 5.7

General equilibria of two-sector economies

Great emphasis was naturally ﬁrst given to labor and capital accumulation in aggregate (one-sector) growth models. An extensive literature on two-sector growth models, however, began in the 1960s. The seminal work on two-sector growth models with ﬂexible sector technologies was done by Uzawa (1961-62, 1963), Solow (1961-62), Inada (1963), Drandakis (1963). The main expositions and references to the early two-sector growth literature are: Stiglitz & Uzawa (1969), Burmeister & Dobell (1970), Wan Jr. (1971), Gandolfo (1980). The study of general equilibrium dynamics in two-sector and multisector growth models has been reviewed and extended in Jensen (2003), Jensen & Larsen (2005) - with emphasis on factor allocation, output composition, and the dualities for commodity and factor prices. 5.7.1

Factor Endowment Allocation and Prices

We now consider an economy consisting of a capital good industry (sector) and a consumer good industry, labeled, i = 1, 2, respectively. The factor endowments, total labor force (L) and the total capital stock (K), are inelastically supplied and are fully employed (utilized): L K k λ L1

= = ≡ ≡

L1 /L + L2 /L ≡ λL1 + λL2 ≡ 1, L 1 + L2 , K1 + K2 , K1 /K + K2 /K ≡ λK1 + λK2 ≡ 1, K/L ≡ λL1 k1 + λL2 k2 ≡ k2 + (k1 − k2 )λL1 , (k − k2 ) / (k1 − k2 ) , λKi ≡ (ki /k) λLi , k1 = k2

(132) (133) (134) (135)

where the factor allocation fractions are denoted λLi , λKi , (132-133). Free factor mobility between the two industries and eﬃcient factor allocation impose the common MRS condition, cf. (6), ω = ω1 (k1 ) = ω2 (k2 ),

(136)

For the variables k1 and k2 to satisfy (136), it is, beyond (2) and (6), further required that the intersection of the sectorial range for ω1 (k1 ) and ω2 (k2 ) is not empty, ωi (ki ) ∈ Ωi = [ω i , ω i ] ⊆ R+ , ω ∈ Ω ≡ Ω1 ∩ Ω2 = [ω, ω] = ∅, (137) The two industries are assumed to operate under perfect competition (zero excess proﬁt); absolute (money) input (factor) prices (w, r) are the same in both industries; and absolute (money) output (product, 202

Stochastic One-Sector and Two-Sector Models commodity) prices (P1 , P2 ) represent unit cost. Hence, we have the competitive producer equilibrium equations, w = Pi · M PLi , r = Pi · M PKi ; ω = w/r, Pi = 0. (138) Yi = Lyi λLi , Pi Yi = wLi + rKi , Li = wLi /Pi Yi , Ki + Li = 1 (139) M P L2 M PK2 P1 f2 (k2 ) − k2 f2 (k2 ) f (k2 ) p ≡ = = = = 2 (140) P2 M PK1 f1 (k1 ) f1 (k1 ) − k1 f1 (k1 ) M P L1 Gross domestic product, Y , is the monetary value of sector outputs, Y

≡ P1 Y1 + P2 Y2 = L(P1 y1 λL1 + P2 y2 λL2 ) ≡ Ly

(141)

and is, with (138-140), equal to the total factor incomes: Y = wL + rK = L(w + rk) = L(ω + k)Pi fi (ki ) = Ly (142) Hence, the factor income distribution shares, δK + δL = 1, become, δK ≡

rk rK wL ≡ , δL ≡ ; Y y Y

δK ≡

k , ω+k

δK k ≡ δL ω

(143)

The macro equivalence of total revenues and total expenditures gives the decomposition of GDP (141) into expenditure shares, si , as si = Pi Yi /Y,

2

si ≡

i=1

2

Pi Yi /Y = 1

(144)

i=1

Lemma 1. The macro factor income shares, δL , δK , (143), are expenditure-weighted combinations of the sectorial factor (cost) shares, δL =

2

si Li ,

δK =

i=1

2

si K i ,

δ K + δL = 1

(145)

i=1

The factor allocation fractions, (132-133), are obtained by, Li /L = λLi = si Li /δL

Ki /K = λKi = si Ki /δK (146)

The total factor endowment ratio, (134), satisfy the identity, cf. (143): 7 2 2 ωδK K/L ≡ k ≡ ≡ ω si Ki si Li (147) δL i=1 i=1 which is a representation of Walras’s law. 203

Bjarne S. Jensen, Martin Richter Proof: By deﬁnition, we have, δL = wL/Y = [wL1 + wL2 ] /Y, δK = rK/Y = [rK1 + rK2 ] /Y (148) From (139)and (144), we get wLi = Li Pi Yi = si Li Y,

rKi = Ki Pi Yi = si Ki Y

(149)

Hence, by (148) and (149) we obtain (145). Next, we obtain λLi =

si Li Y s i Li wLi Li = = = L wL δL Y δL

(150)

λKi =

si Ki Y si Ki rKi Ki = = = K rK δK Y δK

(151) 2

as stated in (146). 5.7.2

Commodity prices and factor prices with CES

The connection between relative factor (service) prices and relative commodity prices follows from (6, 136, 138, 140),

p(ω) =

M PK2 [k2 (ω)] f [k2 (ω)] P1 = 2 , (ω) = P2 M PK1 [k1 (ω)] f1 [k1 (ω)]

ω = w/r.

(152)

The exact form of the function (152) needs particular attention. With (10) and (14), the relative commodity prices (comparative costs) (152) become, with σi = 1, σi = 1, and σ1 = σ2 = σ, respectively, p(ω) =

γ2 a2 k2 (ω)a2 −1 γ2 aa22 (1 − a2 )1−a2 (a2 −a1 ) f2 [k2 (ω)] = = ω (153) f1 [k1 (ω)] γ1 a1 k1 (ω)a1 −1 γ1 aa11 (1 − a1 )1−a1

1/(σ2 −1) γ2 a2 a2 + (1 − a2 )k2 (ω)−(σ2 −1)/σ2 f2 [k2 (ω)] p(ω) = = 1/(σ −1) f1 [k1 (ω)] γ1 a1 [a1 + (1 − a1 )k1 (ω)−(σ1 −1)/σ1 ] 1 σ /(σ2 −1)

=

γ2 a2 2

σ /(σ −1)

1/(σ2 −1)

(1 + c2 ω 1−σ2 )

γ1 a1 1 1 (1 + c1 ω 1−σ1 )1/(σ1 −1) σ 1/(σ−1) σ γ2 a2 1 + c2 ω 1−σ 1 − ai p(ω) = c = i γ1 a1 1 + c1 ω 1−σ , ai 204

(154) (155)

Stochastic One-Sector and Two-Sector Models 5.7.3 Walrasian general equilibrium and CES As to the demand (expenditure share) decomposition between consumption and investment (saving), we shall employ the ”neoclassical” saving assumption, which has been standard in much of the growth literature. It is immaterial for our purposes whether investment is controlled by owners or managers. Hence, we use the aggregate monetary saving function: S = sY,

0 (n + δ)/s σ1 σ1 −1

γ1 a1

> (n + δ)

(173) (174)

σ1 ≤ 1 : (Suﬃcient condition) Persistent growth of k(t) is impossible. σ1 > 1 : Necessary and suﬃcient conditions for limt→∞ k(t) = ∞ are: σ1

σ1 > 1, σ2 < 1 : σ1 > 1, σ2 > 1 :

γ1 a1σ1 −1 > (n + δ) σ1 σ1 −1

γ1 a1

> (n + δ)/s

(175) (176)

except that (175) is occasionally not suﬃcient for small initial values. Proof: See Jensen (2003). The proposition follows essentially from a straigthforward examination of the sign of h(k), (169), by a comparative evaluation of f1 (k1 [ω] ), (172), and δK (ω), (162). The diﬀerence (s) in the RHS constant of the inequalities comes from δK (ω), (162), taking values 1 or s - for ω → 0 or for ω → ∞ depending on the size of σ2 . The boundary behavior of the marginal product of capital, cf. (11 - 13), in the capital good sector, f1 (k1 [ω] ), (172), is crucial for a steady-state or persistent growth. 2 Proposition 1 shows explicitly that the global existence issues of any steady state or persistent growth depend on the size of the key parameters: σi , a1 , γ1 , s, n, δ. While the accumulation parameters (s, n, δ) play some roles, the fundamental role of the technology parameters in the capital good sector (σ1 , γ1 , a1 ) for deciding the types of the longrun evolution in the CGE growth models complies with observation and economic intuition, as well as conﬁrms the strategic importance ascribed to capital good industries by economic historians and the general public, cf. Mahalanobis (1955), Rosenberg (1963). The most important parameter in Proposition 1 is the substitution elasticity in the capital good sector, σ1 . The total productivity parameter γ1 in the capital-good sector matters in all the stated conditions (173-175). If we restrict γ1 = 1 and if σ1 2, then (176) is 208

Stochastic One-Sector and Two-Sector Models usually satisﬁed for other relevant parameters, e.g. high saving rates. The critical role in Proposition 1 is played by σ1 rather than σ2 . The conclusions in this Proposition 1 contrast sharply with earlier standard literature on two-sector growth models. Stiglitz & Uzawa (1969, p.407) reported (with neoclassical saving) that a suﬃcient condition for uniqueness and stability of (convergence to) balanced (steady state) growth paths is: “substitution elasticity in each sector greater than or equal to one.” This condition is neither necessary nor suﬃcient for the long-run steady state family. Indeed, a high value of σ1 would preclude the existence of steady-state growth, cf. (175 - 176). 5.8.2 Neoclassical SDE of the capital-labor ratio The deterministic model (164-165) becomes, with uncertainties (stochastic elements i ) in the growth rate of labor, (n), the gross saving rate, (s), and the capital depreciation rate (δ) : L˙ = L(n + β1 1 ) K˙ = L(s + φ3 (k) 3 )Y /P1 − (δ + β2 2 )K

(177) (178)

For the labor and capital stock, the associated stochastic diﬀerential equations (SDE) to (177–178), are given by (179) dL = Ln dt + Lβ1 dw1 dK = (LsY /P1 ) − δK) dt − β2 K dw2 + LY /P1 φ3 (k) dw3 (180) Theorem 6. The stochastic neoclassical dynamics for the capitallabor ratio of the two-sector model (179-180) is given by the SDE dk = a(k) dt + b(k) dw,

k ∈ (0, ∞)

(181)

where the drift coeﬃcient a(k) and diﬀusion coeﬃcient b(k) are, a(k) Θ b2 (k) β2

= = = =

s¯(k)(y/P1 ) − Θk, s¯(k) = s − ρ13 β1 φ3 (k) n + δ − (β12 + ρ12 β1 β2 ) β 2 k 2 + φ3 (k)2 (y/P1 )2 − ρφ3 (k)(y/P1 )k β12 + β22 + 2ρ12 β1 β2 , ρ = 2(ρ13 β1 + ρ23 β2 )

(182)

where y/P1 is given by, cf. (142), (168), y = (ω + k ) f1 (k1 (ω)) = (Ψ−1 (k) + k)f1 (k1 [Ψ−1 (k)]) ≡ Υ(k) (183) P1 Proof: The proof of Theorem 6 is a replication of the proof of Theorem 1 with f (k) in, (31), (33), replaced by : Υ(k), (183). 2 209

Bjarne S. Jensen, Martin Richter For the one-sector growth models, we obtained explicit conditions for the lower and upper boundaries - zero and ∞ - to be inaccessible boundaries. These explicit conditions - stated in Theorems 2-4 were established after lengthy calculation, as their proofs showed. With the examples considered here, the βi terms only have a second-order eﬀect and the intuition from proposition 1 will help to understand the stochastic system. We have not calculated the exact boundary conditions, which probably are not feasible due to the complex structure of y/P1 , cf. (183). 5.9

Sample paths of two-sector models and CES

We include a few simulations of stochastic two-sector growth models with some variation of the parameter to show the critical parameter values in long-run stationary processes and, alternatively, with inﬁnity as the attractor, in stochastic endogenous growth. Numerical parameter cases illustrating steady-state values (κ, ω ¯ ), (170–172), with CD/CES technologies are collected in Table 3. The sample path of the stochastic process, k(t), in two-sector growth models, (181), is formally given by : t t a(k[u; ω]) du + b(k[u; ω]) dwu [ω] (184) k(t) = k[t; ω] = k(0) + 0

0

where k(0) is a ﬁxed initial condition and [ω] symbolizes a particular realization of the Wiener process. This sample path (184) is thus approximated by the Euler scheme, Kloeden & Platen (1995). After having determined the simulated sample path, k(t), the sample path for ω(t), k1 (t), and y/P1 (t) for the two-sector model (with the same realization of the Wiener process) is determined by using the respective equations, (169), (14), (163), (183), and (172), σi ai −1 −1 Ψ [ k(t) ] (185) ω(t) = Ψ [ k(t) ] ; ki (t) = 1 − ai y/P1 (t) = Υ [ k(t) ] = [ Ψ−1 [k(t)] + k(t) ] f1 ( k1 [Ψ−1 [k(t)] ) (186) Some sample paths, (184-186), are exhibited in Fig. 5.12 - 5.14. From the CGE relationship, ω = Ψ−1 (k), (169), (163), the stationary density function for (ω), called ϕ (ω), is the stationary density function, p(k), transformed by the function Ψ−1 . Hence, using the transformation theorem of densities, ϕ (ω) in Fig. 5.12 - 5.14 is, ϕ (ω) = p ( Ψ(ω)) Ψ (ω) 210

(187)

0.25 0.25 0.25 0.25 0.25 0.25

1 2 3 4 5 6

0.02 0.02 0.02 0.02 0.02 0.02

0.02 0.02

0.05 0.05 0.05 0.05 0.05 0.05

0.05 0.05

δ

0.4 0.6 0.4 0.4 0.4 0.4

0.2 0.4

a1

0.5 0.5 0.5 0.5 0.5 0.5

0.5 0.5

a2

0.5 0.5 1.2 1.2 1.5 1.5

1.0 1.0

σ1

0.7 0.7 1.5 1.5 1.2 1.2

1.0 1.0

σ2

0.01 0.01 0.01 0.01 0.01 0.01

0.01 0.01

β1

Parameters - stationary models

n

0.03 0.03 0.03 0.03 0.03 0.03

0.03 0.03

β2

0.02 0.02 0.02

0.05 0.05 0.05

0.6 0.6 0.6

0.5 0.5 0.5

2.0 2.0 2.0

1.5 1.5 1.5

0.01 0.01 0.01

0.03 0.03 0.03

0.0 0.1 0.1

0.0 0.1 0.0 0.1 0.0 0.1

0.0 0.0

β3

0 1 1

0 1 0 1 0 1

0 0

λ3

5.583 7.586 11.196 11.196 13.198 13.198

4.357 5.878

κ

5.586 7.564 11.192 11.146 13.150 13.113

4.361 5.878

E(k)

5.502 7.435 10.902 10.769 12.725 12.501

4.288 5.754

mode(k)

0.507 0.950 1.369 1.797 1.797 2.398

0.414 0.644

σ(k)

14.684 20.606 6.029 6.029 8.499 8.499

5.546 6.368

ω ¯

0.439 0.354 0.389 0.389 0.277 0.277

0.770 0.420

f1 (k1 (ω)) ¯ k1 (ω) ¯

0.360 0.360 0.360

0.077 0.075 0.182 0.182 0.170 0.170

0.154 0.168

f1 (k1 (ω)) ¯

Limits for k → ∞ ∞ 0.360 ∞ 0.360 ∞ 0.360

1.374 1.969 2.063 2.063 3.739 3.739

1.068 1.783

f1 (k1 (ω)) ¯

Model characteristics

0.176 0.212 0.468 0.468 0.613 0.613

0.200 0.400

K1 (ω) ¯

0.275 0.269 0.650 0.650 0.607 0.607

0.440 0.480

δK (ω) ¯

Remark. Whereas the stationary probability density (188) and the ﬁrst part of Proposition 1, (173-174) refer to ﬁgures like Fig. 5.12-5.14, the second part of Proposition 1, (175-176) refers to and provides the parameter values that generate persistent (endogenous) economic growth in ﬁgures like Fig. 5.15. Proposition 1 serves as a deterministic substitute for the two-sector analogues of the stochastic Theorems 2-4 and the stochastic condition, (102), of one-sector growth models.

0.25 0.25 0.30

Parameters - endogenous growth models

0.20 0.20

1 2

1 2 3

s

case

Table 3. Numerical cases for two sector growth models: CD and CES

Bjarne S. Jensen, Martin Richter 20

20

15

15

15

10

10

10

5

5

20 Ωt

kt

5

k1t yP1t 20

40

60

80

100

100

200

300

400

0.2 0.4 0.6 0.8 1.

500

Figure 5.12: CES II: case 1

20

20

20

15

15

15

10

10

5

5

kt 10 Ωt k1t

5

yP1t 20

40

60

80

100

100

200

300

400

0.2 0.4 0.6 0.8 1.

500

Figure 5.13: CES II: case 3

20

20

15

15

10

10

Ωt

10

5

5

yP1t

5

20

40

60

80

100

20 k1t kt

100

200

300

Figure 5.14: CES II: case 6

212

400

500

15

0.2 0.4 0.6 0.8 1.

Stochastic One-Sector and Two-Sector Models where p is the basic stationary density function for (k) as deﬁned by the drift and diﬀusion coeﬃcients: (44–45), (115–117). The leading role of the capital-good sector as an engine of persistent growth is exhibited by sample paths in Fig. 5.15. For proper scaling of the vertical axis, the sample path of k1 (t) is not included in the left panel, whereas it appears in the right panel. The comments given above to Fig. 5.11 apply similarly to Fig. 5.15. Stochastic (probability) issues appear, according to Pierre-Simon Laplace, because we are partly knowing and partly ignorant. The dynamics of stochastic and deterministic growth models share a common element: the drift coeﬃcient, which constitutes the substance of our knowledge, based on economic theory and empirical (veriﬁed) data. The course of future events is admittedly uncertain, but the diﬀusion coeﬃcient, similarly based on solid theoretical/empirical premises, governs the volatility of outcomes. The mathematical tool of continuous-time probability of Norbert Wiener and Kiyoshi Itˆo allows probability calculations to be made about observable future time paths. Stochastic dynamics properly employed, far from making economic growth models more abstract, actually serve as a powerful liberating framework, enabling the analysis and integration of ever more realistic and complicated hypotheses. Stochastic and deterministic growth models contribute together to a genuine understanding of historical time series in economics and other disciplines.

300 kt

k2t 200

yP1t 100

Ωt

20

40

60

80

100

3000 2900 2800 2700 2600 2500 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100

k1t

kt k2t yP1t Ωt 20

40

60

80

100

Figure 5.15: CES II: case 1 - endogenous growth

213

Bjarne S. Jensen, Martin Richter References: Aghion, P. and Howitt P. (1998) Endogenous Growth Theory. Cambridge, MA and London, England: The MIT Press. Arrow, K.J., Chenery, H.B., Minhas, B. S., and Solow, R.M. (1961) “Capital-Labour Substitution and Economic Eﬃciency.” Review of Economics and Statistics 43: 225–250. Barro, R. J. (1991) “Economic Growth in a Cross-Section of Countries.” Quarterly Journal of Economics 106: 407-443. Barro, R. J. and Sala-i-Martin, X. (1992) “Convergence.” Journal of Political Economy 100: 223-251. Bourguignon, F. (1974) “A Particular Class of Continuous-Time Stochastic Growth Models.” Journal of Economic Theory 9: 141–158. Burmeister, E., and Dobell, A.R. (1970) Mathematical Theories of Economic Growth. London: MacMillan. Chang, F.R., and Malliaris, A.G. (1987) “Asymptotic Growth under Uncertainty – Existence and Uniqueness.” Review of Economic Studies 54: 169–174. De Long, J.B., and Summers, L. H. (1991) “Equipment Investment and Economic Growth.” Quarterly Journal of Economics 106: 445– 502. Dixit, A.K., and Pindyck, R.S. (1994) Investment under Uncertainty. New Jersey: Princeton University Press. Gandolfo, G. (1980) Economic Dynamics: Methods and Models. 2. ed. Amsterdam: North-Holland. Gandolfo, G. (1997) Economic Dynamics. Berlin/New York: Springer Verlag. Itˆo, K., and McKean, Henry P. Jr. (1965) Diﬀusion Processes and Their Sample Paths. Berlin: Springer-Verlag. Jensen, B.S. and Larsen, M.E. (1987) “Growth and Long-Run Stability.” Acta Applicandae Mathematicae 9: 219–137. Also in: Jensen (1994). Jensen, B.S. (1994) The Dynamic Systems of Basic Economic Growth Models. Dordrecht: Kluwer Academic Publishers. Jensen, B.S. and Wang, C. “General Equilibrium Dynamics of Basic Trade Models for Growing Economies.” In: Bjarne S. Jensen and Kar-yiu Wong (eds.) (1997) Dynamics, Economic Growth, and International Trade. Ann Arbor: University of Michigan Press. 214

Stochastic One-Sector and Two-Sector Models Jensen, B.S. and Wang, C. (1999) “Basic Stochastic Dynamic Systems of Growth and Trade.” Review of International Economics 7: 378–402. Jensen, B.S., Richter, M., Wang, C., and Alsholm, P.K. (2001) “Saving Rates, Trade, Technology, and Stochastic Dynamics.” Review of Development Economics 5: 182–204. Jensen, B.S. (2003) “Walrasian General Equilibrium Allocations and Dynamics in Two-Sector Growth Models.” German Economic Review 4: 53–87. Jensen, B.S., and Larsen, M.E. (2005) “General Equilibrium Dynamics of Multi-Sector Growth Models.” Journal of Economics Supplement 10: 17–56. Jensen, B.S., Wang, C., and Johnsen, J. (2007) “Moment Evolution of Gaussian and Geometric Wiener Diﬀusions - derived by Itˆo’s Lemma and Kolmogorov’s Forward Equation.” This volume. Jones, R.W. (1965) “The Structure of Simple General Equilibrium Models.” The Journal of Political Economy 73: 557-72. Karlin, S., and Taylor, H.M. (1981) A Second Course in Stochastic Processes. N.Y.: Academic Press. Kemp, M.C. (1969) The Pure Theory of International Trade and Investment. New Jersey: Prentice-Hall. Kloeden, P.E., and Platen, E. (1995) Numerical Solutions to Stochastic Diﬀerential Equations 2. ed. Berlin: Springer Verlag. Klump, R. (1995) “On the Institutional Determinants of Economic Development – Lessons from a Stochastic Neoclassical Growth Model.” Jahrbuch f¨ ur Wirtschaftswissenschaft 46: 138–51. de La Grandville, O. (2001) Bond Pricing and Portfolio Analysis. Cambridge, Mass.: MIT Press. Lau, S.-H.P. (2002) “Further Inspection of the Stochastic Growth Model by an Analytical Approach.” Macroeconomic Dynamics 6: 748– 757. Mahalanobis, P. C. (1955) “The approach of operational research to planning in India.” Sankhya: Indian Journal of Statistics 16: 3–62. Mandl, P. (1968) Analytical Treatment of One-Dimensional Markov Processes. (Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Band 151) Berlin: Springer-Verlag. Malliaris, A.G., and Brock, W.A. (1982) Stochastic Methods in Economics and Finance. Amsterdam: North-Holland/Elsevier. Mas-Colell, A., Whinston, M.D., and Green, J.R. (1995) Microeconomic Theory. New York: Oxford University Press. 215

Bjarne S. Jensen, Martin Richter Merton, R.C. (1975) “An Asymptotic Theory of Growth under Uncertainty.” Review of Economic Studies 42 (1975): 375–393. Merton, R.C. (1990) Continuous-Time Finance. Cambridge MA: Blackwell. Minhas, B. S. (1962) “The Homohypallagic Production Function, Factor Intensity Reversals, and the Heckscher-Ohlin Theorem.” The Journal of Political Economy 70: 138–156. Quah, D.T. (1996) “Empirics for Economic Growth and Convergence.” European Economic Review 40: 1353-1375. Rosenberg, N. (1963) “Capital Goods, Technology, and Economic Growth.” Oxford Economic Papers 15: 217–227. Sala-i-Martin, X.X. (1996) “Regional Cohesion: Evidence and Theories of Regional Growth and Convergence.” European Economic Review 40: 1325-1352. Sandmo, A. (1970) “The Eﬀect of Uncertainty on Saving Decisions.” Review of Economic Studies 37: 353–60. Scenk-Hoppe, K.R. (2002) “Is there a Golden Rule for the Stochastic Solow Growth Model?” Macroeconomic Dynamics 6: 457–475. Solow, R.M. (1961-62) “Note on Uzawa’s Two-Sector Model of Economic Growth.” Review of Economic Studies 29: 48–50. Also in: Stiglitz and Uzawa (1969). Stigum, B.P. (1972) “Balanced Growth under Uncertainty.” Journal of Economic Theory 5: 42–68. Stiglitz, J.E. and Uzawa, H. (eds.) (1969) Readings in the Modern Theory of Economic Growth. Cambridge (Mass.): M.I.T. Press. Uzawa, H. (1961-62) “On a Two-Sector Model of Economic Growth: I.” Review of Economic Studies 29: 40–47. Uzawa, H. (1963) “On a Two-Sector Model of Economic Growth: II.” Review of Economic Studies 30: 105-18. Also in: Stiglitz and Uzawa (1969). Wan, H.Y., Jr. (1971) Economic Growth. N.Y: Harcourt Brace Jovanowich. Øksendal, B. (2005) Stochastic Diﬀerential Equations – An Introduction with Applications. 6. ed.. Series: Universitext, Springer-Verlag.

216

Chapter 6 Comparative Dynamics in a Stochastic Growth and Trade Model with a Variable Savings Rate

Zhu Hongliang School of Management and Engineering, Nanjing University, Nanjing, P.R.China Huang Wenzao Department of Financial Mathematics, Peking University, Beijing, P.R.China

6.1

Introduction

The central purpose of theories of economic growth is to understand the factors behind the long-run growth of economies, and to explain diﬀerences in their growth performances. A wide class of growth models have been explored in the past decades, see Barro and Sala-i-Martin (1995), Burmeister and Dobell (1970). Most of these models, however, are deterministic and hence ignore fundamental uncertainties, which aﬀect productivity and cause diversity between nations. The uncertainties are intrinsic features of dynamic economic systems, and stochastic elements will appear in any economic growth process generated by factor endowment accumulation and technological change, as many uncertainty factors exist in population growth, production processes, consumers behavior, government expenditure, and policy decisions. In this chapter, we consider the neoclassical two-sector growth models of a small country that is trading in both commodities under uncertainties. This model originates in the open neoclassical

Zhu Hongliang, Huang Wenzao two-sector growth model by Deardorﬀ (1974), Deardorﬀ and Hanson (1978), and Jensen and Wang (1997) in Jensen and Wong (1997), and extended into a stochastic environment in continuous time by Jensen and Wang (1999). In addition, we use a savings function s(k, θ) instead of a constant rate of savings in Jensen and Wang (1999), i.e., the rate of saving depends on the capital-labor ratio, and any policy parameter. As to the dependence of the savings rate on capital-labor, see Solow (1956). Policy instruments such as the (initial) stock of money, the rate of capital income taxation, or other government regulations enter into our analysis as parameters. A perturbation in these parameters will inﬂuence the dynamic behavior of the economy. Similar savings functions have been studied by Atkinson and Stiglitz (1980), Boadway (1979), Chang and Malliaris (1987) and Merton (1975). We study the global comparative dynamic properties of the capital accumulation process with respect to any policy parameter. In characterizing the entire time path of the capital accumulation process, the eﬀect of a policy parameter on the behavior of the entire capital accumulation path can be determined. We show that the time path of the capital-labor ratio satisﬁes a monotonicity with respect to any policy parameter, if the savings function changes monotonically with respect to a policy parameter. In addition, we analyze the impact of the policy parameter on the steady-state distribution of the capital-labor ratio in the stochastic growth and trade model. 6.2

Stochastic dynamic systems for trading economies

Two-sector growth models were ﬁrst studied systematically by Shinkai (1960), Uzawa (1961). Due to the fundamental diﬀerences between capital good and consumer good, production sector is divided into two sectors: capital sector and consumer sector, which are described by diﬀerent production functions. Here we brieﬂy present the structure of the neoclassical two-sector growth and trade model. The sector technologies are described by neoclassical production functions exhibiting constant returns to scale, Yi = Fi (Li , Ki ) = Li Fi (1, Ki /Li ) = Li fi (ki ) = Li yi , i = 1, 2

(1)

The factor endowments (L, K) belong to the diversiﬁcation cone Ck : 2 Ck = {(L, K) ∈ R+ |k1 < K/L < k2 or k2 < K/L < k1 }.

218

(2)

Comparative Dynamics in a Growth and Trade Model The two-sector economy is assumed to operate under perfect competition; money factor prices(w, r) are the same in both sectors, and output prices (P1 , P2 ) represent unit cost. Hence, we have the competitive general equilibrium relations, Pi Yi = rKi + wLi , i = 1, 2. Gross domestic product Y is the monetary value of outputs from both sectors and represents aggregated gross factor incomes: (li = Li /L) Y = P1 Y1 + P2 Y2 = L(P1 y1 l1 + P2 y2 l2 ) = rK + wL = L(rk + ω) = Ly, (3) The small open, competitive, two-sector economy is trading at international prices, determined in the world market, i.e., at the exogenous terms of trade : p = P1 /P2 . Let Qi , i = 1, 2, respectively, denote the quantitative size of the domestic demand for investment (good 1) and consumption (good 2); then we have Qi = Yi − Xi , where Xi , i = 1, 2 are net exports of the two goods. The trade balance is assumed to satisfy the constraint P1 X1 + P2 X2 = 0,

(4)

Then Y = P1 Y1 + P2 Y2 = P1 Q1 + P1 X1 + P2 Q2 + P2 X2 = P1 Q1 + P2 Q2 , (5) i.e., trade equilibrium prevails with no foreign borrowing/lending allowed. Lemma 1.1 With given prices (P1 , P2 ) and the monotonicity and concavity conditions, the GNP function y(k) is a concave C 1 -class function on (0, ∞), and y(k) has a linear segment in the diversiﬁcation cone Ck . Next we set up the neoclassical deterministic dynamic sytem for the small two-sector trading economy. The proportion of gross income Y , cf. (3), that is saved is given by the savings ratio s(k, θ). The savings ratio, as a function, is allowed to depend on k , and more importantly on any policy parameter θ. This general assumption is made, because any policy parameter will inﬂuence the savings-consumption decision, and thereby the dynamic behavior of the trading economy through the savings ratio. 1

Deardorﬀ (1974), Jensen and Wang (1997).

219

Zhu Hongliang, Huang Wenzao With the depreciation of capital, (δP1 K), the factor accumulation equations become: ⎧ dL ⎪ ⎨ L˙ = dt = Ln, K˙ = dK = Ls(k, θ) PY1 − δK dt ⎪ ⎩ = L{s(k, θ)[y1 l1 + ( yp2 )l2 ] − δk} = Lg(k).

(6)

Thus, the dynamics of capital-labor ratio k - for 0 < k < ∞ - can be obtained as, k˙ = g(k) − nk = s(k, θ) PY1 − (δ + n)k = s(k, θ)[y1 l1 + yp2 l2 ] − (n + δ)k.

(7)

By Lemma 1, the complete dynamic system of the small trading economy is (7) - either a linear dynamic system, operating within the diversiﬁcation cone Ck with ”ﬁxed coeﬃcient” sector technologies, k1∗ , k2∗ , y1∗ = f1 (k1∗ ), y2∗ = f2 (k2∗ ), or a nonlinear dynamic system, operating outside the diversiﬁcation cone Ck with complete specialization. Following Merton (1975), our source of uncertainty is the size of the population. Introducing stochastic elements into the growth rate of labor in (6), we have

L˙ = L(n + 1 ), K˙ = Ls(k, θ) PY1 − δK,

(8)

where n represents the expected rate of growth of the population per unit of time, and the random variables 1 is formally given as, 1 = t , with β1 > 0, and Wt is standard Wiener process, β1 is the β1 dW dt instantaneous variance. Hence the stochastic model (8) can be written as the stochastic diﬀerential equations, for 0 ≤ t < ∞:

dL = L(ndt + β1 dW ), dK = [Ls(k, θ) PY1 − δK]dt,

(9)

deﬁned on the whole nonnegative orthant for L and K, and which as (6)-(7) consist of three stochastic subsystems, allowing for, respectively, specialization in good 1, nonspecialization (diversiﬁcation), and specialization in good 2. Using the tools of Ito’s stochastic calculus, we obtain: 220

Comparative Dynamics in a Growth and Trade Model Lemma 2. In the stochastic two-sector growth model with trade, (9), the stochastic dynamics for the capital-labor ratio, k(t), is a diﬀusion process given by, with k2∗ > k1∗ dk = [s(k, θ)

Y − (n + δ − β12 )k]dt − β1 kdW, k ∈ [0, ∞). P1

(10)

Explicitly, for the three subintervals of k, i) 0 < k ≤ k1∗ : dk = [s(k, θ)f1 (k) − (n + δ − β12 )k]dt − β1 kdW ;

(11a)

ii) k1∗ < k < k2∗ : ˜ 1 k) − (n + δ − β 2 )k]dt − β1 kdW ; ˜2 + Θ dk = [s(k, θ)(Θ 1

(11b)

iii) k ≥ k2∗ : dk = [s(k, θ) ˜1 = where, Θ Proof: 6.3

y1∗ −(y2∗ /p) (k1∗ −k2∗ )

f2 (k) − (n + δ − β12 )k]dt − β1 kdW ; p

˜2 = > 0, Θ

(y2∗ /p)k1∗ −y1∗ k2∗ (k1∗ −k2∗ )

(11c)

> 0.

See Jensen and Wang (1999, p. 382) with β2 = β3 = 0. 2

Comparative dynamics and policy parameters

In this section, we analyze the impact of any policy parameter on the entire capital accumulation path, as well as on the stochastic steady states. ∂s = 0, which means Assume that the savings ratio s(k, θ) satisﬁes ∂θ that if θ1 and θ2 are two alternative values of the policy parameter θ with θ1 < θ2 , then for all k > 0: s(k, θ1 )f (k) < (>) s(k, θ2 )f (k), if

∂s > ( ( 0, then the respective time paths, ktθ1 , ktθ2 for kt satisfy: ktθ1 ≤ (≥) ktθ2 a.s.P, for all t > 0.

(13)

Proof: Let sθ > 0 (the proof is identical for sθ < 0): We consider the comparative dynamics for the three subintervals of k, respectively. 1) 0 < k ≤ k1∗ : Given any k, k ∈ (0, k1∗ ], let x = |k − k |. For > 0, deﬁne the indicator function χ(0,) of the set (0, ): 1, if x ∈ (0, ), χ(0,) (x) = (14) 0, if x ∈ / (0, ).

Then

1 dx = ∞. (15) →0 x2 We can choose a sequence {an } (0, 1], n = 1, 2, · · · , such a0 = 1, an ≥ an+1 , limn→∞ an = 0, and for any n ≥ 1, 1 χ(an ,an−1 ) (x) 2 dx = n. (16) x lim

χ(0,) (x)

Choose the continuous function sequence {ρn (x)}, ρn (x) = support on the interval (an , an−1 ). Let Δt = ktθ1 − ktθ2 and consider the functions: |Δt | y Ψn (Δt ) = ρn (x)dxdy, 0

1 ; nx2

it has

0

Φn (Δt ) = Ψn (Δt )χ(0,∞) (Δt ),

(17)

Then Ψn is twice continuously diﬀerentiable on R+ , for a ﬁxed y and y suﬃciently large n, 0 ρn (x)dx = 1. Note that: lim Φn (Δt ) = χ(0,∞) (Δt ) lim Ψn (Δt ) = χ(0,∞) (Δt )|Δt | = sup{0, Δt }

n→∞

n→∞

≡ Δ+ t .

(18)

For Δt ≤ 0, Φn (Δt ) = 0. And for Δt > 0, we have Δt 0 < Φn (Δt ) = Ψn (Δt ) = ρn (x)dx ≤ 1. 0

222

(19)

Comparative Dynamics in a Growth and Trade Model Since Φn (Δt ) = ρn (Δt ), then dΔt = [s(ktθ1 , θ1 )f1 (ktθ1 ) − s(ktθ2 , θ2 )f2 (ktθ2 )]dt +(δ + n − β12 )(ktθ2 − ktθ1 )dt + β1 (ktθ2 − ktθ1 )dWt .

(20)

From Ito’s Lemma, dΦn (Δt ) = Φn (Δt )dΔt + 12 Φn (Δt )(dΔt )2 = Φn (Δt )[s(ktθ1 , θ1 )f1 (ktθ1 ) − s(ktθ2 , θ2 )f1 (ktθ2 )]dt +Φn (Δt )(Δ + n − β12 )(ktθ2 − ktθ1 )dt 1 + Φn (Δt )β12 (ktθ2 − ktθ1 )2 dt 2 +Φn (Δt )β1 (ktθ2 − ktθ1 )dWt .

(21)

A result in Karatzas and Shreve (1991) shows there exists a real valued function Z(k) : R+ → R+ , which satisﬁes the Lipschitz condition and is such that s(k, θ1 )f1 (k) ≤ Z(k) ≤ s(k, θ2 )f1 (k), if

∂s > 0, ∂θ

(22)

Then s(ktθ1 , θ1 )f1 (ktθ1 ) ≤ Z(ktθ1 ), s(ktθ2 , θ2 )f1 (ktθ2 ) ≥ Z(ktθ2 ). Therefore, Φn (Δt )[s(ktθ1 , θ1 )f1 (ktθ1 ) − s(ktθ2 , θ2 )f1 (ktθ2 )]dt ≤ Φn (Δt )[Z(ktθ1 ) − Z(ktθ2 )]dt.

(23)

since, 0 < Φn (Δt ) ≤ 1, E(dWt ) = 0, for any t ≥ 0. 223

Zhu Hongliang, Huang Wenzao Hence, by taking expectations in the integral form of (21), we have EΦn (Δt ) ≤ ξ1

0

t

E(Δ+ s )ds

1 + E 2

0

t

Φn (Δs )β12 (Δs )2 ds,

(24)

where ξ1 is a constant. By Φn (Δt ) = ρn (Δt ), EΦn (Δt ) ≤ ξ

0

t

E(Δ+ s )ds +

tβ12 . 2n

(25)

Thus let n → ∞, we have E(Δ+ t )

≤ξ

0

t

E(Δ+ s )ds.

Using the Gronwall inequality, it follows that E(Δ+ t ) = 0. This implies = 0 a.s.p., then Δ ≤ 0 a.s.p. for 0 < k ≤ k1∗ , i.e., that Δ+ t t ktθ2 ≥ ktθ1 a.s.p. for all t > 0.

(26)

2) For k1∗ < k < k2∗ , the stochastic dynamics of the capital-labor ratio is given by ˜ 2 ) − (δ + n − β 2 )k]dt − β1 kdt. ˜ 1k + Θ dk = [s(k, θ)(Θ 1

(27)

Let Δt = ktθ1 − ktθ2 , deﬁne the same functions ρn (x), Φn (Δt ) as in case 1). For Δt ≤ 0, Φn (Δt ) = 0, and for Δt > 0, we have ˜ 1 ktθ1 + Θ ˜ 2 ) − (δ + n − β12 )ktθ1 ]dtβ1 ktθ1 dt dΔt = [s(ktθ1 , θ1 )(Θ ˜ 1 ktθ1 + Θ ˜ 2 ) − (δ + n − β 2 )ktθ2 ]dt − β1 ktθ2 dt −[s(ktθ2 , θ2 )(Θ 1 ˜ 1 ktθ1 + Θ ˜ 2 ) − s(ktθ2 , θ2 )(Θ ˜ 1 ktθ2 + Θ ˜ 2 )]dt = [s(ktθ1 , θ1 )(Θ +(δ + n − β12 )(ktθ2 − ktθ1 )dt + β1 (ktθ2 − ktθ1 )dWt .

224

(28)

Comparative Dynamics in a Growth and Trade Model From Ito’s Lemma, dΦn (Δt ) = Φn (Δt )dΔt + 12 Φn (Δt )(dΔt )2 ˜ 1 ktθ1 + Θ ˜ 2) = Φn (Δt )[s(ktθ1 , θ1 )(Θ ˜ 1 ktθ2 + Θ ˜ 2 )]dt −s(ktθ2 , θ2 )(Θ +Φn (Δt )(δ + n − β12 )(ktθ2 − ktθ1 )dt + 12 Φn (Δt )β12 (ktθ2 − ktθ1 )2 dt + Φn (Δt )β1 (ktθ2 − ktθ1 )dWt . ˜ 1 > 0, it follows that From s(ktθ2 , θ2 ) > s(ktθ2 , θ2 ) and Θ

(29)

˜ 1 ktθ1 + Θ ˜ 2 ) − s(ktθ2 , θ2 )(Θ ˜ 1 ktθ2 + Θ ˜ 2) s(ktθ1 , θ1 )(Θ ˜ 1 ktθ1 + Θ ˜ 2 − (Θ ˜ 1 ktθ2 + Θ ˜ 2 )]dt ≤ s(ktθ2 , θ2 )[Θ ˜ 1 (ktθ2 − ktθ1 ) ≤ s(ktθ2 , θ2 )Θ ˜ 1 |ktθ2 − ktθ1 |. ≤ Θ Note that 0 < Φn (Δt ) ≤ 1 and E(dWt ) = 0, for any t ≥ 0. Therefore, from (29), we have t t 1 E E(Δ+ )ds + Φn (Δs )β12 (Δs )2 ds, (30) EΦn (Δt ) ≤ ξ2 s 2 0 0 ˜ 1 + δ + n − β12 . From Φn (Δt ) = ρn (Δt ), the inequality where ξ2 = Θ (30) becomes t tβ12 . (31) E(Δ+ EΦn (Δt ) ≤ ξ s )ds + 2n 0 Let n → ∞, then E(Δ+ t ) ≤ ξ

By the Gronwall inequality, implies Δt ≤ 0 a.s.P, and for

t

E(Δ+ s )ds.

0 E(Δ+ t ) = 0, k1∗ < k < k2∗ ,

(32)

then Δ+ t = 0 a.s.P, this

ktθ2 ≥ ktθ1 a.s.P for all t > 0.

(33)

3) k ≥ k2∗ , The proof is similar to the case of 1) and therefore omitted.

225

Zhu Hongliang, Huang Wenzao Thus, we have completed the proof of the Theorem 1.

2

In Theorem 1, the savings rate s(k, θ) depends monotonically on the policy parameter θ. Such a monotonicity restriction is common in conventional models, when analyzing the eﬀect of a policy parameter, see Atkinson and Stiglitz (1980). Theorem 1 shows that the time path of the capital-labor ratio for a small trading economy enjoys the following property with respect to the policy parameter θ: If the savings function depends positively (negatively) on θ for all k > 0, then the capital-labor ratio with parameter value θ2 always lies above (below) the capital-labor ratio with parameter value θ1 in each time period. As is done in Jensen and Wang (1999), we can analyze the existence of the steady-state distribution in the stochastic model (12); moreover, with Cobb-Douglas sector technologies, we can obtain a closed-form expression for the time-invariant distribution function of the diﬀusion process. As an implication of Theorem 1, we oﬀer a comparative dynamic analysis of the steady-state distributions. Deﬁne the distribution function {Ftθ }, t ≥ 0, associated with the time path of the capital-labor ratio {ktθ } as: Ftθ (k) = P [ktθ ≤ k], t ≥ 0. Then the steady-state distribution is deﬁned by F θ = lim Ftθ . t→∞

Corollary 1. If sθ > (< 0) for all k > 0, and θ1 < θ2 , then the respective steady state distributions satisfy : F θ1 ≥ (≤) F θ2 for all k ≥ 0. Proof:

From Theorem 1, with sθ > (< 0), we have ktθ1 ≤ (≥)ktθ2 .

Then for all t ≥ 0, we get Ftθ1 (k) = P [ktθ1 ≤ k] ≥ (≤) P [ktθ2 ≤ k] = Ftθ2 (k).

(34)

Since F θ = limt→∞ Ftθ , in the above inequalities, letting t → ∞ yields 2 the result that F θ1 ≥ (≤)F θ2 if sθ > (< 0) for all k > 0. 226

Comparative Dynamics in a Growth and Trade Model This corollary says that the steady-state distribution of the stochastic dynamics of a small two-sector trading economy with policy parameter value θ2 dominates (is dominated by) the steady-state of the economy with parameter value θ1 , if the savings rate depends positively (negatively) on the policy parameter. Acknowledgements: This research was supported in part by NNSF, and the Fund for ”Study on the Evolution of Complex Economic System” at ”Innovation Center of Economic Transition and Development of Nanjing University” of Ministry of Education, China. We would like to thank Professor Bjarne S. Jensen, Copenhagen Business School, for many valuable discussions and suggestions. References: Atkinson A.B., and Stiglitz J.E. (1980) Lectures on Public Economies. McGraw-hill Inc., New York. Barro R.J., and Sala-i-Martin X. (1995) Economic Growth. New York. Boadway R. (1979) “Long-run Tax Incidence: A Comparative Dynamic Approach.” Review of Economic Studies 46: 505–511. Burmeister E., and Dobell A.R. (1970) Mathematical Theories of Economic Growth. MacMillan, London. Chang F.R., and Malliaris A.G. (1987) “Asymptotic Growth under Uncertainty: Existence and Uniqueness.” Review of Economic Studies 54: 169–174. Deardorﬀ A.V. (1974) “A Geometry of Growth and Trade.” Canadian Journal of Economies 7: 295–306. Deardorﬀ A.V., and Hanson J.A. (1978) “Accumulation and a Longrun Heckscher-Ohlin Theorem.” Economic Inquiry 16: 288–292. Jensen B.S. (1994) The Dynamic Systems of Basic Economic Growth Models. Dordrecht: Kluwer Academic Publishers. Jensen B.S., and Larsen M.E. (1987) “Growth and Long-Run Stability.” Acta Applicandae Mathematicae 9: 219–237. Jensen B.S. and Wong K.Y. (1997) Dynamics, Economic Growth, and International Trade. Ann Arbor, University of Michigan Press. Jensen B.S., and Wang C. (1999) “Basic Stochastic Dynamic Systems of Growth and Trade.” Review of International Economics 7: 378–402. 227

Zhu Hongliang, Huang Wenzao Karatzas I., and Shreve S.E. (1991) Brownian Motion and Stochastic Calculus. Berlin, Springer-Verlag. Merton R.C. (1975) “An Asymptotic Theory of Growth under Uncertainty.” Review of Economic Studies 42: 375–393. Shinkai Y. (1960) “On the Equilibrium Growth of Capital and Labor.” International Economic Review 1: 107–111. Solow R.M. (1956) “A Contribution to the Theory of Economic Growth.” Quarterly Journal of Economics 70: 65–94. Uzawa H. (1961) “On a Two-sector Model of Economic Growth.” Review of Economic Studies 29: 40–47.

228

Chapter 7 Inada Conditions and Global Dynamic Analysis of Basic Growth Models with Time Delays

Zhu Hongliang School of Management and Engineering, Nanjing University, Nanjing, P.R.China Huang Wenzao Department of Financial Mathematics, Peking University, Beijing, P.R.China

7.1

Introduction

Standard growth models exhibit solutions that in the long run will converge smoothly to a unique equilibrium (steady state) from any positive initial point. But actual observations from any country of most variables show many ﬂuctuations. Arrow and Smale (1980) pointed out that the economic system can be seen as an evolutionary complex system, and that it is necessary to study economic theory from the view of nonlinear dynamics. Much evidence indicates that economic ﬂuctuation does not only come from exogenous events, but also from the inner structure of the economic dynamic system, see Zhang (1990a, 1990b), Jarsulic (1993), Lorenz (1989) and Puu (1991). It is notable, however, that time delays are in economics often neglected in continous time dynamics - to a great extent, no doubt, due to the diﬃculty in solving and analyzing such models. Nevertheless, economic development depends not only on the current state, but also on the past states (history). Gandolfo (1997) points out that the delay dynamical systems are much more suitable than diﬀerential equations alone or diﬀerence equations alone for an adequate treatment of

Zhu Hongliang, Huang Wenzao dynamic economic phenomena. In recent years, functional diﬀerential equation theory has made considerable progress. S.Invernizzi and A.Medio (1991) probed the relationship between time delay and chaos in economic systems, and argued that time delay is important in bringing about chaos. E.N.Chukwu (1996, 1998) discussed the controllability of some economic growth models with time delay, and Boucekkine et al (1997) studied economic growth model by methods of numerical value solutions, and found that time delays have great impacts on the dynamic properties. In this chapter, we consider the neoclassical economic growth model with time delays, and our focus is on getting global conditions for steady state stability or oscillations. The standard growth model in labour, L, and capital, K, accumulation (k = K/L) is:

˙ L(t) = nL(t), ˙ K(t) = sF (L(t), K(t)) − δK(t),

(1)

˙ k(t) = sf (k(t)) − (n + δ)k(t),

(2)

where 0 < s < 1,

0 < δ < 1,

n > 0,

represent the gross saving rate, depreciation rate, growth rate of labour, respectively, and F (L(t), L(t)) is a neoclassical production function, homogeneous of degree one, i.e., F (L(t), K(t))/L(t) = F (1, k(t)) = f (k(t)). Moreover, the production function f (k) is here is assumed to satisfy the Inada conditions: f (k) > 0,

f (k) < 0 for all k > 0;

lim f (k) = 0, lim f (k) = ∞.

k→∞

k→0

f (0) = 0,

f (∞) = ∞; (3)

A standard result in growth theory is: Lemma 1. System (2) has a unique positive equilibrium point (steady state), k = κ, and κ is globally asymptotically stable under the Inada condition (3).

230

Global Analysis of Growth Models with Time Delays 7.2

Neoclassical growth model with time delays

In the production function, we introduce a delay (time lag) τ (positive constant) in the fully utilization (productive operation) of acquired (installed) capital, ˙ L(t) = nL(t), (4) ˙ K(t) = sF (L(t), K(t − τ )) − δK(t). The time derivative of the capital/labour ratio, k(t) = K(t)/L(t), is obtained from (4), and so we get the following delay equation for k(t): ˙ k(t) = sf (βk(t − τ )) − (n + δ)k(t) ;

β = e−nτ > 0

(5)

where f (βk(t)) = F (1, βk(t)) satisﬁes the Inada condition. We study the qualitative behavior of delay diﬀerential equation (5), and discuss its economic meaning. Assuming initial time t0 = 0, deﬁne Et0 = [−τ, 0] as the initial region. Let R+ = (0, +∞), and C + = C([−τ, 0], R+ ) be the continuous function space from [−τ, 0] to R+ . For every φ ∈ C + , deﬁne norm φ = sup |φ|. −τ ≤θ≤0

Theorem 1. If the initial functions, k0 (t) ≡ Φ(t), for t ∈ [−τ, 0], belongs to C + , then the solution k(t) of the delay diﬀerential equation (5) exists and is unique for t ∈ [0, +∞). The equilibrium point (steady state) κ is obtained from (5) as, sf (βκ) = (n + δ)κ ,

(6)

and it is globally asymptotically stable. Proof: The existence and uniqueness of the solutions in Theorem 1 follow easily from standard theory, see Hale (1993). Since f (βk) satisﬁes the Inada condition, then there exists a unique positive equilibrium point κ in (6). We must prove the global attractivity of κ. 231

Zhu Hongliang, Huang Wenzao As the ﬁrst step, we show that the solution k(t) of the delay diﬀerential equation (5) is positive and bounded for all t ≥ 0. Assuming k(t) is not always positive for t ≥ 0, then there must be t1 > 0, such that for t ∈ [0, t1 ), k(t) > 0, but k(t1 ) = 0. Thus, we have ˙ 1 ) ≤ 0, i.e., k(t f (βk(t1 − τ )) ≤ (n + δ)k(t1 ) = 0. Since f (βk) > 0 for k > 0, and f (0) = 0, then f (βk(t1 − τ )) = 0. Thus k(t1 − τ ) = 0, which contradicts with the deﬁnition of t1 . So k(t) > 0 for t ≥ 0. By the Inada condition, we have sf (βk) sβf (βk) = lim = 0. k→∞ (n + δ)k k→∞ n+δ lim

Thus, there is a N > 0 such that sf (βk) < (n + δ)k for all k ≥ N . If ˙ 2 ) ≥ 0, k(t2 ) ≥ k(t) is unbounded, then there is a t2 > 0 such that k(t N , and k(t) < k(t2 ) for t ∈ [0, t2 ]; hence, sf (βk(t2 )) < (n + δ)k(t2 ). Since f (k) > 0, then sf (βk(t2 − τ )) < sf (βk(t2 )) < (n + δ)k(t2 ). ˙ 2 ) ≥ 0 leads to But k(t sf (βk(t2 − τ )) > (n + δ)k(t2 ), which is a contradiction. Next we show that if k(t) is a solution of equation (5), with 0 < k(0) < κ, where κ is the equilibrium of equation (5), then k(t) > km for all t > τ , where km = min{k(t) : t ∈ [0, τ ],

(7)

From the above, we know that km > 0 and km < κ. If the argument does not hold, then there exists a t1 > τ such that k(t1 ) = km ; but for ˙ 1 ) ≤ 0. Thus t ∈ [0, t1 ), we have k(t) > km , and k(t sf (βk(t1 − τ )) ≤ (n + δ)k(t1 ). Therefore, k(t1 − τ ) < k(t1 ) = km , 232

Global Analysis of Growth Models with Time Delays which contradicts the deﬁnition of km , (7). Thus for t > τ , we have k(t) > min{k(t) : t ∈ [0, τ ]} = km . Next, we prove the global attractivity of the equilibrium κ. Denote u = lim sup |k(t) − κ|. t→∞

As proved above, k(t) is positively bounded, so u < +∞. We must prove u = 0. Otherwise, if u > 0, then one of the following two statements holds: (i) There exists a sequence {ti } with ti > ti−1 , limi→∞ ti = +∞, and lim k(ti ) = κ + u;

i→∞

(8)

(ii) There exists a sequence {ti } with ti > ti−1 , limi→+∞ ti = +∞, and (9) lim k(ti ) = κ − u. i→∞

Assume that (i) holds, then there exists a > 0 such that sf (β(u + + κ)) < (n + δ)(u − + κ).

(10)

For this , from (8), there is a T = T () > τ such that for t ≥ T − τ , we have k(t) < u + + κ. In the following, we consider two kinds of cases: (ia) k(t) is not monotone; (ib) k(t) is monotone. First, assume that k(t) is not monotone, then there is a t > T such that ˙ ) = 0, k(t ) − κ > u − , k(t Thus, sf (βk(t − τ )) = (n + δ)k(t ) > (n + δ)(u − + κ). From (10), we obtain sf (βk(t − τ )) > (n + δ)(u − + κ) > sf (β(u + + κ)). But f (k) > 0, then k(t − τ ) > u + + κ. 233

Zhu Hongliang, Huang Wenzao This contradicts with k(t) < u + + κ. Next assume that k(t) is monotone, then ˙ = sf (β(κ + u)) − (n + δ)(κ + u) < 0. lim k(t)

t→∞

This leads to lim k(t) = −∞.

t→∞

This contradicts with limt→∞ k(t) = κ + u . Now if (ii) in page 233 holds, we can – with no loss of generality – let 0 < k(0) < κ, and choose 0 < < km , where km = min{k(t) : t ∈ [0, τ ]} such that f (β(κ − u − )) > (n + δ)(κ − u + ). The remaining proof is similar to case (i) with (ia),(ib); so we can omit it. This completes the proof of u = 0 - the global attractivity of κ. As the second step, we prove the local stability of the equilibrium κ of the delay equation (5). Consider the linearized system around κ in equation (5): ˙ k(t) = sβf (βκ)k(t − τ ) − (n + δ)k(t).

(11)

Its characteristic equation is λ = sβf (βκ)e−λτ − (n + δ).

(12)

By the Inada condition, we know that sβf (βκ) < (n + δ). Thus, all roots of equation (12) have negative real part ; see Gopalsamy (1992), i.e., the equilibrium point κ of the delay system (5) is asymptotically stable. With both global attractivity and asymptotic stability of κ, the proof is completed. 2 234

Global Analysis of Growth Models with Time Delays

Figure 1: Convergence of the delay system

By comparing, Lemma 1 and Theorem 1, we ﬁnd that introducing any delay (any size of τ ) for capital, K(t−τ ), in the production function does not change the global stability property of the equilibrium (steady state). The standard growth model - at least with Inada conditions, Lemma 1 - is quite robust to such delays that occur only within the neoclassical production function, i.e., the nonlinear part of (5). Numerical simulation of the delay system (5) shows that in contrast to the smooth monotone solutions from the standard model, the solution of the delay model (5) converges to the positive equilibrium state κ with some ﬂuctuation – see Figure 1, where for simplicity, we choose some parameters, 1 s= , 2

3 n= , 2

1 δ= , 2

τ =3

1

and f (k) = k 2 , which give the delay model: 1 1 1 ˙ k(t) = β 2 k 2 (t − 3) − 2k(t), 2 1 −2 where β = e−nτ = e− 2 , and from (6), κ = 16 e . Let k(t) = κex(t) , then the delay model can be transformed into the system: 9

9

1 x(t) ˙ = 2(exp[ x(t − 3) − x(t)] − 1). 2

235

Zhu Hongliang, Huang Wenzao 7.3

Dynamics with delays in production and depreciation

In this section, the delay model (5) is extended by allowing for also a time delay in capital depreciation. When it for various reasons requires a certain amount of time for acquired capital to be utilized (operated) eﬃciently, then the actual depreciation of capital could properly be postponed, too. For simplicity, the same time lag (delay, τ ) will be used for installation and depreciation of capital. Thus, the extended delay model of economic growth model becomes: ˙ L(t) = nL(t), (13) ˙ K(t) = sF (L(t), K(t − τ )) − δK(t − τ ). Hence the time derivative of k(t) gives (after some manipulations) the following delay equation for k(t): ˙ k(t) = sf (βk(t − τ )) − δβk(t − τ ) − nk(t),

(14)

where s, n, τ, δ, β are parameters as previously stated, and f (k) satisﬁes Inada conditions. The standard example of f (k) meeting (3) is the CD technologies: f (k) = γk α ; γ > 0 ,

0 < α < 1.

(15)

Thus from (14)-(15), we have the delay diﬀerential equation for k(t): ˙ k(t) = sγβ α k α (t − τ ) − δβk(t − τ ) − nk(t)

(16)

˙ k(t) = B[k(t − τ )] − nk(t)

(17)

where B(k) = sγβ α k α − δβk,

B(k) = 0

gives k ∗ = 0,

k ∗∗ = (

1 1 sγ 1−α ) , δ β

and B (k) = 0 gives k = kM = (

1 1 sγα 1−α ) , δ β

i.e., B(kM ) = max{B(k) : k ∈ [0, k ∗∗ ]}. 236

(18)

Global Analysis of Growth Models with Time Delays Theorem 2. If the initial functions, k0 (t) ≡ Φ(t), for t ∈ [−τ, 0], belongs to C + , then the solution k(t) of the delay diﬀerential equation (16) exists and is unique for t ∈ [0, +∞). The unique equilibrium point (steady state) κ is obtained from (16) as, κ=(

1 sγβ α 1−α ) . n + δβ

(19)

i) If δ satisﬁes : 1 α n 1 1−α ( ) ( ), (20) β α 1−α then there exists a T > 0 such that the solution k(t) to (16) for t ≥ T is bounded: sγ 1 1 (21) k(t) < k ∗∗ = ( ) 1−α , δ β ii) If δ satisﬁes :

δ
| B(k) − κ|; k ∈ [kM , k ∗∗ ], k = κ n iii) If δ satisﬁes : n α δ≤ . β1−α then the steady state, (19), is globally asymptotically stable.

(23) (24)

Proof: As to existence and uniqueness of the solutions k(t) to (16), see Theorem 1. Part i): We ﬁrst show that, if (20) holds, then κ≤

1 B(kM ) < k ∗∗ . n

(25)

From (18), it is obvious that κ ≤ n1 B(kM ). In order to get 1 B(kM ) < k ∗∗ , n i.e.,

sγ 1 1 1 α (sγβ α kM − δβkM ) < ( ) 1−α , n δ β

we need (20). 237

Zhu Hongliang, Huang Wenzao If for all large t, i.e., t ≥ T , where T is a suﬃcient large number, ˙ k(t) ≥ κ, then k(t) ≤ 0, thus limt→+∞ k(t) = κ, and the conclusion is true. If k(t) oscillates around the equilibrium κ, then there must be a ˙ ) = 0. We have t > 2τ such that k(t ) > κ, and k(t k(t ) =

1 1 B(k(t − τ )) ≤ B(kM ) < k ∗∗ . n n

Therefore, there exists a T > 0 such that k(t) < k ∗∗ for t ≥ T . Part ii): Denote

u = lim sup |k(t) − κ| t→+∞

From (21), u < +∞. We assume u > 0. First we show that, if (21*) holds, then kM < κ. Because from (

1 1 1 sγα 1−α sγβ α 1−α ) βn 1−α ˜ Thus, for k ∈ (0, k], where k˜ satisﬁes 0 < k˜ < kM , and B(κ) = ˜ B(k), we have 1 |k − κ| > | B(k) − κ|. n ˜ kM ] , we have from (23), When k ∈ (k,

κ − k > κ − k >

1 B(k) − κ ≥ 0, n

where kM < k < κ, and B(k) = B(k ). To summarize, we obtain 1 |k − κ| > | B(k) − κ|, 0 < k < k ∗∗ , k = κ. n

(26)

If the solution k(t) of system (16) is monotone, then we have u = 0; If k(t) oscillates around the equilibrium state κ, then by the continuity of B(k) and (26), there exists a > 0 such that as ξ ∈ [−, ], 1 | B(κ + u + ξ) − κ| < u − , n

(27)

and there exists T > 0 such that as t ≥ T , |k(t) − κ| < u + . 238

(28)

Global Analysis of Growth Models with Time Delays Let t1 > T + τ such that |k(t1 ) − κ| > u − , k(t1 ) > κ, and ˙ 1 ) ≥ 0. k(t Thus, B(k(t1 − τ )) ≥ nk(t1 ), i.e., 1 B(k(t1 − τ )) − κ > u − . n

(29)

From (28), |k(t1 − τ ) − κ| < u + , then by (27), we have 1 B(k(t1 − τ )) − κ < u − . n This contradicts (29), hence u = 0. So (23) gives the variety range of the solutions around the equilibrium κ of the delay system (16). Part iii): If the solution k(t) of system (14) is monotone, then similarly, we have u = 0. In the following, we assume k(t) is not monotone. When kM = κ , we can prove that for t0 > 0, if k(t0 ) ≤ κ , then for any t ≥ t0 , k(t) ≤ κ . Otherwise, if there exists a t1 > τ such that ˙ 1 ) ≥ 0, k(t1 ) > κ, k(t and k(t) < k(t1 ) for t ∈ [t0 , t1 ), then we have B(κ) ≥ B(k(t1 − τ )) ≥ nk(t1 ) > nκ, which is a contradiction. ˙ < 0. Thus k(t) If k(t0 ) > κ , and for all t ≥ t0 , k(t) > κ , then k(t) is monotone. So we have limt→+∞ k(t) = κ. In the following, without loss of generality, we assume that as t ≥ t0 , k(t) ≤ κ. Obviously, for k ∈ (0, kM ] , we have 1 |k − κ| > | B(k) − κ|. n As in the proof of ii), we obtain: lim k(t) = κ.

t→+∞

When δ
κ, it is easily shown that for k ∈ (0, kM ], 1 |k − κ| > | B(k) − κ|. n 239

Zhu Hongliang, Huang Wenzao As in the proof of ii), we obtain: lim k(t) = κ,

t→+∞

which proves the global attractivity of κ. Next, consider the linearized equation of system (16) around κ: ˙ k(t) = [nα − δβ(1 − α)]k(t − τ ) − nk(t).

(30)

Its characteristic equation is: λ = [nα − δβ(1 − α)]e−λτ − n.

(31)

Since n > nα − δβ(1 − α), and by condition (24), we have δ≤

n α nα+1 < , β1−α β1−α

i.e., n > |nα − δβ(1 − α)|. Thus, all roots of the equation (31) have negative real part, so the equilibrium point κ is locally stable. The latter together with global attractivity of κ establish that κ, (19), of the delay model (16) is globally asymptotically stable. 2 Example. We consider some actual values of the parameters in the condition (24). The parameter n (”natural rate of proliferation”) have the range: 0.005 ≤ n ≤ 0.3, cf. Jensen and Wang (1997). If we choose n = 0.02, capital depreciation rate δ = 0.05, β = 0.15, and α = 13 , then 0.02 1 1 n α = = . δ = 0.05 < β1−α 0.15 2 15 Condition (24) is met by the parameters values, and κ is globally asymptotically stable. 7.4

Persistent oscillation in a growth model with delays

In this section, we consider dynamics of the model (16), when it is not asymptotically stable. We shall prove that in some circumstances, the solutions k(t) will oscillate around the equilibrium (steady state), κ. First, we give the deﬁnition of oscillation around κ: Deﬁnition 1. Let k(t) be a solution of the delay system (16) with the initial function Φ ∈ C + . If there exists a sequence {ti }, ti → ∞, i → ∞ such that k(ti ) = κ, we call k(t) an oscillation solution around the equilibrium point. 240

Global Analysis of Growth Models with Time Delays We give the main result: Theorem 3. Let k(t) be a solution of the delay system (16), (17). If the depreciation rate belongs to the interval, n 1 1 α n α < δ < ( ) 1−α ( ), β1−α β α 1−α then 0 < k(t) ≤

(32)

1 sγ 1 1 B(kM ) < k ∗∗ = ( ) 1−α , t ≥ 0. n δ β

a) Under the condition (32), and if, 0 < k(0) < κ, then there exist, η > 0, T > 0, such that k(t) ≥ η for t ≥ T . b) Under the condition (32), and if two additional conditions hold: 1 i) nkM < B( B(kM )), n

(33)

1 1 B( B(kM )), (34) n n then the solutions of the delay model (16) oscillate around the equilibrium point κ. ii) Dτ ≥ 1 ; D = −B (kΔ ), kΔ =

Proof: It is easy to verify that if condition (32) holds, then there exists a T > 0 such that k(t) ≤ n1 B(kM ) for t ≥ T . Without loss of generality, we assume that the delay model (16) has the initial function Φ ∈ C + , and 1 Φ ≤ F (kM ). n In the following, we also prove that k(t) ≤ n1 B(kM ) for all t ≥ 0. Otherwise, there exists a t1 > 0 such that ˙ 1 ) ≥ 0, k(t

k(t1 ) >

and since t ∈ [0, t1 ), k(t) ≤

1 B(kM ), n

1 B(kM ). n

˙ 1 ) ≥ 0 , then Since k(t B(k(t1 − τ )) ≥ nk(t1 ) > B(kM ), which is a contradiction, cf., B(kM ), (18). 241

Zhu Hongliang, Huang Wenzao If for t ≥ 0, there exists a t2 > 0 , such that k(t) > 0 for t ∈ [0, t2 ), ˙ 2 ) ≤ 0, i.e., and k(t2 ) = 0. Then we have k(t B(k(t2 − τ )) ≤ nk(t2 ) = 0. If B(k(t2 − τ )) = 0, then k(t1 − τ ) = 0, this contradicts with the deﬁnition of t1 . If B(k(t2 − τ )) < 0, then k(t1 − τ ) > k0 - a contradiction. Hence, 1 0 < k(t) ≤ B(km ). n Condition (32) gives the boundedness of the solutions of the delay model (16). a) If the statement is not true, then there must be a t1 > τ such that 1 k(t1 ) = min{k(t) : t ∈ [0, t1 ]}, B(k(t1 )) < B( B(kM )), n ˙ 1 ) ≤ 0. We choose 0 < k1 < κ with k1 < k(t1 ) < 1 B(k1 ). and k(t n ˙ 1 ) ≤ 0 leads to Since 0 < k(0) < κ, then 0 < k(t1 ) < κ, and k(t B(k(t1 − τ )) ≤ nk(t1 ) < B(k1 ). Because B(k) is an increasing function for k ∈ [0, k M ], and is a decreasing function for k ∈ [κ, ∞), then either, k(t1 − τ ) < k(t1 ), or k(t1 − τ ) >

1 B(kM ) = η > 0. n

But if k(t1 − τ ) < k(t1 ), this contradicts with the deﬁnition of t1 . The statement holds. b) We ﬁrst assert that, if t ∈ [−τ, 0], 1 k(t) ∈ [kM , B(kM )], n then for all t ≥ 0:

1 (35) k(t) ∈ [kM , B(kM )]. n From (33), κ > kM , there exists a k1 with 0 < k1 < kM . Since k ∈ [k1 , kM ] , B( n1 B(k)) > nk. If (35) does not hold, then by a), there exists a t1 > 0 with k1 < k(t1 ) < kM , such that as t ∈ [0, t1 ), k(t1 ) < k(t) ≤ 242

1 B(kM ), n

Global Analysis of Growth Models with Time Delays ˙ 1 ) < 0. Thus, and k(t B(k(t1 − τ )) < nk(t1 ).

(36)

Since for B(k) ≥ nk for k ∈ [0, kM ] , then k(t1 − τ ) > κ. By the monotony of B(k), we have 1 B(k(t1 − τ )) ≥ B( B(kM )) > nkM > nk(t1 ). n which contradicts with (36). Next, we prove the oscillation of the solutions. Without loss of generality, we assume that k(0) > k ∗ . First, we show that there exists a e1 > 0, such that e1 = inf{t : t > 0, k(t) = κ}. From (34) with D = −B (kΔ ), we can choose a Δ > 0 such that 1 Δ < min{ B(kM ) − κ, κ − kM }, n Then for |k − κ| ≤ Δ, we have |B(k) − B(κ)| ≥ D|k − κ|. ˙ ≤ 0. Obviously, if k(t) ≥ κ for t ∈ [0, t0 ] , then for t ∈ [0, t0 ], k(t) Let t1 = inf{t : t > 0, k(t) ≤ κ + Δ}. If k(0) ≤ k ∗ + Δ, then t1 = 0 . Now assume k(0) > κ + Δ. For 0 ≤ t ≤ t1 , we have ˙ k(t) = [B(k(t − τ )) − B(κ)] − n[k(t) − κ]. Therefore,

˙ k(t) ≤ −n[k(t) − κ] ≤ −nΔ.

Thus, t1 ≤ −

1 1 1 [k(t1 ) − k(0)] ≤ [ B(kM ) − κ − Δ]. nΔ nΔ n

If t ∈ [t1 , t1 + τ ], and, 0 < k(t) − κ ≤ Δ, then for t ∈ [t1 , t1 + τ ] , ˙ k(t) ≤ B(k(t − τ )) − B(κ) ≤ D(κ − k(t − τ )) ≤ −DΔ. 243

Zhu Hongliang, Huang Wenzao This leads to k(t1 + τ ) ≤ k(t1 ) − DΔτ = κ + Δ(1 − Dτ ) ≤ κ, which is a contradiction; so t1 + τ ≥ e1 . ˙ 1) = 0 , ˙ 1 ) < 0; otherwise, if k(e Moreover, it is easy to verify that k(e then k(e1 − τ ) < kM , which is a contradiction. Henceforth, we prove that there exists a e2 , such that e2 = inf{t : t > e1 , k(t) = κ}. Denote e∗ = min{e2 , e1 + τ }. Note that if there does not exist such a e2 , then let e2 = ∞. Therefore, as t ∈ (e1 , e∗ ), we have k(t) < κ. Since ˙ k(t) = [B(k(t − τ )) − B(κ)] − n[k(t) − κ], then

d [(k(t) − κ)ent ] = ent [B(k(t − τ )) − B(κ)] < 0, (37) dt i.e., (k(t)−k ∗ )ent is monotonously decreasing on (e1 , e∗ ); so e∗ = e1 +τ . Thus kM ≤ k(t) < κ for t ∈ (e1 , e1 + τ ). As t ∈ (e1 + τ, e2 ), we have ˙ k(t) ≥ 0. Now, we prove that e2 is ﬁnite. Let t2 = inf {t : t ≥ e1 + τ, k(t) ≥ κ − Δ}. If k(e1 + τ ) ≥ κ − Δ, then t2 = e1 + τ . Now assume that k(e1 + τ ) < κ − Δ. Since t ∈ [e1 + τ, t2 ] , we have ˙ k(t) ≥ nΔ. Therefore, k(t2 ) − k(e1 + τ ) ≥ nΔ(t2 − e1 − τ ), i.e.,

1 (κ − kM − Δ). nΔ If we assume that : e2 > t2 + τ , then for t ∈ [t2 , t2 + τ ], t2 ≤ e1 + τ +

˙ k(t) ≥ B(k(t − τ )) − B(κ) ≥ D(κ − k(t − τ )) > DΔ, Integrating on both sides gives, k(t2 + τ ) > k(t2 ) + DΔτ = κ + Δ(Dτ − 1) > κ, 244

Global Analysis of Growth Models with Time Delays which is a contradiction. So instead, we conclude that : e2 ≤ t2 + τ . i.e. τ < e2 ≤ ≤ ≤ ≤

t2 + τ 1 (κ − kM − Δ) e1 + 2τ + nΔ 1 t1 + 3τ + nΔ (κ − kM − Δ) 1 1 [ n B(kM ) − kM − 2Δ]. 3τ + nΔ

Similarly, it can shown that for, t ∈ (e2 , e2 + τ ), that d [(k(t) − κ)ent ] > 0, (38) dt By repeating the analysis from (37) above, (38) can be proved. 2 Theorem 3 demonstrated that if delays in economic growth models are large enough, such delays can in continuous time dynamics generate persistent oscillatory solutions. 7.5

Final comments

In complex social economic systems - from information collection, decision making, investment implementation - there can be long time delays involved. Obviously, delay phenomena will inﬂuence the dynamic characteristics of the economic systems. But in many delay situations, their implications are not suﬃciently recognized and effectively studied, and so many actual projects do not achieve good results. The eﬀect of delays on the stability of systems is a diﬃcult topic to handle. We have given the conditions of global asymptotic stability to economic growth model with time delay, and analyzed the oscillation around the steady states. The main contribution to economic growth theory is that we investigate a kind of nonlinear dynamic phenomenon, such as ﬂuctuations. We may emphasize that oscillations are not rare, but common in growing economies. Acknowledgements: This research was supported in part by NNSF, and the Fund for ”Study on the Evolution of Complex Economic System” at ”Innovation Center of Economic Transition and Development of Nanjing University” of Ministry of Education, China. We would like to thank Professor Bjarne S. Jensen, Copenhagen Business School, for many valuable discussions and suggestions. 245

Zhu Hongliang, Huang Wenzao References: Boucekkine R., Licandro O., and Paul C. (1997) “Diﬀerence-diﬀerence Equations in Economics: On the numerical Solution of Vintage Capital Growth Models.” Journal of Economic Dynamics and Control 21: 347–362. Chukwu E.N. (1996) “Universal Laws for the Control of Global Economic Growth with Nonlinear Hereditary Dynamics.” Applied Mathematics and Computation 78: 19–81. Chukwu E.N. (1998) “On the Controllability of Nonlinear Economic Systems with Delay: The Italian Example.” Applied Mathematics and Computation 95: 245–274. Gandolfo G. (1997) Economic Dynamics: Methods and Models. 3 ed.. North-Holland, Amsterdam. Gopalsamy K. (1992) Stability and Oscillations in Delay Diﬀerential Equations of Population Dynamics. Kluwer Academic Publishers, Boston. Hale J.K. (1993) Theory of Functional Diﬀerential Equation. SpringerVerlag, New York. Invernizzi S., and Medio A. (1991) “On Lags and Chaos in Economic Dynamic Models.” Journal of Mathematical Economics 20: 521–550. Jarsulic M. (ed.) (1993) Non-Linear Dynamics in Economic Theory. Edward Elgar Publi. Com.. Jensen B.S. (1994) The Dynamic Systems of Basic Economic Growth Models. Dordrecht: Kluwer Academic Publishers. Jensen, B.S., and Wong K.Y. (1997) Dynamics, Economic Growth, and International Trade. Ann Arbor, Unversity of Michigan Press. Jensen B.S., and Larsen M.E. (1987) “Growth and Long-Run Stability.” Acta Applicandae Mathematicae 9: 219–237. Smale S. (1980) The Mathematics of Time: Essays on Dynamical Systems, Economic Processes: Berlin, Springer-Verlag. Zhang W.B. (1990a) Economic Dynamics,Growth and Development. Lecture Notes, in Economics and Mathematical Systems, Vol.350. Springer-Verlag. Zhang W.B. (1990b) Synergetic Economics: Dyanmics, Nonlinear, Instability, Non-equilibrium, Fluctuations and Chaos. Springer-Verlag.

246

Chapter 8 Hopf Bifurcation in Growth Models with Time Delays

Morten Brøns Department of Mathematics, Technical University of Denmark Bjarne S. Jensen University of Southern Denmark and Copenhagen Business School

8.1

Introduction

Time delays (time lags) in production/capital utilization and accumulation dynamics were introduced into basic aggregate growth models by Zhu and Huang (2007). Using a CD production function (Inada conditions), they performed a global analysis of the delay diﬀerential (“mixed diﬀerence-diﬀerential”) equation. They showed that for suﬃciently small delays, the steady-state solution (equilibrium) is globally stable; but the steady-state solution of the capital-labor ratio loses this global stability property, when the delay (time lag) is above a certain critical value (length). Furthermore, they show that for certain values of the time delay, all solutions persistently oscillate (but not necessarily strictly periodic). The purpose of the present paper is - with CD and for particular CES technologies - to complement this global dynamic analysis by performing a nonlinear, local analysis of the dynamics of the delay (time lag) model, when the size of the time delay (lag) is close to the critical value. We show that for this critical delay value, a Hopf bifurcation (of a ﬁxed point into a closed orbit in a neighborhood of the equilibrium) occurs, i.e., periodic solutions (“limit cycles”) are created, when the steady state solution (equilibrium) of the capital-labor

Morten Brøns, Bjarne S. Jensen ratio loses its local stability. We also derive an analytical expression which determines the stability type (supercritical or subcritical) of the periodic solutions (“limit cycles”). Finally, we show that the delay model with CES can exhibit dynamics with solutions that have previously been observed in other (electrodynamic, engineering), delay diﬀerential equations, namely: Square waves and chaos (aperiodic waves/cycles). Economic models of business cycles [as recurrent ﬂuctuations (upswing/downswing) in economic activity] have a long tradition as a specialized branch of economic theory with many hypotheses or paradigms exhibited. However, regarding their dynamic properties, it is worth emphasizing as do Gabisch and Lorenz (1987, p. 3): Whether or not a dynamic model of an economy is a business cycle model, i.e. a model which allows for ﬂuctuations of major economic variables for a considerable amount time, does not depend on the general motivated and paradigmatic features of a model, but rather on its mathematical structure. While, e.g. the introduction of a certain lagstructure in the production function can certainly not be a distinctive mark from an economic point of view, this structure may be the essential dynamic structure which allows for oscillatory motions of an economy. Therefore, this text will concentrate on those features of dynamic economic models which constitute the essential ﬂuctuationgenerating forces. It is the nature and consequences of these delays (time lags) that our theorem and solutions will demonstrate. The relevance and proper interpretation of all economic time lags (duration of delays) will depend on the time units (period) of measurement (year, quarter, month, day) for the economic variables. The delays/lags are not necessarily equal to integers of any time units. Treating economic variables as continuous time processes, the actual occurring delays in such dynamic models can be studied for any length (real number). Often an analytical dilemma arises in economic dynamics, as noted by Goodwin (1990, p. 1): There are two broad types of dynamical equation systems: continuous time and discrete time; both have frequently been used in economics. The latter arise because there are signiﬁcant time-lags in an economy. The trouble is 248

Hopf Bifurcation in Growth Models with Time Delays that these occur in the context of economic activity which is substantially continuous, so that one should formulate mixed diﬀerence-diﬀerential systems, a procedure the complications of which place it beyond the scope of this book. We examine the complications and show the advantages of a rigorous approach to delay issues of using a mixed dynamic system in continuous time. 8.2

Dynamics of growth and cycles

Let us brieﬂy review the derivation of the dynamics for the capitallabor ratio in growth models with time delays. The starting point is the neoclassical economic growth model, L˙ = nL,

K˙ = sY (t) = sF (L, K) = sLF (1, k) ≡ Lsγf (k); k(t) = K(t)/L(t).

(1)

where K is capital, L is labour, and n and s are standard parameters (n > 0, 0 < s < 1). The production function F is homogeneous of degree one; the TPF (“total factor productivity”) parameter (γ) of F is here explicitly speciﬁed together with the saving (investment) parameter (s), as this is a practical procedure in numerical simulations below, cf. (7). The growth model is then modiﬁed to include a delay (time lag) τ in the productive operation (utilization) of installed (acquired) capital (machinery), L˙ = nL,

K˙ = sY (t) = sF (L, Kτ );

Kτ = K(t − τ )

(2)

with the standard notation Kτ = K(t − τ ). From (2), we obtain the time derivative of k(t) = K(t)/L(t) as, % & ˙ ˙ ˙k = LK − K L = sF (L, Kτ ) − n K = sF 1, Kτ − nk L2 L L L & & % % n(t−τ ) L0 e Kτ L τ − nk − nk = sγf kτ = sγf Lτ L L0 ent = sγf (e−nτ kτ ) − nk; kτ ≡ Kτ /Lτ = K(t − τ )/L(t − τ ) (3) i.e.,

k˙ = sγf (e−nτ kτ ) − nk.

(4) 249

Morten Brøns, Bjarne S. Jensen The delay model (4) may be combined with depreciation of the productively operating capital stock (same time lag, delay, τ ) to obtain the basic delay model, cf. Zhu and Huang (2006), k˙ = sγf (e−nτ kτ ) − δτ e−nτ kτ − nk.

(5)

For mathematical analysis, it is convenient to introduce a new variable q = e−nτ k = Kτ /L;

qτ = q(t − τ ) = e−nτ kτ = e−nτ (Kτ /Lτ )

(6)

which implies that the standard economic growth model with delay (5) becomes q˙ = e−nτ k˙ = βf (qτ ) − δqτ − nq ≡ h(q, qτ ); β = sγe−nτ ; δ = δτ e−nτ (7) The family of solutions q(t) to (7) will generally display a more oscillatory time path instead of the monotonicity usually seen with τ = 0. 8.3

Hopf bifurcation analysis

We will ﬁrst assume that there exists an equilibrium (steady-state) solution, q ∗ , to (7): ∀t : q(t) = q ∗ = qτ∗ = q ∗ (t − τ ) = e−nτ k ∗ = e−nτ kτ∗ = e−nτ k ∗ (t − τ ) (8) satisfying q˙ = h(q ∗ , q ∗ ) = βf (q ∗ ) − (δ + n)q ∗ = 0.

(9)

It will be convenient to introduce the variable z (deviation of q from the equilibrium), z = q − q∗; zτ = z(t − τ ) = qτ − q ∗ = q(t − τ ) − q ∗ ;

z = 0 ⇔ zτ = 0 (10)

We next assume that f ,(7), is analytic (can be expanded as Taylor series), and as our dynamic analysis will be local, we will make up to a cubic/third-order Taylor expansion of h,(7), at the equilibrium (z = 0), to explicitly obtain the parametrized dynamics: z˙ = h(q ∗ + z, q ∗ + zτ ) = A0 z + A1 zτ + 250

A2 2 A3 3 z + z + O(zτ4 ) 2 τ 6 τ

(11)

Hopf Bifurcation in Growth Models with Time Delays where the Taylor coeﬃcients are ∂h ∗ ∗ ∂h ∗ ∗ (q , q ) = −n, A1 = (q , q ) = βf (q ∗ ) − δ, ∂q ∂qτ ∂ 2h ∂ 3h A2 = 2 (q ∗ , q ∗ ) = βf (q ∗ ), A3 = 3 (q ∗ , q ∗ ) = βf (q ∗ ). ∂qτ ∂qτ A0 =

(12)

The basic properties and results are summarized in: Theorem 1. Assume that f , (7), is analytic, and assume the existence of a steady-state (equilibrium) solution q ∗ , (9), i.e., h(q ∗ , q ∗ ) = βf (q ∗ ) − (δ + n)q ∗ = 0.

(13)

If the ratio (R) of the ﬁrst-order Taylor coeﬃcients satisﬁes the condition, −n τ0 . Corresponding to τ0 , (15), and the periodic solution of the linearization of (11), its angular velocity (angular frequency), ω0 , and period, T0 , are given by: 1

ω0 = [A21 − A20 ] 2 ;

T0 = 2π/ω0 = 1/ν0

(ν0 : frequency)

(16)

Furthermore, a family of small-amplitude periodic solutions bifurcates from the steady-state, q ∗ ,(13). The periodic solutions (limit cycles) exist for delays τ in an interval, either to the left or to the right of τ0 . The important qualitative properties of the periodic solutions (limit cycles) are determined by the sign and the numerical value of a number 251

Morten Brøns, Bjarne S. Jensen (cubic expansion parameter), τ2 , given by: 1 × τ2 = 8ω02 2 − A1 τ0 [2R2 + 6R − 11] + R[4R(1 + R) − 13] 2 A2 − (A1 τ0 − R)A3 A1 (1 + R)(5 − 4R) 8 2 − A1 τ0 [2(A0 /A1 )2 + 6(A0 /A1 ) − 11] = 8A31 [1 − (A0 /A1 )2 ][1 + (A0 /A1 )][5 − 4(A0 /A1 )] (A0 /A1 )[4(A0 /A1 )(1 + (A0 /A1 )) − 13] + A22 8A31 [1 − (A0 /A1 )2 ][1 + (A0 /A1 )][5 − 4(A0 /A1 )] 1 A1 τ0 − (A0 /A1 ) A3 . − 8 A21 [1 − (A0 /A1 )2 ] (17) If τ2 > 0, the limit cycles existing for τ > τ0 (supercritical) are stable/attractive; if τ2 < 0, the limit cycles existing for τ < τ0 (subcritical) are unstable/repulsive. The period T2 of the limit cycles is given by & % 2π ω2 T2 = [τ − τ0 ] + O([τ − τ0 ]2 ) (18) 1− ω0 ω0 τ 2 where ω2 =

1 2 R2 + 6 R − 11 2 A2 + A1 A3 . 8 ω0 (1 + R)(5 − 4 R)

(19)

The absolute/numerical value of τ2 tells how fast the amplitude of the limit cycles grows with the size of the delay deviation τ − τ0 ; the amplitude of the limit cycle is given by, 0 τ − τ0 + O(τ − τ0 ). (20) = τ2 The smaller τ2 , the quicker the amplitude grows, according to (20). In the atypical case τ2 = 0, the computations must be continued to a higher order than three in (11). Proof: The stability of the equilibrium z = 0 of (11) can be determined by the linearization z˙ = A0 z + A1 zτ 252

(21)

Hopf Bifurcation in Growth Models with Time Delays Solutions of this linear delay diﬀerential equation (21) have the complex form z(t) = Ce(α+iω)t = Ceαt (cos ωt + i sin ωt).

(22)

If all solutions have α < 0, then the equilibrium z = 0 is stable, while it is unstable if there are solutions with α > 0. Hence, a change of stability occurs at a value τ0 of the delay when there are solutions z = Ceiω0 t . Inserting such a solution, and taking real and imaginary parts yields 0 = A0 + A1 cos ω0 τ0 , ω0 = −A1 sin ω0 τ0 .

(23)

which gives, cos ω0 τ0 = −A0 /A1 , sin ω0 τ0 = −

ω0 . A1

(24)

Then squaring and adding yields a critical angular velocity (angular frequency), ω0 , as 1 1 (25) ω0 = A21 − A20 = |A1 | [1 − (A0 /A1 )2 ] 2 , Hence a critical delay (time lag) τ0 exists, cf. (24),(25), τ0 =

1 arccos(−A0 /A1 ) arccos(−A0 /A1 ) = 1 ω0 |A1 | [1 − (A0 /A1 )2 ] 2

(26)

if |A0 /A1 | < 1.

(27)

Inserting the Taylor coeﬃcient expressions (12) in (27) and (26) establishes (14) and (15). At τ = τ0 , where the local dynamics changes - from exponential decay towards the equilibrium to the existence of exponentially growing solutions - the linearized delay diﬀerential equation (21) has periodic solutions with the period: T0 =

2π = 1/ν0 ω0

(ν0 : frequency).

(28)

This indicates the existence of periodic solutions (limit cycles) for the full nonlinear delay diﬀerential equation (11) for delays τ close to τ0 . In fact, the Hopf bifurcation theorem states that small amplitude periodic solutions exist (Diekmann, Gils, Lunel and Walther 1995; 253

Morten Brøns, Bjarne S. Jensen Guckenheimer and Holmes 1983). We proceed to look for these period solutions (limit cycles) following Morris (1976). Since the period cannot be expected to be exactly given by (28) for all τ , the exact determination/computation of the period (T ), (or angular velocity, ω), is an important part of our task. We now introduce a scaled time variable θ = ωt

(29)

where ω is a parameter/constant close to the critical angular velocity, ω0 . This transforms (11) into ωz = A0 z + A1 zωτ +

A2 2 A3 3 ∂z 4 zωτ + zωτ + O(zωτ ); z = 2 6 ∂θ

(30)

The coeﬃcients Ai , 1 = 0, . . . , 3, are here the same as in (12). In (30) we will look for 2π-periodic solutions, and determine the parameter ω = ω(τ ) as we go along such that they have period 2π. Let be some measure of the amplitude of a periodic solution z = z(θ, )

(31)

We expand the periodic solution in a Taylor series of the amplitude , z(θ, ) = z (1) (θ) + z (2) (θ)2 + z (3) (θ)3 + . . . ,

(32)

where

1 ∂ nz (θ, 0) (33) n! ∂n and correspondingly expand ω and the delay τ at ω0 , τ0 , where the oscillations are initiated z (n) (θ) =

ω = ω0 + ω1 + ω2 2 + . . . ,

(34)

τ = τ0 + τ1 + τ2 2 + . . . .

(35)

Inserting these expansions in (30), we collect terms of the same order in , and set each of the terms in the resulting power series equal to zero. (1) To order 1 , we get [z (1) = ∂ 2 z(θ, 0)/∂∂θ; zω0 τ0 = z (1) (θ − ω0 τ0 )] :

= 0. ω0 z (1) − A0 z (1) − A1 zω(1) 0 τ0

(36)

When (ω0 , τ0 ) fulﬁl (24), the complete solution of (36) is, z (1) (θ) = a cos θ + b sin θ. 254

(37)

Hopf Bifurcation in Growth Models with Time Delays We can ﬁx the measure of the amplitude by picking the initial conditions: z(0) = , z (0) = 0, which gives:

z (1) (0) = 0, z (1) (0) = 1, z (j) (0) = 0, z (j) (0) = 0 for j ≥ 2. With this, we get

z (1) (θ) = sin θ.

(38) (39)

Proceeding to order 2 , we get the equation

= ω0 z (2) − A0 z (2) − A1 zω(2) 0 τ0

+ ω1 z (1) − A1 (ω0 τ1 + ω1 τ0 )zω(1) 0 τ0

A2 (1) 2 (z ) . (40) 2 ω0 τ0

With the solution for z (1) from (39), (40) can be rewritten as

= ω0 z (2) − A0 z (2) − A1 zω(2) 0 τ0 1 A2 ω02 + [−ω1 + A0 (ω1 τ0 + ω0 τ1 )] cos θ + ω0 (ω1 τ0 + ω0 τ1 ) sin θ 4 1 A2 + [(ω02 − A20 ) cos 2θ + 2A0 ω0 sin 2θ]. (41) 4 A21 The homogeneous part of this inhomogeneous linear equation is identical to the equation for z (1) . Hence, the solution to the homogeneous equation is resonant with the sin θ and cos θ terms on the right hand side, and will give rise to non-periodic solutions. To obtain periodic solutions of period 2π, we must require that the resonant terms vanish, i.e. −ω1 + A0 (ω1 τ0 + ω0 τ1 ) = 0, ω0 (ω1 τ0 + ω0 τ1 ) = 0.

(42)

These are linear equations in τ1 , ω1 , with solution τ1 = 0, ω1 = 0.

(43)

With (43), we can ﬁnd – by insertion in (41) and using the initial conditions (38) – a periodic solution of (41) of the form z (2) (θ) = a0 + a1 cos θ + b1 sin θ + a2 cos 2θ + b2 sin 2θ

(44)

The result is

255

Morten Brøns, Bjarne S. Jensen A2 , 4(A0 + A1 )

(45)

a1 =

A2 (2A1 − A0 )(A1 − A0 ) , 2A1 (A1 + A0 )(5A1 − 4A0 )

(46)

b1 =

A2 ω0 (A1 − A0 ) , 2A1 (A1 + A0 )(5A1 − 4A0 )

(47)

a2 =

A2 (A21 − 2A20 + 2A1 A0 ) , 2A1 (A1 + A0 )(5A1 − 4A0 )

(48)

b2 =

A2 ω0 (A1 − A0 ) , 2A1 (A1 + A0 )(5A1 − 4A0 )

(49)

a0 =

Finally turning to order 3 , we obtain

ω0 z (3) − A0 z (3) − A1 zω(3) 0 τ0

= −ω2 z (1) − A1 (ω0 τ2 + ω2 τ0 )zω(1) + A2 zω(1) z (2) + 0 τ0 0 τ0 ω0 τ0

A3 (1) 3 (z ) . (50) 6 ω0 τ0

From the previous calculations, the right hand side is known. It has the form of a trigonometric polynomial, u0 + u1 cos θ + v1 sin θ + u2 cos 2θ + v2 sin 2θ + u3 cos 3θ + v3 sin 3θ. (51) The expressions for the coeﬃcients are long and complicated. We are only interested in u1 and v1 , as they multiply the resonant terms, which must be zero to allow periodic solutions. Thus one obtains, u1 = A0 ω0 τ2 + (τ0 A0 − 1)ω2 −11A21 A22 + 4A1 A0 A22 − 4A1 A20 A3 + 8A21 (A1 + A0 )(5A1 − 4A0 ) 2 A1 A0 A3 + 5A31 A3 + 4A20 A22 + ω0 , 8A21 (A1 + A0 )(5A1 − 4A0 ) v1 = ω02 τ2 + ω0 τ0 ω2 4A1 A20 A22 + 4A30 A22 − 4A1 A30 A3 + 8A21 (A1 + A0 )(5A1 − 4A0 ) 2 2 A A A3 + 5A31 A0 A3 + 2A31 A22 − 13A21 A0 A22 + 1 0 . 8A21 (A1 + A0 )(5A1 − 4A0 )

(52)

(53)

Solving the equations u1 = 0, v1 = 0, one obtains τ2 , (17) and ω2 , (19), after simpliﬁcations using (14), (15), (16). 256

Hopf Bifurcation in Growth Models with Time Delays Solving (35) for yields (20). The amplitude is only deﬁned when τ − τ0 has the same sign as τ2 ; hence the periodic solutions exist only on one side of τ0 , as described in the Theorem 1. Inserting (20) in (34) yields ω = ω0 +

ω2 [τ − τ0 ] + O([τ − τ0 ]2 ). τ2

(54)

In the original time variable the period is T = 2π/ω. Inserting (54) in the latter expression for T and making a Taylor expansion in τ − τ0 ﬁnally gives (18). We omit the proof of the ﬁnal statement of the theorem, concerning stability of the periodic solutions. The result is a standard property of the Hopf bifurcation (Diekmann, Gils, Lunel, Walther 1995). The expression (17) for τ2 is complicated, and no direct interpretation in economic terms seems to be possible. In particular, it depends on both second and third derivatives of f , so convexity properties alone are not suﬃcient to determine τ2 . Hence, one must turn to concrete computations as we do now for speciﬁc illustrative examples. 8.4

CD technologies and time delays

We apply Theorem 1 to the equation (7) with a CD production function (55) F (L, K) = γL1−a K a = Lγf (k) ≡ Lγk a ; 0 < a < 1 f (q) = q a = (e−nτ k)a ; f (q) = aq a−1 , f (q) = a(a − 1)q a−2 , f (q) = a(a − 1)(a − 2)q a−3 (56) The equation (13) has a unique steady state (equilibrium), cf. (56), (8), (7),

β q = n+δ ∗

1 1−a

sγ = n+δ

1 1−a

−nτ

−nτ

e 1−a = κ e 1−a ; k ∗ = q ∗ enτ = κ e

−anτ 1−a

nτ

; κ = q ∗ e 1−a (57)

From the simple exact CD expressions, (57), we should note the general inequalities : (58) q∗ < k∗ < κ

257

Morten Brøns, Bjarne S. Jensen Next we ﬁnd, with (12), (56), (57), A0 = −n,

(59)

A1 = a(n + δ) − δ,

(60)

β A2 = β n+δ

β A3 = β n+δ

a−2 1−a

a−3 1−a

a(a − 1),

(61)

a(a − 1)(a − 2)

(62)

The bifurcation condition (14) becomes −n < 1 ⇔ n/δ < 1 − a < 1 |R| = |A0 /A1 | = a(n + δ) − δ 1+a

(63)

Hence a Hopf bifurcation to periodic solutions can occur only when the depreciation parameter δ is suﬃciently large. In particular, when delay of the depreciation of the capital is omitted, corresponding to δ = 0, the bifurcation condition is not fulﬁlled. Thus, with δ = 0, the steady-state is asymptotically stable for all delays τ . This is in agreement with the results from Zhu and Huang (2006), where it is even shown that the steady-state/equilibrium is globally attracting. Returning to situations where (63) is fulﬁlled, the bifurcation analysis can be applied. We ﬁnd - according to Theorem 1 - the exact critical delay τ0 as, τ0 =

arccos

arccos (−A0 /A1 ) 1

|A1 | [1 − (A0 /A1 )2 ] 2

=

n a(n+δ)−δ

1

n |a(n + δ) − δ| [1 − ( a(n+δ)−δ )2 ] 2

and the exact cubic expansion parameter τ2 as, τ2 =

258

1 × 8ω02 2 − A1 τ0 [2 R2 + 6 R − 11] + R[4 R (1 + R) − 13] 2 A2 − A1 (1 + R) (5 − 4 R) (A1 τ0 − R)A3

(64)

Hopf Bifurcation in Growth Models with Time Delays =

1 × 8[(a(n + δ) − δ)2 − n2 ] 88

8 % 2 − (a(n + δ) − δ)τ0 2

−n a(n + δ) − δ & − 11

%

&2

−n a(n + δ) − δ & % &% & 9 % −n −n −n 4 1+ − 13 × + a(n + δ) − δ a(n + δ) − δ a(n + δ) − δ 2 a−2 β 1−a a(a − 1) β n+δ +6

−n −n [a(n + δ) − δ][1 + a(n+δ)−δ ][5 − 4( a(n+δ)−δ )] −n × − (a(n + δ) − δ) τ0 − a(n + δ) − δ 8 99 a−3 1−a β β a(a − 1)(a − 2) . (65) n+δ

A general determination of the sign of τ2 (65) as a function of the parameters is a huge task. For speciﬁc applications, a numerical computation is needed, but special cases may be analyzed in detail. As an example, we now consider the limit of small a. Using Taylor’s theorem, the critical length of the delay becomes,

τ0 =

arccos (−A0 /A1 ) |A1 | [1 − (A0 /A1

1

)2 ] 2

=

arccos (−n/δ) 1

|δ| [(1 − (n/δ)2 ] 2

+ O(a)

(66)

The cubic expansion parameter similarly becomes, τ2 =

[(1/β)(1 + n/δ)]2 (n + τ0 δ 2 ) a + O(a2 ). 4(1 − n/δ)

(67)

We note that τ2 → 0 for a → 0. As n/δ < 1 follows from (63), each of the terms in the coeﬃcient of a, (67) are positive, i.e. τ2 is positive. Hence the Hopf bifurcation is supercritical.

259

Morten Brøns, Bjarne S. Jensen 8.5

CES technologies and time delays

Next we study the Hopf bifurcation of (7) with, σ/(σ−1) F (L, K) = γ (1 − a)L(σ−1)/σ + aK (σ−1)/σ ; 0 < a < 1, σ > 0, σ = 1 (68) σ/(σ−1) f (k) = (1 − a) + ak (σ−1)/σ .

(69)

Solving (13) with (69), we get the unique steady-state, cf. (57), (58), (7), σ 8 9 1−σ 1−σ σ β 1 ∗ −a ; q ∗ < k ∗ < κ (= q ∗ : τ = 0) q = 1−a n+δ (70) which economically exists when the parameters satisfy, % &(1−σ)/σ 1 β > 1. (71) a n+δ The bifurcation condition (14) is −n < 1. |A0 /A1 | = (σ−1)/σ β (n + δ)1/σ a − δ

(72)

For the CES production function, a delay in the capital stock depreciation is also needed to obtain a critical/bifurcation time delay. For δ = 0, the economic condition for existence of a steady-state (71) is % &(1−σ)/σ 1 β > 1. (73) a n But for δ = 0, the bifurcation condition (72) is % &(1−σ)/σ 1 β < 1. a n

(74)

Evidently, both cannot be satisﬁed simultaneously. The general expressions for τ2 and ω2 are formidable, and we omit them here. Again, as an example, we consider the limit of small a. The critical delay from (15) is τ0 = 260

arccos (−A0 /A1 ) [A21

−

1 A20 ] 2

=

arccos (−n/δ) 1

|δ| [(1 − (n/δ)2 ] 2

+ O(a)

(75)

Hopf Bifurcation in Growth Models with Time Delays so n/δ < 1 is needed for the existence of a critical delay. Next, we get σ+1

τ2 =

[(1/β)(1 + n/δ)] σ (n + τ0 δ 2 ) δ [8σ 2 /(σ + 1)](1 − n/δ)

1−σ σ

a + O(a2 ).

(76)

Here we also see that τ2 → 0 for a → 0. As each of the factors in the coeﬃcient of a is positive, i.e., with τ2 is positive, the Hopf bifurcation is supercritical. 8.6

CES and delays with cycles, square waves, and chaos

Next we turn to numerical simulations to demonstrate that a number of typical features of delay diﬀerential equations appear in the present model (7) with the CES production function (69). The simulations are performed with a simple Euler method, using a time step, Δt = τ /200. Tests with smaller time steps resulted in the same solutions with high accuracy. Two sets of basic economic parameters as shown in Table 1, together with variation of the delay time τ . Consider a general delay-diﬀerential equation x˙ = h(xτ , x),

(77)

and consider x0 = y0 such that h(x0 , y0 ) = h(y0 , x0 ) = 0.

(78)

It is well-known (e.g., Chow, Hale, and Huang 1992, Ivanov and Sharkovsky 1992) that under certain conditions, the general equation (77) allows a periodic solution with period approximately 2τ , which is of the square-wave type. The periodic solution spends approximately time τ at each of x0 , y0 , with rapid transitions between these almost constant states. Without going into detail, we demonstrate below the possibility of this feature with parameters of CES delay model, cf. Case 1. Parameter Case 1. Fig. 1(a) shows the periodic solution for a slightly supercritical value of τ . As expected from the Hopf theory, the solution is close to harmonic. As τ is increased, the periodic solution gradually turns into a square wave as shown in Fig. 1(b). Solving (78) yields x0 = 2.57, y0 = 11.4, which agrees with the numerically obtained levels in the square wave. In Table 2, we see a dramatic diﬀerence between q ∗ and k ∗ ; this steady-state discrepancy measures the excess capital (installed compared to operating) per worker - the 261

Morten Brøns, Bjarne S. Jensen Basic parameters Case 1 σ 0.5 a 0.6 n 0.07 Hopf parameters β 0.7 δ 0.12 7.710 q∗ 44.96 τ0 τ2 0.5837 5.506 × 10−2 ω0 T0 114.1 −5.986 × 10−4 ω2 T2 114.1 + 2.125(τ − τ0 )

Case 2 0.5 0.3 0.01

0.1 0.02 4.333 154.9 1.700 1.412 × 10−2 445.0 −1.278 × 10−4 445.0 + 5.729(τ − τ0 )

Table 1: Parameter values for numerical simulations, together with computed quantities from Theorem 1: Critical point q ∗ , critical delay τ0 , cubic expansion coeﬃcient τ2 , angular frequency of linearized system ω0 , expansion coeﬃcient for angular frequency ω2 , and approximate period T2 of limit cycles as a function of the deviation of the delay from the critical value. β τ 0.7 50 0.7 150

sγ q∗ 23.2 7.71 25420.9 7.71

k∗ 255.3 279993.5

κ 303.5 334483.4

Table 2: Parameter values (steady state capital-labor ratios) corresponding to the solutions in Fig. 1. long-run accumulation of idle (non-operating) capital stocks, due to a time delay of the particular/critical size, τ0 . Fig. 1(b) with square waves is extreme (generated by large values of TPF and delay), but it looks a bit like the so-called “ceiling/ﬂoor” models - patterns that have been associated with, e.g., the housing industry. Rather large values of the basic parameter for the capital intensity and labor growth in Case 1 may also support alternating time paths of ”ceiling/ﬂoor” type for this and similar industries. Parameter Case 2. We shall demonstrate transition to chaos. As τ2 is smaller than for set 1, the amplitude of the periodic solution grows 262

Hopf Bifurcation in Growth Models with Time Delays (a)

(b)

11 10

8

q(t)

q(t)

9

7 6 5 4 0

200

400

600 t

800

1000

12 11 10 9 8 7 6 5 4 3 2

1200

0

200

400

600 t

800

1000

1200

Figure 1: Numerical simulations for Case 1. (a): τ = 50. (b): τ = 150. The dashed line represents the equilibrium q ∗ . slower with τ . An almost harmonic periodic solution quite far from the critical value τ2 is shown in Fig. 2(a). However, by increasing τ slightly as in Fig. 2(b), a period doubling bifurcation occurs, as the solution now consists of repeated pairs of diﬀerent peaks. This is also demonstrated when the solution is plotted as a parameterized curve (q(t), q(t−τ )), as in the right panels. It is well-known, e.g., Guckenheimer and Holmes (1983), that a period doubling is typically is the ﬁrst step in a route to chaos. Indeed, a further increase of τ , Fig. 2(c), shows a solution with a complicated pattern of diﬀerent peaks. The solution appears chaotic. Liapunov exponents can be computed to substantiate this claim (cf. Guckenheimer and Holmes 1983). β τ 0.01 220 0.01 240 0.01 255

sγ 0.90 1.10 1.28

q∗ 4.33 4.33 4.33

k∗ 39.08 47.73 55.45

κ 42.40 51.95 60.52

Table 3: Parameter values (steady-state capital-labor ratios) corresponding to the solutions in Fig. 2. In Table 3, the diﬀerence between q ∗ and k ∗ is less pronounced than in Table 2. Still the steady-state (long-run) discrepancy is of tenfold size. The importance and gain from eliminating any delays (slack) in capital utilization is again demonstrated, cf.(κ); the delay model provides some evidence for the increased economic-ﬁnancial attention to “turn-key delivery/contracting” and “just in time” supply-chain management eﬀorts.

263

Morten Brøns, Bjarne S. Jensen

8

7

7

6

6

5

5

q(tau)

q(t)

(a) 8

4

4

3

3

2

2

1

1

0

0 0

1000

2000

3000

4000

5000

0

1

2

3

4 q(t)

5

6

7

8

0

1

2

3

4 q(t)

5

6

7

8

0

1

2

3

4 q(t)

5

6

7

8

t

8

7

7

6

6

5

5

q(tau)

q(t)

(b) 8

4

4

3

3

2

2

1

1

0

0 0

1000

2000

3000

4000

5000

t

8

8

7

7

6

6

5

5

q(tau)

q(t)

(c)

4

4

3

3

2

2

1

1

0

0 0

2000

4000

6000 t

8000

10000

Figure 2: Numerical simulations for parameter Case 2. Left panels show time paths, q(t), and the dashed lines represent the steady-state (equilibrium), q ∗ . Right panels show the pair (q, qτ ) in a phase plane; the marker represents the equilibrium, (q, qτ ) = (q ∗ , q ∗ ). (a): τ = 220. (b): τ = 240. (c): τ = 255. Regarding the occurrence of chaotic motion, it is well-known that, cf. Lorenz (1989, p. 139): While chaotic dynamics in discrete-time systems can already occur in one-dimensional systems like the logistic equation, the equivalent phenomenon in continuous time is restricted to at least three-dimensional systems. Canon264

Hopf Bifurcation in Growth Models with Time Delays ically, chaos cannot occur in a two-dimensional system, because a trajectory cannot intersect itself. The cyclical motion in a two-dimensional system is thus restricted to a monotonically damped, or explosive oscillations, and closed orbits. The fact that chaos can occur in threedimensional continuous time systems, the Lorenz attractor or the R¨ossler attractor can be illustrated with the help of so-called Poincare sections and maps. As to dimensional aspects of any continuous time model with delays, it is essentially “inﬁnite dimensional”. Hence we can see chaotic time path of our single delay diﬀerential equation in Fig. 2. When the general and chaotic time paths are exhibited as phase portraits in Fig. 3, the fundamental diﬀerence between dynamic systems with delays (b), and systems with no delays (a), is highlighted. Moreover, the portrait (b) observationally integrates the phenomena of economic growth and cycles.

Figure 3: Phase Portraits of {L(t), K(t) = L(t)k(t)} from CES without delay (a), cf. Jensen et. al (2005), and with delay (b), Table 1.

8.7

Final comments

The search for various cycles in historical economic data has a long tradition - in Western economies essentially back to the take-oﬀ of industrialization with its rapid population and economic growth per capita. Two distinct “causes” (explanations, paradigms) of recurrent 265

Morten Brøns, Bjarne S. Jensen industrial ﬂuctuations can be found in the mainstream economic literature. One category of models relies heavily on exogenous forces (stochastic shocks) to start as well to maintain cycles. Another type of models generate cycles endogenously by their own formal (mathematical) structure. We have only been concerned with the latter. However, in pure forms, both types of cyclical models have mostly ignored that industrial ﬂuctuations (shorter or longer “waves”) occur on the backdrop of, more or less, steady economic growth. Despite many devices of “detrending” observed economic series, the dynamics of business cycle theory has been divorced from the dynamic systems of the basic (mainstream) growth models, which methodologically entered late in formal quantitative economic theory – mathematical physics/mechanics also began with equilibrium analysis and progressed to analyzing periodic motions (harmonic oscillators). It seems instructive now to recall a perspective on model building from Poincar´e, (1952, p. 181-182): Long ago it was said: If Tycho had had instruments ten times as precise, we would never have had a Kepler, or a Newton, or astronomy. It is misfortune for a science to be born too late, when the means of observation have become too perfect. That is what is happening at this moment with respect to physical chemistry; the founders are hampered in their general grasp by third and fourth decimal places; happily they are men of robust faith. As we get to know the properties of matter better, we see that continuity reigns. From the work of Andrews and Van der Waals, we see how the transition from the liquid to the gaseous state is made, and that is not abrupt. Similarly, there is no gap between the liquid and solid states. With this tendency there is no doubt a loss of simplicity. Such and such an eﬀect was represented by straight lines; it is now necessary to connect these lines by more and more complicated curves. On the other hand, unity is gained. Separate categories quieted but did not satisfy the mind. Hence beginning with aspirations and mathematical tools and data for cyclical “ ﬁne tuning” (epi-cycles) can hamper progress in overall economic understanding and successful application of dynamic models for secular growing economies. Thus, if we look at the numerical size (length) of the critical delays (τ0 ) for Hopf bifurcations, and the necessary delays (τ ) for the limit cycles or “chaos”, it is clear that the 266

Hopf Bifurcation in Growth Models with Time Delays relevant time unit cannot be decades or years, and hardly quarters. But if these delays/lags refer to months or days, their proper interpretations and the economic impacts of such delays becomes more relevant for useful insights into the dynamics behind such observed data series. Aperiodic oscillations (“chaos”) on a small time scale is not “strange economics”; but their mathematical model is fairly complicated. However, the conclusion from our extension of the standard aggregate growth model is that delay diﬀerential equations oﬀer a powerful mathematical instrument for a coherent treatment (integration) - as exhibited in Fig. 3 - of economic growth and cycle models. References: Chow, S.N,, Hale, J.K., and Huang, W. (1992) “From Sine Waves to Square Waves in Delay Equations.” Proceedings of the Royal Society of Edinburgh 120A: 223–229. Diekmann, O., Gils, S.M. van, Verduyn Lunel, S.M. and Walther, H.-O. (1995) Delay Equations. Springer-Verlag, New York. Gabisch G., and Lorenz H.W. (1987) Business Cycle Theory. Springer Verlag. Goodwin, R.M. (1990) Chaotic Economic Dynamics. Oxford University Press. Lorenz, H.W. (1989) Business Cycle Theory. Springer Verlag. Guckenheimer, J., and Holmes, P. (1983) Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer Verlag, New York. Ivanov, A.F., and Sharkovsky, A.N. (1992) “Oscillations in Singularly Perturbed Delay Equations.” Dynamics Reported (New Series) 1: 164–224. Jensen, B.S. (1994) The Dynamic Systems of Basic Economic Growth Models. Kluwer Academic Publishers, Dordrecht (trans. Peking University Press, Beijing). Jensen, B.S, Alsholm, P.K, Larsen, M.E. and Jensen, J.M. (2005) “Dynamic Structure, Exogeneity, Phase Portraits, Growth Paths, and Scale and Substitution Elasticities.” Review of International Economics 13: 59–89. Morris, H.C. (1976) “A Perturbative Approach to Periodic Solutions of Delay-Diﬀerential Dquations.” Journal of the Institute of Mathematics and its Applications 18: 15–24. 267

Morten Brøns, Bjarne S. Jensen Poincar´e, H. (1952) Science and Hypothesis. Dover Publishers, New York. Puu, T. (2003) Attractors, Bifurcations, and Chaos. Springer Verlag. Zhu, H., and Huang, W. (2007) “The Economic Growth Models with Time Delays.” Chapter 7 in this volume.

268

Part III: Intertemporal Optimization in Consumption, Finance, and Growth

Chapter 9 Optimal Consumption and Investment Strategies in Dynamic Stochastic Economies

Claus Munk Department of Business and Economics, University of Southern Denmark Carsten Sørensen Department of Finance, Copenhagen Business School

9.1

Introduction

Individuals save in order to transfer consumption opportunities over time. They must determine how much to save (and hence how much to consume now) and how the savings should be allocated to diﬀerent ﬁnancial assets. Of course the optimal decisions will depend on the preferences of the individual and on the price dynamics of the ﬁnancial assets. It is extremely important to study how the dynamics of ﬁnancial investment opportunities – represented by interest rates, expected returns, volatilities, and correlations – aﬀect the optimal consumption and investment decisions of various individuals. Since investors care about the real returns on their investments, the stochastic variations in the prices of consumer goods should also be taken into account. Recently, numerous papers have focused on one or a few sources of asset price uncertainty. In this chapter, we give a uniﬁed analysis by deriving the optimal consumption and investment strategy in a complete ﬁnancial market where prices follow continuous, not necessarily Markovian, stochastic processes. We focus on investors with constant relative risk aversion (CRRA), but extend the results to hyperbolic

Claus Munk, Carsten Sørensen absolute risk aversion (HARA) and to power-linear habit formation preferences. Our general result shows how individuals will optimally hedge time-variations in investment opportunities and points out exactly what risks that are to be hedged. We discuss how the results of recent studies come out as special cases of our analysis and provide several new examples. The intertemporal consumption and investment decision of a utility-maximizing investor is a classical problem of ﬁnancial economics dating back to Samuelson (1969) and Merton (1969), who derive the optimal strategies for investors with time-additive CRRA utility in a market with constant investment opportunities. Basically, all investors should invest in the same mean-variance optimal, speculative portfolio and in the riskless asset. While Merton (1971, 1973b) gave some preliminary results in the presence of stochastic shifts in investment opportunities, several recent papers have extended and concretized the analysis. We will now give a short review of some of these papers and then describe our analysis and results in more detail. Focusing on interest rate risk, Sørensen (1999) derives the optimal investment strategy when the short-term interest rates and bond prices follow the Vasicek (1977) model and the stock market index has a constant excess rate of return and volatility. Brennan and Xia (2000) allow for a two-factor version of the Vasicek model, but their analysis and results are basically the same. Grasselli (2000) assumes instead that the short rate complies with the Cox, Ingersoll and Ross (1985) model. All three papers ﬁnd that for CRRA utility of terminal wealth only, the optimal investment strategy is to combine the speculative portfolio and the zero-coupon bond expiring at the investment horizon (or a portfolio replicating this zero-coupon bond). The higher the risk aversion, the higher the portfolio weight in the bond and the lower the portfolio weight in the speculative portfolio. In particular, the bond to stock ratio increases with the risk aversion, explaining the asset allocation puzzle identiﬁed by Canner, Mankiw and Weil (1997). Within the class of Gaussian term structure models of the type introduced by Heath, Jarrow and Morton (1992), Munk and Sørensen (2004) discuss the sensitivity of the optimal investment strategy with respect to the dynamics and the current form of the term structure of interest rates. Other papers generalize the dynamics of stock prices relative to the standard assumption of a constant excess rate of return and volatility. Both Kim and Omberg (1996) and Wachter (2002) derive closed-form expressions for the optimal portfolio when the excess expected stock 272

Optimal Consumption and Investment Strategies market return follows a mean-reverting Gaussian process and interest rates and volatilities are constant. Kim and Omberg consider an investor with CRRA utility of terminal wealth only, which enables them to allow for non-perfect correlation between the stock price and the excess expected return. Wachter assumes a perfect negative correlation in order to be able to explicitly solve the optimization of a time-additive CRRA utility of consumption. The presence of mean reversion in stock prices increases the demand for stocks, especially for long-term investors. Barberis (2000) and Xia (2001) explicitly allow for parameter uncertainty in similar settings. A few recent papers have investigated the eﬀects of stochastic stock price volatility on portfolio choice. Chacko and Viceira (2005) study the case where the inverse of the stock price variance follows a meanreverting square-root process and the expected stock return is either constant or aﬃne in the variance. They derive an explicit, but only approximately correct, expression for the optimal strategies of an inﬁnitely lived investor with recursive preferences for consumption. Calibrating their model to U.S. stock returns, they conclude that the intertemporal hedging demand for stocks due to stochastic volatility in their model is lower in size than the hedging demands generated by variations in interest rates or excess returns. Liu and Pan (2003) also consider stochastic stock market volatility and add jump price risk. However, they allow agents also to invest in stock market derivatives and obtain a closed-form solution assuming a mean-reverting squareroot process for the stock price variance and a market price of variance risk proportional to the price volatility. When individuals save by investing in ﬁnancial assets, they are interested in the real returns these assets oﬀer. All the papers referred to above do not explicitly take inﬂation risk into account so that they apply to the real asset price dynamics. However, in most ﬁnancial markets the traded bonds are nominal in the sense that they promise some ﬁxed monetary payments.1 Similarly, the available short-term deposits promise a given nominal interest rate. Due to stochastic variations in the prices of consumer goods, such bonds and deposits have risky real returns. So far, only very few papers have explicitly incorporated inﬂation risk in the asset allocation problem and they all consider rather simple, specialized models. In both Brennan and Xia (2002) and Campbell and Viceira (2001), the real interest rate 1

In some countries, e.g. the United States and the United Kingdom, inﬂationindexed bonds are traded, but only for a few maturities and often with a modest turnover.

273

Claus Munk, Carsten Sørensen is described by a Vasicek-model and the expected inﬂation dynamics is given by an Ornstein-Uhlenbeck process. The term structure of nominal interest rates is therefore described by a two-factor model. While Brennan and Xia assume CRRA utility preferences, Campbell and Viceira apply a recursive utility speciﬁcation in an inﬁnite horizon setting. Munk, Sørensen and Vinther (2004) take a one-factor model for nominal interest rates while the implied term structure of real interest rates is described by a two-factor model, which makes it impossible to replicate real bonds by trading in nominal securities. They incorporate mean reversion in stock prices and derive the optimal investment strategy for investors with CRRA utility of terminal wealth. They demonstrate that stocks may be used as a non-perfect substitute for real bonds for hedging long term real interest rate risk in cases where the stock is negatively correlated with the real interest rate. In this chapter, we ﬁrst study a very general complete ﬁnancial market of nominal securities with prices following continuous, not necessarily Markovian, stochastic processes. Interest rates, excess expected returns, price volatilities, correlations, and consumer prices may all evolve stochastically over time. Despite the general market setting, we are able to derive an explicit and very precise characterization of the optimal consumption and, more remarkably, the optimal investment strategy of an investor with CRRA utility of consumption and/or terminal wealth. This result pinpoints exactly what risks individuals want to hedge and shows how to ﬁnance a desired real consumption process by investing in a market of nominal securities. We discuss how most of the studies listed above come out as special cases of our analysis. We extend our general result to the case of HARA utility and power-linear habit utility. We also discuss how to include labor income. Furthermore, we derive conditions under which an undiversiﬁable shock to the consumer price index can be allowed without changing the structure of the optimal strategies. Such a model feature is also present in Brennan and Xia (2002) in a model with very simple dynamics, but here we establish precisely when such an incompleteness will not ruin the form of the optimal policies. As a special case of our general analysis, we focus on the case where real interest rates are Gaussian and real market prices of risk are deterministic. Under these assumptions, the optimal investment strategy of a CRRA investor combines the speculative portfolio and a single real bond hedging changes in real investment opportunities, even though these changes may be generated by a multi-dimensional 274

Optimal Consumption and Investment Strategies Brownian motion. With utility from terminal wealth only, the hedge bond is the real zero-coupon bond maturing at the horizon of the investor. With utility from intermediate consumption, the hedge bond has a continuous coupon proportional to the expected future real consumption rate under the forward martingale measure. This result links the optimal hedge strategy and the optimal real consumption strategy closely together. While this result is to some extent known from Munk and Sørensen (2004), we in this chapter illustrate and extent the result by an example featuring specialized inﬂation uncertainty and where nominal interest rates follow a possibly non-Markovian, multi-factor Gaussian Heath-Jarrow-Morton model. In the example, we demonstrate how the optimal hedge strategy can as well be implemented by nominal bonds that match the expected nominal consumption under the forward martingale measure that is relevant with respect to nominal valuation. In a second example, we use an analogy to Markovian term structure models to solve in closed form the investment strategy of a CRRA investor who can invest in stock and a stock derivative in a complete market setting. The model of the equity market is adopted from Heston (1993) and features stochastic volatility and excess return on the stock. Besides providing a closed form solution for the optimal investment strategy, we also present numerical results based on realistic model parameters and focus on how changes in the investment opportunity set can be optimally hedged by investing relatively small amounts in a straddle position written on the stock. The rest of the chapter is organized as follows. In Section 9.2, we set up a general complete ﬁnancial market and brieﬂy review the martingale approach for general time-separable utility functions. In Section 9.3, we focus on CRRA utility and derive a general condition that the portfolio optimally hedging changes in the opportunity set must satisfy. We show how to obtain speciﬁc result by: (1) using an analogy to Markovian term structure models, and (2) specializing to Gaussian, not necessarily Markovian, real interest rates. Section 9.4 provide the two speciﬁc examples, including the example with non-Markovian Heath-Jarrow-Morton term structure dynamics. Section 9.5 contains the extensions to HARA or habit utility, to labor income, and to undiversiﬁable inﬂation risk. Finally, Section 9.6 concludes the chapter.

275

Claus Munk, Carsten Sørensen 9.2

Consumption and investment in complete markets

In this section, we describe the general economic model and state the utility maximization problem of an individual investor. We allow for non-Markovian dynamics of prices, but restrict ourselves to a complete market setting. 9.2.1

Information structure

Let z be a d-dimensional standard Brownian motion on a ﬁltered probability space (Ω, F, F, P), where F = {Ft | t ∈ T } is a right-continuous ﬁltration. Here we take T = [0, T ] for some T > 0 representing the time horizon of the investor considered. Ft is the augmentation of the σ-algebra generated by {zu | 0 ≤ u ≤ t}. It is assumed that F0 is the σ-algebra generated by the zero sets of P and that F = FT . Below, all statements involving stochastic variables are assumed to hold almost surely wrt. P and all stochastic processes are assumed to be adapted to F. 9.2.2

Consumption good and inﬂation

We assume that the economy has a single consumption good2 with a unit price Πt that follows a stochastic process with dynamics (1) dΠt = Πt πt dt + σΠt dzt . Since dΠt /Πt is the realized inﬂation rate over the next instant, πt is the expected inﬂation rate and σΠt is the volatility of the inﬂation rate. 9.2.3

Financial assets

The agents in the economy have access to continuous trading in (at least) d + 1 ﬁnancial assets without transaction costs. One asset is an instantaneously nominally riskless asset called the savings account with nominal price At satisfying dAt = Rt At dt

(2)

where R is the continuously compounded short-term nominal interest rate process. Note that this asset is not riskless in real terms since the 2

For consumption/investment problems with multiple consumption goods see, e.g., Breeden (1979), Cuoco and Liu (2000), and Damgaard, Fuglsbjerg and Munk (2003).

276

Optimal Consumption and Investment Strategies real price At /Πt has a diﬀusion term. The other d assets are nominally risky with nominal prices given by the vector Pt = (P1t , . . . , Pdt ) satisfying dPt = diag(Pt ) [(Rt 1 + σP t Λt ) dt + σP t dzt ] .

(3)

Here, Λ is an Rd -valued L2 [0, T ] stochastic process of nominal market prices of risk, and σP is an Rd×d -valued stochastic process determining volatilities and correlations of the ﬁnancial assets.3 The processes R, Λ, and σP are assumed to be progressively measurable with respect to F and such that the equations (2) and (3) are well-deﬁned.4 Note that we allow for non-Markovian dynamics of the investment opportunity set. The volatility process σP t is assumed to satisfy the non-degeneracy assumption ∃ > 0 ∀(x, t) ∈ Rd × T : x σP t σPt x ≥ x 2 .

(4)

As a consequence of condition (4), σ has full rank d implying the dynamic completeness of the market.5 The unique nominal pricing kernel in this economy is the process M = (Mt ) deﬁned by t t 1 t Mt = exp − Rs ds − Λs dzs − Λs 2 ds , (5) 2 0 0 0 with dynamics

dMt = −Mt Rt dt + Λ t dzt .

The unique real pricing kernel is m = (mt ) where mt = Mt Πt /Π0 so that dmt = −mt (Rt − πt + Λ t σΠt ) dt + (Λt − σΠt ) dzt . The implicit, real, and short-term interest rate is therefore6 rt = Rt − πt + Λ t σΠt

T L2 [0, T ] is the set of adapted stochastic processes x such that 0 xt 2 dt < ∞ almost surely. Similarly, L1 [0, T ] is the set of adapted processes x with T xt dt < ∞ almost surely. 0 4 In addition to the L2 [0, T ] assumption on Λ, it suﬃces that R ∈ L1 [0, T ] and σP ∈ L2 [0, T ]. 5 The market is potentially complete, but by restricting the set of admissible portfolio processes, various types of incompleteness can be modeled. This complicates the solution of the utility maximization problem considerably, cf. Cvitanic and Karatzas (1992) and Cuoco (1997). 6 The real rate can also be computed as the rate of return of the portfolio xt = (σPt )−1 σΠt , which is riskless in real terms. 3

277

Claus Munk, Carsten Sørensen and the real market price of risk vector is λt = Λt − σΠt . We can then write the real pricing kernel as t t 1 t mt = exp − rs ds − λs dzs − λs 2 ds . (6) 2 0 0 0 See, e.g., Cochrane (2001) or Duﬃe (2001) for more on pricing kernels. 9.2.4

The individual’s choice problem

We consider an investor seeking to maximize his expected remaining life-time utility by choosing a trading and consumption strategy appropriately. We assume that the investor does not receive income from non-traded assets so that zero is a natural lower bound on the wealth process. (The eﬀects of income will be studied in Section 9.5.3.) We can represent the trading strategy of the investor by an Rd -valued progressively measurable stochastic process x = (x1 , . . . , xd ) with xit denoting the fraction of wealth invested in the i’th risky asset at time t. The fraction of wealth invested in the savings account is residually determined as x0t = 1 − x t 1. A real consumption strategy is a progressively measurable process c = (ct ) with the corresponding nominal consumption given by Ct = ct Πt . Given a trading strategy x and a nominal consumption strategy C, the nominal wealth Wt = WtC,x of the investor evolves according to C,x xt σP t dzt . (7) dWtC,x = Rt WtC,x + WtC,x x t σP t Λt − Ct dt + Wt We denote by C the set of all consumption processes C ∈ L1 [0, T ] and by L1+ the set of FT -measurable random variables W with ﬁnite expectations. A consumption/terminal wealth pair (C, W ) ∈ C × L1+ is called admissible with initial wealth W0 if a trading strategy x ∈ L2 [0, T ] exists such that W0C,x = W0 and WTC,x = W . In that case, the trading strategy x is said to ﬁnance (C, W ). The expected life-time utility of the agent is assumed to be of the time-additive form T U1 (Ct /Πt , t) dt + U2 (WT /ΠT ) , E 0

where U1 (·, t) and U2 (·) are strictly increasing and concave C 1 (0, ∞) functions with U1 (∞, t) ≡ limc↑∞ U1 (c, t) = 0 and U1 (0, t) ≡ limc↓0 U1 (c, t) = ∞ where the primes denote partial derivatives with respect to the ﬁrst argument. Similarly for U2 (·). It follows from 278

Optimal Consumption and Investment Strategies the martingale approach initiated by Pliska (1986) and formalized by Karatzas, Lehoczky and Shreve (1987) and Cox and Huang (1989, 1991) that we can ﬁnd the optimal consumption C ∗ and terminal wealth level W ∗ by solving the static problem T sup E U1 (Cs /Πs , s) ds + U2 (W/ΠT ) , (8) 0

(C,W )∈C×L1+

T

s.t. E 0

Mt Ct dt + MT W ≤ W0 .

(9)

Subsequently, a portfolio x∗ ﬁnancing (C ∗ , W ∗ ) must be found. Due to our assumption that the inﬂation risk is spanned, we can alternatively let the agent directly choose a real consumption process c = (ct ) and a real terminal wealth level w. Given a real consumption process c and a portfolio process x, the real wealth of the investor, wtc,x = WtcΠ,x /Πt , evolves as σPit + x dwtc,x = wtc,x (Rt − πt + σΠt t σP t (Λt − σΠt )) − ct dt + wtc,x (x t σP t − σΠt ) dzt c,x c,x = wt (rt + (x t σP t − σΠt )λt ) − ct dt + wt (xt σP t − σΠt ) dzt . (10)

In terms of real consumption and wealth, the utility maximization problem is formulated as follows T U1 (cs , s) ds + U2 (w) , (11) sup E (c,w)∈C×L1+

s.t. E 0

T

0

mt ct dt + mT w ≤ w0 .

(12)

Given a solution (c∗ , w∗ ) we have to ﬁnd a portfolio x∗ ﬁnancing ∗ ∗ (c∗ , w∗ ) in the sense that wTc ,x = w∗ . 9.2.5 General solution technique By Lagrangian theory, the optimal solution to the static problem will satisfy U2 (w) = ψmT , U1 (ct , t) = ψmt , 279

Claus Munk, Carsten Sørensen where ψ is such that the inequality constraint holds as an equality. Let I1 (·, t) denote the inverse of the marginal utility function U1 (·, t) and I2 (·) the inverse of U2 (·). Deﬁne T H(ψ) = E mt I1 (ψmt , t) dt + mT I2 (ψmT ) . 0

By concavity of the utility functions, H(·) is a decreasing function. Assume that H(ψ) is ﬁnite for all ψ. Then H(·) has an inverse denoted by Y(·) so that ψ = Y(w0 ), and the optimal solution to (11)–(12) can be written as c∗t = I1 (Y(w0 )mt , t),

t ∈ [0, T ],

∗

w = I2 (Y(w0 )mT ). The real wealth process under the optimal policy is given by T 1 ∗ ∗ ∗ wt = Et ms cs ds + mT w . mt t

(13) (14)

(15)

The indirect utility is the future expected utility generated by the optimal policies, i.e. T ∗ ∗ V0 = E U1 (ct , t) dt + U2 (w ) 0 T U1 (I1 (Y(w0 )mt , t), t) dt + U2 (I2 (Y(w0 )mT )) . =E 0

A drawback of the martingale approach is that the optimal investment policy is only given implicitly by the martingale representation theorem. Therefore, it is generally not clear how to implement (c∗ , w∗ ) by a trading strategy x∗ such that wtc

∗ ,x∗

= wt∗ ,

t ∈ [0, T ].

In the case of logarithmic utility, it can be shown (see, e.g., Karatzas and Shreve (1988, Example 3.6.6)) that the optimal investment strategy is to invest the fractions −1 −1 Λt = σPt (λt + σΠt ) (16) xlog t = σP t of wealth in the d risky assets and the remaining fraction 1 − (xlog t ) 1

280

Optimal Consumption and Investment Strategies in the nominal savings account. For general utility functions, the optimal investment strategy can be represented rather abstractly in terms of stochastic integrals of Malliavin derivatives by the Clark-Ocone formula, cf. Ocone and Kazatzas (1991), but to derive an explicit expression for the optimal portfolio for non-logarithmic utility functions, it is generally recognized that the price dynamics must be specialized. Cox and Huang (1989) show that when the state-price density and the risky asset prices constitute a Markovian system, the optimal investment strategy is given in terms of the solution of a linear second order partial diﬀerential equation. More explicit results are given in the case of a deterministically changing investment opportunity set, cf., e.g., Cvitanic and Karatzas (1992). In the following sections, we – on the other hand – provide a closed-form expression for the optimal investment strategy in a very general, possibly non-Markovian, market setting. 9.3

Results for CRRA utility in general markets

We will characterize the optimal investment strategy of a CRRA utility investor in the general market setting outlined above. Therefore, we deﬁne U1 (c, t) = ε1 e−βt

c1−γ , 1−γ

U2 (w) = ε2 e−βT

w1−γ , 1−γ

γ > 0, (17)

which for γ = 1 is interpreted as the limiting case of logarithmic utility. The parameter β is the investor’s subjective time preference rate. The non-negative constants ε1 and ε2 allow for diﬀerent weightings of intermediate and terminal consumption, including the cases with utility from intermediate consumption only (ε2 = 0) and the case with utility from terminal wealth only (ε1 = 0). We will state the optimal consumption and investment strategies in terms of the stochastic process Q = (Qt ), deﬁned by T 1 1 β β γ e− γ (s−t) qts ds + ε2γ e− γ (T −t) qtT , (18) Qt = ε1 t

8%

where qts

= Et

ms mt

&1− γ1 9 .

If we write the dynamics of qts in the form s dqts = qts μsqt dt + (σqt ) dzt ,

(19)

(20) 281

Claus Munk, Carsten Sørensen it follows from a Leibnitz-type rule for stochastic processes proved in the Appendix that 1

σQt =

ε1γ

T

1

β

β

s T e− γ (s−t) qts σqt ds + ε2γ e− γ (T −t) qtT σqt . 1 1 β T − β (s−t) s γ γ − γ (T −t) T γ ε1 t e qt ds + ε2 e qt t

(21)

Theorem 1. The optimal real consumption strategy of a CRRA investor is 1 w∗ (22) c∗t = ε1γ t . Qt The optimal investment strategy is given by & % % & 1 1 Λt + 1 − σΠt + σQt . (23) x∗t = (σPt )−1 γ γ The indirect utility of the investor is Vt =

1 Qγ (w∗ )1−γ . 1−γ t t

(24)

In these expressions, w∗ denotes the real wealth process generated by the optimal strategies. Proof: With CRRA utility we have 1

β

1

1

I1 (y, t) = ε1γ e− γ t y − γ ,

β

1

I2 (y) = ε2γ e− γ T y − γ

so that we get 1 1 H(ψ) = ψ − γ E ε1γ

T

0

1− γ1

β

e − γ t mt

1

β

1− γ1

dt + ε2γ e− γ T mT

1

= ψ − γ Q0 ,

and consequently Y(w0 ) = Qγ0 w0−γ . The optimal consumption rate and terminal wealth are thus w0 , Q0 1 β − 1 w0 w∗ = ε2γ e− γ T mT γ . Q0 1

β

− γ1

c∗t = ε1γ e− γ t mt

Substituting into (15) we obtain the optimal real wealth level wt∗ = 282

−1 w0 − βγ t e Qt mt γ . Q0

(25) (26)

Optimal Consumption and Investment Strategies Combining this with (25), we obtain the expression for the optimal consumption process given in the theorem. By Itˆo’s Lemma, the dynamics of optimal real wealth is % dwt∗

= ...

dt + wt∗

1 dmt dQt − + γ mt Qt

&

% = ...

dt + wt∗

1 λt + σQt γ

& dzt ,

where we leave the drift term unspeciﬁed. Aligning this with the wealth dynamics for a general portfolio x in (10) and using the relation λt = Λt −σΠt , we derive the expression for the optimal portfolio stated in the theorem. The computation of the indirect utility applies the fact that ws∗

=

β wt∗ e− γ (s−t)

%

ms mt

&− γ1

Qs Qt

for all s and t in [0, T ]. Consequently, we can write consumption in [t, T ] and terminal wealth in terms of time t values: c∗s

1 γ

= ε1 e

−β (s−t) γ

%

ms mt

&− γ1

wt∗ , Qt

wT∗

1 γ

= ε2 e

−β (T −t) γ

%

mT mt

&− γ1

wt∗ . Qt (27)

The value function becomes T 1 1 −β(s−t) ∗ 1−γ −β(T −t) ∗ 1−γ (c ) (w ) e ε1 ds + e ε2 Vt = Et 1−γ s 1−γ t 8 % &− 1−γ % ∗ &1−γ T 1−γ γ β(1−γ) 1 ms wt − γ (s−t) γ −β(s−t) Et e ε1 ε1 e ds = 1−γ mt Qt t % &− 1−γ % ∗ &1−γ 9 1−γ γ β(1−γ) mT wt − γ (T −t) γ −β(T −t) ε2 ε2 e +e mt Qt % ∗ &1−γ 1 1 wt = Qγ (w∗ )1−γ Qt = 1 − γ Qt 1−γ t t as claimed. 2 We see that it is optimal to consume a time- and state-dependent fraction of wealth. With constant real investment opportunities, i.e. a constant real short rate rt and a constant real market price of risk λt , we have & % 1 1 1 1 γ γ γ −ξ(T −t) , ε1 + ξε2 − ε1 e Qt = ξ 283

Claus Munk, Carsten Sørensen where

% & &% 1 β 1 ξ = + 1− r+ λ λ . γ γ 2γ

Since σQt is then zero we are back at the Merton (1969) solution adapted to our setting which explicitly takes inﬂation risk into account. Since the diﬀerence between the optimal portfolio with and without −1 stochastic investment opportunities is given by the term σQt , we may interpret this as a hedge against shifts in real σP t investment opportunities. Note that the only shifts that the investor want to hedge are shifts that changes expectations of a power of the future real pricing kernel. In particular, only changes in the real short rate and the real market price of risk are of concern to the investor in this complete market setting. We can also see that the hedge term is determined by the volatility of the wealth/consumption ratio. In a later section, we will strengthen the link between the hedge portfolio and the optimal consumption strategy in a simpliﬁed framework. We will now rewrite the optimal investment strategy to better understand its structure and to simplify the comparison with the literature that does not explicitly take inﬂation risk into account. So far we have allowed the individual to invest in d nominally risky securities and one nominally instantaneously riskless security, the nominal savings account. Due to market completeness, we can combine these d + 1 securities so that we obtain a security which is instantaneously riskless weights real terms. This is achieved by the portfolio with in −1 σΠt in the risky assets and the weight 1 − x˜t 1 in the x˜t = σP t nominal savings account. While the dynamics of the real value p˜0t of this real savings account of course is d˜ p0t = p˜0t rt dt, the dynamics of the nominal value P˜0t = p˜0t Πt is given by dzt . dP˜0t = P˜0t (rt + πt ) dt + σΠt Suppose we allow the individual to invest in the real savings account instead of the nominal savings account and also in the same d risky assets as before. A portfolio x¯ of the d risky assets is then accompanied by a position of 1 − x¯ 1 in the real savings account. For a given portfolio x¯ and a consumption process C, the nominal wealth WtC,¯x will evolve as . σ Λ − C dWtC,¯x = WtC,¯x rt + πt + x¯ dt − 1σ Pt t t t Πt + WtC,x σΠt + x¯ dzt . t σP t − 1σΠt 284

Optimal Consumption and Investment Strategies Comparing this with (7), we can conclude that a portfolio xt in the “old” set of assets (including the nominal savings account) is equivalent to the portfolio −1 σP t xt − σΠt x¯t = σPt − σΠt 1 in the “new” set of assets (including the real savings account). In particular, the optimal portfolio in (23) corresponds to x¯∗t

% & −1 1 λt + σQt . = σP t − 1σΠt γ

(28)

Note that σP t − 1σΠt is the volatility matrix of the real asset prices Pt /Πt . A similar expression for the optimal portfolio was given by Munk and Sørensen (1999) in a real economy. The result in the theorem shows how to implement the strategy in a nominal economy. Next, let us look at some benchmark risk aversion parameters. For a log utility investor (γ = 1), we get

Qt =

1 ε1 + [βε2 − ε1 ]e−β(T −t) β

and σQt = 0, so that the well-known optimal strategies for log investors are obtained. In particular, log investors do not hedge changes in investment opportunities. For inﬁnitely risk averse investors, interpreted as the limit γ → ∞, we have (assuming ε1 , ε2 > 0) Qt =

T

Et t

ms mt

ds + Et

mT , mt

which is the time t real price of a real bond with a continuous coupon of one (consumption unit) in the time interval [t, T ] and a lump sum time T payment of one (consumption unit). An inﬁnitely risk averse investor will not invest speculatively, but simply try to replicate the riskless real annuity bond. To apply the theorem, we have to identify Qt and its volatility vector σQt . Below, we discuss three ways to obtain such speciﬁc results. 9.3.1 Speciﬁc results by analogy to term structure models The theorem generalizes recent results in aﬃne and quadratic Markovian frameworks, cf. Liu (1999), Brennan and Xia (2000), Sørensen 285

Claus Munk, Carsten Sørensen (1999), Grasselli (2000), and Wachter (2002). To obtain their results, ﬁrst rewrite the relevant expectation of the pricing kernel as 8% & 1 9 1− γ ms qts = Et mt % & & s% 1 1 2 = Et exp − 1 − ru + λu du γ 2 t % & s 1 − 1− λ u dzu γ t s (γ) (γ) = EQ e− t ru du , (29) t where we have deﬁned the process r(γ) by & % & % 1 1 1 (γ) rt = 1 − rt + 1− λt 2 , γ 2γ γ

(30)

and Q(γ) is the probability measure under which the process z (γ) deﬁned by & t% 1 (γ) 1− zt = zt + λu du (31) γ 0 is a standard Brownian motion. Combining this observation with the well-known zero-coupon bond pricing results in aﬃne and quadratic term structure models, cf. Duﬃe and Kan (1996) and Leippold and Wu (2000), we can recover Liu’s results. For example, if r(γ) has an aﬃne drift and variance under the probability measure Q(γ) , functions a(γ) and b(γ) will exist such that 8% & 1 9 1− γ (γ) ms (γ) (γ) s qt = Et = e−a (s−t)−b (s−t)rt , mt and hence

1

σQt = −σr(γ) ,t ε1γ

T

b(γ) (s − t)e−ˆa

(γ) (s−t)−b(γ) (s−t)r (γ) t

1

+ ε2γ b(γ) (T − t)e−ˆa 1 ε1γ

T

(γ) (s−t)−b(γ) (s−t)r (γ) t

e−ˆa

(γ) (T −t)−b(γ) (T −t)r (γ) t

1

ds + ε2γ e−ˆa

t

(γ) (T −t)−b(γ) (T −t)r (γ) t

−1 ,

where a ˆ(γ) (τ ) = a(γ) (τ ) + 286

ds

t

β τ, γ

Optimal Consumption and Investment Strategies and σr(γ) ,t is the (absolute) volatility of the process r(γ) deﬁned above. For example, with utility of terminal wealth only (ε1 = 0, ε2 = 1), constant market prices of risk, and the real short rate following a one-factor Vasicek model r − rt ) dt − σr dzt , drt = κ (¯ we get that σr(γ) ,t = −(1 − 1/γ)σr and 1 b(γ) (τ ) = 1 − e−κτ ≡ b(τ ), κ and consequently & % 1 σQt = 1 − σr b(T − t). γ Since σr b(T − t) is the volatility of a real zero-coupon bond maturing at T , we see that the optimal portfolio combines the speculative portfolio and a position in the real zero-coupon bond maturing at the end of the investor’s horizon. This is the main result of Sørensen (1999). 9.3.2 Results under Gaussian real interest rate dynamics Following ideas originally laid out in Munk and Sørensen (1999), it is possible to describe the optimal consumption and investment strategies of CRRA investors in cases where the real interest rate, rt , follows a Gaussian process and the real market prices of risk process, λt , is a deterministic function of time. In this case, the real pricing kernel is log-normally distributed. Hence, it is possible to directly evaluate the relevant expectations that enter the deﬁnition of the stochastic process Q deﬁned in (18). First note that the real price at time t of a real zero-coupon bond which pays oﬀ one consumption unit at time s ≥ t is given by % & % & % & 1 ms ms ms = exp Et ln + Vart ln . Bts = Et mt mt 2 mt It follows that 8% & 1 9 1− γ ms s qt = Et mt % % &* & % & % &2 1 1 1 ms ms 1− + Et ln 1− Vart ln = exp γ mt 2 γ mt 1 1−γ = (Bts )1− γ exp g(t, s) , (32) 2γ 2 287

Claus Munk, Carsten Sørensen where for notational simplicity we have introduced the deterministic function % & ms g(t, s) = Vart ln m st s 1 s 2 ru du − λu du − λu dzu . = Vart − (33) 2 t t t Substituting the above expressions for qts into (18) now provides the optimal real consumption and investment strategy and indirect utility of a CRRA investor through Theorem 1. In particular, the optimal hedge behavior is obtained by investing in a portfolio with diﬀusion term σQt . In the given context, the relevant hedge portfolio at any time t can be characterized as a real coupon bond with a continuous coupon payment stream where the future coupon rate at time s ≥ t must be chosen to be equal to the conditional expected future consumption rate at time s, where the expectations are taken under ¯ s introduced by Jamshidthe so-called forward martingale measures Q ian (1987) and Geman (1989). In general, expectations under the subjective probability measure P, and the time s forward martingale measure diﬀer. We are concerned with valuation in real terms, and the subjective probability measure P and the relevant time s forward ¯ s are interlinked through the relation martingale measure Q % & ms ¯s X = Bts EQ (34) Et t [X] mt which must be satisﬁed for any suﬃciently well-behaved random variable X, cf., e.g., Duﬃe (2001). All in all, we formally have the following corollary to Theorem 1. Corollary 1. If the real interest rate, rt , follows a Gaussian process and the real market price of risk process, λt , is deterministic, the optimal investment strategy is given by % & % & −1 1 1 Λt + 1 − (σBt + σΠt ) . (35) x∗t = σPt γ γ where (σBt + σΠt ) is the volatility vector of the nominal price of a real bond which pays continuous real coupon according to ¯

∗ s −1 s k(s) = EQ t [cs ] = (Bt )

288

wt∗ γ1 − βγ (s−t) s ε e qt , 0 ≤ t ≤ s < T, Qt 1

(36)

Optimal Consumption and Investment Strategies and has a terminal lump sum real payment at time T of −1 wt∗ γ1 − β (T −t) T ¯ k(T ) = EtQT [wT∗ ] = BtT ε e γ qt . Qt 2

(37)

Proof: First note that the last equalities in (36) and (37) follow by the deﬁnition of the relevant forward martingale measure, as in (34), and by the characterization of optimal consumption and terminal wealth in (27). Now, as a key observation, it follows from the characterization of qts in (32) and Ito’s lemma that & % 1 s s σBt (38) σqt = 1 − γ s is the volatility vector of the real zero-coupon bond price where σBt involved in (32). The real price of a coupon bond paying continuous real coupon k(s), t ≤ s < T and with a terminal lump sum real payment, k(T ), at time T is in general given by T Bt = k(s)Bts ds + k(T )BtT . t

Moreover, using Lemma 1 in the appendix, the volatility vector of such a coupon bond is given by T T k(s)B s σ s ds + k(T )BtT σBt . (39) σBt = t T t Bt s T k(s)B ds + k(T )B t t t By inserting k(s) and k(T ) as described in (36) and (37) as well as the relationship in (38), and by comparison with the characterization of the volatility vector σQt in (21), it is seen that & % 1 σBt . σQt = 1 − γ The corollary follows by inserting this relationship in (23) and observing that (σBt + σΠt ) is the volatility vector of the nominal price on the 2 relevant real bond, Πt Bt . Corollary 1 provides an explicit expression for the optimal investment strategy under inﬂation and with possibly non-Markovian, multifactor dynamics of interest rates. The optimal portfolio allocates a fraction of wealth (1/γ) into the speculative portfolio and a fraction 289

Claus Munk, Carsten Sørensen of wealth (1 − 1/γ) into the suggested real coupon bond which hedge against changes in the investment opportunity set. In the case of no intermediate consumption, the relevant bond is a real zero-coupon bond which pays oﬀ at the terminal date, and the corollary generalizes the insights of Brennan and Xia (2000) and Sørensen (1999) into a setting with inﬂation uncertainty. In the case of intermediate consumption, the corollary generalizes results in Munk and Sørensen (1999,2004) into a setting with inﬂation uncertainty. 9.4 9.4.1

Examples Example 1 (Non-Markovian term structure dynamics)

In this example, we will consider an economy where the dynamics of the term structure of nominal interest rates is given by a k-factor model of the HJM-class introduced by Heath, Jarrow and Morton (1992). An HJM-framework is natural since it allows for perfect calibration of the model to initially observed nominal zero-coupon bond prices of diﬀerent maturity. Nominal zero-coupon bond prices are related one-to-one to the structure of forward rates by the re term τ lationship, Dtτ = exp − t fts ds , and the HJM-approach focus on modeling the simultaneous dynamics of all points on the whole forward rate curve. For any τ , the dynamics of the τ -maturity instantaneous forward rate is assumed given by t t α(s, τ ) ds + σf (s, τ ) dzs (40) ftτ = f0τ + 0

0

where σf (·, τ ) is an Rk -valued process while f0τ denotes the τ -maturity forward rate observed initially at time 0. As a no-arbitrage drift restriction, Heath, Jarrow and Morton (1992) have shown that % & τ σf (t, u) du α(t, τ ) = σf (t, τ ) Λt + t

must be satisﬁed. This implies that one only has to specify the initial term structure of forward rates and the volatility structure, σf (t, τ ) when modeling term structure dynamics in an HJM-framework. The nominal short interest rate is given by Rt = ftt and evolves, therefore, according to the equation t t α(s, t) ds + σf (s, t) dzs . (41) Rt = f0t + 0

290

0

Optimal Consumption and Investment Strategies In the following, we will assume that σf (·, τ ) is a deterministic vector process, which implies that the nominal short interest rate, Rt , is a Gaussian process. Furthermore, we will assume that the nominal market price of risk process, Λt , is a deterministic vector process. It is important to point out that in this kind of Gaussian HJM-model, the short rate process is not necessarily Markovian, although the general HJM-framework also encompasses all known Markovian term structure models as special cases.7 We will consider a speciﬁc HJM threefactor numerical example below which exhibits non-Markovian Gaussian interest rate dynamics; this problem is thus not solvable by a direct dynamic programming solution approach as in the tradition of Merton (1971, 1973b). In the absence of inﬂation, Corollary 1 applies. In particular, the relevant portfolio for hedging changes in investment opportunities can be characterized as a bond with coupon payments that match the forward-expected consumption pattern. We will now consider a specialized case of inﬂation dynamics of the form (42) dΠt = Πt π dt + σΠ dzt , where the expected inﬂation rate, π, and the volatility vector, σΠ , are constant. This process is known as a geometric process, and (Πs /Πt ) is log-normally distributed. The real short interest rate is given by rt = Rt − π + Λ t σΠ , and the real market price of risk vector is given by, λt = Λt − σΠ . Hence, the real interest rate is a Gaussian process and the real market price of risk vector process is deterministic. In this case, Corollary 1 applies and the optimal investment strategy can be implemented by investing in a portfolio of the nominal savings account, the nominal speculative portfolio, and a real bond with coupon that in real terms match the forward-expected consumption pattern in order to hedge changes in investment opportunities. Moreover, for concrete calculations of the relevant real coupons, as expressed in equations (36) 7

In fact, the short rate is only Markovian if σf (t, τ ) can be separated as σf (t, τ ) = G(t)H(τ ),

where H is a real-valued continuously diﬀerentiable function that never changes sign and G is an Rk -valued continuously diﬀerentiable function, cf. Carverhill (1994).

291

Claus Munk, Carsten Sørensen and (37) in Corollary 1, one can use that s s 1 s 2 ru du − λu du − λu dzu g(t, s) = Vart − 2 t t t 2 s s s 2 du = λu du + σ (u, τ ) dτ f t t u s s λ σf (u, τ ) dτ du. (43) +2 u t

9.4.2

u

Implementation of hedge by nominal bonds

In the following, we will consider how to implement the hedge against investment opportunities in the absence of real bonds. In particular, we will demonstrate that a conceptually similar hedge portfolio can be implemented using nominal coupon bonds that match the forwardexpected nominal consumption pattern. At this point, it is important to point out that the relevant time s forward martingale measure based on nominal valuation, Qs , diﬀers from the time s forward martingale measure based on real valuation, ¯ s , as deﬁned in (34). The diﬀerent forward martingale measures are Q connected through the following relations to the subjective probability measure P, % & % & ms Ms ¯s s Q s Et X = Bt Et [X] and Et X = Dts EQ t [X] (44) mt Mt which again must be satisﬁed for all suﬃciently well-behaved random variables X. Since c∗ = Π C ∗ and W ∗ = Π w∗ , and by manipulation of the relations in (44), it is seen that ¯

¯

∗ s Qs ∗ s Qs ∗ s Qs ∗ s Dts EQ t [Cs ] = Πt Bt Et [cs ] and Dt Et [WT ] = Πt Bt Et [wT ] (45)

where the expressions on the left-hand and right-hand sides of the equations are simply diﬀerent ways of expressing the present nominal value of the future consumption rate at time s and terminal wealth at time T , respectively. Using that the nominal real pricing kernel, Mt , and the real pricing kernel, mt , are both log-normally distributed under the specialized inﬂation dynamics, one can establish the following connection between nominal and real zero-coupon bond prices: Ms ms s = Et Πt Ψ(t, s) = Πt Bts Ψ(t, s) (46) Dt = Et Mt mt 292

Optimal Consumption and Investment Strategies where

Ψ(t, s) = exp −π (s − t) + Λ σΠ (s − t) s s σΠ σf (u, τ ) dτ du . + t

u

In particular, since Ψ(t, s) is a deterministic function it follows by Ito’s lemma that real and nominal zero-coupon bond volatilities satisfy, s s = σΠ + σBt . σDt

(47)

Consider now a coupon bond paying continuous nominal coupon ∗ s at a rate K(s) = EQ t [Cs ] and with a terminal lump sum nominal Qs ∗ payment K(T ) = Et [WT ]. The nominal price on the bond is T Dt = K(s)Dts ds + K(T )DtT t

and, by again using Lemma 1 in the appendix, the volatility vector of such a coupon bond can be characterized by T s T K(s)Dts σDt ds + K(T )DtT σDt σDt = t T . (48) K(s)Dts ds + K(T )DtT t As inferred from the deﬁnition of K(s) and K(T ), the deﬁnition of k(s) and k(T ) in Corollary 1, and (45), we have Dts K(s) = Πt Bts k(s), t ≤ s ≤ T . Inserting this observation and the relation in (47) into (48), it is seen that σDt = σΠ + σBt where σBt is given in (39). Hence, the suggested nominal bond can be used to implement the hedge against changes in the investment set given in Corollary 1. 9.4.3 Example 2 (Stochastic volatility and excess returns on stocks) In this example, we will consider an investor who can invest in a single stock (a stock index) and an option on the stock. There are two basic sources of randomness in the economy and hence, since the investor can trade in two securities, markets are complete. The model of the economy is based on the stochastic volatility option pricing model of Heston (1993). The notation used in the example is similar to the notation used in Heston (1993), but after having described the formal model below, it is subsequently pointed out how the model exactly ﬁts into the general description of asset price dynamics in section 9.2. 293

Claus Munk, Carsten Sørensen The dynamics of the stock price (cum-dividends), St , and the option price, Ct , are assumed described by √ dSt = (R + λs vt ) dt + vt dˆ z1t St dCt = Ct

%

(49)

& ∂C √ ∂C ∂C ∂C √ λs vt + σλv vt dt + σ vt dˆ R+ vt dˆ z1t + z2t ∂S ∂v ∂S ∂v (50)

and

√ z2t dvt = κ(θ − vt ) dt + σ vt dˆ

(51)

where zˆ1t = z1t and zˆ2t = ρ z1t +

'

1 − ρ2 z2t

are Brownian motions with Cov(dˆ z1t , dˆ z2t ) = ρ dt. Using a standard no-arbitrage approach, Heston (1993) shows how to price options in the above economy. Prices on, e.g., a European call option C(S, v, t), must thus as usual satisfy a PDE with appropriate boundary conditions. Heston (1993), pp. 330-331 and his appendix, demonstrates that the solution is on the following form: C(S, v, t) = SP1 − Ke−r(T −t) P2

(52)

where (see equation (18) in Heston (1993)) 1 1 Pj = + 2 π

0

∞

e−iϕ ln[K] fj (x, v, T ; ϕ) Re dϕ , j = 1, 2 iϕ

(53)

and where x = ln S and fj (·; ϕ) denotes characteristic functions that are obtained as solutions to PDE’s with terminal condition eiϕx . Since coeﬃcients in the particular PDE’s are aﬃne and the terminal conditions are exponential-aﬃne, the solutions for fj , j = 1, 2 are exponential-aﬃne and on the form in equation (17) in Heston (1993). In addition, it is straightforward to obtain closed-form expressions and ∂C that enter the similar to (52) for the partial derivatives ∂C ∂S ∂v description of option price dynamics in (50); the speciﬁc expression for option price dynamics in (50) is implied by Ito’s lemma. The above model of stock price and option price dynamics is a special case of the asset price dynamics in (3). The dynamics of asset prices in (49) and (50) are thus encompassed as a special case of (3) 294

Optimal Consumption and Investment Strategies where: Pt = (St , Ct ) , & % √ λs vt , and Λt = 2 − 12 (1 − ρ ) (λv − ρλs ) & % √ 1 0 ' vt . σP t = ∂C ∂C 2 σ ∂C + ρσ 1 − ρ ∂S ∂v ∂v We will consider the intertemporal portfolio choice of a power utility investor in a setting with constant rate of inﬂation, πt = π, and without inﬂation uncertainty, i.e. σΠt = 0. In this economy, the real rate is thus given by the constant, r = R − π, and nominal risk premia and real risk premia must coincide, λt = Λt . The optimal investment strategy of the investor is described in Theorem 1 and the portfolio solution depends critically on the processes Qt and qts , as deﬁned in (19) and (20). It is possible to determine the speciﬁc optimal investment strategy by using the results on analogy to term structure models. In particu(γ) lar, the process rt deﬁned in (30) in the present context is & & % &% 2 % 1 1 1 λs + λ2v − 2ρλs λv (γ) vt r+ 1− rt = 1 − γ 2γ γ 1 − ρ2 = k0 + k1 vt = k0 + vt∗ (54) where the ﬁrst equality deﬁnes the constants k0 and k1 , and the second equality deﬁnes the proportional volatility process v ∗ = k1 v. In the following, we assume that γ > 1 so that, e.g., k1 is positive.8 By an application of Ito’s lemma, it follows that the dynamics of the proportional volatility process are described by ' (γ) z2t (55) dvt∗ = κ∗ (θ∗ − vt∗ )dt + σ ∗ vt∗ dˆ where

& % κ ' 1 κ∗ = κ + 1 − σλv , θ∗ = k1 θ ∗ , σ ∗ = k1 σ , γ κ

and (γ)

(γ)

zˆ2t = ρz1t +

t ' 1 √ (γ) 1 − ρ2 z2t = z2t + (1 − )λv vu du. γ 0

8

The logarithmic utility case, γ = 1, is described by the general results in Section 9.2. For example, the optimal investment strategy for a log-investor is given by (16) where the hedge term is absent.

295

Claus Munk, Carsten Sørensen (γ)

In particular, zˆ2t is a standard Brownian motion under the probability measure Q(γ) , as deﬁned in (29). Using the results on the analogy to term structure models, the relevant calculations with respect to qts are now similar to the evaluations of zero-coupon bond prices in a term structure model where the dynamics of the short interest rate is given by (54) and (55). This term structure model is known as the extended CIR-model; c.f. Pearson and Sun (1994). By analogy to the derivations in Pearson and Sun (1994), we thus have s (γ) (γ) (γ) (γ) (γ) (56) qts = EQ e− t ru du = e−a (s−t)−b (s−t) rt t where a(γ) (τ ) = k0 τ −

σ ∗2

2(eγτ −1)

b(γ) (τ ) = γ

2κ∗ θ∗

log

(γ+κ∗ )(eγτ −1)+2γ

√

=

∗

2γe(γ+κ )τ /2 (γ+κ∗ )(eγτ −1)+2γ

,

,

κ∗2 + 2σ ∗2 .

The optimal investment strategy is given in (23) in Theorem 1. In the present context, we assume that σΠt = 0 and the optimal investment strategy in (23) reduces to −1 1 −1 σP t Λt + σPt σQt . γ

x∗t =

where the hedge term is given by (using (21), (56), and (51)) % & −1 √ −1 ' ρ σP t σQt = −h(vt , t, s) k1 σ vt σP t 1 − ρ2

(57)

(58)

with 1

T

β

1

β

e− γ (s−t) qts b(γ) (s − t) ds + ε2γ e− γ (T −t) qtT b(γ)(T −t) h(vt , t, s) = . 1 1 β β T ε1γ t e− γ (s−t) qts ds + ε2γ e− γ (T −t) qtT (59) In the special case where the investor has utility from terminal wealth only (ε1 = 0, ε2 = 1), the expression for the optimal investment strategy simpliﬁes. In this case, the optimal investment strategy is given by (58) and (59), but the function h(vt , t, s) in (59) reduces to a function of time only given by ε1γ

t

h(vt , t, s) = b(γ) (T − t). 296

(60)

Optimal Consumption and Investment Strategies Also, in the special case where the relevant hedging instrument (the option with dynamics described in (50)) has zero sensitivity with respect to the underlying stock price, the relevant hedge strategy sim= 0, and the hedge term (58) in the optimal pliﬁes. In this case, ∂C ∂S investment strategy can be written as % &−1 % & −1 ∂C 0 σQt = −h(vt , t, s) k1 . (61) σP t 1 ∂v = 0, the optimal hedge portfolio Hence, in the special case where ∂C ∂S only involves taking a position in the relevant option strategy. Finally, in the numerical calibration below, we also consider the special case where the stock and the volatility process are perfectly negatively correlated (ρ = −1). The optimal investment strategy can in this case be obtained as a limiting case of (57) (where one must set λv = −λs ), or by similar explicit derivation. In particular, implementation of the optimal strategy in this one-dimensional case only requires investing in a single asset, the stock. The optimal investment strategy thus describes the optimal stock position which is given by x∗t = where now, k1 =

1 (1 2γ

1 λs + h(vt , t, s) k1 σ γ

(62)

− γ1 )λ2s , and h(vt , t, s) is described in (59).

9.4.4 Numerical results This subsection presents numerical asset allocation results based on the optimal investment strategies derived above for the Heston (1993) model. We assume a constant rate of inﬂation at π = 0.02 and a constant nominal interest rate of R = 0.04 (and thus a constant real interest rate of r = 0.02). Moreover, the parameters of the basic Heston (1993) stochastic volatility model are chosen close to empirical estimates obtained for this speciﬁc model based on US data; see, e.g., Andersen, Benzoni and Lund (2002), Table IV and footnote 10. In particular, the parameters of the stochastic volatility process in (51) are set so that: κ = 0.50, θ = 0.04, and σ = 0.20.9 The current variance 9

In fact, Andersen, Benzoni and Lund (2002) presents a higher estimate of the mean-reversion parameter, κ = 3.2508. We have chosen to use a slower rate of mean-reversion mainly to illustrate the possibility of longer horizon asset allocation eﬀects in the model. Thus, using a parameter value of κ = 3.2508 would almost eliminate the 5 year, 15 year, and 35 year higher stock allocations in Table 1 and Table 2.

297

Claus Munk, Carsten Sørensen √ rate vt is chosen such that the current stock volatility is 0.20; i.e. vt = 0.20 and, hence, vt = 0.04. Finally, the price on stock price risk is λs = 0.80. This implies that the expected excess return on stocks is, λs vt = 0.80 · 0.04 = 3.2%, at the current volatility level. The price on volatility risk is set at, λv = ρλs , which in our examples implies that there are no speculative demand for bearing volatility risk. It is a well-known fact that the correlation between stock prices and stock price volatility is usually estimated negative; in fact, this phenomenon is often referred to as the “leverage eﬀect.” Below we present optimal asset allocation choices for two values of the correlation coeﬃcient between the stock price and the volatility. First we consider the case of perfect negative correlation, ρ = −1. Then the case where the correlation is set at ρ = −0.60.10 In Table 1, we have tabulated the optimal stock proportion for investors with diﬀerent degrees of constant relative risk aversion and time horizons in the case of perfectly negative correlation between the stock and the volatility process (ρ = −1). In this case, the investor only needs to consider how much to invest in stocks in order to implement the optimal investment strategy, and the residual is invested in the bank account or, equivalently, in bonds (since interest rates are assumed constant). The optimal stock proportions in Table 1 are obtained by inserting the relevant parameter values in the expression for the optimal investment strategy in (62). The results indicate that investors with longer investment horizons should optimally invest a higher fraction of wealth in stocks. This result is similar to results presented by Wachter (2002) in a similar setting where the excess return of stocks follow a mean-reverting process which is perfectly negatively correlated with the stock price. While the excess return in the present context (i.e. λs vt ) follows a CIR squareroot process, Wachter (2002) assumes that the excess return follows an Ornstein-Uhlenbeck process. Moreover, Wachter (2002) assumes constant volatilities. However, conceptually similar to the insight of Wachter (2002), the higher stock proportions for long horizon investors are due to the mean-reversion in stock prices that the negative correlation between excess returns and stock price movement induces. Similar to the complete market analysis in Wachter (2002), the asset allocation results with utility from intermediate consumption in Table 1 are basically obtained as a “duration” weighted averages over similar strategies for investors with utility from terminal wealth only, 10 This speciﬁc parameter value match the estimate of ρ = −0.5877 presented by Andersen, Benzoni and Lund (2002).

298

Optimal Consumption and Investment Strategies as formalized in our setting by the expression for h(vt , t, s) in (59). Table 1: Stock investment in perfect negative correlation case (ρ = −1) Panel A: Utility from terminal wealth only Investment Horizon

γ=1

Relative Risk Aversion γ=2 γ=4

one month

80.0%

40.1%

20.1%

10.1%

5 year

80.0%

43.3%

22.7%

11.6%

15 year

80.0%

43.8%

23.1%

11.9%

25 year

80.0%

43.8%

23.1%

11.9%

γ=8

Panel B: Utility from consumption and terminal wealth Investment Horizon

γ=1

Relative Risk Aversion γ=2 γ=4

one month

80.0%

40.1%

20.1%

10.1%

5 year

80.0%

42.4%

21.8%

11.1%

15 year

80.0%

43.1%

22.5%

11.5%

25 year

80.0%

43.3%

22.7%

11.6%

γ=8

In Table 2, we have tabulated similar optimal investment strategies for investors with diﬀerent degrees of constant relative risk aversion and time horizons in the case of less than perfectly negatively correlation between the stock and the volatility process (ρ = −0.60). In this case, we allow the investor to invest as well in a derivative on the volatility/excess return state-variable, and the optimal asset allocations are in this case obtained by inserting the relevant parameter values in the expression for the optimal investment strategy in (57). In particular, in our numerical example the investor is allowed to invest in a straddle written on the stock and priced according to the Heston (1993) expressions.11 The straddle expires in one-month 11

A straddle is a combination of a bought call option and a bought put option

299

Claus Munk, Carsten Sørensen and the exercise price in the straddle position is set at 1.0075 times the current stock price such that the straddle is currently insensitive to changes in the underlying stock price (i.e. ∂C = 0). The straddle is ∂S thus designed to have maximal correlation and be a relevant instrument to hedge changes in the volatility/excess return state-variable vt in the economy considered. Table 2: Asset allocation choices under stochastic volatility/excess returns Panel A: Utility from terminal wealth only

γ=1

Relative Risk Aversion γ=2 γ=4

γ=8

Stock

80.0%

40.0%

20.0%

10.0%

Bank/Bonds

20.0%

60.3%

80.2%

90.2%

Straddle

0.0%

– 0.3%

– 0.2%

– 0.2%

Panel B: Utility from consumption and terminal wealth

γ=1

Relative Risk Aversion γ=2 γ=4

γ=8

Stock

80.0%

40.0%

20.0%

10.0%

Bank/Bonds

20.0%

60.3%

80.2%

90.1%

Straddle

0.0%

– 0.3%

– 0.2%

– 0.1%

The results tabulated in Table 2 are obtained for investors having a investment horizon of 15 years. However, the results are quite invariable for other horizon since the optimal stock allocation is the same for all investment horizons. Thus, only the allocation into the straddle position reﬂects potential horizon eﬀects (which is also reﬂected in the residually determined bank account or, equivalently, bond position). For all investment horizons, the proportion of wealth invested in the straddle in order to hedge changes in volatility/excess return written on the stock and with the same exercise prices and maturities. The price on the straddle is obtained using the expression for the call option price in (52) and the put-call parity.

300

Optimal Consumption and Investment Strategies risk is numerically small. In the case of perfectly negative correlation in Table 1, this hedge was reﬂected in the higher stock proportions for longer term investors. The straddle is intuitively a more powerful instrument and, thus, the similar hedge can be accomplished using relatively small straddle positions. Also note that the correlation between the stock and the straddle (which is designed to have perfect positive correlation with vt ) is negative in the example. Therefore, while the hedge in Table 1 is accomplished by investing an additional proportion of wealth in stocks, the hedge position in Table 2 involves a short straddle position. 9.5

Extensions

9.5.1 HARA utility Let us describe how the general result for CRRA utility functions in the previous section can be generalized to HARA utility functions of the form U1 (c, t) = ε1 e−βt

(c − c(t))1−γ , 1−γ

U2 (w) = ε2 e−βT

(w − w)1−γ , (63) 1−γ

where c(t) and w are non-negative, non-stochastic real numbers that can be interpreted as the subsistence time t real consumption and terminal wealth level, respectively. Deﬁning cˆt = ct − c(t) and wˆ = w − w, we can reformulate the static utility maximization problem (11)–(12) for the HARA investor as T ct )1−γ ˆ 1−γ −βt (ˆ −βT (w) dt + e , (64) e sup E 1−γ 1−γ (ˆ c,w) ˆ 0 T mt cˆt dt + mT w ˆ ≤ w0 − L0 , (65) s.t. E 0

where 1 Et Lt = mt

T

ms c(s) ds + mT w t

denotes the costs of meeting the future subsistence consumption and terminal wealth level. Of course, if w0 < L0 , the problem has no solution. If w0 > L0 , the problem is mathematically equivalent to the problem of a CRRA investor. We ﬁnd that the optimal consumption process is 1 w ∗ − Lt , c∗t = ct + ε1γ t Qt 301

Claus Munk, Carsten Sørensen while the optimal wealth process is − γ1

β

wt∗ = Lt + e− γ t Qt mt

w 0 − L0 , Q0

which is obtained by the portfolio & % & % −1 Lt −1 1 ∗ σP t λt + σQt + σPt σΠt . xt = 1 − ∗ wt γ The value function becomes Vt = 9.5.2

1 Qγ (w∗ − Lt )1−γ . 1−γ t t

Habit formation

The preferences applied above and in most papers on portfolio and consumption choice are additively time separable. In particular, the utility of consumption at one point in time is independent of the consumption level at all other dates. However, it is probably more realistic that individuals develop habits for consumption so that the utility of the consumption level at one day is decreasing in some average of past consumption levels, cf., e.g., Browning (1991). Several papers have shown that the introduction of habit formation can resolve several of the “puzzles” of asset pricing models with a representative agent having time separable utility; see, e.g., Constantinides (1990) and Campbell and Cochrane (1999). A particularly tractable case is that of power-linear habit utility, i.e. the utility of a consumption stream (ct )t∈[0,T ] is of the form 0

T

e−βt

(ct − ht )1−γ dt, 1−γ

where ht is the habit level deﬁned by t ht = h0 e−βt + α e−β(t−s) cs ds, 0

i.e. an exponentially weighted average of past consumption. Schroder and Skiadas (2002) demonstrate that the solution of a maximization problem with linear habit utility can be expressed in terms of the solution to a problem with standard time additive utility. Munk (2002) applies this procedure to derive the optimal strategies for 302

Optimal Consumption and Investment Strategies an investor with power-linear habit utility in a complete market, but does not explicitly incorporate inﬂation risk. Adapting that result to our setting with inﬂation risk, we obtain the optimal strategies given below. The solution will be stated in terms of the processes F = (Ft ) and G = (Gt ) deﬁned by 9 8 % &1− γ1 T ms 1− γ1 − γδ (s−t) Gt = Et e (1 + αFs ) ds , mt t T T −(β−α)(s−t) ms Ft = Et e ds = e−(β−α)(s−t) Bts ds, mt t t where Bts is the real price at time t of a zero-coupon real bond paying one consumption unit at time s. We can interpret Ft as the real price of a real bond paying a continuous coupon that is exponentially declining over time. Then ht Ft is the cost of ensuring that future consumption exactly equals the habit level since with cs = hs for all s ≥ t, we have hs = e−(β−α)(s−t) ht . If we write the dynamics of the zero-coupon bond prices as dBts = Bts rt + (σts ) λt dt + (σts ) dzt , the dynamics of Ft becomes dFt = −1 dt + Ft where

rt + σFt λt dt + σFt dzt ,

T

e−(β−α)(s−t) Bts σts ds σF t ≡ t T . −(β−α)(s−t) B s ds e t t

We write the dynamics of Gt as dzt . dGt = Gt μGt dt + σGt The optimal consumption process c∗ = (c∗t ) is given by 1

c∗t = h∗t + (1 + αFt )− γ

wt∗ − h∗t Ft . Gt

The indirect utility is Vt =

1 Gγ (w∗ − h∗t Ft )1−γ . 1−γ t t 303

Claus Munk, Carsten Sørensen Finally, the optimal investment strategy is given by the vector & & % % h∗t Ft 1 −1 h∗t Ft −1 ∗ xt = 1 − ∗ σP t σ λt + 1 − ∗ σGt wt γ Pt wt −1 h∗ Ft −1 + t ∗ σPt σF t + σPt σΠt . wt Here h∗t and wt∗ are the habit level and the real wealth induced by the optimal consumption and investment strategy. 9.5.3

Labor income

So far we have assumed that the only income in the life of the investor is given by the return on her ﬁnancial investments. Since labor income is the predominant source of income for most individuals, it is extremely important for consumption and investment decisions to take into account both the current level of labor income, the drift and riskiness of the income stream, and its correlation with ﬁnancial asset returns. The introduction of labor income does not dramatically complicate the analysis as long as (1) the labor income stream is spanned by the traded ﬁnancial assets, i.e. has no other risk components, and (2) the investor is able to borrow using future income as implicit collateral. Suppose for example that the individual receives an exogenously given labor income at the rate Yt , where dYt = Yt μY t dt + σYt dzt , so that the nominal income over the short period [t, t + dt] is Yt dt. Since the income stream is fully hedgeable, it can be valued as any ﬁnancial asset. The time t real value of the income stream (Ys )s∈[t,T ] is therefore T m s Ys lt = Et ds . mt Πs t In this situation, we can think of the agent “selling” his future income at the ﬁnancial market in the exchange of the payment lt so that he has a total real wealth of wt + lt to use for consumption and investments. Consequently, the optimal consumption rate of a CRRA investor in Eq. (22) must be adjusted to 1

c∗t = ε1γ 304

wt∗ + lt . Qt

Optimal Consumption and Investment Strategies The individual will invest in a ﬁnancial portfolio such that the riskiness of the total position of ﬁnancial investments and labor income is similar to the riskiness of the optimal ﬁnancial portfolio in the absence of labor income. Denoting the percentage volatility of lt by σlt , we arrive at the optimal portfolio & % & % −1 lt −1 lt −1 1 ∗ σP t λt + σQt − ∗ σPt σlt + σPt σΠt , xt = 1 + ∗ wt γ wt which generalizes (23). While the results above certainly provide some intuition on the eﬀects of labor income, the underlying assumptions on the income process are probably not realistic. The labor income of most individuals is not fully hedgeable in the ﬁnancial markets and due to moral hazard and adverse selection problems, it may be impossible to borrow against future income so that the individual faces portfolio constraints. However, the optimal consumption and investment choice problem in settings allowing undiversiﬁable income risk and portfolio constraints can only be completely solved using numerical methods for optimal control. Key papers addressing the implications of labor income on consumption and portfolio choice are Bodie, Merton and Samuelson (1992), Cuoco (1997), Duﬃe, Fleming, Soner and Zariphopoulou (1997), Koo (1998), and Munk (2000). 9.5.4 Allowing for undiversiﬁable inﬂation risk Brennan and Xia (2002) solve the consumption and investment choice problem of a CRRA investor in a very concrete setting with particularly simple stochastic processes for a stock index, interest rates, the consumer price level, and the expected inﬂation rate. In particular, they allow for the case where the consumer price level has an undiversiﬁable risk component so that the market is incomplete. Nevertheless, they obtain closed-form solutions for the optimal strategies both with and without intermediate consumption. This contrasts the analysis of Liu (1999) who is only able to ﬁnd closed-form optimal strategies with intermediate consumption if the ﬁnancial market is complete. A natural question is: Does the extension to undiversiﬁable inﬂation risk depend crucially on their specialized setting, or can such a risk component generally be included without signiﬁcantly complicating the analysis? To investigate this issue, we generalize the complete markets model of Sections 2 and 3 by adding a term with a new Brownian motion 305

Claus Munk, Carsten Sørensen to the dynamics of the consumer price level. To be more precise, we replace (1) by dzt + σ ˆΠt dˆ zt , dΠt = Πt πt dt + σΠt

(66)

where zˆ is a one-dimensional standard Brownian motion independent of z. Due to the unhedgeable component of the inﬂation process, the individual can no longer completely control his real wealth process by appropriate behavior. Hence we can no longer think of the investor choosing the real consumption process and the real terminal wealth as in the formulation (11)–(12), but we have to consider the nominal version (8)–(9) and distinguish between the risks that the individual controls and those he cannot control. From (66) we can write

s 1 s 2 πu du − σΠu du + σΠu dzu Πs = Πt exp 2 t t t s s 1 × exp − σ ˆ 2 du + σ ˆΠu dˆ zu 2 t Πu t ≡ Πt η(t, s)ˆ η (t, s). s

(67)

Assuming that (σΠt ) is adapted to the ﬁltration generated by z and (ˆ σΠt ) is adapted to the ﬁltration generated by zˆ, the random variables η(t, s) and ηˆ(t, s) will be independent. ˆ associated with the new There may be a market price of risk, λ, source of risk, zˆ, so that the real pricing kernel satisﬁes s s 1 s 2 ru du − λu du − λ ms = mt exp − u dzu 2 t t t s 1 s ˆ2 ˆ u dˆ × exp − zu λu du − λ 2 t t ˆ s). ≡ mt ζ(t, s)ζ(t,

(68)

Since the investor can only vary his nominal consumption rate and terminal wealth level in the space of random variables measurable with respect to the z-ﬁltration, we have to be careful when deriving the ﬁrst-order conditions for the utility maximization problem (8)–(9). 306

Optimal Consumption and Investment Strategies Applying (67) and (68), we can write the Lagrangian as 8

L = E ε1

T

e

1 1−γ

−βt

0

+ ε2 e ψ − Π0

−βT

0

%

Ct Π0 η(0, t)ˆ η (0, t)

&1−γ dt

&1−γ WT Π0 η(0, T )ˆ η (0, T ) *9 ˆ T) ˆ t) ζ(0, T )ζ(0, ζ(0, t)ζ(0, Ct dt + W − W0 . η(0, t)ˆ η (0, t) η(0, T )ˆ η (0, T )

1 1−γ T

%

We cannot maximize this expectation with respect to the ﬁltration generated by both z and zˆ by a state-by-state maximization, since the individual can only control the states in the ﬁltration generated by z. By independence, however, we can rewrite the Lagrangian as 8

ε1 L=E 1−γ

0

T

γ−1 e−βt Ct1−γ Πγ−1 E[ˆ η (0, t)γ−1 ] dt 0 η(0, t)

ε2 −βT 1−γ γ−1 e WT Π0 η(0, T )γ−1 E[ˆ η (0, T )γ−1 ] 1−γ 8 9 T ˆ t) ψ ζ(0, ζ(0, t) − E Ct dt Π0 ηˆ(0, t) 0 η(0, t) 8 9 *9 ˆ T) ζ(0, ζ(0, T ) E W − W0 , + ηˆ(0, T ) ηˆ(0, T ) +

where the outer expectation is now with respect to the uncertainty that the individual can control so that we can maximize state-by-state as is usually done. Doing that, the ﬁrst-order conditions become 9− γ1 ˆ t) ζ(0, E[ˆ η (0, t)γ−1 ] E Ct = 1 Π0 e η(0, t)ζ(0, t) ηˆ(0, t) ψγ , 9− γ1 8 1 ˆ 1 β 1 ε2γ ζ(0, T ) η (0, T )γ−1 ] γ E W = 1 Π0 e− γ T η(0, T )ζ(0, T )− γ E[ˆ ηˆ(0, T ) γ ψ 1

ε1γ

−β t γ

− γ1

γ1

8

.

Substituting into the budget constraint, we ﬁnd that ˆ 0 ), ψ −1/γ = W0 /(Π0 Q 307

Claus Munk, Carsten Sørensen where we have deﬁned 9 8 T 1 ˆ s) 1 ζ(t, −β (s−t) 1− γ ˆ Qt = ε1 Γ(s) ds e γ Et ζ(t, s) γ Et ηˆ(t, s) t 9 8 1 ˆ β 1 ζ(t, T ) + ε2γ e− γ (T −t) Et ζ(t, T )1− γ Et Γ(T ) ηˆ(t, T ) with 1 Γ(s) = E ηˆ(0, s)γ−1 γ

8

ˆ s) ζ(0, E ηˆ(0, s)

9− γ1 ,

s ∈ [0, T ].

The optimal nominal wealth process becomes Wt∗ =

1 W0 − βγ t ˆ t. e η(0, t)ζ(0, t)− γ Q Q0

(69)

1 qts = Et ζ(t, s)1− γ .

Deﬁne

Since this expectation only involves uncertainty induced by z, the dynamics will be of the form s dzt . dqts = qts μsqt dt + σqt Similarly deﬁne

8 qˆts

= Et

9 ˆ s) ζ(t, , ηˆ(t, s)

which only involves uncertainty induced by zˆ so the dynamics will take the form s ˆsqt dt + σ ˆqt dˆ zt . dˆ qts = qˆts μ ˆ t is Consequently, the dynamics of Q ˆt = Q ˆ t . . . dt + σQt dQ dzt + σ ˆQt dˆ zt , where 1

σQt

σ ˆQt

308

T

β

1

β

s T e− γ (s−t) Γ(s)qts qˆts σqt ds + ε2γ e− γ (T −t) Γ(T )qtT qˆtT σqt = , (70) 1 1 β β T ε1γ t e− γ (s−t) Γ(s)qts qˆts ds + ε2γ e− γ (T −t) Γ(T )qtT qˆtT 1 1 β β T s T ˆqt ds + ε2γ e− γ (T −t) Γ(T )qtT qˆtT σ ˆqt ε1γ t e− γ (s−t) Γ(s)qts qˆts σ = . (71) 1 1 β β T ε1γ t e− γ (s−t) Γ(s)qts qˆts ds + ε2γ e− γ (T −t) Γ(T )qtT qˆtT

ε1γ

t

Optimal Consumption and Investment Strategies Applying Itˆo’s Lemma we ﬁnd that the dynamics of the optimal wealth process is ˆt 1 d Q dη(0, t) dζ(0, t) − + dWt∗ = . . . dt + Wt∗ ˆt η(0, t) γ ζ(0, t) Q % & 1 ∗ λt + σQt + σΠt = . . . dt + Wt dzt + Wt∗ σ ˆQt dˆ zt . (72) γ Comparing with the nominal wealth process (7) for a given investment strategy, we see that the optimal choice of consumption and terminal wealth can only be ﬁnanced with a portfolio of traded securities if σ ˆQt is identically equal to zero. In that case, the portfolio is given by (23), but with σQt deﬁned in (70), and we have a obtained a generalized version of Theorem 1 which encompasses the Brennan and Xia analysis as a very special case. The term σ ˆQt will be zero whenever qˆts is deterministic for each s, i.e. whenever there is no uncertainty about how the expectations 9 8 ˆ s) ζ(t, Et ηˆ(t, s) s 1 s ˆ2 2 ˆu + σ = Et exp − λu − σ λ zu ˆΠu du − ˆΠu dˆ 2 t t ˆ u and σ are to be updated over time. This is satisﬁed when λ ˆΠu are both deterministic functions of time. In the model of Brennan and ˆ and σ Xia (2002), it is therefore the assumptions of constant λ ˆΠ (in their notation −ϕu and ξu , respectively) that are crucial for obtaining a closed-form solution for the optimal consumption and portfolio strategies in the incomplete market setting, while the other assumptions on the dynamics of rates and prices are not needed in this respect. 9.6

Concluding remarks

In this chapter, we have derived optimal consumption and investment strategies of an CRRA investor in a complete capital market setting, and surveyed related and recent literature on optimal consumption and investment strategies. Our analysis has stressed the risks individuals want to hedge, and how to implement optimal real consumption strategies by investing in nominal securities. In line with results in Munk and Sørensen (2004), we have thus shown that a CRRA investor faced 309

Claus Munk, Carsten Sørensen with Gaussian interest rate uncertainty will optimally hedge changes in future interest rates by investing in a real coupon bond with real payments that match the forward expected real consumption pattern. Furthermore, this result has been extended to a case under specialized inﬂation uncertainty where the same investment strategy applies, but using similar nominal bonds in implementing the optimal hedge strategy. In addition, several extensions of the general modeling framework have been given and discussed, including: HARA utility, Habit formation, labor income, and non-diversiﬁable inﬂation risk.

Appendix: A Leibnitz-type rule for stochastic processes Lemma 1. Let Zts be a family of stochastic processes so that for each ﬁxed s ∈ [0, T ] dZts = μst dt + σts dzt , 0 ≤ t ≤ s where σts satisﬁes T

(σts )2 dt < ∞ for all s ∈ [0, T ], T T s 2 (b) 0 σt ds dt < ∞ t

(a)

0

almost surely. Let Yt be deﬁned by

T

Yt = t

Zts ds.

Then the dynamics of Yt are given by %

T

dYt = t

& % μst ds − Ztt dt +

T

t

& σts ds dzt .

Proof: The proof is an application of the generalized Fubini-type rule for stochastic processes stated and applied in the Appendix of Heath, Jarrow and Morton (1992). Let t0 ≤ t1 , then since Zts1 310

=

Zts0

t1

+ t0

μst

t1

dt + t0

σts dzt ,

(73)

Optimal Consumption and Investment Strategies we have Yt1 =

T

t1

T

= t1

Zts0 ds + Zts0 ds +

= Yt0 +

t1

t0

− = Yt0 +

t0 t1

− = Yt0 +

t0 t1

−

t0 t1

t0

T

T

μst dt ds +

μst ds dt +

t1

t0

t1

t0

t

t0

t1

t1

μst ds dt +

T

t1

μst ds dt +

t

t1

t1

Zts0 ds −

t0

T

t t1

Zts0 ds −

t0

= Yt0 +

t t1

t1

t0 T

t1

t0

t1

T

t1

T

t

t0

t0

t1

σts ds dzt

t1

t0

t

t1

σts ds dzt

t1

t

σts ds dzt t1

t0

T

t

t0

t1

t0

T

μst dt ds −

μst ds dt +

σts ds dzt

T

μst ds dt −

t0

s

σts dzt ds

t1

s

t0

σts dzt ds

σts ds dzt

s s s s s Zt0 + μt dt + σt dzt ds

t

t0

T

s μt ds dt +

t0

t1

t0

t

T

σts

ds dzt −

t1

t0

Ztt dt

where the Fubini rule is used in the second and fourth equality while the ﬁrst equality follows by inserting (73) in the deﬁnition of Yt and, also, the last equality follows by using (73) and the fact that t1 t1 t Zt dt = Zss ds; t0

t0

the other equalities follow by pure manipulation of the involved expressions. The claim is now established. 2

311

Claus Munk, Carsten Sørensen References: Amin, K. I., and Jarrow, R. A. (1992) “Pricing Options on Risky Assets in a Stochastic Interest Rate Economy.” Mathematical Finance 2(4): 217–237. Andersen, T. G., Benzoni, L., and Lund, J. (2002) “An Empirical investigation of Continuous-Time Models for Equity Returns.” Journal of Finance 57(3): 1239–1284. Barberis, N. (2000) “Investing for the Long Run when Returns are Predictable.” The Journal of Finance 55: 225–264. Bodie, Z., Merton, R. C., and Samuelson, W. F. (1992) “Labor Supply Flexibility and Portfolio Choice in a Life Cycle Model.” Journal of Economic Dynamics and Control 16: 427–449. Brace, A., and Musiela, M. (1994) “A Multifactor Gauss Markov Implementation of Heath, Jarrow, and Morton.” Mathematical Finance 4(3): 259–283. Breeden, D. T. (1979) “An Intertemporal Asset Pricing Model with Stochastic Consumption and Investment Opportunities.” Journal of Financial Economics 7: 265–296. Brennan, M. J., and Xia, Y. (2000) “Stochastic Interest Rates and the Bond-Stock Mix.” European Finance Review 4(2): 197–210. Brennan, M. J., and Xia, Y. (2002) “Dynamic Asset Allocation under Inﬂation.” The Journal of Finance 57(3): 1201–1238. Browning, M. (1991) “A Simple Nonadditive Preference Structure for Models of Household Behavior over Time.” Journal of Political Economy 99(3): 607–637. Campbell, J. Y., and Cochrane, J. H. (1999) “By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior.” Journal of Political Economy 107: 205–251. Campbell, J. Y., and Viceira, L. M. (2001) “Who Should Buy LongTerm Bonds?” American Economic Review 91(1): 99–127. Canner, N., Mankiw, N. G., and Weil, D. N. (1997) “An Asset Allocation Puzzle.” American Economic Review 87(1): 181–191. Carverhill, A. (1994) “When is the Short Rate Markovian?” Mathematical Finance 4(4): 305–312. Chacko, G., and Viceira, L. M. (2005) “Dynamic Consumption and Portfolio Choice with Stochastic Volatility in Incomplete Markets.” Review of Financial Studies 18(4): 1369–1402. Cochrane, J. H. (2001) Asset Pricing. Princeton University Press. 312

Optimal Consumption and Investment Strategies Constantinides, G. M. (1990) “Habit Formation: A Resolution of the Equity Premium Puzzle.” Journal of Political Economy 98: 519–543. Cox, J. C., and Huang, C.-F. (1989) “Optimal Consumption and Portfolio Policies when Asset Prices Follow a Diﬀusion Process.” Journal of Economic Theory 49: 33–83. Cox, J. C., and Huang, C.-F. (1991) “A Variational Problem Arising in Financial Economics.” Journal of Mathematical Economics 20: 465– 487. Cox, J. C., Ingersoll, J. E. Jr., and Ross, S. A. (1985) “A Theory of the Term Structure of Interest Rates.” Econometrica 53(2): 385–407. Cuoco, D. (1997) “Optimal Consumption and Equilibrium Prices with Portfolio Constraints and Stochastic Income.” Journal of Economic Theory 71(1): 33–73. Cuoco, D., and Liu, H. (2000) “Optimal Consumption of a Divisible Durable Good.” Journal of Economic Dynamics and Control 24(4): 561–613. Cvitani´c, J., and Karatzas, I. (1992) “Convex Duality in Constrained Portfolio Optimization.” The Annals of Applied Probability 2(4): 767– 818. Damgaard, A., Fuglsbjerg B., and Munk, C. (2003) “Optimal Consumption and Investment Strategies with a Perishable and an Indivisible Durable Consumption Good.” Journal of Economic Dynamics and Control 28(2): 209–253. Duﬃe, D. (2001) “Dynamic Asset Pricing Theory (Third ed.). Princeton University Press. Duﬃe, D., Fleming W., Soner, H. M., and Zariphopoulou, T. (1997) “Hedging in Incomplete Markets with HARA Utility.” Journal of Economic Dynamics and Control 21(4–5): 753–782. Duﬃe, D., and Kan, R. (1996). “A Yield-Factor Model of Interest Rates.” Mathematical Finance 6(4): 379–406. Geman, H. (1989) The Importance of the Forward Neutral Probability in a Stochastic Approach of Interest Rates. Working paper, ESSEC. Grasselli, M. (2000) HJB Equations with Stochastic Interest Rates and HARA Utility Functions. Working paper, CREST, Malakoﬀ Cedex, France. Heston, S. L. (1993) “A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options.” Review of Financial Studies 6(2): 327–343. 313

Claus Munk, Carsten Sørensen Heath, D., Jarrow, R., and Morton, A. (1992) “Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claims Valuation.” Econometrica 60(1): 77–105. Jamshidian, F. (1987) Pricing of Contingent Claims in the One Factor Term Structure Model. Working paper, Merrill Lynch Capital Markets. Karatzas, I., Lehoczky, J. P., and Shreve, S. E. (1987) “Optimal Portfolio and Consumption Decisions for a “Small Investor” on a Finite Horizon.” SIAM Journal on Control and Optimization 25(6): 1557– 1586. Karatzas, I., and Shreve, S. E. (1998) Methods of Mathematical Finance, Volume 39 of Applications of Mathematics. New York: Springer-Verlag. Kim, T. S., and Omberg, E. (1996) “Dynamic Nonmyopic Portfolio Behavior.” The Review of Financial Studies 9(1): 141–161. Koo, H.K. (1998) “Consumption and Portfolio Selection with Labor Income: A Continuous Time Approach.” Mathematical Finance 8(1): 49–65. Leippold, M., and Wu, L. (2000) Quadratic Term Structure Models. Working paper, University of St. Gallen and Fordham University. Liu, J. (1999) Portfolio Selection in Stochastic Environments. Working paper, Stanford University. Liu, J., and Pan, J. (2003) “Dynamic Derivative Strategies.” Journal of Financial Economics 49(3): 401–430. Merton, R. C. (1969) “Lifetime Portfolio Selection Under Uncertainty: The Continuous-Time Case.” Review of Economics and Statistics 51: 247–257. Reprinted as Chapter 4 in Merton (1992). Merton, R. C. (1971) “Optimum Consumption and Portfolio Rules in a Continuous-Time Model.” Journal of Economic Theory 3: 373–413. Erratum: Merton (1973a) Reprinted as Chapter 5 in Merton (1992). Merton, R. C. (1973a) “Erratum.” Journal of Economic Theory 6: 213–214. Merton, R. C. (1973b) “An Intertemporal Capital Asset Pricing Model.” Econometrica 41(5): 867–887. Reprinted in an extended form as Chapter 15 in Merton (1992). Merton, R. C. (1992). Basil Blackwell Inc. 314

Continuous-Time Finance.

Padstow, UK:

Optimal Consumption and Investment Strategies Munk, C. (2000) “Optimal Consumption-Investment Policies with Undiversiﬁable Income Risk and Liquidity Constraints.” Journal of Economic Dynamics and Control 24(9): 1315–1343. Munk, C. (2002) Portfolio and Consumption Choice with Stochastic Investment Opportunities and Habit Formation in Preferences. Working paper, University of Southern Denmark. Munk, C., and Sørensen, C. (1999) Optimal Investment Strategies with a Heath-Jarrow-Morton Term Structure of Interest Rates. Working paper, University of Southern Denmark at Odense and Copenhagen Business School. Munk, C., and Sørensen, C. (2004) “Optimal Consumption and Investment Strategies with Stochastic Interest Rates.” Journal of Banking and Finance 28(8): 1987–2013. Munk, C., Sørensen, C., and Vinther, T. N. (2004) “Dynamic Asset Allocation Under Mean-Reverting Returns, Stochastic Interest Rates and Inﬂation Uncertainty: Are Popular Recommendations Consistent with Rational Behavior?” International Review of Economics and Finance 13(2): 141–166. Ocone, D. L., and Karatzas, I. (1991) “A Generalized Clark Representation Formula, with Application to Optimal Portfolios.” Stochastics and Stochastic Reports 34: 187–220. Pearson, N. D., and Sun, T.-S. (1994) “Exploiting the Conditional Density in Estimating the Term Structure: An Application to the Cox, Ingersoll, and Ross Model.” Journal of Finance 49(4): 1279-1304. Pliska, S. R. (1986) “A Stochastic Calculus Model of Continuous Trading: Optimal Portfolios.” Mathematics of Operations Research 11(2): 371–382. Samuelson, P. A. (1969) “Lifetime Portfolio Selection by Dynamic Stochastic Programming.” Review of Economics and Statistics 51: 239–246. Schroder, M., and Skiadas, C. (2002) “An Isomorphism between Asset Pricing Models with and without Linear Habit Formation.” Review of Financial Studies 15(4): 1189–1221. Sørensen, C. (1999) “Dynamic Asset Allocation and Fixed Income Management.” Journal of Financial and Quantitative Analysis 34: 513–531. Vasicek, O. (1977) “An Equilibrium Characterization of the Term Structure.” Journal of Financial Economics 5: 177–188. 315

Claus Munk, Carsten Sørensen Wachter, J. A. (2002) “Portfolio and Consumption Decisions under Mean-Reverting Returns: An Exact Solution for Complete Markets.” Journal of Financial and Quantitative Analysis 37(1): 63–91. Xia, Y. (2001) Long Term Bond Markets and Investor Welfare. Working paper, The Wharton School.

316

Chapter 10 Diﬀerential Systems in Finance and Life Insurance

Mogens Steﬀensen Department of Applied Mathematics and Statistics, Institute for Mathematical Sciences, University of Copenhagen 10.1

Introduction

The mathematics of ﬁnance and the mathematics of life insurance are always intersecting. Life insurance contracts specify an exchange of streams of payments between the insurance company and the contract holder. These payment streams may cover the lifetime of the contract holder. Therefore, time valuation of money is crucial for any measurement of payments due in the past as well as in the future. Life insurance companies never put their money under the pillow, and accumulation and distribution of capital gains were always part of the insurance business. With respect to the future, appropriate discounting of contractual obligations improves the estimates of liabilities. Financial contracts specify an exchange of streams of payments as well. However, while the life insurance payment stream is partly linked to the state of health of the insured, the ﬁnancial payment stream is linked to the ’state of health’ of an enterprise. That could be the stream of dividends distributed to the owners of the enterprise or the stream of claims contingent on the price of the enterprise paid to the holder of a so-called derivative. The discipline of personal ﬁnance is particularly closely linked to life insurance. Decisions on, e.g., consumption, investment, retirement, and insurance coverage belong to some of the most substantial lifetime ﬁnancial decisions of an individual.

Mogens Steﬀensen Valuation of payment streams is probably the most important discipline in the intersection between ﬁnance and life insurance. Various valuation dogmas are in play here. The principle of no arbitrage and the market eﬃciency assumption are taken as given in the majority of modern academic approaches to valuation of ﬁnancial contracts. Life insurance contract valuation typically relies on independence, or at least asymptotic independence, between insured lives. Then the law of large numbers ensures that reasonable estimates can be found if the portfolio of insurance contracts is suﬃciently large. Both dogmas reduce the valuation problem to being primarily a matter of calculation of conditional expected values. Conditional expected values can be approached by several diﬀerent techniques. Monte Carlo simulation, for instance, exploits the property that conditional expected values can be approximated by empirical means. Sometimes, however, one can go at least part of the way by explicit calculations, for example, when a series of auxiliary models with explicit expected values converges towards the real model in such a way that the series of explicit expected values converges to the desired quantity. A diﬀerent route can be taken when the underlying stochastic system is Markovian, i.e., if given the present state, the future is independent of the past. Then solutions to certain systems of deterministic diﬀerential equations can often be proved to characterize the conditional expected values. This is the route taken to various valuation problems and optimization problems in ﬁnance and life insurance in this exposition. Here, we just state the diﬀerential equations, but do not discuss possible numerical solutions to them. Valuation is performed by calculation of conditional expected values. However, the claim to be valuated may contain decision processes for which the valuation problem is extended to a matter of calculating extrema of conditional expected values. The extrema are taken over the set of admissible decision processes. However, also extrema of conditional expected values can be characterized by diﬀerential equations, albeit more involved. Also decision problems that are not part of a valuation problem are relevant and are studied here. We solve both a problem of minimizing expected quadratic disutility and a problem of maximizing expected power utility. In both cases, we state diﬀerential equations characterizing the solutions. Actually, from a technical point of view, valuation under decision making and utility optimization basically only diﬀer by the ﬁrst measuring streams of payments and the second measuring streams of utility of payments. Even from a qualitative point of view the disciplines are closely related, e.g. in 318

Diﬀerential Systems in Finance and Life Insurance the valuation approach called utility indiﬀerence pricing that we shall not deal with here, though. The models used in this article combine the geometric Brownian motion modelling of ﬁnancial assets with the ﬁnite state Markov chain modelling of the state of a life insurance policy. However, the ﬁnite state Markov chain model appears in ﬁnance in other connections than life insurance. Therefore the stated diﬀerential equations apply to other ﬁelds of ﬁnance. One example is reduced form modelling of credit risk where the ’state of health’, or in this connection creditworthiness, of an enterprise can be modelled by a Markov chain. Another example is valuation of innovative enterprise pipelines. Many types of innovative projects may be modelled by a ﬁnite state Markov chain. In e.g. drug development, the drug candidate can be in diﬀerent states (phases) and certain milestone payments are connected to certain states of the drug candidate. The list of discoverers in the ﬁeld of Markov processes and systems of partial diﬀerential equations is awe-inspiring: Feller, Kolmogorov and Dynkin are the fathers of the connection between Markov processes and mathematical analysis. After them contributions by Feynman, Kac, Davis, Bensoussan and Lions among others are relevant in the context of this article. However, we concentrate on a few references on more recent applications related to the material of this article and enclose a sectionwise outline. Section 2: Thiele wrote down in 1875 an ordinary diﬀerential equation for the reserve of a life insurance contract. Interestingly, Thiele was actually also the ﬁrst to model the Brownian motion mathematically in connection with his studies of time series, see Thiele (1880). Thiele’s work on reserves in life insurance was generalized by Hoem (1969) and further by Norberg (1991). The Nobel prize awarded work by Black and Scholes (1973) and Merton (1973) gave new insight in the pricing of claims contingent on underlying ﬁnancial processes. The theory of option pricing has since then turned into one of the larger industries of applied mathematics worldwide. Shortly after, applications to insurance products with contingent claims were suggested by Brennan and Schwartz (1976). The ﬁrst hybrid between Thiele’s and Black and Scholes’ diﬀerential equations appeared in Aase and Persson (1994). Diﬀerential equations for the reserve that connects Hoem (1969) with Aase and Persson (1994) appeared in Steﬀensen (2000). We state and derive the diﬀerential equations of Thiele, Black and Scholes and a particular hybrid equation.

319

Mogens Steﬀensen Section 3: Applications to more general life insurance products are based on the notions of surplus and dividend distribution. These were studied by Norberg (1999,2001) who also valuated future dividends by systems of ordinary diﬀerential equations. Steﬀensen (2006b) approached the dividend valuation problem by solving systems of partial diﬀerential equations conforming with a particular speciﬁcation of the underlying ﬁnancial market. We state the partial diﬀerential equation studied in Steﬀensen (2006b), including a particular case with a semi-explicit solution. Section 4: Contingent claims with early exercise options are connected to the theory of optimal stopping and variational inequalities. Grosen and Jørgensen (2000) realized the connection to surrender options in life insurance. In Steﬀensen (2002), the connection was generalized to general intervention options and the Markov chain model for the insurance policy. We state and prove the variational inequality for the price of a contingent claim and state the corresponding system for an insurance contract with a surrender option. Section 5: Optimal arrangement of payment streams in life insurance was ﬁrst based on the linear regulator. We refer the reader to Fleming and Rishel (1975) for the linear regulator and Cairns (2000) for an overview over its applications to life insurance. The linear regulator was combined with the Markov chain model of an insurance contract in Steﬀensen (2006a). We state and prove the Bellman equation for the linear regulator, and state the Bellman equation derived in Steﬀensen (2006a), including an indication of the solution. Section 6: The more conventional approach to decision making in ﬁnance is based on utility optimization, see Korn (1997) and Merton (1990). Merton (1990) approached decision problems in personal ﬁnance and introduced uncertainty of lifetimes. A connection to the Markov chain model of an insurance contract was suggested in Steffensen (2004). In Nielsen (2004) a related problem is solved. We state the Bellman equations for the decision problems solved by Merton (1990) and Steﬀensen (2004), including an indication of the solution. Both Steﬀensen and Nielsen approach the decision problem of the life insurance company. The methodology used also applies to related decision problems of the policy holder, though. These problems are studied by Kraft and Steﬀensen (2006) who generalize original results by Richard (1975).

320

Diﬀerential Systems in Finance and Life Insurance 10.2

The diﬀerential equations of Thiele and Black-Scholes

10.2.1 Thiele’s diﬀerential equation In this section we state and derive the diﬀerential equation for the so-called reserves connected to a life insurance contract with deterministic payments. We give a proof for the diﬀerential equation that corresponds to the proofs that will appear in the rest of the article. We end the section by considering the stochastic diﬀerential equation for the reserve with application to unit-link life insurance. See Hoem (1969) and Norberg (1991) for diﬀerential equations for the reserve. We consider an insurance policy issued at time 0 and terminating at a ﬁxed ﬁnite time n. There is a ﬁnite set of states of the policy, J = {0, . . . , J}. Let Z (t) denote the state of the policy at time t ∈ [0, n] and let Z be an RCLL process (right-continuous, left limits). By convention, 0 is the initial state, i.e. Z (0) = 0. Then also the associated J-dimensional counting process N = N k k∈J is an RCLL process, where N k counts the number of transitions into state k, i.e. N k (t) = # {s |s ∈ (0, t] , Z (s−) = k, Z (s) = k } . The history of the policy up to and including time t is represented by the sigma-algebra F Z (t) = σ {Z (s) , s ∈ [0, t]} . The development of the policy is given by the ﬁltration FZ = F Z (t) t∈[0,n] . Let B (t) denote the total amount of contractual beneﬁts less premiums payable during the time interval [0, t]. We assume that it develops in accordance with the dynamics bZ(t−)k (t) dN k (t) . (1) dB (t) = dB Z(t) (t) + k:k =Z(t−)

Here, B j is a deterministic and suﬃciently regular function specifying payments due during sojourns in state j, and bjk is a deterministic and suﬃciently regular function specifying payments due upon transition from state j to state k. We assume that each B j decomposes into an absolutely continuous part and a discrete part, i.e. dB j (t) = bj (t) dt + ΔB j (t) .

(2) 321

Mogens Steﬀensen Here, ΔB j (t) = B j (t) − B j (t−), when diﬀerent from 0, is a jump representing a lump sum payable at time t if the policy is then in state j. The set of time points with jumps in (B j )j∈J is D = {t0 , t1 , . . . , tq } where 0 = t0 < t1 < . . . < tq = n. We assume that Z is a time-continuous Markov process on the state space J . Furthermore, we assume that there exist deterministic and suﬃciently regular functions μjk (t) such that N k admits the stochastic intensity process Z(t−)k μ (t) t∈[0,n] , i.e.

M (t) = N (t) − k

k

t

μZ(s)k (s) ds

0

constitutes an FZ -martingale. 0

→ (←)

active

1 disabled

2 dead

Figure 1: Disability model with mortality, disability, and possible recovery. Figure 1 illustrates the disability model used to describe a policy on a single life, with payments depending on the state of health of the insured. We assume that the investment portfolio earns return on invest s ment by a constant interest rate r. We use the notation t = (s,t] throughout and introduce the short-hand notation s s r= r (τ ) dτ = r (s − t) . t

t

Throughout we use subscript for partial diﬀerentiation, e.g. Vtj (t) = ∂ V j (t). ∂t The insurer needs an estimate of the future obligations stipulated in the contract. The usual approach to such a quantity is to think of the insurer having issued a large number of similar contracts with 322

Diﬀerential Systems in Finance and Life Insurance payment streams linked to independent lives. The law of large numbers then leaves the insurer with a liability per insured that tends to the expected present value of future payments, given the past history of the policy, as the number of policy holders tends to inﬁnity. We say that the valuation technique is based on diversiﬁcation of risk. The conditional expected present value is called the reserve and appears on the liability side of the insurer’s balance scheme. By the Markov assumption the reserve is given by n Z(t) − ts r (t) = E e dB (s) Z (t) . (3) V t

We introduce the diﬀerential operator A, the rate of payments β, and the updating sum R, μjk (t) V k (t) − V j (t) , AV j (t) = k:k =j

β j (t) = bj (t) +

μjk (t) bjk (t) ,

k:k =j

R (t) = ΔB (t) + V j (t) − V j (t−) . j

j

We can now present the ﬁrst diﬀerential equation, in general spoken of as Thiele’s diﬀerential equation. Proposition 1. The statewise reserve deﬁned in (3) is characterized by the following deterministic system of backward ordinary diﬀerential equations, / D, 0 = Vtj (t) + AV j (t) + β j (t) − rV j (t) , t ∈ j 0 = R (t) , t ∈ D, 0 = V j (n) .

(4a) (4b) (4c)

In most expositions on the subject, (4a) is written as μjk (t) Rjk (t) , Vtj (t) = rV j (t) − bj (t) − k:k =j

with the so-called sum at risk Rjk (t) deﬁned by Rjk (t) = bjk (t) + V k (t) − V j (t) . In the succeeding sections, however, it turns out to be convenient to work with the diﬀerential operator abbreviation. We choose to do 323

Mogens Steﬀensen this already at this stage in order to communicate the cross-sectional similarities. There are several roads leading to (4). We present a proof that shows that any function solving the diﬀerential equation (4) actually equals the reserve deﬁned in (3). Such a result shows that (4) is a suﬃcient condition on V in the sense that the diﬀerential equation characterizes the reserve uniquely. Take an arbitrary function H j (t) solving (4) and consider the process H Z(t) (t). For this process the following line of equalities holds, n s Z(t) (t) = − d e− t r H Z(s) (s) H n t s =− e− t r −rH Z(s) (s) ds + dH Z(s) (s) t ⎞ ⎛ n s Z(s−)k e− t r ⎝dB (s) − RH dM k (s)⎠ = − −

t

k:k =Z(s−) n

e− t

s t

Z(s) r Hs (s) + AH Z(s) (s) + β Z(s) (s) − rH Z(s) (s) ds

e−

s∈(t,n]∩D

=

n

e−

s t

r

Z(s)

RH

(s)

⎛

s t

r

⎝dB (s) −

t

⎞ Z(s−)k

RH

(s) dM k (s)⎠ .

(5)

k:k =Z(s−)

j jk Here RH and RH are deﬁned as Rj and Rjk with V replaced by H. Now, taking conditional expectation on both sides and assuming suﬃcient integrability, the integral with respect to the martingale vanishes. This leaves us with the conclusion that any solution to (4) equals the reserve, H Z(t) (t) = V Z(t) (t) . We end this section by stating the dynamics of the reserve. Using (4) we get the following, μZ(t)k (t) RZ(t)k (t) dt dV Z(t) (t) = rV Z(t) (t) dt − dB Z(t) (t) −

+

k:k =Z(t)

V k (t) − V Z(t−) (t) dN k (t) ,

(6)

k:k =Z(t−)

that is a backward stochastic diﬀerential equation. The word backward refers to the fact that the solution is ﬁxed by the terminal condition (4c), i.e. V Z(n) (n) = 0. Usually this terminal condition is 324

Diﬀerential Systems in Finance and Life Insurance rewritten by (4b) into V j (n−) := ΔB j (n) where ΔB j (n) is a ﬁxed terminal payment. However, one can turn things upside down by taking this terminal condition to be the deﬁning relation of ΔB Z(n) (n) in terms of V Z(n) (n−), i.e. ΔB Z(n) (n) := V Z(n) (n−) with V Z(n) (n−) given by (6). Then the terminal condition V Z(n) (n) = 0 is fulﬁlled by construction. We then just need an initial condition on V to consider it as a forward stochastic diﬀerential equation. Here, one should take the so-called equivalence relation V 0 (0−) as initial condition. Hereafter, V k (t) can be taken to be anything and plays the role as initial condition at time t on V , given that the policy jumps into state k. See also Kraft and Steﬀensen (2006) who study in further details the transformation from a backward to a forward diﬀerential equation. The type of life insurance where terminal payments are linked to the development of the policy is, generally speaking, known as unitlink life insurance. The construction described above is indeed a kind of unit-link life insurance with no guarantee in the sense that there are no predeﬁned bounds on ΔB Z(n) (n). The simplest implementation turns out by putting V k (t) = V Z(t−) (t) so that dV Z(t) (t) = rV Z(t) (t) dt−dB Z(t) (t)− μZ(t)k (t) bZ(t)k (t) dt. (7) k:k =Z(t)

This means that the reserve is maintained upon transition and the risk sum Rjk (t) reduces to the transition payment bjk (t). Then the reserve is really nothing but an account from which the inﬁnitesimal beneﬁts less premiums dB Z(t) (t) are paid and from which the so-called natural risk premium rate μZ(t)k (t) bZ(t)k (t) k:k =Z(t)

is withdrawn to cover the beneﬁts bZ(t)k (t), k = Z (t). 10.2.2 Black-Scholes diﬀerential equation In this section we state and prove the diﬀerential equation for the value of a ﬁnancial contract with payments linked to a stock index. See Black and Scholes (1973) and Merton (1973) for the original contributions. We consider a ﬁnancial contract issued at time 0 and terminating at a ﬁxed ﬁnite time n. The payoﬀ from the ﬁnancial contract is linked 325

Mogens Steﬀensen to the value of a stock index. Let X (t) denote the stock index at time t ∈ [0, n]. The history of the stock index up to and including time t is represented by the sigma-algebra F X (t) = σ {X (s) , s ∈ [0, t]} . The development of the stock index is formalized by the ﬁltration FX = F X (t) t∈[0,n] . Let B (t) denote the total amount of contractual payments during the time interval [0, t]. We assume that it develops in accordance with the dynamics dB (t) = b (t, X (t)) dt + ΔB (t, X (t)) ,

(8)

where b (t, x) and ΔB (t, x) are deterministic and suﬃciently regular functions specifying payments if the stock value is x at time t. The decomposition of B into an absolutely continuous part and a discrete part conforms with (2). Again, we denote the set of time points with jumps in B by D = {t0 , t1 , . . . , tq } where 0 = t0 < t1 < . . . < tq = n. The most classical example of a contractual payment function is the European call option given by the following speciﬁcation of payment coeﬃcients, b (t, x) = 0, ΔB (t, x) = 0, t < n, ΔB (n, x) = max (x − K, 0) ,

(9)

for some constant K. We assume that X is a time-continuous Markov process on R+ with continuous paths. Furthermore, we assume that the dynamics of X are given by the stochastic diﬀerential equation, dX (t) = αX (t) dt + σX (t) dW (t) , X (0) = x0 , where W is a Wiener-process, and α and σ are constants. We assume that one may invest in X but, at the same time, a riskfree investment opportunity is available. The riskfree investment opportunity earns return on investment by a constant interest rate r, corresponding to the investment portfolio underlying the insurance portfolio in the previous section. 326

Diﬀerential Systems in Finance and Life Insurance The issuer of the ﬁnancial contract wishes to calculate the value of the future payments in the contract. The idea of so-called derivative pricing is that the contract value should prevent the contract from imposing arbitrage possibilities, i.e. riskfree capital gains beyond the return rate r. The entrepreneurs of modern ﬁnancial mathematics realized that, in certain ﬁnancial markets like the one given here, this idea is suﬃcient to produce the unique value of the ﬁnancial contract. This contract value equals the conditional expected value, n Q − ts r e dB (s) X (t) , (10) V (t, X (t)) = E t

where dX (t) = rX (t) dt + σX (t) dW Q (t) , with W Q being a Wiener-process under the measure Q. The measure Q is called a martingale measure because the discounted stock index e−rt X (t) is a martingale under this measure. This construction ensures that the price preventing arbitrage possibilities can be represented in the form (10). Thus, it is actually just a probability theoretical tool for representation. We introduce the diﬀerential operator A, the rate of payments β, and the updating sum R, 1 AV (t, x) = Vx (t, x) rx + Vxx (t, x) σ 2 x2 , 2 β (t, x) = b (t, x) , R (t, x) = ΔB (t, x) + V (t, x) − V (t−, x) . We can now present the second diﬀerential equation. Proposition 2. The contract value given by (10) is characterized by the following deterministic backward partial diﬀerential equation, 0 = Vt (t, x) + AV (t, x) + β (t, x) − rV (t, x) , t ∈ / D, 0 = R (t, x) , t ∈ D, 0 = V (n, x) .

(11a) (11b) (11c)

The usual situation in ﬁnancial expositions is that there are no payments until termination, in the case of which (10.11) reduces to 0 = Vt (t, x) + AV (t, x) − rV (t, x) , V (n−, x) = ΔB (n, x) , 327

Mogens Steﬀensen in general spoken of as the Black-Scholes equation. For the European call option given by (9), the terminal condition is given by V (n−, x) = max (x − K, 0). In this case, the system has an explicit solution that is known as the Black-Scholes formula. This can be found in almost any textbook on derivative pricing. As in the previous section we prove that the diﬀerential equation is a suﬃcient condition on the contract value in the sense that any function solving (10.11) indeed equals the contract value given by (10). Take an arbitrary function H solving (10.11) and consider the process H (t, X (t)). For this process the following line of equalities holds, n s d e− t r H (s, X (s)) H (t, X (t)) = − t n − ts r =− e (−rH (s, X (s)) + dH (s, X (s))) t n s = e− t r dB (s) − Hx (s, X (s)) σX (s) dW Q (s) t n s e− t r Hs (s, X (s)) + AH (s, X (s)) − t s e− t r RH (s, X (s)) + β (s, X (s)) − rH (s, X (s)) ds − =

n

e−

s t

r

s∈(t,n]∩D

dB (s) − Hx (s, X (s)) σX (s) dW Q (s) .

(12)

t

Now, taking conditional expectation on both sides and assuming suﬃcient integrability, the integral with respect to the martingale vanishes. This leaves us with H (t, X (t)) = V (t, X (t)) . Thus, any function solving (10.11) equals the contract value, and the diﬀerential equation is then a suﬃcient condition to characterize the contract value. 10.2.3

A hybrid equation

In this section we state the diﬀerential equation for the reserve connected to a life insurance contract with payments linked to a stock index. We end the section by considering a stochastic diﬀerential equation for the reserve with applications to unit-link life insurance. See Brennan and Schwartz (1976), Aase and Persson (1994), and Steffensen (2000) for the original ideas and the general hybrid equations, respectively. 328

Diﬀerential Systems in Finance and Life Insurance As in section 10.2.1, we consider an insurance policy issued at time 0 and terminating at a ﬁxed ﬁnite time n with a payment stream given by (1). However, instead of letting each B j and each bjk be deterministic functions of time, we introduce dependence on the stock index as formalized in section 10.2.2. We assume that the accumulated payment process develops in accordance with the dynamics bZ(t−)k (t, X (t)) dN k (t) , (13) dB (t) = dB Z(t) (t, X (t)) + k:k =Z(t−)

where dB j (t, x) = bj (t, x) dt + ΔB j (t, x) , with suﬃciently regular functions bjk (t, x), bj (t, x), and ΔB j (t, x). As in the previous sections, we are interested in valuation of the future payments in the payment process. The question is now how we should integrate the two approaches to risk pricing presented there. In section 10.2.1, we assumed insured risk to obey the law of large numbers and based the risk valuation on diversiﬁcation. This left us with a conditional expected present value under the objective probability measure. In section 10.2.2, we based the risk valuation on the no-arbitrage paradigm of derivative pricing. This left us with a conditional expected present value under an artiﬁcial measure Q called the martingale measure. Which measure should we now use for valuation of integrated insurance and ﬁnancial risk in the payment process (13)? The prevention of arbitrage possibilities is not suﬃcient to get a unique martingale measure. Instead, this idea leaves us with an inﬁnite set of martingale measures. From these measures, some can be said to play more important roles than others. Probably the most important role is played by the product measure that combines the objective measure of insurance risk with the martingale measure of ﬁnancial risk. We denote, with a slight misuse of notation, also this product measure by Q. This particular martingale measure appears both in several so-called quadratic hedging approaches and in the theory of asymptotic arbitrage. Typically, this measure is applied for valuation of integrated ﬁnancial and insurance risk. Here, we simply take this measure for given and proceed. It should be mentioned that the diﬀerential equation below holds for a much larger class of martingale measures in the following sense: Instead of valuating insurance risk under the objective measure, one could change this measure and still have a martingale measure. However, changing the measure of insurance risk is just a matter of changing the transition intensities for Z. So changing the intensities in the 329

Mogens Steﬀensen formulas below corresponds to picking out an alternative martingale measure to the product measure described in the previous paragraph. We can now deﬁne the reserve by n Z(t) Q − ts r V (t, X (t)) = E e dB (s) Z (t) , X (t) . (14) t

Note here that we choose the term reserve for the hybrid (14) of the reserve given in (3) and the contract value given in (10.10). This reﬂects that the reserve (14) typically appears on the liability side of an insurance company’s balance scheme. We introduce the diﬀerential operator A, the payment rate β and the updating sum R, μjk (t) V k (t, x) − V j (t, x) (15a) AV j (t, x) = k:k =j

1 j +Vxj (t, x) rx + Vxx (t, x) σ 2 x2 , 2 μjk (t) bjk (t, x) , β j (t, x) = bj (t, x) +

(15b)

k:k =j

Rj (t, x) = ΔB j (t, x) + V j (t, x) − V j (t−, x) .

(15c)

We can now present the third diﬀerential equation. Proposition 3. The reserve given by (14) is characterized by the following deterministic system of backward partial diﬀerential equations, / D, 0 = Vtj (t, x) + AV j (t, x) + β j (t, x) − rV j (t, x) , t ∈

(16a)

j

0 = R (t, x) , t ∈ D,

(16b)

j

(16c)

0 = V (n, x) .

We shall not go through the derivation of the diﬀerential equation characterizing the reserve. The recipe and the calculations can be copied from the previous section but they become more messy as the valuation problem expands. But it is worthwhile to realize that the diﬀerential equation (16) is a true generalization of both (4) and (10.11). The specialization of (16) into (4) comes from erasing all stock index dependence. The specialization into (10.11) comes from erasing all state dependences and all payments triggered by transitions of Z. 330

Diﬀerential Systems in Finance and Life Insurance We end this section by studying the special insurance contract introduced at the end of section 10.2.1 in the presence of stock index dependence. The backward stochastic diﬀerential equation corresponding to (6) describing the dynamics of the reserve turns into dV Z(t) (t, X(t)) = rV Z(t) (t, X (t)) + (α − r) VxZ(t) (t, X(t)) X (t) dt + VxZ(t) (t, X (t)) σX (t) dW (t) − dB Z(t) (t, X (t)) − μZ(t)k (t) RZ(t)k (t, X (t)) dt k:k =Z(t)

+

k V (t, X (t)) − V Z(t−) (t, X (t)) dN k (t)

k:k =Z(t−)

(17) with Rjk (t, x) = bjk (t, x) + V k (t, x) − V j (t, x) . As in section 10.2.1 we let ΔB Z(n) (n) := V Z(n) (n−) be the deﬁning relation implying that the terminal condition V Z(n) (n) = 0 is fulﬁlled by construction. Furthermore, we assume that from the reserve a proportion π (t) is invested in the stock index at time t. Then, letting h denote the number of stock indices held at time t and noting that π (t) V Z(t) (t, X (t)) = h (t) X (t) , we then have that VxZ(t) (t, X (t)) =

V Z(t) (t, X (t)) h (t) = . π (t) X (t)

Plugging this relation into (17) gives us a general version of (6). We write down here the special case coming from V k (t) = V Z(t−) (t), corresponding to (7), dV Z(t) (t, X (t)) = (r + π (t) (α − r)) V Z(t) (t, X (t)) dt +σπ (t) V Z(t) (t, X (t)) dW (t) −dB Z(t) (t, X (t)) μZ(t)k (t) bZ(t)k (t, X (t)) dt. − k:k =Z(t)

Now, this is an investment account with the proportion π invested in the stock index and with a ﬂow of payments corresponding to (7), except for the possibility of stock index dependence in all payments. 331

Mogens Steﬀensen 10.3 10.3.1

Surplus and dividends The dynamics of the surplus

In this section we introduce the notion of surplus that measures the excess of assets over liabilities. Also the notion of dividends that allows the insured to participate in the performance of the insurance contract is introduced. For the succeeding sections, only the process of dividends and the derived dynamics of the surplus are important. See Norberg (1999,2001) and Steﬀensen (2006b) for detailed studies of the notions of surplus and dividends. Life insurance contracts are typically long-term contracts with time horizons up to half a century or more. Calculation of reserves is based on assumptions on interest rates and transition intensities until termination. Two diﬃculties arise in this connection. First, these are quantities that are diﬃcult to predict even on a shorter-term basis. Second, the policy holder may be interested in participating in returns on risky assets rather than risk free assets. At the end of section 10.2.3 we gave one approach to the second diﬃculty: Let the terminal lump sum payment be deﬁned by the terminal value of the reserve. Then the prospective expected value given by (14) can be calculated retrospectively. The unit-linked insurance without a guarantee is hereby constructed. For various reasons, however, only few life insurance contracts were constructed like that in the past. Instead the insurer makes a ﬁrst prudent guess on the future interest rates and transition intensities in order to be able to put up a reserve, knowing quite well that realized returns and transitions differ. This ﬁrst guess on interest rates and transition intensities, here denoted by (r∗ , μ∗ ), is called the ﬁrst order basis, and gives rise to the ﬁrst order reserve, V ∗ . The set of payments B settled under the ﬁrst order basis is called the ﬁrst order payments or the guaranteed payments. However, the insurer and the policy holder agree that the realized returns and transitions should be reﬂected in the realized payment stream. For this reason the insurer adds to the ﬁrst order payments a dividend payment stream. We denote this payment stream by D and assume that its structure corresponds to the structure of B, i.e. δ Z(t−)k (t) dN k (t) , (18) dD (t) = dDZ(t) (t) + k:k =Z(t−) j

dD (t) = δ (t) dt + ΔDj (t) .

332

j

Diﬀerential Systems in Finance and Life Insurance Here, however, the coeﬃcients of D, δ jk (t), δ j (t), and ΔDj (t), are not assumed to be deterministic. In contrast, the dividends should reﬂect realized returns and transitions relative to the ﬁrst order basis assumptions. One can now categorize basically all types of life and pension insurance by their speciﬁcation of D. Such a speciﬁcation includes possible constraints on D, the way D is settled, and the way in which D materializes into payments for the policy holder or others. We shall not give a thorough exposition of the various types of life insurance existing but just give a few hints to what we mean by categorization. When dividends are constrained to be to the beneﬁt of the policy holder, i.e. D is positive and increasing, one speaks of participating or with-proﬁt life insurance. In so-called pension funding there is no such constraint. There, however, often the insured himself is not aﬀected by dividends. In return, an employer pays or receives dividends. No matter whether dividends aﬀect the insured or his employer, the dividends do not necessarily materialize into cash payments. The insurer may convert them into adjustments to ﬁrst order payments. Such a conversion is then agreed upon in the contract. In participating life insurance, this adjustment of ﬁrst order payments is called bonus. We could continue the categorization of life insurance contracts but we stop here. For all types of contracts, however, remains the question: How should dividends reﬂect the realized returns and transitions? A natural measure of realized performance is the surplus given by excess of assets over liabilities. Assuming that payments are invested in a portfolio with value process Y and that liabilities are measured by the ﬁrst order reserve, we get the surplus t Y (t) d (− (B + D) (s)) − V Z(t)∗ (t) , X (t) = 0− Y (s) where the ﬁrst part is the total payments in the past accumulated with capital gains from investing in Y . Note that X in this section is deﬁned as the surplus, in contrast to the previous section where X was the stock index. We now assume that a proportion of Y given by π (t, X (t)) X (t) / X (t) + V Z(t)∗ (t) is invested in a risky asset modelled as in section 10.2.2. Then the dynamics of Y are given by dY (t) = rY (t) dt + σ

π (t, X (t)) X (t) Y (t) dW Q (t) . X (t) + V Z(t)∗ (t) 333

Mogens Steﬀensen Note that we choose to specify the dynamics of Y directly in terms of W Q , the Wiener process under the valuation measure. Deriving the dynamics of X, using these dynamics for Y , one arrives, after a number of rearrangements and abbreviations, at dX (t) = rX (t) dt + π (t, X (t)) σX (t) dW Q (t) + d (C − D) (t) ,

(19)

X (0) = x0 , where C is a surplus contribution process with a structure corresponding to the structure of B and D, i.e. dC (t) = dC Z(t) (t) + cZ(t−)k (t) dN k (t) , (20) k:k =Z(t−) j

j

dC (t) = c (t) dt + ΔC j (t) . The dynamics of X show that π is actually the proportion of the surplus invested in the risky asset. This is the reason for starting out with the proportion π (t, X (t)) X (t) / X (t) + V Z(t)∗ (t) . The elements cjk , cj , and ΔC j of C are deterministic functions. They are, of course, important for a closer study on the elements of the surplus. However, they are not crucial for derivation and comprehension of the formulas in what follows. See also Norberg and Steﬀensen (2005) for more on surplus dynamics in the case where payments, and not only capital gains, are diﬀusion processes. Having introduced the surplus above as a performance measure, a natural next step is to link the dividend payments directly to the surplus, i.e. δ j (t) = δ j (t, X (t)) , δ jk (t) = δ jk (t, X (t)) , ΔDj (t) = ΔDj (t, X (t)) , where we, with a slight misuse of notation, use the same notation for the dividend payments and their functional dependence on (t, X (t)). This formalization of dividends would certainly be a way of getting realized returns (in Y ) and transitions (in N ) reﬂected in the dividend payments. 334

Diﬀerential Systems in Finance and Life Insurance We could have introduced other performance measures than the surplus deﬁned above. However, other well-founded performance measures would typically also follow the dynamics given by (19) with appropriate deﬁnition of the coeﬃcients in C. The formulas derived below would hold true. Thus, in this respect, the story about ﬁrst order quantities and surplus can be seen as just one example of the state process X underlying the dividend payments. 10.3.2 The diﬀerential equation for the market reserve In this section, we state the diﬀerential equation for the reserves connected to a life insurance contract with dividend payments linked to the surplus. This formalizes most practical life insurance contracts where dividends are linked to the performance of the insurance contract. Furthermore, for the special case of dividends that are linear in the surplus, we separate variables of the reserves. Thereby one system of partial diﬀerential equations is reduced to two systems of ordinary diﬀerential equations. See Steﬀensen (2006b) for further studies on partial diﬀerential equations for valuation of surplus-linked dividends. The insurer is interested in valuation of the total future liabilities. We introduce as reserve the expected present value of future total payments given the past history of the policy. The expectation is taken under the product measure Q introduced in section 10.2.2. Since future payments depend on (Z (t) , X (t)) only and (Z (t) , X (t)) is a Markov process, the reserve is given by n Z(t) Q − ts r (t, X (t)) = E e d (B + D) (s) Z (t) , X (t) . (21) V t

We introduce the diﬀerential operator A, the payment rate β and the updating sum R, μjk (t) V k t, x + cjk (t) − δ jk (t, x) − V j (t, x) AV j (t, x) = k:k =j

+Vxj (t, x) rx + cj (t) − δ j (t, x) 1 j (t, x) π 2 (t, x) σ 2 x2 , + Vxx 2 β j (t, x) = bj (t) + δ j (t, x) + μjk (t) bjk (t) + δ jk (t, x) ,

(22a)

(22b)

k:k =j

Rj (t, x) = ΔB j (t) + ΔDj (t, x) + V j t, x + ΔC j (t) − ΔDj (t, x) −V j (t−, x) .

(22c) 335

Mogens Steﬀensen We are now ready to present the fourth diﬀerential equation. Proposition 4. The reserve given by (10.21) is characterized by the following deterministic system of backward partial diﬀerential equations, / D, 0 = Vtj (t, x) + AV j (t, x) + β j (t, x) − rV j (t, x) , t ∈ 0 = Rj (t, x) , t ∈ D, 0 = V j (n, x) .

(23a) (23b) (23c)

As in section 10.2.3, we shall not go through the derivation of the diﬀerential equation. The calculations are even more messy than those leading to the system (16), but the basic ingredients remain the same. However, we explain how (23) generalizes (16) in several respects. First, compare the diﬀerential operators (15a) and (22a). In (15a), the change in the reserve corresponding to a transition from j to k is reﬂected in the diﬀerence V k (t, x) − V j (t, x). In this section, a state transition also aﬀects the variable X such that after a jump from j to k at time t, X (t) = X (t−) + cjk (t) − δ jk (t, X (t−)) . In (22a), this is seen in the change in the reserve by an updating of the variable x accordingly. A similar diﬀerence appears between (15c) and (22c). In (15c) the state process X is not aﬀected by a lump sum payment at a deterministic point in time. This leads to a change in the reserve of V j (t, x) − V j (t−, x). In this section, a lump sum payment at time t yields X (t) = X (t−) + ΔC j (t) − ΔDj (t, X (t−)) . This is then seen in (22c) by an updating of the variable x accordingly. Second, in (15a) the coeﬃcient on Vxj (t, x), rx, stems from the systematic return rate on investment rX (t). In this section, the systematic rate of increments of X, given sojourn in state j, equals rX (t) + cj (t) − δ j (t, X (t)) . This is then reﬂected in the coeﬃcient on Vxj (t, x), rx + cj (t) − δ j (t, x) . Finally, we have in this section allowed for a certain proportional investment of the surplus in the risky asset. The volatility π (t, X (t)) σX (t) dW (t) 336

Diﬀerential Systems in Finance and Life Insurance j leads to a diﬀerent coeﬃcient on Vxx (t, x) in (22a) than in (15a). Apart from the diﬀerence between the diﬀerential operators, the systems (23) and (16) are almost identical. In this section, we have added the two payment streams B and D, of which only D is linked to X. In section 10.2.3, the payment stream B was linked to what X presented there. This is reﬂected in the according replacement of payments in (15b) and (15c), such that (22b) and (22c) appear. So far we have just presented the diﬀerential equation characterizing the reserve. We have not discussed which functional dependence of dividends on X that might be relevant. For such a discussion we need to know the insurer’s and the policy holder’s agreement on reﬂection of performance in dividends. In practice, dividends are always increasing in X. Then a good performance is shared between the two parties by the insurer paying back part of the surplus as positive dividends. A bad performance is shared between the two parties by the insurer collecting part of the deﬁcit as negative dividends. Since there may be constraints on D, e.g., D increasing, these qualitative estimates are not necessarily strict, though. There are only few examples of a functional dependence that allow for more explicit calculations. Luckily, the most important one allows us to take an important step further. We end this section by specifying a particular functional dependence of dividends on X that allows for more explicit calculations of the reserve. We introduce dividends that are linear in the surplus in the sense that

δ j (t) = pj (t) + q j (t) X (t) , δ jk (t) = pjk (t) + q jk (t) X (t) , ΔDj (t) = ΔP j (t) + ΔQj (t) X (t) , where pj , pjk , ΔP j , q j , q jk , and ΔQj are positive deterministic functions. It is an easy exercise to plug these dividends into the system (23). The next step is then to suggest a useful separation of variables in V . Linearity of dividends inspires a guess on the form V j (t, x) = f j (t) + g j (t) x. Plugging this guess and its derivatives into (23) and collecting all terms including and excluding x, respectively, gives us systems of ordinary diﬀerential equations for f and g. We leave it to the reader to verify that the diﬀerential equations covering f and g are similar in structure to (4). This makes further studies, interpretations, and 337

Mogens Steﬀensen representations possible. In this exposition, we just notify the separation of variables of the reserve function for linear dividends. This separation reduces the system (23) of partial diﬀerential equations to two systems of ordinary diﬀerential equations characterizing f and g. See Steﬀensen (2006b) for all details. 10.4 10.4.1

Intervention Optimal stopping and early exercise options

In this section, we state and prove the diﬀerential equation for the value of a ﬁnancial contract with payments linked to a stock index and with an early exercise option. The proof shows that the diﬀerential equation is suﬃcient for a characterization of the contract value. In section 10.2.2, we studied the price of a ﬁnancial contract where the payment rates and lump sum payments at deterministic points in time were linked to a stock index. Typically, there is the additional feature to such a contract that the contract holder can, at any point in time t until termination, close the contract. He then receives a payoﬀ that depends on the stock value upon closure. This feature is known as the premature or early exercise option, since it gives the contract holder the opportunity to convert future payments into an immediate premature payment. Recall the payment stream (8) in section 10.2.2. Now assume that, given exercise at time t, all future payments are converted into one exercise payment, due at time t, and denoted by Φ (t) = Φ (t, X (t)) , where we, with a slight misuse of notation, use Φ for both the process and its suﬃciently regular functional dependence on (t, X (t)). We are now interested in calculating the value of the contract. It is possible to give an arbitrage argument for the unique contract value, τ Q − ts r − tτ r e dB (s) + e Φ (τ ) X (t) . (24) V (t, X (t)) = sup E τ ∈[t,n]

t

The decision not to exercise prematurely is included in the supremum in (24) by specifying Φ (n) = 0 (25) and by presenting the decision not to exercise prematurely by τ = n. Assume that X is modelled as in section 10.2.2 and the market available is as in section 10.2.2. One cannot immediately see from 338

Diﬀerential Systems in Finance and Life Insurance the results in the previous sections how the diﬀerential equation from there can be generalized to the situation in this section. For a ﬁxed τ , the valuation problem is the same as in section 10.2.2 with n replaced by τ but how does the supremum aﬀect the results? Does there still exist a deterministic diﬀerential equation characterizing the contract value? We deﬁne the diﬀerential operator A, the rate of payments β, and the sum R as in section 10.2.2, and introduce furthermore the sum by (t, x) = ΔB (t, x) + Φ (t, x) − V (t−, x) . We can now present the ﬁfth diﬀerential equation. Proposition 5. The contract value given by (24) is characterized by the following deterministic backward partial variational inequality, / D, 0 ≥ Vt (t, x) + AV (t, x) + β (t, x) − rV (t, x) , t ∈ 0 ≥ Φ (t, x) − V (t, x) , t ∈ / D, 0 = [Vt (t, x) + AV (t, x) + β (t, x) − rV (t, x)] × [V (t, x) − Φ (t, x)] , t∈ / D, 0 ≥ R (t, x) , t ∈ D, 0 ≥ (t, x) , t ∈ D, 0 = R (t, x) (t, x) , t ∈ D, 0 = V (n, x) .

(26a) (26b) (26c) (26d) (26e) (26f) (26g)

This system should be compared with (10.11). First, (11a) is replaced by (26a)-(26c). The equation in (11a) turns into an inequality in (26a). An additional inequality (26b) states that the contract value always exceeds the exercise payoﬀ. This is reasonable, since one of the possible exercise strategies is to exercise immediately and this would give an immediate exercise payoﬀ. The equality (26c) is the mathematical version of the following statement: At any point in the state space (t, x), at least one of the inequalities in (26a) and (26b) must be an equality. Second, (11b) is replaced by (26d)-(26f). The equation in (11b) turns into an inequality in (26d). An additional inequality (26e) states that the contract value on the time set D exceeds the lump sum plus the exercise payoﬀ falling due. The equality (26f) states that at least one of the inequalities in (26d) and (26e) must be an equality. Note that (26d)-(26f) easily can be written as V (t−, x) = ΔB (t, x) + max (V (t, x) , Φ (t, x)) , t ∈ D,

(27) 339

Mogens Steﬀensen while there is no such abbreviation available for (26a)-(26c). However, we choose the version (26d)-(26f) to illustrate the symmetry with (26a)-(26c). The usual situation in ﬁnancial expositions is that there are no payments until exercise or termination whatever comes ﬁrst. In that case β (t, x) disappears from (26a)-(26c) and (26d)-(26g) reduce to V (n−, x) = ΔB (n, x) since both V (n, x) and Φ (n, x) are zero. With this speciﬁcation, (26) is the variational inequality characterizing the value of a so-called American option. By the variational inequality (26) one can divide the state space into two regions, possibly intersecting. In the ﬁrst region, (26a) and (26d) are equalities. This region consists of the states where the optimal stopping strategy for the contract holder is not to stop. In this region, the contract value follows a diﬀerential equation as if there were no exercise option. In the second region, (26b) and (26e) are equalities. This region consists of the states where the optimal stopping strategy for the contract holder is to stop. Thus, in this region the value of the contract equals the exercise payoﬀ. It is possible to show that (26) is a necessary condition on the contract value. However, instead we go directly to verifying that (26) is also a suﬃcient condition. The proof starts out in the same way as the veriﬁcation argument in section 10.2.2. Take an arbitrary function H solving (26) and consider the process H (t, X (t)). Then we can write, by replacing n by τ in (10.12), τ

H (t, X (t)) = e− t r H (τ, X (τ )) τ s e− t r dB (s) − Hx (s, X (s)) σX (s) dW Q (s) + t τ s e− t r Hs (s, X (s)) + AH (s, X (s)) − t +β (s, X (s)) − rH (s, X (s)) ds s e− t r RH (s, X (s)) . − s∈(t,τ ]∩D

Now consider an arbitrary stopping time τ . For this stopping time, 340

Diﬀerential Systems in Finance and Life Insurance we know from (26a), (26b) and (26d) that τ τ s H (t, X (t)) ≥ e− t r dB (s) + e− t r Φ (τ ) t τ s − e− t r Hx (s, X (s)) σX (s) dW Q (s) . t

First, taking conditional expectation, given X (t), on both sides and then taking supremum over τ gives that τ Q − ts r − tτ r H (t, X (t)) ≥ sup E e dB (s) + e Φ (τ ) X (t) . (28) τ ∈[t,n]

t

Now consider instead the stopping time deﬁned by τ ∗ = inf {H (s, X (s)) = Φ (s, X (s))} . s∈[t,n]

This stopping time is indeed well-deﬁned since, from (25) and (26g), H (n, X (n)) = Φ (n, X (n)) = 0, so that τ ∗ occurs no later than n. We now know from (26c) and (26f) that 0 = Hs (s, X (s)) + AH (s, X (s)) +β (s, X (s)) − rH (s, X (s)) , 0 = RH (s, X (s)) , s ∈ [t, τ ∗ ] ∩D,

s ∈ [t, τ ∗ ] ,

such that

τ∗

H (t, X (t)) = t

−

e− τ∗

s t

e−

r

dB (s) + e−

s t

r

τ∗ t

r

Φ (τ ∗ )

Hx (s, X (s)) σX (s) dW Q (s) .

t

Taking conditional expectation, given X (t), on both sides and then comparing both sides for all possible stopping times yields the inequality τ Q − ts r − tτ r H (t, X (t)) ≤ sup E e dB (s) + e Φ (τ ) X (t) . (29) τ ∈[t,n]

t

341

Mogens Steﬀensen By (28) and (29), we conclude that H (t, X (t)) = V (t, X (t)) . Thus, any function solving (26) characterizes the contract value. Note that the proof also produces the optimal exercise strategy. The contract holder should exercise according to the stopping time τ ∗ . However, in order to know when to exercise, one must be able to calculate the value. Only rarely, the variational inequality (26) has an explicit solution. However, there are several numerical procedures developed for this purpose. One may, e.g., use Monte Carlo techniques, general partial diﬀerential equation approximations, or certain speciﬁc approximations developed for speciﬁc functions Φ. 10.4.2

Intervention options in life and pension insurance

In this section, we state the diﬀerential equation for the reserve of a life insurance contract with dividends linked to the surplus and with a surrender option. Furthermore we comment on the generalization to general intervention options. See Grosen and Jørgensen (2000) and Steﬀensen (2002) for results on the surrender options and general intervention options. In correspondence with the previous section, also the holder of a life insurance contract can, typically, terminate his policy prematurely. The act of terminating a life insurance policy is called surrender, and the exercise option is in this context called a surrender option. We consider the insurance contract described in section 10.3, i.e. a contract with the total accumulated payments given by B + D. Assume now that the contract holder can terminate his policy at any point in time. Given that he does so at time t, he receives the surrender value Φ (t) = ΦZ(t) (t, X (t)) , for a suﬃciently regular function Φj (t, x). Here, we take X to be the surplus process introduced in section 10.3. We are now interested in calculating the value of future payments speciﬁed in the policy. We consider the reserve, V Z(t) (t, X (t)) Q = sup E τ ∈[t,n]

t

τ

e

−

s t

r

−

d (B + D) (s) + e

τ t

r

Φ (τ ) Z (t) , X (t) , (30)

342

Diﬀerential Systems in Finance and Life Insurance where Q is the product measure described in section 10.2.3. As in the previous section, one cannot immediately see how the diﬀerential equation (23) generalizes to this situation. The results in the previous section indicate, however, that the diﬀerential equation can be replaced by a variational inequality. We deﬁne the diﬀerential operator A, the payment rate β, and the updating sum R as in section 10.3, and introduce furthermore the sum j by j (t, x) = ΔB j (t, x) + Φj (t, x) − V j (t−, x) . We can now present the sixth diﬀerential equation. Proposition 6. The reserve given by (30) is characterized by the following deterministic system of backward partial variational inequalities, / D, 0 ≥ Vtj (t, x) + AV j (t, x) + β j (t, x) − rV j (t, x) , t ∈ / D, 0 ≥ Φj (t, x) − V j (t, x) , t ∈ j j 0 = Vt (t, x) + AV (t, x) − β j (t, x) − rV j (t, x) / D, × V j (t, x) − Φj (t, x) , t ∈ 0 ≥ Rj (t, x) , t ∈ D, 0 ≥ j (t, x) , t ∈ D, 0 = Rj (t, x) j (t, x) , t ∈ D, 0 = V j (n, x) . This diﬀerential equation can be compared with (23) in the same way as (26) was compared with (10.11). Its veriﬁcation goes in the same way as the veriﬁcation of (10.26) although it becomes somewhat more involved. We shall not go through this here. As in the previous section, one can now divide the state space into two regions, possibly intersecting. In the ﬁrst region, the reserve follows a diﬀerential equation as if surrender were not possible. This region consists of states from which immediate surrender is suboptimal. In the second region, the reserve equals the surrender value, and this region consists of the states where immediate surrender is optimal. The surrender value is often in practice given by the ﬁrst order reserve deﬁned in section 10.3, in the sense that Φj (t, x) = V j∗ (t) , and is, thus, not surplus dependent. 343

Mogens Steﬀensen The title of this section is Intervention. So far, we have only dealt with stopping of a ﬁnancial contract in the previous subsection, and stopping of an insurance contract in this subsection. In practice, the insurance policy holder typically holds other options that in some respects are similar in nature to the surrender option but in other respects not. The most important one is the free policy option that allows the policy holder to stop all premium payments but continue the contract in a so-called free policy state. Exercising a free policy option leads to a reduction of the ﬁrst order beneﬁts that were settled under the assumption of full premium payment. Thus, exercising a free policy option does not stop the insurance policy that continues under free policy conditions, but stops only the premium payments. Therefore, one should rather speak of intervention in than stopping of the insurance policy. Of course, stopping is a special example of intervention. For a stopping or surrender option, there is always only one control act, namely the act of stopping since hereafter the contract has expired. Given that the policy has been converted into a free policy, the policy holder may still hold a surrender option. Thus, introducing interventions, the policy holder may choose between diﬀerent series of interventions. This feature produces technical challenges in the veriﬁcation of a variational inequality characterizing the reserve. However, the basic structure of the resulting variational inequality remains the same. See Steﬀensen (2002) for all details. 10.5 10.5.1

Quadratic optimization Portfolio quadratic optimization of dividends

In this section, we state and prove the diﬀerential equation for a value function of an optimization problem where preferences over surplus and dividends are speciﬁed by a quadratic disutility function. We speak of the value process as a disutility reserve. The surplus introduced in section 10.3 is here approximated by a considerably simpler process. We also indicate the solution to the diﬀerential equation and the optimal dividend strategy. The control problem studied in this section is known as the linear regulator. See Fleming and Rishel (1975) for the linear regulator in general and Cairns (2000) for its applications to life insurance. In section 10.3, we introduced the notion of surplus. The surplus accumulates a stochastic process of surplus contributions C and capital gains from investment in a Black-Scholes market. From the 344

Diﬀerential Systems in Finance and Life Insurance surplus is withdrawn redistributions to the policy holders in terms of dividends. We modelled the process of dividends similarly to the underlying payment process B (and the process of surplus contributions C). In (10.23) a deterministic diﬀerential equation for the reserve was presented where the coeﬃcients in the dividend process are linked to the surplus. We concluded section 10.3 by proposing dividends to be aﬃne in the surplus. This led to a reserve that is aﬃne in the surplus. Thus, section 10.3 dealt with valuation of certain dividend plans. The question that we did not address was whether, or rather when, surplus linked dividends, or dividends aﬃne in the surplus for that matter, are particularly attractive. Questions of that kind appear in the discipline of optimization rather than valuation. We approximate the surplus by a diﬀusion process on the basis of the following list of adaptations: • We assume that the surplus is invested in the riskfree asset exclusively. • We approximate the process of surplus contributions by a Brownian motion with volatility ρ and drift c. • We assume that accumulated dividends are absolutely continuous and paid out by the rate δ. These adaptations give us the following surplus dynamics, dX (t) = rX (t) dt + d (C − D) (t) , X (0) = x0 , where dC (t) = c (t) dt + ρ (t) dW (t) , dD (t) = δ (t) dt. We are now interested in deciding on a dividend rate δ that we prefer over other dividend rates according to some preference criterion. For this purpose, we introduce a process of accumulated disutilities U , that is absolutely continuous with disutility rate u (t, δ (t) , X (t)), i.e. dU (t) = u (t, δ (t) , X (t)) dt. We now introduce a certain quadratic disutility criterion, u (t, δ, x) = p (t) (δ − a (t))2 + q (t) x2 .

(31) 345

Mogens Steﬀensen This criterion punishes quadratic deviations of the present dividend rate from a dividend target rate a and deviations of the surplus from 0. Such a disutility criterion reﬂects a trade-oﬀ between policy holders preferring stability of dividends, relative to a, over non-stability, and the insurance company preferring stability of the surplus relative to 0. The preference over the surplus could be driven by regulatory rules stating that earned surplus contributions should be redistributed upon earning in some sense. The deterministic functions p and q give weights to these preference formalizations. At time t, the future disutilities are measured by their conditional expectation. We deﬁne the disutility reserve as the inﬁmum of all such conditional expectations over all admissible dividend payment streams, i.e., n dU (s) X (t) . (32) V (t, X (t)) = inf E D

t

Except for the inﬁmum over D, note the similarity with, e.g., (14). The primary diﬀerence is that, instead of measuring an expected (present) value of payment rates δ, we now measure an expected disutility function of payment rates, p (t) (δ (t) − a (t))2 . Hereto we add an expected disutility function of the position of the surplus, q (t) X (t)2 . We introduce the diﬀerential operator A and the rate of disutilities β, 1 AV (t, x) = Vx (t, x) (rx + c (t) − δ) + Vxx (t, x) ρ2 , 2 β (t, x) = u (t, δ, x) . We are ready to present the seventh diﬀerential equation, which is a so-called Bellman equation. Proposition 7. The disutility reserve given by (32) is characterized by the following Bellman equation, 0 = Vt (t, x) + inf [AV (t, x) + β (t, x)] ,

(33a)

0 = V (n, x) .

(33b)

δ

An appendix to this diﬀerential equation is the speciﬁcation of the optimal dividend stream, i.e., the dividend stream that actually minimizes the disutility reserve (32). This optimal dividend stream, speciﬁed by the optimal rate δ ∗ , is simply the argument of the supremum in (33a), i.e., δ ∗ = arg inf [AV (t, x) + β (t, x)] . δ

346

(34)

Diﬀerential Systems in Finance and Life Insurance It is worthwhile to comment on the connection between (33) and e.g. the variational inequality (26). In (26a)-(26b) and in (26d)-(26e), we had two inequalities, corresponding to two diﬀerent actions, stopping and not stopping. From (26c) and (26f) one of the inequalities must be an equality. The structure of (33a) is the same in the sense that (33a) represents an inﬁnite set of inequalities, corresponding to each possible dividend rate. However, one of the inequalities must hold with equality. Since for each dividend rate, the disutility reserve is described by the same partial diﬀerential equation, we can write this in the very compact way (33a). This compact way actually corresponds to the compact writing of (26d)-(26f) in (27). We now go to the veriﬁcation of (10.33) being a suﬃcient condition for characterization of the disutility reserve. We start out in the same way as in section 10.2. Given a function H (t, x) solving (10.33) and an arbitrary dividend strategy δ, we can write n dH (s, X (s)) H (t, X (t)) = − t n dU (s) − Hx (s, X (s)) ρdW (s) = t n (Hs (s, X (s)) + AH (s, X (s)) + β (s, X (s))) ds. − t

(35)

Note that, given suﬃcient integrability, we could now, by taking conditional expectation on both sides of (35), conclude the following: If the disutility reserve were deﬁned for an exogenously given dividend payment stream, then (10.33) would characterize the disutility reserve with this stream plugged in and without the inﬁmum over δ. This result is obtained by the methodology used in section 10.2. We now argue how the extremum in (32) simply leads to the extremum in (33a). First, consider an arbitrary strategy δ. For this strategy we know, by (33a), that 0 ≤ Ht (t, X (t)) + AH (t, X (t)) + β (t, X (t)) , such that, by (35), H (t, X (t)) ≤

n

dU (s) − Hx (s, X (s)) ρ (s) dW (s) .

t

347

Mogens Steﬀensen Now, assuming suﬃcient integrability, taking conditional expectation on both sides and then taking inﬁmum over D, gives us the inequality n H (t, X (t)) ≤ inf E dU (s) X (t) . (36) D

t

Second, for the speciﬁc strategy, δ ∗ = arg inf [−AH (t, X (t)) − β (t, X (t))] , δ

we know from (33a) that Ht (t, X (t))+AH (t, X (t))+β (t, X (t)) = 0. Inserting this in (35) yields n dU (s) − Hx (s, X (s)) ρ (s) dW (s) . H (t, X (t)) = t

Now, taking conditional expectation on both sides and then estimating over all possible dividend strategies yields the inequality n H (t, X (t)) ≥ inf E dU (s) X (t) . (37) D

t

That H (t, X (t)) = V (t, X (t)) now follows from (36) and (37). We shall not go into the methodology of solving (33a), but just state that it actually has a solution in explicit form. The solution is V (t, X (t)) = f (t) (X (t) − g (t))2 + h (t) , that is just a certain parametrization of a second order polynomial function in X (t). The functions f , g, and h are deterministic functions solving certain diﬀerential equations. We choose this parametrization in order to write the optimal dividend rate as δ ∗ (t, x) = a (t) +

f (t) (x − g (t)) , p (t)

which leads to the following interpretation. First, the dividends contains the target rate a taking into consideration the preferences over present dividends. Second, the preferences over the present and future surplus are hidden in an adjustment to this control. This adjustment controls X towards g, which can be considered the optimal position for X at time t. This adjustment happens with the force f /p that somehow weighs the future preferences over X through f against the present preferences over δ through p. The functions a, p, and q appear in the diﬀerential equations for f , g, and h. 348

Diﬀerential Systems in Finance and Life Insurance The optimal dividend rate is aﬃne in X. So we can conclude that if we redistribute according to the speciﬁcations in this section, then it makes sense to work with aﬃne dividend strategies. In general, disutility rates that are functions of the dividend rates and the surplus always lead to optimal dividend rates that are linked to the surplus. This is a consequence of the Markov property. Thus, it does make sense in general to work with the system (10.23). 10.5.2 Statewise quadratic optimization of dividends In this section, we state the diﬀerential equation for the disutility reserve of an optimization problem where preferences over surplus and dividends are speciﬁed by a quadratic disutility function. The surplus is modelled as in section 10.3. We also indicate the solution to the diﬀerential equation and the optimal dividend strategy. See Steﬀensen (2006a) for the generalization of the linear regulator to Markov chain driven payments. In the previous section, we approximated the surplus by a diﬀusion process and controlled it by an absolutely continuous dividend process D. We now take the step back to the original surplus process with dynamics given by (19). Again, however, we skip investment in the risky asset such that (19) reduces to dX (t) = rX (t) dt + d (C − D) (t) , X (0) = x0 , where C and D are the contribution and dividend processes given in (20) and (18), respectively. As in the previous section, we now introduce a process U of accumulated disutilities. However, due to the structure of C and D, we allow for lump sum disutilities at the discontinuities of C and D. Thus, inheriting the structure of the payment processes, U is taken to have the dynamics dU (t) = dU Z(t) (t, δ (t) , ΔD (t) , X (t)) uZ(t−)k t, δ k (t) , X (t) dN k (t) , + k:k =Z(t−)

dU j (t, δ, ΔD, x) = uj (t, δ, x) dt + ΔU j (t, ΔD, x) . Inspired by the quadratic disutility functions introduced in the previous section, we form the coeﬃcients in the process of accumulated 349

Mogens Steﬀensen disutilities U accordingly, i.e. 2 uj (t, δ, x) = pj (t) δ − aj (t) + q j (t) x2 , 2 ujk t, δ k , x = pjk (t) δ k − ajk (t) + q jk (t) x2 , 2 ΔU j (t, ΔD, x) = ΔP j (t) ΔD − ΔAj (t) + ΔQj (t) x2 . These coeﬃcients should be compared with (31). First, there are now three coeﬃcients corresponding to disutility rates, lump sum disutilities upon transitions of Z, and lump sum disutilities at deterministic points in time. Second, for each type of dividend payment, we allow the target to be state dependent. Third, the weights on disutility of dividend deviations against disutility of surplus deviations are also allowed to be state dependent. The idea is now, with the generalized process of accumulated disutilities, to solve the corresponding optimization problem associated with the disutility reserve n Z(t) (t, X (t)) = inf E dU (s) Z (t) , X (t) . (38) V D

t

We introduce the diﬀerential operator A, the utility rate β and the updating sum R, μjk (t) V k t, x + cjk (t) − δ k − V j (t, x) AV j (t, x) = k:k =j

+Vxj (t, x) rx + cj (t) − δ , 2 β j (t, x) = pj (t) δ − aj (t) + q j (t) x2 2 + μjk (t) pjk (t) δ k − ajk (t) k:k =j

2 +q jk (t) x + cjk (t) − δ k , 2 Rj (t, x) = ΔP j (t) ΔD − ΔAj (t) + ΔQj (t) x2 +V j t, x + ΔC j (t) − ΔD − V j (t−, x) . We are now ready to present the eighth diﬀerential equation, which is a generalized version of the Bellman equation (10.33).

350

Diﬀerential Systems in Finance and Life Insurance Proposition 8. The disutility reserve given by (38) is characterized by the following Bellman equation, 0 = Vtj (t, x) + inf AV j (t, x) + β j (t, x) , t ∈ / D, (39a) δ,δ k

0 = inf Rj (t, x) , t ∈ D,

(39b)

0 = V (n, x) .

(39c)

ΔD j

The methodology needed for veriﬁcation of (39) as a suﬃcient condition for characterization of the disutility reserve is the same as in the previous section. However, the state dependence makes the calculations somewhat more involved. In the previous section, we proposed an appropriately parametrized second order polynomial function as solution to the Bellman equation. It is very convenient that this simple structure is inherited by the solution to (39). The only generalization of the proposed solution is that the coeﬃcient functions f , g, and h should be state dependent, i.e. 2 V Z(t) (t, X (t)) = f Z(t) (t) X (t) − g Z(t) (t) + hZ(t) (t) . Now, it is possible to derive systems of ordinary diﬀerential equations for f , g, and h, that can be solved numerically. The optimal dividend payments, given that the policy is in state j at time t, are given by f j (t) j x − g (t) , pj (t) pjk (t) ajk (t) δ ∗jk (t, x) = jk p (t) + q jk (t) + f k (t) q jk (t) + jk x + cjk (t) jk k p (t) + q (t) + f (t) f k (t) x + cjk (t) − g k (t) . + jk jk k p (t) + q (t) + f (t) δ ∗j (t, x) = aj (t) +

The optimal lump sum dividend payment on D, ΔD∗j (t), follows a formula similar in structure to the formula for δ ∗jk (t). Due to the parametrization of the second order polynomial solution, the following interpretations of δ ∗j (t) and δ ∗jk (t), respectively, apply. The optimal dividend rate should be interpreted in the same way as in the previous section. The rate is given by the target rate and an adjustment that takes care of future preferences over X. The adjustment moves X towards its optimal position at time t, g j (t), with 351

Mogens Steﬀensen the force f j (t) /pj (t). Now consider the optimal lump sum payment upon transition. This is actually a weighted average of three quantities corresponding to three considerations. First, a dividend payment equal to its target is preferred with the ﬁrst weight pjk (t) / pjk (t) + q jk (t) + f k (t) . Second, a payment pushing X (t) towards its target 0 is preferred with the second weight q jk (t) / pjk (t) + q jk (t) + f k (t) . Third, the consideration of the position of X in the future leads to an adjustment that brings X close to its optimal position after the transition, g k (t), by a force equal to the third weight, f k (t) / pjk (t) + q jk (t) + f k (t) . A similar interpretation applies for the lump sum payment at deterministic points in time. 10.6 10.6.1

Utility optimization Merton’s optimization problem

In this section, we state the diﬀerential equation for a value function, here called the utility reserve, of an optimization problem where preferences over surplus and dividends are speciﬁed by a power utility function. The surplus introduced in section 10.3 is here approximated by a considerably simpler process. We also indicate the solution to the diﬀerential equation and the optimal dividend strategy. See Korn (1997) and Merton (1990) for original contributions. In section 10.5.1, we approximated the surplus introduced in 10.3. This led to a portfolio version of the quadratic optimization problem of a life insurance company. Here again, we formulate the redistribution problem as a control problem. However, we now add a decision variable. We do not assume that surplus is invested in the riskfree asset only. Instead, we consider the proportion invested in risky assets as a decision variable. Here, we start out by approximating the surplus introduced in section 10.3 on the basis of the following list of adaptations: • We assume that the process of contributions to the surplus is absolutely continuous and accumulates by the rate c. 352

Diﬀerential Systems in Finance and Life Insurance • We assume that accumulated dividends are absolutely continuous and paid out by the rate δ. This gives us the following surplus dynamics, dX (t) = (r + π (t) (α − r)) X (t) dt + π (t) σX (t) dW (t) +d (C − D) (t) , X (0) = x0 , where dC (t) = c (t) dt, dD (t) = δ (t) dt. We present the problem here and the solution below in terms of the insurance company’s surplus distribution problem. However, the personal ﬁnancial problem of optimal investment-consumption is the same. There the contribution process is interpreted as the personal income process and the dividend process is the consumption process. So the results below should also be read with that application in mind. As in section 10.5, we introduce a preference criterion to decide on a dividend rate δ and an investment proportion π. We introduce a process of accumulated utilities U and a power utility rate u (t, δ (t)), i.e. for γ < 1, dU (t) = u (t, δ (t)) dt, 1 a (t)1−γ δ γ . u (t, δ) = γ

(40)

The criterion (40) rewards high dividend rates without consideration of the surplus. The deterministic function a weighs the utility of dividends over time. Without further speciﬁcations such a problem has no solutions since it would be optimal to pay out inﬁnite dividend rates. However, adding the constraint that the terminal surplus must be non-negative, the problem makes sense. We now measure the future utilities by the utility reserve, n dU (s) X (t) . (41) V (t, X (t)) = sup E π,D

t

Note the similarity with e.g. (14) where we now, instead of measuring an expected (present) value of the payment rates δ, measure an expected utility function of the payment rates. Hereto, we have added the supremum that leaves us with an optimization problem. 353

Mogens Steﬀensen Now, we introduce the diﬀerential operator A, and the rate of disutilities β, AV (t, x) = Vx (t, x) ((r + π (α − r)) x + c (t) − δ) 1 + Vxx (t, x) π 2 σ 2 x2 , 2 1 a (t)1−γ δ γ . β (t) = γ We are now ready to present the ninth diﬀerential equation, which is a Bellman equation. Proposition 9. The utility reserve given by (41) is characterized by the following Bellman equation, 0 = Vt (t, x) + sup [AV (t, x) + β (t)] ,

(42a)

0 = V (n, x) .

(42b)

δ,π

The optimal dividend stream and the optimal investment strategy, speciﬁed by the optimal rate δ ∗ and the optimal proportion π ∗ , are simply the arguments of the supremum in (42), i.e., δ ∗ = arg sup [AV (t, x) + β (t)] , δ

π ∗ = arg sup [AV (t, x) + β (t)] . δ

The veriﬁcation of (42) as a suﬃcient condition characterizing the utility reserve goes in exactly the same way as in section 10.5. The only diﬀerence is that all inequalities are turned around since we are now solving a maximization problem instead of a minimization problem. As we did in section 10.5, we can separate the variables of the solution. In this case, the solution is given by V (t, X (t)) =

1 f (t)1−γ (X (t) + g (t))γ . γ

With this parametrization of the solution, both f and g have solutions that can be interpreted as present values. There exists an artiﬁcial rate r∗ that depends on all parameters in the model, such that n s ∗ e− t r a (s) ds, f (t) = t n s e− t r c (s) ds. g (t) = t

354

Diﬀerential Systems in Finance and Life Insurance The function f is a present value that says something about the value of investing and smoothing out the surplus over the residual time to maturity. The time weights in the function a appear in f . The function g is the present value of future contributions to the surplus. From its appearance in the utility reserve, the insurance company could activate all future surplus contributions and account for them in the surplus. The optimal controls become a (t) (x + g (t)) , f (t) 1 α−r π ∗ (t, x) x = (x + g (t)) . 1 − γ σ2 δ ∗ (t, x) =

These strategies are easy to interpret. One should pay out, at any point in time, a fraction a (t) /f (t) of X (t) + g (t) that weighs the future preferences over dividends through f against the present preferences over dividends through a. The amount optimally invested in stocks π ∗ (t, X (t)) X (t) is a proportion of the surplus plus activated future surplus contributions. In the personal ﬁnance consumption-investment problem, the function g is the present value of future income, which is also called human wealth. Note that for t → n, f → 0, such that the proportion of X paid out as dividends tends to inﬁnity. The consequence is that the optimally controlled surplus ends at 0. 10.6.2 Statewise power utility optimization of dividends In this section, we state the diﬀerential equation for the utility reserve of an optimization problem where preferences over surplus and dividends are speciﬁed by a power utility function. The surplus is modelled as in section 10.3. We also indicate the solution to the diﬀerential equation and the optimal dividend strategy. See Steﬀensen (2004) for the generalization of Merton’s optimization problem to Markov chain driven payments. In the previous section, we approximated the surplus by modelling both contributions and dividends as absolutely continuous processes. We take a step back to the original surplus process with dynamics given by (19), and model the surplus as in (19) with the exception 355

Mogens Steﬀensen that we still take C to be approximated by a deterministic function, i.e., dC (t) = c (t) dt + ΔC (t) , not allowing for state dependence in the surplus contribution. We return to this detail at the very end. The full return to (19) with C given by (20) is not immediately possible. The process of accumulated utilities is, on the other hand, given by dU (t) = dU Z(t) (t, δ (t) , ΔD (t)) uZ(t−)k t, δ k (t) dN k (t) , + k:k =Z(t−) j

j

dU (t, δ, ΔD) = u (t, δ) dt + ΔU j (t, ΔD) . We generalize the power utility function to state-dependent utility functions in the sense that 1 j 1−γ γ a (t) δ , γ 1 jk 1−γ k γ a (t) , δ ujk t, δ k = γ 1 ΔU j (t, ΔD) = ΔAj (t)1−γ (ΔD)γ . γ uj (t, δ) =

These coeﬃcients should be compared with (40). Due to the structure of the dividend payments, there is now one coeﬃcient for each type of dividend payment. Furthermore, we allow the coeﬃcient functions to depend on the state of Z. We now introduce the utility reserve T Z(t) (t, X (t)) = sup E dU (s) Z (t) , X (t) . (43) V D,π

t

We introduce the diﬀerential operator A, the utility rate β and the updating sum R, μjk (t) V k t, x − δ k − V j (t, x) AV j (t, x) = k:k =j

+ Vxj (t, x) ((r + π (α − r)) x + c (t) − δ) 1 j + Vxx (t, x) π 2 σ 2 x2 , 2 γ 1 1 μjk (t) ajk (t)1−γ δ k , β j (t) = aj (t)1−γ δ γ + γ γ k:k =j 356

Diﬀerential Systems in Finance and Life Insurance Rj (t, x) =

1 ΔAj (t)1−γ (ΔD)γ + V j (t, x + ΔC (t) − ΔD) γ − V j (t−, x) .

We are now ready to present the tenth - and ﬁnal - diﬀerential equation, which is a generalized version of the Bellman equation (42). Proposition 10. The utility reserve given by (43) is characterized by the following Bellman equation, / D, (44a) 0 = Vtj (t, x) + sup AV j (t, x) + β j (t) , t ∈ π,δ,δ k

0 = sup R (t, x) , t ∈ D,

(44b)

0 = V (n, x) .

(44c)

j

ΔD j

The methodology needed for verifying (44) as a suﬃcient condition for the characterization of the utility reserve is the same as in section 10.5. In section 10.5, we separated variables of the utility reserve. In section 10.5, introducing state dependence led to a separation of variables, such that parts depending on time became state dependent as well. The question is whether this trick works here again. Indeed, V Z(t) (t, X (t)) =

1 Z(t) 1−γ f (t) (X (t) + g (t))γ . γ

(45)

In the previous section, the function f could be interpreted as an artiﬁcial present value of the stream of coeﬃcients a. Here again, the resulting diﬀerential equation for f leads to similar possibilities for interpretations. However, the conclusion becomes somewhat involved and is not pursued further here. On the other hand, we still have that n s e− t r dC (s) ds, g (t) = t

and the insurance company can again activate all future deterministic surplus contributions and account for them in the surplus. The optimal amount invested in stocks is still given by π ∗ (t, x) x =

1 α−r (x + g (t)) , 1 − γ σ2 357

Mogens Steﬀensen whereas the optimal dividend payments are formalized by aj (t) (x + g (t)) , f j (t) ajk (t) (x + g (t)) , δ ∗jk (t, x) = jk a (t) + f k (t) ΔAj (t) (x + g (t)) . ΔD∗j (t, x) = ΔAj (t) + f j (t) δ ∗j (t, x) =

Again, we can interpret the optimal fraction of surplus in the optimal dividend rate as a trade-oﬀ between present considerations in a and future considerations in f . The same interpretation applies for the optimal lump sum dividends. The numerator concerns the present preferences, while the denominator concerns the future preferences, including the present. For all considerations, the state dependence of f is reﬂected in the state dependent optimal dividend payments. In section 10.5, we ended up with X (n) = 0 due to inﬁnite dividend proportions of the surplus as we get closer to maturity. In this case, the same conclusion is a consequence of the terminal condition f j (n) = 0. We end by a comment on the assumption that the contribution process is not state dependent, in contrast to section 10.5.2. In the case of power utility and state dependent contributions, the value function (45) is not correct. The problem is that the market is incomplete, since the Z-risk is not given any price by the market in contrast to W -risk. Therefore, there exists no unique arbitrage free price of the future contributions if they depend on Z. In order to obtain a complete market situation, one has to add decision variables corresponding to insurance against Z-risk. This is exactly what happens in Kraft and Steﬀensen (2006). There, however, the viewpoint is not that of the insurance company. In contrast, the viewpoint is that of the policy holder, and therefore Kraft and Steﬀensen (2006) is a generalization of the personal ﬁnance consumption-investment problem, which incorporates Z-risk and insurance decisions.

References: Aase, K. K., and Persson, S.-A. (1994) “Pricing of unit-linked life insurance policies.” Scandinavian Actuarial Journal: 26–52. Black, F., and Scholes, M. (1973) “The pricing of options and corporate liabilities.” Journal of Political Economy 81: 637–654. 358

Diﬀerential Systems in Finance and Life Insurance Brennan, M. J., and Schwartz, E. S. (1976) “The pricing of equitylinked life insurance policies with an asset value guarantee.” Journal of Financial Economics 3: 195–213. Cairns, A. J. G. (2000) “Some notes on the dynamics and optimal control of stochastic pension fund models in continuous time.” ASTIN Bulletin 30(1): 19–55. Fleming, W. H., and Rishel, R. W. (1975) Deterministic and Stochastic Optimal Control. Springer-Verlag. Grosen, A., and Jørgensen, P. L. (2000) “Fair valuation of life insurance liabilities: The impact of interest rate guarantees, surrender options, and bonus policies.” Insurance: Mathematics and Economics 26: 37–57. Hald, A. (1981) “T. N. Thiele’s contributions to Statistics.” International Statistic Review 49: 1–20. Hoem, J. M. (1969) “Markov chain models in life insurance.” Bl¨atter der Deutschen Gesellschaft f¨ ur Versicherungsmathematik 9: 91–107. Korn, R. (1997) Optimal Portfolios. World Scientiﬁc. Kraft, H., and Steﬀensen, M. (2006) Optimal Consumption and Insurance: A Continuous-Time Markov Chain Approach. Technical report, Laboratory of Actuarial Mathematics, University of Copenhagen. Merton, R. C. (1973) “Theory of rational option pricing.” Bell Journal of Economics and Management Science 4: 141–183. Merton, R. C. (1990) Continuous-time Finance. Blackwell. Nielsen, P. H. (2005) “Optimal bonus strategies in life insurance: The Markov chain interest rate case.” Scandinavian Actuarial Journal 2: 81–102. Norberg, R. (1991) Reserves in life and pension insurance. Scandinavian Actuarial Journal: 3–24. Norberg, R. (1999). “A theory of bonus in life insurance.” Finance and Stochastics 3(4): 373–390. Norberg, R. (2001) “On bonus and bonus prognoses in life insurance.” Scandinavian Actuarial Journal 2: 126–147. Norberg, R., and Steﬀensen, M. (2005) “What is the time value of a stream of investment?” Journal of Applied Probability 42(3): 861–866. Richard, S. F. (1975) “Optimal consumption, portfolio and life insurance rules for an uncertain lived individual in a continuous time model.” Journal of Financial Economics 2: 187–203. Steﬀensen, M. (2000) “A no arbitrage approach to Thiele’s diﬀerential equation.” Insurance: Mathematics and Economics 27: 201–214. 359

Mogens Steﬀensen Steﬀensen, M. (2002) “Intervention options in life insurance.” Insurance: Mathematics and Economics 31: 71–85. Steﬀensen, M. (2004) On “Merton’s problem for life insurers.” ASTIN Bulletin 34(1): 5–25. Steﬀensen, M. (2006a) “Quadratic optimization of life and pension insurance payments.” ASTIN Bulletin 36(1): 245–267. Steﬀensen, M. (2006b) “Surplus-linked life insurance.” Scandinavian Actuarial Journal 2006(1): 1–22. Thiele, T.N. (1880) Sur la compensation de quelques erreurs quasisyst´ematiques par la m´ethodes de moindre carr´es. Reitzel, Copenhagen. See also Hald (1981).

360

Chapter 11 Uncertain Technological Change and Capital Mobility

Paul A. de Hek Netherlands Bureau for Economic Policy Analysis (CPB)

11.1

Introduction

Although much work has been done in the ﬁeld of stochastic endogenous growth models (see e.g. King and Rebelo 1988; King, Plosser and Rebelo 1988; Obstfeld 1994; Hopenhayn and Muniagurria 1996; W¨alde 1999), there are only a few analyzes on the inﬂuence of uncertainty on (the distribution of) the long-run growth rate. Previous work on economic growth under uncertainty has focused on issues like the existence of a limiting distribution for capital and consumption, but has not tried to understand how the distribution of productivity shocks aﬀects growth in the long run. In an important empirical study, Ramey and Ramey (1995) ﬁnd evidence that economic growth and the volatility of the economic ﬂuctuations are negatively linked. This negative relationship is primarily due to the volatility of the innovations to growth (i.e., of unpredictable changes in the growth rate). This latter measure corresponds closely to the notion of uncertainty. At face value, this result seems to contradict those of Kormendi and Meguire (1985), who ﬁnds that the standard deviation of output growth has a signiﬁcant positive eﬀect on growth. However, Ramey and Ramey (1995, p.1145) argue that in the regressions of Kormendi and Meguire, the positive eﬀect of the standard deviation may be capturing the eﬀect of predictable movements in growth. In that way, both results are consistent: volatility of the innovations seems to have a negative eﬀect, while volatility in the

Paul A. de Hek predicted variable has a positive eﬀect on growth.1 Recent studies by Martin and Rogers (2000) and Imbs (2004) largely conﬁrm the result that countries with higher volatility grow (conditionally) at a lower rate. Investments in research and development (R&D) or, more generally, investments in the creation of knowledge are the driving force behind the advancement of the technology. More investments will generally lead to a higher rate of technological change, and, consequently, to higher economic growth. However, the return to these investments is not known in advance, that is, the productivity of knowledge creation is uncertain. This creates a link between uncertainty and (long-run) growth. In the present study, uncertainty derives from randomness in the productivity of R&D. In general, one part of uncertainty is due to individual, ﬁrm-speciﬁc (idiosyncratic) uncertainty, while the other part arises from economy-wide (common) shocks, which have the same impact on all ﬁrms. Here, the analysis will focus on common shocks2 , such as technology and policy shocks. The objective of this study is, then, to ﬁnd out the nature (positive or negative) of the link between growth and aggregate uncertainty and to identify the main factors that determine this nature. Concerning the theoretical literature on this topic, both De Hek (1999) and Jones, Manuelli and Stacchetti (2005) show that the relationship between volatility in macroeconomic productivity and mean growth can be either positive or negative. The curvature of the utility function is identiﬁed as a key parameter that determines the sign of the relationship. In a recent paper, Blackburn and Pelloni (2004) investigate the relationship between growth and volatility in learningby-doing economies. They ﬁnd that the correlation between long-term growth and short-term volatility depends on the source of stochastic ﬂuctuations and the functioning of the labor market. As regards the former, long-run growth is negatively related to the volatility of (nonneutral) nominal shocks, but positively related to the volatility of real shocks. The present analysis uses a model of endogenous technological change where sustained growth stems from intentional investments in R&D from proﬁt-maximizing, risk-averse ﬁrms. Physical capital is 1

See also Guiso and Parigi (1999) and Aizenman and Marion (1999), who ﬁnd a negative relationship between volatility and (private) investment. 2 Schankerman (2001) ﬁnds that idiosyncratic shocks do not account for much (approximately 25%) of the variation in investment decisions. Nearly 75% of the micro-variance is due to heterogeneity in micro-level responses to aggregate (common) shocks.

362

Uncertain Technological Change and Capital Mobility assumed to be fully mobile, while labor is assumed to be immobile. Uncertainty derives from the productivity of investments in R&D. The main result of this analysis is that the relationship between long-run growth and uncertainty (on the productivity of knowledge creation) depends on two main factors - increasing or non-increasing returns to scale in knowledge creation and a high or low value of the elasticity of intertemporal substitution (of a ﬁrms’ proﬁts). Empirical studies on the returns to scale in knowledge creation (”non-increasing”) and the value of the elasticity of intertemporal substitution (”higher than the critical value”) indicate a negative relationship between long-run growth and uncertainty regarding the productivity of knowledge creation. Hence, this study identiﬁes a new factor - the returns to scale in the research sector - which inﬂuences the growth-uncertainty relationship. Moreover, while Jones, Manuelli and Stacchetti (2005) quantitatively ﬁnd a positive relationship between growth and uncertainty3 , the present analysis establishes a veriﬁable critical value of the elasticity of intertemporal substitution, implying a negative relationship between growth and uncertainty that is consistent with the empirical evidence cited above. 11.2

Framework of the model

The model that will be developed in this section is based on the models of endogenous technological change of Romer (1990) and Aghion and Howitt (1998, Ch. 3). The main diﬀerence with these models is that in the present model, instead of having a separate research sector, research is being undertaken by the intermediate-good producers. Research by a ﬁrm enhances the ﬁrm’s own state of the technology (and has a positive external eﬀect on the other ﬁrms’ states of the technology). This setting allows us to ﬁnd the eﬀect of higher uncertainty in the productivity of investments in R&D on the growth rate through the optimal choices, concerning capital and (skilled) labor, of the intermediate-good producers.

3 De Hek (1999) makes no prediction concerning the most likely nature of the relationship.

363

Paul A. de Hek 11.2.1

Technology

The consumption-capital good in the economy, ﬁnal output Y , is produced according to 1

Yt = L1−β t

0

Ait xβit di,

(1)

where xit is the quantity of intermediate (or capital4 ) good i, Lt is the quantity of labor employed to produce ﬁnal output and Ait is an index for the technology or knowledge in ﬁrm (or sector) i. At each date, the representative ﬁnal-output ﬁrm decides how much of each intermediate good it rents from the producers of those goods. Maximization of its proﬁts implies that the price (or rental rate) pit of intermediate good i is given by Ait xβ−1 (2) pit = βL1−β t it , ∀i ∈ [0, 1]. The wage rate wL,t of (skilled) labor used in the ﬁnal-output sector is equal to its marginal product, 1 β wL,t = (1 − β)L−β (3) t Ait xit di. 0

Each intermediate good is produced by a ﬁrm that has an inﬁnitelyvalid patent on that design (or can in some other way eﬀectively prevent other competitors from entering the market, without aﬀecting the proﬁt maximization). Due to this monopoly power, an intermediate ﬁrm can devote resources, i.e., labor, to research and development (R&D), which enhances the state of the technology of that ﬁrm. A higher state of the technology might be seen as an improvement of the quality of the ﬁrm’s product and implies higher proﬁts. The intermediate sector uses labor to conduct research. Labor or human capital5 in sector i is denoted by hit . Average or total human capital used to conduct research is then given by 1 Ht = hjt dj. 0

The total labor force in the economy is ﬁxed and set to 1, i.e., Lt + Ht = 1 for all t. 4 Intermediate goods and capital (goods) are used interchangeably throughout the study. 5 In this study, the amount of human capital used in sector i, hit , is deﬁned to be the amount of labor used in sector i, lAit , times the (constant) skill level, h. Normalizing h to 1 implies that hit = lAit .

364

Uncertain Technological Change and Capital Mobility To produce intermediate goods at the rate xit , the ﬁrm in sector i requires the use of Ait xit units of capital. We assume throughout this study full international capital mobility, while labor (human capital) is assumed to be immobile. Thus the interest rate r is exogenously given and is equal to the international interest rate. The per period proﬁt of an intermediate-good producer is therefore given by π it = pit xit − rAit xit − wH,t hit , where wH,t is the wage rate of human capital. Suppose that technology or knowledge evolves according to Ai,t+1 = 1 + η t+1 hγit Htθ Ait ,

(4)

(5)

where γ > 0 is a returns-to-scale parameter, θ > 0 a parameter controlling the spill-over eﬀect of average (or total) human capital, Ht =

0

1

hjt dj,

and η a random variable representing the productivity of human capital in the accumulation of knowledge. In every period, η may take any value on some interval I. As a result, the return to research is uncertain. The probability distribution of the return is, however, known and ﬁxed. More formally, assume that the sequence of shocks {η t } satisﬁes: {η t } is a sequence of independently and identically distributed (i.i.d.) random variables with probability distribution μ and support I = [η, η], η > η > 0. Clearly, more (less) uncertainty is associated with higher (lower) variability. (For a formal deﬁnition of variability see Rothschild and Stiglitz, 1970). To determine the eﬀect of changing the variability on the expectation of a function of the random variable, the following result by Rothschild and Stiglitz (1971) is very useful: Given that Y is more variable than X, Ef (X) > (≥) Ef (Y ) if f is strictly (weakly) concave, while Eg(X) < (≤) Eg(Y ) if g is strictly (weakly) convex. Therefore, to determine the eﬀect of increasing (or decreasing) variability on Ef (X), it is suﬃcient to ﬁnd out whether f (.) is strictly 365

Paul A. de Hek concave or strictly convex. E.g. if f (X) is strictly concave, increasing the variability of X leads to a decrease in the expectation of f (X). One line of reasoning suggests that, since all ﬁrms are owned by the consumers (possibly represented by the representative consumer), the utility functions of the consumers should determine how ﬁrms behave. That is, ﬁrms should make their choices to maximize the expected utility (of consumption) of the owners of the ﬁrm. However, according to a second line of reasoning, if the owners delegate the management of the ﬁrm to a manager, you could argue that the manager does not know the utility functions of the owners of the ﬁrm. Suppose, for example, that the ownership shares held in the ﬁrms can be traded among the consumers, either nationally or internationally. Then, if consumers (foreign or domestic) diﬀer with respect to their utility functions, the managers of the ﬁrms will not know which (kind of) consumers own their ﬁrm. In that case, it seems natural for the manager to maximize the expected discounted stream of proﬁts or to incorporate possible risk aversion, the expected discounted stream of the utility of proﬁts6 . Adopting the second line of reasoning7 , the intertemporal expected proﬁt maximization problem of an intermediate-good producer is given by: ∞ 1−σ π f −1 δ t it (6) max E 1 − σf t=0 s.t. Ai,t+1 = 1 + η t+1 hγit Htθ Ait , where E is the expectation operator, δ ≡ 1/(1+r) the discount factor, with r representing the interest rate, and π it is given by equation (4). The parameter σ f ∈ [0, ∞) reﬂects both a measure of risk aversion and the reciprocal of the elasticity of intertemporal substitution. Notice that this set-up includes the ’standard case’ of risk neutrality (and an inﬁnite elasticity of intertemporal substitution), which occurs if σ f = 0. Notice that utility is not well-deﬁned if proﬁts are nonpositive in any period. As shown in Appendix B, proﬁts are positive (negative) if and only if ht < (>)β/(1 + β). This implies that, regardless of its utility function, an intermediate-good producer will never employ more (or just as much) labor than the critical level, since this will 6

In the literature on the theory of the ﬁrm under uncertainty, the assumption that the ﬁrm maximizes the expected utility of proﬁts is widely used. See e.g. Sandmo (1971) and Viaene and Zilcha (1998). 7 A short exposition of the ﬁrst line of reasoning is given in Appendix E.

366

Uncertain Technological Change and Capital Mobility yield negative (or zero) proﬁts independent from the shocks.8 Returns on investment in R&D are uncertain. In each period, the impact of research on each ﬁrm’s stock of knowledge is randomly determined. Since η is assumed to be independent from i, this speciﬁcation of the uncertainty implies that the shocks are economy wide, i.e., the same for each ﬁrm. Therefore, the riskiness of the investments in R&D is the result of changes in the economic climate, e.g., induced by technology or policy shocks. In maximizing the expected discounted stream of proﬁts, the ﬁrm knows the demand for its product as given by equation (2). Therefore, replacing pit in the maximization problem with the right-hand side of equation (2) and diﬀerentiating with respect to the two choice variables xit and hit leads to the ﬁrst-order conditions. These two conditions can be written as 1 % & β−1 r Lt , (7) xit = β2 −σ

−σ

1−β β f θ η t+1 γhγ−1 π it f wH,t = E[δπ i,t+1 it Ht Ait β(1 − β)Lt+1 xi,t+1 ].

(8)

It is assumed that the transversality condition, as given in Appendix A, holds. Moreover, proﬁts are assumed to be positive (see the appendix for the associated restriction on the optimal level of human capital). Let At denote the average productivity parameter across all ﬁrms at date t: 1 Ait di. At ≡ 0

Because each sector i uses Ait xit units of capital, the total capital stock (measured in forgone consumption) is equal to 1 Ait xit di. Kt ≡ 0

According to equation (7), all ﬁrms produce the same amount at any given time: xit = xt = Kt /At for all i. Next, suppose that initially at t = 0 every ﬁrm has the same productivity, that is, Ai0 = A0 for all i, which implies that Ait = At for all i. Then equation (8) allows 8

Another drawback resulting from this speciﬁc utility function concerns the fact that, at zero proﬁt, marginal utility is inﬁnite. However, as explained in the text, this situation will not arise. On the contrary, proﬁts will grow larger and larger over time (see equation (12)), implying that this feature of the utility function is not driving any of the results. The apparent advantage of this speciﬁc utility function is that it produces an analytical solution.

367

Paul A. de Hek us to have hit = ht for all i, which, in turn, implies that Ht = ht . As a result, the aggregate technology (1) can now be expressed in the simpler form . (9) Yt = At xβt L1−β t 11.2.2

Preferences

Assume that consumers behave as if they maximize their expected value of lifetime utility. Consumers are heterogeneous in the sense that they diﬀer in their time preference, ρj , and their elasticity of intertemporal substitution, σ j . The objective of agent j, then, is to select consumption and savings to maximize the expected value of his lifetime utility: &t 1−σj ∞ % cj,t − 1 1 (10) max E 1 + ρj 1 − σj t=0 s.t. bj,t+1 = (1 + r)bj,t + wL,t Lj,t + wH,t hj,t + sj π t − cj,t , where cj,t is consumption and bj,t represents assets. The agent’s sources of income are interest on his stock of assets rbj,t , wage income wL,t Lj,t + wH,t hj,t and his share sj of proﬁts π t = βYt − rKt − wt ht . Maximization with respect to consumption and savings implies that the optimal path of consumption follows the Euler equation, 1+r −σ −σ j . (11) cj,t j = E cj,t+1 1 + ρj The associated transversality condition, which is assumed to be satisﬁed, is given in Appendix A. 11.2.3

Equilibrium

In equilibrium, the wage rate in the intermediate sector should equal the wage rate in the ﬁnal-output sector, i.e., β wH,t = (1 − β)L−β t At xt .

Furthermore, as the total amount of labor present in the economy is normalized to 1, the time allocation restriction reads Lt + ht = 1. Due to the presence of shocks, the notion of balanced growth needs adjustment. Therefore, instead of a constant growth rate the analysis 368

Uncertain Technological Change and Capital Mobility here focuses on a constant expected growth rate. On this balanced expected-growth path (BEGP), the levels of the intermediate goods and labor are constant. This implies that the per period proﬁt grows with the technology, that is, π t+1 = π t 1 + η t+1 hγ+θ . (12) t Incorporating these considerations in equation (8) leads to the BEGP research condition, η t+1 σ = 1. (13) E βγ(1 − h)hγ+θ−1 (1 + r) 1 + η t+1 hγ+θ f The left-hand side of this equation gives the ratio of the return to an additional unit of skilled labor over the cost of an additional unit of skilled labor, on the BEGP. Given the probability measure of η, the intermediate producers choose the optimal amount of time spent on research, h, according to above equation, which determines the rate of technological change, γ+θ

gA,t = 1 + η t+1 ht

.

(14)

On the BEGP, x, L and h are determined by the three conditions given by equations (7) and (13) and the time allocation restriction. Additionally, it is assumed that the solution to this set of equations also satisﬁes the transversality condition associated with the optimization problem. See Appendix A for the exact condition. Since the inputs x and L in the production function are constant along the BEGP, the growth rate of output is equal to the rate of technological change: (15) gY,t = gA,t . 11.3

The eﬀect of uncertainty on growth

11.3.1 The growth rate of output The eﬀect of higher volatility of the shock η on the optimal choice of h depends on the functional form of the BEGP research condition regarding the shock η and the variable h. The ﬁrst step in ﬁnding the eﬀect of more uncertainty on the growth rate of output is to determine the eﬀect of a higher volatility of η on the left-hand-side of equation (13), which will be denoted by E(Φ). It turns out (see Proposition 369

Paul A. de Hek

Figure 1: Equilibrium research condition, with γ +θ ≤ 1. The ﬁgure is based on equation (13) where the expectation is approximated with a second-order Taylor series expansion (see Appendix D). The parameter values are: β = 1/3, γ = 0.5, θ = 0.05, ρ = 0.05, σ f = 1.25, η = 0.85, σ 2η = 0.01. 1 below) that Φ is a concave function of η, implying that a higher volatility of η has a negative eﬀect on the expectation of Φ. Second, the eﬀect of a smaller E(Φ) on the equilibrium value of h depends on the functional form of E(Φ) as a function of h. If γ + θ ≤ 1, it is easy to see that E(Φ) is a decreasing function of h, as depicted in Figure 1. A higher volatility, which decreases E(Φ) as a function of h, then leads to a smaller level of research. On the other hand, if γ + θ > 1, E(Φ) as a function of h is hump-shaped. This implies that there are two equilibrium values of h, a ”low research level equilibrium” and a ”high research level equilibrium” (that is, if the maximum of E(Φ) is higher than 1). See Figure 2 for an example of this situation. There will actually be more time spent on research due to more uncertainty if the economy is in the low level equilibrium, as opposed to less research time in the high level equilibrium. What is the eﬀect of a change in the time spent on research on the growth rate of the economy? A reduction in the time spent on research, for example, implies that the expectation of gA decreases, which, in turn, implies that the growth rate of the economy, g, will be smaller on average. More formally, consider the two probability measures μ and μ+ , where μ+ is more uncertain than μ, that is, it has 370

Uncertain Technological Change and Capital Mobility

Figure 2: Equilibrium research condition, with γ +θ > 1. The ﬁgure is based on equation (13) where the expectation is approximated with a second-order Taylor series expansion (see Appendix D). The parameter values are: β = 1/3, γ = 1.1, θ = 0.05, ρ = 0.05, σ f = 1.25, η = 6, σ 2η = 0.01. the same mean but a higher volatility. Then the average growth rate under μ+ is smaller than the average growth rate under μ for almost any sequence of realizations of η; i.e., it occurs almost surely. The eﬀect of uncertainty on the time spent on research and the average long-run growth rate is summarized in the next proposition. Proposition 1. Let 0 < σf < 2

1 + gA (η) − 1. gA (η)

(A) If γ + θ ≤ 1, then more uncertainty leads to (i) less time spent on research and (ii) a smaller growth rate of output on average. (B) If γ+θ > 1, there may exist two equilibria. Then more uncertainty leads to (i) more (less) time spent on research and (ii) a higher (smaller) growth rate of output on average if the economy is in the low (high) research level equilibrium. Proof: See Appendix C. 2 371

Paul A. de Hek The eﬀect of uncertainty on the path of ﬁnal output is as follows. For example, in case (A) of Proposition 1, more uncertainty leads to less labor used in research and therefore to more labor used in the production of ﬁnal output. Equation (7), then, shows that the amount of every capital good increases. This implies, by equation (1), that ﬁnal output increases initially. However, since the growth rate of output has fallen, at some point in time the new path of ﬁnal output will lie below the initial path. Thus, in the long-run, ﬁnal output is negatively inﬂuenced by uncertainty (that is, in case (A) of Proposition 1). In the previous analysis, the negative eﬀect of uncertainty on output growth could be shown under two restrictions. The ﬁrst restriction puts an upperbound on σ f , the reciprocal of the elasticity of intertemporal substitution (of the proﬁts of the intermediate-good producers) as well as a measure of risk aversion. Even if gA under the best shock is as high as 20%, the restriction requires σ f to be less than 11. This means that this restriction will certainly be satisﬁed if the ﬁrms act as if they were close to risk neutral. (However, if ﬁrms behave in a strict risk-neutral manner, uncertainty will have no eﬀect on the time spent on research and, hence, on the expected growth rate.) Moreover, even if the ﬁrms have similar attitudes towards risk and intertemporal substitution as households, estimates of σ f (and hence of σ) usually indicate that its value is roughly between 1 and 7 (see e.g. Gertner, 1993; Metrick, 1995; Beetsma and Schotman, 2001; Vissing-Jørgenson, 2002; Guvenen, 2006).9 The second restriction is that there are no increasing returns to R&D; i.e., γ + θ ≤ 1. The presence of constant or decreasing returns seems a fairly realistic assumption, which is conﬁrmed by recent empirical evidence. For example, Dinopoulos and Thompson (1996, 2000) estimate versions of Romer’s model of endogenous technological change (Romer, 1990) and ﬁnd positive, but decreasing, returns to R&D. Similar results are found in Hall, Griliches and Hausman (1986), Kortum (1993) and Thompson (1996). The intuition behind the ﬁnding that the nature of the eﬀect of uncertainty on the time spent on research - positive or negative - depends on the parameter σ draws on the fact that this parameter represents both risk aversion and the elasticity of intertemporal substitution. The fact that ﬁrms are risk averse implies that higher uncertainty reduces the return on investment (in skilled labor) in terms of utility. This 9 On the contrary, very high values of σ are found by Hall (1988) and implied by evidence provided by the equity premium puzzle (see e.g. Campbell et al., 1997).

372

Uncertain Technological Change and Capital Mobility aﬀects the amount of investment positively or negatively depending on the relative strenghts of the income and substitution eﬀects. A relatively small σ, for example, implies that the substitution eﬀect dominates the income eﬀect, inducing a positive eﬀect on investment (i.e., the time spent on research). 11.3.2 The growth rate of consumption Due to the international capital market, the growth rates of output and consumption diﬀer. Although individual consumption levels and growth rates diﬀer across consumers, as denoted by the subscript j, the nature of the eﬀect - positive or negative - does not depend on these diﬀerences. As a result, we suppress the subscripts in the following analysis. To determine the eﬀect of uncertainty on the long-run growth rate of consumption, we insert ct+1 = (1 + gc,t+1 )ct into the Euler equation (11) to get

which implies that

−σ −σ 1 + r , c−σ = E gc,t+1 ct t 1+ρ

(16)

1+ρ E gc−σ = . 1+r

(17)

Using a second-order Taylor series expansion around E [gc ], E [gc−σ ] can be approximated by 1 E gc−σ = E [gc ]−σ + σ(σ + 1)E [gc ]−(σ+2) var(gc ), 2

(18)

where var(gc ) is the variance of the growth rate of consumption. Since consumption depends on the income of the consumers, which in turn depends on the state of the technology, consumption depends on the shock η. This implies that a higher variability of the shock leads to more variable consumption and, hence, to a more variable growth rate of consumption. Thus, more uncertainty regarding the shock implies a higher var(gc ). Since, according to equation (17), E [gc−σ ] is constant, equation (18) shows that an increase in var(gc ) will be accompanied with an increase in E [gc ]. Therefore, from this analysis we may conclude that, due to more uncertainty, the growth rate of consumption will on average be higher. 373

Paul A. de Hek This result - that more uncertainty implies a higher average growth rate of consumption - is driven by the consumers’ precautionary saving motive. Due to this motive, consumers save more in more uncertain circumstances in order to ensure themselves against ’bad shocks’. Naturally, these higher savings lead to a higher growth rate. The technical reason for the existence of a precautionary saving motive is the fact that the marginal utility is convex. This convexity implies that the negative consequence, in terms of utility, of a bad shock dominates the positive consequence of a similar (in size) good shock. 11.4

Conclusion

The analysis in this study shows that unpredictable variations in economic productivity may have a positive or negative eﬀect on the average growth rate of output. This conﬁrms the results of earlier papers on this subject. However, this analysis adds two new elements. First, physical capital is assumed to be fully mobile, allowing capital to ﬂow freely between economies. This is in contrast with the earlier closedeconomy models. Second, the theoretical ambiguity result is not solely determined by the value of the elasticity of intertemporal substitution (of consumption) - as is the case in the earlier analyses - but depends on two factors. That is, the relationship between unpredictable variations (uncertainty) in economic productivity and economic growth depends on whether returns to scale in knowledge creation are increasing or non-increasing and whether the elasticity of intertemporal substitution (of proﬁts) is higher or lower than some critical value. Both factors have been studied in the empirical literature. First, empirical studies on the returns to scale in knowledge creation (R&D) indicate that these returns are decreasing. Second, based on empirical analyses on the elasticity of intertemporal substitution and given the critical value as implied by the rate of technological change (under the best possible shock), it is most likely that the value of the elasticity of intertemporal substitution is higher than the critical value. Together these two results imply that unpredictable variations in economic productivity have a negative eﬀect on the average long-run growth rate.

374

Uncertain Technological Change and Capital Mobility Appendix A: Transversality conditions The transversality condition of the intermediate-good producer’s optimization problem is given by −σ f

lim Eδ t π t

t→∞

At+1 = 0.

(19)

The transversality condition of the consumer’s optimization problem is given by &t % 1 c−σ (20) lim E t bt+1 = 0. t→∞ 1+ρ Appendix B: Restriction for ”π > 0” The one-period proﬁt of an intermediate-good producer can be written as β At xβt − β 2 L1−β At xβt − (1 − β)L−β π t = βL1−β t t t At xt ht , ht = β(1 − β)Yt − (1 − β)Yt . Lt

This equation implies that π t > 0 iﬀ ht < βLt = β(1 − ht ). Hence, proﬁt π t is positive if and only if ht < β/(1 + β). If β = 1/3, this implies that ht < 1/4. Appendix C: Proof of Proposition 1 This proof consists of proving the two steps taken in the text prior to the proposition. First, we have to prove that G(η) ≡ η/(1 + ηhγ+θ )σf is a concave function of η. Let us write G(η) = η/(1 + bη)σf , with b = hγ+θ . Diﬀerentiating G(.) with respect to η shows that ∂G/∂η = (1 + (1 − σ f )bη)/((1 + bη)1+σf ). Diﬀerentiating again with respect to η yields (1 + bη)(1 − σ f )b − (1 + (1 − σ f )bη)(1 + σ f )b ∂ 2 G(η) = . 2 ∂η (1 + bη)2+σf 375

Paul A. de Hek From this, we can conclude that ∂ 2 G/∂η 2 < 0 if and only if 1 + σ f < 2(1 + bη)/bη. Hence, G(η) is (strictly) concave for all η ∈ [η, η] iﬀ 1 + bη 1 + gA (η) =2 , 1 + σ f < min 2 η∈[η,η] bη gA (η) since b = hγ+θ . Hence, by Lemma 1, a higher volatility of η decreases E(Φ). Deﬁne the function F as follows: F (h) = mη(1 − h)hγ+θ−1 /(1 + ηhγ+θ )σf , with m = βγ/(1 + r). If γ + θ ≤ 1, it is evident that F (h) is decreasing in h. As a result, E(Φ) is decreasing in h. If γ + θ > 1, numerical simulations indicate that F (h) is hump-shaped. Thus, if the maximum of the function is high enough there exist two equilibria. The ﬁrst step implies that a higher volatility of η decreases E(Φ). The second step implies that depending on whether γ + θ ≤ 1 or γ + θ > 1, E(Φ) is decreasing in h for all h ∈ [0, 1] or hump-shaped. For example, in the ﬁrst case, h has to fall in order to keep E(Φ) equal to 1. Appendix D: Taylor series approximation Using the second-order Taylor series expansion around η implies that

η E (1 + ηhγ+θ )σf & % 1 (σ f + 1)σ f ηh2(γ+θ) 2σ f hγ+θ η σ 2η , + − ≈ (1 + ηhγ+θ )σf 2 (1 + ηhγ+θ )σf +2 (1 + ηhγ+θ )σf +1 where σ 2η represents the variance of η. Hence, this yields an approximation of the expectation in the BEGP research condition (13), which is used to draw the graphs in Figures 1 and 2.

376

Uncertain Technological Change and Capital Mobility Appendix E: Alternative model This version of the model follows the line of reasoning that, since ﬁrms are owned by the representative consumer, the utility function of the representative consumer determines how ﬁrms behave.10 Hence, we assume here that all consumers are the same. As there are inﬁnite many ﬁrms, a single ﬁrm has no eﬀect on the proﬁt of the representative consumer. We therefore let the representative consumer make all the choices (and, hence, internalizes the external eﬀect of skilled labor). This implies that the representative consumer solves the optimization problem: &t 1−σ ∞ % ct − 1 1 (21) max E x,h,c 1 + ρ 1−σ t=0 s.t. bt+1 = (1 + r)bt + wt + π t − ct , At+1 = 1 + η t+1 hγ+θ At . t Inserting the expression for π t (= π it ) into the budget restriction, the restriction becomes bt+1 = (1 + r)bt + Yt − rAt xt − ct .

(22)

The ﬁrst-order condition with respect to xt implies that . r = βLt1−β At xβ−1 t

(23)

The ﬁrst-order condition with respect to ht can be written as β −β c−σ ) A x −(1 − β)(1 − h t t t + t . 1 γ+θ−1 β 1−β E c−σ (γ + θ)η h ) A x − rA x (1 − h = 0. t+1 t t+1 t t t+1 t t+1 1+ρ Using both ﬁrst-order conditions and the fact that on a BEGP xt = xt+1 and ht = ht+1 , the ’alternative’ BEGP research condition reads & % η t+1 γ+θ−1 = 1. (24) E (γ + θ)(1 − h)h (1 + ρ) (ct+1 /ct )σ 10

In a way, this is similar to W¨ alde (1999), where ﬁrms indirectly, i.e., ﬁrms are only engaged in static maximization, maximize the expected utility of the representative consumer.

377

Paul A. de Hek If we compare this equation with the BEGP research condition as given by equation (13) in the text, there are three diﬀerences. First, instead of γ we have here γ + θ, reﬂecting the fact that the representative consumer internalizes the externalities between the ﬁrms. Second, r is replaced by ρ, since the consumer discounts time with the time preference, while the ﬁrms discount time with the interest rate. Third and most importantly, , 1 + η t+1 hγ+θ t the growth rate of technology, is replaced by ct+1 /ct , the growth rate of consumption. While the ﬁrst two diﬀerences do not aﬀect the qualitative eﬀect of uncertainty on growth, the third diﬀerence makes it hard if not impossible to determine the eﬀect of uncertainty on growth in general, since we cannot solve for the growth rate of consumption,11 except when the interest rate is exactly that value at which saving equals investment. In the latter case, the growth rate of consumption exactly equals the rate of technological change, and equation (24) is qualitatively similar to equation (13), yielding the same result concerning the eﬀect of uncertainty on growth as stated in Proposition 1. Acknowledgements: I thank Jean-Marie Viaene for helpful discussions. Financial support from the Netherlands Organisation for Scientiﬁc Research (NWO) is gratefully acknowledged.

References: Aghion, P. and Howitt P. (1998) Endogenous Growth Theory. Cambridge, MA and London, England: The MIT Press. Aizenman, J., and Marion, N. (1999) “Volatility and Investment: Interpreting Evidence from Developing Countries.” Economica 66 (262): 1157–79. Beetsma, R.M.W.J., and Schotman, P.C. (2001) “Measuring risk attitudes in a natural experiment: data from the television game show lingo.” The Economic Journal 111: 821–48. 11

Actually, you do not necessarily have to solve for the growth rate of consumption completely. E.g. if the growth rate is a linear function of the shock, the eﬀect of uncertainty is similar as in the model in the text.

378

Uncertain Technological Change and Capital Mobility Blackburn, K., and Pelloni, A. (2004) “On the Relationship between Growth and Volatility.” Economics Letters 83: 123–127. Campbell, J.Y., Lo, A.W., and MacKinlay, A.C. (1997) The Econometrics of Financial Markets. Princeton, NJ: Princeton University Press. De Hek, P.A. (1999) “On Endogenous Growth under Uncertainty.” International Economic Review 40: 727–44. Dinopoulos, E., and Thompson, P. (1996) “A Contribution to the Empirics of Endogenous Growth.” Eastern Economic Journal 22: 389– 400. Dinopoulos, E., and Thompson, P. (2000) “Endogenous Growth in a Cross-Section of Countries.” Journal of International Economics 51: 335–62. Gertner, R. (1993) “Game shows and economic behavior: Risk taking on ”card sharks”.” Quarterly Journal of Economics 108: 507–21. Guiso, L., and Parigi, G. (1999) “Investment and Demand Uncertainty.” Quarterly Journal of Economics 114: 185–227. Guvenen, M.F. (2006) “Reconciling Conﬂicting Evidence on the Elasticity of Intertemporal Substitution: A Macroeconomic Perspective.” Journal of Monetary Economics, forthcoming. Hall, R.E. (1988) “Intertemporal Substitution in Consumption.” Journal of Political Economy, 96: 339–57. Hall, B., Griliches, Z., and Hausman, J. (1986) “Patents and R&D. Is there a Lag?” International Economic Review 27: 265–283. Hopenhayn, H., and Muniagurria, M. (1996) “Policy Variability and Economic Growth.” Review of Economic Studies 63: 611–625. Imbs, J. (2004) Growth and Volatility. Mimeo, London Business School. Jones, L.E., Mamuelli, R.E., Siu, H.E., and Stachetti, E. (2005) “Fluctuations in Convex Models of Endogenous Growth I: Growth Eﬀects.” Review of Economic Dynamics 8: 780–804. King, R.G., Plosser, C., and Rebelo, S.T. (1988) “Production, Growth and Business Cycles, II: New Directions.” Journal of Monetary Economics 21: 309–341. King, R.G., and Rebelo, S.T. (1988) Business Cycles with Endogenous Growth. Unpublished Paper, University of Rochester. Kormendi, R.L. and Mequire, P.G. (1985) “Macroeconomic Determinants of Growth: Cross-Country Evidence.” Journal of Monetary Economics 16: 141–163. 379

Paul A. de Hek Kortum, S. (1993) “Equilibrium R&D and the Patent-R&D Ratio: U.S. Evidence.” American Economic Review Papers and Proceedings 83: 450–457. Lucas, R.E., Jr. (1988) “On the Mechanics of Economic Development.” Journal of Monetary Economics 22: 3–42. Metrick, A. (1995) “A natural experiment in ”jeopardy”.” American Economic Review 85: 240–53. Obstfeld, M. (1994) “Risk-Taking, Global Diversiﬁcation and Growth.” American Economic Review 84: 1310–29. Ramey, G., and Ramey, V.A. (1995) “Cross-Country Evidence on the Link between Volatility and Growth.” American Economic Review 85: 1138–1151. Romer, P.M. (1986) “Increasing Returns and Long-Run Growth.” Journal of Political Economy 94: 1002–1037. Romer, P.M. (1990) “Endogenous Technological Change.” Journal of Political Economy 98: S71–S102. Rothschild, M., and Stiglitz, J.E. (1970) “Increasing Risk I: A Deﬁnition.” Journal of Economic Theory 2: 225–243. Rothschild, M., and Stiglitz, J.E. (1971) “Increasing Risk II: Its Economic Consequences.” Journal of Economic Theory 3: 66–84. Sandmo, A. (1971) “On the Theory of the Competitive Firm under Price Uncertainty.” American Economic Review 61: 65–73. Thompson, P. (1996) “Technological Opportunity and the Growth of Knowledge.” Journal of Evolutionary Economics 6: 77–97. Viaene, J.-M., and Zilcha, I. (1998) “The Behavior of Competitive Exporting Firms under Multiple Uncertainty.” International Economic Review 39: 591–609. Vissing-Jorgenson, A. (2001) “Limited Asset Market Participation and the Elasticity of Intertemporal Substitution.” Journal of Political Economy 110: 825–53. W¨alde, K. (1999) “A Model of Creative Destruction with Undiversiﬁable Risk and Optimizing Households.” Economic journal 109: C156– C171.

380

Chapter 12 Stochastic Control, Non-Depletion of Renewable Resources, and Intertemporal Substitution

Nils Chr. Framstad The Financial Supervisory Authority of Norway1

12.1

Introduction

It is well known that if the economic discount rate uniformly exceeds the relative growth rate of a resource – measured in physical terms if price is constant, or more generally in value of extracting it all – then, assuming zero costs, a proﬁt maximizer will want to do just that: instantly deplete the resource completely. Thus, from a conservationist point of view, a high discount rate is undesirable, since it represents less value of savings for future times and may lead to the extinction of populations and entire species and the irrecoverable loss of natural resources. There is considerable literature (see e.g., Alvarez 2001 and the references therein) on the eﬀect of uncertainty in such expected proﬁt maximizer models, mainly where uncertainty is modeled by Brownian motion. Pindyck (1984) concludes that in an equilibrium model with bounded maximal extraction rate, one will extract at a rate which is either zero or the maximum possible, but decreasing as a function of the volatility. The bang-bang property indicates that if any extraction rate is allowed, then the optimal strategy should be characterized as a reﬂection, i.e., harvesting precisely as much as necessary in order 1 This work does not reﬂect the views of the Financial Supervisory Authority of Norway.

Nils Chr. Framstad to prevent the population from exceeding a given threshold. In such a setting, Alvarez (2001) conﬁrms rigorously that increasing Brownian uncertainty will increase the threshold, and lead one to wait for a higher population before harvesting (i.e. the opposite eﬀect of the discounting term). However, as pointed out in Framstad (2003), the choice of Brownian noise is crucial as introducing qualitatively diﬀerent zero-mean noises – namely jump uncertainty – may in fact lead to downwards reﬂection at populations lower than in the deterministic case. Having established that zero-mean uncertainty may actually lead to harvesting at a lower level, it is however shown that just as in the deterministic case, it is not optimal for an expected proﬁt maximizer with an inﬁnite time horizon to deplete the resource completely as long as the resource’s expected relative growth at zero exceeds the discount rate. Unlike the aforementioned works, this paper does not attempt to ﬁnd any optimal solution; we shall show that the same criterion will imply that complete depletion cannot be optimal under a far more general class of preferences. It turns out that in this respect, linear utility still is the “worst case” among the risk averse, a property which is not a priori obvious as consumption now is certain while future consumption is not. We assume a setting where a single agent completely and cost free controls the irreversible extraction of the resource and possesses the relevant information on population size and the stochastic evolution law of the relative growth rate at zero. The model is assumed to be an Ito process with semimartingale driving noise. The reader who is not fully familiar with the mathematics behind this, may think of the model as growth which at zero is locally approximately geometric, with a relative growth rate b distorted by zero-mean noise; the noise will actually cancel out from the model, so we only need it to be suﬃciently well behaved. 12.2

The preferences

With preferences represented by a direct utility function of present consumption rate, there might not be any substitute to consumption at a given time (i.e., a non-degenerate time interval). An obvious example is if direct utility is −∞ at zero. It is however objectionable to assume that the agent has to consume at each and every second. In a more realistic setting, formalized by Hindy, Huang and Kreps (1992) in the deterministic case, and by Hindy and Huang (1992) under uncertainty, a positive portion con382

Stochastic Control and Non-Depletion of Renewable Resources sumed should keep the agent satisﬁed for some time, and consumption at two close points in time should be considered close substitutes not only when considering the value function (indirect utility), but also when considering the running direct utility rate. This means that the agent does not necessarily have to consume at all times; a “gulp” consumed now will not only increase direct utility rate at this particular moment, but also in the future. It is not unreasonable to guess that this could lead to earlier harvesting than in the case without intertemporal substitution. So just like in the jump uncertainty case, we may want to ask if the old criterion for non-depletion prevails: is it suﬃcient that the harvested population has a relative growth rate exceeding the economic discount rate? This paper sets out to show that this criterion seems quite robust. To introduce intertemporal substitution, we shall assume that the current consumption rate only indirectly aﬀects direct utility. Although current consumption rate does not enter as an argument directly [Cf. Hindy and Huang (1992), section 5], the case where direct utility depends only on current consumption rate may be obtained as a limiting case. As a simpliﬁcation, we shall assume the agent’s direct utility to depend on a single nonnegative process C which represents not consumption itself, but a decayed transformation of the past consumption path (also frequently referred to as “durability”). 12.2.1 Exponential decay Since we assume C to be one-dimensional, it will turn out from quite reasonable mathematical assumptions that past consumption decays exponentially as time goes by. The justiﬁcation is as follows: Assume consumption at two stopping times times t1 ≥ t0 . We will assume that the rate at which past consumption is “forgotten” is F , so that we have for t ∈ (t0 , t1 ) C(t+ 0 ) · F (t, t0 ) (1) C(t) = C(t+ ) · F (t, t ) for t > t1 . 1 1 Now assume that there is continuity in the sense that if a “zero amount” is consumed at t1 , then C is continuous at t1 and both formulae may be applied. Hence we will require, for t > t1 , that C(t) = F (t, t1 )C(t1 ) = F (t, t1 )F (t1 , t0 )C(t0 ) but also = F (t, t0 )C(t0 ). (2) 383

Nils Chr. Framstad Since F is supposed to be decay, then the following assumptions are natural: ∂F ¯ (t, t) ≤ 0 and F (t, t) = 1 everywhere, ∂t

(3)

Conditions (2)–(3) grant that F (t, t¯) represents exponential decay with respect to the diﬀerence (t − t¯). Now exponential decay has the property that one mouthful consumed today increases utility at all future times, regardless how distant, and one may object to this property. It will however turn out that long durability of consumption is undesirable from a conservationist point of view, in the sense that assuming inﬁnite memory is a harder test for the non-depletion criterion we want to prove. On the other hand, it will not represent much technical obstacle to allow for the decay rate δ (which enters in (4)–(5) below) to be time-dependent, and we will do so – but (for simplicity) we shall assume it to be non-random. We will frequently need the multiplicative decay factor · δ dt}, (4) Δ := exp{− 0

where δ(t) is continuous at 0 and locally bounded. 12.2.2

The process C and direct utility

Having justiﬁed the exponential decay form, we shall assume that, for H being the cumulative harvest process, C obeys dC = −δC dt + k dH.

(5)

We shall argue that we can take k = 1 (constant), but ﬁrst we notice that if we put k = δ and let δ → ∞, we recover the classical nonintertemporal case dC = dH. Furthermore, we can interpret k as a price and technology parameter; imagine that extracting 1 unit from the resource yields k units (units may not be the same, for example in case k reﬂects price) to consume (we note that Theorem 1 will be trivial unless k > 0.) It is natural to allow this parameter to vary over time, but we leave to the reader to see that we can incorporate such time-dependence into the drift term b of the process X to be introduced in (7) below. That way, or by considering C/k instead of C, we see that we can (and will) assume k = 1. 384

(6)

Stochastic Control and Non-Depletion of Renewable Resources We then assume that the direct utility rate at time t is U (t, C(t)), where U is continuous near t = 0 and C2 near C = 0. We denote the derivatives in the second argument by primes, and for technical reasons assume that they are suﬃciently regular for small nonzero values in C. The prototypical direct utility rate would be the discounted form exp{−ρt}u(C(t)), but our generalization represents no obstacle and allows for t-dependence in u as well. Hence we shall stick with the U notation through the below Theorem 1 until Corollary 1, where a more explicit calculation will turn out convenient. 12.3

The optimal control problem

Consider an agent who wants to maximize expected total utility from harvesting from a population X, obeying an Ito stochastic diﬀerential equation of the form ˜ dz) − dH(t), dX(t) = X(t− ) · b(t, X)dt + σdM (t) + ηz Π(dt, X(0+ ) = x,

(7)

where we assume the following mathematical detail: M is a continuous ˜ = Π − π is a measure-valued pure jump martingale martingale, Π composed from an integer-valued random measure Π (assumed right continuous with left limits) and its continuous compensator π. Here, σ and η are stochastic functions, and we assume ad hoc uniqueness and (local) existence of (7), and furthermore that b is continuous at (0, 0), while both σd[M, M ] and ηz dπ are suﬃciently integrable. The process H is our control, the total amount extracted up to and including time t; of course, the harvested process X should remain nonnegative, so we assume that X does not jump past 0 by itself (i.e. we assume ηz ≥ −1) and that we cannot harvest more than the present population, i.e. we restrict the set of admissible H by imposing H(t+ ) − H(t− ) ≤ X(t) for all t.

(8)

For given values of X(0) and C(0), deﬁne the performance up to time T to be: T U (t, C(t))dt], (9) JT (H) := E[ 0

and suppose the objective function to be maximized over H to be JT¯ where T¯ is a ﬁxed deterministic time horizon, ﬁnite or inﬁnite. 385

Nils Chr. Framstad 12.3.1

Optimality criteria

Now the usual idea of optimality would be to try to ﬁnd a control maximizing JT¯ . There are however other optimality criteria designed either for reﬁnement or to ensure existence. We want to treat both the quite weak “sporadically catching up” (SCU) and the strong “overtaking” (OT) optimality criteria, i.e., by deﬁnition, an “optimal” H ∗ should satisfy, respectively, for all admissible H: SCU: OT:

lim sup (JT (H ∗ ) − JT (H)) ≥ 0,

T T¯ JT (H ∗ )

− JT (H) ≥ 0 for all large enough T < T¯.

(10) (11)

For some intuition on the concepts, think of T¯ = ∞ as merely a mathematically convenient approximation to a very long time horizon, and assume that there are multiple controls which accumulate arbitrarily high utility at large times. For example, assume that we have three controls H1 , . . . , H3 with JT (H1 ) = T, JT (H2 ) = 2(T − 1/(2T + 1)), and JT (H3 ) = 2(T − 2/(2T + 1)) sin T. Then J∞ (H1 ) = J∞ (H2 ) = +∞. However, H2 overtakes the two others and H3 still sporadically catches the two others, as lim sup(JT (H3 ) − JT (H2 )) = lim 1/(2T + 1) = 0 and lim sup(JT (H3 ) − JT (H1 )) = ∞. This reasoning exhibits OT-optimality as stronger than both “ordinary” optimality and SCU-optimality, but no relation between the two latter, although SCU-optimality is quite weak if lim sup JT < ∞ for all H. T

For a more thorough treatment of diﬀerent optimality concepts, see e.g. Seierstad and Sydsæter (1987). 386

Stochastic Control and Non-Depletion of Renewable Resources 12.3.2 Comparing only a few strategies A priori, the agent should be permitted to choose among a possibly quite large class of non-anticipating left continuous non-decreasing (cumulative) harvesting processes H satisfying (8), and as pointed out by Hindy and Huang, it should be possible to consume both in a continuous way and with discrete gulps. However, the purpose of this paper is not to prove optimality, but to prove non-optimality of a given strategy, namely immediate total depletion, denoted by ¯ 0 . For this purpose, it suﬃces to ﬁnd one strategy which H = H is better. Speciﬁcally, we shall consider strategies where nearly all of the initial population X(0) is harvested immediately, and after a short time t˜, the rest is harvested. Therefore, we can without loss of generality assume C(0) = 0: we are not allowed to extract a negative amount, but we can imagine doing so, only to immediately reverse the operation and also extract most of the rest. This ﬁctitious operation will not aﬀect the processes nor the running utility rate except at the single time zero, hence not the problem. We schematize: • We assume H(0) = 0 and C(0) + X(0) = x0 ; without loss of generality, we assume C(0) = 0 and X(0) = x0 . • We harvest H(0+ ) = x0 −x and “start X at” X(0+ ) = x (cf. (7)), where x is assumed small. Thus C(0+ ) = x0 − x. • We let X evolve according to (7) until time t˜, when we harvest the rest, namely X(t˜). ¯ 0,x = H ¯ t˜,0 =: ¯ t˜,x . Observe that H • This strategy will be denoted H ¯ 0. H ¯ t˜,x ) − JT (H ¯ 0 ) has a ﬁrst-order term. It will turn out that t˜x → " JT (H 12.4

Non-optimality of immediate total depletion

Let us ﬁrst deﬁne a function K which will help us to state the result in a very general form: T ¯ ¯ K(T ) := (b + δ) Δ(t) · U (t, Δ(t)x0 )dt − U (0, x0 ), (12) 0

where ¯b := b(0, 0)

(13) 387

Nils Chr. Framstad is the relative drift rate at the limit (t, X) → (0, 0), and δ¯ := δ(0)

(14)

is the relative decay rate of the consumption at time 0. We then have the following: Theorem 1. Assume x0 > 0, and that the above conditions hold. ¯ 0 is not • If K(Tn ) > 0 for some sequence Tn # T¯, then H overtaking-optimal. • If K is positive and bounded away from 0 on some nonempty ¯ 0 is not sporadically catching up-optimal. interval (T, T¯], then H ¯ 0 ) < ∞ holds, then • If in addition to the second bullet point JT¯ (H ¯ H0 is not optimal in the ordinary sense. ¯ t˜,x , we can calculate C Proof: With the above described strategy H easily: if t ∈ (0, t˜] Δ(t)(x0 − x) C(t) = Δ(t)[(x0 − x) + X(t˜)/Δ(t˜)] if t > t˜. Then we have for each T < T¯, t˜ ¯ JT (Ht˜,x ) = U (t, Δ(t)(x0 − x))dt 0 T + E U (t, Δ(t)[(x0 − x) + X(t˜)/Δ(t˜)]) dt. t˜

¯ 0 )| < ∞, and consider the diﬀerence Observe that |JT (H ¯ t˜,x ) − JT (H ¯ 0) : D(t˜, x) := JT (H t˜ . U (t, Δ(t)(x0 − x)) − U (t, Δ(t)x0 ) dt D(t˜, x) = 0 T E U (t, Δ(t)[(x0 − x) + X(t˜)/Δ(t˜)]) − U (t, Δ(t)x0 ) dt. + t˜

By the mean value theorem, there is some t1 ∈ (0, t˜) and some x1 ∈ (0, x) so that the ﬁrst line of the right hand side is equal to −xt˜Δ(t1 )U (t1 , Δ(t1 )(x0 − x1 )). 388

Stochastic Control and Non-Depletion of Renewable Resources As for the second line, ﬁx t and write the expectation argument as f (Y (t˜)) where Y := X/Δ; note that (dY )/Y = δdt + (dX)/X. By the Ito formula, df (Y ) = (b(t˜, Y Δ) + δ)Y f (Y )d t˜ + 12 σ 2 Y 2 f (Y )d[M, M ] + f ((1 + ηz )Y ) − f (Y ) − ηz Y f (Y ) dπ + [martingale terms].

(15)

Noting that in general, f (Y (t1 )) will depend on t (but that f (Y (0)) = 0 for all t), and taking expectation, we get that for some t2 ∈ (0, t˜), D(t˜, x) = −xt˜Δ(t1 )U (t1 , Δ(t1 )(x0 − x1 )) ˜ + E tY (t2 ) δ(t2 ) + b(t2 , X(t2 )) +

t˜

t˜

t˜

f (Y (t2 ))dt

[higher than ﬁrst order terms in x]d[M, M ]dt

t˜

+

T

(16) T

0

T

t˜

[higher than ﬁrst order terms in x]dπdt .

0

So D(t˜, x) lim = −U (0, x0 ) + (δ¯ + ¯b) t˜x→0 t˜x

T

Ef (Y (0))dt.

0

Substituting for f (Y (0)) = Δ(t)U (t, Δ(t)x0 ), we see that the right hand side is precisely K(T ). Now let T grow through some sequence (OT) or all sequences (SCU), and the conclusion follows. 2 A more recognizable form of the criterion follows immediately:

389

Nils Chr. Framstad Corollary 1. Assume T¯ = ∞, δ = δ¯ (constant) and that for all (t, c) ∈ [0, ∞) × (0, x0 ], the utility function U (t, c) coincides with e−ρt u(c) for some nondecreasing and concave u, with constant ρ > −δ¯ ¯ 0 ) < ∞). If ¯b > ρ, then K(T ) is bounded away from 0 (implying J∞ (H ¯ 0 is neither SCU-optimal nor optimal for all T large enough and thus H in the ordinary sense. Proof: Consider K. Observe that the coeﬃcient in front of the integral is positive, and that by concavity, U (t, Δ(t)x0 ) ≥ e−ρt u (x0 ). Substitute this and calculate explicitly the (over)estimate, and note that it is increasing at inﬁnity. 2 Note that the conditions of inﬁnite horizon and constant discount rate go hand in hand here: ﬁnite horizon T¯ corresponds to shifting ρ to +∞ at T¯, so a corresponding result would be less powerful for T¯ < ∞. By proceeding as in the proof, we can nevertheless obtain conditions for non-depletion for this and other cases of time-dependent ρ. We skip the details. Corollary 1 suggests that linear utility is “worst case” among the concaves – this is however no proof, as we have only compared within a narrow class of (usually sub-optimal) strategies. But even an expected proﬁt maximizer will not want to deplete the population immediately if growth at 0 exceeds the discount rate, just as in the case where C equals the extraction rate itself. Rather than using concavity, we may sometimes want a stronger result for a given utility function: Example. (CRRA utility) Assume that we are in the setting of Corollary 1, with u being CRRA, i.e. u(c) =

c1−θ − 1 1−θ

for some θ ≥ 0.

Calculating K(∞) explicitly, we ﬁrst ﬁnd that K(∞) = +∞ in the case ρ + δ · (1 − θ) ≤ 0; otherwise, we have ρ + δ · (1 − θ) K(∞) = ¯b + δθ − ρ. xθ0 We get an improved condition compared to Corollary 1: immediate depletion is non-optimal even in all three senses if either ρ ≤ δ · (θ − 1) or ρ < ¯b + δθ, 390

Stochastic Control and Non-Depletion of Renewable Resources and we note that for the most interesting parameter ranges ¯b + δ > 0, the former holds if the latter does. In particular, it will always hold for large enough δ, i.e. low enough degree of intertemporal substitution, cf. in particular the limiting case where k = δ → ∞. Thus our class of preferences gives for CRRA utility the entire spectrum of conditions from none at all in the classical nonlinear case (k = δ → ∞, θ > 0 – cf. the discussion at the beginning of section 12.2), to ρ < ¯b for no decay in direct utility (δ = 0). We see that δ and θ enter our criterion only as a product δθ, i.e. symmetrically. An economic interpretation could be that intertemporal decay plays a similar role as risk aversion represented by the Arrow-Pratt index: no decay or zero risk aversion gives the classical ρ < ¯b criterion, while high decay or high risk aversion indicates a more conservative resource extraction strategy. We remark though that this interpretation is not suﬃciently supported by logic: again, we have only compared immediate depletion with non-optimal strategies. 12.5

Concluding remarks

It is no surprise that intertemporal direct utility can be “worse” (from a conservationist point of view) than the classical case, as seen by our example: In the extreme, direct utility functions which yield −∞ at zero will immediately prohibit immediate depletion in the classical case, whereas an initial “gulp” will grant the agent ﬁnite running utility in our setup. Nevertheless, we have shown that the well-known discount-rate criterion for non-depletion seems fairly robust to these kinds of preferences and to non-Markovian noise. Arguably, there is a weakness that the model has no ﬁnancial market; the harvested amount must immediately be converted into consumption. There might be a price ratio, but there is no way of investing in this model. Finally, we mention that our results do admit several improvements in the technical conditions. For example, the only need for U is to be able to deal with stochastic diﬀerentials. The Meyer-Ito-formula allows for c "→ U to merely be a diﬀerence between convex functions, and the second-order terms still vanish in the limit transition. Maybe more importantly, we can – under the appropriate conditions – allow the coeﬃcients to depend on other stochastic parameters than X, and in this way cover more complex systems.

391

Nils Chr. Framstad Acknowledgements: Main research was carried out at the Department of Mathematics, University of Oslo. The author gratefully acknowledges ﬁnancial support from the Research Council of Norway. References: Alvarez, L. H. R. (2001) “Singular Stochastic Control, Linear Diﬀusions, and Optimal Stopping: a Class of Solvable Problems.” SIAM Journal on Control and Optimization 39: 1697–1710. Framstad, N. C. (2003) “Optimal Harvesting of a Jump Diﬀusion Population and the Eﬀect of Jump Uncertainty.” SIAM Journal on Control and Optimization 42: 1451–1465. Hindy, A., and Huang, C.-F. (1992) “Intertemporal Preferences for Uncertain Consumption: a Continuous Time Approach.” Econometrica 60: 781–801. Hindy, A., Huang, C.-F., and Kreps, D. (1992) “On Intertemporal Preferences in Continuous Time: the Case of Certainty.” Journal of Mathematical Economics 21: 401–440. Pindyck, R. S. (1984) “Uncertainty in the Theory of Renewable Resource Markets.” Review of Economic Studies 51: 289–303. Seierstad, A., and Sydsæter, K. (1987) Optimal Control Theory with Economic Applications. North-Holland Publishing Co., Amsterdam.

392

Chapter 13 Capital Accumulation in a Growth Model with Creative Destruction

Klaus W¨alde University of W¨ urzburg, W¨ urzburg, Germany

13.1

Introduction

Aghion and Howitt (1992) have presented a very inﬂuential model of endogenous growth. Long-run growth results from R&D for improved intermediate goods where each new vintage of intermediate goods yields a higher total factor productivity. While the original presentation of the model did not take capital accumulation into consideration, various more recent contributions (Aghion and Howitt, 1998; Howitt and Aghion, 1998, Howitt, 1999) combined capital accumulation with R&D. These contributions share the feature of risk-neutral agents. W¨alde (1999a) has shown that introducing risk-averse households into the Aghion and Howitt (1992) model substantially alters equilibrium properties. Three out of four market failures disappear and a new market failure resulting from a complementarity in ﬁnancing R&D is identiﬁed. It is the objective of the present paper to show that extending also Aghion and Howitt’s (1998), Howitt and Aghion’s (1998) and Howitt’s (1999) model for risk-averse households considerably broadens the range of phenomena to which their model can be applied. Such an extension allows us to understand not only long-run growth but also short-run ﬂuctuations. Aghion and Howitt’s basic setup implies therefore much richer predictions once the assumption of risk-neutrality is relaxed.

Klaus W¨alde The next section presents a model that contains the central features of Aghion and Howitt’s setup, notably an R&D sector whose probability of success (arrival rate) depends on the amount of resources allocated to R&D. In addition, capital accumulation and the consumption and investment decision of risk-averse households are modeled explicitly. The economy we present produces one good that can be used for consumption, for capital accumulation and as an input for risky R&D. This good employs capital and labor. Risk-averse households can use their savings for ﬁnancing capital accumulation and R&D. As this investment decision is based on (expected) returns, the amount of resources allocated to capital accumulation will be high when returns to capital accumulation are high relatively to expected returns to R&D. With high capital returns, capital accumulation will be fast. When returns to capital accumulation have fallen (due to decreasing returns to capital), capital accumulation will be slower - just as on the saddle path of a standard Ramsey growth model. When capital returns are suﬃciently low, research for new technologies will be ﬁnanced. Once research is successful, a new technology is available and returns to capital accumulation will again be high. The discrete increase in total factor productivity due to new technologies combined with gradual capital accumulation allows us to understand how short-run ﬂuctuations and long-run growth are jointly determined.1 The two features of the model that diﬀer from Aghion and Howitt’s setup are worth of being emphasized: First, gradual capital accumulation can be studied only with risk-averse households. Hence without risk-aversion, short-run ﬂuctuations (of the type presented here) cannot be understood. Second, some features of Aghion and Howitt’s setup, which are not essential for the argument we want to make here, are not taken into consideration. Most importantly, the present model does not have any imperfect competition features. Modeling an economy that is perfectly competitive in all sectors (and therefore has no monopolist in the intermediate good sector) makes the model very tractable. Incentives for R&D are nevertheless present in a decentralized economy, as the outcome of R&D is assumed to consist not only in a blueprint but also in a prototype of the new units of production. As will become clear below, the qualitative properties of the present model should be identical to the qualitative properties of a model with a monopolist in the intermediate goods sector. 1 Some equilibrium properties of the present model resemble the ﬁndings of Bental and Peled (1996) and Matsuyama (1999), who use a discrete-time framework.

394

Capital Accumulation, Growth, and Creative Destruction 13.2

Framework of the model

13.2.1 Technologies Technological progress is labor augmenting and embodied in capital. A capital good Kj of vintage j allows workers to produce with a labor productivity of Aj . Hence, a more modern vintage j +1 implies a labor productivity that is A times higher than labor productivity of vintage j. The production function corresponding to this capital good reads 1−α . (1) Yj = Kjα Aj Lj The amount of labor allocated to this capital good is denoted by Lj , 0 < α < 1 is the output elasticity of capital. The sum of labor employment Lj per vintage equals aggregate constant labor supply L, Σqj=0 Lj = L, where q is the most advanced vintage currently available. Independently of which vintage is used, the same type of output is produced. Aggregate production therefore equals Y = Σqj=0 Yj .

(2)

Aggregate output is used for producing consumption goods C, investment goods I and it is used as an input R for doing R&D, C + I + R = Y.

(3)

The objective of R&D is to develop capital goods that yield a higher labor productivity than existing capital goods. R&D is an uncertain activity which is modeled by the Poisson process q. The probability per unit of time dt of successful R&D is given by λdt, where λ is the arrival rate of the process q. This arrival rate is an increasing function of the amount of resources R used for R&D, λ = R/D (q) .

(4)

The parameter D (q), a fundamental of the model, captures diﬀerences in sector input requirements between R&D and the other sectors. It is an increasing function of the currently most advanced vintage q, as will be discussed later. It will basically be used to remove the well-known scale eﬀect (Backus, Kehoe and Kehoe, 1992; Jones, 1995; Segerstrom, 1998; Young, 1998; Howitt, 1999) in the present model. 395

Klaus W¨alde When R&D is successful, a ﬁrst prototype of a production unit that yields a labor productivity of Aq+1 becomes available. Let the size of this ﬁrst machine be given by κq+1 .2 It might appear unusual that research actually leads to a ﬁrst production unit. Usually, output of successful research is modeled as a blueprint. It should not be too diﬃcult to imagine, however, that at the end of some research project, engineers have actually developed a ﬁrst machine that implies this higher labor productivity. With this assumption, there are incentives to ﬁnance R&D in a decentralized economy, even though all sectors produce under perfect competition: Those who have ﬁnanced R&D obtain the production unit κq+1 , whose capital rewards balance R&D costs. Hence, no proﬁts by a monopolist are required.3 As a second eﬀect of successful R&D, the economy can accumulate capital that yields this higher labor productivity. This is a positive externality.4 Each vintage of capital is subject to depreciation at the constant rate δ. If more investment is allocated to vintage j than capital is lost due to depreciation, the capital stock of this vintage increases in a deterministic way, dKj = (Ij − δKj ) dt,

j = 0...q.

(5)

When research is successful, the capital stock of the next vintage q + 1 increases discretely by the size κq+1 of the ﬁrst new machine of vintage q + 1, (6) dKq+1 = κq+1 dq. Afterwards, (5) would apply to vintages j = 0...q + 1.5 Before describing households in this economy, we now derive some straightforward equilibrium considerations that both simplify the presentation of the production side and, more importantly, the derivation of the budget constraint of households in the next section. Allowing labor to be mobile across vintages j = 0...q such that wage rates equalize, the total output of the economy can be repre2 The size can diﬀer for diﬀerent vintages and we will later assume that κq increases in q. 3 With a monopolist and capital, agents could hold capital and shares in the monopolist. This would require asset pricing, which would make the model intractable when transitional dynamics are to be analyzed. 4 There is an interesting link to the Coase theorem as it was recently amended by Dixit and Olson (2000): When bundling a collective good (the new technology) with a private good (the new machine), the collective good will be provided. 5 Formally, this equation is a stochastic diﬀerential equation driven by the Poisson process q. The increment dq of this process can either be 0 or 1. Successful R&D means dq = 1. For an introduction, cf., e.g., Dixit and Pindyck (1994).

396

Capital Accumulation, Growth, and Creative Destruction sented by a simple Cobb-Douglas production function (cf., Appendix A) Y = K α L1−α . (7) Vintage speciﬁc capital stocks have been aggregated to an aggregate capital index K, K = K0 + BK1 + ... + B q Kq = Σqj=0 B j Kj ,

B=A

1−α α

(8)

.

This index can be considered to be a quality-adjusted measure of the aggregate capital stock, where B j captures the quality of capital of vintage j. The value marginal productivity of a vintage j is then given by ∂Y j B , (9) wjK = pc ∂K where pc is the price of the consumption good. The evolution of this aggregate capital index K follows from (5) and (6). Given that the price of an investment good does not depend on where this investment good is used, that depreciation is the same for all investment goods and given that value marginal productivities (9) are highest for the most advanced vintage, investment takes place only in the currently most advanced vintage q, 0 ∀j < q . Ij = I j=q Hence, dK =

−δK0 − BδK1 − ... − B q−1 δKq−1 + B q [Iq − δKq ] dt

+B q+1 κq+1 dq = (B q I − δK) dt + B q+1 κq+1 dq.

(10)

Concerning prices in this economy, technologies presented above imply (11) p Y = p c = p I = pR Good Y will be chosen as numeraire. Prices pY , pc , pI , pR will therefore be constant throughout the paper; we will nevertheless use them at various places (and not normalize to unity) as this makes some derivations more transparent. As long as investment is positive, the price vq of an installed unit of the most recent vintage of capital equals the price of an investment good, vq = pI . As diﬀerent vintages are 397

Klaus W¨alde perfect substitutes in production (8), prices of diﬀerent vintages are linked to each other by pI = vq = B q−j vj ,

∀j = 0...q.

(12)

Further, the price pK of one eﬃciency unit of capital (which corresponds to one unit of capital of vintage 0) is a decreasing function of the most advanced vintage q, pK = B −q pI .

(13)

This also reﬂects the term B q in the capital accumulation equation (10). The pricing relationship (12) reveals a creative destruction mechanism in the model, despite the absence of aggressive competition between ﬁrms (as e.g., in the original Aghion and Howitt model where the intermediate ﬁrm is always a monopolist). When a new vintage is found, i.e., when q increases by one, the price of older vintages relative to the consumption good fall as by (12) and (11) vj /pc = B −(q−j) . Capital owners therefore experience a certain reduction in their real wealth. 13.2.2

Households

There is a discrete ﬁnite number of households in this economy. Each household is suﬃciently small to neglect the eﬀects of own behavior on aggregate variables. Households maximize expected utility U (t) given by the sum of instantaneous utility u (.) resulting from consumption ﬂows c (τ ) , discounted at the time preference rate ρ, ∞ e−ρ[τ −t] u(c(τ ))dτ, (14) U (t) = E t

where the instantaneous utility function u (.) is characterized by constant relative risk aversion, u(c (τ )) =

c (τ )1−σ − 1 . 1−σ

(15)

For saving purposes, households can buy capital and ﬁnance R&D. When they buy capital, their real wealth a increases in a deterministic and continuous way. This increase depends on the diﬀerence between capital plus labor income ra + w minus expenditure i for R&D and expenditure pc c for consumption. When ﬁnancing R&D, i.e., when i 398

Capital Accumulation, Growth, and Creative Destruction is positive, successful research changes their wealth in a discrete way. A household receives the same share of the value of the successful research project that it has contributed to ﬁnancing this project. When total investment into research is given by J, the household receives the share i/J.6 The value of the successful research project depends on the price vq+1 of the capital good and the ”size” κq+1 of the prototype. In summary, the budget constraint (16) is a stochastic diﬀerential equation, where the deterministic part (.) dt stems from buying capital and the stochastic part (.) dq captures the eﬀects of ﬁnancing R&D. As in (6), when R&D is successful, the increment dq of the Poisson process q underlying R&D equals unity, otherwise, dq = 0. A negative eﬀect of successful research stems from the devaluation of capital, as discussed in relation to the pricing equation (12). As the relative price (13) of an eﬃciency unit of capital falls when a new vintage is discovered, households experience a loss in the value of their assets relative to the consumption good price. The share of assets that is ”lost” due to this devaluation is denoted by s. Hence, % da =

& & % i w−i − c dt + κq+1 − sa dq, ra + pY J

where s=

(16)

B−1 B

(17)

∂Y − δ. ∂K

(18)

and the interest-rate is given by r = Bq

This budget constraint is formally derived in Appendix B. 13.3

Solving the model

This section shows that the economy can be analyzed almost as easily as a standard textbook growth model. All optimality and equilibrium conditions will be expressed in terms of aggregate consumption C and the capital stock K. The behavior of the economy is summarized in the next section as an almost standard phase diagram. 6 This sharing rule introduces an externality in this economy. Individuals tend to invest too much, as shown (in a diﬀerent setup) in W¨ alde (1999a).

399

Klaus W¨alde 13.3.1

Investment decisions of households

Households maximize utility (14) subject to the budget constraint (16) by choosing investment i into R&D and the consumption level c. Optimal investment follows a bang-bang investment rule saying that either no savings are used for R&D at all or all savings are used for R&D. Formally (cf., Appendix C or W¨alde, 1999b), i > 0 ⇐⇒ r − ρ − λ [1 − (1 − s) Ω] 0. = = ra + w/pY − c pY (19) where u (˜ c) Ω= (20) u (c) is the ratio of marginal utility of consumption under the new technology to marginal utility of consumption under the current technology. In this paper, a tilde (˜) denotes the value of a variable immediately after successful R&D. This rule says that R&D is not ﬁnanced (i = 0) when the right hand side is positive, i.e., when returns r to capital accumulation are suﬃciently high. With low returns such that the right hand side is zero (as shown in W¨alde (1999b) and, as we will see, it cannot be negative in equilibrium), all savings net of capital depreciation will be used for ﬁnancing R&D. This bang-bang result might be surprising but it is extremely useful for keeping the model tractable. It is the consequence of three suﬃcient (not necessarily necessary) conditions: (i) there is a representative consumer, so distributional aspects are not taken into consideration here, (ii) the R&D sector operates under constant returns to scale, a standard assumption that allows us to model perfect competition and (iii) the result κq+1 of a successful research project is independent of the amount of resources allocated to R&D. More or less investment into R&D has only an impact on the probability of success, not on its outcome κq+1 .7 It is important to note at this point that allocating all savings to R&D implies that wealth of households remains constant (as long as R&D is not successful). This directly follows from inserting i/pY = ra + w/pY − c into the budget constraint (16) of households. When wealth of households is constant, aggregate wealth, i.e., the capital 7

This bang-bang property can also be found in central planner solutions of economies of this type (W¨alde, 2001). A technical condition is the continuoustime setup. Discrete-time models would have an interior solution (W¨ alde, 1998, ch.8).

400

Capital Accumulation, Growth, and Creative Destruction stock, needs to be constant as well. Hence, when all savings are allocated to R&D, there is still some investment in new equipment such that depreciation is just balanced. Looking at the expression for the interest-rate (18) shows that this is no contradiction to the allocation of all savings to R&D. Savings are net savings, i.e., gross savings Bq

∂Y a + w/pY − c ∂K

minus δa, losses due to depreciation. Hence, gross savings are used for keeping wealth (and thereby the capital stock) constant and for ﬁnancing R&D, Bq

∂Y a + w/pY − c = δa + i/pY . ∂K

From an intuitive point of view, this rule can most easily be understood by looking at the Keynes-Ramsey rule that would hold in an economy where households allocate savings both to R&D and capital accumulation, i.e., where an interior solution for investments into R&D exists. It reads (cf., Appendix D) −

du (c) = [r − ρ − λ [1 − [1 − s] Ω]] dt + [1 − Ω] dq. u (c)

(21)

The deterministic part of this rule is identical to the investment rule. The deterministic part says that consumption grows as long as the interest rate r is suﬃciently high. When the interest rate is too low, no further accumulation of assets takes place. This is a well-known relationship from standard growth models. This helps to understand the above investment rule for the case where no interior solution for investment into R&D exists. As long as the interest-rate is suﬃciently high, only capital accumulation takes place and consumption rises. When the interest-rate has fallen to ρ + λ [1 − [1 − s] Ω] , no further assets are accumulated and consumption is constant. Hence all savings go to ﬁnancing R&D. 13.3.2 The regimes of the economy We now exploit the implications of the investment rule. When no R&D is undertaken, the economy ﬁnds itself in a period of deterministic changes, the deterministic regime. When R&D is undertaken, the economy ﬁnds itself in a stochastic regime. 401

Klaus W¨alde Deterministic regime When all savings are allocated to capital accumulation, no research takes place and no uncertainty is present in the economy. In those ”deterministic times”, consumption follows the standard Keynes-Ramsey rule, u (C) ˙ C = r − ρ. (22) − u (C) where C is aggregate consumption. Capital accumulation is then also deterministic and reads from (10) and (3) K˙ = B q (Y − C) − δK.

(23)

Stochastic regime By contrast, when the interest rate is suﬃciently low such that the investment rule (19) advises to allocate all savings to R&D, the economy ﬁnds itself on what could be called the R&D line. This line follows from the investment rule (19) and reads (cf. Appendix E) Bq

Y − B −q δK − C ∂Y −δ−ρ= ∂K D (q)

% 1−

D (q) Bκq+1

& .

(24)

This line gives combinations of the aggregate capital stock and consumption where the economy is in the stochastic regime. As follows from the discussion of (19), individual consumption, individual wealth and therefore aggregate consumption and the aggregate capital stock are constant on this line. The economy is therefore in a transitory stationary equilibrium. At some point, however, a new technology will be found and individuals adjust their saving plans. The associated jump in consumption is given by (cf. Appendix E) u (C) =

13.4

κq+1 ˜ u (C). D (q)

(25)

Cycles and growth

This section shows how a phase diagram can be used to illustrate the equilibrium path of the economy. It also presents selected properties of time paths as predicted by the model. 402

Capital Accumulation, Growth, and Creative Destruction 13.4.1 The equilibrium path Studying a standard phase diagram would be cumbersome, as the phase diagram ”grows” with each vintage. More formally, zero-motion lines are a function of the most advanced vintage q and they shift outward when q rises. We will therefore present a phase diagram where variables have been transformed according to ˆ q/α , K = KA

ˆ q. C = CA

(26)

ˆ and C, ˆ zero-motion With these new productivity-adjusted variables K lines are vintage-independent. Zero-motion lines The phase diagram consists of zero-motion lines and, in addition to standard phase diagrams, of an R&D line. Productivity-adjusted capital and consumption follow (cf., Appendix F) d ˆ ˆ K = Yˆ − Cˆ − δ K, dt rˆ − ρ ˆ d ˆ C = C. dt σ

(27) (28)

The zero-motion lines for consumption and capital are then ˆ Cˆ = Yˆ − δ K, where ˆ α L1−α , Yˆ = K

rˆ = ρ,

1−α ˆ rˆ = α L/K − δ.

(29)

The R &D line and jumps in consumption and capital Transforming the R&D line (24) yields (cf., Appendix F) ˆ −φ Cˆ = Yˆ − δ K

rˆ − ρ . φ 1 − A1/α κ ˆ0

(30)

For this transformation, we assumed D (q) = φAq ,

and κq+1 = Aq+1 κ ˆ0,

(31)

where φ is a positive parameter that reﬂects relative productivity of investment vs. R&D. This derivation assumed also 1−

φ D (q) = 1 − 1/α > 0. Bκq+1 A κ ˆ0

(32) 403

Klaus W¨alde Both the parameter φ and this parameter restriction will be discussed when drawing the phase diagram. The ﬁrst assumption in (31) is by now standard in models of economic growth. It implies by the resource constraint (3) of the economy that more resources are required to increase labor productivity with a ”probability” λ from q to q + 1 than with the same λ from q − 1 to q. When q machines with each one providing higher productivity have already been developed, it is harder to ﬁnd new and more productive ones.8 Without this assumption, the economy would be characterized by the scale eﬀect: The larger the economy, the faster it grows (which is empirically disputed, cf., e.g., Jones 1995 or Backus, Kehoe and Kehoe, 1992). This scale eﬀect has been solved in many ways (Segerstrom, 1998; Young, 1998; Howitt, 1999), of which the approach chosen here (close to Segerstrom, 1998) appears to be the simplest one from a modelling perspective. The second assumption implies that the size of new machines is such that total factor productivity of this new machine (compare the technology (1)) is A times higher than total factor productivity with the previous vintage. Both assumptions together yield a productivityadjusted R&D line (30) that is an invariant line in the productivityadjusted phase diagram. The consumption jump condition (25) reads with (15) and (31) C

−σ

Aˆ κ0 ˜ −σ C ⇔ C˜ = = φ

%

Aˆ κ0 φ

&1/σ C

for actual consumption and Cˆ˜ = A−1

%

Aˆ κ0 φ

&1/σ Cˆ

(33)

for productivity-adjusted consumption. The capital stock increases due to successful research according to ˜ − K = B q+1 κq+1 . Productivity-adjusted capital changes (10) by K following (26) and (31) are then given by ˆ˜ = K/A ˆ 1/α + κ K ˆ0.

(34)

8 Segerstrom (2001) provides convincing data on R&D expenditures by Intel who supports this (and his) view.

404

Capital Accumulation, Growth, and Creative Destruction d dt C=0

C

R&D line

d dt K=0

m

pa th

EP

ri u ib l i u eq C0 EP

C K

K0

K

Figure 1: Long-run growth and short-run cycles. Equilibrium Let us now plot the phase diagram that will help to understand what ˆ on the horizontal an equilibrium in this economy is. Figure 1 plots K ˆ and C on the vertical axis. Zero-motion lines have the usual shape and laws of motion indicated by arrows are identical to standard Ramsey growth models as well. The R&D line is upward sloping and crosses the steady state. The slope of the R&D line crucially depends on φ, the parameter that captures relative productivity of the R&D sector vs. the investment good sector. A high φ means high productivity in the investment goods sector relative to the research sector (compare (31) and (4) with the resource constraint (3)). The higher φ, the further the R&D line (30) moves to the right. Ceteris paribus, this means longer capital accumulation before R&D starts. The R&D line lies below the zero-motion line for capital because ˆ 0 = 0, the R&D of the parameter restriction (32). If 1 − φ/ A1/α κ line would coincide with the zero-motion line for consumption. This can mosteasily be seen from the expression for the R&D line in (24). 1/α ˆ 0 > 0, the R&D line would lie above the zero-motion If 1 − φ/ A κ line for capital.9 9

In the present paper, we restrict attention to the case in (32). When the R&D line coincides with the zero-motion line for consumption, no resources are left for R&D when the R&D line is hit, and new technologies would never be discovered.

405

Klaus W¨alde Equations (27), (28), (30), (33) and (34) jointly determine the evolution of productivity-adjusted capital and consumption in this economy. An equilibrium is a path EP − EP as drawn in the phase ˆ 0 , Cˆ0 , following laws of motion (27) diagram, starting at a point K and (28), ending on the R&D line (30), jumping according to (33) and (34) to ˆ ˆ ˜ C˜ K, ending up after having followed again laws of motion (27) and (28) at ˆ 0 , Cˆ0 .10 K 13.4.2

Properties of the equilibrium path

This section will present properties of the equilibrium path of this economy. It studies both the long-run and the short-run predictions of the model. While it would be extremely interesting to calibrate this model and derive quantitative predictions, this is left for future work.11 Short-run ﬂuctuations This economy is characterized by long-run growth with short-run ﬂuctuations. The evolution of the economy can nicely be summarized using the above phase diagram. The subsequent discussion refers to actual quantities (like K and C rather than productivity-adjusted variˆ and C), ˆ assuming the economy is in equilibrium. ables K Let the economy start with a capital stock K0 and let it choose a consumption level such that it is on the equilibrium path EP − EP . As returns to capital are suﬃciently high, no one wants to ﬁnance research for new technologies. The economy therefore accumulates more capital of the currently most advanced vintage and approaches the R&D line. Consumption rises and returns to capital fall. The implications of an R&D line lying above the zero motion line for capital are still to be worked out. 10 In equilibrium, the interest rate is always larger or equal to ρ+λ [1 − (1 − s) Ω], as argued in (19). The interest rate would be smaller than this expression only if the economy were below the R&D line. 11 As the objective of a theoretical model is to present an argument as easily as possible, certain predictions especially on cyclical and counter-cyclical behavior are extreme. In work in progress (in a discrete time version of the present model), the author shows that these extreme predictions can be weakened which makes the discrete-time version more suitable for calibration.

406

Capital Accumulation, Growth, and Creative Destruction After some ﬁnite length of time, the economy hits the R&D line (at the upper EP ). Investors realize that capital rewards have fallen so much that they are now indiﬀerent between accumulating capital and ﬁnancing research for a new technology. Resources that were used an instant before for producing new capital equipment are now used for searching for a better type of capital. As long as research is not successful, the economy remains on the R&D line at this point EP . Some new capital goods continue to be produced, just to compensate depreciation. Hence, the aggregate capital stock is constant. Once a new technology is found, the economy is hit by an endogenous technology shock. Its capital stock increases in a discrete way by the size of the new machine κq+1 , as shown in (10), and consumption jumps according to the consumption jump condition (25). The capital stock K unambiguously increases, consumption might rise or fall. After these discrete changes in aggregate capital and consumption, the economy starts accumulating capital again in a smooth way. It now accumulates capital of the new vintage. With this new vintage, the consumption level is on average A times higher than one vintage before. This increase in labor productivity implies positive long-run growth. Moving up the equilibrium path towards the R&D line implies non-constant growth rates. Short run properties of this model are presented in the following ﬁgures. Aggregate consumption plotted in the upper ﬁgure rises over time until the economy hits the R&D line at tR&D . From then on, research is undertaken and consumption is constant. At some point in time t∗ , research is successful. The length between tR&D and t∗ is indeterminate, while the expected length is given by λ−1 . Consumption rises or falls after successful research. Inserting assumptions (31) into (25) yields Aˆ κ0 ˜ (35) u C u (C) = φ This implies Aˆ κ0 C˜ ≷ C ⇔ ≷ 1. φ

(36)

As by the assumption (32) we made in deriving the R&D line, ˆ0 A1/α κ > 1, φ Aˆ κ0 /φ can be larger or smaller than unity, given that A is larger than unity by deﬁnition and 0 < α < 1. The increase in consumption from 407

Klaus W¨alde C AC*

C*

t R&D

O

t

t*

I

I, R

R I

I

R I

O

t R&D

t*

t

Figure 2: Time series of consumption, investment and R&D expenditure. a given point of one cycle to the same point of the next cycle is known. Denoting the consumption level on the R&D line by C ∗ , consumption on the R&D line in the next cycle is A times higher at AC ∗ . This immediately follows from the transformation (26) and the fact that, in equilibrium, productivity-corrected consumption Cˆ is at the same level independently of the currently most advanced vintage q. Output and capital follow qualitatively identical paths to consumption. In contrast to consumption, output and capital deﬁnitely increase after successful R&D. Output also increases from one cycle to the next one by the factor A. The physical capital stock increases by ˆ MeaA1/α , which also follows from (26) and vintage independent K. suring the capital stock in terms of the consumption good, however, shows that it increases by A as well (cf. next section on long-run growth). The growth rate of output relative to capital is given by the ˙ standard expression Y˙ /Y = αK/K. The growth rate of output relative to consumption depends on whether consumption drops or rises 408

Capital Accumulation, Growth, and Creative Destruction after successful R&D. When it drops, consumption grows faster than output (as at the end of a cycle, both have increased by the same factor A). If consumption rises more than output, output grows faster than consumption. Investment decreases over time, as does the interest rate, while resources R are allocated to R&D only at the end of a cycle. This is shown in the lower part of the ﬁgure. Resources R allocated to R&D in the stochastic regime are lower than resources used for investment I an instant before R&D starts: Aggregate output Y does not jump when the economy hits the R&D line, simply because the capital stock does not jump at this point. As consumption remains constant as well, the amount of resources for investment and R&D in the stochastic regime are just as high as an instant before the economy hits the R&D line. This follows from the resource constraint (3). As investment equals depreciation, not all resources that were used for investment go into R&D. Both quantities increase by the factor A from one cycle to the other. The prediction about the timing of R&D is extreme and will empirically probably not hold. The more general prediction of the model is that R&D investment is larger, when returns to capital accumulation are low.12 Long-run growth The model satisﬁes all of Kaldor’s stylized facts (cf., e.g., Barro and Sala-i-Martin, 1995) which are relevant for the present model. (i) Per capita output grows at a constant rate: Output at some ﬁxed ˆ α L1−α , where point of a cycle q can be written with (26) as Y = Aq K the aggregate production function (7) and the transformation (26) was used. As the labor force L is constant and productivity-adjusted ˆ is the same at some ﬁxed point (take, e.g., the capital stock capital K on the R&D line) of any cycle, output per capita increases by A from one cycle to the other. (ii) Physical capital per worker grows over time: ˆ which uses the argument This directly follows from K/L = Aq/α K/L just made. (iii) The rate of return to capital is nearly constant: The interest rate was computed for the R&D line in (29). As it is a function ˆ only, it does not display of the productivity-adjusted capital stock K any long-run trend. (iv) The ratio of physical capital to output is 12

In the discrete time version of the model mentioned in a footnote above, R&D takes place all of the time. The discrete time version is not as tractable as the version presented here, however.

409

Klaus W¨alde nearly constant: This stylized fact is the least obvious to see in the present model. Physical capital is measured as the value of all capital in an economy, deﬂated in an appropriate way. Here, the value of capital is given by its price pK per eﬃciency unit times the measure of the aggregate stock K. Using (8) and (13) yields pK K = B −q pI K. Dividing by the value of output, pc Y , yields ˆ ˆ B −q pI Aq/α K K B −q pI K = = , ˆ α L1−α ˆ α L1−α pc Y K pc Aq K where we used B −q Aq/α = Aq . Hence, capital per output is constant. (v) The shares of labor and physical capital in national income are nearly constant: This directly follows from a Cobb-Douglas production function. 13.5

Conclusions

The economy we have analyzed is characterized by short-run ﬂuctuations and long-run growth. Both short-run ﬂuctuations and long-run growth are caused by endogenous technology shocks. Technology shocks are endogenous, i.e., the point in time when a shock occurs depends on decisions made by agents in this economy, as the economy oﬀers two saving technologies. Households accumulate capital when returns to capital accumulation are suﬃciently high. Capital accumulation implies decreasing capital returns and, at some point, households put their savings into R&D activities when capital returns are low. When R&D is successful, a new technology is available, i.e., a technology shock occurs, and returns to capital accumulation are high again. This result follows from allowing households to be risk-averse. While capital accumulation and uncertain R&D have been studied in the literature, these results were so far not available, as risk-averse households were not taken into consideration. The present paper has shown that including this feature considerably broadens the range of phenomenon to which models of creative destruction and long-run growth can be applied.

410

Capital Accumulation, Growth, and Creative Destruction Acknowledgements: I thank Bettina B¨ uttner, seminar participants at CES Munich, Louvain-la-Neuve, the University of Amsterdam and the Federal Reserve Bank of Minnesota for useful discussions and comments and especially Pat Kehoe, Tim Kehoe and Paul Segerstrom for helpful suggestions and stimulating discussions. Olaf Posch and Benjamin Weigert provided excellent research assistance.

Appendix A: A vintage capital structure Vintage-speciﬁc technologies are given by 1−α , Y0 = K0α A0 L0 1−α , Y1 = K1α A1 L1 .. . Yq = Kqα (Aq Lq )1−α . Labour mobility implies equality of wages for all vintages j, wj = w0 ∀j. The wage rate of vintage j is given by &α % Kj wj = pc (1 − α) Aj . Aj Lj The wage rate of vintage 0 is w0 = pc (1 − α)

%

K0 A0 L0

&α

A0 .

Equality of wages for vintages 0 and 1 implies vintage 1 relative to vintage 0 of &α &α % % K0 K1 0 A = A1 ⇔ w0 = w 1 ⇔ A0 L0 A1 L1 0 1−α K1 1 K1 A ⇔ L1 = A α 1 L0 = A α L0 . A K0 K0 Undertaking the same steps for vintage j yields &α &α % % K0 Kj 0 w0 = w j ⇔ A = Aj ⇔ A0 L0 Aj Lj 0 j Kj A j 1−α Kj ⇔ Lj = A α j L0 = A α L0 . A K0 K0

labor allocation to 1 K0 K1 = Aα 1 0 A L0 A L1

(37)

j Kj K0 = Aα j 0 A L0 A Lj

(38) 411

Klaus W¨alde Inserting into the labor market clearing condition Σqj=0 Lj = L yields q 1−α Kq 1−α K1 L0 + A α L0 + ... + A α L0 = L ⇔ K0 K0 % & q 1−α 1−α L0 α K0 + A α K1 + ... + A = L⇔ Kq K0 K0 L0 = L, K

(39)

where K = K0 + A

1−α α

q 1−α α

K1 + ... + A

q

Kq ≡ K0 + BK1 + ... + B Kq .

Inserting (39) in (37) gives labor allocation to vintage 1, L1 =

A

1−α α

K1

K

L,

and inserting (39) in (38) gives labor allocation to vintage j, j 1−α α

Lj =

A

K

Kj

L.

Now aggregate over outputs. Output of vintage 0 is % % &1−α &1−α 0 1−α K0 L α α L Y0 = K 0 A L 0 = K0 = K0 , K K where we used (39). Output of other vintages are α

1

1−α

j

1−α

Y1 = K 1 A L 1

=

K1α

.. . α

Yj = K j A L j

=

Kjα

1

A α K1 L K j

A α Kj L K

1−α

% = K1

1−α

% = Kj

L A K 1 α

L A K j α

&1−α ,

&1−α .

Total output is then given by & % &1−α % 1−α q 1−α L α α + ...Kq A Y = Y0 + Y1 + ... + Yq = K0 + K1 A K = K α L1−α . 412

Capital Accumulation, Growth, and Creative Destruction Appendix B: The budget constraint Real wealth a of households is given by the sum of the number kj of units of capital of vintage j held by the household times their real price vj /pY , vj . (40) a = Σq+1 j=0 kj pY For reasons that will become clear in a moment, the sum extends from 0 to q + 1, though the most advanced vintage is vintage q and household therefore can not own any capital of vintage q +1, kq+1 = 0. Households trade only capital goods of the most recent vintage. The allocation of older capital goods is ﬁxed (in equilibrium, households would be indiﬀerent about trading old capital goods). Capital held by households therefore follows for old vintages j dkj = −δkj dt,

∀j < q,

for the most recent one q+1 Σj=0 wjK kj + w − i − pc c dkq = − δkq dt, vq

(41)

(42)

and for the next vintage q + 1 i dkq+1 = κq+1 dq. J

(43)

The capital stock kq in (42) of a household increases in a deterministic fashion when the diﬀerence between actual income and spending, K Σq+1 j=0 wj kj + w − i − pc c,

divided by the price vq of an installed or the price pI of a new unit of capital exceeds losses δkq of capital due to depreciation. Capital income K Σq+1 j=0 wj kj of households is given by factor rewards wjK for capital (value marginal productivities) times the amount of capital kj , summed up over all vintages. Equation (43) shows that in the case of a successful R&D project, i.e., when dq = 1, the household obtains the share i/J, i.e., depending on its investment i relative to total investment J into the successful project, of total payoﬀs κq+1 . A successful research project therefore 413

Klaus W¨alde increases the capital stock of vintage q + 1 held by the household from 0 to κq+1 i/J. After that, equation (42) applies to vintage q + 1. The price of a vintage j in terms of the numeraire good is given by (12) with (11). Hence, letting vintage prices evolve in all generality as d

vj vj vj = αj dt + βs dq, pY pY pY

(44)

we know that the deterministic change of the real price vj /pY must be zero, αj = 0 ∀j = 0...q. When research is successful, the price of a unit of a given vintage j in terms of the numeraire good drops as pY = B j−(q+1) .13 Hence, as then, by (12) and (11), v˜j /˜ d (vj /pY ) = v˜j /˜ pY − vj /pY , we have d (vj /pY ) = B j−(q+1) − B j−q . As a consequence and with (44), βs =

d (vj /pY ) B j−(q+1) − B j−q 1−B < 0, = = j−q vj /pY B B

which is identical for all vintages j ≤ q. Real vintage prices (44) therefore evolve according to d

vj B − 1 vj =− dq pY B pY

∀j < q.

(45)

This equation reﬂects the devaluation of old vintages relative to the numeraire good when a new vintage has been developed. This is the source of the creative destruction mechanism in the present model. We can now derive the budget constraint by computing the diﬀerential & % vj d k da = Σq+1 j . j=0 pY For all vintages 0 < j < q, we obtain with (41) and (45) and using Ito’s Lemma % & & % vj vj vj B − 1 vj vj d = − δkj dt + kj − kj − kj dq pY pY pY B pY pY vj B − 1 vj = −δ kj dt − kj dq ∀j = 0...q − 1. pY B pY 13 A tilde (˜) denotes the value of a quantity immediately after successful research.

414

Capital Accumulation, Growth, and Creative Destruction For the currently most advanced vintage q, we use (42) and (45) to obtain q+1 % & vq vq Σj=0 wjK kj + w − i − pc c d kq − δkq dt = pY pY vq & % B − 1 vq vq vq kq − − kq dq + pY B pY pY K w−i vq q+1 wj = Σj=0 kj + − c − δ kq dt pY pY pY −

B − 1 vq kq dq. B pY

For the next vintage q + 1 to come, from (43) and with a real price pY for the prototype after successful R&D, i.e., only when the v˜q+1 /˜ good κq+1 exists, % & vq+1 i i v˜q+1 (46) kq+1 = κq+1 dq = κq+1 dq. d pY p˜Y J J The real price equals unity, v˜q+1 /˜ pY = 1 from (12). Hence, κq+1 stands for the number of consumption goods that can be exchanged for the prototype. This is in accordance with the deﬁnition of real wealth in (40), which also is the number of consumption goods that can be changed for a. Summarizing, we obtain14 & % vj q+1 da = Σj=0 d kj pY % & vj B − 1 vj −δ = Σq−1 k dt − k dq j j j=0 pY B pY K w−i vq B − 1 vq q+1 wj kj + − c − δ kq dt − kq dq = Σj=0 pY pY pY B pY i + κq+1 dq J % & vj B − 1 vj q kj dq = Σj=0 −δ kj dt − pY B pY K w−i vq i q+1 wj kj + − c − δ kq dt + κq+1 dq + Σj=0 pY pY pY J 14 Here we need assets a to equal the sum over all vintages including the notyet-existing one q + 1, as we need to include the development of κq+1 in (46).

415

Klaus W¨alde % =

& w−i pc ∂Y q+1 j Σ B kj − δa + − c dt pY ∂K j=0 pY & % i B−1 a dq. + κq+1 − J B

where the last equality used (9). As (12) tells us pI B j = B q vj and pc = pI by (11), we can replace B j by B j = B q vj /pc and obtain % & w−i q+1 vj q ∂Y da = B Σ kj − δa + − c dt ∂K j=0 pc p & Y % B−1 i a dq + κq+1 − J B % & & % w−i i = ra + − c dt + κq+1 − sa dq, pY J where the interest-rate r and s stand for r = Bq

∂Y − δ, ∂K

s=

B−1 . B

Appendix C: The Bellman equation, the investment rule and the consumption jump condition The Bellman equations is (cf., e.g., Dixit and Pindyck, 1994) w−i ρV (a, q) = max u (c) + Va (a, q) ra + −c pY + λ [V (˜ a, q + 1) − V (a, q)]

(47)

with

i a ˜ = (1 − s) a + κq+1 . J The ﬁrst order condition for consumption is u (c) = Va (a, q) .

(48)

(49)

The derivative with respect to real investment i/pY in R&D is 1 d {.} = −Va (a, q) + λVa˜ (˜ a, q + 1) κq+1 d (i/pY ) J/pY 1 . = −Va (a, q) + Va˜ (˜ a, q + 1) κq+1 D (q) 416

(50)

Capital Accumulation, Growth, and Creative Destruction As R&D is undertaken under perfect competition, total investment J into R&D equals total production costs pR R of R&D ﬁrms. Using (4), we obtain J = pR R = pR λD (q) , which has been used for the last equality in (50). The derivative (50) is in perfect analogy to expression (12) in W¨alde (1999b) (with κq+1 ≡ and D (q) ≡ pI /b though). Hence, the investment rule can be taken from there, taking the slightly different budget constraint (16) into account. The consumption jump condition is in analogy to W¨alde (1999b) as well: As on the R&D line, households are indiﬀerent between ﬁnancing R&D and accumulating capital, the derivative (50) with respect to investment i in R&D is zero. Inserting the ﬁrst order condition for consumption (49) into (50) yields u (c) =

κq+1 u (˜ c) . D (q)

(51)

Replacing individual consumption by aggregate consumption (apply the inverse function of u (.) before) yields the aggregate consumption jump condition (25). Appendix D: The Keynes-Ramsey rule The marginal value of a unit of wealth Va (a, q) is a function of both assets a and of the technological level q. Applying the appropriate version of Ito’s Lemma (W¨alde, 1999b, appendix 1), the diﬀerential of the marginal value reads w−i − c dt+[Va˜ (˜ a, q + 1) − Va (a, q)] dq dVa (a, q) = Vaa (a, q) ra + pY (52) where a ˜ is as in (48). It is important to note that Ito’s Lemma is applied to the partial derivative of the function V (.) with respect to the ﬁrst argument. This means that the jump-term Va˜ (˜ a, q + 1) − Va (a, q) is a diﬀerence between partial derivatives with respect to ﬁrst arguments and not a diﬀerence between partial derivatives with respect to a. 417

Klaus W¨alde The partial derivative of the maximized Bellman equation using the envelope theorem, i.e., assuming interior solutions such that derivatives with respect to control variables are zero, reads (this is derived with more intermediate steps in W¨alde 1999b, app. 3) w−i − c + Va (a, q) r ρVa (a, q) = Vaa (a, q) ra + pY a, q + 1) − Va (a, q)] +λ [Va (˜ w−i − c + Va (a, q) r = Vaa (a, q) ra + pY a, q + 1) − Va (a, q)] +λ [(1 − s) Va˜ (˜ where the last equality used (48). With Va ≡ Va (a, q), Vaa ≡ Vaa (a, q) a, q + 1) and rearranging this reads and Va˜ ≡ Va˜ (˜ w−i −c . [ρ − r + λ] Va − λ (1 − s) Va˜ = Vaa ra + pY Replacing

w−i −c Vaa ra + pY

in (52) by this expression gives dVa = [(ρ − r + λ) Va − (1 − s) λVa˜ ] dt + [Va˜ − Va ] dq ⇔ Va˜ dVa Va˜ dt + = ρ − r + λ 1 − (1 − s) − 1 dq. (53) Va Va Va Using the ﬁrst order condition for consumption (49), we can express the diﬀerential of Va in (53) as dVa = du (c) .

(54)

Dividing (54) by (49) yields du (c) dVa . = Va u (c) The Keynes-Ramsey rule therefore reads with (53) Va˜ du (c) Va˜ = r − ρ − λ 1 − (1 − s) − − 1 dq dt − u (c) Va Va = [r − ρ − λ [1 − [1 − s] Ω]] dt + [1 − Ω] dq. 418

(55)

Capital Accumulation, Growth, and Creative Destruction Appendix E: Deriving the R&D line We start from the investment rule (19) and ﬁrst derive an expression for (1 − s) Ω. From the deﬁnition of Ω in (20) and the consumption jump condition (25) D (q) u (˜ c) = . (56) Ω= u (c) κq+1 Hence, from (17), (1 − s) Ω =

D (q) . Bκq+1

Then, we use the resource constraint (3) to express the arrival rate (4) on the R&D line where I = B −q δK as λ=

Y − B −q δK − C . D (q)

The fact that investment in capital just balances depreciation follows from looking at the budget constraint of households and the interest rate, as discussed after presenting the investment rule (19). Finally, using these two equations plus the expression for the interestrate (18) we obtain a rewritten expression for the investment rule (19) that is a function of the aggregate capital stock and aggregate consumption only, & % Y − B −q δK − C D (q) q ∂Y −δ−ρ> . (57) 1− B ∂K D (q) Bκq+1 This is (24) in the main text, where the inequality sign was replaced by the equality sign. The inequality sign is important to check whether investment in capital accumulation takes place above or below the R&D line. As this expression shows, capital is accumulated when consumption C is suﬃciently high. The phase-diagram analysis will show that this implies that capital accumulation takes place above the R&D line. Appendix F: Transformation in section 13.4.1 This section derives the phase diagram in the transformed variables ˆ q as in (26). Starting from ˆ and C, ˆ where K = KA ˆ q/α , C = CA K ˆ follows (remember the deﬁnition (23), the transformed capital stock K 1−α B = A α in (8)) d ˆ ˆ q/α ⇔ d K ˆ = Yˆ − Cˆ − δ K. ˆ Aq/α K = B q Aq Yˆ − Aq Cˆ − δ KA dt dt 419

Klaus W¨alde where

ˆ α L1−α . Yˆ = K

(58)

With (22), consumption follows qˆ ˆ q u CA Y ∂ A d Aq Cˆ = B q −δ−ρ⇔ − dt ˆ q ˆ q/α u CA ∂ KA ˆ q u CA ∂ Yˆ d Aq Cˆ = − δ − ρ. − ˆ dt ˆ q ∂K u CA where we used the deﬁnition of the interest rate in (18). For our CES utility function, the LHS simpliﬁes and one gets ˆ −δ−ρ ˆ ∂ Yˆ /∂ K dC/dt = . σ Cˆ Let us now derive the transformed R&D line. Replace actual consumption and capital levels by productivity-adjusted levels as in (26), &1−α % L Bq α −δ−ρ ˆ q/α KA & q ˆ − Aq Cˆ % Aq Yˆ − B −q δA α K D (q) ⇔ > 1− D (q) Bκq+1 % &1−α & ˆ − Cˆ % L Yˆ − δ K D (q) α −δ−ρ> . (59) 1− ˆ A−q D (q) Bκq+1 K 1−α

where we used B = A α from (8). This expression shows us that the R&D line is vintage independent (and therefore does not move in the phase diagram) if D (q) = φAq ,

and κq+1 = Aq+1 κ ˆ0,

where κ ˆ 0 if the productivity adjusted size of the prototype 0. Assuming 1−

φAq φ D (q) =1− = 1 − 1/α > 0 q+1 Bκq+1 BA κ ˆ0 A κ ˆ0

(60)

which is (32) in the main text, inserting these assumptions (which have an intuitive economic meaning given in the main text), deﬁning 1−α ˆ rˆ = α L/K −δ 420

Capital Accumulation, Growth, and Creative Destruction as in (29) and solving for consumption yields, φ

rˆ − ρ ˆ − Cˆ ⇔ Cˆ > Yˆ − δ K ˆ − φ rˆ − ρ , > Yˆ − δ K φ φ 1 − A1/α κˆ0 1 − A1/α κ ˆ0

The capital stock increases due to successful research according to (10) by ˜ − K = B q+1 κq+1 . K Productivity-adjusted capital changes following (26) and (31) are then given by A

q+1 α

q+1 ˆ˜ = K/A ˆ˜ − A αq K ˆ = B q+1 Aq+1 κ ˆ 1/α + κ ˆ0 = A α κ ˆ0 ⇔ K ˆ0. K

References: Aghion, P., and Howitt, P. (1992) “A Model of Growth through Creative Destruction.” Econometrica 60: 323–351. Aghion, P., and Howitt, P. (1998) Endogenous Growth Theory. Cambridge, MA, MIT Press. Backus, D. K., Kehoe P. J., and Kehoe, T. J. (1992) “In Search of Scale Eﬀects in Trade and Growth.” Journal of Economic Theory 58: 377–409. Barro, R. J., and Sala-i-Martin, X. (1995) Economic Growth. New York, McGraw-Hill. Bental, B. and Peled, D. (1996) “The Accumulation of Wealth and the Cyclical Generation of new Technologies: A Search Theoretic Approach.” International Economic Review 37: 687–718. Cripps, M.W., Keller, G., and Rady, S. (2002) Strategic Experimentation: The Case of Poisson Bandits. CESifo Working Paper No. 737. Dixit, A. K. and Olson, M. (2000) “Does Voluntary Participation Undermine the Coase Theorem?” Journal of Public Economics 76: 309– 335. Dixit, A. K. and Pindyck, R. S. (1994) Investment Under Uncertainty. Princeton University Press. Howitt, P. (1999) “Steady Endogenous Growth with Population and R&D Inputs Growing.” Journal of Political Economy 107: 715–730. 421

Klaus W¨alde Howitt, P. and Aghion, P. (1998) “Capital Accumulation and Innovation as Complementary Factors in Long-Run Growth.” Journal of Economic Growth 3: 111 –130. Jones, C. I. (1995) “R&D-Based Models of Economic Growth.” Journal of Political Economy 103: 759–84. Matsuyama, K., 1999, “Growing through Cycles.” Econometrica 67: 335 –347. Segerstrom P. S., 1998, “Endogenous Growth without Scale Eﬀects.” American Economic Review 88: 1290–1310. Segerstrom P. S. (2001) Intel Economics. Stockholm School of Economics, mimeo. August 2001. W¨alde, K. (1999a) “A Model of Creative Destruction with Undiversiﬁable Risk and Optimising Households.” Economic Journal 109: C156– C171 W¨alde, K. (1999b) “Optimal Saving under Poisson Uncertainty.” Journal of Economic Theory 87: 194–217. W¨alde, K. (2002) “The Economic Determinants of Technology Shocks in a Real Business Cycle Model.” Journal of Economic Dynamics and Control 27: 1–28. Young, A. (1998) “Growth without Scale Eﬀects.” Journal of Political Economy 106: 41–63.

422

Chapter 14 Employment Cycles in a Growth Model with Creative Destruction

Tapio Palokangas University of Helsinki and HECER

14.1

Introduction

The purpose of this chapter is to construct a model that would explain economic growth with ﬂuctuations in output and employment. This study is closely related to theories of endogenous growth and real business cycles (RBC). Aghion and Howitt (1992) shows that the introduction of jump processes into general equilibrium models leads to endogenous business cycles. In their original model, however, there is a perfect labor market, no real capital, and the households were risk neutral. Aghion and Howitt (1998) incorporates capital accumulation and W¨alde (1999) risk averse households into the same model. Despite of these generalizations, it is still typical for these models that the economy generates output and employment cycles only outside the balanced-growth path. In this chapter, we construct a model which generates such cycles on the balanced-growth path, but in which there are constant equilibrium levels for the labor-capital ratio and the productivity-adjusted wages. In the long run, the economy is expected to follow a balancedgrowth path that satisﬁes Kaldor’s stylized facts as follows:1 1 Cf. Barro and Sala-i-Martin (1995), p. 5. The original reference is Kaldor (1963).

Tapio Palokangas 1. Output per physical labor in production grows over time. 2. Capital per physical labor grows over time. 3. The rate of return to capital is constant. 4. The proportion of output to capital is constant. 5. The share of labor in national income is nearly constant. Around this balanced-growth path, the economy generates cycles. There is large empirical evidence on the assertion that technology shocks are contractionary on impact. Gali (1999) and Basu et al. (2004) document for the U.S. and other G7 economies a negative correlation between technology shocks, identiﬁed under diﬀerent assumptions, and several measures of labor and other inputs. Marchetti (2005) conﬁrms the same result by panel data of Italian manufacturing ﬁrms. These authors, however, interpret the ﬁnding as evidence in favour of sticky nominal prices as follows. In the wake of technology expansion, nominal rigidities prevent prices from falling and thus aggregate demand does not increase. Therefore, ﬁrms produce the same output with a smaller volume of inputs, which have become more productive. In this chapter, an alternative explanation is constructed on the basis of sticky real wages. Introducing endogenous shocks into a RBC model, W¨alde (2002) showed that the ‘laissez faire’ economy and the social planner generate diﬀerent outcomes. In his model, however, the economy is characterized by ‘bang-bang’ development: because R&D is subject to constant returns to scale and the same good is used in both R&D and capital accumulation, the ﬁrms either do R&D or invest in real capital, but not both at the same time. It is assumed that because the ﬁrms also learn from each other, technological change in a single ﬁrm is a function of R&D inputs of all ﬁrms in the economy. This means that ﬁrms invest in R&D and real capital simultaneously and the economy remains on a stationary state, although there are endogenous technological shocks. This study constructs a model where the economy adjusts to productivity shocks through employment, while wages evolve in proportion to the productivity of labor. The major causes of non-competitive real wages in macroeconomic models are eﬃciency wages and unionemployer bargaining. It is assumed that workers in R&D earn eﬃciency wages, i.e., their productivity depends on their expected relative wage, while workers in production belong to a labor union which sets wages for its members. In the production of goods, the marginal 424

Employment Cycles, Growth, and Creative Destruction product of labor is falling for a given capital stock. Hence, there are proﬁts to be bargained over. In the R&D sector, the marginal product of labor is constant, there are no proﬁts, but a worker’s eﬃciency depends on his expected relative wage. The interplay of union and efﬁciency wages generate involuntary unemployment, but relative wages are stable over a cycle. The remainder of this chapter is organized as follows. Technological change is speciﬁed in section 14.2, and R&D and capital accumulation in 14.3. Sections 14.4 and 14.5 introduce capitalists and wage settlement into the model. Growth and cycles are examined in sections 14.6 and 14.7. 14.2

Technology

There is a ﬁxed number m of workers who supply labor and consume all their income, and a ﬁxed number n of capitalists who invest in capital and R&D projects.2 Each capitalist owns and fully controls one ﬁrm. All ﬁrms produce the same consumption good which is chosen as the numeraire. In addition, each ﬁrm produces a capital good which is speciﬁc to the ﬁrm itself.3 The productivity of labor is unity in R&D and a in production. Learning by investment in capital by any ﬁrm contributes a stock of knowledge which is common for all ﬁrms. This spillover of knowledge increases the productivity of labor in production, a. Each capitalist j accumulates ﬁrm-speciﬁc capital Kj , produces goods Yj from labor Lj and ﬁrm-speciﬁc capital Kj and does R&D by labor Zj . Capitalist j produces the consumption good and the ﬁrmspeciﬁc capital good otherwise by the same technology, but total factor productivity in the production of the consumption good is subject to technological change. Thus, capitalist j accumulates ﬁrm-speciﬁc capital Kj by the amount Ij and converts the rest Yj − Ij of output Yj into a consumption good in proportion Aγj , where A > 1 is a constant and γj is the serial number of technology.4 2

I have to separate between workers and capitalists, for simplicity. If workers possessed any capital and were therefore also owners of ﬁrms, then it would be very diﬃcult to model wage bargaining in section 14.5. 3 If capital were freely tradable and not ﬁrm-speciﬁc, then the capitalist’s budget constraint (2) should be modeled as Cj + Wj Lj + vZj + Ij = Aγj Yj . In that case, the capitalist’s propensity to consume, (17), would not be constant and there would be no solution for dynamic programming that characterizes the capitalist’s behavior in section 14.4. 4 If the production of capital goods were subject to technological change as

425

Tapio Palokangas Capitalist j produces its output Yj from labor Lj and capital Kj through a twice-diﬀerentiable production function with constant returns to scale: . (1) Yj = F (aLj , Kj ) = f (lj )Kj , lj = aLj /Kj , f > 0, f < 0. Capitalist j pays the wage Wj per worker in production and all capitalists pay the same wage v per worker in R&D. Capitalist j’s budget constraint is given by Cj + Wj Lj + vZj = Aγj (Yj − Ij ),

(2)

where Cj consumption, Yj output, Wj Lj and vZj wages in production and R&D, and Aγj (Yj − Ij ) supply of the consumption good. It is assumed that the capitalists have following rational expectations on the working of the labor market institutions in the economy: Assumption 1. Each capitalist j expects that the wage Wj for its workers in production will increase in proportion to the total productivity of these workers, aAγj . This implies that Wj = wj aAγj ,

(3)

where the productivity-adjusted wage wj is exogenous for capitalist j. Each capitalist j can increase the probability of technological change for it by investing in R&D. In the advent of technological change, the conversion ratio between consumption and investment for the capitalist increases from Aγj to Aγj +1 . This will generate cycles. The improvement of technology for capitalist j depends on the capitalist’s own R&D, Zj . In a small period of time dt, the probability that R&D leads to development of a new technology is given by (λ log Zj )dt, while the probability that R&D remains without success is given by 1 − (λ log Zj )dt, where λ is research workers’ productivity:5 1 with probability (λ log Zj )dt, dqj = (4) 0 with probability 1 − (λ log Zj )dt, where qj is the Poisson process resulting from capitalist j’s investment in R&D and dqj is the increment of this process. well, then the rate of return to capital would increase with time and Kaldor’s third stylized fact of growth [cf. section 14.1] would not hold. 5 The logarithmic speciﬁcation is chosen, for analytical convenience. With some complication, the results could be generalized to the case in which the probability (λ log Zj )dt is replaced by (λ/ν) log Zjν dt, and ν ∈ (0, ∞) is a constant.

426

Employment Cycles, Growth, and Creative Destruction 14.3

R&D and capital accumulation

Assume that capital is a stock of goods that does not depreciate, for simplicity. Investment per unit of time dt then equals deterministic capital accumulation, Ij dt = dKjd . Solving for Ij from (2) and noting (1) and (3) yield dKjd = Ij dt = [f (lj ) − wj lj ]Kj − A−γj (Cj + vZj ) dt. (5) R&D is directed at developing new production units. Assume that after a successful development of new technology, a constant share φ of the previous vintage can be upgraded which therefore has the higher productivity.6 The remaining share 1 − φ of capital stock becomes obsolete. After successfully completing an R&D project, capital stock is then given by j = φKj , 0 < φ < 1. K (6) Given this deﬁnition, the entire capital stock belongs to the same vintage. Noting (5), capital accumulation for capitalist j is given by j − Kj )dqj = Ij dt + (K j − Kj )dqj dKj = dKjd + (K j − Kj )dqj . = [f (lj ) − wj lj ]Kj − A−γj (Cj + vZj ) dt + (K

(7)

This is a stochastic diﬀerential equation where uncertainty results from a Poisson process qj . During a small period of time dt, the capital stock of vintage γj increases deterministically by investment in capital accumulation. With a successful R&D project, dqj = 1, capital stock j − Kj and the level of productivity in the consumptionjumps by K goods sector rises by A. When no investment in R&D takes place or when R&D fails, the increment dqj is zero, the level of productivity does not change and there is no jump in capital stock Kj . Lucas (1988) shows that endogenous growth models produce constant steady-state growth only under “knife-edge” parameter assumptions. The “knife-edge” assumption of this model is the following: Assumption 2. There is no trend for unemployment. The economy would converge to full employment, if aggregate capital stock grew faster, and unemployment would increase indeﬁnitely, if aggregate capital stock grew slower than the productivity of labor [cf. section 14.5]. Assumption 2 means that capital stock and the productivity of labor must in the long run grow at the same rate. 6

This idea is from W¨alde (2002).

427

Tapio Palokangas 14.4

Capitalists

Capitalist j maximizes its expected utility over time by choosing its streams of consumption Cj , the labor-capital ratio in production, lj , and investment in R&D, Zj , subject to the production function (1), capital accumulation (7) and the stochastic process (4), given the wage v and the productivity-adjusted wage wj . Let ρ > 0 be the constant rate of time preference and 1/(1−σ) the constant rate of risk aversion. The value of the optimal program starting at time t is then ∞ e−ρ(τ −t) Cjσ dτ Γ(Kj , Z, Wj , v, γj ) = max E C j , Z j , lj

t

s.t. (11) and (7),

(8)

where E is the expectations operator. Because the capitalist is a risk :j and Γ = Γ(K j , W :j , v , γj + 1) be averter, 0 < σ < 1 holds. Let K the values of Kj and Γ after successfully completing an R&D project. . Denoting ΓK = ∂Γ/∂Kj and noting (4) and (7), the Bellman equation of the optimal program of capitalist j is as follows:7 ρΓ(Kj , Z, Wj , v, γj ) = max Φ(Cj , Zj , lj , Kj , Z, Wj , v, γj ), Cj , Zj , lj

(9)

where . − Γ] + ΓK Ij Φ(Cj , Zj , lj , Kj , Z, Wj , v, γj ) = Cjσ + (λ log Zj )[Γ :j , Z, Wj , v, γj + 1) − Γ(Kj , Z, Wj , v, γj )] = Cjσ + (λ log Zj )[Γ(K + [f (lj ) − wj lj ]Kj − A−γj (Cj + vZj ) ΓK (Kj , Wj , v, γj ). (10) The ﬁrst-order conditions associated with the optimal program of capitalist j are the following. First, maximizing (10) by the laborcapital ratio lj yields . wj = f (lj ), Πj = π(lj )Kj Aγj , π(lj ) = f (lj ) − f (lj )lj , (11) where Πj is capitalist j’s income (= proﬁts). Second, maximizing (10) by consumption Cj yields σCjσ−1 = ΓK A−γj .

(12)

Finally, maximizing (10) by R&D Zj yields − Γ)λ/Zj − vΓK A−γj = 0. ∂Φ/∂Zj = (Γ 7

428

Cf. Dixit and Pindyck (1994).

(13)

Employment Cycles, Growth, and Creative Destruction Because the productivity-adjusted wage wj is exogenous for capitalist j, given (11), its optimal labor-capital ratio lj = (f )−1 (wj ) can be considered as constant in optimization. To solve the dynamic program, assume ﬁrst that the capitalist’s consumption expenditure Cj is a ﬁxed share cj ∈ (0, 1) of its income Πj , and second that the value function Γ is in ﬁxed proportion (cj rj )−1 to the instantaneous utility Cjσ , where cj and rj are constants. From these, (6), (11) and (12) it follows that Cj = cj Πj = cj π(lj )Kj Aγj ,

∂Cj /∂Kj = cj π(lj )Aγj ,

Γ = Cjσ /(cj rj ),

σCjσ−1 ∂Cj 1 ∂Cjσ ΓK ∂Cj ΓK = = = π(lj ), cj rj ∂Kj cj rj ∂Kj cj rj Aγj ∂Kj rj j /Cj = AK j /Kj = φA, Γ/Γ = (C j /Cj )σ = (φA)σ , rj = π(lj ), C

ΓK =

Kj ΓK /Γ = Kj σCjσ−1 Aγj /Γ = Kj σcj rj Aγj /Cj = Kj σcj π(lj )Aγj /Cj = σ.

(14)

Assume that a technological change leads to the increase in welfare, > Γ, since otherwise, there would be no incentive to do R&D. Given Γ (14), there is then a constant . θ = Γ/Γ − 1 = (φA)σ − 1 > 0.

(15)

From (13), (14) and (15) it follows that vA−γj =

−Γ λ Γ θΓ λ θλKj = = . Γ K Zj ΓK Zj σZj

Given this equation, (14) and (15) imply vZj ΓK /Γ = θλAγj ,

Zj = [θλ/(σv)]Kj Aγj .

(16)

Inserting (10), (11), (14), (15) and (16) into (9) and (10) yields ρ = Φ/Γ

− 1)λ log Zj + π(lj )Kj − A−γj (Cj + vZj ) ΓK /Γ = Cjσ /Γ + (Γ/Γ = Cjσ /Γ + θλ log Zj + π(lj )Kj − A−γj (Cj + vZj ) ΓK /Γ = cj rj + θλ log Zj + [(1 − cj )π(lj )Kj − vZj A−γj ]ΓK /Γ = cj rj + θλ log Zj + (1 − cj )π(lj )σ − vZj A−γj ΓK /Γ = cj π(lj ) + θλ log Zj + (1 − cj )π(lj )σ − θλ = [(1 − σ)cj + σ]π(lj ) + θλ log Zj − θλ. 429

Tapio Palokangas Solving for cj yields the propensity to consume for capitalist j: cj = 14.5

σ ρ + θλ(1 − log Zj ) − . (1 − σ)π(lj ) 1−σ

(17)

Wage settlement

Workers employed by capitalist j in Product ion are organized in labor union j. In a bargain over the wage Wj , capitalist j attempts to maximize its proﬁt Πj , while union j attempts to maximize its members’ income Wj Lj . I construct the reference income for the parties of bargaining as follows.8 Let the wage and employment before dispute be Wj0 and L0j , respectively. It is assumed that due to labor market legislation a ﬁxed proportion β ∈ (0, 1) of the employed workers L0j cannot go on strike (e.g. protection work) and they get the previous wage wj0 during a strike. Noting (1), this implies that during a strike the labor force is βL0j and the employer earns in terms of consumption Π0j = Aγj F (aβL0j , Kj ) − Wj0 βL0j = f (βlj0 ) − wj0 βlj0 Kj Aγj . with lj0 = aL0j /Kj . (18) Thus, the reference income is zero for the union and (18) for the capitalist. It is assumed that both parties in bargaining take capital stock Kj , the level of Product ivity, aAγj , the previous wage wj0 and the previous capital-labor ratio lj0 as given.9 The Generalized Nash Product of the asymmetric bargaining between the union and the capitalist is 1−α . (19) Λj = (Wj Lj )α Πj − Πoj , 8

The same assumption is used in Palokangas (2005). Some papers assume that the expected wage outside the ﬁrm is the union’s reference point, but this is not quite in line with the microfoundations of the alternating oﬀers game. Binmore, Rubinstein and Wolinsky (1986, pp. 177, 185-6) state that the the reference income should not be identiﬁed with the outside option point. Rather, despite the availability of these options, it remains appropriate to identify the reference income with the income streams accruing to the parties in the course of the dispute. For example, if the dispute involves a strike, these income streams are the employee’s income from temporary work, union strike funds, and similar sources, while the employer’s income might derive from temporary arrangements that keeps the business running. 9 If these parties took also the eﬀect of the wage Wj through capital accumulation into account, then the union’s (capitalist’s) target would be the expected value of the stream of wages (proﬁts). Because in our model capital stock follows a cycle, the mathematic solutions for such expected values would be very diﬃcult to obtain.

430

Employment Cycles, Growth, and Creative Destruction where the constant α ∈ (0, 1) is the union’s relative bargaining power. Given (1), (11) and (18), the product (19) takes the form 1−α . Λj (Wj , Kj , α) = (Wj Lj )α [Πj − Πoj 1−α α = lj f (lj ) π(lj ) − f (βlj0 ) + wj0 βlj0 Kj Aγj 1−α α = lj f (lj ) π(lj ) − f (βlj0 ) + f (lj0 )βlj0 Kj Aγj .

(20)

The outcome of the bargaining is obtained through maximizing the Generalized Nash Product (20), given capital stock Kj and the level of productivity Aγj . Because there is a one-to-one correspondence between Wj and lj through (3) and (11), the wage Wj is replaced by the capital-labor rato lj as the instrument of this maximization. Hence, there must be . lj (lj0 ) = arg max Λj lj - 1−α . α . = arg max lj f (lj ) π(lj ) − f (βlj0 ) + f (lj0 )βlj0 lj

(21)

In equilibrium, the capital-labor ratio is constant over time, lj0 = lj . This, (21) and the symmetry throughout all j imply lj = l = constant.

(22)

Assume that a research worker’s productivity λ is an increasing function of his wage v relative to the expected wage after losing one’s job, ω, as follows:10 λ = (v/ω − 1)ζ ,

0 < ζ < 1,

(23)

where the elasticity ζ of productivity with respect to extra wage is a constant. An R&D ﬁrm chooses the wage for its employees, v, to minimize its unit cost of research, v/λ, given the expected wage after losing one’s job, ω. Noting (23), this yields the equilibrium conditions v = ω/(1 − ζ),

λ = [ζ/(1 − ζ)]ζ = constant.

(24)

Let m be the number of workers. If the number of capitalists, n, is large, the expected wage in the economy after losing one’s job, ω, is then equal to the sum of the wages Wj weighed by the probabilities 10 This assumption is a modiﬁcation of the eﬃciency wage model presented in Solow (1979), Summers (1988) and Van Schaik and De Groot (1998).

431

Tapio Palokangas of being employed, Lj /m, for all capitalists j, plus the wage in11R&D, v, times the probability of being employed in R&D, j Zj /m: Zj 1 v Lj . +v = ω= Wj W j Lj + Zj . m m m j=1 m j=1 j=1 j=1 n

n

n

n

(25)

The macroeconomic real variables are as follows: . Cj , C= n

. I= Ij , n

j=1 n . Z= Zj , j=1

. Y =

j=1 n

. K= Kj , n

. L= Lj ,

j=1

n

j=1

Yj .

(26)

j=1

From (11), (16), (22), (24), (25) and (26) it follows that n θ vZ = λ Kj Aγj , σ j=1

Kj Aγj Zj = n , γk Z k=1 Kk A

(27)

n n ωm 1 1 m = W j Lj + Zj (1 − ζ) = Z vZ vZ j=1 Z j=1 n f l σ f (l)l + 1, = Kj Aγj + 1 = vZ j=1 θλ σ −1 f (l)l + 1 Z = (1 − ζ)m = constant. θλ

14.6

(28)

Economic growth

From (1), (11), (22) and (26) it follows that Wj = aAγj f (l),

Y = f (l)K,

L = lK/a,

[Yj − A−γj Wj Lj ]/Kj = f (l) − lf (l) = π.

(29)

These results can be rephrased as: 11

This speciﬁcation is based on the simpliﬁcation that the unemployed are supported by the employed workers in production. It would be a minor modiﬁcation with the same results to extend the model as follows. (i) There are unemployment beneﬁts. (ii) In line with Summers (1988), each unemployed worker obtains beneﬁts in ﬁxed proportion to e.g. the expected wage ω. (iii) The beneﬁts are ﬁnanced by a proportional labor income tax.

432

Employment Cycles, Growth, and Creative Destruction Proposition 1. The output-capital ratio Y /K = f (l) and the rate of return paid to capital, π(l), are constants. A worker’s wage in production, Wj , grows in ﬁxed proportion to the total productivity of labor in production, aAγj , for each capitalist j. Deﬁning the serial number γ of macroeconomic technology so that A (Y − I) = γ

n

Aγj (Yj − Ij ),

(30)

j=1

and aggregating throughout all j, the equation (2) becomes C+

n

Wj Lj + vZ = Aγ (Y − I),

j=1

where C is the capitalists’ total consumption, nj=1 Wj Lj + vZ the workers’ total consumption (= all wages paid in the economy) and Aγj (Y − I) aggregate production of the consumption good. Noting (4), (27) and (30), the average growth rate of aggregate productivity Aγ is as follows:12 E log Aγ+1 − log Aγ n ∂[Aγ (Y − I)] Aγj (Yj − Ij ) E log Aγj +1 − log Aγj = γ γ j ∂[A (Yj − Ij )] A (Y − I) j=1 =

n Aγj (Yj − Ij ) E log Aγj +1 − log Aγj γ A (Y − I) j=1

n Aγj (Yj − Ij ) log Zj Aγ (Y − I) j=1 n Aγj (Yj − Ij ) Z nKj Aγj n log =λ γk Aγ (Y − I) n k=1 Kk A j=1 n nKj Aγj Z Aγj (Yj − Ij ) , log n = λ log + γk n j=1 Aγ (Y − I) k=1 Kk A

=λ

(31)

where E is the expectations operator. Now assume that the number of capitalists, n, is high enough. Because the terms Aγj (Yj − Ij ) n→∞ Aγ (Y − I) lim

12

For this, see Aghion and Howitt (1998), p. 59.

433

Tapio Palokangas approach zero for all j, but the terms nK Aγj n j γk k=1 Kk A are constrained and close to one for all j, when n → ∞, the equation (31) can be written in the form lim E log Aγ+1 − log Aγ = λ log(Z/n). n→∞

This result can be rephrased as follows: Proposition 2. The level of productivity in the consumption-good sector, Aγ , grows on the average at the constant rate λ log(Z/n). A technological change that increases capitalist j’s productivity ˜ j = φKj from Aγj to Aγj +1 and decreases its capital stock from Kj to K follows the Poisson process qj with (4). This means that given (30), proposition 2 and the properties of the Poisson processes, a technological change that increases macroeconomic productivity from Aγ to ˜ = φK, follows Aγ+1 and decreases total capital stock from K to K the Poisson process q with 1 with probability λ log(Z/n)dt, dq = (32) 0 with probability 1 − λ log(Z/n)dt, where dq is the increment of the process q. Noting (1), (11),(22),(26), (27) and (28), the ratio of total labor income n (Wj Lj + vZj ) to national income

n j=1

j=1

Aγj Yj can be written as follows:

n

n γj j=1 (Wj Lj + vZj ) j=1 f (lj )lj Kj A + vZ n = n γj γj j=1 A Yj j=1 f (lj )Kj A n n θ γj γj f (l)l + θλ/σ j=1 f (lj )lj Kj A + σ λ j=1 Kj A n , = = γ j f (l) j=1 f (lj )Kj A

which is a constant. This result can be rephrased as follows: Proposition 3. The share of labor in national income is constant. 434

Employment Cycles, Growth, and Creative Destruction 14.7

Cycles

Noting (11) and (29), unemployment is expressed as U = m − L − Z = m − lK/a − Z,

(33)

where m the number of workers, L employment in production and Z labor devoted to R&D. Because there is no trend for unemployment U by assumption 2, there is no trend for K/a either and 1 dK a˙ =E , a K dt

(34)

where E is the expectations operator. There is a theoretical possibility that total employment L + Z hits the total supply of labor, m, which would excessively complicate the dynamics of the model. To eliminate this, the number of workers, m, is assumed to be large. Noting (33) and (28), one obtains that the economy never attains full employment, if U > 0 and K(t) 1−ζ =