498 71 12MB
English Pages 651
FINANCIAL DERIVATIVE AND ENERGY MARKET VALUATION
FINANCIAL DERIVATIVE AND ENERGY MARKET VALUATION Theory and Implementation R in Matlab
Michael Mastro U.S. Naval Research Lab Washington, DC
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright 2013 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. MATLAB is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB software. Library of Congress Cataloging-in-Publication Data: Mastro, Michael A., 1975– Financial derivative and energy market valuation : theory and implementation in Matlab / Michael Mastro. p. cm. Includes bibliographical references and index. ISBN 978-1-118-48771-6 (cloth) 1. Derivative securities. 2. Energy derivatives. 3. MATLAB. I. Title. HG6024.A3M3774 2012 332.64’57–dc23 2012031825 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
CONTENTS Preface
vii
1
Financial Models
2
Jump Models
35
3
Options
65
4
Binomial Trees
105
5
Trinomial Trees
131
6
Finite Difference Methods
167
7
Kalman Filter
231
8
Futures and Forwards
245
9
Nonlinear and Non-Gaussian Kalman Filter
295
10
Short-Term Deviation/Long-Term Equilibrium Model
349
11
Futures and Forwards Options
359
12
Fourier Transform
397
13
Fundamentals of Characteristic Functions
459
14
Application of Characteristic Functions
467
15
Levy Processes
505
16
Fourier-Based Option Analysis
547
17
Fundamentals of Stochastic Finance
585
18
Affine Jump-Diffusion Processes
605
Index
1
645 v
PREFACE
Energy markets and their associated financial derivatives are characterized by sudden jumps, mean reversion, and stochastic volatility. These aspects necessitate sophisticated models to properly describe even a subset of these traits. Moreover, the implementation of these models itself requires advanced numerical methods. This book establishes the fundamental mathematics and builds up all necessary statistical, quantitative, and financial theories. A number of theoretical topics are expanded, including the Fourier transform, moment generating functions, characteristic functions, and finite and infinite activity Levy processes such as the alpha stable, tempered stable, gamma, variance gamma, inverse Gaussian, and normal inverse Gaussian processes. Applied mathematics such as the fast Fourier transform and the fractional fast Fourier transform are developed and used to generate statistical distributions and for option pricing. On the basis of this knowledge, state-of-the-art quantitative financial models are developed without the need to refer external sources. Seminal works are derived and implemented, including the Black–Scholes, Black, Ornstein–Uhlenbeck, Merton Gaussian jump diffusion, Kou double exponential jump diffusion, and Heston stochastic volatility models. Nevertheless, these models cannot capture the true behavior of the energy markets. The influential two-factor stochastic convenience yield model and the short-term long-term model are derived and implemented. It is shown that adding jumps to these models can be done in a rather ad hoc manner; however, a thorough discussion of the affine transform formalism is presented. This provides an elegant framework to augment jumps to the two-factor models or develop similar jump-diffusion models. To fit these models and display their predictive power, a particular focus is made in developing and utilizing the Kalman filter. For linear models with Gaussian noise, the Kalman filter finds an optimal recursive solution with very little computational burden. For the nonlinear and non-Gaussian models developed in this book, we build up and exploit the extended, Gauss–Hermite, unscented, Monte Carlo, and particle Kalman filters. As suggested by the title, a major vein of this book is the implementation of these models. Availability of working code of modern financial models is limited, and the implementation is typically only discussed in an abstract sense. This book details the necessary steps for implementation and displays the working code. The Matlab
vii
viii
PREFACE
environment was selected because it is broadly available and is simple to port to other popular environments including C++ and C#. In addition, Matlab provides refined graphing routines that allow our code to focus on the relevant quantitative and financial concepts. Michael Mastro
1
Financial Models
1.1. INTRODUCTION The movement of financial assets and products generally displays some type of expected return, even over a short period. This expected return trends at a predictable rate that may be positive, indicating growth; negative, indicating a decline; or zero. Additionally, there are random movements that are individually unpredictable; however, the general distribution of these fluctuations is predictable based on historical movements. The common approach to model randomness is to assume a single- or multi-component Gaussian process. The generalized format to describe a time-dependent stochastic process is dSt = α(S, t)dt + σ (S, t)dWt , where the drift α and volatility σ are functions of time t and asset price S, and Wt is a Wiener process. If the drift α(S, t) = μ and volatility σ are constants, then the process dSt = μ dt + σ dWt is known as arithmetic Brownian motion. This process by itself states that the stock price S will increase (or decrease) without bound at a rate that is not dependent on the current stock price. Clearly, this does not describe the typical behavior for an asset, but modified versions of arithmetic motion are useful in finance and are revisited later in the text. 1.2. GEOMETRIC BROWNIAN MOTION A more appropriate description of a stock price process is that the movements in the stock are proportional to the value of the stock. A specific description is that the overall drift, α(S, t) = μSt , is the product of an expected return μ and the current asset price St . Adding a stochastic movement that is also proportional to the current price level gives dSt = μS dt + σ S dWt , Financial Derivative and Energy Market Valuation: Theory and Implementation in Matlab, First Edition. Michael Mastro. 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.
1
2
FINANCIAL MODELS
which is the well known geometric Brownian motion process. A crude discrete approximation of the stochastic differential equation for geometric Brownian motion given by St = μ t + σ Wt St is only valid over short time intervals. This form does highlight that the percentage change in the stock price S S over a short √ time interval is normally distributed with mean μ t and standard deviation σ t, where μ is the drift and σ is the volatility. The shorthand for a normal distribution is √ S ∼ N μ t, σ t . S The variance of this stochastic return is proportional to the time interval, var S = S σ 2 t (Hull, 2006). One benefit of geometric Brownian motion is that negative asset prices are not possible because any price change is proportional to the current price. Bankruptcy could drive an asset price down to but not past the natural absorbing barrier at zero (Chance, 1994). The discrete approximation of the geometric Brownian motion stochastic equation is composed of a trend (or expectation) term E dS = μ dt and an S uncertainty (of deviation) term. The uncertainty term is given by the Wiener increment √ dWt = εt dt, with E(dWt ) = 0, where ε is the standard normal distribution. It turns out that the variance of dWt is equal to the time interval dt. The variance of the Wiener increment was found by evaluating var(dWt ) = E[dWt − μdWj ]2 = E(dWt 2 ) − E(dWt )2 = E(dWt2 ). √ The expected value of (dWt )2 is, by definition, E[(dWt )2 ] = E[(εt dt)2 ] (Chance, 2005). Pulling the dt factor out of the expectation gives E(dWt2 ) = E(εt2 dt) = (E(εt2 ))dt. To evaluate the E(εt2 ) term requires the computational formula for the variance var(εt ) = E[(εt − E(εt ))2 ] = E[(εt2 − 2εt E(εt ) + E(εt ))2 ]
var(εt ) = E(εt2 ) − 2E(εt )E(εt ) + E(εt )2 = E(εt2 ) − 2E(εt )2 + E(εt )2
var(εt ) = E(εt2 ) − E(εt )2
→ E(εt2 ) = var(εt ) − E(εt )2 .
GEOMETRIC BROWNIAN MOTION
3
The variance term of a standard normal variable is one, var(εt ) = 1. The expected value of a standard normal variable E(εt ) and the square of the expected value E(εt )2 are zero. Therefore, E(εt2 ) = var(εt ) − E(εt )2 = 1 − 0, the expected value of a squared standard normal variable is one. The value for the expectation of the Wiener increment squared is given by E(dWt2 ) = (E(εt2 ))dt = dt. This important result states that the square of a Wiener process equals the time interval, dt = dWt2 . In other words, the Weiner process is unpredictable but the square of the Weiner process is predictable. t The percentage price change of the stochastic representation of dS St is normally distributed because the stochastic differential equation, written as dSt = μ dt + σ dWt , St is a linear transformation of the normally distributed variable dWt . The relative St t t = 1 + dS return S0 +dS S0 S0 = S0 over a time period T is the product of the intervening price changes as displayed by S ST S S S = 1 2 . . . t−1 T , S0 S0 S1 St−2 St−1 i where each increment in relative return SSi−1 is capable of being further subdivided in time. A logarithm of the product converts the product series into a summation series as given by
ln
ST S0
= ln
S1 S0
+ ln
S2 S1
+ · · · + ln
St−1 St−2
+ ln
ST St−1
.
The central limit theorem states that the summation of a large number of identically distributed and independent random variables, each with finite mean and variance, will be approximately normally distributed (Rice, 1995).
1.2.1. Lognormal Stochastic Differential Equation The insight of the previous paragraph provides a motivation to recast the geometric Brownian motion dS = μS dt + σ S dW ,
4
FINANCIAL MODELS
as a lognormal diffusion stochastic differential equation. The alteration is accomplished by using the function G = ln S, and its derivatives
∂G 1 ∂ 2G 1 ∂G = = 0. =− 2 2 ∂S S ∂S S ∂t
Application of Ito’s lemma gives Infinitesimally small→0
∂G ∂G 1 ∂ 2G 1 ∂ 2G 2 2 (dS) + (dt) dS + dt + ∂S 2 ∂S 2 ∂t 2 ∂t 2 1 1 1 (dS)2 + 0 dG = dS − S 2 S2 1 1 1 dG = (μS dt + σ S dW ) − (μS dt + σ S dW )2 S 2 S2 dG = (μ dt + σ dW ) ⎛ ⎞ Infinitesimally dG =
Infinitesimally
dW 2 =dt
small→0 1 1 ⎜ ⎜ 2 2 2 2 σ dW + μσ S − S dt dW + α 2 S 2 ⎜ 2 S2 ⎝
1 dG = (μ dt + σ dW ) − σ 2 dt 2 ⎛ =η ⎞ ⎜ 1 ⎟ ⎜ dG = ⎝μ − σ 2 ⎟ dt + σ dW , 2 ⎠
small→0
dt 2
⎟ ⎟ ⎟ ⎠
where we have introduced a lognormal return drift factor, η, which is the continuously compounded return. A solution to the log return stochastic differential equation is found by integrating
dGu =
η dt +
σ dWu .
The deterministic drift term can be integrated similar to an ordinary differential equation. The σ coefficient is taken as a time-invariant constant, which greatly simplifies the stochastic integral to t 0
σ dWu = σ (Wt − W0 ) = σ (Wt − 0),
EXPECTED VALUE, VARIANCE, AND MOMENTS OF LOGNORMAL DISTRIBUTION
5
where the random term is zero at the time origin by definition. The integral of dGu is Gt − G0 ; therefore, Gt − G0 = ln St − ln S0 = ln
St S0
= ηt + σ Wt .
Hence the evolution of the logarithm of the price change follows a drift η = μ − 12 σ 2 and has a Gaussian distribution with a mean at ηT and a variance σ 2 T . Compactly, this is expressed as √ St ln St − ln S0 = ln ∼ φ ηt, σ t , S0 √ ln St ∼ φ ln S0 + ηt , σ t where φ(m, σSD ) denotes a normal distribution with mean m and standard deviation σSD . It follows that the continuously compounded return is found from St 1 . η = ln t S0 The continuously compounded return is normally distributed with a mean or 2 expected value of E(η) = μ − σ2 and a standard deviation of √σt (Hull, 2006). Formally, this is written as σ2 σ ,√ , η =ϕ μ− 2 t which implies that the likelihood of returns are more certain when examined over a longer time series. Recalling that S = eG allows an expression for the evolution of St as St = S0 eηt+σ Wt . The variance of the stock price St has a lognormal distribution with a variance given by var(St ) = S02 e2αT (eσ
2T
− 1).
1.3. EXPECTED VALUE, VARIANCE, AND MOMENTS OF LOGNORMAL DISTRIBUTION We will provide a brief proof of the lognormal distribution as discussed by Hull (2006). The logarithm of the asset price, G = ln S, has a normal distribution,
6
FINANCIAL MODELS
φ(m, σSD ). The previous derivation showed that the mean of the distribution is dependent on the logarithm of the starting stock price and a product of the continuously compounded rate and the time period of the analysis, m = (ln S0 + ηt); the standard deviation is a product of the√annualized volatility and the square root of the time period of analysis, σSD = σ t. The related probability densities for G = ln(S) are 1 h(G) = √ e 2πσSD
−(G−m)2 2 2σSD
h(S) = √
1 2π σSD S
e
−(ln(S)−m)2 2 2σSD
.
The nth raw moment for a probability distribution h(S) is given by the integral
μ′n
n
= E(S ) =
∞
S n h(S)dS,
0
where n signifies the nth moment. The nth moment after inserting the exponential of G, eG = S, is μ′n = E((eG )n ) =
= μ′n
=
μ′n =
∞
−∞
∞
−∞
∞
−∞
⎛
∞
−∞
2σ 2 nG SD 2 2σSD
⎝e √
1
e
2π σSD
√
1
(enG ) √ e 2π σSD
⎞
⎠√ 1 e 2π σSD
−(G−m)2 2 2σSD
dG
⎛ 2 ⎞ − G2 −Gm+m2 ⎟ ⎜ ⎠ ⎝ 2 2σSD
2 nG−(G2 −Gm+m2 )2 2σSD 2 2σSD
dG
2 nG−(G2 −Gm+m2 )2 2σSD ∞ 1 2 2σSD dG = √ dG e 2πσSD
−∞
1 2π σSD
e
2 )2 (G2 −m−σSD 2 2σSD
e
2 +n2 σ 4 2mnσSD SD 2 2σSD
dG
Integral of normally distributed function=1
2 2 mn+ n σSD
=e
μ′n = emn+
/2
∞
−∞ 2 n2 σSD /2
.
⎛
√
1 2π σSD
Mean
⎞2
⎟ ⎜ 2 ⎟ ⎜ 2 ⎜G −m − σSD ⎟ e
⎝
2 2σSD
⎠
dG
EXPECTED VALUE, VARIANCE, AND MOMENTS OF LOGNORMAL DISTRIBUTION
7
Therefore, the raw moments (taken about 0) are (Weisstein, 2011) μ′1 = em+
2 σSD /2
2 ) 2(m+σSD
μ′2 = e
μ′3 = e3m+
= expected value
2 9σSD /2 2
μ′4 = e4(m+2σSD ) . σ2
A little more mathematics with the first raw moment, μ′1 = em+ SD/2 , generates the √ expected value of the stock price. Substituting m = (ln S0 + ηt) and σSD = σ t yields 1 2 1 2 μ′1 = E(St ) = e(ln S0 +ηt)+ /2σ t = S0 eηt+ /2σ t
E(St ) = S0 e
μ− 1/2σ 2 t+ 1/2σ 2 t
E(St ) = S0 eμt .
An alternate derivation of the expected value of lognormal stock price without invoking the first raw moment that may be more intuitive is presented below. 1.3.1. Lognormal Distribution by Expectation As discussed by Chance (2005), the solution to the stochastic differential equation dSt ηt+σ Wt . Taking the expectation of St = μ dt + σ dWt was developed as St = S0 e the expression for the evolution of St gives E[St ] = E[S0 eηt+σ Wt ], which can be simplified by moving the constant factors out of the expectation to give E[St ] = S0 eηt E[eσ Wt ]. The Weiner process Wt follows a standard normal probability with a mean of zero √ and standard deviation of t as written by 1 −W 2 f (Wt ) = √ e t /2t . 2π t In general, the expected value of a random variable is the ∞integral of the variable and its probability density function is given by E(X) = −∞ xf (x)dx or for g(X), an arbitrary function of X, the expected value is the integral of the inner product
8
FINANCIAL MODELS
∞ as given by E[g(X)] = −∞ g(x)f (x)dx. Therefore, the expected value of the exponential of the Wiener process is σ Wt
E[e
]=
∞
e
σ Wt
−∞
f (Wt )dWt =
∞
e σ Wt √
−∞
1 2π t
e
−Wt2/2t
dWt .
This expression can be placed into a more useful form by completing the square in the exponent by E[eσ Wt ] =
E[eσ Wt ] =
∞
−∞
∞
−∞
E[eσ Wt ] = e
√
√
1 2π t
1 2π t
e
e
2tσ Wt −Wt2 2t
dWt =
(Wt −σ t)2 σ 2 t + 2 2t
∞
−∞
2tσ Wt −Wt2 −σ 2 t σ 2 t 1 + 2 2t dWt e √ 2π t
dWt = e
σ 2t 2
∞
−∞
√ φ (σ t, t )=1
1 e √ 2π t
−1 2
Wt −σ t 2 t
dWt .
σ 2t 2
The integral was eliminated by manipulating the expression into the form of a probability density function. The integral of a probability density function is intrinsically equal to one. Relying on the relation that the continuously compounded 2 return η is normally distributed with a mean or expected value of E(η) = μ − σ2 allows the expected stock price to be written as
E(St ) = S0 e
η+ 1/2σ 2 t
E(St ) = S0 eμt
1.3.2. Moments and Variance of Lognormal Distribution Next, two related approaches are given to find the variance of S. The central moment μn taken about the expected value μ′1 is μn = (S − S)n =
(S − μ′1 )n h(S)dS.
A more specific form of this equation is commonly used to express the variance of a process. The variance of S is given as var(S) = μ2 = (S − S)2 var(S) = E(S 2 ) − [E(S)]2 .
EXPECTED VALUE, VARIANCE, AND MOMENTS OF LOGNORMAL DISTRIBUTION
9
2
The term E(S 2 ) is the raw moment μ′2 = e2(m+σSD ) and the expected value of a squared asset price is 2
[E(S)] =
(μ′1 )2
2
m+ σSD/2
= e
2
2 2 m+ σSD/2
=e
.
√ Substituting these terms along with m = (ln S0 + ηt) and σSD = σ t yields variance of lognormal stock price 2
2
var(S) = −(μ′1 )2 + μ′2 = e2(m+σSD ) − e(2m+σSD ) 2
2
var(S) = e2m+σSD (eσSD − 1) 2
2
var(S) = e2(ln S0 +ηt)+σ t (eσ t − 1) 2
var(St ) = S02 e2μt (eσ t − 1).
Alternatively, using a binomial transform, not derived here, the central moments can be expressed in terms of the raw moments as given by μ1 = 0
2
2
μ2 = −(μ′1 )2 + μ′2 = e2m+σSD (eσSD − 1) μ3 = 2(μ′1 )3 + 3μ′1 μ′2 + μ′3 = e3m+
2 3σSD /2
2
μ4 = −3(μ′1 )4 + 6(μ′1 )2 μ′2 − 4μ′ μ′3 + μ′4 2
2
2
2
(eσSD − 1)2 (eσSD + 2)
2
2
= e4m+2σSD (eσSD − 1)2 (e4σSD + 2e3σSD + 3e2σSD − 3), where the second central moment μ2 yields the variance relative to the mean (Papoulis, 1984). Similarly, the third and fourth central moments provide a construct for the skewness and kurtosis, respectively. 1.3.3. Lognormal Distribution by Candidate Solution Neftci (2000) provides an alternate approach to solve the stochastic differential equation dSu = μ dt + σ dWu . Su Again, the Riemann and stochastic integration to the right side of the differential equation are solved to give t 0
dSu = μt + σ Wt . Su
10
FINANCIAL MODELS
Often, in the financial literature a solution is not available, so a candidate is proposed and back-checked in the original differential equation. For example, the candidate given by μ− 21 σ 2 t+σ Wt
St = S0 e
is a strong solution in that Wt is given exogenously and the error process is considered as another given in the equation. At this point, Ito’s lemma is employed to validate that the candidate solution satisfies the stochastic differential equation and the integral equation. The value S is a function of t and W with partial derivatives given by 1 2 ∂ 2f ∂f ∂f = μ − σ St = σ St = σ 2 St . ∂t 2 ∂z ∂W 2 Application of Ito’s lemma dSu =
∂f 1 ∂ 2f ∂f dt + dW + dW 2 , ∂t ∂W 2 ∂z2
with dW 2 = dt gives dSt = St
1 1 μ − σ 2 dt + σ St dWt + σ 2 dt 2 2
dSt = St [μ dt + σ dW ],
where the original stochastic differential equation is recovered. To provide some clarity to this section, the important equations for geometric Brownian motion are given in Table 1.1. The Black–Scholes option model is based on geometric Brownian motion, dSt = μSt dt + σ St dz, and assumes a constant volatility, σ , or an effective volatility over the life of the contract. In the√limiting case, a constant volatility is the square root of a constant variance, σ = var. For nonconstant variance, the effective volatility is found as the square root of the mean of time-weighted variance or squared volatility over time T1 var1 + T2 var2 + T3 var3 + · · · T 1 σ 2 + T2 σ 2 + T3 σ 2 + · · · = , σ = Ttotal Ttotal where Ti is the time length of the ith period.
FITTING GEOMETRIC BROWNIAN MOTION
11
TABLE 1.1 Summary of Parameters for Geometric Brownian Motion dSt = μ dt + σ dWt St
Stochastic stock process for geometric Brownian motion
μ
Drift
SD
dSt St
√ = σ t
σ2 d(ln St ) = μ − 2
Standard deviation of percentage change dt + σ dWt
Transformation to log price process via Ito’s lemma
E(St ) = S0 eμt
Expected value of St
η =μ−
Continuously compounded return over a period of time length T
σ 2/2
1 ST η = ln T S0 Si + D ui = ln Si−1
Log return over time period Ti − Ti−1 , e.g., daily return with potentially a dividend D
σSD σ = √ τ t
√ σSD = σ t var = (σSD )2 = σ 2 t
σSD
σSD
=
=
1 n−1 1 n−1
n
i=1
(ui − u)2
n
u2i i=1
1 − n(n − 1)
n
ui i=1
2
Annualized volatility, σ , is the standard deviation of the asset’s logarithmic returns in a year, e.g., σSD is the standard deviation of daily logarithmic returns → 252 Trading Days/yr → t = 1/252 Standard deviation over a time period t calculated from annualized volatility Variance over a time period t calculated from annualized volatility General formula to calculate standard deviation from a data series, e.g., daily log return data. Matlab provides a built-in function std for this calculation
1.4. FITTING GEOMETRIC BROWNIAN MOTION The code to simulate and analyze geometric Brownian motion is provided below as the function GBM(S). The function accepts only one parameter S, which is the daily price of an asset assumed to follow geometric Brownian motion. Figure 1.1 is the graphical output of the function where the jagged line is the adjusted close R Matlab is a registered trademark of The MathWorks, Inc.
12
FINANCIAL MODELS 140 120
Price
100
Expected +1 Standard deviation Price path –1 Standard deviation
80 60 40 20 0 2003 2004 2005 2006 2007 2008 2009 2010 2011 Time
FIGURE 1.1 Price process of Exxon Mobil adjusted close stock price fitted to a geometric Brownian motion with volatility σ = 0.27 and return μ = 0.096.
price for Exxon Mobil stock. The continuously compounded return η is found by calculating the logarithm of the daily price change and !then normalized for 1 "n 2 the daily time period. The daily standard deviation σSD = n−1 i=1 (ui − u) is calculated by the Matlab std function. The approach in GBMfit(S) to calculate standard deviation is σSD = std[ln(St ) − ln(St−1 )]. The annualized volatility is found as the daily standard deviation divided by the square root of the time period measured in years, for example, 1/252 for 1 day, σ =
σSD/√t .
Now the expected return per year can be determined from the mean of the continuously compounded return as given by St = mean[ln(St ) − ln(St−1 )] = η = μ − σ 2/2. mean ln St−1 The expected return μ demanded by investors depends on the continuously compounded return η and the volatility risk of the stock, μ = η + σ 2/2. For the Exxon Mobil data from 2003 to mid-2010, a volatility σ = 0.2718 and an expected return μ = 0.096 are estimated. The solid central line of the expected value of the stock process E(St ) = S0 eμt is also called the mean future stock price. 1.5. MEAN PRICE SIMULATION When the function GBMfit is called without an argument, the function will selfsimulate a price process on the basis of internally given parameters and then analyze
MEAN REVERSION MODELS
13
this self-generated price process. The price path is formed by iteratively stepping through
St+t = St e
√ μ− 21 σ 2 t+σ tN (0,1)
,
where N (0, 1) is a random draw from a normal distribution. The function GBMexpected is used to generate several thousand simulated price paths. To speed up the execution, some Matlab vectorization and built-in functions are used. Specifically, the iterative step is accomplished via the built-in cumprod function, which calculates a cumulative product in time from an array of exponential drift and random movements. An alternate approach would be to use the built-in cumsum function for the logarithmic sum of the drift and random movements. Analyzing the price simulation formula shows that the medium stock price after one time step would be median St+1 = St e
μ− 21 σ 2 t
,
that is, half the random movements in the asset price will fall above or below the medium value point. Contrast this to our earlier derivation of the expected asset price given as St = S0 eμt . What is different is that the geometric Brownian motion is not symmetric and one feature of the lognormal distribution is that the one-step mean future stock price is higher than the one-step median future stock price. For example, a symmetric random movement of ±0.1 is not symmetric in the exponential e±0.1 = 0.905/1.105. Figure 1.2 clearly shows that the mean or expected price process will be higher as a consequence of the long tail in the stock price distribution toward higher asset prices. This repeated simulation is a Monte Carlo analysis in its most basic form.
1.6. MEAN REVERSION MODELS The first description of an ordinary mean reversion process was given by Uhlenbeck and Ornstein (1930). The Ornstein–Uhlenbeck process is the continuous-time analog of the discrete-time AR(1) process. The behavior and economic principle of commodities, interest rates, and foreign exchange rates are well described by reversion to a mean. The microeconomic viewpoint is that the long-term marginal production cost of a commodity, such as oil, determines the long-run cost (Dias, 2004). Bessembinder et al. (1995) show that significant mean reversion is observable for prices of agricultural and oil commodities. An alternate viewpoint, which reaches the same conclusion, is that a cartel will target a consistent price level. This target point may vary but the underlying profit level targets and political motivation tend not to change in the short term (Laughton and Jacoby, 1995). Pindyck and Rubinfeld (1991) examined over one hundred years of oil price data and found a slow mean reversion, but a Dickey–Fuller unit root test rejected a simple random walk process. Baker et al. (1998) found that a mean
14
FINANCIAL MODELS
80
E[st] = eµt E[st] + 1 Stand. dev. E[st] – 1 Stand. dev. Mean sim. price 2 0.5 St = S0e(µ–0.5σ )t + σN(0,1)t
60
30
50 40
20 15 10 5
30 0 (a)
↓ E[sT]
25 Frequency
Asset price
70
35
2
4
0
6
Time (Years)
50
(b)
100 Final price
150
FIGURE 1.2 (a) Simulated geometric Brownian motion daily price process. The mean of all the simulated price paths is exactly equal to the calculated expected value. (b) Examining the lognormal distribution of prices shows that the high price tail shifts the mean or expected value above the peak in the distribution.
reversion model was more consistent with the interrelationship between oil price spot and futures data. Specifically, the spot price data is more volatile than the futures price data as is predicted by a mean reversion model. For reference, a random walk model predicts equal volatility in futures and spot data. Additionally, Baker et al. (1998) showed that low spot prices tend to associate with futures prices increasing toward the long-run equilibrium, that is, in contango; and a high spot price tends to associate with futures prices decreasing toward the long-run equilibrium, that is, backwardation. Unlike the geometric Brownain motion process, an arithmetic Brownian motion can have negative stochastic movements, which will result in a negative asset price. Cox et al. (1985) developed a square root model that effectively prevents negative random movements below zero as given by √ drt = κ(θ − rt )dt + σ rt dz. A useful implementation for modeling commodity prices occurs by examining the logarithm of price, x = ln (S). In this form, a negative spot price is prevented as the negative logarithm of the spot price x maintains a positive spot price. Several mean reverting forms have been proposed in the literature to model commodity prices. One example is a geometric mean reverting price process, which is also referred to as the Dixit and Pindyck (1994) model, as given by dS = λS(μ − S)dt + σ S dz,
SOLVING THE ORNSTEIN–UHLENBECK PROCESS
15
where λ is the reversion rate and μ is the long-run mean price. If the price S is higher than the mean price μ, then the negative (μ − S) factor pulls the price level down at a rate determined by the reversion rate λ. A large (μ − S) delta implies a faster rate of reversion. Reversion due to a small (μ − S) delta may be difficult to differentiate from the stochastic variation generated by the σ P dz term. This model displays lognormal diffusion similar to a non-mean reverting geometric Brownian motion model; however, the variance increases with time up only until a stabilization level is reached. 1.7. SOLVING THE ORNSTEIN–UHLENBECK PROCESS Generally, the arithmetic Ornstein–Uhlenbeck process as given by dxt = λ(μ − xt )dt + σ dWt is pulled toward an equilibrium level μ at a rate λ and σ is the volatility or average magnitude, per square root time, of the random fluctuations that are modeled as Brownian motion. Integration of the deterministic term gives the expected value as dx = λ(μ − x)dt
u = (μ − x) −du = λ dt u ln|u| = −λt + C
du = −dx
|u| = ±eC e−λt
u0 = A e−λ0 → A = u0 = (μ − x0 )
μ − x = (μ − x0 )e−λt
E[x(t)] = μ + (x0 − μ)e−λt
E[x(t)] = x0 e−λt + μ(1 − e−λt ). Thus, the expected value is approaching the long-term equilibrium price at a rate proportional to the present displacement from the equilibrium price. Determining the stochastic integral of the arithmetic Ornstein–Uhlenbeck process requires a variation of parameters procedure to define a new function as f (xt , t) = xt eλt , with the derivative found via Ito’s lemma df (xt , t) = λxt eλt dt + eλt dxt
16
FINANCIAL MODELS
df (xt , t) = λxt eλt dt + eλt λ(μ − xt )dt + eλt σ dW
df (xt , t) = eλt λμt dt + eλt σ dW . Integration gives λt
f (xt , t) = xt e = xt +
t
λs
e λμ ds +
t
eλs σ dWs
0
0
xt = xt e−λt + μ(1 − e−λt ) +
t
eλ(s−t) σ dWs ,
0
where the mean is the first two terms as given by E[xt ] = xt e−λt + μ(1 − e−λt ). The variance is found from the integral of the stochastic process by var(xt ) = E[(xt − E[xt ])2 ] ⎤ ⎡ t σ 2 −2λt 2λt var(xt ) = σ 2 e−2λt E ⎣ e2λs σ dWs ⎦ = e (e − e0 ). 2λ 0
σ2 (1 − e−2λt ) var(xt ) = 2λ
Therefore the long-term (stationary) variance and standard deviation are σ2 var(xt ) = SD(xt ) = 2λ
'
σ2 . 2λ
1.8. SIMULATING THE ORNSTEIN–UHLENBECK PROCESS On the basis of the previous derivation, a simulation is given as the sum of the mean and the stochastic fluctuations for an asset price as −λt
St = S0 e
+ μ(1 − e
−λt
)+
'
σ2 (1 − e−2λt )N (0, 1), 2λ
where the time interval t can be arbitrarily large or small as this is an exact solution to the Ornstein–Uhlenbeck process. In this form, random movements are generated by multiplying the magnitude of the standard deviation with a random sampling from the standard normal distribution N(0, 1).
LEAST SQUARES FITTING
17
5.5 Expected +1 Standard deviation Price path –1 Standard deviation
5 4.5
Price
4 3.5 3 2.5 2 1.5 1
0
1
2
3
4
5
Time
FIGURE 1.3 Ornstein–Uhlenbeck process initiated at a price of 5 displays a strong tendency to revert to the equilibrium price of 2.
The function MRpath consists of a simulation followed by a series of calibration approaches. Focusing first on the simulation, MRpath calculates a vector of the expected (mean) price path as well as a vector of random movements. The summation of the expected and stochastic movements is displayed as the jagged line in Figure 1.3. Starting from initial price of 5, the asset price is pulled toward an equilibrium price μ at a rate determined by the mean reversion parameter λ. The confidence interval is shown by the two lines displaced by ±1 standard deviation from the expected price path. An interesting effect of the Ornstein–Uhlenbeck process is that the variance of the process initially grows but then tends to a constant 2 long-term variance as given by varlong-term (xt ) = σ2λ . 1.9. CALIBRATING THE ORNSTEIN–UHLENBECK PROCESS There are many techniques to regress or fit data and here we focus on two effective techniques, namely least squares fitting and maximum likelihood fitting. The function MRpath calibrates the simulated path with function calls to WeightedLeastSquaresOU and MLweightedOU . The former is a linear least squares fit that has been modified to put the solution of the Ornstein–Uhlenbeck process into a workable form. 1.10. LEAST SQUARES FITTING The function WeightedLeastSquaresOU defaults when the StdDev parameter is null (or blank) to a linear unweighted least squares fit. This regression finds a best fit
18
FINANCIAL MODELS
line through a set of data points. The linear least squares procedure fits a straight line, y = ax + b + ε, with slope m, intercept b, and an error term ε, to a set of data by minimizing the sum of squared error residuals. The squared error residuals meet the necessity for a continuous differential quantity in contrast to an absolute error residual that may not be a continuous differential quantity. One characteristic of squared error residuals is that a few outliers may have more influence than the majority of the data points (Weisstein, 2011). The general least squares procedure finds a set of parameters that minimizes the squared vertical offsets of the data from the best fit line, plane, etc. The method is quite flexible as it can be applied to any linear combination of basis functions Xk (x) including sines, cosines, or a polynomial as given by k
y(x) =
ak Xk (x). k=1
This approach is to fit a function with an arbitrarily large number of linear parameters ai to minimize the squared deviation as given by ε2 =
[yi − f (xi , a1 , a2 , . . . , an )]2 .
A minimum of a convex function exists when the first derivative is zero with respect to each dependent variable as given by ∂(ε2 ) = 0. ∂ai The simplest application and the form we are interested in is a linear fit to f (a, b) = a + bx. Deviations from this line are a function of the slope a and intercept b as given by n
ε2 (a, b) =
i=1
[yi − (a + bxi )]2 ,
with respective minima found at ∂(ε2 ) = −2 ∂a ∂(ε2 ) = −2 ∂b
n
i=1
[yi − (a + bxi )] = 0
n
i=1
[yi − (a + bxi )]xi = 0.
This leads to two coupled equations given by n
n
i=1
yi = na + b
xi i=1
LEAST SQUARES FITTING n
n
yi xi = a
i=1
19
n
i=1
xi + b
xi2 , i=1
which can be solved by substitution. Equivalently, the linear least squares can be expressed as matrices (Weisstein, 2011) as ⎡ ⎢ ⎢ ⎢ ⎢ ⎣
⎡
⎤
n
yi ⎥ ⎢ n ⎥ ⎢ ⎥=⎢ n ⎥ ⎢ yi xi ⎦ ⎣ xi
i=1 n
i=1
⎤
n
i=1
i=1 n
xi ⎥ ⎥ a ⎥ ⎥ b , 2⎦ x i
i=1
where the parameters a and b can be solved by the Matlab backslash operator or by a standard matrix inversion as
a = b
1
n
n i=1
xi2 −
⎡
n
xi i=1
n
n
n
⎤
n
xi2
y xi − xi yi ⎥ ⎢ ⎥ ⎢ i=1 i i=1 i=1 i=1 ⎥. 2 ⎢ n n n ⎥ ⎢ ⎦ ⎣ n 2 y x x − i
i
i
i=1
i=1
i=1
This provides a direct solution to linear unweighted least squares. Applying this regression procedure to Ornstein–Uhlenbeck solution −λδt
Si = Si−1 e
−λδt
+ μ(1 − e
)+σ
'
(1 − e−2λδt ) N (0, 1) 2λ
requires viewing this equation with the present observation Si as the y-data and the previous observation Si−1 as the x-data for a time change δt as given by Si = a + bSi−1 + ε. Following van den Berg (2007), the model variables are directly substituted as slope = b = e−λδt
intercept = a = μ(1 − e−λδt ) ' (1 − e−2λδt ) , SD = σ 2λ which can be inverted to give λ=
ln b/δ
μ=
a/(1−e−λδt )
t
= a/(1−b)
2λ −2 ln b σ = SD . = SD −2λδ t (1 − e δt (1 − b2 ) )
20
FINANCIAL MODELS
To add flexibility, we will derive a weighted least squares approach where the weight wi is inversely proportional to the square of the standard deviation (or measurement error) of each data point xi by wi = 1/σi2 . The weighted general least squares merit function is ⎡
⎤2
k
ak Xk (x) ⎥ ⎢ yi − ⎢ ⎥ ⎢ ⎥ k=1 ε2 = ⎢ ⎥ . ⎢ ⎥ σi i=1 ⎣ ⎦ n
The weights and the measurement error are often unknown and simply set to unity to recover the unweighted least square formula. Here, we present the weighted merit function fit to a line with an intercept a and a slope b as n 2
ε (a, b) =
i=1
*
+2 yi − a + bxi . σi
If the errors are normally distributed, that is, a Gaussian distribution, then this approach will replicate the maximum likelihood estimate (MLE; Press et al., 1989). The maximum likelihood approach will be derived in a slightly different manner in Section 1.11. The maximum likelihood approach can be applied to a known distribution of any type, for example, Gaussian and exponential. The least squares approach is powerful in that it provides a good estimate in most cases even if no information is available as to the size or distribution type of the measurement error. The minima of a weighted convex function is expressed as ∂(ε2 ) = −2 ∂a
n
i=1 n
2
∂(ε ) = −2 ∂b
i=1
[yi − (a + bxi )] =0 σi2 xi [yi − (a + bxi )] = 0. σi2
This again gives two coupled equations. To simplify the algebra in the code, several convenient sums will be used as given by n
S=
i=1 n
Sy =
i=1 n
Syy =
i=1
1 Sx = σi2 yi = σi2 yi2 = σi2
n
i=1
n
i=1 n
i=1
xi = σi2
Si Sxx = σi2 Si2 Sxy = σi2
n
i=1 n
i=1 n
i=1
Si−1 σi2 xi2 = σi2
n
2 Si−1
i=1
xi yi = σi2
n
i=1
σi2 Si Si−1 . σi2
LEAST SQUARES FITTING
21
Substituting these sums into the two coupled equations gives Sy = aS + bSx Sxy = aSx + bSxx . The two coupled equations can be solved for the two unknowns, intercept a and slope b, as well as the standard deviation as slope = b = intercept = a = SD =
SSxy − Sx Sy SSxx − Sx2
Sxx Sy − Sx Sxy SSxx − Sx2
SSy − Sy2 − b(SSxy − Sx Sy ) S(S − 2)
.
These unknowns allow direct calculation of the parameters in the Ornstein–Uhlenbeck model. The function WeightedLeastSquaresOU is called with a vector of asset prices, a constant time delta parameter, and an optional vector of measurement errors. If the last parameter is absent or each parameter is equal to a single value, then the function runs as an unweighted least squares fit. Otherwise, the weights correspond to the confidence or importance of the data. In the function WeightedLeastSquaresOU , the earlier data is weighted to coincide with data that is far from the equilibrium level. This was done to improve the fit to λ, which determines the reversion to the mean. The price movements far from equilibrium are dominated by λ. Near equilibrium, the true reversion rate is obscured by noise (via sigma) in the data. A graphical representation of weighted and unweighted least squares best fit lines are displayed in Figure 1.4 along with the current versus previous price data. The data near the equilibrium level of 2 are clustered at the bottom left hand corner of Figure 1.4. The price points far from equilibrium are shown by those represented moving up and right in Figure 1.4. The text output of MRpath gives numerical values for the underlying parameters of the various best fit lines to the St versus St−1 data:
True Standard LS Weight LS Standard ML Weight ML
mu 2.00 2.12 2.39 2.12 2.39
lambda 3.00 8.52 3.45 8.52 3.45
sigma 1.50 2.01 1.83 2.00 1.83
Slope 0.93 0.81 0.92
Intercept 0.14 0.41 0.20
Standard Deviation 0.61 0.29 0.28
22
FINANCIAL MODELS 5 Data True line Unweighted LS fit Weighted LS fit
4.5 4
Price, St
3.5 3 2.5 2 1.5 1
1
2
3 Previous price, St–1
4
5
FIGURE 1.4 Graphical output of function MRpath depicting the unweighted and weighted least squares best fit line to the current price versus previous price data.
The least squares procedures provide fairly good fits to the data. A large volatility and short time step δt (in contrast to the example by van den Berg (2007)) was selected to make the data more noisy and thus more challenging to fit. Nevertheless, in this limited example, the weighted approach does provide a better fit to the mean reversion parameter λ.
1.11. MAXIMUM LIKELIHOOD The previous section sought to find the mean square error under the assumption that the expected variation of the observed data is best modeled as a Gaussian distribution. From another viewpoint, the minimization of the mean square error provides an estimate that maximizes the likelihood of the observed data. This approach can be generalized to find the maximum likelihood of any particular distribution chosen to fit the data. Usually, the distribution type, for example, Gaussian, Bernouilli, and Poisson, is known, but one or more of the parameters describing the distribution are not known (Weisstein, 2011). The conditional density function for n sequential data points xi with a normal distribution for any mean μ and standard deviation σ is
f (x1 , . . . , xn |μ, σ ) =
,
2
−(xi −μ) 1 2π (−n/2) − e √ e 2σ 2 = σn σ 2π
(xi − μ)2 2σ 2
.
The logarithm of the likelihood is more convenient for estimation as it is expressed as a summation rather than a multiplication. The log-likelihood function L is given
MAXIMUM LIKELIHOOD
23
as n
L=
i=1
(xi − μ)2
−1 n ln(2π ) − n ln(σ ) − ln f = 2
2σ 2
.
The maximum of a concave function is found when the first derivative with respect to the dependent parameters is zero, (xi − μ)
∂L =0= ∂μ
→
σ2
(xi − μ) = 0.
Rearranging this relation finds the mean μ that makes the function f the most likely, xi , μ= n which clearly is the expression for an average. Similarly, (xi − μ)2
∂L n = + ∂σ σ
= 0,
σ3
recovers the usual formula for standard deviation,
σ =
(xi − μ)2 n
.
Using the format mentioned above for the Ornstein–Uhlenbeck model, −λδt
Si = Si−1 e
+ μ(1 − e
−λδt
)+σ
'
(1 − e−2λδt ) N (0, 1), 2λ
the conditional probability density fi of observation Si , given the previous observation Si−1 after a time step δt is ⎛
f (Si |Si−1 , μ, σˆ , λ) = √
1 2π σˆ 2
e
where σˆ = SD = σ
mean
'
⎞2
⎟ ⎟ Si−1 e−λδt − μ(1 − e−λδt )⎟⎟
⎜ ⎜ −⎜ ⎜Si − ⎝
(1 − e−2λδt ) 2λ
2σˆ 2
⎠
,
24
FINANCIAL MODELS
was previously referred to as sd. In our initial derivation σˆ was assumed to be constant. Following van den Berg (2007), given n + 1 observations {S0 , . . . , Sn }, the log-likelihood function is n
L=
i=1
ln f (Si |Si−1 , μ, σˆ , λ)
1 n = − ln(2π ) − n ln(σˆ ) − 2 2σˆ 2
n
i=1
(Si − Si−1 e−λδt − μ(1 − e−λδt ))2 .
The parameters σˆ and μ that maximize the likelihood function are found by setting the partial derivatives of the log-likelihood function equal to zero by n
1 ∂L(μ, σˆ , λ) = 2 ∂μ σˆ
i=1
(Si − Si−1 e−λδt − μ(1 − e−λδt )) = 0
n
→μ=
i=1
(Si − Si−1 e−λδt ) n(1 − e−λδt )
∂L(μ, σˆ , λ) n 1 = − 3 ∂ σˆ σˆ σˆ → σˆ 2 =
1 n
n
i=1
(Si − Si−1 e−λδt + μ(1 − e−λδt ))2 = 0.
n
i=1
(Si − Si−1 e−λδt + μ(1 − e−λδt ))2
Finding an optimal choice for the mean reversion parameter λ is simpler after a slight rearrangement of the log-likelihood function to isolate the exponential functions by 1 n L = − ln(2π ) − n ln(σˆ ) − 2 2σˆ 2
n
i=1
((Si − μ) − e−λδt (Si−1 − μ))2
n 2 1 n Si − μ − 2e−λδt (Si − μ)(Si−1 − μ) L = − ln(2π ) − n ln(σˆ ) − 2 2 2σˆ i=1 2 + e−2λδt Si−1 − μ .
The partial derivative of the log-likelihood function with respect to λ is set to zero to find the optimal mean reversion parameter λ by ∂L(μ, σˆ , λ) δ e−λδt =− t 2 ∂λ 2σˆ
n
i=1
- . Si − μ (Si−1 − μ) − e−λδt (Si−1 − μ)2 = 0
MAXIMUM LIKELIHOOD
25
n
⎛
⎞ - . S − μ (S − μ) i i−1 ⎜ ⎟ ⎟ 1 ⎜ i=1 ⎜ ⎟. → λ = − ln ⎜ ⎟ n δt ⎝ ⎠ [(Si−1 − μ)2 ] i=1
In the previous section on least squares fitting, it was shown that individually weighting data points can improve the fit in some situations. It is thus beneficial to add a similar capability to our maximum likelihood approach. The basic idea is to rederive the equations of this section with σˆ i inside the summation or product. This approach takes some liberties with the underlying concept of the Gaussian distribution; however, this exercise is interesting when the numerical values of the weighted maximum likelihood and weighted least squares are compared. Briefly, the log-likelihood function is n
L=
i=1
ln f (Si |Si−1 , μ, σˆ i , λ)
n L = − ln(2π ) − 2
n
n
i=1
[ln(σˆ i )] −
i=1
*
2 1 Si − Si−1 e−λδt − μ 1 − e−λδt 2 2σˆ i
+
and the optimal parameters are n (Si −Si−1 e−λδt )/σˆ 2 i
μ=
σˆ 2 =
i=1
n
n(1 − e−λδt ) 1 n
1/σˆ 2 i
i=1
n
i=1
(Si − Si−1 e−λδt + μ(1 − e−λδt ))2 ⎛
⎞ Si − μ (Si−1 − μ) ⎜ ⎟ ⎟ σˆ i2 1 ⎜ ⎜ i=1 ⎟ λ = − ln ⎜ ⎟. 2 n ⎟ δt ⎜ Si−1 − μ ⎠ ⎝ 2 σ ˆ i i=1 n
The equation just derived for σˆ is dependent on both μ and λ. Fortunately, the two coupled equations for μ and λ are only dependent on each other. Therefore, either μ or λ can be solved for λ or μ, respectively. Once μ and λ are known then σˆ can be solved directly. Again, to simplify the algebra in the code, several convenient
26
FINANCIAL MODELS
sums will be used as given by n
S=
i=1 n
Sy =
i=1 n
Syy =
i=1
1 Sx = σi2 yi = σi2 yi2 = σi2
n
i=1
n
i=1 n
i=1
xi = σi2
Si Sxx = σi2 Si2 Sxy = σi2
n
i=1 n
i=1 n
i=1
Si−1 σi2 xi2 = σi2
n
2 Si−1
i=1
xi yi = σi2
n
i=1
σi2 Si Si−1 . σi2
Substituting these sums into the two coupled equations gives μ=
Sy Sxx − Sx Sxy
S(Sxx − Sxy ) − (Sx2 − Sx Sy ) Sxy − μSx − μSy + Sμ2 1 . λ = − ln δt Sxx − 2μSx + Sμ2
The equation for the standard deviation is Syy − 2e−λδt Sxy + e−2λδt Sxx − 2μ 1 − e−λδt (Sy − e−λδt Sx ) + Sμ2 (1 − e−λδt ) . σˆ 2 = S As discussed earlier, a least squares estimate of normally distributed errors will replicate the MLE. This similarity was seen in the output of the function MRpath discussed previously. The log-likelihood form is quite flexible and will be used again later in this book in conjunction with the Kalman filer.
SUMMARY This chapter provided a derivation and application overview on the equations underlying geometric and arithmetic Brownian motion as well as the related mean reversion models. These models are readily applied to the pricing of financial derivatives and real options. A major vein of this book is the addition of jump processes or stochastic volatility to the drift-diffusion models of this chapter. REFERENCES Baker, M.P., Mayfield, E.S., Parsons, J.E. (1998) Alternative Models of Uncertain Commodity Prices for Use with Modern Asset Pricing, Energy Journal 19, 115. van den Berg, M.A. (2007) Calibrating the Ornstein-Uhlenbeck Model, White Paper, sitmo.com.
APPENDIX
27
Bessembinder, H., Coughenour, J.F., Seguin, P.J., Smoller, M.M. (1995) Mean Reversion in Equilibrium Asset Prices: Evidence from the Futures Term Structure, Journal of Finance 50, 361. Chance, D. (1994) The ABCs of Geometric Brownian Motion, Derivatives Quarterly 1, 41. Chance, D. (2005) Mathematical Probability Theory and Finance: Connecting the Dots, Journal of Financial Education 31, 1. Cox, J.C., Ingersoll, J.E., Ross, S.A. (1985) A Theory of the Term Structure of Interest Rates, Econometrica 53, 385. Dias, M.A.G. (2004) Valuation of Exploration & Production Assets: An Overview of Real Options Models, Journal of Petroleum Science and Engineering 44, 93 Dixit, A.K., Pindyck, R.S. (1994) Investment under Uncertainty, Princeton University Press. Hull, J. (2006) Options, Futures, and Other Derivatives, Prentice Hall. Laughton, D.G., Jacoby, H.D. (1995) The Effects of Reversion on Commodity Projects of Different Length, Real Options in Capital Investments: Models, Strategies, and Applications, Trigeorgis, L. (ed.), Praeger Publisher, p 185. Neftci, S.N. (2000) An Introduction to the Mathematics of Financial Derivatives, Academic Press Advanced Finance. Papoulis, A. (1984) Probability, Random Variables, and Stochastic Processes, McGraw-Hill. Pindyck, R.S., Rubinfeld, D.L. (1991) Econometric Models and Economic Forecasts, McGraw-Hill, Inc. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P. (1989) Numerical Recipes: The Art of Scientific Computing, Cambridge University Press. Rice, J. (1995) Mathematical Statistics and Data Analysis, Duxbury Press. Uhlenbeck, G.E., Ornstein, L.S. (1930) Theory of Brownian Motion, Physical Review 36. Weisstein, E.W. (2011) Raw Moments, MathWorld, Wolfram Research, wolfram.com.
APPENDIX Code for GBMfit function GBMfit (S) %GBMfit().m Geometric Brownian Motion of Actual Data (S) %or Self-Simulation (no input) %calculates drift mu and volatility from ln(St)-ln(St-1) data TimeDelta=1/252; %assume daily prices but could add as an input to function if (nargin == 1), %Stock Price Data Input %or read directly s=load('XOMprice.dat'); steps = length(S); TimeLength=TimeDelta*steps; time = linspace(0,TimeLength, steps); end if (nargin == 0), %self-simulation mu=0.1 %percentage drift
28
FINANCIAL MODELS
Dsig =0.27 %annualized volatility Szero=30; %Initial Price steps = 7.5*252; %assume 7.5 years TimeLength=TimeDelta*steps; time = linspace(0,TimeLength,steps); %years sigma = Dsig*sqrt(TimeDelta).*randn(1,steps); %vector of random movements etadt=(mu-0.5*Dsigˆ2)*TimeDelta; S=zeros(1,steps); S(1)=Szero; for i = 2:steps S(i)=S(i-1)*exp(etadt+sigma(i)); end end %logarithmic returns log(St/S0) is normally distributed LogDelta=log(S(2:end))-log(S(1:end-1)); %=log(s(i)/(s(i-1)) StanDev = std(LogDelta); %variance volˆ2*t EstVol=StanDev/sqrt(TimeDelta) %Estimated annualized volatility %log increments of GBM are normal %mean(LogDelta)=(mu-0.5sigˆ2)t EstMu=mean(LogDelta)/TimeDelta + 0.5*EstVolˆ2/2 Expected=zeros(1,steps); SD=zeros(1,steps); Expected = S(1).*exp((EstMu).*(time)); %E[St]=S0exp(mu*t) SD = Expected.*sqrt(exp(EstVolˆ2.*time) -1); %SDV[St]=S0exp(mu*t)[exp(volˆ2*t)-1]ˆ0.5 time=time+2003; %start in Year 2003 figure plot(time,Expected,'-',... time,Expected+SD,'--',... time,S,':',... time,Expected-SD,'-.'), grid on xlabel('Time '); ylabel('Price')%% legend('Expected', '+1 Standard Deviation', 'Price Path',... '-1 Standard Deviation','location','NorthWest'); if (nargin == 1), %Stock Price Data Input title('Geometric Brownian Motion fit to XOM'); end if (nargin == 0), %self-simulated title('Geometric Brownian Motion of Simulated Price Process'); end xlim ([2003 2011]) end
APPENDIX
29
Code for GBMexpected function GBMexpected () %GBMexpected.m simulates GBM (no input) %of several (vectorized) price paths. %Goal is to show that the expected value of GBM %E[St]=S0exp(mu*t) is equal to the average of many price paths %generated by loop St=S0*exp[(mu-0.5sigˆ2)t+sig*sqrt(t)*N(0,1) %Normally generated random movements N(0,1) are symmetric but %exp(N(0,1)) movements asymmetric. TimeDelta=1/252; %assume daily prices steps = 7.5*252; %assume 7.5 years TimeLength=TimeDelta*steps; time = linspace(0,TimeLength,steps); %years loop=5000; %number of simulations mu=0.1; %percentage drift Dsig =0.1;%annualized volatility Szero=30; %Initial Price etadt=(mu-0.5*Dsigˆ2)*TimeDelta; %result of Ito Expected=zeros(1,steps); SD=zeros(1,steps); Expected = Szero.*exp((mu).*(time)); %E[St]=S0exp(mu*t) SD = Expected.*sqrt(exp(Dsigˆ2.*time) -1); %Stand. Dev. subplot (1,2,1) plot(time,Expected,'-','color','blue','LineWidth',5);hold on; plot(time,Expected+SD,'color','black');hold on; plot(time,Expected-SD,'color',[0.5 0 0]); hold on; %vector of random movements sigma = Dsig*sqrt(TimeDelta).*randn(loop,steps); S=zeros(loop,steps); S=Szero*cumprod(exp(etadt+sigma),2); %cumsum(logS0,etadt+sigma) followed by S=exp %should be faster since addition is generally faster than mult. aveS=mean(S); plot(time,aveS,':','color','magenta','LineWidth',5);hold on; iteration = 1:50:500; %Plot only a few price series plot(time,S(iteration,:),':','color','cyan','LineWidth',0.1); hold on;
30
FINANCIAL MODELS
axis tight xlabel('Time [Years]'); ylabel('Asset Price')%% legend('E[S_t]=eˆ\muˆt', 'E[S_t] + 1 Stand. Dev.',... 'E[S_t] - 1 Stand. Dev.','Mean Sim. Price',... 'S_t=S_0eˆ{(\mu-0.5\sigmaˆ2)t+\sigmaN(0,1)tˆ{0.5}}',... 'location','NorthWest'); title('Geometric Brownian Motion Simulated Price Process'); hold off; Send=S(:,steps); minS=min(Send); maxS=max(Send); subplot(1,2,2) inc=minS:0.2:maxS; %define bin edges hist(S(:,steps),inc); xlabel('Final Price'); ylabel('Frequency'); title('Histogram of Final Simulated Price'); text(Expected(end),loop/180,'\downarrow E[S_T]'); axis tight end
Code for MRpath function MRpath () %MRpath simulates Path of Mean Reversion Process %then uses external function calls for unweighted %and weighted least squares fit as well as unweighted %and weighted maximum likelihood fits. %The rapidly reverting data has less data points %so one can emphasize certain data to %in theory provide a more accurate fit to lambda steps = 201; TimeLength=5; %years lambda=3 %mean reversion rate mu=2 %long-term mean Dsig =1.5 Szero=5; %Initial Price S=zeros(1,steps); Expected=zeros(1,steps); SD=zeros(1,steps); time = linspace(0,TimeLength,steps); TimeDelta = time(2)-time(1) Expected = Szero.*exp(-lambda.*time)+mu.*(1-exp(-lambda*time)); %SD is constant but calculated as vector
APPENDIX
31
SD=Dsig.*sqrt( (1-exp(-2.*lambda.*TimeDelta*ones(1,steps)))./ (2.*lambda)); weightSD =SD.*linspace(0.1,1.9,steps); %Place more weight (importance) into earlier points that are %rapidly changing to hopefully better fit the mean %reversion rate lambda S= Expected + SD.*randn(1,steps); x=S(1:end-1); y=S(2:end); figure plot(time,Expected,'-',... time,Expected+SD,'--',... time,S,':',... time,Expected-SD,'-.'), grid on xlabel('Time '); ylabel('Price')%% legend('Expected', '+1 Standard Deviation',... 'Price Path‘','-1 Standard Deviation'); title(['Mean Reversion Process']); TrueSlope = exp(-lambda*TimeDelta); TrueIntercept = mu*(1-TrueSlope); trueSD= Dsig.*sqrt( (1-exp(-2.*lambda.*time(steps))) ./ (2.*lambda)); TrueLine = polyval([TrueSlope TrueIntercept],x); fprintf(1, '\t\t\t\t mu \t lambda \t sigma \t\t Slope'); fprintf(1, '\t\t Intercept \t Standard Deviation \n'); fprintf(1,... 'True \t\t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \n',... mu, lambda, Dsig, TrueSlope, TrueIntercept, trueSD); %%%Calculate by Unweighted Least Squares [LSMu, LSSigma, LSLambda, LSslope, LSintercept, LSstanDev]... = WeightedLeastSquaresOU (S,TimeDelta); %LS fitting assumes equal weight fprintf(1,... 'Standard LS\t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \n',... LSMu, LSSigma, LSLambda, LSslope, LSintercept, LSstanDev) UnweightedLSline = polyval([LSslope, LSintercept],x); %%%Calculate by Weighted Least Squares [wLSMu, wLSSigma, wLSLambda, wLSslope, wLSintercept, wLSstanDev]...
32
FINANCIAL MODELS
= WeightedLeastSquaresOU (S,TimeDelta, weightSD(2:end)); fprintf(1,... 'Weight LS\t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \t %6.2f \n',... wLSMu, wLSSigma, wLSLambda, wLSslope, wLSintercept, wLSstanDev); WeightedlsLine = polyval([wLSslope, wLSintercept],x); %%%Calculate by Unweighted Maximum likelihood [MLmu, MLlambda, MLsigma] = weightedML(S,TimeDelta); fprintf(1, 'Standard ML\t %6.2f \t %6.2f \t %6.2f \n',... MLmu, MLlambda, MLsigma) %%%Calculate by Weighted Maximum likelihood [wMLmu, wMLlambda, wMLsigma] = weightedML(S,TimeDelta, weightSD(2:end)); fprintf(1, 'Weight ML\t %6.2f \t %6.2f \t %6.2f \n',... wMLmu, wMLlambda, wMLsigma) figure plot(x,y,'o',x,TrueLine,'-',x,UnweightedLSline,'--',x, WeightedlsLine,'.-') xlabel('Previous Price, S_t_-_1 '); ylabel('Price, S_t ')%% legend('Data','True Line','Unweigted LS Fit',... 'Weighted LS Fit', 'location', 'NorthWest'); %Or use one of several Matlab fitting function %Matlabfit = polyfit(x,y,1); %fprintf(1, 'Matlab calc. %6.2f Slope \t %6.2f Intercept \n', Matlabfit); %MatlabfitLine = polyval(Matlabfit,x); end
Code for WeightedLeastSquaresOU function [mu, lambda, sigma, CalcSlope, CalcIntercept, StdDev]... = WeightedLeastSquaresOU (S,delta,sigma) % WeightedLeastSquaresOU performs a weighted or unweighted % least squares fit to Orstein Uhlenbeck process % Derivation in Press, Flannery, Teukolsky, Vetterling, % Numerical Recipes % as well as an unweighted estimation procedure % by M.A. van den Berg available at www.sitmo.com
APPENDIX
33
x=S(1:end-1); y=S(2:end); n= length (y); if nargin < 3, sigma = ones(1,n); end S Sx Sy Sxx Sxy Syy
= = = = = =
sum sum sum sum sum sum
(1./sigma.ˆ2); (x./sigma.ˆ2); (y./sigma.ˆ2); (x.ˆ2./sigma.ˆ2); (x.*y./sigma.ˆ2); (y.ˆ2./sigma.ˆ2);
CalcSlope =(S.*Sxy-Sx.*Sy) /(S.*Sxx-Sx.ˆ2); CalcIntercept=(Sxx.*Sy-Sx.*Sxy)/(S.*Sxx-Sx.ˆ2); StdDev=sqrt((S*Syy - Sy.ˆ2 - (CalcSlope.*(S.*Sxy-Sx.*Sy)))... / (S*(S-2)) ); %%UnWeighted %=sqrt((n*Syy - Sy.ˆ2 - (CalcSlope.*(n.*Sxy-Sx.*Sy))) / (n*(n-2)) ); lambda = -log(CalcSlope)/delta; mu = CalcIntercept/(1-CalcSlope); sigma = StdDev * sqrt(-2*log(CalcSlope)/(delta* (1-CalcSlopeˆ2))); end
Code for weightedML function [mu,lambda, sigma] = weightedML(S,delta,sigma) %weightedML performs a maximum likelihood estimation of the %the parameters in a Orstein Uhlenbeck process %Also see detailed descriptions of the %unweighted estimation %by M.A. van den Berg available at www.sitmo.com %and Weisstein, Eric W. MathWorld, Wolfram Research, %http://mathworld.wolfram.com x=S(1:end-1); y=S(2:end); n= length (y); if nargin < 3, sigma = ones(1,n); end
34 S Sx Sy Sxx Sxy Syy
FINANCIAL MODELS
= = = = = =
sum sum sum sum sum sum
(1./sigma.ˆ2); (x./sigma.ˆ2); (y./sigma.ˆ2); (x.ˆ2./sigma.ˆ2); (x.*y./sigma.ˆ2); (y.ˆ2./sigma.ˆ2);
%In derivation, two equations and two unknowns are available %for mu and lambda. %Sigma is directly solvable once mu and lambda are calculated mu = (Sy*Sxx - Sx*Sxy) / ( S*(Sxx - Sxy) - (Sxˆ2 - Sx*Sy) ); lambda = -log( (Sxy - mu*Sx - mu*Sy + S*muˆ2) /... (Sxx -2*mu*Sx + S*muˆ2) ) / delta; a = exp(-lambda*delta); sigmah2 = (Syy - 2*a*Sxy + aˆ2*Sxx - 2*mu*(1-a)* (Sy - a*Sx)... + S*muˆ2*(1-a)ˆ2)/S; sigma = sqrt(sigmah2*2*lambda/(1-aˆ2)); end
2
Jump Models
2.1. INTRODUCTION The previous chapter introduced models that are based on Brownian motion of asset prices. Deriving, simulating, and estimating models are much simpler when the stochastic behavior is described by Gaussian distributions. Unfortunately, the natural behavior of markets tends to display a greater number of large jumps than predicted by a Gaussian distribution. The fat tails in the distribution can be modeled by a few techniques. The approach within this chapter is an extension of the earlier Gaussian models by adding a Poisson jump process. The Poisson distribution determines the frequency of jumps within a certain time period. Several choices are available for the jump-amplitude distribution. A one-size jump is the simplest model but is quite insightful for describing the behavior of assets that are vulnerable to large negative movements such as defaults. The uniform jump model randomly selects an amplitude from a uniform range of possible jumps. This large but bounded jump size range logically implies that there is some unfeasible level for an asset price to cross. The following section derives the lognormal diffusion and log-uniform jump model and presents a least-squares method and a multinomial maximum likelihood estimation procedure. The chapter concludes with a discussion of other popular jump size distribution models.
2.2. JUMP-DIFFUSION MODEL The form of a geometric jump-diffusion stochastic differential equation is dSt = St [μd dt + σd Wt + J (Q)dNt ], where the subscript d was added to the drift μd and diffusion σd components to avoid potential confusion with a normally distributed jump introduced later. The amplitude J (Q) represents the size of the random jumps and J (Q) > −1, else dSt could bring the asset value to less than zero. The one-dimensional Poisson process Nt is discontinuous with a constant jump rate λ. The simulated frequency Financial Derivative and Energy Market Valuation: Theory and Implementation in Matlab, First Edition. Michael Mastro. 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.
35
36
JUMP MODELS
of jumps over a time period dt is determined by dNt , which has a discrete Poisson distribution given as pk (λdt) = Probability[dNt = k] =
e(−λdt) (λdt)k . k!
If the jump frequency λ is small then the probability of one jump over a short time period dt can be approximated by Probability[dNt = 1] ≈ λdt. With the same assumptions, the odds of two or more jumps scale to the k th power as (λdt)k and can, in general, be neglected. This allows a reduction up to a precision of dt in the Poisson distribution to the zero-one law for jumps of dNt . Simulating the zero-one process defaults to zero jumps, whereas one jump occurs when a uniform random number in [0,1] is found to be less than λdt. The size of the jump is selected to be one constant value J or a certain random process J (Q). The jump amplitude is independent of the arrival of jumps, which itself is determined by dNt . The space-time jump process J (Q)dNt has a mean of E[J (Q)]λdt and a variance of E[J 2 ]λdt (Hanson and Westman, 2002). Building on the Ito calculus described in Chapter 1, it is clear that the geometric stochastic differential equation for dSt /St can be transformed into a log return d(ln St ), with a log-return drift of μld = μd − (σd2 /2) to give d(ln St ) = μld dt + σd Wt + ln[1 + J (Q)]dNt . The ln[1 + J (Q)] mark process term is the Ito calculus shift of the log-return jump amplitude. To see the origin of the log-return jump amplitude, one can consider the stochastic differential equation with jumps but without drift or diffusion terms as given by dSt = St [J (Q)dNt ]. After the first jump, the new value of St+1 would be St + Jump = St + St J (Q) = St [1 + J (Q)]. Clearly, a value of J (Q) < −1 would yield the unreasonable result of a negative asset price. Applying the Ito calculus to transform this equation into a logarithmic state G = ln(S) gives Jump[G] = {ln[St + St J (Q)] − ln St }dNt = ln = ln[1 + J (Q)]dNt .
St + St J (Q) dNt St
JUMP-DIFFUSION MODEL
37
The solution of log-return stochastic differential equation, d(ln St ) = dGt = μld dt + σd Wt + ln[1 + J (Q)]dNt , consists of the sum of a Riemann integral, a Wiener integral, and a Poisson jump integral (Hanson, 2007) as
ln St = Gt = ln S0 +
t
μld ds + σd Ws + ln[1 + J (Q)]dNs .
0
Taking the exponential throughout and converting the exponential sum of the jumps to a product gives St = S0 e
t
μld ds+σd Ws +ln[1+J (Q)]dNs
St = S0 e
t
μld ds+σd Ws
0
0
Q(t) k=1
{ln[1 + J (Tk− )]},
where a prejump state is specified for the k th jump time Tk− to preserve the positive property of the state. The forward approximation can be specified recursively for time steps ti as Gi+1 = Gi + μld ti + σd Wi + J (Q)Ni , with Wiener increments Wi and Poisson increments Ni . The type of jump distribution is selected to augment the diffusion σd component. As mentioned above, most financial asset returns have a distribution that cannot be properly fitted by Gaussian distribution. Specifically, the typical leptokurtic distribution displays a sharper central peak probability near the mean and fatter tails for extreme movements when compared to a Gaussian distribution. This increase in probability for large positive and negative movements is quantified by the kurtosis factor. One choice is to model the extreme jumps as a uniform distribution along [Qa , Qb ] where Qa < 0 < Qb . In other words, the jump size is equally likely to be any point between the most negative movement Qa and the most positive movement Qb . This includes movements near the mean (or zero), but all the jumps are relatively infrequent. Therefore, an infrequent small jump will be insignificant compared to the frequent small movement predicted by the Gaussian (σd ) distribution. The time-independent density of the log-uniform jump-amplitude mark Q is φQ (q) = φ (u) (q; Qa , Qb ) =
U (q; Qa , Qb ) , Qb − Qa
38
JUMP MODELS
where the unit step function U is normalized by the absolute difference between Qa and Qb (Hanson and Zhu, 2004). The mark Q has a mean of 1 μj = E[Q] = Qb − Qa
Qb
qdq =
Qb + Qa , 2
Qa
and variance of σj2
1 = var[Q] = Qb − Qa
Qb Q − Qa 2 Q b + Qa 2 dq = b q− . 2 12
Qa
The inverse of the log-uniform jump-amplitude mark Q is the jump amplitude given by J (Q) = eQ − 1, with a mean of (Hanson, 2007) 1 E[J (Q)] = Q b − Qa
Qb eQb − eQa (eq − 1)dq = − 1. Q b − Qa
Qa
The graphical output of the jump-diffusion function SimJD() is presented below. The SimJD() function simulates a normal drift-diffusion process augmented with two different jump process types. Figure 2.1a shows a process with a constant jump size relative to the current asset price. The jump component is broken out as the dashed line in Figure 2.1. It can be seen that the one-jump size is not exactly constant since the size is proportional to the prejump asset price. The mathematics that define the jumps relative to the size of the current stock price prevents a jump to negative asset prices. Confining the jumps to one size may seem restrictive but certain financial assets are dominated by one type of price movement. For example, stock or bond prices tend to have sudden negative price drops, for example, from bankruptcies, more often than large positive movements. This is observable as a long negative tail or negative skew in the probability distribution. Practically, asset holders are more concerned with properly modeling large negative movements that can lead to financial ruin. In contrast, commodity markets, particularly the electricity spot markets, are characterized by large positive jumps in price due to sudden supply disruptions. Figure 2.1b depicts a drift-diffusion process augmented with a jump process randomly selected over a uniform finite interval [a,b]. The jump component is displayed as the lower dashed line. For the majority of simulations, the time step can set to be sufficiently small (t → 0+ ) such that the probability of two or more jumps can be neglected. In the
JUMP-DIFFUSION MODEL
Price, St
150
39
One-size J/D expected One-size J/D One-size jump
100
50
0 2
4
(a)
150
Price, St
100
6 Time [years]
8
10
8
10
Distributed J/D expected Distributed J/D Distributed jump
50 0
2
4
(b)
6 Time [years]
FIGURE 2.1 Output of SimJD shows drift-diffusion process with (a) one-size jumps and (b) uniformly distributed jumps. The mean or expected price process will be higher than the medium price as a consequence of the properties of a lognormal stock price process with a distribution tail at higher asset prices.
zero- or one-jump form, the probability of zero jumps (1 − λt) is much larger than the probability of one jump (λt). For example, a jump frequency parameter λ = 0.5 would predict on average five jumps per 10 years. To aid the comparison, the size of the one-jump process is set to the average of the uniform jump process. In addition, the occurrence of the random jump (λt) is the same in both processes. The code repeatedly steps through
Si+1 = Si e
√ 2 μd − σd /2 t+σd tN (0,1)+Qi dNi
,
where N(0, 1) is a randomly generated point from a normal distribution. Jump Qi does not take place if a draw from a uniform distribution is in [0, (1 − λt)) such that dNi = 0. The jump size for a uniform distribution is a uniform random
40
JUMP MODELS
number over the finite interval [q1, q2], where q1 < 0 < q2 and can be calculated by Q = q1 + (q2 − q1) *rand. The average mean of the log jumps of this process is found from υmean = E[J (Q)] =
eq2 − eq1 − 1. q2 − q1
As mentioned above, the one-jump size is set to this average of the uniform jump distribution by the code log NuP 1 = log (nuMean + 1). The expectation or mean of a drift-diffusion process was derived earlier (by completing the square) to be E[St ] = S0 eμt , while the medium stock price after one time step would be median St+1
μ− 12 σ 2 t
= St e
;
that is, half the random movements in the asset price will fall above or below the medium value point. The expected return μ demanded by investors depends on the continuously compounded return η and the volatility risk of the stock, μ = η + (σ 2 /2). The jump process adds to the expectation a component eνλdt . This change in expectation can be confirmed by adding a loop or Matlab vector into the code to generate several thousand instances of the jump-diffusion process. The expectation is shown as a dotted line in Figure 2.1. 2.3. PROBABILITY FUNCTIONS At this point, it is necessary to define the probability functions invoked throughout this chapter. The probability of finding a random variable x drawn from a Gaussian distribution of mean μ and variance σ 2 is the probability density function (PDF ) as given by 1 n x−μ 1 −(x−μ)2 /2σ 2 2 e = φ , φ(x, μ, σ ) = √ σ σ 2π σ 2 where we have rearranged the variables to use the normalized PDF defined by 1 2 1 φ n (x) = √ e− 2 x . 2π
The probability (n) of finding a random variable in the interval (−∞, x) drawn from normal (μ = 0, σ 2 = 1) Gaussian distribution is the normalized cumulative distribution function (CDF ), which is given by (n)
(x) =
x
−∞
n
φ (s)ds =
x
−∞
x 1 − 1 s2 1 2 . 1 + erf √ ds = √ e 2 2π 2
PROBABILITY FUNCTIONS
41
The error function erf is a built in function in Matlab that can also be calculated using various approximations (Abramowitz and Stegun, 1972). The CDF can be normalized for nonzero mean and nonunity variance as
(x; μ, σ 2 ) = (n)
x−μ σ
=
1 x−μ 1 + erf . √ 2 σ 2
For convenience, an overloaded CDF is defined for the cumulative distribution over [x, y] as
2
(x, y; μ, σ ) =
y
n
2
φ (z; μ, σ )dz =
x
y
√
x
1 2π σ 2
− (x−μ) 2
e
2
dz.
2σ
Numerical integration of this function is unnecessary. Rather, it is more convenient to calculate cumulative distribution over a finite interval [x, y] as a difference of two CDFs by 2
2
2
(x, y; μ, σ ) = (x; μ, σ ) − (y; μ, σ ) =
(n)
x−μ σ
(n)
−
y−μ . σ
The behavior of the log-return process can be described as a sum of the lognormal drift-diffusion density and the log-uniform jump density. As such, the probability density of log-return process with a random number of jumps k is Zero jumps
φd ln St (x) = p0 (λdt)φ(x; μld dt, σd2 dt) ⎛ ∞ + pk (λdt) k=1
⎞ mean variance ⎜ ⎟
⎝x − kQb , x − kQa ; μld dt, σd2 dt ⎠ k(Qb − Qa )
.
Following the above definition, the cumulative distribution over [kQb , kQa ] can be evaluated as
− μ dt x − kQ a ld
(x − kQb , x − kQa ; μld dt, σd2 dt) = (n) √ σd dt x − kQ − μ dt b ld − (n) . √ σd dt
42
JUMP MODELS
The lognormal diffusion log-uniform jump amplitude is approximated in the zero- or one-jump form as φd ln St (x) =
1 − λdt φ(x; μld dt, σd2 dt) Q b − Qa +
λdt
(x − kQb , x − kQa ; μld dt, σd2 dt). Qb − Qa
The first term on the right is the PDF for a Gaussian drift-diffusion process without jumps and can be evaluated directly or by a normalized PDF by ⎛
⎞ μld ⎜ x − (μ − σd2/ )t ⎟ 1 2 ⎜ ⎟ d φ(x; μld dt, σd2 dt) = √ φ (n) ⎜ √ ⎟. ⎝ ⎠ σd t σd t The density functions provide a mechanism to fit the parameters of a jump-diffusion process. For time steps that are not minute, it is advisable (Hanson and Zhu, 2004) to keep the second-order approximation term in the jump component.
2.4. LEAST-SQUARES ESTIMATION The estimation approach presented below is based on fitting the parameters of a jump-diffusion process to the experimental PDF. The number of data points outside a fixed range of returns is tallied to give the experimental jump frequency, and this number can be divided by the total number of returns to give a probability density normalized to unity. A simple yet fairly accurate approach would be to directly search across the entire parameter space to minimize a least-squares function. As discussed below, a number of numerical tools should be used in combination to improve the estimation. The first improvement is to employ a weighted least squares or χ 2 fit, χ2 =
jd
wi (fi − fiex ),
where wi is the weight of i th bin and fiex is the frequency of the experimental data jd in the i th bin. The theoretical frequency fi for the jump-diffusion process is the integral of the PDF over the bin range [xi , xi+1 ], given by
jd fi
xi+1 jd =N φi (x)dx. xi
LEAST-SQUARES ESTIMATION
43
Hanson and Westman (2002) discuss a weighting scheme to account for the inherent error based on the number density of the data. The bin frequency variance is σ 2jd fi
= var
jd fi
jd
f =N 1− i N
2
jd
fi .
The weights are typically normalized to sum to unity by
1 σ2 jd
wi =
N (bins) k=1
fi
1 σ2 jd fk
.
Synowiec (2008) suggests sorting the empirical log-return data with a range [xmin , xmax ] of length n into k bins, k=
1 (xmax − xmin )n /3 + 1, 2.64(q0.75 − q0.25 )
where qα is the α-percentile of the data. The second improvement is reducing the parameter space to {Qa , Qb , λt}. This is accomplished by relating the Gaussian mean M1 and variance M2 of the log-return data to eliminate μld and σd2 by the constraints M1 = μld t + μj λt, which was derived above for the expected drift of the log-return process, and M2 = σd2 t + (σj2 + μ2j )λt, where the time step t is known. Solving for μld and σd2 as μld = and σd2 =
M1 − μj λt t
M2 − (σj2 + μ2j )λt t
shows that they are not independent variables. Rather, they are dependent on M1 , M2 , λ, and Qa , Qb . The last two, Qa , Qb , are interrelated to the mean jump by μj =
(Qb + Qa ) 2
44
JUMP MODELS
and variance by σj2 =
(Qb − Qa )2 . 12
The function ModelJumpDiffusion(S ) takes in a set of asset closing prices S, or simulates a set of price data for a null input, and then converts the price data to a vector of log-returns. The heart of the program is a call to the Matlab function fminsearch that employs the LikeEval function to fit a set of parameters {λ, Qa , Qb }. The function has several options to value the difference between the binned empirical and the theoretical density distribution calculated from the {λ, Qa , Qb } parameter set. The first option to be discussed is the χ 2 analysis based on an unweighted or weighted least squares. The known parameters are {M1 , M2 , t}. For nearly all optimization routines, the calculation time and the (possibly local) optimal value of the estimated parameters are dependent on the quality of the initial guess. The Matlab function fminsearch is quite robust because it uses a simplex search algorithm (Lagarias et al., 1998); however, nearly all optimization routines prefer to initiate as close as possible to the optimal values. As such, it is beneficial to preestimate all the variables. As discussed above, {M1 , M2 , t} are known but {λ, Qa , Qb } are not known without μld and σd2 . Clewlow et al. (2001) suggest defining any outlier as a jump and using the remaining data to estimate a standard Gaussian drift-diffusion process. Clewlow et al. (2001) recursively used this procedure to arrive at an estimation. In the ModelJumpDiffusion(S ) function, this approach is only used as a preestimation. The number of jumps (or outliers) over the time period gives an estimation to the jump frequency λ. A typical simulation is shown in Figure 2.2.
J/D expected Data
250
Price, St
200
150
100
50 2003
2004
2005
2006
2007
2008
2009
2010
Time (years)
FIGURE 2.2
Simulation of a drift-diffusion process with uniform jumps.
BASIC MOMENTS
(a)
(b) 102
150
Log frequency
Frequency
200
100
100
50
0
45
Data J/D fit –0.1
0 0.1 Log return
–0.1
0 0.1 Log return
FIGURE 2.3 The theoretical distributions calculated from the estimated parameter set versus the empirical binned distribution displayed in (a) linear frequency (JD self-simulation) or (b) log frequency (with least-squares fit).
Outlined above is a two-step process to estimate the jump-diffusion process. The first step involved a brute force classification of jumps as any movement beyond ±3 standard deviations. The drift-diffusions factors were derived based on the remaining data within ±3 standard deviations. These estimated factors were used as an initial guess to minimize the least-squares objective with Matlab’s fminsearch function. The graphs in Figure 2.3 suggest that a proper fit was obtained; however, it is desirable to have a more concrete metric to assess the fit. The ModelJumpDiffusion function also generates the following data. Despite the stochastic nature and limited amount of data, the model factors are in an acceptable range.
Simulated Estimated Calculated
mu
vol
lambda
q1
q2
0.11 0.19 0.13
0.25 0.33 0.24
5.00 4.13 9.00
−0.14 −0.14 −0.12
0.15 0.16 0.14
2.5. BASIC MOMENTS A useful approach to evaluate the effectiveness of the fit for peaked and heavytailed distributions is to compare the skew and kurtosis values from the data and the theoretical fit. A statistical calculation can extract the skew and kurtosis from the log-return data while direct analytical formulae are available to calculate the theoretical skew and kurtosis. The four basic moments defined by Hanson et al.
46
JUMP MODELS
(2004) are the first raw moment, and the second, third, and fourth central moments, given by jd
M1 = μraw 1 = ln(St ) = E[ln(St )] = μld t + μj λt jd jd 2 ln S = var[ln(St )] = σd2 t + σj2 + μ2j λt − M M2 = μcentral = t 2 1 jd
= M3 = μcentral 3 jd
jd 3 jd 3 ln St − M1 =E ln St − M1 =3 σj2 + μ2j μj λt
= M4 = μcentral 4
jd 4 ln St − M1
2 = μ4j + 1.8σj4 + 6μ2j σj2 λt + 3 σd2 + λ σj2 + μ2j t 2 .
The skew and kurtosis can be directly calculated from the third and second central moments as jd M skew = 31.5 jd M2 jd
M3
kurtosis = . jd 2 M2
The text output for this calculation shows some variability but again can be considered reasonable given the random nature of a single simulation. Data Estimated
Skew
Kurtosis
0.921 0.572
12.26 16.87
2.6. THE KOLMOGOROV-SMIRNOV TEST Several goodness-of-fit statistics can be used to compare data to a known distribution. The Kolmogorov–Smirnov test is quite flexible, as it can compare two unknown distributions. The test finds the largest difference Dn in the CDF of two distributions. For our purposes, the test compares the binned empirical CDF to the theoretical jump-diffusion CDF to give a maximum vertical distance as Dn = sup| JD (x) − data (x)|, x
where sup x is the supremum of the set of distances. From this maximum vertical distance, a scaled distance is given as d=
√
nDn ,
47
1 0.8 0.6
(b) 1.4 +dα=0.05 ΦJD ΦData –dα=0.05
1.2 ↓ ↑ 5% Critical level
0.4 0.2
Significance α
(a) Cumulative distribution
THE KOLMOGOROV-SMIRNOV TEST
↓ JD significance level
1 0.8 0.6 0.4 0.2
0 –0.05
0 Log return
0.05
0
0
0.5
1
1.5
2
2.5
Scaled max CDF error d = n0.5Dn
FIGURE 2.4 (a) CDF of the theoretical jump-diffusion model and the binned CDF of the log-return data. The two dashed lines at ±dα represent the boundary of the critical values of the maximum absolute difference at a level of significance of α = 5%. The CDF should be rejected if empirical CDF falls outside ±d α = 0.05 . (b) The Kolmogorov–Smirnov significance curve. The significance level required to match the maximum absolute difference between the empirical and theoretical CDF.
where n is the sample size. The function GoodFit calculates the distance Dn = 0.0087 and scaled distance dn = 0.3767. Graphically, this implies the largest separation in the theoretical and empirical CDF as Dn = 0.0087. In this particular situation, the fit is so close to the theoretical data that the separation in the two lines cannot be distinguished in Figure 2.4. Regardless, the maximum absolute difference Dn is a useful metric to evaluate a fit or to compare distinct fitting procedures or models. The second part of the Kolmogorov–Smirnov statistic is to calculate the fraction of data points expected to fall within a certain band around the theoretical cumulative distribution, assuming this distribution is correct. In Figure 2.4, the two dotted lines define the α = 5% critical level. If any data points fall outside this band, the theoretical CDF is rejected at the α = 5% significance level. The α = 5% significance level is the typically used value to invalidate a model. The empirical CDF clearly does not cross this α = 5% critical significance level, thus the model can be accepted as a proper fit to the data. Formally, what is calculated is the probability Q(d) that the scaled distance between the two distributions [F theo (x), F data (x)] is larger than a critical distance dα , Q(d) = Pr[max|F theo (x) − F data (x)| > dα ] = α. The significance level α is calculated from the critical distance dα by Q(d) = −2
∞ k=1
(−1)k e−2k
2d2 α
.
48
JUMP MODELS
The trend line for the significance level α as a function of critical distance dα is shown in Figure 2.4. Since this equation cannot be inverted, a parameter search is required to determine the critical level dα for a given significance level α. With modern computing, this calculation is trivial, but several decades ago, tables were used to determine the critical values. One handy approximation for a 5% significance level is 1.36 Dα=0.05 = √ . n For this particular fit, Dα = 0.0313, which is much larger than Dn = 0.0081; therefore, the model is deemed a proper fit to the data. An interesting exercise is to find the level of significance that produces a critical value that matches the maximum absolute difference between the empirical and theoretical CDF. For this data, an extremely high significance of 0.9997 is needed. Exxon Mobil (XOM)-adjusted daily close prices were inputted into the ModelJumpDiffusion function. Figure 2.5 shows the actual price process as well as the expected or mean of this process based on the estimated parameters. Overlaying the expected process starting in the year 2003 is somewhat misleading since it is based on future data. The actual fitting procedure is based on minimizing the squared distance between the binned log-return frequency data and the theoretical density function. These two lines are graphically shown to be reasonably close in Figure 2.6. The first line of output data from the program is the preestimated parameters calculated by the procedure of Clewlow et al. (2001). The preestimated data was fed as an initial guess into the Matlab fminsearch function. The simplex minimization
90
J/D expected Data
Price, St
80 70 60 50 40
2003
2004
2005
2006
2007
2008
2009
2010
Time [years]
FIGURE 2.5 XOM-adjusted close price data with expected mean based on parameters fit to a lognormal diffusion log-uniform jump-amplitude model.
MULTINOMIAL ESTIMATION
(a)
(b) 100
200
Log frequency
150 Frequency
49
100
10–20
50
0
Data J/D fit –0.1
0 Log return
0.1
–0.1
0 Log return
0.1
FIGURE 2.6 Linear (a, JD of asset data) and logarithmic (b, with least-squares fit) empirical binned log-return XOM frequency data and theoretical density function calculated based on a least-squares estimation of the jump-diffusion model.
of a least-squares objective function generated the second line of data given below.
Estimated Calculated
Data Estimated
mu 0.11 0.30
vol 0.27 0.18
lambda 2.87 26.71
Skew 0.037 −0.703
q1 −0.15 −0.07
q2 0.16 0.06
Kurtosis 14.09 8.54
Kolmogorov–Smirnov Dn = 0.0149 Scaled dn = 0.6419 Significance = 0.8045 D-alpha = 0.0317 The computation of skew and kurtosis is sensitive to jumps and other outliers in the data. The skew and kurtosis calculated from the data and by minimization of the least-squares objective show that the skew is positive for the data and negative for the theoretical distribution. The Kolmogorov–Smirnov test showed that Dn < Dα=0.05 , so this estimation of jump-diffusion model can be considered proper for the log-return data series. 2.7. MULTINOMIAL ESTIMATION An alternate fitting procedure is made by substituting a multinomial maximum likelihood estimation for the least-squares estimation used in the LikeEval objective
50
JUMP MODELS
(a)
(b) 102
200
Log frequency
Frequency
150
100
100
50
Data J/D fit
10–2 0
–0.1
0 Log return
0.1
–0.1
0 Log return
0.1
FIGURE 2.7 Linear (a, JD of asset data) and logarithmic (b, with multinomial estimation) empirical binned log-return XOM frequency data and theoretical density function calculated based on a multinomial maximum likelihood valuation of the jump-diffusion model.
function (Hanson et al., 2004). A multinomial distribution is the parent of a binomial distribution. The log-likelihood (LH) multinomial function is given by ⎡ data ⎤ fb # " ⎢ ⎥ jd LH(x) = ln data (k) = ⎣ kb ln fb (x) − ln(kb !) − kb ln(ns)⎦ + ln(ns!), nbins b=1
where x is the vector of unknown jump-diffusion parameters, n is the number jd of samples, nb is the number of bins, fb is the theoretical frequency of bin b, and the empirical data count in bin b is kb or, in our case fbdata . Neglecting the constant values, the negative of the log-likelihood multinomial function is found as nbins "
−LH(x) = −
b=1
# jd fbdata ln fb (x) .
The ModelJumpDiffusion function can be set to employ a log-likelihood multinomial objective function by setting the LSflag to 2 (or any number except for 1). The approach was used to examine the XOM log-return data that was used above. Linear- and log-frequency plots (Fig. 2.7) generated by the ModelJumpDiffusion function show that tails of empirical distribution match the fitted distribution well. The output of ModelJumpDiffusion for a multinomial maximum likelihood valuation is given below.
ALTERNATE JUMP MODELS
Estimated Calculated
mu 0.11 0.11
vol 0.27 0.21
lambda 2.87 5.57
q1 −0.15 −0.13
Skew 0.037 0.024
Data Estimated
51
q2 0.16 0.13
Kurtosis 14.09 21.57
Kolmogorov–Smirnov Dn = 0.0271 Scaled dn = 1.1619 Significance = 0.1344 D-alpha = 0.0317 There is some discussion in the literature as to the best objective function to minimize to fit a drift-diffusion process with uniform jumps. Estimation procedures are best when adjusted to a particular data type.
2.8. ALTERNATE JUMP MODELS A few alternate jump distributions have been suggested in the literature to be better for certain data sets. A subset of other important jump distributions is outlined in this section. The underlying motivation has to do with matching the shape of the distribution tail to the jumps in the data as well as the ease of translating the jumpdiffusion process into models of asset, futures, and option prices. The best choice should always be judged on a case-by-case basis. 2.8.1. Normal Model The normal model generates Q with a normal density given by
ϕQ (q) = ϕ(x; μj , σj2 ) = '
1
e
−
(x−μj )2 2σj2
,
2π σj2
with mean μj and variance σj2 . Merton’s work was a seminal exploration into pricing derivates based on a jump-diffusion model of the underlying asset. Merton augmented the lognormal drift diffusion with a lognormal jump distribution to account for the fat tails observed in asset prices, which cause a volatility smile in the Black–Scholes option price model. The log-return probability density for the jump-diffusion model with lognormal jump amplitude is Zero jumps
φd ln St (x) = p0 (λdt)φ(x; μld dt, σd2 dt))
∞ + pk (λdt)φ(x; μld dt + μj k, σd2 dt + σj2 k), k=1
52
JUMP MODELS
where pk (λt) = P (Nt = k) (Synowiec, 2008). Hanson and Zhu (2004) advocate a second-order approximation to improve the estimation, as the second-order approximation contributed 23% relative to the first. The second-order approximation to a [x1 , x2 ] bin probability distribution is
d ln St (x1 , x2 ) =
2 k=0
pk (λdt) (x1 , x2 ; μld dt + μj k, σd2 dt + σj2 k) 2
. pj (λdt)
j =0
The four basic moments as listed by Hanson and Zhu (2004) for a jump-diffusion model with lognormal jump amplitude are given by njd
M1
njd
M2
njd M3 njd M4
= μraw 1 = ln(St ) = E[ln(St )] = μld t + μj λt jd 2 central = var[ln(St )] = σd2 t + (σj2 + μ2j )λt = μ2 = ln St − M1 =
μcentral = 3
=
μcentral 4
jd 3 jd 3 ln St − M1 =E ln St − M1 =3(σj2 + μ2j )μj λt
jd 4 = ln St − M1
= (μ4j + 3σj4 + 6μ2j σj2 )λt + 3[σd2 + λ(σj2 + μ2j )]2 t 2 . 2.8.2. Double Exponential Model Kou (2002) defined the high central distribution peak of returns and the wider dispersion of overall returns as the jump component of a Laplace double exponential. Kou showed that the simplicity of the model allowed analytical tractability for path-dependent options and interest rate derivatives. This analytical tractability mainly arose from the memoryless property of the double exponential distribution. The jump-amplitude mark Q of the double exponential model is φQ (x) = p1 η1 eη1 x I{xR) S=S'; % flip S=(column) to S'=(rows) end LnS=log(S); end if (nargin == 0), % self-simulation Szero=50; mu=0.11; % drift vol=0.25; % Volatility musig2=mu-0.5*volˆ2 lambda=5; % rate of jumps per year = Intensity of Poisson Process q1=-0.14; q2=0.15; %Max Jump Down, Up nuMean=(exp(q2)-exp(q1))/(q2-q1) - 1 % nuMean = Average Jump size measured relative % to previous stock price % logNuP1=log(nuMean+1) %Drift of ln(jumps) Years=7.5; steps = Years*252 lambdadt=lambda*dt; TimeLength=dt*steps; time = linspace(0,TimeLength,steps); %years S(1)=Szero; %LnS(1)=log(Szero); rand('state', 0); randn('state', 0); UniDist=rand(1,steps); % Hanson suggests using center to avoid end bias in % distribution % jumpleft=(1-lambda*dt)/2;jumpright=(1-jumpleft); for i=2:steps % Calculate simulated price process if (lambdadt>UniDist(i)) % if ((UniDist(i)>=jumpleft)&&(UniDist(i) √ ⎩ ⎭ σ T −t Cbc = e−r(T −t) [1 − (−d2 )]
where ln d2 =
St K
2 + r − σ2 (T − t) . √ σ T −t
Since (d2 ) = 1 − (−d2 ), the price of the binary cash-or-nothing call is Cbc (St , t) = Qe−r(T −t) (d2 ). For brevity, other binary options are listed only by their Black–Scholes style valuation equations. The price of an asset-or-nothing binary call is Cba (St , t) = Se−q(T −t) (d1 ). The price of a cash-or-nothing binary put is Pbc (St , t) = Qe−r(T −t) (−d2 ). The price of an asset-or-nothing binary put is Pba (St , t) = Se−q(T −t) (−d1 ). Binary options are a highly dependent on and useful method for examining the volatility skew.
MERTON JUMP-DIFFUSION OPTION PRICE
73
3.9. MERTON JUMP-DIFFUSION OPTION PRICE Shortly after the development of the Black–Scholes option valuation formula, Merton (1976) developed a jump-diffusion model to account for the excess kurtosis and negative skewness observed in log-return data (Matsuda, 2004). This jump process is appended to the Black–Scholes geometric stochastic differential equation to give dSt = St [(α − λk)dt + σ dWt + dJt ]. The asset price change is described by a continuous Wiener process augmented with a Poisson driven jump process. Merton argued that the rare and random arrival of information shocked the price process into noncontinuous jumps. The percentage change in asset price when a jump occurs within a short time period is described by
dJt = d
∞ (yn − 1), n=1
where dJt > −1, otherwise a large negative jump in dSt could create a nonsensical negative asset value. Matsuda (2004) shows that a jump changes the asset price from St to yt St . To avoid an unreasonable negative asset value, the absolute price jump size yt must be greater than zero, that is, yt > 0. Therefore, the jump causes a percentage change in asset price of %change =
dSt y S − St = t t = yt − 1 = Jt . St St
Given that the relative price jump size is Jt = yt − 1 and that the absolute price jump size must be greater than zero, yt > 0, it follows that Jt > −1. Merton assumes that the absolute price jump size yt has a lognormal distribution. In other words, the logarithm of the absolute price jump size is normally distributed,
Sy ln yt = ln t t St
= Yt = N (μj , σj2 ),
which implies Jt = eYt − 1. The jump process Jt is actually a compound Poisson process such that the size of nth jump of J is equal to yn − 1 = eYn − 1. By the definition of a lognormal distribution, the expected value is E[yt ] = 2μ +σ 2 σ 2 μ +1σ2 e j 2 j and the variance is E[(yt − E[yt ])2 ] = e j j (e j − 1), or, equivalently (Matsuda, 2004),
* 2 + μ + 1 σ 2 2μ +σ 2 σ yt = LogNormal e j 2 j , e j j e j − 1 .
74
OPTIONS
The relative price jump size, Jt = yt − 1, has a lognormal distribution with the μ +1σ2 same variance but the expected value is shifted by 1, k ≡ e j 2 j − 1, thus ⎛
μj + 21 σj2
Jt = yt − 1 = LogNormal ⎝e
− 1, e
≡k
2μj +σj2
*
σj2
e
⎞ + − 1 ⎠.
The one-dimensional Poisson process dNt is discontinuous with a constant jump rate λ. The simulated frequency of jumps over a time period dt is determined by dNt , which has a discrete Poisson distribution given as pn (λdt) = Probabilty[dNt = n] =
e(−λdt) (λdt)n . n!
The rate λ defines the expected number of jumps for a time interval dt and since the μ +1σ2 relative price jump size, Jt = yt − 1, has an expected value of k = e j 2 j − 1, the expected relative price change of the jump component is E[(yt − 1)dNt ] = λkdt. This jump-component drift is compensated by reducing the drift term in the stochastic differential equation. The geometric stochastic differential equation dSt = St [(α − λk)dt + σ dWt + (yt − 1)dNt ] is transformed into a log return G = ln St by applying Ito’s formula for a jumpdiffusion process (Cont, 2004) to give 1 dG = d ln St = (α − λk) − σ 2 dt + σ dW + ln yt . 2 The stochastic integration (Matsuda, 2004) of this jump-diffusion process is Nt 1 2 ln St = ln S0 + α − λk − σ t + σt Wt + lnyn 2 n=1
or, equivalently, ⎛ ⎞ =Yn Nt 1 St = S0 exp ⎝ α − λk − σ 2 t + σt Wt + ln yn ⎠ . 2 n=1
Given that, process is
,Nt
n=1 yn = exp
St = S0 exp
, a similar form for the jump-diffusion lny n n=1
Nt
. Nt 1 yn . α − λk − σ 2 t + σt Wt 2 n=1
MERTON OPTION IN A HEDGED PORTFOLIO
75
The log return for any time period is given by xt = ln
St S0
Nt 1 = α − λk − σ 2 t + σt Wt + Yn , 2 n=1
where the compound Poisson jumps add negative skewness and excess kurtosis to the distribution (Matsuda, 2004). The probability density of the Merton jumpdiffusion process is the summation φ(xt ) =
∞ (−λdt) e (λdt)n n=0
n!
1 N xt ; α − λk − σd2 t + nμj , σd2 t + nσj2 , 2
where the first term in the summation is the probability of n jumps and the second term is the standard normal distribution given by
/
σ2 N xt ; α − λk − d 2
0
t+
nμj , σd2 t
+
nσj2
⎧ 2 ⎫ ⎪ ⎪ σd2 ⎪ ⎪ ⎪ ⎪ α − λk − 2 t + nμj ⎨ xt − ⎬ 1 exp − =1 ⎪ ⎪ 2{σd2 t + nσj2 } ⎪ ⎪ 2π(σd2 t + nσj2 ) ⎪ ⎪ ⎭ ⎩
3.10. MERTON OPTION IN A HEDGED PORTFOLIO
It is well known that an effective method to value options in the Black–Scholes framework is to create an arbitrage-free portfolio consisting of an option and its underlying asset and comparing the portfolio’s return to a risk-free asset, for example, a US Treasury bill. The same approach can be used to value an option where the asset follows a process described by Merton’s jump-diffusion model. A portfolio P with one long position on an option V (S, t) and a short position in shares of the underlying asset has a value described by Pt = V (St , t) − St . The change in the portfolio value over a short time period is given by dPt = dV (St , t) − dSt . The differential price process of the asset, dSt , is available, dSt = St [(α − λk)dt + σ dWt + (yt − 1)dNt ]; however, developing the price process of the option based on the underlying asset, dV (St , t), requires Ito’s lemma.
76
OPTIONS
As discussed by Matsuda (2004), extending the standard Ito’s lemma for a driftdiffusion process to a finite-activity jump-diffusion process merely requires the addition of the term [f (Xt− + Xt ) − f (Xt− )]. Generally, a function f (t, Xt ) based on a finite-activity process X, with a differential price process dXt = bdt + σ dWt + dJt , is described by df =
∂f ∂f 1 ∂ 2f + bt + σt2 2 ∂t ∂x 2 ∂x
dt +
∂f σ dW + [f (Xt− + Xt ) − f (Xt− )]. ∂x t t
Specific to our needs, Ito’s lemma gives the option value in differential form as ∂V 1 2 2 ∂ 2V ∂V dt + (α − λk) St + σt St dV (St , t) = ∂t ∂S 2 ∂St2
+ σt St
∂V dWt + [V (yt St , t) − V (St , t)]dNt . ∂St
Substituting for dSt and dV (St , t) gives the change in the portfolio value over a short time period as
∂V ∂V 1 2 2 ∂ 2V + (α − λk) St + σt St dPt = dt ∂t ∂S 2 ∂St2 + σt St
∂V dWt + [V (yt St , t) − V (St , t)]dNt ∂St
− {St [(α − λk)dt + σ dWt + (yt − 1)dNt ]}. Grouping like terms gives dPt =
∂V 1 ∂V ∂ 2V + (α − λk) St + σt2 St2 2 − St (α − λk) dt ∂t ∂S 2 ∂St ∂V + σt St − σt St dWt + [V (yt St , t) − V (St , t) − St (yt − 1)]dNt . ∂St
The usual approach in the Black–Scholes derivation is to eliminate the stochastic component dWt to eliminate risk by nullifying the two terms in ⎛
∂V = ∂S
t
⎞
⎟ ⎜ ⎜σt St ∂V − σt St ⎟ . ⎝ ⎠ ∂S t
The jump-diffusion model has the added variation from the Poisson jump process, ∂V dNt = 0, that remains when the number of shares of the asset is set as = ∂S ; t that is, the market is incomplete as no choice of creates a riskless portfolio. Merton assumed that jumps are uncorrelated with the marketplace, that is, the jump
MERTON OPTION IN A HEDGED PORTFOLIO
77
component represents nonsystematic risk. Therefore, it is possible to eliminate the jump risk by diversification and assume the mean return of the portfolio is the risk-free rate, E[dPt ] = rPt dt. Creating a (continuously) hedged portfolio by setting the number of shares of ∂V the asset as = ∂S also eliminates the common drift terms t = ∂V ∂S
∂V − St (α − λk). (α − λk)St ∂S Examining the expected return gives rPt dt = E[dPt ] = E
∂V 1 2 2 ∂ 2V + σt St dt + V yt St , t − V (St , t) 2 ∂t 2 ∂St ∂V − S y − 1 dNt . ∂S t t
Substituting the portfolio value Pt = V (St , t) − St = V (St , t) − ∂V ∂S St and since the expectation is only necessary over the jump component, the expected return is given as ∂V ∂V 1 ∂ 2V St dt = + σt2 St2 2 dt + E V yt St , t − V (St , t) r V St , t − ∂S ∂t 2 ∂St ∂V − St yt − 1 E[dNt ]. ∂S The expected jump rate is by definition E[dNt ] = λdt. Eliminating the common dt terms gives ∂V ∂V 1 ∂ 2V St = + σt2 St2 2 r V St , t − ∂S ∂t 2 ∂St ∂V +E V yt St , t − V (St , t) − S (y − 1) λ. ∂S t t Rearranging the terms gives Merton’s PDE as 0=
∂ 2V ∂V ∂V 1 + σt2 St2 2 + r S − rV + λE[V (yt St , t) − V (St , t)] ∂t 2 ∂S t ∂St ∂V − λE St yt − 1 . ∂S
78
OPTIONS
To ease the derivation of the option price from the PDE, Merton purposely assumed a price jump size yt that has a lognormal distribution, that is, the logarithm of the absolute price jump size is normally distributed, ln yt = Yt = N(μj , σj2 ). It follows that the E[V (yt St , t) − V (St , t)] term is relative to the expected relative jump size, E[(yt − 1)] = λ. Merton (1976) did prove that this PDE solves for the option price; however, deriving the option price is more elegant than employing an equivalent martingale measure.
3.11. MARTINGALE DERIVATION OF MERTON OPTION PRICE Earlier, the evolution of asset prices in Merton’s model was shown to be Nt 1 2 St = S0 exp α − λk − σd t + σd Wt + Yn . 2 n=1
In a risk-neutral world Q, the rate of growth must be equivalent to the risk-free μ +1σ2 rate-less q dividends. Substituting this relation and k ≡ e j 2 j − 1 gives Nt 1 2 Yn . St = S0 exp r − q − λk − σd t + σd Wt + 2 n=1
For n = Nt , the jump process by itself is described by nμj + convolved with the drift and diffusion terms gives ⎛
* + 1 μ +1σ2 St = S0 exp ⎝ r − q − λ e j 2 j − 1 − σd2 t + nμj + 2
nσj2 t Wt ,
3
σd2 +
and when
nσj2 t 4
⎞
Wt ⎠ ,
where the last term is defined as an effective (or Merton) volatility σM = σd2 + 2 − Substituting the equivalent σd2 = σM
/
nσj2 t
nσj2 t .
gives
0 * + nμ nσj2 1 j μj + 21 σj2 2 St = S0 exp − r −q −λ e −1 + σM − t + σM Wt t 2 t / 0 + 2nμj + nσ 2 * 1 2 j μj + 12 σj2 − σM t + σM Wt . −1 + St = S0 exp r −q −λ e 2t 2
MARTINGALE DERIVATION OF MERTON OPTION PRICE
79
The standard value of a call option is given by C = e−rT E T [max(ST − K, 0)]. Summing the value of 0, 1, or several jumps gives ∞ −λT e (λT )n
C = e−rT
n!
n=0
E T [max(STn − K, 0)].
A lognormal asset price shifted by n jumps is related to the Black–Scholes equation by 2 ), E T [max(STn − K, 0)] = ern T CnBS (S, K, T , rn , σM + * μ +1σ2 and the drift of the nth jump is rn = r − λ e j 2 j − 1 + gives C = e−rT
∞ −λT e (λT )n
n!
n=0
n!
n=0
+
2nμj + nσj2 2t
e
0 2nμj +nσj2 μ + 1 σ2 r−λ e j 2 j −1 + T 2t
n λ μj + 21 σj2 T
∞
−λ μj + 12 σj2 T
=e
/
CnBS
2nμj +nσj2 . 2t
Substituting
2 CnBS (S, K, T , rn , σM )
+ * μ +1σ2 S, K, T , r − λ e j 2 j − 1
2 , σM .
μ +1σ2
The effective probability of jumps is λ = λ(1 + k) = λe j 2 j . Merton’s jumpdiffusion model has a series approximation for the price of a European vanilla call. The option price is based on the first few terms composed of the Black–Scholes price adjusted for the number of jumps n as given by C = e−λT σnM =
3
∞ (λT )n n=0
2 σBS +
n!
CnBS (S, K, T , rn , σnM )
nσj2 T μj + 12 σj2
λ = λ(1 + k) = λe
n μj + 21 σj2 1 2 n ln(1 + k) μ + σ =r −λ e j 2 j −1 + , rn = r − λk + T T
which can be truncated for some reasonable number of jumps. The script MertonScript calculates a series of call prices via the Black–Scholes equation in the function BlackScholesCall and the Merton series approximation in the function MertonSeriesCall . The output is displayed in Figure 3.1. For illustration, the diffusion volatility in the Merton model is assumed to be the same
80
OPTIONS
80
Merton Black–Scholes
70
K = 100 T=1
Call price
60 50 40
Merton µJump = 0.1 σJump = 0.1 λ = 0.5
30 20 10 20
40
60
80
100
Black–Scholes r = 0.05 σ = 0.2 BS
120
140
160
180
S0
FIGURE 3.1 Call price as a function of the time-zero asset price calculated by the Black–Scholes equation and the Merton jump-diffusion series approximation.
as that in the Black–Scholes model. The addition of a jump component logically increases the value of the call option when the current asset price is near the strike price. Deep out of the money or in the money call options collapse to zero value or their intrinsic value, respectively, regardless of the model. The importance of the Merton model is that the instantaneous nature of the jumps can create a steep short-term smile. The smile tends to flatten at a longer time horizon as the diffusion component, with a variance growing as σ 2 T , is more important than a small number of Poisson-generated jumps. Increasing the expected jump frequency controlled via λ will increase the steepness of the smile across all time horizons. In contrast, stochastic volatility models can only create a steep short-term smile when the volatility of volatility is large (Derman, 2008).
3.12. HESTON’S STOCHASTIC VOLATILITY MODEL The Heston (1993) model is similar to the Black–Scholes model except the variance is stochastic and follows a Cox-Ingersol-Ross (CIR) mean-reversion process. The Heston model dynamics are summarized by three equations √ vt St dW1 √ dvt = κ(η − vt )dt + σ vt dW2
dSt = μSt dt + E[dW1 dW2 ] = ρdt,
where κ is the variance reversion rate, η is the mean-reversion or long-term variance level, σ is the volatility of variance, the two Wiener processes are correlated by a factor ρ, and v0 is the initial variance.
HESTON’S STOCHASTIC VOLATILITY MODEL
81
The call price is given by the expectation c = e−rτ E[max(ST − K, 0)] = e−rτ E[(ST − K)+ ], where τ = T − t. Following Rouah (2011), an alternate form for the call price is c = e−rτ E Q [ST IST >K ] − e−rτ KE Q [IST >K ], where I is the indicator function and both expectations are taken under the riskneutral measure Q. The expectation is removed by rewriting the call price as c = ext P1P (x, v, τ ) − e−rτ KP2Q (x, v, τ ), where it is shown below that both P (x, v, τ ) terms are the probabilities of the call finishing in the money; however, P1P (x, v, τ ) is defined under the real-world measure P and P2Q (x, v, τ ) is defined under the risk-neutral measure Q. The link between both P (x, v, τ ) terms is evident by employing a Radon–Nikodym derivative St/S St/S Zt = B T = t rdu T . t/B e 0 /e 0T rdu T Under constant interest rates, the Radon–Nikodym derivative is given by Zt =
St/S
T
ert/erT
=
St/S T r(t−T ) e
=
St/S
T
e−rτ
.
Moving the discount rate (as bond prices) into the expectation in the first term in the call price gives e
−rτ
ert Q Q Bt . E [ST IST >K ] = rT E [ST IST >K ] = E S I e BT T ST >K Q
Transforming the expectation from the risk-neutral measure Q to the real-world measure P via the Radon–Nikodym derivative gives e
−rτ
Q
E [ST IST >K ] = E
P
St/S Bt Bt T P S I Z =E S I BT T ST >K t BT T ST >K Bt/BT
= E P [St IST >K ], = St E P [IST >K ]
where the time t asset price, St = ext , can be moved outside the expectation.
82
OPTIONS
3.13. PDE FOR HESTON PROBABILITIES In Chapter 6, the Heston PDE for the value of a derivative on an underlying asset S is derived as
∂V ∂ 2V 1 ∂ 2V ∂ 2V 1 + vS 2 2 + σ 2 v 2 + σ vρS ∂t 2 ∂S 2 ∂v ∂v∂S − rV + rS
∂V ∂V + {κ(η − v) − λ(S, v, t)} = 0. ∂S ∂v
A similar expression for the Heston PDE for a log price, x = ln S and the linear market price of risk, λ(S, v, t) = λv, is ∂V 1 ∂ 2V + v 2 ∂t 2 ∂x
∂V ∂ 2V ∂ 2V 1 1 r− v + σ 2 v 2 + σ vρ − rV 2 ∂x 2 ∂v ∂v∂x + {κ(η − v) − λv}
∂V = 0. ∂v
If the Heston call option was defined relative to τ = T − t, the first time-dependent term is negative, and the call option PDE is written as −
∂c 1 ∂ 2c 1 1 ∂ 2c ∂ 2c ∂c + v 2+ r− v + σ 2 v 2 + σ vρ − rc ∂τ 2 ∂x 2 ∂x 2 ∂v ∂v∂x + [κ(η − v) − λv]
∂c = 0. ∂v
The PDE is independent of the terms of the call contract; therefore, the Heston probabilities P1P (x, v, τ ) and P2Q (x, v, τ ) both independently follow the PDE (Rouah, 2011). This is evident by examining the call option equation c = St P1P (x, v, τ ) − e−rτ KP2Q (x, v, τ ), which is equal to C = P1P when K = 0 and S = 1 and to C = −P2Q when K = 1, S = 0, and r = 0. In order to insert the isolated probabilities into the call option PDE , it is necessary to evaluate the derivatives of the call option equation as (Rouah, 2011) Q
∂P P ∂P ∂c = ex 1 + re−rτ KP2Q − e−rτ K 2 ∂τ ∂τ ∂τ P ∂P2Q Q −rτ x ∂P1 − e K −rP2 + =e ∂τ ∂τ Q P ∂P2 ∂P1P ∂P2Q ∂c −rτ P x P x ∂P1 −rτ x −e K = e P1 + e −e K = e P1 + ∂x ∂x ∂x ∂x ∂x
PDE FOR HESTON PROBABILITIES
83
P P 2 P ∂ 2 P2Q ∂ 2c x P x ∂P1 x ∂ P1 −rτ x ∂P1 + e + e = e P − e K + e 1 ∂x 2 ∂x ∂x ∂x 2 ∂x 2 ∂ 2 P1P ∂P P ∂ 2 P2Q −rτ − e = ex P1P + 2 1 + K ∂x ∂x 2 ∂x 2 and Q P ∂P2 ∂P ∂c 1 = ex − e−rτ K ∂v ∂v ∂v 2 P 2 Q ∂ P P ∂ ∂ 2c 2 1 = ex − e−rτ K ∂v 2 ∂v 2 ∂v 2
∂P P ∂ 2 P1P ∂ 2 P2Q ∂ 2c = ex 1 + ex − e−rτ K ∂v∂x ∂v ∂v∂x ∂v∂x P 2 P ∂ P1 ∂ 2 P2Q x ∂P1 −rτ =e + −e K , ∂v ∂v∂x ∂v∂x where the P1P terms are in curly brackets, {}, and the P2Q terms are in square brackets, []. Substituting only the P1P terms and removing the common ex terms (or equivalently setting K = 0 and S = 1) in the call option PDE gives −
∂P1P ∂τ
∂P1P P1P + ∂x P ∂ 2 P1 + ∂v∂x P ∂P1 P − r{P1 } + (κ(η − v) − λv) = 0. ∂v
∂ 2 P1P ∂P P 1 + v P1P + 2 1 + + r− 2 ∂x ∂x 2 P 2 P ∂P1 ∂ P1 1 + σ vρ + σ 2v 2 2 ∂v ∂v
1 v 2
The P1P PDE simplifies to 2 P P P ∂ P1 ∂P1 ∂P1 1 1 1 2 ∂ 2 P1P + r+ v − + v + σ v ∂τ 2 ∂x 2 2 ∂x 2 ∂v 2 2 P P ∂ P1 ∂P1 + σ vρ + [σ vρ + κ(η − v) − λv] = 0. ∂v∂x ∂v Similarly, substituting only the P2Q terms and removing the common −e−rτ K terms (or equivalently setting r = 0, K = −1, and S = 0) in the call option PDE gives Q Q ∂ 2 P2Q ∂P2 ∂P2Q 1 1 1 2 ∂ 2 P2 Q + r− v rP2 − + v + σ v ∂τ 2 ∂x 2 2 ∂x 2 ∂v 2 ∂ 2 P2Q ∂P2Q Q + σ vρ − rP2 + [κ(η − v) − λv] = 0. ∂v∂x ∂v
84
OPTIONS
The P2Q PDE simplifies to
∂P Q − 2 ∂τ
Q ∂ 2 P2Q ∂P2 1 1 + r− v + v 2 ∂x 2 ∂x 2 Q Q ∂ 2 P2Q ∂P2 1 2 ∂ 2 P2 + σ vρ + [κ(η − v) − λv] + σ v = 0. ∂v∂x 2 ∂v 2 ∂v
The two equations can be summarized as ∂Pj ∂ 2 Pj ∂Pj 1 + (r + uj v) + v − ∂τ ∂x 2 ∂x 2 ∂Pj ∂ 2 Pj 1 2 ∂ 2 Pj + σ vρ + (a − b v) + σ v = 0. j ∂v∂x 2 ∂v 2 ∂v u1 =
1 2
u2 = −
1 2
a = κη
b1 = κ + λ − σρ
b2 = κ + λ.
3.14. CHARACTERISTIC FUNCTIONS OF THE HESTON PROBABILITIES For the remainder of the derivation, it is helpful to extract a few definitions from Chapter 13. The Fourier transform of the real space probability density function, φ(y), is the characteristic function, ϕ(ω), as given by
ϕ(ω) = F {φ(y)} =
∞
φ(y)ei ωy dy.
−∞
The probability (y) of finding a random variable in the interval (−∞, y) drawn from an arbitrary distribution is the cumulative distribution function as given by
(y) =
y
φ(s)ds.
−∞
The probability density function is the first derivative of the cumulative distribution function ∞ 1 ϕ(ω)e−iωy dω. φ(y) = (y) = 2π −∞
CHARACTERISTIC FUNCTIONS OF THE HESTON PROBABILITIES
85
The complementary conditional distribution function, c (y) = 1 − (y), is given by ⎡∞ ⎤ ∞ −iωy e 1 ϕ (ω) 1 ⎣ e−iωy ϕ (ω) ⎦ dω = + R I dω . ω 2 π iω
1 1 c (y) = + 2 π
0
0
It is clear that the Heston probabilities, Pj = c (y = ln K) = Pr(x > ln K), are in the form of the complementary conditional distribution function ⎡∞ ⎤ 1 ⎣ e−iω ln K ϕj (ω) ⎦ 1 dω . Pj (ln K) = + R 2 π iω 0
The two characteristic functions have the (t = T , τ = 0) boundary conditions of φj (τ = 0; ω) = 1x>ln K . By the Feynman–Kac theorem, the characteristic functions follow PDEs analogous to the PDEs for the probability functions as given by ∂ 2 ϕj ∂ϕj ∂ϕj 1 − + (r + uj v) + v ∂τ ∂x 2 ∂x 2 ∂ 2 ϕj ∂ϕj 1 2 ∂ 2 ϕj + σ vρ = 0. + (a − bj v) + σ v ∂v∂x 2 ∂v 2 ∂v u1 =
1 2
u2 = −
1 2
a = κη
b1 = κ + λ − σρ
b2 = κ + λ.
Heston (1993) assumed an affine form of the characteristic function given by ϕj (x, v; ω) = exp{Cj (τ ; ω) + vDj (τ ; ω) + iωx}. As shown by Rouah (2011), producing the differential equations for the C and D coefficients requires calculation of the derivatives of the characteristic function as given by ∂ϕj ∂τ ∂ϕj ∂v
=
∂Cj ∂τ
= Dj ϕj
+v
∂Dj ∂τ ∂ 2 ϕj ∂v 2
ϕj
= Dj2 ϕj
∂ϕj ∂x
= iωϕj ∂ 2 ϕj ∂v∂x
∂ 2 ϕj ∂x 2
= −ω2 ϕj
= iωDj ϕj
Substituting into the characteristic function PDE and removing the common φj term gives
−
∂Cj ∂τ
+v
∂Dj ∂τ
1 + (r + uj v)[iω] − v[ω2 ] 2 1 + σ vρ[iωDj ] + σ 2 v[Dj2 ] + (a − bj v)[Dj ] = 0. 2
86
OPTIONS
Separating the terms with and without a volatility component gives ∂Dj 1 2 1 2 2 − + uj iω − ω + σρiωDj + σ Dj − bj Dj v ∂τ 2 2 −
∂Cj
+ riω + aDj = 0,
∂τ
which can be split into two differential equations 1 1 = uj iω − ω2 + (σρiω − bj )Dj + σ 2 Dj2 ∂τ 2 2
∂Dj ∂Cj ∂τ
= riω + aDj .
The latter is a simple ordinary differential equation. The former is a Riccati-type equation in the form (Gatheral, 2002) P
∂Dj ∂τ
R
1 = uj iω − ω2 2
−Q 1 + (σρiω − bj )Dj + σ 2 Dj2 . 2
The transform Dj = − R1 ww has a corresponding second-order ordinary differential equation (Rouah, 2011) w − Qj w + Pj R = 0. The auxiliary equation has the quadratic roots
α= β=
−Qj +
1 Q2j − 4Pj R
=
−Qj +
1
=
2
Q2j
− 4Pj R
2
−Qj + dj 2
−Qj − dj 2
,
where the complex root d has two values that are equal in magnitude but opposite in sign, dj = α − β =
1
Q2j − 4Pj R =
1
(σρiω − bj )2 − σ 2 (2uj iω − ω2 ).
The second-order ODE of w −
P + Qj w + Pj R = 0 P
CHARACTERISTIC FUNCTIONS OF THE HESTON PROBABILITIES
87
has a solution of w = Aeαt + Beβt . Substituting into the transform provides Dj = −
1 ( A/B ) αeαt + βeβt 1 w 1 αAeαt + βBeβt = − . =− R w R Aeαt + Beβt R ( A/B ) eαt + eβt
The initial condition Dj (t = 0) = 0 requires the time-zero numerator to be αA + βB = 0. Rearranging as a ratio of A/B = −β/α gives the Dj in the form of Dj = −
1 β{−eατ + eβτ } . R (− β/α) eατ + eβτ
Defining and substituting the factor, gj =
−σρiω + bj + dj −Qj + dj β = , = α −Qj − dj −σρiω + bj − dj
into Dj gives Dj = −
1 β{−eατ + eβτ } . R (−gj )eατ + eβτ
Dividing the numerator and denominator by eβτ expresses Dj as Dj = −
Qj + dj {1 − edj τ } 1 β{1 − edj τ } 1 β{1 − e(α−β)τ } = − = . R (−gj ){1 + e(α−β)τ } R {1 − gj e(α−β)τ } 2R {1 − gj edj τ }
Substitution for β reduces the solution to the Ricatti equation to Dj =
bj − ρσ iω + dj σ2
Once Dj is available, the ODE 2011) by
Cj =
τ
(riω)dy + a
0
Cj = riω + a
∂Cj ∂τ
1 − edj τ 1 − gj edj τ
.
= riω + aDj can be integrated (Rouah,
Qj + dj 2R
τ
riω +
0
{1 − edj y } dy {1 − gj edj y }
1 − gj ln(gj edj y − 1) Qj + dj 2R
dj gj
+y
τ
,
0
which is after substitution 1 − gj edj τ κη Cj = riωτ + 2 bj − σρiω + dj τ − 2 ln . σ 1 − gj
88
OPTIONS
In summary, the call price c = ext P1P (x, v, τ ) − e−rτ KP2Q (x, v, τ ) has Heston probabilities that require a numerical integration of ⎡∞ ⎤ 1 1 ⎣ e−iω ln K ϕj (ω) ⎦ dω , Pj (ln K) = + R 2 π iω 0
with ϕj (x, v; ω) = exp{Cj (τ ; ω) + vDj (τ ; ω) + iωx}. Albrecher et al. (2006) suggested an alternate Heston formalism by multiplying the −dj τ numerator and denominator of Dj by e gj to give
Dj = and
bj − ρσ iω − dj σ2
⎞ dj τ 1 − e ⎠ ⎝ 1 − g1j edj τ ⎛
⎡ ⎛ ⎞⎤ 1 − g1j edj τ κη ⎣ ⎠⎦ . Cj = riωτ + 2 bj − σρiω − dj τ − 2 ln ⎝ σ 1 − g1 j
Lord and Kahl (2010) showed that the formalism of Albrecher can display discontinuities when the principal branch of the complex logarithm is used.
3.15. DECOUPLED GREEN FUNCTION APPROACH TO THE HESTON MODEL This section develops a transform integrated in complex Fourier space where the solution to the Heston PDE is decomposed into an at maturity payoff function and a separate Green function that is not dependent on the volatility (Lewis, 2000). The framework of this approach is easily extended to the numerical calculation of the Greeks (Shaw, 2011). The Heston PDE for the value of a derivative on an underlying asset S was derived as
∂V 1 ∂ 2V 1 ∂ 2V ∂ 2V + vS 2 2 + σ 2 v 2 + σ vρS ∂t 2 ∂S 2 ∂v ∂v∂S
− rV + rS
∂V ∂S
b(v)
∂V = 0. + {κ(η − v) − λ} ∂v
FOURIER SPACE TERMINAL PAYOFF
89
√ For convenience, define a(v) = σ vt and b(v) = κ(η − v) − λ. The transform equations τ =T −t
x = ln(S) + rτ = ln(Serτ ) = ln(FT )
V = W (x, v, τ )e−rτ produce the PDE given by 1 v 2
∂ 2W ∂W − 2 ∂x ∂x
√ ∂ 2W ∂ 2W ∂W 1 ∂W + ρ va(v) + a 2 (v) 2 + b(v) = . ∂x∂v 2 ∂v ∂v ∂τ
Shaw (2011) made use of the Fourier transform and inverse Fourier transform pair 1 W (x; v, τ ) = 2π
∞
−iωx
e
W˜ (ω; v, τ )dω
−∞
W˜ (ω; v, τ ) =
to analyze the Heston PDE in Fourier space. Applying the PDE for W gives
∞
ei ωx W (x; v, τ )dx
−∞
∂ ˜ ∂x W
= −iωW˜ to the
6 ∂ W˜ 5 √ 1 ∂ 2 W˜ ∂ W˜ 1 + a 2 (v) 2 = . − v(ω2 W˜ − iωW˜ ) + −iωρ va (v) + b(v) 2 ∂v 2 ∂v ∂τ 3.16. FOURIER SPACE TERMINAL PAYOFF The terminal condition of the Fourier transform is
W˜ (ω; τ = 0) =
∞
−∞
ei ωx W (x; v, τ = 0)dx =
∞
−∞
=1
ei ωx V (x, v, τ = 0) er0 dx.
The payoff of a European vanilla call option is V (x, v, τ = 0) = max(ex − K, 0), thus W˜ c (ω; τ = 0) =
=
∞
ei ωx max(ex − K, 0)dx
∞
e(1+iω)x − Kei ωx =
−∞
ln K
ei ωx e(1+iω)x −K 1 + iω iw
∞
ln K
.
90
OPTIONS
A contour of integration where Im(ω) > 1 allows the integral to be solved as Kiω K (1+iω) K (1+iω) iω K (1+iω) 1 + iω ˜ −K 0− + , Wc (ω; τ = 0) = 0 − = 1 + iω iω 1 + iω iω iω 1 + iω and the Fourier transform of the payoff of a vanilla call option reduces to K (1+iω) W˜ c (ω; τ = 0) = . iω + ω2 A similar approach with a contour of integration of Im(ω) < 0 gives the Fourier transform of a vanilla put option payoff as K (1+iω) W˜ p (ω; τ = 0) = . iω + ω2 The Fourier transform of a binary call option payoff such that Im(ω) > 0 is −K −iω . W˜ bc (ω; τ = 0) = iω The Fourier transform of a binary put option payoff such that Im(ω) < 0 is Kiω . W˜ bc (ω; τ = 0) = iω 3.17. GREEN FUNCTION FOR HESTON ˜ The solution V of the option or Greek is composed of the Green function, G, and a (volatility independent) time T payoff function. Both components defined in Fourier space as shown in the solution 1 −rτ V (x; v, τ ) = e 2π
iIm(ω)+∞
iIm(ω)−∞
e
FT-payoff −iωx
˜ W˜ (ω; τ = 0)G(ω; v, τ )dω,
˜ are subject to the terminal condition G(ω; v, τ = 0) = 1. The function G is the fundamental transform, given that G is a solution to the PDE (similarly derived for W˜ ) ˜ ˜ ˜ √ 1 ∂ 2G 1 ∂G ˜ + −iωρ va (v) + b(v) ∂ G . = a 2 (v) 2 − v(ω2 − iω)G ∂τ 2 ∂v 2 ∂v
Substituting for the b and a coefficients gives the PDE as
˜ ˜ ˜ ∂G 1 ∂ 2G 1 ˜ + [κ(η − v) − λv − iωρσ v] ∂ G . = σ 2 v 2 − v(ω2 − iω)G ∂τ 2 ∂v 2 ∂v
GREEN FUNCTION FOR HESTON
91
The Green function is assumed to have an affine form ˜ G(ω; v, τ ) = eC(τ,ω)+vD(τ,ω) . ˜ Since the terminal condition G(ω; v, τ = 0) = 1 is known, for consistency, C(τ = 0, v) = 0 and D(τ = 0, v) = 0. Substituting the derivatives ˜ ∂G = ∂τ
∂C ∂D ˜ +v G ∂τ ∂τ
˜ ∂G ˜ = DG ∂v
˜ ∂ 2G ˜ = D2G ∂v 2
˜ term gives into the PDE and canceling the common G ∂C ∂D 1 1 +v = σ 2 vD 2 − v(ω2 − iω) + [κ(η − v) − λv − iωρσ v]D. ∂τ ∂τ 2 2 In this differential equation, the terms independent of v are ∂C = κηD ∂τ and dependent on v are ∂D 1 1 = σ 2 D 2 − (ω2 − iω) − (κ + λ + iωρσ )D. ∂τ 2 2 The solution can be found in a manner similar to Heston’s derivation in the previous section. The final expressions are κ + λ + ρσ iω + d 1 − ed τ D= σ2 1 − ged τ and C=
κη 1 − ged τ , + λ + σρiω + d) τ − 2 ln (κ σ2 1−g
where g= and dj =
7
σρiω + κ + λ + d σρiω + κ + λ − d
σ 2 (ω2 − iω) + (σρiω + κ + λ)2 .
The function HestonFourier calculates the European call price by integrating the equation
V (x; v, τ ) =
1 −rτ e 2π
iIm(ω)−∞
W˜ c (ω;τ =0)
˜ G(ω;v,τ ) (1+iω) C(τ,ω)+vD(τ,ω) −iωx K e dω e iω + ω2
iIm(ω)+∞
92
OPTIONS Heston: Strike K = 50 κ = 1 σ = 0.5 λ = 0.05 η = 0.3 ρ = –0.8 v0 = 0.4
Call value
60
40
20
0 5 4 3 Texpiration
2 1
20
40
60
80
100
Asset price
FIGURE 3.2 European call price as a function of time to expiration and initial asset price calculated using a Fourier inversion of the fundamental transform Heston model.
via Matlab’s adaptive Simpson quadrature function, quad (or Matlab’s adaptive Lobatto quadrature function, quadl ). The analytical expression for the parameter C can display discontinuities that lead to nonsensical results. Various analytical corrections are available (Kahl and J¨ackel, 2005; Lord and Kahl, 2010); a stable albeit slower approach is to perform an additional integration to calculate the value of C as given by C = κη D. The results of the Fourier inversion of the Heston model for the European call price are displayed in Figure 3.2 as a function of time to expiration and initial asset price. 3.18. HESTON GREEKS The calculation of the Greeks requires a simple modification of the expression for ˜ and a (volatility independent) V , which is composed of the Green function, G, time T payoff function, 1 −rτ e V (x; v, τ ) = 2π
iIm(ω)+∞
=x
rτ ˜ v, τ )dω, e−iωln(Se ) W˜ (ω; τ = 0)G(ω;
iIm(ω)−∞
where both components are defined in Fourier space. Delta is the first derivative with respect to underlying asset price S as given by Delta =
−iω ∂V (x; v, τ ) = V (x; v, τ ). ∂S S
HESTON GREEKS
93
Similarly, Gamma is the second derivative with respect to the underlying asset price as given by −ω2 ∂ 2 V (x; v, τ ) = 2 V (x; v, τ ). 2 ∂S S
Gamma =
Vega is the first derivative with respect to underlying asset price variance v (Lin, 2008). The expression for Vega is ∂V (x; v, τ ) 1 −rτ Vega = = e ∂v 2π
iIm(ω)+∞
=x
−iωln(Serτ )
e
iIm(ω)−∞
∂ ˜ W˜ (ω; τ = 0) {G(ω; v, τ )}dω, ∂v
and it must first be expanded since variance is a component of the Fundamental Transform G. Since ∂ ˜ ∂ C(τ,ω)+vD(τ,ω) ˜ {e } = D(τ, ω)G(ω; v, τ ), G(ω; v, τ ) = ∂v ∂v Vega is given by 1 −rτ e Vega = 2π
iIm(ω)+∞
iIm(ω)−∞
=x
−iωln(Serτ )
e
˜ W˜ (ω; τ = 0){D(τ, ω)G(ω; v, τ )}dω.
Figure 3.3 displays the European call price and the option delta as calculated by Fourier inversion of the Heston model. An advantage of the Heston model is its ability to replicate the volatility skew present in market option data as implied from the Black–Scholes equation. This skew arises from the negative correlation, typically −1 < v < −0.7, between the volatility to the asset price. Figure 3.4 displays a three-dimensional plot of call price in the Heston model as a function of time to expiration and initial strike price as well as a two-dimensional slice at a time to expiration of 0.1 years. The implied volatility is the volatility that produces a call price in the Black–Scholes model (for a fixed strike price, asset price, and interest rate) equal to the (fixed) Heston model call price. The implied volatility cannot be calculated analytically because the Black–Scholes model cannot be inverted. Nevertheless, the Matlab function fminsearch rapidly finds the implied volatility by minimizing the squared difference between the Black–Scholes model price (that is found by varying the volatility) and the Heston model call price. For a fixed initial asset price S0 , Figure 3.4 shows that implied volatility is much greater at a low strike price, that is, for options deep in the money. A large smirk is observed for a large negative correlation between asset price and volatility. A correlation near zero will flatten the smirk but the implied volatility will still be slightly higher at a low strike price. For positive correlation, the implied volatility will be higher for options with a high strike price. Still, stochastic volatility models
94
OPTIONS Heston inversion: Strike = 50, T = 1
Call value
60 40 20 2 1.5
1 Variance
0.5
20
40
80 60 Asset price
100
κ =1 σ = 0.5 λ = 0.05 η = 0.3 ρ = –0.8
Delta
0.8 0.6 0.4 0.2 2 1.5 Variance
1
0.5
20
40
80 60 Asset price
100
FIGURE 3.3 (Top) Call price and option (bottom) Delta as a function of initial variance and initial asset price calculated via Fourier inversion of the Heston model.
can only create a steep short-term smile when the volatility of volatility is large (Derman, 2008). Another practical feature of the Heston model is the ability for the implied volatility smirk to flatten and decrease for options contracted for a longer expiration time.
SUMMARY This chapter provides an introduction to the risk-neutral valuation of options based on a diffusion process in the Black–Scholes model as well as a jump-diffusion process in the Merton model and stochastic volatility process in the Heston model. The Black–Scholes model employs two invocations to the cumulative distribution function. One, N(d1 ), is the hedge ratio for a call, that is, the rate of change in call value for a change in the underlying asset. The other, N (d2 ), is the risk-adjusted probability that the option will finish in the money. Alternatively, N (d1 ) is the probability of finishing in the money for a risk-neutral martingale with a stock as the numeraire. Similarly, N (d2 ) is the probability of finishing in the money for a risk-neutral martingale with a riskless bond as the numeraire. A well-known issue with the Black–Scholes model is the appearance of a smirk or a smile in the implied option volatility derived from option prices. This curvature
SUMMARY
95
T = 0.1 years 40
Heston: S0 = 50
Call value
Call value
30 40 30 20 10
20 10
4 2
Texpiration
100 50 Strike price
20
60
80
100
80
100
Strike price
κ = 1 σ = 0.5 λ = 0.05 η = 0.3 ρ = –0.8 v0 = 0.4
T = 0.1 years 0.85
Implied volatility
Implied volatility
40
0.8 0.6 0.4 4 2 Texpiration
0.8 0.75 0.7
100 50 Strike price
0.65 20
40
60
Strike price
FIGURE 3.4 (Top) Call price from Fourier inversion of the Heston model and (bottom left) corresponding volatility implied from a Black–Scholes equivalent call price for a fixed strike price, initial asset price, and risk-free rate. (Bottom right) The Heston model replicates the implied volatility smirk often observed in market data.
became more severe following the Black Monday crash on Monday October 19, 1987, when the Dow Jones Industrial Average dropped (22.61%) by 508 points to 1738.74. The smirk or smile is explained by a greater appreciation for large jumps, particularly the fear of large downward movements that could result in financial ruin. One approach to account for large movements is to append a Poisson jump process to the diffusion process in the Black–Scholes model. Merton’s use of a lognormal jump process allowed the valuation of options via a series approximation. The other major alternative is the family of stochastic volatility models in which the volatility is itself driven by a second stochastic model. For example, it is conceptually appealing that during periods of rapid arrival of information, the frequency of trading will increase. From the viewpoint of a trader, trading time will appear to dilate, whereas from the viewpoint of real time, the volatility of an asset will increase.
96
OPTIONS
REFERENCES Albrecher, H., Mayer, P., Schoutens, W., Tistaert, J. (2006) The Little Heston Trap, Working Paper, Australian Academy of Sciences. Black, F., Scholes, M. (1973) The Pricing of Options and Corporate Liabilities, Journal of Political Economy 81, 637. Cont, R., Tankov, P. (2004) Financial Modelling With Jump Processes, Chapman & Hall/CRC. Kahl, C., J¨ackel, P. (2005) Not-So-Complex Logarithms in the Heston Model, Wilmott, 94. Derman, E. (2008) Jump-Diffusion Models of the Smile, Lecture Notes, Columbia University. Dineen, S. (2005) Probability Theory in Finance: A Mathematical Guide to the Black-Scholes Formula, American Mathematical Society: Graduate Studies in Mathematics, 70. Gatheral, J. (2002) Stochastic Volatility and Local Volatility, Case Study Lecture, Courant Institute of Mathematical Sciences. Heston, S. (1993) A Closed-Form Solution for Options with Stochastic Volatility, with Applications to Bond and Currency Options, Review of Financial Studies 6(2), 327. Hull, J. (2005) Options, Futures and Other Derivatives, Prentice Hall. Kishimoto, M. (2008) On the Black-Scholes Equation: Various Derivations, Stanford University. Lewis, A.L. (2000) Option Valuation Under Stochastic Volatility, Finance Press. Leger, S. (2006) Known Closed Formulas and Their Proofs, New York University. Lin, S. (2008) Financial Difference Schemes for Heston Model, University of Oxford, Department of Mathematics and Computational Finance. Lord, R., Kahl, C. (2010) Complex Logarithms in Heston-Like Models, Mathematical Finance 20, 671 Matsuda, K. (2004) Introduction to the Merton Jump Diffusion Model, White Paper, City University of New York. Merton, R.C. (1976) Option Pricing when Underlying Stock Returns are Discontinuous, Journal of Financial Economics 3, 125. Rouah, F.D. (2011) Derivation of the Heston Model, www.frouah.com. Ross, S. (1976) The Arbitrage Theory of Capital Asset Pricing, Journal of Economic Theory 13(3), 341. Sharpe, W.F. (1964) Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk, Journal of Finance 19(3), 425. Shaw, W. (2011) Stochastic Volatility: Models of Heston Type, Lecture Notes, King’s College London.
APPENDIX Code: MertonScript % MertonScript shows increase in call value of Merton % with jumps vs. Black-Scholes without jumps % underlying Black-Scholes volatility assumed equal T=1; Ttext=['T = ' num2str(T)]; r=0.05; rtext=['r = ' num2str(r)];
APPENDIX
K=100; %S0=100;
Ktext=['K = ' num2str(K)]; Stext=['S_0 = ' num2str(S0)];
%%% Black-Scholes Parameters %%% volBS = 0.2; Voltext=['\sigma_{BS} = ' num2str(volBS)]; d=0; %%% Merton Series %%% muJ = 0.1; sigmaJ = 0.1; lambda = 0.5; mmuJtext=['\mu_{Jump} = ' num2str(muJ)]; msigmaJtext=['\sigma_{Jump} = ' num2str(sigmaJ)]; lambdatext=['\lambda = ' num2str(lambda)]; Srange=1:180; for S=Srange cBS(S) = BlackScholesCall (K,S,T,volBS,r,d); cM(S) = MertonSeriesCall (K,S,T,volBS,... r, d, muJ, sigmaJ, lambda); end plot (Srange, cM, Srange, cBS,'--') axis tight legend ('Merton','Black-Scholes','Location','NorthWest') ylabel('Call Price') xlabel('S_0') bsStr(1)= bsStr(2)= bsStr(3)= text(125,
{'Black-Scholes'}; {rtext}; {Voltext}; 20, bsStr)
genStr(1)= {Ktext}; genStr(2)= {Ttext}; % genStr(3)= {Stext}; text(90, 60, genStr); JumpStr(1)= {'Merton '}; JumpStr(2)= {mmuJtext}; JumpStr(3)= {msigmaJtext}; JumpStr(4)= {lambdatext}; text(50, 20, JumpStr); title ('Merton Log-Normal Jump Process vs. Black-Scholes')
97
98
OPTIONS
Code: MertonSeriesCall function c = MertonSeriesCall(K,S,T,volBS,... r, d, muJ, sigmaJ, lambda); % MertonSeriesCall based on Series Approximation where % n is the probability of n jumps in one time period % lambda is the intensity parameter = mean number of jumps % in one time period % k=exp(muJ+0.5*sigmaJˆ2)-1 = mean relative asset jump size N=5; % Max Number of Possible jumps in one time period lambdaBar=lambda*exp(muJ+0.5*sigmaJˆ2); c=0; % Can write following more compactly using % exp(muJ+0.5*sigmaJˆ2)= k+1 for n=0:N volM = sqrt(volBSˆ2+n*sigmaJ/T); rn = r - lambda*(exp(muJ+0.5*sigmaJˆ2)-1)... + n*(muJ+0.5*sigmaJˆ2)/T; c = c + ((lambdaBar*T)ˆn / factorial(n))... * BlackScholesCall (K,S,T,volM,rn,0); end c = c*exp(-lambdaBar*T); end
Code: HestonFourier function [ c0 ] = HestonFourier (S0,K0,r,t,T,... kappa,lambda,sigma,rho,eta,v0) % HestonFourier performs Fourier inversion of Heston model % The option (or Greek) value is decoupled into a volatility % independent payoff function in Fourier Space % (e.g., European Call: W(tau=0)=K.ˆ(1+i*w)./(i*w-w.ˆ2)) % and a Green Function, G=exp(C+vol*D), which is Fundamental % Transform if G(tau=0)=1 and G(w,vol,tau) satisfies % Heston style PDE. % Requires Numerical Integration of Solution % C=exp(-r*tau)/2pi * integral % HestonFourier will generate various plots comparing % call price, Greeks, and implied volatility % vs. time to expiration, strike price, initial volatility % and asset price disp ('Heston Stochastic Volatility Option Calculation') disp ('via Fourier inversion approach') if (nargin < 10), v0 = 0.01; end
APPENDIX
99
if (nargin < 9), eta = 0.3; end if (nargin < 8), rho = -0.8; end % usually stock price and volatility negative correlation, % which leads to BS implied volatility smirk if (nargin < 7), sigma = 0.5; end if (nargin < 6), lambda =0.05; end if (nargin < 5), kappa = 1; end if (nargin < 4), T =1; end if (nargin < 4), t =0; end if (nargin < 3), r = 0.03; end if (nargin < 2), K0 = 50; end if (nargin < 1), S0 = 50; end tau=T-t; prefactor=(1/(2*pi))*exp(-r*tau); range=200 %+/- integration limits % number of data points: strike, stock price and/or time MaxCount=10 %%%%%%%%%%%%%%%%%%%% OffSetImag=1.5; % offset contour of integration % call option: Fourier Transform Payoff=K0 jmid=round(MaxCount/2); str1=[' \kappa=' num2str(kappa)]; str2=[' \sigma=' num2str(sigma)]; str3=[' \lambda=' num2str(lambda)]; str4=[' \eta=' num2str(eta)]; str5=[' \rho=' num2str(rho)]; str6=[' v_0=' num2str(v0)]; textstr = [str1 str2 str3 str4 str5]; textstrplusv0 = [str1 str2 str3 str4 str5 str6]; disp ('Price vs. Asset Price and vs. Volatility'); disp('j k S(j,k) tau(j,k)'); for j=1:MaxCount for k=1:MaxCount S(j,k)=(2*K0/MaxCount)*j; v(j,k)=0.2*k; disp([j k S(j,k) v(j,k)]); x=log(S(j,k))+r*tau; cInt=@(w) (exp(-i*w*x)... .*(K0.ˆ(1+i*w)./(i*w-w.ˆ2))... .*G(w,kappa,lambda,sigma, rho, eta, tau,v(j,k))); c(j,k)=real(prefactor*... quad(cInt,-range+2i,range+2i));
100
OPTIONS
deltaInt=@(w) ((-i.*w./S(j,k)).*exp(-i*w*x)... .*(K0.ˆ(1+i*w)./(i*w-w.ˆ2))... .*G(w,kappa,lambda,sigma, rho, eta, tau,v(j,k))); Delta(j,k)=real(prefactor*... quad(deltaInt,-range+2i,range+2i)); % % % % % % % %
gammaInt=@(w) ((-w.ˆ2/S(j,k)ˆ2).*exp(-i*w*x)... .*(K0.ˆ(1+i*w)./(i*w-w.ˆ2))... .*G(w,kappa,lambda,sigma, rho, eta, tau,v(j,k))); Gamma(j,k)=real(prefactor*quad(gammaInt,-range+2i, range+2i)); vegaInt=@(w) ((-w.ˆ2/S(j,k)ˆ2).*exp(-i*w*x)... .*(K0.ˆ(1+i*w)./(i*w-w.ˆ2))... .*VV(w,kappa,lambda,sigma, rho, eta, tau,v(j,k))); % Vega(j,k)=real(prefactor*quad(vegaInt,-range+2i, % range+2i)); end end figure; subplot (2,1,1); surf(S,v,c); title(['Heston Inversion: Strike = ' num2str(K0) ', T = ' num2str(T)]) zlabel ('Call Value'); xlabel ('Asset Price'); ylabel ('Variance'); axis tight subplot (2,1,2); surf(S,v,Delta); title(textstr); axis tight zlabel ('Delta'); xlabel ('Asset Price'); ylabel ('Variance'); %figure; subplot (2,1,1); surf(S,v,Gamma); %title(['Fourier Inversion of Heston Model: K = ' num2str(K0)]) %zlabel ('Gamma'); xlabel ('Asset Price'); ylabel %('Variance'); axis tight %subplot (2,1,2);surf(S,v,Vega); title(textstr) %zlabel ('Vega'); xlabel ('Asset Price'); ylabel ('Variance'); %axis tight %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% c0=c(jmid,jmid); % call price returned from function %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % call option value vs. initial asset price and vs. time clear tau tauk=linspace(0.1,5,MaxCount); disp ('Price vs. Initial Asset Price and vs. Time');
APPENDIX
101
disp('j k S(j,k) tau(j,k)'); for j=1:MaxCount for k=1:MaxCount S(j,k)=(2*K0/MaxCount)*j; tau(j,k)=tauk(k); x=log(S(j,k))+r*tau(j,k); disp([j k S(j,k) tau(j,k)]); cInt=@(w) (exp(-i.*w.*x)... .*(K0.ˆ(1+i.*w)./(i.*w-w.ˆ2))... .*G(w,kappa,lambda,sigma, rho, eta, tau(j,k),v0)); prefactor=(1/(2*pi))*exp(-r*tau(j,k)); c(j,k)=real(prefactor*... quadl(cInt,-range+2i,range+2i)); end end figure; surf(S,tau,c ); title(['Heston: Strike K = ' num2str(K0) ' ' textstrplusv0]) zlabel ('Call Value'); xlabel ('Asset Price'); ylabel ('T_{expiration}'); axis tight %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Calculate and plot Heston call option value and % Black-Scholes Implied volatility % vs. strike price and vs. time % Negative correlation (rho) btw stock price and volatility in % Heston model generates a skew in the implied volatility clear tau tauk=linspace(0.1,5,MaxCount); disp ('Calculate Call Price vs. Strike and vs. Time'); disp('j k K(j,k) tau(j,k)'); for j=1:MaxCount for k=1:MaxCount K(j,k)=(2*K0/MaxCount)*j; tau(j,k)=tauk(k); x=log(S0)+r*tau(j,k); disp([j k K(j,k) tau(j,k)]); cInt=@(w) (exp(-i.*w.*x)... .*(K(j,k).ˆ(1+i.*w)./(i.*w-w.ˆ2))... .*G(w,kappa,lambda,sigma, rho, eta, tau(j,k),v0)); prefactor=(1/(2*pi))*exp(-r*tau(j,k)); c(j,k)=real(prefactor*... quadl(cInt,-range+2i,range+2i)); end end
102
OPTIONS
figure subplot (2,2,1); surf(K,tau,c ); title(['Heston: S_0 = ' num2str(S0)]) zlabel ('Call Value'); xlabel ('Strike Price'); ylabel ('T_{expiration}');axis tight disp ('Calculate Implied Volatility'); % Find implied volatility with Matlab's fminsearch via % function % BSdiff, which compares Black-Scholes price % (at BS implied volatility) to Heston call price volguess=v0+sigma; d=0; for j=1:MaxCount for k=1:MaxCount % fixed parameters param = [c(j,k) K(j,k) S0 tau(j,k) r d]; ImpVol(j,k) =... fminsearch(@(vol) BSdiff(vol,param),volguess); volguess=ImpVol(j,k); % speeds up next fminsearch end end subplot (2,2,3); surf(K,tau,ImpVol ); title(textstrplusv0); zlabel ('Implied Volatility'); xlabel ('Strike Price'); ylabel ('T_{expiration}'); axis tight % Take a slice at first time point subplot (2,2,2); plot(K(:,1), c(:,1) ) ylabel ('Call Value'); xlabel ('Strike Price'); title(['T = ' num2str(tau(1,1)) ' Years']); axis tight subplot (2,2,4); plot(K(:,1), ImpVol(:,1) ) ylabel ('Implied Volatility'); xlabel ('Strike Price'); title(['T = ' num2str(tau(1,1)) ' Years']); axis tight end %%%%%%%%%%%%%%%% End HestonFourier Function%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [Gval D] = G(w,kappa,lambda,sigma, rho, eta , tau,v) % G is Integrand for call price, Delta, Gamma equations d=sqrt( (w.ˆ2-i*w)*(sigmaˆ2) + (kappa+lambda+i*sigma* rho*w).ˆ2); g=(kappa+lambda+ i.*sigma.*rho.*w +d)./...
APPENDIX
103
(kappa+lambda+ i.*sigma.*rho.*w -d); D=( (kappa+lambda+ i.*sigma.*rho.*w+d).*(1-exp(d.*tau)) ) ./... ( sigma.ˆ2 .* (1-g.*exp(d.*tau)) ); % % %
C=(kappa.*eta./sigma.ˆ2)... .* ( (kappa+lambda+i.*sigma.*w + d).*tau... -2.*log( (1-g.*exp(d.*tau))/(1-g)));
% Analytical Equation for C can be unstable, therefore use % Brute Force Integration -> C=kappa*eta* integral(D dt) maxcounter=50; Dsum=0; for l=1:maxcounter counter=(l-1)/(maxcounter-1); Dsum= Dsum+ (( (kappa+lambda+ i.*sigma.*rho.*w+d)... .*(1-exp(d.*tau.*counter)) ) ./... ( sigma.ˆ2 .* (1-g.*exp(d.*tau.*counter)) )); end C=kappa*eta*Dsum/(maxcounter-1); Gval=exp(C + D*v); end
function VegaVal = VV(w,kappa,lambda,sigma, rho, eta , tau,v) % VegaVal Provides the Integrand for integration in % equation for Vega. Expression has extra D in % VegaVal=D.*exp(C + D*v); as compared to equations % for call price, Delta, Gamma % Easier to use separate function [Gval D] = G(w,kappa,lambda,sigma, rho, eta , tau,v); VegaVal=D.*Gval; end function diff2 = BSdiff(vol,param) % BSdiff returns squared difference of Black-Schole price % for input volatility to input call price (from Heston % model in this application) c=param(1); K=param(2); S=param(3); T=param(4); r=param(5); d=param(6); diff=c-BlackScholesCall (K,S,T,vol,r,d);
104
OPTIONS
diff2=diffˆ2; end
Code: BlackScholesCall function [c PI1 PI2] = BlackScholesCall (K,S,T,vol,r,d); % BlackScholesCall uses classic semi-analytical equation d1 = ( log(S./K)+ (r-d+volˆ2/2).*T)./ (vol.*sqrt(T)); d2 = d1-vol*sqrt(T); PI1 = myNormCDF(d1); % PI1=CDF(d1) = probability of finishing in the money for % risk neutral Martingale measure with Stock as Numeraire PI2 = myNormCDF(d2); % PI2=CDF(d2) = probability of finishing in the money for % risk neutral Martingale measure with riskless Bond Numeraire c = exp(-d*T).*S.*myNormCDF(d1)-exp(-r*T).*K.* myNormCDF(d2); end
Code: BlackScholesPut function p = BlackScholesPut (K,S,T,vol,r,d); d1 = ( log(S./K)+ (r-d+volˆ2/2).*T)./ (vol.*sqrt(T)); d2 = d1-vol*sqrt(T); p=exp(-r*T).*K.*myNormCDF(-d2)-exp(-d*T).*S.* myNormCDF(-d1); end
Code: myNormCDF function ncdf = myNormCDF (x) % ncdf= 0.5*(1+erf(x/sqrt(2))); ncdf = 0.5*erfc(-x/sqrt(2)); end
4
Binomial Trees
4.1. INTRODUCTION The binomial tree approach simulates a random walk of an asset as a series of up or down movements. The up or down movements are proportional to the volatility of the asset. The value of a European option on the asset is evaluated at the expiration and this option value is propagated back through the branches of the tree to the initial-time root–node. For small time steps, the asset values of the tree replicate the log-normal distribution and thus an option price will converge to the Black–Scholes model price. An advantage of the tree approach is that an early exercise premium, for example, of an American call, can be calculated at each node. In addition, volatility is known to vary with time and this effect can be readily embedded into the tree. The time-dependent volatilities can be calibrated in a manner that is consistent with implied volatilities derived from the market listed price of options for several expirations.
4.2. RISK-NEUTRAL VALUATION The structure of a binomial tree is such that an asset at an initial price S 0 can only move up to S 0 u or down to S 0 d . Cox et al. (1979) selected u = 1/d to force the nodes to replicate, for example, (Sd)u = (Sd)1/d = S. The probability of an up movement is p and the probability of a down movement is (1 − p). The binomial lattice approximates the risk-neutral stochastic process dS = rSdt + σ SdW , with a drift equal to the risk-free rate. The log-normal distribution has an expected value
S E t+dt St
= erdt ,
Financial Derivative and Energy Market Valuation: Theory and Implementation in Matlab, First Edition. Michael Mastro. 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.
105
106
BINOMIAL TREES
and variance of var
St+dt St
= e2rdt (eσ
2 dt
− 1).
When valuing derivatives, the approach is to price derivatives as part of a portfolio that includes the underlying asset. In a complete market, a perfectly hedged portfolio must return the risk-free rate else arbitrage is available. The average return of the up and down movement in the tree must equal risk-free rate, E(St+dt ) = S0 erdt = pS0 u + (1 − p)S0 d, where dt is the time step. Removing the common S0 term shows that the probability of an up movement is p=
a−d erdt − d = , u−d u−d
and the probability of a down movement is 1−p =
u−a u − erdt = , u−d u−d
where a = erdt is the growth factor for a nondividend paying stock. For an asset with a continuous dividend yield y, the cost of carry b = r − y is less; hence, the growth factor is also reduced a = ebdt = e(r−y)dt . If the underlying asset is a futures contract, then the cost of carry is zero and the growth factor is equal to one, a = e0 dt = 1. 4.3. DELTA HEDGE PORTFOLIO As mentioned in the previous section, the usual approach to value a derivative is as part of a portfolio that includes the underlying security. Consider a long portfolio in shares of the underlying asset and a short call option, f , derived from the asset. The cost to form this portfolio is S0 − f . A delta-neutral portfolio exists when the value of the portfolio after an up movement equals the value of the portfolio after a down movement, S0 u − fu = S0 d − fd ,
VARIANCE MATCHING
107
which gives the delta-neutral parameter as =
fu − fd . S0 u − S0 d
The present time-zero value of the portfolio after one time step is found by discounting at the risk-free rate. For example, the present value after an up movement is e−rdt (S0 u − fu ), which, in a complete market, must equal the initial cost to form the portfolio of S0 − f ; hence, f = S0 (1 − e−rdt u) + e−rdt fu . Inserting the delta-neutral parameter gives the initial value of the derivative as f = e−rdt [pfu + (1 − p)fd ], where the probability of an up movement is p = (erdt − d)/(u − d), and the probability of a down movement is 1 − p = (u − erdt )/(u − d) (Hull 2005). 4.4. VARIANCE MATCHING The variance of a variable X with mean μ is μ
var(X) = E[(X − μ) ] = E[X − 2μX + μ ] = E[X ] − 2μE[X] + μ2 2
2
2
2
= E[X2 ] − μ2 ,
var(X) = E[X2 ] − E[X]2 . As described by Brandimarte (2006), the variance of the asset is matched to the variance of the tree by
St2 e2rdt (eσ
2 dt
E St+dt2
E[St+dt ]2
− 1) = var(St+dt ) = St2 pu2 + St2 (1 − p)d 2 − St2 e2rdt .
Reducing the common terms gives e2rdt+σ
2 dt
= pu2 + (1 − p)d 2 . rdt
−d Substituting the probability of an up movement, p = e u−d , and the probability of rdt u−e a down movement, 1 − p = u−d , into the variance equation gives
e2rdt+σ
2 dt
=
erdt − d 2 u + u−d
u − erdt u−d
d 2,
108
BINOMIAL TREES =d
=u
e2rdt+σ
2 dt
e2rdt+σ
2 dt
e2rdt+σ
2 dt
u − du2 + ud 2 − erdt d 2 , u−d
rdt 2
= =
e
erdt (u2 − d 2 ) − (u − d) , u−d
= erdt (u − d) − 1.
Using d = 1/u to remove d and multiplying by a common u creates a quadratic equation (Brandimarte 2006) given by 0 = u2 erdt − u(e2rdt+σ
2 dt
+ 1) − erdt ,
with a quadratic root of
u= u=
(e2rdt+σ
2 dt
(e2rdt+σ 2 dt + 1)2 − 4erdt erdt , 2erdt 1 −rdt e . + 1 + (e2rdt+σ 2 dt + 1)2 − 4erdt erdt 2
+ 1) +
2 e2rdt+σ dt
The Taylor series expansion of the exponential function is ex =
∞ xn n=0
n!
=1+x+
x2 + ···. 2!
Using the first two terms of the Taylor series to approximate the exponentials in the quadratic root gives 1 u = 1 + 2rdt + σ 2 dt + 1 + (1 + 2rdt + σ 2 dt + 1)2 − 4(1 + 2rdt) (1 − rdt). 2 Terms of (dt)2 are assumed to be infinitely small; thus, 1 2 + 2rdt + σ 2 dt + 4σ 2 dt (1 − rdt), 2 √ 1 2 u = 1 + rdt + σ dt + σ 2 dt (1 − rdt). 2
u=
Ignoring dt terms raised to a power greater than unity gives √ 1 u = 1 + rdt + σ 2 dt + σ 2 dt − rdt. 2
RECURSIVE BINOMIAL TREE
109
Further simplification shows that this root √ 1 u = 1 + σ 2 dt + σ 2 dt, 2 √
is a second-order Taylor series expansion of u = eσ dt . In summary, the constraints of the study by Cox et al. (1979) tree methodology are E(ST ) = S0 e(r−y)dt = pS0 u + (1 − p)S0 d, u = d −1 ,
u = eσ
√
dt
,
d = e−σ
√
dt
,
(r−y)dt
p=
e
−d . u−d
The size of the up and down movements and, thus, the spacing between the nodes is a measure of the future stock volatility. The Cox, Ross, Rubinstein (CRR) tree with an infinite number of infinitely small time steps represents the continuous evolution of the asset in a risk-neutral measure with constant volatility over time (Derman et al., 1996). The binomial tree initiates at a singular node, representing the current asset price at time zero. The risk-neutral states of the asset price are repeatedly constructed √ √ by movements up to Su = Seσ dt and down to Sd = Se−σ dt . The time step is selected such that the final layer of the nodes coincides with the expiration time of the option contract. In the case of a European option, the value at each node in the final layer is calculated as f = max[0, (S − K)]. These final node values are stepped backward in time by the equation f = e−rdt [pfu + (1 − p)fd ] until the time-zero value of the option is calculated. In the continuous limit, a European option price calculated via the CRR tree converges to the price given by the Black–Scholes equation. The Matlab Financial Toolbox has functions to display the tree generated by the particular calculation, but these functions are not in the standard Matlab distribution. Where needed, the functions described in this chapter include a brief graphing procedure to generate a graphical tree.
4.5. RECURSIVE BINOMIAL TREE A recursive implementation is provided by the functions binomial and binomialgraph. Recursion provides a fairly elegant method to emphasize the interdependency of the nodes moving forward and backward in time. This recursive approach is certainly not the fastest technique to traverse through the tree. A faster vectorbased approach is discussed later in the chapter. The recursive binomial tree
110
BINOMIAL TREES
Stock call binomial tree: yield = 0 65.14 15.14
65 62.73 13.08 60.40 11.11
60 58.16 9.22
Asset and call value
56.00 7.47 55
53.93 5.90 51.93 4.56
50
56.00 6.71
51.93 3.79
51.93 1.93 50.00 1.12 48.15 0.00
48.15 0.65 46.36 0.38
44.64 0.22
45
53.93 4.28
50.00 1.98
46.36 0.85
56.00 6.00
51.93 2.95
48.15 1.31
48.15 1.94
58.16 8.52
53.93 5.12
50.00 2.74
50.00 3.45
60.40 10.40
46.36 0.00 44.64 0.00
44.64 0.00
42.98 0.00
42.98 0.00 41.39 0.00
41.39 0.00 40
39.85 0.00 0
0.2
0.4
0.6
0.8
38.38 0.00
Time (years)
FIGURE 4.1 European call price on a nondividend (y = 0) paying stock calculated via a seven-step binomial tree for a strike price K = 50, volatility σ = 0.1, an expiration of 1 year, and a risk-free rate r = 0.05. At each node in the tree, the upper value is the asset price and the lower value is the corresponding call value.
calculation of the price of a European call option on a nondividend paying stock is shown in Figure 4.1. It is easy to show that the value of an American option on a stock potentially adds a premium for early exercise at earlier nodes as f = max[(e−rdt (pfu + (1 − p)fd )), (S − K)]. A yield greater than zero is necessary to create an early exercise premium for a call option on a stock. The binomial function can be easily modified to show that
FUTURES OPTION TREE
111
the value of an American put exceeds the value of a European put. The difference in early exercise premium of an American put option (vs an American call option) is attributable to the lower limit of zero on the asset price.
4.6. FUTURES OPTION TREE As a futures contract does not require an initial cash investment or exchange, the expected growth rate should be zero. In other words, the calculation requires a cost of carry of zero, b = 0, or equivalently a yield equal to the risk-free rate. For an initial futures price of F 0 , the value after one time step is =1
E(Ft ) = F0 e0dt = pF0 u + (1 − p)F0 d, thus the probability of an up movement is p=
1−d a−d = , u−d u−d
where the growth factor is equal to one, a = e0dt = 1. The value of each node at the expiration level is calculated as f = max[0, (S − K)]. Like an option on an asset, the futures option is discounted at the risk-free rate to obtain the present value of the option. The value of a European option on a futures contract is stepped backward in time by the following equation: f = e−rdt [pfu + (1 − p)fd ]. The value of an American option on a futures contract adds a possible premium for early exercise at earlier time steps as f = max[e−rdt (pfu + (1 − p)fd ), (S − K)]. A binomial tree used to calculate the price of a call option on a futures contract is depicted in Figure 4.2.
112
BINOMIAL TREES
Futures call binomial tree: b = 0 65.14 15.14
65 62.73 12.73 60.40 10.40
60 58.16 8.16
Asset and call value
56.00 6.06 55
53.93 4.33 51.93 2.98
56.00 6.00
51.93 2.72
51.93 1.93 50.00 0.94 48.15 0.00
48.15 0.46 46.36 0.22
44.64 0.11
45
53.93 3.93
50.00 1.39
46.36 0.44
56.00 6.00
51.93 2.39
48.15 0.79
48.15 1.06
58.16 8.16
53.93 4.13
50.00 1.72
50.00 50 1.99
60.40 10.40
46.36 0.00 44.64 0.00
44.64 0.00 42.98 0.00
42.98 0.00
41.39 0.00
41.39 0.00 39.85 0.00
40 0
0.2
0.4
0.6
0.8
38.38 0.00
Time (years)
FIGURE 4.2 American call price on a futures contract calculated via a seven-step binomial tree. The calculation assumes a cost of carry of zero (y = r), a strike price K = 50, volatility σ = 0.1, an expiration of 1 year, and a risk-free rate r = 0.05. At each node in the tree, the upper value is the asset price and the lower value is the corresponding call value.
4.7. MEMORY AND CPU IMPROVEMENTS The repeated functions calls of the previous recursive binomial tree is without doubt not the optimal implementation of a binomial tree. The modern implementation of Matlab has reduced a number of its supposed CPU bottlenecks, but a high number of function calls are inherently nonoptimal. The hope was that the recursive nature would prove insightful. The simplest code to produce an N -time-step binomial tree is to store the asset and option prices within an N × N array. The memory requirements rapidly increase with the number of steps. A large number of steps are typically needed for an accurate representation of the asset and corresponding option price. Matlab
SMILE AND SMIRK
113
has highly optimized array handling routines, particularly when vectorization is employed, for example, a(1 : 5) = b(1 : 5). Nevertheless, handling large multidimensional arrays requires significant CPU cycles in any mathematical or programming language. The recombinant nature of the binomial tree allows the entire tree to be represented by a 2N + 1 vector. This greatly reduces the memory and CPU requirements. The key insight is that an N -time-step binomial tree only has 2N + 1 possible asset prices (Brandimarte 2006). For example, the central trunk of the CRR tree always equals the initial asset price S0 . The symmetric relation u = 1/d of exponential up and down movements defines S0 = u(dS0 ) = S0 . Similarly, the first down level Sd = dS0 can also be reached by one up and two down movements Sd = u[d(dS0 )] and so on. This vector approach is implemented in the function VectorBinomial . 4.8. SMILE AND SMIRK The underlying concept of the Black–Scholes equation is that asset prices have a constant volatility regardless of time and asset price. This erroneous assumption is most visible when the volatility implied in the Black–Scholes equation is viewed for European options as a function of time duration until maturity and strike price. Figure 4.3 displays the volatility implied from XEO S&P 100 Index European Options. The data are from out of the money put options with strike prices below the $567 asset price as well as out of the money call options with strike price above the $567 current asset price. The negative relationship between out of the money option volatilities and strike price is often referred to as a smirk . The time dependence relation is weaker but the most significant aspect is that out of the money put options with low strike prices increase in implied volatility as the option approaches expiration. The value added to price and, thus, implied volatility of out of the money put options relative to at the money call or put options reveals two issues. The first issue is that asset prices do not follow a log-normal distribution and additional parameters such a stochastic volatility or jump process, especially negative jumps, are necessary to properly model true asset movements. The second issue, which is related to the first, is the equity investors fear sharp downward movements that can lead to financial ruin and place a premium to hedge against downward movements. For psychological reasons and because of modern quantitative analysis, the smirk is more visible post the 1987 market crash. Foreign exchange markets tend to show a positive skew at a strike price above the market price as well as the negative skew below the current market price as just discussed. This combination of negative and positive skew creates a smile. Various points in this book detail incorporating jumps into the model of asset prices. Jumps as well as stochastic volatility have the drawbacks of increased complexity and lack of completeness, that is, the inability to hedge options with the underlying asset (Dupire 1994). Arbitrage pricing and hedging (with the underlying asset) is unavailable when the model has nontradable sources of risk such as jumps, stochastic volatility, or transaction costs.
114
BINOMIAL TREES
XEO S&P100 out of the money option data 0.5
Volatility estimated
0.4
0.3
0.2
2 1 Expiration (years)
FIGURE 4.3 ber 27, 2010.
700
600
500
400
300
Strike
Implied volatility of XEO S&P 100 Index European Options as of Decem-
4.9. IMPLIED LOCAL VOLATILITY Figure 4.3 in effect extracts a deterministic equation for implied volatility as a function of time to expiration and strike price. The implied volatility is not the expected local volatility at a point in time at a particular asset price. Rather, it is the implied constant future local volatility that matches the Black–Scholes option price to the current market option price. Derman et al. (1995) provides an analogy to the quoted yield to maturity of a fixed income security. The yield of a Treasury bond is the implied constant forward discount rate that matches the current market bond price to the value of the bond’s coupon and principal payments. The forward rate is the rate of interest at a particular point in time. The forward rate curve (in time) is extracted from a series of liquid Treasury bond prices and usually does not equal the Treasury bond yield. Similarly, the local implied volatility of an asset is the volatility only at a future point in time as extracted from a series of Vanilla options. Derman and Kani (1994) developed an approach to map the implied local volatility surface to a binomial tree as determined by the implied constant volatility function. The implied constant volatility surface can be interpolated from
IMPLIED LOCAL VOLATILITY
115
market-implied option volatilities. Derman and Kani’s original presentation used a time-independent implied volatility function with a negative skew relative to asset prices. The general idea of the Derman and Kani technique is that the local volatility path up a node in the tree and, thus, the option value in the tree will be equated to the calculated option value. The calculated option value is preferably computed by a separate CRR tree with a constant volatility implied from the option market value. The contribution of the nodes at a point in time is determined by the earlier time asset movements and local volatilities. This information is encapsulated in the node’s Arrow–Debreu price, λin , which is the price of an option that pays one unit of payoff if the stock price attains the value of Sni at state i and time level n (Hardle and Mysickova, 2008). The Arrow–Debreu price is the expected value of the payoff and is calculated by λ11 = 1,
λ1n+1 = e−rt [λ1n pn1 ], i+1 i+1 λn+1 = e−rt [λi+1 + λin (1 − pni )], n pn n+1 = e−rt [λnn (1 − pnn )]. λn+1
The initial node is selected as one (i.e., not zero) to coincide with Matlab’s numbering scheme. Employing the Arrow–Debreu price was not necessary for the derivation of the constant volatility CRR binomial tree. In the implied binomial tree (IBT), the Arrow–Debreu price allows direct valuation of a call option with time duration τ based on moneyness (S-K) and Arrow–Debreu prices at the nodes at time level τ = nt, as given by c(K, nt) =
n i=0
i+1 λi+1 n+1 max Sn+1 − K, 0 .
Similarly, the put option on the IBT is given by p(K, nt) =
n i=0
i+1 , 0 . max K − S λi+1 n+1 n+1
The risk-neutral condition for the forward price at any node is i+1 i Fni = pni Sn+1 + 1 − pni Sn+1 .
Rearranging to isolate the upward transition probability gives pni =
i+1 Fni − Sn+1
i+1 i Sn+1 − Sn+1
.
116
BINOMIAL TREES
Restating the earlier discussion, a call option c priced via a CRR tree, Black–Scholes, etc. with a known constant volatility must (within numerical error) equate to a call option price priced via the nodes in the IBT. The value of call option priced with a CRR tree is c(K, τn+1 = nt), where a time to expiration τn+1 is at the n + 1 time level in the CRR or IBT tree. The value of the call option on the IBT is found from c(K, nt) =
n i=0
i+1 − K, 0 . max S λi+1 n+1 n+1
Moving back by one time step gives 1 ⎡ 1 1 ⎤ λn pn max Sn+1 − K, 0 , j = 0 ⎢ n−1 ⎥ ⎢
⎥ j +1 ⎢ ⎥ j +1 j +1 j j λn pn + λn 1 − pn max Sn+1 − K, 0 ⎥ . + c(K, nt) = e−rt ⎢ ⎢ ⎥ ⎢ j =2 ⎥ ⎣ ⎦
j =n+1 +λnn 1 − pnn max Sn+1 − K, 0 , j = n
Setting the strike price equal to the i th node at the nth step, K = Sni , allows the out of the money nodes less than the strike price to be disregarded, ⎡ 1 1 1 ⎤ λn pn Sn+1 − K ⎢ ⎥ ⎢ i
⎥ c (K, nt) = e−rt ⎢ ⎥. j j j ⎣ + λn pn + λnj −1 1 − pnj −1 Sn+1 − K ⎦ j =2
Shifting the discount factor and separating the i th terms in the summation gives ⎤ ⎡ 1 1 1 λn pn Sn+1 − K, 0 ⎥ ⎢ i−1 ⎥ ⎢ j ⎢ + j j j −1 j −1 rt λ n pn + λ n 1 − pn Sn+1 − K ⎥ =⎢ c (K, nt) e ⎥. ⎢ ⎥ j =2 ⎣ ⎦ i i i i−1 i−1 + λ n p n + λ n 1 − pn Sn+1 − K Splitting the summation gives ⎤ ⎡ 1 1 1 λn pn Sn+1 − K ⎢ ⎥ i−1 ⎥ ⎢ j −1 ⎢ ⎥ j ⎥ ⎢ + λn 1 − pnj −1 (Sn+1 − K) ⎥ ⎢ ⎥ ⎢ j =2 rt ⎥. ⎢ c (K, nt) e =⎢ ⎥ i−1 ⎥ ⎢ j j j ⎥ ⎢ + λn pn (Sn+1 − K) ⎥ ⎢ ⎥ ⎢ j =2 ⎦ ⎣ i i i i−1 i−1 + λ n p n + λ n 1 − pn Sn+1 − K
IMPLIED LOCAL VOLATILITY
117
A shift of the summation by i−1 i−2 j −1 j j j +1 λn 1 − pnj −1 Sn+1 − K = λn 1 − pnj Sn+1 − K j =2
j =1
gives ⎡
⎤ i−2 j j +1 i j i−1 i−1 λn 1 − pn Sn+1 − K + λn 1 − pn Sn+1 − K ⎥ ⎢+ ⎢ ⎥ ⎢ j =1 ⎥ ⎢ ⎥ rt ⎢ i−1 ⎥. c (K, nt) e =⎢ j ⎥ j j 1 ⎥ ⎢+ λn pn Sn+1 − K + λ1n pn1 Sn+1 −K ⎥ ⎢ ⎦ ⎣ j =2 i i i + λn pn Sn+1 − K
This allows a reduction to the form ⎡ i−1 ⎤ i−1
j +1 j λjn 1 − pnj Sn+1 − K + λjn pnj Sn+1 − K ⎥ ⎢ ⎥. c (K, nt) ert = ⎢ j =1 ⎣ j =1 ⎦ i i i +(λn pn )(Sn+1 − K)
There are unknown future values that can be eliminated with the risk-neutral condition i+1 i Fni = pni Sn+1 + 1 − pni Sn+1 , to give
c (K, nt) ert =
i−1 j =1
i λjn Fnj − Klocal + λin pni Sn+1 − Klocal ,
where is the summation term and Klocal = Sni is added to emphasize that a new
i+1 i+1 i , / Sn+1 − Sn+1 strike price is used for each node in a level. As pni = Fni − Sn+1 c(Sni , nt)ert =
+ λin
i+1 Fni − Sn+1
i Sn+1
i+1 Sn+1
i Sn+1 − Sni ,
−
i+1 i+1 i i = λin Fni − Sn+1 Sn+1 − Sni , c Sni , nt ert − Sn+1 − Sn+1
which reduces to the equation used to iterate up the nodes at one time level,
i i+1 i+1 c Sn , nt ert − − λin Sni Fni − Sn+1 Sn+1 i
Sn+1 = , i+1 c Sni , nt ert − − λin Fni − Sn+1
118
BINOMIAL TREES
starting at the center node. For an odd number level, for example, 1, 3, 5, . . . , the central nodes are the initial node S11 = S33 = S55 = · · · as in the CRR tree. The even levels have two nodes that symmetrically (in log space) straddle the center of the tree. Derman and Kani (1994) the CRR logarithmic spacing by
maintained 2 i+1 i / Sn+1 , such that the upper node is = Sni employing the relation Sn+1 i Sn+1
Sni c Sni , nt ert + λin Sni − = . λin Fni − c Sni , nt ert +
Once the upper central node is calculated, then the lower central node is found via 1 2 i+1 i /(Sn+1 ). the logarithmic spacing equation, Sn+1 = S1 The lower half of the tree is calculated in a similar manner by equating the CRR tree put price to the IBT put value. The equation used to iterate down the nodes at one time level is
i−1 i−1 i−1 F i−1 − S i−1 p Sn , nt ert − Sn+1 + λi−1 n Sn n n+1 i
= , Sn+1 i−1 c Sni−1 , nt ert − + λi−1 Fni−1 − Sn+1 n
again starting at the lowest central node. Figure 4.4 displays the IBT calculated for constant volatility calculated by the function DermanKani(0.3), where a volatility of 0.3 is the input parameter. The Derman and Kani procedure when the volatility is constant replicates a CRR tree. S=258.2 AD=0.02 240 S=213.6 S =213.6 AD=0.05 A D=0.05 p=0.47 p = 0. 47
220 200 S=176.7 S =176.7 AD=0.10 A D=0.10 p=0.47 p =0.47
Asset value
180 160 140 120 S=100.0 S=100.0 AD=1.00 D=1.00 100 A p=0.47 p= 0. 4 7 80
S=120.9 S =120.9 A D=0.47 AD=0.47 p=0.47 p= 0. 4 7 S=82.7 S=82.7 AD=0.52 A D=0.52 p=0.47 p=0.47
60 40
0
0.5
S=146.2 S =146.2 AD=0.22 A D=0.22 p=0.47 p = 0. 47 S=100.0 S=100.0 AD=0.49 A D=0.49 p=0.47 p=0.47
S=120.9 S =120.9 AD=0.35 A D=0.35 p=0.47 p =0.47
S=82.7 S = 82. 7 AD=0.38 A D=0.38 S=68.4 S =68.4 p=0.47 p = 0. 47 AD=0.27 A D=0.27 S=56.6 S = 56. 6 p=0.47 p = 0. 47 AD=0.14 A D=0.14 p=0.47 p = 0. 47 1 1.5 Time (years)
S=176.7 S =176.7 AD=0.13 A D=0.13 S=146.2 S =146.2 AD=0.22 A D=0.22 p=0.47 p = 0. 47 S=100.0 S =100.0 AD=0.36 A D=0.36 p=0.47 p = 0. 47 S=68.4 S = 68. 4 AD=0.27 A D=0.27 p=0.47 p = 0. 47 S=46.8 S = 46. 8 AD=0.07 A D=0.07 p=0.47 p = 0. 47
S=120.9 S =120.9 AD=0.28 A D=0.28 S=82.7 AD=0.31 S=56.6 AD=0.17 S=38.7 AD=0.04 2
FIGURE 4.4 Implied binomial tree calculated with a constant volatility of 0.3. The characteristic logarithmic distribution is evident.
SUMMARY
800
Asset value
700
600 S=567.1 AD=1.00 p=0.52 500
S=618.6 AD=0.52 p=0.57 S=519.9 AD=0.47 p=0.64
400
S=666.9 AD=0.29 p=0.38 S=567.1 AD=0.52 p=0.51 S=448.6 AD=0.17 p=0.57
S=739.5 AD=0.11 p=0.64 S=631.1 AD=0.44 p=0.67 S=509.6 AD=0.35 p=0.60 S=375.4 AD=0.07 p=0.56
300 0
0.5
1
1.5
S=787.8 AD=0.07 p=0.55 S=670.7 AD=0.33 p=0.13
119
S=839.3 AD=0.04 S=738.7 AD=0.07 S=667.1 AD=0.46
S=567.1 AD=0.35 p=0.48 S=434.9 AD=0.18 p=0.63 S=307.4 AD=0.03 p=0.54
S=482.1 AD=0.29 S=363.5 AD=0.08 S=246.6 AD=0.01 2
Time (years)
FIGURE 4.5 Implied binomial tree matched to XEO S&P 100 Index European Options as of December 27, 2010.
The transition probability within any binomial tree should be between 0 and 1. Large and abrupt price or time changes in the implied volatility can cause a calculated asset price to violate i+1 Fni > Sn+1 > Fni+1 . Derman and Kani suggest when a violation occurs to override the calculated asset price by maintaining the logarithmic spacing. If this approach fails, Hardle and i+1 = (Fni + Fni+1 )/(2). Mysickova (2008) suggest using the average Sn+1 The IBT calculated by the function DermanKani() for XEO S&P 100 Index European Options is displayed in Figure 4.5. Some numerical stability issues are present in the IBT approach, and a number of alternate procedures have been developed since the initial work of Derman and Kani (1994). The value of knowing the local volatilities is that prices of American and exotic options can be calculated using the IBT that are consistent with European market prices.
SUMMARY The binomial tree is a numerically efficient and conceptually undemanding technique to value options. For a European option, the option value is calculated at expiration and this value repeatedly stepped back in time to the present. A major advantage of the binomial tree approach is the computational ease in adding the early exercise premium present in American options. For small time steps, the asset
120
BINOMIAL TREES
values of the tree replicate the log-normal distribution and, thus, an option price will converge to the Black–Scholes price. REFERENCES Brandimarte, P. (2006) Numerical Methods in Finance: A MATLAB-Based Introduction, Wiley-Interscience. Cox, J.C., Ross, S.A., Rubinstein, M. (1979) Option Pricing: A Simplified Approach, Journal of Financial Economics 7, 229. Derman, E., Kani, I. (1994) Volatility Smile and its Implied Tree, Goldman Sachs Quantitative Strategies Research Notes. Derman, E., Kani, I., Chriss, N. (1996) Implied Trinomial Trees of the Volatility Smile, Goldman Sachs Quantitative Strategies Research Notes. Derman, E., Kani, I., Zon, J.Z. (1995) The Local Volatility Structure, Unlocking the Information in Index Options, Goldman Sachs Quantitative Strategies Research Notes. Dupire, B. (January 1994) Pricing with a Smile, Risk Magazine. Hardle, W.K., Mysickova, A. (2008) Numerics of Implied Binomial Trees, Applied Quantitative Finance 209. Hull, J. (2005) Options, Futures and Other Derivatives, Prentice Hall.
APPENDIX Code: Binomial function OptionValue=binomial (S0,K,r,sigma,T,Steps); % binomial calculates call price via recursive binomialgraph % A futures option requires y=r thus b=0; % An American option requires adding a premium % for early exercise AmericanFlag=0; %1=American, other=European FuturesFlag=0; %1=Futures if nargin < 6 S0=50; K=50; r=0.05; sigma=0.1; T=1; Steps=7; if (FuturesFlag==1) y=r; %Futures cost of carry b=0=r-y -> set y=r -> a=1 else y=0; %yield % dividend or convenience yield; end end
APPENDIX
121
dt=T/Steps; u=exp(sigma*sqrt(dt)) d=exp(-sigma*sqrt(dt)) a=exp((r-y)*dt) %Growth Factor p=(a-d)/(u-d) %OneMinP=1-p t0=0; figure OptionValue = binomialgraph(t0,S0,u,d,Steps,dt,K,p,r,... AmericanFlag,FuturesFlag) xlabel('Time (Years)'); ylabel('Asset and Call Value'); if (FuturesFlag==1) title(['Futures Call Binomial Tree: b = 0']) else title(['Stock Call Binomial Tree: Yield = ',num2str(y)]) end % axis tight hold off end
Code: binomialgraph function OptionValue = binomialgraph(t,S,u,d,Steps,dt,... K,p,r,AmericanFlag,FuturesFlag) % binomialgraph recursively steps through tree % Start at expiration where OptionValue=max(0,(S-K)) % and recursively work back to present time t0 % and price S0 % The Matlab Financial Toolbox has several built-in commands % to calculate financial derivatives via a tree and to view % graphical tree. Since not a standard feature the following % code will graph the tree using basic Matlab plotting % commands % The recursive function is useful to generate graph and % recursion is intuitive to working option price backwards % However, to simply calculate price without graph % a 'lattice' tree converted to a vector should be faster if (t 2 node straddle center of tree midtop = (t+1)/2; K=S(midtop,t) ; if (ConstVolFlag==1) vol=ConstVol; else vol=max(0.1,... griddata(Tvector,Kvector,VolVector,time,K, 'linear')) end CallFlag=1; EuropeanFlag=1; %European Call Call= VectorBinomial(S0,K,r,vol,time,N,d,CallFlag,EuropeanFlag); NodeSum=sum(Arrow(1:midtop-1,t)... .*(erdt.*S(1:midtop-1,t)-S(midtop,t))); F(midtop,t)=S(midtop,t)*erdt; S(midtop,t+1)=(S(midtop,t)... *(erdt*Call+Arrow(midtop,t)*S(midtop,t)-NodeSum))... / (Arrow(midtop,t)*F(midtop,t)-erdt*Call+NodeSum); midbot= midtop+1; S(midbot,t+1)=S0ˆ2/S(midtop,t+1) ; else midtop = ceil((t+1)/2); %set bottom mode for call loop midbot=midtop; %set upper limit for put loop S(midtop,t+1)=S0; %center trunk of tree end %%%Top of tree, Call option Nodes in the money%%%%%%%% for j=midtop-1:-1:1 K=S(j,t); if (ConstVolFlag==1) vol=ConstVol; else vol=max(0.1,... griddata(Tvector,Kvector,VolVector,time,K, 'linear')); end
APPENDIX
127
CallFlag=1; EuropeanFlag=1; %European Call Call = VectorBinomial(S0,K,r,vol,time,N,d,... CallFlag,EuropeanFlag); NodeSum=sum(Arrow(1:j-1,t).*(erdt.*S(1:j-1,t) -S(j,t))); F(j,t)=S(j,t)*erdt; S(j,t+1)=(S(j+1,t+1)*(erdt*Call-NodeSum)... -Arrow(j,t)*S(j,t)*(F(j,t)-S(j+1,t+1)))... /(erdt*Call-NodeSum-Arrow(j,t)*(F(j,t)-S(j+1, t+1))); if ( (S(j,t+1) exp(maxvol*sqrt(dt))*F(j,t) ) %Over-ride with Log spacing for stability %S(j,t+1)=S(j,t)ˆ2/S(j+1,t+1); S(j,t+1)=exp(vol*sqrt(dt))*S(j,t);%More Stable end end%%% End Top Loop %%%Bottom of tree, Put option Nodes in the money%%%%%%%% for j=(midbot+1):1:(t+1) K=S(j-1,t); if (ConstVolFlag==1) vol=ConstVol; else vol=max(0.1,... griddata(Tvector,Kvector,VolVector,time,K, 'linear')); end CallFlag=0; EuropeanFlag=1; %European Put Put = VectorBinomial(S0,K,r,vol,time,N,d,... CallFlag,EuropeanFlag); NodeSum=sum(Arrow(j:t,t).*(S(j-1,t)-erdt.*S(j:t,t))); F(j-1,t)=S(j-1,t)*erdt; S(j,t+1)=(S(j-1,t+1)*(erdt*Put-NodeSum)... +Arrow(j-1,t)*S(j-1,t)*(F(j-1,t)-S(j-1,t+1)))... /(erdt*Put-NodeSum+Arrow(j-1,t)*(F(j-1,t) -S(j-1,t+1))); if ( (S(j,t+1)>F(j-1,t)))%... %|| S(j,t+1) Calculate all probabilities at this level Prob(1:t,t)=(F(1:t,t)-S(2:t+1,t+1))./(S(1:t,t+1) -S(2:t+1,t+1)); %Calculate Top and Bottom Arrow Debreu nodes for next time step Arrow(1,t+1)=Arrow(1,t)*Prob(1,t)/erdt; %up Arrow(t+1,t+1)=Arrow(t,t)*(1-Prob(t,t))/erdt; %down %Vectorize -> Calculate rest of Arrow Debreu at next time step Arrow(2:t,t+1)=(Arrow(1:t-1,t).*(1-Prob(1:t-1,t))... +Arrow(2:t,t).*Prob(2:t,t))/erdt; end %%%% End Derman Kani Calculation Time loop figure % for t=1:Steps for i=1:t clear x y x(:,1)=[(t-1)*dt t*dt];%+dt x(:,2)=x(:,1); y(:,1)=[S(i,t) S(i, t+1)]; y(:,2)=[S(i,t) S(i+1,t+1)]; plot(x,y) txstr1(1) = {num2str(S(i,t),'S=%.1f')}; txstr1(2) = {num2str(Arrow(i,t),'AD=%.2f')}; txstr1(3) = {num2str(Prob(i,t),'p=%.2f')}; %txstr1(4) = {num2str(F(i,t),'%.2f')}; text((t-1)*dt,S(i,t),txstr1); hold on end end t=Steps+1; for i=1:t % Expiration nodes txstr2(1) = {num2str(S(i,t),'S=%.1f')}; txstr2(2) = {num2str(Arrow(i,t),'AD=%.2f')}; text((t-1)*dt*0.95,S(i,t),txstr2); hold on end axis tight xlabel('Time (Years)'); ylabel('Asset Value'); hold off end % DermanKani function
APPENDIX
Code: InvertOptionPrice function Vol = InvertOptionPrice(r,S,d,T,K,P,CallFlag); VolGuess=0.2; options=[];%=optimset('Display','iter'); if (CallFlag==1) %Call [Estimates]=fminsearch(@callError,VolGuess,... options,K,S,T,r,d,P); else %(CallFlag==0) %Put [Estimates]=fminsearch(@putError,VolGuess,... options,K,S,T,r,d,P); end Vol=max(0.001,Estimates); end function error = callError (vol,K,S0,T,r,d,P); N=50; CallFlag=1; EuropeanFlag=1; c= VectorBinomial(S0,K,r,vol,T,N,d,... CallFlag,EuropeanFlag); error=(c-P)ˆ2;% minimize the squared error end function error = putError (vol,K,S0,T,r,d,P); N=50; CallFlag=0; EuropeanFlag=1; p= VectorBinomial(S0,K,r,vol,T,N,d,... CallFlag,EuropeanFlag); error=(p-P)ˆ2;% minimize the squared error end
129
5
Trinomial Trees
5.1. INTRODUCTION The trinomial tree is a clear extension of the binomial tree where the extra node adds an extra degree of freedom. The derivation of the basic trinomial tree is based on moment matching as well as a stability relation to preclude negative probabilities. The trinomial tree has a clear connection to a triangular slice of the rectangular explicit finite difference grid. 5.2. TRINOMIAL TREE DERIVATION The geometric Brownian motion stochastic differential equation dSt = μSt dt + σ St dW, can be transformed via X = ln S and Ito’s lemma into the normally distributed ⎛ =η ⎞ ⎜ 1 2⎟ ⎟ dX = ⎜ ⎝μ − 2 σ ⎠ dt + σ dW,
where η is the continuously compounded return drift factor. Similarly, the riskneutral evolution of the price change dS = (r − q)S dt + σ S dW, with a dividend yield q can be transformed into the logarithm of the price change with a log-normal drift of ν = r − 21 σ 2 − q. A derivative f based on the log price of the asset follows the differential equation ⎛ ⎞ =ν ⎜ ⎟ ∂f 1 2 ∂ 2f ∂f 1 2 ⎟ +⎜ σ + − q r − ⎝ ⎠ ∂X 2 σ ∂X2 = rf. ∂t 2 Financial Derivative and Energy Market Valuation: Theory and Implementation in Matlab, First Edition. Michael Mastro. 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.
131
132
TRINOMIAL TREES
In order to describe the price process with a trinomial tree, it is necessary to first review the statistical distributions of the log-price process. The variance (or central moment about the mean ν) is σ 2 t. Compactly, the log-normal price change is expressed as √ ln St − ln S0 = Xt − X0 = δX ∼ φ vt, σ t , where φ(M, σsd ) denotes a normal distribution with mean M and standard deviation σsd . By the intrinsic properties of the normal distribution, the nth raw moment is found by integrating m′n
=
1 √
σsd 2π
∞
xne
−(x−M)2 2σsd 2
dx.
−∞
Thus, the first raw moment is the mean m′1 = E(X) = vt, and the second raw moment is m′2 = E⌊X2 ⌋ = v 2 t 2 + σ 2 t. At any node in a trinomial tree, three potential nodes can be reached at the next time step δt. The three respective nonnegative probabilities pu , pd , and pm of reaching the next up, down, or middle node, respectively, must sum to unity pu + pd + pm = 1. u = As X is the logarithm of the asset price, the value after an up movement is Xi+1 d Xi + δx, a down movement is Xi+1 = Xi − δx, and the middle node is defined as m no change, Xi+1 = Xi . Matching the moments of the distribution to the moments of the nodes gives
μ′1 = E(δX) = ν δt = pu δx + pd δx + pm · 0
μ′2 = E[(δX)2 ] = ν 2 (δt)2 + σ 2 δt = pu (δx)2 + pd (δx)2 + pm · 0.
Solving for the probabilities gives 1 ν 2 (δt)2 + σ 2 δt ν δt pu = + 2 (δx)2 δx ν δt 1 ν 2 (δt)2 + σ 2 δt − pd = . 2 (δx)2 δx pm = 1 − pu − pd = 1 −
ν 2 (δt)2 + σ 2 δt (δx)2
CALIBRATING THE TRINOMIAL TREE
133
The trinomial tree has nodes (j, i) that are located in time at t = i δt with a log-price spacing of X = j δx, where j = −j max , . . . , 0, . . . , j max . The change in the√log price δx represents an additional degree of freedom. Often, the relation δx = 3 δt is used to approximate the√normal distribution. Other√acceptable relations used √ in the literature are δx = 2 δt, δx = δt, or δx = σ 3δt. Too large of a price change will introduce errors as well as potentially negative probabilities. Calculation of the trinomial tree can proceed in a two-dimensional matrix similar to a finite difference grid. Retaining the entire grid is useful for analysis such as in calculating the Greeks. Earlier, it was shown that a vector approach can reduce the memory and CPU counts necessary to compute the option value via the binomial lattice. The option values in the binomial tree are propagated between even and odd values in the vector. A similar approach can be applied to a trinomial lattice as discussed by Brandimarte (2006). An entire N step trinomial tree of asset value can be contained in a 2N + 1 vector. Starting at the initial node, each new time step replicates the previous time step plus one higher and one lower node. The final time step has 2N + 1 nodes. The option values are then propagated back in time. As the even/odd approach is not available, a temporary earlier time level vector is used to iterate the option value to time zero. The function VectorTrinomial applies this approach and has default parameters similar to the VectorBinomial function. A comparison of the VectorTrinomial and VectorBinomial calculations is made by the function TreeRace. The results are observable in Figures 5.1–5.3. One striking aspect is the oscillatory behavior of the binomial tree. On the other hand, the binomial tree is known to be faster as is shown here. Nevertheless, the trinomial tree allows an additional degree of freedom, which is useful for calibration, and the lack of oscillation in the asset nodes is important for certain options such as Barrier options.
5.3. CALIBRATING THE TRINOMIAL TREE Hull and White (1994) developed a robust two-stage procedure for matching current market forward rates to a one-factor interest rate model as in the form dR = [θ (t) − aR]dt + σ dW. Hull (2005) showed that this procedure is also applicable to a mean-reverting model d(ln S) = [θ (t) − a(ln S)]dt + σ dW that is appropriate for describing commodity prices. This model was previously employed by Schwartz (1997) to model crude oil dynamics by dS = κ[μ − (ln S)]dt + σ dW. S
134
TRINOMIAL TREES
log(∆V)
Binomial vs. trinomial tree comparison
10–2 50
100
150
200
250
300
250
300
European put Value
12.2 12 11.8
Calculation time
50
100
150
200
0.01 Trinomial Binomial
0.005 0
0
50
100
150
200
250
300
350
Steps
FIGURE 5.1 Comparison of European call calculation via a binomial tree to a trinomial tree. The option values calculated by the binomial tree and the trinomial tree are displayed in the middle graph with their difference displayed as a logarithm in the top graph.
log(∆V)
Binomial vs. trinomial tree comparison
10–2 50
100
150
200
250
300
250
300
European put Value
7.4 7.2 7
Calculation time
50
100
150
200
0.01 Trinomial Binomial
0.005 0
0
50
100
150
200
250
300
350
Steps
FIGURE 5.2 nomial tree.
Comparison of European put calculations via a binomial tree versus a tri-
CALIBRATING THE TRINOMIAL TREE
135
log(∆V)
Binomial vs. trinomial tree comparison
10–2 50
100
150
200
250
300
250
300
Value
American put
7.8 7.6
Calculation time
50
100
150
200
0.02 Trinomial Binomial
0.01 0
0
50
100
150
200
250
300
350
Steps
FIGURE 5.3 American put comparison of binomial tree to a trinomial tree calculation. Oscillations in the value of the binomial tree are severe for a small number of steps. As expected, the trinomial tree requires more calculation time as compared to a binomial tree for a given number of steps.
Applying Ito’s lemma to the logarithm of the spot price, Xt = ln St , gives the Ornstein–Uhlenbeck process as dXt = κ(α − Xt )dt + σ dWt , where the mean log price is α = μ − Schwartz shows that θ (t) = κα
σ2 2κ .
Matching the notation of Hull and
a = κ.
The logarithm of the spot price, Xt = ln St , is pulled toward an equilibrium level at a rate κ and σ is the volatility or average magnitude, per square-root time, of the random fluctuations that are modeled as Brownian motions. The expected value or mean is given by E[Xt ] = X0 e−κt + α(1 − e−κt ). If α = 0 then E[Xt ] = X0 e−κt ≈ X0 (−κt + 1) E[Xt ] − X0 = E[dXt ] ≈ −κtX0 .
136
TRINOMIAL TREES
The variance is found from the integral of the stochastic process by ⎡
var(Xt ) = E[(Xt − E[Xt ])2 ] = σ 2 e−2κt E ⎣ var(Xt ) =
σ2 2κ
t 0
⎤
e−2κs σ dzs ⎦ = σ 2 e−2κt (e2κt − e0 )
(1 − e2κt ) ≈ σ 2 t.
5.4. HULL–WHITE CALIBRATION STEP ONE The first stage of the Hull–White procedure is to construct a tree for a process that is initially zero and follows dX = −aXdt + σ dW. As dX is the logarithm of the spot deviation, the process is symmetric around zero and is normally distributed. The trinomial tree has nodes (j, i) that are located in time at t = i δt with a log-price spacing of X = j δx, with j = −j max , . . . , 0, . . . , j max . Thus, the node value can be represented as X(j, i) = j δx, because, X(jmid ) = 0, i. Hull√and White (1994) related the change in the log price to the time step by δx = 3 δt to approximate the normal distribution. The probabilities (pu , pd , pm ) of reaching the next up, down or middle node must sum to unity. The expected value of the logarithmic process is =κ
E(dX′ ) = − α Xδt = MX
where −1 < M = −aδt < 0, as a and δt are assumed to be small (Backus and Zin, 2003). Because the levels of X in the tree are equally spaced and symmetric about zero, the value of any node in the tree can be expressed as X j = j · δx. Therefore, the expected value at node j (and any time step i) can also be written as −a δt
E(dX j ) = M (j δx).
The variance of the logarithmic process is defined for convenience as var(dX) = V ≈ σ 2 δt. At the interior nodes, the expected mean is matched to the up, middle, and down nodes at the next time step by E[dX] = M(j δx) = (δx)pu + 0 · pm − (δx)pd = δx(pu − pd ),
HULL–WHITE CALIBRATION STEP ONE
137
and similarly the variance is matched by V = var(dX) = E[(dX)2 ] − [E(dX)]2
σ 2 δt ≈ V = (δx)2 pu + (0)2 pm + (−δx)2 · pd − (M(j · δx))2 . Consider the j = 0 central branch of the tree where the probability of an up or down movement at j = 0 is equal, p0 = pju=0 = pjd=0 and (M(jmid · δx)) = 0. Thus, V ≈ σ 2 δt = 2p0 (δx)2
V σ 2 δt = . 2 2(δx) 2(δx)2 √ u d = pi=0 = 1/6 and Hull and White (1994) select δx = σ 3δt to give p0 = pi=0 u d m pi=0 = 1 − pi=0 − pi=0 = 2/3. Backus and Zin (2003) deduced that the selection √ of δx = σ 3δt arose to match the kurtosis of the normal distribution p0 =
u d m4 = E[dX′ − E[dX′ ]]4 = (δx)4 pi=0 + (0)4 pm + (−δx)4 pi=0 .
The 4th moment of a normal distribution is m4 = 3V 2 . Ignoring the mean reversion shows m4 = (δx)4 p0 + (δx)4 p0 = 3V 2 . √ Substituting V ≈ σ 2 δt gives δx = σ 3δt. First, two equations are solved for the two unknowns pu and pd . The expected value equation gives the relation pu = Mj + pd . Substituting into variance equation gives ⎛ =pu ⎞ V = (δx)2 ⎝Mj + pd ⎠ + (−δx)2 · pd − (M(j · δx))2 pd = It was shown that
1 V 1 − Mj + (Mj )2 . 2(δx)2 2 2
V 1 = p0 = , 2(δx)2 6
which gives the probability for the interior nodes as 1 1 1 Mj + (Mj )2 = p0 + Mj + (Mj )2 + 6 2 2 2 pm = − (Mj )2 = (1 − 2p0 ) − (Mj )2 3 1 1 1 −Mj + (Mj )2 = p0 + −Mj + (Mj )2 . pd = + 6 2 2 pu =
138
TRINOMIAL TREES
5.5. REVERSION AT EDGE OF TREE Hull and White (1994) set a fixed upper and lower node limit to mimic the mean reversion process. Branches emanating from a node at the upper limit will proceed at the same level, down one level, and down two levels. Conversely, branches emanating from a node at the lower limit will proceed at the same level, up one level, and up two levels. To simplify the notation, the same probability notations, pu , pm , and pd , are used for branching originating from an interior node, a top node, or a bottom node. At the top edge, the expected mean is matched to the up, middle, and down nodes at the next time step by E[dX] = M(j · δx) = 0 · pu + (−δx)pm + (−2δx)pd , and the variance is matched by var(dX) = V = (0)2 pu + (−δx)2 pm + (−2δx)2 · pd − (M(j · δx))2 . Solving gives the probabilities at the top edge as 7 1 1 3Mj + (Mj )2 = 1 + p0 + 3Mj + (Mj )2 + 6 2 2 1 pm = − − 2Mj − (Mj )2 = −2p0 − 2Mj − (Mj )2 3 1 1 1 Mj + (Mj )2 = p0 + Mj + (Mj )2 . pd = + 6 2 2 pu =
The probabilities at the bottom edge are
1 1 1 −Mj + (Mj )2 = p0 + −Mj + (Mj )2 . + 6 2 2 1 pm = − + 2Mj − (Mj )2 = −2p0 + 2Mj − (Mj )2 3 1 7 1 −3Mj + (Mj )2 = 1 + p0 + −3Mj + (Mj )2 pd = + 6 2 2 pu =
The up, middle, and down transition probabilities for the top edge, interior levels, and bottom edge are displayed in Figure 5.4. The probabilities are a function of the jM product. The top and bottom edge of the tree is defined by selecting jmax M and, thus, jmax . All branches in the Hull–White tree must have probabilities between zero and one. Examining the probabilities in the interior levels shows that pm can be negative for a large jM product. The threshold point for positive probabilities is found by Interior
threshold pm =
2 − (Mj )2 = 0 ⇒ Mj = ±0.815. 3
REVERSION AT EDGE OF TREE
139
Top layer nodes 1 0.5
pu pm pd
0 –1.5
–1
–0.5
0
0.5
1
0.5
1
1.5
Transition probability
Interior nodes 2 1 0 –1 –2 –1.5
–1
–0.5
0
1.5
Bottom layer nodes 1
pu pm pd
0.5
0 –1.5
–1
–0.5
0
0.5
1
1.5
–jM
FIGURE 5.4 Up, middle, and down transition probabilities at the top edge, interior levels, and bottom edge of the Hull–White mean reversion trinomial tree. The vertical lines define the Hull–White maximum node ±jmax M = 1 − 1 − 2p0 = 0.1850 at the top or bottom edge level. The minimum point prevents negative pm at the top or bottom edge. A larger jmax risks a negative pm interior node when |jInterior M| ≥ 1 − 2p0 = 0.815.
Similarly, at the top or bottom edge, pm can be negative for a small or large jM product as shown by Interior 1 threshold pm = − − 2Mj − (Mj )2 = 0 3 1 ±Mj = 1 ± 1 − 2p0 = 1 ± 1 − 2 = 0.1835, 1.8165. 6
140
TRINOMIAL TREES
Thus, if an edge node has the product |Mj | < 0.1835, then a negative probability will arise in pm . As discussed by Backus and Zin (2003), the selection of the maximum tree size jmax = 0.1835 −M by Hull and White (1994) presumably followed this logic. Hull (2005) states that turning the tree as soon as possible at jmax = 0.1835 −M and not closer to j = − 0.815 M is computationally more efficient. In summary, the trinomial tree formed in step 1 constructs a tree for a process that is initially zero and follows dX = −aX dt + σ dW. The tree consists of nodes (j, i) that are located in time at t = i δt with a log-price √ spacing of X = j δx. The step size of the log-price change is set to δx = 3 δt to approximate the normal distribution. At the top and bottom edge the branches point back into the tree to simulate a mean reversion behavior. The up, middle, and down transition probabilities are defined to match the expectation and variance of the log-price process. The location of the top and bottom edge of the tree is selected to preclude negative probabilities.
5.6. HULL–WHITE CALIBRATION STEP TWO Step 2 will construct a tree for the mean-reverting model d(ln S) = [θ (t) − a(ln S)]dt + σ dW, by adding the θ (t) parameter to the step 1 tree based on dX = −aX dt + σ dW. The process for ln S has the same framework, probabilities, etc., except that the values at each time level is shifted to match the market futures price at a corresponding time. The futures price is equal to the expected value of commodity price in a riskneutral world, Ft = E Q [S(t)]. After one time step, E Q [S(t) = eX(t) ] equals the market futures price F1 as given by pu eδX+ 1 + pm e0+ 1 + pd e−δX+ 1 = F1·δt , where X0 is not displayed because it is equal to zero by the definition of the step 1 calibration (Hull, 2005). In this equation a time varying correction term (t = i δt) can be solved directly. Upper case theta was used to emphasize that t is the discrete pricing in the tree in contrast to the analytical θ (t). To generalize
HULL–WHITE CALIBRATION STEP TWO
141
the process, each time layer has a set of nodes that must be displaced by t such that Qi,j Si,j = F0,i·δt P0,i·δt j
j
Qi,j eXi,j + t = F0,i·δt P0,i·δt ,
where P0,i·δt is the time zero value of a pure discount bond maturing at time t = i δt. The state price Qi,j is the value at time zero of a security that pays one unit of cash if node (i, j ) is reached (Clewlow and Strickland, 1999), Qi+1,j = Qi,j ′ pj,j ′ Pδt,(i+1)δt , j′
where Pδt,(i+1)δt is the time t = i δt value of a pure discount bond maturing at time t = (i + 1)δt. Solving directly for the offset at each time layer gives ⎤ ⎡ ⎥ ⎢F ⎢ 0,i·δt P0,i·δt ⎥ i = ln ⎢ ⎥. ⎣ Q eXi,j ⎦ i,j
j
If the interest rate is assumed to be constant, the simplified equation for the probability of reaching a particular node (without discounting) is Qi+1,j = and
Qi,j ′ pj,j ′ ,
j′
⎤
⎡
⎥ ⎢ F0,i·δt ⎥ ⎢ i = ln ⎢ ⎥. X ⎣ Q e i,j ⎦ i,j
j
A volatility term is required in these equations. As discussed earlier, implied volatility (via a tree, etc.) from option prices is a more precise representation of the market’s opinion on future movements of the asset. To simplify the code, the function Trinomial calls a function LeastSquaresOU to extract volatility, drift, and mean reversion from the historic WTI front futures contract over, approximately, the past 30 years as seen in Figure 5.5. Subsequently, the function Trinomial calls the function HullWhiteTrinomial to produce the WTI futures curve as of January 11, 2011, as displayed in Figure 5.6. The humped nature of the curve is rather challenging to fit as the contract price
142
TRINOMIAL TREES
120
Crude oil futures rice
100 80 60 40 20 1985
1990
1995 2000 Time (Years)
2005
2010
FIGURE 5.5 WTI front futures contract. A least squares fit of an Orstein–Uhlenbeck process to the log future price set found the parameter set: α = 3.952, κ = 0.3007, and σ = 0.0664. 97
Crude oil futures price
96 95 94 93 92 91 90
FIGURE 5.6 nomial tree.
0
1
2
3 Time (Years)
4
5
6
WTI futures contract curve used to fit the Hull–White mean reversion tri-
initially increases with contract maturity, peaks, and then slowly decreases for long maturity contracts. The fit by the Hull–White procedure of the WTI futures contract curve to the trinomial tree is displayed in Figure 5.7. The general shape of the futures curve is replicated in the spot trinomial tree. The spot prices and transition probabilities can now be applied to value Vanilla as well as Exotic options. For example, the
HULL–WHITE CALIBRATION STEP TWO
143
115 S = 113.1 S = 112.7 S = 112.2 Q = 0.02 Q = 0.04 Q = 0.05 S = 111.4 S = 111.1 Q = 0.06 Q = 0.07
110
Asset value
105
S = 104.3 S = 103.9 S = 103.3 Q = 0.22 Q = 0.23 S = 103.5 Q = 0.23 S = 102.7 S = 102.4 Q = 0.17 Q = 0.23 Q = 0.23
100
S = 96.1 Q = 0.53
S = 95.8 S = 95.4 Q = 0.47 Q = 0.43 S = 94.7 S = 94.4 Q = 0.41 Q = 0.40
S = 88.6 S = 87.8 Q = 0.22 Q = 0.17
S = 88.3 S = 87.9 Q = 0.23 Q = 0.23 S = 87.3 S = 87.0 Q = 0.23 Q = 0.23
S = 95.2 Q = 0.67
95
S = 90.7 90 Q = 1.00
85
80
0
0.5
S = 81.7 S = 81.4 Q = 0.02 Q = 0.04 S = 81.0 S = 80.4 S = 80.2 Q = 0.05 Q = 0.06 Q = 0.07 1 1.5 2 2.5 3 Time (Years)
FIGURE 5.7 Trinomial tree depicting the evolution of a mean reversion process conditional on the WTI futures curve. The time varying correction term was calibrated directly from the futures price curve and thus an analytical expression for the θ(t) parameter was not needed to construct the trinomial tree or value options.
function HullWhiteTrinomial calculates the Arrow–Debreu State Price, Q, at each node in the tree. The European spot call value can be directly calculated by the summation at all nodes in the expiration time step c=
j
Qj max[Sj − K].
Equivalently, a European option can be calculated using the transition probabilities. The spot call option value is calculated at expiration as f j = max (0, [Sj − K]),
144
TRINOMIAL TREES
and this value is propagated back to the inception of the tree. The option value is discounted at the risk-free rate to obtain the present value of the option. The value of a European spot option is stepped backward in time by the equation f j,i−1 = e−r dt pu fu + pm fm + pd fd . The value of an American call option on the spot price adds a possible premium for early exercise at earlier time steps as f j,i−1 = max S j −1 − K , e−r dt pu fu + pm fm + pd fd .
The spot option prices calculated with a strike price at the money by HullWhiteTrinomial using a 50 time step tree are EurCall = 4.5214; EurPut = 1.1813; AmerCall = 6.4360; AmerPut = 1.5679 .
The American option has a premium for the possibility of early exercise. 5.7. SPOT PRICE STOCHASTIC DIFFERENTIAL EQUATION In this section, an expression for the spot price stochastic differential equation is derived. Step 2 in the Hull–White calibration constructed a tree for the meanreverting model d(ln S) = [θ (t) − a(ln S)]dt + σ dW, by adding (t = i δt), the discrete version of the θ (t) parameter, to each time step in the step 1 tree constructed from dX = −aX dt + σ dW. The Hull–White procedure to match the asset prices in the tree to the market futures curve approximates the parameter θ (t) with the discrete correction term (t = i δt). The spot price at time t will equal the value of a forward contract as t approaches expiration time t = T , then (Clewlow and Strickland, 1999) 1 ln S(t) = ln F (t, t) = ln F (0, t) − 2
t
2 −2κ(t−u)
σ e
du +
t
σ e−κ(t−u) dW (u).
0
0
Differentiating gives ⎡ ⎤ t t dS(t) ⎣ ∂ ln F (0, t) = + κσ 2 e−2κ(t−u) du − κ σ e−κ(t−u) dW (u)⎦ dt + σ dW. S(t) ∂t 0
0
SPOT PRICE STOCHASTIC DIFFERENTIAL EQUATION
145
Rearranging the spot price equation gives t
−κ(t−u)
σe
S (t) dW (u) = ln F (0, t)
0
1 + 2
t
σ 2 e−2κ(t−u) du.
0
Substituting gives ⎡ t dS(t) ⎣∂ ln F (0, t) 2 −2κ(t−u) = + κσ e du S(t) ∂t 0
⎛
−κ ⎝ln
⎡
S (t) F (0, t)
⎞⎤ t 1 + σ 2 e−2κ(t−u) du⎠⎦dt + σ dW 2 0
⎤
⎢ ⎥ ⎢ ⎥ t ⎢ dS(t) ⎢ ∂ ln F (0, t) 1 2 S (t) ⎥ ⎥ =⎢ + κσ e−2κ(t−u) du − κ ln ⎥ dt + σ dW. ⎢ S(t) ∂t 2 F (0, t) ⎥ ⎢ ⎥ 0 ⎣ ⎦ 1 (1−e−2κt ) = 2κ
The final expression for the spot price stochastic differential equation shown is
! dS(t) ∂ ln F (0, t) σ 2 −2κt = + (1 − e ) − κ[ln S(t) − ln F (0, t)] dt + σ dW. S(t) ∂t 4 Thus, in the Schwartz (1997) model dS = κ[μ − (ln S)]dt + σ dW, S the long-term risk adjusted drift is given by μ=
1 σ2 1 ∂ ln F (0, t) + (1 − e−2κt ) + ln F (0, t). κ ∂t κ 4
Using the logarithm of the spot price, Xt = ln St , in the Hull–White Tree model dXt = [θ (t) − aXt ]dt + σ dW, Ito’s lemma adds an additional − 21 σ 2 term to give θ (t) =
∂ ln F (0, t) σ 2 1 + (1 − e−2κt ) − σ 2 + κ ln F (0, t). ∂t 4 2
146
TRINOMIAL TREES
5.8. TREE-BASED FUTURES OPTIONS UNDER MEAN REVERSION In this section, the expression for the futures price as a function of current spot price and time to expiration is applied to the Hull–White trinomial tree to allow valuation of European and American options on futures contracts. Restating the Ornstein–Uhlenbeck process under the equivalent martingale measure as dXt = κ(α ∗ − Xt )dt + σ dW ∗ , where α ∗ = α − λ is adjusted by the market price of risk λ (Schwartz 1997). The logarithm of the spot price under the equivalent martingale measure is distributed with a mean of E[Xt ] = X0 e−κt + α ∗ (1 − e−κt ), and variance of var(Xt ) =
σ2 (1 − e−2κt ). 2κ
The forward price is found from F0,T F0,T
1 = E[ST ] = exp E0 XT + var(XT ) 2 σ2 −2κT −κT ∗ −κT (1 − e ) . = E[ST ] = exp ln S0 e +α 1−e + 4κ
The initial investment into a forward contract is zero, thus the expected change must be zero in a risk-neutral world. Clewlow and Strickland (1999) reason that the volatility of the forward prices must have a negative exponential form to obtain a Markovian spot price process. Thus, the stochastic evolution of the energy forward curve is given by dF (t, T ) = σ e−κ(T −t) dW. F (t, T ) Jaillet (2003) shows that this equation can be reached by the derivative of =X0
ln Ft,T to give
σ2 1 − e−2κ(T −t) , = ln S0 e−κ(T −t) + α ∗ 1 − e−κ(T −t) + 4κ
d ln Ft,T = κ e−κ(T −t) X0 dt − α ∗ κ e−κ(T −t) dt −
σ 2 −2κ(T −t) e dt + e−κ(T −t) dXt . 2
TREE-BASED FUTURES OPTIONS UNDER MEAN REVERSION
147
Substituting dXt = κ(α ∗ − Xt )dt + σ dW ∗ gives d ln Ft,T = κ e−κ(T −t) X0 dt − α ∗ κ e−κ(T −t) dt −
σ 2 −2κ(T −t) e dt 2
+ e−κ(T −t) (κ(α ∗ − Xt )dt + σ dW ∗ ) σ 2 −2κ(T −t) ∗ e dt . − d ln Ft,T = e−κ(T −t) σ dW 2 = dF F
2 = d F2 dF
The integration can be expressed as 1 ln F (t, T ) − ln F (0, T ) = − 2
t
2 −2κ(T −u)
σ e
du +
1"T 0
F (t, T ) = F (0, T )e− 2
σ e−κ(T −u) dW (u),
0
0
or
t
" σ 2 e−2κ(T −u) du+ 0T σ e−κ(T −u) dW (u)
.
As the spot price at time t will equal the value of a forward contract as t approaches expiration time t = T (Clewlow and Strickland, 1999), 1 ln S(t) = ln F (t, t) = ln F (0, t) − 2
t
2 −2κ(t−u)
σ e
du +
0
t
σ e−κ(t−u) dW (u),
t
σ e−κ(t−u) dW (u).
0
which can be integrated to give 1 ln S(t) = ln F (t, t) = ln F (0, t) − 2
t
2 −2κ(t−u)
σ e
du +
0
0
Rewriting gives t
−κ(t−u)
σe
S (t) dW (u) = ln F (0, t)
0
1 + 2
t
σ 2 e−2κ(t−u) du.
0
Giving the expression relating the stochastic movements of the spot price to the current time t forward price shows t 0
−κt κu
σe
e
S (t) dW (u) = ln F (0, t)
+
σ2 [1 − e−2κt ]. 4κ
148
TRINOMIAL TREES
Returning to the forward price equation at time t for a contract at a future time T 1 ln F (t, T ) − ln F (0, T ) = − 2
t
2 −2κ(T −u)
σ e
du +
σ e−κ(T −u) dW (u)
0
0
=
t
σ 2 −2κT −2κt [e − 1] + e 4κ
t
σ e−κT eκu dW (u),
0
and multiplying the last term by
ln F (t, T ) − ln F (0, T ) =
e−κt e−κt
give
σ 2 −2κT −2κt e−κT e [e − 1] + −κt 4κ e
t
σ e−κ(t−u) dW (u).
0
The integral in the last term is the just derived relation for the spot price relative to the current forward price. Thus, −κ(T −t) S (t) e e F (t, T ) = F (0, T ) F (0, t)
! # $ 2 − σ4κ e−κT e−2κt −1 (e−κT −e−κt )
,
shows that the forward price curve at a future time T can be determined form the initial forward curve up to time t and the asset price at time t (Clewlow and Strickland, 1999). The futures option prices are calculated by the function HullWhiteTrinomial . The output of the present value at time zero of an option that expires at 2.6 years is given below. The futures options are based on a futures contract with maturity at 3.5 years. EurCall = 4.5214; EurPut = 1.1813; EurCallFuture = 3.7433; EurPutFuture = 0.7827; AmerCall = 6.4360; AmerPut = 1.5679; AmerCallFuture = 4.7400; AmerPutFuture = 1.1334 . At each node in the spot tree, a corresponding value is calculated for the futures value at 3.5 years. The futures maturity is outside the time range of the tree but this is not important as the futures value is calculated by an analytical equation. In contrast, the option expiration date must be within the time range of the tree.
5.9. ANALYTICAL EUROPEAN FUTURES OPTION UNDER MEAN REVERSION In this section, a semianalytical expression for a European option on the spot and a European option on the futures contract are derived. The derivation in the previous
ANALYTICAL EUROPEAN FUTURES OPTION UNDER MEAN REVERSION
149
section for an analytical expression for the futures price evolution is extended to the valuation of European spot and futures options using Black’s framework. The spot price or equivalently the futures price at t = t is 1 ln S(t) = ln F (t, t) = ln F (0, t) − 2
t
2 −2κ(t−u)
σ e
du +
t
σ e−κ(t−u) dW (u).
0
0
Thus, the spot price will have a normal distribution given by σ2 ln S(T ) = N ln F (0, T ) − 1 − e−2κT , 4κ
σ2 −2κT . 1−e 2κ
The normal distribution allows the European spot option to be evaluated in the Black-Scholes framework. A European option with time T expiration on the spot price of the commodity is c = E Q ([F (T , T ) − K]+ ) c(0, S(0); T , K) = B(0, T )[F (0, T )N (d1 ) − KN (d2 )] p(0, S(0); T , K) = B(0, T )[KN (−d2 ) − F (0, T )N(−d1 )] $ # ) ln F (0,T K 1 + sd2 = d1 − s d1 = s 2 σ2 [1 − e−2κT ]. s 2 = varQ [ln F (T , T )] = 2κ This is equivalent to a European call option with expiration T on a futures contract with maturity T . Usually, the option contract will expire at a time toption , which precedes the time Tmaturity futures contract maturity (Jaillet et al., 2004). For this case (tinception < toption < Tmaturity ), the modification to Black’s model gives the time tinception = 0 value of an option with time toption expiration on a futures with maturity Tmaturity as c = E Q ([F (toption , T ) − K]+ )
c(0, S(0); T , K) = e−rtoption [F (0, T )N (d1 ) − KN (d2 )]
p(0, S(0); T , K) = e−rtoption [KN (−d2 ) − F (0, T )N(−d1 )] $ # ) ln F (0,T K 1 d1 = + s(t ,T) d2 = d1 − s(toption , T ) s(toption , T ) 2 option s 2 (toption , T ) = varQ [ln F (toption , T )] =
σ 2 −2κ(T −toption ) e [1 − e−2κtoption ]. 2κ
150
TRINOMIAL TREES
This procedure is equivalent to the spot model given above when toption = Tmaturity ; thus, the same calculation for a spot option (toption = Tmaturity ) or a futures option (toption < Tmaturity ) can be made with the function BlackOptionMeanRev . The function Trinomial calls BlackOptionMeanRev to generate the output BlackCall = 4.4915; BlackPut = 1.1645; BlackCallFuture = 3.7519; BlackPutFuture = 0.7576 , which is similar to the trinomial tree calculation from the previous section given again for reference EurCall = 4.5214; EurPut = 1.1813; EurCallFuture = 3.7433; EurPutFuture = 0.7827 . The entry function trinomial(1) contains the steps parameter, which was set to 50 for the trinomial tree calculations discussed previously. Increasing the steps parameter decreases the asset and time step size, which improves the European option valuation accuracy relative to Black’s model. SUMMARY The tree representation is a robust technique to price derivatives that can be evaluated moving backward in time. This feature is particularly useful for the evaluation of American options that do not have an analytical solution. As a comparison, a Monte Carlo simulation is not well suited to track the historical value of a derivative. Furthermore, a mean-reverting trinomial tree can be constructed to be consistent with the futures price curve. REFERENCES Aydin, N.S. (2010) Pricing Power Derivatives: Electricity Swing Options, Thesis, University of Ulm, Department of Mathematics and Economics. Backus, D., Zin, S. (2003) Technical Note on Hull and White, New York University, School of Business. Brandimarte, P. (2006) Numerical Methods in Finance: a MATLAB-Based Introduction, Wiley. Clewlow, L., Strickland, C. (1999) Valuing Energy Options in a One Factor Model Fitted to Forward Prices, White Paper. Hull, J. (2005) Options, Futures and Other Derivatives, Prentice Hall. Hull, J., White, A. (1994) Numerical Procedures for Implementing Term Structure Models I: Single Factor Models, Journal of Derivatives, Fall. Jaillet, P., Ronn, E., Tompaidis, S., (2004) Valuation of Commodity-Based Swing Options, Management Science 50, 909 Schwartz, E. (1997) The Stochastic Behavior of Commodity Prices: Implications for Valuation and Hedging, Journal of Finance 52, 923.
APPENDIX
151
APPENDIX Code: TreeRace function TreeRace() %TreeRace compares Trinomial v. Binomial Tree calcs. S0=50; K=50; r=0.1; sigma=0.5; T=1; d=0; %dividend yield CallFlag=0; EuropeanFlag=0; trials=16; TriValues=zeros(1,trials); TriTime=zeros(1,trials); for n=1:trials; Ntrial(n)=(n+2)ˆ2; tic TriValues(n)=VectorTrinomial(S0,K,r,sigma,T,Ntrial(n),d,... CallFlag,EuropeanFlag); TriTime(n)=toc; end for n=1:trials; tic BiValues(n)=VectorBinomial(S0,K,r,sigma,T,Ntrial(n),d,... CallFlag,EuropeanFlag); BiTime(n)=toc; end figure subplot(3,1,1) semilogy(Ntrial,abs(TriValues-BiValues)) title('Binomial vs. Trinomial Tree Comparison'); ylabel('log(\DeltaV)'); axis tight subplot(3,1,2) plot(Ntrial,TriValues,Ntrial,BiValues) ylabel('Value');
152
TRINOMIAL TREES
if ((CallFlag==1) && (EuropeanFlag==1)) title('European Call'); elseif ((CallFlag==1) && (EuropeanFlag==0)) title('American Call'); elseif ((CallFlag==0) && (EuropeanFlag==0)) title('American Put'); else title('European Put'); end axis tight subplot(3,1,3) plot(Ntrial,TriTime,Ntrial,BiTime) legend('Trinomial','Binomial','location','NorthWest'); xlabel('Steps');ylabel('Calculation Time'); end
Code: VectorBinomial function Val=VectorBinomial(S0,K,r,sigma,T,N,d,... CallFlag,EuropeanFlag) %VectorBinomial computes option price via binomial tree %for Call (CallFlag=1) or Put (CallFlag=0) %for European (EuropeanFlag=1) or American (EuropeanFlag=0) %Vector of asset (S) and option (V) prices greatly reduces %CPU and Memory Constraints compared to Array for large %number of steps (N) %An N step binomial tree has 2N+1 nodes with only %N distinct asset prices possible if (nargin == 0), %Check for Data Input S0=50; K=50; r=0.1; sigma=0.5; T=1; N=100; d=0; CallFlag=1; EuropeanFlag=0; end dt=T/N; a=exp((r-d)*dt);
APPENDIX
153
u=exp(sigma*sqrt(dt)); %standard dev = annual vol. x sqrt(dt) d=exp(-sigma*sqrt(dt)); p=(a-d)/(u-d); S=zeros(2*N+1,1); V=zeros(2*N+1,1); DiscUpProb=exp(-r*dt)*p; DiscDownProb=exp(-r*dt)*(1-p); %Initial Asset price and center trunk of tree S(N+1)=S0; downpow=[N:-1:1];;%Vectorize down movements uppow=[1:N];%Vectorize up movements %S(1) is lowest price S0*d*d*d*d... S(1:N)=d.ˆdownpow*S0; %S(2N+1) is highest price S0*u*u*u... S(N+2:2*N+1)=u.ˆuppow*S0; %Node 1,3,...2N+1 are expiration nodes %Value at expiration is excess over Strike Price K if (CallFlag==1) V(1:2:2*N+1)=max(0,S(1:2:2*N+1)-K); else V(1:2:2*N+1)=max(0,K-S(1:2:2*N+1)); end if (EuropeanFlag==1) %Propagate European Call or Put for Step=1:N V(Step+1:2:(2*N+1-Step))... =DiscUpProb*V(Step+2:2:(2*N+2-Step))... +DiscDownProb*V(Step:2:(2*N-Step)); end elseif (CallFlag==1) %American Call for Step=1:N V(Step+1:2:(2*N+1-Step))... =max(S(Step+1:2:(2*N+1-Step))-K,... (DiscUpProb*V(Step+2:2:(2*N+2-Step))... +DiscDownProb*V(Step:2:(2*N-Step)))); end else %American Put for Step=1:N V(Step+1:2:(2*N+1-Step))... =max(K-S(Step+1:2:(2*N+1-Step)),... (DiscUpProb*V(Step+2:2:(2*N+2-Step))... +DiscDownProb*V(Step:2:(2*N-Step)))); end end
154
TRINOMIAL TREES
Val=V(N+1); end
Code: VectorTrinomial function V=VectorTrinomial(S0,K,r,sigma,T,N,d,... CallFlag,EuropeanFlag) %VectorTrinomial computes option price via trinomial tree %for Call (CallFlag=1) or Put (CallFlag=0) %for European (EuropeanFlag=1) or American (EuropeanFlag=0) %Vector of asset (S) and option (V) prices greatly reduces %CPU and Memory Constraints compared to Array for large %number of steps (N) %Each new time step in the tree is identical to the %previous time except one higher and one lower node is added %N step trinomial tree has 2N+1 nodes at expiration clc close all if (nargin == 0), %Check for Data Input S0=50; K=50; r=0.1; sigma=0.5; T=1; d=0; %dividend yield CallFlag=0; EuropeanFlag=1; end X0=log(S0); dt=T/N; dx=sigma*sqrt(3*dt); nu=r-0.5*sigmaˆ2-d; X=zeros(2*N+1,1); %Asset V=zeros(2*N+1,1); %Option j=-N:1:N; X(j+N+1)=X0-j*dx; S=exp(X); if (CallFlag==1)
APPENDIX
V=max(S-K,0); else V=max(K-S,0); end %Only use probabilities when discounting option value to %give present value at previous time step DiscPu=exp(-r*dt)*0.5*( (nuˆ2*dtˆ2+sigmaˆ2*dt)/dxˆ2 + nu*dt/dx); DiscPd=exp(-r*dt)*0.5*( (nuˆ2*dtˆ2+sigmaˆ2*dt)/dxˆ2 - nu*dt/dx); DiscPm=exp(-r*dt)*(1-( (nuˆ2*dtˆ2+sigmaˆ2*dt)/dxˆ2)); if (EuropeanFlag==1) for edge=0:N-1 up=(1:1: 2*N+1 -2*edge -2); %Vup=pu*V(up); %mid=(up+1); Vmid=pm*V(mid); %down=(up+2); Vdown=pd*V(down); %Vprev=Vup+Vmid+Vdown Vprev=DiscPu*V(up)+DiscPm*V(up+1)+DiscPd*V(up+2); V=Vprev; end elseif (CallFlag==1) %American Call for edge=0:N-1 up=(1:1: 2*N+1 -2*edge -2); current=(edge+2 :1: 2*N+1-1-edge); Vprev=DiscPu*V(up)+DiscPm*V(up+1)+DiscPd*V(up+2); V=max(S(current)-K,Vprev); end else %American Put for edge=0:N-1 up=(1:1: 2*N+1 -2*edge -2); current=(edge+2 :1: 2*N+1-1-edge); Vprev=DiscPu*V(up)+DiscPm*V(up+1)+DiscPd*V(up+2); V=max(K-S(current),Vprev); end end end
Code: ProbPlot function ProbPlot () %ProbPlot examines the probabilities in Hull-White %Mean Reverting Tree. Hull-White select the maximum tree %size and thus the location of the edge to avoid
155
156
TRINOMIAL TREES
%a negative pm in the interior at abs(jM)>0.815 %But a negative probability at the top or bottom %edge can exist if abs(jM)0) puTop(j)=7/6+(jrealˆ2*Mˆ2+3*jreal*M)/2; pmTop(j)=-1/3-jrealˆ2*Mˆ2-2*jreal*M; pdTop(j)=1/6+(jrealˆ2*Mˆ2+jreal*M)/2; else puTop(j)=NaN; pmTop(j)=NaN;pdTop(j)=NaN; end %bottom if (jreal1) X(j-1,i+1)=X(j,i)+deltaX; end X(j,i+1)=X(j,i); if (j>jtotal) X(j+1,i+1)=X(j,i)-1*deltaX; end jreal=jmid-j; if (j==1) %top pu(j,i)=7/6+(jrealˆ2*Mˆ2+3*jreal*M)/2; pm(j,i)=-1/3-jrealˆ2*Mˆ2-2*jreal*M;
164
TRINOMIAL TREES
pd(j,i)=1/6+(jrealˆ2*Mˆ2+jreal*M)/2; k=1; kmatrix(j,i)=k; %Simplify Amer Option elseif (j==jtotal) %bottom pu(j,i)=1/6+(jrealˆ2*Mˆ2-jreal*M)/2; pm(j,i)=-1/3-jrealˆ2*Mˆ2+2*jreal*M; pd(j,i)=7/6+(jrealˆ2*Mˆ2-3*jreal*M)/2; k=-1; kmatrix(j,i)=k; %Simplify Amer Option else pu(j,i)=1/6+(jrealˆ2*Mˆ2+jreal*M)/2; pm(j,i)=2/3-jrealˆ2*Mˆ2; pd(j,i)=1/6+(jrealˆ2*Mˆ2-jreal*M)/2; k=0; kmatrix(j,i)=k; %Simplify Amer Option end %Special case: Qs not discounted since const interest rates Q(j-1+k,i+1)=Q(j-1+k,i+1)+pu(j,i)*Q(j,i); Q(j+k,i+1)=Q(j+k,i+1)+pm(j,i)*Q(j,i); Q(j+1+k,i+1)=Q(j+1+k,i+1)+pd(j,i)*Q(j,i); %Arrow-Debreu Prices discounted at the riskfree %rate to the present value Arrow(j-1+k,i+1)=Arrow(j-1+k,i+1)... +pu(j,i)*Arrow(j,i)*exp(-r*dt); Arrow(j+k,i+1)=Arrow(j+k,i+1)... +pm(j,i)*Arrow(j,i)*exp(-r*dt); Arrow(j+1+k,i+1)=Arrow(j+1+k,i+1)... +pd(j,i)*Arrow(j,i)*exp(-r*dt); end% j loop if (i>1) F(i) = interp1(Tvector,Fvector,(i-1)*dt); theta(i)=log(F(i)./ sum((Q(jrange,i)).*exp (X(jrange,i)))); else F(1) = interp1(Tvector,Fvector,(i-1)*dt, 'linear','extrap'); theta(1)=log(min(F(1),Fvector(1))); end S(jrange,i)=exp(theta(i)+ X(jrange,i)); end figure plot (Tvector,Fvector,0,F(1),'+','MarkerSize',10); xlabel('Time [Years]'); ylabel('Crude Oil Futures Price'); title('WTI Futures as of January 2011');
APPENDIX
if (Stepsi
j i
Using wxi(k) = xi(k+1)
aii aii
j 1 is overrelation. It is not possible to predict the optimal damping parameter but the range of w = 1.2 − 1.5 is usually satisfactory. 6.12. GAUSS–SEIDEL TECHNIQUE The Gauss–Seidel technique is usually introduced in mathematics texts before description of the SOR technique. Nevertheless, the Gauss–Seidel technique can be considered as a special case of the SOR technique. A relaxation term of w = 1 reduces the SOR method to the Gauss–Seidel method as given by
xi(k+1)
1 = aii
just ⎞ calculated
⎛
⎟ ⎜ & & (k+1) ⎟ ⎜ (k) aij xj − aij xj ⎟. ⎜ ri − ⎠ ⎝ j >i
j 101 Smax = 100; end vol = 0.2; end r = 0.03; end T = 5; end K = 50; end S0 = 50; end
dS=Smax/(M-1) % equivalent to M=Smax/dS + 1 %Smax= (M-1)*dS % S=0,dS,2*dS,...,(M-1)*dS S=0:dS:Smax; % simple step size calibration of tstepApprox = dSˆ2/sqrt(vol)/40 Napprox = round (T/tstepApprox+1); t = linspace(0,T,Napprox); N = length(t) dt = t(2)-t(1) c = zeros(M,N); %C = zeros(M,N); maxSK = max(S-K,0); c(:,N) = maxSK; %C(:,N) = maxSK; %time T expiration p = zeros(M,N); %P = zeros(M,N); maxKS = max(K-S,0); p(:,N) = maxKS; %P(:,N) = maxKS; %time T expiration j = [2:M-1]; % Stock Price Interior Nodes jAll=[1:M]-1; % use for S=dS*jAll alphaCoeff = 0.25*dt*(volˆ2*jAll.ˆ2 -(r-q)*jAll); betaCoeff = -0.5*dt*(volˆ2*jAll.ˆ2+r); gammaCoeff = 0.25*dt*(volˆ2*jAll.ˆ2 +(r-q)*jAll); %%%%%%% iAll=[1:N]; % use for t=(iAll-1)*dt %tau= T-t =(N-1)*dt - (iAll-1)*dt = (N-iAll)*dt c(1,iAll) = 0; % OTM c(M,iAll) = Smax*exp(-q*((iAll-1)*dt))... -K*exp(-r*(N-iAll)*dt); %ITM
213
214
FINITE DIFFERENCE METHODS
p(M,iAll) = 0; % OTM p(1,iAll) = K*exp(-r*(N-iAll)*dt); %ITM %Smin=0 % CN technique only calculates interior nodes. % We throw out the first and last a,b,c Coefficient CoeffMat1=diag(-alphaCoeff(3:M-1),-1)+... diag(1-betaCoeff(2:M-1))+diag(-gammaCoeff(2:M-2),1); CoeffMat2=diag(alphaCoeff(3:M-1),-1)+... diag(1+betaCoeff(2:M-1))+diag(gammaCoeff(2:M-2),1); % As just mentioned, CN technique only calculates % interior nodes. The top interior node needs information % from the top edge node and the bottom interior node needs % information from the bottom edge node. Since we can't % extend the coefficient matrix, the edge nodes % aCoeff(2) and cCoeff(M-1) have to be added in cEdge=zeros(M-2,1); pEdge=zeros(M-2,1); % Matrix Inversion %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% tic invCoeffMat1=inv(CoeffMat1); for i=N-1:-1:1 cEdge(end)=(c(M,i)+c(M,i+1))*gammaCoeff(M-1); %cEdge(1)=0; c(j,i)= (invCoeffMat1 *(CoeffMat2*c(j,i+1) + cEdge)); pEdge(1)=(p(1,i)+p(1,i+1))*alphaCoeff(2); %pEdge(end)=0; p(j,i) = (invCoeffMat1 *(CoeffMat2*p(j,i+1) + pEdge)); end MatrixInversionTime=toc; fprintf (1,'\nMatrix Inversion: Time = %e',... MatrixInversionTime); EuroPut = interp1 (S, p(:,1), S0); EuroCall = interp1 (S, c(:,1), S0); fprintf (1,'\nEuropean Put = %f; European Call = %f\n',... EuroPut, EuroCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Matrix LU %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% tic [L,U]=lu(CoeffMat1); for i=N-1:-1:1 cEdge(end)=(c(M,i)+c(M,i+1))*gammaCoeff(M-1);
APPENDIX
% cEdge(1)=0; c(j,i)= (U \ (L \(CoeffMat2*c(j,i+1) + cEdge))); pEdge(1)=(p(1,i)+p(1,i+1))*alphaCoeff(2); % pEdge(end)=0; p(j,i) = (U \ (L \(CoeffMat2*p(j,i+1) + pEdge)));
end MatrixLUtime=toc; fprintf (1,'\nMatrix LU: Time = %e',... MatrixLUtime) EuroPut = interp1 (S, p(:,1), S0); EuroCall = interp1 (S, c(:,1), S0); fprintf (1,'\nEuropean Put = %f; European Call = %f\n',... EuroPut, EuroCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Matrix Division %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% tic for i=N-1:-1:1 cEdge(end)=(c(M,i)+c(M,i+1))*gammaCoeff(M-1); % cEdge(1)=0; c(j,i)= (CoeffMat1 \ (CoeffMat2*c(j,i+1) + cEdge)); pEdge(1)=(p(1,i)+p(1,i+1))*alphaCoeff(2); % pEdge(end)=0; p(j,i) = (CoeffMat1 \ (CoeffMat2*p(j,i+1) + pEdge)); end MatrixLeftTime=toc; fprintf (1,'\nMatrix Division: Time = %e', MatrixLeftTime); EuroPut = interp1 (S, p(:,1), S0); EuroCall = interp1 (S, c(:,1), S0); fprintf (1,'\nEuropean Put = %f, European Call = %f\n',... EuroPut, EuroCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Gaussian Elimination %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% c = zeros(M,N); c(:,N) = maxSK; p = zeros(M,N); p(:,N) = maxKS; %time T expiration c(1,iAll) = 0; % OTM c(M,iAll) = Smax*exp(-q*((iAll-1)*dt))... -K*exp(-r*(N-iAll)*dt); %ITM p(M,iAll) = 0; % OTM p(1,iAll) = K*exp(-r*(N-iAll)*dt); %ITM %Smin=0 cEdge=zeros(M-2,1); pEdge=zeros(M-2,1);
215
216
FINITE DIFFERENCE METHODS
% place coefficients into form to use standard % tridiagonal matrix algorithm of Thomas %aCoeff(j)c(j-1)+bCoeff(j)c(j)+aCoeff(j)c(j+1)=r(j) aCoeff=-alphaCoeff; bCoeff=1-betaCoeff; cCoeff=-gammaCoeff; tic for i=N-1:-1:1 cEdge(end)=(c(M,i)+c(M,i+1))*gammaCoeff(M-1); % cEdge(1)=0; rc(j)= (CoeffMat2*c(j,i+1) + cEdge) ; pEdge(1)=(p(1,i)+p(1,i+1))*alphaCoeff(2); % pEdge(end)=0; rp(j) = (CoeffMat2*p(j,i+1) + pEdge); %% modify coefficients and right hand side cCoeffHat(2)=cCoeff(2)/bCoeff(2); rcHat(2)=rc(2)/bCoeff(2); rpHat(2)=rp(2)/bCoeff(2); for m=3:1:M-1 cCoeffHat(m)=cCoeff(m)/... (bCoeff(m)-cCoeffHat(m-1)*aCoeff(m)); rcHat(m)=(rc(m)-rcHat(m-1)*aCoeff(m)) /... (bCoeff(m) - cCoeffHat(m-1)*aCoeff(m)); rpHat(m)=(rp(m)-rpHat(m-1)*aCoeff(m)) /... (bCoeff(m) - cCoeffHat(m-1)*aCoeff(m)); end c(M-1,i)=rcHat(M-1); p(M-1,i)=rpHat(M-1); for m=M-2:-1:2 c(m,i)=rcHat(m)-cCoeffHat(m)*c(m+1,i); p(m,i)=rpHat(m)-cCoeffHat(m)*p(m+1,i); end end GaussElimTime=toc; fprintf (1,'\nGaussian Elimination: Time = %e', GaussElimTime); EuroPut = interp1 (S, p(:,1), S0); EuroCall = interp1 (S, c(:,1), S0); fprintf (1,'\nEuropean Put = %f, European Call = %f\n',... EuroPut, EuroCall);
APPENDIX
217
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Analytical Black-Scholes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% tic bsCall = BlackScholesCall (K,S0,T,vol,r,0); bsPut = BlackScholesPut (K,S0,T,vol,r,0); bsTime=toc; fprintf (1,'\nAnalytical Black Scholes: Time = %e', bsTime); fprintf (1,'\nEuropean Put = %f; European Call = %f\n',... bsPut, bsCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Option to plot numerical solution to PDE % as a function of time and underlying asset price PlotFlag=1; if (PlotFlag) Smatrix=repmat(S',1,N); tmatrix=repmat(t,M,1); figure surf (Smatrix,tmatrix,c) shading interp xlabel ('Asset Price'); ylabel ('Time [Years]'); zlabel ('Call Option Price'); title ('Crank-Nicolson European Call') figure surf (Smatrix,tmatrix,p) alpha(.7) shading interp title ('Crank-Nicolson European Put') xlabel ('Asset Price'); ylabel ('Time [Years]'); zlabel ('Put Option Price'); end end
Code: CrankNicAmerPDE function [ EuroCall AmerCall EuroPut AmerPut ] =... CrankNicAmerPDE( S0, K, T, r, vol, Smax, M, q) % CrankNicAmerPDE employs Successive over relation % (SOR) on a PDE grid to calculate price % of American call and put options. PlotFlag Option % turns on plotting of option value as a function % of time and underlying asset price
218 % % % %
FINITE DIFFERENCE METHODS
Option Delta and Gamma can be easily extracted from grid. A 2D plot for Delta and Gamma at time zero is generated as well as a 3D plot for Delta and Gamma vs time and asset price
disp ('Crank Nicolson PDE: American Techniques') if if if if if if if if
(nargin (nargin (nargin (nargin (nargin (nargin (nargin (nargin
< < < < < < <
M>101 Smax = 100; end vol = 0.2; end r = 0.03; end T = 5; end K = 50; end S0 = 50; end
dS=Smax/(M-1) % equivalent to M=Smax/dS + 1 %Smax= (M-1)*dS % S=0,dS,2*dS,...,(M-1)*dS S=0:dS:Smax; % simple step size calibration of tstepApprox = dSˆ2/sqrt(vol)/200 Napprox = round (T/tstepApprox+1); t = linspace(0,T,Napprox); N = length(t) dt = t(2)-t(1) maxSK = (max(S-K,0))'; maxKS = (max(K-S,0))'; j = [2:M-1]; % Stock Price Interior Nodes jAll=[1:M]'-1; % use for S=dS*jAll alphaCoeff = 0.25*dt*(volˆ2*jAll.ˆ2 -(r-q)*jAll); betaCoeff = -0.5*dt*(volˆ2*jAll.ˆ2+r); gammaCoeff = 0.25*dt*(volˆ2*jAll.ˆ2 +(r-q)*jAll); % Same Boundary Conditions for American iAll=[1:N]; % use for t=(iAll-1)*dt %tau= T-t =(N-1)*dt - (iAll-1)*dt = (N-iAll)*dt %%%%%%%%%%%%%%%%%%%%%%%%%% % CN technique only calculates interior nodes. % We ignore the first and last a,b,c Coefficient CoeffMat1=diag(-alphaCoeff(3:M-1),-1)+... diag(1-betaCoeff(2:M-1))+diag(-gammaCoeff(2:M-2),1); CoeffMat2=diag(alphaCoeff(3:M-1),-1)+... diag(1+betaCoeff(2:M-1))+diag(gammaCoeff(2:M-2),1);
APPENDIX
% % % % % %
219
As just mentioned, CN technique only calculates interior nodes. The top interior node needs information from the top edge node and the bottom interior node needs information from the bottom edge node. Since we can't extend the coefficient matrix, the edge nodes aCoeff(2) and cCoeff(M-1) have to be added in
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Matrix Division %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% c = zeros(M,N); c(:,N) = maxSK; p = zeros(M,N); p(:,N) = maxKS; %time T expiration c(1,iAll) = 0; % OTM c(M,iAll) = Smax*exp(-q*((iAll-1)*dt))... -K*exp(-r*(N-iAll)*dt); %ITM p(M,iAll) = 0; % OTM p(1,iAll) = K*exp(-r*(N-iAll)*dt); %ITM %Smin=0 cEdge=zeros(M-2,1); pEdge=zeros(M-2,1); tic for i=N-1:-1:1 cEdge(end)=(c(M,i)+c(M,i+1))*gammaCoeff(M-1); % cEdge(1)=0; c(j,i)= (CoeffMat1 \ (CoeffMat2*c(j,i+1) + cEdge)); pEdge(1)=(p(1,i)+p(1,i+1))*alphaCoeff(2); % pEdge(end)=0; p(j,i) = (CoeffMat1 \ (CoeffMat2*p(j,i+1) + pEdge)); end MatrixLeftTime=toc; fprintf(1,'\nMatrix Division: Time = %e', MatrixLeftTime); EuroPut = interp1 (S, p(:,1), S0); EuroCall = interp1 (S, c(:,1), S0); fprintf (1,'\nEuropean Put = %f, European Call = %f\n\n',... EuroPut, EuroCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % SOR: American %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Direct Calculation of American Put (P) and Call (C) % Successive Over Relaxation Technique is required as % each node is repeatedly checked for early payout %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% C = zeros(M,N); C(:,N) = maxSK;
220
FINITE DIFFERENCE METHODS
P = zeros(M,N); P(:,N) = maxKS; %time T expiration C(1,iAll) = 0; % OTM C(M,iAll) = Smax*exp(-q*((iAll-1)*dt))... -K*exp(-r*(N-iAll)*dt); %ITM P(M,iAll) = 0; % OTM P(1,iAll) = K;%K*exp(-r*(N-iAll)*dt); %ITM %Smin=0 Cchange = zeros(M,1); Pchange = zeros(M,1); Cold = zeros(M,1); Pold=zeros(M,1); rC=zeros(M,1); rP=zeros(M,1); CEdge=zeros(M-2,1); PEdge=zeros(M-2,1); tol = 0.0000001; w=1.5; maxcount=40; tic jNodes=[1:M]; for i=N-1:-1:1 %%%% American Call Calculation C(j,i)=C(j,i+1); CEdge(end)=(C(M,i+1))*gammaCoeff(M-1); %CEdge(1)=0; rC(j)= ((CoeffMat2*C(j,i+1) + CEdge)); CchangeNorm=1; counter =0; while ((CchangeNorm > tol) && (counter < maxcount)) for m=2:M-1 Cold(m)=C(m,i); C(m,i)=max(maxSK(m),... (C(m,i)+(w.*(1./(1-betaCoeff(m))).*... (rC(m) + alphaCoeff(m).*C((m-1),i)... - (1-betaCoeff(m)).*C(m,i)... + gammaCoeff(m).*C(m+1,i))))); end Cchange(j) = C(j,i) - Cold(j); CchangeNorm=(norm(Cchange)); counter = counter +1; end %[i counter CchangeNorm ] %%%% American Put Calculation P(j,i)=P(j,i+1); PEdge(1)=(P(1,i+1))*alphaCoeff(2); %pEdge(end)=0; rP(j) = ((CoeffMat2*P(j,i+1) + PEdge)); PchangeNorm=1; counter =0; while ((PchangeNorm > tol) && (counter < maxcount)) for m=2:M-1
APPENDIX
221
Pold(m)=P(m,i); P(m,i)=max(maxKS(m),... (P(m,i)+(w.*(1./(1-betaCoeff(m))).*... (rP(m) + alphaCoeff(m).*P((m-1),i)... - (1-betaCoeff(m)).*P(m,i)... + gammaCoeff(m).*P(m+1,i))))); end Pchange(j) = P(j,i) - Pold(j); PchangeNorm=(norm(Pchange)); counter = counter +1; end % [i counter PchangeNorm ] end sorTime=toc; fprintf('\nSuccessive Over Relaxation: Time = %e', sorTime); AmerPut = interp1 (S, P(:,1), S0); AmerCall = interp1 (S, C(:,1), S0); fprintf (1,'\nAmerican Put = %f, American Call = %f\n\n',... AmerPut, AmerCall); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Option Greeks%%%%%%%%%%%%%%%%%%%% DeltaCallAmer=(C(3:end,1)-C(1:end-2,1))/(2*dS); DeltaPutAmer=(P(3:end,1)-P(1:end-2,1))/(2*dS); DeltaCallEur=(c(3:end,1)-c(1:end-2,1))/(2*dS); DeltaPutEur=(p(3:end,1)-p(1:end-2,1))/(2*dS); GammaCallAmer=(C(3:end,1)-2*C(2:end-1,1)+C(1:end-2,1))/(dSˆ2); GammaPutAmer=(P(3:end,1)-2*P(2:end-1,1)+P(1:end-2,1))/(dSˆ2); GammaCallEur=(c(3:end,1)-2*c(2:end-1,1)+c(1:end-2,1))/(dSˆ2); GammaPutEur=(p(3:end,1)-2*p(2:end-1,1)+p(1:end-2,1))/(dSˆ2); DeltaCallAmerAll=(C(3:end,:)-C(1:end-2,:))/(2*dS); GammaCallAmerAll=(C(3:end,:)-2*C(2:end-1,:)+C(1:end-2,:))/ (dSˆ2); DeltaPutAmerAll=(P(3:end,:)-P(1:end-2,:))/(2*dS); GammaPutAmerAll=(P(3:end,:)-2*P(2:end-1,:)+P(1:end-2,:))/ (dSˆ2); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Option to plot numerical solution to PDE % as a function of time and underlying asset price
222
FINITE DIFFERENCE METHODS
PlotFlag=1; if (PlotFlag) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % compare American and European call option price % evolution over time figure Smatrix=repmat(S',1,N); tmatrix=repmat(t,M,1); surf (Smatrix,tmatrix,C) hold on surf (Smatrix,tmatrix,c) hold off shading interp xlabel ('Asset Price'); ylabel ('Time [Years]'); zlabel ('Call Option Price'); title ('Crank-Nicolson: European = American Call') % compare American and European put option price % evolution over time figure surf (Smatrix,tmatrix,P) hold on surf (Smatrix,tmatrix,p) hold off alpha(.7) shading interp title ('Crank-Nicolson: American and European Put') xlabel ('Asset Price'); ylabel ('Time [Years]'); zlabel ('Put Option Price'); figure % compare american and european call and put option delta subplot (2,2,1); plot (S(2:(end-1))/S0,DeltaCallAmer,'',... S(2:(end-1))/S0,DeltaCallEur,'--'); title (' Time zero: 5 years until Expiration') xlabel ('Asset Price / S_0'); ylabel ('Call \Delta = \deltac/\deltaS'); subplot (2,2,3); plot (S(2:(end-1))/S0,DeltaPutAmer,'',... S(2:(end-1))/S0,DeltaPutEur,'--'); xlabel ('Asset Price / S_0'); ylabel ('Put \Delta = \deltap/\deltaS');
APPENDIX
% compare american and european call and put option gamma subplot (2,2,2); plot (S(2:(end-1))/S0,GammaCallAmer,'',... S(2:(end-1))/S0,GammaCallEur,'--'); xlabel ('Asset Price / S_0'); ylabel ('Call \Gamma = \deltaˆ2c/\deltaSˆ2'); subplot (2,2,4); plot (S(2:(end-1))/S0,GammaPutAmer,'',... S(2:(end-1))/S0,GammaPutEur,'--'); xlabel ('Asset Price / S_0'); ylabel ('Put \Gamma = \deltaˆ2p/\deltaSˆ2'); legend('American','European')
figure % plot call delta vs time and vs asset price subplot (2,2,1) Smatrix=repmat((S(2:end-1))',1,N)/ S0; tmatrix=repmat(t,M-2,1); colormap pink contourf (Smatrix,tmatrix,DeltaCallAmerAll) xlabel ('Asset Price / S_0'); ylabel ('Time [Years]'); title ('American Call \Delta = \deltaP/\deltaS'); % plot put delta vs time and vs asset price subplot (2,2,3) colormap pink contourf (Smatrix,tmatrix,DeltaPutAmerAll) xlabel ('Asset Price / S_0'); ylabel ('Time [Years]'); title ('American Put \Delta = \deltaP/\deltaS'); % near expiration (last time few points) explosive increase % in gamma obscures plot % plot call gamma vs time and vs asset price subplot (2,2,2) contour (Smatrix,tmatrix,GammaCallAmerAll,50) xlabel ('Asset Price / S_0'); ylabel ('Time [Years]'); title ('American Call \Gamma = \deltaˆ2P/\deltaSˆ2'); % plot put gamma vs time and vs asset price subplot (2,2,4) contour (Smatrix,tmatrix,GammaPutAmerAll,50) xlabel ('Asset Price / S_0'); ylabel ('Time [Years]'); title ('American Put \Gamma = \deltaˆ2P/\deltaSˆ2'); end end
223
224
FINITE DIFFERENCE METHODS
Code: ExplicitHeston function [c0 C0] = ExplicitHeston(S0,K0,r,dt,T,... kappa,lambda,sigma,rho,eta,v0) % ExplicitHeston: explicit finite difference % of Heston Model. Two-dimensional grid of coupled % stochastic volatility and asset price % Time step backwards from known call option value % at maturity to time zero. Explicit FD requires fine % time step grid. Easier to let code calculate dt. if if if if if if if if if if if
(nargin (nargin (nargin (nargin (nargin (nargin (nargin (nargin (nargin (nargin (nargin
< < < < < < < < < <
(S_0)eˆrˆt\uparrow',... 'HorizontalAlignment','right') text(T(midT),F(midT)-1,'F_0E[S_T]'}; text(tau(end*.7),Ftau1(round(end*0.1)),str1, 'HorizontalAlignment','right') str2(1) = {'Normal Backwardation'}; str2(2) = {'F_{\tau} % Sequential scalar % Filtering allows division instead of inversion % % % % % % % % % %
2='Schmidt-SR' only used if zero correlation between measurements. Improves inversion by Doubling the Precision of ill-conditioned Innovation Covariance Matrix Highly correlated measurements precisely measured, i.e., with very small variance terms on the diagonal of the Measurement Noise Covariance matrix, create a situation where all terms in the Innovation Covariance Matrix are similar. This leads to ill-conditioned matrices during the inversion step.
% % % % %
3='Carraro-S/SR' Propagates the squared covariance matrix itself through the filter, thus losing the increase in numerical precision Forces the calculation of a positive semi-definite covariance matrix global Obs ; % use of global avoids passing large arrays global TimeDelta; % generally global variables are bad programming global TimeMat; global LnS CY MeasErr flag; r=0.06; % Constant interest Rates -> Futures=Forward Contract alpha=parameter(1); kappa=parameter(2); sig1=parameter(3); lambda=parameter(4);
APPENDIX
291
sig2=parameter(5); rho=parameter(6); mu=parameter(7); Rvar=parameter(8:11); % Variance terms on diagonal of R=diag(Rvar); % Measurement Noise Covariance Ncontracts=length(TimeMat); % lnF=lnS + CY*(1-exp(-kappa*dt))/kappa + D % Measurement Equation % y[4x1] = H[4x4] * x[2x1] + D[4x1] % relate spot price to forwards at different maturities H=[ones(Ncontracts,1),(-(1-exp(-kappa*TimeMat))/kappa)]; D=((r+0.5*sig2ˆ2/kappaˆ2-alpha+sig2*lambda/kappa... -sig1*sig2*rho/kappa)*TimeMat)... +(0.25*sig2ˆ2*(1-exp(-2*kappa*TimeMat))/kappaˆ3)... +((alpha-sig2*lambda/kappa+sig1*sig2*rho/kappa-sig2ˆ2/ kappaˆ2)... *(1-exp(-kappa*TimeMat)))/kappa; % lnS(t)=lnS(t-1)-CYdt+(mu-0.5*sig1*sig1)dt % CY(t)=CY(t-1)*exp(-kappa*dt)+alpha(1-exp(-kappa*dt) % Transition Equation % x[2x1] = M[2x2] * x[2X1] + C[2x1] + w w~N(0,Q) M=[1, -TimeDelta ; 0, exp(-kappa*TimeDelta) ]; %M=2x2 Matrix C=[TimeDelta*(mu-0.5*sig1ˆ2); alpha*(1-exp(-kappa* TimeDelta))]; Qv1= sig1*sig1*TimeDelta; Qv2= sig2*sig2*( (1-exp(-2.*kappa.*TimeDelta)) ./ (2. *kappa)); Qc= rho*sig1*sig2*( (1-exp(-kappa.*TimeDelta)) ./ kappa); Q=[Qv1, Qc; Qc, Qv2]; nsamples=length(Obs(:,1)); nseries=length(Obs(1,:)); LikeSum=0.5*nsamples*log(2*pi); x=[Obs(1,1)/H(1,1); 0];%inv(H)*Obs(1,:); %Initialize Covariance to Unconditional Variance of State Process P=[Qv1 0; 0 Qv2]; I2=eye(2); I4=eye(4); switch flag case {'1','scalar'} % flag='scalar';
292
FUTURES AND FORWARDS
nmeasurements=length(Obs(1,:)); for i=2:nsamples % update state and covariance predictions as a vector x = M*x+C; %x(k|k-1) State Prediction x=4x1 Matrix P = M*P*M'+ Q; %P(k|k-1) Covariance Prediction % Individually filter each futures contract for j=1:nmeasurements %step through 4 measurements / time step zpred=H(j,:)*x-D(j); %scalar MR(j)=Obs(i,j)-zpred; %scalar HP=H(j,:)*P; %HP[1x2]=H(j)[1x2] P[2x2] V(j)=[[HP]*H(j,:)'+R(j,j)]; %[1x4] invV(j)=1/V(j); %scalar-> do not need invert matrix K=(HP)'*invV(j); x=x+K*MR(j); P=P-K*(HP); end % put into form for maximum likelihood calculation dInvV=diag(invV); % DiagonalV=diag(V); DetDiagonalV=det(DiagonalV) ProdDiag=prod(V); % determinant of diagonal matrix is product elements LikeSum=LikeSum+0.5*log(ProdDiag)+0.5*MR*dInvV*MR'; LnS(i)=x(1); CY(i)=x(2); MeasErr(i,:)=MR; end case {'2','S-SR','Schmidt-SR'} % flag='Schmidt-SR'; S=(chol(P))'; %S(0|0) Q1=(chol(Q))'; R1=(chol(R))'; for i=2:nsamples x = M*x+C ; %x(k|k-1)=M*x(k-1|k-1) A1=[S'*M';Q1']'; %[2x4] matrix: % A1 is rectangular and not Triangular % convert A1 into upper triangle S such that % A1*A1'=S*T*T'*S'= S*S'= M*P*M'+ Q [T S]=qr(A1',0); %T*S=A1' and S=T'*A1' %Matlab function qr factors A1 into upper %triangular matrix S % and orthogonal matrix T that is discarded % qr could be coded with a givens or householder function
APPENDIX
S=S'; %lower triangle %A1*A1'=S*S'= M*P*M'+ Q %triangular S can be propagated through filter A2=[S'*H';R1']'; %(k|k-1) V=A2*A2'; B=chol(V);%upper triangle% V=B'*B invB=inv(B); invV=invB*invB';%=inv(B)*inv(B')%=inv(V) detV=det(B)ˆ2;%det(V)=det(B)*det(B')=det(B)*det(B) K=S*S'*H'*invB*invB'; %Kalman Gain MR = Obs(i,:)' - H*x-D; %Innovation Residual %x(k|k) State Correction based on observation res. x = x + K*(MR) ; gamma=inv(I4+invV*R); % %scalar S=(I2-K*gamma*H)*S;%S(k|k) %scalar %Never necessary to calculate Covariance P =S*S' LikeSum=LikeSum+0.5*log(detV)+0.5*MR'*invV*MR; LnS(i)=x(1); CY(i)=x(2); MeasErr(i,:)=MR; end
case {'3','Carraro-S/SR','S/SR'} % flag='Carraro-S/SR'; % Carraro Squaring/Square root Kalman filter % The product of A(j)*A(j)'=P j=1,2,3 should always be pos % semi-def Q1=(chol(Q))'; R1=(chol(R))'; %size(R1) for i=2:nsamples S=(chol(P))'; %S(0|0) or %S(k|k) x = M*x+C ; %x(k|k-1)=M*x(k-1|k-1)+C A1=[S*M';Q1']'; P=A1*A1' ;%P(k|k-1) S=(chol(P))'; %S(k|k-1) A2=[S'*H';R1']'; V=A2*A2'; %%Residual (Innovation) Covariance invV=inv(V); K = P*H' * invV ;%P*H' * inv(V) %Kalman Gain MR = Obs(i,:)' - H*x-D; %Measurement (Innovation) Residual
293
294
FUTURES AND FORWARDS
% x(k|k) State Correction based on observation res. x = x + K*(MR) ; A3=[S'*(I2-K*H); R1'*K']'; P=A3*A3'; %P(k|k) LikeSum=LikeSum+0.5*log(det(V))+0.5*MR'*invV*MR; %+1e-10; LnS(i)=x(1); CY(i)=x(2); MeasErr(i,:)=MR; end otherwise % default to {'0','Normal-Joseph'} % flag='Normal-Joseph'; for i=2:nsamples x = M*x+C ; % x(k|k-1) State Prediction x=4x1 Matrix P = M*P*M'+ Q ; % P(k|k-1) Covariance Prediction MR = Obs(i,:)' - H*x-D ; %Measurement (Innovation) Residual V = H*P*H' + R; % Residual (Innovation) Covariance invV=inv(V); K = P*H' * invV; % P*H' * inv(V) %Kalman Gain % x(k|k) State Correction based on observation res. x = x + K*(MR) ; % P(k|k) Covariance Correction % -> use More Symmetric Joseph Covariance Correction P=(I2-K*H)*P*(I2-K*H)'+K*R*K'; LikeSum=LikeSum+0.5*log(det(V))+0.5*MR'*invV*MR; %+1e-10; LnS(i)=x(1); CY(i)=x(2); MeasErr(i,:)=MR; end end for i = 1:11 % brute force technique to avoid negative terms if (parameter(i) 0. Examination also shows that, as the dimension of the state vector n increases, the displacement of the points also increases. This dimensional scaling can be offset by setting κ = 3 − nx ; however, for nx ≥ 3 , the zeroth weight can become zero or negative, that is, w0 =
3 − nx , 3
creating a nonpositive semidefinite covariance matrix. The κ parameter can be used to avoid deleterious behavior generated from a severe nonlinearity in the state and measurement equation. A stable Kalman filtration relies on a change in the output from a function for a change in the input. Sigma points located in close proximity can yield identical output and unstable operation of the filter.
9.10. SCALED UNSCENTED TRANSFORM The scaled unscented transform was developed to add more flexibility in the placement of the sigma points by xi = x0 + α(xi − x0 ) where the sigma point scaling parameter α (with 0 ≤ α ≤ 1) is typically small, for example, 10−3 , to avoid sampling nonlocal effects from severe nonlinearities (van der Merwe et al., 2000). The 2nx + 1 sigma points and weights are scaled by a factor λ = α 2 (nx + κ) − nx
310
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
to give √ √ Xi = x, ˆ xˆ + nx + λ , xˆ − nx + λ ,
w0(m) =
w0(c) =
λ , nx + λ
λ + (1 + α 2 + β), nx + λ
wi(m) = wi(c) =
1 2nx + λ
where the superscript (c) represents covariance and (m) represents mean. The nonnegative scaling parameter β is used to reduce the error in the kurtosis with β = 2 optimal for a Gaussian prior. 9.11. UNSCENTED TRANSFORM KALMAN FILTER OF BLACK–SCHOLES MODEL The algorithm developed by Julier and Uhlmann (1997, 2004) for the UKF is identical to the Gauss–Hermite filter. The function NonLinearKalman used above to call the extended Kalman filtration of the Black–Scholes model in the function EKFbs can also be used to call the unscented transform filter in the function UTbs. Figure 9.5 shows that the unscented transform is able to find the hidden volatility as a function of time based a time series of stock index prices and at-the-money call option prices of one month in duration.
9.12. UNSCENTED TRANSFORM KALMAN FILTER OF HESTON MODEL The value of the unscented transform is more apparent for extracting the hidden stochastic volatility of the Heston model. The Heston (1993) model is similar to the Black–Scholes model except that the variance is stochastic and follows a Cox–Ingersoll–Ross (CIR) mean-reversion process. The Heston model dynamics are summarized by three equations, namely, √ dSt = (r + W )St dt + vt St dW1 , √ dvt = κ(θ − vt )dt + σ vt dW2 , E[dW1 dW2 ] = ρdt where κ is the variance reversion rate, θ is the mean-reversion or long-term variance level, σ is the volatility of variance, the factor ρ correlates the two Wiener processes, v0 is the initial variance, and W is the market price of diffusion risk.
UNSCENTED TRANSFORM KALMAN FILTER OF HESTON MODEL
311
The function NonLinearKalmanSV calls the unscented transform filter in the function UTHeston. A modification of Javaheri et al. (2003) gives the state transition equation as Sk 1 vk = xk = xk−1 + κθ − ρσ r + w xk−1 − (κ − ρσ )xk−1 t + ρσ ln 2 Sk−1 with state variance error
Q = σ 2 (1 − ρ 2 )xk−1 t. The measurement equation for the asset price is 1 yk = ln(Sk+1 ) = ln(Sk ) + {r + w xk−1 } + xk−1 t 2 UTbs
Index price
1500 Observed 1000
500 1995
Call price
60 40
2000 Time
2005
2010
2000 Time
2005
2010
2000 Time
2005
2010
Difference Predicted Observed
20 0 1995
Volatility
BS imp. vol. Pred. vol Stock vol. 0.4 0.2 0
1995
FIGURE 9.5 (Middle) Unscented transform filter (based on the Black–Scholes model) fit and prediction of the one-month at-the-money call option given the (top) previous month stock index price and call option price. (Bottom) The volatility as measured from the local stock index price, extracted from an inversion of the Black–Scholes model, and predicted by the Kalman filter.
312
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
with measurement variance error R = xk−1 t. Li (2012) showed that the unscented transform filtration is greatly improved by incorporating call price information. Under the risk-neutral measure, the Heston model can be represented by dSt = rSt dt +
√
vt St dW1Q ,
√ dvt = [κ(θ − vt ) − V vt ]dt + σ vt dW2Q , E[dW1 dW2 ] = ρdt where V is the market price of volatility risk. An analytical equation is not available for option prices described by the Heston model; however, the option price can be calculated efficiently by the fast Fourier transform (FFT) of Carr and Madan (1999). The algorithms necessary for this procedure including FFToption are described in the Fourier-Based Option Analysis chapter. One requirement is a separate function containing the Heston characteristic function. As described in Li (2012), the characteristic function for the Heston model can be represented as ϕHeston (ω; x = Vt−τ ) = E Q [eiωRt ]
= eiωrt − A(ω, τ ) − B(ω, τ )Vt−τ
where
κθ (γ − κ ∗ ) (1 − e−γ τ ) ∗ A(ω, τ ) = 2 2 ln 1 − + (γ − κ )τ , σ 2γ B(ω, τ ) =
2ϕ(ω)(1 − e−γ τ ) , 2γ − (γ − κ ∗ )(1 − e−γ τ )
and γ =
(κ ∗ )2 + 2σ 2 ϕ(ω),
1 (iω + ω2 ), 2 κ ∗ = κ + V − iωρσ.
ϕ(ω) =
The FFT algorithm places the inherent restriction kω = 2π N such that a fine 2π 1 spacing in frequency leads to a coarse spacing in log-strike price points k = ω N. This coarse spacing in log-strike price results in a large number of call options calculated at strike prices, K = S0 exp(k), near zero or several times the current asset price S0 , which are irrelevant. As such, interpolation error may occur for a
SCALED UNSCENTED TRANSFORM KALMAN FILTER
Price
60
313
Obs. call Pred. call Option price error
40 20 0 1995
2000 Time
2005
2010
2000 Time
2005
2010
2000 Time
2005
2010
Price
1500
1000
Obs. index Pred. index
500 1995
Variance
0.5
Predicted Heston var. BS implied var.
0.4 0.3 0.2 0.1 1995
FIGURE 9.6 Unscented transform filtration based on the Heston stochastic volatility model for a time series of (middle) stock index prices and (top) one-month at-the-money call prices. (Bottom) Black–Scholes implied variance and unscented transform prediction of variance.
coarse grid spacing. One choice is to increase the number of calculation points but at the cost of increased computational burden. Chourdakis (2005; 2008) implemented an option valuation procedure based on fractional FFT (FrFFT) which allows a variation in this log-strike spacing. Specifically, the fractional FFT can decrease the A fraction kω 2π = A, which is defined as A = N1 for the traditional FFT. A decrease in the A fraction allows a finer spacing in the log-strike price and consequently the range of log strikes. The array of log-strike points is still symmetric about k = 0. The function CalcHestonCall signals the function FFToption to use the fractional FFT algorithm. The output of the function NonLinearKalmanSV employing the unscented transform filter in the function UTHeston is displayed in Figure 9.6.
9.13. SCALED UNSCENTED TRANSFORM KALMAN FILTER A more sophisticated implementation of the scaled unscented transform is to filter a concatenated vector of the original state and noise vectors. The augmented vector of
314
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
the state vector xk−1 , state noise vector vk−1 , and observation noise vector wk−1 is T a xk−1 = xk−1
T wk−1
T vk−1
T
with a length given by a summation of individual lengths, that is, n = nx + nw + nv . The expected value is initialized as xˆ0a = E[x a ] = xˆ0T
0
and the covariance is initialized as
T 0
⎡ P0 P0a = E[(x0a − xˆ0a )(x0a − xˆ0a )T ] = ⎣ 0 0
⎤ 0 0 Q 0⎦ . 0 R
The scaled unscented transform contains a predict stage and an update stage: Predict Sigma points calculation: [na × (2na +"1)]: a a a a Xk−1 = xˆk−1 xˆk−1 + na + λ Pk−1 ⎡ x ⎤ Xk−1 v ⎦ = ⎣Xk−1 w Xk−1 x x v = m(Xk−1 , Xk−1 ) State prediction: Xk|k−1 xˆk|k−1 =
2na
a xˆk−1 +
" a na + λ Pk−1
x wi(m) Xk|k−1
i=0
Covariance prediction: Pk|k−1 =
2na i=0
x x wi(c) (Xk|k−1 − xˆk|k−1 )(Xk|k−1 − xˆk|k−1 )T
x x w = h(Xk|k−1 , Xk|k−1 ) Measurement prediction: Yk|k−1
yˆk|k−1 = Update Measurement covariance: Pyy = Cross-covariance: Pxy =
2na i=0
2na
x wi(m) Yk|k−1
i=0
2na i=0
wi(c) (Yk|k−1 − yˆk|k−1 )(Yk|k−1 − yˆk|k−1 )T
x wi(c) (Xk|k−1 − xˆk|k−1 )(Yk|k−1 − yˆk|k−1 )T
Kalman gain: Kk = Pxy (Pyy )−1 State correction: xˆk = xˆk|k−1 + Kk (yk − yˆk|k−1 ) Covariance correction: Pk = Pk|k−1 − Kk Pyy KkT
NONLINEAR MONTE CARLO KALMAN FILTER WITH ADDITIVE NOISE
315
9.14. MONTE CARLO NUMERICAL INTEGRATION The integrals arising in the Kalman filter can be numerically solved by a Monte Carlo approximation to the PDF by p(x) = N(x; x, ˆ P xx ) ∼ =
Ns i=1
w (i) δ(x − x (i) )
where Ns random support points are drawn from N(x; x, ˆ P xx ) with identical weights (i) of w = 1/Ns (Haug 2005). Therefore, the empirical estimate of the posterior can be approximated as Ns ∼ 1 δx (i) (dx0:k ) p(x0:k |y1:k ) = 0:k Ns i=1
(i) by drawing random samples x0:k from the posterior distribution (van der Merwe et al., 2000). The expected value of a general function g(x) is estimated as
E[g(x)] =
g(x)p(x)dx ∼ =
Ns Ns 1 g(x) w (i) δ(x − x (i) )dx ∼ g(x (i) ) = Ns i=1
i=1
and, when applied to the filtering problem, it is
E[gk (x0:k )] =
N
s 1 (i) ). gk (x0:k gk (x0:k )p(x0:k |y1:k )dx0:k ∼ = Ns
i=1
The support points of the Monte Carlo simulation perform a similar role as the unscented sigma points or the Gauss–Hermite quadrature points.
9.15. NONLINEAR MONTE CARLO KALMAN FILTER WITH ADDITIVE NOISE With this basic understanding of Monte Carlo integration, the outline of the nonlinear MCKF with additive noise can be given below (Haug 2005). The expected xx value is initialized as xˆ0|0 , and the covariance is initialized as P0|0 . Predict (i) xx ) Generate prior samples: xk−1|k−1 ∼ N(xk−1 ; xˆk−1|k−1 , Pk−1|k−1 State prediction: xˆk|k−1 =
1 Ns
Ns i=1
(i) m(xk−1|k−1 )
316
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
State covariance prediction: Ns (i) (i) Pk|k−1 = Q + N1s (m(xk−1|k−1 ) − xˆk|k−1 )(m(xk−1|k−1 ) − xˆk|k−1 )T i=1
(i) xx ∼ N(xk ; xˆk|k−1 , Pk|k−1 ) Generate predictive samples: xk|k−1
Measurement prediction: yˆk|k−1 =
1 Ns
Ns
(i) ) h(xk|k−1
i=1
Update Measurement covariance: Ns (i) (i) (h(xk|k−1 ) − yˆk|k−1 )(h(xk|k−1 ) − yˆk|k−1 )T Pyy = R + N1s i=1
Cross-covariance: Pxy =
1 Ns
Ns (i) (i) (xk|k−1 − xˆk|k−1 )(h(xk|k−1 ) − yˆk|k−1 )T i=1
Kalman gain: Kk = Pxy (Pyy )−1 State correction: xˆk = xˆk|k−1 + Kk (yk − yˆk|k−1 ) Covariance correction: Pk = Pk|k−1 − Kk Pyy KkT
As the estimated variance is proportional to 1/Ns , the recursive application of Monte Carlo integrations in the MCKF will quickly lead to an increase in error and divergence of the filter (Haug 2005). An additional limitation of MCKF is the requirement that the samples are generated from a known distribution.
9.16. IMPORTANCE SAMPLING An analytical expression for the posterior distribution is typically unavailable; thus samples cannot be generated for the expected value estimation E[gk (x0:k )] =
N
gk (x0:k )p(x0:k |y1:k )dx0:k
s 1 (i) ∼ gk (x0:k ). = Ns
i=1
The solution is to sample from a simple distribution referred to as the importance density q(xk |y1:k ), which is scaled to p(xk |y1:k ) at each xk by scaling or by the importance weight p(xk |y1:k ) . w(xk ) = q(xk |y1:k ) Substitution gives the expected value as gk (xk )w(xk )q(xk |y1:k )dxk E[gk (xk )] = . w(xk )q(xk |y1:k )dxk
IMPORTANCE SAMPLING
317
The expected value estimation for a discrete set of Ns sampled particles is N
E[gk (xk )] =
s 1 gk (xk(i) )wk(i) (xk(i) ) Ns
i=1
Ns 1 wk(i) (xk(i) ) Ns
=
Ns i=1
gk (xk(i) )w˜ k(i) (xk(i) )
i=1
where the normalized importance weight is given by w˜ k(i) (xk(i) ) =
wk(i) (xk(i) ) Ns
.
wk(i) (xk(i) )
i=1
For implementation in a filter, it is beneficial write the importance weight in a recursive manner as posterior probability density
p(xk |y1:k ) w(xk ) = = q(xk |y1:k )
likelihood of observation y normalizing given state x k k constant
c
prior probability density
p(yk |xk ) p(xk |y1:k−1 ) . q(xk |y1:k )
The prior probability for all observations up to yk−1 , that is, y1:k−1 , where the states are not known exactly, is given by predictive conditional density
p(xk |y1:k−1 ) = p(xk |x1:k−1 , y1:k−1 )p(xk−1 |y1:k−1 )dxk−1 and the importance density is similarly expanded as q(xk |y1:k ) =
q(xk |x1:k−1 , y1:k )q(xk−1 |y1:k−1 )dxk−1 .
Inserting gives the importance weight as posterior probability density
previous posterior PDF
cp(yk |xk ) p(xk |x1:k−1 , y1:k−1 )p(xk−1 |y1:k−1 )dxk−1 p(xk |y1:k ) = w(xk ) = . q(xk |y1:k ) q(xk |x1:k−1 , y1:k )q(xk−1 |y1:k−1 )dxk−1
318
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
9.17. SEQUENTIAL IMPORTANCE SAMPLING Typically, in the filtering problem, a set of particles approximates the posterior PDF up to time tk−1 as p(xk−1 |y1:k−1 ) ≈
Ns i=1
(i) (i) wk−1 δ(xk−1 − xk−1|k−1 ).
When the particles are drawn from the importance density q(xk−1 |y1:k−1 ), the weights are defined as (i) p(xk−1|k−1 ) (i) . wk−1 = (i) q(xk−1|k−1 ) As discussed by Haug (2005), in sequential importance sampling, a random (i) (i) measure of [xk−1|k−1 , vk−1 ] is available to approximate p(xk−1 |y1:k−1 ). The definition of the posterior PDF up to time tk−1 gives the weight update equation of each particle as (i)
wk−1
wk(i)
⎫ ⎧ (i) (i) (i) p(xk|k−1 |xk−1|k−1 , y1:k−1 ) ⎨ p xk−1|k−1 ⎬ (i) ∝ p(yk |xk|k−1 ) . (i) (i) (i) q(xk|k−1 |xk−1|k−1 , y1:k−1 ) ⎩ q(xk−1|k−1 ) ⎭
(i) (i) (i) = m(xk−1|k−1 , vk−1 ). The weight The discrete state space equation gives xk|k−1 update process provides a mechanism to sequentially update the importance weights as (i) (i) p(xk|k−1 |xk−1|k−1 ) (i) (i) wk(i) = wk−1 p(yk |xk|k−1 ) . (i) (i) q(xk|k−1 |xk−1|k−1 )
The recursively updated weights give the posterior filtered PDF as p(xk |y1:k ) ≈
Ns i=1
(i) wk(i) δ(xk−1 − xk|k ).
(i) (i) (i) Repeated recursion of the discrete state space equation xk|k−1 = m(xk−1|k−1 , vk−1 ) leads to particle dispersion and the variance of xk increases without bound. The increase in variance causes a degeneracy, where one of the normalized importance weights will tend to 1. The remaining weights will approach zero and are statistically insignificant (van der Merwe et al., 2000). The effective sample size can be estimated by 1 . Nˆ eff = N s (i) 2 {wk } i=1
BOOTSTRAP PARTICLE FILTER
319
Resampling techniques are used to redistribute the particles from low-probability to high-probability points. Resampling is not necessary at every iteration and optionally can be applied only when Nˆ eff < Ns . 9.18. INVERSE TRANSFORM RESAMPLING The inverse transform method is based on the relation that a random variable defined by x = F −1 (u) has a continuous distribution function F where u is a uniformly distributed random variable (Ross 1989). The cumulative distribution function x p(z)dz F (x) = P (z ≤ x) = −∞
has a discrete approximation F (x) =
x Ns
−∞ i=1
w (i) δ(z − z(i) )dz.
Taking x (j ) to be nearest discrete point still below x, the discrete approximation is available as j w (i) . F (x) ∼ = F (x (j ) ) = i=1
The inverse transform resampling is based on drawing u(i) ∼ U (0, 1)
i = 1, . . . , Ns
and interpolating a value of x (i) = F −1 (u(i) ). All x (i) in the sample set are equally probable, that is, P r(x (i) = x) = resampled particle set is w˜ (i) = N1s .
1 Ns ,
and the
9.19. BOOTSTRAP PARTICLE FILTER The bootstrap particle filter uses the assumption that the importance density is equal to the prior density. This implies that (i) (i) (i) (i) q(xk|k−1 |xk−1|k−1 ) = p(xk|k−1 |xk−1|k−1 )
and the sequential update to the importance weights reduces to (i) (i) wk(i) = wk−1 ). p(yk |xk|k−1
320
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
0.25
Range of particles Predicted Heston var.
0.2
0.15
0.1
1995
0.3
2000 Time
2005
2010
2005
2010
Predicted Heston var. Particles
0.25 0.2 0.15 0.1 0.05 0
1995
2000 Time
FIGURE 9.7 Particle filter estimation of variance in the Heston model. The input data is the set of stock index prices and corresponding one-month at-the-money call options. The plots are overlaid with (top panel) the range of particles and (bottom panel) the individual particles.
9.20. PARTICLE FILTER OF THE HESTON MODEL The function NonLinearKalmanSV calls the particle filter in the function PFHeston. Figure 9.7 displays the variance predicted within the Heston model. At each time step, the weight or significance of each particle is evaluated. Before moving to the next step, the particles are resampled whereby the significant particles are multiplied and the insignificant particles are eliminated. Only poor initial estimation of the state and covariance can lead to immediate instability; otherwise, the particle filter is fairly robust.
REFERENCES
321
SUMMARY Several variants of the Kalman filter have been developed to address systems that are nonlinear and possibly present non-Gaussian noise. The UKF provides a reasonable description of nonlinear functions for a relatively low computational burden. The particle filter presents a better representation of the distribution particularly for non-Gaussian noise although the efficiency only approaches that of the UKF for a high number of dimensions in the state. The particle filter relies on a distribution of particles located at statistically significant points. Several resampling algorithms are available to recursively redistribute the particles. Furthermore, more advanced algorithms are available, such as the unscented particle filter, that forecast the likely future location for the particles. REFERENCES Arulampalam, M., Maskell, S., Gordon, N., Clapp, T. (2002) A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking, IEEE Transactions on Signal Processing 50, 174. Carr, P.P., Madan, D.B. (1999) Option Valuation Using the Fast Fourier Transform, Journal of Computational Finance 2(4), 61–7. Chourdakis, K. (2005) Option pricing using the fractional FFT, Journal of Computational Finance 8(2), 1–18. Chourdakis, K. (2008) Financial Engineering: A Brief Introduction Using the Matlab System. Haug, A.J. (2005), A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Process, MITRE Technical Report MTR 05W0000004. Heston, S. (1993) A Closed-Form Solution for Options with Stochastic Volatility, with Applications to Bond and Currency Options, Review of Financial Studies 6(2), 327. Ito, K., Xiong, K. (2000) Gaussian Filters for Nonlinear Filtering Problems, IEEE Trans. Automatic Control 45, 910. J¨ackel, P. (2005) A Note on Multivariate Gauss-Hermite Quadrature, www.jaeckel.org. Javaheri, A., Lautier, D., Galli, A. (2003) Filtering in Finance, Wilmott Magazine. Julier, S.J., Uhlmann, J.K. (1997) A New Extension of the Kalman Filter to Nonlinear Systems, Proceedings of AeroSense: The 11th International Symposium on Aerospace/Defense Sensing, Simulation and Controls 182. Julier, S.J., Uhlmann, J.K. (2004) Unscented Filtering and Nonlinear Estimation, Proceedings of the IEEE 92, 401. Li, J. (2012) An Unscented Kalman Smoother for Volatility Extraction: Evidence from Stock Prices and Options, Computational Statistics and Data Analysis. Liao, L. (2004) Option Pricing Using Bayes Filters, Technical Report, University of Washington. Ross, S.M. (1989) Introduction to Probability Models, Academic Press. van der Merwe, R., de Freitas, N., Doucet, A., Wan, E.A. (2000) The Unscented Particle Filter, Technical Report CUED/F-INFENG/TR 380, Cambridge University, Engineering Department.
322
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
APPENDIX Code for MultiVarGaussian function [ ] = MultiVarGaussian( ) % MultiVarGaussian Plots 2D Gaussian Distribution and % Gauss-Hermite Quadrature Points. A stochastic decoupling % technique based on Cholesky Decomposition and SVD is used to % eliminate correlation % Matrix points are rotated via Cholesky Decomposition: % [R D] = chol(Sigma); L=R'; xnew=L*x; % or SVD: [U,S,V] = svd(Sigma); xnew=U*sqrt(S)*x; rho = 0.6 sig= [1 rho ;rho 1] mu = [0;0] MaxRange=3; MaxNum=40 M=linspace(-MaxRange,MaxRange,MaxNum); N=linspace(-MaxRange,MaxRange,MaxNum); [X,Y] = meshgrid(M,N); for m = 1:MaxNum for n=1:MaxNum x=[M(m); N(n)]; densX(m,n) = MultiPDF( x, mu, sig ); end end % Calculate Gauss-Hermite Weights and Quadrature Points NumP=5 [xp, wp ] = HermiteWeightAndRoots (NumP, 0) % Sort Quadrature Points [xp ind] = sort(xp); % [Xp,Yp] = meshgrid(xp,xp); % Sort Weights based on Quadrature Points tempwp=wp; for k =1:length(wp) wp(k)=tempwp(ind(k)); end W=wp*wp'; % 2D grid of weights % Cholesky Decomposition [R D] = chol (sig) L=R' figure
APPENDIX
323
subplot (1,2,1) contour(X,Y,densX,40) hold on for p1=1:NumP for p2=1:NumP x=[xp(p1) xp(p2)]; xRot = L*x'; plot(xRot(1),xRot(2),'o','MarkerSize',2+20*(W(p1,p2))) hold on end end axis ([-MaxRange-1, MaxRange+1,-MaxRange-1,MaxRange+1]) title('Cholesky Stochastic Decoupling') txtstr=['\rho =' num2str(rho)]; text(2,-2,txtstr); hold off [U,S,V] = svd(sig) A=U*sqrt(S) subplot (1,2,2) contour(X,Y,densX,40) hold on for p1=1:NumP for p2=1:NumP x=[xp(p1) xp(p2)]; xRot = A*x'; plot(xRot(1),xRot(2),'o','MarkerSize',2+20*(W(p1,p2))) hold on end end axis ([-MaxRange-1, MaxRange+1,-MaxRange-1,MaxRange+1]) title('SVD Stochastic Decoupling') text(2,-2,txtstr) hold off end function [ Dens ] = MultiPDF( x, mu, sig ) % Multivariate Probability Density Function k=rank (sig); Dens = ( exp(-0.5*(x-mu)'*inv(sig)*(x-mu) ) )/... ( (2*pi)ˆ(k/2)*sqrt(det(sig))); end
Code for BlackScholesCall function [c PI1 PI2] = BlackScholesCall (K,S,T,vol,r,d);
324
NONLINEAR AND NON-GAUSSIAN KALMAN FILTER
% BlackScholesCall uses classic semi-analytical equation d1 = ( log(S./K)+ (r-d+volˆ2/2).*T)./ (vol.*sqrt(T)); d2 = d1-vol*sqrt(T); PI1 = myNormCDF(d1); % PI1=CDF(d1) = probability of finishing in the money for % risk neutral Martingale measure with Stock as Numeraire PI2 = myNormCDF(d2); % PI2=CDF(d2) = probability of finishing in the money for % risk neutral Martingale measure with riskless Bond Numeraire c = exp(-d*T).*S.*myNormCDF(d1)-exp(-r*T).*K.*myNormCDF (d2); end
Code for HermiteWeightAndRoots function [rootn, wn, Hcoef] =... HermiteWeightAndRoots (nend, OutputFlag ) % HermiteWeightAndRoots calculates Coefficients of H_n % The roots are calculated for each H_n polynomial. The Roots % serve as abscissa points for Gauss-Hermite Quadrature % Corresponding Gauss-Hermite Quadrature weigths at each root % are also calculated. if (nargin 0 -> b1 μ1 μ2
FINITE ACTIVITY LEVY PROCESS
513
where the form of Kou has been used for positive and negative x. The function I{−} indicates a value of 1 if the condition is true and zero if the condition is false; 0 < p1 < 1 is the probability of an upward jump, and p2 = 1 − p1 is the probability of a downward jump; the one-sided means are μ1 = η11 and μ2 = η12 . The expected value of the jump amplitude is E(Y ) = E(log(y)) = pη11 − pη22 and 2 p2 p1 1 1 the variance is Var(Y ) = p1 p2 η1 + η2 + η2 + η2 . The expected value of 1
2
η1 η2 the absolute jump amplitude, y = eY , is E(eY ) = E(y) = ηp11−1 − ηp22+1 . The characteristic function for a jump diffusion process is given by 2 & #'3 Nt
iωXt ] = E exp iω μt + σ Wt + Yn E[e n=1
2
= exp(iωμt)E[exp(iωσ Wt )]E exp iω
Nt
n=1
Yn
#3
.
As Wt ∼ Normal(0, t) and Nt ∼ Poisson(λt) (Sonono 2011), 1 2 2 ω σ t exp(λtE[ei ωY − 1]). E[eiωXt ] = exp(iωμt) exp 2 Or, equivalently, the Levy triplet (μ, σ 2 , υ) uniquely defines the characteristic function by the Levy–Khintchine formula ⎤ ⎡ ∞ 1 ei ωx − 1 υ(dx)⎦ ϕ(ω) = exp ⎣iμω − σ 2 ω2 + 2 −∞
for finite X. Substituting the Levy density
υ = λφ(x) = p1 η1 e−η1 x I{x≥0} + p2 η2 eη2 x I{x 0 and b = 1/v > 0 with v defined as the volatility of the Over a time length t, the increment √ time change. VG − XsVG follows a V G σ t, v/t , tθ law (Schoutens 2003). Under the realXt+s world measure P, the variance gamma process characteristic function is √ VG E[eiωXt ] = ϕ ω; σ t, v/t , tθ = {ϕ(ω; σ, v, θ )}t − t/v 1 = 1 − ivωθ + σ 2 vω2 . 2
RISK-NEUTRAL CHARACTERISTIC FUNCTION
535
It is more convenient to express the characteristic function in the exponential form as " ! √ 1 2 2 t v . ϕ ω; σ t, /t , tθ = exp − ln 1 − ivωθ + σ vω v 2 Under the real-world measure, there is no drift component to the characteristic exponent, that is, ψd (ω) = 0. The characteristic exponent of the NIG process has a stochastic component given by ψs (ω) = − vt ln 1 − ivωθ + 12 σ 2 vω2 under the real-world measure. The risk-neutral drift is found as the risk-free rate compensated for the stochastic variability as given by 1 2 t μrn−d (ω) = r − ψs (−i) = r + ln 1 − vθ − σ v . v 2 The risk-neutral characteristic function of the NIG process is thus given by "⎫ ⎧ ! t 1 2 ⎪ ⎪ ⎪ ⎬ ⎨iω r + v ln 1 − vθ − 2 σ v ⎪ √ . ϕrn ω; σ t, v/t , tθ = exp ⎪ ⎪ 1 t ⎪ ⎪ ⎭ ⎩ − ln 1 − ivωθ + σ 2 vω2 v 2
The alternate viewpoint is that the variance gamma process is a Levy process that is the difference of two independent gamma processes, that is, XtVG = G1t − G2t . The characteristic function of the CGM variance gamma processes can be written as
Ct GM GM + (M − G) iω + ω2 ! " GM = exp Ct ln GM + (M − G) iω + ω2
ϕVG (ω; C, G, M)t =
under the real-world measure. The characteristic exponent drift component is zero, that is, ψd (ω) = 0, and the stochastic component is given by
GM ψs (ω) = C ln GM + (M − G) iω + ω2
.
Equating the expression for the martingale measure r = ψ(−i) gives the riskneutral drift as μrn−d (ω) = r − ψs (−i) = r − C ln
GM . GM + (M − G) − 1
536
LEVY PROCESSES
The risk-neutral characteristic function of the NIG process is thus given by ⎧ ! "⎫ GM ⎪ ⎪ ⎪ ⎪ ⎪iωt r − C ln ⎪ ⎨ ⎬ GM + − G) − 1 (M t . ϕrn−V G (ω; C, G, M) = exp ⎪ ⎪ GM ⎪ ⎪ ⎪ ⎪ ⎩ Ct ln ⎭ GM + (M − G) iω + ω2 15.10. NON-LEVY PROCESSES One limitation of all Levy processes is the inherent time-homogeneous property, which can hamper the simultaneous calibration of a set of derivatives with a range of maturities. Specifically, valuations of derivatives on certain assets imply a timedependency in their volatility, which is better described by the time-dependent stochastic volatility model. The Heston (1993) model is similar to the Black–Scholes model except that the variance is stochastic and follows a CIR mean-reversion process. The Heston model dynamics are summarized by three equations dSt = μSt dt +
√
vt St dW1 , √ dvt = κ(η − vt )dt + σ vt dW2 , E[dW1 dW2 ] = ρdt where κ is the variance reversion rate, η is the mean-reversion or long-term variance level, σ is the volatility of variance, and the two Wiener processes are correlated by a factor ρ. For x = ln St , τ = T − t, and initial variance v0 (Rouah, 2011), the characteristic function for the Heston model is given by
where
ϕHeston (ω; x) & 2 #3'⎫ ⎧ κη 1 − gedτ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ iωx + riωτ + 2 (κ − σρiω + d) − 2 ln ⎪ ⎪ ⎨ ⎬ σ 1−g = exp " ! ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ κ − σρiω + d 1 − edτ ⎪ ⎪ ⎭ ⎩ v + 0 2 dτ σ 1 − ge d = (σρiω − κ)2 − σ 2 (−iω − ω2 )
g=
κ − σρiω − d . κ − σρiω + d
SUMMARY A Levy process is a cadlag stochastic process that is infinitely divisible into independent and identically distributed random variables. Levy processes can be subdivided into finite and infinite activity Levy processes.
REFERENCES
537
Several useful finite activity Levy processes, including the models of Kou and Merton, are generated by affixing a jump component to Brownian-type motion. The infinite activity Levy processes process, including the variance gamma and the NIG models, can display an infinite number of jumps within any finite time interval. These two classes of Levy models are readily extended to several areas of finance including option pricing. REFERENCES Applebaum, D. (2004) Levy Processes - From Probability to Finance and Quantum Groups, Notices of the American Mathematical Society 51(11), 1336. Borak, S., H¨ardle, W., Weron, R. (2004) Stable Distributions, in Statistical Tools for Finance and Insurance, Cizek, P., H¨ardle, W., Weron R. (eds.), Springer. Carr, P., Geman, H., Madan, D., Yor, M. (2002) The Fine Structure of Asset Returns: An Empirical Investigation, Journal of Business 75, 305. Carr, P., Wu, L. (2003) The Finite Moment Log Stable Process and Option Pricing, Journal of Finance 58(2), 753. Cont, R., Tankov, P. (2004) Financial Modeling with Jump Processes, Chapman & Hall/CRC Financial Mathematics Series. Deville, D. (2008) On Levy Processes for Option Pricing: Numerical Methods and Calibration to Index Options, Universit`a Politecnica delle Marche. Devroye, L. (1986) Non-Uniform Random Variate Generation, Springer. Gatheral, J. (2010) Jump-Diffusion Models, in Encyclopedia of Quantitative Finance, Wiley. Grimmet, G., Stirzaker, D. (2001) Probability and Random Processes, Oxford Press. Heston, S. (1993) A Closed-Form Solution for Options with Stochastic Volatility, with Applications to Bond and Currency Options, Review of Financial Studies 6(2), 327. Janicki, A., Weron, A. (1993) Simulation and Chaotic Behavior of Alpha-Stable Stochastic Processes, CRC Press. Kim, Y., Rachev, S.T., Chung, D.M., Bianchi, M. (2008) A Modified Tempered Stable Distribution with Volatility Clustering, in New Developments in Financial Modeling, Soares, J.O., Pina, J.P., Lopes, M.C. (eds.), Cambridge Scholars Publishing, 344. Kitchen, C. (2009) Applications of the Normal Inverse Gaussian (NIG) Processes in Mathematical Finance, University of Calgary. Kou, S. G. (2002) A Jump Diffusion Model for Option Pricing, Management Science 48, 1086. Krichene, N. (2005) Subordinated Levy Processes and Applications to Crude Oil Options, IMF Working Paper WP/05/174. Madan, D.B., Carr, P., Chang, E.C. (1998) The Variance Gamma Process and Option Pricing, European Finance Review 2, 79. Matsuda, K. (2004) Introduction to the Merton Jump Diffusion Model, White Paper, City University of New York. Merton, R.C. (1976) Option Pricing When Underlying Stock Returns are Discontinuous, Journal of Financial Economics 3, 125. Nolan, J.P. (1997). Numerical Calculation of Stable Densities and Distribution Functions, Communications in Statistics - Stochastic Models 13, 759.
538
LEVY PROCESSES
Papapantoleon, A. (2008) An Introduction to Levy Processes with Applications in Finance, Lecture Notes, Institute of Mathematics, TU Berlin. Robbertse, J. (2006) On the Modeling of Asset Returns and the Calibration of European Option Pricing Models, Department of Mathematical Statistics, University of Johannesburg. Rouah, F.D. (2011) Derivation of the Heston Model, www.frouah.com. Samorodnitsky, G., Taqqu, M.S. (1994) Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance, Chapman and Hall. Schoutens, W. (2003) Levy Processes in Finance, Wiley. Sonono, M.E. (2011) Calibration of Financial Models, Dissertation, University of Stellenbosch. Todorov, V., Tauchen, G. (2006) Simulation Methods for Levy-Driven CARMA Stochastic Volatility Models, Journal of Business & Economic Statistics 24, 455. Weron, R. (1996) On the Chambers-Mallows-Stuck Method for Simulating Skewed Stable Random Variables, Statistics and Probability Letters 28, 165; R. Weron (1996) Correction to: On the Chambers-Mallows-Stuck Method for Simulating Skewed Stable Random Variables, Research Report HSC/96/1. Zolotarev, V.M. (1986) One-Dimensional Stable Distributions, American Mathematical Society.
APPENDIX Code: VarianceGammaGen function [ dX, dG ] = VarianceGammaGen( sigma, v, theta, dt ) % VarianceGammaGen calculates dX of a Variance Gamma % subordinated Brownian motion if (nargin < 4), dt =1; end if (nargin < 3), theta =1; end if (nargin < 2), v =1; end if (nargin < 1), sigma =1; end a=1/v; b=1/v; dG=GammaRand (a.*dt,b); % subordinate Gamma time process dX=theta.*dG+sigma*sqrt(dG).*randn; %subordinated drift diffusion end
Code: VarianceGammaScript % VarianceGammaScript simulates a log asset price % and thus the asset price path by a repeated call % to the function VarianceGammaGen dt=0.01;
APPENDIX
T=0:dt:1; sigma=0.1 v=0.1 theta=0.1 dtvector=ones(1,length(T))*dt; for j = 1:length(T) [ dX(j), dG(j) ] = VarianceGammaGen( sigma, v, theta, dt ); end G=cumsum(dG); X=cumsum(dX); figure subplot (4,1,1) plot (T, G) title ('Gamma Process') ylabel ('G') xlabel ('Time') subplot (4,1,2) hist(dX,length(T)/4) ylabel ('Frequency (dX)') xlabel ('dX') subplot (4,1,3) plot (T, X) title ('Variance Gamma log (Price) ') ylabel ('X') xlabel ('Time') axis tight eX=exp(X); subplot (4,1,4) plot (T,eX) ylabel ('eˆX') title ('Variance Gamma Asset Price ') xlabel ('Time') axis tight figure stem3(T,G,X) hold on plot3(T,G,X,'gr')
539
540
LEVY PROCESSES
axis tight xlabel ('Real Time') ylabel ('Gamma Time') zlabel ('X') title ('Variance Gamma Process') % Generate same VG process in CGM parameters % convert v,theta, sigma to C,G,M Cp=1/v Gp=1/(sqrt(0.25*thetaˆ2*vˆ2+0.5*sigmaˆ2*v) - 0.5*theta*v) Mp=1/(sqrt(0.25*thetaˆ2*vˆ2+0.5*sigmaˆ2*v) + 0.5*theta*v) %Up and down gamma process: Gu, Gd au=Cp; ad=Cp; bu=Mp; bd=Gp; %Difference of up and down gamma process: X=Gu-Gd for j = 1:length(T) dGu(j)=GammaRand (au.*dt,bu); % up Gamma process dGd(j)=GammaRand (ad.*dt,bd); % up Gamma process end Gu=cumsum(dGu); Gd=cumsum(dGd); VGdiff=Gu-Gd; figure plot (T, VGdiff, T, X, '--') legend('CGM','\sigmav\theta', 'location','NorthWest') ylabel ('Log Price') xlabel ('Time') title('Comparison of VG Notation')
Code: GammaRand function [ Gab ] = GammaRand ( a,b ) % GammaRand generates Gamma Random variables % Pseudo-Code and description in Deville % Levy Processes in Finance % Requires a PiVal = %g', Number, PIval ) end function IntVal = IntFunction (w,S0,K,r,T,Model,Number,InPar) % IntFunction is the function 'Integrated' by the PI function % First, set phi to name of characteristic function if (strcmp(Model,'Gaussian')) phi='phiBS'; param.sigmaBS=InPar.volBS; param.mu=r-0.5*param.sigmaBSˆ2; %risk-neutral param.T=T; elseif (strcmp(Model,'Merton')) phi='phiMerton'; param.sigmaBS=InPar.volBS; sigmaBS=param.sigmaBS; param.muJ=InPar.muJ; muJ=param.muJ; param.sigmaJ=InPar.sigmaJ; sigmaJ=param.sigmaJ; param.lambda=InPar.lambda; lambda=param.lambda; % risk-neutral drift param.mu=r-0.5*sigmaBSˆ2 ... -lambda*(exp(muJ+0.5*sigmaJˆ2)-1); param.T=T; else disp('Improper Model') end k=log(K/S0); % dimensionless moneyness % Similar to Black-Scholes, PI1 calculated with stock as % numeraire and PI2 calculated with riskless bond as numeraire if (Number == 1) IntVal=real(exp(-i*w*k).*feval(phi,w-i,param)... ./(i.*w.*feval(phi,-i,param))); else % (Number == 2)
574
FOURIER-BASED OPTION ANALYSIS
IntVal=real(exp(-i*w*k).*feval(phi,w,param)... ./(i*w)); end end function c = MertonSeriesCall(K,S,T,volBS,... r, d, muJ, sigmaJ, lambda); % MertonSeriesCall based on Series Approximation where % n is the probability of n jumps in one time period % lambda is the intensity parameter = mean number of jumps % in one time period %k=exp(muJ+0.5*sigmaJˆ2)-1 = mean relative asset jump size N=5; % Max Number of Possible jumps in one time period lambdaBar=lambda*exp(muJ+0.5*sigmaJˆ2); c=0; % Can write following more compactly using % exp(muJ+0.5*sigmaJˆ2)= k+1 for n=0:N volM = sqrt(volBSˆ2+n*sigmaJ/T); rn = r - lambda*(exp(muJ+0.5*sigmaJˆ2)-1)... + n*(muJ+0.5*sigmaJˆ2)/T; c = c + ((lambdaBar*T)ˆn / factorial(n))... * BlackScholesCall (K,S,T,volM,rn,0); end c = c*exp(-lambdaBar*T); end
Code: phiMerton function [ phiVal ] = phiMerton (w,param) % phiMerton is characteristic function for Merton's % jump-diffusion model muJ=param.muJ; sigmaJ=param.sigmaJ; lambda=param.lambda; phiVal=exp(param.T.*(i.*param.mu.*w... -0.5*w.ˆ2 .* param.sigmaBS.ˆ2 ... +lambda.*(exp(i*w*muJ-0.5*sigmaJˆ2.*w.ˆ2)-1))); end
Code: phiBS function [ phiVal ] = phiBS (w,param) % phiBS is characteristic function for Gaussian % log-return
APPENDIX
phiVal=exp(param.T.*(i.*param.mu.*w... -0.5*w.ˆ2 .* param.sigmaBS.ˆ2)); end
Code: FFToptionScript % % % % %
FFToptionScript compares FFT vs. analytical option value for Black-Scholes model and Merton model Lastly, the fractional FFT (FrFFT) is used to decrease the k spacing to bunch the strike prices to more relevant values near the ATM option
T=1; r=0.05; S0=100;
Ttext=['T = ' num2str(T)]; rtext=['r = ' num2str(r)]; S0text=['S_0 = ' num2str(S0)];
%%% Black-Scholes Parameters %%% volBS = 0.2; Voltext=['\sigma_{BS} = ' num2str(volBS)]; d=0; CharFunc='phiBS' Param.T=T; Param.sigmaBS=volBS; Param.S0=S0; Param.r=r; Param.mu=r-0.5*Param.sigmaBSˆ2; %risk-neutral BS [cBSfft, k ] = FFToption ( CharFunc, Param); K=S0*exp(k); % K is normalized to S0 in FFT calc % use K returned from FFT option as input into BS cBSanalytical = BlackScholesCall(K,S0,T,volBS,r,d); %cBSfftOTM = FFToptionOTM ( CharFunc, Param); figure plot (K, cBSanalytical, K, S0* cBSfft, K, S0-K,'--') legend ('Analytical BS', 'FFT BS') xlim ([0.5, 1.5*S0]); ylim([0.1 100]); bsStr(1)= {'Black-Scholes'}; bsStr(2)= {rtext}; bsStr(3)= {Voltext}; text(0.8*S0, 0.6*S0, bsStr) genStr(1)= {S0text}; genStr(2)= {Ttext};
'+',...
575
576
FOURIER-BASED OPTION ANALYSIS
text(0.2*S0, 0.2*S0, genStr); %%% Merton Series %%%%%%%%%%%%%%%%%%%%%%%%%%% % Assume Merton same underlying Black-Scholes % volatility augmented with a log-normal jump % process at a rate lambda muJ = 0.1; sigmaJ = 0.1; lambda = 0.5; mmuJtext=['\mu_{Jump} = ' num2str(muJ)]; msigmaJtext=['\sigma_{Jump} = ' num2str(sigmaJ)]; lambdatext=['\lambda = ' num2str(lambda)]; CharFunc='phiMerton' Param.muJ=muJ; Param.sigmaJ=sigmaJ; Param.lambda=lambda; Param.r=r; % Merton risk-neutral drift Param.mu=r-0.5*volBSˆ2 ... -lambda*(exp(muJ+0.5*sigmaJˆ2)-1); cMERTONanalytical = MertonSeriesCall (K,S0,... T,volBS,r, d, muJ, sigmaJ, lambda); [cMERTONfft, k ] = FFToption ( CharFunc, Param ); % Carr and Madan suggest alternative algorithm for % deep out of the money (OTM) options based on % Time Value of OTM options % cBSfftOTM = FFToptionOTM ( CharFunc, Param ); K=S0*exp(k); %K is normalized to S0 in FFT calc figure plot(K, cMERTONanalytical, K, S0*cMERTONfft,'+',... K,S0-K ,'--') legend ('Merton Series Approx.', 'FFT Merton') xlim ([0.5, 1.5*S0]); ylim([0.1 100]); JumpStr(1)= {'Merton '}; JumpStr(2)= {mmuJtext}; JumpStr(3)= {msigmaJtext}; JumpStr(4)= {lambdatext}; text(0.8*S0, 0.6*S0, JumpStr) genStr(1)= {S0text}; genStr(2)= {Ttext};
APPENDIX
text(0.2*S0, 0.2*S0, genStr); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Option Price via Fractional Fourier Transform % Normally FFT spacing is defined dw*dk =2pi/N % or dw*dk/2pi = 1/N % Thus fixed N and dw (or maximum w = N*dw ) % also sets dk by dk=2pi/(N*dw) % and maximum k by k(max) = N*dk % -> Most k way too large or small % FrFFT allows flexibility via the 'A Fraction' % by dw*dk/2pi = A thus dk = A*2pi/dw % and k spacing can be decreased % In other words, max (and min k =- max k) % are brought closer to the ATM and more option % values are calculated at relevant strike prices alpha=1.5; wEnd=1000; figure krange=0; % zero for just ATM; % Use larger number to look at range around ATM % but this can be misleading as K spacing changes nn=1:4 for n=nn; N=2ˆ(n+6) ff=1:4; for f=ff A=1/(f*N); %set the 'A Fraction' [cBSFRfft k] =... FFToption(CharFunc,Param,alpha,wEnd,N,A); K=S0*exp(k); %K is normalized to S0 in FFT cBSanalytical = BlackScholesCall(K,S0,T,volBS,r,d); % Find ATM call when k~=0 ->K(ATM)=S0*exp(0) [minKval,Ind] = min(k.ˆ2) ; Error=S0*cBSFRfft(Ind-krange:Ind+krange)-... cBSanalytical(Ind-krange:Ind+krange); ErrorSum(n,f)=sum(Error.ˆ2); subplot (max(ff),max(nn),n+(f-1)*(max(nn))); plot (K, cBSanalytical, K, S0* cBSFRfft, '+',... K, S0-K,'--') ftext=['A=1/(' num2str(f) 'N) ']; ntext=['N=' num2str(N)]; nfText(1)={ftext}; nfText(2)={ntext}; text(1.1*S0, 0.8*S0, nfText); xlim ([0, 4*S0])
577
578
FOURIER-BASED OPTION ANALYSIS
ylim ([0 100]) end end % Examine Error for change in N Spacing or A Fraction % Plot should show that FrFFT does not change the % accuracy of the option calculation at a certain K % Accuracy is determined primarily by N and dw as well % as alpha and w-max figure semilogy (nn+6, ErrorSum(:,1),'+',nn+6, ErrorSum(:,2),'--',... nn+6, ErrorSum(:,3), '-.' ,nn+6, ErrorSum(:,4), ':') legend('A=1/N', 'A=1/2N', 'A=1/3N' , 'A=1/4N') xlabel ('2ˆn') ylabel ('Square Error for ATM Call') title ('Compare BS-FFT to BS-Analytical')
Code: MertonSeriesCall function c = MertonSeriesCall(K,S,T,volBS,... r, d, muJ, sigmaJ, lambda); % MertonSeriesCall based on Series Approximation where % n is the probability of n jumps in one time period % lambda is the intensity parameter = mean number of jumps % in one time period %k=exp(muJ+0.5*sigmaJˆ2)-1 = mean relative asset jump size N=10; % Max Number of Possible jumps in one time period lambdaBar=lambda*exp(muJ+0.5*sigmaJˆ2); c=0; % Can write following more compactly using % exp(muJ+0.5*sigmaJˆ2)= k+1 for n=0:N volM = sqrt(volBSˆ2+n*sigmaJ/T); rn = r - lambda*(exp(muJ+0.5*sigmaJˆ2)-1)... + n*(muJ+0.5*sigmaJˆ2)/T; c = c + ((lambdaBar*T)ˆn / factorial(n))... * BlackScholesCall (K,S,T,volM,rn,0); end c = c*exp(-lambdaBar*T); end
Code: FFToption function [C,k]=FFToption(CharFunc, Param, alpha, wEnd , N) % FFToption Performs Carr and Madam Call FFT Option % Calculation. FFToption returns array of Call Prices
APPENDIX
% and their corresponding strikes for a fixed S0. % Calculation if (nargin < 3), alpha=1.25; end if (nargin